SQL question

KRFinch at dstsystems.com KRFinch at dstsystems.com
Fri Aug 8 22:04:02 CDT 2003


Getting the dusty old DBA hat out of mothballs:

1) It looks like you are having the thing go through the entire first table
comparing every record with all of the records in the second table.  So,
for every one of those 180 million records, you are comparing it against
every one of those hundred or so other values, meaning that the server is
having to do about 18 billion operations to do what you asked it to.  I
have yet to see a DB server that could hit anything close to a billion
queries an hour, even straight out of RAM, so this will obviously take more
than 18 hours.  I've seen less complex queries run for more than 18 hours
on tables with only around 3 million records, so I would guess that this
query you have here should take at least the better part of a week to run.

I think you would be better served running a single query first with only
one of the elements you want to delete, and see how long that takes to
rifle through your 180 million records.  That should give you a reasonable
gauge of how long things will take once you multiply that time by 100.
Once you run that single query, you should also take a look at the
performance characteristics of your database based on that run-though and
see what processes need more RAM to handle the query more efficiently.

2) That's engine-specific, although you could probably write your query
that way if you really wanted in any database engine.  You could probably
have it commit every so often through the use of conditional statements
triggered by the contents of the cells you are comparing.  I think it would
make for a long, ugly, complex, slow  query if you had to do it this way.

Thanks!

Kevin Finch
Network Administrator
DST Systems, Inc.
816/435-6039
krfinch at dstsystems.com

                                                                           
             "Brian Densmore"                                              
             <DensmoreB at ctbson                                             
             line.com>                                                  To 
             Sent by:                  "KCLUG (E-mail)" <kclug at kclug.org>  
             owner-kclug at marau                                          cc 
             der.illiana.net                                               
                                                                   Subject 
                                       SQL question                        
             08/08/2003 04:33                                              
             PM                                                            
                                                                           
                                                                           
                                                                           
                                                                           

Well although this is not Linux related it could easily be.
I have a somewhat large database (approx 180 million records)
that I need to delete certain "sets" from based on a foreign
key in a smaller table (approx 4 million records) and the
records I want to delete match a field in a third table
(very small < 100 records). I ran this delete query and it ran
for 18 hours and still didn't finish. We had to kill it
(because the server was swapping back and forth and came
to a standstill because we only have a 1/2 gig of RAM).
Anyway, I, have two questions:
1) why should it take 18 hours to run (the delete query should
have deleted about 1/3 of the data in the database),
2) is there anyway to have a single delete query statement
do periodic commits (as really dangerous as it is)?

basically what I have is this:

table bad_marks (uid,mark)
table messages  (indice,mark,...)
table msg_elements (id,indice,...)

where bad_marks.mark 1->M/1->0 messages.mark, uid = PK
      messages.indice 1->M msg_element.indice, message.indice = PK,
message.mark = indexed
      msg_element id = PK, indice = indexed
and there is no referential integrity turned on (yet).

Thanks,
Brian

"Three OS's from corporate-kings in their towers of glass,
Seven from valley-lords where orchards used to grow,
Nine from dotcoms doomed to die,
one from the dark lord Gates on his dark throne
In the Land of Redmond where the Shadows lie.
one OS to rule them all, one OS to find them,
one OS to bring them all and in the darkness bind them,
In the Land of Redmond where the Shadows lie."    john thrum




More information about the Kclug mailing list