Multiple small deletes


I have a script that loops over records of people (~4 million) and executes multiple updates (~100) and a single delete statement (all of these updates and delete are on different tables). The problem I am facing is that the one delete statement takes about half the run time by itself. I understand that when you execute a delete statement, it needs to update the index, but I find it rather ridiculous. I am currently testing this script with one thread using dbms_parallel_execute but I plan to multithread this script.

I am executing a query similar to the following:

DELETE FROM table1 t1
WHERE (t1.key1, t1.key2) IN (SELECT t2.key1, t2.key2
                               FROM table2 t2
                              WHERE t2.parm1 = 1234
                                AND t2.parm2 = 5678).

Following facts:

  • Table2 (~30 million records) is ~10 times larger than table1 (~3 million records).
  • There is a primary key on table1(key1, key2)
  • There is a primary key on table2(key1, key2)
  • There is an index on table2(parm1, parm2)
  • I have disabled the foreign key constraint on table1(key1, key2) that references table2(key1, key2)
  • There are no other constraints on table1, but many more constraints on table2.

  • All triggers on table1 have been disabled

  • The explain plan for this query comes up with a cost lower than that of many of my update statements (but I know this doesn't account for much).

Explain plan output:

DELETE STATEMENT                    6
NESTED LOOPS                        6
NESTED LOOPS                        6
INDEX               RANGE SCAN      3
INDEX               UNIQUE SCAN     1

I was wondering if there were any way to make this delete go faster. I tried to do a bulk delete but it didn't seem to improve the run time. If there were any way to execute all the deletes and then update the index after, I suspect it would run faster. Obviously doing a create table from a select is out of the picture since I am looping over records (and running through multiple conditions) from another table to do the delete.

Best Answer

I'd consider:

  1. If the table is partitioned - to disable the index partitions on the specific partitions which are going to be changed, and rebuild (rebuild online probably) them after the operation.
  2. Increase the amount of memory in disposal (for the specific session), so that it won't do any I/O operations (on temporary tablespace) at all, by changing workarea_size_policy to 'manual' and sort_area_size, hash_area_size to the maximum size you can spare for this operation.
  3. Run the entire operation using parallel hint (or parallel_index if you'd decide to use some indexes after all).
  4. Switch all the accessed objects to nologging mode, so that redo logs won't be generated during the operation (and switch back to logging right after it's finished of course).