From: | Luca Tettamanti <kronos(dot)it(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | DELETE performance problem |
Date: | 2009-11-24 13:37:08 |
Message-ID: | 20091124133708.GA6235@nb-core2.darkstar.lan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello,
I've run in a severe performance problem with the following statement:
DELETE FROM t1 WHERE t1.annotation_id IN (
SELECT t2.annotation_id FROM t2)
t1 contains about 48M record (table size is 5.8GB), while t2 contains about 60M
record (total size 8.6GB). annotation_id is the PK in t1 but not in t2 (it's
not even unique, in fact there are duplicates - there are about 20M distinct
annotation_id in this table). There are no FKs on either tables.
I've killed the query after 14h(!) of runtime...
I've reproduced the problem using a only the ids (extracted from the full
tables) with the following schemas:
test2=# \d t1
Table "public.t1"
Column | Type | Modifiers
---------------+--------+-----------
annotation_id | bigint | not null
Indexes:
"t1_pkey" PRIMARY KEY, btree (annotation_id)
test2=# \d t2
Table "public.t2"
Column | Type | Modifiers
---------------+--------+-----------
annotation_id | bigint |
Indexes:
"t2_idx" btree (annotation_id)
The query above takes about 30 minutes to complete. The slowdown is not as
severe, but (IMHO) the behaviour is strange. On a win2k8 with 8.3.8 using
procexp I see the process churning the disk and using more memory until it hits
some limit (at about 1.8GB) then the IO slows down considerably. See this
screenshot[1].
This is exactly what happens with the full dataset.
This is the output of the explain:
test2=> explain analyze delete from t1 where annotation_id in (select annotation
_id from t2);
QUERY PLAN
--------------------------------------------------------------------------------
---------------------------------------------------------
Hash Join (cost=1035767.26..2158065.55 rows=181605 width=6) (actual time=64339
5.565..1832056.588 rows=26185953 loops=1)
Hash Cond: (t1.annotation_id = t2.annotation_id)
-> Seq Scan on t1 (cost=0.00..661734.12 rows=45874812 width=14) (actual tim
e=0.291..179119.487 rows=45874812 loops=1)
-> Hash (cost=1033497.20..1033497.20 rows=181605 width=8) (actual time=6433
93.742..643393.742 rows=26185953 loops=1)
-> HashAggregate (cost=1031681.15..1033497.20 rows=181605 width=8) (a
ctual time=571807.575..610178.552 rows=26185953 loops=1)
-> Seq Scan on t2 (cost=0.00..879289.12 rows=60956812 width=8)
(actual time=2460.595..480446.581 rows=60956812 loops=1)
Total runtime: 2271122.474 ms
(7 rows)
Time: 2274723,284 ms
An identital linux machine (with 8.4.1) shows the same issue; with strace I see
a lots of seeks:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
90.37 0.155484 15 10601 read
9.10 0.015649 5216 3 fadvise64
0.39 0.000668 0 5499 write
0.15 0.000253 0 10733 lseek
0.00 0.000000 0 3 open
0.00 0.000000 0 3 close
0.00 0.000000 0 3 semop
------ ----------- ----------- --------- --------- ----------------
100.00 0.172054 26845 total
(30s sample)
Before hitting the memory "limit" (AS on win2k8, unsure about Linux) the trace
is the following:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.063862 0 321597 read
0.00 0.000000 0 3 lseek
0.00 0.000000 0 76 mmap
------ ----------- ----------- --------- --------- ----------------
100.00 0.063862 321676 total
The machines have 8 cores (2 Xeon E5320), 8GB of RAM. Postgres data directory
is on hardware (Dell PERC5) raid mirror, with the log on a separate array.
One machine is running linux 64bit (Debian/stable), the other win2k8 (32 bit).
shared_buffers = 512MB
work_mem = 512MB
maintenance_work_mem = 1GB
checkpoint_segments = 16
wal_buffers = 8MB
fsync = off # Just in case... usually it's enabled
effective_cache_size = 4096MB
(the machine with win2k8 is running with a smaller shared_buffers - 16MB)
Any idea on what's going wrong here?
thanks,
Luca
[1] http://img10.imageshack.us/i/psql2.png/
From | Date | Subject | |
---|---|---|---|
Next Message | Jerry Champlin | 2009-11-24 14:59:10 | Re: DELETE performance problem |
Previous Message | Pavel Stehule | 2009-11-24 12:42:33 | Re: Dynamic sql example |