From: | pbj(at)cmicdo(dot)com |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Performance of UPDATE SET = FROM vs UPDATE SET = (SELECT ...) |
Date: | 2014-11-03 16:33:58 |
Message-ID: | 1415032438.18065.YahooMailBasic@web161703.mail.bf1.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Why does the UPDATE SET = FROM choose a more poorly performing plan than
the UPDATE SET = (SELECT ...)? It seems to me that it is the same join.
I'm using 9.3.5.
CREATE TABLE orig
(
key1 VARCHAR(11) PRIMARY KEY,
time1 TIME
);
INSERT INTO orig (key1, time1)
SELECT
a::TEXT,
(((random()*100)::INT % 24)::TEXT || ':' ||
((random()*100)::INT % 60)::TEXT)::TIME
FROM generate_series(80000000000, 80002000000) a;
CREATE INDEX odx ON orig(key1);
CREATE TABLE second (LIKE orig);
INSERT INTO second (key1)
SELECT (80000000000+(((random()*1000000)::INT) % 1000000))::TEXT
FROM generate_series(1,400000);
EXPLAIN ANALYZE
UPDATE second SET time1 = orig.time1
FROM orig
WHERE second.key1 = orig.key1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Update on second (cost=69461.02..106082.02 rows=400000 width=32) (actual time=16033.023..16033.023 rows=0 loops=1)
-> Hash Join (cost=69461.02..106082.02 rows=400000 width=32) (actual time=7698.445..12992.039 rows=400000 loops=1)
Hash Cond: ((second.key1)::text = (orig.key1)::text)
-> Seq Scan on second (cost=0.00..12627.00 rows=400000 width=18) (actual time=49.820..791.397 rows=400000 loops=1)
-> Hash (cost=31765.01..31765.01 rows=2000001 width=26) (actual time=7648.540..7648.540 rows=2000001 loops=1)
Buckets: 4096 Batches: 128 Memory Usage: 717kB
-> Seq Scan on orig (cost=0.00..31765.01 rows=2000001 width=26) (actual time=0.014..3655.844 rows=2000001 loops=1)
Total runtime: 16033.193 ms
(8 rows)
UPDATE second SET time1 = NULL;
EXPLAIN ANALYZE
UPDATE second SET time1 = (SELECT orig.time1 FROM orig,second
WHERE orig.key1 = second.key1 LIMIT 1);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Update on second (cost=3.60..19078.19 rows=1279959 width=18) (actual time=4642.453..4642.453 rows=0 loops=1)
InitPlan 1 (returns $1)
-> Limit (cost=0.43..3.60 rows=1 width=8) (actual time=2.611..2.613 rows=1 loops=1)
-> Nested Loop (cost=0.43..4056331.83 rows=1279959 width=8) (actual time=2.606..2.606 rows=1 loops=1)
-> Seq Scan on second second_1 (cost=0.00..19074.59 rows=1279959 width=12) (actual time=2.487..2.487 rows=1 loops=1)
-> Index Scan using odx on orig (cost=0.43..3.14 rows=1 width=20) (actual time=0.098..0.098 rows=1 loops=1)
Index Cond: ((key1)::text = (second_1.key1)::text)
-> Seq Scan on second (cost=0.00..19074.59 rows=1279959 width=18) (actual time=6.420..817.739 rows=400000 loops=1)
Total runtime: 4642.561 ms
(9 rows)
From | Date | Subject | |
---|---|---|---|
Next Message | Sven Wegener | 2014-11-03 17:15:04 | COPY TO returning empty result with parallel ALTER TABLE |
Previous Message | Adrian Klaver | 2014-11-02 14:35:16 | Re: How to implent the CONVERT ( data_type [ ( length ) ] , expression ) function in PostgreSQL |