Performance of UPDATE SET = FROM vs UPDATE SET = (SELECT ...)

From: pbj(at)cmicdo(dot)com
To: pgsql-general(at)postgresql(dot)org
Subject: Performance of UPDATE SET = FROM vs UPDATE SET = (SELECT ...)
Date: 2014-11-03 16:33:58
Message-ID: 1415032438.18065.YahooMailBasic@web161703.mail.bf1.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Why does the UPDATE SET = FROM choose a more poorly performing plan than
the UPDATE SET = (SELECT ...)? It seems to me that it is the same join.

I'm using 9.3.5.

CREATE TABLE orig
(
key1 VARCHAR(11) PRIMARY KEY,
time1 TIME
);

INSERT INTO orig (key1, time1)
SELECT
a::TEXT,
(((random()*100)::INT % 24)::TEXT || ':' ||
((random()*100)::INT % 60)::TEXT)::TIME
FROM generate_series(80000000000, 80002000000) a;

CREATE INDEX odx ON orig(key1);

CREATE TABLE second (LIKE orig);

INSERT INTO second (key1)
SELECT (80000000000+(((random()*1000000)::INT) % 1000000))::TEXT
FROM generate_series(1,400000);

EXPLAIN ANALYZE
UPDATE second SET time1 = orig.time1
FROM orig
WHERE second.key1 = orig.key1;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Update on second (cost=69461.02..106082.02 rows=400000 width=32) (actual time=16033.023..16033.023 rows=0 loops=1)
-> Hash Join (cost=69461.02..106082.02 rows=400000 width=32) (actual time=7698.445..12992.039 rows=400000 loops=1)
Hash Cond: ((second.key1)::text = (orig.key1)::text)
-> Seq Scan on second (cost=0.00..12627.00 rows=400000 width=18) (actual time=49.820..791.397 rows=400000 loops=1)
-> Hash (cost=31765.01..31765.01 rows=2000001 width=26) (actual time=7648.540..7648.540 rows=2000001 loops=1)
Buckets: 4096 Batches: 128 Memory Usage: 717kB
-> Seq Scan on orig (cost=0.00..31765.01 rows=2000001 width=26) (actual time=0.014..3655.844 rows=2000001 loops=1)
Total runtime: 16033.193 ms
(8 rows)

UPDATE second SET time1 = NULL;

EXPLAIN ANALYZE
UPDATE second SET time1 = (SELECT orig.time1 FROM orig,second
WHERE orig.key1 = second.key1 LIMIT 1);

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Update on second (cost=3.60..19078.19 rows=1279959 width=18) (actual time=4642.453..4642.453 rows=0 loops=1)
InitPlan 1 (returns $1)
-> Limit (cost=0.43..3.60 rows=1 width=8) (actual time=2.611..2.613 rows=1 loops=1)
-> Nested Loop (cost=0.43..4056331.83 rows=1279959 width=8) (actual time=2.606..2.606 rows=1 loops=1)
-> Seq Scan on second second_1 (cost=0.00..19074.59 rows=1279959 width=12) (actual time=2.487..2.487 rows=1 loops=1)
-> Index Scan using odx on orig (cost=0.43..3.14 rows=1 width=20) (actual time=0.098..0.098 rows=1 loops=1)
Index Cond: ((key1)::text = (second_1.key1)::text)
-> Seq Scan on second (cost=0.00..19074.59 rows=1279959 width=18) (actual time=6.420..817.739 rows=400000 loops=1)
Total runtime: 4642.561 ms
(9 rows)

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sven Wegener 2014-11-03 17:15:04 COPY TO returning empty result with parallel ALTER TABLE
Previous Message Adrian Klaver 2014-11-02 14:35:16 Re: How to implent the CONVERT ( data_type [ ( length ) ] , expression ) function in PostgreSQL