Use ctid in where clause in update from statement

From: Dirk Mika <Dirk(dot)Mika(at)mikatiming(dot)de>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Use ctid in where clause in update from statement
Date: 2019-07-01 09:13:19
Message-ID: C8D480EC-AFF2-40CC-8BD3-754A1204011A@mikatiming.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

I come from the Oracle world and am trying to convert some queries to PostgreSQL syntax. One of these queries is a MERGE statement, which I converted into an UPDATE SET FROM WHERE construct. In the original query I use the pseudo column ROWID to match a source row with a target row.
This is a simplified version of such a query:

MERGE INTO test_large d

USING (SELECT ROWID, test_large.*

FROM test_large

WHERE grp = 1) s

ON (d.ROWID = s.ROWID)

WHEN MATCHED

THEN

UPDATE SET d.grp = s.grp;

It has the following execution plan:

---------------------------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

---------------------------------------------------------------------------------------------

| 0 | MERGE STATEMENT | | 1000 | 16000 | 2006 (0)| 00:00:25 |

| 1 | MERGE | TEST_LARGE | | | | |

| 2 | VIEW | | | | | |

| 3 | NESTED LOOPS | | 1000 | 220K| 2006 (0)| 00:00:25 |

| 4 | TABLE ACCESS BY INDEX ROWID| TEST_LARGE | 1000 | 110K| 1006 (0)| 00:00:13 |

|* 5 | INDEX RANGE SCAN | IX_TL_GRP | 1000 | | 6 (0)| 00:00:01 |

| 6 | TABLE ACCESS BY USER ROWID | TEST_LARGE | 1 | 113 | 1 (0)| 00:00:01 |

---------------------------------------------------------------------------------------------

In the PostgreSQL version I use the column ctid for this. The above query becomes:

UPDATE test_large d
SET grp = s.grp
FROM (SELECT ctid, test_large.*
FROM test_large
WHERE grp = 1) s
WHERE d.ctid = s.ctid;

I noticed that the execution plan for such a statement seems to be suboptimal:

Update on test_large d (cost=155346.98..160367.08 rows=996 width=53)
-> Merge Join (cost=155346.98..160367.08 rows=996 width=53)
Merge Cond: (d.ctid = test_large.ctid)
-> Sort (cost=154330.57..156833.16 rows=1001033 width=43)
Sort Key: d.ctid
-> Seq Scan on test_large d (cost=0.00..19366.33 rows=1001033 width=43)
-> Sort (cost=1016.40..1018.89 rows=996 width=10)
Sort Key: test_large.ctid
-> Index Scan using ix_tl_grp on test_large (cost=0.42..966.80 rows=996 width=10)
Index Cond: (grp = 1)

I expected a tid scan to be used instead of a Seq scan. I did a VACUUM test_large to make sure statistics are valid.
I would suspect that a tid scan is even a bit faster than accessing via the primary key, since reading the index is not required.

I know I could change the WHERE clause to use the primary key, which results in the following plan:

Update on test_large d (cost=0.85..3344.12 rows=995 width=57)
-> Nested Loop (cost=0.85..3344.12 rows=995 width=57)
-> Index Scan using ix_tl_grp on test_large (cost=0.42..994.84 rows=995 width=14)
Index Cond: (grp = 1)
-> Index Scan using pk_test_large on test_large d (cost=0.42..2.36 rows=1 width=47)
Index Cond: (id = test_large.id)

So my question is more "Is a tid scan not possible / useful for an update" than "how do I accelerate this query".

BR
Dirk

--
Dirk Mika
Software Developer

[cid:image001_d0a5cf51-5dd3-4ff5-b428-1016a6d95a0f.png]

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk(dot)mika(at)mikatiming(dot)de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

[cid:CCh2U_f27a5ce6-8556-4db1-8aa4-fd610d6e9efe.jpg]<https://youtu.be/qfOFXrpSKLQ>

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Achilleas Mantzios 2019-07-01 09:27:16 Re: Use ctid in where clause in update from statement
Previous Message Achilleas Mantzios 2019-07-01 07:12:51 Re: iterate over partitions