From: | Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> |
---|---|
To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)2ndquadrant(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] MERGE SQL Statement for PG11 |
Date: | 2018-03-12 12:13:43 |
Message-ID: | CABOikdOTh6GqZ6QFpXwSqaJoCTtJY5iVbJuDunTN4ZvJg1mY2A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Mar 11, 2018 at 11:18 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> It sounds like we should try to thoroughly understand why these
> duplicates arose. Did you actually call EvalPlanQualSetPlan() for all
> subplans at the time?
>
>
The reason for duplicates or even wrong answers is quite simple. The way
UPDATE/DELETE currently works for partition table is that we expand the
inheritance tree for the parent result relation and then create a subplan
for each partition separately. This works fine, even when there exists a
FROM/USING clause in the UPDATE/DELETE statement because the final result
does not change irrespective of whether you first do a UNION ALL between
all partitions and then find the candidate rows or whether you find
candidate rows from individual partitions separately.
In case of MERGE though, since we are performing a RIGHT OUTER JOIN between
the result relation and the source relation, we may conclude that a
matching target row does not exist for a source row, whereas it actually
exists but in some other partition. For example,
CREATE TABLE target (key int, val text) PARTITION BY LIST ( key);
CREATE TABLE part1 PARTITION OF target FOR VALUES IN (1, 2, 3);
CREATE TABLE part2 PARTITION OF target FOR VALUES IN (4, 5, 6);
CREATE TABLE source (skey integer);
INSERT INTO source VALUES (1), (4), (7);
INSERT INTO part1 VALUES (1, 'v1'), (2, 'v2'), (3, 'v3');
INSERT INTO part2 VALUES (4, 'v4'), (5, 'v5'), (6, 'v6');
postgres=# SELECT * FROM target RIGHT OUTER JOIN source ON key = skey;
key | val | skey
-----+-----+------
1 | v1 | 1
4 | v4 | 4
| | 7
(3 rows)
This gives the right answer. But if we join individual partitions and then
do a UNION ALL,
postgres=# SELECT * FROM part1 RIGHT OUTER JOIN source ON key = skey UNION
ALL SELECT * FROM part2 RIGHT OUTER JOIN source ON key = skey;
key | val | skey
-----+-----+------
1 | v1 | 1
| | 4
| | 7
| | 1
4 | v4 | 4
| | 7
(6 rows)
This is what nodeModifyTable does and hence we end up getting duplicates or
even incorrectly declared NOT MATCHED rows, where as they are matched in a
different partition.
I don't think not calling EvalPlanQualSetPlan() on all subplans is a
problem because we really never execute those subplans. In fact. we should
fix that so that those subplans are not even initialised.
> As you know, there is an ON CONFLICT DO UPDATE + partitioning patch in
> the works from Alvaro. In your explanation about that approach that
> you cited, you wondered what the trouble might have been with ON
> CONFLICT + partitioning, and supposed that the issues were similar
> there. Are they? Has that turned up much?
>
>
Well, I initially thought that ON CONFLICT DO UPDATE on partition table may
have the same challenges, but that's probably not the case. For INSERT ON
CONFLICT it's still just an INSERT path, with some special handling for
UPDATEs. Currently, for partition or inherited table, UPDATEs/DELETEs go
via inheritance_planner() thus expanding inheritance for the result
relation where as INSERTs go via simple grouping_planner().
For MERGE, we do all three DMLs. That doesn't mean we could not
re-implement MERGE on the lines of INSERTs, but that would most likely mean
complete re-writing of the UPDATEs/DELETEs for partition/inheritance
tables. The challenges would just be the same in both cases.
Thanks,
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Ashutosh Bapat | 2018-03-12 12:37:07 | Re: [HACKERS] Partition-wise aggregation/grouping |
Previous Message | Badrul Chowdhury | 2018-03-12 11:55:45 | RE: Intermittent pg_ctl failures on Windows |