Re: Writable foreign tables: how to identify rows

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Writable foreign tables: how to identify rows
Date: 2013-03-13 15:15:15
Message-ID: 26552.1363187715@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andres Freund <andres(at)2ndquadrant(dot)com> writes:
> Perhaps pgsql-fdw should make sure the update was performed *without*
> following the ctid chain to a new valid tuple?

I did think about these issues before committing the patch ;-)

The basic theory in PG's existing design is to postpone locking rows as
long as possible; which means that when we do finally lock a target row,
we have to check if it's changed since we scanned it, and that leads
into the whole EvalPlanQual mess. I considered trying to make FDWs
duplicate that behavior, but gave up on it. In the first place, it's
hard to see how you even define "did the row change" unless you have
something exactly like ctids (including forward update chains). And
in the second place, this would mandate yet another round trip to the
remote server for each row to be updated.

In the patch as committed, the expectation (which is satisfied by
postgres_fdw) is that FDWs should lock rows that are candidates for
update/delete during the initial scan. This avoids an extra round trip
and justifies leaving EvalPlanQual out of the picture altogether.
The cost is that we may lock rows that we end up not updating, because
they fail locally-checked restriction or join conditions. I think on
balance that's a good trade-off.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2013-03-13 15:24:21 Re: Writable foreign tables: how to identify rows
Previous Message Dan Thomas 2013-03-13 15:09:07 Re: leaking lots of unreferenced inodes (pg_xlog files?), maybe after moving tables and indexes to tablespace on different volume