From: | Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions |
Date: | 2018-01-22 15:40:14 |
Message-ID: | 9ec8c948-dff3-b045-f950-000d39006184@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 20/01/18 00:52, Robert Haas wrote:
> On Fri, Jan 19, 2018 at 5:19 PM, Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> Regarding the HOT issue - I have to admit I don't quite see why A2
>> wouldn't be reachable through the index, but that's likely due to my
>> limited knowledge of the HOT internals.
>
> The index entries only point to the root tuple in the HOT chain. Any
> subsequent entries can only be reached by following the CTID pointers
> (that's why they are called "Heap Only Tuples"). After T1 aborts,
> we're still OK because the CTID link isn't immediately cleared. But
> after T2 updates the tuple, it makes A1's CTID link point to A3,
> leaving no remaining link to A2.
>
> Although in most respects PostgreSQL treats commits and aborts
> surprisingly symmetrically, CTID links are an exception. When T2
> comes to A1, it sees that A1's xmax is T1 and checks the status of T1.
> If T1 is still in progress, it waits. If T2 has committed, it must
> either abort with a serialization error or update A2 instead under
> EvalPlanQual semantics, depending on the isolation level. If T2 has
> aborted, it assumes that the CTID field of T1 is garbage nobody cares
> about, adds A3 to the page, and makes A1 point to A3 instead of A2.
> No record of the A1->A2 link is kept anywhere *precisely because* A2
> can no longer be visible to anyone.
>
I think this is the only real problem from your list for logical
decoding catalog snapshots. But it's indeed quite a bad one. Is there
something preventing us to remove the assumption that the CTID of T1 is
garbage nobody cares about? I guess turning off HOT for catalogs is not
an option :)
General problem is that we have couple of assumptions
(HeapTupleSatisfiesVacuum being one, what you wrote is another) about
tuples from aborted transactions not being read by anybody. But if we
want to add decoding of 2PC or transaction streaming that's no longer
true so I think we should try to remove that assumption (even if we do
it only for catalogs since that what we care about).
The other option would be to make sure 2PC decoding/tx streaming does
not read aborted transaction but that would mean locking the transaction
every time we give control to output plugin. Given that output plugin
may do network write, this would really mean locking the transaction for
and unbounded period of time. That does not strike me as something we
want to do, decoding should not interact with frontend transaction
management, definitely not this badly.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2018-01-22 15:52:11 | Re: Handling better supported channel binding types for SSL implementations |
Previous Message | Aleksander Alekseev | 2018-01-22 15:18:51 | Re: Vacuum: allow usage of more than 1GB of work mem |