From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Nikhil Sontakke <nikhils(at)2ndquadrant(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Steele <david(at)pgmasters(dot)net>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>, Sokolov Yura <y(dot)sokolov(at)postgrespro(dot)ru>, Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] logical decoding of two-phase transactions |
Date: | 2018-07-20 14:58:36 |
Message-ID: | 20180720145836.gxwhbftuoyx5h4gc@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2018-07-20 12:13:19 +0530, Nikhil Sontakke wrote:
> Hi Andres,
>
>
> > So what if we, at the begin / end of cache miss handling, re-check if
> > the to-be-decoded transaction is still in-progress (or has
> > committed). And we throw an error if that happened. That error is then
> > caught in reorderbuffer, the in-progress-xact aborted callback is
> > called, and processing continues (there's a couple nontrivial details
> > here, but it should be doable).
> >
> > The biggest issue is what constitutes a "cache miss". It's fairly
> > trivial to do this for syscache / relcache, but that's not sufficient:
> > there's plenty cases where catalogs are accessed without going through
> > either. But as far as I can tell if we declared that all historic
> > accesses have to go through systable_beginscan* - which'd imo not be a
> > crazy restriction - we could put the checks at that layer.
> >
>
> Documenting that historic accesses go through systable_* APIs does
> seem reasonable. In our earlier discussions, we felt asking plugin
> writers to do anything along these lines was too onerous and
> cumbersome to expect.
But they don't really need to do that - in just about all cases access
"automatically" goes through systable_* or layers above. If you call
output functions, do syscache lookups, etc you're good.
> > That'd require that an index lookup can't crash if the corresponding
> > heap entry doesn't exist (etc), but that's something we need to handle
> > anyway. The issue that multiple separate catalog lookups need to be
> > coherent (say Robert's pg_class exists, but pg_attribute doesn't
> > example) is solved by virtue of the the pg_attribute lookups failing if
> > the transaction aborted.
> >
> > Am I missing something here?
> >
>
> Are you suggesting we have a:
>
> PG_TRY()
> {
> Catalog_Access();
> }
> PG_CATCH()
> {
> Abort_Handling();
> }
>
> here?
Not quite, no. Basically, in a simplified manner, the logical decoding
loop is like:
while (true)
record = readRecord()
logical = decodeRecord()
PG_TRY():
StartTransactionCommand();
switch (TypeOf(logical))
case INSERT:
insert_callback(logical);
break;
...
CommitTransactionCommand();
PG_CATCH():
AbortCurrentTransaction();
PG_RE_THROW();
what I'm proposing is that that various catalog access functions throw a
new class of error, something like "decoding aborted transactions". The
PG_CATCH() above would then not unconditionally re-throw, but set a flag
and continue iff that class of error was detected.
while (true)
if (in_progress_xact_abort_pending)
StartTransactionCommand();
in_progress_xact_abort_callback(made_up_record);
in_progress_xact_abort_pending = false;
CommitTransactionCommand();
record = readRecord()
logical = decodeRecord()
PG_TRY():
StartTransactionCommand();
switch (TypeOf(logical))
case INSERT:
insert_callback(logical);
break;
...
CommitTransactionCommand();
PG_CATCH():
AbortCurrentTransaction();
if (errclass == DECODING_ABORTED_XACT)
in_progress_xact_abort_pending = true;
continue;
else
PG_RE_THROW();
Now obviously that's just pseudo code with lotsa things missing, but I
think the basic idea should come through?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-07-20 15:03:23 | Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket |
Previous Message | David Rowley | 2018-07-20 14:57:16 | Re: Making "COPY partitioned_table FROM" faster |