Re: Parallel scan with SubTransGetTopmostTransaction assert coredump

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Greg Nancarrow <gregn4422(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Maxim Orlov <m(dot)orlov(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, Pengchengliu <pengchengliu(at)tju(dot)edu(dot)cn>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Date: 2021-06-24 11:57:28
Message-ID: CALT9ZEFXVCjXOLonRe0GGV_B14i8txsNP3LDXo2-4QraSryCgQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Have you found a case that the v2 patch breaks?
>
> The v2 patch does follow the analysis in the beginning of the
> discussion, which identified that in setting up parallel workers, a
> "later transaction snapshot" was taken than the one actually used in
> the statement execution, and that's what eventually leads to the
> observed Assertion failure.
> The original problem reporter stated: "So the root cause is the
> Parallel Workers process set the TransactionXmin with later
> transcation snapshot".
> Also, as far as I can see, it makes no sense to pass parallel workers
> both an active snapshot and a (later) transaction snapshot. In the
> leader, prior to execution and running parallel workers, a transaction
> snapshot is obtained and pushed as the active snapshot (note: only ONE
> snapshot at this point). It makes little sense to me to then obtain
> ANOTHER transaction snapshot when setting up parallel workers, and
> pass that (only) to the workers along with the original (earlier)
> active snapshot. (If there is a reason for passing the TWO snapshots
> to parallel workers, original code authors and/or snapshot experts
> should speak up now, and code comments should be added accordingly to
> explain the purpose and how it is MEANT to work)
> This is why the v2 patch updates the code to just pass one snapshot
> (the active snapshot) to the parallel workers, which gets restored in
> each worker as the transaction snapshot and set as the active snapshot
> (so is then consistent with the snapshot setup in the parallel
> leader).
>

I fully agree that _probably_ two snapshots are not needed for read-only
transactions. But I think we need to be very careful with everything
related to transaction integrity as there are different isolation levels
etc. The only fact I can not break something, seems like not an argument
solid enough.

Robert could you give us some help and insight into why there is a need for
the separate active and transaction snapshots in the parallel
infrastructure patch as per your commit 924bcf4f16d ?

Until then just a workaround like v3 patch seems safer for me (still I am
not a very big specialist in this).

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com <http://www.postgrespro.com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-06-24 12:03:27 Re: pgbench logging broken by time logic changes
Previous Message Guillaume Lelarge 2021-06-24 11:57:09 Weird use of parentheses in the manual