From: | Greg Nancarrow <gregn4422(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Maxim Orlov <m(dot)orlov(at)postgrespro(dot)ru>, Michael Paquier <michael(at)paquier(dot)xyz>, Pengchengliu <pengchengliu(at)tju(dot)edu(dot)cn>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Parallel scan with SubTransGetTopmostTransaction assert coredump |
Date: | 2021-08-11 12:32:15 |
Message-ID: | CAJcOf-dbeg0CfJgo03SXD1=qzM650DrTRJDKhtp_=zvDPoxxfw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Aug 10, 2021 at 12:35 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Now there is evidently something wrong with this line of reasoning
> because the code is buggy and my proposed fix doesn't work, but I
> don't know what is wrong with it. You seem to think that it's crazy
> that we try to replicate the active snapshot to the active snapshot
> and the transaction snapshot to the transaction snapshot, but that
> did, and still does, seem really sane to me. The only part that now
> seems clearly wrong to me is that !IsolationUsesXactSnapshot() case
> takes an *extra* snapshot, but since fixing that didn't fix the bug,
> there's evidently more to the problem than that.
>
I traced through snapshot processing during a parallel SELECT, up to
the point of the existing GetTransactionSnapshot() and
GetCurrentSnapshot() calls in InitializeParallelDSM().
I'm seeing the following sequence of calls, numbered below:
PortalStart():
case PORTAL_ONE_SELECT:
(1) PushActiveSnapshot(GetTransactionSnapshot());
...
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
portal->sourceText,
(2) GetActiveSnapshot(),
InvalidSnapshot,
None_Receiver,
params,
portal->queryEnv,
0);
...
(3) PopActiveSnapshot();
PortalRunSelect():
(4) PushActiveSnapshot(queryDesc->snapshot);
ExecutorRun(queryDesc, direction, (uint64) count,
portal->run_once);
InitializeParallelDSM():
(5) Snapshot transaction_snapshot = GetTransactionSnapshot();
(6) Snapshot active_snapshot = GetActiveSnapshot();
nprocessed = queryDesc->estate->es_processed;
(7) PopActiveSnapshot();
The snapshot used in execution of the query is clearly the
ActiveSnapshot at the time of creating the QueryDesc [at (2)] which is
a copy of the TransactionSnapshot originally acquired [at (1)].
In InitializeParallelDSM() it acquires both the TransactionSnapshot
[at (5)] and the ActiveSnapshot [at (6)], to be serialized in the DSM
for the workers (each of which will deserialize and restore these).
But the problem I see is that the GetTransactionSnapshot() call [at
(5)] may acquire a new snapshot (i.e. a later snapshot than the
ActiveSnapshot used in the execution of the query), for example, if a
concurrent transaction has completed since GetTransactionSnapshot()
was last called [in (1)].
In this case, GetTransactionSnapshot() calls GetSnapshotDataReuse()
and it returns false, causing a new snapshot to be built by
GetTransactionSnapshot().
curXactCompletionCount = ShmemVariableCache->xactCompletionCount;
if (curXactCompletionCount != snapshot->snapXactCompletionCount)
return false;
When this TransactionSnapshot is restored in a worker process, it
accordingly sets TransactionXmin, and if we look back at the coredump
stacktrace and the Assert condition that failed in the worker, we see
that the xid was expected to be >= TransactionXmin, but the Assert
fired because the xid was < TransactionXmin.
Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
This is explained by the TransactionSnapshot being a later snapshot in
this case.
So this is why it seems to be wrong to call GetTransactionSnapshot()
in InitializeParallelDSM() and use a separate, potentially later,
snapshot than that used in the execution state for the query.
Regards,
Greg Nancarrow
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2021-08-11 12:38:43 | Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux? |
Previous Message | torikoshia | 2021-08-11 12:14:05 | Re: RFC: Logging plan of the running query |