From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Why do parallel scans require syncscans (but not really)? |
Date: | 2023-12-30 23:09:55 |
Message-ID: | 64913123-619b-ad4d-29d8-f8a31409d85a@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
While investigating something, I noticed it's not really possible to
disable syncscans for a particular parallel scan.
That is, with serial scans, we can do this:
table_index_build_scan(heap, index, indexInfo, false, true,
callback, state, NULL);
where false means "allow_sync=false", i.e. disabling synchronized scans.
However, if this is tied to a parallel scan, that's not possible. If you
try doing this:
table_index_build_scan(heap, index, indexInfo, false, true,
callback, state, parallel_scan);
it hits this code in heapam_index_build_range_scan()
/*
* Parallel index build.
*
* Parallel case never registers/unregisters own snapshot. Snapshot
* is taken from parallel heap scan, and is SnapshotAny or an MVCC
* snapshot, based on same criteria as serial case.
*/
Assert(!IsBootstrapProcessingMode());
Assert(allow_sync);
snapshot = scan->rs_snapshot;
Sadly, there's no explanation why parallel scans do not allow disabling
sync scans just like serial scans - and it's not quite obvious to me.
Just removing the assert is not sufficient to make this work, though.
Because just a little later we call heap_setscanlimits() which does:
Assert(!scan->rs_inited); /* else too late to change */
/* else rs_startblock is significant */
Assert(!(scan->rs_base.rs_flags & SO_ALLOW_SYNC));
I don't quite understand what the comments for the asserts say, but
SO_ALLOW_SYNC seems to come from table_beginscan_parallel.
Perhaps the goal is to prevent mismatch between the parallel and serial
parts of the scans - OK, that probably makes sense. But I still don't
understand why table_beginscan_parallel forces the SO_ALLOW_SYNC instead
of having an allow_sync parameters too.
Is there a reason why all parallel plans should / must be synchronized?
Well, in fact they are not *required* because if I set
synchronize_seqscans=off
this makes the initscan() to remove the SO_ALLOW_SYNC ...
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2023-12-31 00:18:49 | Re: Fix resource leak (src/bin/pg_combinebackup/pg_combinebackup.c) |
Previous Message | Ranier Vilela | 2023-12-30 22:50:05 | Re: Fix Brin Private Spool Initialization (src/backend/access/brin/brin.c) |