Re: Support tid range scan in parallel?

From: Cary Huang <cary(dot)huang(at)highgo(dot)ca>
To: "David Rowley" <dgrowleyml(at)gmail(dot)com>
Cc: "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Support tid range scan in parallel?
Date: 2024-05-08 22:23:55
Message-ID: 18f5a4e5977.11f1b68ca636787.1014021484376275876@highgo.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you very much for the test and review. Greatly appreciated!

> This now calls heap_setscanlimits() for the parallel version, it's
> just that table_block_parallelscan_nextpage() does nothing to obey
> those limits.

yes, you are absolutely right. Though heap_setscanlimits() is now called by
parallel tid range scan, table_block_parallelscan_nextpage() does nothing
to obey these limits, resulting in more blocks being inefficiently filtered out
by the optimization code you mentioned in heap_getnextslot_tidrange().

> I've not studied exactly how you'd get the rs_numblock information
> down to the parallel scan descriptor. But when you figure that out,
> just remember that you can't do the --scan->rs_numblocks from
> table_block_parallelscan_nextpage() as that's not parallel safe. You
> might be able to add an or condition to: "if (nallocated >=
> pbscan->phs_nblocks)" to make it "if (nallocated >=
> pbscan->phs_nblocks || nallocated >= pbscan->phs_numblocks)",
> although the field names don't seem very intuitive there. It would be
> nicer if the HeapScanDesc field was called rs_blocklimit rather than
> rs_numblocks. It's not for this patch to go messing with that,
> however.

rs_numblock was not passed down to the parallel scan context and
table_block_parallelscan_nextpage() did not seem to have a logic to limit
the block scan range set by heap_setscanlimits() in parallel scan. Also, I
noticed that the rs_startblock was also not passed to the parallel scan
context, which causes the parallel scan always start from 0 even when a
lower ctid bound is specified.

so I added a logic in heap_set_tidrange() to pass these 2 values to parallel
scan descriptor as "phs_startblock" and "phs_numblock". These will be
available in table_block_parallelscan_nextpage() in parallel scan.

I followed your recommendation and modified the condition to:

if (nallocated >= pbscan->phs_nblocks || (pbscan->phs_numblock != 0 &&
nallocated >= pbscan->phs_numblock))

so that the parallel tid range scan will stop when the upper scan limit is
reached. With these changes, I see that the number of buffer reads is
consistent between single and parallel ctid range scans. The v3 patch is
attached.

postgres=# explain (analyze, buffers) select count(*) from test where ctid >= '(0,1)' and ctid < '(11,0)';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Aggregate (cost=39.43..39.44 rows=1 width=8) (actual time=1.007..1.008 rows=1 loops=1)
Buffers: shared read=11
-> Tid Range Scan on test (cost=0.01..34.35 rows=2034 width=0) (actual time=0.076..0.639 rows=2035 loops=1)
TID Cond: ((ctid >= '(0,1)'::tid) AND (ctid < '(11,0)'::tid))
Buffers: shared read=11

postgres=# set max_parallel_workers_per_gather=2;
SET
postgres=# explain (analyze, buffers) select count(*) from test where ctid >= '(0,1)' and ctid < '(11,0)';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=16.45..16.46 rows=1 width=8) (actual time=14.329..16.840 rows=1 loops=1)
Buffers: shared hit=11
-> Gather (cost=16.43..16.44 rows=2 width=8) (actual time=3.197..16.814 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=11
-> Partial Aggregate (cost=16.43..16.44 rows=1 width=8) (actual time=0.705..0.706 rows=1 loops=3)
Buffers: shared hit=11
-> Parallel Tid Range Scan on test (cost=0.01..14.31 rows=848 width=0) (actual time=0.022..0.423 rows=678 loops=3)
TID Cond: ((ctid >= '(0,1)'::tid) AND (ctid < '(11,0)'::tid))
Buffers: shared hit=11

postgres=# set max_parallel_workers_per_gather=3;
SET
postgres=# explain (analyze, buffers) select count(*) from test where ctid >= '(0,1)' and ctid < '(11,0)';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=12.74..12.75 rows=1 width=8) (actual time=16.793..19.053 rows=1 loops=1)
Buffers: shared hit=11
-> Gather (cost=12.72..12.73 rows=3 width=8) (actual time=2.827..19.012 rows=4 loops=1)
Workers Planned: 3
Workers Launched: 3
Buffers: shared hit=11
-> Partial Aggregate (cost=12.72..12.73 rows=1 width=8) (actual time=0.563..0.565 rows=1 loops=4)
Buffers: shared hit=11
-> Parallel Tid Range Scan on test (cost=0.01..11.08 rows=656 width=0) (actual time=0.018..0.338 rows=509 loops=4)
TID Cond: ((ctid >= '(0,1)'::tid) AND (ctid < '(11,0)'::tid))
Buffers: shared hit=11

thank you!

Cary Huang
-------------
HighGo Software Inc. (Canada)
cary(dot)huang(at)highgo(dot)ca
www.highgo.ca

Attachment Content-Type Size
v3-0001-add-parallel-tid-rangescan.patch application/octet-stream 18.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-05-08 22:40:33 Re: Revert: Remove useless self-joins *and* -DREALLOCATE_BITMAPSETS make server crash, regress test fail.
Previous Message Tom Lane 2024-05-08 22:16:19 Re: Revert: Remove useless self-joins *and* -DREALLOCATE_BITMAPSETS make server crash, regress test fail.