Quick Links

Re: Support tid range scan in parallel?

From:	Cary Huang <cary(dot)huang(at)highgo(dot)ca>
To:	"Cary Huang" <cary(dot)huang(at)highgo(dot)ca>
Cc:	"David Rowley" <dgrowleyml(at)gmail(dot)com>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Support tid range scan in parallel?
Date:	2024-05-03 18:55:01
Message-ID:	18f3fcf4f30.12c03d898462092.1918273472989699761@highgo.ca
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hello

> -- parallel plan
> postgres=# set max_parallel_workers_per_gather=2;
> SET
> postgres=# select count(*) from t where ctid >= '(0,0)' and ctid < '(10,0)';
> count
> -------
> 0
> (1 row)
>
> I've not really looked into why, but I see you're not calling
> heap_setscanlimits() in parallel mode. You need to somehow restrict
> the block range of the scan to the range specified in the quals. You
> might need to do more work to make the scan limits work with parallel
> scans.

I found that select count(*) using parallel tid rangescan for the very first time,
it would return the correct result, but the same subsequent queries would
result in 0 as you stated. This is due to the "pscan->phs_syncscan" set to true
in ExecTidRangeScanInitializeDSM(), inherited from parallel seq scan case.
With syncscan enabled, the "table_block_parallelscan_nextpage()" would
return the next block since the end of the first tid rangescan instead of the
correct start block that should be scanned. I see that single tid rangescan
does not have SO_ALLOW_SYNC set, so I figure syncscan should also be
disabled in parallel case. With this change, then it would be okay to call
heap_setscanlimits() in parallel case, so I added this call back to
heap_set_tidrange() in both serial and parallel cases.

> 2. There's a 4 line comment you've added to cost_tidrangescan() which
> is just a copy and paste from cost_seqscan(). If you look at the
> seqscan costing, the comment is true in that scenario, but not true in
> where you've pasted it. The I/O cost is all tied in to run_cost.

thanks for pointing out, I have removed these incorrect comments

> 3. Calling TidRangeQualFromRestrictInfoList() once for the serial path
> and again for the partial path isn't great. It would be good to just
> call that function once and use the result for both path types.

good point. I moved the adding of tid range scan partial path inside
create_tidscan_paths() where it makes a TidRangeQualFromRestrictInfoList()
call for serial path, so I can just reuse tidrangequals if it is appropriate to
consider parallel tid rangescan.

> 4. create_tidrangescan_subpaths() seems like a weird name for a
> function. That seems to imply that scans have subpaths. Scans are
> always leaf paths and have no subpaths.

I removed this function with weird name; it is not needed because the logic inside
is moved to create_tidscan_paths() where it can reuse tidrangequals.

> It would be good to see the EXPLAIN (ANALYZE, BUFFERS) with SET
> track_io_timing = 1;

the v2 patch is attached that should address the issues above. Below are the EXPLAIN
outputs with track_io_timing = 1 in my environment. Generally, parallel tid range scan
results in more I/O timings and shorter execution time.

SET track_io_timing = 1;

===serial tid rangescan===

EXPLAIN (ANALYZE, BUFFERS) select a from test where ctid >= '(0,0)' and ctid < '(216216,40)';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Tid Range Scan on test (cost=0.01..490815.59 rows=27459559 width=4) (actual time=0.072..10143.770 rows=39999999 loops=1)
TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
Buffers: shared hit=298 read=215919 written=12972
I/O Timings: shared read=440.277 write=58.525
Planning:
Buffers: shared hit=2
Planning Time: 0.289 ms
Execution Time: 12497.081 ms
(8 rows)

set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_table_scan_size=0;
set max_parallel_workers_per_gather=2;

===parallel tid rangescan===

EXPLAIN (ANALYZE, BUFFERS) select a from test where ctid >= '(0,0)' and ctid < '(216216,40)';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.01..256758.88 rows=40000130 width=4) (actual time=0.878..7083.705 rows=39999999 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=216217
I/O Timings: shared read=1224.153
-> Parallel Tid Range Scan on test (cost=0.01..256758.88 rows=16666721 width=4) (actual time=0.256..3980.770 rows=13333333 loops=3)
TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
Buffers: shared read=216217
I/O Timings: shared read=1224.153
Planning Time: 0.258 ms
Execution Time: 9731.800 ms
(11 rows)

===serial tid rangescan with aggregate===

set max_parallel_workers_per_gather=0;

EXPLAIN (ANALYZE, BUFFERS) select count(a) from test where ctid >= '(0,0)' and ctid < '(216216,40)';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=716221.63..716221.64 rows=1 width=8) (actual time=12931.695..12931.696 rows=1 loops=1)
Buffers: shared read=216217
I/O Timings: shared read=599.331
-> Tid Range Scan on test (cost=0.01..616221.31 rows=40000130 width=4) (actual time=0.079..6800.482 rows=39999999 loops=1)
TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
Buffers: shared read=216217
I/O Timings: shared read=599.331
Planning:
Buffers: shared hit=1 read=2
I/O Timings: shared read=0.124
Planning Time: 0.917 ms
Execution Time: 12932.348 ms
(12 rows)

===parallel tid rangescan with aggregate===

set max_parallel_workers_per_gather=2;
EXPLAIN (ANALYZE, BUFFERS) select count(a) from test where ctid >= '(0,0)' and ctid < '(216216,40)';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=298425.70..298425.71 rows=1 width=8) (actual time=4842.512..4847.863 rows=1 loops=1)
Buffers: shared read=216217
I/O Timings: shared read=1155.321
-> Gather (cost=298425.68..298425.69 rows=2 width=8) (actual time=4842.020..4847.851 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=216217
I/O Timings: shared read=1155.321
-> Partial Aggregate (cost=298425.68..298425.69 rows=1 width=8) (actual time=4824.730..4824.731 rows=1 loops=3)
Buffers: shared read=216217
I/O Timings: shared read=1155.321
-> Parallel Tid Range Scan on test (cost=0.01..256758.88 rows=16666721 width=4) (actual time=0.098..2614.108 rows=13333333 loops=3)
TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
Buffers: shared read=216217
I/O Timings: shared read=1155.321
Planning:
Buffers: shared read=3
I/O Timings: shared read=3.323
Planning Time: 4.124 ms
Execution Time: 4847.992 ms
(20 rows)

Cary Huang
-------------
HighGo Software Inc. (Canada)
cary(dot)huang(at)highgo(dot)ca
www.highgo.ca

Attachment	Content-Type	Size
v2-0001-add-parallel-tid-rangescan.patch	application/octet-stream	16.4 KB

In response to

Re: Support tid range scan in parallel? at 2024-05-01 16:44:14 from Cary Huang

Responses

Re: Support tid range scan in parallel? at 2024-05-04 01:14:16 from David Rowley

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2024-05-03 18:58:30	Re: Support LIKE with nondeterministic collations
Previous Message	Peter Eisentraut	2024-05-03 18:53:52	Re: Support LIKE with nondeterministic collations