From: | Cary Huang <cary(dot)huang(at)highgo(dot)ca> |
---|---|
To: | "Cary Huang" <cary(dot)huang(at)highgo(dot)ca> |
Cc: | "David Rowley" <dgrowleyml(at)gmail(dot)com>, "Pgsql Hackers" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Support tid range scan in parallel? |
Date: | 2024-05-03 18:55:01 |
Message-ID: | 18f3fcf4f30.12c03d898462092.1918273472989699761@highgo.ca |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello
> -- parallel plan
> postgres=# set max_parallel_workers_per_gather=2;
> SET
> postgres=# select count(*) from t where ctid >= '(0,0)' and ctid < '(10,0)';
> count
> -------
> 0
> (1 row)
>
> I've not really looked into why, but I see you're not calling
> heap_setscanlimits() in parallel mode. You need to somehow restrict
> the block range of the scan to the range specified in the quals. You
> might need to do more work to make the scan limits work with parallel
> scans.
I found that select count(*) using parallel tid rangescan for the very first time,
it would return the correct result, but the same subsequent queries would
result in 0 as you stated. This is due to the "pscan->phs_syncscan" set to true
in ExecTidRangeScanInitializeDSM(), inherited from parallel seq scan case.
With syncscan enabled, the "table_block_parallelscan_nextpage()" would
return the next block since the end of the first tid rangescan instead of the
correct start block that should be scanned. I see that single tid rangescan
does not have SO_ALLOW_SYNC set, so I figure syncscan should also be
disabled in parallel case. With this change, then it would be okay to call
heap_setscanlimits() in parallel case, so I added this call back to
heap_set_tidrange() in both serial and parallel cases.
> 2. There's a 4 line comment you've added to cost_tidrangescan() which
> is just a copy and paste from cost_seqscan(). If you look at the
> seqscan costing, the comment is true in that scenario, but not true in
> where you've pasted it. The I/O cost is all tied in to run_cost.
thanks for pointing out, I have removed these incorrect comments
> 3. Calling TidRangeQualFromRestrictInfoList() once for the serial path
> and again for the partial path isn't great. It would be good to just
> call that function once and use the result for both path types.
good point. I moved the adding of tid range scan partial path inside
create_tidscan_paths() where it makes a TidRangeQualFromRestrictInfoList()
call for serial path, so I can just reuse tidrangequals if it is appropriate to
consider parallel tid rangescan.
> 4. create_tidrangescan_subpaths() seems like a weird name for a
> function. That seems to imply that scans have subpaths. Scans are
> always leaf paths and have no subpaths.
I removed this function with weird name; it is not needed because the logic inside
is moved to create_tidscan_paths() where it can reuse tidrangequals.
> It would be good to see the EXPLAIN (ANALYZE, BUFFERS) with SET
> track_io_timing = 1;
the v2 patch is attached that should address the issues above. Below are the EXPLAIN
outputs with track_io_timing = 1 in my environment. Generally, parallel tid range scan
results in more I/O timings and shorter execution time.
SET track_io_timing = 1;
===serial tid rangescan===
EXPLAIN (ANALYZE, BUFFERS) select a from test where ctid >= '(0,0)' and ctid < '(216216,40)';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Tid Range Scan on test (cost=0.01..490815.59 rows=27459559 width=4) (actual time=0.072..10143.770 rows=39999999 loops=1)
TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
Buffers: shared hit=298 read=215919 written=12972
I/O Timings: shared read=440.277 write=58.525
Planning:
Buffers: shared hit=2
Planning Time: 0.289 ms
Execution Time: 12497.081 ms
(8 rows)
set parallel_setup_cost=0;
set parallel_tuple_cost=0;
set min_parallel_table_scan_size=0;
set max_parallel_workers_per_gather=2;
===parallel tid rangescan===
EXPLAIN (ANALYZE, BUFFERS) select a from test where ctid >= '(0,0)' and ctid < '(216216,40)';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.01..256758.88 rows=40000130 width=4) (actual time=0.878..7083.705 rows=39999999 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=216217
I/O Timings: shared read=1224.153
-> Parallel Tid Range Scan on test (cost=0.01..256758.88 rows=16666721 width=4) (actual time=0.256..3980.770 rows=13333333 loops=3)
TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
Buffers: shared read=216217
I/O Timings: shared read=1224.153
Planning Time: 0.258 ms
Execution Time: 9731.800 ms
(11 rows)
===serial tid rangescan with aggregate===
set max_parallel_workers_per_gather=0;
EXPLAIN (ANALYZE, BUFFERS) select count(a) from test where ctid >= '(0,0)' and ctid < '(216216,40)';
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=716221.63..716221.64 rows=1 width=8) (actual time=12931.695..12931.696 rows=1 loops=1)
Buffers: shared read=216217
I/O Timings: shared read=599.331
-> Tid Range Scan on test (cost=0.01..616221.31 rows=40000130 width=4) (actual time=0.079..6800.482 rows=39999999 loops=1)
TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
Buffers: shared read=216217
I/O Timings: shared read=599.331
Planning:
Buffers: shared hit=1 read=2
I/O Timings: shared read=0.124
Planning Time: 0.917 ms
Execution Time: 12932.348 ms
(12 rows)
===parallel tid rangescan with aggregate===
set max_parallel_workers_per_gather=2;
EXPLAIN (ANALYZE, BUFFERS) select count(a) from test where ctid >= '(0,0)' and ctid < '(216216,40)';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=298425.70..298425.71 rows=1 width=8) (actual time=4842.512..4847.863 rows=1 loops=1)
Buffers: shared read=216217
I/O Timings: shared read=1155.321
-> Gather (cost=298425.68..298425.69 rows=2 width=8) (actual time=4842.020..4847.851 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=216217
I/O Timings: shared read=1155.321
-> Partial Aggregate (cost=298425.68..298425.69 rows=1 width=8) (actual time=4824.730..4824.731 rows=1 loops=3)
Buffers: shared read=216217
I/O Timings: shared read=1155.321
-> Parallel Tid Range Scan on test (cost=0.01..256758.88 rows=16666721 width=4) (actual time=0.098..2614.108 rows=13333333 loops=3)
TID Cond: ((ctid >= '(0,0)'::tid) AND (ctid < '(216216,40)'::tid))
Buffers: shared read=216217
I/O Timings: shared read=1155.321
Planning:
Buffers: shared read=3
I/O Timings: shared read=3.323
Planning Time: 4.124 ms
Execution Time: 4847.992 ms
(20 rows)
Cary Huang
-------------
HighGo Software Inc. (Canada)
cary(dot)huang(at)highgo(dot)ca
www.highgo.ca
Attachment | Content-Type | Size |
---|---|---|
v2-0001-add-parallel-tid-rangescan.patch | application/octet-stream | 16.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-05-03 18:58:30 | Re: Support LIKE with nondeterministic collations |
Previous Message | Peter Eisentraut | 2024-05-03 18:53:52 | Re: Support LIKE with nondeterministic collations |