Re: How to solve my slow disk i/o throughput during index scan

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: FREYBURGER Simon "(SNCF" VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD "MANAGEMENT)" <simon(dot)freyburger(at)sncf(dot)fr>, "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: How to solve my slow disk i/o throughput during index scan
Date: 2024-07-05 02:04:31
Message-ID: d07e4b85-0572-453a-93c8-2ca8b34aa662@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 7/4/24 22:23, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE
TGV / DM RMP YIELD MANAGEMENT) wrote:
> Hello,
>
> Thank you, splitting in β€œOR” query definitely enables bitmap heap scans,
> and thus parallelized read to disk πŸ˜ƒ! I though did not understand your
> second point, what is parallel append, and how to enable it ?
Just for example:

DROP TABLE IF EXISTS t CASCADE;
CREATE TABLE t (id int not null, payload text) PARTITION BY RANGE (id);
CREATE TABLE p1 PARTITION OF t FOR VALUES FROM (0) TO (1000);
CREATE TABLE p2 PARTITION OF t FOR VALUES FROM (1000) TO (2000);
CREATE TABLE p3 PARTITION OF t FOR VALUES FROM (2000) TO (3000);
CREATE TABLE p4 PARTITION OF t FOR VALUES FROM (3000) TO (4000);
INSERT INTO t SELECT x % 4000, repeat('a',128) || x FROM
generate_series(1,1E5) AS x;
ANALYZE t;

SET enable_parallel_append = on;
SET parallel_setup_cost = 0.00001;
SET parallel_tuple_cost = 0.00001;
SET max_parallel_workers_per_gather = 8;
SET min_parallel_table_scan_size = 0;
SET min_parallel_index_scan_size = 0;

EXPLAIN (COSTS OFF)
SELECT t.id, t.payload FROM t WHERE t.id % 2 = 0
GROUP BY t.id, t.payload;

Group
Group Key: t.id, t.payload
-> Gather Merge
Workers Planned: 6
-> Sort
Sort Key: t.id, t.payload
-> Parallel Append
-> Parallel Seq Scan on p1 t_1
Filter: ((id % 2) = 0)
-> Parallel Seq Scan on p2 t_2
Filter: ((id % 2) = 0)
-> Parallel Seq Scan on p3 t_3
Filter: ((id % 2) = 0)
-> Parallel Seq Scan on p4 t_4
Filter: ((id % 2) = 0)

Here the table is scanned in parallel. It also works with IndexScan.

--
regards, Andrei Lepikhov

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Rowley 2024-07-05 02:15:42 Re: Hash Right join and seq scan
Previous Message James Pang 2024-07-05 00:50:18 Re: Hash Right join and seq scan