Re: Need recommendation on PARALLEL INDEX SCAN and PARALLEL APPEND

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Vivekk P <vivekkp(at)zeta(dot)tech>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Need recommendation on PARALLEL INDEX SCAN and PARALLEL APPEND
Date: 2021-10-26 07:18:15
Message-ID: CAM+6J94CiRSnK10i9b2dX9wbeOzZA-M+-EGyd-LE3N+1Wg4k8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 26 Oct 2021 at 11:39, Vivekk P <vivekkp(at)zeta(dot)tech> wrote:

> Hi Team,
>
> Please have a look on the below problem statement and suggest us if there
> are any ways to make the planner pick PARALLEL INDEX SCAN and PARALLEL
> APPEND
>
>
> 1. We have tried fine-tuning the below parameters with all possible values
> to get the expected results but got no luck,
>
> Parameter setting unit
> enable_parallel_append on
> enable_parallel_hash on
> force_parallel_mode off
> max_parallel_maintenance_workers 2
> max_parallel_workers 8
> max_parallel_workers_per_gather 2
> min_parallel_index_scan_size 64 8kB
> min_parallel_table_scan_size 1024 8kB
> parallel_leader_participation on
> parallel_setup_cost 1000
> parallel_tuple_cost 0.1
> effective_cache_size 4GB
> shared_buffers 128MB
> work_mem 4MB
>
>

I am pretty sure there will be questions on why you want to do that or why
you think this would solve any problem.

anyways,
This is just to force a parallel run, but do not do this in production.
The way we try this here, is to trick the optimizer by saying there is no
cost of making use of parallel setup, so this is always the best
path.(which is wrong, but..)

postgres=# select name,setting from pg_settings where name like '%para%';
name | setting
-----------------------------------+---------
enable_parallel_append | on
enable_parallel_hash | on
force_parallel_mode | off
log_parameter_max_length | -1
log_parameter_max_length_on_error | 0
max_parallel_maintenance_workers | 2
max_parallel_workers | 8
max_parallel_workers_per_gather | 2
min_parallel_index_scan_size | 64
min_parallel_table_scan_size | 1024
parallel_leader_participation | on
parallel_setup_cost | 1000
parallel_tuple_cost | 0.1
ssl_dh_params_file |
(14 rows)

postgres=# set force_parallel_mode to on;
SET
postgres=# set parallel_setup_cost to 0;
SET
postgres=# set parallel_tuple_cost to 0;
SET
postgres=# explain analyze select * from t where id > 0;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.00..3.76 rows=80 width=12) (actual time=2.900..5.996
rows=80 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.00..3.76 rows=34 width=12) (actual
time=0.002..0.009 rows=27 loops=3)
-> Parallel Seq Scan on t2 t_2 (cost=0.00..1.23 rows=18
width=12) (actual time=0.005..0.009 rows=31 loops=1)
Filter: (id > 0)
-> Parallel Seq Scan on t1 t_1 (cost=0.00..1.21 rows=17
width=12) (actual time=0.004..0.006 rows=29 loops=1)
Filter: (id > 0)
-> Parallel Seq Scan on t3 t_3 (cost=0.00..1.15 rows=12
width=12) (actual time=0.001..0.003 rows=20 loops=1)
Filter: (id > 0)
Planning Time: 0.568 ms
Execution Time: 6.022 ms
(12 rows)

postgres=# set seq_page_cost to 100000; --- since we do not want seq scan
but index scan
SET
postgres=# explain analyze select * from t where id > 0;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=0.14..37.65 rows=80 width=12) (actual time=0.232..5.326
rows=80 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.14..37.65 rows=34 width=12) (actual
time=0.007..0.020 rows=27 loops=3)
-> Parallel Index Only Scan using t2_ts_id_idx on t2 t_2
(cost=0.14..12.55 rows=18 width=12) (actual time=0.005..0.012 rows=31
loops=1)
Index Cond: (id > 0)
Heap Fetches: 31
-> Parallel Index Only Scan using t1_ts_id_idx on t1 t_1
(cost=0.14..12.53 rows=17 width=12) (actual time=0.007..0.013 rows=29
loops=1)
Index Cond: (id > 0)
Heap Fetches: 29
-> Parallel Index Only Scan using t3_ts_id_idx on t3 t_3
(cost=0.14..12.41 rows=12 width=12) (actual time=0.019..0.025 rows=20
loops=1)
Index Cond: (id > 0)
Heap Fetches: 20
Planning Time: 0.095 ms
Execution Time: 5.351 ms
(15 rows)

Again, do not do this in production. This is only for debugging purposes
using 0 cost.
You can try looking at pg_hint_plan (osdn.jp)
<https://pghintplan.osdn.jp/pg_hint_plan.html> if you want to force a
plan.
Also, be ready to answer, why do you want to do this or what makes you
think the parallel option will work magic.

also with TB sized dbs, pls ensure your disk io/latency etc are not a
problem.
maybe also bump memory and tune accordingly, to absorb disk io.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Ron 2021-10-26 09:05:53 Re: How to copy rows into same table efficiently
Previous Message Mitar 2021-10-26 07:05:06 Determining if a table really changed in a trigger