Re: Question of Parallel Hash Join on TPC-H Benchmark

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Question of Parallel Hash Join on TPC-H Benchmark
Date: 2024-10-10 14:40:43
Message-ID: 20f9671d-6f40-429b-a749-a2f53086e0dd@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The planner likely does not believe it'd be a win. It's not just about
parallel safety - starting parallel workers and copying the data between
processes is not free, and the planner tries to not do parallel stuff
unnecessarily. I'd bet 1GB is fairly small for parallelism.

You can try setting parallel_setup_cost and parallel_tuple_cost to 0,
and reducing min_parallel_{table,index}_scan_size. That should force a
parallel plan, and you'll see if parallelism is helpful.

regards

On 10/10/24 16:16, Ba Jinsheng wrote:
> Forgot to mention:
>
> TPC-H data scale: 1 GB
> PostgreSQL version: https://git.postgresql.org/gitweb/?
> p=postgresql.git;a=commit;h=a94d5b3728dd3e5fd4adb25350712eb785bb9ab6
> <https://git.postgresql.org/gitweb/?
> p=postgresql.git;a=commit;h=a94d5b3728dd3e5fd4adb25350712eb785bb9ab6>
>
>
> ------------------------------------------------------------------------
> *From:* Ba Jinsheng <bajinsheng(at)u(dot)nus(dot)edu>
> *Sent:* Thursday, October 10, 2024 4:11 PM
> *To:* pgsql-bugs(at)lists(dot)postgresql(dot)org <pgsql-bugs(at)lists(dot)postgresql(dot)org>
> *Subject:* Question of Parallel Hash Join on TPC-H Benchmark
>  
>  
>
> - External Email -
>
>  
>
> Hi everyone,
>
> Considering the query 2 in TPC-H benchmark:
>
> explain analyze
> select s_acctbal, s_name, n_name, p_partkey, p_mfgr, s_address, s_phone,
> s_comment from PART, SUPPLIER, PARTSUPP, NATION, REGION where p_partkey
> = ps_partkey and s_suppkey = ps_suppkey and p_size = 30 and p_type like
> '%STEEL' and s_nationkey = n_nationkey and n_regionkey = r_regionkey and
> r_name = 'ASIA' and ps_supplycost = ( select min(ps_supplycost) from
> PARTSUPP, SUPPLIER, NATION, REGION where p_partkey = ps_partkey and
> s_suppkey = ps_suppkey and s_nationkey = n_nationkey and n_regionkey =
> r_regionkey and r_name = 'ASIA' ) order by s_acctbal desc, n_name,
> s_name, p_partkey limit 100;
>
> I run it  on PostgreSQL and got this query plan:
>
>                                                                        
>              QUERY PLAN                                                
>                                    
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=66839.16..66839.17 rows=1 width=195) (actual
> time=267.054..267.114 rows=100 loops=1)
>    ->  Sort  (cost=66839.16..66839.17 rows=1 width=195) (actual
> time=267.053..267.108 rows=100 loops=1)
>          Sort Key: supplier.s_acctbal DESC, nation.n_name,
> supplier.s_name, part.p_partkey
>          Sort Method: top-N heapsort  Memory: 71kB
>          *->  Hash Join  (cost=37874.96..66839.15 rows=1 width=195)
> (actual time=227.273..266.816 rows=485 loops=1)*
>                Hash Cond: ((part.p_partkey = partsupp.ps_partkey) AND
> ((SubPlan 1) = partsupp.ps_supplycost))
>                ->  Gather  (cost=1000.00..6458.40 rows=804 width=30)
> (actual time=0.567..0.733 rows=826 loops=1)
>                      Workers Planned: 2
>                      Workers Launched: 2
>                      ->  Parallel Seq Scan on part  (cost=0.00..5378.00
> rows=335 width=30) (actual time=0.156..15.960 rows=275 loops=3)
>                            Filter: (((p_type)::text ~~ '%STEEL'::text)
> AND (p_size = 30))
>                            Rows Removed by Filter: 66391
>                ->  Hash  (cost=30567.96..30567.96 rows=160000 width=175)
> (actual time=225.816..225.820 rows=160240 loops=1)
>                      Buckets: 65536  Batches: 8  Memory Usage: 4648kB
>                      ->  Hash Join  (cost=407.96..30567.96 rows=160000
> width=175) (actual time=5.200..162.555 rows=160240 loops=1)
>                            Hash Cond: (partsupp.ps_suppkey =
> supplier.s_suppkey)
>                            ->  Seq Scan on partsupp
>  (cost=0.00..25560.00 rows=800000 width=14) (actual time=0.016..59.835
> rows=800000 loops=1)
>                            ->  Hash  (cost=382.96..382.96 rows=2000
> width=169) (actual time=5.169..5.172 rows=2003 loops=1)
>                                  Buckets: 2048  Batches: 1  Memory
> Usage: 413kB
>                                  ->  Hash Join  (cost=2.46..382.96
> rows=2000 width=169) (actual time=0.108..4.336 rows=2003 loops=1)
>                                        Hash Cond: (supplier.s_nationkey
> = nation.n_nationkey)
>                                        ->  Seq Scan on supplier
>  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.013..2.478
> rows=10000 loops=1)
>                                        ->  Hash  (cost=2.40..2.40 rows=5
> width=33) (actual time=0.069..0.071 rows=5 loops=1)
>                                              Buckets: 1024  Batches: 1
>  Memory Usage: 9kB
>                                              ->  Hash Join
>  (cost=1.07..2.40 rows=5 width=33) (actual time=0.060..0.066 rows=5 loops=1)
>                                                    Hash Cond:
> (nation.n_regionkey = region.r_regionkey)
>                                                    ->  Seq Scan on
> nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.011..0.013
> rows=25 loops=1)
>                                                    ->  Hash
>  (cost=1.06..1.06 rows=1 width=4) (actual time=0.029..0.029 rows=1 loops=1)
>                                                          Buckets: 1024
>  Batches: 1  Memory Usage: 9kB
>                                                          ->  Seq Scan on
> region  (cost=0.00..1.06 rows=1 width=4) (actual time=0.014..0.015
> rows=1 loops=1)
>                                                                Filter:
> (r_name = 'ASIA'::bpchar)
>                                                                Rows
> Removed by Filter: 4
>                SubPlan 1
>                  ->  Aggregate  (cost=48.70..48.71 rows=1 width=32)
> (actual time=0.019..0.019 rows=1 loops=1311)
>                        ->  Nested Loop  (cost=0.85..48.70 rows=1
> width=6) (actual time=0.014..0.018 rows=1 loops=1311)
>                              Join Filter: (region_1.r_regionkey =
> nation_1.n_regionkey)
>                              Rows Removed by Join Filter: 3
>                              ->  Seq Scan on region region_1
>  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1
> loops=1311)
>                                    Filter: (r_name = 'ASIA'::bpchar)
>                                    Rows Removed by Filter: 4
>                              ->  Nested Loop  (cost=0.85..47.58 rows=4
> width=10) (actual time=0.010..0.017 rows=4 loops=1311)
>                                    ->  Nested Loop  (cost=0.71..46.96
> rows=4 width=10) (actual time=0.009..0.013 rows=4 loops=1311)
>                                          ->  Index Scan using
> partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10)
> (actual time=0.007..0.008 rows=4 loops=1311)
>                                                Index Cond: (ps_partkey =
> part.p_partkey)
>                                          ->  Index Scan using
> supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8)
> (actual time=0.001..0.001 rows=1 loops=5244)
>                                                Index Cond: (s_suppkey =
> partsupp_1.ps_suppkey)
>                                    ->  Index Scan using nation_pkey on
> nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual
> time=0.001..0.001 rows=1 loops=5244)
>                                          Index Cond: (n_nationkey =
> supplier_1.s_nationkey)
> (50 rows)
>
>
> I am wondering why the HASH JOIN in the fifth line is not in parallel?
> According to this document https://www.postgresql.org/docs/current/
> parallel-safety.html <https://www.postgresql.org/docs/current/parallel-
> safety.html>, I understand it should be in parallel, and we can get a
> more efficient query plan as follows:
>
>                                                                        
>                 QUERY PLAN                                              
>                                          
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=42019.25..42019.25 rows=1 width=195) (actual
> time=134.698..142.258 rows=100 loops=1)
>    ->  Sort  (cost=42019.25..42019.25 rows=1 width=195) (actual
> time=134.697..142.252 rows=100 loops=1)
>          Sort Key: supplier.s_acctbal DESC, nation.n_name,
> supplier.s_name, part.p_partkey
>          Sort Method: top-N heapsort  Memory: 69kB
>          ->  Gather  (cost=26845.97..42019.24 rows=1 width=195) (actual
> time=116.843..142.014 rows=485 loops=1)
>                Workers Planned: 2
>                Workers Launched: 2
>           *     ->  Parallel Hash Join  (cost=25845.97..41019.14 rows=1
> width=195) (actual time=120.427..130.569 rows=162 loops=3)*
>                      Hash Cond: ((part.p_partkey = partsupp.ps_partkey)
> AND ((SubPlan 1) = partsupp.ps_supplycost))
>                      ->  Parallel Seq Scan on part  (cost=0.00..5378.00
> rows=335 width=30) (actual time=0.102..9.739 rows=275 loops=3)
>                            Filter: (((p_type)::text ~~ '%STEEL'::text)
> AND (p_size = 30))
>                            Rows Removed by Filter: 66391
>                      ->  Parallel Hash  (cost=23217.97..23217.97
> rows=66667 width=175) (actual time=97.833..97.837 rows=53413 loops=3)
>                            Buckets: 65536  Batches: 8  Memory Usage: 4800kB
>                            ->  Hash Join  (cost=407.96..23217.97
> rows=66667 width=175) (actual time=4.596..69.280 rows=53413 loops=3)
>                                  Hash Cond: (partsupp.ps_suppkey =
> supplier.s_suppkey)
>                                  ->  Parallel Seq Scan on partsupp
>  (cost=0.00..20893.33 rows=333333 width=14) (actual time=0.039..26.690
> rows=266667 loops=3)
>                                  ->  Hash  (cost=382.96..382.96
> rows=2000 width=169) (actual time=4.543..4.545 rows=2003 loops=3)
>                                        Buckets: 2048  Batches: 1  Memory
> Usage: 413kB
>                                        ->  Hash Join  (cost=2.46..382.96
> rows=2000 width=169) (actual time=0.106..3.658 rows=2003 loops=3)
>                                              Hash Cond:
> (supplier.s_nationkey = nation.n_nationkey)
>                                              ->  Seq Scan on supplier
>  (cost=0.00..323.00 rows=10000 width=144) (actual time=0.015..1.071
> rows=10000 loops=3)
>                                              ->  Hash  (cost=2.40..2.40
> rows=5 width=33) (actual time=0.070..0.072 rows=5 loops=3)
>                                                    Buckets: 1024
>  Batches: 1  Memory Usage: 9kB
>                                                    ->  Hash Join
>  (cost=1.07..2.40 rows=5 width=33) (actual time=0.061..0.067 rows=5 loops=3)
>                                                          Hash Cond:
> (nation.n_regionkey = region.r_regionkey)
>                                                          ->  Seq Scan on
> nation  (cost=0.00..1.25 rows=25 width=37) (actual time=0.009..0.011
> rows=25 loops=3)
>                                                          ->  Hash
>  (cost=1.06..1.06 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=3)
>                                                                Buckets:
> 1024  Batches: 1  Memory Usage: 9kB
>                                                                ->  Seq
> Scan on region  (cost=0.00..1.06 rows=1 width=4) (actual
> time=0.014..0.016 rows=1 loops=3)
>                                                                    
>  Filter: (r_name = 'ASIA'::bpchar)
>                                                                    
>  Rows Removed by Filter: 4
>                      SubPlan 1
>                        ->  Aggregate  (cost=48.70..48.71 rows=1
> width=32) (actual time=0.017..0.017 rows=1 loops=3810)
>                              ->  Nested Loop  (cost=0.85..48.70 rows=1
> width=6) (actual time=0.012..0.016 rows=1 loops=3810)
>                                    Join Filter: (region_1.r_regionkey =
> nation_1.n_regionkey)
>                                    Rows Removed by Join Filter: 3
>                                    ->  Seq Scan on region region_1
>  (cost=0.00..1.06 rows=1 width=4) (actual time=0.001..0.001 rows=1
> loops=3810)
>                                          Filter: (r_name = 'ASIA'::bpchar)
>                                          Rows Removed by Filter: 4
>                                    ->  Nested Loop  (cost=0.85..47.58
> rows=4 width=10) (actual time=0.006..0.015 rows=4 loops=3810)
>                                          ->  Nested Loop
>  (cost=0.71..46.96 rows=4 width=10) (actual time=0.005..0.011 rows=4
> loops=3810)
>                                                ->  Index Scan using
> partsupp_pkey on partsupp partsupp_1  (cost=0.42..13.75 rows=4 width=10)
> (actual time=0.003..0.004 rows=4 loops=3810)
>                                                      Index Cond:
> (ps_partkey = part.p_partkey)
>                                                ->  Index Scan using
> supplier_pkey on supplier supplier_1  (cost=0.29..8.30 rows=1 width=8)
> (actual time=0.001..0.001 rows=1 loops=15240)
>                                                      Index Cond:
> (s_suppkey = partsupp_1.ps_suppkey)
>                                          ->  Index Scan using
> nation_pkey on nation nation_1  (cost=0.14..0.16 rows=1 width=8) (actual
> time=0.001..0.001 rows=1 loops=15240)
>                                                Index Cond: (n_nationkey
> = supplier_1.s_nationkey)
> (50 rows)
>
>
>
> Best regards,
>
> Jinsheng Ba
>
>  
>
> Notice: This email is generated from the account of an NUS alumnus.
> Contents, views, and opinions therein are solely those of the sender.
> Notice: This email is generated from the account of an NUS alumnus.
> Contents, views, and opinions therein are solely those of the sender.
> Notice: This email is generated from the account of an NUS alumnus.
> Contents, views, and opinions therein are solely those of the sender.

--
Tomas Vondra

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Zhang Mingli 2024-10-10 14:47:00 Re: Question of Parallel Hash Join on TPC-H Benchmark
Previous Message Ba Jinsheng 2024-10-10 14:16:52 Re: Question of Parallel Hash Join on TPC-H Benchmark