Re: Possible regression with gather merge.

From: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
To: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Possible regression with gather merge.
Date: 2017-03-22 13:43:24
Message-ID: CAD__Ouge0T0nvhTUKLNGrvN7+py=N1THxajvWf0wovqJ5F7k7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 22, 2017 at 1:09 PM, Rushabh Lathia
<rushabh(dot)lathia(at)gmail(dot)com> wrote:
> In the code, gather merge consider the limit for the sort into
> create_ordered_paths,
> which is wrong. Into create_ordered_paths(), GM should not consider the
> limit
> while doing costing for the sort node.
>
> Attached patch fix the bug.
Thanks, Rushabh, Now I see non-parallel scan being picked up by default

create table test as (select id, (random()*10000)::int as v1, random() as
v2 from generate_series(1,10000000) id);

postgres=# explain analyze select * from test order by v1, v2 limit 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Limit (cost=370146.98..370147.00 rows=10 width=16) (actual
time=1169.685..1169.686 rows=10 loops=1)
-> Sort (cost=370146.98..395146.63 rows=9999860 width=16) (actual
time=1169.683..1169.684 rows=10 loops=1)
Sort Key: v1, v2
Sort Method: top-N heapsort Memory: 25kB
-> Seq Scan on test (cost=0.00..154053.60 rows=9999860
width=16) (actual time=0.016..590.176 rows=10000000 loops=1)
Planning time: 0.070 ms
Execution time: 1169.706 ms
(7 rows)

Another find by accident. Setting higher
max_parallel_workers_per_gather to a higher value than default results
in more parallel workers, But query runs slower than the plan with
lesser workers.

postgres=# set max_parallel_workers_per_gather = default;
SET
postgres=# explain analyze select * from test order by v1, v2 limit 10000000;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=697263.86..1669540.28 rows=8333216 width=16) (actual
time=2212.437..8207.161 rows=10000000 loops=1)
-> Gather Merge (cost=697263.86..1669540.28 rows=8333216
width=16) (actual time=2212.436..7600.478 rows=10000000 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Sort (cost=696263.84..706680.36 rows=4166608 width=16)
(actual time=2173.756..3105.512 rows=3333333 loops=3)
Sort Key: v1, v2
Sort Method: external merge Disk: 86648kB
-> Parallel Seq Scan on test (cost=0.00..95721.08
rows=4166608 width=16) (actual time=0.030..240.486 rows=3333333
loops=3)
Planning time: 0.096 ms
Execution time: 8537.214 ms
(10 rows)

postgres=# set max_parallel_workers_per_gather = 10;
SET
postgres=# explain analyze select * from test order by v1, v2 limit 10000000;
QUERY
PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=431168.44..1628498.09 rows=9999860 width=16) (actual
time=1525.120..13273.805 rows=10000000 loops=1)
-> Gather Merge (cost=431168.44..1628498.09 rows=9999860
width=16) (actual time=1525.119..12650.621 rows=10000000 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Sort (cost=430168.39..436418.30 rows=2499965 width=16)
(actual time=1472.799..2133.571 rows=2000000 loops=5)
Sort Key: v1, v2
Sort Method: external merge Disk: 50336kB
-> Parallel Seq Scan on test (cost=0.00..79054.65
rows=2499965 width=16) (actual time=0.047..201.405 rows=2000000
loops=5)
Planning time: 0.077 ms
Execution time: 13622.319 ms
(10 rows)

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Emil Iggland 2017-03-22 13:47:19 Fix for grammatical error in code-comment
Previous Message Robert Haas 2017-03-22 13:39:20 Re: Partitioned tables and relfilenode