Re: WIP: Upper planner pathification

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: WIP: Upper planner pathification
Date: 2016-03-13 01:03:54
Message-ID: 20160313010354.v7gygdse34vylp5a@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016-03-12 12:22:01 -0500, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On 2016-03-10 23:38:14 -0500, Tom Lane wrote:
> >> I'll do it ... just send me the list.
>
> > After exporting make_agg, make_limit, make_sort_from_sortclauses and
> > making some trivial adjustments due to the pull_var_clause changes
> > change, Citus' tests pass on 9.6, bar some noise.
>
> OK, done.

Thanks.

> > Pathification made
> > some plans switch from hash-agg to sort-agg, and the other way round;
> > but that's obviously ok.
>
> I wonder whether that's pathification per se.

If you're interested enough, I've uploaded a dump of the schema relevant
table to http://anarazel.de/t/lineitem_95_96_plan.dump.gz

the affected query is (after ANALYZE lineitem_102009)

EXPLAIN SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey) AS array_agg
FROM lineitem_102009 lineitem
WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500) AND (l_orderkey < 9500)) GROUP BY l_quantity;

=# SELECT version();
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ version │
├──────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc-5.real (Debian 5.3.1-10) 5.3.1 20160224, 64-bit │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(1 row)

=# SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey) AS array_agg FROM lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500) AND (l_orderkey < 9500)) GROUP BY l_quantity;
┌────────────┬───────┬──────────┬─────┬────────────────────────────────────────────────┐
│ l_quantity │ count │ avg │ avg │ array_agg │
├────────────┼───────┼──────────┼─────┼────────────────────────────────────────────────┤
│ 1.00 │ 9 │ 13044.06 │ 9 │ {8997,9026,9158,9184,9220,9222,9348,9383,9476} │
│ 4.00 │ 7 │ 40868.84 │ 7 │ {9091,9120,9281,9347,9382,9440,9473} │
│ 2.00 │ 8 │ 26072.02 │ 8 │ {9030,9058,9123,9124,9188,9344,9441,9476} │
│ 3.00 │ 9 │ 39925.32 │ 9 │ {9124,9157,9184,9223,9254,9349,9414,9475,9477} │
└────────────┴───────┴──────────┴─────┴────────────────────────────────────────────────┘
(4 rows)

Time: 0.906 ms
=# EXPLAIN SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey) AS array_agg FROM lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500) AND (l_orderkey < 9500)) GROUP BY l_quantity;
┌────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├────────────────────────────────────────────────────────────────────────────────────────────┤
│ HashAggregate (cost=137.91..137.93 rows=1 width=21) │
│ Group Key: l_quantity │
│ -> Bitmap Heap Scan on lineitem_102009 lineitem (cost=13.07..137.44 rows=38 width=21) │
│ Recheck Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500)) │
│ Filter: (l_quantity < '5'::numeric) │
│ -> Bitmap Index Scan on lineitem_pkey_102009 (cost=0.00..13.06 rows=478 width=0) │
│ Index Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500)) │
└────────────────────────────────────────────────────────────────────────────────────────

vs.

=# SELECT version();
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│ version
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│ PostgreSQL 9.6devel on x86_64-pc-linux-gnu, compiled by gcc-6.real (Debian 6-20160228-1) 6.0.0 20160228 (experimental) [trunk revision
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
(1 row)

=# SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey) AS array_agg FROM lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500) AND (l_orderkey < 9500)) GROUP BY l_quantity;
┌────────────┬───────┬──────────┬─────┬────────────────────────────────────────────────┐
│ l_quantity │ count │ avg │ avg │ array_agg │
├────────────┼───────┼──────────┼─────┼────────────────────────────────────────────────┤
│ 1.00 │ 9 │ 13044.06 │ 9 │ {9476,9158,9184,9383,9026,9220,9222,8997,9348} │
│ 2.00 │ 8 │ 26072.02 │ 8 │ {9124,9344,9441,9123,9476,9030,9058,9188} │
│ 3.00 │ 9 │ 39925.32 │ 9 │ {9477,9124,9157,9184,9223,9254,9349,9414,9475} │
│ 4.00 │ 7 │ 40868.84 │ 7 │ {9440,9347,9473,9091,9281,9382,9120} │
└────────────┴───────┴──────────┴─────┴────────────────────────────────────────────────┘
(4 rows)

=# EXPLAIN SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey) AS array_agg FROM lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500) AND (l_orderkey < 9500)) GROUP BY l_quantity;
┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├──────────────────────────────────────────────────────────────────────────────────────────────────┤
│ GroupAggregate (cost=138.43..139.02 rows=1 width=85) │
│ Group Key: l_quantity │
│ -> Sort (cost=138.43..138.53 rows=38 width=21) │
│ Sort Key: l_quantity │
│ -> Bitmap Heap Scan on lineitem_102009 lineitem (cost=13.07..137.44 rows=38 width=21) │
│ Recheck Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500)) │
│ Filter: (l_quantity < '5'::numeric) │
│ -> Bitmap Index Scan on lineitem_pkey_102009 (cost=0.00..13.06 rows=478 width=0) │
│ Index Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500)) │
└──────────────────────────────────────────────────────────────────────────────────────────────────┘
(9 rows)

As you probably can guess, what made me notice this wsa the difference
in the array_agg output.

> Of the three core regression test EXPLAINs that changed in the
> pathification commit, two actually were a case of finding better
> plans. The other one was a random-looking swap between two plans with
> near-identical costs. When I looked into it, I found that the reason
> the planner liked the new plan better was that it was parallel-safe;
> add_path() saw the costs as fuzzily equal and allowed parallel-safe to
> be the determining factor in the choice. The old code hadn't done
> that because the hard-wired cost comparisons in grouping_planner()
> never took parallel-safety into account. But I'd call that a
> parallelism change, not a pathification change; it would certainly
> have appeared to be that if the patches had gone in in the opposite
> order.

I've not yet looked deep enough to determine the root cause; I did
however notice that set enable_sort = false; yields a cheaper plan than
the default one, within the fuzz range (137.91..137.93 vs 138.43..139.02).

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2016-03-13 01:07:08 building on windows using VC 2008
Previous Message Peter Geoghegan 2016-03-12 23:55:10 Re: Refactoring speculative insertion with unique indexes a little