Re: pg16: XX000: could not find pathkey item to sort

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pg16: XX000: could not find pathkey item to sort
Date: 2023-10-03 07:16:07
Message-ID: CAApHDvo7RzcQYw-gnkZr6QCijCqf8vJLkJ4XFk-KawvyAw109Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 3 Oct 2023 at 09:11, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I'm concerned that this patch will be too much overhead when creating
> paths when a PathKey's EquivalenceClass has a large number of members
> from partitioned tables.

I just tried out the patch to see how much it affects the performance
of the planner. I think we need to find a better way to strip off the
pathkeys for the columns that have been aggregated.

Setup:
create table lp (a int, b int) partition by list (a);
select 'create table lp'||x||' partition of lp for values in('||x||')'
from generate_series(0,999)x;
\gexec

\pset pager off
set enable_partitionwise_aggregate=1;

Benchmark query:
explain (summary on) select a,count(*) from lp group by a;

Master:
Planning Time: 23.945 ms
Planning Time: 23.887 ms
Planning Time: 23.927 ms

perf top:
7.39% libc.so.6 [.] __memmove_avx_unaligned_erms
6.98% [kernel] [k] clear_page_rep
5.69% postgres [.] bms_is_subset
5.07% postgres [.] fetch_upper_rel
4.41% postgres [.] bms_equal

Patched:
Planning Time: 41.410 ms
Planning Time: 41.474 ms
Planning Time: 41.488 ms

perf top:
19.02% postgres [.] bms_is_subset
6.91% postgres [.] find_ec_member_matching_expr
5.93% libc.so.6 [.] __memmove_avx_unaligned_erms
5.55% [kernel] [k] clear_page_rep
4.07% postgres [.] fetch_upper_rel
3.46% postgres [.] bms_equal

> I wondered if we should instead just check
> if the subpath's pathkeys match root->group_pathkeys and if they do
> set the AggPath's pathkeys to list_copy_head(subpath->pathkeys,
> root->num_groupby_pathkeys), that'll be much cheaper, but it just
> feels a bit too much like a special case.

I tried this approach (patch attached) and it does perform better than
the other patch:

create_agg_path_fix2.patch:
Planning Time: 24.357 ms
Planning Time: 24.293 ms
Planning Time: 24.259 ms

7.45% libc.so.6 [.] __memmove_avx_unaligned_erms
6.90% [kernel] [k] clear_page_rep
5.56% postgres [.] bms_is_subset
5.38% postgres [.] bms_equal

I wonder if the attached patch is too much of a special case fix. I
guess from the lack of complaints previously that there are no other
cases where we could possibly have pathkeys that belong to columns
that are aggregated. I've not gone to much effort to see if I can
craft a case that hits this without the ORDER BY/DISTINCT aggregate
optimisation, however.

David

Attachment Content-Type Size
create_agg_path_fix2.patch application/octet-stream 935 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-10-03 07:20:45 Re: Incorrect handling of OOM in WAL replay leading to data loss
Previous Message Michael Paquier 2023-10-03 07:08:49 Re: pg_stat_get_activity(): integer overflow due to (int) * (int) for MemoryContextAllocHuge()