Re: 回复: An implementation of multi-key sort

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Konstantin Knizhnik <knizhnik(at)garret(dot)ru>
Cc: Yao Wang <yao-yw(dot)wang(at)broadcom(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Wang Yao <yaowangm(at)outlook(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Hongxu Ma <hongxu(dot)ma(at)broadcom(dot)com>
Subject: Re: 回复: An implementation of multi-key sort
Date: 2024-07-09 18:58:23
Message-ID: CA+TgmoY=uUQxKZwOkMJgS0ufMSiB9a78OaViLEwz336S5Rbh8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 7, 2024 at 2:32 AM Konstantin Knizhnik <knizhnik(at)garret(dot)ru> wrote:
> If mksort really provides advantage only when there are a lot of
> duplicates (for prefix keys?) and of small fraction of duplicates there
> is even some (small) regression
> then IMHO taking in account in planner information about estimated
> number of distinct values seems to be really important.

I don't think we can rely on the planner's n_distinct estimates for
this at all. That information tends to be massively unreliable when we
have it at all. If we rely on it for good performance, it will be easy
to find cases where it's wrong and performance is bad.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-07-09 19:11:51 Re: improve performance of pg_dump with many sequences
Previous Message Tom Lane 2024-07-09 18:52:52 Re: XML test error on Arch Linux