Re: Why don't we consider explicit Incremental Sort?

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why don't we consider explicit Incremental Sort?
Date: 2024-10-29 04:43:26
Message-ID: 238e8b13-89ca-4a7e-921f-66b955e3f924@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/10/24 09:18, Richard Guo wrote:
> On Sun, Sep 22, 2024 at 1:38 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I've pushed this patch after tweaking this part in the commit message.
> Thank you both for your reviews.
My apologies for the late review, but IMO there are some minor weaknesses.
Working on the improvement of the cost_sort model [1], adding into the
model the number of columns to be sorted and the number of comparisons
to be made (through the stadistinct of the first column), I found out
that one test doesn't like IncrementalSort in aggregates.sql:

'Utilise the ordering of merge join to avoid a Sort operation'

After discovering a little bit, I realised that although the idea that
the IncrementalSort is better than the plain Sort is generally correct,
the assumption that it is better all the time seems wrong.
For example, IncrementalSort, following an index, can choose a sort
order with a low level of distinct values of the first columns, causing
more comparisons - remember, we still sort tuples inside a group; or
using an index on multiple columns having a need in just the first
column and additional Sort on many others from the heap, etc.

Of course, I provide highly skewed cases and Sort + SeqScan basically
will resolve the issue. But I think you may discover both possible sort
ways; don't heuristically give a chance only to IncrementalSort. At
least, IndexScan can beat SeqScan because of the index clause selectivity.

[1]
https://www.postgresql.org/message-id/8742aaa8-9519-4a1f-91bd-364aec65f5cf%40gmail.com

--
regards, Andrei Lepikhov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2024-10-29 05:06:00 Re: protocol-level wait-for-LSN
Previous Message Amit Kapila 2024-10-29 04:23:46 Re: Pgoutput not capturing the generated columns