Re: Why don't we consider explicit Incremental Sort?

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Why don't we consider explicit Incremental Sort?
Date: 2024-09-20 08:48:11
Message-ID: CAMbWs48GAVYiNnLePk51FGeYJ=o2cOo-tuLKzqcLkGrScA+75g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Sep 15, 2024 at 6:01 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> Hmmm ... I wasn't involved in that discussion and I haven't studied the
> thread now, but this seems a bit weird to me. If the presorted keys have
> low cardinality, can't the full sort be faster?
>
> Maybe it's not possible with the costing changes (removing the
> penalization), in which case it would be fine to not consider the full
> sort, because we'd just throw it away immediately. But if the full sort
> could be costed as cheaper, I'd say that might be wrong.

IIUC, we initially applied a 1.5x pessimism factor to the cost
estimates of incremental sort in an effort to avoid performance
regressions in cases with a large skew in the number of rows within
the presorted groups. OTOH, this also restricted our ability to
leverage incremental sort when it would be preferable.

I agree with you that sometimes the definition of 'regression' can
depend on when the alternative plans are introduced. Imagine if we
initially did not have the 1.5x pessimism factor and introduced it
later, it would be treated as a 'regression' because some queries that
could benefit from incremental sort would then have to resort to full
sort.

In commit 4a29eabd1, we removed the 1.5x pessimism factor to allow
incremental sort to have a fairer chance at being chosen against a
full sort. With this change, the cost model now tends to favor
incremental sort as being cheaper than full sort in the presence of
presorted keys, making it unnecessary to consider full sort in such
cases, because, as you mentioned, we'd just throw the full sort path
away immediately. So 4a29eabd1 also modified the logic so that if
there are presorted keys, we only create an incremental sort path.
As for the potential performance regressions caused by these changes,
4a29eabd1 opted to use enable_incremental_sort as an escape hatch.

I think the same theory applies to mergejoins too. We can leverage
incremental sort if it is enabled and there are presorted keys,
assuming that it is always faster than full sort, and use the GUC as
an escape hatch in the worst case.

So here is the v2 patch, which is almost the same as v1, but includes
changes in test cases and comments, and also includes a commit
message. I'll put it in the commitfest.

Thanks
Richard

Attachment Content-Type Size
v2-0001-Consider-explicit-incremental-sort-for-mergejoins.patch application/octet-stream 14.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-09-20 09:31:24 Re: not null constraints, again
Previous Message jian he 2024-09-20 08:27:28 Re: not null constraints, again