Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Alexander Alexander <alexander(dot)berezin3000(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18477: A specific SQL query with "ORDER BY ... NULLS FIRST" is performing poorly if an ordering column is n
Date: 2024-06-06 05:04:36
Message-ID: CAApHDvqdSj8Z1tWMZRrs15a9vz6XGiDhEGqr3rEWFEthWT-tvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, 6 Jun 2024 at 15:48, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Yes, we could possibly set things up so that an index with nulls last
> is considered to match a query that specifies NULLS FIRST if we know
> that the column is not-nullable. But I refuse to believe that this
> would be a good use of either development effort or planner cycles.
> AFAICS the problem is purely self-inflicted damage: why is the user
> specifying NULLS FIRST if he knows the column is not-null?

I do agree with you when I think about this for the limited scope that
you've described here. However, I think you've constrained your
thinking of the usefulness of such an optimisation far more narrowly
than it could be applied.

While I've not given this much thought, it did occur to me that if you
had two nullable columns which are in the same EquivalanceClass, which
was generated by a strict equality clause. If these two columns are
not in the same table, what's the reason, assuming the join condition
is strict that we couldn't perform a Merge Join using presorted
results from a NULLS FIRST index on one side of the join and a NULLS
LAST on the other side?

If the user has some genuine reason for creating a NULLS FIRST index
for some other query, then it might be nice if we were able to use
that index for Merge Joins instead of them having to create another
index to speed up the join query.

I've no plans to go and do anything to improve this situation, I just
didn't want this thread to be left in a state that would put off
anyone from making improvements in this area. Having the ability to
better reuse existing indexes is a worthy goal, at least in my book.

If an EquivalenceClass had some ability to track non-nullness of its
members based on strict OpExprs, then we probably could optimise these
sorts of queries better. I don't know how that would work with
canonical PathKeys, but the EquivalenceClass must at least exist
before a PathKey can, so maybe PathKey.pk_nulls_first can be left
false when the EquivalenceClass states that no nulls can exist after
evaluation of the OpExprs that allowed the EquivalenceClass to exist.
There may be some hazards there I've not thought about, but it seems
like considering those would be part of the work of anyone working on
a patch to improve this area.

David

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Bertrand Drouvot 2024-06-06 06:05:11 Re: error "can only drop stats once" brings down database
Previous Message Michael Paquier 2024-06-06 04:44:26 Re: error "can only drop stats once" brings down database