Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: "Anton A(dot) Melnikov" <a(dot)melnikov(at)postgrespro(dot)ru>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, benoit <benoit(at)hopsandfork(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan
Date: 2024-08-07 14:59:11
Message-ID: CAH2-WzmWuo5J3+KGh_QpZLhVQAdJzb7VDwJxFgqROhg4vp-dfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 31, 2024 at 12:47 AM Anton A. Melnikov
<a(dot)melnikov(at)postgrespro(dot)ru> wrote:
> From the 5bf748b86bc commit message:
>
> > There is a theoretical risk that removing restrictions on SAOP index
> > paths from the planner will break compatibility with amcanorder-based
> > index AMs maintained as extensions. Such an index AM could have the
> > same limitations around ordered SAOP scans as nbtree had up until now.
> > Adding a pro forma incompatibility item about the issue to the Postgres
> > 17 release notes seems like a good idea.

Here you've quoted the commit message's description of the risks
around breaking third party index AMs maintained as extensions. FWIW
that seems like a rather different problem to the one that you ran
into with your KNN nbtree patch.

> Seems, this commit broke our posted knn_btree patch. [1]
> If the point from which ORDER BY goes by distance is greater than the elements of ScalarArrayOp,
> then knn_btree algorithm will give only the first tuple. It sorts the elements of ScalarArrayOp
> in descending order and starts searching from smaller to larger
> and always expects that for each element of ScalarArrayOp there will be a separate scan.
> And now it does not work. Reproduction is described in [2].
>
> Seems it is impossible to solve this problem only from the knn-btree patch side.
> Could you advise any ways how to deal with this. Would be very grateful.

Well, you're actually patching nbtree. Your patch isn't upstream of
the nbtree code. You're going to have to reconcile your design with
the design for advancing arrays that is primarily implemented by
_bt_advance_array_keys.

I'm not entirely sure what the best way to go about doing that is --
that would require a lot of context about the KNN patch. It wouldn't
be particularly hard to selectively reenable the old Postgres 16 SAOP
behavior, though. You could conditionally force "goto new_prim_scan"
whenever the KNN patch's mechanism was in use.

That kind of approach might have unintended consequences elsewhere,
though. For example it could break things like mark/restore processing
by merge joins, which assumes that the scan's current array keys can
always be reconstructed after restoring a marked position by resetting
the array's to their first elements (or final elements if it's a
backwards scan). On the other hand, I think that you already shouldn't
expect anything like that to work with your patch. After all, it
changes the order of the tuples returned by the scan, which is already
bound to break certain assumptions made by "regular" index scans.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2024-08-07 14:59:23 Re: Interrupts vs signals
Previous Message Vitaly Davydov 2024-08-07 14:37:39 Re: Fsync (flush) all inserted WAL records