Re: Adding skip scan (including MDAM style range skip scan) to nbtree

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Date: 2024-07-06 00:44:50
Message-ID: CAH2-Wzk6Ah4YZgA+AW7fZMRzT=RqHL9SO=r4nzaWapPKr1OXnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 5, 2024 at 7:04 AM Aleksander Alekseev
<aleksander(at)timescale(dot)com> wrote:
> Test2 with "char" doesn't seem to benefit from the patch anymore
> (pretty sure it did in v1). It always chooses Parallel Seq Scans even
> if I change the condition to `WHERE n > 999_995_000` or `WHERE n =
> 999_997_362`. Is it an expected behavior?

The "char" opclass's skip support routine was totally broken in v1, so
its performance isn't really relevant. In any case v2 didn't make any
changes to the costing, so I'd expect it to use exactly the same query
plan as v1.

> The query uses Index Scan, however the performance is worse than with
> Seq Scan chosen before the patch. It doesn't matter if I choose '>' or
> '=' condition.

That's because the index has a leading/skipped column of type
"numeric", which isn't a supported type just yet (a supported B-Tree
opclass, actually).

The optimization is effective if you create the expression index with
a cast to integer:

CREATE INDEX test4_idx ON test4 USING btree(((extract(year from d))::int4),n);

This performs much better. Now I see "DEBUG: skipping 1 index
attributes" when I run the query "EXPLAIN (ANALYZE, BUFFERS) SELECT
COUNT(*) FROM test4 WHERE n > 900_000_000", which indicates that the
optimization has in fact been used as expected. There are far fewer
buffers hit with this version of your test4, which also indicates that
the optimization has been effective.

Note that the original numeric expression index test4 showed "DEBUG:
skipping 0 index attributes" when the test query ran, which indicated
that the optimization couldn't be used. I suggest that you look out
for that, by running "set client_min_messages to debug2;" from psql
when testing the patch.

> It runs fast and choses Index Only Scan. But then I discovered that
> without the patch Postgres also uses Index Only Scan for this query. I
> didn't know it could do this - what is the name of this technique?

It is a full index scan. These have been possible for many years now
(possibly well over 20 years).

Arguably, the numeric case that didn't use the optimization (your
test4) should have been costed as a full index scan, but it wasn't --
that's why you didn't get a faster sequential scan, which would have
made a little bit more sense. In general, the costing changes in the
patch are very rough.

That said, this particular problem (the test4 numeric issue) should be
fixed by inventing a way for nbtree to use skip scan with types that
lack skip support. It's not primarily a problem with the costing. At
least not in my mind.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2024-07-06 00:57:58 Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Previous Message Thomas Munro 2024-07-06 00:23:26 Clang function pointer type warnings in v14, v15