Re: Window partial fetch optimization

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Levi Aul <levi(at)leviaul(dot)com>
Cc: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Window partial fetch optimization
Date: 2022-05-04 21:56:27
Message-ID: CAMkU=1yLqf6TxMRoVX0xJhPBpR1bH5N85eutvQokaKXa=CGzwQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, May 3, 2022 at 2:11 PM Levi Aul <levi(at)leviaul(dot)com> wrote:

> I have a “temporal table” — a table where there are multiple “versions” of
> entities, with each version having a distinct timestamp:
> CREATE TABLE contract_balance_updates (
> block_id bigint NOT NULL,
> block_signed_at timestamp(0) without time zone NOT NULL,
> contract_address bytea NOT NULL,
> holder_address bytea NOT NULL,
> start_block_height bigint NOT NULL,
> balance numeric NOT NULL
> ) PARTITION BY RANGE (block_signed_at);
>
> -- one for each partition (applied by pg_partman from a template)
> CREATE UNIQUE INDEX contract_balance_updates_pkey
> ON contract_balance_updates(
> holder_address bytea_ops,
> contract_address bytea_ops,
> start_block_height int8_ops DESC
> );
>

How does pg_partman deal with the fact that a unique index on a partitioned
table must contain the partitioning key?

It should be noted that your 3 queries don't return the same thing. The
last one returns columns holder_address, contract_address, and balance,
while the first returns all columns in the table. If you were to make the
first query return just the three columns holder_address, contract_address,
and balance and build a suitable index, then you could get it to use an
index-only scan. This should be similar to (but probably faster than) your
3rd query, without all the kerfuffle of extra scans and dummy syntax. The
index needed would be:

(holder_address bytea_ops, contract_address bytea_ops, start_block_height,
balance);

Note that in theory it could do a better job of using the index you already
have. It could compute the row_number using only the data available in the
index, then go fetch the table tuple for just the rows which pass the
row_number filter. But it just isn't smart enough to do that. (By
separating the WHERE clause from the select list into different queries,
that is essentially what your third query is tricking it into doing)

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message André Hänsel 2022-05-04 23:15:43 Why is there a Sort after an Index Only Scan?
Previous Message David Rowley 2022-05-03 22:12:51 Re: Window partial fetch optimization