Re: Avoid full GIN index scan when possible

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Marc Cousin <cousinmarc(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: Avoid full GIN index scan when possible
Date: 2019-12-25 05:25:38
Message-ID: CAPpHfdsXN5iq3rUwjuwtp_GRFmhq+1QAtgiywGM_6YM9PdGdMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 23, 2019 at 2:39 AM Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> wrote:
> Attached 8th version of the patches.

I've read this thread. I decided to rewrite the patch in the way,
which I find simple and more clear. Attached is the draft patch
written from scratch except regression tests, which were copied "as
is". It based on the discussion in this thread as well as my own
ideas. It works as following.

1) New GinScanKey->excludeOnly flag is introduced. This flag means
that scan key might be satisfied even if no of its entries match the
row. So, such scan keys are useful only for additional check of
results returned by other keys. That is excludeOnly scan key is
designed for exclusion of already obtained results.
2) Initially no hidden scan entries are appended to
GIN_SEARCH_MODE_ALL scan keys. They are appended after getting
statistics about search modes applied to particular attributes.
3) We append at only one GIN_CAT_EMPTY_QUERY scan entry when all scan
keys GIN_SEARCH_MODE_ALL. If there is at least one normal scan key,
no GIN_CAT_EMPTY_QUERY is appended.
4) No hidden entries are appended to GIN_SEARCH_MODE_ALL scan key if
there are normal scan keys for the same column. Otherwise
GIN_CAT_NULL_KEY hidden entry is appended.
5) GIN_SEARCH_MODE_ALL scan keys, which don't have GIN_CAT_EMPTY_QUERY
hidden entry, are marked with excludeOnly flag. So, they are used to
filter results of other scan keys.
6) GIN_CAT_NULL_KEY hidden entry is found, then scan key doesn't match
independently on result of consistent function call.

Therefore, attached patch removes unnecessary GIN_CAT_EMPTY_QUERY scan
entries without removing positive effect of filtering in
GIN_SEARCH_MODE_ALL scan keys.

Patch requires further polishing including comments, minor refactoring
etc. I'm going to continue work on this.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Avoid-GIN-full-scan-for-empty-ALL-keys-v09.patch application/octet-stream 27.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tsunakawa.takay@fujitsu.com 2019-12-25 05:27:26 RE: Implementing Incremental View Maintenance
Previous Message Mahendra Singh 2019-12-25 05:02:07 Re: relpages of btree indexes are not truncating even after deleting all the tuples from table and doing vacuum