Re: Feature request for adoptive indexes

From: Hayk Manukyan <manukyantt(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Feature request for adoptive indexes
Date: 2021-10-26 06:49:31
Message-ID: CAF+kZOFhNJ5NwvOQksGrCKEeMuWTxg+LEF4mYwSDtYB3J-UkBA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

ok. here is the deal if I have the following index with 6 column

CREATE INDEX ON job_nlp_year_scan (job, nlp, year, scan_id, issue_flag,
sequence);

I need to specify all 6 columns in where clause in order to fully use this
index.
It will not be efficient in cases when I have 4 condition in where clause
also I should follow the order of columns.
In case of INCLUDE the 3 columns just will be in index but will not be
structured as index so it will have affect only if In select I will have
that 6 columns nothing more.

In my case I have table with ~15 columns
In my application I have to do a lot of queries with following where
clauses

1. where job = <something> and nlp = <something> and year = <something>
and SCAN_ID = <something>
2. where job = <something> and nlp = <something> and year = <something>
and ISSUE_FLAG = <something>
3. where job = <something> and nlp = <something> and year = <something>
and SEQUENCE = <something>

I don't want to index just on job, nlp, year because for each job, nlp,
year I have approximately 5000-7000 rows ,
overall table have ~50m rows so it is partitioned by job as well. So if I
build 3 separate indexes it will be huge resource.
So I am thinking of having one index which will be job, nlp, year and the
4-th layer will be other columns not just included but also in B-tree
structure.
To visualize it will be something like this:
[image: image.png]
The red part is ordinary index with nested b-trees ant the yellow part is
adaptive part so depends on
where clause optimizer can decide which direction (leaf, b-tree whatever)
to chose.
In this case I will have one index and will manage red part only once for
all three cases.
Those it make sense ?
If you need more discussion we can have short call I will try to explain
you in more detailed way.

best regards

пн, 25 окт. 2021 г. в 19:33, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>:

> Hi,
>
> On 10/25/21 16:07, Hayk Manukyan wrote:
> > Hi everyone. I want to do some feature request regarding indexes, as far
> as
> > I know this kind of functionality doesn't exists in Postgres. Here is my
> > problem :
> > I need to create following indexes:
> > Create index job_nlp_year_scan on ingest_scans_stageing
> > (`job`,`nlp`,`year`,`scan_id`);
> > Create index job_nlp_year_issue_flag on ingest_scans_stageing
> > (`job`,`nlp`,`year`,`issue_flag`);
> > Create index job_nlp_year_sequence on ingest_scans_stageing
> > (`job`,`nlp`,`year`,`sequence`);
> > As you can see the first 3 columns are the same (job, nlp, year). so if I
> > create 3 different indexes db should manage same job_nlp_year structure 3
> > times.
> > The Data Structure that I think which can be efficient in this kind of
> > scenarios is to have 'Adaptive Index' which will be something like
> > Create index job_nlp_year on ingest_scans_stageing
> > (`job`,`nlp`,`year`,(`issue_flag`,`scan_id`, `sequence`));
> > And depend on query it will use or job_nlp_year_scan or
> > job_nlp_year_issue_flag , or job_nlp_year_sequence ( job, nlp, year and
> one
> > of ( `issue_flag` , `scan_id` , `sequence` )
> > For more description please feel free to refer me
>
> It's not very clear what exactly would the "adaptive index" do, except
> that it'd have all three columns. Clearly, the three columns can't be
> considered for ordering etc. but need to be in the index somehow. So why
> wouldn't it be enough to either to create an index with all six columns?
>
> CREATE INDEX ON job_nlp_year_scan (job, nlp, year, scan_id, issue_flag,
> sequence);
>
> or possibly with the columns just "included" in the index:
>
> CREATE INDEX ON job_nlp_year_scan (job, nlp, year) INCLUDE (scan_id,
> issue_flag, sequence);
>
> If this does not work, you either need to explain more clearly what
> exactly the adaptive indexes does, or show queries that can't benefit
> from these existing features.
>
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-10-26 07:26:20 Re: row filtering for logical replication
Previous Message Ronan Dunklau 2021-10-26 06:27:47 Re: pg_receivewal starting position