From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Rajeev rastogi <rajeev(dot)rastogi(at)huawei(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Index scan optimization |
Date: | 2014-09-22 07:05:17 |
Message-ID: | 541FCA2D.5090508@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 09/22/2014 07:47 AM, Rajeev rastogi wrote:
> I have observed a scope of considerable performance improvement in-case of index by a very minor code change.
> Consider the below schema:
>
> create table tbl2(id1 int, id2 varchar(10), id3 int);
> create index idx2 on tbl2(id2, id3);
>
> Query as:
> select count(*) from tbl2 where id2>'a' and id3>990000;
>
> As per current design, it takes following steps to retrieve index tuples:
>
> 1. Find the scan start position by searching first position in BTree as per the first key condition i.e. as per id2>'a'
>
> 2. Then it fetches each tuples from position found in step-1.
>
> 3. For each tuple, it matches all scan key condition, in our example it matches both scan key condition.
>
> 4. If condition match, it returns the tuple otherwise scan stops.
>
> Now problem is here that already first scan key condition is matched to find the scan start position (Step-1), so it is obvious that any further tuple also will match the first scan key condition (as records are sorted).
> So comparison on first scan key condition again in step-3 seems to be redundant.
>
> So my proposal is to skip the condition check on the first scan key condition for every tuple.
The same happens in a single-column case. If you have a query like
"SELECT * FROM tbl2 where id2 > 'a'", once you've found the start
position of the scan, you know that all the rows that follow match too.
> I would like to submit the patch for this improvement.
> Please provide your feedback. Also let me know if I am missing something.
Yeah, sounds like a good idea. This scenario might not arise very often,
but it should be cheap to check, so I doubt it will add any measurable
overhead to the cases where the optimization doesn't help.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Dmitriy Igrishin | 2014-09-22 07:17:32 | Re: libpq connection status and closed fd |
Previous Message | Amit Kapila | 2014-09-22 06:55:15 | Re: Scaling shared buffer eviction |