From: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Andrey Borodin <amborodin(at)acm(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com> |
Subject: | Re: WIP: Covering + unique indexes. |
Date: | 2016-09-26 15:17:41 |
Message-ID: | 374642a5-b84e-faab-5d5c-18e558bde195@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
24.09.2016 15:36, Amit Kapila:
> On Wed, Sep 21, 2016 at 6:51 PM, Anastasia Lubennikova
> <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>> 20.09.2016 08:21, Amit Kapila:
>>
>> On Tue, Sep 6, 2016 at 10:18 PM, Anastasia Lubennikova
>> <a(dot)lubennikova(at)postgrespro(dot)ru> wrote:
>>
>> 28.08.2016 09:13, Amit Kapila:
>>
>>
>> The problem seems really tricky, but the answer is simple.
>> We store included columns unordered. It was mentioned somewhere in
>> this thread.
>>
> Is there any fundamental problem in storing them in ordered way? I
> mean to say, you need to anyway store all the column values on leaf
> page, so why can't we find the exact location for the complete key.
> Basically use truncated key to reach to leaf level and then use the
> complete key to find the exact location to store the key. I might be
> missing some thing here, but if we can store them in ordered fashion,
> we can use them even for queries containing ORDER BY (where ORDER BY
> contains included columns).
>
I'd say that the reason for not using included columns in any
operations which require comparisons, is that they don't have opclass.
Let's go back to the example of points.
This data type don't have any opclass for B-tree, because of fundamental
reasons.
And we can not apply _bt_compare() and others to this attribute, so
we don't include it to scan key.
create table t (i int, i2 int, p point);
create index idx1 on (i) including (i2);
create index idx2 on (i) including (p);
create index idx3 on (i) including (i2, p);
create index idx4 on (i) including (p, i2);
You can keep tuples ordered in idx1, but not for idx2, partially ordered
for idx3, but not for idx4.
At the very beginning of this thread [1], I suggested to use opclass,
where possible.
Exactly the same idea, you're thinking about. But after short
discussion, we came
to conclusion that it would require many additional checks and will be
too complicated,
at least for the initial patch.
>> Let me give you an example:
>>
>> create table t (i int, p point);
>> create index on (i) including (p);
>> "point" data type doesn't have any opclass for btree.
>> Should we insert (0, '(0,2)') before (0, '(1,1)') or after?
>> We have no idea what is the "correct order" for this attribute.
>> So the answer is "it doesn't matter". When searching in index,
>> we know that only key attrs are ordered, so only them can be used
>> in scankey. Other columns are filtered after retrieving data.
>>
>> explain select i,p from t where i =0 and p <@ circle '((0,0),2)';
>> QUERY PLAN
>> -------------------------------------------------------------------
>> Index Only Scan using idx on t (cost=0.14..4.20 rows=1 width=20)
>> Index Cond: (i = 0)
>> Filter: (p <@ '<(0,0),2>'::circle)
>>
> I think here reason for using Filter is that because we don't keep
> included columns in scan keys, can't we think of having them in scan
> keys, but use only key columns in scan key to reach till leaf level
> and then use complete scan key at leaf level.
>> What should I add to README (or to documentation),
>> to make it more understandable?
>>
> May be add the data representation like only leaf pages contains all
> the columns and how the scan works. I think you can see if you can
> extend "Notes About Data Representation" and or "Other Things That Are
> Handy to Know" sections in existing README.
Ok, I'll write it in a few days.
[1] https://www.postgresql.org/message-id/55F84DF4.5030207@postgrespro.ru
--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | dbyzaa@163.com | 2016-09-26 15:39:11 | temporary table vs array performance |
Previous Message | Stephen Frost | 2016-09-26 14:48:51 | Re: Showing parallel status in \df+ |