| From: | Heikki Linnakangas <heikki(at)enterprisedb(dot)com> | 
|---|---|
| To: | Hannu Krosing <hannu(at)skype(dot)net> | 
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: [PATCHES] Bitmapscan changes | 
| Date: | 2007-03-15 09:47:53 | 
| Message-ID: | 45F91649.10900@enterprisedb.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers pgsql-patches | 
Hannu Krosing wrote:
> Ühel kenal päeval, K, 2007-03-14 kell 10:22, kirjutas Heikki
> Linnakangas:
>> Clustered indexes have roughly the same performance effect and use cases 
>> as clustered indexes on MS SQL Server, and Index-Organized-Tables on 
>> Oracle, but the way I've implemented them is significantly different. On 
>> other DBMSs, the index and heap are combined to a single b-tree 
>> structure. The way I've implemented them is less invasive, there's no 
>> changes to the heap for example, and it doesn't require moving live tuples.
> 
> Do you keep visibility info in the index ?
No.
> If there is no visibility data in index, then I can't see, how it gets
> the same performance effect as Index-Organized-Tables, as lot of random
> heap access is still needed.
Let me illustrate the effect in the best case, with a table that 
consists of just the key:
Normal b-tree:
Root -> leaf -> heap
aaa -> aaa -> aaa
        bbb -> bbb
        ccc -> ccc
ddd -> ddd -> ddd
        eee -> eee
        fff -> fff
ggg -> ggg -> ggg
        hhh -> hhh
        iii -> iii
Clustered b-tree:
Root -> heap
aaa -> aaa
        bbb
        ccc
ddd -> ddd
        eee
        fff
ggg -> ggg
        hhh
        iii
The index is much smaller, one level shallower in the best case. A 
smaller index means that more of it fits in cache. If you're doing 
random access through the index, that means that you need to do less I/O 
because you don't need to fetch so many index pages. You need to access 
the heap anyway for the visibility information, as you pointed out, but 
the savings are coming from having to do less index I/O.
How close to the best case do you get in practice? It depends on your 
schema, narrow tables or tables with wide keys gain the most, and on the 
clusteredness of the table.
-- 
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Heikki Linnakangas | 2007-03-15 09:53:32 | Re: [PATCHES] Bitmapscan changes | 
| Previous Message | Peter Eisentraut | 2007-03-15 08:11:52 | SoC ECPG Enhancements | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Heikki Linnakangas | 2007-03-15 09:53:32 | Re: [PATCHES] Bitmapscan changes | 
| Previous Message | Simon Riggs | 2007-03-15 08:27:22 | Re: Synchronized Scan WIP patch |