Re: [PATCHES] Bitmapscan changes

From: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] Bitmapscan changes
Date: 2007-03-15 09:47:53
Message-ID: 45F91649.10900@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Hannu Krosing wrote:
> Ühel kenal päeval, K, 2007-03-14 kell 10:22, kirjutas Heikki
> Linnakangas:
>> Clustered indexes have roughly the same performance effect and use cases
>> as clustered indexes on MS SQL Server, and Index-Organized-Tables on
>> Oracle, but the way I've implemented them is significantly different. On
>> other DBMSs, the index and heap are combined to a single b-tree
>> structure. The way I've implemented them is less invasive, there's no
>> changes to the heap for example, and it doesn't require moving live tuples.
>
> Do you keep visibility info in the index ?

No.

> If there is no visibility data in index, then I can't see, how it gets
> the same performance effect as Index-Organized-Tables, as lot of random
> heap access is still needed.

Let me illustrate the effect in the best case, with a table that
consists of just the key:

Normal b-tree:

Root -> leaf -> heap

aaa -> aaa -> aaa
bbb -> bbb
ccc -> ccc
ddd -> ddd -> ddd
eee -> eee
fff -> fff
ggg -> ggg -> ggg
hhh -> hhh
iii -> iii

Clustered b-tree:

Root -> heap

aaa -> aaa
bbb
ccc
ddd -> ddd
eee
fff
ggg -> ggg
hhh
iii

The index is much smaller, one level shallower in the best case. A
smaller index means that more of it fits in cache. If you're doing
random access through the index, that means that you need to do less I/O
because you don't need to fetch so many index pages. You need to access
the heap anyway for the visibility information, as you pointed out, but
the savings are coming from having to do less index I/O.

How close to the best case do you get in practice? It depends on your
schema, narrow tables or tables with wide keys gain the most, and on the
clusteredness of the table.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-03-15 09:53:32 Re: [PATCHES] Bitmapscan changes
Previous Message Peter Eisentraut 2007-03-15 08:11:52 SoC ECPG Enhancements

Browse pgsql-patches by date

  From Date Subject
Next Message Heikki Linnakangas 2007-03-15 09:53:32 Re: [PATCHES] Bitmapscan changes
Previous Message Simon Riggs 2007-03-15 08:27:22 Re: Synchronized Scan WIP patch