From: | Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> |
---|---|
To: | pgsql-patches(at)postgresql(dot)org |
Cc: | Jie Zhang <jzhang(at)greenplum(dot)com> |
Subject: | On-disk bitmap index implementation |
Date: | 2006-12-04 13:18:10 |
Message-ID: | Pine.LNX.4.58.0612042347490.18986@linuxworld.com.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
Hi all,
Attached is a patch implementing bitmap indexes. It includes major
enhancements on the patch submitted during feature freeze for 8.2 here[1].
In particular: much better integration with the existing bitmap scan code
with the internals of the bitmap streaming pushed down into the AM and
hidden from the executor code; completely new index creation algorithm
which reduced creation time by 20-75% depending on the data; modifications
to the encoding mechanism to suit the integration with bitmap index scans;
work on memory management; lots of code rewriting; range query support.
The code is also much cleaner now.
There are still some things Jie and I have not gotten to yet:
o Improving VACUUM support -- currently, VACUUM FULL means REINDEX for
bitmaps. Heikki Linnakangas offered to work on this. Heikki, are you
still interested?
o Determine if we need to provide anything for rm_startup, rm_cleanup,
rm_safe_restartpoint RmgrData function pointers.
o Test WAL replay more thoroughly.
o I pulled a nice optimisation out of the bitmap scan OR case where a
higher level plan could push down a bitmap and have all the child scans
just OR their data into that inside the AM. I need to get that back in.
o I need to look at tidying up the bitmap stream memory usage insider the
executor. We leak memory from ExecBitmapIndexReScan(), for example.
o Really should add some more detailed docs about why bitmap indexes are
cool.
o Look into adding an AM option such that the user can determine word size
at index creation time. For higher-cardinality data (above 1000 distinct
values), 16 bit word sizes can really help with performance. Although
the word size is not just assumed to be a certain size across the code,
macros are used extensively to interact with the word size. Making it
different for each index might be a little messy.
Comments please!
Gavin
[1] http://archives.postgresql.org/pgsql-patches/2006-09/msg00216.php
Attachment | Content-Type | Size |
---|---|---|
bitmap-4.diff | text/plain | 318.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Oleg Bartunov | 2006-12-04 13:21:08 | Re: 8.2.0 pdf |
Previous Message | Devrim GUNDUZ | 2006-12-04 09:46:10 | 8.2.0 pdf |