From: | Daniel Bausch <bausch(at)dvs(dot)tu-darmstadt(dot)de> |
---|---|
To: | "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, Simon Riggs <simon(at)2ndQuadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | State of the on-disk bitmap index |
Date: | 2012-08-16 10:12:38 |
Message-ID: | 502CC796.9020709@dvs.tu-darmstadt.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Jonah, Simon, and the hackers,
I am going to implement a simple kind of "encoded bitmap indexes" (EBI).
That is an index type where the bitmap columns may not only contain
only a single '1' in the set of bits belonging to a tuple. Instead, an
additional mapping table translates the distinct values of the table
column into a unique encoding. To select for a given value all bitmap
columns must be compared instead of only one. Queries that match
multiple different values (like IN lists or range queries) simplify to
less than the full set of bitmaps that needs to be compared because of
boolean logic. The total number of bitmaps required to represent unique
encodings for all different values is ceil(ld(n)), where n is the number
of distinct values. Compared to normal bitmap indexes this solves the
problem of high-cardinality columns. It is targetet at data warehousing
scenarios with insert only data.
The respective scientific paper can be found at
http://www.dvs.tu-darmstadt.de/publications/pdf/ebi_a4.pdf
I thought, it could be a good idea to base my work on the long proposed
on-disk bitmap index implementation. Regarding to the wiki, you, Jonah
and Simon, were the last devs that touched this thing. Unfortunately I
could not find the patch representing your state of that work. I could
only capture the development history up to Gianni Ciolli & Gabriele
Bartolini from the old pgsql-patches archives. Other people involved
were Jie Zhang, Gavin Sherry, Heikki Linnakangas, and Leonardo F. Are
you and the others still interested in getting this into PG? A rebase
of the most current bitmap index implementation onto master HEAD will be
the first 'byproduct' that I am going to deliver back to you.
1. Is anyone working on this currently?
2. Who has got the most current source code?
3. Is there a git of that or will I need to reconstruct the history from
the patches I collected?
Disclaimer: Although I read and manually processed most of the
executor's code during my last work, I would still call myself a newbie
and therefore will need some assistance most probably.
Regards,
Daniel
--
Daniel Bausch
Wissenschaftlicher Mitarbeiter
Technische Universität Darmstadt
Fachbereich Informatik
Fachgebiet Datenbanken und Verteilte Systeme
Hochschulstraße 10
64289 Darmstadt
Germany
Tel.: +49 6151 16 6706
Fax: +49 6151 16 6229
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2012-08-16 11:43:25 | Re: ToDo: allow to get a number of processed rows by COPY statement |
Previous Message | Marti Raudsepp | 2012-08-16 09:31:06 | UNION ALL with WHERE clause does not use Merge Append |