From: | Daniel Bausch <bausch(at)dvs(dot)tu-darmstadt(dot)de> |
---|---|
To: | Gianni Ciolli <gianni(dot)ciolli(at)2ndquadrant(dot)it> |
Cc: | Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, pgsql-hackers(at)postgresql(dot)org, Gabriele Bartolini <gabriele(dot)bartolini(at)2ndquadrant(dot)it> |
Subject: | Re: State of the on-disk bitmap index |
Date: | 2012-09-05 11:37:59 |
Message-ID: | 50473997.2050706@dvs.tu-darmstadt.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Gianni!
Thank you for your attention and response!
>> I used the (more recent) patches posted by Gianni Ciolli in 2008 and
>> currently am in the process of porting those to master HEAD -- like I
>> promised.
>
> Back in 2008 the PostgreSQL project wasn't using git, and I wasn't
> either; hence that patch is the best starting point I can find.
Ok, fine. However, while I do not find the mail at the moment, I think,
someone said, he fixed the VACUUM. Additionally, the Wiki lists Simon
and Jonah as the last authors, pretending they prepared a patch for 8.5.
>> IIRC, it was already shown that bitmap indexes can speed up TPC-H
>> queries. I will compare B+-tree, bitmap, and encoded bitmap indexes.
>
> I think what Albe meant (also what we attempted back then, if memory
> serves me, but without reaching completion) is a set of tests which
> show a significant benefit of your implementation over the existing
> index type implementations in PostgreSQL, to justify the increased
> complexity of the source code.
>
> The kind of test I have in mind is: a big table T with a
> low-cardinality column C, such that a btree index on C is
> significantly larger than the corresponding bitmap index on the same
> column.
>
> Create the bitmap index, and run a query like
>
> SELECT ... FROM T WHERE C = ...
>
> more than once; then you should notice that subsequent scans are much
> faster than the first run, because the index is small enough to fit
> the shared memory and will not need to be reloaded from disk at every
> scan.
>
> Then drop the bitmap index, and create a btree index on the same
> column; this time the index will be too large and subsequent scans
> will be slow, because the index blocks must be reloaded from disk at
> every scan.
>
> Hope that helps;
Is that, what your bmi-perf-test.tar.gz from 2008 does? I did not look
into that. I will at least do something like you just described plus
some TPC-H test. As the encoding helps against the cardinality
problems, I will also draw comparisons with different cardinalities.
Yours sincerely,
Daniel
--
Daniel Bausch
Wissenschaftlicher Mitarbeiter
Technische Universität Darmstadt
Fachbereich Informatik
Fachgebiet Datenbanken und Verteilte Systeme
Hochschulstraße 10
64289 Darmstadt
Germany
Tel.: +49 6151 16 6706
Fax: +49 6151 16 6229
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2012-09-05 12:56:57 | Re: Proof of concept: standalone backend with full FE/BE protocol |
Previous Message | Kohei KaiGai | 2012-09-05 11:30:37 | Re: [bugfix] sepgsql didn't follow the latest core API changes |