From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: gistchoose vs. bloat |
Date: | 2012-12-14 18:12:09 |
Message-ID: | 1355508729.11945.26.camel@jdavis |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 2012-12-14 at 18:36 +0200, Heikki Linnakangas wrote:
> One question: does the randomization ever help when building a new
> index? In the original test case, you repeatedly delete and insert
> tuples, and I can see how the index can get bloated in that case. But I
> don't see how bloat would occur when building the index from scratch.
When building an index on a bunch of identical int4range values (in my
test, [1,10) ), the resulting index was about 17% smaller.
If the current algorithm always chooses to insert on the left-most page,
then it seems like there would be a half-filled right page for every
split that occurs. Is that reasoning correct?
However, I'm having some second thoughts about the run time for index
builds. Maybe we should have a few more tests to determine if this
should really be the default or just an option?
> BTW, I don't much like the option name "randomization". It's not clear
> what's been randomized. I'd prefer something like
> "distribute_on_equal_penalty", although that's really long. Better ideas?
I agree that "randomization" is vague, but I can't think of anything
better.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2012-12-14 18:37:31 | Re: Use of systable_beginscan_ordered in event trigger patch |
Previous Message | Heikki Linnakangas | 2012-12-14 16:36:50 | Re: gistchoose vs. bloat |