From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: gistchoose vs. bloat |
Date: | 2013-01-24 19:26:15 |
Message-ID: | 51018AD7.5040402@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 21.01.2013 15:19, Heikki Linnakangas wrote:
> On 21.01.2013 15:06, Tom Lane wrote:
>> Jeff Davis<pgsql(at)j-davis(dot)com> writes:
>>> On Mon, 2013-01-21 at 00:48 -0500, Tom Lane wrote:
>>>> I looked at this patch. ISTM we should not have the option at all but
>>>> just do it always. I cannot believe that always-go-left is ever a
>>>> preferable strategy in the long run; the resulting imbalance in the
>>>> index will surely kill any possible benefit. Even if there are some
>>>> cases where it miraculously fails to lose, how many users are going to
>>>> realize that applies to their case and make use of the option?
>>
>>> Sounds good to me.
>>
>>> If I remember correctly, there was also an argument that it may be
>>> useful for repeatable test results. That's a little questionable for
>>> performance (except in those cases where few penalties are identical
>>> anyway), but could plausibly be useful for a crash report or something.
>>
>> Meh. There's already a random decision, in the equivalent place and for
>> a comparable reason, in btree (cf _bt_findinsertloc). Nobody's ever
>> complained about that being indeterminate, so I'm unconvinced that
>> there's a market for it with gist.
>
> I wonder if it would work for gist to do something similar to
> _bt_findinsertloc, and have a bias towards the left page, but sometimes
> descend to one of the pages to the right. You would get the cache
> locality of usually descending down the same subtree, but also fill the
> pages to the right. Jeff / Alexander, want to give that a shot?
I did some experimenting with that. I used the same test case Alexander
did, with geonames data, and compared unpatched version, the original
patch, and the attached patch that biases the first "best" tuple found,
but still sometimes chooses the other equally good ones.
testname | initsize | finalsize | idx_blks_read | idx_blks_hit
----------------+----------+-----------+---------------+--------------
patched-10-4mb | 75497472 | 90202112 | 5853604 | 10178331
unpatched-4mb | 75145216 | 94863360 | 5880676 | 10185647
unpatched-4mb | 75587584 | 97165312 | 5903107 | 10183759
patched-2-4mb | 74768384 | 81403904 | 5768124 | 10193738
origpatch-4mb | 74883072 | 82182144 | 5783412 | 10185373
All these tests were performed with shared_buffers=4MB, so that the
index won't fit completely in shared buffers, and I could use the
idx_blk_read/hit ratio as a measure of cache-friendliness. The
"origpath" test was with a simplified version of Alexander's patch, see
attached. It's the same as the original, but with the
randomization-option and gist-build related code removed. The patched-10
and patched-2 tests are two variants with my patch, with 1/10 and 1/2
chance of moving to the next equal tuple, respectively. The differences
in cache hit ratio might be just a result of different index sizes. I
included two unpatched runs above to show that there's a fair amount of
noise in these tests. That's because of the random nature of the test
case; it picks rows to delete and insert at random.
I think the conclusion is that all of these patches are effective. The
1/10 variant is less effective, as expected, as it's closer in behavior
to the unpatched behavior than the others. The 1/2 variant seems as good
as the original patch.
I performed another test, by inserting 1000000 duplicate values on an
empty table.
testname | finalsize | idx_blks_read | idx_blks_hit
------------+-----------+---------------+--------------
unpatched | 89030656 | 21350 | 2972033
patched-10 | 88973312 | 21450 | 2971920
patched-2 | 88481792 | 22947 | 2970260
origpatch | 61186048 | 761817 | 2221500
The original patch produces a much smaller index in this test, but since
the inserts are distributed randomly, the pages are accessed randomly
which is bad for caching.
A table full of duplicates isn't very realistic, but overall, I'm
leaning towards my version of this patch (gistchoose-2.patch). It has
less potential for causing a regression in existing applications, but is
just as effective in the original scenario of repeated delete+insert.
- Heikki
Attachment | Content-Type | Size |
---|---|---|
duplicatetest.sh | application/x-sh | 625 bytes |
gistbloat.sh | application/x-sh | 1.9 KB |
gistchoose-2.patch | text/x-diff | 2.7 KB |
origpatch-simplified.patch | text/x-diff | 2.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Dunstan | 2013-01-24 19:41:18 | Re: Strange Windows problem, lock_timeout test request |
Previous Message | Phil Sorber | 2013-01-24 19:10:01 | Re: [PATCH] pg_isready (was: [WIP] pg_ping utility) |