From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Юрий Соколов <funny(dot)falcon(at)gmail(dot)com> |
Subject: | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Date: | 2020-01-15 02:08:57 |
Message-ID: | CAH2-Wz=Tr6mxMsKRmv_=9-05_O9QWqOzQ8GweRV2DXS6+Y38QQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 10, 2020 at 1:36 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> Still, v29 doesn't resolve the following points you've raised, where I
> haven't reached a final opinion on what to do myself. These items are
> as follows (I'm quoting your modified patch file sent on January 8th
> here):
Still no progress on these items, but I am now posting v30. A new
version seems warranted, because I now want to revive a patch from a
couple of years back as part of the deduplication project -- it would
be good to get feedback on that sooner rather than later. This is a
patch that you [Heikki] are already familiar with -- the patch to
speed up compactify_tuples() [1]. Sokolov Yura is CC'd here, since he
is the original author.
The deduplication patch is much faster with this in place. For
example, with v30:
pg(at)regression:5432 [25216]=# create unlogged table foo(bar int4);
CREATE TABLE
pg(at)regression:5432 [25216]=# create index unlogged_foo_idx on foo(bar);
CREATE INDEX
pg(at)regression:5432 [25216]=# insert into foo select g from
generate_series(1, 1000000) g, generate_series(1,10) i;
INSERT 0 10000000
Time: 17842.455 ms (00:17.842)
If I revert the "Bucket sort for compactify_tuples" commit locally,
then the same insert statement takes 31.614 seconds! In other words,
the insert statement is made ~77% faster by that commit alone. The
improvement is stable and reproducible.
Clearly there is a big compactify_tuples() bottleneck that comes from
PageIndexMultiDelete(). The hot spot is quite visible with "perf top
-e branch-misses".
The compactify_tuples() patch stalled because it wasn't clear if it
was worth the trouble at the time. It was originally written to
address a much smaller PageRepairFragmentation() bottleneck in heap
pruning. ISTM that deduplication alone is a good enough reason to
commit this patch. I haven't really changed anything about the
2017/2018 patch -- I need to do more review of that. We probably don't
need to qsort() inlining stuff (the bucket sort thing is the real
win), but I included it in v30 all the same.
Other changes in v30:
* We now avoid extra _bt_compare() calls within _bt_check_unique() --
no need to call _bt_compare() once per TID (once per equal tuple is
quite enough).
This is a noticeable performance win, even though the change was
originally intended to make the logic in _bt_check_unique() clearer.
* Reduced the limit on the size of a posting list tuple to 1/6 of a
page -- down from 1/3.
This seems like a good idea on the grounds that it keeps our options
open if we split a page full of duplicates due to UPDATEs rather than
INSERTs (i.e. we split a page full of duplicates that isn't also the
rightmost page among pages that store only those duplicates). A lower
limit is more conservative, and yet doesn't cost us that much space.
* Refined nbtsort.c/CREATE INDEX to work sensibly with non-standard
fillfactor settings.
This last item is a minor bugfix, really.
[1] https://commitfest.postgresql.org/14/1138/
--
Peter Geoghegan
Attachment | Content-Type | Size |
---|---|---|
v30-0005-DEBUG-Show-index-values-in-pageinspect.patch | application/octet-stream | 4.4 KB |
v30-0004-Teach-pageinspect-about-nbtree-posting-lists.patch | application/octet-stream | 18.5 KB |
v30-0003-Bucket-sort-for-compactify_tuples.patch | application/octet-stream | 3.7 KB |
v30-0001-Add-deduplication-to-nbtree.patch | application/octet-stream | 216.2 KB |
v30-0002-Header-for-customized-qsort.patch | application/octet-stream | 21.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2020-01-15 02:28:05 | Re: Improve errors when setting incorrect bounds for SSL protocols |
Previous Message | Kyotaro Horiguchi | 2020-01-15 02:02:24 | Re: pause recovery if pitr target not reached |