From: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date: | 2022-11-30 05:51:03 |
Message-ID: | CAFBsxsHgP5LP9q=Rq_3Q2S6Oyut67z+Vpemux+KseSPXhJF7sg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
There are a few things up in the air, so I'm coming back to this list to
summarize and add a recent update:
On Mon, Nov 14, 2022 at 7:59 PM John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
wrote:
>
> - See how much performance we actually gain from tagging the node kind.
Needs a benchmark that has enough branch mispredicts and L2/3 misses to
show a benefit. Otherwise either neutral or worse in its current form,
depending on compiler(?). Put off for later.
> - Try additional size classes while keeping the node kinds to only four.
This is relatively simple and effective. If only one additional size class
(total 5) is coded as a placeholder, I imagine it will be easier to rebase
shared memory logic than using this technique everywhere possible.
> - Optimize node128 insert.
I've attached a rough start at this. The basic idea is borrowed from our
bitmapset nodes, so we can iterate over and operate on word-sized (32- or
64-bit) types at a time, rather than bytes. To make this easier, I've moved
some of the lower-level macros and types from bitmapset.h/.c to
pg_bitutils.h. That's probably going to need a separate email thread to
resolve the coding style clash this causes, so that can be put off for
later. This is not meant to be included in the next patchset. For
demonstration purposes, I get these results with a function that repeatedly
deletes the last value from a mostly-full node128 leaf and re-inserts it:
select * from bench_node128_load(120);
v11
NOTICE: num_keys = 14400, height = 1, n1 = 0, n4 = 0, n15 = 0, n32 = 0,
n61 = 0, n128 = 121, n256 = 0
fanout | nkeys | rt_mem_allocated | rt_sparseload_ms
--------+-------+------------------+------------------
120 | 14400 | 208304 | 56
v11 + 0006 addendum
NOTICE: num_keys = 14400, height = 1, n1 = 0, n4 = 0, n15 = 0, n32 = 0,
n61 = 0, n128 = 121, n256 = 0
fanout | nkeys | rt_mem_allocated | rt_sparseload_ms
--------+-------+------------------+------------------
120 | 14400 | 208816 | 34
I didn't test inner nodes, but I imagine the difference is bigger. This
bitmap style should also be used for the node256-leaf isset array simply to
be consistent and avoid needing single-use macros, but that has not been
done yet. It won't make a difference for performance because there is no
iteration there.
> - Try templating out the differences between local and shared memory.
I hope to start this sometime after the crashes on 32-bit are resolved.
--
John Naylor
EDB: http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
v11-0006-addendum-bitmapword-node128.patch.txt | text/plain | 12.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2022-11-30 05:55:42 | Re: Strange failure on mamba |
Previous Message | Michael Paquier | 2022-11-30 05:50:16 | Re: Tests for psql \g and \o |