From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
Subject: | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date: | 2024-03-15 07:36:10 |
Message-ID: | CANWCAZYOphTDyMmQkGCX5L_=v1_1jjwiFGxZx_h9bbNbyiOudg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Mar 14, 2024 at 7:04 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Mar 14, 2024 at 6:55 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> >
> > On Thu, Mar 14, 2024 at 12:06 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Mar 14, 2024 at 1:29 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> > > > Okay, here's an another idea: Change test_lookup_tids() to be more
> > > > general and put the validation down into C as well. First we save the
> > > > blocks from do_set_block_offsets() into a table, then with all those
> > > > blocks lookup a sufficiently-large range of possible offsets and save
> > > > found values in another array. So the static items structure would
> > > > have 3 arrays: inserts, successful lookups, and iteration (currently
> > > > the iteration output is private to check_set_block_offsets(). Then
> > > > sort as needed and check they are all the same.
> > >
> > > That's a promising idea. We can use the same mechanism for randomized
> > > tests too. If you're going to work on this, I'll do other tests on my
> > > environment in the meantime.
> >
> > Some progress on this in v72 -- I tried first without using SQL to
> > save the blocks, just using the unique blocks from the verification
> > array. It seems to work fine.
>
> Thanks!
Seems I forgot the attachment last time...there's more stuff now
anyway, based on discussion.
> > - Since there are now three arrays we should reduce max bytes to
> > something smaller.
>
> Agreed.
I went further than this, see below.
> > - Further on that, I'm not sure if the "is full" test is telling us
> > much. It seems we could make max bytes a static variable and set it to
> > the size of the empty store. I'm guessing it wouldn't take much to add
> > enough tids so that the contexts need to allocate some blocks, and
> > then it would appear full and we can test that. I've made it so all
> > arrays repalloc when needed, just in case.
>
> How about using work_mem as max_bytes instead of having it as a static
> variable? In test_tidstore.sql we set work_mem before creating the
> tidstore. It would make the tidstore more controllable by SQL queries.
My complaint is that the "is full" test is trivial, and also strange
in that max_bytes is used for two unrelated things:
- the initial size of the verification arrays, which was always larger
than necessary, and now there are three of them
- the hint to TidStoreCreate to calculate its max block size / the
threshold for being "full"
To make the "is_full" test slightly less trivial, my idea is to save
the empty store size and later add enough tids so that it has to
allocate new blocks/DSA segments, which is not that many, and then it
will appear full. I've done this and also separated the purpose of
various sizes in v72-0009/10.
Using actual work_mem seems a bit more difficult to make this work.
> > - I'm not sure it's useful to keep test_lookup_tids() around. Since we
> > now have a separate lookup test, the only thing it can tell us is that
> > lookups fail on an empty store. I arranged it so that
> > check_set_block_offsets() works on an empty store. Although that's
> > even more trivial, it's just reusing what we already need.
>
> Agreed.
Removed in v72-0007
On Fri, Mar 15, 2024 at 9:49 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> I have two questions on tidstore.c:
>
> +/*
> + * Set the given TIDs on the blkno to TidStore.
> + *
> + * NB: the offset numbers in offsets must be sorted in ascending order.
> + */
>
> Do we need some assertions to check if the given offset numbers are
> sorted expectedly?
Done in v72-0008
> ---
> + if (TidStoreIsShared(ts))
> + found = shared_rt_set(ts->tree.shared, blkno, page);
> + else
> + found = local_rt_set(ts->tree.local, blkno, page);
> +
> + Assert(!found);
>
> Given TidStoreSetBlockOffsets() is designed to always set (i.e.
> overwrite) the value, I think we should not expect that found is
> always false.
I find that a puzzling statement, since 1) it was designed for
insert-only workloads, not actual overwrite IIRC and 2) the tests will
now fail if the same block is set twice, since we just switched the
tests to use a remnant of vacuum's old array. Having said that, I
don't object to removing artificial barriers to using it for purposes
not yet imagined, as long as test_tidstore.sql warns against that.
Given the above two things, I think this function's comment needs
stronger language about its limitations. Perhaps even mention that
it's intended for, and optimized for, vacuum. You and I have long
known that tidstore would need a separate, more complex, function to
add or remove individual tids from existing entries, but it might be
good to have that documented.
Other things:
v72-0011: Test that zero offset raises an error.
v72-0013: I had wanted to microbenchmark this, but since we are
running short of time I decided to skip that, so I want to revert some
code to make it again more similar to the equivalent in tidbitmap.c.
In the absence of evidence, it seems better to do it this way.
Attachment | Content-Type | Size |
---|---|---|
v72-ART.tar.gz | application/gzip | 30.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-03-15 07:39:12 | Re: Weird test mixup |
Previous Message | Kyotaro Horiguchi | 2024-03-15 07:28:53 | Re: Inconsistent printf placeholders |