From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | John Naylor <johncnaylorls(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
Subject: | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date: | 2024-03-14 01:53:14 |
Message-ID: | CAD21AoA_TtELXs_6Ks3oJiSdWehBkO5h4irrqi5xq8SPzWqJNA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Mar 14, 2024 at 9:59 AM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
>
> On Wed, Mar 13, 2024 at 9:29 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Mar 13, 2024 at 8:05 PM John Naylor <johncnaylorls(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Mar 13, 2024 at 8:39 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > > As I mentioned above, if we implement the test cases in C, we can use
> > > > the debug-build array in the test code. And we won't use it in AND/OR
> > > > operations tests in the future.
> > >
> > > That's a really interesting idea, so I went ahead and tried that for
> > > v71. This seems like a good basis for testing larger, randomized
> > > inputs, once we decide how best to hide that from the expected output.
> > > The tests use SQL functions do_set_block_offsets() and
> > > check_set_block_offsets(). The latter does two checks against a tid
> > > array, and replaces test_dump_tids().
> >
> > Great! I think that's a very good starter.
> >
> > The lookup_test() (and test_lookup_tids()) do also test that the
> > IsMember() function returns false as expected if the TID doesn't exist
> > in it, and probably we can do these tests in a C function too.
> >
> > BTW do we still want to test the tidstore by using a combination of
> > SQL functions? We might no longer need to input TIDs via a SQL
> > function.
>
> I'm not sure. I stopped short of doing that to get feedback on this
> much. One advantage with SQL functions is we can use generate_series
> to easily input lists of blocks with different numbers and strides,
> and array literals for offsets are a bit easier. What do you think?
While I'm not a fan of the following part, I agree that it makes sense
to use SQL functions for test data generation:
-- Constant values used in the tests.
\set maxblkno 4294967295
-- The maximum number of heap tuples (MaxHeapTuplesPerPage) in 8kB block is 291.
-- We use a higher number to test tidstore.
\set maxoffset 512
It would also be easier for developers to test the tidstore with their
own data set. So I agreed with the current approach; use SQL functions
for data generation and do the actual tests inside C functions. Is it
convenient for developers if we have functions like generate_tids()
and generate_random_tids() to generate TIDs so that they can pass them
to do_set_block_offsets()? Then they call check_set_block_offsets()
and others for actual data lookup and iteration tests.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2024-03-14 02:00:28 | Re: Recent 027_streaming_regress.pl hangs |
Previous Message | Robert Treat | 2024-03-14 01:47:53 | small_cleanups around login event triggers |