Re: testing framework for MVCC & vacuum (freeze) & heap_page_prune etc.

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: testing framework for MVCC & vacuum (freeze) & heap_page_prune etc.
Date: 2024-12-10 03:42:52
Message-ID: 87ldwoxkxv.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andy Fan <zhihuifan1213(at)163(dot)com> writes:

> Hi,
>
> I'm willing to design one myself
> but it would be better have a ask first to see if there is some existing
> excellent project I can start with and contribute to.

Just to show that I'm not a person who taking things for granted, this is
my draft in this topic. I know there are lots of things missed and much
more details need to be think more, that's why I was asking if we have
existing project.

====
Key Concepts:

* MVCC
- xmin (insert, update)
- xmax (delete, update, select for update, epq query)
- CommandId
(insert/update/delete -> query)
- committs (xmin/xmax)

XID may be a state of in-progress, committed/aborted, 2pc prepared.

When during scanning tuple within a snapshot,

challenge 1: transaction stage changes:
(a) new xid. (in-progress)
(b) in-progress -> commit.
(c) in-progress -> abort
(d) new xid -> commit/abort.

Challenge 2: The old version deleted by hot update prune or vacuum.

hot update prune need hot update query.

We may also have sub-transaction, sub-transaction can be tested with
"statement_level_txn".

We also need to review the above on standby (hot-standby case)

todo: xid state crash recovery
todo: xid stage on standby (read-only).
todo: multiple IDs in each transaction.

* vacuum, freeze.

Besides the MVCC works correctly, we also need the vacuum & freeze works
correctly.

vacuum_freeze.sql. -- run 'vacuum (freeze) w;' randomly.

* error detection:

check_cnt.sql -- Check if the table count is 10000 rows all the time.
we inject random sleep time after fetching each tuple to simulate long
query, this leave the enough time for others to change their transaction
stage or prune the dead tuple etc.

check_diff.sql -- Check the result and check if it have 1...10000 IDs,
only used for troubleshooting. random sleep time is there already.

check_vacuum_freeze.sql: -- a PGSQL function to check the
pg_stat_user_tables.n_xxx. pg_class.relfrozenxid/relminmxid, they
should be able to advance after some period. If it not, some WARNING
should be added.

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey M. Borodin 2024-12-10 03:50:23 Re: testing framework for MVCC & vacuum (freeze) & heap_page_prune etc.
Previous Message Andrey M. Borodin 2024-12-10 03:42:24 Re: UUID v7