| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Anastasia Lubennikova <lubennikovaav(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie> |
| Subject: | Re: Testing autovacuum wraparound (including failsafe) |
| Date: | 2022-06-30 01:40:12 |
| Message-ID: | CAD21AoDVhkXp8HjpFO-gp3TgL6tCKcZQNxn04m01VAtcSi-5sA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Tue, Feb 1, 2022 at 11:58 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Jun 11, 2021 at 10:19 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> >
> > Hi,
> >
> > On 2021-06-10 16:42:01 +0300, Anastasia Lubennikova wrote:
> > > Cool. Thank you for working on that!
> > > Could you please share a WIP patch for the $subj? I'd be happy to help with
> > > it.
> >
> > I've attached the current WIP state, which hasn't evolved much since
> > this message... I put the test in src/backend/access/heap/t/001_emergency_vacuum.pl
> > but I'm not sure that's the best place. But I didn't think
> > src/test/recovery is great either.
> >
>
> Thank you for sharing the WIP patch.
>
> Regarding point (1) you mentioned (StartupSUBTRANS() takes a long time
> for zeroing out all pages), how about using single-user mode instead
> of preparing the transaction? That is, after pg_resetwal we check the
> ages of datfrozenxid by executing a query in single-user mode. That
> way, we don’t need to worry about autovacuum concurrently running
> while checking the ages of frozenxids. I’ve attached a PoC patch that
> does the scenario like:
>
> 1. start cluster with autovacuum=off and create tables with a few data
> and make garbage on them
> 2. stop cluster and do pg_resetwal
> 3. start cluster in single-user mode
> 4. check age(datfrozenxid)
> 5. stop cluster
> 6. start cluster and wait for autovacuums to increase template0,
> template1, and postgres datfrozenxids
The above steps are wrong.
I think we can expose a function in an extension used only by this
test in order to set nextXid to a future value with zeroing out
clog/subtrans pages. We don't need to fill all clog/subtrans pages
between oldestActiveXID and nextXid. I've attached a PoC patch for
adding this regression test and am going to register it to the next
CF.
BTW, while testing the emergency situation, I found there is a race
condition where anti-wraparound vacuum isn't invoked with the settings
autovacuum = off, autovacuum_max_workers = 1. AN autovacuum worker
sends a signal to the postmaster after advancing datfrozenxid in
SetTransactionIdLimit(). But with the settings, if the autovacuum
launcher attempts to launch a worker before the autovacuum worker who
has signaled to the postmaster finishes, the launcher exits without
launching a worker due to no free workers. The new launcher won’t be
launched until new XID is generated (and only when new XID % 65536 ==
0). Although autovacuum_max_workers = 1 is not mandatory for this
test, it's easier to verify the order of operations.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-Add-regression-tests-for-emergency-vacuums.patch | application/octet-stream | 10.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Yugo NAGATA | 2022-06-30 01:48:12 | Add a test for "cannot truncate foreign table" |
| Previous Message | Robert Haas | 2022-06-30 01:39:56 | Re: Emit extra debug message when executing extension script. |