From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Flush pgstats file during checkpoints |
Date: | 2024-09-03 01:08:46 |
Message-ID: | ZtZhnuWotWSI_Xbt@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Sep 02, 2024 at 05:08:03PM +0300, Heikki Linnakangas wrote:
> Hmm, I'm a bit disappointed this doesn't address replication. It makes sense
> that scans are counted separately on a standby, but it would be nice if
> stats like last_vacuum were propagated from primary to standbys. I guess
> that can be handled separately later.
Yes, it's not something that I'm planning to tackle for this thread.
Speaking of which, the design that I got in mind for this area was not
"that" complicated:
- Add a new RMGR for all the stats.
- Add a first callback for stats kinds for WAL inserts, giving to each
stats the possibility to pass down data inserted to the record, as we
want to replicate a portion of the data depending on the kind dealt
with.
- Add a second callback for recovery, called depending on the kind ID.
I have not looked into the details yet, but stats to replicate should
be grouped in a single record on transaction commit or depending on
the flush timing for fixed-numbered stats. Or we should just add them
in commit records?
> Reviewing v7-0001-Flush-pgstats-file-during-checkpoints.patch:
>
> There are various race conditions where a stats entry can be leaked in the
> pgstats file. I.e. relation is dropped, but its stats entry is retained in
> the stats file after crash. In the worst case, suck leaked entries can
> accumulate until the stats file is manually removed, which resets all stats
> again. Perhaps that's acceptable - it's still possible leak the actual
> relation file for a new relation on crash, after all, which is much worse
> (I'm glad Horiguchi-san is working on that [1]).
Yeah, that's not an easy issue. We don't really have a protection
regarding that as well now. Backends can also refer to stats entries
in their shutdown callback that have been dropped concurrently. See
some details about that at https://commitfest.postgresql.org/49/5045/.
> Until 5891c7a8ed, there was a mechanism to garbage collect such orphaned
> entries (pgstat_vacuum()). How bad would it be to re-introduce that? Or can
> we make it more watertight so that there are no leaks?
Not sure about this part, TBH. Doing that again in autovacuum does
not excite me much as it has a cost.
> I think it's failing to flush the stats file at the end of recovery
> checkpoint.
Missed that, oops. I'll double-check this area.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2024-09-03 01:35:15 | Re: Partitioned tables and [un]loggedness |
Previous Message | Michael Paquier | 2024-09-03 00:22:58 | Re: Partitioned tables and [un]loggedness |