From: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: cleanup patches for incremental backup |
Date: | 2024-01-24 18:05:15 |
Message-ID: | 20240124180515.GA2709722@nathanxps13 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Jan 24, 2024 at 12:46:16PM -0500, Robert Haas wrote:
> The "examining summary" line is generated based on the output of
> pg_available_wal_summaries(). The way that works is that the server
> calls readdir(), disassembles the filename into a TLI and two LSNs,
> and returns the result. Then, a fraction of a second later, the test
> script reassembles those components into a filename and finds the file
> missing. If the logic to translate between filenames and TLIs & LSNs
> were incorrect, the test would fail consistently. So the only
> explanation that seems to fit the facts is the file disappearing out
> from under us. But that really shouldn't happen. We do have code to
> remove such files in MaybeRemoveOldWalSummaries(), but it's only
> supposed to be nuking files more than 10 days old.
>
> So I don't really have a theory here as to what could be happening. :-(
There might be an overflow risk in the cutoff time calculation, but I doubt
that's the root cause of these failures:
/*
* Files should only be removed if the last modification time precedes the
* cutoff time we compute here.
*/
cutoff_time = time(NULL) - 60 * wal_summary_keep_time;
Otherwise, I think we'll probably need to add some additional logging to
figure out what is happening...
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2024-01-24 18:08:12 | Re: index prefetching |
Previous Message | Tristan Partin | 2024-01-24 17:57:08 | Re: make dist using git archive |