Re: pgsql: Add TAP tests for pg_verify_checksums

From: Andres Freund <andres(at)anarazel(dot)de>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Postgres hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: pgsql: Add TAP tests for pg_verify_checksums
Date: 2018-10-19 20:07:23
Message-ID: 20181019200723.4byzaw5km5jvsxlc@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 2018-10-19 15:57:28 -0400, Stephen Frost wrote:
> > > > > Of course- the same is true with a crash/restart case, so I'm not sure
> > > > > what you're getting at here.
> > > >
> > > > pg_verify_checksum doesn't support running on a crashed cluster, and I'm
> > > > not sure it'd make sense to teach it to so (there's not really much it
> > > > could do to discern whether a block is a torn page that replay will fix,
> > > > or corrupted).
> > >
> > > ... and this isn't at all relevant, because pg_basebackup does run on a
> > > running cluster.
> >
> > I wasn't ever denying that or anything close to it? My point is that
> > pg_verify_checksum needs much more filtering than it has now to deal
> > with that, because it needs to handle all files that could be present,
> > not just files that could be present after a graceful shutdown.
>
> Perhaps it doesn't today but surely one goal of pg_verify_checksum is to
> be able to run it on a running cluster eventually.

I was saying *precisely* that above. I give up.

> > pg_basebackup's case is *not* really comparable because basebackup.c
> > already performed a lot of filtering before noChecksumFiles is applied.
>
> This all really just points out that we should have the code for
> handling this somewhere common that both pg_verify_checksum and
> pg_basebackup can leverage without duplicating all of it.

I never argued against that. My point is that your argument that they
started out the same isn't true.

> The specific case that started all of this certainly looks pretty
> clearly like a case that both need to deal with.

Yep.

> > > As pointed out elsewhere on this thread, the logic was the same between
> > > the two before this commit... The code in pg_basebackup came from the
> > > prior pg_verify_checksums code. Certainly, some mention of the code
> > > existing in two places, at least, should have been in the comments.
> >
> > But the filter for basebackup only comes after the pre-existing
> > filtering that the basebackup.c code already does. All of:
> >
> > /*
> > * List of files excluded from backups.
> > */
> > static const char *excludeFiles[] =
> > {
> > /* Skip auto conf temporary file. */
> > PG_AUTOCONF_FILENAME ".tmp",
> >
> > /* Skip current log file temporary file */
> > LOG_METAINFO_DATAFILE_TMP,
> >
> > /* Skip relation cache because it is rebuilt on startup */
> > RELCACHE_INIT_FILENAME,
> >
> > /*
> > * If there's a backup_label or tablespace_map file, it belongs to a
> > * backup started by the user with pg_start_backup(). It is *not* correct
> > * for this backup. Our backup_label/tablespace_map is injected into the
> > * tar separately.
> > */
> > BACKUP_LABEL_FILE,
> > TABLESPACE_MAP,
> >
> > "postmaster.pid",
> > "postmaster.opts",
> >
> > /* end of list */
> > NULL
> > };
> >
> > is not applied, for example. Nor is:
> >
> > /* Skip temporary files */
> > if (strncmp(de->d_name,
> > PG_TEMP_FILE_PREFIX,
> > strlen(PG_TEMP_FILE_PREFIX)) == 0)
> > continue;
> >
> > Nor is
> > /* Exclude temporary relations */
> > if (isDbDir && looks_like_temp_rel_name(de->d_name))
> > {
> > elog(DEBUG2,
> > "temporary relation file \"%s\" excluded from backup",
> > de->d_name);
> >
> > continue;
> > }
> >
> > So, yea, they had:
> >
> > static const char *const skip[] = {
> > "pg_control",
> > "pg_filenode.map",
> > "pg_internal.init",
> > "PG_VERSION",
> > NULL,
> > };
> >
> > in common, but not more. All the above exclusions are strictly
> > necessary.
>
> ... but all of that code doesn't change that pg_basebackup also needs to
> ignore the EXEC_BACKEND created config_exec_params/.new files.

Right.

> You're right, pg_verify_checksums, with the assumption that it only runs
> on a cleanly shut-down cluster, doesn't need the temp-file-skipping
> logic, today, but it's going to need it tomorrow, isn't it?

No, it needs it today, as explain below in the email you're replaying
to.

> > FWIW, as far as I can tell, pg_verify_checksum appears to be broken in a
> > lot of (unfortunately) pretty normal scenarios right now. Not just on
> > windows. Besides the narrow window of crashing while a .tmp file is
> > present (and then shutting down normally after a crash restart), it also
> > has the much of wider window of crashing while *any* backend has
> > temporary files/relations. As crash-restarts don't release temp files,
> > they'll still be present after the crash, and a single graceful shutdown
> > afterwards will leave them in place. No?
>
> We do go through and do some cleanup at crash/restart

We don't clean up temp files tho.

* NOTE: we could, but don't, call this during a post-backend-crash restart
* cycle. The argument for not doing it is that someone might want to examine
* the temp files for debugging purposes. This does however mean that
* OpenTemporaryFile had better allow for collision with an existing temp
* file name.

> As you say, some of those are likely to have checksums, which should be
> handled by pg_basebackup and pg_verify_checksums, and that goes back to
> the point I was making up-thread that we want to make sure an account
> for everything. With this pattern-based approach, we could easily end
> up forgetting to add the correct new pattern into pg_verify_checksums.

If you're adding checksums for something, you better test it I don't buy
this. In contrast it's much more likely that there's a file that's
short-lived that you won't easily test against pg_verify_checksum
running in that moment.

> Playing this further and assuming that extensions dropping files into
> tablespace directories is acceptable, what are we supposed to do when
> there's some direct conflict between a file that we want to create in a
> PG tablespace directory and a file that some extension dropped there?
> Create yet another subdirectory which we call
> "THIS_IS_REALLY_ONLY_FOR_PG"?

Then it's a buggy extension. And we error out.

> What about two different extensions wanting to create files with the
> same names in the tablespace directories..?
>
> Experimentation is fine, of course, this is open source code and people
> should feel free to play with it, but we are not obligated to avoid
> breaking things when an extension author, through their experimentation,
> does something which is clearly not supported, like dropping files into
> PG's tablespace directories. Further, when it comes to our user's data,
> we should be taking a strict approach and accounting for everything,
> something that this whitelist-of-patterns-based approach to finding
> files to verify the checksums on doesn't do.

It's not economically feasible to work on extensions that will only be
usable a year down the road.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Stephen Frost 2018-10-19 20:35:46 Re: pgsql: Add TAP tests for pg_verify_checksums
Previous Message Stephen Frost 2018-10-19 19:57:28 Re: pgsql: Add TAP tests for pg_verify_checksums

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2018-10-19 20:35:46 Re: pgsql: Add TAP tests for pg_verify_checksums
Previous Message Stephen Frost 2018-10-19 19:57:28 Re: pgsql: Add TAP tests for pg_verify_checksums