Re: fix tablespace handling in pg_combinebackup

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: fix tablespace handling in pg_combinebackup
Date: 2024-04-18 17:45:54
Message-ID: 20240418174554.sumrmgqpcstppsiv@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2024-04-18 09:03:21 -0400, Robert Haas wrote:
> On Wed, Apr 17, 2024 at 5:50 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > > +If there are tablespace present in the backup, include tablespace_map as
> > > +a keyword parameter whose values is a hash. When tar_program is used, the
> > > +hash keys are tablespace OIDs; otherwise, they are the tablespace pathnames
> > > +used in the backup. In either case, the values are the tablespace pathnames
> > > +that should be used for the target cluster.
> >
> > Where would one get these oids?
>
> You pretty much have to pick them out of the tar file names. It sucks,
> but it's not this patch's fault.

I was really just remarking on this from the angle of a test writer. I know
that our interfaces around this suck...

For tests, do we really need to set anything on a per-tablespace basis? Can't
we instead just reparent all of them to a new directory?

> I wonder if we (as a project) would consider a patch that redesigned
> this whole mechanism. Produce ${TABLESPACE_NAME}.tar in tar-format,
> instead of ${OID}.tar. In directory-format, relocate via
> -T${TABLESPACE_NAME}=${DIR} instead of -T${SERVERDIR}=${DIR}. That
> would be a significant compatibility break, and you'd somehow need to
> solve the problem of what to put in the tablespace_map file, which
> requires OIDs. But it seems like if you could finesse that issue in
> some elegant way, the result would just be a heck of a lot more usable
> than what we have today.

For some things that'd definitely be nicer - but not sure it work well for
everything. Consider cases where you actually have external directories on
different disks, and you want to restore a backup after some data loss. Now
you need to list all the tablespaces separately, to put them back into their
own location.

One thing I've been wondering about is an option to put the "new" tablespaces
into a location relative to each of the old ones.
--tablespace-relative-location=../restore-2024-04-18
which would rewrite all the old tablespaces to that new location.

> > Could some of this be simplified by using allow_in_place_tablespaces instead?
> > Looks like it'd simplify at least the extended test somewhat?
>
> I don't think we can afford to assume that allow_in_place_tablespaces
> doesn't change the behavior.

I think we can't assume that absolutely everywhere, but we don't need to test
it in a lot of places.

> I said (at least off-list) when that feature was introduced that there was
> no way it was going to remain an isolated development hack, and I think
> that's proved to be 100% correct.

Hm, I guess I kinda agree. But not because I think it wasn't good for
development, but because it'd often be much saner to use relative tablespaces
than absolute ones even for prod.

My only point here was that the test would be simpler if you
a) didn't need to create a temp directory for the tablespace, both for
primary and standby
b) didn't need to "gin up" a tablespace map, because all the paths are
relative

Just to be clear: I don't want the above to block merging your test. If you
think you want to do it the way you did, please do.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema-Nio 2024-04-18 17:49:13 Re: Transparent column encryption
Previous Message Ranier Vilela 2024-04-18 17:43:37 Re: plenty code is confused about function level static