From: | Jan Wieck <jan(at)wi3ck(dot)info> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, Alexander Shutyaev <shutyaev(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: pg_upgrade and wraparound |
Date: | 2021-03-12 23:13:33 |
Message-ID: | f09a8c8e-b936-9878-4665-f21d4009e034@wi3ck.info |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Resurrecting an old thread.
We (AWS) have seen this wraparound during pg_upgrade more often recently
with customers who have millions of large objects in their databases.
On 6/11/18 1:14 PM, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
>> I suspect the issue is that pg_resetwal does:
>> if (set_xid != 0)
>> {
>> ControlFile.checkPointCopy.nextXid = set_xid;
>
>> /*
>> * For the moment, just set oldestXid to a value that will force
>> * immediate autovacuum-for-wraparound. It's not clear whether adding
>> * user control of this is useful, so let's just do something that's
>> * reasonably safe. The magic constant here corresponds to the
>> * maximum allowed value of autovacuum_freeze_max_age.
>> */
>> ControlFile.checkPointCopy.oldestXid = set_xid - 2000000000;
>> if (ControlFile.checkPointCopy.oldestXid < FirstNormalTransactionId)
>> ControlFile.checkPointCopy.oldestXid += FirstNormalTransactionId;
>> ControlFile.checkPointCopy.oldestXidDB = InvalidOid;
>> }
>
>> but we have codepath that doesn't check for oldestXidDB being
>> InvalidOid. Not great.
>
> Hm, I think I'd define the problem as "pg_resetwal is violating the
> expectation that oldestXidDB be valid".
>
> However, this just explains the basically-cosmetic issue that the
> complaint message mentions OID 0. It doesn't really get us to the
> answer to why Alexander is seeing a failure. It might be useful
> to see pg_controldata output for the old cluster, as well as
> "select datname, datfrozenxid from pg_database" output from the
> old cluster.
Unfortunately I don't have pg_controldata output from the old clusters
either. I would like to be able to artificially create an "old" cluster
that fails during pg_upgrade in that way.
One of the things in my way is that when using pg_resetwal to put the
NextXID way into the future (to push the old cluster close to wraparound
for example), the postmaster won't start because it doesn't have the
pg_xact files for that around. Should pg_resetwal create the files in
the gap between the old NextXID and the new one?
Onw thing I do have is a patch that provides a workaround for the
problem as well as a substantial speed improvement for the case at hand.
This patch adds some options to pg_upgrade, pg_dump and pg_restore.
Option added to pg_dump:
--blob-in-parallel
This option requires --schema-only. It causes pg_dump to emit the BLOB
metadata with SECTION_DATA instead of SECTION_PRE_DATA. This causes the
statements for creating the large object metadata (lo_create(OID) and
ALTER LARGE OBJECT) to move into the parallel phase of pg_restore, which
means that their metadata will be created in parallel. In my tests a
database containing large objects only is upgraded in 1/#cores the time.
Option added to pg_restore:
--blob-batch-size=N
With this option pg_restore tries to put N BLOB TOC entries into one
transaction. This is per parallel worker and it will commit those
batches if there is a change in object type, so only BLOB TOC entries
will ever be batched at all. With a sufficient
'max_locks_per_transation' a --blob-batch-size=1000 nicely reduces the
number of XIDs consumed for upgrading 10M large objects from 20M to 10K.
Options added to pg_upgrade:
--blob-in-parallel forwarded to pg_dump
--blob-batch-size=N forwarded to pg_restore
--restore-jobs=N forwarded as --jobs=N to pg_restore
Patch is attached.
Regards, Jan
--
Jan Wieck
Principle Database Engineer
Amazon Web Services
Attachment | Content-Type | Size |
---|---|---|
pg_upgrade_improvements.v1.diff | text/x-patch | 12.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2021-03-13 01:30:44 | Re: pg_upgrade and wraparound |
Previous Message | Brent Wood | 2021-03-12 20:44:12 | Re: hstore each() function - returned order?? |