From: | Andrew <pgsqlhackers(at)andrewrepp(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org, soumyadeep2007(at)gmail(dot)com, alexandra(dot)wanglei(at)gmail(dot)com |
Subject: | Re: Fix to enum hashing for dump and restore |
Date: | 2023-01-25 13:23:11 |
Message-ID: | f80347af-12d0-40d3-b8e6-65eb79e6a5fb@app.fastmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Those are excellent points.
We will investigate adjusting pg_dump behavior,
as this is primarily a dump+restore issue.
Thank you!
-Andrew J Repp (VMware)
On Tue, Jan 24, 2023, at 9:56 PM, Tom Lane wrote:
> Andrew <pgsqlhackers(at)andrewrepp(dot)com> writes:
> > I have discovered a bug in one usage of enums. If a table with hash
> > partitions uses an enum as a partitioning key, it can no longer be
> > backed up and restored correctly. This is because enums are represented
> > simply as oids, and the hash function for enums hashes that oid to
> > determine partition distribution. Given the way oids are assigned, any
> > dump+restore of a database with such a table may fail with the error
> > "ERROR: new row for relation "TABLENAME" violates partition constraint".
>
> Ugh, that was not well thought out :-(. I suppose this isn't a problem
> for pg_upgrade, which should preserve the enum value OIDs, but an
> ordinary dump/restore will indeed hit this.
>
> > I have written a patch to fix this bug (attached), by instead having the
> > hashenum functions look up the enumsortorder ID of the value being
> > hashed. These are deterministic across databases, and so allow for
> > stable dump and restore.
>
> Unfortunately, I'm not sure those are as deterministic as all that.
> They are floats, so there's a question of roundoff error, not to
> mention cross-platform variations in what a float looks like. (At the
> absolute minimum, I think we'd have to change your patch to force
> consistent byte ordering of the floats.) Actually though, roundoff
> error wouldn't be a problem for the normal exact-integer values of
> enumsortorder. Where it could come into play is with the fractional
> values used after you insert a value into the existing sort order.
> And then the whole idea fails, because a dump/restore won't duplicate
> those fractional values.
>
> Another problem with this approach is that we can't get from here to there
> without a guaranteed dump/reload failure, since it's quite unlikely that
> the partition assignment will be the same when based on enumsortorder
> as it was when based on OIDs. Worse, it also breaks the pg_upgrade case.
>
> I wonder if it'd work to make pg_dump force --load-via-partition-root
> mode when a hashed partition key includes an enum.
>
> regards, tom lane
>
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2023-01-25 13:29:32 | CREATE ROLE bug? |
Previous Message | shveta malik | 2023-01-25 13:02:11 | Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication |