Quick Links

Re: Collation versioning

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Stephen Frost <sfrost(at)snowman(dot)net>
Cc:	Douglas Doole <dougdoole(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Collation versioning
Date:	2018-09-18 20:34:36
Message-ID:	CAEepm=1XCyNbhEaU+1tG_dFDVje=q98dg2uQJoJPWMu6HEjW2g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Sep 19, 2018 at 12:48 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> * Thomas Munro (thomas(dot)munro(at)enterprisedb(dot)com) wrote:
> > So to be more concrete: pg_depend could have a new column
> > "refobjversion". Whenever indexes are created or rebuilt, we'd
> > capture the current version string in the pg_depend rows that link
> > index attributes and collations. Then we'd compare those against the
> > current value when we first open an index and complain if they don't
> > match. (In this model there would be no "collversion" column in the
> > pg_collation catalog.)
>
> I'm really not sure why you're pushing to have this in pg_depend..
>
> > That'd leave a place for other kinds of database objects (CHECKs,
> > PARTITIONS, ...) to store their version dependency, if someone later
> > wants to add support for that.
>
> Isn't what matters here where the data's stored, as in, in a column..?
>
> All of those would already have dependencies on the column so that they
> can be tracked back there.

Suppose I have a table "emp" and indexes "emp_firstname_idx" and
"emp_lastname_idx". Suppose I created them in a sequence like this:

0: collation fr_CA has version "30"
1: create table emp (firstname text collate "fr_CA", lastname text
collate "fr_CA");
2: create index on emp(firstname);
3: [upgrade operating system]; now collation fr_CA has version "31"
4: create index on emp(lastname);

Now I have two indexes, built when different versions of the collation
were in effect. One of them is potentially corrupted, the other
isn't. Where are you going to record that? Earlier I suggested that
pg_index could have an indcollversion column, so that
emp_firstname_idx's row would hold {"30"} and emp_lastname_idx's row
would hold {"31"}. It would be captured at CREATE INDEX time, and
after that the only way to change it would be to REINDEX, and whenever
it disagrees with the current version according to the provider you'd
get a warning that you can only clear by running REINDEX. Then I
suggested that perhaps pg_depend might be a better place for it,
because it would generalise to other kinds of object too.

For example, suppose I create a constraint CHECK (foo < 'côté') [evil
laugh]. The pg_depend row that links the constraint and the collation
could record the current version as of the moment the constraint was
defined. After an OS upgrade that changes the reported version, I'd
see a warning whenever loading the check constraint, and the only way
to clear it would be to drop and recreate the constraint. (I'm not
proposing we do that, just trying to demonstrate that pg_depend might
be a tidier and more general solution than adding 'collation version'
columns holding arrays of version strings to multiple catalogs. So
help me Codd.)

Just an idea...

--
Thomas Munro
http://www.enterprisedb.com

In response to

Re: Collation versioning at 2018-09-18 12:48:35 from Stephen Frost

Responses

Re: Collation versioning at 2018-09-18 21:54:51 from Stephen Frost

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2018-09-18 21:38:07	Re: pgsql: Allow concurrent-safe open() and fopen() in frontend code for Wi
Previous Message	Andrew Dunstan	2018-09-18 19:36:32	fast default vs triggers