From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> |
Cc: | Douglas Doole <dougdoole(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Christoph Berg <myon(at)debian(dot)org>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Collation versioning |
Date: | 2018-09-18 21:54:51 |
Message-ID: | 20180918215451.GQ4184@tamriel.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Greetings,
* Thomas Munro (thomas(dot)munro(at)enterprisedb(dot)com) wrote:
> On Wed, Sep 19, 2018 at 12:48 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > * Thomas Munro (thomas(dot)munro(at)enterprisedb(dot)com) wrote:
> > > So to be more concrete: pg_depend could have a new column
> > > "refobjversion". Whenever indexes are created or rebuilt, we'd
> > > capture the current version string in the pg_depend rows that link
> > > index attributes and collations. Then we'd compare those against the
> > > current value when we first open an index and complain if they don't
> > > match. (In this model there would be no "collversion" column in the
> > > pg_collation catalog.)
> >
> > I'm really not sure why you're pushing to have this in pg_depend..
> >
> > > That'd leave a place for other kinds of database objects (CHECKs,
> > > PARTITIONS, ...) to store their version dependency, if someone later
> > > wants to add support for that.
> >
> > Isn't what matters here where the data's stored, as in, in a column..?
> >
> > All of those would already have dependencies on the column so that they
> > can be tracked back there.
>
> Suppose I have a table "emp" and indexes "emp_firstname_idx" and
> "emp_lastname_idx". Suppose I created them in a sequence like this:
>
> 0: collation fr_CA has version "30"
> 1: create table emp (firstname text collate "fr_CA", lastname text
> collate "fr_CA");
> 2: create index on emp(firstname);
> 3: [upgrade operating system]; now collation fr_CA has version "31"
> 4: create index on emp(lastname);
>
> Now I have two indexes, built when different versions of the collation
> were in effect. One of them is potentially corrupted, the other
> isn't. Where are you going to record that? Earlier I suggested that
> pg_index could have an indcollversion column, so that
> emp_firstname_idx's row would hold {"30"} and emp_lastname_idx's row
> would hold {"31"}. It would be captured at CREATE INDEX time, and
> after that the only way to change it would be to REINDEX, and whenever
> it disagrees with the current version according to the provider you'd
> get a warning that you can only clear by running REINDEX. Then I
> suggested that perhaps pg_depend might be a better place for it,
> because it would generalise to other kinds of object too.
For indexes, just like for tables, we have entries in pg_attribute where
that information would go.
> For example, suppose I create a constraint CHECK (foo < 'côté') [evil
> laugh]. The pg_depend row that links the constraint and the collation
> could record the current version as of the moment the constraint was
> defined. After an OS upgrade that changes the reported version, I'd
> see a warning whenever loading the check constraint, and the only way
> to clear it would be to drop and recreate the constraint. (I'm not
> proposing we do that, just trying to demonstrate that pg_depend might
> be a tidier and more general solution than adding 'collation version'
> columns holding arrays of version strings to multiple catalogs. So
> help me Codd.)
The CHECK constraint doesn't need to directly track that information-
it should have a dependency on the column in the table and that's where
the information would be recorded about the current collation version.
Maybe I'm missing something but I have to admit that I feel like
pg_depend is being looked at here because everything goes through it-
but everything goes through it because it's simple and we just use it to
get to other things that have the complete definition of the object.
Lots and lots of things in pg_depend would have zero use for such a
field and I'm a bit worried you'd possibly also get into cases where
you've got different collation versions for the same object because of
the different dependencies into it... even if you don't, you're
duplicating that information into every dependency, aren't you?
Thanks!
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Douglas Doole | 2018-09-18 22:05:51 | Re: Collation versioning |
Previous Message | Michael Paquier | 2018-09-18 21:38:07 | Re: pgsql: Allow concurrent-safe open() and fopen() in frontend code for Wi |