Re: ALTER EXTENSION UPGRADE, v3

From: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr>, "David E(dot) Wheeler" <david(at)kineticode(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: ALTER EXTENSION UPGRADE, v3
Date: 2011-02-10 21:11:38
Message-ID: m2d3mzfvcl.fsf@2ndQuadrant.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
> I spent some time reviewing this thread. I think the major point that's

Thanks for doing that, we badly needed someone without an horse in this
race to do that and finish the design.

> So I believe that it'd be a good idea if it were possible for an extension
> author to distribute a package that implements, say, versions 1.0, 1.1,
> and 2.0 of hstore. Not all will choose to do the work needed for that, of
> course, and that's fine. But the extension mechanism ought to permit it.

Agreed. I've been weary of being told I'm trying to put too much into
the first PostgreSQL release with extensions, but I'm happy to see you
want to take it this far. So well, here follows some ideas I've been
trying hard not to push too soon :)

> To do this, we need to remove the concept that the control file specifies
> "the" version of an extension; rather the version is associated with the
> SQL script file. I think we should embed the version number in the script
> file name, and require one to be present (no more omitted version
> numbers). So you would distribute, say,
> hstore-1.0.sql
> hstore-1.1.sql
> hstore-2.0.sql
> representing the scripts needed to install these three versions from

What I don't like in that is that this restrict what the version strings
can look like. In debian for example it's pretty common to use the ~
separator, because 1.0~alpha1 < 1.0~beta < 1.0 with their sorting rules.
And this trick won't work on windows filenames, AFAIK. That's one
reason why I've wanted to stay away from having the version number
strings encoded into the filename in the first place.

But if you get to sorting rules of version strings, you have to define
them properly and impose them to users. That's both a good thing and a
bad thing, but IMO requires that we provide a proper data type for that.

So my opinion here is that we should not only go with your design here
with the version string in the filename, but also imposes how to spell
out version strings in a way that we know will work for PostgreSQL on
every supported system.

> scratch. CREATE EXTENSION would have an option to select which
> version to install. If the option is omitted, there are at least two
> things we could do:
> 1. Choose the newest available version.
> 2. Let the control file specify which version is the default.
> I think I prefer #2 because it avoids needing a rule for comparing
> version identifiers, and it caters to the possibility that the "newest"
> version isn't yet mature enough to be a good default.

I like this idea. +1 for having the default version to install in the
control file. See below for some more details about that, though.

> As for upgrades, let's just expect upgrade scripts to be named
> extension-oldversion-newversion.sql. ALTER EXTENSION UPGRADE knows the
> relevant oldversion from pg_extension, and newversion can be handled the
> same way as in CREATE, ie, either the user says which version to update to
> or we use the default version from the control file.

Again, I like the idea and how simple it make things look, but I think
if we should then bite the bullet and restrict what a version string is
expected to be and offer a data type with proper sorting while at it.
And of course use that as the pg_extension.extversion column type.

That way a SQL query can check if there's a new version available on
your system. That's useful in some places to use as a monitoring alert
coupled with nagios. The sysadmin team does the apt-get install part of
the job and then the DBA team is paged to go upgrade the extensions in
the databases, or shut the alarm somehow.

> I don't seriously expect most extension authors to bother preparing
> upgrade scripts for any cases except adjacent pairs of versions.
> That means that if a user comes along and wants to upgrade across several
> versions of the extension, he'll have to do it in several steps:
> ALTER EXTENSION hstore UPGRADE TO '1.1';
> ALTER EXTENSION hstore UPGRADE TO '2.0';
> ALTER EXTENSION hstore UPGRADE TO '2.1';
> I don't see that as being a major problem --- how often will people have
> the need to do that, anyway? Authors who feel that it is a big deal can
> expend the work to provide shortcut scripts. I do not see adequate return
> on investment from the regexp-matching complications in the currently
> submitted patch.

The regexp matching reason to live is so that we don't have to know
anything about version strings at all. If you're saying that a version
string can not contain a dash and must be a valid filesystem name (often
enough, for all systems supported by PostgreSQL), and you're now saying
that ALTER EXTENSION UPGRADE could automate multi-steps upgrade, then I
think we have to provide the "version" (or "pgversion") data type and
all that jazz.

If we get to somehow, even lightly, depend on some rules, better offer
them in code and documentation rather than have them implicit.

> that it is necessary to solve that now. We could later on add some kind
> of script inclusion capability to allow authors to avoid code duplication
> in multi-version update scripts, but it's just not urgent.

I'd think that points get more or less moot with the automated
multi-step support. You could then have intermediate upgrade scripts
that matches no released version but just are there to ease the
packaging of them.

> So, concrete proposal is to enforce the "extension-version.sql" and
> "extension-oldversion-newversion.sql" naming rules for scripts, which
> means getting rid of the script name parameter in control files.

Well, just for the record, we could extend the script property to be a
key value thing that pairs a version string with an upgrade script
name. We should then maybe have hstore in core to support that.

> (Instead, we could have a directory parameter that tells which directory
> holds all the install and upgrade scripts for the extension.) Also, the

Putting all those files in a directory per extension does not seems much
an option for me. The share/contrib directory is already somewhat of a
mess now, I can't imagine what it will look like if people are to use
the extension mechanism whenever it makes sense… Another +1 here :)

> "version" parameter should be renamed to something like "current_version"
> or "default_version". We also have to be wary of whether any other

I can already hear people wanting version aliases instead. We could
support e.g. 4 or 5 aliases like 'stable', 'support', 'alpha', 'beta'
and maybe 'experimental'. Then rather than defining "current_version"
authors would define any set of those keywords here, and CREATE
EXTENSION and ALTER EXTENSION would by default only care for
resp. 'stable' and 'support'.

And of course the commands would have to support either version strings
or version aliases.

> control-file parameters specify something that might be version-specific.
> Looking at the current list:
>
> comment: probably OK to consider this as a default for all versions.
> We already have the ability for any script file to override it, anyway.

check

> encoding: I don't see any big problem with insisting that all scripts for
> a given extension be in the same encoding.

check

> requires, relocatable and schema: These are problematic, because it's not
[...]
> IOW, we'd read "extension.control" to get the directory and
> default_version values, then determine the version we are installing or
> upgrading to, then see if there's an "extension-version.control" file
> in the extension's directory, and if so read that and let it replace
> the remaining parameters' values.

I like this idea better. Like, way way better. The previous one does
not even stand a chance :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-02-10 21:12:12 Re: Why we don't want hints Was: Slow count(*) again...
Previous Message Tom Lane 2011-02-10 20:58:54 Re: ALTER EXTENSION UPGRADE, v3