From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | sfrost(at)snowman(dot)net |
Cc: | laurenz(dot)albe(at)cybertec(dot)at, andres(at)anarazel(dot)de, mail(at)joeconway(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: pg_config wrongly marked as not parallel safe? |
Date: | 2018-11-29 05:00:47 |
Message-ID: | 20181129.140047.123132781.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
At Wed, 28 Nov 2018 21:17:17 -0500, Stephen Frost <sfrost(at)snowman(dot)net> wrote in <20181129021717(dot)GT3415(at)tamriel(dot)snowman(dot)net>
> Greetings,
>
> * Kyotaro HORIGUCHI (horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp) wrote:
> > At Tue, 27 Nov 2018 06:33:42 +0100, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> wrote in <8dcb0bb248e9aaef0f1ef0faa27ab583bfa5bb84(dot)camel(at)cybertec(dot)at>
> > > On Mon, 2018-11-26 at 19:51 -0500, Stephen Frost wrote:
> > > > If a function's results can change across minor or major versions, we
> > > > shouldn't be marking it as immutable because, by definition, it's not
> > > > immutable.
> > > >
> > > > We, today, have a baked in assumption that any function marked as
> > > > immutable will remain immutable across all major versions that we allow
> > > > indexes to be retained through, which is all of them since about 8.3 at
> > > > this point.
> > > >
> > > > We absolutely need a policy that if you decide to change the results of
> > > > some immutable function across a major version change, you need to
> > > > consider the results on indexes and possibly write into pg_upgrade
> > > > checks to try and detect any usage of that immutable function. I hope
> > > > we're in agreement there.
> > >
> > > It's hard to make a guarantee that a function will never change in the
> > > future. What if we fix some rounding or overflow problem in a floating
> > > point function?
> >
> > +1.
> >
> > Actually, some geometric comparisons are performed counting
> > tolerance margin, the validity of which is in doubt. Their
> > behavior has been changed in recent major version and still has a
> > room for improvement, and the functions are parallel-safe and
> > immutable. Immutablity is mentiond mainly in the light of
> > optimization in the documentation.
>
> I really don't buy off on these arguments in the least. I also didn't
> say that a function wasn't allowed to change- but that the output of an
> immutable function, for a given input, shouldn't change and in the very
> rare case where we absolutely had to make a change, it had better be for
> a very good reason and we need to consider the impact on user indexes.
>
> I don't agree that we can simply declare that all functional and partial
> indexes have to be rebuilt between every major version upgrade, which is
> the alternative. There really isn't another option- either we're
> extremely careful and take immutability of functions seriously and make
> sure to preserve behavior, and therefore indexes, across major versions,
> or we don't and we require indexes to be rebuilt.
Even though index is the major place where the results of
immutable funcions are stored without uesrs' clear consciousness,
things are not restricted to indexes. Just as a thouht experiment
with an extreme case, we can store the results of immutable
functions into tables even whether it makes sense or not. We can
dump'n-load onto PostgreSQL running on another platform. I'm not
sure the new database can work as the same with the original one.
I can't express my feeling well, but this is a problem of how far
we guarantee the immutability after all. We guarantee
planning-execution duration immutability, yes. We guarantee using
plan caches, yes. Beyond restart, yes. Beyond minor-versions,
maybe. Beyond major-versions, presumably? Beyond platform,
improbably?
For example, avg(), sum(), stddev() are immutable but I'm not
sure they are absolutely and in-exact-meaning immutable in all of
the senses above. (However, I don't think the error makes
problematic differences..)
# Yes, as I said above this is a very extreme case.
We might need to split the immutable into two kinds of volatility
level. Immutable as it can be used for indexes, and immutable-2
that immutable at least on running server. I'm not sure there are
any function or how many of the funcitons are categorized to the
latter, or we can sort them out. And I thing that it would be
inconvenient if avg and sum were moved out of the former
category..
> When it comes to the documentation, perhaps we should improve it to make
> it clear that if you change the results from an immutable function then
> you need to rebuild any indexes which use it.
>
> > > If we went that far, hardly any function could be IMMUTABLE.
>
> I don't agree.
>
> > > I think it is best to use IMMUTABLE whenever we don't expect it to
> > > change and it is a function useful for indexes, and if it happens to
> > > change nonetheless, write into the release notes that certain indexes
> > > have to be rebuilt after upgrade.
>
> I don't agree with the notion of "if it happens to change" as that
> sounds pretty cavalier, and only noting something in the release notes
> is a good way to cause a lot of pain for users. If it's possible to do
> better, which it often is (see the list of things pg_upgrade already
> checks...) then we should do so. If there's reasons we can't, fine, it
> happens, but that needs to be taken into account when we're considering
> making such a change and its impact on users.
>
> > > Of course, there is no problem to mark pg_config as stable, because
> > > there is little chance it will be used in an index anyway.
> >
> > Agreed. We have the following description in the documentation.
>
> I'm pretty sure we all agree that pg_config should be marked as stable
> and that changing it from immutable to stable won't be an issue.
>
> > | A common error is to label a function IMMUTABLE when its results
> > | depend on a configuration parameter. For example, a function that
> > | manipulates timestamps might well have results that depend on the
> > | TimeZone setting. For safety, such functions should be labeled
> > | STABLE instead.
> >
> > Still this isn't mentioning upgrades. We could add to this that
> > something like that immutablity is not guaranteed beyond
> > PostgreSQL versions, the results of immutable functions might
> > change for certain reasons including bug fixes, blah, blah..
>
> Which means we're telling every user out there that they have to rebuild
> every functional or partial index they have across every major version.
>
> I don't agree with that and I don't think we really have some pressing
> need to be that free with making changes to immutable functions.
>
> I do think it'd be good to make a mention when talking about immutable
> functions that they're allowed to be used in indexes and, as such, users
> need to be careful to make sure that they don't change the results from
> an immutable function unless they're sure it's not used in an index or
> they're prepared to rebuild all indexes where it is used.
I suppose that not a minor part of users aren't conscious of
immutability so clearly..
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2018-11-29 05:40:52 | Re: pg_config wrongly marked as not parallel safe? |
Previous Message | Etsuro Fujita | 2018-11-29 04:55:24 | Re: Unnecessary asterisk in comment in postgres_fdw.c |