From: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Marko Tiikkaja <marko(at)joh(dot)to>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Fletcher <andy(at)prestigedigital(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
Subject: | Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query |
Date: | 2018-08-15 00:44:27 |
Message-ID: | 87sh3g4h8n.fsf@news-spur.riddles.org.uk |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>> Marking the function parallel safe doesn't seem wrong to me. The
>> non-parallel-safe part is that the input gets fed to it in different
>> order in different workers. And I don't really think that to be the
>> function's fault.
Tom> So that basically opens the question of whether *any* window
Tom> function calculation can safely be pushed down to parallel
Tom> workers.
Grepping the spec for the phrase "possibly non-deterministic" is quite
enlightening. Leaving out non-determinisms caused by timezone or actual
volatility, leaving out cases of non-determinism that we'd call
"stable", and leaving out features like multisets that we don't support
at all, here's the list of interesting cases (comments after each quoted
paragraph are mine):
6.28 <value expression>
d) An <array value constructor by query>.
i.e. ARRAY(select)
o) An <aggregate function> that specifies MIN or MAX and that simply
contains a <value expression> whose declared type is based on a
character string type, user-defined type, or datetime with time
zone type.
i.e. MIN(x) is non-deterministic if "x" can have distinguishable values
that compare equal. PG doesn't have that for text or timestamptz, unlike
the spec, but it does for citext or other user-defined types.
q) An <array aggregate function>.
i.e. array_agg()
u) A <window function> that specifies ROW_NUMBER, FIRST_VALUE,
LAST_VALUE, NTH_VALUE, NTILE, LEAD, or LAG, or whose associated
<window specification> specifies ROWS.
This covers those cases where window functions don't treat peer rows
together.
7.6 <table reference>
27) A <table reference> is possibly non-deterministic if the simply
contained <table primary> or <joined table> is possibly
non-deterministic or if <sample clause> is specified.
i.e. TABLESAMPLE is non-deterministic
7.16 <query specification>
a) The <set quantifier> DISTINCT is specified and one of the columns
of T has a data type of character string, user-defined type, TIME
WITH TIME ZONE, or TIMESTAMP WITH TIME ZONE.
c) The <select list>, <having clause>, or <window clause> contains a
reference to a column C of T that has a data type of character
string, user-defined type, TIME WITH TIME ZONE, or TIMESTAMP WITH
TIME ZONE, and the functional dependency G C, where G is the set
consisting of the grouping columns of T, holds in T.
For both the above two cases, if distinguishable values of a type
compare equal, it's non-deterministic which gets into the result.
7.17 <query expression>
a) The <query expression> contains a <result offset clause>.
b) The <query expression> contains a <fetch first clause>.
f) Both of the following are true:
i) T contains a set operator UNION and ALL is not specified, or T
contains either of the set operators EXCEPT or INTERSECT.
ii) At least one of the following is true:
1) The first or second operand contains a column that has a
declared type of character string.
2) The first or second operand contains a column that has a
declared type of datetime with time zone.
3) The first or second operand contains a column that has a
declared type that is a user-defined type.
(I've left out the many clauses which just amount to "if $thing contains
something which is possibly non-deterministic then it is possibly
non-deterministic")
--
Andrew (irc:RhodiumToad)
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2018-08-15 06:51:54 | Re: BUG #15324: Non-deterministic behaviour from parallelised sub-query |
Previous Message | PG Bug reporting form | 2018-08-14 22:16:30 | BUG #15326: keep getting "ERROR: type "earth" does not exist" while it does exist |