From: | Boszormenyi Zoltan <zb(at)cybertec(dot)at> |
---|---|
To: | Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> |
Cc: | PG Hackers <pgsql-hackers(at)postgresql(dot)org>, Greg Stark <stark(at)mit(dot)edu>, David Fetter <david(at)fetter(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com> |
Subject: | Re: UNNEST with multiple args, and TABLE with multiple funcs |
Date: | 2013-08-19 16:23:44 |
Message-ID: | 52124690.8050309@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
2013-08-13 15:54 keltezéssel, Andrew Gierth írta:
> Summary:
>
> This patch implements a method for expanding multiple SRFs in parallel
> that does not have the surprising LCM behaviour of SRFs-in-select-list.
> (Functions returning fewer rows are padded with nulls instead.)
>
> It then uses this method combined with a parse-time hack to implement
> the (intended to be) spec-conforming behaviour of UNNEST with multiple
> parameters, including flattening of composite results.
>
> The upshot is that given a table like this:
>
> postgres=# select * from t1;
> a | b | c
> ---------------+-------------------+----------------------------------------------
> {11,12,13} | {wombat} |
> {5,10} | {foo,bar} | {"(123,xyzzy)","(456,plugh)","(789,plover)"}
> {21,31,41,51} | {fred,jim,sheila} | {"(111,xyzzy)","(222,plugh)"}
> (3 rows)
>
> (where column "c" is an array of a composite type with 2 cols, "x" and "y")
>
> You can do this:
>
> postgres=# select u.* from t1, unnest(a,b,c) with ordinality as u;
> ?column? | ?column? | x | y | ordinality
> ----------+----------+-----+--------+------------
> 11 | wombat | | | 1
> 12 | | | | 2
> 13 | | | | 3
> 5 | foo | 123 | xyzzy | 1
> 10 | bar | 456 | plugh | 2
> | | 789 | plover | 3
> 21 | fred | 111 | xyzzy | 1
> 31 | jim | 222 | plugh | 2
> 41 | sheila | | | 3
> 51 | | | | 4
> (10 rows)
>
> Or for an example of general combination of functions:
>
> postgres=# select * from table(generate_series(10,20,5), unnest(array['fred','jim']));
> ?column? | ?column?
> ----------+----------
> 10 | fred
> 15 | jim
> 20 |
> (3 rows)
>
> Implementation Details:
>
> The spec syntax for table function calls, <table function derived table>
> in <table reference>, looks like TABLE(func(args...)) AS ...
>
> This patch implements that, plus an extension: it allows multiple
> functions, TABLE(func1(...), func2(...), func3(...)) [WITH ORDINALITY]
> and defines this as meaning that the functions are to be evaluated in
> parallel.
>
> This is implemented by changing RangeFunction, function RTEs, and the
> FunctionScan node to take lists of function calls rather than a single
> function. The calling convention for SRFs is completely unchanged; each
> function returns its own rows (or a tuplestore in materialize mode) just
> as before, and FunctionScan combines the results into a single output
> tuple (keeping track of which functions are exhausted in order to
> correctly fill in nulls on a backwards scan).
>
> Then, a hack in the parser converts unnest(...) appearing as a
> func_table (and only there) into a list of unnest() calls, one for each
> parameter. So
>
> select ... from unnest(a,b,c)
>
> is converted to
>
> select ... from TABLE(unnest(a),unnest(b),unnest(c))
>
> and if unnest appears as part of an existing list inside TABLE(), it's
> expanded to multiple entries there too.
>
> This parser hackery is of course somewhat ugly. But given the objective
> of implementing the spec's unnest syntax, it seems to be the least ugly
> of the possible approaches. (The hard part of doing it any other way
> would be generating the description of the result type; composite array
> parameters expand into multiple result columns.)
Harder maybe but it may still be cleaner in the long run.
> Overall, it's my intention here to remove as many as feasible of the old
> reasons why one might use an SRF in the select list.
Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
WITH ORDINALITY and this feature, I would vote for removing
SRF-in-targetlist and call the release PostgreSQL 10.0.
> This should also
> address the points that Josh brought up in discussion of ORDINALITY
> regarding use of SRF-in-select to unnest multiple arrays.
>
> (As a side issue, this patch also sets up pathkeys for ordinality along
> the lines of a patch I suggested to Greg a while back in response to
> his.)
>
> Current patch status:
>
> This is a first working cut: no docs, no tests, not enough comments, the
> deparse logic probably needs more work (it deparses correctly but the
> formatting may be suboptimal). However all the functionality is believed
> to be in place.
With this last paragraph in mind, I am trying a little review.
* Is the patch in a patch format which has context? (eg: context diff format)
Yes.
* Does it apply cleanly to the current git master?
Applies with some offsets on a few files but without fuzz.
* Does it include reasonable tests, necessary doc patches, etc?
No, as told by the patch author.
* Does the patch actually implement what it's supposed to do?
Yes.
* Do we want that?
Yes.
* Do we already have it?
No.
* Does it follow SQL spec, or the community-agreed behavior?
The SQL spec says these:
In 7.6 <table reference>
<table primary> ::=
...
| <table function derived table> [ AS ] <correlation name> [ <left paren> <derived column
list> <right paren> ]
also later in the same section:
<table function derived table> ::=
TABLE <left paren> <collection value expression> <right paren>
In 6.26 <value expression>
<collection value expression> ::=
<array value expression> | <multiset value expression>
In 6.36 <array value expression>
<array value expression> ::= <array concatenation> | <array primary>
<array concatenation> ::= <array value expression 1> <concatenation operator> <array primary>
<array value expression 1> ::= <array value expression>
<array primary> ::= <array value function> | <value expression primary>
6.3 <value expression primary>
<value expression primary> ::=
<parenthesized value expression>
| <nonparenthesized value expression primary>
<parenthesized value expression> ::= <left paren> <value expression> <right paren>
<nonparenthesized value expression primary> ::=
<unsigned value specification>
| <column reference>
| <set function specification>
| <window function>
| <nested window function>
| <scalar subquery>
| <case expression>
| <cast specification>
| <field reference>
| <subtype treatment>
| <method invocation>
| <static method invocation>
| <new specification>
| <attribute or method reference>
| <reference resolution>
| <collection value constructor>
| <array element reference>
| <multiset element reference>
| <next value expression>
| <routine invocation>
collection value constructor> ::=
| <array value constructor>
| <multiset value constructor>
So, the FROM TABLE(...) AS (...) syntax is a big can of worms and
I haven't even quoted <multiset value expression>.
As far as I can tell, these should also be allowed but isn't:
zozo=# select * from table('a'::text) as x;
ERROR: syntax error at or near "'a'"
LINE 1: select * from table('a'::text) as x;
^
zozo=# select x.* from t1, table(t1.a) as x;
ERROR: syntax error at or near ")"
LINE 1: select x.* from t1, table(t1.a) as x;
^
zozo=# select x.* from table((6)) as x(a int4);
ERROR: syntax error at or near "("
LINE 1: select x.* from table((6)) as x(a int4);
^
zozo=# select x.* from table(values (6)) as x(a int4);
ERROR: syntax error at or near "("
LINE 1: select x.* from table(values (6)) as x(a int4);
^
zozo=# select x.* from table(values(6)) as x(a int4);
ERROR: syntax error at or near "("
LINE 1: select x.* from table(values(6)) as x(a int4);
^
What the patch implements is only the last choice for
<nonparenthesized value expression primary>: <routine invocation>
When you add documentation, it would be nice to mention it.
Also, the grammar extension is a start for adding all the other
standard choices for TABLE().
* Does it include pg_dump support (if applicable)?
n/a
* Are there dangers?
I can't see any. 8-)
* Have all the bases been covered?
My previous comments about the TABLE() syntax says it all.
You can interpret it either way. :-)
* Does the feature work as advertised?
Yes.
* Are there corner cases the author has failed to consider?
I don't know.
* Are there any assertion failures or crashes?
No.
* Does the patch slow down simple tests?
No.
* If it claims to improve performance, does it?
It certainly improves writing queries, as functions inside
unnest() get processed in one scan.
* Does it slow down other things?
I don't think so.
* Does it follow the project coding guidelines?
Yes.
* Are there portability issues?
No.
* Will it work on Windows/BSD etc?
It should, the code uses standard internal PostgreSQL APIs
and extends them. No new system call.
* Are the comments sufficient and accurate?
According to the author, no.
* Does it do what it says, correctly?
Yes.
* Does it produce compiler warnings?
No.
* Can you make it crash?
No.
* Is everything done in a way that fits together coherently with other features/modules?
I think so
* Are there interdependencies that can cause problems?
I don't know.
Best regards,
Zoltán Böszörményi
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
From | Date | Subject | |
---|---|---|---|
Next Message | 'Bruce Momjian' | 2013-08-19 16:26:35 | Re: 9.3 release notes suggestions |
Previous Message | Bruce Momjian | 2013-08-19 16:15:43 | Re: Feature Request on Extensions |