From: | Antonin Houska <antonin(dot)houska(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kevin Grittner <kgrittn(at)ymail(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Reference to parent query from ANY sublink |
Date: | 2013-12-11 23:31:37 |
Message-ID: | 52A8F5D9.4090801@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/11/2013 10:15 PM, Tom Lane wrote:
>
> FWIW, that plan isn't obviously wrong; if it is broken, most likely the
> reason is that the HashAggregate is incorrectly unique-ifying the lower
> table. (Unfortunately, EXPLAIN doesn't show enough about the HashAgg
> to know what it's doing exactly.) The given query is, I think, in
> principle equivalent to
>
> SELECT ...
> FROM SUBSELECT_TBL upper
> WHERE (f1, f2::float) IN
> (SELECT f2, f3 FROM SUBSELECT_TBL);
>
> and if you ask unmodified HEAD to plan that you get
>
> Hash Join (cost=41.55..84.83 rows=442 width=16)
> Hash Cond: ((upper.f1 = subselect_tbl.f2) AND ((upper.f2)::double precision = subselect_tbl.f3))
> -> Seq Scan on subselect_tbl upper (cost=0.00..27.70 rows=1770 width=16)
> -> Hash (cost=38.55..38.55 rows=200 width=12)
> -> HashAggregate (cost=36.55..38.55 rows=200 width=12)
> -> Seq Scan on subselect_tbl (cost=0.00..27.70 rows=1770 width=12)
Before I opened your mail, I also recalled the technique that I noticed
in the planner code, to evaluate SEMI JOIN as INNER JOIN with the RHS
uniquified, so also thought it could be about the uniquification.
> which is the same thing at the visible level of detail ... but this
> version computes the correct result. The cost of the HashAggregate is
> estimated higher, though, which suggests that maybe it's distinct'ing on
> two columns where the bogus plan only does one.
debug_print_plan output contains
:grpColIdx 2
in the AGG node. I think this corresponds to the join condition, which
IMO should be
(upper.f1 = subselect_tbl.f2)
while the other condition was not in the list of join clauses and
therefore ignored for the uniquification's purpose.
And gdb tells me that create_unique_path() never gets more than 1
clause. I can't tell whether it should do just for this special purpose.
> Not sure about where Antonin's patch is going off the rails. I suspect
> it's too simple somehow, but it's also possible that it's OK and the
> real issue is some previously undetected bug in LATERAL processing.
So far I have no idea how to achieve such conditions without this patch.
Thanks for your comments.
// Antonin Houska (Tony)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2013-12-11 23:49:18 | Re: Reference to parent query from ANY sublink |
Previous Message | Tom Lane | 2013-12-11 23:13:23 | Re: Reference to parent query from ANY sublink |