From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | Kim Rose Carlsen <krc(at)hiper(dot)dk> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [GENERAL] Remove useless joins (VARCHAR vs TEXT) |
Date: | 2017-09-16 21:28:44 |
Message-ID: | CAKJS1f-CqHWD+JahHKSLsjRMFLH3EYFFav6DS1eznUa=LHTQZQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On 17 September 2017 at 08:07, Kim Rose Carlsen <krc(at)hiper(dot)dk> wrote:
> It seems there are some difference in VARCHAR vs TEXT when postgres tries to
> decide if a LEFT JOIN is useful or not. I can't figure out if this is
> intentional because there are some difference between TEXT and VARCHAR that
> I dont know about or if it's a bug.
>
>
> I would expect both examples to produce same query plan
>
>
> a)
>
> create table a (id varchar primary key);
> create table b (id varchar primary key);
>
> explain select a.*
> from a
> left join (select distinct id from b) as b
> on a.id = b.id;
>
>
> QUERY PLAN
> ------------------------------------------------------------------
> Hash Right Join (cost=67.60..113.50 rows=1360 width=32)
> Hash Cond: ((b.id)::text = (a.id)::text)
> -> HashAggregate (cost=27.00..40.60 rows=1360 width=32)
> Group Key: b.id
> -> Seq Scan on b (cost=0.00..23.60 rows=1360 width=32)
> -> Hash (cost=23.60..23.60 rows=1360 width=32)
> -> Seq Scan on a (cost=0.00..23.60 rows=1360 width=32)
> (7 rows)
>
> b)
>
> create table a (id text primary key);
>
> create table b (id text primary key);
>
> explain select a.*
> from a
> left join (select distinct id from b) as b
> on a.id = b.id;
>
> QUERY PLAN
> ------------------------------------------------------
> Seq Scan on a (cost=0.00..23.60 rows=1360 width=32)
Yeah, it looks like the code to check for distinctness in the subquery
fails to consider that the join condition may contain RelabelTypes
instead of plain Vars.
The join would be removed if you'd written:
explain select a.* from a left join b on a.id = b.id;
so really the subquery version should be too.
I'm undecided if this should be classed as a bug or just a missed
optimisation. Certainly, the original code should have done this, so
I'm leaning slightly towards this being a bug.
The attached fixes.
(CC'd -hackers since we're starting to discuss code changes. Further
discussion which includes -hackers should drop the general list)
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Attachment | Content-Type | Size |
---|---|---|
join_removal_subquery_fix.patch | application/octet-stream | 888 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Gmail | 2017-09-17 01:30:35 | Re: looking for a globally unique row ID |
Previous Message | rob stone | 2017-09-16 20:20:21 | Re: looking for a globally unique row ID |
From | Date | Subject | |
---|---|---|---|
Next Message | Andreas Joseph Krogh | 2017-09-16 21:36:40 | Re: Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers) |
Previous Message | Thomas Munro | 2017-09-16 20:49:22 | Re: valgrind vs. shared typmod registry |