Re: select subquery versus join subquery

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Gunnar Nick Bluth <gunnar(dot)bluth(dot)extern(at)elster(dot)de>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: select subquery versus join subquery
Date: 2017-05-23 16:59:49
Message-ID: CAMkU=1z_zzi+h-NTo0Skftr3YxF0LPtZqDGLBSRr-pRcBKQOLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, May 23, 2017 at 4:03 AM, Gunnar "Nick" Bluth <
gunnar(dot)bluth(dot)extern(at)elster(dot)de> wrote:

> Am 05/22/2017 um 09:57 PM schrieb Jeff Janes:
> >
> > create view view2 as select id,
> > (
> > select md5 from thing_alias where thing_id=id
> > order by priority desc limit 1
> > ) as md5,
> > cutoff from thing;
> >
> > Cheers,
> >
> > Jeff
>
> Hi Jeff,
>
> how does something like
>
> CREATE OR REPLACE VIEW public.view3 AS
> SELECT thing.id,
> foo.md5,
> thing.cutoff
> FROM thing,
> LATERAL ( SELECT DISTINCT ON (thing_alias.thing_id)
> thing_alias.thing_id,
> thing_alias.md5
> FROM thing_alias
> WHERE thing_alias.thing_id = thing.id
> ORDER BY thing_alias.thing_id, thing_alias.priority DESC) foo
>
> work for you? At least that's always using an index scan here, as
> opposed to view1, which (for me) defaults to a SeqScan on thing_alias at
> a low cutoff.
>

Unfortunately that always uses the index scan, even at a high cutoff where
aggregation on the seq scan and then hash joining is more appropriate. So
it is very similar to view2, except that it doesn't return the rows from
"thing" which have zero corresponding rows in thing_alias.

*****
> Note btw. that both view1 and view2 don't return any md5 values for me,
> while view3 does!
> *****
>

Because of the way I constructed the data, using the power transform of the
uniform random distribution, the early rows of the view (if sorted by
thing_id) are mostly null in the md5 column, so if you only look at the
first few screen-fulls you might not see any md5. But your view does
effectively an inner join rather than a left join, so your view gets rid of
the rows with a NULL md5. Most things don't have aliases; of the things
that do, most have 1; and some have a several.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jarek 2017-05-23 18:29:34 More cores or higer frequency ?
Previous Message Dinesh Chandra 12108 2017-05-23 13:07:00 Re: Log update query along with username who has executed the same.