Re: Efficiently selecting single row from a select with window functions row_number, lag and lead

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Andrew Bailey <hazlorealidad(at)gmail(dot)com>
Cc: PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Efficiently selecting single row from a select with window functions row_number, lag and lead
Date: 2016-01-06 05:10:09
Message-ID: CAKJS1f-KbrYRxc3vdRUoA8kW46mfNbf9PtFDf2vqSAsq6JaDRA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 2 January 2016 at 16:39, Andrew Bailey <hazlorealidad(at)gmail(dot)com> wrote:

> I would like to do the following:
>
> select id, row_number() over w as rownum, lag(id, 1) over w as prev,
> lead(id, 1) over w as next from route where id=1350 window w as (order by
> shortname, id asc rows between 1 preceding and 1 following) order by
> shortname, id ;
>
> However this gives the result
> 1350;1;;
>
>
This does not work due to the id=1350 is always applied before the rows
make it into the window therefore you only have rows which match id=1350,
which is not what you want in this case.

> The following query gives the result I am expecting
>
> select * from (select id, row_number() over w as rownum,
> lag(id, 1) over w as prev, lead(id, 1) over w as next
> from route window w as (order by shortname, id
> rows between 1 preceding and 1 following) order by shortname, id) as s
> where id=1350
>
> 1350;3;1815;1813
>
>
This works because the id=1350 is not pushed down into the subquery which
contain the windowing functions, this also means that the entire route
table is processed and you may suffer from performance problems if the
route table is, or gets big. You'll be able to confirm this by looking at
the EXPLAIN output and noticing the lack of filter on the seqscan.

> The explain plan is
> "Subquery Scan on s (cost=0.14..15.29 rows=1 width=32)"
> " Filter: (s.id = 1350)"
> " -> WindowAgg (cost=0.14..13.51 rows=143 width=12)"
> " -> Index Only Scan using route_idx on route (cost=0.14..10.29
> rows=143 width=12)"
>
> as it makes use of the index created as follows
>
> CREATE INDEX route_idx
> ON route
> USING btree
> (shortname COLLATE pg_catalog."default", id);
>
> I believe that the index has all the data that is needed to obtain the
> results in a single query.
> Is it possible to write the query as a single select and if so how?
>

why not just write it as: select id, (select max(id) from route where id <
1350) as prev, (select min(id) from route where id > 1350) as next from
route where id=2; ?
That should be much more efficient for a larger table as it should avoid
the seqscan and allow the index to be used for all 3 numbers.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Vitaly Burovoy 2016-01-06 05:15:29 Re: Efficiently selecting single row from a select with window functions row_number, lag and lead
Previous Message Jim Nasby 2016-01-06 04:51:58 Re: Efficiently selecting single row from a select with window functions row_number, lag and lead