Re: SQL Help - Finding Next Lowest Value of Current Row Value

From: Jeff Adams <Jeff(dot)Adams(at)noaa(dot)gov>
To: chris(at)chriscurvey(dot)com
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: SQL Help - Finding Next Lowest Value of Current Row Value
Date: 2011-10-11 11:43:54
Message-ID: 000d01cc880b$0dce1850$296a48f0$%Adams@noaa.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for the suggestions Chris (and Chris). After a bit more investigation
I stumbled upon the Window functions. The approach below turned out to be
much more efficient that a function or self join approach. The SQL that I
used is provided below (event_id and mmsi uniquely identify a vessel transit
for which I wished to compute how much time had elapsed between successive
records):

SELECT

a.event_id,

a.mmsi,

(a.epoch - lag(epoch) OVER (PARTITION BY event_id, mmsi ORDER BY epoch
ASC))/60 AS elapsed

FROM

dmas_ais a

Jeff

From: ccurvey(at)gmail(dot)com [mailto:ccurvey(at)gmail(dot)com] On Behalf Of Chris Curvey
Sent: Saturday, October 01, 2011 10:55 PM
To: Jeff Adams
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] SQL Help - Finding Next Lowest Value of Current Row
Value

On Sat, Oct 1, 2011 at 5:30 PM, Jeff Adams <Jeff(dot)Adams(at)noaa(dot)gov> wrote:

Greetings,

I have a large table (~19 million records). Records contains a field
identifying a vessel and a field containing an time (epoch). Using the
current rows vessel and time values, I need to be able to find the next
lowest time value for the vessel and use it to compute how much time has
elapsed between the records. I have tried a scalar subquery in the SELECT,
which works, but it runs quite slowly. Is there an alternative approach that
might perform better for this type of query. Any information would be
greatly appreciated. Thanks...

Jeff

Would a self-join with a MAX() help, like this? (Where "v" is your
vessel_id and "e" is your time value?)

create table stuff

( v int

, e timestamp

);

insert into stuff (v, e) values (1, '1/1/2011');

insert into stuff (v, e) values (1, '1/2/2011');

insert into stuff (v, e) values (1, '1/3/2011');

insert into stuff (v, e) values (2, '2/1/2011');

insert into stuff (v, e) values (2, '2/2/2011');

select a.v, a.e, max(b.e), a.e - max(b.e)

from stuff a

join stuff b on a.v = b.v

where a.e > b.e

group by a.v, a.e

I don't have a multi-million row table handy, but I'd be interested to hear
your results.

--

e-Mail is the equivalent of a postcard written in pencil. This message may
not have been sent by me, or intended for you. It may have been read or
even modified while in transit. e-Mail disclaimers have the same force in
law as a note passed in study hall. If your corporate attorney says that
you need an disclaimer in your signature, you need a new corporate attorney.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Krishnanand Gopinathan Sathikumari 2011-10-11 12:33:38 Question on GiST re-index
Previous Message Alban Hertroys 2011-10-11 07:05:29 Re: Should casting to integer produce same result as trunc()