Re: Approximate join on timestamps

From: Brent Wood <b(dot)wood(at)niwa(dot)co(dot)nz>
To: Phil Endecott <spam_from_postgresql_general(at)chezphil(dot)org>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Approximate join on timestamps
Date: 2007-03-21 00:39:33
Message-ID: 46007EC5.7050302@niwa.co.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Phil Endecott wrote:
> Dear Experts,
>
> I have two tables containing chronological data, and I want to join
> them using the timestamps. The challenge is that the timestamps only
> match approximately.

Hi Phil,

This is how we dealt with a similar situation. It may be suitable for
you....

We have about 200,000,000 records timestamped to facilitate this using a
more or less data warehousing approach.

We generate timestamps at one minute intervals, then assign the
appropriate values (readings) to each timestamp, by using the last
recorded reading before that time (separate record for each instrument)
within an appropriate interval (so missing values are not populated with
historic values). Note that this will discard all records for each
reading except for the last one per interval.

We also add a column ("timer") which has values of 1, 2, 5, 10, 20, 30,
60, 720, 1440 depending on the hour & minutes of the timestamp.

so a "where timer >=60" returns hourly readings, "where timer =720"
gives midday readings, "where timer >=10" gives every 10 minute reading,
etc

This then gets a clustered index on timestamp (and is partitioned on
year) and a 24 way self-relation (ie: 24 instrument readings joined by
timestamp returns 3 months of 10 minute values in 20 odd seconds on a
fastish desktop box.

HTH,

Brent Wood

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Matthew.Pettis@gmail.com 2007-03-21 01:48:15 Using PostgreSQL to archive personal email
Previous Message Tom Lane 2007-03-21 00:12:37 Re: "sniffing" postgres queries