Re: partial time stamp query

From: Brent Wood <Brent(dot)Wood(at)niwa(dot)co(dot)nz>
To: Kirk Wythers <wythe001(at)umn(dot)edu>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: partial time stamp query
Date: 2013-02-04 20:14:58
Message-ID: B30242D206AB9543A3406649674DB4198EFB1006@hamwexdr01.niwa.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Kirk,

We have a (near) real time data database for instrument observations from our research vessels. All observations (summarised to one minute intervals - the actual raw data is in netCDF, this database makes for easier access & meets most users needs) go into a single table, with other tables for metadata about platforms, instruments, etc. Now approaching 350m records, so reasonably substantial.

Underlying optimisations include

partitioned readings table, with a separate partition for each year (now 23 years)
clustered index on timestamp for the previous years partitions.
largeish filesystem block size - tested to work well with the clustered index & small size records)

These generally make a big difference to performance. To address one issue, much like yours, where some users want hourly data for a year, some want daily data for 10 years & some want 1 minute data for the last month (& some, no doubt, want one minute data for 20+ years!) I introduced an integer column called timer. This value is set according to the time (not date) of each record.

Along the lines of (from memory) :an even no of minutes after the hour is 2, 5 minutes is 4, 10 minutes is 8, 15 minute is 16, 30 minutes is 32, 60 minutes is 64, 6 hourly is 128, 12:00 AM is 256 & 12:00PM is 512. When any timestamp is in more than one category (eg: 12:00 is all of even, 5, 15m 30m 60 minutes), the timer value is set to the largest appropriate one.

So a request for:
1 minute data is select from table;
2 minute data is select from table where timer >=2 and timer !=15 and timer !=4;
hourly data is select from table where timer >=64 and timer != 15 and timer != 4;
etc

5 & 15 minute add a bit of complexity, but we gave the users what they wanted. This has worked well for years now, & we have an internal web (mapserver/openlayers based) application allowing users to visualise & download their selected data - they choose from an interval pick list & the SQL is hidden. Some extra enhancements are the automatic collation of lat & lon gps readings into a Postgis point for each reading record, & the automatic aggregation of daily points into daily track lines, so the track for any selected set of dates can easily be displayed on a map (the platforms are mobile vessels - not fixed sites)

You might adapt some of these ideas for your use case?

Cheers

Brent Wood

Programme leader: Environmental Information Delivery
NIWA
DDI: +64 (4) 3860529
________________________________________
From: pgsql-general-owner(at)postgresql(dot)org [pgsql-general-owner(at)postgresql(dot)org] on behalf of Kirk Wythers [wythe001(at)umn(dot)edu]
Sent: Tuesday, February 05, 2013 5:58 AM
To: pgsql-general(at)postgresql(dot)org
Subject: Fwd: [GENERAL] partial time stamp query

Thanks. That worked great! Now I am trying to aggregate these same fifteen minute to hourly. I have tried using date_trunk:

date_trunc('hour', derived_tsoil_fifteen_min_stacked.time2),

but date_truck only seems to aggriage the timestamp. I thought I could use

AVG(derived_tsoil_fifteen_min_stacked.value)

in combination with date_trunk, but I still get 15 minute values, not the hourly average from the four 15 minute records.

rowid date_truck time2 site canopy plot variable name value avg
2010-07-07_00:00:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:00:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 21.06 21.0599994659424
2010-07-07_00:15:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:15:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.96 20.9599990844727
2010-07-07_00:30:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:30:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.88 20.8799991607666
2010-07-07_00:45:00_b4warm_a 2010-07-07 00:00:00 1 2010-07-07 00:45:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.8 20.7999992370605
2010-07-07_01:00:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:00:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.72 20.7199993133545
2010-07-07_01:15:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:15:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.64 20.6399993896484
2010-07-07_01:30:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:30:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.55 20.5499992370605
2010-07-07_01:45:00_b4warm_a 2010-07-07 01:00:00 1 2010-07-07 01:45:00 cfc closed a2 tsoil_sc tsoil_avg1_sc 20.47 20.4699993133545

I was tying to get two records out of this set, with the 'avg" column representing the mean of the first and last four of each 15 minute records.

Perhaps date_trunk only works for the timestamp?

On Feb 4, 2013, at 8:50 AM, Misa Simic <misa(dot)simic(at)gmail(dot)com<mailto:misa(dot)simic(at)gmail(dot)com>> wrote:

WHERE derived_tsoil_fifteen_min_stacked.time2::date = '2010-07-07'::date

On Monday, February 4, 2013, Kirk Wythers wrote:
I am trying to write a query that grabs one particular day from a timestamp column. The data are ordered in 15 minute chunks like this:

2010-07-07 12:45:00
2010-07-07 13:00:00
2010-07-07 13:15:00
2010-07-07 13:30:00
etc…

If I wanted all records from july 7th 2010, I would expect 4 x 24 = 96 records per day.

I have tried the '=' operator, like this

WHERE derived_tsoil_fifteen_min_stacked.time2 = '2010-07-07*'

but that grabs nothing, and using the '~' operator grabs everything with a 2010 or 07 in it… in other words all days from July of 2010.

Any suggestions would be much appreciated.

--
Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Please consider the environment before printing this email.
NIWA is the trading name of the National Institute of Water & Atmospheric Research Ltd.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steve Crawford 2013-02-04 20:16:17 Re: Is it possible to migrate database from Postgresql 8.2 to 9.2 using pg_upgrade?
Previous Message AI Rumman 2013-02-04 20:06:45 Is it possible to migrate database from Postgresql 8.2 to 9.2 using pg_upgrade?