Quick Links

Re: Best way to handle multi-billion row read-only table?

From:	Asher Hoskins <asher(at)piceur(dot)co(dot)uk>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Re: Best way to handle multi-billion row read-only table?
Date:	2010-02-09 21:41:26
Message-ID:	4B71D686.8040003@piceur.co.uk
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Justin Graf wrote:
> Well first is that 200hz meaning 200 samples per channel per second.
> That is very fast sampling for pressure sensor, I would be surprised if
> the meters are actually giving real results at that rate. I would look
> at reducing that down to what the meter is actual capable of sending
> What kind of AD card is being used as this effects what makes sense to
> record.

Yes, we really are measuring at 200 samples per second. We're trying to
capture high resolution images of pressure transients as they move along
water distribution pipelines (the backbones of the water network,
typically 4'-6' in diameter, carrying 500-1000 litres/second) to
understand how they travel and what stress they put upon the pipe. We're
using custom data loggers at the moment based around Intel iMote2 Linux
systems with a high-speed QuickFilter ADC (and the sensors we're using
can cope at 200Hz).

> I would look into table partitioning
> http://www.postgresql.org/docs/current/static/ddl-partitioning.html
> http://wiki.postgresql.org/wiki/Table_partitioning

Thanks for that, it looks like partitioning is the way to go. I'm
assuming that I should try and keep my total_relation_sizes less than
the memory size of the machine?

> A one big index for such a small record will not be a big win because
> the index are going to be the same size as table.
> Look into limiting the number of records each index covers.
> http://www.postgresql.org/docs/8.4/static/sql-createindex.html

If I partition so that each partition holds data for a single channel
(and set a CHECK constraint for this) then I can presumably remove the
channel from the index since constraint exclusion will mean that only
partitions holding the channel I'm interested in will be searched in a
query. Given that within a partition all of my sample_time's will be
different do you know if there's a more efficient way to index these?

Many thanks,
Asher

In response to

Re: Best way to handle multi-billion row read-only table? at 2010-02-09 19:40:14 from Justin Graf

Responses

Re: Best way to handle multi-billion row read-only table? at 2010-02-09 22:08:09 from John R Pierce
Re: Best way to handle multi-billion row read-only table? at 2010-02-09 23:05:21 from Justin Graf

Browse pgsql-general by date

	From	Date	Subject
Next Message	Scott Marlowe	2010-02-09 21:53:36	Re: Best way to handle multi-billion row read-only table?
Previous Message	Scott Marlowe	2010-02-09 21:12:03	Re: vacuumdb ERROR: out of memory