Quick Links

Re: Large data and slow queries

From:	Andrew Staller <andrew(at)timescale(dot)com>
To:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Large data and slow queries
Date:	2017-04-20 14:44:07
Message-ID:	CAEsM1FvhnrnQDbkYDb2WQwayRF8Ft6c9OeQFRcs+jK3ZkPX74g@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Awesome thread.

Samuel,

Just wanted you to be aware of the work we're doing at TimescaleDB (
http://www.timescale.com/) a time-series database extension for PostgreSQL.

Some of how we might help you:
- automatic partitioning by space (primary key - like country_id, for
instance) and time. This creates "chunks" of your data, right-sized by
volume and time constraints (which makes inserts fast at much greater scale
than Vanilla PostgreSQL - not sure if this is a problem for you)
- this will also help if your queries are selective on time and country_id
(per this example)
- the partitioning by time allows you to DROP old chunks without the need
for vacuums

On Thu, Apr 20, 2017 at 8:30 AM, Vick Khera <vivek(at)khera(dot)org> wrote:

> I'm curious why you have so many partial indexes. Are you trying to make
> custom indexes per query? It seems to me you might want to consider making
> the indexes general, and remove the redundant ones (that have the same
> prefix list of indexed fields).
>
> Secondly your table is 102Gb. Clearly there's a lot of data here. How many
> rows does that take? I would further suggest that you partition this table
> such that there are no more than about 10 million rows per partition (I've
> done this by using a id % 100 computation). Maybe in your case it makes
> sense to partition it based on the "what" field, because it appears you are
> trying to do that with your partial indexes already.
>
> On Wed, Apr 19, 2017 at 10:11 PM, Samuel Williams <
> space(dot)ship(dot)traveller(at)gmail(dot)com> wrote:
>
>> Okay, so after changing longitude/latitude to float4, and
>> re-organizing the table a bit, I got the query down from about 8
>> minutes to 40 seconds.
>>
>> The details are in the gist comments:
>> https://gist.github.com/ioquatix/bddda36d9e4ffaceb7a62d7b62259121
>>
>> Now, just need to get performance another 2 orders of magnitude
>> better. Can we make the index a bit more coarse grained, perhaps
>> convert long/lat to integers or something, use a hilbert index, use a
>> postgis index... ideas?
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>>
>
>

--
TimescaleDB* | *Growth & Developer Evangelism
c: 908.581.9509

335 Madison Ave.
New York, NY 10017
www.timescale.com
https://github.com/timescale/timescaledb

In response to

Re: Large data and slow queries at 2017-04-20 12:30:09 from Vick Khera

Responses

Re: Large data and slow queries at 2017-04-20 21:43:59 from Samuel Williams

Browse pgsql-general by date

	From	Date	Subject
Next Message	Adrian Klaver	2017-04-20 14:49:09	Re: Unable to upload backups
Previous Message	Jan de Visser	2017-04-20 14:27:45	Re: UDP buffer drops / statistics collector