Re: Query tuning: partitioning, DISTINCT ON, and indexing

From: bricklen <bricklen(at)gmail(dot)com>
To: Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query tuning: partitioning, DISTINCT ON, and indexing
Date: 2013-06-21 16:08:38
Message-ID: CAGrpgQ_S-JTFmEKov+BOOTB0zPHM-ZOoguczzC4xVY46VEHcJg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Jun 20, 2013 at 10:14 PM, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>wrote:

> On Thu, Jun 20, 2013 at 9:13 PM, bricklen <bricklen(at)gmail(dot)com> wrote:
>
>>
>> On Thu, Jun 20, 2013 at 6:24 PM, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>wrote:
>>
>>> SELECT
>>> DISTINCT ON (type) ts, type, details
>>> FROM
>>> observations
>>> WHERE
>>> subject = '...'
>>> ORDER BY
>>> type, ts DESC;
>>>
>>
>> First thing: What is your "work_mem" set to, and how much RAM is in the
>> machine? If you look at the plan, you'll immediately notice the "external
>> merge Disk" line where it spills to disk on the sort. Try setting your
>> work_mem to 120MB or so (depending on how much RAM you have, # concurrent
>> sessions, complexity of queries etc)
>>
>
> Good call, thanks, although the in-mem quicksort is not much faster:
>
>
> QUERY PLAN
>
>
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=471248.30..489392.67 rows=3 width=47) (actual
> time=32002.133..32817.474 rows=3 loops=1)
> Buffers: shared read=30264
> -> Sort (cost=471248.30..480320.48 rows=3628873 width=47) (actual
> time=32002.128..32455.950 rows=3628803 loops=1)
> Sort Key: public.observations.type, public.observations.ts
> Sort Method: quicksort Memory: 381805kB
> Buffers: shared read=30264
> -> Result (cost=0.00..75862.81 rows=3628873 width=47) (actual
> time=0.026..1323.317 rows=3628803 loops=1)
> Buffers: shared read=30264
> -> Append (cost=0.00..75862.81 rows=3628873 width=47)
> (actual time=0.026..978.477 rows=3628803 loops=1)
> Buffers: shared read=30264
> ...
>
> the machine is not nailed down, but I think I'd need to find a way to
> drastically improve the plan to keep this in Postgres. The alternative is
> probably caching the results somewhere else: for any given subject, I only
> need the latest observation of each type 99.9+% of the time.
>

Here are some pages that might help for what details to provide:
https://wiki.postgresql.org/wiki/Server_Configuration
https://wiki.postgresql.org/wiki/Slow_Query_Questions

Did you try an index on (type, ts desc) ? I don't have much else to add at
this point, but maybe after posting some more server and table (parent and
child) details someone will have an answer for you.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Maciek Sakrejda 2013-06-21 17:08:50 Re: Query tuning: partitioning, DISTINCT ON, and indexing
Previous Message Maciek Sakrejda 2013-06-21 05:14:40 Re: Query tuning: partitioning, DISTINCT ON, and indexing