Re: Partitioned tables and SELECT ... ORDER BY ... LIMIT

From: Дмитрий Шалашов <skaurus(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Partitioned tables and SELECT ... ORDER BY ... LIMIT
Date: 2014-10-16 17:22:20
Message-ID: CAKPeCUFT8m3jtfAnnfTeZ4vCi2-CS5=d22CAf8iRS0qXso6oSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Jeff,

Thanks for clarifications!

In my case yes, it's just few blocks, but different ones every time I
change user_id value in my WHERE clause. When I change user_id - buffers
are no longer "shared hit" in EXPLAIN. This is a bit more worrying.

But if there is no easy fix - well, OK.

Best regards,
Dmitriy Shalashov

2014-10-16 21:04 GMT+04:00 Jeff Janes <jeff(dot)janes(at)gmail(dot)com>:

> On Thu, Oct 16, 2014 at 5:35 AM, Дмитрий Шалашов <skaurus(at)gmail(dot)com>
> wrote:
>
>> Hi,
>>
>> lets imagine that we have some table, partitioned by timestamp field, and
>> we query it with SELECT with ordering by that field (DESC for example),
>> with some modest limit.
>> Lets further say that required amount of rows is found in the first table
>> that query encounters (say, latest one).
>> I am just wondering, why nevertheless PostgreSQL does read couple of
>> buffers from each of the older tables?
>>
>
> The planner only does partition pruning statically, not dynamically.The
> LIMIT has to be implemented dynamically--it cannot prove absolutely that
> the "first" partition will have enough rows, so it cannot eliminate the
> others.
>
> The "Merge Append" does a priority queue merge, and so needs to read the
> "first" row (according to the ORDER BY) from each partition in order to seed
> the priority queue. I guess what it could be made to do in the case where
> there are suitable check constraints on a partition, is seed the priority
> queue with a dummy value constructed from the constraint. If the merge
> never gets far enough to draw upon that dummy value, then that whole plan
> node never needs to get started up.
>
> In your case that would save very little, as reading a few blocks for each
> partition is not much of a burden. Especially as it the same few blocks
> every time, so they should be well cached. There may be other case where
> this would be more helpful. But it isn't clear to me how the planner could
> build such a feature into its cost estimates, and the whole thing would be
> a rather complex and esoteric optimization to make for uncertain gain.
>
> Cheers,
>
> Jeff
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Emi Lu 2014-10-16 18:30:54 CopyManager(In/out) vs. delete/insert directly
Previous Message Felipe Santos 2014-10-16 17:15:43 Re: Partitioned tables and SELECT ... ORDER BY ... LIMIT