Re: Slow query with indexed ORDER BY and LIMIT when using OR'd conditions

From: johno <jan(dot)suchal(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow query with indexed ORDER BY and LIMIT when using OR'd conditions
Date: 2014-07-22 05:57:08
Message-ID: CACuOPqCHEAK93Q5LqfdC6EDqfP57=S2pq68Gn4=S3B1usz-GUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, Jul 22, 2014 at 4:53 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> johno <jan(dot)suchal(at)gmail(dot)com> writes:
> > I am trying to optimize a simple query that returns first 100 rows that
> > have been updated since a given timestamp (ordered by timestamp and id
> > desc). If there are several rows with the same timestamp I need to a
> > second condition, that states that I want to return rows having the given
> > timestamp and id > given id.
>
> > The obvious query is
>
> > SELECT * FROM register_uz_accounting_entities
> > WHERE effective_on > '2014-07-11' OR (effective_on = '2014-07-11' AND
> > id > 1459)
> > ORDER BY effective_on, id
> > LIMIT 100
>
> A more readily optimizable query is
>
> SELECT * FROM register_uz_accounting_entities
> WHERE (effective_on, id) > ('2014-07-11'::date, 1459)
> ORDER BY effective_on, id
> LIMIT 100
>

Yes, but that query has completely different semantics - I can't change
that.

>
> This formulation allows the planner to match both the WHERE and ORDER BY
> clauses directly to the two-column index.
>

Are both fields really used? I was under the impression that only the first
column from index can be used when there is a range query.

>
> > I've tried to optimize this query by pushing down the limit and order
> by's
> > into explicit subselects.
>
> As noted earlier, that's unlikely to be an improvement, because on its
> face it specifies more computation. Postgres is not terribly bright
> about UNIONs, either.
>

Despite the cost calculation in explain the actual query times are very
different. I get consistent sub 50ms responses from the optimized one
(union with pushing down the limits) and 500+ms for the plain one (when not
using bitmap index scan).

Is this possible optimization considered by query planner or do I have
"force" it?

Thanks again for your time and effort, I appreciate it.

>
> regards, tom lane
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2014-07-22 06:15:29 Re: Slow query with indexed ORDER BY and LIMIT when using OR'd conditions
Previous Message Tom Lane 2014-07-22 02:53:01 Re: Slow query with indexed ORDER BY and LIMIT when using OR'd conditions