Re: Help with optimizing a query over hierarchical data

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Damon Snyder <damon(at)huddler-inc(dot)com>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Help with optimizing a query over hierarchical data
Date: 2014-03-04 04:20:18
Message-ID: CAGTBQpaxhyvG42C5W82CtxOp+GLA9710thw1ai4PCxbByW3u=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Mar 3, 2014 at 10:12 PM, Damon Snyder <damon(at)huddler-inc(dot)com> wrote:
>
>> Um... I think your problem is a misuse of CTE. Your CTE is building an
> intermediate of several thousands of rows only to select a dozen
> afterwards. You may want to consider a view or subquery, though I'm
> not sure pg will be able to optimize much given your use of window
> functions, which forces a materialization of that intermediate result.
>
> The application requires that we find an element and it's neighbors within a
> sorted set at a given offset after filtering by category and status. In the
> examples provided, we need position 50000, 6 above, and 6 below. Is there a
> way do to that more efficiently without first determining the position of
> each element within the set using a window function? How would a subquery
> help?
>
> The only solution I could come up with was to materialize the intermediate
> result with the CTE (since you don't know ahead of time how many objects
> match the status and category criteria) then use the window to include the
> position or index.

You're materializing on a per-query basis. That's no good (as your
timings show). Try to find a way to materialize on a more permanent
basis.

I cannot give you a specific solution without investing way more time
than I would. But consider this: all your queries costs are CPU costs.
You need a better algorithm, or better hardware. I doubt you'll find
hardware that performs 16 times faster, so you have to concentrate on
a better algorithm.

And it's unlikely you'll find a better algorithm without a better data
structure. So you need to reorganize your database to make it easier
to query. I don't think simple SQL optimizations will get you to your
performance goal.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message acanada 2014-03-04 10:10:27 Re: Query taking long time
Previous Message Damon Snyder 2014-03-04 01:12:16 Re: Help with optimizing a query over hierarchical data