Re: Help with optimizing a query over hierarchical data

From: Damon Snyder <damon(at)huddler-inc(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Help with optimizing a query over hierarchical data
Date: 2014-03-04 17:06:12
Message-ID: CACkQbuhpsjSLnXykKVcieZey7mW5TozW-_j=TTE0Hpr2bxa8YA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Claudio,
Thanks for the help!

Damon

On Mon, Mar 3, 2014 at 8:20 PM, Claudio Freire <klaussfreire(at)gmail(dot)com>wrote:

> On Mon, Mar 3, 2014 at 10:12 PM, Damon Snyder <damon(at)huddler-inc(dot)com>
> wrote:
> >
> >> Um... I think your problem is a misuse of CTE. Your CTE is building an
> > intermediate of several thousands of rows only to select a dozen
> > afterwards. You may want to consider a view or subquery, though I'm
> > not sure pg will be able to optimize much given your use of window
> > functions, which forces a materialization of that intermediate result.
> >
> > The application requires that we find an element and it's neighbors
> within a
> > sorted set at a given offset after filtering by category and status. In
> the
> > examples provided, we need position 50000, 6 above, and 6 below. Is
> there a
> > way do to that more efficiently without first determining the position of
> > each element within the set using a window function? How would a subquery
> > help?
> >
> > The only solution I could come up with was to materialize the
> intermediate
> > result with the CTE (since you don't know ahead of time how many objects
> > match the status and category criteria) then use the window to include
> the
> > position or index.
>
>
> You're materializing on a per-query basis. That's no good (as your
> timings show). Try to find a way to materialize on a more permanent
> basis.
>
> I cannot give you a specific solution without investing way more time
> than I would. But consider this: all your queries costs are CPU costs.
> You need a better algorithm, or better hardware. I doubt you'll find
> hardware that performs 16 times faster, so you have to concentrate on
> a better algorithm.
>
> And it's unlikely you'll find a better algorithm without a better data
> structure. So you need to reorganize your database to make it easier
> to query. I don't think simple SQL optimizations will get you to your
> performance goal.
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Venkata Balaji Nagothi 2014-03-04 23:35:46 Re: Query taking long time
Previous Message acanada 2014-03-04 11:23:48 Re: Query taking long time