Re: Help with optimizing a query over hierarchical data

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Damon Snyder <damon(at)huddler-inc(dot)com>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Help with optimizing a query over hierarchical data
Date: 2014-03-03 21:52:06
Message-ID: CAGTBQpYM=BmLJ5hr8UXE9O0Ks0v3aO8hQMX4DhG-5e3H2_2aMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Um... I think your problem is a misuse of CTE. Your CTE is building an
intermediate of several thousands of rows only to select a dozen
afterwards. You may want to consider a view or subquery, though I'm
not sure pg will be able to optimize much given your use of window
functions, which forces a materialization of that intermediate result.

I think you need to re-think your queries to be smarter about that.

On Mon, Mar 3, 2014 at 2:55 PM, Damon Snyder <damon(at)huddler-inc(dot)com> wrote:
> Hi Claudio,
> Thanks for responding. Here is the explain (http://explain.depesz.com/s/W3W)
> for the ordering by meta container starting on line 192
> (https://gist.github.com/drsnyder/9277054#file-object-ordering-setup-sql-L192)
>
> Here is the explain (http://explain.depesz.com/s/d1O) for the ordering by
> score starting on line 192
> (https://gist.github.com/drsnyder/9277054#file-object-ordering-setup-sql-L216)
>
> Both of the explains were done with (ANALYZE, BUFFERS).
>
> Thanks for the suggestion regarding de-normalizing. I'll consider that
> approach for the score based query.
>
> I've also included the server config changes made from updates to
> postgresql.conf on the box that I'm testing on. See below.
>
> Thanks,
> Damon
>
> version
> --------------------------------------------------------------------------------------------------------------
> PostgreSQL 9.2.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7
> 20120313 (Red Hat 4.4.7-3), 64-bit
> (1 row)
>
> name | current_setting | source
> ------------------------------+--------------------+----------------------
> application_name | psql | client
> checkpoint_completion_target | 0.9 | configuration file
> checkpoint_segments | 16 | configuration file
> DateStyle | ISO, MDY | configuration file
> default_tablespace | ssd2 | user
> default_text_search_config | pg_catalog.english | configuration file
> effective_cache_size | 5632MB | configuration file
> lc_messages | en_US.UTF-8 | configuration file
> lc_monetary | en_US.UTF-8 | configuration file
> lc_numeric | en_US.UTF-8 | configuration file
> lc_time | en_US.UTF-8 | configuration file
> listen_addresses | * | configuration file
> log_destination | stderr | configuration file
> log_directory | pg_log | configuration file
> log_filename | postgresql-%a.log | configuration file
> log_line_prefix | %d %m %c %x: | configuration file
> log_min_duration_statement | 500ms | configuration file
> log_min_error_statement | error | configuration file
> log_min_messages | error | configuration file
> log_rotation_age | 1d | configuration file
> log_rotation_size | 0 | configuration file
> log_timezone | UTC | configuration file
> log_truncate_on_rotation | on | configuration file
> logging_collector | on | configuration file
> maintenance_work_mem | 480MB | configuration file
> max_connections | 80 | configuration file
> max_stack_depth | 2MB | environment variable
> port | 5432 | command line
> shared_buffers | 1920MB | configuration file
> TimeZone | UTC | configuration file
> wal_buffers | 16MB | configuration file
> work_mem | 8MB | configuration file
> (32 rows)
>
>
>
> On Sat, Mar 1, 2014 at 5:02 PM, Claudio Freire <klaussfreire(at)gmail(dot)com>
> wrote:
>>
>> On Fri, Feb 28, 2014 at 5:01 PM, Damon Snyder <damon(at)huddler-inc(dot)com>
>> wrote:
>> > The primary query that I'm trying to optimize executes in about 1600ms
>> > on my
>> > laptop and about 800ms on production-like hardware (more for the score
>> > version). My target is to get the data fetch down below 100ms if
>> > possible.
>>
>> Could you post some explain analyze of those particular queries?
>>
>> > If you have any suggestions it would be greatly appreciated. Am I
>> > missing
>> > something obvious? Is there a logically equivalent alternative that
>> > would be
>> > more efficient?
>>
>> I'd suggest de-normalizing a bit. For instance, why don't you put the
>> score right into the object? I'm sure the indirection is hurting.
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Venkata Balaji Nagothi 2014-03-03 23:28:22 Re: Query taking long time
Previous Message Eli Naeher 2014-03-03 18:55:12 Subselect an order of magnitude faster than nested joins