Re: PostgreSQL Query Speed Issues

From: Joseph Pravato <joseph(dot)pravato(at)nomagic(dot)com>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: "pgsql-novice(at)postgresql(dot)org" <pgsql-novice(at)postgresql(dot)org>
Subject: Re: PostgreSQL Query Speed Issues
Date: 2013-02-27 23:07:00
Message-ID: 512E9194.6000506@nomagic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Kevin Grittner wrote:

> Ah, well that gets us into a whole new ballgame. With that many
> tables, planning time can become an issue in itself, so there are
> various things that PostgreSQL does to try to planning time from
> ballooning to a point where it takes longer than the time saved by
> plan improvement. The first thing I would try is boosting these
> settings to more than the number of table references:
>
> from_collapse_limit
> join_collapse_limit
> geqo_threshold
>
> Try an EXPLAIN ANALYZE and see whether the "actual time" at the top
> level node of the plan looks good, and see how it compares to
> "Total runtime" at the bottom. The difference is primarily
> planning time, so you can see how good a plan you got versus how
> expensive it was to find that plan. If you see that there are good
> plans, but it is too expensive to find them, you might want to let
> the "genetic query optimizer" have a shot at planning, by adjusting
> the above values. The concept of this alternative planner is that
> it tries to get a plan which is "good enough" with bounded planning
> time.
>
> See the docs for details.

We've changed both 'from_collapse_limit' and 'join_collapse_limit' from
8 to 15. We tried multiple combinations for both values from 8, 12-17, &
20 and both at 15 seemed to be a sweet spot for this view.
geqo_threshold didn't really make any changes once we changed the other two.

We've started to test more of our queries that involve the view and we
believe we may have come across a possible reason for the poor
performance in our views.
The view is pulling data from many tables as said before, one of them
has a varchar(255) column (ColA) that is copied over into the view.
However, in the view column definition, it is a varchar(2147483647).
In the query we tested, we are running an equivalence check against this
column, along with another varchar column of size 1 (ColB). When we
remove the check against ColA the query returns in 2.5 seconds. When
included it talks 200 seconds. There is almost no difference when the
ColB check is removed (2.5s vs 2.3s).

It is of our belief that this varchar(2147483647) could be causing
performance problems. ColA might be defaulting to 2147483647 because it
is being union-ed with the same column a couple of times in different
circumstances. So we are wondering if there a way to set the column's
varchar size in a view definition?

> The plan seems to be reluctant to use index scans, which might be
> related to the ratio between these values:
>
> random_page_cost | 1
> seq_page_cost | 0.1
>
> Do you get a better plan if these are equal? If random is only
> twice the sequential cost?

We have gone ahead and made these settings the same and had no
significant performance increase. We're unsure on what would be a better
plan, please see the next section we commented on.

> The other possible issue is that depending on how the views are
> used, it sometimes creates an optimization barrier. In general,
> the planner will see more options if the views are joined than if
> they are used in subqueries or CTEs.
>
> You might also want to try pasting your plan into:
>
> http://explain.depesz.com/

Side comment: I like this website, thanks for the link. However, we are
not exactly DBA's or anything and are new to databases & postgres. We
are still doing some research on what the site outputs since we're still
new to the information presented. Do you know of any good sites that
show good vs bad query plans? Given the sheer amount in the view's query
plan, we aren't sure whats going on.

Thank you again for all of your assistance!

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message oisin 2013-02-28 19:06:13 Partitions for large spatial data
Previous Message Tom Lisjac 2013-02-27 18:49:22 Re: PostgreSQL Query Speed Issues