Re: Slow query with planner row strange estimation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: damien hostin <damien(dot)hostin(at)axege(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow query with planner row strange estimation
Date: 2010-07-09 20:25:30
Message-ID: AANLkTilpsAwBy2x6Sy8EBYaprOoGp8H0REt3ZiH35cHn@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Jul 9, 2010 at 6:13 AM, damien hostin <damien(dot)hostin(at)axege(dot)com> wrote:
>> Have you tried running ANALYZE on the production server?
>>
>> You might also want to try ALTER TABLE ... SET STATISTICS to a large
>> value on some of the join columns involved in the query.
>
> Hello,
>
> Before comparing the test case on the two machines, I run analyse on the
> whole and look at pg_stats table to see if change occurs for the columns.
> but on the production server the stats never became as good as on the
> desktop computer. I set statistic at 10000 on column used by the join, run
> analyse which take a 3000000 row sample then look at the stats. The stats
> are not as good as on the desktop. Row number is nearly the same but only 1
> or 2 values are found.
>
> The data are not balanced the same way on the two computer :
> - Desktop is 12000 rows with 6000 implicated in the query (50%),
> - "Production" (actually a dev/test server) is 6 million rows with 6000
> implicated in the query (0,1%).
> Columns used in the query are nullable, and in the 5994000 other rows that
> are not implicated in the query these columns are null.
>
> I don't know if the statistic target is a % or a number of value to obtain,

It's a number of values to obtain.

> but event set at max (10000), it didn't managed to collect good stats (for
> this particular query).

I think there's a cutoff where it won't collect values unless they
occur significantly more often than the average frequency. I wonder
if that might be biting you here: without the actual values in the MCV
table, the join selectivity estimates probably aren't too good.

> As I don't know what more to do, my conclusion is that the data need to be
> better balanced to allow the analyse gather better stats. But if there is a
> way to improve the stats/query with this ugly balanced data, I'm open to it
> !
>
> I hope that in real production, data will never be loaded this way. If this
> appened we will maybe set enable_nestloop to off, but I don't think it's a
> good solution, other query have a chance to get slower.

Yeah, that usually works out poorly.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2010-07-09 20:35:28 Re: Need help in performance tuning.
Previous Message Kevin Grittner 2010-07-09 19:31:47 Re: Need help in performance tuning.