Quick Links

Re: Slow query with planner row strange estimation

From:	Dimitri <dimitrik(dot)fr(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	damien hostin <damien(dot)hostin(at)axege(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject:	Re: Slow query with planner row strange estimation
Date:	2010-07-12 07:17:32
Message-ID:	AANLkTinCxi1kURyjrnY3-UmZShUe8Aw-i4vTiasbdX-m@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

It's probably one of the cases when having HINTS in PostgreSQL may be
very helpful..

SELECT /*+ enable_nestloop=off */ ... FROM ...

will just fix this query without impacting other queries and without
adding any additional instructions into the application code..

So, why there is a such resistance to implement hints withing SQL
queries in PG?..

Rgds,
-Dimitri

On 7/9/10, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Jul 9, 2010 at 6:13 AM, damien hostin <damien(dot)hostin(at)axege(dot)com>
> wrote:
>>> Have you tried running ANALYZE on the production server?
>>>
>>> You might also want to try ALTER TABLE ... SET STATISTICS to a large
>>> value on some of the join columns involved in the query.
>>
>> Hello,
>>
>> Before comparing the test case on the two machines, I run analyse on the
>> whole and look at pg_stats table to see if change occurs for the columns.
>> but on the production server the stats never became as good as on the
>> desktop computer. I set statistic at 10000 on column used by the join, run
>> analyse which take a 3000000 row sample then look at the stats. The stats
>> are not as good as on the desktop. Row number is nearly the same but only
>> 1
>> or 2 values are found.
>>
>> The data are not balanced the same way on the two computer :
>> - Desktop is 12000 rows with 6000 implicated in the query (50%),
>> - "Production" (actually a dev/test server) is 6 million rows with 6000
>> implicated in the query (0,1%).
>> Columns used in the query are nullable, and in the 5994000 other rows that
>> are not implicated in the query these columns are null.
>>
>> I don't know if the statistic target is a % or a number of value to
>> obtain,
>
> It's a number of values to obtain.
>
>> but event set at max (10000), it didn't managed to collect good stats (for
>> this particular query).
>
> I think there's a cutoff where it won't collect values unless they
> occur significantly more often than the average frequency. I wonder
> if that might be biting you here: without the actual values in the MCV
> table, the join selectivity estimates probably aren't too good.
>
>> As I don't know what more to do, my conclusion is that the data need to be
>> better balanced to allow the analyse gather better stats. But if there is
>> a
>> way to improve the stats/query with this ugly balanced data, I'm open to
>> it
>> !
>>
>> I hope that in real production, data will never be loaded this way. If
>> this
>> appened we will maybe set enable_nestloop to off, but I don't think it's a
>> good solution, other query have a chance to get slower.
>
> Yeah, that usually works out poorly.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise Postgres Company
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Re: Slow query with planner row strange estimation at 2010-07-09 20:25:30 from Robert Haas

Responses

Re: Slow query with planner row strange estimation at 2010-07-12 20:33:17 from phb07

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jayadevan M	2010-07-12 08:29:36	Re: Queries about PostgreSQL PITR
Previous Message	Fujii Masao	2010-07-12 06:25:08	Re: Queries about PostgreSQL PITR