Re: Why does the number of rows are different in actual and estimated.

From: Evgeny Shishkin <itparanoia(at)gmail(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: AI Rumman <rummandba(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Why does the number of rows are different in actual and estimated.
Date: 2012-12-13 22:40:35
Message-ID: 55F1445F-304C-41C7-AB55-5ABC19184F00@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Dec 14, 2012, at 2:36 AM, Andrew Dunstan <andrew(at)dunslane(dot)net> wrote:

>
> On 12/13/2012 05:12 PM, AI Rumman wrote:
>> Why does the number of rows are different in actual and estimated?
>>
>
>
> Isn't that in the nature of estimates? An estimate is a heuristic guess at the number of rows it will find for the given query or part of a query. It's not uncommon for estimates to be out by several orders of magnitude. Guaranteeing estimates within bounded accuracy and in a given short amount of time (you don't want your planning time to overwhelm your execution time) isn't possible.
>

The main question i think is what to do with it.

The problem starts here

-> Hash Join (cost=9337.97..18115.71 rows=34489 width=244) (actual time=418.054..1156.453 rows=205420 loops=1)
Hash Cond: (customerdetails.customerid = entity.id)
-> Seq Scan on customerdetails (cost=0.00..4752.46 rows=327146 width=13) (actual time=0.021..176.389 rows=327328 loops=1)
-> Hash (cost=6495.65..6495.65 rows=227386 width=231) (actual time=417.839..417.839 rows=205420 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 16056kB
-> Index Scan using entity_setype_idx on entity (cost=0.00..6495.65 rows=227386 width=231) (actual time=0.033..2
53.880 rows=205420 loops=1)
Index Cond: ((setype)::text = 'con_s'::text)
-> Index Scan using con_address_pkey on con_address (cost=0.00..0.27 rows=1 width=46) (actual time=0.003..0.004 rows=1 loops=2054
20)

As you see access methods estimates are ok, it is join result set which is wrong.

How to deal with it?

May be a hack with CTE can help, but is there a way to improve statistics correlation?

> cheers
>
> andrew
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Claudio Freire 2012-12-13 22:42:05 Re: Why does the number of rows are different in actual and estimated.
Previous Message Andrew Dunstan 2012-12-13 22:36:26 Re: Why does the number of rows are different in actual and estimated.