Re: Estimation row error

From: "Gunnar \"Nick\" Bluth" <gunnar(dot)bluth(dot)extern(at)elster(dot)de>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Estimation row error
Date: 2015-12-15 09:25:30
Message-ID: 566FDC8A.8050701@elster.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Am 15.12.2015 um 09:05 schrieb Mathieu VINCENT:
> Hello,
>
> No one to help me to understand this bad estimation rows ?

Well,

on a rather beefy machine, I'm getting quite a different plan:
http://explain.depesz.com/s/3y5r

Which may be related to this setting:
perftest=# show default_statistics_target ;
default_statistics_target
---------------------------
1000
(1 Zeile)

I guess the wrong row assumption (which I get as well!) is caused by the
given correlation of t3.c1 and t3.c2 (which the planner doesn't "see").

Tomas Vondra has written a nice blog post, covering that topic as well:
http://blog.pgaddict.com/posts/common-issues-with-planner-statistics

AFAIK, 9.5 has received some improvements in that field, but I didn't
try that yet.

Best regards,

Nick

>
> Mathieu VINCENT
>
> 2015-12-11 12:35 GMT+01:00 Mathieu VINCENT
> <mathieu(dot)vincent(at)pmsipilot(dot)com <mailto:mathieu(dot)vincent(at)pmsipilot(dot)com>>:
>
> Sorry, I forget to precise Postgresql version
>
> 'PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
> 4.4.7 20120313 (Red Hat 4.4.7-11), 64-bit'
>
>
> BR
>
> Mathieu VINCENT
>
>
>
>
> 2015-12-11 9:53 GMT+01:00 Mathieu VINCENT
> <mathieu(dot)vincent(at)pmsipilot(dot)com <mailto:mathieu(dot)vincent(at)pmsipilot(dot)com>>:
>
> Hello,
>
> I would like to know how row estimation is calculed by explain ?
> In my execution plan, this estimation is extremely wrong (267
> instead of 198000)
> I reproduced this estimation error in this simple case :
>
> drop table if exists t1;
> drop table if exists t2;
> drop table if exists t3;
> drop table if exists t4;
>
> create table t1 as select generate_Series(1,300000) as c1;
> create table t2 as select generate_Series(1,400) as c1;
> create table t3 as select generate_Series(1,200000)%100 as
> c1,generate_Series(1,200000) as c2;
> create table t4 as select generate_Series(1,200000) as c1;
>
> alter table t1 add PRIMARY KEY (c1);
> alter table t2 add PRIMARY KEY (c1);
> alter table t3 add PRIMARY KEY (c1,c2);
> create index on t3 (c1);
> create index on t3 (c2);
> alter table t4 add PRIMARY KEY (c1);
>
> analyze t1;
> analyze t2;
> analyze t3;
> analyze t4;
>
> EXPLAIN (analyze on, buffers on, verbose on)
> select
> *
> from
> t1 t1
> inner join t2 on t1.c1=t2.c1
> inner join t3 on t2.c1=t3.c1
> inner join t4 on t3.c2=t4.c1
>
> Explain plan :
> http://explain.depesz.com/s/wZ3v
>
> I think this error may be problematic because planner will
> choose nested loop instead of hash joins for ultimate join. Can
> you help me to improve this row estimation ?
>
> Thank you for answering
>
> Best Regards,
> <http://www.psih.fr/> PSIH Décisionnel en santé
> Mathieu VINCENT
> Data Analyst
> PMSIpilot - 61 rue Sully - 69006 Lyon - France
>
>
>

--
Gunnar "Nick" Bluth
DBA ELSTER

Tel: +49 911/991-4665
Mobil: +49 172/8853339

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Andreas Kretschmer 2015-12-15 09:49:38 Re: Estimation row error
Previous Message Mathieu VINCENT 2015-12-15 08:05:38 Re: Estimation row error