Re: estimates for nested loop very wrong?

From: joostje(at)komputilo(dot)org
To: pgsql-sql(at)postgresql(dot)org
Cc: joostje(at)komputilo(dot)org
Subject: Re: estimates for nested loop very wrong?
Date: 2003-04-10 22:20:56
Message-ID: 20030410222056.GA4359@co.uea.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Je 2003/04/10(4)/16:04, Tom Lane skribis:
> joostje(at)komputilo(dot)org writes:
> > For example, for the stat.targ=500 run:
> > => explain analyse SELECT id from db, tmp0 WHERE valida AND poseda='uea' AND tab='pers' AND tmp0.v0=id ;
> > NOTICE: QUERY PLAN:
>
> > Nested Loop (cost=0.00..121000.31 rows=28184 width=39) (actual time=1.05..23.24 rows=415 loops=1)
> > -> Seq Scan on tmp0 (cost=0.00..20.00 rows=1000 width=32) (actual time=0.22..0.40 rows=29 loops=1)
> ^^^^^^^^^
> > -> Index Scan using db_id_idx on db (cost=0.00..120.63 rows=28 width=7) (actual time=0.27..0.75 rows=14 loops=29)
> > Total runtime: 23.92 msec
>
> Actually, I see part of the problem: you haven't vacuumed
> tmp0, so it's sitting at the default size estimate of 1000 rows.
> That accounts for more than a factor of 30 in the estimation error
> in the nestloop plan, while it wouldn't have nearly as much impact
> on hash or mergejoin estimates.
>
> There's still a good big error left to account for though :-(

OK, here it goes:

db.id.statistics target set to 10 (default);
effective_cache_size=1000 (default);
random_page_cost=4 (default);
=> estimated cost = 3690777.90

analyse tmp0;
=> estimated cost = 107033.27

set effective_cache_size=100000;
=> estimated cost = 107033.27 (no change there)

set random_page_cost=1; (default was 4)
=> estimated cost = 27592.06

db.id.statistics target = 500; (improves n_distinct estimate from 2041 to 39909)
=> estimated cost = 943.44

set enable_nestloop = off;
=> planner elects Hash Loop, estimated/real cost: 52834.33 12762.86

set enable_hashjoin = off;
=> planner elects Merge Join, estimated/real cost: 179961.75/54314.09

So, it seems everything is working as it should now!
Thanks!

> I don't think you mentioned the other WHERE conditions before. Which
> table are those restricting, and how selective are they?

Ah, sorry, they are worth not mentioning, as db.poseda='uea' and db.tab='pers' and valida
for about 97.6% of the rows. I guess I should have done the tests without
the two conditions, as they have no influence.

Thanks!
joostje

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message CIC mail 2003-04-11 00:56:05 unsubscrube
Previous Message Antti Haapala 2003-04-10 20:50:19 Re: estimates for nested loop very wrong?