From: | Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com> |
---|---|
To: | <joostje(at)komputilo(dot)org> |
Cc: | <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: estimates for nested loop very wrong? |
Date: | 2003-04-10 14:17:03 |
Message-ID: | 20030410071328.M79630-100000@megazone23.bigpanda.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
On Thu, 10 Apr 2003 joostje(at)komputilo(dot)org wrote:
> It really puzzles me why the estimate for the Nested Loop is so bad,
> as it looks like a rather easy thing to estimate...
>
> Below:
> table db has 2.5e6 entries, column "id" has rather evenly distributed values.
> column id has a (btree) index.
> table tmp1 has 23 entries, column "v0" has all unique entries.
> psql version: 7.2.1
>
> Here is the query twice, once with enable_hashjoin ON, second time
> with enable_hashjoin OFF, to force psql to use Nested Loop
>
> ueadb=> explain analyse select id, var, val from db, tmp1 where id=tmp1.v0;
> NOTICE: QUERY PLAN:
>
> Hash Join (cost=1.29..67863.71 rows=61140 width=38) (actual time=4475.26..32442.99 rows=756 loops=1)
> -> Seq Scan on db (cost=0.00..54498.12 rows=2520012 width=31) (actual time=0.07..29170.62 rows=2520012 loops=1)
> -> Hash (cost=1.23..1.23 rows=23 width=7) (actual time=0.25..0.25 rows=0 loops=1)
> -> Seq Scan on tmp1 (cost=0.00..1.23 rows=23 width=7) (actual time=0.17..0.22 rows=23 loops=1)
> Total runtime: 32443.78 msec
>
>
> --Setting hashjoin off, forcing psql to use the Nested Loop
> ueadb=> set enable_hashjoin = off;
>
>
> ueadb=> explain analyse select id, var, val from db, tmp1 where id=tmp1.v0;
> NOTICE: QUERY PLAN:
>
> Nested Loop (cost=0.00..208256.60 rows=61140 width=38) (actual time=0.92..18.49 rows=756 loops=1)
> -> Seq Scan on tmp1 (cost=0.00..1.23 rows=23 width=7) (actual time=0.24..0.39 rows=23 loops=1)
> -> Index Scan using db_id_idx on db (cost=0.00..9021.35 rows=2658 width=31) (actual time=0.32..0.69 rows=33 loops=23)
It seems to be misestimating the number of rows to return on db. That's
probably why the cost is so wrong (it's over estimating by nearly a factor
of 100). Have you analyzed db recently?
> I guess I'll be doing my queries with enable_hashjoin OFF, but is there anythign I'm
> doing wrong?
> (Apart from maybe uzing psql 7.2.1 -- would 7.3 be smarter here?)
It might.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2003-04-10 14:17:09 | Re: estimates for nested loop very wrong? |
Previous Message | Christoph Haller | 2003-04-10 13:21:06 | Re: Extraordinary Full Join |