Re: Nested loops overpriced

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Nested loops overpriced
Date: 2007-05-09 16:17:44
Message-ID: 200705091817.45056.peter_e@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Am Mittwoch, 9. Mai 2007 16:11 schrieb Tom Lane:
> Well, there's something funny going on here. You've got for instance
>
> -> Index Scan using email_pkey on email (cost=0.00..3.85
> rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990) Index Cond:
> (email.email_id = eh_from.email_id)
> Filter: (("time" >= '2007-05-05 17:01:59'::timestamp
> without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without
> time zone))
>
> on the inside of a nestloop whose outer side is predicted to return
> 107156 rows. That should've been discounted to *way* less than 3.85
> cost units per iteration.

This is the new plan with 8.2.4. It's still got the same problem, though.

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=5627064.21..5627718.73 rows=32726 width=184) (actual time=4904.834..5124.585 rows=35000 loops=1)
-> Sort (cost=5627064.21..5627146.03 rows=32726 width=184) (actual time=4904.771..4947.892 rows=35000 loops=1)
Sort Key: eh_subj.header_body
-> Nested Loop (cost=0.00..5624610.06 rows=32726 width=184) (actual time=0.397..4628.141 rows=35000 loops=1)
-> Nested Loop (cost=0.00..1193387.12 rows=28461 width=120) (actual time=0.322..3960.360 rows=35000 loops=1)
-> Nested Loop (cost=0.00..1081957.26 rows=28648 width=112) (actual time=0.238..3572.023 rows=35000 loops=1)
-> Index Scan using dummy_index on email_header eh_from (cost=0.00..13389.15 rows=280662 width=104) (actual time=0.133..1310.248 rows=280990 loops=1)
-> Index Scan using email_pkey on email (cost=0.00..3.79 rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990)
Index Cond: (email.email_id = eh_from.email_id)
Filter: (("time" >= '2007-05-05 17:01:59'::timestamp without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without time zone))
-> Index Scan using mime_part_pkey on mime_part (cost=0.00..3.88 rows=1 width=12) (actual time=0.005..0.006 rows=1 loops=35000)
Index Cond: ((email.email_id = mime_part.email_id) AND (mime_part.mime_part_id = 0))
-> Index Scan using idx__email_header__email_id__mime_part_id on email_header eh_subj (cost=0.00..155.47 rows=18 width=104) (actual time=0.009..0.014 rows=1 loops=35000)
Index Cond: ((email.email_id = eh_subj.email_id) AND (0 = eh_subj.mime_part_id))
Filter: (header_name = 'subject'::text)
Total runtime: 5161.390 ms

> Are you using any nondefault planner settings?

random_page_cost = 3
effective_cache_size = 384MB

> How big are these tables, anyway?

email 35 MB
email_header 421 MB
mime_part 37 MB

Everything is analyzed, vacuumed, and reindexed.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jignesh Shah 2007-05-09 16:27:30 ZFS and Postgresql - WASRe: Best OS for Postgres 8.2
Previous Message Tom Lane 2007-05-09 15:56:13 Re: Apparently useless bitmap scans