Nested loops are killing throughput

From: CG <cgg007(at)yahoo(dot)com>
To: pgsql-sql(at)postgresql(dot)org
Subject: Nested loops are killing throughput
Date: 2006-09-19 14:46:28
Message-ID: 20060919144628.32833.qmail@web37903.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Postgresql 8.1

I've tried turning off nested loops to see what the query planner would choose instead of nested loops. It chose a hash join, and it still had a nested loop in the mix! How can I entice the query planner to use a simpler join scheme? What criteria is used to determine whether or not a merge join will be used?

Here's an explain analyze...

Subquery Scan foo (cost=762.21..762.50 rows=1 width=12) (actual time=10784.849..10786.992 rows=9 loops=1)
-> GroupAggregate (cost=762.21..762.24 rows=1 width=30) (actual time=10767.702..10768.539 rows=9 loops=1)
-> Sort (cost=762.21..762.21 rows=1 width=30) (actual time=10767.525..10767.989 rows=360 loops=1)
Sort Key: (dpdl.field_value)::timestamp with time zone
-> Nested Loop (cost=2.01..762.20 rows=1 width=30) (actual time=283.824..10749.007 rows=360 loops=1)
-> Nested Loop (cost=0.00..744.28 rows=1 width=16) (actual time=31.227..6980.765 rows=5436 loops=1)
-> Nested Loop (cost=0.00..135.29 rows=101 width=16) (actual time=25.514..273.660 rows=5436 loops=1)
-> Index Scan using ut_company_name_idx on user_table ut (cost=0.00..21.96 rows=5 width=11) (actual time=6.596..6.649 rows=9 loops=1)
Index Cond: ((company_name)::text = Acme, Inc.'::text)
-> Index Scan using packet_user_idx on packet dp (cost=0.00..19.89 rows=222 width=27) (actual time=16.939..28.025 rows=604 loops=9)
Index Cond: ((dp.username)::text = ("outer".username)::text)
Filter: (trans_date > (date_trunc('month'::text, (now() - '1 mon'::interval)) - '1 year 6 mons'::interval))
-> Index Scan using packet_status_puuid_pkey on packet_status dps (cost=0.00..6.02 rows=1 width=16) (actual time=1.226..1.228 rows=1 loops=5436)
Index Cond: (dps.packet_uuid = "outer".packet_uuid)
Filter: ((status & 2) = 0)
-> Bitmap Heap Scan on packet_datalink dpdl (cost=2.01..17.87 rows=4 width=30) (actual time=0.685..0.688 rows=0 loops=5436)
Recheck Cond: (dpdl.packet_uuid = "outer".packet_uuid)
Filter: (((field_name)::text = 'event_date'::text) AND ((field_value)::date >= ('now'::text)::date))
-> Bitmap Index Scan on packet_dl_puuid_idx (cost=0.00..2.01 rows=4 width=0) (actual time=0.402..0.402 rows=4 loops=5436)
Index Cond: (dpdl.packet_uuid = "outer".packet_uuid)
Total runtime: 10787.198 ms

Also, no one here can figure out why the row count prediction is off on the packet table either. It's vacuumed and analyzed.

...

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2006-09-19 15:03:07 Re: Nested loops are killing throughput
Previous Message James Im 2006-09-19 13:44:13 Re: inner join is much faster! is that right?