Re: Nested loops are killing throughput

From: CG <cgg007(at)yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Nested loops are killing throughput
Date: 2006-09-19 18:05:06
Message-ID: 20060919180506.16876.qmail@web37903.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

----- Original Message ----
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: CG <cgg007(at)yahoo(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Sent: Tuesday, September 19, 2006 11:03:07 AM
Subject: Re: [SQL] Nested loops are killing throughput

CG <cgg007(at)yahoo(dot)com> writes:
> -> Nested Loop (cost=0.00..744.28 rows=1 width=16) (actual time=31.227..6980.765 rows=5436 loops=1)
> -> Nested Loop (cost=0.00..135.29 rows=101 width=16) (actual time=25.514..273.660 rows=5436 loops=1)
> ...
> -> Index Scan using packet_status_puuid_pkey on packet_status dps (cost=0.00..6.02 rows=1 width=16) (actual time=1.226..1.228 rows=1 loops=5436)
> Index Cond: (dps.packet_uuid = "outer".packet_uuid)
> Filter: ((status & 2) = 0)

>One problem you've got is that the planner has no stats about the
>selectivity of that status condition. My advice would be to forget the
>cute bitmask and store one or more plain boolean columns. Easier to
>write queries against and way more tractable statistically.

n/p ... I'll ditch it.

Even if I simplify that query down to a straight-forward example...

select dp.*, dps.status from dpo.packet dp, dpo.packet_status dps where dp.packet_uuid = dps.packet_uuid and dp.username='joeuser';

I'm still getting the nested loop for a join.

Nested Loop (cost=100000000.00..100013378.98 rows=2206 width=145) (actual time=46.743..18202.318 rows=2225 loops=1)
-> Index Scan using packet_user_idx on packet dp (cost=0.00..88.03 rows=2206 width=125) (actual time=42.263..124.519 rows=2225 loops=1)
Index Cond: ((username)::text = 'joeuser'::text)
-> Index Scan using packet_status_puuid_pkey on packet_status dps (cost=0.00..6.01 rows=1 width=20) (actual time=8.115..8.117 rows=1 loops=2225)
Index Cond: ("outer".packet_uuid = dps.packet_uuid)
Total runtime: 18205.880 ms


These indexes are being used, and the tables are freshly vacuum-analyzed...

CREATE UNIQUE INDEX packet_puuid_idx
ON dpo.packet
USING btree
(packet_uuid);

CREATE INDEX packet_user_idx
ON dpo.packet
USING btree
(username);

CREATE UNIQUE INDEX packet_status_puuid_pkey
ON dpo.packet_status
USING btree
(packet_uuid);

I have no idea why I'm having such a difficult time digging through this data. It should be very straight-forward.

CG

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Lane 2006-09-19 18:17:52 Re: Nested loops are killing throughput
Previous Message A. Kretschmer 2006-09-19 15:15:09 Re: Count and list only where count is bigger than 1