Re: Query runs fast or slow

From: felix(at)crowfix(dot)com
To: pgsql-general(at)postgresql(dot)org
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Query runs fast or slow
Date: 2006-04-17 20:52:40
Message-ID: 20060417205240.GA9080@crowfix.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sun, Apr 16, 2006 at 04:32:25PM -0400, Tom Lane wrote:
> felix-accts-pgsql(at)crowfix(dot)com writes:
> > EXPLAIN ANALYZE doesn't show the slow timing
> > because it requires values, not $n placeholders,
>
> To analyze the plan used for a parameterized query, try
>
> PREPARE foo(...) AS SELECT ... $n ...
>
> EXPLAIN ANALYZE EXECUTE foo(...)
>
> But I already know what you're going to find: the planner's estimates
> for the range query are not going to be very good when it has no idea
> what the range bounds are. This is a situation where it may be best
> to absorb the hit of re-planning each time instead of using a generic
> parameterized plan.

OK, here is the new explain analyze. I eliminated cache effects by
dumping the tables and picking random values with an editor.

felix=> PREPARE foo(TEXT, INT, INT) AS SELECT s.data, g.key, g.val, g.sid FROM key k, val_int v, sid s, glue_int g WHERE (k.data = $1 AND k.id = g.key) AND (v.data >= $2 AND v.data <= $3) AND v.id = g.val AND g.sid = s.id;
PREPARE
felix=> explain analyze execute foo('mthNQFrmVs3Q4bVruCxIAGy', 1973028023, 1973028223);
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1380.11..404223.36 rows=499 width=60) (actual time=5785.012..77823.688 rows=1 loops=1)
-> Hash Join (cost=1380.11..402713.38 rows=499 width=16) (actual time=5766.308..77804.969 rows=1 loops=1)
Hash Cond: ("outer".val = "inner".id)
-> Nested Loop (cost=0.00..400829.78 rows=99701 width=16) (actual time=115.154..77401.159 rows=100000 loops=1)
-> Index Scan using key_data_key on "key" k (cost=0.00..5.82 rows=1 width=4) (actual time=0.125..0.132 rows=1 loops=1)
Index Cond: (data = $1)
-> Index Scan using glue_int_key_idx on glue_int g (cost=0.00..399577.70 rows=99701 width=16) (actual time=115.011..76570.366 rows=100000 loops=1)
Index Cond: ("outer".id = g."key")
-> Hash (cost=1378.86..1378.86 rows=500 width=4) (actual time=11.580..11.580 rows=0 loops=1)
-> Index Scan using val_int_data_key on val_int v (cost=0.00..1378.86 rows=500 width=4) (actual time=11.556..11.561 rows=1 loops=1)
Index Cond: ((data >= $2) AND (data <= $3))
-> Index Scan using sid_pkey on sid s (cost=0.00..3.01 rows=1 width=52) (actual time=18.682..18.687 rows=1 loops=1)
Index Cond: ("outer".sid = s.id)
Total runtime: 77823.897 ms
(14 rows)

A repeat shows it faster, from 77 seconds to 3.

felix=> explain analyze execute foo('mthNQFrmVs3Q4bVruCxIAGy', 1973028023, 1973028223);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1380.11..404223.36 rows=499 width=60) (actual time=205.137..2931.899 rows=1 loops=1)
-> Hash Join (cost=1380.11..402713.38 rows=499 width=16) (actual time=205.056..2931.803 rows=1 loops=1)
Hash Cond: ("outer".val = "inner".id)
-> Nested Loop (cost=0.00..400829.78 rows=99701 width=16) (actual time=0.148..2564.255 rows=100000 loops=1)
-> Index Scan using key_data_key on "key" k (cost=0.00..5.82 rows=1 width=4) (actual time=0.031..0.039 rows=1 loops=1)
Index Cond: (data = $1)
-> Index Scan using glue_int_key_idx on glue_int g (cost=0.00..399577.70 rows=99701 width=16) (actual time=0.105..1808.068 rows=100000 loops=1)
Index Cond: ("outer".id = g."key")
-> Hash (cost=1378.86..1378.86 rows=500 width=4) (actual time=0.090..0.090 rows=0 loops=1)
-> Index Scan using val_int_data_key on val_int v (cost=0.00..1378.86 rows=500 width=4) (actual time=0.074..0.080 rows=1 loops=1)
Index Cond: ((data >= $2) AND (data <= $3))
-> Index Scan using sid_pkey on sid s (cost=0.00..3.01 rows=1 width=52) (actual time=0.061..0.066 rows=1 loops=1)
Index Cond: ("outer".sid = s.id)
Total runtime: 2932.013 ms
(14 rows)

And running it as a simple query shows it much faster, 72 ms.

felix=> EXPLAIN ANALYZE SELECT s.data, g.key, g.val, g.sid FROM key k, val_int v, sid s, glue_int g WHERE (k.data = 'mthNQFrmVs3Q4bVruCxIAGy' AND k.id = g.key) AND (v.data >= 1973028023 AND v.data <= 1973028223) AND v.id = g.val AND g.sid = s.id;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=5.82..1162.41 rows=1 width=60) (actual time=53.341..71.996 rows=1 loops=1)
-> Hash Join (cost=5.82..1159.38 rows=1 width=16) (actual time=53.280..71.921 rows=1 loops=1)
Hash Cond: ("outer"."key" = "inner".id)
-> Nested Loop (cost=0.00..1148.54 rows=1001 width=16) (actual time=40.525..67.962 rows=1000 loops=1)
-> Index Scan using val_int_data_key on val_int v (cost=0.00..6.01 rows=1 width=4) (actual time=0.068..0.072 rows=1 loops=1)
Index Cond: ((data >= 1973028023) AND (data <= 1973028223))
-> Index Scan using glue_int_val_idx on glue_int g (cost=0.00..730.26 rows=32982 width=16) (actual time=40.441..60.218 rows=1000 loops=1)
Index Cond: ("outer".id = g.val)
-> Hash (cost=5.82..5.82 rows=1 width=4) (actual time=0.067..0.067 rows=0 loops=1)
-> Index Scan using key_data_key on "key" k (cost=0.00..5.82 rows=1 width=4) (actual time=0.049..0.054 rows=1 loops=1)
Index Cond: (data = 'mthNQFrmVs3Q4bVruCxIAGy'::text)
-> Index Scan using sid_pkey on sid s (cost=0.00..3.01 rows=1 width=52) (actual time=0.048..0.052 rows=1 loops=1)
Index Cond: ("outer".sid = s.id)
Total runtime: 72.099 ms
(14 rows)

But the plan has also changed, from outer.val to outer.key. Since the
key table is 100 times smaller than the val table, I can see why one
executes so much faster.

But is 77 seconds reasonable? It just seems incredibly slow.

--
... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
Felix Finch: scarecrow repairman & rocket surgeon / felix(at)crowfix(dot)com
GPG = E987 4493 C860 246C 3B1E 6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Teodor Sigaev 2006-04-17 20:56:16 Re: tsearch partial word
Previous Message Dawid Kuroczko 2006-04-17 20:44:20 Re: 21 bit number for sequence