From: | Mario Splivalo <mario(dot)splivalo(at)megafon(dot)hr> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Query much slower when run from postgres function |
Date: | 2009-03-09 13:36:32 |
Message-ID: | 49B51B60.6030600@megafon.hr |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-jdbc pgsql-performance |
I have a function, looking like this:
CREATE OR REPLACE FUNCTION get_memo_display_queue_size(a_service_id integer)
RETURNS integer AS
$BODY$
SELECT
COUNT(*)::integer
FROM
v_messages_memo
LEFT JOIN messages_memo_displayed
ON id = message_id
WHERE
service_id = $1
AND state = 1
AND admin_id IS NULL;
$BODY$
LANGUAGE 'sql' VOLATILE SECURITY DEFINER
COST 100;
Now, when I run that function from psql, it takes around 200ms to complete:
pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=219.728..219.730
rows=1 loops=1)
Total runtime: 219.758 ms
(2 rows)
pulitzer2=#
And it takes around 200ms each time I run the function!
When I rewrite the query so I can see queryplan, I get this:
create view _v1 as
SELECT
*
FROM
v_messages_memo
LEFT JOIN messages_memo_displayed
ON id = message_id
WHERE
state = 1
AND admin_id IS NULL;
pulitzer2=# EXPLAIN ANALYZE select count(*) from _v1 WHERE service_id =
1829;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=23506.14..23506.15 rows=1 width=0) (actual
time=6.001..6.002 rows=1 loops=1)
-> Nested Loop (cost=150.69..23505.59 rows=216 width=0) (actual
time=5.744..5.971 rows=13 loops=1)
-> Hash Left Join (cost=150.69..11035.16 rows=2104 width=4)
(actual time=5.721..5.793 rows=13 loops=1)
Hash Cond: (messages.id =
messages_memo_displayed.message_id)
Filter: (messages_memo_displayed.admin_id IS NULL)
-> Bitmap Heap Scan on messages (cost=97.03..10955.11
rows=4209 width=4) (actual time=0.042..0.075 rows=13 loops=1)
Recheck Cond: (service_id = 1829)
-> Bitmap Index Scan on
messages_uq__service_id__tan (cost=0.00..95.98 rows=4209 width=0)
(actual time=0.032..0.032 rows=13 loops=1)
Index Cond: (service_id = 1829)
-> Hash (cost=28.85..28.85 rows=1985 width=8) (actual
time=5.666..5.666 rows=1985 loops=1)
-> Seq Scan on messages_memo_displayed
(cost=0.00..28.85 rows=1985 width=8) (actual time=0.009..2.697 rows=1985
loops=1)
-> Index Scan using messages_memo_pk on messages_memo
(cost=0.00..5.91 rows=1 width=4) (actual time=0.006..0.008 rows=1 loops=13)
Index Cond: (messages_memo.message_id = messages.id)
Filter: ((messages_memo.state)::integer = 1)
Total runtime: 6.079 ms
(15 rows)
So I noticed that postgres is using seq_scan on messages_memo_displayed,
although there is a PK (and an index) on message_id in
messages_memo_displayed (I'll post DDL of the tables at the end of the
post).
So, I tried EXPLAIN ANALYZE after I forced planner not to use sequential
scans:
pulitzer2=# EXPLAIN ANALYZE select count(*) from _v1 WHERE service_id =
1829;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=25403.60..25403.61 rows=1 width=0) (actual
time=6.546..6.547 rows=1 loops=1)
-> Nested Loop (cost=2048.16..25403.06 rows=216 width=0) (actual
time=6.287..6.512 rows=13 loops=1)
-> Hash Left Join (cost=2048.16..12932.63 rows=2104 width=4)
(actual time=6.268..6.340 rows=13 loops=1)
Hash Cond: (messages.id =
messages_memo_displayed.message_id)
Filter: (messages_memo_displayed.admin_id IS NULL)
-> Bitmap Heap Scan on messages (cost=97.03..10955.11
rows=4209 width=4) (actual time=0.043..0.078 rows=13 loops=1)
Recheck Cond: (service_id = 1829)
-> Bitmap Index Scan on
messages_uq__service_id__tan (cost=0.00..95.98 rows=4209 width=0)
(actual time=0.032..0.032 rows=13 loops=1)
Index Cond: (service_id = 1829)
-> Hash (cost=1926.31..1926.31 rows=1985 width=8)
(actual time=6.211..6.211 rows=1985 loops=1)
-> Index Scan using messages_memo_displayed_pk on
messages_memo_displayed (cost=0.00..1926.31 rows=1985 width=8) (actual
time=0.069..3.221 rows=1985 loops=1)
-> Index Scan using messages_memo_pk on messages_memo
(cost=0.00..5.91 rows=1 width=4) (actual time=0.006..0.008 rows=1 loops=13)
Index Cond: (messages_memo.message_id = messages.id)
Filter: ((messages_memo.state)::integer = 1)
Total runtime: 6.628 ms
(15 rows)
No sequential scan. So I 'changed' my function so that first row says
'SET enable_seqscan TO false'. After that, here are the times for the
function call:
mike(at)som:~$ psql -U postgres pulitzer2
Welcome to psql 8.3.5, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit
pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=300.448..300.450
rows=1 loops=1)
Total runtime: 300.491 ms
(2 rows)
pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=1.940..1.941
rows=1 loops=1)
Total runtime: 1.961 ms
(2 rows)
pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=1.946..1.947
rows=1 loops=1)
Total runtime: 1.973 ms
(2 rows)
pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=1.936..1.937
rows=1 loops=1)
Total runtime: 1.964 ms
(2 rows)
pulitzer2=#
So, first query on the same connection takes 300ms, and any
subsequential query on the same connection takes less than 2 ms. If I
remove 'SET enable_seqscan TO false' from the top of the function, every
call to the function takes around 200-300ms.
Now, as I was explained on pg-jdbc mailinglist, that 'SET enable_seqscan
TO false' affects all queries on that persistent connection from tomcat,
and It's not good solution. So I wanted to post here to ask what other
options do I have.
While writing this I realized that, without forcing sequential scan out,
I get much quicker execution times when I do:
SELECT count(*) FROM _v1 WHERE service_id = 1829
then when I do
SELECT get_memo_display_queue_size(1829),
as seen here:
mike(at)som:~$ psql -U postgres pulitzer2
Welcome to psql 8.3.5, the PostgreSQL interactive terminal.
Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit
pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=259.090..259.092
rows=1 loops=1)
Total runtime: 259.132 ms
(2 rows)
pulitzer2=# EXPLAIN ANALYZE select count(*) from _v1 WHERE service_id =
1829;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=23517.98..23517.99 rows=1 width=0) (actual
time=5.942..5.943 rows=1 loops=1)
-> Nested Loop (cost=150.70..23517.44 rows=216 width=0) (actual
time=5.674..5.909 rows=13 loops=1)
-> Hash Left Join (cost=150.70..11037.87 rows=2105 width=4)
(actual time=5.633..5.706 rows=13 loops=1)
Hash Cond: (messages.id =
messages_memo_displayed.message_id)
Filter: (messages_memo_displayed.admin_id IS NULL)
-> Bitmap Heap Scan on messages (cost=97.04..10957.81
rows=4210 width=4) (actual time=0.032..0.063 rows=13 loops=1)
Recheck Cond: (service_id = 1829)
-> Bitmap Index Scan on
messages_uq__service_id__tan (cost=0.00..95.98 rows=4210 width=0)
(actual time=0.022..0.022 rows=13 loops=1)
Index Cond: (service_id = 1829)
-> Hash (cost=28.85..28.85 rows=1985 width=8) (actual
time=5.588..5.588 rows=1985 loops=1)
-> Seq Scan on messages_memo_displayed
(cost=0.00..28.85 rows=1985 width=8) (actual time=0.009..2.690 rows=1985
loops=1)
-> Index Scan using messages_memo_pk on messages_memo
(cost=0.00..5.92 rows=1 width=4) (actual time=0.008..0.010 rows=1 loops=13)
Index Cond: (messages_memo.message_id = messages.id)
Filter: ((messages_memo.state)::integer = 1)
Total runtime: 6.026 ms
(15 rows)
pulitzer2=# explain analyze select get_memo_display_queue_size(1829);
QUERY PLAN
----------------------------------------------------------------------------------------
Result (cost=0.00..0.26 rows=1 width=0) (actual time=211.712..211.714
rows=1 loops=1)
Total runtime: 211.742 ms
(2 rows)
pulitzer2=# EXPLAIN ANALYZE select count(*) from _v1 WHERE service_id =
1829;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=23517.98..23517.99 rows=1 width=0) (actual
time=5.918..5.920 rows=1 loops=1)
-> Nested Loop (cost=150.70..23517.44 rows=216 width=0) (actual
time=5.659..5.885 rows=13 loops=1)
-> Hash Left Join (cost=150.70..11037.87 rows=2105 width=4)
(actual time=5.638..5.711 rows=13 loops=1)
Hash Cond: (messages.id =
messages_memo_displayed.message_id)
Filter: (messages_memo_displayed.admin_id IS NULL)
-> Bitmap Heap Scan on messages (cost=97.04..10957.81
rows=4210 width=4) (actual time=0.043..0.078 rows=13 loops=1)
Recheck Cond: (service_id = 1829)
-> Bitmap Index Scan on
messages_uq__service_id__tan (cost=0.00..95.98 rows=4210 width=0)
(actual time=0.033..0.033 rows=13 loops=1)
Index Cond: (service_id = 1829)
-> Hash (cost=28.85..28.85 rows=1985 width=8) (actual
time=5.581..5.581 rows=1985 loops=1)
-> Seq Scan on messages_memo_displayed
(cost=0.00..28.85 rows=1985 width=8) (actual time=0.009..2.678 rows=1985
loops=1)
-> Index Scan using messages_memo_pk on messages_memo
(cost=0.00..5.92 rows=1 width=4) (actual time=0.006..0.008 rows=1 loops=13)
Index Cond: (messages_memo.message_id = messages.id)
Filter: ((messages_memo.state)::integer = 1)
Total runtime: 5.994 ms
(15 rows)
pulitzer2=#
Now I'm confused, why is 'sql' function much slower than 'direct' SELECT?
Mike
P.S. Here are tables definition, from psql:
pulitzer2=# \d messages
Table "public.messages"
Column | Type |
Modifiers
--------------------+--------------------------+---------------------------------------------------------------------
id | integer | not null default
nextval(('public.message_id_seq'::text)::regclass)
from | character varying(15) | not null
to | character varying(10) | not null
receiving_time | timestamp with time zone | not null default now()
raw_text | character varying | not null
keyword | character varying |
destination_id | integer | not null
vpn_id | integer |
service_id | integer |
status | integer | not null default 2
gateway_message_id | character varying | not null
prize_id | integer |
tan | character varying |
Indexes:
"messages_pk" PRIMARY KEY, btree (id)
"messages_uq__gateway_message_id" UNIQUE, btree (gateway_message_id)
"messages_uq__service_id__tan" UNIQUE, btree (service_id, tan)
"messages_ix_from" btree ("from")
"messages_ix_receiving_time__service_id__status" btree
(receiving_time, service_id, status)
"messages_ix_vpn_id" btree (vpn_id)
Foreign-key constraints:
"messages_fk__destinations_id" FOREIGN KEY (destination_id)
REFERENCES destinations(id)
"messages_fk__service_prizes_prize_id" FOREIGN KEY (prize_id)
REFERENCES service_prizes(prize_id)
"messages_fk__services_id" FOREIGN KEY (service_id) REFERENCES
services(id)
"messages_fk__vpns_id" FOREIGN KEY (vpn_id) REFERENCES vpns(id)
pulitzer2=# \d messages_memo
Table "public.messages_memo"
Column | Type | Modifiers
------------------------+--------------------------+-----------
message_id | integer | not null
memo | character varying |
state | dom_messages_memo_state | not null
admin_id | integer |
admin_change_timestamp | timestamp with time zone |
Indexes:
"messages_memo_pk" PRIMARY KEY, btree (message_id)
Foreign-key constraints:
"messages_memo_fk__messages_id" FOREIGN KEY (message_id) REFERENCES
messages(id)
pulitzer2=# \d messages_memo_displayed
Table "public.messages_memo_displayed"
Column | Type | Modifiers
------------+---------+-----------
message_id | integer | not null
admin_id | integer | not null
Indexes:
"messages_memo_displayed_pk" PRIMARY KEY, btree (message_id, admin_id)
Foreign-key constraints:
"messages_memo_displayed_fk__admins_id" FOREIGN KEY (admin_id)
REFERENCES admins(id)
"messages_memo_displayed_fk__messages_id" FOREIGN KEY (message_id)
REFERENCES messages(id)
pulitzer2=# \d v_messages_memo
View "public.v_messages_memo"
Column | Type | Modifiers
--------------------+--------------------------+-----------
id | integer |
from | character varying(15) |
to | character varying(10) |
receiving_time | timestamp with time zone |
raw_text | character varying |
keyword | character varying |
destination_id | integer |
vpn_id | integer |
service_id | integer |
status | integer |
gateway_message_id | character varying |
prize_id | integer |
tan | character varying |
memo | character varying |
state | dom_messages_memo_state |
displayed | boolean |
View definition:
SELECT v_messages_full.id, v_messages_full."from",
v_messages_full."to", v_messages_full.receiving_time,
v_messages_full.raw_text, v_messages_full.keyword,
v_messages_full.destination_id, v_messages_full.vpn_id,
v_messages_full.service_id, v_messages_full.status,
v_messages_full.gateway_message_id, v_messages_full.prize_id,
v_messages_full.tan, messages_memo.memo, messages_memo.state,
NULL::boolean AS displayed
FROM messages v_messages_full
JOIN messages_memo ON v_messages_full.id = messages_memo.message_id;
pulitzer2=#
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2009-03-09 16:31:34 | Re: Query much slower when run from postgres function |
Previous Message | Dave Cramer | 2009-03-09 12:37:00 | Re: Forcing postgres NOT to use sequential scan, trough JDBC |
From | Date | Subject | |
---|---|---|---|
Next Message | Lee Hughes | 2009-03-09 14:28:20 | DBT Presentation Location? |
Previous Message | Aaron Guyon | 2009-03-06 15:50:22 | Re: Postgres 8.3, four times slower queries? |