SRF called with optional NULL input runs 7x slower

From: Ow Mun Heng <Ow(dot)Mun(dot)Heng(at)wdc(dot)com>
To: ML PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: SRF called with optional NULL input runs 7x slower
Date: 2007-09-24 03:32:06
Message-ID: 1190604726.11717.79.camel@neuromancer.home.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I've got 2 nearly identical SRFs to retrieve data from the DB(pg 8.2.4)
which goes something like this

================= Function 1 ======================
CREATE OR REPLACE FUNCTION hot(fromdate timestamp without time zone, todate timestamp without time zone, code text[])
RETURNS SETOF trh_hot AS
$BODY$

DECLARE
rec RECORD;

BEGIN
FOR rec IN
SELECT
foo.a,
bar.b
FROM d
INNER JOIN ts
ON ts.id = D.id
inner join trh
ON ts.id = trh.id
AND ts.ttype = trh.ttype
AND ts.run_date = trh.run_date
WHERE d.record_update_date_time BETWEEN fromdate AND todate
AND trh.run_date BETWEEN fromdate AND todate
AND trh.ttype IN (select ttype from lookup_ttype where tsequence = 'hot')
AND d.code = any (code)
LOOP
RETURN NEXT rec;
END LOOP;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;

===================================

========== Function 2 ===================
CREATE OR REPLACE FUNCTION hot(fromdate timestamp without time zone, todate timestamp without time zone, code text[], sn text[])
RETURNS SETOF trh_hot AS
$BODY$

DECLARE
rec RECORD;

BEGIN
FOR rec IN
SELECT
foo.a,
bar.b
FROM d
INNER JOIN ts
ON ts.id = D.id
inner join trh
ON ts.id = trh.id
AND ts.ttype = trh.ttype
AND ts.run_date = trh.run_date
WHERE d.record_update_date_time BETWEEN fromdate AND todate
AND trh.run_date BETWEEN fromdate AND todate
AND trh.ttype IN (select ttype from lookup_ttype where tsequence = 'hot')
--> AND (CASE WHEN code IS NULL THEN true else d.code = any (code) END)
--> AND (CASE WHEN sn IS NULL THEN TRUE else D.id = any(SN) END)
LOOP
RETURN NEXT rec;
END LOOP;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;

===================== >8 =============================

The main difference being these 2 lines

AND (CASE WHEN code IS NULL THEN true else d.code = any (code) END)
AND (CASE WHEN sn IS NULL THEN TRUE else D.id = any(SN) END)

which the aim is to check for NULL input and determine whether or not a
filter is necessary for those items.

The query is called by these respectively

Func1: select * from HOT('8/1/2007','9/30/2007','{HUA71}')
Func2: select * from HOT2('8/1/2007','9/30/2007','{HUA71}',NULL)

If I put in all the SN in replacement of the NULL, it's slightly faster
at 37 secs to return the data.

Vmstat 5 on Function1:7 secs 8400 rows
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 268 32296 2908 1096780 1 1 92 75 1083 1021 10 6 82 3
4 1 268 33708 2864 1094856 0 0 5771 21 1134 3518 23 19 51 7
6 0 268 39196 2876 1091396 0 0 2074 27 1230 3477 33 51 0 16

very little on cpu wait.

Vmstat 5 on Function2: 50secs 8400 rows
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
3 0 268 33648 4980 1087552 0 0 1606 103 1216 2943 21 32 0 47
2 1 268 33888 4484 1089576 0 0 2352 14 1209 2811 19 32 0 49
3 1 268 34380 4100 1090564 0 0 1270 13 1187 2890 13 21 0 66
2 1 268 34184 4004 1091408 0 0 1107 98 1206 2949 12 19 0 69
3 2 268 33004 3984 1093312 0 0 1357 34 1203 2931 11 20 0 69
4 0 268 33008 3860 1093556 0 0 1302 8 1193 2897 13 21 0 65
2 1 268 33608 3800 1093140 0 0 1286 6 1195 2954 12 18 0 70
1 1 268 33552 3300 1093776 0 0 1282 109 1213 2896 12 20 0 67
3 1 268 32404 3296 1095368 0 0 1202 1 1188 2908 12 19 0 69
1 1 268 33976 2680 1094476 0 0 1184 0 1180 2905 11 19 0 70
3 1 268 33336 2440 1095924 0 0 1291 8 1196 2944 11 19 0 70
2 1 268 34112 2272 1095412 0 0 1245 118 1221 2932 13 19 0 68
2 0 268 34844 2756 1095364 0 0 698 4 1145 3040 19 49 0 32

Just look at the CPU wait. What Gives? Does checkin on NULL causes a penalty? I
tried running the same query as is on pgadmin3 and it looked to be OK
(the null checking). I do notice that the system (my laptop) was doing
quite a bit of IO.

how can I debug or diagnose where the issues lies? Explain analyse
doesn't do much since this is a Function Scan anyway.

AS of right now, I'm trying to see if I can do Dynamic SQL instead.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ow Mun Heng 2007-09-24 03:40:43 Re: Is this good spec for a PostgreSQL server?
Previous Message yanot panara 2007-09-24 02:43:04 configuration question