PostgreSQL performance problem -> tuning

From: Yaroslav Mazurak <yamazurak(at)Lviv(dot)Bank(dot)Gov(dot)UA>
To: pgsql-performance(at)postgresql(dot)org
Subject: PostgreSQL performance problem -> tuning
Date: 2003-08-06 07:34:54
Message-ID: 3F30AF9E.2080607@lviv.bank.gov.ua
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi All!

I have installed PostgreSQL 7.3.2 on FreeBSD 4.7, running on PC with
CPU Pentium II 400MHz and 384Mb RAM.

Problem is that SQL statement (see below) is running too long. With
current WHERE clause 'SUBSTR(2, 2) IN ('NL', 'NM') return 25 records.
With 1 record, SELECT time is about 50 minutes and takes approx. 120Mb
RAM. With 25 records SELECT takes about 600Mb of memory and ends after
about 10 hours with error: "Memory exhausted in AllocSetAlloc(32)".

***
How can I speed up processing? Why query (IMHO not too complex)
executes so long? :(
***

Information about configuration, data structures and table sizes see
below. Model picture attached.

Current postgresql.conf settings (some) are:

=== Cut ===
max_connections = 8

shared_buffers = 8192
max_fsm_relations = 256
max_fsm_pages = 65536
max_locks_per_transaction = 16
wal_buffers = 256

sort_mem = 131072
vacuum_mem = 16384

checkpoint_segments = 4
checkpoint_timeout = 300
commit_delay = 32000
commit_siblings = 4
fsync = false

enable_seqscan = false

effective_cache_size = 65536
=== Cut ===

SELECT statement is:

SELECT showcalc('B00204', dd, r020, t071) AS s04
FROM v_file02wide
WHERE a011 = 3
AND inrepdate(data)
AND SUBSTR(ncks, 2, 2) IN ('NL', 'NM')
AND r030 = 980;

Query plan is:


QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=174200202474.99..174200202474.99 rows=1 width=143)
-> Hash Join (cost=174200199883.63..174200202474.89 rows=43 width=143)
Hash Cond: ("outer".id_k041 = "inner".id_k041)
-> Hash Join (cost=174200199880.57..174200202471.07 rows=43
width=139)
Hash Cond: ("outer".id_r030 = "inner".id_r030)
-> Hash Join (cost=174200199865.31..174200202410.31
rows=8992 width=135)
Hash Cond: ("outer".id_r020 = "inner".id_r020)
-> Hash Join
(cost=174200199681.91..174200202069.55 rows=8992 width=124)
Hash Cond: ("outer".id_dd = "inner".id_dd)
-> Merge Join
(cost=174200199676.04..174200201906.32 rows=8992 width=114)
Merge Cond: ("outer".id_v = "inner".id_v)
Join Filter: (("outer".data >= CASE
WHEN ("inner".dataa IS NOT NULL) THEN "inner".dataa WHEN ("outer".data
IS NOT NULL) THEN "outer".data ELSE NULL::date END) AND ("outer".data <=
CASE WHEN ("inner".datab IS NOT NULL) THEN "inner".datab WHEN
("outer".data IS NOT NULL) THEN "outer".data ELSE NULL::date END))
-> Sort (cost=42528.39..42933.04
rows=161858 width=65)
Sort Key: filexxr.id_v
-> Hash Join
(cost=636.25..28524.10 rows=161858 width=65)
Hash Cond: ("outer".id_obl
= "inner".id_obl)
-> Hash Join
(cost=632.67..25687.99 rows=161858 width=61)
Hash Cond:
("outer".id_r = "inner".id_r)
-> Index Scan using
index_file02_k041 on file02 (cost=0.00..18951.63 rows=816093 width=32)
-> Hash
(cost=615.41..615.41 rows=6903 width=29)
-> Index Scan
using index_filexxr_a011 on filexxr (cost=0.00..615.41 rows=6903 width=29)
Index
Cond: (id_a011 = 3)
Filter:
inrepdate(data)
-> Hash (cost=3.47..3.47
rows=43 width=4)
-> Index Scan using
kod_obl_pkey on kod_obl obl (cost=0.00..3.47 rows=43 width=4)
-> Sort
(cost=174200157147.65..174200157150.57 rows=1167 width=49)
Sort Key: dov_tvbv.id_v
-> Merge Join
(cost=0.00..174200157088.20 rows=1167 width=49)
Merge Cond:
("outer".id_bnk = "inner".id_bnk)
-> Index Scan using
dov_bank_pkey on dov_bank (cost=0.00..290100261328.45 rows=1450 width=13)
Filter: (subplan)
SubPlan
-> Materialize
(cost=100000090.02..100000090.02 rows=29 width=11)
-> Seq Scan
on dov_bank (cost=100000000.00..100000090.02 rows=29 width=11)

Filter: ((substr((box)::text, 2, 2) = 'NL'::text) OR
(substr((box)::text, 2, 2) = 'NM'::text))
-> Index Scan using
index_dov_tvbv_bnk on dov_tvbv (cost=0.00..142.42 rows=2334 width=36)
-> Hash (cost=5.83..5.83 rows=16 width=10)
-> Index Scan using ek_pok_r_pkey on
ek_pok_r epr (cost=0.00..5.83 rows=16 width=10)
-> Hash (cost=178.15..178.15 rows=2100 width=11)
-> Index Scan using kl_r020_pkey on kl_r020
(cost=0.00..178.15 rows=2100 width=11)
-> Hash (cost=15.26..15.26 rows=1 width=4)
-> Index Scan using kl_r030_pkey on kl_r030 r030
(cost=0.00..15.26 rows=1 width=4)
Filter: ((r030)::text = '980'::text)
-> Hash (cost=3.04..3.04 rows=4 width=4)
-> Index Scan using kl_k041_pkey on kl_k041
(cost=0.00..3.04 rows=4 width=4)
(45 rows)

Function showcalc definition is:

CREATE OR REPLACE FUNCTION showcalc(VARCHAR(10), VARCHAR(2), VARCHAR(4),
NUMERIC(16)) RETURNS NUMERIC(16)
LANGUAGE SQL AS '
-- Parameters: code, dd, r020, t071
SELECT COALESCE(
(SELECT sc.koef * $4
FROM showing AS s NATURAL JOIN showcomp AS sc
WHERE s.kod LIKE $1
AND NOT SUBSTR(acc_mask, 1, 1) LIKE ''[''
AND SUBSTR(acc_mask, 1, 4) LIKE $3
AND SUBSTR(acc_mask, 5, 1) LIKE SUBSTR($2, 1, 1)),
(SELECT SUM(sc.koef * COALESCE(showcalc(SUBSTR(acc_mask, 2,
LENGTH(acc_mask) - 2), $2, $3, $4), 0))
FROM showing AS s NATURAL JOIN showcomp AS sc
WHERE s.kod LIKE $1
AND SUBSTR(acc_mask, 1, 1) LIKE ''[''),
0) AS showing;
';

View v_file02wide is:

CREATE VIEW v_file02wide AS
SELECT id_a011 AS a011, data, obl.ko, obl.nazva AS oblast, b030,
banx.box AS ncks, banx.nazva AS bank,
epr.dd, r020, r030, a3, r030.nazva AS valuta, k041,
-- Sum equivalent in national currency
t071 * get_kurs(id_r030, data) AS t070,
t071
FROM v_file02 AS vf02
JOIN kod_obl AS obl USING(id_obl)
JOIN (dov_bank NATURAL JOIN dov_tvbv) AS banx
ON banx.id_v = vf02.id_v
AND data BETWEEN COALESCE(banx.dataa, data)
AND COALESCE(banx.datab, data)
JOIN ek_pok_r AS epr USING(id_dd)
JOIN kl_r020 USING(id_r020)
JOIN kl_r030 AS r030 USING(id_r030)
JOIN kl_k041 USING(id_k041);

Function inrepdate is:

CREATE OR REPLACE FUNCTION inrepdate(DATE) RETURNS BOOL
LANGUAGE SQL AS '
-- Returns true if given date is in repdate
SELECT (SELECT COUNT(*) FROM repdate
WHERE $1 BETWEEN COALESCE(data1, CURRENT_DATE)
AND COALESCE(data2, CURRENT_DATE))
> 0;
';

Table sizes (records)
filexxr 34712
file02 816589
v_file02 816589
kod_obl 43
banx 2334
ek_pok_r 16
kl_r020 2100
kl_r030 208
kl_r041 4
v_file02wide
showing 2787
showcomp 13646
repdate 1

Table has indexes almost for all selected fields.
showcalc in this query selects and uses 195 rows.
Total query size is 8066 records (COUNT(*) executes about 33 seconds
and uses 120Mb RAM).

With best regards
Yaroslav Mazurak.

Attachment Content-Type Size
image/gif 27.2 KB

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2003-08-06 09:30:36 Re: PostgreSQL performance problem -> tuning
Previous Message Hannu Krosing 2003-08-06 06:42:33 Re: How Many Inserts Per Transactions