Optimizing a VIEW

From: Madison Kelly <linux(at)alteeve(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Optimizing a VIEW
Date: 2008-08-15 18:36:20
Message-ID: 48A5CCA4.2040206@alteeve.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi all,

I've got a simple table with a lot of data in it:

CREATE TABLE customer_data (
cd_id int primary key default(nextval('cd_seq')),
cd_cust_id int not null,
cd_variable text not null,
cd_value text,
cd_tag text,
added_user int not null,
added_date timestamp not null default now(),
modified_user int not null,
modified_date timestamp not null default now(),

FOREIGN KEY(cd_cust_id) REFERENCES customer(cust_id)
);

The 'cust_id' references the customer that the given data belongs to.
The reason for this "data bucket" (does this structure have a proper
name?) is that the data I need to store on a give customer is quite
variable and outside of my control. As it is, there is about 400
different variable/value pairs I need to store per customer.

This table has a copy in a second historical schema that matches this
one in public but with an additional 'history_id' sequence. I use a
simple function to copy an INSERT or UPDATE to any entry in the
historical schema.

Now I want to graph a certain subset of these variable/value pairs,
so I created a simple (in concept) view to pull out the historical data
set for a given customer. I do this by pulling up a set of records based
on the name of the 'cd_variable' and 'cd_tag' and connect the records
together using a matching timestamp.

The problem is that this view has very quickly become terribly slow.

I've got indexes on the 'cd_variable', 'cd_tag' and the parent
'cust_id' columns, and the plan seems to show that the indexes are
indeed being used, but the query against this view can take up to 10
minutes to respond. I am hoping to avoid making a dedicated table as
what I use to build this dataset may change over time.

Below I will post the VIEW and a sample of the query's EXPLAIN
ANALYZE. Thanks for any tips/help/clue-stick-beating you may be able to
share!

Madi

-=] VIEW

CREATE VIEW view_sync_rate_history AS
SELECT
a.cust_id AS vsrh_cust_id,
a.cust_name AS vsrh_cust_name,
a.cust_business AS vsrh_cust_business,
a.cust_nexxia_id||'-'||a.cust_nexxia_seq AS vsrh_cust_nexxia,
a.cust_phone AS vsrh_cust_phone,
b.cd_value AS vsrh_up_speed,
b.history_id AS vsrh_up_speed_history_id,
c.cd_value AS vsrh_up_rco,
c.history_id AS vsrh_up_rco_history_id,
d.cd_value AS vsrh_up_nm,
d.history_id AS vsrh_up_nm_history_id,
e.cd_value AS vsrh_up_sp,
e.history_id AS vsrh_up_sp_history_id,
f.cd_value AS vsrh_up_atten,
f.history_id AS vsrh_up_atten_history_id,
g.cd_value AS vsrh_down_speed,
g.history_id AS vsrh_down_speed_history_id,
h.cd_value AS vsrh_down_rco,
h.history_id AS vsrh_down_rco_history_id,
i.cd_value AS vsrh_down_nm,
i.history_id AS vsrh_down_nm_history_id,
j.cd_value AS vsrh_down_sp,
j.history_id AS vsrh_down_sp_history_id,
k.cd_value AS vsrh_down_atten,
k.history_id AS vsrh_down_atten_history_id,
l.cd_value AS vsrh_updated,
l.history_id AS vsrh_updated_history_id
FROM
customer a,
history.customer_data b,
history.customer_data c,
history.customer_data d,
history.customer_data e,
history.customer_data f,
history.customer_data g,
history.customer_data h,
history.customer_data i,
history.customer_data j,
history.customer_data k,
history.customer_data l
WHERE
a.cust_id=b.cd_cust_id AND
a.cust_id=c.cd_cust_id AND
a.cust_id=d.cd_cust_id AND
a.cust_id=e.cd_cust_id AND
a.cust_id=f.cd_cust_id AND
a.cust_id=g.cd_cust_id AND
a.cust_id=h.cd_cust_id AND
a.cust_id=i.cd_cust_id AND
a.cust_id=j.cd_cust_id AND
a.cust_id=k.cd_cust_id AND
a.cust_id=l.cd_cust_id AND
b.cd_tag='sync_rate' AND
c.cd_tag='sync_rate' AND
d.cd_tag='sync_rate' AND
e.cd_tag='sync_rate' AND
f.cd_tag='sync_rate' AND
g.cd_tag='sync_rate' AND
h.cd_tag='sync_rate' AND
i.cd_tag='sync_rate' AND
j.cd_tag='sync_rate' AND
k.cd_tag='sync_rate' AND
l.cd_tag='sync_rate' AND
b.cd_variable='upstream_speed' AND
c.cd_variable='upstream_relative_capacity_occupation' AND
d.cd_variable='upstream_noise_margin' AND
e.cd_variable='upstream_signal_power' AND
f.cd_variable='upstream_attenuation' AND
g.cd_variable='downstream_speed' AND
h.cd_variable='downstream_relative_capacity_occupation' AND
i.cd_variable='downstream_noise_margin' AND
j.cd_variable='downstream_signal_power' AND
k.cd_variable='downstream_attenuation' AND
l.cd_variable='sync_rate_updated' AND
b.modified_date=c.modified_date AND
b.modified_date=d.modified_date AND
b.modified_date=e.modified_date AND
b.modified_date=f.modified_date AND
b.modified_date=g.modified_date AND
b.modified_date=h.modified_date AND
b.modified_date=i.modified_date AND
b.modified_date=j.modified_date AND
b.modified_date=k.modified_date AND
b.modified_date=l.modified_date;

-=] EXPLAIN ANALYZE of a sample query
In case this is hard to read in the mail program, here is a link:
http://mizu-bu.org/misc/long_explain_analyze.txt


QUERY PLAN

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1263.93..3417.98 rows=1 width=262) (actual
time=88.005..248996.948 rows=131 loops=1)
Join Filter: ("inner".modified_date = "outer".modified_date)
-> Nested Loop (cost=577.65..1482.46 rows=1 width=154) (actual
time=58.664..5253.873 rows=131 loops=1)
Join Filter: ("outer".modified_date = "inner".modified_date)
-> Nested Loop (cost=369.98..870.28 rows=1 width=128)
(actual time=51.858..4328.108 rows=131 loops=1)
Join Filter: ("inner".modified_date = "outer".modified_date)
-> Nested Loop (cost=343.35..823.93 rows=1 width=116)
(actual time=42.851..3185.995 rows=131 loops=1)
Join Filter: ("inner".modified_date =
"outer".modified_date)
-> Nested Loop (cost=126.20..185.37 rows=1
width=90) (actual time=36.181..2280.245 rows=131 loops=1)
Join Filter: ("inner".modified_date =
"outer".modified_date)
-> Nested Loop (cost=99.57..139.02 rows=1
width=64) (actual time=27.918..1168.061 rows=131 loops=1)
Join Filter: ("outer".modified_date =
"inner".modified_date)
-> Hash Join (cost=72.94..92.67
rows=1 width=38) (actual time=17.769..18.572 rows=131 loops=1)
Hash Cond:
("outer".modified_date = "inner".modified_date)
-> Bitmap Heap Scan on
customer_data i (cost=26.63..46.30 rows=4 width=26) (actual
time=8.226..8.563 rows=131 loops=1)
Recheck Cond:
((cd_variable = 'downstream_noise_margin'::text) AND (103 = cd_cust_id))
Filter: (cd_tag =
'sync_rate'::text)
-> BitmapAnd
(cost=26.63..26.63 rows=5 width=0) (actual time=8.172..8.172 rows=0 loops=1)
-> Bitmap Index
Scan on cd_variable_index (cost=0.00..10.21 rows=918 width=0) (actual
time=6.409..6.409 rows=20981 loops=1)
Index Cond:
(cd_variable = 'downstream_noise_margin'::text)
-> Bitmap Index
Scan on cd_id_index (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.502..0.502 rows=2619 loops=1)
Index Cond:
(103 = cd_cust_id)
-> Hash (cost=46.30..46.30
rows=4 width=12) (actual time=9.526..9.526 rows=131 loops=1)
-> Bitmap Heap Scan on
customer_data e (cost=26.63..46.30 rows=4 width=12) (actual
time=9.140..9.381 rows=131 loops=1)
Recheck Cond:
((cd_variable = 'upstream_signal_power'::text) AND (103 = cd_cust_id))
Filter: (cd_tag =
'sync_rate'::text)
-> BitmapAnd
(cost=26.63..26.63 rows=5 width=0) (actual time=9.082..9.082 rows=0 loops=1)
-> Bitmap
Index Scan on cd_variable_index (cost=0.00..10.21 rows=918 width=0)
(actual time=7.298..7.298 rows=20981 loops=1)
Index
Cond: (cd_variable = 'upstream_signal_power'::text)
-> Bitmap
Index Scan on cd_id_index (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.502..0.502 rows=2619 loops=1)
Index
Cond: (103 = cd_cust_id)
-> Bitmap Heap Scan on customer_data
c (cost=26.63..46.30 rows=4 width=26) (actual time=8.492..8.693
rows=131 loops=131)
Recheck Cond: ((cd_variable =
'upstream_relative_capacity_occupation'::text) AND (103 = cd_cust_id))
Filter: (cd_tag = 'sync_rate'::text)
-> BitmapAnd
(cost=26.63..26.63 rows=5 width=0) (actual time=8.446..8.446 rows=0
loops=131)
-> Bitmap Index Scan on
cd_variable_index (cost=0.00..10.21 rows=918 width=0) (actual
time=6.693..6.693 rows=20986 loops=131)
Index Cond:
(cd_variable = 'upstream_relative_capacity_occupation'::text)
-> Bitmap Index Scan on
cd_id_index (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.494..0.494 rows=2619 loops=131)
Index Cond: (103 =
cd_cust_id)
-> Bitmap Heap Scan on customer_data b
(cost=26.63..46.30 rows=4 width=26) (actual time=8.216..8.405 rows=131
loops=131)
Recheck Cond: ((cd_variable =
'upstream_speed'::text) AND (103 = cd_cust_id))
Filter: (cd_tag = 'sync_rate'::text)
-> BitmapAnd (cost=26.63..26.63
rows=5 width=0) (actual time=8.172..8.172 rows=0 loops=131)
-> Bitmap Index Scan on
cd_variable_index (cost=0.00..10.21 rows=918 width=0) (actual
time=6.417..6.417 rows=20986 loops=131)
Index Cond: (cd_variable =
'upstream_speed'::text)
-> Bitmap Index Scan on
cd_id_index (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.495..0.495 rows=2619 loops=131)
Index Cond: (103 = cd_cust_id)
-> Bitmap Heap Scan on customer_data l
(cost=217.14..637.28 rows=102 width=26) (actual time=6.653..6.843
rows=131 loops=131)
Recheck Cond: ((103 = cd_cust_id) AND
(cd_variable = 'sync_rate_updated'::text))
Filter: (cd_tag = 'sync_rate'::text)
-> BitmapAnd (cost=217.14..217.14 rows=117
width=0) (actual time=6.618..6.618 rows=0 loops=131)
-> Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.485..0.485 rows=2619
loops=131)
Index Cond: (103 = cd_cust_id)
-> Bitmap Index Scan on
cd_variable_index (cost=0.00..200.72 rows=21350 width=0) (actual
time=6.079..6.079 rows=20986 loops=131)
Index Cond: (cd_variable =
'sync_rate_updated'::text)
-> Bitmap Heap Scan on customer_data k
(cost=26.63..46.30 rows=4 width=12) (actual time=8.442..8.638 rows=131
loops=131)
Recheck Cond: ((cd_variable =
'downstream_attenuation'::text) AND (103 = cd_cust_id))
Filter: (cd_tag = 'sync_rate'::text)
-> BitmapAnd (cost=26.63..26.63 rows=5 width=0)
(actual time=8.397..8.397 rows=0 loops=131)
-> Bitmap Index Scan on cd_variable_index
(cost=0.00..10.21 rows=918 width=0) (actual time=6.624..6.624 rows=20986
loops=131)
Index Cond: (cd_variable =
'downstream_attenuation'::text)
-> Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.487..0.487 rows=2619
loops=131)
Index Cond: (103 = cd_cust_id)
-> Bitmap Heap Scan on customer_data d (cost=207.68..610.95
rows=98 width=26) (actual time=6.805..6.994 rows=131 loops=131)
Recheck Cond: ((103 = cd_cust_id) AND (cd_variable =
'upstream_noise_margin'::text))
Filter: (cd_tag = 'sync_rate'::text)
-> BitmapAnd (cost=207.68..207.68 rows=112 width=0)
(actual time=6.769..6.769 rows=0 loops=131)
-> Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.487..0.487 rows=2619
loops=131)
Index Cond: (103 = cd_cust_id)
-> Bitmap Index Scan on cd_variable_index
(cost=0.00..191.26 rows=20360 width=0) (actual time=6.230..6.230
rows=20986 loops=131)
Index Cond: (cd_variable =
'upstream_noise_margin'::text)
-> Nested Loop (cost=686.28..1935.49 rows=1 width=224) (actual
time=21.077..1860.475 rows=131 loops=131)
-> Seq Scan on customer a (cost=0.00..5.22 rows=1 width=164)
(actual time=0.053..0.090 rows=1 loops=131)
Filter: (cust_id = 103)
-> Nested Loop (cost=686.28..1930.26 rows=1 width=76)
(actual time=21.014..1860.177 rows=131 loops=131)
Join Filter: ("inner".modified_date = "outer".modified_date)
-> Nested Loop (cost=472.13..1298.07 rows=1 width=50)
(actual time=14.460..971.017 rows=131 loops=131)
Join Filter: ("inner".modified_date =
"outer".modified_date)
-> Hash Join (cost=259.97..674.63 rows=1
width=38) (actual time=7.459..8.272 rows=131 loops=131)
Hash Cond: ("outer".modified_date =
"inner".modified_date)
-> Bitmap Heap Scan on customer_data h
(cost=213.66..627.06 rows=100 width=26) (actual time=7.391..7.707
rows=131 loops=131)
Recheck Cond: ((103 = cd_cust_id) AND
(cd_variable = 'downstream_relative_capacity_occupation'::text))
Filter: (cd_tag = 'sync_rate'::text)
-> BitmapAnd (cost=213.66..213.66
rows=115 width=0) (actual time=7.355..7.355 rows=0 loops=131)
-> Bitmap Index Scan on
cd_id_index (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.493..0.493 rows=2619 loops=131)
Index Cond: (103 = cd_cust_id)
-> Bitmap Index Scan on
cd_variable_index (cost=0.00..197.24 rows=20926 width=0) (actual
time=6.809..6.809 rows=20986 loops=131)
Index Cond: (cd_variable =
'downstream_relative_capacity_occupation'::text)
-> Hash (cost=46.30..46.30 rows=4
width=12) (actual time=8.253..8.253 rows=131 loops=1)
-> Bitmap Heap Scan on customer_data
f (cost=26.63..46.30 rows=4 width=12) (actual time=7.882..8.113
rows=131 loops=1)
Recheck Cond: ((cd_variable =
'upstream_attenuation'::text) AND (103 = cd_cust_id))
Filter: (cd_tag = 'sync_rate'::text)
-> BitmapAnd
(cost=26.63..26.63 rows=5 width=0) (actual time=7.832..7.832 rows=0 loops=1)
-> Bitmap Index Scan on
cd_variable_index (cost=0.00..10.21 rows=918 width=0) (actual
time=6.065..6.065 rows=20981 loops=1)
Index Cond:
(cd_variable = 'upstream_attenuation'::text)
-> Bitmap Index Scan on
cd_id_index (cost=0.00..16.17 rows=2333 width=0) (actual
time=0.489..0.489 rows=2619 loops=1)
Index Cond: (103 =
cd_cust_id)
-> Bitmap Heap Scan on customer_data j
(cost=212.16..622.19 rows=100 width=12) (actual time=7.092..7.280
rows=131 loops=17161)
Recheck Cond: ((103 = cd_cust_id) AND
(cd_variable = 'downstream_signal_power'::text))
Filter: (cd_tag = 'sync_rate'::text)
-> BitmapAnd (cost=212.16..212.16 rows=114
width=0) (actual time=7.057..7.057 rows=0 loops=17161)
-> Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.493..0.493 rows=2619
loops=17161)
Index Cond: (103 = cd_cust_id)
-> Bitmap Index Scan on
cd_variable_index (cost=0.00..195.74 rows=20784 width=0) (actual
time=6.512..6.512 rows=20986 loops=17161)
Index Cond: (cd_variable =
'downstream_signal_power'::text)
-> Bitmap Heap Scan on customer_data g
(cost=214.15..630.92 rows=101 width=26) (actual time=6.526..6.718
rows=131 loops=17161)
Recheck Cond: ((103 = cd_cust_id) AND (cd_variable
= 'downstream_speed'::text))
Filter: (cd_tag = 'sync_rate'::text)
-> BitmapAnd (cost=214.15..214.15 rows=116
width=0) (actual time=6.492..6.492 rows=0 loops=17161)
-> Bitmap Index Scan on cd_id_index
(cost=0.00..16.17 rows=2333 width=0) (actual time=0.486..0.486 rows=2619
loops=17161)
Index Cond: (103 = cd_cust_id)
-> Bitmap Index Scan on cd_variable_index
(cost=0.00..197.73 rows=21067 width=0) (actual time=5.956..5.956
rows=20986 loops=17161)
Index Cond: (cd_variable =
'downstream_speed'::text)
Total runtime: 248997.571 ms
(114 rows)

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2008-08-15 19:22:59 Re: file system and raid performance
Previous Message Ron Mayer 2008-08-15 04:33:28 Re: Filesystem benchmarking for pg 8.3.3 server