complex query performance assistance request

From: John Mendenhall <john(at)surfutopia(dot)net>
To: pgsql-performance list <pgsql-performance(at)postgresql(dot)org>
Subject: complex query performance assistance request
Date: 2005-08-21 03:48:41
Message-ID: 20050821034841.GA28968@calvin.surfutopia.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I need to improve the performance for the following
query.

Soon after I reboot my server, the following query takes
20 seconds the first time I run it.
When I run it after that, it takes approximately 2 seconds.
I understand the caching taking place (at the os or db
level, it doesn't matter here).

Here are the results of the explain analyze run:

-----
LOG: duration: 6259.632 ms statement: explain analyze
SELECT
c.id AS contact_id,
sr.id AS sales_rep_id,
LTRIM(RTRIM(sr.firstname || ' ' || sr.lastname)) AS sales_rep_name,
p.id AS partner_id,
p.company AS partner_company,
coalesce(LTRIM(RTRIM(c.company)), LTRIM(RTRIM(c.firstname || ' ' || c.lastname)))
AS contact_company,
LTRIM(RTRIM(c.city || ' ' || c.state || ' ' || c.postalcode || ' ' || c.country))
AS contact_location,
c.phone AS contact_phone,
c.email AS contact_email,
co.name AS contact_country,
TO_CHAR(c.request_status_last_modified, 'mm/dd/yy hh12:mi pm')
AS request_status_last_modified,
TO_CHAR(c.request_status_last_modified, 'yyyymmddhh24miss')
AS rqst_stat_last_mdfd_sortable,
c.token_id,
c.master_key_token AS token
FROM
sales_reps sr
JOIN partners p ON (sr.id = p.sales_rep_id)
JOIN contacts c ON (p.id = c.partner_id)
JOIN countries co ON (LOWER(c.country) = LOWER(co.code))
JOIN partner_classification pc ON (p.classification_id = pc.id AND pc.classification != 'Sales Rep')
WHERE
c.lead_deleted IS NULL
AND EXISTS
(
SELECT
lr.id
FROM
lead_requests lr,
lead_request_status lrs
WHERE
c.id = lr.contact_id AND
lr.status_id = lrs.id AND
lrs.is_closed = 0
)
ORDER BY
contact_company, contact_id;
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------
Sort (cost=39093.16..39102.80 rows=3856 width=238) (actual time=6220.481..6221.188 rows=1071 loops=1)
Sort Key: COALESCE(ltrim(rtrim((c.company)::text)), ltrim(rtrim((((c.firstname)::text || ' '::text) || (c.lastname)::text)))), c.id
-> Merge Join (cost=38580.89..38863.48 rows=3856 width=238) (actual time=6015.751..6184.199 rows=1071 loops=1)
Merge Cond: ("outer"."?column3?" = "inner"."?column19?")
-> Sort (cost=14.00..14.61 rows=242 width=19) (actual time=9.250..9.500 rows=240 loops=1)
Sort Key: lower((co.code)::text)
-> Seq Scan on countries co (cost=0.00..4.42 rows=242 width=19) (actual time=0.132..4.498 rows=242 loops=1)
-> Sort (cost=38566.89..38574.86 rows=3186 width=225) (actual time=6005.644..6006.954 rows=1071 loops=1)
Sort Key: lower((c.country)::text)
-> Merge Join (cost=75.65..38381.50 rows=3186 width=225) (actual time=58.086..5979.287 rows=1071 loops=1)
Merge Cond: ("outer".partner_id = "inner".id)
-> Index Scan using contacts_partner_id_idx on contacts c (cost=0.00..160907.39 rows=20106 width=171) (actual time=2.569..5816.985 rows=1547 loops=1)
Filter: ((lead_deleted IS NULL) AND (subplan))
SubPlan
-> Nested Loop (cost=1.16..6.56 rows=2 width=10) (actual time=0.119..0.119 rows=0 loops=40261)
Join Filter: ("outer".status_id = "inner".id)
-> Index Scan using lead_requests_contact_id_idx on lead_requests lr (cost=0.00..4.86 rows=3 width=20) (actual time=0.079..0.083 rows=0 loops=40261)
Index Cond: ($0 = contact_id)
-> Materialize (cost=1.16..1.24 rows=8 width=10) (actual time=0.002..0.011 rows=6 loops=12592)
-> Seq Scan on lead_request_status lrs (cost=0.00..1.16 rows=8 width=10) (actual time=0.083..0.270 rows=7 loops=1)
Filter: (is_closed = 0::numeric)
-> Sort (cost=75.65..76.37 rows=290 width=64) (actual time=55.073..56.990 rows=1334 loops=1)
Sort Key: p.id
-> Merge Join (cost=59.24..63.79 rows=290 width=64) (actual time=31.720..41.096 rows=395 loops=1)
Merge Cond: ("outer".id = "inner".sales_rep_id)
-> Sort (cost=2.42..2.52 rows=39 width=31) (actual time=1.565..1.616 rows=39 loops=1)
Sort Key: sr.id
-> Seq Scan on sales_reps sr (cost=0.00..1.39 rows=39 width=31) (actual time=0.043..0.581 rows=39 loops=1)
-> Sort (cost=56.82..57.55 rows=290 width=43) (actual time=29.921..30.310 rows=395 loops=1)
Sort Key: p.sales_rep_id
-> Nested Loop (cost=24.35..44.96 rows=290 width=43) (actual time=0.169..22.566 rows=395 loops=1)
Join Filter: ("inner".classification_id = "outer".id)
-> Seq Scan on partner_classification pc (cost=0.00..1.04 rows=2 width=10) (actual time=0.059..0.102 rows=2 loops=1)
Filter: ((classification)::text <> 'Sales Rep'::text)
-> Materialize (cost=24.35..28.70 rows=435 width=53) (actual time=0.023..5.880 rows=435 loops=2)
-> Seq Scan on partners p (cost=0.00..24.35 rows=435 width=53) (actual time=0.034..8.937 rows=435 loops=1)
Total runtime: 6225.791 ms
(37 rows)

-----

My first question is, what is the Materialize query plan element?
It happens twice, and usually when I see it, my query is slow.

My second and more important question is, does anyone have
any ideas or suggestions as to how I can increase the speed
for this query?

Things I have already done are, modify the joins and conditions
so it starts with smaller tables, thus the join set is smaller,
modify the configuration of the server to ensure index scans
are used as they should be, ran vacuumdb and analyze on the
database.

Thank you very much in advance for any pointers for additional
places I can look.

Thanks.

JohnM

--
John Mendenhall
john(at)surfutopia(dot)net
surf utopia
internet services

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Marko Ristola 2005-08-21 09:52:04 Re: extremly low memory usage
Previous Message John A Meinel 2005-08-21 02:32:04 Re: extremly low memory usage