Re: bad query performance

From: Didrik Pinte <dp(at)adhocsolutions(dot)com>
To: Luis Sousa <llsousa(at)ualg(dot)pt>, Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: bad query performance
Date: 2003-05-15 07:55:40
Message-ID: 5.2.0.9.0.20030515093244.01bf9fb0@192.168.0.11
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

At 18:54 5/14/2003, Luis Sousa wrote:
>Stephan Szabo wrote:
>
>>On Mon, 12 May 2003, Didrik Pinte wrote:
>>
>>
>>
>>>Here is the query :
>>>------------------------------------------------------------------------------------
>>>SELECT profiles_et.username, profiles_et.name, profiles_et.firstname,
>>>profiles_et.email, company_et.name AS company, count(logs_et.dt) AS cnt,
>>>pro_invoice_addr_at.id
>>>
>>>
>>>FROM
>>>(
>>> (
>>> (
>>> secu.company_et JOIN secu.pro_invoice_addr_at
>>>ON (pro_invoice_addr_at.company = company_et.id)
>>> )
>>>
>>> JOIN secu.profiles_et ON (pro_invoice_addr_at.profile =
>>>profiles_et.id)
>>> )
>>> JOIN secu.logs_et ON (logs_et.invaddr = pro_invoice_addr_at.id)
>>>)
>>>
>>>
>>> GROUP BY profiles_et.username, profiles_et.name, profiles_et.firstname,
>>>profiles_et.email, company_et.name, pro_invoice_addr_at.id
>>>
>>>ORDER BY count(logs_et.dt) DESC;
>>>------------------------------------------------------------------------------------
>>>
>>>
>>>
>>>The secu_company, secu_prov_invoice_addr_at and secu_profiles contains only
>>>informations about the users.
>>>
>>>The query is executing in 8,6 minutes.... Indexes are defined on all the
>>>field used in the different joins.
>>>
>>
>>Have you recently analyzed the tables involved?
>>I'd guess that the two outer sorts and the sort on logs_et are probably
>>the killers, but can you send explain analyze output? Looking at the
>>estimated row numbers and width, it seems to me that with 16Mb of sort_mem
>>it's going to have to sort on disk (although you don't have enough memory
>>to up it far enough probably if the estimates are right)
>>
>>
>>
>>
>>>Here is the query plan :
>>>
>>>------------------------------------------------------------------------------------
>>>Sort (cost=895649.54..896073.23 rows=169474 width=145)
>>> Sort Key: count(logs_et.dt)
>>> -> Aggregate (cost=831240.24..865135.10 rows=169474 width=145)
>>> -> Group (cost=831240.24..860898.24 rows=1694743 width=145)
>>> -> Sort (cost=831240.24..835477.10 rows=1694743 width=145)
>>> Sort Key: profiles_et.username, profiles_et.name,
>>>profiles_et.firstname, profiles_et.email, company_et.name,
>>>pro_invoice_addr_at.id
>>> -> Merge Join (cost=274406.73..304066.75
>>>rows=1694743 width=145)
>>> Merge Cond: ("outer".id = "inner".invaddr)
>>> -> Sort (cost=143.81..144.92 rows=446
>>> width=126)
>>> Sort Key: pro_invoice_addr_at.id
>>> -> Merge Join (cost=90.27..124.18
>>>rows=446 width=126)
>>> Merge Cond: ("outer".id =
>>>"inner".profile)
>>> -> Index Scan using profiles_pk on
>>>profiles_et (cost=0.00..24.98 rows=449 width=66)
>>> -> Sort (cost=90.27..91.39
>>>rows=446 width=60)
>>> Sort Key:
>>>pro_invoice_addr_at.profile
>>> -> Merge
>>>Join (cost=37.82..70.65 rows=446 width=60)
>>> Merge Cond:
>>>("outer".company = "inner".id)
>>> -> Index Scan using
>>>invaddr_at_company_idx on pro_invoice_addr_at (cost=0.00..24.68 rows=446
>>>width=33)
>>> -> Sort
>>>(cost=37.82..38.55 rows=291 width=27)
>>> Sort Key:
>>>company_et.id
>>> -> Seq Scan on
>>>company_et (cost=0.00..25.91 rows=291 width=27)
>>> -> Sort (cost=274262.92..278499.78
>>>rows=1694743 width=19)
>>> Sort Key: logs_et.invaddr
>>> -> Seq Scan on
>>>logs_et (cost=0.00..55404.43 rows=1694743 width=19)
>>>------------------------------------------------------------------------------------
>>>
>>>The computer is a Pentium III 850 Mhz with 256 mb RAM (Sort memory for
>>>postres is 16 mb, and 64 mb of shared memory).
>>>
>>
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 3: if posting/reading through Usenet, please send an appropriate
>>subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
>>message can get through to the mailing list cleanly
>>
>>
>>
>-> Seq Scan on
>company_et (cost=0.00..25.91 rows=291 width=27)
> -> Sort (cost=274262.92..278499.78
>rows=1694743 width=19)
> Sort Key: logs_et.invaddr
>
>
>Do you have any indexes defined for logs_et.invaddr and for
>pro_invoice_addr_at.id ?
>
>Luis Sousa

pro_invoice_add_at.id had not index but was a primary key. I've created an
index on it. Here is the explain analyse for the query :

Sort (cost=895649.54..896073.23 rows=169474 width=145) (actual
time=553423.57..553423.76 rows=259 loops=1)
Sort Key: count(logs_et.dt)
-> Aggregate (cost=831240.24..865135.10 rows=169474 width=145) (actual
time=463611.22..553421.28 rows=259 loops=1)
-> Group (cost=831240.24..860898.24 rows=1694743 width=145)
(actual time=463610.39..550481.54 rows=1698665 loops=1)
-> Sort (cost=831240.24..835477.10 rows=1694743 width=145)
(actual time=463610.36..468962.97 rows=1698665 loops=1)
Sort Key: profiles_et.username, profiles_et.name,
profiles_et.firstname, profiles_et.email, company_et.name,
pro_invoice_addr_at.id
-> Merge Join (cost=274406.73..304066.75
rows=1694743 width=145) (actual time=76531.90..97594.64 rows=1698665 loops=1)
Merge Cond: ("outer".id = "inner".invaddr)
-> Sort (cost=143.81..144.92 rows=446
width=126) (actual time=147.27..147.67 rows=446 loops=1)
Sort Key: pro_invoice_addr_at.id
-> Merge Join (cost=90.27..124.18
rows=446 width=126) (actual time=105.69..143.63 rows=446 loops=1)
Merge Cond: ("outer".id =
"inner".profile)
-> Index Scan using profiles_pk on
profiles_et (cost=0.00..24.98 rows=449 width=66) (actual time=50.53..79.70
rows=449 loops=1)
-> Sort (cost=90.27..91.39
rows=446 width=60) (actual time=55.13..55.54 rows=446 loops=1)
Sort Key:
pro_invoice_addr_at.profile
-> Merge
Join (cost=37.82..70.65 rows=446 width=60) (actual time=40.11..49.69
rows=446 loops=1)
Merge Cond:
("outer".company = "inner".id)
-> Index Scan using
invaddr_at_company_idx on pro_invoice_addr_at (cost=0.00..24.68 rows=446
width=33) (actual time=9.91..13.54 rows=446 loops=1)
-> Sort
(cost=37.82..38.55 rows=291 width=27) (actual time=30.11..30.49 rows=447
loops=1)
Sort Key:
company_et.id
-> Seq Scan on
company_et (cost=0.00..25.91 rows=291 width=27) (actual time=7.24..27.73
rows=291 loops=1)
-> Sort (cost=274262.92..278499.78
rows=1694743 width=19) (actual time=76384.57..80077.80 rows=1698665 loops=1)
Sort Key: logs_et.invaddr
-> Seq Scan on
logs_et (cost=0.00..55404.43 rows=1694743 width=19) (actual
time=11.47..13274.68 rows=1698665 loops=1)
Total runtime: 553673.02 msec

,"**** DISCLAIMER ****
This e-mail and any attachments thereto may contain information
which is confidential and/or protected by intellectual property
rights and are intended for the sole use of the recipient(s) named above.
Any use of the information contained herein (including, but not limited to,
total or partial reproduction, communication or distribution in any form)
by persons other than the designated recipient(s) is prohibited.
If you have received this e-mail in error, please notify the sender
and delete the material from any computer.
Thank you for your cooperation.

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Luis Sousa 2003-05-15 08:43:10 Re: bad query performance
Previous Message Matt Mello 2003-05-15 05:44:21 Re: "deadlock detected" documentation