Quick Links

Re: Help with tuning this query (with explain analyze finally)

From:	"Ken Egervari" <ken(at)upfactor(dot)com>
To:	"Josh Berkus" <josh(at)agliodbs(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Help with tuning this query (with explain analyze finally)
Date:	2005-03-03 06:59:13
Message-ID:	004601c51fbe$828bbb50$6a01a8c0@KEN
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers-win32 pgsql-performance

Josh,

>1) To determine your query order ala Dan Tow and drive off of person,
>please
>SET JOIN_COLLAPSE_LIMIT = 1 and then run Mark Kirkwood's version of the
>query. (Not that I believe in Dan Tow ... see previous message ... but it
>would be interesting to see the results.

Unfortunately, the query still takes 250 milliseconds. I tried it with
other queries and the results are the same as before. Here is the explain
analayze anyway:

Sort (cost=2036.83..2036.87 rows=16 width=103) (actual
time=328.000..328.000 rows=39 loops=1)
Sort Key: cs.date
-> Nested Loop Left Join (cost=620.61..2036.51 rows=16 width=103)
(actual time=250.000..328.000 rows=39 loops=1)
-> Hash Join (cost=620.61..1984.90 rows=16 width=78) (actual
time=250.000..328.000 rows=39 loops=1)
Hash Cond: ("outer".carrier_code_id = "inner".id)
-> Merge Join (cost=606.11..1965.99 rows=825 width=74)
(actual time=250.000..328.000 rows=310 loops=1)
Merge Cond: ("outer".current_status_id = "inner".id)
-> Index Scan using shipment_current_status_id_idx on
shipment s (cost=0.00..2701.26 rows=60307 width=66) (actual
time=0.000..77.000 rows=27711 loops=1)
Filter: (is_purged = false)
-> Sort (cost=606.11..610.50 rows=1756 width=12)
(actual time=141.000..141.000 rows=6902 loops=1)
Sort Key: cs.id
-> Hash Join (cost=1.11..511.48 rows=1756
width=12) (actual time=0.000..109.000 rows=6902 loops=1)
Hash Cond: ("outer".release_code_id =
"inner".id)
-> Index Scan Backward using
current_status_date_idx on shipment_status cs (cost=0.01..422.58 rows=14047
width=16) (actual time=0.000..78.000 rows=14925 loops=1)
Index Cond: ((date >=
(('now'::text)::date - 31)) AND (date <= ('now'::text)::date))
Filter: (cs.* IS NOT NULL)
-> Hash (cost=1.10..1.10 rows=1 width=4)
(actual time=0.000..0.000 rows=0 loops=1)
-> Seq Scan on release_code rc
(cost=0.00..1.10 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
Filter: ((number)::text =
'9'::text)
-> Hash (cost=14.49..14.49 rows=2 width=8) (actual
time=0.000..0.000 rows=0 loops=1)
-> Nested Loop (cost=6.87..14.49 rows=2 width=8)
(actual time=0.000..0.000 rows=2 loops=1)
-> Index Scan using person_pkey on person p
(cost=0.00..5.73 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (id = 355)
-> Hash Join (cost=6.87..8.74 rows=2 width=8)
(actual time=0.000..0.000 rows=2 loops=1)
Hash Cond: ("outer".carrier_id =
"inner".carrier_id)
-> Seq Scan on carrier_code cc
(cost=0.00..1.57 rows=57 width=8) (actual time=0.000..0.000 rows=57 loops=1)
-> Hash (cost=6.86..6.86 rows=1 width=12)
(actual time=0.000..0.000 rows=0 loops=1)
-> Hash Join (cost=3.04..6.86 rows=1
width=12) (actual time=0.000..0.000 rows=1 loops=1)
Hash Cond: ("outer".id =
"inner".carrier_id)
-> Seq Scan on carrier c
(cost=0.00..3.54 rows=54 width=4) (actual time=0.000..0.000 rows=54 loops=1)
-> Hash (cost=3.04..3.04
rows=1 width=8) (actual time=0.000..0.000 rows=0 loops=1)
-> Index Scan using
carrier_to_person_person_id_idx on carrier_to_person ctp (cost=0.00..3.04
rows=1 width=8) (actual time=0.000..0.000 rows=1 loops=1)
Index Cond: (355 =
person_id)
-> Index Scan using shipment_status_shipment_id_idx on
shipment_status ss (cost=0.00..3.20 rows=2 width=25) (actual
time=0.000..0.000 rows=1 loops=39)
Index Cond: (ss.shipment_id = "outer".id)
Total runtime: 328.000 ms

>2) Force PG to drop the merge join via SET ENABLE_MERGEJOIN = FALSE;

Setting this option had no effect either In fact, the query is a bit slower
(266 milliseconds but 250 came up once in 20 executions).

>Also, please let us know some about the server you're using and your
>configuration parameters, particularly:
>shared_buffers
>work_mem
>effective_cache_size
>random_page_cost

Well, I'm on a test machine so the settings haven't changed one bit from the
defaults. This may sound embarrassing, but I bet the production server is
not custom configured either. The computer I'm running these queries on is
just a simple Athon XP 2100+ on WinXP with 1GB of RAM. The production
server is a faster P4, but the rest is the same. Here are the 4 values in
my configuration, but 3 of them were commented:

shared_buffers = 1000
#work_mem = 1024
#effective_cache_size = 1000
#random_page_cost = 4

I'm not sure what these do, but I'm guessing the last 2 affect the planner
to do different things with the statistics. Should I increase the first
two?

Regards,
Ken

In response to

Re: Help with tuning this query (with explain analyze finally) at 2005-03-03 05:52:13 from Josh Berkus

Responses

Re: Help with tuning this query (with explain analyze finally) at 2005-03-03 17:35:14 from Josh Berkus

Browse pgsql-hackers-win32 by date

	From	Date	Subject
Next Message	Richard Huxton	2005-03-03 07:06:47	Re: Help with tuning this query (more musings)
Previous Message	Ken Egervari	2005-03-03 06:35:42	Re: Help with tuning this query (more musings)

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Richard Huxton	2005-03-03 07:06:47	Re: Help with tuning this query (more musings)
Previous Message	Ken Egervari	2005-03-03 06:35:42	Re: Help with tuning this query (more musings)