Re: *very* slow query to summarize data for a month ...

From: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: "Marc G(dot) Fournier" <scrappy(at)postgresql(dot)org>, pgsql-performance(at)postgresql(dot)org
Subject: Re: *very* slow query to summarize data for a month ...
Date: 2003-11-11 00:49:57
Message-ID: 20031110203259.Y727@ganymede.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, 10 Nov 2003, Josh Berkus wrote:

> Marc,
>
> I'd say your machine is very low on available RAM, particularly sort_mem.
> The steps which are taking a long time are:

Here's the server:

last pid: 42651; load averages: 1.52, 0.96, 0.88
up 28+07:43:33 20:35:44
307 processes: 2 running, 304 sleeping, 1 zombie
CPU states: 18.0% user, 0.0% nice, 29.1% system, 0.6% interrupt, 52.3% idle
Mem: 1203M Active, 1839M Inact, 709M Wired, 206M Cache, 199M Buf, 5608K Free
Swap: 8192M Total, 1804K Used, 8190M Free

>
> > Aggregate (cost=32000.94..32083.07 rows=821 width=41) (actual
> time=32983.36..47586.17 rows=144 loops=1)
> > -> Group (cost=32000.94..32062.54 rows=8213 width=41) (actual
> time=32957.40..42817.88 rows=462198 loops=1)
>
> and:
>
> > -> Merge Join (cost=31321.45..31466.92 rows=8213 width=41)
> (actual time=13983.07..22642.14 rows=462198 loops=1)
> > Merge Cond: ("outer".company_id = "inner".company_id)
> > -> Sort (cost=24.41..25.29 rows=352 width=25) (actual
> time=5.52..7.40 rows=348 loops=1)
>
> There are also *large* delays between steps. Either your I/O is saturated,
> or you haven't run a VACUUM FULL ANALYZE in a while (which would also explain
> the estimates being off).

thought about that before I started the thread, and ran it just in case ...

just restarted the server with sort_mem set to 10M, and didn't help much on the Aggregate, or MergeJoin ... :

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=39674.38..39756.70 rows=823 width=41) (actual time=33066.25..54021.50 rows=144 loops=1)
-> Group (cost=39674.38..39736.12 rows=8232 width=41) (actual time=33040.25..47005.57 rows=462198 loops=1)
-> Sort (cost=39674.38..39694.96 rows=8232 width=41) (actual time=33040.22..37875.97 rows=462198 loops=1)
Sort Key: c.company_name, ts.company_id
-> Merge Join (cost=38993.22..39139.02 rows=8232 width=41) (actual time=14428.17..23568.80 rows=462198 loops=1)
Merge Cond: ("outer".company_id = "inner".company_id)
-> Sort (cost=24.41..25.29 rows=352 width=25) (actual time=5.80..7.66 rows=348 loops=1)
Sort Key: c.company_id
-> Seq Scan on company c (cost=0.00..9.52 rows=352 width=25) (actual time=0.08..3.06 rows=352 loops=1)
-> Sort (cost=38968.82..38989.40 rows=8232 width=16) (actual time=14422.27..17429.34 rows=462198 loops=1)
Sort Key: ts.company_id
-> Seq Scan on traffic_logs ts (cost=0.00..38433.46 rows=8232 width=16) (actual time=0.15..8119.72 rows=462198 loops=1)
Filter: (date_trunc('month'::text, runtime) = '2003-10-01 00:00:00'::timestamp without time zone)
Total runtime: 54034.44 msec
(14 rows)

the problem is that the results we are comparing with right now is the one
that had the - time on it :( Just restarted the server with default
sort_mem, and here is the query with that:

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=39691.27..39773.61 rows=823 width=41) (actual time=35077.18..50424.74 rows=144 loops=1)
-> Group (cost=39691.27..39753.03 rows=8234 width=41) (actual time=35051.29..-650049.84 rows=462198 loops=1)
-> Sort (cost=39691.27..39711.86 rows=8234 width=41) (actual time=35051.26..38847.40 rows=462198 loops=1)
Sort Key: c.company_name, ts.company_id
-> Merge Join (cost=39009.92..39155.76 rows=8234 width=41) (actual time=16155.37..25439.42 rows=462198 loops=1)
Merge Cond: ("outer".company_id = "inner".company_id)
-> Sort (cost=24.41..25.29 rows=352 width=25) (actual time=5.85..7.71 rows=348 loops=1)
Sort Key: c.company_id
-> Seq Scan on company c (cost=0.00..9.52 rows=352 width=25) (actual time=0.10..3.07 rows=352 loops=1)
-> Sort (cost=38985.51..39006.10 rows=8234 width=16) (actual time=16149.46..19437.47 rows=462198 loops=1)
Sort Key: ts.company_id
-> Seq Scan on traffic_logs ts (cost=0.00..38450.00 rows=8234 width=16) (actual time=0.16..8869.37 rows=462198 loops=1)
Filter: (date_trunc('month'::text, runtime) = '2003-10-01 00:00:00'::timestamp without time zone)
Total runtime: 50426.80 msec
(14 rows)

And, just on a whim, here it is set to 100M:

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=39691.27..39773.61 rows=823 width=41) (actual time=25888.20..38909.88 rows=144 loops=1)
-> Group (cost=39691.27..39753.03 rows=8234 width=41) (actual time=25862.81..34591.76 rows=462198 loops=1)
-> Sort (cost=39691.27..39711.86 rows=8234 width=41) (actual time=25862.77..723885.95 rows=462198 loops=1)
Sort Key: c.company_name, ts.company_id
-> Merge Join (cost=39009.92..39155.76 rows=8234 width=41) (actual time=12471.23..21855.08 rows=462198 loops=1)
Merge Cond: ("outer".company_id = "inner".company_id)
-> Sort (cost=24.41..25.29 rows=352 width=25) (actual time=5.87..7.74 rows=348 loops=1)
Sort Key: c.company_id
-> Seq Scan on company c (cost=0.00..9.52 rows=352 width=25) (actual time=0.11..3.14 rows=352 loops=1)
-> Sort (cost=38985.51..39006.10 rows=8234 width=16) (actual time=12465.29..14941.24 rows=462198 loops=1)
Sort Key: ts.company_id
-> Seq Scan on traffic_logs ts (cost=0.00..38450.00 rows=8234 width=16) (actual time=0.18..9106.16 rows=462198 loops=1)
Filter: (date_trunc('month'::text, runtime) = '2003-10-01 00:00:00'::timestamp without time zone)
Total runtime: 39077.75 msec
(14 rows)

So, it does give a noticeable improvement the higher the sort_mem ...

And, @ 100M for sort_mem and using the month_trunc index:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=32089.29..32171.63 rows=823 width=41) (actual time=30822.51..57202.44 rows=144 loops=1)
-> Group (cost=32089.29..32151.04 rows=8234 width=41) (actual time=30784.24..743396.18 rows=462198 loops=1)
-> Sort (cost=32089.29..32109.87 rows=8234 width=41) (actual time=30784.21..36212.96 rows=462198 loops=1)
Sort Key: c.company_name, ts.company_id
-> Merge Join (cost=31407.94..31553.77 rows=8234 width=41) (actual time=11384.79..24918.56 rows=462198 loops=1)
Merge Cond: ("outer".company_id = "inner".company_id)
-> Sort (cost=24.41..25.29 rows=352 width=25) (actual time=5.92..9.55 rows=348 loops=1)
Sort Key: c.company_id
-> Seq Scan on company c (cost=0.00..9.52 rows=352 width=25) (actual time=0.08..3.21 rows=352 loops=1)
-> Sort (cost=31383.53..31404.12 rows=8234 width=16) (actual time=11378.81..15211.07 rows=462198 loops=1)
Sort Key: ts.company_id
-> Index Scan using tl_month on traffic_logs ts (cost=0.00..30848.02 rows=8234 width=16) (actual time=0.46..7055.75 rows=462198 loops=1)
Index Cond: (month_trunc(runtime) = '2003-10-01 00:00:00'::timestamp without time zone)
Total runtime: 57401.72 msec
(14 rows)

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Dennis Bjorklund 2003-11-11 06:50:07 Re: *very* slow query to summarize data for a month ...
Previous Message Marc G. Fournier 2003-11-11 00:28:07 Re: *very* slow query to summarize data for a month ...