From: | Mike Charnoky <noky(at)nextbus(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: intermittant performance problem |
Date: | 2009-03-10 02:21:25 |
Message-ID: | 49B5CEA5.9010306@nextbus.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
The random sampling query is normally pretty snappy. It usually takes
on the order of 1 second to sample a few thousand rows of data out of a
few million. The sampling is consistently quick, too. However, on some
days, the sampling starts off quick, then when the process starts
sampling from a different subset of data (different range of times for
the same day), the sampling query takes a couple minutes.
Regarding the concurrent vacuuming, this is definitely not happening. I
always check pg_stat_activity whenever the sampling process starts to
lag behind. I have never seen a vacuum running during this time.
Interesting idea to issue the EXPLAIN first... I will see if I can
instrument the sampling program to do this.
Thanks for your help Tom.
Mike
Tom Lane wrote:
> Mike Charnoky <noky(at)nextbus(dot)com> writes:
>> The sampling query which runs really slow on some days looks something
>> like this:
>
>> INSERT INTO sampled_data
>> (item_name, timestmp, ... )
>> SELECT item_name, timestmp, ... )
>> FROM raw_data
>> WHERE timestmp >= ? and timestmp < ?
>> AND item_name=?
>> AND some_data_field NOTNULL
>> ORDER BY random()
>> LIMIT ?;
>
> Hmph, I'd expect that that would run pretty slowly *all* the time :-(.
> There's no good way to optimize "ORDER BY random()". However, it seems
> like the first thing you should do is modify the program so that it
> issues an EXPLAIN for that right before actually doing the query, and
> then you could see if the plan is different on the slow days.
>
>> We have done a great deal of PG tuning, including the autovacuum for the
>> "raw_data" table. Autovacuum kicks like clockwork every day on that
>> table after the sampling process finishes (after one day's worth of data
>> is deleted from "raw_data" table, a roughly 7% change in size).
>
> Also, are you sure you have ruled out the possibility that the problem
> comes from autovac kicking in *while* the update is running?
>
> regards, tom lane
>
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2009-03-10 02:37:40 | Re: C++ User-defined functions |
Previous Message | Mike Charnoky | 2009-03-10 02:10:53 | Re: intermittant performance problem |