From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Greg Stark <stark(at)mit(dot)edu> |
Cc: | Jim Nasby <jim(at)nasby(dot)net>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: auto_explain WAS: RFC: Timing Events |
Date: | 2013-02-26 17:19:17 |
Message-ID: | CA+TgmoYE8_VGV2GC41ZHxkupmHcOO3X6F+haEQZ0uZFn_4Nfig@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Feb 25, 2013 at 10:22 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Mon, Feb 25, 2013 at 8:26 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Sun, Feb 24, 2013 at 7:27 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>>> We actually do that in our application and have discovered that random
>>> sampling can end up significantly skewing your data.
>>
>> /me blinks.
>>
>> How so?
>
> Sampling is a pretty big area of statistics. There are dozens of
> sampling methods to deal with various problems that occur with
> different types of data distributions.
>
> One problem is if you have some very rare events then random sampling
> can produce odd results since those rare events will drop out entirely
> unless your sample is very large whereas less rare events are
> represented proportionally. There are sampling methods that ensure
> that x% of the rare events are included even if those rare events are
> less than x% of your total data set. One of those might be appropriate
> to use for profiling data when you're looking for rare slow queries
> amongst many faster queries.
I'll grant all that, but it still seems to me like x% of all queries
plus all queries running longer than x milliseconds would cover most
of the interesting cases.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2013-02-26 17:21:53 | Re: "COPY foo FROM STDOUT" and ecpg |
Previous Message | Peter Eisentraut | 2013-02-26 17:02:48 | Re: pg_xlogdump |