From: | Gregory Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | "Hakan Kocaman" <hkocam(at)googlemail(dot)com> |
Cc: | "Josh Berkus" <josh(at)agliodbs(dot)com>, "Joshua D(dot) Drake" <jd(at)commandprompt(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, "Peter Eisentraut" <peter_e(at)gmx(dot)net>, "Greg Smith" <gsmith(at)gregsmith(dot)com>, "Andreas Pflug" <pgadmin(at)pse-consulting(dot)de>, "Decibel!" <decibel(at)decibel(dot)org> |
Subject: | Re: Overhauling GUCS |
Date: | 2008-06-09 23:01:45 |
Message-ID: | 87ve0ius9y.fsf@oxford.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Hakan Kocaman" <hkocam(at)googlemail(dot)com> writes:
> On 6/9/08, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
>>
>> n_distinct. For that Josh is right, we *would* need a sample size
>> proportional to the whole data set which would practically require us to
>> scan the whole table (and have a technique for summarizing the results in a
>> nearly constant sized data structure).
>
> is this (summarizing results in a constant sized data structure) something
> which could be achived by Bloom-Filters ?
Uhm, it would be a bit of a strange application of them but actually it seems
to me that would be a possible approach. It would need a formula for
estimating the number of distinct values given the number of bits set in the
bloom filter. That should be a tractable combinatorics problem (in fact it's
pretty similar to the combinatorics I posted a while back about getting all
the drives in a raid array busy). And if you have a dynamic structure where
the filter size grows then it would overestimate because extra copied bits
would be set.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2008-06-09 23:49:21 | Re: Overhauling GUCS |
Previous Message | Hakan Kocaman | 2008-06-09 21:17:08 | Re: Overhauling GUCS |