From: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: proposal : cross-column stats |
Date: | 2010-12-24 13:06:38 |
Message-ID: | 4D149ADE.9010303@fuzzy.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dne 24.12.2010 13:15, tv(at)fuzzy(dot)cz napsal(a):
>> 2010/12/24 Florian Pflug <fgp(at)phlo(dot)org>:
>>
>>> On Dec23, 2010, at 20:39 , Tomas Vondra wrote:
>>>
>>>> I guess we could use the highest possible value (equal to the number
>>>> of tuples) - according to wiki you need about 10 bits per element
>>>> with 1% error, i.e. about 10MB of memory for each million of
>>>> elements.
>>>
>>> Drat. I had expected these number to come out quite a bit lower than
>>> that, at least for a higher error target. But even with 10% false
>>> positive rate, it's still 4.5MB per 1e6 elements. Still too much to
>>> assume the filter will always fit into memory, I fear :-(
>>
>> I have the impression that both of you are forgetting that there are 8
>> bits in a byte. 10 bits per element = 1.25MB per milion elements.
>
> We are aware of that, but we really needed to do some very rough estimates
> and it's much easier to do the calculations with 10. Actually according to
> wikipedia it's not 10bits per element but 9.6, etc. But it really does not
> matter if there is 10MB or 20MB of data, it's still a lot of data ...
Oooops, now I see what's the problem. I thought you were pointing out
something out, but I've actually used 1B = 1b (which is obviously
wrong). But Florian already noticed that and fixed the estimates.
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2010-12-24 13:10:22 | Re: proposal : cross-column stats |
Previous Message | Robert Haas | 2010-12-24 13:02:35 | Re: [COMMITTERS] pgsql: Move the documentation of --no-security-label to a more sensible |