From: | Jan Urbański <wulczer(at)wulczer(dot)org> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Jesper Krogh <jesper(at)krogh(dot)cc>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: tsvector pg_stats seems quite a bit off. |
Date: | 2010-05-28 08:28:27 |
Message-ID: | 4BFF7EAB.6040706@wulczer.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 28/05/10 04:47, Tom Lane wrote:
> =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <wulczer(at)wulczer(dot)org> writes:
>> On 19/05/10 21:01, Jesper Krogh wrote:
>>> In practice, just cranking the statistics estimate up high enough seems
>>> to solve the problem, but doesn't
>>> there seem to be something wrong in how the statistics are collected?
>
>> The algorithm to determine most common vals does not do it accurately.
>> That would require keeping all lexemes from the analysed tsvectors in
>> memory, which would be impractical. If you want to learn more about the
>> algorithm being used, try reading
>> http://www.vldb.org/conf/2002/S10P03.pdf and corresponding comments in
>> ts_typanalyze.c
>
> I re-scanned that paper and realized that there is indeed something
> wrong with the way we are doing it.
> So I think we have to fix this.
Hm, I'll try to take another look this evening (CEST).
Cheers,
Jan
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2010-05-28 08:46:18 | Re: Patch submission deadline for CommitFest 2010-07 |
Previous Message | Heikki Linnakangas | 2010-05-28 07:48:36 | Re: Idea for getting rid of VACUUM FREEZE on cold pages |