From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Jan Urbański <j(dot)urbanski(at)students(dot)mimuw(dot)edu(dot)pl> |
Cc: | pgsql-hackers(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Subject: | Re: Stats target increase vs compute_tsvector_stats() |
Date: | 2008-12-15 15:01:48 |
Message-ID: | 29737.1229353308@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= <j(dot)urbanski(at)students(dot)mimuw(dot)edu(dot)pl> writes:
> Tom Lane wrote:
>> I came across this bit in ts_typanalyze.c:
>>
>> /* We want statistic_target * 100 lexemes in the MCELEM array */
>> num_mcelem = stats->attr->attstattarget * 100;
>>
>> I wonder whether the multiplier here should be changed?
> The origin of that bit is this post:
> http://archives.postgresql.org/pgsql-hackers/2008-07/msg00556.php
> and the following few downthread ones.
> If we bump the default statistics target 10 times, then changing the
> multiplier to 10 seems the right thing to do.
OK, will do.
> Only thing that needs
> caution is the frequency of pruning we do in the Lossy Counting
> algorithm, that IIRC is correlated with the desired target length of the
> MCELEM array.
Right below that we have
/*
* We set bucket width equal to the target number of result lexemes.
* This is probably about right but perhaps might need to be scaled
* up or down a bit?
*/
bucket_width = num_mcelem;
so it should track automatically. AFAICS the argument in the above
thread that this is an appropriate pruning distance holds good
regardless of just how we obtain the target mcelem count.
> BTW: I've been occupied with other things and might have missed some
> discussions, but at some point it has been considered to use Lossy
> Counting to gather statistics from regular columns, not only tsvectors.
> Wouldn't this help the performance hit ANALYZE takes from upping
> default_stats_target?
Perhaps, but it's not likely to get done for 8.4 ...
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-12-15 15:12:07 | Re: rules regression test failed on mingw |
Previous Message | Jonah H. Harris | 2008-12-15 14:57:55 | Re: Block-level CRC checks |