Re: Bad n_distinct estimation; hacks suggested?

From: Marko Ristola <marko(dot)ristola(at)kolumbus(dot)fi>
To: pgsql-perform <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Bad n_distinct estimation; hacks suggested?
Date: 2005-04-22 18:39:12
Message-ID: 426944D0.6040706@kolumbus.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance


Hi.

Sometimes, if the random number generator, that PostgreSQL uses,
isn't good enough, the randomly selected pages for the statistics
might not be random enough.

Solaris is unknown to me. Maybe the used random number generator there
isn't good enough?

Good statistics depend on good random numbers.

So, for example, if you have one million pages, but the upper bound for
the random
numbers is one hundred thousand pages, the statistics might get tuned.

Or some random number generator has for example only 32000 different values.

Regards,
Marko Ristola

Josh Berkus wrote:

>Tom,
>
>Any thoughts? This is really messing up query execution all across the
>database ...
>
>--Josh
>
>
>
>>Here is the stats = 100 version. Notice that n_distinct has gone down.
>>
>> schemaname | tablename | attname | null_frac | avg_width |
>>n_distinct | most_common_vals
>>
>>| most_common_freqs
>>| histogram_bounds |
>>
>>correlation
>>
>>
>
>
>
>>-------------------+------------- public | web_site_activity_fa |
>>session_id | 0 | 8 | 96107 |
>>{4393922,6049228,6026260,4394034,60341,4393810,2562999,2573850,3006299,4705
>>488,2561499,4705258,3007378,4705490,60327,60352,2560950,2567640,2569852,3006
>>604,4394329,2570739,2406633,2407292,3006356,4393603,4394121,6449083,2565815,
>>4387881,2406770,2407081,2564340,3007328,2406578,2407295,2562813,2567603,4387
>>835,71014,2566253,2566900,6103079,2289424,2407597,2567627,2568333,3457448,23
>>450,23670,60743,70739,2406818,2406852,2407511,2562816,3007446,6306095,60506,
>>71902,591543,1169136,1447077,2285047,2406830,2573964,6222758,61393,70955,709
>>86,71207,71530,262368,2289213,2406899,2567361,2775952,3006824,4387864,623982
>>5,6244853,6422152,1739,58600,179293,278473,488407,1896390,2286976,2407020,25
>>46720,2677019,2984333,3006133,3007497,3310286,3631413,3801909,4366116,438802
>>5}
>>
>>{0.00166667,0.00146667,0.0013,0.0011,0.000933333,0.0009,0.0008,0.0008,0.000
>>733333,0.000733333,0.0007,0.000633333,0.0006,0.0006,0.000566667,0.000566667,
>>0.000566667,0.000566667,0.000566667,0.000566667,0.000566667,0.000533333,0.00
>>05,0.0005,0.0005,0.0005,0.0005,0.0005,0.000466667,0.000466667,0.000433333,0.
>>000433333,0.000433333,0.000433333,0.0004,0.0004,0.0004,0.0004,0.0004,0.00036
>>6667,0.000366667,0.000366667,0.000366667,0.000333333,0.000333333,0.000333333
>>,0.000333333,0.000333333,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.
>>0003,0.0003,0.0003,0.000266667,0.000266667,0.000266667,0.000266667,0.0002666
>>67,0.000266667,0.000266667,0.000266667,0.000266667,0.000233333,0.000233333,0
>>.000233333,0.000233333,0.000233333,0.000233333,0.000233333,0.000233333,0.000
>>233333,0.000233333,0.000233333,0.000233333,0.000233333,0.000233333,0.0002333
>>33,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0
>>002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002}
>>
>>{230,58907,88648,156764,216759,240405,264601,289047,312630,339947,364452,38
>>6486,409427,434075,455140,475759,500086,521530,544703,680376,981066,1313419,
>>1712592,1860151,1882452,1905328,1927504,1948159,1970054,1990408,2014501,2038
>>573,2062786,2087163,2110129,2132196,2155657,2181058,2204976,2228575,2256229,
>>2283897,2352453,2407153,2457716,2542081,2572119,2624133,2699592,2771254,2832
>>224,2908151,2951500,3005088,3032889,3137244,3158685,3179395,3203681,3261587,
>>3304359,3325577,3566688,3621357,3645094,3718667,3740821,3762386,3783169,3804
>>593,3826503,3904589,3931012,3957675,4141934,4265118,4288568,4316898,4365625,
>>4473965,4535752,4559700,4691802,4749478,5977208,6000272,6021416,6045939,6078
>>912,6111900,6145155,6176422,6206627,6238291,6271270,6303067,6334117,6365200,
>>6395250,6424719,6888329}
>>
>>| 0.41744
>>
>>
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Held 2005-04-22 18:39:15 Re: Woo hoo ... a whole new set of compiler headaches!!
Previous Message Jim C. Nasby 2005-04-22 18:23:45 Re: Bitmap scans vs. the statistics views

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2005-04-22 18:52:51 Re: Bad n_distinct estimation; hacks suggested?
Previous Message Alvaro Herrera 2005-04-22 18:30:54 Re: Joel's Performance Issues WAS : Opteron vs Xeon