Re: [GENERAL] Increasing statistics results in worse estimates

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Shelby Cain <alyandon(at)yahoo(dot)com>
Cc: pgsql-hackers-win32(at)postgresql(dot)org
Subject: Re: [GENERAL] Increasing statistics results in worse estimates
Date: 2005-05-01 17:03:55
Message-ID: 25129.1114967035@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers-win32

[ redirecting to pgsql-hackers-win32 ]

Shelby Cain <alyandon(at)yahoo(dot)com> writes:
> --- Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> However, there is something absolutely wacko about
>> the stats collection
>> process here ... you've got fairly reasonable
>> looking results for
>> most-common-values of city name at the lower end of
>> the stats settings
>> (HOUSTON and DALLAS are the most common, sounds
>> about right) ... but at
>> the higher settings the ordering of most-common
>> entries just goes nuts.
>> We've got some kind of bug there.

> I had noticed that as well but wasn't sure about the
> whether MCV really meant what I thought it did.

>> It might be easier to debug this if you could send
>> me the test case.

> I had already removed proprietary data to try and
> whittle down the number of columns I needed to
> demonstrate the weirdness so I can host a dump of the
> table. However, before I take that step I should
> mention that this is the native Windows port so if
> that changes anything let me know.

Thanks for sending me the test data. The bad news is that I can't
reproduce any strange behavior here: the stats get marginally more
accurate as the target goes up, just as you'd expect. So it would
seem there is something broken about ANALYZE on Windows. There's
not anything magic about this particular dataset, AFAICS.

Which Windows build are you using, exactly?

Can anyone else reproduce a problem with ANALYZE producing silly
most-common-values stats at higher statistics targets? The original
thread is here:
http://archives.postgresql.org/pgsql-general/2005-04/msg01368.php

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message CSN 2005-05-01 20:13:09 Can't compile plphp
Previous Message Peter Wilson 2005-05-01 10:17:04 DBmirror replication - replacement for DBMirror.pl

Browse pgsql-hackers-win32 by date

  From Date Subject
Next Message Shelby Cain 2005-05-02 03:53:33 Re: [GENERAL] Increasing statistics results in worse estimates
Previous Message Tom Lane 2005-04-29 17:26:04 Re: Increasing statistics results in worse estimates