Re: [ADMIN] Column missing from pg_statistics

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kadri Raudsepp <kadri(dot)raudsepp(at)nordicgaming(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [ADMIN] Column missing from pg_statistics
Date: 2014-01-10 16:15:14
Message-ID: 12480.1389370514@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-hackers

Kadri Raudsepp <kadri(dot)raudsepp(at)nordicgaming(dot)com> writes:
> I have set up a cron-job that sends me daily reports on bloat amount in
> tables and indices, which I calculate using pg_stats, not pgstattuple, for
> performance and I/O reasons. If the bloat amount or percentage are big
> enough, I use pg_repack to get rid of it. At some point I noticed, that
> some tables keep showing up in the reports with the same amount of bloat,
> which pg_repack was seemingly unable to remove. Investigation showed that
> pgstattuple gave very different results than my bloat-finding query.
> Reason - for some tables there are some columns that never show up in
> pg_statistics.

Hmm. Eyeballing the ANALYZE code, I note that it will decide that it
hasn't got any valid statistics for a column if (1) it finds no NULL
values and (2) every single sampled value in the column is too wide
(more than WIDTH_THRESHOLD = 1024 bytes wide). Does this describe your
problematic column?

It seems like the code is being too conservative here --- it could at
least generate valid values for stanullfrac and stawidth. I'm inclined
to think maybe it should also set stadistinct = -1 ("unique") in this
case, since the basic assumption that validates ignoring very wide
values is that they aren't duplicates.

regards, tom lane

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Brian Weaver 2014-01-10 17:47:42 Re: Hot-Standby resync problem after connection loss
Previous Message Kadri Raudsepp 2014-01-10 14:31:48 Column missing from pg_statistics

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2014-01-10 16:24:01 Re: Fixing pg_basebackup with tablespaces found in $PGDATA
Previous Message Simon Riggs 2014-01-10 16:09:17 Re: Standalone synchronous master