Re: [Proposal] More Vacuum Statistics

From: Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Naoya Anzai <nao-anzai(at)xc(dot)jp(dot)nec(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Akio Iwaasa <aki-iwaasa(at)vt(dot)jp(dot)nec(dot)com>, "bench(dot)coffee(at)gmail(dot)com" <bench(dot)coffee(at)gmail(dot)com>
Subject: Re: [Proposal] More Vacuum Statistics
Date: 2015-05-30 02:30:57
Message-ID: 556920E1.2050508@BlueTreble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/28/15 9:14 AM, Tom Lane wrote:
> Naoya Anzai <nao-anzai(at)xc(dot)jp(dot)nec(dot)com> writes:
>> In my much experience up until now,I have an idea that we can add
>> 2 new vacuum statistics into pg_stat_xxx_tables.
>
> Adding new stats in that way requires adding per-table counters, which
> bloat the statistics files (especially in database with very many tables).
> I do not think you've made a case for these stats being valuable enough
> to justify such overhead for everybody.

It occurs to me that there's no good reason for vacuum-derived stats to
be in the stats file; it's not like users run vacuum anywhere near as
often as other commands. It's stats could be kept in pg_class; we're
already keeping things like relallvisible there.

> As far as the first one goes, I don't even think it's especially useful.
> There might be value in tracking the times of the last few vacuums on a
> table, but knowing the time for only the latest one doesn't sound like it
> would prove much. So I'd be inclined to think more along the lines of
> scanning the postmaster log for autovacuum runtimes, instead of squeezing
> it into the pg_stats views.

You'd also want to know how many pages were scanned, since any decent
estimation would need to take table size into account.

As for history, that's a problem that exists for *all* our statistics,
so anyone that cares about that is going to setup some system to
periodically capture the contents of pg_stat_*.

> A possible alternative so far as the second one goes is to add a function
> (perhaps in contrib/pg_freespacemap) that simply runs through a table's
> VM and counts the number of set bits. This would be more accurate (no
> risk of lost counter updates) and very possibly cheaper overall: it would
> take longer to find out the number when you wanted it, but you wouldn't be
> paying the distributed overhead of tracking it when you didn't want it.

Seems like a reasonable addition to that contrib module regardless. As
Jeff Janes mentioned this info is available in pg_class, but it requires
an ANALYZE to update it.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Data in Trouble? Get it in Treble! http://BlueTreble.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-05-30 02:41:25 Re: [Proposal] More Vacuum Statistics
Previous Message Tomas Vondra 2015-05-30 01:52:09 Re: nested loop semijoin estimates