Re: Autovacuum in the backend

From: "Thomas F(dot) O'Connell" <tfo(at)sitening(dot)com>
To: Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
Cc: Alvaro Herrera <alvherre(at)surnet(dot)cl>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Autovacuum in the backend
Date: 2005-06-17 04:55:34
Message-ID: CD96413A-E1FF-465A-B532-0C935DCE8D47@sitening.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Gavin,

For the record, I don't consider myself a PostgreSQL newbie, nor do I
manage any 2 TB databases (much less tables), but I do have an
unusual production use case: thousands (> 10,000) of tables, many of
them inherited, and many of them with hundreds of thousands (a few
with millions) of rows.

Honestly, creating crontab vacuum management for this scenario would
be a nightmare, and pg_autovacuum has been a godsend. Considering the
recent revelations of O(n^2) iterations over table lists in the
current versions and the stated and apparent ease with which this
problem could be solved by integrating the basic functionality of
pg_autovacuum into the backend, I can personally attest to there
being real-world use cases that would benefit tremendously from
integrated autovacuum.

A few months ago, I attempted to solve the wrong problem by
converting a hardcoded threshold into another command-line option. If
I had spotted the O(n^2) problem, I might've spent the time working
on it then instead of the new command-line option. I suppose it's
possible that I'll head down this road anyway if it looks like
integrated pg_autovacuum is going to be put on hold indefinitely
after this discussion.

Anyway, just wanted to throw out some food for thought for the
practicality of a tool like pg_autovacuum.

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source: Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Jun 16, 2005, at 5:22 PM, Gavin Sherry wrote:

> On Thu, 16 Jun 2005, Alvaro Herrera wrote:
>
>
>> On Thu, Jun 16, 2005 at 04:20:34PM +1000, Gavin Sherry wrote:
>>
>>
>>> 2) By no fault of its own, autovacuum's level of granularity is
>>> the table
>>> level. For people dealing with non-trivial amounts of data (and
>>> we're not
>>> talking gigabytes or terabytes here), this is a serious drawback.
>>> Vacuum
>>> at peak times can cause very intense IO bursts -- even with the
>>> enhancements in 8.0. I don't think the solution to the problem is
>>> to give
>>> users the impression that it is solved and then vacuum their
>>> tables during
>>> peak periods. I cannot stress this enough.
>>>
>>
>> People running systems with petabyte-sized tables can disable
>> autovacuum
>> for those tables, and leave it running for the rest. Then they can
>> schedule whatever maintenance they see fit on their gigantic tables.
>> Trying to run a database with more than a dozen gigabytes of data
>> without expert advice (or at least reading the manual) would be
>> extremely stupid anyway.
>>
>
> As I've said a few times, I'm not concerned about such users. I'm
> concerned about users with some busy tables of a few hundred
> megabytes. I
> still don't think VACUUM at arbitary times on such tables is suitable.
>
> Thanks,
>
> Gavin

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Joshua D. Drake 2005-06-17 07:00:59 Re: Autovacuum in the backend
Previous Message Greg Stark 2005-06-17 03:15:51 Re: Advice on structure /sequence / trigger

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Glaesemann 2005-06-17 05:09:50 Re: [PATCHES] Escape handling in strings
Previous Message Bruce Momjian 2005-06-17 04:22:41 Re: [PATCHES] Escape handling in strings