From: | Joe Conway <mail(at)joeconway(dot)com> |
---|---|
To: | "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: pg_autovacuum next steps |
Date: | 2004-03-22 23:27:45 |
Message-ID: | 405F7671.7070705@joeconway.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Matthew T. O'Connor wrote:
> * Inability to customize thresholds on a per table basis
I ran headlong into this one. IMHO fixing this is critical.
> * Inability to set default thresholds on a per database basis
> * Inability to exclude specific databases / tables from pg_autovacuum
> monitoring
These would be nice to have, but less critical than #1 I think.
> * Inability to schedule vacuums during off-peak times
This would be *really* nice to have. In my recent case, if pg_autovacuum
could work for say 3 minutes, and then back off for 2 minutes or so
while the batch transactions hit, it would be ideal.
> I'm not sure how to address all of these concerns, or that they all
> should be addressed right now. One of my big questions is backend
> integration. I am leaning towards leaving pg_autovacuum as a client
> application in contrib for one more release. During this time, I can
> continue to tweak and improve pg_autovacuum so that we will have a very
> good idea what the final product should be before we make it a standard
> backend process.
I really think pg_autovacuum ought to get folded into the backend now,
for 7.5. I haven't had time yet to read the entire thread, but I saw
others making the same comment. It would make some of the listed
problems go away, or at least become far easier to deal with.
> For PostgreSQL 7.5, I plan to implement these new features:
>
> 1.Per database defaults and per table thresholds (including total
> exclusion)
Great!
> 2.Persistent data
> 3.Single-Pass Mode (external scheduling from cron etc...)
> 4.Off peak scheduling
Great again!
> 1. Per Database defaults and Per table Thresholds:
>
> 1.Store config data inside a special pg_autovacuum table inside
> existing databases that wants custom settings.
A natural if folded into the backend.
> 3.Single-Pass Mode (External Scheduling):
>
> I have received requests to be able to run pg_autovacuum only on request
> (not as a daemon) making only one pass over all the tables (not looping
> indefinately). The advantage being that it will operate more like the
> current vacuum command except that it will only vacuum tables that need
> to be vacuumed. This feature could be useful as long as pg_autovacuum
> exists outside the backend. If pg_autovacuum gets integrated into the
> backend and gets automatically started as a daemon during startup, then
> this option will no longer make sense.
It still might make sense. You could have a mode where the daemon
essentially sleeps forever, until explicitly woken up by a signal. When
woken, it makes one pass, and goes back to infinite sleep. Then provide
a simple way to signal the autovacuum process -- maybe an extension of
the current VACUUM syntax.
> 4.Off-Peak Scheduling:
>
> A fundamental advantage of our vacuum system is that the work required
> to reclaim table space is taken out of the critical path and can be
> moved to and off-peak time when cycles are less precious. One of the
> drawbacks of the current pg_autovacuum is that it doesn't have any way
> to factor this in.
>
> In it's simplest form (which I will implement first) I would add the
> ability to add a second set of thresholds that will be active only
> during an “off-peak” time that can be specified in the pg_autovacuum
> database, perhaps in a general_settings table.
I don't know how this would work, but it is for sure important. In the
recent testing I found that pg_autovacuum (well, lazy vacuum in general,
but I was using pg_autovacuum to control it) made a huge difference in
performance of batch transactions. They range from 4-5 seconds without
vacuum running, to as high as 15 minutes with vacuum running. With the
vacuum delay patch, delay = 1, pagecount = 8, I still saw times go as
high as 10 minutes. Backing vacuum off any more than that caused it to
fall behind the transaction rate unrecoverably. But as I said above, if
the transactions could complete without vacuum running in 4-5 seconds,
then vacuuming resumes for the 3-to-4 minutes between batches, all would
be well.
Joe
From | Date | Subject | |
---|---|---|---|
Next Message | Matthew T. O'Connor | 2004-03-22 23:46:14 | Re: pg_autovacuum next steps |
Previous Message | Bernd Helmle | 2004-03-22 23:15:35 | Re: Thoughts about updateable views |