From: | "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net> |
---|---|
To: | Hackers <pgsql-hackers(at)postgresql(dot)org>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Gregory Stark <stark(at)enterprisedb(dot)com> |
Subject: | Re: autovacuum next steps, take 2 |
Date: | 2007-02-21 22:40:53 |
Message-ID: | 45DCCA75.7050908@zeut.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Alvaro Herrera wrote:
> Ok, scratch that :-) Another round of braindumping below.
I still think this is solution in search of a problem. The main problem
we have right now is that hot tables can be starved from vacuum. Most
of this proposal doesn't touch that. I would like to see that problem
solved first, then we can talk about adding multiple workers per
database or per tablespace etc...
> (This idea can be complemented by having another GUC var,
> autovacuum_hot_workers, which allows the DBA to have more than one
> worker on hot tables (just for the case where there are too many hot
> tables). This may be overkill.)
I think this is more along the lines of what we need first.
> Ron Mayer expressed the thought that we're complicating needlessly the
> UI for vacuum_delay, naptime, etc. He proposes that instead of having
> cost_delay etc, we have a mbytes_per_second parameter of some sort.
> This strikes me a good idea, but I think we could make that after this
> proposal is implemented. So this "take 2" could be implemented, and
> then we could switch the cost_delay stuff to using a MB/s kind of
> measurement somehow (he says waving his hands wildly).
Agree this is probably a good idea in the long run, but I agree this is
lower on the priority list and should come next.
> Greg Stark and Matthew O'Connor say that we're misdirected in having
> more than one worker per tablespace. I say we're not :-) If we
> consider Ron Mayer's idea of measuring MB/s, but we do it per
> tablespace, then we would inflict the correct amount of vacuum pain to
> each tablespace, sleeping as appropriate. I think this would require
> workers of different databases to communicate what tablespaces they are
> using, so that all of them can utilize the correct amount of bandwidth.
I agree that in the long run it might be better to have multiple workers
with MB/s throttle and tablespace aware, but we don't have any of that
infrastructure right now. I think the piece of low-hanging fruit that
your launcher concept can solve is the hot table starvation.
My Proposal: If we require admins to identify hot tables tables, then:
1) Launcher fires-off a worker1 into database X.
2) worker1 deals with "hot" tables first, then regular tables.
3) Launcher continues to launch workers to DB X every autovac naptime.
4) worker2 (or 3 or 4 etc...) sees it is alone in DB X, if so it acts as
worker1 did above. If worker1 is still working in DB X then worker2
looks for hot tables that are being starved because worker1 got busy.
If worker2 finds no hot tables that need work, then worker2 exits.
This seems a very simple solution (given your launcher work) that can
solve the starvation problem.
Thoughts?
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2007-02-21 22:47:57 | Re: [previously on HACKERS] "Compacting" a relation |
Previous Message | Gregory Stark | 2007-02-21 22:38:02 | Re: Column storage positions |