From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> |
Cc: | Shridhar Daithankar <shridhar_daithankar(at)myrealbox(dot)com>, pgsql-performance(at)postgresql(dot)org, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] More detail on settings for pgavd? |
Date: | 2003-11-21 17:09:00 |
Message-ID: | 200311210909.00978.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Matthew,
> As long as pg_autovacuum remains a contrib module, I don't think any
> changes to the system catelogs will be make. If pg_autovacuum is
> deemed ready to move out of contrib, then we can talk about the above.
But we could create a config file that would store stuff in a flatfile table,
OR we could add our own "system table" that would be created when one
"initializes" pg_avd.
Just an idea. Mind you, I'm not so sure that we want to focus immediately on
per-table settings. I think that we want to get the "automatic" settings
working fairly well first; a lot of new DBAs would use the per-table settings
to shoot themselves in the foot. So we need to be able to make a strong
recommendation to "try the automatic settings first."
> Are you saying that you can vacuum a 1 million row table in 2-4
> minutes? While a vacuum of the same table with an additional 1 million
> dead tuples would take an hour?
I'm probably exaggerating. I do know that I can vacuum a fairly clean 1-5
million row table in less than 4 mintues. I've never let such a table get
to 50% dead tuples, so I don't really know how long that takes. Call me a
coward if you like ...
> >I'd be really reluctant to base pv-avd frequency on the fsm settings
> > instead. What if the user loads 8GB of data but leaves fsm_pages at the
> > default of 10,000? You can't do much with that; you'd have to vacuum if
> > even 1% of the data changed.
>
> Ok, but as you said above it's very easy to set the FSM once you know
> your db size.
Actually, thinking about this I realize that PG_AVD and the Perl-based
postgresql.conf configuration script I was working on (darn, who was doing
that with me?) need to go togther. With pg_avd, setting max_fsm_pages is
very easy; without it its a bit of guesswork.
So I think we can do this: for 'auto' settings:
If max_fsm_pages is between 13% and 100% of the total database pages, then set
the vacuum scale factor to match 3/4 of the fsm_pages setting, e.g.
database = 18,000,000 data pages;
max_fsm_pages = 3,600,000;
set vacuum scale factor = 3.6mil/18mil * 3/4 = 0.15
If max_fsm_pages is less than 13% of database pages, issue a warning to the
user (log it, if possible) and set scale factor to 0.1. If it's greater
than 100% set it to 1 and leave it alone.
> I don't have a good plan as to how to incorporate
> this data, but to a large extent the FSM already tracks table activity
> and gives us the most accurate answer about storage growth (short of
> using something like contrib/pgstattuple which takes nearly the same
> amount of time as an actual vacuum)
I don't really think we need to do dynamic monitoring at this point. It
would be a lot of engineering to check data page pollution without having
significant performance impact. It's doable, but something I think we
should hold off until version 3. It would mean hacking the FSM, which is a
little beyond me right now.
> In my testing, I never changed the default statistics settings.
Ah. Well, a lot of users do to resolve query problems.
> But we track tuples because we can compare against the count given by
> the stats system. I don't know of a way (other than looking at the FSM,
> or contrib/pgstattuple ) to see how many dead pages exist.
No, but for scaling you don't need the dynamic count of tuples or of dead
tuples; pg_class holds a reasonable accurate count of pages per table as of
last vacuum.
--
Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Kurt Roeckx | 2003-11-21 17:25:51 | Re: 7.4 logging bug. |
Previous Message | Andreas Pflug | 2003-11-21 16:46:55 | Re: logical column position |
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2003-11-21 21:23:18 | Re: [HACKERS] More detail on settings for pgavd? |
Previous Message | Matthew T. O'Connor | 2003-11-21 15:17:31 | Re: [HACKERS] More detail on settings for pgavd? |