Quick Links

Re: For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER

From:	Bill Moran <wmoran(at)potentialtech(dot)com>
To:	"Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER
Date:	2007-09-18 13:40:43
Message-ID:	20070918094043.6922c3cf.wmoran@potentialtech.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

In response to "Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>:

> > If you find that reindexing improves performance, then you should
> > investigate further. Depending on the exact nature of the problem,
> > there are many possible solutions, three that come to mind:
> > * Add RAM/SHM
>
> Can I add SHM with merely by managing the entry in sysctl.conf? My
> current values:
>
> kernel.shmmax = 536870912
> kernel.shmall = 536870912

These values define the max allowed. They exist to keep poorly written
applications from sucking up all the available memory. Setting them
higher than is needed does not cause any problems, unless a greedy or
poorly-written application grabs all that memory.

> My "shared_buffers" in postgresql.conf is "20000". From the website
> http://www.desknow.com/kb/idx/12/061/article/ I notice that shmmax
> should be sharedbuffer*8192, so I suppose my shmmax can be much lower
> than the above, but I raised it for performance. Am I wrong to do so?

It's completely impossible to tell without knowing more about your
physical hardware. The rule of thumb is 1/3 physical RAM to start, then
adjust if more or less seems to help. That advice is for versions of
PG >= 8. If you're still running a 7.X version, upgrade.

How much RAM does this system have in it? Unless you have other
applications running on this system using RAM, you should allocate
more of it to shared_buffers. If 160M is 1/3 your RAM, you probably
need to add more RAM.

How big is your database? If it's possible to fit it all in
shared_buffers, that will give you the best performance.

> > * REINDEX on a regular schedule
>
> This is sadly not really feasible, because we need to offer a 100%
> availability website. REINDEX does not work concurrently so it is not
> really an option for us. My max_fsm_pages and max_fsm_relations are
> way above the numbers that come up after the VACUUM ANALYZE VERBOSE
> run.

Hence my comment about "depending on your workload" and "investigating
the situation" to determine the best solution.

> But still, the autovacuum stuff seems like it is not working at all.
> Some related entries in the conf file:
>
> autovacuum = on
> autovacuum_vacuum_cost_delay = 20
> vacuum_cost_delay = 20
> autovacuum_naptime = 30
> stats_start_collector = on
> stats_row_level = on
> autovacuum_vacuum_threshold = 80
> autovacuum_analyze_threshold = 80
>
> And yet, the db often slows down, at which point I manually login and
> run a manual VACUUM ANALYZE and it seems fine for some more time.
> Sometimes, I also restart pgsql and that seems to help for a while.

You don't mention *_scale_factor settings. Those are going to be
important as well. Based on your symptoms, it sounds like autovacuum
is not getting those tables vacuumed enough. I recommend raising the
debug level and watching the logs to see if autovacuum is actually
getting tables vacuumed. Consider lowering your *_scale_factor values
if not. Or even reducing autovacuum_naptime.

> Another advice on these forums is to see "vmstat 1", without actually
> specifying how to draw inferences from it. The "free" version of it is
> coming up at decent rates, as follows:
>
>
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r b swpd free buff cache si so bi bo in cs us sy id wa
> 1 0 29124 110760 108980 3467736 0 1 206 140 0 4 2 1 85 12
> 0 0 29124 110632 108980 3467736 0 0 0 0 1052 108 0 0 100 0
> 2 0 29124 108840 108980 3467736 0 0 0 0 1112 299 1 1 98 0
> 1 0 29124 109288 108980 3467736 0 0 0 0 1073 319 2 1 98 0
> .....

Explaining how to interpret the output of this command and determine
what to do with it is not something easily done in a short paragraph.
However, it looks like you've got a lot of RAM being used for the disk
cache. That memory would probably be better used as shared_buffers, so
I suggest you increase that value considerably.

> > * (with newer version) reduce the fill factor and REINDEX
>
> I think some of my tables are updated very frequently so a smaller
> fill factor will be nice. How can I find the current fill factor on my
> tables? Also, is there some method or science to calculating a decent
> fill factor -- size of table, number of indexes, frequency of updates,
> and such? We have one major table which faces a lot of INSERTs and
> UPDATES in a day (up to 10,000) but many many more SELECTs (up to 10
> million).

I'm not sure how to find the current value, but a smaller fill factor
on busy tables should lead to less fragmentation, thus more efficient
indexes over time. Keep in mind that a smaller fill factor will also
lead to larger indexes initially.

--
Bill Moran
http://www.potentialtech.com

In response to

Re: For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER at 2007-09-18 12:15:22 from Phoenix Kiula

Responses

Re: For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER at 2007-09-18 16:16:55 from Phoenix Kiula

Browse pgsql-general by date

	From	Date	Subject
Next Message	Merlin Moncure	2007-09-18 14:55:32	Re: read-only queries on PITRslaves, any progress?
Previous Message	Sander Steffann	2007-09-18 13:07:09	Re: For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER