Re: How to Find Cause of Long Vacuum Times - NOOB Question

From: Yudhvir Singh Sidhu <ysidhu(at)gmail(dot)com>
To: Yudhvir Singh Sidhu <ysidhu(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: How to Find Cause of Long Vacuum Times - NOOB Question
Date: 2007-05-06 04:52:56
Message-ID: 463D5F28.5060006@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Steinar H. Gunderson wrote:
> On Sat, May 05, 2007 at 03:57:25PM -0700, Yudhvir Singh Sidhu wrote:
>
>> Situation: huge amounts of adds and deletes daily. Running daily vacuums
>>
>
> If you have huge amounts of adds and deletes, you might want to vacuum more
> often; optionally, look into autovacuum.
>
>
>> Problem: Vacuum times jump up from 45 minutes, or 1:30 minutes to 6+
>> hours overnight, once every 1 to 3 months.
>>
>
> You might want to check your FSM settings. Take a look at the output of
> VACUUM VERBOSE and see how the results stack up against your FSM settings.
> Optionally, you could do a VACUUM FULL to clear the bloat, but this will lock
> the tables and is not recommended on a regular basis.
>
>
>> I know my indexes are getting fragmented and my tables are getting
>> fragmented.
>>
>
> This sounds like a case of table bloat, ie. vacuuming too seldom and/or too
> low FSM settings.
>
>
>> I also know that some of my btree indexes are not being used in queries.
>>
>
> This is a separate problem, usually; if you need help with a specific query,
> post query and the EXPLAIN ANALYZE output here. (Note that using an index is
> not always a win; Postgres' planner knows about this and tries to figure out
> when it is a win and when it is not.)
>
>
>> I also know that using "UNIQUE" in a query makes PG ignore any index.
>>
>
> Do you mean DISTINCT? There are known problems with SELECT DISTINCT, but I'm
> not sure how it could make Postgres start ignoring an index. Again, it's a
> separate problem.
>
>
>> I am looking for the cause of this. Recently I have been looking at
>> EXPLAIN and ANALYZE.
>>
>
> This is a good beginning. :-)
>
>
>> 1. Running EXPLAIN on a query tells me how my query SHOULD run and
>> running ANALYZE tells me how it DOES run. Is that correct?
>>
>
> Nearly. EXPLAIN tells you how the plan Postgres has chosen, with estimates on
> the costs of each step. EXPLAIN ANALYZE (just plain "ANALYZE" is a different
> command, which updates the planner's statistics) does the same, but also runs
> the query and shows the time each step ended up taking. (Note that the
> units of the estimates and the timings are different, so you can't compare
> them directly.)
>
>
>> 2. If (1) is true, then a difference between the two means my query
>> plan is messed up and running ANALYZE on a table-level will somehow
>> rebuild the plan. Is that correct?
>>
>
> Again, sort of right, but not entirely. ANALYZE updates the planner's
> statistics. Having good statistics is very useful for the planner in
> selecting the plan that actually ends up being the best.
>
>
>> 3. If (2) is correct, then running ANALYZE on a nightly basis before
>> running vacuum will keep vacuum times down. Is that correct?
>>
>
> No, ANALYZE will only update planner statistics, which has nothing to do with
> vacuum times. On the other hand, it might help with some of your queries.
>
> /* Steinar */
>
Gee Wow. I am so glad I looked into this subject. I think I am onto the
right path in solving the long-running vacuum problem. Thanks a lot for
the detailed insight Steinar.

Here is what I think the story is:
a. Large amounts of rows are added to and deleted from a table - daily.
With this much activity, the statistics get out of whack easily. That's
where ANALYZE or VACUUM ANALYZE would help with query speed.
b. If ANALYZE does not have a direct impact on vacuum times, what does?
Meaning what in this EXPLAIN/ANALYZE and Indexing world would have a
direct impact?

Again, thank you Steinar for validating my suspicion. It is great to be
on the right path.

Yudhvir

Here is another command and I suspect does something different than
ANALYZE by itself: VACUUM ANALYZE.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jim Nasby 2007-05-06 05:38:10 Re: Feature Request --- was: PostgreSQL Performance Tuning
Previous Message Steinar H. Gunderson 2007-05-05 23:40:14 Re: How to Find Cause of Long Vacuum Times - NOOB Question