Re: Problem after VACUUM ANALYZE

From: "David Wilson" <david(dot)t(dot)wilson(at)gmail(dot)com>
To: mljv(at)planwerk6(dot)de
Cc: pgsql-general(at)postgresql(dot)org, "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
Subject: Re: Problem after VACUUM ANALYZE
Date: 2008-04-09 08:11:29
Message-ID: e7f9235d0804090111w1a4b0009m8f835eb8d7ed8cd3@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Apr 9, 2008 at 3:29 AM, <mljv(at)planwerk6(dot)de> wrote:

> But if i do "VACUUM ANALYZE" without concurrent queries, everything runs fine
> afterwards.
>
> If i run "VACUUM ANALYZE" with few concurrent queries, it slows down to a
> crawl.
>
> Could it be that something like this is happening:
> - In the early morning a new DB connection is opened.
> - While running VACUUM ANALYZE the planner uses different plans because some
> index could not be used or the statistics are right in that moment not
> present because they are updated... So the query gets a wrong plan. It uses a
> seqScan instead of an index scan.
> - This wrongly planned statement is prepared so even after VACUUM ANALYZE is
> done, the statement does not use the wrong plan.
> - load raises triggers many concurrent queries with wrong plans. so everything
> slows down.
>

I've run into a very similar problem. I have some long-running
processes that generate a large amount of data and then query that
data that must periodically drop their connections and rebuild to
ensure that query plans get re-planned according to updated
statistics. This is especially true when a new DB is first being
brought up with an initial set of data (~1 week of compute time, ~30gb
of data and ~120m rows) with processes that live through the entire
process.

My solution, as mentioned above, is to rebuild the connection approx
every hour on my long-running processes. This is a tricky question,
because you want the benefit of not re-planning queries in 99.9% of
the cases- I'm not really opposed to the current system that requires
the reconnections, given that the overhead involved in them is
completely negligible in the grand scheme of things. There are
downsides to not replanning queries, and if you can't live with them
then you should either force re-planning at intervals or avoid
prepared statements.

--
- David T. Wilson
Princeton Satellite Systems
david(dot)t(dot)wilson(at)gmail(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Pavan Deolasee 2008-04-09 08:16:25 Re: Problem after VACUUM ANALYZE
Previous Message mljv 2008-04-09 07:29:24 Re: Problem after VACUUM ANALYZE