Re: Problem after VACUUM ANALYZE

From: mljv(at)planwerk6(dot)de
To: "David Wilson" <david(dot)t(dot)wilson(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Problem after VACUUM ANALYZE
Date: 2008-04-09 08:17:39
Message-ID: 200804091017.39896.mljv@planwerk6.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Am Mittwoch, 9. April 2008 10:11 schrieb David Wilson:
> On Wed, Apr 9, 2008 at 3:29 AM, <mljv(at)planwerk6(dot)de> wrote:
> > But if i do "VACUUM ANALYZE" without concurrent queries, everything runs
> > fine afterwards.
> >
> > If i run "VACUUM ANALYZE" with few concurrent queries, it slows down to
> > a crawl.
> >
> > Could it be that something like this is happening:
> > - In the early morning a new DB connection is opened.
> > - While running VACUUM ANALYZE the planner uses different plans because
> > some index could not be used or the statistics are right in that moment
> > not present because they are updated... So the query gets a wrong plan.
> > It uses a seqScan instead of an index scan.
> > - This wrongly planned statement is prepared so even after VACUUM
> > ANALYZE is done, the statement does not use the wrong plan.
> > - load raises triggers many concurrent queries with wrong plans. so
> > everything slows down.
>
> I've run into a very similar problem. I have some long-running
> processes that generate a large amount of data and then query that
> data that must periodically drop their connections and rebuild to
> ensure that query plans get re-planned according to updated
> statistics. This is especially true when a new DB is first being
> brought up with an initial set of data (~1 week of compute time, ~30gb
> of data and ~120m rows) with processes that live through the entire
> process.

This is not the case at my place. We are not producing lots of data. Just few
data with small INSERTs all the time. I don't think that the statistic are
really changing quite often as the tables are rather fixed and just some data
is added.

> My solution, as mentioned above, is to rebuild the connection approx
> every hour on my long-running processes. This is a tricky question,
> because you want the benefit of not re-planning queries in 99.9% of
> the cases- I'm not really opposed to the current system that requires
> the reconnections, given that the overhead involved in them is
> completely negligible in the grand scheme of things. There are
> downsides to not replanning queries, and if you can't live with them
> then you should either force re-planning at intervals or avoid
> prepared statements.

In release 8.3:
http://www.postgresql.org/docs/8.3/interactive/release-8-3.html
"Automatically re-plan cached queries when table definitions change or
statistics are updated"

So no reconnection is necessary in 8.3 to replan prepared statements. Maybe it
helps you.

kind regards,
janning

In response to

Browse pgsql-general by date

  From Date Subject
Next Message josep porres 2008-04-09 08:20:18 wrong query result
Previous Message Pavan Deolasee 2008-04-09 08:16:25 Re: Problem after VACUUM ANALYZE