Re: Statistics collection question

From: "Gregory Williamson" <Gregory(dot)Williamson(at)digitalglobe(dot)com>
To: "Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Richard Broersma Jr" <rabroersma(at)yahoo(dot)com>, "Alban Hertroys" <alban(at)magproductions(dot)nl>, "Postgres General" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Statistics collection question
Date: 2007-09-04 07:28:50
Message-ID: 8B319E5A30FF4A48BE7EEAAF609DB233015E2ACD@COMAIL01.digitalglobe.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Sounds like index bloat to me ... lots of updates of indexed columns = lots of extra dead index entries. Since IIRC PostgreSQL indexes (indicii?) don't store information about the "liveness" of the referenced rows, indexed reads would have to sort through a lot of dead wood to find the few live indexed entries.

If you can, try to schedule a few minutes of down time every N hours and reindex the effected tables, followed by a vacuum/analyze to reclaim dead space and update stats maybe ? Admittedly hard but perhaps easier to have 5-10 minutes of down time regularly rather than very slow queries for hours on end. If this works even as a temporary solution it might point the way to a better long term fix.

It sounds as if you have too many services on one server -- the contentions of each for memory and disk I/O would worry me a lot. I tend to like having dedicated DB servers except for certain light-weight development environments.

And stop trying to make PostgreSQL into MySQL, or vice versa. Different engines, different regimes. Not translatable me thinks.

Just sodden thoughts ... sorry for top posting (challenged email tool).

Greg Williamson
Senior DBA
GlobeXplorer LLC, a DigitalGlobe company

Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information and must be protected in accordance with those provisions. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

(My corporate masters made me say this.)

-----Original Message-----
From: pgsql-general-owner(at)postgresql(dot)org on behalf of Phoenix Kiula
Sent: Tue 9/4/2007 1:07 AM
To: Tom Lane
Cc: Richard Broersma Jr; Alban Hertroys; Postgres General
Subject: Re: [GENERAL] Statistics collection question

On 04/09/07, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com> writes:
> > On 04/09/07, Richard Broersma Jr <rabroersma(at)yahoo(dot)com> wrote:
> >> Thats odd, I wonder why the EXPLAIN ANALYZE time id some much less that the logged select
> >> statement times?
>
> > Because the statement has been executed and is in the cache.
>
> That answer is way too flippant. In particular it doesn't explain your
> repeated 80sec queries --- you should have enough memory in that thing
> to be caching a fair amount of your data.
>
> I'm wondering about some transaction taking exclusive lock on the table
> and sitting on it for a minute or so, and also about network problems
> delaying transmission of data to the client.
>

How can I check what is causing the lack? When I restart pgsql it goes
away. The log is empty for a day or too (I'm only logging errors or
slow queries) and the queries are super fast, but after a day it
starts filling up with abysmally slow queries, even on simple queries
with the WHERE clauses that have only one constant on the indexed
column!

As for "network problems delaying transmission of data" -- not sure
what this means. MySQL is super fast on the very same system. Does
pgsql require anything different?

Basically, what I am missing is some info on actually tweaking the
postgresql.conf to suit my system. I run Apache, MySQL, Postgres, and
Exim (email server) on the same dedicated hosting server. I don't mind
if Postgres hogs 2GB of memory, but I need to know how to tweak it. I
have made about eight posts on this list with my entire
postgresql.conf posted in. I have read and re-read the manual and
devoured as many google-groups archives of this list as I possibly
can. I am looking at plenty of catalogue and stats tables (a utility
that makes compiles all of it and presents the system's missteps and
guidelines may be useful, ala "Tuning Primer" script from MySQL camp)
but I am not sure where to begin!

Would appreciate any help. Why do indexed queries take so much time?
It's a simple DB with "10 relations" including tables and indexes.
Simple inserts and updates, about 5000 a day, but non-trivial
concurrent selects (about 45 million a day). Works fine when I
restart, but a day later all goes cattywumpus.

TIA!

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2007-09-04 07:41:27 Re: Statistics collection question
Previous Message Phoenix Kiula 2007-09-04 07:07:41 Re: Statistics collection question