Re: Reindex taking forever, and 99% CPU

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
Cc: PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Reindex taking forever, and 99% CPU
Date: 2014-08-04 01:55:17
Message-ID: CAMkU=1wOX4pCVEUNVfXMcXifEUn=t7bKn+xF-sJ=dpf50OjJVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Saturday, August 2, 2014, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:

> Hi. I've been patient. PG is 9.0.17, updated via Yum yesterday.
>
> One of my large tables (101 GB on disk, about 1.1 billion rows) used
> to take too long to vacuum.

Too long for what? Rome wasn't build in a day, it might not get vacuumed
in a day either. So what?

> Not sure if it's an index corruption
> issue. But I tried VACUUM FULL ANALYZE as recommended in another
> thread yesterday, which took 5 hours on the two times I tried, without
> finishing.
>
> Now the REINDEX TABLE has taken over 6 hours as I decided to be
> patient and just let something finish. Not sure this is normal though!
> How do production level DBAs do this if it takes so long?
>

Generally speaking, we don't.

>
> If I open another SSH window to my server and try "select * from
> pg_stats_activity" it just hangs there, as the REINDEX I presume is
> taking up all the memory? I basically can't do anything else on this
> server.
>

Is this large table one of the system tables?

>
> Just in case it helps, a segment of my postgresql.conf is below. Would
> appreciate any tips on what I can do.
>
> (I did a pg_dump of just this table, which also took about 2 hours,
> then I renamed the original table in the database, and tried to
> pg_restore just the table, but it gave me an error message about the
> archive being in the wrong format !!! So REINDEX or something like it
> seems to be the only idea?)
>

The only idea in order to DO WHAT? So far the only problems we know about
are the ones you are causing yourself, in an effort to fix some problem
which we know nothing about, and which might not actually exist in the
first place.

> Thanks for any help!
>
> PS: CentOS 6 64 bit, 4 GB of RAM, Raid 1 Raptor disks. Postgresql.conf
> and TOP output during the running of the REINDEX are below..
>

Does RAID 1 mean you only have 2 disks in your RAID? If so, that is
woefully inadequate to your apparent workload. The amount of RAM doesn't
inspire confidence, either. If you want to use this hardware, you need to
re-calibrate what "patience" means. Do a "vacuum verbose" (NOT "full") of
the large table, and let it run over a weekend, at least.

> ----POSTGRESQL.CONF-----
>
> max_connections = 180
>

That's probably absurd. If you have an application that loses track of
it's connections and doesn't actually try to make use of them and you can't
fix that application and you have no evidence of other problems, then this
might sense, kind of, as defensive measure. But since you are in an
emergency, or think you are, you should lower this.

maintenance_work_mem = 320MB
>

If the only thing running is the vacuum, you could give it a lot more
memory than this, like 2 or 3 GB. But you should probably do that only in
the session doing the "emergency" vacuum, not globally.

autovacuum_vacuum_cost_delay = 20ms
>

Is vacuum_cost_delay still the default of 0?

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G Johnston 2014-08-04 06:17:35 Minor pg_stat_activity query/message improvement request
Previous Message Adrian Klaver 2014-08-03 13:29:11 Re: Reindex taking forever, and 99% CPU