From: | Jeff Amiel <becauseimjeff(at)yahoo(dot)com> |
---|---|
To: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Looking for settings/configuration for FASTEST reindex on idle system. |
Date: | 2014-01-09 22:03:25 |
Message-ID: | 1389305005.50135.YahooMailNeo@web122901.mail.ne1.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I have a maintenance window coming up and using pg_upgrade to upgrade from 9.2.X to 9.3.X.
As part of the window, I’d like to ‘cluster’ each table by its primary key. After doing so, I see amazing performance improvements (probably mostly because of index bloat - but possibly due to table fragmentation)
That being said, I have a single table that is blowing my window -
at 140 million rows (28 gig in size with 75 gig worth of indexes), this bad boy is my white whale. There are 10 indexes (not including the primary key). Yes - 10 is a lot - but I’ve been monitoring their use (most are single column or partial indexes) and all are used.
That being said, I’ve been reading and experimenting in trying to get a cluster of this table (which re-indexes all 10/11 indexes) to complete in a reasonable amount of time.
There are lots of settings and ranges to chose from and while my experiments continue, I was looking to get some input. Lowest I have gotten for clustering this table is just under 6 hours.
I am familiar with pg_reorg and it’s sibling pg_repack - but they call the base postgresql reindex functions underneath - and I have learned by using ‘verbose’ that the actual clustering of the table is quick - it’s the reindexing that is slow (It’s doing each reindex sequentially instead of concurently)
PostgreSQL 9.3.2 on x86_64-pc-solaris2.11, compiled by gcc (GCC) 4.5.2, 64-bit
500 gig of ram
2.7gig processors (48 cores)
Shared buffers set to 120gig
Maintenance work men set to 1gig
work men set to 500 meg
Things I have read/seen/been told to tweak…
fsync (set to off)
setting wal_level to minimal (to avoid wal logging of cluster activity)
bumping up maintenance work men (but I’ve also seen/read that uber high values cause disk based sorts which ultimately slow things down)
Tweaking checkpoint settings (although with wal_level set to minimal - I don’t think it comes into play)
any good suggestions for lighting a fire under this process?
If worse comes to worse, I can vacuum full the table and reindex each index concurrently - but it won’t give me the benefit of having the tuples ordered by their oft-grouped primary key.
From | Date | Subject | |
---|---|---|---|
Next Message | Adrian Klaver | 2014-01-09 23:09:24 | Re: pg_restore - selective restore use cases. HINT use DROP CASCADE |
Previous Message | Day, David | 2014-01-09 21:51:39 | Re: pg_restore - selective restore use cases. HINT use DROP CASCADE |