From: | "Carlo Stonebanks" <stonec(dot)register(at)sympatico(dot)ca> |
---|---|
To: | <pgsql-performance(at)postgresql(dot)org> |
Subject: | Performance improves only after repeated VACUUM/ANALYZE |
Date: | 2007-09-18 19:59:44 |
Message-ID: | B57531DF0F844B86962E9A22852F27F6@serenity |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
My client "publishes" an "edition" of their DB from his production site to
his hosted web/db server. This is done by FTPing a backup of the DB to his
hosting provider.
Immediately after a "publication" (restore to web/db server) we immediately
run VACUUM ANALYZE to make sure the statistics and row estimates are
correct.
The problem is, after this initial VACUUM ANALYZE, the row estimates in
query plans are off by several orders of magnitude. For example, a
disastrous plan was created because the planner estimated 4K rows when in
fact it returned 980K rows.
Sometimes - a day or two later - the plans return to "normal" and row
estimates are closer to realistic values. Guessing that there may be
background events that are correcting the row estimates over time, I ran an
ANALYZE on the DB - and sure enough - the row estimates corrected
themselves. The puzzling thing is, there have been no writes of any sort to
the data - there is no reason for the stats to have changed.
I believe that a VACUUM may not be necessary for a newly restored DB, but I
assumed that VACUUM ANALYZE and ANALYZE have the same net result. Am I
wrong?
If I am not wrong (i.e. VACUUM ANALYZE and ANALYZE should produce the same
results) why would the performance improve on a DB that has seen no
transactional activity only after the SECOND try?
PG 8.2.4 on RH LINUX 1GB RAM SCSI RAID 1
Carlo
From | Date | Subject | |
---|---|---|---|
Next Message | Carlo Stonebanks | 2007-09-18 20:07:33 | Nested loops row estimates always too high |
Previous Message | Bill Moran | 2007-09-18 16:18:45 | Re: R: DELETE queries slow down |