VACUUM FULL

From: Brian McCane <bmccane(at)mccons(dot)net>
To: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: VACUUM FULL
Date: 2002-05-11 00:18:42
Message-ID: 20020510185436.F62674-100000@fw.mccons.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin


Okay, I guess I misunderstood something about "VACUUM FULL". My
understanding was that a VACUUM (without FULL), marked unused records for
reuse. VACUUM FULL moved records from the end of a file to the holes
where unused records were stored and truncated the file to free disk
space. So far so good, but....

I have had continued loss of performance on one of my largest tables
(600,000,000 records). There are 4 fields in a record like such:

CREATE TABLE foo (
a int not null references bar(a) on delete cascade on update no action,
b int not null references baz(b) on delete cascade on update no action,
c int,
d smallint,
primary key(a, b)) ;
CREATE INDEX foo_ac ON foo (a,c) ;
CREATE INDEX foo_ad on foo (a,d) ;
And there are 3 triggers which fire before insert/delete/update.

I INSERT/UPDATE/DELETE approximately 300,000 records per day, but this
number is increasing on a daily basis as I make changes which improve the
performance of my data gathering tools (spiders ;). Two days ago, it had
reached the point where a search for a 3-word term (ie. free news servers)
took about 60 seconds. I have just spent 48 hours running a VACUUM FULL
on my table, and now the same search takes < 10 seconds. I assume that
the increase in performance is due to the decrease in table/index size
which added up to approximate 1GB of freed space on the machine, which was
approximately 4% of the original size of the table and all its indices.
But, a 4% decrease in size should not add up to a 84% increase in
performance (is that right? I always get the ratio confused :).

If all that VACUUM FULL did was move records from file 12345678.6 to file
12345678, the database would still being doing a large number of random
accesses on the table. However, if VACUUM FULL clusters the data
according to the primary key, it would still be doing a large number of
random access on the table, because the primary key has almost nothing to
do with how I actually access the data in real life. So, is VACUUM FULL
looking somewhere in pg_statistics (or pg_stat_user_indexes), to determine
which index I actually use most (foo_ad), and then clustering the data
that way, or is there some other agent at work here.

- brian

Wm. Brian McCane | Life is full of doors that won't open
Search http://recall.maxbaud.net/ | when you knock, equally spaced amid those
Usenet http://freenews.maxbaud.net/ | that open when you don't want them to.
Auction http://www.sellit-here.com/ | - Roger Zelazny "Blood of Amber"

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Dan Langille 2002-05-11 00:38:30 Re: VACUUM FULL
Previous Message dac 2002-05-10 20:03:49 unregister knight@tmilenio.com