From: | Rik Bellens <rik(dot)bellens(at)telin(dot)ugent(dot)be> |
---|---|
To: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: strange table disk sizes |
Date: | 2011-09-01 12:16:30 |
Message-ID: | 4E5F779E.1020300@telin.ugent.be |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Op 01-09-11 13:31, Scott Marlowe schreef:
> On Thu, Sep 1, 2011 at 4:32 AM, Rik Bellens<rik(dot)bellens(at)telin(dot)ugent(dot)be> wrote:
>> Hello,
>>
>> I have two large tables in a database, one containing original data and the
>> other one derived from the first table. The first table contains several
>> columns and indexes, while the second table has less columns and only one
>> index. Both tables have the same number of rows. Nevertheless, the second
>> table is much larger in disk size than the first one. How can this be
>> explained?
> This is most likely due to table bloat. In PostgreSQL when you update
> or delete a row, a dead version gets left behind. Vacuum eventually
> comes along and reclaims the empty space to be reused. If you delete
> / update a LOT of rows at once, then you'll have a lot of dead rows
> which can only be reused after vacuuming when you do more updates or
> deletes later on.
>
> A few salient questions. What version of PostgreSQL are you running?
> Is autovacuum running? Do you do a LOT of bulk deletes / updates? If
> you do a lot of bulk deletes on this table, and you delete everything,
> can you switch to using the truncate command instead?
I use version 8.3. I see the 'autovacuum launcher process' and
'autovacuum worker process' in the process list, so I suppose autovacuum
is running.
Rows in the measurement table are added once and never deleted or
updated. Adding a row to this table triggers a function that adds a row
to the stats_count table. Normally rows are added chronologically. So
rows in the stats_count table are normally not updated either. If
however, for some reason, a measurement is added from an older time, all
rows of that device which come after this time, are updated, but I don't
think this will happen very often.
The table 'stats_count' was created in a later stage, so the first 45M
rows were added at once and chronologically. However, because the
function to initialize this table took a long time and the client
application crashed a few times, I had to restart this function several
times. Can it be that there is some trash left from running this
function several times without finishing it? Would it be a solution to
run 'VACUUM FULL' to reclaim some disk space?
From | Date | Subject | |
---|---|---|---|
Next Message | Rik Bellens | 2011-09-01 12:38:50 | Re: strange table disk sizes |
Previous Message | Sim Zacks | 2011-09-01 12:10:22 | Re: md5 of table |