From: | Richard Huxton <dev(at)archonet(dot)com> |
---|---|
To: | Marcin Mańk <marcin(dot)mank(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: VACUUM FULL hangs |
Date: | 2006-03-15 11:54:02 |
Message-ID: | 4418005A.70108@archonet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Marcin Mańk wrote:
> Hello list.
>
> I recently tried to do a slony replica of my database, and doing it falied.
> I retried, and then it succeeded (why it failed is another story).
>
> This caused that in the replica there is a lot of dead tuples ( If i
> understand correctly, a failure in creating the replica means a HUGE aborted
> transaction - and Slony should TRUNCATE the table, getting rid of dead
> tuples, but that is a subject for another list).
>
> so I did vacuum full verbose analyze (does it make sense ?)
Fair enough. If you want empty tables TRUNCATE is probably a better bet
though.
> This hanged on a (quite large) table:
>
> INFO: vacuuming "public.calls"
> INFO: "calls": found 7980456 removable, 3989705 nonremovable row versions
> in 296943 pages
> DETAIL: 0 dead row versions cannot be removed yet.
> Nonremovable row versions range from 136 to 224 bytes long.
> There were 891 unused item pointers.
> Total free space (including removable row versions) is 1594703944 bytes.
> 197958 pages are or will become empty, including 0 at the end of the table.
> 212719 pages containing 1588415680 free bytes are potential move
> destinations.
OK, so there are 7.9 million removable rows and 3.9 million nonremovable
so truncate isn't an option since you have data you presumably want to
keep. It estimates about 200,000 pages will become empty, but none of
them are at the end of the table. This represents 1.5GB of unused
disk-space.
I'm a bit puzzled as to how you managed to get so much free space at the
start of the table. Did the replication work on the second try?
> CPU 7.25s/3.28u sec elapsed 144.95 sec.
> INFO: index "calls_pkey" now contains 3989705 row versions in 8975 pages
> DETAIL: 108927 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.35s/0.59u sec elapsed 39.03 sec.
> INFO: index "calls_cli" now contains 3989705 row versions in 13504 pages
> DETAIL: 108927 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.51s/0.60u sec elapsed 58.60 sec.
> INFO: index "calls_dnis" now contains 3989705 row versions in 13600 pages
> DETAIL: 108927 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.60s/0.90u sec elapsed 27.05 sec.
> INFO: index "calls_u" now contains 3989705 row versions in 23820 pages
> DETAIL: 108927 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.92s/0.78u sec elapsed 80.51 sec.
> INFO: index "calls_z" now contains 3989705 row versions in 13607 pages
> DETAIL: 108927 index row versions were removed.
> 0 index pages have been deleted, 0 are currently reusable.
> CPU 0.60s/0.85u sec elapsed 39.77 sec.
It's done all the indexes (and seems to have done them quite quickly),
and is presumably working on the data now.
> It was hanging in this state for more than 3 hours, and I had to kill the
> vacuum process.
>
>>From iostat I saw that there was continuous write activity, steadilly about
> 1.3 MB/s (the disk system can do about 40 MB/s), and there were iowait
> processes. There was no read activity.
>
> There were no other clients for that database (but there were clients in
> other databases in the instance).
OK, so you might well be getting the vacuum writing one page, then WAL,
then vacuum, etc. That will mean the disk spends most of its time
seeking back and fore. How many disks do you have, and is the WAL on a
separate set of disks?
I think it's just taking a long time because you have so many pages to
move and not enough disk bandwidth. Of course the root of the problem is
that you had so many dead rows after a failed replication, but you're
right and that's another email.
--
Richard Huxton
Archonet Ltd
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2006-03-15 17:56:53 | BETWEEN optimizer problems with single-value range |
Previous Message | Marcin Mańk | 2006-03-15 10:09:52 | VACUUM FULL hangs |