V8.4 TOAST table problem

From: Paul Tilles <paul(dot)tilles(at)noaa(dot)gov>
To: pgsql-general(at)postgresql(dot)org
Cc: Brad Mccune <bradley(dot)mccune(at)noaa(dot)gov>
Subject: V8.4 TOAST table problem
Date: 2013-07-03 15:19:55
Message-ID: 51D4411B.2030908@noaa.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Recently, I've had a PostgreSQL 8.2.11 server upgraded to 8.4 in order
to take advantage of autovacuum features. This server exists in a very
closed environment (isolated network, limited root privileges; this
explains the older software in use) and runs on RHEL5.5 (i686). After
the upgrade, the database has constantly been growing to the tune of 5-6
GB a day. Normally, the database, as a whole, is ~20GB; currently, it is
~89GB. We have a couple other servers which run equivalent databases and
actually synchronize the records to each other via a 3rd party
application (one I do not have access to the inner workings). The other
databases are ~20GB as they should be.

Running the following SQL, it's fairly obvious there's an issue with a
particular table, and, more specifically, its TOAST table.

|SELECT nspname|| '.' || relnameAS "relation",
pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace NON (N.oid= C.relnamespace)
WHERE nspnameNOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(C.oid) DESC
LIMIT20;|

Which produces:

relation | size
--+---------------------------------
pg_toast.pg_toast_16874 | 89 GB
fews00.warmstates | 1095 MB
...
(20 rows)

This TOAST table is for a table called "timeseries" which saves large
records of blobbed data. A|SUM(LENGTH(blob)/1024./1024.)| of all the
records in timeseries yields ~16GB for that column. There should be
[b]no reason[/b] this table's TOAST table should be as large as it is.

I've performed a |VACUUM FULL VERBOSE ANALYZE timeseries|, and the
vacuum runs to completion with no errors.

INFO: vacuuming "pg_toast.pg_toast_16874"
INFO: "pg_toast_16874": found 22483 removable, 10475318 nonremovable
row versions in 10448587 pages
DETAIL: 0 dead row versions cannot be removed yet.
Nonremovable row versions range from 37 to 2036 bytes long.
There were 20121422 unused item pointers.
Total free space (including removable row versions) is 0 bytes.
4944885 pages are or will become empty, including 0 at the end of
the table. 4944885 pages containing 0 free bytes are potential move
destinations.
CPU 75.31s/29.59u sec elapsed 877.79 sec.
INFO: index "pg_toast_16874_index" now contains 10475318 row
versions in 179931 pages
DETAIL: 23884 index row versions were removed.
101623 index pages have been deleted, 101623 are currently reusable.
CPU 1.35s/2.46u sec elapsed 21.07 sec.

REINDEXed the table which freed [b]some[/b] space (~1GB). I can't
CLUSTER the table as there isn't enough space on disk for the process,
and I'm waiting to rebuild the table entirely as I'd like to find out
why it is so much bigger than equivalent databases we have.

Ran a query from the PostgreSQL wiki here - "Show Database Bloat"
<http://wiki.postgresql.org/wiki/Show_database_bloat>, and this is what
I get:

current_database | schemaname | tablename | tbloat | wastedbytes |
iname | ibloat | wastedibytes
------------------+------------+--------------------------------+--------+-------------+---------------------------------+--------+--------------
ptrdb04 | fews00 | timeseries | 1.0 | 0 | idx_timeseries_synchlevel
| 0.0 | 0
ptrdb04 | fews00 | timeseries | 1.0 | 0 | idx_timeseries_localavail
| 0.0 | 0
ptrdb04 | fews00 | timeseries | 1.0 | 0 | idx_timeseries_expirytime
| 0.0 | 0
ptrdb04 | fews00 | timeseries | 1.0 | 0 | idx_timeseries_expiry_null
| 0.0 | 0
ptrdb04 | fews00 | timeseries | 1.0 | 0 | uniq_localintid | 0.0 | 0
ptrdb04 | fews00 | timeseries | 1.0 | 0 | pk_timeseries | 0.1 | 0
ptrdb04 | fews00 | idx_timeseries_expiry_null | 0.6 | 0 | ? | 0.0 | 0

It looks like the database doesn't consider this space as "empty," at
all, but I just don't see where all the disk space is coming from!

I suspect that this database server is deciding to use 4-5x as much disk
space to save the same records pulled from the other data servers. My
question is this: Is there a way I can verify the physical disk size of
a row? I'd like to compare the size of one row on this database to
another "healthy" database.

Thanks for any help you can provide!

Paul Tilles

Responses

Browse pgsql-general by date

  From Date Subject
Next Message boraldomaster 2013-07-03 16:43:42 How to create a cursor that is independent of transactions and doesn't calculated when created ?
Previous Message Jake Silverman 2013-07-03 15:07:29 C User Defined Functions