Re: db size and VACUUM ANALYZE

From: Marcin Krol <mrkafk(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org >> PGSQL Mailing List" <pgsql-general(at)postgresql(dot)org>
Subject: Re: db size and VACUUM ANALYZE
Date: 2010-02-12 19:20:19
Message-ID: 4B75A9F3.5040109@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Bill Moran wrote:
> Note that the "correct" disk size for your database is probably closer
> to the 1.6G you were seeing before.

This might be the case, but how do I find out what are the "correct" sizes?

I have a script that does following queries:

SELECT relpages * 8192 AS size_in_bytes, relname FROM pg_class WHERE
relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public')
ORDER BY size_in_bytes DESC LIMIT 10;

SELECT SUM(sizes.size_in_bytes) AS total_size_for_top_10_tables FROM
(SELECT relpages * 8192 AS size_in_bytes, relname FROM pg_class WHERE
relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public')
ORDER BY size_in_bytes DESC LIMIT 10) AS sizes;

SELECT SUM(sizes.size_in_bytes) AS total_size_for_all_tables FROM
(SELECT relpages * 8192 AS size_in_bytes, relname FROM pg_class WHERE
relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = 'public'))
AS sizes;

Result before (1.6G db):

size_in_bytes | relname
---------------+----------------------
806387712 | cs_ver_digests_pkey
103530496 | oai_edi_atts_pkey
62021632 | cs_ver_paths
61734912 | cs_ver_digests
55721984 | cs_fil_paths
45309952 | met_files
38412288 | met_versions
26247168 | cs_ver_content_types
25444352 | met_edi_ver
23724032 | met_edi_atts
(10 rows)

total_size_for_top_10_tables
------------------------------
1248534528
(1 row)

total_size_for_all_tables
---------------------------
1467809792

Results now (600M db):

size_in_bytes | relname
---------------+---------------------------
62169088 | cs_ver_paths
55828480 | cs_fil_paths
45441024 | met_files
42000384 | cs_ver_digests
37552128 | met_versions
25509888 | met_edi_ver
24215552 | cs_ver_content_types
20717568 | met_edi_atts
18186240 | met_edi_ver_pkey
13565952 | cs_ver_content_types_pkey
(10 rows)

total_size_for_top_10_tables
------------------------------
345186304
(1 row)

total_size_for_all_tables
---------------------------
467476480
(1 row)

>This allows PG some free space
> within the data files to add/remove records. vacuum full removes this
> space, and you'll likely find that the files will simply expand to
> use it again. Vaccuum (without full) keeps that space at an equilibrium.

I don't mind slight performance degradation, the problem is that it is
2nd time that beyond certain db size the performance degradation tends
to be almost runaway.

> As to performance degradation, you'll always see performance hits as
> your database size increases. I'm assuming from your need to ask about
> this issue that the degradation was significant.

Yes, to the point of unacceptable (that is, queries took like 20-30
seconds).

> In that case, you first
> want to make sure that the tables in the database have indexes in
> all the right places -- in my experience, this is the biggest cause of
> performance issues. Use of EXPLAIN ANALYZE on queries that are performing
> slow will usually indicate where indexes can help.

I'll try, though that will not be easy as they are complex and were not
written by me (it's a closed system).

>>From there, you may simply have too little hardware for the database to
> run at the speed you expect.

You see that's the weird thing: the machine in question has 4 cpus and
4G of ram. When the performance was unacceptable, the loadavg was around
1, all cpus were slightly loaded, and iostat didn't show much happening
on the disks. The one thing I remember is that there were many
postmaster processes (like 20), they had huge virtual sizes (like 800m)
and large resident sizes (like 300M).

On top of having the pg_dump backup, I have copied the binary files of
db when pg was stopped. I could play with those files (change them under
the same pg config on another machine).

> Giving it more RAM is cheap and tends to
> work wonders. Any time the system runs out of RAM, it needs to use disk
> instead, which significantly hurts performance.

This is my memory config:

shared_buffers = 768MB

temp_buffers = 32MB # min 800kB

work_mem = 32MB # min 64kB

max_stack_depth = 256MB # min 100kB

max_fsm_pages = 153600

% sysctl -a | grep shm
kernel.shmmni = 4096
kernel.shmall = 262144
kernel.shmmax = 1073741824

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Allan Kamau 2010-02-12 19:21:35 Re: Weeding out unused user created database objects, could I use pg_catalog?
Previous Message Richard Huxton 2010-02-12 19:16:16 Re: Weeding out unused user created database objects, could I use pg_catalog?