From: | Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz> |
---|---|
To: | "Daniel Serodio (lists)" <daniel(dot)lists(at)mandic(dot)com(dot)br> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Help estimating database and WAL size |
Date: | 2012-10-09 02:18:03 |
Message-ID: | 5073895B.9040700@archidevsys.co.nz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 09/10/12 09:39, Daniel Serodio (lists) wrote:
> We are preparing a PostgreSQL database for production usage and we
> need to estimate the storage size for this database. We're a team of
> developers with low expertise on database administration, so we are
> doing research, reading manuals and using our general IT knowledge to
> achieve this.
>
> We have actual data to migrate to this database and some rough
> estimations of growth. For the sake of the example, let's say we have
> a estimation of growth of 50% per year.
>
> The point is: what's the general proper technique for doing a good
> size estimation?
>
> We are estimating the storage usage by the following rules. Topics
> where we need advice are marked with ** asterisks **. Feedback on the
> whole process is more than welcome.
>
> 1) Estimate the size of each table
> 1.1) Discover the actual size of each row.
> - For fields with a fixed size (like bigint, char, etc) we
> used the sizes described in the documentation
> - For fields with a dynamic size (like text) we estimated the
> string length and used the function select pg_column_size('expected
> text here'::text)
> - We added 4 more bytes for the OID that PostgreSQL uses
> internally
> 1.2) Multiply the size of each row by the number of estimated rows
> ** Do I need to consider any overhead here, like row or table
> metadata? **
>
> 2) Estimate the size of each table index
> ** Don't know how to estimate this, need advice here **
>
> 3) Estimate the size of the transaction log
> ** We've got no idea how to estimate this, need advice **
>
> 4) Estimate the size of the backups (full and incremental)
> ** Don't know how to estimate this, need advice here **
>
> 5) Sum all the estimates for the actual minimum size
>
> 6) Apply a factor of 1.5x (the 50% growth) to the sum of the estimates
> 1, 2 and 4 for the minimum size after 1 year
>
> 7) Apply an overall factor of 1.2 ~ 1.4 (20% to 40% more) to estimates
> 5 and 6 for a good safety margin
>
> I know the rules got pretty extensive, please let me know if you need
> more data or examples for a better understanding.
>
> We've also posted this question to
> http://dba.stackexchange.com/q/25617/10166
>
> Thanks in advance,
> Daniel Serodio
>
>
You also have to allow for table & index bloat.
When a record is DELETEd or UPDATEd, the space used on the disk is not
automatically reclaimed. So in a very volatile database, the size of the
data files could be several times bigger than the actual data storage
requires. There are automatic and manual procedures (look up VACUUM)
that can keep this under control. However, you will still need to
account for bloat. The extent of bloat depends very much on your usage
patterns.
Cheers,
Gavin
From | Date | Subject | |
---|---|---|---|
Next Message | marian krucina | 2012-10-09 05:14:28 | Re: Dump/restore indexes and functions in public schema |
Previous Message | Tom Lane | 2012-10-09 02:12:32 | Re: Creating and managing triggers |