From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | Troels Arvin <troels(at)arvin(dot)dk> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Storage consumption |
Date: | 2003-11-14 18:26:12 |
Message-ID: | 200311141826.hAEIQCD23265@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Did you see the FAQ item on estimating disk space?
---------------------------------------------------------------------------
Troels Arvin wrote:
> Hello,
>
> For some very data-intensive projects it's interesting how much space the
> DBMS uses for the storage of data, so I'm investigating how space
> efficient different DBMSes are.
>
> In the PostgreSQL manual, it's written that values of the type INTEGER
> take op four bytes. I was curious about how close to real-World this
> number is, so I did a test: How much space does PostgreSQL use when
> storing 100000 rows where each row consists of a single INTEGER value?
>
> With help from http://random.org/ I created a file with 100000 random
> integer insertions. The SQL used to do that is available at
> http://troels.arvin.dk/db/tests/storage-sizes/randomints.zip
>
> About installation: PostgreSQL v. 7.3.4 on Red Hat Linux 9, file system
> ext3. PostgreSQL data-area in /var/lib/pgsql/data.
>
> For this test, PostgreSQL is being used for nothing else.
>
> Before test start:
> -----------------
> Access to a default database ('psql' brings you right into a working
> database) from psql.
> Access to do a 'du' (disk usage unix-command) on /var/lib/pgsql/data from
> the command line.
> No existing table 'inttab' in database. PostgreSQL stopped.
>
> Test starts.
> -----------
> Output of 'du -sb /var/lib/pgsql/data': 77946519.
> Start PostgreSQL.
> Do: "CREATE TABLE inttab (intval INT) WITHOUT OIDS;"
> psql -q -f random_ints.sql
> (Wait for a long time.)
> Do: "VACUUM FULL;"
> Shut down PostgreSQL.
> Output of 'du -sb /var/lib/pgsql/data': 81190551.
>
> Result:
> ------
> Real difference: 81190551-77946519 = 3244032
> Optimal difference: 100000*4 = 400000
> Storage consumption rate ((real/optimal)*100)% = 811%
>
> I'm surprised by an overhead _that_ high. Any comments on my methology?
> Does it need adjustments? If you think it's rotten: What methology would
> you use to measure space overhead for a DBMS? (Again: Space overhead is
> seldomly interesting, but sometimes it is.)
>
> I guess that transaction log files are a joker in this context, but then
> again: A number which reflects the DBMS' disk usage before and after an
> operation does have real-World meaning, I think.
>
> (Of course, I'll need another methology for DBMSes which preallocate a
> fixed amount of storage for a database.)
>
> --
> Greetings from Troels Arvin, Copenhagen, Denmark
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff | 2003-11-14 18:36:26 | Re: Updated Documentation |
Previous Message | Rajesh Kumar Mallah | 2003-11-14 18:13:26 | Re: DOMAIN usability |