Re: Smaller data types use same disk space

From: Adrian Klaver <adrian(dot)klaver(at)gmail(dot)com>
To: "McGehee, Robert" <Robert(dot)McGehee(at)geodecapital(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Smaller data types use same disk space
Date: 2012-07-25 00:46:09
Message-ID: 500F41D1.3010806@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 07/24/2012 03:21 PM, McGehee, Robert wrote:
> Hi,
> I've created two tables labeled "Big" and "Small" that both store the same 10 million rows of data using 493MB and 487MB of disk space respectively. The difference is that the "Big" table uses data types that take up more space (integer rather than smallint, float rather than real, etc). The "Big" table should need about 27 bytes/row versus 16 bytes/row for the "Small" table, indicating to me that the "Big" table should be 70% bigger in actual disk size. In reality, it's only 1% bigger or 6MB (after clustering, vacuuming and analyzing). Why is this? Shouldn't the "Small" table be about 110MB smaller (11 bytes for 10 million rows)? I'm estimating table size with \d+
>
> Thanks, Robert
>
> Table "Big"
> Column | Type | Bytes
> ----------+------------------+-----------
> rmid | integer | 4
> date | date | 4
> rmfactor | text | 7 (about 3 characters/cell)
> id | integer | 4
> value | double precision | 8
> ---------------------------------
> Total Bytes/Row 27
> Rows 10M
> Actual Size 493MB
>
>
> Table "Small"
> Column | Type | Bytes
> --------+----------+-----------
> rmid | smallint | 2
> date | date | 4
> rmfid | smallint | 2 (rmfid is a smallint index into the rmfactor table)
> id | integer | 4
> value | real | 4
> ---------------------------------
> Total Bytes/Row 16
> Rows 10M
> Actual Size 487MB

See here for the gory details:
http://www.postgresql.org/docs/9.1/interactive/storage-page-layout.html

See in particular:
Table 55-4. HeapTupleHeaderData Layout
From the text:
""All table rows are structured in the same way. There is a fixed-size
header (occupying 23 bytes on most machines.." which breaks you
assumption of the Big/Small row size comparison.

>
>
>

--
Adrian Klaver
adrian(dot)klaver(at)gmail(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Steve Crawford 2012-07-25 00:54:31 Re: Smaller data types use same disk space
Previous Message Scott Marlowe 2012-07-25 00:02:05 Re: Pg 9.1: Do I need to run vacuum analyze instead vacuum full?