Quick Links

Re: BIG files

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	rabt(at)dim(dot)uchile(dot)cl
Cc:	pgsql-novice(at)postgresql(dot)org
Subject:	Re: BIG files
Date:	2005-06-19 18:20:50
Message-ID:	18672.1119205250@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-novice

rabt(at)dim(dot)uchile(dot)cl writes:
> I work with VERY large datasets: 60 monthly tables with 700,000 rows and 99
> columns each, with mostly large numeric values (15 digits) ( NUMERIC(15,0)
> datatypes, not all filled).

> The main problem is disk space. The database files stored in postgres
> take 4 or 5 times more space than in Mysql. Just to be sure, after
> each bulk load, I performed a VACUUM FULL to reclaim any posible lost
> space, but nothing gets reclaimed. My plain text dump files with
> INSERTS are just 150 Mb in size, while the files in Postgres directory
> are more than 1 Gb each!!

Hmm ... it's conventional wisdom that a PG data file will be larger than
an equivalent text file, but you seem to have a much worse case than I'd
expect. As Bruno noted, BIGINT sounds like a better choice than NUMERIC
for your data, but even with NUMERIC I don't see where the bloat is
coming from. A 15-digit NUMERIC value ought to occupy 16 bytes on disk
(8 bytes of overhead + 4 base-10000 digits at 2 bytes each), so that
ought to be just about one-for-one with the text representation. And
with 99 of those per row, the 32-bytes-per-row row header overhead isn't
what's killing you either.

Could we see the exact table declaration (including indexes) and a few
rows of sample data?

regards, tom lane

In response to

BIG files at 2005-06-18 17:45:42 from rabt

Responses

Re: BIG files at 2005-06-20 14:36:46 from Tom Lane

Browse pgsql-novice by date

	From	Date	Subject
Next Message	Prasad dev	2005-06-20 00:27:41	Raise Notice
Previous Message	Stephan Szabo	2005-06-19 16:33:01	Re: BIG files