Quick Links

Re: tuple data size and compression

From:	Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To:	Tom DalPozzo <t(dot)dalpozzo(at)gmail(dot)com>
Cc:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: tuple data size and compression
Date:	2016-12-16 15:10:24
Message-ID:	38569cb0-7de0-4015-feac-755950955ec6@aklaver.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 12/16/2016 12:23 AM, Tom DalPozzo wrote:
> I see. But in my case rows don't reach that thresold (I didn't
> check if
> 2K but I didn't change anything). So I'm wondering if there is
> any other
> chance except the TOAST to get the rows compressed or not.
>
>
> Are you really sure you want that? For small files the overhead of
> compression tends to out weigh the benefits. A contrived example
> biased to making my point:
>
> aklaver(at)killi:~> dd if=/dev/urandom of=file.txt bs=10 count=10
> 10+0 records in
> 10+0 records out
> 100 bytes (100 B) copied, 0.253617 s, 0.4 kB/s
> aklaver(at)killi:~> l -h file.txt
> -rw-r--r-- 1 aklaver users 100 Dec 15 13:07 file.txt
> aklaver(at)killi:~> gzip file.txt
> aklaver(at)killi:~> l -h file.txt.gz
> -rw-r--r-- 1 aklaver users 132 Dec 15 13:07 file.txt.gz
>
>
> I noticed that, when I use constant data, the total IO writes (by
> iostat) are more or less 1/2 of the the total IO writes when using
> random or other data hard to compress.
>
>
> Define constant data?
>
> I thought the data you are inputting is below the compression threshold?
>
> Is I/O causing a problem or to put it another way, what is the
> problem you are trying to solve?
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
>
> Hi Adrian,
> I haven't got a problem. I'm just trying to estimate a worst case of
> total IO writes of my DB on the field, over the years.

Well worst case is some program goes haywire and saturates you I0 with
writes. I would think a more reasonable estimate would be the average
write load with a +/- standard deviation. Though that would entail
guessing more about the future then I would be willing to commit to.

> If tuples get compressed, then all depends on the data compressibility.
> So, either 1) I understand if/when they get compressed or 2) I must
> simulate with different kinds of data (compressible and not).
> Furthermore, in general I like, if possible, to be aware of what is
> going behind the scenes.
> As for 2) I made 3 tests, each inserting 1million rows of 100 bytea on
> my 'dati' field. First I tried 100 bytes=constant data=\\x7b7b7b7b....
> then I tried 100 bytes=random and finally inserting 100 bytes taken
> from random offsets of a compressed file.
> It seems that IO writes strongly depend on the data, even if dati field
> length is just 100.

That goes without saying.

Two questions:

1) Do you know what your data in the future is going to be?

2) Is a 100 byte bytea a realistic approximation of that data?

> wal_compression=off.
> Regards
> Pupillo
>
>
>
>
>
>
>
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Re: tuple data size and compression at 2016-12-16 08:23:06 from Tom DalPozzo

Responses

Re: tuple data size and compression at 2016-12-16 16:45:45 from Tom DalPozzo

Browse pgsql-general by date

	From	Date	Subject
Next Message	Vladimir Rusinov	2016-12-16 15:59:29	Re: pgbench initialize
Previous Message	Thomas.Deboben.ext	2016-12-16 15:01:25	Re: Windows installation - could not connect to server: Connection refused (0x0000274D/10061)