Re: tuple data size and compression

From: Tom DalPozzo <t(dot)dalpozzo(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: tuple data size and compression
Date: 2016-12-16 08:23:06
Message-ID: CAK77FCSdnSHtpermhLHKKnUXu9N9UeF-RRDpO=QwvZ6a4fvodA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>
> I see. But in my case rows don't reach that thresold (I didn't check if
>> 2K but I didn't change anything). So I'm wondering if there is any other
>> chance except the TOAST to get the rows compressed or not.
>>
>
> Are you really sure you want that? For small files the overhead of
> compression tends to out weigh the benefits. A contrived example biased to
> making my point:
>
> aklaver(at)killi:~> dd if=/dev/urandom of=file.txt bs=10 count=10
> 10+0 records in
> 10+0 records out
> 100 bytes (100 B) copied, 0.253617 s, 0.4 kB/s
> aklaver(at)killi:~> l -h file.txt
> -rw-r--r-- 1 aklaver users 100 Dec 15 13:07 file.txt
> aklaver(at)killi:~> gzip file.txt
> aklaver(at)killi:~> l -h file.txt.gz
> -rw-r--r-- 1 aklaver users 132 Dec 15 13:07 file.txt.gz
>
>
> I noticed that, when I use constant data, the total IO writes (by
>> iostat) are more or less 1/2 of the the total IO writes when using
>> random or other data hard to compress.
>>
>
> Define constant data?
>
> I thought the data you are inputting is below the compression threshold?
>
> Is I/O causing a problem or to put it another way, what is the problem you
> are trying to solve?
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com

Hi Adrian,
I haven't got a problem. I'm just trying to estimate a worst case of total
IO writes of my DB on the field, over the years.
If tuples get compressed, then all depends on the data compressibility.
So, either 1) I understand if/when they get compressed or 2) I must
simulate with different kinds of data (compressible and not).
Furthermore, in general I like, if possible, to be aware of what is going
behind the scenes.
As for 2) I made 3 tests, each inserting 1million rows of 100 bytea on my
'dati' field. First I tried 100 bytes=constant data=\\x7b7b7b7b.... then
I tried 100 bytes=random and finally inserting 100 bytes taken from random
offsets of a compressed file.
It seems that IO writes strongly depend on the data, even if dati field
length is just 100.
wal_compression=off.
Regards
Pupillo

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message marin 2016-12-16 10:07:45 Checking data checksums...
Previous Message Charles Clavadetscher 2016-12-16 06:46:04 Re: Recursive row level security policy