From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | Peter Geoghegan <pg(at)heroku(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Larry White <ljw1001(at)gmail(dot)com> |
Subject: | Re: jsonb format is pessimal for toast compression |
Date: | 2014-08-14 00:52:30 |
Message-ID: | 14953.1407977550@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Bruce Momjian <bruce(at)momjian(dot)us> writes:
> Seems we have two issues:
> 1) the header makes testing for compression likely to fail
> 2) use of pointers rather than offsets reduces compression potential
> I understand we are focusing on #1, but how much does compression reduce
> the storage size with and without #2? Seems we need to know that answer
> before deciding if it is worth reducing the ability to do fast lookups
> with #2.
That's a fair question. I did a very very simple hack to replace the item
offsets with item lengths -- turns out that that mostly requires removing
some code that changes lengths to offsets ;-). I then loaded up Larry's
example of a noncompressible JSON value, and compared pg_column_size()
which is just about the right thing here since it reports datum size after
compression. Remembering that the textual representation is 12353 bytes:
json: 382 bytes
jsonb, using offsets: 12593 bytes
jsonb, using lengths: 406 bytes
So this confirms my suspicion that the choice of offsets not lengths
is what's killing compressibility. If it used lengths, jsonb would be
very nearly as compressible as the original text.
Hack attached in case anyone wants to collect more thorough statistics.
We'd not actually want to do it like this because of the large expense
of recomputing the offsets on-demand all the time. (It does pass the
regression tests, for what that's worth.)
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
jsonb-with-lengths-hack.patch | text/x-diff | 2.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2014-08-14 01:01:43 | Re: jsonb format is pessimal for toast compression |
Previous Message | Stephen Frost | 2014-08-14 00:26:12 | Re: replication commands and log_statements |