Re: jsonb format is pessimal for toast compression

From: Laurence Rowe <l(at)lrowe(dot)co(dot)uk>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Arthur Silva <arthurprs(at)gmail(dot)com>, Larry White <ljw1001(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com>, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
Subject: Re: jsonb format is pessimal for toast compression
Date: 2014-08-26 19:16:18
Message-ID: CAOycyLSAFEu-jJMr_F6YtfRSGiPzJLs88R02_JMXEX3pp++rFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26 August 2014 11:34, Josh Berkus <josh(at)agliodbs(dot)com> wrote:

> On 08/26/2014 07:51 AM, Tom Lane wrote:
> > My feeling about it at this point is that the apparent speed gain from
> > using offsets is illusory: in practically all real-world cases where
> there
> > are enough keys or array elements for it to matter, costs associated with
> > compression (or rather failure to compress) will dominate any savings we
> > get from offset-assisted lookups. I agree that the evidence for this
> > opinion is pretty thin ... but the evidence against it is nonexistent.
>
> Well, I have shown one test case which shows where lengths is a net
> penalty. However, for that to be the case, you have to have the
> following conditions *all* be true:
>
> * lots of top-level keys
> * short values
> * rows which are on the borderline for TOAST
> * table which fits in RAM
>
> ... so that's a "special case" and if it's sub-optimal, no bigee. Also,
> it's not like it's an order-of-magnitude slower.
>
> Anyway, I called for feedback on by blog, and have gotten some:
>
> http://www.databasesoup.com/2014/08/the-great-jsonb-tradeoff.html

It would be really interesting to see your results with column STORAGE
EXTERNAL for that benchmark. I think it is important to separate out the
slowdown due to decompression now being needed vs that inherent in the new
format, we can always switch off compression on a per-column basis using
STORAGE EXTERNAL.

My JSON data has smallish objects with a small number of keys, it barely
compresses at all with the patch and shows similar results to Arthur's
data. Across ~500K rows I get:

encoded=# select count(properties->>'submitted_by') from compressed;
count
--------
431948
(1 row)

Time: 250.512 ms

encoded=# select count(properties->>'submitted_by') from uncompressed;
count
--------
431948
(1 row)

Time: 218.552 ms

Laurence

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-08-26 19:17:13 Re: jsonb format is pessimal for toast compression
Previous Message Andres Freund 2014-08-26 19:14:19 Re: jsonb format is pessimal for toast compression