From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Peter Geoghegan <pg(at)heroku(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Larry White <ljw1001(at)gmail(dot)com> |
Subject: | Re: jsonb format is pessimal for toast compression |
Date: | 2014-08-15 01:28:08 |
Message-ID: | 53ED6228.2060507@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> Before changing to to INT_MAX:
>
> thetype | colsize_distribution
> ---------+----------------------------
> json | {1741,1767,1854,1904,2292}
> jsonb | {3551,5866,5910,5958,6168}
>
> After:
>
> thetype | colsize_distribution
> ---------+----------------------------
> json | {1741,1767,1854,1904,2292}
> jsonb | {3515,3543,3636,3690,4038}
>
> So that did improve things, just not as much as we'd like.
And with Tom's test patch:
postgres=# select pg_size_pretty(pg_total_relation_size('jsonic'));
pg_size_pretty
----------------
394 MB
(1 row)
postgres=# select pg_size_pretty(pg_total_relation_size('jsonbish'));
pg_size_pretty
----------------
541 MB
(1 row)
thetype | colsize_distribution
---------+----------------------------
json | {1741,1767,1854,1904,2292}
jsonb | {2037,2114,2288,2348,2746}
Since that improved things a *lot*, just +40% instead of +200%, I
thought I'd test some select queries. I decided to test a GIN lookup
and value extraction, since indexed lookup is really what I care about.
9.4b2 no patches:
postgres=# explain analyze select row_to_json -> 'kt1_total_sum' from
jsonbish where row_to_json @> '{ "rpt_per_dt" : "2003-06-30" }';
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on jsonbish (cost=29.55..582.92 rows=200 width=18)
(actual time=20.814..2845.454 rows=100423 loops=1)
Recheck Cond: (row_to_json @> '{"rpt_per_dt": "2003-06-30"}'::jsonb)
Heap Blocks: exact=1471
-> Bitmap Index Scan on jsonbish_row_to_json_idx (cost=0.00..29.50
rows=200 width=0) (actual time=20.551..20.551 rows=100423 loops=1)
Index Cond: (row_to_json @> '{"rpt_per_dt": "2003-06-30"}'::jsonb)
Planning time: 0.102 ms
Execution time: 2856.179 ms
9.4b2 TL patch:
postgres=# explain analyze select row_to_json -> 'kt1_total_sum' from
jsonbish where row_to_json @> '{ "rpt_per_dt" : "2003-06-30" }';
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on jsonbish (cost=29.55..582.92 rows=200 width=18)
(actual time=24.071..5201.687 rows=100423 loops=1)
Recheck Cond: (row_to_json @> '{"rpt_per_dt": "2003-06-30"}'::jsonb)
Heap Blocks: exact=1471
-> Bitmap Index Scan on jsonbish_row_to_json_idx (cost=0.00..29.50
rows=200 width=0) (actual time=23.779..23.779 rows=100423 loops=1)
Index Cond: (row_to_json @> '{"rpt_per_dt": "2003-06-30"}'::jsonb)
Planning time: 0.098 ms
Execution time: 5214.212 ms
... so, an 80% increase in lookup and extraction time for swapping
offsets for lengths. That's actually all extraction time; I tried
removing the extraction from the query, and without it both queries are
close enough to be statstically insignificant.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2014-08-15 02:16:15 | run xmllint during build (was Re: need xmllint on borka) |
Previous Message | Josh Berkus | 2014-08-15 01:03:57 | Re: Reporting the commit LSN at commit time |