JSON path

From: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: JSON path
Date: 2019-11-14 13:55:28
Message-ID: 71a1d71e-5263-2980-117c-066a37e481bc@redhat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

We have a table which has a jsonb column in it. Each row contains a lot
of data in that column, so TOASTed.

We have to extract data from that column at different levels, so an
example query could look like

select
col1,
col2,
jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5.val1."text()"')
as val1,
jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5.val2."text()"')
as val2,
jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5.val3."text()"')
as val3
from tbl
where
id = 1;

I tried to rewrite it to

WITH foo AS (select
id,
col1,
col2,
jsonb_path_query_first(data,'$.lvl1.lvl2.lvl3.lvl4.lvl5') as jsondata,
from tbl )
select
col1,
col2,
jsondata->val1->'text()' as val1,
jsondata->val2->'text()' as val2,
jsondata->val3->'text()' as val3
from foo
where
id = 1;

However, WITH has the same run-time profile - most of the time is spent
in pglz_decompress. Using the -> notation has the same profile.

The more data I extract from the JSON object the slower the query gets.

Of course, if I change the column to EXTERNAL we see a ~3.5 x speedup in
the queries but disk space requirements goes up by too much.

(We need to use a jsonb column as the data is unstructured, and may
differ in structure between rows. Yes, yes, I know...)

PostgreSQL 12.x on RHEL.

If anybody has some good ideas it would be appreciated.

Thanks in advance !

Best regards,
Jesper

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2019-11-14 18:04:36 Re: JSON path
Previous Message Luís Roberto Weck 2019-11-14 11:14:25 Re: Parallel Query