Re: Abnormal JSON query performance

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: 007reader(at)gmail(dot)com
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Abnormal JSON query performance
Date: 2018-05-12 05:03:41
Message-ID: CAFj8pRAaD9zXF_tXLohAgaxMdj8Fb_oT_WsMU24em8ymPWKKrA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi

2018-05-12 6:32 GMT+02:00 <007reader(at)gmail(dot)com>:

> I observed performance anomaly with JSON that I believe is incorrect in
> version 10, although didn’t test earlier ones.
> Here is a test table:
> create table test(
> id integer primary key,
> data json
> );
>
> A simple query works fine on a table of 92,000 records with ~60KB average
> record size.
>
> select data->>key1 from test limit 1000;
>
> The query above takes about 155 ms. If I add additional field to the list
> of fields, execution time increases linearly - retrieving 2 fields take
> ~310 ms, 3 fields ~ 465 ms and so on. The explain statement for 5 fields
> looks normal
>
> Limit (cost=0.00..34.10 rows=1000 width=160) (actual time=1.783..813.643
> rows=1000 loops=1)
> -> Seq Scan on loans (cost=0.00..3165.85 rows=92834 width=160)
> (actual time=1.753..805.629 rows=1000 loops=1)
> Planning time: 8.546 ms
> Execution time: 817.874 ms
>
> Looks like every record gets parsed separately for each key... Unless I’m
> doing something incorrectly, this makes retrieval of individual key
> completely useless for practical cases. It’d leave only unattractive
> alternative of getting the entire JSON document. In my case, instead of
> retrieving a few hundred byte, it would force getting 60KB for each query...
>

It is not a bug. This behave is expected.

Json values are stored as plain text in PostgreSQL. Every time if it is
necessary it is parsed. If need fast access to one specific field, then you
can use functional index

CREATE INDEX ON test(data->>key1);

You can try JSONB type - it is like Json, but it is preparsed, so access to
some fields should be faster. Again, there can be used functional index, or
special JSONB indexes for faster searching.

Don't forget - PostgreSQL is relational databases - the sequential access
to data will be most fast, when data will be stored in relational form -
any attribute is stored in one specific column. No atomic data types are
more dynamic, but access is significantly slower - it is not important for
hundreds values, but it is clean when you read more than thousands values,
..

p.s. index is your friend

Regards

Pavel

>
> -Bob Jones
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message John Bester 2018-05-12 05:37:52 pg_upgradecluster 9.5 to 10.3
Previous Message 007reader 2018-05-12 04:32:42 Abnormal JSON query performance