Re: Abnormal JSON query performance

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: 007reader <007reader(at)gmail(dot)com>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Abnormal JSON query performance
Date: 2018-05-12 07:36:51
Message-ID: CAFj8pRDU1AkYKOr2kRopK1V6Sacd+VTSbiMZ_e1hJ21ZsPxz9Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

2018-05-12 9:31 GMT+02:00 <007reader(at)gmail(dot)com>:

> Sorry again, autocorrect kills me. I use JSONB, not JSON. Serving the
> entire document for jsonb doesn’t make much sense. If jsonb pre-paresed,
> selecting and assembling a few keys must be faster than assembling all of
> them.
>

jsonb is preparsed, but that is all. PostgreSQL can't to store one value to
separate blocks. It can't to do for any type - json(b) is not a exception.

Regards

Pavel

>
>
> On May 11, 2018, at 10:03 PM, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
> wrote:
>
> Hi
>
> 2018-05-12 6:32 GMT+02:00 <007reader(at)gmail(dot)com>:
>
>> I observed performance anomaly with JSON that I believe is incorrect in
>> version 10, although didn’t test earlier ones.
>> Here is a test table:
>> create table test(
>> id integer primary key,
>> data json
>> );
>>
>> A simple query works fine on a table of 92,000 records with ~60KB average
>> record size.
>>
>> select data->>key1 from test limit 1000;
>>
>> The query above takes about 155 ms. If I add additional field to the list
>> of fields, execution time increases linearly - retrieving 2 fields take
>> ~310 ms, 3 fields ~ 465 ms and so on. The explain statement for 5 fields
>> looks normal
>>
>> Limit (cost=0.00..34.10 rows=1000 width=160) (actual time=1.783..813.643
>> rows=1000 loops=1)
>> -> Seq Scan on loans (cost=0.00..3165.85 rows=92834 width=160)
>> (actual time=1.753..805.629 rows=1000 loops=1)
>> Planning time: 8.546 ms
>> Execution time: 817.874 ms
>>
>> Looks like every record gets parsed separately for each key... Unless I’m
>> doing something incorrectly, this makes retrieval of individual key
>> completely useless for practical cases. It’d leave only unattractive
>> alternative of getting the entire JSON document. In my case, instead of
>> retrieving a few hundred byte, it would force getting 60KB for each query...
>>
>
> It is not a bug. This behave is expected.
>
> Json values are stored as plain text in PostgreSQL. Every time if it is
> necessary it is parsed. If need fast access to one specific field, then you
> can use functional index
>
> CREATE INDEX ON test(data->>key1);
>
> You can try JSONB type - it is like Json, but it is preparsed, so access
> to some fields should be faster. Again, there can be used functional index,
> or special JSONB indexes for faster searching.
>
> Don't forget - PostgreSQL is relational databases - the sequential access
> to data will be most fast, when data will be stored in relational form -
> any attribute is stored in one specific column. No atomic data types are
> more dynamic, but access is significantly slower - it is not important for
> hundreds values, but it is clean when you read more than thousands values,
> ..
>
> p.s. index is your friend
>
> Regards
>
> Pavel
>
>
>
>
>>
>> -Bob Jones
>>
>
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message 007reader 2018-05-13 17:16:42 Re: Abnormal JSON query performance
Previous Message 007reader 2018-05-12 07:31:27 Re: Abnormal JSON query performance