Re: Speeding up JSON + TSQUERY + GIN

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: "Sven R(dot) Kunze" <srkunze(at)mail(dot)de>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Speeding up JSON + TSQUERY + GIN
Date: 2017-03-06 04:25:23
Message-ID: CAMkU=1wvV2pgCJtSfj5HOQuK4255G5S2XGftw-D3b3Vx6YBoNw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Thu, Mar 2, 2017 at 1:19 PM, Sven R. Kunze <srkunze(at)mail(dot)de> wrote:

> On 01.03.2017 18:04, Jeff Janes wrote:
>
> On Wed, Mar 1, 2017 at 6:02 AM, Sven R. Kunze <srkunze(at)mail(dot)de> wrote:
>
>> On 28.02.2017 17:49, Jeff Janes wrote:
>>
>> Oh. In my hands, it works very well. I get 70 seconds to do the {age:
>> 20} query from pure cold caches, versus 1.4 seconds from cold caches which
>> was followed by pg_prewarm('docs','prefetch').
>>
>> How much RAM do you have? Maybe you don't have enough to hold the table
>> in RAM. What kind of IO system? And what OS?
>>
>>
>> On my test system:
>>
>> RAM: 4GB
>> IO: SSD (random_page_cost = 1.0)
>> OS: Ubuntu 16.04
>>
>
>
> 4GB is not much RAM to be trying to pre-warm this amount of data into.
> Towards the end of the pg_prewarm, it is probably evicting data read in by
> the earlier part of it.
>
> What is shared_buffers?
>
>
> 942MB.
>
> But I see where you are coming from. How come that these queries need a
> Recheck Cond? I gather that this would require reading not only the index
> data but also the table itself which could be huge, right?
>

Bitmaps can overflow and drop the row-level information, tracking only the
blocks which need to be inspected. So it has to have a recheck in case
that happens (although in your case it is not actually overflowing--but it
still needs to be prepared for that). Also, I think that jsonb_path_ops
indexes the hashes of the paths, so it can deliver false positives which
need to be rechecked. And you are selecting `id`, which is not in the
index so it would have to consult the table anyway to retrieve that. Even
if it could get all the data from the index itself, I don't think GIN
indexes support that feature.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2017-03-06 05:24:02 Re: Performance issue in PostgreSQL server...
Previous Message Jeff Janes 2017-03-06 04:23:08 Re: Performance issue in PostgreSQL server...