From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Marc-Olaf Jaschke <marc-olaf(dot)jaschke(at)s24(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: performance issue with bitmap index scans on huge amounts of big jsonb documents |
Date: | 2016-12-05 02:28:16 |
Message-ID: | CAMkU=1y+_XTgLtA2uL01jX4=E_vFVa4YnE9_ZMA704uRLAB5yg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
> big_jsonb @> '[{"x": 1, "filler": "cfcd208495d565ef66e7dff9f98764da"}]';
> I wonder why bitmap heap scan adds such a big amount of time on top of the
> plain bitmap index scan.
> It seems to me, that the recheck is active although all blocks are exact
> [1] and that pg is loading the jsonb for the recheck.
>
> Is this an expected behavior?
>
Yes, this is expected. The gin index is lossy. It knows that all the
elements are present (except when it doesn't--large elements might get
hashed down and suffer hash collisions), but it doesn't know what the
recursive structure between them is, and has to do a recheck.
For example, if you change your example where clause to:
big_jsonb @> '[{"filler": 1, "x": "cfcd208495d565ef66e7dff9f98764da"}]';
You will see that the index still returns 50,000 rows, but now all of them
get rejected upon the recheck.
You could try changing the type of index to jsonb_path_ops. In your given
example, it won't make a difference, because you are actually counting half
the table and so half the table needs to be rechecked. But in my example,
jsonb_path_ops successfully rejects all the rows at the index stage.
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Marc-Olaf Jaschke | 2016-12-05 14:51:19 | Re: performance issue with bitmap index scans on huge amounts of big jsonb documents |
Previous Message | Bill Measday | 2016-12-03 20:42:02 | Re: Substantial different index use between 9.5 and 9.6 |