From: | Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, Mahendra Thalor <mahendra(dot)thalor(at)enterprisedb(dot)com>, Oleg Bartunov <obartunov(at)postgrespro(dot)ru> |
Subject: | Re: Collecting statistics about contents of JSONB columns |
Date: | 2022-01-05 20:22:31 |
Message-ID: | CANbhV-FgzKvB5Drz9R8yprC2=O156mYmncLdsRxE8CxyYzadJA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 31 Dec 2021 at 22:07, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> The patch does something far more
> elegant - it simply uses stavalues to store an array of JSONB documents,
> each describing stats for one path extracted from the sampled documents.
Sounds good.
> I'm sure there's plenty open questions - for example I think we'll need
> some logic to decide which paths to keep, otherwise the statistics can
> get quite big, if we're dealing with large / variable documents. We're
> already doing something similar for MCV lists.
>
> One of Nikita's patches not included in this thread allow "selective"
> statistics, where you can define in advance a "filter" restricting which
> parts are considered interesting by ANALYZE. That's interesting, but I
> think we need some simple MCV-like heuristics first anyway.
>
> Another open question is how deep the stats should be. Imagine documents
> like this:
>
> {"a" : {"b" : {"c" : {"d" : ...}}}}
>
> The current patch build stats for all possible paths:
>
> "a"
> "a.b"
> "a.b.c"
> "a.b.c.d"
>
> and of course many of the paths will often have JSONB documents as
> values, not simple scalar values. I wonder if we should limit the depth
> somehow, and maybe build stats only for scalar values.
The user interface for designing filters sounds hard, so I'd hope we
can ignore that for now.
--
Simon Riggs http://www.EnterpriseDB.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2022-01-05 20:56:26 | Re: biblio.sgml dead link |
Previous Message | Simon Riggs | 2022-01-05 20:19:30 | Re: Logical insert/update/delete WAL records for custom table AMs |