From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Calculage avg. width when operator = is missing |
Date: | 2015-09-22 18:58:09 |
Message-ID: | 5601A4C1.4060007@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 09/22/2015 12:16 PM, Shulgin, Oleksandr wrote:
> Hi Hackers,
>
> I've recently stumbled upon a problem with table bloat estimation in
> case there are columns of type JSON.
>
> The quick bloat estimation queries use sum over pg_statistic.stawidth
> of table's columns, but in case of JSON the corresponding entry is
> never created by the ANALYZE command due to equality comparison
> operator missing. I understand why there is no such operator defined
> for this particular type, but shouldn't we still try to produce
> meaningful average width estimation?
>
> In my case the actual bloat is around 40% as verified with
> pgstattuple, while the bloat reported by quick estimate can be between
> 75% and 95%(!) in three instances of this problem. We're talking
> about some hundreds of GB of miscalculation.
>
> Attached patch against master makes the std_typanalyze still try to
> compute the minimal stats even if there is no "=" operator. Makes sense?
>
> I could also find this report in archives that talks about similar
> problem, but due to all values being over the analyze threshold:
>
> http://www.postgresql.org/message-id/flat/12480(dot)1389370514(at)sss(dot)pgh(dot)pa(dot)us#12480(dot)1389370514@sss.pgh.pa.us
>
> I think we could try harder, otherwise any estimate relying on average
> width can be way off in such cases.
Yes, "/revenons/ à /nos moutons/." You can set up text based comparison
ops fairly easily for json - you just need to be aware of the
limitations. See https://gist.github.com/adunstan/32ad224d7499d2603708
But I agree we should be able to do some analysis of types without
comparison ops.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2015-09-22 19:11:27 | Re: [COMMITTERS] pgsql: Use gender-neutral language in documentation |
Previous Message | Robert Haas | 2015-09-22 18:52:49 | Re: Rework the way multixact truncations work |