Re: long analyze, libc bug and libicu

From: Grigory Smolkin <g(dot)smolkin(at)postgrespro(dot)ru>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: long analyze, libc bug and libicu
Date: 2018-07-07 09:18:59
Message-ID: 2da0c289-bfc8-a40f-e13b-1b2943746e9e@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 07/07/2018 10:10 AM, Peter Eisentraut wrote:

> On 05.07.18 17:05, Grigory Smolkin wrote:
>> Why ANALYZE igrones column COLLATE?
> I think the statistics would be mostly the same independent of which
> collation you use. This could possibly be refined, but I don't think
> it's a major problem right now.
>

Thank you for your interest in this problem!
> I think the statistics would be mostly the same independent of which
> collation you use.

I assumed that one of the goals of using libicu is to be independent
from libc collation and it`s bugs and inconsistencies, but current
ANALYZE forced to use libc anyway, which undermines that goal.
> This could possibly be refined, but I don't think
> it's a major problem right now.

It`s a major problem to people, who use Thai alphabet.
In attachment there is a data sample(33MB on my machine). ANALYZE`ing it
comes up with following results:

postgres=# ANALYZE t_icu_coll;
ANALYZE
Time: 2252086.648 ms

37minutes on 33MB table is painful. On big tables autovacuum ANALYZE
goes for hours, starving autovacuum VACUUM for worker
slots(autovacuum_max_workers).
Another major problem is that in strol_l() backend process ignores
pg_terminate_backend()/pg_cancel_backend() functions.

With attached patch this problem goes away:

postgres=# analyze t_icu_coll;
ANALYZE
Time: 161.419 ms

--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
t_icu_coll.sql.gz application/gzip 3.2 MB
analyze_honour_column_collation.patch text/x-patch 675 bytes

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Stephen Frost 2018-07-07 14:16:48 Re: long analyze, libc bug and libicu
Previous Message Peter Eisentraut 2018-07-07 07:14:27 Re: BUG #15263: pg_dump / psql failure. When loading, psql does not see function-based constraints or indices