Re: speed up unicode decomposition and recomposition

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speed up unicode decomposition and recomposition
Date: 2020-10-15 00:25:23
Message-ID: 20201015002523.GA2305@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 14, 2020 at 01:06:40PM -0400, Tom Lane wrote:
> John Naylor <john(dot)naylor(at)enterprisedb(dot)com> writes:
>> Some other considerations:
>> - As I alluded above, this adds ~26kB to libpq because of SASLPrep. Since
>> the decomp array was reordered to optimize linear search, it can no longer
>> be used for binary search. It's possible to build two arrays, one for
>> frontend and one for backend, but that's additional complexity. We could
>> also force frontend to do a linear search all the time, but that seems
>> foolish. I haven't checked if it's possible to exclude the hash from
>> backend's libpq.
>
> IIUC, the only place libpq uses this is to process a password-sized string
> or two during connection establishment. It seems quite silly to add
> 26kB in order to make that faster. Seems like a nice speedup on the
> backend side, but I'd vote for keeping the frontend as-is.

Agreed. Let's only use the perfect hash in the backend. It would be
nice to avoid an extra generation of the decomposition table for that,
and a table ordered by codepoints is easier to look at. How much do
you think would be the performance impact if we don't use for the
linear search the most-optimized decomposition table?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-10-15 00:36:54 Re: pgsql: Restore replication protocol's duplicate command tags
Previous Message Thomas Munro 2020-10-14 23:55:19 Re: kevent latch paths don't handle postmaster death well