From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: SCRAM authentication, take three |
Date: | 2017-02-15 10:58:22 |
Message-ID: | a67d873a-fa0b-ba97-b2f8-89efb0ecdf65@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02/09/2017 09:33 AM, Michael Paquier wrote:
> On Tue, Feb 7, 2017 at 11:28 AM, Michael Paquier
> <michael(dot)paquier(at)gmail(dot)com> wrote:
>> Yes, I am actively working on this one now. I am trying to come up
>> first with something in the shape of an extension to begin with, and
>> get a patch out of it. That will be more simple for testing. For now
>> the work that really remains in the patches attached on this thread is
>> to get the internal work done, all the UTF8-related routines being
>> already present in scram-common.c to work on the strings.
>
> It took me a couple of days... And attached is the prototype
> implementing SASLprep(), or NFKC if you want for UTF-8 strings.
Cool!
> Now using this module I have arrived to the following conclusions to
> put to a minimum the size of the conversion tables, without much
> impacting lookup performance:
> - There are 24k characters with a combining class of 0 and no
> decomposition for a total of 30k characters, those need to be dropped
> from the conversion table.
> - Most characters have a single, or double decomposition, one has a
> decomposition of 18 characters. So we need to create two sets of
> conversion tables:
> -- A base table, with the character number (4 bytes), the combining
> class (1 byte) and the size of the decomposition (1 byte).
> -- A set of decomposition tables, classified by size.
> So when decomposing a character, we check first the size of the
> decomposition, then get the set from the correct table.
Sounds good.
> Now regarding the shape of the implementation for SCRAM, we need one
> thing: a set of routines in src/common/ to build decompositions for a
> given UTF-8 string with conversion UTF8 string <=> pg_wchar array, the
> decomposition and the reordering. The extension attached roughly
> implements that. What we can actually do as well is have in contrib/ a
> module that does NFK[C|D] using the base APIs in src/common/. Using
> arrays of pg_wchar (integers) to manipulate the characters, we can
> validate and have a set of regression tests that do *not* have to
> print non-ASCII characters.
A contrib module or built-in extra functions to deal with Unicode
characters might be handy for a lot of things. But I'd leave that out
for now, to keep this patch minimal.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2017-02-15 10:59:06 | Re: Proposal: GetOldestXminExtend for ignoring arbitrary vacuum flags |
Previous Message | Amit Khandekar | 2017-02-15 09:43:28 | Re: Parallel Append implementation |