Re: Hostnames, IDNs, Punycode and Unicode Case Folding

From: Mike Cardwell <pgsql(at)lists(dot)grepular(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Hostnames, IDNs, Punycode and Unicode Case Folding
Date: 2014-12-29 23:50:54
Message-ID: 20141229235054.GA23430@glue.grepular.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

* on the Mon, Dec 29, 2014 at 04:23:22PM -0700, David G Johnston wrote:

>>> WHERE hostname='nißan.com'
>>>
>>
>> _IF_ Postgres had a punycode function, then you could use:
>> WHERE punycode(hostname) = punycode('nißan.com')
>
> If the OP wraps what he is doing up into a function that is what you end up
> getting: a memoized punycode function.
>
> http://en.wikipedia.org/wiki/Memoization
>
> It has to be defined as volatile but basically write the function to check
> for the provided input on the indexed table and if it doesn't exist the
> function will calculate the punycode value and store it onto the table
> before returning the punycode value to the caller.

I'm not sure all that is necessary. It could be quite a simple function,
like the lower() function. So what I would do is this:

CREATE UNIQUE INDEX hostnames_hostname_key ON hostnames (lower(punycode_encode(hostname)));

That would prevent adding more than one representation for the same hostname
to the column. And I if I wanted to do a fast, indexed search where I could
supply any representation of the hostname as input, I would just do:

WHERE lower(punycode_encode(hostname)) = lower(punycode_encode('any-representation'))

There doesn't need to be any extra table storage for the punycode encoded
version.

--
Mike Cardwell https://grepular.com https://emailprivacytester.com
OpenPGP Key 35BC AF1D 3AA2 1F84 3DC3 B0CF 70A5 F512 0018 461F
XMPP OTR Key 8924 B06A 7917 AAF3 DBB1 BF1B 295C 3C78 3EF1 46B4

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Johnston 2014-12-29 23:56:26 Re: Rollback on include error in psql
Previous Message Adrian Klaver 2014-12-29 23:38:30 Re: [GENERAL] Rollback on include error in psql