Re: Hostnames, IDNs, Punycode and Unicode Case Folding

From: Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Hostnames, IDNs, Punycode and Unicode Case Folding
Date: 2014-12-30 01:53:09
Message-ID: 20141230015309.GJ54847@crankycanuck.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Dec 30, 2014 at 12:53:42AM +0000, Mike Cardwell wrote:
> > Hmm. How did you get the original, then?
>
> The "original" in my case, is the hostname which the end user supplied.
> Essentially, when I display it back to them, I want to make sure it is
> displayed the same way that it was when they originally submitted it.

Ah. This gets even betterâ„¢ for you, then, because whereas in IDNA2003
you can pass it an old fashioned LDH name (letter, digit, hypen),
IDNA2008 treats those as _outside_ the spec. So basically, you first
have to get a label and determine whether it is LDH or not (you can do
this by checking for any octet outside the LDH range) and then you can
decide which way to process it. In IDNA2003, the punycode output from
an LDH label turns out always to be the LDH label. The reason for
this is that you're supposed to validate that a U-label is really a
U-label before registering in IDNA2008, and lots of perfectly good LDH
labels (like EXAMPLE) are not valid under IDNA2008 because of upper
case.

(If by now you think that maybe it's time for this DNS thing to get
replaced, you have company.)

> I was unaware of the different versions of IDNA. I basically started using
> the Perl module IDNA::Punycode in my project and assumed that this was the
> only type. Seems like I need to do some more reading.

Yeah, this is all made much harder by the fact that several IDN
libraries still do 2003. Here is one that many people are using for
IDNA2008:
<https://gitorious.org/libidn2/libidn2/source/0d6b5c0a9f1e4a9742c5ce32b6241afb4910cae1:>
It's GPLv3, though, which brings its own issues.

A

--
Andrew Sullivan
ajs(at)crankycanuck(dot)ca

In response to

Browse pgsql-general by date

  From Date Subject
Next Message John Casey 2014-12-30 04:51:05 bdr_init_copy fails when starting 2nd BDR node
Previous Message Mike Cardwell 2014-12-30 00:53:42 Re: Hostnames, IDNs, Punycode and Unicode Case Folding