Re: Implementation of SASLprep for SCRAM-SHA-256

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>
Subject: Re: Implementation of SASLprep for SCRAM-SHA-256
Date: 2017-04-11 05:25:08
Message-ID: CAB7nPqTMcxFc5oefi6RBK9gzu0-9__ZTbhYro1NACBUZ33uLWQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 10, 2017 at 5:11 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> Here are some characters that seem plausible to be misinterpreted and
> ignored by SASLprep:
> EUC-JP and EUC-JISX0213:
>
> U+00AD (C2 AD): 足 (meaning "foot", per Unihan database)
> U+FE00-FE0F (EF B8 8X): 鏝 (meaning "trowel", per Unihan database)

Those are common words, but I still have to see those characters used
in passwords. For user names, I have seen games or apps using Kanjis,
so they could be used in such cases. But any application can be
careful in controlling the encoding used by the client, so I would not
worry much about them.

> EUC-CN:
>
> U+00AD (C2 AD): 颅 (meaning "skull", per Unihan database)
> U+FE00-FE0FF (EF B8 8X): 锔 (meaning "curium", per Unihan database)
> U+FEFF (EF BB BF): 锘 (meaning "nobelium", per Wikipedia)
>
> EUC-KR:
>
> U+FE00-FE0F (EF BB BF): 截 (meanings "cut off, stop, obstruct, intersect",
> per Unihan database
> U+FEFF (EF BB BF): 癤 (meanings "pimple, sore, boil", per Unihan database)
>
> EUC-TW:
> U+FE00-FE0F: 踫 (meanings "collide, bump into", per Unihan database)
> U+FEFF: 踢 (meaning "kick", per Unihan database)

Not completely sure about those ones. At least I think that the two
set of characters of Chinese Taipei are pretty common there.

> CP866:
> U+1806: саЖ
> U+180B: саЛ
> U+180C: саМ
> U+180D: саН
> U+200B: тАЛ
> U+200C: тАМ
> U+200D: тАН
> -------
>
> The CP866 cases seem most likely to cause confusion.

No idea about those.

> Those are all common
> words in Russian. I don't know how common those Chinese/Japanese characters are.
>
> Overall, I think this is OK. Even though there are those characters that can
> be misinterpreted, for it to be problem all of the following have to be
> true:
>
> 1. The client is using one of those encodings.
> 2. The password string as whole has to look like valid UTF-8.
> 3. Ignoring those characters/words from the password would lead to a
> significantly weaker password, i.e. it was not very long to begin with, or
> it consisted almost entirely of those characters/words.
>
> Thoughts? Attached is the full results of running iconv with each encoding,
> from which I picked the above cases.

I am not much worrying about such things either. Still I am wondering
though if it would be useful to give users the possibility to block
connection attempts from clients that do not use UTF-8 in this case.
Some people use open instances.
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-04-11 05:49:48 Re: contrib/bloom wal-check not run by default
Previous Message Heikki Linnakangas 2017-04-11 05:10:23 Re: SCRAM authentication, take three