Quick Links

Re: Doc: typo in config.sgml

From:	Tatsuo Ishii <ishii(at)postgresql(dot)org>
To:	nagata(at)sraoss(dot)co(dot)jp
Cc:	daniel(at)yesql(dot)se, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Doc: typo in config.sgml
Date:	2024-10-01 01:33:50
Message-ID:	20241001.103350.1086523034528885049.ishii@postgresql.org
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

>> That's because non-breaking space (nbsp) is not encoded as 0xa0 in
>> UTF-8. nbsp in UTF-8 is "0xc2 0xa0" (2 bytes) (A 0xa0 is a nbsp's code
>> point in Unicode. i.e. U+00A0).
>> So grep -P "[\xC2\xA0]" should work to detect nbsp.
>
> `LC_ALL=C grep -P "\xC2\xA0"` works for my environment.
> ([ and ] were not necessary.)
>
> When LC_ALL is null, `grep -P "\xA0"` could not detect any characters in charset.sgml,
> but I think it is better to specify both LC_ALL=C and "\xC2\xA0" for making sure detecting
> nbsp.
>
> One problem is that -P option can be used in only GNU grep, and grep in mac doesn't support it.
>
> On bash, we can also use `grep $'\xc2\xa0'`, but I am not sure we can assume the shell is bash.
>
> Maybe, better way is use perl itself rather than grep as following.
>
> `perl -ne '/\xC2\xA0/ and print' `
>
> I attached a patch fixed in this way.

GNU sed can also be used without setting LC_ALL:

sed -n /"\xC2\xA0"/p

However I am not sure if non-GNU sed can do this too...

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

In response to

Re: Doc: typo in config.sgml at 2024-09-30 14:18:39 from Yugo NAGATA

Responses

Re: Doc: typo in config.sgml at 2024-10-01 06:16:52 from Yugo NAGATA

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2024-10-01 02:07:44	Re: query_id, pg_stat_activity, extended query protocol
Previous Message	Thomas Krennwallner	2024-10-01 00:35:12	Re: pg_upgrade check for invalid databases