Re: Doc: typo in config.sgml

From: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: daniel(at)yesql(dot)se, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Doc: typo in config.sgml
Date: 2024-10-01 06:16:52
Message-ID: 20241001151652.94e03445e2d9815ac8651b55@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 01 Oct 2024 10:33:50 +0900 (JST)
Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:

> >> That's because non-breaking space (nbsp) is not encoded as 0xa0 in
> >> UTF-8. nbsp in UTF-8 is "0xc2 0xa0" (2 bytes) (A 0xa0 is a nbsp's code
> >> point in Unicode. i.e. U+00A0).
> >> So grep -P "[\xC2\xA0]" should work to detect nbsp.
> >
> > `LC_ALL=C grep -P "\xC2\xA0"` works for my environment.
> > ([ and ] were not necessary.)
> >
> > When LC_ALL is null, `grep -P "\xA0"` could not detect any characters in charset.sgml,
> > but I think it is better to specify both LC_ALL=C and "\xC2\xA0" for making sure detecting
> > nbsp.
> >
> > One problem is that -P option can be used in only GNU grep, and grep in mac doesn't support it.
> >
> > On bash, we can also use `grep $'\xc2\xa0'`, but I am not sure we can assume the shell is bash.
> >
> > Maybe, better way is use perl itself rather than grep as following.
> >
> > `perl -ne '/\xC2\xA0/ and print' `
> >
> > I attached a patch fixed in this way.
>
> GNU sed can also be used without setting LC_ALL:
>
> sed -n /"\xC2\xA0"/p
>
> However I am not sure if non-GNU sed can do this too...

Although I've not check it myself, BSD sed doesn't support \x escape according to [1].

[1] https://stackoverflow.com/questions/24275070/sed-not-giving-me-correct-substitute-operation-for-newline-with-mac-difference

By the way, I've attached a patch a bit modified to use the plural form statement
as same as check-tabs.

Non-breaking **spaces** appear in SGML/XML files

Regards,
Yugo Nagata

>
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS K.K.
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp

--
Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>

Attachment Content-Type Size
v3_check_nbsp.diff text/x-diff 822 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2024-10-01 06:23:50 Re: Address the -Wuse-after-free warning in ATExecAttachPartition()
Previous Message Jim Jones 2024-10-01 05:50:21 Re: Psql meta-command conninfo+