Re: improvements in Unicode tables generation code

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: peter(dot)eisentraut(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: improvements in Unicode tables generation code
Date: 2021-06-22 08:17:26
Message-ID: 20210622.171726.1714863438567031272.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Tue, 22 Jun 2021 09:20:16 +0200, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> wrote in
> I have accumulated a few patches to improve the output of the scripts
> in src/backend/utils/mb/Unicode/ to be less non-standard-looking and
> fix a few other minor things in that area.
>
> v1-0001-Make-Unicode-makefile-more-parallel-safe.patch
>
> The makefile rule that calls UCS_to_most.pl was written incorrectly
> for parallel make. The script writes all output files in one go, but
> the rule as written would call the command once for each output file
> in parallel.

I was annoyed by that behavior but haven't found how to stop that. It
looks to work. (But I haven't run it for me for the reason at the end
of this mail.)

> v1-0002-Make-UCS_to_most.pl-process-encodings-in-sorted-o.patch
>
> This mainly just helps eyeball the output while debugging the previous
> patch.
>
> v1-0003-Remove-some-whitespace-in-generated-C-output.patch
>
> Improve a small formatting issue in the output.

These look just fine.

> v1-0004-Simplify-code-generation-code.patch
>
> This simplifies the code a bit, which helps with the next patch.

This simplifies the code in exchange of allowing a comma after the
last element of array literals. I'm fine with it as long as we allow
that style in the tree.

> v1-0005-Fix-indentation-in-generated-output.patch
>
> This changes the indentation in the output from two spaces to a tab.
>
> I haven't included the actual output changes in the last patch,
> because they would be huge, but the idea should be clear.
>
> All together, these make the output look closer to how pgindent would
> make it.

I agree to the fix.

Mmm. (although, somewhat unrelated to this patch set) I tried this but
I found that www.unicode.org doesn't respond (for at least these
several days). I'm not sure what is happening here.

> wget -O 8859-2.TXT --no-use-server-timestamps https://www.unicode.org/Public/MAPPINGS/ISO8859/8859-2.TXT
> --2021-06-22 17:09:34-- https://www.unicode.org/Public/MAPPINGS/ISO8859/8859-2.TXT
> Resolving www.unicode.org (www.unicode.org) 66.34.208.12
> Connecting to www.unicode.org (www.unicode.org)|66.34.208.12|:443...
(timeouts)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2021-06-22 08:20:46 Re: improvements in Unicode tables generation code
Previous Message Michael Paquier 2021-06-22 08:07:23 Re: Toast compression method options