Re: Thoughts on using Text::Template for our autogenerated code?

From: Corey Huinker <corey(dot)huinker(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Daniel Gustafsson <daniel(at)yesql(dot)se>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Thoughts on using Text::Template for our autogenerated code?
Date: 2023-04-03 19:07:15
Message-ID: CADkLM=dwGKCgicsyTPBjtgtRMcgqAE+V4zgLXTAVLs8fe1oVJA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Yeah, it's somewhat hard to believe that the cost/benefit ratio would be
> attractive. But maybe you could mock up some examples of what the input
> could look like, and get people on board (or not) before writing any
> code.
>
>
tl;dr - I tried a few things, nothing that persuades myself let alone the
community, but perhaps some ideas for the future.

I borrowed Bertrand's ongoing work for waiteventnames.* because that is
what got me thinking about this in the first place. I considered a few
different templating libraries:

There is no perl implementation of the golang template library (example of
that here: https://blog.gopheracademy.com/advent-2017/using-go-templates/ )
that I could find.

Text::Template does not support loops, and as such it is no better than
here-docs.

Template Toolkit seems to do what we need, but it has a kitchen sink of
dependencies that make it an unattractive option, so I didn't even attempt
it.

HTML::Template has looping and if/then/else constructs, and it is a single
standalone library. It also does a "separation of concerns" wherein you
pass in parameter names and values, and some parameters can be for loops,
which means you pass an arrayref of hashrefs that the template engine loops
over. That's where the advantages stop, however. It is fairly verbose, and
because it is HTML-centric it isn't very good about controlling whitespace,
which leads to piling template directives onto the same line in order to
avoid spurious newlines. As such I cannot recommend it.

My ideal template library would have text something like this:

[% loop events %]
[% $enum_value %]
[% if __first__ +%] = [%+ $inital_value %][% endif %]
[% if ! __last__ %],[% endif +%]
[% end loop %]
[% loop xml_blocks indent: relative,spaces,4 %]

<row>

<SomeElement attrib=[%attrib_val%]>[%element_body%]/>

</row>

[% end loop %]

[%+ means "leading whitespace matters", +%] means "trailing whitespace
matters"
That pseudocode is a mix of ASP, HTML::Template. The special variables
__first__ and __last__ refer to which iteration of the loop we are on. You
would pass it a data structure like this:

{events: [ { enum_value: "abc", initial_value: "def"}, ... { enum_value:
"wuv", initial_value: "xyz" } ],
xml_block: [ {attrib_val: "one", element_body: "two"} ]
}

I did one initial pass with just converting printf statements to here-docs,
and the results were pretty unsatisfying. It wasn't really possible to
"see" the output files take shape.

My next attempt was to take the "separation of concerns" code from the
HTML::Template version, constructing the nested data structure of resolved
output values, and then iterating over that once per output file. This
resulted in something cleaner, partly because we're only writing one file
type at a time, partly because the interpolated variables have names much
closer to their output meaning.

In doing this, it occurred to me that a lot of this effort is in getting
the code to conform to our own style guide, at the cost of the generator
code being less readable. What if we wrote the generator and formatted the
code in a way that made sense for the generator, and then pgindented it.
That's not the workflow right now, but perhaps it could be.

Conclusions:
- There is no "good enough" template engine that doesn't require big
changes in dependencies.
- pgindent will not save you from a run-on sentence, like putting all of
a typdef enum values on one line
- There is some clarity value in either separating input processing from
the output processing, or making the input align more closely with the
outputs
- Fiddling with indentation and spacing detracts from legibility no matter
what method is used.
- here docs are basically ok but they necessarily confuse output
indentation with code indentation. it is possible to de-indent them them
with <<~ but that's a 5.25+ feature.
- Any of these principles can be applied at any time, with no overhaul
required.

"sorted-" is the slightly modified version of Bertrand's code.
"eof-as-is-" is a direct conversion of the original but using here-docs.
"heredoc-fone-file-at-a-time-" first generates an output-friendly data
structure
"needs-pgindent-" is what is possible if we format for our own readability
and make pgindent fix the output, though it was not a perfect output match

Attachment Content-Type Size
sorted-generate-waiteventnames.pl application/x-perl 4.9 KB
eof-as-is-generate-waiteventnames.pl application/x-perl 4.8 KB
heredoc-one-file-at-a-time-generate-waiteventnames.pl application/x-perl 5.4 KB
needs-pgindent-generate-waiteventnames.pl application/x-perl 5.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-04-03 19:08:37 Re: Should vacuum process config file reload more often
Previous Message Jeff Davis 2023-04-03 19:05:29 Re: running logical replication as the subscription owner