Re: Getting our tables to render better in PDF output

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-docs(at)lists(dot)postgresql(dot)org
Subject: Re: Getting our tables to render better in PDF output
Date: 2020-02-14 21:00:00
Message-ID: e794bbd7-32f2-157d-8c7d-7bbfe4262e81@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

Hello Alvaro,
14.02.2020 23:16, Alvaro Herrera wrote:
> On 2020-Feb-13, Alexander Lakhin wrote:
>
>> Yes, I was starting with manual &zwsp; insertions into the translation,
>> but later I reduced such insertions just to several dozens. (For
>> example, we still have "3.1415926535&zwsp;8979323846" in the translation.)
>> The main issue of the manual approach was that I needed to recheck that
>> zwsp placement on updates, and I can't see where it's desired until I
>> generate pdf. Fortunately, fop prints warning like that:
>> [WARN] FOUserAgent - The contents of fo:block line 2 exceed the
>> available area in the inline-progression direction by 22725 millipoints.
>> (See position 127769:983)
>> It's not very user-friendly, but still useful when we have a pair or two
>> of them.
> It seems to me that a productive way forward would be to fix the layout
> to make these warning disappear. Then it will be relatively easy to find
> where to fix, if new ones appear.
>
> Now I suppose you're complaining about the "position 127769:983" part of
> the error message which tells you with zero clarity where the problem
> is. Maybe what we need is to figure out what the numbers mean, and how
> to use them; for example if they are byte offsets into the file, then it
> should be possible to tell your editor to go to that byte in the
> complete XML file.
I'm not complaining about the cryptic position of the problems, I'm
concerned with their number.
The position is specified as {line_number}:{character_postition} in
postgres-*.fo (not in the DocBook source).
For example, when performing `make postgres-A4.pdf` on REL_12_STABLE I get:
[WARN] FOUserAgent - The contents of fo:block line 1 exceed the
available area in the inline-progression direction by more than 50
points. (See position 28808:374)

To find an exact problematic text you can look at the specified line(s)
of postgres-A4.fo:
/$ sed -n '28808,28811p' postgres-A4.fo /
<fo:block id="id-1.5.13.4.7.12.1" wrap-option="wrap" text-align="start"
space-before.minimum="0.8em" space-before.optimum="1em"
space-before.maximum="1.2em" space-after.minimum="0.8em"
space-after.optimum="1em" space-after.maximum="1.2em" hyphenate="false"
white-space-collapse="false" white-space-treatment="preserve"
linefeed-treatment="preserve" font-family="monospace">
EXPLAIN SELECT * FROM tenk1 WHERE unique1 &lt; 100;

Searching this text in pdf gets you to page 467 where you can see a long
line of '---' going of the page...
>> Third (minor) issue is with translation - when I will see some break in
>> the English source, e.g. "split_part('abc~(at)~def&zwsp;~(at)~ghi', '~(at)~',
>> 2)", should I leave the break in the same place, or it's better to move
>> it because adjacent text has different length and the table columns have
>> different width?
> If the English version is warning-clean, then it should be possible to
> keep the zwsps in the same location in the translation, and then tweak
> the translation according to any new warnings that appear there.
> My guess is that the majority of zwsps are going to want to stay in the
> same place.
Yes, that's why I consider this as minor issue, but some kind of an
automatic solution can eliminate it at all.
>> Maybe some of the rules can be implemented explicitly in the DocBook
>> source, just to reduce tons of zwsp in the generated output, or the
>> "fo:table-cell/fo:block//text()" condition can be improved to filter
>> some (text-only?) tables out, but I think that the idea of our specific
>> line breaking rules could work.
> Maybe we can mark-up specific table cells/columns as being subject to
> the special line breaking rules.
Things made complicated by the xslt preprocessor, because you can't see
Docbook tags and attributes on a FOP level, but I can explore possible
resolutions if we choose to go this way.

Best regards,
Alexander

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Tom Lane 2020-02-14 21:06:51 Re: Getting our tables to render better in PDF output
Previous Message Alvaro Herrera 2020-02-14 20:16:43 Re: Getting our tables to render better in PDF output