From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Cc: | "David E(dot) Wheeler" <david(at)justatheory(dot)com> |
Subject: | Re: Broken error detection in genbki.pl |
Date: | 2024-04-01 19:37:19 |
Message-ID: | 568b9070-7c43-8067-a7f5-7c826eca1827@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2024-03-20 We 12:44, Tom Lane wrote:
> David Wheeler complained over in [1] that genbki.pl fails to produce a
> useful error message if there's anything wrong in a catalog data file.
> He's right about that, but the band-aid patch he proposed doesn't
> improve the situation much. The actual problem is that key error
> messages in genbki.pl expect $bki_values->{line_number} to be set,
> which it is not because we're invoking Catalog::ParseData with
> $preserve_formatting = 0, and that runs a fast path that doesn't do
> line-by-line processing and hence doesn't/can't fill that field.
>
> I'm quite sure that those error cases worked as-intended when first
> coded. I did not check the git history, but I suppose that somebody
> added the non-preserve_formatting fast path later without any
> consideration for the damage it was doing to error reporting ability.
> I'm of the opinion that this change was penny-wise and pound-foolish.
> On my machine, the time to re-make the bki files with the fast path
> enabled is 0.597s, and with it disabled (measured with the attached
> quick-hack patch) it's 0.612s. Is that savings worth future hackers
> having to guess what they broke and where in a large .dat file?
>
> As you can see from the quick-hack patch, it's kind of a mess to use
> the $preserve_formatting = 1 case, because there are a lot of loops
> that have to be taught to ignore comment lines, which we don't really
> care about except in reformat_dat_file.pl. What I suggest doing, but
> have not yet coded up, is to nuke the fast path in Catalog::ParseData
> and reinterpret the $preserve_formatting argument as controlling
> whether comments and whitespace are entered in the output data
> structure, but not whether we parse it line-by-line. That should fix
> this problem with zero change in the callers, and also buy back a
> little bit of the cost compared to this quick hack.
>
> Thoughts?
>
> regards, tom lane
>
> [1] https://www.postgresql.org/message-id/60EF4E11-BC1C-4034-B37B-695662D28AD2%40justatheory.com
>
Makes sense
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2024-04-01 19:43:06 | Re: pg_combinebackup --copy-file-range |
Previous Message | Nathan Bossart | 2024-04-01 19:37:18 | Re: pg_upgrade failing for 200+ million Large Objects |