Re: pgbench client-side performance issue on large scripts

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pgbench client-side performance issue on large scripts
Date: 2025-02-26 00:17:53
Lists: pgsql-hackers

I wrote:
> I got nerd-sniped by this question and spent some time looking into
> it.

On second look, I'd failed to absorb your point about how the main
loop of ParseScript doesn't need the line number at all; only if
it's a backslash command are we going to use that. So we can
move the calculation to be done only after we see a backslash.

I'd spent a little time worrying about how the calculation was
really giving a wrong line number: typically, it'd return the
line number of the previous semicolon, since we haven't lexed
any further than that. That could be fixed with more code,
but it's pretty pointless if we don't need the value in the
first place.

Also, I did a tiny bit of micro-optimization in the first
patch to remove the hazard that it'd still be O(N^2) with
very long input lines.

regards, tom lane

Attachment Content-Type Size
v2-0001-Get-rid-of-O-N-2-script-parsing-overhead-in-pgben.patch text/x-diff 12.3 KB
v2-0002-Avoid-unnecessary-computation-of-pgbench-s-script.patch text/x-diff 3.4 KB

