Quick Links

Re: benchmarking Flex practices

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: benchmarking Flex practices
Date:	2019-11-25 22:51:41
Message-ID:	31686.1574722301@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

[ My apologies for being so slow to get back to this ]

John Naylor <john(dot)naylor(at)2ndquadrant(dot)com> writes:
> Now that I think of it, the regression in v7 was largely due to the
> fact that the parser has to call the lexer 3 times per string in this
> case, and that's going to be slower no matter what we do.

Ah, of course. I'm not too fussed about the performance of queries with
an explicit UESCAPE clause, as that seems like a very minority use-case.
What we do want to pay attention to is not regressing for plain
identifiers/strings, and to a lesser extent the U& cases without UESCAPE.

> Inlining hexval() and friends seems to have helped somewhat for
> unicode escapes, but I'd have to profile to improve that further.
> However, v8 has regressed from v7 enough with both simple strings and
> the information schema that it's a noticeable regression from HEAD.
> I'm guessing getting rid of the "Uescape" production is to blame, but
> I haven't tried reverting just that one piece. Since inlining the
> rules didn't seem to help with the precedence hacks, it seems like the
> separate production was a better way. Thoughts?

I have duplicated your performance tests here, and get more or less
the same results (see below). I agree that the performance of the
v8 patch isn't really where we want to be --- and it also seems
rather invasive to gram.y, and hence error-prone. (If we do it
like that, I bet my bottom dollar that somebody would soon commit
a patch that adds a production using IDENT not Ident, and it'd take
a long time to notice.)

It struck me though that there's another solution we haven't discussed,
and that's to make the token lookahead filter in parser.c do the work
of converting UIDENT [UESCAPE SCONST] to IDENT, and similarly for the
string case. I pursued that to the extent of developing the attached
incomplete patch ("v9"), which looks reasonable from a performance
standpoint. I get these results with tests using the drive_parser
function:

information_schema

HEAD 3447.674 ms, 3433.498 ms, 3422.407 ms
v6 3381.851 ms, 3442.478 ms, 3402.629 ms
v7 3525.865 ms, 3441.038 ms, 3473.488 ms
v8 3567.640 ms, 3488.417 ms, 3556.544 ms
v9 3456.360 ms, 3403.635 ms, 3418.787 ms

pgbench str

HEAD 4414.046 ms, 4376.222 ms, 4356.468 ms
v6 4304.582 ms, 4245.534 ms, 4263.562 ms
v7 4395.815 ms, 4398.381 ms, 4460.304 ms
v8 4475.706 ms, 4466.665 ms, 4471.048 ms
v9 4392.473 ms, 4316.549 ms, 4318.472 ms

pgbench unicode

HEAD 4959.000 ms, 4921.751 ms, 4945.069 ms
v6 4856.998 ms, 4802.996 ms, 4855.486 ms
v7 5057.199 ms, 4948.342 ms, 4956.614 ms
v8 5008.090 ms, 4963.641 ms, 4983.576 ms
v9 4809.227 ms, 4767.355 ms, 4741.641 ms

pgbench uesc

HEAD 5114.401 ms, 5235.764 ms, 5200.567 ms
v6 5030.156 ms, 5083.398 ms, 4986.974 ms
v7 5915.508 ms, 5953.135 ms, 5929.775 ms
v8 5678.810 ms, 5665.239 ms, 5645.696 ms
v9 5648.965 ms, 5601.592 ms, 5600.480 ms

(A note about what we're looking at: on my machine, after using cpupower
to lock down the CPU frequency, and taskset to bind everything to one
CPU socket, I can get numbers that are very repeatable, to 0.1% or so
... until I restart the postmaster, and then I get different but equally
repeatable numbers. The difference can be several percent, which is a lot
of noise compared to what we're looking for. I believe the explanation is
that kernel ASLR has loaded the backend executable at some different
addresses and so there are different cache-line-boundary effects. While
I could lock that down too by disabling ASLR, the result would be to
overemphasize chance effects of a particular set of cache line boundaries.
So I prefer to run all the tests over again after restarting the
postmaster, a few times, and then look at the overall set of results to
see what things look like. Each number quoted above is median-of-three
tests within a single postmaster run.)

Anyway, my conclusion is that the attached patch is at least as fast
as today's HEAD; it's not as fast as v6, but on the other hand it's
an even smaller postmaster executable, so there's something to be said
for that:

$ size postg*
text data bss dec hex filename
7478138 57928 203360 7739426 761822 postgres.head
7271218 57928 203360 7532506 72efda postgres.v6
7275810 57928 203360 7537098 7301ca postgres.v7
7276978 57928 203360 7538266 73065a postgres.v8
7266274 57928 203360 7527562 72dc8a postgres.v9

I based this on your v7 not v8; not sure if there's anything you
want to salvage from v8.

Generally, I'm pretty happy with this approach: it touches gram.y
hardly at all, and it removes just about all of the complexity from
scan.l. I'm happier about dropping the support code into parser.c
than the other choices we've discussed.

There's still undone work here, though:

* I did not touch psql. Probably your patch is fine for that.

* I did not do more with ecpg than get it to compile, using the
same hacks as in your v7. It still fails its regression tests,
but now the reason is that what we've done in parser/parser.c
needs to be transposed into the identical functionality in
ecpg/preproc/parser.c. Or at least some kind of functionality
there. A problem with this approach is that it presumes we can
reduce a UIDENT sequence to a plain IDENT, but to do so we need
assumptions about the target encoding, and I'm not sure that
ecpg should make any such assumptions. Maybe ecpg should just
reject all cases that produce non-ASCII identifiers? (Probably
it could be made to do something smarter with more work, but
it's not clear to me that it's worth the trouble.)

* I haven't convinced myself either way as to whether it'd be
better to factor out the code duplicated between the UIDENT
and UCONST cases in base_yylex.

If this seems like a reasonable approach to you, please fill in
the missing psql and ecpg bits.

regards, tom lane

Attachment	Content-Type	Size
v9-draft-handle-uescapes-in-parser.patch	text/x-diff	31.5 KB

In response to

Re: benchmarking Flex practices at 2019-08-01 08:50:56 from John Naylor

Responses

Re: benchmarking Flex practices at 2019-11-26 11:38:00 from John Naylor

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeremy Schneider	2019-11-25 23:05:03	Re: Proposal: Global Index
Previous Message	Robert Haas	2019-11-25 22:44:21	Re: global / super barriers (for checksums)