Quick Links

Re: benchmarking Flex practices

From:	John Naylor <john(dot)naylor(at)2ndquadrant(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: benchmarking Flex practices
Date:	2019-12-03 11:02:17
Message-ID:	CACPNZCv9fWE9Q8yVwHgQ4=5y3RBi2CkzoG2MD8Q_3ftNVt0sCg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Nov 26, 2019 at 10:32 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I haven't looked closely at what ecpg does with the processed
> identifiers. If it just spits them out as-is, a possible solution
> is to not do anything about de-escaping, but pass the sequence
> U&"..." (plus UESCAPE ... if any), just like that, on to the grammar
> as the value of the IDENT token.

It does pass them along as-is, so I did it that way.

In the attached v10, I've synced both ECPG and psql.

> * I haven't convinced myself either way as to whether it'd be
> better to factor out the code duplicated between the UIDENT
> and UCONST cases in base_yylex.

I chose to factor it out, since we have 2 versions of parser.c, and
this way was much easier to work with.

Some notes:

I arranged for the ECPG grammar to only see SCONST and IDENT. With
UCONST and UIDENT out of the way, it was a small additional step to
put all string reconstruction into the lexer, which has the advantage
of allowing removal of the other special-case ECPG string tokens as
well. The fewer special cases involved in pasting the grammar
together, the better. In doing so, I've probably introduced memory
leaks, but I wanted to get your opinion on the overall approach before
investigating.

In ECPG's parser.c, I simply copied check_uescapechar() and
ecpg_isspace(), but we could find a common place if desired. During
development, I found that this file replicates the location-tracking
logic in the backend, but doesn't seem to make use of it. I also would
have had to replicate the backend's datatype for YYLTYPE. Fixing that
might be worthwhile some day, but to get this working, I just ripped
out the extra location tracking.

I no longer use state variables to track scanner state, and in fact I
removed the existing "state_before" variable in ECPG. Instead, I used
the Flex builtins yy_push_state(), yy_pop_state(), and yy_top_state().
These have been a feature for a long time, it seems, so I think we're
okay as far as portability. I think it's cleaner this way, and
possibly faster. I also used this to reunite the xcc and xcsql states.
This whole part could be split out into a separate refactoring patch
to be applied first, if desired.

--
John Naylor https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment	Content-Type	Size
v10-handle-uescapes-in-parser.patch	application/octet-stream	61.6 KB

In response to

Re: benchmarking Flex practices at 2019-11-26 15:32:29 from Tom Lane

Responses

Re: benchmarking Flex practices at 2020-01-02 23:56:28 from John Naylor

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Pengzhou Tang	2019-12-03 11:05:34	Errors when update a view with conditional-INSTEAD rules
Previous Message	Amit Kapila	2019-12-03 10:56:49	Re: [HACKERS] Block level parallel vacuum