Re: Lexer patch question

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Lexer patch question
Date: 2005-06-15 17:36:22
Message-ID: 200506151736.j5FHaMM20283@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> > I am confused why the following change Tom made to scan.l works.
> > Isn't that 'x' required so xqescape doesn't match '\x'?
>
> > *** scan.l 2 Jun 2005 01:23:08 -0000 1.123
> > --- scan.l 2 Jun 2005 17:45:17 -0000 1.124
> > ***************
> > *** 193,199 ****
> > xqstart {quote}
> > xqdouble {quote}{quote}
> > xqinside [^\\']+
> > ! xqescape [\\][^0-7x]
> > xqoctesc [\\][0-7]{1,3}
> > xqhexesc [\\]x[0-9A-Fa-f]{1,2}
>
> > --- 193,199 ----
> > xqstart {quote}
> > xqdouble {quote}{quote}
> > xqinside [^\\']+
> > ! xqescape [\\][^0-7]
> > xqoctesc [\\][0-7]{1,3}
> > xqhexesc [\\]x[0-9A-Fa-f]{1,2}
>
> No; if a match to xqhexesc is possible, the lexer will prefer that match
> because it is longer. If a match to xqhexesc is not possible --- that
> is, we have \x not followed by a hex digit --- then we *want* xqescape
> to match. The original coding forced a backup to the <xq>. rule in this
> situation, which is not how we want it to behave.

Oh, I didn't realize lexers would choose the longer token when given
multiple options.

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2005-06-15 17:39:02 Re: Lexer patch question
Previous Message Tom Lane 2005-06-15 17:31:24 Re: Lexer patch question