Re: hint infrastructure setup (v3)

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: hint infrastructure setup (v3)
Date: 2004-04-05 16:52:10
Message-ID: 200404051652.i35GqAN05855@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


As another teaching aid idea, is there something that would auto-format
queries to they are more readable?

---------------------------------------------------------------------------

Fabien COELHO wrote:
>
>
>
> Dear Tom,
>
> > No, just redefine "yyerror" as a macro that passes additional parameters.
>
> Ok! That's just the kind of the hack-hook I was looking for!!
>
>
> > > The terminals are not available, they must be kept separatly if you
> > > want them. This can be done in yylex().
> >
> > We don't need them. Any already-shifted tokens are to the left of where
> > the error is, no?
>
> Yes and no. I agree that you don't need them for computing the follow set.
> However it would help a lot to generate nicer hints. I'm looking for help
> the user, and the follow set is just part of the help.
>
> Think of typical "SELECT foo, bla, FROM xxx" (it looks stupid on a one
> line query, but not so on a very large query) which is quite easier to
> detect and hint about because of FROM is just after a comma.
>
> The rule part is also a real issue. There is no easy way to translate
> states into rules, to know whether we're somewhere in a "ShowStmt" or
> "AlterTableStmt"...
>
> If you're after a comma in state 1232, should you just say IDENT... I'd
> rather say "user name" or "table name" if I can... Otherwise the hint
> stays at a low parser level. Good hints requires to know about the
> current context, and it is not easy to get that deep in the automaton.
>
>
> > Yeah, I had come to the same conclusion --- state moves made without
> > consuming input would need to be backed out if we wanted to report the
> > complete follow set. I haven't yet looked to see how to do that.
>
> Well, following you're previous suggestion, one can redefine the YYLEX
> macro to save the current state each time a token is required.
>
>
> > > As you noted, for things like "SHOW 123", the follow set basically
> > > includes all keywords although you can have SHOW ALL or SHOW name.
> > > So, as you suggested, you can either say "ident" as a simplification, but
> >
> > You're ignoring the distinction between classes of keywords. I would
> > not recommend treating reserved_keywords as a subclass of ident.
>
> Ok, I agree that it can help in this very case a little bit here because
> ALL is reserved. I did not make this distinction when I played with bison.
>
>
> > > (5) anything that can be done would be hardwired to one version of bison.
> >
> > I think this argument is completely without merit.
>
> Possible;-)
>
> However I don't like writing hacks that depends on other people future
> behavior.
>
>
> > > (b) write a new "recursive descendant" parser, and drop "gram.y"
> >
> > Been there, done that, not impressed with the idea. There's a reason
> > why people invented parser generators...
>
> Sure. It was just a list of options;-)
>
>
> > > As a side effect of my inspection is that the automaton generated by bison
> > > could be simplified if a different tradeoff between the lexer, the parser
> > > and the post-processing would be chosen. Namelly, all tokens that are
> > > just identifiers should be dropped and processed differently.
> >
> > We are not going to reserve the keywords that are presently unreserved.
>
> Reserving keywords would help, but I would not think it could be accepted,
> because it is not SQL philosophy. SQL is about 30 years also, as old
> as yacc ideas... but they did not care at the time;-) When you look at
> the syntax, it was designed with a recursive parser in mind.
>
> Part of my comment was to explain "why" the generated automaton is large.
> SQL is a small language, but it has a big automaton in postgresql, and
> this is IMHO the explanation.
>
>
> > If you can think of a reasonable way to stop treating them as separate
> > tokens inside the grammar without altering the user-visible behavior,
> > I'm certainly interested.
>
> I was thinking about type names especially, and maybe others.
>
> I join a small proof-of-concept patch to drop some tokens out of the
> parser. As a results, 6 tokens, 6 rules and 9 states are removed,
> and the automaton table is reduced by 438 entries (1.5%). Not too bad;-)
> I think others could be dealt with similarily, or with more work.
>
>
> Thanks for the discussion,
>
> --
> Fabien.

Content-Description:

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Fabien COELHO 2004-04-05 17:01:25 Re: hint infrastructure setup (v3)
Previous Message Fabien COELHO 2004-04-05 16:44:22 Re: hint infrastructure setup (v3)