Converting tab-complete.c's else-if chain to a switch

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Converting tab-complete.c's else-if chain to a switch
Date: 2024-07-11 20:25:02
Message-ID: 2208466.1720729502@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Per discussion elsewhere [1], I've been looking at $SUBJECT.
I have a very crude Perl hack (too ugly to show yet ;-)) that
produces output along the lines of

Existing code:

/* CREATE STATISTICS <name> */
else if (Matches("CREATE", "STATISTICS", MatchAny))
COMPLETE_WITH("(", "ON");
else if (Matches("CREATE", "STATISTICS", MatchAny, "("))
COMPLETE_WITH("ndistinct", "dependencies", "mcv");
else if (Matches("CREATE", "STATISTICS", MatchAny, "(*)"))
COMPLETE_WITH("ON");
else if (HeadMatches("CREATE", "STATISTICS", MatchAny) &&
TailMatches("FROM"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);

Generated tables:

enum TCPatternID
{
...
M_CREATE_STATISTICS_ANY,
M_CREATE_STATISTICS_ANY_LPAREN,
M_CREATE_STATISTICS_ANY_PARENSTARPAREN,
HM_CREATE_STATISTICS_ANY,
...
};

enum TCPatternKind
{
Match,
MatchCS,
HeadMatch,
HeadMatchCS,
TailMatch,
TailMatchCS,
};

typedef struct
{
enum TCPatternID id;
enum TCPatternKind kind;
int nwords;
const char *const *words;
} TCPattern;

#define TCPAT(id, kind, ...) \
{ (id), (kind), VA_ARGS_NARGS(__VA_ARGS__), \
(const char * const []) { __VA_ARGS__ } }

static const TCPattern tcpatterns[] =
{
...
TCPAT(M_CREATE_STATISTICS_ANY,
Match, "CREATE", "STATISTICS", MatchAny),
TCPAT(M_CREATE_STATISTICS_ANY_LPAREN,
Match, "CREATE", "STATISTICS", MatchAny, "("),
TCPAT(M_CREATE_STATISTICS_ANY_PARENSTARPAREN,
Match, "CREATE", "STATISTICS", MatchAny, "(*)"),
TCPAT(HM_CREATE_STATISTICS_ANY,
HeadMatch, "CREATE", "STATISTICS", MatchAny),
...
};

Converted code:

/* CREATE STATISTICS <name> */
case M_CREATE_STATISTICS_ANY:
COMPLETE_WITH("(", "ON");
break;
case M_CREATE_STATISTICS_ANY_LPAREN:
COMPLETE_WITH("ndistinct", "dependencies", "mcv");
break;
case M_CREATE_STATISTICS_ANY_PARENSTARPAREN:
COMPLETE_WITH("ON");
break;
case HM_CREATE_STATISTICS_ANY:
if (TailMatches("FROM"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);
break;

The idea (I've not actually written this part yet) is that an
outer loop would iterate through the table entries and invoke
the appropriate switch case for any successful match. As soon
as the switch code sets "matches" non-NULL (it might not, as in
the last case in the example), we can exit.

While this clearly can be made to work, it doesn't seem like the
result will be very maintainable. You have to invent a single-use
enum label, and the actual definition of the primary match pattern
is far away from the code using it, and we have two fundamentally
different ways of writing the same pattern test (since we'll still
need some instances of direct calls to TailMatches and friends,
as in the last example case).

What I'm thinking about doing instead of pursuing this exact
implementation idea is that we should create a preprocessor
to deal with building the table. I'm envisioning that the
new source code will look like

/* CREATE STATISTICS <name> */
case Matches("CREATE", "STATISTICS", MatchAny):
COMPLETE_WITH("(", "ON");
break;
case Matches("CREATE", "STATISTICS", MatchAny, "("):
COMPLETE_WITH("ndistinct", "dependencies", "mcv");
break;
case Matches("CREATE", "STATISTICS", MatchAny, "(*)"):
COMPLETE_WITH("ON");
break;
case HeadMatches("CREATE", "STATISTICS", MatchAny):
if (TailMatches("FROM"))
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables);
break;

Of course this isn't legal C, since the case labels are not
constant expressions. The preprocessing script would look
for case labels that look like this, generate the appropriate
table entries, and replace the case labels with mechanically-
generated ID codes that don't need to be particularly human
readable. On the downside, YA preprocessor and mini-language
isn't much fun; but at least this is a lot closer to real C
than some of the other things we've invented.

(Further down the line, we can look into improvements such as
avoiding duplicate string comparisons; but that can be done behind
the scenes, without messing with this source-code notation.)

Does anyone particularly hate this plan, or have a better idea?

BTW, we have quite a few instances of code like the aforementioned

else if (HeadMatches("CREATE", "STATISTICS", MatchAny) &&
TailMatches("FROM"))

I'm thinking we should invent a Matches() option "MatchAnyN",
which could appear at most once and would represent an automatic
match to zero or more words appearing between the head part and
the tail part. Then this could be transformed to

else if (Matches("CREATE", "STATISTICS", MatchAny, MatchAnyN, "FROM"))

the advantage being that more of the pattern can be embedded in the
table and we need fewer bits of ad-hoc logic. Maybe this'd be worth
doing even if we don't go forward with the switch conversion.

regards, tom lane

[1] https://www.postgresql.org/message-id/1203570.1720470608%40sss.pgh.pa.us

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilya Gladyshev 2024-07-11 20:35:24 Re: CREATE INDEX CONCURRENTLY on partitioned index
Previous Message Robert Haas 2024-07-11 20:09:31 Re: Parent/child context relation in pg_get_backend_memory_contexts()