From: | "Meredith L(dot) Patterson" <mlp(at)thesmartpolitenerd(dot)com> |
---|---|
To: | sfpug(at)postgresql(dot)org |
Subject: | Re: Questions for Meredith |
Date: | 2007-01-13 06:50:23 |
Message-ID: | 45A8812F.9070407@thesmartpolitenerd.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | sfpug |
Josh Berkus wrote:
> Now that your slides are up, let me give you the list of comments and
> questions from the PUG that you asked for:
Thanks!
I'm actually going to take these one at a time, because I think the
replies are going to run long.
> 1) Suggestion: the query template notation should distinguish between
> literals and non-literals. Currently, you have:
>
> SELECT {col1} FROM table;
> and SELECT '{col}' FROM table;
First, a sidenote: wow, a literal in the target list is legal? Checking
the grammar, apparently so, since target_list => target_el, target_el =>
a_expr, and a_expr => AexprConst (which can be pretty much any literal).
In the first case you're actually not *so* screwed, because the lowest
enclosing node over {col1} is actually ColId. Marking the lowest
enclosing node with an asterisk, a partial parse (of just the targetlist
subtree of the SELECT) looks something like:
simple_select =>
target_list =>
target_el =>
a_expr =>
c_expr =>
columnref =>
relation_name =>
*ColId =>
IDENT
So the only valid elements that can appear here are tokens that have
been lexed as IDENT, or a number of tokens which fall under the
catch-alls of unreserved_keyword and col_name_keyword. By contrast,
literals are lexed as SCONST, FCONST, ICONST, BCONST ... you get the idea.
The place where this *is* profoundly dangerous is in the WHERE clause,
as I'll show below.
> ... with the same notation, allowing an attacker to try to swap a literal
> with an object reference if they can defeat the escaping mechanism,
> especially if the programmer makes makes the mistake of omitting the
> quotes or putting them inside the brackets.
I'm actually going to switch to a different example from the slides,
because today as I was digging back through the lexer I realised I'd
made a mistake in the talk. String constants, aka string literals, are
lexed *with* the quotes around them. So {'foo'} and '{foo}' are actually
equivalent; by extension, the lowest enclosing node around them would be
Sconst, e.g.:
where_clause =>
a_expr =>
c_expr =>
AexprConst =>
*Sconst =>
SCONST
This is where Quinn's suggestion of a sanity-checker function which
shows you what you're *really* allowing to vary will come in handy.
So the good news is that I actually suggested more caution than was
necessary. The bad news is that allowing an attacker to include
comparisons between literals is still dangerous, so detecting this is
still a Good Idea.
However, I did some more thinking about it, and realised that it's also
possible to construct an object-literal comparison -- or merely an
object expression which always returns true. Consider:
SELECT foo FROM bar WHERE exists(select current_database())
Or, without resorting to any built-in functions other than exists():
SELECT foo FROM bar WHERE exists(select relname from pg_class)
The exists() function returns TRUE if the subquery given to it as an
argument returns at least one row, and it is always possible to
construct a subquery which returns at least one row by virtue of being
able to query about the database itself.
So I think it makes more sense to punt back to the more cautious stance
that some of us took: if it's possible for a rule to recurse directly,
i.e., a rule's right-hand side can contain an instance of its own
left-hand side, throw an exception during construction and force the
user to handle that.
This is easily implemented; since Dejector's parser constructs an XML
parse tree, it's trivial to add a "recurses" attribute to all
self-recursive rules, then check to see whether a node which is
potentially recursive appears as (or below) a lowest enclosing node.
More difficult, but something I would like to implement at some point,
is an idea which I think stems from a suggestion Quinn made. It could be
useful to be able to inspect a rule which appears in an exemplar and see
which applications the exemplar allows. Any rule which appears above a
lowest enclosing node will only have one allowable application: the one
which appears in the exemplar. (For example, if your exemplar is a
SELECT statement, then the <stmt> rule can *only* have <SelectStmt> as
its child, provided the entire statement isn't bracketed.) But the rule
at a lowest enclosing node can use *any* of its applications; therein
lies much of the danger with bracketing an <a_expr>.
The bracket notation is convenient, but this all-or-one-or-nothing
nature somewhat betrays the philosophy behind Dejector. If the goal is
to create a restricted sublanguage, then it should be possible to
selectively disallow some rule applications. For instance, if you could
remove the rule
<a_expr> ::= <a_expr> OR <a_expr>
from your sublanguage, you've just removed the "OR 1=1" attack while
still allowing other <a_expr>s into your WHERE clause.
Comments? Criticisms? Anything I'm missing here?
(Questions 2-5 coming up soon.)
--mlp
From | Date | Subject | |
---|---|---|---|
Next Message | Quinn Weaver | 2007-01-15 01:27:17 | Re: Questions for Meredith |
Previous Message | David Fetter | 2007-01-12 23:43:35 | Perl Regular Expressions |