Re: [RFC] Interface of Row Level Security

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [RFC] Interface of Row Level Security
Date: 2012-05-30 19:26:23
Message-ID: CADyhKSUOet3qsJ8SjnF7yD9R+TXn+DkdnWkpSWCiHA5AkG_q2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/5/29 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Fri, May 25, 2012 at 5:08 PM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>>>> I think it is a good idea not to apply RLS when current user has
>>>> superuser privilege from perspective of security model consistency,
>>>> but it is inconsistent to check privileges underlying tables.
>>>
>>> Seems like a somewhat random wart, if it's just an exception for
>>> superusers.  I think we need to do better than that.  For example, at
>>> my last company, sales reps A and B were permitted to see all
>>> customers of the company, but sales reps C, D, E, F, G, H, I, and J
>>> were permitted to see only their own accounts.  Those sorts of
>>> policies need to be easy to implement.
>>>
>> Probably, if "sales_rep" column records its responsible repo, its
>> security policy is able to be described as:
>>  (my_sales_rep() in ('A', 'B') OR sales_rep = my_sales_rep())
>
> Yes, but that's a pain to optimize.  When A or B tries to select from
> the table, the query optimizer has to realize that my_sales_rep() is
> stable, inline it, do constant simplification and throw away the
> entire OR clause.  Note that this won't work today, because we only
> constant-fold immutable functions, not stable ones.  Then, since there
> are no remaining security quals, we have to realize that we actually
> don't need the security_barrier subquery RTE at all, and optimize that
> away as well.  Maybe we can make all of that work, and maybe we should
> make all of it work, but it's fairly complex.  The advantage of having
> the function return the qual rather than contain the qual is that all
> of that goes away.  The function can choose to return nothing (no RLS
> for this user) or it can choose to return something (which will likely
> be simpler than what it would have needed to return out of the chute).
>  One disadvantage is that we have to parse the returned qual instead
> of just sucking in a node-tree.
>
> Anyway, I don't feel super-strongly about this particular idea, so if
> I'm the only one who likes it, fine, but that having been said, I
> think users are going to want a *declarative* way to control which
> policies are applied to which users.  Suppose Bob is a sales rep who
> is only allowed to see his own customers, but then one day, we decide
> we trust Bob after all, so we want to let him see everything.  We
> could go back and update the IN (...) list in the security policy
> function, but that's an ugly and unscalable nuisance, especially if
> we've got 10,000 users.  It's much nicer to be able to just grant bob
> a permission using some kind of, well, GRANT command.  That's what
> we're doing, after all.  Alastair's proposal of making the security
> policy a property of the GRANT is one way of tackling that, and the
> RLSBYPASS permission I proposed elsewhere is another.  Something along
> these lines seems likely to improve performance (by replacing a query
> optimization problem with a syscache lookup) as well as ease-of-use.
>
My preference is RLSBYPASS permission rather than the approach
with functions that return policy clause at run-time, because it needs
to invalidate prepared statement at random timing.
In case of this function approach, the RLS policy shall be generated
on planner stage, and we cannot have any assumption to the criteria
of RLS policy. A function might generate RLS policy regarding to the
current user id. Yes, it is straightforward. The prepared statement
should be invalidate whenever current user-id got switched.
However, someone may define a function that generate RLS policy
depending on the value of "client_min_messages" for example.
Do we need to invalidate prepared statement whenever GUC get
updated? I think it is overkill. We cannot predicate all the criteria
user want to control the RLS policy using the functions.
So, RLSBYPASS permission is more simple way to limit number of
situations to invalidate prepared statements.

If we would have an "ideal optimizer", I'd still like the optimizer to
wipe out redundant clauses transparently, rather than RLSBYPASS
permissions, because it just controls all-or-nothing stuff.
For example, if tuples are categorized to unclassified, classified or
secret, and RLS policy is configured as:
((current_user IN ('alice', 'bob') AND X IN ('unclassified',
'classified')) OR (X IN 'unclassified)),
superuser can see all the tuples, and alice and bob can see
up to classified tuples.
Is it really hard to wipe out redundant condition at planner stage?
If current_user is obviously 'kaigai', it seems to me the left-side of
this clause can be wiped out at the planner stage.
Do I consider the issue too simple?

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2012-05-30 19:49:26 Re: 9.2beta1, parallel queries, ReleasePredicateLocks, CheckForSerializableConflictIn in the oprofile
Previous Message Jeff Davis 2012-05-30 19:21:14 GiST subsplit question