From: | Zhaomo Yang <zhy001(at)cs(dot)ucsd(dot)edu> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | pgsql-hackers(at)postgresql(dot)org, tgl(at)sss(dot)pgh(dot)pa(dot)us, Craig Ringer <craig(at)2ndquadrant(dot)com>, kaigai(at)ak(dot)jp(dot)nec(dot)com, Kirill Levchenko <klevchen(at)cs(dot)ucsd(dot)edu> |
Subject: | Re: A mechanism securing web applications in DBMS |
Date: | 2014-09-05 22:43:13 |
Message-ID: | CA+0EDdCvYPF4WJpnjNEkkk+Drc5iz02=jkWypYrGScu56=nToA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Stephen,
There are several things I want to explain:
(1) Two philosophies important to our design
- Try to not force web application developers to make too many changes
to their apps if they wa.
- Try to make our mechanism as simple as possible.
Web application developers have all kinds of backgrounds. If the
security mechanism is too alien to them, they wouldn't use it.
(2) Why we need to cache application-level users' identifiers
We want to differentiate application-level users in DBMS, but not by
creating a DB user (or role in PG's terminology ) for every
application-level user, otherwise there will be all sorts of problems
when the number of application-level users is greater than a threshold
(e.g. catalog, as you mentioned). Instead, we still use one DB user
representing all the application-level users, just as how web apps
work now. Besides the identifiers (attributes) of a application-level
user are stored in some private place of the corresponding session
(e.g. temp table) when the application-level user authenticates so
that the DBMS can differentiate application-level users. (Connection
pooling should be fine as long as an application session doesn't
return its connection until it finishes. )
Normally, a web application authenticates an application-level user by
making a SELECT query with the user provided user id and password on
the password table to see if there is a match (Of course this is an
over simplified version of how authentication works. ). Using our
mechanism, the web application instead calls the authentication
function, which does a SELECT on the table first, and store the
identifiers of that application-level user somewhere if a match found.
The identifiers of the current application-level user are referenced
by the policies so that fine-grained access control can be enforced.
(3) CREATE AUTHENTICATION FUNCTION
In our mechanism, we ask web application developers provide an
authentication function which normally takes user id and password as
inputs and returns a row containing all the identifiers (attributes)
of the corresponding application-level user. Let us call the place
storing the current application-level user's identifiers as
"identifier store".
For example,
This is an authentication function provided by a app developer.
CREATE AUTHENTICATION FUNCTION auth (user_id TEXT, password TEXT)
RETURNS table(uid BIGINT, permissions TEXT[]) AS $$
...
$$ LANGUAGE plpgsql SECURITY DEFINER;
Under the hood, two regular functions will be defined
# the same function with a different name
CREATE FUNCTION _auth (user_id TEXT, password TEXT)
RETURNS table(uid BIGINT, permissions TEXT[]) AS $$
# copy the function body from the CREATE AUTHENTICATION FUNCTION above
$$ LANGUAGE plpgsql SECURITY DEFINER;
# the function which is actually called in the server code
CREATE FUNCTION auth (user_id TEXT, password TEXT)
RETURNS table(uid BIGINT, permissions TEXT[]) AS $$
# clear the identifier store
# execute function _auth and insert the result into the identifier store.
# return the row in the identifier store
$$ LANGUAGE plpgsql SECURITY DEFINER;
The whole point of this CREATE AUTHENTICATION FUNCTION syntax is to
reduce developers' work. By giving developers very specific
instructions on how to write an authentication function, we hope they
would find it easy to write one. Admittedly, however, what CREATE
AUTHENTICATION FUNCTION does can be achieved by CREATE FUNCTION.
Please let me know if you have any other questions.
Zhaomo
On Thu, Sep 4, 2014 at 6:53 PM, Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Zhaomo,
>
> * Zhaomo Yang (zhy001(at)cs(dot)ucsd(dot)edu) wrote:
>> I am a graduate student from UC San Diego. My adviser, Dr. Kirill
>> Levchenko, and I have been working on a web/DB security project for
>> the last few months. Since fine-grained access control in DBMS is part
>> of our project and the PostgreSQL community is also working on it now,
>> we would like to exchange some ideas on the issue with you.
>
> Fantastic! Very interested.
>
>> 1. Background
>> Nowadays people entrust more and more sensitive data to web
>> applications, while security vulnerabilities are common in these
>> applications. The normal structure of web applications consists of
>> three tiers: client-side code, server-side code and a database. In
>> server-side code a database user representing all the
>> application-level users interacts with the database with full
>> privileges. Since currently database built-in access control is too
>> coarse, no access control mechanism is used in most of web
>> applications. This situation is not ideal since a malicious
>> application user can tamper with other users’ data by exploiting
>> vulnerabilities in the application, and if the application is
>> completely under the malicious user’s control so is the database.
>
> Agreed- we are certainly working to improve that situation, though
> consideration must also be given to how our catalogs are structured and
> that they are unlikely to support "web-scale" numbers of individual
> roles as currently implemented. We're in a better spot than we were a
> few years ago (we used to have flat files for roles...), but we lack any
> partitioning capability in the catalogs today.
>
> It's entirely possible that, in some situations, the inability of PG to
> support the number of roles has dwarfed the permission granularity
> concern. Even today there are mechanisms available to limit what an
> individual role can view, add, update, or delete (though they are less
> than ideal, rather cludgy to work with and could be better).
>
>> 2. Our Mechanism.
>> In order to fix the problem, we believe there are two underlying
>> problems to be addressed: authenticating application-level users to
>> DBMS and fine-grained access control in the DBMS. Note that these two
>> problems are related. Without authenticating application-level users
>> to the DBMS, can the DBMS not know which application-level user it is
>> actually interacting with, thus the DBMS cannot apply fine-grained
>> access control policy correctly.
>
> Agreed- PG needs to know the user to be able to get to the level of
> returning only what that individual user is allowed to see.
>
>> 2.1 The authentication problem
>> We address the authentication problem by requiring developers to
>> define an authentication function in the DBMS. This function is
>> invoked whenever an application-level user logs in. An authentication
>> function contains the authentication logic in the server-side code.
>> Besides, it also caches some attributes of the current
>> application-level user, which will be checked by the fine-grained
>> access control mechanism.
>
> This is one approach which can be used, though PG has a reasonably
> complex authentication system which applications can leverage (consider
> Kerberos proxying and client-side certificates), rather than having the
> application independently develop an authentication system. Still, if
> the application is to be responsible for the authentication, then a
> simple 'SET ROLE' can be done in PG to switch to the context of an
> individual user.
>
>> In our implementation, we added a new type of function called
>> authentication function. Developers are supposed to use CREATE
>> AUTHENTICATION FUNCTION command to define authentication function,
>> which contains the application authentication logic and must be
>> defined to return table rows. When the authentication function is
>> invoked, the return values are cached in a temp table, called the
>> authentication table, with the same name as the authentication
>> function. The authentication table is available to access control
>> statements (GRANT-WHERE statements).
>
> It'd be great if you could explain the actual difference, in PG, between
> these AUTHENTICATION functions and regular functions (or perhaps
> security definer functions), which can also create tables and return
> rows.
>
>> 2.2 The fined-grained access control problem
>>
>> We address the fined-grained access control problem by using
>> GRANT-WHERE statements from [1]. Concretely, the syntax of GRANT-WHERE
>> statements is
>>
>> GRANT <privilege> [, ...] ON <relname> TO <user> [, ...]
>> USING <referenced relname> [, ...]
>> WHERE <predicate>;
>>
>> where <privilege> can be SELECT, INSERT, UPDATE or DELETE, the USING
>> clause lists the tables of which should be available to the predicate
>> and the WHERE clause specifies the predicate.
>
> Certainly an interesting alternative, though you run the very serious
> risk that the SQL standard will come up with an extention to the GRANT
> syntax that will end up causing problems long-term (this is part of the
> reason for the CREATE POLICY syntax that we're currently working
> towards).
>
>> 3. Integrate our mechanism to PostgreSQL.
>>
>> 3.1 Authentication Function
>>
>> The major challenge of this part is how to cache users’ authentication
>> information. In our prototype implementation, an authentication table
>> is actually a temporary table. But it makes our GRANT-WHERE statements
>> also temporary since they refer to the authentication table. Since
>> every session has its own authentication table and all the
>> authentication tables share the same schema, the global temporary
>> table [2] would be a perfect fit for caching users’ authentication
>> information. Also, it will not make GRANT-WHERE statements temporary
>> anymore.
>
> This certainly deserves further discussion. What is the reason that a
> cacheing mechanism into a temporary table is necessary..? Are these
> session identifiers to keep track of already-authenticated users? How
> does that play with a connection pooler (which is a very real
> consideration when thinking about anything web-scale).
>
>> 3.2 Fined-Grained Access Control
>>
>> We noticed that the PostgreSQL community is currently working on
>> fine-grained access control. Access control policies can be defined in
>> form of “ALTER TABLE … SET ROW SECURITY” statement. Although
>> functionally “ALTER TABLE … SET ROW SECURITY” statements and
>> GRANT-WHERE statements are equivalent, however, there are some
>> advantages in the latter one:
>>
>> 1) GRANT-WHERE statement is very similar to the SQL GRANT statement,
>> which should be already familiar to most of the developers, thus the
>> developers may feel more comfortable to use it.
>
> To be frank, I don't really buy this argument. There are certainly
> arguments to be had for why GRANT would make sense instead of ALTER
> TABLE, but "developer comfort" isn't one of them, and the concern
> regarding the SQL specification for the GRANT command trumps it.
>
>> 2) In GRANT-WHERE syntax, a DB user can be specified as the recipient
>> of the GRANT. In the current proposed row security syntax, one needs
>> to write a big statement containing all the predicates for all the
>> possible recipients per table per operation.
>
> This isn't actually accurate- the latest CREATE POLICY syntax includes
> the ability to specify roles to which the policy is to be applied.
> Further, the next patch will include multiple expressions to address
> what happens during UPDATE (where you have to consider what rows are
> allowed to be selected, and what rows are allowed to be added back into
> the table). That would certainly further complicate the already quite
> complicated GRANT syntax, and not to much advantage that I can see.
>
>> For example, suppose we have the following two GRANT-WHERE statements on table1
>
> [...]
>
>> As you can see, this syntax requires us to combine two logically
>> separate statement into one statement. Also, it can be hard for
>> developers to debug.
>
> We've moved off of ALTER TABLE for this approach for a number of quite
> reasonable reasons and have moved to an independent CREATE POLICY
> approach, which addresses the above concerns.
>
> That said, I do feel that in many situations the reality is that users
> will simply create a stored procedure (in plpgsql or similar) which then
> evaluates what is being requested against another table or possibly even
> an external policy system (eg: SELinux) to determine if the record
> should be returned. This runs the risk of not being as performant,
> which is absolutely a concern we are considering, but supporting the
> per-role policies (which are already in the latest patch) will hopefully
> address some of that concern, and we may be able to do even better in
> the future.
>
>> 4. Collaboration.
>>
>> We are looking forward to joining forces with PostgreSQL community on
>> this problem. More specifically, we have the following suggestions
>> which we hope the community would consider:
>>
>> 1) Using GRANT-WHERE syntax for the row-level access control. If for
>> any reasons it is impossible, we hope at least a clause will be added
>> to the “ALTER TABLE … SET ROW SECURITY” syntax such that a recipient
>> can be specified;
>
> Right, already done and integrated into the row-security patch (though
> the specific syntax is CREATE POLICY, as discussed previously). We
> certainly welcome feedback on that patch and would love to discuss
> further- feel free to contact me directly and we'll set up time to
> discuss.
>
>> 2) Restarting Global Temporary Table project;
>
> I'm certainly for this.
>
>> 3) Accepting “CREATE AUTHENTICATION FUNCTION” as a variant of “CREATE
>> FUNCTION” command.
>
> More information about how it's different from CREATE FUNCTION would be
> very helpful for this.
>
>> We’d like to discuss the mechanisms above with the community further,
>> and if any of the mechanisms are accepted by the community, we also
>> would like to take part in implementing it. Please feel free to
>> contact us if you have any questions.
>
> Consider it done- I'd like to collaborate with you on the row-security
> discussion primairly, but also about how to manage very many users in
> PG and the authentication challenges.
>
> Thanks!
>
> Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Marko Tiikkaja | 2014-09-05 22:53:21 | Re: Re: [PATCH] parser: optionally warn about missing AS for column and table aliases |
Previous Message | David Johnston | 2014-09-05 22:35:05 | Re: Re: [PATCH] parser: optionally warn about missing AS for column and table aliases |