Re: SE-PostgreSQL and row level security

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, bogdan(at)omnidatagrup(dot)ro, David Fetter <david(at)fetter(dot)org>, KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: SE-PostgreSQL and row level security
Date: 2009-02-16 14:58:05
Message-ID: 603c8f070902160658k3e9ae794r286466ef35fe7273@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>> Both of SE-PostgreSQL and vanilla PostgreSQL don't give an assurance to
>> eliminate information leaks via such kind of covert channels, so they
>> don't violate any specifications of them. Thus, it is not a problem.
>
> If that's true then I don't see why we would try to automatically hide records
> you don't have access to. The only reason to do so is to try to close these
> covert channels and if we can't do that then I don't see any benefit to doing
> so.

So, this email really got me thinking, and after thinking about it for
a while I think you're wrong about this part. :-)

If we had no security in the database at all (no table or column
privileges, no login roles or privileges - everyone connects as
superuser!) then we wouldn't have any covert channels either. Covert
channels, by definition, are methods by which access controls can be
partially or completely subverted, so if there are no access controls,
there are no covert channels, either. In some sense, covert channels
are the degree to which its possible to work around the overt security
controls.

It's worth noting that this is almost never zero. There are papers
out there about subverting SSH by measuring the length of time that
the remote machine takes to reject your request for access and
inferring from that at what stage of the process authentication
failed, and from that eventually being able to crack the system. Of
course they only got it working on a local LAN with a fast switch and
probably not a lot of other traffic on the network, but so what? The
point is that there is information there, as it is in every system,
and so the question is not "Are there covert channels?" but "Are the
covert channels sufficiently large so as to render the system not
useful in the real world?".

I haven't seen anyone present a shred of evidence that this would be
the case in SE-PostgreSQL. Even if you can infer the existence of a
referring key, as Kevin Grittner just pointed out in another email on
this thread, that may not be that helpful. The information is likely
to be some sort of unexciting key, like an integer or a UUID or (as in
Kevin's example) a sequentially assigned case number. Maybe if you're
really lucky and have just the right set of permissions you'll be able
to infer the size of the referring table, and there could be
situations where that is sensitive information, but Kevin's example is
a good example of a case where it's not: the case load of the family
court (or whatever) is not that much of a secret. The names of the
people involved in the cases is.

> If users want to select "all matching records the user has access to" they
> should just put that in the WHERE clause (and we should provide a convenient
> function to do so). If we implicitly put it in the WHERE clause then
> effectively we're providing incorrect answers to the SQL query they did
> submit.
>
> This is a big part of the "breaking SQL semantics" argument. Since the
> automatic row hiding provides different answers than the SQL query is really
> requesting it means we can't trust the results to follow the usual rules.

The requested functionality is no different in its effect than writing
a custom view for each user that enforces the desired permissions
checks, but it is a lot more convenient.

> I think there's more to it though. Tom pointed out some respects in which the
> hooks are too late and too low level to really know what privilege set is in
> effect. The existing security checks are all performed earlier in plan
> execution, not at low level row access routines. This is a more fundamental
> change which you'll have to address before for *any* row level security scheme
> even without the automatic data hiding.
>
> So, assuming the SELinux integration for existing security checks is committed
> for 8.4 I think the things you need to address for 8.5 will be:
>
> 1) Row level security checks in general (whether SELinux or native Postgres
> security model) and showing that the hooks are in the right places for
> Tom's concerns.
>
> 2) Dealing with the scaling to security labels for billions of objects and
> dealing with garbage collecting unused labels. I think it might be simpler
> to have security labels be explicitly allocated and dropped instead of
> creating them on demand.
>
> 3) The data hiding scheme -- which frankly I think is dead in the water. It
> amounts to a major change to the SQL semantics where every query
> effectively has a volatile function in it which produces different answers
> for different users. And it doesn't accomplish anything since the covert
> channels it attempts to address are still open.

One thing that I do think is a legitimate concern is performance,
which I think is some of what you're getting at here. An iterative
lookup of the security ID for each row visited basically amounts to
forcing row-level security to be checked using a nested loop plan, but
it's probably not hard to construct scenarios where that isn't a very
good plan. Surely we want to be able to index the relation on the
security ID and do bitmap index scans, etc.

...Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2009-02-16 14:58:41 Re: SE-PostgreSQL and row level security
Previous Message Tom Lane 2009-02-16 14:55:38 Re: SE-PostgreSQL and row level security