Re: Updates of SE-PostgreSQL 8.4devel patches (r1197)

From: KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
To: Simon Riggs <simon(at)2ndQuadrant(dot)com>
Cc: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Updates of SE-PostgreSQL 8.4devel patches (r1197)
Date: 2008-11-08 05:21:59
Message-ID: 491521F7.3050703@kaigai.gr.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon, Thanks for your comments.

> Some initial thoughts based upon reading the Wiki. I've not been
> involved in things up to now, so if this dredges up old discussions,
> well, these are my thoughts.
>
> I want SEPostgreSQL, but I'd like it to work without needing to be a
> compile time option so people can just use it as/when needed. Plus we
> don't want to support what would be/is essentially a fork of Postgres.
> Most CPUs will optimise away a simple if-test in various places.

As Bruce also mentioned, it has to be linked with libselinux to communicate
in-kernel SELinux, but it is not onw of the universal libraries, so, it
has to be enabled/disabled on build-time option.

In addition, we can also disable the feature by the following configuration.
sepostgresql = disabled (at $PGDATA/postgresql.conf )

TODO: add a description about the guc variable. It can have four state:
"default", "enforcing", "permissive" and "disabled".

> Some users will be able to take advantage of the facilities without
> implementing full MLS. Yet we want the full facilities for Government.
> Many people currently run multiple customers in different schemas,
> though this would let them just run a single set of tables so is much
> better for running many small customers.

SELinux community also provides a MLS enabled policy, but it is not
a default one now. SE-PostgreSQL has all the its access controls
decision externally, so it is out of our coltrols.

> I'm very unhappy about putting a nonoptional value on tuple headers,
> especially because it is updatable. Do we expect MVCC will work with
> updatable security contexts? i.e. when you update the security context
> of a tuple that won't effect its visibility to existing system users. I
> can't imagine you'd want that would you? It's kind of difficult to *not*
> get it though.

When a user updates the security context of some tuples, it is invisible
for other clients until its commit, like as other normal data.
Sorry, it is unclear for me what is the concern you mention.

> Looks to me that this feature is useless without things working at row
> level. So we can't leave that part out.

I guess you are saying the core PostgreSQL also has table and column
level granularities, but its criteria to make a decision is different.
SE-PostgreSQL makes its decision based on the privileges of peer process,
not a database role.

> The security context on each row could be an optional column present
> only if HEAP_HASSECURITYCONTEXT is set (0x0010 see htup.h), just like
> OIDs. Use a specific datatype rather than TEXT. That datatype could be
> an identifier to pg_security. Security people have big databases too, so
> we need to compress the security context more and take out parse time of
> string handling. Don't think we should use Oids, they're too big. Might
> be easier to use a 2byte field and restrict access to 32,000 contexts,
> which is easily enough. TEXT also makes me nervous, just in case there
> is some collation/encoding weirdness that allows contexts to be
> subverted. Fixed integers are hard to compromise in that respect.

An issue is who can decide the existance or needs of security system
attribute. If the table owner can disable it, we cannot say it as
a mandatory access control feature, so the security attribute has to
be always appeared when the security feature is enabled.

Here is an answer for the expected question.
When we refer the "security_context" system column of tuples without
HEAP_HAS_SECURITY, it returns an alternative label called as "unlabeled_t",
because it has no labels.

The reason why we adopt TEXT type is SELinux requires userspace
object manager makes queries via text represented security context,
and it also can be used for other security feature to show client
its security attribute in human readable format.

For your information, in-kernel SELinux can handle 2^32 - 1 of security
context internally in theoretical maximum, so using Oid as a security
identifier is a fair decision to avoid an implementation to handle overflow
cases.

> How will unique indexes work? Do you implicitly add security context as
> last column on every unique index, or does the uniqueness violation only
> occurs within security contexts, or does the uniqueness violation tested
> against all contextx that the inserter can currently see? Is there a
> change to system catalogs?

The unique/primary key constraint works at the lowest level.
So, it has a possibility that invisible tuple prevent to insert a tuple.

> Foreign Key deletions could be handled correctly if you treat them as
> updates. If we have the following example
>
> TableA
> security_context=y value=2 fk=1
>
> TableB
> security_context=x value=1
>
> TableA refers to TableB. Context x cannot see context y.
>
> So if somebody with context x tries to delete value1 from TableB, they
> will be refused because of a row they cannot see. In this case the
> correct action is to update the tuple in TableB so it now has a
> security_context = y. The user with x cannot see it and can be persuaded
> he deleted it, while the user with y can still see it.

It is also discussed before.
In this case, SE-PostgreSQL prevent to delete a tuple on the TableB
to keep referential integrity. So, it enables unprivileged users to
infer the existance of refering tuples, but it is a limitation called
as "covert channel".

Your example is the simplest case, so it seems to work well, but
most of cases are not obvious. If the TE policy prevent accesses
on tuples with security_context=x, we have no obvious way to decide
what is a proper security context to be updated.

> The section on LOAD doesn't sound very convincing. Loading a module
> could immediately subvert security. We could probably tighten up on that
> for general use as well. ISTM we need something like a new catalog table
> for loadable modules, so we can give them specific security contexts and
> potentially store some kind of verification information about them.
> Having a single "can load modules" isn't enough with Postgres, since it
> would effectively prevent us from loading *any* modules in a secure
> database. Which kinda removes much of the benefit of using Postgres.

SE-PostgreSQL applies access controls for individual loadable modules.
When a function implemented within an external modules tries to be loaded,
it checks security context between the database and the file of loadable
modules. (No need to say, in-kernel SELinux assigns its security context
for files in common format, so we can compare them each other.)

Perhaps, the description I wrote can easily make misunderstanding.
If you can read it says widespread load modules permission, I'll revise
its representation.

http://wiki.postgresql.org/wiki/SEPostgreSQL#Loading_shared_library_module

> Is there an issue with relation cache and catalog caches? ISTM that you
> should put a security context onto each shared invalidation message, so
> that backends know not to maintain caches for data they aren't allowed
> to see. Probably overkill, just thinking.

Sorry, I cannot understand what you concerned.
The pg_security system catalog is only modified when a newly appeared
text representation of security attribute is given. So, it is read only
for most of cases.

> The interaction with SELinux should not be hardcoded. I think we need
> some form of plugin/wrapper to allow other systems to work. Not sure
> what those are, but this isn't a Linux only project. We want to give
> everybody the ability to work with PostgreSQL.

The PGACE security framework already enables what you want.
The "rowacl" module is a proof of concept, and we can add a new mechanism
on the framework, if necessary.
Do you know the LSM (Linux Security Module) system?

> How does Discretionary Access control work with regard to server logs?
> You said there was no superuser access, but I don't see any controls for
> the logs. Do we need log_max_security_context?

It writes out server logs to filesystem, so its access controls should
be applied in-kernel SELinux. SE-PostgreSQL does not care system call
invocation, because it should be a task in kernel features.
It works as a reference monitor for SQL queries.

> Trusted procedures seem very similar to SECURITY DEFINER functions. Can
> you explain what the differences are? I'm sure we don't want to similar
> features.

It does not change the working security context of the client.

Please consider whether a set-uid program on operating system makes
unnecessary domain transition of SELinux, or not. They have individual
purposes and roles, so we need both of them.

> I don't see any reason for the "=" in most of the new DDL syntax. Just
> seems superfluous and out of character with normal SQL DDL.

Because I thought the syntax is most friendliness for end-users
two years ago. It looks obvious the option specifies the security
context of resources.
Sorry, it is not a clear reason, but it easily makes us to understand
the option.

> With DDL we already have table options, so why not include
> security_context as an option? If we are adding this to databases as
> well, it seems a good time to include a generic database-level options
> facility and make "security_context" the first option.

How do we implement the "security_context" option for columns?
When we want to define a table with explicitly labeled column,
I don't think the table option is a user friendly interface.

> The parameter to enable this facility should be something like
> enhanced_security = on

As I noted above, it already provides:
sepostgresql = [ default | enforcing | permissive | disabled ]

> My feeling is that if you want to include these features in core
> PostgreSQL you won't be able to maintain the "separate branding" of
> SEPostgreSQL, logo etc.. Maybe I'll get used to it. But if we are going
> to use it, we should say SEPostgresSQL, not SE-PostgreSQL if we are
> saying SELinux.

Please fee free to call it as SE-PostgreSQL, SEPostgreSQL, sepgsql
and others. I don't think it is a significant matter.
However, in my hope, I don't want to discard the mascot and logo,
because it was a present from my friend who works as an illustrator.

In Japan, a turtle is used as a mascot of PostgreSQL. :-)
http://www.postgresql.jp/npo/logo/

> It would make sense for you to look at the work on replication also. Not
> much use having SEPostgreSQL if we can't replicate it securely. And
> possibly in-place update so that respects label security.

The log-shipping replication mechanism will works correctly,
but I have not evaluated it yet. In this case, the master
and slaves should have same security policy for databases.

> I've not looked at the code and probably won't have time to do that. But
> if I did, I'd say "diff -c".

OK, I'll fix the scripts to generate patch set.
The next series will provided with "diff -c" style.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2008-11-08 05:25:08 Re: Updates of SE-PostgreSQL 8.4devel patches (r1197)
Previous Message Ron Mayer 2008-11-08 03:29:35 Re: Patch for SQL-Standard Interval output and decoupling DateStyle from IntervalStyle