Re: [v9.2] Object access hooks with arguments support (v1)

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [v9.2] Object access hooks with arguments support (v1)
Date: 2011-10-18 15:25:34
Message-ID: CADyhKSXirpiqBg4_HGC56TOY+3CiFLRxBdtmTMZXm6RdzaQ_Eg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2011/10/18 Robert Haas <robertmhaas(at)gmail(dot)com>:
> On Thu, Oct 13, 2011 at 6:48 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>>    struct ObjectAccessInfoData {
>>        ObjectAccessType   oa_type;
>>        ObjectAddress         oa_address;
>>        union {
>>            struct {
>>                HeapTuple       newtuple;
>>                TupleDesc       tupdesc;  /* only if create a new relation */
>>                    :
>>            } post_create;      /* additional info for OAT_POST_CREATE event */
>>        }
>>    }
>
> That's possibly an improvement over just passing everything opaquely,
> but it still doesn't seem very good.  I mean, this is C, not Perl or
> Python.  Anything where you pass a bunch of arguments of indeterminate
> type and hope that the person you're calling can figure it out is
> inherently pretty dangerous.  I had it in mind that the object access
> hook mechanism could serve as a simple and generic way of letting an
> external module know that an event of a certain type has occurred on a
> certain object, and to let that module gain control.  But if we have
> to pass a raft of arguments in, then it's not generic any more - it's
> just a bunch of things that are probably really need to be separate
> hooks shoved into a single function.
>
I got an inspiration of this idea from APIs of foreign-data-wrapper that
trusts values of each fields that contains function pointers.
Indeed, we may have a risk of backend crash, if we try to run modules
built towards v9.1 on the pgsql-v9.2devel. However, it is not avoidable,
unless we rewrite whole of the code by Java?, Python? or something? :-(

At least, we don't promise binary compatible across major versions?
If so, I don't think dangerousness is a reasonable reason why the hook
does not take any additional arguments, as long as auther of modules
can be noticed by compiler warnings.

The point to be focused on is how do we implement the target feature
(DDL permissions by sepgsql in my case) with simpleness.

>>> Moreover, if
>>> you did document it, I think it would boil down to "this is what
>>> sepgsql happens to need", and I don't think that's an acceptable
>>> answer. ?We have repeatedly refused to adopt that approach in the
>>> past.
>>>
>> Unfortunately, I still don't have a clear answer for this point.
>> However, in general terms, it is impossible to design any interface without
>> knowledge of its usage more or less.
>
> Well, that's true.  But the hook also has to be maintainable.  ISTM
> that there's no reasonable way for someone making a modification to
> the code to know whether the additional local variable that they just
> added to the function should be passed to the hook, or not.  Here, at
> least as far as I can see, there's no guiding principle that would
> enable future contributors to make an intelligent decision about that.
>  And if someone wanted to write another client for the hook, it's not
> very obvious whether the particular things you've chosen to pass here
> would be relevant or not.  I think if you look over the code you'll
> find that there's nowhere else that we have a hook that looks anything
> like what you're proposing here.
>
In the case when the second client of the hook is proposed, a straight-
forward approach is to expand the above ObjectAccessInfoData to
satisfy the requirement of new one, if existing arguments are not enough.
Because existing modules ignore the new fields, it is harmless.

Linux adopts this approch to host multiple security modules without
confusion, although one makes decision by security label and one
other makes decision by pathname of files; they require different
information on the point to make its decision, because they ignore
unrelevant arguments; such as pathname in selinux.

I remember you worry about the number of arguments getting increased
infinity, however, it is an extreme situation. We are not an A.I to commit
proposed patches automatically, so auther of modules (including me)
will explain why the new arguments are additionally needed here, like as
pathname based security module did in Linux kernel development.

>> At least, this proposition enables modules being informed using
>> commonly used data type (such as HeapTuple, TupleDesc), compared
>> to the past approach that tried to push all the necessary information
>> into argument list individually.
>
> That does seem like a good direction to go in, but you're still
> passing a lot of other stuff in there.  I guess my feeling is that if
> we can't boil down the argument list to a short list of things that
> are more-or-less common to all the call sites, we probably need to
> rethink the whole design.
>
Honestly, the arguments of object_access_hook is a method to implement
security model of sepgsql, not a purpose of mine. So, an alternative idea is
quite welcome to implement sepgsql.
However, I think it is less invasive way than other ideas right now.

For example, I hope sepgsql to perform as follows when user create a new table.
- It computes a default security label that needs Oid of the namespace.
- It checks db_table:{create} permission on the security label being computed.
- In addition, it checks db_table:{insert} permission when SELECT INTO
- Also, it checks these permissions on columns being newly created.
- However, It does not check permissions when the tables are internally created,
such as toast tables or temporary relations created by make_new_heap().

Although table creation is the most complex case, it allows us to implement
with just three additional aguments. I tried to implement to deploy separate
hooks on the code path on DefineRelation and OpenIntoRel, but I didn't feel
it is enough simple than the latest approach.

Unfortunately, I have no idea to find out these "contextual information" being
commonly used for all the object classes, so I tried to define my example
ObjectAccessInfoData structure with optional fields depending on event
type and object class.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2011-10-18 16:18:49 Re: pg_ctl restart - behaviour based on wrong instance
Previous Message Simon Riggs 2011-10-18 15:18:14 Re: synchronized snapshots