Possible major bug in PlPython (plus some other ideas)

From: Kevin Jacobs <jacobs(at)penguin(dot)theopalgroup(dot)com>
To: <pgsql-hackers(at)postgresql(dot)org>
Cc: Kevin Jacobs <jacobs(at)penguin(dot)theopalgroup(dot)com>
Subject: Possible major bug in PlPython (plus some other ideas)
Date: 2001-11-08 13:39:30
Message-ID: Pine.LNX.4.33.0111080752500.29882-100000@penguin.theopalgroup.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi everyone,

I have noticed a possibly major issues in Plpython that may need to be
addressed before 7.2 is released:

1) If Plpython is installed as a trusted language, and from what little I
can glean from the documentation, it should not have any filesystem access.
However, the default behavior of the restricted execution environment
being used allows read-only filesystem access.

Here is the current behavior (from the Python Library Reference):

r_open(filename[, mode[, bufsize]])

Method called when open() is called in the restricted environment.
The arguments are identical to those of open(), and a file object (or
a class instance compatible with file objects) should be returned.
RExec's default behavior is allow opening any file for reading, but
forbidding any attempt to write a file. See the example below for an
implementation of a less restrictive r_open().

It is fairly easy to override this method to unconditionally raise an
access exception.

I have some other suggestions that may not be appropriate for the 7.2
release, but think should be addressed before too long:

2) I'm not sure why the TD dictionary exists. Why not create objects
'new', 'old' or 'event' in the global namespace when the interpreter is
called in the appropriate contexts? The current way is unwieldy, not
very 'Pythonic' (a frequent justification for change in the Python
world), and not consistent with other PostgreSQL procedural backends.
Its possible to keep TD for backward compatibility, so there is no
downside.

3) 'old' and 'new' should also provide class-like syntax:

e.g. old.foo, new.baz (using getitem)

instead of
old['foo'], new['baz'] (using getattr)

Of course we cannot drop the getattr interface, since many valid column
names are not valid python identifiers (I think -- I haven't looked at
the SQL grammar lately, though I'm guessing that is the case).

4) Plpython does not use the standard Python boolean checks, which is also
not very Pythonic and somewhat confusing:

e.g.

CREATE OR REPLACE FUNCTION py_true() RETURNS bool AS '
return "a"
' LANGUAGE 'plpython';

CREATE OR REPLACE FUNCTION py_false() RETURNS bool AS '
return {}
' LANGUAGE 'plpython';

# select py_true();
ERROR: Bad boolean external representation 'a'
select py_false();
ERROR: Bad boolean external representation '{}'

These should return:

# select py_true(); -- non-empty strings evaluate to true
bool
------
t
(1 row)
select py_false(); -- empty dictionaries evaluate to false
bool
------
f
(1 row)

I suggest changing the semantics of boolean return values to use
PyObject_IsTrue(PyObject *o) to properly test for truth values.

5) It should be trivial to create an "untrusted" version of Plpython that
bypasses the restricted execution environment. This is worthy of some
consideration, since it may be very useful and can be implemented with
relative ease.

6) [Very low priority] Its not insane to consider a Plpython procedure
that spawns off a Python thread to do background processing tasks.
This is obviously something that will only be possible in an untrusted
version of the interpreter. Also, if the SPI interface is thread-safe,
then it may be useful to release the Python interpreter lock around
some of the SPI calls.

OK, so I've got a laundry list of issues. Only the first issue is a real
show-stopper in my mind, though I'd like to see at least 1-4 addressed
before 7.2 or 7.2.1, if at all possible. After some discussion, I'll even
by happy to implement most/all of these items, though I'd like more
collaboration than just submitting patches blindly for consideration.

Thanks,
-Kevin Jacobs

PS: Oh, I'd like to thank everyone working on PostgreSQL for the wonderful
job they've done. I'm a _very_ new user and am in the process of
porting a _very large_ project from MySQL/MSSQL to PostgresQL, for
obvious reasons. I had expected the process to be painful and tedious,
but instead it has been a real pleasure and I have enjoyed exploring all
of the wonderful things you all have given me to play with.

--
Kevin Jacobs
The OPAL Group - Enterprise Systems Architect
Voice: (216) 986-0710 x 19 E-mail: jacobs(at)theopalgroup(dot)com
Fax: (216) 986-0714 WWW: http://www.theopalgroup.com

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message mlw 2001-11-08 13:52:22 Re: Storage Location Patch Proposal for V7.3
Previous Message Stefan Rindeskar 2001-11-08 12:52:40 Re: Storage Location Patch Proposal for V7.3