[I want more questions to be answered in this section;
please send them to amk1@bigfoot.com
.]
How do I guard against denial-of-service attacks? Or, how do I keep restricted code from consuming a lot of memory?
Even if restricted code can't open sockets or write files, it can still
cause problems by entering an infinite loop or consuming lots of memory;
this is as easy as coding while 1: pass
or 'a' *
12345678901
. Unfortunately, there's no way at present to prevent
restricted code from doing this. The Python process may therefore
encounter a MemoryError
exception, loop forever, or be killed by
the operating system.
One solution would be to perform os.fork()
to get a child process
running the interpreter. The child could then use the resource
module to set limits on the amount of memory, stack space, and CPU time
it can consume, and run the restricted code. In the meantime, the
parent process can set a timeout and wait for the child to return its
results; if the child takes too long, the parent can conclude that the
restricted code looped forever, and kill the child process.
If restricted code returns a class instance via r_eval()
,
can that class instance do nasty things if unrestricted code calls its
methods?
You might be worried about the handling of values returned by
r_eval()
. For example, let's say your program does this:
value = r_env.r_eval( expression ) print str(value)
If value
is a class instance, and has a __str__
method,
that method will get called by the str()
function. Is it
possible for the restricted code to return a class instance where the
__str__
function does something nasty? Does this provide a way
for restricted code to smuggle out code that gets run without
restrictions?
The answer is no. If restricted code returns a class instance, or a function, then, despite being called by unrestricted code, those functions will always be executed in the restricted environment. You can see why if you follow this little exercise. Run the interpreter in interactive mode, and create a sample class with a single method.
>>> class C: ... def f(self): print "Hi!" ...
Now, look at the attributes of the unbound method C.f
:
>>> dir(C.f) ['__doc__', '__name__', 'im_class', 'im_func', 'im_self']
im_func
is the attribute we're interested in; it contains the
actual function for the method. Look at the function's attributes using
the dir()
built-in function, and then look at the
func_globals
attribute.
>>> dir(C.f.im_func) ['__doc__', '__name__', 'func_code', 'func_defaults', 'func_doc', 'func_globals', 'func_name'] >>> C.f.im_func.func_globals {'__doc__': None, '__name__': '__main__', '__builtins__': <module '__builtin__'>, 'f': <function f at 1201a68b0>, 'C': <class __main__.C at 1201b35e0>, 'a': <__main__.C instance at 1201a6b10>}
See how the function contains attributes for its __builtins__
module? This means that, wherever it goes, the function will always use
the same __builtin__
module, namely the one provided by the
restricted environment.
This means that the function's module scope is limited to that of the restricted environment; it has no way to access any variables or methods in the unrestricted environment that is calling into the restricted environment.
r_env.r_exec('def f(): g()\n') f = r_env.r_eval('f') def g(): print "I'm unrestricted."
If you execute the f()
function in the unrestricted module, it
will fail with a NameError
exception, because f()
doesn't
have access to the unrestricted namespace. To make this work, you'd
must insert g
into the restricted namespace. Be careful when
doing this, since g
will be executed without restrictions; you
have to be sure that g
is a function that can't be used to do
any damage. (Or is an instance with no methods that do anything
dangerous. Or is a module containing no dangerous functions. You get
the idea.)
What happens if restricted code raises an exception?
The rexec module doesn't do anything special for exceptions
raised by restricted code; they'll be propagated up the call stack
until a try...except
statement is found that catches it. If
no exception handler is found, the interpreter will print a traceback and exit, which
is its usual behaviour. To prevent untrusted code from terminating
the program, you should surround calls to r_exec()
,
r_execfile()
, etc. with a try...except
statement.
Python 1.5 introduced exceptions that could be classes; for more information about this new feature, consult http://www.python.org/doc/essays/stdexceptions.html. Class-based exceptions present a problem; the separation between restricted and unrestricted namespaces may cause confusion. Consider this example code, suggested by Jeff Rush.
t1.py:
# t1.py from rexec import RHooks, RExec from t2 import MyException r= RExec( ) print 'MyException class:', repr(MyException) try: r.r_execfile('t3.py') except MyException, args: print 'Got MyException in t3.py' except: print 'Missed MyException "%s" in t3.py' % repr(MyException)
t2.py
#t2.py class MyException(Exception): pass def myfunc(): print 'Raising', `MyException` raise MyException, 5 print 't2 module initialized'
t3.py:
#t3.py import sys from t2 import MyException, myfunc myfunc()
So, t1.py imports the MyException
class from
t2.py, and then executes some restricted code that also imports
t2.py and raises MyException
. However, because of the
separation between restricted and unrestricted code, t2.py
is
actually imported twice, once in each mode. Therefore two distinct
class objects are created for MyException
, and the
except
statement doesn't catch the exception because it seems
to be of the wrong class.
The solution is to modify t1.py to pluck the class object out
of the restricted environment, instead of importing it. The following
code will do the job, if added to t1.py
:
module = r.add_module('__main__') mod_dict = module.__dict__ MyException = mod_dict['MyException']
The first two lines simply get the dictionary for the __main__
module; this is a usage pattern discussed above. The last line simply
gets the value corresponding to 'MyException', which will be the class
object for MyException
.