For some applications, it's desirable to execute chunks of Python code that come from an outside source. The most obvious example is a Web browser such as Grail, which can download and execute applets written in Python.
An obvious danger of downloading and running code from anywhere is that someone might write a malicious applet that appears to be harmless, but silently erases files, makes copies of sensitive data, or gives the applet's author a back door into your system. The solution is to run the code in a restricted environment, where it's prevented from performing any operations that could be used maliciously.
Java does this by using the Java Virtual Machine, which executes Java bytecode. The virtual machine, or VM, has complete control over the running applet, and any dangerous operations must go through the VM in order to be performed. The VM can therefore trap suspicious activity, and stop the applet's execution, if a strict security policy is used, or ask the user if the operation should be permitted, if the policy is somewhat looser.
Python already has a virtual machine that executes Python byte codes, so
creating a restricted execution environment simply requires sealing off
dangerous built-in functions such as open()
, and dangerous
modules, such as the socket
module. This can be done by creating
new namespaces, removing any dangerous functions, and forcing code to be
executed in those namespaces. While a simple idea, in practice it's
fairly complicated to implement. Luckily, the required features have
been present in Python for a while, and it's already been implemented
for you as a standard module.
Code for using a restricted execution environment is in the rexec module. The base class is called RExec; in a later section of this HOWTO, we'll show you how to create your own subclasses of RExec to customize the functions and modules that are available. Here's the documentation for creating a new RExec instance:
-v
option was given to the Python
interpreter.
The hooks parameter can be an instance of the RHooks
class, or of some subclass of RHooks
; a default instance will
be used if the parameter is omitted. This is only required when
creating particularly exotic restricted environments that import
modules in new ways. If you need to use this, you'll have to
consult the source code (or Guido) for a complete picture of what's
going on.
The RExec instance has r_exec()
, r_eval()
, and
r_execfile()
functions, which do the same thing as Python's
built-in exec()
, eval()
, and execfile()
functions,
performing them in the restricted environment. (There are also
s_exec()
, s_eval()
, and s_execfile()
methods which
replace the restricted environment's standard input, output, and error
files with StringIO
objects that allow you to control the input
and capture any output generated.)
Here's a sample usage of a restricted environment. First, the RExec instance has to be created.
r_env = rexec.RExec()
Now, we can execute code and evaluate expressions in the environment:
r_env.r_exec('import string') expr = 'string.upper("This is a test")' print r_env.r_eval( expr )
The first line executes a statement, importing the string
module.
Since it's considered a safe module, the operation succeeds. The second
and third lines create a string containing an expression, and evaluates
the expression in the restricted environment; it prints out "THIS
IS A TEST", as you'd expect.
Unsafe operations trigger an exception. For example:
r_env.r_exec('import socket')
The previous line will cause an ImportError
exception to be
raised, with an associated string value that reads "untrusted dynamic
module: _socket". Trying to open a file for writing is also forbidden:
r_env.r_exec('file = open("/tmp/a.out", "w")')
This will raise an IOError
exception, with an assocated string
value that reads "can't open files for writing in restricted mode". The
restricted code can catch the exception in a try...except
block
and continue running; this is useful for writing code which works in
both restricted and unrestricted mode. Opening files for reading will
work, however.
Exactly what restrictions does the base RExec impose? It limits the modules that can be imported to the following safe list:
audioop, array, binascii, cmath, errno, imageop, marshal, math, md5, operator, parser, regex, pcre, rotor, select, strop, struct, time
In general, these are modules that can't affect anything outside of
the executing code; they allow various forms of computation, but don't
allow operations that change the filesystem or use network connections
to other machines. (The pcre
module may be unfamiliar. It's
an internal module used by the re module, so restricted code
can still use the re to perform regular expression matches.)
It also restricts the variables and functions that are available from
the sys
and os
modules. The sys
module only
contains the following symbols:
ps1, ps2, copyright, version, platform, exit, maxint
The os
module is reduced to the following functions:
error, fstat, listdir, lstat, readlink, stat, times, uname, getpid, getppid, getcwd, getuid, getgid, geteuid, getegid
Note that restricted code has some read-only access to the filesystem
via functions like os.stat
and os.readlink
; if you wish to
forbid all access to the filename, these functions must be removed.
In restricted mode, there are various attributes of function and class
objects that are no longer accessible: the __dict__
attribute of
class, instance and module objects; the __self__
attribute of
method objects; and most of the attributes of function objects, namely
func_code
, func_defaults
, func_doc
,
func_globals
, and func_name
.
The __import__()
and reload()
functions are replaced by
versions which implement the above restrictions. Finally, Python's
usual open()
function is removed and replaced by a restricted
version that only allows opening files for reading.
To change any of these policies, whether to be stricter or looser, see the section below on customizing the restricted environment.