[TSJ CTF 2022] javascript_vm, w4nn4cryp7 writeup
This is another CTF that I get to play with the people at team Project Sekai. The challenges are nice and interesting, and here are my writeups for the 2 RE challenges that I’ve managed to solve.
javascript_vm
There are two kinds of Javascript virtual machines. Those who understand Javascript (like node.js) and those who don’t (like … ?).
Author: @wxrdnx
Writing a disassembler
This is a classic vm chall. We got a binary file, along with the Github repo for the VM implementation. Having the source code and documentation of the VM is a great help, it means we don’t have to spend as much time and effort to understand the instruction set. We just need to look at the right files, write a disassembler and the chall is 50% solved.
There is a total of 14
instructions. We can easily find out how they are encoded in binary format by looking at the documentation and the file src/assembler/assembler/instruction-encoder.js. From there, I was able to write a disassembler for this VM
|
|
@eana
did that and his result helped me a lot in writing and fixing errors in my disassembler.Analysing the binary
After disassembling the opcodes, we can now start analyzing the flow of the program.
The first 74 instructions seem to be in the same function, we’ll call it the main
function. There is a call to a function at address 75, this is where the program asks for our input (you can check this out yourself).
|
|
After that, the program does a bunch of other stuff, but we don’t need to care about that for now. The interesting part is at lines 27
to 74
. It is a big loop with some comparisons.
|
|
If you can’t see the loop yet, here are the opcodes that my teammate @eana
printed out, they are much more intuitive than mine.
|
|
At addresses 44
and 48
, we can see 2 values got loaded from the memory. After that, those values are compared with each other and if they are not equal, we jump to address 59
.
|
|
The instructions at address 59
tell the VM to print a string from memory, the address of the string is stored inside register B
(check the documentation of NOA 2
instruction for more info).
We can calculate the address ourselves and search for the needed string inside the binary, and it turns out that the program is trying to print Wrong
at address 519
.
Similarly, we can see that the instructions at address 67
to 74
print out Correct
.
So now the flow of the program is quite clear. It first gets our input, does something with it, then finally validates it char by char using the comparisons at address 51
.
Emulating the VM:
To find the correct input, we only need to make sure that at address 51
, the program doesn’t jump to the Wrong
branch. So my solution is as follow:
- Emulating the whole VM in Python
- When the program asks for input from stdin, we inject a z3 BitVec to memory.
- When the program reaches address
51
, we skip the comparisons, add a constraint to our z3 solver, then move to the correct branch
With this approach, we don’t need to care about how the program encodes our input. I think it’s a very good way to deal with VM-type challenges in general.
Since my disassembler is already written in Python, what I need to do now is just modifying the disassembler so that it runs the code instead of just parsing them.
|
|
Then change the syscall()
function to inject our z3 symbols.
|
|
And finally, we patch the jump at address 51
and add our constraints
|
|
After the program finishes executing, we can eval our equations and get the flag: TSJ{17_15_n07_7h3_j4v45cr1p7_vm_y0u_r_f4m1l14r_w17h}
w4nn4cryp7:
Team Sekai stops doing this CTF halfway to focus on another CTF. So I have to solve this all alone :sadge:
Original file, binary and dump file inside: w4nn4cryp7.zip
For those who don’t want to download the entire thing:
- Binary: encoder.exe
- Dump file: encoder.DMP
- Encrypted flag: CH4_Metasploit.txt.LMFAO
Oh nyo! TSJ’s PC has been infected by the w4nn4cryp7 malware! Hopefully, TSJ created a dump file for malware analysts to investigate. Can you help TSJ recover his infected C drive?
NOTE 1: encoder.exe IS A REAL MALWARE! PLEASE SOLVE THIS CHALLENGE IN A VIRTUAL MACHINE ENVIRONMENT!!!
Note 2: The flag is ASCII art, and it is hidden in one of the files in TSJ’s C drive.
Author: @wxrdnx
Basic analysis:
We are given a PE file, along with a .DMP
file and an infected drive with many encrypted files.
The PE file is packed with UPX, so we need to unpack it before loading it to a disassembler. All the symbols are stripped, so it’s going to be a little bit harder to analyze this binary. A good place to start is the main
function, there are many ways online that show you how to locate the main function in a PE binary. For this particular file, main
is at 0x401571
Since this is a stripped binary, we don’t know which one is a library function and may waste a lot of time trying to reverse unnecessary code. My way to deal with such binary is just trying to guess what each function does using strings, debugging, etc., and staying away from all the codes that I think are too complex.
Based on the strings in the binary, we can see that the program first check if we supplied a directory name, then it goes on to check if the name is victim
. If these checks fail, the program exits.
After that, we come to a small loop.
|
|
Thanks to the error message inside sub_559340
, we know that the program is looping through every file inside the victim
directory.
|
|
After some reading and debugging, I found out that the program will add all the filenames (except for encoder.exe
) inside the victim
directory to some kind of list. It will probably open every file in this list and encrypt them later.
After that, we see a call to sub_4A59F0
|
|
This is a pretty interesting function. Since a C++ object always starts with a pointer to the vtable, and the vtable
is usually stored in the binary as a global array, the statement *a1 = off_611010;
makes me suspect that this may be a constructor of some class. Jumping to that address in IDA confirms my suspicion.
|
|
IDA recognize the data structure at 0x611010 as the vtable for class CryptoPP::AutoSeededRandomPool
, which means that sub_4A59F0
is indeed the constructor for CryptoPP::AutoSeededRandomPool
.
So after some basic analysis, we now know that:
- The malware puts all filenames inside directory
victim
inside a list, possibly to encrypt them later on. - The malware uses CryptoPP library for some purpose, maybe to encrypt the files.
Looking for the encrypt algorithm:
To decrypt all the given files, we have to find out what encrypt algorithm is used. Using some plugin like findcrypt-yara, we know that the malware either uses AES or RC6. But we don’t know the block size and which mode of operation it uses, so we have to turn back to the code.
Using the same technique as above, we see that sub_4A59F0
is actually a constructor for class CryptoPP::AutoSeededRandomPool
, which is used to generate random bytes. And at 0x40192F
, the AutoSeededRandomPool
object is used as a param to another function.
|
|
The parameters it passes to sub_4031F0
include an object of type AutoSeededRandomPool
, a pointer to a buffer (check the asm code!), and a number. It’s fair to guess that this is a function to generate some random bytes to a specified buffer, and those random bytes may also be our key and iv.
Next, we come to a big loop. The program actually iterates through the list of names we mentioned above. Inside this loop, there is a call to sub_4BD0E0
, we can easily identify this function as the constructor for class CryptoPP::CipherModeFinalTemplate_CipherHolder<CryptoPP::BlockCipherFinal<(CryptoPP::CipherDir)0,CryptoPP::RC6::Enc>,CryptoPP::CBC_Encryption>
The name is really long, but we can see the 2 class names CryptoPP::RC6::Enc
and CryptoPP::CBC_Encryption
inside the above template class. There seem to be no other crypto functions in the program, so I concluded that the malware uses RC6
with CBC mode
to encrypt our files.
Finally, we need the key and iv. Checking the sample code from CryptoPP for RC6
, we see that the function SetKeyWithIV
is used to specify the key and iv for the encryption. There is a function with the same signature in the malware.
|
|
The random bytes generated using AutoSeededRandomPool
is now used as key and iv for RC6
, this seems plausible! Now we just need to extract those values from the dump files and it’s done.
Extracting key and iv:
I had never actually analyzed a dump file before, so this step took me a lot of time. In my opinion, the best tool for this is Windbg
.
Firstly, key and iv are stored on stack, so we can use command k
to view stack frames. The result are as followed
|
|
Based on the RetAddr field, we can see that the stack frame for the main
function is frame 8
.
Next, I use .frame /r 8
to see the frame context, this gives me the value of rbp
of this frame
|
|
Combine this with the stack info from IDA, we now know the absolute address of key
and iv
in memory, so we just need to use db
to dump them out.
|
|
Decrypting files:
All we need to do now is write a script to decrypt our files.
|
|
It took quite a long time for me to find the correct file with the flag, but finally, I got it:
TSJ{Purchasing_iuqerfsodp9ifjaposdfjhgosurijfaewrwergwea_DOT_com_wont_stop_me_from_going_brrrrr_LMAO}
(The correct file is CH4 Metasploit.txt.LMFAO
)
Decrypting script: decrypt.cpp
Idb file: encoder.bin.i64