[REVERSING] Discount VMProtect - 465 Pts.


Challenge Description:

lol it didn't load


Then the challenge gives us this file: lol it didn't load


Let’s see what the ELF file does. lol it didn't load

It basically asks for a password then checks if it matches to the password embedded inside the executable, which is also probably the flag.


Let’s run gdb on it and do static analysis. Let’s check for functions and how much symbols is left since the file command says it is stripped.

lol it didn't load Hmmm… There is no main and only plt entries remained. Let’s find the entry point and the main then.
The entry point is at 0x400770. lol it didn't load


Then our main address is 0x400c88. lol it didn't load

This is the whole main assembly code lol it didn't load

Summary of main:


Let’s check Ghidra if we got the same analysis to it. lol it didn't load We got the same analysis. Great! We noticed also that the 0x400857 is called two times. Let’s see what 0x400857 does.
In Ghidra: lol it didn't load Hmmm… weird, it only returns. Let’s check gdb then.

In GDB: lol it didn't load It looks like this function offers a lot more than just returning. It pushes 0x2361ca to the stack then it XORs it with 0x636465. The value after XORing is 0x40087f then returning. Since 0x40087f is the top of the stack, the code actually jumps to that address instead of going back to main. Let’s check Ghidra what’s inside 0x40087f. lol it didn't load lol it didn't load lol it didn't load Woah! I think we just stumbled a treasure! However, there’s a lot of code and it looks scary. Second, what’s the deal with this loop? After a lot of headbanging and searching, I found this article about VM protectd binaries. I suggest to give it a read. VM Protected Binaries Article.
TL;DR: A Binary that’s VM protected has its own set of “instructions”, has a function that acts like a processor then reads the instructions, and has its own memory (stack, heap, etc. depending on the design). The 0x40087f is actually the function that executes those instructions. We can also see the values in the switch cases. These are the opcodes of the instructions.
Now that we know what the big loop is, we have to answer the following questions: a) Where does the binary get the instructions? b) What does each opcode do? c) This is the main part of the program therefore the flag is connected to it somewhere and somehow. So where is it?
Let’s answer (a). Let’s go back at 0x400857 before jumping to 0x40087f lol it didn't load

Before the XOR operation, we can see that rdi value is being stored to rbp-0x28:

Also, a very important note, those values above appeared in every operation for every opcode inside the big loop. But what is the value of rdi? Going back to main. lol it didn't load

Addresses 0x602320 and 0x6020e0 are passed as arguments to 0x400857 function. What’s inside in this address?

Contents: of 0x602320 lol it didn't load

Contents: of 0x6020e0 lol it didn't load

We can see many of the bytes are starting in 3 (i.e. 0x32, 0x36, 0x33 etc.) Could this be the instructions? It most likely is but to confirm things let’s go Ghidra and check the decompiled code at 0x40087f.

lol it didn't load

Basically, what it says: [rbp-0x17] = [rbp-0x28][rbp-0x4]; #access the byte from [rbp-0x28] at offset [rbp-0x4] [rbp-0x4] += 1

Then the value at [rbp-0x17] is being used to check against the cases for opcodes thus the addresses 0x602320 and 0x6020e0 contains the values of instructions. We should also notice that offset lies in [rbp-0x4] therefore this is the instruction pointer.
Takeaways: [rbp-0x28] is the base address. The value is set in the 0x400857 through rdi which came from the main function [rbp-0x4] is the instruction pointer. [rbp-0x8] ??? -> we don’t know this yet.
Let’s answer (b): Since there are a lot of opcodes, this is the list of each opcode and what they do.
0x30 ->

0x31 -> 0x4008cf - 0x400913

0x32 -> 0x400918 - 0x400940

0x33 -> 0x400945 - 0x40097d

0x35

0x34 -> 0x400982 - 0x4009bc

0x36

0x37

0x38

0x39

0x61

0x62

0x63 -> 0x400bd2 - 0x400bef

0x64 -> 0x400bf4 - 0x400c3b

0x65


Takeaways:

-We can see that [rbp-0x8] is the offset from the 0x6027a0 and adds 1 to itself whenever it saves a value. (You can also check the other cases. There are cases in which [rbp-0x8] decreases 1 whenever it does an operation)

[rbp-0x28] is the base address. The value is set in the 0x400857 through rdi which came from the main function [rbp-0x4] is the instruction pointer. [rbp-0x8] is the stack pointer 0x6027a0 is the base address of the stack


Finally, let’s answer (c). Where is the flag?

Let’s look at the first set if instructions at address 0x602320: lol it didn't load

0.  31, data = 80
2.  63
3.  32
4.  32
5.  64, data = 80
7.  33
8.  34, data = 12
a.  36, data = 1
c.  38
d.  32
e.  32
f.  39
10. 34, data = 3
12. 36, data = 23
14. 39
15. 34, data = 1b
17. 36, data = 0
19. 64, data = 0
1b. 36, data = 0
1d. 64, data = 80
1f. 30 

Instructions 0-2:

Instructions 3-10:

Instructions 12-1d:

Instruction 1f:


Hmmm… This set of instruction is not that interesting, except for instruction 2 which copies a gibberish from 0x6020a0 to 0x6025aa. lol it didn't load
Nevertheless, we were able to grasp the execution for the first time. Let’s go to instruction set from lol it didn't load

0.  31, data = 80
2.  32
3.  32
4.  64, data = 80
6.  33
7.  32
8.  34, data = 28 #This jumps directly to opcode 30 -> exit instruction
a.  35
b.  36, data = 63
d.  37
e.  36, data = 98
10. 38
11. 61
12. 62
13. 31, data = 80
15. 36, data = a
17. 38
18. 65
19. 39
1a. 34, data = 20
1c. 36, data = 0
1e. 64, data = 0
20. 36, data = 1
22. 38
23. 32
24. 32
25. 37
26. 34, data = 2
28. 30 #Exit instruction

Instructions 0 - 12:

a) char = char rotates to right 1 (Instruction a)

b) char = char ^ 0x63 (Instruction d)

c) char = char + 0x98 (Instruction 10)

d) char = neg(char) (Instruction 11)

Instruction 13 - 1e

lol it didn't load

raw_data = '\x18r\xa2\xa4\x9d\x89\x1f\xa2\x8d\x9b\x94\rm\x9b\x95\xec\xec\x12\x9b\x94#\x16\x9bl\x13\x0em\r\x96\x8d\x0e\x90\x13\x97\x8a\xbb\xcfd~\xd3\x1a@#\xec\xdf\x00\x00\x00'


With following reversal process:

def get_flag(raw_data):
	flag = ""
		for char in raw_data:
	     		char = ord(char) ^ 0xff
			char = (0x100+char - 0x98) & 0xff
			char = char ^ 0x63
			char = chr(((char << 1) | (char>> 7)) & 0xff )
			flag += char
	return flag

The flag is: lol it didn't load X-MAS{VMs_ar3_c00l_aNd_1nt3resting}
NICE! We are able to get flag! The rest of the instructions were actually just looping back for the rest of the characters to be processed. 1 character mistake actually will make 0x6025a0[0] = 0 (Instructions 1c and 1e)
Thank you for reading this long writeup
Author’s comment: I’ve learned a lot doing this challenge and the writeup. Thank you reader for putting it up until the end and I hope you’ve also learned something in this writeup.