This material comes from our Software Reverse Engineering course. During this tutorial you will learn how to tackle disassembling high level languages on a simple example. It's a complex task that can't be fully learned with a single article, but this is a great place to start!
Every binary that you have in your system usually was made in a high level language, but you don’t always know the language. Sometimes it’s possible to revert or decompile your binary into the original source code. Disassembling is not an easy task; with some specific codes it is possible to do this, but you will not know the name of the variables or the comments inside the code (for example). But you can always see the binary, convert it to hexadecimal and then interpret each byte, or group of bytes, as an instruction (as you learned in the last section). You can do it manually or use a specific disassembling tool to do this.
In the Table 21 we have a list of tools that you can use for disassembling x86 code:
Table 21 – Some disassembler for x86.
|Tool||X86 Architecture||Operating System|
|Interactive Disassembler (IDA)||32 and 64||Linux / Win|
|Hack||16||DOS / Win|
|NDISASM||32 and 64||DOS / Win / Mac / Linux|
Of course there is much more than this, you can take a look on the Internet and check some free, open source and commercial disassembling tools that you can use for this purpose. In this course, most of the time we are going to use OllyDbg or IDA (32 bit free version), because it is free and easy to use. The bad thing is that it only runs on Windows. In Linux examples, we are going to use NDISASM, that comes with NASM. One of the best disassemblers is Interactive Disassembler, or just IDA. This software is capable of disassembling code from various architectures, not just x86. They have a free x86 32 bit version for Windows. The full version, that runs on Linux and a bunch of different architectures, is very expensive.
Analyzing the Program
There are a lot of things that you should think about before doing reverse engineering to a binary program. Usually, you want to modify something, like removing a message that appears in a certain moment.
Sometimes you just want to change the value of a constant or variable, other times you may need to fix a bug in an old legacy code.
It is really difficult and unprovable to do reverse engineering on the whole code, because the binary is usually coded in a high level language, and when you attempt disassembling the code, you will have many more lines in Assembly language compared to the original source code in a high level language.
One good example to clarify this is thinking in a LOOP coded in a high level language. Let's suppose that you have a C code like the following (Figure 4):
for (i = 0; i < 0x1212; i++)
Figure 4 – A loop coded in C language.
The above code is really simple; as you can see it does nothing, just repeats for 1212 times (the 0x indicates an hexadecimal value). Just remember the syntax of the “for” loop: first parameter (i = 0) is the initial condition; second parameter (i < 0x1212) is the stop condition and the third parameter is the increment (in this case the same as i = i + 1). To understand what happens with the coder after you compile it, we did the compiling process in two different environments. The first one was done with the old Borland Turbo C (3.0).
After we had compiled and linked the source code, we got the binary. Thus, it was possible to disassembly the binary and analyze the Assembly code. Let's take a look at Figure 5.
XOR SI, SI
JMP SHORT loc_1029A
CMP SI, 1212h
JL SHORT loc_10299
Figure 5 – Disassembly of the binary loop compiled in Turbo C 3.
To disassemble the code, I used the IDA Free for 32 bits. Do you think you can understand the relationship between the original C code and the disassembled code? Let's explain a little to you.
The instruction XOR SI, SI you can understand as the “i = 0” (the first parameter from the original C code). How do we know this? It is easy: when you use XOR (Exclusive OR) using the same value for the both parameters, you will get a zero as a result (if you don't know how a XOR works, please look at the references).
Let's jump to the instruction INC SI. This one means the third parameter, which is “i++”. This one is really easy to identify, because the Assembly instruction INC just increments one register that you specify. In this case, we are using the pointer register SI, which was previously started with zero.
Going back to the second parameter (“i < 0x1212”) and looking in the disassembly of it, you will observe that we have more than one Assembly instruction to represent it: JMP SHORT loc_1029A, CMP SI, 1212h and JL SHORT loc_10299. The first Assembly instruction (JMP SHORT loc_1029A) is an unconditional loop, just meaning that the code will jump to the specified label (observe that the label is just a reference to the memory address, but the IDA disassembler helped us by putting a name on it).
When the code jumps to “loc_1029A”, you will see that we have the instruction CMP SI, 1212h (our second Assembly instruction related to the “i < 0x1212”). This is an ALU instruction, which compares the register SI with 1212h and sets the flag bits in the Flag Register (section 2.1.4). Actually, the instruction CMP acts as a SUB (subtract), the only difference is that instruction CMP does not save the result, just changes the flags. Now that we have the flags updated, we can analyze them using a conditional jump, in this case JL. As you can can see in the third instruction related to the “i < 0x1212”, we have a JL SHORT loc_10299, which means “jump if less” referring to the “<” symbol, originally present in our C code.
Note that it was really fast and easy to analyze parameters one and three from our “for” loop present in the C source code, just the parameter number two takes a while to understand the logic, but it is also easy.
Now, let's see what we get when we compile the same C code using “gcc” compiler (Figure 6).
JMP SHORT loc_401354
CMP [ESP+10h+var_4], 1211h JLE SHORT loc_401350
Figure 6 - Disassembly of the binary loop compiled in GCC.
As you can see, the Assembly code is not the same, there are a lot of different decisions made by the compiler that resulted in a different OPCODE combination (that we are seeing here as Assembly instructions).
Let's start thinking about our loop parameters: i = 0, i < 0x1212 and i++. What happened to the first parameter (i = 0)? Well, here it is difficult to see, but the “gcc” compiler decided to deal with the variable “directly”, without moving it into a pointer register (e.g. SI, as we saw before). To clarify this situation, let's jump again to the third parameter (i++), which will help us also explain the first one. The Assembly instruction INC [ESP+10h+var_4] is the one responsible for incrementing the variable “i” table that originally controls the loop. The first thing to observe here is the fact that the compiler generates a 32 bit code, as we can see in the ESP register (section 2.1.2). The other thing is that the brackets “[ ]” indicate that we want the address pointed by “ESP+10h+var4”, which means that this is the place where the original variable “i” is located (which answers our question about the first parameter: “i = 0”).
The third parameter, “i<0x1212”, one more time should be analyzed as a subset of Assembly instructions: JMP SHORT loc_401354, CMP [ESP+10h+var4], 1211h and JLE SHORT loc_401350. The first Assembly (JMP SHORT loc_401354) instruction has the same purpose mentioned before, just jump to a label and execute the second Assembly instruction (CMP [ESP+10h+var4], 1211h). As you can see in this instruction, again we are looking into the address point by [ESP+10h+var4] to see the value of the variable. The interesting thing here is that we are comparing the variable “i” with 1211h. Why is this happening? In our original C code, we had compared the variable with 1212h. Well, the compiler “decided” to change the value and to deal with this change, it also changed the next instruction. In the third instruction (JLE SHORT loc_401350), instead of the JL mentioned before, we have a JLE, which means “jump if less or equal”. This is how the compiler dealt with the comparison made by the second instruction. Instead of comparing with 0x1212 and “jump if less”, it decided to compare with 0x1211 and “jump if less or equal”, which means “the same thing” in the end.
Changing the Binary
Let's suppose that you want to change some binary code that you already found, for example, the comparison that you found before (section 4.1). As we know, the comparison loop will continue running until the second condition becomes false (i<0x1212).
Imagine that you do not want to enter in this loop anymore, but you do not have the source code to do this, then you have to do something with the binary. One of the easiest ways is to change the address of the unconditional jump (JMP) to the next instruction right after the conditional jump (JL or JLE).
Another possibility is to change the value of the variable before making the comparison (CMP). In the code compiled with “gcc”, you will have to change the value pointed by [ESP+10h+var4].
In some cases, we may want to remove a message that appears, or maybe a window. In these cases, we usually have a CALL (Assembly instruction to call a subroutine) that we should “remove”. Actually, we will need to change to another instruction, as we will see in the next subsection.
Number of Bytes
When we want to change something in our binary, we have to pay attention to the fact that we cannot change the number of bytes. It is one of the security mechanisms related to binary files, but we are not going to discuss this here.
The fact is that we have to pay attention to this if we want to make permanent changes in our binary. In section 4.2, we mentioned the JUMP example. In this case, it is really easy because we are just going to jump to the next “label” or instruction.
But when we want to omit a CALL, we have to put something there to replace the old bytes. We need to put something there that will not change the behavior of the other parts of our program. Thus, it is very useful to use the instruction NOP, that does nothing! This instruction has one byte (0x90) and we can put as many as we need without changing the number of bytes of our original file.
Supposing that the original CALL has three bytes, we just replace these three bytes with three 0x90, which means execute NOT three times.
Sum of Bytes
Another issue that sometimes we have to deal with is the fact that some binary files have a mechanism called CHECKSUM. This kind of thing is really common in Computer Networks Protocols, for example.
The CHECKSUM is a sum of all the bytes of the binary file, but with a limit of some bytes (e.g. 2 bytes). Let's suppose we have code with the following bytes: 0x10, 0x22, 0x35. In this case, the CHECKSUM would be 0x0067 (2 bytes).
Usually, this field is somewhere at the end of the binary file, and if you change one of your bytes (e.g. 0x10 for 0x11), then you will also have to change the CHECKSUM field to the new value, in this case 0x0068.