Symbolic Execution with Angr - part 1
Table of Contents
An overview of the binary analysis python framework angr in performing symbolic execution.
Symbolic Execution #
Before introducing the angr python framework, it is important to understand the key concept that lies at the heart of angr’s core functionality - symbolic execution.
Symbolic execution is a technique of program analysis where instead of supplying normal inputs to a program during testing, symbolic inputs that represent arbitrary values are supplied.
Think of symbolic inputs as variables such as those in algebra. Since we are trying to find a variable that will trigger a given state, symbolic execution will walk through all possible paths to reach that desired state, solving for that particular variable.
Suppose we have an equation as follows:
\(x^2 + 6x + 18 = 12\)
x represents our symbol and it depends on a given execution path that binds it.
An execution path is how you traverse through the program to reach a state.
Example 1 #
Given a simple program, let us map out the execution paths of the program
input = input('Input the password: ')
if input == 'goodpass':
print 'Success.'
else:
print 'Try again.'
path 1: User enters ‘goodpass’ as input
path 2: User enters ‘pass123’ as input
To find the value of λ, we can walk back in a given execution path to find out what input triggers a given branch. In the first branch, backtracking leads you to (if input == ‘goodpass’), where goodpass triggers the desirable input.
Let us look into how we can do this using the angr framework
The Angr Framework #
Angr is a binary analysis framework based on a suite of python 3 libraries that allows you to carry out various tasks.
With its symbolic execution engine, it can step into a binary’s various states and follow any branch leading to the states, find a state that matches given criteria and solves for a symbolic variable depending on a given execution path as we demonstrated in the Introduction.
Installation instructions can be found here
Clone this repository which contains the files we will need throughout this n-part series.
In the dist directory, you will find all your project files.
➜ dist git:(master) pwd
/opt/angr_ctf/dist
➜ dist git:(master) ls
00_angr_find 07_angr_symbolic_file 14_angr_shared_library scaffold02.py scaffold09.py scaffold16.py
01_angr_avoid 08_angr_constraints 15_angr_arbitrary_read scaffold03.py scaffold10.py scaffold17.py
02_angr_find_condition 09_angr_hooks 16_angr_arbitrary_write scaffold04.py scaffold11.py
03_angr_symbolic_registers 10_angr_simprocedures 17_angr_arbitrary_jump scaffold05.py scaffold12.py
04_angr_symbolic_stack 11_angr_sim_scanf lib14_angr_shared_library.so scaffold06.py scaffold13.py
05_angr_symbolic_memory 12_angr_veritesting scaffold00.py scaffold07.py scaffold14.py
06_angr_symbolic_dynamic_memory 13_angr_static_binary scaffold01.py scaffold08.py scaffold15.py
➜ dist git:(master)
00_angr_find #
The first example we look into sets us up for basic angr usage. Run the binary in your terminal to see how it works
➜ dist git:(master) ./00_angr_find
Enter the password: pass123
Try again.
Program analysis #
It is a simple program that asks for a password. We can disassemble the program to find out more information. Using a disassembler of your choice, disassemble the main function and view your control graph.You can use ghidra, IDA pro, Cutter, radare etc. I disassembled with Cutter
The control graph is not too complicated but we don’t have to spend time analyzing it to find the password
2 branches are of interest.
Branch 1 triggered “Try again” while branch 2 should trigger “Good job”
Our goal is to trigger branch 2 by supplying the correct password.
Solution #
Let us build out our solution in python step by step.
- Import necessary libraries.
import angr
import sys
sys library will be used to parse our arguments
- Instantiate an angr project.
def main():
path_to_binary = '/opt/angr_ctf/dist/00_angr_find'
project = angr.Project(path_to_binary) #1
initial_state = project.factory.entry_state() #2
simulation = project.factory.simgr(initial_state) #3
We provide the path to the project and use the Project method to create an instance of an angr project to work with. #1
We then create an initial_state variable that holds the entry point to the program. This is necessary for the symbolic execution engine to know where to start exploring the program.#2
We follow through with a simulation manager object that starts executing at the entry point of our program. #3
- Addresses definition
print_good_address = 0x8048675 #4
simulation.explore(find=print_good_address) #5
The good address that we want is the one of the branches that leads to printing ‘Good job’ indicated below as 0x8048675
- House-keeping
if simulation.found: #6
solution_state = simulation.found[0] #7
solution = solution_state.posix.dumps(sys.stdin.fileno()) #8
print("[+] Solution found! : {}".format(solution.decode('utf-8'))) #9
else:
raise Exception("Could not find the solution") #10
if __name__ == "__main__":
main()
Once the execution paths to reach our desirable address have been obtained, we begin by checking if our simulation variable contains something. #6
.
If we found something, we obtain the input that triggered that particular state. #7
We then dump it #8
and print it out #9
If not, we raise an exception #10
Our final script looks as follows:
import angr
import sys
def main():
path_to_binary = '/opt/angr_ctf/dist/00_angr_find'
project = angr.Project(path_to_binary)
initial_state = project.factory.entry_state()
simulation = project.factory.simgr(initial_state)
print_good_address = 0x8048675
simulation.explore(find=print_good_address)
if simulation.found:
solution_state = simulation.found[0]
solution = solution_state.posix.dumps(sys.stdin.fileno())
print("[+] Success! Solution is: {}".format(solution.decode('utf-8')))
else:
raise Exception("Could not find the solution")
if __name__ == "__main__":
main()
Run it to see angr perform its dark magic
We easily find the string that triggers our desirable state.
For the next binary 01_angr_avoid, the solution is exactly the same as this only that you shall have to specify the address of the branch you want to avoid.
In the next tutorial, we will look at more advanced concepts such as how to avoid path explosion and also injecting symbolic values into registers. Stay tuned!