Reverse Engineering – Binary Program Analysis [July 2022]

In this training, we learn the fundamentals of reverse engineering from scratch, ranging from reconstructing high-level code over recovering complex data structures and C++ class hierarchies to analyzing complex malware samples. In between, we become proficient in using state-of-the-art tools such as IDA, Ghidra, and GDB. This way, the training accompanies students in their first reverse engineering steps and paves their way for a long journey.

$4,299.00

Duration

4 days

Delivery Method

virtual

Level

beginner

Seats Available

20

Duration

4 days

Delivery Method

virtual

Level

beginner

REGISTRATION CLOSED

DATE: 19-22 July 2022

TIME: 09:00 to 17:00 EDT/GMT-4

Date Day Time Duration
19 Jul Tuesday 0900-17:00 EDT/GMT-4 8 Hours
20 Jul Wednesday 0900-17:00 EDT/GMT-4 8 Hours
21 Jul Thursday 0900-17:00 EDT/GMT-4 8 Hours
22 Jul Friday 0900-17:00 EDT/GMT-4 8 Hours

 


Reverse engineering is the art of extracting valuable information from unknown binary programs. No matter whether we aim to find vulnerabilities in closed-source software, dissect the internals of nation-state malware, or simply bypass copy protection technologies: Reverse engineering helps us to pinpoint relevant code/data locations, enables us to reconstruct high-level constructs from machine code, and thus provides us with insights into valuable program internals.
In this training, we learn the fundamentals of reverse engineering from scratch, ranging from reconstructing high-level code over recovering complex data structures and C++ class hierarchies to analyzing complex malware samples. In between, we become proficient in using state-of-the-art tools such as IDA, Ghidra, and GDB. This way, the training accompanies students in their first reverse engineering steps and paves their way for a long journey.

First, we discuss the layers between machine code and high-level languages, introduce binary file formats and get to know important tools such as hex editors, disassemblers, decompilers, and debuggers. Afterward, we familiarize ourselves with the X86-64 instruction set architecture, the most common architecture on desktop computers and servers. Thereby, we learn how to manually write assembly code, inspect registers and flags in a debugger, and reconstruct arithmetic calculations and loops in a disassembler.

In the second part, we cover the reconstruction of high-level code constructs from machine code. For this, we compile C code to machine code and compare them side-by-side. Using different compilers and optimization levels, we are able to study the manifold representations of high-level constructs. Afterward, we focus on manually recovering high-level functions from compiler-generated code. Finally, we dive into the area of software cracking and deepen our skills by reverse engineering and patching serial validation schemes.

Before we reconstruct complex data structures and C++ classes with Ghidra, we first learn how to identify them manually. Following, we have a look at how to recover class inheritance relationships, analyze constructors & virtual functions, and how to dissolve virtual function calls.

Finally, we put our obtained knowledge into practice by analyzing nation-state malware samples. After discussing challenges and strategies when dealing with complex binaries, we identify malware functionality based on API functions and reconstruct class hierarchies of malware modules. In order to reveal hidden strings in the binary, we script Ghidra to automatically decrypt them.

 

**Note that the training focuses on hands-on sessions. While some lecture parts provide an understanding of how high-level code can be represented in machine code, various hands-on sessions teach how to interact with reverse engineering tools and reconstruct high-level code from binary programs. The trainer actively supports the students to successfully solve the given exercises. After a task is completed, we discuss different solutions in class. Furthermore, students receive detailed reference solutions that can be used during and after the course.

While this class mostly focuses on the X86-64 architecture, we can optionally take a look at the ARM32 architecture and discuss their differences and similarities. Since the course teaches reverse engineering in a general way, students will notice that all techniques and tools can also be applied to other architectures.

 

Topics CoveredĀ 

The training orientates at the following outline:

  • Introduction to Reverse Engineering
    – Motivation
    – Application scenarios
    – From machine code to high-level languages
    – Compilers
    – Executable file formats (ELF & PE)
    – Static and dynamic program analysis
    – Editing ELF files with a hex editor
    – Disassembling with IDA
    – Decompilation with Ghidra
    – Debugging with GDB

 

  • X86-64 Architecture
    – Architecture overview
    – Register and data types
    – Arithmetic operations and control-flow instructions
    – Stack operations and function invocations
    – Inspection of registers and flags with GDB
    – Implementation of arithmetic operations in assembly code
    – Reconstruction of simple calculations
    – Loop reconstruction with IDA

 

  • Reconstruction of Functions
    – Inspection of empty functions on the binary level
    – Stack frame analysis with GDB
    – Prologue and epilogue identification with IDA and GDB
    – Calling conventions
    – Basic blocks and control-flow graphs
    – Reconstruction of function signatures and arguments
    – Reconstruction of recursive functions
    – Reconstruction of (nested) conditionals/switch case
    – Reconstruction of (nested) loops
    – Impact of compiler optimizations

 

  • Software Cracking
    – Software license checks and keygenning
    – Analysis of serial validation schemes with IDA/Ghidra and GDB
    – Patching to manipulate control flow

 

  • Reconstruction of Data Structures
    – Local and global data structures
    – Variables, arrays, strings and structs
    – Reconstruction of arrays with IDA/Ghidra
    – Reconstruction of structs with IDA/Ghidra

 

  • * C++ Reverse Engineering
    – Function overloading and name mangling
    – Class objects and object life cycles
    – Identification and reconstruction of class objects
    – Reconstruction of class relationships/inheritance
    – Static/dynamic dispatching
    – Virtual functions and class inheritance
    – Identification and analysis of virtual function tables
    – Dissolving virtual function calls

 

  • Malware Reverse Engineering
    – Malware types and behavior
    – Analysis challenges and strategies
    – Identification of malware functionality based on API functions
    – Class reconstruction of C++ malware with Ghidra
    – Ghidra scripting for automated string decryption

 

  • ARM32 Architecture (Optional)
    – Architecture overview
    – Differences to X86-64
    – Register and data types
    – Stack operations
    – Arithmetic operations and control-flow instructions
    – Subroutines and calling convention

Agenda

  • Introduction to Reverse Engineering

    - Motivation - Application scenarios - From machine code to high-level languages - Compilers - Executable file formats (ELF & PE) - Static and dynamic program analysis - Editing ELF files with a hex editor - Disassembling with IDA - Decompilation with Ghidra - Debugging with GDB

  • X86-64 Architecture

    - Architecture overview - Register and data types - Arithmetic operations and control-flow instructions - Stack operations and function invocations - Inspection of registers and flags with GDB - Implementation of arithmetic operations in assembly code - Reconstruction of simple calculations - Loop reconstruction with IDA

  • Reconstruction of Functions

    - Inspection of empty functions on the binary level - Stack frame analysis with GDB - Prologue and epilogue identification with IDA and GDB - Calling conventions - Basic blocks and control-flow graphs - Reconstruction of function signatures and arguments - Reconstruction of recursive functions - Reconstruction of (nested) conditionals/switch case - Reconstruction of (nested) loops - Impact of compiler optimizations

  • Software Cracking

    - Software license checks and keygenning - Analysis of serial validation schemes with IDA/Ghidra and GDB - Patching to manipulate control flow

  • Reconstruction of Data Structures

    - Local and global data structures - Variables, arrays, strings and structs - Reconstruction of arrays with IDA/Ghidra - Reconstruction of structs with IDA/Ghidra

  • C++ Reverse Engineering

    - Function overloading and name mangling - Class objects and object life cycles - Identification and reconstruction of class objects - Reconstruction of class relationships/inheritance - Static/dynamic dispatching - Virtual functions and class inheritance - Identification and analysis of virtual function tables - Dissolving virtual function calls

  • Malware Reverse Engineering

    - Malware types and behavior - Analysis challenges and strategies - Identification of malware functionality based on API functions - Class reconstruction of C++ malware with Ghidra - Ghidra scripting for automated string decryption

  • ARM32 Architecture (Optional)

    - Architecture overview - Differences to X86-64 - Register and data types - Stack operations - Arithmetic operations and control-flow instructions - Subroutines and calling convention

Why You Should Take This Course

This class is intended for students who have little or no prior knowledge of reverse engineering and want to learn binary program analysis from scratch. This includes, but is not limited to, security professionals as well as malware, forensic, or threat analysts. Furthermore, the course is also interesting for students who aim to extend their knowledge and want to dive into data structure recovery, C++ reverse engineering or malware analysis

Who Should Attend

  • Anyone who are interested to learn binary program analysis
  • Security professionals
  • Malware analysts
  • Forensic analysts
  • Threat analysts

Key Learning Objectives

  • Learn reverse engineering from scratch and understand all layers between machine code and high-level languages

  • Become proficient in using state-of-the-art tools like IDA, Ghidra and GDB

  • Learn how to reconstruct (nested) conditionals and loops, functions, complex data structures and C++ classes from machine code

  • Get to know strategies to analyze complex binaries and apply them to nation-state malware samples

  • Deepen your reverse engineering skills in various hands-on sessions
  • Prerequisite Knowledge

    The participants should have some familiarity with low-level programming in C. Particularly, a basic understanding of pointers is recommended.

    Hardware / Software Requirements

    Students should have access to a computer with 4 GB RAM (minimum) and at least 20 GB disk space. Furthermore, they should install a virtualization software such as Virtual Box or VMware. Students will be provided with a Linux VM containing all necessary tools and setups.

    Your Instructor

    Tim Blazytko @mr_phrazer is a well-known binary security researcher and co-founder of emproof. After working on novel methods for code deobfuscation, fuzzing and root cause analysis during his PhD, Tim now builds code obfuscation schemes tailored to embedded devices. Moreover, he gives trainings on reverse engineering & code deobfuscation, analyzes malware and performs security audits.

    What students say about this training:

    From Tim’s past HITB training

    • Trainer’s Overall Score: 96%
    • Trainees’ Overall Feedback:

    Would you recommend this class, or attend other classes by this trainer?
    “I would absolutely recommend others take this class or any other classes taught by Tim”

    “Yes, I would definitely recommend this class to any reverse engineers wanting to advance their skills, and I would attend other classes by this trainer.”

    “Absolutely recommend this class. It has met and exceed all my expectations!”

    What part of this course did you find most useful and interesting?
    “I found all of it very interesting. The most useful parts to me were the coding/reversing exercises. That really helps to cement my understanding of the topics discussed”

    “The latter part, dealing with the automation of analysis, [where we were] applying the theory of techniques covered earlier on”

    “It is very difficult to fault any component of this course, its appears as a very mature and well refined project. Tim is clearly very passionate on the subjects and that is portrayed through the material and delivery.”