LOLBins Against the Machine: Reverse Engineering at Machine Speed

Matan Abutbul Senior Researcher
Pentera Labs

Purpose

Attackers can utilize Living Off the Land Binaries (LOLBins) to execute commands, evade detection, and maintain persistence using legitimate system tools already present in your environment. This research explores how AI can be used to proactively discover new, undocumented LOLBins before they are weaponized.

Executive Summary

The next LOLBin isn’t in a threat intel feed; it’s hiding in your /usr/bin directory right now. But most defenders only discover it after it’s been weaponized. I aim to flip this, helping defenders proactively discover unknown binaries that can be abused for command execution. My research introduces a novel approach that leverages AI to enhance reverse engineering by automating the tracing of execution paths and identifying attacker-relevant functionality. LOLBins serve as a case study for this methodology, but the framework is applicable to a wide range of binary analysis problems that extend far beyond this single category.

Does it apply to my organization?

Yes. If your organization runs Linux, Windows, or macOS environments and relies on built-in system utilities for administration or automation. Even if you have robust endpoint detection, these binaries can be abused for stealthy attacks that blend in with legitimate operations.

Who should read this?

Security researchers, red teamers, and defenders (including CISOs and SOC analysts) who want to understand and detect emerging “living off the land” techniques and learn how AI can be used to scale binary analysis and uncover new abuse paths.

The Cloak of Legitimacy

Picture a secure compound. Every vehicle approaching the loading dock is inspected. Unknown cars are stopped, IDs are checked, and questions are asked. But what if, instead of showing up in your own car, you arrive in the delivery truck of a well-known/pre-approved vendor? You are wearing the uniform, driving the right route, and following the expected process. To security, you look like part of the daily routine. No one questions you. You are waved through without a second glance.

This is exactly how attackers operate when they abuse LOLBins. These are legitimate system executables, already present on most machines and used daily for administration. They are signed, trusted, and pre-installed. Because of that, they rarely raise alarms. From the defender’s perspective, they look like the vendor truck; part of business as usual.

But attackers see them for what they really are: low-noise, high-leverage tools for stealth and persistence. Instead of dropping custom malware or noisy payloads, they turn the environment’s own tools against it. With LOLBins, they can run arbitrary commands, pivot between machines, establish footholds, and exfiltrate data without uploading a single foreign binary or tripping traditional security controls. These actions are wrapped in the cloak of legitimacy, hiding malicious behavior behind the trusted façade of routine operations.

That is what makes LOLBins so effective. Most detection logic is built to flag anomalies; new binaries, unexpected behaviors, or unusual traffic patterns. But LOLBins do not stand out. Their presence is expected, their behavior appears ordinary, and their abuse often mimics real administrative activity. As a result, defenders end up scanning for threats while the attacker is already inside, quietly blending into the background noise of everyday operations.

This write-up focuses on discovering new LOLBins in Linux environments, though the same principle applies across platforms. Windows, macOS, and cloud environments all contain their own versions of native utilities that can be misused in similar ways. The central idea is universal: the more trusted the tool, the more dangerous it becomes when used against you.

From a Simple Task to a Bigger Question

It all started when I was working on a new feature for our product. The feature was intended to add more options for executing commands with elevated privileges.

So I went to the well-known repository GTFOBins in order to search for matching binaries for the task.During my work, there was one thought I kept circling back to: How hard is it to create such a repository, and maintain it?

After finishing my initially assigned task, I had time for a little research that combined “old-fashioned” reverse engineering and AI.

The research question was “Is it possible to use AI in order to automate the process of finding new LOLBins?” Not just to document them after the fact, but to proactively surface binaries with attacker-useful execution behavior before they are widely known.

The answer is YES, of course, but how?

As this was the leading question I had another thing in mind: I want to use AI to improve my efficiency doing so. I remember back in the days the time and effort I used to spend in order to analyze just one binary.

Before you can effectively automate a process, it’s better to understand what the process looks like manually. So I started some old-fashioned reverse engineering of one binary that is already known for privilege escalation (/usr/bin/find).

Source– https://gtfobins.github.io/gtfobins/find/

What I had in mind was that when using a LOLBin to execute a command or execute another binary, there must be some sort of relevant syscalls that are being used in the process. I wanted to search for those functions and then take their arguments and back-track those arguments to validate if they were passed from the main function using command line arguments.

That manual workflow became the foundation for everything that followed.

Old-School LOLBins Hunting

Whenever I plan to automate something, I start by mastering it manually. Before writing a single line of code, I want to understand exactly how the behavior works, what it depends on, and which decisions I’ll eventually need to replicate programmatically. In this section, I’ll walk through the manual analysis flow I performed using the radare2 framework to determine whether a given binary exhibits LOLBin-like behavior.

The goal of this section is not to teach radare2, but to demonstrate the reasoning process the automation must later replicate.

After loading the binary into radare2, I begin by searching for common execution primitives:

I start with the first match. Ideally it’s the one we’re after. If not, I simply continue through the list and repeat the process.

Next, I look for all cross-references to sym.imp.execvp effectively asking “Where in the binary is execvp actually invoked?”:

In this case, there is only one caller:

sym.imp.execvp is invoked exclusively from fcn.0000fdf0 at address 0x10186.

Any external command execution in this binary must pass through this function.

After locating sym.imp.execvp and listing its cross-references, I now know exactly which functions in the binary are responsible for spawning external commands. In this sample, execvp is called from a single function, fcn.0000fdf0 at address 0x10186, so any command execution must flow through this code path.

A bit of information about the function using afi @ fcn.0000fdf0: there are 298 instructions, meaning we need to focus on specific blocks. First, we would like to see the parameters of execvp and where they come from pdf @ fcn.0000fdf0 will print the disassembled function where the execvp was called using pd -4 @ 0x10186+4 (and repeating the same process for every call site when there is more than one). I inspect how the argument registers are populated just before the execvp call (4 instructions are enough this time). In this case, the first argument (the program name) is built from r12.

r12 is the suspect, this is the variable that holds the binary to execute next.

Around the execvp call there are several function calls: fcn.00021af0, sym.imp.dcgettext, sym.imp.error, fcn.00021cb0, and fcn.0001c150.

To decide where to dig next, we focus on data flow, not just proximity.

execvp uses r12 as argv, so we look for where r12 is last defined.

The other nearby calls (dcgettext,error,fcn.00021cb0)do not write to r12 at all. Rather, they handle error messages and directory changes. fcn.0001c150 only influences whether we reach the execvp call.

Only fcn.00021af0 returns a value that becomes r12 and is later passed to execvp. That’s why we choose fcn.00021af0 as the next function to analyze.

Inside fcn.00021af0 there is only one call instruction (call fcn.00034980), and the function immediately returns the value left in rax. Since fcn.0000fdf0 copies rax into r12 right after calling 21af0, we can confirm that whatever fcn.00034980 returns ultimately becomes the argument vector passed to execvp.

Now I’m going to skip a few intermediate steps (because the process becomes recursive) just to validate that r12 is indeed the pointer returned by fcn.00034980, and that it remains unchanged until it is later used as argv in the execvp call.

After validation, we want to see where the execvp caller function (fcn.0000fdf0) is coming from.

Using axt @ fcn.0000fdf0, we discover that this function is not called directly. Instead, its address is stored as a function pointer in another function (fcn.0001b550r12), which constructs an internal action node.

Tracing that backward reveals that fcn.0001b550 is itself called from a small adapter function (fcn.0000e550) that hardcodes the string « -exec » into rdi and then jumps into the action constructor.

This conclusively ties the only execvp call in the binary to the parsing and evaluation of the « -exec » expression in the user’s command line.

To sum it all up, we’re left with a tedious process (as always with reverse engineering 🙂) until reaching that one function that matches our needs.

So, after manually reversing a couple of already known LOLBins from GTFOBins to support my claim that this process reliably surfaced their execution paths, I wanted to start coding.

At this point, my goal was very simple: to automatically find those known binaries (GTFOBins) with my tool, see whether it could rediscover known LOLBins, and potentially surface new ones that had not yet been documented.

Moving to Automation

At this point, the obvious question is: where does AI come into play? That’s where things start to get interesting.

In this tool, AI is not used as a shortcut or a replacement for reverse engineering. Instead, it acts as a reverse engineering assistant inside the process.

The core idea is simple: the tool does all the heavy lifting first. It collects data, builds context around each execution path, and only then sends a structured query to the AI. AI’s role is limited to digesting that context and providing a decision: do the syscall parameters originate from command-line arguments, or not?

To support this, I implemented a helper class called AIUtils, which handles all communication with the AI backend: creating assistants, sending queries, and processing responses. In this project, I used OpenAI, but the design is not tied to a specific provider. Any model with a Python API could be plugged in without changing the core analysis logic.

At the same time, I wanted a clean separation between reverse engineering logic and tooling. For that reason, I created another utility class called R2Utils. This class encapsulates all interactions with r2pipe and exposes the same primitives I used during the manual analysis phase, such as list functions, cross-reference discovery, and disassembly retrieval, along with additional helpers needed for automation.

With these building blocks in place, the focus shifted to the logic of the tool itself.

Automating the Manual Process

The plan was to build an automated tool that accepts either a single binary or an entire directory (such as /usr/bin) as input, and then follows the same reasoning process I previously applied by hand to each discovered executable.

First, each binary is opened using r2pipe, and full analysis is performed using radare2.

Next, the tool identifies candidate execution functions. Rather than relying solely on predefined assumptions, it can first leverage AI to analyze the full list of functions extracted from the binary and determine which ones are capable of executing external commands. AI is also used to sort and prioritize these candidates, allowing the analysis to focus on the most relevant execution paths early on.

This AI-driven classification is used as a discovery mechanism, especially for binaries that rely on less obvious helpers or undocumented execution wrappers.

To keep the analysis grounded and fail-safe, the tool also maintains a predefined list of well-known execution primitives such as execl, execvp, popen, fork, and system. This list acts as a fallback and a safety net, ensuring that common and well-understood execution paths are always included, even if the AI classification is inconclusive or unavailable.

For each detected execution call, the tool builds a reverse call graph, starting from the execution point and tracing backward through the program until it reaches the entry point, typically main or an equivalent dispatcher function.

While building this call chain, the tool collects code snippets from each function along the path. These snippets preserve contextual disassembly around each call site and are later used for argument tracing and validation.

Once a complete call chain is constructed, the tool extracts the arguments passed to the execution function and traces their origin through the collected code. The goal is to determine whether those arguments ultimately originate from the program’s input arguments.

If the argument can be traced back to user-controlled input, the tool flags the execution path as a potential command execution vector. If not, the result remains unproven.

The key point here is that the automated flow is not fundamentally different from the manual one. It is the same process, encoded and enforced programmatically.

Scaling the Analysis

At some point, I needed to validate whether this approach actually worked at scale.

To do that, I set up a clean Ubuntu 22.04 environment and ran the tool against the entire /usr/bin directory, which contains roughly a thousand binaries. I launched the analysis overnight and waited for the results.

The moment of truth came the next morning. I filtered out binaries already documented in GTFOBins and focused on what remained. What surfaced were binaries that are not currently listed in the repository, yet exhibited execution paths consistent with LOLBin behavior that warranted further investigation.

Results of /usr/bin/chrt:

Results of /usr/bin/i386:

Getting this kind of validation was genuinely exciting. What started as a side research idea during day-to-day work turned into something tangible: a repeatable process that could surface interesting results without manual intervention.

The Bigger Takeaway

While the findings themselves are interesting, they are not the most important outcome of this work.

What I found more meaningful is the process behind it: the ability to take existing knowledge and skills and amplify them through automation and AI. This is not just about discovering new privilege escalation paths or building another binary analysis tool. It’s about learning how to extend your reach and efficiency by combining technical intuition with AI-driven reasoning.

If there is one takeaway I hope readers walk away with, it’s that this mindset applies far beyond security research. The tools are already here. What matters is how creatively and responsibly we choose to use them.

Download

Subscribe to our newsletter