AlgoKiller-Plugin: An MCP-Powered AI Agent for High-Fidelity ARM64 Trace Analysis & Crypto Triage

12 hours ago12 hr

Project Link: https://github.com/icloudza/algokiller-plugin

Current Version: v0.9.5 | License: MIT | Platform: macOS (Apple Silicon / Intel)

The Problem: The "Confidence Trap" of LLMs in RE

If you’ve ever pasted a raw execution trace into an LLM and asked it to analyze

an algorithm, you know the routine:

- It sees an add or a mul and confidently claims, "This is SHA-256."

- It spots 0x12345678 and declares, "This is the encryption key."

- You challenge it, it apologizes, and then hallucinates an equally confident

but different lie.

The issue isn't the model's intelligence—it's the lack of a "Evidence vs.

Inference" framework. LLMs treat all outputs with the same level of authority.

AlgoKiller-Plugin changes the paradigm:

Don’t let the LLM look at raw traces; let it look through tools. Don’t let it

just give conclusions; force it to leave a verifiable reasoning trail.

What it is (and what it isn't)

Ideal Use Cases: You have a GB-sized ARM64 instruction trace (generated via

GumTrace) and need to answer:

- Which algorithm produced this 32-byte ciphertext? Where did the key

originate?

- How is the X-Sign / token / device fingerprint constructed in the trace?

- What does this VMP handler do? What is the shape of the opcode table?

- Trace a buffer from allocation to free: who wrote to it?

What it isn't:

- It is an AI Assistant, not a standalone static analysis platform like

IDA/Binary Ninja.

- It doesn't "one-click unpack" or "auto-recover" everything. It lays out the

evidence and categorizes inferences. The analyst makes the final call.

Architecture: Three Layers of Governance

1. Bottom Layer: ak_search (C Engine) A high-performance engine using mmap, BMH

string matching, and inverted line indexing. It handles GB-scale traces with

millisecond latency.

2. Middle Layer: MCP Server (25 Tools) A Model Context Protocol (MCP) server

wrapping ak_search into JSON-RPC tools. Compatible with Claude 3.5 Sonnet,

Cursor, and Codex.

3. Top Layer: Hypothesis Ledger (The Anti-Hallucination Brain) This is the

core. It forces the LLM to explicitly record hypotheses, evidence,

confidence levels, and falsification conditions. High-confidence conclusions

in final reports must include [H<n>] back-references to the ledger.

Featured Tools

1. regflow — Register Evolution Tracking

The most frequent question in RE: "Where did this value come from?" regflow

performs a backtrace of a register's assignment chain over N steps. It turns

"guessing" into "provenance."

2. fold — Block-Aware Trace Compression

ARM64 compilers produce massive unrolled loops. fold identifies repeating Basic

Block (BB) sequences and collapses them (e.g., [BB @ 0x1234 × 47 times]).

Real-world test: Compressed a 115 MB trace to 1.1 MB with zero information loss,

making it possible to fit complex logic into the LLM's context window.

3. constscan — 95 Crypto Fingerprints + Verdict Gradings

Scans for MD5 T-tables, SHA-2 K-tables, AES S-boxes, ECC parameters, etc. The

magic is in the grading:

- real: Constant found + actual hash/cipher instruction context.

- weak: Constant found but context is suspicious (e.g., a simple memory copy).

- alu_only: Byte match found, but surrounded by generic ALU ops (Likely a

false positive).

4. cryptoinstr — ARM Crypto Extensions

Detection of hardware-accelerated instructions: AESE, SHA256H, SM4E, PMULL, etc.

In modern iOS/Android apps, these are "Hard Evidence"—far more reliable than

constant scanning.

The "Hypothesis Ledger" Mechanism

LLM reasoning is iterative: Evidence A -> Hypothesis X; Evidence B -> Strengthen

X; Evidence C (contradiction) -> Abandon X.

Without a ledger, LLMs "distill" this process into a single final (and often

confused) string. The Ledger forces the LLM to treat its own reasoning as a

verifiable state machine:

The Seven Gates of Verification

To "conclude" a hypothesis as High-Confidence, the server enforces 7 hard

checks:

1. Verbatim match of cited trace lines in the source file.

2. Evidence excerpts must exist in previous tool outputs.

3. At least 2 independent tool sources (No single-point-of-failure).

4. Creation timestamp must precede the conclusion (No "hindsight" forging).

5. Falsification conditions must be defined.

6. Must pass an independent "Red-Team Sub-Agent" review.

7. Final report back-references must link to an existing, concluded ledger

entry.

The Red-Team Sub-Agent

Before closing a case, the primary agent must spawn a Reviewer Sub-Agent with a

clean context. It is given only the hypothesis and the evidence list. It

independently re-runs tools to verify the claims. If the reviewer fails it, the

primary agent cannot claim "High Confidence."

Special Focus: VMP Recovery (v0.9.5)

VMP is the "final boss" of RE. The plugin enforces a 4-phase methodology with

objective criteria:

- A. Identification: High-frequency dispatcher loops + jump table detection +

VM context structure.

- B. Opcode Schema: Frequency statistics for 100+ opcodes, requiring

independent hits and zero "ghost opcodes."

- C. Handler Translation: Every handler undergoes round-trip emulation

verification + Red-Team review.

- D. Business-Level Closure: Bit-for-bit consistency check of inputs and

outputs.

FAQ

Q: Why not just use frida-trace? frida-trace is function-level. It lacks

fidelity. Furthermore, frida-trace throughput is ~1k-10k IPS; GumTrace runs at

hundreds of thousands of IPS via Stalker.

Q: Does it handle large traces? The ak_search daemon uses mmap and builds line

indexes. Initializing a GB-file takes 1-2 seconds; subsequent queries are

sub-millisecond via IPC.

Getting Started

You can use it via Claude Desktop, Cursor, or Codex.

Claude Code / Desktop:

claude plugin marketplace add icloudza/algokiller-plugin

claude plugin install algokiller@algokiller-suite

Prompt example: "Use algokiller ciphertext mode to analyze /path/to/trace.log

and recover the X-Sign ciphertext a3b2c1..."

Project Credits:

- AlgoKiller by @lidongyooo: The original methodology and ak_search core.

- GumTrace by @lidongyooo: The ARM64 instruction-level tracer.

Github: https://github.com/icloudza/algokiller-plugin

I’m looking for feedback on real-world samples, especially regarding iOS literal

pool detection and VMP handler recovery. Let’s discuss in the Issues!

Hello everyone! This project aims to leverage AI’s powerful analysis capabilities to help with high-efficiency algorithm analysis and related scenarios. We hope the community can collaborate and build this project together.

AlgoKiller-Plugin: An MCP-Powered AI Agent for High-Fidelity ARM64 Trace Analysis & Crypto Triage

Featured Replies

Create an account or sign in to comment

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)