Published on

The Complete Guide to Reverse Engineering & Open-Source Replication in the AI Era

Authors

The Complete Guide to Reverse Engineering & Open-Source Replication in the AI Era

A Universal Methodology Using Open Computer Use as a Case Study

A hands-on summary from replicating Codex Computer Use as open source Applicable to all similar "analyze → reverse-engineer → replicate → open-source" projects


Table of Contents

  1. Project Background & Problem Definition
  2. Full Technology Stack Overview
  3. Phase 1: Reconnaissance & Information Gathering
  4. Phase 2: Static Analysis & Reverse Engineering
  5. Phase 3: Dynamic Analysis & Traffic Interception
  6. Phase 4: Breaking Process Signature Restrictions
  7. Phase 5: Core Feature Implementation
  8. Phase 6: Comparative Validation & Eval Feedback Loop
  9. Phase 7: Productization & Release
  10. Phase 8: Conquering Visual Details (Mouse Animation Reverse Engineering)
  11. Universal Methodology Summary
  12. Complete SOP for Future Similar Projects

1. Project Background & Problem Definition

1.1 What Is Computer Use

Computer Use is a technology that enables AI Agents to autonomously control a computer's interface (mouse, keyboard, applications) to complete tasks. There are traditionally two implementation approaches:

  • Connectors mode: Directly calls application APIs such as Gmail and Slack — no UI manipulation
  • GUI mode: Simulates mouse and keyboard events to directly operate on screen

OpenAI Codex's Innovation: Background Computer Use — a non-preemptive operation model. All previous solutions required "occupying" the screen, preventing the user from simultaneously using the computer. Codex's approach allows the AI to operate in the background without the user ever noticing.

1.2 Breaking Down the Objective

Before starting, the team clearly defined what needed to be replicated:

Original Goal: Replicate the "non-preemptive" core capability of Codex Computer Use
Broken down into:
  ├── Functional Goals (must complete)
  │   ├── Background interaction with UI
  │   ├── Screenshot capture for multimodal inference
  │   └── Expose MCP service externally (9 tools)
  └── Experience Goals (bonus)
      ├── Smooth mouse animation effects
      ├── Permission request floating window
      └── One-click install/publish

Key Decision: Features first, experience second — advance in phases, don't let visual details block the main flow.


2. Full Technology Stack Overview

2.1 Analysis Toolchain

ToolPurposePrinciple
file / stringsBinary basicsReads ELF/Mach-O headers and printable strings
class-dump / nmSwift/ObjC symbol exportExtracts class names and method names from the binary symbol table
Hopper DisassemblerAssembly-level reverse engineeringDecompiles machine code into pseudocode
mitmdumpHTTPS man-in-the-middle traffic captureProxy intercept, TLS traffic decryption
mitmproxyTraffic capture visualizationGUI version of mitmdump
Codex AIAssists binary analysisMultimodal understanding of decompilation results

2.2 Development Stack

LayerTechnologyPurpose
Core implementationSwiftmacOS native, accesses AX API
Service wrapperMCP (Model Context Protocol)Provides standardized tool interfaces externally
Signature bypassGoWrite CLI to borrow Codex.app's signature
Package management & releasenpm / Node.jsnpm i -g open-computer-use
Animation algorithmSwift + Bezier curvesMouse movement path calculation
Multimedia processingffmpeg / ImageMagickVideo frame extraction, image processing

2.3 AI Auxiliary Tool Usage

Codex (Primary)
  ├── Analyze binary files
  ├── Open multiple parallel sessions for division of labor
  ├── Traffic capture configuration and execution
  └── Code generation and validation

Grok
  └── Mining technical clues from Twitter/X

Claude / GPT-4o
  └── Multimodal analysis (screenshots, video frame understanding)

3. Phase 1: Reconnaissance & Information Gathering

3.1 Goal: Find an Analyzable Entry Point

Before reverse-engineering any closed system, the very first step is always finding the physical files.

Steps

# 1. Find the location of Codex App's Computer Use plugin
ls -la ~/.codex/plugins/cache/openai-bundled/computer-use/

# 2. View directory structure
find ~/.codex/plugins/cache/openai-bundled/computer-use/ -type f | head -50

# 3. Check file size to estimate analysis workload
du -sh "~/.codex/plugins/cache/openai-bundled/computer-use/1.0.750/Codex Computer Use.app"
# → 26.5MB, manageable workload

# 4. View .app internal structure (macOS app bundle is essentially a directory)
ls -la "Codex Computer Use.app/Contents/"
# → MacOS/ (executables), Frameworks/, Resources/

Key Findings

  • The entire feature is a standalone macOS App Bundle, 26.5MB
  • The core executable is SkyComputerUseClient
  • It exposes services externally via the MCP protocol (meaning the interface is standardized)

Why This Step Matters

Finding the physical files = finding the "battlefield." Without files, all subsequent analysis is empty talk. File size determines analysis cost; file type determines analysis method.

3.2 Gathering Public Information

Before diving into analysis, collect all the available "free intelligence":

Public information sources:
├── OpenAI official blog (feature intros, screenshots)
├── Twitter/X posts from Software.inc team members
├── App Store / release notes
├── MCP protocol specification docs (mcpprotocol.com)
└── Related open-source projects on GitHub

Actual operation: Ask Grok to search tweets from Software.inc and Ari, then infer technical keywords from the comments.


4. Phase 2: Static Analysis & Reverse Engineering

4.1 Extracting the Symbol Table (Swift Binary)

A Swift-compiled binary contains a wealth of symbol information — you can get class structure without ever executing the program.

# Extract all symbols (function names, class names, protocols)
nm "Codex Computer Use.app/Contents/MacOS/SkyComputerUseClient" | grep -i "computer\|screen\|mouse\|cursor"

# Use class-dump to extract ObjC/Swift class definitions
class-dump "Codex Computer Use.app/Contents/MacOS/SkyComputerUseClient" > symbols.txt

# strings extracts readable strings (discovers API endpoints, error messages, etc.)
strings "Codex Computer Use.app/Contents/MacOS/SkyComputerUseClient" | grep -E "mcp|tool|accessibility|screen"

Key Findings

Key clues extracted from the symbol table:

- AXUIElement (macOS Accessibility API related)
- CGScreenCapture (screenshot API)
- SkyComputerUseClient (main class name)
- MCP Tool related: 9 tool names in total
- osascript (AppleScript fallback)

4.2 AI-Assisted Analysis of Decompilation Results

Directly feed decompiled screenshots or text to Codex/Claude:

Prompt template:
"This is a code snippet decompiled from [binary name]. Please analyze:
1. What is the core functionality of this code?
2. What are the key data structures and interface definitions?
3. How does it interact with [target system]?
Please organize the analysis into a document."

Discovered Architecture:

SkyComputerUseClient
├── MCP Server Layer (external: 9 standardized tools)
│   ├── tool_1: screenshot
│   ├── tool_2: click
│   ├── tool_3: type_text
│   ├── tool_4: scroll
│   ├── tool_5: find_element
│   ├── tool_6: get_ui_tree
│   ├── tool_7: run_applescript
│   ├── tool_8: key_press
│   └── tool_9: wait
└── Core Interaction Layer (underlying: three control methods)
    ├── AX API (preferred, background control)
    ├── osascript (fallback when AX fails)
    └── CGEvent (mouse events, last resort)

4.3 Core Principle: macOS Accessibility API

This is the technical foundation of the entire "non-preemptive" approach — it must be thoroughly understood.

What Is the Accessibility API (AX API)?

To support visually impaired users, macOS provides an interface called AXUIElement that allows programmatic reading and manipulation of all UI elements.

// Core capabilities of the AX API
import ApplicationServices

// 1. Get the AX tree (UI element tree) of an application
let appRef = AXUIElementCreateApplication(pid)

// 2. Find a specific UI element (by title, role, etc.)
var value: CFTypeRef?
AXUIElementCopyAttributeValue(appRef, kAXWindowsAttribute as CFString, &value)

// 3. Click a button in the background (no need for the window to be in the foreground!)
AXUIElementPerformAction(buttonRef, kAXPressAction as CFString)

// 4. Type text in the background
AXUIElementSetAttributeValue(fieldRef, kAXValueAttribute as CFString, "Hello" as CFTypeRef)

Key Feature: All AX API operations do not require the target window to be in the foreground — this is the fundamental reason for "non-preemptive" operation.

Three-Level Fallback Strategy: AX → osascript → CGEvent

func performClick(element: AXUIElement?) {
    // First priority: AX API (most precise, works in background)
    if let element = element {
        let result = AXUIElementPerformAction(element, kAXPressAction as CFString)
        if result == .success { return }
    }
    
    // Second priority: Apple Script (good compatibility)
    let script = "tell application \"Safari\" to click button 1 of window 1"
    NSAppleScript(source: script)?.executeAndReturnError(nil)
    
    // Last resort: CGEvent mouse simulation (will occupy screen)
    let event = CGEvent(mouseEventSource: nil, mouseType: .leftMouseDown,
                        mouseCursorPosition: point, mouseButton: .left)
    event?.post(tap: .cghidEventTap)
}

5. Phase 3: Dynamic Analysis & Traffic Interception

5.1 Why Traffic Capture Is Necessary

Static analysis tells you "what exists," but the precise parameter definitions of tools (JSON Schema) can only be obtained from actual calls. Manually rewriting parameter definitions makes it nearly impossible to achieve 100% strict alignment.

Goal: Directly capture from Codex's actual network requests:

  • Complete system prompt
  • Precise JSON Schema definitions for all 9 tools
  • Actual request/response formats

5.2 Configuring mitmdump for HTTPS Man-in-the-Middle Capture

Installation & Basic Configuration

# Install mitmproxy (includes mitmdump)
pip install mitmproxy
# or
brew install mitmproxy

# Start mitmdump, listening on port 8080
mitmdump -p 8080 --save-stream-file capture.mitm

# View real-time traffic (visual)
mitmweb -p 8080

Install mitmproxy Root Certificate (Trust HTTPS Decryption)

# 1. After starting mitmproxy, visit mitm.it in your browser
# 2. Download the certificate for your platform
# 3. Trust the certificate in macOS Keychain

# Or install via command line
security add-trusted-cert -d -r trustRoot -k ~/Library/Keychains/login.keychain ~/.mitmproxy/mitmproxy-ca-cert.pem

Configure Codex to Route Through the Proxy

# Method 1: System proxy (recommended)
# System Preferences → Network → Advanced → Proxies → HTTP/HTTPS Proxy → 127.0.0.1:8080

# Method 2: Environment variables
export HTTP_PROXY=http://127.0.0.1:8080
export HTTPS_PROXY=http://127.0.0.1:8080

# Method 3: Let Codex configure it itself (recursive)
# Simply tell Codex: "Please configure mitmdump and start traffic capture"

5.3 Recursive Capture: Let Codex Call Itself While Being Captured

This is the most elegant operation of the entire process:

UserTell Codex: "Call your computer-use plugin to execute a screenshot task, while capturing traffic with mitmdump"
Codex configures the proxy environment
Codex calls its own computer-use MCP tool
mitmdump captures all HTTPS requests
User obtains the complete tools definition and system prompt

Captured Data Structure (Example)

{
  "model": "gpt-4o",
  "tools": [
    {
      "name": "screenshot",
      "description": "Capture a screenshot of the current screen state...",
      "input_schema": {
        "type": "object",
        "properties": {
          "app_name": {
            "type": "string",
            "description": "Target application name"
          }
        }
      }
    }
    // ... precise definitions for the remaining 8 tools
  ],
  "system": "You are a computer use agent capable of..."
}

5.4 Writing a mitmdump Filter Script

# filter.py - only capture computer-use related requests
from mitmproxy import http
import json

def request(flow: http.HTTPFlow) -> None:
    if "computer-use" in flow.request.pretty_url or \
       "api.openai.com" in flow.request.pretty_url:
        # Save request body
        if flow.request.content:
            data = json.loads(flow.request.content)
            with open("captured_tools.json", "w") as f:
                json.dump(data, f, indent=2)
            print(f"[+] Captured request to {flow.request.pretty_url}")

# Run: mitmdump -s filter.py -p 8080

6. Phase 4: Breaking Process Signature Restrictions

6.1 Problem Discovery

# Attempt to connect to the official plugin directly using an MCP Client
npx @modelcontextprotocol/inspector "Codex Computer Use.app/Contents/MacOS/SkyComputerUseClient"

# Result: process crashes immediately
# Error: Process terminated with signal 9 (SIGKILL)

6.2 Diagnosing the Cause

# View process crash logs
log show --predicate 'process == "SkyComputerUseClient"' --last 5m

# Analyze crash report
cat ~/Library/Logs/DiagnosticReports/SkyComputerUseClient*.crash | grep -A 20 "Exception"

# Key finding: the process uses SecCodeCopyGuestWithAttributes to verify the parent process signature
# Only a parent process signed by Codex.app can launch it

6.3 Solution: Signature Inheritance Proxy

Approach: Write a Go program that runs within Codex.app's process context, thereby inheriting the correct signature.

// launcher.go
package main

import (
    "fmt"
    "os"
    "os/exec"
    "path/filepath"
)

func main() {
    // Find the path to SkyComputerUseClient
    pluginPath := filepath.Join(
        os.Getenv("HOME"),
        ".codex/plugins/cache/openai-bundled/computer-use/1.0.750",
        "Codex Computer Use.app/Contents/MacOS/SkyComputerUseClient",
    )
    
    // Launch process, inheriting the signature context of the current process
    cmd := exec.Command(pluginPath)
    cmd.Stdin = os.Stdin
    cmd.Stdout = os.Stdout
    cmd.Stderr = os.Stderr
    
    if err := cmd.Run(); err != nil {
        fmt.Fprintf(os.Stderr, "Failed to launch: %v\n", err)
        os.Exit(1)
    }
}
# Compile
go build -o codex-launcher launcher.go

# Key: use codesign to let the launcher borrow Codex.app's signature
# (or call from within the Codex.app process to directly inherit)

# Actual operation: have Codex itself execute this launcher
# Since Codex itself is a validly signed process, child processes forked from it inherit the signature

6.4 Verification of Success

# Successfully call the official MCP via CLI
echo '{"jsonrpc":"2.0","method":"tools/list","id":1}' | ./codex-launcher

# Expected output: the complete list of 9 tools
{
  "result": {
    "tools": [
      {"name": "screenshot", ...},
      // ...
    ]
  }
}

7. Phase 5: Core Feature Implementation

7.1 Project Structure (Starting from harness-template)

open-computer-use/
├── Sources/
│   └── ComputerUse/
│       ├── main.swift              # Entry point
│       ├── MCPServer.swift         # MCP protocol layer
│       ├── AccessibilityEngine.swift # AX API core
│       ├── ScreenCapture.swift     # Screenshot module
│       ├── AppleScriptFallback.swift # Fallback solution
│       └── Tools/                  # 9 MCP tool implementations
│           ├── ScreenshotTool.swift
│           ├── ClickTool.swift
│           ├── TypeTextTool.swift
│           └── ...
├── Package.swift
├── install.sh                      # One-click install script
└── docs/                           # Analysis docs continuously output by AI (LLM Wiki)
    ├── architecture.md
    ├── ax-api-reference.md
    └── mcp-tools-spec.md

7.2 MCP Server Implementation

MCP (Model Context Protocol) is a standardized tool interface protocol proposed by Anthropic. AI models use MCP to call external tools.

// MCPServer.swift
import Foundation

struct MCPServer {
    // Tool registry
    let tools: [String: MCPTool] = [
        "screenshot": ScreenshotTool(),
        "click": ClickTool(),
        "type_text": TypeTextTool(),
        "scroll": ScrollTool(),
        "find_element": FindElementTool(),
        "get_ui_tree": GetUITreeTool(),
        "run_applescript": RunAppleScriptTool(),
        "key_press": KeyPressTool(),
        "wait": WaitTool(),
    ]
    
    // Handle JSON-RPC requests
    func handle(request: JSONRPCRequest) -> JSONRPCResponse {
        switch request.method {
        case "tools/list":
            return listTools()
        case "tools/call":
            return callTool(request)
        default:
            return errorResponse("Unknown method")
        }
    }
    
    // Communicate with MCP Client via stdio
    func run() {
        while let line = readLine() {
            guard let data = line.data(using: .utf8),
                  let request = try? JSONDecoder().decode(JSONRPCRequest.self, from: data) else {
                continue
            }
            let response = handle(request: request)
            let responseJSON = try! JSONEncoder().encode(response)
            print(String(data: responseJSON, encoding: .utf8)!)
        }
    }
}

7.3 Core Tool Implementation Examples

// ScreenshotTool.swift
import ScreenSaver
import AppKit

struct ScreenshotTool: MCPTool {
    var name = "screenshot"
    var description = "Capture a screenshot of the current screen or a specific application window"
    var inputSchema: JSONSchema = [
        "type": "object",
        "properties": [
            "app_name": ["type": "string", "description": "Target app name (optional)"]
        ]
    ]
    
    func execute(input: [String: Any]) async throws -> MCPToolResult {
        let appName = input["app_name"] as? String
        
        // Screenshot logic
        let screenshot: NSImage
        if let app = appName {
            screenshot = try captureApp(named: app)  // Capture specific app
        } else {
            screenshot = try captureScreen()  // Capture full screen
        }
        
        // Convert to base64 and return
        let base64 = screenshot.tiffRepresentation?
            .base64EncodedString() ?? ""
        
        return MCPToolResult(
            content: [["type": "image", "data": base64, "mimeType": "image/png"]]
        )
    }
}
// AccessibilityEngine.swift - Core background control implementation
import ApplicationServices

class AccessibilityEngine {
    // Find UI element
    func findElement(inApp pid: pid_t, matching query: ElementQuery) -> AXUIElement? {
        let appRef = AXUIElementCreateApplication(pid)
        return searchUITree(root: appRef, query: query)
    }
    
    // Recursively search UI tree
    private func searchUITree(root: AXUIElement, query: ElementQuery) -> AXUIElement? {
        var children: CFTypeRef?
        AXUIElementCopyAttributeValue(root, kAXChildrenAttribute as CFString, &children)
        
        guard let childArray = children as? [AXUIElement] else { return nil }
        
        for child in childArray {
            if matches(element: child, query: query) { return child }
            if let found = searchUITree(root: child, query: query) { return found }
        }
        return nil
    }
    
    // Background click (no need for window in foreground!)
    func click(element: AXUIElement) throws {
        let result = AXUIElementPerformAction(element, kAXPressAction as CFString)
        guard result == .success else {
            throw AccessibilityError.actionFailed(result)
        }
    }
    
    // Background text input
    func typeText(_ text: String, into element: AXUIElement) throws {
        let result = AXUIElementSetAttributeValue(
            element,
            kAXValueAttribute as CFString,
            text as CFTypeRef
        )
        guard result == .success else {
            throw AccessibilityError.setValueFailed(result)
        }
    }
}

8. Phase 6: Comparative Validation & Eval Feedback Loop

8.1 Designing the Comparative Validation Framework

With the ability to "call the official version," establish a rigorous comparative validation system:

# Test script: run the same task on both official and open-source versions
#!/bin/bash
TASK="Take a screenshot of the current page in Safari browser"

echo "=== Official Version ==="
echo "$TASK" | codex --use-plugin=computer-use 2>&1 | tee official_output.json

echo "=== Open Source Version ==="
echo "$TASK" | codex --mcp-server=open-computer-use 2>&1 | tee opensource_output.json

# Compare differences
diff official_output.json opensource_output.json

8.2 Multi-Dimensional Comparison Metrics

Comparison dimensions:
├── Functionality (P0)
│   ├── Tool call success rate
│   ├── Screenshot quality (resolution, completeness)
│   ├── UI element location accuracy
│   └── Text input accuracy
├── Performance (P1)
│   ├── Response latency
│   ├── Memory usage
│   └── CPU usage
└── Stability (P2)
    ├── Long-run stability
    └── Exception recovery capability

8.3 Dog Fooding: Using Your Own Product for Testing

In the final stage, use open-computer-use itself to develop open-computer-use. This is both a test and the most realistic form of validation:

Tasks executed using open-computer-use:
- Open Xcode and modify a file
- Run tests in Terminal
- Take screenshots to verify UI changes
- Submit a Git commit

9. Phase 7: Productization & Release

9.1 Permission Request UI (Reference from Software.inc Approach)

macOS requires explicit requests for two permissions:

  • Accessibility (required for AX API)
  • Screen Recording (required for screenshots)
// PermissionWindow.swift - Draggable floating window
import AppKit

class PermissionFloatingWindow: NSPanel {
    init() {
        super.init(
            contentRect: NSRect(x: 0, y: 0, width: 320, height: 200),
            styleMask: [.titled, .closable, .miniaturizable, .utilityWindow],
            backing: .buffered,
            defer: false
        )
        
        // Float above all other windows
        self.level = .floating
        self.isMovableByWindowBackground = true  // Drag anywhere to move
        
        // Check permission status and guide user to enable
        setupPermissionChecks()
    }
    
    func setupPermissionChecks() {
        // Check Accessibility permission
        let axEnabled = AXIsProcessTrusted()
        
        // Check Screen Recording permission
        let screenEnabled = CGPreflightScreenCaptureAccess()
        
        // Update UI based on status
        updateUI(ax: axEnabled, screen: screenEnabled)
    }
    
    @objc func openAccessibilitySettings() {
        NSWorkspace.shared.open(URL(string: "x-apple.systempreferences:com.apple.preference.security?Privacy_Accessibility")!)
    }
}

9.2 Publishing to npm

// package.json
{
  "name": "open-computer-use",
  "version": "1.0.0",
  "description": "Open-source alternative to Codex Computer Use",
  "bin": {
    "open-computer-use": "./bin/open-computer-use.js"
  },
  "scripts": {
    "postinstall": "node scripts/install.js"
  }
}
// bin/open-computer-use.js
#!/usr/bin/env node
const { execSync } = require('child_process');
const path = require('path');

// Find the Swift-compiled binary
const binaryPath = path.join(__dirname, '../bin/ComputerUse');

// Start the MCP service directly
execSync(binaryPath, { stdio: 'inherit' });
# install.js - automatically compile Swift code during installation
const { execSync } = require('child_process');
execSync('swift build -c release', { cwd: __dirname });
# Publish
npm publish

# User installation
npm install -g open-computer-use

# Add to Codex MCP config (one-click command)
open-computer-use install-to-codex

9.3 One-Click Integration with Codex MCP

// Automatically modify ~/.codex/config.json
const configPath = path.join(os.homedir(), '.codex', 'config.json');
const config = JSON.parse(fs.readFileSync(configPath, 'utf8'));

config.mcpServers = config.mcpServers || {};
config.mcpServers['open-computer-use'] = {
    command: 'open-computer-use',
    args: []
};

fs.writeFileSync(configPath, JSON.stringify(config, null, 2));
console.log('✅ open-computer-use has been added to Codex MCP config');

9.4 Logo Design (Fully AI-Generated)

Prompt to Codex:
"Design a logo for a project called open-computer-use.
Theme: AI controlling a computer in the background, mouse cursor is the core element.
Style: clean, modern, geeky.
Format: SVG, provide multiple options to choose from."

AI outputs multiple SVGs
Convert formats using ffmpeg/ImageMagick
AI self-reviews (send screenshots back to AI for confirmation)
Select final design

10. Phase 8: Conquering Visual Details (Mouse Animation Reverse Engineering)

This is the most hardcore part, demonstrating how visual effects can also be reverse-engineered.

10.1 Video Frame Analysis

# Download the demo video from the Software.inc author
# Use ffmpeg to extract frames (30 frames per second)
ffmpeg -i demo_video.mp4 -vf fps=30 frames/frame_%04d.png

# Ask Codex to analyze key frames
# Find the mouse position change sequence, reverse-engineer the motion curve

Extract keywords from Twitter comments: calculates natural and aesthetic motion paths

Ask AI to search relevant materials:
├── Bezier Curve mouse paths
├── Fitts' Law — human mouse movement patterns
├── Critically Damped Spring Animation
├── Related paper: "Natural Mouse Trajectory Simulation"
└── Open-source implementations: human-cursor, naturalmouser

10.3 Binary Reverse Engineering to Extract the Algorithm

When AI and papers aren't precise enough, reverse-engineer the binary directly:

# Use Hopper/IDA to decompile mouse animation-related functions
# Search for function name keywords
strings SkyComputerUseClient | grep -i "cursor\|animate\|bezier\|easing"

# Locate the core function in the decompiler
# Ask AI to analyze the assembly/pseudocode and reconstruct the algorithm
AI Prompt:
"This is a mouse animation function decompiled from SkyComputerUseClient.
Please analyze the algorithm principle and re-implement it in Swift:
[paste the decompiled pseudocode]"

10.4 Mouse Path Algorithm Implementation

// CursorAnimator.swift
import CoreGraphics
import QuartzCore

class NaturalCursorAnimator {
    // Cubic Bezier curve path
    func generatePath(from start: CGPoint, to end: CGPoint) -> [CGPoint] {
        // Generate natural control points between start and end
        let midX = (start.x + end.x) / 2
        let midY = (start.y + end.y) / 2
        
        // Add random offset to simulate hand tremor
        let offset = CGFloat.random(in: -20...20)
        let control1 = CGPoint(x: midX + offset, y: start.y + offset * 0.5)
        let control2 = CGPoint(x: midX - offset, y: end.y - offset * 0.5)
        
        // Sample path points along the Bezier curve
        return sampleBezierCurve(
            p0: start, p1: control1, p2: control2, p3: end,
            steps: Int(distance(start, end) / 5)  // More sample points for longer distances
        )
    }
    
    // Velocity curve: accelerate → constant → decelerate (mimics human habits)
    func easeInOutCubic(_ t: CGFloat) -> CGFloat {
        if t < 0.5 {
            return 4 * t * t * t
        } else {
            let f = 2 * t - 2
            return 0.5 * f * f * f + 1
        }
    }
    
    // Move virtual cursor along the path (doesn't affect real mouse)
    func animateCursor(along path: [CGPoint], completion: @escaping () -> Void) {
        var index = 0
        Timer.scheduledTimer(withTimeInterval: 1.0/60.0, repeats: true) { timer in
            guard index < path.count else {
                timer.invalidate()
                completion()
                return
            }
            
            let t = CGFloat(index) / CGFloat(path.count)
            let easedT = self.easeInOutCubic(t)
            
            // Update virtual cursor position (drawn on overlay window)
            self.updateVirtualCursor(to: path[index], opacity: easedT > 0.9 ? 1 - (easedT - 0.9) * 10 : 1)
            index += 1
        }
    }
}

11. Universal Methodology Summary

11.1 The Meta-Framework for Problem Solving

┌─────────────────────────────────────────────────────┐
Problem DefinitionWhat (what to do) + Why (why it's feasible)└─────────────────────┬───────────────────────────────┘
┌─────────────────────────────────────────────────────┐
Information GatheringPublic intelligence + Static analysis +Dynamic traffic capture                         │
└─────────────────────┬───────────────────────────────┘
┌─────────────────────────────────────────────────────┐
Decomposition & ParallelismSplit large problems into independent modules,│  advance in parallel with multiple AI sessions      │
└─────────────────────┬───────────────────────────────┘
┌─────────────────────────────────────────────────────┐
Implementation & ValidationMVPComparative ValidationDog Fooding└─────────────────────┬───────────────────────────────┘
┌─────────────────────────────────────────────────────┐
Productization & ReleasePackagePublishRecordOpen Source└─────────────────────────────────────────────────────┘

11.2 AI Assistance Principles

Providing AI with context is the core human responsibility.

The upper limit of AI capability = the quality of context you provide. When AI gets stuck, the problem isn't that AI isn't capable — it's that the right context is missing.

What AI is good at:
Analyzing existing code/binaries/docs
Processing multiple independent tasks in parallel
Generating code to precise specifications
Rapidly implementing algorithms when reference materials are available
Multimodal analysis of images/videos

What humans need to do:
Decide what to do and what not to do
Collect and provide critical context
Make judgment calls among multiple options
Discover AI blind spots and fill in missing information
Maintain overall direction and pace

11.3 Troubleshooting Path When Stuck

Problem: Implementation doesn't match the original
Step 1: Is there more precise reference material? (traffic capture, reverse engineering, papers)
Step 2: Is the context provided to AI specific enough?
Step 3: Can automated comparative validation be established?
Step 4: Can the binary be directly reverse-engineered to obtain the real algorithm?

12. Complete SOP for Future Similar Projects

Standard operating procedure applicable to all "analyze closed systems, replicate open-source" projects.

Phase 0: Preparation (30 minutes)

# 1. Create new project from harness-template
gh repo create my-project --template your-org/harness-template
cd my-project

# 2. Create docs/ folder, let AI continuously accumulate documentation
mkdir docs

# 3. Clearly decompose objectives
cat > docs/goal.md << EOF
## Objective
Replicate: [Target system name]

## Feature Breakdown
- P0 (must have):
- P1 (important):
- P2 (optional):

## Success Criteria
- [ ] Functional comparative validation passed
- [ ] Performance benchmarks met
- [ ] Releasable state
EOF

Phase 1: Information Gathering (1–2 hours)

# 1. Find target files
find / -name "*[target]*" 2>/dev/null

# 2. Basic analysis
file target_binary
strings target_binary | tee docs/strings.txt
nm target_binary | tee docs/symbols.txt

# 3. Have AI analyze and document in docs/
# Prompt: "Analyze these symbols, infer system architecture, write to docs/architecture.md"

# 4. Collect public intelligence
# - Official blog / documentation
# - GitHub Issues / PR
# - Twitter/X related discussions
# - Academic papers

Phase 2: Dynamic Analysis (1–2 hours)

# Configure traffic capture
pip install mitmproxy
mitmdump -p 8080 --save-stream-file capture.mitm &

# Configure proxy, trigger target features
export HTTPS_PROXY=http://127.0.0.1:8080

# Have AI analyze captured traffic
# Prompt: "Analyze capture.mitm, extract all API interface definitions, write to docs/api-spec.md"

Phase 3: Core Implementation (4–6 hours)

# Open multiple parallel AI sessions
# Session A: Implement core features
# Session B: Implement auxiliary tools
# Session C: Handle edge cases and error handling
# Session D: Write tests

# Each session's context must include:
# - docs/architecture.md (system architecture)
# - docs/api-spec.md (interface definitions)
# - Specific requirements for the current module

Phase 4: Validation Feedback Loop (1–2 hours)

# Establish comparative tests
./scripts/compare.sh "test task description" official open-source

# Dog Fooding
# Use your own tool for development work, discover real-world issues

# Fix → Validate → Iterate

Phase 5: Release (1 hour)

# Package
npm init && npm publish
# or
go build && goreleaser release

# Record a demo video
# Ask AI to recommend royalty-free music sites, download license-free music
ffmpeg -i screen_record.mov -i music.mp3 -shortest output.mp4

# Open Source
gh repo create open-[target-name] --public
git push

Closing Thoughts

"What the AI era changes is only the method of solving problems — but the Geek spirit, the drive to solve problems, remains constant."