CyGuide
Most security tool wrappers are glorified shell scripts with a prettier interface. CyGuide's architectural bet is different: instead of running tools and printing output, it maintains a live relational graph of everything discovered across an entire investigation session — so every scan enriches the last rather than starting from scratch. The engine enforces a strict plugin contract so that any contributor can add a new tool in two files without touching a single line of core infrastructure, a property that holds whether the tool is a network scanner, a web enumerator, or a credential checker.
Design Decisions
Architecture
Smart Adapter / Stupid Framework
The engine knows nothing about nmap, whois, or any other tool — that intelligence lives entirely in each tool's self-contained adapter. Adding a new tool is purely additive; no existing file changes.
Data Model
Dual-Store: Graph + Event Log
Every finding is written to two places simultaneously — a deduplicated stateful graph for current truth, and an append-only event log for session reproducibility via replay.
Identity
PIK-Based Upserts Over Append
Every schema declares a Primary Identity Key. When two tools report the same entity, the engine merges them into one node rather than creating a duplicate.
Safety
Canonicalization Boundary
A normalization layer standardizes syntax before data enters the graph. It deliberately stops at semantic inference to prevent corrupting investigations with "best guess" resolutions.
Problem Solved
Root cause: The executor originally used process.communicate(), which waits for a tool to finish entirely before returning any output. During a long nmap scan this froze the UI completely — clicks queued up, stats stalled, the stop button was inert. The underlying cause was threefold: the event loop was starved because there were no yield points, the output widget was re-parsing accumulated Rich markup with quadratic cost, and a new SQLite connection was opened for every single finding.
Fix: The executor was refactored to stream stdout line-by-line using readline() loops, with stderr consumed concurrently in a background task. Output accumulates in a list (O(1) appends) rather than a growing string. The UI refreshes at a capped rate of 20fps, markup parsing is disabled, and the store moved to a persistent connection.
What else changed: With the event loop properly yielding, the Stop button became functional, the reactive stat labels update live during scans, and the explanation pane can stream responses in parallel without blocking raw output.
Two-Mode Design
CyGuide ships two modes because the gap between a student and a practitioner is not just knowledge — it is workflow. Learning Mode guides a newcomer through a structured tool browser, lets them select named recipes, and streams a first-principles explanation of what was found alongside the raw output. Power Mode is for someone who already knows what the tools do: a persistent session workspace where the entity graph accumulates across tool runs, suggestions surface automatically, and the user works with findings rather than typing IP addresses.
By the Numbers
- 15 finding schemas in the shared vocabulary, covering network, DNS, web, credential, and vulnerability layers. [verify]
- 27 automated tests across unit, integration, and mode-logic layers, including a full executor flow test using a mock nmap binary. [verify]
- 2 reference tool adapters (nmap and whois) that together demonstrate both the Scanner pattern and the Enrichment pattern for contributors.
- 3 explanation backends shipped: zero-dependency template engine, local Ollama LLM as auto-detected upgrade, and Anthropic API as explicit opt-in.
Governance
- Tier 1 — Frozen: Engine core (store, registry, canonicalization) — changes require team consensus and are enforced via GitHub CODEOWNERS.
- Tier 2 — Review Required: All adapters and the executor — one peer review mandatory for async contract correctness and subprocess safety.
- Tier 3 — Free to Change: UI screens, CSS, documentation, manifest learning content — no approval needed.