Overview
Crow Eye is an open-source Windows forensic investigation engine designed to collect, analyze, and visualize various Windows artifacts. It features a modular architecture with specialized components for artifact collection, data processing, and visualization through a cyberpunk-themed GUI.
Key Features
- Comprehensive Artifact Collection: Supports Prefetch, Registry, Event Logs, Amcache, Jump Lists, SRUM, MFT, USN Journal, Recycle Bin.
- Timeline Visualization: Advanced timeline view with OpenGL-accelerated rendering.
- Case Management: Organize investigations into cases with persistent configuration.
- Modular Architecture: Easy to extend with new artifact parsers.
Project Structure
The project is organized into several key directories:
Artifacts_Collectors/: Specialized parsers for Windows artifacts (Prefetch, MFT, Registry, etc.).eye/: The Eye AI Forensic Agent subsystem (Bridge, Services, UI).correlation_engine/: Advanced forensic correlation and identity engine.dynamic_mapping/: Intelligence layer for dynamic linking and enrichment.data/: Data management components and unified search engines.ui/: Main PyQt5 user interface components.timeline/: OpenGL and React-based timeline visualization.utils/: Core utilities, concurrency management, and dependency installers.configs/: System configurations and the Forensic Knowledge Base (RAG).CONTRIBUTING.md: Contribution guidelines.Crow Eye.py: Main application entry point.GUI_resources.py: UI assets and resources.LICENSE: Project license.README.md: Project overview.styles.py: Custom UI styling engine.
System Architecture & Orchestration
Crow Eye is built on a sophisticated multi-layered architecture that prioritizes forensic integrity and real-time responsiveness. The system orchestrates complex data pipelines while maintaining a seamless investigative workspace.
UI & Experience Layer
Responsive React-based interface embedded via QWebEngine. Manages the high-fidelity dark-mode canvas and real-time data streaming.
Forensic Collectors
Modular triage artifacts. Each collector is an independent module capable of parsing raw Windows artifacts into standardized forensic formats.
Persistence & Search
High-performance SQLite backend. Manages forensic "Feathers" and provides O(log n) search capabilities across millions of artifacts.
Image Parsing Engine
Modular engine powered by the dissect framework. Uses
the Strategy Design Pattern to parse E01, VHDX, VMDK, and RAW images without host-OS
mounting.
Timeline & Visualization
Hybrid React/Python architecture. Uses an SPA interface embedded via QWebEngine with the Timeline Bridge for asynchronous event streaming.
Forensic Correlation Engine
Advanced Correlation Framework
The Correlation Engine is the analytical heart of Crow-Eye. It provides a modular framework for finding temporal and semantic relationships across diverse forensic artifacts, enabling investigators to reconstruct complex attack chains with O(N log N) efficiency.
Core Methodology
Crow-Eye utilizes a Dual-Engine Architecture to balance forensic precision with computational performance. Whether tracking a single executable across the timeline or analyzing massive event logs, the system dynamically adapts to the investigative context.
🪶 Feather
The data normalization layer. Feathers transform raw artifacts (LNK, Prefetch, EVTX) into standardized SQLite schemas, ensuring tool-agnostic analysis.
🪽 Wing
The logic layer. Wings define the correlation rules, temporal boundaries (default 180m), and semantic mappings that guide the investigation.
Engine Subsystems
Engine Core
Feather System
Wings System
Dual-Engine Comparison
Identity-Based
O(N log N)Optimized for tracking specific files/applications across artifacts. Constant memory usage via streaming.
Time-Based
O(N log N)Systematic temporal analysis across large datasets. Ideal for reconstruction of event timelines.
Engine Selection Guide
| Feature | Identity-Based | Time-Based |
|---|---|---|
| Primary Focus | Entity Tracking | Temporal Context |
| Scaling | 10,000+ Records | 1,000+ Records |
| Memory | O(1) Streaming | Low (Buffered) |
Directory Structure
Eye AI Forensic Agent
Active Development Notice: The Eye Assistant is currently in continuous active development. Users and developers should expect significant changes, architectural optimizations, and new feature integrations in the upcoming releases.
The Eye AI Forensic Assistant is an embedded, tool-augmented "Forensic AI Agent" that orchestrates forensic data querying, analysis, and report generation within the Crow-eye suite. It acts as an expert forensic peer, accelerating the analysis of Windows artifacts by detecting suspicious behaviors through deep correlation and helping investigators build comprehensive forensic reports.
Core Capabilities
Eye consists of a modern React-based GUI embedded inside the PyQt5 application, interacting asynchronously with a Python backend Context Manager.
- Natural Language Investigation: Query forensic artifacts using conversational AI.
- Multi-Source Data Integration: Unified access to Prefetch, MFT, Registry, Event Logs, and more.
- Living Report Workspace: Real-time collaborative documentation with charts and evidence tracking.
- RAG-Enhanced Analysis: Retrieval-Augmented Generation for artifact-specific forensic knowledge.
- Multi-Backend Support: OpenAI, Anthropic, Gemini, Ollama, and LM Studio.
- Human-in-the-Loop: Critical investigative decisions require investigator validation.
Core Architectural Components
2.1. The Frontend (React + QWebChannel)
- Embedded Web UI: A modern interface providing a Chat view, Data
Viewer, and a "Living Report" Generator. Embedded using
QWebEngineView. - Asynchronous Communication: Uses
QWebChannelto stream chunks of text and tool states in real-time, reducing perceived latency.
2.2. The Backend Context Manager (Python)
The ContextManager is the brain of Eye. It maintains conversational
state, enforces forensic reporting rules, applies RAG, and dictates authorized tool
usage.
2.3. Model-Agnostic Router
Allows seamless switching between backends:
- Local Server API: LM Studio / Ollama for private, offline analysis.
- Cloud APIs: Gemini, Anthropic, OpenAI for complex reasoning.
2.4. TOON (Table-Oriented Object Notation) Engine
Prevents context window exhaustion. If a query returns >1000 rows, the TOON Engine applies SQL pushdowns to aggregate data into an ultra-compact structure (metadata, sample rows, summary stats) before delivery to the AI.
System Architecture Map
SQLite)]:::data Registry[(Registry Hives
SAM / SYSTEM)]:::data KnowledgeBase[(RAG Knowledge Base
Embeddings)]:::data Keychain[(OS Keychain
Credential Storage)]:::data AuditFiles[(truncation_audit.log)]:::data CaseDir[(Case Directory Manager)]:::data end ChatUI <--> BridgeJS BridgeJS <--> EyeBridge EyeBridge <--> ContextMgr ConfigMgr -- validates --> ActiveConfig ActiveConfig -- against --> Schema ConfigMgr -- drives --> ModelRouter ContextMgr --> QueryProc QueryProc --> IntentEng QueryProc --> RAGSvc QueryProc --> TokenMgr QueryProc --> ModelRouter QueryProc --> HistoryMgr QueryProc --> AuditLog QueryProc --> EvidenceDet EvidenceDet --> ThreatIntel ThreatIntel --> VT & OTX & LOLBAS ModelRouter --> LocalCLI & LocalAPI & CloudAPI ModelRouter -.-> CredentialMgr CredentialMgr <--> Keychain QueryProc --> DBSvc QueryProc --> SearchSvc QueryProc --> ReportEng DBSvc --> TOON ReportEng --> PDFExp & SVGExp & Heatmap & Timeline Heatmap & Timeline --> ColorMgr QueryProc -.->|Investigator Approval| HumanValidation{Human in the Loop}:::protocol HumanValidation -.-> ContextMgr DBSvc --> ForensicDB & Registry QueryProc --> Correlation RAGSvc --> KnowledgeBase AuditLog --> AuditFiles CaseDir --> ForensicDB ReportEng -- exports --> FileSystem[(HTML/PDF Reports)]:::data %% Subgraph Styling style FE fill:#1e1b4b,stroke:#312e81,stroke-width:2px,color:#fff style BC fill:#1e1b4b,stroke:#312e81,stroke-width:2px,color:#fff style BR fill:#0f172a,stroke:#1e293b,stroke-width:2px,color:#fff style SV fill:#0f172a,stroke:#1e293b,stroke-width:2px,color:#fff style ID fill:#2e1065,stroke:#4c1d95,stroke-width:2px,color:#fff style RX fill:#064e3b,stroke:#065f46,stroke-width:2px,color:#fff style BS fill:#450a0a,stroke:#7f1d1d,stroke-width:2px,color:#fff style FI fill:#020617,stroke:#0f172a,stroke-width:2px,color:#fff
The Ghassan Elsman Protocol (GEP)
The GEP is a strict set of forensic integrity boundaries enforced during every investigation step.
truncation_audit.log.
Investigation Pipeline
The 7-stage processing flow ensures thorough and verifiable analysis.
- Intent Interception: Heuristic check for direct commands (e.g., model switching).
- Forensic Keyword Analysis: Identifying target artifacts (Prefetch, MFT, etc.).
- RAG Lookup: Contextual retrieval from the forensic knowledge base.
- Token Balancing: Optimization of the context window for history and RAG prompts.
- Tool Execution: Direct execution of database queries, regex searches, and intelligence lookups.
- Forensic Synthesis: Applying the Ghassan Elsman Protocol for strictly evidence-anchored reporting.
- Completion: Delivering the final payload, interactive action chips, and data viewers to the UI.
Eye Tool Arsenal
Eye is equipped with a suite of functional tools to manipulate data, generate reports, and hunt for intelligence.
Investigative & Data Tools
- query_database: Executes raw SQL
SELECTqueries directly against forensic SQLite databases with automatic TOON compression. - search_artifacts: Performs text or regex searches across all available databases.
- get_schema: Retrieves table schema information (columns, types) as a fallback mechanism.
- query_correlation_results: Finds time-based or identity-based correlations using Crow-eye's Correlation Engine.
- list_case_files: Secure navigation of the active case directory to discover artifacts.
Reporting & Visualization (The Living Report)
- report_append_section: Adds markdown narrative and synthesis to the report.
- report_add_data_table: Generates interactive data tables for raw forensic evidence.
- report_add_chart: Creates data visualizations (Bar, Line, Pie) for pattern analysis.
- report_add_timeline: Constructs chronological timeline visualizations.
- report_add_heatmap: Generates intensity heatmaps (e.g., login activity by hour/day).
- report_add_chat_transcript: Documents AI reasoning or investigator dialogues.
- report_add_chain_of_custody: Documents evidence handling procedures.
- report_edit_section / report_delete_section: Modifies existing report blocks.
- export_report: Triggers formal export to HTML, PDF, or Markdown.
Threat Intelligence Tools
- query_threat_intel: Queries external intelligence (VirusTotal, AlienVault) for reputation data.
- query_living_off_the_land_intel: Assesses if binaries/drivers are known dual-use tools (LOLBAS/LOLDrivers).
- internet_search: Performs wide-spectrum research for emerging threats or techniques.
Initial Triage Workflow
Upon starting a new case, Eye autonomously executes a "Master Forensic Triage Report", encompassing:
- System Identity & Network Discovery
- Authentication & Login Activity
- Evidence of Execution (Top 10 Apps)
- Persistence Mechanisms (Auto-Runs)
- Anti-Forensics & File Lifecycle
- User Intent (Search & Commands)
- Connected Hardware (USB Devices)
- Final Synthesis & Strategy
This ensures the investigator is immediately presented with a comprehensive, actionable overview of the endpoint's state.
Forensic Knowledge Base (RAG)
The Retrieval-Augmented Generation (RAG) service empowers the Eye Assistant with deep, artifact-specific forensic knowledge. Instead of relying on general LLM training data, Eye consults an internal library of markdown documents.
- Artifact Schemas: Detailed breakdowns of database structures (e.g.,
global_schema_reference.md). - Forensic Methodology: Step-by-step investigative workflows and best practices.
- Targeted Knowledge: Specific guides for each artifact (e.g.,
amcache_knowledge.md,prefetch_knowledge.md,usn_knowledge.md) outlining exactly what each field means and how it can be abused.
Directory Structure
Core Components
1. Main Application (Crow Eye.py)
The main application serves as the entry point and orchestrator for the entire system.
Responsibilities
- Environment Setup: Creates and manages a virtual environment with required dependencies
- UI Initialization: Sets up the PyQt5-based user interface with cyberpunk styling
- Artifact Collection Coordination: Invokes appropriate artifact collectors
- Data Visualization: Displays collected artifacts in tables and UI components
- Case Management: Handles case creation, loading, and configuration
Key Functions
setup_virtual_environment(): Creates Python virtual environmentcheck_and_install_requirements(): Ensures all packages are installedvalidate_dependencies(): Validates dependency functionalityis_admin(): Checks for administrator privilegesload_registry_data_from_db(): Master function for loading registry data
2. Styles System (styles.py)
Defines the cyberpunk-themed visual identity of Crow Eye with neon accents and dark backgrounds.
Features
- Custom color palette with neon cyan (#00FFFF) accents
- Dark theme optimized for long forensic sessions
- Consistent styling across all UI components
- Custom table styles with alternating row colors
3. Component Factory (component_factory.py)
Factory pattern for creating consistent UI elements throughout the application.
Created Components
- Styled tables with custom headers
- Search dialogs with filters
- Progress indicators
- Custom buttons and controls
Artifact Collectors
Each artifact collector is a specialized module for extracting and parsing a specific type of Windows forensic artifact.
Common Collector Pattern
All collectors follow this pattern:
- Locate: Find artifact source (files, registry keys, etc.)
- Parse: Extract binary data into structured information
- Store: Save results in SQLite databases
- Export: Generate JSON output for interoperability
1. Prefetch Parser (Prefetch_claw.py)
Parses Windows Prefetch files (.pf) to extract execution history.
Forensic Value
- Program execution history
- Last execution times (up to 8 timestamps)
- Run count
- Files and directories accessed by the program
Supported Versions
- Windows XP/2003 (Version 17)
- Windows Vista/7 (Version 23)
- Windows 8/8.1/2012 (Version 26)
- Windows 10/11 (Versions 30-31)
2. Registry Parser (Regclaw.py)
Extracts forensic artifacts from live Windows Registry hives.
Artifacts Collected
- USB Devices & Storage
- UserAssist (ROT-13 decoded)
- Shellbags (folder access)
- Recent Documents
- Network Lists
- Run/RunOnce keys
- Installed Programs
- Services
- BAM/DAM (Background Activity Moderator)
3. Offline Registry Parser (offline_RegClaw.py)
Parses offline registry hives without requiring live system access.
Key Features
- Hive Support: SYSTEM, SOFTWARE, SAM, SECURITY, NTUSER.DAT
- Path Independence: No reliance on current system's registry API
- Cross-Analysis: Analyze hives from different Windows versions
4. Amcache Parser (amcacheparser.py)
Parses Amcache.hve to identify application execution history.
Database Tables
- InventoryApplication
- InventoryApplicationFile
- InventoryDriverBinary
- DeviceCensus
5. Event Log Parser (WinLog_Claw.py)
Parses Windows Event Log files (.evtx).
Forensic Value
- User logon/logoff events
- Process creation (Event ID 4688)
- Service installations
- System events
6. Jump Lists & LNK Parser (A_CJL_LNK_Claw.py)
Parses Jump Lists and LNK (shortcut) files.
Forensic Value
- Recently accessed files
- Application usage patterns
- File paths and network shares
- Timestamps of file access
7. SRUM Parser (SRUM_Claw.py)
Parses System Resource Usage Monitor database.
Forensic Value
- Application runtime and resource usage
- Network connectivity data
- Energy usage statistics
8. MFT Parser (MFT_Claw.py)
Parses the Master File Table from NTFS file systems.
Forensic Value
- Complete file system timeline
- File creation, modification, access times
- Deleted file recovery
- File attributes and permissions
9. USN Journal Parser (USN_Claw.py)
Parses the Update Sequence Number Journal.
Forensic Value
- File system change tracking
- File creation, deletion, renaming events
- Detailed change reasons
10. Recycle Bin Parser (recyclebin_claw.py)
Parses Recycle Bin artifacts.
Forensic Value
- Deleted file metadata
- Original file paths
- Deletion timestamps
- File sizes
11. Shimcache Parser (shimcash_claw.py)
Parses the Windows Application Compatibility Cache (Shimcache) to track executable files that have been present on the system.
Forensic Value
- Execution history (even if the file is deleted)
- Full file paths
- Last modification timestamps
- Execution flags and status
Crow-claw: Advanced Collection Engine
Crow-claw is the high-fidelity collection core of Crow-Eye. It is designed to bypass operational system locks and provide deep access to hidden or protected forensic artifacts.
Raw Disk Access
Bypasses Windows API locks using
raw_disk_access_strategy.py, allowing the engine to read files like
MFT, Registry Hives, and Pagefiles while the system is live.
VSS Management
The integrated
shadow_copy_manager.py automatically identifies, mounts, and parses
Volume Shadow Copies, enabling historical analysis of system states.
Core Components
- VSS Health Diagnostics: Automated volume consistency checks via
vss_health_checker.py. - Error Classifier: Advanced error handling
(
error_classifier.py) to distinguish between access denials and corrupted data. - Multi-Strategy Collection: Dynamically switches between Standard, Shadow, and Raw access depending on the forensic context.
Forensic Image Parsing Engine
The Forensics Image Parsing module provides a robust, extensible
architecture designed to analyze and extract artifacts from diverse forensic containers. By
leveraging the Strategy Design Pattern and the dissect
framework, it abstracts image complexity into a unified filesystem interface.
E01 / Ex01
Expert Witness Format support with intelligent loaders for segmented or missing image slices.
RAW / DD
Bit-for-bit raw copies with automated multi-part discovery (e.g., .001, .002).
VHDX / VMDK
Direct parsing of Hyper-V and VMware virtual disk formats for cloud and VM forensics.
Core Architectural Components
Image Parser
(image_parser.py)
The central coordinator that detects formats through signature verification and manages the lifecycle of parsing strategies.
FS Accessor
(file_system_accessor.py)
Abstraction layer handling complex NTFS features like Alternate Data Streams (ADS) and sparse file compaction ($J Journal).
Image Extractor
(image_extractor.py)
The bridge between artifact definitions and parsed volumes, translating Windows environment variables for seamless extraction.
Partition Detector
(partition_detector.py)
Scans for Volume Systems (MBR, GPT) and handles "Volume-Only" acquisitions without partition tables.
The Parsing Pipeline
- Detection: Cascading format checks via
can_handle()electing the appropriate strategy. - Mounting: Transparent mounting of containers and resolution of split segments.
- Discovery: Automated partition probing and creation of
PartitionInfometadata. - Traversal: Mapping offset addresses to mount specific NTFS/FAT32 volumes.
- Extraction: Streaming artifact data while preserving forensic MAC timestamps.
Offline Artifact Importer
The Offline Importer subsystem allows investigators to process raw forensic artifacts that have been extracted from target machines or acquired via third-party tools. It provides a robust GUI for batch processing and validation.
Core Capabilities
- Automated Discovery: Scans directories to automatically identify
artifact types (Prefetch, EVTX, Registry Hives) using
artifact_type_detector.py. - Batch Processing: Orchestrates multiple offline parsers concurrently
using
parser_invoker.py. - Validation & Indexing: Validates file integrity before parsing and builds a comprehensive scan index.
- Standalone GUI: Provides a dedicated interface
(
offline_importer_gui.py) for managing imports outside of the live collection workflow.
Timeline Module
The Crow-Eye Timeline is a sophisticated analytical engine that aggregates and correlates forensic artifacts into a unified, chronologically ordered interface. It utilizes a Hybrid Architecture to deliver high-performance visualization of massive datasets.
Hybrid Architecture (React + Python)
The timeline bridges a robust Python data-processing backend with a modern, responsive React
frontend hosted within a PyQt5 QWebEngineView.
Timeline Bridge
Uses QWebChannel
(timeline_bridge.py) for asynchronous, sub-millisecond communication
between the React UI and the Python forensic logic.
OpenGL Visualization
The React frontend leverages the Canvas API and GPU acceleration to render 100k+ events across interactive swimlanes and heatmaps.
Core Data Orchestration
Data Manager
(timeline_data_manager.py)
Manages a thread-safe connection pool to multiple artifact databases (MFT, Registry, SRUM, etc.) with optimized time-range indexing.
Timestamp Parser
(UniversalTimestampParser)
Forensically normalizes Windows FILETIME, Unix Epoch, Mac Absolute, and OLE dates into standardized UTC ISO 8601 strings.
Optimization Strategies
- Time-Sliced Querying: Prevents UI blocking by sampling and chunking data for different zoom levels.
- Event Aggregation: Condenses raw events into high-level representations (e.g., system sessions) when viewed at macro scales.
- Progressive Loading: Asynchronously fetches data as the investigator pans through the temporal viewport.
Investigation Workflow
- Initialization:
TimelineDialoginstantiates the bridge and loads the React build. - Temporal Mapping: UI queries
getTimeBounds()to map the absolute forensic scope of the case. - Contextual Querying: As the analyst zooms, specific swimlanes (e.g., MFT, Network, Execution) request localized time-slices.
- Correlation: The
correlation_engine.pyidentifies proximity-based relationships between isolated system events.
Dynamic Linking & Intelligence
The Dynamic Linking engine is the intelligence layer of Crow-Eye, responsible for enriching raw artifacts with system-level context and identifying hidden relationships between forensic data points.
Intelligence Engine (intelligence_engine.py)
The core logic that orchestrates link gathering and enrichment. It transforms anonymous IDs into human-readable investigative context.
Identity Enrichment
Automatically resolves Windows SIDs to Usernames and matches AppIDs to their friendly application names using local mapping databases.
Semantic Cross-Linking
Links diverse artifacts (e.g., Prefetch execution to LNK file creation) by identifying shared identifiers like file paths, hashes, or timestamps.
Knowledge Architecture
- Rule Framework: Modular rules located in the
rules/directory define how different artifacts relate to each other. - Enrichment Database: High-speed IO (
io/database.py) for managing the dynamic mapping state and persistent link history. - Intelligence IO: Handles the import/export of intelligence data for sharing across investigation teams.
Data Management Layer
The data layer handles all database operations, search functionality, and data loading.
Database Architecture
Crow Eye uses SQLite databases for storing parsed artifacts:
- Case Databases: One database per case
- Artifact Tables: Separate tables for each artifact type
- Indexes: Optimized for timestamp and text searches
Key Components
database_manager.py: Connection and transaction managementbase_loader.py: Base class for data loadersregistry_loader.py: Registry-specific data loadingmft_loader.py: MFT data loading with virtual tablesusn_loader.py: USN Journal data loadingsearch_engine.py: Full-text search across artifactsindex_manager.py: Database index optimization
Search Capabilities
- Full-text search across all artifacts
- Timestamp range filtering
- Regular expression support
- Multi-field queries
- Search history tracking
UI Components
The UI layer provides a cyberpunk-themed interface for interacting with forensic data.
Component Factory Pattern
The component_factory.py module creates consistent UI elements:
- Styled tables with custom headers
- Search and filter dialogs
- Progress indicators
- Custom buttons and controls
Key Dialogs
case_dialog.py: Case creation and managementsearch_filter_dialog.py: Advanced search interfacerow_detail_dialog.py: Detailed artifact viewLoading_dialog.py: Custom loading animations
Virtual Tables
For large datasets (MFT, USN), Crow Eye uses virtual tables:
- On-demand data loading
- Smooth scrolling for millions of records
- Memory-efficient rendering
- Pagination controls
Documentation Reference
For developers, contributors, and deep-dive technical research, Crow Eye maintains extensive markdown documentation within its repository. These guides cover architectural decisions, system orchestration, and contribution standards.
General Documentation
Correlation Engine Documentation
The Correlation Engine is the most actively developed subsystem, with over 10,000 lines of documentation:
Engine Overview
System overview with architecture diagrams.
Engine Docs
Dual-engine architecture, selection guide, optimization.
Architecture
Component integration and data flow.
🪶 Feather Docs
Data normalization system.
🪽 Wings Docs
Correlation rules definitions.
Pipeline Docs
Workflow orchestration.
Artifact Registry
Centralized artifact type definitions.
Weight Precedence
Weight resolution hierarchy.
Config Reload
Live configuration updates.
Interfaces
Dependency injection and testing.
Contrib Guide
Priority areas and dev status.