ForensiX: ML-powered forensic analysis of Chrome and Brave browser artifacts

Browser forensic analysis tools often rely on parsing raw SQLite databases and logs, but turning that data into actionable intelligence is a challenge. ForensiX addresses this by combining traditional artifact extraction with a machine learning model that classifies URLs and generates behavioral profiles. This approach makes it possible to extract meaningful insights like browsing heatmaps, credential usage frequency, and estimated personal info from Chrome and Brave browser data.

what ForensiX does and how it works

ForensiX is a self-hosted forensic analysis tool focused on Google Chrome and Brave browser artifacts. It uses a client-server architecture where the frontend UI is built with Node.js and the backend processing happens in Python. Evidence is stored in MongoDB, which acts as a central store for extracted artifacts and processed insights.

The tool mounts browser data volumes in read-only mode with hash verification to ensure evidence integrity. It extracts a wide range of browser artifacts including browsing history, login data, autofill forms, downloads, bookmarks, cache, and favicons.

A critical component is the ML model (~700MB) that classifies URLs into categories. This model enriches the raw artifact data, allowing the system to generate behavioral profiles. These profiles include personal information estimations, browsing heatmaps that visualize activity over time, and analysis of credential frequency to spot reused passwords or accounts.

Deployment is designed to be straightforward with Docker and docker-compose, but manual setup using pip and npm is also possible for environments where Docker is not available.

the integration of ML and behavioral profiling as a technical strength

What distinguishes ForensiX is the integration of a sizable ML model for URL classification directly within the forensic pipeline. This is not just a simple database scraper; it applies machine learning to transform raw browsing data into meaningful categories. This adds a layer of semantic understanding that many traditional forensic tools lack.

The tradeoff here is the footprint and complexity: the ML model is about 700MB, which impacts download times and resource usage. The Docker build process reflects this with a potentially lengthy initial setup due to dependency and model downloads.

The architecture cleanly separates concerns: Node.js handles the frontend UI while Python manages backend data extraction and ML inference. MongoDB provides a flexible schema for storing diverse artifact types and the enriched metadata.

The codebase appears well-structured with clear division between client and server directories. This separation improves maintainability and allows developers to work independently on UI and backend. The use of Docker-compose for orchestration simplifies DX, bundling services like MongoDB with the app.

However, the reliance on Docker and a heavyweight ML model might limit adoption in lightweight or resource-constrained environments. The manual installation path mitigates this somewhat but requires more setup effort.

quick start with Docker-compose

Here’s the exact quick start commands from the README to get ForensiX running:

# Clone the repo
git clone https://github.com/ChmaraX/forensix.git
cd forensix

# Prepare your browser data
# For Chrome (replace with your actual profile path)
cp -r "/Users/username/Library/Application Support/Google/Chrome/Default/." ./data/

# For Brave (replace with your actual profile path)
cp -r "/Users/username/Library/Application Support/BraveSoftware/Brave-Browser/Profile 2/." ./data/

# Build and start the application

docker-compose up --build

This setup will build all services from source, install dependencies, download the ML model (~700MB), and start the UI, server, and MongoDB.

Alternatively, manual installation is possible by installing Python dependencies, downloading the model, and running client and server via npm, but the Docker method is recommended for simplicity.

verdict: a solid tool for forensic practitioners comfortable with ML and Docker

ForensiX is a practical open-source tool for forensic analysts working with Chrome and Brave browser data. Its ML-powered URL classification and behavioral profiling elevate it beyond simple artifact extraction.

The Docker-based deployment streamlines setup, though the ML model size and initial build time are tradeoffs worth noting. Manual install caters to those who need more control or cannot use containers.

While it might not fit casual users or those unfamiliar with forensic workflows, ForensiX offers a valuable foundation for researchers and practitioners who want to turn browser artifacts into actionable forensic intelligence.

Its modular architecture makes it a good candidate for extension or integration into larger forensic toolchains.

Overall, this repo is worth exploring if you deal with browser forensics and want to see how ML can augment traditional artifact analysis.

Camoufox: a stealthy Firefox fork for AI agents and web scraping — Camoufox is a Firefox fork optimized for AI agents and web scraping with stealth fingerprint injection at the C++ level
Crawlee: a TypeScript library for stealthy web scraping and browser automation — Crawlee is a TypeScript library for web scraping and browser automation with human-like stealth. Supports Playwright, Pu
undetected-chromedriver: patching Selenium to evade anti-bot detection — undetected-chromedriver patches Selenium’s Chromedriver to bypass anti-bot defenses like Distill Network and DataDome. I
awesome-web-scraping: a curated hub for web scraping tools and resources — A comprehensive, multi-language curated list of web scraping tools, services, and resources that acts as a vital referen
Pydoll: Async-native Chromium automation with typed extraction for web scraping — Pydoll is a Python library for Chromium automation using Chrome DevTools Protocol. It offers async-native APIs and Pydan

→ GitHub Repo: ChmaraX/forensix ⭐ 240 · JavaScript

Noureddine RAMDI / ForensiX: ML-powered forensic analysis of Chrome and Brave browser artifacts

what ForensiX does and how it works

the integration of ML and behavioral profiling as a technical strength

quick start with Docker-compose

verdict: a solid tool for forensic practitioners comfortable with ML and Docker

Related Articles