Noureddine RAMDI / Leo Health Core: local-first parsing of massive health data with SAX streaming in Python

Created Mon, 04 May 2026 10:23:01 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

sandseb123/Leo-Health-Core

Apple Health exports can reach several gigabytes in XML format, making it a challenge to process them efficiently on typical laptops without excessive memory use or long wait times. Leo Health Core tackles this problem head-on by using a SAX streaming XML parser that processes the data incrementally, avoiding the need to load the entire file into memory. Paired with a file watcher that automatically detects new data exported via AirDrop, it creates a smooth local pipeline for health data ingestion and exploration.

what Leo Health Core does and how it’s built

Leo Health Core is a zero-dependency Python command-line interface (CLI) tool that parses Apple Health XML exports and Whoop CSV files, consolidating heterogeneous wearable data into a single SQLite database. Its core function is to normalize these different data formats into a unified schema stored locally in SQLite, which users can query with standard SQL.

The architecture is deliberately minimal and privacy-focused: the tool runs entirely locally, never making network requests. It exposes data through a localhost-only web dashboard and a terminal UI, enabling users to explore their health metrics without sending sensitive data to the cloud.

The stack is pure Python with no external dependencies, which makes it lightweight and easy to install. It uses SAX streaming parsing for Apple Health’s XML files, which are often very large (up to 4GB). SAX (Simple API for XML) is an event-driven API that reads XML sequentially, emitting parsing events without loading the entire document into memory. This approach is crucial for handling large exports efficiently.

A file watcher monitors the Downloads folder (or wherever exports arrive via AirDrop) and triggers automatic ingestion of new files every 10 seconds. This means users can export their data from Apple Health or Whoop on their devices and have it ingested seamlessly without manual commands.

Docker support allows running the tool in containers on macOS, Linux, and Windows, enabling consistent cross-platform deployment.

technical strengths and tradeoffs under the hood

The standout technical feature is the use of SAX streaming parsing for Apple Health XML. Many tools try to parse such large XML files by loading them fully into memory, which leads to high RAM usage and slow performance. By contrast, Leo Health Core’s streaming parser uses approximately 8MB of RAM and parses files in under 60 seconds, according to the README metrics.

This streaming approach is a tradeoff: it requires writing event-driven parsing logic that is more complex than simple DOM parsing, but it pays off with a much smaller memory footprint and faster processing times. It also means the parser can start processing and storing data immediately instead of waiting for the entire file to be read.

Another strength is the zero-dependency Python implementation. Not relying on external libraries reduces installation friction and potential security risks from dependencies. However, this also means the codebase carries all parsing and database logic internally, which might limit extensibility or require more maintenance.

The local privacy-first architecture is a deliberate design choice. Exposing data only on localhost and keeping everything on-device means no data is sent to external servers. This is ideal for privacy-conscious users but limits integration possibilities with cloud services or remote analytics.

The file watcher pattern that checks for new files every 10 seconds automates the ingestion process but might introduce slight delays between export and ingestion. It also assumes that exports land predictably in monitored folders.

The SQLite database schema unifies data from different wearables but likely simplifies some device-specific metrics to fit a common model. This normalization tradeoff favors querying convenience over capturing every nuance.

Overall, the code is surprisingly clean for a zero-dependency project, with clear module separation between parsers, database interaction, and UI layers. The terminal UI and web dashboard provide accessible ways to explore data without needing custom queries.

quick start with Leo Health Core

Installation instructions are straightforward for users with Python 3.9+:

git clone https://github.com/sandseb123/Leo-Health-Core.git
cd Leo-Health-Core
pip3 install -e .

After installation, three commands are available globally on macOS:

leo          # view your health dashboard
leo-watch    # start watching Downloads for new exports
leo-dash     # open full web dashboard in browser

If pip3 is not found, the README suggests trying pip install -e . or installing Python 3.9+ from python.org.

For those wanting cross-platform compatibility or containerized deployment, Docker support is provided (though the precise Docker commands were truncated in the analysis).

verdict: who should consider Leo Health Core

Leo Health Core is relevant for developers, data enthusiasts, or privacy-conscious users who want a local-first solution to unify and explore their Apple Health and Whoop data. Its zero-dependency Python CLI is lightweight and accessible, especially for those comfortable with command-line tools.

The SAX streaming parser and file watcher create a smooth experience for handling large exports without overloading system memory, a common pain point with Apple Health XML files. The SQLite backend with a unified schema provides a solid base for custom SQL queries or integration with dashboards.

Limitations include the local-only operation without cloud sync, the minimalistic normalization schema which may omit device-specific details, and some setup friction for users unfamiliar with Python environments. The file watcher polling interval may also introduce slight ingestion delays.

In sum, this repo solves a real problem with a pragmatic, privacy-first approach that balances performance and simplicity. For anyone looking to build personal health data tooling or research wearable data locally, Leo Health Core is worth exploring.


→ GitHub Repo: sandseb123/Leo-Health-Core ⭐ 83 · Python