SmartScan Android tackles a problem many of us face but rarely see solved well on mobile: how to search your photos and videos for text or similar images without sending anything to the cloud. It uses on-device AI to generate vector embeddings of your media, then performs offline similarity search and clustering — all within a fully offline, privacy-first app.
What SmartScan Android does and how it’s built
SmartScan is a Kotlin Android app designed to perform reverse-image search and text search on your local media collection without any cloud dependency. Its core innovation is running all AI inference locally, using ONNX Runtime to execute pre-trained embedding models directly on the device. This means no photos or videos ever leave your phone, addressing privacy concerns common with cloud-based search.
The app uses Jetpack Compose for its UI, reflecting modern Android design principles, and employs a vector search approach: it converts images and their associated text into vector embeddings, which then allow similarity comparisons. These embeddings are clustered to group related media automatically, improving search and browsing experience.
Under the hood, SmartScan uses ONNX Runtime for embedding inference. This runtime is optimized for mobile devices, providing a balance between compute efficiency and accuracy. The embedding vectors are stored locally and indexed for similarity search. The app supports manual tagging with autocomplete to enhance search relevance and collection management features to organize media.
All indexing and inference operations run entirely on-device, keeping the app fully functional offline. The project is distributed as a standalone APK under the GPLv3 license, and its Kotlin codebase is relatively clean and well-structured.
How SmartScan Android handles on-device vector similarity and clustering
The standout technical feature of SmartScan is its implementation of vector similarity search and clustering on Android hardware. Running ONNX models locally is non-trivial due to limited CPU/GPU resources and battery constraints. Using ONNX Runtime allows the app to execute embedding models efficiently without external dependencies.
The app generates vector embeddings for media items—these vectors capture semantic content such as image features or recognized text. Similarity search then uses these vectors to find related items based on distance metrics like cosine similarity.
Clustering groups similar embeddings together, enabling automatic media grouping without manual input. This clustering mechanism is essential to organize large photo libraries and helps users navigate their collections intuitively.
One tradeoff is that on-device processing can be slower than cloud-based inference, especially on lower-end devices. The app balances this by optimizing model size and using efficient indexing structures. However, initial indexing can still take noticeable time, and battery consumption is a consideration.
The code quality is solid, with clear separation between UI (Jetpack Compose), embedding inference (ONNX Runtime wrappers), and data management layers. The use of Kotlin coroutines for asynchronous tasks improves responsiveness and user experience.
Here’s a simplified snippet illustrating how ONNX Runtime might be used to generate embeddings in the app:
val session = OrtEnvironment.getEnvironment().createSession(modelPath, OrtSession.SessionOptions())
val inputTensor = OnnxTensor.createTensor(env, inputData)
val output = session.run(mapOf(inputName to inputTensor))
val embedding = output[0].value as FloatArray
This snippet shows the core of running a model inference to get an embedding vector, a critical step in the app’s search pipeline.
Explore the project and understand its structure
The repo is structured typically for an Android app using Kotlin and Jetpack Compose. Key directories include app/src/main/java for source code, with well-organized packages separating UI, data, and inference logic.
The README provides a detailed explanation of the app’s features and architecture but does not include installation or quickstart commands. The project is intended for developers comfortable with Android development who want to explore on-device ML applications.
Documentation covers the embedding models used, clustering algorithms, and data indexing strategies, which are worth reading to understand the app’s design decisions.
If you want to dive deeper, start by looking at the ONNX model integration code in the inference package, then examine the clustering implementation and how the UI binds to these data layers.
Verdict: who should consider SmartScan Android?
SmartScan Android is a solid example of privacy-first, on-device AI for media search. It’s relevant for Android developers interested in ML on edge devices, especially those exploring vector search and clustering without cloud dependency.
The app’s architecture is a practical demonstration of using ONNX Runtime on mobile hardware, balancing compute constraints and functionality. It’s also a reminder that fully offline AI experiences are achievable, though with tradeoffs in speed and battery use.
On the downside, the app may feel slow on older devices during indexing, and the feature set is focused on search and grouping without broader editing or sharing capabilities.
Overall, SmartScan is worth a look if you want to understand how to implement on-device vector embeddings and similarity search in Kotlin Android apps. It’s a useful reference for anyone building privacy-conscious media management tools or experimenting with mobile AI.
Related Articles
- Pathway LLM App: unified pipelines for scalable retrieval-augmented generation and AI search — Pathway LLM App provides integrated pipelines for scalable RAG and AI search, combining vector and full-text indexing wi
- Scrapling: adaptive web scraping with AI integration for resilient data extraction — Scrapling offers an adaptive web scraping framework with AI integration to handle site changes and anti-bot systems, sup
- Crawlee: a TypeScript library for stealthy web scraping and browser automation — Crawlee is a TypeScript library for web scraping and browser automation with human-like stealth. Supports Playwright, Pu
- leetcode-master: a structured roadmap for mastering data structures and algorithms with LeetCode — leetcode-master offers a curated, progressive path to mastering algorithms with LeetCode problems, detailed C++ explanat
- MemPalace: local-first AI memory with strong semantic retrieval and no cloud dependency — MemPalace offers a local-first AI memory system with 96.6% recall on conversation history retrieval without any cloud or
→ GitHub Repo: dev-diaries41/smartscan ⭐ 437 · Kotlin