Rapidhash tackles a common challenge in hashing: how to design fast, reliable 64-bit hash functions that adapt to different deployment environments while keeping the code footprint and instruction count tightly controlled. The project offers three variants tuned for general purpose, high-performance computing (HPC) with cache sensitivity, and mobile or embedded systems with strict code size constraints. This tiered approach based on instruction budget rather than raw speed alone is uncommon and worth a close look.
A family of platform-independent 64-bit hash functions optimized by instruction count
Rapidhash is a collection of three related hash functions targeting different performance and deployment tradeoffs. They are all implemented in C++ and avoid architecture-specific intrinsics, making them truly platform-independent across x86-64 and ARM64. Each variant compiles down to a remarkably low instruction count on Clang 18+ — under 185 instructions for the general-purpose rapidhash, under 140 instructions for rapidhashMicro designed for cache-miss sensitive HPC/server workloads, and less than 100 instructions for rapidhashNano targeting embedded or mobile contexts with strict code size budgets.
The three variants produce identical outputs when hashing inputs up to their respective size thresholds (80 bytes for Micro, 48 bytes for Nano), facilitating seamless fallback and consistent hashing results across different environments. This design enables developers to choose the right balance of code size and performance without changing hash semantics for typical key sizes.
Under the hood, rapidhash avoids specialized CPU instructions, relying on portable C++ code that still achieves excellent throughput and low latency. This universality is a significant advantage for cross-platform software where architecture-specific optimizations can add maintenance overhead or cause inconsistent performance.
Efficient instruction count and platform independence define its technical edge
What sets rapidhash apart is the strict control over instruction count for each variant, which is a rare consideration in hashing libraries. Most high-performance hashes focus purely on throughput or collision resistance, often leveraging architecture-specific intrinsics like SIMD or AES instructions. Rapidhash instead balances these factors with code simplicity and portability.
The instruction count metrics are concrete and impressive: rapidhash compiles to approximately 185 instructions on Clang 18+, rapidhashMicro to about 140 instructions without stack usage, and rapidhashNano to fewer than 100 instructions, also without stack usage. This means that even the heaviest variant keeps its machine code footprint tight, which is crucial for HPC workloads sensitive to cache misses and embedded systems where every byte and cycle matter.
Benchmark numbers from Apple Silicon platforms show rapidhash achieving peak throughput of up to 71GB/s on the M4 CPU and average latency as low as 1.38ns on the M3 Pro for small keys. This performance generally surpasses xxh3, a popular high-speed hash, on many Apple Silicon and ARM server platforms. These benchmarks validate rapidhash as a serious contender for performance-critical applications without sacrificing portability.
Quality and collision resistance are not afterthoughts. Rapidhash has been validated using SMHasher and SMHasher3 test suites, which are established tools for hash quality assessment. Collision studies on extremely large datasets (15Gi and 62Gi keys) show near-ideal collision distributions, with observed collisions closely matching expected theoretical values. This means rapidhash maintains high-quality hashing behavior even at massive scale.
The production adoption of rapidhash in major projects like Chromium, Folly’s F14 hash map, Fuchsia OS, Julia language, and Zig language runtimes speaks to its real-world reliability and performance.
The tradeoff is clear: rapidhash avoids architecture-specific instructions to keep the codebase maintainable and portable, which may leave some performance on the table compared to intrinsics-heavy implementations. However, for many users, the reduced complexity and consistent behavior across platforms outweigh this.
Explore the project and documentation
Since the repository doesn’t provide direct installation or quickstart commands, the best way to get started is by exploring the repo structure and documentation.
The main source code is organized around the three hash variants, with implementation files likely named accordingly (e.g., rapidhash.cpp, rapidhashMicro.cpp, rapidhashNano.cpp). The README contains detailed benchmark results and explains the design rationale. There are also test suites integrated to validate the quality of the hash functions.
For developers wanting to integrate rapidhash, reviewing the API header files will show the streaming hash API and usage patterns. The documentation provides guidance on the thresholds for identical output and how to select the variant that fits your deployment context.
Given the focus on instruction count and benchmarking, reading the benchmark scripts and performance testing code can provide insights into how rapidhash maintains its edge across different CPUs.
Verdict: a niche but valuable tool for performance-conscious hashing
Rapidhash is highly relevant for developers needing robust, fast, and platform-independent 64-bit hashing with a clear tradeoff between code size and performance. Its three-tiered design lets you pick the best variant for general use, HPC workloads sensitive to cache misses, or embedded/mobile environments with minimal code footprint.
The code is surprisingly clean for such a performance-focused project, clearly balancing complexity and maintainability. Its avoidance of architecture-specific intrinsics means you get consistent behavior regardless of CPU, which can be a big plus in cross-platform systems.
However, if you are targeting a single architecture and want to squeeze every last cycle of performance, intrinsics-based hashes like xxh3 might still offer advantages. Also, the repo assumes familiarity with hashing concepts and benchmarking to fully appreciate its strengths.
For production systems where collision resistance, throughput, and portability matter in equal measure, rapidhash is worth understanding and potentially adopting. It solves a real problem of providing a family of hashes tuned by instruction count, which is a clever and pragmatic approach rarely seen in this space.
Related Articles
- nh: a Rust-based unified CLI for the Nix ecosystem with enhanced search and ergonomics — nh is a Rust CLI tool consolidating Nix, NixOS, and Home Manager commands with improved ergonomics, speed, and Elasticse
- etcd: a robust distributed key-value store built on Go and Raft — etcd is a distributed key-value store in Go that uses the Raft consensus algorithm for high availability and consistency
→ GitHub Repo: Nicoshev/rapidhash ⭐ 824 · C++