Noureddine RAMDI Dinour

Lead Developer & AI Enthusiast — Software Architecture, AI/LLM, Infrastructure Automation

Organizations

3 results for Metal

Zinc: A Zig-based LLM inference engine optimized for AMD RDNA and Apple Silicon GPUs
Zinc is a Zig-written LLM inference engine using Vulkan and Metal for AMD RDNA and Apple Silicon GPUs. It supports GGUF quantized models and exposes an OpenAI-compatible API with streaming.
github-stars zig llm gpu-inference vulkan Created Tue, 05 May 2026 13:37:39 +0000
dflash-mlx: Speculative decoding on Apple Silicon with Metal and MLX
dflash-mlx implements exact speculative decoding for language models on Apple Silicon using Metal and MLX, reducing forward passes with a block-diffusion draft model and per-layer KV cache rollback.
github-stars python machine learning apple silicon metal Created Mon, 04 May 2026 10:23:02 +0000
vllm-mlx: Efficient LLM serving on Apple Silicon with SSD-tiered KV cache and continuous batching
vllm-mlx is a Python inference server for Apple Silicon that supports OpenAI and Anthropic APIs, featuring SSD-tiered KV cache for long-context agents and continuous batching for performance.
github-stars python apple-silicon machine-learning inference-server Created Mon, 04 May 2026 10:23:02 +0000