Noureddine RAMDI Dinour

Lead Developer & AI Enthusiast — Software Architecture, AI/LLM, Infrastructure Automation

Organizations

18 results for Gpu

Clear filter

ReconViaGen: a two-stage generative diffusion pipeline for multi-view 3D reconstruction
ReconViaGen reconstructs high-quality 3D objects from multiple images using a two-stage diffusion model. It runs inference on consumer GPUs and supports modular experimentation.
github-stars 3d-reconstruction diffusion-models python pytorch Created Mon, 06 Jul 2026 15:15:52 +0000
A structured GPU performance engineering curriculum from fundamentals to frontier labs
A curated GPU performance engineering curriculum focusing on CUDA, kernel optimization, and NVIDIA architectures, guiding engineers from fundamentals to advanced production techniques.
github-stars gpu cuda cutlass triton Created Sat, 23 May 2026 20:41:14 +0000
deck.gl-raster: GPU-accelerated client-side rendering of massive geospatial rasters
deck.gl-raster streams and renders huge Cloud-Optimized GeoTIFFs entirely in-browser using WebGL2, avoiding servers and preprocessing. It enables fast, scalable geospatial visualization of raw raster data.
github-stars typescript deck.gl webgl2 geospatial Created Sat, 23 May 2026 20:41:14 +0000
Inside Mini-SGLang: A clear and modular Python LLM inference engine
Mini-SGLang is a modular Python reimplementation of the SGLang LLM inference engine with production features like Radix Cache, chunked prefill, overlap scheduling, and tensor parallelism.
github-stars python llm inference gpu Created Sat, 23 May 2026 20:41:14 +0000
OCRFlux: GPU-Accelerated OCR with Python for High-Performance Document Processing
OCRFlux is a Python OCR tool optimized for NVIDIA GPUs, enabling fast, high-quality OCR on documents using a conda environment and poppler-utils for PDF rendering.
github-stars python ocr gpu conda Created Sat, 23 May 2026 20:41:14 +0000
SceneSmith: AI-driven pipeline for physics-ready 3D indoor scene generation from text
SceneSmith uses GPT-5-powered agents to generate physically plausible 3D indoor scenes from text prompts, ready for robotics simulation without manual cleanup.
github-stars python 3d-generation ai-agents robotics-simulation Created Sat, 23 May 2026 20:41:14 +0000
VisoMaster Fusion: a portable Windows app bundling multiple AI face-swapping models
VisoMaster Fusion bundles over a dozen AI face-swapping models into a portable Windows desktop app with automatic runtime setup, simplifying the complex AI video editing workflow.
github-stars python ai face-swapping gpu Created Sat, 23 May 2026 20:41:14 +0000
Repurposing the ASRock AMD BC-250: Community-driven firmware unlocking on PS5-derived silicon
The ASRock AMD BC-250 mining board uses PS5-derived silicon with 6 Zen 2 cores and a 24CU RDNA2 GPU sharing 16GB GDDR6. This repo documents community firmware mods and Linux GPU support.
github-stars amd linux firmware gpu Created Tue, 05 May 2026 16:46:42 +0000
TurboOCR: a GPU-accelerated OCR server optimized for raw pixel input and high throughput
TurboOCR is a C++/CUDA OCR server leveraging TensorRT FP16 for high throughput and low latency, featuring a zero-decode pixel pipeline and multi-protocol API.
github-stars cpp cuda ocr tensorrt Created Tue, 05 May 2026 13:37:39 +0000
NVIDIA Warp: JIT-compiling Python for CUDA-powered differentiable physics
NVIDIA Warp lets you write Python functions JIT-compiled into CUDA kernels for GPU-accelerated differentiable physics and ML integration, simplifying GPU programming in Python.
github-stars python cuda gpu jit-compilation Created Mon, 04 May 2026 10:23:03 +0000
AniGen: GPU-accelerated 3D animation generation with Python and CUDA
AniGen is a Linux-only Python project for 3D animation generation using NVIDIA GPUs and CUDA. It integrates PyTorch, spconv, and pytorch3d with a smooth setup script for complex dependencies.
github-stars python cuda pytorch 3d-animation Created Mon, 04 May 2026 10:23:02 +0000
Lucebox Hub: hand-optimized CUDA kernels for efficient LLM inference on RTX 3090 and beyond
Lucebox Hub optimizes LLM inference on consumer GPUs using a megakernel CUDA approach and speculative decoding, achieving high throughput on RTX 3090 and newer Nvidia GPUs.
github-stars cuda llm gpu inference Created Mon, 04 May 2026 10:23:02 +0000
NVIDIA open GPU kernel modules: a pragmatic architecture for Linux GPU drivers
NVIDIA’s open GPU kernel modules split driver code into pre-built OS-agnostic binaries and thin kernel interface layers, avoiding recompilation on Linux kernel updates. Here’s how it works.
github-stars linux gpu kernel-modules nvidia Created Mon, 04 May 2026 10:23:02 +0000
Recreating the 3dfx Voodoo GPU in SpinalHDL for FPGA and cycle-accurate simulation
SpinalVoodoo rebuilds the classic 3dfx Voodoo Graphics GPU in SpinalHDL, targeting FPGA synthesis and cycle-accurate simulation with a focus on perspective-corrected texture mapping and fixed-point interpolation.
github-stars hardware-design fpga spinalhdl gpu Created Mon, 04 May 2026 10:23:02 +0000
claude-shorts: AI-driven pipeline for viral vertical video clips from long form content
claude-shorts uses AI scoring, GPU transcription, and adaptive video reframing to extract viral-ready vertical clips from long videos, optimizing cuts with audio-aware snapping and platform-specific encoding.
github-stars python video-processing ai gpu Created Mon, 04 May 2026 10:23:01 +0000
Cupid: feed-forward 3D reconstruction with joint camera pose estimation from single images
Cupid is a feed-forward 3D reconstruction model that jointly estimates camera pose and reconstructs 3D objects from single 2D images, outputting textured 3D meshes and radiance fields in seconds.
github-stars 3d-reconstruction computer-vision deep-learning cuda Created Mon, 04 May 2026 10:23:01 +0000
DeepEP: Optimizing communication for large Mixture-of-Experts models with CUDA kernels
DeepEP is a CUDA-based communication library designed for Mixture-of-Experts models, delivering high-throughput GPU kernels with NVLink and RDMA support for efficient expert parallelism.
github-stars cuda gpu mixture-of-experts expert-parallelism Created Sat, 02 May 2026 20:07:04 +0000
vLLM: Efficient large language model serving with paged attention and continuous batching
vLLM is a Python library for high-throughput LLM inference using paged attention and continuous batching. It supports quantization, distributed inference, and an OpenAI-compatible API.
github-stars python llm inference gpu Created Sat, 02 May 2026 20:07:04 +0000