Noureddine RAMDI Dinour

Lead Developer & AI Enthusiast — Software Architecture, AI/LLM, Infrastructure Automation

Organizations

30 results for Pytorch

Clear filter

OpenThinkIMG: Modular vision tool orchestration for enhanced multimodal inference
OpenThinkIMG enables modular orchestration of independent vision tools for enhanced inference workflows using PyTorch and service-based architecture. Clear quickstart included.
github-stars python pytorch vision multimodal Created Mon, 06 Jul 2026 15:15:52 +0000
ReconViaGen: a two-stage generative diffusion pipeline for multi-view 3D reconstruction
ReconViaGen reconstructs high-quality 3D objects from multiple images using a two-stage diffusion model. It runs inference on consumer GPUs and supports modular experimentation.
github-stars 3d-reconstruction diffusion-models python pytorch Created Mon, 06 Jul 2026 15:15:52 +0000
APISR: a Python toolkit for AI-based image and video super-resolution with practical inference modes
APISR is a Python repo for AI-powered image and video super-resolution, offering fast Gradio inference and full-featured regular inference with dataset curation tools.
github-stars python pytorch super-resolution ai Created Sat, 23 May 2026 20:41:14 +0000
DeepSpeed: scalable deep learning optimization with extensible hardware support
DeepSpeed is a Python library that optimizes large-scale deep learning training with multi-hardware support and JIT CUDA extensions. Explore its architecture, strengths, and quick installation.
github-stars python deep-learning pytorch cuda Created Sat, 23 May 2026 20:41:14 +0000
DualSDF: A two-level signed distance function approach for semantic 3D shape manipulation
DualSDF separates coarse semantic structure from fine geometric detail in 3D shape modeling using a two-level signed distance function. It enables intuitive shape edits with pretrained models and a WebGL demo.
github-stars 3d pytorch signed-distance-function shape-manipulation Created Sat, 23 May 2026 20:41:14 +0000
Fast3R: scalable multi-view 3D reconstruction with a single forward pass
Fast3R from Meta FAIR processes 1000+ unordered images simultaneously for 3D reconstruction using a ViT-Large backbone and multi-view attention, eliminating iterative matching.
github-stars 3d-reconstruction computer-vision pytorch transformers Created Sat, 23 May 2026 20:41:14 +0000
Hivemind: decentralized peer-to-peer deep learning with PyTorch
Hivemind is a PyTorch library enabling decentralized deep learning over the internet using a peer-to-peer Distributed Hash Table (DHT). It supports fault-tolerant training and decentralized parameter averaging without global sync.
github-stars python pytorch distributed-training decentralized Created Sat, 23 May 2026 20:41:14 +0000
MASt3R-SLAM: integrating foundation-model 3D priors into real-time dense SLAM
MASt3R-SLAM integrates a pretrained 3D reconstruction model as a geometry prior in a dense SLAM pipeline, enabling real-time tracking and mapping without classical bundle adjustment or depth sensors.
github-stars slam computer-vision 3d-reconstruction pytorch Created Sat, 23 May 2026 20:41:14 +0000
OmniGen2: a unified multimodal generation model with separate decoding paths for text and images
OmniGen2 unifies visual understanding, text-to-image generation, and image editing using distinct decoding pathways for text and images, built on Qwen-VL-2.5 with CPU offloading for accessibility.
github-stars multimodal deep-learning pytorch image-generation Created Sat, 23 May 2026 20:41:14 +0000
PartCrafter: compositional 3D mesh generation with latent diffusion transformers
PartCrafter generates multiple semantically distinct 3D mesh parts from a single RGB image using latent diffusion transformers, enabling structured 3D generation with pretrained models and VLM-based part suggestions.
github-stars python latent-diffusion 3d-mesh computer-vision Created Sat, 23 May 2026 20:41:14 +0000
SVFR: unified video face restoration with task-conditioned stable video diffusion
SVFR combines blind face restoration, colorization, and inpainting in a single stable video diffusion model, enabling efficient multi-task video face enhancement.
github-stars python video-restoration diffusion-models deep-learning Created Sat, 23 May 2026 20:41:14 +0000
CodeFormer: Deep learning-based blind face restoration with fidelity control
CodeFormer uses a codebook transformer architecture for blind face restoration, letting users control the tradeoff between quality and fidelity with a unique fidelity weight parameter.
github-stars python deep-learning face-restoration computer-vision Created Tue, 05 May 2026 13:37:39 +0000
AniGen: GPU-accelerated 3D animation generation with Python and CUDA
AniGen is a Linux-only Python project for 3D animation generation using NVIDIA GPUs and CUDA. It integrates PyTorch, spconv, and pytorch3d with a smooth setup script for complex dependencies.
github-stars python cuda pytorch 3d-animation Created Mon, 04 May 2026 10:23:02 +0000
ComfyUI Trellis2: Extending ComfyUI with Dinov3 for 3D-Aware Diffusion Workflows
ComfyUI-Trellis2 integrates facebook’s Dinov3 model into ComfyUI for advanced 3D-aware diffusion workflows. This article breaks down its architecture, strengths, and installation steps.
github-stars python comfyui diffusion-models dinov3 Created Mon, 04 May 2026 10:23:02 +0000
DIMO: Distilling Diverse 3D Motion Priors for Arbitrary Object Motion Synthesis
DIMO distills motion priors from text-conditioned and multi-view video models into a shared latent space, enabling diverse 3D motion generation for arbitrary objects using 3D Gaussian splatting and 4D rendering.
github-stars python pytorch 3d-motion 3d-gaussian-splatting Created Mon, 04 May 2026 10:23:02 +0000
DROID-W: extending SLAM to dynamic, in-the-wild scenes with uncertainty estimation
DROID-W builds on DROID-SLAM to handle dynamic scenes in-the-wild by jointly estimating camera pose, scene structure, and dynamic uncertainty using Lie group optimization and metric depth estimation.
github-stars slam dynamic-scenes lie-group-optimization gaussian-rasterization Created Mon, 04 May 2026 10:23:02 +0000
Falcon-Perception: a minimal multimodal PyTorch engine for object detection, segmentation, and OCR
Falcon-Perception is a PyTorch engine for multimodal autoregressive Transformers handling detection, segmentation, and OCR with FlexAttention and efficient caching.
github-stars pytorch multimodal transformers cuda Created Mon, 04 May 2026 10:23:02 +0000
Omni-Diffusion: unified any-to-any multimodal generation with masked discrete diffusion
Omni-Diffusion models text, image, and speech tokens jointly via masked discrete diffusion, enabling any-to-any multimodal generation with a single unified model.
github-stars python multimodal diffusion-model machine-learning Created Mon, 04 May 2026 10:23:02 +0000
PEAR: real-time expressive 3D human mesh recovery at 100 FPS
PEAR predicts expressive 3D human mesh parameters for body, hands, and face simultaneously at 100 FPS using a pixel-aligned architecture based on PyTorch and SMPL-X models.
github-stars python pytorch 3d-human-mesh smpl-x Created Mon, 04 May 2026 10:23:02 +0000
Streaming 3D scene reconstruction with LingBot-Map’s geometric context transformer
LingBot-Map performs streaming 3D reconstruction from long image sequences at ~20 FPS using a geometric context transformer and paged KV cache attention for efficient memory management.
github-stars python 3d-reconstruction transformers streaming-inference Created Mon, 04 May 2026 10:23:02 +0000