Noureddine RAMDI Dinour

Lead Developer & AI Enthusiast — Software Architecture, AI/LLM, Infrastructure Automation

Organizations

28 results for Computer-Vision

Clear filter

3D-RE-GEN: reconstructing editable 3D indoor scenes from a single photo with multi-model AI orchestration
3D-RE-GEN reconstructs complete editable 3D indoor scenes from a single RGB photo. It integrates SAM, Hunyuan3D-2.0, and VGGT models in a modular Python pipeline.
github-stars 3d-reconstruction python computer-vision ai Created Sat, 23 May 2026 20:41:14 +0000
Autodistill: Automating vision model distillation from foundation models to edge deployables
Autodistill automates the pipeline from large foundation models to edge-ready vision models using pluggable plugins and a natural language ontology for zero-shot labeling.
github-stars python computer-vision machine-learning model-distillation Created Sat, 23 May 2026 20:41:14 +0000
Comic Translate: AI-driven multi-language comic translation with full-page context
Comic Translate uses advanced AI models and a multi-step pipeline for accurate comic translation across languages, combining speech bubble detection, OCR, and LLMs with full-page context.
github-stars python llm ocr computer-vision Created Sat, 23 May 2026 20:41:14 +0000
Fast3R: scalable multi-view 3D reconstruction with a single forward pass
Fast3R from Meta FAIR processes 1000+ unordered images simultaneously for 3D reconstruction using a ViT-Large backbone and multi-view attention, eliminating iterative matching.
github-stars 3d-reconstruction computer-vision pytorch transformers Created Sat, 23 May 2026 20:41:14 +0000
MASt3R-SLAM: integrating foundation-model 3D priors into real-time dense SLAM
MASt3R-SLAM integrates a pretrained 3D reconstruction model as a geometry prior in a dense SLAM pipeline, enabling real-time tracking and mapping without classical bundle adjustment or depth sensors.
github-stars slam computer-vision 3d-reconstruction pytorch Created Sat, 23 May 2026 20:41:14 +0000
PartCrafter: compositional 3D mesh generation with latent diffusion transformers
PartCrafter generates multiple semantically distinct 3D mesh parts from a single RGB image using latent diffusion transformers, enabling structured 3D generation with pretrained models and VLM-based part suggestions.
github-stars python latent-diffusion 3d-mesh computer-vision Created Sat, 23 May 2026 20:41:14 +0000
Pixal3D: pixel-aligned 3D asset generation from a single image with projection conditioning
Pixal3D generates high-fidelity 3D assets with PBR textures from a single image using pixel-aligned projection conditioning. It offers a three-stage cascade and low-VRAM mode for consumer GPUs.
github-stars python 3d-generation pbr-texturing deep-learning Created Sat, 23 May 2026 20:41:14 +0000
SAM3-UNet: Adapting Meta's SAM3 for efficient dense prediction with a lightweight U-Net decoder
SAM3-UNet adapts Meta’s SAM3 foundation model for dense prediction tasks using a parameter-efficient adapter and U-Net decoder, enabling training under 6 GB GPU memory.
github-stars python computer-vision segmentation deep-learning Created Sat, 23 May 2026 20:41:14 +0000
Tencent HY-World 2.0: multi-modal pipeline for persistent, editable 3D world generation
Tencent’s HY-World 2.0 generates persistent 3D assets from text, images, or video using a four-stage pipeline. It outputs editable worlds compatible with Blender, Unity, and Unreal Engine.
github-stars 3d generative-ai python deep-learning Created Sat, 23 May 2026 20:41:14 +0000
CodeFormer: Deep learning-based blind face restoration with fidelity control
CodeFormer uses a codebook transformer architecture for blind face restoration, letting users control the tradeoff between quality and fidelity with a unique fidelity weight parameter.
github-stars python deep-learning face-restoration computer-vision Created Tue, 05 May 2026 13:37:39 +0000
OVIE: Monocular novel view synthesis without multi-view supervision
OVIE trains novel view synthesis models using unpaired internet images, avoiding the need for calibrated multi-view datasets. It uses Vision Transformers and foundation models for pose and depth encoding.
github-stars computer-vision novel-view-synthesis vision-transformer monocular-3d Created Tue, 05 May 2026 13:37:39 +0000
StereoWorld: stereo vision-based 3D-consistent video generation from binocular inputs
StereoWorld uses binocular stereo vision cues to guide 3D-consistent stereo video generation, offering a biologically inspired approach to scene geometry understanding.
github-stars 3d stereo vision video generation world models Created Tue, 05 May 2026 13:37:39 +0000
Awesome-Deblurring: A comprehensive academic resource on image and video deblurring techniques
Awesome-Deblurring compiles 100+ key papers tracing image and video deblurring from classical optimization to modern deep learning, serving as a go-to bibliography for researchers and developers.
github-stars image-processing computer-vision deep-learning academic-resource Created Mon, 04 May 2026 10:23:02 +0000
MotionCrafter: unified 4D geometry and motion reconstruction from monocular video
MotionCrafter jointly reconstructs 4D geometry and dense motion from monocular video using a unified 4D VAE, eliminating post-optimization. This Python framework offers training and visualization tools.
github-stars python computer-vision video-diffusion 4d-vae Created Mon, 04 May 2026 10:23:02 +0000
MultiWorld: a unified framework for multi-agent multi-view video world modeling
MultiWorld offers a unified framework for multi-agent multi-view video world modeling using a frozen VGGT backbone for implicit 3D understanding. It supports scalable multi-agent control and autoregressive inference.
github-stars python multi-agent video-modeling computer-vision Created Mon, 04 May 2026 10:23:02 +0000
OpenPose: real-time multi-person 2D pose estimation with constant-time body detection
OpenPose is a C++ library for real-time multi-person 2D pose estimation using Part Affinity Fields, enabling constant inference time for body detection regardless of person count.
github-stars c++ pose-estimation computer-vision cuda Created Mon, 04 May 2026 10:23:02 +0000
PEAR: real-time expressive 3D human mesh recovery at 100 FPS
PEAR predicts expressive 3D human mesh parameters for body, hands, and face simultaneously at 100 FPS using a pixel-aligned architecture based on PyTorch and SMPL-X models.
github-stars python pytorch 3d-human-mesh smpl-x Created Mon, 04 May 2026 10:23:02 +0000
Viseron: a modular, self-hosted AI video surveillance platform
Viseron is a self-hosted, local-only AI NVR platform in Python with modular AI features for privacy-focused video surveillance. Runs fully locally with Docker deployment.
github-stars python docker computer-vision self-hosted Created Mon, 04 May 2026 10:23:02 +0000
Cupid: feed-forward 3D reconstruction with joint camera pose estimation from single images
Cupid is a feed-forward 3D reconstruction model that jointly estimates camera pose and reconstructs 3D objects from single 2D images, outputting textured 3D meshes and radiance fields in seconds.
github-stars 3d-reconstruction computer-vision deep-learning cuda Created Mon, 04 May 2026 10:23:01 +0000
NAS3R: Self-supervised 3D reconstruction and camera pose estimation with Gaussian splatting
NAS3R enables self-supervised 3D geometry and camera parameter estimation without ground-truth data, using Gaussian splatting and a VGGT backbone. It supports multi-view setups and optional pretrained initialization.
github-stars python 3d-reconstruction self-supervised-learning gaussian-splatting Created Mon, 04 May 2026 10:23:01 +0000