3D-RE-GEN reconstructs complete editable 3D indoor scenes from a single RGB photo. It integrates SAM, Hunyuan3D-2.0, and VGGT models in a modular Python pipeline.
Fast3R from Meta FAIR processes 1000+ unordered images simultaneously for 3D reconstruction using a ViT-Large backbone and multi-view attention, eliminating iterative matching.
MASt3R-SLAM integrates a pretrained 3D reconstruction model as a geometry prior in a dense SLAM pipeline, enabling real-time tracking and mapping without classical bundle adjustment or depth sensors.
Matrix-3D generates explorable 360-degree 3D worlds from text or images using panoramic video and 3D Gaussian splatting, optimized to run on 12-19GB VRAM consumer GPUs.
LingBot-Map performs streaming 3D reconstruction from long image sequences at ~20 FPS using a geometric context transformer and paged KV cache attention for efficient memory management.
Cupid is a feed-forward 3D reconstruction model that jointly estimates camera pose and reconstructs 3D objects from single 2D images, outputting textured 3D meshes and radiance fields in seconds.
MV-SAM3D extends SAM 3D Objects with entropy-based multi-view fusion and optional pose optimization for more stable and consistent 3D object reconstruction across scenes.
NAS3R enables self-supervised 3D geometry and camera parameter estimation without ground-truth data, using Gaussian splatting and a VGGT backbone. It supports multi-view setups and optional pretrained initialization.
NOVA3R implements a non-pixel-aligned visual transformer for amodal 3D reconstruction from unposed multi-view images, recovering occluded geometry with physical plausibility.