A curated repo breaking down large language model internals with numeric attention math, tokenization, and transformer architecture, targeting engineers who want to understand LLMs under the hood.
Cupid is a feed-forward 3D reconstruction model that jointly estimates camera pose and reconstructs 3D objects from single 2D images, outputting textured 3D meshes and radiance fields in seconds.
deepseek_ocr_app combines React and FastAPI to offer powerful OCR for images and multipage PDFs with exports to Markdown, HTML, DOCX, and JSON. It features real-time progress tracking and bounding box visualization.
FinRL provides an open-source three-layer architecture for financial reinforcement learning with 5 DRL agents and 14+ data sources. Great for learning DRL in finance.
NOVA3R implements a non-pixel-aligned visual transformer for amodal 3D reconstruction from unposed multi-view images, recovering occluded geometry with physical plausibility.
OpenMythos implements a recurrent-depth transformer that recycles layers via looped blocks, using input injection to prevent signal drift. It scales from 1B to 1T parameters with up to 1M token context.
SceneMaker separates de-occlusion from 3D object generation to handle occluded open-set scenes. It uses FLUX Kontext and Step1X-3D, with code and checkpoints available.
Tencent’s Hunyuan3D-Part offers a two-model pipeline for 3D mesh part segmentation with P3-SAM and high-fidelity part generation via X-Part, targeting semantic mesh decomposition.
This repo provides annotated PyTorch implementations of major deep learning papers with side-by-side explanations, aiding understanding and prototyping.
face_recognition provides a simple Python API and CLI for highly accurate face detection and recognition using dlib’s deep learning model. It supports facial landmarks and multi-core processing.
Dive into Deep Learning Chinese edition offers an interactive, code-driven deep learning textbook in Python, integrating theory with runnable examples for hands-on learning.
Explore PyTorch’s unique tape-based autograd and dynamic neural networks architecture that enables flexible model development and efficient GPU-accelerated tensor computation.
TensorFlow is a comprehensive open-source machine learning platform with stable multi-language APIs and broad hardware support, evolving from research prototype to production-ready ecosystem.
YOLOv5 by Ultralytics offers an accessible, fast, and accurate PyTorch-based computer vision toolkit for object detection, segmentation, and classification. Explore its architecture, strengths, and quickstart usage.
Keras 3 introduces a multi-backend architecture supporting JAX, TensorFlow, PyTorch, and OpenVINO, enabling flexible, accelerated deep learning model development with up to 350% speedups.