Nougat is Meta’s neural OCR system for academic PDFs, extracting LaTeX math and tables into structured Markdown using a Vision Transformer encoder-decoder. It offers CLI, API, and training tools.
OVIE trains novel view synthesis models using unpaired internet images, avoiding the need for calibrated multi-view datasets. It uses Vision Transformers and foundation models for pose and depth encoding.