<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Transformers on Noureddine RAMDI</title><link>https://ramdi.fr/tags/transformers/</link><description>Recent content in Transformers on Noureddine RAMDI</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sat, 23 May 2026 20:41:27 +0000</lastBuildDate><atom:link href="https://ramdi.fr/tags/transformers/index.xml" rel="self" type="application/rss+xml"/><item><title>AI-ML-Cheatsheets: a structured collection of AI and machine learning reference sheets</title><link>https://ramdi.fr/github-stars/ai-ml-cheatsheets-a-structured-collection-of-ai-and-machine-learning-reference-sheets/</link><pubDate>Sat, 23 May 2026 20:41:14 +0000</pubDate><guid>https://ramdi.fr/github-stars/ai-ml-cheatsheets-a-structured-collection-of-ai-and-machine-learning-reference-sheets/</guid><description>AI-ML-Cheatsheets offers a modular, offline-ready collection of concise AI/ML reference sheets from foundational math to transformers and large language models.</description></item><item><title>Fast3R: scalable multi-view 3D reconstruction with a single forward pass</title><link>https://ramdi.fr/github-stars/fast3r-scalable-multi-view-3d-reconstruction-with-a-single-forward-pass/</link><pubDate>Sat, 23 May 2026 20:41:14 +0000</pubDate><guid>https://ramdi.fr/github-stars/fast3r-scalable-multi-view-3d-reconstruction-with-a-single-forward-pass/</guid><description>Fast3R from Meta FAIR processes 1000+ unordered images simultaneously for 3D reconstruction using a ViT-Large backbone and multi-view attention, eliminating iterative matching.</description></item><item><title>Lynx: modular personalized video generation with dual adapters on a frozen diffusion transformer</title><link>https://ramdi.fr/github-stars/lynx-modular-personalized-video-generation-with-dual-adapters-on-a-frozen-diffusion-transformer/</link><pubDate>Sat, 23 May 2026 20:41:14 +0000</pubDate><guid>https://ramdi.fr/github-stars/lynx-modular-personalized-video-generation-with-dual-adapters-on-a-frozen-diffusion-transformer/</guid><description>Lynx generates personalized videos from a single image using a frozen Diffusion Transformer with ID and Ref adapters. This modular design balances fidelity and efficiency.</description></item><item><title>pdf-document-layout-analysis: a dual-model PDF layout analysis microservice with Docker deployment</title><link>https://ramdi.fr/github-stars/pdf-document-layout-analysis-a-dual-model-pdf-layout-analysis-microservice-with-docker-deployment/</link><pubDate>Sat, 23 May 2026 20:41:14 +0000</pubDate><guid>https://ramdi.fr/github-stars/pdf-document-layout-analysis-a-dual-model-pdf-layout-analysis-microservice-with-docker-deployment/</guid><description>pdf-document-layout-analysis is a Dockerized microservice using Vision Grid Transformer and LightGBM for PDF layout analysis, offering high accuracy or fast processing with OCR, translation, and multi-format export.</description></item><item><title>Falcon-Perception: a minimal multimodal PyTorch engine for object detection, segmentation, and OCR</title><link>https://ramdi.fr/github-stars/falcon-perception-a-minimal-multimodal-pytorch-engine-for-object-detection-segmentation-and-ocr/</link><pubDate>Mon, 04 May 2026 10:23:02 +0000</pubDate><guid>https://ramdi.fr/github-stars/falcon-perception-a-minimal-multimodal-pytorch-engine-for-object-detection-segmentation-and-ocr/</guid><description>Falcon-Perception is a PyTorch engine for multimodal autoregressive Transformers handling detection, segmentation, and OCR with FlexAttention and efficient caching.</description></item><item><title>Hands-On Large Language Models: A practical, visual journey through LLM engineering</title><link>https://ramdi.fr/github-stars/hands-on-large-language-models-a-practical-visual-journey-through-llm-engineering/</link><pubDate>Mon, 04 May 2026 10:23:02 +0000</pubDate><guid>https://ramdi.fr/github-stars/hands-on-large-language-models-a-practical-visual-journey-through-llm-engineering/</guid><description>Explore the Hands-On Large Language Models repo, a Jupyter notebook-based practical guide from fundamentals to fine-tuning, designed for hands-on LLM learning on free Colab GPUs.</description></item><item><title>Streaming 3D scene reconstruction with LingBot-Map’s geometric context transformer</title><link>https://ramdi.fr/github-stars/streaming-3d-scene-reconstruction-with-lingbot-maps-geometric-context-transformer/</link><pubDate>Mon, 04 May 2026 10:23:02 +0000</pubDate><guid>https://ramdi.fr/github-stars/streaming-3d-scene-reconstruction-with-lingbot-maps-geometric-context-transformer/</guid><description>LingBot-Map performs streaming 3D reconstruction from long image sequences at ~20 FPS using a geometric context transformer and paged KV cache attention for efficient memory management.</description></item><item><title>Exploring DeepMind's representations4d: advanced self-supervised video representations with moving latent tokens</title><link>https://ramdi.fr/github-stars/exploring-deepmind-s-representations4d-advanced-self-supervised-video-representations-with-moving-latent-tokens/</link><pubDate>Mon, 04 May 2026 10:23:01 +0000</pubDate><guid>https://ramdi.fr/github-stars/exploring-deepmind-s-representations4d-advanced-self-supervised-video-representations-with-moving-latent-tokens/</guid><description>Google DeepMind&amp;rsquo;s representations4d bundles three self-supervised video learning approaches using transformers, including a novel object-centric tracking method with latent tokens moving off the pixel grid.</description></item><item><title>In-Place TTT: Adaptive test-time training for transformer LLMs with in-place fast-weight updates</title><link>https://ramdi.fr/github-stars/in-place-ttt-adaptive-test-time-training-for-transformer-llms-with-in-place-fast-weight-updates/</link><pubDate>Mon, 04 May 2026 10:23:01 +0000</pubDate><guid>https://ramdi.fr/github-stars/in-place-ttt-adaptive-test-time-training-for-transformer-llms-with-in-place-fast-weight-updates/</guid><description>ByteDance&amp;rsquo;s In-Place TTT enables adaptive transformer inference by updating MLP down-projection weights in-place at test time, supporting long-context reasoning without extra modules.</description></item><item><title>OmniStream: a multi-frame transformer for continuous video stream perception</title><link>https://ramdi.fr/github-stars/omnistream-a-multi-frame-transformer-for-continuous-video-stream-perception/</link><pubDate>Mon, 04 May 2026 10:23:01 +0000</pubDate><guid>https://ramdi.fr/github-stars/omnistream-a-multi-frame-transformer-for-continuous-video-stream-perception/</guid><description>OmniStream uses a multi-frame transformer to process continuous video streams with patch-level temporal indexing, supporting downstream vision-language-action tasks.</description></item><item><title>OpenMythos: Exploring recurrent-depth transformers with input injection for sustained reasoning</title><link>https://ramdi.fr/github-stars/openmythos-exploring-recurrent-depth-transformers-with-input-injection-for-sustained-reasoning/</link><pubDate>Mon, 04 May 2026 10:23:01 +0000</pubDate><guid>https://ramdi.fr/github-stars/openmythos-exploring-recurrent-depth-transformers-with-input-injection-for-sustained-reasoning/</guid><description>OpenMythos implements a recurrent-depth transformer that recycles layers via looped blocks, using input injection to prevent signal drift. It scales from 1B to 1T parameters with up to 1M token context.</description></item><item><title>annotated_deep_learning_paper_implementations: annotated PyTorch implementations of key deep learning papers</title><link>https://ramdi.fr/github-stars/annotated-deep-learning-paper-implementations-annotated-pytorch-implementations-of-key-deep-learning-papers/</link><pubDate>Sat, 02 May 2026 20:07:04 +0000</pubDate><guid>https://ramdi.fr/github-stars/annotated-deep-learning-paper-implementations-annotated-pytorch-implementations-of-key-deep-learning-papers/</guid><description>This repo provides annotated PyTorch implementations of major deep learning papers with side-by-side explanations, aiding understanding and prototyping.</description></item><item><title>Hugging Face Transformers: a unified API for state-of-the-art AI models across modalities</title><link>https://ramdi.fr/github-stars/hugging-face-transformers-a-unified-api-for-state-of-the-art-ai-models-across-modalities/</link><pubDate>Sun, 26 Apr 2026 09:31:26 +0000</pubDate><guid>https://ramdi.fr/github-stars/hugging-face-transformers-a-unified-api-for-state-of-the-art-ai-models-across-modalities/</guid><description>Hugging Face Transformers offers a unified Python API to access over 1 million pretrained AI models for text, vision, and audio, simplifying complex pipelines with its Pipeline API.</description></item></channel></rss>