Gated-Delta-Networks on Noureddine RAMDI

Gated-Delta-Networks on Noureddine RAMDIhttps://ramdi.fr/tags/gated-delta-networks/Recent content in Gated-Delta-Networks on Noureddine RAMDIHugoenSat, 23 May 2026 20:41:27 +0000Alibaba's Qwen3.6: Efficient large-scale LLMs with gated delta networks and sparse MoEhttps://ramdi.fr/github-stars/alibaba-s-qwen3-6-efficient-large-scale-llms-with-gated-delta-networks-and-sparse-moe/Tue, 05 May 2026 13:37:39 +0000https://ramdi.fr/github-stars/alibaba-s-qwen3-6-efficient-large-scale-llms-with-gated-delta-networks-and-sparse-moe/Qwen3.6 from Alibaba uses gated delta networks and sparse Mixture-of-Experts to achieve near-397B parameter model performance with only 3B active parameters, supporting 201 languages and 262k context length.