MAGI implements a multi-round debate protocol among three LLMs to match stronger models’ accuracy via iterative critique and voting. It offers fault tolerance, adaptive escalation, and persona presets.
The Awesome-LLM-Ensemble repo catalogs research on combining multiple LLMs with a clear three-phase taxonomy: before, during, and after inference ensemble methods.