LLM-MM-Agent: autonomous mathematical modeling with hierarchical method selection

LLM-MM-Agent tackles the challenge of automating real-world mathematical modeling end-to-end by simulating how human experts approach these problems. It integrates problem analysis, model formulation, computational solving, and academic report writing into a single autonomous workflow driven by large language models (LLMs). What sets it apart is its structured knowledge representation and retrieval mechanism that bridges unstructured problem descriptions and formal mathematical methods.

What LLM-MM-Agent does and how it works

At a high level, LLM-MM-Agent is a Python-based system designed to solve complex mathematical modeling problems autonomously. It was accepted at NeurIPS 2025 for its novel approach to simulating the human expert workflow in mathematical modeling contests like MCM/ICM.

The system operates through four main stages:

Problem analysis: Parsing and understanding the problem statement using LLMs.
Mathematical model formulation: Selecting and formulating appropriate mathematical models.
Computational solving: Using the MLE-Solver module, which autonomously generates code to solve the formulated model.
Automated report writing: Generating a complete academic report explaining the problem, model, solution, and results.

The core architectural innovation is the Hierarchical Mathematical Modeling Library (HMML). This is a tri-level knowledge graph spanning domains, subdomains, and 98 method nodes. It encodes expert intuition about which mathematical methods are relevant to different problem types. Rather than relying on purely unstructured LLM outputs, the system retrieves relevant modeling methods via an actor-critic mechanism that assesses both the problem context and solution progress, enabling dynamic and context-aware method selection.

The system currently supports OpenAI’s GPT-4o and DeepSeek-R1 LLMs, allowing some flexibility in backend model choice. The repository also includes an open-source demo implemented with a Next.js frontend and FastAPI backend for local deployment, making it accessible for experimentation.

Technical strengths and design tradeoffs

The standout technical feature of LLM-MM-Agent is the HMML’s tri-level knowledge hierarchy combined with the actor-critic retrieval mechanism. This structure is a deliberate design to address a common pain point in autonomous modeling: how to translate vague, natural language problem descriptions into precise mathematical formulations.

The 98 method nodes within HMML cover a broad spectrum of modeling techniques, organized by domain and subdomain. This organization provides a clear, structured path for method retrieval rather than a flat or ad hoc approach.

The actor-critic mechanism is particularly interesting. It acts as a dual evaluator — the “actor” proposes candidate methods based on the problem context, while the “critic” evaluates these candidates in light of the partially constructed solution. This feedback loop refines method selection dynamically, which is more nuanced than a one-shot method retrieval.

From a code quality perspective, the system is implemented in Python 3.10 and leverages modern practices. The codebase is modular, separating the modeling library, solver module, and report generation cleanly. Dependency management is straightforward, with requirements listed in a standard pip format.

Tradeoffs are evident. The system depends heavily on LLM API keys (OpenAI) and the cost/latency implications that entails. The hierarchical modeling library is a static knowledge structure, which means updating or extending the 98 method nodes requires manual curation and domain expertise. Also, the system’s effectiveness is demonstrated primarily on MCM/ICM style problems, which have well-defined problem types — its generalization beyond this scope remains to be seen.

Quick start

Running the Agent

You can directly run the Mathematical Modeling Agent with:

python MMAgent/main.py --key "your_openai_key" --task "task_id"

Example:

python MMAgent/main.py --key "sk-XXX" --task "2024_C"

Here, task corresponds to the problem ID from MM-Bench (e.g., "2024_C" refers to the 2024 MCM problem C).

Installation Guide

Prerequisites

Python 3.10 recommended
Conda (optional but preferred)

Setup Steps

Clone the Repository

git clone git@github.com:usail-hkust/LLM-MM-Agent.git

Create and Activate the Conda Environment

conda create --name math_modeling python=3.10
conda activate math_modeling

Navigate to Project Directory

cd LLM-MM-Agent

Install Dependencies

pip install -r requirements.txt

Verdict

LLM-MM-Agent is a rare example of an autonomous system that attempts to replicate the full expert process in mathematical modeling contests, from problem analysis to report writing. Its hierarchical modeling library and actor-critic retrieval mechanism are worth understanding for anyone interested in applying LLMs beyond simple prompt-based code generation.

The system is best suited for research and academic use cases where mathematical rigor and modeling intuition are central. It requires OpenAI API keys or compatible LLMs, which might limit easy adoption in some environments. Also, extending and maintaining the hierarchical method library demands domain knowledge.

If you’re working on autonomous agents for scientific or engineering modeling tasks, this repo offers a concrete architectural pattern to explore. For general-purpose LLM applications, it’s probably overkill. The demonstrated success in MCM/ICM is a solid proof point but also sets the scope of immediate applicability.

Overall, the code is clean, modular, and reasonably documented, making it a good resource for practitioners looking to understand advanced method selection and autonomous problem solving with LLMs.

MLE-Agent: Autonomous LLM agents for end-to-end ML workflow automation — MLE-Agent is a Python LLM agent framework that automates ML workflows, including autonomous Kaggle competitions and smar
A hands-on course for mastering large language models: fine-tuning, quantization, and tooling — Explore a comprehensive LLM course with practical notebooks on fine-tuning (QLoRA, DPO), quantization (GPTQ), and tools
Understanding LLM internals: a hands-on guide to transformers and attention math — A curated repo breaking down large language model internals with numeric attention math, tokenization, and transformer a
Awesome LLM Apps: a practical collection of runnable AI agent and RAG templates — Awesome LLM Apps offers 100+ runnable AI agent and RAG templates for quick LLM app development. It supports multiple pro
LLM-driven browser automation with Browser-Use: a hands-on look — Browser-Use is a Python library enabling LLM-powered AI agents to automate browsers efficiently. It features a custom Ch

→ GitHub Repo: usail-hkust/LLM-MM-Agent ⭐ 551 · Python

Noureddine RAMDI / LLM-MM-Agent: autonomous mathematical modeling with hierarchical method selection