<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Autonomous-Agents on Noureddine RAMDI</title><link>https://ramdi.fr/tags/autonomous-agents/</link><description>Recent content in Autonomous-Agents on Noureddine RAMDI</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sat, 23 May 2026 20:41:27 +0000</lastBuildDate><atom:link href="https://ramdi.fr/tags/autonomous-agents/index.xml" rel="self" type="application/rss+xml"/><item><title>Claw-Eval: a rigorous Python harness for trustworthy evaluation of LLM-powered autonomous agents</title><link>https://ramdi.fr/github-stars/claw-eval-a-rigorous-python-harness-for-trustworthy-evaluation-of-llm-powered-autonomous-agents/</link><pubDate>Sat, 23 May 2026 20:41:14 +0000</pubDate><guid>https://ramdi.fr/github-stars/claw-eval-a-rigorous-python-harness-for-trustworthy-evaluation-of-llm-powered-autonomous-agents/</guid><description>Claw-Eval offers a Python-based evaluation harness for LLM autonomous agents, featuring 300 tasks and a strict Pass^3 metric to ensure reliable, multi-dimensional benchmarking.</description></item><item><title>LLM-MM-Agent: autonomous mathematical modeling with hierarchical method selection</title><link>https://ramdi.fr/github-stars/llm-mm-agent-autonomous-mathematical-modeling-with-hierarchical-method-selection/</link><pubDate>Sat, 23 May 2026 20:41:14 +0000</pubDate><guid>https://ramdi.fr/github-stars/llm-mm-agent-autonomous-mathematical-modeling-with-hierarchical-method-selection/</guid><description>LLM-MM-Agent uses LLMs as autonomous agents for end-to-end mathematical modeling, featuring a unique hierarchical method library with actor-critic selection. Supports GPT-4o and DeepSeek-R1.</description></item><item><title>Minds Platform: An enterprise-grade AI foundation for autonomous agents and semantic search</title><link>https://ramdi.fr/github-stars/minds-platform-an-enterprise-grade-ai-foundation-for-autonomous-agents-and-semantic-search/</link><pubDate>Fri, 15 May 2026 14:23:51 +0000</pubDate><guid>https://ramdi.fr/github-stars/minds-platform-an-enterprise-grade-ai-foundation-for-autonomous-agents-and-semantic-search/</guid><description>Minds Platform offers a Python-based AI foundation with autonomous agents and semantic search, designed for flexible enterprise deployment across cloud and on-prem environments.</description></item><item><title>Goal-Driven: orchestrating long-lived AI agents with prompt-based verification loops</title><link>https://ramdi.fr/github-stars/goal-driven-orchestrating-long-lived-ai-agents-with-prompt-based-verification-loops/</link><pubDate>Tue, 05 May 2026 16:46:42 +0000</pubDate><guid>https://ramdi.fr/github-stars/goal-driven-orchestrating-long-lived-ai-agents-with-prompt-based-verification-loops/</guid><description>Goal-Driven offers a prompt-based master-subagent architecture to sustain long-running AI problem-solving sessions through a verification-driven orchestration loop without code or frameworks.</description></item><item><title>Mapping the AI agent self-evolution ecosystem with the awesome-agent-evolution taxonomy</title><link>https://ramdi.fr/github-stars/mapping-the-ai-agent-self-evolution-ecosystem-with-the-awesome-agent-evolution-taxonomy/</link><pubDate>Tue, 05 May 2026 16:46:42 +0000</pubDate><guid>https://ramdi.fr/github-stars/mapping-the-ai-agent-self-evolution-ecosystem-with-the-awesome-agent-evolution-taxonomy/</guid><description>The awesome-agent-evolution repo organizes 50+ open-source projects into a clear taxonomy of AI agent self-evolution and infrastructure layers, offering a practical ecosystem map for developers.</description></item><item><title>BoxPwnr: benchmarking autonomous LLM agents on cybersecurity challenges with iterative command execution</title><link>https://ramdi.fr/github-stars/boxpwnr-benchmarking-autonomous-llm-agents-on-cybersecurity-challenges-with-iterative-command-execution/</link><pubDate>Mon, 04 May 2026 10:23:01 +0000</pubDate><guid>https://ramdi.fr/github-stars/boxpwnr-benchmarking-autonomous-llm-agents-on-cybersecurity-challenges-with-iterative-command-execution/</guid><description>BoxPwnr benchmarks LLM-based autonomous agents on cybersecurity challenges using iterative command execution in a Kali Docker container, supporting 20+ LLM models and 13+ platforms.</description></item><item><title>Running autonomous software engineering agents with AWS CDK and EC2 workers</title><link>https://ramdi.fr/github-stars/running-autonomous-software-engineering-agents-with-aws-cdk-and-ec2-workers/</link><pubDate>Mon, 04 May 2026 10:23:01 +0000</pubDate><guid>https://ramdi.fr/github-stars/running-autonomous-software-engineering-agents-with-aws-cdk-and-ec2-workers/</guid><description>Explore how aws-samples/remote-swe-agents runs autonomous software engineering agents in dedicated EC2 instances orchestrated by AWS CDK with a Next.js interface and Amazon Bedrock LLM integration.</description></item><item><title>Symphony: orchestrating autonomous coding agents with work-level management</title><link>https://ramdi.fr/github-stars/symphony-orchestrating-autonomous-coding-agents-with-work-level-management/</link><pubDate>Sun, 03 May 2026 11:08:03 +0000</pubDate><guid>https://ramdi.fr/github-stars/symphony-orchestrating-autonomous-coding-agents-with-work-level-management/</guid><description>Symphony by OpenAI orchestrates autonomous coding agents via work boards and proof-of-work validation, shifting AI coding from direct supervision to task-level management.</description></item></channel></rss>