<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Ai-Safety on Noureddine RAMDI</title><link>https://ramdi.fr/tags/ai-safety/</link><description>Recent content in Ai-Safety on Noureddine RAMDI</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sat, 23 May 2026 20:41:27 +0000</lastBuildDate><atom:link href="https://ramdi.fr/tags/ai-safety/index.xml" rel="self" type="application/rss+xml"/><item><title>DeepTeam: A Python framework for adversarial red teaming of large language models</title><link>https://ramdi.fr/github-stars/deepteam-a-python-framework-for-adversarial-red-teaming-of-large-language-models/</link><pubDate>Sat, 23 May 2026 20:41:14 +0000</pubDate><guid>https://ramdi.fr/github-stars/deepteam-a-python-framework-for-adversarial-red-teaming-of-large-language-models/</guid><description>DeepTeam is a Python tool for red teaming LLMs by dynamically generating adversarial attacks and evaluating vulnerabilities like bias. It requires minimal setup and no predefined datasets.</description></item><item><title>npcpy: enforcing AI behavioral compliance through architecture for multimodal LLM apps</title><link>https://ramdi.fr/github-stars/npcpy-enforcing-ai-behavioral-compliance-through-architecture-for-multimodal-llm-apps/</link><pubDate>Sat, 23 May 2026 20:41:14 +0000</pubDate><guid>https://ramdi.fr/github-stars/npcpy-enforcing-ai-behavioral-compliance-through-architecture-for-multimodal-llm-apps/</guid><description>npcpy offers a unique NPC Context-Agent-Tool data layer to enforce AI compliance via software architecture, supporting multimodal LLM apps and multi-agent systems with local and cloud providers.</description></item><item><title>Inside Claude Code: A detailed reconstruction of Anthropic's AI safety and architecture</title><link>https://ramdi.fr/github-stars/inside-claude-code-a-detailed-reconstruction-of-anthropic-s-ai-safety-and-architecture/</link><pubDate>Tue, 05 May 2026 16:46:42 +0000</pubDate><guid>https://ramdi.fr/github-stars/inside-claude-code-a-detailed-reconstruction-of-anthropic-s-ai-safety-and-architecture/</guid><description>A deep dive into Claude Code’s 512K lines of TypeScript reveals a layered YOLO safety classifier, multi-agent IPC, and terminal UI rendering—key to Anthropic’s AI production system.</description></item><item><title>ISC-Bench: exposing fundamental AI safety failures from workflow-level design</title><link>https://ramdi.fr/github-stars/isc-bench-exposing-fundamental-ai-safety-failures-from-workflow-level-design/</link><pubDate>Mon, 04 May 2026 10:23:02 +0000</pubDate><guid>https://ramdi.fr/github-stars/isc-bench-exposing-fundamental-ai-safety-failures-from-workflow-level-design/</guid><description>ISC-Bench reveals a structural AI safety flaw where LLMs produce harmful outputs to complete tasks, bypassing prompt-level defenses. It benchmarks this workflow-level vulnerability across top models.</description></item></channel></rss>