Turning text into AI podcast episodes with a coding agent and Fish Audio TTS

Every time you want to create a podcast episode from arbitrary content—whether text, files, or URLs—you face the tedious task of scripting, voice generation, and audio stitching. personalized-podcast tackles this by running the entire pipeline through a coding agent skill that writes the script and stitches the audio, eliminating the need for separate script-generation API calls.

What personalized-podcast does and how it works

personalized-podcast is a Python-based Claude Code skill designed to automatically convert diverse content sources into a two-host AI podcast episode. It uses Claude Code, an environment where a coding agent runs skills, to orchestrate the entire process from content ingestion to audio output.

The pipeline begins with Claude reading the input content—this can be plain text, uploaded files, or URLs pointing to articles or documents. Using predefined constraints and show format instructions stored in PROMPT.md, Claude generates a conversational script designed for two distinct hosts: Alex, who represents a curious persona, and Sam, an analytical counterpart. This dual-host approach adds natural variety and engagement to the podcast dialogue.

For voice synthesis, personalized-podcast leverages Fish Audio’s text-to-speech API, which provides access to over 2 million voices, including the default Alex and Sam voices pre-configured for convenience. Each line of the generated script is sent to Fish Audio’s TTS API to produce audio snippets for each speaker.

Finally, the repo employs pydub combined with ffmpeg to stitch these audio snippets together. The stitching process respects natural pacing and silences to create a fluid MP3 audio file ready for distribution.

Optionally, the tool supports creating a GitHub Pages-hosted RSS feed, allowing users to distribute the podcast episodes directly to podcast apps, streamlining the publishing workflow.

The repo requires Python 3.10+, ffmpeg installed on the host system, and a valid Fish Audio API key (free tier available). It also depends on a coding agent environment that supports skills, such as Claude Code.

Technical strengths and architectural tradeoffs

The standout technical feature of personalized-podcast is its use of the coding agent itself as the script writer. Unlike typical pipelines where you might call an LLM API separately to generate a script, here Claude reads the input content and produces the podcast script internally, applying the prompt constraints from PROMPT.md. This design removes an entire API layer, reducing complexity and latency while keeping the orchestration local except for the TTS calls.

This approach also means the skill is tightly integrated with Claude Code’s runtime environment, which can be both a strength and a limitation. On one hand, it allows seamless interaction with the coding agent and easy updates to the prompt or config files for customization. On the other, it requires users to be operating within a supported coding agent platform, which might limit standalone usability outside those environments.

Fish Audio’s TTS API is a solid choice for voice generation, offering a vast library of voices, including expressive options that help distinguish the two podcast hosts. The free tier lowers the barrier to entry, but the dependency on an external API introduces potential latency and reliability considerations.

The audio stitching with pydub and ffmpeg is straightforward and effective, but users must have ffmpeg installed and properly configured. This external dependency is common in audio processing but worth noting.

Customization is well supported: PROMPT.md lets you tailor the show format and conversational style, while config.yaml adjusts the tone and personality of the hosts. This flexibility is valuable for adapting the podcast’s voice without touching the core code.

The repo’s code quality is pragmatic and focused on the pipeline flow rather than deep algorithmic complexity. It cleanly separates concerns—script generation, TTS, and audio stitching—making it easier to maintain or swap components.

Quick start

1. Install

gh repo clone zarazhangrui/personalized-podcast-skill ~/.claude/skills/personalized-podcast

2. Go

/podcast <paste content, point to files, or describe a topic>

That’s it. On first run, Claude sets up the Python environment, installs dependencies, and asks you for a free Fish Audio API key. Default voices are pre-configured. Your first episode generates immediately.

Requirements

A coding agent that supports skills (e.g. Claude Code, Gemini CLI, Copilot CLI)
Python 3.10+
ffmpeg (brew install ffmpeg on macOS)
Fish Audio account (free tier available)

who personalized-podcast is for

This repo is a solid fit for developers or content creators who want to automate podcast generation from arbitrary content using AI, especially if they are already working within a coding agent environment like Claude Code. It solves the common pain point of script writing and voice generation by collapsing the script creation into a coding agent skill, simplifying the pipeline.

However, it is not a general-purpose standalone podcast generator. It requires setup within a specific environment and an external API key for TTS, which might not suit all users. The audio output quality depends on Fish Audio voices, which are good but may not fit all stylistic needs.

If you are comfortable with Python tooling, have access to a Claude Code or similar environment, and want to experiment with AI-driven podcast automation, personalized-podcast offers a pragmatic and extensible starting point. Its design choices prioritize simplicity and integration over heavy customization or multi-voice mixing.

In practice, this means you can get a two-host AI podcast episode from your text or URLs with minimal friction. The code is surprisingly clean for an AI orchestration pipeline, and the prompt/config-driven approach gives you decent control over the show’s style.

The main tradeoff is this tight coupling to the coding agent ecosystem and external TTS dependency. But for those who fit the target use case, personalized-podcast fills a neat niche in automated content-to-podcast workflows.

→ GitHub Repo: zarazhangrui/personalized-podcast ⭐ 337 · Python

Noureddine RAMDI / Turning text into AI podcast episodes with a coding agent and Fish Audio TTS