Noureddine RAMDI / pdftochat: a cloud-integrated PDF-to-chat system with hybrid vector search

Created Mon, 04 May 2026 10:23:02 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

Nutlope/pdftochat

pdftochat tackles a common developer challenge: turning static PDFs into conversational chatbots without building your own embedding models or search infrastructure. It stitches together several cloud services to provide an end-to-end PDF-to-chat pipeline that’s scalable and relatively low friction if you’re willing to manage a few moving parts.

pdf-to-chat with cloud vector search and LLM integration

This repo is a Next.js app written in TypeScript designed to let you upload PDFs, index their content using vector search, and chat with the documents conversationally. It’s not just a local chatbot but a cloud-integrated system using several APIs and managed services.

Under the hood, it uses:

  • Together.ai for the large language model (LLM) that powers the chat responses.
  • Chroma Cloud for vector search, combining dense embeddings from Qwen with sparse SPLADE embeddings fused via Reciprocal Rank Fusion (RRF).
  • Bytescale for PDF storage.
  • Clerk for user authentication.
  • PostgreSQL (with Prisma ORM) for relational data storage.
  • Vercel as the deployment platform.

The architecture is a typical modern JAMstack style app with serverless functions and API routes managing communication with external services. The vector database is fully managed by Chroma Cloud, which automatically generates embeddings and indexes, removing the need for local embedding infrastructure or manual index maintenance.

hybrid vector search and modular cloud architecture

What distinguishes pdftochat is its use of a hybrid vector search approach combining two distinct embedding types:

  • Dense embeddings from Qwen, which capture semantic similarity.
  • Sparse embeddings from SPLADE, which emphasize keyword matching.

These two embedding sets are fused at query time using Reciprocal Rank Fusion (RRF) to improve retrieval quality. This hybrid approach is more robust than relying on a single embedding model.

The repo delegates embedding generation and vector search entirely to Chroma Cloud, simplifying the codebase but introducing a dependency on this external service. Collections are automatically created per document, so there’s no manual index management.

Authentication and user management are handled by Clerk, which integrates smoothly with Next.js. PDF storage is offloaded to Bytescale, a specialized PDF hosting service.

This modular cloud integration approach lets the repo focus on orchestrating the document ingestion, query interfaces, and chat UI without reinventing core ML or storage components.

The tradeoff is clear: you get a ready-to-deploy, scalable system but you’re tied to several third-party cloud services. For some, this reduces operational complexity; for others, it’s a limitation on control and customization.

Code quality is solid with TypeScript types enforced throughout, Prisma ORM managing the database layer, and environment variables clearly documented in .env.example. The code is surprisingly clean given the number of integrations.

quick start with environment setup and database push

To deploy your own instance, you need to set up accounts and credentials for Together.ai, Chroma Cloud, Bytescale, Clerk, and hosting (Vercel or similar). The .env.example file lists all required environment variables.

Before running, prepare your database schema with Prisma:

npx prisma db push

Set these environment variables for Chroma Cloud to enable vector search:

NEXT_PUBLIC_VECTORSTORE=chroma

CHROMA_API_KEY=       # Your Chroma Cloud API key
CHROMA_TENANT=        # Your tenant ID
CHROMA_DATABASE=      # Your database name

Collections are created automatically for each uploaded document, so no manual index setup is needed.

The README details the deployment steps for Vercel including setting up Postgres or Neon as the database.

who should consider pdftochat

pdftochat is relevant for developers wanting a cloud-native PDF chatbot that offloads embedding and vector search to managed services. It’s a good fit if you want to avoid running your own embedding model or vector DB.

The tradeoff is the complexity of managing multiple cloud accounts and environment variables, plus vendor lock-in to Chroma Cloud and Together.ai. The system is opinionated around these services, so swapping components would require work.

If you’re looking for a straightforward, scalable PDF Q&A system built with modern TypeScript and Next.js, this repo is worth exploring. The hybrid vector search approach is a neat touch that could improve retrieval quality compared to simple single-embedding setups.

It’s not for those who want a fully self-hosted or minimal-dependency solution — the external cloud services are integral to how it works. But for teams comfortable with cloud APIs and modern JAMstack architectures, pdftochat offers a practical, well-structured starting point.


→ GitHub Repo: Nutlope/pdftochat ⭐ 1,384 · TypeScript