Noureddine RAMDI / Osintgraph: AI-driven Instagram OSINT with Neo4j graph analysis

Created Sat, 23 May 2026 20:41:14 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

XD-MHLOO/Osintgraph

Instagram OSINT is notoriously tricky due to the platform’s restrictions and the complex social graph data involved. Osintgraph tackles this challenge with a two-phase pipeline that scrapes Instagram public data and stores it in a Neo4j graph database, then layers on an AI agent powered by Google’s Gemini to enable natural language querying and semantic analysis over the collected data. This architecture effectively turns a graph database into an interactive intelligence tool, bridging raw data collection and actionable insights.

How Osintgraph collects and analyzes Instagram data

At its core, Osintgraph is a Python tool designed for reconnaissance and social network analysis on Instagram. It scrapes profile data, followers, followees, posts, comments, and likes — the typical data points that map a social network’s structure and activity. Instead of dumping this data into flat files or relational tables, it leverages Neo4j, a graph database, to store entities and relationships naturally, which plays to the strengths of graph queries and visualization.

The repo’s workflow splits into two distinct phases: reconnaissance and investigation. The reconnaissance phase involves data collection by scraping Instagram using credentials from an Instagram account (preferably a non-primary one to mitigate risk). This phase can optionally include pre-analysis of media content and posts using the Gemini AI API, which enriches the dataset with semantic insights.

Once the data lands in Neo4j, the investigation phase begins. Here, the Gemini-powered AI agent accepts natural language queries, enabling intuitive exploration of complex social graphs. This means you can ask questions like “Show the target user’s profile info” or perform keyword and semantic searches without writing Cypher queries. Neo4j’s visual console complements this by allowing manual graph exploration.

Technically, Osintgraph is Python-based, integrating several components:

  • Instagram scraping logic that handles session cookies and user agent to reduce detection risk
  • Neo4j graph database for data storage and query
  • Gemini API integration for AI-driven analysis and natural language interface

This multi-component architecture balances scraping, graph storage, and AI analysis into a coherent OSINT workflow.

Technical strengths and design tradeoffs of Osintgraph

What sets Osintgraph apart is the tight integration of an AI agent with a graph database backend. Instead of just scraping and graphing, it empowers users to interact with the data conversationally. This lowers the barrier for investigators who might not be familiar with graph query languages.

The codebase is surprisingly clean for a project juggling network scraping, AI APIs, and graph storage. The Python CLI commands are well-documented, and the setup script guides users through configuring Instagram credentials, Neo4j connection, and Gemini API keys.

However, the tradeoffs are clear:

  • Dependency on external services: Neo4j must be set up (free tier available but still an external dependency), and the Gemini API key requires Google Cloud credentials.
  • Instagram scraping is fragile by nature. The tool uses session cookies and user agents to mimic a real user, but Instagram can change its API or block accounts, which is a risk.
  • The AI agent’s capabilities depend on Gemini’s API limits and latency, which might affect responsiveness.

The choice of Neo4j is well-suited for social network data but adds operational overhead compared to simpler storage. Still, the visual and query power it offers justifies this.

Overall, Osintgraph balances complexity and usability well, focusing on practical OSINT workflows rather than theoretical purity.

Quick start with Osintgraph

Getting Osintgraph running requires installing the package, configuring credentials, and running commands to collect and analyze data. Here’s the exact process from the README:

# Install OSINTGraph
pipx install osintgraph
# Or using pip inside a virtual environment
e.g. python -m venv venv && source venv/bin/activate
pip install osintgraph

Next, configure the required services:

  • Instagram account credentials (preferably not your main account)
  • Neo4j database instance (create a free instance, get admin credentials)
  • Gemini API key from Google AI Studio
  • Optional user agent string copied from your browser to reduce detection risk

Run the setup command to enter these configurations interactively:

osintgraph setup

To start collecting Instagram data on a target username (replace TARGET_INSTAGRAM_USERNAME):

osintgraph discover TARGET_INSTAGRAM_USERNAME --limit follower=100 followee=100 post=2

After data collection, launch the AI agent to query and analyze:

osintgraph agent

Try asking the agent simple commands like:

Show the target user's profile info

For manual graph visualization, use the Neo4j console:

  • Open Neo4j Console in a browser
  • Connect to your database
  • Use the “Explore” tab and search for “Show me a graph”

This lets you visualize the social network graph interactively.

Verdict

Osintgraph is a solid, practical tool for OSINT researchers focused on Instagram social network mapping. Its two-phase architecture integrating scraping, graph storage, and AI-driven natural language querying is well thought out and implemented.

It’s best suited for users comfortable managing external dependencies like Neo4j and Google Cloud APIs, and who understand the inherent fragility of Instagram scraping. The AI agent adds a layer of usability uncommon in OSINT tools, making complex graph data accessible.

Limitations include dependency on third-party services, potential Instagram scraping blocks, and API usage costs or limits. It is not a turnkey solution but a robust framework that requires some setup and operational awareness.

For anyone needing to map Instagram social relations with AI-augmented investigation, Osintgraph is worth exploring.


→ GitHub Repo: XD-MHLOO/Osintgraph ⭐ 736 · Python