Scrapfly MCP Integration

2 min read Updated May 25, 2026

Cloud browser sessions, anti-bot bypass scraping, and structured data extraction from Scrapfly's hosted MCP cloud, for web-data-heavy analytics workflows.

Scrapfly is the cloud-browser + scraping infrastructure many data teams use when sites enforce anti-bot protection. Connecting Scrapfly to clariBI surfaces structured page data, headless browser sessions, and crawl-extraction tooling inside AI analyses.

Why connect Scrapfly

When data behind anti-bot protection is needed for analysis, Tavily/Firecrawl-style search-grade tooling is not enough. Scrapfly's cloud-browser MCP gives the AI engine an escape hatch.

You can ask "Extract the product table from this site that blocks scrapers", "Render this single-page app and pull the JSON-LD", or "Crawl the docs subdomain and surface page titles" and the AI engine routes through Scrapfly.

How the connection works

clariBI talks to Scrapfly through its hosted MCP server at https://mcp.scrapfly.io/mcp. Authentication uses an OAuth flow that clariBI registers itself for (no developer console setup on your side). Tokens stay encrypted server-side and never leave clariBI in clear form.

sequenceDiagram
    actor U as You
    participant C as clariBI
    participant V as Scrapfly
    U->>C: Click Authorize with Scrapfly
    C->>V: Open OAuth authorization
    V-->>U: Grant read access?
    U->>V: Approve
    V-->>C: Authorization code
    C->>V: Exchange code for tokens
    V-->>C: Access + refresh tokens
    C->>C: Encrypt and store credentials
    C-->>U: Connection ready

Available tools

clariBI exposes the read-only Scrapfly tools that the vendor's MCP server publishes at connection time. Write operations (create, update, delete, send, refund) are filtered out by a name-pattern blocklist before any tool reaches the analysis engine, so connecting Scrapfly cannot modify data on the vendor side.

The exact tool inventory depends on the Scrapfly features your account has access to. After connecting, try a few natural-language questions to see what Scrapfly data clariBI can pull.

Data flow during analysis

When you ask a question that maps to Scrapfly, the AI engine routes to the right tool, reads the result, and pairs the answer with a chart you can pin to a dashboard.

sequenceDiagram
    actor U as You
    participant C as clariBI
    participant AI as AI engine
    participant V as Scrapfly
    U->>C: Ask a question about cloud-browser scraping and structured extraction
    C->>AI: Plan the analysis
    AI->>V: Call the right tool
    V-->>AI: Tool result
    AI->>AI: Summarize and chart
    C-->>U: Answer plus visual

Setting up the connection

Open Data Sources in the clariBI sidebar.
Click Add data source.
Open the MCP Servers tab.
Click the Scrapfly card.
Click Authorize with Scrapfly.
Sign in to Scrapfly in the popup window and grant the requested read scopes.
Back in clariBI, give your data source a name.
Click Finish.

Permissions and data access

OAuth scoping is granted at authorize time on the Scrapfly consent screen. clariBI restricts itself to read-only operations on the Scrapfly side; the data Scrapfly fetches comes from third-party sites you choose.

Troubleshooting

Error	Cause	Fix
"Quota exceeded"	Scrapfly enforces per-plan request quotas.	Upgrade your Scrapfly plan or narrow the scrape scope.
"Site blocks scraper"	Some sites use advanced anti-bot fingerprinting Scrapfly can't bypass for your plan tier.	Try a higher Scrapfly plan with browser sessions, or use the Firecrawl-flavoured Tavily/Apify path instead.

Data Sources Intermediate

Connecting PostgreSQL and MySQL

Step-by-step instructions for connecting PostgreSQL and MySQL databases to clariBI. Covers connecti…

5 min read

Data Sources Beginner

Uploading CSV, Excel, and PDF Files

How to upload CSV, Excel, and PDF files as data sources in clariBI. Covers supported formats, file …

3 min read