Import Confluence Pages into clariBI

Overview

Confluence is where many teams keep their documentation, meeting notes, project plans, and knowledge bases. By connecting Confluence to clariBI, you can import this content and use AI analytics to extract insights, track project status, and identify patterns across your documentation.

This guide walks through connecting Confluence, selecting content to import, and getting started with analysis.

Confluence integration overview

Prerequisites

An active Atlassian Confluence account (Cloud or Data Center)
Admin access to the Confluence space(s) you want to import
Analyst role or above in your clariBI organization

Step 1: Connect Confluence

For Confluence Cloud

Go to Data Sources in the clariBI sidebar.
Click Add Source.
Select Confluence as the data source type.
Click Connect with Atlassian.
Log in to your Atlassian account if prompted.
Review the permissions clariBI is requesting:
Read Confluence content -- Access to pages, blog posts, and attachments
Read Confluence spaces -- Access to space metadata and structure
Click Accept.
After authorization, clariBI shows your Confluence site URL. Confirm it is correct.

For Confluence Data Center (Self-Hosted)

Follow the same steps, but select Confluence Data Center in step 3.
Enter your Confluence server URL (e.g., https://confluence.yourcompany.com).
Provide an API token generated from your Confluence admin settings.
Click Test Connection, then Save.

Confluence OAuth flow

Step 2: Select Spaces and Pages

After connecting:

clariBI shows a list of all Confluence spaces you have access to.
Select the spaces you want to import. You can choose:
Entire space -- All pages and sub-pages in the space
Specific pages -- Browse the page tree and select individual pages
Pages with a specific label -- Import only pages tagged with certain Confluence labels
Choose the content types to include:
Pages -- Standard wiki pages (recommended)
Blog posts -- Team blog entries
Attachments -- Files attached to pages (PDFs, spreadsheets, images)
Click Continue.

Filtering by Label

Label-based filtering is useful when you only want to analyze a subset of content. For example, if your team tags meeting notes with a "meeting-notes" label, you can import only those pages.

Step 3: Configure Sync Settings

Sync frequency: Daily (recommended), weekly, or manual only.
Historical depth: Import all pages or only pages created/modified in the last 30, 90, or 180 days.
Content format: clariBI imports the text content of each page. Tables, headings, and lists are preserved. Images are referenced but not analyzed.
Click Save and Sync.

The initial sync imports all selected content. Subsequent syncs pull only new and updated pages.

Using Confluence Data in clariBI

AI-Powered Analysis

With Confluence data imported, you can ask questions like:

"Summarize the key decisions from all meeting notes in the Engineering space this quarter"
"How many project status pages mention 'delayed' or 'at risk'?"
"What topics appear most frequently across our product documentation?"
"List all action items from the last 10 meeting notes"

Each query costs 1 AI credit.

Building Dashboards

Create a dashboard that tracks Confluence content:

Content volume widget -- Number of pages created per week/month
Top contributors -- Who is writing the most documentation
Label distribution -- Which labels are used most frequently
Recent updates -- A feed of recently modified pages

Generating Reports

Generate AI-powered reports from your Confluence data:

Quarterly documentation review -- Summarizes new and updated pages
Meeting notes digest -- Extracts decisions and action items across multiple meetings
Knowledge gap analysis -- Identifies topics with sparse documentation

Content Processing

What clariBI Imports

Page title and hierarchy (parent/child relationships)
Page body text in plain text format (HTML formatting is stripped, structure is preserved)
Author and last modified date
Labels/tags assigned to the page
Attachment metadata (file name, type, size) -- not the file contents

What clariBI Does Not Import

Page permissions (all imported content follows clariBI's access controls)
Confluence macros (rendered output is imported as text where possible)
Inline comments on pages
Page version history (only the current version is imported)

Managing the Connection

Re-syncing

To trigger a manual sync:

Go to Data Sources and click the Confluence connection.
Click Sync Now.
clariBI pulls any new or updated pages since the last sync.

Handling Deleted Pages

If a Confluence page is deleted after import, clariBI marks it as "Source Deleted" in the data. The content remains in clariBI until you manually remove it or run a cleanup.

Disconnecting

Go to Data Sources.
Click the three-dot menu next to the Confluence connection.
Select Disconnect.
Previously imported data remains in clariBI but no new syncs occur.

To also remove imported data, select Disconnect and Remove Data.

Troubleshooting

"Insufficient Permissions" Error

Ensure your Confluence account has at least read access to the spaces you selected.
For Data Center, verify the API token has the correct permissions.

Missing Pages After Sync

Check that the pages are in a selected space or match the label filter.
Pages in restricted spaces (personal spaces or spaces with restricted permissions) may not be accessible to the API.
Run a manual sync and check the sync log for errors.

Large Spaces Take Too Long

If a space has thousands of pages, the initial sync may take 10-30 minutes. Subsequent syncs are faster because only changes are pulled.
Consider filtering by label to import only relevant content.

Confluence: Importing Wiki Pages for Analysis