Systematic Review Automation

PRISMA 2020 Pipeline with I-Category Agents

Conversation-driven systematic literature review automation with AI-assisted screening, automated PDF retrieval, and RAG-powered analysis.

What is Systematic Review Automation?

Diverga's I-category agents provide a 7-stage automated systematic literature review pipeline following PRISMA 2020 guidelines. They combine conversation-driven workflow with AI-assisted screening and RAG technology.

PRISMA 2020 compliant workflow

AI-assisted screening with Groq LLM

Automated PDF retrieval (5 databases)

RAG-powered literature analysis

7-Stage Pipeline

Each stage builds on the previous, ensuring systematic and reproducible research.

Research Domain Setup

15-20 min

Define research question, scope, and constraints

Query Strategy Design

20-30 min

Design search queries with keywords and operators

PRISMA Configuration

15-25 min

Set inclusion/exclusion criteria and thresholds

Database Search

10-20 min

Fetch papers from Semantic Scholar, OpenAlex, arXiv

Screening & Selection

30-60 min

AI-assisted relevance screening with configurable LLM

RAG System Building

20-40 min

Create vector database for semantic search

Analysis & Synthesis

Ongoing

Query literature and generate PRISMA diagram

Two Project Types

Choose the workflow that matches your research scope:

Knowledge Repository

15,000-20,000 papers

50% relevance threshold

Broad exploration, topic discovery, RAG-first workflow

Emerging research fields

Interdisciplinary topics

Exploratory reviews

Systematic Review

50-300 papers

90% relevance threshold

Rigorous PRISMA 2020 compliance, publication-ready

Meta-analysis

Clinical guidelines

Evidence synthesis

Project Structure

Diverga creates a dual-directory structure separating system files from researcher-facing documentation:

General Research Project

Created by natural language project init or /diverga:setup

.research/                    # System files (hidden)
├── baselines/
│   ├── literature/
│   ├── methodology/
│   └── framework/
├── changes/
│   ├── current/
│   └── archive/
├── sessions/
├── project-state.yaml        # Research configuration
├── decision-log.yaml         # Checkpoint decisions
├── checkpoints.yaml          # Checkpoint states
└── hud-state.json            # HUD display state

docs/                          # Researcher-facing (auto-generated)
├── PROJECT_STATUS.md          # Progress tracking
├── DECISION_LOG.md            # Decision audit trail
├── RESEARCH_AUDIT.md          # IRB/reproducibility audit
├── METHODOLOGY.md             # Research design summary
├── TIMELINE.md                # Milestones and deadlines
├── REFERENCES.md              # Bibliography tracking
└── README.md                  # Project overview (editable)

PRISMA Pipeline Project

Additional structure created when running systematic review pipeline

data/
├── raw/                       # Downloaded PDFs
│   ├── semantic_scholar/
│   ├── openalex/
│   └── arxiv/
├── processed/
│   ├── deduplicated.json      # After deduplication
│   ├── screened.json           # After AI screening
│   └── included.json           # Final included papers
├── vectordb/                   # ChromaDB vector database
│   └── chroma/
├── reports/
│   ├── prisma_flow.png         # PRISMA 2020 diagram
│   └── screening_report.md     # Screening statistics
└── config.yaml                 # Pipeline configuration

Supported Databases

Three databases chosen for API access and PDF availability:

Semantic Scholar

~40% open access

Free API access

Citation network

Influential papers

OpenAlex

~50% open access

Polite pool (faster)

Rich metadata

Institution tracking

arXiv

100% PDF access

Preprint server

Full-text access

No rate limits

Cost Efficiency

Minimize API costs while maintaining quality:

500-paper review

~$0.07

Screening time

30-60 min

PDF retrieval

50-60%

Groq LLM (default): $0.01 per 100 papers
Local embeddings: Zero cost (sentence-transformers)
No institutional subscriptions required

Key Features

AI-Assisted Screening

Groq LLM (llama-3.3-70b) for relevance scoring with configurable thresholds

Conversation-Driven

Stage-by-stage prompts guide researchers through PRISMA workflow

Automated PDF Retrieval

Retry logic, fallback chains, and progress tracking for 50-60% success rate

RAG-Powered Analysis

ChromaDB vector database for semantic search and synthesis

PRISMA Diagram

Auto-generate PRISMA 2020 flow diagram with exclusion tracking

Quality Validation

Checkpoint integration ensures reproducibility and transparency

Learn More

Explore detailed documentation for each component:

PRISMA Guidelines

PRISMA 2020 compliance and flow diagram

Supported Databases

Database API integration and PDF retrieval strategy

Project Structure

Auto-generated documentation and folder layout

Ready to Automate Your Systematic Review?

Start with Diverga's I-category agents to conduct PRISMA 2020 compliant systematic reviews in hours, not weeks.

View I-Category Agents Quick Start Guide GitHub Repository