Developer guides

LlamaIndex Integration Guide

Use Crawlora APIs to feed structured public web data into LlamaIndex readers, tools, and retrieval pipelines instead of parsing raw HTML.

Custom readerTool patternRetrieval ingestionPython

Browse APIs Try Playground View Python Guide

Verified HTTP pattern

POST /google/search

Normalized JSON

Request

POST https://api.crawlora.net/api/v1/google/search
x-api-key: $CRAWLORA_API_KEY
Content-Type: application/json

{
  "country": "us",
  "keyword": "best CRM software",
  "language": "en",
  "limit": 10,
  "page": 1
}

Base URL

https://api.crawlora.net/api/v1

Auth header

x-api-key

Example endpoint

POST /google/search

This repository does not contain an official LlamaIndex package. Treat this guide as a custom reader and tool pattern around the HTTP API.

Developer workflow

Why use Crawlora with LlamaIndex?

LlamaIndex builds retrieval and agent workflows over your data. Crawlora can provide normalized JSON from supported public platforms, making it easier to convert into Documents, embed, and query than raw HTML. The same agent-native structured data also backs Crawlora's hosted MCP tools.

Developer workflow

Integration patterns

Custom reader that returns Documents.
Tool for agent-time queries.
Retrieval ingestion pipeline.
Scheduled data refresh job.

Developer workflow

Python custom reader example

Convert Crawlora results into LlamaIndex Documents with clear metadata before indexing.

Custom reader · python

import requests
from llama_index.core import Document

def crawlora_documents(query: str):
    resp = requests.get(
        "https://api.crawlora.net/api/v1/google/search",
        headers={"x-api-key": "YOUR_API_KEY"},
        params={"q": query},
    )
    return [
        Document(text=item.get("snippet", ""), metadata={"url": item.get("url"), "title": item.get("title")})
        for item in resp.json().get("data", [])
    ]

Developer workflow

Index and query

Build an index from the Documents, then query it in your RAG or agent flow.

Index + query · python

from llama_index.core import VectorStoreIndex

docs = crawlora_documents("retrieval augmented generation")
index = VectorStoreIndex.from_documents(docs)
answer = index.as_query_engine().query("Summarize the latest on RAG")

Developer workflow

Use cases

SERP research agent.
YouTube transcript summarizer.
Reddit community-insight retrieval.
Review and reputation monitoring.
Product and market research grounding.

Developer workflow

Production tips

Keep API keys server-side.
Bound result counts and token budgets.
Add timeouts and handle 429 responses.
Cache repeated requests.
Keep source URLs in metadata for citations.
Avoid sending unnecessary personal data to LLMs.

Responsible public web data workflows

Use Crawlora for structured public web data workflows. Customers are responsible for compliance with applicable laws, third-party rights, platform rules, and Crawlora terms. Keep API keys server-side, validate inputs, and avoid collecting or storing unnecessary sensitive data.

Read Crawlora terms

Developer workflow

FAQ

Common questions for this Crawlora developer integration path.

Does Crawlora have an official LlamaIndex package?

This frontend repository does not contain an official Crawlora LlamaIndex package. Use a custom reader or tool wrapper around the HTTP API.

Should I build a reader or a tool?

Use a reader for scheduled ingestion and indexing, and a tool for agent-time queries.

Can I use Crawlora results in a vector store?

Yes. Convert normalized result items into Documents with clear metadata before embedding.

Can I ground answers in YouTube transcripts?

Yes, if the selected transcript endpoint fits your workflow. Keep result counts and token budgets bounded.

How do I control cost?

Bound result counts, cache repeated requests, and monitor credits on the pricing and console surfaces.

How should I handle rate limits?

Back off on 429 responses, reduce concurrency, and avoid aggressive retry loops.

Wrap one endpoint as a LlamaIndex reader

Start with Google Search or YouTube transcripts, convert to Documents, then index and query.

View Python Guide Browse APIs

Developer guides

LlamaIndex Integration Guide

Use Crawlora APIs to feed structured public web data into LlamaIndex readers, tools, and retrieval pipelines instead of parsing raw HTML.

Custom readerTool patternRetrieval ingestionPython

Browse APIs Try Playground View Python Guide

Verified HTTP pattern

POST /google/search

Normalized JSON

Request

POST https://api.crawlora.net/api/v1/google/search
x-api-key: $CRAWLORA_API_KEY
Content-Type: application/json

{
  "country": "us",
  "keyword": "best CRM software",
  "language": "en",
  "limit": 10,
  "page": 1
}

Base URL

https://api.crawlora.net/api/v1

Auth header

x-api-key

Example endpoint

POST /google/search

This repository does not contain an official LlamaIndex package. Treat this guide as a custom reader and tool pattern around the HTTP API.

Developer workflow

Why use Crawlora with LlamaIndex?

Developer workflow

Integration patterns

Custom reader that returns Documents.
Tool for agent-time queries.
Retrieval ingestion pipeline.
Scheduled data refresh job.

Developer workflow

Python custom reader example

Convert Crawlora results into LlamaIndex Documents with clear metadata before indexing.

Custom reader · python

import requests
from llama_index.core import Document

def crawlora_documents(query: str):
    resp = requests.get(
        "https://api.crawlora.net/api/v1/google/search",
        headers={"x-api-key": "YOUR_API_KEY"},
        params={"q": query},
    )
    return [
        Document(text=item.get("snippet", ""), metadata={"url": item.get("url"), "title": item.get("title")})
        for item in resp.json().get("data", [])
    ]

Developer workflow

Index and query

Build an index from the Documents, then query it in your RAG or agent flow.

Index + query · python

from llama_index.core import VectorStoreIndex

docs = crawlora_documents("retrieval augmented generation")
index = VectorStoreIndex.from_documents(docs)
answer = index.as_query_engine().query("Summarize the latest on RAG")

Developer workflow

Use cases

SERP research agent.
YouTube transcript summarizer.
Reddit community-insight retrieval.
Review and reputation monitoring.
Product and market research grounding.

Developer workflow

Production tips

Keep API keys server-side.
Bound result counts and token budgets.
Add timeouts and handle 429 responses.
Cache repeated requests.
Keep source URLs in metadata for citations.
Avoid sending unnecessary personal data to LLMs.

Responsible public web data workflows

Read Crawlora terms

Developer workflow

FAQ

Common questions for this Crawlora developer integration path.

Does Crawlora have an official LlamaIndex package?

This frontend repository does not contain an official Crawlora LlamaIndex package. Use a custom reader or tool wrapper around the HTTP API.

Should I build a reader or a tool?

Use a reader for scheduled ingestion and indexing, and a tool for agent-time queries.

Can I use Crawlora results in a vector store?

Yes. Convert normalized result items into Documents with clear metadata before embedding.

Can I ground answers in YouTube transcripts?

Yes, if the selected transcript endpoint fits your workflow. Keep result counts and token budgets bounded.

How do I control cost?

Bound result counts, cache repeated requests, and monitor credits on the pricing and console surfaces.

How should I handle rate limits?

Back off on 429 responses, reduce concurrency, and avoid aggressive retry loops.

Wrap one endpoint as a LlamaIndex reader

Start with Google Search or YouTube transcripts, convert to Documents, then index and query.

View Python Guide Browse APIs

LlamaIndex Integration Guide

Why use Crawlora with LlamaIndex?

Integration patterns

Python custom reader example

Custom reader · python

Index and query

Index + query · python

Use cases

Production tips

Responsible public web data workflows

Related developer links

Python guide

LangChain

n8n

MCP

YouTube

Google Search

FAQ

Wrap one endpoint as a LlamaIndex reader

LlamaIndex Integration Guide

Why use Crawlora with LlamaIndex?

Integration patterns

Python custom reader example

Custom reader · python

Index and query

Index + query · python

Use cases

Production tips

Responsible public web data workflows

Related developer links

Python guide

LangChain

n8n

MCP

YouTube

Google Search

FAQ

Wrap one endpoint as a LlamaIndex reader