Developer guides
Use Crawlora APIs to feed structured public web data into LlamaIndex readers, tools, and retrieval pipelines instead of parsing raw HTML.
Verified HTTP pattern
POST /google/search
Request
POST https://api.crawlora.net/api/v1/google/search
x-api-key: $CRAWLORA_API_KEY
Content-Type: application/json
{
"country": "us",
"keyword": "best CRM software",
"language": "en",
"limit": 10,
"page": 1
}Base URL
https://api.crawlora.net/api/v1
Auth header
x-api-key
Example endpoint
POST /google/search
This repository does not contain an official LlamaIndex package. Treat this guide as a custom reader and tool pattern around the HTTP API.
Developer workflow
LlamaIndex builds retrieval and agent workflows over your data. Crawlora can provide normalized JSON from supported public platforms, making it easier to convert into Documents, embed, and query than raw HTML. The same agent-native structured data also backs Crawlora's hosted MCP tools.
Developer workflow
Developer workflow
Convert Crawlora results into LlamaIndex Documents with clear metadata before indexing.
import requests
from llama_index.core import Document
def crawlora_documents(query: str):
resp = requests.get(
"https://api.crawlora.net/api/v1/google/search",
headers={"x-api-key": "YOUR_API_KEY"},
params={"q": query},
)
return [
Document(text=item.get("snippet", ""), metadata={"url": item.get("url"), "title": item.get("title")})
for item in resp.json().get("data", [])
]Developer workflow
Build an index from the Documents, then query it in your RAG or agent flow.
from llama_index.core import VectorStoreIndex
docs = crawlora_documents("retrieval augmented generation")
index = VectorStoreIndex.from_documents(docs)
answer = index.as_query_engine().query("Summarize the latest on RAG")Developer workflow
Developer workflow
Use Crawlora for structured public web data workflows. Customers are responsible for compliance with applicable laws, third-party rights, platform rules, and Crawlora terms. Keep API keys server-side, validate inputs, and avoid collecting or storing unnecessary sensitive data.
Read Crawlora termsDeveloper workflow
Use these pages to move between endpoint discovery, examples, pricing, and responsible-use guidance.
Developer workflow
Common questions for this Crawlora developer integration path.
This frontend repository does not contain an official Crawlora LlamaIndex package. Use a custom reader or tool wrapper around the HTTP API.
Use a reader for scheduled ingestion and indexing, and a tool for agent-time queries.
Yes. Convert normalized result items into Documents with clear metadata before embedding.
Yes, if the selected transcript endpoint fits your workflow. Keep result counts and token budgets bounded.
Bound result counts, cache repeated requests, and monitor credits on the pricing and console surfaces.
Back off on 429 responses, reduce concurrency, and avoid aggressive retry loops.
Start with Google Search or YouTube transcripts, convert to Documents, then index and query.