Docs menu

Crawlora Docs

Python Examples

Use Python for data pipelines, notebooks, scheduled jobs, JSONL exports, and analysis workflows.

Install requests

Install

python -m pip install requests

Basic request with timeout

Python request

Use environment variables for secrets and keep Crawlora API keys server-side.

import os
import requests

API_KEY = os.environ["CRAWLORA_API_KEY"]
BASE_URL = "https://api.crawlora.net/api/v1"

payload = {
  "country": "us",
  "keyword": "chatgpt",
  "language": "en",
  "limit": 10,
  "page": 1
}

response = requests.request(
    "POST",
    f"{BASE_URL}/google/search",
    headers={
        "x-api-key": API_KEY,
        "Content-Type": "application/json",
    },
    json=payload,
        timeout=60,
)

try:
    data = response.json()
except ValueError:
    data = {"raw": response.text}

if not response.ok:
    raise RuntimeError(f"Crawlora request failed: {response.status_code} {data}")

print(data)

Write results to JSONL

JSONL pipeline

import json

keywords = ["project management software", "crm for startups", "sales automation"]

with open("crawlora-search-results.jsonl", "w", encoding="utf-8") as output:
    for keyword in keywords:
        try:
            data = crawlora_request(
                "/google/search",
                {"keyword": keyword, "country": "us", "language": "en", "limit": 10, "page": 1},
            )
            output.write(json.dumps({"keyword": keyword, "response": data}) + "\n")
        except RuntimeError as exc:
            output.write(json.dumps({"keyword": keyword, "error": str(exc)}) + "\n")

Simple retry/backoff pattern

Retry helper

import time
import requests

def request_with_retry(method, url, **kwargs):
    retryable = {429, 500, 502, 503, 504}
    for attempt in range(1, 4):
        response = requests.request(method, url, timeout=60, **kwargs)
        if response.ok:
            return response
        if response.status_code not in retryable or attempt == 3:
            raise RuntimeError(f"Crawlora request failed: {response.status_code} {response.text}")
        time.sleep(min(2 ** (attempt - 1), 8))

Notebook and data-pipeline usage

  • Keep API keys in environment variables or notebook secrets
  • Write raw responses before transforming them
  • Store endpoint, input, timestamp, and request ID when present
  • Use JSONL for append-only batch jobs
  • Back off on 429 and temporary 5xx responses

Responsible public web data workflows

Crawlora is designed for responsible structured public web data workflows. Customers are responsible for using Crawlora in compliance with applicable laws, third-party rights, target-platform rules, and Crawlora terms.

Read Crawlora terms

Use Python with real endpoint schemas

Open endpoint docs to confirm required fields, response structures, and credit cost before scheduling jobs.