Tony Wang4 min readHow to Scrape Trustpilot Reviews in 2026 (API & Python)
Three ways to scrape Trustpilot reviews and ratings in 2026 — DIY Python, no-code tools, or a structured API — what each returns and the legal basics.
The fastest way to scrape Trustpilot reviews in 2026 is to call a structured reviews API that returns normalized JSON — ratings, review text, dates, reviewer labels, and business profile data — instead of paginating and parsing Trustpilot's HTML yourself. DIY in Python is possible, but pagination, layout changes, and anti-bot defenses make it expensive to maintain, and review text is personal data you need to handle carefully.
Trustpilot does run an official Business API, but it requires a paid plan and partner access. A structured scraping API is the practical route when you want public review data across businesses you don't own.
Is it legal to scrape Trustpilot reviews?
Reviews are public, but they contain personal data (reviewer names and opinions), so handle them with extra care:
- Collect only public review pages — no logins.
- Treat reviewer information as personal data under GDPR/CCPA: minimize what you store and have a lawful basis.
- Respect rate limits and Trustpilot's terms; you are responsible for lawful use.
- Don't present scraped reviews in a way that misleads about their source or authenticity.
This is not legal advice — see Is web scraping legal in 2026?, which covers the public-vs-personal-data line in more depth.
Option 1: DIY in Python (and why it breaks)
A first pass fetches a business page and parses review cards:
import json, requests
from bs4 import BeautifulSoup
resp = requests.get(
"https://www.trustpilot.com/review/openai.com",
headers={"User-Agent": "Mozilla/5.0", "Accept-Language": "en-US,en;q=0.9"},
)
soup = BeautifulSoup(resp.text, "html.parser")
# Trustpilot pre-loads reviews into a __NEXT_DATA__ script tag — faster than parsing cards
data = json.loads(soup.select_one("#__NEXT_DATA__").string)
reviews = data["props"]["pageProps"].get("reviews", []) # undocumented path; shifts between deploys
print(len(reviews), "reviews on page 1")
# ...then normalize each review to CSV, page every ?page= URL, and get past Cloudflare
The recurring costs:
- Cloudflare & rate limits — Trustpilot sits behind Cloudflare with bot detection and rate limiting, so repeated requests need residential proxies and JS-rendering fallbacks.
- Undocumented shape — the
__NEXT_DATA__blob's structure changes between deploys, so your path into it silently breaks. - Pagination — reviews span many
?page=URLs to walk and dedupe. - Normalization — stars, dates, verification flags, and replies all need consistent parsing.
Option 2: No-code and ready-made tools
Point-and-click scrapers can grab a page or two, but reputation monitoring means re-checking businesses on a schedule and storing the history. That is a pipeline job, which is where an API fits better than a manual tool.
Option 3: A structured Trustpilot API
Crawlora's Trustpilot API exposes documented endpoints for business profiles, reviews, search, and categories, returning normalized JSON. Reviews are addressed by the business slug (its domain on Trustpilot):
curl -G "https://api.crawlora.net/api/v1/trustpilot/business/openai.com/reviews" \
-H "x-api-key: $CRAWLORA_API_KEY" \
--data-urlencode "page=1" \
--data-urlencode "language=en"
import requests
resp = requests.get(
"https://api.crawlora.net/api/v1/trustpilot/business/openai.com/reviews",
headers={"x-api-key": "YOUR_API_KEY"},
params={"page": 1, "language": "en"},
)
for review in resp.json()["data"]["reviews"]:
print(review["stars"], review["title"], review["date"])
A response is normalized JSON you can store directly (fields are illustrative — confirm the schema in the docs):
{
"code": 200,
"msg": "OK",
"data": {
"business": { "slug": "openai.com", "rating": 4.1, "review_count": 1234 },
"reviews": [
{
"stars": 5,
"title": "Great experience",
"text": "Review body text.",
"date": "2026-05-01",
"verified": true
}
]
}
}
You can filter reviews by stars, verified, language, a search term q, and a date range. Don't know the slug? Find it, then pull the profile, related competitors, or a whole category:
h = {"x-api-key": "YOUR_API_KEY"}
base = "https://api.crawlora.net/api/v1/trustpilot"
found = requests.get(f"{base}/business-units/search", headers=h, params={"q": "openai"}).json()["data"]
profile = requests.get(f"{base}/business/openai.com", headers=h).json()["data"]
related = requests.get(f"{base}/business/openai.com/related", headers=h).json()["data"]
sector = requests.get(f"{base}/category/software_company", headers=h, params={"page": 1}).json()["data"]
The slug is the company's domain on Trustpilot (e.g. openai.com); the category endpoints (/categories, /category/{slug}) discover businesses in a sector for competitive benchmarking.
What you can collect
- Business profile: overall rating, review count, and category
- Individual reviews: stars, title, body, date, verification, and replies where present
- Filters for stars, verified status, language, keyword, and date range
- Category and search endpoints for discovering businesses
Limitations and common challenges
- The official API is own-profile-only. Trustpilot's Service Reviews and Business Units APIs require a verified Business account and return data only for your own Business Unit — to benchmark competitors you collect public reviews.
- Cloudflare & rate limits. Public pages sit behind Cloudflare with bot detection and rate limiting, so DIY needs residential proxies and JS rendering; a structured API handles that behind one key.
- Reviews are personal data. Reviewer names and opinions fall under GDPR/CCPA — minimize what you store, have a lawful basis, and don't misrepresent reviews' source or authenticity.
- Pagination. Reviews and category listings page; walk every page and dedupe.
Sources
Where this fits
Try it first, free: run any public URL through the Free Web Scraper, or check whether a site blocks bots with the Anti-Bot Checker — no signup.
This powers review and reputation monitoring: track a brand's rating over time, flag new negative reviews, and benchmark against competitors. Pair it with app-side sentiment from the App Store and Google Play review endpoints — or our pre-scraped app-review sentiment dataset — for a cross-channel view of customer feedback. For sentiment from other public sources, see how to scrape Reddit; for the broader toolkit, how to choose a web scraping API.
Get started by testing the endpoint in the Playground, reading the request and response schema in the API docs, and reviewing credit costs on the pricing page.
Frequently asked questions
Can I scrape Trustpilot without getting blocked?
Crawlora manages proxy routing, request pacing, and retries behind the API and returns documented errors for unusable pages, so you call one endpoint and get normalized review JSON. (Trustpilot sits behind Cloudflare with bot detection and rate limiting, so DIY needs residential proxies and JS rendering.)
Does Trustpilot have an official API?
Yes, but it is own-profile-only: Trustpilot's Service Reviews and Business Units APIs require a verified Business account and return data only for your own Business Unit. To benchmark competitors, Crawlora collects public review data across businesses as normalized JSON.
Is review data personal data?
Reviews include reviewer names and opinions, which can be personal data under GDPR and CCPA. Minimize what you store, have a lawful basis, handle it responsibly, and don't misrepresent reviews' source or authenticity.
Can I scrape competitor reviews or a whole sector?
Yes. Use the business-units search to find slugs, the related-businesses endpoint for competitors, and the category endpoints to discover and pull every business in a sector for benchmarking.
What can I filter reviews on?
Stars, verified status, language, a keyword (q), and a date range, plus pagination, so you can pull just the reviews you need.
How do I find a business's slug?
Use the business-units search endpoint. The slug is the company's domain on Trustpilot (for example, openai.com), used in the reviews path.