Tony WangApril 13, 2026Updated June 7, 20264 min read

How to Scrape Amazon Product Data in 2026 (API & Python)

Three ways to scrape Amazon product data, prices, and reviews in 2026: DIY Python, no-code, or a structured API — what each returns and the legal basics.

Amazon Guide Web Scraping API

The fastest way to scrape Amazon product data in 2026 is to call a structured Amazon API that returns normalized JSON — title, ASIN, price, availability, rating, and review count — instead of running and babysitting your own scraper. You can build a DIY scraper in Python, but Amazon's anti-bot defenses, rotating layouts, and CAPTCHAs make it costly to keep working. This guide covers all three approaches, what each returns, where each breaks, and the legal basics.

Approach	Output	Holds up at scale?	Best for
DIY Python	Raw HTML you parse	No — CAPTCHAs, bans, layout churn	Learning, one-off scripts
No-code tools	CSV/JSON export	Limited	One-off exports
Structured Amazon API	Normalized JSON	Yes — proxies/parsing handled	In-product, scheduled pipelines

Is it legal to scrape Amazon product data?

Public product data — titles, prices, ratings, and reviews — is generally lower-risk to collect: in the US, hiQ Labs v. LinkedIn held that accessing publicly available data does not violate the Computer Fraud and Abuse Act, and facts like prices are not copyrightable. That said, Amazon's Conditions of Use prohibit automated access, so you can face blocks or bans, and personal data triggers privacy laws. Rules of thumb:

Collect only public product data; avoid personal data and anything behind a login.
Respect rate limits and don't degrade the service.
Review Amazon's terms and your own compliance requirements.

This is not legal advice — when in doubt, talk to a lawyer.

Option 1: DIY in Python (and why it breaks)

For a single page you might reach for requests + BeautifulSoup, parse the product fields, and export to CSV — escalating to a headless browser when JavaScript or anti-bot checks get in the way:

import csv
import requests
from bs4 import BeautifulSoup

headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
    "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0 Safari/537.36",
    "Accept-Language": "en-US,en;q=0.9",
}
resp = requests.get("https://www.amazon.com/dp/B0DGJ736JM", headers=headers, timeout=30)
soup = BeautifulSoup(resp.text, "html.parser")

title = soup.select_one("#productTitle")
price = soup.select_one(".a-price .a-offscreen")
rating = soup.select_one("#acrPopover .a-icon-alt")

row = {
    "asin": "B0DGJ736JM",
    "title": title.get_text(strip=True) if title else None,
    "price": price.get_text(strip=True) if price else None,
    "rating": rating.get_text(strip=True) if rating else None,
}

with open("amazon.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=row.keys())
    writer.writeheader()
    writer.writerow(row)
# ...on a good day. On a bad day: a CAPTCHA, a 503, or a redirect to a bot check.

It demos well and then breaks at scale:

Anti-bot — CAPTCHAs and IP bans push you into proxy rotation and fingerprinting.
Layout churn — Amazon changes the DOM and your selectors silently break.
ASIN and variation sprawl — prices and availability vary by variant, seller, and region.
Scale — reliable collection means proxy pools, retries, and monitoring you have to run.

Most of the cost is not the first scrape — it is keeping it alive.

Option 2: No-code tools

Visual extractors and browser extensions handle the page for you and export CSV or JSON. They are fine for one-off exports, but less convenient when you need Amazon data inside a product or pipeline, on a schedule, with predictable fields.

Option 3: A structured Amazon API

For repeatable, in-product workflows, an Amazon scraping API gives you documented endpoints that return normalized JSON — no browser, no selectors, no proxy pool to run. Fetch a product by ASIN:

curl https://api.crawlora.net/api/v1/amazon/product/B0DGJ736JM \
  -H "x-api-key: $CRAWLORA_API_KEY"

The same call in Python:

import requests

resp = requests.get(
    "https://api.crawlora.net/api/v1/amazon/product/B0DGJ736JM",
    headers={"x-api-key": "YOUR_API_KEY"},
)
data = resp.json()["data"]
print(data["title"], data["price"], data["rating"])

The response is normalized JSON you can store directly (fields shown are illustrative — check the docs for the current schema):

{
  "code": 200,
  "msg": "OK",
  "data": {
    "asin": "B0DGJ736JM",
    "title": "Example product",
    "price": 189,
    "currency": "USD",
    "availability": "In Stock",
    "rating": 4.4,
    "review_count": 1055
  }
}

Collecting a whole result set? Use the search endpoint:

curl "https://api.crawlora.net/api/v1/amazon/search?query=wireless+earbuds&page=1" \
  -H "x-api-key: $CRAWLORA_API_KEY"

What you can collect

Where the public listing exposes them, the fields fall into a few groups:

Product identity — title, ASIN, brand or seller, images, and the product URL.
Pricing — current price, currency, strikethrough/list price, and availability.
Social proof — average rating, review count, and review samples; for full review mining, page the product's reviews.
Listing metadata — search-result position and badges (Prime, Best Seller, Amazon's Choice), plus the ASIN or query context you requested.

Re-run on a schedule and store one row per ASIN per run to build a price-and-rating history.

Limitations and common challenges

Even with a good API, keep a few Amazon realities in mind:

Personalized results. Amazon tailors results to location and history, so two requests can differ; locale parameters reduce but don't eliminate this.
Pagination caps. Search is capped at roughly 20 pages per query — split by category or filters to go deeper.
Price volatility. Prices change frequently, so for accurate tracking you need to re-scrape on a schedule, not once.
Anti-bot for DIY. Direct scraping faces CAPTCHAs, behavioral analysis, and cookie/fingerprint checks; a structured API absorbs proxy routing, rendering, and retries behind the endpoint.
Marketplace and currency. The endpoint defaults to amazon.com (USD); confirm coverage before relying on other regional marketplaces.

Where this gets used

Structured Amazon data powers a few common workflows:

Price & MAP monitoring — track prices and availability and flag MAP violations. See the price & MAP monitoring use case.
Product monitoring — watch listings, ratings, and Buy-Box changes via Amazon product monitoring.
E-commerce intelligence — compare catalogs and assortment across marketplaces with e-commerce product intelligence.

Sources

Start collecting

Try it first, free: run any public URL through the Free Web Scraper, or check whether a site blocks bots with the Anti-Bot Checker — no signup.

Test the product endpoint in the Playground, check the response schema in the API docs, and review credit costs on the pricing page. Running the same playbook on another marketplace? See how to scrape eBay and how to scrape Shopify stores. For the bigger picture, see best Amazon scraping APIs, how to choose a web scraping API, and ScraperAPI alternatives.

Frequently asked questions

Can I scrape Amazon without getting blocked?

With a structured API, proxy routing and browser execution are handled behind the endpoint, so you don't manage blocks yourself. A DIY scraper faces CAPTCHAs, behavioral analysis, and cookie/fingerprint checks, so it needs proxies and careful pacing.

What data can I get from Amazon?

Public listing fields: title, ASIN, price, currency, availability, rating, review count, brand or seller, images, and search-result position, where available.

Can I scrape Amazon reviews?

Yes, where public. Product responses include the average rating, review count, and review samples; for full review text and sentiment analysis, page the product's reviews. Treat reviewer names and text as potentially personal data and collect only what you need.

Can I scrape Amazon prices and track changes?

Yes. Pull price, currency, and availability and store one row per ASIN per run, then compare each run to the previous one to detect price moves, stock changes, and MAP violations. Prices change often, so re-scrape on a schedule.

Can I scrape Amazon from other countries or marketplaces?

Amazon serves different results and currencies by marketplace and region. The endpoint defaults to amazon.com (USD); confirm coverage before relying on other regional marketplaces, and expect some personalization to remain.

Is this the official Amazon SP-API or PA-API?

No. This extracts public Amazon product data and is independent of Amazon's official Selling Partner API (sellers) and Product Advertising API (affiliates), both of which are gated to your own account.

How often can I refresh the data?

As often as your plan and responsible-use constraints allow; most teams run scheduled snapshots rather than continuous polling.

Tony WangApril 13, 2026Updated June 7, 20264 min read

How to Scrape Amazon Product Data in 2026 (API & Python)

Three ways to scrape Amazon product data, prices, and reviews in 2026: DIY Python, no-code, or a structured API — what each returns and the legal basics.

Amazon Guide Web Scraping API

Approach	Output	Holds up at scale?	Best for
DIY Python	Raw HTML you parse	No — CAPTCHAs, bans, layout churn	Learning, one-off scripts
No-code tools	CSV/JSON export	Limited	One-off exports
Structured Amazon API	Normalized JSON	Yes — proxies/parsing handled	In-product, scheduled pipelines

Is it legal to scrape Amazon product data?

Collect only public product data; avoid personal data and anything behind a login.
Respect rate limits and don't degrade the service.
Review Amazon's terms and your own compliance requirements.

This is not legal advice — when in doubt, talk to a lawyer.

Option 1: DIY in Python (and why it breaks)

For a single page you might reach for requests + BeautifulSoup, parse the product fields, and export to CSV — escalating to a headless browser when JavaScript or anti-bot checks get in the way:

import csv
import requests
from bs4 import BeautifulSoup

headers = {
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
    "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0 Safari/537.36",
    "Accept-Language": "en-US,en;q=0.9",
}
resp = requests.get("https://www.amazon.com/dp/B0DGJ736JM", headers=headers, timeout=30)
soup = BeautifulSoup(resp.text, "html.parser")

title = soup.select_one("#productTitle")
price = soup.select_one(".a-price .a-offscreen")
rating = soup.select_one("#acrPopover .a-icon-alt")

row = {
    "asin": "B0DGJ736JM",
    "title": title.get_text(strip=True) if title else None,
    "price": price.get_text(strip=True) if price else None,
    "rating": rating.get_text(strip=True) if rating else None,
}

with open("amazon.csv", "w", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=row.keys())
    writer.writeheader()
    writer.writerow(row)
# ...on a good day. On a bad day: a CAPTCHA, a 503, or a redirect to a bot check.

It demos well and then breaks at scale:

Anti-bot — CAPTCHAs and IP bans push you into proxy rotation and fingerprinting.
Layout churn — Amazon changes the DOM and your selectors silently break.
ASIN and variation sprawl — prices and availability vary by variant, seller, and region.
Scale — reliable collection means proxy pools, retries, and monitoring you have to run.

Most of the cost is not the first scrape — it is keeping it alive.

Option 2: No-code tools

Option 3: A structured Amazon API

For repeatable, in-product workflows, an Amazon scraping API gives you documented endpoints that return normalized JSON — no browser, no selectors, no proxy pool to run. Fetch a product by ASIN:

curl https://api.crawlora.net/api/v1/amazon/product/B0DGJ736JM \
  -H "x-api-key: $CRAWLORA_API_KEY"

The same call in Python:

import requests

resp = requests.get(
    "https://api.crawlora.net/api/v1/amazon/product/B0DGJ736JM",
    headers={"x-api-key": "YOUR_API_KEY"},
)
data = resp.json()["data"]
print(data["title"], data["price"], data["rating"])

The response is normalized JSON you can store directly (fields shown are illustrative — check the docs for the current schema):

{
  "code": 200,
  "msg": "OK",
  "data": {
    "asin": "B0DGJ736JM",
    "title": "Example product",
    "price": 189,
    "currency": "USD",
    "availability": "In Stock",
    "rating": 4.4,
    "review_count": 1055
  }
}

Collecting a whole result set? Use the search endpoint:

curl "https://api.crawlora.net/api/v1/amazon/search?query=wireless+earbuds&page=1" \
  -H "x-api-key: $CRAWLORA_API_KEY"

What you can collect

Where the public listing exposes them, the fields fall into a few groups:

Product identity — title, ASIN, brand or seller, images, and the product URL.
Pricing — current price, currency, strikethrough/list price, and availability.
Social proof — average rating, review count, and review samples; for full review mining, page the product's reviews.
Listing metadata — search-result position and badges (Prime, Best Seller, Amazon's Choice), plus the ASIN or query context you requested.

Re-run on a schedule and store one row per ASIN per run to build a price-and-rating history.

Limitations and common challenges

Even with a good API, keep a few Amazon realities in mind:

Personalized results. Amazon tailors results to location and history, so two requests can differ; locale parameters reduce but don't eliminate this.
Pagination caps. Search is capped at roughly 20 pages per query — split by category or filters to go deeper.
Price volatility. Prices change frequently, so for accurate tracking you need to re-scrape on a schedule, not once.
Anti-bot for DIY. Direct scraping faces CAPTCHAs, behavioral analysis, and cookie/fingerprint checks; a structured API absorbs proxy routing, rendering, and retries behind the endpoint.
Marketplace and currency. The endpoint defaults to amazon.com (USD); confirm coverage before relying on other regional marketplaces.

Where this gets used

Structured Amazon data powers a few common workflows:

Price & MAP monitoring — track prices and availability and flag MAP violations. See the price & MAP monitoring use case.
Product monitoring — watch listings, ratings, and Buy-Box changes via Amazon product monitoring.
E-commerce intelligence — compare catalogs and assortment across marketplaces with e-commerce product intelligence.

Sources

Start collecting

Try it first, free: run any public URL through the Free Web Scraper, or check whether a site blocks bots with the Anti-Bot Checker — no signup.

Frequently asked questions

Can I scrape Amazon without getting blocked?

What data can I get from Amazon?

Public listing fields: title, ASIN, price, currency, availability, rating, review count, brand or seller, images, and search-result position, where available.

Can I scrape Amazon reviews?

Can I scrape Amazon prices and track changes?

Can I scrape Amazon from other countries or marketplaces?

Is this the official Amazon SP-API or PA-API?

How often can I refresh the data?

As often as your plan and responsible-use constraints allow; most teams run scheduled snapshots rather than continuous polling.

How to Scrape Amazon Product Data in 2026 (API & Python)

Is it legal to scrape Amazon product data?

Option 1: DIY in Python (and why it breaks)

Option 2: No-code tools

Option 3: A structured Amazon API

What you can collect

Limitations and common challenges

Where this gets used

Sources

Start collecting

Frequently asked questions

How to Scrape Yahoo Finance in 2026 (API & Python)

Web Scraping with Python — The Complete 2026 Guide

How to Scrape App Store & Google Play Reviews in 2026 (API & Python)

Scrape Data From a Website to Excel — 3 Ways That Work

Web Scraping with AI — How Agents Get Web Data in 2026

How to Scrape Airbnb in 2026 (API & Python)

How to Scrape Amazon Product Data in 2026 (API & Python)

Is it legal to scrape Amazon product data?

Option 1: DIY in Python (and why it breaks)

Option 2: No-code tools

Option 3: A structured Amazon API

What you can collect

Limitations and common challenges

Where this gets used

Sources

Start collecting

Frequently asked questions

How to Scrape Yahoo Finance in 2026 (API & Python)

Web Scraping with Python — The Complete 2026 Guide

How to Scrape App Store & Google Play Reviews in 2026 (API & Python)

Scrape Data From a Website to Excel — 3 Ways That Work

Web Scraping with AI — How Agents Get Web Data in 2026

How to Scrape Airbnb in 2026 (API & Python)