Is Web Scraping Legal in 2026? A Practical Guide
A practical 2026 guide to web scraping and the law: public vs private data, hiQ/CFAA, terms of service, copyright, and GDPR/CCPA, with a do/don't checklist.
Web scraping is generally legal when you collect publicly available data and respect how you access it — but "legal" depends on the data, the method, and what you do with the results, not on scraping as an activity. This guide breaks down the rules that actually matter in 2026 so you can scope a project on the right side of the line. It is not legal advice; when stakes are high, talk to a lawyer.
The three questions that decide it
Whether a specific project is defensible usually comes down to three things:
- What data — public facts (prices, addresses, ratings) carry the least risk; personal data and copyrighted content carry the most.
- How you access it — scraping public pages is very different from bypassing logins, defeating CAPTCHAs, or ignoring explicit blocks.
- What you do with it — internal research differs from republishing content or building a competing database.
Public vs. private data
The cleanest rule of thumb: collect public, non-personal, factual data. In the US, hiQ Labs v. LinkedIn established that accessing data available to the general public without authentication does not constitute "unauthorized access" under the Computer Fraud and Abuse Act (CFAA). Facts themselves — a price, an address, a star rating — are not copyrightable.
Risk rises sharply when data sits behind a login, includes personal information, or is creative/copyrighted content you intend to reuse.
Terms of service and robots.txt
A site's Terms of Service can prohibit automated access even when the data is public. Violating ToS is generally a contract matter (think account bans or cease-and-desist letters), not a criminal one — but it is still a real risk, especially for logged-in scraping. robots.txt is a crawling convention, not a law, but ignoring explicit disallows weakens your position. Respect rate limits and don't degrade the service.
Personal data: GDPR and CCPA
If you collect personal data about people in the EU or California, privacy laws such as GDPR and CCPA/CPRA apply regardless of whether the data was public. That means a lawful basis, purpose limits, and data-subject rights. The safest path for most products is to avoid personal data and stick to business- and product-level facts.
A practical do / don't checklist
Do
- Collect public, factual, non-personal data.
- Respect rate limits, robots.txt, and reasonable load.
- Review the source's terms and your own compliance obligations.
- Keep a clear, legitimate purpose for what you collect.
Don't
- Bypass logins, paywalls, or CAPTCHAs to reach gated data.
- Collect personal data without a lawful basis.
- Republish copyrighted content wholesale.
- Hammer a site or evade explicit blocks.
How Crawlora fits
Crawlora is built for responsible public web data: documented platform APIs that return normalized JSON for public sources, with managed rate-limited access. It is infrastructure — you remain responsible for lawful, compliant use of the data. See rate limits for pacing guidance, and the web scraping API overview for how the endpoint model works.
FAQ
Is web scraping illegal? No, not inherently. Collecting public data is generally lawful in the US, UK, and EU; problems arise from how you access it (bypassing auth) and what you collect (personal or copyrighted data).
Does hiQ v. LinkedIn mean I can scrape anything public? It means accessing public data isn't "unauthorized access" under the CFAA — but ToS, copyright, and privacy law still apply.
Is violating a site's Terms of Service a crime? Generally it's a contract issue (bans, cease-and-desist), not criminal — but it's still a risk, particularly for logged-in or evasive scraping.
Can I scrape personal data if it's public? Public availability doesn't exempt you from GDPR/CCPA. Avoid personal data unless you have a lawful basis.
Is this legal advice? No. This is general information; consult a lawyer for your specific situation.
Related reading
- How to Scrape LinkedIn in 2026 (Legally) — the legal lines applied to one of the most sensitive sources.
- Best Web Scraping APIs in 2026: How to Choose — once you know the rules, pick the right tool.