
How to Scrape Product Data from E-Commerce Sites
Get Product Data from Multiple E-Commerce Websites with a Single Tool and Build a Data Pipeline in Minutes
If you need to monitor prices, product details, or availability across Amazon, Walmart, eBay, Target, and other e-commerce sites, you may be running into two problems:
Fragmented data collection — having to run separate scrapers for each site or type of page.
Incomplete coverage — category pages are great for finding new items, but can miss updates to important products; product detail pages give precise tracking but no discovery.
We'll show you how to solve these problems with E-commerce Scraping Tool. It extracts data from multiple e-commerce websites, from both category and product detail URLs, in the same run, so you can get complete, deduplicated datasets in one export.
Why Product Data Scraping Is Essential for E-Commerce Success
The e-commerce landscape has become brutally competitive. Prices change by the hour, inventory fluctuates constantly, and new products launch daily. According to Prisync's competitive pricing research, 90% of shoppers compare prices across multiple sites before making a purchase, and 60% of online shoppers say price is the most important factor in their buying decision.
The Challenge:
Traditional product tracking approaches fail at scale:
Manual monitoring: Staff checks competitor websites daily, copies data into spreadsheets. This takes hours and misses real-time changes.
Single-site scrapers: You build a custom Amazon scraper, then another for Walmart, then eBay, then Target... Each requires different code, maintenance, and proxy management. Your engineering team spends weeks maintaining scrapers instead of building product features.
Expensive monitoring platforms: Enterprise tools like Prisync or Omnia cost $500-$3,000/month and often limit product counts or require annual contracts.
Incomplete data: Category scrapers find new products but miss price updates on existing SKUs. Product-specific scrapers track changes but don't discover new competitors entering your category.
What You Really Need:
A unified scraper that handles multiple e-commerce platforms, extracts both product listings (for discovery) and individual product details (for tracking), deduplicates results automatically, and costs pennies per thousand products.
That's exactly what Apify's E-commerce Scraping Tool delivers.
How to Scrape Product Data for E-Commerce
You can use E-commerce Scraping Tool via the UI using natural language or JSON input, or run it programmatically via API. The UI is the best way to test it out, so let's start there.
Step 1: Open Apify's E-commerce Scraping Tool
E-commerce Scraping Tool lives on Apify Store—the world's largest marketplace of web scrapers. To get started, click Try for free. If you're logged in to your Apify account, you'll be taken to Apify Console—your dashboard for running the scraper. If not, you'll be prompted to sign in or sign up first.
New users receive $5 of free credit every month, which is enough to scrape 10,000+ products to test the tool.
Step 2: Choose Your Input Types
Once you're logged in, you can configure the tool in Apify Console.
The tool supports two main URL types: Category listing URLs and Product detail URLs.
Input Type What It Is When to Use It Category listing URLs Search results or category pages with multiple products Discover many products, monitor whole categories, find new arrivals Product detail URLs URLs pointing directly to a single product page Monitor known SKUs, track specific items for price/stock changes
Example Category URLs:
https://www.amazon.com/s?k=wireless+headphones
https://www.walmart.com/browse/electronics/headphones/3944_133251
https://www.ebay.com/sch/i.html?_nkw=bluetooth+speakers
https://www.target.com/c/electronics/-/N-5xtg6
Example Product URLs:
https://www.amazon.com/dp/B0CX23V2ZK
https://www.walmart.com/ip/Apple-AirPods-Pro/520468661
https://www.ebay.com/itm/234567890123
https://www.target.com/p/sony-headphones/-/A-87654321
Using the Manual Tab:
Simply paste your URLs into the input fields—one per line. The scraper automatically detects whether each URL is a category page or product detail page.
Same Configuration in JSON Tab:
For advanced users or API integration, you can configure via JSON:
{ "categoryUrls": [ "https://www.amazon.com/s?k=wireless+headphones", "https://www.walmart.com/browse/electronics/headphones/3944_133251" ], "productUrls": [ "https://www.amazon.com/dp/B0CX23V2ZK", "https://www.walmart.com/ip/520468661" ], "maxProducts": 100
}
When to Use Both Category and Detail URLs
Using both input types in the same run is helpful when you want:
Discovery + refresh → Crawl categories for new items while re-scraping your curated set of known products.
Mixed sourcing → Some sites give stable product URLs, others require crawling categories to find items.
Single dataset output → No need to merge separate runs.
The scraper automatically deduplicates products found in both inputs, so you never get duplicate entries even if a product appears in both category results and your direct URL list.
Step 3: Run and Export
Click Start to run the scraper.
When the run finishes, open the Storage tab to view results: product name, price, SKU, brand, image, description, and URL.
What You Get:
{ "title": "Sony WH-1000XM5 Wireless Headphones", "price": 329.99, "currency": "USD", "availability": "In Stock", "brand": "Sony", "sku": "WH1000XM5/B", "asin": "B0CX23V2ZK", "url": "https://www.amazon.com/dp/B0CX23V2ZK", "imageUrl": "https://m.media-amazon.com/images/I/...", "rating": 4.6, "reviewsCount": 12847, "category": "Electronics > Headphones > Over-Ear", "seller": "Amazon.com", "shippingPrice": 0, "originalPrice": 399.99, "discount": 70.00, "discountPercentage": 17.5, "description": "Industry-leading noise cancellation...", "features": [ "30-hour battery life", "Multipoint connection", "Speak-to-chat technology" ], "specifications": { "weight": "250g", "connectivity": "Bluetooth 5.2", "colors": ["Black", "Silver"] }
}
Export your dataset as JSON, CSV, Excel, or HTML, or integrate it directly with your systems via:
Google Sheets (auto-sync results)
Airtable (build product databases)
Snowflake/BigQuery (data warehouse pipelines)
Zapier/Make.com (workflow automation)
Webhook (push to your backend in real-time)
Real-World Use Cases for Product Data Scraping
Use Case 1: Dynamic Repricing for Amazon Sellers
Scenario: You sell electronics on Amazon and compete with 50+ sellers who change prices multiple times daily.
Solution:
Add competitor ASINs to E-commerce Scraping Tool
Schedule scraper to run every 2 hours
Export results to your repricing software via API
Automatically adjust prices to stay competitive
Example Input:
https://www.amazon.com/dp/B0CX23V2ZK (competitor 1)
https://www.amazon.com/dp/B09JQL3NWT (competitor 2)
https://www.amazon.com/dp/B0BTXPX3NY (competitor 3)
Result: One Amazon seller reported a 23% increase in Buy Box wins after implementing 2-hour price monitoring, leading to $47K monthly revenue increase with the same inventory.
Use Case 2: Product Discovery for Dropshipping
Scenario: You run a dropshipping store and need to identify trending products before they saturate the market.
Solution:
Scrape Amazon Best Sellers categories daily
Filter for products with 500+ reviews added in past 30 days
Analyze price points and shipping options
Add winning products to your Shopify store
Example Input:
https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics
https://www.amazon.com/Best-Sellers-Home-Kitchen/zgbs/home-garden
https://www.amazon.com/Best-Sellers-Sports-Outdoors/zgbs/sporting-goods
Result: Dropshipper identified 12 trending products per month, launched 5 on their store, generated $28K in first-quarter sales from early market entry.
Use Case 3: Competitive Intelligence for Brands
Scenario: You're a brand manufacturer selling through authorized retailers. You need to monitor pricing, inventory, and unauthorized sellers across marketplaces.
Solution:
Scrape your SKUs across Amazon, Walmart, eBay, Target
Identify MAP policy violations (prices below minimum advertised price)
Detect unauthorized sellers and counterfeit listings
Track inventory availability across all channels
Example Input:
{ "productUrls": [ "https://www.amazon.com/dp/YOUR-ASIN-1", "https://www.walmart.com/ip/YOUR-SKU-1", "https://www.ebay.com/sch/i.html?_nkw=YOUR-BRAND+PRODUCT-1", "https://www.target.com/p/YOUR-TCIN-1" ]
}
Result: Brand compliance increased from 67% to 94% within 3 months. Identified 23 unauthorized sellers, recovered $156K in lost revenue through cease-and-desist actions.
Use Case 4: Market Research for Product Launches
Scenario: You're launching a new yoga mat line and need competitive intelligence on pricing, features, and market positioning.
Solution:
Scrape "yoga mat" category across major marketplaces
Extract 500-1,000 products
Analyze: average price ($28.50), price range ($15-$89), top brands (Gaiam, Manduka, Liforme)
Identify feature gaps (eco-friendly materials, alignment guides, portability)
Example Data Analysis:
Price Range Product Count Avg Rating Market Share $15-$25 342 4.1 43% $26-$50 287 4.4 36% $51-$89 164 4.7 21%
Result: Launched at $34.99 (premium positioning without pricing out). Highlighted eco-friendly materials (underrepresented feature). First-month sales exceeded projections by 34%.
Use Case 5: Inventory Arbitrage for Resellers
Scenario: You're an e-commerce arbitrage seller looking for price discrepancies across platforms to flip products for profit.
Solution:
Scrape same products across Amazon, eBay, Walmart, Target
Calculate price spreads (product costs $45 on Walmart, sells for $68 on Amazon)
Filter for products with >30% margin and high sales volume
Automate purchase orders when opportunities appear
Example Arbitrage Opportunity:
Product Walmart Price Amazon Price Margin Amazon Sales Rank Nintendo Switch Game $42.99 $59.99 39% #234 in Video Games Kitchen Gadget $18.50 $29.99 62% #89 in Kitchen Phone Accessory $8.99 $16.99 89% #45 in Cell Phone
Result: Identified 23 profitable arbitrage opportunities weekly. Average profit per flip: $12.40. Monthly arbitrage revenue: $1,200-1,800 with minimal effort.
Use Case 6: Historical Price Tracking for Deal Sites
Scenario: You run a deal aggregation website (like Slickdeals or Honey) and need to verify that advertised discounts are genuine.
Solution:
Scrape product prices daily across major retailers
Store historical data in time-series database
Calculate: average price, lowest price, price volatility
Flag deals as "genuine" only if current price is <90% of 30-day average
Result: User trust increased 41% after implementing verified deal badges. Affiliate conversion rates improved 28% because users knew they were getting real deals, not fake "original price" markups.
How Much Will Your Run Cost?
E-commerce Scraping Tool uses a pay-per-event pricing model. You pay for:
Actor start (per run): $0.00007
Listings scraped (per page): $0.00026
Product details extracted (per product): $0.00100
Optional: Residential proxy use (per product): $0.00080
Optional: Browser rendering (per product): $0.00051
Example: Scraping 1,000 Listing Pages (~20,000 Products)
Without Proxies or Browser Rendering:
Actor start = $0.00007
Listings = 1,000 × $0.00026 = $0.26
Product details = 20,000 × $0.00100 = $20.00
Total ≈ $20.26
With Proxies + Browser Rendering:
Actor start = $0.00007
Listings = $0.26
Product details = $20.00
Residential proxy = 20,000 × $0.00080 = $16.00
Browser rendering = 20,000 × $0.00051 = $10.20
Total ≈ $46.46
Key Takeaway
Costs remain very low relative to data volume. A large run with proxies + browser rendering comes to about $46.46 for 20,000 products. That's $0.0023 per product—less than a quarter of a cent.
Compare to Alternatives:
Solution Monthly Cost Products Tracked Cost Per Product Manual Tracking $1,200 (80 hrs @ $15/hr) 500 $2.40 Prisync $500/month 10,000 $0.05 Competera $2,000/month 50,000 $0.04 Apify E-commerce Tool $46.46 (one-time) 20,000 $0.0023
For frequent large-scale runs, Apify's Business plan offers volume discounts and priority support.
Run E-commerce Scraping Tool via API
If you want to automate, scale, or integrate scraping into your existing workflow, you can run E-commerce Scraping Tool programmatically with the Apify API.
Why Use the API?
You already have URLs generated from another system (e.g., ERP, Google Sheet) and want to feed them directly.
You need to run scraping jobs on a schedule and deliver results automatically.
You're building a price monitoring or product tracking app that requires fresh data on demand.
How to Use It
Python Example:
from apify_client import ApifyClient client = ApifyClient("<YOUR_API_TOKEN>") run_input = { "categoryUrls": [ "https://www.amazon.com/s?k=wireless+headphones", "https://www.walmart.com/browse/electronics/headphones/3944_133251" ], "productUrls": [ "https://www.amazon.com/dp/B0CX23V2ZK" ], "maxProducts": 100
} run = client.actor("apify/e-commerce-scraping-tool").call(run_input=run_input) # Fetch results
for item in client.dataset(run["defaultDatasetId"]).iterate_items(): print(f"{item['title']}: ${item['price']} ({item['availability']})")
JavaScript Example:
const { ApifyClient } = require('apify-client'); const client = new ApifyClient({ token: '<YOUR_API_TOKEN>',
}); const input = { "categoryUrls": [ "https://www.amazon.com/s?k=wireless+headphones" ], "maxProducts": 100
}; const run = await client.actor("apify/e-commerce-scraping-tool").call(input); const { items } = await client.dataset(run.defaultDatasetId).listItems();
items.forEach((item) => { console.log(`${item.title}: $${item.price}`);
});
cURL Example:
curl -X POST https://api.apify.com/v2/acts/apify~e-commerce-scraping-tool/runs \ -H "Authorization: Bearer YOUR_API_TOKEN" \ -H "Content-Type: application/json" \ -d '{ "categoryUrls": ["https://www.amazon.com/s?k=headphones"], "maxProducts": 50 }'
For more details about using the API, go to the API Documentation or the Apify API docs.
Advanced Features That Set This Scraper Apart
1. Automatic Deduplication
When scraping multiple category pages or combining results from different sources, you often get duplicate products. E-commerce Scraping Tool automatically deduplicates based on:
SKU matching
Product URL normalization
Title similarity (fuzzy matching)
You get clean datasets without manual cleanup.
2. Multi-Marketplace Support
Unlike specialized scrapers that only work on Amazon or eBay, this tool handles:
Amazon (all domains: .com, .co.uk, .de, .fr, .ca, .au, etc.)
Walmart
eBay
Target
Etsy
AliExpress
Best Buy
Home Depot
Wayfair
Newegg
And 50+ other major e-commerce sites
No need to manage multiple scrapers or learn different APIs for each platform.
3. Structured Data Extraction
The scraper uses intelligent parsing to extract structured fields:
Product identifiers: SKU, UPC, EAN, ASIN, TCIN
Pricing data: Current price, original price, discount percentage, currency
Availability: In stock, out of stock, limited quantity, pre-order, backorder
Seller information: Primary seller, marketplace sellers, third-party sellers
Product attributes: Brand, category, color, size, weight, dimensions, materials
Customer feedback: Rating, review count, review snippets, Q&A
Media: Primary image, gallery images, video URLs
Shipping: Shipping cost, estimated delivery, free shipping eligibility
This structured format makes downstream analysis and integration seamless.
4. Smart Pagination Handling
Category pages often display 20-50 products per page with pagination. The scraper automatically:
Detects pagination links (Next, Page 2, Load More)
Crawls through all pages up to your limit
Handles infinite scroll and lazy loading
Respects rate limits to avoid detection
5. Anti-Bot Detection Bypass
E-commerce sites use aggressive anti-scraping protections. The scraper handles:
Proxy rotation: Residential and datacenter proxy pools
Browser fingerprinting: Mimics real Chrome/Firefox browsers
CAPTCHA handling: Automatic solving for common challenges
Rate limiting: Smart delays between requests
Session management: Maintains cookies and user agents
According to research on web scraping best practices, residential proxies combined with browser fingerprinting achieve 95%+ success rates even on heavily protected e-commerce sites.
Building Data Pipelines with E-Commerce Scraping Tool
Automated Daily Price Monitoring
Scenario: Track 1,000 competitor products daily and alert when prices drop below yours.
Setup:
Create a task with your product URLs
Schedule to run daily at 9 AM
Set up webhook integration to your backend
Backend compares new prices to your prices
Send email alerts for price drops >5%
Stack:
Apify (scraping)
Zapier/Make.com (orchestration)
Google Sheets (data storage)
SendGrid (email alerts)
Real-Time Inventory Monitoring
Scenario: Monitor product availability for dropshipping—restock your store when suppliers restock.
Setup:
Scrape supplier product pages every hour
Push results to webhook endpoint
Backend checks availability status
If "In Stock" → auto-publish to Shopify
If "Out of Stock" → auto-unpublish from Shopify
Stack:
Apify (scraping)
Custom backend (Node.js/Python)
Shopify API (inventory sync)
PostgreSQL (historical tracking)
Competitor Product Launch Detection
Scenario: Get alerts when competitors launch new products in your category.
Setup:
Scrape competitor category pages weekly
Compare to previous week's dataset
Identify new products (not in historical data)
Send Slack notification with product details
Add to competitive intelligence dashboard
Stack:
Apify (scraping)
Airtable (database)
Zapier (automation)
Slack (notifications)
Tableau (dashboard)
Common Pitfalls and How to Avoid Them
Pitfall 1: Scraping Too Aggressively
Problem: Setting up hourly scrapes of 10,000 products triggers rate limits and IP bans.
Solution: Start with daily runs and scale gradually. Use proxy rotation for high-volume scraping. Monitor error rates and adjust frequency accordingly.
Pitfall 2: Ignoring Data Quality Issues
Problem: Not all scraped data is perfect—out-of-stock products return null prices, promotional prices are temporary, bundle deals complicate comparisons.
Solution: Implement post-processing filters:
Remove products with null or zero prices
Flag promotional prices separately (check for "originalPrice" field)
Normalize pricing for bundles (calculate unit price)
Track historical data to identify temporary vs. permanent changes
Pitfall 3: Not Scheduling Regular Updates
Problem: E-commerce prices change rapidly. One-time scrapes become stale within hours or days.
Solution: Use Apify's scheduling feature:
Save your scraper configuration as a "Task"
Set up a schedule (hourly, daily, weekly)
Enable email notifications on completion
Connect to webhooks for real-time data pipeline updates
Pitfall 4: Exceeding Budget Without Realizing
Problem: Scraping 100,000 products daily with proxies and browser rendering can rack up costs quickly.
Solution:
Test with small runs first (50-100 products)
Monitor usage in Apify Console dashboard
Set budget alerts
Disable proxies/rendering for low-protection sites
Use datacenter proxies instead of residential when possible
FAQ
Q: Can I scrape Amazon without getting blocked?
Yes. E-commerce Scraping Tool uses advanced anti-detection techniques including residential proxy rotation and browser fingerprinting. Success rates exceed 95% when proxies are enabled. For ultra-high-volume Amazon scraping, we recommend using datacenter proxies for listing pages and residential proxies only for product details to optimize costs.
Q: Does this work for international e-commerce sites?
Absolutely. The scraper supports Amazon domains across 20+ countries (.co.uk, .de, .fr, .ca, .au, etc.) and major international marketplaces like AliExpress, Mercado Libre, and Rakuten. Simply provide the full URL including country-specific domain.
Q: How often should I run price monitoring?
It depends on your industry:
Electronics/Tech: Every 2-6 hours (prices change rapidly)
Fashion/Apparel: Daily (seasonal changes and promotions)
Home Goods: Weekly (slower-moving inventory)
Groceries/Consumables: Daily during promotional periods
Q: Can I get historical price data?
The scraper captures current prices only. To build historical tracking, set up scheduled runs and store results in a database or Google Sheets with timestamps. Many users export to BigQuery or Snowflake for long-term analysis.
Q: Is this legal?
Scraping publicly available pricing data is generally legal under US law (based on precedent like hiQ Labs v. LinkedIn). However, you must respect rate limits, not violate Terms of Service for commercial purposes without permission, and avoid scraping personal user data. Consult legal counsel for your specific jurisdiction and use case.
Q: What if the scraper breaks due to website changes?
Apify maintains the E-commerce Scraping Tool continuously. When target sites update their layouts, the scraper gets updated within 24-48 hours. You don't need to modify anything—just keep using it.
Try E-commerce Scraping Tool
For e-commerce product data, you can collect information from both listing and detail URLs across multiple websites and store it in one dataset. This lets you discover many products, track known SKUs, and get maximum coverage in one deduplicated dataset.
You can run the scraper via UI for simplicity, or programmatically via API for integration into your workflow.
Ready to start tracking product data at scale? Sign up for Apify here with $5 free monthly credit and start scraping e-commerce data in minutes.
Your competitors are already using automated product intelligence. It's time to level the playing field.