Multi Page Scraper
scraperMultiPage
Turnkey task scraperMultiPage extracts data from a list of the web pages. Enter websites in the input.json files and click START to scrape required data. Also command "dex8 start -i input.json" will start the crawler.
Features:
- Extract data from many web pages and from many HTML elements
- Extract data from dynamic HTML content which is loaded by Javascript
- Extract data from HTML element defined by CSS selector
- Extract text, html or value from HTML tag attribute
- Filter extracted data by Regular Expression
- Correct extracted data by custom JS function
Input fields
Example of an input file.{
"device_name": "Desktop Linux",
"urls": [
"adsuu.com",
"dex8.com"
],
"encodeURL": false,
"extracts": [
{
"tip": "text",
"selector": "title"
},
{
"tip": "attr",
"attribute": "content",
"selector": "meta[name=\"keywords\"]"
},
{
"tip": "attr",
"attribute": "content",
"selector": "meta[name=\"description\"]"
},
{
"tip": "attr",
"attribute": "href",
"selector": "a"
}
],
"filter": {
"reg_str": "",
"reg_flags": ""
},
"corrector": "return result;" // or just put false
}
PRICE: 10.00 EUR /month
To buy this product you need to sign up for a free account and login .