Web Scraper is a commercial software ti pick public information from web sites, there are different options from the software that require an additional payment, but they have a very useful solution for free in an Internet Browser extension, for Google Chrome and for Firefox, here is my test for Google Home.
You can download the software from the following link: https://webscraper.io/
In this Step by Step guide you will see the few steps needed to install it and use to scrap text and link from amazon, for example to create promoted links with an amazon affiliate code.
The Steps are:
- - Installation:
- - First Steps:
- - Scrap:
- - Export Data:
1.- Installation:
Easy like clicking in a button, you go to the download section of web scraper site, look for prices chard and click on free extension, there you will see a install button, after licking it, it will be added in your Google Chrome browser.
Extension details before to install:
Click on extension in the tool bar of google chrome to see extension properties, you will see a list of all your extension in your internet browser, just look for web scraper and pin it to ensure that you can see it in the menu but as you will see in the next steps, it doesn't matter if you don't see it in the top bar extension button because we will open it from the developer tools in the bottom bar section of your Browser.
Detail of the extension pinned, I recommend to pin it, you can use more options.
Your Web Scrapper is installed and running, you don't need to restart the Browser neither the computer.
2.- First steps
Open the "Inspect windows" with CTRL+SHIFT+I, you have to look for a new tab called "Web Scrapper", there is a easy configuration that you have to follow before you start scrapping a web site:
First click on the Sub-menu "Create new sitemap" and "Create"
you will see the next windows where we have to enter a url of a target web site, just add a sitemap name and the web site url, see the yellow mark in the next screenshot, I will explain it to you later.
Selecting the target, in this test we choose amazon site to download name and links to different amazon products, we make a product search until we select the product desired, ensure that you have site pagination to move forward all the products.
Then, we have to copy the target url into our sitemap, take care about the last part of the url, in this part we have to select the pagination number and replace it by a dynamic variable, remember the yellow mark that I explain to you before.
In this example we will use the dynamic variable (1-10), remember that you have to check the limit before the scrape. The Web Scraper will pass step by Step from 1 to 10, if you select 20000, it will try to do also.
Now we have to create a Selector, a selector is an object that use web scraper to identify an element of the web site, for example you can create a selector for "Text Title", "Product Link" and "Picture", in this example we are picking only the name and the link, for that we will create 2 selectors.
Once you create the selector and after you flag the option "multiple", you will see in amazon page that all this elements will be remarked, web scraper will search the site for all element of the same type, in this example you can see that after the second lick of the element, all other similar elements are also selected, in amazon you will see that promoted product don't have the same property than normal product, if you would like to pick them all, you will need to create an specific web scraped selector also for promoted products.
we can create many selectors, there is no limit.
In this example, we have created 2 elements, one for text and other for links, you will see this option in each element, once the selector is create you will see the selectors option menu, there you can preview selector chooses, the data preview ( list of data that will be scraped ), edit again or delete it.
Before you scrape you can see a data preview to ensure that you are scraping all the correct information, this is a recommendation is not a mandatory step.
Now it's time to scrape, just got to the menu option "Sitemap test" and select "Scrape"
After you click in the scrape option, the system will ask you for a default wait time, this is to avoid server checks, if the server have anything to avoid scrape, you can raise this number up to avoid it and look like you are a human pushin the mouse button like if there as not sun.
At the beginning, you will see a popup, take care to avoid closing the popup before the scrape ends. once it ends you will see the next report.
it's done, now we can export the data in different formats, I prefer CSV because you can open it in excel easily.
4.- Exporting the data:
in the same sitemap windows, just go to the menu and click on "Export data as CSV", I like it more than sitemap, but it depend of your objective
There are two steps to export the data, if you don't see the popup you ave to click in the "Download now" link.
Your file it's ready in the internet browser download folder.
This is the data exported, now in excel there are no limits to edit this information, you can for example add your amazon affiliate promo code in each link and then upload them in other site.
And now ? what we can do with that information ? Imagination don't have limitations...