S30Hack: Scraping with Python, Scrapy, Beautiful Soup or Selenium ?

Scrap is the Technic to pic public information from other web site and save it to analyse or only for you to manage it as different topics, let's see a few of legal examples:

Read different product prices from different web sites, this save time to you if you do the same manually.
Convert html tables to excel files,
Check information on different social media media account that you own,
Download pictures from different sites,

All this task can be annoy if you do it manually in a huge web sites list, if you have a few knowledge about programming language, there are some interesting alternatives than software like web-scraper

lets see a comparison table of scrap libraries in python and I'll how some examples of each one, as soon i code it :-)

Scrapy	Beautiful Soup	Selenium
Pros: Robust Portable Efficient	Pros: Easy to learn Friendly Interface extensions	Pros: JavaScript friendly perfect for test automation you can run it on dialog mode
Cons: coding knowledge	Cons: inconsistent Big dependencies	Cons: not really a full web scraper beside it does similar thinks

Lets see some examples of this:

Scrapy: Extract text from div's and span elements.

import scrapy
class QuotesSpider(scrapy.Spider):
name = "quotes"
start_urls = [
'http://quotes.toscrape.com/page/1/',
'http://quotes.toscrape.com/page/2/',
]
def parse(self, response):
for quote in response.css('div.quote'):
yield {
'text': quote.css('span.text::text').get(),
'author': quote.css('small.author::text').get(),
'tags': quote.css('div.tags a.tag::text').getall(),

}

Beautiful Soup: bla bla

<code comming soon>

Selenium: bla bla

<code comming soon>

Code Sources:

https://docs.scrapy.org/en/latest/intro/tutorial.html