How much can you earn with your own blog ?

You may be wonder, but how much do you earn with your own blog

You can enter it and see that there is no Adsense, no direct advertisers, no sale of services / advice, the eBooks I have is free and so more

You can not only promote with affiliates and not in a very invasive way, and you can make around 500 USD each month for everything. If you monetized your own blog to the maximum.

You have to Consider to add:

  • Adsense Advertising
  • More affiliates, many more!
  • eBook sales
  • Sale of advice on SEO
  • Direct advertisers 
  • More private courses 
  • Sale of reviews of similar products

Yes, you can monetize all this and dedicate yourself exclusively to your blog, and I think that you can easily walk netween $3,500 or $4,500 a month in profits.

As you can see, with a simple blog you can be very profitable if you show that you are a crack in your sector, if you entertain people and if you are constant, adding to all this that you know, in hindsight, monetize it correctly.

Scraping with Python, Scrapy, Beautiful Soup or Selenium ?

Scrap is the Technic to pic public information from other web site and save it to analyse or only for you to manage it as different topics, let's see a few of legal examples:

  • Read different product prices from different web sites, this save time to you if you do the same manually.
  • Convert html tables to excel files, 
  • Check information on different social media media account that you own, 
  • Download pictures from different sites, 

All this task can be annoy if you do it manually in a huge web sites list, if you have a few knowledge about programming language, there are some interesting alternatives than software like web-scraper

lets see a comparison table of scrap libraries in python and I'll how some examples of each one, as soon i code it :-)



ScrapyBeautiful SoupSelenium

Pros:
Robust
Portable
Efficient
Pros:
Easy to learn
Friendly Interface
extensions
Pros:
JavaScript friendly
perfect for test automation
you can run it on dialog mode

Cons:
coding knowledge
Cons:
inconsistent

Big dependencies
Cons:
not really a full web scraper beside it does similar thinks

Lets see some examples of this:

Scrapy: Extract text from div's and span elements.

import scrapy
class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = [
        'http://quotes.toscrape.com/page/1/',
        'http://quotes.toscrape.com/page/2/',
    ]
    def parse(self, response):
        for quote in response.css('div.quote'):
            yield {
                'text': quote.css('span.text::text').get(),
                'author': quote.css('small.author::text').get(),
                'tags': quote.css('div.tags a.tag::text').getall(),

            }


Beautiful Soupbla bla
<code comming soon>
 
Seleniumbla bla
<code comming soon>


Code Sources:

 https://docs.scrapy.org/en/latest/intro/tutorial.html




Proxies what are and how to use them

What is a server proxy:

A server proxy is a computer system that manage internet connection and handle the traffic between 2 or more points, from a client computer to another server, the proxy is  used to move the traffic between a connection, it offer security, performance and more privacy.

What are the most common cons of using a proxy :

  • Privacy, If all users identify themselves as one, it is difficult for the accessed resource to differentiate them. But this can be bad, for example when identification must necessarily be made.
  • Abuse, Being willing to receive requests from many users and respond to them, you may do some work that you do not touch. You therefore need to control who has access and who does not to your services, which is usually very difficult.
  • Irregularity: The fact that the proxy represents more than one user gives problems in many scenarios, in particular those that presuppose direct communication between 1 sender and 1 receiver (such as TCP/IP).
How to setup a proxy configuration in a single system for all users:
You will need to setup a global environment variable that will affect to all users, for you have to edit the file etc/profile with root user and add the needed entries with the proxy information that you would like to use. If you would like to use a proxy for ftp and http connection, add the following entries:
export https_proxy=http://server-proxy:8080/
export ftp_proxy=http://210.113.232.28:8083/

Follow the next steps to setup a proxy trough the terminal in a linux system, if you have any scrap or craw program in your system, you will need it, you can use command http_proxy

export http_proxy=http://server-name:port//
maybe the proxy server requires authentication, if then, do this:

export http_proxy=http://user:password@server-name:port/

If you would like to use https for more security, use https_proxy like this:

export https_proxy=https://user:password@server-name:port/

To see your current configuration use the following commands:

echo $http_proxy
echo $https_proxy

If you need to delete the configuration, use the unset command.

unset http_proxy
unset https_proxy

Here you have a list of proxy server providers, remember that the proxies server will insert your IP address into the request header or they can sniff your traffic, it's important to care that if you use sensitive information, you have to trust on your proxy supplier.

Remember, if you have a scrapper or crawler running on your computer and/or server, it's recommended to use a good proxy  to avoid any ban from server





Web Scraper - How to use the Chrome Extension

Web Scraper is a commercial software ti pick public information from web sites, there are different options from the software that require an additional payment, but they have a very useful solution  for free in an Internet Browser extension, for Google Chrome and for Firefox, here is my test for Google Home.

You can download the software from the following link:  https://webscraper.io/

In this Step by Step guide you will see the few steps needed to install it and use to scrap text and link from amazon, for example to create promoted links with an amazon affiliate code.

The Steps are:

  1. - Installation:
  2. - First Steps:
  3. - Scrap:
  4. - Export Data:


1.- Installation:

Easy like clicking in a button, you go to the download section of web scraper site, look for prices chard and click on free extension, there you will see a install button, after licking it, it will be added in your Google Chrome browser.


Extension details before to install:

Click on extension in the tool bar of google chrome to see extension properties, you will see a list of all your extension in your internet browser, just look for web scraper and pin it to ensure that you can see it in the menu but as you will see in the next steps, it doesn't matter if you don't see it in the top bar extension button because we will open it from the developer tools in the bottom bar section of your Browser.

Detail of the extension pinned, I recommend to pin it, you can use more options.
Your Web Scrapper is installed and running, you don't need to restart the Browser neither the computer.

2.- First steps 

Open the "Inspect windows" with CTRL+SHIFT+I, you have to look for a new tab called "Web Scrapper", there is a easy configuration that you have to follow before you start scrapping a web site:

First click on the Sub-menu "Create new sitemap" and "Create"

you will see the next windows where we have to enter a url of a target web site, just add a sitemap name and the web site url, see the yellow mark in the next screenshot, I will explain it to you later.

Selecting the target, in this test we choose amazon site to download name and links to different amazon products, we make a product search until we select the product desired, ensure that you have site pagination to move forward all the products.

Then, we have to copy the target url into our sitemap, take care about the last part of the url, in this part we have to select the pagination number and replace it by a dynamic variable, remember the yellow mark that I explain to you before.

In this example we will use the dynamic variable (1-10), remember that you have to check the limit before the scrape. The Web Scraper will pass step by Step from 1 to 10, if you select 20000, it will try to do also.

Now we have to create a Selector, a selector is an object that  use web scraper to identify an element of the web site, for example you can create a selector for "Text Title", "Product Link" and "Picture", in this example we are picking  only the name and the link, for that we will create 2 selectors.

Once you create the selector and after you flag the option "multiple", you will see in amazon page that all this elements will be remarked, web scraper will search the site for all element of the same type, in this example you can see that after the second lick of the element, all other similar elements are also selected, in amazon you will see that promoted product don't have the same property than normal product, if you would like to pick them all, you will need to create an specific web scraped selector also for promoted products.

we can create many selectors, there is no limit.
In this example, we have created 2 elements, one for text and other for links, you will see this option in each element, once the selector is create you will see the selectors option menu, there you can preview selector chooses, the data preview ( list of data that will be scraped ), edit again or delete it.
Before you scrape you can see a data preview to ensure that you are scraping all the correct information, this is a recommendation is not a mandatory step.
Now it's time to scrape, just got to the menu option "Sitemap test" and select "Scrape"

After you click in the scrape option, the system will ask you for a default wait time, this is to avoid server checks, if the server have anything to avoid scrape, you can raise this number up to avoid it and look like you are a human pushin the mouse button like if there as not sun.

At the beginning, you will see a popup, take care to avoid closing the popup before the scrape ends. once it ends you will see the next report.



it's done, now we can export the data in different formats, I prefer CSV because you can open it in excel easily.

4.- Exporting the data:

in the same sitemap windows, just go to the menu and click on "Export data as CSV", I like it more than sitemap, but it depend of your objective

There are two steps to export the data, if you don't see the popup you ave to click in the "Download now" link.

Your file it's ready in the internet browser download folder.

This is the data exported, now in excel there are no limits to edit this information, you can for example add your amazon affiliate promo code in each link and then upload them in other site.

And now ? what we can do with that information ? Imagination don't have limitations... 

Diferent types of cloaking

Remember that cloaking is to hide change dynamically the web content for human user and robots, so you can show an HTML content to a bot and images or flash content to humans, we talk about in this post.

There are different types of cloaking

  • User.agent cloaking

    web sites that use this technique, identify the visit by the user agent field in the HTTP request, for example, you can identify a human visitor is you can see user browser information, bot's don't use internet browsers :-), with that easy php code you can identify the information of the user agent
    <?php
    echo $_SERVER['HTTP_USER_AGENT'];
    ?>
  • To identify a bot, you must use a code like this:

      if ( strstr( strtolower( $_SERVER['HTTP_USER_AGENT'] ), 'googlebot') )
          {
          // code to execute for bot's
          }

    Examples to use, remember that google checks will penalty you if they detect it.
    - Change image content to HTML content.
    - Redirect bots or humans to a different  pages.
    - etc...

  • IP based cloaking

    This different technique of cloaking is quick similar to the user agent but rather than check is the visitor is human or search engine bot, you can read the IP of your visit and change the behavior of your site, to detect the IP of your visitor in php you can use the next script:

    <?php 
    $ip = $_SERVER['REMOTE_ADDR'];
    ?>
    or
    <?php 
    $ip =  or $_SERVER['REMOTE_HOST']
    ?>

        Examples to use, this is not so bad as user-agent.
        - This is more common for geolocate sites, you can redirect your visitor to the correct language if   
           you can geo locate his IP.
        - You can apply IP filtering for develop testers, while you are developing a web site, maybe you    
           would like to avoid some controls in the ip that you are using.
        - You can identify the IP of your visit, search in your user database if is a returning visitor and you
           can show additional information based of the last visit information...
  • HTTP language header cloaking,

You can get this information using JavaScript or PHP, it is a property of the user loaded document, but not of its parent.

JavaScript example:

<script type='text/javascript'>
document.write(document.referrer);
</script>

PHP example:

<?php echo $_SERVER["HTTP_REFERER"]; ?>


        Other code  Example:

        <?php
            $langu = substr($_SERVER['HTTP_ACCEPT_LANGUAGE'], 0, 2);
            $acceptLang = ['de', 'es', 'en']; 
            $langu = in_array($langu, $acceptLang) ? $langu : 'jp';
            require_once "index{$langu }.php"; 
        ?>

        This header is a trick to use when while using the server has any option to determine the language         trough another way, like using specific URL and/or IP, that is chosen by an explicit decision of                 the user.

              • JavaScript cloaking

                It's not very different from that last examples explained before, remember that JavaScript is used as a client side, so after the server send all the http request and display the html to the user, you can apply there any JavaScript routines to  manipulate the content based on how your human visitor do in the website.

                For example you can redirect the user reading the client information using javascript, there you have more options but you are more vulnerable if the user have javascript block or hacks.
              <A HREF="YourAffiliateSiteUrl" onMouseOver="func_changeurl('v_code'); return true" onMouseOut="window.status=' '; return true">Link Title</a>
                      Whit that code, you can change your affiliates links once your human visitor click on it, you can do         the same using php and it's better and secure.



              Link Farming, definition and uses

               What are Link Farms?

              Link farms are a group of knowed websites between them that constantly exchanging links in order to improve the current positioning of all the websites they link. This practice is not very friendly in SEO and is currently in a big disuse due to the problems it causes in terms of positioning once the search engine detect it, so try to avoid it.

              Another of the practices that were carried out in the past, but which caused the search algorithms to evolve in order to put a stop to them. Currently, Google and the rest of the engines reward the quality of the contents and the natural location of the links in these, as well as the authority of the place from which the link is made and the objective. 

              This change has meant that all the web grouped that used to resort to these practices have completely collapsed. The link farms were developed to improve the page rank of different websites, although currently they only cause problems by receiving penalties from Google or other engines.

              It should be noted that they differ from web directories in that the latter link, but provide context and in order to be useful to visitors. Moreover, they do not have hundreds or thousands of links plaguing all their contents.

              What are Link Farms for?

              The link farms were used so that all the websites linked to them could improve their page rank and, therefore, achieve a better organic positioning in search engines, especially in Google. It was a fairly common practice in the past, now it is illegitimacy of its implementation and because it brings more disadvantages than advantages if you don't use it well.

              Take car and remember than receiving back links from these types of websites usually directories diminishes your page's reputation and therefore its position. Therefore, they are something to be avoided as much as possible.

              Structure of the link farm

              Link farms are often websites that do not contain any relevant content and only serve to build links. Websites that are exclusively dedicated to link cultivation can also be called link networks. These sites are useless for visitors, as their only purpose is to refer to other websites and thus increase their link popularity. They are supposed to increase the Page Rank according to the number of links that refer to the website in question.

              Until a few years ago, these methods worked, but search engines are getting better at identifying spam, link networks and other black hat methods to explore techniques such as cloaking.
              The use of link farms is targeted by Google and is considered spam. If a link farm is detected, it will either appear at the bottom of the search results or be removed from the search engine index entirely and therefore no longer accessible to users; so you already know if you don't use it well.

              Relevance for SEO

              Furthermore, Google will penalize any other website that is connected to such a link farm, as links from previously identified networks are generally considered to be harmful. Google not only evaluates the number of links, but also their quality. A senseless abundance of links that are in no way related to the content of the website are devalued by search engines and will then negatively affect positioning.

              As a result, the linked websites will slip into the search results lists. This happens quite regularly and website operators and agency clients are well advised to refrain from this type of black hat SEO or if necessary to inform their contracted agency that they want natural link building. If you already have link farm links, they should be identified and removed. This requires first a detailed analysis of the backlinks and then all malicious links have to be removed manually. Subsequently, a reconsideration request can be sent to Google.

              ¿Has your site an unknown clone ? How to find duplicate content

              Hi, I'm WBear and I will talk you something about what is duplicate content on Internet for search engines and how it can be dangerous for your business if you don't take care 
               
              What is Duplicate Content
               
              There are 2 types of duplicated content, one is that content that are inside your own web site and it can be found in different sections/menus/areas, other type of duplicated content is text that is exactly the same on other domain/website. Both can be cloned partially and full copied into your site, lets go to see what is the difference of each one.
               
              Search engine always take care about what content they will show to users after a search, this is a reason why search engines evaluate the quality of content before show it on search results.
               
              How we can avoid to used duplicated content ?
               
              If your are thinking on CMS, you will remember that there is impossible to avoid all duplicated content because always will be some module that is visible in different sections. But this is not to much important for search engine, in that web sites, the important part to take acre about duplicate content is the main body of the web, the area where the section information is loaded and showed to users.
               
              There are some considerations to take care before the go-live of your web site:
               
              Canonical link,
              As described on Wikipedia a canonical link is and HTML element that prevent web developers owners to avoid duplicated content issues, this is also used if you work with any content management system or e-commerce style.
              In that link you will find more information from Google, about how you can use or define canonical links in your site.
               
              Redirect, moving to a new domain,
              When you need to move your site from one domain to another you will need to notify Google and other search engines that you move the content to a new one with 301redirect as is indicated by Google in that link
               
              Dynamic pages,
              As I tell you before on dynamic pages there is always duplicated content and it is shown using url parameters, that change the behavior and the content of the site, to inform search engine about that you need to specify what parameters  need to be indexed and with non.
               
              How you can detect of avoid the use of duplicated content in your web site:
              The next list is only a few tool that I sometimes use to check duplicated content in web site where someone else write the content.

              *The description of each tool is the meta information of each site, test it as your own consideration and select the best that work for your, there isn't any perfect tool for that, so you have to use what you need in each situation, I don't use the same in each analysis, I prefer to change the check method and compare the result of 2 different tools.

              http://www.copyscape.com 
              Copyscape is a free plagiarism checker. The software lets you detect duplicate content and check if your articles are original.

              http://www.siteliner.com/
              Free and fast analysis of your entire website - duplicate content, broken links, internal page rank, redirections and more. Also creates an XML sitemap.

              http://www.plagium.com/
              Plagiarism Tracker, Plagiarism Checker,Plagium, Plagiary, Plagiarist, Plagiarism, Fabrication, Falsification, Leech Finder, Source Tracker, Source Finder, Pirate Tracker, Copyright, Copyright Law, Unauthorized Use, Copyright Violation, plagiat, plagiats, plagier, détection de plagiat, droits d'auteur, deteção de plágio

              http://www.similarsitesearch.com/
              SimilarSiteSearch.com helps you find similar, related, or alternative websites. Our goal is to generate the most relevant results for our users.

              https://www.google.com/webmasters/tools/
              Finally we can see if there is duplicate content within our website with the tool for webmaster from Google, from it we can see it is if there are meta tags and titles H1 duplicates within the pages that make up the domain, if it detects that so it is best to correct this point those mistakes disappear.

              That's all for now, non duplicated content will allow you a long time relevance on search engines and a good user experience for your audience. This is only a few information about how is identified by search engines duplicate content and the penalties about using it.
              Quote from BBear: Copy quality content from top page ranks pages, copy it 100%, after you open the content to search engines, you have time before your site is banned to rewrite the content and change it, that will allow your some time of good result and when you change the content again it will give you a change notification to call search engine robots again.

              Sources:
              https://support.google.com/webmasters/answer/66359?hl=en
              http://moz.com/learn/seo/duplicate-content
              http://en.wikipedia.org/wiki/Duplicate_content
              http://searchenginewatch.com/article/2049078/What-is-Duplicate-Content