Smoke Signals

Where there is smoke, there’s an SEO fanning a fire. In the old days in-your-face signals like “Pay for a Link” language was enough to disqualify a link. Post-Penguin, more subtle signs of SEO are reason to pause and reconsider a link project.  When analyzing a web page for the purposes of identifying whether a link should be disavowed, it’s important to be accurate. Disavow the wrong links and the site may lose rankings. This article discusses why certain SEO signals, termed Smoke Signals, can and should be considered as possible negative signs of quality.

It is not generally well understood how search engines use statistics to find a baseline of what represents normal in order to discover sites that have features that are unnatural. I highly encourage the reading of this paper titled, Spam, Damn Spam, and Statistics – Using statistical analysis to locate spam web pages because it discusses the method in an easy to understand manner. I will quote a key passage here, added emphasis is mine:

Web pages and the hyperlinks between them induce a graph structure. Using graph-theoretic terminology, the out-degree of a web page is equal to the number of hyperlinks embedded in the page, while the in-degree of a page is equal to the number of hyperlinks referring to that page.

Distribution of Out-Degrees

“Figure 4 shows the distribution of out-degrees. The x-axis denotes the out-degree of a page; the y-axis denotes the number of pages in DS2 with that out-degree. Both axes are drawn on a logarithmic scale. (The 53.7 million pages in DS2 that have out-degree 0 are not included in this graph due to the limitations of the log-scale plot.) The graph appears linear over a wide range, a shape characteristic of a Zipfian distribution. The blue oval highlights a number of outliers in the distribution. For example, there are 158,290 pages with out-degree 1301; while according to the overall distribution of out-degrees we would expect only about 1,700 such pages. Overall, 0.05% of the pages in DS2 have an outdegree that is at least three times more common than the Zipfian distribution would suggest. We examined a crosssection of these pages, and virtually all of them are spam.

Figure 5 Distribution of In-Degrees

Figure 5 shows the distribution of in-degrees. As in figure 4, the x-axis denotes the in-degree of a page, the y-axis denotes the number of pages in DS2 with that in-degree, and both axes are drawn on a logarithmic scale. The graph appears linear over an even wider range than the previous graph, exhibiting an even more pronounced Zipfian distribution. However, there is also an even larger set of outliers, and some of them are even more pronounced. For example, there are 369,457 web pages with in in-degree 1001 in DS2, while according to the overall in-degree distribution we would expect only about 2,000 such pages. Overall, 0.19% of the pages in DS2 have an in-degree that is at least three times more common than the Zipfian distribution would suggest. We examined a cross-section of these pages, and the vast majority of them are spam.”

The point I am illustrating is that there are statistics of what constitutes normal patterns and what constitutes unnatural patterns, an attempt to influence the search engine rankings.  In addition to standard metrics built-into available tools (measuring c-blocks, etc.), it’s important to look for smoke signals, signs of unnatural activity which means that outbound links from a particular page are likely poisoned. Here are some tips:

1. Guest posts by SEO agencies or about SEO on a non-SEO website
Do a site search of the website to find guest posts by SEO companies or guest posts about SEO.
Guest posts by SEO companies on behalf of themselves (or clients) are frequently indicative of paid link activities or other unnatural activities.  Articles about SEO in a site that is not expressly about SEO are also what I call smoke signals. There is often a lot of ugly things going on just beneath the surface.

2. Be wary when the bulk of the articles on a site are contributed guest posts
There are many legit sites featuring guest posts. However there are many sites that trade on guest posts for revenue, particularly in the UK.  While you should not dismiss these sites out of hand, a site that depends not only aggressively solicits guest posts but appears to depend on guest posts should be given closer scrutiny.

Be wary of eating the dog food advice that the search engines can’t identify paid articles: Tally up the  Smoke Signals and the unnatural patterns can and do stand out in a statistical analysis.  What are possible negative signals?

  • Unnatural number of outbound links
  • Unnatural number of phrases such as “guest article”
  • Unnatural number of anchor text within the body of an article that matches the destination of the author in the byline etcetera.
  • Topic matter that is barely relevant

And those are just some of the on-page factors, not taking into account the off-page ones.

Those of you who have placed a site into reconsideration know what I’m talking about when I say that a hand check is brutal. What passes for normal in the regular algo doesn’t pass under the scrutiny of a hand check. It’s important to be aware of the smoke signals when preparing a site for a reconsideration request.

This blog post is a portion of the entire article. The complete article is published in the Advanced Link Building Newsletter. Read the entire blog post plus unpublished advanced strategies for building links. Be on the cutting edge, get it here!

Enjoyed this article?

Share this post with your friends!