How to Classify Thousands Of Unnatural Links (Really Quickly)

Posted on: June 10th, 2014 by Patrick Hathaway in Guides, How To

One of URL Profiler’s strengths is its speed and accuracy at auditing links. Our customers are using it every day to quickly recover from unnatural link penalties and penguin related downgrades. We’ve even got some who have used it to diagnose negative SEO attacks. Yeah, they do exist!

This post will show you how to very quickly classify thousands of links so you can figure which ones are unnatural. We have this information in video form, if you are more visually inclined:

Importing Links

An important part of link cleanup work is trying to find as many links as you possibly can. When carrying out link audits we will typically use 5 different link sources to try and make sure we have found as many links as possible.

Rather than forcing you to combine all the links and dedupe them yourself, URL Profiler will do this for you. So just import the raw data files and URL Profiler will combine them all, across a wide range of supported reporting tools:

  • Google Webmaster Tools
  • Bing Webmaster Tools
  • ahrefs
  • Majestic SEO
  • Open Site Explorer

Within URL Profiler, right click in the white box and choose to ‘Import from File’, then you can import multiple csv files in one go, and let the tool de-dupe the links for you.

Import Multiple CSV Files

Setting Up The Tool

URL Profiler is a bit of a swiss army knife, in that it can be used for many different tasks. To get maximum benefit out of each use, you need to adjust some of the setting specific to the task. URL Profiler will save these settings for the next use so if you are doing lots of similar tasks (e.g. link prospecting) you can just leave the settings alone after the first run.

Checking Your Links

We need URL Profiler to go and look at the links to your site, to determine things like:

  • Does the link still exist?
  • What anchor text is used on the link?
  • Where is the link positioned on the page?
  • What type of site is the link on?

In order to check the links still point at your site, we need to tell URL Profiler what your site is, so enter your domain name in the box ‘Domain to Check’ in the Link Analysis section at the bottom. You don’t need to worry about the ‘www’ bit unless you are specifically checking links to a subdomain.

I’ve written another post about why our link checker is awesome – and why the following settings are recommended – so I’ll not repeat myself here (you can read that post here).

Choose the Settings option top left and make sure ‘Connection Timeout’ is set above 40 seconds:

URL Profiler Connections Settings

Then navigate to the ‘Link Analysis’ tab, and push the maximum retries up to 5 (or at least 3). Again, read my other post if you want to know why this will give you the most accurate link analysis possible.

URL Profiler Link Analysis Settings
Defining Anchor Text

URL Profiler has a refined unnatural links classification system, which isn’t based on some fancy ‘machine learning algorithm’, but is instead based on how SEOs perform link classification.

When an SEO carries out a link classification, he/she is looking for patterns. We all know that SEOs of days gone by looked to scale any tactic that could get them links quickly and easily. By identifying patterns, you can identify link building footprints that will make your link classification much quicker.

For example, you might notice a few articles with heavily optimised anchors, and then look to see if there were many other links from article sites – a classic link building footprint. I wrote a lot more about this over on Search Engine Journal, and a lot of the same principles have been built into URL Profiler’s link classification system.

In order for it to work properly, you need to comprehensively define anchor text. So, after entering your domain to check, click the ‘Anchors’ button underneath:

Dodgy Backlinks

You will see 3 tabs – Branded, Commercial and Generic. So first define your branded anchor text, which is typically the name of the site, the web address and variations thereof:

Branded Anchor Text

Similarly, go to the Commercial tab and enter in any anchor text that might signify manipulative intent.

URL Profiler uses phrase matching so you don’t need to enter every single exact variation here. In my screenshot I have entered ‘seo’ – this would pick up both ‘seo links’ and ‘seo backlinks’ as commercial anchor text without me having to enter anything else – plus any other anchor text variations that have seo in them.


The Generic anchor text is pretty self explanatory, so just fill in a few options here:

Generic Anchor Text

I can’t emphasise enough the importance of entering your anchor variations properly (particularly Commercial) – it will save you a ton of time when you come to analyse the results.

Selecting Domain Metrics

We deliberately built the link scoring system so that it doesn’t use any link metrics (such as Moz Domain Authority or PageRank). This is because an unnatural link isn’t an unnatural link because ‘Trust Flow is low’ – it’s an unnatural link because it was placed in order to manipulate the Google algorithm. If you somehow placed a highly optimised link onto itself, it would still be unnatural – never mind the PageRank.

That said – I do often look at some metrics to give me more information about my links (typically homepage PageRank, homepage index status, URL index status, Majestic Citation Flow & Trust Flow). However, we didn’t want the link scorer to be reliant on metrics, and we wanted it to be API independent. As such you can get very accurate link scores by selecting only the following:

Domain Metrics

Only ‘Site Type’ and IP Address. That’s it.

Site Type will identify if the site is an article repository, blog, link directory, forum etc… and the IP Address will allow you to identify links all coming from the same IP or subnet.

Once these settings are done, paste in your links and then run the Profiler. Armed with the site type and the anchor text, URL Profiler can find and classify all your links.

Analysing The Data

Once you open the results, you will be presented with a spreadsheet separated into different worksheets. The first sheet is a summary report.

Unnatural Links Results

This is simply an overview of the results – a bird’s eye view of the data. So you can see how many links were found, or not, and how many were classed as suspect, unnatural etc…

Many of our users will run this report before taking on new clients, almost like a risk analysis report!

It’s also really useful for getting a quick picture of how much work it might take to do an unnatural link clean up.

Link Scoring

The second sheet, entitled All, contains all the URLs and their data regardless of whether a link was found or not. The other worksheets split out the links based on their link score.

If we zoom in on some of the data you can see how this works:

Link Scores

This is a selection of links which we have deemed to be unnatural, which is listed in the ‘Link Score’ column.

Notice in the ‘reason’ column we are completely transparent with the data. There are no annoying codes or obscured reasoning, it is all perfectly clear so you can see exactly why we have come to this conclusion.

This represents a very important principle for URL Profiler – we believe that the data is YOUR data. We can tell you what we think, but at the end of the day you are the SEO expert, and where you add value is in your interpretation of the data. URL Profiler is a tool built by SEOs for SEOs, so we will always try to make our data as useful as possible.

At this stage, you will have an abundance of data about your links. You will know which links to ignore (nofollow links etc…) and which ones to be most concerned about. You will still need to spend some time going through and verifying the data (we can’t do everything for you!) but we are sure we will have saved you a hell of a lot of time.

Patrick Hathaway

By Patrick Hathaway

I seem to be the one that writes all the blog posts, so I am going to unofficially name myself 'Editor'. In fact, I think I prefer Editor-in-chief. You can follow me on Twitter or 'encircle me' on .

If You Like the Sound of URL Profiler,
Download a Free Trial Today

(You'll be amazed by how much time it saves you, every day!)

  • Free 14 day trial (full feature)
  • No credit card required
  • License from only £12.95 a month

Ready to take your content auditing seriously?