SEO Blog - Resources - seoClarity

How to Audit Canonical Tags at Scale

Written by Suraj Lalchandani | September 30, 2022

Technical issues with a canonical tag prevent it from serving its purpose.

That being: informing Google or other search engines which page is the preferred version out of a group of similar pages.

Without canonicals, you may face problems with pages having duplicate content, and waste your crawl budget along the way.  

This post explains how to leverage an SEO platform to audit your canonicals at scale to ensure they’re working for you, not against you.

If you’re still unsure about the purpose of a canonical, let’s start with some background. If you’re ready to audit, jump down to learn how to audit canonical tags at scale >

What is a Canonical Tag?

A canonical tag informs search engines that there is a preferred page to be indexed out of a group of similar or duplicate pages.

It may look something like this:

<link rel="canonical" href="https://www.website.com/page/" />

The Importance of Auditing Your Canonicals

Enterprise sites tend to create new, unique or even personalized experiences on the site that can result in new URLs that Google may discover.

The result is hundreds and potentially thousands of individual URLs (or more…) that might change with filter or tracking tags.

Search engines like Google are less likely to show a page at all if there is duplicate content because it makes it difficult to determine:

  • Which version of the page to crawl and index,
  • Which version of the page to rank in SERPs, and
  • Whether link equity should be focused on one page or spread across several pages.

You don’t want Google to crawl and index different versions of the same page when it can be crawling other, unique content on your site.

An audit allows you to gain insights into a few important pieces of the puzzle:

  • All pages where canonical tagging exists, along with the pages where it doesn't.
  • An evaluation of whether the tag points to the right page.
  • A check on whether  the pages with canonical tags are both crawlable and indexable
  • Canonicals that point to 3xx redirects, 4xx client errors, or 5xx server errors - all of which need to be fixed.

How to Audit Your Canonical Tags for SEO

The concept is relatively straightforward. What makes it one of the most common technical SEO issues is its sheer breadth and scope.

That’s why we’ll demonstrate how to audit your canonicals with an SEO platform.

Done manually, you’d need to keep a record of every single URL with a canonical tag, along with the pages these tags correct to, and an evaluation of whether it's still the right reference.

No one has the time or resources to do that. That’s like trying to count the grains of sand on a beach!

Within the seoClarity platform, clients can leverage our built-in crawler to run a site crawl and audit their canonicals.

Note: If you don’t have an appropriate tool, here’s a list of some of the best crawler and audit tools.

#1. Run a Crawl

Step one is of course to gather the crawl data. In seoClarity, you can set up full site crawls from a starting URL, csv crawls, and sitemap crawls.

The crawls can be customized to focus on a specific section of your site (or all of it) depending on what pages you are looking to audit.

You’re in control of crawl depth, frequency (so you can set up recurring crawls), crawl speed, and more.

#2. View the Canonical Audit Report

Once the crawl is complete, it’s time to jump into the findings.

If this seems pretty simple so far, that's because with an SEO platform, it is!

Navigate to the “Canonical Audit” tab on the left-hand side to filter down to the canonical issues (instead of all technical SEO issues).  

  • Title/URL, as its header suggests, displays the URL and Title of the page. This way, you can see if pages with self-referencing canonical tags have unique title tags.
  • Canonical Type shows the canonical tag configuration of all the crawled URLs by percentage. Same means self-referencing canonicals, while Other means that the canonical is different from the URL of the page it's on. None means no tag is present.
  • Status displays the status code found on that page when it was crawled.
  • Canonical simply displays the canonical URL for the page on which it's listed
  • On/Off Page Issues lists any issues found with the canonical tag for the respective page.

Bonus: Fix the Canonical Issues at Scale

You can’t fix anything until you know it’s broken in the first place. That’s why running an audit is so crucial.

But now, it’s time for the most important step: fixing the canonical issues so they can serve their purpose.

Similar to finding the issues, a manual approach just isn’t feasible for an enterprise site. (Although, you can take that approach if you choose.)

You’ll be better served by using SEO tech to implement the fixes at scale — across all of your pages with just a few steps.

ClarityAutomate is an execution-first SEO platform that lets you make site changes on your own timeline, all without the dev team. A few clicks is all it takes to resolve these canonical issues site-wide … and plenty of other issues, too.  

Familiarize yourself with all the potential canonical tag issues that can arise.

Or, if you need to solve a specific canonical issue, follow our step-by-step walkthrough to implement a solution at scale: