Canonicalization for Pagination – Roundup of SEO Wisdom [UPDATED]
Tongue twister, anyone? Pagination canonicalization is one of the trickier challenges for the technical SEO set today. We’ve been closely monitoring the hints and tips that occasionally drop from our favorite Googlers’ lips (sure, we all adore Matt Cutts, but let’s save a little love for JohnMu and Maile Ohye, shall we?). Here’s a roundup of good advice we’ve heard, along with our analysis and resulting recommendations.
Pagination is commonly seen on ecommerce sites when a category contains more products than can be listed on each page:
The question often arises: How should an SEO handle all of these pages? Allowing all of these pages to be indexed separately in search engines means indexing a lot of duplicate or near-duplicate content on a website, which dilutes SEO power and creates an undesirable user experience, since nobody particularly wants to click through from Google to a deep paginated page.
Here is the wisdom we’ve gathered from our favorite Googlers on the subject:
JohnMu on paginated pages and the canonical tag (Feb, 2010):
Pagination: this is complicated, I personally would be careful when using with rel=canonical with paginated lists. The important part is that we should be able to find all products listed, so at the very least those lists should provide a default sort order where we can access (and index) all pages. Since this is somewhat difficult unless you really, really know what you are doing, I would personally avoid adding rel=canonical for these pages. One possible solution could be to use JavaScript for paginated lists with different sort orders, for example, that way you would have a single URL which lists all products.
Our interpretation: Google needs to be able to see all of these paginated pages so that it can click through and get to all the individual links (product pages) that are listed on all the pages. JohnMu suggests creating a default page that contains all of the links, so that Google can get to them. Trouble is, what if you have a thousand of these links? Or even more.
At SMX West (March 2011), Maile Ohye noted that you shouldn’t use canonical tags for paginated pages. (We’ve heard her say this since 2010) Here are notes from Barry Schwartz (SEORoundtable) on this session:
Maile explained that since the results on pages 2, 3, 4, and 5 are different from page 1, you should not use the canonical tag here.
Not only that, if you do, Google may ignore it because Google uses methods to determine if the canonical tag command is actually something valid for that case. So if you canonical page 2 to page 1 and page 2 is not similar enough to page 1, Google may ignore your canonical tag.
Our interpretation: Since paginated pages aren’t identical to each other, and they aren’t subsets of each other, they shouldn’t be canonicalized to each other. Google is pretty smart and can probably figure out when it’s dealing with page=1, page=2, page=3, and Google may ignore your canonical tags in this case.
The one exception to the no canonical tags for pagination rule is when you have a “view all” page. As Brian Ussery of SEO firm Nine by Blue notes:
If paginated content uses “view all” page if it loads well you can put rel canonical on those URLs.
If you don’t have a “view all” page 2 isn’t a subset so you can’t use rel canonical.
At the end of the day, our two favored pagination approaches are the following:
- noindex,follow
Place a robots meta tag on all deep paginated pages instructing the robots to “noindex, follow” the page. This will allow Google to visit the deep pages, follow links on these pages, but not index these pages.
- “view all” + canonical
Create a “view all” page showing all of the links on your list. Then, on all other paginated pages (including page 1), add a canonical tag identifying the “view all” page as the canonical form. We think this solution is best used when the “view all” page is also the version of the page that is linked from elsewhere on the site.
UPDATE: Google got its act together and created a better way to deal with pagination canonicalization. Read all about it at Google’s webmaster blog.