Columnist Pratik Dholakiya shares the 8 technical SEO issues he sees most frequently. Even the most experienced SEO professionals can sometimes overlook these common issues!
SEO is more than inbound marketing. There’s massive overlap, but there’s a technical side to SEO that sometimes gets neglected, especially by casual followers of the industry.
As somebody who spends a great deal of time looking at sites searching for opportunities to optimize, I notice patterns that creep up often: technical mistakes that show up again and again.
Let’s go over these mistakes. If my experience is anything to go by, odds are high you’re making at least one of them.
1. Nofollowing your own URLs
There comes a time in every SEO’s life when they need to keep a page hidden from the search results — to prevent duplicate content issues, to hide member areas, to keep thin content pages out of the index, to hide archives and internal search result pages, during an A/B test and so on. This is perfectly innocent, perfectly noble and perfectly necessary. However…
… do not use the “nofollow” tag to accomplish this!
The “nofollow” tag doesn’t prevent pages from being indexed by the search engines, but it does ruin the flow of PageRank through your site.
For the very same reason, you should not attempt to sculpt the flow of PageRank through your site by using the “nofollow” tag. Let me explain.
The “nofollow” tag does prevent PageRank from passing through a link, but Google still takes into account the total number of links on your page when determining how much PageRank to pass. In other words, your followed links will pass the same amount of PageRank regardless of whether the other links on the page are nofollowed or not.
I still see this happening often: SEO newcomers and webmasters using “nofollow” tags on their own content, either thinking that it will prevent a page from showing up in the search results, or thinking that they can use it to direct PageRank to their most important pages. The “nofollow” tag accomplishes neither of these things.
When you use a “nofollow” tag, you are throwing away PageRank. Don’t do it, not even on pages that you don’t want indexed. If you want to keep a page out of the index, use this in your HTML head:
<meta name=”robots” content=”noindex, follow”>
The above directive prevents the page from turning up in the search results but recommends that the search engine follow the links on the page. That way, any PageRank that flows into the unindexed page will be passed back to your site through the links on the page, rather than getting dumped.
2. Not using canonicalization
The rel=canonical tag in the HTML head looks like this:
<link rel=”canonical” href=”http://www.example.com/url-a.html” />
It tells search engines that instead of the current page, the linked URL should be treated as “canon” by the search engines.
Why would you use this tag? The purpose of it is to prevent duplicate content from getting indexed, which can result in diluting your search engine authority. Using the canonical tag also seems to pass PageRank from the non-canonical page to the canonical page, so there is no need to be concerned about losing the PageRank accumulated by the non-canonical page.
This is a place where conversion optimizers can often fail. Page alternates in an A/B test should make use of the canonical tag so that the alternate page doesn’t get indexed (and so that any authority picked up by the alternate page is passed to the primary page).
Variations on product pages, such as alternates with a different color, are another common example. Duplicates can also get created any time URL query strings are in use. For this reason, sitewide canonicalization can be a good solution for sites that make use of query strings. Self-referencing canonical pages are not generally thought to be an issue.
3. Poor use of outbound links
If you’re linking to another site in your site-wide navigation, and it’s not one of your social media profiles, odds are you should remove the link.
From a pure PageRank standpoint, external links dilute the authority that gets passed back to your own site. This isn’t to say that you shouldn’t be linking to anybody else (which would utterly defeat the purpose of using links as a ranking factor). But outbound links in your own site navigation compound the losses by affecting every page.
Of course, Google has come a long way since the original PageRank algorithm, but there’s another reason why external links in the navigation are iffy: It’s easy for them to look like spam.
The situation is, of course, far worse if the links use keyword anchor text or if the links are placed somewhere where they could be confused for internal site navigation.
Outbound links in the primary content are generally not an issue, but it is important to screen them for quality. Links to “bad neighborhoods” can get a site penalized by Google’s spam team or pushed down the rankings by anti-spam algorithms.
And, of course, it is absolutely crucial that you always nofollow advertisement links of any kind.
4. Not enough outbound links
The idea that “a little bit of knowledge is a dangerous thing” definitely applies here. A limited understanding of how the search engines work leads some to believe that they should never link to another site. While it’s true that the pure PageRank algorithm would suggest this, it’s simply not how things work out in the field.
A case study by Reboot Online makes a pretty clear case for this. They created 10 sites featuring a nonsense keyword, five featuring authoritative outbound links and five not.
The results were about as definitive as possible for a study of this size: All five of the sites with outbound links performed better than the sites without them.
In a post on PageRank sculpting by Google’s former head of web spam, Matt Cutts, he also mentions that “parts of our system encourage links to good sites,” which seems to confirm the idea that linking to other sites is important.
To be fair, John Mueller has openly stated that outbound links aren’t “specifically a ranking factor,” while adding that they “can bring value to your content and that in turn can be relevant for us in search.” In context of the Reboot Online study and Matt Cutts’s statement, this might be interpreted to mean that including citations boosts confidence in content, rather than meaning that outbound links have no effect at all.
Regardless, well-sourced content is a must if you want to be taken seriously — which may have a positive, if indirect, effect on rankings.
5. Poor internal link structure
There’s more than one right way to structure your links, but there are plenty of wrong ways to do it, too.
Let’s start with the basics. As the Google guidelines state:
Build your site with a logical link structure. Every page should be reachable from at least one static text link.
Your typical modern content management system will usually handle at least this much automatically. But this functionality sometimes gets broken. One dangerous myth is that you are supposed to canonicalize multiple page posts back to the first page. In reality, you should either leave well enough alone or canonicalize to a single page that contains the entire post. This goes for archives and similar pages, too. Canonicalizing these pages runs the risk of erasing the links on these pages from the search index.
A completely flat link architecture is another common issue. Some take the idea that every page needs to be accessible through links a bit too far, including links to virtually every page on the site within the navigation.
From the user perspective, this creates obvious issues by making it very difficult to locate appropriate pages.
But this confusion passes on to the search engines and the way that they interpret your site. Without a clear hierarchy, search engines have a very difficult time parsing which pages on your site are most important, which pages cover which topics, and so on.
Remember, there’s much more to the algorithm than PageRank. A categorical hierarchy helps search engines understand your site semantically, which is very important for rankings.
Watch out for tag clouds and long lists of dated archives. These show up less often in modern CMS themes, but they occur often enough that you should know they are to be avoided. Click-throughs on these are awful, and the extra links divide up PageRank. Dated archive lists, in particular, add no semantic information to your link architecture, and category links are much more organized than muddy tag clouds.
Finally, while it’s not exactly a mistake not to, we highly recommend referencing your own content within your body content. Contextual links within body content are generally believed to count more than links in the navigation, and they certainly add important semantic value.
6. Poor URL architecture
URL architecture can be a difficult thing to fix without breaking other aspects of your SEO, so we don’t recommend rushing into this, or you might do more harm than good.
That said, one of the most frequent issues I come across is a lack of solid URL architecture. In particular, folder organization is often spotty.
A few common issues:
- Blog posts listed in multiple categories, resulting in blog posts listed in multiple folders, creating duplicate content issues as a result.
- URLs with no folders other than the parent domain. While this is precisely the form your most important pages should take, pages further down the hierarchy should be listed in folders to categorize them.
- URLs with folders that are, themselves, 404 pages. If a URL is listed under a folder, many users expect that folder to be an operational page. From an architecture perspective, it’s semantically confusing, and from an internal link perspective, it’s ideal to have links to these pages from a parent folder.
- Junk URLs full of numbers and letters. These days, these are primarily reserved for search result pages and database queries that aren’t intended to be indexed and found in search engines. Your URLs should contain useful information intelligible to a human if you want them to contribute positively to your performance in the search engines.
In addressing these issues, there are two complications you want to avoid: creating 404 pages and losing existing link authority. When you change your URL architecture, you need to make sure that the old pages 301 to the new ones. Ideally, any internal links to the old pages should also be updated, since PageRank is reduced by the damping factor every time it passes through a link or 301.
As an exception, if blog posts are listed in multiple categories, a 301 isn’t always necessary, but in its place you should canonicalize to the preferable page.
7. Using frames
Frames and iframes are needed in a few places, but you should never use them for anything that you want to be indexed. Google is pretty clear on this:
Frames can cause problems for search engines because they don’t correspond to the conceptual model of the web. In this model, one page displays only one URL. Pages that use frames or iframes display several URLs (one for each frame) within a single page. Google tries to associate framed content with the page containing the frames, but we don’t guarantee that we will.
This isn’t to say that your site should never use them. YouTube embeds make use of iframes, for example.
What you absolutely should not do is use frames as a method of navigating content on your site. This not only makes the content difficult to index, it ruins your site architecture and makes it very difficult for people to reference your content with links.
8. Using unindexable formats
Search engines have limited ability to crawl and index the content found inside images, flash files, Java applets and videos.