Colum­nist Pratik Dho­lakiya shares the 8 tech­ni­cal SEO issues he sees most fre­quent­ly. Even the most expe­ri­enced SEO pro­fes­sion­als can some­times over­look these com­mon issues!

SEO is more than inbound mar­ket­ing. There’s mas­sive over­lap, but there’s a tech­ni­cal side to SEO that some­times gets neglect­ed, espe­cial­ly by casu­al fol­low­ers of the indus­try.

As some­body who spends a great deal of time look­ing at sites search­ing for oppor­tu­ni­ties to opti­mize, I notice pat­terns that creep up often: tech­ni­cal mis­takes that show up again and again.

Let’s go over these mis­takes. If my expe­ri­ence is any­thing to go by, odds are high you’re mak­ing at least one of them.

1. Nofollowing your own URLs

There comes a time in every SEO’s life when they need to keep a page hid­den from the search results — to pre­vent dupli­cate con­tent issues, to hide mem­ber areas, to keep thin con­tent pages out of the index, to hide archives and inter­nal search result pages, dur­ing an A/B test and so on. This is per­fect­ly inno­cent, per­fect­ly noble and per­fect­ly nec­es­sary. How­ev­er…

… do not use the “nofol­low” tag to accom­plish this!

The “nofol­low” tag doesn’t pre­vent pages from being indexed by the search engines, but it does ruin the flow of PageR­ank through your site.

For the very same rea­son, you should not attempt to sculpt the flow of PageR­ank through your site by using the “nofol­low” tag. Let me explain.

The “nofol­low” tag does pre­vent PageR­ank from pass­ing through a link, but Google still takes into account the total num­ber of links on your page when deter­min­ing how much PageR­ank to pass. In oth­er words, your fol­lowed links will pass the same amount of PageR­ank regard­less of whether the oth­er links on the page are nofol­lowed or not.

I still see this hap­pen­ing often: SEO new­com­ers and web­mas­ters using “nofol­low” tags on their own con­tent, either think­ing that it will pre­vent a page from show­ing up in the search results, or think­ing that they can use it to direct PageR­ank to their most impor­tant pages. The “nofol­low” tag accom­plish­es nei­ther of these things.

When you use a “nofol­low” tag, you are throw­ing away PageR­ank. Don’t do it, not even on pages that you don’t want indexed. If you want to keep a page out of the index, use this in your HTML head:

<meta name=”robots” content=”noindex, fol­low”>

The above direc­tive pre­vents the page from turn­ing up in the search results but rec­om­mends that the search engine fol­low the links on the page. That way, any PageR­ank that flows into the unin­dexed page will be passed back to your site through the links on the page, rather than get­ting dumped.

2. Not using canonicalization

The rel=canonical tag in the HTML head looks like this:

<link rel=”canonical” href=”‑a.html” />

It tells search engines that instead of the cur­rent page, the linked URL should be treat­ed as “canon” by the search engines.

Why would you use this tag? The pur­pose of it is to pre­vent dupli­cate con­tent from get­ting indexed, which can result in dilut­ing your search engine author­i­ty. Using the canon­i­cal tag also seems to pass PageR­ank from the non-canon­i­cal page to the canon­i­cal page, so there is no need to be con­cerned about los­ing the PageR­ank accu­mu­lat­ed by the non-canon­i­cal page.

This is a place where con­ver­sion opti­miz­ers can often fail. Page alter­nates in an A/B test should make use of the canon­i­cal tag so that the alter­nate page doesn’t get indexed (and so that any author­i­ty picked up by the alter­nate page is passed to the pri­ma­ry page).

Vari­a­tions on prod­uct pages, such as alter­nates with a dif­fer­ent col­or, are anoth­er com­mon exam­ple. Dupli­cates can also get cre­at­ed any time URL query strings are in use. For this rea­son, sitewide canon­i­cal­iza­tion can be a good solu­tion for sites that make use of query strings. Self-ref­er­enc­ing canon­i­cal pages are not gen­er­al­ly thought to be an issue.

3. Poor use of outbound links

If you’re link­ing to anoth­er site in your site-wide nav­i­ga­tion, and it’s not one of your social media pro­files, odds are you should remove the link.

From a pure PageR­ank stand­point, exter­nal links dilute the author­i­ty that gets passed back to your own site. This isn’t to say that you shouldn’t be link­ing to any­body else (which would utter­ly defeat the pur­pose of using links as a rank­ing fac­tor). But out­bound links in your own site nav­i­ga­tion com­pound the loss­es by affect­ing every page.

Of course, Google has come a long way since the orig­i­nal PageR­ank algo­rithm, but there’s anoth­er rea­son why exter­nal links in the nav­i­ga­tion are iffy: It’s easy for them to look like spam.

The sit­u­a­tion is, of course, far worse if the links use key­word anchor text or if the links are placed some­where where they could be con­fused for inter­nal site nav­i­ga­tion.

Out­bound links in the pri­ma­ry con­tent are gen­er­al­ly not an issue, but it is impor­tant to screen them for qual­i­ty. Links to “bad neigh­bor­hoods” can get a site penal­ized by Google’s spam team or pushed down the rank­ings by anti-spam algo­rithms.

And, of course, it is absolute­ly cru­cial that you always nofol­low adver­tise­ment links of any kind.

4. Not enough outbound links

The idea that “a lit­tle bit of knowl­edge is a dan­ger­ous thing” def­i­nite­ly applies here. A lim­it­ed under­stand­ing of how the search engines work leads some to believe that they should nev­er link to anoth­er site. While it’s true that the pure PageR­ank algo­rithm would sug­gest this, it’s sim­ply not how things work out in the field.

A case study by Reboot Online makes a pret­ty clear case for this. They cre­at­ed 10 sites fea­tur­ing a non­sense key­word, five fea­tur­ing author­i­ta­tive out­bound links and five not.

The results were about as defin­i­tive as pos­si­ble for a study of this size: All five of the sites with out­bound links per­formed bet­ter than the sites with­out them.

In a post on PageR­ank sculpt­ing by Google’s for­mer head of web spam, Matt Cutts, he also men­tions that “parts of our sys­tem encour­age links to good sites,” which seems to con­firm the idea that link­ing to oth­er sites is impor­tant.

To be fair, John Mueller has open­ly stat­ed that out­bound links aren’t “specif­i­cal­ly a rank­ing fac­tor,” while adding that they “can bring val­ue to your con­tent and that in turn can be rel­e­vant for us in search.” In con­text of the Reboot Online study and Matt Cutts’s state­ment, this might be inter­pret­ed to mean that includ­ing cita­tions boosts con­fi­dence in con­tent, rather than mean­ing that out­bound links have no effect at all.

Regard­less, well-sourced con­tent is a must if you want to be tak­en seri­ous­ly — which may have a pos­i­tive, if indi­rect, effect on rank­ings.

5. Poor internal link structure

There’s more than one right way to struc­ture your links, but there are plen­ty of wrong ways to do it, too.

Let’s start with the basics. As the Google guide­lines state:

Build your site with a log­i­cal link struc­ture. Every page should be reach­able from at least one sta­t­ic text link.
Your typ­i­cal mod­ern con­tent man­age­ment sys­tem will usu­al­ly han­dle at least this much auto­mat­i­cal­ly. But this func­tion­al­i­ty some­times gets bro­ken. One dan­ger­ous myth is that you are sup­posed to canon­i­cal­ize mul­ti­ple page posts back to the first page. In real­i­ty, you should either leave well enough alone or canon­i­cal­ize to a sin­gle page that con­tains the entire post. This goes for archives and sim­i­lar pages, too. Canon­i­cal­iz­ing these pages runs the risk of eras­ing the links on these pages from the search index.

A com­plete­ly flat link archi­tec­ture is anoth­er com­mon issue. Some take the idea that every page needs to be acces­si­ble through links a bit too far, includ­ing links to vir­tu­al­ly every page on the site with­in the nav­i­ga­tion.

From the user per­spec­tive, this cre­ates obvi­ous issues by mak­ing it very dif­fi­cult to locate appro­pri­ate pages.

But this con­fu­sion pass­es on to the search engines and the way that they inter­pret your site. With­out a clear hier­ar­chy, search engines have a very dif­fi­cult time pars­ing which pages on your site are most impor­tant, which pages cov­er which top­ics, and so on.

Remem­ber, there’s much more to the algo­rithm than PageR­ank. A cat­e­gor­i­cal hier­ar­chy helps search engines under­stand your site seman­ti­cal­ly, which is very impor­tant for rank­ings.

Watch out for tag clouds and long lists of dat­ed archives. These show up less often in mod­ern CMS themes, but they occur often enough that you should know they are to be avoid­ed. Click-throughs on these are awful, and the extra links divide up PageR­ank. Dat­ed archive lists, in par­tic­u­lar, add no seman­tic infor­ma­tion to your link archi­tec­ture, and cat­e­go­ry links are much more orga­nized than mud­dy tag clouds.

Final­ly, while it’s not exact­ly a mis­take not to, we high­ly rec­om­mend ref­er­enc­ing your own con­tent with­in your body con­tent. Con­tex­tu­al links with­in body con­tent are gen­er­al­ly believed to count more than links in the nav­i­ga­tion, and they cer­tain­ly add impor­tant seman­tic val­ue.

6. Poor URL architecture

URL archi­tec­ture can be a dif­fi­cult thing to fix with­out break­ing oth­er aspects of your SEO, so we don’t rec­om­mend rush­ing into this, or you might do more harm than good.

That said, one of the most fre­quent issues I come across is a lack of sol­id URL archi­tec­ture. In par­tic­u­lar, fold­er orga­ni­za­tion is often spot­ty.

A few com­mon issues:

  • Blog posts list­ed in mul­ti­ple cat­e­gories, result­ing in blog posts list­ed in mul­ti­ple fold­ers, cre­at­ing dupli­cate con­tent issues as a result.
  • URLs with no fold­ers oth­er than the par­ent domain. While this is pre­cise­ly the form your most impor­tant pages should take, pages fur­ther down the hier­ar­chy should be list­ed in fold­ers to cat­e­go­rize them.
  • URLs with fold­ers that are, them­selves, 404 pages. If a URL is list­ed under a fold­er, many users expect that fold­er to be an oper­a­tional page. From an archi­tec­ture per­spec­tive, it’s seman­ti­cal­ly con­fus­ing, and from an inter­nal link per­spec­tive, it’s ide­al to have links to these pages from a par­ent fold­er.
  • Junk URLs full of num­bers and let­ters. These days, these are pri­mar­i­ly reserved for search result pages and data­base queries that aren’t intend­ed to be indexed and found in search engines. Your URLs should con­tain use­ful infor­ma­tion intel­li­gi­ble to a human if you want them to con­tribute pos­i­tive­ly to your per­for­mance in the search engines.

In address­ing these issues, there are two com­pli­ca­tions you want to avoid: cre­at­ing 404 pages and los­ing exist­ing link author­i­ty. When you change your URL archi­tec­ture, you need to make sure that the old pages 301 to the new ones. Ide­al­ly, any inter­nal links to the old pages should also be updat­ed, since PageR­ank is reduced by the damp­ing fac­tor every time it pass­es through a link or 301.

As an excep­tion, if blog posts are list­ed in mul­ti­ple cat­e­gories, a 301 isn’t always nec­es­sary, but in its place you should canon­i­cal­ize to the prefer­able page.

7. Using frames

Frames and iframes are need­ed in a few places, but you should nev­er use them for any­thing that you want to be indexed. Google is pret­ty clear on this:

Frames can cause prob­lems for search engines because they don’t cor­re­spond to the con­cep­tu­al mod­el of the web. In this mod­el, one page dis­plays only one URL. Pages that use frames or iframes dis­play sev­er­al URLs (one for each frame) with­in a sin­gle page. Google tries to asso­ciate framed con­tent with the page con­tain­ing the frames, but we don’t guar­an­tee that we will.

This isn’t to say that your site should nev­er use them. YouTube embeds make use of iframes, for exam­ple.

What you absolute­ly should not do is use frames as a method of nav­i­gat­ing con­tent on your site. This not only makes the con­tent dif­fi­cult to index, it ruins your site archi­tec­ture and makes it very dif­fi­cult for peo­ple to ref­er­ence your con­tent with links.

8. Using unindexable formats

Search engines have lim­it­ed abil­i­ty to crawl and index the con­tent found inside images, flash files, Java applets and videos.