SOURCE: Search Engine Jour­nal

Once a site is live or has advanced past a cer­tain age, most web­mas­ters don’t real­ly con­cern them­selves with their crawl bud­get any­more.

As long as you keep link­ing to new blog posts at some point in your web­site, it should sim­ply show up in Google or Bing’s index and start rank­ing.

Only, after time, you notice that your site is start­ing to lose key­word rank­ings and none of your new posts are even hit­ting the top 100 for their tar­get key­word.

It could sim­ply be a result of your site’s tech­ni­cal struc­ture, thin con­tent, or new algo­rithm changes, but it could also be caused by a very prob­lem­at­ic crawl error.

With hun­dreds of bil­lions of web­pages in Google’s index, you need to opti­mize your crawl bud­get to stay com­pet­i­tive.

Here are 11 tips and tricks to help opti­mize your crawl speed and help your web­pages rank high­er in search.

1. Track Crawl Status with Google Search Console

Errors in your crawl sta­tus could be indica­tive of a deep­er issue on your site.

Check­ing your crawl sta­tus every 30–60 days is impor­tant to iden­ti­fy poten­tial errors that are impact­ing your site’s over­all mar­ket­ing per­for­mance. It’s lit­er­al­ly the first step of SEO; with­out it, all oth­er efforts are null.

Right there on the side­bar, you’ll be able to check your crawl sta­tus under the index tab.

Now, if you want to remove access to a cer­tain web­page, you can tell Search Con­sole direct­ly. This is use­ful if a page is tem­porar­i­ly redi­rect­ed or has a 404 error.

A 410 para­me­ter will per­ma­nent­ly remove a page from the index, so beware of using the nuclear option.

Common Crawl Errors & Solutions

If your web­site is unfor­tu­nate enough to be expe­ri­enc­ing a crawl error, it may require an easy solu­tion or be indica­tive of a much larg­er tech­ni­cal prob­lem on your site. The most com­mon crawl errors I see are:

  • DNS errors
  • Serv­er errors
  • Robots.txt errors
  • 404 errors

To diag­nose some of these errors, you can lever­age the Fetch as Google tool to see how Google effec­tive­ly views your site.

Fail­ure to prop­er­ly fetch and ren­der a page could be indica­tive of a deep­er DNS error that will need to be resolved by your DNS provider.

Resolv­ing a serv­er error requires diag­nos­ing a spe­cif­ic error that can be ref­er­enced in this guide. The most com­mon errors include:

  • Time­out
  • Con­nec­tion refused
  • Con­nect failed
  • Con­nect time­out
  • No response

Most of the time, a serv­er error is usu­al­ly tem­po­rary, although a per­sis­tent prob­lem could require you to con­tact your host­ing provider direct­ly.

Robots.txt errors, on the oth­er hand, could be more prob­lem­at­ic for your site. If your robots.txt file is return­ing a 200 or 404 error, it means search engines are hav­ing dif­fi­cul­ty retriev­ing this file.

You could sub­mit a robots.txt sitemap or avoid the pro­to­col alto­geth­er, opt­ing to man­u­al­ly noin­dex pages that could be prob­lem­at­ic for your crawl.

Resolv­ing these errors quick­ly will ensure that all of your tar­get pages are crawled and indexed the next time search engines crawl your site.

2. Create Mobile-Friendly Webpages

With the arrival of the mobile-first index, we must also opti­mize our pages to dis­play mobile friend­ly copies on the mobile index.

The good news is that a desk­top copy will still be indexed and dis­play under the mobile index if a mobile-friend­ly copy does not exist. The bad news is that your rank­ings may suf­fer as a result.

There are many tech­ni­cal tweaks that can instant­ly make your web­site more mobile friend­ly includ­ing:

  • Imple­ment­ing respon­sive web design.
  • Insert­ing the view­point meta tag in con­tent.
  • Mini­fy­ing on-page resources (CSS and JS).
  • Tag­ging pages with the AMP cache.
  • Opti­miz­ing and com­press­ing images for faster load times.
  • Reduc­ing the size of on-page UI ele­ments.

Create Mobile-Friendly Webpages

Be sure to test your web­site on a mobile plat­form and run it through Google Page­speed Insights. Page speed is an impor­tant rank­ing fac­tor and can affect the speed to which search engines can crawl your site.

3. Update Content Regularly

Search engines will crawl your site more reg­u­lar­ly if you pro­duce new con­tent on a reg­u­lar basis. This is espe­cial­ly use­ful for pub­lish­ers who need new sto­ries pub­lished and indexed on a reg­u­lar basis.

Pro­duc­ing con­tent on a reg­u­lar basis sig­nals to search engines that your site is con­stant­ly improv­ing and pub­lish­ing new con­tent and there­fore needs to be crawled more often to reach its intend­ed audi­ence.

4. Submit a Sitemap to Each Search Engine

One of the best tips for index­a­tion to this day remains sub­mit­ting a sitemap to Google Search Con­sole and Bing Web­mas­ter Tools.

You can cre­ate an XML ver­sion using a sitemap gen­er­a­tor or man­u­al­ly cre­ate one in Google Search Con­sole by tag­ging the canon­i­cal ver­sion of each page that con­tains dupli­cate con­tent.

5. Optimize Your Interlinking Scheme

Estab­lish­ing a con­sis­tent infor­ma­tion archi­tec­ture is cru­cial to ensur­ing that your web­site is not only prop­er­ly indexed, but also prop­er­ly orga­nized.

Cre­at­ing main ser­vice cat­e­gories where relat­ed web­pages can sit can fur­ther help search engines prop­er­ly index web­page con­tent under cer­tain cat­e­gories when intent may not be clear.

6. Deep Link to Isolated Webpages

If a web­page on your site or a sub­do­main is cre­at­ed in iso­la­tion or there is an error pre­vent­ing it from being crawled, then you can get it indexed by acquir­ing a link on an exter­nal domain. This is an espe­cial­ly use­ful strat­e­gy for pro­mot­ing new pieces of con­tent on your web­site and get­ting it indexed quick­er.

Beware of syn­di­cat­ing con­tent to accom­plish this as search engines may ignore syn­di­cat­ed pages and it could cre­ate dupli­cate errors if not prop­er­ly canon­i­cal­ized.

7. Minify On-Page Resources & Increase Load Times

Forc­ing search engines to crawl large and unop­ti­mized images will eat up your crawl bud­get and pre­vent your site from being indexed as often.

Search engines also have dif­fi­cul­ty crawl­ing cer­tain back­end ele­ments of your web­site. For exam­ple, Google has his­tor­i­cal­ly strug­gled to crawl JavaScript.

Even cer­tain resources like Flash and CSS can per­form poor­ly over mobile devices and eat up your crawl bud­get. In a sense, it’s a lose-lose sce­nario where page speed and crawl bud­get are sac­ri­ficed for obtru­sive on-page ele­ments.

Be sure to opti­mize your web­page for speed, espe­cial­ly over mobile, by mini­fy­ing on-page resources, such as CSS. You can also enable caching and com­pres­sion to help spi­ders crawl your site faster.

8. Fix Pages with Noindex Tags

Over the course of your website’s devel­op­ment, it may make sense to imple­ment a noin­dex tag on pages that may be dupli­cat­ed or only meant for users who take a cer­tain action.

Regard­less, you can iden­ti­fy web pages with noin­dex tags that are pre­vent­ing them from being crawled by using a free online tool like Scream­ing Frog.

The Yoast plu­g­in for Word­Press allows you to eas­i­ly switch a page from index to noin­dex. You could also do this man­u­al­ly in the back­end of pages on your site.

9. Set a Custom Crawl Rate

In the old ver­sion of Google Search Con­sole, you can actu­al­ly slow or cus­tomize the speed of your crawl rates if Google’s spi­ders are neg­a­tive­ly impact­ing your site.

This also gives your web­site time to make nec­es­sary changes if it is going through a sig­nif­i­cant redesign or migra­tion.

10. Eliminate Duplicate Content

Hav­ing mas­sive amounts of dupli­cate con­tent can sig­nif­i­cant­ly slow down your crawl rate and eat up your crawl bud­get.

You can elim­i­nate these prob­lems by either block­ing these pages from being indexed or plac­ing a canon­i­cal tag on the page you wish to be indexed.

Along the same lines, it pays to opti­mize the meta tags of each indi­vid­ual page to pre­vent search engines from mis­tak­ing sim­i­lar pages as dupli­cate con­tent in their crawl.

11. Block Pages You Don’t Want Spiders to Crawl

There may be instances where you want to pre­vent search engines from crawl­ing a spe­cif­ic page. You can accom­plish this by the fol­low­ing meth­ods:

  • Plac­ing a noin­dex tag.
  • Plac­ing the URL in a robots.txt file.
  • Delet­ing the page alto­geth­er.

This can also help your crawls run more effi­cient­ly, instead of forc­ing search engines to pour through dupli­cate con­tent.

Conclusion

Chances are, if you are already fol­low­ing SEO best prac­tices, you should have noth­ing to wor­ry about with your crawl sta­tus.

Of course, it nev­er hurts to check your crawl sta­tus in Google Search Con­sole and to con­duct a reg­u­lar inter­nal link­ing audit.