A Beginners Guide to Making Your Website Crawler Friendly
27th August, 2021
Start ReadingOk, so you’ve made your website and you think you’ve done a great job. You’ve had a look at the Best Website Builder Australia 2021 article, and built your own website. But, something a lot of people forget to consider or don’t even know that they need to consider is how it’s going to be crawled!
When you’re new to the SEO game, it can be quite tricky wrapping your head around all the jargon, the process and how it works. We’ve actually created a super helpful Glossary for Essential SEO Terminology that you can check out before reading this. But in this article, we’re going to focus on one step in the SEO process which is known as website crawling and how you make sure your website is being crawled.
What is website crawling?
The term website crawling doesn’t sound all that fun. But, website crawling is a critical step in the SEO process. Website crawling is a task that is typically carried out by a web crawler bot. These bots have various names including spider bots, spiders or just crawlers. These bots start from a seed or a list of known URLs. They crawl the web pages at those URLs first. As they crawl those web pages, they will find hyperlinks to other URLs, and they add those to the list of pages to crawl next.
You may be wondering, why do they crawl in the first place? Well, the internet is constantly expanding and changing and it can be extremely challenging not knowing what is where. The internet is often referred to as a library, and now knowing where anything is, or having a way to organise anything can be quite chaotic. Web crawlers primary use is to discover publicly available web pages and bring data about those web pages back to Google’s servers. This way they get information on the ‘books’ of the internet and can identify webpages that match the queries of searchers.
Indexing
As mentioned previously, the web crawlers crawl your page and bring information back to the search engine. The process of the information being stored is known as indexing. Just because your site has been crawled, does not necessarily mean that it will be indexed. The index is where all the pages that have been discovered are stored. This huge database is like a massive library where all this information is sorted.
Making your website crawl-able
If spider bots cannot crawl your site, then it can’t and won’t be indexed, and therefore, it will not show up in search results. That is why website owners must make it easy for their website to be crawled. If your website does not get indexed, you are as good as invisible.
Removing crawl blocks
To ensure you are being crawled and indexed, you should make sure that you are not blocking crawlers. You can block web crawlers in something called a robots.txt file. You can have a look at whether you are blocking crawlers by simply checking at
yourdomain.com/robots.txt. If you see a code along the lines of Googlebot Disallow you can simply just remove these “disallow” rules to ensure that your web page can be crawl-able. What these codes mean is that you’re not letting Google bots (Google’s spider bots) crawl your page.
While removing crawl blocks is important for getting indexed and crawled, the name of the game isn’t to get absolutely everything indexed. There are some things that you actually don’t want spider bots to get their virtual claws into. These include old URL’s that have thin content, staging or test pages, special promo code pages or duplicate URL’s. It’s important to identify the pages that you do want to show up on the search results and others that you should crawl a block.
Crawling “Through”, Not Just “to”
Crawl bots don’t just go to a page and get all the information, then leave. They actually use the links and crawl through the page and collect the information they find on those pages. Crawlers need to be able to crawl not only to your website but also through it. While these bots may seem quite nifty and smart, they are actually limited in the functions that they can perform.
Things like login forms can pose as a real roadblock to a bot. Don’t hide your information behind these locked doors. Bots cant use search forms, so don’t rely on them to show the bot everything that your visitors might be searching for.
Don’t use other forms of media (e.g. images, videos, GIFS etc.) to communicate text. There is no real guarantee that the bot will actually pick this up and index it.
These are super quick and easy ways to help ensure your meeting the prerequisites to showing up on that search page.
SEO can be quite complex. Crawling and indexing are only the very first stages of the long process know as SEO. If you are curious about the following steps of the SEO process, you should check out our other articles all about how you can benefit from SEO. If you still want to learn more, contact us for a little more information about how you can get professional help here!
Written by
Tianna Chalon