How google search works

Ever wondered, how Google search works? If you are not a techie, you will discover many secrets of how you get '2.5 million search results in 0.35 seconds' on Google. 

Google does not search the web real-time!

When we type a search query on the Google search box, it comes out with thousands of search results. What people generally think is that Google sends its tools to search the entire www and the tools come out with the best results - like the Aladdin genie. No, it does not search the web like that.

What Google does is that (i) its search bots keep crawiling the www to find useful stuff, (ii) it maintains a huge index of such stuff that its bots find as good, and (iii) it uses as set of complex formulas to match your query with the stuff in its index.

What type of content does Google consider good for search?

In Google's consideration, the quality of content comes on top when indexing web pages and other web entities. There are hundreds of quality parameters on which Google evaluates a webpage. These include:

  • Originality of content
  • Thoroughness of content
  • Usefulness of content
  • How much the webpage has been linked from autoritative websites
  • Social media signals about the content
  • Lack of grammatical and language errors
  • Lack of unethical optimization (=artificial jacking up) done on the webpage/ website

Google assigns points/ marks to each such parameter and uses a series of very complicated formulas (=algorithms) to give a combined ranking to the webpage. It is reported that there are many parameters that relate to the entire website and many more that relate to individual entities (webpages, files, images, etc). Webpages and websites with low ranking are not kept in the index and when a website's/ page's quality deteriorates, that is taken out of the index. A webpage with high ranking is likely to come high on search results when someone searches Google.

Fresh content scores over old one.

Google likes freshness of content. So, webpages  that update regularly are more likely to be served than the ones that have not been updated for long. Blogs score a big point here. 

Google also has special freshness algorithms to find if the searcher is seeking up-to-date information or an evergreen one. When a search query has keywords such as 'latest', 'updates', 'score' etc, Google tries to show up content that gives fresh information on a topic/ event.
However, evergreen content that does not change over time (e.g 'the planetary system') has its own value for Google, if it is written well. While updating content has low shelf life, evergreen high-quality content is searched for and shared again and again, and this sends positive signals to Google.

Relevance is so so important for search.

In serving results, Google checks meaningful expressions (=keywords) in the query and also the intention of search. 

If you search for 'power solutions', Google will try to find whether you are searching for an electricity solution near you, an article on electric power issues and their solutions, a liquid solution with good strengh of cleaning etc, ways to deal with political power, or something else. 
Once Google makes sense of what the search is about, it looks into its index and tries to serve results that best match the query. Again, there are complicated algorithms that are used for matching search queries with entries in the index.

search on Google

Search engines have become smarter over time.

Initial search engines were 'dumb'; they just looked at the search query and matched it with entries in their indexes. Smart webmasters made fool of them by stuffing keywords into their webpages and getting useless webpages on top of search pages. Then Google and others started punishing such artificially jacked up content. They also built language models to better uderstand how different phrases with the same word mean different things with change of context or the way the query is made. Later, search engines started using machine learning for better understanding intent behind a search. Now they serve search results on the basis of many factors other than direct relevance, e.g. search settings, searcher's location, what other searches were recently made on the same device, and what app has the searcher been using at that time.

Signals of relevance, and forced relevance

Search engines have learnt that context of a search keyword tells a lot about the search intent. For example, 'How to fix a faulty faucet' tells exactly what the searcher wants to know. So, instead of serving academic or engineering articles on faucets, Google will try to serve articles with advice on fixing a faulty faucet. It might also give links to faucet repair services near you.

As mentioned above, the freshness algorithm smells if the query is for updates. If you want to know 'Cicago weather', it will give latest weather updates on Cicago on top of search page; if you query 'dollar-rupee rate', it will give the latest exchange rate between these currencies. Maybe, it will also give links to forex dealers near you. 

Coming to the earlier example again. Instead of just indexing entries for 'faucets', Google has a way to index webpages that contain keywords with more than one word. In this case, Google will look for index entries in which  'repair' and 'faucet' come together in a meaningful sequence. So, Google is likely to serve search results with expressions such as the following: how to repair a faucet, what to do when a faucet leaks, ways of fixing faucets, how to fix taps so that they do not leak. Such long expressions that together identify a relevant indexed entry are called long tail keywords and are very helpful to Google in knowing the context of the query.

Google goes a step forward. It also looks for other items on webpages that tell whether a webpage is relevant to the search query. Taking an example from Google itself, if you search for 'dog', it will not serve a webpage with 'dog' written in it a hundred times. In addition to finding whether the webpage really has useful information on dogs (e.g. dog foods, breeds, diseases, pet care), it will also see whether it has photos, videos and links pertaining to dogs.

Search engine optimization or SEO

All webmasters know about search engine optimization. SEO includes measures that are taken on the websites and specific webpages so that they come up on search results. 

Search engines welcome ethical SEO - which guides search engines about content and relevance of webpages, but they hate black-hat SEO - which tries to fool search engines into believing that a poor quality webpage is of very high quality and relevance. Filtering out such bad, spammy, webpages is one reason search engines keep changing their algorithms very regularly. Google has reported making as many as 3234 improvements in its search within the last one year (i.e. 2018). 

Google and othe major search engines do not serve results just on the strength of quality and relevance. They give value to SEO and therefore search-optimized content is likely to come up even if it may not be the best and most relevant. 

The changing face of queries and search results

More and more searches are now made on mobile devices. This has posed new challenges and opportunities before search engines. Mobile phones have made Google available all the time and everywhere. New developments in localization etc have also made it easy for users to search everything on the go. Earlier, when we did not know details about something or someone, we looked for it in Google on our desktops. Now we go to Google when we plan a trip or look for a restaurant nearby. The results for such queries have to be exact, instant and with useful links.

Another major change that the search engines are seeing is 'voice search'. More and more people are using voice for search, especially on the move or those not comfortable in typing fast. Mind works differently when one types a query and when one speaks it into the microphone. There also are issues relating to pronunciation and noise. Search engines, thus, have to be even smarter in getting the query and its intent right.

Playing with results

Search engines need huge resources. So, they must earn while giving the results freeThey serve paid results before organic (= naturally occurring) search results. They also seem to give preference to market-friendly entries over knowledge articles. On sidebar (on wide devices) and top and bottom (especially on smaller devices), they stuff advertisements. They collect your browsing data, ostensibly to refine search but also with the intent to serve you targeted ads. Thee are many other ways, search engines play with results; sometimes to help the searcher and sometimes for commercial reasons. Can you really blame them when you are getting so much information and convenience for free?

That's all for now. 

You would find the following SEO related articles useful, especially if you are a blogger or webmaster:

Fundamentals of search engine optimization: What? Why? How?

5 simple SEO actions that give big results

Best Search Engine Optimization strategy is simple and ethical