PeterHoggan
New Member
It’s being suggested that Google’s new caffeine update is going to place more relevance on LSI. So here is some information for those interested in the basics. Don’t worry about references to previous tutorials as this piece is pretty much stand alone.
Search Engines and Latent Semantic Indexing
In the last unit of the course, we offered a basic overview of how search engines work. In this unit, we are going to take a more in-depth look at search technology, explaining some of the innovations that have been made to help return relevant results for search queries. In particular, we will be looking at some of the factors involved in Latent Semantic Indexing (or LSI for short).
Because of its very nature, beginners may find the material in this unit quite advanced, so you are encouraged to take your time and try to absorb the main points. We have also provided footnotes and suggestions for further reading. You are not required to read this material, but more advanced webmasters may find some of the sources mentioned useful.
By the end of this unit you should be able to:
This unit assumes that you have read the previous parts of the course and are familiar with major search engines such as Google.
3.1 Another look at search engines
SEO requires quite a lot of background knowledge if you are going to optimise your page in a manner that is effective and does not actually damage the ranking of your website. Before you even get your hands dirty altering the code on your web pages, you need to do quite a bit of research into the most effective keywords to use for your product or services and into the competition you face in search engine rankings. However, even prior to this, however, it is necessary that you understand a bit about
In one sense, these can be considered the ‘rocks’ upon which effective SEO is founded. After all, SEO is about improving your search engine visibility in order to bring targeted traffic to your site, and this implicitly involves understanding the nature of both search engines and the average searcher. Knowledge of these areas will prove an immense help when you come to optimising your own pages.
The two areas are inter-related: after all, the function of a search engine is not to search documents per se, but to search documents in a way that satisfies the needs of the average searcher. In order to maintain trust, the search engine must continue to provide the user with reliable and relevant results for search queries. In this context, any innovations made in search engine algorithms can be considered, first and foremost, as refinements aimed at providing ever more relevant results for the searcher.
We will now look at some of the advanced techniques that search engines are beginning to employ in order to satisfy the needs of the searcher.
3.2 Google and Latent Semantic Indexing (LSI)
In the last unit of the course, we showed you how a search engine attempts to find relevant documents for a search query by locating pages in its index that match the search query - that is, pages that contain the specific words we entered. However, the process is rather more complex than this, largely because of an innovation on the part of the world’s leading search provider, Google.
In order to return more relevant results for the user, Google uses a method called ‘Latent Semantic Indexing’ when indexing documents on the web. Although this method is not used universally by all search engines, it is likely that other search engines will begin to factor this (or a similar) method into their algorithms in the future.
Note that Google does not rely entirely on LSI for finding relevant results. However, according to noted SEO experts Google has been using LSI ‘for a while’ and has ‘recently increased its weighting’ . This means that while traditional keyword based search queries are still relevant - i.e., Google still tries to retrieve documents that contain the specific search terms or keywords you use - Google’s search algorithm has begun to place more importance on LSI when attempting to determine and retrieve relevant documents for a specific search query.
So what is LSI and how does it differ from a standard keyword search? In essence, LSI is a method for retrieving documents that are relevant to a search but that may not contain the specific keyword entered by the user.
For example, in a traditional keyword based search, if I enter the search phrase ‘used cars’ into the search engine, it will only return documents that mention those actual terms somewhere on the page. It will not return web pages that mention terms that we normally consider to be closely related to our search query, e.g. ‘second hand’, ‘vehicles’, ‘automobiles’, and so forth (unless these pages also happen to use the keyphrase ‘used cars’).
When using LSI, on the other hand, the search engine finds a means to locate pages that contain related terms as well as our specific keyphrase. Therefore, our search might also return pages that only mention ‘second-hand automobiles’ as well as pages that specifically mention ‘used cars’.
As you can see, then, LSI allows the search engine to return documents that are outside our specific search phrase, but that are still relevant to our search. It begins to approximate how we actually use language in real life, where we are aware of alternative terms and synonyms for words, and for this reason should prove to be more useful to the searcher than a standard keyword search.
Search Engines and Latent Semantic Indexing
In the last unit of the course, we offered a basic overview of how search engines work. In this unit, we are going to take a more in-depth look at search technology, explaining some of the innovations that have been made to help return relevant results for search queries. In particular, we will be looking at some of the factors involved in Latent Semantic Indexing (or LSI for short).
Because of its very nature, beginners may find the material in this unit quite advanced, so you are encouraged to take your time and try to absorb the main points. We have also provided footnotes and suggestions for further reading. You are not required to read this material, but more advanced webmasters may find some of the sources mentioned useful.
By the end of this unit you should be able to:
- understand the basics of Latent Semantic Indexing
- understand how a search engine sees documents
- understand how a search engine weights keywords
This unit assumes that you have read the previous parts of the course and are familiar with major search engines such as Google.
3.1 Another look at search engines
SEO requires quite a lot of background knowledge if you are going to optimise your page in a manner that is effective and does not actually damage the ranking of your website. Before you even get your hands dirty altering the code on your web pages, you need to do quite a bit of research into the most effective keywords to use for your product or services and into the competition you face in search engine rankings. However, even prior to this, however, it is necessary that you understand a bit about
- search engines
- the individual searcher
In one sense, these can be considered the ‘rocks’ upon which effective SEO is founded. After all, SEO is about improving your search engine visibility in order to bring targeted traffic to your site, and this implicitly involves understanding the nature of both search engines and the average searcher. Knowledge of these areas will prove an immense help when you come to optimising your own pages.
The two areas are inter-related: after all, the function of a search engine is not to search documents per se, but to search documents in a way that satisfies the needs of the average searcher. In order to maintain trust, the search engine must continue to provide the user with reliable and relevant results for search queries. In this context, any innovations made in search engine algorithms can be considered, first and foremost, as refinements aimed at providing ever more relevant results for the searcher.
We will now look at some of the advanced techniques that search engines are beginning to employ in order to satisfy the needs of the searcher.
3.2 Google and Latent Semantic Indexing (LSI)
In the last unit of the course, we showed you how a search engine attempts to find relevant documents for a search query by locating pages in its index that match the search query - that is, pages that contain the specific words we entered. However, the process is rather more complex than this, largely because of an innovation on the part of the world’s leading search provider, Google.
In order to return more relevant results for the user, Google uses a method called ‘Latent Semantic Indexing’ when indexing documents on the web. Although this method is not used universally by all search engines, it is likely that other search engines will begin to factor this (or a similar) method into their algorithms in the future.
Note that Google does not rely entirely on LSI for finding relevant results. However, according to noted SEO experts Google has been using LSI ‘for a while’ and has ‘recently increased its weighting’ . This means that while traditional keyword based search queries are still relevant - i.e., Google still tries to retrieve documents that contain the specific search terms or keywords you use - Google’s search algorithm has begun to place more importance on LSI when attempting to determine and retrieve relevant documents for a specific search query.
So what is LSI and how does it differ from a standard keyword search? In essence, LSI is a method for retrieving documents that are relevant to a search but that may not contain the specific keyword entered by the user.
For example, in a traditional keyword based search, if I enter the search phrase ‘used cars’ into the search engine, it will only return documents that mention those actual terms somewhere on the page. It will not return web pages that mention terms that we normally consider to be closely related to our search query, e.g. ‘second hand’, ‘vehicles’, ‘automobiles’, and so forth (unless these pages also happen to use the keyphrase ‘used cars’).
When using LSI, on the other hand, the search engine finds a means to locate pages that contain related terms as well as our specific keyphrase. Therefore, our search might also return pages that only mention ‘second-hand automobiles’ as well as pages that specifically mention ‘used cars’.
As you can see, then, LSI allows the search engine to return documents that are outside our specific search phrase, but that are still relevant to our search. It begins to approximate how we actually use language in real life, where we are aware of alternative terms and synonyms for words, and for this reason should prove to be more useful to the searcher than a standard keyword search.