Locayta Search® spell-correction

On some websites, up to 50% of search queries may be misspelled. Unless something is done about the misspelling problem, just a single missing character or an extraneous character or characters transposed will produce a nil or weak search results; prompting the visitor to leave your site believing that you do not have what they are looking for.  

From the research that we've done, it would appear that a lot of misspelling is not actually misspelling but mistyping – i.e. the user knows how to spell the product name, they've just mistyped the word. 

Traditionally most sites that try to address this problem, do so by implementing a simple dictionary look-up where words are checked against an English language dictionary. 

However the problem with this is that you have to create and maintain the dictionary manually. This means that every time you add a product to the index, you've got to check the product data to see if there are any terms that are unique to the product but not currently in the index. In which case you would add those new terms to the index.

However, when you need to delete a product, you've have to reverse the process and check whether there are any terms uniquely within the dictionary related to that one product, which you now have to remove.

Other problems with dictionaries, include the fact that dictionaries normally won't recognise any valid words that don't appear in dictionaries, such as brand names or surnames.

In addition, with a dictionary, you have to assume that the first 1 or 2 characters are correctly spelled and that any misspelling occurs further along the word. If you don’t make this assumption, then any misspelled 5-letter word, could actually be any 5-letter word in the entire dictionary.

However, we know that a lot of misspelling is actually mistyping. The dictionary approach won’t help as mistyping often occurs with the first strike of the keyboard – i.e. you can’t assume that the first one or two letters are actually correct.

Locayta Search will catch any misspelling, mistyping or misunderstanding including the most vague search impulse!

Locayta Search employs two different algorithms, trigram analysis & edit-distance. Trigram analysis breaks words into blocks and spins them around enhancing possible word matches. Edit-distance measures how much words have been misspelled. The two combined provide a very powerful spell-correction capability.

Locayta Search uses trigram and edit-distance analysis to handle misspellings or mistyping of search terms. It is trying to work out how you have misspelled a word in relation to what it knows about the product index. It will always try to give a result, rather than a zero result.