
Locayta Search Feature Set |
<< back to Technical Overview |
Locayta's latest search engine includes the following functionality: Relevance Algorithm Sorting Balance Factor The amount the results are skewed can also be controlled using a balance factor, which is a percentage value between 0% (the results are not skewed) and 100% (the results are skewed such that their order is identical to as if they had just been sorted by the field). This is especially useful when searching for products in a database that also contains accessories; a lot of the time searching for a product will return its accessories first in the results, so a balance factor can be set such that non-accessories are boosted to the top of the results in this case, while still allowing accessories to be found if the user searches for them specifically. Filtering and facet generation In order for this to be presented to the user, facet options are generated by the search engine. In the example above, a list of all the brands in the user's search results are generated, and the user can choose to narrow their search to just one (or several) of the options generated. This is also known as Guided Navigation. Stemming Stemming is the process of removing these suffixes and pluralisations to determine the 'stem' of each word in the record. The same process is also applied to the user's query words. This allows the user to search, for example, for 'dresses' and still find products that are described as a 'dress' (and vice versa). Stemming for different languages is often required too; for example in German, pluralisation is often more complicated than simply adding an 's' onto the end of the word. Locayta has stemming algorithms available for several common languages. Spelling correction Locayta performs spelling correction using a system called tri-gram analysis. This uses the concept of edit-distance to determine which of the words available in the search index most closely matches the user's term, and accommodates for mistakes at the start of the word as well as in the middle and at the end. The leniency of the spell-corrector can be configured too. Boolean operators For simple queries, a default operator can be chosen. This allows a default of “matching all words” or “matching any words” to be set. The ESP platform also allows a staggered approach, eg: Use AND unless less than five results are found, in which case use OR. This behaviour can be customised. |
Field weighting This logic can also be applied to other parts of the products' information, if it is available, such as the brand name or manufacturer. Stopwords Stopwords can also be used to speed up the results a little, by removing words that often return a lot of generic results that aren't very useful. For example, in a clothing search, 'clothes' might be a stopword because if the user searches for 'red clothes', we can assume they will want to search for 'red' in every product, rather than just products containing the word 'clothes'. The full list of stopwords can be set in the ESP platform's control panel. Synonyms However, synonyms can be set up so that this does happen, allowing the user to find more relevant results, despite not knowing how the terms are phrased in the product data. Synonyms are the easiest way to solve low result searches. More like this This is a different approach to behavioural recommendations—also available from our platform—because it looks only at the text describing a product, rather than examining user behaviour, and therefore makes it particularly useful on sites where that information is unavailable (for example, in new sites where no data has been accumulated yet), or for searches on data where user behaviour isn't considered useful. Threshold cutoff However, when sorting or balancing results, textually irrelevant results can end up being presented on the first page of results, which is often undesirable. To avoid this happening, Locayta can be configured to ignore results that are less relevant than a certain percentage. This also makes the number of results returned smaller, which can be less daunting to a user who would otherwise be presented with dozens or hundreds of pages of results to look through. Adding a threshold also gives a small improvement on the speed in which results are returned, as less data has to be fetched. |