Spotlight on Refiners
Date: 08. Nov 2017
One of the recurring support requests revolves around refiners, tags and the displayed counts. Proper operation of the refiners depends on their configuration and the data provider used.
Are Refiners not showing the correct count?
In this blog post, we provide you with insights into the refiner dynamics "behind the scenes", so you will know when a situation of 'bad number count' does occur. Plus, we are explaining how to address the situation.
A. Choosing the right data provider
1. ListDataProvider
This data provider uses in memory execution for all items (incl. tag splitting and item count). The result is a correct item count in refiners, but the execution is potentially slow.
Let's assume the following scenario (using two Word documents with a combination of geographic tags from Switzerland):
After saving these documents, the documents' tag string contains the following data (the tag ids are examples only and may obviously differ in your environment):
Document A: ((df2)(df4)(df5)) ((df2)(df4)(df6))
(Or: Basel-Stadt (df5) and Basel-Landschaft (df6) with their respective parent tags: "Switzerland" (df4) and "Europe" (df2))
Document B ((df2)(df4)(df5)) ((df2)(df4)(df7)) ((df2)(df4)(df8))
(Or: Basel-Stadt (df5), Luzern (df7), and Tessin (df8) with their respective parent tags: "Switzerland" (df4) and "Europe" (df2))
This is the search result when using the HierarchicalTagRenderer
The same search result, but with a FlatTagRenderer
Refined items vs search results
One of the common misunderstandings involves totaling all the refined items (numbers in parentheses) by comparing them to the number of returned search results. Here, we have two search results (documents) with a total occurence count of nine refined items ("Location" tags).
The takeaway from this example: The tag occurence count in the refiner reflects the number of tags to be found within the search result. The sum of the occurence count does not need to be the same as the sum of the returned search items (here: documents).
2. SearchDataProvider without Content Enrichment Web Service (CEWS)
The refinement is done by using the "Refinement Result" returned from the SharePoint Search Service Application calculated on the MATCHPOINTTAGS Managed Property. This "Refinement Result" will return the 100 unique refiners with the most used tag strings.
For the sake of this example, we are using a higher numbers of documents, where we applied our Swiss location tags. This may lead to the following hypothetical results from the SharePoint Search:
((df2)(df4)(df5)) ((df2)(df4)(df6)) (6453)
((df2)(df4)(df6)) ((df2)(df4)(df5)) (4877)
((df2)(df4)(df5)) ((df2)(df4)(df6)) ((df2)(df4)(df8)) (3405)
... 96 other results
((df2)(df4)(df5)) ((df2)(df4)(df7)) ((df2)(df4)(df8)) (452)
As you see from the above example, sort order matters!
Which means:
6453 items contain both tags Basel-Stadt and Basel-Landschaft (in that order!)
4877 items contain both tags Basel-Landschaft and Basel-Stadt (in that order!)
3405 items contain the three tags Basel-Stadt, Basel-Landschaft and Tessin
...
452 items contain the three tags Basel-Stadt, Luzern, and Tessin
MatchPoint uses these 100 refinement result rows by taking them into memory and splitting them into individual tags for creating a tree structure with the total number count. In complex tagging scenarios, this can lead quickly to wrong counts, since the 100 most common tag strings (= sorted tag combinations) will only represent a small subset of the total items stored on the environment). Moreover, the in-memory processing of the raw data can be rather slow.
The results of the calculation mentioned above might lead to the following tree display:
Europe[df2]: (6453+4877+3405+452)
. Switzerland[df4]: (6453+4877+3405+452)
.. Basel-Stadt [df5]: (6453+4877+3405+452)
.. Basel-Land [df6]: (6453+4877+3405)
.. Luzern [df7]: (452)
.. Tessin [df8]: (3405 + 452)
The takeaway from this example: Many tag strings (=tag combinations) are relatively rare, but in their sum can play a significant role which is not displayed in the result (due to the "top 100" returned results.)
3. SearchDataProvider with Content Enrichment Web Service (CEWS)
When using the Content Enrichment Web Service (CEWS), the tags are being split after the Search Crawl is saved as Multi-Value entries in the Search Index. Due to this small change in the index representation, the SharePoint search engine has enough semantical information and is able to "understand" the values and calculate correct refiner values. This means the Index does not contain the string "((df2)(df4)(df5)) ((df2)(df4)(df6))", but an array of strings ["((df2)(df4)(df5))","((df2)(df4)(df6))"].
This data serves for the refiner of the SharePoint search, which in turn means the refiner count is always correct (as per crawl time) and execution times are fast. Moreover, this perfect integration in the SharePoint refinement process results in the possibility to add tag refiners to be used in SharePoint OOB refiners on SharePoint OOB search pages!
The takeaway from this example: The "top 100" results that MatchPoint obtains from the SharePoint search when using the CEWS, do contain indeed the correct top 100 most occuring tags used at the time of the crawl.
B. Choosing the right configuration approach (NativeRefinementSettings)
The configuration for the refinement web part should contain the refinement settings where the EnableNativeRefinement is being enabled.
For this to work, the specified fields in the refinement configuration must be of type refinable (see "Managed Properties" in the SharePoint Central Administration).
FieldSetting
The NativeRefinementSettings / FieldSetting allows the configuration of refiner parameters on a per-field basis. The possible parameters and settings are explained the Microsoft document Query Refinement in SharePoint. As the Microsoft documentation (and the MatchPoint configuration page) point out, a refiner parameter of "filter=200" would allow you to increase the number of refiner results on a per-field basis.
TagSetting
The NativeRefinementSettings / TagSetting is a convenience wrapper that has the same functionality as the FieldSetting - with the field name parameter being automatically set.
The takeaway from this example: All the above configuration parameters are explained both in detail and directly within the configuration console. This case also points out the SharePoint OOB limitation of 100 refinement results and how to overcome this limitation.