Dutch breaker / stemmer issue for SQL Server 2012 (SOLVED)
Dutch breaker / stemmer issue for SQL Server 2012 (SOLVED)
Problem
Summary
Faceted search breaks and stems search terms issued by the user, performing the actual search with the search terms entered by the user plus terms resulted from breaking and stemming. Faceted search uses the built-in support of SQL-Server for stemming and breaking. Unfortunately the Dutch breaker and stemmer, although breaking, does not correctly stem. This is a defect that has been confirmed by Microsoft. The defect was not present in SQL-Server 2008 and SQL-Server 2008 R2, and is present in SQL-Server 2012.
Effect
End users may notice that less pages are found. For example when searching for "lopen" the stemmer would normally produce stemmed terms such as "lopende", "loopt", "liepen", and several more. A subsequent search results in more pages.
Note that breaking is operational. For example when searching for "beeldvorming" the breaker produces "beeldvorming", "beeld", and "vorming". A subsequent search results in pages as expected.
Details
The result of breaking and stemming can be investigated using the following SQL-Server query.
SELECT * FROM sys.dm_fts_parser('FORMSOF(INFLECTIONAL, "lopen")', 1043, 0, 0);
1043 identifying the Dutch language.
Resolution
Fix: adjust a SQL-Server configuration in the registry
- On the database server open the registry editor
- Navigate to HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/Microsoft SQL Server/[YourServerInstance]/MSSearch/CLSID
- Add a new key with name {7c83a911-f24b-45f7-92b8-348484c33ba5}
- Set the (Default) value to MsWb7.dll
- Navigate to HKEY_LOCAL_MACHINE/SOFTWARE/Microsoft/Microsoft SQL Server/[YourServerInstance]/MSSearch/Language/nld
- Change the value for StemmerClass to {7c83a911-f24b-45f7-92b8-348484c33ba5}
- In SQL-Server Management studio execute the following TSQL to get the updated registry value:
EXEC sp_fulltext_service @action='load_os_resources', @value=1;
EXEC sp_fulltext_service 'update_languages';
EXEC sp_fulltext_service 'restart_all_fdhosts';
Verify:
SELECT * FROM sys.dm_fts_parser('FORMSOF(INFLECTIONAL, "rood")', 1043, 0, 0);
SELECT * FROM sys.dm_fts_parser('FORMSOF(INFLECTIONAL, "beeldvorming")', 1043, 0, 0);
- Yields 15 and 7 results on SQL-Server 2008 and 2008 R2
- Yields 1 and 3 results on SQL-Server 2012 before fixing
- Yields 15 and 14 results on SQL-Server 2012 after fixing
Earlier workaround: revert to SQL-Server 2008
Reverting to SQL-Server 2008 or SQL-Server 2008 R2 restores the original correct behavior.
Earlier attempted workaround: only revert the 2008 stemmer and breaker
A suggested workaround to continue using SQL-Server 2012 while reverting to the SQL-Server 2008 stemmer and breaker does not work. This workaround involves replacing NaturalLanguage6.dll by the SQL-Server 2008 version, and taking a few additional measures. Microsoft confirmed this is not a feasible workaround.
- Features
-
Add-On Modules
- SmartInstant
- Electronic front office "E-loket"
- Enforcing Cookie Policies
-
Faceted Search
- Requirements and deploying Faceted Search
- Introducing Faceted Search
- Using Faceted Search
- Faceted Search concepts
- Customizing your Faceted Search
- Reference
-
Knowledge Base
- Existing SearchGroups or LinkedDataSearchGroups CTFPs will be removed
- Upgrading alpha version Faceted Search
- How to customize result highlighting in Faceted Search
- Run stored procedure pr_tsFacetedSearch after adding/changing physical field weights
- Skin CSS gets mangled after saving
- Faceted Search Full-text vs Native
- Implementing exact matching of search term(s) for native queries
- Query XML Search Field Weights implementation adjustment
- Slides knowledge session (October 2010)
- Smartsite Management Information (SMI)
- Search Engine Optimization (SEO) Toolkit
- Web Toolkit
- Personalization
- DigiD Module
- Ogone Payment Module
- Emergency redirect