Stoplists and noise word lists
Introduction
All versions of SQL Server that have supported full-text search have included an option to ignore certain words that are considered unimportant for search purposes. In SQL Server 2005 and earlier, this option was known as the noise word list. Noise word lists were language-specific text files stored in the file system. In SQL Server 2008, noise word lists have been replaced by stoplists, which are likewise composed of words, known as stopwords, that are unimportant for search. Unlike noise word lists, stoplists are stored in the database rather than in the file system. SQL Server 2008 also provides T-SQL language improvements designed to make stoplists more flexible than noise word lists and to facilitate and ease their management.
The reasons for creating or using a stoplist include the following:
- To prevent your full-text indexes from becoming bloated with unnecessary and unimportant words
- To improve the quality of full-text search results by eliminating unimportant word matches from the results
- To increase the efficiency of full-text searches by eliminating searching and stemming of irrelevant words
SQL Server 2005 and prior
In SQL Server 2005 and prior versions, FTS utilizes noise word lists to indicate words that should be ignored during full-text searches. These noise word lists are stored as plain text files under SQL Server’s MSSQL\FTDATA directory with names like noiseXXX.txt, where the XXX indicates a three-letter language code such as ENU for U.S. English (LCID 1033) and NLD for Dutch (LCID 1043). In prior versions of SQL Server, you were allowed a single noise word list per supported language.
SQL Server 2008 (and beyond)
SQL Server 2008 changes this with stoplists. Stoplists are stored in the database instead of the file system, and you can create as many stoplists as you like for any supported language. The only limitation is that you can assign only one stoplist to any given full-text index.
System Stoplists
By default, SQL Server 2008 provides system stoplists for most languages that iFTS supports. You can view the system stoplists by querying the sys.fulltext_system_stopwords catalog view. This view returns a column with the stopword entries from the system stoplists and a language_id column with the LCID of each entry.
SQL |
![]() |
---|---|
/* Query the Dutch Language System Stoplist */ SELECT * FROM sys.fulltext_system_stopwords WHERE language_id = 1043; |
Creating Custom Stoplists
You can create and manage custom stoplists in SQL Server 2008, using either the SSMS GUI or T-SQL statements. When you create a custom stoplist, you can either create an empty stoplist or base it on a system - or an existing full-text stoplist. To create a custom stoplist in the SSMS GUI, right-click on Your database (e.g. EmptySix) ➤ Storage ➤ Full Text Stoplists in the Object Explorer. Then select New Full-Text Stoplist... and follow the wizard. The owner of the stoplist is most likely dbo.
SQL |
![]() |
---|---|
/* Create custom stoplist using T-SQL */ CREATE FULLTEXT STOPLIST EmptySix_Stoplist FROM SYSTEM STOPLIST /* or an existing custom stoplist; omit this line to create an empty stoplist */ AUTHORIZATION dbo; |
Managing Custom Stoplists
After you’ve created a custom stoplist, you can add or remove stopwords from the stoplist by right-clicking the name of the stoplist in the Object Explorer and selecting Properties from the pop-up context menu. Note: you can manage stopwords for any specific language (or for all languages) within a single full-text stoplist per database.
SQL |
![]() |
---|---|
/* Manage custom stoplist using T-SQL */ ALTER FULLTEXT STOPLIST EmptySix_Stoplist ADD 'mijn_stopwoord' LANGUAGE 1043; ALTER FULLTEXT STOPLIST EmptySix_Stoplist ADD 'my_stopword' LANGUAGE 1033; ALTER FULLTEXT STOPLIST EmptySix_Stoplist DROP 'mijn_stopwoord' LANGUAGE 1043; |
Once you’ve created a custom stoplist, you can use the SSMS Create Full-Text Index wizard to create a new full-text index that uses it. You can also assign it to an existing full-text index by right-clicking on the full-text indexed table and selecting Full-Text index ➤ Properties. In the Full-Text Index Properties window, you can assign a custom stoplist to the full-text index. Note: if your full-text index is set for automatic change tracking, changing the stoplist will kick off a full population.
SQL |
![]() |
---|---|
/* Assign custom stoplist to an existing full-text index using T-SQL */ ALTER FULLTEXT INDEX ON vwtsContentPUB SET STOPLIST = EmptySix_Stoplist; |
SQL |
![]() |
---|---|
/* Unassign any stoplist to an existing full-text index using T-SQL */ ALTER FULLTEXT INDEX ON vwtsContentPUB SET STOPLIST = OFF; |
SQL Server 2008 provides three catalog views that are useful for retrieving the contents of system and custom stoplists:
SQL |
![]() |
---|---|
SELECT * FROM sys.fulltext_system_stopwords; SELECT * FROM sys.fulltext_stoplists; SELECT * FROM sys.fulltext_stopwords; |
Determining Stoplist
In order to determine which stoplist (if any) is used for a full-text index, one could execute the following query (SQL Server 2008 only):
SQL |
![]() |
---|---|
SELECT object_name(object_id) AS IndexedObject, CASE ISNULL(fi.stoplist_id, -1) WHEN 0 THEN 'System Stoplist' WHEN -1 THEN 'No Stoplist' ELSE fs.name END AS UsedStoplist FROM sys.fulltext_indexes fi LEFT JOIN sys.fulltext_stoplists fs ON fi.stoplist_id = fs.stoplist_id; |
Upgrading Noise Word Lists to Stoplists
When you upgrade a full-text catalog from a SQL Server 2005 database, SQL Server doesn’t upgrade your old noise word lists automatically. It will copy the noise word lists to the MSSQL\FTData\FTNoiseThesaurusBak subdirectory under your SQL Server 2008 installation directory, but that’s about it. If you only used the default system noise word lists in SQL Server 2005, you won’t need to upgrade your old noise word lists, since the SQL Server 2008 system stoplists have equivalent content.
If you customized your old noise word lists, however, you’ll need to create a custom stoplist and import the old noise words into the stoplist. SQL Server doesn’t provide a standard utility to perform this function, but it’s easy enough to create one with the OPENROWSET rowset provider.
Stoplists and Indexing
When you create or populate a full-text index, SQL Server uses the stoplist associated with the index to ignore instances of the stopwords during full-text indexing. Basically, iFTS won’t index any stopwords it encounters during the indexing process. SQL Server will only ignore exact matches of the stopwords, however.
For example, when you add fish as a stopword, a search query on words like fishes, fished or fishing will still yield results, while only ignoring the literal word fish.
Noise file
The location of the specified noise file is set in the registry: HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\MSSearch\Language\nld ➤ NoiseFile.
After editing the noise file, rebuild the catalog (*Note, if you have a large catalog, do this off-hours as it will impact performance):
SQL |
![]() |
---|---|
/* Rebuild the catalog after editing noise file(s) */ ALTER FULLTEXT CATALOG SmsSearch REBUILD WITH ACCENT_SENSITIVITY = OFF; |