From ca450a07eeee7b5a52336796edddce31c5f87ccd Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Wed, 14 Nov 2007 18:36:37 +0000 Subject: Add an Accept parameter to "simple" dictionaries. The default of true gives the old behavior; selecting false allows the dictionary to be used as a filter ahead of other dictionaries, because it will pass on rather than accept words that aren't in its stopword list. Jan Urbanski --- doc/src/sgml/textsearch.sgml | 37 ++++++++++++++++++++++++++++++++++--- 1 file changed, 34 insertions(+), 3 deletions(-) (limited to 'doc/src/sgml') diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml index 0ba401c2a43..31753791cda 100644 --- a/doc/src/sgml/textsearch.sgml +++ b/doc/src/sgml/textsearch.sgml @@ -1,4 +1,4 @@ - + Full Text Search @@ -2093,9 +2093,11 @@ SELECT ts_rank_cd (to_tsvector('english','list stop words'), to_tsquery('list &a The simple dictionary template operates by converting the input token to lower case and checking it against a file of stop words. - If it is found in the file then NULL is returned, causing + If it is found in the file then an empty array is returned, causing the token to be discarded. If not, the lower-cased form of the word - is returned as the normalized lexeme. + is returned as the normalized lexeme. Alternatively, the dictionary + can be configured to report non-stop-words as unrecognized, allowing + them to be passed on to the next dictionary in the list. @@ -2138,6 +2140,35 @@ SELECT ts_lexize('public.simple_dict','The'); + + We can also choose to return NULL, instead of the lower-cased + word, if it is not found in the stop words file. This behavior is + selected by setting the dictionary's Accept parameter to + false. Continuing the example: + + +ALTER TEXT SEARCH DICTIONARY public.simple_dict ( Accept = false ); + +SELECT ts_lexize('public.simple_dict','YeS'); + ts_lexize +----------- + + +SELECT ts_lexize('public.simple_dict','The'); + ts_lexize +----------- + {} + + + + + With the default setting of Accept = true, + it is only useful to place a simple dictionary at the end + of a list of dictionaries, since it will never pass on any token to + a following dictionary. Conversely, Accept = false + is only useful when there is at least one following dictionary. + + Most types of dictionaries rely on configuration files, such as files of -- cgit v1.2.3