diff options
| author | Tom Lane <tgl@sss.pgh.pa.us> | 2007-11-14 23:43:27 +0000 |
|---|---|---|
| committer | Tom Lane <tgl@sss.pgh.pa.us> | 2007-11-14 23:43:27 +0000 |
| commit | 866bad9543897291319d0a309dbddeb9ea8808ac (patch) | |
| tree | a21e5743d19f3cc3104ff39085a0cdd8a9b840d1 /doc/src/sgml | |
| parent | 5858990f8793881144f0c113f49493861c6c3004 (diff) | |
Add a rank/(rank+1) normalization option to ts_rank(). While the usefulness
of this seems a bit marginal, if it's useful enough to be shown in the manual
then we probably ought to support doing it without double evaluation of the
ts_rank function. Per my proposal earlier today.
Diffstat (limited to 'doc/src/sgml')
| -rw-r--r-- | doc/src/sgml/textsearch.sgml | 22 |
1 files changed, 15 insertions, 7 deletions
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml index 31753791cda..9366fdd2407 100644 --- a/doc/src/sgml/textsearch.sgml +++ b/doc/src/sgml/textsearch.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.33 2007/11/14 18:36:37 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.34 2007/11/14 23:43:27 tgl Exp $ --> <chapter id="textsearch"> <title id="textsearch-title">Full Text Search</title> @@ -940,6 +940,7 @@ SELECT plainto_tsquery('english', 'The Fat & Rats:C'); <listitem> <para> 4 divides the rank by the mean harmonic distance between extents + (this is implemented only by <function>ts_rank_cd</>) </para> </listitem> <listitem> @@ -953,17 +954,24 @@ SELECT plainto_tsquery('english', 'The Fat & Rats:C'); of unique words in document </para> </listitem> + <listitem> + <para> + 32 divides the rank by itself + 1 + </para> + </listitem> </itemizedlist> + If more than one flag bit is specified, the transformations are + applied in the order listed. </para> <para> It is important to note that the ranking functions do not use any global - information so it is impossible to produce a fair normalization to 1% or - 100%, as sometimes desired. However, a simple technique like - <literal>rank/(rank+1)</literal> can be applied. Of course, this is just - a cosmetic change, i.e., the ordering of the search results will not - change. + information, so it is impossible to produce a fair normalization to 1% or + 100% as sometimes desired. Normalization option 32 + (<literal>rank/(rank+1)</literal>) can be applied to scale all ranks + into the range zero to one, but of course this is just a cosmetic change; + it will not affect the ordering of the search results. </para> <para> @@ -991,7 +999,7 @@ ORDER BY rank DESC LIMIT 10; This is the same example using normalized ranking: <programlisting> -SELECT title, ts_rank_cd(textsearch, query)/(ts_rank_cd(textsearch, query) + 1) AS rank +SELECT title, ts_rank_cd(textsearch, query, 32 /* rank/(rank+1) */ ) AS rank FROM apod, to_tsquery('neutrino|(dark & matter)') query WHERE query @@ textsearch ORDER BY rank DESC LIMIT 10; |
