diff options
| author | Tom Lane <tgl@sss.pgh.pa.us> | 2007-11-14 23:43:27 +0000 | 
|---|---|---|
| committer | Tom Lane <tgl@sss.pgh.pa.us> | 2007-11-14 23:43:27 +0000 | 
| commit | 866bad9543897291319d0a309dbddeb9ea8808ac (patch) | |
| tree | a21e5743d19f3cc3104ff39085a0cdd8a9b840d1 /doc/src/sgml | |
| parent | 5858990f8793881144f0c113f49493861c6c3004 (diff) | |
Add a rank/(rank+1) normalization option to ts_rank().  While the usefulness
of this seems a bit marginal, if it's useful enough to be shown in the manual
then we probably ought to support doing it without double evaluation of the
ts_rank function.  Per my proposal earlier today.
Diffstat (limited to 'doc/src/sgml')
| -rw-r--r-- | doc/src/sgml/textsearch.sgml | 22 | 
1 files changed, 15 insertions, 7 deletions
| diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml index 31753791cda..9366fdd2407 100644 --- a/doc/src/sgml/textsearch.sgml +++ b/doc/src/sgml/textsearch.sgml @@ -1,4 +1,4 @@ -<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.33 2007/11/14 18:36:37 tgl Exp $ --> +<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.34 2007/11/14 23:43:27 tgl Exp $ -->  <chapter id="textsearch">   <title id="textsearch-title">Full Text Search</title> @@ -940,6 +940,7 @@ SELECT plainto_tsquery('english', 'The Fat & Rats:C');       <listitem>        <para>         4 divides the rank by the mean harmonic distance between extents +       (this is implemented only by <function>ts_rank_cd</>)        </para>       </listitem>       <listitem> @@ -953,17 +954,24 @@ SELECT plainto_tsquery('english', 'The Fat & Rats:C');         of unique words in document        </para>       </listitem> +     <listitem> +      <para> +       32 divides the rank by itself + 1 +      </para> +     </listitem>      </itemizedlist> +    If more than one flag bit is specified, the transformations are +    applied in the order listed.     </para>     <para>      It is important to note that the ranking functions do not use any global -    information so it is impossible to produce a fair normalization to 1% or -    100%, as sometimes desired.  However, a simple technique like -    <literal>rank/(rank+1)</literal> can be applied.  Of course, this is just -    a cosmetic change, i.e., the ordering of the search results will not -    change. +    information, so it is impossible to produce a fair normalization to 1% or +    100% as sometimes desired.  Normalization option 32 +    (<literal>rank/(rank+1)</literal>) can be applied to scale all ranks +    into the range zero to one, but of course this is just a cosmetic change; +    it will not affect the ordering of the search results.     </para>     <para> @@ -991,7 +999,7 @@ ORDER BY rank DESC LIMIT 10;      This is the same example using normalized ranking:  <programlisting> -SELECT title, ts_rank_cd(textsearch, query)/(ts_rank_cd(textsearch, query) + 1) AS rank +SELECT title, ts_rank_cd(textsearch, query, 32 /* rank/(rank+1) */ ) AS rank  FROM apod, to_tsquery('neutrino|(dark & matter)') query  WHERE  query @@ textsearch  ORDER BY rank DESC LIMIT 10; | 
