diff options
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/hstore.sgml | 17 | ||||
-rw-r--r-- | doc/src/sgml/indices.sgml | 2 | ||||
-rw-r--r-- | doc/src/sgml/intarray.sgml | 25 | ||||
-rw-r--r-- | doc/src/sgml/ltree.sgml | 37 | ||||
-rw-r--r-- | doc/src/sgml/pgtrgm.sgml | 17 | ||||
-rw-r--r-- | doc/src/sgml/ref/create_index.sgml | 16 | ||||
-rw-r--r-- | doc/src/sgml/textsearch.sgml | 13 |
7 files changed, 113 insertions, 14 deletions
diff --git a/doc/src/sgml/hstore.sgml b/doc/src/sgml/hstore.sgml index 64c2477fffc..f1f2b08cd79 100644 --- a/doc/src/sgml/hstore.sgml +++ b/doc/src/sgml/hstore.sgml @@ -468,6 +468,23 @@ CREATE INDEX hidx ON testhstore USING GIN (h); </programlisting> <para> + <literal>gist_hstore_ops</literal> GiST opclass approximates set of + key/value pairs as a bitmap signature. Optional integer parameter + <literal>siglen</literal> of <literal>gist_hstore_ops</literal> determines + signature length in bytes. Default signature length is 16 bytes. + Valid values of signature length are between 1 and 2024 bytes. Longer + signatures leads to more precise search (scan less fraction of index, scan + less heap pages), but larger index. + </para> + + <para> + Example of creating such an index with a signature length of 32 bytes: + </para> +<programlisting> + CREATE INDEX hidx ON testhstore USING GIST (h gist_hstore_ops(siglen=32)); +</programlisting> + + <para> <type>hstore</type> also supports <type>btree</type> or <type>hash</type> indexes for the <literal>=</literal> operator. This allows <type>hstore</type> columns to be declared <literal>UNIQUE</literal>, or to be used in <literal>GROUP BY</literal>, diff --git a/doc/src/sgml/indices.sgml b/doc/src/sgml/indices.sgml index 86539a781c5..1be209a2fe7 100644 --- a/doc/src/sgml/indices.sgml +++ b/doc/src/sgml/indices.sgml @@ -1316,7 +1316,7 @@ SELECT target FROM tests WHERE subject = 'some-subject' AND success; An index definition can specify an <firstterm>operator class</firstterm> for each column of an index. <synopsis> -CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> (<replaceable>column</replaceable> <replaceable>opclass</replaceable> <optional><replaceable>sort options</replaceable></optional> <optional>, ...</optional>); +CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> (<replaceable>column</replaceable> <replaceable>opclass</replaceable> [ ( <replaceable>opclass_options</replaceable> ) ] <optional><replaceable>sort options</replaceable></optional> <optional>, ...</optional>); </synopsis> The operator class identifies the operators to be used by the index for that column. For example, a B-tree index on the type <type>int4</type> diff --git a/doc/src/sgml/intarray.sgml b/doc/src/sgml/intarray.sgml index 025cbca616e..72b4b23c158 100644 --- a/doc/src/sgml/intarray.sgml +++ b/doc/src/sgml/intarray.sgml @@ -265,7 +265,7 @@ </para> <para> - Two GiST index operator classes are provided: + Two parametrized GiST index operator classes are provided: <literal>gist__int_ops</literal> (used by default) is suitable for small- to medium-size data sets, while <literal>gist__intbig_ops</literal> uses a larger signature and is more @@ -274,6 +274,25 @@ The implementation uses an RD-tree data structure with built-in lossy compression. </para> + + <para> + <literal>gist__int_ops</literal> approximates integer set as an array of + integer ranges. Optional integer parameter <literal>numranges</literal> of + <literal>gist__int_ops</literal> determines maximum number of ranges in + one index key. Default value of <literal>numranges</literal> is 100. + Valid values are between 1 and 253. Using larger arrays as GiST index + keys leads to more precise search (scan less fraction of index, scan less + heap pages), but larger index. + </para> + + <para> + <literal>gist__intbig_ops</literal> approximates integer set as a bitmap + signature. Optional integer parameter <literal>siglen</literal> of + <literal>gist__intbig_ops</literal> determines signature length in bytes. + Default signature length is 16 bytes. Valid values of signature length + are between 1 and 2024 bytes. Longer signatures leads to more precise + search (scan less fraction of index, scan less heap pages), but larger index. + </para> <para> There is also a non-default GIN operator class @@ -293,8 +312,8 @@ -- a message can be in one or more <quote>sections</quote> CREATE TABLE message (mid INT PRIMARY KEY, sections INT[], ...); --- create specialized index -CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__int_ops); +-- create specialized index with sigature length of 32 bytes +CREATE INDEX message_rdtree_idx ON message USING GIST (sections gist__int_ops(siglen=32)); -- select messages in section 1 OR 2 - OVERLAP operator SELECT message.mid FROM message WHERE message.sections && '{1,2}'; diff --git a/doc/src/sgml/ltree.sgml b/doc/src/sgml/ltree.sgml index 2d539f23fd8..ae4b33ec85e 100644 --- a/doc/src/sgml/ltree.sgml +++ b/doc/src/sgml/ltree.sgml @@ -498,32 +498,61 @@ Europe & Russia*@ & !Transportation </listitem> <listitem> <para> - GiST index over <type>ltree</type>: + GiST index over <type>ltree</type> (<literal>gist_ltree_ops</literal> + opclass): <literal><</literal>, <literal><=</literal>, <literal>=</literal>, <literal>>=</literal>, <literal>></literal>, <literal>@></literal>, <literal><@</literal>, <literal>@</literal>, <literal>~</literal>, <literal>?</literal> </para> <para> - Example of creating such an index: + <literal>gist_ltree_ops</literal> GiST opclass approximates set of + path labels as a bitmap signature. Optional integer parameter + <literal>siglen</literal> of <literal>gist_ltree_ops</literal> determines + signature length in bytes. Default signature length is 8 bytes. + Valid values of signature length are between 1 and 2024 bytes. Longer + signatures leads to more precise search (scan less fraction of index, scan + less heap pages), but larger index. + </para> + <para> + Example of creating such an index with a default signature length of 8 bytes: </para> <programlisting> CREATE INDEX path_gist_idx ON test USING GIST (path); </programlisting> + <para> + Example of creating such an index with a signature length of 100 bytes: + </para> +<programlisting> +CREATE INDEX path_gist_idx ON test USING GIST (path gist_ltree_ops(siglen=100)); +</programlisting> </listitem> <listitem> <para> - GiST index over <type>ltree[]</type>: + GiST index over <type>ltree[]</type> (<literal>gist__ltree_ops</literal> + opclass): <literal>ltree[] <@ ltree</literal>, <literal>ltree @> ltree[]</literal>, <literal>@</literal>, <literal>~</literal>, <literal>?</literal> </para> <para> - Example of creating such an index: + <literal>gist__ltree_ops</literal> GiST opclass works similar to + <literal>gist_ltree_ops</literal> and also takes signature length as + a parameter. Default value of <literal>siglen</literal> in + <literal>gist__ltree_ops</literal> is 28 bytes. + </para> + <para> + Example of creating such an index with a default signature length of 28 bytes: </para> <programlisting> CREATE INDEX path_gist_idx ON test USING GIST (array_path); </programlisting> <para> + Example of creating such an index with a signature length of 100 bytes: + </para> +<programlisting> +CREATE INDEX path_gist_idx ON test USING GIST (array_path gist__ltree_ops(siglen=100)); +</programlisting> + <para> Note: This index type is lossy. </para> </listitem> diff --git a/doc/src/sgml/pgtrgm.sgml b/doc/src/sgml/pgtrgm.sgml index 049f496869c..dde02634ae4 100644 --- a/doc/src/sgml/pgtrgm.sgml +++ b/doc/src/sgml/pgtrgm.sgml @@ -391,6 +391,23 @@ CREATE INDEX trgm_idx ON test_trgm USING GIN (t gin_trgm_ops); </para> <para> + <literal>gist_trgm_ops</literal> GiST opclass approximates set of + trigrams as a bitmap signature. Optional integer parameter + <literal>siglen</literal> of <literal>gist_trgm_ops</literal> determines + signature length in bytes. Default signature length is 12 bytes. + Valid values of signature length are between 1 and 2024 bytes. Longer + signatures leads to more precise search (scan less fraction of index, scan + less heap pages), but larger index. + </para> + + <para> + Example of creating such an index with a signature length of 32 bytes: + </para> +<programlisting> +CREATE INDEX trgm_idx ON test_trgm USING GIST (t gist_trgm_ops(siglen=32)); +</programlisting> + + <para> At this point, you will have an index on the <structfield>t</structfield> column that you can use for similarity searching. A typical query is <programlisting> diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml index f0fe6fb874b..3f902dcf84f 100644 --- a/doc/src/sgml/ref/create_index.sgml +++ b/doc/src/sgml/ref/create_index.sgml @@ -22,7 +22,7 @@ PostgreSQL documentation <refsynopsisdiv> <synopsis> CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class="parameter">name</replaceable> ] ON [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ USING <replaceable class="parameter">method</replaceable> ] - ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ) + ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] { <replaceable class="parameter">opclass</replaceable> | DEFAULT } [ ( <replaceable class="parameter">opclass_parameter</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] ) ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ) [ INCLUDE ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ] [ WITH ( <replaceable class="parameter">storage_parameter</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] ) ] [ TABLESPACE <replaceable class="parameter">tablespace_name</replaceable> ] @@ -286,6 +286,15 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class= </varlistentry> <varlistentry> + <term><replaceable class="parameter">opclass_parameter</replaceable></term> + <listitem> + <para> + The name of an operator class parameter. See below for details. + </para> + </listitem> + </varlistentry> + + <varlistentry> <term><literal>ASC</literal></term> <listitem> <para> @@ -679,8 +688,9 @@ Indexes: </para> <para> - An <firstterm>operator class</firstterm> can be specified for each - column of an index. The operator class identifies the operators to be + An <firstterm>operator class</firstterm> with its optional parameters + can be specified for each column of an index. + The operator class identifies the operators to be used by the index for that column. For example, a B-tree index on four-byte integers would use the <literal>int4_ops</literal> class; this operator class includes comparison functions for four-byte diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml index c5b254c7ca9..2217fcd6c2f 100644 --- a/doc/src/sgml/textsearch.sgml +++ b/doc/src/sgml/textsearch.sgml @@ -3637,7 +3637,7 @@ SELECT plainto_tsquery('supernovae stars'); <tertiary>text search</tertiary> </indexterm> - <literal>CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING GIST (<replaceable>column</replaceable>);</literal> + <literal>CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING GIST (<replaceable>column</replaceable> [ { DEFAULT | tsvector_ops } (siglen = <replaceable>number</replaceable>) ] );</literal> </term> <listitem> @@ -3645,6 +3645,8 @@ SELECT plainto_tsquery('supernovae stars'); Creates a GiST (Generalized Search Tree)-based index. The <replaceable>column</replaceable> can be of <type>tsvector</type> or <type>tsquery</type> type. + Optional integer parameter <literal>siglen</literal> determines + signature length in bytes (see below for details). </para> </listitem> </varlistentry> @@ -3668,12 +3670,17 @@ SELECT plainto_tsquery('supernovae stars'); to check the actual table row to eliminate such false matches. (<productname>PostgreSQL</productname> does this automatically when needed.) GiST indexes are lossy because each document is represented in the - index by a fixed-length signature. The signature is generated by hashing + index by a fixed-length signature. Signature length in bytes is determined + by the value of the optional integer parameter <literal>siglen</literal>. + Default signature length (when <literal>siglen</literal> is not specied) is + 124 bytes, maximal length is 2024 bytes. The signature is generated by hashing each word into a single bit in an n-bit string, with all these bits OR-ed together to produce an n-bit document signature. When two words hash to the same bit position there will be a false match. If all words in the query have matches (real or false) then the table row must be - retrieved to see if the match is correct. + retrieved to see if the match is correct. Longer signatures leads to more + precise search (scan less fraction of index, scan less heap pages), but + larger index. </para> <para> |