diff options
Diffstat (limited to 'doc/src')
-rw-r--r-- | doc/src/sgml/btree.sgml | 267 | ||||
-rw-r--r-- | doc/src/sgml/filelist.sgml | 1 | ||||
-rw-r--r-- | doc/src/sgml/postgres.sgml | 1 | ||||
-rw-r--r-- | doc/src/sgml/xindex.sgml | 15 |
4 files changed, 276 insertions, 8 deletions
diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml new file mode 100644 index 00000000000..9f39edc742d --- /dev/null +++ b/doc/src/sgml/btree.sgml @@ -0,0 +1,267 @@ +<!-- doc/src/sgml/btree.sgml --> + +<chapter id="btree"> +<title>B-Tree Indexes</title> + + <indexterm> + <primary>index</primary> + <secondary>B-Tree</secondary> + </indexterm> + +<sect1 id="btree-intro"> + <title>Introduction</title> + + <para> + <productname>PostgreSQL</productname> includes an implementation of the + standard <acronym>btree</acronym> (multi-way binary tree) index data + structure. Any data type that can be sorted into a well-defined linear + order can be indexed by a btree index. The only limitation is that an + index entry cannot exceed approximately one-third of a page (after TOAST + compression, if applicable). + </para> + + <para> + Because each btree operator class imposes a sort order on its data type, + btree operator classes (or, really, operator families) have come to be + used as <productname>PostgreSQL</productname>'s general representation + and understanding of sorting semantics. Therefore, they've acquired + some features that go beyond what would be needed just to support btree + indexes, and parts of the system that are quite distant from the + btree AM make use of them. + </para> + +</sect1> + +<sect1 id="btree-behavior"> + <title>Behavior of B-Tree Operator Classes</title> + + <para> + As shown in <xref linkend="xindex-btree-strat-table"/>, a btree operator + class must provide five comparison operators, + <literal><</literal>, + <literal><=</literal>, + <literal>=</literal>, + <literal>>=</literal> and + <literal>></literal>. + One might expect that <literal><></literal> should also be part of + the operator class, but it is not, because it would almost never be + useful to use a <literal><></literal> WHERE clause in an index + search. (For some purposes, the planner treats <literal><></literal> + as associated with a btree operator class; but it finds that operator via + the <literal>=</literal> operator's negator link, rather than + from <structname>pg_amop</structname>.) + </para> + + <para> + When several data types share near-identical sorting semantics, their + operator classes can be grouped into an operator family. Doing so is + advantageous because it allows the planner to make deductions about + cross-type comparisons. Each operator class within the family should + contain the single-type operators (and associated support functions) + for its input data type, while cross-type comparison operators and + support functions are <quote>loose</quote> in the family. It is + recommendable that a complete set of cross-type operators be included + in the family, thus ensuring that the planner can represent any + comparison conditions that it deduces from transitivity. + </para> + + <para> + There are some basic assumptions that a btree operator family must + satisfy: + </para> + + <itemizedlist> + <listitem> + <para> + An <literal>=</literal> operator must be an equivalence relation; that + is, for all non-null values <replaceable>A</replaceable>, + <replaceable>B</replaceable>, <replaceable>C</replaceable> of the + data type: + + <itemizedlist> + <listitem> + <para> + <replaceable>A</replaceable> <literal>=</literal> + <replaceable>A</replaceable> is true + (<firstterm>reflexive law</firstterm>) + </para> + </listitem> + <listitem> + <para> + if <replaceable>A</replaceable> <literal>=</literal> + <replaceable>B</replaceable>, + then <replaceable>B</replaceable> <literal>=</literal> + <replaceable>A</replaceable> + (<firstterm>symmetric law</firstterm>) + </para> + </listitem> + <listitem> + <para> + if <replaceable>A</replaceable> <literal>=</literal> + <replaceable>B</replaceable> and <replaceable>B</replaceable> + <literal>=</literal> <replaceable>C</replaceable>, + then <replaceable>A</replaceable> <literal>=</literal> + <replaceable>C</replaceable> + (<firstterm>transitive law</firstterm>) + </para> + </listitem> + </itemizedlist> + </para> + </listitem> + + <listitem> + <para> + A <literal><</literal> operator must be a strong ordering relation; + that is, for all non-null values <replaceable>A</replaceable>, + <replaceable>B</replaceable>, <replaceable>C</replaceable>: + + <itemizedlist> + <listitem> + <para> + <replaceable>A</replaceable> <literal><</literal> + <replaceable>A</replaceable> is false + (<firstterm>irreflexive law</firstterm>) + </para> + </listitem> + <listitem> + <para> + if <replaceable>A</replaceable> <literal><</literal> + <replaceable>B</replaceable> + and <replaceable>B</replaceable> <literal><</literal> + <replaceable>C</replaceable>, + then <replaceable>A</replaceable> <literal><</literal> + <replaceable>C</replaceable> + (<firstterm>transitive law</firstterm>) + </para> + </listitem> + </itemizedlist> + </para> + </listitem> + + <listitem> + <para> + Furthermore, the ordering is total; that is, for all non-null + values <replaceable>A</replaceable>, <replaceable>B</replaceable>: + + <itemizedlist> + <listitem> + <para> + exactly one of <replaceable>A</replaceable> <literal><</literal> + <replaceable>B</replaceable>, <replaceable>A</replaceable> + <literal>=</literal> <replaceable>B</replaceable>, and + <replaceable>B</replaceable> <literal><</literal> + <replaceable>A</replaceable> is true + (<firstterm>trichotomy law</firstterm>) + </para> + </listitem> + </itemizedlist> + + (The trichotomy law justifies the definition of the comparison support + function, of course.) + </para> + </listitem> + </itemizedlist> + + <para> + The other three operators are defined in terms of <literal>=</literal> + and <literal><</literal> in the obvious way, and must act consistently + with them. + </para> + + <para> + For an operator family supporting multiple data types, the above laws must + hold when <replaceable>A</replaceable>, <replaceable>B</replaceable>, + <replaceable>C</replaceable> are taken from any data types in the family. + The transitive laws are the trickiest to ensure, as in cross-type + situations they represent statements that the behaviors of two or three + different operators are consistent. + As an example, it would not work to put <type>float8</type> + and <type>numeric</type> into the same operator family, at least not with + the current semantics that <type>numeric</type> values are converted + to <type>float8</type> for comparison to a <type>float8</type>. Because + of the limited accuracy of <type>float8</type>, this means there are + distinct <type>numeric</type> values that will compare equal to the + same <type>float8</type> value, and thus the transitive law would fail. + </para> + + <para> + Another requirement for a multiple-data-type family is that any implicit + or binary-coercion casts that are defined between data types included in + the operator family must not change the associated sort ordering. + </para> + + <para> + It should be fairly clear why a btree index requires these laws to hold + within a single data type: without them there is no ordering to arrange + the keys with. Also, index searches using a comparison key of a + different data type require comparisons to behave sanely across two + data types. The extensions to three or more data types within a family + are not strictly required by the btree index mechanism itself, but the + planner relies on them for optimization purposes. + </para> + +</sect1> + +<sect1 id="btree-support-funcs"> + <title>B-Tree Support Functions</title> + + <para> + As shown in <xref linkend="xindex-btree-support-table"/>, btree defines + one required and one optional support function. + </para> + + <para> + For each combination of data types that a btree operator family provides + comparison operators for, it must provide a comparison support function, + registered in <structname>pg_amproc</structname> with support function + number 1 and + <structfield>amproclefttype</structfield>/<structfield>amprocrighttype</structfield> + equal to the left and right data types for the comparison (i.e., the + same data types that the matching operators are registered with + in <structname>pg_amop</structname>). + The comparison function must take two non-null values + <replaceable>A</replaceable> and <replaceable>B</replaceable> and + return an <type>int32</type> value that + is <literal><</literal> <literal>0</literal>, <literal>0</literal>, + or <literal>></literal> <literal>0</literal> + when <replaceable>A</replaceable> <literal><</literal> + <replaceable>B</replaceable>, <replaceable>A</replaceable> + <literal>=</literal> <replaceable>B</replaceable>, + or <replaceable>A</replaceable> <literal>></literal> + <replaceable>B</replaceable>, respectively. The function must not + return <literal>INT_MIN</literal> for the <replaceable>A</replaceable> + <literal><</literal> <replaceable>B</replaceable> case, + since the value may be negated before being tested for sign. A null + result is disallowed, too. + See <filename>src/backend/access/nbtree/nbtcompare.c</filename> for + examples. + </para> + + <para> + If the compared values are of a collatable data type, the appropriate + collation OID will be passed to the comparison support function, using + the standard <function>PG_GET_COLLATION()</function> mechanism. + </para> + + <para> + Optionally, a btree operator family may provide <firstterm>sort + support</firstterm> function(s), registered under support function number + 2. These functions allow implementing comparisons for sorting purposes + in a more efficient way than naively calling the comparison support + function. The APIs involved in this are defined in + <filename>src/include/utils/sortsupport.h</filename>. + </para> + +</sect1> + +<sect1 id="btree-implementation"> + <title>Implementation</title> + + <para> + An introduction to the btree index implementation can be found in + <filename>src/backend/access/nbtree/README</filename>. + </para> + +</sect1> + +</chapter> diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index a72c50eadbb..732b8ab7d0b 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -83,6 +83,7 @@ <!ENTITY bki SYSTEM "bki.sgml"> <!ENTITY catalogs SYSTEM "catalogs.sgml"> <!ENTITY geqo SYSTEM "geqo.sgml"> +<!ENTITY btree SYSTEM "btree.sgml"> <!ENTITY gist SYSTEM "gist.sgml"> <!ENTITY spgist SYSTEM "spgist.sgml"> <!ENTITY gin SYSTEM "gin.sgml"> diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml index 041afdbd860..054347b17d9 100644 --- a/doc/src/sgml/postgres.sgml +++ b/doc/src/sgml/postgres.sgml @@ -252,6 +252,7 @@ &geqo; &indexam; &generic-wal; + &btree; &gist; &spgist; &gin; diff --git a/doc/src/sgml/xindex.sgml b/doc/src/sgml/xindex.sgml index 81c0cdc4f8d..e40131473fe 100644 --- a/doc/src/sgml/xindex.sgml +++ b/doc/src/sgml/xindex.sgml @@ -35,7 +35,7 @@ <productname>PostgreSQL</productname>, but all index methods are described in <classname>pg_am</classname>. It is possible to add a new index access method by writing the necessary code and - then creating a row in <classname>pg_am</classname> — but that is + then creating an entry in <classname>pg_am</classname> — but that is beyond the scope of this chapter (see <xref linkend="indexam"/>). </para> @@ -404,6 +404,8 @@ B-trees require a single support function, and allow a second one to be supplied at the operator class author's option, as shown in <xref linkend="xindex-btree-support-table"/>. + The requirements for these support functions are explained further in + <xref linkend="btree-support-funcs"/>. </para> <table tocentry="1" id="xindex-btree-support-table"> @@ -426,8 +428,8 @@ </row> <row> <entry> - Return the addresses of C-callable sort support function(s), - as documented in <filename>utils/sortsupport.h</filename> (optional) + Return the addresses of C-callable sort support function(s) + (optional) </entry> <entry>2</entry> </row> @@ -1056,11 +1058,8 @@ ALTER OPERATOR FAMILY integer_ops USING btree ADD <para> In a B-tree operator family, all the operators in the family must sort - compatibly, meaning that the transitive laws hold across all the data types - supported by the family: <quote>if A = B and B = C, then A = C</quote>, - and <quote>if A < B and B < C, then A < C</quote>. Moreover, implicit - or binary coercion casts between types represented in the operator family - must not change the associated sort ordering. For each + compatibly, as is specified in detail in <xref linkend="btree-behavior"/>. + For each operator in the family there must be a support function having the same two input data types as the operator. It is recommended that a family be complete, i.e., for each combination of data types, all operators are |