Use quicksort, not replacement selection, for external sorting.

We still use replacement selection for the first run of the sort only and only when the number of tuples is relatively small. Otherwise, the first run, and subsequent runs in all cases, are produced using quicksort. This tends to be faster except perhaps for very small amounts of working memory. Peter Geoghegan, reviewed by Tomas Vondra, Jeff Janes, Mithun Cy, Greg Stark, and me.
author: Robert Haas <rhaas@postgresql.org> 2016-04-08 02:36:26 -0400
committer: Robert Haas <rhaas@postgresql.org> 2016-04-08 02:36:26 -0400
commit: 0711803775a37e0bf39d7efdd1e34d9d7e640ea1 (patch)
tree: 8f9e68f32ace8aa5ca7520f7589aca419266b0ce /doc/src
parent: 719c84c1be51f3d3fe6049b77ddbaa0c4b58a9a9 (diff)
1 files changed, 39 insertions, 0 deletions
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fdeda90f03d..2f04702b6e5 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1472,6 +1472,45 @@ include_dir 'conf.d'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-replacement-sort-tuples" xreflabel="replacement_sort_tuples">
+      <term><varname>replacement_sort_tuples</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>replacement_sort_tuples</> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        When the number of tuples to be sorted is smaller than this number,
+        a sort will produce its first output run using replacement selection
+        rather than quicksort.  This may be useful in memory-constrained
+        environments where tuples that are input into larger sort operations
+        have a strong physical-to-logical correlation.  Note that this does
+        not include input tuples with an <emphasis>inverse</emphasis>
+        correlation.  It is possible for the replacement selection algorithm
+        to generate one long run that requires no merging, where use of the
+        default strategy would result in many runs that must be merged
+        to produce a final sorted output.  This may allow sort
+        operations to complete sooner.
+       </para>
+       <para>
+        The default is 150,000 tuples.  Note that higher values are typically
+        not much more effective, and may be counter-productive, since the
+        priority queue is sensitive to the size of available CPU cache, whereas
+        the default strategy sorts runs using a <firstterm>cache
+        oblivious</firstterm> algorithm.  This property allows the default sort
+        strategy to automatically and transparently make effective use
+        of available CPU cache.
+       </para>
+       <para>
+        Setting <varname>maintenance_work_mem</varname> to its default
+        value usually prevents utility command external sorts (e.g.,
+        sorts used by <command>CREATE INDEX</> to build B-Tree
+        indexes) from ever using replacement selection sort, unless the
+        input tuples are quite wide.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-autovacuum-work-mem" xreflabel="autovacuum_work_mem">
       <term><varname>autovacuum_work_mem</varname> (<type>integer</type>)
       <indexterm>
author	Robert Haas <rhaas@postgresql.org>	2016-04-08 02:36:26 -0400
committer	Robert Haas <rhaas@postgresql.org>	2016-04-08 02:36:26 -0400
commit	0711803775a37e0bf39d7efdd1e34d9d7e640ea1 (patch)
tree	8f9e68f32ace8aa5ca7520f7589aca419266b0ce /doc/src
parent	719c84c1be51f3d3fe6049b77ddbaa0c4b58a9a9 (diff)