Add parallel-aware hash joins.

Introduce parallel-aware hash joins that appear in EXPLAIN plans as Parallel Hash Join with Parallel Hash. While hash joins could already appear in parallel queries, they were previously always parallel-oblivious and had a partial subplan only on the outer side, meaning that the work of the inner subplan was duplicated in every worker. After this commit, the planner will consider using a partial subplan on the inner side too, using the Parallel Hash node to divide the work over the available CPU cores and combine its results in shared memory. If the join needs to be split into multiple batches in order to respect work_mem, then workers process different batches as much as possible and then work together on the remaining batches. The advantages of a parallel-aware hash join over a parallel-oblivious hash join used in a parallel query are that it: * avoids wasting memory on duplicated hash tables * avoids wasting disk space on duplicated batch files * divides the work of building the hash table over the CPUs One disadvantage is that there is some communication between the participating CPUs which might outweigh the benefits of parallelism in the case of small hash tables. This is avoided by the planner's existing reluctance to supply partial plans for small scans, but it may be necessary to estimate synchronization costs in future if that situation changes. Another is that outer batch 0 must be written to disk if multiple batches are required. A potential future advantage of parallel-aware hash joins is that right and full outer joins could be supported, since there is a single set of matched bits for each hashtable, but that is not yet implemented. A new GUC enable_parallel_hash is defined to control the feature, defaulting to on. Author: Thomas Munro Reviewed-By: Andres Freund, Robert Haas Tested-By: Rafia Sabih, Prabhat Sahu Discussion: https://postgr.es/m/CAEepm=2W=cOkiZxcg6qiFQP-dHUe09aqTrEMM7yJDrHMhDv_RA@mail.gmail.com https://postgr.es/m/CAEepm=37HKyJ4U6XOLi=JgfSHM3o6B-GaeO-6hkOmneTDkH+Uw@mail.gmail.com
author: Andres Freund <andres@anarazel.de> 2017-12-20 23:39:21 -0800
committer: Andres Freund <andres@anarazel.de> 2017-12-21 00:43:41 -0800
commit: 1804284042e659e7d16904e7bbb0ad546394b6a3 (patch)
tree: d1980f7d94cb31aed8880007e5a41ede08d0cb84 /doc/src
parent: f94eec490b2671399c102b89c9fa0311aea3a39f (diff)
2 files changed, 76 insertions, 1 deletions
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 533faf060de..e4a01699e46 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3647,6 +3647,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-parallel-hash" xreflabel="enable_parallel_hash">
+      <term><varname>enable_parallel_hash</varname> (<type>boolean</type>)
+       <indexterm>
+        <primary><varname>enable_parallel_hash</varname> configuration parameter</primary>
+       </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's use of hash-join plan
+        types with parallel hash. Has no effect if hash-join plans are not
+        also enabled. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-partition-wise-join" xreflabel="enable_partition_wise_join">
       <term><varname>enable_partition_wise_join</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b6f80d97080..8a9793644fa 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1263,7 +1263,7 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
          <entry>Waiting in an extension.</entry>
         </row>
         <row>
-         <entry morerows="17"><literal>IPC</literal></entry>
+         <entry morerows="32"><literal>IPC</literal></entry>
          <entry><literal>BgWorkerShutdown</literal></entry>
          <entry>Waiting for background worker to shut down.</entry>
         </row>
@@ -1280,6 +1280,66 @@ postgres   27093  0.0  0.0  30096  2752 ?        Ss   11:34   0:00 postgres: ser
          <entry>Waiting for activity from child process when executing <literal>Gather</literal> node.</entry>
         </row>
         <row>
+          <entry><literal>Hash/Batch/Allocating</literal></entry>
+          <entry>Waiting for an elected Parallel Hash participant to allocate a hash table.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/Batch/Electing</literal></entry>
+          <entry>Electing a Parallel Hash participant to allocate a hash table.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/Batch/Loading</literal></entry>
+          <entry>Waiting for other Parallel Hash participants to finish loading a hash table.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/Build/Allocating</literal></entry>
+          <entry>Waiting for an elected Parallel Hash participant to allocate the initial hash table.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/Build/Electing</literal></entry>
+          <entry>Electing a Parallel Hash participant to allocate the initial hash table.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/Build/HashingInner</literal></entry>
+          <entry>Waiting for other Parallel Hash participants to finish hashing the inner relation.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/Build/HashingOuter</literal></entry>
+          <entry>Waiting for other Parallel Hash participants to finish partitioning the outer relation.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/GrowBatches/Allocating</literal></entry>
+          <entry>Waiting for an elected Parallel Hash participant to allocate more batches.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/GrowBatches/Deciding</literal></entry>
+          <entry>Electing a Parallel Hash participant to decide on future batch growth.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/GrowBatches/Electing</literal></entry>
+          <entry>Electing a Parallel Hash participant to allocate more batches.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/GrowBatches/Finishing</literal></entry>
+          <entry>Waiting for an elected Parallel Hash participant to decide on future batch growth.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/GrowBatches/Repartitioning</literal></entry>
+          <entry>Waiting for other Parallel Hash participants to finishing repartitioning.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/GrowBuckets/Allocating</literal></entry>
+          <entry>Waiting for an elected Parallel Hash participant to finish allocating more buckets.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/GrowBuckets/Electing</literal></entry>
+          <entry>Electing a Parallel Hash participant to allocate more buckets.</entry>
+        </row>
+        <row>
+          <entry><literal>Hash/GrowBuckets/Reinserting</literal></entry>
+          <entry>Waiting for other Parallel Hash participants to finish inserting tuples into new buckets.</entry>
+        </row>
+        <row>
          <entry><literal>LogicalSyncData</literal></entry>
          <entry>Waiting for logical replication remote server to send data for initial table synchronization.</entry>
         </row>
author	Andres Freund <andres@anarazel.de>	2017-12-20 23:39:21 -0800
committer	Andres Freund <andres@anarazel.de>	2017-12-21 00:43:41 -0800
commit	1804284042e659e7d16904e7bbb0ad546394b6a3 (patch)
tree	d1980f7d94cb31aed8880007e5a41ede08d0cb84 /doc/src
parent	f94eec490b2671399c102b89c9fa0311aea3a39f (diff)