Disable parallel plans for RIGHT_SEMI joins

RIGHT_SEMI joins rely on the HEAP_TUPLE_HAS_MATCH flag to guarantee that only the first match for each inner tuple is considered. However, in a parallel hash join, the inner relation is stored in a shared global hash table that can be probed by multiple workers concurrently. This allows different workers to inspect and set the match flags of the same inner tuples at the same time. If two workers probe the same inner tuple concurrently, both may see the match flag as unset and emit the same tuple, leading to duplicate output rows and violating RIGHT_SEMI join semantics. For now, we disable parallel plans for RIGHT_SEMI joins. In the long term, it may be possible to support parallel execution by performing atomic operations on the match flag, for example using a CAS or similar mechanism. Backpatch to v18, where RIGHT_SEMI join was introduced. Bug: #19094 Reported-by: Lori Corbani <Lori.Corbani@jax.org> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19094-6ed410eb5b256abd@postgresql.org Backpatch-through: 18
author: Richard Guo <rguo@postgresql.org> 2025-10-30 11:58:45 +0900
committer: Richard Guo <rguo@postgresql.org> 2025-10-30 11:58:45 +0900
commit: 257ee78341f2657d5c19cdaf0888f843e9bb0c33 (patch)
tree: 8d55877063d2c9b5cf164e135e59e5d55e4da85c /src/test/regress
parent: 50eb4e11815664bfcee883e92f4bf238ac23ec12 (diff)
2 files changed, 47 insertions, 0 deletions
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index d10095de70f..0e82ca1867a 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -3081,6 +3081,33 @@ select * from tbl_rs t1 join
 (6 rows)
 
 --
+-- regression test for bug with parallel-hash-right-semi join
+--
+begin;
+-- encourage use of parallel plans
+set local parallel_setup_cost=0;
+set local parallel_tuple_cost=0;
+set local min_parallel_table_scan_size=0;
+set local max_parallel_workers_per_gather=4;
+-- ensure we don't get parallel hash right semi join
+explain (costs off)
+select * from tenk1 t1
+where exists (select 1 from tenk1 t2 where fivethous = t1.fivethous)
+and t1.fivethous < 5;
+                    QUERY PLAN                    
+--------------------------------------------------
+ Gather
+   Workers Planned: 4
+   ->  Parallel Hash Semi Join
+         Hash Cond: (t1.fivethous = t2.fivethous)
+         ->  Parallel Seq Scan on tenk1 t1
+               Filter: (fivethous < 5)
+         ->  Parallel Hash
+               ->  Parallel Seq Scan on tenk1 t2
+(8 rows)
+
+rollback;
+--
 -- regression test for bug #13908 (hash join with skew tuples & nbatch increase)
 --
 set work_mem to '64kB';
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index b1732453e8d..c6b8b09a381 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -760,6 +760,26 @@ select * from tbl_rs t1 join
   on true;
 
 --
+-- regression test for bug with parallel-hash-right-semi join
+--
+
+begin;
+
+-- encourage use of parallel plans
+set local parallel_setup_cost=0;
+set local parallel_tuple_cost=0;
+set local min_parallel_table_scan_size=0;
+set local max_parallel_workers_per_gather=4;
+
+-- ensure we don't get parallel hash right semi join
+explain (costs off)
+select * from tenk1 t1
+where exists (select 1 from tenk1 t2 where fivethous = t1.fivethous)
+and t1.fivethous < 5;
+
+rollback;
+
+--
 -- regression test for bug #13908 (hash join with skew tuples & nbatch increase)
 --
author	Richard Guo <rguo@postgresql.org>	2025-10-30 11:58:45 +0900
committer	Richard Guo <rguo@postgresql.org>	2025-10-30 11:58:45 +0900
commit	257ee78341f2657d5c19cdaf0888f843e9bb0c33 (patch)
tree	8d55877063d2c9b5cf164e135e59e5d55e4da85c /src/test/regress
parent	50eb4e11815664bfcee883e92f4bf238ac23ec12 (diff)