From 00727249ec8404c68391ec58e9c9f0d8a88d5ca0 Mon Sep 17 00:00:00 2001 From: Paulo Casaretto Date: Fri, 29 Aug 2025 16:02:54 +0000 Subject: range-diff: add configurable memory limit for cost matrix MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When comparing large commit ranges (e.g., 250,000+ commits), range-diff attempts to allocate an n×n cost matrix that can exhaust available memory. For example, with 256,784 commits (n = 513,568), the matrix would require approximately 256GB of memory (513,568² × 4 bytes), causing either immediate segmentation faults due to integer overflow or system hangs. Add a memory limit check in get_correspondences() before allocating the cost matrix. This check uses the total size in bytes (n² × sizeof(int)) and compares it against a configurable maximum, preventing both excessive memory usage and integer overflow issues. The limit is configurable via a new --max-memory option that accepts human-readable sizes (e.g., "1G", "500M"). The default is 4GB for 64 bit systems and 2GB for 32 bit systems. This allows comparing ranges of approximately 32,000 (16,000) commits - generous for real-world use cases while preventing impractical operations. When the limit is exceeded, range-diff now displays a clear error message showing both the requested memory size and the maximum allowed, formatted in human-readable units for better user experience. Example usage: git range-diff --max-memory=1G branch1...branch2 git range-diff --max-memory=500M base..topic1 base..topic2 This approach was chosen over alternatives: - Pre-counting commits: Would require spawning additional git processes and reading all commits twice - Limiting by commit count: Less precise than actual memory usage - Streaming approach: Would require significant refactoring of the current algorithm This issue was previously discussed in: https://lore.kernel.org/git/RFC-cover-v2-0.5-00000000000-20211210T122901Z-avarab@gmail.com/ Acked-by: Johannes Schindelin Signed-off-by: Paulo Casaretto Signed-off-by: Junio C Hamano --- log-tree.c | 1 + 1 file changed, 1 insertion(+) (limited to 'log-tree.c') diff --git a/log-tree.c b/log-tree.c index 233bf9f227..73d21f7176 100644 --- a/log-tree.c +++ b/log-tree.c @@ -717,6 +717,7 @@ static void show_diff_of_diff(struct rev_info *opt) struct range_diff_options range_diff_opts = { .creation_factor = opt->creation_factor, .dual_color = 1, + .max_memory = RANGE_DIFF_MAX_MEMORY_DEFAULT, .diffopt = &opts }; -- cgit v1.2.3