diff options
author | Lidong Yan <yldhome2d2@gmail.com> | 2025-07-15 10:56:22 +0800 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2025-07-15 08:12:33 -0700 |
commit | 2a6ce090f27016d68ee6952809d98fe88ce53522 (patch) | |
tree | 6cc61bca262b9e127d8105f55d82858df20d5a73 /commit.c | |
parent | 937153dece3c2b1e04b0e071298745abd57cd347 (diff) |
bloom: optimize multiple pathspec items in revision
To enable optimize multiple pathspec items in revision traversal,
return 0 if all pathspec item is literal in forbid_bloom_filters().
Add for loops to initialize and check each pathspec item's bloom_keyvec
when optimization is possible.
Add new test cases in t/t4216-log-bloom.sh to ensure
- consistent results between the optimization for multiple pathspec
items using bloom filter and the case without bloom filter
optimization.
- does not use bloom filter if any pathspec item is not literal.
With these optimizations, we get some improvements for multi-pathspec runs
of 'git log'. First, in the Git repository we see these modest results:
Benchmark 1: old
Time (mean ± σ): 73.1 ms ± 2.9 ms
Range (min … max): 69.9 ms … 84.5 ms 42 runs
Benchmark 2: new
Time (mean ± σ): 55.1 ms ± 2.9 ms
Range (min … max): 51.1 ms … 61.2 ms 52 runs
Summary
'new' ran
1.33 ± 0.09 times faster than 'old'
But in a larger repo, such as the LLVM project repo below, we get even
better results:
Benchmark 1: old
Time (mean ± σ): 1.974 s ± 0.006 s
Range (min … max): 1.960 s … 1.983 s 10 runs
Benchmark 2: new
Time (mean ± σ): 262.9 ms ± 2.4 ms
Range (min … max): 257.7 ms … 266.2 ms 11 runs
Summary
'new' ran
7.51 ± 0.07 times faster than 'old'
Signed-off-by: Derrick Stolee <stolee@gmail.com>
[ly: rename convert_pathspec_to_filter() to convert_pathspec_to_bloom_keyvec()]
Signed-off-by: Lidong Yan <502024330056@smail.nju.edu.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'commit.c')
0 files changed, 0 insertions, 0 deletions