summaryrefslogtreecommitdiff
path: root/src/backend/access/heap/heapam.c
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2025-10-13 13:17:45 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2025-10-13 13:17:45 -0400
commit1f8062dd9668572d66549fc798a7d2057aa34ee1 (patch)
tree9ab6ecd05fa101bae13388a14ff5141fbd711ac0 /src/backend/access/heap/heapam.c
parentfe8192a95e6c7159d639e341740e32966c9cf385 (diff)
Fix serious performance problems in LZ4Stream_read_internal.
I was distressed to find that reading an LZ4-compressed toc.dat file was hundreds of times slower than it ought to be. On investigation, the blame mostly affixes to LZ4Stream_read_overflow's habit of memmove'ing all the remaining buffered data after each read operation. Since reading a TOC file tends to involve a lot of small (even one-byte) decompression calls, that amounts to an O(N^2) cost. This could have been fixed with a minimal patch, but to my eyes LZ4Stream_read_internal and LZ4Stream_read_overflow are badly-written spaghetti code; in particular the eol_flag logic is inefficient and duplicative. I chose to throw the code away and rewrite from scratch. This version is about sixty lines shorter as well as not having the performance issue. Fortunately, AFAICT the only way to get to this problem is to manually LZ4-compress the toc.dat and/or blobs.toc files within a directory-style archive; in the main data files, we read blocks that are large enough that the O(N^2) behavior doesn't manifest. Few people do that, which likely explains the lack of field complaints. Otherwise this performance bug might be considered bad enough to warrant back-patching. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us
Diffstat (limited to 'src/backend/access/heap/heapam.c')
0 files changed, 0 insertions, 0 deletions