diff options
| author | Tom Lane <tgl@sss.pgh.pa.us> | 2011-08-20 14:51:02 -0400 | 
|---|---|---|
| committer | Tom Lane <tgl@sss.pgh.pa.us> | 2011-08-20 14:51:32 -0400 | 
| commit | 706493a1f7cbd9c7d3a792fd5066b55c145b9b01 (patch) | |
| tree | b008d8e26dc55ec2fc607c4592a25fd1243996bd /src | |
| parent | d4c24254fab0b9908211f990c4239664f3ed5630 (diff) | |
Fix performance problem when building a lossy tidbitmap.
As pointed out by Sergey Koposov, repeated invocations of tbm_lossify can
make building a large tidbitmap into an O(N^2) operation.  To fix, make
sure we remove more than the minimum amount of information per call, and
add a fallback path to behave sanely if we're unable to fit the bitmap
within the requested amount of memory.
This has been wrong since the tidbitmap code was written, so back-patch
to all supported branches.
Diffstat (limited to 'src')
| -rw-r--r-- | src/backend/nodes/tidbitmap.c | 22 | 
1 files changed, 19 insertions, 3 deletions
| diff --git a/src/backend/nodes/tidbitmap.c b/src/backend/nodes/tidbitmap.c index ae29bd9970c..54d8028da01 100644 --- a/src/backend/nodes/tidbitmap.c +++ b/src/backend/nodes/tidbitmap.c @@ -953,8 +953,11 @@ tbm_lossify(TIDBitmap *tbm)  	/*  	 * XXX Really stupid implementation: this just lossifies pages in  	 * essentially random order.  We should be paying some attention to the -	 * number of bits set in each page, instead.  Also it might be a good idea -	 * to lossify more than the minimum number of pages during each call. +	 * number of bits set in each page, instead. +	 * +	 * Since we are called as soon as nentries exceeds maxentries, we should +	 * push nentries down to significantly less than maxentries, or else we'll +	 * just end up doing this again very soon.  We shoot for maxentries/2.  	 */  	Assert(!tbm->iterating);  	Assert(tbm->status == TBM_HASH); @@ -975,7 +978,7 @@ tbm_lossify(TIDBitmap *tbm)  		/* This does the dirty work ... */  		tbm_mark_page_lossy(tbm, page->blockno); -		if (tbm->nentries <= tbm->maxentries) +		if (tbm->nentries <= tbm->maxentries / 2)  		{  			/* we have done enough */  			hash_seq_term(&status); @@ -988,6 +991,19 @@ tbm_lossify(TIDBitmap *tbm)  		 * not care whether we visit lossy chunks or not.  		 */  	} + +	/* +	 * With a big bitmap and small work_mem, it's possible that we cannot +	 * get under maxentries.  Again, if that happens, we'd end up uselessly +	 * calling tbm_lossify over and over.  To prevent this from becoming a +	 * performance sink, force maxentries up to at least double the current +	 * number of entries.  (In essence, we're admitting inability to fit +	 * within work_mem when we do this.)  Note that this test will not fire +	 * if we broke out of the loop early; and if we didn't, the current +	 * number of entries is simply not reducible any further. +	 */ +	if (tbm->nentries > tbm->maxentries / 2) +		tbm->maxentries = Min(tbm->nentries, (INT_MAX - 1) / 2) * 2;  }  /* | 
