user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2007-11-08	Improve error message	Peter Eisentraut

2007-11-07	Improve the performance of LIKE/regex estimation in non-C locales, by making	Tom Lane
	make_greater_string() try harder to generate a string that's actually greater than its input string. Before we just assumed that making a string that was memcmp-greater was enough, but it is easy to generate examples where this is not so when the locale is not C. Instead, loop until the relevant comparison function agrees that the generated string is greater than the input. Unfortunately this is probably not enough to guarantee that the generated string is greater than all extensions of the input, so we cannot relax the restriction to C locale for the LIKE/regex index optimization. But it should at least improve the odds of getting a useful selectivity estimate in prefix_selectivity(). Per example from Guillaume Smet. Backpatch to 8.1, mainly because that's what the complainant is using...
2007-11-07	Fix patternsel() and callers to do the right thing for NOT LIKE and the other	Tom Lane
	negated-match operators. patternsel had been using the supplied operator as though it were a positive-match operator, and thus obtaining a wrong result, which was even more wrong after the caller subtracted it from 1. Seems cleanest to give patternsel an explicit "negate" argument so that it knows what's going on. Also install the same factorization scheme for pattern join selectivity estimators; even though they are just stubs at the moment, this may keep someone from making the same type of mistake when they get filled out. Per report from Greg Mullane. Backpatch to 8.2 --- previous releases do not show the problem because patternsel() doesn't actually use the operator directly.
2007-11-06	Some code review for xml.c:	Tom Lane
	Add some more xml_init() calls that might not be necessary, but seem like a good idea to avoid possible problems like we saw in xmlelement(). Fix unsafe assumption that you can keep using the tupledesc of a relcache entry you don't have open. Add missing error checks for SearchSysCache failure. Get rid of handwritten array traversal in xpath() and O(N^2), broken-for-nulls array access code in map_sql_value_to_xml_value(), in favor of using deconstruct_array. Manually adjust a lot of line breaks in places where the code is otherwise gonna look pretty awful after pg_indent hacks it up (original author seems to have liked to lay out code for a 200-column window).
2007-11-05	Fix xmlelement() to initialize libxml correctly before using it, and to avoid	Tom Lane
	assuming that evaluation of its input expressions won't change the state of libxml. This requires refactoring xml_init() to not call xmlInitParser(), since now not all of its callers want that. I also tweaked things to avoid repeated execution of one-time-only tests inside xml_init(), though this is mostly for clarity rather than in hopes of saving any noticeable amount of runtime. Per report from Sheikh Amjad and subsequent discussion. In passing, fix a couple of inadequately schema-qualified queries.
2007-10-24	Set read_only = TRUE while evaluating input queries for ts_rewrite()	Tom Lane
	and ts_stat(), per my recent suggestion. Also add a possibly-not-needed- but-can't-hurt check for NULL SPI_tuptable, before we try to dereference same.
2007-10-24	Remove the aggregate form of ts_rewrite(), since it doesn't work as desired	Tom Lane
	if there are zero rows to aggregate over, and the API seems both conceptually and notationally ugly anyway. We should look for something that improves on the tsquery-and-text-SELECT version (which is also pretty ugly but at least it works...), but it seems that will take query infrastructure that doesn't exist today. (Hm, I wonder if there's anything in or near SQL2003 window functions that would help?) Per discussion.
2007-10-23	Fix two-argument form of ts_rewrite() so it actually works for cases where	Tom Lane
	a later rewrite rule should change a subtree modified by an earlier one. Per my gripe of a few days ago.
2007-10-23	Fix several bugs in tsvectorin, including crash due to uninitialized field and	Tom Lane
	miscomputation of required palloc size. The crash could only occur if the input contained lexemes both with and without positions, which is probably not common in practice. The miscomputation would definitely result in wasted space. Also fix some inconsistent coding around alignment of strings and positions in a tsvector value; these errors could also lead to crashes given mixed with/without position data and a machine that's picky about alignment. And be more careful about checking for overflow of string offsets. Patch is only against HEAD --- I have not looked to see if same bugs are in back-branch contrib/tsearch2 code.
2007-10-21	Fix shared tsvector/tsquery input code so that we don't say "syntax error in	Tom Lane
	tsvector" when we are really parsing a tsquery. Report the bogus input, too. Make styles of some related error messages more consistent.
2007-10-20	Adjust error message to agree with documentation. The tsearch documentation	Tom Lane
	uniformly calls these things weights, not classes.
2007-10-13	Migrate the former contrib/txid module into core. This will make it easier	Tom Lane
	for Slony and Skytools to depend on it. Per discussion.
2007-10-13	Guard against possible double free during error escape from XML	Tom Lane
	functions. Patch for the reported issue from Kris Jurka, some other potential trouble spots plugged by Tom.
2007-10-13	Fix the inadvertent libpq ABI breakage discovered by Martin Pitt: the	Tom Lane
	renumbering of encoding IDs done between 8.2 and 8.3 turns out to break 8.2 initdb and psql if they are run with an 8.3beta1 libpq.so. For the moment we can rearrange the order of enum pg_enc to keep the same number for everything except PG_JOHAB, which isn't a problem since there are no direct references to it in the 8.2 programs anyway. (This does force initdb unfortunately.) Going forward, we want to fix things so that encoding IDs can be changed without an ABI break, and this commit includes the changes needed to allow libpq's encoding IDs to be treated as fully independent of the backend's. The main issue is that libpq clients should not include pg_wchar.h or otherwise assume they know the specific values of libpq's encoding IDs, since they might encounter version skew between pg_wchar.h and the libpq.so they are using. To fix, have libpq officially export functions needed for encoding name<=>ID conversion and validity checking; it was doing this anyway unofficially. It's still the case that we can't renumber backend encoding IDs until the next bump in libpq's major version number, since doing so will break the 8.2-era client programs. However the code is now prepared to avoid this type of problem in future. Note that initdb is no longer a libpq client: we just pull in the two source files we need directly. The patch also fixes a few places that were being sloppy about checking for an unrecognized encoding name.
2007-10-13	Fix ALTER COLUMN TYPE to preserve the tablespace and reloptions of indexes	Tom Lane
	it affects. The original coding neglected tablespace entirely (causing the indexes to move to the database's default tablespace) and for an index belonging to a UNIQUE or PRIMARY KEY constraint, it would actually try to assign the parent table's reloptions to the index :-(. Per bug #3672 and subsequent investigation. 8.0 and 8.1 did not have reloptions, but the tablespace bug is present.
2007-09-26	In the integer-datetimes case, date2timestamp and date2timestamptz need	Tom Lane
	to check for overflow because the legal range of type date is actually wider than timestamp's. Problem found by Neil Conway.
2007-09-25	Just-in-time background writing strategy. This code avoids re-scanning	Tom Lane
	buffers that cannot possibly need to be cleaned, and estimates how many buffers it should try to clean based on moving averages of recent allocation requests and density of reusable buffers. The patch also adds a couple more columns to pg_stat_bgwriter to help measure the effectiveness of the bgwriter. Greg Smith, building on his own work and ideas from several other people, in particular a much older patch from Itagaki Takahiro.
2007-09-23	Fix bugs in XML binary I/O functions. Heikki and Tom	Tom Lane

2007-09-22	Fix bogus calculation of potential output string length in translate().	Tom Lane

2007-09-22	Although I'd misdiagnosed the reason for the recent failures on	Tom Lane
	buildfarm member grebe, I see no reason to revert the 1-byte-header-friendly changes I made in varlena.c. Instead, tweak the code a little bit to get more advantage out of that.
2007-09-22	Doh --- what's really happening on buildfarm member grebe is that its	Tom Lane
	malloc returns NULL for malloc(0). Defend against that case.
2007-09-22	Go back to using a separate method for doing ILIKE for single byte	Andrew Dunstan
	character encodings that doesn't involve calling lower(). This should cure the performance regression in this case complained of by Guillaume Smet. It still leaves the horrid performance for multi-byte encodings introduced in 8.2, but there's no obvious solution for that in sight.
2007-09-22	Fix varlena.c routines to allow 1-byte-header text values. This is now	Tom Lane
	demonstrably necessary for text_substring() since regexp_split functions may pass it such a value; and we might as well convert the whole file at once. Per buildfarm results (though I wonder why most machines aren't showing a failure).
2007-09-21	Fix regex, LIKE, and some other second-rank text-manipulation functions	Tom Lane
	to not cause needless copying of text datums that have 1-byte headers. Greg Stark, in response to performance gripe from Guillaume Smet and ITAGAKI Takahiro.
2007-09-20	Solaris portability fix that was previously made in contrib/tsearch2	Tom Lane
	but got lost from the version committed to main tree. Per Greg Stark.
2007-09-20	Fix msvc warnings, patch by Hannes Eder <Hannes@HannesEder.net>	Teodor Sigaev

2007-09-20	HOT updates. When we update a tuple without changing any of its indexed	Tom Lane
	columns, and the new version can be stored on the same heap page, we no longer generate extra index entries for the new version. Instead, index searches follow the HOT-chain links to ensure they find the correct tuple version. In addition, this patch introduces the ability to "prune" dead tuples on a per-page basis, without having to do a complete VACUUM pass to recover space. VACUUM is still needed to clean up dead index entries, however. Pavan Deolasee, with help from a bunch of other people.
2007-09-19	Prevent corr() from returning the wrong results for negative correlation	Neil Conway
	values. The previous coding essentially assumed that x = sqrt(x*x), which does not hold for x < 0. Thanks to Jie Zhang at Greenplum and Gavin Sherry for reporting this issue.
2007-09-18	Close previously open holes for invalidly encoded data to enter the	Andrew Dunstan
	database via builtin functions, as recently discussed on -hackers. chr() now returns a character in the database encoding. For UTF8 encoded databases the argument is treated as a Unicode code point. For other multi-byte encodings the argument must designate a strict ascii character, or an error is raised, as is also the case if the argument is 0. ascii() is adjusted so that it remains the inverse of chr(). The two argument form of convert() is gone, and the three argument form now takes a bytea first argument and returns a bytea. To cover this loss three new functions are introduced: . convert_from(bytea, name) returns text - converts the first argument from the named encoding to the database encoding . convert_to(text, name) returns bytea - converts the first argument from the database encoding to the named encoding . length(bytea, name) returns int - gives the length of the first argument in characters in the named encoding
2007-09-16	Fix overflow in extract(epoch from interval) for intervals exceeding 68 years.	Tom Lane
	Seems to have been introduced in 8.1 by careless SECS_PER_DAY search-and-replace.
2007-09-13	Fix typo in typecasting.	Teodor Sigaev
	patch from ITAGAKI Takahiro <itagaki.takahiro@oss.ntt.co.jp>
2007-09-11	Remove QueryOperand->istrue flag, it was used only in cover ranking	Teodor Sigaev
	(ts_rank_cd). Use palloc'ed array in ranking instead of flag.
2007-09-11	Refactor from Heikki Linnakangas <heikki@enterprisedb.com>:	Teodor Sigaev
	* Defined new struct WordEntryPosVector that holds a uint16 length and a variable size array of WordEntries. This replaces the previous convention of a variable size uint16 array, with the first element implying the length. WordEntryPosVector has the same layout in memory, but is more readable in source code. The POSDATAPTR and POSDATALEN macros are still used, though it would now be more readable to access the fields in WordEntryPosVector directly. * Removed needfree field from DocRepresentation. It was always set to false. * Miscellaneous other commenting and refactoring
2007-09-11	Rename recently-added pg_stat_activity column from txn_start to xact_start,	Tom Lane
	for consistency with other column names such as in pg_stat_database.
2007-09-11	Arrange for SET LOCAL's effects to persist until the end of the current top	Tom Lane
	transaction, unless rolled back or overridden by a SET clause for the same variable attached to a surrounding function call. Per discussion, these seem the best semantics. Note that this is an INCOMPATIBLE CHANGE: in 8.0 through 8.2, SET LOCAL's effects disappeared at subtransaction commit (leading to behavior that made little sense at the SQL level). I took advantage of the opportunity to rewrite and simplify the GUC variable save/restore logic a little bit. The old idea of a "tentative" value is gone; it was a hangover from before we had a stack. Also, we no longer need a stack entry for every nesting level, but only for those in which a variable's value actually changed.
2007-09-10	Change void* opaque argument to Datum type, add argument's	Teodor Sigaev
	name to PushFunction type definition. Per suggestion by Tome Lane <tgl@sss.pgh.pa.us>
2007-09-07	Improvements from Heikki Linnakangas <heikki@enterprisedb.com>	Teodor Sigaev
	- change the alignment requirement of lexemes in TSVector slightly. Lexeme strings were always padded to 2-byte aligned length to make sure that if there's position array (uint16[]) it has the right alignment. The patch changes that so that the padding is not done when there's no positions. That makes the storage of tsvectors without positions slightly more compact. - added some #include "miscadmin.h" lines I missed in the earlier when I added calls to check_stack_depth(). - Reimplement the send/recv functions, and added a comment above them describing the on-wire format. The CRC is now recalculated in tsquery as well per previous discussion.
2007-09-07	Improving various checks by Heikki Linnakangas <heikki@enterprisedb.com>	Teodor Sigaev
	- add code to check that the query tree is well-formed. It was indeed possible to send malformed queries in binary mode, which produced all kinds of strange results. - make the left-field a uint32. There's no reason to arbitrarily limit it to 16-bits, and it won't increase the disk/memory footprint either now that QueryOperator and QueryOperand are separate structs. - add check_stack_depth() call to all recursive functions I found. Some of them might have a natural limit so that you can't force arbitrarily deep recursions, but check_stack_depth() is cheap enough that seems best to just stick it into anything that might be a problem.
2007-09-07	Refactoring by Heikki Linnakangas <heikki@enterprisedb.com> with	Teodor Sigaev
	small editorization by me - Brake the QueryItem struct into QueryOperator and QueryOperand. Type was really the only common field between them. QueryItem still exists, and is used in the TSQuery struct as before, but it's now a union of the two. Many other changes fell from that, like separation of pushval_asis function into pushValue, pushOperator and pushStop. - Moved some structs that were for internal use only from header files to the right .c-files. - Moved tsvector parser to a new tsvector_parser.c file. Parser code was about half of the size of tsvector.c, it's also used from tsquery.c, and it has some data structures of its own, so it seems better to separate it. Cleaned up the API so that TSVectorParserState is not accessed from outside tsvector_parser.c. - Separated enumerations (#defines, really) used for QueryItem.type field and as return codes from gettoken_query. It was just accidental code sharing. - Removed ParseQueryNode struct used internally by makepol and friends. push*-functions now construct QueryItems directly. - Changed int4 variables to just ints for variables like "i" or "array size", where the storage-size was not significant.
2007-09-05	Implement lazy XID allocation: transactions that do not modify any database	Tom Lane
	rows will normally never obtain an XID at all. We already did things this way for subtransactions, but this patch extends the concept to top-level transactions. In applications where there are lots of short read-only transactions, this should improve performance noticeably; not so much from removal of the actual XID-assignments, as from reduction of overhead that's driven by the rate of XID consumption. We add a concept of a "virtual transaction ID" so that active transactions can be uniquely identified even if they don't have a regular XID. This is a much lighter-weight concept: uniqueness of VXIDs is only guaranteed over the short term, and no on-disk record is made about them. Florian Pflug, with some editorialization by Tom.
2007-09-04	Provide for binary input/output of enums, to fix complaint from Merlin Moncure.	Andrew Dunstan
	This just provides text values, we're not exposing the underlying Oid representation. Catalog version bumped.
2007-08-31	Apply a band-aid fix for the problem that 8.2 and up completely misestimate	Tom Lane
	the number of rows likely to be produced by a query such as SELECT * FROM t1 LEFT JOIN t2 USING (key) WHERE t2.key IS NULL; What this is doing is selecting for t1 rows with no match in t2, and thus it may produce a significant number of rows even if the t2.key table column contains no nulls at all. 8.2 thinks the table column's null fraction is relevant and thus may estimate no rows out, which results in terrible plans if there are more joins above this one. A proper fix for this will involve passing much more information about the context of a clause to the selectivity estimator functions than we ever have. There's no time left to write such a patch for 8.3, and it wouldn't be back-patchable into 8.2 anyway. Instead, put in an ad-hoc test to defeat the normal table-stats-based estimation when an IS NULL test is evaluated at an outer join, and just use a constant estimate instead --- I went with 0.5 for lack of a better idea. This won't catch every case but it will catch the typical ways of writing such queries, and it seems unlikely to make things worse for other queries.
2007-08-31	Install check_stack_depth() protection in two recursive tsquery	Tom Lane
	processing routines. Per Heikki.
2007-08-30	Fix int8mul so that overflow check is applied correctly for INT64_IS_BUSTED	Tom Lane
	case, per Florian Pflug. Not back-patched since it's unclear that anyone but me still cares ...
2007-08-29	Relax permissions checks on dbsize functions, per discussion. Revert out all	Tom Lane
	checks for individual-table-size functions, since anyone in the database could get approximate values from pg_class.relpages anyway. Allow database-size to users with CONNECT privilege for the target database (note that this is granted by default). Allow tablespace-size if the user has CREATE privilege on the tablespace (which is not granted by default), or if the tablespace is the default tablespace for the current database (since we treat that as implicitly allowing use of the tablespace).
2007-08-27	Remove the 'not in' operator (!!=). This was a hangover from Berkeley	Tom Lane
	days that was obsolete the moment we had IN (SELECT ...) capability. It's arguably a security hole since it applied no permissions check to the table it searched, and since it was never documented anywhere, removing it seems more appropriate than fixing it.
2007-08-27	Restrict pg_relation_size to relation owner, pg_database_size to DB owner,	Tom Lane
	and pg_tablespace_size to superusers. Perhaps we could weaken the first case to just require SELECT privilege, but that doesn't work for the other cases, so use ownership as the common concept.
2007-08-27	Make currtid() functions require SELECT privileges on the target table.	Tom Lane
	While it's not clear that TID linkage info is of any great use to a nefarious user, it's certainly unexpected that these functions wouldn't insist on read privileges.
2007-08-21	Remove extraneous semicolon --- buildfarm member bear, for one,	Tom Lane
	objects to it.
2007-08-21	Fix cash_mul_int4 and cash_div_int4 for overenthusiastic substitution	Tom Lane
	of int64 for int32. Per reports from Merlin Moncure and Andrew Chernow.