user/sven/postgresql.git

Age	Commit message (Collapse)	Author
2007-09-26	Create a function variable "join_search_hook" to let plugins override the	Tom Lane
	join search order portion of the planner; this is specifically intended to simplify developing a replacement for GEQO planning. Patch by Julius Stroffek, editorialized on by me. I renamed make_one_rel_by_joins to standard_join_search and make_rels_by_joins to join_search_one_level to better reflect their place within this scheme.
2007-09-25	Change on-disk representation of NUMERIC datatype so that the sign_dscale	Tom Lane
	word comes before the weight instead of after. This will allow future binary-compatible extension of the representation to support compact formats, as discussed on pgsql-hackers around 2007/06/18. The reason to do it now is that we've already pretty well broken any chance of simple in-place upgrade from 8.2 to 8.3, but it's possible that 8.3 to 8.4 (or whenever we get around to squeezing NUMERIC) could otherwise be data-compatible.
2007-09-25	Just-in-time background writing strategy. This code avoids re-scanning	Tom Lane
	buffers that cannot possibly need to be cleaned, and estimates how many buffers it should try to clean based on moving averages of recent allocation requests and density of reusable buffers. The patch also adds a couple more columns to pg_stat_bgwriter to help measure the effectiveness of the bgwriter. Greg Smith, building on his own work and ideas from several other people, in particular a much older patch from Itagaki Takahiro.
2007-09-24	Simplify and rename some GUC variables, per various recent discussions:	Tom Lane
	* stats_start_collector goes away; we always start the collector process, unless prevented by a problem with setting up the stats UDP socket. * stats_reset_on_server_start goes away; it seems useless in view of the availability of pg_stat_reset(). * stats_block_level and stats_row_level are merged into a single variable "track_counts", which controls all reports sent to the collector process. * stats_command_string is renamed to track_activities. * log_autovacuum is renamed to log_autovacuum_min_duration to better reflect its meaning. The log_autovacuum change is not a compatibility issue since it didn't exist before 8.3 anyway. The other changes need to be release-noted.
2007-09-24	Remove "convert 'blah' using conversion_name" facility, because if it	Andrew Dunstan
	produces text it is an encoding hole and if not it's incompatible with the spec, whatever the spec means (which we're not sure about anyway).
2007-09-22	Fix cost estimates for EXISTS subqueries that are evaluated as initPlans	Tom Lane
	(because they are uncorrelated with the immediate parent query). We were charging the full run cost to the parent node, disregarding the fact that only one row need be fetched for EXISTS. While this would only be a cosmetic issue in most cases, it might possibly affect planning outcomes if the parent query were itself a subquery to some upper query. Per recent discussion with Steve Crawford.
2007-09-22	Parenthesize macro arguments safely. I see no bug among the current	Tom Lane
	uses of PG_DETOAST_DATUM_SLICE, but it's clearly trouble waiting to happen.
2007-09-21	Fix regex, LIKE, and some other second-rank text-manipulation functions	Tom Lane
	to not cause needless copying of text datums that have 1-byte headers. Greg Stark, in response to performance gripe from Guillaume Smet and ITAGAKI Takahiro.
2007-09-21	Improve handling of prune/no-prune decisions by storing a page's oldest	Tom Lane
	unpruned XMAX in its header. At the cost of 4 bytes per page, this keeps us from performing heap_page_prune when there's no chance of pruning anything. Seems to be necessary per Heikki's preliminary performance testing.
2007-09-21	If we're gonna provide an --enable-profiling configure option, surely	Tom Lane
	it ought to know that you need -DLINUX_PROFILE on Linux.
2007-09-20	HOT updates. When we update a tuple without changing any of its indexed	Tom Lane
	columns, and the new version can be stored on the same heap page, we no longer generate extra index entries for the new version. Instead, index searches follow the HOT-chain links to ensure they find the correct tuple version. In addition, this patch introduces the ability to "prune" dead tuples on a per-page basis, without having to do a complete VACUUM pass to recover space. VACUUM is still needed to clean up dead index entries, however. Pavan Deolasee, with help from a bunch of other people.
2007-09-18	Close previously open holes for invalidly encoded data to enter the	Andrew Dunstan
	database via builtin functions, as recently discussed on -hackers. chr() now returns a character in the database encoding. For UTF8 encoded databases the argument is treated as a Unicode code point. For other multi-byte encodings the argument must designate a strict ascii character, or an error is raised, as is also the case if the argument is 0. ascii() is adjusted so that it remains the inverse of chr(). The two argument form of convert() is gone, and the three argument form now takes a bytea first argument and returns a bytea. To cover this loss three new functions are introduced: . convert_from(bytea, name) returns text - converts the first argument from the named encoding to the database encoding . convert_to(text, name) returns bytea - converts the first argument from the database encoding to the named encoding . length(bytea, name) returns int - gives the length of the first argument in characters in the named encoding
2007-09-12	Redefine the lp_flags field of item pointers as having four states, rather	Tom Lane
	than two independent bits (one of which was never used in heap pages anyway, or at least hadn't been in a very long time). This gives us flexibility to add the HOT notions of redirected and dead item pointers without requiring anything so klugy as magic values of lp_off and lp_len. The state values are chosen so that for the states currently in use (pre-HOT) there is no change in the physical representation.
2007-09-11	Remove QueryOperand->istrue flag, it was used only in cover ranking	Teodor Sigaev
	(ts_rank_cd). Use palloc'ed array in ranking instead of flag.
2007-09-11	Fix header's size of structs defines in ispell.	Teodor Sigaev
	Backpatch is needed for contrib version.
2007-09-11	Refactor from Heikki Linnakangas <heikki@enterprisedb.com>:	Teodor Sigaev
	* Defined new struct WordEntryPosVector that holds a uint16 length and a variable size array of WordEntries. This replaces the previous convention of a variable size uint16 array, with the first element implying the length. WordEntryPosVector has the same layout in memory, but is more readable in source code. The POSDATAPTR and POSDATALEN macros are still used, though it would now be more readable to access the fields in WordEntryPosVector directly. * Removed needfree field from DocRepresentation. It was always set to false. * Miscellaneous other commenting and refactoring
2007-09-11	Rename recently-added pg_stat_activity column from txn_start to xact_start,	Tom Lane
	for consistency with other column names such as in pg_stat_database.
2007-09-11	Arrange for SET LOCAL's effects to persist until the end of the current top	Tom Lane
	transaction, unless rolled back or overridden by a SET clause for the same variable attached to a surrounding function call. Per discussion, these seem the best semantics. Note that this is an INCOMPATIBLE CHANGE: in 8.0 through 8.2, SET LOCAL's effects disappeared at subtransaction commit (leading to behavior that made little sense at the SQL level). I took advantage of the opportunity to rewrite and simplify the GUC variable save/restore logic a little bit. The old idea of a "tentative" value is gone; it was a hangover from before we had a stack. Also, we no longer need a stack entry for every nesting level, but only for those in which a variable's value actually changed.
2007-09-10	Heikki Linnakangas <heikki@enterprisedb.com>:	Teodor Sigaev
	Add tsearch subdirectory is added to Makefile to allow compile custom tsearch dictionary as an external module.
2007-09-10	Change void* opaque argument to Datum type, add argument's	Teodor Sigaev
	name to PushFunction type definition. Per suggestion by Tome Lane <tgl@sss.pgh.pa.us>
2007-09-10	Code review for GUC revert-values-if-removed-from-postgresql.conf patch;	Tom Lane
	and in passing, fix some bogosities dating from the custom_variable_classes patch. Fix guc-file.l to correctly check changes in custom_variable_classes that are attempted concurrently with additions/removals of custom variables, and don't allow the new setting to be applied in advance of checking it. Clean up messy and undocumented situation for string variables with NULL boot_val. Fix DefineCustomVariable functions to initialize boot_val correctly. Prevent find_option from inserting bogus placeholders for custom variables that are simply inquired about rather than being set.
2007-09-08	Replace the former method of determining snapshot xmax --- to wit, calling	Tom Lane
	ReadNewTransactionId from GetSnapshotData --- with a "latestCompletedXid" variable that is updated during transaction commit or abort. Since latestCompletedXid is written only in places that had to lock ProcArrayLock exclusively anyway, and is read only in places that had to lock ProcArrayLock shared anyway, it adds no new locking requirements to the system despite being cluster-wide. Moreover, removing ReadNewTransactionId from snapshot acquisition eliminates the need to take both XidGenLock and ProcArrayLock at the same time. Since XidGenLock is sometimes held across I/O this can be a significant win. Some preliminary benchmarking suggested that this patch has no effect on average throughput but can significantly improve the worst-case transaction times seen in pgbench. Concept by Florian Pflug, implementation by Tom Lane.
2007-09-07	Improvements from Heikki Linnakangas <heikki@enterprisedb.com>	Teodor Sigaev
	- change the alignment requirement of lexemes in TSVector slightly. Lexeme strings were always padded to 2-byte aligned length to make sure that if there's position array (uint16[]) it has the right alignment. The patch changes that so that the padding is not done when there's no positions. That makes the storage of tsvectors without positions slightly more compact. - added some #include "miscadmin.h" lines I missed in the earlier when I added calls to check_stack_depth(). - Reimplement the send/recv functions, and added a comment above them describing the on-wire format. The CRC is now recalculated in tsquery as well per previous discussion.
2007-09-07	Improving various checks by Heikki Linnakangas <heikki@enterprisedb.com>	Teodor Sigaev
	- add code to check that the query tree is well-formed. It was indeed possible to send malformed queries in binary mode, which produced all kinds of strange results. - make the left-field a uint32. There's no reason to arbitrarily limit it to 16-bits, and it won't increase the disk/memory footprint either now that QueryOperator and QueryOperand are separate structs. - add check_stack_depth() call to all recursive functions I found. Some of them might have a natural limit so that you can't force arbitrarily deep recursions, but check_stack_depth() is cheap enough that seems best to just stick it into anything that might be a problem.
2007-09-07	Refactoring by Heikki Linnakangas <heikki@enterprisedb.com> with	Teodor Sigaev
	small editorization by me - Brake the QueryItem struct into QueryOperator and QueryOperand. Type was really the only common field between them. QueryItem still exists, and is used in the TSQuery struct as before, but it's now a union of the two. Many other changes fell from that, like separation of pushval_asis function into pushValue, pushOperator and pushStop. - Moved some structs that were for internal use only from header files to the right .c-files. - Moved tsvector parser to a new tsvector_parser.c file. Parser code was about half of the size of tsvector.c, it's also used from tsquery.c, and it has some data structures of its own, so it seems better to separate it. Cleaned up the API so that TSVectorParserState is not accessed from outside tsvector_parser.c. - Separated enumerations (#defines, really) used for QueryItem.type field and as return codes from gettoken_query. It was just accidental code sharing. - Removed ParseQueryNode struct used internally by makepol and friends. push*-functions now construct QueryItems directly. - Changed int4 variables to just ints for variables like "i" or "array size", where the storage-size was not significant.
2007-09-07	Allow CREATE INDEX CONCURRENTLY to disregard transactions in other	Tom Lane
	databases, per gripe from hubert depesz lubaczewski. Patch from Simon Riggs.
2007-09-06	Make eval_const_expressions() preserve typmod when simplifying something like	Tom Lane
	null::char(3) to a simple Const node. (It already worked for non-null values, but not when we skipped evaluation of a strict coercion function.) This prevents loss of typmod knowledge in situations such as exhibited in bug #3598. Unfortunately there seems no good way to fix that bug in 8.1 and 8.2, because they simply don't carry a typmod for a plain Const node. In passing I made all the other callers of makeNullConst supply "real" typmod values too, though I think it probably doesn't matter anywhere else.
2007-09-05	Implement lazy XID allocation: transactions that do not modify any database	Tom Lane
	rows will normally never obtain an XID at all. We already did things this way for subtransactions, but this patch extends the concept to top-level transactions. In applications where there are lots of short read-only transactions, this should improve performance noticeably; not so much from removal of the actual XID-assignments, as from reduction of overhead that's driven by the rate of XID consumption. We add a concept of a "virtual transaction ID" so that active transactions can be uniquely identified even if they don't have a regular XID. This is a much lighter-weight concept: uniqueness of VXIDs is only guaranteed over the short term, and no on-disk record is made about them. Florian Pflug, with some editorialization by Tom.
2007-09-04	Provide for binary input/output of enums, to fix complaint from Merlin Moncure.	Andrew Dunstan
	This just provides text values, we're not exposing the underlying Oid representation. Catalog version bumped.
2007-09-03	Support SET FROM CURRENT in CREATE/ALTER FUNCTION, ALTER DATABASE, ALTER ROLE.	Tom Lane
	(Actually, it works as a plain statement too, but I didn't document that because it seems a bit useless.) Unify VariableResetStmt with VariableSetStmt, and clean up some ancient cruft in the representation of same.
2007-09-03	Improve stylistic consistency of descriptions of built-in objects by avoiding	Tom Lane
	initcap style --- the vast majority of the existing descriptions do not use an initial cap. I didn't change places where the first word was all-cap. initdb not forced because this doesn't change any regression test results.
2007-09-03	Fix breakage of GIN support for varchar[] and cidr[] that I introduced in the	Tom Lane
	operator-family rewrite. I had mistakenly supposed that these could use the pg_amproc entries for text[] and inet[] respectively. However, binary compatibility of the underlying types does not make two array types binary compatible (since they must differ in the header field that gives the element type OID), and so the index support code doesn't consider those entries applicable. Add back the missing pg_amproc entries, and add an opr_sanity query to try to catch such mistakes in future. Per report from Gregory Maxwell.
2007-09-03	Implement function-local GUC parameter settings, as per recent discussion.	Tom Lane
	There are still some loose ends: I didn't do anything about the SET FROM CURRENT idea yet, and it's not real clear whether we are happy with the interaction of SET LOCAL with function-local settings. The documentation is a bit spartan, too.
2007-08-31	Apply a band-aid fix for the problem that 8.2 and up completely misestimate	Tom Lane
	the number of rows likely to be produced by a query such as SELECT * FROM t1 LEFT JOIN t2 USING (key) WHERE t2.key IS NULL; What this is doing is selecting for t1 rows with no match in t2, and thus it may produce a significant number of rows even if the t2.key table column contains no nulls at all. 8.2 thinks the table column's null fraction is relevant and thus may estimate no rows out, which results in terrible plans if there are more joins above this one. A proper fix for this will involve passing much more information about the context of a clause to the selectivity estimator functions than we ever have. There's no time left to write such a patch for 8.3, and it wouldn't be back-patchable into 8.2 anyway. Instead, put in an ad-hoc test to defeat the normal table-stats-based estimation when an IS NULL test is evaluated at an outer join, and just use a constant estimate instead --- I went with 0.5 for lack of a better idea. This won't catch every case but it will catch the typical ways of writing such queries, and it seems unlikely to make things worse for other queries.
2007-08-31	Rewrite make_outerjoininfo's construction of min_lefthand and min_righthand	Tom Lane
	sets for outer joins, in the light of bug #3588 and additional thought and experimentation. The original methodology was fatally flawed for nests of more than two outer joins: it got the relationships between adjacent joins right, but didn't always come to the right conclusions about whether a join could be interchanged with one two or more levels below it. This was largely caused by a mistaken idea that we should use the min_lefthand + min_righthand sets of a sub-join as the minimum left or right input set of an upper join when we conclude that the sub-join can't commute with the upper one. If there's a still-lower join that the sub-join can commute with, this method led us to think that that one could commute with the topmost join; which it can't. Another problem (not directly connected to bug #3588) was that make_outerjoininfo's processing-order-dependent method for enforcing outer join identity #3 didn't work right: if we decided that join A could safely commute with lower join B, we dropped all information about sub-joins under B that join A could perhaps not safely commute with, because we removed B's entire min_righthand from A's. To fix, make an explicit computation of all inner join combinations that occur below an outer join, and add to that the full syntactic relsets of any lower outer joins that we determine it can't commute with. This method gives much more direct enforcement of the outer join rearrangement identities, and it turns out not to cost a lot of additional bookkeeping. Thanks to Richard Harris for the bug report and test case.
2007-08-27	Fix a couple of misbehaviors rooted in the fact that the default creation	Tom Lane
	namespace isn't necessarily first in the search path (there could be implicit schemas ahead of it). Examples are test=# set search_path TO s1; test=# create view pg_timezone_names as select * from pg_timezone_names(); ERROR: "pg_timezone_names" is already a view test=# create table pg_class (f1 int primary key); ERROR: permission denied: "pg_class" is a system catalog You'd expect these commands to create the requested objects in s1, since names beginning with pg_ aren't supposed to be reserved anymore. What is happening is that we create the requested base table and then execute additional commands (here, CREATE RULE or CREATE INDEX), and that code is passed the same RangeVar that was in the original command. Since that RangeVar has schemaname = NULL, the secondary commands think they should do a path search, and that means they find system catalogs that are implicitly in front of s1 in the search path. This is perilously close to being a security hole: if the secondary command failed to apply a permission check then it'd be possible for unprivileged users to make schema modifications to system catalogs. But as far as I can find, there is no code path in which a check doesn't occur. Which makes it just a weird corner-case bug for people who are silly enough to want to name their tables the same as a system catalog. The relevant code has changed quite a bit since 8.2, which means this patch wouldn't work as-is in the back branches. Since it's a corner case no one has reported from the field, I'm not going to bother trying to back-patch.
2007-08-27	Remove the 'not in' operator (!!=). This was a hangover from Berkeley	Tom Lane
	days that was obsolete the moment we had IN (SELECT ...) capability. It's arguably a security hole since it applied no permissions check to the table it searched, and since it was never documented anywhere, removing it seems more appropriate than fixing it.
2007-08-26	Make ARRAY(SELECT ...) return an empty array, rather than a NULL, when the	Tom Lane
	sub-select returns zero rows. Per complaint from Jens Schicke. Since this is more in the nature of a definition change than a bug, not back-patched.
2007-08-25	Rename built-in Snowball stemmer dictionaries to be english_stem,	Tom Lane
	russian_stem, etc. Per discussion.
2007-08-25	Cleanup for some problems in tsearch patch:	Tom Lane
	- ispell initialization crashed on empty dictionary file - ispell initialization crashed on affix file with prefixes but no suffixes - stop words file was run through pg_verify_mbstr, with database encoding, but it's supposed to be UTF-8; similar bug for synonym files - bunch of comments added, typos fixed, and other cleanup Introduced consistent encoding checking/conversion of data read from tsearch configuration files, by doing this in a single t_readline() subroutine (replacing direct usages of fgets). Cleaned up API for readstopwords too. Heikki Linnakangas
2007-08-22	Remove option to change parser of an existing text search configuration.	Tom Lane
	This prevents needing to do complex and poorly-defined updates of the mapping table if the new parser has different token types than the old. Per discussion.
2007-08-22	Simplify the syntax of CREATE/ALTER TEXT SEARCH DICTIONARY by treating the	Tom Lane
	init options of the template as top-level options in the syntax. This also makes ALTER a bit easier to use, since options can be replaced individually. I also made these statements verify that the tmplinit method will accept the new settings before they get stored; in the original coding you didn't find out about mistakes until the dictionary got invoked. Under the hood, init methods now get options as a List of DefElem instead of a raw text string --- that lets tsearch use existing options-pushing code instead of duplicating functionality.
2007-08-21	Tsearch2 functionality migrates to core. The bulk of this work is by	Tom Lane
	Oleg Bartunov and Teodor Sigaev, but I did a lot of editorializing, so anything that's broken is probably my fault. Documentation is nonexistent as yet, but let's land the patch so we can get some portability testing done.
2007-08-19	Provide for logfiles in machine readable CSV format. In consequence, rename	Andrew Dunstan
	redirect_stderr to logging_collector. Original patch from Arul Shaji, subsequently modified by Greg Smith, and then heavily modified by me.
2007-08-15	Arrange to cache a ResultRelInfo in the executor's EState for relations that	Tom Lane
	are not one of the query's defined result relations, but nonetheless have triggers fired against them while the query is active. This was formerly impossible but can now occur because of my recent patch to fix the firing order for RI triggers. Caching a ResultRelInfo avoids duplicating work by repeatedly opening and closing the same relation, and also allows EXPLAIN ANALYZE to "see" and report on these extra triggers. Use the same mechanism to cache open relations when firing deferred triggers at transaction shutdown; this replaces the former one-element-cache strategy used in that case, and should improve performance a bit when there are deferred triggers on a number of relations.
2007-08-15	Repair problems occurring when multiple RI updates have to be done to the same	Tom Lane
	row within one query: we were firing check triggers before all the updates were done, leading to bogus failures. Fix by making the triggers queued by an RI update go at the end of the outer query's trigger event list, thereby effectively making the processing "breadth-first". This was indeed how it worked pre-8.0, so the bug does not occur in the 7.x branches. Per report from Pavel Stehule.
2007-08-14	Fix oversight in async-commit patch: there were some places in heapam.c	Tom Lane
	that still thought they could set HEAP_XMAX_COMMITTED immediately after seeing the other transaction commit. Make them use the same logic as tqual.c does to determine if the hint bit can be set yet.
2007-08-07	Adjust the output of MemoryContextStats() so that the stats for a	Neil Conway
	child memory contexts is indented two spaces to the right of its parent context. This should make it easier to deduce the memory context hierarchy from the output of MemoryContextStats().
2007-08-05	Apparently icc doesn't always define __ICC, and it's more correct to	Tom Lane
	check for __INTEL_COMPILER. Per report from Dirk Tilger. Not back-patched since I don't fully trust it yet ...
2007-08-04	Fix up bad layout of some comments (probably pg_indent's fault), and	Tom Lane
	improve grammar a tad. Per Greg Stark.