summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2013-11-23PL/Tcl: Add event trigger supportPeter Eisentraut
From: Dimitri Fontaine <dimitri@2ndQuadrant.fr>
2013-11-23Fix array slicing of int2vector and oidvector values.Tom Lane
The previous coding labeled expressions such as pg_index.indkey[1:3] as being of int2vector type; which is not right because the subscript bounds of such a result don't, in general, satisfy the restrictions of int2vector. To fix, implicitly promote the result of slicing int2vector to int2[], or oidvector to oid[]. This is similar to what we've done with domains over arrays, which is a good analogy because these types are very much like restricted domains of the corresponding regular-array types. A side-effect is that we now also forbid array-element updates on such columns, eg while "update pg_index set indkey[4] = 42" would have worked before if you were superuser (and corrupted your catalogs irretrievably, no doubt) it's now disallowed. This seems like a good thing since, again, some choices of subscripting would've led to results not satisfying the restrictions of int2vector. The case of an array-slice update was rejected before, though with a different error message than you get now. We could make these cases work in future if we added a cast from int2[] to int2vector (with a cast function checking the subscript restrictions) but it seems unlikely that there's any value in that. Per report from Ronan Dunklau. Back-patch to all supported branches because of the crash risks involved.
2013-11-23Ensure _dosmaperr() actually sets errno correctly.Tom Lane
If logging is enabled, either ereport() or fprintf() might stomp on errno internally, causing this function to return the wrong result. That might only end in a misleading error report, but in any code that's examining errno to decide what to do next, the consequences could be far graver. This has been broken since the very first version of this file in 2006 ... it's a bit astonishing that we didn't identify this long ago. Reported by Amit Kapila, though this isn't his proposed fix.
2013-11-23Fix thinko in SPI_execute_plan() callsPeter Eisentraut
Two call sites were apparently thinking that the last argument of SPI_execute_plan() is the number of query parameters, but it is actually the row limit. Change the calls to 0, since we don't care about the limit there. The previous code didn't break anything, but it was still wrong.
2013-11-23Avoid potential buffer overflow crashPeter Eisentraut
A pointer to a C string was treated as a pointer to a "name" datum and passed to SPI_execute_plan(). This pointer would then end up being passed through datumCopy(), which would try to copy the entire 64 bytes of name data, thus running past the end of the C string. Fix by converting the string to a proper name structure. Found by LLVM AddressSanitizer.
2013-11-22Flatten join alias Vars before pulling up targetlist items from a subquery.Tom Lane
pullup_replace_vars()'s decisions about whether a pulled-up replacement expression needs to be wrapped in a PlaceHolderVar depend on the assumption that what looks like a Var behaves like a Var. However, if the Var is a join alias reference, later flattening of join aliases might replace the Var with something that's not a Var at all, and should have been wrapped. To fix, do a forcible pass of flatten_join_alias_vars() on the subquery targetlist before we start to copy items out of it. We'll re-run that processing on the pulled-up expressions later, but that's harmless. Per report from Ken Tanzer; the added regression test case is based on his example. This bug has been there since the PlaceHolderVar mechanism was invented, but has escaped detection because the circumstances that trigger it are fairly narrow. You need a flattenable query underneath an outer join, which contains another flattenable query inside a join of its own, with a dangerous expression (a constant or something else non-strict) in that one's targetlist. Having seen this, I'm wondering if it wouldn't be prudent to do all alias-variable flattening earlier, perhaps even in the rewriter. But that would probably not be a back-patchable change.
2013-11-22Fix Hot-Standby initialization of clog and subtrans.Heikki Linnakangas
These bugs can cause data loss on standbys started with hot_standby=on at the moment they start to accept read only queries, by marking committed transactions as uncommited. The likelihood of such corruptions is small unless the primary has a high transaction rate. 5a031a5556ff83b8a9646892715d7fef415b83c3 fixed bugs in HS's startup logic by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both clog and subtrans are written to before that. This only failed to fail in common cases because the usage of ExtendCLOG in procarray.c was superflous since clog extensions are actually WAL logged. f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing extensions of pg_subtrans due to the former commit's changes - which are not WAL logged - by performing the extensions when switching to a state > STANDBY_INITIALIZED and not performing xid assignments before that - again missing the fact that ExtendCLOG is unneccessary - but screwed up twice: Once because latestObservedXid wasn't updated anymore in that state due to the earlier commit and once by having an off-by-one error in the loop performing extensions. This means that whenever a CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed between the start of the checkpoint recovery started from and the first xl_running_xact record old transactions commit bits in pg_clog could be overwritten if they started and committed in that window. Fix this mess by not performing ExtendCLOG() in HS at all anymore since it's unneeded and evidently dangerous and by performing subtrans extensions even before reaching STANDBY_SNAPSHOT_PENDING. Analysis and patch by Andres Freund. Reported by Christophe Pettus. Backpatch down to 9.0, like the previous commit that caused this.
2013-11-22Avoid acquiring spinlock when checking if recovery has finished, for speed.Heikki Linnakangas
RecoveryIsInProgress() can be called very frequently. During normal operation, it just checks a backend-local variable and returns quickly, but during hot standby, it checks a spinlock-protected shared variable. Those spinlock acquisitions can become a point of contention on a busy hot standby system. Replace the spinlock acquisition with a memory barrier. Per discussion with Andres Freund, Ants Aasma and Merlin Moncure.
2013-11-21Tweak streamutil.c further to avoid scan-build warningPeter Eisentraut
The previous change added a new scan-build warning about need_password assigned but not read.
2013-11-21Support multi-argument UNNEST(), and TABLE() syntax for multiple functions.Tom Lane
This patch adds the ability to write TABLE( function1(), function2(), ...) as a single FROM-clause entry. The result is the concatenation of the first row from each function, followed by the second row from each function, etc; with NULLs inserted if any function produces fewer rows than others. This is believed to be a much more useful behavior than what Postgres currently does with multiple SRFs in a SELECT list. This syntax also provides a reasonable way to combine use of column definition lists with WITH ORDINALITY: put the column definition list inside TABLE(), where it's clear that it doesn't control the ordinality column as well. Also implement SQL-compliant multiple-argument UNNEST(), by turning UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)). The SQL standard specifies TABLE() with only a single function, not multiple functions, and it seems to require an implicit UNNEST() which is not what this patch does. There may be something wrong with that reading of the spec, though, because if it's right then the spec's TABLE() is just a pointless alternative spelling of UNNEST(). After further review of that, we might choose to adopt a different syntax for what this patch does, but in any case this functionality seems clearly worthwhile. Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and significantly revised by me
2013-11-21Fix pg_isready to handle -d option properly.Fujii Masao
Previously, -d option for pg_isready was broken. When the name of the database was specified by -d option, pg_isready failed with an error. When the conninfo specified by -d option contained the setting of the host name but not Numeric IP address (i.e., hostaddr), pg_isready displayed wrong connection message. -d option could not handle a valid URI prefix at all. This commit fixes these bugs of pg_isready. Backpatch to 9.3, where pg_isready was introduced. Per report from Josh Berkus and Robert Haas. Original patch by Fabrízio de Royes Mello, heavily modified by me.
2013-11-20More GIN refactoring.Heikki Linnakangas
Split off the portion of ginInsertValue that inserts the tuple to current level into a separate function, ginPlaceToPage. ginInsertValue's charter is now to recurse up the tree to insert the downlink, when a page split is required. This is in preparation for a patch to change the way incomplete splits are handled, which will need to do these operations separately. And IMHO makes the code more readable anyway.
2013-11-20Refactor the internal GIN B-tree interface for forming a downlink.Heikki Linnakangas
This creates a new gin-btree callback function for creating a downlink for a page. Previously, ginxlog.c duplicated the logic used during normal operation.
2013-11-20Further GIN refactoring.Heikki Linnakangas
Merge some functions that were always called together. Makes the code little bit more readable.
2013-11-19ecpg: Split off mmfatal() from mmerror()Peter Eisentraut
This allows decorating mmfatal() with noreturn compiler hints, leading to better diagnostics.
2013-11-19Add tab completion for \pset in psql.Fujii Masao
Pavel Stehule, reviewed by Ian Lawrence Barwick
2013-11-18Spell SQL keywords in uppercase in pg_dump's query.Heikki Linnakangas
The server won't care, but let's be consistent. David Rowley.
2013-11-18Replace appendPQExpBuffer(..., <constant>) with appendPQExpBufferStrHeikki Linnakangas
Arguably makes the code a bit more readable, and might give a small performance gain. David Rowley
2013-11-18Use cstring_to_text_with_len when length is known.Robert Haas
This avoids a potentially-expensive extra call to strlen(). David Rowley
2013-11-18Count locked pages that don't need vacuuming as scanned.Heikki Linnakangas
Previously, if VACUUM skipped vacuuming a page because it's pinned, it didn't count that page as scanned. However, that meant that relfrozenxid was not bumped up either, which prevented anti-wraparound vacuum from doing its job. Report by Миша Тюрин, analysis and patch by Sergey Burladyn and Jeff Janes. Backpatch to 9.2, where the skip-locked-pages behavior was introduced.
2013-11-17Add make_date() and make_time() functions.Tom Lane
Pavel Stehule, reviewed by Jeevan Chalke and Atri Sharma
2013-11-16Improve performance of numeric sum(), avg(), stddev(), variance(), etc.Tom Lane
This patch improves performance of most built-in aggregates that formerly used a NUMERIC or NUMERIC array as their transition type; this includes not only aggregates on numeric inputs, but some aggregates on integer inputs where overflow of an int8 value is a possibility. The code now uses a special-purpose data structure to avoid array construction and deconstruction overhead, as well as packing and unpacking overhead for numeric values. These aggregates' transition type is now declared as INTERNAL, since it doesn't correspond to any SQL data type. To keep the planner from thinking that that means a lot of storage will be used, we make use of the just-added pg_aggregate.aggtransspace feature. The space estimate is set to 128 bytes, which is at least in the right ballpark. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra
2013-11-16Allow aggregates to provide estimates of their transition state data size.Tom Lane
Formerly the planner had a hard-wired rule of thumb for guessing the amount of space consumed by an aggregate function's transition state data. This estimate is critical to deciding whether it's OK to use hash aggregation, and in many situations the built-in estimate isn't very good. This patch adds a column to pg_aggregate wherein a per-aggregate estimate can be provided, overriding the planner's default, and infrastructure for setting the column via CREATE AGGREGATE. It may be that additional smarts will be required in future, perhaps even a per-aggregate estimation function. But this is already a step forward. This is extracted from a larger patch to improve the performance of numeric and int8 aggregates. I (tgl) thought it was worth reviewing and committing this infrastructure separately. In this commit, all built-in aggregates are given aggtransspace = 0, so no behavior should change. Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra
2013-11-15Fix incorrect loop counts in tidbitmap.c.Tom Lane
A couple of places that should have been iterating over WORDS_PER_CHUNK words were iterating over WORDS_PER_PAGE words instead. This thinko accidentally failed to fail, because (at least on common architectures with default BLCKSZ) WORDS_PER_CHUNK is a bit less than WORDS_PER_PAGE, and the extra words being looked at were always zero so nothing happened. Still, it's a bug waiting to happen if anybody ever fools with the parameters affecting TIDBitmap sizes, and it's a small waste of cycles too. So back-patch to all active branches. Etsuro Fujita
2013-11-15Speed up printing of INSERT statements in pg_dump.Tom Lane
In --inserts and especially --column-inserts mode, we can get a useful speedup by generating the common prefix of all a table's INSERT commands just once, and then printing the prebuilt string for each row. This avoids multiple invocations of fmtId() and other minor fooling around. David Rowley
2013-11-15Clean up password prompting logic in streamutil.c.Tom Lane
The previous coding was fairly unreadable and drew double-free warnings from clang. I believe the double free was actually not reachable, because PQconnectionNeedsPassword is coded to not return true if a password was provided, so that the loop can't iterate more than twice. Nonetheless it seems worth rewriting. No back-patch since this is just cosmetic.
2013-11-15Compute correct em_nullable_relids in get_eclass_for_sort_expr().Tom Lane
Bug #8591 from Claudio Freire demonstrates that get_eclass_for_sort_expr must be able to compute valid em_nullable_relids for any new equivalence class members it creates. I'd worried about this in the commit message for db9f0e1d9a4a0842c814a464cdc9758c3f20b96c, but claimed that it wasn't a problem because multi-member ECs should already exist when it runs. That is transparently wrong, though, because this function is also called by initialize_mergeclause_eclasses, which runs during deconstruct_jointree. The example given in the bug report (which the new regression test item is based upon) fails because the COALESCE() expression is first seen by initialize_mergeclause_eclasses rather than process_equivalence. Fixing this requires passing the appropriate nullable_relids set to get_eclass_for_sort_expr, and it requires new code to compute that set for top-level expressions such as ORDER BY, GROUP BY, etc. We store the top-level nullable_relids in a new field in PlannerInfo to avoid computing it many times. In the back branches, I've added the new field at the end of the struct to minimize ABI breakage for planner plugins. There doesn't seem to be a good alternative to changing get_eclass_for_sort_expr's API signature, though. There probably aren't any third-party extensions calling that function directly; moreover, if there are, they probably need to think about what to pass for nullable_relids anyway. Back-patch to 9.2, like the previous patch in this area.
2013-11-15Prevent leakage of cached plans and execution trees in plpgsql DO blocks.Tom Lane
plpgsql likes to cache query plans and simple-expression execution state trees across calls. This is a considerable win for multiple executions of the same function. However, it's useless for DO blocks, since by definition those are executed only once and discarded. Nonetheless, we were allowing a DO block's expression execution trees to survive until end of transaction, resulting in a significant intra-transaction memory leak, as reported by Yeb Havinga. Worse, if the DO block exited with an error, the compiled form of the block's code was leaked till end of session --- along with subsidiary plancache entries. To fix, make DO blocks keep their expression execution trees in a private EState that's deleted at exit from the block, and add a PG_TRY block to plpgsql_inline_handler to make sure that memory cleanup happens even on error exits. Also add a regression test covering error handling in a DO block, because my first try at this broke that. (The test is not meant to prove that we don't leak memory anymore, though it could be used for that with a much larger loop count.) Ideally we'd back-patch this into all versions supporting DO blocks; but the patch needs to add a field to struct PLpgSQL_execstate, and that would break ABI compatibility for third-party plugins such as the plpgsql debugger. Given the small number of complaints so far, fixing this in HEAD only seems like an acceptable choice.
2013-11-15Minor comment corrections for sequence hashtable patch.Tom Lane
There were enough typos in the comments to annoy me ...
2013-11-15Fix buffer overrun in isolation test program.Kevin Grittner
Commit 061b88c732952c59741374806e1e41c1ec845d50 saved argv0 to a global buffer without ensuring that it was zero terminated, allowing references to it to overrun the buffer and access other memory. This probably would not have presented any security risk, but could have resulted in very confusing failures if the path to the executable was very long. Reported by David Rowley
2013-11-15Fix bogus hash table creation.Heikki Linnakangas
Andres Freund
2013-11-15Use a hash table to store current sequence values.Heikki Linnakangas
This speeds up nextval() and currval(), when you touch a lot of different sequences in the same backend. David Rowley
2013-11-14Add a regression test case for \d on an index.Tom Lane
Previous commit shows the need for this. The coverage isn't really thorough, but it's better than nothing.
2013-11-14Fix incorrect column name in psql \d code.Tom Lane
pg_index.indisreplident had at one time in its development been called indisidentity. describe.c got missed when it was renamed. Bug introduced in commit 07cacba983ef79be4a84fcd0e0ca3b5fcb85dd65. Andres Freund
2013-11-13Fix whitespacePeter Eisentraut
2013-11-13Fix isolation check for MSVC to handle recent changes.Andrew Dunstan
2013-11-13Fix relfilenodemap.c's handling of cache invalidations.Robert Haas
The old code entered a new hash table entry first, then scanned pg_class to determine what value to fill in, and then populated the entry. This fails to work properly if a cache invalidation happens as a result of opening pg_class. Repair. Along the way, get rid of the idea of blowing away the entire hash table as a method of processing invalidations. Instead, just delete all the entries one by one. This is probably not quite as cheap but it's simpler, and shouldn't happen often. Andres Freund
2013-11-13Free ignorelist after each regression test schedule.Kevin Grittner
It's a trivial amount of RAM held until the end of the regression test run; but it's probably worth fixing to silence future warnings from code analyzers. This was the only memory leak pointed out by clang's static code analysis tool.
2013-11-13Fix bug in GIN posting tree root creation.Heikki Linnakangas
The root page is filled with as many items as fit, and the rest are inserted using normal insertions. However, I fumbled the variable names, and the code actually memcpy'd all the items on the page, overflowing the buffer. While at it, rename the variable to make the distinction more clear. Reported by Teodor Sigaev. This bug was introduced by my recent refactorings, so no backpatching required.
2013-11-13Move variable closer to where it is usedPeter Eisentraut
This avoids an unused variable warning on Windows when building without asserts From: David Rowley <dgrowleyml@gmail.com>
2013-11-12Try again to make pg_isolation_regress work its build directory.Robert Haas
We can't search for the isolationtester binary until after we've set up the environment, because otherwise when find_other_exec() tries to invoke it with the -V option, it might fail for inability to locate a working libpq. So postpone that step. Andres Freund
2013-11-12Remove leftovers of IRIX portPeter Eisentraut
This removes the remaining pieces of the IRIX port that was removed by ea91a6be89575095f61ebf36d67c2df98be093db.
2013-11-11Fix failure with whole-row reference to a subquery.Tom Lane
Simple oversight in commit 1cb108efb0e60d87e4adec38e7636b6e8efbeb57 --- recursively examining a subquery output column is only sane if the original Var refers to a single output column. Found by Kevin Grittner.
2013-11-11Fix ruleutils pretty-printing to not generate trailing whitespace.Tom Lane
The pretty-printing logic in ruleutils.c operates by inserting a newline and some indentation whitespace into strings that are already valid SQL. This naturally results in leaving some trailing whitespace before the newline in many cases; which can be annoying when processing the output with other tools, as complained of by Joe Abbate. We can fix that in a pretty localized fashion by deleting any trailing whitespace before we append a pretty-printing newline. In addition, we have to modify the code inserted by commit 2f582f76b1945929ff07116cd4639747ce9bb8a1 so that we also delete trailing whitespace when transposing items from temporary buffers into the main result string, when a temporary item starts with a newline. This results in rather voluminous changes to the regression test results, but it's easily verified that they are only removal of trailing whitespace. Back-patch to 9.3, because the aforementioned commit resulted in many more cases of trailing whitespace than had occurred in earlier branches.
2013-11-11Re-allow duplicate aliases within aliased JOINs.Tom Lane
Although the SQL spec forbids duplicate table aliases, historically we've allowed queries like SELECT ... FROM tab1 x CROSS JOIN (tab2 x CROSS JOIN tab3 y) z on the grounds that the aliased join (z) hides the aliases within it, therefore there is no conflict between the two RTEs named "x". The LATERAL patch broke this, on the misguided basis that "x" could be ambiguous if tab3 were a LATERAL subquery. To avoid breaking existing queries, it's better to allow this situation and complain only if tab3 actually does contain an ambiguous reference. We need only remove the check that was throwing an error, because the column lookup code is already prepared to handle ambiguous references. Per bug #8444.
2013-11-11Don't abort pg_basebackup when receiving empty WAL blockMagnus Hagander
This is a similar fix as c6ec8793aa59d1842082e14b4b4aae7d4bd883fd 9.2. This should never happen in 9.3 and newer since the special case cannot happen there, but this patch synchronizes up the code so there is no confusion on why they're different. An empty block is as harmless in 9.3 as it was in 9.2, and can safely be ignored.
2013-11-10Fix whitespace issues found by git diff --check, add gitattributesPeter Eisentraut
Set per file type attributes in .gitattributes to fine-tune whitespace checks. With the associated cleanups, the tree is now clean for git
2013-11-09Fix ECPG compiler warning.Robert Haas
Commit 9b4d52f2095be96ca238ce41f6963ec56376491f failed to notice that pg_regress_ecpg needed updating. This patch was independently submitted by both David Rowley and Andres Freund.
2013-11-08Fix race condition in GIN posting tree page deletion.Heikki Linnakangas
If a page is deleted, and reused for something else, just as a search is following a rightlink to it from its left sibling, the search would continue scanning whatever the new contents of the page are. That could lead to incorrect query results, or even something more curious if the page is reused for a different kind of a page. To fix, modify the search algorithm to lock the next page before releasing the previous one, and refrain from deleting pages from the leftmost branch of the tree. Add a new Concurrency section to the README, explaining why this works. There is a lot more one could say about concurrency in GIN, but that's for another patch. Backpatch to all supported versions.
2013-11-08Fix pg_isolation_regress to work outside its build directory.Robert Haas
This makes it possible to, for example, use the isolation tester to test a contrib module. Andres Freund