summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2014-01-29Fix unsafe references to errno within error messaging logic.Tom Lane
Various places were supposing that errno could be expected to hold still within an ereport() nest or similar contexts. This isn't true necessarily, though in some cases it accidentally failed to fail depending on how the compiler chanced to order the subexpressions. This class of thinko explains recent reports of odd failures on clang-built versions, typically missing or inappropriate HINT fields in messages. Problem identified by Christian Kruse, who also submitted the patch this commit is based on. (I fixed a few issues in his patch and found a couple of additional places with the same disease.) Back-patch as appropriate to all supported branches.
2014-01-23Fix bugs in PQhost().Fujii Masao
In the platform that doesn't support Unix-domain socket, when neither host nor hostaddr are specified, the default host 'localhost' is used to connect to the server and PQhost() must return that, but it didn't. This patch fixes PQhost() so that it returns the default host in that case. Also this patch fixes PQhost() so that it doesn't return Unix-domain socket directory path in the platform that doesn't support Unix-domain socket. Back-patch to all supported versions.
2014-01-21Allow type_func_name_keywords in even more placesStephen Frost
A while back, 2c92edad48796119c83d7dbe6c33425d1924626d allowed type_func_name_keywords to be used in more places, including role identifiers. Unfortunately, that commit missed out on cases where name_list was used for lists-of-roles, eg: for DROP ROLE. This resulted in the unfortunate situation that you could CREATE a role with a type_func_name_keywords-allowed identifier, but not DROP it (directly- ALTER could be used to rename it to something which could be DROP'd). This extends allowing type_func_name_keywords to places where role lists can be used. Back-patch to 9.0, as 2c92edad48796119c83d7dbe6c33425d1924626d was.
2014-01-21Tweak parse location assignment for CURRENT_DATE and related constructs.Tom Lane
All these constructs generate parse trees consisting of a Const and a run-time type coercion (perhaps a FuncExpr or a CoerceViaIO). Modify the raw parse output so that we end up with the original token's location attached to the type coercion node while the Const has location -1; before, it was the other way around. This makes no difference in terms of what exprLocation() will say about the parse tree as a whole, so it should not have any user-visible impact. The point of changing it is that we do not want contrib/pg_stat_statements to treat these constructs as replaceable constants. It will do the right thing if the Const has location -1 rather than a valid location. This is a pretty ugly hack, but then this code is ugly already; we should someday replace this translation with special-purpose parse node(s) that would allow ruleutils.c to reconstruct the original query text. (See also commit 5d3fcc4c2e137417ef470d604fee5e452b22f6a7, which also hacked location assignment rules for the benefit of pg_stat_statements.) Back-patch to 9.2 where pg_stat_statements grew the ability to recognize replaceable constants. Kyotaro Horiguchi
2014-01-18Allow SET TABLESPACE to database defaultStephen Frost
We've always allowed CREATE TABLE to create tables in the database's default tablespace without checking for CREATE permissions on that tablespace. Unfortunately, the original implementation of ALTER TABLE ... SET TABLESPACE didn't pick up on that exception. This changes ALTER TABLE ... SET TABLESPACE to allow the database's default tablespace without checking for CREATE rights on that tablespace, just as CREATE TABLE works today. Users could always do this through a series of commands (CREATE TABLE ... AS SELECT * FROM ...; DROP TABLE ...; etc), so let's fix the oversight in SET TABLESPACE's original implementation.
2014-01-17Fix client-only installationPeter Eisentraut
The psql Makefile was not creating $(datadir) before installing psqlrc.sample there. In most cases, the directory would be created in some other way, but for the documented from-source client-only installation procedure, it could fail. Reported-by: Mike Blackwell <mike.blackwell@rrd.com>
2014-01-14Improve FILES section of psql reference page.Tom Lane
Primarily, explain where to find the system-wide psqlrc file, per recent gripe from John Sutton. Do some general wordsmithing and improve the markup, too. Also adjust psqlrc.sample so its comments about file location are somewhat trustworthy. (Not sure why we bother with this file when it's empty, but whatever.) Back-patch to 9.2 where the startup file naming scheme was last changed.
2014-01-14Fix multiple bugs in index page locking during hot-standby WAL replay.Tom Lane
In ordinary operation, VACUUM must be careful to take a cleanup lock on each leaf page of a btree index; this ensures that no indexscans could still be "in flight" to heap tuples due to be deleted. (Because of possible index-tuple motion due to concurrent page splits, it's not enough to lock only the pages we're deleting index tuples from.) In Hot Standby, the WAL replay process must likewise lock every leaf page. There were several bugs in the code for that: * The replay scan might come across unused, all-zero pages in the index. While btree_xlog_vacuum itself did the right thing (ie, nothing) with such pages, xlogutils.c supposed that such pages must be corrupt and would throw an error. This accounts for various reports of replication failures with "PANIC: WAL contains references to invalid pages". To fix, add a ReadBufferMode value that instructs XLogReadBufferExtended not to complain when we're doing this. * btree_xlog_vacuum performed the extra locking if standbyState == STANDBY_SNAPSHOT_READY, but that's not the correct test: we won't open up for hot standby queries until the database has reached consistency, and we don't want to do the extra locking till then either, for fear of reading corrupted pages (which bufmgr.c would complain about). Fix by exporting a new function from xlog.c that will report whether we're actually in hot standby replay mode. * To ensure full coverage of the index in the replay scan, btvacuumscan would emit a dummy WAL record for the last page of the index, if no vacuuming work had been done on that page. However, if the last page of the index is all-zero, that would result in corruption of said page, since the functions called on it weren't prepared to handle that case. There's no need to lock any such pages, so change the logic to target the last normal leaf page instead. The first two of these bugs were diagnosed by Andres Freund, the other one by me. Fixes based on ideas from Heikki Linnakangas and myself. This has been wrong since Hot Standby was introduced, so back-patch to 9.0.
2014-01-11Fix possible crashes due to using elog/ereport too early in startup.Tom Lane
Per reports from Andres Freund and Luke Campbell, a server failure during set_pglocale_pgservice results in a segfault rather than a useful error message, because the infrastructure needed to use ereport hasn't been initialized; specifically, MemoryContextInit hasn't been called. One known cause of this is starting the server in a directory it doesn't have permission to read. We could try to prevent set_pglocale_pgservice from using anything that depends on palloc or elog, but that would be messy, and the odds of future breakage seem high. Moreover there are other things being called in main.c that look likely to use palloc or elog too --- perhaps those things shouldn't be there, but they are there today. The best solution seems to be to move the call of MemoryContextInit to very early in the backend's real main() function. I've verified that an elog or ereport occurring immediately after that is now capable of sending something useful to stderr. I also added code to elog.c to print something intelligible rather than just crashing if MemoryContextInit hasn't created the ErrorContext. This could happen if MemoryContextInit itself fails (due to malloc failure), and provides some future-proofing against someone trying to sneak in new code even earlier in server startup. Back-patch to all supported branches. Since we've only heard reports of this type of failure recently, it may be that some recent change has made it more likely to see a crash of this kind; but it sure looks like it's broken all the way back.
2014-01-11Fix compute_scalar_stats() for case that all values exceed WIDTH_THRESHOLD.Tom Lane
The standard typanalyze functions skip over values whose detoasted size exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during ANALYZE. However, we (I think I, actually :-() failed to consider the possibility that *every* non-null value in a column is too wide. While compute_minimal_stats() seems to behave reasonably anyway in such a case, compute_scalar_stats() just fell through and generated no pg_statistic entry at all. That's unnecessarily pessimistic: we can still produce valid stanullfrac and stawidth values in such cases, since we do include too-wide values in the average-width calculation. Furthermore, since the general assumption in this code is that too-wide values are probably all distinct from each other, it seems reasonable to set stadistinct to -1 ("all distinct"). Per complaint from Kadri Raudsepp. This has been like this since roughly neolithic times, so back-patch to all supported branches.
2014-01-09Fix descriptor output in ECPG.Michael Meskes
While working on most platforms the old way sometimes created alignment problems. This should fix it. Also the regresion tests were updated to test for the reported case. Report and fix by MauMau <maumau307@gmail.com>
2014-01-08Fix "cannot accept a set" error when only some arms of a CASE return a set.Tom Lane
In commit c1352052ef1d4eeb2eb1d822a207ddc2d106cb13, I implemented an optimization that assumed that a function's argument expressions would either always return a set (ie multiple rows), or always not. This is wrong however: we allow CASE expressions in which some arms return a set of some type and others just return a scalar of that type. There may be other examples as well. To fix, replace the run-time test of whether an argument returned a set with a static precheck (expression_returns_set). This adds a little bit of query startup overhead, but it seems barely measurable. Per bug #8228 from David Johnston. This has been broken since 8.0, so patch all supported branches.
2014-01-08Fix pause_at_recovery_target + recovery_target_inclusive combination.Heikki Linnakangas
If pause_at_recovery_target is set, recovery pauses *before* applying the target record, even if recovery_target_inclusive is set. If you then continue with pg_xlog_replay_resume(), it will apply the target record before ending recovery. In other words, if you log in while it's paused and verify that the database looks OK, ending recovery changes its state again, possibly destroying data that you were tring to salvage with PITR. Backpatch to 9.1, this has been broken since pause_at_recovery_target was added.
2014-01-08Fix bug in determining when recovery has reached consistency.Heikki Linnakangas
When starting WAL replay from an online checkpoint, the last replayed WAL record variable was initialized using the checkpoint record's location, even though the records between the REDO location and the checkpoint record had not been replayed yet. That was noted as "slightly confusing" but harmless in the comment, but in some cases, it fooled CheckRecoveryConsistency to incorrectly conclude that we had already reached a consistent state immediately at the beginning of WAL replay. That caused the system to accept read-only connections in hot standby mode too early, and also PANICs with message "WAL contains references to invalid pages". Fix by initializing the variables to the REDO location instead. In 9.2 and above, change CheckRecoveryConsistency() to use lastReplayedEndRecPtr variable when checking if backup end location has been reached. It was inconsistently using EndRecPtr for that check, but lastReplayedEndRecPtr when checking min recovery point. It made no difference before this patch, because in all the places where CheckRecoveryConsistency was called the two variables were the same, but it was always an accident waiting to happen, and would have been wrong after this patch anyway. Report and analysis by Tomonari Katsumata, bug #8686. Backpatch to 9.0, where hot standby was introduced.
2014-01-07Move permissions check from do_pg_start_backup to pg_start_backupMagnus Hagander
And the same for do_pg_stop_backup. The code in do_pg_* is not allowed to access the catalogs. For manual base backups, the permissions check can be handled in the calling function, and for streaming base backups only users with the required permissions can get past the authentication step in the first place. Reported by Antonin Houska, diagnosed by Andres Freund
2014-01-07Avoid including tablespaces inside PGDATA twice in base backupsMagnus Hagander
If a tablespace was crated inside PGDATA it was backed up both as part of the PGDATA backup and as the backup of the tablespace. Avoid this by skipping any directory inside PGDATA that contains one of the active tablespaces. Dimitri Fontaine and Magnus Hagander
2014-01-04Fix translatability markings in psql, and add defenses against future bugs.Tom Lane
Several previous commits have added columns to various \d queries without updating their translate_columns[] arrays, leading to potentially incorrect translations in NLS-enabled builds. Offenders include commit 893686762 (added prosecdef to \df+), c9ac00e6e (added description to \dc+) and 3b17efdfd (added description to \dC+). Fix those cases back to 9.3 or 9.2 as appropriate. Since this is evidently more easily missed than one would like, in HEAD also add an Assert that the supplied array is long enough. This requires an API change for printQuery(), so it seems inappropriate for back branches, but presumably all future changes will be tested in HEAD anyway. In HEAD and 9.3, also clean up a whole lot of sloppiness in the emitted SQL for \dy (event triggers): lack of translatability due to failing to pass words-to-be-translated through gettext_noop(), inadequate schema qualification, and sloppy formatting resulting in unnecessarily ugly -E output. Peter Eisentraut and Tom Lane, per bug #8702 from Sergey Burladyan
2014-01-01Do not use an empty hostname.Michael Meskes
When trying to connect to a given database libecpg should not try using an empty hostname if no hostname was given.
2013-12-29Don't attempt to limit target database for pg_restore.Kevin Grittner
There was an apparent attempt to limit the target database for pg_restore to version 7.1.0 or later. Due to a leading zero this was interpreted as an octal number, which allowed targets with version numbers down to 2.87.36. The lowest actual release above that was 6.0.0, so that was effectively the limit. Since the success of the restore attempt will depend primarily on on what statements were generated by the dump run, we don't want pg_restore trying to guess whether a given target should be allowed based on version number. Allow a connection to any version. Since it is very unlikely that anyone would be using a recent version of pg_restore to restore to a pre-6.0 database, this has little to no practical impact, but it makes the code less confusing to read. Issue reported and initial patch suggestion from Joel Jacobson based on an article by Andrey Karpov reporting on issues found by PVS-Studio static code analyzer. Final patch based on analysis by Tom Lane. Back-patch to all supported branches.
2013-12-27Properly detect invalid JSON numbers when generating JSON.Andrew Dunstan
Instead of looking for characters that aren't valid in JSON numbers, we simply pass the output string through the JSON number parser, and if it fails the string is quoted. This means among other things that money and domains over money will be quoted correctly and generate valid JSON. Fixes bug #8676 reported by Anderson Cristian da Silva. Backpatched to 9.2 where JSON generation was introduced.
2013-12-27Fix misplaced right paren bugs in pgstatfuncs.c.Kevin Grittner
The bug would only show up if the C sockaddr structure contained zero in the first byte for a valid address; otherwise it would fail to fail, which is probably why it went unnoticed for so long. Patch submitted by Joel Jacobson after seeing an article by Andrey Karpov in which he reports finding this through static code analysis using PVS-Studio. While I was at it I moved a definition of a local variable referenced in the buggy code to a more local context. Backpatch to all supported branches.
2013-12-15Add "SHIFT_JIS" as an accepted encoding name for locale checking.Tatsuo Ishii
When locale is "ja_JP.SJIS", nl_langinfo(CODESET) returns "SHIFT_JIS" on some platforms, at least on RedHat Linux. So the encoding/locale match table (encoding_match_list) needs the entry. Otherwise client encoding is set to SQL_ASCII. Back patch to all supported branches.
2013-12-14Fix inherited UPDATE/DELETE with UNION ALL subqueries.Tom Lane
Fix an oversight in commit b3aaf9081a1a95c245fd605dcf02c91b3a5c3a29: we do indeed need to process the planner's append_rel_list when copying RTE subqueries, because if any of them were flattenable UNION ALL subqueries, the append_rel_list shows which subquery RTEs were pulled up out of which other ones. Without this, UNION ALL subqueries aren't correctly inserted into the update plans for inheritance child tables after the first one, typically resulting in no update happening for those child table(s). Per report from Victor Yegorov. Experimentation with this case also exposed a fault in commit a7b965382cf0cb30aeacb112572718045e6d4be7: if an inherited UPDATE/DELETE was proven totally dummy by constraint exclusion, we might arrive at add_rtes_to_flat_rtable with root->simple_rel_array being NULL. This should be interpreted as not having any RelOptInfos. I chose to code the guard as a check against simple_rel_array_size, so as to also provide some protection against indexing off the end of the array. Back-patch to 9.2 where the faulty code was added.
2013-12-13Add HOLD/RESUME_INTERRUPTS in HandleCatchupInterrupt/HandleNotifyInterrupt.Tom Lane
This prevents a possible longjmp out of the signal handler if a timeout or SIGINT occurs while something within the handler has transiently set ImmediateInterruptOK. For safety we must hold off the timeout or cancel error until we're back in mainline, or at least till we reach the end of the signal handler when ImmediateInterruptOK was true at entry. This syncs these functions with the logic now present in handle_sig_alarm. AFAICT there is no live bug here in 9.0 and up, because I don't think we currently can wait for any heavyweight lock inside these functions, and there is no other code (except read-from-client) that will turn on ImmediateInterruptOK. However, that was not true pre-9.0: in older branches ProcessIncomingNotify might block trying to lock pg_listener, and then a SIGINT could lead to undesirable control flow. It might be all right anyway given the relatively narrow code ranges in which NOTIFY interrupts are enabled, but for safety's sake I'm back-patching this.
2013-12-12Fix ancient docs/comments thinko: XID comparison is mod 2^32, not 2^31.Tom Lane
Pointed out by Gianni Ciolli.
2013-12-11Tweak placement of explicit ANALYZE commands in the regression tests.Tom Lane
Make the COPY test, which loads most of the large static tables used in the tests, also explicitly ANALYZE those tables. This allows us to get rid of various ad-hoc, and rather redundant, ANALYZE commands that had gotten stuck into various test scripts over time to ensure we got consistent plan choices. (We could have done a database-wide ANALYZE, but that would cause stats to get attached to the small static tables too, which results in plan changes compared to the historical behavior. I'm not sure that's a good idea, so not going that far for now.) Back-patch to 9.0, since 9.0 and 9.1 are currently sometimes failing regression tests for lack of an "ANALYZE tenk1" in the subselect test. There's no need for this in 8.4 since we didn't print any plans back then.
2013-12-10Fix possible crash with nested SubLinks.Tom Lane
An expression such as WHERE (... x IN (SELECT ...) ...) IN (SELECT ...) could produce an invalid plan that results in a crash at execution time, if the planner attempts to flatten the outer IN into a semi-join. This happens because convert_testexpr() was not expecting any nested SubLinks and would wrongly replace any PARAM_SUBLINK Params belonging to the inner SubLink. (I think the comment denying that this case could happen was wrong when written; it's certainly been wrong for quite a long time, since very early versions of the semijoin flattening logic.) Per report from Teodor Sigaev. Back-patch to all supported branches.
2013-12-05Clear retry flags properly in replacement OpenSSL sock_write function.Tom Lane
Current OpenSSL code includes a BIO_clear_retry_flags() step in the sock_write() function. Either we failed to copy the code correctly, or they added this since we copied it. In any case, lack of the clear step appears to be the cause of the server lockup after connection loss reported in bug #8647 from Valentine Gogichashvili. Assume that this is correct coding for all OpenSSL versions, and hence back-patch to all supported branches. Diagnosis and patch by Alexander Kukushkin.
2013-12-03Fix full-page writes of internal GIN pages.Heikki Linnakangas
Insertion to a non-leaf GIN page didn't make a full-page image of the page, which is wrong. The code used to do it correctly, but was changed (commit 853d1c3103fa961ae6219f0281885b345593d101) because the redo-routine didn't track incomplete splits correctly when the page was restored from a full page image. Of course, that was not right way to fix it, the redo routine should've been fixed instead. The redo-routine was surreptitiously fixed in 2010 (commit 4016bdef8aded77b4903c457050622a5a1815c16), so all we need to do now is revert the code that creates the record to its original form. This doesn't change the format of the WAL record. Backpatch to all supported versions.
2013-12-02Fix crash in assign_collations_walker for EXISTS with empty SELECT list.Tom Lane
We (I think I, actually) forgot about this corner case while coding collation resolution. Per bug #8648 from Arjen Nienhuis.
2013-12-02Stamp 9.2.6.REL9_2_6Tom Lane
2013-12-02Fix incomplete backpatch of pg_multixact truncation changes to <= 9.2Alvaro Herrera
The backpatch of a95335b544d9c8377e9dc7a399d8e9a155895f82 to 9.2, 9.1 and 9.0 was incomplete, missing changes to xlog.c, primarily the call to TrimMultiXact(). Testing presumably didn't show a problem without these changes because TrimMultiXact() performs defense-in-depth work, which is not strictly necessary. It also missed moving StartupMultiXact() which would have been problematic if a restartpoing happened in exactly the wrong moment, causing a transient error. Andres Freund
2013-12-02Translation updatesPeter Eisentraut
2013-12-01Update time zone data files to tzdata release 2013h.Tom Lane
DST law changes in Argentina, Brazil, Jordan, Libya, Liechtenstein, Morocco, Palestine. New timezone abbreviations WIB, WIT, WITA for Indonesia.
2013-12-01Back-patch src/timezone/known_abbrevs.txt into all active branches.Tom Lane
Needed so that timezone data update patches can be cherry-picked into older branches conveniently.
2013-11-30Fix pg_dumpall to work for databases flagged as read-only.Kevin Grittner
pg_dumpall's charter is to be able to recreate a database cluster's contents in a virgin installation, but it was failing to honor that contract if the cluster had any ALTER DATABASE SET default_transaction_read_only settings. By including a SET command for the connection for each connection opened by pg_dumpall output, errors are avoided and the source cluster is successfully recreated. There was discussion of whether to also set this for the connection applying pg_dump output, but it was felt that it was both less appropriate in that context, and far easier to work around. Backpatch to all supported branches.
2013-11-29Truncate pg_multixact/'s contents during crash recoveryAlvaro Herrera
Commit 9dc842f08 of 8.2 era prevented MultiXact truncation during crash recovery, because there was no guarantee that enough state had been setup, and because it wasn't deemed to be a good idea to remove data during crash recovery anyway. Since then, due to Hot-Standby, streaming replication and PITR, the amount of time a cluster can spend doing crash recovery has increased significantly, to the point that a cluster may even never come out of it. This has made not truncating the content of pg_multixact/ not defensible anymore. To fix, take care to setup enough state for multixact truncation before crash recovery starts (easy since checkpoints contain the required information), and move the current end-of-recovery actions to a new TrimMultiXact() function, analogous to TrimCLOG(). At some later point, this should probably done similarly to the way clog.c is doing it, which is to just WAL log truncations, but we can't do that for the back branches. Back-patch to 9.0. 8.4 also has the problem, but since there's no hot standby there, it's much less pressing. In 9.2 and earlier, this patch is simpler than in newer branches, because multixact access during recovery isn't required. Add appropriate checks to make sure that's not happening. Andres Freund
2013-11-29Fix assorted issues in pg_ctl's pgwin32_CommandLine().Tom Lane
Ensure that the invocation command for postgres or pg_ctl runservice double-quotes the executable's pathname; failure to do this leads to trouble when the path contains spaces. Also, ensure that the path ends in ".exe" in both cases and uses backslashes rather than slashes as directory separators. The latter issue is reported to confuse some third-party tools such as Symantec Backup Exec. Also, rewrite the function to avoid buffer overrun issues by using a PQExpBuffer instead of a fixed-size static buffer. Combinations of very long executable pathnames and very long data directory pathnames could have caused trouble before, for example. Back-patch to all active branches, since this code has been like this for a long while. Naoya Anzai and Tom Lane, reviewed by Rajeev Rastogi
2013-11-29Be sure to release proc->backendLock after SetupLockInTable() failure.Tom Lane
The various places that transferred fast-path locks to the main lock table neglected to release the PGPROC's backendLock if SetupLockInTable failed due to being out of shared memory. In most cases this is no big deal since ensuing error cleanup would release all held LWLocks anyway. But there are some hot-standby functions that don't consider failure of FastPathTransferRelationLocks to be a hard error, and in those cases this oversight could lead to system lockup. For consistency, make all of these places look the same as FastPathTransferRelationLocks. Noted while looking for the cause of Dan Wood's bugs --- this wasn't it, but it's a bug anyway.
2013-11-28Fix latent(?) race condition in LockReleaseAll.Tom Lane
We have for a long time checked the head pointer of each of the backend's proclock lists and skipped acquiring the corresponding locktable partition lock if the head pointer was NULL. This was safe enough in the days when proclock lists were changed only by the owning backend, but it is pretty questionable now that the fast-path patch added cases where backends add entries to other backends' proclock lists. However, we don't really wish to revert to locking each partition lock every time, because in simple transactions that would add a lot of useless lock/unlock cycles on already-heavily-contended LWLocks. Fortunately, the only way that another backend could be modifying our proclock list at this point would be if it was promoting a formerly fast-path lock of ours; and any such lock must be one that we'd decided not to delete in the previous loop over the locallock table. So it's okay if we miss seeing it in this loop; we'd just decide not to delete it again. However, once we've detected a non-empty list, we'd better re-fetch the list head pointer after acquiring the partition lock. This guards against possibly fetching a corrupt-but-non-null pointer if pointer fetch/store isn't atomic. It's not clear if any practical architectures are like that, but we've never assumed that before and don't wish to start here. In any case, the situation certainly deserves a code comment. While at it, refactor the partition traversal loop to use a for() construct instead of a while() loop with goto's. Back-patch, just in case the risk is real and not hypothetical.
2013-11-27Fix stale-pointer problem in fast-path locking logic.Tom Lane
When acquiring a lock in fast-path mode, we must reset the locallock object's lock and proclock fields to NULL. They are not necessarily that way to start with, because the locallock could be left over from a failed lock acquisition attempt earlier in the transaction. Failure to do this led to all sorts of interesting misbehaviors when LockRelease tried to clean up no-longer-related lock and proclock objects in shared memory. Per report from Dan Wood. In passing, modify LockRelease to elog not just Assert if it doesn't find lock and proclock objects for a formerly fast-path lock, matching the code in FastPathGetRelationLockEntry and LockRefindAndRelease. This isn't a bug but it will help in diagnosing any future bugs in this area. Also, modify FastPathTransferRelationLocks and FastPathGetRelationLockEntry to break out of their loops over the fastpath array once they've found the sole matching entry. This was inconsistently done in some search loops and not others. Improve assorted related comments, too. Back-patch to 9.2 where the fast-path mechanism was introduced.
2013-11-27Don't update relfrozenxid if any pages were skipped.Heikki Linnakangas
Vacuum recognizes that it can update relfrozenxid by checking whether it has processed all pages of a relation. Unfortunately it performed that check after truncating the dead pages at the end of the relation, and used the new number of pages to decide whether all pages have been scanned. If the new number of pages happened to be smaller or equal to the number of pages scanned, it incorrectly decided that all pages were scanned. This can lead to relfrozenxid being updated, even though some pages were skipped that still contain old XIDs. That can lead to data loss due to xid wraparounds with some rows suddenly missing. This likely has escaped notice so far because it takes a large number (~2^31) of xids being used to see the effect, while a full-table vacuum before that would fix the issue. The incorrect logic was introduced by commit b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8. Backpatch this fix down to 8.4, like that commit. Andres Freund, with some modifications by me.
2013-11-27ECPG: Fix searching for quoted cursor names case-sensitively.Michael Meskes
Patch by Böszörményi Zoltán <zb@cybertec.at>
2013-11-26ECPG: Make the preprocessor emit ';' if the variable type for a list ofMichael Meskes
variables is varchar. This fixes this test case: int main(void) { exec sql begin declare section; varchar a[50], b[50]; exec sql end declare section; return 0; } Since varchars are internally turned into custom structs and the type name is emitted for these variable declarations, the preprocessed code previously had: struct varchar_1 { ... } a _,_ struct varchar_2 { ... } b ; The comma in the generated C file was a syntax error. There are no regression test changes since it's not exercised. Patch by Boszormenyi Zoltan <zb@cybertec.at>
2013-11-26ECPG: Fix offset to NULL/size indicator array.Michael Meskes
Patch by Boszormenyi Zoltan <zb@cybertec.at>
2013-11-23Fix array slicing of int2vector and oidvector values.Tom Lane
The previous coding labeled expressions such as pg_index.indkey[1:3] as being of int2vector type; which is not right because the subscript bounds of such a result don't, in general, satisfy the restrictions of int2vector. To fix, implicitly promote the result of slicing int2vector to int2[], or oidvector to oid[]. This is similar to what we've done with domains over arrays, which is a good analogy because these types are very much like restricted domains of the corresponding regular-array types. A side-effect is that we now also forbid array-element updates on such columns, eg while "update pg_index set indkey[4] = 42" would have worked before if you were superuser (and corrupted your catalogs irretrievably, no doubt) it's now disallowed. This seems like a good thing since, again, some choices of subscripting would've led to results not satisfying the restrictions of int2vector. The case of an array-slice update was rejected before, though with a different error message than you get now. We could make these cases work in future if we added a cast from int2[] to int2vector (with a cast function checking the subscript restrictions) but it seems unlikely that there's any value in that. Per report from Ronan Dunklau. Back-patch to all supported branches because of the crash risks involved.
2013-11-23Ensure _dosmaperr() actually sets errno correctly.Tom Lane
If logging is enabled, either ereport() or fprintf() might stomp on errno internally, causing this function to return the wrong result. That might only end in a misleading error report, but in any code that's examining errno to decide what to do next, the consequences could be far graver. This has been broken since the very first version of this file in 2006 ... it's a bit astonishing that we didn't identify this long ago. Reported by Amit Kapila, though this isn't his proposed fix.
2013-11-23Avoid potential buffer overflow crashPeter Eisentraut
A pointer to a C string was treated as a pointer to a "name" datum and passed to SPI_execute_plan(). This pointer would then end up being passed through datumCopy(), which would try to copy the entire 64 bytes of name data, thus running past the end of the C string. Fix by converting the string to a proper name structure. Found by LLVM AddressSanitizer.
2013-11-22Flatten join alias Vars before pulling up targetlist items from a subquery.Tom Lane
pullup_replace_vars()'s decisions about whether a pulled-up replacement expression needs to be wrapped in a PlaceHolderVar depend on the assumption that what looks like a Var behaves like a Var. However, if the Var is a join alias reference, later flattening of join aliases might replace the Var with something that's not a Var at all, and should have been wrapped. To fix, do a forcible pass of flatten_join_alias_vars() on the subquery targetlist before we start to copy items out of it. We'll re-run that processing on the pulled-up expressions later, but that's harmless. Per report from Ken Tanzer; the added regression test case is based on his example. This bug has been there since the PlaceHolderVar mechanism was invented, but has escaped detection because the circumstances that trigger it are fairly narrow. You need a flattenable query underneath an outer join, which contains another flattenable query inside a join of its own, with a dangerous expression (a constant or something else non-strict) in that one's targetlist. Having seen this, I'm wondering if it wouldn't be prudent to do all alias-variable flattening earlier, perhaps even in the rewriter. But that would probably not be a back-patchable change.
2013-11-22Fix Hot-Standby initialization of clog and subtrans.Heikki Linnakangas
These bugs can cause data loss on standbys started with hot_standby=on at the moment they start to accept read only queries, by marking committed transactions as uncommited. The likelihood of such corruptions is small unless the primary has a high transaction rate. 5a031a5556ff83b8a9646892715d7fef415b83c3 fixed bugs in HS's startup logic by maintaining less state until at least STANDBY_SNAPSHOT_PENDING state was reached, missing the fact that both clog and subtrans are written to before that. This only failed to fail in common cases because the usage of ExtendCLOG in procarray.c was superflous since clog extensions are actually WAL logged. f44eedc3f0f347a856eea8590730769125964597/I then tried to fix the missing extensions of pg_subtrans due to the former commit's changes - which are not WAL logged - by performing the extensions when switching to a state > STANDBY_INITIALIZED and not performing xid assignments before that - again missing the fact that ExtendCLOG is unneccessary - but screwed up twice: Once because latestObservedXid wasn't updated anymore in that state due to the earlier commit and once by having an off-by-one error in the loop performing extensions. This means that whenever a CLOG_XACTS_PER_PAGE (32768 with default settings) boundary was crossed between the start of the checkpoint recovery started from and the first xl_running_xact record old transactions commit bits in pg_clog could be overwritten if they started and committed in that window. Fix this mess by not performing ExtendCLOG() in HS at all anymore since it's unneeded and evidently dangerous and by performing subtrans extensions even before reaching STANDBY_SNAPSHOT_PENDING. Analysis and patch by Andres Freund. Reported by Christophe Pettus. Backpatch down to 9.0, like the previous commit that caused this.