summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2013-06-28Make the OVER keyword unreserved.Robert Haas
This results in a slightly less specific error message when OVER is used in a context where we don't accept window functions, but per discussion, it's worth it to get the benefit of not needing to reserve this keyword any more. This same refactoring will also let us avoid reserving some other keywords that we expect to add in upcoming patches (specifically, IGNORE, RESPECT, and FILTER). Troels Nielsen, with minor changes by me
2013-06-28Define Trap and TrapMacro even in non-cassert builds.Robert Haas
In some cases, the use of these macros may be preferable to Assert() or AssertMacro(), since this way the caller can set the trap message. Andres Freund and Robert Haas
2013-06-28Track spinlock delay in microsecond granularity.Heikki Linnakangas
On many platforms the OS will round the sleep time to millisecond resolution, but there is no reason for us to pre-emptively round the argument to pg_usleep. When the delay was measured in milliseconds and started from 1 ms, it sometimes took many attempts until the logic that increases the delay by multiplying with a random value between 1 and 2 actually managed to bump it from 1 ms to 2 ms. That lead to a sequence of 1 ms waits until the delay started to increase. This wasn't really a problem but it looked odd if you observed the waits. There is no measurable difference in performance, but it's more readable this way. Jeff Janes
2013-06-27Update pg_resetxlog's documentation on multixactsAlvaro Herrera
I added some more functionality to it in 0ac5ad5134f27 but neglected to add it to the docs. Per Peter Eisentraut in message 1367112171.32604.4.camel@vanquo.pezone.net
2013-06-27Permit super-MaxAllocSize allocations with MemoryContextAllocHuge().Noah Misch
The MaxAllocSize guard is convenient for most callers, because it reduces the need for careful attention to overflow, data type selection, and the SET_VARSIZE() limit. A handful of callers are happy to navigate those hazards in exchange for the ability to allocate a larger chunk. Introduce MemoryContextAllocHuge() and repalloc_huge(). Use this in tuplesort.c and tuplestore.c, enabling internal sorts of up to INT_MAX tuples, a factor-of-48 increase. In particular, B-tree index builds can now benefit from much-larger maintenance_work_mem settings. Reviewed by Stephen Frost, Simon Riggs and Jeff Janes.
2013-06-27Mark index-constraint comments with correct dependency in pg_dump.Tom Lane
When there's a comment on an index that was created with UNIQUE or PRIMARY KEY constraint syntax, we need to label the comment as depending on the constraint not the index, since only the constraint object actually appears in the dump. This incorrect dependency can lead to parallel pg_restore trying to restore the comment before the index has been created, per bug #8257 from Lloyd Albin. This patch fixes pg_dump to produce the right dependency in dumps made in the future. Usually we also try to hack pg_restore to work around bogus dependencies, so that existing (wrong) dumps can still be restored in parallel mode; but that doesn't seem practical here since there's no easy way to relate the constraint dump entry to the comment after the fact. Andres Freund
2013-06-27Expect EWOULDBLOCK from a non-blocking connect() call only on Windows.Tom Lane
On Unix-ish platforms, EWOULDBLOCK may be the same as EAGAIN, which is *not* a success return, at least not on Linux. We need to treat it as a failure to avoid giving a misleading error message. Per the Single Unix Spec, only EINPROGRESS and EINTR returns indicate that the connection attempt is in progress. On Windows, on the other hand, EWOULDBLOCK (WSAEWOULDBLOCK) is the expected case. We must accept EINPROGRESS as well because Cygwin will return that, and it doesn't seem worth distinguishing Cygwin from native Windows here. It's not very clear whether EINTR can occur on Windows, but let's leave that part of the logic alone in the absence of concrete trouble reports. Also, remove the test for errno == 0, effectively reverting commit da9501bddb42222dc33c031b1db6ce2133bcee7b, which AFAICS was just a thinko; or at best it might have been a workaround for a platform-specific bug, which we can hope is gone now thirteen years later. In any case, since libpq makes no effort to reset errno to zero before calling connect(), it seems unlikely that that test has ever reliably done anything useful. Andres Freund and Tom Lane
2013-06-26Cooperate with the Valgrind instrumentation framework.Noah Misch
Valgrind "client requests" in aset.c and mcxt.c teach Valgrind and its Memcheck tool about the PostgreSQL allocator. This makes Valgrind roughly as sensitive to memory errors involving palloc chunks as it is to memory errors involving malloc chunks. Further client requests in PageAddItem() and printtup() verify that all bits being added to a buffer page or furnished to an output function are predictably-defined. Those tests catch failures of C-language functions to fully initialize the bits of a Datum, which in turn stymie optimizations that rely on _equalConst(). Define the USE_VALGRIND symbol in pg_config_manual.h to enable these additions. An included "suppression file" silences nominal errors we don't plan to fix. Reviewed in earlier versions by Peter Geoghegan and Korry Douglas.
2013-06-26Refactor aset.c and mcxt.c in preparation for Valgrind cooperation.Noah Misch
Move some repeated debugging code into functions and store intermediates in variables where not presently necessary. No code-generation changes in a production build, and no functional changes. This simplifies and focuses the main patch.
2013-06-26Initialize pad bytes in GinFormTuple().Noah Misch
Every other core buffer page consumer initializes the bytes it furnishes to PageAddItem(). For consistency, do the same here. No back-patch; regardless, we couldn't count on the fix so long as binary upgrade can carry forward affected index builds.
2013-06-26Renovate display of non-ASCII messages on Windows.Noah Misch
GNU gettext selects a default encoding for the messages it emits in a platform-specific manner; it uses the Windows ANSI code page on Windows and follows LC_CTYPE on other platforms. This is inconvenient for PostgreSQL server processes, so realize consistent cross-platform behavior by calling bind_textdomain_codeset() on Windows each time we permanently change LC_CTYPE. This primarily affects SQL_ASCII databases and processes like the postmaster that do not attach to a database, making their behavior consistent with PostgreSQL on non-Windows platforms. Messages from SQL_ASCII databases use the encoding implied by the database LC_CTYPE, and messages from non-database processes use LC_CTYPE from the postmaster system environment. PlatformEncoding becomes unused, so remove it. Make write_console() prefer WriteConsoleW() to write() regardless of the encodings in use. In this situation, write() will invariably mishandle non-ASCII characters. elog.c has assumed that messages conform to the database encoding. While usually true, this does not hold for SQL_ASCII and MULE_INTERNAL. Introduce MessageEncoding to track the actual encoding of message text. The present consumers are Windows-specific code for converting messages to UTF16 for use in system interfaces. This fixes the appearance in Windows event logs and consoles of translated messages from SQL_ASCII processes like the postmaster. Note that SQL_ASCII inherently disclaims a strong notion of encoding, so non-ASCII byte sequences interpolated into messages by %s may yet yield a nonsensical message. MULE_INTERNAL has similar problems at present, albeit for a different reason: its lack of libiconv support or a conversion to UTF8. Consequently, one need no longer restart Windows with a different Windows ANSI code page to broadly test backend logging under a given language. Changing the user's locale ("Format") is enough. Several accounts can simultaneously run postmasters under different locales, all correctly logging localized messages to Windows event logs and consoles. Alexander Law and Noah Misch
2013-06-26pg_receivexlog: Fix logic errorPeter Eisentraut
The code checking the WAL file name contained a logic error and wouldn't actually catch some bad names.
2013-06-25Avoid inconsistent type declarationAlvaro Herrera
Clang 3.3 correctly complains that a variable of type enum MultiXactStatus cannot hold a value of -1, which makes sense. Change the declared type of the variable to int instead, and apply casting as necessary to avoid the warning. Per notice from Andres Freund
2013-06-25Properly dump dropped foreign table cols in binary-upgrade mode.Andrew Dunstan
In binary upgrade mode, we need to recreate and then drop dropped columns so that all the columns get the right attribute number. This is true for foreign tables as well as for native tables. For foreign tables we have been getting the first part right but not the second, leading to bogus columns in the upgraded database. Fix this all the way back to 9.1, where foreign tables were introduced.
2013-06-26Support clean switchover.Fujii Masao
In replication, when we shutdown the master, walsender tries to send all the outstanding WAL records to the standby, and then to exit. This basically means that all the WAL records are fully synced between two servers after the clean shutdown of the master. So, after promoting the standby to new master, we can restart the stopped master as new standby without the need for a fresh backup from new master. But there was one problem so far: though walsender tries to send all the outstanding WAL records, it doesn't wait for them to be replicated to the standby. Then, before receiving all the WAL records, walreceiver can detect the closure of connection and exit. We cannot guarantee that there is no missing WAL in the standby after clean shutdown of the master. In this case, backup from new master is required when restarting the stopped master as new standby. This patch fixes this problem. It just changes walsender so that it waits for all the outstanding WAL records to be replicated to the standby before closing the replication connection. Per discussion, this is a fix that needs to get backpatched rather than new feature. So, back-patch to 9.1 where enough infrastructure for this exists. Patch by me, reviewed by Andres Freund.
2013-06-24Reverting previous commit, pending investigationSimon Riggs
of sporadic seg faults from various build farm members.
2013-06-24ALTER TABLE ... ALTER CONSTRAINT for FKsSimon Riggs
Allow constraint attributes to be altered, so the default setting of NOT DEFERRABLE can be altered to DEFERRABLE and back. Review by Abhijit Menon-Sen
2013-06-24Translation updatesPeter Eisentraut
2013-06-23Add a comment warning against use of pg_usleep() for long sleeps.Tom Lane
Follow-up to commit 873ab97219caabeb2f7b390268a4fe01e2b7518c, in which I noted that WaitLatch was a better solution in the commit log message, but neglected to add any documentation in the code.
2013-06-23Ensure no xid gaps during Hot Standby startupSimon Riggs
In some cases with higher numbers of subtransactions it was possible for us to incorrectly initialize subtrans leading to complaints of missing pages. Bug report by Sergey Konoplev Analysis and fix by Andres Freund
2013-06-20Clarify terminology standalone backend vs. single-user modePeter Eisentraut
Most of the documentation uses "single-user mode", so use that in the code as well. Adjust the documentation to match the new error message wording. Also add a documentation index entry for "single-user mode". Based-on-patch-by: Jeff Janes <jeff.janes@gmail.com>
2013-06-19initdb: Add blank line before output about checksumsPeter Eisentraut
This maintains the logical grouping of the output better.
2013-06-20Support TB (terabyte) memory unit in GUC variables.Fujii Masao
Patch by Simon Riggs, reviewed by Jeff Janes and me.
2013-06-19Modernize entab source codeBruce Momjian
Remove halt.c, improve comments, rename manual page file.
2013-06-19Fix the create_index regression test for Danish collation.Kevin Grittner
In Danish collations, there are letter combinations which sort higher than 'Z'. A test for values > 'WA' was picking up rows where the value started with 'AA', causing the test to fail. Backpatch to 9.2, where the failing test was added. Per report from Svenne Krap and analysis by Jeff Janes
2013-06-17psql: Re-allow -1 together with -c or -lPeter Eisentraut
2013-06-17Add buffer_std flag to MarkBufferDirtyHint().Jeff Davis
MarkBufferDirtyHint() writes WAL, and should know if it's got a standard buffer or not. Currently, the only callers where buffer_std is false are related to the FSM. In passing, rename XLOG_HINT to XLOG_FPI, which is more descriptive. Back-patch to 9.3.
2013-06-15Use WaitLatch, not pg_usleep, for delaying in pg_sleep().Tom Lane
This avoids platform-dependent behavior wherein pg_sleep() might fail to be interrupted by statement timeout, query cancel, SIGTERM, etc. Also, since there's no reason to wake up once a second any more, we can reduce the power consumption of a sleeping backend a tad. Back-patch to 9.3, since use of SA_RESTART for SIGALRM makes this a bigger issue than it used to be.
2013-06-16Fix pg_restore -l with the directory archive to display the correct format name.Fujii Masao
Back-patch to 9.1 where the directory archive was introduced.
2013-06-15Use SA_RESTART for all signals, including SIGALRM.Tom Lane
The exclusion of SIGALRM dates back to Berkeley days, when Postgres used SIGALRM in only one very short stretch of code. Nowadays, allowing it to interrupt kernel calls doesn't seem like a very good idea, since its use for statement_timeout means SIGALRM could occur anyplace in the code, and there are far too many call sites where we aren't prepared to deal with EINTR failures. When third-party code is taken into consideration, it seems impossible that we ever could be fully EINTR-proof, so better to use SA_RESTART always and deal with the implications of that. One such implication is that we should not assume pg_usleep() will be terminated early by a signal. Therefore, long sleeps should probably be replaced by WaitLatch operations where practical. Back-patch to 9.3 so we can get some beta testing on this change.
2013-06-15Be consistent about #define'ing configure symbols as "1" not empty.Tom Lane
This is just neatnik-ism, since all the tests in the code are #ifdefs, but we shouldn't specify symbols as "Define to 1 ..." and then not actually define them that way.
2013-06-14Update RELEASE_CHANGES to describe library version bumping more fully.Tom Lane
2013-06-14Stamp shared-library minor version numbers for 9.4.Tom Lane
2013-06-14Stamp HEAD as 9.4devel.Tom Lane
Let the hacking begin ...
2013-06-14Avoid deadlocks during insertion into SP-GiST indexes.Tom Lane
SP-GiST's original scheme for avoiding deadlocks during concurrent index insertions doesn't work, as per report from Hailong Li, and there isn't any evident way to make it work completely. We could possibly lock individual inner tuples instead of their whole pages, but preliminary experimentation suggests that the performance penalty would be huge. Instead, if we fail to get a buffer lock while descending the tree, just restart the tree descent altogether. We keep the old tuple positioning rules, though, in hopes of reducing the number of cases where this can happen. Teodor Sigaev, somewhat edited by Tom Lane
2013-06-13Remove special-case treatment of LOG severity level in standalone mode.Tom Lane
elog.c has historically treated LOG messages as low-priority during bootstrap and standalone operation. This has led to confusion and even masked a bug, because the normal expectation of code authors is that elog(LOG) will put something into the postmaster log, and that wasn't happening during initdb. So get rid of the special-case rule and make the priority order the same as it is in normal operation. To keep from cluttering initdb's output and the behavior of a standalone backend, tweak the severity level of three messages routinely issued by xlog.c during startup and shutdown so that they won't appear in these cases. Per my proposal back in December.
2013-06-13Refactor checksumming code to make it easier to use externally.Tom Lane
pg_filedump and other external utility programs are likely to want to be able to check Postgres page checksums. To avoid messy duplication of code, move the checksumming functionality into an exported header file, much as we did awhile back for the CRC code. In passing, get rid of an unportable assumption that a static char[] array will be word-aligned, and do some other minor code beautification.
2013-06-13PL/Python: Fix type mixupPeter Eisentraut
Memory was allocated based on the sizeof a type that was not the type of the pointer that the result was being assigned to. The types happen to be of the same size, but it's still wrong.
2013-06-13Only install a portal's ResourceOwner if it actually has one.Tom Lane
In most scenarios a portal without a ResourceOwner is dead and not subject to any further execution, but a portal for a cursor WITH HOLD remains in existence with no ResourceOwner after the creating transaction is over. In this situation, if we attempt to "execute" the portal directly to fetch data from it, we were setting CurrentResourceOwner to NULL, leading to a segfault if the datatype output code did anything that required a resource owner (such as trying to fetch system catalog entries that weren't already cached). The case appears to be impossible to provoke with stock libpq, but psqlODBC at least is able to cause it when working with held cursors. Simplest fix is to just skip the assignment to CurrentResourceOwner, so that any resources used by the data output operations will be managed by the transaction-level resource owner instead. For consistency I changed all the places that install a portal's resowner as current, even though some of them are probably not reachable with a held cursor's portal. Per report from Joshua Berry (with thanks to Hiroshi Inoue for developing a self-contained test case). Back-patch to all supported versions.
2013-06-12Avoid reading past datum end when parsing JSON.Noah Misch
Several loops in the JSON parser examined a byte in memory just before checking whether its address was in-bounds, so they could read one byte beyond the datum's allocation. A SIGSEGV is possible. New in 9.3, so no back-patch.
2013-06-12Avoid reading below the start of a stack variable in tokenize_file().Noah Misch
We would wrongly overwrite the prior stack byte if it happened to contain '\n' or '\r'. New in 9.3, so no back-patch.
2013-06-12Don't pass oidvector by value.Noah Misch
Since the structure ends with a flexible array, doing so truncates any vector having more than one element. New in 9.3, so no back-patch.
2013-06-12Observe array length in HaveVirtualXIDsDelayingChkpt().Noah Misch
Since commit f21bb9cfb5646e1793dcc9c0ea697bab99afa523, this function ignores the caller-provided length and loops until it finds a terminator, which GetVirtualXIDsDelayingChkpt() never adds. Restore the previous loop control logic. In passing, revert the addition of an unused variable by the same commit, presumably a debugging relic.
2013-06-12Don't use ordinary NULL-terminated strings as Name datums.Noah Misch
Consumers are entitled to read the full 64 bytes pertaining to a Name; using a shorter NULL-terminated string leads to reading beyond the end its allocation; a SIGSEGV is possible. Use the frequent idiom of copying to a NameData on the stack. New in 9.3, so no back-patch.
2013-06-12Improve updatability checking for views and foreign tables.Tom Lane
Extend the FDW API (which we already changed for 9.3) so that an FDW can report whether specific foreign tables are insertable/updatable/deletable. The default assumption continues to be that they're updatable if the relevant executor callback function is supplied by the FDW, but finer granularity is now possible. As a test case, add an "updatable" option to contrib/postgres_fdw. This patch also fixes the information_schema views, which previously did not think that foreign tables were ever updatable, and fixes view_is_auto_updatable() so that a view on a foreign table can be auto-updatable. initdb forced due to changes in information_schema views and the functions they rely on. This is a bit unfortunate to do post-beta1, but if we don't change this now then we'll have another API break for FDWs when we do change it. Dean Rasheed, somewhat editorialized on by Tom Lane
2013-06-12Fix unescaping of JSON Unicode escapes, especially for non-UTF8.Andrew Dunstan
Per discussion on -hackers. We treat Unicode escapes when unescaping them similarly to the way we treat them in PostgreSQL string literals. Escapes in the ASCII range are always accepted, no matter what the database encoding. Escapes for higher code points are only processed in UTF8 databases, and attempts to process them in other databases will result in an error. \u0000 is never unescaped, since it would result in an impermissible null byte.
2013-06-11Fix cache flush hazard in cache_record_field_properties().Tom Lane
We need to increment the refcount on the composite type's cached tuple descriptor while we do lookups of its column types. Otherwise a cache flush could occur and release the tuple descriptor before we're done with it. This fails reliably with -DCLOBBER_CACHE_ALWAYS, but the odds of a failure in a production build seem rather low (since the pfree'd descriptor typically wouldn't get scribbled on immediately). That may explain the lack of any previous reports. Buildfarm issue noted by Christian Ullrich. Back-patch to 9.1 where the bogus code was added.
2013-06-11Fix pg_isready to handle conninfo properly.Fujii Masao
pg_isready displays the host name and the port number that it uses to connect to the server. So far, pg_isready didn't use the conninfo specified in -d option for calculating those host name and port number. This can lead to wrong display to a user. This commit changes pg_isready so that it uses the conninfo for that calculation. Original patch by Phil Sorber, modified by me.
2013-06-09Fix ordering of obj id for Rules and EventTriggers in pg_dump.Joe Conway
getSchemaData() must identify extension member objects and mark them as not to be dumped. This must happen after reading all objects that can be direct members of extensions, but before we begin to process table subsidiary objects. Both rules and event triggers were wrong in this regard. Backport rules portion of patch to 9.1 -- event triggers do not exist prior to 9.3. Suggested fix by Tom Lane, initial complaint and patch by me.
2013-06-09Remove unnecessary restrictions about RowExprs in transformAExprIn().Tom Lane
When the existing code here was written, it made sense to special-case RowExprs because that was the only way that we could handle row comparisons at all. Now that we have record_eq() and arrays of composites, the generic logic for "scalar" types will in fact work on RowExprs too, so there's no reason to throw error for combinations of RowExprs and other ways of forming composite values, nor to ignore the possibility of using a ScalarArrayOpExpr. But keep using the old logic when comparing two RowExprs, for consistency with the main transformAExprOp() logic. (This allows some cases with not-quite-identical rowtypes to succeed, so we might get push-back if we removed it.) Per bug #8198 from Rafal Rzepecki. Back-patch to all supported branches, since this works fine as far back as 8.4. Rafal Rzepecki and Tom Lane