summaryrefslogtreecommitdiff
path: root/src/backend/utils
AgeCommit message (Collapse)Author
2015-07-09Make wal_compression PGC_SUSET rather than PGC_USERSET.Fujii Masao
When enabling wal_compression, there is a risk to leak data similarly to the BREACH and CRIME attacks on SSL where the compression ratio of a full page image gives a hint of what is the existing data of this page. This vulnerability is quite cumbersome to exploit in practice, but doable. So this patch makes wal_compression PGC_SUSET in order to prevent non-superusers from enabling it and exploiting the vulnerability while DBA thinks the risk very seriously and disables it in postgresql.conf. Back-patch to 9.5 where wal_compression was introduced.
2015-07-08Revoke support for strxfrm() that write past the specified array length.Noah Misch
This formalizes a decision implicit in commit 4ea51cdfe85ceef8afabceb03c446574daa0ac23 and adds clean detection of affected systems. Vendor updates are available for each such known bug. Back-patch to 9.5, where the aforementioned commit first appeared.
2015-07-02Use appendStringInfoString/Char et al where appropriate.Heikki Linnakangas
Patch by David Rowley. Backpatch to 9.5, as some of the calls were new in 9.5, and keeping the code in sync with master makes future backpatching easier.
2015-07-01Make sampler_random_fract() actually obey its API contract.Tom Lane
This function is documented to return a value in the range (0,1), which is what its predecessor anl_random_fract() did. However, the new version depends on pg_erand48() which returns a value in [0,1). The possibility of returning zero creates hazards of division by zero or trying to compute log(0) at some call sites, and it might well break third-party modules using anl_random_fract() too. So let's change it to never return zero. Spotted by Coverity. Michael Paquier, cosmetically adjusted by me
2015-06-29In bttext_abbrev_convert, move pfree to the right place.Robert Haas
Without this, we might access memory that's already been freed, or leak memory if in the C locale. Peter Geoghegan
2015-06-29Code + docs review for escaping of option values (commit 11a020eb6).Tom Lane
Avoid memory leak from incorrect choice of how to free a StringInfo (resetStringInfo doesn't do it). Now that pg_split_opts doesn't scribble on the optstr, mark that as "const" for clarity. Attach the commentary in protocol.sgml to the right place, and add documentation about the user-visible effects of this change on postgres' -o option and libpq's PGOPTIONS option.
2015-06-28Run the C portions of guc-file.l through pgindent.Tom Lane
Yeah, I know, pretty anal-retentive of me. But we oughta find some way to automate this for the .y and .l files.
2015-06-28Improve design and implementation of pg_file_settings view.Tom Lane
As first committed, this view reported on the file contents as they were at the last SIGHUP event. That's not as useful as reporting on the current contents, and what's more, it didn't work right on Windows unless the current session had serviced at least one SIGHUP. Therefore, arrange to re-read the files when pg_show_all_settings() is called. This requires only minor refactoring so that we can pass changeVal = false to set_config_option() so that it won't actually apply any changes locally. In addition, add error reporting so that errors that would prevent the configuration files from being loaded, or would prevent individual settings from being applied, are visible directly in the view. This makes the view usable for pre-testing whether edits made in the config files will have the desired effect, before one actually issues a SIGHUP. I also added an "applied" column so that it's easy to identify entries that are superseded by later entries; this was the main use-case for the original design, but it seemed unnecessarily hard to use for that. Also fix a 9.4.1 regression that allowed multiple entries for a PGC_POSTMASTER variable to cause bogus complaints in the postmaster log. (The issue here was that commit bf007a27acd7b2fb unintentionally reverted 3e3f65973a3c94a6, which suppressed any duplicate entries within ParseConfigFp. However, since the original coding of the pg_file_settings view depended on such suppression *not* happening, we couldn't have fixed this issue now without first doing something with pg_file_settings. Now we suppress duplicates by marking them "ignored" within ProcessConfigFileInternal, which doesn't hide them in the view.) Lesser changes include: Drive the view directly off the ConfigVariable list, instead of making a basically-equivalent second copy of the data. There's no longer any need to hang onto the data permanently, anyway. Convert show_all_file_settings() to do its work in one call and return a tuplestore; this avoids risks associated with assuming that the GUC state will hold still over the course of query execution. (I think there were probably latent bugs here, though you might need something like a cursor on the view to expose them.) Arrange to run SIGHUP processing in a short-lived memory context, to forestall process-lifespan memory leaks. (There is one known leak in this code, in ProcessConfigDirectory; it seems minor enough to not be worth back-patching a specific fix for.) Remove mistaken assignment to ConfigFileLineno that caused line counting after an include_dir directive to be completely wrong. Add missed failure check in AlterSystemSetConfigFile(). We don't really expect ParseConfigFp() to fail, but that's not an excuse for not checking.
2015-06-28Add missing_ok option to the SQL functions for reading files.Heikki Linnakangas
This makes it possible to use the functions without getting errors, if there is a chance that the file might be removed or renamed concurrently. pg_rewind needs to do just that, although this could be useful for other purposes too. (The changes to pg_rewind to use these functions will come in a separate commit.) The read_binary_file() function isn't very well-suited for extensions.c's purposes anymore, if it ever was. So bite the bullet and make a copy of it in extension.c, tailored for that use case. This seems better than the accidental code reuse, even if it's a some more lines of code. Michael Paquier, with plenty of kibitzing by me.
2015-06-28Fix comment for GetCurrentIntegerTimestamp().Kevin Grittner
The unit of measure is microseconds, not milliseconds. Backpatch to 9.3 where the function and its comment were added.
2015-06-27Avoid passing NULL to memcmp() in lookups of zero-argument functions.Tom Lane
A few places assumed they could pass NULL for the argtypes array when looking up functions known to have zero arguments. At first glance it seems that this should be safe enough, since memcmp() is surely not allowed to fetch any bytes if its count argument is zero. However, close reading of the C standard says that such calls have undefined behavior, so we'd probably best avoid it. Since the number of places doing this is quite small, and some other places looking up zero-argument functions were already passing dummy arrays, let's standardize on the latter solution rather than hacking the function lookup code to avoid calling memcmp() in these cases. I also added Asserts to catch any future violations of the new rule. Given the utter lack of any evidence that this actually causes any problems in the field, I don't feel a need to back-patch this change. Per report from Piotr Stefaniak, though this is not his patch.
2015-06-25Allow background workers to connect to no particular database.Robert Haas
The documentation claims that this is supported, but it didn't actually work. Fix that. Reported by Pavel Stehule; patch by me.
2015-06-25Fix the logic for putting relations into the relcache init file.Tom Lane
Commit f3b5565dd4e59576be4c772da364704863e6a835 was a couple of bricks shy of a load; specifically, it missed putting pg_trigger_tgrelid_tgname_index into the relcache init file, because that index is not used by any syscache. However, we have historically nailed that index into cache for performance reasons. The upshot was that load_relcache_init_file always decided that the init file was busted and silently ignored it, resulting in a significant hit to backend startup speed. To fix, reinstantiate RelationIdIsInInitFile() as a wrapper around RelationSupportsSysCache(), which can know about additional relations that should be in the init file despite being unknown to syscache.c. Also install some guards against future mistakes of this type: make write_relcache_init_file Assert that all nailed relations get written to the init file, and make load_relcache_init_file emit a WARNING if it takes the "wrong number of nailed relations" exit path. Now that we remove the init files during postmaster startup, that case should never occur in the field, even if we are starting a minor-version update that added or removed rels from the nailed set. So the warning shouldn't ever be seen by end users, but it will show up in the regression tests if somebody breaks this logic. Back-patch to all supported branches, like the previous commit.
2015-06-20Fix failure to copy setlocale() return value.Noah Misch
POSIX permits setlocale() calls to invalidate any previous setlocale() return values, but commit 5f538ad004aa00cf0881f179f0cde789aad4f47e neglected to account for setlocale(LC_CTYPE, NULL) doing so. The effect was to set the LC_CTYPE environment variable to an unintended value. pg_perm_setlocale() sets this variable to assist PL/Perl; without it, Perl would undo PostgreSQL's locale settings. The known-affected configurations are 32-bit, release builds using Visual Studio 2012 or Visual Studio 2013. Visual Studio 2010 is unaffected, as were all buildfarm-attested configurations. In principle, this bug could leave the wrong LC_CTYPE in effect after PL/Perl use, which could in turn facilitate problems like corrupt tsvector datums. No known platform experiences that consequence, because PL/Perl on Windows does not use this environment variable. The bug has been user-visible, as early postmaster failure, on systems with Windows ANSI code page set to CP936 for "Chinese (Simplified, PRC)" and probably on systems using other multibyte code pages. (SetEnvironmentVariable() rejects values containing character data not valid under the Windows ANSI code page.) Back-patch to 9.4, where the faulty commit first appeared. Reported by Didi Hu and 林鹏程. Reviewed by Tom Lane, though this fix strategy was not his first choice.
2015-06-20Revert "Detect setlocale(LC_CTYPE, NULL) clobbering previous return values."Noah Misch
This reverts commit b76e76be460a240e99c33f6fb470dd1d5fe01a2a. The buildfarm yielded no related failures.
2015-06-17Detect setlocale(LC_CTYPE, NULL) clobbering previous return values.Noah Misch
POSIX permits setlocale() calls to invalidate any previous setlocale() return values. Commit 5f538ad004aa00cf0881f179f0cde789aad4f47e neglected to account for that. In advance of fixing that bug, switch to failing hard on affected configurations. This is a planned temporary commit to assay buildfarm-represented configurations.
2015-06-12Fix "path" infrastructure bug affecting jsonb_set()Andrew Dunstan
jsonb_set() and other clients of the setPathArray() utility function could get spurious results when an array integer subscript is provided that is not within the range of int. To fix, ensure that the value returned by strtol() within setPathArray() is within the range of int; when it isn't, assume an invalid input in line with existing, similar cases. The path-orientated operators that appeared in PostgreSQL 9.3 and 9.4 do not call setPathArray(), and already independently take this precaution, so no change there. Peter Geoghegan
2015-06-12Fix alphabetization in catalogs.sgml.Fujii Masao
System catalogs and views should be listed alphabetically in catalog.sgml, but only pg_file_settings view not. This patch also fixes typos in pg_file_settings comments.
2015-06-07Desupport jsonb subscript deletion on objectsAndrew Dunstan
Supporting deletion of JSON pairs within jsonb objects using an array-style integer subscript allowed for surprising outcomes. This was mostly due to the implementation-defined ordering of pairs within objects for jsonb. It also seems desirable to make jsonb integer subscript deletion consistent with the 9.4 era general purpose integer subscripting operator for jsonb (although that operator returns NULL when an object is encountered, while we prefer here to throw an error). Peter Geoghegan, following discussion on -hackers.
2015-06-07Use a safer method for determining whether relcache init file is stale.Tom Lane
When we invalidate the relcache entry for a system catalog or index, we must also delete the relcache "init file" if the init file contains a copy of that rel's entry. The old way of doing this relied on a specially maintained list of the OIDs of relations present in the init file: we made the list either when reading the file in, or when writing the file out. The problem is that when writing the file out, we included only rels present in our local relcache, which might have already suffered some deletions due to relcache inval events. In such cases we correctly decided not to overwrite the real init file with incomplete data --- but we still used the incomplete initFileRelationIds list for the rest of the current session. This could result in wrong decisions about whether the session's own actions require deletion of the init file, potentially allowing an init file created by some other concurrent session to be left around even though it's been made stale. Since we don't support changing the schema of a system catalog at runtime, the only likely scenario in which this would cause a problem in the field involves a "vacuum full" on a catalog concurrently with other activity, and even then it's far from easy to provoke. Remarkably, this has been broken since 2002 (in commit 786340441706ac1957a031f11ad1c2e5b6e18314), but we had never seen a reproducible test case until recently. If it did happen in the field, the symptoms would probably involve unexpected "cache lookup failed" errors to begin with, then "could not open file" failures after the next checkpoint, as all accesses to the affected catalog stopped working. Recovery would require manually removing the stale "pg_internal.init" file. To fix, get rid of the initFileRelationIds list, and instead consult syscache.c's list of relations used in catalog caches to decide whether a relation is included in the init file. This should be a tad more efficient anyway, since we're replacing linear search of a list with ~100 entries with a binary search. It's a bit ugly that the init file contents are now so directly tied to the catalog caches, but in practice that won't make much difference. Back-patch to all supported branches.
2015-06-05Fix incorrect order of database-locking operations in InitPostgres().Tom Lane
We should set MyProc->databaseId after acquiring the per-database lock, not beforehand. The old way risked deadlock against processes trying to copy or delete the target database, since they would first acquire the lock and then wait for processes with matching databaseId to exit; that left a window wherein an incoming process could set its databaseId and then block on the lock, while the other process had the lock and waited in vain for the incoming process to exit. CountOtherDBBackends() would time out and fail after 5 seconds, so this just resulted in an unexpected failure not a permanent lockup, but it's still annoying when it happens. A real-world example of a use-case is that short-duration connections to a template database should not cause CREATE DATABASE to fail. Doing it in the other order should be fine since the contract has always been that processes searching the ProcArray for a database ID must hold the relevant per-database lock while searching. Thus, this actually removes the former race condition that required an assumption that storing to MyProc->databaseId is atomic. It's been like this for a long time, so back-patch to all active branches.
2015-05-31Avoid naming a variable "new", and remove bogus initializer.Andrew Dunstan
Per gripe from Tom Lane.
2015-05-31Add a couple of missing JsonbValue type initialisers.Andrew Dunstan
2015-05-31Rename jsonb_replace to jsonb_set and allow it to add new valuesAndrew Dunstan
The function is given a fourth parameter, which defaults to true. When this parameter is true, if the last element of the path is missing in the original json, jsonb_set creates it in the result and assigns it the new value. If it is false then the function does nothing unless all elements of the path are present, including the last. Based on some original code from Dmitry Dolgov, heavily modified by me. Catalog version bumped.
2015-05-29Revert exporting of internal GUC variable "data_directory".Tom Lane
This undoes a poorly-thought-out choice in commit 970a18687f9b3058, namely to export guc.c's internal variable data_directory. The authoritative variable so far as C code is concerned is DataDir; there is no reason for anything except specific bits of GUC code to look at the GUC variable. After yesterday's commits fixing the fsync-on-restart patch, the only remaining misuse of data_directory was in AlterSystemSetConfigFile(), which would be much better off just using a relative path anyhow: it's less code and it doesn't break if the DBA moves the data directory of a running system, which is a case we've taken some pains over in the past. This is mostly cosmetic, so no need for a back-patch (and I'd be hesitant to remove a global variable in stable branches anyway).
2015-05-28Fix assorted inconsistencies in our calls of readlink().Tom Lane
Ensure that we null-terminate the result string (one place in pg_rewind). Be paranoid about out-of-range results from readlink() (should not happen, but there is no good reason for some call sites to be careful about it and others not). Consistently use the whole buffer, not sometimes one byte less. Ensure we emit an appropriate errcode() in all cases. Spell the error messages the same way. The only serious bug here is the missing null-termination in pg_rewind, which is new code, so no need for a back-patch. Abhijit Menon-Sen and Tom Lane
2015-05-28Fix pg_get_functiondef() to print a function's LEAKPROOF property.Tom Lane
Seems to have been an oversight in the original leakproofness patch. Per report and patch from Jeevan Chalke. In passing, prettify some awkward leakproof-related code in AlterFunction.
2015-05-26Revert "Add all structured objects passed to pushJsonbValue piecewise."Andrew Dunstan
This reverts commit 54547bd87f49326d67051254c363e6597d16ffda. This appears to have been a thinko on my part. I will try to come up wioth a better solution.
2015-05-26Revert "Simplify addJsonbToParseState()"Andrew Dunstan
This reverts commit fba12c8c6c4159e1923958a4006b26f3cf873254. This relied on a commit that is also being reverted.
2015-05-26Simplify addJsonbToParseState()Andrew Dunstan
This function no longer needs to walk non-scalar structures passed to it, following commit 54547bd87f49326d67051254c363e6597d16ffda.
2015-05-26Add all structured objects passed to pushJsonbValue piecewise.Andrew Dunstan
Commit 9b74f32cdbff8b9be47fc69164eae552050509ff did this for objects of type jbvBinary, but in trying further to simplify some of the new jsonb code I discovered that objects of type jbvObject or jbvArray passed as WJB_ELEM or WJB_VALUE also caused problems. These too are now added component by component. Backpatch to 9.4.
2015-05-25Clean up and simplify jsonb_concat code.Andrew Dunstan
Some of this is made possible by commit 9b74f32cdbff8b9be47fc69164eae552050509ff which lets pushJsonbValue handle binary Jsonb values, meaning that clients no longer have to, and some is just doing things in simpler and more straightforward ways.
2015-05-24Manual cleanup of pgindent results.Tom Lane
Fix some places where pgindent did silly stuff, often because project style wasn't followed to begin with. (I've not touched the atomics headers, though.)
2015-05-24Remove no-longer-required function declarations.Tom Lane
Remove a bunch of "extern Datum foo(PG_FUNCTION_ARGS);" declarations that are no longer needed now that PG_FUNCTION_INFO_V1(foo) provides that. Some of these were evidently missed in commit e7128e8dbb305059, but others were cargo-culted in in code added since then. Possibly that can be blamed in part on the fact that we'd not fixed relevant documentation examples, which I've now done.
2015-05-23pgindent run for 9.5Bruce Momjian
2015-05-23Fix yet another bug in ON CONFLICT rule deparsing.Andres Freund
Expand testing of rule deparsing a good bit, it's evidently needed. Author: Peter Geoghegan, Andres Freund Discussion: CAM3SWZQmXxZhQC32QVEOTYfNXJBJ_Q2SDENL7BV14Cq-zL0FLg@mail.gmail.com
2015-05-22Fix recently-introduced crash in array_contain_compare().Tom Lane
Silly oversight in commit 1dc5ebc9077ab742079ce5dac9a6664248d42916: when array2 is an expanded array, it might have array2->xpn.dnulls equal to NULL, indicating the array is known null-free. The code wasn't expecting that, because it formerly always used deconstruct_array() which always delivers a nulls array. Per bug #13334 from Regina Obe.
2015-05-22Unpack jbvBinary objects passed to pushJsonbValueAndrew Dunstan
pushJsonbValue was accepting jbvBinary objects passed as WJB_ELEM or WJB_VALUE data. While this succeeded, when those objects were later encountered in attempting to convert the result to Jsonb, errors occurred. With this change we ghuarantee that a JSonbValue constructed from calls to pushJsonbValue does not contain any jbvBinary objects. This cures a problem observed with jsonb_delete. This means callers of pushJsonbValue no longer need to perform this unpacking themselves. A subsequent patch will perform some cleanup in that area. The error was not triggered by any 9.4 code, but this is a publicly visible routine, and so the error could be exercised by third party code, therefore backpatch to 9.4. Bug report from Peter Geoghegan, fix by me.
2015-05-20Collection of typo fixes.Heikki Linnakangas
Use "a" and "an" correctly, mostly in comments. Two error messages were also fixed (they were just elogs, so no translation work required). Two function comments in pg_proc.h were also fixed. Etsuro Fujita reported one of these, but I found a lot more with grep. Also fix a few other typos spotted while grepping for the a/an typos. For example, "consists out of ..." -> "consists of ...". Plus a "though"/ "through" mixup reported by Euler Taveira. Many of these typos were in old code, which would be nice to backpatch to make future backpatching easier. But much of the code was new, and I didn't feel like crafting separate patches for each branch. So no backpatching.
2015-05-19Various fixes around ON CONFLICT for rule deparsing.Andres Freund
Neither the deparsing of the new alias for INSERT's target table, nor of the inference clause was supported. Also fixup a typo in an error message. Add regression tests to test those code paths. Author: Peter Geoghegan
2015-05-18Put back a backwards-compatible version of sampling support functions.Tom Lane
Commit 83e176ec18d2a91dbea1d0d1bd94c38dc47cd77c removed the longstanding support functions for block sampling without any consideration of the impact this would have on third-party FDWs. The new API is not notably more functional for FDWs than the old, so forcing them to change doesn't seem like a good thing. We can provide the old API as a wrapper (more or less) around the new one for a minimal amount of extra code.
2015-05-18Check return values of sensitive system library calls.Noah Misch
PostgreSQL already checked the vast majority of these, missing this handful that nearly cannot fail. If putenv() failed with ENOMEM in pg_GSS_recvauth(), authentication would proceed with the wrong keytab file. If strftime() returned zero in cache_locale_time(), using the unspecified buffer contents could lead to information exposure or a crash. Back-patch to 9.0 (all supported versions). Other unchecked calls to these functions, especially those in frontend code, pose negligible security concern. This patch does not address them. Nonetheless, it is always better to check return values whose specification provides for indicating an error. In passing, fix an off-by-one error in strftime_win32()'s invocation of WideCharToMultiByte(). Upon retrieving a value of exactly MAX_L10N_DATA bytes, strftime_win32() would overrun the caller's buffer by one byte. MAX_L10N_DATA is chosen to exceed the length of every possible value, so the vulnerable scenario probably does not arise. Security: CVE-2015-3166
2015-05-17Fix typos in commentsMagnus Hagander
Dmitriy Olshevskiy
2015-05-16Support GROUPING SETS, CUBE and ROLLUP.Andres Freund
This SQL standard functionality allows to aggregate data by different GROUP BY clauses at once. Each grouping set returns rows with columns grouped by in other sets set to NULL. This could previously be achieved by doing each grouping as a separate query, conjoined by UNION ALLs. Besides being considerably more concise, grouping sets will in many cases be faster, requiring only one scan over the underlying data. The current implementation of grouping sets only supports using sorting for input. Individual sets that share a sort order are computed in one pass. If there are sets that don't share a sort order, additional sort & aggregation steps are performed. These additional passes are sourced by the previous sort step; thus avoiding repeated scans of the source data. The code is structured in a way that adding support for purely using hash aggregation or a mix of hashing and sorting is possible. Sorting was chosen to be supported first, as it is the most generic method of implementation. Instead of, as in an earlier versions of the patch, representing the chain of sort and aggregation steps as full blown planner and executor nodes, all but the first sort are performed inside the aggregation node itself. This avoids the need to do some unusual gymnastics to handle having to return aggregated and non-aggregated tuples from underlying nodes, as well as having to shut down underlying nodes early to limit memory usage. The optimizer still builds Sort/Agg node to describe each phase, but they're not part of the plan tree, but instead additional data for the aggregation node. They're a convenient and preexisting way to describe aggregation and sorting. The first (and possibly only) sort step is still performed as a separate execution step. That retains similarity with existing group by plans, makes rescans fairly simple, avoids very deep plans (leading to slow explains) and easily allows to avoid the sorting step if the underlying data is sorted by other means. A somewhat ugly side of this patch is having to deal with a grammar ambiguity between the new CUBE keyword and the cube extension/functions named cube (and rollup). To avoid breaking existing deployments of the cube extension it has not been renamed, neither has cube been made a reserved keyword. Instead precedence hacking is used to make GROUP BY cube(..) refer to the CUBE grouping sets feature, and not the function cube(). To actually group by a function cube(), unlikely as that might be, the function name has to be quoted. Needs a catversion bump because stored rules may change. Author: Andrew Gierth and Atri Sharma, with contributions from Andres Freund Reviewed-By: Andres Freund, Noah Misch, Tom Lane, Svenne Krap, Tomas Vondra, Erik Rijkers, Marti Raudsepp, Pavel Stehule Discussion: CAOeZVidmVRe2jU6aMk_5qkxnB7dfmPROzM7Ur8JPW5j8Y5X-Lw@mail.gmail.com
2015-05-15Add BRIN infrastructure for "inclusion" opclassesAlvaro Herrera
This lets BRIN be used with R-Tree-like indexing strategies. Also provided are operator classes for range types, box and inet/cidr. The infrastructure provided here should be sufficient to create operator classes for similar datatypes; for instance, opclasses for PostGIS geometries should be doable, though we didn't try to implement one. (A box/point opclass was also submitted, but we ripped it out before commit because the handling of floating point comparisons in existing code is inconsistent and would generate corrupt indexes.) Author: Emre Hasegeli. Cosmetic changes by me Review: Andreas Karlsson
2015-05-15Move strategy numbers to include/access/stratnum.hAlvaro Herrera
For upcoming BRIN opclasses, it's convenient to have strategy numbers defined in a single place. Since there's nothing appropriate, create it. The StrategyNumber typedef now lives there, as well as existing strategy numbers for B-trees (from skey.h) and R-tree-and-friends (from gist.h). skey.h is forced to include stratnum.h because of the StrategyNumber typedef, but gist.h is not; extensions that currently rely on gist.h for rtree strategy numbers might need to add a new A few .c files can stop including skey.h and/or gist.h, which is a nice side benefit. Per discussion: https://www.postgresql.org/message-id/20150514232132.GZ2523@alvh.no-ip.org Authored by Emre Hasegeli and Álvaro. (It's not clear to me why bootscanner.l has any #include lines at all.)
2015-05-15Extend GB18030 encoding conversion to cover full Unicode range.Tom Lane
Our previous code for GB18030 <-> UTF8 conversion only covered Unicode code points up to U+FFFF, but the actual spec defines conversions for all code points up to U+10FFFF. That would be rather impractical as a lookup table, but fortunately there is a simple algorithmic conversion between the additional code points and the equivalent GB18030 byte patterns. Make use of the just-added callback facility in LocalToUtf/UtfToLocal to perform the additional conversions. Having created the infrastructure to do that, we can use the same code to map certain linearly-related subranges of the Unicode space below U+FFFF, allowing removal of the corresponding lookup table entries. This more than halves the lookup table size, which is a substantial savings; utf8_and_gb18030.so drops from nearly a megabyte to about half that. In support of doing that, replace ISO10646-GB18030.TXT with the data file gb-18030-2000.xml (retrieved from http://source.icu-project.org/repos/icu/data/trunk/charset/data/xml/ ) in which these subranges have been deleted from the simple lookup entries. Per bug #12845 from Arjen Nienhuis. The conversion code added here is based on his proposed patch, though I whacked it around rather heavily.
2015-05-15TABLESAMPLE, SQL Standard and extensibleSimon Riggs
Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs
2015-05-15Add archive_mode='always' option.Heikki Linnakangas
In 'always' mode, the standby independently archives all files it receives from the primary. Original patch by Fujii Masao, docs and review by me.
2015-05-15Fix insufficiently-paranoid GB18030 encoding verifier.Tom Lane
The previous coding effectively only verified that the second byte of a multibyte character was in the expected range; moreover, it wasn't careful to make sure that the second byte even exists in the buffer before touching it. The latter seems unlikely to cause any real problems in the field (in particular, it could never be a problem with null-terminated input), but it's still a bug. Since GB18030 is not a supported backend encoding, the only thing we'd really be doing with GB18030 text is converting it to UTF8 in LocalToUtf, which would fail anyway on any invalid character for lack of a match in its lookup table. So the only user-visible consequence of this change should be that you'll get "invalid byte sequence for encoding" rather than "character has no equivalent" for malformed GB18030 input. However, impending changes to the GB18030 conversion code will require these tighter up-front checks to avoid producing bogus results.