summaryrefslogtreecommitdiff
path: root/doc/src
AgeCommit message (Collapse)Author
2019-03-27Use Pandoc also for plain-text documentation outputPeter Eisentraut
The makefile rule for the (rarely used) plain-text output postgres.txt was still written to use lynx, but in 96b8b8b6f9d8de4af01a77797273ad88c7a8e32e, where the INSTALL file was switched to pandoc, the rest of the makefile support for lynx was removed, so this was broken. Rewrite the rule to also use pandoc for postgres.txt.
2019-03-27Add support for multivariate MCV listsTomas Vondra
Introduce a third extended statistic type, supported by the CREATE STATISTICS command - MCV lists, a generalization of the statistic already built and used for individual columns. Compared to the already supported types (n-distinct coefficients and functional dependencies), MCV lists are more complex, include column values and allow estimation of much wider range of common clauses (equality and inequality conditions, IS NULL, IS NOT NULL etc.). Similarly to the other types, a new pseudo-type (pg_mcv_list) is used. Author: Tomas Vondra Reviewed-by: Dean Rasheed, David Rowley, Mark Dilger, Alvaro Herrera Discussion: https://postgr.es/m/dfdac334-9cf2-2597-fb27-f0fb3753f435@2ndquadrant.com
2019-03-27Improve error handling of column references in expression transformationMichael Paquier
Column references are not allowed in default expressions and partition bound expressions, and are restricted as such once the transformation of their expressions is done. However, trying to use more complex column references can lead to confusing error messages. For example, trying to use a two-field column reference name for default expressions and partition bounds leads to "missing FROM-clause entry for table", which makes no sense in their respective context. In order to make the errors generated more useful, this commit adds more verbose messages when transforming column references depending on the context. This has a little consequence though: for example an expression using an aggregate with a column reference as argument would cause an error to be generated for the column reference, while the aggregate was the problem reported before this commit because column references get transformed first. The confusion exists for default expressions for a long time, and the problem is new as of v12 for partition bounds. Still per the lack of complaints on the matter no backpatch is done. The patch has been written by Amit Langote and me, and Tom Lane has provided the improvement of the documentation for default expressions on the CREATE TABLE page. Author: Amit Langote, Michael Paquier Reviewed-by: Tom Lane Discussion: https://postgr.es/m/20190326020853.GM2558@paquier.xyz
2019-03-26Fix typoMagnus Hagander
Author: Daniel Gustafsson <daniel@yesql.se>
2019-03-25Doc: clarify that REASSIGN OWNED doesn't handle default privileges.Tom Lane
It doesn't touch regular privileges either, but only the latter was explicitly stated. Discussion: https://postgr.es/m/155348282848.9808.12629518043813943231@wrigleys.postgresql.org
2019-03-25pgbench: Remove \csetAlvaro Herrera
Partial revert of commit 6260cc550b0e, "pgbench: add \cset and \gset commands". While \gset is widely considered a useful and necessary tool for user- defined benchmarks, \cset does not have as much value, and its implementation was considered "not to be up to project standards" (though I, Álvaro, can't quite understand exactly how). Therefore, remove \cset. Author: Fabien Coelho Discussion: https://postgr.es/m/alpine.DEB.2.21.1903230716030.18811@lancre Discussion: https://postgr.es/m/201901101900.mv7zduch6sad@alvherre.pgsql
2019-03-25Add progress reporting for CLUSTER and VACUUM FULL.Robert Haas
This uses the same progress reporting infrastructure added in commit c16dc1aca5e01e6acaadfcf38f5fc964a381dc62 and extends it to these additional cases. We lack the ability to track the internal progress of sorts and index builds so the information reported is coarse-grained for some parts of the operation, but it still seems like a significant improvement over having nothing at all. Tatsuro Yamada, reviewed by Thomas Munro, Masahiko Sawada, Michael Paquier, Jeff Janes, Alvaro Herrera, Rafia Sabih, and by me. A fair amount of polishing also by me. Discussion: http://postgr.es/m/59A77072.3090401@lab.ntt.co.jp
2019-03-24Transaction chainingPeter Eisentraut
Add command variants COMMIT AND CHAIN and ROLLBACK AND CHAIN, which start new transactions with the same transaction characteristics as the just finished one, per SQL standard. Support for transaction chaining in PL/pgSQL is also added. This functionality is especially useful when running COMMIT in a loop in PL/pgSQL. Reviewed-by: Fabien COELHO <coelho@cri.ensmp.fr> Discussion: https://www.postgresql.org/message-id/flat/28536681-324b-10dc-ade8-ab46f7645a5a@2ndquadrant.com
2019-03-23Accept XML documents when xmloption = content, as required by SQL:2006+.Tom Lane
Previously we were using the SQL:2003 definition, which doesn't allow this, but that creates a serious dump/restore gotcha: there is no setting of xmloption that will allow all valid XML data. Hence, switch to the 2006 definition. Since libxml doesn't accept <!DOCTYPE> directives in the mode we use for CONTENT parsing, the implementation is to detect <!DOCTYPE> in the input and switch to DOCUMENT parsing mode. This should not cost much, because <!DOCTYPE> should be close to the front of the input if it's there at all. It's possible that this causes the error messages for malformed input to be slightly different than they were before, if said input includes <!DOCTYPE>; but that does not seem like a big problem. In passing, buy back a few cycles in parsing of large XML documents by not doing strlen() of the whole input in parse_xml_decl(). Back-patch because dump/restore failures are not nice. This change shouldn't break any cases that worked before, so it seems safe to back-patch. Chapman Flack (revised a bit by me) Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com
2019-03-23Add option -N/--no-sync to pg_checksumsMichael Paquier
This is an option consistent with what pg_dump, pg_rewind and pg_basebackup provide which is useful for leveraging the I/O effort when testing things, not to be used in a production environment. Author: Michael Paquier Reviewed-by: Michael Banck, Fabien Coelho, Sergei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
2019-03-23Add options to enable and disable checksums in pg_checksumsMichael Paquier
An offline cluster can now work with more modes in pg_checksums: - --enable enables checksums in a cluster, updating all blocks with a correct checksum, and updating the control file at the end. - --disable disables checksums in a cluster, updating only the control file. - --check is an extra option able to verify checksums for a cluster, and the default used if no mode is specified. When running --enable or --disable, the data folder gets fsync'd for durability, and then it is followed by a control file update and flush to keep the operation consistent should the tool be interrupted, killed or the host unplugged. If no mode is specified in the options, then --check is used for compatibility with older versions of pg_checksums (named pg_verify_checksums in v11 where it was introduced). Author: Michael Banck, Michael Paquier Reviewed-by: Fabien Coelho, Magnus Hagander, Sergei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
2019-03-22Collations with nondeterministic comparisonPeter Eisentraut
This adds a flag "deterministic" to collations. If that is false, such a collation disables various optimizations that assume that strings are equal only if they are byte-wise equal. That then allows use cases such as case-insensitive or accent-insensitive comparisons or handling of strings with different Unicode normal forms. This functionality is only supported with the ICU provider. At least glibc doesn't appear to have any locales that work in a nondeterministic way, so it's not worth supporting this for the libc provider. The term "deterministic comparison" in this context is from Unicode Technical Standard #10 (https://unicode.org/reports/tr10/#Deterministic_Comparison). This patch makes changes in three areas: - CREATE COLLATION DDL changes and system catalog changes to support this new flag. - Many executor nodes and auxiliary code are extended to track collations. Previously, this code would just throw away collation information, because the eventually-called user-defined functions didn't use it since they only cared about equality, which didn't need collation information. - String data type functions that do equality comparisons and hashing are changed to take the (non-)deterministic flag into account. For comparison, this just means skipping various shortcuts and tie breakers that use byte-wise comparison. For hashing, we first need to convert the input string to a canonical "sort key" using the ICU analogue of strxfrm(). Reviewed-by: Daniel Verite <daniel@manitou-mail.org> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/flat/1ccc668f-4cbc-0bef-af67-450b47cdfee7@2ndquadrant.com
2019-03-21Add DNS SRV support for LDAP server discovery.Thomas Munro
LDAP servers can be advertised on a network with RFC 2782 DNS SRV records. The OpenLDAP command-line tools automatically try to find servers that way, if no server name is provided by the user. Teach PostgreSQL to do the same using OpenLDAP's support functions, when building with OpenLDAP. For now, we assume that HAVE_LDAP_INITIALIZE (an OpenLDAP extension available since OpenLDAP 2.0 and also present in Apple LDAP) implies that you also have ldap_domain2hostlist() (which arrived in the same OpenLDAP version and is also present in Apple LDAP). Author: Thomas Munro Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/CAEepm=2hAnSfhdsd6vXsM6VZVN0br-FbAZ-O+Swk18S5HkCP=A@mail.gmail.com
2019-03-20Allow amcheck to re-find tuples using new search.Peter Geoghegan
Teach contrib/amcheck's bt_index_parent_check() function to take advantage of the uniqueness property of heapkeyspace indexes in support of a new verification option: non-pivot tuples (non-highkey tuples on the leaf level) can optionally be re-found using a new search for each, that starts from the root page. If a tuple cannot be re-found, report that the index is corrupt. The new "rootdescend" verification option is exhaustive, and can therefore make a call to bt_index_parent_check() take a lot longer. Re-finding tuples during verification is mostly intended as an option for backend developers, since the corruption scenarios that it alone is uniquely capable of detecting seem fairly far-fetched. For example, "rootdescend" verification is much more likely to detect corruption of the least significant byte of a key from a pivot tuple in the root page of a B-Tree that already has at least three levels. Typically, only a few tuples on a cousin leaf page are at risk of "getting overlooked" by index scans in this scenario. The corrupt key in the root page is only slightly corrupt: corrupt enough to give wrong answers to some queries, and yet not corrupt enough to allow the problem to be detected without verifying agreement between the leaf page and the root page, skipping at least one internal page level. The existing bt_index_parent_check() checks never cross more than a single level. Author: Peter Geoghegan Reviewed-By: Heikki Linnakangas Discussion: https://postgr.es/m/CAH2-Wz=yTWnVu+HeHGKb2AGiADL9eprn-cKYAto4MkKOuiGtRQ@mail.gmail.com
2019-03-20Make heap TID a tiebreaker nbtree index column.Peter Geoghegan
Make nbtree treat all index tuples as having a heap TID attribute. Index searches can distinguish duplicates by heap TID, since heap TID is always guaranteed to be unique. This general approach has numerous benefits for performance, and is prerequisite to teaching VACUUM to perform "retail index tuple deletion". Naively adding a new attribute to every pivot tuple has unacceptable overhead (it bloats internal pages), so suffix truncation of pivot tuples is added. This will usually truncate away the "extra" heap TID attribute from pivot tuples during a leaf page split, and may also truncate away additional user attributes. This can increase fan-out, especially in a multi-column index. Truncation can only occur at the attribute granularity, which isn't particularly effective, but works well enough for now. A future patch may add support for truncating "within" text attributes by generating truncated key values using new opclass infrastructure. Only new indexes (BTREE_VERSION 4 indexes) will have insertions that treat heap TID as a tiebreaker attribute, or will have pivot tuples undergo suffix truncation during a leaf page split (on-disk compatibility with versions 2 and 3 is preserved). Upgrades to version 4 cannot be performed on-the-fly, unlike upgrades from version 2 to version 3. contrib/amcheck continues to work with version 2 and 3 indexes, while also enforcing stricter invariants when verifying version 4 indexes. These stricter invariants are the same invariants described by "3.1.12 Sequencing" from the Lehman and Yao paper. A later patch will enhance the logic used by nbtree to pick a split point. This patch is likely to negatively impact performance without smarter choices around the precise point to split leaf pages at. Making these two mostly-distinct sets of enhancements into distinct commits seems like it might clarify their design, even though neither commit is particularly useful on its own. The maximum allowed size of new tuples is reduced by an amount equal to the space required to store an extra MAXALIGN()'d TID in a new high key during leaf page splits. The user-facing definition of the "1/3 of a page" restriction is already imprecise, and so does not need to be revised. However, there should be a compatibility note in the v12 release notes. Author: Peter Geoghegan Reviewed-By: Heikki Linnakangas, Alexander Korotkov Discussion: https://postgr.es/m/CAH2-WzkVb0Kom=R+88fDFb=JSxZMFvbHVC6Mn9LJ2n=X=kS-Uw@mail.gmail.com
2019-03-19Implement OR REPLACE option for CREATE AGGREGATE.Andrew Gierth
Aggregates have acquired a dozen or so optional attributes in recent years for things like parallel query and moving-aggregate mode; the lack of an OR REPLACE option to add or change these for an existing agg makes extension upgrades gratuitously hard. Rectify.
2019-03-18Fix typos in sgml docs about RefetchForeignRow().Andres Freund
I screwed this up in ad0bda5d24e. Reported-By: Jie Zhang, Michael Paquier, Etsuro Fujita Discussion: https://postgr.es/m/1396E95157071C4EBBA51892C5368521017F2DA203@G08CNEXMBPEKD02.g08.fujitsu.local
2019-03-16Numeric error suppression in jsonpathAlexander Korotkov
Add support of numeric error suppression to jsonpath as it's required by standard. This commit doesn't use PG_TRY()/PG_CATCH() in order to implement that. Instead, it provides internal versions of numeric functions used, which support error suppression. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Author: Alexander Korotkov, Nikita Glukhov Reviewed-by: Tomas Vondra
2019-03-16Partial implementation of SQL/JSON path languageAlexander Korotkov
SQL 2016 standards among other things contains set of SQL/JSON features for JSON processing inside of relational database. The core of SQL/JSON is JSON path language, allowing access parts of JSON documents and make computations over them. This commit implements partial support JSON path language as separate datatype called "jsonpath". The implementation is partial because it's lacking datetime support and suppression of numeric errors. Missing features will be added later by separate commits. Support of SQL/JSON features requires implementation of separate nodes, and it will be considered in subsequent patches. This commit includes following set of plain functions, allowing to execute jsonpath over jsonb values: * jsonb_path_exists(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_match(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_query(jsonb, jsonpath[, jsonb, bool]), * jsonb_path_query_array(jsonb, jsonpath[, jsonb, bool]). * jsonb_path_query_first(jsonb, jsonpath[, jsonb, bool]). This commit also implements "jsonb @? jsonpath" and "jsonb @@ jsonpath", which are wrappers over jsonpath_exists(jsonb, jsonpath) and jsonpath_predicate(jsonb, jsonpath) correspondingly. These operators will have an index support (implemented in subsequent patches). Catversion bumped, to add new functions and operators. Code was written by Nikita Glukhov and Teodor Sigaev, revised by me. Documentation was written by Oleg Bartunov and Liudmila Mantrova. The work was inspired by Oleg Bartunov. Discussion: https://postgr.es/m/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com Author: Nikita Glukhov, Teodor Sigaev, Alexander Korotkov, Oleg Bartunov, Liudmila Mantrova Reviewed-by: Tomas Vondra, Andrew Dunstan, Pavel Stehule, Alexander Korotkov
2019-03-15Enable parallel query with SERIALIZABLE isolation.Thomas Munro
Previously, the SERIALIZABLE isolation level prevented parallel query from being used. Allow the two features to be used together by sharing the leader's SERIALIZABLEXACT with parallel workers. An extra per-SERIALIZABLEXACT LWLock is introduced to make it safe to share, and new logic is introduced to coordinate the early release of the SERIALIZABLEXACT required for the SXACT_FLAG_RO_SAFE optimization, as follows: The first backend to observe the SXACT_FLAG_RO_SAFE flag (set by some other transaction) will 'partially release' the SERIALIZABLEXACT, meaning that the conflicts and locks it holds are released, but the SERIALIZABLEXACT itself will remain active because other backends might still have a pointer to it. Whenever any backend notices the SXACT_FLAG_RO_SAFE flag, it clears its own MySerializableXact variable and frees local resources so that it can skip SSI checks for the rest of the transaction. In the special case of the leader process, it transfers the SERIALIZABLEXACT to a new variable SavedSerializableXact, so that it can be completely released at the end of the transaction after all workers have exited. Remove the serializable_okay flag added to CreateParallelContext() by commit 9da0cc35, because it's now redundant. Author: Thomas Munro Reviewed-by: Haribabu Kommi, Robert Haas, Masahiko Sawada, Kevin Grittner Discussion: https://postgr.es/m/CAEepm=0gXGYhtrVDWOTHS8SQQy_=S9xo+8oCxGLWZAOoeJ=yzQ@mail.gmail.com
2019-03-15During pg_upgrade, conditionally skip transfer of FSMs.Amit Kapila
If a heap on the old cluster has 4 pages or fewer, and the old cluster was PG v11 or earlier, don't copy or link the FSM. This will shrink space usage for installations with large numbers of small tables. This will allow pg_upgrade to take advantage of commit b0eaa4c51b where we have avoided creation of the free space map for small heap relations. Author: John Naylor Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CACPNZCu4cOdm3uGnNEGXivy7Gz8UWyQjynDpdkPGabQ18_zK6g%40mail.gmail.com
2019-03-14Sync commentary in transam.h and bki.sgml.Tom Lane
Commit a6417078c missed updating some comments in transam.h about reservation of high OIDs for development purposes. Also tamp down an over-optimistic comment there about how easy it'd be to change FirstNormalObjectId. Earlier, commit 09568ec3d failed to update bki.sgml for the split between genbki.pl-assigned OIDs and those assigned during initdb. Also fix genbki.pl so that it will complain if it overruns that split. It's possible that doing so would have no very bad consequences, but that's no excuse for not detecting it.
2019-03-14Use condition variables to wait for checkpoints.Thomas Munro
Previously we used a polling/sleeping loop to wait for checkpoints to begin and end, which leads to up to a couple hundred milliseconds of needless thumb-twiddling. Use condition variables instead. Author: Thomas Munro Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CA%2BhUKGLY7sDe%2Bbg1K%3DbnEzOofGoo4bJHYh9%2BcDCXJepb6DQmLw%40mail.gmail.com
2019-03-13Allow ALTER TABLE .. SET NOT NULL to skip provably unnecessary scans.Robert Haas
If existing CHECK or NOT NULL constraints preclude the presence of nulls, we need not look to see whether any are present. Sergei Kornilov, reviewed by Stephen Frost, Ildar Musin, David Rowley, and by me. Discussion: http://postgr.es/m/81911511895540@web58j.yandex.ru
2019-03-13Remove extra commaMagnus Hagander
Author: Christoph Berg <myon@debian.org>
2019-03-13Rename pg_verify_checksums to pg_checksumsMichael Paquier
The current tool name is too restrictive and focuses only on verifying checksums. As more options to control checksums for an offline cluster are planned to be added, switch to a more generic name. Documentation as well as all past references to the tool are updated. Author: Michael Paquier Reviewed-by: Michael Banck, Fabien Coelho, Seigei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
2019-03-12Add support for hyperbolic functions, as well as log10().Tom Lane
The SQL:2016 standard adds support for the hyperbolic functions sinh(), cosh(), and tanh(). POSIX has long required libm to provide those functions as well as their inverses asinh(), acosh(), atanh(). Hence, let's just expose the libm functions to the SQL level. As with the trig functions, we only implement versions for float8, not numeric. For the moment, we'll assume that all platforms actually do have these functions; if experience teaches otherwise, some autoconf effort may be needed. SQL:2016 also adds support for base-10 logarithm, but with the function name log10(), whereas the name we've long used is log(). Add aliases named log10() for the float8 and numeric versions. Lætitia Avrot Discussion: https://postgr.es/m/CAB_COdguG22LO=rnxDQ2DW1uzv8aQoUzyDQNJjrR4k00XSgm5w@mail.gmail.com
2019-03-12Remove remaining hard-wired OID references in the initial catalog data.Tom Lane
In the v11-era commits that taught genbki.pl to resolve symbolic OID references in the initial catalog data, we didn't bother to make every last reference symbolic; some of the catalogs have so few initial rows that it didn't seem worthwhile. However, the new project policy that OIDs assigned by new patches should be automatically renumberable changes this calculus. A patch that wants to add a row in one of these catalogs would have a problem when the OID it assigns gets renumbered. Hence, do the mop-up work needed to make all OID references in initial data be symbolic, and establish an associated project policy that we'll never again write a hard-wired OID reference there. No catversion bump since the contents of postgres.bki aren't actually changed by this commit. Discussion: https://postgr.es/m/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-=eaochT0dd2BN9CQ@mail.gmail.com
2019-03-12Create a script that can renumber manually-assigned OIDs.Tom Lane
This commit adds a Perl script renumber_oids.pl, which can reassign a range of manually-assigned OIDs to someplace else by modifying OID fields of the catalog *.dat files and OID-assigning macros in the catalog *.h files. Up to now, we've encouraged new patches that need manually-assigned OIDs to use OIDs just above the range of existing OIDs. Predictably, this leads to patches stepping on each others' toes, as whichever one gets committed first creates an OID conflict that other patch author(s) have to resolve manually. With the availability of renumber_oids.pl, we can eliminate a lot of this hassle. The new project policy, therefore, is: * Encourage new patches to use high OIDs (the documentation suggests choosing a block of OIDs at random in 8000..9999). * After feature freeze in each development cycle, run renumber_oids.pl to move all such OIDs down to lower numbers, thus freeing the high OID range for the next development cycle. This plan should greatly reduce the risk of OID collisions between concurrently-developed patches. Also, if such a collision happens anyway, we have the option to resolve it without much effort by doing an off-schedule OID renumbering to get the first-committed patch out of the way. Or a patch author could use renumber_oids.pl to change their patch's assignments without much pain. This approach does put a premium on not hard-wiring any OID values in places where renumber_oids.pl and genbki.pl can't fix them. Project practice in that respect seems to be pretty good already, but a follow-on patch will sand down some rough edges. John Naylor and Tom Lane, per an idea of Peter Geoghegan's Discussion: https://postgr.es/m/CAH2-WzmMTGMcPuph4OvsO7Ykut0AOCF_i-=eaochT0dd2BN9CQ@mail.gmail.com
2019-03-11Allow fractional input values for integer GUCs, and improve rounding logic.Tom Lane
Historically guc.c has just refused examples like set work_mem = '30.1GB', but it seems more useful for it to take that and round off the value to some reasonable approximation of what the user said. Just rounding to the parameter's native unit would work, but it would lead to rather silly-looking settings, such as 31562138kB for this example. Instead let's round to the nearest multiple of the next smaller unit (if any), producing 30822MB. Also, do the units conversion math in floating point and round to integer (if needed) only at the end. This produces saner results for inputs that aren't exact multiples of the parameter's native unit, and removes another difference in the behavior for integer vs. float parameters. In passing, document the ability to use hex or octal input where it ought to be documented. Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us
2019-03-11Tweak wording on VARIADIC array doc patch.Andrew Dunstan
Per suggestion from Tom Lane.
2019-03-11Document incompatibility of comparison expressions with VARIADIC array argumentsAndrew Dunstan
COALESCE, GREATEST and LEAST all look like functions taking variable numbers of arguments, but in fact they are not functions, and so VARIADIC array arguments don't work with them. Add a note to the docs explaining this fact. The consensus is not to try to make this work, but just to document the limitation. Discussion: https://postgr.es/m/CAFj8pRCaAtuXuRtvXf5GmPbAVriUQrNMo7-=TXUFN025S31R_w@mail.gmail.com
2019-03-11pgbench: increase the maximum number of variables/argumentsAndrew Dunstan
pgbench's arbitrary limit of 10 arguments for SQL statements or metacommands is far too low. Increase it to 256. This results in a very modest increase in memory usage, not enough to worry about. The maximum includes the SQL statement or metacommand. This is reflected in the comments and revised TAP tests. Simon Riggs and Dagfinn Ilmari Mannsåker with some light editing by me. Reviewed by: David Rowley and Fabien Coelho Discussion: https://postgr.es/m/CANP8+jJiMJOAf-dLoHuR-8GENiK+eHTY=Omw38Qx7j2g0NDTXA@mail.gmail.com
2019-03-10Fix documentation on partitioning vs. foreign tablesAlvaro Herrera
1. The PARTITION OF clause of CREATE FOREIGN TABLE was not explained in the CREATE FOREIGN TABLE reference page. Add it. (Postgres 10 onwards) 2. The limitation that tuple routing cannot target partitions that are foreign tables was not documented clearly enough. Improve wording. (Postgres 10 onwards) 3. The UPDATE tuple re-routing concurrency behavior was explained in the DDL chapter, which doesn't seem the right place. Move it to the UPDATE reference page instead. (Postgres 11 onwards). Authors: Amit Langote, David Rowley. Reviewed-by: Etsuro Fujita. Reported-by: Derek Hans Discussion: https://postgr.es/m/CAGrP7a3Xc1Qy_B2WJcgAD8uQTS_NDcJn06O5mtS_Ne1nYhBsyw@mail.gmail.com
2019-03-10Reduce the default value of autovacuum_vacuum_cost_delay to 2ms.Tom Lane
This is a better way to implement the desired change of increasing autovacuum's default resource consumption. Discussion: https://postgr.es/m/28720.1552101086@sss.pgh.pa.us
2019-03-10Revert "Increase the default vacuum_cost_limit from 200 to 2000"Tom Lane
This reverts commit bd09503e633b8077822bb4daf91625b71ac16253. Per discussion, it seems like what we should do instead is to reduce the default value of autovacuum_vacuum_cost_delay by the same factor. That's functionally equivalent as long as the platform can accurately service the smaller delay request, which should be true on anything released in the last 10 years or more. And smaller, more-closely-spaced delays are better in terms of providing a steady I/O load. Discussion: https://postgr.es/m/28720.1552101086@sss.pgh.pa.us
2019-03-10Convert [autovacuum_]vacuum_cost_delay into floating-point GUCs.Tom Lane
This change makes it possible to specify sub-millisecond delays, which work well on most modern platforms, though that was not true when the cost-delay feature was designed. To support this without breaking existing configuration entries, improve guc.c to allow floating-point GUCs to have units. Also, allow "us" (microseconds) as an input/output unit for time-unit GUCs. (It's not allowed as a base unit, at least not yet.) Likewise change the autovacuum_vacuum_cost_delay reloption to be floating-point; this forces a catversion bump because the layout of StdRdOptions changes. This patch doesn't in itself change the default values or allowed ranges for these parameters, and it should not affect the behavior for any already-allowed setting for them. Discussion: https://postgr.es/m/1798.1552165479@sss.pgh.pa.us
2019-03-10Support for INCLUDE attributes in GiST indexesAlexander Korotkov
Similarly to B-tree, GiST index access method gets support of INCLUDE attributes. These attributes aren't used for tree navigation and aren't present in non-leaf pages. But they are present in leaf pages and can be fetched during index-only scan. The point of having INCLUDE attributes in GiST indexes is slightly different from the point of having them in B-tree. The main point of INCLUDE attributes in B-tree is to define UNIQUE constraint over part of attributes enabled for index-only scan. In GiST the main point of INCLUDE attributes is to use index-only scan for attributes, whose data types don't have GiST opclasses. Discussion: https://postgr.es/m/73A1A452-AD5F-40D4-BD61-978622FF75C1%40yandex-team.ru Author: Andrey Borodin, with small changes by me Reviewed-by: Andreas Karlsson
2019-03-09Simplify release-note links to back branches.Tom Lane
Now that https://www.postgresql.org/docs/release/ is populated, replace the stopgap text we had under "Prior Releases" with a pointer to that archive. Discussion: https://postgr.es/m/e0f09c9a-bd2b-862a-d379-601dfabc8969@postgresql.org
2019-03-09Add new clientcert hba option verify-fullMagnus Hagander
This allows a login to require both that the cn of the certificate matches (like authentication type cert) *and* that another authentication method (such as password or kerberos) succeeds as well. The old value of clientcert=1 maps to the new clientcert=verify-ca, clientcert=0 maps to the new clientcert=no-verify, and the new option erify-full will add the validation of the CN. Author: Julian Markwort, Marius Timmer Reviewed by: Magnus Hagander, Thomas Munro
2019-03-09Track block level checksum failures in pg_stat_databaseMagnus Hagander
This adds a column that counts how many checksum failures have occurred on files belonging to a specific database. Both checksum failures during normal backend processing and those created when a base backup detects a checksum failure are counted. Author: Magnus Hagander Reviewed by: Julien Rouhaud
2019-03-08Fix function signatures of pageinspect in documentationMichael Paquier
tuple_data_split() lacked the type of the first argument, and heap_page_item_attrs() has reversed the first and second argument, with the bytea argument using an incorrect name. Author: Laurenz Albe Discussion: https://postgr.es/m/8f9ab7b16daf623e87eeef5203a4ffc0dece8dfd.camel@cybertec.at Backpatch-through: 9.6
2019-03-07Fix minor deficiencies in XMLTABLE, xpath(), xmlexists()Alvaro Herrera
Correctly process nodes of more types than previously. In some cases, nodes were being ignored (nothing was output); in other cases, trying to return them resulted in errors about unrecognized nodes. In yet other cases, necessary escaping (of XML special characters) was not being done. Fix all those (as far as the authors could find) and add regression tests cases verifying the new behavior. I (Álvaro) was of two minds about backpatching these changes. They do seem bugfixes that would benefit most users of the affected functions; but on the other hand it would change established behavior in minor releases, so it seems prudent not to. Authors: Pavel Stehule, Markus Winand, Chapman Flack Discussion: https://postgr.es/m/CAFj8pRA6J25CtAZ2TuRvxK3gat7-bBUYh0rfE2yM7Hj9GD14Dg@mail.gmail.com https://postgr.es/m/8BDB0627-2105-4564-AA76-7849F028B96E@winand.at The elephant in the room as pointed out by Chapman Flack, not fixed in this commit, is that we still have XMLTABLE operating on XPath 1.0 instead of the standard-mandated XQuery (or even its subset XPath 2.0). Fixing that is a major undertaking, however.
2019-03-07Allow ATTACH PARTITION with only ShareUpdateExclusiveLock.Robert Haas
We still require AccessExclusiveLock on the partition itself, because otherwise an insert that violates the newly-imposed partition constraint could be in progress at the same time that we're changing that constraint; only the lock level on the parent relation is weakened. To make this safe, we have to cope with (at least) three separate problems. First, relevant DDL might commit while we're in the process of building a PartitionDesc. If so, find_inheritance_children() might see a new partition while the RELOID system cache still has the old partition bound cached, and even before invalidation messages have been queued. To fix that, if we see that the pg_class tuple seems to be missing or to have a null relpartbound, refetch the value directly from the table. We can't get the wrong value, because DETACH PARTITION still requires AccessExclusiveLock throughout; if we ever want to change that, this will need more thought. In testing, I found it quite difficult to hit even the null-relpartbound case; the race condition is extremely tight, but the theoretical risk is there. Second, successive calls to RelationGetPartitionDesc might not return the same answer. The query planner will get confused if lookup up the PartitionDesc for a particular relation does not return a consistent answer for the entire duration of query planning. Likewise, query execution will get confused if the same relation seems to have a different PartitionDesc at different times. Invent a new PartitionDirectory concept and use it to ensure consistency. This ensures that a single invocation of either the planner or the executor sees the same view of the PartitionDesc from beginning to end, but it does not guarantee that the planner and the executor see the same view. Since this allows pointers to old PartitionDesc entries to survive even after a relcache rebuild, also postpone removing the old PartitionDesc entry until we're certain no one is using it. For the most part, it seems to be OK for the planner and executor to have different views of the PartitionDesc, because the executor will just ignore any concurrently added partitions which were unknown at plan time; those partitions won't be part of the inheritance expansion, but invalidation messages will trigger replanning at some point. Normally, this happens by the time the very next command is executed, but if the next command acquires no locks and executes a prepared query, it can manage not to notice until a new transaction is started. We might want to tighten that up, but it's material for a separate patch. There would still be a small window where a query that started just after an ATTACH PARTITION command committed might fail to notice its results -- but only if the command starts before the commit has been acknowledged to the user. All in all, the warts here around serializability seem small enough to be worth accepting for the considerable advantage of being able to add partitions without a full table lock. Although in general the consequences of new partitions showing up between planning and execution are limited to the query not noticing the new partitions, run-time partition pruning will get confused in that case, so that's the third problem that this patch fixes. Run-time partition pruning assumes that indexes into the PartitionDesc are stable between planning and execution. So, add code so that if new partitions are added between plan time and execution time, the indexes stored in the subplan_map[] and subpart_map[] arrays within the plan's PartitionedRelPruneInfo get adjusted accordingly. There does not seem to be a simple way to generalize this scheme to cope with partitions that are removed, mostly because they could then get added back again with different bounds, but it works OK for added partitions. This code does not try to ensure that every backend participating in a parallel query sees the same view of the PartitionDesc. That currently doesn't matter, because we never pass PartitionDesc indexes between backends. Each backend will ignore the concurrently added partitions which it notices, and it doesn't matter if different backends are ignoring different sets of concurrently added partitions. If in the future that matters, for example because we allow writes in parallel query and want all participants to do tuple routing to the same set of partitions, the PartitionDirectory concept could be improved to share PartitionDescs across backends. There is a draft patch to serialize and restore PartitionDescs on the thread where this patch was discussed, which may be a useful place to start. Patch by me. Thanks to Alvaro Herrera, David Rowley, Simon Riggs, Amit Langote, and Michael Paquier for discussion, and to Alvaro Herrera for some review. Discussion: http://postgr.es/m/CA+Tgmobt2upbSocvvDej3yzokd7AkiT+PvgFH+a9-5VV1oJNSQ@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoZE0r9-cyA-aY6f8WFEROaDLLL7Vf81kZ8MtFCkxpeQSw@mail.gmail.com Discussion: http://postgr.es/m/CA+TgmoY13KQZF-=HNTrt9UYWYx3_oYOQpu9ioNT49jGgiDpUEA@mail.gmail.com
2019-03-07Fix broken markupAlvaro Herrera
2019-03-07Fix the BY {REF,VALUE} clause of XMLEXISTS/XMLTABLEAlvaro Herrera
This clause is used to indicate the passing mode of a XML document, but we were doing it wrong: we accepted BY REF and ignored it, and rejected BY VALUE as a syntax error. The reality, however, is that documents are always passed BY VALUE, so rejecting that clause was silly. Change things so that we accept BY VALUE. BY REF continues to be accepted, and continues to be ignored. Author: Chapman Flack Reviewed-by: Pavel Stehule Discussion: https://postgr.es/m/5C297BB7.9070509@anastigmatix.net
2019-03-07pg_dump: allow multiple rows per insertAlvaro Herrera
This is useful to speed up loading data in a different database engine. Authors: Surafel Temesgen and David Rowley. Lightly edited by Álvaro. Reviewed-by: Fabien Coelho Discussion: https://postgr.es/m/CALAY4q9kumSdnRBzvRJvSRf2+BH20YmSvzqOkvwpEmodD-xv6g@mail.gmail.com
2019-03-06tableam: introduce table AM infrastructure.Andres Freund
This introduces the concept of table access methods, i.e. CREATE ACCESS METHOD ... TYPE TABLE and CREATE TABLE ... USING (storage-engine). No table access functionality is delegated to table AMs as of this commit, that'll be done in following commits. Subsequent commits will incrementally abstract table access functionality to be routed through table access methods. That change is too large to be reviewed & committed at once, so it'll be done incrementally. Docs will be updated at the end, as adding them incrementally would likely make them less coherent, and definitely is a lot more work, without a lot of benefit. Table access methods are specified similar to index access methods, i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with callbacks. In contrast to index AMs that struct needs to live as long as a backend, typically that's achieved by just returning a pointer to a constant struct. Psql's \d+ now displays a table's access method. That can be disabled with HIDE_TABLEAM=true, which is mainly useful so regression tests can be run against different AMs. It's quite possible that this behaviour still needs to be fine tuned. For now it's not allowed to set a table AM for a partitioned table, as we've not resolved how partitions would inherit that. Disallowing allows us to introduce, if we decide that's the way forward, such a behaviour without a compatibility break. Catversion bumped, to add the heap table AM and references to it. Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
2019-03-06Increase the default vacuum_cost_limit from 200 to 2000Andrew Dunstan
The original 200 default value was set back in f425b605f4e when the cost delay settings were first added. Hardware has improved quite a bit since then and we've also made improvements such as sorting buffers during checkpoints (9cd00c457e6) which should result in less random writes. This low default value was reportedly causing problems with badly configured servers and in the absence of a native method to remove excessive bloat from tables without incurring an AccessExclusiveLock, this often made cleaning up the damage caused by badly configured auto-vacuums difficult. It seems more likely that someone will notice that auto-vacuum is running too quickly than too slowly, so let's go all out and multiple the default value for the setting by 10. With the default vacuum_cost_page_dirty and autovacuum_vacuum_cost_delay (assuming a page size of 8192 bytes), this allows autovacuum a theoretical maximum dirty write rate of around 39MB/s instead of just 3.9MB/s. Author: David Rowley Discussion: https://postgr.es/m/CAKJS1f_YbXC2qTMPyCbmsPiKvZYwpuQNQMohiRXLj1r=8_rYvw@mail.gmail.com
2019-03-04pg_partition_ancestorsAlvaro Herrera
Adds another introspection feature for partitioning, necessary for further psql patches. Reviewed-by: Michaël Paquier Discussion: https://postgr.es/m/20190226222757.GA31622@alvherre.pgsql