summaryrefslogtreecommitdiff
path: root/src/backend/commands/copy.c
AgeCommit message (Collapse)Author
2017-05-16Fix relcache leak when row triggers on partitions are fired by COPY.Robert Haas
Thomas Munro, reviewed by Amit Langote Discussion: http://postgr.es/m/CAEepm=15Jss-yhFApuKzxcoCuFnb8TR8iQiWMjG=CLYPx48QLw@mail.gmail.com
2017-05-10Revert "Permit dump/reload of not-too-large >1GB tuples"Alvaro Herrera
This reverts commits fa2fa9955280 and 42f50cb8fa98. While the functionality that was intended to be provided by these commits is desired, the patch didn't actually solve as many of the problematic situations as we hoped, and it created a bunch of its own problems. Since we're going to require more extensive changes soon for other reasons and users have been working around these problems for a long time already, there is no point in spending effort in fixing this halfway measure. Per complaint from Tom Lane. Discussion: https://postgr.es/m/21407.1484606922@sss.pgh.pa.us (Commit fa2fa9955280 had already been reverted in branches 9.5 as f858524ee4f and 9.6 as e9e44a0953, so this touches master only. Commit 42f50cb8fa98 was not present in the older branches.)
2017-04-17Set range table for CopyFrom() in tablesyncPeter Eisentraut
CopyFrom() needs a range table for formatting certain errors for constraint violations. This changes the mechanism of how the range table is passed to the CopyFrom() executor state. We used to generate the range table and one entry for the relation manually inside DoCopy(). Now we use addRangeTableEntryForRelation() to setup the range table and relation entry for the ParseState, which is then passed down by BeginCopyFrom(). Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Euler Taveira <euler@timbira.com.br>
2017-04-10Improve castNode notation by introducing list-extraction-specific variants.Tom Lane
This extends the castNode() notation introduced by commit 5bcab1114 to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
2017-04-10Fix reporting of violations in ExecConstraints, again.Robert Haas
We decided in f1b4c771ea74f42447dccaed42ffcdcccf3aa694 to pass the original slot to ExecConstraints(), but that breaks when there are BEFORE ROW triggers involved. So we need to do reverse-map the tuples back to the original descriptor instead, as Amit originally proposed. Amit Langote, reviewed by Ashutosh Bapat. One overlooked comment fixed by me. Discussion: http://postgr.es/m/b3a17254-6849-e542-2353-bde4e880b6a4@lab.ntt.co.jp
2017-03-31Add infrastructure to support EphemeralNamedRelation references.Kevin Grittner
A QueryEnvironment concept is added, which allows new types of objects to be passed into queries from parsing on through execution. At this point, the only thing implemented is a collection of EphemeralNamedRelation objects -- relations which can be referenced by name in queries, but do not exist in the catalogs. The only type of ENR implemented is NamedTuplestore, but provision is made to add more types fairly easily. An ENR can carry its own TupleDesc or reference a relation in the catalogs by relid. Although these features can be used without SPI, convenience functions are added to SPI so that ENRs can easily be used by code run through SPI. The initial use of all this is going to be transition tables in AFTER triggers, but that will be added to each PL as a separate commit. An incidental effect of this patch is to produce a more informative error message if an attempt is made to modify the contents of a CTE from a referencing DML statement. No tests previously covered that possibility, so one is added. Kevin Grittner and Thomas Munro Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro with valuable comments and suggestions from many others
2017-03-28Cast result of copyObject() to correct typePeter Eisentraut
copyObject() is declared to return void *, which allows easily assigning the result independent of the input, but it loses all type checking. If the compiler supports typeof or something similar, cast the result to the input type. This creates a greater amount of type safety. In some cases, where the result is assigned to a generic type such as Node * or Expr *, new casts are now necessary, but in general casts are now unnecessary in the normal case and indicate that something unusual is happening. Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
2017-03-23Allow for parallel execution whenever ExecutorRun() is done only once.Robert Haas
Previously, it was unsafe to execute a plan in parallel if ExecutorRun() might be called with a non-zero row count. However, it's quite easy to fix things up so that we can support that case, provided that it is known that we will never call ExecutorRun() a second time for the same QueryDesc. Add infrastructure to signal this, and cross-checks to make sure that a caller who claims this is true doesn't later reneg. While that pattern never happens with queries received directly from a client -- there's no way to know whether multiple Execute messages will be sent unless the first one requests all the rows -- it's pretty common for queries originating from procedural languages, which often limit the result to a single tuple or to a user-specified number of tuples. This commit doesn't actually enable parallelism in any additional cases, because currently none of the places that would be able to benefit from this infrastructure pass CURSOR_OPT_PARALLEL_OK in the first place, but it makes it much more palatable to pass CURSOR_OPT_PARALLEL_OK in places where we currently don't, because it eliminates some cases where we'd end up having to run the parallel plan serially. Patch by me, based on some ideas from Rafia Sabih and corrected by Rafia Sabih based on feedback from Dilip Kumar and myself. Discussion: http://postgr.es/m/CA+TgmobXEhvHbJtWDuPZM9bVSLiTj-kShxQJ2uM5GPDze9fRYA@mail.gmail.com
2017-03-23Logical replication support for initial data copyPeter Eisentraut
Add functionality for a new subscription to copy the initial data in the tables and then sync with the ongoing apply process. For the copying, add a new internal COPY option to have the COPY source data provided by a callback function. The initial data copy works on the subscriber by receiving COPY data from the publisher and then providing it locally into a COPY that writes to the destination table. A WAL receiver can now execute full SQL commands. This is used here to obtain information about tables and publications. Several new options were added to CREATE and ALTER SUBSCRIPTION to control whether and when initial table syncing happens. Change pg_dump option --no-create-subscription-slots to --no-subscription-connect and use the new CREATE SUBSCRIPTION ... NOCONNECT option for that. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Tested-by: Erik Rijkers <er@xs4all.nl>
2017-03-14Spelling fixes in code commentsPeter Eisentraut
From: Josh Soref <jsoref@gmail.com>
2017-03-06Avoid dangling pointer to relation name in RLS code path in DoCopy().Tom Lane
With RLS active, "COPY tab TO ..." failed under -DRELCACHE_FORCE_RELEASE, and would sometimes fail without that, because it used the relation name directly from the relcache as part of the parsetree it's building. That becomes a potentially-dangling pointer as soon as the relcache entry is closed, a bit further down. Typical symptom if the relcache entry chanced to get cleared would be "relation does not exist" error with a garbage relation name, or possibly a core dump; but if you were really truly unlucky, the COPY might copy from the wrong table. Per report from Andrew Dunstan that regression tests fail with -DRELCACHE_FORCE_RELEASE. The core tests now pass for me (but have not tried "make check-world" yet). Discussion: https://postgr.es/m/7b52f900-0579-cda9-ae2e-de5da17090e6@2ndQuadrant.com
2017-02-21Make more use of castNode()Peter Eisentraut
2017-01-26Use the new castNode() macro in a number of places.Andres Freund
This is far from a pervasive conversion, but it's a good starting point. Author: Peter Eisentraut, with some minor changes by me Reviewed-By: Tom Lane Discussion: https://postgr.es/m/c5d387d9-3440-f5e0-f9d4-71d53b9fbe52@2ndquadrant.com
2017-01-24Reindent table partitioning code.Robert Haas
We've accumulated quite a bit of stuff with which pgindent is not quite happy in this code; clean it up to provide a less-annoying base for future pgindent runs.
2017-01-24Fix interaction of partitioned tables with BulkInsertState.Robert Haas
When copying into a partitioned table, the target heap may change from one tuple to next. We must ask ReadBufferBI() to get a new buffer every time such change occurs. To do that, use new function ReleaseBulkInsertStatePin(). This fixes the bug that tuples ended up being inserted into the wrong partition, which occurred exactly because the wrong buffer was used. Amit Langote, per a suggestion from Robert Haas. Some cosmetic adjustments by me. Reports by 高增琦 (Gao Zengqi), Venkata B Nagothi, and Ragnar Ouchterlony. Discussion: http://postgr.es/m/CAFmBtr32FDOqofo8yG-4mjzL1HnYHxXK5S9OGFJ%3D%3DcJpgEW4vA%40mail.gmail.com Discussion: http://postgr.es/m/CAEyp7J9WiX0L3DoiNcRrY-9iyw%3DqP%2Bj%3DDLsAnNFF1xT2J1ggfQ%40mail.gmail.com Discussion: http://postgr.es/m/16d73804-c9cd-14c5-463e-5caad563ff77%40agama.tv Discussion: http://postgr.es/m/CA+TgmoaiZpDVUUN8LZ4jv1qFE_QyR+H9ec+79f5vNczYarg5Zg@mail.gmail.com
2017-01-19Remove obsoleted code relating to targetlist SRF evaluation.Andres Freund
Since 69f4b9c plain expression evaluation (and thus normal projection) can't return sets of tuples anymore. Thus remove code dealing with that possibility. This will require adjustments in external code using ExecEvalExpr()/ExecProject() - that should neither be hard nor very common. Author: Andres Freund and Tom Lane Discussion: https://postgr.es/m/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de
2017-01-19Fix failure to enforce partitioning contraint for internal partitions.Robert Haas
When a tuple is inherited into a partitioning root, no partition constraints need to be enforced; when it is inserted into a leaf, the parent's partitioning quals needed to be enforced. The previous coding got both of those cases right. When a tuple is inserted into an intermediate level of the partitioning hierarchy (i.e. a table which is both a partition itself and in turn partitioned), it must enforce the partitioning qual inherited from its parent. That case got overlooked; repair. Amit Langote
2017-01-14Change representation of statement lists, and add statement location info.Tom Lane
This patch makes several changes that improve the consistency of representation of lists of statements. It's always been the case that the output of parse analysis is a list of Query nodes, whatever the types of the individual statements in the list. This patch brings similar consistency to the outputs of raw parsing and planning steps: * The output of raw parsing is now always a list of RawStmt nodes; the statement-type-dependent nodes are one level down from that. * The output of pg_plan_queries() is now always a list of PlannedStmt nodes, even for utility statements. In the case of a utility statement, "planning" just consists of wrapping a CMD_UTILITY PlannedStmt around the utility node. This list representation is now used in Portal and CachedPlan plan lists, replacing the former convention of intermixing PlannedStmts with bare utility-statement nodes. Now, every list of statements has a consistent head-node type depending on how far along it is in processing. This allows changing many places that formerly used generic "Node *" pointers to use a more specific pointer type, thus reducing the number of IsA() tests and casts needed, as well as improving code clarity. Also, the post-parse-analysis representation of DECLARE CURSOR is changed so that it looks more like EXPLAIN, PREPARE, etc. That is, the contained SELECT remains a child of the DeclareCursorStmt rather than getting flipped around to be the other way. It's now true for both Query and PlannedStmt that utilityStmt is non-null if and only if commandType is CMD_UTILITY. That allows simplifying a lot of places that were testing both fields. (I think some of those were just defensive programming, but in many places, it was actually necessary to avoid confusing DECLARE CURSOR with SELECT.) Because PlannedStmt carries a canSetTag field, we're also able to get rid of some ad-hoc rules about how to reconstruct canSetTag for a bare utility statement; specifically, the assumption that a utility is canSetTag if and only if it's the only one in its list. While I see no near-term need for relaxing that restriction, it's nice to get rid of the ad-hocery. The API of ProcessUtility() is changed so that what it's passed is the wrapper PlannedStmt not just the bare utility statement. This will affect all users of ProcessUtility_hook, but the changes are pretty trivial; see the affected contrib modules for examples of the minimum change needed. (Most compilers should give pointer-type-mismatch warnings for uncorrected code.) There's also a change in the API of ExplainOneQuery_hook, to pass through cursorOptions instead of expecting hook functions to know what to pick. This is needed because of the DECLARE CURSOR changes, but really should have been done in 9.6; it's unlikely that any extant hook functions know about using CURSOR_OPT_PARALLEL_OK. Finally, teach gram.y to save statement boundary locations in RawStmt nodes, and pass those through to Query and PlannedStmt nodes. This allows more intelligent handling of cases where a source query string contains multiple statements. This patch doesn't actually do anything with the information, but a follow-on patch will. (Passing this information through cleanly is the true motivation for these changes; while I think this is all good cleanup, it's unlikely we'd have bothered without this end goal.) catversion bump because addition of location fields to struct Query affects stored rules. This patch is by me, but it owes a good deal to Fabien Coelho who did a lot of preliminary work on the problem, and also reviewed the patch. Discussion: https://postgr.es/m/alpine.DEB.2.20.1612200926310.29821@lancre
2017-01-04Fix reporting of constraint violations for table partitioning.Robert Haas
After a tuple is routed to a partition, it has been converted from the root table's row type to the partition's row type. ExecConstraints needs to report the failure using the original tuple and the parent's tuple descriptor rather than the ones for the selected partition. Amit Langote
2017-01-04Move partition_tuple_slot out of EState.Robert Haas
Commit 2ac3ef7a01df859c62d0a02333b646d65eaec5ff added a TupleTapleSlot for partition tuple slot to EState (es_partition_tuple_slot) but it's more logical to have it as part of ModifyTableState (mt_partition_tuple_slot) and CopyState (partition_tuple_slot). Discussion: http://postgr.es/m/1bd459d9-4c0c-197a-346e-e5e59e217d97@lab.ntt.co.jp Amit Langote, per a gripe from me
2017-01-03Update copyright via script for 2017Bruce Momjian
2016-12-22Fix tuple routing in cases where tuple descriptors don't match.Robert Haas
The previous coding failed to work correctly when we have a multi-level partitioned hierarchy where tables at successive levels have different attribute numbers for the partition key attributes. To fix, have each PartitionDispatch object store a standalone TupleTableSlot initialized with the TupleDesc of the corresponding partitioned table, along with a TupleConversionMap to map tuples from the its parent's rowtype to own rowtype. After tuple routing chooses a leaf partition, we must use the leaf partition's tuple descriptor, not the root table's. To that end, a dedicated TupleTableSlot for tuple routing is now allocated in EState. Amit Langote
2016-12-21Refactor partition tuple routing code to reduce duplication.Robert Haas
Amit Langote
2016-12-13Clean up code, comments, and formatting for table partitioning.Robert Haas
Amit Langote, plus pgindent-ing by me. Inspired in part by review comments from Tomas Vondra.
2016-12-07Implement table partitioning.Robert Haas
Table partitioning is like table inheritance and reuses much of the existing infrastructure, but there are some important differences. The parent is called a partitioned table and is always empty; it may not have indexes or non-inherited constraints, since those make no sense for a relation with no data of its own. The children are called partitions and contain all of the actual data. Each partition has an implicit partitioning constraint. Multiple inheritance is not allowed, and partitioning and inheritance can't be mixed. Partitions can't have extra columns and may not allow nulls unless the parent does. Tuples inserted into the parent are automatically routed to the correct partition, so tuple-routing ON INSERT triggers are not needed. Tuple routing isn't yet supported for partitions which are foreign tables, and it doesn't handle updates that cross partition boundaries. Currently, tables can be range-partitioned or list-partitioned. List partitioning is limited to a single column, but range partitioning can involve multiple columns. A partitioning "column" can be an expression. Because table partitioning is less general than table inheritance, it is hoped that it will be easier to reason about properties of partitions, and therefore that this will serve as a better foundation for a variety of possible optimizations, including query planner optimizations. The tuple routing based which this patch does based on the implicit partitioning constraints is an example of this, but it seems likely that many other useful optimizations are also possible. Amit Langote, reviewed and tested by Robert Haas, Ashutosh Bapat, Amit Kapila, Rajkumar Raghuwanshi, Corey Huinker, Jaime Casanova, Rushabh Lathia, Erik Rijkers, among others. Minor revisions by me.
2016-12-02Permit dump/reload of not-too-large >1GB tuplesAlvaro Herrera
Our documentation states that our maximum field size is 1 GB, and that our maximum row size of 1.6 TB. However, while this might be attainable in theory with enough contortions, it is not workable in practice; for starters, pg_dump fails to dump tables containing rows larger than 1 GB, even if individual columns are well below the limit; and even if one does manage to manufacture a dump file containing a row that large, the server refuses to load it anyway. This commit enables dumping and reloading of such tuples, provided two conditions are met: 1. no single column is larger than 1 GB (in output size -- for bytea this includes the formatting overhead) 2. the whole row is not larger than 2 GB There are three related changes to enable this: a. StringInfo's API now has two additional functions that allow creating a string that grows beyond the typical 1GB limit (and "long" string). ABI compatibility is maintained. We still limit these strings to 2 GB, though, for reasons explained below. b. COPY now uses long StringInfos, so that pg_dump doesn't choke trying to emit rows longer than 1GB. c. heap_form_tuple now uses the MCXT_ALLOW_HUGE flag in its allocation for the input tuple, which means that large tuples are accepted on input. Note that at this point we do not apply any further limit to the input tuple size. The main reason to limit to 2 GB is that the FE/BE protocol uses 32 bit length words to describe each row; and because the documentation is ambiguous on its signedness and libpq does consider it signed, we cannot use the highest-order bit. Additionally, the StringInfo API uses "int" (which is 4 bytes wide in most platforms) in many places, so we'd need to change that API too in order to improve, which has lots of fallout. Backpatch to 9.5, which is the oldest that has MemoryContextAllocExtended, a necessary piece of infrastructure. We could apply to 9.4 with very minimal additional effort, but any further than that would require backpatching "huge" allocations too. This is the largest set of changes we could find that can be back-patched without breaking compatibility with existing systems. Fixing a bigger set of problems (for example, dumping tuples bigger than 2GB, or dumping fields bigger than 1GB) would require changing the FE/BE protocol and/or changing the StringInfo API in an ABI-incompatible way, neither of which would be back-patchable. Authors: Daniel Vérité, Álvaro Herrera Reviewed by: Tomas Vondra Discussion: https://postgr.es/m/20160229183023.GA286012@alvherre.pgsql
2016-11-10Support "COPY view FROM" for views with INSTEAD OF INSERT triggers.Tom Lane
We just pass the data to the INSTEAD trigger. Haribabu Kommi, reviewed by Dilip Kumar Patch: <CAJrrPGcSQkrNkO+4PhLm4B8UQQQmU9YVUuqmtgM=pmzMfxWaWQ@mail.gmail.com>
2016-10-11Drop server support for FE/BE protocol version 1.0.Tom Lane
While this isn't a lot of code, it's been essentially untestable for a very long time, because libpq doesn't support anything older than protocol 2.0, and has not since release 6.3. There's no reason to believe any other client-side code still uses that protocol, either. Discussion: <2661.1475849167@sss.pgh.pa.us>
2016-10-03Fix RLS with COPY (col1, col2) FROM tabStephen Frost
Attempting to COPY a subset of columns from a table with RLS enabled would fail due to an invalid query being constructed (using a single ColumnRef with the list of fields to exact in 'fields', but that's for the different levels of an indirection for a single column, not for specifying multiple columns). Correct by building a ColumnRef and then RestTarget for each column being requested and then adding those to the targetList for the select query. Include regression tests to hopefully catch if this is broken again in the future. Patch-By: Adam Brightwell Reviewed-By: Michael Paquier
2016-09-06Add a HINT for client-vs-server COPY failure cases.Tom Lane
Users often get confused between COPY and \copy and try to use client-side paths with COPY. The server then cannot find the file (if remote), or sees a permissions problem (if local), or some variant of that. Emit a hint about this in the most common cases. In future we might want to expand the set of errnos for which the hint gets printed, but be conservative for now. Craig Ringer, reviewed by Christoph Berg and Tom Lane Discussion: <CAMsr+YEqtD97qPEzQDqrCt5QiqPbWP_X4hmvy2pQzWC0GWiyPA@mail.gmail.com>
2016-09-06Add location field to DefElemPeter Eisentraut
Add a location field to the DefElem struct, used to parse many utility commands. Update various error messages to supply error position information. To propogate the error position information in a more systematic way, create a ParseState in standard_ProcessUtility() and pass that to interested functions implementing the utility commands. This seems better than passing the query string and then reassembling a parse state ad hoc, which violates the encapsulation of the ParseState type. Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
2016-08-27Add macros to make AllocSetContextCreate() calls simpler and safer.Tom Lane
I found that half a dozen (nearly 5%) of our AllocSetContextCreate calls had typos in the context-sizing parameters. While none of these led to especially significant problems, they did create minor inefficiencies, and it's now clear that expecting people to copy-and-paste those calls accurately is not a great idea. Let's reduce the risk of future errors by introducing single macros that encapsulate the common use-cases. Three such macros are enough to cover all but two special-purpose contexts; those two calls can be left as-is, I think. While this patch doesn't in itself improve matters for third-party extensions, it doesn't break anything for them either, and they can gradually adopt the simplified notation over time. In passing, change TopMemoryContext to use the default allocation parameters. Formerly it could only be extended 8K at a time. That was probably reasonable when this code was written; but nowadays we create many more contexts than we did then, so that it's not unusual to have a couple hundred K in TopMemoryContext, even without considering various dubious code that sticks other things there. There seems no good reason not to let it use growing blocks like most other contexts. Back-patch to 9.6, mostly because that's still close enough to HEAD that it's easy to do so, and keeping the branches in sync can be expected to avoid some future back-patching pain. The bugs fixed by these changes don't seem to be significant enough to justify fixing them further back. Discussion: <21072.1472321324@sss.pgh.pa.us>
2016-06-09pgindent run for 9.6Robert Haas
2016-06-06Stop the executor if no more tuples can be sent from worker to leader.Robert Haas
If a Gather node has read as many tuples as it needs (for example, due to Limit) it may detach the queue connecting it to the worker before reading all of the worker's tuples. Rather than let the worker continue to generate and send all of the results, have it stop after sending the next tuple. More could be done here to stop the worker even quicker, but this is about as well as we can hope to do for 9.6. This is in response to a problem report from Andreas Seltenreich. Commit 44339b892a04e94bbb472235882dc6f7023bdc65 should be actually be sufficient to fix that example even without this change, but it seems better to do this, too, since we might otherwise waste quite a large amount of effort in one or more workers. Discussion: CAA4eK1KOKGqmz9bGu+Z42qhRwMbm4R5rfnqsLCNqFs9j14jzEA@mail.gmail.com Amit Kapila
2016-01-02Update copyright for 2016Bruce Momjian
Backpatch certain files through 9.1
2015-11-27COPY (INSERT/UPDATE/DELETE .. RETURNING ..)Teodor Sigaev
Attached is a patch for being able to do COPY (query) without a CTE. Author: Marko Tiikkaja Review: Michael Paquier
2015-11-17Message style fixPeter Eisentraut
from Euler Taveira
2015-11-16Message improvementsPeter Eisentraut
2015-11-04Update spelling of COPY optionsPeter Eisentraut
The preferred spelling was changed from FORCE QUOTE to FORCE_QUOTE and the like, but some code was still referring to the old spellings.
2015-10-28Message style improvementsPeter Eisentraut
Message style, plurals, quoting, spelling, consistency with similar messages
2015-09-28COPY: use pg_plan_query() instead of planner()Alvaro Herrera
While at it, trim the includes list in copy.c. The planner headers cannot be removed, but there are a few others that are not of any use.
2015-08-02Fix a number of places that produced XX000 errors in the regression tests.Tom Lane
It's against project policy to use elog() for user-facing errors, or to omit an errcode() selection for errors that aren't supposed to be "can't happen" cases. Fix all the violations of this policy that result in ERRCODE_INTERNAL_ERROR log entries during the standard regression tests, as errors that can reliably be triggered from SQL surely should be considered user-facing. I also looked through all the files touched by this commit and fixed other nearby problems of the same ilk. I do not claim to have fixed all violations of the policy, just the ones in these files. In a few places I also changed existing ERRCODE choices that didn't seem particularly appropriate; mainly replacing ERRCODE_SYNTAX_ERROR by something more specific. Back-patch to 9.5, but no further; changing ERRCODE assignments in stable branches doesn't seem like a good idea.
2015-07-27Improve RLS handling in copy.cStephen Frost
To avoid a race condition where the relation being COPY'd could be changed into a view or otherwise modified, keep the original lock on the relation. Further, fully qualify the relation when building the query up. Also remove the poorly thought-out Assert() and check the entire relationOids list as, post-RLS, there can certainly be multiple relations involved and the planner does not guarantee their ordering. Per discussion with Noah and Andres. Back-patch to 9.5 where RLS was introduced.
2015-05-23pgindent run for 9.5Bruce Momjian
2015-05-20Collection of typo fixes.Heikki Linnakangas
Use "a" and "an" correctly, mostly in comments. Two error messages were also fixed (they were just elogs, so no translation work required). Two function comments in pg_proc.h were also fixed. Etsuro Fujita reported one of these, but I found a lot more with grep. Also fix a few other typos spotted while grepping for the a/an typos. For example, "consists out of ..." -> "consists of ...". Plus a "though"/ "through" mixup reported by Euler Taveira. Many of these typos were in old code, which would be nice to backpatch to make future backpatching easier. But much of the code was new, and I didn't feel like crafting separate patches for each branch. So no backpatching.
2015-05-08Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.Andres Freund
The newly added ON CONFLICT clause allows to specify an alternative to raising a unique or exclusion constraint violation error when inserting. ON CONFLICT refers to constraints that can either be specified using a inference clause (by specifying the columns of a unique constraint) or by naming a unique or exclusion constraint. DO NOTHING avoids the constraint violation, without touching the pre-existing row. DO UPDATE SET ... [WHERE ...] updates the pre-existing tuple, and has access to both the tuple proposed for insertion and the existing tuple; the optional WHERE clause can be used to prevent an update from being executed. The UPDATE SET and WHERE clauses have access to the tuple proposed for insertion using the "magic" EXCLUDED alias, and to the pre-existing tuple using the table name or its alias. This feature is often referred to as upsert. This is implemented using a new infrastructure called "speculative insertion". It is an optimistic variant of regular insertion that first does a pre-check for existing tuples and then attempts an insert. If a violating tuple was inserted concurrently, the speculatively inserted tuple is deleted and a new attempt is made. If the pre-check finds a matching tuple the alternative DO NOTHING or DO UPDATE action is taken. If the insertion succeeds without detecting a conflict, the tuple is deemed inserted. To handle the possible ambiguity between the excluded alias and a table named excluded, and for convenience with long relation names, INSERT INTO now can alias its target table. Bumps catversion as stored rules change. Author: Peter Geoghegan, with significant contributions from Heikki Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes. Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs, Dean Rasheed, Stephen Frost and many others.
2015-05-08Represent columns requiring insert and update privileges indentently.Andres Freund
Previously, relation range table entries used a single Bitmapset field representing which columns required either UPDATE or INSERT privileges, despite the fact that INSERT and UPDATE privileges are separately cataloged, and may be independently held. As statements so far required either insert or update privileges but never both, that was sufficient. The required permission could be inferred from the top level statement run. The upcoming INSERT ... ON CONFLICT UPDATE feature needs to independently check for both privileges in one statement though, so that is not sufficient anymore. Bumps catversion as stored rules change. Author: Peter Geoghegan Reviewed-By: Andres Freund
2015-04-30Create an infrastructure for parallel computation in PostgreSQL.Robert Haas
This does four basic things. First, it provides convenience routines to coordinate the startup and shutdown of parallel workers. Second, it synchronizes various pieces of state (e.g. GUCs, combo CID mappings, transaction snapshot) from the parallel group leader to the worker processes. Third, it prohibits various operations that would result in unsafe changes to that state while parallelism is active. Finally, it propagates events that would result in an ErrorResponse, NoticeResponse, or NotifyResponse message being sent to the client from the parallel workers back to the master, from which they can then be sent on to the client. Robert Haas, Amit Kapila, Noah Misch, Rushabh Lathia, Jeevan Chalke. Suggestions and review from Andres Freund, Heikki Linnakangas, Noah Misch, Simon Riggs, Euler Taveira, and Jim Nasby.
2015-02-02Be more careful to not lose sync in the FE/BE protocol.Heikki Linnakangas
If any error occurred while we were in the middle of reading a protocol message from the client, we could lose sync, and incorrectly try to interpret a part of another message as a new protocol message. That will usually lead to an "invalid frontend message" error that terminates the connection. However, this is a security issue because an attacker might be able to deliberately cause an error, inject a Query message in what's supposed to be just user data, and have the server execute it. We were quite careful to not have CHECK_FOR_INTERRUPTS() calls or other operations that could ereport(ERROR) in the middle of processing a message, but a query cancel interrupt or statement timeout could nevertheless cause it to happen. Also, the V2 fastpath and COPY handling were not so careful. It's very difficult to recover in the V2 COPY protocol, so we will just terminate the connection on error. In practice, that's what happened previously anyway, as we lost protocol sync. To fix, add a new variable in pqcomm.c, PqCommReadingMsg, that is set whenever we're in the middle of reading a message. When it's set, we cannot safely ERROR out and continue running, because we might've read only part of a message. PqCommReadingMsg acts somewhat similarly to critical sections in that if an error occurs while it's set, the error handler will force the connection to be terminated, as if the error was FATAL. It's not implemented by promoting ERROR to FATAL in elog.c, like ERROR is promoted to PANIC in critical sections, because we want to be able to use PG_TRY/CATCH to recover and regain protocol sync. pq_getmessage() takes advantage of that to prevent an OOM error from terminating the connection. To prevent unnecessary connection terminations, add a holdoff mechanism similar to HOLD/RESUME_INTERRUPTS() that can be used hold off query cancel interrupts, but still allow die interrupts. The rules on which interrupts are processed when are now a bit more complicated, so refactor ProcessInterrupts() and the calls to it in signal handlers so that the signal handlers always call it if ImmediateInterruptOK is set, and ProcessInterrupts() can decide to not do anything if the other conditions are not met. Reported by Emil Lenngren. Patch reviewed by Noah Misch and Andres Freund. Backpatch to all supported versions. Security: CVE-2015-0244
2015-01-28Clean up range-table building in copy.cStephen Frost
Commit 804b6b6db4dcfc590a468e7be390738f9f7755fb added the build of a range table in copy.c to initialize the EState es_range_table since it can be needed in error paths. Unfortunately, that commit didn't appreciate that some code paths might end up not initializing the rte which is used to build the range table. Fix that and clean up a couple others things along the way- build it only once and don't explicitly set it on the !is_from path as it doesn't make any sense there (cstate is palloc0'd, so this isn't an issue from an initializing standpoint either). The prior commit went back to 9.0, but this only goes back to 9.1 as prior to that the range table build happens immediately after building the RTE and therefore doesn't suffer from this issue. Pointed out by Robert.