summaryrefslogtreecommitdiff
path: root/src/backend/commands/copy.c
AgeCommit message (Collapse)Author
2007-12-27Fix ill-advised usage of x?y:z expressions in errmsg() and errhint() calls.Tom Lane
This prevented gettext from recognizing the strings that need to be translated.
2007-12-27Swap the order of testing for control characters and for column delimiter inTom Lane
CopyAttributeOutText(), so that control characters are converted to the C-style escape sequences even if they happen to be equal to the column delimiter (as is true by default for tab, for example). Oversight in my previous patch to restore pre-8.3 behavior of COPY OUT escaping. Per report from Tomas Szepe.
2007-12-03Revert COPY OUT to follow the pre-8.3 handling of ASCII control characters,Tom Lane
namely that \r, \n, \t, \b, \f, \v are dumped as those two-character representations rather than a backslash and the literal control character. I had made it do the other to save some code, but this was ill-advised, because dump files in which these characters appear literally are prone to newline mangling. Fortunately, doing it the old way should only cost a few more lines of code, and not slow down the copy loop materially. Per bug #3795 from Lou Duchez.
2007-11-30Avoid incrementing the CommandCounter when CommandCounterIncrement is calledTom Lane
but no database changes have been made since the last CommandCounterIncrement. This should result in a significant improvement in the number of "commands" that can typically be performed within a transaction before hitting the 2^32 CommandId size limit. In particular this buys back (and more) the possible adverse consequences of my previous patch to fix plan caching behavior. The implementation requires tracking whether the current CommandCounter value has been "used" to mark any tuples. CommandCounter values stored into snapshots are presumed not to be used for this purpose. This requires some small executor changes, since the executor used to conflate the curcid of the snapshot it was using with the command ID to mark output tuples with. Separating these concepts allows some small simplifications in executor APIs. Something for the TODO list: look into having CommandCounterIncrement not do AcceptInvalidationMessages. It seems fairly bogus to be doing it there, but exactly where to do it instead isn't clear, and I'm disinclined to mess with asynchronous behavior during late beta.
2007-11-15pgindent run for 8.3.Bruce Momjian
2007-09-12Perform post-escaping encoding validity checks on SQL literals and COPY inputAndrew Dunstan
so that invalidly encoded data cannot enter the database by these means.
2007-09-07Don't take ProcArrayLock while exiting a transaction that has no XID; there isTom Lane
no need for serialization against snapshot-taking because the xact doesn't affect anyone else's snapshot anyway. Per discussion. Also, move various info about the interlocking of transactions and snapshots out of code comments and into a hopefully-more-cohesive discussion in access/transam/README. Also, remove a couple of now-obsolete comments about having to force some WAL to be written to persuade RecordTransactionCommit to do its thing.
2007-06-20Minor code cleanup: calling FreeFile() before ereport(ERROR) is notNeil Conway
necessary, since files opened via AllocateFile() are closed automatically as part of error recovery.
2007-06-17Marginal hacking to improve the speed of COPY OUT. I had found in a bit ofTom Lane
profiling that CopyAttributeOutText was taking an unreasonable fraction of the backend run time (like 66%!) on the following trivial test case: $ time psql -c "copy (select repeat('xyzzy',50) from generate_series(1,10000000)) to stdout" regression >/dev/null The time is all being spent on scanning the string for characters to be escaped, which most of the time there aren't any of. Some tweaking to take as many tests as possible out of the inner loop reduced the runtime of this example by more than 10%. In a real-world case it wouldn't be as useful a speedup, but it still seems worth adding a few lines here.
2007-04-27Modify processing of DECLARE CURSOR and EXPLAIN so that they can resolve theTom Lane
types of unspecified parameters when submitted via extended query protocol. This worked in 8.2 but I had broken it during plancache changes. DECLARE CURSOR is now treated almost exactly like a plain SELECT through parse analysis, rewrite, and planning; only just before sending to the executor do we divert it away to ProcessUtility. This requires a special-case check in a number of places, but practically all of them were already special-casing SELECT INTO, so it's not too ugly. (Maybe it would be a good idea to merge the two by treating IntoClause as a form of utility statement? Not going to worry about that now, though.) That approach doesn't work for EXPLAIN, however, so for that I punted and used a klugy solution of running parse analysis an extra time if under extended query protocol.
2007-04-18Update docs/error message for CSV quote/escape --- must be ASCII.Bruce Momjian
Backpatch doc change to 8.2.X.
2007-04-18Update error message for COPY with a multi-byte delimiter.Bruce Momjian
2007-04-16Expose more cursor-related functionality in SPI: specifically, allowTom Lane
access to the planner's cursor-related planning options, and provide new FETCH/MOVE routines that allow access to the full power of those commands. Small refactoring of planner(), pg_plan_query(), and pg_plan_queries() APIs to make it convenient to pass the planning options down from SPI. This is the core-code portion of Pavel Stehule's patch for scrollable cursor support in plpgsql; I'll review and apply the plpgsql changes separately.
2007-03-29Teach CLUSTER to skip writing WAL if not needed (ie, not using archiving)Tom Lane
--- Simon. Also, code review and cleanup for the previous COPY-no-WAL patches --- Tom.
2007-03-13First phase of plan-invalidation project: create a plan cache managementTom Lane
module and teach PREPARE and protocol-level prepared statements to use it. In service of this, rearrange utility-statement processing so that parse analysis does not assume table schemas can't change before execution for utility statements (necessary because we don't attempt to re-acquire locks for utility statements when reusing a stored plan). This requires some refactoring of the ProcessUtility API, but it ends up cleaner anyway, for instance we can get rid of the QueryContext global. Still to do: fix up SPI and related code to use the plan cache; I'm tempted to try to make SQL functions use it too. Also, there are at least some aspects of system state that we want to ensure remain the same during a replan as in the original processing; search_path certainly ought to behave that way for instance, and perhaps there are others.
2007-03-03Add resetStringInfo(), which clears the content of a StringInfo, andNeil Conway
fixup various places in the tree that were clearing a StringInfo by hand. Making this function a part of the API simplifies client code slightly, and avoids needlessly peeking inside the StringInfo interface.
2007-02-20Remove the Query structure from the executor's API. This allows us to stopTom Lane
storing mostly-redundant Query trees in prepared statements, portals, etc. To replace Query, a new node type called PlannedStmt is inserted by the planner at the top of a completed plan tree; this carries just the fields of Query that are still needed at runtime. The statement lists kept in portals etc. now consist of intermixed PlannedStmt and bare utility-statement nodes --- no Query. This incidentally allows us to remove some fields from Query and Plan nodes that shouldn't have been there in the first place. Still to do: simplify the execution-time range table; at the moment the range table passed to the executor still contains Query trees for subqueries. initdb forced due to change of stored rules.
2007-01-25Prevent WAL logging when COPY is done in the same transation thatBruce Momjian
created it. Simon Riggs
2007-01-05Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian
back-stamped for this.
2006-10-06Message style improvementsPeter Eisentraut
2006-10-04pgindent run for 8.2.Bruce Momjian
2006-08-31Attibution addition: Add Karel Zak also for COPY SELECT.Bruce Momjian
2006-08-31Correct attibution:Bruce Momjian
COPY SELECT work done by Zoltan Boszormenyi
2006-08-30Extend COPY to support COPY (SELECT ...) TO ...Tom Lane
Bernd Helmle
2006-07-14Remove 576 references of include files that were not needed.Bruce Momjian
2006-07-13Allow include files to compile own their own.Bruce Momjian
Strip unused include files out unused include files, and add needed includes to C files. The next step is to remove unused include files in C files.
2006-05-26Further hacking on performance of COPY OUT. It seems that fwrite()'sTom Lane
per-call overhead is quite significant, at least on Linux: whatever it's doing is more than just shoving the bytes into a buffer. Buffering the data so we can call fwrite() just once per row seems to be a win.
2006-05-25Reduce per-character overhead in COPY OUT by combining calls toTom Lane
CopySendData.
2006-05-21Change the backend to reject strings containing invalidly-encoded multibyteTom Lane
characters in all cases. Formerly we mostly just threw warnings for invalid input, and failed to detect it at all if no encoding conversion was required. The tighter check is needed to defend against SQL-injection attacks as per CVE-2006-2313 (further details will be published after release). Embedded zero (null) bytes will be rejected as well. The checks are applied during input to the backend (receipt from client or COPY IN), so it no longer seems necessary to check in textin() and related routines; any string arriving at those functions will already have been validated. Conversion failure reporting (for characters with no equivalent in the destination encoding) has been cleaned up and made consistent while at it. Also, fix a few longstanding errors in little-used encoding conversion routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic, mic_to_euc_tw were all broken to varying extents. Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki for identifying the security issues.
2006-04-05Fix a bunch of problems with domains by making them use special input functionsTom Lane
that apply the necessary domain constraint checks immediately. This fixes cases where domain constraints went unchecked for statement parameters, PL function local variables and results, etc. We can also eliminate existing special cases for domains in places that had gotten it right, eg COPY. Also, allow domains over domains (base of a domain is another domain type). This almost worked before, but was disallowed because the original patch hadn't gotten it quite right.
2006-04-04Modify all callers of datatype input and receive functions so that if theseTom Lane
functions are not strict, they will be called (passing a NULL first parameter) during any attempt to input a NULL value of their datatype. Currently, all our input functions are strict and so this commit does not change any behavior. However, this will make it possible to build domain input functions that centralize checking of domain constraints, thereby closing numerous holes in our domain support, as per previous discussion. While at it, I took the opportunity to introduce convenience functions InputFunctionCall, OutputFunctionCall, etc to use in code that calls I/O functions. This eliminates a lot of grotty-looking casts, but the main motivation is to make it easier to grep for these places if we ever need to touch them again.
2006-03-23Add error location info to ResTarget parse nodes. Allows error cursor to be ↵Tom Lane
supplied for various mistakes involving INSERT and UPDATE target columns.
2006-03-05Update copyright for 2006. Update scripts.Bruce Momjian
2006-03-03Make the COPY command return a command tag that includes the number ofTom Lane
rows copied. Backend side of Volkan Yazici's recent patch, with corrections and documentation.
2006-02-03Prevent COPY from using newline or carriage return as delimiter or null.Bruce Momjian
Disallow backslash as the delimiter in non-CVS mode. David Fetter
2005-12-28Add regression tests for CSV and \., and add automatic quoting of aBruce Momjian
single column dump that has a \. value, so the load works properly. I also added documentation describing this issue.
2005-12-27Our code had:Bruce Momjian
if (c == '\\' && cstate->line_buf.len == 0) The problem with that is the because of the input and _output_ buffering, cstate->line_buf.len could be zero even if we are not on the first character of a line. In fact, for a typical line, it is zero for all characters on the line. The proper solution is to introduce a boolean, first_char_in_line, that we set as we enter the loop and clear once we process a character. I have restructured the line-reading code in copy.c by: o merging the CSV/non-CSV functions into a single function o used macros to centralize and clarify the buffering code o updated comments o renamed client_encoding_only to encoding_embeds_ascii o added a high-bit test to the encoding_embeds_ascii test for performance o in CSV mode, allow a backslash followed by a non-period to continue being processed as a data value There should be no performance impact from this patch because it is functionally equivalent. If you apply the patch you will see copy.c is much clearer in this area now and might suggest additional optimizations. I have also attached a 8.1-only patch to fix the CSV \. handling bug with no code restructuring.
2005-11-22Re-run pgindent, fixing a problem where comment lines after a blankBruce Momjian
comment line where output as too long, and update typedefs for /lib directory. Also fix case where identifiers were used as variable names in the backend, but as typedefs in ecpg (favor the backend for indenting). Backpatch to 8.1.X.
2005-11-03Rename the members of CommandDest enum so they don't collide with other uses ofAlvaro Herrera
those names. (Debug and None were pretty bad names anyway.) I hope I catched all uses of the names in comments too.
2005-10-15Standard pgindent run for 8.1.Bruce Momjian
2005-10-03COPY's test for read-only transaction was backward; it prohibited COPY TOTom Lane
where it should prohibit COPY FROM. Found by Alon Goldshuv.
2005-09-24Clean up possibly-uninitialized-variable warnings reported by gcc 4.x.Tom Lane
2005-09-24Suppress signed-vs-unsigned-char warnings.Tom Lane
2005-09-01Fix unportable uses of <ctype.h> functions. Per Sergey Koposov.Tom Lane
2005-08-06COPY performance improvements. Avoid calling CopyGetData for each inputTom Lane
character, tighten the inner loops of CopyReadLine and CopyReadAttribute, arrange to parse out all the attributes of a line in just one call instead of one CopyReadAttribute call per attribute, be smarter about which client encodings require slow pg_encoding_mblen() loops. Also, clean up the mishmash of static variables and overly-long parameter lists in favor of passing around a single CopyState struct containing all the state data. Original patch by Alon Goldshuv, reworked by Tom Lane.
2005-07-10Change typreceive function API so that receive functions get the sameTom Lane
optional arguments as text input functions, ie, typioparam OID and atttypmod. Make all the datatypes that use typmod enforce it the same way in typreceive as they do in typinput. This fixes a problem with failure to enforce length restrictions during COPY FROM BINARY.
2005-06-28Replace pg_shadow and pg_group by new role-capable catalogs pg_authidTom Lane
and pg_auth_members. There are still many loose ends to finish in this patch (no documentation, no regression tests, no pg_dump support for instance). But I'm going to commit it now anyway so that Alvaro can make some progress on shared dependencies. The catalog changes should be pretty much done.
2005-06-02Add support for \x hex escapes in COPY.Bruce Momjian
Sergey Ten
2005-05-07Add COPY WITH CVS HEADER to allow a heading line as the first line inBruce Momjian
COPY. Andrew Dunstan
2005-05-06For some reason access/tupmacs.h has been #including utils/memutils.h,Tom Lane
which is neither needed by nor related to that header. Remove the bogus inclusion and instead include the header in those C files that actually need it. Also fix unnecessary inclusions and bad inclusion order in tsearch2 files.