summaryrefslogtreecommitdiff
path: root/src/backend/utils/adt/varchar.c
AgeCommit message (Collapse)Author
2016-02-03Extend sortsupport for text to more opclasses.Robert Haas
Have varlena.c expose an interface that allows the char(n), bytea, and bpchar types to piggyback on a now-generalized SortSupport for text. This pushes a little more knowledge of the bpchar/char(n) type into varlena.c than might be preferred, but that seems like the approach that creates least friction. Also speed things up for index builds that use text_pattern_ops or varchar_pattern_ops. This patch does quite a bit of renaming, but it seems likely to be worth it, so as to avoid future confusion about the fact that this code is now more generally used than the old names might have suggested. Peter Geoghegan, reviewed by Álvaro Herrera and Andreas Karlsson, with small tweaks by me.
2016-01-02Update copyright for 2016Bruce Momjian
Backpatch certain files through 9.1
2015-01-06Update copyright for 2015Bruce Momjian
Backpatch certain files through 9.0
2014-05-06pgindent run for 9.4Bruce Momjian
This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.
2014-02-24docs: document behavior of CHAR() comparisons with chars < spaceBruce Momjian
Space trimming rather than space-padding causes unusual behavior, which might not be standards-compliant. Also remove recently-added now-redundant C comment.
2014-02-13Add C comment about problems with CHAR() space trimmingBruce Momjian
2014-01-07Update copyright for 2014Bruce Momjian
Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.
2013-01-01Update copyrights for 2013Bruce Momjian
Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.
2012-06-10Run pgindent on 9.2 source tree in preparation for first 9.3Bruce Momjian
commit-fest.
2012-05-25Fix string truncation to be multibyte-aware in text_name and bpchar_name.Tom Lane
Previously, casts to name could generate invalidly-encoded results. Also, make these functions match namein() more exactly, by consistently using palloc0() instead of ad-hoc zeroing code. Back-patch to all supported branches. Karl Schnaitter and Tom Lane
2012-03-23Code review for protransform patches.Tom Lane
Fix loss of previous expression-simplification work when a transform function fires: we must not simply revert to untransformed input tree. Instead build a dummy FuncExpr node to pass to the transform function. This has the additional advantage of providing a simpler, more uniform API for transform functions. Move documentation to a somewhat less buried spot, relocate some poorly-placed code, be more wary of null constants and invalid typmod values, add an opr_sanity check on protransform function signatures, and some other minor cosmetic adjustments. Note: although this patch touches pg_proc.h, no need for catversion bump, because the changes are cosmetic and don't actually change the intended catalog contents.
2012-01-01Update copyright notices for year 2012.Bruce Momjian
2011-06-21Add notion of a "transform function" that can simplify function calls.Robert Haas
Initially, we use this only to eliminate calls to the varchar() function in cases where the length is not being reduced and, therefore, the function call is equivalent to a RelabelType operation. The most significant effect of this is that we can avoid a table rewrite when changing a varchar(X) column to a varchar(Y) column, where Y > X. Noah Misch, reviewed by me and Alexey Klyukin
2011-02-08Per-column collation supportPeter Eisentraut
This adds collation support for columns and domains, a COLLATE clause to override it per expression, and B-tree index support. Peter Eisentraut reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch
2011-01-01Stamp copyrights for year 2011.Bruce Momjian
2010-12-21Use memcmp() rather than strncmp() when shorter string length is known.Robert Haas
It appears that this will be faster for all but the shortest strings; at least one some platforms, memcmp() can use word-at-a-time comparisons. Noah Misch, somewhat pared down.
2010-09-20Remove cvs keywords from all files.Magnus Hagander
2010-01-02Update copyright for the year 2010.Bruce Momjian
2009-07-11Alter some gratuitous uses of "ANSI" when "SQL standard" might have beenPeter Eisentraut
meant or the reference to a standard was unnecessary.
2009-06-118.4 pgindent run, with new combined Linux/FreeBSD/MinGW typedef listBruce Momjian
provided by Andrew.
2009-01-01Update copyright for 2009.Bruce Momjian
2008-05-27Alter the xxx_pattern_ops opclasses to use the regular equality operator ofTom Lane
the associated datatype as their equality member. This means that these opclasses can now support plain equality comparisons along with LIKE tests, thus avoiding the need for an extra index in some applications. This optimization was not possible when the pattern opclasses were first introduced, because we didn't insist that text equality meant bitwise equality; but we do now, so there is no semantic difference between regular and pattern equality operators. I removed the name_pattern_ops opclass altogether, since it's really useless: name's regular comparisons are just strcmp() and are unlikely to become something different. Instead teach indxpath.c that btree name_ops can be used for LIKE whether or not the locale is C. This might lead to a useful speedup in LIKE queries on the system catalogs in non-C locales. The ~=~ and ~<>~ operators are gone altogether. (It would have been nice to keep them for backward compatibility's sake, but since the pg_amop structure doesn't allow multiple equality operators per opclass, there's no way.) A not-immediately-obvious incompatibility is that the sort order within bpchar_pattern_ops indexes changes --- it had been identical to plain strcmp, but is now trailing-blank-insensitive. This will impact in-place upgrades, if those ever happen. Per discussions a couple months ago.
2008-05-04Use new cstring/text conversion functions in some additional places.Tom Lane
These changes assume that the varchar and xml data types are represented the same as text. (I did not, however, accept the portions of the proposed patch that wanted to assume bytea is the same as text --- tgl.) Brendan Jurd
2008-03-25Simplify and standardize conversions between TEXT datums and ordinary CTom Lane
strings. This patch introduces four support functions cstring_to_text, cstring_to_text_with_len, text_to_cstring, and text_to_cstring_buffer, and two macros CStringGetTextDatum and TextDatumGetCString. A number of existing macros that provided variants on these themes were removed. Most of the places that need to make such conversions now require just one function or macro call, in place of the multiple notational layers that used to be needed. There are no longer any direct calls of textout or textin, and we got most of the places that were using handmade conversions via memcpy (there may be a few still lurking, though). This commit doesn't make any serious effort to eliminate transient memory leaks caused by detoasting toasted text objects before they reach text_to_cstring. We changed PG_GETARG_TEXT_P to PG_GETARG_TEXT_PP in a few places where it was easy, but much more could be done. Brendan Jurd and Tom Lane
2008-01-01Update copyrights in source tree to 2008.Bruce Momjian
2007-11-15pgindent run for 8.3.Bruce Momjian
2007-06-15Tweak the API for per-datatype typmodin functions so that they are passedTom Lane
an array of strings rather than an array of integers, and allow any simple constant or identifier to be used in typmods; for example create table foo (f1 widget(42,'23skidoo',point)); Of course the typmodin function has still got to pack this info into a non-negative int32 for storage, but it's still a useful improvement in flexibility, especially considering that you can do nearly anything if you are willing to keep the info in a side table. We can get away with this change since we have not yet released a version providing user-definable typmods. Per discussion.
2007-04-06Support varlena fields with single-byte headers and unaligned storage.Tom Lane
This commit breaks any code that assumes that the mere act of forming a tuple (without writing it to disk) does not "toast" any fields. While all available regression tests pass, I'm not totally sure that we've fixed every nook and cranny, especially in contrib. Greg Stark with some help from Tom Lane
2007-02-27Replace direct assignments to VARATT_SIZEP(x) with SET_VARSIZE(x, len).Tom Lane
Get rid of VARATT_SIZE and VARATT_DATA, which were simply redundant with VARSIZE and VARDATA, and as a consequence almost no code was using the longer names. Rename the length fields of struct varlena and various derived structures to catch anyplace that was accessing them directly; and clean up various places so caught. In itself this patch doesn't change any behavior at all, but it is necessary infrastructure if we hope to play any games with the representation of varlena headers. Greg Stark and Tom Lane
2007-01-05Update CVS HEAD for 2007 copyright. Back branches are typically notBruce Momjian
back-stamped for this.
2006-12-30Support type modifiers for user-defined types, and pull most knowledgeTom Lane
about typmod representation for standard types out into type-specific typmod I/O functions. Teodor Sigaev, with some editorialization by Tom Lane.
2006-10-04pgindent run for 8.2.Bruce Momjian
2006-07-14Remove 576 references of include files that were not needed.Bruce Momjian
2006-07-11Alphabetically order reference to include files, "S"-"Z".Bruce Momjian
2006-05-21Change the backend to reject strings containing invalidly-encoded multibyteTom Lane
characters in all cases. Formerly we mostly just threw warnings for invalid input, and failed to detect it at all if no encoding conversion was required. The tighter check is needed to defend against SQL-injection attacks as per CVE-2006-2313 (further details will be published after release). Embedded zero (null) bytes will be rejected as well. The checks are applied during input to the backend (receipt from client or COPY IN), so it no longer seems necessary to check in textin() and related routines; any string arriving at those functions will already have been validated. Conversion failure reporting (for characters with no equivalent in the destination encoding) has been cleaned up and made consistent while at it. Also, fix a few longstanding errors in little-used encoding conversion routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic, mic_to_euc_tw were all broken to varying extents. Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki for identifying the security issues.
2006-03-05Update copyright for 2006. Update scripts.Bruce Momjian
2005-12-22Adjust string comparison so that only bitwise-equal strings are consideredTom Lane
equal: if strcoll claims two strings are equal, check it with strcmp, and sort according to strcmp if not identical. This fixes inconsistent behavior under glibc's hu_HU locale, and probably under some other locales as well. Also, take advantage of the now-well-defined behavior to speed up texteq, textne, bpchareq, bpcharne: they may as well just do a bitwise comparison and not bother with strcoll at all. NOTE: affected databases may need to REINDEX indexes on text columns to be sure they are self-consistent.
2005-10-15Standard pgindent run for 8.1.Bruce Momjian
2005-07-29Fix typo.Bruce Momjian
uniware
2005-07-10Change typreceive function API so that receive functions get the sameTom Lane
optional arguments as text input functions, ie, typioparam OID and atttypmod. Make all the datatypes that use typmod enforce it the same way in typreceive as they do in typinput. This fixes a problem with failure to enforce length restrictions during COPY FROM BINARY.
2005-05-29Marginal hack to save a little bit of time in bpcharin() when typmod is -1,Tom Lane
which is a common case.
2005-04-12Add aggsortop column to pg_aggregate, so that MIN/MAX optimization canTom Lane
be supported for all datatypes. Add CREATE AGGREGATE and pg_dump support too. Add specialized min/max aggregates for bpchar, instead of depending on text's min/max, because otherwise the possible use of bpchar indexes cannot be recognized. initdb forced because of catalog changes.
2004-12-31Tag appropriate files for rc3PostgreSQL Daemon
Also performed an initial run through of upgrading our Copyright date to extend to 2005 ... first run here was very simple ... change everything where: grep 1996-2004 && the word 'Copyright' ... scanned through the generated list with 'less' first, and after, to make sure that I only picked up the right entries ...
2004-08-29Pgindent run for 8.0.Bruce Momjian
2004-08-29Update copyright to 2004.Bruce Momjian
2004-08-02While perusing SQL92 I realized that we are delivering the wrong SQLSTATETom Lane
error code for string-too-long errors. It should be STRING_DATA_RIGHT_TRUNCATION not STRING_DATA_LENGTH_MISMATCH. The latter probably should only be applied to cases where a string must be exactly so many bits --- there are no cases at all where it applies to character strings, only bit strings.
2004-02-01Make length() disregard trailing spaces in char(n) values, per discussionTom Lane
some time ago and recent patch from Gavin Sherry. Update documentation to point out that trailing spaces are insignificant in char(n).
2003-11-29$Header: -> $PostgreSQL Changes ...PostgreSQL Daemon
2003-08-04Remove --enable-recode feature, since it's been broken by IPv6 changes,Tom Lane
and seems to have too few users to justify maintaining.
2003-08-04Update copyrights to 2003.Bruce Momjian