summaryrefslogtreecommitdiff
path: root/src/interfaces/ecpg
AgeCommit message (Collapse)Author
2024-10-23ecpg: Fix out-of-bound read in DecodeDateTime()Michael Paquier
It was possible for the code to read out-of-bound data from the "day_tab" table with some crafted input data. Let's treat these as invalid input as the month number is incorrect. A test is added to test this case with a check on the errno returned by the decoding routine. A test close to the new one added in this commit was testing for a failure, but did not look at the errno generated, so let's use this commit to also change it, adding a check on the errno returned by DecodeDateTime(). Like the other test scripts, dt_test should likely be expanded to include more checks based on the errnos generated in these code paths. This is left as future work. This issue exists since 2e6f97560a83, so backpatch all the way down. Reported-by: Pavel Nekrasov Author: Bruce Momjian, Pavel Nekrasov Discussion: https://postgr.es/m/18614-6bbe00117352309e@postgresql.org Backpatch-through: 12
2024-10-22ecpg: Refactor ecpg_log() to skip unnecessary calls to ECPGget_sqlca().Fujii Masao
Previously, ecpg_log() always called ECPGget_sqlca() to retrieve sqlca, even though it was only needed for debug logging. This commit updates ecpg_log() to call ECPGget_sqlca() only when debug logging is enabled. Author: Yuto Sasaki Reviewed-by: Alvaro Herrera, Tom Lane, Fujii Masao Discussion: https://postgr.es/m/TY2PR01MB3628A85689649BABC9A1C6C3C1782@TY2PR01MB3628.jpnprd01.prod.outlook.com
2024-10-17ecpg: fix more minor mishandling of bad input in preprocessor.Tom Lane
Don't get confused by an unmatched right brace in the input. (Previously, this led to discarding information about file-level variables and then possibly crashing.) Detect, rather than crash on, an attempt to index into a non-array variable. As before, in the absence of field complaints I'm not too excited about back-patching these. Per valgrind testing by Alexander Lakhin. Discussion: https://postgr.es/m/a239aec2-6c79-5fc9-9272-cea41158a360@gmail.com
2024-10-16ecpg: fix some minor mishandling of bad input in preprocessor.Tom Lane
Avoid null-pointer crash when considering a cursor declaration that's outside any C function (a case which is useless anyway). Ensure a cursor for a prepared statement is marked as initially not open. At worst, if we chanced to get not-already-zeroed memory from malloc(), this oversight would result in failing to issue a "cursor "foo" has been declared but not opened" warning that would have been appropriate. Avoid running off the end of the buffer when there are mismatched square brackets following a variable name. This could lead to SIGSEGV after reaching the end of memory. Given the lack of field complaints, none of these seem to be worth back-patching, but let's clean them up in HEAD. Per valgrind testing by Alexander Lakhin. Discussion: https://postgr.es/m/5f5bcecd-d7ec-b8c0-6c92-d1a7c6e0f639@gmail.com
2024-10-14ecpg: invent a saner syntax for ecpg.addons entries.Tom Lane
Put the rule type at the start not the end, and put spaces between the constitutent token names instead of smashing them into an illegible mess. This has no functional impact but I think it makes the rules a great deal more readable. Discussion: https://postgr.es/m/1185216.1724001216@sss.pgh.pa.us
2024-10-14ecpg: add cross-checks to parse.pl for usage of internal tables.Tom Lane
parse.pl contains several constant tables that describe tweaks to be made to the backend grammar. In the same spirit as 00b0e7204, add cross-checks that each table entry is used at least once (or exactly once if that's appropriate). This should help catch cases where adjustments to the backend grammar cause a table entry not to match as expected. Per suggestion from Michael Paquier. Discussion: https://postgr.es/m/ZsLVbjsc5x5Saesg@paquier.xyz
2024-10-14ecpg: avoid breaking the IDENT precedence level in two.Tom Lane
Careless string hacking caused parse.pl to transform gram.y's declaration %nonassoc IDENT PARTITION RANGE ROWS ... into %nonassoc IDENT %nonassoc CSTRING PARTITION RANGE ROWS ... It turns out that this has no semantic impact, because the generated preproc.c is exactly the same either way (if you inject a blank line to keep line numbers the same). Nonetheless, given the great emphasis that the commentary in gram.y places on keeping those other keywords at the same precedence level as IDENT, this seems like foolishly risking ecpg behaving differently from the core parser. Adjust the code so that CSTRING is added to the precedence line without breaking it into two lines. Discussion: https://postgr.es/m/2157151.1713540065@sss.pgh.pa.us
2024-10-14ecpg: improve preprocessor's memory management.Tom Lane
Invent a notion of "local" storage that will automatically be reclaimed at the end of each statement. Use this for location strings as well as other visibly short-lived data within the parser. Also, make cat_str and make_str return local storage and not free their inputs, which allows dispensing with a whole lot of retail mm_strdup calls. We do have to add some new ones in places where a local-lifetime string needs to be added to a longer-lived data structure, but on balance there are a lot less mm_strdup calls than before. In hopes of flushing out places where changes were necessary, I changed YYLTYPE from "char *" to "const char *", which forced const-ification of various function arguments that probably should've been like that all along. This still leaks somewhat more memory than v17, but that will be cleaned up in future commits. Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14ecpg: move some functions into a new file ecpg/preproc/util.c.Tom Lane
mm_alloc and mm_strdup were in type.c, which seems a completely random choice. No doubt the original author thought two small functions didn't deserve their own file. But I'm about to add some more memory-management stuff beside them, so let's put them in a less surprising place. This seems like a better home for mmerror, mmfatal, and the cat_str/make_str family, too. Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14ecpg: re-implement preprocessor's string management.Tom Lane
Most productions in the preprocessor grammar construct strings representing SQL or C statements or fragments thereof. Instead of returning these as <str> results of the productions, return them as "location" values, taking advantage of Bison's flexibility about what a location is. We aren't really giving up anything thereby, since ecpg's error reports have always just given line numbers, and that's tracked separately. The advantage of this is that a single instance of the YYLLOC_DEFAULT macro can perform all the work needed by the vast majority of productions, including all the ones made automatically by parse.pl. This avoids having large numbers of effectively-identical productions, which tickles an optimization inefficiency in recent versions of clang. (This patch reduces the compilation time for preproc.o by more than 100-fold with clang 16, and is visibly helpful with gcc too.) The compiled parser is noticeably smaller as well. A disadvantage of this approach is that YYLLOC_DEFAULT is applied before running the production's semantic action (if any). This means it cannot use the method favored by cat_str() of free'ing all the input strings; if the action needs to look at the input strings, it'd be looking at dangling storage. As this stands, therefore, it leaks memory like a sieve. This is already a big patch though, and fixing the memory management seems like a separable problem, so let's leave that for the next step. (This does remove some free() calls that I'd have had to touch anyway, in the expectation that the next step will manage memory reclamation quite differently.) Most of the changes here are mindless substitution of "@N" for "$N" in grammar rules; see the changes to README.parser for an explanation. Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14ecpg: major cleanup, simplification, and documentation of parse.pl.Tom Lane
Remove a lot of cruft, clean up and document what's left. This produces the same preproc.y output as before, except for fewer blank lines. (It's not like we're making any attempt to match the layout of gram.y, so I removed the one bit of logic that seemed to have that in mind.) Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14ecpg: remove check_rules.pl.Tom Lane
As noted in the previous commit, check_rules.pl is now entirely redundant with checks made by parse.pl, or would be if it weren't for the places where it's wrong. It's a waste of build cycles and maintenance effort, so remove it. Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14ecpg: clean up documentation of parse.pl, and add more input checking.Tom Lane
README.parser is the user's manual, such as it is, for parse.pl. It's rather poorly written if you ask me; so try to improve it. (More could be written here, but this at least covers the same info in a more organized fashion.) Also, the single solitary line of usage info in parse.pl itself was a lie. Replace. Add some error checks that the ecpg.addons entries meet the syntax rules set forth in README.parser. One of them didn't, but accidentally worked anyway because the logic in include_addon is such that 'block' is the default behavior. Also add a cross-check that each ecpg.addons entry is matched exactly once in the backend grammar. This exposed that there are two dead entries there --- they are dead because the %replace_types table in parse.pl causes their nonterminals to be ignored altogether. Removing them doesn't change the generated preproc.y file. (This implies that check_rules.pl is completely worthless and should be nuked: it adds build cycles and maintenance effort while failing to reliably accomplish its one job of detecting dead rules. I'll do that separately.) Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-10-14Remove traces of BeOS.Peter Eisentraut
Commit 15abc7788e6 tolerated namespace pollution from BeOS system headers. Commit 44f902122 de-supported BeOS. Since that stuff didn't make it into the Meson build system, synchronize by removing from configure. Author: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Japin Li <japinli@hotmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (the idea, not the patch) Discussion: https://postgr.es/m/ME3P282MB3166F9D1F71F787929C0C7E7B6312%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM
2024-10-04ecpg: avoid adding whitespace around '&' in connection URLs.Tom Lane
The preprocessor really should not have done this to begin with. The space after '&' was masked by ECPGconnect's skipping spaces before option keywords, and the space before by dint of libpq being (mostly) insensitive to trailing space in option values. We fixed the one known problem with that in 920d51979. Hence this patch is mostly cosmetic, and we'll just change it in HEAD. Discussion: https://postgr.es/m/TY2PR01MB36286A7B97B9A15793335D18C1772@TY2PR01MB3628.jpnprd01.prod.outlook.com
2024-10-01Simplify checking for xlocale.hPeter Eisentraut
Instead of XXX_IN_XLOCALE_H for several features XXX, let's just include <xlocale.h> if HAVE_XLOCALE_H. The reason for the extra complication was apparently that some old glibc systems also had an <xlocale.h>, and you weren't supposed to include it directly, but it's gone now (as far as I can tell it was harmless to do so anyway). Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CWZBBRR6YA8D.8EHMDRGLCKCD%40neon.tech
2024-09-05Prevent mis-encoding of "trailing junk after numeric literal" errors.Tom Lane
Since commit 2549f0661, we reject an identifier immediately following a numeric literal (without separating whitespace), because that risks ambiguity with hex/octal/binary integers. However, that patch used token patterns like "{integer}{ident_start}", which is problematic because {ident_start} matches only a single byte. If the first character after the integer is a multibyte character, this ends up with flex reporting an error message that includes a partial multibyte character. That can cause assorted bad-encoding problems downstream, both in the report to the client and in the postmaster log file. To fix, use {identifier} not {ident_start} in the "junk" token patterns, so that they will match complete multibyte characters. This seems generally better user experience quite aside from the encoding problem: for "123abc" the error message will now say that the error appeared at or near "123abc" instead of "123a". While at it, add some commentary about why these patterns exist and how they work. Report and patch by Karina Litskevich; review by Pavel Borisov. Back-patch to v15 where the problem came in. Discussion: https://postgr.es/m/CACiT8iZ_diop=0zJ7zuY3BXegJpkKK1Av-PU7xh0EDYHsa5+=g@mail.gmail.com
2024-08-23thread-safety: gmtime_r(), localtime_r()Peter Eisentraut
Use gmtime_r() and localtime_r() instead of gmtime() and localtime(), for thread-safety. There are a few affected calls in libpq and ecpg's libpgtypes, which are probably effectively bugs, because those libraries already claim to be thread-safe. There is one affected call in the backend. Most of the backend otherwise uses the custom functions pg_gmtime() and pg_localtime(), which are implemented differently. While we're here, change the call in the backend to gmtime*() instead of localtime*(), since for that use time zone behavior is irrelevant, and this side-steps any questions about when time zones are initialized by localtime_r() vs localtime(). Portability: gmtime_r() and localtime_r() are in POSIX but are not available on Windows. Windows has functions gmtime_s() and localtime_s() that can fulfill the same purpose, so we add some small wrappers around them. (Note that these *_s() functions are also different from the *_s() functions in the bounds-checking extension of C11. We are not using those here.) On MinGW, you can get the POSIX-style *_r() functions by defining _POSIX_C_SOURCE appropriately before including <time.h>. This leads to a conflict at least in plpython because apparently _POSIX_C_SOURCE gets defined in some header there, and then our replacement definitions conflict with the system definitions. To avoid that sort of thing, we now always define _POSIX_C_SOURCE on MinGW and use the POSIX-style functions here. Reviewed-by: Stepan Neretin <sncfmgg@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/eba1dc75-298e-4c46-8869-48ba8aad7d70@eisentraut.org
2024-08-15Remove dependence on -fwrapv semantics in a few places.Nathan Bossart
This commit attempts to update a few places, such as the money, numeric, and timestamp types, to no longer rely on signed integer wrapping for correctness. This is intended to move us closer towards removing -fwrapv, which may enable some compiler optimizations. However, there is presently no plan to actually remove that compiler option in the near future. Besides using some of the existing overflow-aware routines in int.h, this commit introduces and makes use of some new ones. Specifically, it adds functions that accept a signed integer and return its absolute value as an unsigned integer with the same width (e.g., pg_abs_s64()). It also adds functions that accept an unsigned integer, store the result of negating that integer in a signed integer with the same width, and return whether the negation overflowed (e.g., pg_neg_u64_overflow()). Finally, this commit adds a couple of tests for timestamps near POSTGRES_EPOCH_JDATE. Author: Joseph Koshakow Reviewed-by: Tom Lane, Heikki Linnakangas, Jian He Discussion: https://postgr.es/m/CAAvxfHdBPOyEGS7s%2Bxf4iaW0-cgiq25jpYdWBqQqvLtLe_t6tw%40mail.gmail.com
2024-08-15Clean up indentation and whitespace inconsistencies in ecpg.Tom Lane
ecpg's lexer and parser files aren't normally processed by pgindent, and unsurprisingly there's a lot of code in there that doesn't really match project style. I spent some time running pgindent over the fragments of these files that are C code, and this is the result. This is in the same spirit as commit 30ed71e42, though apparently Peter used a different method for that one, since it didn't find these problems. Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us
2024-08-10Fix inappropriate uses of atol()Peter Eisentraut
Some code using atol() would not work correctly if sizeof(long)==4: - src/bin/pg_basebackup/pg_basebackup.c: Would miscount size of a tablespace over 2 TB. - src/bin/pg_basebackup/streamutil.c: Would truncate a timeline ID beyond INT32_MAX. - src/bin/pg_rewind/libpq_source.c: Would miscount size of files larger than 2 GB (but this currently cannot happen). Replace these with atoll(). In one case, the use of atol() did not result in incorrect behavior but seems inconsistent with related code: - src/interfaces/ecpg/ecpglib/execute.c: Gratuitous, since it processes a value from pg_type.typlen, which is int16. Replace this with atoi(). Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/a52738ad-06bc-4d45-b59f-b38a8a89de49%40eisentraut.org
2024-08-07Revert ECPG's use of pnstrdup()Peter Eisentraut
Commit 0b9466fce added a dependency on fe_memutils' pnstrdup() inside informix.c. This adds an exit() path in a library, which we don't want. (Unlike libpq, the ecpg libraries don't have an automated check for that, but it makes sense to keep them to a similar standard.) The ecpg code can already handle failure results from the *strdup() call by itself. Author: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/CAOYmi+=pg=W5L1h=3MEP_EB24jaBu2FyATrLXqQHGe7cpuvwyg@mail.gmail.com
2024-08-03Add -Wmissing-variable-declarations to the standard compilation flagsPeter Eisentraut
This warning flag detects global variables not declared in header files. This is similar to what -Wmissing-prototypes does for functions. (More correctly, it is similar to what -Wmissing-declarations does for functions, but -Wmissing-prototypes is a superset of that in C.) This flag is new in GCC 14. Clang has supported it for a while. Several recent commits have cleaned up warnings triggered by this, so it should now be clean. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-08-02Include bison header files into implementation filesPeter Eisentraut
Before Bison 3.4, the generated parser implementation files run afoul of -Wmissing-variable-declarations (in spite of commit ab61c40bfa2) because declarations for yylval and possibly yylloc are missing. The generated header files contain an extern declaration, but the implementation files don't include the header files. Since Bison 3.4, the generated implementation files automatically include the generated header files, so then it works. To make this work with older Bison versions as well, include the generated header file from the .y file. (With older Bison versions, the generated implementation file contains effectively a copy of the header file pasted in, so including the header file is redundant. But we know this works anyway because the core grammar uses this arrangement already.) Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-25Add extern declarations for Bison global variablesPeter Eisentraut
This adds extern declarations for some global variables produced by Bison that are not already declared in its generated header file. This is a workaround to be able to add -Wmissing-variable-declarations to the global set of warning options in the near future. Another longer-term solution would be to convert these grammars to "pure" parsers in Bison, to avoid global variables altogether. Note that the core grammar is already pure, so this patch did not need to touch it. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-02Fix overflow in parsing of positional parameterPeter Eisentraut
Replace atol with pg_strtoint32_safe in the backend parser and with strtoint in ECPG to reject overflows when parsing the number of a positional parameter. With atol from glibc, parameters $2147483648 and $4294967297 turn into $-2147483648 and $1, respectively. Author: Erik Wienhold <ewie@ewie.name> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/5d216d1c-91f6-4cbe-95e2-b4cbd930520c@ewie.name
2024-06-04Fix PL/pgSQL's handling of integer ranges containing underscores.Dean Rasheed
Commit faff8f8e47 allowed integer literals to contain underscores, but failed to update the lexer's "numericfail" rule. As a result, a decimal integer literal containing underscores would fail to parse, if used in an integer range with no whitespace after the first number, such as "1_001..1_003" in a PL/pgSQL FOR loop. Fix and backpatch to v16, where support for underscores in integer literals was added. Report and patch by Erik Wienhold. Discussion: https://postgr.es/m/808ce947-46ec-4628-85fa-3dd600b2c154%40ewie.name
2024-05-23Remove race conditions between ECPGdebug() and ecpg_log().Tom Lane
Coverity complains that ECPGdebug is accessing debugstream without holding debug_mutex, which is a fair complaint: we should take debug_mutex while changing the settings ecpg_log looks at. In some branches it also complains about unlocked use of simple_debug. I think it's intentional and safe to have a quick unlocked check of simple_debug at the start of ecpg_log, since that early exit will always be taken in non-debug cases. But we should recheck simple_debug after acquiring the mutex. In the worst case, calling ECPGdebug concurrently with ecpg_log in another thread could result in a null-pointer dereference due to debugstream transiently being NULL while simple_debug isn't 0. This is largely hypothetical, since it's unlikely anybody uses ECPGdebug() at all in the field, and our own regression tests don't seem to be hitting the theoretical race conditions either. Still, if we're going to the trouble of having mutexes here, we ought to be using them in a way that's actually safe not just almost safe. Hence, back-patch to all supported branches.
2024-05-15Re-forbid underscore in positional parametersPeter Eisentraut
Underscores were added to numeric literals in faff8f8e47. This change also affected the positional parameters (e.g., $1) rule, which uses the same production for its digits. But this did not actually work, because the digits for parameters are processed using atol(), which does not handle underscores and ignores whatever it cannot parse. The underscores notation is probably not useful for positional parameters, so for simplicity revert that rule to its old form that only accepts digits 0-9. Author: Erik Wienhold <ewie@ewie.name> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/5d216d1c-91f6-4cbe-95e2-b4cbd930520c%40ewie.name
2024-05-06Translation updatesPeter Eisentraut
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: be182cc55e6f72c66215fd9b38851969e3ce5480
2024-04-29Make two-phase tests of ECPG and main suite more concurrent-proofMichael Paquier
The ECPG and main 2PC tests have been using rather-generic names for the prepared transactions they generate. This commit switches the 2PC transactions to use more complex GIDs, reducing the risk of naming conflicts. The main 2PC tests also include scans of pg_prepared_xacts that do not apply filters on the GID of the prepared transactions, making it possible to fail the test when any 2PC transaction runs concurrently. The CI has been able to see such failures with an installcheck running the ECPG and the main regression test suites in parallel. The queries on pg_prepared_xacts gain quals to only look after the GIDs generated locally. The race is very hard to reproduce, so no backbatch is done for now. Reported-by: Richard Guo Discussion: https://postgr.es/m/CAMbWs4-mWCGbbE_bne5=AfqjYGDaUZmjCw2+soLjrdNA0xUDFw@mail.gmail.com
2024-04-24Remove obsolete symbol from ecpg_config.h.inPeter Eisentraut
HAVE_LONG_LONG_INT was not added to ecpg_config.h.in by the meson build system, but rather than add it there, we decided to remove it from the makefile build system, to make both consistent that way. There is no documentation or examples that suggest that the presence of this symbol was publicly advertised, and of course the feature is required by C99 (but we don't necessarily require C99 for ecpg user code). ecpg core code and ecpg tests use the symbol HAVE_LONG_LONG_INT_64 instead, which is still there. Discussion: https://www.postgresql.org/message-id/flat/bf35d032-02fc-4173-9f4f-840999cc3ef3%40eisentraut.org
2024-04-16Fix assorted bugs in ecpg's macro mechanism.Tom Lane
The code associated with EXEC SQL DEFINE was unreadable and full of bugs, notably: * It'd attempt to free a non-malloced string if the ecpg program tries to redefine a macro that was defined on the command line. * Possible memory stomp if user writes "-D=foo". * Undef'ing or redefining a macro defined on the command line would change the state visible to the next file, when multiple files are specified on the command line. (While possibly that could have been an intentional choice, the code clearly intends to revert to the original macro state; it's just failing to consider this interaction.) * Missing "break" in defining a new macro meant that redefinition of an existing name would cause an extra entry to be added to the definition list. While not immediately harmful, a subsequent undef would result in the prior entry becoming visible again. * The interactions with input buffering are subtle and were entirely undocumented. It's not that surprising that we hadn't noticed these bugs, because there was no test coverage at all of either the -D command line switch or multiple input files. This patch adds such coverage (in a rather hacky way I guess). In addition to the code bugs, the user documentation was confused about whether the -D switch defines a C macro or an ecpg one, and it failed to mention that you can write "-Dsymbol=value". These problems are old, so back-patch to all supported branches. Discussion: https://postgr.es/m/998011.1713217712@sss.pgh.pa.us
2024-04-08JSON_TABLE: Add support for NESTED paths and columnsAmit Langote
A NESTED path allows to extract data from nested levels of JSON objects given by the parent path expression, which are projected as columns specified using a nested COLUMNS clause, just like the parent COLUMNS clause. Rows comprised from a NESTED columns are "joined" to the row comprised from the parent columns. If a particular NESTED path evaluates to 0 rows, then the nested COLUMNS will emit NULLs, making it an OUTER join. NESTED columns themselves may include NESTED paths to allow extracting data from arbitrary nesting levels, which are likewise joined against the rows at the parent level. Multiple NESTED paths at a given level are called "sibling" paths and their rows are combined by UNIONing them, that is, after being joined against the parent row as described above. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-04-04Fix ecpg's mechanism for detecting unsupported cases in the grammar.Tom Lane
ecpg wants to emit a warning if it parses a SQL construct that the backend can parse but will immediately throw a FEATURE_NOT_SUPPORTED error for. The way it was testing for this was to see if the string ERRCODE_FEATURE_NOT_SUPPORTED appeared anywhere in the gram.y code. This is, of course, not nearly good enough, as there are plenty of rules in gram.y that throw that error only conditionally. There was a hack dating to 2008 to suppress the warning in one rule that doesn't even exist anymore, but nothing for other cases we've created since then. End result was that you could get "unsupported feature will be passed to server" warnings while compiling perfectly good SQL code in ecpg. Somehow we'd not heard complaints about this, but it was exposed by the recent addition of an ecpg test for a SQL/JSON construct. To fix, suppress the warning if the rule contains any "if" statement. Manual comparison of gram.y with the generated preproc.y file shows that the warning is now emitted only in rules where it's sensible. This problem has existed for a long time, so back-patch to all supported branches. Discussion: https://postgr.es/m/603615.1712245382@sss.pgh.pa.us
2024-04-04Further cleanup for recent JSON-related commits.Tom Lane
Add overlooked .gitignore entries. Fix test_json_parser/Makefile to use the pgxs.mk clean rule instead of fighting it. Suppresses a warning from make, at least for me.
2024-04-04Add basic JSON_TABLE() functionalityAmit Langote
JSON_TABLE() allows JSON data to be converted into a relational view and thus used, for example, in a FROM clause, like other tabular data. Data to show in the view is selected from a source JSON object using a JSON path expression to get a sequence of JSON objects that's called a "row pattern", which becomes the source to compute the SQL/JSON values that populate the view's output columns. Column values themselves are computed using JSON path expressions applied to each of the JSON objects comprising the "row pattern", for which the SQL/JSON query functions added in 6185c9737cf4 are used. To implement JSON_TABLE() as a table function, this augments the TableFunc and TableFuncScanState nodes that are currently used to support XMLTABLE() with some JSON_TABLE()-specific fields. Note that the JSON_TABLE() spec includes NESTED COLUMNS and PLAN clauses, which are required to provide more flexibility to extract data out of nested JSON objects, but they are not implemented here to keep this commit of manageable size. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-25ecpg: Fix return code for overflow in numeric conversionDaniel Gustafsson
The decimal conversion functions dectoint and dectolong are documented to return ECPG_INFORMIX_NUM_OVERFLOW in case of overflows, but always returned -1 on all errors due to incorrectly checking the returnvalue from the PGTYPES* functions. Author: Aidar Imamov <a.imamov@postgrespro.ru> Discussion: https://postgr.es/m/54d2b53327516d9454daa5fb2f893bdc@postgrespro.ru
2024-03-21Add SQL/JSON query functionsAmit Langote
This introduces the following SQL/JSON functions for querying JSON data using jsonpath expressions: JSON_EXISTS(), which can be used to apply a jsonpath expression to a JSON value to check if it yields any values. JSON_QUERY(), which can be used to to apply a jsonpath expression to a JSON value to get a JSON object, an array, or a string. There are various options to control whether multi-value result uses array wrappers and whether the singleton scalar strings are quoted or not. JSON_VALUE(), which can be used to apply a jsonpath expression to a JSON value to return a single scalar value, producing an error if it multiple values are matched. Both JSON_VALUE() and JSON_QUERY() functions have options for handling EMPTY and ERROR conditions, which can be used to specify the behavior when no values are matched and when an error occurs during jsonpath evaluation, respectively. Author: Nikita Glukhov <n.gluhov@postgrespro.ru> Author: Teodor Sigaev <teodor@sigaev.ru> Author: Oleg Bartunov <obartunov@gmail.com> Author: Alexander Korotkov <aekorotkov@gmail.com> Author: Andrew Dunstan <andrew@dunslane.net> Author: Amit Langote <amitlangote09@gmail.com> Author: Peter Eisentraut <peter@eisentraut.org> Author: Jian He <jian.universality@gmail.com> Reviewers have included (in no particular order): Andres Freund, Alexander Korotkov, Pavel Stehule, Andrew Alsup, Erik Rijkers, Zihong Yu, Himanshu Upadhyaya, Daniel Gustafsson, Justin Pryzby, Álvaro Herrera, Jian He, Anton A. Melnikov, Nikita Malakhov, Peter Eisentraut, Tomas Vondra Discussion: https://postgr.es/m/cd0bb935-0158-78a7-08b5-904886deac4b@postgrespro.ru Discussion: https://postgr.es/m/20220616233130.rparivafipt6doj3@alap3.anarazel.de Discussion: https://postgr.es/m/abd9b83b-aa66-f230-3d6d-734817f0995d%40postgresql.org Discussion: https://postgr.es/m/CA+HiwqHROpf9e644D8BRqYvaAPmgBZVup-xKMDPk-nd4EpgzHw@mail.gmail.com Discussion: https://postgr.es/m/CA+HiwqE4XTdfb1nW=Ojoy_tQSRhYt-q_kb6i5d4xcKyrLC1Nbg@mail.gmail.com
2024-03-13Make the order of the header file includes consistentPeter Eisentraut
Similar to commit 7e735035f20. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-WhpCFMbXCjtJ%2BFzmjfPrp7Hw1pk4p%2BZpU95Kh3ofZ1A%40mail.gmail.com
2024-03-12Use printf's %m format instead of strerror(errno) in more placesMichael Paquier
Most callers of strerror() are removed from the backend code. The remaining callers require special handling with a saved errno from a previous system call. The frontend code still needs strerror() where error states need to be handled outside of fprintf. Note that pg_regress is not changed to use %m as the TAP output may clobber errno, since those functions call fprintf() and friends before evaluating the format string. Support for %m in src/port/snprintf.c has been added in d6c55de1f99a, hence all the stable branches currently supported include it. Author: Dagfinn Ilmari Mannsåker Discussion: https://postgr.es/m/87sf13jhuw.fsf@wibble.ilmari.org
2024-02-19ecpg: Fix zero-termination of string generated by intoasc()Michael Paquier
intoasc(), a wrapper for PGTYPESinterval_to_asc that converts an interval to its textual representation, used a plain memcpy() when copying its result. This could miss a zero-termination in the result string, leading to an incorrect result. The routines in informix.c do not provide the length of their result buffer, which would allow a replacement of strcpy() to safer strlcpy() calls, but this requires an ABI breakage and that cannot happen in back-branches. Author: Oleg Tselebrovskiy Reviewed-by: Ashutosh Bapat Discussion: https://postgr.es/m/bf47888585149f83b276861a1662f7e4@postgrespro.ru Backpatch-through: 12
2024-02-19ecpg: Fix error handling on OOMs when parsing timestampsMichael Paquier
pgtypes_alloc() can return NULL when failing an allocation, which is something that PGTYPEStimestamp_defmt_asc() has forgotten about when translating a timestamp for 'D', 'r', 'R' and 'T' as these require a temporary allocation. This is unlikely going to be a problem in practice, so no backpatch is done. Author: Oleg Tselebrovskiy Discussion: https://postgr.es/m/bf47888585149f83b276861a1662f7e4@postgrespro.ru
2024-02-09Avoid concurrent calls to bindtextdomain().Tom Lane
We previously supposed that it was okay for different threads to call bindtextdomain() concurrently (cf. commit 1f655fdc3). It now emerges that there's at least one gettext implementation in which that triggers an abort() crash, so let's stop doing that. Add mutexes guarding libpq's and ecpglib's calls, which are the only ones that need worry about multithreaded callers. Note: in libpq, we could perhaps have piggybacked on default_threadlock() to avoid defining a new mutex variable. I judge that not terribly safe though, since libpq_gettext could be called from code that is holding the default mutex. If that were the first such call in the process, it'd fail. An extra mutex is cheap insurance against unforeseen interactions. Per bug #18312 from Christian Maurer. Back-patch to all supported versions. Discussion: https://postgr.es/m/18312-bbbabc8113592b78@postgresql.org Discussion: https://postgr.es/m/264860.1707163416@sss.pgh.pa.us
2024-02-09Clean up Windows-specific mutex code in libpq and ecpglib.Tom Lane
Fix pthread-win32.h and pthread-win32.c to provide a more complete emulation of POSIX pthread mutexes: define PTHREAD_MUTEX_INITIALIZER and make sure that pthread_mutex_lock() can operate on a mutex object that's been initialized that way. Then we don't need the duplicative platform-specific logic in default_threadlock() and pgtls_init(), which we'd otherwise need yet a third copy of for an upcoming bug fix. Also, since default_threadlock() supposes that pthread_mutex_lock() cannot fail, try to ensure that that's actually true, by getting rid of the malloc call that was formerly involved in initializing an emulated mutex. We can define an extra state for the spinlock field instead. Also, replace the similar code in ecpglib/misc.c with this version. While ecpglib's version at least had a POSIX-compliant API, it also had the potential of failing during mutex init (but here, because of CreateMutex failure rather than malloc failure). Since all of misc.c's callers ignore failures, it seems like a wise idea to avoid failures here too. A further improvement in this area could be to unify libpq's and ecpglib's implementations into a src/port/pthread-win32.c file. But that doesn't seem like a bug fix, so I'll desist for now. In preparation for the aforementioned bug fix, back-patch to all supported branches. Discussion: https://postgr.es/m/264860.1707163416@sss.pgh.pa.us
2024-01-03Update copyright for 2024Bruce Momjian
Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12
2024-01-03Fix some typosMichael Paquier
Author: Dagfinn Ilmari Mannsåker Reviewed-by: Shubham Khanna Discussion: https://postgr.es/m/87le9fmi01.fsf@wibble.ilmari.org
2023-12-29Make all Perl warnings fatalPeter Eisentraut
There are a lot of Perl scripts in the tree, mostly code generation and TAP tests. Occasionally, these scripts produce warnings. These are probably always mistakes on the developer side (true positives). Typical examples are warnings from genbki.pl or related when you make a mess in the catalog files during development, or warnings from tests when they massage a config file that looks different on different hosts, or mistakes during merges (e.g., duplicate subroutine definitions), or just mistakes that weren't noticed because there is a lot of output in a verbose build. This changes all warnings into fatal errors, by replacing use warnings; by use warnings FATAL => 'all'; in all Perl files. Discussion: https://www.postgresql.org/message-id/flat/06f899fd-1826-05ab-42d6-adeb1fd5e200%40eisentraut.org
2023-11-06Remove distprepPeter Eisentraut
A PostgreSQL release tarball contains a number of prebuilt files, in particular files produced by bison, flex, perl, and well as html and man documentation. We have done this consistent with established practice at the time to not require these tools for building from a tarball. Some of these tools were hard to get, or get the right version of, from time to time, and shipping the prebuilt output was a convenience to users. Now this has at least two problems: One, we have to make the build system(s) work in two modes: Building from a git checkout and building from a tarball. This is pretty complicated, but it works so far for autoconf/make. It does not currently work for meson; you can currently only build with meson from a git checkout. Making meson builds work from a tarball seems very difficult or impossible. One particular problem is that since meson requires a separate build directory, we cannot make the build update files like gram.h in the source tree. So if you were to build from a tarball and update gram.y, you will have a gram.h in the source tree and one in the build tree, but the way things work is that the compiler will always use the one in the source tree. So you cannot, for example, make any gram.y changes when building from a tarball. This seems impossible to fix in a non-horrible way. Second, there is increased interest nowadays in precisely tracking the origin of software. We can reasonably track contributions into the git tree, and users can reasonably track the path from a tarball to packages and downloads and installs. But what happens between the git tree and the tarball is obscure and in some cases non-reproducible. The solution for both of these issues is to get rid of the step that adds prebuilt files to the tarball. The tarball now only contains what is in the git tree (*). Getting the additional build dependencies is no longer a problem nowadays, and the complications to keep these dual build modes working are significant. And of course we want to get the meson build system working universally. This commit removes the make distprep target altogether. The make dist target continues to do its job, it just doesn't call distprep anymore. (*) - The tarball also contains the INSTALL file that is built at make dist time, but not by distprep. This is unchanged for now. The make maintainer-clean target, whose job it is to remove the prebuilt files in addition to what make distclean does, is now just an alias to make distprep. (In practice, it is probably obsolete given that git clean is available.) The following programs are now hard build requirements in configure (they were already required by meson.build): - bison - flex - perl Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e07408d9-e5f2-d9fd-5672-f53354e9305e@eisentraut.org
2023-10-24Speed up pg_regress server readiness testing.Daniel Gustafsson
Instead of connecting to the server with psql to check if it is ready for running tests, this changes pg_regress to use PQPing which avoids performing system() calls which are expensive on some platforms, like Windows. The frequency of tests is also increased in order to connect to the server faster. This patch is part of a larger effort to make testing consume fewer resources in order to be able to fit more tests into the available CI system constraints. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/20230823192239.jxew5s3sjru63lio@awork3.anarazel.de