Age | Commit message (Collapse) | Author |
|
If we have an array of records stored on disk, the individual record fields
cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are
only prepared to deal with TOAST pointers appearing in top-level fields of
a stored row. The same applies for ranges over composite types, nested
composites, etc. However, the existing code only took care of expanding
sub-field TOAST pointers for the case of nested composites, not for other
structured types containing composites. For example, given a command such
as
UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ...
where x is a direct reference to a field of an on-disk tuple, if that field
is long enough to be toasted out-of-line then the TOAST pointer would be
inserted as-is into the array column. If the source record for x is later
deleted, the array field value would become a dangling pointer, leading
to errors along the line of "missing chunk number 0 for toast value ..."
when the value is referenced. A reproducible test case for this was
provided by Jan Pecek, but it seems likely that some of the "missing chunk
number" reports we've heard in the past were caused by similar issues.
Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to
produce a self-contained Datum value if the Datum is of composite type.
Seen in this light, the problem is not just confined to arrays and ranges,
but could also affect some other places where detoasting is done in that
way, for example form_index_tuple().
I tried teaching the array code to apply toast_flatten_tuple_attribute()
along with PG_DETOAST_DATUM() when the array element type is composite,
but this was messy and imposed extra cache lookup costs whether or not any
TOAST pointers were present, indeed sometimes when the array element type
isn't even composite (since sometimes it takes a typcache lookup to find
that out). The idea of extending that approach to all the places that
currently use PG_DETOAST_DATUM() wasn't attractive at all.
This patch instead solves the problem by decreeing that composite Datum
values must not contain any out-of-line TOAST pointers in the first place;
that is, we expand out-of-line fields at the point of constructing a
composite Datum, not at the point where we're about to insert it into a
larger tuple. This rule is applied only to true composite Datums, not
to tuples that are being passed around the system as tuples, so it's not
as invasive as it might sound at first. With this approach, the amount
of code that has to be touched for a full solution is greatly reduced,
and added cache lookup costs are avoided except when there actually is
a TOAST pointer that needs to be inlined.
The main drawback of this approach is that we might sometimes dereference
a TOAST pointer that will never actually be used by the query, imposing a
rather large cost that wasn't there before. On the other side of the coin,
if the field value is used multiple times then we'll come out ahead by
avoiding repeat detoastings. Experimentation suggests that common SQL
coding patterns are unaffected either way, though. Applications that are
very negatively affected could be advised to modify their code to not fetch
columns they won't be using.
In future, we might consider reverting this solution in favor of detoasting
only at the point where data is about to be stored to disk, using some
method that can drill down into multiple levels of nested structured types.
That will require defining new APIs for structured types, though, so it
doesn't seem feasible as a back-patchable fix.
Note that this patch changes HeapTupleGetDatum() from a macro to a function
call; this means that any third-party code using that macro will not get
protection against creating TOAST-pointer-containing Datums until it's
recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER().
It seems likely that this is not a big problem in practice: most of the
tuple-returning functions in core and contrib produce outputs that could
not possibly be toasted anyway, and the same probably holds for third-party
extensions.
This bug has existed since TOAST was invented, so back-patch to all
supported branches.
|
|
Since ruleutils.c recurses, it could be driven to stack overflow by
deeply nested constructs. Very large queries might also take long
enough to deparse that a check for interrupts seems like a good idea.
Stick appropriate tests into a couple of key places.
Noted by Greg Stark. Back-patch to all supported branches.
|
|
EXEC_BACKEND builds (i.e., Windows) failed to absorb values of PGC_BACKEND
parameters if they'd been changed post-startup via the config file. This
for example prevented log_connections from working if it were turned on
post-startup. The mechanism for handling this case has always been a bit
of a kluge, and it wasn't revisited when we implemented EXEC_BACKEND.
While in a normal forking environment new backends will inherit the
postmaster's value of such settings, EXEC_BACKEND backends have to read
the settings from the CONFIG_EXEC_PARAMS file, and they were mistakenly
rejecting them. So this case has always been broken in the Windows port;
so back-patch to all supported branches.
Amit Kapila
|
|
Initialization of this field was not being done according to the
st_changecount protocol (it has to be done within the changecount increment
range, not outside). And the test to see if the value should be reported
as null was wrong. Noted while perusing uses of Port.remote_hostname.
This was wrong from the introduction of this code (commit 4a25bc145),
so back-patch to 9.1.
|
|
This should eliminate the risk of recursive entry to syslog(3), which
appears to be the cause of the hang reported in bug #9551 from James
Morton.
Arguably, the real problem here is auth.c's willingness to turn on
ImmediateInterruptOK while executing fairly wide swaths of backend code.
We may well need to work at narrowing the code ranges in which the
authentication_timeout interrupt is enabled. For the moment, though,
this is a cheap and reasonably noninvasive fix for a field-reported
failure; the other approach would be complex and not necessarily
bug-free itself.
Back-patch to all supported branches.
|
|
The behavior of that is undefined, although unlikely to lead to problems in
practice.
Found by running regression tests with Valgrind.
|
|
In make_ruledef and get_query_def, we have long used AcquireRewriteLocks
to ensure that the querytree we are about to deparse is up-to-date and
the schemas of the underlying relations aren't changing. Howwever, that
function thinks the query is about to be executed, so it acquires locks
that are stronger than necessary for the purpose of deparsing. Thus for
example, if pg_dump asks to deparse a rule that includes "INSERT INTO t",
we'd acquire RowExclusiveLock on t. That results in interference with
concurrent transactions that might for example ask for ShareLock on t.
Since pg_dump is documented as being purely read-only, this is unexpected.
(Worse, it used to actually be read-only; this behavior dates back only
to 8.1, cf commit ba4200246.)
Fix this by adding a parameter to AcquireRewriteLocks to tell it whether
we want the "real" execution locks or only AccessShareLock.
Report, diagnosis, and patch by Dean Rasheed. Back-patch to all supported
branches.
|
|
The regex code didn't have any provision for query cancel; which is
unsurprising given its non-Postgres origin, but still problematic since
some operations can take a long time. Introduce a callback function to
check for a pending query cancel or session termination request, and
call it in a couple of strategic spots where we can make the regex code
exit with an error indicator.
If we ever actually split out the regex code as a standalone library,
some additional work will be needed to let the cancel callback function
be specified externally to the library. But that's straightforward
(certainly so by comparison to putting the locale-dependent character
classification logic on a similar arms-length basis), and there seems
no need to do it right now.
A bigger issue is that there may be more places than these two where
we need to check for cancels. We can always add more checks later,
now that the infrastructure is in place.
Since there are known examples of not-terribly-long regexes that can
lock up a backend for a long time, back-patch to all supported branches.
I have hopes of fixing the known performance problems later, but adding
query cancel ability seems like a good idea even if they were all fixed.
|
|
If there are lots of uncommitted tuples at the end of the index range,
get_actual_variable_range() ends up fetching each one and doing an MVCC
visibility check on it, until it finally hits a visible tuple. This is
bad enough in isolation, considering that we don't need an exact answer
only an approximate one. But because the tuples are not yet committed,
each visibility check does a TransactionIdIsInProgress() test, which
involves scanning the ProcArray. When multiple sessions do this
concurrently, the ensuing contention results in horrid performance loss.
20X overall throughput loss on not-too-complicated queries is easy to
demonstrate in the back branches (though someone's made it noticeably
less bad in HEAD).
We can dodge the problem fairly effectively by using SnapshotDirty rather
than a normal MVCC snapshot. This will cause the index probe to take
uncommitted tuples as good, so that we incur only one tuple fetch and test
even if there are many such tuples. The extent to which this degrades the
estimate is debatable: it's possible the result is actually a more accurate
prediction than before, if the endmost tuple has become committed by the
time we actually execute the query being planned. In any case, it's not
very likely that it makes the estimate a lot worse.
SnapshotDirty will still reject tuples that are known committed dead, so
we won't give bogus answers if an invalid outlier has been deleted but not
yet vacuumed from the index. (Because btrees know how to mark such tuples
dead in the index, we shouldn't have a big performance problem in the case
that there are many of them at the end of the range.) This consideration
motivates not using SnapshotAny, which was also considered as a fix.
Note: the back branches were using SnapshotNow instead of an MVCC snapshot,
but the problem and solution are the same.
Per performance complaints from Bartlomiej Romanski, Josh Berkus, and
others. Back-patch to 9.0, where the issue was introduced (by commit
40608e7f949fb7e4025c0ddd5be01939adc79eec).
|
|
Coverity identified a number of places in which it couldn't prove that a
string being copied into a fixed-size buffer would fit. We believe that
most, perhaps all of these are in fact safe, or are copying data that is
coming from a trusted source so that any overrun is not really a security
issue. Nonetheless it seems prudent to forestall any risk by using
strlcpy() and similar functions.
Fixes by Peter Eisentraut and Jozef Mlich based on Coverity reports.
In addition, fix a potential null-pointer-dereference crash in
contrib/chkpass. The crypt(3) function is defined to return NULL on
failure, but chkpass.c didn't check for that before using the result.
The main practical case in which this could be an issue is if libc is
configured to refuse to execute unapproved hashing algorithms (e.g.,
"FIPS mode"). This ideally should've been a separate commit, but
since it touches code adjacent to one of the buffer overrun changes,
I included it in this commit to avoid last-minute merge issues.
This issue was reported by Honza Horak.
Security: CVE-2014-0065 for buffer overruns, CVE-2014-0066 for crypt()
|
|
Several functions, mostly type input functions, calculated an allocation
size such that the calculation wrapped to a small positive value when
arguments implied a sufficiently-large requirement. Writes past the end
of the inadvertent small allocation followed shortly thereafter.
Coverity identified the path_in() vulnerability; code inspection led to
the rest. In passing, add check_stack_depth() to prevent stack overflow
in related functions.
Back-patch to 8.4 (all supported versions). The non-comment hstore
changes touch code that did not exist in 8.4, so that part stops at 9.0.
Noah Misch and Heikki Linnakangas, reviewed by Tom Lane.
Security: CVE-2014-0064
|
|
The primary role of PL validators is to be called implicitly during
CREATE FUNCTION, but they are also normal functions that a user can call
explicitly. Add a permissions check to each validator to ensure that a
user cannot use explicit validator calls to achieve things he could not
otherwise achieve. Back-patch to 8.4 (all supported versions).
Non-core procedural language extensions ought to make the same two-line
change to their own validators.
Andres Freund, reviewed by Tom Lane and Noah Misch.
Security: CVE-2014-0061
|
|
Granting a role without ADMIN OPTION is supposed to prevent the grantee
from adding or removing members from the granted role. Issuing SET ROLE
before the GRANT bypassed that, because the role itself had an implicit
right to add or remove members. Plug that hole by recognizing that
implicit right only when the session user matches the current role.
Additionally, do not recognize it during a security-restricted operation
or during execution of a SECURITY DEFINER function. The restriction on
SECURITY DEFINER is not security-critical. However, it seems best for a
user testing his own SECURITY DEFINER function to see the same behavior
others will see. Back-patch to 8.4 (all supported versions).
The SQL standards do not conflate roles and users as PostgreSQL does;
only SQL roles have members, and only SQL users initiate sessions. An
application using PostgreSQL users and roles as SQL users and roles will
never attempt to grant membership in the role that is the session user,
so the implicit right to add or remove members will never arise.
The security impact was mostly that a role member could revoke access
from others, contrary to the wishes of his own grantor. Unapproved role
member additions are less notable, because the member can still largely
achieve that by creating a view or a SECURITY DEFINER function.
Reviewed by Andres Freund and Tom Lane. Reported, independently, by
Jonas Sundman and Noah Misch.
Security: CVE-2014-0060
|
|
Per reports from Andres Freund and Luke Campbell, a server failure during
set_pglocale_pgservice results in a segfault rather than a useful error
message, because the infrastructure needed to use ereport hasn't been
initialized; specifically, MemoryContextInit hasn't been called.
One known cause of this is starting the server in a directory it
doesn't have permission to read.
We could try to prevent set_pglocale_pgservice from using anything that
depends on palloc or elog, but that would be messy, and the odds of future
breakage seem high. Moreover there are other things being called in main.c
that look likely to use palloc or elog too --- perhaps those things
shouldn't be there, but they are there today. The best solution seems to
be to move the call of MemoryContextInit to very early in the backend's
real main() function. I've verified that an elog or ereport occurring
immediately after that is now capable of sending something useful to
stderr.
I also added code to elog.c to print something intelligible rather than
just crashing if MemoryContextInit hasn't created the ErrorContext.
This could happen if MemoryContextInit itself fails (due to malloc
failure), and provides some future-proofing against someone trying to
sneak in new code even earlier in server startup.
Back-patch to all supported branches. Since we've only heard reports of
this type of failure recently, it may be that some recent change has made
it more likely to see a crash of this kind; but it sure looks like it's
broken all the way back.
|
|
Instead of looking for characters that aren't valid in JSON numbers, we
simply pass the output string through the JSON number parser, and if it
fails the string is quoted. This means among other things that money and
domains over money will be quoted correctly and generate valid JSON.
Fixes bug #8676 reported by Anderson Cristian da Silva.
Backpatched to 9.2 where JSON generation was introduced.
|
|
The bug would only show up if the C sockaddr structure contained
zero in the first byte for a valid address; otherwise it would
fail to fail, which is probably why it went unnoticed for so long.
Patch submitted by Joel Jacobson after seeing an article by Andrey
Karpov in which he reports finding this through static code
analysis using PVS-Studio. While I was at it I moved a definition
of a local variable referenced in the buggy code to a more local
context.
Backpatch to all supported branches.
|
|
A pointer to a C string was treated as a pointer to a "name" datum and
passed to SPI_execute_plan(). This pointer would then end up being
passed through datumCopy(), which would try to copy the entire 64 bytes
of name data, thus running past the end of the C string. Fix by
converting the string to a proper name structure.
Found by LLVM AddressSanitizer.
|
|
Simple oversight in commit 1cb108efb0e60d87e4adec38e7636b6e8efbeb57 ---
recursively examining a subquery output column is only sane if the
original Var refers to a single output column. Found by Kevin Grittner.
|
|
Back-patch commits 8e68816cc2567642c6fcca4eaac66c25e0ae5ced and
8dace66e0735ca39b779922d02c24ea2686e6521 into the stable branches.
Buildfarm testing revealed no great portability surprises, and it
seems useful to have this robustness improvement in all branches.
|
|
These things didn't work because the planner omitted to do the necessary
preprocessing of a WindowFunc's argument list. Add the few dozen lines
of code needed to handle that.
Although this sounds like a feature addition, it's really a bug fix because
the default-argument case was likely to crash previously, due to lack of
checking of the number of supplied arguments in the built-in window
functions. It's not a security issue because there's no way for a
non-superuser to create a window function definition with defaults that
refers to a built-in C function, but nonetheless people might be annoyed
that it crashes rather than producing a useful error message. So
back-patch as far as the patch applies easily, which turns out to be 9.2.
I'll put a band-aid in earlier versions as a separate patch.
(Note that these features still don't work for aggregates, and fixing that
case will be harder since we represent aggregate arg lists as target lists
not bare expression lists. There's no crash risk though because CREATE
AGGREGATE doesn't accept defaults, and we reject named-argument notation
when parsing an aggregate call.)
|
|
Historically, printtup() has assumed that it could prevent memory leakage
by pfree'ing the string result of each output function and manually
managing detoasting of toasted values. This amounts to assuming that
datatype output functions never leak any memory internally; an assumption
we've already decided to be bogus elsewhere, for example in COPY OUT.
range_out in particular is known to leak multiple kilobytes per call, as
noted in bug #8573 from Godfried Vanluffelen. While we could go in and fix
that leak, it wouldn't be very notationally convenient, and in any case
there have been and undoubtedly will again be other leaks in other output
functions. So what seems like the best solution is to run the output
functions in a temporary memory context that can be reset after each row,
as we're doing in COPY OUT. Some quick experimentation suggests this is
actually a tad faster than the retail pfree's anyway.
This patch fixes all the variants of printtup, except for debugtup()
which is used in standalone mode. It doesn't seem worth worrying
about query-lifespan leaks in standalone mode, and fixing that case
would be a bit tedious since debugtup() doesn't currently have any
startup or shutdown functions.
While at it, remove manual detoast management from several other
output-function call sites that had copied it from printtup(). This
doesn't make a lot of difference right now, but in view of recent
discussions about supporting "non-flattened" Datums, we're going to
want that code gone eventually anyway.
Back-patch to 9.2 where range_out was introduced. We might eventually
decide to back-patch this further, but in the absence of known major
leaks in older output functions, I'll refrain for now.
|
|
Formerly, when using a SQL-spec timezone setting with a fixed GMT offset
(called a "brute force" timezone in the code), the session_timezone
variable was not updated to match the nominal timezone; rather, all code
was expected to ignore session_timezone if HasCTZSet was true. This is
of course obviously fragile, though a search of the code finds only
timeofday() failing to honor the rule. A bigger problem was that
DetermineTimeZoneOffset() supposed that if its pg_tz parameter was
pointer-equal to session_timezone, then HasCTZSet should override the
parameter. This would cause datetime input containing an explicit zone
name to be treated as referencing the brute-force zone instead, if the
zone name happened to match the session timezone that had prevailed
before installing the brute-force zone setting (as reported in bug #8572).
The same malady could affect AT TIME ZONE operators.
To fix, set up session_timezone so that it matches the brute-force zone
specification, which we can do using the POSIX timezone definition syntax
"<abbrev>offset", and get rid of the bogus lookaside check in
DetermineTimeZoneOffset(). Aside from fixing the erroneous behavior in
datetime parsing and AT TIME ZONE, this will cause the timeofday() function
to print its result in the user-requested time zone rather than some
previously-set zone. It might also affect results in third-party
extensions, if there are any that make use of session_timezone without
considering HasCTZSet, but in all cases the new behavior should be saner
than before.
Back-patch to all supported branches.
|
|
B-tree operators are not allowed to leak memory into the current memory
context. Range_cmp leaked detoasted copies of the arguments. That caused
a quick out-of-memory error when creating an index on a range column.
Reported by Marian Krucina, bug #8468.
|
|
The previous coding in plancache.c essentially used 10% of the estimated
runtime as its cost estimate for planning. This can be pretty bogus,
especially when the estimated runtime is very small, such as in a simple
expression plan created by plpgsql, or a simple INSERT ... VALUES.
While we don't have a really good handle on how planning time compares
to runtime, it seems reasonable to use an estimate based on the number of
relations referenced in the query, with a rather large multiplier. This
patch uses 1000 * cpu_operator_cost * (nrelations + 1), so that even a
trivial query will be charged 1000 * cpu_operator_cost for planning.
This should address the problem reported by Marc Cousin and others that
9.2 and up prefer custom plans in cases where the planning time greatly
exceeds what can be saved.
|
|
The C99 and POSIX standards require strtod() to accept all these spellings
(case-insensitively): "inf", "+inf", "-inf", "infinity", "+infinity",
"-infinity". However, pre-C99 systems might accept only some or none of
these, and apparently Windows still doesn't accept "inf". To avoid
surprising cross-platform behavioral differences, manually check for each
of these spellings if strtod() fails. We were previously handling just
"infinity" and "-infinity" that way, but since C99 is most of the world
now, it seems likely that applications are expecting all these spellings
to work.
Per bug #8355 from Basil Peace. It turns out this fix won't actually
resolve his problem, because Python isn't being this careful; but that
doesn't mean we shouldn't be.
|
|
If a tuple is locked but not updated by a concurrent transaction,
HeapTupleSatisfiesDirty would return that transaction's Xid in xmax,
causing callers to wait on it, when it is not necessary (in fact, if the
other transaction had used a multixact instead of a plain Xid to mark
the tuple, HeapTupleSatisfiesDirty would have behave differently and
*not* returned the Xmax).
This bug was introduced in commit 3f7fbf85dc5b42, dated December 1998,
so it's almost 15 years old now. However, it's hard to see this
misbehave, because before we had NOWAIT the only consequence of this is
that transactions would wait for slightly more time than necessary; so
it's not surprising that this hasn't been reported yet.
Craig Ringer and Andres Freund
|
|
We'd find the same match twice if it was of zero length and not immediately
adjacent to the previous match. replace_text_regexp() got similar cases
right, so adjust this search logic to match that. Note that even though
the regexp_split_to_xxx() functions share this code, they did not display
equivalent misbehavior, because the second match would be considered
degenerate and ignored.
Jeevan Chalke, with some cosmetic changes by me.
|
|
In a boolean column that contains mostly nulls, ANALYZE might not find
enough non-null values to populate the most-common-values stats,
but it would still create a pg_statistic entry with stanullfrac set.
The logic in booltestsel() for this situation did the wrong thing for
"col IS NOT TRUE" and "col IS NOT FALSE" tests, forgetting that null
values would satisfy these tests (so that the true selectivity would
be close to one, not close to zero). Per bug #8274.
Fix by Andrew Gierth, some comment-smithing by me.
|
|
It's possible to drop a column from an input table of a JOIN clause in a
view, if that column is nowhere actually referenced in the view. But it
will still be there in the JOIN clause's joinaliasvars list. We used to
replace such entries with NULL Const nodes, which is handy for generation
of RowExpr expansion of a whole-row reference to the view. The trouble
with that is that it can't be distinguished from the situation after
subquery pull-up of a constant subquery output expression below the JOIN.
Instead, replace such joinaliasvars with null pointers (empty expression
trees), which can't be confused with pulled-up expressions. expandRTE()
still emits the old convention, though, for convenience of RowExpr
generation and to reduce the risk of breaking extension code.
In HEAD and 9.3, this patch also fixes a problem with some new code in
ruleutils.c that was failing to cope with implicitly-casted joinaliasvars
entries, as per recent report from Feike Steenbergen. That oversight was
because of an inadequate description of the data structure in parsenodes.h,
which I've now corrected. There were some pre-existing oversights of the
same ilk elsewhere, which I believe are now all fixed.
|
|
In tuplesort.c:inittapes(), we calculate tapeSpace by first figuring
out how many 'tapes' we can use (maxTapes) and then multiplying the
result by the tape buffer overhead for each. Unfortunately, when
we are on a system with an 8-byte long, we allow work_mem to be
larger than 2GB and that allows maxTapes to be large enough that the
32bit arithmetic can overflow when multiplied against the buffer
overhead.
When this overflow happens, we end up adding the overflow to the
amount of space available, causing the amount of memory allocated to
be larger than work_mem.
Note that to reach this point, you have to set work mem to at least
24GB and be sorting a set which is at least that size. Given that a
user who can set work_mem to 24GB could also set it even higher, if
they were looking to run the system out of memory, this isn't
considered a security issue.
This overflow risk was found by the Coverity scanner.
Back-patch to all supported branches, as this issue has existed
since before 8.4.
|
|
We need to increment the refcount on the composite type's cached tuple
descriptor while we do lookups of its column types. Otherwise a cache
flush could occur and release the tuple descriptor before we're done with
it. This fails reliably with -DCLOBBER_CACHE_ALWAYS, but the odds of a
failure in a production build seem rather low (since the pfree'd descriptor
typically wouldn't get scribbled on immediately). That may explain the
lack of any previous reports. Buildfarm issue noted by Christian Ullrich.
Back-patch to 9.1 where the bogus code was added.
|
|
This case doesn't normally happen, because the planner usually clamps
all row estimates to at least one row; but I found that it can arise
when dealing with relations excluded by constraints. Without a defense,
estimate_num_groups() can return zero, which leads to divisions by zero
inside the planner as well as assertion failures in the executor.
An alternative fix would be to change set_dummy_rel_pathlist() to make
the size estimate for a dummy relation 1 row instead of 0, but that seemed
pretty ugly; and probably someday we'll want to drop the convention that
the minimum rowcount estimate is 1 row.
Back-patch to 8.4, as the problem can be demonstrated that far back.
|
|
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction. However, plancache.c busily
saved or examined the search_path for every cached plan. If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt. This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.
This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
|
|
The point of turning off track_activities is to avoid this reporting
overhead, but a thinko in commit 4f42b546fd87a80be30c53a0f2c897acb826ad52
caused pgstat_report_activity() to perform half of its updates anyway.
Fix that, and also make sure that we clear all the now-disabled fields
when transitioning to the non-reporting state.
|
|
An oversight in commit e710b65c1c56ca7b91f662c63d37ff2e72862a94 allowed
database names beginning with "-" to be treated as though they were secure
command-line switches; and this switch processing occurs before client
authentication, so that even an unprivileged remote attacker could exploit
the bug, needing only connectivity to the postmaster's port. Assorted
exploits for this are possible, some requiring a valid database login,
some not. The worst known problem is that the "-r" switch can be invoked
to redirect the process's stderr output, so that subsequent error messages
will be appended to any file the server can write. This can for example be
used to corrupt the server's configuration files, so that it will fail when
next restarted. Complete destruction of database tables is also possible.
Fix by keeping the database name extracted from a startup packet fully
separate from command-line switches, as had already been done with the
user name field.
The Postgres project thanks Mitsumasa Kondo for discovering this bug,
Kyotaro Horiguchi for drafting the fix, and Noah Misch for recognizing
the full extent of the danger.
Security: CVE-2013-1899
|
|
The pg_start_backup() and pg_stop_backup() functions checked the privileges
of the initially-authenticated user rather than the current user, which is
wrong. For example, a user-defined index function could successfully call
these functions when executed by ANALYZE within autovacuum. This could
allow an attacker with valid but low-privilege database access to interfere
with creation of routine backups. Reported and fixed by Noah Misch.
Security: CVE-2013-1901
|
|
formatting.c used locale-dependent case folding rules in some code paths
where the result isn't supposed to be locale-dependent, for example
to_char(timestamp, 'DAY'). Since the source data is always just ASCII
in these cases, that usually didn't matter ... but it does matter in
Turkish locales, which have unusual treatment of "i" and "I". To confuse
matters even more, the misbehavior was only visible in UTF8 encoding,
because in single-byte encodings we used pg_toupper/pg_tolower which
don't have locale-specific behavior for ASCII characters. Fix by providing
intentionally ASCII-only case-folding functions and using these where
appropriate. Per bug #7913 from Adnan Dursun. Back-patch to all active
branches, since it's been like this for a long time.
|
|
I fixed this code back in commit 841b4a2d5, but didn't think carefully
enough about the behavior near zero, which meant it improperly rejected
1999-12-31 24:00:00. Per report from Magnus Hagander.
|
|
Backpatch to release 9.2
Brar Piening and Noah Misch, reviewed by Craig Ringer.
|
|
This function was misdeclared to take cstring when it should take internal.
This at least allows crashing the server, and in principle an attacker
might be able to use the function to examine the contents of server memory.
The correct fix is to adjust the system catalog contents (and fix the
regression tests that should have caught this but failed to). However,
asking users to correct the catalog contents in existing installations
is a pain, so as a band-aid fix for the back branches, install a check
in enum_recv() to make it throw error if called with a cstring argument.
We will later revert this in HEAD in favor of correcting the catalogs.
Our thanks to Sumit Soni (via Secunia SVCRP) for reporting this issue.
Security: CVE-2013-0255
|
|
Revert commit 84725aa5efe11688633b553e58113efce4181f2e
|
|
Following bug analysis of #7819 by Tom Lane
|
|
The code failed to detect an out-of-memory failure.
Xi Wang
|
|
Dates outside the supported range could be entered, but would not print
reasonably, and operations such as conversion to timestamp wouldn't behave
sanely either. Since this has the potential to result in undumpable table
data, it seems worth back-patching.
Hitoshi Harada
|
|
This partially reverts commit 21a39de5809cd3050a37d2554323cc1d0cbeed9d,
restoring the pre-9.2 cost estimates for index usage. That change
introduced much too large a bias against larger indexes, as per reports
from Jeff Janes and others. The whole thing needs a rewrite, which I've
done in HEAD, but the safest thing to do in 9.2 is just to undo this
multiplier change.
|
|
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path. And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead. This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before. This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.
By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.
An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones. This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.
Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)
Heikki Linnakangas and Tom Lane
|
|
Code review for commit 2f582f76b1945929ff07116cd4639747ce9bb8a1: don't use
a static variable for what ought to be a deparse_context field, fix
non-multibyte-safe test for spaces, avoid useless and potentially O(N^2)
(though admittedly with a very small constant) calculations of wrap
positions when we aren't going to wrap.
|
|
During crash recovery, we remove disk files belonging to temporary tables,
but the system catalog entries for such tables are intentionally not
cleaned up right away. Instead, the first backend that uses a temp schema
is expected to clean out any leftover objects therein. This approach
requires that we be careful to ignore leftover temp tables (since any
actual access attempt would fail), *even if their BackendId matches our
session*, if we have not yet established use of the session's corresponding
temp schema. That worked fine in the past, but was broken by commit
debcec7dc31a992703911a9953e299c8d730c778 which incorrectly removed the
rd_islocaltemp relcache flag. Put it back, and undo various changes
that substituted tests like "rel->rd_backend == MyBackendId" for use
of a state-aware flag. Per trouble report from Heikki Linnakangas.
Back-patch to 9.1 where the erroneous change was made. In the back
branches, be careful to add rd_islocaltemp in a spot in the struct that
was alignment padding before, so as not to break existing add-on code.
|
|
We failed to ever fill the sixth line (LISTEN_ADDR), which caused the
attempt to fill the seventh line (SHMEM_KEY) to fail, so that the shared
memory key never got added to the file in standalone mode. This has been
broken since we added more content to our lock files in 9.1.
To fix, tweak the logic in CreateLockFile to add an empty LISTEN_ADDR
line in standalone mode. This is a tad grotty, but since that function
already knows almost everything there is to know about the contents of
lock files, it doesn't seem that it's any better to hack it elsewhere.
It's not clear how significant this bug really is, since a standalone
backend should never have any children and thus it seems not critical
to be able to check the nattch count of the shmem segment externally.
But I'm going to back-patch the fix anyway.
This problem had escaped notice because of an ancient (and in hindsight
pretty dubious) decision to suppress LOG-level messages by default in
standalone mode; so that the elog(LOG) complaint in AddToDataDirLockFile
that should have warned of the problem didn't do anything. Fixing that
is material for a separate patch though.
|
|
The dynahash code requires the number of buckets in a hash table to fit
in an int; but since we calculate the desired hash table size dynamically,
there are various scenarios where we might calculate too large a value.
The resulting overflow can lead to infinite loops, division-by-zero
crashes, etc. I (tgl) had previously installed some defenses against that
in commit 299d1716525c659f0e02840e31fbe4dea3, but that covered only one
call path. Moreover it worked by limiting the request size to work_mem,
but in a 64-bit machine it's possible to set work_mem high enough that the
problem appears anyway. So let's fix the problem at the root by installing
limits in the dynahash.c functions themselves.
Trouble report and patch by Jeff Davis.
|