Age | Commit message (Collapse) | Author |
|
The original coding of this function overlooked the possibility that
it could be passed anything except simple OpExpr indexquals. But
ScalarArrayOpExpr is possible too, and the code would probably crash
(and surely give ridiculous answers) in such a case. Add logic to try
to estimate sanely for such cases.
In passing, fix the treatment of inner-indexscan cost estimation: it was
failing to scale up properly for multiple iterations of a nestloop.
(I think somebody might've thought that index_pages_fetched() is linear,
but of course it's not.)
Report, diagnosis, and preliminary patch by Marti Raudsepp; I refactored
it a bit and fixed the cost estimation.
Back-patch into 9.1 where the bogus code was introduced.
|
|
I forgot to change the functions to use the PG_GETARG_INET_PP() macro,
when I changed DatumGetInetP() to unpack the datum, like Datum*P macros
usually do. Also, I screwed up the definition of the PG_GETARG_INET_PP()
macro, and didn't notice because it wasn't used.
This fixes the memory leak when sorting inet values, as reported
by Jochen Erwied and debugged by Andres Freund. Backpatch to 8.3, like
the previous patch that broke it.
|
|
Since record[] uses array_in, it needs to have its element type passed
as typioparam. In HEAD and 9.1, this fix essentially reverts commit
9bc933b2125a5358722490acbc50889887bf7680, which was a hack that is no
longer needed since domains don't set their typelem anymore. Before
that, adjust the logic so that only domains are excluded from being
treated like arrays, rather than assuming that only base types should
be included. Add a regression test to demonstrate the need for this.
Per report from Maxim Boguk.
Back-patch to 8.4, where type record[] was added.
|
|
On a platform that isn't supplying __FILE__, previous coding would either
crash or give a stale result for the filename string. Not sure how likely
that is, but the original code catered for it, so let's keep doing so.
|
|
In vpath builds, the __FILE__ macro that is used in verbose error
reports contains the full absolute file name, which makes the error
messages excessively verbose. So keep only the base name, thus
matching the behavior of non-vpath builds.
|
|
a new macro, DatumGetInetPP(), that does not. This brings these macros
in line with other DatumGet*P() macros.
Backpatch to 8.3, where 1-byte header varlenas were introduced.
|
|
If a tuple in a syscache contains an out-of-line toasted field, and we
try to fetch that field shortly after some other transaction has committed
an update or deletion of the tuple, there is a race condition: vacuum
could come along and remove the toast tuples before we can fetch them.
This leads to transient failures like "missing chunk number 0 for toast
value NNNNN in pg_toast_2619", as seen in recent reports from Andrew
Hammond and Tim Uckun.
The design idea of syscache is that access to stale syscache entries
should be prevented by relation-level locks, but that fails for at least
two cases where toasted fields are possible: ANALYZE updates pg_statistic
rows without locking out sessions that might want to plan queries on the
same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without
any meaningful lock at all.
The least risky fix seems to be an idea that Heikki suggested when we
were dealing with a related problem back in August: forcibly detoast any
out-of-line fields before putting a tuple into syscache in the first place.
This avoids the problem because at the time we fetch the parent tuple from
the catalog, we should be holding an MVCC snapshot that will prevent
removal of the toast tuples, even if the parent tuple is outdated
immediately after we fetch it. (Note: I'm not convinced that this
statement holds true at every instant where we could be fetching a syscache
entry at all, but it does appear to hold true at the times where we could
fetch an entry that could have a toasted field. We will need to be a bit
wary of adding toast tables to low-level catalogs that don't have them
already.) An additional benefit is that subsequent uses of the syscache
entry should be faster, since they won't have to detoast the field.
Back-patch to all supported versions. The problem is significantly harder
to reproduce in pre-9.0 releases, because of their willingness to flush
every entry in a syscache whenever the underlying catalog is vacuumed
(cf CatalogCacheFlushRelation); but there is still a window for trouble.
|
|
cash_out failed to handle multiple-byte thousands separators, as per bug
#6277 from Alexander Law. In addition, cash_in didn't handle that either,
nor could it handle multiple-byte positive_sign. Both routines failed to
support multiple-byte mon_decimal_point, which I did not think was worth
changing, but at least now they check for the possibility and fall back to
using '.' rather than emitting invalid output. Also, make cash_in handle
trailing negative signs, which formerly it would reject. Since cash_out
generates trailing negative signs whenever the locale tells it to, this
last omission represents a fail-to-reload-dumped-data bug. IMO that
justifies patching this all the way back.
|
|
The uniqueness condition might fail to hold intra-transaction, and assuming
it does can give incorrect query results. Per report from Marti Raudsepp,
though this is not his proposed patch.
Back-patch to 9.0, where both these features were introduced. In the
released branches, add the new IndexOptInfo field to the end of the struct,
to try to minimize ABI breakage for third-party code that may be examining
that struct.
|
|
CREATE EXTENSION needs to transiently set search_path, as well as
client_min_messages and log_min_messages. We were doing this by the
expedient of saving the current string value of each variable, doing a
SET LOCAL, and then doing another SET LOCAL with the previous value at
the end of the command. This is a bit expensive though, and it also fails
badly if there is anything funny about the existing search_path value,
as seen in a recent report from Roger Niederland. Fortunately, there's a
much better way, which is to piggyback on the GUC infrastructure previously
developed for functions with SET options. We just open a new GUC nesting
level, do our assignments with GUC_ACTION_SAVE, and then close the nesting
level when done. This automatically restores the prior settings without a
re-parsing pass, so (in principle anyway) there can't be an error. And
guc.c still takes care of cleanup in event of an error abort.
The CREATE EXTENSION code for this was modeled on some much older code in
ri_triggers.c, which I also changed to use the better method, even though
there wasn't really much risk of failure there. Also improve the comments
in guc.c to reflect this additional usage.
|
|
This oversight meant that on Windows, the pg_settings view would not
display source file or line number information for values coming from
postgresql.conf, unless the backend had received a SIGHUP since starting.
In passing, also make the error detection in read_nondefault_variables a
tad more thorough, and fix it to not lose precision on float GUCs (these
changes are already in HEAD as of my previous commit).
|
|
Forgot to consider this case ...
|
|
The standardized errno code for "no such locale" failures is ENOENT, which
we were just reporting at face value, viz "No such file or directory".
Per gripe from Thom Brown, this might confuse users, so add an errdetail
message to clarify what it means. Also, report newlocale() failures as
ERRCODE_INVALID_PARAMETER_VALUE rather than using
errcode_for_file_access(), since newlocale()'s errno values aren't
necessarily tied directly to file access failures.
|
|
Trailing-zero stripping applied by the FM specifier could strip zeroes
to the left of the decimal point, for a format with no digit positions
after the decimal point (such as "FM999.").
Reported and diagnosed by Marti Raudsepp, though I didn't use his patch.
|
|
With 9.1's use of Params to pass down values from NestLoop join nodes
to their inner plans, it is possible for a Param to have type RECORD, in
which case the set of fields comprising the value isn't determinable by
inspection of the Param alone. However, just as with a Var of type RECORD,
we can find out what we need to know if we can locate the expression that
the Param represents. We already knew how to do this in get_parameter(),
but I'd overlooked the need to be able to cope in get_name_for_var_field(),
which led to EXPLAIN failing with "record type has not been registered".
To fix, refactor the search code in get_parameter() so it can be used by
both functions.
Per report from Marti Raudsepp.
|
|
The code in shift_jis_20042euc_jis_2004() would fetch two bytes even when
only one remained in the string. Since conversion functions aren't
supposed to assume null-terminated input, this poses a small risk of
fetching past the end of memory and incurring SIGSEGV. No such crash has
been identified in the field, but we've certainly seen the equivalent
happen in other code paths, so patch this one all the way back.
Report and patch by Noah Misch.
|
|
dots. I previously worked around this in initdb, mapping the known
problematic locale names to aliases that work, but Hiroshi Inoue pointed
out that that's not enough because even if you use one of the aliases, like
"Chinese_HKG", setlocale(LC_CTYPE, NULL) returns back the long form, ie.
"Chinese_Hong Kong S.A.R.". When we try to restore an old locale value by
passing that value back to setlocale(), it fails. Note that you are affected
by this bug also if you use one of those short-form names manually, so just
reverting the hack in initdb won't fix it.
To work around that, move the locale name mapping from initdb to a wrapper
around setlocale(), so that the mapping is invoked on every setlocale() call.
Also, add a few checks for failed setlocale() calls in the backend. These
calls shouldn't fail, and if they do there isn't much we can do about it,
but at least you'll get a warning.
Backpatch to 9.1, where the initdb hack was introduced. The Windows bug
affects older versions too if you set locale manually to one of the aliases,
but given the lack of complaints from the field, I'm hesitent to backpatch.
|
|
Examination of examples provided by Mark Kirkwood and others has convinced
me that actually commit 7f3eba30c9d622d1981b1368f2d79ba0999cdff2 was quite
a few bricks shy of a load. The useful part of that patch was clamping
ndistinct for the inner side of a semi or anti join, and the reason why
that's needed is that it's the only way that restriction clauses
eliminating rows from the inner relation can affect the estimated size of
the join result. I had not clearly understood why the clamping was
appropriate, and so mis-extrapolated to conclude that we should clamp
ndistinct for the outer side too, as well as for both sides of regular
joins. These latter actions were all wrong, and are reverted with this
patch. In addition, the clamping logic is now made to affect the behavior
of both paths in eqjoinsel_semi, with or without MCV lists to compare.
When we have MCVs, we suppose that the most common values are the ones
that are most likely to survive the decimation resulting from a lower
restriction clause, so we think of the clamping as eliminating non-MCV
values, or potentially even the least-common MCVs for the inner relation.
Back-patch to 8.4, same as previous fixes in this area.
|
|
This patch fixes an oversight in my commit
7f3eba30c9d622d1981b1368f2d79ba0999cdff2 of 2008-10-23. That patch
accounted for baserel restriction clauses that reduced the number of rows
coming out of a table (and hence the number of possibly-distinct values of
a join variable), but not for join restriction clauses that might have been
applied at a lower level of join. To account for the latter, look up the
sizes of the min_lefthand and min_righthand inputs of the current join,
and clamp with those in the same way as for the base relations.
Noted while investigating a complaint from Ben Chobot, although this in
itself doesn't seem to explain his report.
Back-patch to 8.4; previous versions used different estimation methods
for which this heuristic isn't relevant.
|
|
It is possible for VACUUM to scan no pages at all, if the visibility map
shows that all pages are all-visible. In this situation VACUUM has no new
information to report about the relation's tuple density, so it wasn't
changing pg_class.reltuples ... but it updated pg_class.relpages anyway.
That's wrong in general, since there is no evidence to justify changing the
density ratio reltuples/relpages, but it's particularly bad if the previous
state was relpages=reltuples=0, which means "unknown tuple density".
We just replaced "unknown" with "zero". ANALYZE would eventually recover
from this, but it could take a lot of repetitions of ANALYZE to do so if
the relation size is much larger than the maximum number of pages ANALYZE
will scan, because of the moving-average behavior introduced by commit
b4b6923e03f4d29636a94f6f4cc2f5cf6298b8c8.
The only known situation where we could have relpages=reltuples=0 and yet
the visibility map asserts everything's visible is immediately following
a pg_upgrade. It might be advisable for pg_upgrade to try to preserve the
relpages/reltuples statistics; but in any case this code is wrong on its
own terms, so fix it. Per report from Sergey Koposov.
Back-patch to 8.4, where the visibility map was introduced, same as the
previous change.
|
|
Per bug #6181 from Itagaki Takahiro. Also do some marginal code cleanup
and improve error handling.
|
|
tsvector_concat() allocated its result workspace using the "conservative"
estimate of the sum of the two input tsvectors' sizes. Unfortunately that
wasn't so conservative as all that, because it supposed that the number of
pad bytes required could not grow. Which it can, as per test case from
Jesper Krogh, if there's a mix of lexemes with positions and lexemes
without them in the input data. The fix is to assume that we might add
a not-previously-present pad byte for each and every lexeme in the two
inputs; which really is conservative, but it doesn't seem worthwhile to
try to be more precise.
This is an aboriginal bug in tsvector_concat, so back-patch to all
versions containing it.
|
|
This reverts commit 1bde67c0b9adce8b7ed2a2d1fcb2788cf96cea64, which
should have been done only on the master branch.
|
|
This makes it slightly more clear that '*' is not part of the default
value, in case that wasn't obvious.
As requested by Dougal Sutherland.
|
|
The TID isn't stable enough: we might queue an sinval event before a VACUUM
FULL, and then process it afterwards, when the target tuple no longer has
the same TID. So we must invalidate entries on the basis of hash value
only. The old coding can be shown to result in various bizarre,
hard-to-reproduce errors in the presence of concurrent VACUUM FULLs on
system catalogs, and could easily result in permanent catalog corruption,
up to and including complete loss of tables.
This commit is just a minimal fix that removes the unsafe comparison.
We should remove transmission of the tuple TID from sinval messages
altogether, and then arrange to suppress the extra message in the common
case of a heap_update that doesn't change the key hashvalue. But that's
going to be much more invasive, and will only produce a probably-marginal
performance gain, so it doesn't seem like material for a back-patch.
Back-patch to 9.0. Before that, VACUUM FULL refused to do any tuple moving
if it found any INSERT_IN_PROGRESS or DELETE_IN_PROGRESS tuples (and
CLUSTER would give up altogether), so there was no risk of moving a tuple
that might be the subject of an unsent sinval message.
|
|
We have to be sure that we have revalidated each nailed-in-cache relcache
entry before we try to use it to load data for some other relcache entry.
The introduction of "mapped relations" in 9.0 broke this, because although
we updated the state kept in relmapper.c early enough, we failed to
propagate that information into relcache entries soon enough; in
particular, we could try to fetch pg_class rows out of pg_class before
we'd updated its relcache entry's rd_node.relNode value from the map.
This bug accounts for Dave Gould's report of failures after "vacuum full
pg_class", and I believe that there is risk for other system catalogs
as well.
The core part of the fix is to copy relmapper data into the relcache
entries during "phase 1" in RelationCacheInvalidate(), before they'll be
used in "phase 2". To try to future-proof the code against other similar
bugs, I also rearranged the order in which nailed relations are visited
during phase 2: now it's pg_class first, then pg_class_oid_index, then
other nailed relations. This should ensure that RelationClearRelation can
apply RelationReloadIndexInfo to all nailed indexes without risking use
of not-yet-revalidated relcache entries.
Back-patch to 9.0 where the relation mapper was introduced.
|
|
The previous code tried to synchronize by unlinking the init file twice,
but that doesn't actually work: it leaves a window wherein a third process
could read the already-stale init file but miss the SI messages that would
tell it the data is stale. The result would be bizarre failures in catalog
accesses, typically "could not read block 0 in file ..." later during
startup.
Instead, hold RelCacheInitLock across both the unlink and the sending of
the SI messages. This is more straightforward, and might even be a bit
faster since only one unlink call is needed.
This has been wrong since it was put in (in 2002!), so back-patch to all
supported releases.
|
|
The statement start timestamp was not set before initiating the transaction
that is used to look up client authentication information in pg_authid.
In consequence, enable_sig_alarm computed a wrong value (far in the past)
for statement_fin_time. That didn't have any immediate effect, because the
timeout alarm was set without reference to statement_fin_time; but if we
subsequently blocked on a lock for a short time, CheckStatementTimeout
would consult the bogus value when we cancelled the lock timeout wait,
and then conclude we'd timed out, leading to immediate failure of the
connection attempt. Thus an innocent "vacuum full pg_authid" would cause
failures of concurrent connection attempts. Noted while testing other,
more serious consequences of vacuum full on system catalogs.
We should set the statement timestamp before StartTransactionCommand(),
so that the transaction start timestamp is also valid. I'm not sure if
there are any non-cosmetic effects of it not being valid, but the xact
timestamp is at least sent to the statistics machinery.
Back-patch to 9.0. Before that, the client authentication timeout was done
outside any transaction and did not depend on this state to be valid.
|
|
The previous limit of 1024 was set on the assumption that all modern syslog
implementations have line length limits of 2KB or so. However, this is
false, as at least Solaris and sysklogd truncate at only 1KB. 900 seems
to leave enough room for the max likely length of the tacked-on prefixes,
so let's go with that.
As with the previous change, it doesn't seem wise to back-patch this into
already-released branches; but it should be OK to sneak it into 9.1.
Noah Misch
|
|
There may be some other places where we should use errdetail_internal,
but they'll have to be evaluated case-by-case. This commit just hits
a bunch of places where invoking gettext is obviously a waste of cycles.
|
|
This function supports untranslated detail messages, in the same way that
errmsg_internal supports untranslated primary messages. We've needed this
for some time IMO, but discussion of some cases in the SSI code provided
the impetus to actually add it.
Kevin Grittner, with minor adjustments by me
|
|
Regular aggregate functions in combination with, or within the arguments
of, window functions are OK per spec; they have the semantics that the
aggregate output rows are computed and then we run the window functions
over that row set. (Thus, this combination is not really useful unless
there's a GROUP BY so that more than one aggregate output row is possible.)
The case without GROUP BY could fail, as recently reported by Jeff Davis,
because sloppy construction of the Agg node's targetlist resulted in extra
references to possibly-ungrouped Vars appearing outside the aggregate
function calls themselves. See the added regression test case for an
example.
Fixing this requires modifying the API of flatten_tlist and its underlying
function pull_var_clause. I chose to make pull_var_clause's API for
aggregates identical to what it was already doing for placeholders, since
the useful behaviors turn out to be the same (error, report node as-is, or
recurse into it). I also tightened the error checking in this area a bit:
if it was ever valid to see an uplevel Var, Aggref, or PlaceHolderVar here,
that was a long time ago, so complain instead of ignoring them.
Backpatch into 9.1. The failure exists in 8.4 and 9.0 as well, but seeing
that it only occurs in a basically-useless corner case, it doesn't seem
worth the risks of changing a function API in a minor release. There might
be third-party code using pull_var_clause.
|
|
We were using GetConfigOption to collect the old value of each setting,
overlooking the possibility that it didn't exist yet. This does happen
in the case of adding a new entry within a custom variable class, as
exhibited in bug #6097 from Maxim Boguk.
To fix, add a missing_ok parameter to GetConfigOption, but only in 9.1
and HEAD --- it seems possible that some third-party code is using that
function, so changing its API in a minor release would cause problems.
In 9.0, create a near-duplicate function instead.
|
|
|
|
Per discussion, this structure seems more understandable than what was
there before. Make config.sgml and postgresql.conf.sample agree.
In passing do a bit of editorial work on the variable descriptions.
|
|
get_op_btree_interpretation assumed this in order to save some duplication
of code, but it's not true in general anymore because we added <> support
to btree_gist. (We still assume it for btree opclasses, though.)
Also, essentially the same logic was baked into predtest.c. Get rid of
that duplication by generalizing get_op_btree_interpretation so that it
can be used by predtest.c.
Per bug report from Denis de Bernardy and investigation by Jeff Davis,
though I didn't use Jeff's patch exactly as-is.
Back-patch to 9.1; we do not support this usage before that.
|
|
|
|
We had previously (af26857a2775e7ceb0916155e931008c2116632f)
established the U.S. spellings as standard.
|
|
Per bug #6082, reported by Steve Haslam
|
|
|
|
|
|
The previous code went into an infinite loop after overflow. In fact,
an overflow is not really an error; it just means that the current
value is the last one we need to return. So, just arrange to stop
immediately when overflow is detected.
Back-patch all the way.
|
|
The initial commit of the ALTER TABLE ADD FOREIGN KEY NOT VALID feature
failed to support labeling such constraints as deferrable. The best fix
for this seems to be to fold NOT VALID into ConstraintAttributeSpec.
That's a bit more general than the documented syntax, but it allows
better-targeted syntax error messages.
In addition, do some mostly-but-not-entirely-cosmetic code review for
the whole NOT VALID patch.
|
|
This oversight could result in a tuplestore using much more than the
intended amount of memory. It would only happen in a code path that loaded
a tuplestore via tuplestore_putvalues(), and many of those won't emit huge
amounts of data; but cases such as holdable cursors and plpgsql's RETURN
NEXT command could have the problem. The fix ensures that the tuplestore
will switch to write-to-disk mode when it overruns work_mem.
The potential overrun was finite, because we would still count the space
used by the tuple pointer array, so the tuplestore code would eventually
flip into write-to-disk mode anyway. When storing wide tuples we would
go far past the expected work_mem usage before that happened; but this
may account for the lack of prior reports.
Back-patch to 8.4, where tuplestore_putvalues was introduced.
Per bug #6061 from Yann Delorme.
|
|
called 'pid'.
|
|
|
|
This case was missed when NOT VALID constraints were first introduced in
commit 722bf7017bbe796decc79c1fde03e7a83dae9ada by Simon Riggs on
2011-02-08. Among other things, it causes pg_dump to omit the NOT VALID
flag when dumping such constraints, which may cause them to fail to
load afterwards, if they contained values failing the constraint.
Per report from Thom Brown.
|
|
The existence of a btree opclass accepting composite types caused us to
assume that every composite type is sortable. This isn't true of course;
we need to check if the column types are all sortable. There was logic
for this for the case of array comparison (ie, check that the element
type is sortable), but we missed the point for rowtypes. Per Teodor's
report of an ANALYZE failure for an unsortable composite type.
Rather than just add some more ad-hoc logic for this, I moved knowledge of
the issue into typcache.c. The typcache will now only report out array_eq,
record_cmp, and friends as usable operators if the array or composite type
will work with those functions.
Unfortunately we don't have enough info to do this for anonymous RECORD
types; in that case, just assume it will work, and take the runtime failure
as before if it doesn't.
This patch might be a candidate for back-patching at some point, but
given the lack of complaints from the field, I'd rather just test it in
HEAD for now.
Note: most of the places touched in this patch will need further work
when we get around to supporting hashing of record types.
|
|
parse_xml_decl's header comment says you can pass NULL for any unwanted
output parameter, but it failed to honor this contract for the "standalone"
flag. The only currently-affected caller is xml_recv, so the net effect is
that sending a binary XML value containing a standalone parameter in its
xml declaration would crash the backend. Per bug #6044 from Christopher
Dillard.
In passing, remove useless initializations of parse_xml_decl's output
parameters in xml_parse.
Back-patch to 8.3, where this code was introduced.
|
|
We had some hacks in ruleutils.c to cope with various odd transformations
that the optimizer could do on a CASE foo WHEN "CaseTestExpr = RHS" clause.
However, the fundamental impossibility of covering all cases was exposed
by Heikki, who pointed out that the "=" operator could get replaced by an
inlined SQL function, which could contain nearly anything at all. So give
up on the hacks and just print the expression as-is if we fail to recognize
it as "CaseTestExpr = RHS". (We must cover that case so that decompiled
rules print correctly; but we are not under any obligation to make EXPLAIN
output be 100% valid SQL in all cases, and already could not do so in some
other cases.) This approach requires that we have some printable
representation of the CaseTestExpr node type; I used "CASE_TEST_EXPR".
Back-patch to all supported branches, since the problem case fails in all.
|