Age | Commit message (Collapse) | Author |
|
is in progress on the same hashtable. This seems the least invasive way to
fix the recently-recognized problem that a split could cause the scan to
visit entries twice or (with much lower probability) miss them entirely.
The only field-reported problem caused by this is the "failed to re-find
shared lock object" PANIC in COMMIT PREPARED reported by Michel Dorochevsky,
which was caused by multiply visited entries. However, it seems certain
that mdsync() is vulnerable to missing required fsync's due to missed
entries, and I am fearful that RelationCacheInitializePhase2() might be at
risk as well. Because of that and the generalized hazard presented by this
bug, back-patch all the supported branches.
Along the way, fix pg_prepared_statement() and pg_cursor() to not assume
that the hashtables they are examining will stay static between calls.
This is risky regardless of the newly noted dynahash problem, because
hash_seq_search() has never promised to cope with deletion of table entries
other than the just-returned one. There may be no bug here because the only
supported way to call these functions is via ExecMakeTableFunctionResult()
which will cycle them to completion before doing anything very interesting,
but it seems best to get rid of the assumption. This affects 8.2 and HEAD
only, since those functions weren't there earlier.
|
|
characters in all cases. Formerly we mostly just threw warnings for invalid
input, and failed to detect it at all if no encoding conversion was required.
The tighter check is needed to defend against SQL-injection attacks as per
CVE-2006-2313 (further details will be published after release). Embedded
zero (null) bytes will be rejected as well. The checks are applied during
input to the backend (receipt from client or COPY IN), so it no longer seems
necessary to check in textin() and related routines; any string arriving at
those functions will already have been validated. Conversion failure
reporting (for characters with no equivalent in the destination encoding)
has been cleaned up and made consistent while at it.
Also, fix a few longstanding errors in little-used encoding conversion
routines: win1251_to_iso, win866_to_iso, euc_tw_to_big5, euc_tw_to_mic,
mic_to_euc_tw were all broken to varying extents.
Patches by Tatsuo Ishii and Tom Lane. Thanks to Akio Ishida and Yasuo Ohgaki
for identifying the security issues.
|
|
to try to create a log segment file concurrently, but the code erroneously
specified O_EXCL to open(), resulting in a needless failure. Before 7.4,
it was even a PANIC condition :-(. Correct code is actually simpler than
what we had, because we can just say O_CREAT to start with and not need a
second open() call. I believe this accounts for several recent reports of
hard-to-reproduce "could not create file ...: File exists" errors in both
pg_clog and pg_subtrans.
|
|
Back-patch of previous fix in HEAD for plperl-vs-locale issue.
|
|
very narrow window in which SimpleLruReadPage or SimpleLruWritePage could
think that I/O was needed when it wasn't (and indeed the buffer had already
been assigned to another page). This would result in an Assert failure if
Asserts were enabled, and probably in silent data corruption if not.
Reported independently by Jim Nasby and Robert Creager.
I intend a more extensive fix when 8.2 development starts, but this is a
reasonably low-impact patch for the existing branches.
|
|
WAL record; this is necessary to be sure we recognize stale WAL records
when a WAL page was only partially written during a system crash.
|
|
ON COMMIT actions. Per bug report from Michael Guerin.
|
|
for transaction commits that occurred just before the checkpoint. This is
an EXTREMELY serious bug --- kudos to Satoshi Okada for creating a
reproducible test case to prove its existence.
|
|
and FreeDir routines modeled on the existing AllocateFile/FreeFile.
Like the latter, these routines will avoid failing on EMFILE/ENFILE
conditions whenever possible, and will prevent leakage of directory
descriptors if an elog() occurs while one is open.
Also, reduce PANIC to ERROR in MoveOfflineLogs() --- this is not
critical code and there is no reason to force a DB restart on failure.
All per recent trouble report from Olivier Hubaut.
|
|
complete ExtendCLOG() before advancing nextXid, so that if that routine
fails, the next incoming transaction will try it again. Per trouble
report from Christopher Kings-Lynne.
|
|
protocol, per report from Igor Shevchenko. NOTIFY thought it could
do its thing if transaction blockState is TBLOCK_DEFAULT, but in
reality it had better check the low-level transaction state is
TRANS_DEFAULT as well. Formerly it was not possible to wait for the
client in a state where the first is true and the second is not ...
but now we can have such a state. Minor cleanup in StartTransaction()
as well.
|
|
post-abort cleanup hooks. I'm surprised that we have not needed this
already, but I need it now to fix a plpgsql problem, and the usefulness
for other dynamically loaded modules seems obvious.
|
|
|
|
|
|
terms, add some clarifications, fix some untranslatable attempts at dynamic
message building.
|
|
now able to cope with assigning new relfilenode values to nailed-in-cache
indexes, so they can be reindexed using the fully crash-safe method. This
leaves only shared system indexes as special cases. Remove the 'index
deactivation' code, since it provides no useful protection in the shared-
index case. Require reindexing of shared indexes to be done in standalone
mode, but remove other restrictions on REINDEX. -P (IgnoreSystemIndexes)
now prevents using indexes for lookups, but does not disable index updates.
It is therefore safe to allow from PGOPTIONS. Upshot: reindexing system catalogs
can be done without a standalone backend for all cases except
shared catalogs.
|
|
|
|
|
|
so it won't miss 'em again.
|
|
|
|
|
|
|
|
|
|
xid when we fail to access pg_clog.
|
|
fixed incorrect initial setting of StartUpID. The logic in XLogWrite()
expects that Write->curridx is advanced to the next page as soon as
LogwrtResult points to the end of the current page, but StartupXLOG()
failed to make that happen when the old WAL ended exactly on a page
boundary. Per trouble report from Hannu Krosing.
|
|
|
|
least-recently-used strategy from clog.c into slru.c. It doesn't
change any visible behaviour and passes all regression tests plus a
TruncateCLOG test done manually.
Apart from refactoring I made a little change to SlruRecentlyUsed,
formerly ClogRecentlyUsed: It now skips incrementing lru_counts, if
slotno is already the LRU slot, thus saving a few CPU cycles. To make
this work, lru_counts are initialised to 1 in SimpleLruInit.
SimpleLru will be used by pg_subtrans (part of the nested transactions
project), so the main purpose of this patch is to avoid future code
duplication.
Manfred Koizar
|
|
example from Rao Kumar. This is a very corner corner-case, requiring
a minimum of three closely-spaced database crashes and an unlucky
positioning of the second recovery's checkpoint record before you'd notice
any problem. But the consequences are dire enough that it's a must-fix.
|
|
only remnant of this failed experiment is that the server will take
SET AUTOCOMMIT TO ON. Still TODO: provide some client-side autocommit
logic in libpq.
|
|
but that was enough tedium for one day. Along the way, move the few
support routines for types xid and cid into a more logical place.
|
|
dead xlog segments are not considered part of a critical section. It is
not necessary to force a database-wide panic if we get a failure in these
operations. Per recent trouble reports.
|
|
|
|
|
|
|
|
Both plannable queries and utility commands are now always executed
within Portals, which have been revamped so that they can handle the
load (they used to be good only for single SELECT queries). Restructure
code to push command-completion-tag selection logic out of postgres.c,
so that it won't have to be duplicated between simple and extended queries.
initdb forced due to addition of a field to Query nodes.
|
|
for tableID/columnID in RowDescription. (The latter isn't really
implemented yet though --- the backend always sends zeroes, and libpq
just throws away the data.)
|
|
initial values and runtime changes in selected parameters. This gets
rid of the need for an initial 'select pg_client_encoding()' query in
libpq, bringing us back to one message transmitted in each direction
for a standard connection startup. To allow server version to be sent
using the same GUC mechanism that handles other parameters, invent the
concept of a never-settable GUC parameter: you can 'show server_version'
but it's not settable by any GUC input source. Create 'lc_collate' and
'lc_ctype' never-settable parameters so that people can find out these
settings without need for pg_controldata. (These side ideas were all
discussed some time ago in pgsql-hackers, but not yet implemented.)
|
|
|
|
be reduced to a plain ERROR. Should make it at least a little less
painful to deal with data-corruption problems.
|
|
(materialization into a tuple store) discussed on pgsql-hackers earlier.
I've updated the documentation and the regression tests.
Notes on the implementation:
- I needed to change the tuple store API slightly -- it assumes that it
won't be used to hold data across transaction boundaries, so the temp
files that it uses for on-disk storage are automatically reclaimed at
end-of-transaction. I added a flag to tuplestore_begin_heap() to control
this behavior. Is changing the tuple store API in this fashion OK?
- in order to store executor results in a tuple store, I added a new
CommandDest. This works well for the most part, with one exception: the
current DestFunction API doesn't provide enough information to allow the
Executor to store results into an arbitrary tuple store (where the
particular tuple store to use is chosen by the call site of
ExecutorRun). To workaround this, I've temporarily hacked up a solution
that works, but is not ideal: since the receiveTuple DestFunction is
passed the portal name, we can use that to lookup the Portal data
structure for the cursor and then use that to get at the tuple store the
Portal is using. This unnecessarily ties the Portal code with the
tupleReceiver code, but it works...
The proper fix for this is probably to change the DestFunction API --
Tom suggested passing the full QueryDesc to the receiveTuple function.
In that case, callers of ExecutorRun could "subclass" QueryDesc to add
any additional fields that their particular CommandDest needed to get
access to. This approach would work, but I'd like to think about it for
a little bit longer before deciding which route to go. In the mean time,
the code works fine, so I don't think a fix is urgent.
- (semi-related) I added a NO SCROLL keyword to DECLARE CURSOR, and
adjusted the behavior of SCROLL in accordance with the discussion on
-hackers.
- (unrelated) Cleaned up some SGML markup in sql.sgml, copy.sgml
Neil Conway
|
|
|
|
|
|
PostgreSQL source code.
Neil Conway
|
|
support btree compaction, as per proposal of a few days ago. btree index
pages no longer store parent links, instead they have a level indicator
(counting up from zero for leaf pages). The FixBTree recovery logic is
removed, and replaced by code that detects missing parent-level insertions
during WAL replay. Also, generate appropriate WAL entries when updating
btree metapage and when building a btree index from scratch. I believe
btree indexes are now completely WAL-legal for the first time.
initdb forced due to index and WAL changes.
|
|
rather than actually opening the files. This eliminates some corner cases
where the file indeed exists but open() fails for another reason, such
as being out of file descriptors. The net reliability gain is probably
tiny, since xlog.c is full of other file open calls that will elog(PANIC)
if they fail for any reason; but this specific failure mode has been
observed in the field, so we may as well fix it.
|
|
|
|
-hackers a couple days ago.
Notes/caveats:
- added regression tests for the new functionality, all
regression tests pass on my machine
- added pg_dump support
- updated PL/PgSQL to support per-statement triggers; didn't
look at the other procedural languages.
- there's (even) more code duplication in trigger.c than there
was previously. Any suggestions on how to refactor the
ExecXXXTriggers() functions to reuse more code would be
welcome -- I took a brief look at it, but couldn't see an
easy way to do it (there are several subtly-different
versions of the code in question)
- updated the documentation. I also took the liberty of
removing a big chunk of duplicated syntax documentation in
the Programmer's Guide on triggers, and moving that
information to the CREATE TRIGGER reference page.
- I also included some spelling fixes and similar small
cleanups I noticed while making the changes. If you'd like
me to split those into a separate patch, let me know.
Neil Conway
|
|
but do it correctly now.
|
|
|
|
before commit, not after :-( --- the original coding is not only unsafe
if an error occurs while it's processing, but it generates an invalid
sequence of WAL entries. Resurrect 7.2 logic for deleting items when
no longer needed. Use an enum instead of random macros. Editorialize
on names used for routines and constants. Teach backend/nodes routines
about new field in CreateTable struct. Add a regression test.
|