summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2011-11-03Support range data types.Heikki Linnakangas
Selectivity estimation functions are missing for some range type operators, which is a TODO. Jeff Davis
2011-11-03Fix handling of PlaceHolderVars in nestloop parameter management.Tom Lane
If we use a PlaceHolderVar from the outer relation in an inner indexscan, we need to reference the PlaceHolderVar as such as the value to be passed in from the outer relation. The previous code effectively tried to reconstruct the PHV from its component expression, which doesn't work since (a) the Vars therein aren't necessarily bubbled up far enough, and (b) it would be the wrong semantics anyway because of the possibility that the PHV is supposed to have gone to null at some point before the current join. Point (a) led to "variable not found in subplan target list" planner errors, but point (b) would have led to silently wrong answers. Per report from Roger Niederland.
2011-11-02Avoid scanning nulls at the beginning of a btree index scan.Tom Lane
If we have an inequality key that constrains the other end of the index, it doesn't directly help us in doing the initial positioning ... but it does imply a NOT NULL constraint on the index column. If the index stores nulls at this end, we can use the implied NOT NULL condition for initial positioning, just as if it had been stated explicitly. This avoids wasting time when there are a lot of nulls in the column. This is the reverse of the examples given in bugs #6278 and #6283, which were about failing to stop early when we encounter nulls at the end of the indexscan.
2011-11-02Fix btree stop-at-nulls logic properly.Tom Lane
As pointed out by Naoya Anzai, my previous try at this was a few bricks shy of a load, because I had forgotten that the initial-positioning logic might not try to skip over nulls at the end of the index the scan will start from. We ought to fix that, because it represents an unnecessary inefficiency, but first let's get the scan-stop logic back to a safe state. With this patch, we preserve the performance benefit requested in bug #6278 for the case of scanning forward into NULLs (in a NULLS LAST index), but the reverse case of scanning backward across NULLs when there's no suitable initial-positioning qual is still inefficient.
2011-11-02Update more comments about checkpoints being done by bgwriterSimon Riggs
2011-11-02Reduce checkpoints and WAL traffic on low activity database serverSimon Riggs
Previously, we skipped a checkpoint if no WAL had been written since last checkpoint, though this does not appear in user documentation. As of now, we skip a checkpoint until we have written at least one enough WAL to switch the next WAL file. This greatly reduces the level of activity and number of WAL messages generated by a very low activity server. This is safe because the purpose of a checkpoint is to act as a starting place for a recovery, in case of crash. This patch maintains minimal WAL volume for replay in case of crash, thus maintaining very low crash recovery time.
2011-11-02Refactor xlog.c to create src/backend/postmaster/startup.cSimon Riggs
Startup process now has its own dedicated file, just like all other special/background processes. Reduces role and size of xlog.c
2011-11-02Derive oldestActiveXid at correct time for Hot Standby.Simon Riggs
There was a timing window between when oldestActiveXid was derived and when it should have been derived that only shows itself under heavy load. Move code around to ensure correct timing of derivation. No change to StartupSUBTRANS() code, which is where this failed. Bug report by Chris Redekop
2011-11-02Start Hot Standby faster when initial snapshot is incomplete.Simon Riggs
If the initial snapshot had overflowed then we can start whenever the latest snapshot is empty, not overflowed or as we did already, start when the xmin on primary was higher than xmax of our starting snapshot, which proves we have full snapshot data. Bug report by Chris Redekop
2011-11-02Remove spurious entry from missed catch while patch jugglingSimon Riggs
2011-11-02Fix timing of Startup CLOG and MultiXact during Hot StandbySimon Riggs
Patch by me, bug report by Chris Redekop, analysis by Florian Pflug
2011-11-01Initialize myProcLocks queues just once, at postmaster startup.Robert Haas
In assert-enabled builds, we assert during the shutdown sequence that the queues have been properly emptied, and during process startup that we are inheriting empty queues. In non-assert enabled builds, we just save a few cycles.
2011-11-01Preserve Var location information during flatten_join_alias_vars.Tom Lane
This allows us to give correct syntax error pointers when complaining about ungrouped variables in a join query with aggregates or GROUP BY. It's pretty much irrelevant for the planner's use of the function, though perhaps it might aid debugging sometimes.
2011-11-01Fix race condition with toast table access from a stale syscache entry.Tom Lane
If a tuple in a syscache contains an out-of-line toasted field, and we try to fetch that field shortly after some other transaction has committed an update or deletion of the tuple, there is a race condition: vacuum could come along and remove the toast tuples before we can fetch them. This leads to transient failures like "missing chunk number 0 for toast value NNNNN in pg_toast_2619", as seen in recent reports from Andrew Hammond and Tim Uckun. The design idea of syscache is that access to stale syscache entries should be prevented by relation-level locks, but that fails for at least two cases where toasted fields are possible: ANALYZE updates pg_statistic rows without locking out sessions that might want to plan queries on the same table, and CREATE OR REPLACE FUNCTION updates pg_proc rows without any meaningful lock at all. The least risky fix seems to be an idea that Heikki suggested when we were dealing with a related problem back in August: forcibly detoast any out-of-line fields before putting a tuple into syscache in the first place. This avoids the problem because at the time we fetch the parent tuple from the catalog, we should be holding an MVCC snapshot that will prevent removal of the toast tuples, even if the parent tuple is outdated immediately after we fetch it. (Note: I'm not convinced that this statement holds true at every instant where we could be fetching a syscache entry at all, but it does appear to hold true at the times where we could fetch an entry that could have a toasted field. We will need to be a bit wary of adding toast tables to low-level catalogs that don't have them already.) An additional benefit is that subsequent uses of the syscache entry should be faster, since they won't have to detoast the field. Back-patch to all supported versions. The problem is significantly harder to reproduce in pre-9.0 releases, because of their willingness to flush every entry in a syscache whenever the underlying catalog is vacuumed (cf CatalogCacheFlushRelation); but there is still a window for trouble.
2011-11-01Clean up whitespace and indentation in parser and scanner filesPeter Eisentraut
These are not touched by pgindent, so clean them up a bit manually.
2011-11-01Comment changes to show bgwriter no longer performs checkpoints.Simon Riggs
2011-11-01Have checkpointer send stats once each processing loop.Simon Riggs
Noted by Fujii Masao
2011-11-01Add new file for checkpointer.cSimon Riggs
2011-11-01Split work of bgwriter between 2 processes: bgwriter and checkpointer.Simon Riggs
bgwriter is now a much less important process, responsible for page cleaning duties only. checkpointer is now responsible for checkpoints and so has a key role in shutdown. Later patches will correct doc references to the now old idea that bgwriter performs checkpoints. Has beneficial effect on performance at high write rates, but mainly refactoring to more easily allow changes for power reduction by simplifying previously tortuous code around required to allow page cleaning and checkpointing to time slice in the same process. Patch by me, Review by Dickson Guedes
2011-10-31Stop btree indexscans upon reaching nulls in either direction.Tom Lane
The existing scan-direction-sensitive tests were overly complex, and failed to stop the scan in cases where it's perfectly legitimate to do so. Per bug #6278 from Maksym Boguk. Back-patch to 8.3, which is as far back as the patch applies easily. Doesn't seem worth sweating over a relatively minor performance issue in 8.2 at this late date. (But note that this was a performance regression from 8.1 and before, so 8.2 is being left as an outlier.)
2011-10-30Support more locale-specific formatting options in cash_out().Tom Lane
The POSIX spec defines locale fields for controlling the ordering of the value, sign, and currency symbol in monetary output, but cash_out only supported a small subset of these options. Fully implement p/n_sign_posn, p/n_cs_precedes, and p/n_sep_by_space per spec. Fix up cash_in so that it will accept all these format variants. Also, make sure that thousands_sep is only inserted to the left of the decimal point, as required by spec. Per bug #6144 from Eduard Kracmar and discussion of bug #6277. This patch includes some ideas from Alexander Lakhin's proposed patch, though it is very different in detail.
2011-10-30Further improvement of make_greater_string.Tom Lane
Make sure that it considers all the possibilities that the old code did, instead of trying only one possibility per character position. To keep the runtime in bounds, instead tweak the character incrementers to not try every possible multibyte character code. Remove unnecessary logic to restore the old character value on failure. Additional comment and formatting cleanup.
2011-10-29Update visibilitymap.c header comments.Robert Haas
Recent work on index-only scans left this somewhat out of date.
2011-10-29Fix assorted bogosities in cash_in() and cash_out().Tom Lane
cash_out failed to handle multiple-byte thousands separators, as per bug #6277 from Alexander Law. In addition, cash_in didn't handle that either, nor could it handle multiple-byte positive_sign. Both routines failed to support multiple-byte mon_decimal_point, which I did not think was worth changing, but at least now they check for the possibility and fall back to using '.' rather than emitting invalid output. Also, make cash_in handle trailing negative signs, which formerly it would reject. Since cash_out generates trailing negative signs whenever the locale tells it to, this last omission represents a fail-to-reload-dumped-data bug. IMO that justifies patching this all the way back.
2011-10-29Improve make_greater_string() with encoding-specific incrementers.Robert Haas
This infrastructure doesn't in any way guarantee that the character we produce will sort before the one we incremented; but it does at least make it much more likely that we'll end up with something that is a valid character, which improves our chances. Kyotaro Horiguchi, with various adjustments by me.
2011-10-28Allow hint bits to be set sooner for temporary and unlogged tables.Robert Haas
We need not wait until the commit record is durably on disk, because in the event of a crash the page we're updating with hint bits will be gone anyway. Per off-list report from Heikki Linnakangas, this can significantly degrade the performance of unlogged tables; I was able to show a 2x speedup from this patch on a pgbench run with scale factor 15. In practice, this will mostly help small, heavily updated tables, because on larger tables you're unlikely to run into the same row again before the commit record makes it out to disk.
2011-10-28Demote some sanity checks in BufferIsValid() to assertions.Robert Haas
Testing reveals that this macro is a hot-spot for index-only-scans. Per discussion with Tom Lane.
2011-10-28Remove hard-coded "\connect postgres" from pg_dumpall.Robert Haas
This doesn't appear to accompish anything useful, and does make the restore fail if the postgres database happens to have been dropped.
2011-10-28De-parallelize ecpg build some more.Tom Lane
Make sure ecpg/include/ is rebuilt before the other subdirectories, so that ecpg_config.h is up to date. This is not likely to matter during production builds, only development, so no back-patch.
2011-10-27Update docs to point to the timezone library's new home at IANA.Tom Lane
The recent unpleasantness with copyrights has accelerated a move that was already in planning.
2011-10-27Fix the number of lwlocks needed by the "fast path" lock patch. It needsHeikki Linnakangas
one lock per backend or auxiliary process - the need for a lock for each aux processes was not accounted for in NumLWLocks(). No-one noticed, because the three locks needed for the three aux processes fit into the few extra lwlocks we allocate for 3rd party modules that don't call RequestAddinLWLocks() (NUM_USER_DEFINED_LWLOCKS, 4 by default).
2011-10-27Avoid recursion while processing ELSIF lists in plpgsql.Tom Lane
The original implementation of ELSIF in plpgsql converted the construct into nested simple IF statements. This was prone to stack overflow with long ELSIF lists, in two different ways. First, it's difficult to generate the parsetree without using right-recursion in the bison grammar, and that's prone to parser stack overflow since nothing can be reduced until the whole list has been read. Second, we'd recurse during execution, thus creating an unnecessary risk of execution-time stack overflow. Rewrite so that the ELSIF list is represented as a flat list, scanned via iteration not recursion, and generated through left-recursion in the grammar. Per a gripe from Håvard Kongsgård.
2011-10-27Add simple script to check for right recursion in Bison grammars.Tom Lane
We should generally use left-recursion not right-recursion to parse lists. Bison hasn't got any built-in way to check for this type of inefficiency, and I didn't find anything on the net in a quick search, so I wrote a little Perl script to do it. Add to src/tools/ so we don't have to re-invent this wheel next time we wonder if we're doing anything stupid. Currently, the only place that seems to need fixing is plpgsql's stmt_else production, so the problem doesn't appear to be common enough to warrant trying to include such a test in our standard build process. If we did want to do that, we'd need a way to ignore some false positives, such as a_expr := '-' a_expr
2011-10-26Improve planner's ability to recognize cases where an IN's RHS is unique.Tom Lane
If the right-hand side of a semijoin is unique, then we can treat it like a normal join (or another way to say that is: we don't need to explicitly unique-ify the data before doing it as a normal join). We were recognizing such cases when the RHS was a sub-query with appropriate DISTINCT or GROUP BY decoration, but there's another way: if the RHS is a plain relation with unique indexes, we can check if any of the indexes prove the output is unique. Most of the infrastructure for that was there already in the join removal code, though I had to rearrange it a bit. Per reflection about a recent example in pgsql-performance.
2011-10-26Fix pg_bsd_indent bug where newlines were not being trimmed from typedefBruce Momjian
lines. Update pg_bsd_indent required version to 1.1 (and update ftp site). Problem reported by Magnus.
2011-10-26Implement streaming xlog for backup toolsMagnus Hagander
Add option for parallel streaming of the transaction log while a base backup is running, to get the logfiles before the server has removed them. Also add a tool called pg_receivexlog, which streams the transaction log into files, creating a log archive without having to wait for segments to complete, thus decreasing the window of data loss without having to waste space using archive_timeout. This works best in combination with archive_command - suggested usage docs etc coming later.
2011-10-26MingW doesn't support wcstombs_s()...Magnus Hagander
2011-10-26Change FK trigger naming convention to fix self-referential FKs.Tom Lane
Use names like "RI_ConstraintTrigger_a_NNNN" for FK action triggers and "RI_ConstraintTrigger_c_NNNN" for FK check triggers. This ensures the action trigger fires first in self-referential cases where the very same row update fires both an action and a check trigger. This change provides a non-probabilistic solution for bug #6268, at the risk that it could break client code that is making assumptions about the exact names assigned to auto-generated FK triggers. Hence, change this in HEAD only. No need for forced initdb since old triggers continue to work fine.
2011-10-26Change FK trigger creation order to better support self-referential FKs.Tom Lane
When a foreign-key constraint references another column of the same table, row updates will queue both the PK's ON UPDATE action and the FK's CHECK action in the same event. The ON UPDATE action must execute first, else the CHECK will check a non-final state of the row and possibly throw an inappropriate error, as seen in bug #6268 from Roman Lytovchenko. Now, the firing order of multiple triggers for the same event is determined by the sort order of their pg_trigger.tgnames, and the auto-generated names we use for FK triggers are "RI_ConstraintTrigger_NNNN" where NNNN is the trigger OID. So most of the time the firing order is the same as creation order, and so rearranging the creation order fixes it. This patch will fail to fix the problem if the OID counter wraps around or adds a decimal digit (eg, from 99999 to 100000) while we are creating the triggers for an FK constraint. Given the small odds of that, and the low usage of self-referential FKs, we'll live with that solution in the back branches. A better fix is to change the auto-generated names for FK triggers, but it seems unwise to do that in stable branches because there may be client code that depends on the naming convention. We'll fix it that way in HEAD in a separate patch. Back-patch to all supported branches, since this bug has existed for a long time.
2011-10-25Make event_source visible on all platformsMagnus Hagander
On non-windows platform, we just ignore any value set there. Noted by Jaime Casanova
2011-10-25Remove argument decoration that appears unsupported on mingwMagnus Hagander
2011-10-25Support configurable eventlog application names on WindowsMagnus Hagander
This allows different instances to use the eventlog with different identifiers, by setting the event_source GUC, similar to how syslog_ident works. Original patch by MauMau, heavily modified by Magnus Hagander
2011-10-24Add debugging aid in isolationtesterAlvaro Herrera
2011-10-24Make TABLE tab complation in psql include all relationsMagnus Hagander
Not just tables, since views also work fine with the TABLE command.
2011-10-23Make psql support tab completion of EXECUTE <prepared-statement-name>.Tom Lane
Andreas Karlsson, reviewed by Josh Kupershmidt
2011-10-23Improve git_changelog's handling of inconsistent commit orderings.Tom Lane
Use the CommitDate not the AuthorDate, as the former is representative of the order in which things went into the main repository, and the latter isn't very; we now have instances where the AuthorDate is as much as a month before the patch really went in. Also, get rid of the "commit order inversions" heuristic, which turns out not to do anything very desirable. Instead we just print commits in strict timestamp order, interpreting the "timestamp" of a merged commit as its timestamp on the newest branch it appears in. This fixes some cases where very ancient commits were being printed relatively early in the report.
2011-10-23Don't trust deferred-unique indexes for join removal.Tom Lane
The uniqueness condition might fail to hold intra-transaction, and assuming it does can give incorrect query results. Per report from Marti Raudsepp, though this is not his proposed patch. Back-patch to 9.0, where both these features were introduced. In the released branches, add the new IndexOptInfo field to the end of the struct, to try to minimize ABI breakage for third-party code that may be examining that struct.
2011-10-22Support synchronization of snapshots through an export/import procedure.Tom Lane
A transaction can export a snapshot with pg_export_snapshot(), and then others can import it with SET TRANSACTION SNAPSHOT. The data does not leave the server so there are not security issues. A snapshot can only be imported while the exporting transaction is still running, and there are some other restrictions. I'm not totally convinced that we've covered all the bases for SSI (true serializable) mode, but it works fine for lesser isolation modes. Joachim Wieland, reviewed by Marko Tiikkaja, and rather heavily modified by Tom Lane
2011-10-22Fix overly-complicated usage of errcode_for_file_access().Heikki Linnakangas
No need to do "errcode(errcode_for_file_access())", just "errcode_for_file_access()" is enough. The extra errcode() call is useless but harmless, so there's no user-visible bug here. Nevertheless, backpatch to 9.1 where this code were added.
2011-10-21Code review for pgstat_get_crashed_backend_activity patch.Tom Lane
Avoid possibly dumping core when pgstat_track_activity_query_size has a less-than-default value; avoid uselessly searching for the query string of a successfully-exited backend; don't bother putting out an ERRDETAIL if we don't have a query to show; some other minor stylistic improvements.