summaryrefslogtreecommitdiff
path: root/src/include
AgeCommit message (Collapse)Author
2008-08-11Relation forks patch requires a catversion bump due to changes in the formatHeikki Linnakangas
of some WAL records, and two-phase state files, which I forgot.
2008-08-11Introduce the concept of relation forks. An smgr relation can now consistHeikki Linnakangas
of multiple forks, and each fork can be created and grown separately. The bulk of this patch is about changing the smgr API to include an extra ForkNumber argument in every smgr function. Also, smgrscheduleunlink and smgrdounlink no longer implicitly call smgrclose, because other forks might still exist after unlinking one. The callers of those functions have been modified to call smgrclose instead. This patch in itself doesn't have any user-visible effect, but provides the infrastructure needed for upcoming patches. The additional forks envisioned are a rewritten FSM implementation that doesn't rely on a fixed-size shared memory block, and a visibility map to allow skipping portions of a table in VACUUM that have no dead tuples.
2008-08-07Improve INTERSECT/EXCEPT hashing by realizing that we don't need to make anyTom Lane
hashtable entries for tuples that are found only in the second input: they can never contribute to the output. Furthermore, this implies that the planner should endeavor to put first the smaller (in number of groups) input relation for an INTERSECT. Implement that, and upgrade prepunion's estimation of the number of rows returned by setops so that there's some amount of sanity in the estimate of which one is smaller.
2008-08-07Support hashing for duplicate-elimination in INTERSECT and EXCEPT queries.Tom Lane
This completes my project of improving usage of hashing for duplicate elimination (aggregate functions with DISTINCT remain undone, but that's for some other day). As with the previous patches, this means we can INTERSECT/EXCEPT on datatypes that can hash but not sort, and it means that INTERSECT/EXCEPT without ORDER BY are no longer certain to produce sorted output.
2008-08-07Teach the system how to use hashing for UNION. (INTERSECT/EXCEPT will follow,Tom Lane
but seem like a separate patch since most of the remaining work is on the executor side.) I took the opportunity to push selection of the grouping operators for set operations into the parser where it belongs. Otherwise this is just a small exercise in making prepunion.c consider both alternatives. As with the recent DISTINCT patch, this means we can UNION on datatypes that can hash but not sort, and it means that UNION without ORDER BY is no longer certain to produce sorted output.
2008-08-05Move pgstat.tmp into a temporary directory under $PGDATA named pg_stat_tmp.Magnus Hagander
This allows the use of a ramdrive (either through mount or symlink) for the temporary file that's written every half second, which should reduce I/O. On server shutdown/startup, the file is written to the old location in the global directory, to preserve data across restarts. Bump catversion since the $PGDATA directory layout changed.
2008-08-05Improve SELECT DISTINCT to consider hash aggregation, as well as sort/uniq,Tom Lane
as methods for implementing the DISTINCT step. This eliminates the former performance gap between DISTINCT and GROUP BY, and also makes it possible to do SELECT DISTINCT on datatypes that only support hashing not sorting. SELECT DISTINCT ON is still always implemented by sorting; it would take executor changes to support hashing that, and it's not clear it's worth the trouble. This is a release-note-worthy incompatibility from previous PG versions, since SELECT DISTINCT can no longer be counted on to deliver sorted output without explicitly saying ORDER BY. (Anyone who can't cope with that can consider turning off enable_hashagg.) Several regression test queries needed to have ORDER BY added to preserve stable output order. I fixed the ones that manifested here, but there might be some other cases that show up on other platforms.
2008-08-04Improve CREATE/DROP/RENAME DATABASE so that when failing because the sourceTom Lane
or target database is being accessed by other users, it tells you whether the "other users" are live sessions or uncommitted prepared transactions. (Indeed, it tells you exactly how many of each, but that's mostly just because it was easy to do so.) This should help forestall the gotcha of not realizing that a prepared transaction is what's blocking the command. Per discussion.
2008-08-02Rearrange the querytree representation of ORDER BY/GROUP BY/DISTINCT itemsTom Lane
as per my recent proposal: 1. Fold SortClause and GroupClause into a single node type SortGroupClause. We were already relying on them to be struct-equivalent, so using two node tags wasn't accomplishing much except to get in the way of comparing items with equal(). 2. Add an "eqop" field to SortGroupClause to carry the associated equality operator. This is cheap for the parser to get at the same time it's looking up the sort operator, and storing it eliminates the need for repeated not-so-cheap lookups during planning. In future this will also let us represent GROUP/DISTINCT operations on datatypes that have hash opclasses but no btree opclasses (ie, they have equality but no natural sort order). The previous representation simply didn't work for that, since its only indicator of comparison semantics was a sort operator. 3. Add a hasDistinctOn boolean to struct Query to explicitly record whether the distinctClause came from DISTINCT or DISTINCT ON. This allows removing some complicated and not 100% bulletproof code that attempted to figure that out from the distinctClause alone. This patch doesn't in itself create any new capability, but it's necessary infrastructure for future attempts to use hash-based grouping for DISTINCT and UNION/INTERSECT/EXCEPT.
2008-08-01Move ident authentication code into auth.c along with the other authenciationMagnus Hagander
routines, leaving hba.c to deal only with processing the HBA specific files.
2008-07-31Fix parser so that we don't modify the user-written ORDER BY list in orderTom Lane
to represent DISTINCT or DISTINCT ON. This gets rid of a longstanding annoyance that a view or rule using SELECT DISTINCT will be dumped out with an overspecified ORDER BY list, and is one small step along the way to decoupling DISTINCT and ORDER BY enough so that hash-based implementation of DISTINCT will be possible. In passing, improve transformDistinctClause so that it doesn't reject duplicate DISTINCT ON items, as was reported by Steve Midgley a couple weeks ago.
2008-07-30Flip the default typispreferred setting from true to false. This affectsTom Lane
only type categories in which the previous coding made *every* type preferred; so there is no change in effective behavior, because the function resolution rules only do something different when faced with a choice between preferred and non-preferred types in the same category. It just seems safer and less surprising to have CREATE TYPE default to non-preferred status ...
2008-07-30Replace the hard-wired type knowledge in TypeCategory() and IsPreferredType()Tom Lane
with system catalog lookups, as was foreseen to be necessary almost since their creation. Instead put the information into two new pg_type columns, typcategory and typispreferred. Add support for setting these when creating a user-defined base type. The category column is just a "char" (i.e. a poor man's enum), allowing a crude form of user extensibility of the category list: just use an otherwise-unused character. This seems sufficient for foreseen uses, but we could upgrade to having an actual category catalog someday, if there proves to be a huge demand for custom type categories. In this patch I have attempted to hew exactly to the behavior of the previous hardwired logic, except for introducing new type categories for arrays, composites, and enums. In particular the default preferred state for user-defined types remains TRUE. That seems worth revisiting, but it should be done as a separate patch from introducing the infrastructure. Likewise, any adjustment of the standard set of categories should be done separately.
2008-07-26As noted by Andrew Gierth, there's really no need any more to force a junkTom Lane
filter to be used when INSERT or SELECT INTO has a plan that returns raw disk tuples. The virtual-tuple-slot optimizations that were put in place awhile ago mean that ExecInsert has to do ExecMaterializeSlot, and that already copies the tuple if it's raw (and does so more efficiently than a junk filter, too). So get rid of that logic. This in turn means that we can throw away ExecMayReturnRawTuples, which wasn't used for any other purpose, and was always a kluge anyway. In passing, move a couple of SELECT-INTO-specific fields out of EState and into the private state of the SELECT INTO DestReceiver, as was foreseen in an old comment there. Also make intorel_receive use ExecMaterializeSlot not ExecCopySlotTuple, for consistency with ExecInsert and to possibly save a tuple copy step in some cases.
2008-07-23Use guc.c's parse_int() instead of pg_atoi() to parse fillfactor inTom Lane
default_reloptions(). The previous coding was really a bug because pg_atoi() will always throw elog on bad input data, whereas default_reloptions is not supposed to complain about bad input unless its validate parameter is true. Right now you could only expose the problem by hand-modifying pg_class.reloptions into an invalid state, so it doesn't seem worth back-patching; but we should get it right in HEAD because there might be other situations in future. Noted while studying GIN fast-update patch.
2008-07-18Adjust things so that the query_string of a cached plan and the sourceText ofTom Lane
a portal are never NULL, but reliably provide the source text of the query. It turns out that there was only one place that was really taking a short-cut, which was the 'EXECUTE' utility statement. That doesn't seem like a sufficiently critical performance hotspot to justify not offering a guarantee of validity of the portal source text. Fix it to copy the source text over from the cached plan. Add Asserts in the places that set up cached plans and portals to reject null source strings, and simplify a bunch of places that formerly needed to guard against nulls. There may be a few places that cons up statements for execution without having any source text at all; I found one such in ConvertTriggerToFK(). It seems sufficient to inject a phony source string in such a case, for instance ProcessUtility((Node *) atstmt, "(generated ALTER TABLE ADD FOREIGN KEY command)", NULL, false, None_Receiver, NULL); We should take a second look at the usage of debug_query_string, particularly the recently added current_query() SQL function. ITAGAKI Takahiro and Tom Lane
2008-07-18Provide a function hook to let plug-ins get control around ExecutorRun.Tom Lane
ITAGAKI Takahiro
2008-07-18Implement SQL-spec RETURNS TABLE syntax for functions.Tom Lane
(Unlike the original submission, this patch treats TABLE output parameters as being entirely equivalent to OUT parameters -- tgl) Pavel Stehule
2008-07-16Add a "provariadic" column to pg_proc to eliminate the remarkably expensiveTom Lane
need to deconstruct proargmodes for each pg_proc entry inspected by FuncnameGetCandidates(). Fixes function lookup performance regression caused by yesterday's variadic-functions patch. In passing, make pg_proc.probin be NULL, rather than a dummy value '-', in cases where it is not actually used for the particular type of function. This should buy back some of the space cost of the extra column.
2008-07-16Support "variadic" functions, which can accept a variable number of argumentsTom Lane
so long as all the trailing arguments are of the same (non-array) type. The function receives them as a single array argument (which is why they have to all be the same type). It might be useful to extend this facility to aggregates, but this patch doesn't do that. This patch imposes a noticeable slowdown on function lookup --- a follow-on patch will fix that by adding a redundant column to pg_proc. Pavel Stehule
2008-07-16Add array_fill() to create arrays initialized with a value.Bruce Momjian
Pavel Stehule
2008-07-14Clean up buildfarm failures arising from the seemingly straightforward pageTom Lane
macros patch :-(. Results from both baiji and mastodon imply that MSVC fails to perceive offsetof(PageHeaderData, pd_linp[0]) as a constant expression in some contexts where offsetof(PageHeaderData, pd_linp) works fine. Sloth, thy name is Micro.
2008-07-14Create a type-specific typanalyze routine for tsvector, which collects statsTom Lane
on the most common individual lexemes in place of the mostly-useless default behavior of counting duplicate tsvectors. Future work: create selectivity estimation functions that actually do something with these stats. (Some other things we ought to look at doing: using the Lossy Counting algorithm in compute_minimal_stats, and using the element-counting idea for stats on regular arrays.) Jan Urbanski
2008-07-13Change the PageGetContents() macro to guarantee its result is maxalign'd,Tom Lane
thereby forestalling any problems with alignment of the data structure placed there. Since SizeOfPageHeaderData is maxalign'd anyway in 8.3 and HEAD, this does not actually change anything right now, but it is foreseeable that the header size will change again someday. I had to fix a couple of places that were assuming that the content offset is just SizeOfPageHeaderData rather than MAXALIGN(SizeOfPageHeaderData). Per discussion of Zdenek's page-macros patch.
2008-07-13Clean up the use of some page-header-access macros: principally, useTom Lane
SizeOfPageHeaderData instead of sizeof(PageHeaderData) in places where that makes the code clearer, and avoid casting between Page and PageHeader where possible. Zdenek Kotala, with some additional cleanup by Heikki Linnakangas. I did not apply the parts of the proposed patch that would have resulted in slightly changing the on-disk format of hash indexes; it seems to me that's not a win as long as there's any chance of having in-place upgrade for 8.4.
2008-07-12Don't make --enable-cassert turn on RANDOMIZE_ALLOCATED_MEMORY automatically;Tom Lane
it's just too dang expensive. Per recent discussion, but I just got my nose rubbed in it again while doing some performance checking.
2008-07-12Const-ify the arguments of str_tolower() and friends to suppress compileTom Lane
warnings. Clean up various unneeded cruft that was left behind after creating those routines. Introduce some convenience functions str_tolower_z etc to eliminate tedious and error-prone double arguments in formatting.c. (Currently there seems no need to export the latter, but maybe reconsider this later.)
2008-07-11Multi-column GIN indexes. Teodor SigaevTom Lane
2008-07-10Tighten up SS_finalize_plan's computation of valid_params to exclude Params ofTom Lane
the current query level that aren't in fact output parameters of the current initPlans. (This means, for example, output parameters of regular subplans.) To make this work correctly for output parameters coming from sibling initplans requires rejiggering the API of SS_finalize_plan just a bit: we need the siblings to be visible to it, rather than hidden as SS_make_initplan_from_plan had been doing. This is really part of my response to bug #4290, but I concluded this part probably shouldn't be back-patched, since all that it's doing is to make a debugging cross-check tighter.
2008-07-03Add a function pg_get_keywords() to let clients find out the set of keywordsTom Lane
known to the SQL parser. Dave Page
2008-07-03Update source code comment about when to use gettext_noop().Bruce Momjian
2008-07-01Extend VacAttrStats to allow typanalyze functions to store statistic valuesHeikki Linnakangas
of different types than the underlying column. The capability isn't yet used for anything, but will be required by upcoming patch to analyze tsvector columns. Jan Urbanski
2008-07-01Teach autovacuum how to determine whether a temp table belongs to a crashedTom Lane
backend. If so, send a LOG message to the postmaster log, and if the table is beyond the vacuum-for-wraparound horizon, forcibly drop it. Per recent discussions. Perhaps we ought to back-patch this, but it probably needs to age a bit in HEAD first.
2008-06-30Fix recovery.conf boolean variables to take the same range of stringBruce Momjian
values as postgresql.conf.
2008-06-30Turn PGBE_ACTIVITY_SIZE into a GUC variable, track_activity_query_size.Heikki Linnakangas
As the buffer could now be a lot larger than before, and copying it could thus be a lot more expensive than before, use strcpy instead of memcpy to copy the query string, as was already suggested in comments. Also, only copy the PgBackendStatus struct and string if the slot is in use. Patch by Thomas Lee, with some changes by me.
2008-06-28If pnstrdup is going to be promoted to a generally available function,Tom Lane
it ought to conform to the rest of palloc.h in using Size for sizes.
2008-06-24Reduce the alignment requirement of type "name" from int to char, and arrangeTom Lane
to suppress zero-padding of "name" entries in indexes. The alignment change is unlikely to save any space, but it is really needed anyway to make the world safe for our widespread practice of passing plain old C strings to functions that are declared as taking Name. In the previous coding, the C compiler was entitled to assume that a Name pointer was word-aligned; but we were failing to guarantee that. I think the reason we'd not seen failures is that usually the only thing that gets done with such a pointer is strcmp(), which is hard to optimize in a way that exploits word-alignment. Still, some enterprising compiler guy will probably think of a way eventually, or we might change our code in a way that exposes more-obvious optimization opportunities. The padding change is accomplished in one-liner fashion by declaring the "name" index opclasses to use storage type "cstring" in pg_opclass.h. Normally btree and hash don't allow a nondefault storage type, because they don't have any provisions for converting the input datum to another type. However, because name and cstring are effectively the same thing except for padding, no conversion is needed --- we only need index_form_tuple() to treat the datum as being cstring not name, and this is sufficient. This seems to make for about a one-third reduction in the typical sizes of system catalog indexes that involve "name" columns, of which we have many. These two changes are only weakly related, but the alignment change makes me feel safer that the padding change won't introduce problems, so I'm committing them together.
2008-06-23Merge duplicate upper/lower/initcap() routines in oracle_compat.c andBruce Momjian
formatting.c to use common code; remove duplicate functions and support routines that are no longer needed.
2008-06-19Rewrite the sinval messaging mechanism to reduce contention and avoidTom Lane
unnecessary cache resets. The major changes are: * When the queue overflows, we only issue a cache reset to the specific backend or backends that still haven't read the oldest message, rather than resetting everyone as in the original coding. * When we observe backend(s) falling well behind, we signal SIGUSR1 to only one backend, the one that is furthest behind and doesn't already have a signal outstanding for it. When it finishes catching up, it will in turn signal SIGUSR1 to the next-furthest-back guy, if there is one that is far enough behind to justify a signal. The PMSIGNAL_WAKEN_CHILDREN mechanism is removed. * We don't attempt to clean out dead messages after every message-receipt operation; rather, we do it on the insertion side, and only when the queue fullness passes certain thresholds. * Split SInvalLock into SInvalReadLock and SInvalWriteLock so that readers don't block writers nor vice versa (except during the infrequent queue cleanout operations). * Transfer multiple sinval messages for each acquisition of a read or write lock.
2008-06-19Improve our #include situation by moving pointer types away from theAlvaro Herrera
corresponding struct definitions. This allows other headers to avoid including certain highly-loaded headers such as rel.h and relscan.h, instead using just relcache.h, heapam.h or genam.h, which are more lightweight and thus cause less unnecessary dependencies.
2008-06-18Improve error reporting for problems in text search configuration filesTom Lane
by installing an error context subroutine that will provide the file name and line number for all errors detected while reading a config file. Some of the reader routines were already doing that in an ad-hoc way for errors detected directly in the reader, but it didn't help for problems detected in subroutines, such as encoding violations. Back-patch to 8.3 because 8.3 is where people will be trying to debug configuration files.
2008-06-18Move wchar2char() and char2wchar() from tsearch into /mb to be easier toBruce Momjian
use for other modules; also move pnstrdup(). Clean up code slightly.
2008-06-17Clean up some problems with redundant cross-type arithmetic operators. AddTom Lane
int2-and-int8 implementations of the basic arithmetic operators +, -, *, /. This doesn't really add any new functionality, but it avoids "operator is not unique" failures that formerly occurred in these cases because the parser couldn't decide whether to promote the int2 to int4 or int8. We could alternatively have removed the existing cross-type operators, but experimentation shows that the cost of an additional type coercion expression node is noticeable compared to such cheap operators; so let's not give up any performance here. On the other hand, I removed the int2-and-int4 modulo (%) operators since they didn't seem as important from a performance standpoint. Per a complaint last January from ykhuang.
2008-06-17Move USE_WIDE_UPPER_LOWER define to c.h, and remove TS_USE_WIDE and useBruce Momjian
USE_WIDE_UPPER_LOWER instead.
2008-06-15Rearrange ALTER TABLE syntax processing as per my recent proposal: theTom Lane
grammar allows ALTER TABLE/INDEX/SEQUENCE/VIEW interchangeably for all subforms of those commands, and then we sort out what's really legal at execution time. This allows the ALTER SEQUENCE/VIEW reference pages to fully document all the ALTER forms available for sequences and views respectively, and eliminates a longstanding cause of confusion for users. The net effect is that the following forms are allowed that weren't before: ALTER SEQUENCE OWNER TO ALTER VIEW ALTER COLUMN SET/DROP DEFAULT ALTER VIEW OWNER TO ALTER VIEW SET SCHEMA (There's no actual functionality gain here, but formerly you had to say ALTER TABLE instead.) Interestingly, the grammar tables actually get smaller, probably because there are fewer special cases to keep track of. I did not disallow using ALTER TABLE for these operations. Perhaps we should, but there's a backwards-compatibility issue if we do; in fact it would break existing pg_dump scripts. I did however tighten up ALTER SEQUENCE and ALTER VIEW to reject non-sequences and non-views in the new cases as well as a couple of cases where they didn't before. The patch doesn't change pg_dump to use the new syntaxes, either.
2008-06-14Refactor the handling of the various DropStmt variants so that when multipleTom Lane
objects are specified, we drop them all in a single performMultipleDeletions call. This makes the RESTRICT/CASCADE checks more relaxed: it's not counted as a cascade if one of the later objects has a dependency on an earlier one. NOTICE messages about such cases go away, too. In passing, fix the permissions check for DROP CONVERSION, which for some reason was never made role-aware, and omitted the namespace-owner exemption too. Alex Hunsaker, with further fiddling by me.
2008-06-12Refactor XLogOpenRelation() and XLogReadBuffer() in preparation for relationHeikki Linnakangas
forks. XLogOpenRelation() and the associated light-weight relation cache in xlogutils.c is gone, and XLogReadBuffer() now takes a RelFileNode as argument, instead of Relation. For functions that still need a Relation struct during WAL replay, there's a new function called CreateFakeRelcacheEntry() that returns a fake entry like XLogOpenRelation() used to.
2008-06-10Comment fix, should say TSQuery instead of TSVector.Heikki Linnakangas
Per Jan Urbanski.
2008-06-08Rewrite DROP's dependency traversal algorithm into an honest two-passTom Lane
algorithm, replacing the original intention of a one-pass search, which had been hacked up over time to be partially two-pass in hopes of handling various corner cases better. It still wasn't quite there, especially as regards emitting unwanted NOTICE messages. More importantly, this approach lets us fix a number of open bugs concerning concurrent DROP scenarios, because we can take locks during the first pass and avoid traversing to dependent objects that were just deleted by someone else. There is more that can be done here, but I'll go ahead and commit the base patch before working on the options.
2008-06-08Move BufferGetPageSize and BufferGetPage from bufpage.h to bufmgr.h. It isAlvaro Herrera
more logical that way, and also it reduces the amount of unnecessary includes in bufpage.h, which is widely used. Zdenek Kotala. My previous patch to bufpage.h should also have credited him as author, but I forgot (sorry about that).