Age | Commit message (Collapse) | Author |
|
This introduces the concept of table access methods, i.e. CREATE
ACCESS METHOD ... TYPE TABLE and
CREATE TABLE ... USING (storage-engine).
No table access functionality is delegated to table AMs as of this
commit, that'll be done in following commits.
Subsequent commits will incrementally abstract table access
functionality to be routed through table access methods. That change
is too large to be reviewed & committed at once, so it'll be done
incrementally.
Docs will be updated at the end, as adding them incrementally would
likely make them less coherent, and definitely is a lot more work,
without a lot of benefit.
Table access methods are specified similar to index access methods,
i.e. pg_am.amhandler returns, as INTERNAL, a pointer to a struct with
callbacks. In contrast to index AMs that struct needs to live as long
as a backend, typically that's achieved by just returning a pointer to
a constant struct.
Psql's \d+ now displays a table's access method. That can be disabled
with HIDE_TABLEAM=true, which is mainly useful so regression tests can
be run against different AMs. It's quite possible that this behaviour
still needs to be fine tuned.
For now it's not allowed to set a table AM for a partitioned table, as
we've not resolved how partitions would inherit that. Disallowing
allows us to introduce, if we decide that's the way forward, such a
behaviour without a compatibility break.
Catversion bumped, to add the heap table AM and references to it.
Author: Haribabu Kommi, Andres Freund, Alvaro Herrera, Dimitri Golgov and others
Discussion:
https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
https://postgr.es/m/20160812231527.GA690404@alvherre.pgsql
https://postgr.es/m/20190107235616.6lur25ph22u5u5av@alap3.anarazel.de
https://postgr.es/m/20190304234700.w5tmhducs5wxgzls@alap3.anarazel.de
|
|
Adds another introspection feature for partitioning, necessary for
further psql patches.
Reviewed-by: Michaël Paquier
Discussion: https://postgr.es/m/20190226222757.GA31622@alvherre.pgsql
|
|
This changes the partition functions so as tables and indexes which are
not part of partition trees are handled the same way as what is done for
undefined objects and unsupported relkinds: pg_partition_tree() returns
no rows and pg_partition_root() returns a NULL result. Hence,
partitioned tables, partitioned indexes and relations whose flag
pg_class.relispartition is set are considered as valid objects to
process.
Previously, tables and indexes not included in a partition tree were
processed the same way as a partition or a partitioned table, which
caused the functions to return inconsistent results for inherited
tables, especially when inheriting from multiple tables.
Reported-by: Álvaro Herrera
Author: Amit Langote, Michael Paquier
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/20190228193203.GA26151@alvherre.pgsql
|
|
The function was tweaked so as it returned one row full of NULLs when
working on an unsupported relkind or an undefined object as of cc53123,
and after discussion with Amit and Álvaro it looks more natural to make
it return no rows.
Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Amit Langote
Discussion: https://postgr.es/m/20190227184808.GA17357@alvherre.pgsql
|
|
Merge ri_setnull and ri_setdefault into one function ri_set. These
functions were to a large part identical.
This is a continuation in spirit of
4797f9b519995ceca5d6b8550b5caa2ff6d19347.
Author: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0ccdd3e1-10b0-dd05-d8a7-183507c11eb1%402ndquadrant.com
|
|
We have forboth() and forthree() macros that simplify iterating
through several parallel lists, but not everyplace that could
reasonably use those was doing so. Also invent forfour() and
forfive() macros to do the same for four or five parallel lists,
and use those where applicable.
The immediate motivation for doing this is to reduce the number
of ad-hoc lnext() calls, to reduce the footprint of a WIP patch.
However, it seems like good cleanup and error-proofing anyway;
the places that were combining forthree() with a manually iterated
loop seem particularly illegible and bug-prone.
There was some speculation about restructuring related parsetree
representations to reduce the need for parallel list chasing of
this sort. Perhaps that's a win, or perhaps not, but in any case
it would be considerably more invasive than this patch; and it's
not particularly related to my immediate goal of improving the
List infrastructure. So I'll leave that question for another day.
Patch by me; thanks to David Rowley for review.
Discussion: https://postgr.es/m/11587.1550975080@sss.pgh.pa.us
|
|
There was a mix of old_slot/oldslot, new_slot/newslot. Since we've
changed everything from row to slot, we might as well take this
opportunity to clean this up.
Also update some more comments for the slot change.
|
|
Declare loop variable in for loop, for readability and to save space.
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0ccdd3e1-10b0-dd05-d8a7-183507c11eb1%402ndquadrant.com
|
|
Reduce the vertical space used by comments in ri_triggers.c, making
the file longer and more tedious to read than it needs to be. Update
some comments to use a more common style.
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0ccdd3e1-10b0-dd05-d8a7-183507c11eb1%402ndquadrant.com
|
|
ri_triggers.c spends a lot of space catering to a not-yet-implemented
MATCH PARTIAL option. An actual implementation would probably not use
the existing code structure anyway, so let's just simplify this for
now.
First, have ri_FetchConstraintInfo() check that riinfo->confmatchtype
is valid. Then we don't have to repeat that everywhere.
In the various referential action functions, we don't need to pay
attention to the match type at all right now, so remove all that code.
A future MATCH PARTIAL implementation would probably have some
conditions added to the present code, but it won't need an entirely
separate switch branch in each case.
In RI_FKey_fk_upd_check_required(), reorganize the code to make it
much simpler.
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0ccdd3e1-10b0-dd05-d8a7-183507c11eb1%402ndquadrant.com
|
|
for ff11e7f4b9ae017585c3ba146db7ba39c31f209a
|
|
In preparation for abstracting table storage, convert trigger.c to
track tuples in slots. Which also happens to make code calling
triggers simpler.
As the calling interface for triggers themselves is not changed in
this patch, HeapTuples still are extracted from the slot at that
time. But that's handled solely inside trigger.c, not visible to
callers. It's quite likely that we'll want to revise the external
trigger interface, but that's a separate large project.
As part of this work the slots used for old/new/return tuples are
moved from EState into ResultRelInfo, as different updated tables
might need different slots. The slots are now also now created
on-demand, which is good both from an efficiency POV, but also makes
the modifying code simpler.
Author: Andres Freund, Amit Khandekar and Ashutosh Bapat
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
|
|
It seems to make more sense for this to be in selfuncs.c, since it's
largely a statistical-estimation thing, and it's related to other
functions like estimate_hash_bucket_stats that are there.
While at it, change the result type from Size to double. Perhaps at one
point it was impossible for the result to overflow an integer, but
I've got no confidence in that proposition anymore. Nothing's actually
done with the result except to compare it to a work_mem-based limit,
so as long as we don't get an overflow on the way to that comparison,
things should be fine even with very large dNumGroups.
Code movement proposed by Antonin Houska, type change by me
Discussion: https://postgr.es/m/25767.1549359615@localhost
|
|
Change pg_stat_ssl so that an unprivileged user can only see their own
rows; other rows will be all null. This makes the behavior consistent
with pg_stat_activity, where information about where the connection
came from is also restricted.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/63117976-d02c-c8e2-3aef-caa31a5ab8d3%402ndquadrant.com
|
|
Historically we've always materialized the full output of a CTE query,
treating WITH as an optimization fence (so that, for example, restrictions
from the outer query cannot be pushed into it). This is appropriate when
the CTE query is INSERT/UPDATE/DELETE, or is recursive; but when the CTE
query is non-recursive and side-effect-free, there's no hazard of changing
the query results by pushing restrictions down.
Another argument for materialization is that it can avoid duplicate
computation of an expensive WITH query --- but that only applies if
the WITH query is called more than once in the outer query. Even then
it could still be a net loss, if each call has restrictions that
would allow just a small part of the WITH query to be computed.
Hence, let's change the behavior for WITH queries that are non-recursive
and side-effect-free. By default, we will inline them into the outer
query (removing the optimization fence) if they are called just once.
If they are called more than once, we will keep the old behavior by
default, but the user can override this and force inlining by specifying
NOT MATERIALIZED. Lastly, the user can force the old behavior by
specifying MATERIALIZED; this would mainly be useful when the query had
deliberately been employing WITH as an optimization fence to prevent a
poor choice of plan.
Andreas Karlsson, Andrew Gierth, David Fetter
Discussion: https://postgr.es/m/87sh48ffhb.fsf@news-spur.riddles.org.uk
|
|
Test for the compiler builtins __builtin_clz, __builtin_ctz, and
__builtin_popcount, and make use of these in preference to
handwritten C code if they're available. Create src/port
infrastructure for "leftmost one", "rightmost one", and "popcount"
so as to centralize these decisions.
On x86_64, __builtin_popcount generally won't make use of the POPCNT
opcode because that's not universally supported yet. Provide code
that checks CPUID and then calls POPCNT via asm() if available.
This requires indirecting through a function pointer, which is
an annoying amount of overhead for a one-instruction operation,
but it's probably not worth working harder than this for our
current use-cases.
I'm not sure we've found all the existing places that could profit
from this new infrastructure; but we at least touched all the
ones that used copied-and-pasted versions of the bitmapset.c code,
and got rid of multiple copies of the associated constant arrays.
While at it, replace c-compiler.m4's one-per-builtin-function
macros with a single one that can handle all the cases we need
to worry about so far. Also, because I'm paranoid, make those
checks into AC_LINK checks rather than just AC_COMPILE; the
former coding failed to verify that libgcc has support for the
builtin, in cases where it's not inline code.
David Rowley, Thomas Munro, Alvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/CAKJS1f9WTAGG1tPeJnD18hiQW5gAk59fQ6WK-vfdAKEHyRg2RA@mail.gmail.com
|
|
Get rid of deconstruct_indexquals() in favor of just iterating over the
IndexClause list directly. The extra services that that function used to
provide, such as hiding clause commutation and associating the right index
column with each clause, are no longer useful given the new data structure.
I'd originally thought that it'd provide a useful amount of abstraction
by freeing callers from paying attention to the exact clause type of each
indexqual, but that hope proves to have been vain, because few callers can
ignore the semantic differences between different clause types. Indeed,
removing it results in a net code savings, and probably some cycles shaved
by not having to build an extra list-of-structs data structure.
Also, export a few formerly-static support functions, with the goal
of allowing extension AMs to write functionality equivalent to
genericcostestimate() without pointless code duplication.
Discussion: https://postgr.es/m/24586.1550106354@sss.pgh.pa.us
|
|
In commit 1a8d5afb0, I thought it'd be a good idea to define
IndexClause.indexquals as NIL in the most common case where the given
clause (IndexClause.rinfo) is usable exactly as-is. It'd be more
consistent to define the indexquals in that case as being a one-element
list containing IndexClause.rinfo, but I thought saving the palloc
overhead for making such a list would be worthwhile.
In hindsight, that was a great example of "premature optimization is the
root of all evil": it's complicated everyplace that needs to deal with
the indexquals, requiring duplicative code to handle both the simple
case and the not-simple case. I'd initially found that tolerable but
it's getting less so as I mop up some areas that I'd not touched in
1a8d5afb0. In any case, two more pallocs during a planner run are
surely at the noise level (a conclusion confirmed by a bit of
microbenchmarking). So let's change this decision before it becomes
set in stone, and insist that IndexClause.indexquals always be a valid
list of the actual index quals for the clause.
Discussion: https://postgr.es/m/24586.1550106354@sss.pgh.pa.us
|
|
While at it, refactor patternsel() a bit so that it can be used from
the LIKE/regex planner support functions as well. This makes the
planner able to deal equally well with either operator or function
syntax for these operations. I'm not excited about that as a feature
in itself, but it provides a nice model for extensions to follow if
they want such behavior for their operations.
This change localizes the use of pattern_fixed_prefix() and
make_greater_string() so that they no longer need be exported.
(We might get pushback from extensions about that, perhaps,
in which case I'd be inclined to re-export them in a new header
file like_support.h.)
This reduces the bulk of selfuncs.c a fair amount, removing ~1370
lines or about one-sixth of that file; it's still too big, but this
is progress.
Discussion: https://postgr.es/m/24537.1550093915@sss.pgh.pa.us
|
|
Previously, floating-point output was done by rounding to a specific
decimal precision; by default, to 6 or 15 decimal digits (losing
information) or as requested using extra_float_digits. Drivers that
wanted exact float values, and applications like pg_dump that must
preserve values exactly, set extra_float_digits=3 (or sometimes 2 for
historical reasons, though this isn't enough for float4).
Unfortunately, decimal rounded output is slow enough to become a
noticable bottleneck when dealing with large result sets or COPY of
large tables when many floating-point values are involved.
Floating-point output can be done much faster when the output is not
rounded to a specific decimal length, but rather is chosen as the
shortest decimal representation that is closer to the original float
value than to any other value representable in the same precision. The
recently published Ryu algorithm by Ulf Adams is both relatively
simple and remarkably fast.
Accordingly, change float4out/float8out to output shortest decimal
representations if extra_float_digits is greater than 0, and make that
the new default. Applications that need rounded output can set
extra_float_digits back to 0 or below, and take the resulting
performance hit.
We make one concession to portability for systems with buggy
floating-point input: we do not output decimal values that fall
exactly halfway between adjacent representable binary values (which
would rely on the reader doing round-to-nearest-even correctly). This
is known to be a problem at least for VS2013 on Windows.
Our version of the Ryu code originates from
https://github.com/ulfjack/ryu/ at commit c9c3fb1979, but with the
following (significant) modifications:
- Output format is changed to use fixed-point notation for small
exponents, as printf would, and also to use lowercase 'e', a
minimum of 2 exponent digits, and a mandatory sign on the exponent,
to keep the formatting as close as possible to previous output.
- The output of exact midpoint values is disabled as noted above.
- The integer fast-path code is changed somewhat (since we have
fixed-point output and the upstream did not).
- Our project style has been largely applied to the code with the
exception of C99 declaration-after-statement, which has been
retained as an exception to our present policy.
- Most of upstream's debugging and conditionals are removed, and we
use our own configure tests to determine things like uint128
availability.
Changing the float output format obviously affects a number of
regression tests. This patch uses an explicit setting of
extra_float_digits=0 for test output that is not expected to be
exactly reproducible (e.g. due to numerical instability or differing
algorithms for transcendental functions).
Conversions from floats to numeric are unchanged by this patch. These
may appear in index expressions and it is not yet clear whether any
change should be made, so that can be left for another day.
This patch assumes that the only supported floating point format is
now IEEE format, and the documentation is updated to reflect that.
Code by me, adapting the work of Ulf Adams and other contributors.
References:
https://dl.acm.org/citation.cfm?id=3192369
Reviewed-by: Tom Lane, Andres Freund, Donald Dong
Discussion: https://postgr.es/m/87r2el1bx6.fsf@news-spur.riddles.org.uk
|
|
Using strtod() creates a double-rounding problem; the input decimal
value is first rounded to the nearest double; rounding that to the
nearest float may then give an incorrect result.
An example is that 7.038531e-26 when input via strtod and then rounded
to float4 gives 0xAE43FEp-107 instead of the correct 0xAE43FDp-107.
Values output by earlier PG versions with extra_float_digits=3 should
all be read in with the same values as previously. However, values
supplied by other software using shortest representations could be
mis-read.
On platforms that lack a strtof() entirely, we fall back to the old
incorrect rounding behavior. (As strtof() is required by C99, such
platforms are considered of primarily historical interest.) On VS2013,
some workarounds are used to get correct error handling.
The regression tests now test for the correct input values, so
platforms that lack strtof() will need resultmap entries. An entry for
HP-UX 10 is included (more may be needed).
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/871s5emitx.fsf@news-spur.riddles.org.uk
Discussion: https://postgr.es/m/87d0owlqpv.fsf@news-spur.riddles.org.uk
|
|
Replace casts whose only purpose is to cast away const with the
unconstify() macro.
Discussion: https://www.postgresql.org/message-id/flat/53a28052-f9f3-1808-fed9-460fd43035ab%402ndquadrant.com
|
|
We're only going to consider key columns when creating indexquals,
so there is no point in having the outer loops in indxpath.c iterate
further than nkeycolumns.
Doing so in match_pathkeys_to_index() is actually wrong, and would have
caused crashes by now, except that we have no index AMs supporting both
amcanorderbyop and amcaninclude.
It's also wrong in relation_has_unique_index_for(). The effect there is
to fail to prove uniqueness even when the index does prove it, if there
are extra columns.
Also future-proof examine_variable() for the day when extra columns can
be expressions, and fix what's either a thinko or just an oversight in
btcostestimate(): we should consider the number of key columns, not the
total, when deciding whether to derate correlation.
None of these things seemed important enough to risk changing in a
just-before-wrap patch, but since we're past the release wrap window,
time to fix 'em.
Discussion: https://postgr.es/m/25526.1549847928@sss.pgh.pa.us
|
|
For a long time, indxpath.c has had the ability to extract derived (lossy)
index conditions from certain operators such as LIKE. For just as long,
it's been obvious that we really ought to make that capability available
to extensions. This commit finally accomplishes that, by adding another
API for planner support functions that lets them create derived index
conditions for their functions. As proof of concept, the hardwired
"special index operator" code formerly present in indxpath.c is pushed
out to planner support functions attached to LIKE and other relevant
operators.
A weak spot in this design is that an extension needs to know OIDs for
the operators, datatypes, and opfamilies involved in the transformation
it wants to make. The core-code prototypes use hard-wired OID references
but extensions don't have that option for their own operators etc. It's
usually possible to look up the required info, but that may be slow and
inconvenient. However, improving that situation is a separate task.
I want to do some additional refactorization around selfuncs.c, but
that also seems like a separate task.
Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
|
|
Last use was removed in 2c66f9924c1162bfba27c77004ccf42fb6ea188d.
|
|
Add support function requests for estimating the selectivity, cost,
and number of result rows (if a SRF) of the target function.
The lack of a way to estimate selectivity of a boolean-returning
function in WHERE has been a recognized deficiency of the planner
since Berkeley days. This commit finally fixes it.
In addition, non-constant estimates of cost and number of output
rows are now possible. We still fall back to looking at procost
and prorows if the support function doesn't service the request,
of course.
To make concrete use of the possibility of estimating output rowcount
for SRFs, this commit adds support functions for array_unnest(anyarray)
and the integer variants of generate_series; the lack of plausible
rowcount estimates for those, even when it's obvious to a human,
has been a repeated subject of complaints. Obviously, much more
could now be done in this line, but I'm mostly just trying to get
the infrastructure in place.
Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
|
|
Rename/repurpose pg_proc.protransform as "prosupport". The idea is
still that it names an internal function that provides knowledge to
the planner about the behavior of the function it's attached to;
but redesign the API specification so that it's not limited to doing
just one thing, but can support an extensible set of requests.
The original purpose of simplifying a function call is handled by
the first request type to be invented, SupportRequestSimplify.
Adjust all the existing transform functions to handle this API,
and rename them fron "xxx_transform" to "xxx_support" to reflect
the potential generalization of what they do. (Since we never
previously provided any way for extensions to add transform functions,
this change doesn't create an API break for them.)
Also add DDL and pg_dump support for attaching a support function to a
user-defined function. Unfortunately, DDL access has to be restricted
to superusers, at least for now; but seeing that support functions
will pretty much have to be written in C, that limitation is just
theoretical. (This support is untested in this patch, but a follow-on
patch will add cases that exercise it.)
Discussion: https://postgr.es/m/15193.1548028093@sss.pgh.pa.us
|
|
In place of three separate but interrelated lists (indexclauses,
indexquals, and indexqualcols), an IndexPath now has one list
"indexclauses" of IndexClause nodes. This holds basically the same
information as before, but in a more useful format: in particular, there
is now a clear connection between an indexclause (an original restriction
clause from WHERE or JOIN/ON) and the indexquals (directly usable index
conditions) derived from it.
We also change the ground rules a bit by mandating that clause commutation,
if needed, be done up-front so that what is stored in the indexquals list
is always directly usable as an index condition. This gets rid of repeated
re-determination of which side of the clause is the indexkey during costing
and plan generation, as well as repeated lookups of the commutator
operator. To minimize the added up-front cost, the typical case of
commuting a plain OpExpr is handled by a new special-purpose function
commute_restrictinfo(). For RowCompareExprs, generating the new clause
properly commuted to begin with is not really any more complex than before,
it's just different --- and we can save doing that work twice, as the
pretty-klugy original implementation did.
Tracking the connection between original and derived clauses lets us
also track explicitly whether the derived clauses are an exact or lossy
translation of the original. This provides a cheap solution to getting
rid of unnecessary rechecks of boolean index clauses, which previously
seemed like it'd be more expensive than it was worth.
Another pleasant (IMO) side-effect is that EXPLAIN now always shows
index clauses with the indexkey on the left; this seems less confusing.
This commit leaves expand_indexqual_conditions() and some related
functions in a slightly messy state. I didn't bother to change them
any more than minimally necessary to work with the new data structure,
because all that code is going to be refactored out of existence in
a follow-on patch.
Discussion: https://postgr.es/m/22182.1549124950@sss.pgh.pa.us
|
|
While this isn't really supposed to happen, it can occur in OOM
situations and perhaps others. Instead of crashing, substitute
"(no message provided)".
I didn't worry about localizing this text, since we aren't
localizing anything else here; besides, if we're on the edge of
OOM, it's unlikely gettext() would work.
Report and fix by Sergio Conde Gómez in bug #15624.
Discussion: https://postgr.es/m/15624-4dea54091a2864e6@postgresql.org
|
|
These mainly help understanding the function signatures better.
|
|
This is useful when looking at partition trees with multiple layers, and
combined with pg_partition_tree, it provides the possibility to show up
an entire tree by just knowing one member at any level.
Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Amit Langote
Discussion: https://postgr.es/m/20181207014015.GP2407@paquier.xyz
|
|
Change pg_dump and ruleutils.c to use the FUNCTION keyword instead of
PROCEDURE in trigger and event trigger definitions.
This completes the pieces of the transition started in
0a63f996e018ac508c858e87fa39cc254a5db49f that were kept out of
PostgreSQL 11 because of the required catversion change.
Discussion: https://www.postgresql.org/message-id/381bef53-f7be-29c8-d977-948e389161d6@2ndquadrant.com
|
|
Over at patch https://commitfest.postgresql.org/21/1062/ Dmitry wants to
introduce a more generic subscription mechanism, which allows
subscripting not only arrays but also other object types such as JSONB.
That functionality is introduced in a largish invasive patch, out of
which this internal renaming patch was extracted.
Author: Dmitry Dolgov
Reviewed-by: Tom Lane, Arthur Zakirov
Discussion: https://postgr.es/m/CA+q6zcUK4EqPAu7XRRO5CCjMwhz5zvg+rfWuLzVoxp_5sKS6=w@mail.gmail.com
|
|
Add columns client_serial and issuer_dn to pg_stat_ssl. These allow
uniquely identifying the client certificate.
Rename the existing column clientdn to client_dn, to make the naming
more consistent and easier to read.
Discussion: https://www.postgresql.org/message-id/flat/398754d8-6bb5-c5cf-e7b8-22e5f0983caf@2ndquadrant.com/
|
|
Create a new header optimizer/optimizer.h, which exposes just the
planner functions that can be used "at arm's length", without need
to access Paths or the other planner-internal data structures defined
in nodes/relation.h. This is intended to provide the whole planner
API seen by most of the rest of the system; although FDWs still need
to use additional stuff, and more thought is also needed about just
what selfuncs.c should rely on.
The main point of doing this now is to limit the amount of new
#include baggage that will be needed by "planner support functions",
which I expect to introduce later, and which will be in relevant
datatype modules rather than anywhere near the planner.
This commit just moves relevant declarations into optimizer.h from
other header files (a couple of which go away because everything
got moved), and adjusts #include lists to match. There's further
cleanup that could be done if we want to decide that some stuff
being exposed by optimizer.h doesn't belong in the planner at all,
but I'll leave that for another day.
Discussion: https://postgr.es/m/11460.1548706639@sss.pgh.pa.us
|
|
Return null if there is no client certificate. This is how it has
always been documented, but in reality it returned an empty string.
Reviewed-by: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Discussion: https://www.postgresql.org/message-id/flat/398754d8-6bb5-c5cf-e7b8-22e5f0983caf@2ndquadrant.com/
|
|
The fact that "SELECT expression" has no base relations has long been a
thorn in the side of the planner. It makes it hard to flatten a sub-query
that looks like that, or is a trivial VALUES() item, because the planner
generally uses relid sets to identify sub-relations, and such a sub-query
would have an empty relid set if we flattened it. prepjointree.c contains
some baroque logic that works around this in certain special cases --- but
there is a much better answer. We can replace an empty FROM clause with a
dummy RTE that acts like a table of one row and no columns, and then there
are no such corner cases to worry about. Instead we need some logic to
get rid of useless dummy RTEs, but that's simpler and covers more cases
than what was there before.
For really trivial cases, where the query is just "SELECT expression" and
nothing else, there's a hazard that adding the extra RTE makes for a
noticeable slowdown; even though it's not much processing, there's not
that much for the planner to do overall. However testing says that the
penalty is very small, close to the noise level. In more complex queries,
this is able to find optimizations that we could not find before.
The new RTE type is called RTE_RESULT, since the "scan" plan type it
gives rise to is a Result node (the same plan we produced for a "SELECT
expression" query before). To avoid confusion, rename the old ResultPath
path type to GroupResultPath, reflecting that it's only used in degenerate
grouping cases where we know the query produces just one grouped row.
(It wouldn't work to unify the two cases, because there are different
rules about where the associated quals live during query_planner.)
Note: although this touches readfuncs.c, I don't think a catversion
bump is required, because the added case can't occur in stored rules,
only plans.
Patch by me, reviewed by David Rowley and Mark Dilger
Discussion: https://postgr.es/m/15944.1521127664@sss.pgh.pa.us
|
|
Before this change FunctionCallInfoData, the struct arguments etc for
V1 function calls are stored in, always had space for
FUNC_MAX_ARGS/100 arguments, storing datums and their nullness in two
arrays. For nearly every function call 100 arguments is far more than
needed, therefore wasting memory. Arg and argnull being two separate
arrays also guarantees that to access a single argument, two
cachelines have to be touched.
Change the layout so there's a single variable-length array with pairs
of value / isnull. That drastically reduces memory consumption for
most function calls (on x86-64 a two argument function now uses
64bytes, previously 936 bytes), and makes it very likely that argument
value and its nullness are on the same cacheline.
Arguments are stored in a new NullableDatum struct, which, due to
padding, needs more memory per argument than before. But as usually
far fewer arguments are stored, and individual arguments are cheaper
to access, that's still a clear win. It's likely that there's other
places where conversion to NullableDatum arrays would make sense,
e.g. TupleTableSlots, but that's for another commit.
Because the function call information is now variable-length
allocations have to take the number of arguments into account. For
heap allocations that can be done with SizeForFunctionCallInfoData(),
for on-stack allocations there's a new LOCAL_FCINFO(name, nargs) macro
that helps to allocate an appropriately sized and aligned variable.
Some places with stack allocation function call information don't know
the number of arguments at compile time, and currently variably sized
stack allocations aren't allowed in postgres. Therefore allow for
FUNC_MAX_ARGS space in these cases. They're not that common, so for
now that seems acceptable.
Because of the need to allocate FunctionCallInfo of the appropriate
size, older extensions may need to update their code. To avoid subtle
breakages, the FunctionCallInfoData struct has been renamed to
FunctionCallInfoBaseData. Most code only references FunctionCallInfo,
so that shouldn't cause much collateral damage.
This change is also a prerequisite for more efficient expression JIT
compilation (by allocating the function call information on the stack,
allowing LLVM to optimize it away); previously the size of the call
information caused problems inside LLVM's optimizer.
Author: Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/20180605172952.x34m5uz6ju6enaem@alap3.anarazel.de
|
|
While it's perhaps unlikely that users would write an explicit test
like "ctid IS NULL", this function is also used in range estimation,
and an incorrect answer can throw off the results for tight ranges.
Anyway it's not much code so we might as well do it.
Edmund Horner
Discussion: https://postgr.es/m/CAMyN-kCa3BFUFrCTtQeprxTU1anCd3Pua7zXstGCKq4pXgjukw@mail.gmail.com
|
|
Heikki's compiler doesn't complain about end_ptr, apparently,
but mine does.
In passing, I failed to resist the temptation to remove the
no-longer-used fldnum variable, and relocate chunk_len's
declaration to a narrower scope.
|
|
The old implementation first converted the input strings to arrays of
wchars, and performed the conversion on those. However, the conversion is
expensive, and for a large input string, consumes a lot of memory.
Allocating the large arrays also meant that these functions could not be
used on strings larger 1 GB / pg_encoding_max_length() (256 MB for UTF-8).
Avoid the conversion, and instead use the single-byte algorithm even with
multibyte encodings. That can get fooled, if there is a matching byte
sequence in the middle of a multi-byte character, so to eliminate false
positives like that, we verify any matches by walking the string character
by character with pg_mblen(). Also, if the caller needs the position of
the match, as a character-offset, we also need to walk the string to count
the characters.
Performance testing shows that walking the whole string with pg_mblen() is
somewhat slower than converting the whole string to wchars. It's still
often a win, though, because we don't need to do it if there is no match,
and even when there is, we only need to walk up to the point where the
match is, not the whole string. Even in the worst case, there would be
room for optimization: Much of the CPU time in the current loop with
pg_mblen() is function call overhead, and could be improved by inlining
pg_mblen() and/or the encoding-specific mblen() functions. But I didn't
attempt to do that as part of this patch.
Most of the callers of text_position_setup/next functions were actually
not interested in the position of the match, counted in characters. To
cater for them, refactor the text_position_next() interface into two
parts: searching for the next match (text_position_next()), and returning
the current match's position as a pointer (text_position_get_match_ptr())
or as a character offset (text_position_get_match_pos()). Getting the
pointer to the match is a more convenient API for many callers, and with
UTF-8, it allows skipping the character-walking step altogether, because
UTF-8 can't have false matches even when treated like raw byte strings.
Reviewed-by: John Naylor
Discussion: https://www.postgresql.org/message-id/3173d989-bc1c-fc8a-3b69-f24246f73876%40iki.fi
|
|
Given these routines are heap specific, and that there will be more
generic visibility support in via table AM, it makes sense to move the
prototypes to heapam.h (routines like HeapTupleSatisfiesVacuum will
not be exposed in a generic fashion, because they are too storage
specific).
Similarly, the code in tqual.c is specific to heap, so moving it into
access/heap/ makes sense.
Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
|
|
The code in tqual.c is largely heap specific. Due to the upcoming
pluggable storage work, it therefore makes sense to move it into
access/heap/ (as the file's header notes, the tqual name isn't very
good).
But the various statically allocated snapshot and snapshot
initialization functions are now (see previous commit) generic and do
not depend on functions declared in tqual.h anymore. Therefore move.
Also move XidInMVCCSnapshot as that's useful for future AMs, and
already used outside of tqual.c.
Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
|
|
Most of these had been obsoleted by 568d4138c / the SnapshotNow
removal.
This is is preparation for moving most of tqual.[ch] into either
snapmgr.h or heapam.h, which in turn is in preparation for pluggable
table AMs.
Author: Andres Freund
Discussion: https://postgr.es/m/20180703070645.wchpu5muyto5n647@alap3.anarazel.de
|
|
Author: Andres Freund
Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
|
|
A lot of files only included heapam.h for relation_open, heap_open etc
- replace the heapam.h include in those files with the narrower
header.
Author: Andres Freund
Discussion: https://postgr.es/m/20190111000539.xbv7s6w7ilcvm7dp@alap3.anarazel.de
|
|
My commit 3de241dba86f introduced some code (in tablecmds.c) to obtain
data from a pg_constraint row for a foreign key, that already existed in
ri_triggers.c. Split it out into its own routine in pg_constraint.c,
where it naturally belongs.
No functional code changes, only code movement.
Backpatch to pg11, because a future bugfix is simpler after this.
|
|
heapam.h previously was included in a number of widely used
headers (e.g. execnodes.h, indirectly in executor.h, ...). That's
problematic on its own, as heapam.h contains a lot of low-level
details that don't need to be exposed that widely, but becomes more
problematic with the upcoming introduction of pluggable table storage
- it seems inappropriate for heapam.h to be included that widely
afterwards.
heapam.h was largely only included in other headers to get the
HeapScanDesc typedef (which was defined in heapam.h, even though
HeapScanDescData is defined in relscan.h). The better solution here
seems to be to just use the underlying struct (forward declared where
necessary). Similar for BulkInsertState.
Another problem was that LockTupleMode was used in executor.h - parts
of the file tried to cope without heapam.h, but due to the fact that
it indirectly included it, several subsequent violations of that goal
were not not noticed. We could just reuse the approach of declaring
parameters as int, but it seems nicer to move LockTupleMode to
lockoptions.h - that's not a perfect location, but also doesn't seem
bad.
As a number of files relied on implicitly included heapam.h, a
significant number of files grew an explicit include. It's quite
probably that a few external projects will need to do the same.
Author: Andres Freund
Reviewed-By: Alvaro Herrera
Discussion: https://postgr.es/m/20190114000701.y4ttcb74jpskkcfb@alap3.anarazel.de
|
|
Avoid using "typeid" as a parameter name in header files, since that
is a C++ keyword. These cases were introduced recently, in 04fe805a1
and 586b98fdf.
Since I'm an incurable neatnik, also rename these parameters in the
underlying function definitions. That's not really necessary per
project rules, but I don't like function declarations that don't
quite agree with the underlying definitions.
Per src/tools/pginclude/cpluspluscheck.
|
|
Previously, ScanKeywordLookup was passed an array of string pointers.
This had some performance deficiencies: the strings themselves might
be scattered all over the place depending on the compiler (and some
quick checking shows that at least with gcc-on-Linux, they indeed
weren't reliably close together). That led to very cache-unfriendly
behavior as the binary search touched strings in many different pages.
Also, depending on the platform, the string pointers might need to
be adjusted at program start, so that they couldn't be simple constant
data. And the ScanKeyword struct had been designed with an eye to
32-bit machines originally; on 64-bit it requires 16 bytes per
keyword, making it even more cache-unfriendly.
Redesign so that the keyword strings themselves are allocated
consecutively (as part of one big char-string constant), thereby
eliminating the touch-lots-of-unrelated-pages syndrome. And get
rid of the ScanKeyword array in favor of three separate arrays:
uint16 offsets into the keyword array, uint16 token codes, and
uint8 keyword categories. That reduces the overhead per keyword
to 5 bytes instead of 16 (even less in programs that only need
one of the token codes and categories); moreover, the binary search
only touches the offsets array, further reducing its cache footprint.
This also lets us put the token codes somewhere else than the
keyword strings are, which avoids some unpleasant build dependencies.
While we're at it, wrap the data used by ScanKeywordLookup into
a struct that can be treated as an opaque type by most callers.
That doesn't change things much right now, but it will make it
less painful to switch to a hash-based lookup method, as is being
discussed in the mailing list thread.
Most of the change here is associated with adding a generator
script that can build the new data structure from the same
list-of-PG_KEYWORD header representation we used before.
The PG_KEYWORD lists that plpgsql and ecpg used to embed in
their scanner .c files have to be moved into headers, and the
Makefiles have to be taught to invoke the generator script.
This work is also necessary if we're to consider hash-based lookup,
since the generator script is what would be responsible for
constructing a hash table.
Aside from saving a few kilobytes in each program that includes
the keyword table, this seems to speed up raw parsing (flex+bison)
by a few percent. So it's worth doing even as it stands, though
we think we can gain even more with a follow-on patch to switch
to hash-based lookup.
John Naylor, with further hacking by me
Discussion: https://postgr.es/m/CAJVSVGXdFVU2sgym89XPL=Lv1zOS5=EHHQ8XWNzFL=mTXkKMLw@mail.gmail.com
|