PostgreSQL Coding Conventions
  
   Formatting
   
    Source code formatting uses 4 column tab spacing, with
    tabs preserved (i.e., tabs are not expanded to spaces).
    Each logical indentation level is one additional tab stop.
   
   
    Layout rules (brace positioning, etc) follow BSD conventions.  In
    particular, curly braces for the controlled blocks of if>,
    while>, switch>, etc go on their own lines.
   
   
    Do not use C++ style comments (//> comments).  Strict ANSI C
    compilers do not accept them.  For the same reason, do not use C++
    extensions such as declaring new variables mid-block.
   
   
    The preferred style for multi-line comment blocks is
/*
 * comment text begins here
 * and continues here
 */
    Note that comment blocks that begin in column 1 will be preserved as-is
    by pgindent>, but it will re-flow indented comment blocks
    as though they were plain text.  If you want to preserve the line breaks
    in an indented block, add dashes like this:
    /*----------
     * comment text begins here
     * and continues here
     *----------
     */
   
   
    While submitted patches do not absolutely have to follow these formatting
    rules, it's a good idea to do so.  Your code will get run through
    pgindent> before the next release, so there's no point in
    making it look nice under some other set of formatting conventions.
   
   
    The src/tools directory contains sample settings
    files that can be used with the emacs,
    xemacs or vim
    editors to help ensure that they format code according to these
    conventions.
   
   
    The text browsing tools more and
    less can be invoked as:
more -x4
less -x4
    to make them show tabs appropriately.
   
  
  
   Reporting Errors Within the Server
   
    ereport
   
   
    elog
   
   
    Error, warning, and log messages generated within the server code
    should be created using ereport>, or its older cousin
    elog>.  The use of this function is complex enough to
    require some explanation.
   
   
    There are two required elements for every message: a severity level
    (ranging from DEBUG> to PANIC>) and a primary
    message text.  In addition there are optional elements, the most
    common of which is an error identifier code that follows the SQL spec's
    SQLSTATE conventions.
    ereport> itself is just a shell function, that exists
    mainly for the syntactic convenience of making message generation
    look like a function call in the C source code.  The only parameter
    accepted directly by ereport> is the severity level.
    The primary message text and any optional message elements are
    generated by calling auxiliary functions, such as errmsg>,
    within the ereport> call.
   
   
    A typical call to ereport> might look like this:
ereport(ERROR,
        (errcode(ERRCODE_DIVISION_BY_ZERO),
         errmsg("division by zero")));
    This specifies error severity level ERROR> (a run-of-the-mill
    error).  The errcode> call specifies the SQLSTATE error code
    using a macro defined in src/include/utils/errcodes.h>.  The
    errmsg> call provides the primary message text.  Notice the
    extra set of parentheses surrounding the auxiliary function calls —
    these are annoying but syntactically necessary.
   
   
    Here is a more complex example:
ereport(ERROR,
        (errcode(ERRCODE_AMBIGUOUS_FUNCTION),
         errmsg("function %s is not unique",
                func_signature_string(funcname, nargs,
                                      NIL, actual_arg_types)),
         errhint("Unable to choose a best candidate function. "
                 "You might need to add explicit typecasts.")));
    This illustrates the use of format codes to embed run-time values into
    a message text.  Also, an optional hint> message is provided.
   
   
    The available auxiliary routines for ereport> are:
  
   
    
     errcode(sqlerrcode) specifies the SQLSTATE error identifier
     code for the condition.  If this routine is not called, the error
     identifier defaults to
     ERRCODE_INTERNAL_ERROR> when the error severity level is
     ERROR> or higher, ERRCODE_WARNING> when the
     error level is WARNING>, otherwise (for NOTICE>
     and below) ERRCODE_SUCCESSFUL_COMPLETION>.
     While these defaults are often convenient, always think whether they
     are appropriate before omitting the errcode()> call.
    
   
   
    
     errmsg(const char *msg, ...) specifies the primary error
     message text, and possibly run-time values to insert into it.  Insertions
     are specified by sprintf>-style format codes.  In addition to
     the standard format codes accepted by sprintf>, the format
     code %m> can be used to insert the error message returned
     by strerror> for the current value of errno>.
     
      
       That is, the value that was current when the ereport> call
       was reached; changes of errno> within the auxiliary reporting
       routines will not affect it.  That would not be true if you were to
       write strerror(errno)> explicitly in errmsg>'s
       parameter list; accordingly, do not do so.
      
     
     %m> does not require any
     corresponding entry in the parameter list for errmsg>.
     Note that the message string will be run through gettext>
     for possible localization before format codes are processed.
    
   
   
    
     errmsg_internal(const char *msg, ...) is the same as
     errmsg>, except that the message string will not be
     translated nor included in the internationalization message dictionary.
     This should be used for cannot happen> cases that are probably
     not worth expending translation effort on.
    
   
   
    
     errmsg_plural(const char *fmt_singular, const char *fmt_plural,
     unsigned long n, ...) is like errmsg>, but with
     support for various plural forms of the message.
     fmt_singular> is the English singular format,
     fmt_plural> is the English plural format,
     n> is the integer value that determines which plural
     form is needed, and the remaining arguments are formatted according
     to the selected format string.  For more information see
     .
    
   
   
    
     errdetail(const char *msg, ...) supplies an optional
     detail> message; this is to be used when there is additional
     information that seems inappropriate to put in the primary message.
     The message string is processed in just the same way as for
     errmsg>.
    
   
   
    
     errdetail_log(const char *msg, ...) is the same as
     errdetail> except that this string goes only to the server
     log, never to the client.  If both errdetail> and
     errdetail_log> are used then one string goes to the client
     and the other to the log.  This is useful for error details that are
     too security-sensitive or too bulky to include in the report
     sent to the client.
    
   
   
    
     errdetail_plural(const char *fmt_singular, const char *fmt_plural,
     unsigned long n, ...) is like errdetail>, but with
     support for various plural forms of the message.
     For more information see .
    
   
   
    
     errhint(const char *msg, ...) supplies an optional
     hint> message; this is to be used when offering suggestions
     about how to fix the problem, as opposed to factual details about
     what went wrong.
     The message string is processed in just the same way as for
     errmsg>.
    
   
   
    
     errcontext(const char *msg, ...) is not normally called
     directly from an ereport> message site; rather it is used
     in error_context_stack> callback functions to provide
     information about the context in which an error occurred, such as the
     current location in a PL function.
     The message string is processed in just the same way as for
     errmsg>.  Unlike the other auxiliary functions, this can
     be called more than once per ereport> call; the successive
     strings thus supplied are concatenated with separating newlines.
    
   
   
    
     errposition(int cursorpos) specifies the textual location
     of an error within a query string.  Currently it is only useful for
     errors detected in the lexical and syntactic analysis phases of
     query processing.
    
   
   
    
     errcode_for_file_access()> is a convenience function that
     selects an appropriate SQLSTATE error identifier for a failure in a
     file-access-related system call.  It uses the saved
     errno> to determine which error code to generate.
     Usually this should be used in combination with %m> in the
     primary error message text.
    
   
   
    
     errcode_for_socket_access()> is a convenience function that
     selects an appropriate SQLSTATE error identifier for a failure in a
     socket-related system call.
    
   
   
    
     errhidestmt(bool hide_stmt) can be called to specify
     suppression of the STATEMENT:> portion of a message in the
     postmaster log.  Generally this is appropriate if the message text
     includes the current statement already.
    
   
  
   
   
    There is an older function elog> that is still heavily used.
    An elog> call:
elog(level, "format string", ...);
    is exactly equivalent to:
ereport(level, (errmsg_internal("format string", ...)));
    Notice that the SQLSTATE error code is always defaulted, and the message
    string is not subject to translation.
    Therefore, elog> should be used only for internal errors and
    low-level debug logging.  Any message that is likely to be of interest to
    ordinary users should go through ereport>.  Nonetheless,
    there are enough internal cannot happen> error checks in the
    system that elog> is still widely used; it is preferred for
    those messages for its notational simplicity.
   
   
    Advice about writing good error messages can be found in
    .
   
  
  
   Error Message Style Guide
   
    This style guide is offered in the hope of maintaining a consistent,
    user-friendly style throughout all the messages generated by
    PostgreSQL>.
   
  
   What goes where
   
    The primary message should be short, factual, and avoid reference to
    implementation details such as specific function names.
    Short
 means should fit on one line under normal
    conditions
.  Use a detail message if needed to keep the primary
    message short, or if you feel a need to mention implementation details
    such as the particular system call that failed. Both primary and detail
    messages should be factual.  Use a hint message for suggestions about what
    to do to fix the problem, especially if the suggestion might not always be
    applicable.
   
   
    For example, instead of:
IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m
(plus a long addendum that is basically a hint)
    write:
Primary:    could not create shared memory segment: %m
Detail:     Failed syscall was shmget(key=%d, size=%u, 0%o).
Hint:       the addendum
   
   
    Rationale: keeping the primary message short helps keep it to the point,
    and lets clients lay out screen space on the assumption that one line is
    enough for error messages.  Detail and hint messages can be relegated to a
    verbose mode, or perhaps a pop-up error-details window.  Also, details and
    hints would normally be suppressed from the server log to save
    space. Reference to implementation details is best avoided since users
    don't know the details anyway.
   
  
  
   Formatting
   
    Don't put any specific assumptions about formatting into the message
    texts.  Expect clients and the server log to wrap lines to fit their own
    needs.  In long messages, newline characters (\n) can be used to indicate
    suggested paragraph breaks.  Don't end a message with a newline.  Don't
    use tabs or other formatting characters.  (In error context displays,
    newlines are automatically added to separate levels of context such as
    function calls.)
   
   
    Rationale: Messages are not necessarily displayed on terminal-type
    displays.  In GUI displays or browsers these formatting instructions are
    at best ignored.
   
  
  
   Quotation marks
   
    English text should use double quotes when quoting is appropriate.
    Text in other languages should consistently use one kind of quotes that is
    consistent with publishing customs and computer output of other programs.
   
   
    Rationale: The choice of double quotes over single quotes is somewhat
    arbitrary, but tends to be the preferred use.  Some have suggested
    choosing the kind of quotes depending on the type of object according to
    SQL conventions (namely, strings single quoted, identifiers double
    quoted).  But this is a language-internal technical issue that many users
    aren't even familiar with, it won't scale to other kinds of quoted terms,
    it doesn't translate to other languages, and it's pretty pointless, too.
   
  
  
   Use of quotes
   
    Use quotes always to delimit file names, user-supplied identifiers, and
    other variables that might contain words.  Do not use them to mark up
    variables that will not contain words (for example, operator names).
   
   
    There are functions in the backend that will double-quote their own output
    at need (for example, format_type_be>()).  Do not put
    additional quotes around the output of such functions.
   
   
    Rationale: Objects can have names that create ambiguity when embedded in a
    message.  Be consistent about denoting where a plugged-in name starts and
    ends.  But don't clutter messages with unnecessary or duplicate quote
    marks.
   
  
  
   Grammar and punctuation
   
    The rules are different for primary error messages and for detail/hint
    messages:
   
   
    Primary error messages: Do not capitalize the first letter.  Do not end a
    message with a period.  Do not even think about ending a message with an
    exclamation point.
   
   
    Detail and hint messages: Use complete sentences, and end each with
    a period.  Capitalize the first word of sentences.  Put two spaces after
    the period if another sentence follows (for English text; might be
    inappropriate in other languages).
   
   
    Rationale: Avoiding punctuation makes it easier for client applications to
    embed the message into a variety of grammatical contexts.  Often, primary
    messages are not grammatically complete sentences anyway.  (And if they're
    long enough to be more than one sentence, they should be split into
    primary and detail parts.)  However, detail and hint messages are longer
    and might need to include multiple sentences.  For consistency, they should
    follow complete-sentence style even when there's only one sentence.
   
  
  
   Upper case vs. lower case
   
    Use lower case for message wording, including the first letter of a
    primary error message.  Use upper case for SQL commands and key words if
    they appear in the message.
   
   
    Rationale: It's easier to make everything look more consistent this
    way, since some messages are complete sentences and some not.
   
  
  
   Avoid passive voice
   
    Use the active voice.  Use complete sentences when there is an acting
    subject (A could not do B
).  Use telegram style without
    subject if the subject would be the program itself; do not use
    I
 for the program.
   
   
    Rationale: The program is not human.  Don't pretend otherwise.
   
  
  
   Present vs past tense
   
    Use past tense if an attempt to do something failed, but could perhaps
    succeed next time (perhaps after fixing some problem).  Use present tense
    if the failure is certainly permanent.
   
   
    There is a nontrivial semantic difference between sentences of the form:
could not open file "%s": %m
and:
cannot open file "%s"
    The first one means that the attempt to open the file failed.  The
    message should give a reason, such as disk full
 or
    file doesn't exist
.  The past tense is appropriate because
    next time the disk might not be full anymore or the file in question might
    exist.
   
   
    The second form indicates that the functionality of opening the named file
    does not exist at all in the program, or that it's conceptually
    impossible.  The present tense is appropriate because the condition will
    persist indefinitely.
   
   
    Rationale: Granted, the average user will not be able to draw great
    conclusions merely from the tense of the message, but since the language
    provides us with a grammar we should use it correctly.
   
  
  
   Type of the object
   
    When citing the name of an object, state what kind of object it is.
   
   
    Rationale: Otherwise no one will know what foo.bar.baz>
    refers to.
   
  
  
   Brackets
   
    Square brackets are only to be used (1) in command synopses to denote
    optional arguments, or (2) to denote an array subscript.
   
   
    Rationale: Anything else does not correspond to widely-known customary
    usage and will confuse people.
   
  
  
   Assembling error messages
   
   When a message includes text that is generated elsewhere, embed it in
   this style:
could not open file %s: %m
   
   
    Rationale: It would be difficult to account for all possible error codes
    to paste this into a single smooth sentence, so some sort of punctuation
    is needed.  Putting the embedded text in parentheses has also been
    suggested, but it's unnatural if the embedded text is likely to be the
    most important part of the message, as is often the case.
   
  
  
   Reasons for errors
   
    Messages should always state the reason why an error occurred.
    For example:
BAD:    could not open file %s
BETTER: could not open file %s (I/O failure)
    If no reason is known you better fix the code.
   
  
  
   Function names
   
    Don't include the name of the reporting routine in the error text. We have
    other mechanisms for finding that out when needed, and for most users it's
    not helpful information.  If the error text doesn't make as much sense
    without the function name, reword it.
BAD:    pg_atoi: error in "z": cannot parse "z"
BETTER: invalid input syntax for integer: "z"
   
   
    Avoid mentioning called function names, either; instead say what the code
    was trying to do:
BAD:    open() failed: %m
BETTER: could not open file %s: %m
    If it really seems necessary, mention the system call in the detail
    message.  (In some cases, providing the actual values passed to the
    system call might be appropriate information for the detail message.)
   
   
    Rationale: Users don't know what all those functions do.
   
  
  
   Tricky words to avoid
  
    Unable
   
    Unable
 is nearly the passive voice.  Better use
    cannot
 or could not
, as appropriate.
   
  
  
    Bad
   
    Error messages like bad result
 are really hard to interpret
    intelligently.  It's better to write why the result is bad
,
    e.g., invalid format
.
   
  
  
    Illegal
   
    Illegal
 stands for a violation of the law, the rest is
    invalid
. Better yet, say why it's invalid.
   
  
  
    Unknown
   
    Try to avoid unknown
.  Consider error: unknown
    response
.  If you don't know what the response is, how do you know
    it's erroneous? Unrecognized
 is often a better choice.
    Also, be sure to include the value being complained of.
BAD:    unknown node type
BETTER: unrecognized node type: 42
   
  
  
    Find vs. Exists
   
    If the program uses a nontrivial algorithm to locate a resource (e.g., a
    path search) and that algorithm fails, it is fair to say that the program
    couldn't find
 the resource.  If, on the other hand, the
    expected location of the resource is known but the program cannot access
    it there then say that the resource doesn't exist
.  Using
    find
 in this case sounds weak and confuses the issue.
   
  
  
    May vs. Can vs. Might
   
    May
 suggests permission (e.g., "You may borrow my rake."),
    and has little use in documentation or error messages.
    Can
 suggests ability (e.g., "I can lift that log."),
    and might
 suggests possibility (e.g., "It might rain
    today.").  Using the proper word clarifies meaning and assists
    translation.
   
  
  
    Contractions
   
    Avoid contractions, like can't
;  use
    cannot
 instead.
   
  
  
  
   Proper spelling
   
    Spell out words in full.  For instance, avoid:
  
   
    
     spec
    
   
   
    
     stats
    
   
   
    
     parens
    
   
   
    
     auth
    
   
   
    
     xact
    
   
  
   
   
    Rationale: This will improve consistency.
   
  
  
   Localization
   
    Keep in mind that error message texts need to be translated into other
    languages.  Follow the guidelines in 
    to avoid making life difficult for translators.