PL/Perl - Perl Procedural Language
  
   PL/Perl
  
  
   Perl
  
  
   PL/Perl is a loadable procedural language that enables you to write
   PostgreSQL functions in the 
   Perl programming language.
  
   The usual advantage to using PL/Perl is that this allows use,
   within stored functions, of the manyfold string
    munging
 operators and functions available for Perl.  Parsing
   complex strings may be be easier using Perl than it is with the
   string functions and control structures provided in PL/pgSQL.
  
  
   To install PL/Perl in a particular database, use
   createlang plperl dbname>.
  
  
   
    If a language is installed into template1>, all subsequently
    created databases will have the language installed automatically.
   
  
  
   
    Users of source packages must specially enable the build of
    PL/Perl during the installation process.  (Refer to  for more information.)  Users of
    binary packages might find PL/Perl in a separate subpackage.
   
  
 
  PL/Perl Functions and Arguments
  
   To create a function in the PL/Perl language, use the standard
   
   syntax:
CREATE FUNCTION funcname (argument-types) RETURNS return-type AS $$
    # PL/Perl function body
$$ LANGUAGE plperl;
   The body of the function is ordinary Perl code. In fact, the PL/Perl
   glue code wraps it inside a Perl subroutine. A PL/Perl function must
   always return a scalar value.  You can return more complex structures
   (arrays, records, and sets) by returning a reference, as discussed below.
   Never return a list.
  
  
   
    The use of named nested subroutines is dangerous in Perl, especially if
    they refer to lexical variables in the enclosing scope. Because a PL/Perl
    function is wrapped in a subroutine, any named subroutine you create will
    be nested. In general, it is far safer to create anonymous subroutines
    which you call via a coderef. See the perldiag
    man page for more details.
   
  
  
   The syntax of the CREATE FUNCTION command requires
   the function body to be written as a string constant.  It is usually
   most convenient to use dollar quoting (see ) for the string constant.
   If you choose to use regular single-quoted string constant syntax,
   you must escape single quote marks ('>) and backslashes
   (\>) used in the body of the function, typically by
   doubling them (see ).
  
  
   Arguments and results are handled as in any other Perl subroutine:
   arguments are passed in @_, and a result value
   is returned with return> or as the last expression
   evaluated in the function.
  
  
   For example, a function returning the greater of two integer values
   could be defined as:
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
    if ($_[0] > $_[1]) { return $_[0]; }
    return $_[1];
$$ LANGUAGE plperl;
  
  
   If an SQL null valuenull value>in PL/Perl> is passed to a function,
   the argument value will appear as undefined> in Perl.  The
   above function definition will not behave very nicely with null
   inputs (in fact, it will act as though they are zeroes).  We could
   add STRICT> to the function definition to make
   PostgreSQL do something more reasonable:
   if a null value is passed, the function will not be called at all,
   but will just return a null result automatically.  Alternatively,
   we could check for undefined inputs in the function body.  For
   example, suppose that we wanted perl_max with
   one null and one nonnull argument to return the nonnull argument,
   rather than a null value:
CREATE FUNCTION perl_max (integer, integer) RETURNS integer AS $$
    my ($x,$y) = @_;
    if (! defined $x) {
        if (! defined $y) { return undef; }
        return $y;
    }
    if (! defined $y) { return $x; }
    if ($x > $y) { return $x; }
    return $y;
$$ LANGUAGE plperl;
   As shown above, to return an SQL null value from a PL/Perl
   function, return an undefined value.  This can be done whether the
   function is strict or not.
  
  
   Perl can return PostgreSQL arrays as
   references to Perl arrays.  Here is an example:
CREATE OR REPLACE function returns_array()
RETURNS text[][] AS $$
    return [['a"b','c,d'],['e\\f','g']];
$$ LANGUAGE plperl;
select returns_array();
  
  
   Composite-type arguments are passed to the function as references
   to hashes.  The keys of the hash are the attribute names of the
   composite type.  Here is an example:
CREATE TABLE employee (
    name text,
    basesalary integer,
    bonus integer
);
CREATE FUNCTION empcomp(employee) RETURNS integer AS $$
    my ($emp) = @_;
    return $emp->{basesalary} + $emp->{bonus};
$$ LANGUAGE plperl;
SELECT name, empcomp(employee.*) FROM employee;
  
  
   A PL/Perl function can return a composite-type result using the same
   approach: return a reference to a hash that has the required attributes.
   For example,
CREATE TYPE testrowperl AS (f1 integer, f2 text, f3 text);
CREATE OR REPLACE FUNCTION perl_row() RETURNS testrowperl AS $$
    return {f2 => 'hello', f1 => 1, f3 => 'world'};
$$ LANGUAGE plperl;
SELECT * FROM perl_row();
   Any columns in the declared result data type that are not present in the
   hash will be returned as null values.
  
  
    PL/Perl functions can also return sets of either scalar or
    composite types.  Usually you'll want to return rows one at a
    time, both to speed up startup time and to keep from queueing up
    the entire result set in memory.  You can do this with
    return_next as illustrated below.  Note that
    after the last return_next, you must put
    either return or (better) return
    undef.
CREATE OR REPLACE FUNCTION perl_set_int(int)
RETURNS SETOF INTEGER AS $$
    foreach (0..$_[0]) {
        return_next($_);
    }
    return undef;
$$ LANGUAGE plperl;
SELECT * FROM perl_set_int(5);
CREATE OR REPLACE FUNCTION perl_set()
RETURNS SETOF testrowperl AS $$
    return_next({ f1 => 1, f2 => 'Hello', f3 => 'World' });
    return_next({ f1 => 2, f2 => 'Hello', f3 => 'PostgreSQL' });
    return_next({ f1 => 3, f2 => 'Hello', f3 => 'PL/Perl' });
    return undef;
$$ LANGUAGE plperl;
    For small result sets, you can return a reference to an array that
    contains either scalars, references to arrays, or references to
    hashes for simple types, array types, and composite types,
    respectively.  Here are some simple examples of returning the entire
    result set as an array reference:
CREATE OR REPLACE FUNCTION perl_set_int(int) RETURNS SETOF INTEGER AS $$
    return [0..$_[0]];
$$ LANGUAGE plperl;
SELECT * FROM perl_set_int(5);
CREATE OR REPLACE FUNCTION perl_set() RETURNS SETOF testrowperl AS $$
    return [
        { f1 => 1, f2 => 'Hello', f3 => 'World' },
        { f1 => 2, f2 => 'Hello', f3 => 'PostgreSQL' },
        { f1 => 3, f2 => 'Hello', f3 => 'PL/Perl' }
    ];
$$ LANGUAGE plperl;
SELECT * FROM perl_set();
  
  
   If you wish to use the strict> pragma with your code,
   the easiest way to do so is to SET>
   plperl.use_strict to true.  This parameter affects
   subsequent compilations of PL/Perl> functions, but not
   functions already compiled in the current session.  To set the
   parameter before PL/Perl> has been loaded, it is
   necessary to have added plperl>> to the  list in
   postgresql.conf.
  
  
   Another way to use the strict> pragma is to put
use strict;
   in the function body.  But this only works in PL/PerlU>
   functions, since use> is not a trusted operation.  In
   PL/Perl> functions you can instead do
BEGIN { strict->import(); }
  
 
 
  Database Access from PL/Perl
  
   Access to the database itself from your Perl function can be done
   via the function spi_exec_query described
   below, or via an experimental module 
   
   DBD::PgSPI
   (also available at 
   CPAN mirror sites>).  This module makes available a
   DBI>-compliant database-handle named
   $pg_dbh that can be used to perform queries with
   normal DBI>
   syntax.DBI>
  
  
   PL/Perl provides additional Perl commands:
   
    
     
      spi_exec_query
      in PL/Perl
     
     spi_exec_query>(query [, max-rows])
     spi_query>(command)
     spi_fetchrow>(cursor)
     spi_prepare>(command, argument types)
     spi_exec_prepared>(plan)
     spi_query_prepared>(plan [, attributes], arguments)
     spi_cursor_close>(cursor)
     spi_freeplan>(plan)
     
      
       spi_exec_query executes an SQL command and
returns the entire row set as a reference to an array of hash
references.  You should only use this command when you know
that the result set will be relatively small.  Here is an
example of a query (SELECT command) with the
optional maximum number of rows:
$rv = spi_exec_query('SELECT * FROM my_table', 5);
        This returns up to 5 rows from the table
        my_table.  If my_table
        has a column my_column, you can get that
        value from row $i of the result like this:
$foo = $rv->{rows}[$i]->{my_column};
       The total number of rows returned from a SELECT
       query can be accessed like this:
$nrows = $rv->{processed}
      
      
       Here is an example using a different command type:
$query = "INSERT INTO my_table VALUES (1, 'test')";
$rv = spi_exec_query($query);
       You can then access the command status (e.g.,
       SPI_OK_INSERT) like this:
$res = $rv->{status};
       To get the number of rows affected, do:
$nrows = $rv->{processed};
      
      
       Here is a complete example:
CREATE TABLE test (
    i int,
    v varchar
);
INSERT INTO test (i, v) VALUES (1, 'first line');
INSERT INTO test (i, v) VALUES (2, 'second line');
INSERT INTO test (i, v) VALUES (3, 'third line');
INSERT INTO test (i, v) VALUES (4, 'immortal');
CREATE OR REPLACE FUNCTION test_munge() RETURNS SETOF test AS $$
    my $rv = spi_exec_query('select i, v from test;');
    my $status = $rv->{status};
    my $nrows = $rv->{processed};
    foreach my $rn (0 .. $nrows - 1) {
        my $row = $rv->{rows}[$rn];
        $row->{i} += 200 if defined($row->{i});
        $row->{v} =~ tr/A-Za-z/a-zA-Z/ if (defined($row->{v}));
        return_next($row);
    }
    return undef;
$$ LANGUAGE plperl;
SELECT * FROM test_munge();
    
    
    spi_query and spi_fetchrow
    work together as a pair for row sets which may be large, or for cases
    where you wish to return rows as they arrive.
    spi_fetchrow works only with
    spi_query. The following example illustrates how
    you use them together:
CREATE TYPE foo_type AS (the_num INTEGER, the_text TEXT);
CREATE OR REPLACE FUNCTION lotsa_md5 (INTEGER) RETURNS SETOF foo_type AS $$
    use Digest::MD5 qw(md5_hex);
    my $file = '/usr/share/dict/words';
    my $t = localtime;
    elog(NOTICE, "opening file $file at $t" );
    open my $fh, '<', $file # ooh, it's a file access!
        or elog(ERROR, "can't open $file for reading: $!");
    my @words = <$fh>;
    close $fh;
    $t = localtime;
    elog(NOTICE, "closed file $file at $t");
    chomp(@words);
    my $row;
    my $sth = spi_query("SELECT * FROM generate_series(1,$_[0]) AS b(a)");
    while (defined ($row = spi_fetchrow($sth))) {
        return_next({
            the_num => $row->{a},
            the_text => md5_hex($words[rand @words])
        });
    }
    return;
$$ LANGUAGE plperlu;
SELECT * from lotsa_md5(500);
    
      
    
    spi_prepare, spi_query_prepared, spi_exec_prepared, 
    and spi_freeplan implement the same functionality but for prepared queries. Once
    a query plan is prepared by a call to spi_prepare, the plan can be used instead
    of the string query, either in spi_exec_prepared, where the result is the same as returned
    by spi_exec_query, or in spi_query_prepared which returns a cursor
    exactly as spi_query does, which can be later passed to spi_fetchrow.
    
    
    
    The advantage of prepared queries is that is it possible to use one prepared plan for more
    than one query execution. After the plan is not needed anymore, it may be freed with 
    spi_freeplan:
    
    
    
CREATE OR REPLACE FUNCTION init() RETURNS INTEGER AS $$
        $_SHARED{my_plan} = spi_prepare( 'SELECT (now() + $1)::date AS now', 'INTERVAL');
$$ LANGUAGE plperl;
CREATE OR REPLACE FUNCTION add_time( INTERVAL ) RETURNS TEXT AS $$
        return spi_exec_prepared( 
                $_SHARED{my_plan},
                $_[0],
        )->{rows}->[0]->{now};
$$ LANGUAGE plperl;
CREATE OR REPLACE FUNCTION done() RETURNS INTEGER AS $$
        spi_freeplan( $_SHARED{my_plan});
        undef $_SHARED{my_plan};
$$ LANGUAGE plperl;
SELECT init();
SELECT add_time('1 day'), add_time('2 days'), add_time('3 days');
SELECT done();
  add_time  |  add_time  |  add_time  
------------+------------+------------
 2005-12-10 | 2005-12-11 | 2005-12-12
    
    
    
    Note that the parameter subscript in spi_prepare is defined via
    $1, $2, $3, etc, so avoid declaring query strings in double quotes that might easily
    lead to hard-to-catch bugs.
    
    
     Normally, spi_fetchrow> should be repeated until it
     returns undef, indicating that there are no more
     rows to read.  The cursor is automatically freed when
     spi_fetchrow> returns undef.
     If you do not wish to read all the rows, instead call
     spi_cursor_close> to free the cursor.
     Failure to do so will result in memory leaks.
    
     
    
    
     
      elog
      in PL/Perl
     
     elog>(level, msg)
     
      
       Emit a log or error message. Possible levels are
       DEBUG>, LOG>, INFO>,
       NOTICE>, WARNING>, and ERROR>.
       ERROR>
        raises an error condition; if this is not trapped by the surrounding
        Perl code, the error propagates out to the calling query, causing
        the current transaction or subtransaction to be aborted.  This
        is effectively the same as the Perl die> command.
        The other levels only generate messages of different
        priority levels.
        Whether messages of a particular priority are reported to the client,
        written to the server log, or both is controlled by the
         and
         configuration
        variables. See  for more
        information.
      
     
    
   
  
 
 
  Data Values in PL/Perl
  
   The argument values supplied to a PL/Perl function's code are
   simply the input arguments converted to text form (just as if they
   had been displayed by a SELECT statement).
   Conversely, the return> command will accept any string
   that is acceptable input format for the function's declared return
   type.  So, within the PL/Perl function,
   all values are just text strings.
  
 
 
  Global Values in PL/Perl
  
    You can use the global hash %_SHARED to store
    data, including code references, between function calls for the
    lifetime of the current session.
  
  
    Here is a simple example for shared data:
CREATE OR REPLACE FUNCTION set_var(name text, val text) RETURNS text AS $$
    if ($_SHARED{$_[0]} = $_[1]) {
        return 'ok';
    } else {
        return "can't set shared variable $_[0] to $_[1]";
    }
$$ LANGUAGE plperl;
CREATE OR REPLACE FUNCTION get_var(name text) RETURNS text AS $$
    return $_SHARED{$_[0]};
$$ LANGUAGE plperl;
SELECT set_var('sample', 'Hello, PL/Perl!  How's tricks?');
SELECT get_var('sample');
  
  
   Here is a slightly more complicated example using a code reference:
CREATE OR REPLACE FUNCTION myfuncs() RETURNS void AS $$
    $_SHARED{myquote} = sub {
        my $arg = shift;
        $arg =~ s/(['\\])/\\$1/g;
        return "'$arg'";
    };
$$ LANGUAGE plperl;
SELECT myfuncs(); /* initializes the function */
/* Set up a function that uses the quote function */
CREATE OR REPLACE FUNCTION use_quote(TEXT) RETURNS text AS $$
    my $text_to_quote = shift;
    my $qfunc = $_SHARED{myquote};
    return &$qfunc($text_to_quote);
$$ LANGUAGE plperl;
   (You could have replaced the above with the one-liner
   return $_SHARED{myquote}->($_[0]);
   at the expense of readability.)
  
 
 
  Trusted and Untrusted PL/Perl
  
   trusted
   PL/Perl
  
  
   Normally, PL/Perl is installed as a trusted> programming
   language named plperl>.  In this setup, certain Perl
   operations are disabled to preserve security.  In general, the
   operations that are restricted are those that interact with the
   environment. This includes file handle operations,
   require, and use (for
   external modules).  There is no way to access internals of the
   database server process or to gain OS-level access with the
   permissions of the server process,
   as a C function can do.  Thus, any unprivileged database user may
   be permitted to use this language.
  
  
   Here is an example of a function that will not work because file
   system operations are not allowed for security reasons:
CREATE FUNCTION badfunc() RETURNS integer AS $$
    my $tmpfile = "/tmp/badfile";
    open my $fh, '>', $tmpfile
        or elog(ERROR, qq{could not open the file "$tmpfile": $!});
    print $fh "Testing writing to a file\n";
    close $fh or elog(ERROR, qq{could not close the file "$tmpfile": $!});
    return 1;
$$ LANGUAGE plperl;
    The creation of this function will fail as its use of a forbidden
    operation will be be caught by the validator.
  
  
   Sometimes it is desirable to write Perl functions that are not
   restricted.  For example, one might want a Perl function that sends
   mail.  To handle these cases, PL/Perl can also be installed as an
   untrusted> language (usually called
   PL/PerlUPL/PerlU>).
   In this case the full Perl language is available.  If the
   createlang program is used to install the
   language, the language name plperlu will select
   the untrusted PL/Perl variant.
  
  
   The writer of a PL/PerlU> function must take care that the function
   cannot be used to do anything unwanted, since it will be able to do
   anything that could be done by a user logged in as the database
   administrator.  Note that the database system allows only database
   superusers to create functions in untrusted languages.
  
  
   If the above function was created by a superuser using the language
   plperlu>, execution would succeed.
  
  
    
	  For security reasons, to stop a leak of privileged operations from
      PL/PerlU> to PL/Perl>, these two languages
	  have to run in separate instances of the Perl interpreter. If your
	  Perl installation has been appropriately compiled, this is not a problem.
	  However, not all installations are compiled with the requisite flags.
	  If PostgreSQL> detects that this is the case then it will
	  not start a second interpreter, but instead create an error. In
	  consequence, in such an installation, you cannot use both 
	  PL/PerlU> and PL/Perl> in the same backend
	  process. The remedy for this is to obtain a Perl installation created
	  with the appropriate flags, namely either usemultiplicity> or
	  both usethreads> and useithreads>. 
	  For more details,see the perlembed> manual page.
    
  
  
 
 
  PL/Perl Triggers
  
   PL/Perl can be used to write trigger functions.  In a trigger function,
   the hash reference $_TD contains information about the
   current trigger event. $_TD> is a global variable, 
   which gets a separate local value for each invocation of the trigger. 
   The fields of the $_TD hash reference are:
   
    
     $_TD->{new}{foo}
     
      
       NEW value of column foo
      
     
    
    
     $_TD->{old}{foo}
     
      
       OLD value of column foo
      
     
    
    
     $_TD->{name}
     
      
       Name of the trigger being called
      
     
    
    
     $_TD->{event}
     
      
       Trigger event: INSERT>, UPDATE>, DELETE>, or UNKNOWN>
      
     
    
    
     $_TD->{when}
     
      
       When the trigger was called: BEFORE, AFTER, or UNKNOWN
      
     
    
    
     $_TD->{level}
     
      
       The trigger level: ROW, STATEMENT, or UNKNOWN
      
     
    
    
     $_TD->{relid}
     
      
       OID of the table on which the trigger fired
      
     
    
    
     $_TD->{table_name}
     
      
       Name of the table on which the trigger fired
      
     
    
    
     $_TD->{relname}
     
      
       Name of the table on which the trigger fired. This has been deprecated,
       and could be removed in a future release. 
       Please use $_TD->{table_name} instead.
      
     
    
    
     $_TD->{table_schema}
     
      
       Name of the schema in which the table on which the trigger fired, is
      
     
    
    
     $_TD->{argc}
     
      
       Number of arguments of the trigger function
      
     
    
    
     @{$_TD->{args}}
     
      
       Arguments of the trigger function.  Does not exist if $_TD->{argc} is 0.
      
     
    
   
  
  
   Triggers can return one of the following:
   
    
     return;
     
      
       Execute the statement
      
     
    
    
     "SKIP"
     
      
       Don't execute the statement
      
     
    
    
     "MODIFY"
     
      
       Indicates that the NEW row was modified by
       the trigger function
      
     
    
   
  
  
   Here is an example of a trigger function, illustrating some of the
   above:
CREATE TABLE test (
    i int,
    v varchar
);
CREATE OR REPLACE FUNCTION valid_id() RETURNS trigger AS $$
    if (($_TD->{new}{i} >= 100) || ($_TD->{new}{i} <= 0)) {
        return "SKIP";    # skip INSERT/UPDATE command
    } elsif ($_TD->{new}{v} ne "immortal") {
        $_TD->{new}{v} .= "(modified by trigger)";
        return "MODIFY";  # modify row and execute INSERT/UPDATE command
    } else {
        return;           # execute INSERT/UPDATE command
    }
$$ LANGUAGE plperl;
CREATE TRIGGER test_valid_id_trig
    BEFORE INSERT OR UPDATE ON test
    FOR EACH ROW EXECUTE PROCEDURE valid_id();
  
 
 
  Limitations and Missing Features
  
   The following features are currently missing from PL/Perl, but they
   would make welcome contributions.
   
    
     
      PL/Perl functions cannot call each other directly (because they
      are anonymous subroutines inside Perl).
     
    
    
     
      SPI is not yet fully implemented.
     
    
    
     
      If you are fetching very large data sets using
      spi_exec_query, you should be aware that
      these will all go into memory.  You can avoid this by using
      spi_query/spi_fetchrow as
      illustrated earlier.
     
     
        A similar problem occurs if a set-returning function passes a
        large set of rows back to PostgreSQL via return. You
        can avoid this problem too by instead using
        return_next for each row returned, as shown
        previously.