From 41b9c8452b9df3a431dffc346890f926d17d47ad Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Thu, 2 Aug 2012 13:10:30 -0400 Subject: Replace libpq's "row processor" API with a "single row" mode. After taking awhile to digest the row-processor feature that was added to libpq in commit 92785dac2ee7026948962cd61c4cd84a2d052772, we've concluded it is over-complicated and too hard to use. Leave the core infrastructure changes in place (that is, there's still a row processor function inside libpq), but remove the exposed API pieces, and instead provide a "single row" mode switch that causes PQgetResult to return one row at a time in separate PGresult objects. This approach incurs more overhead than proper use of a row processor callback would, since construction of a PGresult per row adds extra cycles. However, it is far easier to use and harder to break. The single-row mode still affords applications the primary benefit that the row processor API was meant to provide, namely not having to accumulate large result sets in memory before processing them. Preliminary testing suggests that we can probably buy back most of the extra cycles by micro-optimizing construction of the extra results, but that task will be left for another day. Marko Kreen --- doc/src/sgml/libpq.sgml | 416 ++++++++++++++++-------------------------------- 1 file changed, 139 insertions(+), 277 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 5c5dd68db30..255c5c1abb8 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -2418,14 +2418,28 @@ ExecStatusType PQresultStatus(const PGresult *res); PGRES_COPY_BOTH - Copy In/Out (to and from server) data transfer started. This is - currently used only for streaming replication. + Copy In/Out (to and from server) data transfer started. This + feature is currently used only for streaming replication, + so this status should not occur in ordinary applications. + + + + + + PGRES_SINGLE_TUPLE + + + The PGresult contains a single result tuple + from the current command. This status occurs only when + single-row mode has been selected for the query + (see ). - If the result status is PGRES_TUPLES_OK, then + If the result status is PGRES_TUPLES_OK or + PGRES_SINGLE_TUPLE, then the functions described below can be used to retrieve the rows returned by the query. Note that a SELECT command that happens to retrieve zero rows still shows @@ -2726,7 +2740,8 @@ void PQclear(PGresult *res); These functions are used to extract information from a PGresult object that represents a successful query result (that is, one that has status - PGRES_TUPLES_OK). They can also be used to extract + PGRES_TUPLES_OK or PGRES_SINGLE_TUPLE). + They can also be used to extract information from a successful Describe operation: a Describe's result has all the same column information that actual execution of the query would provide, but it has zero rows. For objects with other status values, @@ -3738,7 +3753,7 @@ unsigned char *PQunescapeBytea(const unsigned char *from, size_t *to_length); The PQexec function is adequate for submitting - commands in normal, synchronous applications. It has a couple of + commands in normal, synchronous applications. It has a few deficiencies, however, that can be of importance to some users: @@ -3769,6 +3784,15 @@ unsigned char *PQunescapeBytea(const unsigned char *from, size_t *to_length); PQexec. + + + + PQexec always collects the command's entire result, + buffering it in a single PGresult. While + this simplifies error-handling logic for the application, it can be + impractical for results containing many rows. + + @@ -3984,8 +4008,11 @@ int PQsendDescribePortal(PGconn *conn, const char *portalName); Waits for the next result from a prior PQsendQuery, PQsendQueryParams, - PQsendPrepare, or - PQsendQueryPrepared call, and returns it. + PQsendPrepare, + PQsendQueryPrepared, + PQsendDescribePrepared, or + PQsendDescribePortal + call, and returns it. A null pointer is returned when the command is complete and there will be no more results. @@ -4012,7 +4039,7 @@ PGresult *PQgetResult(PGconn *conn); Even when PQresultStatus indicates a fatal error, PQgetResult should be called until it - returns a null pointer to allow libpq to + returns a null pointer, to allow libpq to process the error information completely. @@ -4029,7 +4056,18 @@ PGresult *PQgetResult(PGconn *conn); can be obtained individually. (This allows a simple form of overlapped processing, by the way: the client can be handling the results of one command while the server is still working on later queries in the same - command string.) However, calling PQgetResult + command string.) + + + + Another frequently-desired feature that can be obtained with + PQsendQuery and PQgetResult + is retrieving large query results a row at a time. This is discussed + in . + + + + By itself, calling PQgetResult will still cause the client to block until the server completes the next SQL command. This can be avoided by proper use of two more functions: @@ -4238,6 +4276,98 @@ int PQflush(PGconn *conn); + + Retrieving Query Results Row-By-Row + + + libpq + single-row mode + + + + Ordinarily, libpq collects a SQL command's + entire result and returns it to the application as a single + PGresult. This can be unworkable for commands + that return a large number of rows. For such cases, applications can use + PQsendQuery and PQgetResult in + single-row mode. In this mode, the result row(s) are + returned to the application one at a time, as they are received from the + server. + + + + To enter single-row mode, call PQsetSingleRowMode + immediately after a successful call of PQsendQuery + (or a sibling function). This mode selection is effective only for the + currently executing query. Then call PQgetResult + repeatedly, until it returns null, as documented in . If the query returns any rows, they are returned + as individual PGresult objects, which look like + normal query results except for having status code + PGRES_SINGLE_TUPLE instead of + PGRES_TUPLES_OK. After the last row, or immediately if + the query returns zero rows, a zero-row object with status + PGRES_TUPLES_OK is returned; this is the signal that no + more rows will arrive. (But note that it is still necessary to continue + calling PQgetResult until it returns null.) All of + these PGresult objects will contain the same row + description data (column names, types, etc) that an ordinary + PGresult object for the query would have. + Each object should be freed with PQclear as usual. + + + + + + + PQsetSingleRowMode + + PQsetSingleRowMode + + + + + + Select single-row mode for the currently-executing query. + + +int PQsetSingleRowMode(PGconn *conn); + + + + + This function can only be called immediately after + PQsendQuery or one of its sibling functions, + before any other operation on the connection such as + PQconsumeInput or + PQgetResult. If called at the correct time, + the function activates single-row mode for the current query and + returns 1. Otherwise the mode stays unchanged and the function + returns 0. In any case, the mode reverts to normal after + completion of the current query. + + + + + + + + + While processing a query, the server may return some rows and then + encounter an error, causing the query to be aborted. Ordinarily, + libpq discards any such rows and reports only the + error. But in single-row mode, those rows will have already been + returned to the application. Hence, the application will see some + PGRES_SINGLE_TUPLE PGresult + objects followed by a PGRES_FATAL_ERROR object. For + proper transactional behavior, the application must be designed to + discard or undo whatever has been done with the previously-processed + rows, if the query ultimately fails. + + + + + Canceling Queries in Progress @@ -5700,274 +5830,6 @@ defaultNoticeProcessor(void *arg, const char *message) - - Custom Row Processing - - - PQrowProcessor - - - - row processor - in libpq - - - - Ordinarily, when receiving a query result from the server, - libpq adds each row value to the current - PGresult until the entire result set is received; then - the PGresult is returned to the application as a unit. - This approach is simple to work with, but becomes inefficient for large - result sets. To improve performance, an application can register a - custom row processor function that processes each row - as the data is received from the network. The custom row processor could - process the data fully, or store it into some application-specific data - structure for later processing. - - - - - The row processor function sees the rows before it is known whether the - query will succeed overall, since the server might return some rows before - encountering an error. For proper transactional behavior, it must be - possible to discard or undo whatever the row processor has done, if the - query ultimately fails. - - - - - When using a custom row processor, row data is not accumulated into the - PGresult, so the PGresult ultimately delivered to - the application will contain no rows (PQntuples = - 0). However, it still has PQresultStatus = - PGRES_TUPLES_OK, and it contains correct information about the - set of columns in the query result. On the other hand, if the query fails - partway through, the returned PGresult has - PQresultStatus = PGRES_FATAL_ERROR. The - application must be prepared to undo any actions of the row processor - whenever it gets a PGRES_FATAL_ERROR result. - - - - A custom row processor is registered for a particular connection by - calling PQsetRowProcessor, described below. - This row processor will be used for all subsequent query results on that - connection until changed again. A row processor function must have a - signature matching - - -typedef int (*PQrowProcessor) (PGresult *res, const PGdataValue *columns, - const char **errmsgp, void *param); - - where PGdataValue is described by - -typedef struct pgDataValue -{ - int len; /* data length in bytes, or <0 if NULL */ - const char *value; /* data value, without zero-termination */ -} PGdataValue; - - - - - The res parameter is the PGRES_TUPLES_OK - PGresult that will eventually be delivered to the calling - application (if no error intervenes). It contains information about - the set of columns in the query result, but no row data. In particular the - row processor must fetch PQnfields(res) to know the number of - data columns. - - - - Immediately after libpq has determined the result set's - column information, it will make a call to the row processor with - columns set to NULL, but the other parameters as - usual. The row processor can use this call to initialize for a new result - set; if it has nothing to do, it can just return 1. In - subsequent calls, one per received row, columns - is non-NULL and points to an array of PGdataValue structs, one per - data column. - - - - errmsgp is an output parameter used only for error - reporting. If the row processor needs to report an error, it can set - *errmsgp to point to a suitable message - string (and then return -1). As a special case, returning - -1 without changing *errmsgp - from its initial value of NULL is taken to mean out of memory. - - - - The last parameter, param, is just a void pointer - passed through from PQsetRowProcessor. This can be - used for communication between the row processor function and the - surrounding application. - - - - In the PGdataValue array passed to a row processor, data values - cannot be assumed to be zero-terminated, whether the data format is text - or binary. A SQL NULL value is indicated by a negative length field. - - - - The row processor must process the row data values - immediately, or else copy them into application-controlled storage. - The value pointers passed to the row processor point into - libpq's internal data input buffer, which will be - overwritten by the next packet fetch. - - - - The row processor function must return either 1 or - -1. - 1 is the normal, successful result value; libpq - will continue with receiving row values from the server and passing them to - the row processor. -1 indicates that the row processor has - encountered an error. In that case, - libpq will discard all remaining rows in the result set - and then return a PGRES_FATAL_ERROR PGresult to - the application (containing the specified error message, or out of - memory for query result if *errmsgp - was left as NULL). - - - - Another option for exiting a row processor is to throw an exception using - C's longjmp() or C++'s throw. If this is done, - processing of the incoming data can be resumed later by calling - PQgetResult; the row processor will be invoked as normal for - any remaining rows in the current result. - As with any usage of PQgetResult, the application - should continue calling PQgetResult until it gets a NULL - result before issuing any new query. - - - - In some cases, an exception may mean that the remainder of the - query result is not interesting. In such cases the application can discard - the remaining rows with PQskipResult, described below. - Another possible recovery option is to close the connection altogether with - PQfinish. - - - - - - - PQsetRowProcessor - - PQsetRowProcessor - - - - - - Sets a callback function to process each row. - - -void PQsetRowProcessor(PGconn *conn, PQrowProcessor func, void *param); - - - - - The specified row processor function func is installed as - the active row processor for the given connection conn. - Also, param is installed as the passthrough pointer to - pass to it. Alternatively, if func is NULL, the standard - row processor is reinstalled on the given connection (and - param is ignored). - - - - Although the row processor can be changed at any time in the life of a - connection, it's generally unwise to do so while a query is active. - In particular, when using asynchronous mode, be aware that both - PQisBusy and PQgetResult can call the current - row processor. - - - - - - - PQgetRowProcessor - - PQgetRowProcessor - - - - - - Fetches the current row processor for the specified connection. - - -PQrowProcessor PQgetRowProcessor(const PGconn *conn, void **param); - - - - - In addition to returning the row processor function pointer, the - current passthrough pointer will be returned at - *param, if param is not NULL. - - - - - - - PQskipResult - - PQskipResult - - - - - - Discard all the remaining rows in the incoming result set. - - -PGresult *PQskipResult(PGconn *conn); - - - - - This is a simple convenience function to discard incoming data after a - row processor has failed or it's determined that the rest of the result - set is not interesting. PQskipResult is exactly - equivalent to PQgetResult except that it transiently - installs a dummy row processor function that just discards data. - The returned PGresult can be discarded without further ado - if it has status PGRES_TUPLES_OK; but other status values - should be handled normally. (In particular, - PGRES_FATAL_ERROR indicates a server-reported error that - will still need to be dealt with.) - As when using PQgetResult, one should usually repeat the - call until NULL is returned to ensure the connection has reached an - idle state. Another possible usage is to call - PQskipResult just once, and then resume using - PQgetResult to process subsequent result sets normally. - - - - Because PQskipResult will wait for server input, it is not - very useful in asynchronous applications. In particular you should not - code a loop of PQisBusy and PQskipResult, - because that will result in the installed row processor being called - within PQisBusy. To get the proper behavior in an - asynchronous application, you'll need to install a dummy row processor - (or set a flag to make your normal row processor do nothing) and leave - it that way until you have discarded all incoming data via your normal - PQisBusy and PQgetResult loop. - - - - - - - - Event System -- cgit v1.2.3