From 6f922ef88e43b3084cdddf4b5ffe525a00896a90 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Wed, 4 Apr 2012 18:39:08 -0400
Subject: Improve efficiency of dblink by using libpq's new row processor API.

This patch provides a test case for libpq's row processor API.
contrib/dblink can deal with very large result sets by dumping them into
a tuplestore (which can spill to disk) --- but until now, the intermediate
storage of the query result in a PGresult meant memory bloat for any large
result.  Now we use a row processor to convert the data to tuple form and
dump it directly into the tuplestore.

A limitation is that this only works for plain dblink() queries, not
dblink_send_query() followed by dblink_get_result().  In the latter
case we don't know the desired tuple rowtype soon enough.  While hack
solutions to that are possible, a different user-level API would
probably be a better answer.

Kyotaro Horiguchi, reviewed by Marko Kreen and Tom Lane
---
 doc/src/sgml/dblink.sgml | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

(limited to 'doc/src')
diff --git a/doc/src/sgml/dblink.sgml b/doc/src/sgml/dblink.sgml
index 855495c54d5..72ca765be73 100644
--- a/doc/src/sgml/dblink.sgml
+++ b/doc/src/sgml/dblink.sgml
@@ -425,14 +425,6 @@ SELECT *
   <refsect1>
    <title>Notes</title>
 
-   <para>
-    <function>dblink</> fetches the entire remote query result before
-    returning any of it to the local system.  If the query is expected
-    to return a large number of rows, it's better to open it as a cursor
-    with <function>dblink_open</> and then fetch a manageable number
-    of rows at a time.
-   </para>
-
    <para>
     A convenient way to use <function>dblink</> with predetermined
     queries is to create a view.
@@ -1432,6 +1424,18 @@ dblink_get_result(text connname [, bool fail_on_error]) returns setof record
     sent, and one additional time to obtain an empty set result,
     before the connection can be used again.
    </para>
+
+   <para>
+    When using <function>dblink_send_query</> and
+    <function>dblink_get_result</>, <application>dblink</> fetches the entire
+    remote query result before returning any of it to the local query
+    processor.  If the query returns a large number of rows, this can result
+    in transient memory bloat in the local session.  It may be better to open
+    such a query as a cursor with <function>dblink_open</> and then fetch a
+    manageable number of rows at a time.  Alternatively, use plain
+    <function>dblink()</>, which avoids memory bloat by spooling large result
+    sets to disk.
+   </para>
   </refsect1>
 
   <refsect1>
-- 
cgit v1.2.3