summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorTomas Vondra <tomas.vondra@postgresql.org>2022-12-30 23:14:53 +0100
committerTomas Vondra <tomas.vondra@postgresql.org>2022-12-30 23:16:01 +0100
commit8ad51b5f446b5c19ba2c0033a0f7b3180b3b6d95 (patch)
treede0cd74753d56a785ce61315a0044383faf8e9eb /doc/src
parent02699bc1fd3b7fccba2e6b55189a148ef69b8a00 (diff)
Sample postgres_fdw tables remotely during ANALYZE
When collecting ANALYZE sample on foreign tables, postgres_fdw fetched all rows and performed the sampling locally. For large tables this means transferring and immediately discarding large amounts of data. This commit allows the sampling to be performed on the remote server, transferring only the much smaller sample. The sampling is performed using the built-in TABLESAMPLE methods (system, bernoulli) or random() function, depending on the remote server version. Remote sampling can be enabled by analyze_sampling on the foreign server and/or foreign table, with supported values 'off', 'auto', 'system', 'bernoulli' and 'random'. The default value is 'auto' which uses either 'bernoulli' (TABLESAMPLE method) or 'random' (for remote servers without TABLESAMPLE support).
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/postgres-fdw.sgml35
1 files changed, 35 insertions, 0 deletions
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 527f4deaaa2..6c943184850 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -326,6 +326,41 @@ OPTIONS (ADD password_required 'false');
frequently updated, the local statistics will soon be obsolete.
</para>
+ <para>
+ The following option controls how such an <command>ANALYZE</command>
+ operation behaves:
+ </para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term><literal>analyze_sampling</literal> (<type>text</type>)</term>
+ <listitem>
+ <para>
+ This option, which can be specified for a foreign table or a foreign
+ server, determines if <command>ANALYZE</command> on a foreign table
+ samples the data on the remote side, or reads and transfers all data
+ and performs the sampling locally. The supported values
+ are <literal>off</literal>, <literal>random</literal>,
+ <literal>system</literal>, <literal>bernoulli</literal>
+ and <literal>auto</literal>. <literal>off</literal> disables remote
+ sampling, so all data are transferred and sampled locally.
+ <literal>random</literal> performs remote sampling using the
+ <literal>random()</literal> function to choose returned rows,
+ while <literal>system</literal> and <literal>bernoulli</literal> rely
+ on the built-in <literal>TABLESAMPLE</literal> methods of those
+ names. <literal>random</literal> works on all remote server versions,
+ while <literal>TABLESAMPLE</literal> is supported only since 9.5.
+ <literal>auto</literal> (the default) picks the recommended sampling
+ method automatically; currently it means
+ either <literal>bernoulli</literal> or <literal>random</literal>
+ depending on the remote server version.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
</sect3>
<sect3>