From e35cc3b3f2d081e2de3a0e077715d12b3580cc74 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Mon, 24 Jul 2023 13:48:22 +0900 Subject: pgbench: Use COPY for client-side data generation This commit switches the client-side data generation from INSERT queries to COPY for the two tables pgbench_branches and pgbench_tellers. pgbench_accounts was already using COPY. COPY is a better interface for bulk loading or high latency connections (this point can be countered with the option for server-side data generation, still client-side is the default), and measurements have proved that using it for these two other tables can lead to improvements during initialization. I did not notice slowdowns at large scale numbers on a local setup, either, most of the work happening for the accounts table. Previously COPY was only used for the pgbench_accounts table because the amount of data was much larger than the two other tables. The code is refactored so as all three tables use the same code path to execute the COPY queries, with a callback to build data rows. Author: Tristan Partin Discussion: https://postgr.es/m/CSTU5P82ONZ1.19XFUGHMXHBRY@c3po --- doc/src/sgml/ref/pgbench.sgml | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) (limited to 'doc/src') diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 850028557d3..6c5c8afa6d4 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -231,10 +231,11 @@ pgbench options d extensively through a COPY. pgbench uses the FREEZE option with version 14 or later of PostgreSQL to speed up - subsequent VACUUM, unless partitions are enabled. - Using g causes logging to print one message - every 100,000 rows while generating data for the - pgbench_accounts table. + subsequent VACUUM, except on the + pgbench_accounts table if partitions are + enabled. Using g causes logging to + print one message every 100,000 rows while generating data for all + tables. With G (server-side data generation), -- cgit v1.2.3