From 81134af3ec09d67043833f8d614fd688f17cb213 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Tue, 10 Mar 2015 22:33:24 -0400 Subject: Move pgbench from contrib/ to src/bin/ Reviewed-by: Michael Paquier --- doc/src/sgml/contrib.sgml | 1 - doc/src/sgml/filelist.sgml | 1 - doc/src/sgml/pgbench.sgml | 1176 ---------------------------------------- doc/src/sgml/ref/allfiles.sgml | 1 + doc/src/sgml/ref/pgbench.sgml | 1176 ++++++++++++++++++++++++++++++++++++++++ doc/src/sgml/reference.sgml | 1 + 6 files changed, 1178 insertions(+), 1178 deletions(-) delete mode 100644 doc/src/sgml/pgbench.sgml create mode 100644 doc/src/sgml/ref/pgbench.sgml (limited to 'doc/src') diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml index f21fa149182..57730955bfa 100644 --- a/doc/src/sgml/contrib.sgml +++ b/doc/src/sgml/contrib.sgml @@ -187,7 +187,6 @@ pages. &oid2name; - &pgbench; &vacuumlo; diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml index 8b9d6a91279..ab935a6664f 100644 --- a/doc/src/sgml/filelist.sgml +++ b/doc/src/sgml/filelist.sgml @@ -125,7 +125,6 @@ - diff --git a/doc/src/sgml/pgbench.sgml b/doc/src/sgml/pgbench.sgml deleted file mode 100644 index ed12e279064..00000000000 --- a/doc/src/sgml/pgbench.sgml +++ /dev/null @@ -1,1176 +0,0 @@ - - - - - pgbench - - - - pgbench - 1 - Application - - - - pgbench - run a benchmark test on PostgreSQL - - - - - pgbench - - option - dbname - - - pgbench - option - dbname - - - - - Description - - pgbench is a simple program for running benchmark - tests on PostgreSQL. It runs the same sequence of SQL - commands over and over, possibly in multiple concurrent database sessions, - and then calculates the average transaction rate (transactions per second). - By default, pgbench tests a scenario that is - loosely based on TPC-B, involving five SELECT, - UPDATE, and INSERT commands per transaction. - However, it is easy to test other cases by writing your own transaction - script files. - - - - Typical output from pgbench looks like: - - -transaction type: TPC-B (sort of) -scaling factor: 10 -query mode: simple -number of clients: 10 -number of threads: 1 -number of transactions per client: 1000 -number of transactions actually processed: 10000/10000 -tps = 85.184871 (including connections establishing) -tps = 85.296346 (excluding connections establishing) - - - The first six lines report some of the most important parameter - settings. The next line reports the number of transactions completed - and intended (the latter being just the product of number of clients - and number of transactions per client); these will be equal unless the run - failed before completion. (In - - - The default TPC-B-like transaction test requires specific tables to be - set up beforehand. pgbench should be invoked with - the - - - - pgbench -i creates four tables pgbench_accounts, - pgbench_branches, pgbench_history, and - pgbench_tellers, - destroying any existing tables of these names. - Be very careful to use another database if you have tables having these - names! - - - - - At the default scale factor of 1, the tables initially - contain this many rows: - -table # of rows ---------------------------------- -pgbench_branches 1 -pgbench_tellers 10 -pgbench_accounts 100000 -pgbench_history 0 - - You can (and, for most purposes, probably should) increase the number - of rows by using the - - - Once you have done the necessary setup, you can run your benchmark - with a command that doesn't include - - - - Options - - - The following is divided into three subsections: Different options are used - during database initialization and while running benchmarks, some options - are useful in both cases. - - - - Initialization Options - - - pgbench accepts the following command-line - initialization arguments: - - - - - - - - - Required to invoke initialization mode. - - - - - - fillfactor - fillfactor - - - Create the pgbench_accounts, - pgbench_tellers and - pgbench_branches tables with the given fillfactor. - Default is 100. - - - - - - - - - - Perform no vacuuming after initialization. - - - - - - - - - - Switch logging to quiet mode, producing only one progress message per 5 - seconds. The default logging prints one message each 100000 rows, which - often outputs many lines per second (especially on good hardware). - - - - - - scale_factor - scale_factor - - - Multiply the number of rows generated by the scale factor. - For example, -s 100 will create 10,000,000 rows - in the pgbench_accounts table. Default is 1. - When the scale is 20,000 or larger, the columns used to - hold account identifiers (aid columns) - will switch to using larger integers (bigint), - in order to be big enough to hold the range of account - identifiers. - - - - - - - - - Create foreign key constraints between the standard tables. - - - - - - - - - Create indexes in the specified tablespace, rather than the default - tablespace. - - - - - - - - - Create tables in the specified tablespace, rather than the default - tablespace. - - - - - - - - - Create all tables as unlogged tables, rather than permanent tables. - - - - - - - - - - - Benchmarking Options - - - pgbench accepts the following command-line - benchmarking arguments: - - - - - clients - clients - - - Number of clients simulated, that is, number of concurrent database - sessions. Default is 1. - - - - - - - - - - Establish a new connection for each transaction, rather than - doing it just once per client session. - This is useful to measure the connection overhead. - - - - - - - - - - Print debugging output. - - - - - - varname=value - varname=value - - - Define a variable for use by a custom script (see below). - Multiple - - - - - filename - filename - - - Read transaction script from filename. - See below for details. - , , and - are mutually exclusive. - - - - - - threads - threads - - - Number of worker threads within pgbench. - Using more than one thread can be helpful on multi-CPU machines. - The number of clients must be a multiple of the number of threads, - since each thread is given the same number of client sessions to manage. - Default is 1. - - - - - - - - - - Write the time taken by each transaction to a log file. - See below for details. - - - - - - limit - limit - - - Transaction which last more than limit milliseconds - are counted and reported separately, as late. - - - When throttling is used ( - - - - - querymode - querymode - - - Protocol to use for submitting queries to the server: - - - simple: use simple query protocol. - - - extended: use extended query protocol. - - - prepared: use extended query protocol with prepared statements. - - - The default is simple query protocol. (See - for more information.) - - - - - - - - - - Perform no vacuuming before running the test. - This option is necessary - if you are running a custom test scenario that does not include - the standard tables pgbench_accounts, - pgbench_branches, pgbench_history, and - pgbench_tellers. - - - - - - - - - - Do not update pgbench_tellers and - pgbench_branches. - This will avoid update contention on these tables, but - it makes the test case even less like TPC-B. - - - - - - sec - sec - - - Show progress report every sec seconds. The report - includes the time since the beginning of the run, the tps since the - last report, and the transaction latency average and standard - deviation since the last report. Under throttling ( - - - - - - - - - Report the average per-statement latency (execution time from the - perspective of the client) of each command after the benchmark - finishes. See below for details. - - - - - - rate - rate - - - Execute transactions targeting the specified rate instead of running - as fast as possible (the default). The rate is given in transactions - per second. If the targeted rate is above the maximum possible rate, - the rate limit won't impact the results. - - - The rate is targeted by starting transactions along a - Poisson-distributed schedule time line. The expected start time - schedule moves forward based on when the client first started, not - when the previous transaction ended. That approach means that when - transactions go past their original scheduled end time, it is - possible for later ones to catch up again. - - - When throttling is active, the transaction latency reported at the - end of the run is calculated from the scheduled start times, so it - includes the time each transaction had to wait for the previous - transaction to finish. The wait time is called the schedule lag time, - and its average and maximum are also reported separately. The - transaction latency with respect to the actual transaction start time, - i.e. the time spent executing the transaction in the database, can be - computed by subtracting the schedule lag time from the reported - latency. - - - - If - - - A high schedule lag time is an indication that the system cannot - process transactions at the specified rate, with the chosen number of - clients and threads. When the average transaction execution time is - longer than the scheduled interval between each transaction, each - successive transaction will fall further behind, and the schedule lag - time will keep increasing the longer the test run is. When that - happens, you will have to reduce the specified transaction rate. - - - - - - scale_factor - scale_factor - - - Report the specified scale factor in pgbench's - output. With the built-in tests, this is not necessary; the - correct scale factor will be detected by counting the number of - rows in the pgbench_branches table. However, when testing - custom benchmarks ( - - - - - - - - - Perform select-only transactions instead of TPC-B-like test. - - - - - - transactions - transactions - - - Number of transactions each client runs. Default is 10. - - - - - - seconds - seconds - - - Run the test for this many seconds, rather than a fixed number of - transactions per client. and - are mutually exclusive. - - - - - - - - - - Vacuum all four standard tables before running the test. - With neither - - - - - - - - Length of aggregation interval (in seconds). May be used only together - with -l - with this option, the log contains - per-interval summary (number of transactions, min/max latency and two - additional fields useful for variance estimation). - - - This option is not currently supported on Windows. - - - - - - - - - Sampling rate, used when writing data into the log, to reduce the - amount of log generated. If this option is given, only the specified - fraction of transactions are logged. 1.0 means all transactions will - be logged, 0.05 means only 5% of the transactions will be logged. - - - Remember to take the sampling rate into account when processing the - log file. For example, when computing tps values, you need to multiply - the numbers accordingly (e.g. with 0.01 sample rate, you'll only get - 1/100 of the actual tps). - - - - - - - - - - - Common Options - - - pgbench accepts the following command-line - common arguments: - - - - - hostname - hostname - - - The database server's host name - - - - - - port - port - - - The database server's port number - - - - - - login - login - - - The user name to connect as - - - - - - - - - - Print the pgbench version and exit. - - - - - - - - - - Show help about pgbench command line - arguments, and exit. - - - - - - - - - - - Notes - - - What is the <quote>Transaction</> Actually Performed in pgbench? - - - The default transaction script issues seven commands per transaction: - - - - BEGIN; - UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; - SELECT abalance FROM pgbench_accounts WHERE aid = :aid; - UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; - UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; - INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); - END; - - - - If you specify - - - - Custom Scripts - - - pgbench has support for running custom - benchmark scenarios by replacing the default transaction script - (described above) with a transaction script read from a file - ( option). In this case a transaction - counts as one execution of a script file. You can even specify - multiple scripts (multiple options), in which - case a random one of the scripts is chosen each time a client session - starts a new transaction. - - - - The format of a script file is one SQL command per line; multiline - SQL commands are not supported. Empty lines and lines beginning with - -- are ignored. Script file lines can also be - meta commands, which are interpreted by pgbench - itself, as described below. - - - - There is a simple variable-substitution facility for script files. - Variables can be set by the command-line - - - Automatic variables - - - - Variable - Description - - - - - - scale - current scale factor - - - - client_id - unique number identifying the client session (starts from zero) - - - -
- - - Script file meta commands begin with a backslash (\). - Arguments to a meta command are separated by white space. - These meta commands are supported: - - - - - - \set varname expression - - - - - Sets variable varname to an integer value calculated - from expression. - The expression may contain integer constants such as 5432, - references to variables :variablename, - and expressions composed of unary (-) or binary operators - (+, -, *, /, %) - with their usual associativity, and parentheses. - - - - Examples: - -\set ntellers 10 * :scale -\set aid (1021 * :aid) % (100000 * :scale) + 1 - - - - - - - \setrandom varname min max [ uniform | { gaussian | exponential } threshold ] - - - - - Sets variable varname to a random integer value - between the limits min and max inclusive. - Each limit can be either an integer constant or a - :variablename reference to a variable - having an integer value. - - - - By default, or when uniform is specified, all values in the - range are drawn with equal probability. Specifying gaussian - or exponential options modifies this behavior; each - requires a mandatory threshold which determines the precise shape of the - distribution. - - - - For a Gaussian distribution, the interval is mapped onto a standard - normal distribution (the classical bell-shaped Gaussian curve) truncated - at -threshold on the left and +threshold - on the right. - To be precise, if PHI(x) is the cumulative distribution - function of the standard normal distribution, with mean mu - defined as (max + min) / 2.0, then value i - between min and max inclusive is drawn - with probability: - - (PHI(2.0 * threshold * (i - min - mu + 0.5) / (max - min + 1)) - - PHI(2.0 * threshold * (i - min - mu - 0.5) / (max - min + 1))) / - (2.0 * PHI(threshold) - 1.0). - Intuitively, the larger the threshold, the more - frequently values close to the middle of the interval are drawn, and the - less frequently values close to the min and - max bounds. - About 67% of values are drawn from the middle 1.0 / threshold - and 95% in the middle 2.0 / threshold; for instance, if - threshold is 4.0, 67% of values are drawn from the middle - quarter and 95% from the middle half of the interval. - The minimum threshold is 2.0 for performance of - the Box-Muller transform. - - - - For an exponential distribution, the threshold - parameter controls the distribution by truncating a quickly-decreasing - exponential distribution at threshold, and then - projecting onto integers between the bounds. - To be precise, value i between min and - max inclusive is drawn with probability: - (exp(-threshold*(i-min)/(max+1-min)) - - exp(-threshold*(i+1-min)/(max+1-min))) / (1.0 - exp(-threshold)). - Intuitively, the larger the threshold, the more - frequently values close to min are accessed, and the - less frequently values close to max are accessed. - The closer to 0 the threshold, the flatter (more uniform) the access - distribution. - A crude approximation of the distribution is that the most frequent 1% - values in the range, close to min, are drawn - threshold% of the time. - The threshold value must be strictly positive. - - - - Example: - -\setrandom aid 1 :naccounts gaussian 5.0 - - - - - - - \sleep number [ us | ms | s ] - - - - - Causes script execution to sleep for the specified duration in - microseconds (us), milliseconds (ms) or seconds - (s). If the unit is omitted then seconds are the default. - number can be either an integer constant or a - :variablename reference to a variable - having an integer value. - - - - Example: - -\sleep 10 ms - - - - - - - \setshell varname command [ argument ... ] - - - - - Sets variable varname to the result of the shell command - command. The command must return an integer value - through its standard output. - - - argument can be either a text constant or a - :variablename reference to a variable of - any types. If you want to use argument starting with - colons, you need to add an additional colon at the beginning of - argument. - - - - Example: - -\setshell variable_to_be_assigned command literal_argument :variable ::literal_starting_with_colon - - - - - - - \shell command [ argument ... ] - - - - - Same as \setshell, but the result is ignored. - - - - Example: - -\shell command literal_argument :variable ::literal_starting_with_colon - - - - - - - As an example, the full definition of the built-in TPC-B-like - transaction is: - - -\set nbranches :scale -\set ntellers 10 * :scale -\set naccounts 100000 * :scale -\setrandom aid 1 :naccounts -\setrandom bid 1 :nbranches -\setrandom tid 1 :ntellers -\setrandom delta -5000 5000 -BEGIN; -UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; -SELECT abalance FROM pgbench_accounts WHERE aid = :aid; -UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; -UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; -INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); -END; - - - This script allows each iteration of the transaction to reference - different, randomly-chosen rows. (This example also shows why it's - important for each client session to have its own variables — - otherwise they'd not be independently touching different rows.) - - -
- - - Per-Transaction Logging - - - With the , - pgbench writes the time taken by each transaction - to a log file. The log file will be named - pgbench_log.nnn, where - nnn is the PID of the pgbench process. - If the - - - The format of the log is: - - -client_id transaction_no time file_no time_epoch time_us schedule_lag - - - where time is the total elapsed transaction time in microseconds, - file_no identifies which script file was used - (useful when multiple scripts were specified with - - - Here is a snippet of the log file generated: - - 0 199 2241 0 1175850568 995598 - 0 200 2465 0 1175850568 998079 - 0 201 2513 0 1175850569 608 - 0 202 2038 0 1175850569 2663 - - - Another example with --rate=100 and --latency-limit=5 (note the additional - schedule_lag column): - - 0 81 4621 0 1412881037 912698 3005 - 0 82 6173 0 1412881037 914578 4304 - 0 83 skipped 0 1412881037 914578 5217 - 0 83 skipped 0 1412881037 914578 5099 - 0 83 4722 0 1412881037 916203 3108 - 0 84 4142 0 1412881037 918023 2333 - 0 85 2465 0 1412881037 919759 740 - - In this example, transaction 82 was late, because it's latency (6.173 ms) was - over the 5 ms limit. The next two transactions were skipped, because they - were already late before they were even started. - - - - When running a long test on hardware that can handle a lot of transactions, - the log files can become very large. The - - - - Aggregated Logging - - - With the option, the logs use a bit different format: - - -interval_start num_of_transactions latency_sum latency_2_sum min_latency max_latency lag_sum lag_2_sum min_lag max_lag skipped_transactions - - - where interval_start is the start of the interval (UNIX epoch - format timestamp), num_of_transactions is the number of transactions - within the interval, latency_sum is a sum of latencies - (so you can compute average latency easily). The following two fields are useful - for variance estimation - latency_sum is a sum of latencies and - latency_2_sum is a sum of 2nd powers of latencies. The last two - fields are min_latency - a minimum latency within the interval, and - max_latency - maximum latency within the interval. A transaction is - counted into the interval when it was committed. The fields in the end, - lag_sum, lag_2_sum, min_lag, - and max_lag, are only present if the - - - Here is example outputs: - -1345828501 5601 1542744 483552416 61 2573 -1345828503 7884 1979812 565806736 60 1479 -1345828505 7208 1979422 567277552 59 1391 -1345828507 7685 1980268 569784714 60 1398 -1345828509 7073 1979779 573489941 236 1411 - - - - Notice that while the plain (unaggregated) log file contains index - of the custom script files, the aggregated log does not. Therefore if - you need per script data, you need to aggregate the data on your own. - - - - - - Per-Statement Latencies - - - With the - - - For the default script, the output will look similar to this: - -starting vacuum...end. -transaction type: TPC-B (sort of) -scaling factor: 1 -query mode: simple -number of clients: 10 -number of threads: 1 -number of transactions per client: 1000 -number of transactions actually processed: 10000/10000 -tps = 618.764555 (including connections establishing) -tps = 622.977698 (excluding connections establishing) -statement latencies in milliseconds: - 0.004386 \set nbranches 1 * :scale - 0.001343 \set ntellers 10 * :scale - 0.001212 \set naccounts 100000 * :scale - 0.001310 \setrandom aid 1 :naccounts - 0.001073 \setrandom bid 1 :nbranches - 0.001005 \setrandom tid 1 :ntellers - 0.001078 \setrandom delta -5000 5000 - 0.326152 BEGIN; - 0.603376 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; - 0.454643 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; - 5.528491 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; - 7.335435 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; - 0.371851 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); - 1.212976 END; - - - - - If multiple script files are specified, the averages are reported - separately for each script file. - - - - Note that collecting the additional timing information needed for - per-statement latency computation adds some overhead. This will slow - average execution speed and lower the computed TPS. The amount - of slowdown varies significantly depending on platform and hardware. - Comparing average TPS values with and without latency reporting enabled - is a good way to measure if the timing overhead is significant. - - - - - Good Practices - - - It is very easy to use pgbench to produce completely - meaningless numbers. Here are some guidelines to help you get useful - results. - - - - In the first place, never believe any test that runs - for only a few seconds. Use the - - - For the default TPC-B-like test scenario, the initialization scale factor - ( - - - The default test scenario is also quite sensitive to how long it's been - since the tables were initialized: accumulation of dead rows and dead space - in the tables changes the results. To understand the results you must keep - track of the total number of updates and when vacuuming happens. If - autovacuum is enabled it can result in unpredictable changes in measured - performance. - - - - A limitation of pgbench is that it can itself become - the bottleneck when trying to test a large number of client sessions. - This can be alleviated by running pgbench on a different - machine from the database server, although low network latency will be - essential. It might even be useful to run several pgbench - instances concurrently, on several client machines, against the same - database server. - - -
-
diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml index f3b577119bd..9ae6aecb1a1 100644 --- a/doc/src/sgml/ref/allfiles.sgml +++ b/doc/src/sgml/ref/allfiles.sgml @@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory. + diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml new file mode 100644 index 00000000000..a8085463a5e --- /dev/null +++ b/doc/src/sgml/ref/pgbench.sgml @@ -0,0 +1,1176 @@ + + + + + pgbench + + + + pgbench + 1 + Application + + + + pgbench + run a benchmark test on PostgreSQL + + + + + pgbench + + option + dbname + + + pgbench + option + dbname + + + + + Description + + pgbench is a simple program for running benchmark + tests on PostgreSQL. It runs the same sequence of SQL + commands over and over, possibly in multiple concurrent database sessions, + and then calculates the average transaction rate (transactions per second). + By default, pgbench tests a scenario that is + loosely based on TPC-B, involving five SELECT, + UPDATE, and INSERT commands per transaction. + However, it is easy to test other cases by writing your own transaction + script files. + + + + Typical output from pgbench looks like: + + +transaction type: TPC-B (sort of) +scaling factor: 10 +query mode: simple +number of clients: 10 +number of threads: 1 +number of transactions per client: 1000 +number of transactions actually processed: 10000/10000 +tps = 85.184871 (including connections establishing) +tps = 85.296346 (excluding connections establishing) + + + The first six lines report some of the most important parameter + settings. The next line reports the number of transactions completed + and intended (the latter being just the product of number of clients + and number of transactions per client); these will be equal unless the run + failed before completion. (In + + + The default TPC-B-like transaction test requires specific tables to be + set up beforehand. pgbench should be invoked with + the + + + + pgbench -i creates four tables pgbench_accounts, + pgbench_branches, pgbench_history, and + pgbench_tellers, + destroying any existing tables of these names. + Be very careful to use another database if you have tables having these + names! + + + + + At the default scale factor of 1, the tables initially + contain this many rows: + +table # of rows +--------------------------------- +pgbench_branches 1 +pgbench_tellers 10 +pgbench_accounts 100000 +pgbench_history 0 + + You can (and, for most purposes, probably should) increase the number + of rows by using the + + + Once you have done the necessary setup, you can run your benchmark + with a command that doesn't include + + + + Options + + + The following is divided into three subsections: Different options are used + during database initialization and while running benchmarks, some options + are useful in both cases. + + + + Initialization Options + + + pgbench accepts the following command-line + initialization arguments: + + + + + + + + + Required to invoke initialization mode. + + + + + + fillfactor + fillfactor + + + Create the pgbench_accounts, + pgbench_tellers and + pgbench_branches tables with the given fillfactor. + Default is 100. + + + + + + + + + + Perform no vacuuming after initialization. + + + + + + + + + + Switch logging to quiet mode, producing only one progress message per 5 + seconds. The default logging prints one message each 100000 rows, which + often outputs many lines per second (especially on good hardware). + + + + + + scale_factor + scale_factor + + + Multiply the number of rows generated by the scale factor. + For example, -s 100 will create 10,000,000 rows + in the pgbench_accounts table. Default is 1. + When the scale is 20,000 or larger, the columns used to + hold account identifiers (aid columns) + will switch to using larger integers (bigint), + in order to be big enough to hold the range of account + identifiers. + + + + + + + + + Create foreign key constraints between the standard tables. + + + + + + + + + Create indexes in the specified tablespace, rather than the default + tablespace. + + + + + + + + + Create tables in the specified tablespace, rather than the default + tablespace. + + + + + + + + + Create all tables as unlogged tables, rather than permanent tables. + + + + + + + + + + + Benchmarking Options + + + pgbench accepts the following command-line + benchmarking arguments: + + + + + clients + clients + + + Number of clients simulated, that is, number of concurrent database + sessions. Default is 1. + + + + + + + + + + Establish a new connection for each transaction, rather than + doing it just once per client session. + This is useful to measure the connection overhead. + + + + + + + + + + Print debugging output. + + + + + + varname=value + varname=value + + + Define a variable for use by a custom script (see below). + Multiple + + + + + filename + filename + + + Read transaction script from filename. + See below for details. + , , and + are mutually exclusive. + + + + + + threads + threads + + + Number of worker threads within pgbench. + Using more than one thread can be helpful on multi-CPU machines. + The number of clients must be a multiple of the number of threads, + since each thread is given the same number of client sessions to manage. + Default is 1. + + + + + + + + + + Write the time taken by each transaction to a log file. + See below for details. + + + + + + limit + limit + + + Transaction which last more than limit milliseconds + are counted and reported separately, as late. + + + When throttling is used ( + + + + + querymode + querymode + + + Protocol to use for submitting queries to the server: + + + simple: use simple query protocol. + + + extended: use extended query protocol. + + + prepared: use extended query protocol with prepared statements. + + + The default is simple query protocol. (See + for more information.) + + + + + + + + + + Perform no vacuuming before running the test. + This option is necessary + if you are running a custom test scenario that does not include + the standard tables pgbench_accounts, + pgbench_branches, pgbench_history, and + pgbench_tellers. + + + + + + + + + + Do not update pgbench_tellers and + pgbench_branches. + This will avoid update contention on these tables, but + it makes the test case even less like TPC-B. + + + + + + sec + sec + + + Show progress report every sec seconds. The report + includes the time since the beginning of the run, the tps since the + last report, and the transaction latency average and standard + deviation since the last report. Under throttling ( + + + + + + + + + Report the average per-statement latency (execution time from the + perspective of the client) of each command after the benchmark + finishes. See below for details. + + + + + + rate + rate + + + Execute transactions targeting the specified rate instead of running + as fast as possible (the default). The rate is given in transactions + per second. If the targeted rate is above the maximum possible rate, + the rate limit won't impact the results. + + + The rate is targeted by starting transactions along a + Poisson-distributed schedule time line. The expected start time + schedule moves forward based on when the client first started, not + when the previous transaction ended. That approach means that when + transactions go past their original scheduled end time, it is + possible for later ones to catch up again. + + + When throttling is active, the transaction latency reported at the + end of the run is calculated from the scheduled start times, so it + includes the time each transaction had to wait for the previous + transaction to finish. The wait time is called the schedule lag time, + and its average and maximum are also reported separately. The + transaction latency with respect to the actual transaction start time, + i.e. the time spent executing the transaction in the database, can be + computed by subtracting the schedule lag time from the reported + latency. + + + + If + + + A high schedule lag time is an indication that the system cannot + process transactions at the specified rate, with the chosen number of + clients and threads. When the average transaction execution time is + longer than the scheduled interval between each transaction, each + successive transaction will fall further behind, and the schedule lag + time will keep increasing the longer the test run is. When that + happens, you will have to reduce the specified transaction rate. + + + + + + scale_factor + scale_factor + + + Report the specified scale factor in pgbench's + output. With the built-in tests, this is not necessary; the + correct scale factor will be detected by counting the number of + rows in the pgbench_branches table. However, when testing + custom benchmarks ( + + + + + + + + + Perform select-only transactions instead of TPC-B-like test. + + + + + + transactions + transactions + + + Number of transactions each client runs. Default is 10. + + + + + + seconds + seconds + + + Run the test for this many seconds, rather than a fixed number of + transactions per client. and + are mutually exclusive. + + + + + + + + + + Vacuum all four standard tables before running the test. + With neither + + + + + + + + Length of aggregation interval (in seconds). May be used only together + with -l - with this option, the log contains + per-interval summary (number of transactions, min/max latency and two + additional fields useful for variance estimation). + + + This option is not currently supported on Windows. + + + + + + + + + Sampling rate, used when writing data into the log, to reduce the + amount of log generated. If this option is given, only the specified + fraction of transactions are logged. 1.0 means all transactions will + be logged, 0.05 means only 5% of the transactions will be logged. + + + Remember to take the sampling rate into account when processing the + log file. For example, when computing tps values, you need to multiply + the numbers accordingly (e.g. with 0.01 sample rate, you'll only get + 1/100 of the actual tps). + + + + + + + + + + + Common Options + + + pgbench accepts the following command-line + common arguments: + + + + + hostname + hostname + + + The database server's host name + + + + + + port + port + + + The database server's port number + + + + + + login + login + + + The user name to connect as + + + + + + + + + + Print the pgbench version and exit. + + + + + + + + + + Show help about pgbench command line + arguments, and exit. + + + + + + + + + + + Notes + + + What is the <quote>Transaction</> Actually Performed in pgbench? + + + The default transaction script issues seven commands per transaction: + + + + BEGIN; + UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; + SELECT abalance FROM pgbench_accounts WHERE aid = :aid; + UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; + UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; + INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); + END; + + + + If you specify + + + + Custom Scripts + + + pgbench has support for running custom + benchmark scenarios by replacing the default transaction script + (described above) with a transaction script read from a file + ( option). In this case a transaction + counts as one execution of a script file. You can even specify + multiple scripts (multiple options), in which + case a random one of the scripts is chosen each time a client session + starts a new transaction. + + + + The format of a script file is one SQL command per line; multiline + SQL commands are not supported. Empty lines and lines beginning with + -- are ignored. Script file lines can also be + meta commands, which are interpreted by pgbench + itself, as described below. + + + + There is a simple variable-substitution facility for script files. + Variables can be set by the command-line + + + Automatic variables + + + + Variable + Description + + + + + + scale + current scale factor + + + + client_id + unique number identifying the client session (starts from zero) + + + +
+ + + Script file meta commands begin with a backslash (\). + Arguments to a meta command are separated by white space. + These meta commands are supported: + + + + + + \set varname expression + + + + + Sets variable varname to an integer value calculated + from expression. + The expression may contain integer constants such as 5432, + references to variables :variablename, + and expressions composed of unary (-) or binary operators + (+, -, *, /, %) + with their usual associativity, and parentheses. + + + + Examples: + +\set ntellers 10 * :scale +\set aid (1021 * :aid) % (100000 * :scale) + 1 + + + + + + + \setrandom varname min max [ uniform | { gaussian | exponential } threshold ] + + + + + Sets variable varname to a random integer value + between the limits min and max inclusive. + Each limit can be either an integer constant or a + :variablename reference to a variable + having an integer value. + + + + By default, or when uniform is specified, all values in the + range are drawn with equal probability. Specifying gaussian + or exponential options modifies this behavior; each + requires a mandatory threshold which determines the precise shape of the + distribution. + + + + For a Gaussian distribution, the interval is mapped onto a standard + normal distribution (the classical bell-shaped Gaussian curve) truncated + at -threshold on the left and +threshold + on the right. + To be precise, if PHI(x) is the cumulative distribution + function of the standard normal distribution, with mean mu + defined as (max + min) / 2.0, then value i + between min and max inclusive is drawn + with probability: + + (PHI(2.0 * threshold * (i - min - mu + 0.5) / (max - min + 1)) - + PHI(2.0 * threshold * (i - min - mu - 0.5) / (max - min + 1))) / + (2.0 * PHI(threshold) - 1.0). + Intuitively, the larger the threshold, the more + frequently values close to the middle of the interval are drawn, and the + less frequently values close to the min and + max bounds. + About 67% of values are drawn from the middle 1.0 / threshold + and 95% in the middle 2.0 / threshold; for instance, if + threshold is 4.0, 67% of values are drawn from the middle + quarter and 95% from the middle half of the interval. + The minimum threshold is 2.0 for performance of + the Box-Muller transform. + + + + For an exponential distribution, the threshold + parameter controls the distribution by truncating a quickly-decreasing + exponential distribution at threshold, and then + projecting onto integers between the bounds. + To be precise, value i between min and + max inclusive is drawn with probability: + (exp(-threshold*(i-min)/(max+1-min)) - + exp(-threshold*(i+1-min)/(max+1-min))) / (1.0 - exp(-threshold)). + Intuitively, the larger the threshold, the more + frequently values close to min are accessed, and the + less frequently values close to max are accessed. + The closer to 0 the threshold, the flatter (more uniform) the access + distribution. + A crude approximation of the distribution is that the most frequent 1% + values in the range, close to min, are drawn + threshold% of the time. + The threshold value must be strictly positive. + + + + Example: + +\setrandom aid 1 :naccounts gaussian 5.0 + + + + + + + \sleep number [ us | ms | s ] + + + + + Causes script execution to sleep for the specified duration in + microseconds (us), milliseconds (ms) or seconds + (s). If the unit is omitted then seconds are the default. + number can be either an integer constant or a + :variablename reference to a variable + having an integer value. + + + + Example: + +\sleep 10 ms + + + + + + + \setshell varname command [ argument ... ] + + + + + Sets variable varname to the result of the shell command + command. The command must return an integer value + through its standard output. + + + argument can be either a text constant or a + :variablename reference to a variable of + any types. If you want to use argument starting with + colons, you need to add an additional colon at the beginning of + argument. + + + + Example: + +\setshell variable_to_be_assigned command literal_argument :variable ::literal_starting_with_colon + + + + + + + \shell command [ argument ... ] + + + + + Same as \setshell, but the result is ignored. + + + + Example: + +\shell command literal_argument :variable ::literal_starting_with_colon + + + + + + + As an example, the full definition of the built-in TPC-B-like + transaction is: + + +\set nbranches :scale +\set ntellers 10 * :scale +\set naccounts 100000 * :scale +\setrandom aid 1 :naccounts +\setrandom bid 1 :nbranches +\setrandom tid 1 :ntellers +\setrandom delta -5000 5000 +BEGIN; +UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; +SELECT abalance FROM pgbench_accounts WHERE aid = :aid; +UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; +UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; +INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); +END; + + + This script allows each iteration of the transaction to reference + different, randomly-chosen rows. (This example also shows why it's + important for each client session to have its own variables — + otherwise they'd not be independently touching different rows.) + + +
+ + + Per-Transaction Logging + + + With the , + pgbench writes the time taken by each transaction + to a log file. The log file will be named + pgbench_log.nnn, where + nnn is the PID of the pgbench process. + If the + + + The format of the log is: + + +client_id transaction_no time file_no time_epoch time_us schedule_lag + + + where time is the total elapsed transaction time in microseconds, + file_no identifies which script file was used + (useful when multiple scripts were specified with + + + Here is a snippet of the log file generated: + + 0 199 2241 0 1175850568 995598 + 0 200 2465 0 1175850568 998079 + 0 201 2513 0 1175850569 608 + 0 202 2038 0 1175850569 2663 + + + Another example with --rate=100 and --latency-limit=5 (note the additional + schedule_lag column): + + 0 81 4621 0 1412881037 912698 3005 + 0 82 6173 0 1412881037 914578 4304 + 0 83 skipped 0 1412881037 914578 5217 + 0 83 skipped 0 1412881037 914578 5099 + 0 83 4722 0 1412881037 916203 3108 + 0 84 4142 0 1412881037 918023 2333 + 0 85 2465 0 1412881037 919759 740 + + In this example, transaction 82 was late, because it's latency (6.173 ms) was + over the 5 ms limit. The next two transactions were skipped, because they + were already late before they were even started. + + + + When running a long test on hardware that can handle a lot of transactions, + the log files can become very large. The + + + + Aggregated Logging + + + With the option, the logs use a bit different format: + + +interval_start num_of_transactions latency_sum latency_2_sum min_latency max_latency lag_sum lag_2_sum min_lag max_lag skipped_transactions + + + where interval_start is the start of the interval (UNIX epoch + format timestamp), num_of_transactions is the number of transactions + within the interval, latency_sum is a sum of latencies + (so you can compute average latency easily). The following two fields are useful + for variance estimation - latency_sum is a sum of latencies and + latency_2_sum is a sum of 2nd powers of latencies. The last two + fields are min_latency - a minimum latency within the interval, and + max_latency - maximum latency within the interval. A transaction is + counted into the interval when it was committed. The fields in the end, + lag_sum, lag_2_sum, min_lag, + and max_lag, are only present if the + + + Here is example outputs: + +1345828501 5601 1542744 483552416 61 2573 +1345828503 7884 1979812 565806736 60 1479 +1345828505 7208 1979422 567277552 59 1391 +1345828507 7685 1980268 569784714 60 1398 +1345828509 7073 1979779 573489941 236 1411 + + + + Notice that while the plain (unaggregated) log file contains index + of the custom script files, the aggregated log does not. Therefore if + you need per script data, you need to aggregate the data on your own. + + + + + + Per-Statement Latencies + + + With the + + + For the default script, the output will look similar to this: + +starting vacuum...end. +transaction type: TPC-B (sort of) +scaling factor: 1 +query mode: simple +number of clients: 10 +number of threads: 1 +number of transactions per client: 1000 +number of transactions actually processed: 10000/10000 +tps = 618.764555 (including connections establishing) +tps = 622.977698 (excluding connections establishing) +statement latencies in milliseconds: + 0.004386 \set nbranches 1 * :scale + 0.001343 \set ntellers 10 * :scale + 0.001212 \set naccounts 100000 * :scale + 0.001310 \setrandom aid 1 :naccounts + 0.001073 \setrandom bid 1 :nbranches + 0.001005 \setrandom tid 1 :ntellers + 0.001078 \setrandom delta -5000 5000 + 0.326152 BEGIN; + 0.603376 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid; + 0.454643 SELECT abalance FROM pgbench_accounts WHERE aid = :aid; + 5.528491 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid; + 7.335435 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid; + 0.371851 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP); + 1.212976 END; + + + + + If multiple script files are specified, the averages are reported + separately for each script file. + + + + Note that collecting the additional timing information needed for + per-statement latency computation adds some overhead. This will slow + average execution speed and lower the computed TPS. The amount + of slowdown varies significantly depending on platform and hardware. + Comparing average TPS values with and without latency reporting enabled + is a good way to measure if the timing overhead is significant. + + + + + Good Practices + + + It is very easy to use pgbench to produce completely + meaningless numbers. Here are some guidelines to help you get useful + results. + + + + In the first place, never believe any test that runs + for only a few seconds. Use the + + + For the default TPC-B-like test scenario, the initialization scale factor + ( + + + The default test scenario is also quite sensitive to how long it's been + since the tables were initialized: accumulation of dead rows and dead space + in the tables changes the results. To understand the results you must keep + track of the total number of updates and when vacuuming happens. If + autovacuum is enabled it can result in unpredictable changes in measured + performance. + + + + A limitation of pgbench is that it can itself become + the bottleneck when trying to test a large number of client sessions. + This can be alleviated by running pgbench on a different + machine from the database server, although low network latency will be + essential. It might even be useful to run several pgbench + instances concurrently, on several client machines, against the same + database server. + + +
+
diff --git a/doc/src/sgml/reference.sgml b/doc/src/sgml/reference.sgml index 9fb32f8c284..c1765ef1c5e 100644 --- a/doc/src/sgml/reference.sgml +++ b/doc/src/sgml/reference.sgml @@ -230,6 +230,7 @@ &dropuser; &ecpgRef; &pgBasebackup; + &pgbench; &pgConfig; &pgDump; &pgDumpall; -- cgit v1.2.3