user/sven/linux.git/tools/perf/bench, branch v3.14.2

perf bench numa: Make no args mean 'run all tests'

2014-03-14T13:04:10Z

If we call just: perf bench numa mem it will present the same output as: perf bench numa mem -h i.e. ask for instructions about what to run. While that is kinda ok, using 'run all tests' as the default, i.e. making 'no parms' be equivalent to: perf bench numa mem -a Will allow: perf bench numa all to actually do what is asked: i.e. run all the 'bench' tests, instead of responding to that by asking what to do. That, in turn, allows: perf bench all to actually complete, for the same reasons. And after that, the tests that come after that, and that at some point hit a NULL deref, will run, allowing me to reproduce a recently reported problem. That when you have the needed numa libraries, which wasn't the case for the reporter, making me a bit confused after trying to reproduce his report. So make no parms mean -a. Cc: Adrian Hunter Cc: David Ahern Cc: Frederic Weisbecker Cc: Jiri Olsa Cc: Mike Galbraith Cc: Namhyung Kim Cc: Patrick Palka Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lkml.kernel.org/n/tip-x7h0ghx4pef4n0brywg21krk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo

perf tools: Fix bench/numa.c for 32-bit build

2013-10-21T14:19:42Z

bench/numa.c: In function 'worker_thread': bench/numa.c:1123:20: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare] bench/numa.c:1171:6: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'u64' [-Werror=format] cc1: all warnings being treated as errors Signed-off-by: Adrian Hunter Cc: David Ahern Cc: Frederic Weisbecker Cc: Ingo Molnar Cc: Jiri Olsa Cc: Mike Galbraith Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Stephane Eranian Link: http://lkml.kernel.org/r/1382099356-4918-13-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo

perf bench: Fix failing assertions in numa bench

2013-10-11T15:17:38Z

Patch adds more subtle handling of -C and -N parameters in parse_{cpu,node}_setup_list() functions when there isn't enough NUMA nodes or CPUs present. Instead of assertion and terminating benchmark, partial test is skipped with error message and perf will continue to the next one. Fixed problem can be easily reproduced on machine with only one NUMA node: # Running numa/mem benchmark... # Running main, "perf bench numa mem -a" ... # Running RAM-bw-remote, "perf bench numa mem -p 1 -t 1 -P 1024 -C 0 -M 1 -s perf: bench/numa.c:622: parse_setup_node_list: Assertion `!(bind_node_0 < 0 || bind_node_0 >= g->p.nr_nodes)' failed. Aborted Signed-off-by: Petr Holasek Acked-by: Ingo Molnar Cc: Ingo Molnar Cc: Jiri Olsa Cc: Petr Benas Link: http://lkml.kernel.org/r/1380821325-4017-1-git-send-email-pholasek@redhat.com Signed-off-by: Petr Benas Signed-off-by: Arnaldo Carvalho de Melo

perf bench sched: Add --threaded option

2013-10-11T15:17:25Z

Allow the measurement of thread versus process context switch performance. The default stays at 'process' based measurement, like lmbench's lat_ctx benchmark. Sample output: comet:~/tip/tools/perf> taskset 1 ./perf bench sched pipe # Running sched/pipe benchmark... # Executed 1000000 pipe operations between two processes Total time: 4.138 [sec] 4.138729 usecs/op 241620 ops/sec comet:~/tip/tools/perf> taskset 1 ./perf bench sched pipe --threaded # Running sched/pipe benchmark... # Executed 1000000 pipe operations between two threads Total time: 3.667 [sec] 3.667667 usecs/op 272652 ops/sec Signed-off-by: Ingo Molnar Cc: Peter Zijlstra Cc: Namhyung Kim Cc: Jiri Olsa Cc: Arnaldo Carvalho de Melo Link: http://lkml.kernel.org/r/20130917114256.GA31159@gmail.com Signed-off-by: Arnaldo Carvalho de Melo

tools/perf: Standardize feature support define names to: HAVE_{FEATURE}_SUPPORT

2013-10-09T06:48:28Z

Standardize all the feature flags based on the HAVE_{FEATURE}_SUPPORT naming convention: HAVE_ARCH_X86_64_SUPPORT HAVE_BACKTRACE_SUPPORT HAVE_CPLUS_DEMANGLE_SUPPORT HAVE_DWARF_SUPPORT HAVE_ELF_GETPHDRNUM_SUPPORT HAVE_GTK2_SUPPORT HAVE_GTK_INFO_BAR_SUPPORT HAVE_LIBAUDIT_SUPPORT HAVE_LIBELF_MMAP_SUPPORT HAVE_LIBELF_SUPPORT HAVE_LIBNUMA_SUPPORT HAVE_LIBUNWIND_SUPPORT HAVE_ON_EXIT_SUPPORT HAVE_PERF_REGS_SUPPORT HAVE_SLANG_SUPPORT HAVE_STRLCPY_SUPPORT Cc: Arnaldo Carvalho de Melo Cc: Peter Zijlstra Cc: Namhyung Kim Cc: David Ahern Cc: Jiri Olsa Link: http://lkml.kernel.org/n/tip-u3zvqejddfZhtrbYbfhi3spa@git.kernel.org Signed-off-by: Ingo Molnar

perf bench: Fix memcpy benchmark for large sizes

2013-07-22T15:41:56Z

The glibc calloc() function has an optimization to not explicitely memset() very large calloc allocations that just came from mmap(), because they are known to be zero. This could result in the perf memcpy benchmark reading only from the zero page, which gives unrealistic results. Always call memset explicitly on the source area to avoid this problem. Signed-off-by: Andi Kleen Cc: Hitoshi Mitake Cc: Kirill A. Shutemov Link: http://lkml.kernel.org/n/tip-pzz2qrdq9eymxda0y8yxdn33@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo

perf bench: Fix memory allocation fail check in mem{set,cpy} workloads

2013-07-08T20:35:40Z

Addresses of allocated memory areas saved to '*src' and '*dst', so we need to check them for NULL, not 'src' and 'dst'. Signed-off-by: Kirill A. Shutemov Acked-by: Hitoshi Mitake Cc: Hitoshi Mitake Cc: Ingo Molnar Cc: Paul Mackerras Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1370518503-4230-1-git-send-email-kirill.shutemov@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo

perf tools: Fix LIBNUMA build with glibc 2.12 and older.

2013-03-14T11:06:21Z

The tokens MADV_HUGEPAGE and MADV_NOHUGEPAGE are not available with glibc 2.12 and older. Define these tokens if they are not already defined. This patch fixes these build errors with older versions of glibc. CC bench/numa.o bench/numa.c: In function ‘alloc_data’: bench/numa.c:334: error: ‘MADV_HUGEPAGE’ undeclared (first use in this function) bench/numa.c:334: error: (Each undeclared identifier is reported only once bench/numa.c:334: error: for each function it appears in.) bench/numa.c:341: error: ‘MADV_NOHUGEPAGE’ undeclared (first use in this function) make: *** [bench/numa.o] Error 1 Signed-off-by: Vinson Lee Acked-by: Ingo Molnar Cc: Ingo Molnar Cc: Irina Tirdea Cc: Paul Mackerras Cc: Pekka Enberg Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1363214064-4671-2-git-send-email-vlee@twitter.com Signed-off-by: Arnaldo Carvalho de Melo

perf: Add 'perf bench numa mem' NUMA performance measurement suite

2013-01-30T13:35:36Z

Add a suite of NUMA performance benchmarks. The goal was simulate the behavior and access patterns of real NUMA workloads, via a wide range of parameters, so this tool goes well beyond simple bzero() measurements that most NUMA micro-benchmarks use: - It processes the data and creates a chain of data dependencies, like a real workload would. Neither the compiler, nor the kernel (via KSM and other optimizations) nor the CPU can eliminate parts of the workload. - It randomizes the initial state and also randomizes the target addresses of the processing - it's not a simple forward scan of addresses. - It provides flexible options to set process, thread and memory relationship information: -G sets "global" memory shared between all test processes, -P sets "process" memory shared by all threads of a process and -T sets "thread" private memory. - There's a NUMA convergence monitoring and convergence latency measurement option via -c and -m. - Micro-sleeps and synchronization can be injected to provoke lock contention and scheduling, via the -u and -S options. This simulates IO and contention. - The -x option instructs the workload to 'perturb' itself artificially every N seconds, by moving to the first and last CPU of the system periodically. This way the stability of convergence equilibrium and the number of steps taken for the scheduler to reach equilibrium again can be measured. - The amount of work can be specified via the -l loop count, and/or via a -s seconds-timeout value. - CPU and node memory binding options, to test hard binding scenarios. THP can be turned on and off via madvise() calls. - Live reporting of convergence progress in an 'at glance' output format. Printing of convergence and deconvergence events. The 'perf bench numa mem -a' option will start an array of about 30 individual tests that will each output such measurements: # Running 5x5-bw-thread, "perf bench numa mem -p 5 -t 5 -P 512 -s 20 -zZ0q --thp 1" 5x5-bw-thread, 20.276, secs, runtime-max/thread 5x5-bw-thread, 20.004, secs, runtime-min/thread 5x5-bw-thread, 20.155, secs, runtime-avg/thread 5x5-bw-thread, 0.671, %, spread-runtime/thread 5x5-bw-thread, 21.153, GB, data/thread 5x5-bw-thread, 528.818, GB, data-total 5x5-bw-thread, 0.959, nsecs, runtime/byte/thread 5x5-bw-thread, 1.043, GB/sec, thread-speed 5x5-bw-thread, 26.081, GB/sec, total-speed See the help text and the code for more details. Cc: Peter Zijlstra Cc: Arnaldo Carvalho de Melo Cc: Frederic Weisbecker Cc: Mike Galbraith Cc: Steven Rostedt Cc: Linus Torvalds Cc: Andrew Morton Cc: Peter Zijlstra Cc: Andrea Arcangeli Cc: Rik van Riel Cc: Mel Gorman Cc: Hugh Dickins Signed-off-by: Ingo Molnar

perf tools: Use __maybe_used for unused variables

2012-09-11T15:19:15Z

perf defines both __used and __unused variables to use for marking unused variables. The variable __used is defined to __attribute__((__unused__)), which contradicts the kernel definition to __attribute__((__used__)) for new gcc versions. On Android, __used is also defined in system headers and this leads to warnings like: warning: '__used__' attribute ignored __unused is not defined in the kernel and is not a standard definition. If __unused is included everywhere instead of __used, this leads to conflicts with glibc headers, since glibc has a variables with this name in its headers. The best approach is to use __maybe_unused, the definition used in the kernel for __attribute__((unused)). In this way there is only one definition in perf sources (instead of 2 definitions that point to the same thing: __used and __unused) and it works on both Linux and Android. This patch simply replaces all instances of __used and __unused with __maybe_unused. Signed-off-by: Irina Tirdea Acked-by: Pekka Enberg Cc: David Ahern Cc: Ingo Molnar Cc: Namhyung Kim Cc: Paul Mackerras Cc: Peter Zijlstra Cc: Steven Rostedt Link: http://lkml.kernel.org/r/1347315303-29906-7-git-send-email-irina.tirdea@intel.com [ committer note: fixed up conflict with a116e05 in builtin-sched.c ] Signed-off-by: Arnaldo Carvalho de Melo