diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 11:15:14 -0800 | 
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 11:15:14 -0800 | 
| commit | d8b91dde38f4c43bd0bbbf17a90f735b16aaff2c (patch) | |
| tree | bd72dabf6e4b23e060fce429c87e60504f69de54 /tools/perf/util/evlist.c | |
| parent | 5e7481a25e90b661d1dbbba18be3fd3dfe12ec6f (diff) | |
| parent | e4c1091cb495d9cbec8956d642644a71a1689958 (diff) | |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side changes:
   - Clean up the x86 instruction decoder (Masami Hiramatsu)
   - Add new uprobes optimization for PUSH instructions on x86 (Yonghong
     Song)
   - Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian)
   - Fix misc bugs, update documentation, plus various cleanups (Jiri
     Olsa)
  There's a large number of tooling side improvements:
   - Intel-PT/BTS improvements (Adrian Hunter)
   - Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo)
   - Introduce an errno code to string facility (Hendrik Brueckner)
   - Various build system improvements (Jiri Olsa)
   - Add support for CoreSight trace decoding by making the perf tools
     use the external openCSD (Mathieu Poirier, Tor Jeremiassen)
   - Add ARM Statistical Profiling Extensions (SPE) support (Kim
     Phillips)
   - libtraceevent updates (Steven Rostedt)
   - Intel vendor event JSON updates (Andi Kleen)
   - Introduce 'perf report --mmaps' and 'perf report --tasks' to show
     info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)
   - Add infrastructure to record first and last sample time to the
     perf.data file header, so that when processing all samples in a
     'perf record' session, such as when doing build-id processing, or
     when specifically requesting that that info be recorded, use that
     in 'perf report --time', that also got support for percent slices
     in addition to absolute ones.
     I.e. now it is possible to ask for the samples in the 10%-20% time
     slice of a perf.data file (Jin Yao)
   - Allow system wide 'perf stat --per-thread', sorting the result (Jin
     Yao)
     E.g.:
      [root@jouet ~]# perf stat --per-thread --metrics IPC
      ^C
       Performance counter stats for 'system wide':
                  make-22229  23,012,094,032  inst_retired.any   #  0.8 IPC
                   cc1-22419     692,027,497  inst_retired.any   #  0.8 IPC
                   gcc-22418     328,231,855  inst_retired.any   #  0.9 IPC
                   cc1-22509     220,853,647  inst_retired.any   #  0.8 IPC
                   gcc-22486     199,874,810  inst_retired.any   #  1.0 IPC
                    as-22466     177,896,365  inst_retired.any   #  0.9 IPC
                   cc1-22465     150,732,374  inst_retired.any   #  0.8 IPC
                   gcc-22508     112,555,593  inst_retired.any   #  0.9 IPC
                   cc1-22487     108,964,079  inst_retired.any   #  0.7 IPC
       qemu-system-x86-2697       21,330,550  inst_retired.any   #  0.3 IPC
       systemd-journal-551        20,642,951  inst_retired.any   #  0.4 IPC
       docker-containe-17651       9,552,892  inst_retired.any   #  0.5 IPC
       dockerd-current-9809        7,528,586  inst_retired.any   #  0.5 IPC
                  make-22153  12,504,194,380  inst_retired.any   #  0.8 IPC
               python2-22429  12,081,290,954  inst_retired.any   #  0.8 IPC
      <SNIP>
               python2-22429  15,026,328,103  cpu_clk_unhalted.thread
                   cc1-22419     826,660,193  cpu_clk_unhalted.thread
                   gcc-22418     365,321,295  cpu_clk_unhalted.thread
                   cc1-22509     279,169,362  cpu_clk_unhalted.thread
                   gcc-22486     210,156,950  cpu_clk_unhalted.thread
      <SNIP>
           5.638075538 seconds time elapsed
     [root@jouet ~]#
   - Improve shell auto-completion of perf events (Jin Yao)
   - 'perf probe' improvements (Masami Hiramatsu)
   - Improve PMU infrastructure to support amp64's ThunderX2
     implementation defined core events (Ganapatrao Kulkarni)
   - Various annotation related improvements and fixes (Thomas Richter)
   - Clarify usage of 'overwrite' and 'backward' in the evlist/mmap
     code, removing the 'overwrite' parameter from several functions as
     it was always used it as 'false' (Wang Nan)
   - Fix/improve 'perf record' reverse recording support (Wang Nan)
   - Improve command line options documentation (Sihyeon Jang)
   - Optimize sample parsing for ordering events, where we don't need to
     parse all the PERF_SAMPLE_ bits, just the ones leading to the
     timestamp needed to reorder events (Jiri Olsa)
   - Generalize the annotation code to support other source information
     besides objdump/DWARF obtained ones, starting with python scripts,
     that will is slated to be merged soon (Jiri Olsa)
   - ... and a lot more that I failed to list, see the shortlog and
     changelog for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
  perf trace beauty flock: Move to separate object file
  perf evlist: Remove fcntl.h from evlist.h
  perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY
  perf trace: Do not print from time delta for interrupted syscall lines
  perf trace: Add --print-sample
  perf bpf: Remove misplaced __maybe_unused attribute
  MAINTAINERS: Adding entry for CoreSight trace decoding
  perf tools: Add mechanic to synthesise CoreSight trace packets
  perf tools: Add full support for CoreSight trace decoding
  pert tools: Add queue management functionality
  perf tools: Add functionality to communicate with the openCSD decoder
  perf tools: Add support for decoding CoreSight trace data
  perf tools: Add decoder mechanic to support dumping trace data
  perf tools: Add processing of coresight metadata
  perf tools: Add initial entry point for decoder CoreSight traces
  perf tools: Integrating the CoreSight decoding library
  perf vendor events intel: Update IvyTown files to V20
  perf vendor events intel: Update IvyBridge files to V20
  perf vendor events intel: Update BroadwellDE events to V7
  perf vendor events intel: Update SkylakeX events to V1.06
  ...
Diffstat (limited to 'tools/perf/util/evlist.c')
| -rw-r--r-- | tools/perf/util/evlist.c | 70 | 
1 files changed, 43 insertions, 27 deletions
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index b62e523a7035..ac35cd214feb 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -25,6 +25,7 @@  #include "parse-events.h"  #include <subcmd/parse-options.h> +#include <fcntl.h>  #include <sys/ioctl.h>  #include <sys/mman.h> @@ -125,7 +126,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)  void perf_evlist__exit(struct perf_evlist *evlist)  {  	zfree(&evlist->mmap); -	zfree(&evlist->backward_mmap); +	zfree(&evlist->overwrite_mmap);  	fdarray__exit(&evlist->pollfd);  } @@ -675,11 +676,11 @@ static int perf_evlist__set_paused(struct perf_evlist *evlist, bool value)  {  	int i; -	if (!evlist->backward_mmap) +	if (!evlist->overwrite_mmap)  		return 0;  	for (i = 0; i < evlist->nr_mmaps; i++) { -		int fd = evlist->backward_mmap[i].fd; +		int fd = evlist->overwrite_mmap[i].fd;  		int err;  		if (fd < 0) @@ -711,7 +712,7 @@ union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int  	 * No need for read-write ring buffer: kernel stop outputting when  	 * it hit md->prev (perf_mmap__consume()).  	 */ -	return perf_mmap__read_forward(md, evlist->overwrite); +	return perf_mmap__read_forward(md);  }  union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx) @@ -738,7 +739,7 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)  void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)  { -	perf_mmap__consume(&evlist->mmap[idx], evlist->overwrite); +	perf_mmap__consume(&evlist->mmap[idx], false);  }  static void perf_evlist__munmap_nofree(struct perf_evlist *evlist) @@ -749,16 +750,16 @@ static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)  		for (i = 0; i < evlist->nr_mmaps; i++)  			perf_mmap__munmap(&evlist->mmap[i]); -	if (evlist->backward_mmap) +	if (evlist->overwrite_mmap)  		for (i = 0; i < evlist->nr_mmaps; i++) -			perf_mmap__munmap(&evlist->backward_mmap[i]); +			perf_mmap__munmap(&evlist->overwrite_mmap[i]);  }  void perf_evlist__munmap(struct perf_evlist *evlist)  {  	perf_evlist__munmap_nofree(evlist);  	zfree(&evlist->mmap); -	zfree(&evlist->backward_mmap); +	zfree(&evlist->overwrite_mmap);  }  static struct perf_mmap *perf_evlist__alloc_mmap(struct perf_evlist *evlist) @@ -800,7 +801,7 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,  static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,  				       struct mmap_params *mp, int cpu_idx, -				       int thread, int *_output, int *_output_backward) +				       int thread, int *_output, int *_output_overwrite)  {  	struct perf_evsel *evsel;  	int revent; @@ -812,18 +813,20 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,  		int fd;  		int cpu; +		mp->prot = PROT_READ | PROT_WRITE;  		if (evsel->attr.write_backward) { -			output = _output_backward; -			maps = evlist->backward_mmap; +			output = _output_overwrite; +			maps = evlist->overwrite_mmap;  			if (!maps) {  				maps = perf_evlist__alloc_mmap(evlist);  				if (!maps)  					return -1; -				evlist->backward_mmap = maps; +				evlist->overwrite_mmap = maps;  				if (evlist->bkw_mmap_state == BKW_MMAP_NOTREADY)  					perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);  			} +			mp->prot &= ~PROT_WRITE;  		}  		if (evsel->system_wide && thread) @@ -884,14 +887,14 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,  	pr_debug2("perf event ring buffer mmapped per cpu\n");  	for (cpu = 0; cpu < nr_cpus; cpu++) {  		int output = -1; -		int output_backward = -1; +		int output_overwrite = -1;  		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,  					      true);  		for (thread = 0; thread < nr_threads; thread++) {  			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu, -							thread, &output, &output_backward)) +							thread, &output, &output_overwrite))  				goto out_unmap;  		}  	} @@ -912,13 +915,13 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,  	pr_debug2("perf event ring buffer mmapped per thread\n");  	for (thread = 0; thread < nr_threads; thread++) {  		int output = -1; -		int output_backward = -1; +		int output_overwrite = -1;  		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,  					      false);  		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread, -						&output, &output_backward)) +						&output, &output_overwrite))  			goto out_unmap;  	} @@ -1052,15 +1055,18 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,   * Return: %0 on success, negative error code otherwise.   */  int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages, -			 bool overwrite, unsigned int auxtrace_pages, +			 unsigned int auxtrace_pages,  			 bool auxtrace_overwrite)  {  	struct perf_evsel *evsel;  	const struct cpu_map *cpus = evlist->cpus;  	const struct thread_map *threads = evlist->threads; -	struct mmap_params mp = { -		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE), -	}; +	/* +	 * Delay setting mp.prot: set it before calling perf_mmap__mmap. +	 * Its value is decided by evsel's write_backward. +	 * So &mp should not be passed through const pointer. +	 */ +	struct mmap_params mp;  	if (!evlist->mmap)  		evlist->mmap = perf_evlist__alloc_mmap(evlist); @@ -1070,7 +1076,6 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,  	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)  		return -ENOMEM; -	evlist->overwrite = overwrite;  	evlist->mmap_len = perf_evlist__mmap_size(pages);  	pr_debug("mmap size %zuB\n", evlist->mmap_len);  	mp.mask = evlist->mmap_len - page_size - 1; @@ -1091,10 +1096,9 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,  	return perf_evlist__mmap_per_cpu(evlist, &mp);  } -int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages, -		      bool overwrite) +int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages)  { -	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false); +	return perf_evlist__mmap_ex(evlist, pages, 0, false);  }  int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target) @@ -1102,7 +1106,8 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)  	struct cpu_map *cpus;  	struct thread_map *threads; -	threads = thread_map__new_str(target->pid, target->tid, target->uid); +	threads = thread_map__new_str(target->pid, target->tid, target->uid, +				      target->per_thread);  	if (!threads)  		return -1; @@ -1582,6 +1587,17 @@ int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *even  	return perf_evsel__parse_sample(evsel, event, sample);  } +int perf_evlist__parse_sample_timestamp(struct perf_evlist *evlist, +					union perf_event *event, +					u64 *timestamp) +{ +	struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event); + +	if (!evsel) +		return -EFAULT; +	return perf_evsel__parse_sample_timestamp(evsel, event, timestamp); +} +  size_t perf_evlist__fprintf(struct perf_evlist *evlist, FILE *fp)  {  	struct perf_evsel *evsel; @@ -1739,13 +1755,13 @@ void perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist,  		RESUME,  	} action = NONE; -	if (!evlist->backward_mmap) +	if (!evlist->overwrite_mmap)  		return;  	switch (old_state) {  	case BKW_MMAP_NOTREADY: {  		if (state != BKW_MMAP_RUNNING) -			goto state_err;; +			goto state_err;  		break;  	}  	case BKW_MMAP_RUNNING: {  | 
