Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar: "Kernel side changes: - Clean up the x86 instruction decoder (Masami Hiramatsu) - Add new uprobes optimization for PUSH instructions on x86 (Yonghong Song) - Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian) - Fix misc bugs, update documentation, plus various cleanups (Jiri Olsa) There's a large number of tooling side improvements: - Intel-PT/BTS improvements (Adrian Hunter) - Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo) - Introduce an errno code to string facility (Hendrik Brueckner) - Various build system improvements (Jiri Olsa) - Add support for CoreSight trace decoding by making the perf tools use the external openCSD (Mathieu Poirier, Tor Jeremiassen) - Add ARM Statistical Profiling Extensions (SPE) support (Kim Phillips) - libtraceevent updates (Steven Rostedt) - Intel vendor event JSON updates (Andi Kleen) - Introduce 'perf report --mmaps' and 'perf report --tasks' to show info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo) - Add infrastructure to record first and last sample time to the perf.data file header, so that when processing all samples in a 'perf record' session, such as when doing build-id processing, or when specifically requesting that that info be recorded, use that in 'perf report --time', that also got support for percent slices in addition to absolute ones. I.e. now it is possible to ask for the samples in the 10%-20% time slice of a perf.data file (Jin Yao) - Allow system wide 'perf stat --per-thread', sorting the result (Jin Yao) E.g.: [root@jouet ~]# perf stat --per-thread --metrics IPC ^C Performance counter stats for 'system wide': make-22229 23,012,094,032 inst_retired.any # 0.8 IPC cc1-22419 692,027,497 inst_retired.any # 0.8 IPC gcc-22418 328,231,855 inst_retired.any # 0.9 IPC cc1-22509 220,853,647 inst_retired.any # 0.8 IPC gcc-22486 199,874,810 inst_retired.any # 1.0 IPC as-22466 177,896,365 inst_retired.any # 0.9 IPC cc1-22465 150,732,374 inst_retired.any # 0.8 IPC gcc-22508 112,555,593 inst_retired.any # 0.9 IPC cc1-22487 108,964,079 inst_retired.any # 0.7 IPC qemu-system-x86-2697 21,330,550 inst_retired.any # 0.3 IPC systemd-journal-551 20,642,951 inst_retired.any # 0.4 IPC docker-containe-17651 9,552,892 inst_retired.any # 0.5 IPC dockerd-current-9809 7,528,586 inst_retired.any # 0.5 IPC make-22153 12,504,194,380 inst_retired.any # 0.8 IPC python2-22429 12,081,290,954 inst_retired.any # 0.8 IPC <SNIP> python2-22429 15,026,328,103 cpu_clk_unhalted.thread cc1-22419 826,660,193 cpu_clk_unhalted.thread gcc-22418 365,321,295 cpu_clk_unhalted.thread cc1-22509 279,169,362 cpu_clk_unhalted.thread gcc-22486 210,156,950 cpu_clk_unhalted.thread <SNIP> 5.638075538 seconds time elapsed [root@jouet ~]# - Improve shell auto-completion of perf events (Jin Yao) - 'perf probe' improvements (Masami Hiramatsu) - Improve PMU infrastructure to support amp64's ThunderX2 implementation defined core events (Ganapatrao Kulkarni) - Various annotation related improvements and fixes (Thomas Richter) - Clarify usage of 'overwrite' and 'backward' in the evlist/mmap code, removing the 'overwrite' parameter from several functions as it was always used it as 'false' (Wang Nan) - Fix/improve 'perf record' reverse recording support (Wang Nan) - Improve command line options documentation (Sihyeon Jang) - Optimize sample parsing for ordering events, where we don't need to parse all the PERF_SAMPLE_ bits, just the ones leading to the timestamp needed to reorder events (Jiri Olsa) - Generalize the annotation code to support other source information besides objdump/DWARF obtained ones, starting with python scripts, that will is slated to be merged soon (Jiri Olsa) - ... and a lot more that I failed to list, see the shortlog and changelog for details" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits) perf trace beauty flock: Move to separate object file perf evlist: Remove fcntl.h from evlist.h perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY perf trace: Do not print from time delta for interrupted syscall lines perf trace: Add --print-sample perf bpf: Remove misplaced __maybe_unused attribute MAINTAINERS: Adding entry for CoreSight trace decoding perf tools: Add mechanic to synthesise CoreSight trace packets perf tools: Add full support for CoreSight trace decoding pert tools: Add queue management functionality perf tools: Add functionality to communicate with the openCSD decoder perf tools: Add support for decoding CoreSight trace data perf tools: Add decoder mechanic to support dumping trace data perf tools: Add processing of coresight metadata perf tools: Add initial entry point for decoder CoreSight traces perf tools: Integrating the CoreSight decoding library perf vendor events intel: Update IvyTown files to V20 perf vendor events intel: Update IvyBridge files to V20 perf vendor events intel: Update BroadwellDE events to V7 perf vendor events intel: Update SkylakeX events to V1.06 ...
author: Linus Torvalds <torvalds@linux-foundation.org> 2018-01-30 11:15:14 -0800
committer: Linus Torvalds <torvalds@linux-foundation.org> 2018-01-30 11:15:14 -0800
commit: d8b91dde38f4c43bd0bbbf17a90f735b16aaff2c (patch)
tree: bd72dabf6e4b23e060fce429c87e60504f69de54 /tools/perf/util/evlist.c
parent: 5e7481a25e90b661d1dbbba18be3fd3dfe12ec6f (diff)
parent: e4c1091cb495d9cbec8956d642644a71a1689958 (diff)
1 files changed, 43 insertions, 27 deletions
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index b62e523a7035..ac35cd214feb 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -25,6 +25,7 @@
 #include "parse-events.h"
 #include <subcmd/parse-options.h>
 
+#include <fcntl.h>
 #include <sys/ioctl.h>
 #include <sys/mman.h>
 
@@ -125,7 +126,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
 	zfree(&evlist->mmap);
-	zfree(&evlist->backward_mmap);
+	zfree(&evlist->overwrite_mmap);
 	fdarray__exit(&evlist->pollfd);
 }
 
@@ -675,11 +676,11 @@ static int perf_evlist__set_paused(struct perf_evlist *evlist, bool value)
 {
 	int i;
 
-	if (!evlist->backward_mmap)
+	if (!evlist->overwrite_mmap)
 		return 0;
 
 	for (i = 0; i < evlist->nr_mmaps; i++) {
-		int fd = evlist->backward_mmap[i].fd;
+		int fd = evlist->overwrite_mmap[i].fd;
 		int err;
 
 		if (fd < 0)
@@ -711,7 +712,7 @@ union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int
 	 * No need for read-write ring buffer: kernel stop outputting when
 	 * it hit md->prev (perf_mmap__consume()).
 	 */
-	return perf_mmap__read_forward(md, evlist->overwrite);
+	return perf_mmap__read_forward(md);
 }
 
 union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
@@ -738,7 +739,7 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 {
-	perf_mmap__consume(&evlist->mmap[idx], evlist->overwrite);
+	perf_mmap__consume(&evlist->mmap[idx], false);
 }
 
 static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
@@ -749,16 +750,16 @@ static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
 		for (i = 0; i < evlist->nr_mmaps; i++)
 			perf_mmap__munmap(&evlist->mmap[i]);
 
-	if (evlist->backward_mmap)
+	if (evlist->overwrite_mmap)
 		for (i = 0; i < evlist->nr_mmaps; i++)
-			perf_mmap__munmap(&evlist->backward_mmap[i]);
+			perf_mmap__munmap(&evlist->overwrite_mmap[i]);
 }
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	perf_evlist__munmap_nofree(evlist);
 	zfree(&evlist->mmap);
-	zfree(&evlist->backward_mmap);
+	zfree(&evlist->overwrite_mmap);
 }
 
 static struct perf_mmap *perf_evlist__alloc_mmap(struct perf_evlist *evlist)
@@ -800,7 +801,7 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu_idx,
-				       int thread, int *_output, int *_output_backward)
+				       int thread, int *_output, int *_output_overwrite)
 {
 	struct perf_evsel *evsel;
 	int revent;
@@ -812,18 +813,20 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		int fd;
 		int cpu;
 
+		mp->prot = PROT_READ | PROT_WRITE;
 		if (evsel->attr.write_backward) {
-			output = _output_backward;
-			maps = evlist->backward_mmap;
+			output = _output_overwrite;
+			maps = evlist->overwrite_mmap;
 
 			if (!maps) {
 				maps = perf_evlist__alloc_mmap(evlist);
 				if (!maps)
 					return -1;
-				evlist->backward_mmap = maps;
+				evlist->overwrite_mmap = maps;
 				if (evlist->bkw_mmap_state == BKW_MMAP_NOTREADY)
 					perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);
 			}
+			mp->prot &= ~PROT_WRITE;
 		}
 
 		if (evsel->system_wide && thread)
@@ -884,14 +887,14 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		int output = -1;
-		int output_backward = -1;
+		int output_overwrite = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output, &output_backward))
+							thread, &output, &output_overwrite))
 				goto out_unmap;
 		}
 	}
@@ -912,13 +915,13 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
-		int output_backward = -1;
+		int output_overwrite = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output, &output_backward))
+						&output, &output_overwrite))
 			goto out_unmap;
 	}
 
@@ -1052,15 +1055,18 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * Return: %0 on success, negative error code otherwise.
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
+			 unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite)
 {
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
-	struct mmap_params mp = {
-		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
-	};
+	/*
+	 * Delay setting mp.prot: set it before calling perf_mmap__mmap.
+	 * Its value is decided by evsel's write_backward.
+	 * So &mp should not be passed through const pointer.
+	 */
+	struct mmap_params mp;
 
 	if (!evlist->mmap)
 		evlist->mmap = perf_evlist__alloc_mmap(evlist);
@@ -1070,7 +1076,6 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
 	mp.mask = evlist->mmap_len - page_size - 1;
@@ -1091,10 +1096,9 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	return perf_evlist__mmap_per_cpu(evlist, &mp);
 }
 
-int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
-		      bool overwrite)
+int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages)
 {
-	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+	return perf_evlist__mmap_ex(evlist, pages, 0, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
@@ -1102,7 +1106,8 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 	struct cpu_map *cpus;
 	struct thread_map *threads;
 
-	threads = thread_map__new_str(target->pid, target->tid, target->uid);
+	threads = thread_map__new_str(target->pid, target->tid, target->uid,
+				      target->per_thread);
 
 	if (!threads)
 		return -1;
@@ -1582,6 +1587,17 @@ int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *even
 	return perf_evsel__parse_sample(evsel, event, sample);
 }
 
+int perf_evlist__parse_sample_timestamp(struct perf_evlist *evlist,
+					union perf_event *event,
+					u64 *timestamp)
+{
+	struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event);
+
+	if (!evsel)
+		return -EFAULT;
+	return perf_evsel__parse_sample_timestamp(evsel, event, timestamp);
+}
+
 size_t perf_evlist__fprintf(struct perf_evlist *evlist, FILE *fp)
 {
 	struct perf_evsel *evsel;
@@ -1739,13 +1755,13 @@ void perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist,
 		RESUME,
 	} action = NONE;
 
-	if (!evlist->backward_mmap)
+	if (!evlist->overwrite_mmap)
 		return;
 
 	switch (old_state) {
 	case BKW_MMAP_NOTREADY: {
 		if (state != BKW_MMAP_RUNNING)
-			goto state_err;;
+			goto state_err;
 		break;
 	}
 	case BKW_MMAP_RUNNING: {
author	Linus Torvalds <torvalds@linux-foundation.org>	2018-01-30 11:15:14 -0800
committer	Linus Torvalds <torvalds@linux-foundation.org>	2018-01-30 11:15:14 -0800
commit	d8b91dde38f4c43bd0bbbf17a90f735b16aaff2c (patch)
tree	bd72dabf6e4b23e060fce429c87e60504f69de54 /tools/perf/util/evlist.c
parent	5e7481a25e90b661d1dbbba18be3fd3dfe12ec6f (diff)
parent	e4c1091cb495d9cbec8956d642644a71a1689958 (diff)