All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
@ 2022-09-05 11:42 Adrian Hunter
  2022-09-06 12:53 ` Arnaldo Carvalho de Melo
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Adrian Hunter @ 2022-09-05 11:42 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users, Peter Zijlstra, Ingo Molnar

The offending commit removed mmap_per_thread(), which did not consider
the different set-output rules for per-thread mmaps i.e. in the per-thread
case set-output is used for file descriptors of the same thread not the
same cpu.

This was not immediately noticed because it only happens with
multi-threaded targets and we do not have a test for that yet.

Reinstate mmap_per_thread() expanding it to cover also system-wide per-cpu
events i.e. to continue to allow the mixing of per-thread and per-cpu
mmaps.

Debug messages (with -vv) show the file descriptors that are opened with
sys_perf_event_open. New debug messages are added (needs -vvv) that show
also which file descriptors are mmapped and which are redirected with
set-output.

In the per-cpu case (cpu != -1) file descriptors for the same CPU are
set-output to the first file descriptor for that CPU.

In the per-thread case (cpu == -1) file descriptors for the same thread are
set-output to the first file descriptor for that thread.

Example (process 17489 has 2 threads):

 Before (but with new debug prints):

   $ perf record --no-bpf-event -vvv --per-thread -p 17489
   <SNIP>
   sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
   sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
   <SNIP>
   libperf: idx 0: mmapping fd 5
   libperf: idx 0: set output fd 6 -> 5
   failed to mmap with 22 (Invalid argument)

 After:

   $ perf record --no-bpf-event -vvv --per-thread -p 17489
   <SNIP>
   sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
   sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
   <SNIP>
   libperf: mmap_per_thread: nr cpu values (may include -1) 1 nr threads 2
   libperf: idx 0: mmapping fd 5
   libperf: idx 1: mmapping fd 6
   <SNIP>
   [ perf record: Woken up 2 times to write data ]
   [ perf record: Captured and wrote 0.018 MB perf.data (15 samples) ]

Per-cpu example (process 20341 has 2 threads, same as above):

   $ perf record --no-bpf-event -vvv -p 20341
   <SNIP>
   sys_perf_event_open: pid 20341  cpu 0  group_fd -1  flags 0x8 = 5
   sys_perf_event_open: pid 20342  cpu 0  group_fd -1  flags 0x8 = 6
   sys_perf_event_open: pid 20341  cpu 1  group_fd -1  flags 0x8 = 7
   sys_perf_event_open: pid 20342  cpu 1  group_fd -1  flags 0x8 = 8
   sys_perf_event_open: pid 20341  cpu 2  group_fd -1  flags 0x8 = 9
   sys_perf_event_open: pid 20342  cpu 2  group_fd -1  flags 0x8 = 10
   sys_perf_event_open: pid 20341  cpu 3  group_fd -1  flags 0x8 = 11
   sys_perf_event_open: pid 20342  cpu 3  group_fd -1  flags 0x8 = 12
   sys_perf_event_open: pid 20341  cpu 4  group_fd -1  flags 0x8 = 13
   sys_perf_event_open: pid 20342  cpu 4  group_fd -1  flags 0x8 = 14
   sys_perf_event_open: pid 20341  cpu 5  group_fd -1  flags 0x8 = 15
   sys_perf_event_open: pid 20342  cpu 5  group_fd -1  flags 0x8 = 16
   sys_perf_event_open: pid 20341  cpu 6  group_fd -1  flags 0x8 = 17
   sys_perf_event_open: pid 20342  cpu 6  group_fd -1  flags 0x8 = 18
   sys_perf_event_open: pid 20341  cpu 7  group_fd -1  flags 0x8 = 19
   sys_perf_event_open: pid 20342  cpu 7  group_fd -1  flags 0x8 = 20
   <SNIP>
   libperf: mmap_per_cpu: nr cpu values 8 nr threads 2
   libperf: idx 0: mmapping fd 5
   libperf: idx 0: set output fd 6 -> 5
   libperf: idx 1: mmapping fd 7
   libperf: idx 1: set output fd 8 -> 7
   libperf: idx 2: mmapping fd 9
   libperf: idx 2: set output fd 10 -> 9
   libperf: idx 3: mmapping fd 11
   libperf: idx 3: set output fd 12 -> 11
   libperf: idx 4: mmapping fd 13
   libperf: idx 4: set output fd 14 -> 13
   libperf: idx 5: mmapping fd 15
   libperf: idx 5: set output fd 16 -> 15
   libperf: idx 6: mmapping fd 17
   libperf: idx 6: set output fd 18 -> 17
   libperf: idx 7: mmapping fd 19
   libperf: idx 7: set output fd 20 -> 19
   <SNIP>
   [ perf record: Woken up 7 times to write data ]
   [ perf record: Captured and wrote 0.020 MB perf.data (17 samples) ]

Fixes: ae4f8ae16a07 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---


Changes in V2:

	Expand commit message.
	White-space fixups.


 tools/lib/perf/evlist.c | 50 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
index e6c98a6e3908..6b1bafe267a4 100644
--- a/tools/lib/perf/evlist.c
+++ b/tools/lib/perf/evlist.c
@@ -486,6 +486,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 			if (ops->idx)
 				ops->idx(evlist, evsel, mp, idx);
 
+			pr_debug("idx %d: mmapping fd %d\n", idx, *output);
 			if (ops->mmap(map, mp, *output, evlist_cpu) < 0)
 				return -1;
 
@@ -494,6 +495,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 			if (!idx)
 				perf_evlist__set_mmap_first(evlist, map, overwrite);
 		} else {
+			pr_debug("idx %d: set output fd %d -> %d\n", idx, fd, *output);
 			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
 				return -1;
 
@@ -519,6 +521,48 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	return 0;
 }
 
+static int
+mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
+		struct perf_mmap_param *mp)
+{
+	int nr_threads = perf_thread_map__nr(evlist->threads);
+	int nr_cpus    = perf_cpu_map__nr(evlist->all_cpus);
+	int cpu, thread, idx = 0;
+	int nr_mmaps = 0;
+
+	pr_debug("%s: nr cpu values (may include -1) %d nr threads %d\n",
+		 __func__, nr_cpus, nr_threads);
+
+	/* per-thread mmaps */
+	for (thread = 0; thread < nr_threads; thread++, idx++) {
+		int output = -1;
+		int output_overwrite = -1;
+
+		if (mmap_per_evsel(evlist, ops, idx, mp, 0, thread, &output,
+				   &output_overwrite, &nr_mmaps))
+			goto out_unmap;
+	}
+
+	/* system-wide mmaps i.e. per-cpu */
+	for (cpu = 1; cpu < nr_cpus; cpu++, idx++) {
+		int output = -1;
+		int output_overwrite = -1;
+
+		if (mmap_per_evsel(evlist, ops, idx, mp, cpu, 0, &output,
+				   &output_overwrite, &nr_mmaps))
+			goto out_unmap;
+	}
+
+	if (nr_mmaps != evlist->nr_mmaps)
+		pr_err("Miscounted nr_mmaps %d vs %d\n", nr_mmaps, evlist->nr_mmaps);
+
+	return 0;
+
+out_unmap:
+	perf_evlist__munmap(evlist);
+	return -1;
+}
+
 static int
 mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	     struct perf_mmap_param *mp)
@@ -528,6 +572,8 @@ mmap_per_cpu(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
 	int nr_mmaps = 0;
 	int cpu, thread;
 
+	pr_debug("%s: nr cpu values %d nr threads %d\n", __func__, nr_cpus, nr_threads);
+
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		int output = -1;
 		int output_overwrite = -1;
@@ -569,6 +615,7 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
 			  struct perf_evlist_mmap_ops *ops,
 			  struct perf_mmap_param *mp)
 {
+	const struct perf_cpu_map *cpus = evlist->all_cpus;
 	struct perf_evsel *evsel;
 
 	if (!ops || !ops->get || !ops->mmap)
@@ -588,6 +635,9 @@ int perf_evlist__mmap_ops(struct perf_evlist *evlist,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
+	if (perf_cpu_map__empty(cpus))
+		return mmap_per_thread(evlist, ops, mp);
+
 	return mmap_per_cpu(evlist, ops, mp);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-05 11:42 [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets Adrian Hunter
@ 2022-09-06 12:53 ` Arnaldo Carvalho de Melo
  2022-09-06 12:59 ` Jiri Olsa
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-09-06 12:53 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users, Peter Zijlstra, Ingo Molnar

Em Mon, Sep 05, 2022 at 02:42:09PM +0300, Adrian Hunter escreveu:
> The offending commit removed mmap_per_thread(), which did not consider
> the different set-output rules for per-thread mmaps i.e. in the per-thread
> case set-output is used for file descriptors of the same thread not the
> same cpu.
> 
> Fixes: ae4f8ae16a07 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

I added these:

Reported-by: Tomáš Trnka <trnka@scm.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216441

Jiri made a comment, can you please reply?

Applied locally for further tests,

Thanks,

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-05 11:42 [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets Adrian Hunter
  2022-09-06 12:53 ` Arnaldo Carvalho de Melo
@ 2022-09-06 12:59 ` Jiri Olsa
  2022-09-06 14:04   ` Adrian Hunter
  2022-09-06 17:50 ` Namhyung Kim
  2022-09-07 14:20 ` Jiri Olsa
  3 siblings, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2022-09-06 12:59 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar

On Mon, Sep 05, 2022 at 02:42:09PM +0300, Adrian Hunter wrote:

SNIP

> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> index e6c98a6e3908..6b1bafe267a4 100644
> --- a/tools/lib/perf/evlist.c
> +++ b/tools/lib/perf/evlist.c
> @@ -486,6 +486,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>  			if (ops->idx)
>  				ops->idx(evlist, evsel, mp, idx);
>  
> +			pr_debug("idx %d: mmapping fd %d\n", idx, *output);
>  			if (ops->mmap(map, mp, *output, evlist_cpu) < 0)
>  				return -1;
>  
> @@ -494,6 +495,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>  			if (!idx)
>  				perf_evlist__set_mmap_first(evlist, map, overwrite);
>  		} else {
> +			pr_debug("idx %d: set output fd %d -> %d\n", idx, fd, *output);
>  			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
>  				return -1;
>  
> @@ -519,6 +521,48 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>  	return 0;
>  }
>  
> +static int
> +mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> +		struct perf_mmap_param *mp)
> +{
> +	int nr_threads = perf_thread_map__nr(evlist->threads);
> +	int nr_cpus    = perf_cpu_map__nr(evlist->all_cpus);
> +	int cpu, thread, idx = 0;
> +	int nr_mmaps = 0;
> +
> +	pr_debug("%s: nr cpu values (may include -1) %d nr threads %d\n",
> +		 __func__, nr_cpus, nr_threads);

-1 as cpu value is only for 'empty' perf_cpu_map, right?

> +
> +	/* per-thread mmaps */
> +	for (thread = 0; thread < nr_threads; thread++, idx++) {
> +		int output = -1;
> +		int output_overwrite = -1;
> +
> +		if (mmap_per_evsel(evlist, ops, idx, mp, 0, thread, &output,
> +				   &output_overwrite, &nr_mmaps))
> +			goto out_unmap;
> +	}
> +
> +	/* system-wide mmaps i.e. per-cpu */
> +	for (cpu = 1; cpu < nr_cpus; cpu++, idx++) {
> +		int output = -1;
> +		int output_overwrite = -1;
> +
> +		if (mmap_per_evsel(evlist, ops, idx, mp, cpu, 0, &output,
> +				   &output_overwrite, &nr_mmaps))
> +			goto out_unmap;
> +	}

will this loop be executed? we are in here because all_cpus is empty, right?

thanks,
jirka

> +
> +	if (nr_mmaps != evlist->nr_mmaps)
> +		pr_err("Miscounted nr_mmaps %d vs %d\n", nr_mmaps, evlist->nr_mmaps);
> +
> +	return 0;
> +
> +out_unmap:
> +	perf_evlist__munmap(evlist);
> +	return -1;
> +}

SNIP

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-06 12:59 ` Jiri Olsa
@ 2022-09-06 14:04   ` Adrian Hunter
  2022-09-06 17:43     ` Ian Rogers
  2022-09-06 19:45     ` Jiri Olsa
  0 siblings, 2 replies; 12+ messages in thread
From: Adrian Hunter @ 2022-09-06 14:04 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar

On 6/09/22 15:59, Jiri Olsa wrote:
> On Mon, Sep 05, 2022 at 02:42:09PM +0300, Adrian Hunter wrote:
> 
> SNIP
> 
>> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
>> index e6c98a6e3908..6b1bafe267a4 100644
>> --- a/tools/lib/perf/evlist.c
>> +++ b/tools/lib/perf/evlist.c
>> @@ -486,6 +486,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>>  			if (ops->idx)
>>  				ops->idx(evlist, evsel, mp, idx);
>>  
>> +			pr_debug("idx %d: mmapping fd %d\n", idx, *output);
>>  			if (ops->mmap(map, mp, *output, evlist_cpu) < 0)
>>  				return -1;
>>  
>> @@ -494,6 +495,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>>  			if (!idx)
>>  				perf_evlist__set_mmap_first(evlist, map, overwrite);
>>  		} else {
>> +			pr_debug("idx %d: set output fd %d -> %d\n", idx, fd, *output);
>>  			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
>>  				return -1;
>>  
>> @@ -519,6 +521,48 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>>  	return 0;
>>  }
>>  
>> +static int
>> +mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>> +		struct perf_mmap_param *mp)
>> +{
>> +	int nr_threads = perf_thread_map__nr(evlist->threads);
>> +	int nr_cpus    = perf_cpu_map__nr(evlist->all_cpus);
>> +	int cpu, thread, idx = 0;
>> +	int nr_mmaps = 0;
>> +
>> +	pr_debug("%s: nr cpu values (may include -1) %d nr threads %d\n",
>> +		 __func__, nr_cpus, nr_threads);
> 
> -1 as cpu value is only for 'empty' perf_cpu_map, right?

The cpu map is a map of valid 3rd arguments to perf_event_open, so -1
means all CPUs which is per-thread by necessity.

> 
>> +
>> +	/* per-thread mmaps */
>> +	for (thread = 0; thread < nr_threads; thread++, idx++) {
>> +		int output = -1;
>> +		int output_overwrite = -1;
>> +
>> +		if (mmap_per_evsel(evlist, ops, idx, mp, 0, thread, &output,
>> +				   &output_overwrite, &nr_mmaps))
>> +			goto out_unmap;
>> +	}
>> +
>> +	/* system-wide mmaps i.e. per-cpu */
>> +	for (cpu = 1; cpu < nr_cpus; cpu++, idx++) {
>> +		int output = -1;
>> +		int output_overwrite = -1;
>> +
>> +		if (mmap_per_evsel(evlist, ops, idx, mp, cpu, 0, &output,
>> +				   &output_overwrite, &nr_mmaps))
>> +			goto out_unmap;
>> +	}
> 
> will this loop be executed? we are in here because all_cpus is empty, right?

Yes it is executed.  I put back the code that was there before ae4f8ae16a07
("libperf evlist: Allow mixing per-thread and per-cpu mmaps"), which uses
perf_cpu_map__empty() which only checks the first entry is -1:

bool perf_cpu_map__empty(const struct perf_cpu_map *map)
{
	return map ? map->map[0].cpu == -1 : true;
}

But there can be more CPUs in the map, so perf_cpu_map__empty()
returns true for the per-thread case, as desired, even if there
are also system-wide CPUs.

I guess perf_cpu_map__empty() needs renaming.

> 
> thanks,
> jirka
> 
>> +
>> +	if (nr_mmaps != evlist->nr_mmaps)
>> +		pr_err("Miscounted nr_mmaps %d vs %d\n", nr_mmaps, evlist->nr_mmaps);
>> +
>> +	return 0;
>> +
>> +out_unmap:
>> +	perf_evlist__munmap(evlist);
>> +	return -1;
>> +}
> 
> SNIP


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-06 14:04   ` Adrian Hunter
@ 2022-09-06 17:43     ` Ian Rogers
  2022-09-06 21:17       ` Vince Weaver
  2022-09-06 19:45     ` Jiri Olsa
  1 sibling, 1 reply; 12+ messages in thread
From: Ian Rogers @ 2022-09-06 17:43 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar

On Tue, Sep 6, 2022 at 7:04 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 6/09/22 15:59, Jiri Olsa wrote:
> > On Mon, Sep 05, 2022 at 02:42:09PM +0300, Adrian Hunter wrote:
> >
> > SNIP
> >
> >> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> >> index e6c98a6e3908..6b1bafe267a4 100644
> >> --- a/tools/lib/perf/evlist.c
> >> +++ b/tools/lib/perf/evlist.c
> >> @@ -486,6 +486,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> >>                      if (ops->idx)
> >>                              ops->idx(evlist, evsel, mp, idx);
> >>
> >> +                    pr_debug("idx %d: mmapping fd %d\n", idx, *output);
> >>                      if (ops->mmap(map, mp, *output, evlist_cpu) < 0)
> >>                              return -1;
> >>
> >> @@ -494,6 +495,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> >>                      if (!idx)
> >>                              perf_evlist__set_mmap_first(evlist, map, overwrite);
> >>              } else {
> >> +                    pr_debug("idx %d: set output fd %d -> %d\n", idx, fd, *output);
> >>                      if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
> >>                              return -1;
> >>
> >> @@ -519,6 +521,48 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> >>      return 0;
> >>  }
> >>
> >> +static int
> >> +mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> >> +            struct perf_mmap_param *mp)
> >> +{
> >> +    int nr_threads = perf_thread_map__nr(evlist->threads);
> >> +    int nr_cpus    = perf_cpu_map__nr(evlist->all_cpus);
> >> +    int cpu, thread, idx = 0;
> >> +    int nr_mmaps = 0;
> >> +
> >> +    pr_debug("%s: nr cpu values (may include -1) %d nr threads %d\n",
> >> +             __func__, nr_cpus, nr_threads);
> >
> > -1 as cpu value is only for 'empty' perf_cpu_map, right?
>
> The cpu map is a map of valid 3rd arguments to perf_event_open, so -1
> means all CPUs which is per-thread by necessity.
>
> >
> >> +
> >> +    /* per-thread mmaps */
> >> +    for (thread = 0; thread < nr_threads; thread++, idx++) {
> >> +            int output = -1;
> >> +            int output_overwrite = -1;
> >> +
> >> +            if (mmap_per_evsel(evlist, ops, idx, mp, 0, thread, &output,
> >> +                               &output_overwrite, &nr_mmaps))
> >> +                    goto out_unmap;
> >> +    }
> >> +
> >> +    /* system-wide mmaps i.e. per-cpu */
> >> +    for (cpu = 1; cpu < nr_cpus; cpu++, idx++) {
> >> +            int output = -1;
> >> +            int output_overwrite = -1;
> >> +
> >> +            if (mmap_per_evsel(evlist, ops, idx, mp, cpu, 0, &output,
> >> +                               &output_overwrite, &nr_mmaps))
> >> +                    goto out_unmap;
> >> +    }
> >
> > will this loop be executed? we are in here because all_cpus is empty, right?
>
> Yes it is executed.  I put back the code that was there before ae4f8ae16a07
> ("libperf evlist: Allow mixing per-thread and per-cpu mmaps"), which uses
> perf_cpu_map__empty() which only checks the first entry is -1:
>
> bool perf_cpu_map__empty(const struct perf_cpu_map *map)
> {
>         return map ? map->map[0].cpu == -1 : true;
> }
>
> But there can be more CPUs in the map, so perf_cpu_map__empty()
> returns true for the per-thread case, as desired, even if there
> are also system-wide CPUs.
>
> I guess perf_cpu_map__empty() needs renaming.

Let me nitpick :-) -1 means any CPU, as per the perf_event_open man
page, all CPUs would be a CPU map with every CPU on the system in it -
all CPUs can have some ambiguity in whether it includes offline CPUs,
which isn't true for any. I've not sent out a patch to clean up the
libperf CPU map code as I'd like to propose we start a libperf2 (with
LPC coming up it'd be a good place to discuss this, there's also the
office hours on Thursday). What I hope for libperf2 is that we can
make its license the same as libbpf, so the code can be more widely
included in projects. As such it wouldn't be valid to copy paste
libperf's CPU map into libperf2, but a larger clean up and refactor
could be put into libperf2 with libperf and perf depending upon it.
There are other areas that'd benefit from cleanup such as how a dash
is handled by parse events. It makes sense (imo) for this all to
become libperf2 work, and once we're happy with libperf2 we can
replace libperf completely.

Thanks,
Ian

> >
> > thanks,
> > jirka
> >
> >> +
> >> +    if (nr_mmaps != evlist->nr_mmaps)
> >> +            pr_err("Miscounted nr_mmaps %d vs %d\n", nr_mmaps, evlist->nr_mmaps);
> >> +
> >> +    return 0;
> >> +
> >> +out_unmap:
> >> +    perf_evlist__munmap(evlist);
> >> +    return -1;
> >> +}
> >
> > SNIP
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-05 11:42 [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets Adrian Hunter
  2022-09-06 12:53 ` Arnaldo Carvalho de Melo
  2022-09-06 12:59 ` Jiri Olsa
@ 2022-09-06 17:50 ` Namhyung Kim
  2022-09-07  4:52   ` Adrian Hunter
  2022-09-07 14:20 ` Jiri Olsa
  3 siblings, 1 reply; 12+ messages in thread
From: Namhyung Kim @ 2022-09-06 17:50 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers, linux-kernel,
	linux-perf-users, Peter Zijlstra, Ingo Molnar

On Mon, Sep 5, 2022 at 4:42 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> The offending commit removed mmap_per_thread(), which did not consider
> the different set-output rules for per-thread mmaps i.e. in the per-thread
> case set-output is used for file descriptors of the same thread not the
> same cpu.
>
> This was not immediately noticed because it only happens with
> multi-threaded targets and we do not have a test for that yet.

Yeah, this is unfortunate.  I feel like I need to spend some time on it.

>
> Reinstate mmap_per_thread() expanding it to cover also system-wide per-cpu
> events i.e. to continue to allow the mixing of per-thread and per-cpu
> mmaps.
>
> Debug messages (with -vv) show the file descriptors that are opened with
> sys_perf_event_open. New debug messages are added (needs -vvv) that show
> also which file descriptors are mmapped and which are redirected with
> set-output.
>
> In the per-cpu case (cpu != -1) file descriptors for the same CPU are
> set-output to the first file descriptor for that CPU.
>
> In the per-thread case (cpu == -1) file descriptors for the same thread are
> set-output to the first file descriptor for that thread.
>
> Example (process 17489 has 2 threads):
>
>  Before (but with new debug prints):
>
>    $ perf record --no-bpf-event -vvv --per-thread -p 17489
>    <SNIP>
>    sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
>    sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
>    <SNIP>
>    libperf: idx 0: mmapping fd 5
>    libperf: idx 0: set output fd 6 -> 5
>    failed to mmap with 22 (Invalid argument)
>
>  After:
>
>    $ perf record --no-bpf-event -vvv --per-thread -p 17489
>    <SNIP>
>    sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
>    sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
>    <SNIP>
>    libperf: mmap_per_thread: nr cpu values (may include -1) 1 nr threads 2
>    libperf: idx 0: mmapping fd 5
>    libperf: idx 1: mmapping fd 6
>    <SNIP>
>    [ perf record: Woken up 2 times to write data ]
>    [ perf record: Captured and wrote 0.018 MB perf.data (15 samples) ]

It'd be nice if the example had 2 events so that it could check the
set-output rule actually worked.

Thanks,
Namhyung

>
> Per-cpu example (process 20341 has 2 threads, same as above):
>
>    $ perf record --no-bpf-event -vvv -p 20341
>    <SNIP>
>    sys_perf_event_open: pid 20341  cpu 0  group_fd -1  flags 0x8 = 5
>    sys_perf_event_open: pid 20342  cpu 0  group_fd -1  flags 0x8 = 6
>    sys_perf_event_open: pid 20341  cpu 1  group_fd -1  flags 0x8 = 7
>    sys_perf_event_open: pid 20342  cpu 1  group_fd -1  flags 0x8 = 8
>    sys_perf_event_open: pid 20341  cpu 2  group_fd -1  flags 0x8 = 9
>    sys_perf_event_open: pid 20342  cpu 2  group_fd -1  flags 0x8 = 10
>    sys_perf_event_open: pid 20341  cpu 3  group_fd -1  flags 0x8 = 11
>    sys_perf_event_open: pid 20342  cpu 3  group_fd -1  flags 0x8 = 12
>    sys_perf_event_open: pid 20341  cpu 4  group_fd -1  flags 0x8 = 13
>    sys_perf_event_open: pid 20342  cpu 4  group_fd -1  flags 0x8 = 14
>    sys_perf_event_open: pid 20341  cpu 5  group_fd -1  flags 0x8 = 15
>    sys_perf_event_open: pid 20342  cpu 5  group_fd -1  flags 0x8 = 16
>    sys_perf_event_open: pid 20341  cpu 6  group_fd -1  flags 0x8 = 17
>    sys_perf_event_open: pid 20342  cpu 6  group_fd -1  flags 0x8 = 18
>    sys_perf_event_open: pid 20341  cpu 7  group_fd -1  flags 0x8 = 19
>    sys_perf_event_open: pid 20342  cpu 7  group_fd -1  flags 0x8 = 20
>    <SNIP>
>    libperf: mmap_per_cpu: nr cpu values 8 nr threads 2
>    libperf: idx 0: mmapping fd 5
>    libperf: idx 0: set output fd 6 -> 5
>    libperf: idx 1: mmapping fd 7
>    libperf: idx 1: set output fd 8 -> 7
>    libperf: idx 2: mmapping fd 9
>    libperf: idx 2: set output fd 10 -> 9
>    libperf: idx 3: mmapping fd 11
>    libperf: idx 3: set output fd 12 -> 11
>    libperf: idx 4: mmapping fd 13
>    libperf: idx 4: set output fd 14 -> 13
>    libperf: idx 5: mmapping fd 15
>    libperf: idx 5: set output fd 16 -> 15
>    libperf: idx 6: mmapping fd 17
>    libperf: idx 6: set output fd 18 -> 17
>    libperf: idx 7: mmapping fd 19
>    libperf: idx 7: set output fd 20 -> 19
>    <SNIP>
>    [ perf record: Woken up 7 times to write data ]
>    [ perf record: Captured and wrote 0.020 MB perf.data (17 samples) ]
>
> Fixes: ae4f8ae16a07 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-06 14:04   ` Adrian Hunter
  2022-09-06 17:43     ` Ian Rogers
@ 2022-09-06 19:45     ` Jiri Olsa
  2022-09-07  5:05       ` Adrian Hunter
  1 sibling, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2022-09-06 19:45 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Jiri Olsa, Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim,
	Ian Rogers, linux-kernel, linux-perf-users, Peter Zijlstra,
	Ingo Molnar

On Tue, Sep 06, 2022 at 05:04:45PM +0300, Adrian Hunter wrote:
> On 6/09/22 15:59, Jiri Olsa wrote:
> > On Mon, Sep 05, 2022 at 02:42:09PM +0300, Adrian Hunter wrote:
> > 
> > SNIP
> > 
> >> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
> >> index e6c98a6e3908..6b1bafe267a4 100644
> >> --- a/tools/lib/perf/evlist.c
> >> +++ b/tools/lib/perf/evlist.c
> >> @@ -486,6 +486,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> >>  			if (ops->idx)
> >>  				ops->idx(evlist, evsel, mp, idx);
> >>  
> >> +			pr_debug("idx %d: mmapping fd %d\n", idx, *output);
> >>  			if (ops->mmap(map, mp, *output, evlist_cpu) < 0)
> >>  				return -1;
> >>  
> >> @@ -494,6 +495,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> >>  			if (!idx)
> >>  				perf_evlist__set_mmap_first(evlist, map, overwrite);
> >>  		} else {
> >> +			pr_debug("idx %d: set output fd %d -> %d\n", idx, fd, *output);
> >>  			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
> >>  				return -1;
> >>  
> >> @@ -519,6 +521,48 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> >>  	return 0;
> >>  }
> >>  
> >> +static int
> >> +mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
> >> +		struct perf_mmap_param *mp)
> >> +{
> >> +	int nr_threads = perf_thread_map__nr(evlist->threads);
> >> +	int nr_cpus    = perf_cpu_map__nr(evlist->all_cpus);
> >> +	int cpu, thread, idx = 0;
> >> +	int nr_mmaps = 0;
> >> +
> >> +	pr_debug("%s: nr cpu values (may include -1) %d nr threads %d\n",
> >> +		 __func__, nr_cpus, nr_threads);
> > 
> > -1 as cpu value is only for 'empty' perf_cpu_map, right?
> 
> The cpu map is a map of valid 3rd arguments to perf_event_open, so -1
> means all CPUs which is per-thread by necessity.
> 
> > 
> >> +
> >> +	/* per-thread mmaps */
> >> +	for (thread = 0; thread < nr_threads; thread++, idx++) {
> >> +		int output = -1;
> >> +		int output_overwrite = -1;
> >> +
> >> +		if (mmap_per_evsel(evlist, ops, idx, mp, 0, thread, &output,
> >> +				   &output_overwrite, &nr_mmaps))
> >> +			goto out_unmap;
> >> +	}
> >> +
> >> +	/* system-wide mmaps i.e. per-cpu */
> >> +	for (cpu = 1; cpu < nr_cpus; cpu++, idx++) {
> >> +		int output = -1;
> >> +		int output_overwrite = -1;
> >> +
> >> +		if (mmap_per_evsel(evlist, ops, idx, mp, cpu, 0, &output,
> >> +				   &output_overwrite, &nr_mmaps))
> >> +			goto out_unmap;
> >> +	}
> > 
> > will this loop be executed? we are in here because all_cpus is empty, right?
> 
> Yes it is executed.  I put back the code that was there before ae4f8ae16a07
> ("libperf evlist: Allow mixing per-thread and per-cpu mmaps"), which uses

hm, but commit ae4f8ae16a07 does not have similar cpu loop

> perf_cpu_map__empty() which only checks the first entry is -1:
> 
> bool perf_cpu_map__empty(const struct perf_cpu_map *map)
> {
> 	return map ? map->map[0].cpu == -1 : true;
> }
> 
> But there can be more CPUs in the map, so perf_cpu_map__empty()
> returns true for the per-thread case, as desired, even if there
> are also system-wide CPUs.

I don't see how, if I'd see -1 together with other cpu values in
perf_cpu_map I'd think it's a bug, but I might be missing some
auxtrace usage,

I thought we use -1 just for empty cpu map, so in per-thread case
-1 is properly passed to perf_event_open syscall

jirka

> 
> I guess perf_cpu_map__empty() needs renaming.
> 
> > 
> > thanks,
> > jirka
> > 
> >> +
> >> +	if (nr_mmaps != evlist->nr_mmaps)
> >> +		pr_err("Miscounted nr_mmaps %d vs %d\n", nr_mmaps, evlist->nr_mmaps);
> >> +
> >> +	return 0;
> >> +
> >> +out_unmap:
> >> +	perf_evlist__munmap(evlist);
> >> +	return -1;
> >> +}
> > 
> > SNIP
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-06 17:43     ` Ian Rogers
@ 2022-09-06 21:17       ` Vince Weaver
  0 siblings, 0 replies; 12+ messages in thread
From: Vince Weaver @ 2022-09-06 21:17 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Adrian Hunter, Jiri Olsa, Arnaldo Carvalho de Melo, Jiri Olsa,
	Namhyung Kim, linux-kernel, linux-perf-users, Peter Zijlstra,
	Ingo Molnar

On Tue, 6 Sep 2022, Ian Rogers wrote:

> > >>                              perf_evlist__set_mmap_first(evlist, map, overwrite);
> > >>              } else {
> libperf CPU map code as I'd like to propose we start a libperf2 (with
> LPC coming up it'd be a good place to discuss this, there's also the
> office hours on Thursday). What I hope for libperf2 is that we can
> make its license the same as libbpf, so the code can be more widely
> included in projects.

If you did plan to do this, it might be nice to include some 
representatives from groups that would be likely to use such a library.  
This might include the PAPI library developers, and there are various 
other tools (especially in the HPC area) who are coding to 
perf_event_open() directly but probably would like to use a library 
instead if possible.

Vince Weaver
vincent.weaver@maine.edu

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-06 17:50 ` Namhyung Kim
@ 2022-09-07  4:52   ` Adrian Hunter
  0 siblings, 0 replies; 12+ messages in thread
From: Adrian Hunter @ 2022-09-07  4:52 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Ian Rogers, linux-kernel,
	linux-perf-users, Peter Zijlstra, Ingo Molnar

On 6/09/22 20:50, Namhyung Kim wrote:
> On Mon, Sep 5, 2022 at 4:42 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>>
>> The offending commit removed mmap_per_thread(), which did not consider
>> the different set-output rules for per-thread mmaps i.e. in the per-thread
>> case set-output is used for file descriptors of the same thread not the
>> same cpu.
>>
>> This was not immediately noticed because it only happens with
>> multi-threaded targets and we do not have a test for that yet.
> 
> Yeah, this is unfortunate.  I feel like I need to spend some time on it.
> 
>>
>> Reinstate mmap_per_thread() expanding it to cover also system-wide per-cpu
>> events i.e. to continue to allow the mixing of per-thread and per-cpu
>> mmaps.
>>
>> Debug messages (with -vv) show the file descriptors that are opened with
>> sys_perf_event_open. New debug messages are added (needs -vvv) that show
>> also which file descriptors are mmapped and which are redirected with
>> set-output.
>>
>> In the per-cpu case (cpu != -1) file descriptors for the same CPU are
>> set-output to the first file descriptor for that CPU.
>>
>> In the per-thread case (cpu == -1) file descriptors for the same thread are
>> set-output to the first file descriptor for that thread.
>>
>> Example (process 17489 has 2 threads):
>>
>>  Before (but with new debug prints):
>>
>>    $ perf record --no-bpf-event -vvv --per-thread -p 17489
>>    <SNIP>
>>    sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
>>    sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
>>    <SNIP>
>>    libperf: idx 0: mmapping fd 5
>>    libperf: idx 0: set output fd 6 -> 5
>>    failed to mmap with 22 (Invalid argument)
>>
>>  After:
>>
>>    $ perf record --no-bpf-event -vvv --per-thread -p 17489
>>    <SNIP>
>>    sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
>>    sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
>>    <SNIP>
>>    libperf: mmap_per_thread: nr cpu values (may include -1) 1 nr threads 2
>>    libperf: idx 0: mmapping fd 5
>>    libperf: idx 1: mmapping fd 6
>>    <SNIP>
>>    [ perf record: Woken up 2 times to write data ]
>>    [ perf record: Captured and wrote 0.018 MB perf.data (15 samples) ]
> 
> It'd be nice if the example had 2 events so that it could check the
> set-output rule actually worked.

Here is an Intel PT example

$ perf record -vvv --per-thread -p 1521 -e intel_pt//
Using CPUID GenuineIntel-6-8C-1
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
DEBUGINFOD_URLS=
nr_cblocks: 0
affinity: SYS
mmap flush: 1
comp level: 0
------------------------------------------------------------
perf_event_attr:
  type                             9
  size                             128
  config                           0x300e601
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|IDENTIFIER
  read_format                      ID
  disabled                         1
  sample_id_all                    1
  exclude_guest                    1
  aux_watermark                    1048576
------------------------------------------------------------
sys_perf_event_open: pid 1521  cpu -1  group_fd -1  flags 0x8 = 5
sys_perf_event_open: pid 1522  cpu -1  group_fd -1  flags 0x8 = 6
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             128
  config                           0x9
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|IDENTIFIER
  read_format                      ID
  disabled                         1
  exclude_kernel                   1
  exclude_hv                       1
  mmap                             1
  comm                             1
  task                             1
  sample_id_all                    1
  exclude_guest                    1
  mmap2                            1
  comm_exec                        1
  bpf_event                        1
------------------------------------------------------------
sys_perf_event_open: pid 1521  cpu -1  group_fd -1  flags 0x8 = 7
sys_perf_event_open: pid 1522  cpu -1  group_fd -1  flags 0x8 = 8
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             128
  config                           0x9
  { sample_period, sample_freq }   1
  sample_type                      IP|TID|TIME|IDENTIFIER
  read_format                      ID
  exclude_kernel                   1
  exclude_hv                       1
  sample_id_all                    1
  exclude_guest                    1
  ksymbol                          1
  text_poke                        1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 9
sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 10
sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 11
sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 12
sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 13
sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 14
sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 15
sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 16
mmap size 528384B
AUX area mmap length 4194304
libperf: mmap_per_thread: nr cpu values (may include -1) 9 nr threads 2
libperf: idx 0: mmapping fd 5
libperf: idx 0: set output fd 7 -> 5
libperf: idx 1: mmapping fd 6
libperf: idx 1: set output fd 8 -> 6
libperf: idx 2: mmapping fd 9
libperf: idx 3: mmapping fd 10
libperf: idx 4: mmapping fd 11
libperf: idx 5: mmapping fd 12
libperf: idx 6: mmapping fd 13
libperf: idx 7: mmapping fd 14
libperf: idx 8: mmapping fd 15
libperf: idx 9: mmapping fd 16
Control descriptor is not initialized
thread_data[0x5572a73ec2f0]: nr_mmaps=10, maps=0x5572a73ec380, ow_maps=(nil)
thread_data[0x5572a73ec2f0]: cpu-1: maps[0] -> mmap[0]
thread_data[0x5572a73ec2f0]: cpu0: maps[1] -> mmap[1]
thread_data[0x5572a73ec2f0]: cpu1: maps[2] -> mmap[2]
thread_data[0x5572a73ec2f0]: cpu2: maps[3] -> mmap[3]
thread_data[0x5572a73ec2f0]: cpu3: maps[4] -> mmap[4]
thread_data[0x5572a73ec2f0]: cpu4: maps[5] -> mmap[5]
thread_data[0x5572a73ec2f0]: cpu5: maps[6] -> mmap[6]
thread_data[0x5572a73ec2f0]: cpu6: maps[7] -> mmap[7]
thread_data[0x5572a73ec2f0]: cpu7: maps[8] -> mmap[8]
thread_data[0x5572a73ec2f0]: cpu-1: maps[9] -> mmap[9]
thread_data[0x5572a73ec2f0]: pollfd[0] <- event_fd=5
thread_data[0x5572a73ec2f0]: pollfd[1] <- event_fd=7
thread_data[0x5572a73ec2f0]: pollfd[2] <- event_fd=6
thread_data[0x5572a73ec2f0]: pollfd[3] <- event_fd=8
thread_data[0x5572a73ec2f0]: pollfd[4] <- non_perf_event fd=4
------------------------------------------------------------
perf_event_attr:
  type                             1
  size                             128
  config                           0x9
  watermark                        1
  sample_id_all                    1
  bpf_event                        1
  { wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 1521  cpu -1  group_fd -1  flags 0x8 = 17
sys_perf_event_open: pid 1522  cpu -1  group_fd -1  flags 0x8 = 18
mmap size 528384B
libperf: mmap_per_thread: nr cpu values (may include -1) 1 nr threads 2
libperf: idx 0: mmapping fd 17
libperf: idx 1: mmapping fd 18
Synthesizing TSC conversion information
Synthesizing id index
Synthesizing auxtrace information
auxtrace idx 1 old 0 head 0x2c0 diff 0x2c0
auxtrace idx 0 old 0 head 0x2060 diff 0x2060
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.029 MB perf.data ]




> 
> Thanks,
> Namhyung
> 
>>
>> Per-cpu example (process 20341 has 2 threads, same as above):
>>
>>    $ perf record --no-bpf-event -vvv -p 20341
>>    <SNIP>
>>    sys_perf_event_open: pid 20341  cpu 0  group_fd -1  flags 0x8 = 5
>>    sys_perf_event_open: pid 20342  cpu 0  group_fd -1  flags 0x8 = 6
>>    sys_perf_event_open: pid 20341  cpu 1  group_fd -1  flags 0x8 = 7
>>    sys_perf_event_open: pid 20342  cpu 1  group_fd -1  flags 0x8 = 8
>>    sys_perf_event_open: pid 20341  cpu 2  group_fd -1  flags 0x8 = 9
>>    sys_perf_event_open: pid 20342  cpu 2  group_fd -1  flags 0x8 = 10
>>    sys_perf_event_open: pid 20341  cpu 3  group_fd -1  flags 0x8 = 11
>>    sys_perf_event_open: pid 20342  cpu 3  group_fd -1  flags 0x8 = 12
>>    sys_perf_event_open: pid 20341  cpu 4  group_fd -1  flags 0x8 = 13
>>    sys_perf_event_open: pid 20342  cpu 4  group_fd -1  flags 0x8 = 14
>>    sys_perf_event_open: pid 20341  cpu 5  group_fd -1  flags 0x8 = 15
>>    sys_perf_event_open: pid 20342  cpu 5  group_fd -1  flags 0x8 = 16
>>    sys_perf_event_open: pid 20341  cpu 6  group_fd -1  flags 0x8 = 17
>>    sys_perf_event_open: pid 20342  cpu 6  group_fd -1  flags 0x8 = 18
>>    sys_perf_event_open: pid 20341  cpu 7  group_fd -1  flags 0x8 = 19
>>    sys_perf_event_open: pid 20342  cpu 7  group_fd -1  flags 0x8 = 20
>>    <SNIP>
>>    libperf: mmap_per_cpu: nr cpu values 8 nr threads 2
>>    libperf: idx 0: mmapping fd 5
>>    libperf: idx 0: set output fd 6 -> 5
>>    libperf: idx 1: mmapping fd 7
>>    libperf: idx 1: set output fd 8 -> 7
>>    libperf: idx 2: mmapping fd 9
>>    libperf: idx 2: set output fd 10 -> 9
>>    libperf: idx 3: mmapping fd 11
>>    libperf: idx 3: set output fd 12 -> 11
>>    libperf: idx 4: mmapping fd 13
>>    libperf: idx 4: set output fd 14 -> 13
>>    libperf: idx 5: mmapping fd 15
>>    libperf: idx 5: set output fd 16 -> 15
>>    libperf: idx 6: mmapping fd 17
>>    libperf: idx 6: set output fd 18 -> 17
>>    libperf: idx 7: mmapping fd 19
>>    libperf: idx 7: set output fd 20 -> 19
>>    <SNIP>
>>    [ perf record: Woken up 7 times to write data ]
>>    [ perf record: Captured and wrote 0.020 MB perf.data (17 samples) ]
>>
>> Fixes: ae4f8ae16a07 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-06 19:45     ` Jiri Olsa
@ 2022-09-07  5:05       ` Adrian Hunter
  0 siblings, 0 replies; 12+ messages in thread
From: Adrian Hunter @ 2022-09-07  5:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar

On 6/09/22 22:45, Jiri Olsa wrote:
> On Tue, Sep 06, 2022 at 05:04:45PM +0300, Adrian Hunter wrote:
>> On 6/09/22 15:59, Jiri Olsa wrote:
>>> On Mon, Sep 05, 2022 at 02:42:09PM +0300, Adrian Hunter wrote:
>>>
>>> SNIP
>>>
>>>> diff --git a/tools/lib/perf/evlist.c b/tools/lib/perf/evlist.c
>>>> index e6c98a6e3908..6b1bafe267a4 100644
>>>> --- a/tools/lib/perf/evlist.c
>>>> +++ b/tools/lib/perf/evlist.c
>>>> @@ -486,6 +486,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>>>>  			if (ops->idx)
>>>>  				ops->idx(evlist, evsel, mp, idx);
>>>>  
>>>> +			pr_debug("idx %d: mmapping fd %d\n", idx, *output);
>>>>  			if (ops->mmap(map, mp, *output, evlist_cpu) < 0)
>>>>  				return -1;
>>>>  
>>>> @@ -494,6 +495,7 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>>>>  			if (!idx)
>>>>  				perf_evlist__set_mmap_first(evlist, map, overwrite);
>>>>  		} else {
>>>> +			pr_debug("idx %d: set output fd %d -> %d\n", idx, fd, *output);
>>>>  			if (ioctl(fd, PERF_EVENT_IOC_SET_OUTPUT, *output) != 0)
>>>>  				return -1;
>>>>  
>>>> @@ -519,6 +521,48 @@ mmap_per_evsel(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>>>>  	return 0;
>>>>  }
>>>>  
>>>> +static int
>>>> +mmap_per_thread(struct perf_evlist *evlist, struct perf_evlist_mmap_ops *ops,
>>>> +		struct perf_mmap_param *mp)
>>>> +{
>>>> +	int nr_threads = perf_thread_map__nr(evlist->threads);
>>>> +	int nr_cpus    = perf_cpu_map__nr(evlist->all_cpus);
>>>> +	int cpu, thread, idx = 0;
>>>> +	int nr_mmaps = 0;
>>>> +
>>>> +	pr_debug("%s: nr cpu values (may include -1) %d nr threads %d\n",
>>>> +		 __func__, nr_cpus, nr_threads);
>>>
>>> -1 as cpu value is only for 'empty' perf_cpu_map, right?
>>
>> The cpu map is a map of valid 3rd arguments to perf_event_open, so -1
>> means all CPUs which is per-thread by necessity.
>>
>>>
>>>> +
>>>> +	/* per-thread mmaps */
>>>> +	for (thread = 0; thread < nr_threads; thread++, idx++) {
>>>> +		int output = -1;
>>>> +		int output_overwrite = -1;
>>>> +
>>>> +		if (mmap_per_evsel(evlist, ops, idx, mp, 0, thread, &output,
>>>> +				   &output_overwrite, &nr_mmaps))
>>>> +			goto out_unmap;
>>>> +	}
>>>> +
>>>> +	/* system-wide mmaps i.e. per-cpu */
>>>> +	for (cpu = 1; cpu < nr_cpus; cpu++, idx++) {
>>>> +		int output = -1;
>>>> +		int output_overwrite = -1;
>>>> +
>>>> +		if (mmap_per_evsel(evlist, ops, idx, mp, cpu, 0, &output,
>>>> +				   &output_overwrite, &nr_mmaps))
>>>> +			goto out_unmap;
>>>> +	}
>>>
>>> will this loop be executed? we are in here because all_cpus is empty, right?
>>
>> Yes it is executed.  I put back the code that was there before ae4f8ae16a07
>> ("libperf evlist: Allow mixing per-thread and per-cpu mmaps"), which uses
> 
> hm, but commit ae4f8ae16a07 does not have similar cpu loop

It was calling mmap_per_cpu() for that case.

The 2 cases: mmap_per_cpu() and mmap_per_thread() could still be
combined into a single function.

> 
>> perf_cpu_map__empty() which only checks the first entry is -1:
>>
>> bool perf_cpu_map__empty(const struct perf_cpu_map *map)
>> {
>> 	return map ? map->map[0].cpu == -1 : true;
>> }
>>
>> But there can be more CPUs in the map, so perf_cpu_map__empty()
>> returns true for the per-thread case, as desired, even if there
>> are also system-wide CPUs.
> 
> I don't see how, if I'd see -1 together with other cpu values in
> perf_cpu_map I'd think it's a bug, but I might be missing some
> auxtrace usage,

Yes, it is for system-wide collection of events that can affect
every CPU.  Currently text_poke is always system-wide - see the
Intel PT example.

> 
> I thought we use -1 just for empty cpu map, so in per-thread case
> -1 is properly passed to perf_event_open syscall

Yes, but it does not need to be limited to that case.

> 
> jirka
> 
>>
>> I guess perf_cpu_map__empty() needs renaming.
>>
>>>
>>> thanks,
>>> jirka
>>>
>>>> +
>>>> +	if (nr_mmaps != evlist->nr_mmaps)
>>>> +		pr_err("Miscounted nr_mmaps %d vs %d\n", nr_mmaps, evlist->nr_mmaps);
>>>> +
>>>> +	return 0;
>>>> +
>>>> +out_unmap:
>>>> +	perf_evlist__munmap(evlist);
>>>> +	return -1;
>>>> +}
>>>
>>> SNIP
>>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-05 11:42 [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets Adrian Hunter
                   ` (2 preceding siblings ...)
  2022-09-06 17:50 ` Namhyung Kim
@ 2022-09-07 14:20 ` Jiri Olsa
  2022-09-08 15:18   ` Arnaldo Carvalho de Melo
  3 siblings, 1 reply; 12+ messages in thread
From: Jiri Olsa @ 2022-09-07 14:20 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Namhyung Kim, Ian Rogers,
	linux-kernel, linux-perf-users, Peter Zijlstra, Ingo Molnar

On Mon, Sep 05, 2022 at 02:42:09PM +0300, Adrian Hunter wrote:
> The offending commit removed mmap_per_thread(), which did not consider
> the different set-output rules for per-thread mmaps i.e. in the per-thread
> case set-output is used for file descriptors of the same thread not the
> same cpu.
> 
> This was not immediately noticed because it only happens with
> multi-threaded targets and we do not have a test for that yet.
> 
> Reinstate mmap_per_thread() expanding it to cover also system-wide per-cpu
> events i.e. to continue to allow the mixing of per-thread and per-cpu
> mmaps.
> 
> Debug messages (with -vv) show the file descriptors that are opened with
> sys_perf_event_open. New debug messages are added (needs -vvv) that show
> also which file descriptors are mmapped and which are redirected with
> set-output.
> 
> In the per-cpu case (cpu != -1) file descriptors for the same CPU are
> set-output to the first file descriptor for that CPU.
> 
> In the per-thread case (cpu == -1) file descriptors for the same thread are
> set-output to the first file descriptor for that thread.
> 
> Example (process 17489 has 2 threads):
> 
>  Before (but with new debug prints):
> 
>    $ perf record --no-bpf-event -vvv --per-thread -p 17489
>    <SNIP>
>    sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
>    sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
>    <SNIP>
>    libperf: idx 0: mmapping fd 5
>    libperf: idx 0: set output fd 6 -> 5
>    failed to mmap with 22 (Invalid argument)
> 
>  After:
> 
>    $ perf record --no-bpf-event -vvv --per-thread -p 17489
>    <SNIP>
>    sys_perf_event_open: pid 17489  cpu -1  group_fd -1  flags 0x8 = 5
>    sys_perf_event_open: pid 17490  cpu -1  group_fd -1  flags 0x8 = 6
>    <SNIP>
>    libperf: mmap_per_thread: nr cpu values (may include -1) 1 nr threads 2
>    libperf: idx 0: mmapping fd 5
>    libperf: idx 1: mmapping fd 6
>    <SNIP>
>    [ perf record: Woken up 2 times to write data ]
>    [ perf record: Captured and wrote 0.018 MB perf.data (15 samples) ]
> 
> Per-cpu example (process 20341 has 2 threads, same as above):
> 
>    $ perf record --no-bpf-event -vvv -p 20341
>    <SNIP>
>    sys_perf_event_open: pid 20341  cpu 0  group_fd -1  flags 0x8 = 5
>    sys_perf_event_open: pid 20342  cpu 0  group_fd -1  flags 0x8 = 6
>    sys_perf_event_open: pid 20341  cpu 1  group_fd -1  flags 0x8 = 7
>    sys_perf_event_open: pid 20342  cpu 1  group_fd -1  flags 0x8 = 8
>    sys_perf_event_open: pid 20341  cpu 2  group_fd -1  flags 0x8 = 9
>    sys_perf_event_open: pid 20342  cpu 2  group_fd -1  flags 0x8 = 10
>    sys_perf_event_open: pid 20341  cpu 3  group_fd -1  flags 0x8 = 11
>    sys_perf_event_open: pid 20342  cpu 3  group_fd -1  flags 0x8 = 12
>    sys_perf_event_open: pid 20341  cpu 4  group_fd -1  flags 0x8 = 13
>    sys_perf_event_open: pid 20342  cpu 4  group_fd -1  flags 0x8 = 14
>    sys_perf_event_open: pid 20341  cpu 5  group_fd -1  flags 0x8 = 15
>    sys_perf_event_open: pid 20342  cpu 5  group_fd -1  flags 0x8 = 16
>    sys_perf_event_open: pid 20341  cpu 6  group_fd -1  flags 0x8 = 17
>    sys_perf_event_open: pid 20342  cpu 6  group_fd -1  flags 0x8 = 18
>    sys_perf_event_open: pid 20341  cpu 7  group_fd -1  flags 0x8 = 19
>    sys_perf_event_open: pid 20342  cpu 7  group_fd -1  flags 0x8 = 20
>    <SNIP>
>    libperf: mmap_per_cpu: nr cpu values 8 nr threads 2
>    libperf: idx 0: mmapping fd 5
>    libperf: idx 0: set output fd 6 -> 5
>    libperf: idx 1: mmapping fd 7
>    libperf: idx 1: set output fd 8 -> 7
>    libperf: idx 2: mmapping fd 9
>    libperf: idx 2: set output fd 10 -> 9
>    libperf: idx 3: mmapping fd 11
>    libperf: idx 3: set output fd 12 -> 11
>    libperf: idx 4: mmapping fd 13
>    libperf: idx 4: set output fd 14 -> 13
>    libperf: idx 5: mmapping fd 15
>    libperf: idx 5: set output fd 16 -> 15
>    libperf: idx 6: mmapping fd 17
>    libperf: idx 6: set output fd 18 -> 17
>    libperf: idx 7: mmapping fd 19
>    libperf: idx 7: set output fd 20 -> 19
>    <SNIP>
>    [ perf record: Woken up 7 times to write data ]
>    [ perf record: Captured and wrote 0.020 MB perf.data (17 samples) ]
> 
> Fixes: ae4f8ae16a07 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets
  2022-09-07 14:20 ` Jiri Olsa
@ 2022-09-08 15:18   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2022-09-08 15:18 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Adrian Hunter, Jiri Olsa, Namhyung Kim, Ian Rogers, linux-kernel,
	linux-perf-users, Peter Zijlstra, Ingo Molnar

Em Wed, Sep 07, 2022 at 04:20:59PM +0200, Jiri Olsa escreveu:
> On Mon, Sep 05, 2022 at 02:42:09PM +0300, Adrian Hunter wrote:
> > Fixes: ae4f8ae16a07 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
> > Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> 
> Acked-by: Jiri Olsa <jolsa@kernel.org>

Thanks, added,

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-09-08 15:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-05 11:42 [PATCH V2] libperf evlist: Fix per-thread mmaps for multi-threaded targets Adrian Hunter
2022-09-06 12:53 ` Arnaldo Carvalho de Melo
2022-09-06 12:59 ` Jiri Olsa
2022-09-06 14:04   ` Adrian Hunter
2022-09-06 17:43     ` Ian Rogers
2022-09-06 21:17       ` Vince Weaver
2022-09-06 19:45     ` Jiri Olsa
2022-09-07  5:05       ` Adrian Hunter
2022-09-06 17:50 ` Namhyung Kim
2022-09-07  4:52   ` Adrian Hunter
2022-09-07 14:20 ` Jiri Olsa
2022-09-08 15:18   ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.