linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks
@ 2021-12-08 11:54 James Clark
  2021-12-08 13:17 ` Leo Yan
  0 siblings, 1 reply; 8+ messages in thread
From: James Clark @ 2021-12-08 11:54 UTC (permalink / raw)
  To: mathieu.poirier, coresight
  Cc: suzuki.poulose, James Clark, Mike Leach, Leo Yan, John Garry,
	Will Deacon, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, linux-arm-kernel, linux-perf-users, linux-kernel

There are two checks, one is for size when running without admin, but
this one is covered by the driver and reported on in more detail here
(builtin-record.c):

  pr_err("Permission error mapping pages.\n"
         "Consider increasing "
         "/proc/sys/kernel/perf_event_mlock_kb,\n"
         "or try again with a smaller value of -m/--mmap_pages.\n"
         "(current value: %u,%u)\n",

This had the effect of artificially limiting the aux buffer size to a
value smaller than what was allowed because perf_event_mlock_kb wasn't
taken into account.

The second is to check for a power of two, but this is covered here
(evlist.c):

  pr_info("rounding mmap pages size to %s (%lu pages)\n",
          buf, pages);

Signed-off-by: James Clark <james.clark@arm.com>
---
 tools/perf/arch/arm/util/cs-etm.c | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
index 293a23bf8be3..8a3d54a86c9c 100644
--- a/tools/perf/arch/arm/util/cs-etm.c
+++ b/tools/perf/arch/arm/util/cs-etm.c
@@ -407,25 +407,6 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
 
 	}
 
-	/* Validate auxtrace_mmap_pages provided by user */
-	if (opts->auxtrace_mmap_pages) {
-		unsigned int max_page = (KiB(128) / page_size);
-		size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size;
-
-		if (!privileged &&
-		    opts->auxtrace_mmap_pages > max_page) {
-			opts->auxtrace_mmap_pages = max_page;
-			pr_err("auxtrace too big, truncating to %d\n",
-			       max_page);
-		}
-
-		if (!is_power_of_2(sz)) {
-			pr_err("Invalid mmap size for %s: must be a power of 2\n",
-			       CORESIGHT_ETM_PMU_NAME);
-			return -EINVAL;
-		}
-	}
-
 	if (opts->auxtrace_snapshot_mode)
 		pr_debug2("%s snapshot size: %zu\n", CORESIGHT_ETM_PMU_NAME,
 			  opts->auxtrace_snapshot_size);
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks
  2021-12-08 11:54 [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks James Clark
@ 2021-12-08 13:17 ` Leo Yan
  2021-12-08 14:08   ` James Clark
  0 siblings, 1 reply; 8+ messages in thread
From: Leo Yan @ 2021-12-08 13:17 UTC (permalink / raw)
  To: James Clark
  Cc: mathieu.poirier, coresight, suzuki.poulose, Mike Leach,
	John Garry, Will Deacon, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, linux-perf-users,
	linux-kernel

Hi James,

On Wed, Dec 08, 2021 at 11:54:35AM +0000, James Clark wrote:
> There are two checks, one is for size when running without admin, but
> this one is covered by the driver and reported on in more detail here
> (builtin-record.c):
> 
>   pr_err("Permission error mapping pages.\n"
>          "Consider increasing "
>          "/proc/sys/kernel/perf_event_mlock_kb,\n"
>          "or try again with a smaller value of -m/--mmap_pages.\n"
>          "(current value: %u,%u)\n",

I looked into the kernel code and found:

  sysctl_perf_event_mlock = 512 + (PAGE_SIZE / 1024);  // 512KB + 1 page

If the system have multiple cores, let's say 8 cores, then kernel even
can relax the limitaion with:

  user_lock_limit *= num_online_cpus();

So means the memory lock limitation is:

  (512KB + 1 page) * 8 = 4MB + 8 pages.

Seems to me, it's much relax than the user space's limitaion 128KB.
And let's imagine for Arm server, the permitted buffer size can be a
huge value (e.g. for a system with 128 cores).

Could you confirm if this is right?

Thanks,
Leo

> This had the effect of artificially limiting the aux buffer size to a
> value smaller than what was allowed because perf_event_mlock_kb wasn't
> taken into account.
> 
> The second is to check for a power of two, but this is covered here
> (evlist.c):
> 
>   pr_info("rounding mmap pages size to %s (%lu pages)\n",
>           buf, pages);
> 
> Signed-off-by: James Clark <james.clark@arm.com>
> ---
>  tools/perf/arch/arm/util/cs-etm.c | 19 -------------------
>  1 file changed, 19 deletions(-)
> 
> diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
> index 293a23bf8be3..8a3d54a86c9c 100644
> --- a/tools/perf/arch/arm/util/cs-etm.c
> +++ b/tools/perf/arch/arm/util/cs-etm.c
> @@ -407,25 +407,6 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
>  
>  	}
>  
> -	/* Validate auxtrace_mmap_pages provided by user */
> -	if (opts->auxtrace_mmap_pages) {
> -		unsigned int max_page = (KiB(128) / page_size);
> -		size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size;
> -
> -		if (!privileged &&
> -		    opts->auxtrace_mmap_pages > max_page) {
> -			opts->auxtrace_mmap_pages = max_page;
> -			pr_err("auxtrace too big, truncating to %d\n",
> -			       max_page);
> -		}
> -
> -		if (!is_power_of_2(sz)) {
> -			pr_err("Invalid mmap size for %s: must be a power of 2\n",
> -			       CORESIGHT_ETM_PMU_NAME);
> -			return -EINVAL;
> -		}
> -	}
> -
>  	if (opts->auxtrace_snapshot_mode)
>  		pr_debug2("%s snapshot size: %zu\n", CORESIGHT_ETM_PMU_NAME,
>  			  opts->auxtrace_snapshot_size);
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks
  2021-12-08 13:17 ` Leo Yan
@ 2021-12-08 14:08   ` James Clark
  2021-12-09 13:44     ` Leo Yan
  0 siblings, 1 reply; 8+ messages in thread
From: James Clark @ 2021-12-08 14:08 UTC (permalink / raw)
  To: Leo Yan
  Cc: mathieu.poirier, coresight, suzuki.poulose, Mike Leach,
	John Garry, Will Deacon, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, linux-perf-users,
	linux-kernel



On 08/12/2021 13:17, Leo Yan wrote:
> Hi James,
> 
> On Wed, Dec 08, 2021 at 11:54:35AM +0000, James Clark wrote:
>> There are two checks, one is for size when running without admin, but
>> this one is covered by the driver and reported on in more detail here
>> (builtin-record.c):
>>
>>   pr_err("Permission error mapping pages.\n"
>>          "Consider increasing "
>>          "/proc/sys/kernel/perf_event_mlock_kb,\n"
>>          "or try again with a smaller value of -m/--mmap_pages.\n"
>>          "(current value: %u,%u)\n",
> 
> I looked into the kernel code and found:
> 
>   sysctl_perf_event_mlock = 512 + (PAGE_SIZE / 1024);  // 512KB + 1 page
> 
> If the system have multiple cores, let's say 8 cores, then kernel even
> can relax the limitaion with:
> 
>   user_lock_limit *= num_online_cpus();
> 
> So means the memory lock limitation is:
> 
>   (512KB + 1 page) * 8 = 4MB + 8 pages.
> 
> Seems to me, it's much relax than the user space's limitaion 128KB.
> And let's imagine for Arm server, the permitted buffer size can be a
> huge value (e.g. for a system with 128 cores).
> 
> Could you confirm if this is right?

Yes that seems to be the case. And the commit message for that addition
states the reasoning:

  perf_counter: Increase mmap limit
  
  In a default 'perf top' run the tool will create a counter for
  each online CPU. With enough CPUs this will eventually exhaust
  the default limit.

  So scale it up with the number of online CPUs.

To me that makes sense. Normally the memory installed also scales with the
number of cores.

Are you saying that we should look into modifying that scaling factor in
perf_mmap()? Or that we should still add something to userspace for
coresight to limit user supplied buffer sizes?

I think it makes sense to allow the user to specify any value that will work,
it's up to them.

James

> 
> Thanks,
> Leo
> 
>> This had the effect of artificially limiting the aux buffer size to a
>> value smaller than what was allowed because perf_event_mlock_kb wasn't
>> taken into account.
>>
>> The second is to check for a power of two, but this is covered here
>> (evlist.c):
>>
>>   pr_info("rounding mmap pages size to %s (%lu pages)\n",
>>           buf, pages);
>>
>> Signed-off-by: James Clark <james.clark@arm.com>
>> ---
>>  tools/perf/arch/arm/util/cs-etm.c | 19 -------------------
>>  1 file changed, 19 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c
>> index 293a23bf8be3..8a3d54a86c9c 100644
>> --- a/tools/perf/arch/arm/util/cs-etm.c
>> +++ b/tools/perf/arch/arm/util/cs-etm.c
>> @@ -407,25 +407,6 @@ static int cs_etm_recording_options(struct auxtrace_record *itr,
>>  
>>  	}
>>  
>> -	/* Validate auxtrace_mmap_pages provided by user */
>> -	if (opts->auxtrace_mmap_pages) {
>> -		unsigned int max_page = (KiB(128) / page_size);
>> -		size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size;
>> -
>> -		if (!privileged &&
>> -		    opts->auxtrace_mmap_pages > max_page) {
>> -			opts->auxtrace_mmap_pages = max_page;
>> -			pr_err("auxtrace too big, truncating to %d\n",
>> -			       max_page);
>> -		}
>> -
>> -		if (!is_power_of_2(sz)) {
>> -			pr_err("Invalid mmap size for %s: must be a power of 2\n",
>> -			       CORESIGHT_ETM_PMU_NAME);
>> -			return -EINVAL;
>> -		}
>> -	}
>> -
>>  	if (opts->auxtrace_snapshot_mode)
>>  		pr_debug2("%s snapshot size: %zu\n", CORESIGHT_ETM_PMU_NAME,
>>  			  opts->auxtrace_snapshot_size);
>> -- 
>> 2.28.0
>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks
  2021-12-08 14:08   ` James Clark
@ 2021-12-09 13:44     ` Leo Yan
  2021-12-09 14:16       ` James Clark
  0 siblings, 1 reply; 8+ messages in thread
From: Leo Yan @ 2021-12-09 13:44 UTC (permalink / raw)
  To: James Clark
  Cc: mathieu.poirier, coresight, suzuki.poulose, Mike Leach,
	John Garry, Will Deacon, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, linux-perf-users,
	linux-kernel

On Wed, Dec 08, 2021 at 02:08:04PM +0000, James Clark wrote:
> On 08/12/2021 13:17, Leo Yan wrote:
> > Hi James,
> > 
> > On Wed, Dec 08, 2021 at 11:54:35AM +0000, James Clark wrote:
> >> There are two checks, one is for size when running without admin, but
> >> this one is covered by the driver and reported on in more detail here
> >> (builtin-record.c):
> >>
> >>   pr_err("Permission error mapping pages.\n"
> >>          "Consider increasing "
> >>          "/proc/sys/kernel/perf_event_mlock_kb,\n"
> >>          "or try again with a smaller value of -m/--mmap_pages.\n"
> >>          "(current value: %u,%u)\n",
> > 
> > I looked into the kernel code and found:
> > 
> >   sysctl_perf_event_mlock = 512 + (PAGE_SIZE / 1024);  // 512KB + 1 page
> > 
> > If the system have multiple cores, let's say 8 cores, then kernel even
> > can relax the limitaion with:
> > 
> >   user_lock_limit *= num_online_cpus();
> > 
> > So means the memory lock limitation is:
> > 
> >   (512KB + 1 page) * 8 = 4MB + 8 pages.
> > 
> > Seems to me, it's much relax than the user space's limitaion 128KB.
> > And let's imagine for Arm server, the permitted buffer size can be a
> > huge value (e.g. for a system with 128 cores).
> > 
> > Could you confirm if this is right?
> 
> Yes that seems to be the case. And the commit message for that addition
> states the reasoning:
> 
>   perf_counter: Increase mmap limit
>   
>   In a default 'perf top' run the tool will create a counter for
>   each online CPU. With enough CPUs this will eventually exhaust
>   the default limit.
> 
>   So scale it up with the number of online CPUs.
> 
> To me that makes sense. Normally the memory installed also scales with the
> number of cores.
> 
> Are you saying that we should look into modifying that scaling factor in
> perf_mmap()? Or that we should still add something to userspace for
> coresight to limit user supplied buffer sizes?

I don't think we should modify the scaling factor in perf_mmap(), the
logic is not only used by AUX buffer, it's shared by normal event
ring buffer.

> I think it makes sense to allow the user to specify any value that will work,
> it's up to them.

Understand, I verified this patch with below steps:

root@debian:~# echo 0 > /proc/sys/kernel/perf_event_paranoid

leoy@debian:~$ perf record -e cs_etm// -m 4M,8M -o perf_test.data -- sleep 1
Permission error mapping pages.
Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
or try again with a smaller value of -m/--mmap_pages.
(current value: 1024,2048)

leoy@debian:~$ perf record -e cs_etm// -m 4M,4M -o perf_test.data -- sleep 1
Couldn't synthesize bpf events.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.607 MB perf_test.data ]

So this patch looks good for me:

Reviewed-by: Leo Yan <leo.yan@linaro.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks
  2021-12-09 13:44     ` Leo Yan
@ 2021-12-09 14:16       ` James Clark
  2021-12-10 16:54         ` Mathieu Poirier
  2021-12-10 19:03         ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 8+ messages in thread
From: James Clark @ 2021-12-09 14:16 UTC (permalink / raw)
  To: Leo Yan
  Cc: mathieu.poirier, coresight, suzuki.poulose, Mike Leach,
	John Garry, Will Deacon, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, linux-perf-users,
	linux-kernel



On 09/12/2021 13:44, Leo Yan wrote:
> On Wed, Dec 08, 2021 at 02:08:04PM +0000, James Clark wrote:
>> On 08/12/2021 13:17, Leo Yan wrote:
>>> Hi James,
>>>
>>> On Wed, Dec 08, 2021 at 11:54:35AM +0000, James Clark wrote:
>>>> There are two checks, one is for size when running without admin, but
>>>> this one is covered by the driver and reported on in more detail here
>>>> (builtin-record.c):
>>>>
>>>>   pr_err("Permission error mapping pages.\n"
>>>>          "Consider increasing "
>>>>          "/proc/sys/kernel/perf_event_mlock_kb,\n"
>>>>          "or try again with a smaller value of -m/--mmap_pages.\n"
>>>>          "(current value: %u,%u)\n",
>>>
>>> I looked into the kernel code and found:
>>>
>>>   sysctl_perf_event_mlock = 512 + (PAGE_SIZE / 1024);  // 512KB + 1 page
>>>
>>> If the system have multiple cores, let's say 8 cores, then kernel even
>>> can relax the limitaion with:
>>>
>>>   user_lock_limit *= num_online_cpus();
>>>
>>> So means the memory lock limitation is:
>>>
>>>   (512KB + 1 page) * 8 = 4MB + 8 pages.
>>>
>>> Seems to me, it's much relax than the user space's limitaion 128KB.
>>> And let's imagine for Arm server, the permitted buffer size can be a
>>> huge value (e.g. for a system with 128 cores).
>>>
>>> Could you confirm if this is right?
>>
>> Yes that seems to be the case. And the commit message for that addition
>> states the reasoning:
>>
>>   perf_counter: Increase mmap limit
>>   
>>   In a default 'perf top' run the tool will create a counter for
>>   each online CPU. With enough CPUs this will eventually exhaust
>>   the default limit.
>>
>>   So scale it up with the number of online CPUs.
>>
>> To me that makes sense. Normally the memory installed also scales with the
>> number of cores.
>>
>> Are you saying that we should look into modifying that scaling factor in
>> perf_mmap()? Or that we should still add something to userspace for
>> coresight to limit user supplied buffer sizes?
> 
> I don't think we should modify the scaling factor in perf_mmap(), the
> logic is not only used by AUX buffer, it's shared by normal event
> ring buffer.
> 
>> I think it makes sense to allow the user to specify any value that will work,
>> it's up to them.
> 
> Understand, I verified this patch with below steps:
> 
> root@debian:~# echo 0 > /proc/sys/kernel/perf_event_paranoid
> 
> leoy@debian:~$ perf record -e cs_etm// -m 4M,8M -o perf_test.data -- sleep 1
> Permission error mapping pages.
> Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
> or try again with a smaller value of -m/--mmap_pages.
> (current value: 1024,2048)
> 
> leoy@debian:~$ perf record -e cs_etm// -m 4M,4M -o perf_test.data -- sleep 1
> Couldn't synthesize bpf events.
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.607 MB perf_test.data ]
> 
> So this patch looks good for me:
> 
> Reviewed-by: Leo Yan <leo.yan@linaro.org>
> 
Thanks Leo!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks
  2021-12-09 14:16       ` James Clark
@ 2021-12-10 16:54         ` Mathieu Poirier
  2021-12-10 17:55           ` Arnaldo Carvalho de Melo
  2021-12-10 19:03         ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 8+ messages in thread
From: Mathieu Poirier @ 2021-12-10 16:54 UTC (permalink / raw)
  To: James Clark
  Cc: Leo Yan, coresight, suzuki.poulose, Mike Leach, John Garry,
	Will Deacon, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, linux-arm-kernel, linux-perf-users, linux-kernel

On Thu, Dec 09, 2021 at 02:16:43PM +0000, James Clark wrote:
> 
> 
> On 09/12/2021 13:44, Leo Yan wrote:
> > On Wed, Dec 08, 2021 at 02:08:04PM +0000, James Clark wrote:
> >> On 08/12/2021 13:17, Leo Yan wrote:
> >>> Hi James,
> >>>
> >>> On Wed, Dec 08, 2021 at 11:54:35AM +0000, James Clark wrote:
> >>>> There are two checks, one is for size when running without admin, but
> >>>> this one is covered by the driver and reported on in more detail here
> >>>> (builtin-record.c):
> >>>>
> >>>>   pr_err("Permission error mapping pages.\n"
> >>>>          "Consider increasing "
> >>>>          "/proc/sys/kernel/perf_event_mlock_kb,\n"
> >>>>          "or try again with a smaller value of -m/--mmap_pages.\n"
> >>>>          "(current value: %u,%u)\n",
> >>>
> >>> I looked into the kernel code and found:
> >>>
> >>>   sysctl_perf_event_mlock = 512 + (PAGE_SIZE / 1024);  // 512KB + 1 page
> >>>
> >>> If the system have multiple cores, let's say 8 cores, then kernel even
> >>> can relax the limitaion with:
> >>>
> >>>   user_lock_limit *= num_online_cpus();
> >>>
> >>> So means the memory lock limitation is:
> >>>
> >>>   (512KB + 1 page) * 8 = 4MB + 8 pages.
> >>>
> >>> Seems to me, it's much relax than the user space's limitaion 128KB.
> >>> And let's imagine for Arm server, the permitted buffer size can be a
> >>> huge value (e.g. for a system with 128 cores).
> >>>
> >>> Could you confirm if this is right?
> >>
> >> Yes that seems to be the case. And the commit message for that addition
> >> states the reasoning:
> >>
> >>   perf_counter: Increase mmap limit
> >>   
> >>   In a default 'perf top' run the tool will create a counter for
> >>   each online CPU. With enough CPUs this will eventually exhaust
> >>   the default limit.
> >>
> >>   So scale it up with the number of online CPUs.
> >>
> >> To me that makes sense. Normally the memory installed also scales with the
> >> number of cores.
> >>
> >> Are you saying that we should look into modifying that scaling factor in
> >> perf_mmap()? Or that we should still add something to userspace for
> >> coresight to limit user supplied buffer sizes?
> > 
> > I don't think we should modify the scaling factor in perf_mmap(), the
> > logic is not only used by AUX buffer, it's shared by normal event
> > ring buffer.
> > 
> >> I think it makes sense to allow the user to specify any value that will work,
> >> it's up to them.
> > 
> > Understand, I verified this patch with below steps:
> > 
> > root@debian:~# echo 0 > /proc/sys/kernel/perf_event_paranoid
> > 
> > leoy@debian:~$ perf record -e cs_etm// -m 4M,8M -o perf_test.data -- sleep 1
> > Permission error mapping pages.
> > Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
> > or try again with a smaller value of -m/--mmap_pages.
> > (current value: 1024,2048)
> > 
> > leoy@debian:~$ perf record -e cs_etm// -m 4M,4M -o perf_test.data -- sleep 1
> > Couldn't synthesize bpf events.
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.607 MB perf_test.data ]
> > 
> > So this patch looks good for me:
> > 
> > Reviewed-by: Leo Yan <leo.yan@linaro.org>
> > 
> Thanks Leo!

Arnaldo is not on the recipient list and as such he won't see this patch...


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks
  2021-12-10 16:54         ` Mathieu Poirier
@ 2021-12-10 17:55           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-12-10 17:55 UTC (permalink / raw)
  To: Mathieu Poirier, James Clark
  Cc: Leo Yan, coresight, suzuki.poulose, Mike Leach, John Garry,
	Will Deacon, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, linux-arm-kernel, linux-perf-users, linux-kernel



On December 10, 2021 1:54:36 PM GMT-03:00, Mathieu Poirier <mathieu.poirier@linaro.org> wrote:
>On Thu, Dec 09, 2021 at 02:16:43PM +0000, James Clark wrote:
>> 
>> 
>> On 09/12/2021 13:44, Leo Yan wrote:
>> > On Wed, Dec 08, 2021 at 02:08:04PM +0000, James Clark wrote:
>> >> On 08/12/2021 13:17, Leo Yan wrote:
>> >>> Hi James,
>> >>>
>> >>> On Wed, Dec 08, 2021 at 11:54:35AM +0000, James Clark wrote:
>> >>>> There are two checks, one is for size when running without admin, but
>> >>>> this one is covered by the driver and reported on in more detail here
>> >>>> (builtin-record.c):
>> >>>>
>> >>>>   pr_err("Permission error mapping pages.\n"
>> >>>>          "Consider increasing "
>> >>>>          "/proc/sys/kernel/perf_event_mlock_kb,\n"
>> >>>>          "or try again with a smaller value of -m/--mmap_pages.\n"
>> >>>>          "(current value: %u,%u)\n",
>> >>>
>> >>> I looked into the kernel code and found:
>> >>>
>> >>>   sysctl_perf_event_mlock = 512 + (PAGE_SIZE / 1024);  // 512KB + 1 page
>> >>>
>> >>> If the system have multiple cores, let's say 8 cores, then kernel even
>> >>> can relax the limitaion with:
>> >>>
>> >>>   user_lock_limit *= num_online_cpus();
>> >>>
>> >>> So means the memory lock limitation is:
>> >>>
>> >>>   (512KB + 1 page) * 8 = 4MB + 8 pages.
>> >>>
>> >>> Seems to me, it's much relax than the user space's limitaion 128KB.
>> >>> And let's imagine for Arm server, the permitted buffer size can be a
>> >>> huge value (e.g. for a system with 128 cores).
>> >>>
>> >>> Could you confirm if this is right?
>> >>
>> >> Yes that seems to be the case. And the commit message for that addition
>> >> states the reasoning:
>> >>
>> >>   perf_counter: Increase mmap limit
>> >>   
>> >>   In a default 'perf top' run the tool will create a counter for
>> >>   each online CPU. With enough CPUs this will eventually exhaust
>> >>   the default limit.
>> >>
>> >>   So scale it up with the number of online CPUs.
>> >>
>> >> To me that makes sense. Normally the memory installed also scales with the
>> >> number of cores.
>> >>
>> >> Are you saying that we should look into modifying that scaling factor in
>> >> perf_mmap()? Or that we should still add something to userspace for
>> >> coresight to limit user supplied buffer sizes?
>> > 
>> > I don't think we should modify the scaling factor in perf_mmap(), the
>> > logic is not only used by AUX buffer, it's shared by normal event
>> > ring buffer.
>> > 
>> >> I think it makes sense to allow the user to specify any value that will work,
>> >> it's up to them.
>> > 
>> > Understand, I verified this patch with below steps:
>> > 
>> > root@debian:~# echo 0 > /proc/sys/kernel/perf_event_paranoid
>> > 
>> > leoy@debian:~$ perf record -e cs_etm// -m 4M,8M -o perf_test.data -- sleep 1
>> > Permission error mapping pages.
>> > Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
>> > or try again with a smaller value of -m/--mmap_pages.
>> > (current value: 1024,2048)
>> > 
>> > leoy@debian:~$ perf record -e cs_etm// -m 4M,4M -o perf_test.data -- sleep 1
>> > Couldn't synthesize bpf events.
>> > [ perf record: Woken up 1 times to write data ]
>> > [ perf record: Captured and wrote 0.607 MB perf_test.data ]
>> > 
>> > So this patch looks good for me:
>> > 
>> > Reviewed-by: Leo Yan <leo.yan@linaro.org>
>> > 
>> Thanks Leo!
>
>Arnaldo is not on the recipient list and as such he won't see this patch...
>

I saw it now, can I take this as an acked-by: Matthieu too?

- Arnaldo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks
  2021-12-09 14:16       ` James Clark
  2021-12-10 16:54         ` Mathieu Poirier
@ 2021-12-10 19:03         ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 8+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-12-10 19:03 UTC (permalink / raw)
  To: James Clark
  Cc: Leo Yan, mathieu.poirier, coresight, suzuki.poulose, Mike Leach,
	John Garry, Will Deacon, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-arm-kernel, linux-perf-users,
	linux-kernel

Em Thu, Dec 09, 2021 at 02:16:43PM +0000, James Clark escreveu:
> 
> 
> On 09/12/2021 13:44, Leo Yan wrote:
> > On Wed, Dec 08, 2021 at 02:08:04PM +0000, James Clark wrote:
> >> On 08/12/2021 13:17, Leo Yan wrote:
> >>> Hi James,
> >>>
> >>> On Wed, Dec 08, 2021 at 11:54:35AM +0000, James Clark wrote:
> >>>> There are two checks, one is for size when running without admin, but
> >>>> this one is covered by the driver and reported on in more detail here
> >>>> (builtin-record.c):
> >>>>
> >>>>   pr_err("Permission error mapping pages.\n"
> >>>>          "Consider increasing "
> >>>>          "/proc/sys/kernel/perf_event_mlock_kb,\n"
> >>>>          "or try again with a smaller value of -m/--mmap_pages.\n"
> >>>>          "(current value: %u,%u)\n",
> >>>
> >>> I looked into the kernel code and found:
> >>>
> >>>   sysctl_perf_event_mlock = 512 + (PAGE_SIZE / 1024);  // 512KB + 1 page
> >>>
> >>> If the system have multiple cores, let's say 8 cores, then kernel even
> >>> can relax the limitaion with:
> >>>
> >>>   user_lock_limit *= num_online_cpus();
> >>>
> >>> So means the memory lock limitation is:
> >>>
> >>>   (512KB + 1 page) * 8 = 4MB + 8 pages.
> >>>
> >>> Seems to me, it's much relax than the user space's limitaion 128KB.
> >>> And let's imagine for Arm server, the permitted buffer size can be a
> >>> huge value (e.g. for a system with 128 cores).
> >>>
> >>> Could you confirm if this is right?
> >>
> >> Yes that seems to be the case. And the commit message for that addition
> >> states the reasoning:
> >>
> >>   perf_counter: Increase mmap limit
> >>   
> >>   In a default 'perf top' run the tool will create a counter for
> >>   each online CPU. With enough CPUs this will eventually exhaust
> >>   the default limit.
> >>
> >>   So scale it up with the number of online CPUs.
> >>
> >> To me that makes sense. Normally the memory installed also scales with the
> >> number of cores.
> >>
> >> Are you saying that we should look into modifying that scaling factor in
> >> perf_mmap()? Or that we should still add something to userspace for
> >> coresight to limit user supplied buffer sizes?
> > 
> > I don't think we should modify the scaling factor in perf_mmap(), the
> > logic is not only used by AUX buffer, it's shared by normal event
> > ring buffer.
> > 
> >> I think it makes sense to allow the user to specify any value that will work,
> >> it's up to them.
> > 
> > Understand, I verified this patch with below steps:
> > 
> > root@debian:~# echo 0 > /proc/sys/kernel/perf_event_paranoid
> > 
> > leoy@debian:~$ perf record -e cs_etm// -m 4M,8M -o perf_test.data -- sleep 1
> > Permission error mapping pages.
> > Consider increasing /proc/sys/kernel/perf_event_mlock_kb,
> > or try again with a smaller value of -m/--mmap_pages.
> > (current value: 1024,2048)
> > 
> > leoy@debian:~$ perf record -e cs_etm// -m 4M,4M -o perf_test.data -- sleep 1
> > Couldn't synthesize bpf events.
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.607 MB perf_test.data ]
> > 
> > So this patch looks good for me:
> > 
> > Reviewed-by: Leo Yan <leo.yan@linaro.org>
> > 
> Thanks Leo!


Thanks, applied.

- Arnaldo


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-12-10 19:03 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-08 11:54 [PATCH] perf cs-etm: Remove duplicate and incorrect aux size checks James Clark
2021-12-08 13:17 ` Leo Yan
2021-12-08 14:08   ` James Clark
2021-12-09 13:44     ` Leo Yan
2021-12-09 14:16       ` James Clark
2021-12-10 16:54         ` Mathieu Poirier
2021-12-10 17:55           ` Arnaldo Carvalho de Melo
2021-12-10 19:03         ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).