All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] perf tools: allow overriding MAX_NR_CPUS at compile time
@ 2017-09-22 11:20 Christophe Leroy
  2017-09-22 14:09 ` Arnaldo Carvalho de Melo
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Christophe Leroy @ 2017-09-22 11:20 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Alexander Shishkin
  Cc: linux-kernel, linuxppc-dev

After update of kernel, perf tool doesn't run anymore on my
32MB RAM powerpc board, but still runs on a 128MB RAM board:

~# strace perf
execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot allocate memory)
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
+++ killed by SIGSEGV +++
Segmentation fault

objdump -x shows that .bss section has a huge size of 24Mbytes:

 27 .bss          016baca8  101cebb8  101cebb8  001cd988  2**3

With especially the following objects having quite big size

10205f80 l     O .bss	00140000              runtime_cycles_stats
10345f80 l     O .bss	00140000              runtime_stalled_cycles_front_stats
10485f80 l     O .bss	00140000              runtime_stalled_cycles_back_stats
105c5f80 l     O .bss	00140000              runtime_branches_stats
10705f80 l     O .bss	00140000              runtime_cacherefs_stats
10845f80 l     O .bss	00140000              runtime_l1_dcache_stats
10985f80 l     O .bss	00140000              runtime_l1_icache_stats
10ac5f80 l     O .bss	00140000              runtime_ll_cache_stats
10c05f80 l     O .bss	00140000              runtime_itlb_cache_stats
10d45f80 l     O .bss	00140000              runtime_dtlb_cache_stats
10e85f80 l     O .bss	00140000              runtime_cycles_in_tx_stats
10fc5f80 l     O .bss	00140000              runtime_transaction_stats
11105f80 l     O .bss	00140000              runtime_elision_stats
11245f80 l     O .bss	00140000              runtime_topdown_total_slots
11385f80 l     O .bss	00140000              runtime_topdown_slots_retired
114c5f80 l     O .bss	00140000              runtime_topdown_slots_issued
11605f80 l     O .bss	00140000              runtime_topdown_fetch_bubbles
11745f80 l     O .bss	00140000              runtime_topdown_recovery_bubbles

This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
to 1024"), because many tables are sized with MAX_NR_CPUS

This patch gives the opportunity to redefine MAX_NR_CPUS via

make EXTRA_CFLAGS=-DMAX_NR_CPUS=1

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 tools/perf/perf.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index dc442ba21bf6..a9db563da0a9 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -23,7 +23,9 @@ static inline unsigned long long rdclock(void)
 	return ts.tv_sec * 1000000000ULL + ts.tv_nsec;
 }
 
+#ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS			1024
+#endif
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;
-- 
2.13.3

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf tools: allow overriding MAX_NR_CPUS at compile time
  2017-09-22 11:20 [PATCH] perf tools: allow overriding MAX_NR_CPUS at compile time Christophe Leroy
@ 2017-09-22 14:09 ` Arnaldo Carvalho de Melo
  2018-05-03 13:40 ` Arnaldo Carvalho de Melo
  2018-08-02  8:16 ` [tip:perf/core] perf tools: Allow " tip-bot for Christophe Leroy
  2 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2017-09-22 14:09 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Peter Zijlstra, Ingo Molnar, Alexander Shishkin, linux-kernel,
	linuxppc-dev

Em Fri, Sep 22, 2017 at 01:20:43PM +0200, Christophe Leroy escreveu:
> After update of kernel, perf tool doesn't run anymore on my
> 32MB RAM powerpc board, but still runs on a 128MB RAM board:
> 
> ~# strace perf
> execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot allocate memory)
> --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
> +++ killed by SIGSEGV +++
> Segmentation fault
> 
> objdump -x shows that .bss section has a huge size of 24Mbytes:
> 
>  27 .bss          016baca8  101cebb8  101cebb8  001cd988  2**3
> 
> With especially the following objects having quite big size
> 
> 10205f80 l     O .bss	00140000              runtime_cycles_stats
> 10345f80 l     O .bss	00140000              runtime_stalled_cycles_front_stats
> 10485f80 l     O .bss	00140000              runtime_stalled_cycles_back_stats
> 105c5f80 l     O .bss	00140000              runtime_branches_stats
> 10705f80 l     O .bss	00140000              runtime_cacherefs_stats
> 10845f80 l     O .bss	00140000              runtime_l1_dcache_stats
> 10985f80 l     O .bss	00140000              runtime_l1_icache_stats
> 10ac5f80 l     O .bss	00140000              runtime_ll_cache_stats
> 10c05f80 l     O .bss	00140000              runtime_itlb_cache_stats
> 10d45f80 l     O .bss	00140000              runtime_dtlb_cache_stats
> 10e85f80 l     O .bss	00140000              runtime_cycles_in_tx_stats
> 10fc5f80 l     O .bss	00140000              runtime_transaction_stats
> 11105f80 l     O .bss	00140000              runtime_elision_stats
> 11245f80 l     O .bss	00140000              runtime_topdown_total_slots
> 11385f80 l     O .bss	00140000              runtime_topdown_slots_retired
> 114c5f80 l     O .bss	00140000              runtime_topdown_slots_issued
> 11605f80 l     O .bss	00140000              runtime_topdown_fetch_bubbles
> 11745f80 l     O .bss	00140000              runtime_topdown_recovery_bubbles
> 
> This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
> to 1024"), because many tables are sized with MAX_NR_CPUS
> 
> This patch gives the opportunity to redefine MAX_NR_CPUS via
> 
> make EXTRA_CFLAGS=-DMAX_NR_CPUS=1

I'll probably apply this, but the right thing would be to get rid of
MAX_NR_CPUS completely and instead allocate that using sysconf:

[root@jouet ~]# perf trace -e open,getdents getconf _NPROCESSORS_CONF
4
     0.014 ( 0.011 ms): getconf/2452 open(filename: /etc/ld.so.cache, flags: CLOEXEC                       ) = 3
     0.045 ( 0.008 ms): getconf/2452 open(filename: /lib64/libc.so.6, flags: CLOEXEC                       ) = 3
     0.264 ( 0.012 ms): getconf/2452 open(filename: /usr/lib/locale/locale-archive, flags: CLOEXEC         ) = 3
     0.315 ( 0.028 ms): getconf/2452 open(filename: /sys/devices/system/cpu, flags: CLOEXEC|DIRECTORY|NONBLOCK) = 3
     0.349 ( 0.013 ms): getconf/2452 getdents(fd: 3</sys/devices/system/cpu>, dirent: 0x944030, count: 32768) = 624
     0.365 ( 0.001 ms): getconf/2452 getdents(fd: 3</sys/devices/system/cpu>, dirent: 0x944030, count: 32768) = 0
[root@jouet ~]#
 
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>  tools/perf/perf.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index dc442ba21bf6..a9db563da0a9 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -23,7 +23,9 @@ static inline unsigned long long rdclock(void)
>  	return ts.tv_sec * 1000000000ULL + ts.tv_nsec;
>  }
>  
> +#ifndef MAX_NR_CPUS
>  #define MAX_NR_CPUS			1024
> +#endif
>  
>  extern const char *input_name;
>  extern bool perf_host, perf_guest;
> -- 
> 2.13.3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf tools: allow overriding MAX_NR_CPUS at compile time
  2017-09-22 11:20 [PATCH] perf tools: allow overriding MAX_NR_CPUS at compile time Christophe Leroy
  2017-09-22 14:09 ` Arnaldo Carvalho de Melo
@ 2018-05-03 13:40 ` Arnaldo Carvalho de Melo
  2018-08-01  9:37   ` Christophe LEROY
  2018-08-02  8:16 ` [tip:perf/core] perf tools: Allow " tip-bot for Christophe Leroy
  2 siblings, 1 reply; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-05-03 13:40 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Peter Zijlstra, Ingo Molnar, Alexander Shishkin, linux-kernel,
	linuxppc-dev

Em Fri, Sep 22, 2017 at 01:20:43PM +0200, Christophe Leroy escreveu:
> After update of kernel, perf tool doesn't run anymore on my
> 32MB RAM powerpc board, but still runs on a 128MB RAM board:

Cleaning up my inbox, found this one, simple enough, still applies,
applied.

These all needs to be dynamicly allocated, but still, with this one can
get a functioning tool, apply it.

- Arnaldo
 
> ~# strace perf
> execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot allocate memory)
> --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
> +++ killed by SIGSEGV +++
> Segmentation fault
> 
> objdump -x shows that .bss section has a huge size of 24Mbytes:
> 
>  27 .bss          016baca8  101cebb8  101cebb8  001cd988  2**3
> 
> With especially the following objects having quite big size
> 
> 10205f80 l     O .bss	00140000              runtime_cycles_stats
> 10345f80 l     O .bss	00140000              runtime_stalled_cycles_front_stats
> 10485f80 l     O .bss	00140000              runtime_stalled_cycles_back_stats
> 105c5f80 l     O .bss	00140000              runtime_branches_stats
> 10705f80 l     O .bss	00140000              runtime_cacherefs_stats
> 10845f80 l     O .bss	00140000              runtime_l1_dcache_stats
> 10985f80 l     O .bss	00140000              runtime_l1_icache_stats
> 10ac5f80 l     O .bss	00140000              runtime_ll_cache_stats
> 10c05f80 l     O .bss	00140000              runtime_itlb_cache_stats
> 10d45f80 l     O .bss	00140000              runtime_dtlb_cache_stats
> 10e85f80 l     O .bss	00140000              runtime_cycles_in_tx_stats
> 10fc5f80 l     O .bss	00140000              runtime_transaction_stats
> 11105f80 l     O .bss	00140000              runtime_elision_stats
> 11245f80 l     O .bss	00140000              runtime_topdown_total_slots
> 11385f80 l     O .bss	00140000              runtime_topdown_slots_retired
> 114c5f80 l     O .bss	00140000              runtime_topdown_slots_issued
> 11605f80 l     O .bss	00140000              runtime_topdown_fetch_bubbles
> 11745f80 l     O .bss	00140000              runtime_topdown_recovery_bubbles
> 
> This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
> to 1024"), because many tables are sized with MAX_NR_CPUS
> 
> This patch gives the opportunity to redefine MAX_NR_CPUS via
> 
> make EXTRA_CFLAGS=-DMAX_NR_CPUS=1
> 
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> ---
>  tools/perf/perf.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
> index dc442ba21bf6..a9db563da0a9 100644
> --- a/tools/perf/perf.h
> +++ b/tools/perf/perf.h
> @@ -23,7 +23,9 @@ static inline unsigned long long rdclock(void)
>  	return ts.tv_sec * 1000000000ULL + ts.tv_nsec;
>  }
>  
> +#ifndef MAX_NR_CPUS
>  #define MAX_NR_CPUS			1024
> +#endif
>  
>  extern const char *input_name;
>  extern bool perf_host, perf_guest;
> -- 
> 2.13.3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf tools: allow overriding MAX_NR_CPUS at compile time
  2018-05-03 13:40 ` Arnaldo Carvalho de Melo
@ 2018-08-01  9:37   ` Christophe LEROY
  2018-08-01 14:40     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 6+ messages in thread
From: Christophe LEROY @ 2018-08-01  9:37 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Peter Zijlstra, Ingo Molnar, Alexander Shishkin, linux-kernel,
	linuxppc-dev



Le 03/05/2018 à 15:40, Arnaldo Carvalho de Melo a écrit :
> Em Fri, Sep 22, 2017 at 01:20:43PM +0200, Christophe Leroy escreveu:
>> After update of kernel, perf tool doesn't run anymore on my
>> 32MB RAM powerpc board, but still runs on a 128MB RAM board:
> 
> Cleaning up my inbox, found this one, simple enough, still applies,
> applied.

Did you finally apply it ? I can't see it in linux-next. Will it be 
merged into 4.19 ?

Thanks
Christophe

> 
> These all needs to be dynamicly allocated, but still, with this one can
> get a functioning tool, apply it.
> 
> - Arnaldo
>   
>> ~# strace perf
>> execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot allocate memory)
>> --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
>> +++ killed by SIGSEGV +++
>> Segmentation fault
>>
>> objdump -x shows that .bss section has a huge size of 24Mbytes:
>>
>>   27 .bss          016baca8  101cebb8  101cebb8  001cd988  2**3
>>
>> With especially the following objects having quite big size
>>
>> 10205f80 l     O .bss	00140000              runtime_cycles_stats
>> 10345f80 l     O .bss	00140000              runtime_stalled_cycles_front_stats
>> 10485f80 l     O .bss	00140000              runtime_stalled_cycles_back_stats
>> 105c5f80 l     O .bss	00140000              runtime_branches_stats
>> 10705f80 l     O .bss	00140000              runtime_cacherefs_stats
>> 10845f80 l     O .bss	00140000              runtime_l1_dcache_stats
>> 10985f80 l     O .bss	00140000              runtime_l1_icache_stats
>> 10ac5f80 l     O .bss	00140000              runtime_ll_cache_stats
>> 10c05f80 l     O .bss	00140000              runtime_itlb_cache_stats
>> 10d45f80 l     O .bss	00140000              runtime_dtlb_cache_stats
>> 10e85f80 l     O .bss	00140000              runtime_cycles_in_tx_stats
>> 10fc5f80 l     O .bss	00140000              runtime_transaction_stats
>> 11105f80 l     O .bss	00140000              runtime_elision_stats
>> 11245f80 l     O .bss	00140000              runtime_topdown_total_slots
>> 11385f80 l     O .bss	00140000              runtime_topdown_slots_retired
>> 114c5f80 l     O .bss	00140000              runtime_topdown_slots_issued
>> 11605f80 l     O .bss	00140000              runtime_topdown_fetch_bubbles
>> 11745f80 l     O .bss	00140000              runtime_topdown_recovery_bubbles
>>
>> This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
>> to 1024"), because many tables are sized with MAX_NR_CPUS
>>
>> This patch gives the opportunity to redefine MAX_NR_CPUS via
>>
>> make EXTRA_CFLAGS=-DMAX_NR_CPUS=1
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>> ---
>>   tools/perf/perf.h | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/tools/perf/perf.h b/tools/perf/perf.h
>> index dc442ba21bf6..a9db563da0a9 100644
>> --- a/tools/perf/perf.h
>> +++ b/tools/perf/perf.h
>> @@ -23,7 +23,9 @@ static inline unsigned long long rdclock(void)
>>   	return ts.tv_sec * 1000000000ULL + ts.tv_nsec;
>>   }
>>   
>> +#ifndef MAX_NR_CPUS
>>   #define MAX_NR_CPUS			1024
>> +#endif
>>   
>>   extern const char *input_name;
>>   extern bool perf_host, perf_guest;
>> -- 
>> 2.13.3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf tools: allow overriding MAX_NR_CPUS at compile time
  2018-08-01  9:37   ` Christophe LEROY
@ 2018-08-01 14:40     ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2018-08-01 14:40 UTC (permalink / raw)
  To: Christophe LEROY
  Cc: Peter Zijlstra, Ingo Molnar, Alexander Shishkin, linux-kernel,
	linuxppc-dev

Em Wed, Aug 01, 2018 at 11:37:30AM +0200, Christophe LEROY escreveu:
> 
> 
> Le 03/05/2018 à 15:40, Arnaldo Carvalho de Melo a écrit :
> > Em Fri, Sep 22, 2017 at 01:20:43PM +0200, Christophe Leroy escreveu:
> > > After update of kernel, perf tool doesn't run anymore on my
> > > 32MB RAM powerpc board, but still runs on a 128MB RAM board:
> > 
> > Cleaning up my inbox, found this one, simple enough, still applies,
> > applied.
> 
> Did you finally apply it ? I can't see it in linux-next. Will it be merged
> into 4.19 ?

Sure, applied it finally, 

- Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip:perf/core] perf tools: Allow overriding MAX_NR_CPUS at compile time
  2017-09-22 11:20 [PATCH] perf tools: allow overriding MAX_NR_CPUS at compile time Christophe Leroy
  2017-09-22 14:09 ` Arnaldo Carvalho de Melo
  2018-05-03 13:40 ` Arnaldo Carvalho de Melo
@ 2018-08-02  8:16 ` tip-bot for Christophe Leroy
  2 siblings, 0 replies; 6+ messages in thread
From: tip-bot for Christophe Leroy @ 2018-08-02  8:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: hpa, alexander.shishkin, acme, tglx, christophe.leroy, peterz,
	mingo, linux-kernel

Commit-ID:  21b8732eb4479b579bda9ee38e62b2c312c2a0e5
Gitweb:     https://git.kernel.org/tip/21b8732eb4479b579bda9ee38e62b2c312c2a0e5
Author:     Christophe Leroy <christophe.leroy@c-s.fr>
AuthorDate: Fri, 22 Sep 2017 13:20:43 +0200
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 1 Aug 2018 12:33:24 -0300

perf tools: Allow overriding MAX_NR_CPUS at compile time

After update of kernel, the perf tool doesn't run anymore on my 32MB RAM
powerpc board, but still runs on a 128MB RAM board:

  ~# strace perf
  execve("/usr/sbin/perf", ["perf"], [/* 12 vars */]) = -1 ENOMEM (Cannot allocate memory)
  --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
  +++ killed by SIGSEGV +++
  Segmentation fault

objdump -x shows that .bss section has a huge size of 24Mbytes:

 27 .bss          016baca8  101cebb8  101cebb8  001cd988  2**3

With especially the following objects having quite big size:

  10205f80 l     O .bss	00140000     runtime_cycles_stats
  10345f80 l     O .bss	00140000     runtime_stalled_cycles_front_stats
  10485f80 l     O .bss	00140000     runtime_stalled_cycles_back_stats
  105c5f80 l     O .bss	00140000     runtime_branches_stats
  10705f80 l     O .bss	00140000     runtime_cacherefs_stats
  10845f80 l     O .bss	00140000     runtime_l1_dcache_stats
  10985f80 l     O .bss	00140000     runtime_l1_icache_stats
  10ac5f80 l     O .bss	00140000     runtime_ll_cache_stats
  10c05f80 l     O .bss	00140000     runtime_itlb_cache_stats
  10d45f80 l     O .bss	00140000     runtime_dtlb_cache_stats
  10e85f80 l     O .bss	00140000     runtime_cycles_in_tx_stats
  10fc5f80 l     O .bss	00140000     runtime_transaction_stats
  11105f80 l     O .bss	00140000     runtime_elision_stats
  11245f80 l     O .bss	00140000     runtime_topdown_total_slots
  11385f80 l     O .bss	00140000     runtime_topdown_slots_retired
  114c5f80 l     O .bss	00140000     runtime_topdown_slots_issued
  11605f80 l     O .bss	00140000     runtime_topdown_fetch_bubbles
  11745f80 l     O .bss	00140000     runtime_topdown_recovery_bubbles

This is due to commit 4d255766d28b1 ("perf: Bump max number of cpus
to 1024"), because many tables are sized with MAX_NR_CPUS

This patch gives the opportunity to redefine MAX_NR_CPUS via

  $ make EXTRA_CFLAGS=-DMAX_NR_CPUS=1

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20170922112043.8349468C57@po15668-vm-win7.idsi0.si.c-s.fr
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/perf.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index d215714f48df..21bf7f5a3cf5 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -25,7 +25,9 @@ static inline unsigned long long rdclock(void)
 	return ts.tv_sec * 1000000000ULL + ts.tv_nsec;
 }
 
+#ifndef MAX_NR_CPUS
 #define MAX_NR_CPUS			1024
+#endif
 
 extern const char *input_name;
 extern bool perf_host, perf_guest;

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-08-02  8:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-22 11:20 [PATCH] perf tools: allow overriding MAX_NR_CPUS at compile time Christophe Leroy
2017-09-22 14:09 ` Arnaldo Carvalho de Melo
2018-05-03 13:40 ` Arnaldo Carvalho de Melo
2018-08-01  9:37   ` Christophe LEROY
2018-08-01 14:40     ` Arnaldo Carvalho de Melo
2018-08-02  8:16 ` [tip:perf/core] perf tools: Allow " tip-bot for Christophe Leroy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.