linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND 0/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
@ 2022-04-12 23:15 Libo Chen
  2022-04-12 23:15 ` [PATCH RESEND 1/1] " Libo Chen
  0 siblings, 1 reply; 18+ messages in thread
From: Libo Chen @ 2022-04-12 23:15 UTC (permalink / raw)
  To: gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch

We encountered an issue related to NR_CPUMASK_BITS on ARM64. It turned
out CPUMASK_OFFSTACK was not enabled and NR_CPUS is set to 4096 in our
aarch64 config, so each cpumask operation is unnecessarily expensive
esp. on small VMs. There are decent number of cpumask operations in the
scheduler. Many of them are on the hot paths. The cumulative effect is
quite siginificant. With CONFIG_CPUMASK_OFFSTACKT set, we saw ~10% gain on
ApacheBench and a few percentage on redis in a 32-core ARM64 VM with 5.16
kernel.
One may argue why not just set NR_CPUS to smaller values, the thing is,
for example, we support VMs with flexible shapes. The customer can
specify the number of CPUs ranging from 1 to 160. You cannot just prepare
160 kernel images for each release with one for each shape. And kernel
with NR_CPUS=160 doesn't perform well on a 1-CPU VM. It's important that
we can set CPUMASK_OFFSTACK=y so that the kernel can dynamically allocate
cpumasks based on the actual number of CPUs in the system.

The problem I am trying to address in this patch is currently
CPUMASK_OFFSTACK depends on DEBUG_PER_CPU_MAPS except for x86 and it
doesn't even show up in kconfig menu as well as kconfig
without DEBUG_PER_CPU_MAPS=y. We should remove such outdated,
unnecessary dependency.

Of course, I am open to other ideas. And people who are more familar with
this matter can shed a light on why this dependency has been kept for so
long. My goal here is to determine if DEBUG_PER_CPU_MAPS is absoultely
necessary for CPUMASK_OFFSTACK.

RESEND:
- Correct the subject line of cover letter
- Send to more people, esp. gkh for more attention since no one seems to
  be responsible for this part of code

Libo Chen (1):
  lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK

 lib/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-12 23:15 [PATCH RESEND 0/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK Libo Chen
@ 2022-04-12 23:15 ` Libo Chen
  2022-04-13  0:18   ` Randy Dunlap
                     ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Libo Chen @ 2022-04-12 23:15 UTC (permalink / raw)
  To: gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch

Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
make a lot of sense nowaday. Even the original patch dating back to 2008,
aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
rationale for such dependency.

Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
There is no reason other architectures cannot given the fact that they
have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
x86.

Signed-off-by: Libo Chen <libo.chen@oracle.com>
---
 lib/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/Kconfig b/lib/Kconfig
index 087e06b4cdfd..7209039dfb59 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -511,7 +511,7 @@ config CHECK_SIGNATURE
 	bool
 
 config CPUMASK_OFFSTACK
-	bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
+	bool "Force CPU masks off stack"
 	help
 	  Use dynamic allocation for cpumask_var_t, instead of putting
 	  them on the stack.  This is a bit more expensive, but avoids
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-12 23:15 ` [PATCH RESEND 1/1] " Libo Chen
@ 2022-04-13  0:18   ` Randy Dunlap
  2022-04-13  1:35     ` Libo Chen
  2022-04-13 13:11   ` kernel test robot
  2022-04-13 14:33   ` kernel test robot
  2 siblings, 1 reply; 18+ messages in thread
From: Randy Dunlap @ 2022-04-13  0:18 UTC (permalink / raw)
  To: Libo Chen, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch

Hi--

On 4/12/22 16:15, Libo Chen wrote:
> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
> make a lot of sense nowaday. Even the original patch dating back to 2008,
> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
> rationale for such dependency.
> 
> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
> There is no reason other architectures cannot given the fact that they
> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
> x86.
> 
> Signed-off-by: Libo Chen <libo.chen@oracle.com>
> ---
>  lib/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/Kconfig b/lib/Kconfig
> index 087e06b4cdfd..7209039dfb59 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>  	bool
>  
>  config CPUMASK_OFFSTACK
> -	bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS

This "if" dependency only controls whether the Kconfig symbol's prompt is
displayed (presented) in kconfig tools. Removing it makes the prompt always
be displayed.

Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
of DEBUG_PER_CPU_MAPS.

Is there another problem here?

> +	bool "Force CPU masks off stack"
>  	help
>  	  Use dynamic allocation for cpumask_var_t, instead of putting
>  	  them on the stack.  This is a bit more expensive, but avoids

-- 
~Randy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13  0:18   ` Randy Dunlap
@ 2022-04-13  1:35     ` Libo Chen
  2022-04-13  2:13       ` Randy Dunlap
  0 siblings, 1 reply; 18+ messages in thread
From: Libo Chen @ 2022-04-13  1:35 UTC (permalink / raw)
  To: Randy Dunlap, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch

Hi Randy,

On 4/12/22 17:18, Randy Dunlap wrote:
> Hi--
>
> On 4/12/22 16:15, Libo Chen wrote:
>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>> rationale for such dependency.
>>
>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>> There is no reason other architectures cannot given the fact that they
>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>> x86.
>>
>> Signed-off-by: Libo Chen <libo.chen@oracle.com>
>> ---
>>   lib/Kconfig | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/Kconfig b/lib/Kconfig
>> index 087e06b4cdfd..7209039dfb59 100644
>> --- a/lib/Kconfig
>> +++ b/lib/Kconfig
>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>   	bool
>>   
>>   config CPUMASK_OFFSTACK
>> -	bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> This "if" dependency only controls whether the Kconfig symbol's prompt is
> displayed (presented) in kconfig tools. Removing it makes the prompt always
> be displayed.
>
> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
> of DEBUG_PER_CPU_MAPS.
Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under 
some config xxx? That will work but it requires code changes for each 
architecture.
But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without 
CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it 
doesn't work.

Libo
> Is there another problem here?
>
>> +	bool "Force CPU masks off stack"
>>   	help
>>   	  Use dynamic allocation for cpumask_var_t, instead of putting
>>   	  them on the stack.  This is a bit more expensive, but avoids


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13  1:35     ` Libo Chen
@ 2022-04-13  2:13       ` Randy Dunlap
  2022-04-13  2:34         ` Libo Chen
  0 siblings, 1 reply; 18+ messages in thread
From: Randy Dunlap @ 2022-04-13  2:13 UTC (permalink / raw)
  To: Libo Chen, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch

Hi,

On 4/12/22 18:35, Libo Chen wrote:
> Hi Randy,
> 
> On 4/12/22 17:18, Randy Dunlap wrote:
>> Hi--
>>
>> On 4/12/22 16:15, Libo Chen wrote:
>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>> rationale for such dependency.
>>>
>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>> There is no reason other architectures cannot given the fact that they
>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>> x86.
>>>
>>> Signed-off-by: Libo Chen <libo.chen@oracle.com>
>>> ---
>>>   lib/Kconfig | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>> index 087e06b4cdfd..7209039dfb59 100644
>>> --- a/lib/Kconfig
>>> +++ b/lib/Kconfig
>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>       bool
>>>     config CPUMASK_OFFSTACK
>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>> be displayed.
>>
>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>> of DEBUG_PER_CPU_MAPS.
> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.

I'm just talking about the Kconfig change below.  Not talking about whatever else
it might require per architecture.

But you say you have tried that and it doesn't work. What part of it doesn't work?
The Kconfig part or some code execution?

I'll test the Kconfig part of it later (in a few hours).

> Libo
>> Is there another problem here?
>>
>>> +    bool "Force CPU masks off stack"
>>>       help
>>>         Use dynamic allocation for cpumask_var_t, instead of putting
>>>         them on the stack.  This is a bit more expensive, but avoids
> 

-- 
~Randy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13  2:13       ` Randy Dunlap
@ 2022-04-13  2:34         ` Libo Chen
  2022-04-13  5:54           ` Randy Dunlap
  0 siblings, 1 reply; 18+ messages in thread
From: Libo Chen @ 2022-04-13  2:34 UTC (permalink / raw)
  To: Randy Dunlap, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch



On 4/12/22 19:13, Randy Dunlap wrote:
> Hi,
>
> On 4/12/22 18:35, Libo Chen wrote:
>> Hi Randy,
>>
>> On 4/12/22 17:18, Randy Dunlap wrote:
>>> Hi--
>>>
>>> On 4/12/22 16:15, Libo Chen wrote:
>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>> rationale for such dependency.
>>>>
>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>> There is no reason other architectures cannot given the fact that they
>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>> x86.
>>>>
>>>> Signed-off-by: Libo Chen <libo.chen@oracle.com>
>>>> ---
>>>>    lib/Kconfig | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>> --- a/lib/Kconfig
>>>> +++ b/lib/Kconfig
>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>        bool
>>>>      config CPUMASK_OFFSTACK
>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>> be displayed.
>>>
>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>> of DEBUG_PER_CPU_MAPS.
>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
> I'm just talking about the Kconfig change below.  Not talking about whatever else
> it might require per architecture.
>
> But you say you have tried that and it doesn't work. What part of it doesn't work?
> The Kconfig part or some code execution?
oh the Kconfig part. For example, make olddefconfig on a config file 
with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I 
explicitly set DEBUG_PER_CPU_MAPS=y

Libo
> I'll test the Kconfig part of it later (in a few hours).
>
>> Libo
>>> Is there another problem here?
>>>
>>>> +    bool "Force CPU masks off stack"
>>>>        help
>>>>          Use dynamic allocation for cpumask_var_t, instead of putting
>>>>          them on the stack.  This is a bit more expensive, but avoids


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13  2:34         ` Libo Chen
@ 2022-04-13  5:54           ` Randy Dunlap
  2022-04-13  6:56             ` Libo Chen
  0 siblings, 1 reply; 18+ messages in thread
From: Randy Dunlap @ 2022-04-13  5:54 UTC (permalink / raw)
  To: Libo Chen, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch

Hi Libo,

On 4/12/22 19:34, Libo Chen wrote:
> 
> 
> On 4/12/22 19:13, Randy Dunlap wrote:
>> Hi,
>>
>> On 4/12/22 18:35, Libo Chen wrote:
>>> Hi Randy,
>>>
>>> On 4/12/22 17:18, Randy Dunlap wrote:
>>>> Hi--
>>>>
>>>> On 4/12/22 16:15, Libo Chen wrote:
>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>>> rationale for such dependency.
>>>>>
>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>>> There is no reason other architectures cannot given the fact that they
>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>>> x86.
>>>>>
>>>>> Signed-off-by: Libo Chen <libo.chen@oracle.com>
>>>>> ---
>>>>>    lib/Kconfig | 2 +-
>>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>>> --- a/lib/Kconfig
>>>>> +++ b/lib/Kconfig
>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>>        bool
>>>>>      config CPUMASK_OFFSTACK
>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>>> be displayed.
>>>>
>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>>> of DEBUG_PER_CPU_MAPS.
>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
>> I'm just talking about the Kconfig change below.  Not talking about whatever else
>> it might require per architecture.
>>
>> But you say you have tried that and it doesn't work. What part of it doesn't work?
>> The Kconfig part or some code execution?
> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y

I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
(with a patch, of course.)
It builds OK. I don't know if it will run OK.

I think that you are arguing for a patch like this:

--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -511,7 +511,8 @@ config CHECK_SIGNATURE
 	bool
 
 config CPUMASK_OFFSTACK
-	bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
+	bool "Force CPU masks off stack"
+	depends on DEBUG_PER_CPU_MAPS
 	help
 	  Use dynamic allocation for cpumask_var_t, instead of putting
 	  them on the stack.  This is a bit more expensive, but avoids


As I said earlier, the "if" on the "bool" line just controls the prompt message.
This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.


> Libo
>> I'll test the Kconfig part of it later (in a few hours).
>>
>>> Libo
>>>> Is there another problem here?
>>>>
>>>>> +    bool "Force CPU masks off stack"
>>>>>        help
>>>>>          Use dynamic allocation for cpumask_var_t, instead of putting
>>>>>          them on the stack.  This is a bit more expensive, but avoids
> 

-- 
~Randy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13  5:54           ` Randy Dunlap
@ 2022-04-13  6:56             ` Libo Chen
  2022-04-13  8:37               ` Masahiro Yamada
  2022-04-13 15:41               ` Randy Dunlap
  0 siblings, 2 replies; 18+ messages in thread
From: Libo Chen @ 2022-04-13  6:56 UTC (permalink / raw)
  To: Randy Dunlap, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch

Hi Randy

On 4/12/22 22:54, Randy Dunlap wrote:
> Hi Libo,
>
> On 4/12/22 19:34, Libo Chen wrote:
>>
>> On 4/12/22 19:13, Randy Dunlap wrote:
>>> Hi,
>>>
>>> On 4/12/22 18:35, Libo Chen wrote:
>>>> Hi Randy,
>>>>
>>>> On 4/12/22 17:18, Randy Dunlap wrote:
>>>>> Hi--
>>>>>
>>>>> On 4/12/22 16:15, Libo Chen wrote:
>>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>>>> rationale for such dependency.
>>>>>>
>>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>>>> There is no reason other architectures cannot given the fact that they
>>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>>>> x86.
>>>>>>
>>>>>> Signed-off-by: Libo Chen <libo.chen@oracle.com>
>>>>>> ---
>>>>>>     lib/Kconfig | 2 +-
>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>>>> --- a/lib/Kconfig
>>>>>> +++ b/lib/Kconfig
>>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>>>         bool
>>>>>>       config CPUMASK_OFFSTACK
>>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>>>> be displayed.
>>>>>
>>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>>>> of DEBUG_PER_CPU_MAPS.
>>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
>>> I'm just talking about the Kconfig change below.  Not talking about whatever else
>>> it might require per architecture.
>>>
>>> But you say you have tried that and it doesn't work. What part of it doesn't work?
>>> The Kconfig part or some code execution?
>> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y
> I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
> (with a patch, of course.)
> It builds OK. I don't know if it will run OK.

I am a little confused, did you succeed with your patch (replacing "if" 
with "depends on") or my patch (removing "if")? Because I definitely 
cannot enable CPUMASK_OFFSTACK for arm64 without DEBUG_PER_CPUMAPS 
enabled using your change.
> I think that you are arguing for a patch like this:

I am actually arguing for the opposite, I don't think CPUMASK_OFFSTACK 
should require DEBUG_PER_CPU_MAPS. They should be separate and 
independent to each other. So removing "if ..." should be enough in my 
opinion.
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
>   	bool
>   
>   config CPUMASK_OFFSTACK
> -	bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> +	bool "Force CPU masks off stack"
> +	depends on DEBUG_PER_CPU_MAPS

This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to 
enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument 
is CPUMASK_OFFSTACK should be enable/disabled independent of 
DEBUG_PER_CPU_MASK
>   	help
>   	  Use dynamic allocation for cpumask_var_t, instead of putting
>   	  them on the stack.  This is a bit more expensive, but avoids
>
>
> As I said earlier, the "if" on the "bool" line just controls the prompt message.
> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>

Okay I understand now "if" on the "boot" is not a dependency and it only 
controls the prompt message, then the question is why we cannot enable 
CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt 
message? Is it not the behavior we expect?

Libo

>> Libo
>>> I'll test the Kconfig part of it later (in a few hours).
>>>
>>>> Libo
>>>>> Is there another problem here?
>>>>>
>>>>>> +    bool "Force CPU masks off stack"
>>>>>>         help
>>>>>>           Use dynamic allocation for cpumask_var_t, instead of putting
>>>>>>           them on the stack.  This is a bit more expensive, but avoids


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13  6:56             ` Libo Chen
@ 2022-04-13  8:37               ` Masahiro Yamada
  2022-04-13 15:41               ` Randy Dunlap
  1 sibling, 0 replies; 18+ messages in thread
From: Masahiro Yamada @ 2022-04-13  8:37 UTC (permalink / raw)
  To: Libo Chen
  Cc: Randy Dunlap, Greg Kroah-Hartman, Thomas Gleixner,
	Peter Zijlstra (Intel),
	Ingo Molnar, Vlastimil Babka, Andrew Morton,
	Linux Kernel Mailing List, Linux Kbuild mailing list,
	linux-arm-kernel, linux-arch

On Wed, Apr 13, 2022 at 3:56 PM Libo Chen <libo.chen@oracle.com> wrote:
>
> Hi Randy
>
> On 4/12/22 22:54, Randy Dunlap wrote:
> > Hi Libo,
> >
> > On 4/12/22 19:34, Libo Chen wrote:
> >>
> >> On 4/12/22 19:13, Randy Dunlap wrote:
> >>> Hi,
> >>>
> >>> On 4/12/22 18:35, Libo Chen wrote:
> >>>> Hi Randy,
> >>>>
> >>>> On 4/12/22 17:18, Randy Dunlap wrote:
> >>>>> Hi--
> >>>>>
> >>>>> On 4/12/22 16:15, Libo Chen wrote:
> >>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
> >>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
> >>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
> >>>>>> rationale for such dependency.
> >>>>>>
> >>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
> >>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
> >>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
> >>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
> >>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
> >>>>>> There is no reason other architectures cannot given the fact that they
> >>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
> >>>>>> x86.
> >>>>>>
> >>>>>> Signed-off-by: Libo Chen <libo.chen@oracle.com>
> >>>>>> ---
> >>>>>>     lib/Kconfig | 2 +-
> >>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
> >>>>>>
> >>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
> >>>>>> index 087e06b4cdfd..7209039dfb59 100644
> >>>>>> --- a/lib/Kconfig
> >>>>>> +++ b/lib/Kconfig
> >>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
> >>>>>>         bool
> >>>>>>       config CPUMASK_OFFSTACK
> >>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> >>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
> >>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
> >>>>> be displayed.
> >>>>>
> >>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
> >>>>> of DEBUG_PER_CPU_MAPS.
> >>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
> >>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
> >>> I'm just talking about the Kconfig change below.  Not talking about whatever else
> >>> it might require per architecture.
> >>>
> >>> But you say you have tried that and it doesn't work. What part of it doesn't work?
> >>> The Kconfig part or some code execution?
> >> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y
> > I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
> > (with a patch, of course.)
> > It builds OK. I don't know if it will run OK.
>
> I am a little confused, did you succeed with your patch (replacing "if"
> with "depends on") or my patch (removing "if")? Because I definitely
> cannot enable CPUMASK_OFFSTACK for arm64 without DEBUG_PER_CPUMAPS
> enabled using your change.
> > I think that you are arguing for a patch like this:
>
> I am actually arguing for the opposite, I don't think CPUMASK_OFFSTACK
> should require DEBUG_PER_CPU_MAPS. They should be separate and
> independent to each other. So removing "if ..." should be enough in my
> opinion.
> > --- a/lib/Kconfig
> > +++ b/lib/Kconfig
> > @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
> >       bool
> >
> >   config CPUMASK_OFFSTACK
> > -     bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> > +     bool "Force CPU masks off stack"
> > +     depends on DEBUG_PER_CPU_MAPS
>
> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to
> enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument
> is CPUMASK_OFFSTACK should be enable/disabled independent of
> DEBUG_PER_CPU_MASK
> >       help
> >         Use dynamic allocation for cpumask_var_t, instead of putting
> >         them on the stack.  This is a bit more expensive, but avoids
> >
> >
> > As I said earlier, the "if" on the "bool" line just controls the prompt message.
> > This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
> >
>
> Okay I understand now "if" on the "boot" is not a dependency and it only
> controls the prompt message, then the question is why we cannot enable
> CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt
> message? Is it not the behavior we expect?
>


    config CPUMASK_OFFSTACK
            bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS

... is equivalent to this:

    config CPUMASK_OFFSTACK
            bool
            prompt "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS



When DEBUG_PER_CPU_MAPS is disabled, the prompt line is ignored,
and CPUMASK_OFFSTACK becomes a user-unconfigurable option.


Other options still can select it,
but users cannot enable it directly from the prompt.

I see x86 and powerpc do this.

$ kgrep 'select CPUMASK_OFFSTACK'
./arch/x86/Kconfig:946: select CPUMASK_OFFSTACK
./arch/powerpc/Kconfig:164: select CPUMASK_OFFSTACK if NR_CPUS >= 8192







> Libo
>
> >> Libo
> >>> I'll test the Kconfig part of it later (in a few hours).
> >>>
> >>>> Libo
> >>>>> Is there another problem here?
> >>>>>
> >>>>>> +    bool "Force CPU masks off stack"
> >>>>>>         help
> >>>>>>           Use dynamic allocation for cpumask_var_t, instead of putting
> >>>>>>           them on the stack.  This is a bit more expensive, but avoids
>


-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-12 23:15 ` [PATCH RESEND 1/1] " Libo Chen
  2022-04-13  0:18   ` Randy Dunlap
@ 2022-04-13 13:11   ` kernel test robot
  2022-04-13 14:33   ` kernel test robot
  2 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2022-04-13 13:11 UTC (permalink / raw)
  To: Libo Chen, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: kbuild-all, linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch

Hi Libo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.18-rc2 next-20220413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Libo-Chen/lib-Kconfig-remove-DEBUG_PER_CPU_MAPS-dependency-for-CPUMASK_OFFSTACK/20220413-073657
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e
config: nios2-randconfig-r023-20220413 (https://download.01.org/0day-ci/archive/20220413/202204132113.V7jyB6Zc-lkp@intel.com/config)
compiler: nios2-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/6636f7cf28d2a79cde937c0f212e8a87080da06d
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Libo-Chen/lib-Kconfig-remove-DEBUG_PER_CPU_MAPS-dependency-for-CPUMASK_OFFSTACK/20220413-073657
        git checkout 6636f7cf28d2a79cde937c0f212e8a87080da06d
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=nios2 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 71
   kernel/workqueue.o: in function `free_workqueue_attrs':
   workqueue.c:(.text+0x3fb8): undefined reference to `free_cpumask_var'
   workqueue.c:(.text+0x3fb8): relocation truncated to fit: R_NIOS2_CALL26 against `free_cpumask_var'
   nios2-linux-ld: kernel/workqueue.o: in function `alloc_workqueue_attrs':
>> workqueue.c:(.text+0x40d4): undefined reference to `alloc_cpumask_var'
   workqueue.c:(.text+0x40d4): relocation truncated to fit: R_NIOS2_CALL26 against `alloc_cpumask_var'
   nios2-linux-ld: kernel/workqueue.o: in function `workqueue_set_unbound_cpumask':
   workqueue.c:(.text+0x5584): undefined reference to `zalloc_cpumask_var'
   workqueue.c:(.text+0x5584): relocation truncated to fit: R_NIOS2_CALL26 against `zalloc_cpumask_var'
   nios2-linux-ld: workqueue.c:(.text+0x5680): undefined reference to `free_cpumask_var'
   workqueue.c:(.text+0x5680): relocation truncated to fit: R_NIOS2_CALL26 against `free_cpumask_var'
   nios2-linux-ld: kernel/workqueue.o: in function `wq_unbound_cpumask_store':
   workqueue.c:(.text+0x56e0): undefined reference to `zalloc_cpumask_var'
   workqueue.c:(.text+0x56e0): relocation truncated to fit: R_NIOS2_CALL26 against `zalloc_cpumask_var'
   nios2-linux-ld: workqueue.c:(.text+0x5718): undefined reference to `free_cpumask_var'
   workqueue.c:(.text+0x5718): relocation truncated to fit: R_NIOS2_CALL26 against `free_cpumask_var'
   nios2-linux-ld: kernel/workqueue.o: in function `workqueue_init_early':
>> workqueue.c:(.init.text+0x1e8): undefined reference to `alloc_cpumask_var'
   workqueue.c:(.init.text+0x1e8): relocation truncated to fit: R_NIOS2_CALL26 against `alloc_cpumask_var'
   nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 15310
   kernel/sched/core.o: in function `sched_setaffinity':
>> core.c:(.text+0x3b40): undefined reference to `alloc_cpumask_var'
   core.c:(.text+0x3b40): relocation truncated to fit: R_NIOS2_CALL26 against `alloc_cpumask_var'
>> nios2-linux-ld: core.c:(.text+0x3b54): undefined reference to `alloc_cpumask_var'
   core.c:(.text+0x3b54): relocation truncated to fit: R_NIOS2_CALL26 against `alloc_cpumask_var'
>> nios2-linux-ld: core.c:(.text+0x3be0): undefined reference to `free_cpumask_var'
   core.c:(.text+0x3be0): relocation truncated to fit: R_NIOS2_CALL26 against `free_cpumask_var'
   nios2-linux-ld: core.c:(.text+0x3bf0): undefined reference to `free_cpumask_var'
   core.c:(.text+0x3bf0): additional relocation overflows omitted from the output
   nios2-linux-ld: kernel/sched/core.o: in function `__se_sys_sched_setaffinity':
   core.c:(.text+0x3c64): undefined reference to `alloc_cpumask_var'
   nios2-linux-ld: core.c:(.text+0x3cc4): undefined reference to `free_cpumask_var'
   nios2-linux-ld: kernel/sched/core.o: in function `__se_sys_sched_getaffinity':
   core.c:(.text+0x3dd4): undefined reference to `alloc_cpumask_var'
   nios2-linux-ld: core.c:(.text+0x3e20): undefined reference to `free_cpumask_var'
   nios2-linux-ld: kernel/sched/core.o: in function `sched_init':
>> core.c:(.init.text+0x27c): undefined reference to `load_balance_mask'
>> nios2-linux-ld: core.c:(.init.text+0x284): undefined reference to `load_balance_mask'
>> nios2-linux-ld: core.c:(.init.text+0x28c): undefined reference to `select_idle_mask'
   nios2-linux-ld: core.c:(.init.text+0x290): undefined reference to `select_idle_mask'
   nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 18
   kernel/sched/build_utility.o: in function `housekeeping_setup_type':
   build_utility.c:(.init.text+0x130): undefined reference to `alloc_bootmem_cpumask_var'
   nios2-linux-ld: kernel/sched/build_utility.o: in function `housekeeping_setup':
   build_utility.c:(.init.text+0x198): undefined reference to `alloc_bootmem_cpumask_var'
   nios2-linux-ld: build_utility.c:(.init.text+0x1b4): undefined reference to `alloc_bootmem_cpumask_var'
   nios2-linux-ld: build_utility.c:(.init.text+0x2f4): undefined reference to `free_bootmem_cpumask_var'
   nios2-linux-ld: build_utility.c:(.init.text+0x304): undefined reference to `free_bootmem_cpumask_var'
   nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 91
   kernel/profile.o: in function `prof_cpu_mask_proc_write':
   profile.c:(.text+0x40c): undefined reference to `zalloc_cpumask_var'
   nios2-linux-ld: profile.c:(.text+0x450): undefined reference to `free_cpumask_var'
   nios2-linux-ld: kernel/profile.o: in function `profile_init':
>> profile.c:(.ref.text+0xc0): undefined reference to `alloc_cpumask_var'
>> nios2-linux-ld: profile.c:(.ref.text+0x134): undefined reference to `free_cpumask_var'
   nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 700424681
   kernel/time/hrtimer.o: in function `clock_was_set':
>> hrtimer.c:(.text+0xc48): undefined reference to `zalloc_cpumask_var'
>> nios2-linux-ld: hrtimer.c:(.text+0xd50): undefined reference to `free_cpumask_var'
   nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 361628
   kernel/torture.o: in function `torture_cleanup_begin':
>> torture.c:(.text+0x758): undefined reference to `free_cpumask_var'
   nios2-linux-ld: kernel/torture.o: in function `torture_shuffle_init':
>> torture.c:(.text+0xbe4): undefined reference to `alloc_cpumask_var'
   nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 13757
   block/blk-mq.o: in function `blk_mq_alloc_hctx':
   blk-mq.c:(.text+0x1550): undefined reference to `zalloc_cpumask_var_node'
   nios2-linux-ld: blk-mq.c:(.text+0x16d4): undefined reference to `free_cpumask_var'
   nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 264202560
   block/blk-mq-sysfs.o: in function `blk_mq_hw_sysfs_release':
   blk-mq-sysfs.c:(.text+0x2c0): undefined reference to `free_cpumask_var'
   nios2-linux-ld: nios2-linux-ld: DWARF error: could not find abbrev number 3666
   drivers/base/cpu.o: in function `print_cpus_offline':
   cpu.c:(.text+0x94): undefined reference to `alloc_cpumask_var'
   nios2-linux-ld: cpu.c:(.text+0xec): undefined reference to `free_cpumask_var'
   nios2-linux-ld: drivers/base/cpu.o: in function `print_cpus_isolated':
   cpu.c:(.text+0x1b8): undefined reference to `alloc_cpumask_var'
   nios2-linux-ld: cpu.c:(.text+0x20c): undefined reference to `free_cpumask_var'

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-12 23:15 ` [PATCH RESEND 1/1] " Libo Chen
  2022-04-13  0:18   ` Randy Dunlap
  2022-04-13 13:11   ` kernel test robot
@ 2022-04-13 14:33   ` kernel test robot
  2 siblings, 0 replies; 18+ messages in thread
From: kernel test robot @ 2022-04-13 14:33 UTC (permalink / raw)
  To: Libo Chen, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: kbuild-all, linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch

Hi Libo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.18-rc2 next-20220413]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/intel-lab-lkp/linux/commits/Libo-Chen/lib-Kconfig-remove-DEBUG_PER_CPU_MAPS-dependency-for-CPUMASK_OFFSTACK/20220413-073657
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ce522ba9ef7e2d9fb22a39eb3371c0c64e2a433e
config: parisc-randconfig-r014-20220413 (https://download.01.org/0day-ci/archive/20220413/202204132236.KPzXaw0b-lkp@intel.com/config)
compiler: hppa-linux-gcc (GCC) 11.2.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/6636f7cf28d2a79cde937c0f212e8a87080da06d
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Libo-Chen/lib-Kconfig-remove-DEBUG_PER_CPU_MAPS-dependency-for-CPUMASK_OFFSTACK/20220413-073657
        git checkout 6636f7cf28d2a79cde937c0f212e8a87080da06d
        # save the config file to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=parisc SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   hppa-linux-ld: kernel/workqueue.o: in function `free_workqueue_attrs':
>> kernel/workqueue.c:3370: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/workqueue.o: in function `alloc_workqueue_attrs':
>> kernel/workqueue.c:3390: undefined reference to `alloc_cpumask_var'
   hppa-linux-ld: kernel/workqueue.o: in function `workqueue_set_unbound_cpumask':
>> kernel/workqueue.c:5390: undefined reference to `zalloc_cpumask_var'
>> hppa-linux-ld: kernel/workqueue.c:5406: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/workqueue.o: in function `wq_unbound_cpumask_store':
   kernel/workqueue.c:5664: undefined reference to `zalloc_cpumask_var'
   hppa-linux-ld: kernel/workqueue.c:5671: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/workqueue.o: in function `workqueue_init_early':
   kernel/workqueue.c:5995: undefined reference to `alloc_cpumask_var'
   hppa-linux-ld: kernel/sched/core.o: in function `sched_setaffinity':
>> kernel/sched/core.c:7948: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: kernel/sched/core.c:7951: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: kernel/sched/core.c:7978: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/sched/core.c:7980: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/sched/core.o: in function `__se_sys_sched_setaffinity':
   kernel/sched/core.c:8051: undefined reference to `alloc_cpumask_var'
   hppa-linux-ld: kernel/sched/core.c:8057: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/sched/core.o: in function `__se_sys_sched_getaffinity':
   kernel/sched/core.c:8108: undefined reference to `alloc_cpumask_var'
   hppa-linux-ld: kernel/sched/core.c:8120: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/sched/core.o: in function `sched_init':
>> kernel/sched/core.c:9499: undefined reference to `load_balance_mask'
>> hppa-linux-ld: kernel/sched/core.c:9499: undefined reference to `load_balance_mask'
>> hppa-linux-ld: kernel/sched/core.c:9501: undefined reference to `select_idle_mask'
>> hppa-linux-ld: kernel/sched/core.c:9501: undefined reference to `select_idle_mask'
   hppa-linux-ld: kernel/sched/build_utility.o: in function `housekeeping_setup_type':
>> kernel/sched/isolation.c:104: undefined reference to `alloc_bootmem_cpumask_var'
   hppa-linux-ld: kernel/sched/build_utility.o: in function `housekeeping_setup':
   kernel/sched/isolation.c:122: undefined reference to `alloc_bootmem_cpumask_var'
>> hppa-linux-ld: kernel/sched/isolation.c:128: undefined reference to `alloc_bootmem_cpumask_var'
>> hppa-linux-ld: kernel/sched/isolation.c:173: undefined reference to `free_bootmem_cpumask_var'
   hppa-linux-ld: kernel/sched/isolation.c:175: undefined reference to `free_bootmem_cpumask_var'
   hppa-linux-ld: kernel/taskstats.o: in function `taskstats_user_cmd':
>> kernel/taskstats.c:441: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: kernel/taskstats.c:457: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: kernel/taskstats.c:464: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/events/core.o: in function `perf_event_init':
>> kernel/events/core.c:13237: undefined reference to `zalloc_cpumask_var'
   hppa-linux-ld: fs/io_uring.o: in function `__io_uring_register':
>> fs/io_uring.c:11472: undefined reference to `alloc_cpumask_var'
>> hppa-linux-ld: fs/io_uring.c:11488: undefined reference to `free_cpumask_var'
   hppa-linux-ld: fs/io_uring.c:11493: undefined reference to `free_cpumask_var'
   hppa-linux-ld: fs/io-wq.o: in function `io_wq_create':
   fs/io-wq.c:1180: undefined reference to `alloc_cpumask_var'
   hppa-linux-ld: fs/io-wq.c:1214: undefined reference to `free_cpumask_var'
   hppa-linux-ld: fs/io-wq.o: in function `io_wq_put_and_exit':
   fs/io-wq.c:1290: undefined reference to `free_cpumask_var'
   hppa-linux-ld: block/blk-mq.o: in function `blk_mq_alloc_hctx':
   block/blk-mq.c:3528: undefined reference to `zalloc_cpumask_var_node'
   hppa-linux-ld: block/blk-mq.c:3575: undefined reference to `free_cpumask_var'
   hppa-linux-ld: drivers/base/cpu.o: in function `print_cpus_offline':
   drivers/base/cpu.c:245: undefined reference to `alloc_cpumask_var'
   hppa-linux-ld: drivers/base/cpu.c:249: undefined reference to `free_cpumask_var'
   hppa-linux-ld: drivers/base/cpu.o: in function `print_cpus_isolated':
   drivers/base/cpu.c:274: undefined reference to `alloc_cpumask_var'
   hppa-linux-ld: drivers/base/cpu.c:281: undefined reference to `free_cpumask_var'
   hppa-linux-ld: drivers/net/ethernet/emulex/benet/be_main.o: in function `be_clear_queues':
   drivers/net/ethernet/emulex/benet/be_main.c:2943: undefined reference to `free_cpumask_var'
   hppa-linux-ld: drivers/net/ethernet/emulex/benet/be_main.o: in function `be_setup_queues':
   drivers/net/ethernet/emulex/benet/be_main.c:2981: undefined reference to `zalloc_cpumask_var'
   hppa-linux-ld: drivers/net/ethernet/sfc/falcon/efx.o: in function `ef4_probe_nic':
   drivers/net/ethernet/sfc/falcon/efx.c:1329: undefined reference to `zalloc_cpumask_var'
   hppa-linux-ld: drivers/net/ethernet/sfc/falcon/efx.c:1344: undefined reference to `free_cpumask_var'
   hppa-linux-ld: net/core/dev.o: in function `netif_get_num_default_rss_queues':
   net/core/dev.c:3001: undefined reference to `zalloc_cpumask_var'
   hppa-linux-ld: net/core/dev.c:3009: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/profile.o: in function `prof_cpu_mask_proc_write':
   kernel/profile.c:361: undefined reference to `zalloc_cpumask_var'
   hppa-linux-ld: kernel/profile.c:369: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/profile.o: in function `profile_init':
   kernel/profile.c:114: undefined reference to `alloc_cpumask_var'
   hppa-linux-ld: kernel/profile.c:132: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/torture.o: in function `torture_cleanup_begin':
   kernel/torture.c:591: undefined reference to `free_cpumask_var'
   hppa-linux-ld: kernel/torture.o: in function `torture_shuffle_init':
   kernel/torture.c:572: undefined reference to `alloc_cpumask_var'
   hppa-linux-ld: block/blk-mq-sysfs.o: in function `blk_mq_hw_sysfs_release':
   block/blk-mq-sysfs.c:41: undefined reference to `free_cpumask_var'


vim +3370 kernel/workqueue.c

1fa44ecad2b864 James Bottomley     2006-02-23  3360  
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3361  /**
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3362   * free_workqueue_attrs - free a workqueue_attrs
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3363   * @attrs: workqueue_attrs to free
226223ab3c4118 Tejun Heo           2013-03-12  3364   *
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3365   * Undo alloc_workqueue_attrs().
226223ab3c4118 Tejun Heo           2013-03-12  3366   */
513c98d0868295 Daniel Jordan       2019-09-05  3367  void free_workqueue_attrs(struct workqueue_attrs *attrs)
226223ab3c4118 Tejun Heo           2013-03-12  3368  {
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3369  	if (attrs) {
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02 @3370  		free_cpumask_var(attrs->cpumask);
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3371  		kfree(attrs);
226223ab3c4118 Tejun Heo           2013-03-12  3372  	}
226223ab3c4118 Tejun Heo           2013-03-12  3373  }
226223ab3c4118 Tejun Heo           2013-03-12  3374  
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3375  /**
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3376   * alloc_workqueue_attrs - allocate a workqueue_attrs
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3377   *
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3378   * Allocate a new workqueue_attrs, initialize with default settings and
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3379   * return it.
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3380   *
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3381   * Return: The allocated new workqueue_attr on success. %NULL on failure.
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3382   */
513c98d0868295 Daniel Jordan       2019-09-05  3383  struct workqueue_attrs *alloc_workqueue_attrs(void)
226223ab3c4118 Tejun Heo           2013-03-12  3384  {
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3385  	struct workqueue_attrs *attrs;
226223ab3c4118 Tejun Heo           2013-03-12  3386  
be69d00d976957 Thomas Gleixner     2019-06-26  3387  	attrs = kzalloc(sizeof(*attrs), GFP_KERNEL);
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3388  	if (!attrs)
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3389  		goto fail;
be69d00d976957 Thomas Gleixner     2019-06-26 @3390  	if (!alloc_cpumask_var(&attrs->cpumask, GFP_KERNEL))
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3391  		goto fail;
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3392  
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3393  	cpumask_copy(attrs->cpumask, cpu_possible_mask);
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3394  	return attrs;
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3395  fail:
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3396  	free_workqueue_attrs(attrs);
6ba94429c8e7b8 Frederic Weisbecker 2015-04-02  3397  	return NULL;
226223ab3c4118 Tejun Heo           2013-03-12  3398  }
226223ab3c4118 Tejun Heo           2013-03-12  3399  

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13  6:56             ` Libo Chen
  2022-04-13  8:37               ` Masahiro Yamada
@ 2022-04-13 15:41               ` Randy Dunlap
  2022-04-13 19:28                 ` Libo Chen
  1 sibling, 1 reply; 18+ messages in thread
From: Randy Dunlap @ 2022-04-13 15:41 UTC (permalink / raw)
  To: Libo Chen, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch



On 4/12/22 23:56, Libo Chen wrote:
> Hi Randy
> 
> On 4/12/22 22:54, Randy Dunlap wrote:
>> Hi Libo,
>>
>> On 4/12/22 19:34, Libo Chen wrote:
>>>
>>> On 4/12/22 19:13, Randy Dunlap wrote:
>>>> Hi,
>>>>
>>>> On 4/12/22 18:35, Libo Chen wrote:
>>>>> Hi Randy,
>>>>>
>>>>> On 4/12/22 17:18, Randy Dunlap wrote:
>>>>>> Hi--
>>>>>>
>>>>>> On 4/12/22 16:15, Libo Chen wrote:
>>>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>>>>> rationale for such dependency.
>>>>>>>
>>>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>>>>> There is no reason other architectures cannot given the fact that they
>>>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>>>>> x86.
>>>>>>>
>>>>>>> Signed-off-by: Libo Chen <libo.chen@oracle.com>
>>>>>>> ---
>>>>>>>     lib/Kconfig | 2 +-
>>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>>>>> --- a/lib/Kconfig
>>>>>>> +++ b/lib/Kconfig
>>>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>>>>         bool
>>>>>>>       config CPUMASK_OFFSTACK
>>>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>>>>> be displayed.
>>>>>>
>>>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>>>>> of DEBUG_PER_CPU_MAPS.
>>>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>>>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
>>>> I'm just talking about the Kconfig change below.  Not talking about whatever else
>>>> it might require per architecture.
>>>>
>>>> But you say you have tried that and it doesn't work. What part of it doesn't work?
>>>> The Kconfig part or some code execution?
>>> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y
>> I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
>> (with a patch, of course.)
>> It builds OK. I don't know if it will run OK.
> 
> I am a little confused, did you succeed with your patch (replacing "if" with "depends on") or my patch (removing "if")? Because I definitely cannot enable CPUMASK_OFFSTACK for arm64 without DEBUG_PER_CPUMAPS enabled using your change.

This patch builds cleanly for me:

---
 arch/arm64/Kconfig |    1 +
 lib/Kconfig        |    2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -511,7 +511,7 @@ config CHECK_SIGNATURE
 	bool
 
 config CPUMASK_OFFSTACK
-	bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
+	bool "Force CPU masks off stack"
 	help
 	  Use dynamic allocation for cpumask_var_t, instead of putting
 	  them on the stack.  This is a bit more expensive, but avoids
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -316,6 +316,7 @@ config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
 
 config SMP
 	def_bool y
+	select CPUMASK_OFFSTACK
 
 config KERNEL_MODE_NEON
 	def_bool y

along with:
# CONFIG_DEBUG_PER_CPU_MAPS is not set


>> I think that you are arguing for a patch like this:
> 
> I am actually arguing for the opposite, I don't think CPUMASK_OFFSTACK should require DEBUG_PER_CPU_MAPS. They should be separate and independent to each other. So removing "if ..." should be enough in my opinion.

I agree.

>> --- a/lib/Kconfig
>> +++ b/lib/Kconfig
>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
>>       bool
>>     config CPUMASK_OFFSTACK
>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>> +    bool "Force CPU masks off stack"
>> +    depends on DEBUG_PER_CPU_MAPS
> 
> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
>>       help
>>         Use dynamic allocation for cpumask_var_t, instead of putting
>>         them on the stack.  This is a bit more expensive, but avoids
>>
>>
>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>>
> 
> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?

Yes, it is. I don't know that the problem is...

-- 
~Randy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13 15:41               ` Randy Dunlap
@ 2022-04-13 19:28                 ` Libo Chen
  2022-04-13 20:52                   ` Arnd Bergmann
  0 siblings, 1 reply; 18+ messages in thread
From: Libo Chen @ 2022-04-13 19:28 UTC (permalink / raw)
  To: Randy Dunlap, gregkh, masahiroy, tglx, peterz, mingo, vbabka, akpm
  Cc: linux-kernel, linux-kbuild, linux-arm-kernel, linux-arch



On 4/13/22 08:41, Randy Dunlap wrote:
>
> On 4/12/22 23:56, Libo Chen wrote:
>> Hi Randy
>>
>> On 4/12/22 22:54, Randy Dunlap wrote:
>>> Hi Libo,
>>>
>>> On 4/12/22 19:34, Libo Chen wrote:
>>>> On 4/12/22 19:13, Randy Dunlap wrote:
>>>>> Hi,
>>>>>
>>>>> On 4/12/22 18:35, Libo Chen wrote:
>>>>>> Hi Randy,
>>>>>>
>>>>>> On 4/12/22 17:18, Randy Dunlap wrote:
>>>>>>> Hi--
>>>>>>>
>>>>>>> On 4/12/22 16:15, Libo Chen wrote:
>>>>>>>> Forcing CPUMASK_OFFSTACK to be conditoned on DEBUG_PER_CPU_MAPS doesn't
>>>>>>>> make a lot of sense nowaday. Even the original patch dating back to 2008,
>>>>>>>> aab46da0520a ("cpumask: Add CONFIG_CPUMASK_OFFSTACK") didn't give any
>>>>>>>> rationale for such dependency.
>>>>>>>>
>>>>>>>> Nowhere in the code supports the presumption that DEBUG_PER_CPU_MAPS is
>>>>>>>> necessary for CONFIG_CPUMASK_OFFSTACK. Make no mistake, it's good to
>>>>>>>> have DEBUG_PER_CPU_MAPS for debugging purpose or precaution, but it's
>>>>>>>> simply not a hard requirement for CPUMASK_OFFSTACK. Moreover, x86 Kconfig
>>>>>>>> already can set CPUMASK_OFFSTACK=y without DEBUG_PER_CPU_MAPS=y.
>>>>>>>> There is no reason other architectures cannot given the fact that they
>>>>>>>> have even fewer, if any, arch-specific CONFIG_DEBUG_PER_CPU_MAPS code than
>>>>>>>> x86.
>>>>>>>>
>>>>>>>> Signed-off-by: Libo Chen <libo.chen@oracle.com>
>>>>>>>> ---
>>>>>>>>      lib/Kconfig | 2 +-
>>>>>>>>      1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/lib/Kconfig b/lib/Kconfig
>>>>>>>> index 087e06b4cdfd..7209039dfb59 100644
>>>>>>>> --- a/lib/Kconfig
>>>>>>>> +++ b/lib/Kconfig
>>>>>>>> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>>>>>>>>          bool
>>>>>>>>        config CPUMASK_OFFSTACK
>>>>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>>>> This "if" dependency only controls whether the Kconfig symbol's prompt is
>>>>>>> displayed (presented) in kconfig tools. Removing it makes the prompt always
>>>>>>> be displayed.
>>>>>>>
>>>>>>> Any architecture could select (should be able to) CPUMASK_OFFSTACK independently
>>>>>>> of DEBUG_PER_CPU_MAPS.
>>>>>> Do you mean changing arch/xxxx/Kconfig to select CPUMASK_OFFSTACK under some config xxx? That will work but it requires code changes for each architecture.
>>>>>> But if you are talking about setting CONFIG_CPUMASK_OFFSTACK=y without CONFIG_DEBUG_PER_CPU_MAPS directly in config file, I have tried, it doesn't work.
>>>>> I'm just talking about the Kconfig change below.  Not talking about whatever else
>>>>> it might require per architecture.
>>>>>
>>>>> But you say you have tried that and it doesn't work. What part of it doesn't work?
>>>>> The Kconfig part or some code execution?
>>>> oh the Kconfig part. For example, make olddefconfig on a config file with CPUMASK_OFFSTACK=y only turns off CPUMASK_OFFSTACK unless I explicitly set DEBUG_PER_CPU_MAPS=y
>>> I can enable CPUMASK_OFFSTACK for arm64 without having DEBUG_PER_CPU_MAPS enabled.
>>> (with a patch, of course.)
>>> It builds OK. I don't know if it will run OK.
>> I am a little confused, did you succeed with your patch (replacing "if" with "depends on") or my patch (removing "if")? Because I definitely cannot enable CPUMASK_OFFSTACK for arm64 without DEBUG_PER_CPUMAPS enabled using your change.
> This patch builds cleanly for me:
>
> ---
>   arch/arm64/Kconfig |    1 +
>   lib/Kconfig        |    2 +-
>   2 files changed, 2 insertions(+), 1 deletion(-)
>
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -511,7 +511,7 @@ config CHECK_SIGNATURE
>   	bool
>   
>   config CPUMASK_OFFSTACK
> -	bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> +	bool "Force CPU masks off stack"
>   	help
>   	  Use dynamic allocation for cpumask_var_t, instead of putting
>   	  them on the stack.  This is a bit more expensive, but avoids
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -316,6 +316,7 @@ config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
>   
>   config SMP
>   	def_bool y
> +	select CPUMASK_OFFSTACK
>   
>   config KERNEL_MODE_NEON
>   	def_bool y
>
> along with:
> # CONFIG_DEBUG_PER_CPU_MAPS is not set
>
>
>>> I think that you are arguing for a patch like this:
>> I am actually arguing for the opposite, I don't think CPUMASK_OFFSTACK should require DEBUG_PER_CPU_MAPS. They should be separate and independent to each other. So removing "if ..." should be enough in my opinion.
> I agree.
>
>>> --- a/lib/Kconfig
>>> +++ b/lib/Kconfig
>>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
>>>        bool
>>>      config CPUMASK_OFFSTACK
>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>> +    bool "Force CPU masks off stack"
>>> +    depends on DEBUG_PER_CPU_MAPS
>> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
>>>        help
>>>          Use dynamic allocation for cpumask_var_t, instead of putting
>>>          them on the stack.  This is a bit more expensive, but avoids
>>>
>>>
>>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
>>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>>>
>> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?
> Yes, it is. I don't know that the problem is...
Masahiro explained that CPUMASK_OFFSTACK can only be configured by 
options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't 
seem to be what we want.

Libo


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13 19:28                 ` Libo Chen
@ 2022-04-13 20:52                   ` Arnd Bergmann
  2022-04-13 21:50                     ` Libo Chen
  0 siblings, 1 reply; 18+ messages in thread
From: Arnd Bergmann @ 2022-04-13 20:52 UTC (permalink / raw)
  To: Libo Chen
  Cc: Randy Dunlap, gregkh, Masahiro Yamada, Thomas Gleixner,
	Peter Zijlstra, Ingo Molnar, Vlastimil Babka, Andrew Morton,
	Linux Kernel Mailing List, Linux Kbuild mailing list, Linux ARM,
	linux-arch

On Wed, Apr 13, 2022 at 9:28 PM Libo Chen <libo.chen@oracle.com> wrote:
> On 4/13/22 08:41, Randy Dunlap wrote:
> > On 4/12/22 23:56, Libo Chen wrote:
> >>> --- a/lib/Kconfig
> >>> +++ b/lib/Kconfig
> >>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
> >>>        bool
> >>>      config CPUMASK_OFFSTACK
> >>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> >>> +    bool "Force CPU masks off stack"
> >>> +    depends on DEBUG_PER_CPU_MAPS
> >> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
> >>>        help
> >>>          Use dynamic allocation for cpumask_var_t, instead of putting
> >>>          them on the stack.  This is a bit more expensive, but avoids
> >>>
> >>>
> >>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
> >>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
> >>>
> >> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?
> > Yes, it is. I don't know that the problem is...
> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
> seem to be what we want.

I think the correct way to do it is to follow x86 and powerpc, and tying
CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
For smaller values of NR_CPUS, the onstack masks are obviously
cheaper, we just need to decide what the cut-off point is.

In x86, the onstack masks can be selected for normal SMP builds with
up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
CPUs while selecting CPUMASK_OFFSTACK.
PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
implicitly whenever NR_CPUS is set to 8192 or more.

I think we can easily do the same as powerpc on arm64. With the
ApacheBench test you cite in the patch description, what is the
value of NR_CPUS at which you start seeing a noticeable
benefit for offstack masks? Can you do the same test for
NR_CPUS=1024 or 2048?

           Arnd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13 20:52                   ` Arnd Bergmann
@ 2022-04-13 21:50                     ` Libo Chen
  2022-04-14  1:20                       ` Randy Dunlap
  2022-04-14 11:41                       ` Arnd Bergmann
  0 siblings, 2 replies; 18+ messages in thread
From: Libo Chen @ 2022-04-13 21:50 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Randy Dunlap, gregkh, Masahiro Yamada, Thomas Gleixner,
	Peter Zijlstra, Ingo Molnar, Vlastimil Babka, Andrew Morton,
	Linux Kernel Mailing List, Linux Kbuild mailing list, Linux ARM,
	linux-arch



On 4/13/22 13:52, Arnd Bergmann wrote:
> On Wed, Apr 13, 2022 at 9:28 PM Libo Chen <libo.chen@oracle.com> wrote:
>> On 4/13/22 08:41, Randy Dunlap wrote:
>>> On 4/12/22 23:56, Libo Chen wrote:
>>>>> --- a/lib/Kconfig
>>>>> +++ b/lib/Kconfig
>>>>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
>>>>>         bool
>>>>>       config CPUMASK_OFFSTACK
>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>> +    bool "Force CPU masks off stack"
>>>>> +    depends on DEBUG_PER_CPU_MAPS
>>>> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
>>>>>         help
>>>>>           Use dynamic allocation for cpumask_var_t, instead of putting
>>>>>           them on the stack.  This is a bit more expensive, but avoids
>>>>>
>>>>>
>>>>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
>>>>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>>>>>
>>>> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?
>>> Yes, it is. I don't know that the problem is...
>> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
>> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
>> seem to be what we want.
> I think the correct way to do it is to follow x86 and powerpc, and tying
> CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
> For smaller values of NR_CPUS, the onstack masks are obviously
> cheaper, we just need to decide what the cut-off point is.
I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on 
some architectures such as parisc and nios2 as reported by kernel test 
robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of 
guard on CPUMASK_OFFSTACK.
> In x86, the onstack masks can be selected for normal SMP builds with
> up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
> CPUs while selecting CPUMASK_OFFSTACK.
> PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
> implicitly whenever NR_CPUS is set to 8192 or more.
>
> I think we can easily do the same as powerpc on arm64. With the
I am leaning more towards x86's way because even NR_CPUS=160 is too 
expensive for 4-core arm64 VMs according to apachebench. I highly doubt 
that there is a good cut-off point to make everybody happy (or not unhappy).
> ApacheBench test you cite in the patch description, what is the
> value of NR_CPUS at which you start seeing a noticeable
> benefit for offstack masks? Can you do the same test for
> NR_CPUS=1024 or 2048?
As mentioned above, a good cut-off point moves depends on the actual 
number of CPUs. But yeah I can do the same test for 1024 or even smaller 
NR_CPUs values on the same 64-core arm64 VM setup.

Libo

>
>             Arnd


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13 21:50                     ` Libo Chen
@ 2022-04-14  1:20                       ` Randy Dunlap
  2022-04-14 11:41                       ` Arnd Bergmann
  1 sibling, 0 replies; 18+ messages in thread
From: Randy Dunlap @ 2022-04-14  1:20 UTC (permalink / raw)
  To: Libo Chen, Arnd Bergmann
  Cc: gregkh, Masahiro Yamada, Thomas Gleixner, Peter Zijlstra,
	Ingo Molnar, Vlastimil Babka, Andrew Morton,
	Linux Kernel Mailing List, Linux Kbuild mailing list, Linux ARM,
	linux-arch

Hi,

On 4/13/22 14:50, Libo Chen wrote:
> 
> 
> On 4/13/22 13:52, Arnd Bergmann wrote:
>> On Wed, Apr 13, 2022 at 9:28 PM Libo Chen <libo.chen@oracle.com> wrote:
>>> On 4/13/22 08:41, Randy Dunlap wrote:
>>>> On 4/12/22 23:56, Libo Chen wrote:
>>>>>> --- a/lib/Kconfig
>>>>>> +++ b/lib/Kconfig
>>>>>> @@ -511,7 +511,8 @@ config CHECK_SIGNATURE
>>>>>>         bool
>>>>>>       config CPUMASK_OFFSTACK
>>>>>> -    bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
>>>>>> +    bool "Force CPU masks off stack"
>>>>>> +    depends on DEBUG_PER_CPU_MAPS
>>>>> This forces every arch to enable DEBUG_PER_CPU_MAPS if they want to enable CPUMASK_OFFSTACK, it's even stronger than "if". My whole argument is CPUMASK_OFFSTACK should be enable/disabled independent of DEBUG_PER_CPU_MASK
>>>>>>         help
>>>>>>           Use dynamic allocation for cpumask_var_t, instead of putting
>>>>>>           them on the stack.  This is a bit more expensive, but avoids
>>>>>>
>>>>>>
>>>>>> As I said earlier, the "if" on the "bool" line just controls the prompt message.
>>>>>> This patch make CPUMASK_OFFSTACK require DEBUG_PER_CPU_MAPS -- which might be overkill.
>>>>>>
>>>>> Okay I understand now "if" on the "boot" is not a dependency and it only controls the prompt message, then the question is why we cannot enable CPUMASK_OFFSTACK without DEBUG_PER_CPU_MAPS if it only controls prompt message? Is it not the behavior we expect?
>>>> Yes, it is. I don't know that the problem is...
>>> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
>>> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
>>> seem to be what we want.
>> I think the correct way to do it is to follow x86 and powerpc, and tying
>> CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.

Sure. Just FTR, I was just trying to see if an arch (arm64) would build OK or not
when CPUMASK_OFFSTACK was enabled. and it does.
My patch wasn't meant to have a very long life.

>> For smaller values of NR_CPUS, the onstack masks are obviously
>> cheaper, we just need to decide what the cut-off point is.
> I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on some architectures such as parisc and nios2 as reported by kernel test robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of guard on CPUMASK_OFFSTACK.
>> In x86, the onstack masks can be selected for normal SMP builds with
>> up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
>> CPUs while selecting CPUMASK_OFFSTACK.
>> PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
>> implicitly whenever NR_CPUS is set to 8192 or more.
>>
>> I think we can easily do the same as powerpc on arm64. With the
> I am leaning more towards x86's way because even NR_CPUS=160 is too expensive for 4-core arm64 VMs according to apachebench. I highly doubt that there is a good cut-off point to make everybody happy (or not unhappy).
>> ApacheBench test you cite in the patch description, what is the
>> value of NR_CPUS at which you start seeing a noticeable
>> benefit for offstack masks? Can you do the same test for
>> NR_CPUS=1024 or 2048?
> As mentioned above, a good cut-off point moves depends on the actual number of CPUs. But yeah I can do the same test for 1024 or even smaller NR_CPUs values on the same 64-core arm64 VM setup.


-- 
~Randy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-13 21:50                     ` Libo Chen
  2022-04-14  1:20                       ` Randy Dunlap
@ 2022-04-14 11:41                       ` Arnd Bergmann
  2022-04-14 18:01                         ` Libo Chen
  1 sibling, 1 reply; 18+ messages in thread
From: Arnd Bergmann @ 2022-04-14 11:41 UTC (permalink / raw)
  To: Libo Chen
  Cc: Arnd Bergmann, Randy Dunlap, gregkh, Masahiro Yamada,
	Thomas Gleixner, Peter Zijlstra, Ingo Molnar, Vlastimil Babka,
	Andrew Morton, Linux Kernel Mailing List,
	Linux Kbuild mailing list, Linux ARM, linux-arch

On Wed, Apr 13, 2022 at 11:50 PM Libo Chen <libo.chen@oracle.com> wrote:
> On 4/13/22 13:52, Arnd Bergmann wrote:
> >>> Yes, it is. I don't know that the problem is...
> >> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
> >> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
> >> seem to be what we want.
> > I think the correct way to do it is to follow x86 and powerpc, and tying
> > CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
> > For smaller values of NR_CPUS, the onstack masks are obviously
> > cheaper, we just need to decide what the cut-off point is.
>
> I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on
> some architectures such as parisc and nios2 as reported by kernel test
> robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of
> guard on CPUMASK_OFFSTACK.

NIOS2 does not support SMP builds at all, so it should never be possible to
select CPUMASK_OFFSTACK there. We may want to guard
DEBUG_PER_CPU_MAPS by adding a 'depends on SMP' in order to
prevent it from getting selected.

For PARISC, the largest configuration is 32-way SMP, so CPUMASK_OFFSTACK
is clearly pointless there as well, even though it should technically
be possible
to support. What is the build error on parisc?

> > In x86, the onstack masks can be selected for normal SMP builds with
> > up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
> > CPUs while selecting CPUMASK_OFFSTACK.
> > PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
> > implicitly whenever NR_CPUS is set to 8192 or more.
> >
> > I think we can easily do the same as powerpc on arm64. With the
> I am leaning more towards x86's way because even NR_CPUS=160 is too
> expensive for 4-core arm64 VMs according to apachebench. I highly doubt
> that there is a good cut-off point to make everybody happy (or not unhappy).

It seems surprising that you would see any improvement for offstack masks
when using NR_CPUS=160, that is just three 64-bit words worth of data, but
it requires allocating the mask dynamically, which takes way more memory
to initialize.

> > ApacheBench test you cite in the patch description, what is the
> > value of NR_CPUS at which you start seeing a noticeable
> > benefit for offstack masks? Can you do the same test for
> > NR_CPUS=1024 or 2048?
>
> As mentioned above, a good cut-off point moves depends on the actual
> number of CPUs. But yeah I can do the same test for 1024 or even smaller
> NR_CPUs values on the same 64-core arm64 VM setup.

If you see an improvement for small NR_CPUS values using offstack masks,
it's possible that the actual difference is something completely
different and we
can just make the on-stack case faster, possibly the cause is something about
cacheline alignment or inlining decisions using your specific kernel config.

Are you able to compare the 'perf report' output between runs with either
size to see where the extra time gets spent?

        Arnd

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
  2022-04-14 11:41                       ` Arnd Bergmann
@ 2022-04-14 18:01                         ` Libo Chen
  0 siblings, 0 replies; 18+ messages in thread
From: Libo Chen @ 2022-04-14 18:01 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Randy Dunlap, gregkh, Masahiro Yamada, Thomas Gleixner,
	Peter Zijlstra, Ingo Molnar, Vlastimil Babka, Andrew Morton,
	Linux Kernel Mailing List, Linux Kbuild mailing list, Linux ARM,
	linux-arch



On 4/14/22 04:41, Arnd Bergmann wrote:
> On Wed, Apr 13, 2022 at 11:50 PM Libo Chen <libo.chen@oracle.com> wrote:
>> On 4/13/22 13:52, Arnd Bergmann wrote:
>>>>> Yes, it is. I don't know that the problem is...
>>>> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
>>>> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
>>>> seem to be what we want.
>>> I think the correct way to do it is to follow x86 and powerpc, and tying
>>> CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
>>> For smaller values of NR_CPUS, the onstack masks are obviously
>>> cheaper, we just need to decide what the cut-off point is.
>> I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on
>> some architectures such as parisc and nios2 as reported by kernel test
>> robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of
>> guard on CPUMASK_OFFSTACK.
> NIOS2 does not support SMP builds at all, so it should never be possible to
> select CPUMASK_OFFSTACK there. We may want to guard
> DEBUG_PER_CPU_MAPS by adding a 'depends on SMP' in order to
> prevent it from getting selected.
>
> For PARISC, the largest configuration is 32-way SMP, so CPUMASK_OFFSTACK
> is clearly pointless there as well, even though it should technically
> be possible
> to support. What is the build error on parisc?
Similar to NIOS2, A bunch of undefined references to *_cpumask_var() 
calls.  It seems that PARISC doesn't support the cpumask offstack API at all

>>> In x86, the onstack masks can be selected for normal SMP builds with
>>> up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
>>> CPUs while selecting CPUMASK_OFFSTACK.
>>> PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
>>> implicitly whenever NR_CPUS is set to 8192 or more.
>>>
>>> I think we can easily do the same as powerpc on arm64. With the
>> I am leaning more towards x86's way because even NR_CPUS=160 is too
>> expensive for 4-core arm64 VMs according to apachebench. I highly doubt
>> that there is a good cut-off point to make everybody happy (or not unhappy).
> It seems surprising that you would see any improvement for offstack masks
> when using NR_CPUS=160, that is just three 64-bit words worth of data, but
> it requires allocating the mask dynamically, which takes way more memory
> to initialize.
>
>>> ApacheBench test you cite in the patch description, what is the
>>> value of NR_CPUS at which you start seeing a noticeable
>>> benefit for offstack masks? Can you do the same test for
>>> NR_CPUS=1024 or 2048?
>> As mentioned above, a good cut-off point moves depends on the actual
>> number of CPUs. But yeah I can do the same test for 1024 or even smaller
>> NR_CPUs values on the same 64-core arm64 VM setup.
> If you see an improvement for small NR_CPUS values using offstack masks,
> it's possible that the actual difference is something completely
> different and we
> can just make the on-stack case faster, possibly the cause is something about
> cacheline alignment or inlining decisions using your specific kernel config.
>
> Are you able to compare the 'perf report' output between runs with either
> size to see where the extra time gets spent?
Okay yeah I will take some time to gather more data. It does appear that 
something else may also contribute to the performance difference.

Thanks
Libo
>          Arnd


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-04-14 18:01 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-12 23:15 [PATCH RESEND 0/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK Libo Chen
2022-04-12 23:15 ` [PATCH RESEND 1/1] " Libo Chen
2022-04-13  0:18   ` Randy Dunlap
2022-04-13  1:35     ` Libo Chen
2022-04-13  2:13       ` Randy Dunlap
2022-04-13  2:34         ` Libo Chen
2022-04-13  5:54           ` Randy Dunlap
2022-04-13  6:56             ` Libo Chen
2022-04-13  8:37               ` Masahiro Yamada
2022-04-13 15:41               ` Randy Dunlap
2022-04-13 19:28                 ` Libo Chen
2022-04-13 20:52                   ` Arnd Bergmann
2022-04-13 21:50                     ` Libo Chen
2022-04-14  1:20                       ` Randy Dunlap
2022-04-14 11:41                       ` Arnd Bergmann
2022-04-14 18:01                         ` Libo Chen
2022-04-13 13:11   ` kernel test robot
2022-04-13 14:33   ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).