LKML Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
@ 2019-01-21 11:47 Eial Czerwacki
  2019-01-30 10:33 ` Eial Czerwacki
  2019-03-01 18:30 ` Barret Rhoden
  0 siblings, 2 replies; 17+ messages in thread
From: Eial Czerwacki @ 2019-01-21 11:47 UTC (permalink / raw)
  To: linux-kernel; +Cc: Eial Czerwacki, Shai Fultheim, Oren Twaig

as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the default one
causing the struct size to exceed the size ok 8KB.

in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if CONFIG_X86_VSMP is set.

the value was caculated on linux 4.20.3, make allmodconfig all and the following oneliner:
for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc

Signed-off-by: Eial Czerwacki <eial@scalemp.com>
Signed-off-by: Shai Fultheim <shai@scalemp.com>
Signed-off-by: Oren Twaig <oren@scalemp.com>
---
 include/linux/percpu.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index 70b7123..6b79693 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -14,7 +14,11 @@
 
 /* enough to cover all DEFINE_PER_CPUs in modules */
 #ifdef CONFIG_MODULES
+#ifdef X86_VSMP
+#define PERCPU_MODULE_RESERVE		(1 << 16)
+#else
 #define PERCPU_MODULE_RESERVE		(8 << 10)
+#endif
 #else
 #define PERCPU_MODULE_RESERVE		0
 #endif
-- 
2.7.4


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-01-21 11:47 [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set Eial Czerwacki
@ 2019-01-30 10:33 ` Eial Czerwacki
  2019-03-01 18:30 ` Barret Rhoden
  1 sibling, 0 replies; 17+ messages in thread
From: Eial Czerwacki @ 2019-01-30 10:33 UTC (permalink / raw)
  To: linux-kernel; +Cc: Shai Fultheim, Oren Twaig

Greetings,

On 1/21/19 1:47 PM, Eial Czerwacki wrote:
> as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
> by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the default one
> causing the struct size to exceed the size ok 8KB.
> 
> in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if CONFIG_X86_VSMP is set.
> 
> the value was caculated on linux 4.20.3, make allmodconfig all and the following oneliner:
> for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc
> 
> Signed-off-by: Eial Czerwacki <eial@scalemp.com>
> Signed-off-by: Shai Fultheim <shai@scalemp.com>
> Signed-off-by: Oren Twaig <oren@scalemp.com>
> ---
>  include/linux/percpu.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
> index 70b7123..6b79693 100644
> --- a/include/linux/percpu.h
> +++ b/include/linux/percpu.h
> @@ -14,7 +14,11 @@
>  
>  /* enough to cover all DEFINE_PER_CPUs in modules */
>  #ifdef CONFIG_MODULES
> +#ifdef X86_VSMP
> +#define PERCPU_MODULE_RESERVE		(1 << 16)
> +#else
>  #define PERCPU_MODULE_RESERVE		(8 << 10)
> +#endif
>  #else
>  #define PERCPU_MODULE_RESERVE		0
>  #endif
> 
is it possible to push this patch to mainline?
it seems like no objections/comment regarding it exists.
we'd like to fix the bug mentioned above.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-01-21 11:47 [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set Eial Czerwacki
  2019-01-30 10:33 ` Eial Czerwacki
@ 2019-03-01 18:30 ` Barret Rhoden
  2019-03-01 20:34   ` Dennis Zhou
  2019-03-04  7:42   ` Eial Czerwacki
  1 sibling, 2 replies; 17+ messages in thread
From: Barret Rhoden @ 2019-03-01 18:30 UTC (permalink / raw)
  To: Eial Czerwacki, dennis, tj, cl; +Cc: linux-kernel, Shai Fultheim, Oren Twaig

Hi -

On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
 >

Your main issue was that you only sent this patch to LKML, but not the 
maintainers of the file.  If you don't, your patch might get lost.  To 
get the appropriate people and lists, run:

	scripts/get_maintainer.pl YOUR_PATCH.patch.

For this patch, you'll get this:

Dennis Zhou <dennis@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
Tejun Heo <tj@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
Christoph Lameter <cl@linux.com> (maintainer:PER-CPU MEMORY ALLOCATOR)
linux-kernel@vger.kernel.org (open list)

I added the three maintainers to this email.

I have a few minor comments below.

 > [PATCH] percpu/module resevation: change resevation size iff X86_VSMP 
is set

You misspelled 'reservation'.  Also, I'd just say: "percpu: increase 
module reservation size if X86_VSMP is set".  ('change' -> 'increase'), 
only says 'reservation' once.)

> as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)

I think you can add a tag for this right above your Signed-off-by tags. 
e.g.:

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339

> by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the default one
> causing the struct size to exceed the size ok 8KB.
                                             ^of

Which struct are you talking about?  I have one in mind, but others 
might not know from reading the commit message.

I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511. 
In that case, it was because modules (drm and amdkfd) were using 
DEFINE_SRCU, which does a DEFINE_PER_CPU on struct srcu_data, and that 
used ____cacheline_internodealigned_in_smp.

> 
> in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if CONFIG_X86_VSMP is set.
                                ^increase

> 
> the value was caculated on linux 4.20.3, make allmodconfig all and the following oneliner:
                ^calculated

> for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc

Not sure how useful the one-liner is, versus a description of what 
you're doing.  i.e. "the size of all module percpu data sections, or 
something."

Also, how close was that calculated value to 64K?  If more modules start 
using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.

Thanks,
Barret

> Signed-off-by: Eial Czerwacki <eial@scalemp.com>
> Signed-off-by: Shai Fultheim <shai@scalemp.com>
> Signed-off-by: Oren Twaig <oren@scalemp.com>
> ---
>   include/linux/percpu.h | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
> index 70b7123..6b79693 100644
> --- a/include/linux/percpu.h
> +++ b/include/linux/percpu.h
> @@ -14,7 +14,11 @@
>   
>   /* enough to cover all DEFINE_PER_CPUs in modules */
>   #ifdef CONFIG_MODULES
> +#ifdef X86_VSMP
> +#define PERCPU_MODULE_RESERVE		(1 << 16)
> +#else
>   #define PERCPU_MODULE_RESERVE		(8 << 10)
> +#endif
>   #else
>   #define PERCPU_MODULE_RESERVE		0
>   #endif
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-01 18:30 ` Barret Rhoden
@ 2019-03-01 20:34   ` Dennis Zhou
  2019-03-01 21:27     ` Barret Rhoden
  2019-03-04  7:42   ` Eial Czerwacki
  1 sibling, 1 reply; 17+ messages in thread
From: Dennis Zhou @ 2019-03-01 20:34 UTC (permalink / raw)
  To: Barret Rhoden
  Cc: Eial Czerwacki, dennis, tj, cl, linux-kernel, Shai Fultheim,
	Oren Twaig, Paul E. McKenney

Hi Barret,

On Fri, Mar 01, 2019 at 01:30:15PM -0500, Barret Rhoden wrote:
> Hi -
> 
> On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
> >
> 
> Your main issue was that you only sent this patch to LKML, but not the
> maintainers of the file.  If you don't, your patch might get lost.  To get
> the appropriate people and lists, run:
> 
> 	scripts/get_maintainer.pl YOUR_PATCH.patch.
> 
> For this patch, you'll get this:
> 
> Dennis Zhou <dennis@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
> Tejun Heo <tj@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
> Christoph Lameter <cl@linux.com> (maintainer:PER-CPU MEMORY ALLOCATOR)
> linux-kernel@vger.kernel.org (open list)
> 
> I added the three maintainers to this email.
> 
> I have a few minor comments below.
> 
> > [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is
> set
> 
> You misspelled 'reservation'.  Also, I'd just say: "percpu: increase module
> reservation size if X86_VSMP is set".  ('change' -> 'increase'), only says
> 'reservation' once.)
> 
> > as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
> 
> I think you can add a tag for this right above your Signed-off-by tags.
> e.g.:
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339
> 
> > by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the default one
> > causing the struct size to exceed the size ok 8KB.
>                                             ^of
> 
> Which struct are you talking about?  I have one in mind, but others might
> not know from reading the commit message.
> 
> I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511. In
> that case, it was because modules (drm and amdkfd) were using DEFINE_SRCU,
> which does a DEFINE_PER_CPU on struct srcu_data, and that used
> ____cacheline_internodealigned_in_smp.
> 
> > 
> > in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if CONFIG_X86_VSMP is set.
>                                ^increase
> 
> > 
> > the value was caculated on linux 4.20.3, make allmodconfig all and the following oneliner:
>                ^calculated
> 
> > for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc
> 
> Not sure how useful the one-liner is, versus a description of what you're
> doing.  i.e. "the size of all module percpu data sections, or something."
> 
> Also, how close was that calculated value to 64K?  If more modules start
> using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.
> 
> Thanks,
> Barret
> 
> > Signed-off-by: Eial Czerwacki <eial@scalemp.com>
> > Signed-off-by: Shai Fultheim <shai@scalemp.com>
> > Signed-off-by: Oren Twaig <oren@scalemp.com>
> > ---
> >   include/linux/percpu.h | 4 ++++
> >   1 file changed, 4 insertions(+)
> > 
> > diff --git a/include/linux/percpu.h b/include/linux/percpu.h
> > index 70b7123..6b79693 100644
> > --- a/include/linux/percpu.h
> > +++ b/include/linux/percpu.h
> > @@ -14,7 +14,11 @@
> >   /* enough to cover all DEFINE_PER_CPUs in modules */
> >   #ifdef CONFIG_MODULES
> > +#ifdef X86_VSMP
> > +#define PERCPU_MODULE_RESERVE		(1 << 16)
> > +#else
> >   #define PERCPU_MODULE_RESERVE		(8 << 10)
> > +#endif
> >   #else
> >   #define PERCPU_MODULE_RESERVE		0
> >   #endif
> > 
> 

Thanks for sending this to me.

I must say, I really do not want to expand the reserved region. In most
cases, it can easily end up unused and thus wasted memory as it is hard
allocated on boot. This is done because code gen assumes static
variables are close to the program counter. This would not be true with
dynamic allocations which being at the end of the vmalloc area
(Summarized from Tejun's account in [1]).

Another note on the reserved region. It starts at the end of the static
region which means it generally isn't page aligned. So while an 8kb
allocation would fit, a 4kb alignment more than likely would fail.
Something as large as 8kb should probably be dynamically allocated as
well.

I read through the bugzilla report and it seems that the culprits are:
  drivers/gpu/drm/amd/amdkfd/kfd_process.c:DEFINE_SRCU(kfd_processes_srcu);
  drivers/gpu/drm/drm_drv.c:DEFINE_STATIC_SRCU(drm_unplug_srcu);

Is there a reason we cannot dynamically initialize these structs? I've
cced Paul McKenney because we saw an issue with ipmi in December [1].

[1] https://lore.kernel.org/linux-mm/CAJM9R-JWO1P_qJzw2JboMH2dgPX7K1tF49nO5ojvf=iwGddXRQ@mail.gmail.com/

Thanks,
Dennis

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-01 20:34   ` Dennis Zhou
@ 2019-03-01 21:27     ` Barret Rhoden
  2019-03-01 21:54       ` Christopher Lameter
  0 siblings, 1 reply; 17+ messages in thread
From: Barret Rhoden @ 2019-03-01 21:27 UTC (permalink / raw)
  To: Dennis Zhou
  Cc: Eial Czerwacki, tj, cl, linux-kernel, Shai Fultheim, Oren Twaig,
	Paul E. McKenney

Hi -

On 03/01/2019 03:34 PM, Dennis Zhou wrote:
> Hi Barret,
> 
> On Fri, Mar 01, 2019 at 01:30:15PM -0500, Barret Rhoden wrote:
>> Hi -
>>
>> On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
>>>
>>
>> Your main issue was that you only sent this patch to LKML, but not the
>> maintainers of the file.  If you don't, your patch might get lost.  To get
>> the appropriate people and lists, run:
>>
>> 	scripts/get_maintainer.pl YOUR_PATCH.patch.
>>
>> For this patch, you'll get this:
>>
>> Dennis Zhou <dennis@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> Tejun Heo <tj@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> Christoph Lameter <cl@linux.com> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> linux-kernel@vger.kernel.org (open list)
>>
>> I added the three maintainers to this email.
>>
>> I have a few minor comments below.
>>
>>> [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is
>> set
>>
>> You misspelled 'reservation'.  Also, I'd just say: "percpu: increase module
>> reservation size if X86_VSMP is set".  ('change' -> 'increase'), only says
>> 'reservation' once.)
>>
>>> as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
>>
>> I think you can add a tag for this right above your Signed-off-by tags.
>> e.g.:
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339
>>
>>> by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the default one
>>> causing the struct size to exceed the size ok 8KB.
>>                                              ^of
>>
>> Which struct are you talking about?  I have one in mind, but others might
>> not know from reading the commit message.
>>
>> I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511. In
>> that case, it was because modules (drm and amdkfd) were using DEFINE_SRCU,
>> which does a DEFINE_PER_CPU on struct srcu_data, and that used
>> ____cacheline_internodealigned_in_smp.
>>
>>>
>>> in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if CONFIG_X86_VSMP is set.
>>                                 ^increase
>>
>>>
>>> the value was caculated on linux 4.20.3, make allmodconfig all and the following oneliner:
>>                 ^calculated
>>
>>> for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc
>>
>> Not sure how useful the one-liner is, versus a description of what you're
>> doing.  i.e. "the size of all module percpu data sections, or something."
>>
>> Also, how close was that calculated value to 64K?  If more modules start
>> using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.
>>
>> Thanks,
>> Barret
>>
>>> Signed-off-by: Eial Czerwacki <eial@scalemp.com>
>>> Signed-off-by: Shai Fultheim <shai@scalemp.com>
>>> Signed-off-by: Oren Twaig <oren@scalemp.com>
>>> ---
>>>    include/linux/percpu.h | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
>>> index 70b7123..6b79693 100644
>>> --- a/include/linux/percpu.h
>>> +++ b/include/linux/percpu.h
>>> @@ -14,7 +14,11 @@
>>>    /* enough to cover all DEFINE_PER_CPUs in modules */
>>>    #ifdef CONFIG_MODULES
>>> +#ifdef X86_VSMP
>>> +#define PERCPU_MODULE_RESERVE		(1 << 16)
>>> +#else
>>>    #define PERCPU_MODULE_RESERVE		(8 << 10)
>>> +#endif
>>>    #else
>>>    #define PERCPU_MODULE_RESERVE		0
>>>    #endif
>>>
>>
> 
> Thanks for sending this to me.
> 
> I must say, I really do not want to expand the reserved region. In most
> cases, it can easily end up unused and thus wasted memory as it is hard
> allocated on boot. This is done because code gen assumes static
> variables are close to the program counter. This would not be true with
> dynamic allocations which being at the end of the vmalloc area
> (Summarized from Tejun's account in [1]).
> 
> Another note on the reserved region. It starts at the end of the static
> region which means it generally isn't page aligned. So while an 8kb
> allocation would fit, a 4kb alignment more than likely would fail.
> Something as large as 8kb should probably be dynamically allocated as
> well.
 >
> I read through the bugzilla report and it seems that the culprits are:
>    drivers/gpu/drm/amd/amdkfd/kfd_process.c:DEFINE_SRCU(kfd_processes_srcu);
>    drivers/gpu/drm/drm_drv.c:DEFINE_STATIC_SRCU(drm_unplug_srcu);
> 
> Is there a reason we cannot dynamically initialize these structs? I've
> cced Paul McKenney because we saw an issue with ipmi in December [1].

I looked at the AMD driver, and it looks like they could dynamically 
initialize it.  It would require a little extra plumbing.  I imagine the 
DRM one is the same way.

To catch this in the future, should we disallow DEFINE_SRCU in modules 
or something?  Otherwise, this will pop up again the next time someone 
uses DEFINE_SRCU in a module and builds with CONFIG_X86_VSMP.

That might be a little much, and it still won't be sufficient to catch 
all cases.  This will also come up any time a module has a static 
per-cpu data structure that uses __cacheline_aligned_in_smp, so it's not 
limited to SRCU either.

I'm not familiar with VSMP - how bad is it to use L1 cache alignment 
instead of 4K page alignment?  Maybe some structures can use the smaller 
alignment?  Or maybe have VSMP require SRCU-using modules to be built-in?

Thanks,

Barret


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-01 21:27     ` Barret Rhoden
@ 2019-03-01 21:54       ` Christopher Lameter
  2019-03-13 19:40         ` Barret Rhoden
  0 siblings, 1 reply; 17+ messages in thread
From: Christopher Lameter @ 2019-03-01 21:54 UTC (permalink / raw)
  To: Barret Rhoden
  Cc: Dennis Zhou, Eial Czerwacki, tj, linux-kernel, Shai Fultheim,
	Oren Twaig, Paul E. McKenney

On Fri, 1 Mar 2019, Barret Rhoden wrote:

> I'm not familiar with VSMP - how bad is it to use L1 cache alignment instead
> of 4K page alignment?  Maybe some structures can use the smaller alignment?
> Or maybe have VSMP require SRCU-using modules to be built-in?

It is very expensive. VMSP exchanges 4K segments via RDMA between servers
to build a large address space and run a kernel in the large address
space. Using smaller segments can cause a lot of
"cacheline" bouncing (meaning transfers of 4K segments back and forth
between servers).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-01 18:30 ` Barret Rhoden
  2019-03-01 20:34   ` Dennis Zhou
@ 2019-03-04  7:42   ` Eial Czerwacki
  1 sibling, 0 replies; 17+ messages in thread
From: Eial Czerwacki @ 2019-03-04  7:42 UTC (permalink / raw)
  To: Barret Rhoden; +Cc: dennis, tj, cl, linux-kernel, Shai Fultheim, Oren Twaig

Greetings Barret,

On 3/1/19 8:30 PM, Barret Rhoden wrote:
> Hi -
> 
> On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
>>
> 
> Your main issue was that you only sent this patch to LKML, but not the
> maintainers of the file.  If you don't, your patch might get lost.  To
> get the appropriate people and lists, run:
> 
>     scripts/get_maintainer.pl YOUR_PATCH.patch.
> 
> For this patch, you'll get this:
> 
> Dennis Zhou <dennis@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
> Tejun Heo <tj@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
> Christoph Lameter <cl@linux.com> (maintainer:PER-CPU MEMORY ALLOCATOR)
> linux-kernel@vger.kernel.org (open list)
> 
> I added the three maintainers to this email.
> 
> I have a few minor comments below.
> 
thanks, I did not knew that, I'll use it next time.

>> [PATCH] percpu/module resevation: change resevation size iff X86_VSMP
> is set
> 
> You misspelled 'reservation'.  Also, I'd just say: "percpu: increase
> module reservation size if X86_VSMP is set".  ('change' -> 'increase'),
> only says 'reservation' once.)
> 
>> as reported in bug #201339
>> (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
> 
> I think you can add a tag for this right above your Signed-off-by tags.
> e.g.:
> 
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339
> 
>> by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from
>> the default one
>> causing the struct size to exceed the size ok 8KB.
>                                             ^of
> 
will fix, thanks.

> Which struct are you talking about?  I have one in mind, but others
> might not know from reading the commit message.
> 
> I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511.
> In that case, it was because modules (drm and amdkfd) were using
> DEFINE_SRCU, which does a DEFINE_PER_CPU on struct srcu_data, and that
> used ____cacheline_internodealigned_in_smp.
you are correct, the structure in question is struct srcu_data.

> 
>>
>> in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if
>> CONFIG_X86_VSMP is set.
>                                ^increase
> 
>>
>> the value was caculated on linux 4.20.3, make allmodconfig all and the
>> following oneliner:
>                ^calculated
> 
will fix, thanks.

>> for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc;
>> done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r:
>> "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk
>> '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END
>> {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column
>> -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc
> 
> Not sure how useful the one-liner is, versus a description of what
> you're doing.  i.e. "the size of all module percpu data sections, or
> something."
I thought an easy reproducing will suffice, I'll take that into account.

> 
> Also, how close was that calculated value to 64K?  If more modules start
> using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.
the biggest module was 12472 bytes in size, as multiple modules uses the
same percpu, more is needed, the only way I was able to make it fit was 64K.

of course there is a possibility that at a specific scenario 64K will
not be enough but we have yet to encounter such scenario.

> 
> Thanks,
> Barret
> 
>> Signed-off-by: Eial Czerwacki <eial@scalemp.com>
>> Signed-off-by: Shai Fultheim <shai@scalemp.com>
>> Signed-off-by: Oren Twaig <oren@scalemp.com>
>> ---
>>   include/linux/percpu.h | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
>> index 70b7123..6b79693 100644
>> --- a/include/linux/percpu.h
>> +++ b/include/linux/percpu.h
>> @@ -14,7 +14,11 @@
>>     /* enough to cover all DEFINE_PER_CPUs in modules */
>>   #ifdef CONFIG_MODULES
>> +#ifdef X86_VSMP
>> +#define PERCPU_MODULE_RESERVE        (1 << 16)
>> +#else
>>   #define PERCPU_MODULE_RESERVE        (8 << 10)
>> +#endif
>>   #else
>>   #define PERCPU_MODULE_RESERVE        0
>>   #endif
>>
> 
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-01 21:54       ` Christopher Lameter
@ 2019-03-13 19:40         ` Barret Rhoden
  2019-03-13 20:26           ` Tejun Heo
  2019-04-04 20:53           ` Christopher Lameter
  0 siblings, 2 replies; 17+ messages in thread
From: Barret Rhoden @ 2019-03-13 19:40 UTC (permalink / raw)
  To: Christopher Lameter
  Cc: Dennis Zhou, Eial Czerwacki, tj, linux-kernel, Shai Fultheim,
	Oren Twaig, Paul E. McKenney

Hi -

On 03/01/2019 04:54 PM, Christopher Lameter wrote:
> On Fri, 1 Mar 2019, Barret Rhoden wrote:
> 
>> I'm not familiar with VSMP - how bad is it to use L1 cache alignment instead
>> of 4K page alignment?  Maybe some structures can use the smaller alignment?
>> Or maybe have VSMP require SRCU-using modules to be built-in?
> 
> It is very expensive. VMSP exchanges 4K segments via RDMA between servers
> to build a large address space and run a kernel in the large address
> space. Using smaller segments can cause a lot of
> "cacheline" bouncing (meaning transfers of 4K segments back and forth
> between servers).
> 

Given that these are large machines, would it be OK to statically 
reserve 64K on them for modules' percpu data?

The bug that led me to here was from someone running on a non-VSMP 
machine but had that config set.  Perhaps we make it more clear in the 
Kconfig option to not set it on other machines.  That might make it less 
likely anyone on a non-VSMP machine pays the 64K overhead.

Are there any other alternatives?  Not using static SRCU in any code 
that could be built as a module seems a little harsh.

Thanks,

Barret


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-13 19:40         ` Barret Rhoden
@ 2019-03-13 20:26           ` Tejun Heo
  2019-03-13 21:22             ` Paul E. McKenney
  2019-04-04 20:53           ` Christopher Lameter
  1 sibling, 1 reply; 17+ messages in thread
From: Tejun Heo @ 2019-03-13 20:26 UTC (permalink / raw)
  To: Barret Rhoden
  Cc: Christopher Lameter, Dennis Zhou, Eial Czerwacki, linux-kernel,
	Shai Fultheim, Oren Twaig, Paul E. McKenney

Hello,

On Wed, Mar 13, 2019 at 03:40:04PM -0400, Barret Rhoden wrote:
> Are there any other alternatives?  Not using static SRCU in any code
> that could be built as a module seems a little harsh.

Yes, allocate the srcu dynamically on module init and destroy on
module exit.  That's how the other similar case got solved too.  We
can't keep bumping up reserved size by the number of static SRCUs in
modules.  It's mostly there to make trivial small things easier.  We
don't lose anything meaningful by allocating srcu dynamically.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-13 20:26           ` Tejun Heo
@ 2019-03-13 21:22             ` Paul E. McKenney
  2019-03-13 21:29               ` Tejun Heo
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2019-03-13 21:22 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Barret Rhoden, Christopher Lameter, Dennis Zhou, Eial Czerwacki,
	linux-kernel, Shai Fultheim, Oren Twaig

On Wed, Mar 13, 2019 at 01:26:40PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Wed, Mar 13, 2019 at 03:40:04PM -0400, Barret Rhoden wrote:
> > Are there any other alternatives?  Not using static SRCU in any code
> > that could be built as a module seems a little harsh.
> 
> Yes, allocate the srcu dynamically on module init and destroy on
> module exit.  That's how the other similar case got solved too.  We
> can't keep bumping up reserved size by the number of static SRCUs in
> modules.  It's mostly there to make trivial small things easier.  We
> don't lose anything meaningful by allocating srcu dynamically.

Should I define DEFINE_SRCU() and DEFINE_STATIC_SRCU() only if
!defined(MODULE)?

Untested (probably doesn't even build) patch below.

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 7f7c8c050f63..a979da9cf71f 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -105,6 +105,8 @@ struct srcu_struct {
  * Define and initialize a srcu struct at build time.
  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
  *
+ * Build-time srcu_struct definition is not allowed in modules.
+ *
  * Note that although DEFINE_STATIC_SRCU() hides the name from other
  * files, the per-CPU variable rules nevertheless require that the
  * chosen name be globally unique.  These rules also prohibit use of
@@ -120,11 +122,13 @@ struct srcu_struct {
  *
  * See include/linux/percpu-defs.h for the rules on per-CPU variables.
  */
-#define __DEFINE_SRCU(name, is_static)					\
-	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);\
+#ifndef MODULE
+#  define __DEFINE_SRCU(name, is_static)				\
+	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);	\
 	is_static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name##_srcu_data)
-#define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
-#define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
+#  define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
+#  define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
+#endif
 
 void synchronize_srcu_expedited(struct srcu_struct *ssp);
 void srcu_barrier(struct srcu_struct *ssp);
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 5ff797fd3715..7cf1e3aed695 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -496,9 +496,18 @@ static struct rcu_torture_ops rcu_busted_ops = {
  * Definitions for srcu torture testing.
  */
 
-DEFINE_STATIC_SRCU(srcu_ctl);
 static struct srcu_struct srcu_ctld;
-static struct srcu_struct *srcu_ctlp = &srcu_ctl;
+static struct srcu_struct *srcu_ctlp;
+
+#ifndef MODULE
+DEFINE_STATIC_SRCU(srcu_ctl);
+
+static void srcu_torture_init(void)
+{
+	rcu_sync_torture_init();
+	srcu_ctlp = &srcu_ctl;
+}
+#endif
 
 static int srcu_torture_read_lock(void) __acquires(srcu_ctlp)
 {
@@ -565,9 +574,10 @@ static void srcu_torture_synchronize_expedited(void)
 	synchronize_srcu_expedited(srcu_ctlp);
 }
 
+#ifndef MODULE
 static struct rcu_torture_ops srcu_ops = {
 	.ttype		= SRCU_FLAVOR,
-	.init		= rcu_sync_torture_init,
+	.init		= srcu_torture_init,
 	.readlock	= srcu_torture_read_lock,
 	.read_delay	= srcu_read_delay,
 	.readunlock	= srcu_torture_read_unlock,
@@ -581,25 +591,25 @@ static struct rcu_torture_ops srcu_ops = {
 	.irq_capable	= 1,
 	.name		= "srcu"
 };
+#endif
 
-static void srcu_torture_init(void)
+static void srcud_torture_init(void)
 {
 	rcu_sync_torture_init();
 	WARN_ON(init_srcu_struct(&srcu_ctld));
 	srcu_ctlp = &srcu_ctld;
 }
 
-static void srcu_torture_cleanup(void)
+static void srcud_torture_cleanup(void)
 {
 	cleanup_srcu_struct(&srcu_ctld);
-	srcu_ctlp = &srcu_ctl; /* In case of a later rcutorture run. */
 }
 
 /* As above, but dynamically allocated. */
 static struct rcu_torture_ops srcud_ops = {
 	.ttype		= SRCU_FLAVOR,
-	.init		= srcu_torture_init,
-	.cleanup	= srcu_torture_cleanup,
+	.init		= srcud_torture_init,
+	.cleanup	= srcud_torture_cleanup,
 	.readlock	= srcu_torture_read_lock,
 	.read_delay	= srcu_read_delay,
 	.readunlock	= srcu_torture_read_unlock,
@@ -617,8 +627,8 @@ static struct rcu_torture_ops srcud_ops = {
 /* As above, but broken due to inappropriate reader extension. */
 static struct rcu_torture_ops busted_srcud_ops = {
 	.ttype		= SRCU_FLAVOR,
-	.init		= srcu_torture_init,
-	.cleanup	= srcu_torture_cleanup,
+	.init		= srcud_torture_init,
+	.cleanup	= srcud_torture_cleanup,
 	.readlock	= srcu_torture_read_lock,
 	.read_delay	= rcu_read_delay,
 	.readunlock	= srcu_torture_read_unlock,


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-13 21:22             ` Paul E. McKenney
@ 2019-03-13 21:29               ` Tejun Heo
  2019-03-13 23:11                 ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Tejun Heo @ 2019-03-13 21:29 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Barret Rhoden, Christopher Lameter, Dennis Zhou, Eial Czerwacki,
	linux-kernel, Shai Fultheim, Oren Twaig

Hello,

On Wed, Mar 13, 2019 at 02:22:55PM -0700, Paul E. McKenney wrote:
> Should I define DEFINE_SRCU() and DEFINE_STATIC_SRCU() only if
> !defined(MODULE)?

Yeah, that sounds like a great idea with comments explaining why it's
like that.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-13 21:29               ` Tejun Heo
@ 2019-03-13 23:11                 ` Paul E. McKenney
  2019-03-14 17:36                   ` Tejun Heo
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2019-03-13 23:11 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Barret Rhoden, Christopher Lameter, Dennis Zhou, Eial Czerwacki,
	linux-kernel, Shai Fultheim, Oren Twaig

On Wed, Mar 13, 2019 at 02:29:12PM -0700, Tejun Heo wrote:
> Hello,
> 
> On Wed, Mar 13, 2019 at 02:22:55PM -0700, Paul E. McKenney wrote:
> > Should I define DEFINE_SRCU() and DEFINE_STATIC_SRCU() only if
> > !defined(MODULE)?
> 
> Yeah, that sounds like a great idea with comments explaining why it's
> like that.

Like this?

 * Build-time srcu_struct definition is not allowed in modules because
 * otherwise it is necessary to increase the size of the reserved region
 * each time a DEFINE_SRCU() or DEFINE_STATIC_SRCU() are added to a
 * kernel module.  Kernel modules should instead declare an srcu_struct
 * and then invoke init_srcu_struct() from their module_init function and
 * cleanup_srcu_struct() from their module_exit function.  Note that modules
 * using call_srcu() will also need to invoke srcu_barrier() from their
 * module_exit function.

Also, it looks like Barret beat me to this suggestion.  ;-)

In addition, rcutorture and rcuperf needed to be updated because
they used to use DEFINE_STATIC_STRUCT() whether built in or built
as a loadable module.

How does the (very lightly tested) patch below look to you all?

							Thanx, Paul

------------------------------------------------------------------------

commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
Author: Paul E. McKenney <paulmck@linux.ibm.com>
Date:   Wed Mar 13 16:06:22 2019 -0700

    srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules
    
    Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
    requires that the size of the reserved region be increased, which is
    not something we want to be doing all that often.  Instead, loadable
    modules should define an srcu_struct and invoke init_srcu_struct()
    from their module_init function and cleanup_srcu_struct() from their
    module_exit function.  Note that modules using call_srcu() will also
    need to invoke srcu_barrier() from their module_exit function.
    
    This commit enforces this advice by refusing to define DEFINE_SRCU()
    and DEFINE_STATIC_SRCU() within loadable modules.
    
    Suggested-by: Barret Rhoden <brho@google.com>
    Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>

diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 7f7c8c050f63..ac5ea1c72e97 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -105,6 +105,15 @@ struct srcu_struct {
  * Define and initialize a srcu struct at build time.
  * Do -not- call init_srcu_struct() nor cleanup_srcu_struct() on it.
  *
+ * Build-time srcu_struct definition is not allowed in modules because
+ * otherwise it is necessary to increase the size of the reserved region
+ * each time a DEFINE_SRCU() or DEFINE_STATIC_SRCU() are added to a
+ * kernel module.  Kernel modules should instead declare an srcu_struct
+ * and then invoke init_srcu_struct() from their module_init function and
+ * cleanup_srcu_struct() from their module_exit function.  Note that modules
+ * using call_srcu() will also need to invoke srcu_barrier() from their
+ * module_exit function.
+ *
  * Note that although DEFINE_STATIC_SRCU() hides the name from other
  * files, the per-CPU variable rules nevertheless require that the
  * chosen name be globally unique.  These rules also prohibit use of
@@ -120,11 +129,13 @@ struct srcu_struct {
  *
  * See include/linux/percpu-defs.h for the rules on per-CPU variables.
  */
-#define __DEFINE_SRCU(name, is_static)					\
-	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);\
+#ifndef MODULE
+#  define __DEFINE_SRCU(name, is_static)				\
+	static DEFINE_PER_CPU(struct srcu_data, name##_srcu_data);	\
 	is_static struct srcu_struct name = __SRCU_STRUCT_INIT(name, name##_srcu_data)
-#define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
-#define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
+#  define DEFINE_SRCU(name)		__DEFINE_SRCU(name, /* not static */)
+#  define DEFINE_STATIC_SRCU(name)	__DEFINE_SRCU(name, static)
+#endif
 
 void synchronize_srcu_expedited(struct srcu_struct *ssp);
 void srcu_barrier(struct srcu_struct *ssp);
diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index c29761152874..b44208b3bf95 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -139,6 +139,7 @@ struct rcu_perf_ops {
 	void (*sync)(void);
 	void (*exp_sync)(void);
 	const char *name;
+	const char *altname;
 };
 
 static struct rcu_perf_ops *cur_ops;
@@ -186,8 +187,16 @@ static struct rcu_perf_ops rcu_ops = {
  * Definitions for srcu perf testing.
  */
 
+static struct srcu_struct *srcu_ctlp;
+
+#ifndef MODULE
 DEFINE_STATIC_SRCU(srcu_ctl_perf);
-static struct srcu_struct *srcu_ctlp = &srcu_ctl_perf;
+
+static void srcu_sync_perf_init(void)
+{
+	srcu_ctlp = &srcu_ctl_perf
+}
+#endif
 
 static int srcu_perf_read_lock(void) __acquires(srcu_ctlp)
 {
@@ -224,9 +233,10 @@ static void srcu_perf_synchronize_expedited(void)
 	synchronize_srcu_expedited(srcu_ctlp);
 }
 
+#ifndef MODULE
 static struct rcu_perf_ops srcu_ops = {
 	.ptype		= SRCU_FLAVOR,
-	.init		= rcu_sync_perf_init,
+	.init		= srcu_sync_perf_init,
 	.readlock	= srcu_perf_read_lock,
 	.readunlock	= srcu_perf_read_unlock,
 	.get_gp_seq	= srcu_perf_completed,
@@ -238,24 +248,25 @@ static struct rcu_perf_ops srcu_ops = {
 	.exp_sync	= srcu_perf_synchronize_expedited,
 	.name		= "srcu"
 };
+#endif
 
 static struct srcu_struct srcud;
 
-static void srcu_sync_perf_init(void)
+static void srcud_sync_perf_init(void)
 {
 	srcu_ctlp = &srcud;
 	init_srcu_struct(srcu_ctlp);
 }
 
-static void srcu_sync_perf_cleanup(void)
+static void srcud_sync_perf_cleanup(void)
 {
 	cleanup_srcu_struct(srcu_ctlp);
 }
 
 static struct rcu_perf_ops srcud_ops = {
 	.ptype		= SRCU_FLAVOR,
-	.init		= srcu_sync_perf_init,
-	.cleanup	= srcu_sync_perf_cleanup,
+	.init		= srcud_sync_perf_init,
+	.cleanup	= srcud_sync_perf_cleanup,
 	.readlock	= srcu_perf_read_lock,
 	.readunlock	= srcu_perf_read_unlock,
 	.get_gp_seq	= srcu_perf_completed,
@@ -265,7 +276,10 @@ static struct rcu_perf_ops srcud_ops = {
 	.gp_barrier	= srcu_rcu_barrier,
 	.sync		= srcu_perf_synchronize,
 	.exp_sync	= srcu_perf_synchronize_expedited,
-	.name		= "srcud"
+	.name		= "srcud",
+#ifndef MODULE
+	.altname	= "srcu" /* Avoid breaking kbuild test robot. */
+#endif
 };
 
 /*
@@ -594,7 +608,11 @@ rcu_perf_init(void)
 	long i;
 	int firsterr = 0;
 	static struct rcu_perf_ops *perf_ops[] = {
-		&rcu_ops, &srcu_ops, &srcud_ops, &tasks_ops,
+		&rcu_ops,
+#ifndef MODULE
+		&srcu_ops,
+#endif
+		&srcud_ops, &tasks_ops,
 	};
 
 	if (!torture_init_begin(perf_type, verbose))
@@ -605,6 +623,11 @@ rcu_perf_init(void)
 		cur_ops = perf_ops[i];
 		if (strcmp(perf_type, cur_ops->name) == 0)
 			break;
+		if (cur_ops->altname &&
+		    strcmp(perf_type, cur_ops->altname) == 0) {
+			pr_alert("rcu-perf: substituting perf type: \"%s\" for \"%s\"\n", cur_ops->name, perf_type);
+			break;
+		}
 	}
 	if (i == ARRAY_SIZE(perf_ops)) {
 		pr_alert("rcu-perf: invalid perf type: \"%s\"\n", perf_type);
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 5ff797fd3715..e4674c550b0f 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -300,6 +300,7 @@ struct rcu_torture_ops {
 	int can_boost;
 	int extendables;
 	const char *name;
+	const char *altname;
 };
 
 static struct rcu_torture_ops *cur_ops;
@@ -496,9 +497,18 @@ static struct rcu_torture_ops rcu_busted_ops = {
  * Definitions for srcu torture testing.
  */
 
-DEFINE_STATIC_SRCU(srcu_ctl);
 static struct srcu_struct srcu_ctld;
-static struct srcu_struct *srcu_ctlp = &srcu_ctl;
+static struct srcu_struct *srcu_ctlp;
+
+#ifndef MODULE
+DEFINE_STATIC_SRCU(srcu_ctl);
+
+static void srcu_torture_init(void)
+{
+	rcu_sync_torture_init();
+	srcu_ctlp = &srcu_ctl;
+}
+#endif
 
 static int srcu_torture_read_lock(void) __acquires(srcu_ctlp)
 {
@@ -565,9 +575,10 @@ static void srcu_torture_synchronize_expedited(void)
 	synchronize_srcu_expedited(srcu_ctlp);
 }
 
+#ifndef MODULE
 static struct rcu_torture_ops srcu_ops = {
 	.ttype		= SRCU_FLAVOR,
-	.init		= rcu_sync_torture_init,
+	.init		= srcu_torture_init,
 	.readlock	= srcu_torture_read_lock,
 	.read_delay	= srcu_read_delay,
 	.readunlock	= srcu_torture_read_unlock,
@@ -581,25 +592,25 @@ static struct rcu_torture_ops srcu_ops = {
 	.irq_capable	= 1,
 	.name		= "srcu"
 };
+#endif
 
-static void srcu_torture_init(void)
+static void srcud_torture_init(void)
 {
 	rcu_sync_torture_init();
 	WARN_ON(init_srcu_struct(&srcu_ctld));
 	srcu_ctlp = &srcu_ctld;
 }
 
-static void srcu_torture_cleanup(void)
+static void srcud_torture_cleanup(void)
 {
 	cleanup_srcu_struct(&srcu_ctld);
-	srcu_ctlp = &srcu_ctl; /* In case of a later rcutorture run. */
 }
 
 /* As above, but dynamically allocated. */
 static struct rcu_torture_ops srcud_ops = {
 	.ttype		= SRCU_FLAVOR,
-	.init		= srcu_torture_init,
-	.cleanup	= srcu_torture_cleanup,
+	.init		= srcud_torture_init,
+	.cleanup	= srcud_torture_cleanup,
 	.readlock	= srcu_torture_read_lock,
 	.read_delay	= srcu_read_delay,
 	.readunlock	= srcu_torture_read_unlock,
@@ -611,14 +622,17 @@ static struct rcu_torture_ops srcud_ops = {
 	.cb_barrier	= srcu_torture_barrier,
 	.stats		= srcu_torture_stats,
 	.irq_capable	= 1,
-	.name		= "srcud"
+	.name		= "srcud",
+#ifndef MODULE
+	.altname	= "srcu" /* Avoid breaking kbuild test robot. */
+#endif
 };
 
 /* As above, but broken due to inappropriate reader extension. */
 static struct rcu_torture_ops busted_srcud_ops = {
 	.ttype		= SRCU_FLAVOR,
-	.init		= srcu_torture_init,
-	.cleanup	= srcu_torture_cleanup,
+	.init		= srcud_torture_init,
+	.cleanup	= srcud_torture_cleanup,
 	.readlock	= srcu_torture_read_lock,
 	.read_delay	= rcu_read_delay,
 	.readunlock	= srcu_torture_read_unlock,
@@ -2235,7 +2249,11 @@ rcu_torture_init(void)
 	int cpu;
 	int firsterr = 0;
 	static struct rcu_torture_ops *torture_ops[] = {
-		&rcu_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops,
+		&rcu_ops, &rcu_busted_ops,
+#ifndef MODULE
+		&srcu_ops,
+#endif
+		&srcud_ops,
 		&busted_srcud_ops, &tasks_ops,
 	};
 
@@ -2247,6 +2265,11 @@ rcu_torture_init(void)
 		cur_ops = torture_ops[i];
 		if (strcmp(torture_type, cur_ops->name) == 0)
 			break;
+		if (cur_ops->altname &&
+		    strcmp(torture_type, cur_ops->altname) == 0) {
+			pr_alert("rcu-torture: substituting torture type: \"%s\" for \"%s\"\n", cur_ops->name, torture_type);
+			break;
+		}
 	}
 	if (i == ARRAY_SIZE(torture_ops)) {
 		pr_alert("rcu-torture: invalid torture type: \"%s\"\n",


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-13 23:11                 ` Paul E. McKenney
@ 2019-03-14 17:36                   ` Tejun Heo
  2019-03-14 22:19                     ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Tejun Heo @ 2019-03-14 17:36 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Barret Rhoden, Christopher Lameter, Dennis Zhou, Eial Czerwacki,
	linux-kernel, Shai Fultheim, Oren Twaig

On Wed, Mar 13, 2019 at 04:11:55PM -0700, Paul E. McKenney wrote:
> commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
> Author: Paul E. McKenney <paulmck@linux.ibm.com>
> Date:   Wed Mar 13 16:06:22 2019 -0700
> 
>     srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules
>     
>     Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
>     requires that the size of the reserved region be increased, which is
>     not something we want to be doing all that often.  Instead, loadable
>     modules should define an srcu_struct and invoke init_srcu_struct()
>     from their module_init function and cleanup_srcu_struct() from their
>     module_exit function.  Note that modules using call_srcu() will also
>     need to invoke srcu_barrier() from their module_exit function.
>     
>     This commit enforces this advice by refusing to define DEFINE_SRCU()
>     and DEFINE_STATIC_SRCU() within loadable modules.
>     
>     Suggested-by: Barret Rhoden <brho@google.com>
>     Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>

Looks-great-to-me-by: Tejun Heo <tj@kernel.org>

Thanks. :)

-- 
tejun

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-14 17:36                   ` Tejun Heo
@ 2019-03-14 22:19                     ` Paul E. McKenney
  2019-03-18  8:18                       ` Eial Czerwacki
  0 siblings, 1 reply; 17+ messages in thread
From: Paul E. McKenney @ 2019-03-14 22:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Barret Rhoden, Christopher Lameter, Dennis Zhou, Eial Czerwacki,
	linux-kernel, Shai Fultheim, Oren Twaig

On Thu, Mar 14, 2019 at 10:36:19AM -0700, Tejun Heo wrote:
> On Wed, Mar 13, 2019 at 04:11:55PM -0700, Paul E. McKenney wrote:
> > commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
> > Author: Paul E. McKenney <paulmck@linux.ibm.com>
> > Date:   Wed Mar 13 16:06:22 2019 -0700
> > 
> >     srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules
> >     
> >     Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
> >     requires that the size of the reserved region be increased, which is
> >     not something we want to be doing all that often.  Instead, loadable
> >     modules should define an srcu_struct and invoke init_srcu_struct()
> >     from their module_init function and cleanup_srcu_struct() from their
> >     module_exit function.  Note that modules using call_srcu() will also
> >     need to invoke srcu_barrier() from their module_exit function.
> >     
> >     This commit enforces this advice by refusing to define DEFINE_SRCU()
> >     and DEFINE_STATIC_SRCU() within loadable modules.
> >     
> >     Suggested-by: Barret Rhoden <brho@google.com>
> >     Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> 
> Looks-great-to-me-by: Tejun Heo <tj@kernel.org>

Applied.  ;-)

							Thanx, Paul

> Thanks. :)
> 
> -- 
> tejun
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-14 22:19                     ` Paul E. McKenney
@ 2019-03-18  8:18                       ` Eial Czerwacki
  2019-03-18 14:27                         ` Paul E. McKenney
  0 siblings, 1 reply; 17+ messages in thread
From: Eial Czerwacki @ 2019-03-18  8:18 UTC (permalink / raw)
  To: paulmck
  Cc: Tejun Heo, Barret Rhoden, Christopher Lameter, Dennis Zhou,
	linux-kernel, Shai Fultheim, Oren Twaig

Greetings Paul,

On 3/15/19 12:19 AM, Paul E. McKenney wrote:
> On Thu, Mar 14, 2019 at 10:36:19AM -0700, Tejun Heo wrote:
>> On Wed, Mar 13, 2019 at 04:11:55PM -0700, Paul E. McKenney wrote:
>>> commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
>>> Author: Paul E. McKenney <paulmck@linux.ibm.com>
>>> Date:   Wed Mar 13 16:06:22 2019 -0700
>>>
>>>     srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules
>>>     
>>>     Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
>>>     requires that the size of the reserved region be increased, which is
>>>     not something we want to be doing all that often.  Instead, loadable
>>>     modules should define an srcu_struct and invoke init_srcu_struct()
>>>     from their module_init function and cleanup_srcu_struct() from their
>>>     module_exit function.  Note that modules using call_srcu() will also
>>>     need to invoke srcu_barrier() from their module_exit function.
>>>     
>>>     This commit enforces this advice by refusing to define DEFINE_SRCU()
>>>     and DEFINE_STATIC_SRCU() within loadable modules.
>>>     
>>>     Suggested-by: Barret Rhoden <brho@google.com>
>>>     Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
>>
>> Looks-great-to-me-by: Tejun Heo <tj@kernel.org>
> 
> Applied.  ;-)
> 
> 							Thanx, Paul
> 
>> Thanks. :)
>>
>> -- 
>> tejun
>>
> 
> 

when can this patch be found in the kernel mainline git repo? I'd like
to test and see if the patch that started this mail thread still occurs.

Thanks,

Eial.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-18  8:18                       ` Eial Czerwacki
@ 2019-03-18 14:27                         ` Paul E. McKenney
  0 siblings, 0 replies; 17+ messages in thread
From: Paul E. McKenney @ 2019-03-18 14:27 UTC (permalink / raw)
  To: Eial Czerwacki
  Cc: Tejun Heo, Barret Rhoden, Christopher Lameter, Dennis Zhou,
	linux-kernel, Shai Fultheim, Oren Twaig

On Mon, Mar 18, 2019 at 10:18:48AM +0200, Eial Czerwacki wrote:
> Greetings Paul,
> 
> On 3/15/19 12:19 AM, Paul E. McKenney wrote:
> > On Thu, Mar 14, 2019 at 10:36:19AM -0700, Tejun Heo wrote:
> >> On Wed, Mar 13, 2019 at 04:11:55PM -0700, Paul E. McKenney wrote:
> >>> commit 34f67df09cc0c6bf082a7cfca435373caeeb8d82
> >>> Author: Paul E. McKenney <paulmck@linux.ibm.com>
> >>> Date:   Wed Mar 13 16:06:22 2019 -0700
> >>>
> >>>     srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules
> >>>     
> >>>     Adding DEFINE_SRCU() or DEFINE_STATIC_SRCU() to a loadable module
> >>>     requires that the size of the reserved region be increased, which is
> >>>     not something we want to be doing all that often.  Instead, loadable
> >>>     modules should define an srcu_struct and invoke init_srcu_struct()
> >>>     from their module_init function and cleanup_srcu_struct() from their
> >>>     module_exit function.  Note that modules using call_srcu() will also
> >>>     need to invoke srcu_barrier() from their module_exit function.
> >>>     
> >>>     This commit enforces this advice by refusing to define DEFINE_SRCU()
> >>>     and DEFINE_STATIC_SRCU() within loadable modules.
> >>>     
> >>>     Suggested-by: Barret Rhoden <brho@google.com>
> >>>     Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
> >>
> >> Looks-great-to-me-by: Tejun Heo <tj@kernel.org>
> > 
> > Applied.  ;-)
> > 
> > 							Thanx, Paul
> > 
> >> Thanks. :)
> >>
> >> -- 
> >> tejun
> >>
> > 
> > 
> 
> when can this patch be found in the kernel mainline git repo? I'd like
> to test and see if the patch that started this mail thread still occurs.

Thank you for your interest!

It is a3f5f4fae725 ("srcu: Forbid DEFINE{,_STATIC}_SRCU() from modules")
in my -rcu tree.  If all goes well, I will submit it to the v5.2 merge
window.  I do not expect it to be submitted to -stable.

And -rcu is here:

	git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git

							Thanx, Paul


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
  2019-03-13 19:40         ` Barret Rhoden
  2019-03-13 20:26           ` Tejun Heo
@ 2019-04-04 20:53           ` Christopher Lameter
  1 sibling, 0 replies; 17+ messages in thread
From: Christopher Lameter @ 2019-04-04 20:53 UTC (permalink / raw)
  To: Barret Rhoden
  Cc: Dennis Zhou, Eial Czerwacki, tj, linux-kernel, Shai Fultheim,
	Oren Twaig, Paul E. McKenney

On Wed, 13 Mar 2019, Barret Rhoden wrote:

> > It is very expensive. VMSP exchanges 4K segments via RDMA between servers
> > to build a large address space and run a kernel in the large address
> > space. Using smaller segments can cause a lot of
> > "cacheline" bouncing (meaning transfers of 4K segments back and forth
> > between servers).
> >
>
> Given that these are large machines, would it be OK to statically reserve 64K
> on them for modules' percpu data?

Likely.

> The bug that led me to here was from someone running on a non-VSMP machine but
> had that config set.  Perhaps we make it more clear in the Kconfig option to
> not set it on other machines.  That might make it less likely anyone on a
> non-VSMP machine pays the 64K overhead.

Right.

> Are there any other alternatives?  Not using static SRCU in any code that
> could be built as a module seems a little harsh.

Sorry this ended up in my spam folder somehow. Just fished it out.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, back to index

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-21 11:47 [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set Eial Czerwacki
2019-01-30 10:33 ` Eial Czerwacki
2019-03-01 18:30 ` Barret Rhoden
2019-03-01 20:34   ` Dennis Zhou
2019-03-01 21:27     ` Barret Rhoden
2019-03-01 21:54       ` Christopher Lameter
2019-03-13 19:40         ` Barret Rhoden
2019-03-13 20:26           ` Tejun Heo
2019-03-13 21:22             ` Paul E. McKenney
2019-03-13 21:29               ` Tejun Heo
2019-03-13 23:11                 ` Paul E. McKenney
2019-03-14 17:36                   ` Tejun Heo
2019-03-14 22:19                     ` Paul E. McKenney
2019-03-18  8:18                       ` Eial Czerwacki
2019-03-18 14:27                         ` Paul E. McKenney
2019-04-04 20:53           ` Christopher Lameter
2019-03-04  7:42   ` Eial Czerwacki

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git