linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Barret Rhoden <brho@google.com>
To: Dennis Zhou <dennis@kernel.org>
Cc: Eial Czerwacki <eial@scalemp.com>,
	tj@kernel.org, cl@linux.com, linux-kernel@vger.kernel.org,
	Shai Fultheim <shai@scalemp.com>, Oren Twaig <oren@scalemp.com>,
	"Paul E. McKenney" <paulmck@linux.ibm.com>
Subject: Re: [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set
Date: Fri, 1 Mar 2019 16:27:54 -0500	[thread overview]
Message-ID: <c72402f2-967e-cd56-99d8-9139c9e7f267@google.com> (raw)
In-Reply-To: <20190301203455.GA97188@dennisz-mbp.dhcp.thefacebook.com>

Hi -

On 03/01/2019 03:34 PM, Dennis Zhou wrote:
> Hi Barret,
> 
> On Fri, Mar 01, 2019 at 01:30:15PM -0500, Barret Rhoden wrote:
>> Hi -
>>
>> On 01/21/2019 06:47 AM, Eial Czerwacki wrote:
>>>
>>
>> Your main issue was that you only sent this patch to LKML, but not the
>> maintainers of the file.  If you don't, your patch might get lost.  To get
>> the appropriate people and lists, run:
>>
>> 	scripts/get_maintainer.pl YOUR_PATCH.patch.
>>
>> For this patch, you'll get this:
>>
>> Dennis Zhou <dennis@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> Tejun Heo <tj@kernel.org> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> Christoph Lameter <cl@linux.com> (maintainer:PER-CPU MEMORY ALLOCATOR)
>> linux-kernel@vger.kernel.org (open list)
>>
>> I added the three maintainers to this email.
>>
>> I have a few minor comments below.
>>
>>> [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is
>> set
>>
>> You misspelled 'reservation'.  Also, I'd just say: "percpu: increase module
>> reservation size if X86_VSMP is set".  ('change' -> 'increase'), only says
>> 'reservation' once.)
>>
>>> as reported in bug #201339 (https://bugzilla.kernel.org/show_bug.cgi?id=201339)
>>
>> I think you can add a tag for this right above your Signed-off-by tags.
>> e.g.:
>>
>> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=201339
>>
>>> by enabling X86_VSMP, INTERNODE_CACHE_BYTES's definition differs from the default one
>>> causing the struct size to exceed the size ok 8KB.
>>                                              ^of
>>
>> Which struct are you talking about?  I have one in mind, but others might
>> not know from reading the commit message.
>>
>> I ran into this on https://bugzilla.kernel.org/show_bug.cgi?id=202511. In
>> that case, it was because modules (drm and amdkfd) were using DEFINE_SRCU,
>> which does a DEFINE_PER_CPU on struct srcu_data, and that used
>> ____cacheline_internodealigned_in_smp.
>>
>>>
>>> in order to avoid such issue, increse PERCPU_MODULE_RESERVE to 64KB if CONFIG_X86_VSMP is set.
>>                                 ^increase
>>
>>>
>>> the value was caculated on linux 4.20.3, make allmodconfig all and the following oneliner:
>>                 ^calculated
>>
>>> for f in `find -name *.ko`; do echo $f; readelf -S $f  |grep perc; done |grep data..percpu -B 1 |grep ko |while read r; do echo -n "$r: "; objdump --syms --section=.data..percpu $r|grep data |sort -n  |awk '{c++; d=strtonum("0x" $1) + strtonum("0x" $5); if (m < d) m = d;} END {printf("%d vars-> last addr 0x%x ( %d )\n", c, m, m)}' ; done |column -t |sort -k 8 -n | awk '{print $8}'| paste -sd+ | bc
>>
>> Not sure how useful the one-liner is, versus a description of what you're
>> doing.  i.e. "the size of all module percpu data sections, or something."
>>
>> Also, how close was that calculated value to 64K?  If more modules start
>> using DEFINE_SRCU, each of which uses 8K, then that 64K might run out.
>>
>> Thanks,
>> Barret
>>
>>> Signed-off-by: Eial Czerwacki <eial@scalemp.com>
>>> Signed-off-by: Shai Fultheim <shai@scalemp.com>
>>> Signed-off-by: Oren Twaig <oren@scalemp.com>
>>> ---
>>>    include/linux/percpu.h | 4 ++++
>>>    1 file changed, 4 insertions(+)
>>>
>>> diff --git a/include/linux/percpu.h b/include/linux/percpu.h
>>> index 70b7123..6b79693 100644
>>> --- a/include/linux/percpu.h
>>> +++ b/include/linux/percpu.h
>>> @@ -14,7 +14,11 @@
>>>    /* enough to cover all DEFINE_PER_CPUs in modules */
>>>    #ifdef CONFIG_MODULES
>>> +#ifdef X86_VSMP
>>> +#define PERCPU_MODULE_RESERVE		(1 << 16)
>>> +#else
>>>    #define PERCPU_MODULE_RESERVE		(8 << 10)
>>> +#endif
>>>    #else
>>>    #define PERCPU_MODULE_RESERVE		0
>>>    #endif
>>>
>>
> 
> Thanks for sending this to me.
> 
> I must say, I really do not want to expand the reserved region. In most
> cases, it can easily end up unused and thus wasted memory as it is hard
> allocated on boot. This is done because code gen assumes static
> variables are close to the program counter. This would not be true with
> dynamic allocations which being at the end of the vmalloc area
> (Summarized from Tejun's account in [1]).
> 
> Another note on the reserved region. It starts at the end of the static
> region which means it generally isn't page aligned. So while an 8kb
> allocation would fit, a 4kb alignment more than likely would fail.
> Something as large as 8kb should probably be dynamically allocated as
> well.
 >
> I read through the bugzilla report and it seems that the culprits are:
>    drivers/gpu/drm/amd/amdkfd/kfd_process.c:DEFINE_SRCU(kfd_processes_srcu);
>    drivers/gpu/drm/drm_drv.c:DEFINE_STATIC_SRCU(drm_unplug_srcu);
> 
> Is there a reason we cannot dynamically initialize these structs? I've
> cced Paul McKenney because we saw an issue with ipmi in December [1].

I looked at the AMD driver, and it looks like they could dynamically 
initialize it.  It would require a little extra plumbing.  I imagine the 
DRM one is the same way.

To catch this in the future, should we disallow DEFINE_SRCU in modules 
or something?  Otherwise, this will pop up again the next time someone 
uses DEFINE_SRCU in a module and builds with CONFIG_X86_VSMP.

That might be a little much, and it still won't be sufficient to catch 
all cases.  This will also come up any time a module has a static 
per-cpu data structure that uses __cacheline_aligned_in_smp, so it's not 
limited to SRCU either.

I'm not familiar with VSMP - how bad is it to use L1 cache alignment 
instead of 4K page alignment?  Maybe some structures can use the smaller 
alignment?  Or maybe have VSMP require SRCU-using modules to be built-in?

Thanks,

Barret


  reply	other threads:[~2019-03-01 21:28 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-21 11:47 [PATCH] percpu/module resevation: change resevation size iff X86_VSMP is set Eial Czerwacki
2019-01-30 10:33 ` Eial Czerwacki
2019-03-01 18:30 ` Barret Rhoden
2019-03-01 20:34   ` Dennis Zhou
2019-03-01 21:27     ` Barret Rhoden [this message]
2019-03-01 21:54       ` Christopher Lameter
2019-03-13 19:40         ` Barret Rhoden
2019-03-13 20:26           ` Tejun Heo
2019-03-13 21:22             ` Paul E. McKenney
2019-03-13 21:29               ` Tejun Heo
2019-03-13 23:11                 ` Paul E. McKenney
2019-03-14 17:36                   ` Tejun Heo
2019-03-14 22:19                     ` Paul E. McKenney
2019-03-18  8:18                       ` Eial Czerwacki
2019-03-18 14:27                         ` Paul E. McKenney
2019-04-04 20:53           ` Christopher Lameter
2019-03-04  7:42   ` Eial Czerwacki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c72402f2-967e-cd56-99d8-9139c9e7f267@google.com \
    --to=brho@google.com \
    --cc=cl@linux.com \
    --cc=dennis@kernel.org \
    --cc=eial@scalemp.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oren@scalemp.com \
    --cc=paulmck@linux.ibm.com \
    --cc=shai@scalemp.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).