All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nico Pache <npache@redhat.com>
To: Yang Shi <shy828301@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>, Shakeel Butt <shakeelb@google.com>,
	Kirill Tkhai <ktkhai@virtuozzo.com>, Roman Gushchin <guro@fb.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	raquini@redhat.com, Michal Hocko <mhocko@suse.com>,
	David Hildenbrand <david@redhat.com>
Subject: Re: [PATCH v2 1/1] mm/vmscan.c: Prevent allocating shrinker_info on offlined nodes
Date: Tue, 7 Dec 2021 19:33:40 -0500	[thread overview]
Message-ID: <17a7d9e4-5ebc-1160-1e5e-97707b6e5286@redhat.com> (raw)
In-Reply-To: <CAHbLzkoCds-WOoN5CKas4DThk8hU65pgtMcga10QEqEmKU2f5A@mail.gmail.com>



On 12/7/21 19:26, Yang Shi wrote:
> On Tue, Dec 7, 2021 at 3:44 PM Andrew Morton <akpm@linux-foundation.org> wrote:
>>
>> On Tue,  7 Dec 2021 17:40:13 -0500 Nico Pache <npache@redhat.com> wrote:
>>
>>> We have run into a panic caused by a shrinker allocation being attempted
>>> on an offlined node.
>>>
>>> Our crash analysis has determined that the issue originates from trying
>>> to allocate pages on an offlined node in expand_one_shrinker_info. This
>>> function makes the incorrect assumption that we can allocate on any node.
>>> To correct this we make sure the node is online before tempting an
>>> allocation. If it is not online choose the closest node.
>>
>> This isn't fully accurate, is it?  We could allocate on a node which is
>> presently offline but which was previously onlined, by testing
>> NODE_DATA(nid).
>>
>> It isn't entirely clear to me from the v1 discussion why this approach
>> isn't being taken?
>>
>> AFAICT the proposed patch is *already* taking this approach, by having
>> no protection against a concurrent or subsequent node offlining?
> 
> AFAICT, we have not reached agreement on how to fix it yet. I saw 3
> proposals at least:
> 
> 1. From Michal, allocate node data for all possible nodes.
> https://lore.kernel.org/all/Ya89aqij6nMwJrIZ@dhcp22.suse.cz/T/#u
> 
> 2. What this patch does. Proposed originally from
> https://lore.kernel.org/all/20211108202325.20304-1-amakhalov@vmware.com/T/#u

Correct me if im wrong, but isn't that a different caller? This patch fixes the
issue in expand_one_shrinker_info.

> 3. From David, fix in node_zonelist().
> https://lore.kernel.org/all/51c65635-1dae-6ba4-daf9-db9df0ec35d8@redhat.com/T/#u
> 
>>
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -222,13 +222,16 @@ static int expand_one_shrinker_info(struct mem_cgroup *memcg,
>>>       int size = map_size + defer_size;
>>>
>>>       for_each_node(nid) {
>>> +             int tmp = nid;
>>
>> Not `tmp', please.  Better to use an identifier which explains the
>> variable's use.  target_nid?
>>
>> And a newline after defining locals, please.
>>
>>>               pn = memcg->nodeinfo[nid];
>>>               old = shrinker_info_protected(memcg, nid);
>>>               /* Not yet online memcg */
>>>               if (!old)
>>>                       return 0;
>>>
>>> -             new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, nid);
>>> +             if(!node_online(nid))
>>
>> s/if(/if (/
>>
>>> +                     tmp = numa_mem_id();
>>> +             new = kvmalloc_node(sizeof(*new) + size, GFP_KERNEL, tmp);
>>>               if (!new)
>>>                       return -ENOMEM;
>>>
>>
>> And a code comment fully explaining what's going on here?
> 


  reply	other threads:[~2021-12-08  0:33 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07 22:40 [PATCH v2 0/1] Dont allocate pages on a offline node Nico Pache
2021-12-07 22:40 ` [PATCH v2 1/1] mm/vmscan.c: Prevent allocating shrinker_info on offlined nodes Nico Pache
2021-12-07 23:34   ` Matthew Wilcox
2021-12-08  0:25     ` Nico Pache
2021-12-08  1:53       ` Andrew Morton
2021-12-07 23:44   ` Andrew Morton
2021-12-08  0:26     ` Yang Shi
2021-12-08  0:33       ` Nico Pache [this message]
2021-12-08  1:23         ` Yang Shi
2021-12-08  1:26           ` Yang Shi
2021-12-08  7:59             ` Michal Hocko
2021-12-13 19:10               ` Yang Shi
2022-01-10 17:09       ` Rafael Aquini
2022-01-10 17:16         ` Michal Hocko
2022-01-10 17:21           ` Rafael Aquini
2021-12-08  0:40     ` Nico Pache
2021-12-08  7:54       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17a7d9e4-5ebc-1160-1e5e-97707b6e5286@redhat.com \
    --to=npache@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=guro@fb.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=raquini@redhat.com \
    --cc=shakeelb@google.com \
    --cc=shy828301@gmail.com \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.