linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Bringmann <mwb@linux.vnet.ibm.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>,
	Kees Cook <keescook@chromium.org>,
	Mathieu Malaterre <malat@debian.org>,
	Pavel Tatashin <pasha.tatashin@oracle.com>,
	Nicholas Piggin <npiggin@gmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>,
	Juliet Kim <minkim@us.ibm.com>,
	Tyrel Datwyler <tyreld@linux.vnet.ibm.com>,
	Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>,
	Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>,
	linuxppc-dev@lists.ozlabs.org,
	Dan Williams <dan.j.williams@intel.com>,
	Oscar Salvador <osalvador@suse.de>
Subject: Re: [PATCH] migration/mm: Add WARN_ON to try_offline_node
Date: Tue, 2 Oct 2018 13:13:22 -0500	[thread overview]
Message-ID: <e7dd66c1-d196-3a14-0115-acdaf538ebfd@linux.vnet.ibm.com> (raw)
In-Reply-To: <20181002160446.GA18290@dhcp22.suse.cz>



On 10/02/2018 11:04 AM, Michal Hocko wrote:
> On Tue 02-10-18 10:14:49, Michael Bringmann wrote:
>> On 10/02/2018 09:59 AM, Michal Hocko wrote:
>>> On Tue 02-10-18 09:51:40, Michael Bringmann wrote:
>>> [...]
>>>> When the device-tree affinity attributes have changed for memory,
>>>> the 'nid' affinity calculated points to a different node for the
>>>> memory block than the one used to install it, previously on the
>>>> source system.  The newly calculated 'nid' affinity may not yet
>>>> be initialized on the target system.  The current memory tracking
>>>> mechanisms do not record the node to which a memory block was
>>>> associated when it was added.  Nathan is looking at adding this
>>>> feature to the new implementation of LMBs, but it is not there
>>>> yet, and won't be present in earlier kernels without backporting a
>>>> significant number of changes.
>>>
>>> Then the patch you have proposed here just papers over a real issue, no?
>>> IIUC then you simply do not remove the memory if you lose the race.
>>
>> The problem occurs when removing memory after an affinity change
>> references a node that was previously unreferenced.  Other code
>> in 'kernel/mm/memory_hotplug.c' deals with initializing an empty
>> node when adding memory to a system.  The 'removing memory' case is
>> specific to systems that perform LPM and allow device-tree changes.
>> The powerpc kernel does not have the option of accepting some PRRN
>> requests and accepting others.  It must perform them all.
> 
> I am sorry, but you are still too cryptic for me. Either there is a
> correctness issue and the the patch doesn't really fix anything or the
> final race doesn't make any difference and then the ppc code should be
> explicit about that. Checking the node inside the hotplug core code just
> looks as a wrong layer to mitigate an arch specific problem. I am not
> saying the patch is a no-go but if anything we want a big fat comment
> explaining how this is possible because right now it just points to an
> incorrect API usage.
> 
> That being said, this sounds pretty much ppc specific problem and I
> would _prefer_ it to be handled there (along with a big fat comment of
> course).

Let me try again.  Regardless of the path to which we get to this condition,
we currently crash the kernel.  This patch changes that to a WARN_ON notice
and continues executing the kernel without shutting down the system.  I saw
the problem during powerpc testing, because that is the focus of my work.
There are other paths to this function besides powerpc.  I feel that the
kernel should keep running instead of halting.

Regards,

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:       (512) 466-0650
mwb@linux.vnet.ibm.com


  reply	other threads:[~2018-10-02 18:13 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-01 18:56 [PATCH] migration/mm: Add WARN_ON to try_offline_node Michael Bringmann
2018-10-01 20:02 ` Kees Cook
2018-10-01 20:27 ` Michal Hocko
2018-10-01 23:20   ` Tyrel Datwyler
2018-10-02 14:51     ` Michael Bringmann
2018-10-02 14:59       ` Michal Hocko
2018-10-02 15:14         ` Michael Bringmann
2018-10-02 16:04           ` Michal Hocko
2018-10-02 18:13             ` Michael Bringmann [this message]
2018-10-02 19:45               ` Tyrel Datwyler
2018-10-03  7:03                 ` Michal Hocko
2018-10-03 13:27                 ` Michael Bringmann
2018-10-03 23:05                   ` Tyrel Datwyler
2018-10-04  1:02                     ` Michael Bringmann
2018-10-01 23:23   ` Tyrel Datwyler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e7dd66c1-d196-3a14-0115-acdaf538ebfd@linux.vnet.ibm.com \
    --to=mwb@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bauerman@linux.vnet.ibm.com \
    --cc=dan.j.williams@intel.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=malat@debian.org \
    --cc=mauricfo@linux.vnet.ibm.com \
    --cc=mhocko@kernel.org \
    --cc=minkim@us.ibm.com \
    --cc=nfont@linux.vnet.ibm.com \
    --cc=npiggin@gmail.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@oracle.com \
    --cc=tlfalcon@linux.vnet.ibm.com \
    --cc=tyreld@linux.vnet.ibm.com \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).