All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Greg Thelen <gthelen@google.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Pekka Enberg <penberg@kernel.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: Re: [RFC v2 4/4] mm: fallback for offline nodes in alloc_pages_node
Date: Mon, 27 Jul 2015 13:29:45 +0200	[thread overview]
Message-ID: <55B61629.1060701@suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.10.1507241559181.12744@chino.kir.corp.google.com>

On 07/25/2015 01:06 AM, David Rientjes wrote:
> On Fri, 24 Jul 2015, Vlastimil Babka wrote:
>
>> Because I didn't think you would suggest the "nid = numa_mem_id()" for
>> !node_online(nid) fixup would happen only for CONFIG_DEBUG_VM kernels. But it
>> seems that you do suggest that? I would understand if the fixup (correcting an
>> offline node to some that's online) was done regardless of DEBUG_VM, and
>> DEBUG_VM just switched between silent and noisy fixup. But having a debug option
>> alter the outcome seems wrong?
>
> Hmm, not sure why this is surprising, I don't expect people to deploy
> production kernels with CONFIG_DEBUG_VM enabled, it's far too expensive.
> I was expecting they would enable it for, well... debug :)

But is there any other place that does such thing for debug builds?

> In that case, if nid is a valid node but offline, then the nid =
> numa_mem_id() fixup seems fine to allow the kernel to continue debugging.
>
> When a node is offlined as a result of memory hotplug, the pgdat doesn't
> get freed so it can be onlined later.  Thus, alloc_pages_node() with an
> offline node and !CONFIG_DEBUG_VM may not panic.  If it does, this can
> probably be removed because we're covered.

I've checked, but can't say I understand the hotplug code completely... 
but it seems there are two cases
- the node was never online, and the nid passed to alloc_pages_node() is 
clearly bogus. Then there's no pgdat and it should crash on NULL pointer 
dereference. VM_WARN_ON() in __alloc_pages_node() will already catch 
this and provide more details as to what caused the crash. Fixup would 
allow "continue debugging", but it seems that having configured e.g. a 
crashdump to inspect is a better way to debug than letting the kernel 
continue?
- the node has been online in the past, so the nid pointing to an 
offline node might be due to a race with offlining. It shouldn't crash, 
and most likely the zonelist that is left untouched by the offlining 
(AFAICS) will allow fallback to other nodes. Unless there is a nodemask 
of __GFP_THIS_NODE, in which case allocation fails. Again, VM_WARN_ON() 
in __alloc_pages_node() will warn us already. I doubt the fixup is 
needed here?

So I would just drop this patch. We already have the debug warning in 
__alloc_pages_node(), and a fixup is imho just confusing.

WARNING: multiple messages have this Message-ID (diff)
From: Vlastimil Babka <vbabka@suse.cz>
To: David Rientjes <rientjes@google.com>
Cc: Christoph Lameter <cl@linux.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Greg Thelen <gthelen@google.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Pekka Enberg <penberg@kernel.org>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: Re: [RFC v2 4/4] mm: fallback for offline nodes in alloc_pages_node
Date: Mon, 27 Jul 2015 13:29:45 +0200	[thread overview]
Message-ID: <55B61629.1060701@suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.10.1507241559181.12744@chino.kir.corp.google.com>

On 07/25/2015 01:06 AM, David Rientjes wrote:
> On Fri, 24 Jul 2015, Vlastimil Babka wrote:
>
>> Because I didn't think you would suggest the "nid = numa_mem_id()" for
>> !node_online(nid) fixup would happen only for CONFIG_DEBUG_VM kernels. But it
>> seems that you do suggest that? I would understand if the fixup (correcting an
>> offline node to some that's online) was done regardless of DEBUG_VM, and
>> DEBUG_VM just switched between silent and noisy fixup. But having a debug option
>> alter the outcome seems wrong?
>
> Hmm, not sure why this is surprising, I don't expect people to deploy
> production kernels with CONFIG_DEBUG_VM enabled, it's far too expensive.
> I was expecting they would enable it for, well... debug :)

But is there any other place that does such thing for debug builds?

> In that case, if nid is a valid node but offline, then the nid =
> numa_mem_id() fixup seems fine to allow the kernel to continue debugging.
>
> When a node is offlined as a result of memory hotplug, the pgdat doesn't
> get freed so it can be onlined later.  Thus, alloc_pages_node() with an
> offline node and !CONFIG_DEBUG_VM may not panic.  If it does, this can
> probably be removed because we're covered.

I've checked, but can't say I understand the hotplug code completely... 
but it seems there are two cases
- the node was never online, and the nid passed to alloc_pages_node() is 
clearly bogus. Then there's no pgdat and it should crash on NULL pointer 
dereference. VM_WARN_ON() in __alloc_pages_node() will already catch 
this and provide more details as to what caused the crash. Fixup would 
allow "continue debugging", but it seems that having configured e.g. a 
crashdump to inspect is a better way to debug than letting the kernel 
continue?
- the node has been online in the past, so the nid pointing to an 
offline node might be due to a race with offlining. It shouldn't crash, 
and most likely the zonelist that is left untouched by the offlining 
(AFAICS) will allow fallback to other nodes. Unless there is a nodemask 
of __GFP_THIS_NODE, in which case allocation fails. Again, VM_WARN_ON() 
in __alloc_pages_node() will warn us already. I doubt the fixup is 
needed here?

So I would just drop this patch. We already have the debug warning in 
__alloc_pages_node(), and a fixup is imho just confusing.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2015-07-27 11:29 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-24 14:45 [RFC v2 1/4] mm: make alloc_pages_exact_node pass __GFP_THISNODE Vlastimil Babka
2015-07-24 14:45 ` Vlastimil Babka
2015-07-24 14:45 ` [RFC v2 2/4] mm: unify checks in alloc_pages_node family of functions Vlastimil Babka
2015-07-24 14:45   ` Vlastimil Babka
2015-07-24 20:09   ` David Rientjes
2015-07-24 20:09     ` David Rientjes
2015-07-24 14:45 ` [RFC v2 3/4] mm: use numa_mem_id in alloc_pages_node() Vlastimil Babka
2015-07-24 14:45   ` Vlastimil Babka
2015-07-24 20:09   ` David Rientjes
2015-07-24 20:09     ` David Rientjes
2015-07-29 13:31   ` Mel Gorman
2015-07-29 13:31     ` Mel Gorman
2015-07-24 14:45 ` [RFC v2 4/4] mm: fallback for offline nodes in alloc_pages_node Vlastimil Babka
2015-07-24 14:45   ` Vlastimil Babka
2015-07-24 15:48   ` Christoph Lameter
2015-07-24 15:48     ` Christoph Lameter
2015-07-24 19:54     ` David Rientjes
2015-07-24 19:54       ` David Rientjes
2015-07-24 20:39       ` Vlastimil Babka
2015-07-24 20:39         ` Vlastimil Babka
2015-07-24 23:06         ` David Rientjes
2015-07-24 23:06           ` David Rientjes
2015-07-27 11:29           ` Vlastimil Babka [this message]
2015-07-27 11:29             ` Vlastimil Babka
2015-07-24 20:08 ` [RFC v2 1/4] mm: make alloc_pages_exact_node pass __GFP_THISNODE David Rientjes
2015-07-24 20:08   ` David Rientjes
2015-07-24 20:52   ` Vlastimil Babka
2015-07-24 20:52     ` Vlastimil Babka
2015-07-24 23:09     ` David Rientjes
2015-07-24 23:09       ` David Rientjes
2015-07-27 15:39 ` Johannes Weiner
2015-07-27 15:39   ` Johannes Weiner
2015-07-27 15:47   ` Vlastimil Babka
2015-07-27 15:47     ` Vlastimil Babka
2015-07-29 13:30 ` Mel Gorman
2015-07-29 13:30   ` Mel Gorman
2015-07-30 14:33   ` Christoph Lameter
2015-07-30 14:33     ` Christoph Lameter
2015-07-30 15:14   ` Johannes Weiner
2015-07-30 15:14     ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55B61629.1060701@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cl@linux.com \
    --cc=gthelen@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.