linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: Remove VM_BUG_ON in __alloc_pages_node
@ 2019-06-05  6:02 Bharath Vedartham
  2019-06-05  7:03 ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Bharath Vedartham @ 2019-06-05  6:02 UTC (permalink / raw)
  To: akpm, vbabka, mhocko, rientjes; +Cc: khalid.aziz, linux-mm, linux-kernel

In __alloc_pages_node, there is a VM_BUG_ON on the condition (nid < 0 ||
nid >= MAX_NUMNODES). Remove this VM_BUG_ON and add a VM_WARN_ON, if the
condition fails and fail the allocation if an invalid NUMA node id is
passed to __alloc_pages_node.

The check (nid < 0 || nid >= MAX_NUMNODES) also considers NUMA_NO_NODE
as an invalid nid, but the caller of __alloc_pages_node is assumed to
have checked for the case where nid == NUMA_NO_NODE.

Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com>
---
 include/linux/gfp.h | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 5f5e25f..075bdaf 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -480,7 +480,11 @@ __alloc_pages(gfp_t gfp_mask, unsigned int order, int preferred_nid)
 static inline struct page *
 __alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
 {
-	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
+	if (nid < 0 || nid >= MAX_NUMNODES) {
+		VM_WARN_ON(nid < 0 || nid >= MAX_NUMNODES);
+		return NULL; 
+	}
+
 	VM_WARN_ON((gfp_mask & __GFP_THISNODE) && !node_online(nid));
 
 	return __alloc_pages(gfp_mask, order, nid);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: Remove VM_BUG_ON in __alloc_pages_node
  2019-06-05  6:02 [PATCH] mm: Remove VM_BUG_ON in __alloc_pages_node Bharath Vedartham
@ 2019-06-05  7:03 ` Michal Hocko
  2019-06-05 13:07   ` Bharath Vedartham
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2019-06-05  7:03 UTC (permalink / raw)
  To: Bharath Vedartham
  Cc: akpm, vbabka, rientjes, khalid.aziz, linux-mm, linux-kernel

On Wed 05-06-19 11:32:29, Bharath Vedartham wrote:
> In __alloc_pages_node, there is a VM_BUG_ON on the condition (nid < 0 ||
> nid >= MAX_NUMNODES). Remove this VM_BUG_ON and add a VM_WARN_ON, if the
> condition fails and fail the allocation if an invalid NUMA node id is
> passed to __alloc_pages_node.

What is the motivation of the patch? VM_BUG_ON is not enabled by default
and your patch adds a branch to a really hot path. Why is this an
improvement for something that shouldn't happen in the first place?

> 
> The check (nid < 0 || nid >= MAX_NUMNODES) also considers NUMA_NO_NODE
> as an invalid nid, but the caller of __alloc_pages_node is assumed to
> have checked for the case where nid == NUMA_NO_NODE.
> 
> Signed-off-by: Bharath Vedartham <linux.bhar@gmail.com>
> ---
>  include/linux/gfp.h | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/gfp.h b/include/linux/gfp.h
> index 5f5e25f..075bdaf 100644
> --- a/include/linux/gfp.h
> +++ b/include/linux/gfp.h
> @@ -480,7 +480,11 @@ __alloc_pages(gfp_t gfp_mask, unsigned int order, int preferred_nid)
>  static inline struct page *
>  __alloc_pages_node(int nid, gfp_t gfp_mask, unsigned int order)
>  {
> -	VM_BUG_ON(nid < 0 || nid >= MAX_NUMNODES);
> +	if (nid < 0 || nid >= MAX_NUMNODES) {
> +		VM_WARN_ON(nid < 0 || nid >= MAX_NUMNODES);
> +		return NULL; 
> +	}
> +
>  	VM_WARN_ON((gfp_mask & __GFP_THISNODE) && !node_online(nid));
>  
>  	return __alloc_pages(gfp_mask, order, nid);
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: Remove VM_BUG_ON in __alloc_pages_node
  2019-06-05  7:03 ` Michal Hocko
@ 2019-06-05 13:07   ` Bharath Vedartham
  2019-06-05 14:22     ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Bharath Vedartham @ 2019-06-05 13:07 UTC (permalink / raw)
  To: Michal Hocko, akpm, vbabka, rientjes; +Cc: khalid.aziz, linux-mm, linux-kernel

[Not replying inline as my mail is bouncing back]

This patch is based on reading the code rather than a kernel crash. My
thought process was that if an invalid node id was passed to
__alloc_pages_node, it would be better to add a VM_WARN_ON and fail the
allocation rather than crashing the kernel. 
I feel it would be better to fail the allocation early in the hot path
if an invalid node id is passed. This is irrespective of whether the
VM_[BUG|WARN]_*s are enabled or not. I do not see any checks in the hot
path for the node id, which in turn may cause NODE_DATA(nid) to fail to
get the pglist_data pointer for the node id. 
We can optimise the branch by wrapping it around in unlikely(), if
performance is the issue?
What are your thoughts on this? 

Thank you 
Bharath

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: Remove VM_BUG_ON in __alloc_pages_node
  2019-06-05 13:07   ` Bharath Vedartham
@ 2019-06-05 14:22     ` Michal Hocko
  2019-06-05 15:55       ` Bharath Vedartham
  0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2019-06-05 14:22 UTC (permalink / raw)
  To: Bharath Vedartham
  Cc: akpm, vbabka, rientjes, khalid.aziz, linux-mm, linux-kernel

On Wed 05-06-19 18:37:28, Bharath Vedartham wrote:
> [Not replying inline as my mail is bouncing back]
> 
> This patch is based on reading the code rather than a kernel crash. My
> thought process was that if an invalid node id was passed to
> __alloc_pages_node, it would be better to add a VM_WARN_ON and fail the
> allocation rather than crashing the kernel. 

This makes some sense to me because BUG_ONs are usually a wrong way to
handle wrong usage of the API. On the other hand VM_BUG_ON is special in
the way that production although some distributions enable it by default
IIRC.

> I feel it would be better to fail the allocation early in the hot path
> if an invalid node id is passed. This is irrespective of whether the
> VM_[BUG|WARN]_*s are enabled or not. I do not see any checks in the hot
> path for the node id, which in turn may cause NODE_DATA(nid) to fail to
> get the pglist_data pointer for the node id. 
> We can optimise the branch by wrapping it around in unlikely(), if
> performance is the issue?

unlikely will just move the return NULL ouside of the main code flow.
The check will still be done.

> What are your thoughts on this? 

I don't know. I would leave the code as it is now or remove the
VM_BUG_ON. I do not remember this would be catching any real issues in
the past.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: Remove VM_BUG_ON in __alloc_pages_node
  2019-06-05 14:22     ` Michal Hocko
@ 2019-06-05 15:55       ` Bharath Vedartham
  2019-06-06  5:29         ` Michal Hocko
  0 siblings, 1 reply; 6+ messages in thread
From: Bharath Vedartham @ 2019-06-05 15:55 UTC (permalink / raw)
  To: Michal Hocko; +Cc: akpm, vbabka, rientjes, khalid.aziz, linux-mm, linux-kernel

IMO the reason why a lot of failures must not have occured in the past
might be because the programs which use it use stuff like cpu_to_node or
have checks for nid.
If one day we do get a program which passes an invalid node id without
VM_BUG_ON enabled, it might get weird.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] mm: Remove VM_BUG_ON in __alloc_pages_node
  2019-06-05 15:55       ` Bharath Vedartham
@ 2019-06-06  5:29         ` Michal Hocko
  0 siblings, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2019-06-06  5:29 UTC (permalink / raw)
  To: Bharath Vedartham
  Cc: akpm, vbabka, rientjes, khalid.aziz, linux-mm, linux-kernel

On Wed 05-06-19 21:25:01, Bharath Vedartham wrote:
> IMO the reason why a lot of failures must not have occured in the past
> might be because the programs which use it use stuff like cpu_to_node or
> have checks for nid.
> If one day we do get a program which passes an invalid node id without
> VM_BUG_ON enabled, it might get weird.

It will blow up on a NULL NODE_DATA and it will be quite obvious what
that was so I wouldn't lose any sleep over that. I do not think we have
any directly user controlable way to provide a completely ad-hoc numa
node for an allocation.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-06-06  5:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-05  6:02 [PATCH] mm: Remove VM_BUG_ON in __alloc_pages_node Bharath Vedartham
2019-06-05  7:03 ` Michal Hocko
2019-06-05 13:07   ` Bharath Vedartham
2019-06-05 14:22     ` Michal Hocko
2019-06-05 15:55       ` Bharath Vedartham
2019-06-06  5:29         ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).