All of lore.kernel.org
 help / color / mirror / Atom feed
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, Mel Gorman <mgorman@suse.de>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Michal Hocko <mhocko@kernel.org>,
	Christopher Lameter <cl@linux.com>,
	linuxppc-dev@lists.ozlabs.org,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Kirill Tkhai <ktkhai@virtuozzo.com>,
	Bharata B Rao <bharata@linux.ibm.com>
Subject: Re: [PATCH 2/4] mm/slub: Use mem_node to allocate a new slab
Date: Tue, 17 Mar 2020 19:15:23 +0530	[thread overview]
Message-ID: <20200317134523.GB4334@linux.vnet.ibm.com> (raw)
In-Reply-To: <ef34b0bb-4dfa-cb0e-1830-9ad59119da5e@suse.cz>

* Vlastimil Babka <vbabka@suse.cz> [2020-03-17 14:34:25]:

> On 3/17/20 2:17 PM, Srikar Dronamraju wrote:
> > Currently while allocating a slab for a offline node, we use its
> > associated node_numa_mem to search for a partial slab. If we don't find
> > a partial slab, we try allocating a slab from the offline node using
> > __alloc_pages_node. However this is bound to fail.
> > 
> > NIP [c00000000039a300] __alloc_pages_nodemask+0x130/0x3b0
> > LR [c00000000039a3c4] __alloc_pages_nodemask+0x1f4/0x3b0
> > Call Trace:
> > [c0000008b36837f0] [c00000000039a3b4] __alloc_pages_nodemask+0x1e4/0x3b0 (unreliable)
> > [c0000008b3683870] [c0000000003d1ff8] new_slab+0x128/0xcf0
> > [c0000008b3683950] [c0000000003d6060] ___slab_alloc+0x410/0x820
> > [c0000008b3683a40] [c0000000003d64a4] __slab_alloc+0x34/0x60
> > [c0000008b3683a70] [c0000000003d78b0] __kmalloc_node+0x110/0x490
> > [c0000008b3683af0] [c000000000343a08] kvmalloc_node+0x58/0x110
> > [c0000008b3683b30] [c0000000003ffd44] mem_cgroup_css_online+0x104/0x270
> > [c0000008b3683b90] [c000000000234e08] online_css+0x48/0xd0
> > [c0000008b3683bc0] [c00000000023dedc] cgroup_apply_control_enable+0x2ec/0x4d0
> > [c0000008b3683ca0] [c0000000002416f8] cgroup_mkdir+0x228/0x5f0
> > [c0000008b3683d10] [c000000000520360] kernfs_iop_mkdir+0x90/0xf0
> > [c0000008b3683d50] [c00000000043e400] vfs_mkdir+0x110/0x230
> > [c0000008b3683da0] [c000000000441ee0] do_mkdirat+0xb0/0x1a0
> > [c0000008b3683e20] [c00000000000b278] system_call+0x5c/0x68
> > 
> > Mitigate this by allocating the new slab from the node_numa_mem.
> 
> Are you sure this is really needed and the other 3 patches are not enough for
> the current SLUB code to work as needed? It seems you are changing the semantics
> here...
> 

The other 3 patches are not enough because we don't carry the searchnode
when the actual alloc_pages_node gets called. 

With only the 3 patches, we see the above Panic, its signature is slightly
different from what Sachin first reported and which I have carried in 1st
patch.

> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -1970,14 +1970,8 @@ static void *get_partial(struct kmem_cache *s, gfp_t flags, int node,
> >  		struct kmem_cache_cpu *c)
> >  {
> >  	void *object;
> > -	int searchnode = node;
> >  
> > -	if (node == NUMA_NO_NODE)
> > -		searchnode = numa_mem_id();
> > -	else if (!node_present_pages(node))
> > -		searchnode = node_to_mem_node(node);
> > -
> > -	object = get_partial_node(s, get_node(s, searchnode), c, flags);
> > +	object = get_partial_node(s, get_node(s, node), c, flags);
> >  	if (object || node != NUMA_NO_NODE)>  		return object;
> >
> >       return get_any_partial(s, flags, c);
> 
> I.e. here in this if(), now node will never equal NUMA_NO_NODE (thanks to the
> hunk below), thus the get_any_partial() call becomes dead code?
> 
> > @@ -2470,6 +2464,11 @@ static inline void *new_slab_objects(struct kmem_cache *s, gfp_t flags,
> >  
> >  	WARN_ON_ONCE(s->ctor && (flags & __GFP_ZERO));
> >  
> > +	if (node == NUMA_NO_NODE)
> > +		node = numa_mem_id();
> > +	else if (!node_present_pages(node))
> > +		node = node_to_mem_node(node);
> > +
> >  	freelist = get_partial(s, flags, node, c);
> >  
> >  	if (freelist)
> > @@ -2569,12 +2568,10 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> >  redo:
> >  
> >  	if (unlikely(!node_match(page, node))) {
> > -		int searchnode = node;
> > -
> >  		if (node != NUMA_NO_NODE && !node_present_pages(node))
> > -			searchnode = node_to_mem_node(node);
> > +			node = node_to_mem_node(node);
> >  
> > -		if (unlikely(!node_match(page, searchnode))) {
> > +		if (unlikely(!node_match(page, node))) {
> >  			stat(s, ALLOC_NODE_MISMATCH);
> >  			deactivate_slab(s, page, c->freelist, c);
> >  			goto new_slab;
> > 
> 

-- 
Thanks and Regards
Srikar Dronamraju



WARNING: multiple messages have this Message-ID (diff)
From: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Sachin Sant <sachinp@linux.vnet.ibm.com>,
	Michal Hocko <mhocko@kernel.org>,
	linux-mm@kvack.org, Kirill Tkhai <ktkhai@virtuozzo.com>,
	Mel Gorman <mgorman@suse.de>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Bharata B Rao <bharata@linux.ibm.com>,
	linuxppc-dev@lists.ozlabs.org, Christopher Lameter <cl@linux.com>
Subject: Re: [PATCH 2/4] mm/slub: Use mem_node to allocate a new slab
Date: Tue, 17 Mar 2020 19:15:23 +0530	[thread overview]
Message-ID: <20200317134523.GB4334@linux.vnet.ibm.com> (raw)
In-Reply-To: <ef34b0bb-4dfa-cb0e-1830-9ad59119da5e@suse.cz>

* Vlastimil Babka <vbabka@suse.cz> [2020-03-17 14:34:25]:

> On 3/17/20 2:17 PM, Srikar Dronamraju wrote:
> > Currently while allocating a slab for a offline node, we use its
> > associated node_numa_mem to search for a partial slab. If we don't find
> > a partial slab, we try allocating a slab from the offline node using
> > __alloc_pages_node. However this is bound to fail.
> > 
> > NIP [c00000000039a300] __alloc_pages_nodemask+0x130/0x3b0
> > LR [c00000000039a3c4] __alloc_pages_nodemask+0x1f4/0x3b0
> > Call Trace:
> > [c0000008b36837f0] [c00000000039a3b4] __alloc_pages_nodemask+0x1e4/0x3b0 (unreliable)
> > [c0000008b3683870] [c0000000003d1ff8] new_slab+0x128/0xcf0
> > [c0000008b3683950] [c0000000003d6060] ___slab_alloc+0x410/0x820
> > [c0000008b3683a40] [c0000000003d64a4] __slab_alloc+0x34/0x60
> > [c0000008b3683a70] [c0000000003d78b0] __kmalloc_node+0x110/0x490
> > [c0000008b3683af0] [c000000000343a08] kvmalloc_node+0x58/0x110
> > [c0000008b3683b30] [c0000000003ffd44] mem_cgroup_css_online+0x104/0x270
> > [c0000008b3683b90] [c000000000234e08] online_css+0x48/0xd0
> > [c0000008b3683bc0] [c00000000023dedc] cgroup_apply_control_enable+0x2ec/0x4d0
> > [c0000008b3683ca0] [c0000000002416f8] cgroup_mkdir+0x228/0x5f0
> > [c0000008b3683d10] [c000000000520360] kernfs_iop_mkdir+0x90/0xf0
> > [c0000008b3683d50] [c00000000043e400] vfs_mkdir+0x110/0x230
> > [c0000008b3683da0] [c000000000441ee0] do_mkdirat+0xb0/0x1a0
> > [c0000008b3683e20] [c00000000000b278] system_call+0x5c/0x68
> > 
> > Mitigate this by allocating the new slab from the node_numa_mem.
> 
> Are you sure this is really needed and the other 3 patches are not enough for
> the current SLUB code to work as needed? It seems you are changing the semantics
> here...
> 

The other 3 patches are not enough because we don't carry the searchnode
when the actual alloc_pages_node gets called. 

With only the 3 patches, we see the above Panic, its signature is slightly
different from what Sachin first reported and which I have carried in 1st
patch.

> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -1970,14 +1970,8 @@ static void *get_partial(struct kmem_cache *s, gfp_t flags, int node,
> >  		struct kmem_cache_cpu *c)
> >  {
> >  	void *object;
> > -	int searchnode = node;
> >  
> > -	if (node == NUMA_NO_NODE)
> > -		searchnode = numa_mem_id();
> > -	else if (!node_present_pages(node))
> > -		searchnode = node_to_mem_node(node);
> > -
> > -	object = get_partial_node(s, get_node(s, searchnode), c, flags);
> > +	object = get_partial_node(s, get_node(s, node), c, flags);
> >  	if (object || node != NUMA_NO_NODE)>  		return object;
> >
> >       return get_any_partial(s, flags, c);
> 
> I.e. here in this if(), now node will never equal NUMA_NO_NODE (thanks to the
> hunk below), thus the get_any_partial() call becomes dead code?
> 
> > @@ -2470,6 +2464,11 @@ static inline void *new_slab_objects(struct kmem_cache *s, gfp_t flags,
> >  
> >  	WARN_ON_ONCE(s->ctor && (flags & __GFP_ZERO));
> >  
> > +	if (node == NUMA_NO_NODE)
> > +		node = numa_mem_id();
> > +	else if (!node_present_pages(node))
> > +		node = node_to_mem_node(node);
> > +
> >  	freelist = get_partial(s, flags, node, c);
> >  
> >  	if (freelist)
> > @@ -2569,12 +2568,10 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node,
> >  redo:
> >  
> >  	if (unlikely(!node_match(page, node))) {
> > -		int searchnode = node;
> > -
> >  		if (node != NUMA_NO_NODE && !node_present_pages(node))
> > -			searchnode = node_to_mem_node(node);
> > +			node = node_to_mem_node(node);
> >  
> > -		if (unlikely(!node_match(page, searchnode))) {
> > +		if (unlikely(!node_match(page, node))) {
> >  			stat(s, ALLOC_NODE_MISMATCH);
> >  			deactivate_slab(s, page, c->freelist, c);
> >  			goto new_slab;
> > 
> 

-- 
Thanks and Regards
Srikar Dronamraju


  reply	other threads:[~2020-03-17 13:45 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-18 10:45 [5.6.0-rc2-next-20200218/powerpc] Boot failure on POWER9 Sachin Sant
2020-02-18 10:50 ` Kirill Tkhai
2020-02-18 10:50   ` Kirill Tkhai
2020-02-18 11:01   ` Kirill Tkhai
2020-02-18 11:01     ` Kirill Tkhai
2020-02-18 11:35     ` Kirill Tkhai
2020-02-18 11:35       ` Kirill Tkhai
2020-02-18 11:40     ` Sachin Sant
2020-02-18 11:55       ` Michal Hocko
2020-02-18 11:55         ` Michal Hocko
2020-02-18 14:00         ` Sachin Sant
2020-02-18 14:00           ` Sachin Sant
2020-02-18 14:26           ` Michal Hocko
2020-02-18 14:26             ` Michal Hocko
2020-02-18 15:11             ` Sachin Sant
2020-02-18 15:11               ` Sachin Sant
2020-02-18 15:24               ` Michal Hocko
2020-02-18 15:24                 ` Michal Hocko
2020-02-22  3:38                 ` Christopher Lameter
2020-02-22  3:38                   ` Christopher Lameter
2020-02-24  8:58                   ` Michal Hocko
2020-02-24  8:58                     ` Michal Hocko
2020-02-26 18:25                     ` Christopher Lameter
2020-02-26 18:25                       ` Christopher Lameter
2020-02-26 18:41                       ` Michal Hocko
2020-02-26 18:41                         ` Michal Hocko
2020-02-26 18:44                         ` Christopher Lameter
2020-02-26 18:44                           ` Christopher Lameter
2020-02-26 19:01                           ` Michal Hocko
2020-02-26 19:01                             ` Michal Hocko
2020-02-26 20:31                             ` David Rientjes
2020-02-26 20:31                               ` David Rientjes
2020-02-26 20:52                               ` Michal Hocko
2020-02-26 20:52                                 ` Michal Hocko
2020-02-26 21:45                         ` Vlastimil Babka
2020-02-26 21:45                           ` Vlastimil Babka
2020-02-26 22:29                           ` Vlastimil Babka
2020-02-26 22:29                             ` Vlastimil Babka
2020-02-27 12:12                             ` Michal Hocko
2020-02-27 12:12                               ` Michal Hocko
2020-02-27 16:00                               ` Sachin Sant
2020-02-27 16:00                                 ` Sachin Sant
2020-02-27 16:16                                 ` Vlastimil Babka
2020-02-27 18:26                                   ` Michal Hocko
2020-02-27 18:26                                     ` Michal Hocko
2020-03-10 15:01                                     ` Michal Hocko
2020-03-10 15:01                                       ` Michal Hocko
2020-03-12 12:18                                       ` Michael Ellerman
2020-03-12 12:18                                         ` Michael Ellerman
2020-03-12 16:51                                         ` Sachin Sant
2020-03-12 16:51                                           ` Sachin Sant
2020-03-13 10:48                                           ` Michael Ellerman
2020-03-13 10:48                                             ` Michael Ellerman
2020-03-13 11:12                                             ` Srikar Dronamraju
2020-03-13 11:12                                               ` Srikar Dronamraju
2020-03-13 11:35                                               ` Vlastimil Babka
2020-03-13 11:35                                                 ` Vlastimil Babka
2020-03-14  8:10                                                 ` Sachin Sant
2020-02-27 12:02                           ` Michal Hocko
2020-02-27 12:02                             ` Michal Hocko
2020-02-18 11:38   ` Sachin Sant
2020-02-18 11:53     ` Kirill Tkhai
2020-03-17 13:17 ` [PATCH 0/4] Fix kmalloc_node on offline nodes Srikar Dronamraju
2020-03-17 13:17   ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 1/4] mm: Check for node_online in node_present_pages Srikar Dronamraju
2020-03-17 13:17     ` Srikar Dronamraju
2020-03-17 13:37     ` Srikar Dronamraju
2020-03-17 13:37       ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 2/4] mm/slub: Use mem_node to allocate a new slab Srikar Dronamraju
2020-03-17 13:17     ` Srikar Dronamraju
2020-03-17 13:34     ` Vlastimil Babka
2020-03-17 13:34       ` Vlastimil Babka
2020-03-17 13:45       ` Srikar Dronamraju [this message]
2020-03-17 13:45         ` Srikar Dronamraju
2020-03-17 13:53         ` Vlastimil Babka
2020-03-17 13:53           ` Vlastimil Babka
2020-03-17 14:51           ` Srikar Dronamraju
2020-03-17 14:51             ` Srikar Dronamraju
2020-03-17 15:29             ` Vlastimil Babka
2020-03-17 15:29               ` Vlastimil Babka
2020-03-18  7:29               ` Srikar Dronamraju
2020-03-18  7:29                 ` Srikar Dronamraju
2020-03-17 16:41       ` Srikar Dronamraju
2020-03-17 16:41         ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 3/4] mm: Implement reset_numa_mem Srikar Dronamraju
2020-03-17 13:17     ` Srikar Dronamraju
2020-03-17 13:17   ` [PATCH 4/4] powerpc/numa: Set fallback nodes for offline nodes Srikar Dronamraju
2020-03-17 13:17     ` Srikar Dronamraju
2020-03-17 14:22     ` Bharata B Rao
2020-03-17 14:22       ` Bharata B Rao
2020-03-17 14:29       ` Srikar Dronamraju
2020-03-17 14:29         ` Srikar Dronamraju

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200317134523.GB4334@linux.vnet.ibm.com \
    --to=srikar@linux.vnet.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata@linux.ibm.com \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=sachinp@linux.vnet.ibm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.