linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Rientjes <rientjes@google.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org, Mikulas Patocka <mpatocka@redhat.com>,
	Ondrej Kozina <okozina@redhat.com>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Mel Gorman <mgorman@suse.de>, Neil Brown <neilb@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	dm-devel@redhat.com
Subject: Re: [RFC PATCH 1/2] mempool: do not consume memory reserves from the reclaim path
Date: Wed, 20 Jul 2016 14:06:26 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.10.1607201353230.22427@chino.kir.corp.google.com> (raw)
In-Reply-To: <20160720081541.GF11249@dhcp22.suse.cz>

On Wed, 20 Jul 2016, Michal Hocko wrote:

> > Any mempool_alloc() user that then takes a contended mutex can do this.  
> > An example:
> > 
> > 	taskA		taskB		taskC
> > 	-----		-----		-----
> > 	mempool_alloc(a)
> > 			mutex_lock(b)
> > 	mutex_lock(b)
> > 					mempool_alloc(a)
> > 
> > Imagine the mempool_alloc() done by taskA depleting all free elements so 
> > we rely on it to do mempool_free() before any other mempool allocator can 
> > be guaranteed.
> > 
> > If taskC is oom killed, or has PF_MEMALLOC set, it cannot access memory 
> > reserves from the page allocator if __GFP_NOMEMALLOC is automatic in 
> > mempool_alloc().  This livelocks the page allocator for all processes.
> > 
> > taskB in this case need only stall after taking mutex_lock() successfully; 
> > that could be because of the oom livelock, it is contended on another 
> > mutex held by an allocator, etc.
> 
> But that falls down to the deadlock described by Johannes above because
> then the mempool user would _depend_ on an "unguarded page allocation"
> via that particular lock and that is a bug.
>  

It becomes a deadlock because of mempool_alloc(a) forcing 
__GFP_NOMEMALLOC, I agree.

For that not to be the case, it must be required that between 
mempool_alloc() and mempool_free() that we take no mutex that may be held 
by any other thread on the system, in any context, that is allocating 
memory.  If that's a caller's bug as you describe it, and only enabled by 
mempool_alloc() forcing __GFP_NOMEMALLOC, then please add the relevant 
lockdep detection, which would be trivial to add, so we can determine if 
any users are unsafe and prevent this issue in the future.  The 
overwhelming goal here should be to prevent possible problems in the 
future especially if an API does not allow you to opt-out of the behavior.

> > My point is that I don't think we should be forcing any behavior wrt 
> > memory reserves as part of the mempool implementation. 
> 
> Isn't the reserve management the whole point of the mempool approach?
> 

No, the whole point is to maintain the freelist of elements that are 
guaranteed; my suggestion is that we cannot make that guarantee if we are 
blocked from freeing elements.  It's trivial to fix by allowing 
__GFP_NOMEMALLOC from the caller in cases where you cannot possibly be 
blocked by an oom victim.

> Or it would get stuck because even page allocator memory reserves got
> depleted. Without any way to throttle there is no guarantee to make
> further progress. In fact this is not a theoretical situation. It has
> been observed with the swap over dm-crypt and there shouldn't be any
> lock dependeces you are describing above there AFAIU.
> 

They should do mempool_alloc(__GFP_NOMEMALLOC), no argument.

> > The mempool_alloc() user may construct their 
> > set of gfp flags as appropriate just like any other memory allocator in 
> > the kernel.
> 
> So which users of mempool_alloc would benefit from not having
> __GFP_NOMEMALLOC and why?
> 

Any mempool_alloc() user that would be blocked on returning the element 
back to the freelist by an oom condition.  I think the dm-crypt case is 
quite unique on how it is able to deplete memory reserves.

> Anway, I feel we are looping in a circle. We have a clear regression
> caused by your patch. It might solve some oom livelock you are seeing
> but there are only very dim details about it and the patch might very
> well paper over a bug in mempool usage somewhere else. We definitely
> need more details to know that better.
> 

What is the objection to allowing __GFP_NOMEMALLOC from the caller with 
clear documentation on how to use it?  It can be described to not allow 
depletion of memory reserves with the caveat that the caller must ensure 
mempool_free() cannot be blocked in lowmem situations.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-07-20 21:06 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-18  8:39 [RFC PATCH 0/2] mempool vs. page allocator interaction Michal Hocko
2016-07-18  8:41 ` [RFC PATCH 1/2] mempool: do not consume memory reserves from the reclaim path Michal Hocko
2016-07-18  8:41   ` [RFC PATCH 2/2] mm, mempool: do not throttle PF_LESS_THROTTLE tasks Michal Hocko
2016-07-19 21:50     ` Mikulas Patocka
2016-07-22  8:46     ` NeilBrown
2016-07-22  9:04       ` NeilBrown
2016-07-22  9:15       ` Michal Hocko
2016-07-23  0:12         ` NeilBrown
2016-07-25  8:32           ` Michal Hocko
2016-07-25 19:23             ` Michal Hocko
2016-07-26  7:07               ` Michal Hocko
2016-07-27  3:43             ` [dm-devel] " NeilBrown
2016-07-27 18:24               ` Michal Hocko
2016-07-27 21:33                 ` NeilBrown
2016-07-28  7:17                   ` Michal Hocko
2016-08-03 12:53                     ` Mikulas Patocka
2016-08-03 14:34                       ` Michal Hocko
2016-08-04 18:49                         ` Mikulas Patocka
2016-08-12 12:32                           ` Michal Hocko
2016-08-13 17:34                             ` Mikulas Patocka
2016-08-14 10:34                               ` Michal Hocko
2016-08-15 16:15                                 ` Mikulas Patocka
2016-11-23 21:11                                 ` Mikulas Patocka
2016-11-24 13:29                                   ` Michal Hocko
2016-11-24 17:10                                     ` Mikulas Patocka
2016-11-28 14:06                                       ` Michal Hocko
2016-07-25 21:52           ` Mikulas Patocka
2016-07-26  7:25             ` Michal Hocko
2016-07-27  4:02             ` [dm-devel] " NeilBrown
2016-07-27 14:28               ` Mikulas Patocka
2016-07-27 18:40                 ` Michal Hocko
2016-08-03 13:59                   ` Mikulas Patocka
2016-08-03 14:42                     ` Michal Hocko
2016-08-04 18:46                       ` Mikulas Patocka
2016-07-27 21:36                 ` NeilBrown
2016-07-19  2:00   ` [RFC PATCH 1/2] mempool: do not consume memory reserves from the reclaim path David Rientjes
2016-07-19  7:49     ` Michal Hocko
2016-07-19 13:54   ` Johannes Weiner
2016-07-19 14:19     ` Michal Hocko
2016-07-19 22:01       ` Mikulas Patocka
2016-07-19 20:45     ` David Rientjes
2016-07-20  8:15       ` Michal Hocko
2016-07-20 21:06         ` David Rientjes [this message]
2016-07-21  8:52           ` Michal Hocko
2016-07-21 12:13             ` Johannes Weiner
2016-07-21 14:53               ` Michal Hocko
2016-07-21 15:26                 ` Johannes Weiner
2016-07-22  1:41                 ` NeilBrown
2016-07-22  6:37                 ` Michal Hocko
2016-07-22 12:26                   ` Vlastimil Babka
2016-07-22 19:44                     ` Andrew Morton
2016-07-23 18:52                       ` Vlastimil Babka
2016-07-19 21:50   ` Mikulas Patocka
2016-07-20  6:44     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.10.1607201353230.22427@chino.kir.corp.google.com \
    --to=rientjes@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=dm-devel@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=neilb@suse.de \
    --cc=okozina@redhat.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).