linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Mikulas Patocka <mpatocka@redhat.com>,
	Ondrej Kozina <okozina@redhat.com>,
	Jerome Marchand <jmarchan@redhat.com>,
	Stanislav Kozina <skozina@redhat.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	David Rientjes <rientjes@google.com>
Subject: Re: System freezes after OOM
Date: Wed, 13 Jul 2016 15:39:55 +0200	[thread overview]
Message-ID: <20160713133955.GK28723@dhcp22.suse.cz> (raw)
In-Reply-To: <2d5e1f84-e886-7b98-cb11-170d7104fd13@I-love.SAKURA.ne.jp>

[CC David]

On Wed 13-07-16 22:19:23, Tetsuo Handa wrote:
> >> On Mon 11-07-16 11:43:02, Mikulas Patocka wrote:
> >> [...]
> >>> The general problem is that the memory allocator does 16 retries to 
> >>> allocate a page and then triggers the OOM killer (and it doesn't take into 
> >>> account how much swap space is free or how many dirty pages were really 
> >>> swapped out while it waited).
> >>
> >> Well, that is not how it works exactly. We retry as long as there is a
> >> reclaim progress (at least one page freed) back off only if the
> >> reclaimable memory can exceed watermks which is scaled down in 16
> >> retries. The overal size of free swap is not really that important if we
> >> cannot swap out like here due to complete memory reserves depletion:
> >> https://okozina.fedorapeople.org/bugs/swap_on_dmcrypt/vmlog-1462458369-00000/sample-00011/dmesg:
> >> [   90.491276] Node 0 DMA free:0kB min:60kB low:72kB high:84kB active_anon:4096kB inactive_anon:4636kB active_file:212kB inactive_file:280kB unevictable:488kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:488kB dirty:276kB writeback:4636kB mapped:476kB shmem:12kB slab_reclaimable:204kB slab_unreclaimable:4700kB kernel_stack:48kB pagetables:120kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:61132 all_unreclaimable? yes
> >> [   90.491283] lowmem_reserve[]: 0 977 977 977
> >> [   90.491286] Node 0 DMA32 free:0kB min:3828kB low:4824kB high:5820kB active_anon:423820kB inactive_anon:424916kB active_file:17996kB inactive_file:21800kB unevictable:20724kB isolated(anon):384kB isolated(file):0kB present:1032184kB managed:1001260kB mlocked:20724kB dirty:25236kB writeback:49972kB mapped:23076kB shmem:1364kB slab_reclaimable:13796kB slab_unreclaimable:43008kB kernel_stack:2816kB pagetables:7320kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:5635400 all_unreclaimable? yes
> >>
> >> Look at the amount of free memory. It is completely depleted. So it
> >> smells like a process which has access to memory reserves has consumed
> >> all of it. I suspect a __GFP_MEMALLOC resp. PF_MEMALLOC from softirq
> >> context user which went off the leash.
> > 
> > It is caused by the commit f9054c70d28bc214b2857cf8db8269f4f45a5e23. Prior 
> > to this commit, mempool allocations set __GFP_NOMEMALLOC, so they never 
> > exhausted reserved memory. With this commit, mempool allocations drop 
> > __GFP_NOMEMALLOC, so they can dig deeper (if the process has PF_MEMALLOC, 
> > they can bypass all limits).
> 
> I wonder whether commit f9054c70d28bc214 ("mm, mempool: only set
> __GFP_NOMEMALLOC if there are free elements") is doing correct thing.
> It says
> 
>     If an oom killed thread calls mempool_alloc(), it is possible that it'll
>     loop forever if there are no elements on the freelist since
>     __GFP_NOMEMALLOC prevents it from accessing needed memory reserves in
>     oom conditions.

I haven't studied the patch very deeply so I might be missing something
but from a quick look the patch does exactly what the above says.

mempool_alloc used to inhibit ALLOC_NO_WATERMARKS by default. David has
only changed that to allow ALLOC_NO_WATERMARKS if there are no objects
in the pool and so we have no fallback for the default __GFP_NORETRY
request.

> but we can allow mempool_alloc(__GFP_NOMEMALLOC) requests to access
> memory reserves via below change, can't we?

Well, I do not see all the potential side effects of such a change but
I believe it shouldn't be really necessary because we should eventually
allow ALLOC_NO_WATERMARKS even from mempool_alloc.

> The purpose of allowing
> ALLOC_NO_WATERMARKS via TIF_MEMDIE is to make sure current allocation
> request does not to loop forever inside the page allocator, isn't it?
> Why we need to allow mempool_alloc(__GFP_NOMEMALLOC) requests to use
> ALLOC_NO_WATERMARKS when TIF_MEMDIE is not set?
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 6903b69..e4e3700 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3439,14 +3439,14 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
>  	} else if (unlikely(rt_task(current)) && !in_interrupt())
>  		alloc_flags |= ALLOC_HARDER;
>  
> -	if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) {
> +	if (!in_interrupt() && unlikely(test_thread_flag(TIF_MEMDIE)))
> +		alloc_flags |= ALLOC_NO_WATERMARKS;
> +	else if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) {
>  		if (gfp_mask & __GFP_MEMALLOC)
>  			alloc_flags |= ALLOC_NO_WATERMARKS;
>  		else if (in_serving_softirq() && (current->flags & PF_MEMALLOC))
>  			alloc_flags |= ALLOC_NO_WATERMARKS;
> -		else if (!in_interrupt() &&
> -				((current->flags & PF_MEMALLOC) ||
> -				 unlikely(test_thread_flag(TIF_MEMDIE))))
> +		else if (!in_interrupt() && (current->flags & PF_MEMALLOC))
>  			alloc_flags |= ALLOC_NO_WATERMARKS;
>  	}
>  #ifdef CONFIG_CMA
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2016-07-13 13:40 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <57837CEE.1010609@redhat.com>
     [not found] ` <f80dc690-7e71-26b2-59a2-5a1557d26713@redhat.com>
     [not found]   ` <9be09452-de7f-d8be-fd5d-4a80d1cd1ba3@redhat.com>
2016-07-11 15:43     ` System freezes after OOM Mikulas Patocka
2016-07-12  6:49       ` Michal Hocko
2016-07-12 23:44         ` Mikulas Patocka
2016-07-13  8:35           ` Jerome Marchand
2016-07-13 11:14             ` Michal Hocko
2016-07-13 14:21               ` Mikulas Patocka
2016-07-13 11:10           ` Michal Hocko
2016-07-13 12:50             ` Michal Hocko
2016-07-13 13:44               ` Milan Broz
2016-07-13 15:21                 ` Mikulas Patocka
2016-07-14  9:09                   ` Michal Hocko
2016-07-14  9:46                     ` Milan Broz
2016-07-13 15:02             ` Mikulas Patocka
2016-07-14 10:51               ` [dm-devel] " Ondrej Kozina
2016-07-14 12:51               ` Michal Hocko
2016-07-14 14:00                 ` Mikulas Patocka
2016-07-14 14:59                   ` Michal Hocko
2016-07-14 15:25                     ` Ondrej Kozina
2016-07-14 17:35                     ` Mikulas Patocka
2016-07-15  8:35                       ` Michal Hocko
2016-07-15 12:11                         ` Mikulas Patocka
2016-07-15 12:22                           ` Michal Hocko
2016-07-15 17:02                             ` Mikulas Patocka
2016-07-18  7:22                               ` Michal Hocko
2016-07-14 14:08                 ` Ondrej Kozina
2016-07-14 15:31                   ` Michal Hocko
2016-07-14 17:07                     ` Ondrej Kozina
2016-07-14 17:36                       ` Michal Hocko
2016-07-14 17:39                         ` Michal Hocko
2016-07-15 11:42                       ` Tetsuo Handa
2016-07-13 13:19           ` Tetsuo Handa
2016-07-13 13:39             ` Michal Hocko [this message]
2016-07-13 14:18               ` Mikulas Patocka
2016-07-13 14:56                 ` Michal Hocko
2016-07-13 15:11                   ` Mikulas Patocka
2016-07-13 23:53                     ` David Rientjes
2016-07-14 11:01                       ` Tetsuo Handa
2016-07-14 12:29                         ` Mikulas Patocka
2016-07-14 20:26                         ` David Rientjes
2016-07-14 21:40                           ` Tetsuo Handa
2016-07-14 22:04                             ` David Rientjes
2016-07-15 11:25                           ` Mikulas Patocka
2016-07-15 21:21                             ` David Rientjes
2016-07-14 12:27                       ` Mikulas Patocka
2016-07-14 20:22                         ` David Rientjes
2016-07-15 11:21                           ` Mikulas Patocka
2016-07-15 21:25                             ` David Rientjes
2016-07-15 21:39                               ` Mikulas Patocka
2016-07-15 21:58                                 ` David Rientjes
2016-07-15 23:53                                   ` Mikulas Patocka
2016-07-18 15:14                             ` Johannes Weiner
2016-07-14 15:29                       ` Michal Hocko
2016-07-14 20:38                         ` David Rientjes
2016-07-15  7:22                           ` Michal Hocko
2016-07-15  8:23                             ` Michal Hocko
2016-07-15 12:00                             ` Mikulas Patocka
2016-07-15 21:47                             ` David Rientjes
2016-07-18  7:39                               ` Michal Hocko
2016-07-18 21:03                                 ` David Rientjes
2016-07-14  0:01             ` David Rientjes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160713133955.GK28723@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=jmarchan@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mpatocka@redhat.com \
    --cc=okozina@redhat.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=skozina@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).