All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Gerhard Wiesinger <lists@wiesinger.com>
Cc: Michal Hocko <mhocko@kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: Still OOM problems with 4.9er/4.10er kernels
Date: Mon, 27 Feb 2017 18:02:36 +0900	[thread overview]
Message-ID: <20170227090236.GA2789@bbox> (raw)
In-Reply-To: <82bce413-1bd7-7f66-1c3d-0d890bbaf6f1@wiesinger.com>

On Sun, Feb 26, 2017 at 09:40:42AM +0100, Gerhard Wiesinger wrote:
> On 04.01.2017 10:11, Michal Hocko wrote:
> >>The VM stops working (e.g. not pingable) after around 8h (will be restarted
> >>automatically), happened serveral times.
> >>
> >>Had also further OOMs which I sent to Mincham.
> >Could you post them to the mailing list as well, please?
> 
> Still OOMs on dnf update procedure with kernel 4.10: 4.10.0-1.fc26.x86_64 as
> well on 4.9.9-200.fc25.x86_64
> 
> On 4.10er kernels:
> 
> Free swap  = 1137532kB
> 
> cat /etc/sysctl.d/* | grep ^vm
> vm.dirty_background_ratio = 3
> vm.dirty_ratio = 15
> vm.overcommit_memory = 2
> vm.overcommit_ratio = 80
> vm.swappiness=10
> 
> kernel: python invoked oom-killer:
> gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0, order=0,
> oom_score_adj=0
> kernel: python cpuset=/ mems_allowed=0
> kernel: CPU: 1 PID: 813 Comm: python Not tainted 4.10.0-1.fc26.x86_64 #1
> kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3
> 04/01/2014
> kernel: Call Trace:
> kernel:  dump_stack+0x63/0x84
> kernel:  dump_header+0x7b/0x1f6
> kernel:  ? do_try_to_free_pages+0x2c5/0x340
> kernel:  oom_kill_process+0x202/0x3d0
> kernel:  out_of_memory+0x2b7/0x4e0
> kernel:  __alloc_pages_slowpath+0x915/0xb80
> kernel:  __alloc_pages_nodemask+0x218/0x2d0
> kernel:  alloc_pages_current+0x93/0x150
> kernel:  __page_cache_alloc+0xcf/0x100
> kernel:  filemap_fault+0x39d/0x800
> kernel:  ? page_add_file_rmap+0xe5/0x200
> kernel:  ? filemap_map_pages+0x2e1/0x4e0
> kernel:  ext4_filemap_fault+0x36/0x50
> kernel:  __do_fault+0x21/0x110
> kernel:  handle_mm_fault+0xdd1/0x1410
> kernel:  ? swake_up+0x42/0x50
> kernel:  __do_page_fault+0x23f/0x4c0
> kernel:  trace_do_page_fault+0x41/0x120
> kernel:  do_async_page_fault+0x51/0xa0
> kernel:  async_page_fault+0x28/0x30
> kernel: RIP: 0033:0x7f0681ad6350
> kernel: RSP: 002b:00007ffcbdd238d8 EFLAGS: 00010246
> kernel: RAX: 00007f0681b0f960 RBX: 0000000000000000 RCX: 7fffffffffffffff
> kernel: RDX: 0000000000000000 RSI: 3ff0000000000000 RDI: 3ff0000000000000
> kernel: RBP: 00007f067461ab40 R08: 0000000000000000 R09: 3ff0000000000000
> kernel: R10: 0000556f1c6d8a80 R11: 0000000000000001 R12: 00007f0676d1a8d0
> kernel: R13: 0000000000000000 R14: 00007f06746168bc R15: 00007f0674385910
> kernel: Mem-Info:
> kernel: active_anon:37423 inactive_anon:37512 isolated_anon:0
>          active_file:462 inactive_file:603 isolated_file:0
>          unevictable:0 dirty:0 writeback:0 unstable:0
>          slab_reclaimable:3538 slab_unreclaimable:4818
>          mapped:859 shmem:9 pagetables:3370 bounce:0
>          free:1650 free_pcp:103 free_cma:0
> kernel: Node 0 active_anon:149380kB inactive_anon:149704kB
> active_file:1848kB inactive_file:3660kB unevictable:0kB isolated(anon):128kB
> isolated(file):0kB mapped:4580kB dirty:0kB writeback:380kB shmem:0kB
> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 36kB writeback_tmp:0kB
> unstable:0kB pages_scanned:352 all_unreclaimable? no
> kernel: Node 0 DMA free:1484kB min:104kB low:128kB high:152kB
> active_anon:5660kB inactive_anon:6156kB active_file:56kB inactive_file:64kB
> unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB
> slab_reclaimable:444kB slab_unreclaimable:1208kB kernel_stack:32kB
> pagetables:592kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> kernel: lowmem_reserve[]: 0 327 327 327 327
> kernel: Node 0 DMA32 free:5012kB min:2264kB low:2828kB high:3392kB
> active_anon:143580kB inactive_anon:143300kB active_file:2576kB
> inactive_file:2560kB unevictable:0kB writepending:0kB present:376688kB
> managed:353968kB mlocked:0kB slab_reclaimable:13708kB
> slab_unreclaimable:18064kB kernel_stack:2352kB pagetables:12888kB bounce:0kB
> free_pcp:412kB local_pcp:88kB free_cma:0kB
> kernel: lowmem_reserve[]: 0 0 0 0 0
> kernel: Node 0 DMA: 70*4kB (UMEH) 20*8kB (UMEH) 13*16kB (MH) 5*32kB (H)
> 4*64kB (H) 2*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB =
> 1576kB
> kernel: Node 0 DMA32: 1134*4kB (UMEH) 25*8kB (UMEH) 13*16kB (MH) 7*32kB (H)
> 3*64kB (H) 0*128kB 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5616kB
 
Althogh DMA32 zone has enough free memory, free memory includes H pageblock
which is reserved memory for high-order atomic allocation. That might be
a reason you cannot succeed watermark check for the allocation.

I tried to solve the issue in 4.9 time to use up the reserved memory before
the OOM and merged into 4.10 but I think there is a hole so could you apply
this patch on top of your 4.10? (To be clear, cannot apply it to 4.9)

>From 9779a1c5d32e2edb64da5cdfcd6f9737b94a247a Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Mon, 27 Feb 2017 17:39:06 +0900
Subject: [PATCH] mm: use up highatomic before OOM kill

Not-Yet-Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/page_alloc.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 614cd0397ce3..e073cca4969e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3549,16 +3549,6 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order,
 		*no_progress_loops = 0;
 	else
 		(*no_progress_loops)++;
-
-	/*
-	 * Make sure we converge to OOM if we cannot make any progress
-	 * several times in the row.
-	 */
-	if (*no_progress_loops > MAX_RECLAIM_RETRIES) {
-		/* Before OOM, exhaust highatomic_reserve */
-		return unreserve_highatomic_pageblock(ac, true);
-	}
-
 	/*
 	 * Keep reclaiming pages while there is a chance this will lead
 	 * somewhere.  If none of the target zones can satisfy our allocation
@@ -3821,6 +3811,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
 	if (read_mems_allowed_retry(cpuset_mems_cookie))
 		goto retry_cpuset;
 
+	/* Before OOM, exhaust highatomic_reserve */
+	if (unreserve_highatomic_pageblock(ac, true))
+		goto retry;
+
 	/* Reclaim has failed us, start killing things */
 	page = __alloc_pages_may_oom(gfp_mask, order, ac, &did_some_progress);
 	if (page)
-- 
2.7.4

WARNING: multiple messages have this Message-ID (diff)
From: Minchan Kim <minchan@kernel.org>
To: Gerhard Wiesinger <lists@wiesinger.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: Still OOM problems with 4.9er/4.10er kernels
Date: Mon, 27 Feb 2017 18:02:36 +0900	[thread overview]
Message-ID: <20170227090236.GA2789@bbox> (raw)
In-Reply-To: <82bce413-1bd7-7f66-1c3d-0d890bbaf6f1@wiesinger.com>

On Sun, Feb 26, 2017 at 09:40:42AM +0100, Gerhard Wiesinger wrote:
> On 04.01.2017 10:11, Michal Hocko wrote:
> >>The VM stops working (e.g. not pingable) after around 8h (will be restarted
> >>automatically), happened serveral times.
> >>
> >>Had also further OOMs which I sent to Mincham.
> >Could you post them to the mailing list as well, please?
> 
> Still OOMs on dnf update procedure with kernel 4.10: 4.10.0-1.fc26.x86_64 as
> well on 4.9.9-200.fc25.x86_64
> 
> On 4.10er kernels:
> 
> Free swap  = 1137532kB
> 
> cat /etc/sysctl.d/* | grep ^vm
> vm.dirty_background_ratio = 3
> vm.dirty_ratio = 15
> vm.overcommit_memory = 2
> vm.overcommit_ratio = 80
> vm.swappiness=10
> 
> kernel: python invoked oom-killer:
> gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), nodemask=0, order=0,
> oom_score_adj=0
> kernel: python cpuset=/ mems_allowed=0
> kernel: CPU: 1 PID: 813 Comm: python Not tainted 4.10.0-1.fc26.x86_64 #1
> kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3
> 04/01/2014
> kernel: Call Trace:
> kernel:  dump_stack+0x63/0x84
> kernel:  dump_header+0x7b/0x1f6
> kernel:  ? do_try_to_free_pages+0x2c5/0x340
> kernel:  oom_kill_process+0x202/0x3d0
> kernel:  out_of_memory+0x2b7/0x4e0
> kernel:  __alloc_pages_slowpath+0x915/0xb80
> kernel:  __alloc_pages_nodemask+0x218/0x2d0
> kernel:  alloc_pages_current+0x93/0x150
> kernel:  __page_cache_alloc+0xcf/0x100
> kernel:  filemap_fault+0x39d/0x800
> kernel:  ? page_add_file_rmap+0xe5/0x200
> kernel:  ? filemap_map_pages+0x2e1/0x4e0
> kernel:  ext4_filemap_fault+0x36/0x50
> kernel:  __do_fault+0x21/0x110
> kernel:  handle_mm_fault+0xdd1/0x1410
> kernel:  ? swake_up+0x42/0x50
> kernel:  __do_page_fault+0x23f/0x4c0
> kernel:  trace_do_page_fault+0x41/0x120
> kernel:  do_async_page_fault+0x51/0xa0
> kernel:  async_page_fault+0x28/0x30
> kernel: RIP: 0033:0x7f0681ad6350
> kernel: RSP: 002b:00007ffcbdd238d8 EFLAGS: 00010246
> kernel: RAX: 00007f0681b0f960 RBX: 0000000000000000 RCX: 7fffffffffffffff
> kernel: RDX: 0000000000000000 RSI: 3ff0000000000000 RDI: 3ff0000000000000
> kernel: RBP: 00007f067461ab40 R08: 0000000000000000 R09: 3ff0000000000000
> kernel: R10: 0000556f1c6d8a80 R11: 0000000000000001 R12: 00007f0676d1a8d0
> kernel: R13: 0000000000000000 R14: 00007f06746168bc R15: 00007f0674385910
> kernel: Mem-Info:
> kernel: active_anon:37423 inactive_anon:37512 isolated_anon:0
>          active_file:462 inactive_file:603 isolated_file:0
>          unevictable:0 dirty:0 writeback:0 unstable:0
>          slab_reclaimable:3538 slab_unreclaimable:4818
>          mapped:859 shmem:9 pagetables:3370 bounce:0
>          free:1650 free_pcp:103 free_cma:0
> kernel: Node 0 active_anon:149380kB inactive_anon:149704kB
> active_file:1848kB inactive_file:3660kB unevictable:0kB isolated(anon):128kB
> isolated(file):0kB mapped:4580kB dirty:0kB writeback:380kB shmem:0kB
> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 36kB writeback_tmp:0kB
> unstable:0kB pages_scanned:352 all_unreclaimable? no
> kernel: Node 0 DMA free:1484kB min:104kB low:128kB high:152kB
> active_anon:5660kB inactive_anon:6156kB active_file:56kB inactive_file:64kB
> unevictable:0kB writepending:0kB present:15992kB managed:15908kB mlocked:0kB
> slab_reclaimable:444kB slab_unreclaimable:1208kB kernel_stack:32kB
> pagetables:592kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> kernel: lowmem_reserve[]: 0 327 327 327 327
> kernel: Node 0 DMA32 free:5012kB min:2264kB low:2828kB high:3392kB
> active_anon:143580kB inactive_anon:143300kB active_file:2576kB
> inactive_file:2560kB unevictable:0kB writepending:0kB present:376688kB
> managed:353968kB mlocked:0kB slab_reclaimable:13708kB
> slab_unreclaimable:18064kB kernel_stack:2352kB pagetables:12888kB bounce:0kB
> free_pcp:412kB local_pcp:88kB free_cma:0kB
> kernel: lowmem_reserve[]: 0 0 0 0 0
> kernel: Node 0 DMA: 70*4kB (UMEH) 20*8kB (UMEH) 13*16kB (MH) 5*32kB (H)
> 4*64kB (H) 2*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB =
> 1576kB
> kernel: Node 0 DMA32: 1134*4kB (UMEH) 25*8kB (UMEH) 13*16kB (MH) 7*32kB (H)
> 3*64kB (H) 0*128kB 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 5616kB
 
Althogh DMA32 zone has enough free memory, free memory includes H pageblock
which is reserved memory for high-order atomic allocation. That might be
a reason you cannot succeed watermark check for the allocation.

I tried to solve the issue in 4.9 time to use up the reserved memory before
the OOM and merged into 4.10 but I think there is a hole so could you apply
this patch on top of your 4.10? (To be clear, cannot apply it to 4.9)

  parent reply	other threads:[~2017-02-27  9:18 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-30  7:10 Still OOM problems with 4.9er kernels Gerhard Wiesinger
2016-11-30  7:20 ` Gerhard Wiesinger
2016-12-09  7:06   ` Gerhard Wiesinger
2016-12-09 13:40     ` Michal Hocko
2016-12-09 13:40       ` Michal Hocko
2016-12-09 15:52       ` Gerhard Wiesinger
2016-12-09 15:52         ` Gerhard Wiesinger
2016-12-09 15:58         ` Gerhard Wiesinger
2016-12-09 15:58           ` Gerhard Wiesinger
2016-12-09 16:09         ` Michal Hocko
2016-12-09 16:09           ` Michal Hocko
2016-12-09 16:58           ` Gerhard Wiesinger
2016-12-09 17:30             ` Michal Hocko
2016-12-09 17:30               ` Michal Hocko
2016-12-09 18:01               ` Gerhard Wiesinger
2016-12-09 18:01                 ` Gerhard Wiesinger
2016-12-09 21:42                 ` Vlastimil Babka
2016-12-09 21:42                   ` Vlastimil Babka
2016-12-10 13:50                   ` Gerhard Wiesinger
2016-12-10 13:50                     ` Gerhard Wiesinger
2016-12-12  8:24                     ` Michal Hocko
2016-12-12  8:24                       ` Michal Hocko
2016-12-23  2:55         ` Minchan Kim
2016-12-23  2:55           ` Minchan Kim
2017-01-01 17:20           ` Gerhard Wiesinger
2017-01-01 17:20             ` Gerhard Wiesinger
2017-01-04  8:40           ` Gerhard Wiesinger
2017-01-04  9:11             ` Michal Hocko
2017-01-04  9:11               ` Michal Hocko
2017-02-26  8:40               ` Still OOM problems with 4.9er/4.10er kernels Gerhard Wiesinger
2017-02-27  8:27                 ` Michal Hocko
2017-02-27  8:27                   ` Michal Hocko
2017-02-28  6:06                   ` Gerhard Wiesinger
2017-02-28  6:06                     ` Gerhard Wiesinger
2017-02-28  8:14                     ` Michal Hocko
2017-02-28  8:14                       ` Michal Hocko
2017-02-27  9:02                 ` Minchan Kim [this message]
2017-02-27  9:02                   ` Minchan Kim
2017-02-27  9:44                   ` Michal Hocko
2017-02-27  9:44                     ` Michal Hocko
2017-02-28  5:17                     ` Minchan Kim
2017-02-28  5:17                       ` Minchan Kim
2017-02-28  8:12                       ` Michal Hocko
2017-02-28  8:12                         ` Michal Hocko
2017-03-02  7:17                         ` Minchan Kim
2017-03-02  7:17                           ` Minchan Kim
2017-03-16  6:38                           ` Gerhard Wiesinger
2017-03-16  6:38                             ` Gerhard Wiesinger
2017-03-16  8:27                             ` Michal Hocko
2017-03-16  8:27                               ` Michal Hocko
2017-03-16  8:47                               ` lkml
2017-03-16  8:47                                 ` lkml
2017-03-16  9:08                                 ` Michal Hocko
2017-03-16  9:08                                   ` Michal Hocko
2017-03-16  9:23                                   ` lkml
2017-03-16  9:23                                     ` lkml
2017-03-16  9:39                                     ` Michal Hocko
2017-03-16  9:39                                       ` Michal Hocko
2017-03-17 16:37                                       ` Gerhard Wiesinger
2017-03-17 16:37                                         ` Gerhard Wiesinger
2017-03-17 17:13                                         ` Michal Hocko
2017-03-17 17:13                                           ` Michal Hocko
2017-03-17 20:08                                           ` Gerhard Wiesinger
2017-03-17 20:08                                             ` Gerhard Wiesinger
2017-03-19  8:17                                             ` Gerhard Wiesinger
2017-03-19  8:17                                               ` Gerhard Wiesinger
2017-03-20  1:54                                               ` Tetsuo Handa
2017-03-20  1:54                                                 ` Tetsuo Handa
2017-03-19 15:18                                             ` Michal Hocko
2017-03-19 15:18                                               ` Michal Hocko
2017-03-19 16:02                                               ` Gerhard Wiesinger
2017-03-19 16:02                                                 ` Gerhard Wiesinger
2017-03-20  3:05                                                 ` Mike Galbraith
2017-03-20  3:05                                                   ` Mike Galbraith
2017-03-21  5:59                                                   ` Gerhard Wiesinger
2017-03-21  5:59                                                     ` Gerhard Wiesinger
2017-03-21  7:13                                                     ` Mike Galbraith
2017-03-21  7:13                                                       ` Mike Galbraith
2017-03-23  7:16                                                       ` Gerhard Wiesinger
2017-03-23  7:16                                                         ` Gerhard Wiesinger
2017-03-23  8:38                                                         ` Mike Galbraith
2017-03-23  8:38                                                           ` Mike Galbraith
2017-03-23 14:46                                                           ` Tetsuo Handa
2017-03-23 14:46                                                             ` Tetsuo Handa
2017-03-26  8:36                                                           ` Gerhard Wiesinger
2017-03-26  8:36                                                             ` Gerhard Wiesinger
2016-12-09 16:03       ` Still OOM problems with 4.9er kernels Gerhard Wiesinger
2016-12-09 16:03         ` Gerhard Wiesinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170227090236.GA2789@bbox \
    --to=minchan@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lists@wiesinger.com \
    --cc=mhocko@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.