All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>, Tejun Heo <tj@kernel.org>,
	Roman Gushchin <guro@fb.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-mm <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry().
Date: Thu, 06 Sep 2018 10:00:00 +0900	[thread overview]
Message-ID: <201809060100.w86100i6060716@www262.sakura.ne.jp> (raw)
In-Reply-To: <20180905140451.GG14951@dhcp22.suse.cz>

Michal Hocko wrote:
> On Wed 05-09-18 22:53:33, Tetsuo Handa wrote:
> > On 2018/09/05 22:40, Michal Hocko wrote:
> > > Changelog said 
> > > 
> > > "Although this is possible in principle let's wait for it to actually
> > > happen in real life before we make the locking more complex again."
> > > 
> > > So what is the real life workload that hits it? The log you have pasted
> > > below doesn't tell much.
> > 
> > Nothing special. I just ran a multi-threaded memory eater on a CONFIG_PREEMPT=y kernel.
> 
> I strongly suspec that your test doesn't really represent or simulate
> any real and useful workload. Sure it triggers a rare race and we kill
> another oom victim. Does this warrant to make the code more complex?
> Well, I am not convinced, as I've said countless times.

Yes. Below is an example from a machine running Apache Web server/Tomcat AP server/PostgreSQL DB server.
An memory eater needlessly killed Tomcat due to this race. I assert that we should fix af5679fbc669f31f.



Before:

systemd(1)-+-NetworkManager(693)-+-dhclient(791)
           |                     |-{NetworkManager}(698)
           |                     `-{NetworkManager}(702)
           |-abrtd(630)
           |-agetty(1007)
           |-atd(653)
           |-auditd(600)---{auditd}(601)
           |-avahi-daemon(625)---avahi-daemon(631)
           |-crond(657)
           |-dbus-daemon(638)
           |-firewalld(661)---{firewalld}(788)
           |-httpd(1169)-+-httpd(1170)
           |             |-httpd(1171)
           |             |-httpd(1172)
           |             |-httpd(1173)
           |             `-httpd(1174)
           |-irqbalance(628)
           |-java(1074)-+-{java}(1092)
           |            |-{java}(1093)
           |            |-{java}(1094)
           |            |-{java}(1095)
           |            |-{java}(1096)
           |            |-{java}(1097)
           |            |-{java}(1098)
           |            |-{java}(1099)
           |            |-{java}(1100)
           |            |-{java}(1101)
           |            |-{java}(1102)
           |            |-{java}(1103)
           |            |-{java}(1104)
           |            |-{java}(1105)
           |            |-{java}(1106)
           |            |-{java}(1107)
           |            |-{java}(1108)
           |            |-{java}(1109)
           |            |-{java}(1110)
           |            |-{java}(1111)
           |            |-{java}(1114)
           |            |-{java}(1115)
           |            |-{java}(1116)
           |            |-{java}(1117)
           |            |-{java}(1118)
           |            |-{java}(1119)
           |            |-{java}(1120)
           |            |-{java}(1121)
           |            |-{java}(1122)
           |            |-{java}(1123)
           |            |-{java}(1124)
           |            |-{java}(1125)
           |            |-{java}(1126)
           |            |-{java}(1127)
           |            |-{java}(1128)
           |            |-{java}(1129)
           |            |-{java}(1130)
           |            |-{java}(1131)
           |            |-{java}(1132)
           |            |-{java}(1133)
           |            |-{java}(1134)
           |            |-{java}(1135)
           |            |-{java}(1136)
           |            |-{java}(1137)
           |            `-{java}(1138)
           |-ksmtuned(659)---sleep(1727)
           |-login(1006)---bash(1052)---pstree(1728)
           |-polkitd(624)-+-{polkitd}(633)
           |              |-{polkitd}(642)
           |              |-{polkitd}(643)
           |              |-{polkitd}(645)
           |              `-{polkitd}(650)
           |-postgres(1154)-+-postgres(1155)
           |                |-postgres(1157)
           |                |-postgres(1158)
           |                |-postgres(1159)
           |                |-postgres(1160)
           |                `-postgres(1161)
           |-rsyslogd(986)-+-{rsyslogd}(997)
           |               `-{rsyslogd}(999)
           |-sendmail(1008)
           |-sendmail(1023)
           |-smbd(983)-+-cleanupd(1027)
           |           |-lpqd(1032)
           |           `-smbd-notifyd(1026)
           |-sshd(981)
           |-systemd-journal(529)
           |-systemd-logind(627)
           |-systemd-udevd(560)
           `-tuned(980)-+-{tuned}(1030)
                        |-{tuned}(1031)
                        |-{tuned}(1033)
                        `-{tuned}(1047)



After:

systemd(1)-+-NetworkManager(693)-+-dhclient(791)
           |                     |-{NetworkManager}(698)
           |                     `-{NetworkManager}(702)
           |-abrtd(630)
           |-agetty(1007)
           |-atd(653)
           |-auditd(600)---{auditd}(601)
           |-avahi-daemon(625)---avahi-daemon(631)
           |-crond(657)
           |-dbus-daemon(638)
           |-firewalld(661)---{firewalld}(788)
           |-httpd(1169)-+-httpd(1170)
           |             |-httpd(1171)
           |             |-httpd(1172)
           |             |-httpd(1173)
           |             `-httpd(1174)
           |-irqbalance(628)
           |-ksmtuned(659)---sleep(1758)
           |-login(1006)---bash(1052)---pstree(1759)
           |-polkitd(624)-+-{polkitd}(633)
           |              |-{polkitd}(642)
           |              |-{polkitd}(643)
           |              |-{polkitd}(645)
           |              `-{polkitd}(650)
           |-postgres(1154)-+-postgres(1155)
           |                |-postgres(1157)
           |                |-postgres(1158)
           |                |-postgres(1159)
           |                |-postgres(1160)
           |                `-postgres(1161)
           |-rsyslogd(986)-+-{rsyslogd}(997)
           |               `-{rsyslogd}(999)
           |-sendmail(1008)
           |-sendmail(1023)
           |-smbd(983)-+-cleanupd(1027)
           |           |-lpqd(1032)
           |           `-smbd-notifyd(1026)
           |-sshd(981)
           |-systemd-journal(529)
           |-systemd-logind(627)
           |-systemd-udevd(560)
           `-tuned(980)-+-{tuned}(1030)
                        |-{tuned}(1031)
                        |-{tuned}(1033)
                        `-{tuned}(1047)



[  222.165946] a.out invoked oom-killer: gfp_mask=0x6280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
[  222.170631] a.out cpuset=/ mems_allowed=0
[  222.172956] CPU: 4 PID: 1748 Comm: a.out Tainted: G                T 4.19.0-rc2+ #690
[  222.176517] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  222.180892] Call Trace:
[  222.182947]  dump_stack+0x85/0xcb
[  222.185240]  dump_header+0x69/0x2fe
[  222.187579]  ? _raw_spin_unlock_irqrestore+0x41/0x70
[  222.190319]  oom_kill_process+0x307/0x390
[  222.192803]  out_of_memory+0x2f3/0x5d0
[  222.195190]  __alloc_pages_slowpath+0xc01/0x1030
[  222.197844]  __alloc_pages_nodemask+0x333/0x390
[  222.200452]  alloc_pages_vma+0x77/0x1f0
[  222.202869]  __handle_mm_fault+0x81c/0xf40
[  222.205334]  handle_mm_fault+0x1b7/0x3c0
[  222.207712]  __do_page_fault+0x2a6/0x580
[  222.210036]  do_page_fault+0x32/0x270
[  222.212266]  ? page_fault+0x8/0x30
[  222.214402]  page_fault+0x1e/0x30
[  222.216463] RIP: 0033:0x4008d8
[  222.218429] Code: Bad RIP value.
[  222.220388] RSP: 002b:00007fff34061350 EFLAGS: 00010206
[  222.222931] RAX: 00007efea3c2e010 RBX: 0000000100000000 RCX: 0000000000000000
[  222.225976] RDX: 00000000b190f000 RSI: 0000000000020000 RDI: 0000000200000050
[  222.228891] RBP: 00007efea3c2e010 R08: 0000000200001000 R09: 0000000000021000
[  222.231779] R10: 0000000000000022 R11: 0000000000001000 R12: 0000000000000006
[  222.234626] R13: 00007fff34061440 R14: 0000000000000000 R15: 0000000000000000
[  222.238482] Mem-Info:
[  222.240511] active_anon:789816 inactive_anon:3457 isolated_anon:0
[  222.240511]  active_file:11 inactive_file:44 isolated_file:0
[  222.240511]  unevictable:0 dirty:6 writeback:0 unstable:0
[  222.240511]  slab_reclaimable:8052 slab_unreclaimable:24408
[  222.240511]  mapped:1898 shmem:3704 pagetables:4316 bounce:0
[  222.240511]  free:20841 free_pcp:0 free_cma:0
[  222.254349] Node 0 active_anon:3159264kB inactive_anon:13828kB active_file:44kB inactive_file:176kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:7592kB dirty:24kB writeback:0kB shmem:14816kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 2793472kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[  222.264038] Node 0 DMA free:13812kB min:308kB low:384kB high:460kB active_anon:1876kB inactive_anon:8kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15960kB managed:15876kB mlocked:0kB kernel_stack:0kB pagetables:4kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  222.274208] lowmem_reserve[]: 0 2674 3378 3378
[  222.276831] Node 0 DMA32 free:56068kB min:53260kB low:66572kB high:79884kB active_anon:2673292kB inactive_anon:216kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129152kB managed:2738564kB mlocked:0kB kernel_stack:96kB pagetables:3024kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  222.287961] lowmem_reserve[]: 0 0 703 703
[  222.291154] Node 0 Normal free:13672kB min:14012kB low:17512kB high:21012kB active_anon:483864kB inactive_anon:13604kB active_file:0kB inactive_file:4kB unevictable:0kB writepending:0kB present:1048576kB managed:720644kB mlocked:0kB kernel_stack:7520kB pagetables:14272kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  222.302407] lowmem_reserve[]: 0 0 0 0
[  222.304596] Node 0 DMA: 1*4kB (M) 0*8kB 1*16kB (U) 7*32kB (UM) 4*64kB (U) 4*128kB (UM) 2*256kB (U) 0*512kB 0*1024kB 0*2048kB 3*4096kB (ME) = 13812kB
[  222.311748] Node 0 DMA32: 37*4kB (U) 29*8kB (UM) 20*16kB (UM) 30*32kB (UME) 28*64kB (UME) 11*128kB (UME) 9*256kB (UME) 8*512kB (UM) 6*1024kB (UME) 1*2048kB (E) 9*4096kB (UM) = 56316kB
[  222.318932] Node 0 Normal: 151*4kB (UM) 2*8kB (UM) 97*16kB (UM) 195*32kB (UME) 53*64kB (UME) 11*128kB (UME) 2*256kB (UM) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 13724kB
[  222.325455] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  222.329059] 3707 total pagecache pages
[  222.331315] 0 pages in swap cache
[  222.333440] Swap cache stats: add 0, delete 0, find 0/0
[  222.336097] Free swap  = 0kB
[  222.338091] Total swap = 0kB
[  222.340084] 1048422 pages RAM
[  222.342165] 0 pages HighMem/MovableOnly
[  222.344460] 179651 pages reserved
[  222.347423] 0 pages cma reserved
[  222.349793] 0 pages hwpoisoned
[  222.351784] Out of memory: Kill process 1748 (a.out) score 838 or sacrifice child
[  222.355131] Killed process 1748 (a.out) total-vm:4267252kB, anon-rss:2909224kB, file-rss:0kB, shmem-rss:0kB
[  222.359644] java invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[  222.364180] java cpuset=/ mems_allowed=0
[  222.366619] CPU: 0 PID: 1110 Comm: java Tainted: G                T 4.19.0-rc2+ #690
[  222.370088] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  222.377089] Call Trace:
[  222.380503]  dump_stack+0x85/0xcb
[  222.380509]  dump_header+0x69/0x2fe
[  222.380515]  ? _raw_spin_unlock_irqrestore+0x41/0x70
[  222.380518]  oom_kill_process+0x307/0x390
[  222.380553]  out_of_memory+0x2f3/0x5d0
[  222.396414]  __alloc_pages_slowpath+0xc01/0x1030
[  222.396423]  __alloc_pages_nodemask+0x333/0x390
[  222.396431]  filemap_fault+0x465/0x910
[  222.404090]  ? xfs_ilock+0xbf/0x2b0 [xfs]
[  222.404118]  ? __xfs_filemap_fault+0x7d/0x2c0 [xfs]
[  222.404124]  ? down_read_nested+0x66/0xa0
[  222.404148]  __xfs_filemap_fault+0x8e/0x2c0 [xfs]
[  222.404156]  __do_fault+0x11/0x133
[  222.404158]  __handle_mm_fault+0xa57/0xf40
[  222.404165]  handle_mm_fault+0x1b7/0x3c0
[  222.404171]  __do_page_fault+0x2a6/0x580
[  222.404187]  do_page_fault+0x32/0x270
[  222.404194]  ? page_fault+0x8/0x30
[  222.404196]  page_fault+0x1e/0x30
[  222.404199] RIP: 0033:0x7fedb229ed42
[  222.404205] Code: Bad RIP value.
[  222.404207] RSP: 002b:00007fed92ae9c90 EFLAGS: 00010202
[  222.404209] RAX: ffffffffffffff92 RBX: 00007fedb187c470 RCX: 00007fedb229ed42
[  222.404210] RDX: 0000000000000001 RSI: 0000000000000089 RDI: 00007fedac13c354
[  222.404211] RBP: 00007fed92ae9d50 R08: 00007fedac13c328 R09: 00000000ffffffff
[  222.404212] R10: 00007fed92ae9cf0 R11: 0000000000000202 R12: 0000000000000001
[  222.404213] R13: 00007fed92ae9cf0 R14: ffffffffffffff92 R15: 00007fedac13c300
[  222.404783] Mem-Info:
[  222.404790] active_anon:429056 inactive_anon:3457 isolated_anon:0
[  222.404790]  active_file:0 inactive_file:833 isolated_file:0
[  222.404790]  unevictable:0 dirty:0 writeback:0 unstable:0
[  222.404790]  slab_reclaimable:8052 slab_unreclaimable:24344
[  222.404790]  mapped:2375 shmem:3704 pagetables:3030 bounce:0
[  222.404790]  free:381368 free_pcp:89 free_cma:0
[  222.404793] Node 0 active_anon:1716224kB inactive_anon:13828kB active_file:0kB inactive_file:3332kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:9500kB dirty:0kB writeback:0kB shmem:14816kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 155648kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
[  222.404794] Node 0 DMA free:13812kB min:308kB low:384kB high:460kB active_anon:1876kB inactive_anon:8kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15960kB managed:15876kB mlocked:0kB kernel_stack:0kB pagetables:4kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  222.404798] lowmem_reserve[]: 0 2674 3378 3378
[  222.404802] Node 0 DMA32 free:1362940kB min:53260kB low:66572kB high:79884kB active_anon:1366928kB inactive_anon:216kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129152kB managed:2738564kB mlocked:0kB kernel_stack:96kB pagetables:3028kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  222.404831] lowmem_reserve[]: 0 0 703 703
[  222.404837] Node 0 Normal free:149160kB min:14012kB low:17512kB high:21012kB active_anon:348116kB inactive_anon:13604kB active_file:0kB inactive_file:3080kB unevictable:0kB writepending:0kB present:1048576kB managed:720644kB mlocked:0kB kernel_stack:7504kB pagetables:9088kB bounce:0kB free_pcp:376kB local_pcp:12kB free_cma:0kB
[  222.404841] lowmem_reserve[]: 0 0 0 0
[  222.404859] Node 0 DMA: 1*4kB (M) 0*8kB 1*16kB (U) 7*32kB (UM) 4*64kB (U) 4*128kB (UM) 2*256kB (U) 0*512kB 0*1024kB 0*2048kB 3*4096kB (ME) = 13812kB
[  222.405114] Node 0 DMA32: 37*4kB (U) 29*8kB (UM) 20*16kB (UM) 30*32kB (UME) 28*64kB (UME) 11*128kB (UME) 9*256kB (UME) 8*512kB (UM) 6*1024kB (UME) 10*2048kB (ME) 326*4096kB (UM) = 1373180kB
[  222.405423] Node 0 Normal: 512*4kB (U) 1075*8kB (UM) 1667*16kB (UM) 1226*32kB (UME) 497*64kB (UME) 209*128kB (UME) 50*256kB (UM) 0*512kB 0*1024kB 0*2048kB 1*4096kB (M) = 152008kB
[  222.405797] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  222.405799] 4655 total pagecache pages
[  222.405801] 0 pages in swap cache
[  222.405802] Swap cache stats: add 0, delete 0, find 0/0
[  222.405803] Free swap  = 0kB
[  222.405803] Total swap = 0kB
[  222.405835] 1048422 pages RAM
[  222.405837] 0 pages HighMem/MovableOnly
[  222.405838] 179651 pages reserved
[  222.405839] 0 pages cma reserved
[  222.405840] 0 pages hwpoisoned
[  222.405843] Out of memory: Kill process 1074 (java) score 50 or sacrifice child
[  222.406136] Killed process 1074 (java) total-vm:5555688kB, anon-rss:174244kB, file-rss:0kB, shmem-rss:0kB
[  222.443446] oom_reaper: reaped process 1748 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

  reply	other threads:[~2018-09-06  1:00 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-26 11:06 [PATCH] mm,page_alloc: PF_WQ_WORKER threads must sleep at should_reclaim_retry() Tetsuo Handa
2018-07-26 11:06 ` Tetsuo Handa
2018-07-26 11:39 ` Michal Hocko
2018-07-27 15:47   ` Tetsuo Handa
2018-07-30  9:32     ` Michal Hocko
2018-07-30 14:34       ` Tetsuo Handa
2018-07-30 14:46         ` Michal Hocko
2018-07-30 14:54           ` Tejun Heo
2018-07-30 15:25             ` Tetsuo Handa
2018-07-30 15:44               ` Tejun Heo
2018-07-30 18:51                 ` Michal Hocko
2018-07-30 19:10                   ` Michal Hocko
2018-07-30 19:10                     ` Michal Hocko
2018-07-30 21:01                     ` Tetsuo Handa
2018-07-31  5:09                       ` Michal Hocko
2018-07-31 10:47                         ` Tetsuo Handa
2018-07-31 11:15                           ` Michal Hocko
2018-07-31 11:30                             ` Tetsuo Handa
2018-07-31 11:55                               ` Michal Hocko
2018-08-02 22:05                         ` Tetsuo Handa
2018-08-03  6:16                           ` Michal Hocko
2018-08-21 21:07                             ` Tetsuo Handa
2018-08-22  7:32                               ` Michal Hocko
2018-08-23 20:06                               ` David Rientjes
2018-08-23 21:00                                 ` Tetsuo Handa
2018-08-23 22:45                                   ` David Rientjes
2018-08-24  0:31                                     ` Tetsuo Handa
2018-09-05 13:20                                       ` Tetsuo Handa
2018-09-05 13:40                                         ` Michal Hocko
2018-09-05 13:53                                           ` Tetsuo Handa
2018-09-05 14:04                                             ` Michal Hocko
2018-09-06  1:00                                               ` Tetsuo Handa [this message]
2018-09-06  5:57                                                 ` Michal Hocko
2018-09-06  6:22                                                   ` Tetsuo Handa
2018-09-06  7:03                                                   ` Tetsuo Handa
2018-07-30 19:14                   ` Tejun Heo
2018-08-27 13:51 Michal Hocko
2018-08-27 13:51 ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201809060100.w86100i6060716@www262.sakura.ne.jp \
    --to=penguin-kernel@i-love.sakura.ne.jp \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.