From: Michal Hocko <mhocko@suse.cz> To: azurIt <azurit@pobox.sk> Cc: Johannes Weiner <hannes@cmpxchg.org>, Andrew Morton <akpm@linux-foundation.org>, David Rientjes <rientjes@google.com>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [patch 0/7] improve memcg oom killer robustness v2 Date: Thu, 5 Sep 2013 14:03:47 +0200 [thread overview] Message-ID: <20130905120347.GA13666@dhcp22.suse.cz> (raw) In-Reply-To: <20130905134702.C703F65B@pobox.sk> On Thu 05-09-13 13:47:02, azurIt wrote: > >On Thu 05-09-13 12:17:00, azurIt wrote: > >> >[...] > >> >> My script detected another freezed cgroup today, sending stacks. Is > >> >> there anything interesting? > >> > > >> >3 tasks are sleeping and waiting for somebody to take an action to > >> >resolve memcg OOM. The memcg oom killer is enabled for that group? If > >> >yes, which task has been selected to be killed? You can find that in oom > >> >report in dmesg. > >> > > >> >I can see a way how this might happen. If the killed task happened to > >> >allocate a memory while it is exiting then it would get to the oom > >> >condition again without freeing any memory so nobody waiting on the > >> >memcg_oom_waitq gets woken. We have a report like that: > >> >https://lkml.org/lkml/2013/7/31/94 > >> > > >> >The issue got silent in the meantime so it is time to wake it up. > >> >It would be definitely good to see what happened in your case though. > >> >If any of the bellow tasks was the oom victim then it is very probable > >> >this is the same issue. > >> > >> Here it is: > >> http://watchdog.sk/lkml/kern5.log > > > >$ grep "Killed process \<103[168]\>" kern5.log > >$ > > > >So none of the sleeping tasks has been killed previously. > > > >> Processes were killed by my script > > > >OK, I am really confused now. The log contains a lot of in-kernel memcg > >oom killer messages: > >$ grep "Memory cgroup out of memory:" kern5.log | wc -l > >809 > > > >This suggests that the oom killer is not disabled. What exactly has you > >script done? > > > >> at about 11:05:35. > > > >There is an oom killer striking at 11:05:35: > >Sep 5 11:05:35 server02 kernel: [1751856.433101] Task in /1066/uid killed as a result of limit of /1066 > >[...] > >Sep 5 11:05:35 server02 kernel: [1751856.539356] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name > >Sep 5 11:05:35 server02 kernel: [1751856.539745] [ 1046] 1066 1046 228537 95491 3 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.539894] [ 1047] 1066 1047 228604 95488 6 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540043] [ 1050] 1066 1050 228470 95452 5 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540191] [ 1051] 1066 1051 228592 95521 6 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540340] [ 1052] 1066 1052 228594 95546 5 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540489] [ 1054] 1066 1054 228470 95453 5 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540646] Memory cgroup out of memory: Kill process 1046 (apache2) score 1000 or sacrifice child > > > >And this doesn't list any of the tasks sleeping and waiting for oom > >resolving so they must have been created after this OOM. Is this the > >same group? > > cgroup was 1066. My script is doing this: > 1.) It checks memory usage of all cgroups and is searching for those whos memory usage is >= 99% of their limit. > 2.) If any are found, they are saved in an array of 'candidates for killing'. > 3.) It sleep for 30 seconds. > 4.) Do (1) and if any of found cgorups were also found in (2), it kills all processes inside it. > 5.) Clear array of saved cgroups and continue. This is racy and doesn't really tell you anything about any group being frozen. [...] > But, of course, i cannot guarantee that the killed cgroup was really > freezed (because of bug in linux kernel), there could be some false > positives - for example, cgroup has 99% usage of memory, my script > detected it, OOM successfully resolved the problem and, after 30 > seconds, the same cgroup has again 99% usage of it's memory and my > script detected it again. Exactly > This is why i'm sending stacks here, i simply cannot tell if > there was or wasn't a problem. On the other hand if those processes would be stuck waiting for somebody to resolve the OOM for a long time without any change then yes we have a problem. Just to be sure I got you right. You have killed all the processes from the group you have sent stacks for, right? If that is the case I am really curious about processes sitting in sleep_on_page_killable because those are killable by definition. > I can disable the script and wait until the problem really occurs but > when it happens, our services will go down. I definitely do not want to encourage you to let your services down... > Hope i was clear enough - if not, i can post the source code of that > script. -- Michal Hocko SUSE Labs
WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.cz> To: azurIt <azurit@pobox.sk> Cc: Johannes Weiner <hannes@cmpxchg.org>, Andrew Morton <akpm@linux-foundation.org>, David Rientjes <rientjes@google.com>, KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [patch 0/7] improve memcg oom killer robustness v2 Date: Thu, 5 Sep 2013 14:03:47 +0200 [thread overview] Message-ID: <20130905120347.GA13666@dhcp22.suse.cz> (raw) In-Reply-To: <20130905134702.C703F65B@pobox.sk> On Thu 05-09-13 13:47:02, azurIt wrote: > >On Thu 05-09-13 12:17:00, azurIt wrote: > >> >[...] > >> >> My script detected another freezed cgroup today, sending stacks. Is > >> >> there anything interesting? > >> > > >> >3 tasks are sleeping and waiting for somebody to take an action to > >> >resolve memcg OOM. The memcg oom killer is enabled for that group? If > >> >yes, which task has been selected to be killed? You can find that in oom > >> >report in dmesg. > >> > > >> >I can see a way how this might happen. If the killed task happened to > >> >allocate a memory while it is exiting then it would get to the oom > >> >condition again without freeing any memory so nobody waiting on the > >> >memcg_oom_waitq gets woken. We have a report like that: > >> >https://lkml.org/lkml/2013/7/31/94 > >> > > >> >The issue got silent in the meantime so it is time to wake it up. > >> >It would be definitely good to see what happened in your case though. > >> >If any of the bellow tasks was the oom victim then it is very probable > >> >this is the same issue. > >> > >> Here it is: > >> http://watchdog.sk/lkml/kern5.log > > > >$ grep "Killed process \<103[168]\>" kern5.log > >$ > > > >So none of the sleeping tasks has been killed previously. > > > >> Processes were killed by my script > > > >OK, I am really confused now. The log contains a lot of in-kernel memcg > >oom killer messages: > >$ grep "Memory cgroup out of memory:" kern5.log | wc -l > >809 > > > >This suggests that the oom killer is not disabled. What exactly has you > >script done? > > > >> at about 11:05:35. > > > >There is an oom killer striking at 11:05:35: > >Sep 5 11:05:35 server02 kernel: [1751856.433101] Task in /1066/uid killed as a result of limit of /1066 > >[...] > >Sep 5 11:05:35 server02 kernel: [1751856.539356] [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name > >Sep 5 11:05:35 server02 kernel: [1751856.539745] [ 1046] 1066 1046 228537 95491 3 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.539894] [ 1047] 1066 1047 228604 95488 6 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540043] [ 1050] 1066 1050 228470 95452 5 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540191] [ 1051] 1066 1051 228592 95521 6 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540340] [ 1052] 1066 1052 228594 95546 5 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540489] [ 1054] 1066 1054 228470 95453 5 0 0 apache2 > >Sep 5 11:05:35 server02 kernel: [1751856.540646] Memory cgroup out of memory: Kill process 1046 (apache2) score 1000 or sacrifice child > > > >And this doesn't list any of the tasks sleeping and waiting for oom > >resolving so they must have been created after this OOM. Is this the > >same group? > > cgroup was 1066. My script is doing this: > 1.) It checks memory usage of all cgroups and is searching for those whos memory usage is >= 99% of their limit. > 2.) If any are found, they are saved in an array of 'candidates for killing'. > 3.) It sleep for 30 seconds. > 4.) Do (1) and if any of found cgorups were also found in (2), it kills all processes inside it. > 5.) Clear array of saved cgroups and continue. This is racy and doesn't really tell you anything about any group being frozen. [...] > But, of course, i cannot guarantee that the killed cgroup was really > freezed (because of bug in linux kernel), there could be some false > positives - for example, cgroup has 99% usage of memory, my script > detected it, OOM successfully resolved the problem and, after 30 > seconds, the same cgroup has again 99% usage of it's memory and my > script detected it again. Exactly > This is why i'm sending stacks here, i simply cannot tell if > there was or wasn't a problem. On the other hand if those processes would be stuck waiting for somebody to resolve the OOM for a long time without any change then yes we have a problem. Just to be sure I got you right. You have killed all the processes from the group you have sent stacks for, right? If that is the case I am really curious about processes sitting in sleep_on_page_killable because those are killable by definition. > I can disable the script and wait until the problem really occurs but > when it happens, our services will go down. I definitely do not want to encourage you to let your services down... > Hope i was clear enough - if not, i can post the source code of that > script. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-09-05 12:03 UTC|newest] Thread overview: 227+ messages / expand[flat|nested] mbox.gz Atom feed top 2013-08-03 16:59 [patch 0/7] improve memcg oom killer robustness v2 Johannes Weiner 2013-08-03 16:59 ` Johannes Weiner 2013-08-03 16:59 ` [patch 1/7] arch: mm: remove obsolete init OOM protection Johannes Weiner 2013-08-03 16:59 ` Johannes Weiner 2013-08-06 6:34 ` Vineet Gupta 2013-08-06 6:34 ` Vineet Gupta 2013-08-06 6:34 ` Vineet Gupta 2013-08-03 16:59 ` [patch 2/7] arch: mm: do not invoke OOM killer on kernel fault OOM Johannes Weiner 2013-08-03 16:59 ` Johannes Weiner 2013-08-03 16:59 ` [patch 3/7] arch: mm: pass userspace fault flag to generic fault handler Johannes Weiner 2013-08-03 16:59 ` Johannes Weiner 2013-08-05 22:06 ` Andrew Morton 2013-08-05 22:06 ` Andrew Morton 2013-08-05 22:25 ` Johannes Weiner 2013-08-05 22:25 ` Johannes Weiner 2013-08-03 16:59 ` [patch 4/7] x86: finish user fault error path with fatal signal Johannes Weiner 2013-08-03 16:59 ` Johannes Weiner 2013-08-03 16:59 ` [patch 5/7] mm: memcg: enable memcg OOM killer only for user faults Johannes Weiner 2013-08-03 16:59 ` Johannes Weiner 2013-08-05 9:18 ` Michal Hocko 2013-08-05 9:18 ` Michal Hocko 2013-08-03 16:59 ` [patch 6/7] mm: memcg: rework and document OOM waiting and wakeup Johannes Weiner 2013-08-03 16:59 ` Johannes Weiner 2013-08-03 17:00 ` [patch 7/7] mm: memcg: do not trap chargers with full callstack on OOM Johannes Weiner 2013-08-03 17:00 ` Johannes Weiner 2013-08-05 9:54 ` Michal Hocko 2013-08-05 9:54 ` Michal Hocko 2013-08-05 9:54 ` Michal Hocko 2013-08-05 20:56 ` Johannes Weiner 2013-08-05 20:56 ` Johannes Weiner 2013-08-03 17:08 ` [patch 0/7] improve memcg oom killer robustness v2 Johannes Weiner 2013-08-03 17:08 ` Johannes Weiner 2013-08-09 9:06 ` azurIt 2013-08-09 9:06 ` azurIt 2013-08-09 9:06 ` azurIt 2013-08-30 19:58 ` azurIt 2013-08-30 19:58 ` azurIt 2013-09-02 10:38 ` azurIt 2013-09-02 10:38 ` azurIt 2013-09-03 20:48 ` Johannes Weiner 2013-09-03 20:48 ` Johannes Weiner 2013-09-04 7:53 ` azurIt 2013-09-04 7:53 ` azurIt 2013-09-04 7:53 ` azurIt 2013-09-04 7:53 ` azurIt 2013-09-04 8:18 ` azurIt 2013-09-04 8:18 ` azurIt 2013-09-05 11:54 ` Johannes Weiner 2013-09-05 11:54 ` Johannes Weiner 2013-09-05 12:43 ` Michal Hocko 2013-09-05 12:43 ` Michal Hocko 2013-09-05 16:18 ` Johannes Weiner 2013-09-05 16:18 ` Johannes Weiner 2013-09-09 12:36 ` Michal Hocko 2013-09-09 12:36 ` Michal Hocko 2013-09-09 12:56 ` Michal Hocko 2013-09-09 12:56 ` Michal Hocko 2013-09-12 12:59 ` Johannes Weiner 2013-09-12 12:59 ` Johannes Weiner 2013-09-16 14:03 ` Michal Hocko 2013-09-16 14:03 ` Michal Hocko 2013-09-16 14:03 ` Michal Hocko 2013-09-05 13:24 ` Michal Hocko 2013-09-05 13:24 ` Michal Hocko 2013-09-09 13:10 ` azurIt 2013-09-09 13:10 ` azurIt 2013-09-09 17:28 ` Johannes Weiner 2013-09-09 17:28 ` Johannes Weiner 2013-09-09 19:59 ` azurIt 2013-09-09 19:59 ` azurIt 2013-09-09 20:12 ` Johannes Weiner 2013-09-09 20:12 ` Johannes Weiner 2013-09-09 20:18 ` azurIt 2013-09-09 20:18 ` azurIt 2013-09-09 21:08 ` azurIt 2013-09-09 21:08 ` azurIt 2013-09-10 18:13 ` azurIt 2013-09-10 18:13 ` azurIt 2013-09-10 18:37 ` Johannes Weiner 2013-09-10 18:37 ` Johannes Weiner 2013-09-10 19:32 ` azurIt 2013-09-10 19:32 ` azurIt 2013-09-10 20:12 ` Johannes Weiner 2013-09-10 20:12 ` Johannes Weiner 2013-09-10 21:08 ` azurIt 2013-09-10 21:08 ` azurIt 2013-09-10 21:08 ` azurIt 2013-09-10 21:18 ` Johannes Weiner 2013-09-10 21:18 ` Johannes Weiner 2013-09-10 21:32 ` azurIt 2013-09-10 21:32 ` azurIt 2013-09-10 22:03 ` Johannes Weiner 2013-09-10 22:03 ` Johannes Weiner 2013-09-11 12:33 ` azurIt 2013-09-11 12:33 ` azurIt 2013-09-11 18:03 ` Johannes Weiner 2013-09-11 18:03 ` Johannes Weiner 2013-09-11 18:03 ` Johannes Weiner 2013-09-11 18:54 ` azurIt 2013-09-11 18:54 ` azurIt 2013-09-11 19:11 ` Johannes Weiner 2013-09-11 19:11 ` Johannes Weiner 2013-09-11 19:41 ` azurIt 2013-09-11 19:41 ` azurIt 2013-09-11 20:04 ` Johannes Weiner 2013-09-11 20:04 ` Johannes Weiner 2013-09-14 10:48 ` azurIt 2013-09-14 10:48 ` azurIt 2013-09-16 13:40 ` Michal Hocko 2013-09-16 13:40 ` Michal Hocko 2013-09-16 14:01 ` azurIt 2013-09-16 14:01 ` azurIt 2013-09-16 14:06 ` Michal Hocko 2013-09-16 14:06 ` Michal Hocko 2013-09-16 14:13 ` azurIt 2013-09-16 14:13 ` azurIt 2013-09-16 14:13 ` azurIt 2013-09-16 14:57 ` Michal Hocko 2013-09-16 14:57 ` Michal Hocko 2013-09-16 15:05 ` azurIt 2013-09-16 15:05 ` azurIt 2013-09-16 15:17 ` Johannes Weiner 2013-09-16 15:17 ` Johannes Weiner 2013-09-16 15:17 ` Johannes Weiner 2013-09-16 15:24 ` azurIt 2013-09-16 15:24 ` azurIt 2013-09-16 15:25 ` Michal Hocko 2013-09-16 15:25 ` Michal Hocko 2013-09-16 15:40 ` azurIt 2013-09-16 15:40 ` azurIt 2013-09-16 20:52 ` azurIt 2013-09-16 20:52 ` azurIt 2013-09-17 0:02 ` Johannes Weiner 2013-09-17 0:02 ` Johannes Weiner 2013-09-17 11:15 ` azurIt 2013-09-17 11:15 ` azurIt 2013-09-17 11:15 ` azurIt 2013-09-17 14:10 ` Michal Hocko 2013-09-17 14:10 ` Michal Hocko 2013-09-18 14:03 ` azurIt 2013-09-18 14:03 ` azurIt 2013-09-18 14:03 ` azurIt 2013-09-18 14:24 ` Michal Hocko 2013-09-18 14:24 ` Michal Hocko 2013-09-18 14:33 ` azurIt 2013-09-18 14:33 ` azurIt 2013-09-18 14:42 ` Michal Hocko 2013-09-18 14:42 ` Michal Hocko 2013-09-18 14:42 ` Michal Hocko 2013-09-18 18:02 ` azurIt 2013-09-18 18:02 ` azurIt 2013-09-18 18:36 ` Michal Hocko 2013-09-18 18:36 ` Michal Hocko 2013-09-18 18:36 ` Michal Hocko 2013-09-18 18:04 ` Johannes Weiner 2013-09-18 18:04 ` Johannes Weiner 2013-09-18 18:19 ` Johannes Weiner 2013-09-18 18:19 ` Johannes Weiner 2013-09-18 19:55 ` Johannes Weiner 2013-09-18 19:55 ` Johannes Weiner 2013-09-18 19:55 ` Johannes Weiner 2013-09-18 20:52 ` azurIt 2013-09-18 20:52 ` azurIt 2013-09-18 20:52 ` azurIt 2013-09-25 7:26 ` azurIt 2013-09-25 7:26 ` azurIt 2013-09-25 7:26 ` azurIt 2013-09-26 16:54 ` azurIt 2013-09-26 16:54 ` azurIt 2013-09-26 16:54 ` azurIt 2013-09-26 19:27 ` Johannes Weiner 2013-09-26 19:27 ` Johannes Weiner 2013-09-27 2:04 ` azurIt 2013-09-27 2:04 ` azurIt 2013-09-27 2:04 ` azurIt 2013-09-27 2:04 ` azurIt 2013-10-07 11:01 ` azurIt 2013-10-07 11:01 ` azurIt 2013-10-07 11:01 ` azurIt 2013-10-07 11:01 ` azurIt 2013-10-07 19:23 ` Johannes Weiner 2013-10-07 19:23 ` Johannes Weiner 2013-10-09 18:44 ` azurIt 2013-10-09 18:44 ` azurIt 2013-10-09 18:44 ` azurIt 2013-10-10 0:14 ` Johannes Weiner 2013-10-10 0:14 ` Johannes Weiner 2013-10-10 0:14 ` Johannes Weiner 2013-10-10 22:59 ` azurIt 2013-10-10 22:59 ` azurIt 2013-10-10 22:59 ` azurIt 2013-09-17 11:20 ` azurIt 2013-09-17 11:20 ` azurIt 2013-09-16 10:22 ` azurIt 2013-09-16 10:22 ` azurIt 2013-09-04 9:45 ` azurIt 2013-09-04 9:45 ` azurIt 2013-09-04 11:57 ` Michal Hocko 2013-09-04 11:57 ` Michal Hocko 2013-09-04 12:10 ` azurIt 2013-09-04 12:10 ` azurIt 2013-09-04 12:10 ` azurIt 2013-09-04 12:26 ` Michal Hocko 2013-09-04 12:26 ` Michal Hocko 2013-09-04 12:26 ` Michal Hocko 2013-09-04 12:39 ` azurIt 2013-09-04 12:39 ` azurIt 2013-09-05 9:14 ` azurIt 2013-09-05 9:14 ` azurIt 2013-09-05 9:53 ` Michal Hocko 2013-09-05 9:53 ` Michal Hocko 2013-09-05 10:17 ` azurIt 2013-09-05 10:17 ` azurIt 2013-09-05 11:17 ` Michal Hocko 2013-09-05 11:17 ` Michal Hocko 2013-09-05 11:17 ` Michal Hocko 2013-09-05 11:47 ` azurIt 2013-09-05 11:47 ` azurIt 2013-09-05 12:03 ` Michal Hocko [this message] 2013-09-05 12:03 ` Michal Hocko 2013-09-05 12:33 ` azurIt 2013-09-05 12:33 ` azurIt 2013-09-05 12:33 ` azurIt 2013-09-05 12:45 ` Michal Hocko 2013-09-05 12:45 ` Michal Hocko 2013-09-05 13:00 ` azurIt 2013-09-05 13:00 ` azurIt
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20130905120347.GA13666@dhcp22.suse.cz \ --to=mhocko@suse.cz \ --cc=akpm@linux-foundation.org \ --cc=azurit@pobox.sk \ --cc=cgroups@vger.kernel.org \ --cc=hannes@cmpxchg.org \ --cc=kamezawa.hiroyu@jp.fujitsu.com \ --cc=kosaki.motohiro@jp.fujitsu.com \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=rientjes@google.com \ --cc=x86@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.