All of lore.kernel.org
 help / color / mirror / Atom feed
From: "azurIt" <azurit@pobox.sk>
To: "Johannes Weiner" <hannes@cmpxchg.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	"Michal Hocko" <mhocko@suse.cz>,
	"David Rientjes" <rientjes@google.com>,
	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@jp.fujitsu.com>,
	"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 0/7] improve memcg oom killer robustness v2
Date: Mon, 16 Sep 2013 12:22:59 +0200	[thread overview]
Message-ID: <20130916122259.F60857D4@pobox.sk> (raw)
In-Reply-To: <20130911200426.GO856@cmpxchg.org>

> CC: "Andrew Morton" <akpm@linux-foundation.org>, "Michal Hocko" <mhocko@suse.cz>, "David Rientjes" <rientjes@google.com>, "KAMEZAWA Hiroyuki" <kamezawa.hiroyu@jp.fujitsu.com>, "KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>, linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org
>On Wed, Sep 11, 2013 at 09:41:18PM +0200, azurIt wrote:
>> >On Wed, Sep 11, 2013 at 08:54:48PM +0200, azurIt wrote:
>> >> >On Wed, Sep 11, 2013 at 02:33:05PM +0200, azurIt wrote:
>> >> >> >On Tue, Sep 10, 2013 at 11:32:47PM +0200, azurIt wrote:
>> >> >> >> >On Tue, Sep 10, 2013 at 11:08:53PM +0200, azurIt wrote:
>> >> >> >> >> >On Tue, Sep 10, 2013 at 09:32:53PM +0200, azurIt wrote:
>> >> >> >> >> >> Here is full kernel log between 6:00 and 7:59:
>> >> >> >> >> >> http://watchdog.sk/lkml/kern6.log
>> >> >> >> >> >
>> >> >> >> >> >Wow, your apaches are like the hydra.  Whenever one is OOM killed,
>> >> >> >> >> >more show up!
>> >> >> >> >> 
>> >> >> >> >> 
>> >> >> >> >> 
>> >> >> >> >> Yeah, it's supposed to do this ;)
>> >> >> >
>> >> >> >How are you expecting the machine to recover from an OOM situation,
>> >> >> >though?  I guess I don't really understand what these machines are
>> >> >> >doing.  But if you are overloading them like crazy, isn't that the
>> >> >> >expected outcome?
>> >> >> 
>> >> >> 
>> >> >> 
>> >> >> 
>> >> >> 
>> >> >> There's no global OOM, server has enough of memory. OOM is occuring only in cgroups (customers who simply don't want to pay for more memory).
>> >> >
>> >> >Yes, sure, but when the cgroups are thrashing, they use the disk and
>> >> >CPU to the point where the overall system is affected.
>> >> 
>> >> 
>> >> 
>> >> 
>> >> Didn't know that there is a disk usage because of this, i never noticed anything yet.
>> >
>> >You said there was heavy IO going on...?
>> 
>> 
>> 
>> Yes, there usually was a big IO but it was related to that
>> deadlocking bug in kernel (or i assume it was). I never saw a big IO
>> in normal conditions even when there were lots of OOM in
>> cgroups. I'm even not using swap because of this so i was assuming
>> that lacks of memory is not doing any additional IO (or am i
>> wrong?). And if you mean that last problem with IO from Monday, i
>> don't exactly know what happens but it's really long time when we
>> had so big problem with IO that it disables also root login on
>> console.
>
>The deadlocking problem should be separate from this.
>
>Even without swap, the binaries and libraries of the running tasks can
>get reclaimed (and immediately faulted back from disk, i.e thrashing).
>
>Usually the OOM killer should kick in before tasks cannibalize each
>other like that.
>
>The patch you were using did in fact have the side effect of widening
>the window between tasks entering heavy reclaim and the OOM killer
>kicking in, so it could explain the IO worsening while fixing the dead
>lock problem.
>
>That followup patch tries to narrow this window by quite a bit and
>tries to stop concurrent reclaim when the group is already OOM.
>

Johannes,

it's, unfortunately, happening several times per day and we cannot work like this :( i will boot previous kernel this night. If you have any patches which can help me or you, please send them so i can install them with this reboot. Thank you.

azur

WARNING: multiple messages have this Message-ID (diff)
From: "azurIt" <azurit@pobox.sk>
To: "Johannes Weiner" <hannes@cmpxchg.org>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
	"Michal Hocko" <mhocko@suse.cz>,
	"David Rientjes" <rientjes@google.com>,
	"KAMEZAWA Hiroyuki" <kamezawa.hiroyu@jp.fujitsu.com>,
	"KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 0/7] improve memcg oom killer robustness v2
Date: Mon, 16 Sep 2013 12:22:59 +0200	[thread overview]
Message-ID: <20130916122259.F60857D4@pobox.sk> (raw)
In-Reply-To: <20130911200426.GO856@cmpxchg.org>

> CC: "Andrew Morton" <akpm@linux-foundation.org>, "Michal Hocko" <mhocko@suse.cz>, "David Rientjes" <rientjes@google.com>, "KAMEZAWA Hiroyuki" <kamezawa.hiroyu@jp.fujitsu.com>, "KOSAKI Motohiro" <kosaki.motohiro@jp.fujitsu.com>, linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org
>On Wed, Sep 11, 2013 at 09:41:18PM +0200, azurIt wrote:
>> >On Wed, Sep 11, 2013 at 08:54:48PM +0200, azurIt wrote:
>> >> >On Wed, Sep 11, 2013 at 02:33:05PM +0200, azurIt wrote:
>> >> >> >On Tue, Sep 10, 2013 at 11:32:47PM +0200, azurIt wrote:
>> >> >> >> >On Tue, Sep 10, 2013 at 11:08:53PM +0200, azurIt wrote:
>> >> >> >> >> >On Tue, Sep 10, 2013 at 09:32:53PM +0200, azurIt wrote:
>> >> >> >> >> >> Here is full kernel log between 6:00 and 7:59:
>> >> >> >> >> >> http://watchdog.sk/lkml/kern6.log
>> >> >> >> >> >
>> >> >> >> >> >Wow, your apaches are like the hydra.  Whenever one is OOM killed,
>> >> >> >> >> >more show up!
>> >> >> >> >> 
>> >> >> >> >> 
>> >> >> >> >> 
>> >> >> >> >> Yeah, it's supposed to do this ;)
>> >> >> >
>> >> >> >How are you expecting the machine to recover from an OOM situation,
>> >> >> >though?  I guess I don't really understand what these machines are
>> >> >> >doing.  But if you are overloading them like crazy, isn't that the
>> >> >> >expected outcome?
>> >> >> 
>> >> >> 
>> >> >> 
>> >> >> 
>> >> >> 
>> >> >> There's no global OOM, server has enough of memory. OOM is occuring only in cgroups (customers who simply don't want to pay for more memory).
>> >> >
>> >> >Yes, sure, but when the cgroups are thrashing, they use the disk and
>> >> >CPU to the point where the overall system is affected.
>> >> 
>> >> 
>> >> 
>> >> 
>> >> Didn't know that there is a disk usage because of this, i never noticed anything yet.
>> >
>> >You said there was heavy IO going on...?
>> 
>> 
>> 
>> Yes, there usually was a big IO but it was related to that
>> deadlocking bug in kernel (or i assume it was). I never saw a big IO
>> in normal conditions even when there were lots of OOM in
>> cgroups. I'm even not using swap because of this so i was assuming
>> that lacks of memory is not doing any additional IO (or am i
>> wrong?). And if you mean that last problem with IO from Monday, i
>> don't exactly know what happens but it's really long time when we
>> had so big problem with IO that it disables also root login on
>> console.
>
>The deadlocking problem should be separate from this.
>
>Even without swap, the binaries and libraries of the running tasks can
>get reclaimed (and immediately faulted back from disk, i.e thrashing).
>
>Usually the OOM killer should kick in before tasks cannibalize each
>other like that.
>
>The patch you were using did in fact have the side effect of widening
>the window between tasks entering heavy reclaim and the OOM killer
>kicking in, so it could explain the IO worsening while fixing the dead
>lock problem.
>
>That followup patch tries to narrow this window by quite a bit and
>tries to stop concurrent reclaim when the group is already OOM.
>

Johannes,

it's, unfortunately, happening several times per day and we cannot work like this :( i will boot previous kernel this night. If you have any patches which can help me or you, please send them so i can install them with this reboot. Thank you.

azur

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2013-09-16 10:23 UTC|newest]

Thread overview: 227+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-03 16:59 [patch 0/7] improve memcg oom killer robustness v2 Johannes Weiner
2013-08-03 16:59 ` Johannes Weiner
2013-08-03 16:59 ` [patch 1/7] arch: mm: remove obsolete init OOM protection Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-06  6:34   ` Vineet Gupta
2013-08-06  6:34     ` Vineet Gupta
2013-08-06  6:34     ` Vineet Gupta
2013-08-03 16:59 ` [patch 2/7] arch: mm: do not invoke OOM killer on kernel fault OOM Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-03 16:59 ` [patch 3/7] arch: mm: pass userspace fault flag to generic fault handler Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-05 22:06   ` Andrew Morton
2013-08-05 22:06     ` Andrew Morton
2013-08-05 22:25     ` Johannes Weiner
2013-08-05 22:25       ` Johannes Weiner
2013-08-03 16:59 ` [patch 4/7] x86: finish user fault error path with fatal signal Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-03 16:59 ` [patch 5/7] mm: memcg: enable memcg OOM killer only for user faults Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-05  9:18   ` Michal Hocko
2013-08-05  9:18     ` Michal Hocko
2013-08-03 16:59 ` [patch 6/7] mm: memcg: rework and document OOM waiting and wakeup Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-03 17:00 ` [patch 7/7] mm: memcg: do not trap chargers with full callstack on OOM Johannes Weiner
2013-08-03 17:00   ` Johannes Weiner
2013-08-05  9:54   ` Michal Hocko
2013-08-05  9:54     ` Michal Hocko
2013-08-05  9:54     ` Michal Hocko
2013-08-05 20:56     ` Johannes Weiner
2013-08-05 20:56       ` Johannes Weiner
2013-08-03 17:08 ` [patch 0/7] improve memcg oom killer robustness v2 Johannes Weiner
2013-08-03 17:08   ` Johannes Weiner
2013-08-09  9:06   ` azurIt
2013-08-09  9:06     ` azurIt
2013-08-09  9:06     ` azurIt
2013-08-30 19:58   ` azurIt
2013-08-30 19:58     ` azurIt
2013-09-02 10:38     ` azurIt
2013-09-02 10:38       ` azurIt
2013-09-03 20:48       ` Johannes Weiner
2013-09-03 20:48         ` Johannes Weiner
2013-09-04  7:53         ` azurIt
2013-09-04  7:53           ` azurIt
2013-09-04  7:53           ` azurIt
2013-09-04  7:53           ` azurIt
2013-09-04  8:18         ` azurIt
2013-09-04  8:18           ` azurIt
2013-09-05 11:54           ` Johannes Weiner
2013-09-05 11:54             ` Johannes Weiner
2013-09-05 12:43             ` Michal Hocko
2013-09-05 12:43               ` Michal Hocko
2013-09-05 16:18               ` Johannes Weiner
2013-09-05 16:18                 ` Johannes Weiner
2013-09-09 12:36                 ` Michal Hocko
2013-09-09 12:36                   ` Michal Hocko
2013-09-09 12:56                   ` Michal Hocko
2013-09-09 12:56                     ` Michal Hocko
2013-09-12 12:59                     ` Johannes Weiner
2013-09-12 12:59                       ` Johannes Weiner
2013-09-16 14:03                       ` Michal Hocko
2013-09-16 14:03                         ` Michal Hocko
2013-09-16 14:03                         ` Michal Hocko
2013-09-05 13:24             ` Michal Hocko
2013-09-05 13:24               ` Michal Hocko
2013-09-09 13:10             ` azurIt
2013-09-09 13:10               ` azurIt
2013-09-09 17:28               ` Johannes Weiner
2013-09-09 17:28                 ` Johannes Weiner
2013-09-09 19:59                 ` azurIt
2013-09-09 19:59                   ` azurIt
2013-09-09 20:12                   ` Johannes Weiner
2013-09-09 20:12                     ` Johannes Weiner
2013-09-09 20:18                     ` azurIt
2013-09-09 20:18                       ` azurIt
2013-09-09 21:08                     ` azurIt
2013-09-09 21:08                       ` azurIt
2013-09-10 18:13                     ` azurIt
2013-09-10 18:13                       ` azurIt
2013-09-10 18:37                       ` Johannes Weiner
2013-09-10 18:37                         ` Johannes Weiner
2013-09-10 19:32                         ` azurIt
2013-09-10 19:32                           ` azurIt
2013-09-10 20:12                           ` Johannes Weiner
2013-09-10 20:12                             ` Johannes Weiner
2013-09-10 21:08                             ` azurIt
2013-09-10 21:08                               ` azurIt
2013-09-10 21:08                               ` azurIt
2013-09-10 21:18                               ` Johannes Weiner
2013-09-10 21:18                                 ` Johannes Weiner
2013-09-10 21:32                                 ` azurIt
2013-09-10 21:32                                   ` azurIt
2013-09-10 22:03                                   ` Johannes Weiner
2013-09-10 22:03                                     ` Johannes Weiner
2013-09-11 12:33                                     ` azurIt
2013-09-11 12:33                                       ` azurIt
2013-09-11 18:03                                       ` Johannes Weiner
2013-09-11 18:03                                         ` Johannes Weiner
2013-09-11 18:03                                         ` Johannes Weiner
2013-09-11 18:54                                         ` azurIt
2013-09-11 18:54                                           ` azurIt
2013-09-11 19:11                                           ` Johannes Weiner
2013-09-11 19:11                                             ` Johannes Weiner
2013-09-11 19:41                                             ` azurIt
2013-09-11 19:41                                               ` azurIt
2013-09-11 20:04                                               ` Johannes Weiner
2013-09-11 20:04                                                 ` Johannes Weiner
2013-09-14 10:48                                                 ` azurIt
2013-09-14 10:48                                                   ` azurIt
2013-09-16 13:40                                                   ` Michal Hocko
2013-09-16 13:40                                                     ` Michal Hocko
2013-09-16 14:01                                                     ` azurIt
2013-09-16 14:01                                                       ` azurIt
2013-09-16 14:06                                                       ` Michal Hocko
2013-09-16 14:06                                                         ` Michal Hocko
2013-09-16 14:13                                                         ` azurIt
2013-09-16 14:13                                                           ` azurIt
2013-09-16 14:13                                                           ` azurIt
2013-09-16 14:57                                                           ` Michal Hocko
2013-09-16 14:57                                                             ` Michal Hocko
2013-09-16 15:05                                                             ` azurIt
2013-09-16 15:05                                                               ` azurIt
2013-09-16 15:17                                                               ` Johannes Weiner
2013-09-16 15:17                                                                 ` Johannes Weiner
2013-09-16 15:17                                                                 ` Johannes Weiner
2013-09-16 15:24                                                                 ` azurIt
2013-09-16 15:24                                                                   ` azurIt
2013-09-16 15:25                                                               ` Michal Hocko
2013-09-16 15:25                                                                 ` Michal Hocko
2013-09-16 15:40                                                                 ` azurIt
2013-09-16 15:40                                                                   ` azurIt
2013-09-16 20:52                                                                 ` azurIt
2013-09-16 20:52                                                                   ` azurIt
2013-09-17  0:02                                                                   ` Johannes Weiner
2013-09-17  0:02                                                                     ` Johannes Weiner
2013-09-17 11:15                                                                     ` azurIt
2013-09-17 11:15                                                                       ` azurIt
2013-09-17 11:15                                                                       ` azurIt
2013-09-17 14:10                                                                       ` Michal Hocko
2013-09-17 14:10                                                                         ` Michal Hocko
2013-09-18 14:03                                                                         ` azurIt
2013-09-18 14:03                                                                           ` azurIt
2013-09-18 14:03                                                                           ` azurIt
2013-09-18 14:24                                                                           ` Michal Hocko
2013-09-18 14:24                                                                             ` Michal Hocko
2013-09-18 14:33                                                                             ` azurIt
2013-09-18 14:33                                                                               ` azurIt
2013-09-18 14:42                                                                               ` Michal Hocko
2013-09-18 14:42                                                                                 ` Michal Hocko
2013-09-18 14:42                                                                                 ` Michal Hocko
2013-09-18 18:02                                                                                 ` azurIt
2013-09-18 18:02                                                                                   ` azurIt
2013-09-18 18:36                                                                                   ` Michal Hocko
2013-09-18 18:36                                                                                     ` Michal Hocko
2013-09-18 18:36                                                                                     ` Michal Hocko
2013-09-18 18:04                                                                           ` Johannes Weiner
2013-09-18 18:04                                                                             ` Johannes Weiner
2013-09-18 18:19                                                                             ` Johannes Weiner
2013-09-18 18:19                                                                               ` Johannes Weiner
2013-09-18 19:55                                                                               ` Johannes Weiner
2013-09-18 19:55                                                                                 ` Johannes Weiner
2013-09-18 19:55                                                                                 ` Johannes Weiner
2013-09-18 20:52                                                                                 ` azurIt
2013-09-18 20:52                                                                                   ` azurIt
2013-09-18 20:52                                                                                   ` azurIt
2013-09-25  7:26                                                                                 ` azurIt
2013-09-25  7:26                                                                                   ` azurIt
2013-09-25  7:26                                                                                   ` azurIt
2013-09-26 16:54                                                                                 ` azurIt
2013-09-26 16:54                                                                                   ` azurIt
2013-09-26 16:54                                                                                   ` azurIt
2013-09-26 19:27                                                                                   ` Johannes Weiner
2013-09-26 19:27                                                                                     ` Johannes Weiner
2013-09-27  2:04                                                                                     ` azurIt
2013-09-27  2:04                                                                                       ` azurIt
2013-09-27  2:04                                                                                       ` azurIt
2013-09-27  2:04                                                                                       ` azurIt
2013-10-07 11:01                                                                                     ` azurIt
2013-10-07 11:01                                                                                       ` azurIt
2013-10-07 11:01                                                                                       ` azurIt
2013-10-07 11:01                                                                                       ` azurIt
2013-10-07 19:23                                                                                       ` Johannes Weiner
2013-10-07 19:23                                                                                         ` Johannes Weiner
2013-10-09 18:44                                                                                         ` azurIt
2013-10-09 18:44                                                                                           ` azurIt
2013-10-09 18:44                                                                                           ` azurIt
2013-10-10  0:14                                                                                           ` Johannes Weiner
2013-10-10  0:14                                                                                             ` Johannes Weiner
2013-10-10  0:14                                                                                             ` Johannes Weiner
2013-10-10 22:59                                                                                             ` azurIt
2013-10-10 22:59                                                                                               ` azurIt
2013-10-10 22:59                                                                                               ` azurIt
2013-09-17 11:20                                                                     ` azurIt
2013-09-17 11:20                                                                       ` azurIt
2013-09-16 10:22                                                 ` azurIt [this message]
2013-09-16 10:22                                                   ` azurIt
2013-09-04  9:45         ` azurIt
2013-09-04  9:45           ` azurIt
2013-09-04 11:57           ` Michal Hocko
2013-09-04 11:57             ` Michal Hocko
2013-09-04 12:10             ` azurIt
2013-09-04 12:10               ` azurIt
2013-09-04 12:10               ` azurIt
2013-09-04 12:26               ` Michal Hocko
2013-09-04 12:26                 ` Michal Hocko
2013-09-04 12:26                 ` Michal Hocko
2013-09-04 12:39                 ` azurIt
2013-09-04 12:39                   ` azurIt
2013-09-05  9:14                 ` azurIt
2013-09-05  9:14                   ` azurIt
2013-09-05  9:53                   ` Michal Hocko
2013-09-05  9:53                     ` Michal Hocko
2013-09-05 10:17                     ` azurIt
2013-09-05 10:17                       ` azurIt
2013-09-05 11:17                       ` Michal Hocko
2013-09-05 11:17                         ` Michal Hocko
2013-09-05 11:17                         ` Michal Hocko
2013-09-05 11:47                         ` azurIt
2013-09-05 11:47                           ` azurIt
2013-09-05 12:03                           ` Michal Hocko
2013-09-05 12:03                             ` Michal Hocko
2013-09-05 12:33                             ` azurIt
2013-09-05 12:33                               ` azurIt
2013-09-05 12:33                               ` azurIt
2013-09-05 12:45                               ` Michal Hocko
2013-09-05 12:45                                 ` Michal Hocko
2013-09-05 13:00                                 ` azurIt
2013-09-05 13:00                                   ` azurIt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130916122259.F60857D4@pobox.sk \
    --to=azurit@pobox.sk \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=rientjes@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.