linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: azurIt <azurit@pobox.sk>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	cgroups mailinglist <cgroups@vger.kernel.org>
Subject: Re: memory-cgroup bug
Date: Fri, 23 Nov 2012 11:04:38 +0100	[thread overview]
Message-ID: <20121123100438.GF24698@dhcp22.suse.cz> (raw)
In-Reply-To: <20121123102137.10D6D653@pobox.sk>

On Fri 23-11-12 10:21:37, azurIt wrote:
[...]
> It, luckily, happend again so i have more info.
> 
>  - there wasn't any logs in kernel from OOM for that cgroup
>  - there were 16 processes in cgroup
>  - processes in cgroup were taking togather 100% of CPU (it
>    was allowed to use only one core, so 100% of that core)
>  - memory.failcnt was groving fast
>  - oom_control:
> oom_kill_disable 0
> under_oom 0 (this was looping from 0 to 1)

So there was an OOM going on but no messages in the log? Really strange.
Kame already asked about oom_score_adj of the processes in the group but
it didn't look like all the processes would have oom disabled, right?

>  - limit_in_bytes was set to 157286400
>  - content of stat (as you can see, the whole memory limit was used):
> cache 0
> rss 0

This looks like a top-level group for your user.

> mapped_file 0
> pgpgin 0
> pgpgout 0
> swap 0
> pgfault 0
> pgmajfault 0
> inactive_anon 0
> active_anon 0
> inactive_file 0
> active_file 0
> unevictable 0
> hierarchical_memory_limit 157286400
> hierarchical_memsw_limit 157286400
> total_cache 0
> total_rss 157286400

OK, so all the memory is anonymous and you have no swap so the oom is
the only thing to do.

> total_mapped_file 0
> total_pgpgin 10326454
> total_pgpgout 10288054
> total_swap 0
> total_pgfault 12939677
> total_pgmajfault 4283
> total_inactive_anon 0
> total_active_anon 157286400
> total_inactive_file 0
> total_active_file 0
> total_unevictable 0
> 
> 
> i also grabber oom_adj, oom_score_adj and stack of all processes, here
> it is:
> http://www.watchdog.sk/lkml/memcg-bug.tar

Hmm, all processes waiting for oom are stuck at the very same place:
$ grep mem_cgroup_handle_oom -r [0-9]*
30858/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
30859/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
30860/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
30892/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
30898/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
31588/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
32044/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
32358/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
6031/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
6534/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
7020/stack:[<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0

We are taking memcg_oom_lock spinlock twice in that function + we can
schedule. As none of the tasks is scheduled this would suggest that you
are blocked at the first lock. But who got the lock then?
This is really strange.
Btw. is sysrq+t resp. sysrq+w showing the same traces as
/proc/<pid>/stat?
 
> Notice that stack is different for few processes.

Yes others are in VFS resp ext3. ext3_write_begin looks a bit dangerous
but it grabs the page before it really starts a transaction.

> Stack for all processes were NOT chaging and was still the same.

Could you take few snapshots over time?

> Btw, don't know if it matters but i was several cgroup subsystems
> mounted and i'm also using them (i was not activating freezer in this
> case, don't know if it can be active automatically by kernel or what,

No

> didn't checked if cgroup was freezed but i suppose it wasn't):
> none            /cgroups        cgroup  defaults,cpuacct,cpuset,memory,freezer,task,blkio 0 0

Do you see the same issue if only memory controller was mounted (resp.
cpuset which you seem to use as well from your description).

I know you said booting into a vanilla kernel would be problematic but
could you at least rule out te cgroup patches that you have mentioned?
If you need to move a task to a group based by an uid you can use
cgrules daemon (libcgroup1 package) for that as well.
-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2012-11-23 10:04 UTC|newest]

Thread overview: 172+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-21 19:02 memory-cgroup bug azurIt
2012-11-22  0:26 ` Kamezawa Hiroyuki
2012-11-22  9:36   ` azurIt
2012-11-22 21:45     ` Michal Hocko
2012-11-22 15:24 ` Michal Hocko
2012-11-22 18:05   ` azurIt
2012-11-22 21:42     ` Michal Hocko
2012-11-22 22:34       ` azurIt
2012-11-23  7:40         ` Michal Hocko
2012-11-23  9:21           ` azurIt
2012-11-23  9:28             ` Michal Hocko
2012-11-23  9:44               ` azurIt
2012-11-23 10:10                 ` Michal Hocko
2012-11-23  9:34             ` Glauber Costa
2012-11-23 10:04             ` Michal Hocko [this message]
2012-11-23 14:59               ` azurIt
2012-11-25 10:17                 ` Michal Hocko
2012-11-25 12:39                   ` azurIt
2012-11-25 13:02                     ` Michal Hocko
2012-11-25 13:27                       ` azurIt
2012-11-25 13:44                         ` Michal Hocko
2012-11-25  0:10               ` azurIt
2012-11-25 12:05                 ` Michal Hocko
2012-11-25 12:36                   ` azurIt
2012-11-25 13:55                   ` Michal Hocko
2012-11-26  0:38                     ` azurIt
2012-11-26  7:57                       ` Michal Hocko
2012-11-26 13:18                       ` [PATCH -mm] memcg: do not trigger OOM from add_to_page_cache_locked Michal Hocko
2012-11-26 13:21                         ` [PATCH for 3.2.34] " Michal Hocko
2012-11-26 21:28                           ` azurIt
2012-11-30  1:45                           ` azurIt
2012-11-30  2:29                           ` azurIt
2012-11-30 12:45                             ` Michal Hocko
2012-11-30 12:53                               ` azurIt
2012-11-30 13:44                               ` azurIt
2012-11-30 14:44                                 ` Michal Hocko
2012-11-30 15:03                                   ` Michal Hocko
2012-11-30 15:37                                     ` Michal Hocko
2012-11-30 15:08                                   ` azurIt
2012-11-30 15:39                                     ` Michal Hocko
2012-11-30 15:59                                       ` azurIt
2012-11-30 16:19                                         ` Michal Hocko
2012-11-30 16:26                                           ` azurIt
2012-11-30 16:53                                             ` Michal Hocko
2012-11-30 20:43                                               ` azurIt
2012-12-03 15:16                                           ` Michal Hocko
2012-12-05  1:36                                             ` azurIt
2012-12-05 14:17                                               ` Michal Hocko
2012-12-06  0:29                                                 ` azurIt
2012-12-06  9:54                                                   ` Michal Hocko
2012-12-06 10:12                                                     ` azurIt
2012-12-06 17:06                                                       ` Michal Hocko
2012-12-10  1:20                                                     ` azurIt
2012-12-10  9:43                                                       ` Michal Hocko
2012-12-10 10:18                                                         ` azurIt
2012-12-10 15:52                                                           ` Michal Hocko
2012-12-10 17:18                                                             ` azurIt
2012-12-17  1:34                                                             ` azurIt
2012-12-17 16:32                                                               ` Michal Hocko
2012-12-17 18:23                                                                 ` azurIt
2012-12-17 19:55                                                                   ` Michal Hocko
2012-12-18 14:22                                                                     ` azurIt
2012-12-18 15:20                                                                       ` Michal Hocko
2012-12-24 13:25                                                                         ` azurIt
2012-12-28 16:22                                                                           ` Michal Hocko
2012-12-30  1:09                                                                             ` azurIt
2012-12-30 11:08                                                                               ` Michal Hocko
2013-01-25 15:07                                                                                 ` azurIt
2013-01-25 16:31                                                                                   ` Michal Hocko
2013-02-05 13:49                                                                                     ` Michal Hocko
2013-02-05 14:49                                                                                       ` azurIt
2013-02-05 16:09                                                                                         ` Michal Hocko
2013-02-05 16:46                                                                                           ` azurIt
2013-02-05 16:48                                                                                           ` Greg Thelen
2013-02-05 17:46                                                                                             ` Michal Hocko
2013-02-05 18:09                                                                                               ` Greg Thelen
2013-02-05 18:59                                                                                                 ` Michal Hocko
2013-02-08  4:27                                                                                                   ` Greg Thelen
2013-02-08 16:29                                                                                                     ` Michal Hocko
2013-02-08 16:40                                                                                                       ` Michal Hocko
2013-02-06  1:17                                                                                           ` azurIt
2013-02-06 14:01                                                                                             ` Michal Hocko
2013-02-06 14:22                                                                                               ` Michal Hocko
2013-02-06 16:00                                                                                                 ` [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set Michal Hocko
2013-02-08  5:03                                                                                                   ` azurIt
2013-02-08  9:44                                                                                                     ` Michal Hocko
2013-02-08 11:02                                                                                                       ` azurIt
2013-02-08 12:38                                                                                                         ` Michal Hocko
2013-02-08 13:56                                                                                                           ` azurIt
2013-02-08 14:47                                                                                                             ` Michal Hocko
2013-02-08 15:24                                                                                                             ` Michal Hocko
2013-02-08 15:58                                                                                                               ` azurIt
2013-02-08 17:10                                                                                                                 ` Michal Hocko
2013-02-08 21:02                                                                                                                   ` azurIt
2013-02-10 15:03                                                                                                                     ` Michal Hocko
2013-02-10 16:46                                                                                                                       ` azurIt
2013-02-11 11:22                                                                                                                         ` Michal Hocko
2013-02-22  8:23                                                                                                                           ` azurIt
2013-02-22 12:52                                                                                                                             ` Michal Hocko
2013-02-22 12:54                                                                                                                               ` azurIt
2013-02-22 13:00                                                                                                                                 ` Michal Hocko
2013-06-06 16:04                                                                                                                             ` Michal Hocko
2013-06-06 16:16                                                                                                                               ` azurIt
2013-06-07 13:11                                                                                                                                 ` [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM Michal Hocko
2013-06-17 10:21                                                                                                                                   ` azurIt
2013-06-19 13:26                                                                                                                                     ` Michal Hocko
2013-06-22 20:09                                                                                                                                       ` azurIt
2013-06-24 20:13                                                                                                                                         ` Johannes Weiner
2013-06-28 10:06                                                                                                                                           ` azurIt
2013-07-05 18:17                                                                                                                                             ` Johannes Weiner
2013-07-05 19:02                                                                                                                                               ` azurIt
2013-07-05 19:18                                                                                                                                                 ` Johannes Weiner
2013-07-07 23:42                                                                                                                                                   ` azurIt
2013-07-09 13:10                                                                                                                                                     ` Michal Hocko
2013-07-09 13:19                                                                                                                                                       ` azurIt
2013-07-09 13:54                                                                                                                                                         ` Michal Hocko
2013-07-10 16:25                                                                                                                                                           ` azurIt
2013-07-11  7:25                                                                                                                                                             ` Michal Hocko
2013-07-13 23:26                                                                                                                                                               ` azurIt
2013-07-13 23:51                                                                                                                                                                 ` azurIt
2013-07-15 15:41                                                                                                                                                                   ` Michal Hocko
2013-07-15 16:00                                                                                                                                                                     ` Michal Hocko
2013-07-16 15:35                                                                                                                                                                       ` Johannes Weiner
2013-07-16 16:09                                                                                                                                                                         ` Michal Hocko
2013-07-16 16:48                                                                                                                                                                           ` Johannes Weiner
2013-07-19  4:21                                                                                                                                                                             ` Johannes Weiner
2013-07-19  4:22                                                                                                                                                                               ` [patch 1/5] mm: invoke oom-killer from remaining unconverted page fault handlers Johannes Weiner
2013-07-19  4:24                                                                                                                                                                               ` [patch 2/5] mm: pass userspace fault flag to generic fault handler Johannes Weiner
2013-07-19  4:25                                                                                                                                                                               ` [patch 3/5] x86: finish fault error path with fatal signal Johannes Weiner
2013-07-24 20:32                                                                                                                                                                                 ` Johannes Weiner
2013-07-25 20:29                                                                                                                                                                                   ` KOSAKI Motohiro
2013-07-25 21:50                                                                                                                                                                                     ` Johannes Weiner
2013-07-19  4:25                                                                                                                                                                               ` [patch 4/5] memcg: do not trap chargers with full callstack on OOM Johannes Weiner
2013-07-19  4:26                                                                                                                                                                               ` [patch 5/5] mm: memcontrol: sanity check memcg OOM context unwind Johannes Weiner
2013-07-19  8:23                                                                                                                                                                               ` [PATCH for 3.2] memcg: do not trap chargers with full callstack on OOM azurIt
2013-07-14 17:07                                                                                                                                                   ` azurIt
2013-07-09 13:00                                                                                                                                           ` Michal Hocko
2013-07-09 13:08                                                                                                                                             ` Michal Hocko
2013-07-09 13:10                                                                                                                                               ` Michal Hocko
2013-06-24 16:48                                                                                                                                       ` azurIt
2013-02-22 12:00                                                                                                                           ` [PATCH for 3.2.34] memcg: do not trigger OOM if PF_NO_MEMCG_OOM is set azurIt
2013-02-07 11:01                                                                                               ` [PATCH for 3.2.34] memcg: do not trigger OOM from add_to_page_cache_locked Kamezawa Hiroyuki
2013-02-07 12:31                                                                                                 ` Michal Hocko
2013-02-08  4:16                                                                                                   ` Kamezawa Hiroyuki
2013-02-08  1:40                                                                                                 ` Kamezawa Hiroyuki
2013-02-08 16:01                                                                                                   ` Michal Hocko
2013-02-05 16:31                                                                                         ` Michal Hocko
2012-12-24 13:38                                                                         ` azurIt
2012-12-28 16:35                                                                           ` Michal Hocko
2012-11-26 17:46                         ` [PATCH -mm] " Johannes Weiner
2012-11-26 18:04                           ` Michal Hocko
2012-11-26 18:24                             ` Johannes Weiner
2012-11-26 19:03                               ` Michal Hocko
2012-11-26 19:29                                 ` Johannes Weiner
2012-11-26 20:08                                   ` Michal Hocko
2012-11-26 20:19                                     ` Johannes Weiner
2012-11-26 20:46                                       ` azurIt
2012-11-26 20:53                                         ` Johannes Weiner
2012-11-26 22:06                                       ` Michal Hocko
2012-11-27  0:05                         ` Kamezawa Hiroyuki
2012-11-27  9:54                           ` Michal Hocko
2012-11-27 19:48                           ` Johannes Weiner
2012-11-27 20:54                             ` [PATCH -v2 " Michal Hocko
2012-11-27 20:59                               ` Michal Hocko
2012-11-28 15:26                                 ` Johannes Weiner
2012-11-28 16:04                                   ` Michal Hocko
2012-11-28 16:37                                     ` Johannes Weiner
2012-11-28 16:46                                       ` Michal Hocko
2012-11-28 16:48                                         ` Michal Hocko
2012-11-28 18:44                                           ` Johannes Weiner
2012-11-28 20:20                                           ` Hugh Dickins
2012-11-29 14:05                                             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121123100438.GF24698@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=azurit@pobox.sk \
    --cc=cgroups@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).