From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752228Ab3IRSg0 (ORCPT ); Wed, 18 Sep 2013 14:36:26 -0400 Received: from cantor2.suse.de ([195.135.220.15]:56382 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751529Ab3IRSgY (ORCPT ); Wed, 18 Sep 2013 14:36:24 -0400 Date: Wed, 18 Sep 2013 20:36:17 +0200 From: Michal Hocko To: azurIt Cc: Johannes Weiner , Andrew Morton , David Rientjes , KAMEZAWA Hiroyuki , KOSAKI Motohiro , linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [patch 0/7] improve memcg oom killer robustness v2 Message-ID: <20130918183617.GD3421@dhcp22.suse.cz> References: <20130916152548.GF3674@dhcp22.suse.cz> <20130916225246.A633145B@pobox.sk> <20130917000244.GD3278@cmpxchg.org> <20130917131535.94E0A843@pobox.sk> <20130917141013.GA30838@dhcp22.suse.cz> <20130918160304.6EDF2729@pobox.sk> <20130918142400.GA3421@dhcp22.suse.cz> <20130918163306.3620C973@pobox.sk> <20130918144245.GC3421@dhcp22.suse.cz> <20130918200239.DDA96791@pobox.sk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130918200239.DDA96791@pobox.sk> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 18-09-13 20:02:39, azurIt wrote: > > CC: "Johannes Weiner" , "Andrew Morton" , "David Rientjes" , "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" , linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org > >On Wed 18-09-13 16:33:06, azurIt wrote: > >> > CC: "Johannes Weiner" , "Andrew Morton" , "David Rientjes" , "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" , linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org > >> >On Wed 18-09-13 16:03:04, azurIt wrote: > >> >[..] > >> >> I was finally able to get stack of problematic process :) I saved it > >> >> two times from the same process, as Michal suggested (i wasn't able to > >> >> take more). Here it is: > >> >> > >> >> First (doesn't look very helpfull): > >> >> [] 0xffffffffffffffff > >> > > >> >No it is not. > >> > > >> >> Second: > >> >> [] shrink_zone+0x481/0x650 > >> >> [] do_try_to_free_pages+0xde/0x550 > >> >> [] try_to_free_pages+0x9b/0x120 > >> >> [] free_more_memory+0x5d/0x60 > >> >> [] __getblk+0x14d/0x2c0 > >> >> [] __bread+0x13/0xc0 > >> >> [] ext3_get_branch+0x98/0x140 > >> >> [] ext3_get_blocks_handle+0xd7/0xdc0 > >> >> [] ext3_get_block+0xc4/0x120 > >> >> [] do_mpage_readpage+0x38a/0x690 > >> >> [] mpage_readpages+0xfb/0x160 > >> >> [] ext3_readpages+0x1d/0x20 > >> >> [] __do_page_cache_readahead+0x1c5/0x270 > >> >> [] ra_submit+0x21/0x30 > >> >> [] filemap_fault+0x380/0x4f0 > >> >> [] __do_fault+0x78/0x5a0 > >> >> [] handle_pte_fault+0x84/0x940 > >> >> [] handle_mm_fault+0x16a/0x320 > >> >> [] do_page_fault+0x13b/0x490 > >> >> [] page_fault+0x1f/0x30 > >> >> [] 0xffffffffffffffff > >> > > >> >This is the direct reclaim path. You are simply running out of memory > >> >globaly. There is no memcg specific code in that trace. > >> > >> > >> No, i'm not. Here is htop and server graphs from this case: > > > >Bahh, right you are. I didn't look at the trace carefully. It is > >free_more_memory which calls the direct reclaim shrinking. > > > >Sorry about the confusion > > > Happens again and this time i got 5x this: > [] 0xffffffffffffffff > > :( it's probably looping very fast so i need to have some luck Or it is looping in the userspace. -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michal Hocko Subject: Re: [patch 0/7] improve memcg oom killer robustness v2 Date: Wed, 18 Sep 2013 20:36:17 +0200 Message-ID: <20130918183617.GD3421@dhcp22.suse.cz> References: <20130916152548.GF3674@dhcp22.suse.cz> <20130916225246.A633145B@pobox.sk> <20130917000244.GD3278@cmpxchg.org> <20130917131535.94E0A843@pobox.sk> <20130917141013.GA30838@dhcp22.suse.cz> <20130918160304.6EDF2729@pobox.sk> <20130918142400.GA3421@dhcp22.suse.cz> <20130918163306.3620C973@pobox.sk> <20130918144245.GC3421@dhcp22.suse.cz> <20130918200239.DDA96791@pobox.sk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20130918200239.DDA96791-Rm0zKEqwvD4@public.gmane.org> Sender: cgroups-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: azurIt Cc: Johannes Weiner , Andrew Morton , David Rientjes , KAMEZAWA Hiroyuki , KOSAKI Motohiro , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org List-Id: linux-arch.vger.kernel.org On Wed 18-09-13 20:02:39, azurIt wrote: > > CC: "Johannes Weiner" , "Andrew Morton" , "David Rientjes" , "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > >On Wed 18-09-13 16:33:06, azurIt wrote: > >> > CC: "Johannes Weiner" , "Andrew Morton" , "David Rientjes" , "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" , linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org, linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > >> >On Wed 18-09-13 16:03:04, azurIt wrote: > >> >[..] > >> >> I was finally able to get stack of problematic process :) I saved it > >> >> two times from the same process, as Michal suggested (i wasn't able to > >> >> take more). Here it is: > >> >> > >> >> First (doesn't look very helpfull): > >> >> [] 0xffffffffffffffff > >> > > >> >No it is not. > >> > > >> >> Second: > >> >> [] shrink_zone+0x481/0x650 > >> >> [] do_try_to_free_pages+0xde/0x550 > >> >> [] try_to_free_pages+0x9b/0x120 > >> >> [] free_more_memory+0x5d/0x60 > >> >> [] __getblk+0x14d/0x2c0 > >> >> [] __bread+0x13/0xc0 > >> >> [] ext3_get_branch+0x98/0x140 > >> >> [] ext3_get_blocks_handle+0xd7/0xdc0 > >> >> [] ext3_get_block+0xc4/0x120 > >> >> [] do_mpage_readpage+0x38a/0x690 > >> >> [] mpage_readpages+0xfb/0x160 > >> >> [] ext3_readpages+0x1d/0x20 > >> >> [] __do_page_cache_readahead+0x1c5/0x270 > >> >> [] ra_submit+0x21/0x30 > >> >> [] filemap_fault+0x380/0x4f0 > >> >> [] __do_fault+0x78/0x5a0 > >> >> [] handle_pte_fault+0x84/0x940 > >> >> [] handle_mm_fault+0x16a/0x320 > >> >> [] do_page_fault+0x13b/0x490 > >> >> [] page_fault+0x1f/0x30 > >> >> [] 0xffffffffffffffff > >> > > >> >This is the direct reclaim path. You are simply running out of memory > >> >globaly. There is no memcg specific code in that trace. > >> > >> > >> No, i'm not. Here is htop and server graphs from this case: > > > >Bahh, right you are. I didn't look at the trace carefully. It is > >free_more_memory which calls the direct reclaim shrinking. > > > >Sorry about the confusion > > > Happens again and this time i got 5x this: > [] 0xffffffffffffffff > > :( it's probably looping very fast so i need to have some luck Or it is looping in the userspace. -- Michal Hocko SUSE Labs From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f179.google.com (mail-pd0-f179.google.com [209.85.192.179]) by kanga.kvack.org (Postfix) with ESMTP id 934D96B0032 for ; Wed, 18 Sep 2013 14:36:25 -0400 (EDT) Received: by mail-pd0-f179.google.com with SMTP id v10so7378553pde.24 for ; Wed, 18 Sep 2013 11:36:25 -0700 (PDT) Date: Wed, 18 Sep 2013 20:36:17 +0200 From: Michal Hocko Subject: Re: [patch 0/7] improve memcg oom killer robustness v2 Message-ID: <20130918183617.GD3421@dhcp22.suse.cz> References: <20130916152548.GF3674@dhcp22.suse.cz> <20130916225246.A633145B@pobox.sk> <20130917000244.GD3278@cmpxchg.org> <20130917131535.94E0A843@pobox.sk> <20130917141013.GA30838@dhcp22.suse.cz> <20130918160304.6EDF2729@pobox.sk> <20130918142400.GA3421@dhcp22.suse.cz> <20130918163306.3620C973@pobox.sk> <20130918144245.GC3421@dhcp22.suse.cz> <20130918200239.DDA96791@pobox.sk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130918200239.DDA96791@pobox.sk> Sender: owner-linux-mm@kvack.org List-ID: To: azurIt Cc: Johannes Weiner , Andrew Morton , David Rientjes , KAMEZAWA Hiroyuki , KOSAKI Motohiro , linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org On Wed 18-09-13 20:02:39, azurIt wrote: > > CC: "Johannes Weiner" , "Andrew Morton" , "David Rientjes" , "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" , linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org > >On Wed 18-09-13 16:33:06, azurIt wrote: > >> > CC: "Johannes Weiner" , "Andrew Morton" , "David Rientjes" , "KAMEZAWA Hiroyuki" , "KOSAKI Motohiro" , linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org > >> >On Wed 18-09-13 16:03:04, azurIt wrote: > >> >[..] > >> >> I was finally able to get stack of problematic process :) I saved it > >> >> two times from the same process, as Michal suggested (i wasn't able to > >> >> take more). Here it is: > >> >> > >> >> First (doesn't look very helpfull): > >> >> [] 0xffffffffffffffff > >> > > >> >No it is not. > >> > > >> >> Second: > >> >> [] shrink_zone+0x481/0x650 > >> >> [] do_try_to_free_pages+0xde/0x550 > >> >> [] try_to_free_pages+0x9b/0x120 > >> >> [] free_more_memory+0x5d/0x60 > >> >> [] __getblk+0x14d/0x2c0 > >> >> [] __bread+0x13/0xc0 > >> >> [] ext3_get_branch+0x98/0x140 > >> >> [] ext3_get_blocks_handle+0xd7/0xdc0 > >> >> [] ext3_get_block+0xc4/0x120 > >> >> [] do_mpage_readpage+0x38a/0x690 > >> >> [] mpage_readpages+0xfb/0x160 > >> >> [] ext3_readpages+0x1d/0x20 > >> >> [] __do_page_cache_readahead+0x1c5/0x270 > >> >> [] ra_submit+0x21/0x30 > >> >> [] filemap_fault+0x380/0x4f0 > >> >> [] __do_fault+0x78/0x5a0 > >> >> [] handle_pte_fault+0x84/0x940 > >> >> [] handle_mm_fault+0x16a/0x320 > >> >> [] do_page_fault+0x13b/0x490 > >> >> [] page_fault+0x1f/0x30 > >> >> [] 0xffffffffffffffff > >> > > >> >This is the direct reclaim path. You are simply running out of memory > >> >globaly. There is no memcg specific code in that trace. > >> > >> > >> No, i'm not. Here is htop and server graphs from this case: > > > >Bahh, right you are. I didn't look at the trace carefully. It is > >free_more_memory which calls the direct reclaim shrinking. > > > >Sorry about the confusion > > > Happens again and this time i got 5x this: > [] 0xffffffffffffffff > > :( it's probably looping very fast so i need to have some luck Or it is looping in the userspace. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org