All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	azurIt <azurit@pobox.sk>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 7/7] mm: memcg: do not trap chargers with full callstack on OOM
Date: Mon, 5 Aug 2013 11:54:29 +0200	[thread overview]
Message-ID: <20130805095429.GJ10146@dhcp22.suse.cz> (raw)
In-Reply-To: <1375549200-19110-8-git-send-email-hannes@cmpxchg.org>

On Sat 03-08-13 13:00:00, Johannes Weiner wrote:
> The memcg OOM handling is incredibly fragile and can deadlock.  When a
> task fails to charge memory, it invokes the OOM killer and loops right
> there in the charge code until it succeeds.  Comparably, any other
> task that enters the charge path at this point will go to a waitqueue
> right then and there and sleep until the OOM situation is resolved.
> The problem is that these tasks may hold filesystem locks and the
> mmap_sem; locks that the selected OOM victim may need to exit.
> 
> For example, in one reported case, the task invoking the OOM killer
> was about to charge a page cache page during a write(), which holds
> the i_mutex.  The OOM killer selected a task that was just entering
> truncate() and trying to acquire the i_mutex:
> 
> OOM invoking task:
> [<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
> [<ffffffff8110b5ab>] T.1146+0x5ab/0x5c0
> [<ffffffff8110c22e>] mem_cgroup_cache_charge+0xbe/0xe0
> [<ffffffff810ca28c>] add_to_page_cache_locked+0x4c/0x140
> [<ffffffff810ca3a2>] add_to_page_cache_lru+0x22/0x50
> [<ffffffff810ca45b>] grab_cache_page_write_begin+0x8b/0xe0
> [<ffffffff81193a18>] ext3_write_begin+0x88/0x270
> [<ffffffff810c8fc6>] generic_file_buffered_write+0x116/0x290
> [<ffffffff810cb3cc>] __generic_file_aio_write+0x27c/0x480
> [<ffffffff810cb646>] generic_file_aio_write+0x76/0xf0           # takes ->i_mutex
> [<ffffffff8111156a>] do_sync_write+0xea/0x130
> [<ffffffff81112183>] vfs_write+0xf3/0x1f0
> [<ffffffff81112381>] sys_write+0x51/0x90
> [<ffffffff815b5926>] system_call_fastpath+0x18/0x1d
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> OOM kill victim:
> [<ffffffff811109b8>] do_truncate+0x58/0xa0              # takes i_mutex
> [<ffffffff81121c90>] do_last+0x250/0xa30
> [<ffffffff81122547>] path_openat+0xd7/0x440
> [<ffffffff811229c9>] do_filp_open+0x49/0xa0
> [<ffffffff8110f7d6>] do_sys_open+0x106/0x240
> [<ffffffff8110f950>] sys_open+0x20/0x30
> [<ffffffff815b5926>] system_call_fastpath+0x18/0x1d
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> The OOM handling task will retry the charge indefinitely while the OOM
> killed task is not releasing any resources.
> 
> A similar scenario can happen when the kernel OOM killer for a memcg
> is disabled and a userspace task is in charge of resolving OOM
> situations.  In this case, ALL tasks that enter the OOM path will be
> made to sleep on the OOM waitqueue and wait for userspace to free
> resources or increase the group's limit.  But a userspace OOM handler
> is prone to deadlock itself on the locks held by the waiting tasks.
> For example one of the sleeping tasks may be stuck in a brk() call
> with the mmap_sem held for writing but the userspace handler, in order
> to pick an optimal victim, may need to read files from /proc/<pid>,
> which tries to acquire the same mmap_sem for reading and deadlocks.
> 
> This patch changes the way tasks behave after detecting a memcg OOM
> and makes sure nobody loops or sleeps with locks held:
> 
> 1. When OOMing in a user fault, invoke the OOM killer and restart the
>    fault instead of looping on the charge attempt.  This way, the OOM
>    victim can not get stuck on locks the looping task may hold.
> 
> 2. When OOMing in a user fault but somebody else is handling it
>    (either the kernel OOM killer or a userspace handler), don't go to
>    sleep in the charge context.  Instead, remember the OOMing memcg in
>    the task struct and then fully unwind the page fault stack with
>    -ENOMEM.  pagefault_out_of_memory() will then call back into the
>    memcg code to check if the -ENOMEM came from the memcg, and then
>    either put the task to sleep on the memcg's OOM waitqueue or just
>    restart the fault.  The OOM victim can no longer get stuck on any
>    lock a sleeping task may hold.
> 
> Reported-by: Reported-by: azurIt <azurit@pobox.sk>
> Debugged-by: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

I was thinking whether we should add task_in_memcg_oom into return to
the userspace path just in case but this should be OK for now and new
users of mem_cgroup_enable_oom will be fought against hard.

Acked-by: Michal Hocko <mhocko@suse.cz>

Thanks

> ---
>  include/linux/memcontrol.h |  21 +++++++
>  include/linux/sched.h      |   4 ++
>  mm/memcontrol.c            | 154 +++++++++++++++++++++++++++++++--------------
>  mm/memory.c                |   3 +
>  mm/oom_kill.c              |   7 ++-
>  5 files changed, 140 insertions(+), 49 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 9c449c1..cb84058 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -131,6 +131,10 @@ extern void mem_cgroup_replace_page_cache(struct page *oldpage,
>   *
>   * Toggle whether a failed memcg charge should invoke the OOM killer
>   * or just return -ENOMEM.  Returns the previous toggle state.
> + *
> + * NOTE: Any path that enables the OOM killer before charging must
> + *       call mem_cgroup_oom_synchronize() afterward to finalize the
> + *       OOM handling and clean up.
>   */
>  static inline bool mem_cgroup_toggle_oom(bool new)
>  {
> @@ -156,6 +160,13 @@ static inline void mem_cgroup_disable_oom(void)
>  	WARN_ON(old == false);
>  }
>  
> +static inline bool task_in_memcg_oom(struct task_struct *p)
> +{
> +	return p->memcg_oom.in_memcg_oom;
> +}
> +
> +bool mem_cgroup_oom_synchronize(void);
> +
>  #ifdef CONFIG_MEMCG_SWAP
>  extern int do_swap_account;
>  #endif
> @@ -392,6 +403,16 @@ static inline void mem_cgroup_disable_oom(void)
>  {
>  }
>  
> +static inline bool task_in_memcg_oom(struct task_struct *p)
> +{
> +	return false;
> +}
> +
> +static inline bool mem_cgroup_oom_synchronize(void)
> +{
> +	return false;
> +}
> +
>  static inline void mem_cgroup_inc_page_stat(struct page *page,
>  					    enum mem_cgroup_page_stat_item idx)
>  {
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 4b3effc..4593e27 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1400,6 +1400,10 @@ struct task_struct {
>  	unsigned int memcg_kmem_skip_account;
>  	struct memcg_oom_info {
>  		unsigned int may_oom:1;
> +		unsigned int in_memcg_oom:1;
> +		unsigned int oom_locked:1;
> +		int wakeups;
> +		struct mem_cgroup *wait_on_memcg;
>  	} memcg_oom;
>  #endif
>  #ifdef CONFIG_UPROBES
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 3d0c1d3..b30c67a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -280,6 +280,7 @@ struct mem_cgroup {
>  
>  	bool		oom_lock;
>  	atomic_t	under_oom;
> +	atomic_t	oom_wakeups;
>  
>  	int	swappiness;
>  	/* OOM-Killer disable */
> @@ -2180,6 +2181,7 @@ static int memcg_oom_wake_function(wait_queue_t *wait,
>  
>  static void memcg_wakeup_oom(struct mem_cgroup *memcg)
>  {
> +	atomic_inc(&memcg->oom_wakeups);
>  	/* for filtering, pass "memcg" as argument. */
>  	__wake_up(&memcg_oom_waitq, TASK_NORMAL, 0, memcg);
>  }
> @@ -2191,19 +2193,17 @@ static void memcg_oom_recover(struct mem_cgroup *memcg)
>  }
>  
>  /*
> - * try to call OOM killer. returns false if we should exit memory-reclaim loop.
> + * try to call OOM killer
>   */
> -static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
> -				  int order)
> +static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
>  {
> -	struct oom_wait_info owait;
>  	bool locked;
> +	int wakeups;
>  
> -	owait.memcg = memcg;
> -	owait.wait.flags = 0;
> -	owait.wait.func = memcg_oom_wake_function;
> -	owait.wait.private = current;
> -	INIT_LIST_HEAD(&owait.wait.task_list);
> +	if (!current->memcg_oom.may_oom)
> +		return;
> +
> +	current->memcg_oom.in_memcg_oom = 1;
>  
>  	/*
>  	 * As with any blocking lock, a contender needs to start
> @@ -2211,12 +2211,8 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
>  	 * otherwise it can miss the wakeup from the unlock and sleep
>  	 * indefinitely.  This is just open-coded because our locking
>  	 * is so particular to memcg hierarchies.
> -	 *
> -	 * Even if signal_pending(), we can't quit charge() loop without
> -	 * accounting. So, UNINTERRUPTIBLE is appropriate. But SIGKILL
> -	 * under OOM is always welcomed, use TASK_KILLABLE here.
>  	 */
> -	prepare_to_wait(&memcg_oom_waitq, &owait.wait, TASK_KILLABLE);
> +	wakeups = atomic_read(&memcg->oom_wakeups);
>  	mem_cgroup_mark_under_oom(memcg);
>  
>  	locked = mem_cgroup_oom_trylock(memcg);
> @@ -2226,15 +2222,95 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
>  
>  	if (locked && !memcg->oom_kill_disable) {
>  		mem_cgroup_unmark_under_oom(memcg);
> -		finish_wait(&memcg_oom_waitq, &owait.wait);
>  		mem_cgroup_out_of_memory(memcg, mask, order);
> +		mem_cgroup_oom_unlock(memcg);
> +		/*
> +		 * There is no guarantee that an OOM-lock contender
> +		 * sees the wakeups triggered by the OOM kill
> +		 * uncharges.  Wake any sleepers explicitely.
> +		 */
> +		memcg_oom_recover(memcg);
>  	} else {
> -		schedule();
> -		mem_cgroup_unmark_under_oom(memcg);
> -		finish_wait(&memcg_oom_waitq, &owait.wait);
> +		/*
> +		 * A system call can just return -ENOMEM, but if this
> +		 * is a page fault and somebody else is handling the
> +		 * OOM already, we need to sleep on the OOM waitqueue
> +		 * for this memcg until the situation is resolved.
> +		 * Which can take some time because it might be
> +		 * handled by a userspace task.
> +		 *
> +		 * However, this is the charge context, which means
> +		 * that we may sit on a large call stack and hold
> +		 * various filesystem locks, the mmap_sem etc. and we
> +		 * don't want the OOM handler to deadlock on them
> +		 * while we sit here and wait.  Store the current OOM
> +		 * context in the task_struct, then return -ENOMEM.
> +		 * At the end of the page fault handler, with the
> +		 * stack unwound, pagefault_out_of_memory() will check
> +		 * back with us by calling
> +		 * mem_cgroup_oom_synchronize(), possibly putting the
> +		 * task to sleep.
> +		 */
> +		current->memcg_oom.oom_locked = locked;
> +		current->memcg_oom.wakeups = wakeups;
> +		css_get(&memcg->css);
> +		current->memcg_oom.wait_on_memcg = memcg;
>  	}
> +}
> +
> +/**
> + * mem_cgroup_oom_synchronize - complete memcg OOM handling
> + *
> + * This has to be called at the end of a page fault if the the memcg
> + * OOM handler was enabled and the fault is returning %VM_FAULT_OOM.
> + *
> + * Memcg supports userspace OOM handling, so failed allocations must
> + * sleep on a waitqueue until the userspace task resolves the
> + * situation.  Sleeping directly in the charge context with all kinds
> + * of locks held is not a good idea, instead we remember an OOM state
> + * in the task and mem_cgroup_oom_synchronize() has to be called at
> + * the end of the page fault to put the task to sleep and clean up the
> + * OOM state.
> + *
> + * Returns %true if an ongoing memcg OOM situation was detected and
> + * finalized, %false otherwise.
> + */
> +bool mem_cgroup_oom_synchronize(void)
> +{
> +	struct oom_wait_info owait;
> +	struct mem_cgroup *memcg;
> +
> +	/* OOM is global, do not handle */
> +	if (!current->memcg_oom.in_memcg_oom)
> +		return false;
> +
> +	/*
> +	 * We invoked the OOM killer but there is a chance that a kill
> +	 * did not free up any charges.  Everybody else might already
> +	 * be sleeping, so restart the fault and keep the rampage
> +	 * going until some charges are released.
> +	 */
> +	memcg = current->memcg_oom.wait_on_memcg;
> +	if (!memcg)
> +		goto out;
> +
> +	if (test_thread_flag(TIF_MEMDIE) || fatal_signal_pending(current))
> +		goto out_memcg;
> +
> +	owait.memcg = memcg;
> +	owait.wait.flags = 0;
> +	owait.wait.func = memcg_oom_wake_function;
> +	owait.wait.private = current;
> +	INIT_LIST_HEAD(&owait.wait.task_list);
>  
> -	if (locked) {
> +	prepare_to_wait(&memcg_oom_waitq, &owait.wait, TASK_KILLABLE);
> +	/* Only sleep if we didn't miss any wakeups since OOM */
> +	if (atomic_read(&memcg->oom_wakeups) == current->memcg_oom.wakeups)
> +		schedule();
> +	finish_wait(&memcg_oom_waitq, &owait.wait);
> +out_memcg:
> +	mem_cgroup_unmark_under_oom(memcg);
> +	if (current->memcg_oom.oom_locked) {
>  		mem_cgroup_oom_unlock(memcg);
>  		/*
>  		 * There is no guarantee that an OOM-lock contender
> @@ -2243,11 +2319,10 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
>  		 */
>  		memcg_oom_recover(memcg);
>  	}
> -
> -	if (test_thread_flag(TIF_MEMDIE) || fatal_signal_pending(current))
> -		return false;
> -	/* Give chance to dying process */
> -	schedule_timeout_uninterruptible(1);
> +	css_put(&memcg->css);
> +	current->memcg_oom.wait_on_memcg = NULL;
> +out:
> +	current->memcg_oom.in_memcg_oom = 0;
>  	return true;
>  }
>  
> @@ -2560,12 +2635,11 @@ enum {
>  	CHARGE_RETRY,		/* need to retry but retry is not bad */
>  	CHARGE_NOMEM,		/* we can't do more. return -ENOMEM */
>  	CHARGE_WOULDBLOCK,	/* GFP_WAIT wasn't set and no enough res. */
> -	CHARGE_OOM_DIE,		/* the current is killed because of OOM */
>  };
>  
>  static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  				unsigned int nr_pages, unsigned int min_pages,
> -				bool oom_check)
> +				bool invoke_oom)
>  {
>  	unsigned long csize = nr_pages * PAGE_SIZE;
>  	struct mem_cgroup *mem_over_limit;
> @@ -2622,14 +2696,10 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	if (mem_cgroup_wait_acct_move(mem_over_limit))
>  		return CHARGE_RETRY;
>  
> -	/* If we don't need to call oom-killer at el, return immediately */
> -	if (!oom_check || !current->memcg_oom.may_oom)
> -		return CHARGE_NOMEM;
> -	/* check OOM */
> -	if (!mem_cgroup_handle_oom(mem_over_limit, gfp_mask, get_order(csize)))
> -		return CHARGE_OOM_DIE;
> +	if (invoke_oom)
> +		mem_cgroup_oom(mem_over_limit, gfp_mask, get_order(csize));
>  
> -	return CHARGE_RETRY;
> +	return CHARGE_NOMEM;
>  }
>  
>  /*
> @@ -2732,7 +2802,7 @@ again:
>  	}
>  
>  	do {
> -		bool oom_check;
> +		bool invoke_oom = oom && !nr_oom_retries;
>  
>  		/* If killed, bypass charge */
>  		if (fatal_signal_pending(current)) {
> @@ -2740,14 +2810,8 @@ again:
>  			goto bypass;
>  		}
>  
> -		oom_check = false;
> -		if (oom && !nr_oom_retries) {
> -			oom_check = true;
> -			nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> -		}
> -
> -		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch, nr_pages,
> -		    oom_check);
> +		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
> +					   nr_pages, invoke_oom);
>  		switch (ret) {
>  		case CHARGE_OK:
>  			break;
> @@ -2760,16 +2824,12 @@ again:
>  			css_put(&memcg->css);
>  			goto nomem;
>  		case CHARGE_NOMEM: /* OOM routine works */
> -			if (!oom) {
> +			if (!oom || invoke_oom) {
>  				css_put(&memcg->css);
>  				goto nomem;
>  			}
> -			/* If oom, we never return -ENOMEM */
>  			nr_oom_retries--;
>  			break;
> -		case CHARGE_OOM_DIE: /* Killed by OOM Killer */
> -			css_put(&memcg->css);
> -			goto bypass;
>  		}
>  	} while (ret != CHARGE_OK);
>  
> diff --git a/mm/memory.c b/mm/memory.c
> index 58ef726..91da6fb 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3868,6 +3868,9 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	if (flags & FAULT_FLAG_USER)
>  		mem_cgroup_disable_oom();
>  
> +	if (WARN_ON(task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM)))
> +		mem_cgroup_oom_synchronize();
> +
>  	return ret;
>  }
>  
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 98e75f2..314e9d2 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -678,9 +678,12 @@ out:
>   */
>  void pagefault_out_of_memory(void)
>  {
> -	struct zonelist *zonelist = node_zonelist(first_online_node,
> -						  GFP_KERNEL);
> +	struct zonelist *zonelist;
>  
> +	if (mem_cgroup_oom_synchronize())
> +		return;
> +
> +	zonelist = node_zonelist(first_online_node, GFP_KERNEL);
>  	if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
>  		out_of_memory(NULL, 0, 0, NULL, false);
>  		clear_zonelist_oom(zonelist, GFP_KERNEL);
> -- 
> 1.8.3.2
> 

-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
To: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
Cc: Andrew Morton
	<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
	David Rientjes <rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	KAMEZAWA Hiroyuki
	<kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
	azurIt <azurit-Rm0zKEqwvD4@public.gmane.org>,
	KOSAKI Motohiro
	<kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org>,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	x86-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	linux-arch-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [patch 7/7] mm: memcg: do not trap chargers with full callstack on OOM
Date: Mon, 5 Aug 2013 11:54:29 +0200	[thread overview]
Message-ID: <20130805095429.GJ10146@dhcp22.suse.cz> (raw)
In-Reply-To: <1375549200-19110-8-git-send-email-hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

On Sat 03-08-13 13:00:00, Johannes Weiner wrote:
> The memcg OOM handling is incredibly fragile and can deadlock.  When a
> task fails to charge memory, it invokes the OOM killer and loops right
> there in the charge code until it succeeds.  Comparably, any other
> task that enters the charge path at this point will go to a waitqueue
> right then and there and sleep until the OOM situation is resolved.
> The problem is that these tasks may hold filesystem locks and the
> mmap_sem; locks that the selected OOM victim may need to exit.
> 
> For example, in one reported case, the task invoking the OOM killer
> was about to charge a page cache page during a write(), which holds
> the i_mutex.  The OOM killer selected a task that was just entering
> truncate() and trying to acquire the i_mutex:
> 
> OOM invoking task:
> [<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
> [<ffffffff8110b5ab>] T.1146+0x5ab/0x5c0
> [<ffffffff8110c22e>] mem_cgroup_cache_charge+0xbe/0xe0
> [<ffffffff810ca28c>] add_to_page_cache_locked+0x4c/0x140
> [<ffffffff810ca3a2>] add_to_page_cache_lru+0x22/0x50
> [<ffffffff810ca45b>] grab_cache_page_write_begin+0x8b/0xe0
> [<ffffffff81193a18>] ext3_write_begin+0x88/0x270
> [<ffffffff810c8fc6>] generic_file_buffered_write+0x116/0x290
> [<ffffffff810cb3cc>] __generic_file_aio_write+0x27c/0x480
> [<ffffffff810cb646>] generic_file_aio_write+0x76/0xf0           # takes ->i_mutex
> [<ffffffff8111156a>] do_sync_write+0xea/0x130
> [<ffffffff81112183>] vfs_write+0xf3/0x1f0
> [<ffffffff81112381>] sys_write+0x51/0x90
> [<ffffffff815b5926>] system_call_fastpath+0x18/0x1d
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> OOM kill victim:
> [<ffffffff811109b8>] do_truncate+0x58/0xa0              # takes i_mutex
> [<ffffffff81121c90>] do_last+0x250/0xa30
> [<ffffffff81122547>] path_openat+0xd7/0x440
> [<ffffffff811229c9>] do_filp_open+0x49/0xa0
> [<ffffffff8110f7d6>] do_sys_open+0x106/0x240
> [<ffffffff8110f950>] sys_open+0x20/0x30
> [<ffffffff815b5926>] system_call_fastpath+0x18/0x1d
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> The OOM handling task will retry the charge indefinitely while the OOM
> killed task is not releasing any resources.
> 
> A similar scenario can happen when the kernel OOM killer for a memcg
> is disabled and a userspace task is in charge of resolving OOM
> situations.  In this case, ALL tasks that enter the OOM path will be
> made to sleep on the OOM waitqueue and wait for userspace to free
> resources or increase the group's limit.  But a userspace OOM handler
> is prone to deadlock itself on the locks held by the waiting tasks.
> For example one of the sleeping tasks may be stuck in a brk() call
> with the mmap_sem held for writing but the userspace handler, in order
> to pick an optimal victim, may need to read files from /proc/<pid>,
> which tries to acquire the same mmap_sem for reading and deadlocks.
> 
> This patch changes the way tasks behave after detecting a memcg OOM
> and makes sure nobody loops or sleeps with locks held:
> 
> 1. When OOMing in a user fault, invoke the OOM killer and restart the
>    fault instead of looping on the charge attempt.  This way, the OOM
>    victim can not get stuck on locks the looping task may hold.
> 
> 2. When OOMing in a user fault but somebody else is handling it
>    (either the kernel OOM killer or a userspace handler), don't go to
>    sleep in the charge context.  Instead, remember the OOMing memcg in
>    the task struct and then fully unwind the page fault stack with
>    -ENOMEM.  pagefault_out_of_memory() will then call back into the
>    memcg code to check if the -ENOMEM came from the memcg, and then
>    either put the task to sleep on the memcg's OOM waitqueue or just
>    restart the fault.  The OOM victim can no longer get stuck on any
>    lock a sleeping task may hold.
> 
> Reported-by: Reported-by: azurIt <azurit-Rm0zKEqwvD4@public.gmane.org>
> Debugged-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>
> Signed-off-by: Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org>

I was thinking whether we should add task_in_memcg_oom into return to
the userspace path just in case but this should be OK for now and new
users of mem_cgroup_enable_oom will be fought against hard.

Acked-by: Michal Hocko <mhocko-AlSwsSmVLrQ@public.gmane.org>

Thanks

> ---
>  include/linux/memcontrol.h |  21 +++++++
>  include/linux/sched.h      |   4 ++
>  mm/memcontrol.c            | 154 +++++++++++++++++++++++++++++++--------------
>  mm/memory.c                |   3 +
>  mm/oom_kill.c              |   7 ++-
>  5 files changed, 140 insertions(+), 49 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 9c449c1..cb84058 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -131,6 +131,10 @@ extern void mem_cgroup_replace_page_cache(struct page *oldpage,
>   *
>   * Toggle whether a failed memcg charge should invoke the OOM killer
>   * or just return -ENOMEM.  Returns the previous toggle state.
> + *
> + * NOTE: Any path that enables the OOM killer before charging must
> + *       call mem_cgroup_oom_synchronize() afterward to finalize the
> + *       OOM handling and clean up.
>   */
>  static inline bool mem_cgroup_toggle_oom(bool new)
>  {
> @@ -156,6 +160,13 @@ static inline void mem_cgroup_disable_oom(void)
>  	WARN_ON(old == false);
>  }
>  
> +static inline bool task_in_memcg_oom(struct task_struct *p)
> +{
> +	return p->memcg_oom.in_memcg_oom;
> +}
> +
> +bool mem_cgroup_oom_synchronize(void);
> +
>  #ifdef CONFIG_MEMCG_SWAP
>  extern int do_swap_account;
>  #endif
> @@ -392,6 +403,16 @@ static inline void mem_cgroup_disable_oom(void)
>  {
>  }
>  
> +static inline bool task_in_memcg_oom(struct task_struct *p)
> +{
> +	return false;
> +}
> +
> +static inline bool mem_cgroup_oom_synchronize(void)
> +{
> +	return false;
> +}
> +
>  static inline void mem_cgroup_inc_page_stat(struct page *page,
>  					    enum mem_cgroup_page_stat_item idx)
>  {
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 4b3effc..4593e27 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1400,6 +1400,10 @@ struct task_struct {
>  	unsigned int memcg_kmem_skip_account;
>  	struct memcg_oom_info {
>  		unsigned int may_oom:1;
> +		unsigned int in_memcg_oom:1;
> +		unsigned int oom_locked:1;
> +		int wakeups;
> +		struct mem_cgroup *wait_on_memcg;
>  	} memcg_oom;
>  #endif
>  #ifdef CONFIG_UPROBES
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 3d0c1d3..b30c67a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -280,6 +280,7 @@ struct mem_cgroup {
>  
>  	bool		oom_lock;
>  	atomic_t	under_oom;
> +	atomic_t	oom_wakeups;
>  
>  	int	swappiness;
>  	/* OOM-Killer disable */
> @@ -2180,6 +2181,7 @@ static int memcg_oom_wake_function(wait_queue_t *wait,
>  
>  static void memcg_wakeup_oom(struct mem_cgroup *memcg)
>  {
> +	atomic_inc(&memcg->oom_wakeups);
>  	/* for filtering, pass "memcg" as argument. */
>  	__wake_up(&memcg_oom_waitq, TASK_NORMAL, 0, memcg);
>  }
> @@ -2191,19 +2193,17 @@ static void memcg_oom_recover(struct mem_cgroup *memcg)
>  }
>  
>  /*
> - * try to call OOM killer. returns false if we should exit memory-reclaim loop.
> + * try to call OOM killer
>   */
> -static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
> -				  int order)
> +static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
>  {
> -	struct oom_wait_info owait;
>  	bool locked;
> +	int wakeups;
>  
> -	owait.memcg = memcg;
> -	owait.wait.flags = 0;
> -	owait.wait.func = memcg_oom_wake_function;
> -	owait.wait.private = current;
> -	INIT_LIST_HEAD(&owait.wait.task_list);
> +	if (!current->memcg_oom.may_oom)
> +		return;
> +
> +	current->memcg_oom.in_memcg_oom = 1;
>  
>  	/*
>  	 * As with any blocking lock, a contender needs to start
> @@ -2211,12 +2211,8 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
>  	 * otherwise it can miss the wakeup from the unlock and sleep
>  	 * indefinitely.  This is just open-coded because our locking
>  	 * is so particular to memcg hierarchies.
> -	 *
> -	 * Even if signal_pending(), we can't quit charge() loop without
> -	 * accounting. So, UNINTERRUPTIBLE is appropriate. But SIGKILL
> -	 * under OOM is always welcomed, use TASK_KILLABLE here.
>  	 */
> -	prepare_to_wait(&memcg_oom_waitq, &owait.wait, TASK_KILLABLE);
> +	wakeups = atomic_read(&memcg->oom_wakeups);
>  	mem_cgroup_mark_under_oom(memcg);
>  
>  	locked = mem_cgroup_oom_trylock(memcg);
> @@ -2226,15 +2222,95 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
>  
>  	if (locked && !memcg->oom_kill_disable) {
>  		mem_cgroup_unmark_under_oom(memcg);
> -		finish_wait(&memcg_oom_waitq, &owait.wait);
>  		mem_cgroup_out_of_memory(memcg, mask, order);
> +		mem_cgroup_oom_unlock(memcg);
> +		/*
> +		 * There is no guarantee that an OOM-lock contender
> +		 * sees the wakeups triggered by the OOM kill
> +		 * uncharges.  Wake any sleepers explicitely.
> +		 */
> +		memcg_oom_recover(memcg);
>  	} else {
> -		schedule();
> -		mem_cgroup_unmark_under_oom(memcg);
> -		finish_wait(&memcg_oom_waitq, &owait.wait);
> +		/*
> +		 * A system call can just return -ENOMEM, but if this
> +		 * is a page fault and somebody else is handling the
> +		 * OOM already, we need to sleep on the OOM waitqueue
> +		 * for this memcg until the situation is resolved.
> +		 * Which can take some time because it might be
> +		 * handled by a userspace task.
> +		 *
> +		 * However, this is the charge context, which means
> +		 * that we may sit on a large call stack and hold
> +		 * various filesystem locks, the mmap_sem etc. and we
> +		 * don't want the OOM handler to deadlock on them
> +		 * while we sit here and wait.  Store the current OOM
> +		 * context in the task_struct, then return -ENOMEM.
> +		 * At the end of the page fault handler, with the
> +		 * stack unwound, pagefault_out_of_memory() will check
> +		 * back with us by calling
> +		 * mem_cgroup_oom_synchronize(), possibly putting the
> +		 * task to sleep.
> +		 */
> +		current->memcg_oom.oom_locked = locked;
> +		current->memcg_oom.wakeups = wakeups;
> +		css_get(&memcg->css);
> +		current->memcg_oom.wait_on_memcg = memcg;
>  	}
> +}
> +
> +/**
> + * mem_cgroup_oom_synchronize - complete memcg OOM handling
> + *
> + * This has to be called at the end of a page fault if the the memcg
> + * OOM handler was enabled and the fault is returning %VM_FAULT_OOM.
> + *
> + * Memcg supports userspace OOM handling, so failed allocations must
> + * sleep on a waitqueue until the userspace task resolves the
> + * situation.  Sleeping directly in the charge context with all kinds
> + * of locks held is not a good idea, instead we remember an OOM state
> + * in the task and mem_cgroup_oom_synchronize() has to be called at
> + * the end of the page fault to put the task to sleep and clean up the
> + * OOM state.
> + *
> + * Returns %true if an ongoing memcg OOM situation was detected and
> + * finalized, %false otherwise.
> + */
> +bool mem_cgroup_oom_synchronize(void)
> +{
> +	struct oom_wait_info owait;
> +	struct mem_cgroup *memcg;
> +
> +	/* OOM is global, do not handle */
> +	if (!current->memcg_oom.in_memcg_oom)
> +		return false;
> +
> +	/*
> +	 * We invoked the OOM killer but there is a chance that a kill
> +	 * did not free up any charges.  Everybody else might already
> +	 * be sleeping, so restart the fault and keep the rampage
> +	 * going until some charges are released.
> +	 */
> +	memcg = current->memcg_oom.wait_on_memcg;
> +	if (!memcg)
> +		goto out;
> +
> +	if (test_thread_flag(TIF_MEMDIE) || fatal_signal_pending(current))
> +		goto out_memcg;
> +
> +	owait.memcg = memcg;
> +	owait.wait.flags = 0;
> +	owait.wait.func = memcg_oom_wake_function;
> +	owait.wait.private = current;
> +	INIT_LIST_HEAD(&owait.wait.task_list);
>  
> -	if (locked) {
> +	prepare_to_wait(&memcg_oom_waitq, &owait.wait, TASK_KILLABLE);
> +	/* Only sleep if we didn't miss any wakeups since OOM */
> +	if (atomic_read(&memcg->oom_wakeups) == current->memcg_oom.wakeups)
> +		schedule();
> +	finish_wait(&memcg_oom_waitq, &owait.wait);
> +out_memcg:
> +	mem_cgroup_unmark_under_oom(memcg);
> +	if (current->memcg_oom.oom_locked) {
>  		mem_cgroup_oom_unlock(memcg);
>  		/*
>  		 * There is no guarantee that an OOM-lock contender
> @@ -2243,11 +2319,10 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
>  		 */
>  		memcg_oom_recover(memcg);
>  	}
> -
> -	if (test_thread_flag(TIF_MEMDIE) || fatal_signal_pending(current))
> -		return false;
> -	/* Give chance to dying process */
> -	schedule_timeout_uninterruptible(1);
> +	css_put(&memcg->css);
> +	current->memcg_oom.wait_on_memcg = NULL;
> +out:
> +	current->memcg_oom.in_memcg_oom = 0;
>  	return true;
>  }
>  
> @@ -2560,12 +2635,11 @@ enum {
>  	CHARGE_RETRY,		/* need to retry but retry is not bad */
>  	CHARGE_NOMEM,		/* we can't do more. return -ENOMEM */
>  	CHARGE_WOULDBLOCK,	/* GFP_WAIT wasn't set and no enough res. */
> -	CHARGE_OOM_DIE,		/* the current is killed because of OOM */
>  };
>  
>  static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  				unsigned int nr_pages, unsigned int min_pages,
> -				bool oom_check)
> +				bool invoke_oom)
>  {
>  	unsigned long csize = nr_pages * PAGE_SIZE;
>  	struct mem_cgroup *mem_over_limit;
> @@ -2622,14 +2696,10 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	if (mem_cgroup_wait_acct_move(mem_over_limit))
>  		return CHARGE_RETRY;
>  
> -	/* If we don't need to call oom-killer at el, return immediately */
> -	if (!oom_check || !current->memcg_oom.may_oom)
> -		return CHARGE_NOMEM;
> -	/* check OOM */
> -	if (!mem_cgroup_handle_oom(mem_over_limit, gfp_mask, get_order(csize)))
> -		return CHARGE_OOM_DIE;
> +	if (invoke_oom)
> +		mem_cgroup_oom(mem_over_limit, gfp_mask, get_order(csize));
>  
> -	return CHARGE_RETRY;
> +	return CHARGE_NOMEM;
>  }
>  
>  /*
> @@ -2732,7 +2802,7 @@ again:
>  	}
>  
>  	do {
> -		bool oom_check;
> +		bool invoke_oom = oom && !nr_oom_retries;
>  
>  		/* If killed, bypass charge */
>  		if (fatal_signal_pending(current)) {
> @@ -2740,14 +2810,8 @@ again:
>  			goto bypass;
>  		}
>  
> -		oom_check = false;
> -		if (oom && !nr_oom_retries) {
> -			oom_check = true;
> -			nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> -		}
> -
> -		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch, nr_pages,
> -		    oom_check);
> +		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
> +					   nr_pages, invoke_oom);
>  		switch (ret) {
>  		case CHARGE_OK:
>  			break;
> @@ -2760,16 +2824,12 @@ again:
>  			css_put(&memcg->css);
>  			goto nomem;
>  		case CHARGE_NOMEM: /* OOM routine works */
> -			if (!oom) {
> +			if (!oom || invoke_oom) {
>  				css_put(&memcg->css);
>  				goto nomem;
>  			}
> -			/* If oom, we never return -ENOMEM */
>  			nr_oom_retries--;
>  			break;
> -		case CHARGE_OOM_DIE: /* Killed by OOM Killer */
> -			css_put(&memcg->css);
> -			goto bypass;
>  		}
>  	} while (ret != CHARGE_OK);
>  
> diff --git a/mm/memory.c b/mm/memory.c
> index 58ef726..91da6fb 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3868,6 +3868,9 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	if (flags & FAULT_FLAG_USER)
>  		mem_cgroup_disable_oom();
>  
> +	if (WARN_ON(task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM)))
> +		mem_cgroup_oom_synchronize();
> +
>  	return ret;
>  }
>  
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 98e75f2..314e9d2 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -678,9 +678,12 @@ out:
>   */
>  void pagefault_out_of_memory(void)
>  {
> -	struct zonelist *zonelist = node_zonelist(first_online_node,
> -						  GFP_KERNEL);
> +	struct zonelist *zonelist;
>  
> +	if (mem_cgroup_oom_synchronize())
> +		return;
> +
> +	zonelist = node_zonelist(first_online_node, GFP_KERNEL);
>  	if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
>  		out_of_memory(NULL, 0, 0, NULL, false);
>  		clear_zonelist_oom(zonelist, GFP_KERNEL);
> -- 
> 1.8.3.2
> 

-- 
Michal Hocko
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	azurIt <azurit@pobox.sk>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org, x86@kernel.org,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 7/7] mm: memcg: do not trap chargers with full callstack on OOM
Date: Mon, 5 Aug 2013 11:54:29 +0200	[thread overview]
Message-ID: <20130805095429.GJ10146@dhcp22.suse.cz> (raw)
In-Reply-To: <1375549200-19110-8-git-send-email-hannes@cmpxchg.org>

On Sat 03-08-13 13:00:00, Johannes Weiner wrote:
> The memcg OOM handling is incredibly fragile and can deadlock.  When a
> task fails to charge memory, it invokes the OOM killer and loops right
> there in the charge code until it succeeds.  Comparably, any other
> task that enters the charge path at this point will go to a waitqueue
> right then and there and sleep until the OOM situation is resolved.
> The problem is that these tasks may hold filesystem locks and the
> mmap_sem; locks that the selected OOM victim may need to exit.
> 
> For example, in one reported case, the task invoking the OOM killer
> was about to charge a page cache page during a write(), which holds
> the i_mutex.  The OOM killer selected a task that was just entering
> truncate() and trying to acquire the i_mutex:
> 
> OOM invoking task:
> [<ffffffff8110a9c1>] mem_cgroup_handle_oom+0x241/0x3b0
> [<ffffffff8110b5ab>] T.1146+0x5ab/0x5c0
> [<ffffffff8110c22e>] mem_cgroup_cache_charge+0xbe/0xe0
> [<ffffffff810ca28c>] add_to_page_cache_locked+0x4c/0x140
> [<ffffffff810ca3a2>] add_to_page_cache_lru+0x22/0x50
> [<ffffffff810ca45b>] grab_cache_page_write_begin+0x8b/0xe0
> [<ffffffff81193a18>] ext3_write_begin+0x88/0x270
> [<ffffffff810c8fc6>] generic_file_buffered_write+0x116/0x290
> [<ffffffff810cb3cc>] __generic_file_aio_write+0x27c/0x480
> [<ffffffff810cb646>] generic_file_aio_write+0x76/0xf0           # takes ->i_mutex
> [<ffffffff8111156a>] do_sync_write+0xea/0x130
> [<ffffffff81112183>] vfs_write+0xf3/0x1f0
> [<ffffffff81112381>] sys_write+0x51/0x90
> [<ffffffff815b5926>] system_call_fastpath+0x18/0x1d
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> OOM kill victim:
> [<ffffffff811109b8>] do_truncate+0x58/0xa0              # takes i_mutex
> [<ffffffff81121c90>] do_last+0x250/0xa30
> [<ffffffff81122547>] path_openat+0xd7/0x440
> [<ffffffff811229c9>] do_filp_open+0x49/0xa0
> [<ffffffff8110f7d6>] do_sys_open+0x106/0x240
> [<ffffffff8110f950>] sys_open+0x20/0x30
> [<ffffffff815b5926>] system_call_fastpath+0x18/0x1d
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> The OOM handling task will retry the charge indefinitely while the OOM
> killed task is not releasing any resources.
> 
> A similar scenario can happen when the kernel OOM killer for a memcg
> is disabled and a userspace task is in charge of resolving OOM
> situations.  In this case, ALL tasks that enter the OOM path will be
> made to sleep on the OOM waitqueue and wait for userspace to free
> resources or increase the group's limit.  But a userspace OOM handler
> is prone to deadlock itself on the locks held by the waiting tasks.
> For example one of the sleeping tasks may be stuck in a brk() call
> with the mmap_sem held for writing but the userspace handler, in order
> to pick an optimal victim, may need to read files from /proc/<pid>,
> which tries to acquire the same mmap_sem for reading and deadlocks.
> 
> This patch changes the way tasks behave after detecting a memcg OOM
> and makes sure nobody loops or sleeps with locks held:
> 
> 1. When OOMing in a user fault, invoke the OOM killer and restart the
>    fault instead of looping on the charge attempt.  This way, the OOM
>    victim can not get stuck on locks the looping task may hold.
> 
> 2. When OOMing in a user fault but somebody else is handling it
>    (either the kernel OOM killer or a userspace handler), don't go to
>    sleep in the charge context.  Instead, remember the OOMing memcg in
>    the task struct and then fully unwind the page fault stack with
>    -ENOMEM.  pagefault_out_of_memory() will then call back into the
>    memcg code to check if the -ENOMEM came from the memcg, and then
>    either put the task to sleep on the memcg's OOM waitqueue or just
>    restart the fault.  The OOM victim can no longer get stuck on any
>    lock a sleeping task may hold.
> 
> Reported-by: Reported-by: azurIt <azurit@pobox.sk>
> Debugged-by: Michal Hocko <mhocko@suse.cz>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

I was thinking whether we should add task_in_memcg_oom into return to
the userspace path just in case but this should be OK for now and new
users of mem_cgroup_enable_oom will be fought against hard.

Acked-by: Michal Hocko <mhocko@suse.cz>

Thanks

> ---
>  include/linux/memcontrol.h |  21 +++++++
>  include/linux/sched.h      |   4 ++
>  mm/memcontrol.c            | 154 +++++++++++++++++++++++++++++++--------------
>  mm/memory.c                |   3 +
>  mm/oom_kill.c              |   7 ++-
>  5 files changed, 140 insertions(+), 49 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 9c449c1..cb84058 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -131,6 +131,10 @@ extern void mem_cgroup_replace_page_cache(struct page *oldpage,
>   *
>   * Toggle whether a failed memcg charge should invoke the OOM killer
>   * or just return -ENOMEM.  Returns the previous toggle state.
> + *
> + * NOTE: Any path that enables the OOM killer before charging must
> + *       call mem_cgroup_oom_synchronize() afterward to finalize the
> + *       OOM handling and clean up.
>   */
>  static inline bool mem_cgroup_toggle_oom(bool new)
>  {
> @@ -156,6 +160,13 @@ static inline void mem_cgroup_disable_oom(void)
>  	WARN_ON(old == false);
>  }
>  
> +static inline bool task_in_memcg_oom(struct task_struct *p)
> +{
> +	return p->memcg_oom.in_memcg_oom;
> +}
> +
> +bool mem_cgroup_oom_synchronize(void);
> +
>  #ifdef CONFIG_MEMCG_SWAP
>  extern int do_swap_account;
>  #endif
> @@ -392,6 +403,16 @@ static inline void mem_cgroup_disable_oom(void)
>  {
>  }
>  
> +static inline bool task_in_memcg_oom(struct task_struct *p)
> +{
> +	return false;
> +}
> +
> +static inline bool mem_cgroup_oom_synchronize(void)
> +{
> +	return false;
> +}
> +
>  static inline void mem_cgroup_inc_page_stat(struct page *page,
>  					    enum mem_cgroup_page_stat_item idx)
>  {
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 4b3effc..4593e27 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1400,6 +1400,10 @@ struct task_struct {
>  	unsigned int memcg_kmem_skip_account;
>  	struct memcg_oom_info {
>  		unsigned int may_oom:1;
> +		unsigned int in_memcg_oom:1;
> +		unsigned int oom_locked:1;
> +		int wakeups;
> +		struct mem_cgroup *wait_on_memcg;
>  	} memcg_oom;
>  #endif
>  #ifdef CONFIG_UPROBES
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 3d0c1d3..b30c67a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -280,6 +280,7 @@ struct mem_cgroup {
>  
>  	bool		oom_lock;
>  	atomic_t	under_oom;
> +	atomic_t	oom_wakeups;
>  
>  	int	swappiness;
>  	/* OOM-Killer disable */
> @@ -2180,6 +2181,7 @@ static int memcg_oom_wake_function(wait_queue_t *wait,
>  
>  static void memcg_wakeup_oom(struct mem_cgroup *memcg)
>  {
> +	atomic_inc(&memcg->oom_wakeups);
>  	/* for filtering, pass "memcg" as argument. */
>  	__wake_up(&memcg_oom_waitq, TASK_NORMAL, 0, memcg);
>  }
> @@ -2191,19 +2193,17 @@ static void memcg_oom_recover(struct mem_cgroup *memcg)
>  }
>  
>  /*
> - * try to call OOM killer. returns false if we should exit memory-reclaim loop.
> + * try to call OOM killer
>   */
> -static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
> -				  int order)
> +static void mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
>  {
> -	struct oom_wait_info owait;
>  	bool locked;
> +	int wakeups;
>  
> -	owait.memcg = memcg;
> -	owait.wait.flags = 0;
> -	owait.wait.func = memcg_oom_wake_function;
> -	owait.wait.private = current;
> -	INIT_LIST_HEAD(&owait.wait.task_list);
> +	if (!current->memcg_oom.may_oom)
> +		return;
> +
> +	current->memcg_oom.in_memcg_oom = 1;
>  
>  	/*
>  	 * As with any blocking lock, a contender needs to start
> @@ -2211,12 +2211,8 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
>  	 * otherwise it can miss the wakeup from the unlock and sleep
>  	 * indefinitely.  This is just open-coded because our locking
>  	 * is so particular to memcg hierarchies.
> -	 *
> -	 * Even if signal_pending(), we can't quit charge() loop without
> -	 * accounting. So, UNINTERRUPTIBLE is appropriate. But SIGKILL
> -	 * under OOM is always welcomed, use TASK_KILLABLE here.
>  	 */
> -	prepare_to_wait(&memcg_oom_waitq, &owait.wait, TASK_KILLABLE);
> +	wakeups = atomic_read(&memcg->oom_wakeups);
>  	mem_cgroup_mark_under_oom(memcg);
>  
>  	locked = mem_cgroup_oom_trylock(memcg);
> @@ -2226,15 +2222,95 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
>  
>  	if (locked && !memcg->oom_kill_disable) {
>  		mem_cgroup_unmark_under_oom(memcg);
> -		finish_wait(&memcg_oom_waitq, &owait.wait);
>  		mem_cgroup_out_of_memory(memcg, mask, order);
> +		mem_cgroup_oom_unlock(memcg);
> +		/*
> +		 * There is no guarantee that an OOM-lock contender
> +		 * sees the wakeups triggered by the OOM kill
> +		 * uncharges.  Wake any sleepers explicitely.
> +		 */
> +		memcg_oom_recover(memcg);
>  	} else {
> -		schedule();
> -		mem_cgroup_unmark_under_oom(memcg);
> -		finish_wait(&memcg_oom_waitq, &owait.wait);
> +		/*
> +		 * A system call can just return -ENOMEM, but if this
> +		 * is a page fault and somebody else is handling the
> +		 * OOM already, we need to sleep on the OOM waitqueue
> +		 * for this memcg until the situation is resolved.
> +		 * Which can take some time because it might be
> +		 * handled by a userspace task.
> +		 *
> +		 * However, this is the charge context, which means
> +		 * that we may sit on a large call stack and hold
> +		 * various filesystem locks, the mmap_sem etc. and we
> +		 * don't want the OOM handler to deadlock on them
> +		 * while we sit here and wait.  Store the current OOM
> +		 * context in the task_struct, then return -ENOMEM.
> +		 * At the end of the page fault handler, with the
> +		 * stack unwound, pagefault_out_of_memory() will check
> +		 * back with us by calling
> +		 * mem_cgroup_oom_synchronize(), possibly putting the
> +		 * task to sleep.
> +		 */
> +		current->memcg_oom.oom_locked = locked;
> +		current->memcg_oom.wakeups = wakeups;
> +		css_get(&memcg->css);
> +		current->memcg_oom.wait_on_memcg = memcg;
>  	}
> +}
> +
> +/**
> + * mem_cgroup_oom_synchronize - complete memcg OOM handling
> + *
> + * This has to be called at the end of a page fault if the the memcg
> + * OOM handler was enabled and the fault is returning %VM_FAULT_OOM.
> + *
> + * Memcg supports userspace OOM handling, so failed allocations must
> + * sleep on a waitqueue until the userspace task resolves the
> + * situation.  Sleeping directly in the charge context with all kinds
> + * of locks held is not a good idea, instead we remember an OOM state
> + * in the task and mem_cgroup_oom_synchronize() has to be called at
> + * the end of the page fault to put the task to sleep and clean up the
> + * OOM state.
> + *
> + * Returns %true if an ongoing memcg OOM situation was detected and
> + * finalized, %false otherwise.
> + */
> +bool mem_cgroup_oom_synchronize(void)
> +{
> +	struct oom_wait_info owait;
> +	struct mem_cgroup *memcg;
> +
> +	/* OOM is global, do not handle */
> +	if (!current->memcg_oom.in_memcg_oom)
> +		return false;
> +
> +	/*
> +	 * We invoked the OOM killer but there is a chance that a kill
> +	 * did not free up any charges.  Everybody else might already
> +	 * be sleeping, so restart the fault and keep the rampage
> +	 * going until some charges are released.
> +	 */
> +	memcg = current->memcg_oom.wait_on_memcg;
> +	if (!memcg)
> +		goto out;
> +
> +	if (test_thread_flag(TIF_MEMDIE) || fatal_signal_pending(current))
> +		goto out_memcg;
> +
> +	owait.memcg = memcg;
> +	owait.wait.flags = 0;
> +	owait.wait.func = memcg_oom_wake_function;
> +	owait.wait.private = current;
> +	INIT_LIST_HEAD(&owait.wait.task_list);
>  
> -	if (locked) {
> +	prepare_to_wait(&memcg_oom_waitq, &owait.wait, TASK_KILLABLE);
> +	/* Only sleep if we didn't miss any wakeups since OOM */
> +	if (atomic_read(&memcg->oom_wakeups) == current->memcg_oom.wakeups)
> +		schedule();
> +	finish_wait(&memcg_oom_waitq, &owait.wait);
> +out_memcg:
> +	mem_cgroup_unmark_under_oom(memcg);
> +	if (current->memcg_oom.oom_locked) {
>  		mem_cgroup_oom_unlock(memcg);
>  		/*
>  		 * There is no guarantee that an OOM-lock contender
> @@ -2243,11 +2319,10 @@ static bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask,
>  		 */
>  		memcg_oom_recover(memcg);
>  	}
> -
> -	if (test_thread_flag(TIF_MEMDIE) || fatal_signal_pending(current))
> -		return false;
> -	/* Give chance to dying process */
> -	schedule_timeout_uninterruptible(1);
> +	css_put(&memcg->css);
> +	current->memcg_oom.wait_on_memcg = NULL;
> +out:
> +	current->memcg_oom.in_memcg_oom = 0;
>  	return true;
>  }
>  
> @@ -2560,12 +2635,11 @@ enum {
>  	CHARGE_RETRY,		/* need to retry but retry is not bad */
>  	CHARGE_NOMEM,		/* we can't do more. return -ENOMEM */
>  	CHARGE_WOULDBLOCK,	/* GFP_WAIT wasn't set and no enough res. */
> -	CHARGE_OOM_DIE,		/* the current is killed because of OOM */
>  };
>  
>  static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  				unsigned int nr_pages, unsigned int min_pages,
> -				bool oom_check)
> +				bool invoke_oom)
>  {
>  	unsigned long csize = nr_pages * PAGE_SIZE;
>  	struct mem_cgroup *mem_over_limit;
> @@ -2622,14 +2696,10 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	if (mem_cgroup_wait_acct_move(mem_over_limit))
>  		return CHARGE_RETRY;
>  
> -	/* If we don't need to call oom-killer at el, return immediately */
> -	if (!oom_check || !current->memcg_oom.may_oom)
> -		return CHARGE_NOMEM;
> -	/* check OOM */
> -	if (!mem_cgroup_handle_oom(mem_over_limit, gfp_mask, get_order(csize)))
> -		return CHARGE_OOM_DIE;
> +	if (invoke_oom)
> +		mem_cgroup_oom(mem_over_limit, gfp_mask, get_order(csize));
>  
> -	return CHARGE_RETRY;
> +	return CHARGE_NOMEM;
>  }
>  
>  /*
> @@ -2732,7 +2802,7 @@ again:
>  	}
>  
>  	do {
> -		bool oom_check;
> +		bool invoke_oom = oom && !nr_oom_retries;
>  
>  		/* If killed, bypass charge */
>  		if (fatal_signal_pending(current)) {
> @@ -2740,14 +2810,8 @@ again:
>  			goto bypass;
>  		}
>  
> -		oom_check = false;
> -		if (oom && !nr_oom_retries) {
> -			oom_check = true;
> -			nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> -		}
> -
> -		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch, nr_pages,
> -		    oom_check);
> +		ret = mem_cgroup_do_charge(memcg, gfp_mask, batch,
> +					   nr_pages, invoke_oom);
>  		switch (ret) {
>  		case CHARGE_OK:
>  			break;
> @@ -2760,16 +2824,12 @@ again:
>  			css_put(&memcg->css);
>  			goto nomem;
>  		case CHARGE_NOMEM: /* OOM routine works */
> -			if (!oom) {
> +			if (!oom || invoke_oom) {
>  				css_put(&memcg->css);
>  				goto nomem;
>  			}
> -			/* If oom, we never return -ENOMEM */
>  			nr_oom_retries--;
>  			break;
> -		case CHARGE_OOM_DIE: /* Killed by OOM Killer */
> -			css_put(&memcg->css);
> -			goto bypass;
>  		}
>  	} while (ret != CHARGE_OK);
>  
> diff --git a/mm/memory.c b/mm/memory.c
> index 58ef726..91da6fb 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3868,6 +3868,9 @@ int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	if (flags & FAULT_FLAG_USER)
>  		mem_cgroup_disable_oom();
>  
> +	if (WARN_ON(task_in_memcg_oom(current) && !(ret & VM_FAULT_OOM)))
> +		mem_cgroup_oom_synchronize();
> +
>  	return ret;
>  }
>  
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 98e75f2..314e9d2 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -678,9 +678,12 @@ out:
>   */
>  void pagefault_out_of_memory(void)
>  {
> -	struct zonelist *zonelist = node_zonelist(first_online_node,
> -						  GFP_KERNEL);
> +	struct zonelist *zonelist;
>  
> +	if (mem_cgroup_oom_synchronize())
> +		return;
> +
> +	zonelist = node_zonelist(first_online_node, GFP_KERNEL);
>  	if (try_set_zonelist_oom(zonelist, GFP_KERNEL)) {
>  		out_of_memory(NULL, 0, 0, NULL, false);
>  		clear_zonelist_oom(zonelist, GFP_KERNEL);
> -- 
> 1.8.3.2
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-08-05  9:54 UTC|newest]

Thread overview: 227+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-03 16:59 [patch 0/7] improve memcg oom killer robustness v2 Johannes Weiner
2013-08-03 16:59 ` Johannes Weiner
2013-08-03 16:59 ` [patch 1/7] arch: mm: remove obsolete init OOM protection Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-06  6:34   ` Vineet Gupta
2013-08-06  6:34     ` Vineet Gupta
2013-08-06  6:34     ` Vineet Gupta
2013-08-03 16:59 ` [patch 2/7] arch: mm: do not invoke OOM killer on kernel fault OOM Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-03 16:59 ` [patch 3/7] arch: mm: pass userspace fault flag to generic fault handler Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-05 22:06   ` Andrew Morton
2013-08-05 22:06     ` Andrew Morton
2013-08-05 22:25     ` Johannes Weiner
2013-08-05 22:25       ` Johannes Weiner
2013-08-03 16:59 ` [patch 4/7] x86: finish user fault error path with fatal signal Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-03 16:59 ` [patch 5/7] mm: memcg: enable memcg OOM killer only for user faults Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-05  9:18   ` Michal Hocko
2013-08-05  9:18     ` Michal Hocko
2013-08-03 16:59 ` [patch 6/7] mm: memcg: rework and document OOM waiting and wakeup Johannes Weiner
2013-08-03 16:59   ` Johannes Weiner
2013-08-03 17:00 ` [patch 7/7] mm: memcg: do not trap chargers with full callstack on OOM Johannes Weiner
2013-08-03 17:00   ` Johannes Weiner
2013-08-05  9:54   ` Michal Hocko [this message]
2013-08-05  9:54     ` Michal Hocko
2013-08-05  9:54     ` Michal Hocko
2013-08-05 20:56     ` Johannes Weiner
2013-08-05 20:56       ` Johannes Weiner
2013-08-03 17:08 ` [patch 0/7] improve memcg oom killer robustness v2 Johannes Weiner
2013-08-03 17:08   ` Johannes Weiner
2013-08-09  9:06   ` azurIt
2013-08-09  9:06     ` azurIt
2013-08-09  9:06     ` azurIt
2013-08-30 19:58   ` azurIt
2013-08-30 19:58     ` azurIt
2013-09-02 10:38     ` azurIt
2013-09-02 10:38       ` azurIt
2013-09-03 20:48       ` Johannes Weiner
2013-09-03 20:48         ` Johannes Weiner
2013-09-04  7:53         ` azurIt
2013-09-04  7:53           ` azurIt
2013-09-04  7:53           ` azurIt
2013-09-04  7:53           ` azurIt
2013-09-04  8:18         ` azurIt
2013-09-04  8:18           ` azurIt
2013-09-05 11:54           ` Johannes Weiner
2013-09-05 11:54             ` Johannes Weiner
2013-09-05 12:43             ` Michal Hocko
2013-09-05 12:43               ` Michal Hocko
2013-09-05 16:18               ` Johannes Weiner
2013-09-05 16:18                 ` Johannes Weiner
2013-09-09 12:36                 ` Michal Hocko
2013-09-09 12:36                   ` Michal Hocko
2013-09-09 12:56                   ` Michal Hocko
2013-09-09 12:56                     ` Michal Hocko
2013-09-12 12:59                     ` Johannes Weiner
2013-09-12 12:59                       ` Johannes Weiner
2013-09-16 14:03                       ` Michal Hocko
2013-09-16 14:03                         ` Michal Hocko
2013-09-16 14:03                         ` Michal Hocko
2013-09-05 13:24             ` Michal Hocko
2013-09-05 13:24               ` Michal Hocko
2013-09-09 13:10             ` azurIt
2013-09-09 13:10               ` azurIt
2013-09-09 17:28               ` Johannes Weiner
2013-09-09 17:28                 ` Johannes Weiner
2013-09-09 19:59                 ` azurIt
2013-09-09 19:59                   ` azurIt
2013-09-09 20:12                   ` Johannes Weiner
2013-09-09 20:12                     ` Johannes Weiner
2013-09-09 20:18                     ` azurIt
2013-09-09 20:18                       ` azurIt
2013-09-09 21:08                     ` azurIt
2013-09-09 21:08                       ` azurIt
2013-09-10 18:13                     ` azurIt
2013-09-10 18:13                       ` azurIt
2013-09-10 18:37                       ` Johannes Weiner
2013-09-10 18:37                         ` Johannes Weiner
2013-09-10 19:32                         ` azurIt
2013-09-10 19:32                           ` azurIt
2013-09-10 20:12                           ` Johannes Weiner
2013-09-10 20:12                             ` Johannes Weiner
2013-09-10 21:08                             ` azurIt
2013-09-10 21:08                               ` azurIt
2013-09-10 21:08                               ` azurIt
2013-09-10 21:18                               ` Johannes Weiner
2013-09-10 21:18                                 ` Johannes Weiner
2013-09-10 21:32                                 ` azurIt
2013-09-10 21:32                                   ` azurIt
2013-09-10 22:03                                   ` Johannes Weiner
2013-09-10 22:03                                     ` Johannes Weiner
2013-09-11 12:33                                     ` azurIt
2013-09-11 12:33                                       ` azurIt
2013-09-11 18:03                                       ` Johannes Weiner
2013-09-11 18:03                                         ` Johannes Weiner
2013-09-11 18:03                                         ` Johannes Weiner
2013-09-11 18:54                                         ` azurIt
2013-09-11 18:54                                           ` azurIt
2013-09-11 19:11                                           ` Johannes Weiner
2013-09-11 19:11                                             ` Johannes Weiner
2013-09-11 19:41                                             ` azurIt
2013-09-11 19:41                                               ` azurIt
2013-09-11 20:04                                               ` Johannes Weiner
2013-09-11 20:04                                                 ` Johannes Weiner
2013-09-14 10:48                                                 ` azurIt
2013-09-14 10:48                                                   ` azurIt
2013-09-16 13:40                                                   ` Michal Hocko
2013-09-16 13:40                                                     ` Michal Hocko
2013-09-16 14:01                                                     ` azurIt
2013-09-16 14:01                                                       ` azurIt
2013-09-16 14:06                                                       ` Michal Hocko
2013-09-16 14:06                                                         ` Michal Hocko
2013-09-16 14:13                                                         ` azurIt
2013-09-16 14:13                                                           ` azurIt
2013-09-16 14:13                                                           ` azurIt
2013-09-16 14:57                                                           ` Michal Hocko
2013-09-16 14:57                                                             ` Michal Hocko
2013-09-16 15:05                                                             ` azurIt
2013-09-16 15:05                                                               ` azurIt
2013-09-16 15:17                                                               ` Johannes Weiner
2013-09-16 15:17                                                                 ` Johannes Weiner
2013-09-16 15:17                                                                 ` Johannes Weiner
2013-09-16 15:24                                                                 ` azurIt
2013-09-16 15:24                                                                   ` azurIt
2013-09-16 15:25                                                               ` Michal Hocko
2013-09-16 15:25                                                                 ` Michal Hocko
2013-09-16 15:40                                                                 ` azurIt
2013-09-16 15:40                                                                   ` azurIt
2013-09-16 20:52                                                                 ` azurIt
2013-09-16 20:52                                                                   ` azurIt
2013-09-17  0:02                                                                   ` Johannes Weiner
2013-09-17  0:02                                                                     ` Johannes Weiner
2013-09-17 11:15                                                                     ` azurIt
2013-09-17 11:15                                                                       ` azurIt
2013-09-17 11:15                                                                       ` azurIt
2013-09-17 14:10                                                                       ` Michal Hocko
2013-09-17 14:10                                                                         ` Michal Hocko
2013-09-18 14:03                                                                         ` azurIt
2013-09-18 14:03                                                                           ` azurIt
2013-09-18 14:03                                                                           ` azurIt
2013-09-18 14:24                                                                           ` Michal Hocko
2013-09-18 14:24                                                                             ` Michal Hocko
2013-09-18 14:33                                                                             ` azurIt
2013-09-18 14:33                                                                               ` azurIt
2013-09-18 14:42                                                                               ` Michal Hocko
2013-09-18 14:42                                                                                 ` Michal Hocko
2013-09-18 14:42                                                                                 ` Michal Hocko
2013-09-18 18:02                                                                                 ` azurIt
2013-09-18 18:02                                                                                   ` azurIt
2013-09-18 18:36                                                                                   ` Michal Hocko
2013-09-18 18:36                                                                                     ` Michal Hocko
2013-09-18 18:36                                                                                     ` Michal Hocko
2013-09-18 18:04                                                                           ` Johannes Weiner
2013-09-18 18:04                                                                             ` Johannes Weiner
2013-09-18 18:19                                                                             ` Johannes Weiner
2013-09-18 18:19                                                                               ` Johannes Weiner
2013-09-18 19:55                                                                               ` Johannes Weiner
2013-09-18 19:55                                                                                 ` Johannes Weiner
2013-09-18 19:55                                                                                 ` Johannes Weiner
2013-09-18 20:52                                                                                 ` azurIt
2013-09-18 20:52                                                                                   ` azurIt
2013-09-18 20:52                                                                                   ` azurIt
2013-09-25  7:26                                                                                 ` azurIt
2013-09-25  7:26                                                                                   ` azurIt
2013-09-25  7:26                                                                                   ` azurIt
2013-09-26 16:54                                                                                 ` azurIt
2013-09-26 16:54                                                                                   ` azurIt
2013-09-26 16:54                                                                                   ` azurIt
2013-09-26 19:27                                                                                   ` Johannes Weiner
2013-09-26 19:27                                                                                     ` Johannes Weiner
2013-09-27  2:04                                                                                     ` azurIt
2013-09-27  2:04                                                                                       ` azurIt
2013-09-27  2:04                                                                                       ` azurIt
2013-09-27  2:04                                                                                       ` azurIt
2013-10-07 11:01                                                                                     ` azurIt
2013-10-07 11:01                                                                                       ` azurIt
2013-10-07 11:01                                                                                       ` azurIt
2013-10-07 11:01                                                                                       ` azurIt
2013-10-07 19:23                                                                                       ` Johannes Weiner
2013-10-07 19:23                                                                                         ` Johannes Weiner
2013-10-09 18:44                                                                                         ` azurIt
2013-10-09 18:44                                                                                           ` azurIt
2013-10-09 18:44                                                                                           ` azurIt
2013-10-10  0:14                                                                                           ` Johannes Weiner
2013-10-10  0:14                                                                                             ` Johannes Weiner
2013-10-10  0:14                                                                                             ` Johannes Weiner
2013-10-10 22:59                                                                                             ` azurIt
2013-10-10 22:59                                                                                               ` azurIt
2013-10-10 22:59                                                                                               ` azurIt
2013-09-17 11:20                                                                     ` azurIt
2013-09-17 11:20                                                                       ` azurIt
2013-09-16 10:22                                                 ` azurIt
2013-09-16 10:22                                                   ` azurIt
2013-09-04  9:45         ` azurIt
2013-09-04  9:45           ` azurIt
2013-09-04 11:57           ` Michal Hocko
2013-09-04 11:57             ` Michal Hocko
2013-09-04 12:10             ` azurIt
2013-09-04 12:10               ` azurIt
2013-09-04 12:10               ` azurIt
2013-09-04 12:26               ` Michal Hocko
2013-09-04 12:26                 ` Michal Hocko
2013-09-04 12:26                 ` Michal Hocko
2013-09-04 12:39                 ` azurIt
2013-09-04 12:39                   ` azurIt
2013-09-05  9:14                 ` azurIt
2013-09-05  9:14                   ` azurIt
2013-09-05  9:53                   ` Michal Hocko
2013-09-05  9:53                     ` Michal Hocko
2013-09-05 10:17                     ` azurIt
2013-09-05 10:17                       ` azurIt
2013-09-05 11:17                       ` Michal Hocko
2013-09-05 11:17                         ` Michal Hocko
2013-09-05 11:17                         ` Michal Hocko
2013-09-05 11:47                         ` azurIt
2013-09-05 11:47                           ` azurIt
2013-09-05 12:03                           ` Michal Hocko
2013-09-05 12:03                             ` Michal Hocko
2013-09-05 12:33                             ` azurIt
2013-09-05 12:33                               ` azurIt
2013-09-05 12:33                               ` azurIt
2013-09-05 12:45                               ` Michal Hocko
2013-09-05 12:45                                 ` Michal Hocko
2013-09-05 13:00                                 ` azurIt
2013-09-05 13:00                                   ` azurIt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130805095429.GJ10146@dhcp22.suse.cz \
    --to=mhocko@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=azurit@pobox.sk \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.