From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752912AbbCZPiw (ORCPT <rfc822;w@1wt.eu>);
	Thu, 26 Mar 2015 11:38:52 -0400
Received: from cantor2.suse.de ([195.135.220.15]:58397 "EHLO mx2.suse.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751774AbbCZPit (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Thu, 26 Mar 2015 11:38:49 -0400
Date: Thu, 26 Mar 2015 16:38:47 +0100
From: Michal Hocko <mhocko@suse.cz>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>, linux-mm@kvack.org,
        linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
        torvalds@linux-foundation.org, akpm@linux-foundation.org,
        ying.huang@intel.com, aarcange@redhat.com, david@fromorbit.com,
        tytso@mit.edu
Subject: Re: [patch 08/12] mm: page_alloc: wait for OOM killer progress
 before retrying
Message-ID: <20150326153847.GP15257@dhcp22.suse.cz>
References: <1427264236-17249-1-git-send-email-hannes@cmpxchg.org>
 <1427264236-17249-9-git-send-email-hannes@cmpxchg.org>
 <201503252315.FBJ09847.FSOtOJQFOMLFVH@I-love.SAKURA.ne.jp>
 <20150326112445.GC18560@cmpxchg.org>
 <20150326143223.GM15257@dhcp22.suse.cz>
 <20150326152343.GE23973@cmpxchg.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150326152343.GE23973@cmpxchg.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu 26-03-15 11:23:43, Johannes Weiner wrote:
> On Thu, Mar 26, 2015 at 03:32:23PM +0100, Michal Hocko wrote:
> > On Thu 26-03-15 07:24:45, Johannes Weiner wrote:
> > > On Wed, Mar 25, 2015 at 11:15:48PM +0900, Tetsuo Handa wrote:
> > > > Johannes Weiner wrote:
> > [...]
> > > > >  	/*
> > > > > -	 * Acquire the oom lock.  If that fails, somebody else is
> > > > > -	 * making progress for us.
> > > > > +	 * This allocating task can become the OOM victim itself at
> > > > > +	 * any point before acquiring the lock.  In that case, exit
> > > > > +	 * quickly and don't block on the lock held by another task
> > > > > +	 * waiting for us to exit.
> > > > >  	 */
> > > > > -	if (!mutex_trylock(&oom_lock)) {
> > > > > -		*did_some_progress = 1;
> > > > > -		schedule_timeout_uninterruptible(1);
> > > > > -		return NULL;
> > > > > +	if (test_thread_flag(TIF_MEMDIE) || mutex_lock_killable(&oom_lock)) {
> > > > > +		alloc_flags |= ALLOC_NO_WATERMARKS;
> > > > > +		goto alloc;
> > > > >  	}
> > > > 
> > > > When a thread group has 1000 threads and most of them are doing memory allocation
> > > > request, all of them will get fatal_signal_pending() == true when one of them are
> > > > chosen by OOM killer.
> > > > This code will allow most of them to access memory reserves, won't it?
> > > 
> > > Ah, good point!  Only TIF_MEMDIE should get reserve access, not just
> > > any dying thread.  Thanks, I'll fix it in v2.
> > 
> > Do you plan to post this v2 here for review?
> 
> Yeah, I was going to wait for feedback to settle before updating the
> code.  But I was thinking something like this?
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 9ce9c4c083a0..106793a75461 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2344,7 +2344,8 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, int alloc_flags,
>  	 * waiting for us to exit.
>  	 */
>  	if (test_thread_flag(TIF_MEMDIE) || mutex_lock_killable(&oom_lock)) {
> -		alloc_flags |= ALLOC_NO_WATERMARKS;
> +		if (test_thread_flag(TIF_MEMDIE))
> +			alloc_flags |= ALLOC_NO_WATERMARKS;
>  		goto alloc;
>  	}

OK, I have expected something like this. I understand why you want to
retry inside this function. But I would prefer if gfp_to_alloc_flags was
used here so that we do not have that TIF_MEMDIE logic duplicated at two
places.
-- 
Michal Hocko
SUSE Labs

From mboxrd@z Thu Jan  1 00:00:00 1970
From: Michal Hocko <mhocko@suse.cz>
Subject: Re: [patch 08/12] mm: page_alloc: wait for OOM killer progress
 before retrying
Date: Thu, 26 Mar 2015 16:38:47 +0100
Message-ID: <20150326153847.GP15257@dhcp22.suse.cz>
References: <1427264236-17249-1-git-send-email-hannes@cmpxchg.org>
 <1427264236-17249-9-git-send-email-hannes@cmpxchg.org>
 <201503252315.FBJ09847.FSOtOJQFOMLFVH@I-love.SAKURA.ne.jp>
 <20150326112445.GC18560@cmpxchg.org>
 <20150326143223.GM15257@dhcp22.suse.cz>
 <20150326152343.GE23973@cmpxchg.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	torvalds@linux-foundation.org, akpm@linux-foundation.org,
	ying.huang@intel.com, aarcange@redhat.com, david@fromorbit.com,
	tytso@mit.edu
To: Johannes Weiner <hannes@cmpxchg.org>
Return-path: <owner-linux-mm@kvack.org>
Content-Disposition: inline
In-Reply-To: <20150326152343.GE23973@cmpxchg.org>
Sender: owner-linux-mm@kvack.org
List-Id: linux-fsdevel.vger.kernel.org

On Thu 26-03-15 11:23:43, Johannes Weiner wrote:
> On Thu, Mar 26, 2015 at 03:32:23PM +0100, Michal Hocko wrote:
> > On Thu 26-03-15 07:24:45, Johannes Weiner wrote:
> > > On Wed, Mar 25, 2015 at 11:15:48PM +0900, Tetsuo Handa wrote:
> > > > Johannes Weiner wrote:
> > [...]
> > > > >  	/*
> > > > > -	 * Acquire the oom lock.  If that fails, somebody else is
> > > > > -	 * making progress for us.
> > > > > +	 * This allocating task can become the OOM victim itself at
> > > > > +	 * any point before acquiring the lock.  In that case, exit
> > > > > +	 * quickly and don't block on the lock held by another task
> > > > > +	 * waiting for us to exit.
> > > > >  	 */
> > > > > -	if (!mutex_trylock(&oom_lock)) {
> > > > > -		*did_some_progress = 1;
> > > > > -		schedule_timeout_uninterruptible(1);
> > > > > -		return NULL;
> > > > > +	if (test_thread_flag(TIF_MEMDIE) || mutex_lock_killable(&oom_lock)) {
> > > > > +		alloc_flags |= ALLOC_NO_WATERMARKS;
> > > > > +		goto alloc;
> > > > >  	}
> > > > 
> > > > When a thread group has 1000 threads and most of them are doing memory allocation
> > > > request, all of them will get fatal_signal_pending() == true when one of them are
> > > > chosen by OOM killer.
> > > > This code will allow most of them to access memory reserves, won't it?
> > > 
> > > Ah, good point!  Only TIF_MEMDIE should get reserve access, not just
> > > any dying thread.  Thanks, I'll fix it in v2.
> > 
> > Do you plan to post this v2 here for review?
> 
> Yeah, I was going to wait for feedback to settle before updating the
> code.  But I was thinking something like this?
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 9ce9c4c083a0..106793a75461 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2344,7 +2344,8 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, int alloc_flags,
>  	 * waiting for us to exit.
>  	 */
>  	if (test_thread_flag(TIF_MEMDIE) || mutex_lock_killable(&oom_lock)) {
> -		alloc_flags |= ALLOC_NO_WATERMARKS;
> +		if (test_thread_flag(TIF_MEMDIE))
> +			alloc_flags |= ALLOC_NO_WATERMARKS;
>  		goto alloc;
>  	}

OK, I have expected something like this. I understand why you want to
retry inside this function. But I would prefer if gfp_to_alloc_flags was
used here so that we do not have that TIF_MEMDIE logic duplicated at two
places.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>