All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>,
	Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jia He <hejianet@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 1/9] mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
Date: Thu, 9 Mar 2017 14:20:44 +0000	[thread overview]
Message-ID: <20170309142044.5ewlvus6ana6boqp@suse.de> (raw)
In-Reply-To: <20170307165631.GA21425@cmpxchg.org>

On Tue, Mar 07, 2017 at 11:56:31AM -0500, Johannes Weiner wrote:
> On Tue, Mar 07, 2017 at 11:17:02AM +0100, Michal Hocko wrote:
> > On Mon 06-03-17 11:24:10, Johannes Weiner wrote:
> > > @@ -3271,7 +3271,8 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
> > >  		 * Raise priority if scanning rate is too low or there was no
> > >  		 * progress in reclaiming pages
> > >  		 */
> > > -		if (raise_priority || !sc.nr_reclaimed)
> > > +		nr_reclaimed = sc.nr_reclaimed - nr_reclaimed;
> > > +		if (raise_priority || !nr_reclaimed)
> > >  			sc.priority--;
> > >  	} while (sc.priority >= 1);
> > >  
> > 
> > I would rather not play with the sc state here. From a quick look at
> > least 
> > 	/*
> > 	 * Fragmentation may mean that the system cannot be rebalanced for
> > 	 * high-order allocations. If twice the allocation size has been
> > 	 * reclaimed then recheck watermarks only at order-0 to prevent
> > 	 * excessive reclaim. Assume that a process requested a high-order
> > 	 * can direct reclaim/compact.
> > 	 */
> > 	if (sc->order && sc->nr_reclaimed >= compact_gap(sc->order))
> > 		sc->order = 0;
> > 
> > does rely on the value. Wouldn't something like the following be safer?
> 
> Well, what behavior is correct, though? This check looks like an
> argument *against* resetting sc.nr_reclaimed.
> 
> If kswapd is woken up for a higher order, this check sets a reclaim
> cutoff beyond which it should give up on the order and balance for 0.
> 
> That's on the scope of the kswapd invocation. Applying this threshold
> to the outcome of just the preceeding priority seems like a mistake.
> 
> Mel? Vlastimil?

I cannot say which is definitely the correct behaviour. The current
behaviour is conservative due to the historical concerns about kswapd
reclaiming the world. The hazard as I see it is that resetting it *may*
lead to more aggressive reclaim for high-order allocations. That may be a
welcome outcome to some that really want high-order pages and be unwelcome
to others that prefer pages to remain resident.

However, in this case it's a tight window and problems would be tricky to
detect. THP allocations won't trigger the behaviour and with vmalloc'd
stack, I'd expect that only SLUB-intensive workloads using high-order
pages would trigger any adverse behaviour. While I'm mildly concerned, I
would be a little surprised if it actually caused runaway reclaim.

-- 
Mel Gorman
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@suse.de>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>,
	Minchan Kim <minchan@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jia He <hejianet@gmail.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com, Vlastimil Babka <vbabka@suse.cz>
Subject: Re: [PATCH 1/9] mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
Date: Thu, 9 Mar 2017 14:20:44 +0000	[thread overview]
Message-ID: <20170309142044.5ewlvus6ana6boqp@suse.de> (raw)
In-Reply-To: <20170307165631.GA21425@cmpxchg.org>

On Tue, Mar 07, 2017 at 11:56:31AM -0500, Johannes Weiner wrote:
> On Tue, Mar 07, 2017 at 11:17:02AM +0100, Michal Hocko wrote:
> > On Mon 06-03-17 11:24:10, Johannes Weiner wrote:
> > > @@ -3271,7 +3271,8 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int classzone_idx)
> > >  		 * Raise priority if scanning rate is too low or there was no
> > >  		 * progress in reclaiming pages
> > >  		 */
> > > -		if (raise_priority || !sc.nr_reclaimed)
> > > +		nr_reclaimed = sc.nr_reclaimed - nr_reclaimed;
> > > +		if (raise_priority || !nr_reclaimed)
> > >  			sc.priority--;
> > >  	} while (sc.priority >= 1);
> > >  
> > 
> > I would rather not play with the sc state here. From a quick look at
> > least 
> > 	/*
> > 	 * Fragmentation may mean that the system cannot be rebalanced for
> > 	 * high-order allocations. If twice the allocation size has been
> > 	 * reclaimed then recheck watermarks only at order-0 to prevent
> > 	 * excessive reclaim. Assume that a process requested a high-order
> > 	 * can direct reclaim/compact.
> > 	 */
> > 	if (sc->order && sc->nr_reclaimed >= compact_gap(sc->order))
> > 		sc->order = 0;
> > 
> > does rely on the value. Wouldn't something like the following be safer?
> 
> Well, what behavior is correct, though? This check looks like an
> argument *against* resetting sc.nr_reclaimed.
> 
> If kswapd is woken up for a higher order, this check sets a reclaim
> cutoff beyond which it should give up on the order and balance for 0.
> 
> That's on the scope of the kswapd invocation. Applying this threshold
> to the outcome of just the preceeding priority seems like a mistake.
> 
> Mel? Vlastimil?

I cannot say which is definitely the correct behaviour. The current
behaviour is conservative due to the historical concerns about kswapd
reclaiming the world. The hazard as I see it is that resetting it *may*
lead to more aggressive reclaim for high-order allocations. That may be a
welcome outcome to some that really want high-order pages and be unwelcome
to others that prefer pages to remain resident.

However, in this case it's a tight window and problems would be tricky to
detect. THP allocations won't trigger the behaviour and with vmalloc'd
stack, I'd expect that only SLUB-intensive workloads using high-order
pages would trigger any adverse behaviour. While I'm mildly concerned, I
would be a little surprised if it actually caused runaway reclaim.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-03-09 14:35 UTC|newest]

Thread overview: 80+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-28 21:39 [PATCH 0/9] mm: kswapd spinning on unreclaimable nodes - fixes and cleanups Johannes Weiner
2017-02-28 21:39 ` Johannes Weiner
2017-02-28 21:39 ` [PATCH 1/9] mm: fix 100% CPU kswapd busyloop on unreclaimable nodes Johannes Weiner
2017-02-28 21:39   ` Johannes Weiner
2017-03-02  3:23   ` Hillf Danton
2017-03-02  3:23     ` Hillf Danton
2017-03-02 23:30   ` Shakeel Butt
2017-03-02 23:30     ` Shakeel Butt
2017-03-03  1:26   ` Minchan Kim
2017-03-03  1:26     ` Minchan Kim
2017-03-03  7:59     ` Michal Hocko
2017-03-03  7:59       ` Michal Hocko
2017-03-06  1:37       ` Minchan Kim
2017-03-06  1:37         ` Minchan Kim
2017-03-06 16:24         ` Johannes Weiner
2017-03-06 16:24           ` Johannes Weiner
2017-03-07  0:59           ` Hillf Danton
2017-03-07  0:59             ` Hillf Danton
2017-03-07  7:28           ` Minchan Kim
2017-03-07  7:28             ` Minchan Kim
2017-03-07 10:17           ` Michal Hocko
2017-03-07 10:17             ` Michal Hocko
2017-03-07 16:56             ` Johannes Weiner
2017-03-07 16:56               ` Johannes Weiner
2017-03-09 14:20               ` Mel Gorman [this message]
2017-03-09 14:20                 ` Mel Gorman
2017-02-28 21:40 ` [PATCH 2/9] mm: fix check for reclaimable pages in PF_MEMALLOC reclaim throttling Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:02   ` Michal Hocko
2017-03-01 15:02     ` Michal Hocko
2017-03-02  3:25   ` Hillf Danton
2017-03-02  3:25     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 3/9] mm: remove seemingly spurious reclaimability check from laptop_mode gating Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:06   ` Michal Hocko
2017-03-01 15:06     ` Michal Hocko
2017-03-01 15:17   ` Mel Gorman
2017-03-01 15:17     ` Mel Gorman
2017-03-02  3:27   ` Hillf Danton
2017-03-02  3:27     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 4/9] mm: remove unnecessary reclaimability check from NUMA balancing target Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:14   ` Michal Hocko
2017-03-01 15:14     ` Michal Hocko
2017-03-02  3:28   ` Hillf Danton
2017-03-02  3:28     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 5/9] mm: don't avoid high-priority reclaim on unreclaimable nodes Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:21   ` Michal Hocko
2017-03-01 15:21     ` Michal Hocko
2017-03-02  3:31   ` Hillf Danton
2017-03-02  3:31     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 6/9] mm: don't avoid high-priority reclaim on memcg limit reclaim Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:40   ` Michal Hocko
2017-03-01 15:40     ` Michal Hocko
2017-03-01 17:36     ` Johannes Weiner
2017-03-01 17:36       ` Johannes Weiner
2017-03-01 19:13       ` Michal Hocko
2017-03-01 19:13         ` Michal Hocko
2017-03-02  3:32   ` Hillf Danton
2017-03-02  3:32     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 7/9] mm: delete NR_PAGES_SCANNED and pgdat_reclaimable() Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:41   ` Michal Hocko
2017-03-01 15:41     ` Michal Hocko
2017-03-02  3:34   ` Hillf Danton
2017-03-02  3:34     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 8/9] Revert "mm, vmscan: account for skipped pages as a partial scan" Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 15:51   ` Michal Hocko
2017-03-01 15:51     ` Michal Hocko
2017-03-02  3:36   ` Hillf Danton
2017-03-02  3:36     ` Hillf Danton
2017-02-28 21:40 ` [PATCH 9/9] mm: remove unnecessary back-off function when retrying page reclaim Johannes Weiner
2017-02-28 21:40   ` Johannes Weiner
2017-03-01 14:56   ` Michal Hocko
2017-03-01 14:56     ` Michal Hocko
2017-03-02  3:37   ` Hillf Danton
2017-03-02  3:37     ` Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170309142044.5ewlvus6ana6boqp@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=hejianet@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.