All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>,
	Linux Kernel List <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Minchan Kim <minchan.kim@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	stable@kernel.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [stable] [PATCH 0/3] Reduce watermark-related problems with the per-cpu	allocator V4
Date: Tue, 21 Sep 2010 05:58:14 -0700	[thread overview]
Message-ID: <20100921125814.GF1205@kroah.com> (raw)
In-Reply-To: <20100921111741.GB11439@csn.ul.ie>

On Tue, Sep 21, 2010 at 12:17:41PM +0100, Mel Gorman wrote:
> On Fri, Sep 03, 2010 at 04:05:51PM -0700, Andrew Morton wrote:
> > On Fri,  3 Sep 2010 10:08:43 +0100
> > Mel Gorman <mel@csn.ul.ie> wrote:
> > 
> > > The noteworthy change is to patch 2 which now uses the generic
> > > zone_page_state_snapshot() in zone_nr_free_pages(). Similar logic still
> > > applies for *when* zone_page_state_snapshot() to avoid ovedhead.
> > > 
> > > Changelog since V3
> > >   o Use generic helper for NR_FREE_PAGES estimate when necessary
> > > 
> > > Changelog since V2
> > >   o Minor clarifications
> > >   o Rebase to 2.6.36-rc3
> > > 
> > > Changelog since V1
> > >   o Fix for !CONFIG_SMP
> > >   o Correct spelling mistakes
> > >   o Clarify a ChangeLog
> > >   o Only check for counter drift on machines large enough for the counter
> > >     drift to breach the min watermark when NR_FREE_PAGES report the low
> > >     watermark is fine
> > > 
> > > Internal IBM test teams beta testing distribution kernels have reported
> > > problems on machines with a large number of CPUs whereby page allocator
> > > failure messages show huge differences between the nr_free_pages vmstat
> > > counter and what is available on the buddy lists. In an extreme example,
> > > nr_free_pages was above the min watermark but zero pages were on the buddy
> > > lists allowing the system to potentially livelock unable to make forward
> > > progress unless an allocation succeeds. There is no reason why the problems
> > > would not affect mainline so the following series mitigates the problems
> > > in the page allocator related to to per-cpu counter drift and lists.
> > > 
> > > The first patch ensures that counters are updated after pages are added to
> > > free lists.
> > > 
> > > The second patch notes that the counter drift between nr_free_pages and what
> > > is on the per-cpu lists can be very high. When memory is low and kswapd
> > > is awake, the per-cpu counters are checked as well as reading the value
> > > of NR_FREE_PAGES. This will slow the page allocator when memory is low and
> > > kswapd is awake but it will be much harder to breach the min watermark and
> > > potentially livelock the system.
> > > 
> > > The third patch notes that after direct-reclaim an allocation can
> > > fail because the necessary pages are on the per-cpu lists. After a
> > > direct-reclaim-and-allocation-failure, the per-cpu lists are drained and
> > > a second attempt is made.
> > > 
> > > Performance tests against 2.6.36-rc3 did not show up anything interesting. A
> > > version of this series that continually called vmstat_update() when
> > > memory was low was tested internally and found to help the counter drift
> > > problem. I described this during LSF/MM Summit and the potential for IPI
> > > storms was frowned upon. An alternative fix is in patch two which uses
> > > for_each_online_cpu() to read the vmstat deltas while memory is low and
> > > kswapd is awake. This should be functionally similar.
> > > 
> > > This patch should be merged after the patch "vmstat : update
> > > zone stat threshold at onlining a cpu" which is in mmotm as
> > > vmstat-update-zone-stat-threshold-when-onlining-a-cpu.patch .
> > > 
> > > If we can agree on it, this series is a stable candidate.
> > 
> > (cc stable@kernel.org)
> > 
> > >  include/linux/mmzone.h |   13 +++++++++++++
> > >  include/linux/vmstat.h |   22 ++++++++++++++++++++++
> > >  mm/mmzone.c            |   21 +++++++++++++++++++++
> > >  mm/page_alloc.c        |   29 +++++++++++++++++++++--------
> > >  mm/vmstat.c            |   15 ++++++++++++++-
> > >  5 files changed, 91 insertions(+), 9 deletions(-)
> > 
> > For the entire patch series I get
> > 
> >  include/linux/mmzone.h |   13 +++++++++++++
> >  include/linux/vmstat.h |   22 ++++++++++++++++++++++
> >  mm/mmzone.c            |   21 +++++++++++++++++++++
> >  mm/page_alloc.c        |   33 +++++++++++++++++++++++----------
> >  mm/vmstat.c            |   16 +++++++++++++++-
> >  5 files changed, 94 insertions(+), 11 deletions(-)
> > 
> > The patches do apply OK to 2.6.35.
> > 
> > Give the extent and the coreness of it all, it's a bit more than I'd
> > usually push at the -stable guys.  But I guess that if the patches fix
> > all the issues you've noted, as well as David's "minute-long livelocks
> > in memory reclaim" then yup, it's worth backporting it all.
> > 
> 
> These patches have made it to mainline as the following commits.
> 
> 9ee493c mm: page allocator: drain per-cpu lists after direct reclaim allocation fails
> aa45484 mm: page allocator: calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake
> 72853e2 mm: page allocator: update free page counters after pages are placed on the free list
> 
> I have not heard from the -stable guys, is there a reasonable
> expectation that they'll be picked up?

If you ask me, then I'll know to give a response :)

None of these were tagged as going to the stable tree, should I include
them?  If so, for which -stable tree?  .27, .32, and .35 are all
currently active.

thanks,

greg k-h

WARNING: multiple messages have this Message-ID (diff)
From: Greg KH <greg@kroah.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Rik van Riel <riel@redhat.com>,
	Christoph Lameter <cl@linux-foundation.org>,
	Dave Chinner <david@fromorbit.com>,
	Linux Kernel List <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Minchan Kim <minchan.kim@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	stable@kernel.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [stable] [PATCH 0/3] Reduce watermark-related problems with the per-cpu	allocator V4
Date: Tue, 21 Sep 2010 05:58:14 -0700	[thread overview]
Message-ID: <20100921125814.GF1205@kroah.com> (raw)
In-Reply-To: <20100921111741.GB11439@csn.ul.ie>

On Tue, Sep 21, 2010 at 12:17:41PM +0100, Mel Gorman wrote:
> On Fri, Sep 03, 2010 at 04:05:51PM -0700, Andrew Morton wrote:
> > On Fri,  3 Sep 2010 10:08:43 +0100
> > Mel Gorman <mel@csn.ul.ie> wrote:
> > 
> > > The noteworthy change is to patch 2 which now uses the generic
> > > zone_page_state_snapshot() in zone_nr_free_pages(). Similar logic still
> > > applies for *when* zone_page_state_snapshot() to avoid ovedhead.
> > > 
> > > Changelog since V3
> > >   o Use generic helper for NR_FREE_PAGES estimate when necessary
> > > 
> > > Changelog since V2
> > >   o Minor clarifications
> > >   o Rebase to 2.6.36-rc3
> > > 
> > > Changelog since V1
> > >   o Fix for !CONFIG_SMP
> > >   o Correct spelling mistakes
> > >   o Clarify a ChangeLog
> > >   o Only check for counter drift on machines large enough for the counter
> > >     drift to breach the min watermark when NR_FREE_PAGES report the low
> > >     watermark is fine
> > > 
> > > Internal IBM test teams beta testing distribution kernels have reported
> > > problems on machines with a large number of CPUs whereby page allocator
> > > failure messages show huge differences between the nr_free_pages vmstat
> > > counter and what is available on the buddy lists. In an extreme example,
> > > nr_free_pages was above the min watermark but zero pages were on the buddy
> > > lists allowing the system to potentially livelock unable to make forward
> > > progress unless an allocation succeeds. There is no reason why the problems
> > > would not affect mainline so the following series mitigates the problems
> > > in the page allocator related to to per-cpu counter drift and lists.
> > > 
> > > The first patch ensures that counters are updated after pages are added to
> > > free lists.
> > > 
> > > The second patch notes that the counter drift between nr_free_pages and what
> > > is on the per-cpu lists can be very high. When memory is low and kswapd
> > > is awake, the per-cpu counters are checked as well as reading the value
> > > of NR_FREE_PAGES. This will slow the page allocator when memory is low and
> > > kswapd is awake but it will be much harder to breach the min watermark and
> > > potentially livelock the system.
> > > 
> > > The third patch notes that after direct-reclaim an allocation can
> > > fail because the necessary pages are on the per-cpu lists. After a
> > > direct-reclaim-and-allocation-failure, the per-cpu lists are drained and
> > > a second attempt is made.
> > > 
> > > Performance tests against 2.6.36-rc3 did not show up anything interesting. A
> > > version of this series that continually called vmstat_update() when
> > > memory was low was tested internally and found to help the counter drift
> > > problem. I described this during LSF/MM Summit and the potential for IPI
> > > storms was frowned upon. An alternative fix is in patch two which uses
> > > for_each_online_cpu() to read the vmstat deltas while memory is low and
> > > kswapd is awake. This should be functionally similar.
> > > 
> > > This patch should be merged after the patch "vmstat : update
> > > zone stat threshold at onlining a cpu" which is in mmotm as
> > > vmstat-update-zone-stat-threshold-when-onlining-a-cpu.patch .
> > > 
> > > If we can agree on it, this series is a stable candidate.
> > 
> > (cc stable@kernel.org)
> > 
> > >  include/linux/mmzone.h |   13 +++++++++++++
> > >  include/linux/vmstat.h |   22 ++++++++++++++++++++++
> > >  mm/mmzone.c            |   21 +++++++++++++++++++++
> > >  mm/page_alloc.c        |   29 +++++++++++++++++++++--------
> > >  mm/vmstat.c            |   15 ++++++++++++++-
> > >  5 files changed, 91 insertions(+), 9 deletions(-)
> > 
> > For the entire patch series I get
> > 
> >  include/linux/mmzone.h |   13 +++++++++++++
> >  include/linux/vmstat.h |   22 ++++++++++++++++++++++
> >  mm/mmzone.c            |   21 +++++++++++++++++++++
> >  mm/page_alloc.c        |   33 +++++++++++++++++++++++----------
> >  mm/vmstat.c            |   16 +++++++++++++++-
> >  5 files changed, 94 insertions(+), 11 deletions(-)
> > 
> > The patches do apply OK to 2.6.35.
> > 
> > Give the extent and the coreness of it all, it's a bit more than I'd
> > usually push at the -stable guys.  But I guess that if the patches fix
> > all the issues you've noted, as well as David's "minute-long livelocks
> > in memory reclaim" then yup, it's worth backporting it all.
> > 
> 
> These patches have made it to mainline as the following commits.
> 
> 9ee493c mm: page allocator: drain per-cpu lists after direct reclaim allocation fails
> aa45484 mm: page allocator: calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake
> 72853e2 mm: page allocator: update free page counters after pages are placed on the free list
> 
> I have not heard from the -stable guys, is there a reasonable
> expectation that they'll be picked up?

If you ask me, then I'll know to give a response :)

None of these were tagged as going to the stable tree, should I include
them?  If so, for which -stable tree?  .27, .32, and .35 are all
currently active.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2010-09-21 13:02 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-03  9:08 [PATCH 0/3] Reduce watermark-related problems with the per-cpu allocator V4 Mel Gorman
2010-09-03  9:08 ` Mel Gorman
2010-09-03  9:08 ` [PATCH 1/3] mm: page allocator: Update free page counters after pages are placed on the free list Mel Gorman
2010-09-03  9:08   ` Mel Gorman
2010-09-03 22:38   ` Andrew Morton
2010-09-03 22:38     ` Andrew Morton
2010-09-05 18:06     ` Mel Gorman
2010-09-05 18:06       ` Mel Gorman
2010-09-03  9:08 ` [PATCH 2/3] mm: page allocator: Calculate a better estimate of NR_FREE_PAGES when memory is low and kswapd is awake Mel Gorman
2010-09-03  9:08   ` Mel Gorman
2010-09-03 22:55   ` Andrew Morton
2010-09-03 22:55     ` Andrew Morton
2010-09-03 23:17     ` Christoph Lameter
2010-09-03 23:17       ` Christoph Lameter
2010-09-03 23:28       ` Andrew Morton
2010-09-03 23:28         ` Andrew Morton
2010-09-04  0:54         ` Christoph Lameter
2010-09-04  0:54           ` Christoph Lameter
2010-09-05 18:12     ` Mel Gorman
2010-09-05 18:12       ` Mel Gorman
2010-09-03  9:08 ` [PATCH 3/3] mm: page allocator: Drain per-cpu lists after direct reclaim allocation fails Mel Gorman
2010-09-03  9:08   ` Mel Gorman
2010-09-03 23:00   ` Andrew Morton
2010-09-03 23:00     ` Andrew Morton
2010-09-04  2:25     ` Dave Chinner
2010-09-04  2:25       ` Dave Chinner
2010-09-04  3:21       ` Andrew Morton
2010-09-04  3:21         ` Andrew Morton
2010-09-04  7:58         ` Dave Chinner
2010-09-04  7:58           ` Dave Chinner
2010-09-04  8:14           ` Dave Chinner
2010-09-04  8:14             ` Dave Chinner
     [not found]             ` <20100905015400.GA10714@localhost>
     [not found]               ` <20100905021555.GG705@dastard>
     [not found]                 ` <20100905060539.GA17450@localhost>
     [not found]                   ` <20100905131447.GJ705@dastard>
2010-09-05 13:45                     ` Wu Fengguang
2010-09-05 13:45                       ` Wu Fengguang
2010-09-05 23:33                       ` Dave Chinner
2010-09-05 23:33                         ` Dave Chinner
2010-09-06  4:02                       ` Dave Chinner
2010-09-06  4:02                         ` Dave Chinner
2010-09-06  8:40                         ` Mel Gorman
2010-09-06  8:40                           ` Mel Gorman
2010-09-06 21:50                           ` Dave Chinner
2010-09-06 21:50                             ` Dave Chinner
2010-09-08  8:49                             ` Dave Chinner
2010-09-08  8:49                               ` Dave Chinner
2010-09-09 12:39                               ` Mel Gorman
2010-09-09 12:39                                 ` Mel Gorman
2010-09-10  6:17                                 ` Dave Chinner
2010-09-10  6:17                                   ` Dave Chinner
2010-09-07 14:23                         ` Christoph Lameter
2010-09-07 14:23                           ` Christoph Lameter
2010-09-08  2:13                           ` Wu Fengguang
2010-09-08  2:13                             ` Wu Fengguang
2010-09-04  3:23       ` Wu Fengguang
2010-09-04  3:23         ` Wu Fengguang
2010-09-04  3:59         ` Andrew Morton
2010-09-04  3:59           ` Andrew Morton
2010-09-04  4:37           ` Wu Fengguang
2010-09-04  4:37             ` Wu Fengguang
2010-09-05 18:22       ` Mel Gorman
2010-09-05 18:22         ` Mel Gorman
2010-09-05 18:14     ` Mel Gorman
2010-09-05 18:14       ` Mel Gorman
2010-09-08  7:43   ` KOSAKI Motohiro
2010-09-08  7:43     ` KOSAKI Motohiro
2010-09-08 20:05     ` Christoph Lameter
2010-09-08 20:05       ` Christoph Lameter
2010-09-09 12:41     ` Mel Gorman
2010-09-09 12:41       ` Mel Gorman
2010-09-09 13:45       ` Christoph Lameter
2010-09-09 13:45         ` Christoph Lameter
2010-09-09 13:55         ` Mel Gorman
2010-09-09 13:55           ` Mel Gorman
2010-09-09 14:32           ` Christoph Lameter
2010-09-09 14:32             ` Christoph Lameter
2010-09-09 15:05             ` Mel Gorman
2010-09-09 15:05               ` Mel Gorman
2010-09-10  2:56               ` KOSAKI Motohiro
2010-09-10  2:56                 ` KOSAKI Motohiro
2010-09-03 23:05 ` [PATCH 0/3] Reduce watermark-related problems with the per-cpu allocator V4 Andrew Morton
2010-09-03 23:05   ` Andrew Morton
2010-09-21 11:17   ` Mel Gorman
2010-09-21 11:17     ` Mel Gorman
2010-09-21 12:58     ` Greg KH [this message]
2010-09-21 12:58       ` [stable] " Greg KH
2010-09-21 14:23       ` Mel Gorman
2010-09-21 14:23         ` Mel Gorman
2010-09-23 18:49         ` Greg KH
2010-09-23 18:49           ` Greg KH
2010-09-24  9:14           ` Mel Gorman
2010-09-24  9:14             ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100921125814.GF1205@kroah.com \
    --to=greg@kroah.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan.kim@gmail.com \
    --cc=riel@redhat.com \
    --cc=stable@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.