From: Mel Gorman <mel@csn.ul.ie> To: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Cc: Linux Memory Management List <linux-mm@kvack.org>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, Christoph Lameter <cl@linux-foundation.org>, Nick Piggin <npiggin@suse.de>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Lin Ming <ming.m.lin@intel.com>, Peter Zijlstra <peterz@infradead.org>, Pekka Enberg <penberg@cs.helsinki.fi>, Andrew Morton <akpm@linux-foundation.org> Subject: Re: [PATCH 00/22] Cleanup and optimise the page allocator V7 Date: Tue, 28 Apr 2009 11:27:19 +0100 [thread overview] Message-ID: <20090428102719.GA23540@csn.ul.ie> (raw) In-Reply-To: <1240883957.2567.886.camel@ymzhang> On Tue, Apr 28, 2009 at 09:59:17AM +0800, Zhang, Yanmin wrote: > On Mon, 2009-04-27 at 15:38 +0100, Mel Gorman wrote: > > On Mon, Apr 27, 2009 at 03:58:39PM +0800, Zhang, Yanmin wrote: > > > On Wed, 2009-04-22 at 14:53 +0100, Mel Gorman wrote: > > > > Here is V7 of the cleanup and optimisation of the page allocator and > > > > it should be ready for wider testing. Please consider a possibility for > > > > merging as a Pass 1 at making the page allocator faster. Other passes will > > > > occur later when this one has had a bit of exercise. This patchset is based > > > > on mmotm-2009-04-17 and I've tested it successfully on a small number of > > > > machines. > > > We ran some performance benchmarks against V7 patch on top of 2.6.30-rc3. > > > It seems some counters in kernel are incorrect after we run some ffsb (disk I/O benchmark) > > > and swap-cp (a simple swap memory testing by cp on tmpfs). Free memory is bigger than > > > total memory. > > > > > > > oops. Can you try this patch please? > > > > ==== CUT HERE ==== > > > > Properly account for freed pages in free_pages_bulk() and when allocating high-order pages in buffered_rmqueue() > > > > free_pages_bulk() updates the number of free pages in the zone but it is > > assuming that the pages being freed are order-0. While this is currently > > always true, it's wrong to assume the order is 0. This patch fixes the > > problem. > > > > buffered_rmqueue() is not updating NR_FREE_PAGES when allocating pages with > > __rmqueue(). This means that any high-order allocation will appear to increase > > the number of free pages leading to the situation where free pages appears to > > exceed available RAM. This patch accounts for those allocated pages properly. > > > > This is a candidate fix to the patch > > page-allocator-update-nr_free_pages-only-as-necessary.patch. It has yet to be > > verified as fixing a problem where the free pages count is getting corrupted. > > > > Signed-off-by: Mel Gorman <mel@csn.ul.ie> > > --- > > mm/page_alloc.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 3db5f57..dd69593 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -545,7 +545,7 @@ static void free_pages_bulk(struct zone *zone, int count, > > zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE); > > zone->pages_scanned = 0; > > > > - __mod_zone_page_state(zone, NR_FREE_PAGES, count); > > + __mod_zone_page_state(zone, NR_FREE_PAGES, count << order); > > while (count--) { > > struct page *page; > > > > @@ -1151,6 +1151,7 @@ again: > > } else { > > spin_lock_irqsave(&zone->lock, flags); > > page = __rmqueue(zone, order, migratetype); > > + __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order)); > Here 'i' should be 1? 1UL even. Not sure how I managed to send a version with 'i' after a build + boot test. > > > spin_unlock(&zone->lock); > > if (!page) > > goto failed; > > I ran a cp kernel source files and swap-cp workload and didn't find > bad counter now. > I'm assuming you mean that it worked with s/i/1/. I'll send out an updated version. Thanks a lot. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab
WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mel@csn.ul.ie> To: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Cc: Linux Memory Management List <linux-mm@kvack.org>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, Christoph Lameter <cl@linux-foundation.org>, Nick Piggin <npiggin@suse.de>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Lin Ming <ming.m.lin@intel.com>, Peter Zijlstra <peterz@infradead.org>, Pekka Enberg <penberg@cs.helsinki.fi>, Andrew Morton <akpm@linux-foundation.org> Subject: Re: [PATCH 00/22] Cleanup and optimise the page allocator V7 Date: Tue, 28 Apr 2009 11:27:19 +0100 [thread overview] Message-ID: <20090428102719.GA23540@csn.ul.ie> (raw) In-Reply-To: <1240883957.2567.886.camel@ymzhang> On Tue, Apr 28, 2009 at 09:59:17AM +0800, Zhang, Yanmin wrote: > On Mon, 2009-04-27 at 15:38 +0100, Mel Gorman wrote: > > On Mon, Apr 27, 2009 at 03:58:39PM +0800, Zhang, Yanmin wrote: > > > On Wed, 2009-04-22 at 14:53 +0100, Mel Gorman wrote: > > > > Here is V7 of the cleanup and optimisation of the page allocator and > > > > it should be ready for wider testing. Please consider a possibility for > > > > merging as a Pass 1 at making the page allocator faster. Other passes will > > > > occur later when this one has had a bit of exercise. This patchset is based > > > > on mmotm-2009-04-17 and I've tested it successfully on a small number of > > > > machines. > > > We ran some performance benchmarks against V7 patch on top of 2.6.30-rc3. > > > It seems some counters in kernel are incorrect after we run some ffsb (disk I/O benchmark) > > > and swap-cp (a simple swap memory testing by cp on tmpfs). Free memory is bigger than > > > total memory. > > > > > > > oops. Can you try this patch please? > > > > ==== CUT HERE ==== > > > > Properly account for freed pages in free_pages_bulk() and when allocating high-order pages in buffered_rmqueue() > > > > free_pages_bulk() updates the number of free pages in the zone but it is > > assuming that the pages being freed are order-0. While this is currently > > always true, it's wrong to assume the order is 0. This patch fixes the > > problem. > > > > buffered_rmqueue() is not updating NR_FREE_PAGES when allocating pages with > > __rmqueue(). This means that any high-order allocation will appear to increase > > the number of free pages leading to the situation where free pages appears to > > exceed available RAM. This patch accounts for those allocated pages properly. > > > > This is a candidate fix to the patch > > page-allocator-update-nr_free_pages-only-as-necessary.patch. It has yet to be > > verified as fixing a problem where the free pages count is getting corrupted. > > > > Signed-off-by: Mel Gorman <mel@csn.ul.ie> > > --- > > mm/page_alloc.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 3db5f57..dd69593 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -545,7 +545,7 @@ static void free_pages_bulk(struct zone *zone, int count, > > zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE); > > zone->pages_scanned = 0; > > > > - __mod_zone_page_state(zone, NR_FREE_PAGES, count); > > + __mod_zone_page_state(zone, NR_FREE_PAGES, count << order); > > while (count--) { > > struct page *page; > > > > @@ -1151,6 +1151,7 @@ again: > > } else { > > spin_lock_irqsave(&zone->lock, flags); > > page = __rmqueue(zone, order, migratetype); > > + __mod_zone_page_state(zone, NR_FREE_PAGES, -(i << order)); > Here 'i' should be 1? 1UL even. Not sure how I managed to send a version with 'i' after a build + boot test. > > > spin_unlock(&zone->lock); > > if (!page) > > goto failed; > > I ran a cp kernel source files and swap-cp workload and didn't find > bad counter now. > I'm assuming you mean that it worked with s/i/1/. I'll send out an updated version. Thanks a lot. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-04-28 10:27 UTC|newest] Thread overview: 186+ messages / expand[flat|nested] mbox.gz Atom feed top 2009-04-22 13:53 [PATCH 00/22] Cleanup and optimise the page allocator V7 Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 01/22] Replace __alloc_pages_internal() with __alloc_pages_nodemask() Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 02/22] Do not sanity check order in the fast path Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 16:13 ` Dave Hansen 2009-04-22 16:13 ` Dave Hansen 2009-04-22 17:11 ` Mel Gorman 2009-04-22 17:11 ` Mel Gorman 2009-04-22 17:30 ` Dave Hansen 2009-04-22 17:30 ` Dave Hansen 2009-04-23 0:13 ` Mel Gorman 2009-04-23 0:13 ` Mel Gorman 2009-04-23 1:34 ` Dave Hansen 2009-04-23 1:34 ` Dave Hansen 2009-04-23 9:58 ` Mel Gorman 2009-04-23 9:58 ` Mel Gorman 2009-04-23 17:36 ` Dave Hansen 2009-04-23 17:36 ` Dave Hansen 2009-04-24 2:57 ` KOSAKI Motohiro 2009-04-24 2:57 ` KOSAKI Motohiro 2009-04-24 10:34 ` Mel Gorman 2009-04-24 10:34 ` Mel Gorman 2009-04-24 14:16 ` Dave Hansen 2009-04-24 14:16 ` Dave Hansen 2009-04-23 19:26 ` Dave Hansen 2009-04-23 19:26 ` Dave Hansen 2009-04-23 19:45 ` Dave Hansen 2009-04-23 19:45 ` Dave Hansen 2009-04-24 9:21 ` Mel Gorman 2009-04-24 9:21 ` Mel Gorman 2009-04-24 14:25 ` Dave Hansen 2009-04-24 14:25 ` Dave Hansen 2009-04-22 20:11 ` David Rientjes 2009-04-22 20:11 ` David Rientjes 2009-04-22 20:20 ` Christoph Lameter 2009-04-22 20:20 ` Christoph Lameter 2009-04-23 7:44 ` Pekka Enberg 2009-04-23 7:44 ` Pekka Enberg 2009-04-23 22:44 ` Andrew Morton 2009-04-23 22:44 ` Andrew Morton 2009-04-22 13:53 ` [PATCH 03/22] Do not check NUMA node ID when the caller knows the node is valid Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 04/22] Check only once if the zonelist is suitable for the allocation Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 05/22] Break up the allocator entry point into fast and slow paths Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 06/22] Move check for disabled anti-fragmentation out of fastpath Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 07/22] Calculate the preferred zone for allocation only once Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-23 22:48 ` Andrew Morton 2009-04-23 22:48 ` Andrew Morton 2009-04-22 13:53 ` [PATCH 08/22] Calculate the migratetype " Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 09/22] Calculate the alloc_flags " Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-23 22:52 ` Andrew Morton 2009-04-23 22:52 ` Andrew Morton 2009-04-24 10:47 ` Mel Gorman 2009-04-24 10:47 ` Mel Gorman 2009-04-24 17:51 ` Andrew Morton 2009-04-24 17:51 ` Andrew Morton 2009-04-22 13:53 ` [PATCH 10/22] Remove a branch by assuming __GFP_HIGH == ALLOC_HIGH Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 11/22] Inline __rmqueue_smallest() Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 12/22] Inline buffered_rmqueue() Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 13/22] Inline __rmqueue_fallback() Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 14/22] Do not call get_pageblock_migratetype() more than necessary Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 15/22] Do not disable interrupts in free_page_mlock() Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-23 22:59 ` Andrew Morton 2009-04-23 22:59 ` Andrew Morton 2009-04-24 0:07 ` KOSAKI Motohiro 2009-04-24 0:07 ` KOSAKI Motohiro 2009-04-24 0:33 ` KOSAKI Motohiro 2009-04-24 0:33 ` KOSAKI Motohiro 2009-04-24 11:33 ` Mel Gorman 2009-04-24 11:33 ` Mel Gorman 2009-04-24 11:52 ` Lee Schermerhorn 2009-04-24 11:52 ` Lee Schermerhorn 2009-04-24 11:18 ` Mel Gorman 2009-04-24 11:18 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 16/22] Do not setup zonelist cache when there is only one node Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 20:24 ` David Rientjes 2009-04-22 20:24 ` David Rientjes 2009-04-22 20:32 ` Lee Schermerhorn 2009-04-22 20:32 ` Lee Schermerhorn 2009-04-22 20:34 ` David Rientjes 2009-04-22 20:34 ` David Rientjes 2009-04-23 0:11 ` KOSAKI Motohiro 2009-04-23 0:11 ` KOSAKI Motohiro 2009-04-23 0:19 ` Mel Gorman 2009-04-23 0:19 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 17/22] Do not check for compound pages during the page allocator sanity checks Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 18/22] Use allocation flags as an index to the zone watermark Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 17:11 ` Dave Hansen 2009-04-22 17:11 ` Dave Hansen 2009-04-22 17:14 ` Mel Gorman 2009-04-22 17:14 ` Mel Gorman 2009-04-22 17:47 ` Dave Hansen 2009-04-22 17:47 ` Dave Hansen 2009-04-23 0:27 ` KOSAKI Motohiro 2009-04-23 0:27 ` KOSAKI Motohiro 2009-04-23 10:03 ` Mel Gorman 2009-04-23 10:03 ` Mel Gorman 2009-04-24 6:41 ` KOSAKI Motohiro 2009-04-24 6:41 ` KOSAKI Motohiro 2009-04-22 20:06 ` David Rientjes 2009-04-22 20:06 ` David Rientjes 2009-04-23 0:29 ` Mel Gorman 2009-04-23 0:29 ` Mel Gorman 2009-04-27 17:00 ` [RFC] Replace the watermark-related union in struct zone with a watermark[] array Mel Gorman 2009-04-27 17:00 ` Mel Gorman 2009-04-27 20:48 ` David Rientjes 2009-04-27 20:48 ` David Rientjes 2009-04-27 20:54 ` Mel Gorman 2009-04-27 20:54 ` Mel Gorman 2009-04-27 20:51 ` Christoph Lameter 2009-04-27 20:51 ` Christoph Lameter 2009-04-27 21:04 ` David Rientjes 2009-04-27 21:04 ` David Rientjes 2009-04-30 13:35 ` Mel Gorman 2009-04-30 13:35 ` Mel Gorman 2009-04-30 13:48 ` Dave Hansen 2009-04-30 13:48 ` Dave Hansen 2009-05-12 14:13 ` [RFC] Replace the watermark-related union in struct zone with a watermark[] array V2 Mel Gorman 2009-05-12 14:13 ` Mel Gorman 2009-05-12 15:05 ` [RFC] Replace the watermark-related union in struct zone with awatermark[] " Dave Hansen 2009-05-12 15:05 ` Dave Hansen 2009-05-13 8:31 ` [RFC] Replace the watermark-related union in struct zone with a watermark[] " KOSAKI Motohiro 2009-05-13 8:31 ` KOSAKI Motohiro 2009-04-22 13:53 ` [PATCH 19/22] Update NR_FREE_PAGES only as necessary Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-23 23:06 ` Andrew Morton 2009-04-23 23:06 ` Andrew Morton 2009-04-23 23:04 ` Christoph Lameter 2009-04-23 23:04 ` Christoph Lameter 2009-04-24 13:06 ` Mel Gorman 2009-04-24 13:06 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 20/22] Get the pageblock migratetype without disabling interrupts Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 21/22] Use a pre-calculated value instead of num_online_nodes() in fast paths Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 23:04 ` David Rientjes 2009-04-22 23:04 ` David Rientjes 2009-04-23 0:44 ` Mel Gorman 2009-04-23 0:44 ` Mel Gorman 2009-04-23 19:29 ` David Rientjes 2009-04-23 19:29 ` David Rientjes 2009-04-24 13:31 ` [PATCH] Do not override definition of node_set_online() with macro Mel Gorman 2009-04-24 13:31 ` Mel Gorman 2009-04-22 13:53 ` [PATCH 22/22] slab: Use nr_online_nodes to check for a NUMA platform Mel Gorman 2009-04-22 13:53 ` Mel Gorman 2009-04-22 14:37 ` Pekka Enberg 2009-04-22 14:37 ` Pekka Enberg 2009-04-27 7:58 ` [PATCH 00/22] Cleanup and optimise the page allocator V7 Zhang, Yanmin 2009-04-27 7:58 ` Zhang, Yanmin 2009-04-27 14:38 ` Mel Gorman 2009-04-27 14:38 ` Mel Gorman 2009-04-28 1:59 ` Zhang, Yanmin 2009-04-28 1:59 ` Zhang, Yanmin 2009-04-28 10:27 ` Mel Gorman [this message] 2009-04-28 10:27 ` Mel Gorman 2009-04-28 10:31 ` [PATCH] Properly account for freed pages in free_pages_bulk() and when allocating high-order pages in buffered_rmqueue() Mel Gorman 2009-04-28 10:31 ` Mel Gorman 2009-04-28 16:37 ` Christoph Lameter 2009-04-28 16:37 ` Christoph Lameter 2009-04-28 16:51 ` Mel Gorman 2009-04-28 16:51 ` Mel Gorman 2009-04-28 17:15 ` Hugh Dickins 2009-04-28 17:15 ` Hugh Dickins 2009-04-28 18:07 ` [PATCH] Properly account for freed pages in free_pages_bulk() and when allocating high-order pages in buffered_rmqueue() V2 Mel Gorman 2009-04-28 18:07 ` Mel Gorman 2009-04-28 18:25 ` Hugh Dickins 2009-04-28 18:25 ` Hugh Dickins 2009-04-28 18:36 ` [PATCH] Properly account for freed pages in free_pages_bulk() and when allocating high-order pages in buffered_rmqueue() Mel Gorman 2009-04-28 18:36 ` Mel Gorman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20090428102719.GA23540@csn.ul.ie \ --to=mel@csn.ul.ie \ --cc=akpm@linux-foundation.org \ --cc=cl@linux-foundation.org \ --cc=kosaki.motohiro@jp.fujitsu.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=ming.m.lin@intel.com \ --cc=npiggin@suse.de \ --cc=penberg@cs.helsinki.fi \ --cc=peterz@infradead.org \ --cc=yanmin_zhang@linux.intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.