From: Nick Piggin <nickpiggin@yahoo.com.au> To: Mel Gorman <mel@csn.ul.ie> Cc: Linux Memory Management List <linux-mm@kvack.org>, Pekka Enberg <penberg@cs.helsinki.fi>, Rik van Riel <riel@redhat.com>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, Christoph Lameter <cl@linux-foundation.org>, Johannes Weiner <hannes@cmpxchg.org>, Nick Piggin <npiggin@suse.de>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Lin Ming <ming.m.lin@intel.com>, Zhang Yanmin <yanmin_zhang@linux.intel.com> Subject: Re: [RFC PATCH 00/20] Cleanup and optimise the page allocator Date: Tue, 24 Feb 2009 02:22:03 +1100 [thread overview] Message-ID: <200902240222.04645.nickpiggin@yahoo.com.au> (raw) In-Reply-To: <20090223150055.GK6740@csn.ul.ie> On Tuesday 24 February 2009 02:00:56 Mel Gorman wrote: > On Tue, Feb 24, 2009 at 01:46:01AM +1100, Nick Piggin wrote: > > free_page_mlock shouldn't really be in free_pages_check, but oh well. > > Agreed, I took it out of there. Oh good. I didn't notice that. > > > Patch 16 avoids using the zonelist cache on non-NUMA machines > > > > > > Patch 17 removes an expensive and excessively paranoid check in the > > > allocator fast path > > > > I would be careful of removing useful debug checks completely like > > this. What is the cost? Obviously non-zero, but it is also a check > > The cost was something like 1/10th the cost of the path. There are atomic > operations in there that are causing the problems. The only atomic memory operations in there should be atomic loads of word or atomic_t sized and aligned locations, which should just be normal loads on any architecture? The only atomic RMW you might see in that function would come from free_page_mlock (which you moved out of there, and anyway can be made non-atomic). I'd like you to just reevaluate it after your patchset, after the patch to make mlock non-atomic, and my patch I just sent. > > I have seen trigger on quite a lot of occasions (due to kernel bugs > > and hardware bugs, and in each case it is better to warn than not, > > even if many other situations can go undetected). > > Have you really seen it trigger for the allocation path or did it > trigger in teh free path? Essentially we are making the same check on > every allocation and free which is why I considered it excessivly > paranoid. Yes I've seen it trigger in the allocation path. Kernel memory scribbles or RAM errors. > > One problem is that some of the calls we're making in page_alloc.c > > do the compound_head() thing, wheras we know that we only want to > > look at this page. I've attached a patch which cuts out about 150 > > bytes of text and several branches from these paths. > > Nice, I should have spotted that. I'm going to fold this into the series > if that is ok with you? I'll replace patch 17 with it and see does it > still show up on profiles. Great! Sure fold it in (and put SOB: me on there if you like). > > > So, by and large it's an improvement of some sort. > > > > Most of these benchmarks *really* need to be run quite a few times to get > > a reasonable confidence. > > Most are run repeatedly and an average taken but I should double check > what is going on. It's irritating that gains/regressions are > inconsistent between different machine types but that is nothing new. Yeah. Cache behaviour maybe. One thing you might try is to size the struct page out to 64 bytes if it isn't already. This could bring down any skews if one kernel is lucky to get a nice packing of pages, or another is unlucky to get lots of struct pages spread over 2 cachelines. Maybe I'm just thinking wishfully :) I think with many of your changes, common sense will tell us that it is a better code sequence. Sometimes it's just impossible to really get "scientific proof" :)
WARNING: multiple messages have this Message-ID (diff)
From: Nick Piggin <nickpiggin@yahoo.com.au> To: Mel Gorman <mel@csn.ul.ie> Cc: Linux Memory Management List <linux-mm@kvack.org>, Pekka Enberg <penberg@cs.helsinki.fi>, Rik van Riel <riel@redhat.com>, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>, Christoph Lameter <cl@linux-foundation.org>, Johannes Weiner <hannes@cmpxchg.org>, Nick Piggin <npiggin@suse.de>, Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, Lin Ming <ming.m.lin@intel.com>, Zhang Yanmin <yanmin_zhang@linux.intel.com> Subject: Re: [RFC PATCH 00/20] Cleanup and optimise the page allocator Date: Tue, 24 Feb 2009 02:22:03 +1100 [thread overview] Message-ID: <200902240222.04645.nickpiggin@yahoo.com.au> (raw) In-Reply-To: <20090223150055.GK6740@csn.ul.ie> On Tuesday 24 February 2009 02:00:56 Mel Gorman wrote: > On Tue, Feb 24, 2009 at 01:46:01AM +1100, Nick Piggin wrote: > > free_page_mlock shouldn't really be in free_pages_check, but oh well. > > Agreed, I took it out of there. Oh good. I didn't notice that. > > > Patch 16 avoids using the zonelist cache on non-NUMA machines > > > > > > Patch 17 removes an expensive and excessively paranoid check in the > > > allocator fast path > > > > I would be careful of removing useful debug checks completely like > > this. What is the cost? Obviously non-zero, but it is also a check > > The cost was something like 1/10th the cost of the path. There are atomic > operations in there that are causing the problems. The only atomic memory operations in there should be atomic loads of word or atomic_t sized and aligned locations, which should just be normal loads on any architecture? The only atomic RMW you might see in that function would come from free_page_mlock (which you moved out of there, and anyway can be made non-atomic). I'd like you to just reevaluate it after your patchset, after the patch to make mlock non-atomic, and my patch I just sent. > > I have seen trigger on quite a lot of occasions (due to kernel bugs > > and hardware bugs, and in each case it is better to warn than not, > > even if many other situations can go undetected). > > Have you really seen it trigger for the allocation path or did it > trigger in teh free path? Essentially we are making the same check on > every allocation and free which is why I considered it excessivly > paranoid. Yes I've seen it trigger in the allocation path. Kernel memory scribbles or RAM errors. > > One problem is that some of the calls we're making in page_alloc.c > > do the compound_head() thing, wheras we know that we only want to > > look at this page. I've attached a patch which cuts out about 150 > > bytes of text and several branches from these paths. > > Nice, I should have spotted that. I'm going to fold this into the series > if that is ok with you? I'll replace patch 17 with it and see does it > still show up on profiles. Great! Sure fold it in (and put SOB: me on there if you like). > > > So, by and large it's an improvement of some sort. > > > > Most of these benchmarks *really* need to be run quite a few times to get > > a reasonable confidence. > > Most are run repeatedly and an average taken but I should double check > what is going on. It's irritating that gains/regressions are > inconsistent between different machine types but that is nothing new. Yeah. Cache behaviour maybe. One thing you might try is to size the struct page out to 64 bytes if it isn't already. This could bring down any skews if one kernel is lucky to get a nice packing of pages, or another is unlucky to get lots of struct pages spread over 2 cachelines. Maybe I'm just thinking wishfully :) I think with many of your changes, common sense will tell us that it is a better code sequence. Sometimes it's just impossible to really get "scientific proof" :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-02-23 15:23 UTC|newest] Thread overview: 190+ messages / expand[flat|nested] mbox.gz Atom feed top 2009-02-22 23:17 [RFC PATCH 00/20] Cleanup and optimise the page allocator Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 01/20] Replace __alloc_pages_internal() with __alloc_pages_nodemask() Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 02/20] Do not sanity check order in the fast path Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 03/20] Do not check NUMA node ID when the caller knows the node is valid Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-23 15:01 ` Christoph Lameter 2009-02-23 15:01 ` Christoph Lameter 2009-02-23 16:24 ` Mel Gorman 2009-02-23 16:24 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 04/20] Convert gfp_zone() to use a table of precalculated values Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-23 11:55 ` [PATCH] mm: clean up __GFP_* flags a bit Peter Zijlstra 2009-02-23 11:55 ` Peter Zijlstra 2009-02-23 18:01 ` Mel Gorman 2009-02-23 18:01 ` Mel Gorman 2009-02-23 20:27 ` Vegard Nossum 2009-02-23 20:27 ` Vegard Nossum 2009-02-23 15:23 ` [PATCH 04/20] Convert gfp_zone() to use a table of precalculated values Christoph Lameter 2009-02-23 15:23 ` Christoph Lameter 2009-02-23 15:41 ` Nick Piggin 2009-02-23 15:41 ` Nick Piggin 2009-02-23 15:43 ` [PATCH 04/20] Convert gfp_zone() to use a table of precalculated value Christoph Lameter 2009-02-23 15:43 ` Christoph Lameter 2009-02-23 16:40 ` Mel Gorman 2009-02-23 16:40 ` Mel Gorman 2009-02-23 17:03 ` Christoph Lameter 2009-02-23 17:03 ` Christoph Lameter 2009-02-24 1:32 ` KAMEZAWA Hiroyuki 2009-02-24 1:32 ` KAMEZAWA Hiroyuki 2009-02-24 3:59 ` Nick Piggin 2009-02-24 3:59 ` Nick Piggin 2009-02-24 5:20 ` KAMEZAWA Hiroyuki 2009-02-24 5:20 ` KAMEZAWA Hiroyuki 2009-02-24 11:36 ` Mel Gorman 2009-02-24 11:36 ` Mel Gorman 2009-02-23 16:33 ` [PATCH 04/20] Convert gfp_zone() to use a table of precalculated values Mel Gorman 2009-02-23 16:33 ` Mel Gorman 2009-02-23 16:33 ` [PATCH 04/20] Convert gfp_zone() to use a table of precalculated value Christoph Lameter 2009-02-23 16:33 ` Christoph Lameter 2009-02-23 17:41 ` Mel Gorman 2009-02-23 17:41 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 05/20] Check only once if the zonelist is suitable for the allocation Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 06/20] Break up the allocator entry point into fast and slow paths Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 07/20] Simplify the check on whether cpusets are a factor or not Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-23 7:14 ` Pekka J Enberg 2009-02-23 7:14 ` Pekka J Enberg 2009-02-23 9:07 ` Peter Zijlstra 2009-02-23 9:07 ` Peter Zijlstra 2009-02-23 9:13 ` Pekka Enberg 2009-02-23 9:13 ` Pekka Enberg 2009-02-23 11:39 ` Mel Gorman 2009-02-23 11:39 ` Mel Gorman 2009-02-23 13:19 ` Pekka Enberg 2009-02-23 13:19 ` Pekka Enberg 2009-02-23 9:14 ` Li Zefan 2009-02-23 9:14 ` Li Zefan 2009-02-22 23:17 ` [PATCH 08/20] Move check for disabled anti-fragmentation out of fastpath Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 09/20] Calculate the preferred zone for allocation only once Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 10/20] Calculate the migratetype " Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 11/20] Inline get_page_from_freelist() in the fast-path Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-23 7:21 ` Pekka Enberg 2009-02-23 7:21 ` Pekka Enberg 2009-02-23 11:42 ` Mel Gorman 2009-02-23 11:42 ` Mel Gorman 2009-02-23 15:32 ` Nick Piggin 2009-02-23 15:32 ` Nick Piggin 2009-02-24 13:32 ` Mel Gorman 2009-02-24 13:32 ` Mel Gorman 2009-02-24 14:08 ` Nick Piggin 2009-02-24 14:08 ` Nick Piggin 2009-02-24 15:03 ` Mel Gorman 2009-02-24 15:03 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 12/20] Inline __rmqueue_smallest() Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 13/20] Inline buffered_rmqueue() Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-23 7:24 ` Pekka Enberg 2009-02-23 7:24 ` Pekka Enberg 2009-02-23 11:44 ` Mel Gorman 2009-02-23 11:44 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 14/20] Do not call get_pageblock_migratetype() more than necessary Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 15/20] Do not disable interrupts in free_page_mlock() Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-23 9:19 ` Peter Zijlstra 2009-02-23 9:19 ` Peter Zijlstra 2009-02-23 12:23 ` Mel Gorman 2009-02-23 12:23 ` Mel Gorman 2009-02-23 12:44 ` Peter Zijlstra 2009-02-23 12:44 ` Peter Zijlstra 2009-02-23 14:25 ` Mel Gorman 2009-02-23 14:25 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 16/20] Do not setup zonelist cache when there is only one node Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 17/20] Do not double sanity check page attributes during allocation Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 18/20] Split per-cpu list into one-list-per-migrate-type Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 19/20] Batch free pages from migratetype per-cpu lists Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-22 23:17 ` [PATCH 20/20] Get rid of the concept of hot/cold page freeing Mel Gorman 2009-02-22 23:17 ` Mel Gorman 2009-02-23 9:37 ` Andrew Morton 2009-02-23 9:37 ` Andrew Morton 2009-02-23 23:30 ` Mel Gorman 2009-02-23 23:30 ` Mel Gorman 2009-02-23 23:53 ` Andrew Morton 2009-02-23 23:53 ` Andrew Morton 2009-02-24 11:51 ` Mel Gorman 2009-02-24 11:51 ` Mel Gorman 2009-02-25 0:01 ` Andrew Morton 2009-02-25 0:01 ` Andrew Morton 2009-02-25 16:01 ` Mel Gorman 2009-02-25 16:01 ` Mel Gorman 2009-02-25 16:19 ` Andrew Morton 2009-02-25 16:19 ` Andrew Morton 2009-02-26 16:37 ` Mel Gorman 2009-02-26 16:37 ` Mel Gorman 2009-02-26 17:00 ` Christoph Lameter 2009-02-26 17:00 ` Christoph Lameter 2009-02-26 17:15 ` Mel Gorman 2009-02-26 17:15 ` Mel Gorman 2009-02-26 17:30 ` Christoph Lameter 2009-02-26 17:30 ` Christoph Lameter 2009-02-27 11:33 ` Nick Piggin 2009-02-27 11:33 ` Nick Piggin 2009-02-27 15:40 ` Christoph Lameter 2009-02-27 15:40 ` Christoph Lameter 2009-03-03 13:52 ` Mel Gorman 2009-03-03 13:52 ` Mel Gorman 2009-03-03 18:53 ` Christoph Lameter 2009-03-03 18:53 ` Christoph Lameter 2009-02-27 11:38 ` Nick Piggin 2009-02-27 11:38 ` Nick Piggin 2009-03-01 10:37 ` KOSAKI Motohiro 2009-03-01 10:37 ` KOSAKI Motohiro 2009-02-25 18:33 ` Christoph Lameter 2009-02-25 18:33 ` Christoph Lameter 2009-02-22 23:57 ` [RFC PATCH 00/20] Cleanup and optimise the page allocator Andi Kleen 2009-02-22 23:57 ` Andi Kleen 2009-02-23 12:34 ` Mel Gorman 2009-02-23 12:34 ` Mel Gorman 2009-02-23 15:34 ` [RFC PATCH 00/20] Cleanup and optimise the page allocato Christoph Lameter 2009-02-23 15:34 ` Christoph Lameter 2009-02-23 0:02 ` [RFC PATCH 00/20] Cleanup and optimise the page allocator Andi Kleen 2009-02-23 0:02 ` Andi Kleen 2009-02-23 14:32 ` Mel Gorman 2009-02-23 14:32 ` Mel Gorman 2009-02-23 17:49 ` Andi Kleen 2009-02-23 17:49 ` Andi Kleen 2009-02-24 14:32 ` Mel Gorman 2009-02-24 14:32 ` Mel Gorman 2009-02-23 7:29 ` Pekka Enberg 2009-02-23 7:29 ` Pekka Enberg 2009-02-23 8:34 ` Zhang, Yanmin 2009-02-23 8:34 ` Zhang, Yanmin 2009-02-23 9:10 ` KOSAKI Motohiro 2009-02-23 9:10 ` KOSAKI Motohiro 2009-02-23 11:55 ` [PATCH] mm: gfp_to_alloc_flags() Peter Zijlstra 2009-02-23 11:55 ` Peter Zijlstra 2009-02-23 14:00 ` Pekka Enberg 2009-02-23 14:00 ` Pekka Enberg 2009-02-23 18:17 ` Mel Gorman 2009-02-23 18:17 ` Mel Gorman 2009-02-23 20:09 ` Peter Zijlstra 2009-02-23 20:09 ` Peter Zijlstra 2009-02-23 22:59 ` Andrew Morton 2009-02-23 22:59 ` Andrew Morton 2009-02-24 8:59 ` Peter Zijlstra 2009-02-24 8:59 ` Peter Zijlstra 2009-02-23 14:38 ` [RFC PATCH 00/20] Cleanup and optimise the page allocator Christoph Lameter 2009-02-23 14:38 ` Christoph Lameter 2009-02-23 14:46 ` Nick Piggin 2009-02-23 14:46 ` Nick Piggin 2009-02-23 15:00 ` Mel Gorman 2009-02-23 15:00 ` Mel Gorman 2009-02-23 15:22 ` Nick Piggin [this message] 2009-02-23 15:22 ` Nick Piggin 2009-02-23 20:26 ` Mel Gorman 2009-02-23 20:26 ` Mel Gorman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=200902240222.04645.nickpiggin@yahoo.com.au \ --to=nickpiggin@yahoo.com.au \ --cc=cl@linux-foundation.org \ --cc=hannes@cmpxchg.org \ --cc=kosaki.motohiro@jp.fujitsu.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mel@csn.ul.ie \ --cc=ming.m.lin@intel.com \ --cc=npiggin@suse.de \ --cc=penberg@cs.helsinki.fi \ --cc=riel@redhat.com \ --cc=yanmin_zhang@linux.intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.