All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Dave Hansen <dave@sr71.net>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>,
	Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	Cody P Schafer <cody@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC][PATCH] drivers: base: dynamic memory block creation
Date: Wed, 14 Aug 2013 13:35:46 -0700	[thread overview]
Message-ID: <20130814203546.GA6200@kroah.com> (raw)
In-Reply-To: <520BE30D.3070401@sr71.net>

On Wed, Aug 14, 2013 at 01:05:33PM -0700, Dave Hansen wrote:
> On 08/14/2013 12:43 PM, Greg Kroah-Hartman wrote:
> > On Wed, Aug 14, 2013 at 02:31:45PM -0500, Seth Jennings wrote:
> >> ppc64 has a normal memory block size of 256M (however sometimes as low
> >> as 16M depending on the system LMB size), and (I think) x86 is 128M.  With
> >> 1TB of RAM and a 256M block size, that's 4k memory blocks with 20 sysfs
> >> entries per block that's around 80k items that need be created at boot
> >> time in sysfs.  Some systems go up to 16TB where the issue is even more
> >> severe.
> > 
> > The x86 developers are working with larger memory sizes and they haven't
> > seen the problem in this area, for them it's in other places, as I
> > referred to in my other email.
> 
> The SGI guys don't run normal distro kernels and don't turn on memory
> hotplug, so they don't see this.  I do the same in my testing of
> large-memory x86 systems to speed up my boots.  I'll go stick it back in
> there and see if I can generate some numbers for a 1TB machine.
> 
> But, the problem on x86 is at _worst_ 1/8 of the problem on ppc64 since
> the SECTION_SIZE is so 8x bigger by default.
> 
> Also, the cost of creating sections on ppc is *MUCH* higher than x86
> when amortized across the number of pages that you're initializing.  A
> section on ppc64 has to be created for each (2^24/2^16)=256 pages while
> one on x86 is created for each (2^27/2^12)=32768 pages.
> 
> Thus, x86 folks with our small pages and large sections tend to be
> focused on per-page costs.  The ppc folks with their small sections and
> larger pages tend to be focused on the per-section costs.

Ah, thanks for the explaination, now it makes more sense why they are
both optimizing in different places.

But a "cleanup" patch first, and then the "change the logic to go
faster" would be better here, so that we can review what is really
happening.

thanks,

greg k-h

WARNING: multiple messages have this Message-ID (diff)
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Dave Hansen <dave@sr71.net>
Cc: Seth Jennings <sjenning@linux.vnet.ibm.com>,
	Nathan Fontenot <nfont@linux.vnet.ibm.com>,
	Cody P Schafer <cody@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Lai Jiangshan <laijs@cn.fujitsu.com>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC][PATCH] drivers: base: dynamic memory block creation
Date: Wed, 14 Aug 2013 13:35:46 -0700	[thread overview]
Message-ID: <20130814203546.GA6200@kroah.com> (raw)
In-Reply-To: <520BE30D.3070401@sr71.net>

On Wed, Aug 14, 2013 at 01:05:33PM -0700, Dave Hansen wrote:
> On 08/14/2013 12:43 PM, Greg Kroah-Hartman wrote:
> > On Wed, Aug 14, 2013 at 02:31:45PM -0500, Seth Jennings wrote:
> >> ppc64 has a normal memory block size of 256M (however sometimes as low
> >> as 16M depending on the system LMB size), and (I think) x86 is 128M.  With
> >> 1TB of RAM and a 256M block size, that's 4k memory blocks with 20 sysfs
> >> entries per block that's around 80k items that need be created at boot
> >> time in sysfs.  Some systems go up to 16TB where the issue is even more
> >> severe.
> > 
> > The x86 developers are working with larger memory sizes and they haven't
> > seen the problem in this area, for them it's in other places, as I
> > referred to in my other email.
> 
> The SGI guys don't run normal distro kernels and don't turn on memory
> hotplug, so they don't see this.  I do the same in my testing of
> large-memory x86 systems to speed up my boots.  I'll go stick it back in
> there and see if I can generate some numbers for a 1TB machine.
> 
> But, the problem on x86 is at _worst_ 1/8 of the problem on ppc64 since
> the SECTION_SIZE is so 8x bigger by default.
> 
> Also, the cost of creating sections on ppc is *MUCH* higher than x86
> when amortized across the number of pages that you're initializing.  A
> section on ppc64 has to be created for each (2^24/2^16)=256 pages while
> one on x86 is created for each (2^27/2^12)=32768 pages.
> 
> Thus, x86 folks with our small pages and large sections tend to be
> focused on per-page costs.  The ppc folks with their small sections and
> larger pages tend to be focused on the per-section costs.

Ah, thanks for the explaination, now it makes more sense why they are
both optimizing in different places.

But a "cleanup" patch first, and then the "change the logic to go
faster" would be better here, so that we can review what is really
happening.

thanks,

greg k-h

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-08-14 20:35 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-14 19:31 [RFC][PATCH] drivers: base: dynamic memory block creation Seth Jennings
2013-08-14 19:31 ` Seth Jennings
2013-08-14 19:40 ` Greg Kroah-Hartman
2013-08-14 19:40   ` Greg Kroah-Hartman
2013-08-16 19:07   ` Seth Jennings
2013-08-16 19:07     ` Seth Jennings
2013-08-14 19:43 ` Greg Kroah-Hartman
2013-08-14 19:43   ` Greg Kroah-Hartman
2013-08-14 20:05   ` Dave Hansen
2013-08-14 20:05     ` Dave Hansen
2013-08-14 20:35     ` Greg Kroah-Hartman [this message]
2013-08-14 20:35       ` Greg Kroah-Hartman
2013-08-14 21:16       ` Seth Jennings
2013-08-14 21:16         ` Seth Jennings
2013-08-14 21:37       ` Yinghai Lu
2013-08-14 21:52         ` Seth Jennings
2013-08-14 21:52           ` Seth Jennings
2013-08-14 23:20           ` Yinghai Lu
2013-08-14 23:20             ` Yinghai Lu
2013-08-15  2:12           ` Michael Ellerman
2013-08-15  2:12             ` Michael Ellerman
2013-08-15  2:12             ` Michael Ellerman
2013-08-14 20:40 ` Nathan Fontenot
2013-08-14 20:40   ` Nathan Fontenot
2013-08-14 20:47 ` Dave Hansen
2013-08-14 20:47   ` Dave Hansen
2013-08-14 21:14   ` Seth Jennings
2013-08-14 21:14     ` Seth Jennings
2013-08-14 21:36     ` Dave Hansen
2013-08-14 21:36       ` Dave Hansen
2013-08-14 21:37     ` Cody P Schafer
2013-08-14 21:37       ` Cody P Schafer
2013-08-14 21:49       ` Dave Hansen
2013-08-14 21:49         ` Dave Hansen
2013-08-15  0:01 ` Rafael J. Wysocki
2013-08-15  0:01   ` Rafael J. Wysocki
2013-08-16 18:41   ` Seth Jennings
2013-08-16 18:41     ` Seth Jennings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130814203546.GA6200@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=cody@linux.vnet.ibm.com \
    --cc=dave@sr71.net \
    --cc=laijs@cn.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nfont@linux.vnet.ibm.com \
    --cc=rafael.j.wysocki@intel.com \
    --cc=sjenning@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.