linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: "Martin J. Bligh" <mbligh@aracnet.com>
Cc: <linux-kernel@vger.kernel.org>
Subject: Re: memory hotremove prototype, take 3
Date: Tue, 9 Dec 2003 16:45:10 -0800	[thread overview]
Message-ID: <B8E391BBE9FE384DAA4C5C003888BE6F4FAF4A@scsmsx401.sc.intel.com> (raw)

> If your target is NUMA, then you really, really need CONFIG_NONLINEAR.
> We don't support multiple pgdats per node, nor do I wish to, as it'll
> make an unholy mess ;-). With CONFIG_NONLINEAR, the discontiguities
> within a node are buried down further, so we have much less complexity
> to deal with from the main VM. The abstraction also keeps the poor
> VM engineers trying to read / write the code saner via simplicity ;-)
> 
> WRT generic discontigmem support (not NUMA), doing that via pgdats
> should really go away, as there's no real difference between the 
> chunks of physical memory as far as the page allocator is concerned.
> The plan is to use Daniel's nonlinear stuff to replace that, and keep
> the pgdats strictly for NUMA. Same would apply to hotpluggable zones - 
> I'd hate to end up with 512 pgdats of stuff that are really all the
> same memory types underneath.

I guess this all depends on whether you allow bits of memory on
nodes to be hot-plugged ... or insist on the whole node being
added/removed in one fell swoop.  I'd expect the latter to be
a more common model, and in that case the "pgdat-for-the-node" is
the same as the "pgdat-for-the-hot-plug-zone", so you don't have
a proliferation of pgdats to support hotplug.

> The real issue you have is the mapping of the struct pages - if we can
> achieve a non-contig mapping of the mem_map / lmem_map array, we should
> be able to take memory on and offline reasonably easy. If you're willing
> for a first implementation to pre-allocate the struct page array for 
> every possible virtual address, it makes life a lot easier.

On 64-bit systems with CONFIG_VIRTUAL_MEMMAP, this would be trivial,
and avoids the need for the extra level of indirection in the psection[]
and vection[] arrays in CONFIG_NONLINEAR (ok ... it doesn't really get
rid of the indirection, as the page table lookup to access the virtual
mem_map effectively ends up doing the same thing).
 
> Adding the other layer of indirection for access the struct page array
> should fix up most of that, and is very easily abstracted out via the
> pfn_to_page macros and friends. I ripped out all the direct references
> to mem_map indexing already in 2.6, so it should all be nicely 
> abstracted out.

I did go back and look at the CONFIG_NONLINEAR patch again, and I
still can't see how to make it useful on 64-bit machines. Jack
Steiner asked a bunch of questions on how it would work for an
architecture like the SGI:
http://marc.theaimsgroup.com/?l=lse-tech&m=101828803506249&w=2
I don't remember seeing any answers on the list.  Assuming he
were to use a section size of 64MB (a convenient number for ia64)
he'd end up with psection[]/vsection[] tables with 8 million
entries each (@ 4 bytes/entry -> 64MB for the pair).

-Tony Luck  


             reply	other threads:[~2003-12-10  0:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-10  0:45 Luck, Tony [this message]
  -- strict thread matches above, loose matches on Subject: below --
2003-12-03 17:57 memory hotremove prototype, take 3 Luck, Tony
2003-12-03  5:19 Perez-Gonzalez, Inaky
2003-12-01 20:12 Luck, Tony
2003-12-02  3:01 ` IWAMOTO Toshihiro
2003-12-02  6:43   ` Hirokazu Takahashi
2003-12-02 22:26 ` Yasunori Goto
2003-12-01  3:41 IWAMOTO Toshihiro
2003-12-01 19:56 ` Pavel Machek
2003-12-03 19:41 ` Martin J. Bligh
2003-12-04  3:58   ` IWAMOTO Toshihiro
2003-12-04  5:38     ` Martin J. Bligh
2003-12-04 15:44       ` IWAMOTO Toshihiro
2003-12-04 17:12         ` Martin J. Bligh
2003-12-04 18:27         ` Jesse Barnes
2003-12-04 18:29           ` Martin J. Bligh
2003-12-04 18:59             ` Jesse Barnes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B8E391BBE9FE384DAA4C5C003888BE6F4FAF4A@scsmsx401.sc.intel.com \
    --to=tony.luck@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@aracnet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).