linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: IWAMOTO Toshihiro <iwamoto@valinux.co.jp>
To: "Martin J. Bligh" <mbligh@aracnet.com>
Cc: IWAMOTO Toshihiro <iwamoto@valinux.co.jp>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: memory hotremove prototype, take 3
Date: Fri, 05 Dec 2003 00:44:06 +0900	[thread overview]
Message-ID: <20031204154406.7FC587007A@sv1.valinux.co.jp> (raw)
In-Reply-To: <152440000.1070516333@[10.10.2.4]>

At Wed, 03 Dec 2003 21:38:54 -0800,
Martin J. Bligh <mbligh@aracnet.com> wrote:
> > My target is somewhat NUMA-ish and fairly large.  So I'm not sure if
> > CONFIG_NONLINEAR fits, but CONFIG_NUMA isn't perfect either.
> 
> If your target is NUMA, then you really, really need CONFIG_NONLINEAR.
> We don't support multiple pgdats per node, nor do I wish to, as it'll
> make an unholy mess ;-). With CONFIG_NONLINEAR, the discontiguities
> within a node are buried down further, so we have much less complexity
> to deal with from the main VM. The abstraction also keeps the poor
> VM engineers trying to read / write the code saner via simplicity ;-)

IIRC, memory is contiguous within a NUMA node.  I think Goto-san will
clarify this issue when his code gets ready. :-)

> WRT generic discontigmem support (not NUMA), doing that via pgdats
> should really go away, as there's no real difference between the 
> chunks of physical memory as far as the page allocator is concerned.
> The plan is to use Daniel's nonlinear stuff to replace that, and keep
> the pgdats strictly for NUMA. Same would apply to hotpluggable zones - 
> I'd hate to end up with 512 pgdats of stuff that are really all the
> same memory types underneath.

Yes. Unnecessary zone rebalancing would suck.

> The real issue you have is the mapping of the struct pages - if we can
> acheive a non-contig mapping of the mem_map / lmem_map array, we should
> be able to take memory on and offline reasonably easy. If you're willing
> for a first implementation to pre-allocate the struct page array for 
> every possible virtual address, it makes life a lot easier.

Preallocating struct page array isn't feasible for the target system
because max memory / min memory ratio is large.
Our plan is to use the beginning (or the end) of the memory block being
hotplugged.  If a 2GB memory block is added, first ~20MB is used for
the struct page array for the rest of the memory block.


> >> PS. What's this bit of the patch for?
> >> 
> >>  void *vmalloc(unsigned long size)
> >>  {
> >> +#ifdef CONFIG_MEMHOTPLUGTEST
> >> +       return __vmalloc(size, GFP_KERNEL, PAGE_KERNEL);
> >> +#else
> >>         return __vmalloc(size, GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL);
> >> +#endif
> >>  }
> > 
> > This is necessary because kernel memory cannot be swapped out.
> > Only highmem can be hot removed, though it doesn't need to be highmem.
> > We can define another zone attribute such as GFP_HOTPLUGGABLE.
> 
> You could just lock the pages, I'd think? I don't see at a glance
> exactly what you were using this for, but would that work?

I haven't seriously considered to implement vmalloc'd memory, but I
guess that would be too complicated if not impossible.
Making kernel threads or interrupt handlers block on memory access
sound very difficult to me.

--
IWAMOTO Toshihiro

  reply	other threads:[~2003-12-04 15:44 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-12-01  3:41 memory hotremove prototype, take 3 IWAMOTO Toshihiro
2003-12-01 19:56 ` Pavel Machek
2003-12-03 19:41 ` Martin J. Bligh
2003-12-04  3:58   ` IWAMOTO Toshihiro
2003-12-04  5:38     ` Martin J. Bligh
2003-12-04 15:44       ` IWAMOTO Toshihiro [this message]
2003-12-04 17:12         ` Martin J. Bligh
2003-12-04 18:27         ` Jesse Barnes
2003-12-04 18:29           ` Martin J. Bligh
2003-12-04 18:59             ` Jesse Barnes
2003-12-01 20:12 Luck, Tony
2003-12-02  3:01 ` IWAMOTO Toshihiro
2003-12-02  6:43   ` Hirokazu Takahashi
2003-12-02 22:26 ` Yasunori Goto
2003-12-03  5:19 Perez-Gonzalez, Inaky
2003-12-03 17:57 Luck, Tony
2003-12-10  0:45 Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20031204154406.7FC587007A@sv1.valinux.co.jp \
    --to=iwamoto@valinux.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mbligh@aracnet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).