From: IWAMOTO Toshihiro <iwamoto@valinux.co.jp>
To: "Martin J. Bligh" <mbligh@aracnet.com>
Cc: IWAMOTO Toshihiro <iwamoto@valinux.co.jp>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: memory hotremove prototype, take 3
Date: Fri, 05 Dec 2003 00:44:06 +0900 [thread overview]
Message-ID: <20031204154406.7FC587007A@sv1.valinux.co.jp> (raw)
In-Reply-To: <152440000.1070516333@[10.10.2.4]>
At Wed, 03 Dec 2003 21:38:54 -0800,
Martin J. Bligh <mbligh@aracnet.com> wrote:
> > My target is somewhat NUMA-ish and fairly large. So I'm not sure if
> > CONFIG_NONLINEAR fits, but CONFIG_NUMA isn't perfect either.
>
> If your target is NUMA, then you really, really need CONFIG_NONLINEAR.
> We don't support multiple pgdats per node, nor do I wish to, as it'll
> make an unholy mess ;-). With CONFIG_NONLINEAR, the discontiguities
> within a node are buried down further, so we have much less complexity
> to deal with from the main VM. The abstraction also keeps the poor
> VM engineers trying to read / write the code saner via simplicity ;-)
IIRC, memory is contiguous within a NUMA node. I think Goto-san will
clarify this issue when his code gets ready. :-)
> WRT generic discontigmem support (not NUMA), doing that via pgdats
> should really go away, as there's no real difference between the
> chunks of physical memory as far as the page allocator is concerned.
> The plan is to use Daniel's nonlinear stuff to replace that, and keep
> the pgdats strictly for NUMA. Same would apply to hotpluggable zones -
> I'd hate to end up with 512 pgdats of stuff that are really all the
> same memory types underneath.
Yes. Unnecessary zone rebalancing would suck.
> The real issue you have is the mapping of the struct pages - if we can
> acheive a non-contig mapping of the mem_map / lmem_map array, we should
> be able to take memory on and offline reasonably easy. If you're willing
> for a first implementation to pre-allocate the struct page array for
> every possible virtual address, it makes life a lot easier.
Preallocating struct page array isn't feasible for the target system
because max memory / min memory ratio is large.
Our plan is to use the beginning (or the end) of the memory block being
hotplugged. If a 2GB memory block is added, first ~20MB is used for
the struct page array for the rest of the memory block.
> >> PS. What's this bit of the patch for?
> >>
> >> void *vmalloc(unsigned long size)
> >> {
> >> +#ifdef CONFIG_MEMHOTPLUGTEST
> >> + return __vmalloc(size, GFP_KERNEL, PAGE_KERNEL);
> >> +#else
> >> return __vmalloc(size, GFP_KERNEL | __GFP_HIGHMEM, PAGE_KERNEL);
> >> +#endif
> >> }
> >
> > This is necessary because kernel memory cannot be swapped out.
> > Only highmem can be hot removed, though it doesn't need to be highmem.
> > We can define another zone attribute such as GFP_HOTPLUGGABLE.
>
> You could just lock the pages, I'd think? I don't see at a glance
> exactly what you were using this for, but would that work?
I haven't seriously considered to implement vmalloc'd memory, but I
guess that would be too complicated if not impossible.
Making kernel threads or interrupt handlers block on memory access
sound very difficult to me.
--
IWAMOTO Toshihiro
next prev parent reply other threads:[~2003-12-04 15:44 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-12-01 3:41 memory hotremove prototype, take 3 IWAMOTO Toshihiro
2003-12-01 19:56 ` Pavel Machek
2003-12-03 19:41 ` Martin J. Bligh
2003-12-04 3:58 ` IWAMOTO Toshihiro
2003-12-04 5:38 ` Martin J. Bligh
2003-12-04 15:44 ` IWAMOTO Toshihiro [this message]
2003-12-04 17:12 ` Martin J. Bligh
2003-12-04 18:27 ` Jesse Barnes
2003-12-04 18:29 ` Martin J. Bligh
2003-12-04 18:59 ` Jesse Barnes
2003-12-01 20:12 Luck, Tony
2003-12-02 3:01 ` IWAMOTO Toshihiro
2003-12-02 6:43 ` Hirokazu Takahashi
2003-12-02 22:26 ` Yasunori Goto
2003-12-03 5:19 Perez-Gonzalez, Inaky
2003-12-03 17:57 Luck, Tony
2003-12-10 0:45 Luck, Tony
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20031204154406.7FC587007A@sv1.valinux.co.jp \
--to=iwamoto@valinux.co.jp \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mbligh@aracnet.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).