linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steve Longerbeam <stevel@mvista.com>
To: Ray Bryant <raybry@sgi.com>
Cc: Andi Kleen <ak@muc.de>, Hirokazu Takahashi <taka@valinux.co.jp>,
	Dave Hansen <haveblue@us.ibm.com>,
	Marcello Tosatti <marcelo.tosatti@cyclades.com>,
	Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>, andrew morton <akpm@osdl.org>
Subject: Re: page migration patchset
Date: Wed, 05 Jan 2005 10:41:51 -0800	[thread overview]
Message-ID: <41DC34EF.7010507@mvista.com> (raw)
In-Reply-To: <41DB5CE9.6090505@sgi.com>

Hi everyone,

Ray Bryant wrote:

> Andi Kleen wrote:
>
>> Ray Bryant <raybry@sgi.com> writes:
>>
>>
>>> http://sr71.net/patches/2.6.10/2.6.10-mm1-mhp-test7/
>>>
>>> A number of us are interested in using the page migration patchset 
>>> by itself:
>>>
>>> (1)  Myself, for a manual page migration project I am working on.  
>>> (This
>>>      is for migrating jobs from one set of nodes to another under batch
>>>      scheduler control).
>>> (2)  Marcello, for his memory defragmentation work.
>>> (3)  Of course, the memory hotplug project itself.
>>>
>>> (there are probably other "users" that I have not enumerated here).
>>
>>
>>
>> Could you coordinate that with Steve Longerbeam (cc'ed) ?
>> He has a NUMA API extension ready to be merged into -mm* that also
>> does kind of page migration when changing the policies of files.
>>
>> -Andi
>>
>>
> Yes, Steve's patch tries to move page cache pages that are found to be 
> allocated in the "wrong" place.  (See remove_invalid_filemap_page() in 
> his
> patch of 11/02/2004 on lkml).  But if the page is found to be busy, 
> the code
> gives up, as near as I can tell.


correct, my patch is using invalidate_mapping_pages(), which doesn't wait
for a locked pagecache page.

>
> If the page migration patch were merged, Steve could call 
> migrate_onepage(page,node) to move the page to the correct node. even 
> if it
> is busy [hopefully his code can "wait" at that point, I haven't looked 
> into it further to see if that is the case.]


sounds good to me. And it can wait, since remove_invalid_filemap_page() 
is called
at syscall time, so the syscall will just block.

>
> [This is really the page migration patch plus a small patch of
> mine that addss the node argument to migrate_onepage(), and that I 
> hope will
> get merged into the page migration patch shortly]
>
> Other than that, I don't see a big intersection between the two patches.
> Steve, do you see anything else where we need to coordinate?


well, I need to study the page migration patch more (this is the
first time I've heard of it). But it sounds as if my patch and the
page migration patch are complementary.

>
> On the other hand, there is some work to be done wrt memory policies
> and page migration.  For the project I am working on, we need to be able
> to move all of the pages used by a process on one set of nodes to another
> set of nodes.  At some point during this process we will need to update
> the memory policy for that process.  For Steve's patch, we will
> similarly need to update the policy associated with files associated with
> the process, I would think, elsewise new pages will get allocated on the
> old set of nodes, which is something we don't want.  Sounds like some
> new interfaces will have to be developed here.  Does that make sense
> to you, Andi and Steve?


yes.

>
> My personal preference would be to keep as much of this as possible
> under user space control; that is, rather than having a big autonomous
> system call that migrates pages and then updates policy information,
> I'd prefer to split the work into several smaller system calls that
> are issued by a user space program responsible for coordinating the
> process migration as a series of steps, e. g.:
>
> (1)  suspend the process via SIGSTOP
> (2)  update the mempolicy information
> (3)  migrate the process's pages
> (4)  migrate the process to the new cpu via set_schedaffinity()
> (5)  resume the process via SIGCONT
>

steps 2 and 3 can be accomplished by a call to mbind() and
specifying MPOL_MF_MOVE. And since mbind() takes an
address range, you could probably migrate pages and change
the policies for all of the process' mappings in a single mbind()
call.

Note that Andrew had to drop my patch from 2.6.10, because
the 4-level page tables feature was re-implemented using a
different interface, which broke my patch. So Andrew asked me
to re-do the patch for inclusion in 2.6.11. That gives us ~2 months
to work on integrating the page migration and NUMA mempolicy
filemap patches.

Ray, btw it is beneficial that I can work with you on this, because I have
no access to true NUMA machines. My testing of the filemap mempolicy
patch has only been on a UP discontiguous memory system. I assume
you've got access to Altix machines at SGI to do testing and benchmarking
of my filemap patch and your migration patches.

Steve


  reply	other threads:[~2005-01-05 18:44 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-01-05  0:32 page migration patchset Ray Bryant
2005-01-05  2:07 ` Andi Kleen
2005-01-05  3:20   ` Ray Bryant
2005-01-05 18:41     ` Steve Longerbeam [this message]
2005-01-05 19:23       ` Ray Bryant
2005-01-05 23:00         ` Steve Longerbeam
2005-01-05 23:16           ` Ray Bryant
2005-01-05 20:55       ` Hugh Dickins
     [not found]         ` <41DC7EAD.8010407@mvista.com>
2005-01-06 14:43           ` Andi Kleen
2005-01-06 16:00             ` Ray Bryant
2005-01-06 17:50               ` Christoph Lameter
2005-01-06 19:29                 ` Andi Kleen
2005-01-06 22:30             ` William Lee Irwin III
2005-01-06 23:08               ` Andrew Morton
2005-01-06 23:15                 ` William Lee Irwin III
2005-01-06 23:21               ` Ray Bryant
2005-01-06 23:35                 ` William Lee Irwin III
2005-01-06 23:53               ` Anton Blanchard
2005-01-07  0:06                 ` William Lee Irwin III
2005-01-07  0:31                 ` Andi Kleen
2005-01-06 23:43             ` Steve Longerbeam
2005-01-06 23:58               ` William Lee Irwin III
2005-01-11 15:38       ` Ray Bryant
2005-01-11 19:00         ` Steve Longerbeam
2005-01-11 19:30           ` Ray Bryant
2005-01-11 20:59             ` Steve Longerbeam
2005-01-12 12:35         ` Robin Holt
2005-01-12 18:12           ` Hugh Dickins
2005-01-12 18:45             ` Ray Bryant
2005-01-12 18:53             ` Andrew Morton
2005-01-14 13:55               ` swapspace layout improvements advocacy Tim Schmielau
2005-01-14 18:15                 ` Andrew Morton
2005-01-14 22:52                 ` Barry K. Nathan
2005-01-15  0:33                   ` Alan Cox
2005-01-15  2:26                   ` Tim Schmielau
2005-01-15  8:55                   ` Pasi Savolainen
2005-01-06 20:59 page migration patchset Ray Bryant
2005-01-06 23:04 ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41DC34EF.7010507@mvista.com \
    --to=stevel@mvista.com \
    --cc=ak@muc.de \
    --cc=akpm@osdl.org \
    --cc=haveblue@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=marcelo.tosatti@cyclades.com \
    --cc=raybry@sgi.com \
    --cc=taka@valinux.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).