linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, John Hubbard <jhubbard@nvidia.com>,
	Russell King <linux@armlinux.org.uk>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Rich Felker <dalias@libc.org>,
	Chris Metcalf <cmetcalf@mellanox.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [HMM v13 01/18] mm/memory/hotplug: convert device parameter bool to set of flags
Date: Mon, 21 Nov 2016 07:27:40 -0500	[thread overview]
Message-ID: <20161121122740.GB2392@redhat.com> (raw)
In-Reply-To: <5832972E.1050405@linux.vnet.ibm.com>

On Mon, Nov 21, 2016 at 12:11:50PM +0530, Anshuman Khandual wrote:
> On 11/18/2016 11:48 PM, Jérôme Glisse wrote:

[...]

> > @@ -956,7 +963,7 @@ kernel_physical_mapping_remove(unsigned long start, unsigned long end)
> >  	remove_pagetable(start, end, true);
> >  }
> >  
> > -int __ref arch_remove_memory(u64 start, u64 size)
> > +int __ref arch_remove_memory(u64 start, u64 size, int flags)
> >  {
> >  	unsigned long start_pfn = start >> PAGE_SHIFT;
> >  	unsigned long nr_pages = size >> PAGE_SHIFT;
> > @@ -965,6 +972,12 @@ int __ref arch_remove_memory(u64 start, u64 size)
> >  	struct zone *zone;
> >  	int ret;
> >  
> > +	/* Need to add support for device and unaddressable memory if needed */
> > +	if (flags & MEMORY_UNADDRESSABLE) {
> > +		BUG();
> > +		return -EINVAL;
> > +	}
> > +
> >  	/* With altmap the first mapped page is offset from @start */
> >  	altmap = to_vmem_altmap((unsigned long) page);
> >  	if (altmap)
> 
> So with this patch none of the architectures support un-addressable
> memory but then support will be added through later patches ?
> zone_for_memory function's flag now takes MEMORY_DEVICE parameter.
> Then we need to change all the previous ZONE_DEVICE changes which
> ever took "for_device" to accommodate this new flag ? just curious.

Yes correct.


> > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> > index 01033fa..ba9b12e 100644
> > --- a/include/linux/memory_hotplug.h
> > +++ b/include/linux/memory_hotplug.h
> > @@ -103,7 +103,7 @@ extern bool memhp_auto_online;
> >  
> >  #ifdef CONFIG_MEMORY_HOTREMOVE
> >  extern bool is_pageblock_removable_nolock(struct page *page);
> > -extern int arch_remove_memory(u64 start, u64 size);
> > +extern int arch_remove_memory(u64 start, u64 size, int flags);
> >  extern int __remove_pages(struct zone *zone, unsigned long start_pfn,
> >  	unsigned long nr_pages);
> >  #endif /* CONFIG_MEMORY_HOTREMOVE */
> > @@ -275,7 +275,20 @@ extern int add_memory(int nid, u64 start, u64 size);
> >  extern int add_memory_resource(int nid, struct resource *resource, bool online);
> >  extern int zone_for_memory(int nid, u64 start, u64 size, int zone_default,
> >  		bool for_device);
> > -extern int arch_add_memory(int nid, u64 start, u64 size, bool for_device);
> > +
> > +/*
> > + * For device memory we want more informations than just knowing it is device
> > + * memory. We want to know if we can migrate it (ie it is not storage memory
> > + * use by DAX). Is it addressable by the CPU ? Some device memory like GPU
> > + * memory can not be access by CPU but we still want struct page so that we
> > + * can use it like regular memory.
> 
> Some typos here. Needs to be cleaned up as well. But please have a
> look at comment below over the classification itself.
> 
> > + */
> > +#define MEMORY_FLAGS_NONE 0
> > +#define MEMORY_DEVICE (1 << 0)
> > +#define MEMORY_MOVABLE (1 << 1)
> > +#define MEMORY_UNADDRESSABLE (1 << 2)
> 
> It should be DEVICE_MEMORY_* instead of MEMORY_* as we are trying to
> classify device memory (though they are represented with struct page)
> not regular system ram memory. This should attempt to classify device
> memory which is backed by struct pages. arch_add_memory/arch_remove
> _memory does not come into play if it's traditional device memory
> which is just PFN and does not have struct page associated with it.

Good idea i will change that.


> Broadly they are either CPU accessible or in-accessible. Storage
> memory like persistent memory represented though ZONE_DEVICE fall
> under the accessible (coherent) category. IIUC right now they are
> not movable because page->pgmap replaces page->lru in struct page
> hence its inability to be on standard LRU lists as one of the
> reasons. As there was a need to have struct page to exploit more
> core VM features on these memory going forward it will have to be
> migratable one way or the other to accommodate features like
> compaction, HW poison etc in these storage memory. Hence my point
> here is lets not classify any of these memories as non-movable.
> Just addressable or not should be the only classification.

Being on the lru or not is not and issue in respect to migration. Being
on the lru was use as an indication that the page is manage through the
standard mm code and thus that many assumptions hold which in turn do
allow migration. But if one use device memory following all rules of
regular memory then migration can be done to no matter if page is on
lru or not.

I still think that the MOVABLE is an important distinction as i am pretty
sure that the persistent folks do not want to see their page migrated in
anyway. I might rename it to DEVICE_MEMORY_ALLOW_MIGRATION.

Cheers,
Jérôme 

  reply	other threads:[~2016-11-21 12:27 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-18 18:18 [HMM v13 00/18] HMM (Heterogeneous Memory Management) v13 Jérôme Glisse
2016-11-18 18:18 ` [HMM v13 01/18] mm/memory/hotplug: convert device parameter bool to set of flags Jérôme Glisse
2016-11-21  0:44   ` Balbir Singh
2016-11-21  4:53     ` Jerome Glisse
2016-11-21  6:57       ` Anshuman Khandual
2016-11-21 12:19         ` Jerome Glisse
2016-11-21  6:41   ` Anshuman Khandual
2016-11-21 12:27     ` Jerome Glisse [this message]
2016-11-22  5:35       ` Anshuman Khandual
2016-11-22 14:08         ` Jerome Glisse
2016-11-18 18:18 ` [HMM v13 02/18] mm/ZONE_DEVICE/unaddressable: add support for un-addressable device memory Jérôme Glisse
2016-11-21  8:06   ` Anshuman Khandual
2016-11-21 12:33     ` Jerome Glisse
2016-11-22  5:15       ` Anshuman Khandual
2016-11-18 18:18 ` [HMM v13 03/18] mm/ZONE_DEVICE/free_hot_cold_page: catch ZONE_DEVICE pages Jérôme Glisse
2016-11-21  8:18   ` Anshuman Khandual
2016-11-21 12:50     ` Jerome Glisse
2016-11-22  4:30       ` Anshuman Khandual
2016-11-18 18:18 ` [HMM v13 04/18] mm/ZONE_DEVICE/free-page: callback when page is freed Jérôme Glisse
2016-11-21  1:49   ` Balbir Singh
2016-11-21  4:57     ` Jerome Glisse
2016-11-21  8:26   ` Anshuman Khandual
2016-11-21 12:34     ` Jerome Glisse
2016-11-22  5:02       ` Anshuman Khandual
2016-11-18 18:18 ` [HMM v13 05/18] mm/ZONE_DEVICE/devmem_pages_remove: allow early removal of device memory Jérôme Glisse
2016-11-21 10:37   ` Anshuman Khandual
2016-11-21 12:39     ` Jerome Glisse
2016-11-22  4:54       ` Anshuman Khandual
2016-11-18 18:18 ` [HMM v13 06/18] mm/ZONE_DEVICE/unaddressable: add special swap for unaddressable Jérôme Glisse
2016-11-21  2:06   ` Balbir Singh
2016-11-21  5:05     ` Jerome Glisse
2016-11-22  2:19       ` Balbir Singh
2016-11-22 13:59         ` Jerome Glisse
2016-11-21 11:10     ` Anshuman Khandual
2016-11-21 10:58   ` Anshuman Khandual
2016-11-21 12:42     ` Jerome Glisse
2016-11-22  4:48       ` Anshuman Khandual
2016-11-24 13:56         ` Jerome Glisse
2016-11-18 18:18 ` [HMM v13 07/18] mm/ZONE_DEVICE/x86: add support for un-addressable device memory Jérôme Glisse
2016-11-21  2:08   ` Balbir Singh
2016-11-21  5:08     ` Jerome Glisse
2016-11-18 18:18 ` [HMM v13 08/18] mm/hmm: heterogeneous memory management (HMM for short) Jérôme Glisse
2016-11-21  2:29   ` Balbir Singh
2016-11-21  5:14     ` Jerome Glisse
2016-11-23  4:03   ` Anshuman Khandual
2016-11-27 13:10     ` Jerome Glisse
2016-11-28  2:58       ` Anshuman Khandual
2016-11-28  9:41         ` Jerome Glisse
2016-11-18 18:18 ` [HMM v13 09/18] mm/hmm/mirror: mirror process address space on device with HMM helpers Jérôme Glisse
2016-11-21  2:42   ` Balbir Singh
2016-11-21  5:18     ` Jerome Glisse
2016-11-18 18:18 ` [HMM v13 10/18] mm/hmm/mirror: add range lock helper, prevent CPU page table update for the range Jérôme Glisse
2016-11-18 18:18 ` [HMM v13 11/18] mm/hmm/mirror: add range monitor helper, to monitor CPU page table update Jérôme Glisse
2016-11-18 18:18 ` [HMM v13 12/18] mm/hmm/mirror: helper to snapshot CPU page table Jérôme Glisse
2016-11-18 18:18 ` [HMM v13 13/18] mm/hmm/mirror: device page fault handler Jérôme Glisse
2016-11-18 18:18 ` [HMM v13 14/18] mm/hmm/migrate: support un-addressable ZONE_DEVICE page in migration Jérôme Glisse
2016-11-18 18:18 ` [HMM v13 15/18] mm/hmm/migrate: add new boolean copy flag to migratepage() callback Jérôme Glisse
2016-11-18 18:18 ` [HMM v13 16/18] mm/hmm/migrate: new memory migration helper for use with device memory Jérôme Glisse
2016-11-18 19:57   ` Aneesh Kumar K.V
2016-11-18 20:15     ` Jerome Glisse
2016-11-19 14:32   ` Aneesh Kumar K.V
2016-11-19 17:17     ` Jerome Glisse
2016-11-20 18:21       ` Aneesh Kumar K.V
2016-11-20 20:06         ` Jerome Glisse
2016-11-21  3:30   ` Balbir Singh
2016-11-21  5:31     ` Jerome Glisse
2016-11-18 18:18 ` [HMM v13 17/18] mm/hmm/devmem: device driver helper to hotplug ZONE_DEVICE memory Jérôme Glisse
2016-11-18 18:18 ` [HMM v13 18/18] mm/hmm/devmem: dummy HMM device as an helper for " Jérôme Glisse
2016-11-19  0:41 ` [HMM v13 00/18] HMM (Heterogeneous Memory Management) v13 John Hubbard
2016-11-19 14:50   ` Aneesh Kumar K.V
2016-11-23  9:16 ` Haggai Eran
2016-11-25 16:16   ` Jerome Glisse
2016-11-27 13:27     ` Haggai Eran

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161121122740.GB2392@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=cmetcalf@mellanox.com \
    --cc=dalias@libc.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jhubbard@nvidia.com \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=paulus@samba.org \
    --cc=schwidefsky@de.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=ysato@users.sourceforge.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).