All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jerome Glisse <jglisse@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Evgeny Baskakov <ebaskakov@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>
Subject: Re: [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v3
Date: Mon, 12 Jun 2017 13:57:41 -0400	[thread overview]
Message-ID: <20170612175741.GA4696@redhat.com> (raw)
In-Reply-To: <9d4efdd1-1a76-27e2-5e6b-86bfe13b9865@nvidia.com>

On Thu, Jun 08, 2017 at 08:55:05PM -0700, John Hubbard wrote:
> On 05/24/2017 10:20 AM, Jérôme Glisse wrote:
> [...8<...]
> > +#if IS_ENABLED(CONFIG_DEVICE_PRIVATE)
> > +int device_private_entry_fault(struct vm_area_struct *vma,
> > +		       unsigned long addr,
> > +		       swp_entry_t entry,
> > +		       unsigned int flags,
> > +		       pmd_t *pmdp)
> > +{
> > +	struct page *page = device_private_entry_to_page(entry);
> > +
> > +	/*
> > +	 * The page_fault() callback must migrate page back to system memory
> > +	 * so that CPU can access it. This might fail for various reasons
> > +	 * (device issue, device was unsafely unplugged, ...). When such
> > +	 * error conditions happen, the callback must return VM_FAULT_SIGBUS.
> > +	 *
> > +	 * Note that because memory cgroup charges are accounted to the device
> > +	 * memory, this should never fail because of memory restrictions (but
> > +	 * allocation of regular system page might still fail because we are
> > +	 * out of memory).
> > +	 *
> > +	 * There is a more in-depth description of what that callback can and
> > +	 * cannot do, in include/linux/memremap.h
> > +	 */
> > +	return page->pgmap->page_fault(vma, addr, page, flags, pmdp);
> > +}
> > +EXPORT_SYMBOL(device_private_entry_fault);
> > +#endif /* CONFIG_DEVICE_PRIVATE */
> > +
> >   static void pgmap_radix_release(struct resource *res)
> >   {
> >   	resource_size_t key, align_start, align_size, align_end;
> > @@ -321,6 +351,10 @@ void *devm_memremap_pages(struct device *dev, struct resource *res,
> >   	}
> >   	pgmap->ref = ref;
> >   	pgmap->res = &page_map->res;
> > +	pgmap->type = MEMORY_DEVICE_PUBLIC;
> > +	pgmap->page_fault = NULL;
> > +	pgmap->page_free = NULL;
> > +	pgmap->data = NULL;
> >   	mutex_lock(&pgmap_lock);
> >   	error = 0;
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index d744cff..f5357ff 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -736,6 +736,19 @@ config ZONE_DEVICE
> >   	  If FS_DAX is enabled, then say Y.
> > +config DEVICE_PRIVATE
> > +	bool "Unaddressable device memory (GPU memory, ...)"
> > +	depends on X86_64
> > +	depends on ZONE_DEVICE
> > +	depends on MEMORY_HOTPLUG
> > +	depends on MEMORY_HOTREMOVE
> > +	depends on SPARSEMEM_VMEMMAP
> > +
> > +	help
> > +	  Allows creation of struct pages to represent unaddressable device
> > +	  memory; i.e., memory that is only accessible from the device (or
> > +	  group of devices).
> > +
> 
> Hi Jerome,
> 
> CONFIG_DEVICE_PRIVATE has caused me some problems, because it's not coupled to HMM_DEVMEM.
> 
> To fix this, my first choice would be to just s/DEVICE_PRIVATE/HMM_DEVMEM/g
> , because I don't see any value to DEVICE_PRIVATE as an independent Kconfig
> choice. It's complicating the Kconfig choices, and adding problems. However,
> if DEVICE_PRIVATE must be kept, then something like this also fixes my HMM
> tests:


Better is depend on so that you can not select HMM_DEVMEM if you do not have
DEVICE_PRIVATE. But maybe this can be merge under one config option, i do not
have any strong preference personnaly. The HMM_DEVMEM just enable helper code
that make using CONFIG_DEVICE_PRIVATE easier for device driver but is not
strictly needed ie device driver can reimplement what HMM_DEVMEM provides.

I might just merge this kernel option as part of CDM patchset that i am about
to send.

Cheers,
Jérôme

> 
> From: John Hubbard <jhubbard@nvidia.com>
> Date: Thu, 8 Jun 2017 20:13:13 -0700
> Subject: [PATCH] hmm: select CONFIG_DEVICE_PRIVATE with HMM_DEVMEM
> 
> The HMM_DEVMEM feature is useless without the various
> features that are guarded with CONFIG_DEVICE_PRIVATE.
> Therefore, auto-select DEVICE_PRIVATE when selecting
> HMM_DEVMEM.
> 
> Otherwise, you can easily end up with a partially
> working HMM installation: if you select HMM_DEVMEM,
> but do not select DEVICE_PRIVATE, then faulting and
> migrating to a device (such as a GPU) works, but CPU
> page faults are ignored, so the page never migrates
> back to the CPU.
> 
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  mm/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 46296d5d7570..23d2f5ec865e 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -318,6 +318,8 @@ config HMM_DEVMEM
>  	bool "HMM device memory helpers (to leverage ZONE_DEVICE)"
>  	depends on ARCH_HAS_HMM
>  	select HMM
> +	select DEVICE_PRIVATE
> +
>  	help
>  	  HMM devmem is a set of helper routines to leverage the ZONE_DEVICE
>  	  feature. This is just to avoid having device drivers to replicating a lot
> -- 
> 2.13.1
> 
> This is a minor thing, and I don't think this needs to hold up merging HMM
> v23 into -mm, IMHO. But I would like it fixed at some point.
> 
> thanks,
> --
> John Hubbard
> NVIDIA
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Jerome Glisse <jglisse@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Evgeny Baskakov <ebaskakov@nvidia.com>,
	Dan Williams <dan.j.williams@intel.com>,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>
Subject: Re: [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v3
Date: Mon, 12 Jun 2017 13:57:41 -0400	[thread overview]
Message-ID: <20170612175741.GA4696@redhat.com> (raw)
In-Reply-To: <9d4efdd1-1a76-27e2-5e6b-86bfe13b9865@nvidia.com>

On Thu, Jun 08, 2017 at 08:55:05PM -0700, John Hubbard wrote:
> On 05/24/2017 10:20 AM, Jerome Glisse wrote:
> [...8<...]
> > +#if IS_ENABLED(CONFIG_DEVICE_PRIVATE)
> > +int device_private_entry_fault(struct vm_area_struct *vma,
> > +		       unsigned long addr,
> > +		       swp_entry_t entry,
> > +		       unsigned int flags,
> > +		       pmd_t *pmdp)
> > +{
> > +	struct page *page = device_private_entry_to_page(entry);
> > +
> > +	/*
> > +	 * The page_fault() callback must migrate page back to system memory
> > +	 * so that CPU can access it. This might fail for various reasons
> > +	 * (device issue, device was unsafely unplugged, ...). When such
> > +	 * error conditions happen, the callback must return VM_FAULT_SIGBUS.
> > +	 *
> > +	 * Note that because memory cgroup charges are accounted to the device
> > +	 * memory, this should never fail because of memory restrictions (but
> > +	 * allocation of regular system page might still fail because we are
> > +	 * out of memory).
> > +	 *
> > +	 * There is a more in-depth description of what that callback can and
> > +	 * cannot do, in include/linux/memremap.h
> > +	 */
> > +	return page->pgmap->page_fault(vma, addr, page, flags, pmdp);
> > +}
> > +EXPORT_SYMBOL(device_private_entry_fault);
> > +#endif /* CONFIG_DEVICE_PRIVATE */
> > +
> >   static void pgmap_radix_release(struct resource *res)
> >   {
> >   	resource_size_t key, align_start, align_size, align_end;
> > @@ -321,6 +351,10 @@ void *devm_memremap_pages(struct device *dev, struct resource *res,
> >   	}
> >   	pgmap->ref = ref;
> >   	pgmap->res = &page_map->res;
> > +	pgmap->type = MEMORY_DEVICE_PUBLIC;
> > +	pgmap->page_fault = NULL;
> > +	pgmap->page_free = NULL;
> > +	pgmap->data = NULL;
> >   	mutex_lock(&pgmap_lock);
> >   	error = 0;
> > diff --git a/mm/Kconfig b/mm/Kconfig
> > index d744cff..f5357ff 100644
> > --- a/mm/Kconfig
> > +++ b/mm/Kconfig
> > @@ -736,6 +736,19 @@ config ZONE_DEVICE
> >   	  If FS_DAX is enabled, then say Y.
> > +config DEVICE_PRIVATE
> > +	bool "Unaddressable device memory (GPU memory, ...)"
> > +	depends on X86_64
> > +	depends on ZONE_DEVICE
> > +	depends on MEMORY_HOTPLUG
> > +	depends on MEMORY_HOTREMOVE
> > +	depends on SPARSEMEM_VMEMMAP
> > +
> > +	help
> > +	  Allows creation of struct pages to represent unaddressable device
> > +	  memory; i.e., memory that is only accessible from the device (or
> > +	  group of devices).
> > +
> 
> Hi Jerome,
> 
> CONFIG_DEVICE_PRIVATE has caused me some problems, because it's not coupled to HMM_DEVMEM.
> 
> To fix this, my first choice would be to just s/DEVICE_PRIVATE/HMM_DEVMEM/g
> , because I don't see any value to DEVICE_PRIVATE as an independent Kconfig
> choice. It's complicating the Kconfig choices, and adding problems. However,
> if DEVICE_PRIVATE must be kept, then something like this also fixes my HMM
> tests:


Better is depend on so that you can not select HMM_DEVMEM if you do not have
DEVICE_PRIVATE. But maybe this can be merge under one config option, i do not
have any strong preference personnaly. The HMM_DEVMEM just enable helper code
that make using CONFIG_DEVICE_PRIVATE easier for device driver but is not
strictly needed ie device driver can reimplement what HMM_DEVMEM provides.

I might just merge this kernel option as part of CDM patchset that i am about
to send.

Cheers,
Jerome

> 
> From: John Hubbard <jhubbard@nvidia.com>
> Date: Thu, 8 Jun 2017 20:13:13 -0700
> Subject: [PATCH] hmm: select CONFIG_DEVICE_PRIVATE with HMM_DEVMEM
> 
> The HMM_DEVMEM feature is useless without the various
> features that are guarded with CONFIG_DEVICE_PRIVATE.
> Therefore, auto-select DEVICE_PRIVATE when selecting
> HMM_DEVMEM.
> 
> Otherwise, you can easily end up with a partially
> working HMM installation: if you select HMM_DEVMEM,
> but do not select DEVICE_PRIVATE, then faulting and
> migrating to a device (such as a GPU) works, but CPU
> page faults are ignored, so the page never migrates
> back to the CPU.
> 
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  mm/Kconfig | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 46296d5d7570..23d2f5ec865e 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -318,6 +318,8 @@ config HMM_DEVMEM
>  	bool "HMM device memory helpers (to leverage ZONE_DEVICE)"
>  	depends on ARCH_HAS_HMM
>  	select HMM
> +	select DEVICE_PRIVATE
> +
>  	help
>  	  HMM devmem is a set of helper routines to leverage the ZONE_DEVICE
>  	  feature. This is just to avoid having device drivers to replicating a lot
> -- 
> 2.13.1
> 
> This is a minor thing, and I don't think this needs to hold up merging HMM
> v23 into -mm, IMHO. But I would like it fixed at some point.
> 
> thanks,
> --
> John Hubbard
> NVIDIA
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-06-12 17:57 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-24 17:20 [HMM 00/15] HMM (Heterogeneous Memory Management) v23 Jérôme Glisse
2017-05-24 17:20 ` Jérôme Glisse
2017-05-24 17:20 ` [HMM 01/15] hmm: heterogeneous memory management documentation Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-06-24  6:15   ` John Hubbard
2017-06-24  6:15     ` John Hubbard
2017-05-24 17:20 ` [HMM 02/15] mm/hmm: heterogeneous memory management (HMM for short) v4 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-31  2:10   ` Balbir Singh
2017-05-31  2:10     ` Balbir Singh
2017-06-01 22:35     ` Jerome Glisse
2017-06-01 22:35       ` Jerome Glisse
2017-05-24 17:20 ` [HMM 03/15] mm/hmm/mirror: mirror process address space on device with HMM helpers v3 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-24 17:20 ` [HMM 04/15] mm/hmm/mirror: helper to snapshot CPU page table v3 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-24 17:20 ` [HMM 05/15] mm/hmm/mirror: device page fault handler Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-24 17:20 ` [HMM 06/15] mm/memory_hotplug: introduce add_pages Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-31  1:31   ` Balbir Singh
2017-05-31  1:31     ` Balbir Singh
2017-05-24 17:20 ` [HMM 07/15] mm/ZONE_DEVICE: new type of ZONE_DEVICE for unaddressable memory v3 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-30 16:43   ` Ross Zwisler
2017-05-30 16:43     ` Ross Zwisler
2017-05-30 21:43     ` Jerome Glisse
2017-05-30 21:43       ` Jerome Glisse
2017-05-31  1:23   ` Balbir Singh
2017-05-31  1:23     ` Balbir Singh
2017-06-09  3:55   ` John Hubbard
2017-06-09  3:55     ` John Hubbard
2017-06-12 17:57     ` Jerome Glisse [this message]
2017-06-12 17:57       ` Jerome Glisse
2017-06-15  3:41   ` zhong jiang
2017-06-15  3:41     ` zhong jiang
2017-06-15 17:43     ` Jerome Glisse
2017-06-15 17:43       ` Jerome Glisse
2017-05-24 17:20 ` [HMM 08/15] mm/ZONE_DEVICE: special case put_page() for device private pages v2 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-24 17:20 ` [HMM 09/15] mm/hmm/devmem: device memory hotplug using ZONE_DEVICE v5 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-06-24  3:54   ` John Hubbard
2017-06-24  3:54     ` John Hubbard
2017-05-24 17:20 ` [HMM 10/15] mm/hmm/devmem: dummy HMM device for ZONE_DEVICE memory v3 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-24 17:20 ` [HMM 11/15] mm/migrate: new migrate mode MIGRATE_SYNC_NO_COPY Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-24 17:20 ` [HMM 12/15] mm/migrate: new memory migration helper for use with device memory v4 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-31  3:59   ` Balbir Singh
2017-05-31  3:59     ` Balbir Singh
2017-06-01 22:35     ` Jerome Glisse
2017-06-01 22:35       ` Jerome Glisse
2017-06-07  9:02       ` Balbir Singh
2017-06-07  9:02         ` Balbir Singh
2017-06-07 14:06         ` Jerome Glisse
2017-06-07 14:06           ` Jerome Glisse
2017-05-24 17:20 ` [HMM 13/15] mm/migrate: migrate_vma() unmap page from vma while collecting pages Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-24 17:20 ` [HMM 14/15] mm/migrate: support un-addressable ZONE_DEVICE page in migration v2 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-05-31  4:09   ` Balbir Singh
2017-05-31  4:09     ` Balbir Singh
2017-05-31  8:39     ` Balbir Singh
2017-05-31  8:39       ` Balbir Singh
2017-05-24 17:20 ` [HMM 15/15] mm/migrate: allow migrate_vma() to alloc new page on empty entry v2 Jérôme Glisse
2017-05-24 17:20   ` Jérôme Glisse
2017-06-16  7:22 ` [HMM 00/15] HMM (Heterogeneous Memory Management) v23 Bridgman, John
2017-06-16 14:47   ` Jerome Glisse
2017-06-16 14:47     ` Jerome Glisse
2017-06-16 17:55     ` Bridgman, John
2017-06-16 17:55       ` Bridgman, John
2017-06-16 18:04       ` Jerome Glisse
2017-06-16 18:04         ` Jerome Glisse
2017-06-23 15:00 ` Bob Liu
2017-06-23 15:00   ` Bob Liu
2017-06-23 15:28   ` Jerome Glisse
2017-06-23 15:28     ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170612175741.GA4696@redhat.com \
    --to=jglisse@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=ebaskakov@nvidia.com \
    --cc=jhubbard@nvidia.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ross.zwisler@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.