linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_*
@ 2021-05-27 23:08 Felix Kuehling
  2021-05-27 23:08 ` [RFC PATCH 1/5] drm/amdkfd: add SPM support for SVM Felix Kuehling
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Felix Kuehling @ 2021-05-27 23:08 UTC (permalink / raw)
  To: felix.kuehling, akpm, linux-mm; +Cc: hch, jglisse, jgg, dri-devel, amd-gfx

AMD is building a system architecture for the Frontier supercomputer with
a coherent interconnect between CPUs and GPUs. This hardware architecture
allows the CPUs to coherently access GPU device memory. We have hardware
in our labs and we are working with our partner HPE on the BIOS, firmware
and software for delivery to the DOE.

The system BIOS advertises the GPU device memory (aka VRAM) as SPM
(special purpose memory) in the UEFI system address map. The amdgpu driver
looks it up with lookup_resource and registers it with devmap as
MEMORY_DEVICE_GENERIC using devm_memremap_pages.

Now we're trying to migrate data to and from that memory using the
migrate_vma_* helpers so we can support page-based migration in our
unified memory allocations, while also supporting CPU access to those
pages.

This patch series makes a few changes to make MEMORY_DEVICE_GENERIC pages
behave correctly in the migrate_vma_* helpers. We are looking for feedback
about this approach. If we're close, what's needed to make our patches
acceptable upstream? If we're not close, any suggestions how else to
achieve what we are trying to do (i.e. page migration and coherent CPU
access to VRAM)?

This work is based on HMM and our SVM memory manager that was recently
upstreamed to Dave Airlie's drm-next branch
[https://cgit.freedesktop.org/drm/drm/log/?h=drm-next]. On top of that we
did some rework of our VRAM management for migrations to remove some
incorrect assumptions, allow partially successful migrations and GPU
memory mappings that mix pages in VRAM and system memory.
[https://patchwork.kernel.org/project/dri-devel/list/?series=489811]

In this RFC, patches 1 and 2 are for context to show how we are looking up
the SPM memory and registering it with devmap.

Patches 3-5 are the changes we are trying to upstream or rework to make
them acceptable upstream.

Alex Sierra (5):
  drm/amdkfd: add SPM support for SVM
  drm/amdkfd: generic type as sys mem on migration to ram
  include/linux/mm.h: helper to check zone device generic type
  mm: add generic type support for device zone page migration
  mm: changes to unref pages with Generic type

 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 15 +++++++++++----
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h     |  1 -
 include/linux/mm.h                       |  8 ++++++++
 kernel/resource.c                        |  2 +-
 mm/memremap.c                            |  5 ++++-
 mm/migrate.c                             | 13 ++++++++-----
 6 files changed, 32 insertions(+), 12 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 1/5] drm/amdkfd: add SPM support for SVM
  2021-05-27 23:08 [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Felix Kuehling
@ 2021-05-27 23:08 ` Felix Kuehling
  2021-05-29  6:38   ` Christoph Hellwig
  2021-05-27 23:08 ` [RFC PATCH 2/5] drm/amdkfd: generic type as sys mem on migration to ram Felix Kuehling
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Felix Kuehling @ 2021-05-27 23:08 UTC (permalink / raw)
  To: felix.kuehling, akpm, linux-mm
  Cc: hch, jglisse, jgg, dri-devel, amd-gfx, Alex Sierra

From: Alex Sierra <alex.sierra@amd.com>

When CPU is connected throug XGMI, it has coherent
access to VRAM resource. In this case that resource
is taken from a table in the device gmc aperture base.
This resource is used along with the device type, which could
be DEVICE_PRIVATE or DEVICE_GENERIC to create the device
page map region.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 12 +++++++++---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.h     |  1 -
 kernel/resource.c                        |  2 +-
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index c8ca3252cbc2..f5939449a99f 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -895,6 +895,7 @@ int svm_migrate_init(struct amdgpu_device *adev)
 	struct resource *res;
 	unsigned long size;
 	void *r;
+	bool xgmi_connected_to_cpu = adev->gmc.xgmi.connected_to_cpu;
 
 	/* Page migration works on Vega10 or newer */
 	if (kfddev->device_info->asic_family < CHIP_VEGA10)
@@ -907,17 +908,22 @@ int svm_migrate_init(struct amdgpu_device *adev)
 	 * should remove reserved size
 	 */
 	size = ALIGN(adev->gmc.real_vram_size, 2ULL << 20);
-	res = devm_request_free_mem_region(adev->dev, &iomem_resource, size);
+	if (xgmi_connected_to_cpu)
+		res = lookup_resource(&iomem_resource, adev->gmc.aper_base);
+	else
+		res = devm_request_free_mem_region(adev->dev, &iomem_resource, size);
+
 	if (IS_ERR(res))
 		return -ENOMEM;
 
-	pgmap->type = MEMORY_DEVICE_PRIVATE;
 	pgmap->nr_range = 1;
 	pgmap->range.start = res->start;
 	pgmap->range.end = res->end;
+	pgmap->type = xgmi_connected_to_cpu ?
+				MEMORY_DEVICE_GENERIC : MEMORY_DEVICE_PRIVATE;
 	pgmap->ops = &svm_migrate_pgmap_ops;
 	pgmap->owner = SVM_ADEV_PGMAP_OWNER(adev);
-	pgmap->flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
+	pgmap->flags = 0;
 	r = devm_memremap_pages(adev->dev, pgmap);
 	if (IS_ERR(r)) {
 		pr_err("failed to register HMM device memory\n");
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
index 21f693767a0d..3881a93192ed 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.h
@@ -38,7 +38,6 @@
 #define SVM_RANGE_VRAM_DOMAIN (1UL << 0)
 #define SVM_ADEV_PGMAP_OWNER(adev)\
 			((adev)->hive ? (void *)(adev)->hive : (void *)(adev))
-
 struct svm_range_bo {
 	struct amdgpu_bo		*bo;
 	struct kref			kref;
diff --git a/kernel/resource.c b/kernel/resource.c
index 627e61b0c124..da137553b83e 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -783,7 +783,7 @@ struct resource *lookup_resource(struct resource *root, resource_size_t start)
 
 	return res;
 }
-
+EXPORT_SYMBOL(lookup_resource);
 /*
  * Insert a resource into the resource tree. If successful, return NULL,
  * otherwise return the conflicting resource (compare to __request_resource())
-- 
2.31.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 2/5] drm/amdkfd: generic type as sys mem on migration to ram
  2021-05-27 23:08 [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Felix Kuehling
  2021-05-27 23:08 ` [RFC PATCH 1/5] drm/amdkfd: add SPM support for SVM Felix Kuehling
@ 2021-05-27 23:08 ` Felix Kuehling
  2021-05-27 23:08 ` [RFC PATCH 3/5] include/linux/mm.h: helper to check zone device generic type Felix Kuehling
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Felix Kuehling @ 2021-05-27 23:08 UTC (permalink / raw)
  To: felix.kuehling, akpm, linux-mm
  Cc: hch, jglisse, jgg, dri-devel, amd-gfx, Alex Sierra

From: Alex Sierra <alex.sierra@amd.com>

Generic device type memory on VRAM to RAM migration,
has similar access as System RAM from the CPU. This flag sets
the source from the sender. Which in Generic type case,
should be set as SYSTEM.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
index f5939449a99f..7b41006c1164 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
@@ -653,8 +653,9 @@ svm_migrate_vma_to_ram(struct amdgpu_device *adev, struct svm_range *prange,
 	migrate.vma = vma;
 	migrate.start = start;
 	migrate.end = end;
-	migrate.flags = MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
 	migrate.pgmap_owner = SVM_ADEV_PGMAP_OWNER(adev);
+	migrate.flags = adev->gmc.xgmi.connected_to_cpu ?
+			MIGRATE_VMA_SELECT_SYSTEM : MIGRATE_VMA_SELECT_DEVICE_PRIVATE;
 
 	size = 2 * sizeof(*migrate.src) + sizeof(uint64_t) + sizeof(dma_addr_t);
 	size *= npages;
-- 
2.31.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 3/5] include/linux/mm.h: helper to check zone device generic type
  2021-05-27 23:08 [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Felix Kuehling
  2021-05-27 23:08 ` [RFC PATCH 1/5] drm/amdkfd: add SPM support for SVM Felix Kuehling
  2021-05-27 23:08 ` [RFC PATCH 2/5] drm/amdkfd: generic type as sys mem on migration to ram Felix Kuehling
@ 2021-05-27 23:08 ` Felix Kuehling
  2021-05-27 23:08 ` [RFC PATCH 4/5] mm: add generic type support for device zone page migration Felix Kuehling
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Felix Kuehling @ 2021-05-27 23:08 UTC (permalink / raw)
  To: felix.kuehling, akpm, linux-mm
  Cc: hch, jglisse, jgg, dri-devel, amd-gfx, Alex Sierra

From: Alex Sierra <alex.sierra@amd.com>

Helper to check if zone device page is generic type.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 include/linux/mm.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c9900aedc195..1af7b9b76948 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1158,6 +1158,13 @@ static inline bool is_device_private_page(const struct page *page)
 		page->pgmap->type == MEMORY_DEVICE_PRIVATE;
 }
 
+static inline bool is_device_generic_page(const struct page *page)
+{
+	return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
+		is_zone_device_page(page) &&
+		page->pgmap->type == MEMORY_DEVICE_GENERIC;
+}
+
 static inline bool is_pci_p2pdma_page(const struct page *page)
 {
 	return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) &&
-- 
2.31.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 4/5] mm: add generic type support for device zone page migration
  2021-05-27 23:08 [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Felix Kuehling
                   ` (2 preceding siblings ...)
  2021-05-27 23:08 ` [RFC PATCH 3/5] include/linux/mm.h: helper to check zone device generic type Felix Kuehling
@ 2021-05-27 23:08 ` Felix Kuehling
  2021-05-29  6:40   ` Christoph Hellwig
  2021-05-27 23:08 ` [RFC PATCH 5/5] mm: changes to unref pages with Generic type Felix Kuehling
  2021-05-28 13:08 ` [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Jason Gunthorpe
  5 siblings, 1 reply; 15+ messages in thread
From: Felix Kuehling @ 2021-05-27 23:08 UTC (permalink / raw)
  To: felix.kuehling, akpm, linux-mm
  Cc: hch, jglisse, jgg, dri-devel, amd-gfx, Alex Sierra

From: Alex Sierra <alex.sierra@amd.com>

This support is only for generic type anonymous memory.
Generic type with zone device pages require to take an extra reference,
as it's done with device private type.
Also, support added to migrate pages meta-data for generic device type.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 mm/migrate.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 20ca887ea769..33e573a992e5 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -380,7 +380,8 @@ static int expected_page_refs(struct address_space *mapping, struct page *page)
 	 * Device private pages have an extra refcount as they are
 	 * ZONE_DEVICE pages.
 	 */
-	expected_count += is_device_private_page(page);
+	expected_count +=
+			(is_device_private_page(page) || is_device_generic_page(page));
 	if (mapping)
 		expected_count += thp_nr_pages(page) + page_has_private(page);
 
@@ -2607,7 +2608,7 @@ static bool migrate_vma_check_page(struct page *page)
 		 * FIXME proper solution is to rework migration_entry_wait() so
 		 * it does not need to take a reference on page.
 		 */
-		return is_device_private_page(page);
+		return is_device_private_page(page) | is_device_generic_page(page);
 	}
 
 	/* For file back page */
@@ -3069,10 +3070,12 @@ void migrate_vma_pages(struct migrate_vma *migrate)
 		mapping = page_mapping(page);
 
 		if (is_zone_device_page(newpage)) {
-			if (is_device_private_page(newpage)) {
+			if (is_device_private_page(newpage) ||
+			    is_device_generic_page(newpage)) {
 				/*
-				 * For now only support private anonymous when
-				 * migrating to un-addressable device memory.
+				 * For now only support private and devdax/generic
+				 * anonymous when migrating to un-addressable
+				 * device memory.
 				 */
 				if (mapping) {
 					migrate->src[i] &= ~MIGRATE_PFN_MIGRATE;
-- 
2.31.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* [RFC PATCH 5/5] mm: changes to unref pages with Generic type
  2021-05-27 23:08 [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Felix Kuehling
                   ` (3 preceding siblings ...)
  2021-05-27 23:08 ` [RFC PATCH 4/5] mm: add generic type support for device zone page migration Felix Kuehling
@ 2021-05-27 23:08 ` Felix Kuehling
  2021-05-29  6:42   ` Christoph Hellwig
  2021-05-28 13:08 ` [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Jason Gunthorpe
  5 siblings, 1 reply; 15+ messages in thread
From: Felix Kuehling @ 2021-05-27 23:08 UTC (permalink / raw)
  To: felix.kuehling, akpm, linux-mm
  Cc: hch, jglisse, jgg, dri-devel, amd-gfx, Alex Sierra

From: Alex Sierra <alex.sierra@amd.com>

pages in device mapping refcounts are 1-based, instead
of 0-based. If refcount 1, means it can be freed.
This logic is not set for Generic memory type. Therefore,
its release is threated as a normal page, instead of
the callback device driver release it.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
---
 include/linux/mm.h | 1 +
 mm/memremap.c      | 5 ++++-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1af7b9b76948..83bd2f3e111b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1130,6 +1130,7 @@ static inline bool page_is_devmap_managed(struct page *page)
 	switch (page->pgmap->type) {
 	case MEMORY_DEVICE_PRIVATE:
 	case MEMORY_DEVICE_FS_DAX:
+	case MEMORY_DEVICE_GENERIC:
 		return true;
 	default:
 		break;
diff --git a/mm/memremap.c b/mm/memremap.c
index 16b2fb482da1..d2563fbcf987 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -44,6 +44,7 @@ EXPORT_SYMBOL(devmap_managed_key);
 static void devmap_managed_enable_put(struct dev_pagemap *pgmap)
 {
 	if (pgmap->type == MEMORY_DEVICE_PRIVATE ||
+	    pgmap->type == MEMORY_DEVICE_GENERIC ||
 	    pgmap->type == MEMORY_DEVICE_FS_DAX)
 		static_branch_dec(&devmap_managed_key);
 }
@@ -51,6 +52,7 @@ static void devmap_managed_enable_put(struct dev_pagemap *pgmap)
 static void devmap_managed_enable_get(struct dev_pagemap *pgmap)
 {
 	if (pgmap->type == MEMORY_DEVICE_PRIVATE ||
+	    pgmap->type == MEMORY_DEVICE_GENERIC ||
 	    pgmap->type == MEMORY_DEVICE_FS_DAX)
 		static_branch_inc(&devmap_managed_key);
 }
@@ -480,7 +482,8 @@ EXPORT_SYMBOL_GPL(get_dev_pagemap);
 void free_devmap_managed_page(struct page *page)
 {
 	/* notify page idle for dax */
-	if (!is_device_private_page(page)) {
+	if (!(is_device_private_page(page) ||
+	    is_device_generic_page(page))) {
 		wake_up_var(&page->_refcount);
 		return;
 	}
-- 
2.31.1



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_*
  2021-05-27 23:08 [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Felix Kuehling
                   ` (4 preceding siblings ...)
  2021-05-27 23:08 ` [RFC PATCH 5/5] mm: changes to unref pages with Generic type Felix Kuehling
@ 2021-05-28 13:08 ` Jason Gunthorpe
  2021-05-28 15:56   ` Felix Kuehling
  5 siblings, 1 reply; 15+ messages in thread
From: Jason Gunthorpe @ 2021-05-28 13:08 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: felix.kuehling, akpm, linux-mm, hch, jglisse, dri-devel, amd-gfx

On Thu, May 27, 2021 at 07:08:04PM -0400, Felix Kuehling wrote:
> Now we're trying to migrate data to and from that memory using the
> migrate_vma_* helpers so we can support page-based migration in our
> unified memory allocations, while also supporting CPU access to those
> pages.

So you have completely coherent and indistinguishable GPU and CPU
memory and the need of migration is basicaly alot like NUMA policy
choice - get better access locality?
 
> This patch series makes a few changes to make MEMORY_DEVICE_GENERIC pages
> behave correctly in the migrate_vma_* helpers. We are looking for feedback
> about this approach. If we're close, what's needed to make our patches
> acceptable upstream? If we're not close, any suggestions how else to
> achieve what we are trying to do (i.e. page migration and coherent CPU
> access to VRAM)?

I'm not an expert in migrate, but it doesn't look outrageous.

Have you thought about allowing MEMORY_DEVICE_GENERIC to work with
hmm_range_fault() so you can have nice uniform RDMA?

People have wanted to do that with MEMORY_DEVICE_PRIVATE but nobody
finished the work

Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_*
  2021-05-28 13:08 ` [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Jason Gunthorpe
@ 2021-05-28 15:56   ` Felix Kuehling
  2021-05-29  6:41     ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Felix Kuehling @ 2021-05-28 15:56 UTC (permalink / raw)
  To: Jason Gunthorpe, Felix Kuehling
  Cc: dri-devel, linux-mm, jglisse, amd-gfx, akpm, hch

Am 2021-05-28 um 9:08 a.m. schrieb Jason Gunthorpe:
> On Thu, May 27, 2021 at 07:08:04PM -0400, Felix Kuehling wrote:
>> Now we're trying to migrate data to and from that memory using the
>> migrate_vma_* helpers so we can support page-based migration in our
>> unified memory allocations, while also supporting CPU access to those
>> pages.
> So you have completely coherent and indistinguishable GPU and CPU
> memory and the need of migration is basicaly alot like NUMA policy
> choice - get better access locality?

Yes. For a typical GPU compute application it means the GPU gets the
best bandwidth/latency, and the CPU can coherently access the results
without page faults and migrations. That's especially valuable for
applications with persistent compute kernels that want to exploit
concurrency between CPU and GPU.


>  
>> This patch series makes a few changes to make MEMORY_DEVICE_GENERIC pages
>> behave correctly in the migrate_vma_* helpers. We are looking for feedback
>> about this approach. If we're close, what's needed to make our patches
>> acceptable upstream? If we're not close, any suggestions how else to
>> achieve what we are trying to do (i.e. page migration and coherent CPU
>> access to VRAM)?
> I'm not an expert in migrate, but it doesn't look outrageous.
>
> Have you thought about allowing MEMORY_DEVICE_GENERIC to work with
> hmm_range_fault() so you can have nice uniform RDMA?

Yes. That's our plan for RDMA to unified memory on this system. My
understanding was, that DEVICE_GENERIC pages should already work with
hmm_range_fault. But maybe I'm missing something.


>
> People have wanted to do that with MEMORY_DEVICE_PRIVATE but nobody
> finished the work

Yeah, for DEVICE_PRIVATE it seems more tricky because the peer device is
not the owner of the pages and would need help from the actual owner to
get proper DMA addresses.

Regards,
  Felix


>
> Jason


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/5] drm/amdkfd: add SPM support for SVM
  2021-05-27 23:08 ` [RFC PATCH 1/5] drm/amdkfd: add SPM support for SVM Felix Kuehling
@ 2021-05-29  6:38   ` Christoph Hellwig
  2021-05-29 18:42     ` Felix Kuehling
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2021-05-29  6:38 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: felix.kuehling, akpm, linux-mm, hch, jglisse, jgg, dri-devel,
	amd-gfx, Alex Sierra

On Thu, May 27, 2021 at 07:08:05PM -0400, Felix Kuehling wrote:
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 627e61b0c124..da137553b83e 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -783,7 +783,7 @@ struct resource *lookup_resource(struct resource *root, resource_size_t start)
>  
>  	return res;
>  }
> -
> +EXPORT_SYMBOL(lookup_resource);

NAK on hiding random exports in a driver patch.  This needs to be a proper
patch with a proper rationale, a kerneldoc comment and use
EXPORT_SYMBOL_GPL.  And even then it smells rather fishy, but I'll wait
for the rationale.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 4/5] mm: add generic type support for device zone page migration
  2021-05-27 23:08 ` [RFC PATCH 4/5] mm: add generic type support for device zone page migration Felix Kuehling
@ 2021-05-29  6:40   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2021-05-29  6:40 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: felix.kuehling, akpm, linux-mm, hch, jglisse, jgg, dri-devel,
	amd-gfx, Alex Sierra

On Thu, May 27, 2021 at 07:08:08PM -0400, Felix Kuehling wrote:
> -	expected_count += is_device_private_page(page);
> +	expected_count +=
> +			(is_device_private_page(page) || is_device_generic_page(page));

Please avoid the completely unreadable overly long lines.  And given
how oftへn this check is duplicated you probably really want a helper.
And properly document it while you're at it.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_*
  2021-05-28 15:56   ` Felix Kuehling
@ 2021-05-29  6:41     ` Christoph Hellwig
  2021-05-29 18:37       ` Felix Kuehling
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2021-05-29  6:41 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: Jason Gunthorpe, Felix Kuehling, dri-devel, linux-mm, jglisse,
	amd-gfx, akpm, hch

On Fri, May 28, 2021 at 11:56:36AM -0400, Felix Kuehling wrote:
> Am 2021-05-28 um 9:08 a.m. schrieb Jason Gunthorpe:
> > On Thu, May 27, 2021 at 07:08:04PM -0400, Felix Kuehling wrote:
> >> Now we're trying to migrate data to and from that memory using the
> >> migrate_vma_* helpers so we can support page-based migration in our
> >> unified memory allocations, while also supporting CPU access to those
> >> pages.
> > So you have completely coherent and indistinguishable GPU and CPU
> > memory and the need of migration is basicaly alot like NUMA policy
> > choice - get better access locality?
> 
> Yes. For a typical GPU compute application it means the GPU gets the
> best bandwidth/latency, and the CPU can coherently access the results
> without page faults and migrations. That's especially valuable for
> applications with persistent compute kernels that want to exploit
> concurrency between CPU and GPU.

So why not expose the GPU memory as a CPUless memory node?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 5/5] mm: changes to unref pages with Generic type
  2021-05-27 23:08 ` [RFC PATCH 5/5] mm: changes to unref pages with Generic type Felix Kuehling
@ 2021-05-29  6:42   ` Christoph Hellwig
  2021-05-29 18:44     ` Felix Kuehling
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2021-05-29  6:42 UTC (permalink / raw)
  To: Felix Kuehling
  Cc: felix.kuehling, akpm, linux-mm, hch, jglisse, jgg, dri-devel,
	amd-gfx, Alex Sierra

On Thu, May 27, 2021 at 07:08:09PM -0400, Felix Kuehling wrote:
> From: Alex Sierra <alex.sierra@amd.com>
> 
> pages in device mapping refcounts are 1-based, instead
> of 0-based. If refcount 1, means it can be freed.
> This logic is not set for Generic memory type. Therefore,
> its release is threated as a normal page, instead of
> the callback device driver release it.

So we really need to stop this magic one off refcount.  We had a WIP
patchset to do that, and we need to finish that off instad of piling on
more hacks.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_*
  2021-05-29  6:41     ` Christoph Hellwig
@ 2021-05-29 18:37       ` Felix Kuehling
  0 siblings, 0 replies; 15+ messages in thread
From: Felix Kuehling @ 2021-05-29 18:37 UTC (permalink / raw)
  To: Christoph Hellwig, Felix Kuehling
  Cc: Jason Gunthorpe, dri-devel, linux-mm, jglisse, amd-gfx, akpm


[-- Attachment #1.1: Type: text/plain, Size: 2108 bytes --]

Am 2021-05-29 um 2:41 a.m. schrieb Christoph Hellwig:
> On Fri, May 28, 2021 at 11:56:36AM -0400, Felix Kuehling wrote:
>> Am 2021-05-28 um 9:08 a.m. schrieb Jason Gunthorpe:
>>> On Thu, May 27, 2021 at 07:08:04PM -0400, Felix Kuehling wrote:
>>>> Now we're trying to migrate data to and from that memory using the
>>>> migrate_vma_* helpers so we can support page-based migration in our
>>>> unified memory allocations, while also supporting CPU access to those
>>>> pages.
>>> So you have completely coherent and indistinguishable GPU and CPU
>>> memory and the need of migration is basicaly alot like NUMA policy
>>> choice - get better access locality?
>> Yes. For a typical GPU compute application it means the GPU gets the
>> best bandwidth/latency, and the CPU can coherently access the results
>> without page faults and migrations. That's especially valuable for
>> applications with persistent compute kernels that want to exploit
>> concurrency between CPU and GPU.
> So why not expose the GPU memory as a CPUless memory node?

We did consider this, and are in fact still considering it for future
systems. For this system we decided not to go that way for several reasons.

For one, it means the driver would need to allocate VRAM with
__alloc_pages_nodemask for its own needs (firmware blobs, page tables,
etc.) and traditional BO-based memory allocation APIs. The GPU driver
would compete for VRAM with other application allocations, such as
malloc, mmap, page cache etc. Benchmarking and optimizing the NUMA
policy for such a system with a wide variety of workloads would be a big
effort.

All VRAM would need to be 0-initialized at allocation time. (I know
about init_on_free=1. In fact that's what our GPU driver does for VRAM
today, but asynchronously to hide the latency. However, init_on_free is
synchronous and has other drawbacks for system memory according to the
documentation of config INIT_ON_FREE_DEFAULT_ON.)

To make virtualization work, GPU access to its own local VRAM would need
to go through the system IOMMU.

Regards,
  Felix




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/5] drm/amdkfd: add SPM support for SVM
  2021-05-29  6:38   ` Christoph Hellwig
@ 2021-05-29 18:42     ` Felix Kuehling
  0 siblings, 0 replies; 15+ messages in thread
From: Felix Kuehling @ 2021-05-29 18:42 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: felix.kuehling, akpm, linux-mm, jglisse, jgg, dri-devel, amd-gfx,
	Alex Sierra

Am 2021-05-29 um 2:38 a.m. schrieb Christoph Hellwig:
> On Thu, May 27, 2021 at 07:08:05PM -0400, Felix Kuehling wrote:
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index 627e61b0c124..da137553b83e 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -783,7 +783,7 @@ struct resource *lookup_resource(struct resource *root, resource_size_t start)
>>  
>>  	return res;
>>  }
>> -
>> +EXPORT_SYMBOL(lookup_resource);
> NAK on hiding random exports in a driver patch.  This needs to be a proper
> patch with a proper rationale, a kerneldoc comment and use
> EXPORT_SYMBOL_GPL.  And even then it smells rather fishy, but I'll wait
> for the rationale.

Yeah, I missed that in internal review. I agree this needs to be a
separate patch.

The rationale is, that the GPU driver needs to be able to look up and
claim the resource corresponding to its VRAM in order to register it as
DEVICE_GENERIC memory with devmap. If there is a better way to do this,
I'm open to suggestions.

Regards,
  Felix



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 5/5] mm: changes to unref pages with Generic type
  2021-05-29  6:42   ` Christoph Hellwig
@ 2021-05-29 18:44     ` Felix Kuehling
  0 siblings, 0 replies; 15+ messages in thread
From: Felix Kuehling @ 2021-05-29 18:44 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: felix.kuehling, akpm, linux-mm, jglisse, jgg, dri-devel, amd-gfx,
	Alex Sierra


[-- Attachment #1.1: Type: text/plain, Size: 686 bytes --]

Am 2021-05-29 um 2:42 a.m. schrieb Christoph Hellwig:
> On Thu, May 27, 2021 at 07:08:09PM -0400, Felix Kuehling wrote:
>> From: Alex Sierra <alex.sierra@amd.com>
>>
>> pages in device mapping refcounts are 1-based, instead
>> of 0-based. If refcount 1, means it can be freed.
>> This logic is not set for Generic memory type. Therefore,
>> its release is threated as a normal page, instead of
>> the callback device driver release it.
> So we really need to stop this magic one off refcount.  We had a WIP
> patchset to do that, and we need to finish that off instad of piling on
> more hacks.

Sure. I'll be happy to help with that, if I can.

Thanks,
  Felix




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 236 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-05-29 18:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-27 23:08 [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Felix Kuehling
2021-05-27 23:08 ` [RFC PATCH 1/5] drm/amdkfd: add SPM support for SVM Felix Kuehling
2021-05-29  6:38   ` Christoph Hellwig
2021-05-29 18:42     ` Felix Kuehling
2021-05-27 23:08 ` [RFC PATCH 2/5] drm/amdkfd: generic type as sys mem on migration to ram Felix Kuehling
2021-05-27 23:08 ` [RFC PATCH 3/5] include/linux/mm.h: helper to check zone device generic type Felix Kuehling
2021-05-27 23:08 ` [RFC PATCH 4/5] mm: add generic type support for device zone page migration Felix Kuehling
2021-05-29  6:40   ` Christoph Hellwig
2021-05-27 23:08 ` [RFC PATCH 5/5] mm: changes to unref pages with Generic type Felix Kuehling
2021-05-29  6:42   ` Christoph Hellwig
2021-05-29 18:44     ` Felix Kuehling
2021-05-28 13:08 ` [RFC PATCH 0/5] Support DEVICE_GENERIC memory in migrate_vma_* Jason Gunthorpe
2021-05-28 15:56   ` Felix Kuehling
2021-05-29  6:41     ` Christoph Hellwig
2021-05-29 18:37       ` Felix Kuehling

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).