All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] KFD: Mapping-specific MTYPEs on Arcturus
@ 2019-08-23 21:33 Kuehling, Felix
       [not found] ` <20190823213249.10749-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-23 21:33 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak, Koenig, Christian

This is a simplified version of Oak's patch series, addressing some of
the concerns raised during code review. Patches 1-3 are taken directly
from Oak's series. Patch 4 is a simplified version of Oak's patches
3 and 5 that keeps all the KFD memory type abstractions out of the GMC
IP-version-specific code.

Compile tested only.

Felix Kuehling (1):
  drm/amdgpu: Determing PTE flags separately for each mapping

Oak Zeng (3):
  drm/amdgpu: Extends amdgpu vm definitions
  drm/amdgpu: Support new arcturus mtype
  drm/amdgpu: Support snooped PTE flag

 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 60 ++++++++++++++-----
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |  9 ++-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c         |  6 ++
 drivers/gpu/drm/amd/include/vega10_enum.h     |  1 +
 include/uapi/drm/amdgpu_drm.h                 |  4 ++
 6 files changed, 63 insertions(+), 19 deletions(-)

-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] drm/amdgpu: Extends amdgpu vm definitions
       [not found] ` <20190823213249.10749-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2019-08-23 21:33   ` [PATCH 2/4] drm/amdgpu: Support new arcturus mtype Kuehling, Felix
@ 2019-08-23 21:33   ` Kuehling, Felix
       [not found]     ` <20190823213249.10749-2-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2019-08-23 21:33   ` [PATCH 3/4] drm/amdgpu: Support snooped PTE flag Kuehling, Felix
  2019-08-23 21:33   ` [PATCH 4/4] drm/amdgpu: Determing PTE flags separately for each mapping Kuehling, Felix
  3 siblings, 1 reply; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-23 21:33 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak, Koenig, Christian

From: Oak Zeng <Oak.Zeng@amd.com>

Add definition of all supported mtypes. The RW mtype
is recently introduced for arcturus. Also add definition
of a flag to probe and possibly invalidate remote GPU
cache, which will be used later in this series.

Change-Id: I96fc9bb4b6b1e62bdc10b600d8aaa6a802128d6d
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 9 +++++++--
 include/uapi/drm/amdgpu_drm.h          | 4 ++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 2eda3a8c330d..7a77477af6a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -80,8 +80,13 @@ struct amdgpu_bo_list_entry;
 #define AMDGPU_PTE_MTYPE_VG10(a)	((uint64_t)(a) << 57)
 #define AMDGPU_PTE_MTYPE_VG10_MASK	AMDGPU_PTE_MTYPE_VG10(3ULL)
 
-#define AMDGPU_MTYPE_NC 0
-#define AMDGPU_MTYPE_CC 2
+enum amdgpu_mtype {
+	AMDGPU_MTYPE_NC = 0,
+	AMDGPU_MTYPE_WC = 1,
+	AMDGPU_MTYPE_CC = 2,
+	AMDGPU_MTYPE_UC = 3,
+	AMDGPU_MTYPE_RW = 4,
+};
 
 #define AMDGPU_PTE_DEFAULT_ATC  (AMDGPU_PTE_SYSTEM      \
                                 | AMDGPU_PTE_SNOOPED    \
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index ca97b6802275..97e8e51f76aa 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -503,6 +503,10 @@ struct drm_amdgpu_gem_op {
 #define AMDGPU_VM_MTYPE_CC		(3 << 5)
 /* Use UC MTYPE instead of default MTYPE */
 #define AMDGPU_VM_MTYPE_UC		(4 << 5)
+/* Use RW MTYPE instead of default MTYPE */
+#define AMDGPU_VM_MTYPE_RW		(5 << 5)
+/* Cacheable/snoopable */
+#define AMDGPU_VM_PAGE_INVALIDATE_PROBE	(1 << 9)
 
 struct drm_amdgpu_gem_va {
 	/** GEM object handle */
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] drm/amdgpu: Support new arcturus mtype
       [not found] ` <20190823213249.10749-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-23 21:33   ` Kuehling, Felix
  2019-08-23 21:33   ` [PATCH 1/4] drm/amdgpu: Extends amdgpu vm definitions Kuehling, Felix
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-23 21:33 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak, Koenig, Christian

From: Oak Zeng <Oak.Zeng@amd.com>

Arcturus repurposed mtype WC to RW. Modify gmc functions
to support the new mtype

Change-Id: Idc338e5386a57020f45262025e2664ab4ba9f291
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c     | 3 +++
 drivers/gpu/drm/amd/include/vega10_enum.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 6ce2977c3c59..9aafcda6c488 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -587,6 +587,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
 	case AMDGPU_VM_MTYPE_WC:
 		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_WC);
 		break;
+	case AMDGPU_VM_MTYPE_RW:
+		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_RW);
+		break;
 	case AMDGPU_VM_MTYPE_CC:
 		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_CC);
 		break;
diff --git a/drivers/gpu/drm/amd/include/vega10_enum.h b/drivers/gpu/drm/amd/include/vega10_enum.h
index c14ba65a2415..adf1b754666e 100644
--- a/drivers/gpu/drm/amd/include/vega10_enum.h
+++ b/drivers/gpu/drm/amd/include/vega10_enum.h
@@ -1037,6 +1037,7 @@ TCC_CACHE_POLICY_STREAM                  = 0x00000001,
 typedef enum MTYPE {
 MTYPE_NC                                 = 0x00000000,
 MTYPE_WC                                 = 0x00000001,
+MTYPE_RW                                 = 0x00000001,
 MTYPE_CC                                 = 0x00000002,
 MTYPE_UC                                 = 0x00000003,
 } MTYPE;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] drm/amdgpu: Support snooped PTE flag
       [not found] ` <20190823213249.10749-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2019-08-23 21:33   ` [PATCH 2/4] drm/amdgpu: Support new arcturus mtype Kuehling, Felix
  2019-08-23 21:33   ` [PATCH 1/4] drm/amdgpu: Extends amdgpu vm definitions Kuehling, Felix
@ 2019-08-23 21:33   ` Kuehling, Felix
       [not found]     ` <20190823213249.10749-4-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  2019-08-23 21:33   ` [PATCH 4/4] drm/amdgpu: Determing PTE flags separately for each mapping Kuehling, Felix
  3 siblings, 1 reply; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-23 21:33 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak, Koenig, Christian

From: Oak Zeng <Oak.Zeng@amd.com>

Set snooped PTE flag according to mapping flag. Write request to a
page with snooped bit set, will send out invalidate probe request
to TCC of the remote GPU where the vram page resides.

Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 9aafcda6c488..8a7c4ec69ae8 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -604,6 +604,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
 	if (flags & AMDGPU_VM_PAGE_PRT)
 		pte_flag |= AMDGPU_PTE_PRT;
 
+	if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
+		pte_flag |= AMDGPU_PTE_SNOOPED;
+
 	return pte_flag;
 }
 
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] drm/amdgpu: Determing PTE flags separately for each mapping
       [not found] ` <20190823213249.10749-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2019-08-23 21:33   ` [PATCH 3/4] drm/amdgpu: Support snooped PTE flag Kuehling, Felix
@ 2019-08-23 21:33   ` Kuehling, Felix
  3 siblings, 0 replies; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-23 21:33 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak, Koenig, Christian

The same BO can be mapped with different PTE flags by different GPUs.
Therefore determine the PTE flags separately for each mapping instead
of storing them in the buffer object.

Add a helper function to determine the PTE flags with special logic
for Arcturus that takes advantages of the new MTYPE_RW and the ability
to probe-invalidate remote caches over XGMI.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h    |  2 +-
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 60 ++++++++++++++-----
 2 files changed, 45 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index e519df3fd2b6..1af8f83f7e02 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -57,7 +57,7 @@ struct kgd_mem {
 	unsigned int mapped_to_gpu_memory;
 	uint64_t va;
 
-	uint32_t mapping_flags;
+	uint32_t alloc_flags;
 
 	atomic_t invalid;
 	struct amdkfd_process_info *process_info;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 44a52b09cc58..dd24d9517de2 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -31,6 +31,7 @@
 #include "amdgpu_vm.h"
 #include "amdgpu_amdkfd.h"
 #include "amdgpu_dma_buf.h"
+#include "amdgpu_xgmi.h"
 
 /* Special VM and GART address alignment needed for VI pre-Fiji due to
  * a HW bug.
@@ -355,6 +356,43 @@ static int vm_update_pds(struct amdgpu_vm *vm, struct amdgpu_sync *sync)
 	return amdgpu_sync_fence(NULL, sync, vm->last_update, false);
 }
 
+static uint32_t get_pte_flags(struct amdgpu_device *adev, struct kgd_mem *mem)
+{
+	struct amdgpu_device *bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
+	bool coherent = mem->alloc_flags & ALLOC_MEM_FLAGS_COHERENT;
+	uint32_t mapping_flags;
+
+	mapping_flags = AMDGPU_VM_PAGE_READABLE;
+	if (mem->alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
+	if (mem->alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
+
+	switch (adev->asic_type) {
+	case CHIP_ARCTURUS:
+		if (mem->alloc_flags & ALLOC_MEM_FLAGS_VRAM) {
+			if (bo_adev == adev) {
+				mapping_flags |= coherent ?
+					AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW;
+			} else {
+				mapping_flags |= AMDGPU_VM_MTYPE_UC;
+				if (amdgpu_xgmi_same_hive(adev, bo_adev))
+					mapping_flags |=
+						AMDGPU_VM_PAGE_INVALIDATE_PROBE;
+			}
+		} else {
+			mapping_flags |= coherent ?
+				AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC;
+		}
+		break;
+	default:
+		mapping_flags |= coherent ?
+			AMDGPU_VM_MTYPE_UC : AMDGPU_VM_MTYPE_NC;
+	}
+
+	return amdgpu_gmc_get_pte_flags(adev, mapping_flags);
+}
+
 /* add_bo_to_vm - Add a BO to a VM
  *
  * Everything that needs to bo done only once when a BO is first added
@@ -403,8 +441,7 @@ static int add_bo_to_vm(struct amdgpu_device *adev, struct kgd_mem *mem,
 	}
 
 	bo_va_entry->va = va;
-	bo_va_entry->pte_flags = amdgpu_gmc_get_pte_flags(adev,
-							 mem->mapping_flags);
+	bo_va_entry->pte_flags = get_pte_flags(adev, mem);
 	bo_va_entry->kgd_dev = (void *)adev;
 	list_add(&bo_va_entry->bo_list, list_bo_va);
 
@@ -1081,7 +1118,6 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
 	int byte_align;
 	u32 domain, alloc_domain;
 	u64 alloc_flags;
-	uint32_t mapping_flags;
 	int ret;
 
 	/*
@@ -1143,16 +1179,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
 			adev->asic_type != CHIP_VEGAM) ?
 			VI_BO_SIZE_ALIGN : 1;
 
-	mapping_flags = AMDGPU_VM_PAGE_READABLE;
-	if (flags & ALLOC_MEM_FLAGS_WRITABLE)
-		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
-	if (flags & ALLOC_MEM_FLAGS_EXECUTABLE)
-		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
-	if (flags & ALLOC_MEM_FLAGS_COHERENT)
-		mapping_flags |= AMDGPU_VM_MTYPE_UC;
-	else
-		mapping_flags |= AMDGPU_VM_MTYPE_NC;
-	(*mem)->mapping_flags = mapping_flags;
+	(*mem)->alloc_flags = flags;
 
 	amdgpu_sync_create(&(*mem)->sync);
 
@@ -1625,9 +1652,10 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct kgd_dev *kgd,
 
 	INIT_LIST_HEAD(&(*mem)->bo_va_list);
 	mutex_init(&(*mem)->lock);
-	(*mem)->mapping_flags =
-		AMDGPU_VM_PAGE_READABLE | AMDGPU_VM_PAGE_WRITEABLE |
-		AMDGPU_VM_PAGE_EXECUTABLE | AMDGPU_VM_MTYPE_NC;
+	(*mem)->alloc_flags =
+		((bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ?
+		 ALLOC_MEM_FLAGS_VRAM : ALLOC_MEM_FLAGS_GTT) |
+		ALLOC_MEM_FLAGS_WRITABLE | ALLOC_MEM_FLAGS_EXECUTABLE;
 
 	(*mem)->bo = amdgpu_bo_ref(bo);
 	(*mem)->va = va;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] drm/amdgpu: Extends amdgpu vm definitions
       [not found]     ` <20190823213249.10749-2-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-24 11:12       ` Christian König
  0 siblings, 0 replies; 16+ messages in thread
From: Christian König @ 2019-08-24 11:12 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Zeng, Oak, Koenig, Christian

Am 23.08.19 um 23:33 schrieb Kuehling, Felix:
> From: Oak Zeng <Oak.Zeng@amd.com>
>
> Add definition of all supported mtypes. The RW mtype
> is recently introduced for arcturus. Also add definition
> of a flag to probe and possibly invalidate remote GPU
> cache, which will be used later in this series.

Maybe separate that into two patches?

Christian.

>
> Change-Id: I96fc9bb4b6b1e62bdc10b600d8aaa6a802128d6d
> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 9 +++++++--
>   include/uapi/drm/amdgpu_drm.h          | 4 ++++
>   2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 2eda3a8c330d..7a77477af6a4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -80,8 +80,13 @@ struct amdgpu_bo_list_entry;
>   #define AMDGPU_PTE_MTYPE_VG10(a)	((uint64_t)(a) << 57)
>   #define AMDGPU_PTE_MTYPE_VG10_MASK	AMDGPU_PTE_MTYPE_VG10(3ULL)
>   
> -#define AMDGPU_MTYPE_NC 0
> -#define AMDGPU_MTYPE_CC 2
> +enum amdgpu_mtype {
> +	AMDGPU_MTYPE_NC = 0,
> +	AMDGPU_MTYPE_WC = 1,
> +	AMDGPU_MTYPE_CC = 2,
> +	AMDGPU_MTYPE_UC = 3,
> +	AMDGPU_MTYPE_RW = 4,
> +};
>   
>   #define AMDGPU_PTE_DEFAULT_ATC  (AMDGPU_PTE_SYSTEM      \
>                                   | AMDGPU_PTE_SNOOPED    \
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index ca97b6802275..97e8e51f76aa 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -503,6 +503,10 @@ struct drm_amdgpu_gem_op {
>   #define AMDGPU_VM_MTYPE_CC		(3 << 5)
>   /* Use UC MTYPE instead of default MTYPE */
>   #define AMDGPU_VM_MTYPE_UC		(4 << 5)
> +/* Use RW MTYPE instead of default MTYPE */
> +#define AMDGPU_VM_MTYPE_RW		(5 << 5)
> +/* Cacheable/snoopable */
> +#define AMDGPU_VM_PAGE_INVALIDATE_PROBE	(1 << 9)
>   
>   struct drm_amdgpu_gem_va {
>   	/** GEM object handle */

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Support snooped PTE flag
       [not found]     ` <20190823213249.10749-4-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-24 11:13       ` Christian König
       [not found]         ` <f09e6893-347b-4ade-76e5-ad37d8e4e782-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Christian König @ 2019-08-24 11:13 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Zeng, Oak, Koenig, Christian

Am 23.08.19 um 23:33 schrieb Kuehling, Felix:
> From: Oak Zeng <Oak.Zeng@amd.com>
>
> Set snooped PTE flag according to mapping flag. Write request to a
> page with snooped bit set, will send out invalidate probe request
> to TCC of the remote GPU where the vram page resides.
>
> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 9aafcda6c488..8a7c4ec69ae8 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -604,6 +604,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
>   	if (flags & AMDGPU_VM_PAGE_PRT)
>   		pte_flag |= AMDGPU_PTE_PRT;
>   
> +	if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
> +		pte_flag |= AMDGPU_PTE_SNOOPED;
> +

That is still a NAK without further checks. We need to make absolutely 
sure that we don't set this when PCIe routing is in use.

Christian.

>   	return pte_flag;
>   }
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Support snooped PTE flag
       [not found]         ` <f09e6893-347b-4ade-76e5-ad37d8e4e782-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2019-08-24 16:24           ` Kuehling, Felix
       [not found]             ` <a7f9ad48-2e46-5415-e2a8-1738a101d716-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-24 16:24 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak

On 2019-08-24 7:13, Christian König wrote:
> Am 23.08.19 um 23:33 schrieb Kuehling, Felix:
>> From: Oak Zeng <Oak.Zeng@amd.com>
>>
>> Set snooped PTE flag according to mapping flag. Write request to a
>> page with snooped bit set, will send out invalidate probe request
>> to TCC of the remote GPU where the vram page resides.
>>
>> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
>> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> index 9aafcda6c488..8a7c4ec69ae8 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> @@ -604,6 +604,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct 
>> amdgpu_device *adev,
>>       if (flags & AMDGPU_VM_PAGE_PRT)
>>           pte_flag |= AMDGPU_PTE_PRT;
>>   +    if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
>> +        pte_flag |= AMDGPU_PTE_SNOOPED;
>> +
>
> That is still a NAK without further checks. We need to make absolutely 
> sure that we don't set this when PCIe routing is in use.

The only place where AMDGPU_VM_... flags are accepted from user mode 
seems to be amdgpu_gem_va_ioctl. It has an explicit set of valid_flags 
it accepts. The new INVALIDATE_PROBE flag is not part of it. That means 
user mode will  not be able to set it directly. If we added it to the 
set of valid_flags in amdgpu_gem_va_ioctl, we'd need to add appropriate 
checks at the same time.

KFD does not expose AMDGPU_VM_... flags directly to user mode. It only 
sets the INVALIDATE_PROBE flag in kernel mode for mappings in the same 
XGMI hive on Arturus (in patch 4).

If there is something I'm missing, please point it out. But AFAICT the 
checking that is currently done should satisfy your requirements.

Regards,
   Felix

>
> Christian.
>
>>       return pte_flag;
>>   }
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Support snooped PTE flag
       [not found]             ` <a7f9ad48-2e46-5415-e2a8-1738a101d716-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-24 18:59               ` Christian König
       [not found]                 ` <96b6ac1d-de87-3fdb-a531-af4b0a42f1d5-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Christian König @ 2019-08-24 18:59 UTC (permalink / raw)
  To: Kuehling, Felix, Koenig, Christian,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Zeng, Oak

Am 24.08.19 um 18:24 schrieb Kuehling, Felix:
> On 2019-08-24 7:13, Christian König wrote:
>> Am 23.08.19 um 23:33 schrieb Kuehling, Felix:
>>> From: Oak Zeng <Oak.Zeng@amd.com>
>>>
>>> Set snooped PTE flag according to mapping flag. Write request to a
>>> page with snooped bit set, will send out invalidate probe request
>>> to TCC of the remote GPU where the vram page resides.
>>>
>>> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
>>> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>>>    1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> index 9aafcda6c488..8a7c4ec69ae8 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> @@ -604,6 +604,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct
>>> amdgpu_device *adev,
>>>        if (flags & AMDGPU_VM_PAGE_PRT)
>>>            pte_flag |= AMDGPU_PTE_PRT;
>>>    +    if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
>>> +        pte_flag |= AMDGPU_PTE_SNOOPED;
>>> +
>> That is still a NAK without further checks. We need to make absolutely
>> sure that we don't set this when PCIe routing is in use.
> The only place where AMDGPU_VM_... flags are accepted from user mode
> seems to be amdgpu_gem_va_ioctl. It has an explicit set of valid_flags
> it accepts. The new INVALIDATE_PROBE flag is not part of it. That means
> user mode will  not be able to set it directly. If we added it to the
> set of valid_flags in amdgpu_gem_va_ioctl, we'd need to add appropriate
> checks at the same time.
>
> KFD does not expose AMDGPU_VM_... flags directly to user mode. It only
> sets the INVALIDATE_PROBE flag in kernel mode for mappings in the same
> XGMI hive on Arturus (in patch 4).
>
> If there is something I'm missing, please point it out. But AFAICT the
> checking that is currently done should satisfy your requirements.

The hardware behavior depends on the placement of the buffer, so at bare 
minimum we need to check if it's pointing to PCIe or local (check the 
system bit).

But even if it's local what is the behavior for local memory? E.g. not 
accessed through XGMI?

As far as I can see what we need to check here is that this is a remote 
access over XGMI and then (and only then!) we are allowed to set the 
snoop bit on the PTE.

Regards,
Christian.

>
> Regards,
>     Felix
>
>> Christian.
>>
>>>        return pte_flag;
>>>    }
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Support snooped PTE flag
       [not found]                 ` <96b6ac1d-de87-3fdb-a531-af4b0a42f1d5-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2019-08-26 15:45                   ` Kuehling, Felix
       [not found]                     ` <4b48fc1a-6ee2-bb60-0518-ea9c6346b8d6-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-26 15:45 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak

On 2019-08-24 2:59 p.m., Christian König wrote:
> Am 24.08.19 um 18:24 schrieb Kuehling, Felix:
>> On 2019-08-24 7:13, Christian König wrote:
>>> Am 23.08.19 um 23:33 schrieb Kuehling, Felix:
>>>> From: Oak Zeng <Oak.Zeng@amd.com>
>>>>
>>>> Set snooped PTE flag according to mapping flag. Write request to a
>>>> page with snooped bit set, will send out invalidate probe request
>>>> to TCC of the remote GPU where the vram page resides.
>>>>
>>>> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
>>>> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>>>>    1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> index 9aafcda6c488..8a7c4ec69ae8 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>> @@ -604,6 +604,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct
>>>> amdgpu_device *adev,
>>>>        if (flags & AMDGPU_VM_PAGE_PRT)
>>>>            pte_flag |= AMDGPU_PTE_PRT;
>>>>    +    if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
>>>> +        pte_flag |= AMDGPU_PTE_SNOOPED;
>>>> +
>>> That is still a NAK without further checks. We need to make absolutely
>>> sure that we don't set this when PCIe routing is in use.
>> The only place where AMDGPU_VM_... flags are accepted from user mode
>> seems to be amdgpu_gem_va_ioctl. It has an explicit set of valid_flags
>> it accepts. The new INVALIDATE_PROBE flag is not part of it. That means
>> user mode will  not be able to set it directly. If we added it to the
>> set of valid_flags in amdgpu_gem_va_ioctl, we'd need to add appropriate
>> checks at the same time.
>>
>> KFD does not expose AMDGPU_VM_... flags directly to user mode. It only
>> sets the INVALIDATE_PROBE flag in kernel mode for mappings in the same
>> XGMI hive on Arturus (in patch 4).
>>
>> If there is something I'm missing, please point it out. But AFAICT the
>> checking that is currently done should satisfy your requirements.
>
> The hardware behavior depends on the placement of the buffer, so at 
> bare minimum we need to check if it's pointing to PCIe or local (check 
> the system bit).
>
> But even if it's local what is the behavior for local memory? E.g. not 
> accessed through XGMI?
>
> As far as I can see what we need to check here is that this is a 
> remote access over XGMI and then (and only then!) we are allowed to 
> set the snoop bit on the PTE.

My point is, the only place where this bit can get set right now is in 
kernel mode in amdgpu_amdkfd_gpuvm.c. See patch 4 of my series. That 
code already has all the right conditions to make sure the 
INVALIDATE_PROBE bit is only set in the correct circumstances (remote 
XGMI mappings in the same hive):

> +	switch (adev->asic_type) {
> +	case CHIP_ARCTURUS:
> +		if (mem->alloc_flags & ALLOC_MEM_FLAGS_VRAM) {
> +			if (bo_adev == adev) {
> +				...
> +			} else {
> +				...
> +				if (amdgpu_xgmi_same_hive(adev, bo_adev))
> +					mapping_flags |=
> +						AMDGPU_VM_PAGE_INVALIDATE_PROBE;
> +			}
> +		} else {
I think you're asking me to add an extra check for the same conditions 
in the GMC code? Like GMC doesn't trust amdgpu_amdkfd. It seems 
completely redundant and a bit paranoid to me. gmc_v9_0_get_vm_pte_flags 
doesn't even have all the information it needs to make that determination.

Regards,
   Felix


>
> Regards,
> Christian.
>
>>
>> Regards,
>>     Felix
>>
>>> Christian.
>>>
>>>>        return pte_flag;
>>>>    }
>> _______________________________________________
>> amd-gfx mailing list
>> amd-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Support snooped PTE flag
       [not found]                     ` <4b48fc1a-6ee2-bb60-0518-ea9c6346b8d6-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-26 17:16                       ` Koenig, Christian
       [not found]                         ` <89bf5baa-4b63-e40a-7995-fa35bad988b7-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Koenig, Christian @ 2019-08-26 17:16 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak

Am 26.08.19 um 17:45 schrieb Kuehling, Felix:
> On 2019-08-24 2:59 p.m., Christian König wrote:
>> Am 24.08.19 um 18:24 schrieb Kuehling, Felix:
>>> On 2019-08-24 7:13, Christian König wrote:
>>>> Am 23.08.19 um 23:33 schrieb Kuehling, Felix:
>>>>> From: Oak Zeng <Oak.Zeng@amd.com>
>>>>>
>>>>> Set snooped PTE flag according to mapping flag. Write request to a
>>>>> page with snooped bit set, will send out invalidate probe request
>>>>> to TCC of the remote GPU where the vram page resides.
>>>>>
>>>>> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
>>>>> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
>>>>> ---
>>>>>     drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>>>>>     1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>> index 9aafcda6c488..8a7c4ec69ae8 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>> @@ -604,6 +604,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct
>>>>> amdgpu_device *adev,
>>>>>         if (flags & AMDGPU_VM_PAGE_PRT)
>>>>>             pte_flag |= AMDGPU_PTE_PRT;
>>>>>     +    if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
>>>>> +        pte_flag |= AMDGPU_PTE_SNOOPED;
>>>>> +
>>>> That is still a NAK without further checks. We need to make absolutely
>>>> sure that we don't set this when PCIe routing is in use.
>>> The only place where AMDGPU_VM_... flags are accepted from user mode
>>> seems to be amdgpu_gem_va_ioctl. It has an explicit set of valid_flags
>>> it accepts. The new INVALIDATE_PROBE flag is not part of it. That means
>>> user mode will  not be able to set it directly. If we added it to the
>>> set of valid_flags in amdgpu_gem_va_ioctl, we'd need to add appropriate
>>> checks at the same time.
>>>
>>> KFD does not expose AMDGPU_VM_... flags directly to user mode. It only
>>> sets the INVALIDATE_PROBE flag in kernel mode for mappings in the same
>>> XGMI hive on Arturus (in patch 4).
>>>
>>> If there is something I'm missing, please point it out. But AFAICT the
>>> checking that is currently done should satisfy your requirements.
>> The hardware behavior depends on the placement of the buffer, so at
>> bare minimum we need to check if it's pointing to PCIe or local (check
>> the system bit).
>>
>> But even if it's local what is the behavior for local memory? E.g. not
>> accessed through XGMI?
>>
>> As far as I can see what we need to check here is that this is a
>> remote access over XGMI and then (and only then!) we are allowed to
>> set the snoop bit on the PTE.
> My point is, the only place where this bit can get set right now is in
> kernel mode in amdgpu_amdkfd_gpuvm.c. See patch 4 of my series. That
> code already has all the right conditions to make sure the
> INVALIDATE_PROBE bit is only set in the correct circumstances (remote
> XGMI mappings in the same hive):
>
>> +	switch (adev->asic_type) {
>> +	case CHIP_ARCTURUS:
>> +		if (mem->alloc_flags & ALLOC_MEM_FLAGS_VRAM) {
>> +			if (bo_adev == adev) {
>> +				...
>> +			} else {
>> +				...
>> +				if (amdgpu_xgmi_same_hive(adev, bo_adev))
>> +					mapping_flags |=
>> +						AMDGPU_VM_PAGE_INVALIDATE_PROBE;
>> +			}
>> +		} else {
> I think you're asking me to add an extra check for the same conditions
> in the GMC code? Like GMC doesn't trust amdgpu_amdkfd. It seems
> completely redundant and a bit paranoid to me.

Well the job of the VM code is to figure out the flags and location for 
the PTE based on the current placement BO and the flags given for the 
mapping.

And yes that implies that we don't trust upper layers to do this instead.

> gmc_v9_0_get_vm_pte_flags doesn't even have all the information it needs to make that determination.

Well then we probably need to change that.

Regards,
Christian.

>
> Regards,
>     Felix
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Support snooped PTE flag
       [not found]                         ` <89bf5baa-4b63-e40a-7995-fa35bad988b7-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-26 18:03                           ` Kuehling, Felix
       [not found]                             ` <216f63db-78c5-1098-bea5-2f379b0bf051-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-26 18:03 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak

On 2019-08-26 1:16 p.m., wrote:
> Am 26.08.19 um 17:45 schrieb Kuehling, Felix:
>> On 2019-08-24 2:59 p.m., Christian König wrote:
>>> Am 24.08.19 um 18:24 schrieb Kuehling, Felix:
>>>> On 2019-08-24 7:13, Christian König wrote:
>>>>> Am 23.08.19 um 23:33 schrieb Kuehling, Felix:
>>>>>> From: Oak Zeng <Oak.Zeng@amd.com>
>>>>>>
>>>>>> Set snooped PTE flag according to mapping flag. Write request to a
>>>>>> page with snooped bit set, will send out invalidate probe request
>>>>>> to TCC of the remote GPU where the vram page resides.
>>>>>>
>>>>>> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
>>>>>> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
>>>>>> ---
>>>>>>      drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>>>>>>      1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>> index 9aafcda6c488..8a7c4ec69ae8 100644
>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>> @@ -604,6 +604,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct
>>>>>> amdgpu_device *adev,
>>>>>>          if (flags & AMDGPU_VM_PAGE_PRT)
>>>>>>              pte_flag |= AMDGPU_PTE_PRT;
>>>>>>      +    if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
>>>>>> +        pte_flag |= AMDGPU_PTE_SNOOPED;
>>>>>> +
>>>>> That is still a NAK without further checks. We need to make absolutely
>>>>> sure that we don't set this when PCIe routing is in use.
>>>> The only place where AMDGPU_VM_... flags are accepted from user mode
>>>> seems to be amdgpu_gem_va_ioctl. It has an explicit set of valid_flags
>>>> it accepts. The new INVALIDATE_PROBE flag is not part of it. That means
>>>> user mode will  not be able to set it directly. If we added it to the
>>>> set of valid_flags in amdgpu_gem_va_ioctl, we'd need to add appropriate
>>>> checks at the same time.
>>>>
>>>> KFD does not expose AMDGPU_VM_... flags directly to user mode. It only
>>>> sets the INVALIDATE_PROBE flag in kernel mode for mappings in the same
>>>> XGMI hive on Arturus (in patch 4).
>>>>
>>>> If there is something I'm missing, please point it out. But AFAICT the
>>>> checking that is currently done should satisfy your requirements.
>>> The hardware behavior depends on the placement of the buffer, so at
>>> bare minimum we need to check if it's pointing to PCIe or local (check
>>> the system bit).
>>>
>>> But even if it's local what is the behavior for local memory? E.g. not
>>> accessed through XGMI?
>>>
>>> As far as I can see what we need to check here is that this is a
>>> remote access over XGMI and then (and only then!) we are allowed to
>>> set the snoop bit on the PTE.
>> My point is, the only place where this bit can get set right now is in
>> kernel mode in amdgpu_amdkfd_gpuvm.c. See patch 4 of my series. That
>> code already has all the right conditions to make sure the
>> INVALIDATE_PROBE bit is only set in the correct circumstances (remote
>> XGMI mappings in the same hive):
>>
>>> +	switch (adev->asic_type) {
>>> +	case CHIP_ARCTURUS:
>>> +		if (mem->alloc_flags & ALLOC_MEM_FLAGS_VRAM) {
>>> +			if (bo_adev == adev) {
>>> +				...
>>> +			} else {
>>> +				...
>>> +				if (amdgpu_xgmi_same_hive(adev, bo_adev))
>>> +					mapping_flags |=
>>> +						AMDGPU_VM_PAGE_INVALIDATE_PROBE;
>>> +			}
>>> +		} else {
>> I think you're asking me to add an extra check for the same conditions
>> in the GMC code? Like GMC doesn't trust amdgpu_amdkfd. It seems
>> completely redundant and a bit paranoid to me.
> Well the job of the VM code is to figure out the flags and location for
> the PTE based on the current placement BO and the flags given for the
> mapping.
>
> And yes that implies that we don't trust upper layers to do this instead.

I consider amdgpu_amdkfd_gpuvm as part of the lower layer. It has 
control over the placement of the buffers.

That said, if the GMC code has to figure out the PTE mapping flags based 
on the location of the buffer and the intended usage, it'll be hard to 
avoid some of the abstractions you criticized in Oak's patch series. You 
can't have it both ways. Either you give user mode the responsibility to 
know all the HW details and let user mode determine the mtype and flags 
(which is what you currently do in the GEM interface), or you let the VM 
code determine the flags based on more abstract information from user mode.

I see this discussion moving towards a more abstract interface. I'll see 
if I can add that into the GMC IP without breaking the existing AMDGPU 
GEM APIs.

Regards,
   Felix


>
>> gmc_v9_0_get_vm_pte_flags doesn't even have all the information it needs to make that determination.
> Well then we probably need to change that.
>
> Regards,
> Christian.
>
>> Regards,
>>      Felix
>>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Support snooped PTE flag
       [not found]                             ` <216f63db-78c5-1098-bea5-2f379b0bf051-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-26 18:12                               ` Koenig, Christian
       [not found]                                 ` <49ba47da-a225-5a21-9014-ccc316c55b60-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Koenig, Christian @ 2019-08-26 18:12 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak

Am 26.08.19 um 20:03 schrieb Kuehling, Felix:
> On 2019-08-26 1:16 p.m., wrote:
>> Am 26.08.19 um 17:45 schrieb Kuehling, Felix:
>>> On 2019-08-24 2:59 p.m., Christian König wrote:
>>>> Am 24.08.19 um 18:24 schrieb Kuehling, Felix:
>>>>> On 2019-08-24 7:13, Christian König wrote:
>>>>>> Am 23.08.19 um 23:33 schrieb Kuehling, Felix:
>>>>>>> From: Oak Zeng <Oak.Zeng@amd.com>
>>>>>>>
>>>>>>> Set snooped PTE flag according to mapping flag. Write request to a
>>>>>>> page with snooped bit set, will send out invalidate probe request
>>>>>>> to TCC of the remote GPU where the vram page resides.
>>>>>>>
>>>>>>> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
>>>>>>> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
>>>>>>> ---
>>>>>>>       drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>>>>>>>       1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>> index 9aafcda6c488..8a7c4ec69ae8 100644
>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>>>>>> @@ -604,6 +604,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct
>>>>>>> amdgpu_device *adev,
>>>>>>>           if (flags & AMDGPU_VM_PAGE_PRT)
>>>>>>>               pte_flag |= AMDGPU_PTE_PRT;
>>>>>>>       +    if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
>>>>>>> +        pte_flag |= AMDGPU_PTE_SNOOPED;
>>>>>>> +
>>>>>> That is still a NAK without further checks. We need to make absolutely
>>>>>> sure that we don't set this when PCIe routing is in use.
>>>>> The only place where AMDGPU_VM_... flags are accepted from user mode
>>>>> seems to be amdgpu_gem_va_ioctl. It has an explicit set of valid_flags
>>>>> it accepts. The new INVALIDATE_PROBE flag is not part of it. That means
>>>>> user mode will  not be able to set it directly. If we added it to the
>>>>> set of valid_flags in amdgpu_gem_va_ioctl, we'd need to add appropriate
>>>>> checks at the same time.
>>>>>
>>>>> KFD does not expose AMDGPU_VM_... flags directly to user mode. It only
>>>>> sets the INVALIDATE_PROBE flag in kernel mode for mappings in the same
>>>>> XGMI hive on Arturus (in patch 4).
>>>>>
>>>>> If there is something I'm missing, please point it out. But AFAICT the
>>>>> checking that is currently done should satisfy your requirements.
>>>> The hardware behavior depends on the placement of the buffer, so at
>>>> bare minimum we need to check if it's pointing to PCIe or local (check
>>>> the system bit).
>>>>
>>>> But even if it's local what is the behavior for local memory? E.g. not
>>>> accessed through XGMI?
>>>>
>>>> As far as I can see what we need to check here is that this is a
>>>> remote access over XGMI and then (and only then!) we are allowed to
>>>> set the snoop bit on the PTE.
>>> My point is, the only place where this bit can get set right now is in
>>> kernel mode in amdgpu_amdkfd_gpuvm.c. See patch 4 of my series. That
>>> code already has all the right conditions to make sure the
>>> INVALIDATE_PROBE bit is only set in the correct circumstances (remote
>>> XGMI mappings in the same hive):
>>>
>>>> +	switch (adev->asic_type) {
>>>> +	case CHIP_ARCTURUS:
>>>> +		if (mem->alloc_flags & ALLOC_MEM_FLAGS_VRAM) {
>>>> +			if (bo_adev == adev) {
>>>> +				...
>>>> +			} else {
>>>> +				...
>>>> +				if (amdgpu_xgmi_same_hive(adev, bo_adev))
>>>> +					mapping_flags |=
>>>> +						AMDGPU_VM_PAGE_INVALIDATE_PROBE;
>>>> +			}
>>>> +		} else {
>>> I think you're asking me to add an extra check for the same conditions
>>> in the GMC code? Like GMC doesn't trust amdgpu_amdkfd. It seems
>>> completely redundant and a bit paranoid to me.
>> Well the job of the VM code is to figure out the flags and location for
>> the PTE based on the current placement BO and the flags given for the
>> mapping.
>>
>> And yes that implies that we don't trust upper layers to do this instead.
> I consider amdgpu_amdkfd_gpuvm as part of the lower layer. It has
> control over the placement of the buffers.
>
> That said, if the GMC code has to figure out the PTE mapping flags based
> on the location of the buffer and the intended usage, it'll be hard to
> avoid some of the abstractions you criticized in Oak's patch series. You
> can't have it both ways. Either you give user mode the responsibility to
> know all the HW details and let user mode determine the mtype and flags
> (which is what you currently do in the GEM interface), or you let the VM
> code determine the flags based on more abstract information from user mode.

Good point, and yes I actually think as well that we shouldn't have had 
the gmc_v9_0_get_vm_pte_flags callback in the first place.

> I see this discussion moving towards a more abstract interface. I'll see
> if I can add that into the GMC IP without breaking the existing AMDGPU
> GEM APIs.

We should probably just revert back the to doing most of the mapping in 
amdgpu_vm_bo_split_mapping().

Here we already have a whole bunch of ASIC specific handling anyway:

>         if (!(mapping->flags & AMDGPU_PTE_READABLE))
>                 flags &= ~AMDGPU_PTE_READABLE;
>         if (!(mapping->flags & AMDGPU_PTE_WRITEABLE))
>                 flags &= ~AMDGPU_PTE_WRITEABLE;
>
>         flags &= ~AMDGPU_PTE_EXECUTABLE;
>         flags |= mapping->flags & AMDGPU_PTE_EXECUTABLE;
>
>         if (adev->asic_type == CHIP_NAVI10) {
>                 flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
>                 flags |= (mapping->flags & AMDGPU_PTE_MTYPE_NV10_MASK);
>         } else {
>                 flags &= ~AMDGPU_PTE_MTYPE_VG10_MASK;
>                 flags |= (mapping->flags & AMDGPU_PTE_MTYPE_VG10_MASK);
>         }
>
>         if ((mapping->flags & AMDGPU_PTE_PRT) &&
>             (adev->asic_type >= CHIP_VEGA10)) {
>                 flags |= AMDGPU_PTE_PRT;
>                 if (adev->asic_type >= CHIP_NAVI10) {
>                         flags |= AMDGPU_PTE_SNOOPED;
>                         flags |= AMDGPU_PTE_LOG;
>                         flags |= AMDGPU_PTE_SYSTEM;
>                 }
>                 flags &= ~AMDGPU_PTE_VALID;
>         }

And now that you mentioned it, the code you proposed wouldn't have 
worked anyway because the AMDGPU_PTE_SNOOPED would have been filtered 
out here :)

Regards,
Christian.

>
> Regards,
>     Felix
>
>
>>> gmc_v9_0_get_vm_pte_flags doesn't even have all the information it needs to make that determination.
>> Well then we probably need to change that.
>>
>> Regards,
>> Christian.
>>
>>> Regards,
>>>       Felix
>>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] drm/amdgpu: Support snooped PTE flag
       [not found]                                 ` <49ba47da-a225-5a21-9014-ccc316c55b60-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-26 22:32                                   ` Kuehling, Felix
  0 siblings, 0 replies; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-26 22:32 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak

On 2019-08-26 2:12 p.m., Koenig, Christian wrote:
>
> We should probably just revert back the to doing most of the mapping in
> amdgpu_vm_bo_split_mapping().
>
> Here we already have a whole bunch of ASIC specific handling anyway:
>
>>          if (!(mapping->flags & AMDGPU_PTE_READABLE))
>>                  flags &= ~AMDGPU_PTE_READABLE;
>>          if (!(mapping->flags & AMDGPU_PTE_WRITEABLE))
>>                  flags &= ~AMDGPU_PTE_WRITEABLE;
>>
>>          flags &= ~AMDGPU_PTE_EXECUTABLE;
>>          flags |= mapping->flags & AMDGPU_PTE_EXECUTABLE;
>>
>>          if (adev->asic_type == CHIP_NAVI10) {
>>                  flags &= ~AMDGPU_PTE_MTYPE_NV10_MASK;
>>                  flags |= (mapping->flags & AMDGPU_PTE_MTYPE_NV10_MASK);
>>          } else {
>>                  flags &= ~AMDGPU_PTE_MTYPE_VG10_MASK;
>>                  flags |= (mapping->flags & AMDGPU_PTE_MTYPE_VG10_MASK);
>>          }
>>
>>          if ((mapping->flags & AMDGPU_PTE_PRT) &&
>>              (adev->asic_type >= CHIP_VEGA10)) {
>>                  flags |= AMDGPU_PTE_PRT;
>>                  if (adev->asic_type >= CHIP_NAVI10) {
>>                          flags |= AMDGPU_PTE_SNOOPED;
>>                          flags |= AMDGPU_PTE_LOG;
>>                          flags |= AMDGPU_PTE_SYSTEM;
>>                  }
>>                  flags &= ~AMDGPU_PTE_VALID;
>>          }
> And now that you mentioned it, the code you proposed wouldn't have
> worked anyway because the AMDGPU_PTE_SNOOPED would have been filtered
> out here :)

OK, and that does look like the right place to set the PTE_SNOOPED bit 
for XGMI mappings on Arcturus. I'll send out an updated patch series 
that no longer needs the AMDGPU_VM_PAGE_INVALIDATE_PROBE bit in the UAPI.

Thanks for the pointer,
   Felix


>
> Regards,
> Christian.
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] drm/amdgpu: Support new arcturus mtype
       [not found]     ` <20190826230355.25007-2-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-27  8:57       ` Koenig, Christian
  0 siblings, 0 replies; 16+ messages in thread
From: Koenig, Christian @ 2019-08-27  8:57 UTC (permalink / raw)
  To: Kuehling, Felix, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak

Am 27.08.19 um 01:07 schrieb Kuehling, Felix:
> From: Oak Zeng <Oak.Zeng@amd.com>
>
> Arcturus repurposed mtype WC to RW. Modify gmc functions
> to support the new mtype
>
> Change-Id: Idc338e5386a57020f45262025e2664ab4ba9f291
> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c     | 3 +++
>   drivers/gpu/drm/amd/include/vega10_enum.h | 1 +
>   2 files changed, 4 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index f77138ba41f6..7aa365cd8d1d 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -603,6 +603,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
>   	case AMDGPU_VM_MTYPE_WC:
>   		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_WC);
>   		break;
> +	case AMDGPU_VM_MTYPE_RW:
> +		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_RW);
> +		break;
>   	case AMDGPU_VM_MTYPE_CC:
>   		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_CC);
>   		break;
> diff --git a/drivers/gpu/drm/amd/include/vega10_enum.h b/drivers/gpu/drm/amd/include/vega10_enum.h
> index c14ba65a2415..adf1b754666e 100644
> --- a/drivers/gpu/drm/amd/include/vega10_enum.h
> +++ b/drivers/gpu/drm/amd/include/vega10_enum.h
> @@ -1037,6 +1037,7 @@ TCC_CACHE_POLICY_STREAM                  = 0x00000001,
>   typedef enum MTYPE {
>   MTYPE_NC                                 = 0x00000000,
>   MTYPE_WC                                 = 0x00000001,
> +MTYPE_RW                                 = 0x00000001,
>   MTYPE_CC                                 = 0x00000002,
>   MTYPE_UC                                 = 0x00000003,
>   } MTYPE;

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/4] drm/amdgpu: Support new arcturus mtype
       [not found] ` <20190826230355.25007-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-26 23:07   ` Kuehling, Felix
       [not found]     ` <20190826230355.25007-2-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Kuehling, Felix @ 2019-08-26 23:07 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW; +Cc: Zeng, Oak, Koenig, Christian

From: Oak Zeng <Oak.Zeng@amd.com>

Arcturus repurposed mtype WC to RW. Modify gmc functions
to support the new mtype

Change-Id: Idc338e5386a57020f45262025e2664ab4ba9f291
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c     | 3 +++
 drivers/gpu/drm/amd/include/vega10_enum.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index f77138ba41f6..7aa365cd8d1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -603,6 +603,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
 	case AMDGPU_VM_MTYPE_WC:
 		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_WC);
 		break;
+	case AMDGPU_VM_MTYPE_RW:
+		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_RW);
+		break;
 	case AMDGPU_VM_MTYPE_CC:
 		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_CC);
 		break;
diff --git a/drivers/gpu/drm/amd/include/vega10_enum.h b/drivers/gpu/drm/amd/include/vega10_enum.h
index c14ba65a2415..adf1b754666e 100644
--- a/drivers/gpu/drm/amd/include/vega10_enum.h
+++ b/drivers/gpu/drm/amd/include/vega10_enum.h
@@ -1037,6 +1037,7 @@ TCC_CACHE_POLICY_STREAM                  = 0x00000001,
 typedef enum MTYPE {
 MTYPE_NC                                 = 0x00000000,
 MTYPE_WC                                 = 0x00000001,
+MTYPE_RW                                 = 0x00000001,
 MTYPE_CC                                 = 0x00000002,
 MTYPE_UC                                 = 0x00000003,
 } MTYPE;
-- 
2.17.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-08-27  8:57 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-23 21:33 [PATCH 0/4] KFD: Mapping-specific MTYPEs on Arcturus Kuehling, Felix
     [not found] ` <20190823213249.10749-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2019-08-23 21:33   ` [PATCH 2/4] drm/amdgpu: Support new arcturus mtype Kuehling, Felix
2019-08-23 21:33   ` [PATCH 1/4] drm/amdgpu: Extends amdgpu vm definitions Kuehling, Felix
     [not found]     ` <20190823213249.10749-2-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2019-08-24 11:12       ` Christian König
2019-08-23 21:33   ` [PATCH 3/4] drm/amdgpu: Support snooped PTE flag Kuehling, Felix
     [not found]     ` <20190823213249.10749-4-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2019-08-24 11:13       ` Christian König
     [not found]         ` <f09e6893-347b-4ade-76e5-ad37d8e4e782-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-08-24 16:24           ` Kuehling, Felix
     [not found]             ` <a7f9ad48-2e46-5415-e2a8-1738a101d716-5C7GfCeVMHo@public.gmane.org>
2019-08-24 18:59               ` Christian König
     [not found]                 ` <96b6ac1d-de87-3fdb-a531-af4b0a42f1d5-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2019-08-26 15:45                   ` Kuehling, Felix
     [not found]                     ` <4b48fc1a-6ee2-bb60-0518-ea9c6346b8d6-5C7GfCeVMHo@public.gmane.org>
2019-08-26 17:16                       ` Koenig, Christian
     [not found]                         ` <89bf5baa-4b63-e40a-7995-fa35bad988b7-5C7GfCeVMHo@public.gmane.org>
2019-08-26 18:03                           ` Kuehling, Felix
     [not found]                             ` <216f63db-78c5-1098-bea5-2f379b0bf051-5C7GfCeVMHo@public.gmane.org>
2019-08-26 18:12                               ` Koenig, Christian
     [not found]                                 ` <49ba47da-a225-5a21-9014-ccc316c55b60-5C7GfCeVMHo@public.gmane.org>
2019-08-26 22:32                                   ` Kuehling, Felix
2019-08-23 21:33   ` [PATCH 4/4] drm/amdgpu: Determing PTE flags separately for each mapping Kuehling, Felix
2019-08-26 23:07 [PATCH 1/4] drm/amdgpu: Extends amdgpu vm definitions (v2) Kuehling, Felix
     [not found] ` <20190826230355.25007-1-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2019-08-26 23:07   ` [PATCH 2/4] drm/amdgpu: Support new arcturus mtype Kuehling, Felix
     [not found]     ` <20190826230355.25007-2-Felix.Kuehling-5C7GfCeVMHo@public.gmane.org>
2019-08-27  8:57       ` Koenig, Christian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.