All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/5] drm/amdgpu: Extends amdgpu vm definitions
@ 2019-08-09  2:15 Zeng, Oak
       [not found] ` <1565316926-19516-1-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Zeng, Oak @ 2019-08-09  2:15 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Zeng, Oak, Koenig, Christian, Keely, Sean

Add definition of all supported mtypes. The RW mtype
is recently introduced for arcturus. Also add definition
of a flag to probe and possibly invalidate remote GPU
cache, which will be used later in this series.

Change-Id: I96fc9bb4b6b1e62bdc10b600d8aaa6a802128d6d
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 9 +++++++--
 include/uapi/drm/amdgpu_drm.h          | 4 ++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 2eda3a8..7a77477 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -80,8 +80,13 @@ struct amdgpu_bo_list_entry;
 #define AMDGPU_PTE_MTYPE_VG10(a)	((uint64_t)(a) << 57)
 #define AMDGPU_PTE_MTYPE_VG10_MASK	AMDGPU_PTE_MTYPE_VG10(3ULL)
 
-#define AMDGPU_MTYPE_NC 0
-#define AMDGPU_MTYPE_CC 2
+enum amdgpu_mtype {
+	AMDGPU_MTYPE_NC = 0,
+	AMDGPU_MTYPE_WC = 1,
+	AMDGPU_MTYPE_CC = 2,
+	AMDGPU_MTYPE_UC = 3,
+	AMDGPU_MTYPE_RW = 4,
+};
 
 #define AMDGPU_PTE_DEFAULT_ATC  (AMDGPU_PTE_SYSTEM      \
                                 | AMDGPU_PTE_SNOOPED    \
diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
index ca97b68..97e8e51 100644
--- a/include/uapi/drm/amdgpu_drm.h
+++ b/include/uapi/drm/amdgpu_drm.h
@@ -503,6 +503,10 @@ struct drm_amdgpu_gem_op {
 #define AMDGPU_VM_MTYPE_CC		(3 << 5)
 /* Use UC MTYPE instead of default MTYPE */
 #define AMDGPU_VM_MTYPE_UC		(4 << 5)
+/* Use RW MTYPE instead of default MTYPE */
+#define AMDGPU_VM_MTYPE_RW		(5 << 5)
+/* Cacheable/snoopable */
+#define AMDGPU_VM_PAGE_INVALIDATE_PROBE	(1 << 9)
 
 struct drm_amdgpu_gem_va {
 	/** GEM object handle */
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/5] drm/amdgpu: Support new arcturus mtype
       [not found] ` <1565316926-19516-1-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-09  2:15   ` Zeng, Oak
  2019-08-09  2:15   ` [PATCH 3/5] drm/amdkfd: Postpone memory mapping flags calculation to mapping time Zeng, Oak
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Zeng, Oak @ 2019-08-09  2:15 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Zeng, Oak, Koenig, Christian, Keely, Sean

Arcturus repurposed mtype WC to RW. Modify gmc functions
to support the new mtype

Change-Id: Idc338e5386a57020f45262025e2664ab4ba9f291
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c     | 3 +++
 drivers/gpu/drm/amd/include/vega10_enum.h | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index a2aa35e..e6adc86 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -582,6 +582,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
 	case AMDGPU_VM_MTYPE_WC:
 		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_WC);
 		break;
+	case AMDGPU_VM_MTYPE_RW:
+		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_RW);
+		break;
 	case AMDGPU_VM_MTYPE_CC:
 		pte_flag |= AMDGPU_PTE_MTYPE_VG10(MTYPE_CC);
 		break;
diff --git a/drivers/gpu/drm/amd/include/vega10_enum.h b/drivers/gpu/drm/amd/include/vega10_enum.h
index c14ba65..adf1b75 100644
--- a/drivers/gpu/drm/amd/include/vega10_enum.h
+++ b/drivers/gpu/drm/amd/include/vega10_enum.h
@@ -1037,6 +1037,7 @@ TCC_CACHE_POLICY_STREAM                  = 0x00000001,
 typedef enum MTYPE {
 MTYPE_NC                                 = 0x00000000,
 MTYPE_WC                                 = 0x00000001,
+MTYPE_RW                                 = 0x00000001,
 MTYPE_CC                                 = 0x00000002,
 MTYPE_UC                                 = 0x00000003,
 } MTYPE;
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/5] drm/amdkfd: Postpone memory mapping flags calculation to mapping time
       [not found] ` <1565316926-19516-1-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
  2019-08-09  2:15   ` [PATCH 2/5] drm/amdgpu: Support new arcturus mtype Zeng, Oak
@ 2019-08-09  2:15   ` Zeng, Oak
       [not found]     ` <1565316926-19516-3-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
  2019-08-09  2:15   ` [PATCH 4/5] drm/amdgpu: Support snooped PTE flag Zeng, Oak
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Zeng, Oak @ 2019-08-09  2:15 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Zeng, Oak, Koenig, Christian, Keely, Sean

Some mapping flags are decided by memory mapping destination which is not
know at memory object allocation time. So it is reasonable to decide memory
mapping flags at mapping time, instead of alloc time. Record memory allocation
flags during allocation time and calculate mapping flags during memory mapping
time.

Also made the memory mapping flage calculation to be asic-specific, because
different asic can have different mapping scheme.

The new memory mapping flag is decided by both memory allocation flags and
whether the memory mapping is to a remote device.

This is preparation work to introduce more sophisticated mapping scheme.

Change-Id: I98e7c47d850585ad7f0a9e11617c92b7a9aced3b
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h       |  1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++++-----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h          |  4 ++++
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c           | 20 +++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c            | 19 ++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c            | 20 +++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c            | 21 ++++++++++++++++++++-
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c            | 21 ++++++++++++++++++++-
 8 files changed, 106 insertions(+), 16 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index e519df3..026475a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -58,6 +58,7 @@ struct kgd_mem {
 	uint64_t va;
 
 	uint32_t mapping_flags;
+	uint32_t alloc_flags;
 
 	atomic_t invalid;
 	struct amdkfd_process_info *process_info;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 44a52b0..f91ef48 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -1081,7 +1081,6 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
 	int byte_align;
 	u32 domain, alloc_domain;
 	u64 alloc_flags;
-	uint32_t mapping_flags;
 	int ret;
 
 	/*
@@ -1143,16 +1142,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
 			adev->asic_type != CHIP_VEGAM) ?
 			VI_BO_SIZE_ALIGN : 1;
 
-	mapping_flags = AMDGPU_VM_PAGE_READABLE;
-	if (flags & ALLOC_MEM_FLAGS_WRITABLE)
-		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
-	if (flags & ALLOC_MEM_FLAGS_EXECUTABLE)
-		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
-	if (flags & ALLOC_MEM_FLAGS_COHERENT)
-		mapping_flags |= AMDGPU_VM_MTYPE_UC;
-	else
-		mapping_flags |= AMDGPU_VM_MTYPE_NC;
-	(*mem)->mapping_flags = mapping_flags;
+	(*mem)->alloc_flags = flags;
 
 	amdgpu_sync_create(&(*mem)->sync);
 
@@ -1298,6 +1288,7 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
 		struct kgd_dev *kgd, struct kgd_mem *mem, void *vm)
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
+	struct amdgpu_device *bo_adev;
 	struct amdgpu_vm *avm = (struct amdgpu_vm *)vm;
 	int ret;
 	struct amdgpu_bo *bo;
@@ -1315,6 +1306,9 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
 		return -EINVAL;
 	}
 
+	bo_adev = amdgpu_ttm_adev(bo->tbo.bdev);
+	mem->mapping_flags = amdgpu_gmc_get_vm_mapping_flags(adev, mem->alloc_flags, bo_adev != adev);
+
 	/* Make sure restore is not running concurrently. Since we
 	 * don't map invalid userptr BOs, we rely on the next restore
 	 * worker to do the mapping
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
index 071145a..6bd0c28 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
@@ -105,6 +105,9 @@ struct amdgpu_gmc_funcs {
 	/* get the pde for a given mc addr */
 	void (*get_vm_pde)(struct amdgpu_device *adev, int level,
 			   u64 *dst, u64 *flags);
+	/* get per asic vm mapping flags */
+	uint32_t (*get_vm_mapping_flags)(struct amdgpu_device *adev,
+			uint32_t alloc_flags, bool remote_mapping);
 };
 
 struct amdgpu_xgmi {
@@ -185,6 +188,7 @@ struct amdgpu_gmc {
 #define amdgpu_gmc_emit_pasid_mapping(r, vmid, pasid) (r)->adev->gmc.gmc_funcs->emit_pasid_mapping((r), (vmid), (pasid))
 #define amdgpu_gmc_get_vm_pde(adev, level, dst, flags) (adev)->gmc.gmc_funcs->get_vm_pde((adev), (level), (dst), (flags))
 #define amdgpu_gmc_get_pte_flags(adev, flags) (adev)->gmc.gmc_funcs->get_vm_pte_flags((adev),(flags))
+#define amdgpu_gmc_get_vm_mapping_flags(adev, alloc_flags, remote_mapping) ((adev)->gmc.gmc_funcs->get_vm_mapping_flags((adev), (alloc_flags), (remote_mapping)))
 
 /**
  * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible through the BAR
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index 4e3ac10..7e420e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -415,12 +415,30 @@ static void gmc_v10_0_get_vm_pde(struct amdgpu_device *adev, int level,
 	}
 }
 
+static uint32_t gmc_v10_0_get_vm_mapping_flags(struct amdgpu_device *adev,
+				uint32_t alloc_flags, bool remote_mapping)
+{
+	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
+
+	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
+		mapping_flags |= AMDGPU_VM_MTYPE_UC;
+	else
+		mapping_flags |= AMDGPU_VM_MTYPE_NC;
+
+	return mapping_flags;
+}
+
 static const struct amdgpu_gmc_funcs gmc_v10_0_gmc_funcs = {
 	.flush_gpu_tlb = gmc_v10_0_flush_gpu_tlb,
 	.emit_flush_gpu_tlb = gmc_v10_0_emit_flush_gpu_tlb,
 	.emit_pasid_mapping = gmc_v10_0_emit_pasid_mapping,
 	.get_vm_pte_flags = gmc_v10_0_get_vm_pte_flags,
-	.get_vm_pde = gmc_v10_0_get_vm_pde
+	.get_vm_pde = gmc_v10_0_get_vm_pde,
+	.get_vm_mapping_flags = gmc_v10_0_get_vm_mapping_flags
 };
 
 static void gmc_v10_0_set_gmc_funcs(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index b06d876..2b2af6a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -404,6 +404,22 @@ static void gmc_v6_0_get_vm_pde(struct amdgpu_device *adev, int level,
 	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
 }
 
+static uint32_t gmc_v6_0_get_vm_mapping_flags(struct amdgpu_device *adev,
+				uint32_t alloc_flags, bool remote_mapping)
+{
+	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
+
+	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
+		mapping_flags |= AMDGPU_VM_MTYPE_UC;
+	else
+		mapping_flags |= AMDGPU_VM_MTYPE_NC;
+
+	return mapping_flags;
+}
 static void gmc_v6_0_set_fault_enable_default(struct amdgpu_device *adev,
 					      bool value)
 {
@@ -1157,7 +1173,8 @@ static const struct amdgpu_gmc_funcs gmc_v6_0_gmc_funcs = {
 	.emit_flush_gpu_tlb = gmc_v6_0_emit_flush_gpu_tlb,
 	.set_prt = gmc_v6_0_set_prt,
 	.get_vm_pde = gmc_v6_0_get_vm_pde,
-	.get_vm_pte_flags = gmc_v6_0_get_vm_pte_flags
+	.get_vm_pte_flags = gmc_v6_0_get_vm_pte_flags,
+	.get_vm_mapping_flags = gmc_v6_0_get_vm_mapping_flags
 };
 
 static const struct amdgpu_irq_src_funcs gmc_v6_0_irq_funcs = {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 75aa333..619862e 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -481,6 +481,23 @@ static void gmc_v7_0_get_vm_pde(struct amdgpu_device *adev, int level,
 	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
 }
 
+static uint32_t gmc_v7_0_get_vm_mapping_flags(struct amdgpu_device *adev,
+				uint32_t alloc_flags, bool remote_mapping)
+{
+	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
+
+	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
+		mapping_flags |= AMDGPU_VM_MTYPE_UC;
+	else
+		mapping_flags |= AMDGPU_VM_MTYPE_NC;
+
+	return mapping_flags;
+}
+
 /**
  * gmc_v8_0_set_fault_enable_default - update VM fault handling
  *
@@ -1353,7 +1370,8 @@ static const struct amdgpu_gmc_funcs gmc_v7_0_gmc_funcs = {
 	.emit_pasid_mapping = gmc_v7_0_emit_pasid_mapping,
 	.set_prt = gmc_v7_0_set_prt,
 	.get_vm_pte_flags = gmc_v7_0_get_vm_pte_flags,
-	.get_vm_pde = gmc_v7_0_get_vm_pde
+	.get_vm_pde = gmc_v7_0_get_vm_pde,
+	.get_vm_mapping_flags = gmc_v7_0_get_vm_mapping_flags
 };
 
 static const struct amdgpu_irq_src_funcs gmc_v7_0_irq_funcs = {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index 8bf2ba3..d2cecb30 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -706,6 +706,24 @@ static void gmc_v8_0_get_vm_pde(struct amdgpu_device *adev, int level,
 	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
 }
 
+static uint32_t gmc_v8_0_get_vm_mapping_flags(struct amdgpu_device *adev,
+				uint32_t alloc_flags, bool remote_mapping)
+{
+	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
+
+	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
+		mapping_flags |= AMDGPU_VM_MTYPE_UC;
+	else
+		mapping_flags |= AMDGPU_VM_MTYPE_NC;
+
+	return mapping_flags;
+}
+
+
 /**
  * gmc_v8_0_set_fault_enable_default - update VM fault handling
  *
@@ -1721,7 +1739,8 @@ static const struct amdgpu_gmc_funcs gmc_v8_0_gmc_funcs = {
 	.emit_pasid_mapping = gmc_v8_0_emit_pasid_mapping,
 	.set_prt = gmc_v8_0_set_prt,
 	.get_vm_pte_flags = gmc_v8_0_get_vm_pte_flags,
-	.get_vm_pde = gmc_v8_0_get_vm_pde
+	.get_vm_pde = gmc_v8_0_get_vm_pde,
+	.get_vm_mapping_flags = gmc_v8_0_get_vm_mapping_flags
 };
 
 static const struct amdgpu_irq_src_funcs gmc_v8_0_irq_funcs = {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index e6adc86..d709902 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -626,12 +626,31 @@ static void gmc_v9_0_get_vm_pde(struct amdgpu_device *adev, int level,
 	}
 }
 
+static uint32_t gmc_v9_0_get_vm_mapping_flags(struct amdgpu_device *adev,
+				uint32_t alloc_flags, bool remote_mapping)
+{
+	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
+
+	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
+		mapping_flags |= AMDGPU_VM_MTYPE_UC;
+	else
+		mapping_flags |= AMDGPU_VM_MTYPE_NC;
+
+	return mapping_flags;
+}
+
+
 static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = {
 	.flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb,
 	.emit_flush_gpu_tlb = gmc_v9_0_emit_flush_gpu_tlb,
 	.emit_pasid_mapping = gmc_v9_0_emit_pasid_mapping,
 	.get_vm_pte_flags = gmc_v9_0_get_vm_pte_flags,
-	.get_vm_pde = gmc_v9_0_get_vm_pde
+	.get_vm_pde = gmc_v9_0_get_vm_pde,
+	.get_vm_mapping_flags = gmc_v9_0_get_vm_mapping_flags
 };
 
 static void gmc_v9_0_set_gmc_funcs(struct amdgpu_device *adev)
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 4/5] drm/amdgpu: Support snooped PTE flag
       [not found] ` <1565316926-19516-1-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
  2019-08-09  2:15   ` [PATCH 2/5] drm/amdgpu: Support new arcturus mtype Zeng, Oak
  2019-08-09  2:15   ` [PATCH 3/5] drm/amdkfd: Postpone memory mapping flags calculation to mapping time Zeng, Oak
@ 2019-08-09  2:15   ` Zeng, Oak
       [not found]     ` <1565316926-19516-4-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
  2019-08-09  2:15   ` [PATCH 5/5] amd/amdgpu: Introduce new page mapping scheme for arcturus Zeng, Oak
  2019-08-09 12:28   ` [PATCH 1/5] drm/amdgpu: Extends amdgpu vm definitions Koenig, Christian
  4 siblings, 1 reply; 14+ messages in thread
From: Zeng, Oak @ 2019-08-09  2:15 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Zeng, Oak, Koenig, Christian, Keely, Sean

Set snooped PTE flag according to mapping flag. Write request to a
page with snooped bit set, will send out invalidate probe request
to TCC of the remote GPU where the vram page resides.

Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index d709902..8faead3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -599,6 +599,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
 	if (flags & AMDGPU_VM_PAGE_PRT)
 		pte_flag |= AMDGPU_PTE_PRT;
 
+	if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
+		pte_flag |= AMDGPU_PTE_SNOOPED;
+
 	return pte_flag;
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 5/5] amd/amdgpu: Introduce new page mapping scheme for arcturus
       [not found] ` <1565316926-19516-1-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
                     ` (2 preceding siblings ...)
  2019-08-09  2:15   ` [PATCH 4/5] drm/amdgpu: Support snooped PTE flag Zeng, Oak
@ 2019-08-09  2:15   ` Zeng, Oak
  2019-08-09 12:28   ` [PATCH 1/5] drm/amdgpu: Extends amdgpu vm definitions Koenig, Christian
  4 siblings, 0 replies; 14+ messages in thread
From: Zeng, Oak @ 2019-08-09  2:15 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Zeng, Oak, Koenig, Christian, Keely, Sean

The new memory mapping scheme is:
For vram:
Fine-grain coherency: local CC and remote UC, with snoop.
Coarse-grain coherency: local RW and remote UC, with snoop.
For host memory (not changed)
Fine-grain coherency: UC
Coarse-grain coherency: NC

Change-Id: I6a071249f953cbed813bfd953b6a2e0826f54f86
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c |  1 +
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 47 ++++++++++++++++++++++++++++++++++-
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index 2b2af6a..51dae7f 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -420,6 +420,7 @@ static uint32_t gmc_v6_0_get_vm_mapping_flags(struct amdgpu_device *adev,
 
 	return mapping_flags;
 }
+
 static void gmc_v6_0_set_fault_enable_default(struct amdgpu_device *adev,
 					      bool value)
 {
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 8faead3..2f742e3 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -646,6 +646,35 @@ static uint32_t gmc_v9_0_get_vm_mapping_flags(struct amdgpu_device *adev,
 	return mapping_flags;
 }
 
+static uint32_t gmc_v9_0_arcturus_get_vm_mapping_flags(struct amdgpu_device *adev,
+				uint32_t alloc_flags, bool remote_mapping)
+{
+	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
+
+	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
+	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
+		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
+
+	if ((alloc_flags & ALLOC_MEM_FLAGS_VRAM)) {
+		if (!remote_mapping) {
+			if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
+				mapping_flags |= AMDGPU_VM_MTYPE_CC;
+			else
+				mapping_flags |= AMDGPU_VM_MTYPE_RW;
+		} else {
+			mapping_flags |= AMDGPU_VM_MTYPE_UC;
+			mapping_flags |= AMDGPU_VM_PAGE_INVALIDATE_PROBE;
+		}
+	} else {
+		if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
+			mapping_flags |= AMDGPU_VM_MTYPE_UC;
+		else
+			mapping_flags |= AMDGPU_VM_MTYPE_NC;
+	}
+
+	return mapping_flags;
+}
 
 static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = {
 	.flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb,
@@ -656,9 +685,25 @@ static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = {
 	.get_vm_mapping_flags = gmc_v9_0_get_vm_mapping_flags
 };
 
+static const struct amdgpu_gmc_funcs gmc_v9_0_arcturus_gmc_funcs = {
+	.flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb,
+	.emit_flush_gpu_tlb = gmc_v9_0_emit_flush_gpu_tlb,
+	.emit_pasid_mapping = gmc_v9_0_emit_pasid_mapping,
+	.get_vm_pte_flags = gmc_v9_0_get_vm_pte_flags,
+	.get_vm_pde = gmc_v9_0_get_vm_pde,
+	.get_vm_mapping_flags = gmc_v9_0_arcturus_get_vm_mapping_flags
+};
+
 static void gmc_v9_0_set_gmc_funcs(struct amdgpu_device *adev)
 {
-	adev->gmc.gmc_funcs = &gmc_v9_0_gmc_funcs;
+	switch(adev->asic_type) {
+	case CHIP_ARCTURUS:
+		adev->gmc.gmc_funcs = &gmc_v9_0_arcturus_gmc_funcs;
+		break;
+	default:
+		adev->gmc.gmc_funcs = &gmc_v9_0_gmc_funcs;
+		break;
+	}
 }
 
 static void gmc_v9_0_set_umc_funcs(struct amdgpu_device *adev)
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/5] drm/amdgpu: Extends amdgpu vm definitions
       [not found] ` <1565316926-19516-1-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
                     ` (3 preceding siblings ...)
  2019-08-09  2:15   ` [PATCH 5/5] amd/amdgpu: Introduce new page mapping scheme for arcturus Zeng, Oak
@ 2019-08-09 12:28   ` Koenig, Christian
       [not found]     ` <801142da-51cb-4efd-2cd6-860c65bfb311-5C7GfCeVMHo@public.gmane.org>
  4 siblings, 1 reply; 14+ messages in thread
From: Koenig, Christian @ 2019-08-09 12:28 UTC (permalink / raw)
  To: Zeng, Oak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Keely, Sean

Am 09.08.19 um 04:15 schrieb Zeng, Oak:
> Add definition of all supported mtypes. The RW mtype
> is recently introduced for arcturus. Also add definition
> of a flag to probe and possibly invalidate remote GPU
> cache, which will be used later in this series.
>
> Change-Id: I96fc9bb4b6b1e62bdc10b600d8aaa6a802128d6d
> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 9 +++++++--
>   include/uapi/drm/amdgpu_drm.h          | 4 ++++
>   2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 2eda3a8..7a77477 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -80,8 +80,13 @@ struct amdgpu_bo_list_entry;
>   #define AMDGPU_PTE_MTYPE_VG10(a)	((uint64_t)(a) << 57)
>   #define AMDGPU_PTE_MTYPE_VG10_MASK	AMDGPU_PTE_MTYPE_VG10(3ULL)
>   
> -#define AMDGPU_MTYPE_NC 0
> -#define AMDGPU_MTYPE_CC 2
> +enum amdgpu_mtype {
> +	AMDGPU_MTYPE_NC = 0,
> +	AMDGPU_MTYPE_WC = 1,
> +	AMDGPU_MTYPE_CC = 2,
> +	AMDGPU_MTYPE_UC = 3,
> +	AMDGPU_MTYPE_RW = 4,
> +};

Mhm, why an enum?

Christian.

>   
>   #define AMDGPU_PTE_DEFAULT_ATC  (AMDGPU_PTE_SYSTEM      \
>                                   | AMDGPU_PTE_SNOOPED    \
> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
> index ca97b68..97e8e51 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -503,6 +503,10 @@ struct drm_amdgpu_gem_op {
>   #define AMDGPU_VM_MTYPE_CC		(3 << 5)
>   /* Use UC MTYPE instead of default MTYPE */
>   #define AMDGPU_VM_MTYPE_UC		(4 << 5)
> +/* Use RW MTYPE instead of default MTYPE */
> +#define AMDGPU_VM_MTYPE_RW		(5 << 5)
> +/* Cacheable/snoopable */
> +#define AMDGPU_VM_PAGE_INVALIDATE_PROBE	(1 << 9)
>   
>   struct drm_amdgpu_gem_va {
>   	/** GEM object handle */

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5] drm/amdkfd: Postpone memory mapping flags calculation to mapping time
       [not found]     ` <1565316926-19516-3-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-09 12:30       ` Koenig, Christian
       [not found]         ` <95459b48-c8e5-b54c-00f1-8a1fdccd7e66-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Koenig, Christian @ 2019-08-09 12:30 UTC (permalink / raw)
  To: Zeng, Oak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Keely, Sean

Am 09.08.19 um 04:15 schrieb Zeng, Oak:
> Some mapping flags are decided by memory mapping destination which is not
> know at memory object allocation time. So it is reasonable to decide memory
> mapping flags at mapping time, instead of alloc time. Record memory allocation
> flags during allocation time and calculate mapping flags during memory mapping
> time.
>
> Also made the memory mapping flage calculation to be asic-specific, because
> different asic can have different mapping scheme.
>
> The new memory mapping flag is decided by both memory allocation flags and
> whether the memory mapping is to a remote device.
>
> This is preparation work to introduce more sophisticated mapping scheme.
>
> Change-Id: I98e7c47d850585ad7f0a9e11617c92b7a9aced3b
> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h       |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++++-----------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h          |  4 ++++
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c           | 20 +++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c            | 19 ++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c            | 20 +++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c            | 21 ++++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c            | 21 ++++++++++++++++++++-
>   8 files changed, 106 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index e519df3..026475a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -58,6 +58,7 @@ struct kgd_mem {
>   	uint64_t va;
>   
>   	uint32_t mapping_flags;
> +	uint32_t alloc_flags;
>   
>   	atomic_t invalid;
>   	struct amdkfd_process_info *process_info;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 44a52b0..f91ef48 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1081,7 +1081,6 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   	int byte_align;
>   	u32 domain, alloc_domain;
>   	u64 alloc_flags;
> -	uint32_t mapping_flags;
>   	int ret;
>   
>   	/*
> @@ -1143,16 +1142,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   			adev->asic_type != CHIP_VEGAM) ?
>   			VI_BO_SIZE_ALIGN : 1;
>   
> -	mapping_flags = AMDGPU_VM_PAGE_READABLE;
> -	if (flags & ALLOC_MEM_FLAGS_WRITABLE)
> -		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> -	if (flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> -		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> -	if (flags & ALLOC_MEM_FLAGS_COHERENT)
> -		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> -	else
> -		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> -	(*mem)->mapping_flags = mapping_flags;
> +	(*mem)->alloc_flags = flags;
>   
>   	amdgpu_sync_create(&(*mem)->sync);
>   
> @@ -1298,6 +1288,7 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
>   		struct kgd_dev *kgd, struct kgd_mem *mem, void *vm)
>   {
>   	struct amdgpu_device *adev = get_amdgpu_device(kgd);
> +	struct amdgpu_device *bo_adev;
>   	struct amdgpu_vm *avm = (struct amdgpu_vm *)vm;
>   	int ret;
>   	struct amdgpu_bo *bo;
> @@ -1315,6 +1306,9 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
>   		return -EINVAL;
>   	}
>   
> +	bo_adev = amdgpu_ttm_adev(bo->tbo.bdev);
> +	mem->mapping_flags = amdgpu_gmc_get_vm_mapping_flags(adev, mem->alloc_flags, bo_adev != adev);
> +
>   	/* Make sure restore is not running concurrently. Since we
>   	 * don't map invalid userptr BOs, we rely on the next restore
>   	 * worker to do the mapping
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> index 071145a..6bd0c28 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> @@ -105,6 +105,9 @@ struct amdgpu_gmc_funcs {
>   	/* get the pde for a given mc addr */
>   	void (*get_vm_pde)(struct amdgpu_device *adev, int level,
>   			   u64 *dst, u64 *flags);
> +	/* get per asic vm mapping flags */
> +	uint32_t (*get_vm_mapping_flags)(struct amdgpu_device *adev,
> +			uint32_t alloc_flags, bool remote_mapping);
>   };
>   
>   struct amdgpu_xgmi {
> @@ -185,6 +188,7 @@ struct amdgpu_gmc {
>   #define amdgpu_gmc_emit_pasid_mapping(r, vmid, pasid) (r)->adev->gmc.gmc_funcs->emit_pasid_mapping((r), (vmid), (pasid))
>   #define amdgpu_gmc_get_vm_pde(adev, level, dst, flags) (adev)->gmc.gmc_funcs->get_vm_pde((adev), (level), (dst), (flags))
>   #define amdgpu_gmc_get_pte_flags(adev, flags) (adev)->gmc.gmc_funcs->get_vm_pte_flags((adev),(flags))
> +#define amdgpu_gmc_get_vm_mapping_flags(adev, alloc_flags, remote_mapping) ((adev)->gmc.gmc_funcs->get_vm_mapping_flags((adev), (alloc_flags), (remote_mapping)))
>   
>   /**
>    * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible through the BAR
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index 4e3ac10..7e420e0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -415,12 +415,30 @@ static void gmc_v10_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	}
>   }
>   
> +static uint32_t gmc_v10_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping)
> +{
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
> +
>   static const struct amdgpu_gmc_funcs gmc_v10_0_gmc_funcs = {
>   	.flush_gpu_tlb = gmc_v10_0_flush_gpu_tlb,
>   	.emit_flush_gpu_tlb = gmc_v10_0_emit_flush_gpu_tlb,
>   	.emit_pasid_mapping = gmc_v10_0_emit_pasid_mapping,
>   	.get_vm_pte_flags = gmc_v10_0_get_vm_pte_flags,
> -	.get_vm_pde = gmc_v10_0_get_vm_pde
> +	.get_vm_pde = gmc_v10_0_get_vm_pde,
> +	.get_vm_mapping_flags = gmc_v10_0_get_vm_mapping_flags
>   };
>   
>   static void gmc_v10_0_set_gmc_funcs(struct amdgpu_device *adev)
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index b06d876..2b2af6a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -404,6 +404,22 @@ static void gmc_v6_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
>   }
>   
> +static uint32_t gmc_v6_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping)
> +{
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
>   static void gmc_v6_0_set_fault_enable_default(struct amdgpu_device *adev,
>   					      bool value)
>   {
> @@ -1157,7 +1173,8 @@ static const struct amdgpu_gmc_funcs gmc_v6_0_gmc_funcs = {
>   	.emit_flush_gpu_tlb = gmc_v6_0_emit_flush_gpu_tlb,
>   	.set_prt = gmc_v6_0_set_prt,
>   	.get_vm_pde = gmc_v6_0_get_vm_pde,
> -	.get_vm_pte_flags = gmc_v6_0_get_vm_pte_flags
> +	.get_vm_pte_flags = gmc_v6_0_get_vm_pte_flags,
> +	.get_vm_mapping_flags = gmc_v6_0_get_vm_mapping_flags

That looks like a duplication of what get_vm_pte_flags is supposed to be 
doing.

Christian.

>   };
>   
>   static const struct amdgpu_irq_src_funcs gmc_v6_0_irq_funcs = {
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> index 75aa333..619862e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> @@ -481,6 +481,23 @@ static void gmc_v7_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
>   }
>   
> +static uint32_t gmc_v7_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping)
> +{
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
> +
>   /**
>    * gmc_v8_0_set_fault_enable_default - update VM fault handling
>    *
> @@ -1353,7 +1370,8 @@ static const struct amdgpu_gmc_funcs gmc_v7_0_gmc_funcs = {
>   	.emit_pasid_mapping = gmc_v7_0_emit_pasid_mapping,
>   	.set_prt = gmc_v7_0_set_prt,
>   	.get_vm_pte_flags = gmc_v7_0_get_vm_pte_flags,
> -	.get_vm_pde = gmc_v7_0_get_vm_pde
> +	.get_vm_pde = gmc_v7_0_get_vm_pde,
> +	.get_vm_mapping_flags = gmc_v7_0_get_vm_mapping_flags
>   };
>   
>   static const struct amdgpu_irq_src_funcs gmc_v7_0_irq_funcs = {
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> index 8bf2ba3..d2cecb30 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> @@ -706,6 +706,24 @@ static void gmc_v8_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
>   }
>   
> +static uint32_t gmc_v8_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping)
> +{
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
> +
> +
>   /**
>    * gmc_v8_0_set_fault_enable_default - update VM fault handling
>    *
> @@ -1721,7 +1739,8 @@ static const struct amdgpu_gmc_funcs gmc_v8_0_gmc_funcs = {
>   	.emit_pasid_mapping = gmc_v8_0_emit_pasid_mapping,
>   	.set_prt = gmc_v8_0_set_prt,
>   	.get_vm_pte_flags = gmc_v8_0_get_vm_pte_flags,
> -	.get_vm_pde = gmc_v8_0_get_vm_pde
> +	.get_vm_pde = gmc_v8_0_get_vm_pde,
> +	.get_vm_mapping_flags = gmc_v8_0_get_vm_mapping_flags
>   };
>   
>   static const struct amdgpu_irq_src_funcs gmc_v8_0_irq_funcs = {
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index e6adc86..d709902 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -626,12 +626,31 @@ static void gmc_v9_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	}
>   }
>   
> +static uint32_t gmc_v9_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping)
> +{
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
> +
> +
>   static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = {
>   	.flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb,
>   	.emit_flush_gpu_tlb = gmc_v9_0_emit_flush_gpu_tlb,
>   	.emit_pasid_mapping = gmc_v9_0_emit_pasid_mapping,
>   	.get_vm_pte_flags = gmc_v9_0_get_vm_pte_flags,
> -	.get_vm_pde = gmc_v9_0_get_vm_pde
> +	.get_vm_pde = gmc_v9_0_get_vm_pde,
> +	.get_vm_mapping_flags = gmc_v9_0_get_vm_mapping_flags
>   };
>   
>   static void gmc_v9_0_set_gmc_funcs(struct amdgpu_device *adev)

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Support snooped PTE flag
       [not found]     ` <1565316926-19516-4-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-09 12:31       ` Koenig, Christian
       [not found]         ` <88fad585-ad1b-bca9-7079-d79896def19c-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Koenig, Christian @ 2019-08-09 12:31 UTC (permalink / raw)
  To: Zeng, Oak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Keely, Sean

Am 09.08.19 um 04:15 schrieb Zeng, Oak:
> Set snooped PTE flag according to mapping flag. Write request to a
> page with snooped bit set, will send out invalidate probe request
> to TCC of the remote GPU where the vram page resides.
>
> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index d709902..8faead3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -599,6 +599,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
>   	if (flags & AMDGPU_VM_PAGE_PRT)
>   		pte_flag |= AMDGPU_PTE_PRT;
>   
> +	if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
> +		pte_flag |= AMDGPU_PTE_SNOOPED;
> +

Still a NAK, we absolutely need a check here that this is only set when 
the BO is in XGMI.

Christian.

>   	return pte_flag;
>   }
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 1/5] drm/amdgpu: Extends amdgpu vm definitions
       [not found]     ` <801142da-51cb-4efd-2cd6-860c65bfb311-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-09 13:50       ` Zeng, Oak
  0 siblings, 0 replies; 14+ messages in thread
From: Zeng, Oak @ 2019-08-09 13:50 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Keely, Sean



Regards,
Oak

-----Original Message-----
From: Koenig, Christian <Christian.Koenig@amd.com> 
Sent: Friday, August 9, 2019 8:29 AM
To: Zeng, Oak <Oak.Zeng@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix <Felix.Kuehling@amd.com>; Keely, Sean <Sean.Keely@amd.com>
Subject: Re: [PATCH 1/5] drm/amdgpu: Extends amdgpu vm definitions

Am 09.08.19 um 04:15 schrieb Zeng, Oak:
> Add definition of all supported mtypes. The RW mtype is recently 
> introduced for arcturus. Also add definition of a flag to probe and 
> possibly invalidate remote GPU cache, which will be used later in this 
> series.
>
> Change-Id: I96fc9bb4b6b1e62bdc10b600d8aaa6a802128d6d
> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 9 +++++++--
>   include/uapi/drm/amdgpu_drm.h          | 4 ++++
>   2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> index 2eda3a8..7a77477 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
> @@ -80,8 +80,13 @@ struct amdgpu_bo_list_entry;
>   #define AMDGPU_PTE_MTYPE_VG10(a)	((uint64_t)(a) << 57)
>   #define AMDGPU_PTE_MTYPE_VG10_MASK	AMDGPU_PTE_MTYPE_VG10(3ULL)
>   
> -#define AMDGPU_MTYPE_NC 0
> -#define AMDGPU_MTYPE_CC 2
> +enum amdgpu_mtype {
> +	AMDGPU_MTYPE_NC = 0,
> +	AMDGPU_MTYPE_WC = 1,
> +	AMDGPU_MTYPE_CC = 2,
> +	AMDGPU_MTYPE_UC = 3,
> +	AMDGPU_MTYPE_RW = 4,
> +};

Mhm, why an enum?

[Oak]: To me, enum and macro don't make big difference, except you can use enum at run time (for example use it in a debugger) while macro will be replaced during pre-compile. Any reason no enum h ere? I can change it to macro, not a big deal.

Christian.

>   
>   #define AMDGPU_PTE_DEFAULT_ATC  (AMDGPU_PTE_SYSTEM      \
>                                   | AMDGPU_PTE_SNOOPED    \
> diff --git a/include/uapi/drm/amdgpu_drm.h 
> b/include/uapi/drm/amdgpu_drm.h index ca97b68..97e8e51 100644
> --- a/include/uapi/drm/amdgpu_drm.h
> +++ b/include/uapi/drm/amdgpu_drm.h
> @@ -503,6 +503,10 @@ struct drm_amdgpu_gem_op {
>   #define AMDGPU_VM_MTYPE_CC		(3 << 5)
>   /* Use UC MTYPE instead of default MTYPE */
>   #define AMDGPU_VM_MTYPE_UC		(4 << 5)
> +/* Use RW MTYPE instead of default MTYPE */
> +#define AMDGPU_VM_MTYPE_RW		(5 << 5)
> +/* Cacheable/snoopable */
> +#define AMDGPU_VM_PAGE_INVALIDATE_PROBE	(1 << 9)
>   
>   struct drm_amdgpu_gem_va {
>   	/** GEM object handle */

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 3/5] drm/amdkfd: Postpone memory mapping flags calculation to mapping time
       [not found]         ` <95459b48-c8e5-b54c-00f1-8a1fdccd7e66-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-09 14:21           ` Zeng, Oak
       [not found]             ` <BL0PR12MB258085011920B10BDE86856B80D60-b4cIHhjg/p/XzH18dTCKOgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Zeng, Oak @ 2019-08-09 14:21 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Keely, Sean



Regards,
Oak

-----Original Message-----
From: Koenig, Christian <Christian.Koenig@amd.com> 
Sent: Friday, August 9, 2019 8:31 AM
To: Zeng, Oak <Oak.Zeng@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix <Felix.Kuehling@amd.com>; Keely, Sean <Sean.Keely@amd.com>
Subject: Re: [PATCH 3/5] drm/amdkfd: Postpone memory mapping flags calculation to mapping time

Am 09.08.19 um 04:15 schrieb Zeng, Oak:
> Some mapping flags are decided by memory mapping destination which is 
> not know at memory object allocation time. So it is reasonable to 
> decide memory mapping flags at mapping time, instead of alloc time. 
> Record memory allocation flags during allocation time and calculate 
> mapping flags during memory mapping time.
>
> Also made the memory mapping flage calculation to be asic-specific, 
> because different asic can have different mapping scheme.
>
> The new memory mapping flag is decided by both memory allocation flags 
> and whether the memory mapping is to a remote device.
>
> This is preparation work to introduce more sophisticated mapping scheme.
>
> Change-Id: I98e7c47d850585ad7f0a9e11617c92b7a9aced3b
> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h       |  1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++++-----------
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h          |  4 ++++
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c           | 20 +++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c            | 19 ++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c            | 20 +++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c            | 21 ++++++++++++++++++++-
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c            | 21 ++++++++++++++++++++-
>   8 files changed, 106 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> index e519df3..026475a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
> @@ -58,6 +58,7 @@ struct kgd_mem {
>   	uint64_t va;
>   
>   	uint32_t mapping_flags;
> +	uint32_t alloc_flags;
>   
>   	atomic_t invalid;
>   	struct amdkfd_process_info *process_info; diff --git 
> a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index 44a52b0..f91ef48 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -1081,7 +1081,6 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   	int byte_align;
>   	u32 domain, alloc_domain;
>   	u64 alloc_flags;
> -	uint32_t mapping_flags;
>   	int ret;
>   
>   	/*
> @@ -1143,16 +1142,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>   			adev->asic_type != CHIP_VEGAM) ?
>   			VI_BO_SIZE_ALIGN : 1;
>   
> -	mapping_flags = AMDGPU_VM_PAGE_READABLE;
> -	if (flags & ALLOC_MEM_FLAGS_WRITABLE)
> -		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> -	if (flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> -		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> -	if (flags & ALLOC_MEM_FLAGS_COHERENT)
> -		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> -	else
> -		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> -	(*mem)->mapping_flags = mapping_flags;
> +	(*mem)->alloc_flags = flags;
>   
>   	amdgpu_sync_create(&(*mem)->sync);
>   
> @@ -1298,6 +1288,7 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
>   		struct kgd_dev *kgd, struct kgd_mem *mem, void *vm)
>   {
>   	struct amdgpu_device *adev = get_amdgpu_device(kgd);
> +	struct amdgpu_device *bo_adev;
>   	struct amdgpu_vm *avm = (struct amdgpu_vm *)vm;
>   	int ret;
>   	struct amdgpu_bo *bo;
> @@ -1315,6 +1306,9 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
>   		return -EINVAL;
>   	}
>   
> +	bo_adev = amdgpu_ttm_adev(bo->tbo.bdev);
> +	mem->mapping_flags = amdgpu_gmc_get_vm_mapping_flags(adev, 
> +mem->alloc_flags, bo_adev != adev);
> +
>   	/* Make sure restore is not running concurrently. Since we
>   	 * don't map invalid userptr BOs, we rely on the next restore
>   	 * worker to do the mapping
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> index 071145a..6bd0c28 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
> @@ -105,6 +105,9 @@ struct amdgpu_gmc_funcs {
>   	/* get the pde for a given mc addr */
>   	void (*get_vm_pde)(struct amdgpu_device *adev, int level,
>   			   u64 *dst, u64 *flags);
> +	/* get per asic vm mapping flags */
> +	uint32_t (*get_vm_mapping_flags)(struct amdgpu_device *adev,
> +			uint32_t alloc_flags, bool remote_mapping);
>   };
>   
>   struct amdgpu_xgmi {
> @@ -185,6 +188,7 @@ struct amdgpu_gmc {
>   #define amdgpu_gmc_emit_pasid_mapping(r, vmid, pasid) (r)->adev->gmc.gmc_funcs->emit_pasid_mapping((r), (vmid), (pasid))
>   #define amdgpu_gmc_get_vm_pde(adev, level, dst, flags) (adev)->gmc.gmc_funcs->get_vm_pde((adev), (level), (dst), (flags))
>   #define amdgpu_gmc_get_pte_flags(adev, flags) 
> (adev)->gmc.gmc_funcs->get_vm_pte_flags((adev),(flags))
> +#define amdgpu_gmc_get_vm_mapping_flags(adev, alloc_flags, 
> +remote_mapping) ((adev)->gmc.gmc_funcs->get_vm_mapping_flags((adev), 
> +(alloc_flags), (remote_mapping)))
>   
>   /**
>    * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible 
> through the BAR diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index 4e3ac10..7e420e0 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -415,12 +415,30 @@ static void gmc_v10_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	}
>   }
>   
> +static uint32_t gmc_v10_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping) {
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
> +
>   static const struct amdgpu_gmc_funcs gmc_v10_0_gmc_funcs = {
>   	.flush_gpu_tlb = gmc_v10_0_flush_gpu_tlb,
>   	.emit_flush_gpu_tlb = gmc_v10_0_emit_flush_gpu_tlb,
>   	.emit_pasid_mapping = gmc_v10_0_emit_pasid_mapping,
>   	.get_vm_pte_flags = gmc_v10_0_get_vm_pte_flags,
> -	.get_vm_pde = gmc_v10_0_get_vm_pde
> +	.get_vm_pde = gmc_v10_0_get_vm_pde,
> +	.get_vm_mapping_flags = gmc_v10_0_get_vm_mapping_flags
>   };
>   
>   static void gmc_v10_0_set_gmc_funcs(struct amdgpu_device *adev) diff 
> --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> index b06d876..2b2af6a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
> @@ -404,6 +404,22 @@ static void gmc_v6_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
>   }
>   
> +static uint32_t gmc_v6_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping) {
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
>   static void gmc_v6_0_set_fault_enable_default(struct amdgpu_device *adev,
>   					      bool value)
>   {
> @@ -1157,7 +1173,8 @@ static const struct amdgpu_gmc_funcs gmc_v6_0_gmc_funcs = {
>   	.emit_flush_gpu_tlb = gmc_v6_0_emit_flush_gpu_tlb,
>   	.set_prt = gmc_v6_0_set_prt,
>   	.get_vm_pde = gmc_v6_0_get_vm_pde,
> -	.get_vm_pte_flags = gmc_v6_0_get_vm_pte_flags
> +	.get_vm_pte_flags = gmc_v6_0_get_vm_pte_flags,
> +	.get_vm_mapping_flags = gmc_v6_0_get_vm_mapping_flags

That looks like a duplication of what get_vm_pte_flags is supposed to be doing.

[Oak] There are two layers of definition here: general user-accessible AMDGPU_VM_xxx definitions (in amdgpu_drm.h) and asic-specific AMDGPU_PTE_xxx definitions (defined in amdgpu_vm.h). For amdgpu drm user, user pass in AMDGPU_VM_XX flags and get_vm_pte_flags translate those flags to AMDGPU_PTE_xxx flags to program page table.

For compute stack, there are two layers of translation: we will first generate the AMDGPU_VM_xxx flags from memory object allocation flags/memory physical location/memory mapping destination/memory access path (pcie or xgmi). Then get_vm_pte_flags is called to translate AMDGPU_VM_xx flags to AMDGPU_PTE_xx flags. The second layer translation is the same with the drm usage.

For compute stack user, due the API definition, user can't directly pass in the AMDGPU_VM_xxx flags, instead it pass in an allocation flags. Also it is not possible for user to pass in the AMDGPU_VM flags as some of the flags is decided by factors that user is unaware, for example whether the access is through pcie or xgmi. 

Due to above reasons, We had to do two layers of translations. The two layers of translation exists before this patch seriers (the first layer was in amdgpu_amdkfd_gpuvm.c). When I do this patch series, I had a need to make the first layers of translation also to be asic-specific as Arcturus introduce different mapping scheme. Then I moved the first layer translation to asic-specifc gmc functions. 

Christian.

>   };
>   
>   static const struct amdgpu_irq_src_funcs gmc_v6_0_irq_funcs = { diff 
> --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> index 75aa333..619862e 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
> @@ -481,6 +481,23 @@ static void gmc_v7_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
>   }
>   
> +static uint32_t gmc_v7_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping) {
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
> +
>   /**
>    * gmc_v8_0_set_fault_enable_default - update VM fault handling
>    *
> @@ -1353,7 +1370,8 @@ static const struct amdgpu_gmc_funcs gmc_v7_0_gmc_funcs = {
>   	.emit_pasid_mapping = gmc_v7_0_emit_pasid_mapping,
>   	.set_prt = gmc_v7_0_set_prt,
>   	.get_vm_pte_flags = gmc_v7_0_get_vm_pte_flags,
> -	.get_vm_pde = gmc_v7_0_get_vm_pde
> +	.get_vm_pde = gmc_v7_0_get_vm_pde,
> +	.get_vm_mapping_flags = gmc_v7_0_get_vm_mapping_flags
>   };
>   
>   static const struct amdgpu_irq_src_funcs gmc_v7_0_irq_funcs = { diff 
> --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> index 8bf2ba3..d2cecb30 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
> @@ -706,6 +706,24 @@ static void gmc_v8_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
>   }
>   
> +static uint32_t gmc_v8_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping) {
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
> +
> +
>   /**
>    * gmc_v8_0_set_fault_enable_default - update VM fault handling
>    *
> @@ -1721,7 +1739,8 @@ static const struct amdgpu_gmc_funcs gmc_v8_0_gmc_funcs = {
>   	.emit_pasid_mapping = gmc_v8_0_emit_pasid_mapping,
>   	.set_prt = gmc_v8_0_set_prt,
>   	.get_vm_pte_flags = gmc_v8_0_get_vm_pte_flags,
> -	.get_vm_pde = gmc_v8_0_get_vm_pde
> +	.get_vm_pde = gmc_v8_0_get_vm_pde,
> +	.get_vm_mapping_flags = gmc_v8_0_get_vm_mapping_flags
>   };
>   
>   static const struct amdgpu_irq_src_funcs gmc_v8_0_irq_funcs = { diff 
> --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index e6adc86..d709902 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -626,12 +626,31 @@ static void gmc_v9_0_get_vm_pde(struct amdgpu_device *adev, int level,
>   	}
>   }
>   
> +static uint32_t gmc_v9_0_get_vm_mapping_flags(struct amdgpu_device *adev,
> +				uint32_t alloc_flags, bool remote_mapping) {
> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
> +
> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
> +	else
> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
> +
> +	return mapping_flags;
> +}
> +
> +
>   static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = {
>   	.flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb,
>   	.emit_flush_gpu_tlb = gmc_v9_0_emit_flush_gpu_tlb,
>   	.emit_pasid_mapping = gmc_v9_0_emit_pasid_mapping,
>   	.get_vm_pte_flags = gmc_v9_0_get_vm_pte_flags,
> -	.get_vm_pde = gmc_v9_0_get_vm_pde
> +	.get_vm_pde = gmc_v9_0_get_vm_pde,
> +	.get_vm_mapping_flags = gmc_v9_0_get_vm_mapping_flags
>   };
>   
>   static void gmc_v9_0_set_gmc_funcs(struct amdgpu_device *adev)

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 4/5] drm/amdgpu: Support snooped PTE flag
       [not found]         ` <88fad585-ad1b-bca9-7079-d79896def19c-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-09 14:24           ` Zeng, Oak
       [not found]             ` <BL0PR12MB25801EB371ADC1E82ED6AEA180D60-b4cIHhjg/p/XzH18dTCKOgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
  0 siblings, 1 reply; 14+ messages in thread
From: Zeng, Oak @ 2019-08-09 14:24 UTC (permalink / raw)
  To: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Keely, Sean



Regards,
Oak

-----Original Message-----
From: Koenig, Christian <Christian.Koenig@amd.com> 
Sent: Friday, August 9, 2019 8:31 AM
To: Zeng, Oak <Oak.Zeng@amd.com>; amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix <Felix.Kuehling@amd.com>; Keely, Sean <Sean.Keely@amd.com>
Subject: Re: [PATCH 4/5] drm/amdgpu: Support snooped PTE flag

Am 09.08.19 um 04:15 schrieb Zeng, Oak:
> Set snooped PTE flag according to mapping flag. Write request to a 
> page with snooped bit set, will send out invalidate probe request to 
> TCC of the remote GPU where the vram page resides.
>
> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index d709902..8faead3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -599,6 +599,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
>   	if (flags & AMDGPU_VM_PAGE_PRT)
>   		pte_flag |= AMDGPU_PTE_PRT;
>   
> +	if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
> +		pte_flag |= AMDGPU_PTE_SNOOPED;
> +

Still a NAK, we absolutely need a check here that this is only set when the BO is in XGMI.

[Oak]: Per discussion with upper layer stack, remote vram mapping (either over PCIe or XGMI) should always invalidate probe the cache lines of GPU who owns the memory object.

Christian.

>   	return pte_flag;
>   }
>   

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/5] drm/amdgpu: Support snooped PTE flag
       [not found]             ` <BL0PR12MB25801EB371ADC1E82ED6AEA180D60-b4cIHhjg/p/XzH18dTCKOgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2019-08-09 14:36               ` Koenig, Christian
  0 siblings, 0 replies; 14+ messages in thread
From: Koenig, Christian @ 2019-08-09 14:36 UTC (permalink / raw)
  To: Zeng, Oak, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Keely, Sean

Am 09.08.19 um 16:24 schrieb Zeng, Oak:
>
> Regards,
> Oak
>
> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig@amd.com>
> Sent: Friday, August 9, 2019 8:31 AM
> To: Zeng, Oak <Oak.Zeng@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Kuehling, Felix <Felix.Kuehling@amd.com>; Keely, Sean <Sean.Keely@amd.com>
> Subject: Re: [PATCH 4/5] drm/amdgpu: Support snooped PTE flag
>
> Am 09.08.19 um 04:15 schrieb Zeng, Oak:
>> Set snooped PTE flag according to mapping flag. Write request to a
>> page with snooped bit set, will send out invalidate probe request to
>> TCC of the remote GPU where the vram page resides.
>>
>> Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
>> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
>>    1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> index d709902..8faead3 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> @@ -599,6 +599,9 @@ static uint64_t gmc_v9_0_get_vm_pte_flags(struct amdgpu_device *adev,
>>    	if (flags & AMDGPU_VM_PAGE_PRT)
>>    		pte_flag |= AMDGPU_PTE_PRT;
>>    
>> +	if (flags & AMDGPU_VM_PAGE_INVALIDATE_PROBE)
>> +		pte_flag |= AMDGPU_PTE_SNOOPED;
>> +
> Still a NAK, we absolutely need a check here that this is only set when the BO is in XGMI.
>
> [Oak]: Per discussion with upper layer stack, remote vram mapping (either over PCIe or XGMI) should always invalidate probe the cache lines of GPU who owns the memory object.

Then we can just drop the AMDGPU_VM_PAGE_INVALIDATE_PROBE flag.

But the problem is something entirely different, the PTE_SNOOPED flag 
has different meaning depending on if the transaction is routed to XGMI 
or PCIe.

For XGMI the flag triggers invalidate probe on the remote GPU, but for 
PCIe it controls CPU cache snooping.

And the CPU cache snooping flags of the devices and host must match or 
otherwise you run into a hell lot of problems on some architectures.

So we can only allow setting the snooped flag here when that PTE is 
really pointing to XGMI.

Regards,
Christian.

>
> Christian.
>
>>    	return pte_flag;
>>    }
>>    

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/5] drm/amdkfd: Postpone memory mapping flags calculation to mapping time
       [not found]             ` <BL0PR12MB258085011920B10BDE86856B80D60-b4cIHhjg/p/XzH18dTCKOgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
@ 2019-08-09 14:43               ` Christian König
  0 siblings, 0 replies; 14+ messages in thread
From: Christian König @ 2019-08-09 14:43 UTC (permalink / raw)
  To: Zeng, Oak, Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Keely, Sean

Am 09.08.19 um 16:21 schrieb Zeng, Oak:
>
> Regards,
> Oak
>
> -----Original Message-----
> From: Koenig, Christian <Christian.Koenig@amd.com>
> Sent: Friday, August 9, 2019 8:31 AM
> To: Zeng, Oak <Oak.Zeng@amd.com>; amd-gfx@lists.freedesktop.org
> Cc: Kuehling, Felix <Felix.Kuehling@amd.com>; Keely, Sean <Sean.Keely@amd.com>
> Subject: Re: [PATCH 3/5] drm/amdkfd: Postpone memory mapping flags calculation to mapping time
>
> Am 09.08.19 um 04:15 schrieb Zeng, Oak:
>> Some mapping flags are decided by memory mapping destination which is
>> not know at memory object allocation time. So it is reasonable to
>> decide memory mapping flags at mapping time, instead of alloc time.
>> Record memory allocation flags during allocation time and calculate
>> mapping flags during memory mapping time.
>>
>> Also made the memory mapping flage calculation to be asic-specific,
>> because different asic can have different mapping scheme.
>>
>> The new memory mapping flag is decided by both memory allocation flags
>> and whether the memory mapping is to a remote device.
>>
>> This is preparation work to introduce more sophisticated mapping scheme.
>>
>> Change-Id: I98e7c47d850585ad7f0a9e11617c92b7a9aced3b
>> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
>> ---
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h       |  1 +
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 16 +++++-----------
>>    drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h          |  4 ++++
>>    drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c           | 20 +++++++++++++++++++-
>>    drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c            | 19 ++++++++++++++++++-
>>    drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c            | 20 +++++++++++++++++++-
>>    drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c            | 21 ++++++++++++++++++++-
>>    drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c            | 21 ++++++++++++++++++++-
>>    8 files changed, 106 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> index e519df3..026475a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
>> @@ -58,6 +58,7 @@ struct kgd_mem {
>>    	uint64_t va;
>>    
>>    	uint32_t mapping_flags;
>> +	uint32_t alloc_flags;
>>    
>>    	atomic_t invalid;
>>    	struct amdkfd_process_info *process_info; diff --git
>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> index 44a52b0..f91ef48 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
>> @@ -1081,7 +1081,6 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>>    	int byte_align;
>>    	u32 domain, alloc_domain;
>>    	u64 alloc_flags;
>> -	uint32_t mapping_flags;
>>    	int ret;
>>    
>>    	/*
>> @@ -1143,16 +1142,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>>    			adev->asic_type != CHIP_VEGAM) ?
>>    			VI_BO_SIZE_ALIGN : 1;
>>    
>> -	mapping_flags = AMDGPU_VM_PAGE_READABLE;
>> -	if (flags & ALLOC_MEM_FLAGS_WRITABLE)
>> -		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
>> -	if (flags & ALLOC_MEM_FLAGS_EXECUTABLE)
>> -		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
>> -	if (flags & ALLOC_MEM_FLAGS_COHERENT)
>> -		mapping_flags |= AMDGPU_VM_MTYPE_UC;
>> -	else
>> -		mapping_flags |= AMDGPU_VM_MTYPE_NC;
>> -	(*mem)->mapping_flags = mapping_flags;
>> +	(*mem)->alloc_flags = flags;
>>    
>>    	amdgpu_sync_create(&(*mem)->sync);
>>    
>> @@ -1298,6 +1288,7 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
>>    		struct kgd_dev *kgd, struct kgd_mem *mem, void *vm)
>>    {
>>    	struct amdgpu_device *adev = get_amdgpu_device(kgd);
>> +	struct amdgpu_device *bo_adev;
>>    	struct amdgpu_vm *avm = (struct amdgpu_vm *)vm;
>>    	int ret;
>>    	struct amdgpu_bo *bo;
>> @@ -1315,6 +1306,9 @@ int amdgpu_amdkfd_gpuvm_map_memory_to_gpu(
>>    		return -EINVAL;
>>    	}
>>    
>> +	bo_adev = amdgpu_ttm_adev(bo->tbo.bdev);
>> +	mem->mapping_flags = amdgpu_gmc_get_vm_mapping_flags(adev,
>> +mem->alloc_flags, bo_adev != adev);
>> +
>>    	/* Make sure restore is not running concurrently. Since we
>>    	 * don't map invalid userptr BOs, we rely on the next restore
>>    	 * worker to do the mapping
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> index 071145a..6bd0c28 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h
>> @@ -105,6 +105,9 @@ struct amdgpu_gmc_funcs {
>>    	/* get the pde for a given mc addr */
>>    	void (*get_vm_pde)(struct amdgpu_device *adev, int level,
>>    			   u64 *dst, u64 *flags);
>> +	/* get per asic vm mapping flags */
>> +	uint32_t (*get_vm_mapping_flags)(struct amdgpu_device *adev,
>> +			uint32_t alloc_flags, bool remote_mapping);
>>    };
>>    
>>    struct amdgpu_xgmi {
>> @@ -185,6 +188,7 @@ struct amdgpu_gmc {
>>    #define amdgpu_gmc_emit_pasid_mapping(r, vmid, pasid) (r)->adev->gmc.gmc_funcs->emit_pasid_mapping((r), (vmid), (pasid))
>>    #define amdgpu_gmc_get_vm_pde(adev, level, dst, flags) (adev)->gmc.gmc_funcs->get_vm_pde((adev), (level), (dst), (flags))
>>    #define amdgpu_gmc_get_pte_flags(adev, flags)
>> (adev)->gmc.gmc_funcs->get_vm_pte_flags((adev),(flags))
>> +#define amdgpu_gmc_get_vm_mapping_flags(adev, alloc_flags,
>> +remote_mapping) ((adev)->gmc.gmc_funcs->get_vm_mapping_flags((adev),
>> +(alloc_flags), (remote_mapping)))
>>    
>>    /**
>>     * amdgpu_gmc_vram_full_visible - Check if full VRAM is visible
>> through the BAR diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> index 4e3ac10..7e420e0 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
>> @@ -415,12 +415,30 @@ static void gmc_v10_0_get_vm_pde(struct amdgpu_device *adev, int level,
>>    	}
>>    }
>>    
>> +static uint32_t gmc_v10_0_get_vm_mapping_flags(struct amdgpu_device *adev,
>> +				uint32_t alloc_flags, bool remote_mapping) {
>> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
>> +
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
>> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
>> +	else
>> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
>> +
>> +	return mapping_flags;
>> +}
>> +
>>    static const struct amdgpu_gmc_funcs gmc_v10_0_gmc_funcs = {
>>    	.flush_gpu_tlb = gmc_v10_0_flush_gpu_tlb,
>>    	.emit_flush_gpu_tlb = gmc_v10_0_emit_flush_gpu_tlb,
>>    	.emit_pasid_mapping = gmc_v10_0_emit_pasid_mapping,
>>    	.get_vm_pte_flags = gmc_v10_0_get_vm_pte_flags,
>> -	.get_vm_pde = gmc_v10_0_get_vm_pde
>> +	.get_vm_pde = gmc_v10_0_get_vm_pde,
>> +	.get_vm_mapping_flags = gmc_v10_0_get_vm_mapping_flags
>>    };
>>    
>>    static void gmc_v10_0_set_gmc_funcs(struct amdgpu_device *adev) diff
>> --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
>> index b06d876..2b2af6a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
>> @@ -404,6 +404,22 @@ static void gmc_v6_0_get_vm_pde(struct amdgpu_device *adev, int level,
>>    	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
>>    }
>>    
>> +static uint32_t gmc_v6_0_get_vm_mapping_flags(struct amdgpu_device *adev,
>> +				uint32_t alloc_flags, bool remote_mapping) {
>> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
>> +
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
>> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
>> +	else
>> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
>> +
>> +	return mapping_flags;
>> +}
>>    static void gmc_v6_0_set_fault_enable_default(struct amdgpu_device *adev,
>>    					      bool value)
>>    {
>> @@ -1157,7 +1173,8 @@ static const struct amdgpu_gmc_funcs gmc_v6_0_gmc_funcs = {
>>    	.emit_flush_gpu_tlb = gmc_v6_0_emit_flush_gpu_tlb,
>>    	.set_prt = gmc_v6_0_set_prt,
>>    	.get_vm_pde = gmc_v6_0_get_vm_pde,
>> -	.get_vm_pte_flags = gmc_v6_0_get_vm_pte_flags
>> +	.get_vm_pte_flags = gmc_v6_0_get_vm_pte_flags,
>> +	.get_vm_mapping_flags = gmc_v6_0_get_vm_mapping_flags
> That looks like a duplication of what get_vm_pte_flags is supposed to be doing.
>
> [Oak] There are two layers of definition here: general user-accessible AMDGPU_VM_xxx definitions (in amdgpu_drm.h) and asic-specific AMDGPU_PTE_xxx definitions (defined in amdgpu_vm.h). For amdgpu drm user, user pass in AMDGPU_VM_XX flags and get_vm_pte_flags translate those flags to AMDGPU_PTE_xxx flags to program page table.
>
> For compute stack, there are two layers of translation: we will first generate the AMDGPU_VM_xxx flags from memory object allocation flags/memory physical location/memory mapping destination/memory access path (pcie or xgmi). Then get_vm_pte_flags is called to translate AMDGPU_VM_xx flags to AMDGPU_PTE_xx flags. The second layer translation is the same with the drm usage.
>
> For compute stack user, due the API definition, user can't directly pass in the AMDGPU_VM_xxx flags, instead it pass in an allocation flags. Also it is not possible for user to pass in the AMDGPU_VM flags as some of the flags is decided by factors that user is unaware, for example whether the access is through pcie or xgmi.
>
> Due to above reasons, We had to do two layers of translations. The two layers of translation exists before this patch seriers (the first layer was in amdgpu_amdkfd_gpuvm.c). When I do this patch series, I had a need to make the first layers of translation also to be asic-specific as Arcturus introduce different mapping scheme. Then I moved the first layer translation to asic-specifc gmc functions.

Yeah, but this is a really bad design. We are essentially leaking the 
problem that we have two upper layers into the hardware specific lower 
layer and that is a rather big no-go.

We need to find a different approach for this which makes it possible to 
handle this with a single callback or the design is a rather clear NAK.

Christian.

>
> Christian.
>
>>    };
>>    
>>    static const struct amdgpu_irq_src_funcs gmc_v6_0_irq_funcs = { diff
>> --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>> index 75aa333..619862e 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
>> @@ -481,6 +481,23 @@ static void gmc_v7_0_get_vm_pde(struct amdgpu_device *adev, int level,
>>    	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
>>    }
>>    
>> +static uint32_t gmc_v7_0_get_vm_mapping_flags(struct amdgpu_device *adev,
>> +				uint32_t alloc_flags, bool remote_mapping) {
>> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
>> +
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
>> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
>> +	else
>> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
>> +
>> +	return mapping_flags;
>> +}
>> +
>>    /**
>>     * gmc_v8_0_set_fault_enable_default - update VM fault handling
>>     *
>> @@ -1353,7 +1370,8 @@ static const struct amdgpu_gmc_funcs gmc_v7_0_gmc_funcs = {
>>    	.emit_pasid_mapping = gmc_v7_0_emit_pasid_mapping,
>>    	.set_prt = gmc_v7_0_set_prt,
>>    	.get_vm_pte_flags = gmc_v7_0_get_vm_pte_flags,
>> -	.get_vm_pde = gmc_v7_0_get_vm_pde
>> +	.get_vm_pde = gmc_v7_0_get_vm_pde,
>> +	.get_vm_mapping_flags = gmc_v7_0_get_vm_mapping_flags
>>    };
>>    
>>    static const struct amdgpu_irq_src_funcs gmc_v7_0_irq_funcs = { diff
>> --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>> index 8bf2ba3..d2cecb30 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
>> @@ -706,6 +706,24 @@ static void gmc_v8_0_get_vm_pde(struct amdgpu_device *adev, int level,
>>    	BUG_ON(*addr & 0xFFFFFF0000000FFFULL);
>>    }
>>    
>> +static uint32_t gmc_v8_0_get_vm_mapping_flags(struct amdgpu_device *adev,
>> +				uint32_t alloc_flags, bool remote_mapping) {
>> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
>> +
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
>> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
>> +	else
>> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
>> +
>> +	return mapping_flags;
>> +}
>> +
>> +
>>    /**
>>     * gmc_v8_0_set_fault_enable_default - update VM fault handling
>>     *
>> @@ -1721,7 +1739,8 @@ static const struct amdgpu_gmc_funcs gmc_v8_0_gmc_funcs = {
>>    	.emit_pasid_mapping = gmc_v8_0_emit_pasid_mapping,
>>    	.set_prt = gmc_v8_0_set_prt,
>>    	.get_vm_pte_flags = gmc_v8_0_get_vm_pte_flags,
>> -	.get_vm_pde = gmc_v8_0_get_vm_pde
>> +	.get_vm_pde = gmc_v8_0_get_vm_pde,
>> +	.get_vm_mapping_flags = gmc_v8_0_get_vm_mapping_flags
>>    };
>>    
>>    static const struct amdgpu_irq_src_funcs gmc_v8_0_irq_funcs = { diff
>> --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> index e6adc86..d709902 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>> @@ -626,12 +626,31 @@ static void gmc_v9_0_get_vm_pde(struct amdgpu_device *adev, int level,
>>    	}
>>    }
>>    
>> +static uint32_t gmc_v9_0_get_vm_mapping_flags(struct amdgpu_device *adev,
>> +				uint32_t alloc_flags, bool remote_mapping) {
>> +	uint32_t mapping_flags = AMDGPU_VM_PAGE_READABLE;
>> +
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
>> +		mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
>> +	if (alloc_flags & ALLOC_MEM_FLAGS_COHERENT)
>> +		mapping_flags |= AMDGPU_VM_MTYPE_UC;
>> +	else
>> +		mapping_flags |= AMDGPU_VM_MTYPE_NC;
>> +
>> +	return mapping_flags;
>> +}
>> +
>> +
>>    static const struct amdgpu_gmc_funcs gmc_v9_0_gmc_funcs = {
>>    	.flush_gpu_tlb = gmc_v9_0_flush_gpu_tlb,
>>    	.emit_flush_gpu_tlb = gmc_v9_0_emit_flush_gpu_tlb,
>>    	.emit_pasid_mapping = gmc_v9_0_emit_pasid_mapping,
>>    	.get_vm_pte_flags = gmc_v9_0_get_vm_pte_flags,
>> -	.get_vm_pde = gmc_v9_0_get_vm_pde
>> +	.get_vm_pde = gmc_v9_0_get_vm_pde,
>> +	.get_vm_mapping_flags = gmc_v9_0_get_vm_mapping_flags
>>    };
>>    
>>    static void gmc_v9_0_set_gmc_funcs(struct amdgpu_device *adev)
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/5] drm/amdgpu: Support snooped PTE flag
       [not found] ` <1565145062-16674-1-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
@ 2019-08-07  2:31   ` Zeng, Oak
  0 siblings, 0 replies; 14+ messages in thread
From: Zeng, Oak @ 2019-08-07  2:31 UTC (permalink / raw)
  To: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
  Cc: Kuehling, Felix, Zeng, Oak, Koenig, Christian, Keely, Sean

Set snooped PTE flag according to mapping flag

Change-Id: I799f68ec7a5a1abf32075f5ef31051641a0b3736
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index adad7c4..8beefaf 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -639,6 +639,9 @@ static uint64_t gmc_v9_0_arcturus_get_vm_pte_flags(struct amdgpu_device *adev,
 	if (flags & AMDGPU_VM_PAGE_PRT)
 		pte_flag |= AMDGPU_PTE_PRT;
 
+	if (flags & AMDGPU_VM_PAGE_SNOOPED)
+		pte_flag |= AMDGPU_PTE_SNOOPED;
+
 	return pte_flag;
 }
 
-- 
2.7.4

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-08-09 14:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-09  2:15 [PATCH 1/5] drm/amdgpu: Extends amdgpu vm definitions Zeng, Oak
     [not found] ` <1565316926-19516-1-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
2019-08-09  2:15   ` [PATCH 2/5] drm/amdgpu: Support new arcturus mtype Zeng, Oak
2019-08-09  2:15   ` [PATCH 3/5] drm/amdkfd: Postpone memory mapping flags calculation to mapping time Zeng, Oak
     [not found]     ` <1565316926-19516-3-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
2019-08-09 12:30       ` Koenig, Christian
     [not found]         ` <95459b48-c8e5-b54c-00f1-8a1fdccd7e66-5C7GfCeVMHo@public.gmane.org>
2019-08-09 14:21           ` Zeng, Oak
     [not found]             ` <BL0PR12MB258085011920B10BDE86856B80D60-b4cIHhjg/p/XzH18dTCKOgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-08-09 14:43               ` Christian König
2019-08-09  2:15   ` [PATCH 4/5] drm/amdgpu: Support snooped PTE flag Zeng, Oak
     [not found]     ` <1565316926-19516-4-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
2019-08-09 12:31       ` Koenig, Christian
     [not found]         ` <88fad585-ad1b-bca9-7079-d79896def19c-5C7GfCeVMHo@public.gmane.org>
2019-08-09 14:24           ` Zeng, Oak
     [not found]             ` <BL0PR12MB25801EB371ADC1E82ED6AEA180D60-b4cIHhjg/p/XzH18dTCKOgdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-08-09 14:36               ` Koenig, Christian
2019-08-09  2:15   ` [PATCH 5/5] amd/amdgpu: Introduce new page mapping scheme for arcturus Zeng, Oak
2019-08-09 12:28   ` [PATCH 1/5] drm/amdgpu: Extends amdgpu vm definitions Koenig, Christian
     [not found]     ` <801142da-51cb-4efd-2cd6-860c65bfb311-5C7GfCeVMHo@public.gmane.org>
2019-08-09 13:50       ` Zeng, Oak
  -- strict thread matches above, loose matches on Subject: below --
2019-08-07  2:31 Zeng, Oak
     [not found] ` <1565145062-16674-1-git-send-email-Oak.Zeng-5C7GfCeVMHo@public.gmane.org>
2019-08-07  2:31   ` [PATCH 4/5] drm/amdgpu: Support snooped PTE flag Zeng, Oak

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.