All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Some refactoring and maybe a memory accounting fixlet
@ 2024-04-26 16:43 Tvrtko Ursulin
  2024-04-26 16:43 ` [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper Tvrtko Ursulin
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-26 16:43 UTC (permalink / raw)
  To: amd-gfx; +Cc: Christian König, kernel-dev, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

As I was reading through the driver I spotted one thing which could
perhaps make it more readable (1/3), one thing which reduces some double
conversions (in principle) from TTM placement back to domain (2/3), and
also enables the last patch in the series which maybe fixes a small
memory accounting issue. But only relevant when interacting with KFD
IIUC.

Assuming the fix is real and not my confusion, please let me know if it is
all acceptable, or you would like me to re-structure the series to only
contain the fix and break off the rest for later (or never).

Tvrtko Ursulin (3):
  drm/amdgpu: Add amdgpu_bo_is_vm_bo helper
  drm/amdgpu: Reduce mem_type to domain double indirection
  drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c     |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 31 ++++++++++-----------
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h  | 14 ++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c      | 31 +++++++++------------
 5 files changed, 44 insertions(+), 38 deletions(-)

-- 
2.44.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper
  2024-04-26 16:43 [PATCH 0/3] Some refactoring and maybe a memory accounting fixlet Tvrtko Ursulin
@ 2024-04-26 16:43 ` Tvrtko Ursulin
  2024-04-29 11:02   ` Christian König
  2024-04-26 16:43 ` [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection Tvrtko Ursulin
  2024-04-26 16:43 ` [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting Tvrtko Ursulin
  2 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-26 16:43 UTC (permalink / raw)
  To: amd-gfx; +Cc: Christian König, kernel-dev, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

Help code readability by replacing a bunch of:

bo->tbo.base.resv == vm->root.bo->tbo.base.resv

With:

amdgpu_bo_is_vm_bo(bo, vm)

No functional changes.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c    |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 14 ++++++++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c     | 31 +++++++++-------------
 3 files changed, 28 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 67c234bcf89f..32e4a9c6e805 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -174,7 +174,7 @@ static int amdgpu_gem_object_open(struct drm_gem_object *obj,
 		return -EPERM;
 
 	if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID &&
-	    abo->tbo.base.resv != vm->root.bo->tbo.base.resv)
+	    !amdgpu_bo_is_vm_bo(abo, vm))
 		return -EPERM;
 
 	r = amdgpu_bo_reserve(abo, false);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index be679c42b0b8..f2bb6965cc77 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -307,6 +307,20 @@ static inline struct amdgpu_bo *amdgpu_bo_shadowed(struct amdgpu_bo *bo)
 	return NULL;
 }
 
+/**
+ * amdgpu_bo_is_vm_bo - check if the BO is VM always valid
+ *
+ * @abo: BO to be tested.
+ * @vm: VM to test against.
+ *
+ * Returns true if the BO is VM always valid.
+ */
+static inline bool amdgpu_bo_is_vm_bo(struct amdgpu_bo *bo,
+				      struct amdgpu_vm *vm)
+{
+	return bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv;
+}
+
 bool amdgpu_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
 void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 8af3f0fd3073..6d6f0e325172 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -333,7 +333,7 @@ void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
 	base->next = bo->vm_bo;
 	bo->vm_bo = base;
 
-	if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
+	if (!amdgpu_bo_is_vm_bo(bo, vm))
 		return;
 
 	dma_resv_assert_held(vm->root.bo->tbo.base.resv);
@@ -1101,13 +1101,12 @@ static void amdgpu_vm_bo_get_memory(struct amdgpu_bo_va *bo_va,
 	 * For now ignore BOs which are currently locked and potentially
 	 * changing their location.
 	 */
-	if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv &&
-	    !dma_resv_trylock(bo->tbo.base.resv))
+	if (!amdgpu_bo_is_vm_bo(bo, vm) && !dma_resv_trylock(bo->tbo.base.resv))
 		return;
 
 	amdgpu_bo_get_memory(bo, stats);
-	if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
-	    dma_resv_unlock(bo->tbo.base.resv);
+	if (amdgpu_bo_is_vm_bo(bo, vm))
+		dma_resv_unlock(bo->tbo.base.resv);
 }
 
 void amdgpu_vm_get_memory(struct amdgpu_vm *vm,
@@ -1203,8 +1202,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 		uncached = false;
 	}
 
-	if (clear || (bo && bo->tbo.base.resv ==
-		      vm->root.bo->tbo.base.resv))
+	if (clear || amdgpu_bo_is_vm_bo(bo, vm))
 		last_update = &vm->last_update;
 	else
 		last_update = &bo_va->last_pt_update;
@@ -1246,7 +1244,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
 	 * the evicted list so that it gets validated again on the
 	 * next command submission.
 	 */
-	if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv) {
+	if (amdgpu_bo_is_vm_bo(bo, vm)) {
 		uint32_t mem_type = bo->tbo.resource->mem_type;
 
 		if (!(bo->preferred_domains &
@@ -1640,10 +1638,9 @@ static void amdgpu_vm_bo_insert_map(struct amdgpu_device *adev,
 	if (mapping->flags & AMDGPU_PTE_PRT)
 		amdgpu_vm_prt_get(adev);
 
-	if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
-	    !bo_va->base.moved) {
+	if (amdgpu_bo_is_vm_bo(bo, vm) && !bo_va->base.moved)
 		amdgpu_vm_bo_moved(&bo_va->base);
-	}
+
 	trace_amdgpu_vm_bo_map(bo_va, mapping);
 }
 
@@ -1922,8 +1919,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
 		if (before->flags & AMDGPU_PTE_PRT)
 			amdgpu_vm_prt_get(adev);
 
-		if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
-		    !before->bo_va->base.moved)
+		if (amdgpu_bo_is_vm_bo(bo, vm) && !before->bo_va->base.moved)
 			amdgpu_vm_bo_moved(&before->bo_va->base);
 	} else {
 		kfree(before);
@@ -1937,8 +1933,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
 		if (after->flags & AMDGPU_PTE_PRT)
 			amdgpu_vm_prt_get(adev);
 
-		if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
-		    !after->bo_va->base.moved)
+		if (amdgpu_bo_is_vm_bo(bo, vm) && !after->bo_va->base.moved)
 			amdgpu_vm_bo_moved(&after->bo_va->base);
 	} else {
 		kfree(after);
@@ -2017,7 +2012,7 @@ void amdgpu_vm_bo_del(struct amdgpu_device *adev,
 
 	if (bo) {
 		dma_resv_assert_held(bo->tbo.base.resv);
-		if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
+		if (amdgpu_bo_is_vm_bo(bo, vm))
 			ttm_bo_set_bulk_move(&bo->tbo, NULL);
 
 		for (base = &bo_va->base.bo->vm_bo; *base;
@@ -2111,7 +2106,7 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
 	for (bo_base = bo->vm_bo; bo_base; bo_base = bo_base->next) {
 		struct amdgpu_vm *vm = bo_base->vm;
 
-		if (evicted && bo->tbo.base.resv == vm->root.bo->tbo.base.resv) {
+		if (evicted && amdgpu_bo_is_vm_bo(bo, vm)) {
 			amdgpu_vm_bo_evicted(bo_base);
 			continue;
 		}
@@ -2122,7 +2117,7 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
 
 		if (bo->tbo.type == ttm_bo_type_kernel)
 			amdgpu_vm_bo_relocated(bo_base);
-		else if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
+		else if (amdgpu_bo_is_vm_bo(bo, vm))
 			amdgpu_vm_bo_moved(bo_base);
 		else
 			amdgpu_vm_bo_invalidated(bo_base);
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection
  2024-04-26 16:43 [PATCH 0/3] Some refactoring and maybe a memory accounting fixlet Tvrtko Ursulin
  2024-04-26 16:43 ` [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper Tvrtko Ursulin
@ 2024-04-26 16:43 ` Tvrtko Ursulin
  2024-04-29 11:03   ` Christian König
  2024-04-26 16:43 ` [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting Tvrtko Ursulin
  2 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-26 16:43 UTC (permalink / raw)
  To: amd-gfx; +Cc: Christian König, kernel-dev, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM
placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT,
depending on AMDGPU_GEM_CREATE_PREEMPTIBLE.

Simplify a few places in the code which convert the TTM placement into
a domain by checking against the current placement directly.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  4 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 30 ++++++++++-----------
 2 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 055ba2ea4c12..ff83f8d8628c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -165,8 +165,8 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach,
 		if (r)
 			return ERR_PTR(r);
 
-	} else if (!(amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type) &
-		     AMDGPU_GEM_DOMAIN_GTT)) {
+	} else if (bo->tbo.resource->mem_type != TTM_PL_TT &&
+		   bo->tbo.resource->mem_type != AMDGPU_PL_PREEMPT) {
 		return ERR_PTR(-EBUSY);
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8bc79924d171..fb984669fc3a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -976,12 +976,12 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
 
 	ttm_bo_pin(&bo->tbo);
 
-	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
-	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
+	if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
 		atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
 		atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
 			     &adev->visible_pin_size);
-	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
+	} else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+		   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
 		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
 	}
 
@@ -1280,7 +1280,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 {
 	uint64_t size = amdgpu_bo_size(bo);
 	struct drm_gem_object *obj;
-	unsigned int domain;
 	bool shared;
 
 	/* Abort if the BO doesn't currently have a backing store */
@@ -1290,21 +1289,21 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 	obj = &bo->tbo.base;
 	shared = drm_gem_object_is_shared_for_memory_stats(obj);
 
-	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
-	switch (domain) {
-	case AMDGPU_GEM_DOMAIN_VRAM:
+	switch (bo->tbo.resource->mem_type) {
+	case TTM_PL_VRAM:
 		stats->vram += size;
 		if (amdgpu_bo_in_cpu_visible_vram(bo))
 			stats->visible_vram += size;
 		if (shared)
 			stats->vram_shared += size;
 		break;
-	case AMDGPU_GEM_DOMAIN_GTT:
+	case TTM_PL_TT:
+	case AMDGPU_PL_PREEMPT:
 		stats->gtt += size;
 		if (shared)
 			stats->gtt_shared += size;
 		break;
-	case AMDGPU_GEM_DOMAIN_CPU:
+	case TTM_PL_SYSTEM:
 	default:
 		stats->cpu += size;
 		if (shared)
@@ -1317,7 +1316,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 		if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
 			stats->requested_visible_vram += size;
 
-		if (domain != AMDGPU_GEM_DOMAIN_VRAM) {
+		if (bo->tbo.resource->mem_type != TTM_PL_VRAM) {
 			stats->evicted_vram += size;
 			if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
 				stats->evicted_visible_vram += size;
@@ -1592,19 +1591,18 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)
 	u64 size;
 
 	if (dma_resv_trylock(bo->tbo.base.resv)) {
-		unsigned int domain;
-		domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
-		switch (domain) {
-		case AMDGPU_GEM_DOMAIN_VRAM:
+		switch (bo->tbo.resource->mem_type) {
+		case TTM_PL_VRAM:
 			if (amdgpu_bo_in_cpu_visible_vram(bo))
 				placement = "VRAM VISIBLE";
 			else
 				placement = "VRAM";
 			break;
-		case AMDGPU_GEM_DOMAIN_GTT:
+		case TTM_PL_TT:
+		case AMDGPU_PL_PREEMPT:
 			placement = "GTT";
 			break;
-		case AMDGPU_GEM_DOMAIN_CPU:
+		case TTM_PL_SYSTEM:
 		default:
 			placement = "CPU";
 			break;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting
  2024-04-26 16:43 [PATCH 0/3] Some refactoring and maybe a memory accounting fixlet Tvrtko Ursulin
  2024-04-26 16:43 ` [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper Tvrtko Ursulin
  2024-04-26 16:43 ` [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection Tvrtko Ursulin
@ 2024-04-26 16:43 ` Tvrtko Ursulin
  2024-04-26 22:24   ` Felix Kuehling
  2024-04-29 11:06   ` Christian König
  2 siblings, 2 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-26 16:43 UTC (permalink / raw)
  To: amd-gfx; +Cc: Christian König, kernel-dev, Tvrtko Ursulin, Felix Kuehling

From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
SG BOs") added a new TTM region it missed to notice the conceptual
imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.

That imbalance leads to such objects getting accounted against the
resource, but are not un-accounted when unpinned.

Fix by extending the accounting criteria in amdgpu_bo_unpin.

What also aappears needs fixing is not reporting their size from the
amdgpu_bo_get_memory, which is used to implement fdinfo stats, so they are
not mixed with the regular userspace created and driver owned objects.

And also amdgpu_bo_print_info for debugfs reporting.

Note that the patch depends on the previous one which broke down the
relevant checks from the domain based to placement based.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible SG BOs")
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index fb984669fc3a..5a2bbc793953 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
 		atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
 		atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
 			     &adev->visible_pin_size);
-	} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
+	} else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+		   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
 		atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
 	}
 
@@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 			stats->vram_shared += size;
 		break;
 	case TTM_PL_TT:
-	case AMDGPU_PL_PREEMPT:
 		stats->gtt += size;
 		if (shared)
 			stats->gtt_shared += size;
@@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)
 				placement = "VRAM";
 			break;
 		case TTM_PL_TT:
-		case AMDGPU_PL_PREEMPT:
 			placement = "GTT";
 			break;
 		case TTM_PL_SYSTEM:
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting
  2024-04-26 16:43 ` [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting Tvrtko Ursulin
@ 2024-04-26 22:24   ` Felix Kuehling
  2024-04-29  9:43     ` Tvrtko Ursulin
  2024-04-29 11:06   ` Christian König
  1 sibling, 1 reply; 22+ messages in thread
From: Felix Kuehling @ 2024-04-26 22:24 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx; +Cc: Christian König, kernel-dev, Tvrtko Ursulin


On 2024-04-26 12:43, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>
> When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
> SG BOs") added a new TTM region it missed to notice the conceptual
> imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
>
> That imbalance leads to such objects getting accounted against the
> resource, but are not un-accounted when unpinned.

AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
pinned. In any case you should make sure that the accounting is 
consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
patch breaks that consistency.


>
> Fix by extending the accounting criteria in amdgpu_bo_unpin.
>
> What also aappears needs fixing is not reporting their size from the
> amdgpu_bo_get_memory, which is used to implement fdinfo stats, so they are
> not mixed with the regular userspace created and driver owned objects.

I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
does use system memory and it is GPU accessible, just like GTT. The only 
difference is, that it's not subject to the GTT limits because their 
eviction is handled by callbacks other than TTM evictions and doesn't 
need to wait for fences.

Regards,
   Felix


>
> And also amdgpu_bo_print_info for debugfs reporting.
>
> Note that the patch depends on the previous one which broke down the
> relevant checks from the domain based to placement based.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible SG BOs")
> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index fb984669fc3a..5a2bbc793953 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
>   		atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
>   		atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
>   			     &adev->visible_pin_size);
> -	} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
> +	} else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
> +		   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>   		atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
>   	}
>   
> @@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   			stats->vram_shared += size;
>   		break;
>   	case TTM_PL_TT:
> -	case AMDGPU_PL_PREEMPT:
>   		stats->gtt += size;
>   		if (shared)
>   			stats->gtt_shared += size;
> @@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)
>   				placement = "VRAM";
>   			break;
>   		case TTM_PL_TT:
> -		case AMDGPU_PL_PREEMPT:
>   			placement = "GTT";
>   			break;
>   		case TTM_PL_SYSTEM:

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting
  2024-04-26 22:24   ` Felix Kuehling
@ 2024-04-29  9:43     ` Tvrtko Ursulin
  2024-04-29 11:11       ` Christian König
  2024-04-29 14:43       ` Felix Kuehling
  0 siblings, 2 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-29  9:43 UTC (permalink / raw)
  To: Felix Kuehling, Tvrtko Ursulin, amd-gfx; +Cc: Christian König, kernel-dev


On 26/04/2024 23:24, Felix Kuehling wrote:
> 
> On 2024-04-26 12:43, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>
>> When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
>> SG BOs") added a new TTM region it missed to notice the conceptual
>> imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
>>
>> That imbalance leads to such objects getting accounted against the
>> resource, but are not un-accounted when unpinned.
> 
> AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
> pinned. In any case you should make sure that the accounting is 
> consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
> patch breaks that consistency.

You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
such objects, or something else?

If they run, then at the end of pin there is:

	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
...
	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);

And unpin has no handling for AMDGPU_PL_PREEMPT.

Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
checking the domain as stored in the bo at creation time.

Although I am still confused by the statement userptr BOs are not 
pinned. It is not needed to map them via GART on AMD hardware for GPU to 
be able to access them?
>> Fix by extending the accounting criteria in amdgpu_bo_unpin.
>>
>> What also aappears needs fixing is not reporting their size from the
>> amdgpu_bo_get_memory, which is used to implement fdinfo stats, so they 
>> are
>> not mixed with the regular userspace created and driver owned objects.
> 
> I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
> does use system memory and it is GPU accessible, just like GTT. The only 
> difference is, that it's not subject to the GTT limits because their 
> eviction is handled by callbacks other than TTM evictions and doesn't 
> need to wait for fences.

As in you think those two hunks of the patch are correct?

Regards,

Tvrtko


> Regards,
>    Felix
> 
> 
>>
>> And also amdgpu_bo_print_info for debugfs reporting.
>>
>> Note that the patch depends on the previous one which broke down the
>> relevant checks from the domain based to placement based.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible SG 
>> BOs")
>> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
>> Cc: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index fb984669fc3a..5a2bbc793953 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
>>           atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
>>           atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
>>                    &adev->visible_pin_size);
>> -    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
>> +           bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>>           atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>       }
>> @@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>               stats->vram_shared += size;
>>           break;
>>       case TTM_PL_TT:
>> -    case AMDGPU_PL_PREEMPT:
>>           stats->gtt += size;
>>           if (shared)
>>               stats->gtt_shared += size;
>> @@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct 
>> amdgpu_bo *bo, struct seq_file *m)
>>                   placement = "VRAM";
>>               break;
>>           case TTM_PL_TT:
>> -        case AMDGPU_PL_PREEMPT:
>>               placement = "GTT";
>>               break;
>>           case TTM_PL_SYSTEM:

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper
  2024-04-26 16:43 ` [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper Tvrtko Ursulin
@ 2024-04-29 11:02   ` Christian König
  2024-04-29 13:34     ` Tvrtko Ursulin
  0 siblings, 1 reply; 22+ messages in thread
From: Christian König @ 2024-04-29 11:02 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx; +Cc: Christian König, kernel-dev, Tvrtko Ursulin

Am 26.04.24 um 18:43 schrieb Tvrtko Ursulin:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>
> Help code readability by replacing a bunch of:
>
> bo->tbo.base.resv == vm->root.bo->tbo.base.resv
>
> With:
>
> amdgpu_bo_is_vm_bo(bo, vm)
>
> No functional changes.

Ah,yes that was on my TODO list as well.

But I would have rather added this to the VM instead. In other words 
move it to amdgpu_vm.h and call it amdgpu_vm_is_bo_always_valid() or 
something like that.

Regards,
Christian.

>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c    |  2 +-
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 14 ++++++++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c     | 31 +++++++++-------------
>   3 files changed, 28 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> index 67c234bcf89f..32e4a9c6e805 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
> @@ -174,7 +174,7 @@ static int amdgpu_gem_object_open(struct drm_gem_object *obj,
>   		return -EPERM;
>   
>   	if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID &&
> -	    abo->tbo.base.resv != vm->root.bo->tbo.base.resv)
> +	    !amdgpu_bo_is_vm_bo(abo, vm))
>   		return -EPERM;
>   
>   	r = amdgpu_bo_reserve(abo, false);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> index be679c42b0b8..f2bb6965cc77 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
> @@ -307,6 +307,20 @@ static inline struct amdgpu_bo *amdgpu_bo_shadowed(struct amdgpu_bo *bo)
>   	return NULL;
>   }
>   
> +/**
> + * amdgpu_bo_is_vm_bo - check if the BO is VM always valid
> + *
> + * @abo: BO to be tested.
> + * @vm: VM to test against.
> + *
> + * Returns true if the BO is VM always valid.
> + */
> +static inline bool amdgpu_bo_is_vm_bo(struct amdgpu_bo *bo,
> +				      struct amdgpu_vm *vm)
> +{
> +	return bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv;
> +}
> +
>   bool amdgpu_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
>   void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> index 8af3f0fd3073..6d6f0e325172 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
> @@ -333,7 +333,7 @@ void amdgpu_vm_bo_base_init(struct amdgpu_vm_bo_base *base,
>   	base->next = bo->vm_bo;
>   	bo->vm_bo = base;
>   
> -	if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
> +	if (!amdgpu_bo_is_vm_bo(bo, vm))
>   		return;
>   
>   	dma_resv_assert_held(vm->root.bo->tbo.base.resv);
> @@ -1101,13 +1101,12 @@ static void amdgpu_vm_bo_get_memory(struct amdgpu_bo_va *bo_va,
>   	 * For now ignore BOs which are currently locked and potentially
>   	 * changing their location.
>   	 */
> -	if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv &&
> -	    !dma_resv_trylock(bo->tbo.base.resv))
> +	if (!amdgpu_bo_is_vm_bo(bo, vm) && !dma_resv_trylock(bo->tbo.base.resv))
>   		return;
>   
>   	amdgpu_bo_get_memory(bo, stats);
> -	if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
> -	    dma_resv_unlock(bo->tbo.base.resv);
> +	if (amdgpu_bo_is_vm_bo(bo, vm))
> +		dma_resv_unlock(bo->tbo.base.resv);
>   }
>   
>   void amdgpu_vm_get_memory(struct amdgpu_vm *vm,
> @@ -1203,8 +1202,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   		uncached = false;
>   	}
>   
> -	if (clear || (bo && bo->tbo.base.resv ==
> -		      vm->root.bo->tbo.base.resv))
> +	if (clear || amdgpu_bo_is_vm_bo(bo, vm))
>   		last_update = &vm->last_update;
>   	else
>   		last_update = &bo_va->last_pt_update;
> @@ -1246,7 +1244,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va,
>   	 * the evicted list so that it gets validated again on the
>   	 * next command submission.
>   	 */
> -	if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv) {
> +	if (amdgpu_bo_is_vm_bo(bo, vm)) {
>   		uint32_t mem_type = bo->tbo.resource->mem_type;
>   
>   		if (!(bo->preferred_domains &
> @@ -1640,10 +1638,9 @@ static void amdgpu_vm_bo_insert_map(struct amdgpu_device *adev,
>   	if (mapping->flags & AMDGPU_PTE_PRT)
>   		amdgpu_vm_prt_get(adev);
>   
> -	if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
> -	    !bo_va->base.moved) {
> +	if (amdgpu_bo_is_vm_bo(bo, vm) && !bo_va->base.moved)
>   		amdgpu_vm_bo_moved(&bo_va->base);
> -	}
> +
>   	trace_amdgpu_vm_bo_map(bo_va, mapping);
>   }
>   
> @@ -1922,8 +1919,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
>   		if (before->flags & AMDGPU_PTE_PRT)
>   			amdgpu_vm_prt_get(adev);
>   
> -		if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
> -		    !before->bo_va->base.moved)
> +		if (amdgpu_bo_is_vm_bo(bo, vm) && !before->bo_va->base.moved)
>   			amdgpu_vm_bo_moved(&before->bo_va->base);
>   	} else {
>   		kfree(before);
> @@ -1937,8 +1933,7 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
>   		if (after->flags & AMDGPU_PTE_PRT)
>   			amdgpu_vm_prt_get(adev);
>   
> -		if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
> -		    !after->bo_va->base.moved)
> +		if (amdgpu_bo_is_vm_bo(bo, vm) && !after->bo_va->base.moved)
>   			amdgpu_vm_bo_moved(&after->bo_va->base);
>   	} else {
>   		kfree(after);
> @@ -2017,7 +2012,7 @@ void amdgpu_vm_bo_del(struct amdgpu_device *adev,
>   
>   	if (bo) {
>   		dma_resv_assert_held(bo->tbo.base.resv);
> -		if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
> +		if (amdgpu_bo_is_vm_bo(bo, vm))
>   			ttm_bo_set_bulk_move(&bo->tbo, NULL);
>   
>   		for (base = &bo_va->base.bo->vm_bo; *base;
> @@ -2111,7 +2106,7 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
>   	for (bo_base = bo->vm_bo; bo_base; bo_base = bo_base->next) {
>   		struct amdgpu_vm *vm = bo_base->vm;
>   
> -		if (evicted && bo->tbo.base.resv == vm->root.bo->tbo.base.resv) {
> +		if (evicted && amdgpu_bo_is_vm_bo(bo, vm)) {
>   			amdgpu_vm_bo_evicted(bo_base);
>   			continue;
>   		}
> @@ -2122,7 +2117,7 @@ void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
>   
>   		if (bo->tbo.type == ttm_bo_type_kernel)
>   			amdgpu_vm_bo_relocated(bo_base);
> -		else if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
> +		else if (amdgpu_bo_is_vm_bo(bo, vm))
>   			amdgpu_vm_bo_moved(bo_base);
>   		else
>   			amdgpu_vm_bo_invalidated(bo_base);


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection
  2024-04-26 16:43 ` [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection Tvrtko Ursulin
@ 2024-04-29 11:03   ` Christian König
  2024-04-29 13:38     ` Tvrtko Ursulin
  0 siblings, 1 reply; 22+ messages in thread
From: Christian König @ 2024-04-29 11:03 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx; +Cc: Christian König, kernel-dev, Tvrtko Ursulin

Am 26.04.24 um 18:43 schrieb Tvrtko Ursulin:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>
> All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM
> placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT,
> depending on AMDGPU_GEM_CREATE_PREEMPTIBLE.
>
> Simplify a few places in the code which convert the TTM placement into
> a domain by checking against the current placement directly.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

Reviewed-by: Christian König <christian.koenig@amd.com>

> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  4 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 30 ++++++++++-----------
>   2 files changed, 16 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index 055ba2ea4c12..ff83f8d8628c 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -165,8 +165,8 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach,
>   		if (r)
>   			return ERR_PTR(r);
>   
> -	} else if (!(amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type) &
> -		     AMDGPU_GEM_DOMAIN_GTT)) {
> +	} else if (bo->tbo.resource->mem_type != TTM_PL_TT &&
> +		   bo->tbo.resource->mem_type != AMDGPU_PL_PREEMPT) {
>   		return ERR_PTR(-EBUSY);
>   	}
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 8bc79924d171..fb984669fc3a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -976,12 +976,12 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
>   
>   	ttm_bo_pin(&bo->tbo);
>   
> -	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> -	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
> +	if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
>   		atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
>   		atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
>   			     &adev->visible_pin_size);
> -	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
> +	} else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
> +		   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>   		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>   	}
>   
> @@ -1280,7 +1280,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   {
>   	uint64_t size = amdgpu_bo_size(bo);
>   	struct drm_gem_object *obj;
> -	unsigned int domain;
>   	bool shared;
>   
>   	/* Abort if the BO doesn't currently have a backing store */
> @@ -1290,21 +1289,21 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   	obj = &bo->tbo.base;
>   	shared = drm_gem_object_is_shared_for_memory_stats(obj);
>   
> -	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> -	switch (domain) {
> -	case AMDGPU_GEM_DOMAIN_VRAM:
> +	switch (bo->tbo.resource->mem_type) {
> +	case TTM_PL_VRAM:
>   		stats->vram += size;
>   		if (amdgpu_bo_in_cpu_visible_vram(bo))
>   			stats->visible_vram += size;
>   		if (shared)
>   			stats->vram_shared += size;
>   		break;
> -	case AMDGPU_GEM_DOMAIN_GTT:
> +	case TTM_PL_TT:
> +	case AMDGPU_PL_PREEMPT:
>   		stats->gtt += size;
>   		if (shared)
>   			stats->gtt_shared += size;
>   		break;
> -	case AMDGPU_GEM_DOMAIN_CPU:
> +	case TTM_PL_SYSTEM:
>   	default:
>   		stats->cpu += size;
>   		if (shared)
> @@ -1317,7 +1316,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   		if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>   			stats->requested_visible_vram += size;
>   
> -		if (domain != AMDGPU_GEM_DOMAIN_VRAM) {
> +		if (bo->tbo.resource->mem_type != TTM_PL_VRAM) {
>   			stats->evicted_vram += size;
>   			if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>   				stats->evicted_visible_vram += size;
> @@ -1592,19 +1591,18 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)
>   	u64 size;
>   
>   	if (dma_resv_trylock(bo->tbo.base.resv)) {
> -		unsigned int domain;
> -		domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> -		switch (domain) {
> -		case AMDGPU_GEM_DOMAIN_VRAM:
> +		switch (bo->tbo.resource->mem_type) {
> +		case TTM_PL_VRAM:
>   			if (amdgpu_bo_in_cpu_visible_vram(bo))
>   				placement = "VRAM VISIBLE";
>   			else
>   				placement = "VRAM";
>   			break;
> -		case AMDGPU_GEM_DOMAIN_GTT:
> +		case TTM_PL_TT:
> +		case AMDGPU_PL_PREEMPT:
>   			placement = "GTT";
>   			break;
> -		case AMDGPU_GEM_DOMAIN_CPU:
> +		case TTM_PL_SYSTEM:
>   		default:
>   			placement = "CPU";
>   			break;


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting
  2024-04-26 16:43 ` [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting Tvrtko Ursulin
  2024-04-26 22:24   ` Felix Kuehling
@ 2024-04-29 11:06   ` Christian König
  1 sibling, 0 replies; 22+ messages in thread
From: Christian König @ 2024-04-29 11:06 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx
  Cc: Christian König, kernel-dev, Tvrtko Ursulin, Felix Kuehling

Am 26.04.24 um 18:43 schrieb Tvrtko Ursulin:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>
> When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
> SG BOs") added a new TTM region it missed to notice the conceptual
> imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
>
> That imbalance leads to such objects getting accounted against the
> resource, but are not un-accounted when unpinned.
>
> Fix by extending the accounting criteria in amdgpu_bo_unpin.
>
> What also aappears needs fixing is not reporting their size from the
> amdgpu_bo_get_memory, which is used to implement fdinfo stats, so they are
> not mixed with the regular userspace created and driver owned objects.
>
> And also amdgpu_bo_print_info for debugfs reporting.
>
> Note that the patch depends on the previous one which broke down the
> relevant checks from the domain based to placement based.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible SG BOs")
> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>   1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index fb984669fc3a..5a2bbc793953 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
>   		atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
>   		atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
>   			     &adev->visible_pin_size);
> -	} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
> +	} else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
> +		   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {

Good catch, but please separate that one from the other changes since we 
probably want to backport it.

>   		atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
>   	}
>   
> @@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   			stats->vram_shared += size;
>   		break;
>   	case TTM_PL_TT:
> -	case AMDGPU_PL_PREEMPT:
>   		stats->gtt += size;
>   		if (shared)
>   			stats->gtt_shared += size;
> @@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)
>   				placement = "VRAM";
>   			break;
>   		case TTM_PL_TT:
> -		case AMDGPU_PL_PREEMPT:

Yeah, that makes sense as well. But we need a case for AMDGPU_PL_PREEMPT 
here as well then.

Regards,
Christian.

>   			placement = "GTT";
>   			break;
>   		case TTM_PL_SYSTEM:


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting
  2024-04-29  9:43     ` Tvrtko Ursulin
@ 2024-04-29 11:11       ` Christian König
  2024-04-29 13:45         ` Tvrtko Ursulin
  2024-04-29 14:43       ` Felix Kuehling
  1 sibling, 1 reply; 22+ messages in thread
From: Christian König @ 2024-04-29 11:11 UTC (permalink / raw)
  To: Tvrtko Ursulin, Felix Kuehling, Tvrtko Ursulin, amd-gfx; +Cc: kernel-dev

Am 29.04.24 um 11:43 schrieb Tvrtko Ursulin:
>
> On 26/04/2024 23:24, Felix Kuehling wrote:
>>
>> On 2024-04-26 12:43, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>
>>> When commit b453e42a6e8b ("drm/amdgpu: Add new placement for 
>>> preemptible
>>> SG BOs") added a new TTM region it missed to notice the conceptual
>>> imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
>>>
>>> That imbalance leads to such objects getting accounted against the
>>> resource, but are not un-accounted when unpinned.
>>
>> AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
>> pinned. In any case you should make sure that the accounting is 
>> consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
>> patch breaks that consistency.
>
> You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
> such objects, or something else?
>
> If they run, then at the end of pin there is:
>
>     domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> ...
>     } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
>         atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>
> And unpin has no handling for AMDGPU_PL_PREEMPT.
>
> Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
> AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
> checking the domain as stored in the bo at creation time.
>
> Although I am still confused by the statement userptr BOs are not 
> pinned. It is not needed to map them via GART on AMD hardware for GPU 
> to be able to access them?

No, a GART mapping is only needed if you want to scanout from them or 
otherwise use them from the kernel on the GPU.

Background is that the kernel doesn't has VM with page tables..

>>> Fix by extending the accounting criteria in amdgpu_bo_unpin.
>>>
>>> What also aappears needs fixing is not reporting their size from the
>>> amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
>>> they are
>>> not mixed with the regular userspace created and driver owned objects.
>>
>> I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
>> does use system memory and it is GPU accessible, just like GTT. The 
>> only difference is, that it's not subject to the GTT limits because 
>> their eviction is handled by callbacks other than TTM evictions and 
>> doesn't need to wait for fences.
>
> As in you think those two hunks of the patch are correct?

I think so as well, yes. But we still need a name for preemptible BOs 
while printing them in debugfs.

Regards,
Christian.

>
> Regards,
>
> Tvrtko
>
>
>> Regards,
>>    Felix
>>
>>
>>>
>>> And also amdgpu_bo_print_info for debugfs reporting.
>>>
>>> Note that the patch depends on the previous one which broke down the
>>> relevant checks from the domain based to placement based.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>> Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
>>> SG BOs")
>>> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index fb984669fc3a..5a2bbc793953 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
>>>           atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
>>>                    &adev->visible_pin_size);
>>> -    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>>> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
>>> +           bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>>       }
>>> @@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>>               stats->vram_shared += size;
>>>           break;
>>>       case TTM_PL_TT:
>>> -    case AMDGPU_PL_PREEMPT:
>>>           stats->gtt += size;
>>>           if (shared)
>>>               stats->gtt_shared += size;
>>> @@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct 
>>> amdgpu_bo *bo, struct seq_file *m)
>>>                   placement = "VRAM";
>>>               break;
>>>           case TTM_PL_TT:
>>> -        case AMDGPU_PL_PREEMPT:
>>>               placement = "GTT";
>>>               break;
>>>           case TTM_PL_SYSTEM:


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper
  2024-04-29 11:02   ` Christian König
@ 2024-04-29 13:34     ` Tvrtko Ursulin
  2024-04-29 13:51       ` Christian König
  0 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-29 13:34 UTC (permalink / raw)
  To: Christian König, Tvrtko Ursulin, amd-gfx
  Cc: Christian König, kernel-dev


On 29/04/2024 12:02, Christian König wrote:
> Am 26.04.24 um 18:43 schrieb Tvrtko Ursulin:
>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>
>> Help code readability by replacing a bunch of:
>>
>> bo->tbo.base.resv == vm->root.bo->tbo.base.resv
>>
>> With:
>>
>> amdgpu_bo_is_vm_bo(bo, vm)
>>
>> No functional changes.
> 
> Ah,yes that was on my TODO list as well.
> 
> But I would have rather added this to the VM instead. In other words 
> move it to amdgpu_vm.h and call it amdgpu_vm_is_bo_always_valid() or 
> something like that.

I am happy to move it around as long as you are sure amdgpu_vm.h is the 
location.

For instance main API there it seems to be amdgpu_vm_bo's. At least all 
the amdgpu_bo usages do not needing anything more that the struct 
forward declared.

So if I move the helper there I either need to make it include another 
header, or move the helper out of line to amdgpu_vm.c.

Thoughts?

Regards,

Tvrtko

>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c    |  2 +-
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 14 ++++++++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c     | 31 +++++++++-------------
>>   3 files changed, 28 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> index 67c234bcf89f..32e4a9c6e805 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>> @@ -174,7 +174,7 @@ static int amdgpu_gem_object_open(struct 
>> drm_gem_object *obj,
>>           return -EPERM;
>>       if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID &&
>> -        abo->tbo.base.resv != vm->root.bo->tbo.base.resv)
>> +        !amdgpu_bo_is_vm_bo(abo, vm))
>>           return -EPERM;
>>       r = amdgpu_bo_reserve(abo, false);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> index be679c42b0b8..f2bb6965cc77 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>> @@ -307,6 +307,20 @@ static inline struct amdgpu_bo 
>> *amdgpu_bo_shadowed(struct amdgpu_bo *bo)
>>       return NULL;
>>   }
>> +/**
>> + * amdgpu_bo_is_vm_bo - check if the BO is VM always valid
>> + *
>> + * @abo: BO to be tested.
>> + * @vm: VM to test against.
>> + *
>> + * Returns true if the BO is VM always valid.
>> + */
>> +static inline bool amdgpu_bo_is_vm_bo(struct amdgpu_bo *bo,
>> +                      struct amdgpu_vm *vm)
>> +{
>> +    return bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv;
>> +}
>> +
>>   bool amdgpu_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
>>   void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 
>> domain);
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> index 8af3f0fd3073..6d6f0e325172 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>> @@ -333,7 +333,7 @@ void amdgpu_vm_bo_base_init(struct 
>> amdgpu_vm_bo_base *base,
>>       base->next = bo->vm_bo;
>>       bo->vm_bo = base;
>> -    if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
>> +    if (!amdgpu_bo_is_vm_bo(bo, vm))
>>           return;
>>       dma_resv_assert_held(vm->root.bo->tbo.base.resv);
>> @@ -1101,13 +1101,12 @@ static void amdgpu_vm_bo_get_memory(struct 
>> amdgpu_bo_va *bo_va,
>>        * For now ignore BOs which are currently locked and potentially
>>        * changing their location.
>>        */
>> -    if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv &&
>> -        !dma_resv_trylock(bo->tbo.base.resv))
>> +    if (!amdgpu_bo_is_vm_bo(bo, vm) && 
>> !dma_resv_trylock(bo->tbo.base.resv))
>>           return;
>>       amdgpu_bo_get_memory(bo, stats);
>> -    if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
>> -        dma_resv_unlock(bo->tbo.base.resv);
>> +    if (amdgpu_bo_is_vm_bo(bo, vm))
>> +        dma_resv_unlock(bo->tbo.base.resv);
>>   }
>>   void amdgpu_vm_get_memory(struct amdgpu_vm *vm,
>> @@ -1203,8 +1202,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>> *adev, struct amdgpu_bo_va *bo_va,
>>           uncached = false;
>>       }
>> -    if (clear || (bo && bo->tbo.base.resv ==
>> -              vm->root.bo->tbo.base.resv))
>> +    if (clear || amdgpu_bo_is_vm_bo(bo, vm))
>>           last_update = &vm->last_update;
>>       else
>>           last_update = &bo_va->last_pt_update;
>> @@ -1246,7 +1244,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>> *adev, struct amdgpu_bo_va *bo_va,
>>        * the evicted list so that it gets validated again on the
>>        * next command submission.
>>        */
>> -    if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv) {
>> +    if (amdgpu_bo_is_vm_bo(bo, vm)) {
>>           uint32_t mem_type = bo->tbo.resource->mem_type;
>>           if (!(bo->preferred_domains &
>> @@ -1640,10 +1638,9 @@ static void amdgpu_vm_bo_insert_map(struct 
>> amdgpu_device *adev,
>>       if (mapping->flags & AMDGPU_PTE_PRT)
>>           amdgpu_vm_prt_get(adev);
>> -    if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
>> -        !bo_va->base.moved) {
>> +    if (amdgpu_bo_is_vm_bo(bo, vm) && !bo_va->base.moved)
>>           amdgpu_vm_bo_moved(&bo_va->base);
>> -    }
>> +
>>       trace_amdgpu_vm_bo_map(bo_va, mapping);
>>   }
>> @@ -1922,8 +1919,7 @@ int amdgpu_vm_bo_clear_mappings(struct 
>> amdgpu_device *adev,
>>           if (before->flags & AMDGPU_PTE_PRT)
>>               amdgpu_vm_prt_get(adev);
>> -        if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
>> -            !before->bo_va->base.moved)
>> +        if (amdgpu_bo_is_vm_bo(bo, vm) && !before->bo_va->base.moved)
>>               amdgpu_vm_bo_moved(&before->bo_va->base);
>>       } else {
>>           kfree(before);
>> @@ -1937,8 +1933,7 @@ int amdgpu_vm_bo_clear_mappings(struct 
>> amdgpu_device *adev,
>>           if (after->flags & AMDGPU_PTE_PRT)
>>               amdgpu_vm_prt_get(adev);
>> -        if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
>> -            !after->bo_va->base.moved)
>> +        if (amdgpu_bo_is_vm_bo(bo, vm) && !after->bo_va->base.moved)
>>               amdgpu_vm_bo_moved(&after->bo_va->base);
>>       } else {
>>           kfree(after);
>> @@ -2017,7 +2012,7 @@ void amdgpu_vm_bo_del(struct amdgpu_device *adev,
>>       if (bo) {
>>           dma_resv_assert_held(bo->tbo.base.resv);
>> -        if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
>> +        if (amdgpu_bo_is_vm_bo(bo, vm))
>>               ttm_bo_set_bulk_move(&bo->tbo, NULL);
>>           for (base = &bo_va->base.bo->vm_bo; *base;
>> @@ -2111,7 +2106,7 @@ void amdgpu_vm_bo_invalidate(struct 
>> amdgpu_device *adev,
>>       for (bo_base = bo->vm_bo; bo_base; bo_base = bo_base->next) {
>>           struct amdgpu_vm *vm = bo_base->vm;
>> -        if (evicted && bo->tbo.base.resv == 
>> vm->root.bo->tbo.base.resv) {
>> +        if (evicted && amdgpu_bo_is_vm_bo(bo, vm)) {
>>               amdgpu_vm_bo_evicted(bo_base);
>>               continue;
>>           }
>> @@ -2122,7 +2117,7 @@ void amdgpu_vm_bo_invalidate(struct 
>> amdgpu_device *adev,
>>           if (bo->tbo.type == ttm_bo_type_kernel)
>>               amdgpu_vm_bo_relocated(bo_base);
>> -        else if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
>> +        else if (amdgpu_bo_is_vm_bo(bo, vm))
>>               amdgpu_vm_bo_moved(bo_base);
>>           else
>>               amdgpu_vm_bo_invalidated(bo_base);
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection
  2024-04-29 11:03   ` Christian König
@ 2024-04-29 13:38     ` Tvrtko Ursulin
  0 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-29 13:38 UTC (permalink / raw)
  To: Christian König, Tvrtko Ursulin, amd-gfx
  Cc: Christian König, kernel-dev


Hi Christian,

On 29/04/2024 12:03, Christian König wrote:
> Am 26.04.24 um 18:43 schrieb Tvrtko Ursulin:
>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>
>> All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM
>> placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT,
>> depending on AMDGPU_GEM_CREATE_PREEMPTIBLE.
>>
>> Simplify a few places in the code which convert the TTM placement into
>> a domain by checking against the current placement directly.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> 
> Reviewed-by: Christian König <christian.koenig@amd.com>

Thanks! I am however a bit unsure now, read below..

>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  4 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 30 ++++++++++-----------
>>   2 files changed, 16 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> index 055ba2ea4c12..ff83f8d8628c 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> @@ -165,8 +165,8 @@ static struct sg_table *amdgpu_dma_buf_map(struct 
>> dma_buf_attachment *attach,
>>           if (r)
>>               return ERR_PTR(r);
>> -    } else if (!(amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type) &
>> -             AMDGPU_GEM_DOMAIN_GTT)) {
>> +    } else if (bo->tbo.resource->mem_type != TTM_PL_TT &&
>> +           bo->tbo.resource->mem_type != AMDGPU_PL_PREEMPT) {
>>           return ERR_PTR(-EBUSY);

... this for instance (one hunk below too).

Because what we are discussing in 3/3, 
amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type) does not return a 
valid domain for AMDGPU_PL_PREEMPT. Which could make mentioning it here 
misleading. So I am not really sure what the design is supposed to be.

Is this a weakness in the current code? Maybe best to leave the 
discussion for 3/3.

Regards,

Tvrtko

>>       }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 8bc79924d171..fb984669fc3a 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -976,12 +976,12 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo 
>> *bo, u32 domain,
>>       ttm_bo_pin(&bo->tbo);
>> -    domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>> -    if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
>> +    if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
>>           atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
>>           atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
>>                    &adev->visible_pin_size);
>> -    } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
>> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
>> +           bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>>           atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>       }
>> @@ -1280,7 +1280,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>   {
>>       uint64_t size = amdgpu_bo_size(bo);
>>       struct drm_gem_object *obj;
>> -    unsigned int domain;
>>       bool shared;
>>       /* Abort if the BO doesn't currently have a backing store */
>> @@ -1290,21 +1289,21 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>       obj = &bo->tbo.base;
>>       shared = drm_gem_object_is_shared_for_memory_stats(obj);
>> -    domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>> -    switch (domain) {
>> -    case AMDGPU_GEM_DOMAIN_VRAM:
>> +    switch (bo->tbo.resource->mem_type) {
>> +    case TTM_PL_VRAM:
>>           stats->vram += size;
>>           if (amdgpu_bo_in_cpu_visible_vram(bo))
>>               stats->visible_vram += size;
>>           if (shared)
>>               stats->vram_shared += size;
>>           break;
>> -    case AMDGPU_GEM_DOMAIN_GTT:
>> +    case TTM_PL_TT:
>> +    case AMDGPU_PL_PREEMPT:
>>           stats->gtt += size;
>>           if (shared)
>>               stats->gtt_shared += size;
>>           break;
>> -    case AMDGPU_GEM_DOMAIN_CPU:
>> +    case TTM_PL_SYSTEM:
>>       default:
>>           stats->cpu += size;
>>           if (shared)
>> @@ -1317,7 +1316,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>           if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>>               stats->requested_visible_vram += size;
>> -        if (domain != AMDGPU_GEM_DOMAIN_VRAM) {
>> +        if (bo->tbo.resource->mem_type != TTM_PL_VRAM) {
>>               stats->evicted_vram += size;
>>               if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>>                   stats->evicted_visible_vram += size;
>> @@ -1592,19 +1591,18 @@ u64 amdgpu_bo_print_info(int id, struct 
>> amdgpu_bo *bo, struct seq_file *m)
>>       u64 size;
>>       if (dma_resv_trylock(bo->tbo.base.resv)) {
>> -        unsigned int domain;
>> -        domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>> -        switch (domain) {
>> -        case AMDGPU_GEM_DOMAIN_VRAM:
>> +        switch (bo->tbo.resource->mem_type) {
>> +        case TTM_PL_VRAM:
>>               if (amdgpu_bo_in_cpu_visible_vram(bo))
>>                   placement = "VRAM VISIBLE";
>>               else
>>                   placement = "VRAM";
>>               break;
>> -        case AMDGPU_GEM_DOMAIN_GTT:
>> +        case TTM_PL_TT:
>> +        case AMDGPU_PL_PREEMPT:
>>               placement = "GTT";
>>               break;
>> -        case AMDGPU_GEM_DOMAIN_CPU:
>> +        case TTM_PL_SYSTEM:
>>           default:
>>               placement = "CPU";
>>               break;
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting
  2024-04-29 11:11       ` Christian König
@ 2024-04-29 13:45         ` Tvrtko Ursulin
  2024-04-29 14:34           ` Felix Kuehling
  0 siblings, 1 reply; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-29 13:45 UTC (permalink / raw)
  To: Christian König, Felix Kuehling, Tvrtko Ursulin, amd-gfx; +Cc: kernel-dev


On 29/04/2024 12:11, Christian König wrote:
> Am 29.04.24 um 11:43 schrieb Tvrtko Ursulin:
>>
>> On 26/04/2024 23:24, Felix Kuehling wrote:
>>>
>>> On 2024-04-26 12:43, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>>
>>>> When commit b453e42a6e8b ("drm/amdgpu: Add new placement for 
>>>> preemptible
>>>> SG BOs") added a new TTM region it missed to notice the conceptual
>>>> imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
>>>>
>>>> That imbalance leads to such objects getting accounted against the
>>>> resource, but are not un-accounted when unpinned.
>>>
>>> AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
>>> pinned. In any case you should make sure that the accounting is 
>>> consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
>>> patch breaks that consistency.
>>
>> You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
>> such objects, or something else?
>>
>> If they run, then at the end of pin there is:
>>
>>     domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>> ...
>>     } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
>>         atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>
>> And unpin has no handling for AMDGPU_PL_PREEMPT.
>>
>> Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
>> AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
>> checking the domain as stored in the bo at creation time.
>>
>> Although I am still confused by the statement userptr BOs are not 
>> pinned. It is not needed to map them via GART on AMD hardware for GPU 
>> to be able to access them?
> 
> No, a GART mapping is only needed if you want to scanout from them or 
> otherwise use them from the kernel on the GPU.
> 
> Background is that the kernel doesn't has VM with page tables..

Got it, thanks!

Presumably somewhere else in the code then it is prevented to call 
pin/unpin on those?

What to do, if anything, with the attempt to address the asymmetry in 
the accounting criteria between the pin and unpin?

I mean domain based on pin:

	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
		atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
		atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
			     &adev->visible_pin_size);
	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
	}

Versus placement based on unpin:

	if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
		atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
		atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
			     &adev->visible_pin_size);
	} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
		atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
	}

The fact amdgpu_mem_type_to_domain never translates back to 
AMDGPU_PL_PREEMPT means there is indeed currently no bug.

Is 2/3 still desirable to convert the check in pin to me mem_type based?

>>>> Fix by extending the accounting criteria in amdgpu_bo_unpin.
>>>>
>>>> What also aappears needs fixing is not reporting their size from the
>>>> amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
>>>> they are
>>>> not mixed with the regular userspace created and driver owned objects.
>>>
>>> I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
>>> does use system memory and it is GPU accessible, just like GTT. The 
>>> only difference is, that it's not subject to the GTT limits because 
>>> their eviction is handled by callbacks other than TTM evictions and 
>>> doesn't need to wait for fences.
>>
>> As in you think those two hunks of the patch are correct?
> 
> I think so as well, yes. But we still need a name for preemptible BOs 
> while printing them in debugfs.

Currently it looks the name is 'CPU':

amdgpu_bo_print_info()
...
		case AMDGPU_GEM_DOMAIN_CPU:
		default:
			placement = "CPU";
			break;


Also, where to account them in struct amdgpu_mem_stats?

Regards,

Tvrtko

> 
> Regards,
> Christian.
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>
>>> Regards,
>>>    Felix
>>>
>>>
>>>>
>>>> And also amdgpu_bo_print_info for debugfs reporting.
>>>>
>>>> Note that the patch depends on the previous one which broke down the
>>>> relevant checks from the domain based to placement based.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>> Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
>>>> SG BOs")
>>>> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
>>>> Cc: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> index fb984669fc3a..5a2bbc793953 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> @@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
>>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
>>>>           atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
>>>>                    &adev->visible_pin_size);
>>>> -    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>>>> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
>>>> +           bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>>>       }
>>>> @@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>>>               stats->vram_shared += size;
>>>>           break;
>>>>       case TTM_PL_TT:
>>>> -    case AMDGPU_PL_PREEMPT:
>>>>           stats->gtt += size;
>>>>           if (shared)
>>>>               stats->gtt_shared += size;
>>>> @@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct 
>>>> amdgpu_bo *bo, struct seq_file *m)
>>>>                   placement = "VRAM";
>>>>               break;
>>>>           case TTM_PL_TT:
>>>> -        case AMDGPU_PL_PREEMPT:
>>>>               placement = "GTT";
>>>>               break;
>>>>           case TTM_PL_SYSTEM:
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper
  2024-04-29 13:34     ` Tvrtko Ursulin
@ 2024-04-29 13:51       ` Christian König
  0 siblings, 0 replies; 22+ messages in thread
From: Christian König @ 2024-04-29 13:51 UTC (permalink / raw)
  To: Tvrtko Ursulin, Christian König, Tvrtko Ursulin, amd-gfx; +Cc: kernel-dev

Am 29.04.24 um 15:34 schrieb Tvrtko Ursulin:
>
> On 29/04/2024 12:02, Christian König wrote:
>> Am 26.04.24 um 18:43 schrieb Tvrtko Ursulin:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>
>>> Help code readability by replacing a bunch of:
>>>
>>> bo->tbo.base.resv == vm->root.bo->tbo.base.resv
>>>
>>> With:
>>>
>>> amdgpu_bo_is_vm_bo(bo, vm)
>>>
>>> No functional changes.
>>
>> Ah,yes that was on my TODO list as well.
>>
>> But I would have rather added this to the VM instead. In other words 
>> move it to amdgpu_vm.h and call it amdgpu_vm_is_bo_always_valid() or 
>> something like that.
>
> I am happy to move it around as long as you are sure amdgpu_vm.h is 
> the location.
>
> For instance main API there it seems to be amdgpu_vm_bo's. At least 
> all the amdgpu_bo usages do not needing anything more that the struct 
> forward declared.
>
> So if I move the helper there I either need to make it include another 
> header, or move the helper out of line to amdgpu_vm.c.
>
> Thoughts?

amdgpu_vm.c is fine as well. I just though that something so simply 
could be an inline function in the header as well.

Regards,
Christian.

>
> Regards,
>
> Tvrtko
>
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c    |  2 +-
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.h | 14 ++++++++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c     | 31 
>>> +++++++++-------------
>>>   3 files changed, 28 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> index 67c234bcf89f..32e4a9c6e805 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
>>> @@ -174,7 +174,7 @@ static int amdgpu_gem_object_open(struct 
>>> drm_gem_object *obj,
>>>           return -EPERM;
>>>       if (abo->flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID &&
>>> -        abo->tbo.base.resv != vm->root.bo->tbo.base.resv)
>>> +        !amdgpu_bo_is_vm_bo(abo, vm))
>>>           return -EPERM;
>>>       r = amdgpu_bo_reserve(abo, false);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index be679c42b0b8..f2bb6965cc77 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -307,6 +307,20 @@ static inline struct amdgpu_bo 
>>> *amdgpu_bo_shadowed(struct amdgpu_bo *bo)
>>>       return NULL;
>>>   }
>>> +/**
>>> + * amdgpu_bo_is_vm_bo - check if the BO is VM always valid
>>> + *
>>> + * @abo: BO to be tested.
>>> + * @vm: VM to test against.
>>> + *
>>> + * Returns true if the BO is VM always valid.
>>> + */
>>> +static inline bool amdgpu_bo_is_vm_bo(struct amdgpu_bo *bo,
>>> +                      struct amdgpu_vm *vm)
>>> +{
>>> +    return bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv;
>>> +}
>>> +
>>>   bool amdgpu_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
>>>   void amdgpu_bo_placement_from_domain(struct amdgpu_bo *abo, u32 
>>> domain);
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> index 8af3f0fd3073..6d6f0e325172 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
>>> @@ -333,7 +333,7 @@ void amdgpu_vm_bo_base_init(struct 
>>> amdgpu_vm_bo_base *base,
>>>       base->next = bo->vm_bo;
>>>       bo->vm_bo = base;
>>> -    if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
>>> +    if (!amdgpu_bo_is_vm_bo(bo, vm))
>>>           return;
>>>       dma_resv_assert_held(vm->root.bo->tbo.base.resv);
>>> @@ -1101,13 +1101,12 @@ static void amdgpu_vm_bo_get_memory(struct 
>>> amdgpu_bo_va *bo_va,
>>>        * For now ignore BOs which are currently locked and potentially
>>>        * changing their location.
>>>        */
>>> -    if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv &&
>>> -        !dma_resv_trylock(bo->tbo.base.resv))
>>> +    if (!amdgpu_bo_is_vm_bo(bo, vm) && 
>>> !dma_resv_trylock(bo->tbo.base.resv))
>>>           return;
>>>       amdgpu_bo_get_memory(bo, stats);
>>> -    if (bo->tbo.base.resv != vm->root.bo->tbo.base.resv)
>>> -        dma_resv_unlock(bo->tbo.base.resv);
>>> +    if (amdgpu_bo_is_vm_bo(bo, vm))
>>> +        dma_resv_unlock(bo->tbo.base.resv);
>>>   }
>>>   void amdgpu_vm_get_memory(struct amdgpu_vm *vm,
>>> @@ -1203,8 +1202,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>> *adev, struct amdgpu_bo_va *bo_va,
>>>           uncached = false;
>>>       }
>>> -    if (clear || (bo && bo->tbo.base.resv ==
>>> -              vm->root.bo->tbo.base.resv))
>>> +    if (clear || amdgpu_bo_is_vm_bo(bo, vm))
>>>           last_update = &vm->last_update;
>>>       else
>>>           last_update = &bo_va->last_pt_update;
>>> @@ -1246,7 +1244,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device 
>>> *adev, struct amdgpu_bo_va *bo_va,
>>>        * the evicted list so that it gets validated again on the
>>>        * next command submission.
>>>        */
>>> -    if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv) {
>>> +    if (amdgpu_bo_is_vm_bo(bo, vm)) {
>>>           uint32_t mem_type = bo->tbo.resource->mem_type;
>>>           if (!(bo->preferred_domains &
>>> @@ -1640,10 +1638,9 @@ static void amdgpu_vm_bo_insert_map(struct 
>>> amdgpu_device *adev,
>>>       if (mapping->flags & AMDGPU_PTE_PRT)
>>>           amdgpu_vm_prt_get(adev);
>>> -    if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
>>> -        !bo_va->base.moved) {
>>> +    if (amdgpu_bo_is_vm_bo(bo, vm) && !bo_va->base.moved)
>>>           amdgpu_vm_bo_moved(&bo_va->base);
>>> -    }
>>> +
>>>       trace_amdgpu_vm_bo_map(bo_va, mapping);
>>>   }
>>> @@ -1922,8 +1919,7 @@ int amdgpu_vm_bo_clear_mappings(struct 
>>> amdgpu_device *adev,
>>>           if (before->flags & AMDGPU_PTE_PRT)
>>>               amdgpu_vm_prt_get(adev);
>>> -        if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
>>> -            !before->bo_va->base.moved)
>>> +        if (amdgpu_bo_is_vm_bo(bo, vm) && !before->bo_va->base.moved)
>>> amdgpu_vm_bo_moved(&before->bo_va->base);
>>>       } else {
>>>           kfree(before);
>>> @@ -1937,8 +1933,7 @@ int amdgpu_vm_bo_clear_mappings(struct 
>>> amdgpu_device *adev,
>>>           if (after->flags & AMDGPU_PTE_PRT)
>>>               amdgpu_vm_prt_get(adev);
>>> -        if (bo && bo->tbo.base.resv == vm->root.bo->tbo.base.resv &&
>>> -            !after->bo_va->base.moved)
>>> +        if (amdgpu_bo_is_vm_bo(bo, vm) && !after->bo_va->base.moved)
>>> amdgpu_vm_bo_moved(&after->bo_va->base);
>>>       } else {
>>>           kfree(after);
>>> @@ -2017,7 +2012,7 @@ void amdgpu_vm_bo_del(struct amdgpu_device *adev,
>>>       if (bo) {
>>>           dma_resv_assert_held(bo->tbo.base.resv);
>>> -        if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
>>> +        if (amdgpu_bo_is_vm_bo(bo, vm))
>>>               ttm_bo_set_bulk_move(&bo->tbo, NULL);
>>>           for (base = &bo_va->base.bo->vm_bo; *base;
>>> @@ -2111,7 +2106,7 @@ void amdgpu_vm_bo_invalidate(struct 
>>> amdgpu_device *adev,
>>>       for (bo_base = bo->vm_bo; bo_base; bo_base = bo_base->next) {
>>>           struct amdgpu_vm *vm = bo_base->vm;
>>> -        if (evicted && bo->tbo.base.resv == 
>>> vm->root.bo->tbo.base.resv) {
>>> +        if (evicted && amdgpu_bo_is_vm_bo(bo, vm)) {
>>>               amdgpu_vm_bo_evicted(bo_base);
>>>               continue;
>>>           }
>>> @@ -2122,7 +2117,7 @@ void amdgpu_vm_bo_invalidate(struct 
>>> amdgpu_device *adev,
>>>           if (bo->tbo.type == ttm_bo_type_kernel)
>>>               amdgpu_vm_bo_relocated(bo_base);
>>> -        else if (bo->tbo.base.resv == vm->root.bo->tbo.base.resv)
>>> +        else if (amdgpu_bo_is_vm_bo(bo, vm))
>>>               amdgpu_vm_bo_moved(bo_base);
>>>           else
>>>               amdgpu_vm_bo_invalidated(bo_base);
>>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting
  2024-04-29 13:45         ` Tvrtko Ursulin
@ 2024-04-29 14:34           ` Felix Kuehling
  0 siblings, 0 replies; 22+ messages in thread
From: Felix Kuehling @ 2024-04-29 14:34 UTC (permalink / raw)
  To: Tvrtko Ursulin, Christian König, Tvrtko Ursulin, amd-gfx; +Cc: kernel-dev



On 2024-04-29 9:45, Tvrtko Ursulin wrote:
> 
> On 29/04/2024 12:11, Christian König wrote:
>> Am 29.04.24 um 11:43 schrieb Tvrtko Ursulin:
>>>
>>> On 26/04/2024 23:24, Felix Kuehling wrote:
>>>>
>>>> On 2024-04-26 12:43, Tvrtko Ursulin wrote:
>>>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>>>
>>>>> When commit b453e42a6e8b ("drm/amdgpu: Add new placement for 
>>>>> preemptible
>>>>> SG BOs") added a new TTM region it missed to notice the conceptual
>>>>> imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
>>>>>
>>>>> That imbalance leads to such objects getting accounted against the
>>>>> resource, but are not un-accounted when unpinned.
>>>>
>>>> AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
>>>> pinned. In any case you should make sure that the accounting is 
>>>> consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. 
>>>> This patch breaks that consistency.
>>>
>>> You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run 
>>> for such objects, or something else?
>>>
>>> If they run, then at the end of pin there is:
>>>
>>>     domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>>> ...
>>>     } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
>>>         atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>>
>>> And unpin has no handling for AMDGPU_PL_PREEMPT.
>>>
>>> Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
>>> AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
>>> checking the domain as stored in the bo at creation time.
>>>
>>> Although I am still confused by the statement userptr BOs are not 
>>> pinned. It is not needed to map them via GART on AMD hardware for GPU 
>>> to be able to access them?
>>
>> No, a GART mapping is only needed if you want to scanout from them or 
>> otherwise use them from the kernel on the GPU.
>>
>> Background is that the kernel doesn't has VM with page tables..
> 
> Got it, thanks!
> 
> Presumably somewhere else in the code then it is prevented to call 
> pin/unpin on those?

I was referring to this condition in amdgpu_bo_pin_restricted:

         if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
                 return -EPERM;

However, when I look into it more, I see that AMDGPU_PL_PREEMPT is used 
for other SG BOs that actually are pinned, specifically BOs created by 
KFD with KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL or 
KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP. These are very small BOs (one or two 
pages), and only one per process, per GPU, so I'm not sure it's worth 
adding special handling for them in the BO pin accounting.

Regards,
   Felix


> 
> What to do, if anything, with the attempt to address the asymmetry in 
> the accounting criteria between the pin and unpin?
> 
> I mean domain based on pin:
> 
>      domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>      if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
>          atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
>          atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
>                   &adev->visible_pin_size);
>      } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
>          atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>      }
> 
> Versus placement based on unpin:
> 
>      if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
>          atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
>          atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
>                   &adev->visible_pin_size);
>      } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>          atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
>      }
> 
> The fact amdgpu_mem_type_to_domain never translates back to 
> AMDGPU_PL_PREEMPT means there is indeed currently no bug.
> 
> Is 2/3 still desirable to convert the check in pin to me mem_type based?
> 
>>>>> Fix by extending the accounting criteria in amdgpu_bo_unpin.
>>>>>
>>>>> What also aappears needs fixing is not reporting their size from the
>>>>> amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
>>>>> they are
>>>>> not mixed with the regular userspace created and driver owned objects.
>>>>
>>>> I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
>>>> does use system memory and it is GPU accessible, just like GTT. The 
>>>> only difference is, that it's not subject to the GTT limits because 
>>>> their eviction is handled by callbacks other than TTM evictions and 
>>>> doesn't need to wait for fences.
>>>
>>> As in you think those two hunks of the patch are correct?
>>
>> I think so as well, yes. But we still need a name for preemptible BOs 
>> while printing them in debugfs.
> 
> Currently it looks the name is 'CPU':
> 
> amdgpu_bo_print_info()
> ...
>          case AMDGPU_GEM_DOMAIN_CPU:
>          default:
>              placement = "CPU";
>              break;
> 
> 
> Also, where to account them in struct amdgpu_mem_stats?
> 
> Regards,
> 
> Tvrtko
> 
>>
>> Regards,
>> Christian.
>>
>>>
>>> Regards,
>>>
>>> Tvrtko
>>>
>>>
>>>> Regards,
>>>>    Felix
>>>>
>>>>
>>>>>
>>>>> And also amdgpu_bo_print_info for debugfs reporting.
>>>>>
>>>>> Note that the patch depends on the previous one which broke down the
>>>>> relevant checks from the domain based to placement based.
>>>>>
>>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>>> Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
>>>>> SG BOs")
>>>>> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
>>>>> Cc: Christian König <christian.koenig@amd.com>
>>>>> ---
>>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>>>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> index fb984669fc3a..5a2bbc793953 100644
>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>>> @@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
>>>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
>>>>>           atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
>>>>>                    &adev->visible_pin_size);
>>>>> -    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>>>>> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
>>>>> +           bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>>>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>>>>       }
>>>>> @@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>>>>               stats->vram_shared += size;
>>>>>           break;
>>>>>       case TTM_PL_TT:
>>>>> -    case AMDGPU_PL_PREEMPT:
>>>>>           stats->gtt += size;
>>>>>           if (shared)
>>>>>               stats->gtt_shared += size;
>>>>> @@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct 
>>>>> amdgpu_bo *bo, struct seq_file *m)
>>>>>                   placement = "VRAM";
>>>>>               break;
>>>>>           case TTM_PL_TT:
>>>>> -        case AMDGPU_PL_PREEMPT:
>>>>>               placement = "GTT";
>>>>>               break;
>>>>>           case TTM_PL_SYSTEM:
>>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting
  2024-04-29  9:43     ` Tvrtko Ursulin
  2024-04-29 11:11       ` Christian König
@ 2024-04-29 14:43       ` Felix Kuehling
  2024-04-29 16:51         ` Tvrtko Ursulin
  1 sibling, 1 reply; 22+ messages in thread
From: Felix Kuehling @ 2024-04-29 14:43 UTC (permalink / raw)
  To: Tvrtko Ursulin, Tvrtko Ursulin, amd-gfx; +Cc: Christian König, kernel-dev

On 2024-04-29 5:43, Tvrtko Ursulin wrote:
> 
> On 26/04/2024 23:24, Felix Kuehling wrote:
>>
>> On 2024-04-26 12:43, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>
>>> When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible
>>> SG BOs") added a new TTM region it missed to notice the conceptual
>>> imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
>>>
>>> That imbalance leads to such objects getting accounted against the
>>> resource, but are not un-accounted when unpinned.
>>
>> AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
>> pinned. In any case you should make sure that the accounting is 
>> consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
>> patch breaks that consistency.
> 
> You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
> such objects, or something else?

Right. amdgpu_bo_pin_restricted will return an error for userptr BOs:

         if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
                 return -EPERM;


> 
> If they run, then at the end of pin there is:
> 
>      domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> ...
>      } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
>          atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);

You changed that in your patch 2:

-	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
+	} else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
+		   bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
  		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
  	}

I was suggesting you just change this in patch 2 like this, so it 
matches what's done on unpin:

-	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
+	} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
  		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
  	}


> 
> And unpin has no handling for AMDGPU_PL_PREEMPT.
> 
> Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
> AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
> checking the domain as stored in the bo at creation time.
> 
> Although I am still confused by the statement userptr BOs are not 
> pinned. It is not needed to map them via GART on AMD hardware for GPU to 
> be able to access them?
>>> Fix by extending the accounting criteria in amdgpu_bo_unpin.
>>>
>>> What also aappears needs fixing is not reporting their size from the
>>> amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
>>> they are
>>> not mixed with the regular userspace created and driver owned objects.
>>
>> I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
>> does use system memory and it is GPU accessible, just like GTT. The 
>> only difference is, that it's not subject to the GTT limits because 
>> their eviction is handled by callbacks other than TTM evictions and 
>> doesn't need to wait for fences.
> 
> As in you think those two hunks of the patch are correct?

Yes. It seems, Christian agrees but wants to show preemptible memory 
separately in debugfs instead of not showing it at all.

Regards,
   Felix


> 
> Regards,
> 
> Tvrtko
> 
> 
>> Regards,
>>    Felix
>>
>>
>>>
>>> And also amdgpu_bo_print_info for debugfs reporting.
>>>
>>> Note that the patch depends on the previous one which broke down the
>>> relevant checks from the domain based to placement based.
>>>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>> Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
>>> SG BOs")
>>> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
>>> Cc: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index fb984669fc3a..5a2bbc793953 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
>>>           atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
>>>                    &adev->visible_pin_size);
>>> -    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>>> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
>>> +           bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>>       }
>>> @@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>>               stats->vram_shared += size;
>>>           break;
>>>       case TTM_PL_TT:
>>> -    case AMDGPU_PL_PREEMPT:
>>>           stats->gtt += size;
>>>           if (shared)
>>>               stats->gtt_shared += size;
>>> @@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct 
>>> amdgpu_bo *bo, struct seq_file *m)
>>>                   placement = "VRAM";
>>>               break;
>>>           case TTM_PL_TT:
>>> -        case AMDGPU_PL_PREEMPT:
>>>               placement = "GTT";
>>>               break;
>>>           case TTM_PL_SYSTEM:

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting
  2024-04-29 14:43       ` Felix Kuehling
@ 2024-04-29 16:51         ` Tvrtko Ursulin
  0 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-29 16:51 UTC (permalink / raw)
  To: Felix Kuehling, Tvrtko Ursulin, amd-gfx; +Cc: Christian König, kernel-dev


On 29/04/2024 15:43, Felix Kuehling wrote:
> On 2024-04-29 5:43, Tvrtko Ursulin wrote:
>>
>> On 26/04/2024 23:24, Felix Kuehling wrote:
>>>
>>> On 2024-04-26 12:43, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>>
>>>> When commit b453e42a6e8b ("drm/amdgpu: Add new placement for 
>>>> preemptible
>>>> SG BOs") added a new TTM region it missed to notice the conceptual
>>>> imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin.
>>>>
>>>> That imbalance leads to such objects getting accounted against the
>>>> resource, but are not un-accounted when unpinned.
>>>
>>> AMDGPU_PL_PREEMPT is mostly used for userptr BOs, which cannot be 
>>> pinned. In any case you should make sure that the accounting is 
>>> consistent between amdgpu_bo_pin_restricted and amdgpu_bo_unpin. This 
>>> patch breaks that consistency.
>>
>> You mean amdgpu_bo_pin(_restricted) and amdgpu_bo_unpin do not run for 
>> such objects, or something else?
> 
> Right. amdgpu_bo_pin_restricted will return an error for userptr BOs:
> 
>          if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm))
>                  return -EPERM;

Ah I missed that, thank you!

>>
>> If they run, then at the end of pin there is:
>>
>>      domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>> ...
>>      } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
>>          atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
> 
> You changed that in your patch 2:
> 
> -    } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
> +           bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>           atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>       }
> 
> I was suggesting you just change this in patch 2 like this, so it 
> matches what's done on unpin:
> 
> -    } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>           atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>       }
> 
> 
>>
>> And unpin has no handling for AMDGPU_PL_PREEMPT.
>>
>> Ah I see.. does it rely on amdgpu_mem_type_to_domain returning 0 for 
>> AMDGPU_PL_PREEMPT? My confusion was I misread the pinning check as 
>> checking the domain as stored in the bo at creation time.
>>
>> Although I am still confused by the statement userptr BOs are not 
>> pinned. It is not needed to map them via GART on AMD hardware for GPU 
>> to be able to access them?
>>>> Fix by extending the accounting criteria in amdgpu_bo_unpin.
>>>>
>>>> What also aappears needs fixing is not reporting their size from the
>>>> amdgpu_bo_get_memory, which is used to implement fdinfo stats, so 
>>>> they are
>>>> not mixed with the regular userspace created and driver owned objects.
>>>
>>> I think that's true. It's a very fine distinction. AMDGPU_PL_PREEMPT 
>>> does use system memory and it is GPU accessible, just like GTT. The 
>>> only difference is, that it's not subject to the GTT limits because 
>>> their eviction is handled by callbacks other than TTM evictions and 
>>> doesn't need to wait for fences.
>>
>> As in you think those two hunks of the patch are correct?
> 
> Yes. It seems, Christian agrees but wants to show preemptible memory 
> separately in debugfs instead of not showing it at all.

Okay, I've posted a respin with a name 'PREEMPTIBLE' to start the naming 
discussion. :)

If I did not get confused again, it is only the last patch in the series 
moves the reporting for those objects from 'CPU' to this new label.

This is because both the current code:

                domain = 
amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
                switch (domain) {
...
                default:
                         placement = "CPU";
                         break;

And with my 2/3:

                switch (bo->tbo.resource->mem_type) {
...
                case TTM_PL_SYSTEM:
                default:
                         placement = "CPU";
                         break;

They end up in the CPU bucket. Things only change with 3/3:

Regards,

Tvrtko

> 
> Regards,
>    Felix
> 
> 
>>
>> Regards,
>>
>> Tvrtko
>>
>>
>>> Regards,
>>>    Felix
>>>
>>>
>>>>
>>>> And also amdgpu_bo_print_info for debugfs reporting.
>>>>
>>>> Note that the patch depends on the previous one which broke down the
>>>> relevant checks from the domain based to placement based.
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>>> Fixes: b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible 
>>>> SG BOs")
>>>> Cc: Felix Kuehling <Felix.Kuehling@amd.com>
>>>> Cc: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 5 ++---
>>>>   1 file changed, 2 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> index fb984669fc3a..5a2bbc793953 100644
>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>>> @@ -1032,7 +1032,8 @@ void amdgpu_bo_unpin(struct amdgpu_bo *bo)
>>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->vram_pin_size);
>>>>           atomic64_sub(amdgpu_vram_mgr_bo_visible_size(bo),
>>>>                    &adev->visible_pin_size);
>>>> -    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>>>> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT ||
>>>> +           bo->tbo.resource->mem_type == AMDGPU_PL_PREEMPT) {
>>>>           atomic64_sub(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>>>       }
>>>> @@ -1298,7 +1299,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>>>               stats->vram_shared += size;
>>>>           break;
>>>>       case TTM_PL_TT:
>>>> -    case AMDGPU_PL_PREEMPT:
>>>>           stats->gtt += size;
>>>>           if (shared)
>>>>               stats->gtt_shared += size;
>>>> @@ -1599,7 +1599,6 @@ u64 amdgpu_bo_print_info(int id, struct 
>>>> amdgpu_bo *bo, struct seq_file *m)
>>>>                   placement = "VRAM";
>>>>               break;
>>>>           case TTM_PL_TT:
>>>> -        case AMDGPU_PL_PREEMPT:
>>>>               placement = "GTT";
>>>>               break;
>>>>           case TTM_PL_SYSTEM:

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection
  2024-05-03  9:14 [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper Tvrtko Ursulin
@ 2024-05-03  9:14 ` Tvrtko Ursulin
  0 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-05-03  9:14 UTC (permalink / raw)
  To: amd-gfx; +Cc: kernel-dev, Tvrtko Ursulin, Christian König, Felix Kuehling

From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM
placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT,
depending on AMDGPU_GEM_CREATE_PREEMPTIBLE.

Simplify a few places in the code which convert the TTM placement into
a domain by checking against the current placement directly.

In the conversion AMDGPU_PL_PREEMPT either does not have to be handled
because amdgpu_mem_type_to_domain() cannot return that value anyway.

v2:
 * Remove AMDGPU_PL_PREEMPT handling.

v3:
 * Rebase.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> # v2
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 29 +++++++++------------
 2 files changed, 13 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 055ba2ea4c12..0b3b10d21952 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -165,8 +165,7 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach,
 		if (r)
 			return ERR_PTR(r);
 
-	} else if (!(amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type) &
-		     AMDGPU_GEM_DOMAIN_GTT)) {
+	} else if (bo->tbo.resource->mem_type != TTM_PL_TT) {
 		return ERR_PTR(-EBUSY);
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8d8c39be6129..4f9073dd19eb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -983,12 +983,11 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
 
 	ttm_bo_pin(&bo->tbo);
 
-	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
-	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
+	if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
 		atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
 		atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
 			     &adev->visible_pin_size);
-	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
+	} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
 		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
 	}
 
@@ -1293,7 +1292,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 	struct ttm_resource *res = bo->tbo.resource;
 	uint64_t size = amdgpu_bo_size(bo);
 	struct drm_gem_object *obj;
-	unsigned int domain;
 	bool shared;
 
 	/* Abort if the BO doesn't currently have a backing store */
@@ -1303,21 +1301,20 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 	obj = &bo->tbo.base;
 	shared = drm_gem_object_is_shared_for_memory_stats(obj);
 
-	domain = amdgpu_mem_type_to_domain(res->mem_type);
-	switch (domain) {
-	case AMDGPU_GEM_DOMAIN_VRAM:
+	switch (res->mem_type) {
+	case TTM_PL_VRAM:
 		stats->vram += size;
-		if (amdgpu_res_cpu_visible(adev, bo->tbo.resource))
+		if (amdgpu_res_cpu_visible(adev, res))
 			stats->visible_vram += size;
 		if (shared)
 			stats->vram_shared += size;
 		break;
-	case AMDGPU_GEM_DOMAIN_GTT:
+	case TTM_PL_TT:
 		stats->gtt += size;
 		if (shared)
 			stats->gtt_shared += size;
 		break;
-	case AMDGPU_GEM_DOMAIN_CPU:
+	case TTM_PL_SYSTEM:
 	default:
 		stats->cpu += size;
 		if (shared)
@@ -1330,7 +1327,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 		if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
 			stats->requested_visible_vram += size;
 
-		if (domain != AMDGPU_GEM_DOMAIN_VRAM) {
+		if (res->mem_type != TTM_PL_VRAM) {
 			stats->evicted_vram += size;
 			if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
 				stats->evicted_visible_vram += size;
@@ -1604,20 +1601,18 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)
 	u64 size;
 
 	if (dma_resv_trylock(bo->tbo.base.resv)) {
-		unsigned int domain;
 
-		domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
-		switch (domain) {
-		case AMDGPU_GEM_DOMAIN_VRAM:
+		switch (bo->tbo.resource->mem_type) {
+		case TTM_PL_VRAM:
 			if (amdgpu_res_cpu_visible(adev, bo->tbo.resource))
 				placement = "VRAM VISIBLE";
 			else
 				placement = "VRAM";
 			break;
-		case AMDGPU_GEM_DOMAIN_GTT:
+		case TTM_PL_TT:
 			placement = "GTT";
 			break;
-		case AMDGPU_GEM_DOMAIN_CPU:
+		case TTM_PL_SYSTEM:
 		default:
 			placement = "CPU";
 			break;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection
  2024-04-29 16:47 ` [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection Tvrtko Ursulin
  2024-04-30  0:31   ` Felix Kuehling
@ 2024-05-03  6:31   ` Christian König
  1 sibling, 0 replies; 22+ messages in thread
From: Christian König @ 2024-05-03  6:31 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx; +Cc: kernel-dev, Tvrtko Ursulin

Am 29.04.24 um 18:47 schrieb Tvrtko Ursulin:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>
> All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM
> placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT,
> depending on AMDGPU_GEM_CREATE_PREEMPTIBLE.
>
> Simplify a few places in the code which convert the TTM placement into
> a domain by checking against the current placement directly.
>
> In the conversion AMDGPU_PL_PREEMPT either does not have to be handled
> because amdgpu_mem_type_to_domain() cannot return that value anyway.
>
> v2:
>   * Remove AMDGPU_PL_PREEMPT handling.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Reviewed-by: Christian König <christian.koenig@amd.com> # v1
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  3 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 27 +++++++++------------
>   2 files changed, 12 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index 055ba2ea4c12..0b3b10d21952 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -165,8 +165,7 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach,
>   		if (r)
>   			return ERR_PTR(r);
>   
> -	} else if (!(amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type) &
> -		     AMDGPU_GEM_DOMAIN_GTT)) {
> +	} else if (bo->tbo.resource->mem_type != TTM_PL_TT) {
>   		return ERR_PTR(-EBUSY);
>   	}
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 8bc79924d171..eb5bd6962560 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -976,12 +976,11 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
>   
>   	ttm_bo_pin(&bo->tbo);
>   
> -	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> -	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
> +	if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
>   		atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
>   		atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
>   			     &adev->visible_pin_size);
> -	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
> +	} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>   		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>   	}
>   
> @@ -1280,7 +1279,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   {
>   	uint64_t size = amdgpu_bo_size(bo);
>   	struct drm_gem_object *obj;
> -	unsigned int domain;
>   	bool shared;
>   
>   	/* Abort if the BO doesn't currently have a backing store */
> @@ -1290,21 +1288,20 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   	obj = &bo->tbo.base;
>   	shared = drm_gem_object_is_shared_for_memory_stats(obj);
>   
> -	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> -	switch (domain) {
> -	case AMDGPU_GEM_DOMAIN_VRAM:
> +	switch (bo->tbo.resource->mem_type) {
> +	case TTM_PL_VRAM:
>   		stats->vram += size;
>   		if (amdgpu_bo_in_cpu_visible_vram(bo))
>   			stats->visible_vram += size;
>   		if (shared)
>   			stats->vram_shared += size;
>   		break;
> -	case AMDGPU_GEM_DOMAIN_GTT:
> +	case TTM_PL_TT:
>   		stats->gtt += size;
>   		if (shared)
>   			stats->gtt_shared += size;
>   		break;
> -	case AMDGPU_GEM_DOMAIN_CPU:
> +	case TTM_PL_SYSTEM:
>   	default:
>   		stats->cpu += size;
>   		if (shared)
> @@ -1317,7 +1314,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   		if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>   			stats->requested_visible_vram += size;
>   
> -		if (domain != AMDGPU_GEM_DOMAIN_VRAM) {
> +		if (bo->tbo.resource->mem_type != TTM_PL_VRAM) {
>   			stats->evicted_vram += size;
>   			if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>   				stats->evicted_visible_vram += size;
> @@ -1592,19 +1589,17 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)
>   	u64 size;
>   
>   	if (dma_resv_trylock(bo->tbo.base.resv)) {
> -		unsigned int domain;
> -		domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> -		switch (domain) {
> -		case AMDGPU_GEM_DOMAIN_VRAM:
> +		switch (bo->tbo.resource->mem_type) {
> +		case TTM_PL_VRAM:
>   			if (amdgpu_bo_in_cpu_visible_vram(bo))
>   				placement = "VRAM VISIBLE";
>   			else
>   				placement = "VRAM";
>   			break;
> -		case AMDGPU_GEM_DOMAIN_GTT:
> +		case TTM_PL_TT:
>   			placement = "GTT";
>   			break;
> -		case AMDGPU_GEM_DOMAIN_CPU:
> +		case TTM_PL_SYSTEM:

I would still prefer a AMDGPU_PL_PREEMPT here to be able to distinct those.

On the other hand OA, GWS and GDS placements are missing as well. So 
that switch should probably be fixed.

Either way the patch is Reviewed-by: Christian König 
<christian.koenig@amd.com> for now since it doesn't change the handling 
at all.

Regards,
Christian.

>   		default:
>   			placement = "CPU";
>   			break;


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection
  2024-04-30  0:31   ` Felix Kuehling
@ 2024-04-30  8:27     ` Tvrtko Ursulin
  0 siblings, 0 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-30  8:27 UTC (permalink / raw)
  To: Felix Kuehling, Tvrtko Ursulin, amd-gfx; +Cc: Christian König, kernel-dev


On 30/04/2024 01:31, Felix Kuehling wrote:
> 
> On 2024-04-29 12:47, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>>
>> All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM
>> placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT,
>> depending on AMDGPU_GEM_CREATE_PREEMPTIBLE.
>>
>> Simplify a few places in the code which convert the TTM placement into
>> a domain by checking against the current placement directly.
>>
>> In the conversion AMDGPU_PL_PREEMPT either does not have to be handled
>> because amdgpu_mem_type_to_domain() cannot return that value anyway.
>>
>> v2:
>>   * Remove AMDGPU_PL_PREEMPT handling.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>> Reviewed-by: Christian König <christian.koenig@amd.com> # v1
> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
> 
> I also ran kfdtest on a multi-GPU system just to make sure this didn't 
> break our multi-GPU support. BTW, I had to fix up some things when I 

Excellent thank you!

Btw important thing to stress is that I hope the r-b means not only 
patch is functionaly correct but that you guys actually agree it is an 
improvement. Because I am quite new in your code base so please apply 
strict criteria on my proposals.

> tried to apply your patch to the current amd-staging-drm-next branch. 
> That branch was just rebased on Linux 6.8, so maybe that's part of the 
> reason.

I am conditioned to work against drm-tip so maybe that is one reason, or 
also possibly because now I see I used drm-tip from more than a week ago 
as a base. :( I can rebase and re-send. So amd-staging-drm-next is the 
correct branch?

Regards,

Tvrtko

> 
> 
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  3 +--
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 27 +++++++++------------
>>   2 files changed, 12 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> index 055ba2ea4c12..0b3b10d21952 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
>> @@ -165,8 +165,7 @@ static struct sg_table *amdgpu_dma_buf_map(struct 
>> dma_buf_attachment *attach,
>>           if (r)
>>               return ERR_PTR(r);
>> -    } else if (!(amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type) &
>> -             AMDGPU_GEM_DOMAIN_GTT)) {
>> +    } else if (bo->tbo.resource->mem_type != TTM_PL_TT) {
>>           return ERR_PTR(-EBUSY);
>>       }
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 8bc79924d171..eb5bd6962560 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -976,12 +976,11 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo 
>> *bo, u32 domain,
>>       ttm_bo_pin(&bo->tbo);
>> -    domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>> -    if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
>> +    if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
>>           atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
>>           atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
>>                    &adev->visible_pin_size);
>> -    } else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
>> +    } else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>>           atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>>       }
>> @@ -1280,7 +1279,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>   {
>>       uint64_t size = amdgpu_bo_size(bo);
>>       struct drm_gem_object *obj;
>> -    unsigned int domain;
>>       bool shared;
>>       /* Abort if the BO doesn't currently have a backing store */
>> @@ -1290,21 +1288,20 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>       obj = &bo->tbo.base;
>>       shared = drm_gem_object_is_shared_for_memory_stats(obj);
>> -    domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>> -    switch (domain) {
>> -    case AMDGPU_GEM_DOMAIN_VRAM:
>> +    switch (bo->tbo.resource->mem_type) {
>> +    case TTM_PL_VRAM:
>>           stats->vram += size;
>>           if (amdgpu_bo_in_cpu_visible_vram(bo))
>>               stats->visible_vram += size;
>>           if (shared)
>>               stats->vram_shared += size;
>>           break;
>> -    case AMDGPU_GEM_DOMAIN_GTT:
>> +    case TTM_PL_TT:
>>           stats->gtt += size;
>>           if (shared)
>>               stats->gtt_shared += size;
>>           break;
>> -    case AMDGPU_GEM_DOMAIN_CPU:
>> +    case TTM_PL_SYSTEM:
>>       default:
>>           stats->cpu += size;
>>           if (shared)
>> @@ -1317,7 +1314,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>>           if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>>               stats->requested_visible_vram += size;
>> -        if (domain != AMDGPU_GEM_DOMAIN_VRAM) {
>> +        if (bo->tbo.resource->mem_type != TTM_PL_VRAM) {
>>               stats->evicted_vram += size;
>>               if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>>                   stats->evicted_visible_vram += size;
>> @@ -1592,19 +1589,17 @@ u64 amdgpu_bo_print_info(int id, struct 
>> amdgpu_bo *bo, struct seq_file *m)
>>       u64 size;
>>       if (dma_resv_trylock(bo->tbo.base.resv)) {
>> -        unsigned int domain;
>> -        domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
>> -        switch (domain) {
>> -        case AMDGPU_GEM_DOMAIN_VRAM:
>> +        switch (bo->tbo.resource->mem_type) {
>> +        case TTM_PL_VRAM:
>>               if (amdgpu_bo_in_cpu_visible_vram(bo))
>>                   placement = "VRAM VISIBLE";
>>               else
>>                   placement = "VRAM";
>>               break;
>> -        case AMDGPU_GEM_DOMAIN_GTT:
>> +        case TTM_PL_TT:
>>               placement = "GTT";
>>               break;
>> -        case AMDGPU_GEM_DOMAIN_CPU:
>> +        case TTM_PL_SYSTEM:
>>           default:
>>               placement = "CPU";
>>               break;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection
  2024-04-29 16:47 ` [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection Tvrtko Ursulin
@ 2024-04-30  0:31   ` Felix Kuehling
  2024-04-30  8:27     ` Tvrtko Ursulin
  2024-05-03  6:31   ` Christian König
  1 sibling, 1 reply; 22+ messages in thread
From: Felix Kuehling @ 2024-04-30  0:31 UTC (permalink / raw)
  To: Tvrtko Ursulin, amd-gfx; +Cc: Christian König, kernel-dev, Tvrtko Ursulin


On 2024-04-29 12:47, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
>
> All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM
> placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT,
> depending on AMDGPU_GEM_CREATE_PREEMPTIBLE.
>
> Simplify a few places in the code which convert the TTM placement into
> a domain by checking against the current placement directly.
>
> In the conversion AMDGPU_PL_PREEMPT either does not have to be handled
> because amdgpu_mem_type_to_domain() cannot return that value anyway.
>
> v2:
>   * Remove AMDGPU_PL_PREEMPT handling.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Reviewed-by: Christian König <christian.koenig@amd.com> # v1
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>

I also ran kfdtest on a multi-GPU system just to make sure this didn't 
break our multi-GPU support. BTW, I had to fix up some things when I 
tried to apply your patch to the current amd-staging-drm-next branch. 
That branch was just rebased on Linux 6.8, so maybe that's part of the 
reason.


> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  3 +--
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 27 +++++++++------------
>   2 files changed, 12 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> index 055ba2ea4c12..0b3b10d21952 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
> @@ -165,8 +165,7 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach,
>   		if (r)
>   			return ERR_PTR(r);
>   
> -	} else if (!(amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type) &
> -		     AMDGPU_GEM_DOMAIN_GTT)) {
> +	} else if (bo->tbo.resource->mem_type != TTM_PL_TT) {
>   		return ERR_PTR(-EBUSY);
>   	}
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 8bc79924d171..eb5bd6962560 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -976,12 +976,11 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
>   
>   	ttm_bo_pin(&bo->tbo);
>   
> -	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> -	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
> +	if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
>   		atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
>   		atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
>   			     &adev->visible_pin_size);
> -	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
> +	} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
>   		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
>   	}
>   
> @@ -1280,7 +1279,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   {
>   	uint64_t size = amdgpu_bo_size(bo);
>   	struct drm_gem_object *obj;
> -	unsigned int domain;
>   	bool shared;
>   
>   	/* Abort if the BO doesn't currently have a backing store */
> @@ -1290,21 +1288,20 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   	obj = &bo->tbo.base;
>   	shared = drm_gem_object_is_shared_for_memory_stats(obj);
>   
> -	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> -	switch (domain) {
> -	case AMDGPU_GEM_DOMAIN_VRAM:
> +	switch (bo->tbo.resource->mem_type) {
> +	case TTM_PL_VRAM:
>   		stats->vram += size;
>   		if (amdgpu_bo_in_cpu_visible_vram(bo))
>   			stats->visible_vram += size;
>   		if (shared)
>   			stats->vram_shared += size;
>   		break;
> -	case AMDGPU_GEM_DOMAIN_GTT:
> +	case TTM_PL_TT:
>   		stats->gtt += size;
>   		if (shared)
>   			stats->gtt_shared += size;
>   		break;
> -	case AMDGPU_GEM_DOMAIN_CPU:
> +	case TTM_PL_SYSTEM:
>   	default:
>   		stats->cpu += size;
>   		if (shared)
> @@ -1317,7 +1314,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
>   		if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>   			stats->requested_visible_vram += size;
>   
> -		if (domain != AMDGPU_GEM_DOMAIN_VRAM) {
> +		if (bo->tbo.resource->mem_type != TTM_PL_VRAM) {
>   			stats->evicted_vram += size;
>   			if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
>   				stats->evicted_visible_vram += size;
> @@ -1592,19 +1589,17 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)
>   	u64 size;
>   
>   	if (dma_resv_trylock(bo->tbo.base.resv)) {
> -		unsigned int domain;
> -		domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
> -		switch (domain) {
> -		case AMDGPU_GEM_DOMAIN_VRAM:
> +		switch (bo->tbo.resource->mem_type) {
> +		case TTM_PL_VRAM:
>   			if (amdgpu_bo_in_cpu_visible_vram(bo))
>   				placement = "VRAM VISIBLE";
>   			else
>   				placement = "VRAM";
>   			break;
> -		case AMDGPU_GEM_DOMAIN_GTT:
> +		case TTM_PL_TT:
>   			placement = "GTT";
>   			break;
> -		case AMDGPU_GEM_DOMAIN_CPU:
> +		case TTM_PL_SYSTEM:
>   		default:
>   			placement = "CPU";
>   			break;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection
  2024-04-29 16:47 [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper Tvrtko Ursulin
@ 2024-04-29 16:47 ` Tvrtko Ursulin
  2024-04-30  0:31   ` Felix Kuehling
  2024-05-03  6:31   ` Christian König
  0 siblings, 2 replies; 22+ messages in thread
From: Tvrtko Ursulin @ 2024-04-29 16:47 UTC (permalink / raw)
  To: amd-gfx; +Cc: Christian König, kernel-dev, Tvrtko Ursulin

From: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>

All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM
placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT,
depending on AMDGPU_GEM_CREATE_PREEMPTIBLE.

Simplify a few places in the code which convert the TTM placement into
a domain by checking against the current placement directly.

In the conversion AMDGPU_PL_PREEMPT either does not have to be handled
because amdgpu_mem_type_to_domain() cannot return that value anyway.

v2:
 * Remove AMDGPU_PL_PREEMPT handling.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c |  3 +--
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c  | 27 +++++++++------------
 2 files changed, 12 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
index 055ba2ea4c12..0b3b10d21952 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c
@@ -165,8 +165,7 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach,
 		if (r)
 			return ERR_PTR(r);
 
-	} else if (!(amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type) &
-		     AMDGPU_GEM_DOMAIN_GTT)) {
+	} else if (bo->tbo.resource->mem_type != TTM_PL_TT) {
 		return ERR_PTR(-EBUSY);
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8bc79924d171..eb5bd6962560 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -976,12 +976,11 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
 
 	ttm_bo_pin(&bo->tbo);
 
-	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
-	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
+	if (bo->tbo.resource->mem_type == TTM_PL_VRAM) {
 		atomic64_add(amdgpu_bo_size(bo), &adev->vram_pin_size);
 		atomic64_add(amdgpu_vram_mgr_bo_visible_size(bo),
 			     &adev->visible_pin_size);
-	} else if (domain == AMDGPU_GEM_DOMAIN_GTT) {
+	} else if (bo->tbo.resource->mem_type == TTM_PL_TT) {
 		atomic64_add(amdgpu_bo_size(bo), &adev->gart_pin_size);
 	}
 
@@ -1280,7 +1279,6 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 {
 	uint64_t size = amdgpu_bo_size(bo);
 	struct drm_gem_object *obj;
-	unsigned int domain;
 	bool shared;
 
 	/* Abort if the BO doesn't currently have a backing store */
@@ -1290,21 +1288,20 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 	obj = &bo->tbo.base;
 	shared = drm_gem_object_is_shared_for_memory_stats(obj);
 
-	domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
-	switch (domain) {
-	case AMDGPU_GEM_DOMAIN_VRAM:
+	switch (bo->tbo.resource->mem_type) {
+	case TTM_PL_VRAM:
 		stats->vram += size;
 		if (amdgpu_bo_in_cpu_visible_vram(bo))
 			stats->visible_vram += size;
 		if (shared)
 			stats->vram_shared += size;
 		break;
-	case AMDGPU_GEM_DOMAIN_GTT:
+	case TTM_PL_TT:
 		stats->gtt += size;
 		if (shared)
 			stats->gtt_shared += size;
 		break;
-	case AMDGPU_GEM_DOMAIN_CPU:
+	case TTM_PL_SYSTEM:
 	default:
 		stats->cpu += size;
 		if (shared)
@@ -1317,7 +1314,7 @@ void amdgpu_bo_get_memory(struct amdgpu_bo *bo,
 		if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
 			stats->requested_visible_vram += size;
 
-		if (domain != AMDGPU_GEM_DOMAIN_VRAM) {
+		if (bo->tbo.resource->mem_type != TTM_PL_VRAM) {
 			stats->evicted_vram += size;
 			if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
 				stats->evicted_visible_vram += size;
@@ -1592,19 +1589,17 @@ u64 amdgpu_bo_print_info(int id, struct amdgpu_bo *bo, struct seq_file *m)
 	u64 size;
 
 	if (dma_resv_trylock(bo->tbo.base.resv)) {
-		unsigned int domain;
-		domain = amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type);
-		switch (domain) {
-		case AMDGPU_GEM_DOMAIN_VRAM:
+		switch (bo->tbo.resource->mem_type) {
+		case TTM_PL_VRAM:
 			if (amdgpu_bo_in_cpu_visible_vram(bo))
 				placement = "VRAM VISIBLE";
 			else
 				placement = "VRAM";
 			break;
-		case AMDGPU_GEM_DOMAIN_GTT:
+		case TTM_PL_TT:
 			placement = "GTT";
 			break;
-		case AMDGPU_GEM_DOMAIN_CPU:
+		case TTM_PL_SYSTEM:
 		default:
 			placement = "CPU";
 			break;
-- 
2.44.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2024-05-03  9:15 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-26 16:43 [PATCH 0/3] Some refactoring and maybe a memory accounting fixlet Tvrtko Ursulin
2024-04-26 16:43 ` [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper Tvrtko Ursulin
2024-04-29 11:02   ` Christian König
2024-04-29 13:34     ` Tvrtko Ursulin
2024-04-29 13:51       ` Christian König
2024-04-26 16:43 ` [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection Tvrtko Ursulin
2024-04-29 11:03   ` Christian König
2024-04-29 13:38     ` Tvrtko Ursulin
2024-04-26 16:43 ` [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting Tvrtko Ursulin
2024-04-26 22:24   ` Felix Kuehling
2024-04-29  9:43     ` Tvrtko Ursulin
2024-04-29 11:11       ` Christian König
2024-04-29 13:45         ` Tvrtko Ursulin
2024-04-29 14:34           ` Felix Kuehling
2024-04-29 14:43       ` Felix Kuehling
2024-04-29 16:51         ` Tvrtko Ursulin
2024-04-29 11:06   ` Christian König
2024-04-29 16:47 [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper Tvrtko Ursulin
2024-04-29 16:47 ` [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection Tvrtko Ursulin
2024-04-30  0:31   ` Felix Kuehling
2024-04-30  8:27     ` Tvrtko Ursulin
2024-05-03  6:31   ` Christian König
2024-05-03  9:14 [PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper Tvrtko Ursulin
2024-05-03  9:14 ` [PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection Tvrtko Ursulin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.