All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Deucher <alexdeucher@gmail.com>
To: "Michel Dänzer" <michel@daenzer.net>
Cc: Maling list - DRI developers <dri-devel@lists.freedesktop.org>
Subject: Re: [PATCH] drm/radeon: fix TOPDOWN handling for bo_create
Date: Mon, 16 Mar 2015 18:32:16 -0400	[thread overview]
Message-ID: <CADnq5_PqVU1PeDr0XMvA1nKa5VmoRvWPVJQ_uZuvq2LoALgRwA@mail.gmail.com> (raw)
In-Reply-To: <550251B2.5020000@daenzer.net>

[-- Attachment #1: Type: text/plain, Size: 3296 bytes --]

On Thu, Mar 12, 2015 at 10:55 PM, Michel Dänzer <michel@daenzer.net> wrote:
> On 12.03.2015 22:09, Alex Deucher wrote:
>> On Thu, Mar 12, 2015 at 5:23 AM, Christian König
>> <deathsimple@vodafone.de> wrote:
>>> On 12.03.2015 10:02, Michel Dänzer wrote:
>>>>
>>>> On 12.03.2015 06:14, Alex Deucher wrote:
>>>>>
>>>>> On Wed, Mar 11, 2015 at 4:51 PM, Alex Deucher <alexdeucher@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> On Wed, Mar 11, 2015 at 2:21 PM, Christian König
>>>>>> <deathsimple@vodafone.de> wrote:
>>>>>>>
>>>>>>> On 11.03.2015 16:44, Alex Deucher wrote:
>>>>>>>>
>>>>>>>> radeon_bo_create() calls radeon_ttm_placement_from_domain()
>>>>>>>> before ttm_bo_init() is called.  radeon_ttm_placement_from_domain()
>>>>>>>> uses the ttm bo size to determine when to select top down
>>>>>>>> allocation but since the ttm bo is not initialized yet the
>>>>>>>> check is always false.
>>>>>>>>
>>>>>>>> Noticed-by: Oded Gabbay <oded.gabbay@amd.com>
>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>>>>>> Cc: stable@vger.kernel.org
>>>>>>>
>>>>>>>
>>>>>>> And I was already wondering why the heck the BOs always made this
>>>>>>> ping/pong
>>>>>>> in memory after creation.
>>>>>>>
>>>>>>> Patch is Reviewed-by: Christian König <christian.koenig@amd.com>
>>>>>>
>>>>>> And fixing that promptly broke VCE due to vram location requirements.
>>>>>> Updated patch attached.  Thoughts?
>>>>>
>>>>> And one more take to make things a bit more explicit for static kernel
>>>>> driver allocations.
>>>>
>>>> struct ttm_place::lpfn is honoured even with TTM_PL_FLAG_TOPDOWN, so
>>>> latter should work with RADEON_GEM_CPU_ACCESS. It sounds like the
>>>> problem is really that some BOs are expected to be within a certain
>>>> range from the beginning of VRAM, but lpfn isn't set accordingly. It
>>>> would be better to fix that by setting lpfn directly than indirectly via
>>>> RADEON_GEM_CPU_ACCESS.
>>>
>>>
>>> Yeah, agree. We should probably try to find the root cause of this instead.
>>>
>>> As far as I know VCE has no documented limitation on where buffers are
>>> placed (unlike UVD). So this is a bit strange. Are you sure that it isn't
>>> UVD which breaks here?
>>
>> It's definitely VCE, I don't know why UVD didn't have a problem.  I
>> considered using pin_restricted to make sure it got pinned in the CPU
>> visible region, but that had two problems: 1. it would end up getting
>> migrated when pinned,
>
> Maybe something like radeon_uvd_force_into_uvd_segment() is needed for
> VCE as well?
>
>
>> 2. it would end up at the top of the restricted
>> region since the top down flag is set which would end up fragmenting
>> vram.
>
> If that's an issue (which outweighs the supposed benefit of
> TTM_PL_FLAG_TOPDOWN), then again the proper solution would be not to set
> TTM_PL_FLAG_TOPDOWN when rbo->placements[i].lpfn != 0 and smaller than
> the whole available region, instead of checking for VRAM and
> RADEON_GEM_CPU_ACCESS.
>

How about something like the attached patch?  I'm not really sure
about the restrictions for the UVD and VCE fw and stack/heap buffers,
but this seems to work.  It seems like the current UVD/VCE code works
by accident since the check for TOPDOWN fails.

Alex

[-- Attachment #2: 0001-drm-radeon-handle-pfn-restrictions-and-TOPDOWN-in-ra.patch --]
[-- Type: text/x-patch, Size: 26208 bytes --]

From 304963717d0ad761fc860928c8b08df297635668 Mon Sep 17 00:00:00 2001
From: Alex Deucher <alexander.deucher@amd.com>
Date: Wed, 11 Mar 2015 11:27:26 -0400
Subject: [PATCH] drm/radeon: handle pfn restrictions and TOPDOWN in
 radeon_bo_create()

Explicitly set the pfn restrictions on bo_create. Previously
we were relying on bottom up behavior in a number of places.
Make it explicit.

Also pass the size explicitly since radeon_bo_create() calls
radeon_ttm_placement_from_domain() before ttm_bo_init() is called.
radeon_ttm_placement_from_domain() uses the ttm bo size to
determine when to select top down allocation but since the ttm bo
is not initialized yet the check is always false.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/radeon/cik.c              |  4 +-
 drivers/gpu/drm/radeon/evergreen.c        |  6 +--
 drivers/gpu/drm/radeon/r600.c             |  4 +-
 drivers/gpu/drm/radeon/radeon.h           |  3 +-
 drivers/gpu/drm/radeon/radeon_benchmark.c |  6 ++-
 drivers/gpu/drm/radeon/radeon_device.c    |  2 +-
 drivers/gpu/drm/radeon/radeon_gart.c      |  1 +
 drivers/gpu/drm/radeon/radeon_gem.c       |  5 +-
 drivers/gpu/drm/radeon/radeon_kfd.c       |  2 +-
 drivers/gpu/drm/radeon/radeon_mn.c        |  5 +-
 drivers/gpu/drm/radeon/radeon_object.c    | 86 ++++++++++++++++---------------
 drivers/gpu/drm/radeon/radeon_object.h    |  3 +-
 drivers/gpu/drm/radeon/radeon_prime.c     |  2 +-
 drivers/gpu/drm/radeon/radeon_ring.c      |  2 +-
 drivers/gpu/drm/radeon/radeon_sa.c        |  2 +-
 drivers/gpu/drm/radeon/radeon_test.c      |  4 +-
 drivers/gpu/drm/radeon/radeon_ttm.c       | 14 +++--
 drivers/gpu/drm/radeon/radeon_uvd.c       |  9 +---
 drivers/gpu/drm/radeon/radeon_vce.c       |  4 +-
 drivers/gpu/drm/radeon/radeon_vm.c        |  4 +-
 20 files changed, 87 insertions(+), 81 deletions(-)

diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c
index 28faea9..8d70176 100644
--- a/drivers/gpu/drm/radeon/cik.c
+++ b/drivers/gpu/drm/radeon/cik.c
@@ -4756,7 +4756,7 @@ static int cik_mec_init(struct radeon_device *rdev)
 		r = radeon_bo_create(rdev,
 				     rdev->mec.num_mec *rdev->mec.num_pipe * MEC_HPD_SIZE * 2,
 				     PAGE_SIZE, true,
-				     RADEON_GEM_DOMAIN_GTT, 0, NULL, NULL,
+				     RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL, NULL,
 				     &rdev->mec.hpd_eop_obj);
 		if (r) {
 			dev_warn(rdev->dev, "(%d) create HDP EOP bo failed\n", r);
@@ -4922,7 +4922,7 @@ static int cik_cp_compute_resume(struct radeon_device *rdev)
 			r = radeon_bo_create(rdev,
 					     sizeof(struct bonaire_mqd),
 					     PAGE_SIZE, true,
-					     RADEON_GEM_DOMAIN_GTT, 0, NULL,
+					     RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL,
 					     NULL, &rdev->ring[idx].mqd_obj);
 			if (r) {
 				dev_warn(rdev->dev, "(%d) create MQD bo failed\n", r);
diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c
index f848acf..02b5d2e 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -4054,7 +4054,7 @@ int sumo_rlc_init(struct radeon_device *rdev)
 		/* save restore block */
 		if (rdev->rlc.save_restore_obj == NULL) {
 			r = radeon_bo_create(rdev, dws * 4, PAGE_SIZE, true,
-					     RADEON_GEM_DOMAIN_VRAM, 0, NULL,
+					     RADEON_GEM_DOMAIN_VRAM, 0, 0, 0, NULL,
 					     NULL, &rdev->rlc.save_restore_obj);
 			if (r) {
 				dev_warn(rdev->dev, "(%d) create RLC sr bo failed\n", r);
@@ -4133,7 +4133,7 @@ int sumo_rlc_init(struct radeon_device *rdev)
 
 		if (rdev->rlc.clear_state_obj == NULL) {
 			r = radeon_bo_create(rdev, dws * 4, PAGE_SIZE, true,
-					     RADEON_GEM_DOMAIN_VRAM, 0, NULL,
+					     RADEON_GEM_DOMAIN_VRAM, 0, 0, 0, NULL,
 					     NULL, &rdev->rlc.clear_state_obj);
 			if (r) {
 				dev_warn(rdev->dev, "(%d) create RLC c bo failed\n", r);
@@ -4210,7 +4210,7 @@ int sumo_rlc_init(struct radeon_device *rdev)
 		if (rdev->rlc.cp_table_obj == NULL) {
 			r = radeon_bo_create(rdev, rdev->rlc.cp_table_size,
 					     PAGE_SIZE, true,
-					     RADEON_GEM_DOMAIN_VRAM, 0, NULL,
+					     RADEON_GEM_DOMAIN_VRAM, 0, 0, 0, NULL,
 					     NULL, &rdev->rlc.cp_table_obj);
 			if (r) {
 				dev_warn(rdev->dev, "(%d) create RLC cp table bo failed\n", r);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 8f6d862..7249620 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -1457,7 +1457,7 @@ int r600_vram_scratch_init(struct radeon_device *rdev)
 	if (rdev->vram_scratch.robj == NULL) {
 		r = radeon_bo_create(rdev, RADEON_GPU_PAGE_SIZE,
 				     PAGE_SIZE, true, RADEON_GEM_DOMAIN_VRAM,
-				     0, NULL, NULL, &rdev->vram_scratch.robj);
+				     0, 0, 0, NULL, NULL, &rdev->vram_scratch.robj);
 		if (r) {
 			return r;
 		}
@@ -3390,7 +3390,7 @@ int r600_ih_ring_alloc(struct radeon_device *rdev)
 	if (rdev->ih.ring_obj == NULL) {
 		r = radeon_bo_create(rdev, rdev->ih.ring_size,
 				     PAGE_SIZE, true,
-				     RADEON_GEM_DOMAIN_GTT, 0,
+				     RADEON_GEM_DOMAIN_GTT, 0, 0, 0,
 				     NULL, NULL, &rdev->ih.ring_obj);
 		if (r) {
 			DRM_ERROR("radeon: failed to create ih ring buffer (%d).\n", r);
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 35ab65d..809fc49 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -2980,7 +2980,8 @@ extern void radeon_surface_init(struct radeon_device *rdev);
 extern int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data);
 extern void radeon_legacy_set_clock_gating(struct radeon_device *rdev, int enable);
 extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int enable);
-extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain);
+extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain,
+					     u64 size, unsigned fpfn, unsigned lpfn);
 extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo);
 extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t addr,
 				     uint32_t flags);
diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c b/drivers/gpu/drm/radeon/radeon_benchmark.c
index 87d5fb2..1caa887 100644
--- a/drivers/gpu/drm/radeon/radeon_benchmark.c
+++ b/drivers/gpu/drm/radeon/radeon_benchmark.c
@@ -94,7 +94,8 @@ static void radeon_benchmark_move(struct radeon_device *rdev, unsigned size,
 	int time;
 
 	n = RADEON_BENCHMARK_ITERATIONS;
-	r = radeon_bo_create(rdev, size, PAGE_SIZE, true, sdomain, 0, NULL, NULL, &sobj);
+	r = radeon_bo_create(rdev, size, PAGE_SIZE, true, sdomain, 0, 0, 0,
+			     NULL, NULL, &sobj);
 	if (r) {
 		goto out_cleanup;
 	}
@@ -106,7 +107,8 @@ static void radeon_benchmark_move(struct radeon_device *rdev, unsigned size,
 	if (r) {
 		goto out_cleanup;
 	}
-	r = radeon_bo_create(rdev, size, PAGE_SIZE, true, ddomain, 0, NULL, NULL, &dobj);
+	r = radeon_bo_create(rdev, size, PAGE_SIZE, true, ddomain, 0, 0, 0,
+			     NULL, NULL, &dobj);
 	if (r) {
 		goto out_cleanup;
 	}
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index b7ca4c5..d1cebfe 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -465,7 +465,7 @@ int radeon_wb_init(struct radeon_device *rdev)
 
 	if (rdev->wb.wb_obj == NULL) {
 		r = radeon_bo_create(rdev, RADEON_GPU_PAGE_SIZE, PAGE_SIZE, true,
-				     RADEON_GEM_DOMAIN_GTT, 0, NULL, NULL,
+				     RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL, NULL,
 				     &rdev->wb.wb_obj);
 		if (r) {
 			dev_warn(rdev->dev, "(%d) create WB bo failed\n", r);
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c
index 5450fa9..1513c1c 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -128,6 +128,7 @@ int radeon_gart_table_vram_alloc(struct radeon_device *rdev)
 	if (rdev->gart.robj == NULL) {
 		r = radeon_bo_create(rdev, rdev->gart.table_size,
 				     PAGE_SIZE, true, RADEON_GEM_DOMAIN_VRAM,
+				     0, (rdev->mc.visible_vram_size >> PAGE_SHIFT),
 				     0, NULL, NULL, &rdev->gart.robj);
 		if (r) {
 			return r;
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
index ac3c131..58f4556 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -67,7 +67,7 @@ int radeon_gem_object_create(struct radeon_device *rdev, unsigned long size,
 
 retry:
 	r = radeon_bo_create(rdev, size, alignment, kernel, initial_domain,
-			     flags, NULL, NULL, &robj);
+			     0, 0, flags, NULL, NULL, &robj);
 	if (r) {
 		if (r != -ERESTARTSYS) {
 			if (initial_domain == RADEON_GEM_DOMAIN_VRAM) {
@@ -337,7 +337,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data,
 			goto release_object;
 		}
 
-		radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_GTT);
+		radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_GTT, bo->tbo.mem.size,
+						 0, 0);
 		r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
 		radeon_bo_unreserve(bo);
 		up_read(&current->mm->mmap_sem);
diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c
index 061eaa9..e0caa6c 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -212,7 +212,7 @@ static int alloc_gtt_mem(struct kgd_dev *kgd, size_t size,
 		return -ENOMEM;
 
 	r = radeon_bo_create(rdev, size, PAGE_SIZE, true, RADEON_GEM_DOMAIN_GTT,
-				RADEON_GEM_GTT_WC, NULL, NULL, &(*mem)->bo);
+			     0, 0, RADEON_GEM_GTT_WC, NULL, NULL, &(*mem)->bo);
 	if (r) {
 		dev_err(rdev->dev,
 			"failed to allocate BO for amdkfd (%d)\n", r);
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
index a69bd44..7a8b0e5 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -141,14 +141,15 @@ static void radeon_mn_invalidate_range_start(struct mmu_notifier *mn,
 				DRM_ERROR("(%d) failed to wait for user bo\n", r);
 		}
 
-		radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_CPU);
+		radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_CPU, bo->tbo.mem.size,
+						 0, 0);
 		r = ttm_bo_validate(&bo->tbo, &bo->placement, false, false);
 		if (r)
 			DRM_ERROR("(%d) failed to validate user bo\n", r);
 
 		radeon_bo_unreserve(bo);
 	}
-	
+
 	mutex_unlock(&rmn->lock);
 }
 
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 43e0994..aa2f815 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -93,7 +93,8 @@ bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo)
 	return false;
 }
 
-void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
+void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain,
+				      u64 size, unsigned fpfn, unsigned lpfn)
 {
 	u32 c = 0, i;
 
@@ -105,14 +106,17 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
 		 */
 		if ((rbo->flags & RADEON_GEM_NO_CPU_ACCESS) &&
 		    rbo->rdev->mc.visible_vram_size < rbo->rdev->mc.real_vram_size) {
-			rbo->placements[c].fpfn =
-				rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT;
+			if (fpfn > (rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT))
+				rbo->placements[c].fpfn = fpfn;
+			else
+				rbo->placements[c].fpfn =
+					rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT;
 			rbo->placements[c++].flags = TTM_PL_FLAG_WC |
 						     TTM_PL_FLAG_UNCACHED |
 						     TTM_PL_FLAG_VRAM;
 		}
 
-		rbo->placements[c].fpfn = 0;
+		rbo->placements[c].fpfn = fpfn;
 		rbo->placements[c++].flags = TTM_PL_FLAG_WC |
 					     TTM_PL_FLAG_UNCACHED |
 					     TTM_PL_FLAG_VRAM;
@@ -120,18 +124,17 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
 
 	if (domain & RADEON_GEM_DOMAIN_GTT) {
 		if (rbo->flags & RADEON_GEM_GTT_UC) {
-			rbo->placements[c].fpfn = 0;
+			rbo->placements[c].fpfn = fpfn;
 			rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED |
 				TTM_PL_FLAG_TT;
-
 		} else if ((rbo->flags & RADEON_GEM_GTT_WC) ||
 			   (rbo->rdev->flags & RADEON_IS_AGP)) {
-			rbo->placements[c].fpfn = 0;
+			rbo->placements[c].fpfn = fpfn;
 			rbo->placements[c++].flags = TTM_PL_FLAG_WC |
 				TTM_PL_FLAG_UNCACHED |
 				TTM_PL_FLAG_TT;
 		} else {
-			rbo->placements[c].fpfn = 0;
+			rbo->placements[c].fpfn = fpfn;
 			rbo->placements[c++].flags = TTM_PL_FLAG_CACHED |
 						     TTM_PL_FLAG_TT;
 		}
@@ -139,24 +142,24 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
 
 	if (domain & RADEON_GEM_DOMAIN_CPU) {
 		if (rbo->flags & RADEON_GEM_GTT_UC) {
-			rbo->placements[c].fpfn = 0;
+			rbo->placements[c].fpfn = fpfn;
 			rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED |
 				TTM_PL_FLAG_SYSTEM;
 
 		} else if ((rbo->flags & RADEON_GEM_GTT_WC) ||
 		    rbo->rdev->flags & RADEON_IS_AGP) {
-			rbo->placements[c].fpfn = 0;
+			rbo->placements[c].fpfn = fpfn;
 			rbo->placements[c++].flags = TTM_PL_FLAG_WC |
 				TTM_PL_FLAG_UNCACHED |
 				TTM_PL_FLAG_SYSTEM;
 		} else {
-			rbo->placements[c].fpfn = 0;
+			rbo->placements[c].fpfn = fpfn;
 			rbo->placements[c++].flags = TTM_PL_FLAG_CACHED |
 						     TTM_PL_FLAG_SYSTEM;
 		}
 	}
 	if (!c) {
-		rbo->placements[c].fpfn = 0;
+		rbo->placements[c].fpfn = fpfn;
 		rbo->placements[c++].flags = TTM_PL_MASK_CACHING |
 					     TTM_PL_FLAG_SYSTEM;
 	}
@@ -171,7 +174,7 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
 			rbo->placements[i].lpfn =
 				rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT;
 		else
-			rbo->placements[i].lpfn = 0;
+			rbo->placements[i].lpfn = lpfn;
 	}
 
 	/*
@@ -179,17 +182,18 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain)
 	 * improve fragmentation quality.
 	 * 512kb was measured as the most optimal number.
 	 */
-	if (rbo->tbo.mem.size > 512 * 1024) {
+	if (size > 512 * 1024) {
 		for (i = 0; i < c; i++) {
-			rbo->placements[i].flags |= TTM_PL_FLAG_TOPDOWN;
+			if (rbo->placements[i].lpfn == 0)
+				rbo->placements[i].flags |= TTM_PL_FLAG_TOPDOWN;
 		}
 	}
 }
 
 int radeon_bo_create(struct radeon_device *rdev,
 		     unsigned long size, int byte_align, bool kernel,
-		     u32 domain, u32 flags, struct sg_table *sg,
-		     struct reservation_object *resv,
+		     u32 domain, unsigned fpfn, unsigned lpfn, u32 flags,
+		     struct sg_table *sg, struct reservation_object *resv,
 		     struct radeon_bo **bo_ptr)
 {
 	struct radeon_bo *bo;
@@ -252,7 +256,7 @@ int radeon_bo_create(struct radeon_device *rdev,
 	bo->flags &= ~RADEON_GEM_GTT_WC;
 #endif
 
-	radeon_ttm_placement_from_domain(bo, domain);
+	radeon_ttm_placement_from_domain(bo, domain, size, fpfn, lpfn);
 	/* Kernel allocation are uninterruptible */
 	down_read(&rdev->pm.mclk_lock);
 	r = ttm_bo_init(&rdev->mman.bdev, &bo->tbo, size, type,
@@ -328,6 +332,7 @@ int radeon_bo_pin_restricted(struct radeon_bo *bo, u32 domain, u64 max_offset,
 			     u64 *gpu_addr)
 {
 	int r, i;
+	unsigned lpfn;
 
 	if (radeon_ttm_tt_has_userptr(bo->tbo.ttm))
 		return -EPERM;
@@ -350,19 +355,16 @@ int radeon_bo_pin_restricted(struct radeon_bo *bo, u32 domain, u64 max_offset,
 
 		return 0;
 	}
-	radeon_ttm_placement_from_domain(bo, domain);
-	for (i = 0; i < bo->placement.num_placement; i++) {
-		/* force to pin into visible video ram */
-		if ((bo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
-		    !(bo->flags & RADEON_GEM_NO_CPU_ACCESS) &&
-		    (!max_offset || max_offset > bo->rdev->mc.visible_vram_size))
-			bo->placements[i].lpfn =
-				bo->rdev->mc.visible_vram_size >> PAGE_SHIFT;
-		else
-			bo->placements[i].lpfn = max_offset >> PAGE_SHIFT;
-
+	/* force to pin into visible video ram */
+	if ((domain == RADEON_GEM_DOMAIN_VRAM) &&
+	    !(bo->flags & RADEON_GEM_NO_CPU_ACCESS) &&
+	    (!max_offset || max_offset > bo->rdev->mc.visible_vram_size))
+		lpfn = bo->rdev->mc.visible_vram_size >> PAGE_SHIFT;
+	else
+		lpfn = max_offset >> PAGE_SHIFT;
+	radeon_ttm_placement_from_domain(bo, domain, bo->tbo.mem.size, 0, lpfn);
+	for (i = 0; i < bo->placement.num_placement; i++)
 		bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT;
-	}
 
 	r = ttm_bo_validate(&bo->tbo, &bo->placement, false, false);
 	if (likely(r == 0)) {
@@ -557,9 +559,13 @@ int radeon_bo_list_validate(struct radeon_device *rdev,
 			}
 
 		retry:
-			radeon_ttm_placement_from_domain(bo, domain);
-			if (ring == R600_RING_TYPE_UVD_INDEX)
+			if (ring == R600_RING_TYPE_UVD_INDEX) {
+				radeon_ttm_placement_from_domain(bo, domain, bo->tbo.mem.size,
+								 0, (256 * 1024 * 1024) >> PAGE_SHIFT);
 				radeon_uvd_force_into_uvd_segment(bo, allowed);
+			} else {
+				radeon_ttm_placement_from_domain(bo, domain, bo->tbo.mem.size, 0, 0);
+			}
 
 			initial_bytes_moved = atomic64_read(&rdev->num_bytes_moved);
 			r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
@@ -784,7 +790,7 @@ int radeon_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
 	struct radeon_device *rdev;
 	struct radeon_bo *rbo;
 	unsigned long offset, size, lpfn;
-	int i, r;
+	int r;
 
 	if (!radeon_ttm_bo_is_radeon_bo(bo))
 		return 0;
@@ -800,17 +806,13 @@ int radeon_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
 		return 0;
 
 	/* hurrah the memory is not visible ! */
-	radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_VRAM);
-	lpfn =	rdev->mc.visible_vram_size >> PAGE_SHIFT;
-	for (i = 0; i < rbo->placement.num_placement; i++) {
-		/* Force into visible VRAM */
-		if ((rbo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
-		    (!rbo->placements[i].lpfn || rbo->placements[i].lpfn > lpfn))
-			rbo->placements[i].lpfn = lpfn;
-	}
+	lpfn = rdev->mc.visible_vram_size >> PAGE_SHIFT;
+	radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_VRAM,
+					 rbo->tbo.mem.size, 0, lpfn);
 	r = ttm_bo_validate(bo, &rbo->placement, false, false);
 	if (unlikely(r == -ENOMEM)) {
-		radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_GTT);
+		radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_GTT,
+						 rbo->tbo.mem.size, 0, 0);
 		return ttm_bo_validate(bo, &rbo->placement, false, false);
 	} else if (unlikely(r != 0)) {
 		return r;
diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h
index d8d295e..806dc5c 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -124,7 +124,8 @@ extern int radeon_bo_wait(struct radeon_bo *bo, u32 *mem_type,
 
 extern int radeon_bo_create(struct radeon_device *rdev,
 			    unsigned long size, int byte_align,
-			    bool kernel, u32 domain, u32 flags,
+			    bool kernel, u32 domain,
+			    unsigned fpfn, unsigned lpfn, u32 flags,
 			    struct sg_table *sg,
 			    struct reservation_object *resv,
 			    struct radeon_bo **bo_ptr);
diff --git a/drivers/gpu/drm/radeon/radeon_prime.c b/drivers/gpu/drm/radeon/radeon_prime.c
index f3609c9..d808d8f 100644
--- a/drivers/gpu/drm/radeon/radeon_prime.c
+++ b/drivers/gpu/drm/radeon/radeon_prime.c
@@ -68,7 +68,7 @@ struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device *dev,
 
 	ww_mutex_lock(&resv->lock, NULL);
 	ret = radeon_bo_create(rdev, attach->dmabuf->size, PAGE_SIZE, false,
-			       RADEON_GEM_DOMAIN_GTT, 0, sg, resv, &bo);
+			       RADEON_GEM_DOMAIN_GTT, 0, 0, 0, sg, resv, &bo);
 	ww_mutex_unlock(&resv->lock);
 	if (ret)
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 2456f69..d9f0150 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -383,7 +383,7 @@ int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, unsig
 	/* Allocate ring buffer */
 	if (ring->ring_obj == NULL) {
 		r = radeon_bo_create(rdev, ring->ring_size, PAGE_SIZE, true,
-				     RADEON_GEM_DOMAIN_GTT, 0, NULL,
+				     RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL,
 				     NULL, &ring->ring_obj);
 		if (r) {
 			dev_err(rdev->dev, "(%d) ring create failed\n", r);
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index c507896..3013fb18 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -65,7 +65,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev,
 	}
 
 	r = radeon_bo_create(rdev, size, align, true,
-			     domain, flags, NULL, NULL, &sa_manager->bo);
+			     domain, 0, 0, flags, NULL, NULL, &sa_manager->bo);
 	if (r) {
 		dev_err(rdev->dev, "(%d) failed to allocate bo for manager\n", r);
 		return r;
diff --git a/drivers/gpu/drm/radeon/radeon_test.c b/drivers/gpu/drm/radeon/radeon_test.c
index 79181816..073d6d2 100644
--- a/drivers/gpu/drm/radeon/radeon_test.c
+++ b/drivers/gpu/drm/radeon/radeon_test.c
@@ -67,7 +67,7 @@ static void radeon_do_test_moves(struct radeon_device *rdev, int flag)
 	}
 
 	r = radeon_bo_create(rdev, size, PAGE_SIZE, true, RADEON_GEM_DOMAIN_VRAM,
-			     0, NULL, NULL, &vram_obj);
+			     0, 0, 0, NULL, NULL, &vram_obj);
 	if (r) {
 		DRM_ERROR("Failed to create VRAM object\n");
 		goto out_cleanup;
@@ -87,7 +87,7 @@ static void radeon_do_test_moves(struct radeon_device *rdev, int flag)
 		struct radeon_fence *fence = NULL;
 
 		r = radeon_bo_create(rdev, size, PAGE_SIZE, true,
-				     RADEON_GEM_DOMAIN_GTT, 0, NULL, NULL,
+				     RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL, NULL,
 				     gtt_obj + i);
 		if (r) {
 			DRM_ERROR("Failed to create GTT object %d\n", i);
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index d02aa1d..befb590 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -197,7 +197,8 @@ static void radeon_evict_flags(struct ttm_buffer_object *bo,
 	switch (bo->mem.mem_type) {
 	case TTM_PL_VRAM:
 		if (rbo->rdev->ring[radeon_copy_ring_index(rbo->rdev)].ready == false)
-			radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_CPU);
+			radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_CPU,
+							 rbo->tbo.mem.size, 0, 0);
 		else if (rbo->rdev->mc.visible_vram_size < rbo->rdev->mc.real_vram_size &&
 			 bo->mem.start < (rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT)) {
 			unsigned fpfn = rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT;
@@ -209,7 +210,8 @@ static void radeon_evict_flags(struct ttm_buffer_object *bo,
 			 * BOs to be evicted from VRAM
 			 */
 			radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_VRAM |
-							 RADEON_GEM_DOMAIN_GTT);
+							 RADEON_GEM_DOMAIN_GTT,
+							 rbo->tbo.mem.size, 0, 0);
 			rbo->placement.num_busy_placement = 0;
 			for (i = 0; i < rbo->placement.num_placement; i++) {
 				if (rbo->placements[i].flags & TTM_PL_FLAG_VRAM) {
@@ -222,11 +224,13 @@ static void radeon_evict_flags(struct ttm_buffer_object *bo,
 				}
 			}
 		} else
-			radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_GTT);
+			radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_GTT,
+							 rbo->tbo.mem.size, 0, 0);
 		break;
 	case TTM_PL_TT:
 	default:
-		radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_CPU);
+		radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_CPU,
+						 rbo->tbo.mem.size, 0, 0);
 	}
 	*placement = rbo->placement;
 }
@@ -888,7 +892,7 @@ int radeon_ttm_init(struct radeon_device *rdev)
 	radeon_ttm_set_active_vram_size(rdev, rdev->mc.visible_vram_size);
 
 	r = radeon_bo_create(rdev, 256 * 1024, PAGE_SIZE, true,
-			     RADEON_GEM_DOMAIN_VRAM, 0, NULL,
+			     RADEON_GEM_DOMAIN_VRAM, 0, (256 * 1024) >> PAGE_SHIFT, 0, NULL,
 			     NULL, &rdev->stollen_vga_memory);
 	if (r) {
 		return r;
diff --git a/drivers/gpu/drm/radeon/radeon_uvd.c b/drivers/gpu/drm/radeon/radeon_uvd.c
index c10b2ae..6261463 100644
--- a/drivers/gpu/drm/radeon/radeon_uvd.c
+++ b/drivers/gpu/drm/radeon/radeon_uvd.c
@@ -141,7 +141,7 @@ int radeon_uvd_init(struct radeon_device *rdev)
 		  RADEON_UVD_STACK_SIZE + RADEON_UVD_HEAP_SIZE +
 		  RADEON_GPU_PAGE_SIZE;
 	r = radeon_bo_create(rdev, bo_size, PAGE_SIZE, true,
-			     RADEON_GEM_DOMAIN_VRAM, 0, NULL,
+			     RADEON_GEM_DOMAIN_VRAM, 0, (256 * 1024 * 1024) >> PAGE_SHIFT, 0, NULL,
 			     NULL, &rdev->uvd.vcpu_bo);
 	if (r) {
 		dev_err(rdev->dev, "(%d) failed to allocate UVD bo\n", r);
@@ -259,13 +259,6 @@ int radeon_uvd_resume(struct radeon_device *rdev)
 void radeon_uvd_force_into_uvd_segment(struct radeon_bo *rbo,
 				       uint32_t allowed_domains)
 {
-	int i;
-
-	for (i = 0; i < rbo->placement.num_placement; ++i) {
-		rbo->placements[i].fpfn = 0 >> PAGE_SHIFT;
-		rbo->placements[i].lpfn = (256 * 1024 * 1024) >> PAGE_SHIFT;
-	}
-
 	/* If it must be in VRAM it must be in the first segment as well */
 	if (allowed_domains == RADEON_GEM_DOMAIN_VRAM)
 		return;
diff --git a/drivers/gpu/drm/radeon/radeon_vce.c b/drivers/gpu/drm/radeon/radeon_vce.c
index 976fe43..c35f0a9 100644
--- a/drivers/gpu/drm/radeon/radeon_vce.c
+++ b/drivers/gpu/drm/radeon/radeon_vce.c
@@ -126,8 +126,8 @@ int radeon_vce_init(struct radeon_device *rdev)
 	size = RADEON_GPU_PAGE_ALIGN(rdev->vce_fw->size) +
 	       RADEON_VCE_STACK_SIZE + RADEON_VCE_HEAP_SIZE;
 	r = radeon_bo_create(rdev, size, PAGE_SIZE, true,
-			     RADEON_GEM_DOMAIN_VRAM, 0, NULL, NULL,
-			     &rdev->vce.vcpu_bo);
+			     RADEON_GEM_DOMAIN_VRAM, 0, (256 * 1024 * 1024) >> PAGE_SHIFT, 0,
+			     NULL, NULL, &rdev->vce.vcpu_bo);
 	if (r) {
 		dev_err(rdev->dev, "(%d) failed to allocate VCE bo\n", r);
 		return r;
diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
index 2a5a4a9..e845d4c 100644
--- a/drivers/gpu/drm/radeon/radeon_vm.c
+++ b/drivers/gpu/drm/radeon/radeon_vm.c
@@ -542,7 +542,7 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
 
 		r = radeon_bo_create(rdev, RADEON_VM_PTE_COUNT * 8,
 				     RADEON_GPU_PAGE_SIZE, true,
-				     RADEON_GEM_DOMAIN_VRAM, 0,
+				     RADEON_GEM_DOMAIN_VRAM, 0, 0, 0,
 				     NULL, NULL, &pt);
 		if (r)
 			return r;
@@ -1186,7 +1186,7 @@ int radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm)
 	}
 
 	r = radeon_bo_create(rdev, pd_size, align, true,
-			     RADEON_GEM_DOMAIN_VRAM, 0, NULL,
+			     RADEON_GEM_DOMAIN_VRAM, 0, 0, 0, NULL,
 			     NULL, &vm->page_directory);
 	if (r)
 		return r;
-- 
1.8.3.1


[-- Attachment #3: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2015-03-16 22:32 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-11 15:44 [PATCH] drm/radeon: fix TOPDOWN handling for bo_create Alex Deucher
2015-03-11 18:21 ` Christian König
2015-03-11 20:51   ` Alex Deucher
2015-03-11 21:14     ` Alex Deucher
2015-03-12  9:02       ` Michel Dänzer
2015-03-12  9:23         ` Christian König
2015-03-12  9:30           ` Oded Gabbay
2015-03-12  9:36             ` Christian König
2015-03-15 15:07               ` Oded Gabbay
2015-03-12 13:09           ` Alex Deucher
2015-03-13  2:55             ` Michel Dänzer
2015-03-16 22:32               ` Alex Deucher [this message]
2015-03-17  3:48                 ` Michel Dänzer
2015-03-17 15:19                   ` Alex Deucher
2015-03-17 15:43                     ` Christian König
2015-03-17 15:50                       ` Alex Deucher
2015-03-17 11:40                 ` Christian König
2015-03-17 13:49                   ` Alex Deucher
2015-03-12 10:21         ` Lauri Kasanen
2015-03-13  9:11         ` Daniel Vetter
2015-03-13  9:46           ` Michel Dänzer
2015-03-13 16:36             ` Daniel Vetter
2015-03-13 17:57               ` Alex Deucher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADnq5_PqVU1PeDr0XMvA1nKa5VmoRvWPVJQ_uZuvq2LoALgRwA@mail.gmail.com \
    --to=alexdeucher@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=michel@daenzer.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.