All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] fence, sa allocator, ib pool, semaphore rework
@ 2012-05-04 22:43 j.glisse
  2012-05-04 22:43 ` [PATCH 01/18] drm/radeon: replace the per ring mutex with a global one j.glisse
                   ` (17 more replies)
  0 siblings, 18 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel

First chunk rework fence to use uin64_t, unlike previous patch,
we only emit the lower 32 bits with the hw. The upper 32bits is
handled in the fence process function where a lenghty comment
discuss all the possible things that can go wrong and why it
doesn't matter.

Then taking advantage of faster fence querying, the sa allocator
is simplified and improved. The commit message describe the
algorithm used. Taking advantage of that ib & semaphore is simplified
to use the same temporary ring related manager.

To finish all this, more code is axed and simplified.


It all need the previous 17 patches and the various other small fixes.
To make it easier i regrouped all the patch needed at :
http://people.freedesktop.org/~glisse/reset5/base/

Those patch should hit next soon.

The whole patchset can also be found at :
http://people.freedesktop.org/~glisse/reset5/


It all got limited testing so far, but i tried hard to make sure that
each step is working and thus bisectable. I split the fence rework
in 2 part so one can bisect if either the move to 64bits fence or
the complete change in fence handling logic is causing trouble.

Cheers,
Jerome

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 01/18] drm/radeon: replace the per ring mutex with a global one
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 02/18] drm/radeon: convert fence to uint64_t v3 j.glisse
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König

From: Christian König <deathsimple@vodafone.de>

A single global mutex for ring submissions seems sufficient.

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon.h           |    3 +-
 drivers/gpu/drm/radeon/radeon_device.c    |    3 +-
 drivers/gpu/drm/radeon/radeon_pm.c        |   10 +-----
 drivers/gpu/drm/radeon/radeon_ring.c      |   28 +++++++++++-------
 drivers/gpu/drm/radeon/radeon_semaphore.c |   42 +++++++++++++----------------
 5 files changed, 41 insertions(+), 45 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 729d332..71d8af8 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -667,7 +667,6 @@ struct radeon_ring {
 	uint64_t		gpu_addr;
 	uint32_t		align_mask;
 	uint32_t		ptr_mask;
-	struct mutex		mutex;
 	bool			ready;
 	u32			ptr_reg_shift;
 	u32			ptr_reg_mask;
@@ -806,6 +805,7 @@ int radeon_ring_alloc(struct radeon_device *rdev, struct radeon_ring *cp, unsign
 int radeon_ring_lock(struct radeon_device *rdev, struct radeon_ring *cp, unsigned ndw);
 void radeon_ring_commit(struct radeon_device *rdev, struct radeon_ring *cp);
 void radeon_ring_unlock_commit(struct radeon_device *rdev, struct radeon_ring *cp);
+void radeon_ring_undo(struct radeon_ring *ring);
 void radeon_ring_unlock_undo(struct radeon_device *rdev, struct radeon_ring *cp);
 int radeon_ring_test(struct radeon_device *rdev, struct radeon_ring *cp);
 void radeon_ring_force_activity(struct radeon_device *rdev, struct radeon_ring *ring);
@@ -1516,6 +1516,7 @@ struct radeon_device {
 	rwlock_t			fence_lock;
 	struct radeon_fence_driver	fence_drv[RADEON_NUM_RINGS];
 	struct radeon_semaphore_driver	semaphore_drv;
+	struct mutex			ring_lock;
 	struct radeon_ring		ring[RADEON_NUM_RINGS];
 	struct radeon_ib_pool		ib_pool;
 	struct radeon_irq		irq;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index 93615d1..ad3a7fb 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -724,8 +724,7 @@ int radeon_device_init(struct radeon_device *rdev,
 	 * can recall function without having locking issues */
 	radeon_mutex_init(&rdev->cs_mutex);
 	radeon_mutex_init(&rdev->ib_pool.mutex);
-	for (i = 0; i < RADEON_NUM_RINGS; ++i)
-		mutex_init(&rdev->ring[i].mutex);
+	mutex_init(&rdev->ring_lock);
 	mutex_init(&rdev->dc_hw_i2c_mutex);
 	if (rdev->family >= CHIP_R600)
 		spin_lock_init(&rdev->ih.lock);
diff --git a/drivers/gpu/drm/radeon/radeon_pm.c b/drivers/gpu/drm/radeon/radeon_pm.c
index caa55d6..7c38745 100644
--- a/drivers/gpu/drm/radeon/radeon_pm.c
+++ b/drivers/gpu/drm/radeon/radeon_pm.c
@@ -252,10 +252,7 @@ static void radeon_pm_set_clocks(struct radeon_device *rdev)
 
 	mutex_lock(&rdev->ddev->struct_mutex);
 	mutex_lock(&rdev->vram_mutex);
-	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
-		if (rdev->ring[i].ring_obj)
-			mutex_lock(&rdev->ring[i].mutex);
-	}
+	mutex_lock(&rdev->ring_lock);
 
 	/* gui idle int has issues on older chips it seems */
 	if (rdev->family >= CHIP_R600) {
@@ -311,10 +308,7 @@ static void radeon_pm_set_clocks(struct radeon_device *rdev)
 
 	rdev->pm.dynpm_planned_action = DYNPM_ACTION_NONE;
 
-	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
-		if (rdev->ring[i].ring_obj)
-			mutex_unlock(&rdev->ring[i].mutex);
-	}
+	mutex_unlock(&rdev->ring_lock);
 	mutex_unlock(&rdev->vram_mutex);
 	mutex_unlock(&rdev->ddev->struct_mutex);
 }
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 2eb4c6e..a4d60ae 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -346,9 +346,9 @@ int radeon_ring_alloc(struct radeon_device *rdev, struct radeon_ring *ring, unsi
 		if (ndw < ring->ring_free_dw) {
 			break;
 		}
-		mutex_unlock(&ring->mutex);
+		mutex_unlock(&rdev->ring_lock);
 		r = radeon_fence_wait_next(rdev, radeon_ring_index(rdev, ring));
-		mutex_lock(&ring->mutex);
+		mutex_lock(&rdev->ring_lock);
 		if (r)
 			return r;
 	}
@@ -361,10 +361,10 @@ int radeon_ring_lock(struct radeon_device *rdev, struct radeon_ring *ring, unsig
 {
 	int r;
 
-	mutex_lock(&ring->mutex);
+	mutex_lock(&rdev->ring_lock);
 	r = radeon_ring_alloc(rdev, ring, ndw);
 	if (r) {
-		mutex_unlock(&ring->mutex);
+		mutex_unlock(&rdev->ring_lock);
 		return r;
 	}
 	return 0;
@@ -389,20 +389,25 @@ void radeon_ring_commit(struct radeon_device *rdev, struct radeon_ring *ring)
 void radeon_ring_unlock_commit(struct radeon_device *rdev, struct radeon_ring *ring)
 {
 	radeon_ring_commit(rdev, ring);
-	mutex_unlock(&ring->mutex);
+	mutex_unlock(&rdev->ring_lock);
 }
 
-void radeon_ring_unlock_undo(struct radeon_device *rdev, struct radeon_ring *ring)
+void radeon_ring_undo(struct radeon_ring *ring)
 {
 	ring->wptr = ring->wptr_old;
-	mutex_unlock(&ring->mutex);
+}
+
+void radeon_ring_unlock_undo(struct radeon_device *rdev, struct radeon_ring *ring)
+{
+	radeon_ring_undo(ring);
+	mutex_unlock(&rdev->ring_lock);
 }
 
 void radeon_ring_force_activity(struct radeon_device *rdev, struct radeon_ring *ring)
 {
 	int r;
 
-	mutex_lock(&ring->mutex);
+	mutex_lock(&rdev->ring_lock);
 	radeon_ring_free_size(rdev, ring);
 	if (ring->rptr == ring->wptr) {
 		r = radeon_ring_alloc(rdev, ring, 1);
@@ -411,7 +416,7 @@ void radeon_ring_force_activity(struct radeon_device *rdev, struct radeon_ring *
 			radeon_ring_commit(rdev, ring);
 		}
 	}
-	mutex_unlock(&ring->mutex);
+	mutex_unlock(&rdev->ring_lock);
 }
 
 void radeon_ring_lockup_update(struct radeon_ring *ring)
@@ -520,11 +525,12 @@ void radeon_ring_fini(struct radeon_device *rdev, struct radeon_ring *ring)
 	int r;
 	struct radeon_bo *ring_obj;
 
-	mutex_lock(&ring->mutex);
+	mutex_lock(&rdev->ring_lock);
 	ring_obj = ring->ring_obj;
+	ring->ready = false;
 	ring->ring = NULL;
 	ring->ring_obj = NULL;
-	mutex_unlock(&ring->mutex);
+	mutex_unlock(&rdev->ring_lock);
 
 	if (ring_obj) {
 		r = radeon_bo_reserve(ring_obj, false);
diff --git a/drivers/gpu/drm/radeon/radeon_semaphore.c b/drivers/gpu/drm/radeon/radeon_semaphore.c
index 930a08a..c5b3d8e 100644
--- a/drivers/gpu/drm/radeon/radeon_semaphore.c
+++ b/drivers/gpu/drm/radeon/radeon_semaphore.c
@@ -39,7 +39,6 @@ static int radeon_semaphore_add_bo(struct radeon_device *rdev)
 	uint32_t *cpu_ptr;
 	int r, i;
 
-
 	bo = kmalloc(sizeof(struct radeon_semaphore_bo), GFP_KERNEL);
 	if (bo == NULL) {
 		return -ENOMEM;
@@ -154,13 +153,17 @@ int radeon_semaphore_sync_rings(struct radeon_device *rdev,
 				bool sync_to[RADEON_NUM_RINGS],
 				int dst_ring)
 {
-	int i, r;
+	int i = 0, r;
 
-	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
-		unsigned num_ops = i == dst_ring ? RADEON_NUM_RINGS : 1;
+	mutex_lock(&rdev->ring_lock);
+	r = radeon_ring_alloc(rdev, &rdev->ring[dst_ring], RADEON_NUM_RINGS * 8);
+	if (r) {
+		goto error;
+	}
 
-		/* don't lock unused rings */
-		if (!sync_to[i] && i != dst_ring)
+	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+		/* no need to sync to our own or unused rings */
+		if (!sync_to[i] || i == dst_ring)
 			continue;
 
 		/* prevent GPU deadlocks */
@@ -170,38 +173,31 @@ int radeon_semaphore_sync_rings(struct radeon_device *rdev,
 			goto error;
 		}
 
-                r = radeon_ring_lock(rdev, &rdev->ring[i], num_ops * 8);
-                if (r)
+		r = radeon_ring_alloc(rdev, &rdev->ring[i], 8);
+		if (r) {
 			goto error;
-	}
-
-	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
-		/* no need to sync to our own or unused rings */
-		if (!sync_to[i] || i == dst_ring)
-                        continue;
+		}
 
 		radeon_semaphore_emit_signal(rdev, i, semaphore);
 		radeon_semaphore_emit_wait(rdev, dst_ring, semaphore);
-	}
 
-	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
-
-		/* don't unlock unused rings */
-		if (!sync_to[i] && i != dst_ring)
-			continue;
-
-		radeon_ring_unlock_commit(rdev, &rdev->ring[i]);
+		radeon_ring_commit(rdev, &rdev->ring[i]);
 	}
 
+	radeon_ring_commit(rdev, &rdev->ring[dst_ring]);
+	mutex_unlock(&rdev->ring_lock);
+
 	return 0;
 
 error:
 	/* unlock all locks taken so far */
 	for (--i; i >= 0; --i) {
 		if (sync_to[i] || i == dst_ring) {
-			radeon_ring_unlock_undo(rdev, &rdev->ring[i]);
+			radeon_ring_undo(&rdev->ring[i]);
 		}
 	}
+	radeon_ring_undo(&rdev->ring[dst_ring]);
+	mutex_unlock(&rdev->ring_lock);
 	return r;
 }
 
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 02/18] drm/radeon: convert fence to uint64_t v3
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
  2012-05-04 22:43 ` [PATCH 01/18] drm/radeon: replace the per ring mutex with a global one j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 03/18] drm/radeon: rework fence handling, drop fence list v5 j.glisse
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Jerome Glisse, Christian König

From: Jerome Glisse <jglisse@redhat.com>

This convert fence to use uint64_t sequence number intention is
to use the fact that uin64_t is big enough that we don't need to
care about wrap around.

Tested with and without writeback using 0xFFFFF000 as initial
fence sequence and thus allowing to test the wrap around from
32bits to 64bits.

v2: Add comment about possible race btw CPU & GPU, add comment
    stressing that we need 2 dword aligned for R600_WB_EVENT_OFFSET
    Read fence sequenc in reverse order of GPU write them so we
    mitigate the race btw CPU and GPU.

v3: Drop the need for ring to emit the 64bits fence, and just have
    each ring emit the lower 32bits of the fence sequence. We
    handle the wrap over 32bits in fence_process.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon.h       |   39 ++++++-----
 drivers/gpu/drm/radeon/radeon_fence.c |  116 +++++++++++++++++++++++----------
 drivers/gpu/drm/radeon/radeon_ring.c  |    9 ++-
 3 files changed, 107 insertions(+), 57 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 71d8af8..4d9c433 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -100,28 +100,32 @@ extern int radeon_lockup_timeout;
  * Copy from radeon_drv.h so we don't have to include both and have conflicting
  * symbol;
  */
-#define RADEON_MAX_USEC_TIMEOUT		100000	/* 100 ms */
-#define RADEON_FENCE_JIFFIES_TIMEOUT	(HZ / 2)
+#define RADEON_MAX_USEC_TIMEOUT			100000	/* 100 ms */
+#define RADEON_FENCE_JIFFIES_TIMEOUT		(HZ / 2)
 /* RADEON_IB_POOL_SIZE must be a power of 2 */
-#define RADEON_IB_POOL_SIZE		16
-#define RADEON_DEBUGFS_MAX_COMPONENTS	32
-#define RADEONFB_CONN_LIMIT		4
-#define RADEON_BIOS_NUM_SCRATCH		8
+#define RADEON_IB_POOL_SIZE			16
+#define RADEON_DEBUGFS_MAX_COMPONENTS		32
+#define RADEONFB_CONN_LIMIT			4
+#define RADEON_BIOS_NUM_SCRATCH			8
 
 /* max number of rings */
-#define RADEON_NUM_RINGS 3
+#define RADEON_NUM_RINGS			3
+
+/* fence seq are set to this number when signaled */
+#define RADEON_FENCE_SIGNALED_SEQ		0LL
+#define RADEON_FENCE_NOTEMITED_SEQ		(~0LL)
 
 /* internal ring indices */
 /* r1xx+ has gfx CP ring */
-#define RADEON_RING_TYPE_GFX_INDEX  0
+#define RADEON_RING_TYPE_GFX_INDEX		0
 
 /* cayman has 2 compute CP rings */
-#define CAYMAN_RING_TYPE_CP1_INDEX 1
-#define CAYMAN_RING_TYPE_CP2_INDEX 2
+#define CAYMAN_RING_TYPE_CP1_INDEX		1
+#define CAYMAN_RING_TYPE_CP2_INDEX		2
 
 /* hardcode those limit for now */
-#define RADEON_VA_RESERVED_SIZE		(8 << 20)
-#define RADEON_IB_VM_MAX_SIZE		(64 << 10)
+#define RADEON_VA_RESERVED_SIZE			(8 << 20)
+#define RADEON_IB_VM_MAX_SIZE			(64 << 10)
 
 /*
  * Errata workarounds.
@@ -254,8 +258,9 @@ struct radeon_fence_driver {
 	uint32_t			scratch_reg;
 	uint64_t			gpu_addr;
 	volatile uint32_t		*cpu_addr;
-	atomic_t			seq;
-	uint32_t			last_seq;
+	/* seq is protected by ring emission lock */
+	uint64_t			seq;
+	atomic64_t			last_seq;
 	unsigned long			last_activity;
 	wait_queue_head_t		queue;
 	struct list_head		emitted;
@@ -268,11 +273,9 @@ struct radeon_fence {
 	struct kref			kref;
 	struct list_head		list;
 	/* protected by radeon_fence.lock */
-	uint32_t			seq;
-	bool				emitted;
-	bool				signaled;
+	uint64_t			seq;
 	/* RB, DMA, etc. */
-	int				ring;
+	unsigned			ring;
 	struct radeon_semaphore		*semaphore;
 };
 
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
index 5bb78bf..0d3d226 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -66,14 +66,14 @@ int radeon_fence_emit(struct radeon_device *rdev, struct radeon_fence *fence)
 	unsigned long irq_flags;
 
 	write_lock_irqsave(&rdev->fence_lock, irq_flags);
-	if (fence->emitted) {
+	if (fence->seq && fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
 		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 		return 0;
 	}
-	fence->seq = atomic_add_return(1, &rdev->fence_drv[fence->ring].seq);
+	/* we are protected by the ring emission mutex */
+	fence->seq = rdev->fence_drv[fence->ring].seq++;
 	radeon_fence_ring_emit(rdev, fence->ring, fence);
 	trace_radeon_fence_emit(rdev->ddev, fence->seq);
-	fence->emitted = true;
 	/* are we the first fence on a previusly idle ring? */
 	if (list_empty(&rdev->fence_drv[fence->ring].emitted)) {
 		rdev->fence_drv[fence->ring].last_activity = jiffies;
@@ -87,14 +87,59 @@ static bool radeon_fence_poll_locked(struct radeon_device *rdev, int ring)
 {
 	struct radeon_fence *fence;
 	struct list_head *i, *n;
-	uint32_t seq;
+	uint64_t seq, last_seq;
+	unsigned count_loop = 0;
 	bool wake = false;
 
-	seq = radeon_fence_read(rdev, ring);
-	if (seq == rdev->fence_drv[ring].last_seq)
-		return false;
+	/* Note there is a scenario here for an infinite loop but it's
+	 * very unlikely to happen. For it to happen, the current polling
+	 * process need to be interrupted by another process and another
+	 * process needs to update the last_seq btw the atomic read and
+	 * xchg of the current process.
+	 *
+	 * More over for this to go in infinite loop there need to be
+	 * continuously new fence signaled ie radeon_fence_read needs
+	 * to return a different value each time for both the currently
+	 * polling process and the other process that xchg the last_seq
+	 * btw atomic read and xchg of the current process. And the
+	 * value the other process set as last seq must be higher than
+	 * the seq value we just read. Which means that current process
+	 * need to be interrupted after radeon_fence_read and before
+	 * atomic xchg.
+	 *
+	 * To be even more safe we count the number of time we loop and
+	 * we bail after 10 loop just accepting the fact that we might
+	 * have temporarly set the last_seq not to the true real last
+	 * seq but to an older one.
+	 */
+	do {
+		last_seq = atomic64_read(&rdev->fence_drv[ring].last_seq);
+		seq = radeon_fence_read(rdev, ring);
+		seq |= last_seq & 0xffffffff00000000LL;
+		if (seq < last_seq) {
+			seq += 0x100000000LL;
+		}
 
-	rdev->fence_drv[ring].last_seq = seq;
+		if (!wake && seq == last_seq) {
+			return false;
+		}
+		/* If we loop over we don't want to return without
+		 * checking if a fence is signaled as it means that the
+		 * seq we just read is different from the previous on.
+		 */
+		wake = true;
+		if ((count_loop++) > 10) {
+			/* We looped over too many time leave with the
+			 * fact that we might have set an older fence
+			 * seq then the current real last seq as signaled
+			 * by the hw.
+			 */
+			break;
+		}
+	} while (atomic64_xchg(&rdev->fence_drv[ring].last_seq, seq) > seq);
+
+	/* reset wake to false */
+	wake = false;
 	rdev->fence_drv[ring].last_activity = jiffies;
 
 	n = NULL;
@@ -112,7 +157,7 @@ static bool radeon_fence_poll_locked(struct radeon_device *rdev, int ring)
 			n = i->prev;
 			list_move_tail(i, &rdev->fence_drv[ring].signaled);
 			fence = list_entry(i, struct radeon_fence, list);
-			fence->signaled = true;
+			fence->seq = RADEON_FENCE_SIGNALED_SEQ;
 			i = n;
 		} while (i != &rdev->fence_drv[ring].emitted);
 		wake = true;
@@ -128,7 +173,7 @@ static void radeon_fence_destroy(struct kref *kref)
 	fence = container_of(kref, struct radeon_fence, kref);
 	write_lock_irqsave(&fence->rdev->fence_lock, irq_flags);
 	list_del(&fence->list);
-	fence->emitted = false;
+	fence->seq = RADEON_FENCE_NOTEMITED_SEQ;
 	write_unlock_irqrestore(&fence->rdev->fence_lock, irq_flags);
 	if (fence->semaphore)
 		radeon_semaphore_free(fence->rdev, fence->semaphore);
@@ -145,9 +190,7 @@ int radeon_fence_create(struct radeon_device *rdev,
 	}
 	kref_init(&((*fence)->kref));
 	(*fence)->rdev = rdev;
-	(*fence)->emitted = false;
-	(*fence)->signaled = false;
-	(*fence)->seq = 0;
+	(*fence)->seq = RADEON_FENCE_NOTEMITED_SEQ;
 	(*fence)->ring = ring;
 	(*fence)->semaphore = NULL;
 	INIT_LIST_HEAD(&(*fence)->list);
@@ -163,18 +206,18 @@ bool radeon_fence_signaled(struct radeon_fence *fence)
 		return true;
 
 	write_lock_irqsave(&fence->rdev->fence_lock, irq_flags);
-	signaled = fence->signaled;
+	signaled = (fence->seq == RADEON_FENCE_SIGNALED_SEQ);
 	/* if we are shuting down report all fence as signaled */
 	if (fence->rdev->shutdown) {
 		signaled = true;
 	}
-	if (!fence->emitted) {
+	if (fence->seq == RADEON_FENCE_NOTEMITED_SEQ) {
 		WARN(1, "Querying an unemitted fence : %p !\n", fence);
 		signaled = true;
 	}
 	if (!signaled) {
 		radeon_fence_poll_locked(fence->rdev, fence->ring);
-		signaled = fence->signaled;
+		signaled = (fence->seq == RADEON_FENCE_SIGNALED_SEQ);
 	}
 	write_unlock_irqrestore(&fence->rdev->fence_lock, irq_flags);
 	return signaled;
@@ -183,8 +226,8 @@ bool radeon_fence_signaled(struct radeon_fence *fence)
 int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 {
 	struct radeon_device *rdev;
-	unsigned long irq_flags, timeout;
-	u32 seq;
+	unsigned long irq_flags, timeout, last_activity;
+	uint64_t seq;
 	int i, r;
 	bool signaled;
 
@@ -207,7 +250,9 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 			timeout = 1;
 		}
 		/* save current sequence value used to check for GPU lockups */
-		seq = rdev->fence_drv[fence->ring].last_seq;
+		seq = atomic64_read(&rdev->fence_drv[fence->ring].last_seq);
+		/* Save current last activity valuee, used to check for GPU lockups */
+		last_activity = rdev->fence_drv[fence->ring].last_activity;
 		read_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 
 		trace_radeon_fence_wait_begin(rdev->ddev, seq);
@@ -235,24 +280,23 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 			}
 
 			write_lock_irqsave(&rdev->fence_lock, irq_flags);
-			/* check if sequence value has changed since last_activity */
-			if (seq != rdev->fence_drv[fence->ring].last_seq) {
+			/* test if somebody else has already decided that this is a lockup */
+			if (last_activity != rdev->fence_drv[fence->ring].last_activity) {
 				write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 				continue;
 			}
 
-			/* change sequence value on all rings, so nobody else things there is a lockup */
-			for (i = 0; i < RADEON_NUM_RINGS; ++i)
-				rdev->fence_drv[i].last_seq -= 0x10000;
-
-			rdev->fence_drv[fence->ring].last_activity = jiffies;
 			write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 
 			if (radeon_ring_is_lockup(rdev, fence->ring, &rdev->ring[fence->ring])) {
-
 				/* good news we believe it's a lockup */
-				printk(KERN_WARNING "GPU lockup (waiting for 0x%08X last fence id 0x%08X)\n",
-				     fence->seq, seq);
+				dev_warn(rdev->dev, "GPU lockup (waiting for 0x%016llx last fence id 0x%016llx)\n",
+					 fence->seq, seq);
+
+				/* change last activity so nobody else think there is a lockup */
+				for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+					rdev->fence_drv[i].last_activity = jiffies;
+				}
 
 				/* mark the ring as not ready any more */
 				rdev->ring[fence->ring].ready = false;
@@ -387,9 +431,9 @@ int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring)
 	}
 	rdev->fence_drv[ring].cpu_addr = &rdev->wb.wb[index/4];
 	rdev->fence_drv[ring].gpu_addr = rdev->wb.gpu_addr + index;
-	radeon_fence_write(rdev, atomic_read(&rdev->fence_drv[ring].seq), ring);
+	radeon_fence_write(rdev, rdev->fence_drv[ring].seq, ring);
 	rdev->fence_drv[ring].initialized = true;
-	DRM_INFO("fence driver on ring %d use gpu addr 0x%08Lx and cpu addr 0x%p\n",
+	DRM_INFO("fence driver on ring %d use gpu addr 0x%016llx and cpu addr 0x%p\n",
 		 ring, rdev->fence_drv[ring].gpu_addr, rdev->fence_drv[ring].cpu_addr);
 	write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 	return 0;
@@ -400,7 +444,9 @@ static void radeon_fence_driver_init_ring(struct radeon_device *rdev, int ring)
 	rdev->fence_drv[ring].scratch_reg = -1;
 	rdev->fence_drv[ring].cpu_addr = NULL;
 	rdev->fence_drv[ring].gpu_addr = 0;
-	atomic_set(&rdev->fence_drv[ring].seq, 0);
+	/* we start at one */
+	rdev->fence_drv[ring].seq = 1;
+	atomic64_set(&rdev->fence_drv[ring].last_seq, 0);
 	INIT_LIST_HEAD(&rdev->fence_drv[ring].emitted);
 	INIT_LIST_HEAD(&rdev->fence_drv[ring].signaled);
 	init_waitqueue_head(&rdev->fence_drv[ring].queue);
@@ -458,12 +504,12 @@ static int radeon_debugfs_fence_info(struct seq_file *m, void *data)
 			continue;
 
 		seq_printf(m, "--- ring %d ---\n", i);
-		seq_printf(m, "Last signaled fence 0x%08X\n",
-			   radeon_fence_read(rdev, i));
+		seq_printf(m, "Last signaled fence 0x%016lx\n",
+			   atomic64_read(&rdev->fence_drv[i].last_seq));
 		if (!list_empty(&rdev->fence_drv[i].emitted)) {
 			fence = list_entry(rdev->fence_drv[i].emitted.prev,
 					   struct radeon_fence, list);
-			seq_printf(m, "Last emitted fence %p with 0x%08X\n",
+			seq_printf(m, "Last emitted fence %p with 0x%016llx\n",
 				   fence,  fence->seq);
 		}
 	}
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index a4d60ae..4ae222b 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -82,7 +82,7 @@ bool radeon_ib_try_free(struct radeon_device *rdev, struct radeon_ib *ib)
 	bool done = false;
 
 	/* only free ib which have been emited */
-	if (ib->fence && ib->fence->emitted) {
+	if (ib->fence && ib->fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
 		if (radeon_fence_signaled(ib->fence)) {
 			radeon_fence_unref(&ib->fence);
 			radeon_sa_bo_free(rdev, &ib->sa_bo);
@@ -149,8 +149,9 @@ retry:
 	/* this should be rare event, ie all ib scheduled none signaled yet.
 	 */
 	for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
-		if (rdev->ib_pool.ibs[idx].fence && rdev->ib_pool.ibs[idx].fence->emitted) {
-			r = radeon_fence_wait(rdev->ib_pool.ibs[idx].fence, false);
+		struct radeon_fence *fence = rdev->ib_pool.ibs[idx].fence;
+		if (fence && fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
+			r = radeon_fence_wait(fence, false);
 			if (!r) {
 				goto retry;
 			}
@@ -173,7 +174,7 @@ void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib)
 		return;
 	}
 	radeon_mutex_lock(&rdev->ib_pool.mutex);
-	if (tmp->fence && !tmp->fence->emitted) {
+	if (tmp->fence && tmp->fence->seq == RADEON_FENCE_NOTEMITED_SEQ) {
 		radeon_sa_bo_free(rdev, &tmp->sa_bo);
 		radeon_fence_unref(&tmp->fence);
 	}
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 03/18] drm/radeon: rework fence handling, drop fence list v5
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
  2012-05-04 22:43 ` [PATCH 01/18] drm/radeon: replace the per ring mutex with a global one j.glisse
  2012-05-04 22:43 ` [PATCH 02/18] drm/radeon: convert fence to uint64_t v3 j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 04/18] drm/radeon: hold ring emission mutex for lockup detection j.glisse
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Jerome Glisse, Christian König

From: Jerome Glisse <jglisse@redhat.com>

Using 64bits fence sequence we can directly compare sequence
number to know if a fence is signaled or not. Thus the fence
list became useless, so does the fence lock that mainly
protected the fence list.

Things like ring.ready are no longer behind a lock, this should
be ok as ring.ready is initialized once and will only change
when facing lockup. Worst case is that we return an -EBUSY just
after a successfull GPU reset, or we go into wait state instead
of returning -EBUSY (thus delaying reporting -EBUSY to fence
wait caller).

v2: Remove left over comment, force using writeback on cayman and
    newer, thus not having to suffer from possibly scratch reg
    exhaustion
v3: Rebase on top of change to uint64 fence patch
v4: Change DCE5 test to force write back on cayman and newer but
    also any APU such as PALM or SUMO family
v5: Rebase on top of new uint64 fence patch

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon.h        |    6 +-
 drivers/gpu/drm/radeon/radeon_device.c |    8 +-
 drivers/gpu/drm/radeon/radeon_fence.c  |  287 +++++++++++++-------------------
 3 files changed, 123 insertions(+), 178 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 4d9c433..2928139 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -263,15 +263,12 @@ struct radeon_fence_driver {
 	atomic64_t			last_seq;
 	unsigned long			last_activity;
 	wait_queue_head_t		queue;
-	struct list_head		emitted;
-	struct list_head		signaled;
 	bool				initialized;
 };
 
 struct radeon_fence {
 	struct radeon_device		*rdev;
 	struct kref			kref;
-	struct list_head		list;
 	/* protected by radeon_fence.lock */
 	uint64_t			seq;
 	/* RB, DMA, etc. */
@@ -291,7 +288,7 @@ int radeon_fence_wait_next(struct radeon_device *rdev, int ring);
 int radeon_fence_wait_empty(struct radeon_device *rdev, int ring);
 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence);
 void radeon_fence_unref(struct radeon_fence **fence);
-int radeon_fence_count_emitted(struct radeon_device *rdev, int ring);
+unsigned radeon_fence_count_emitted(struct radeon_device *rdev, int ring);
 
 /*
  * Tiling registers
@@ -1516,7 +1513,6 @@ struct radeon_device {
 	struct radeon_mode_info		mode_info;
 	struct radeon_scratch		scratch;
 	struct radeon_mman		mman;
-	rwlock_t			fence_lock;
 	struct radeon_fence_driver	fence_drv[RADEON_NUM_RINGS];
 	struct radeon_semaphore_driver	semaphore_drv;
 	struct mutex			ring_lock;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index ad3a7fb..cb4f9c2 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -225,9 +225,9 @@ int radeon_wb_init(struct radeon_device *rdev)
 	/* disable event_write fences */
 	rdev->wb.use_event = false;
 	/* disabled via module param */
-	if (radeon_no_wb == 1)
+	if (radeon_no_wb == 1) {
 		rdev->wb.enabled = false;
-	else {
+	} else {
 		if (rdev->flags & RADEON_IS_AGP) {
 			/* often unreliable on AGP */
 			rdev->wb.enabled = false;
@@ -237,8 +237,9 @@ int radeon_wb_init(struct radeon_device *rdev)
 		} else {
 			rdev->wb.enabled = true;
 			/* event_write fences are only available on r600+ */
-			if (rdev->family >= CHIP_R600)
+			if (rdev->family >= CHIP_R600) {
 				rdev->wb.use_event = true;
+			}
 		}
 	}
 	/* always use writeback/events on NI, APUs */
@@ -731,7 +732,6 @@ int radeon_device_init(struct radeon_device *rdev,
 	mutex_init(&rdev->gem.mutex);
 	mutex_init(&rdev->pm.mutex);
 	mutex_init(&rdev->vram_mutex);
-	rwlock_init(&rdev->fence_lock);
 	rwlock_init(&rdev->semaphore_drv.lock);
 	INIT_LIST_HEAD(&rdev->gem.objects);
 	init_waitqueue_head(&rdev->irq.vblank_queue);
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
index 0d3d226..1610601 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -63,30 +63,18 @@ static u32 radeon_fence_read(struct radeon_device *rdev, int ring)
 
 int radeon_fence_emit(struct radeon_device *rdev, struct radeon_fence *fence)
 {
-	unsigned long irq_flags;
-
-	write_lock_irqsave(&rdev->fence_lock, irq_flags);
+	/* we are protected by the ring emission mutex */
 	if (fence->seq && fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
-		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 		return 0;
 	}
-	/* we are protected by the ring emission mutex */
 	fence->seq = rdev->fence_drv[fence->ring].seq++;
 	radeon_fence_ring_emit(rdev, fence->ring, fence);
 	trace_radeon_fence_emit(rdev->ddev, fence->seq);
-	/* are we the first fence on a previusly idle ring? */
-	if (list_empty(&rdev->fence_drv[fence->ring].emitted)) {
-		rdev->fence_drv[fence->ring].last_activity = jiffies;
-	}
-	list_move_tail(&fence->list, &rdev->fence_drv[fence->ring].emitted);
-	write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 	return 0;
 }
 
-static bool radeon_fence_poll_locked(struct radeon_device *rdev, int ring)
+static bool radeon_fence_poll(struct radeon_device *rdev, unsigned ring)
 {
-	struct radeon_fence *fence;
-	struct list_head *i, *n;
 	uint64_t seq, last_seq;
 	unsigned count_loop = 0;
 	bool wake = false;
@@ -138,43 +126,17 @@ static bool radeon_fence_poll_locked(struct radeon_device *rdev, int ring)
 		}
 	} while (atomic64_xchg(&rdev->fence_drv[ring].last_seq, seq) > seq);
 
-	/* reset wake to false */
-	wake = false;
 	rdev->fence_drv[ring].last_activity = jiffies;
 
-	n = NULL;
-	list_for_each(i, &rdev->fence_drv[ring].emitted) {
-		fence = list_entry(i, struct radeon_fence, list);
-		if (fence->seq == seq) {
-			n = i;
-			break;
-		}
-	}
-	/* all fence previous to this one are considered as signaled */
-	if (n) {
-		i = n;
-		do {
-			n = i->prev;
-			list_move_tail(i, &rdev->fence_drv[ring].signaled);
-			fence = list_entry(i, struct radeon_fence, list);
-			fence->seq = RADEON_FENCE_SIGNALED_SEQ;
-			i = n;
-		} while (i != &rdev->fence_drv[ring].emitted);
-		wake = true;
-	}
-	return wake;
+	return true;
 }
 
 static void radeon_fence_destroy(struct kref *kref)
 {
-	unsigned long irq_flags;
-        struct radeon_fence *fence;
+	struct radeon_fence *fence;
 
 	fence = container_of(kref, struct radeon_fence, kref);
-	write_lock_irqsave(&fence->rdev->fence_lock, irq_flags);
-	list_del(&fence->list);
 	fence->seq = RADEON_FENCE_NOTEMITED_SEQ;
-	write_unlock_irqrestore(&fence->rdev->fence_lock, irq_flags);
 	if (fence->semaphore)
 		radeon_semaphore_free(fence->rdev, fence->semaphore);
 	kfree(fence);
@@ -193,80 +155,93 @@ int radeon_fence_create(struct radeon_device *rdev,
 	(*fence)->seq = RADEON_FENCE_NOTEMITED_SEQ;
 	(*fence)->ring = ring;
 	(*fence)->semaphore = NULL;
-	INIT_LIST_HEAD(&(*fence)->list);
 	return 0;
 }
 
 bool radeon_fence_signaled(struct radeon_fence *fence)
 {
-	unsigned long irq_flags;
-	bool signaled = false;
+	struct radeon_device *rdev;
+	unsigned i;
+	bool signaled;
 
-	if (!fence)
+	if (!fence) {
 		return true;
-
-	write_lock_irqsave(&fence->rdev->fence_lock, irq_flags);
-	signaled = (fence->seq == RADEON_FENCE_SIGNALED_SEQ);
-	/* if we are shuting down report all fence as signaled */
-	if (fence->rdev->shutdown) {
-		signaled = true;
 	}
+
+	rdev = fence->rdev;
 	if (fence->seq == RADEON_FENCE_NOTEMITED_SEQ) {
 		WARN(1, "Querying an unemitted fence : %p !\n", fence);
 		signaled = true;
 	}
-	if (!signaled) {
-		radeon_fence_poll_locked(fence->rdev, fence->ring);
-		signaled = (fence->seq == RADEON_FENCE_SIGNALED_SEQ);
+	signaled = (fence->seq == RADEON_FENCE_SIGNALED_SEQ);
+	if (signaled) {
+		return true;
+	}
+	/* poll new last sequence at least once */
+	for (i = 0; i < 2; i++) {
+		if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >= fence->seq) {
+			fence->seq = RADEON_FENCE_SIGNALED_SEQ;
+			return true;
+		}
+		radeon_fence_poll(fence->rdev, fence->ring);
 	}
-	write_unlock_irqrestore(&fence->rdev->fence_lock, irq_flags);
-	return signaled;
+	return false;
 }
 
-int radeon_fence_wait(struct radeon_fence *fence, bool intr)
+bool radeon_seq_signaled(struct radeon_device *rdev, u64 seq, unsigned ring)
 {
-	struct radeon_device *rdev;
-	unsigned long irq_flags, timeout, last_activity;
+	unsigned i;
+
+	/* poll new last sequence at least once */
+	for (i = 0; i < 2; i++) {
+		if (atomic64_read(&rdev->fence_drv[ring].last_seq) >= seq) {
+			return true;
+		}
+		radeon_fence_poll(rdev, ring);
+	}
+	return false;
+}
+
+static int radeon_wait_seq(struct radeon_device *rdev, u64 target_seq,
+			   unsigned ring, bool intr)
+{
+	unsigned long timeout, last_activity;
 	uint64_t seq;
-	int i, r;
+	unsigned i;
 	bool signaled;
+	int r;
 
-	if (fence == NULL) {
-		WARN(1, "Querying an invalid fence : %p !\n", fence);
-		return -EINVAL;
-	}
+	while (target_seq > atomic64_read(&rdev->fence_drv[ring].last_seq)) {
+		if (!rdev->ring[ring].ready) {
+			return -EBUSY;
+		}
 
-	rdev = fence->rdev;
-	signaled = radeon_fence_signaled(fence);
-	while (!signaled) {
-		read_lock_irqsave(&rdev->fence_lock, irq_flags);
 		timeout = jiffies - RADEON_FENCE_JIFFIES_TIMEOUT;
-		if (time_after(rdev->fence_drv[fence->ring].last_activity, timeout)) {
+		if (time_after(rdev->fence_drv[ring].last_activity, timeout)) {
 			/* the normal case, timeout is somewhere before last_activity */
-			timeout = rdev->fence_drv[fence->ring].last_activity - timeout;
+			timeout = rdev->fence_drv[ring].last_activity - timeout;
 		} else {
 			/* either jiffies wrapped around, or no fence was signaled in the last 500ms
-			 * anyway we will just wait for the minimum amount and then check for a lockup */
+			 * anyway we will just wait for the minimum amount and then check for a lockup
+			 */
 			timeout = 1;
 		}
-		/* save current sequence value used to check for GPU lockups */
-		seq = atomic64_read(&rdev->fence_drv[fence->ring].last_seq);
+		seq = atomic64_read(&rdev->fence_drv[ring].last_seq);
 		/* Save current last activity valuee, used to check for GPU lockups */
-		last_activity = rdev->fence_drv[fence->ring].last_activity;
-		read_unlock_irqrestore(&rdev->fence_lock, irq_flags);
+		last_activity = rdev->fence_drv[ring].last_activity;
 
 		trace_radeon_fence_wait_begin(rdev->ddev, seq);
-		radeon_irq_kms_sw_irq_get(rdev, fence->ring);
+		radeon_irq_kms_sw_irq_get(rdev, ring);
 		if (intr) {
-			r = wait_event_interruptible_timeout(
-				rdev->fence_drv[fence->ring].queue,
-				(signaled = radeon_fence_signaled(fence)), timeout);
+			r = wait_event_interruptible_timeout(rdev->fence_drv[ring].queue,
+							     (signaled = radeon_seq_signaled(rdev, target_seq, ring)),
+							     timeout);
 		} else {
-			r = wait_event_timeout(
-				rdev->fence_drv[fence->ring].queue,
-				(signaled = radeon_fence_signaled(fence)), timeout);
+			r = wait_event_timeout(rdev->fence_drv[ring].queue,
+					       (signaled = radeon_seq_signaled(rdev, target_seq, ring)),
+					       timeout);
 		}
-		radeon_irq_kms_sw_irq_put(rdev, fence->ring);
+		radeon_irq_kms_sw_irq_put(rdev, ring);
 		if (unlikely(r < 0)) {
 			return r;
 		}
@@ -279,19 +254,24 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 				continue;
 			}
 
-			write_lock_irqsave(&rdev->fence_lock, irq_flags);
+			/* check if sequence value has changed since last_activity */
+			if (seq != atomic64_read(&rdev->fence_drv[ring].last_seq)) {
+				continue;
+			}
 			/* test if somebody else has already decided that this is a lockup */
-			if (last_activity != rdev->fence_drv[fence->ring].last_activity) {
-				write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
+			if (last_activity != rdev->fence_drv[ring].last_activity) {
 				continue;
 			}
 
-			write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
-
-			if (radeon_ring_is_lockup(rdev, fence->ring, &rdev->ring[fence->ring])) {
+			if (radeon_ring_is_lockup(rdev, ring, &rdev->ring[ring])) {
 				/* good news we believe it's a lockup */
 				dev_warn(rdev->dev, "GPU lockup (waiting for 0x%016llx last fence id 0x%016llx)\n",
-					 fence->seq, seq);
+					 target_seq, seq);
+
+				/* change last activity so nobody else think there is a lockup */
+				for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+					rdev->fence_drv[i].last_activity = jiffies;
+				}
 
 				/* change last activity so nobody else think there is a lockup */
 				for (i = 0; i < RADEON_NUM_RINGS; ++i) {
@@ -299,7 +279,7 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 				}
 
 				/* mark the ring as not ready any more */
-				rdev->ring[fence->ring].ready = false;
+				rdev->ring[ring].ready = false;
 				return -EDEADLK;
 			}
 		}
@@ -307,52 +287,47 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 	return 0;
 }
 
-int radeon_fence_wait_next(struct radeon_device *rdev, int ring)
+int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 {
-	unsigned long irq_flags;
-	struct radeon_fence *fence;
 	int r;
 
-	write_lock_irqsave(&rdev->fence_lock, irq_flags);
-	if (!rdev->ring[ring].ready) {
-		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
-		return -EBUSY;
+	if (fence == NULL) {
+		WARN(1, "Querying an invalid fence : %p !\n", fence);
+		return -EINVAL;
 	}
-	if (list_empty(&rdev->fence_drv[ring].emitted)) {
-		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
-		return -ENOENT;
+
+	r = radeon_wait_seq(fence->rdev, fence->seq, fence->ring, intr);
+	if (r) {
+		return r;
 	}
-	fence = list_entry(rdev->fence_drv[ring].emitted.next,
-			   struct radeon_fence, list);
-	radeon_fence_ref(fence);
-	write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
-	r = radeon_fence_wait(fence, false);
-	radeon_fence_unref(&fence);
-	return r;
+	fence->seq = RADEON_FENCE_SIGNALED_SEQ;
+	return 0;
 }
 
-int radeon_fence_wait_empty(struct radeon_device *rdev, int ring)
+int radeon_fence_wait_next(struct radeon_device *rdev, int ring)
 {
-	unsigned long irq_flags;
-	struct radeon_fence *fence;
-	int r;
+	uint64_t seq;
 
-	write_lock_irqsave(&rdev->fence_lock, irq_flags);
-	if (!rdev->ring[ring].ready) {
-		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
-		return -EBUSY;
-	}
-	if (list_empty(&rdev->fence_drv[ring].emitted)) {
-		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
+	/* We are not protected by ring lock when reading current seq but
+	 * it's ok as worst case is we return to early while we could have
+	 * wait.
+	 */
+	seq = atomic64_read(&rdev->fence_drv[ring].last_seq) + 1ULL;
+	if (seq >= rdev->fence_drv[ring].seq) {
+		/* nothing to wait for, last_seq is already the last emited fence */
 		return 0;
 	}
-	fence = list_entry(rdev->fence_drv[ring].emitted.prev,
-			   struct radeon_fence, list);
-	radeon_fence_ref(fence);
-	write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
-	r = radeon_fence_wait(fence, false);
-	radeon_fence_unref(&fence);
-	return r;
+	return radeon_wait_seq(rdev, seq, ring, false);
+}
+
+int radeon_fence_wait_empty(struct radeon_device *rdev, int ring)
+{
+	/* We are not protected by ring lock when reading current seq
+	 * but it's ok as wait empty is call from place where no more
+	 * activity can be scheduled so there won't be concurrent access
+	 * to seq value.
+	 */
+	return radeon_wait_seq(rdev, rdev->fence_drv[ring].seq, ring, false);
 }
 
 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence)
@@ -373,47 +348,35 @@ void radeon_fence_unref(struct radeon_fence **fence)
 
 void radeon_fence_process(struct radeon_device *rdev, int ring)
 {
-	unsigned long irq_flags;
 	bool wake;
 
-	write_lock_irqsave(&rdev->fence_lock, irq_flags);
-	wake = radeon_fence_poll_locked(rdev, ring);
-	write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
+	wake = radeon_fence_poll(rdev, ring);
 	if (wake) {
 		wake_up_all(&rdev->fence_drv[ring].queue);
 	}
 }
 
-int radeon_fence_count_emitted(struct radeon_device *rdev, int ring)
+unsigned radeon_fence_count_emitted(struct radeon_device *rdev, int ring)
 {
-	unsigned long irq_flags;
-	int not_processed = 0;
+	uint64_t emitted;
 
-	read_lock_irqsave(&rdev->fence_lock, irq_flags);
-	if (!rdev->fence_drv[ring].initialized) {
-		read_unlock_irqrestore(&rdev->fence_lock, irq_flags);
-		return 0;
-	}
-
-	if (!list_empty(&rdev->fence_drv[ring].emitted)) {
-		struct list_head *ptr;
-		list_for_each(ptr, &rdev->fence_drv[ring].emitted) {
-			/* count up to 3, that's enought info */
-			if (++not_processed >= 3)
-				break;
-		}
+	radeon_fence_poll(rdev, ring);
+	/* We are not protected by ring lock when reading the last sequence
+	 * but it's ok to report slightly wrong fence count here.
+	 */
+	emitted = rdev->fence_drv[ring].seq - atomic64_read(&rdev->fence_drv[ring].last_seq);
+	/* to avoid 32bits warp around */
+	if (emitted > 0x10000000) {
+		emitted = 0x10000000;
 	}
-	read_unlock_irqrestore(&rdev->fence_lock, irq_flags);
-	return not_processed;
+	return (unsigned)emitted;
 }
 
 int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring)
 {
-	unsigned long irq_flags;
 	uint64_t index;
 	int r;
 
-	write_lock_irqsave(&rdev->fence_lock, irq_flags);
 	radeon_scratch_free(rdev, rdev->fence_drv[ring].scratch_reg);
 	if (rdev->wb.use_event) {
 		rdev->fence_drv[ring].scratch_reg = 0;
@@ -422,7 +385,6 @@ int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring)
 		r = radeon_scratch_get(rdev, &rdev->fence_drv[ring].scratch_reg);
 		if (r) {
 			dev_err(rdev->dev, "fence failed to get scratch register\n");
-			write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 			return r;
 		}
 		index = RADEON_WB_SCRATCH_OFFSET +
@@ -433,9 +395,8 @@ int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring)
 	rdev->fence_drv[ring].gpu_addr = rdev->wb.gpu_addr + index;
 	radeon_fence_write(rdev, rdev->fence_drv[ring].seq, ring);
 	rdev->fence_drv[ring].initialized = true;
-	DRM_INFO("fence driver on ring %d use gpu addr 0x%016llx and cpu addr 0x%p\n",
+	dev_info(rdev->dev, "fence driver on ring %d use gpu addr 0x%016llx and cpu addr 0x%p\n",
 		 ring, rdev->fence_drv[ring].gpu_addr, rdev->fence_drv[ring].cpu_addr);
-	write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 	return 0;
 }
 
@@ -447,22 +408,18 @@ static void radeon_fence_driver_init_ring(struct radeon_device *rdev, int ring)
 	/* we start at one */
 	rdev->fence_drv[ring].seq = 1;
 	atomic64_set(&rdev->fence_drv[ring].last_seq, 0);
-	INIT_LIST_HEAD(&rdev->fence_drv[ring].emitted);
-	INIT_LIST_HEAD(&rdev->fence_drv[ring].signaled);
+	rdev->fence_drv[ring].last_activity = jiffies;
 	init_waitqueue_head(&rdev->fence_drv[ring].queue);
 	rdev->fence_drv[ring].initialized = false;
 }
 
 int radeon_fence_driver_init(struct radeon_device *rdev)
 {
-	unsigned long irq_flags;
 	int ring;
 
-	write_lock_irqsave(&rdev->fence_lock, irq_flags);
 	for (ring = 0; ring < RADEON_NUM_RINGS; ring++) {
 		radeon_fence_driver_init_ring(rdev, ring);
 	}
-	write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 	if (radeon_debugfs_fence_init(rdev)) {
 		dev_err(rdev->dev, "fence debugfs file creation failed\n");
 	}
@@ -471,7 +428,6 @@ int radeon_fence_driver_init(struct radeon_device *rdev)
 
 void radeon_fence_driver_fini(struct radeon_device *rdev)
 {
-	unsigned long irq_flags;
 	int ring;
 
 	for (ring = 0; ring < RADEON_NUM_RINGS; ring++) {
@@ -479,9 +435,7 @@ void radeon_fence_driver_fini(struct radeon_device *rdev)
 			continue;
 		radeon_fence_wait_empty(rdev, ring);
 		wake_up_all(&rdev->fence_drv[ring].queue);
-		write_lock_irqsave(&rdev->fence_lock, irq_flags);
 		radeon_scratch_free(rdev, rdev->fence_drv[ring].scratch_reg);
-		write_unlock_irqrestore(&rdev->fence_lock, irq_flags);
 		rdev->fence_drv[ring].initialized = false;
 	}
 }
@@ -496,7 +450,6 @@ static int radeon_debugfs_fence_info(struct seq_file *m, void *data)
 	struct drm_info_node *node = (struct drm_info_node *)m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct radeon_device *rdev = dev->dev_private;
-	struct radeon_fence *fence;
 	int i;
 
 	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
@@ -506,12 +459,8 @@ static int radeon_debugfs_fence_info(struct seq_file *m, void *data)
 		seq_printf(m, "--- ring %d ---\n", i);
 		seq_printf(m, "Last signaled fence 0x%016lx\n",
 			   atomic64_read(&rdev->fence_drv[i].last_seq));
-		if (!list_empty(&rdev->fence_drv[i].emitted)) {
-			fence = list_entry(rdev->fence_drv[i].emitted.prev,
-					   struct radeon_fence, list);
-			seq_printf(m, "Last emitted fence %p with 0x%016llx\n",
-				   fence,  fence->seq);
-		}
+		seq_printf(m, "Last emitted  0x%016llx\n",
+			   rdev->fence_drv[i].seq);
 	}
 	return 0;
 }
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 04/18] drm/radeon: hold ring emission mutex for lockup detection
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (2 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 03/18] drm/radeon: rework fence handling, drop fence list v5 j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 05/18] drm/radeon: use inline functions to calc sa_bo addr j.glisse
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König

From: Christian König <deathsimple@vodafone.de>

We are locking the ring emission mutex anyway, so
there is no harm in doing it a bit earlier and
prevent multiple resets to happen at the same time.

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon_fence.c |   10 +++++-----
 drivers/gpu/drm/radeon/radeon_ring.c  |    2 --
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
index 1610601..99c31b2 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -258,8 +258,11 @@ static int radeon_wait_seq(struct radeon_device *rdev, u64 target_seq,
 			if (seq != atomic64_read(&rdev->fence_drv[ring].last_seq)) {
 				continue;
 			}
+
+			mutex_lock(&rdev->ring_lock);
 			/* test if somebody else has already decided that this is a lockup */
 			if (last_activity != rdev->fence_drv[ring].last_activity) {
+				mutex_unlock(&rdev->ring_lock);
 				continue;
 			}
 
@@ -273,15 +276,12 @@ static int radeon_wait_seq(struct radeon_device *rdev, u64 target_seq,
 					rdev->fence_drv[i].last_activity = jiffies;
 				}
 
-				/* change last activity so nobody else think there is a lockup */
-				for (i = 0; i < RADEON_NUM_RINGS; ++i) {
-					rdev->fence_drv[i].last_activity = jiffies;
-				}
-
 				/* mark the ring as not ready any more */
 				rdev->ring[ring].ready = false;
+				mutex_unlock(&rdev->ring_lock);
 				return -EDEADLK;
 			}
+			mutex_unlock(&rdev->ring_lock);
 		}
 	}
 	return 0;
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 4ae222b..8b7f00e 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -408,7 +408,6 @@ void radeon_ring_force_activity(struct radeon_device *rdev, struct radeon_ring *
 {
 	int r;
 
-	mutex_lock(&rdev->ring_lock);
 	radeon_ring_free_size(rdev, ring);
 	if (ring->rptr == ring->wptr) {
 		r = radeon_ring_alloc(rdev, ring, 1);
@@ -417,7 +416,6 @@ void radeon_ring_force_activity(struct radeon_device *rdev, struct radeon_ring *
 			radeon_ring_commit(rdev, ring);
 		}
 	}
-	mutex_unlock(&rdev->ring_lock);
 }
 
 void radeon_ring_lockup_update(struct radeon_ring *ring)
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 05/18] drm/radeon: use inline functions to calc sa_bo addr
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (3 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 04/18] drm/radeon: hold ring emission mutex for lockup detection j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 06/18] drm/radeon: add proper locking to the SA v3 j.glisse
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König

From: Christian König <deathsimple@vodafone.de>

Instead of hacking the calculation multiple times.

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon_gart.c      |    6 ++----
 drivers/gpu/drm/radeon/radeon_object.h    |   11 +++++++++++
 drivers/gpu/drm/radeon/radeon_ring.c      |    6 ++----
 drivers/gpu/drm/radeon/radeon_semaphore.c |    6 ++----
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c
index c58a036..4a5d9d4 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -404,10 +404,8 @@ retry:
 		radeon_vm_unbind(rdev, vm_evict);
 		goto retry;
 	}
-	vm->pt = rdev->vm_manager.sa_manager.cpu_ptr;
-	vm->pt += (vm->sa_bo.offset >> 3);
-	vm->pt_gpu_addr = rdev->vm_manager.sa_manager.gpu_addr;
-	vm->pt_gpu_addr += vm->sa_bo.offset;
+	vm->pt = radeon_sa_bo_cpu_addr(&vm->sa_bo);
+	vm->pt_gpu_addr = radeon_sa_bo_gpu_addr(&vm->sa_bo);
 	memset(vm->pt, 0, RADEON_GPU_PAGE_ALIGN(vm->last_pfn * 8));
 
 retry_id:
diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h
index f9104be..c120ab9 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -146,6 +146,17 @@ extern struct radeon_bo_va *radeon_bo_va(struct radeon_bo *rbo,
 /*
  * sub allocation
  */
+
+static inline uint64_t radeon_sa_bo_gpu_addr(struct radeon_sa_bo *sa_bo)
+{
+	return sa_bo->manager->gpu_addr + sa_bo->offset;
+}
+
+static inline void * radeon_sa_bo_cpu_addr(struct radeon_sa_bo *sa_bo)
+{
+	return sa_bo->manager->cpu_ptr + sa_bo->offset;
+}
+
 extern int radeon_sa_bo_manager_init(struct radeon_device *rdev,
 				     struct radeon_sa_manager *sa_manager,
 				     unsigned size, u32 domain);
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 8b7f00e..b4c8a67 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -127,10 +127,8 @@ retry:
 					     size, 256);
 			if (!r) {
 				*ib = &rdev->ib_pool.ibs[idx];
-				(*ib)->ptr = rdev->ib_pool.sa_manager.cpu_ptr;
-				(*ib)->ptr += ((*ib)->sa_bo.offset >> 2);
-				(*ib)->gpu_addr = rdev->ib_pool.sa_manager.gpu_addr;
-				(*ib)->gpu_addr += (*ib)->sa_bo.offset;
+				(*ib)->ptr = radeon_sa_bo_cpu_addr(&(*ib)->sa_bo);
+				(*ib)->gpu_addr = radeon_sa_bo_gpu_addr(&(*ib)->sa_bo);
 				(*ib)->fence = fence;
 				(*ib)->vm_id = 0;
 				(*ib)->is_const_ib = false;
diff --git a/drivers/gpu/drm/radeon/radeon_semaphore.c b/drivers/gpu/drm/radeon/radeon_semaphore.c
index c5b3d8e..f312ba5 100644
--- a/drivers/gpu/drm/radeon/radeon_semaphore.c
+++ b/drivers/gpu/drm/radeon/radeon_semaphore.c
@@ -53,10 +53,8 @@ static int radeon_semaphore_add_bo(struct radeon_device *rdev)
 		kfree(bo);
 		return r;
 	}
-	gpu_addr = rdev->ib_pool.sa_manager.gpu_addr;
-	gpu_addr += bo->ib->sa_bo.offset;
-	cpu_ptr = rdev->ib_pool.sa_manager.cpu_ptr;
-	cpu_ptr += (bo->ib->sa_bo.offset >> 2);
+	gpu_addr = radeon_sa_bo_gpu_addr(&bo->ib->sa_bo);
+	cpu_ptr = radeon_sa_bo_cpu_addr(&bo->ib->sa_bo);
 	for (i = 0; i < (RADEON_SEMAPHORE_BO_SIZE/8); i++) {
 		bo->semaphores[i].gpu_addr = gpu_addr;
 		bo->semaphores[i].cpu_ptr = cpu_ptr;
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 06/18] drm/radeon: add proper locking to the SA v3
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (4 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 05/18] drm/radeon: use inline functions to calc sa_bo addr j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 07/18] drm/radeon: add sub allocator debugfs file j.glisse
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König

From: Christian König <deathsimple@vodafone.de>

Make the suballocator self containing to locking.

v2: split the bugfix into a seperate patch.
v3: remove some unreleated changes.

Sig-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon.h    |    1 +
 drivers/gpu/drm/radeon/radeon_sa.c |    6 ++++++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 2928139..2621794 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -381,6 +381,7 @@ struct radeon_bo_list {
  * alignment).
  */
 struct radeon_sa_manager {
+	spinlock_t		lock;
 	struct radeon_bo	*bo;
 	struct list_head	sa_bo;
 	unsigned		size;
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index 8fbfe69..aed0a8c 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -37,6 +37,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev,
 {
 	int r;
 
+	spin_lock_init(&sa_manager->lock);
 	sa_manager->bo = NULL;
 	sa_manager->size = size;
 	sa_manager->domain = domain;
@@ -139,6 +140,7 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 
 	BUG_ON(align > RADEON_GPU_PAGE_SIZE);
 	BUG_ON(size > sa_manager->size);
+	spin_lock(&sa_manager->lock);
 
 	/* no one ? */
 	head = sa_manager->sa_bo.prev;
@@ -172,6 +174,7 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 	offset += wasted;
 	if ((sa_manager->size - offset) < size) {
 		/* failed to find somethings big enough */
+		spin_unlock(&sa_manager->lock);
 		return -ENOMEM;
 	}
 
@@ -180,10 +183,13 @@ out:
 	sa_bo->offset = offset;
 	sa_bo->size = size;
 	list_add(&sa_bo->list, head);
+	spin_unlock(&sa_manager->lock);
 	return 0;
 }
 
 void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo *sa_bo)
 {
+	spin_lock(&sa_bo->manager->lock);
 	list_del_init(&sa_bo->list);
+	spin_unlock(&sa_bo->manager->lock);
 }
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 07/18] drm/radeon: add sub allocator debugfs file
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (5 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 06/18] drm/radeon: add proper locking to the SA v3 j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 08/18] drm/radeon: keep start and end offset in the SA j.glisse
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König

From: Christian König <deathsimple@vodafone.de>

Dumping the current allocations.

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon_object.h |    5 +++++
 drivers/gpu/drm/radeon/radeon_ring.c   |   22 ++++++++++++++++++++++
 drivers/gpu/drm/radeon/radeon_sa.c     |   14 ++++++++++++++
 3 files changed, 41 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h
index c120ab9..d9fca1e 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -172,5 +172,10 @@ extern int radeon_sa_bo_new(struct radeon_device *rdev,
 			    unsigned size, unsigned align);
 extern void radeon_sa_bo_free(struct radeon_device *rdev,
 			      struct radeon_sa_bo *sa_bo);
+#if defined(CONFIG_DEBUG_FS)
+extern void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
+					 struct seq_file *m);
+#endif
+
 
 #endif
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index b4c8a67..40a8528 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -603,6 +603,23 @@ static int radeon_debugfs_ib_info(struct seq_file *m, void *data)
 static struct drm_info_list radeon_debugfs_ib_list[RADEON_IB_POOL_SIZE];
 static char radeon_debugfs_ib_names[RADEON_IB_POOL_SIZE][32];
 static unsigned radeon_debugfs_ib_idx[RADEON_IB_POOL_SIZE];
+
+static int radeon_debugfs_sa_info(struct seq_file *m, void *data)
+{
+	struct drm_info_node *node = (struct drm_info_node *) m->private;
+	struct drm_device *dev = node->minor->dev;
+	struct radeon_device *rdev = dev->dev_private;
+
+	radeon_sa_bo_dump_debug_info(&rdev->ib_pool.sa_manager, m);
+
+	return 0;
+
+}
+
+static struct drm_info_list radeon_debugfs_sa_list[] = {
+        {"radeon_sa_info", &radeon_debugfs_sa_info, 0, NULL},
+};
+
 #endif
 
 int radeon_debugfs_ring_init(struct radeon_device *rdev, struct radeon_ring *ring)
@@ -629,6 +646,11 @@ int radeon_debugfs_ib_init(struct radeon_device *rdev)
 {
 #if defined(CONFIG_DEBUG_FS)
 	unsigned i;
+	int r;
+
+	r = radeon_debugfs_add_files(rdev, radeon_debugfs_sa_list, 1);
+	if (r)
+		return r;
 
 	for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
 		sprintf(radeon_debugfs_ib_names[i], "radeon_ib_%04u", i);
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index aed0a8c..1db0568 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -193,3 +193,17 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo *sa_bo)
 	list_del_init(&sa_bo->list);
 	spin_unlock(&sa_bo->manager->lock);
 }
+
+#if defined(CONFIG_DEBUG_FS)
+void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
+				  struct seq_file *m)
+{
+	struct radeon_sa_bo *i;
+
+	spin_lock(&sa_manager->lock);
+	list_for_each_entry(i, &sa_manager->sa_bo, list) {
+		seq_printf(m, "offset %08d: size %4d\n", i->offset, i->size);
+	}
+	spin_unlock(&sa_manager->lock);
+}
+#endif
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 08/18] drm/radeon: keep start and end offset in the SA
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (6 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 07/18] drm/radeon: add sub allocator debugfs file j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 09/18] drm/radeon: make sa bo a stand alone object j.glisse
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König

From: Christian König <deathsimple@vodafone.de>

Instead of offset + size keep start and end offset directly.

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon.h        |    4 ++--
 drivers/gpu/drm/radeon/radeon_cs.c     |    4 ++--
 drivers/gpu/drm/radeon/radeon_object.h |    4 ++--
 drivers/gpu/drm/radeon/radeon_sa.c     |   13 +++++++------
 4 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 2621794..a90cabf 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -396,8 +396,8 @@ struct radeon_sa_bo;
 struct radeon_sa_bo {
 	struct list_head		list;
 	struct radeon_sa_manager	*manager;
-	unsigned			offset;
-	unsigned			size;
+	unsigned			soffset;
+	unsigned			eoffset;
 };
 
 /*
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
index dd0fdef..ac6c9e3 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -470,7 +470,7 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
 		/* ib pool is bind at 0 in virtual address space to gpu_addr is the
 		 * offset inside the pool bo
 		 */
-		parser->const_ib->gpu_addr = parser->const_ib->sa_bo.offset;
+		parser->const_ib->gpu_addr = parser->const_ib->sa_bo.soffset;
 		r = radeon_ib_schedule(rdev, parser->const_ib);
 		if (r)
 			goto out;
@@ -480,7 +480,7 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
 	/* ib pool is bind at 0 in virtual address space to gpu_addr is the
 	 * offset inside the pool bo
 	 */
-	parser->ib->gpu_addr = parser->ib->sa_bo.offset;
+	parser->ib->gpu_addr = parser->ib->sa_bo.soffset;
 	parser->ib->is_const_ib = false;
 	r = radeon_ib_schedule(rdev, parser->ib);
 out:
diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h
index d9fca1e..99ab46a 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -149,12 +149,12 @@ extern struct radeon_bo_va *radeon_bo_va(struct radeon_bo *rbo,
 
 static inline uint64_t radeon_sa_bo_gpu_addr(struct radeon_sa_bo *sa_bo)
 {
-	return sa_bo->manager->gpu_addr + sa_bo->offset;
+	return sa_bo->manager->gpu_addr + sa_bo->soffset;
 }
 
 static inline void * radeon_sa_bo_cpu_addr(struct radeon_sa_bo *sa_bo)
 {
-	return sa_bo->manager->cpu_ptr + sa_bo->offset;
+	return sa_bo->manager->cpu_ptr + sa_bo->soffset;
 }
 
 extern int radeon_sa_bo_manager_init(struct radeon_device *rdev,
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index 1db0568..3bea7ba 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -152,11 +152,11 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 	offset = 0;
 	list_for_each_entry(tmp, &sa_manager->sa_bo, list) {
 		/* room before this object ? */
-		if (offset < tmp->offset && (tmp->offset - offset) >= size) {
+		if (offset < tmp->soffset && (tmp->soffset - offset) >= size) {
 			head = tmp->list.prev;
 			goto out;
 		}
-		offset = tmp->offset + tmp->size;
+		offset = tmp->eoffset;
 		wasted = offset % align;
 		if (wasted) {
 			wasted = align - wasted;
@@ -166,7 +166,7 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 	/* room at the end ? */
 	head = sa_manager->sa_bo.prev;
 	tmp = list_entry(head, struct radeon_sa_bo, list);
-	offset = tmp->offset + tmp->size;
+	offset = tmp->eoffset;
 	wasted = offset % align;
 	if (wasted) {
 		wasted = align - wasted;
@@ -180,8 +180,8 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 
 out:
 	sa_bo->manager = sa_manager;
-	sa_bo->offset = offset;
-	sa_bo->size = size;
+	sa_bo->soffset = offset;
+	sa_bo->eoffset = offset + size;
 	list_add(&sa_bo->list, head);
 	spin_unlock(&sa_manager->lock);
 	return 0;
@@ -202,7 +202,8 @@ void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
 
 	spin_lock(&sa_manager->lock);
 	list_for_each_entry(i, &sa_manager->sa_bo, list) {
-		seq_printf(m, "offset %08d: size %4d\n", i->offset, i->size);
+		seq_printf(m, "[%08x %08x] size %4d [%p]\n",
+			   i->soffset, i->eoffset, i->eoffset - i->soffset, i);
 	}
 	spin_unlock(&sa_manager->lock);
 }
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 09/18] drm/radeon: make sa bo a stand alone object
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (7 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 08/18] drm/radeon: keep start and end offset in the SA j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 10/18] drm/radeon: define new SA interface v2 j.glisse
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König

From: Christian König <deathsimple@vodafone.de>

Allocating and freeing it seperately.

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon.h           |    4 ++--
 drivers/gpu/drm/radeon/radeon_cs.c        |    4 ++--
 drivers/gpu/drm/radeon/radeon_gart.c      |    4 ++--
 drivers/gpu/drm/radeon/radeon_object.h    |    4 ++--
 drivers/gpu/drm/radeon/radeon_ring.c      |    6 +++---
 drivers/gpu/drm/radeon/radeon_sa.c        |   28 +++++++++++++++++++---------
 drivers/gpu/drm/radeon/radeon_semaphore.c |    4 ++--
 7 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index a90cabf..3519a52 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -629,7 +629,7 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device *rdev, int crtc);
  */
 
 struct radeon_ib {
-	struct radeon_sa_bo	sa_bo;
+	struct radeon_sa_bo	*sa_bo;
 	unsigned		idx;
 	uint32_t		length_dw;
 	uint64_t		gpu_addr;
@@ -684,7 +684,7 @@ struct radeon_vm {
 	unsigned			last_pfn;
 	u64				pt_gpu_addr;
 	u64				*pt;
-	struct radeon_sa_bo		sa_bo;
+	struct radeon_sa_bo		*sa_bo;
 	struct mutex			mutex;
 	/* last fence for cs using this vm */
 	struct radeon_fence		*fence;
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
index ac6c9e3..2aa3f89 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -470,7 +470,7 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
 		/* ib pool is bind at 0 in virtual address space to gpu_addr is the
 		 * offset inside the pool bo
 		 */
-		parser->const_ib->gpu_addr = parser->const_ib->sa_bo.soffset;
+		parser->const_ib->gpu_addr = parser->const_ib->sa_bo->soffset;
 		r = radeon_ib_schedule(rdev, parser->const_ib);
 		if (r)
 			goto out;
@@ -480,7 +480,7 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
 	/* ib pool is bind at 0 in virtual address space to gpu_addr is the
 	 * offset inside the pool bo
 	 */
-	parser->ib->gpu_addr = parser->ib->sa_bo.soffset;
+	parser->ib->gpu_addr = parser->ib->sa_bo->soffset;
 	parser->ib->is_const_ib = false;
 	r = radeon_ib_schedule(rdev, parser->ib);
 out:
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c
index 4a5d9d4..c5789ef 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -404,8 +404,8 @@ retry:
 		radeon_vm_unbind(rdev, vm_evict);
 		goto retry;
 	}
-	vm->pt = radeon_sa_bo_cpu_addr(&vm->sa_bo);
-	vm->pt_gpu_addr = radeon_sa_bo_gpu_addr(&vm->sa_bo);
+	vm->pt = radeon_sa_bo_cpu_addr(vm->sa_bo);
+	vm->pt_gpu_addr = radeon_sa_bo_gpu_addr(vm->sa_bo);
 	memset(vm->pt, 0, RADEON_GPU_PAGE_ALIGN(vm->last_pfn * 8));
 
 retry_id:
diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h
index 99ab46a..4fc7f07 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -168,10 +168,10 @@ extern int radeon_sa_bo_manager_suspend(struct radeon_device *rdev,
 					struct radeon_sa_manager *sa_manager);
 extern int radeon_sa_bo_new(struct radeon_device *rdev,
 			    struct radeon_sa_manager *sa_manager,
-			    struct radeon_sa_bo *sa_bo,
+			    struct radeon_sa_bo **sa_bo,
 			    unsigned size, unsigned align);
 extern void radeon_sa_bo_free(struct radeon_device *rdev,
-			      struct radeon_sa_bo *sa_bo);
+			      struct radeon_sa_bo **sa_bo);
 #if defined(CONFIG_DEBUG_FS)
 extern void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
 					 struct seq_file *m);
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 40a8528..739554c 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -127,8 +127,8 @@ retry:
 					     size, 256);
 			if (!r) {
 				*ib = &rdev->ib_pool.ibs[idx];
-				(*ib)->ptr = radeon_sa_bo_cpu_addr(&(*ib)->sa_bo);
-				(*ib)->gpu_addr = radeon_sa_bo_gpu_addr(&(*ib)->sa_bo);
+				(*ib)->ptr = radeon_sa_bo_cpu_addr((*ib)->sa_bo);
+				(*ib)->gpu_addr = radeon_sa_bo_gpu_addr((*ib)->sa_bo);
 				(*ib)->fence = fence;
 				(*ib)->vm_id = 0;
 				(*ib)->is_const_ib = false;
@@ -227,7 +227,7 @@ int radeon_ib_pool_init(struct radeon_device *rdev)
 		rdev->ib_pool.ibs[i].fence = NULL;
 		rdev->ib_pool.ibs[i].idx = i;
 		rdev->ib_pool.ibs[i].length_dw = 0;
-		INIT_LIST_HEAD(&rdev->ib_pool.ibs[i].sa_bo.list);
+		rdev->ib_pool.ibs[i].sa_bo = NULL;
 	}
 	rdev->ib_pool.head_id = 0;
 	rdev->ib_pool.ready = true;
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index 3bea7ba..625f2d4 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -131,7 +131,7 @@ int radeon_sa_bo_manager_suspend(struct radeon_device *rdev,
  */
 int radeon_sa_bo_new(struct radeon_device *rdev,
 		     struct radeon_sa_manager *sa_manager,
-		     struct radeon_sa_bo *sa_bo,
+		     struct radeon_sa_bo **sa_bo,
 		     unsigned size, unsigned align)
 {
 	struct radeon_sa_bo *tmp;
@@ -140,6 +140,9 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 
 	BUG_ON(align > RADEON_GPU_PAGE_SIZE);
 	BUG_ON(size > sa_manager->size);
+
+	*sa_bo = kmalloc(sizeof(struct radeon_sa_bo), GFP_KERNEL);
+
 	spin_lock(&sa_manager->lock);
 
 	/* no one ? */
@@ -175,23 +178,30 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 	if ((sa_manager->size - offset) < size) {
 		/* failed to find somethings big enough */
 		spin_unlock(&sa_manager->lock);
+		kfree(*sa_bo);
+		*sa_bo = NULL;
 		return -ENOMEM;
 	}
 
 out:
-	sa_bo->manager = sa_manager;
-	sa_bo->soffset = offset;
-	sa_bo->eoffset = offset + size;
-	list_add(&sa_bo->list, head);
+	(*sa_bo)->manager = sa_manager;
+	(*sa_bo)->soffset = offset;
+	(*sa_bo)->eoffset = offset + size;
+	list_add(&(*sa_bo)->list, head);
 	spin_unlock(&sa_manager->lock);
 	return 0;
 }
 
-void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo *sa_bo)
+void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo **sa_bo)
 {
-	spin_lock(&sa_bo->manager->lock);
-	list_del_init(&sa_bo->list);
-	spin_unlock(&sa_bo->manager->lock);
+	if (!sa_bo || !*sa_bo)
+		return;
+
+	spin_lock(&(*sa_bo)->manager->lock);
+	list_del_init(&(*sa_bo)->list);
+	spin_unlock(&(*sa_bo)->manager->lock);
+	kfree(*sa_bo);
+	*sa_bo = NULL;
 }
 
 #if defined(CONFIG_DEBUG_FS)
diff --git a/drivers/gpu/drm/radeon/radeon_semaphore.c b/drivers/gpu/drm/radeon/radeon_semaphore.c
index f312ba5..d518d32 100644
--- a/drivers/gpu/drm/radeon/radeon_semaphore.c
+++ b/drivers/gpu/drm/radeon/radeon_semaphore.c
@@ -53,8 +53,8 @@ static int radeon_semaphore_add_bo(struct radeon_device *rdev)
 		kfree(bo);
 		return r;
 	}
-	gpu_addr = radeon_sa_bo_gpu_addr(&bo->ib->sa_bo);
-	cpu_ptr = radeon_sa_bo_cpu_addr(&bo->ib->sa_bo);
+	gpu_addr = radeon_sa_bo_gpu_addr(bo->ib->sa_bo);
+	cpu_ptr = radeon_sa_bo_cpu_addr(bo->ib->sa_bo);
 	for (i = 0; i < (RADEON_SEMAPHORE_BO_SIZE/8); i++) {
 		bo->semaphores[i].gpu_addr = gpu_addr;
 		bo->semaphores[i].cpu_ptr = cpu_ptr;
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 10/18] drm/radeon: define new SA interface v2
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (8 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 09/18] drm/radeon: make sa bo a stand alone object j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 11/18] drm/radeon: use one wait queue for all rings add fence_wait_any j.glisse
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König, Jerome Glisse

From: Christian König <deathsimple@vodafone.de>

Define the interface without modifying the allocation
algorithm in any way.

v2: rebase on top of fence new uint64 patch

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon.h           |    1 +
 drivers/gpu/drm/radeon/radeon_gart.c      |    6 +-
 drivers/gpu/drm/radeon/radeon_object.h    |    5 +-
 drivers/gpu/drm/radeon/radeon_ring.c      |    8 ++--
 drivers/gpu/drm/radeon/radeon_sa.c        |   60 ++++++++++++++++++++++++----
 drivers/gpu/drm/radeon/radeon_semaphore.c |    2 +-
 6 files changed, 63 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 3519a52..c78f15b 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -398,6 +398,7 @@ struct radeon_sa_bo {
 	struct radeon_sa_manager	*manager;
 	unsigned			soffset;
 	unsigned			eoffset;
+	struct radeon_fence		*fence;
 };
 
 /*
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c
index c5789ef..53dba8e 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -326,7 +326,7 @@ static void radeon_vm_unbind_locked(struct radeon_device *rdev,
 	rdev->vm_manager.use_bitmap &= ~(1 << vm->id);
 	list_del_init(&vm->list);
 	vm->id = -1;
-	radeon_sa_bo_free(rdev, &vm->sa_bo);
+	radeon_sa_bo_free(rdev, &vm->sa_bo, NULL);
 	vm->pt = NULL;
 
 	list_for_each_entry(bo_va, &vm->va, vm_list) {
@@ -395,7 +395,7 @@ int radeon_vm_bind(struct radeon_device *rdev, struct radeon_vm *vm)
 retry:
 	r = radeon_sa_bo_new(rdev, &rdev->vm_manager.sa_manager, &vm->sa_bo,
 			     RADEON_GPU_PAGE_ALIGN(vm->last_pfn * 8),
-			     RADEON_GPU_PAGE_SIZE);
+			     RADEON_GPU_PAGE_SIZE, false);
 	if (r) {
 		if (list_empty(&rdev->vm_manager.lru_vm)) {
 			return r;
@@ -426,7 +426,7 @@ retry_id:
 	/* do hw bind */
 	r = rdev->vm_manager.funcs->bind(rdev, vm, id);
 	if (r) {
-		radeon_sa_bo_free(rdev, &vm->sa_bo);
+		radeon_sa_bo_free(rdev, &vm->sa_bo, NULL);
 		return r;
 	}
 	rdev->vm_manager.use_bitmap |= 1 << id;
diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h
index 4fc7f07..befec7d 100644
--- a/drivers/gpu/drm/radeon/radeon_object.h
+++ b/drivers/gpu/drm/radeon/radeon_object.h
@@ -169,9 +169,10 @@ extern int radeon_sa_bo_manager_suspend(struct radeon_device *rdev,
 extern int radeon_sa_bo_new(struct radeon_device *rdev,
 			    struct radeon_sa_manager *sa_manager,
 			    struct radeon_sa_bo **sa_bo,
-			    unsigned size, unsigned align);
+			    unsigned size, unsigned align, bool block);
 extern void radeon_sa_bo_free(struct radeon_device *rdev,
-			      struct radeon_sa_bo **sa_bo);
+			      struct radeon_sa_bo **sa_bo,
+			      struct radeon_fence *fence);
 #if defined(CONFIG_DEBUG_FS)
 extern void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
 					 struct seq_file *m);
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 739554c..c84af9c 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -85,7 +85,7 @@ bool radeon_ib_try_free(struct radeon_device *rdev, struct radeon_ib *ib)
 	if (ib->fence && ib->fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
 		if (radeon_fence_signaled(ib->fence)) {
 			radeon_fence_unref(&ib->fence);
-			radeon_sa_bo_free(rdev, &ib->sa_bo);
+			radeon_sa_bo_free(rdev, &ib->sa_bo, NULL);
 			done = true;
 		}
 	}
@@ -124,7 +124,7 @@ retry:
 		if (rdev->ib_pool.ibs[idx].fence == NULL) {
 			r = radeon_sa_bo_new(rdev, &rdev->ib_pool.sa_manager,
 					     &rdev->ib_pool.ibs[idx].sa_bo,
-					     size, 256);
+					     size, 256, false);
 			if (!r) {
 				*ib = &rdev->ib_pool.ibs[idx];
 				(*ib)->ptr = radeon_sa_bo_cpu_addr((*ib)->sa_bo);
@@ -173,7 +173,7 @@ void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib)
 	}
 	radeon_mutex_lock(&rdev->ib_pool.mutex);
 	if (tmp->fence && tmp->fence->seq == RADEON_FENCE_NOTEMITED_SEQ) {
-		radeon_sa_bo_free(rdev, &tmp->sa_bo);
+		radeon_sa_bo_free(rdev, &tmp->sa_bo, NULL);
 		radeon_fence_unref(&tmp->fence);
 	}
 	radeon_mutex_unlock(&rdev->ib_pool.mutex);
@@ -247,7 +247,7 @@ void radeon_ib_pool_fini(struct radeon_device *rdev)
 	radeon_mutex_lock(&rdev->ib_pool.mutex);
 	if (rdev->ib_pool.ready) {
 		for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
-			radeon_sa_bo_free(rdev, &rdev->ib_pool.ibs[i].sa_bo);
+			radeon_sa_bo_free(rdev, &rdev->ib_pool.ibs[i].sa_bo, NULL);
 			radeon_fence_unref(&rdev->ib_pool.ibs[i].fence);
 		}
 		radeon_sa_bo_manager_fini(rdev, &rdev->ib_pool.sa_manager);
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index 625f2d4..bc4ef87 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -129,20 +129,32 @@ int radeon_sa_bo_manager_suspend(struct radeon_device *rdev,
  *
  * Alignment can't be bigger than page size
  */
+
+static void radeon_sa_bo_remove_locked(struct radeon_sa_bo *sa_bo)
+{
+	list_del(&sa_bo->list);
+	radeon_fence_unref(&sa_bo->fence);
+	kfree(sa_bo);
+}
+
 int radeon_sa_bo_new(struct radeon_device *rdev,
 		     struct radeon_sa_manager *sa_manager,
 		     struct radeon_sa_bo **sa_bo,
-		     unsigned size, unsigned align)
+		     unsigned size, unsigned align, bool block)
 {
-	struct radeon_sa_bo *tmp;
+	struct radeon_fence *fence = NULL;
+	struct radeon_sa_bo *tmp, *next;
 	struct list_head *head;
 	unsigned offset = 0, wasted = 0;
+	int r;
 
 	BUG_ON(align > RADEON_GPU_PAGE_SIZE);
 	BUG_ON(size > sa_manager->size);
 
 	*sa_bo = kmalloc(sizeof(struct radeon_sa_bo), GFP_KERNEL);
 
+retry:
+
 	spin_lock(&sa_manager->lock);
 
 	/* no one ? */
@@ -153,7 +165,17 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 
 	/* look for a hole big enough */
 	offset = 0;
-	list_for_each_entry(tmp, &sa_manager->sa_bo, list) {
+	list_for_each_entry_safe(tmp, next, &sa_manager->sa_bo, list) {
+		/* try to free this object */
+		if (tmp->fence) {
+			if (radeon_fence_signaled(tmp->fence)) {
+				radeon_sa_bo_remove_locked(tmp);
+				continue;
+			} else {
+				fence = tmp->fence;
+			}
+		}
+
 		/* room before this object ? */
 		if (offset < tmp->soffset && (tmp->soffset - offset) >= size) {
 			head = tmp->list.prev;
@@ -178,6 +200,13 @@ int radeon_sa_bo_new(struct radeon_device *rdev,
 	if ((sa_manager->size - offset) < size) {
 		/* failed to find somethings big enough */
 		spin_unlock(&sa_manager->lock);
+		if (block && fence) {
+			r = radeon_fence_wait(fence, false);
+			if (r)
+				return r;
+
+			goto retry;
+		}
 		kfree(*sa_bo);
 		*sa_bo = NULL;
 		return -ENOMEM;
@@ -192,15 +221,22 @@ out:
 	return 0;
 }
 
-void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo **sa_bo)
+void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo **sa_bo,
+		       struct radeon_fence *fence)
 {
+	struct radeon_sa_manager *sa_manager;
+
 	if (!sa_bo || !*sa_bo)
 		return;
 
-	spin_lock(&(*sa_bo)->manager->lock);
-	list_del_init(&(*sa_bo)->list);
-	spin_unlock(&(*sa_bo)->manager->lock);
-	kfree(*sa_bo);
+	sa_manager = (*sa_bo)->manager;
+	spin_lock(&sa_manager->lock);
+	if (fence && fence->seq && fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
+		(*sa_bo)->fence = radeon_fence_ref(fence);
+	} else {
+		radeon_sa_bo_remove_locked(*sa_bo);
+	}
+	spin_unlock(&sa_manager->lock);
 	*sa_bo = NULL;
 }
 
@@ -212,8 +248,14 @@ void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
 
 	spin_lock(&sa_manager->lock);
 	list_for_each_entry(i, &sa_manager->sa_bo, list) {
-		seq_printf(m, "[%08x %08x] size %4d [%p]\n",
+		seq_printf(m, "[%08x %08x] size %4d (%p)",
 			   i->soffset, i->eoffset, i->eoffset - i->soffset, i);
+		if (i->fence) {
+			seq_printf(m, " protected by %Ld (%p)\n",
+				   i->fence->seq, i->fence);
+		} else {
+			seq_printf(m, "\n");
+		}
 	}
 	spin_unlock(&sa_manager->lock);
 }
diff --git a/drivers/gpu/drm/radeon/radeon_semaphore.c b/drivers/gpu/drm/radeon/radeon_semaphore.c
index d518d32..dbde874 100644
--- a/drivers/gpu/drm/radeon/radeon_semaphore.c
+++ b/drivers/gpu/drm/radeon/radeon_semaphore.c
@@ -72,7 +72,7 @@ static int radeon_semaphore_add_bo(struct radeon_device *rdev)
 static void radeon_semaphore_del_bo_locked(struct radeon_device *rdev,
 					   struct radeon_semaphore_bo *bo)
 {
-	radeon_sa_bo_free(rdev, &bo->ib->sa_bo);
+	radeon_sa_bo_free(rdev, &bo->ib->sa_bo, NULL);
 	radeon_fence_unref(&bo->ib->fence);
 	list_del(&bo->list);
 	kfree(bo);
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 11/18] drm/radeon: use one wait queue for all rings add fence_wait_any
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (9 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 10/18] drm/radeon: define new SA interface v2 j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 12/18] drm/radeon: multiple ring allocator j.glisse
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Jerome Glisse, Christian König

From: Jerome Glisse <jglisse@redhat.com>

Use one wait queue for all rings. When one ring progress, other
likely does to and we are not expecting to have a lot of waiter
anyway.

Also add a fence_wait_any that will wait until the first fence
in the fence array (one fence per ring) is signaled. This allow
to wait on all rings.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
---
 drivers/gpu/drm/radeon/radeon.h       |    5 +-
 drivers/gpu/drm/radeon/radeon_fence.c |  151 +++++++++++++++++++++++++++++++-
 2 files changed, 150 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index c78f15b..6f8d0f5 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -262,7 +262,6 @@ struct radeon_fence_driver {
 	uint64_t			seq;
 	atomic64_t			last_seq;
 	unsigned long			last_activity;
-	wait_queue_head_t		queue;
 	bool				initialized;
 };
 
@@ -286,6 +285,9 @@ bool radeon_fence_signaled(struct radeon_fence *fence);
 int radeon_fence_wait(struct radeon_fence *fence, bool interruptible);
 int radeon_fence_wait_next(struct radeon_device *rdev, int ring);
 int radeon_fence_wait_empty(struct radeon_device *rdev, int ring);
+int radeon_fence_wait_any(struct radeon_device *rdev,
+			  struct radeon_fence **fences,
+			  bool intr);
 struct radeon_fence *radeon_fence_ref(struct radeon_fence *fence);
 void radeon_fence_unref(struct radeon_fence **fence);
 unsigned radeon_fence_count_emitted(struct radeon_device *rdev, int ring);
@@ -1516,6 +1518,7 @@ struct radeon_device {
 	struct radeon_scratch		scratch;
 	struct radeon_mman		mman;
 	struct radeon_fence_driver	fence_drv[RADEON_NUM_RINGS];
+	wait_queue_head_t		fence_queue;
 	struct radeon_semaphore_driver	semaphore_drv;
 	struct mutex			ring_lock;
 	struct radeon_ring		ring[RADEON_NUM_RINGS];
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
index 99c31b2..9c2c1b3 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -233,11 +233,11 @@ static int radeon_wait_seq(struct radeon_device *rdev, u64 target_seq,
 		trace_radeon_fence_wait_begin(rdev->ddev, seq);
 		radeon_irq_kms_sw_irq_get(rdev, ring);
 		if (intr) {
-			r = wait_event_interruptible_timeout(rdev->fence_drv[ring].queue,
+			r = wait_event_interruptible_timeout(rdev->fence_queue,
 							     (signaled = radeon_seq_signaled(rdev, target_seq, ring)),
 							     timeout);
 		} else {
-			r = wait_event_timeout(rdev->fence_drv[ring].queue,
+			r = wait_event_timeout(rdev->fence_queue,
 					       (signaled = radeon_seq_signaled(rdev, target_seq, ring)),
 					       timeout);
 		}
@@ -304,6 +304,147 @@ int radeon_fence_wait(struct radeon_fence *fence, bool intr)
 	return 0;
 }
 
+bool radeon_seq_any_signaled(struct radeon_device *rdev, u64 *seq)
+{
+	unsigned i, j;
+
+	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+		/* poll new last sequence at least once */
+		for (j = 0; j < 2; j++) {
+			if (atomic64_read(&rdev->fence_drv[i].last_seq) >= seq[i]) {
+				return true;
+			}
+			radeon_fence_poll(rdev, i);
+		}
+	}
+	return false;
+}
+
+static int radeon_wait_seq_any(struct radeon_device *rdev, u64 *target_seq,
+			       bool intr)
+{
+	unsigned long timeout, last_activity, tmp;
+	unsigned i, ring = 0;
+	bool signaled;
+	int r;
+
+	/* use the most recent one as indicator */
+	for (i = 0, last_activity = 0; i < RADEON_NUM_RINGS; ++i) {
+		if (time_after(rdev->fence_drv[i].last_activity, last_activity)) {
+			last_activity = rdev->fence_drv[i].last_activity;
+		}
+		/* For lockup detection we just pick one of the ring we are
+		 * actively waiting for
+		 */
+		if (target_seq[i]) {
+			ring = i;
+		}
+	}
+	while (!radeon_seq_any_signaled(rdev, target_seq)) {
+		timeout = jiffies - RADEON_FENCE_JIFFIES_TIMEOUT;
+		if (time_after(last_activity, timeout)) {
+			/* the normal case, timeout is somewhere before last_activity */
+			timeout = last_activity - timeout;
+		} else {
+			/* either jiffies wrapped around, or no fence was signaled in the last 500ms
+			 * anyway we will just wait for the minimum amount and then check for a lockup
+			 */
+			timeout = 1;
+		}
+
+		trace_radeon_fence_wait_begin(rdev->ddev, target_seq[ring]);
+		for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+			if (target_seq[i]) {
+				radeon_irq_kms_sw_irq_get(rdev, i);
+			}
+		}
+		if (intr) {
+			r = wait_event_interruptible_timeout(rdev->fence_queue,
+							     (signaled = radeon_seq_any_signaled(rdev, target_seq)),
+							     timeout);
+		} else {
+			r = wait_event_timeout(rdev->fence_queue,
+					       (signaled = radeon_seq_any_signaled(rdev, target_seq)),
+					       timeout);
+		}
+		for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+			if (target_seq[i]) {
+				radeon_irq_kms_sw_irq_put(rdev, i);
+			}
+		}
+		if (unlikely(r < 0)) {
+			return r;
+		}
+		trace_radeon_fence_wait_end(rdev->ddev, target_seq[ring]);
+
+		if (unlikely(!signaled)) {
+			/* we were interrupted for some reason and fence
+			 * isn't signaled yet, resume waiting */
+			if (r) {
+				continue;
+			}
+
+			mutex_lock(&rdev->ring_lock);
+			for (i = 0, tmp = 0; i < RADEON_NUM_RINGS; ++i) {
+				if (time_after(rdev->fence_drv[i].last_activity, tmp)) {
+					tmp = rdev->fence_drv[i].last_activity;
+				}
+			}
+			/* test if somebody else has already decided that this is a lockup */
+			if (last_activity != tmp) {
+				last_activity = tmp;
+				mutex_unlock(&rdev->ring_lock);
+				continue;
+			}
+
+			if (radeon_ring_is_lockup(rdev, ring, &rdev->ring[ring])) {
+				/* good news we believe it's a lockup */
+				dev_warn(rdev->dev, "GPU lockup (waiting for 0x%016llx)\n",
+					 target_seq[ring]);
+
+				/* change last activity so nobody else think there is a lockup */
+				for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+					rdev->fence_drv[i].last_activity = jiffies;
+				}
+
+				/* mark the ring as not ready any more */
+				rdev->ring[ring].ready = false;
+				mutex_unlock(&rdev->ring_lock);
+				return -EDEADLK;
+			}
+			mutex_unlock(&rdev->ring_lock);
+		}
+	}
+	return 0;
+}
+
+int radeon_fence_wait_any(struct radeon_device *rdev,
+			  struct radeon_fence **fences,
+			  bool intr)
+{
+	uint64_t seq[RADEON_NUM_RINGS];
+	unsigned i, c;
+	int r;
+
+	for (i = 0, c = 0; i < RADEON_NUM_RINGS; ++i) {
+		seq[i] = 0;
+		if (fences[i] && fences[i]->seq < RADEON_FENCE_NOTEMITED_SEQ) {
+			seq[i] = fences[i]->seq;
+			c++;
+		}
+	}
+	if (!c) {
+		/* nothing to wait for */
+		return 0;
+	}
+
+	r = radeon_wait_seq_any(rdev, seq, intr);
+	if (r) {
+		return r;
+	}
+	return 0;
+}
+
 int radeon_fence_wait_next(struct radeon_device *rdev, int ring)
 {
 	uint64_t seq;
@@ -352,7 +493,7 @@ void radeon_fence_process(struct radeon_device *rdev, int ring)
 
 	wake = radeon_fence_poll(rdev, ring);
 	if (wake) {
-		wake_up_all(&rdev->fence_drv[ring].queue);
+		wake_up_all(&rdev->fence_queue);
 	}
 }
 
@@ -409,7 +550,6 @@ static void radeon_fence_driver_init_ring(struct radeon_device *rdev, int ring)
 	rdev->fence_drv[ring].seq = 1;
 	atomic64_set(&rdev->fence_drv[ring].last_seq, 0);
 	rdev->fence_drv[ring].last_activity = jiffies;
-	init_waitqueue_head(&rdev->fence_drv[ring].queue);
 	rdev->fence_drv[ring].initialized = false;
 }
 
@@ -417,6 +557,7 @@ int radeon_fence_driver_init(struct radeon_device *rdev)
 {
 	int ring;
 
+	init_waitqueue_head(&rdev->fence_queue);
 	for (ring = 0; ring < RADEON_NUM_RINGS; ring++) {
 		radeon_fence_driver_init_ring(rdev, ring);
 	}
@@ -434,7 +575,7 @@ void radeon_fence_driver_fini(struct radeon_device *rdev)
 		if (!rdev->fence_drv[ring].initialized)
 			continue;
 		radeon_fence_wait_empty(rdev, ring);
-		wake_up_all(&rdev->fence_drv[ring].queue);
+		wake_up_all(&rdev->fence_queue);
 		radeon_scratch_free(rdev, rdev->fence_drv[ring].scratch_reg);
 		rdev->fence_drv[ring].initialized = false;
 	}
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 12/18] drm/radeon: multiple ring allocator
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (10 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 11/18] drm/radeon: use one wait queue for all rings add fence_wait_any j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 13/18] drm/radeon: simplify semaphore handling v2 j.glisse
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König, Jerome Glisse

From: Christian König <deathsimple@vodafone.de>

A startover with a new idea for a multiple ring allocator.
Should perform as well as a normal ring allocator as long
as only one ring does somthing, but falls back to a more
complex algorithm if more complex things start to happen.

We store the last allocated bo in last, we always try to allocate
after the last allocated bo. Principle is that in a linear GPU ring
progression was is after last is the oldest bo we allocated and thus
the first one that should no longer be in use by the GPU.

If it's not the case we skip over the bo after last to the closest
done bo if such one exist. If none exist and we are not asked to
block we report failure to allocate.

If we are asked to block we wait on all the oldest fence of all
rings. We just wait for any of those fence to complete.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
---
 drivers/gpu/drm/radeon/radeon.h      |    7 +-
 drivers/gpu/drm/radeon/radeon_ring.c |   19 +--
 drivers/gpu/drm/radeon/radeon_sa.c   |  314 ++++++++++++++++++++++++----------
 3 files changed, 233 insertions(+), 107 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 6f8d0f5..1c3eb06 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -385,7 +385,9 @@ struct radeon_bo_list {
 struct radeon_sa_manager {
 	spinlock_t		lock;
 	struct radeon_bo	*bo;
-	struct list_head	sa_bo;
+	struct radeon_sa_bo	*last;
+	struct list_head	flist[RADEON_NUM_RINGS];
+	struct list_head	olist;
 	unsigned		size;
 	uint64_t		gpu_addr;
 	void			*cpu_ptr;
@@ -396,7 +398,8 @@ struct radeon_sa_bo;
 
 /* sub-allocation buffer */
 struct radeon_sa_bo {
-	struct list_head		list;
+	struct list_head		olist;
+	struct list_head		flist;
 	struct radeon_sa_manager	*manager;
 	unsigned			soffset;
 	unsigned			eoffset;
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index c84af9c..2eed251 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -204,25 +204,22 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib)
 
 int radeon_ib_pool_init(struct radeon_device *rdev)
 {
-	struct radeon_sa_manager tmp;
 	int i, r;
 
-	r = radeon_sa_bo_manager_init(rdev, &tmp,
-				      RADEON_IB_POOL_SIZE*64*1024,
-				      RADEON_GEM_DOMAIN_GTT);
-	if (r) {
-		return r;
-	}
-
 	radeon_mutex_lock(&rdev->ib_pool.mutex);
 	if (rdev->ib_pool.ready) {
 		radeon_mutex_unlock(&rdev->ib_pool.mutex);
-		radeon_sa_bo_manager_fini(rdev, &tmp);
 		return 0;
 	}
 
-	rdev->ib_pool.sa_manager = tmp;
-	INIT_LIST_HEAD(&rdev->ib_pool.sa_manager.sa_bo);
+	r = radeon_sa_bo_manager_init(rdev, &rdev->ib_pool.sa_manager,
+				      RADEON_IB_POOL_SIZE*64*1024,
+				      RADEON_GEM_DOMAIN_GTT);
+	if (r) {
+		radeon_mutex_unlock(&rdev->ib_pool.mutex);
+		return r;
+	}
+
 	for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
 		rdev->ib_pool.ibs[i].fence = NULL;
 		rdev->ib_pool.ibs[i].idx = i;
diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c
index bc4ef87..dc67cb4 100644
--- a/drivers/gpu/drm/radeon/radeon_sa.c
+++ b/drivers/gpu/drm/radeon/radeon_sa.c
@@ -27,21 +27,42 @@
  * Authors:
  *    Jerome Glisse <glisse@freedesktop.org>
  */
+/* Algorithm:
+ *
+ * We store the last allocated bo in last, we always try to allocate
+ * after the last allocated bo. Principle is that in a linear GPU ring
+ * progression was is after last is the oldest bo we allocated and thus
+ * the first one that should no longer be in use by the GPU.
+ *
+ * If it's not the case we skip over the bo after last to the closest
+ * done bo if such one exist. If none exist and we are not asked to
+ * block we report failure to allocate.
+ *
+ * If we are asked to block we wait on all the oldest fence of all
+ * rings. We just wait for any of those fence to complete.
+ */
 #include "drmP.h"
 #include "drm.h"
 #include "radeon.h"
 
+static void radeon_sa_bo_remove_locked(struct radeon_sa_bo *sa_bo);
+static void radeon_sa_bo_try_free(struct radeon_sa_manager *sa_manager);
+
 int radeon_sa_bo_manager_init(struct radeon_device *rdev,
 			      struct radeon_sa_manager *sa_manager,
 			      unsigned size, u32 domain)
 {
-	int r;
+	int i, r;
 
 	spin_lock_init(&sa_manager->lock);
 	sa_manager->bo = NULL;
 	sa_manager->size = size;
 	sa_manager->domain = domain;
-	INIT_LIST_HEAD(&sa_manager->sa_bo);
+	INIT_LIST_HEAD(&sa_manager->olist);
+	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+		INIT_LIST_HEAD(&sa_manager->flist[i]);
+	}
+	sa_manager->last = NULL;
 
 	r = radeon_bo_create(rdev, size, RADEON_GPU_PAGE_SIZE, true,
 			     RADEON_GEM_DOMAIN_CPU, &sa_manager->bo);
@@ -58,11 +79,17 @@ void radeon_sa_bo_manager_fini(struct radeon_device *rdev,
 {
 	struct radeon_sa_bo *sa_bo, *tmp;
 
-	if (!list_empty(&sa_manager->sa_bo)) {
-		dev_err(rdev->dev, "sa_manager is not empty, clearing anyway\n");
+	if (!list_empty(&sa_manager->olist)) {
+		sa_manager->last = list_entry(sa_manager->olist.next,
+					      struct radeon_sa_bo, olist);
+		radeon_sa_bo_try_free(sa_manager);
+		radeon_sa_bo_remove_locked(sa_manager->last);
+		if (!list_empty(&sa_manager->olist)) {
+			dev_err(rdev->dev, "sa_manager is not empty, clearing anyway\n");
+		}
 	}
-	list_for_each_entry_safe(sa_bo, tmp, &sa_manager->sa_bo, list) {
-		list_del_init(&sa_bo->list);
+	list_for_each_entry_safe(sa_bo, tmp, &sa_manager->olist, olist) {
+		radeon_sa_bo_remove_locked(sa_bo);
 	}
 	radeon_bo_unref(&sa_manager->bo);
 	sa_manager->size = 0;
@@ -114,111 +141,204 @@ int radeon_sa_bo_manager_suspend(struct radeon_device *rdev,
 	return r;
 }
 
-/*
- * Principe is simple, we keep a list of sub allocation in offset
- * order (first entry has offset == 0, last entry has the highest
- * offset).
- *
- * When allocating new object we first check if there is room at
- * the end total_size - (last_object_offset + last_object_size) >=
- * alloc_size. If so we allocate new object there.
- *
- * When there is not enough room at the end, we start waiting for
- * each sub object until we reach object_offset+object_size >=
- * alloc_size, this object then become the sub object we return.
- *
- * Alignment can't be bigger than page size
- */
-
 static void radeon_sa_bo_remove_locked(struct radeon_sa_bo *sa_bo)
 {
-	list_del(&sa_bo->list);
+	struct radeon_sa_manager *sa_manager = sa_bo->manager;
+	struct list_head *prev = sa_bo->olist.prev;
+
+	list_del_init(&sa_bo->olist);
+	if (list_empty(&sa_manager->olist)) {
+		sa_manager->last = NULL;
+	} if (sa_manager->last == sa_bo) {
+		if (prev != &sa_manager->olist) {
+			sa_manager->last = list_entry(prev,
+						      struct radeon_sa_bo,
+						      olist);
+		} else {
+			sa_manager->last = list_entry(sa_manager->olist.prev,
+						      struct radeon_sa_bo,
+						      olist);
+		}
+	}
+	list_del_init(&sa_bo->flist);
 	radeon_fence_unref(&sa_bo->fence);
 	kfree(sa_bo);
 }
 
+static void radeon_sa_bo_try_free(struct radeon_sa_manager *sa_manager)
+{
+	struct radeon_sa_bo *sa_bo, *tmp;
+
+	if (sa_manager->last == NULL) {
+		return;
+	}
+	sa_bo = sa_manager->last;
+	list_for_each_entry_safe_continue(sa_bo, tmp, &sa_manager->olist, olist) {
+		if (sa_bo->fence == NULL || !radeon_fence_signaled(sa_bo->fence)) {
+			return;
+		}
+		radeon_sa_bo_remove_locked(sa_bo);
+	}
+}
+
+static inline void radeon_sa_bo_hole_after(struct radeon_sa_bo *sa_bo,
+					   unsigned *soffset, unsigned *eoffset)
+{
+
+	if (sa_bo == NULL) {
+		return;
+	}
+	*soffset = sa_bo->eoffset;
+	if (sa_bo->olist.next == &sa_bo->manager->olist) {
+		*eoffset = sa_bo->manager->size;
+	} else {
+		struct radeon_sa_bo *next;
+
+		next = list_entry(sa_bo->olist.next, struct radeon_sa_bo, olist);
+		*eoffset = next->soffset;
+	}
+}
+
+static bool radeon_sa_bo_try_alloc(struct radeon_sa_manager *sa_manager,
+				   struct radeon_sa_bo *sa_bo,
+				   unsigned size, unsigned align)
+{
+	unsigned soffset, eoffset, wasted;
+
+	soffset = 0;
+	eoffset = sa_manager->size;
+	radeon_sa_bo_hole_after(sa_manager->last, &soffset, &eoffset);
+	wasted = (align - (soffset % align)) % align;
+
+	if ((eoffset - soffset) >= (size + wasted)) {
+		soffset += wasted;
+
+		sa_bo->manager = sa_manager;
+		sa_bo->soffset = soffset;
+		sa_bo->eoffset = soffset + size;
+		if (sa_manager->last) {
+			list_add(&sa_bo->olist, &sa_manager->last->olist);
+		} else {
+			list_add(&sa_bo->olist, &sa_manager->olist);
+		}
+		INIT_LIST_HEAD(&sa_bo->flist);
+		sa_manager->last = sa_bo;
+		return true;
+	}
+	return false;
+}
+
+static void radeon_sa_bo_next_hole(struct radeon_sa_manager *sa_manager)
+{
+	unsigned i, soffset, best, tmp;
+
+	/* last can't be NULL if we reach this function */
+	soffset = sa_manager->last->eoffset;
+	/* to handle wrap around we add sa_manager->size */
+	best = sa_manager->size * 2;
+	/* go over all fence list and try to find the closest sa_bo
+	 * of the current last
+	 */
+	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+		struct radeon_sa_bo *sa_bo;
+
+		if (list_empty(&sa_manager->flist[i]))
+			continue;
+		sa_bo = list_first_entry(&sa_manager->flist[i],
+					 struct radeon_sa_bo, flist);
+
+		if (!radeon_fence_signaled(sa_bo->fence))
+			continue;
+
+		tmp = sa_bo->soffset;
+		if (tmp < soffset) {
+			/* wrap around, pretend it's after */
+			tmp += sa_manager->size;
+		}
+		tmp -= soffset;
+		if (tmp < best) {
+			/* this sa bo is the closest one */
+			best = tmp;
+			sa_manager->last = list_entry(sa_bo->olist.prev,
+						      struct radeon_sa_bo,
+						      olist);
+		}
+	}
+}
+
+static int radeon_sa_bo_block(struct radeon_device *rdev,
+			      struct radeon_sa_manager *sa_manager)
+{
+	struct radeon_fence *fences[RADEON_NUM_RINGS];
+	int i, r;
+
+	for (i = 0; i < RADEON_NUM_RINGS; ++i) {
+		struct radeon_sa_bo *sa_bo;
+
+		fences[i] = NULL;
+		if (list_empty(&sa_manager->flist[i])) {
+			continue;
+		}
+		sa_bo = list_first_entry(&sa_manager->flist[i],
+					 struct radeon_sa_bo, flist);
+		fences[i] = sa_bo->fence;
+	}
+
+	spin_unlock(&sa_manager->lock);
+	r = radeon_fence_wait_any(rdev, fences, false);
+	spin_lock(&sa_manager->lock);
+	return r;
+}
+
 int radeon_sa_bo_new(struct radeon_device *rdev,
 		     struct radeon_sa_manager *sa_manager,
 		     struct radeon_sa_bo **sa_bo,
 		     unsigned size, unsigned align, bool block)
 {
-	struct radeon_fence *fence = NULL;
-	struct radeon_sa_bo *tmp, *next;
-	struct list_head *head;
-	unsigned offset = 0, wasted = 0;
-	int r;
+	unsigned i;
+	int r = -ENOMEM;
 
 	BUG_ON(align > RADEON_GPU_PAGE_SIZE);
 	BUG_ON(size > sa_manager->size);
 
 	*sa_bo = kmalloc(sizeof(struct radeon_sa_bo), GFP_KERNEL);
-
-retry:
+	if ((*sa_bo) == NULL) {
+		return -ENOMEM;
+	}
+	(*sa_bo)->manager = sa_manager;
+	(*sa_bo)->fence = NULL;
+	INIT_LIST_HEAD(&(*sa_bo)->olist);
+	INIT_LIST_HEAD(&(*sa_bo)->flist);
 
 	spin_lock(&sa_manager->lock);
+	do {
+		/* try to allocate couple time before going to wait */
+		for (i = 0; i < 2; ++i) {
+			radeon_sa_bo_try_free(sa_manager);
 
-	/* no one ? */
-	head = sa_manager->sa_bo.prev;
-	if (list_empty(&sa_manager->sa_bo)) {
-		goto out;
-	}
-
-	/* look for a hole big enough */
-	offset = 0;
-	list_for_each_entry_safe(tmp, next, &sa_manager->sa_bo, list) {
-		/* try to free this object */
-		if (tmp->fence) {
-			if (radeon_fence_signaled(tmp->fence)) {
-				radeon_sa_bo_remove_locked(tmp);
-				continue;
-			} else {
-				fence = tmp->fence;
+			if (radeon_sa_bo_try_alloc(sa_manager, *sa_bo,
+						   size, align)) {
+				spin_unlock(&sa_manager->lock);
+				return 0;
 			}
-		}
 
-		/* room before this object ? */
-		if (offset < tmp->soffset && (tmp->soffset - offset) >= size) {
-			head = tmp->list.prev;
-			goto out;
-		}
-		offset = tmp->eoffset;
-		wasted = offset % align;
-		if (wasted) {
-			wasted = align - wasted;
+			/* see if we can skip over some allocations */
+			radeon_sa_bo_next_hole(sa_manager);
 		}
-		offset += wasted;
-	}
-	/* room at the end ? */
-	head = sa_manager->sa_bo.prev;
-	tmp = list_entry(head, struct radeon_sa_bo, list);
-	offset = tmp->eoffset;
-	wasted = offset % align;
-	if (wasted) {
-		wasted = align - wasted;
-	}
-	offset += wasted;
-	if ((sa_manager->size - offset) < size) {
-		/* failed to find somethings big enough */
-		spin_unlock(&sa_manager->lock);
-		if (block && fence) {
-			r = radeon_fence_wait(fence, false);
-			if (r)
-				return r;
-
-			goto retry;
+
+		if (block) {
+			r = radeon_sa_bo_block(rdev, sa_manager);
+			if (r) {
+				goto out_err;
+			}
 		}
-		kfree(*sa_bo);
-		*sa_bo = NULL;
-		return -ENOMEM;
-	}
+	} while (block);
 
-out:
-	(*sa_bo)->manager = sa_manager;
-	(*sa_bo)->soffset = offset;
-	(*sa_bo)->eoffset = offset + size;
-	list_add(&(*sa_bo)->list, head);
+out_err:
 	spin_unlock(&sa_manager->lock);
-	return 0;
+	kfree(*sa_bo);
+	*sa_bo = NULL;
+	return r;
 }
 
 void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo **sa_bo,
@@ -226,13 +346,16 @@ void radeon_sa_bo_free(struct radeon_device *rdev, struct radeon_sa_bo **sa_bo,
 {
 	struct radeon_sa_manager *sa_manager;
 
-	if (!sa_bo || !*sa_bo)
+	if (sa_bo == NULL || *sa_bo == NULL) {
 		return;
+	}
 
 	sa_manager = (*sa_bo)->manager;
 	spin_lock(&sa_manager->lock);
 	if (fence && fence->seq && fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
 		(*sa_bo)->fence = radeon_fence_ref(fence);
+		list_add_tail(&(*sa_bo)->flist,
+			      &sa_manager->flist[fence->ring]);
 	} else {
 		radeon_sa_bo_remove_locked(*sa_bo);
 	}
@@ -247,15 +370,18 @@ void radeon_sa_bo_dump_debug_info(struct radeon_sa_manager *sa_manager,
 	struct radeon_sa_bo *i;
 
 	spin_lock(&sa_manager->lock);
-	list_for_each_entry(i, &sa_manager->sa_bo, list) {
-		seq_printf(m, "[%08x %08x] size %4d (%p)",
-			   i->soffset, i->eoffset, i->eoffset - i->soffset, i);
-		if (i->fence) {
-			seq_printf(m, " protected by %Ld (%p)\n",
-				   i->fence->seq, i->fence);
+	list_for_each_entry(i, &sa_manager->olist, olist) {
+		if (i == sa_manager->last) {
+			seq_printf(m, ">");
 		} else {
-			seq_printf(m, "\n");
+			seq_printf(m, " ");
+		}
+		seq_printf(m, "[0x%08x 0x%08x] size %8d",
+			   i->soffset, i->eoffset, i->eoffset - i->soffset);
+		if (i->fence) {
+			seq_printf(m, " protected by 0x%016llx", i->fence->seq);
 		}
+		seq_printf(m, "\n");
 	}
 	spin_unlock(&sa_manager->lock);
 }
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 13/18] drm/radeon: simplify semaphore handling v2
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (11 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 12/18] drm/radeon: multiple ring allocator j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 14/18] drm/radeon: rip out the ib pool j.glisse
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Jerome Glisse, Christian König

From: Jerome Glisse <jglisse@redhat.com>

Directly use the suballocator to get small chunks of memory.
It's equally fast and doesn't crash when we encounter a GPU reset.

v2: rebased on new SA interface.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
---
 drivers/gpu/drm/radeon/evergreen.c        |    1 -
 drivers/gpu/drm/radeon/ni.c               |    1 -
 drivers/gpu/drm/radeon/r600.c             |    1 -
 drivers/gpu/drm/radeon/radeon.h           |   29 +-----
 drivers/gpu/drm/radeon/radeon_device.c    |    2 -
 drivers/gpu/drm/radeon/radeon_fence.c     |    2 +-
 drivers/gpu/drm/radeon/radeon_semaphore.c |  137 +++++------------------------
 drivers/gpu/drm/radeon/radeon_test.c      |    4 +-
 drivers/gpu/drm/radeon/rv770.c            |    1 -
 drivers/gpu/drm/radeon/si.c               |    1 -
 10 files changed, 30 insertions(+), 149 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c
index 0e860c6..ec61194 100644
--- a/drivers/gpu/drm/radeon/evergreen.c
+++ b/drivers/gpu/drm/radeon/evergreen.c
@@ -3422,7 +3422,6 @@ void evergreen_fini(struct radeon_device *rdev)
 	evergreen_pcie_gart_fini(rdev);
 	r600_vram_scratch_fini(rdev);
 	radeon_gem_fini(rdev);
-	radeon_semaphore_driver_fini(rdev);
 	radeon_fence_driver_fini(rdev);
 	radeon_agp_fini(rdev);
 	radeon_bo_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/ni.c b/drivers/gpu/drm/radeon/ni.c
index 9cd2657..107b217 100644
--- a/drivers/gpu/drm/radeon/ni.c
+++ b/drivers/gpu/drm/radeon/ni.c
@@ -1744,7 +1744,6 @@ void cayman_fini(struct radeon_device *rdev)
 	cayman_pcie_gart_fini(rdev);
 	r600_vram_scratch_fini(rdev);
 	radeon_gem_fini(rdev);
-	radeon_semaphore_driver_fini(rdev);
 	radeon_fence_driver_fini(rdev);
 	radeon_bo_fini(rdev);
 	radeon_atombios_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 1cadf97..2bce657 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2658,7 +2658,6 @@ void r600_fini(struct radeon_device *rdev)
 	r600_vram_scratch_fini(rdev);
 	radeon_agp_fini(rdev);
 	radeon_gem_fini(rdev);
-	radeon_semaphore_driver_fini(rdev);
 	radeon_fence_driver_fini(rdev);
 	radeon_bo_fini(rdev);
 	radeon_atombios_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 1c3eb06..937db02 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -434,34 +434,13 @@ int radeon_mode_dumb_destroy(struct drm_file *file_priv,
 /*
  * Semaphores.
  */
-struct radeon_ring;
-
-#define	RADEON_SEMAPHORE_BO_SIZE	256
-
-struct radeon_semaphore_driver {
-	rwlock_t			lock;
-	struct list_head		bo;
-};
-
-struct radeon_semaphore_bo;
-
 /* everything here is constant */
 struct radeon_semaphore {
-	struct list_head		list;
+	struct radeon_sa_bo		*sa_bo;
+	signed				waiters;
 	uint64_t			gpu_addr;
-	uint32_t			*cpu_ptr;
-	struct radeon_semaphore_bo	*bo;
 };
 
-struct radeon_semaphore_bo {
-	struct list_head		list;
-	struct radeon_ib		*ib;
-	struct list_head		free;
-	struct radeon_semaphore		semaphores[RADEON_SEMAPHORE_BO_SIZE/8];
-	unsigned			nused;
-};
-
-void radeon_semaphore_driver_fini(struct radeon_device *rdev);
 int radeon_semaphore_create(struct radeon_device *rdev,
 			    struct radeon_semaphore **semaphore);
 void radeon_semaphore_emit_signal(struct radeon_device *rdev, int ring,
@@ -473,7 +452,8 @@ int radeon_semaphore_sync_rings(struct radeon_device *rdev,
 				bool sync_to[RADEON_NUM_RINGS],
 				int dst_ring);
 void radeon_semaphore_free(struct radeon_device *rdev,
-			   struct radeon_semaphore *semaphore);
+			   struct radeon_semaphore *semaphore,
+			   struct radeon_fence *fence);
 
 /*
  * GART structures, functions & helpers
@@ -1522,7 +1502,6 @@ struct radeon_device {
 	struct radeon_mman		mman;
 	struct radeon_fence_driver	fence_drv[RADEON_NUM_RINGS];
 	wait_queue_head_t		fence_queue;
-	struct radeon_semaphore_driver	semaphore_drv;
 	struct mutex			ring_lock;
 	struct radeon_ring		ring[RADEON_NUM_RINGS];
 	struct radeon_ib_pool		ib_pool;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index cb4f9c2..9e28060 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -732,11 +732,9 @@ int radeon_device_init(struct radeon_device *rdev,
 	mutex_init(&rdev->gem.mutex);
 	mutex_init(&rdev->pm.mutex);
 	mutex_init(&rdev->vram_mutex);
-	rwlock_init(&rdev->semaphore_drv.lock);
 	INIT_LIST_HEAD(&rdev->gem.objects);
 	init_waitqueue_head(&rdev->irq.vblank_queue);
 	init_waitqueue_head(&rdev->irq.idle_queue);
-	INIT_LIST_HEAD(&rdev->semaphore_drv.bo);
 	/* initialize vm here */
 	rdev->vm_manager.use_bitmap = 1;
 	rdev->vm_manager.max_pfn = 1 << 20;
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
index 9c2c1b3..8b4778f 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -138,7 +138,7 @@ static void radeon_fence_destroy(struct kref *kref)
 	fence = container_of(kref, struct radeon_fence, kref);
 	fence->seq = RADEON_FENCE_NOTEMITED_SEQ;
 	if (fence->semaphore)
-		radeon_semaphore_free(fence->rdev, fence->semaphore);
+		radeon_semaphore_free(fence->rdev, fence->semaphore, NULL);
 	kfree(fence);
 }
 
diff --git a/drivers/gpu/drm/radeon/radeon_semaphore.c b/drivers/gpu/drm/radeon/radeon_semaphore.c
index dbde874..1bc5513 100644
--- a/drivers/gpu/drm/radeon/radeon_semaphore.c
+++ b/drivers/gpu/drm/radeon/radeon_semaphore.c
@@ -31,118 +31,40 @@
 #include "drm.h"
 #include "radeon.h"
 
-static int radeon_semaphore_add_bo(struct radeon_device *rdev)
-{
-	struct radeon_semaphore_bo *bo;
-	unsigned long irq_flags;
-	uint64_t gpu_addr;
-	uint32_t *cpu_ptr;
-	int r, i;
-
-	bo = kmalloc(sizeof(struct radeon_semaphore_bo), GFP_KERNEL);
-	if (bo == NULL) {
-		return -ENOMEM;
-	}
-	INIT_LIST_HEAD(&bo->free);
-	INIT_LIST_HEAD(&bo->list);
-	bo->nused = 0;
-
-	r = radeon_ib_get(rdev, 0, &bo->ib, RADEON_SEMAPHORE_BO_SIZE);
-	if (r) {
-		dev_err(rdev->dev, "failed to get a bo after 5 retry\n");
-		kfree(bo);
-		return r;
-	}
-	gpu_addr = radeon_sa_bo_gpu_addr(bo->ib->sa_bo);
-	cpu_ptr = radeon_sa_bo_cpu_addr(bo->ib->sa_bo);
-	for (i = 0; i < (RADEON_SEMAPHORE_BO_SIZE/8); i++) {
-		bo->semaphores[i].gpu_addr = gpu_addr;
-		bo->semaphores[i].cpu_ptr = cpu_ptr;
-		bo->semaphores[i].bo = bo;
-		list_add_tail(&bo->semaphores[i].list, &bo->free);
-		gpu_addr += 8;
-		cpu_ptr += 2;
-	}
-	write_lock_irqsave(&rdev->semaphore_drv.lock, irq_flags);
-	list_add_tail(&bo->list, &rdev->semaphore_drv.bo);
-	write_unlock_irqrestore(&rdev->semaphore_drv.lock, irq_flags);
-	return 0;
-}
-
-static void radeon_semaphore_del_bo_locked(struct radeon_device *rdev,
-					   struct radeon_semaphore_bo *bo)
-{
-	radeon_sa_bo_free(rdev, &bo->ib->sa_bo, NULL);
-	radeon_fence_unref(&bo->ib->fence);
-	list_del(&bo->list);
-	kfree(bo);
-}
-
-void radeon_semaphore_shrink_locked(struct radeon_device *rdev)
-{
-	struct radeon_semaphore_bo *bo, *n;
-
-	if (list_empty(&rdev->semaphore_drv.bo)) {
-		return;
-	}
-	/* only shrink if first bo has free semaphore */
-	bo = list_first_entry(&rdev->semaphore_drv.bo, struct radeon_semaphore_bo, list);
-	if (list_empty(&bo->free)) {
-		return;
-	}
-	list_for_each_entry_safe_continue(bo, n, &rdev->semaphore_drv.bo, list) {
-		if (bo->nused)
-			continue;
-		radeon_semaphore_del_bo_locked(rdev, bo);
-	}
-}
 
 int radeon_semaphore_create(struct radeon_device *rdev,
 			    struct radeon_semaphore **semaphore)
 {
-	struct radeon_semaphore_bo *bo;
-	unsigned long irq_flags;
-	bool do_retry = true;
 	int r;
 
-retry:
-	*semaphore = NULL;
-	write_lock_irqsave(&rdev->semaphore_drv.lock, irq_flags);
-	list_for_each_entry(bo, &rdev->semaphore_drv.bo, list) {
-		if (list_empty(&bo->free))
-			continue;
-		*semaphore = list_first_entry(&bo->free, struct radeon_semaphore, list);
-		(*semaphore)->cpu_ptr[0] = 0;
-		(*semaphore)->cpu_ptr[1] = 0;
-		list_del(&(*semaphore)->list);
-		bo->nused++;
-		break;
-	}
-	write_unlock_irqrestore(&rdev->semaphore_drv.lock, irq_flags);
-
+	*semaphore = kmalloc(sizeof(struct radeon_semaphore), GFP_KERNEL);
 	if (*semaphore == NULL) {
-		if (do_retry) {
-			do_retry = false;
-			r = radeon_semaphore_add_bo(rdev);
-			if (r)
-				return r;
-			goto retry;
-		}
 		return -ENOMEM;
 	}
-
+	r = radeon_sa_bo_new(rdev, &rdev->ib_pool.sa_manager,
+			     &(*semaphore)->sa_bo, 8, 8, true);
+	if (r) {
+		kfree(*semaphore);
+		*semaphore = NULL;
+		return r;
+	}
+	(*semaphore)->waiters = 0;
+	(*semaphore)->gpu_addr = radeon_sa_bo_gpu_addr((*semaphore)->sa_bo);
+	*((uint64_t*)radeon_sa_bo_cpu_addr((*semaphore)->sa_bo)) = 0;
 	return 0;
 }
 
 void radeon_semaphore_emit_signal(struct radeon_device *rdev, int ring,
 			          struct radeon_semaphore *semaphore)
 {
+	--semaphore->waiters;
 	radeon_semaphore_ring_emit(rdev, ring, &rdev->ring[ring], semaphore, false);
 }
 
 void radeon_semaphore_emit_wait(struct radeon_device *rdev, int ring,
 			        struct radeon_semaphore *semaphore)
 {
+	++semaphore->waiters;
 	radeon_semaphore_ring_emit(rdev, ring, &rdev->ring[ring], semaphore, true);
 }
 
@@ -200,29 +122,16 @@ error:
 }
 
 void radeon_semaphore_free(struct radeon_device *rdev,
-			   struct radeon_semaphore *semaphore)
+			   struct radeon_semaphore *semaphore,
+			   struct radeon_fence *fence)
 {
-	unsigned long irq_flags;
-
-	write_lock_irqsave(&rdev->semaphore_drv.lock, irq_flags);
-	semaphore->bo->nused--;
-	list_add_tail(&semaphore->list, &semaphore->bo->free);
-	radeon_semaphore_shrink_locked(rdev);
-	write_unlock_irqrestore(&rdev->semaphore_drv.lock, irq_flags);
-}
-
-void radeon_semaphore_driver_fini(struct radeon_device *rdev)
-{
-	struct radeon_semaphore_bo *bo, *n;
-	unsigned long irq_flags;
-
-	write_lock_irqsave(&rdev->semaphore_drv.lock, irq_flags);
-	/* we force to free everything */
-	list_for_each_entry_safe(bo, n, &rdev->semaphore_drv.bo, list) {
-		if (!list_empty(&bo->free)) {
-			dev_err(rdev->dev, "still in use semaphore\n");
-		}
-		radeon_semaphore_del_bo_locked(rdev, bo);
+	if (semaphore == NULL) {
+		return;
+	}
+	if (semaphore->waiters > 0) {
+		dev_err(rdev->dev, "semaphore %p has more waiters than signalers,"
+			" hardware lockup imminent!\n", semaphore);
 	}
-	write_unlock_irqrestore(&rdev->semaphore_drv.lock, irq_flags);
+	radeon_sa_bo_free(rdev, &semaphore->sa_bo, fence);
+	kfree(semaphore);
 }
diff --git a/drivers/gpu/drm/radeon/radeon_test.c b/drivers/gpu/drm/radeon/radeon_test.c
index dc5dcf4..b057387 100644
--- a/drivers/gpu/drm/radeon/radeon_test.c
+++ b/drivers/gpu/drm/radeon/radeon_test.c
@@ -317,7 +317,7 @@ void radeon_test_ring_sync(struct radeon_device *rdev,
 
 out_cleanup:
 	if (semaphore)
-		radeon_semaphore_free(rdev, semaphore);
+		radeon_semaphore_free(rdev, semaphore, NULL);
 
 	if (fence1)
 		radeon_fence_unref(&fence1);
@@ -437,7 +437,7 @@ void radeon_test_ring_sync2(struct radeon_device *rdev,
 
 out_cleanup:
 	if (semaphore)
-		radeon_semaphore_free(rdev, semaphore);
+		radeon_semaphore_free(rdev, semaphore, NULL);
 
 	if (fenceA)
 		radeon_fence_unref(&fenceA);
diff --git a/drivers/gpu/drm/radeon/rv770.c b/drivers/gpu/drm/radeon/rv770.c
index a8b0016..40f82e2 100644
--- a/drivers/gpu/drm/radeon/rv770.c
+++ b/drivers/gpu/drm/radeon/rv770.c
@@ -1278,7 +1278,6 @@ void rv770_fini(struct radeon_device *rdev)
 	rv770_pcie_gart_fini(rdev);
 	r600_vram_scratch_fini(rdev);
 	radeon_gem_fini(rdev);
-	radeon_semaphore_driver_fini(rdev);
 	radeon_fence_driver_fini(rdev);
 	radeon_agp_fini(rdev);
 	radeon_bo_fini(rdev);
diff --git a/drivers/gpu/drm/radeon/si.c b/drivers/gpu/drm/radeon/si.c
index f8ee066..b25a8ff 100644
--- a/drivers/gpu/drm/radeon/si.c
+++ b/drivers/gpu/drm/radeon/si.c
@@ -4107,7 +4107,6 @@ void si_fini(struct radeon_device *rdev)
 	si_pcie_gart_fini(rdev);
 	r600_vram_scratch_fini(rdev);
 	radeon_gem_fini(rdev);
-	radeon_semaphore_driver_fini(rdev);
 	radeon_fence_driver_fini(rdev);
 	radeon_bo_fini(rdev);
 	radeon_atombios_fini(rdev);
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 14/18] drm/radeon: rip out the ib pool
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (12 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 13/18] drm/radeon: simplify semaphore handling v2 j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 15/18] drm/radeon: immediately remove ttm-move semaphore j.glisse
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Jerome Glisse, Christian König

From: Jerome Glisse <jglisse@redhat.com>

It isn't necessary any more and the suballocator seems to perform
even better.

Signed-off-by: Christian König <deathsimple@vodafone.de>
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
---
 drivers/gpu/drm/radeon/radeon.h           |   17 +--
 drivers/gpu/drm/radeon/radeon_device.c    |    1 -
 drivers/gpu/drm/radeon/radeon_gart.c      |   12 +-
 drivers/gpu/drm/radeon/radeon_ring.c      |  241 ++++++++---------------------
 drivers/gpu/drm/radeon/radeon_semaphore.c |    2 +-
 5 files changed, 71 insertions(+), 202 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 937db02..24220d1 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -616,7 +616,6 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device *rdev, int crtc);
 
 struct radeon_ib {
 	struct radeon_sa_bo	*sa_bo;
-	unsigned		idx;
 	uint32_t		length_dw;
 	uint64_t		gpu_addr;
 	uint32_t		*ptr;
@@ -625,18 +624,6 @@ struct radeon_ib {
 	bool			is_const_ib;
 };
 
-/*
- * locking -
- * mutex protects scheduled_ibs, ready, alloc_bm
- */
-struct radeon_ib_pool {
-	struct radeon_mutex		mutex;
-	struct radeon_sa_manager	sa_manager;
-	struct radeon_ib		ibs[RADEON_IB_POOL_SIZE];
-	bool				ready;
-	unsigned			head_id;
-};
-
 struct radeon_ring {
 	struct radeon_bo	*ring_obj;
 	volatile uint32_t	*ring;
@@ -778,7 +765,6 @@ struct si_rlc {
 int radeon_ib_get(struct radeon_device *rdev, int ring,
 		  struct radeon_ib **ib, unsigned size);
 void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib);
-bool radeon_ib_try_free(struct radeon_device *rdev, struct radeon_ib *ib);
 int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 void radeon_ib_pool_fini(struct radeon_device *rdev);
@@ -1504,7 +1490,8 @@ struct radeon_device {
 	wait_queue_head_t		fence_queue;
 	struct mutex			ring_lock;
 	struct radeon_ring		ring[RADEON_NUM_RINGS];
-	struct radeon_ib_pool		ib_pool;
+	bool				ib_pool_ready;
+	struct radeon_sa_manager	ring_tmp_bo;
 	struct radeon_irq		irq;
 	struct radeon_asic		*asic;
 	struct radeon_gem		gem;
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index 9e28060..3253e8d 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -724,7 +724,6 @@ int radeon_device_init(struct radeon_device *rdev,
 	/* mutex initialization are all done here so we
 	 * can recall function without having locking issues */
 	radeon_mutex_init(&rdev->cs_mutex);
-	radeon_mutex_init(&rdev->ib_pool.mutex);
 	mutex_init(&rdev->ring_lock);
 	mutex_init(&rdev->dc_hw_i2c_mutex);
 	if (rdev->family >= CHIP_R600)
diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c
index 53dba8e..8e9ef34 100644
--- a/drivers/gpu/drm/radeon/radeon_gart.c
+++ b/drivers/gpu/drm/radeon/radeon_gart.c
@@ -432,8 +432,8 @@ retry_id:
 	rdev->vm_manager.use_bitmap |= 1 << id;
 	vm->id = id;
 	list_add_tail(&vm->list, &rdev->vm_manager.lru_vm);
-	return radeon_vm_bo_update_pte(rdev, vm, rdev->ib_pool.sa_manager.bo,
-				       &rdev->ib_pool.sa_manager.bo->tbo.mem);
+	return radeon_vm_bo_update_pte(rdev, vm, rdev->ring_tmp_bo.bo,
+				       &rdev->ring_tmp_bo.bo->tbo.mem);
 }
 
 /* object have to be reserved */
@@ -631,7 +631,7 @@ int radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm)
 	/* map the ib pool buffer at 0 in virtual address space, set
 	 * read only
 	 */
-	r = radeon_vm_bo_add(rdev, vm, rdev->ib_pool.sa_manager.bo, 0,
+	r = radeon_vm_bo_add(rdev, vm, rdev->ring_tmp_bo.bo, 0,
 			     RADEON_VM_PAGE_READABLE | RADEON_VM_PAGE_SNOOPED);
 	return r;
 }
@@ -648,12 +648,12 @@ void radeon_vm_fini(struct radeon_device *rdev, struct radeon_vm *vm)
 	radeon_mutex_unlock(&rdev->cs_mutex);
 
 	/* remove all bo */
-	r = radeon_bo_reserve(rdev->ib_pool.sa_manager.bo, false);
+	r = radeon_bo_reserve(rdev->ring_tmp_bo.bo, false);
 	if (!r) {
-		bo_va = radeon_bo_va(rdev->ib_pool.sa_manager.bo, vm);
+		bo_va = radeon_bo_va(rdev->ring_tmp_bo.bo, vm);
 		list_del_init(&bo_va->bo_list);
 		list_del_init(&bo_va->vm_list);
-		radeon_bo_unreserve(rdev->ib_pool.sa_manager.bo);
+		radeon_bo_unreserve(rdev->ring_tmp_bo.bo);
 		kfree(bo_va);
 	}
 	if (!list_empty(&vm->va)) {
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index 2eed251..c70a3b7 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -24,6 +24,7 @@
  * Authors: Dave Airlie
  *          Alex Deucher
  *          Jerome Glisse
+ *          Christian König
  */
 #include <linux/seq_file.h>
 #include <linux/slab.h>
@@ -33,8 +34,10 @@
 #include "radeon.h"
 #include "atom.h"
 
-int radeon_debugfs_ib_init(struct radeon_device *rdev);
-int radeon_debugfs_ring_init(struct radeon_device *rdev, struct radeon_ring *ring);
+/*
+ * IB.
+ */
+int radeon_debugfs_sa_init(struct radeon_device *rdev);
 
 u32 radeon_get_ib_value(struct radeon_cs_parser *p, int idx)
 {
@@ -61,106 +64,37 @@ u32 radeon_get_ib_value(struct radeon_cs_parser *p, int idx)
 	return idx_value;
 }
 
-void radeon_ring_write(struct radeon_ring *ring, uint32_t v)
-{
-#if DRM_DEBUG_CODE
-	if (ring->count_dw <= 0) {
-		DRM_ERROR("radeon: writting more dword to ring than expected !\n");
-	}
-#endif
-	ring->ring[ring->wptr++] = v;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw--;
-	ring->ring_free_dw--;
-}
-
-/*
- * IB.
- */
-bool radeon_ib_try_free(struct radeon_device *rdev, struct radeon_ib *ib)
-{
-	bool done = false;
-
-	/* only free ib which have been emited */
-	if (ib->fence && ib->fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
-		if (radeon_fence_signaled(ib->fence)) {
-			radeon_fence_unref(&ib->fence);
-			radeon_sa_bo_free(rdev, &ib->sa_bo, NULL);
-			done = true;
-		}
-	}
-	return done;
-}
-
 int radeon_ib_get(struct radeon_device *rdev, int ring,
 		  struct radeon_ib **ib, unsigned size)
 {
-	struct radeon_fence *fence;
-	unsigned cretry = 0;
-	int r = 0, i, idx;
-
-	*ib = NULL;
-	/* align size on 256 bytes */
-	size = ALIGN(size, 256);
-
-	r = radeon_fence_create(rdev, &fence, ring);
-	if (r) {
-		dev_err(rdev->dev, "failed to create fence for new IB\n");
-		return r;
-	}
+	int r;
 
-	radeon_mutex_lock(&rdev->ib_pool.mutex);
-	idx = rdev->ib_pool.head_id;
-retry:
-	if (cretry > 5) {
-		dev_err(rdev->dev, "failed to get an ib after 5 retry\n");
-		radeon_mutex_unlock(&rdev->ib_pool.mutex);
-		radeon_fence_unref(&fence);
+	*ib = kmalloc(sizeof(struct radeon_ib), GFP_KERNEL);
+	if (*ib == NULL) {
 		return -ENOMEM;
 	}
-	cretry++;
-	for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
-		radeon_ib_try_free(rdev, &rdev->ib_pool.ibs[idx]);
-		if (rdev->ib_pool.ibs[idx].fence == NULL) {
-			r = radeon_sa_bo_new(rdev, &rdev->ib_pool.sa_manager,
-					     &rdev->ib_pool.ibs[idx].sa_bo,
-					     size, 256, false);
-			if (!r) {
-				*ib = &rdev->ib_pool.ibs[idx];
-				(*ib)->ptr = radeon_sa_bo_cpu_addr((*ib)->sa_bo);
-				(*ib)->gpu_addr = radeon_sa_bo_gpu_addr((*ib)->sa_bo);
-				(*ib)->fence = fence;
-				(*ib)->vm_id = 0;
-				(*ib)->is_const_ib = false;
-				/* ib are most likely to be allocated in a ring fashion
-				 * thus rdev->ib_pool.head_id should be the id of the
-				 * oldest ib
-				 */
-				rdev->ib_pool.head_id = (1 + idx);
-				rdev->ib_pool.head_id &= (RADEON_IB_POOL_SIZE - 1);
-				radeon_mutex_unlock(&rdev->ib_pool.mutex);
-				return 0;
-			}
-		}
-		idx = (idx + 1) & (RADEON_IB_POOL_SIZE - 1);
+	r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo, &(*ib)->sa_bo, size, 256, true);
+	if (r) {
+		dev_err(rdev->dev, "failed to get a new IB (%d)\n", r);
+		kfree(*ib);
+		*ib = NULL;
+		return r;
 	}
-	/* this should be rare event, ie all ib scheduled none signaled yet.
-	 */
-	for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
-		struct radeon_fence *fence = rdev->ib_pool.ibs[idx].fence;
-		if (fence && fence->seq < RADEON_FENCE_NOTEMITED_SEQ) {
-			r = radeon_fence_wait(fence, false);
-			if (!r) {
-				goto retry;
-			}
-			/* an error happened */
-			break;
-		}
-		idx = (idx + 1) & (RADEON_IB_POOL_SIZE - 1);
+	r = radeon_fence_create(rdev, &(*ib)->fence, ring);
+	if (r) {
+		dev_err(rdev->dev, "failed to create fence for new IB (%d)\n", r);
+		radeon_sa_bo_free(rdev, &(*ib)->sa_bo, NULL);
+		kfree(*ib);
+		*ib = NULL;
+		return r;
 	}
-	radeon_mutex_unlock(&rdev->ib_pool.mutex);
-	radeon_fence_unref(&fence);
-	return r;
+
+	(*ib)->ptr = radeon_sa_bo_cpu_addr((*ib)->sa_bo);
+	(*ib)->gpu_addr = radeon_sa_bo_gpu_addr((*ib)->sa_bo);
+	(*ib)->vm_id = 0;
+	(*ib)->is_const_ib = false;
+
+	return 0;
 }
 
 void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib)
@@ -171,12 +105,9 @@ void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib)
 	if (tmp == NULL) {
 		return;
 	}
-	radeon_mutex_lock(&rdev->ib_pool.mutex);
-	if (tmp->fence && tmp->fence->seq == RADEON_FENCE_NOTEMITED_SEQ) {
-		radeon_sa_bo_free(rdev, &tmp->sa_bo, NULL);
-		radeon_fence_unref(&tmp->fence);
-	}
-	radeon_mutex_unlock(&rdev->ib_pool.mutex);
+	radeon_sa_bo_free(rdev, &tmp->sa_bo, tmp->fence);
+	radeon_fence_unref(&tmp->fence);
+	kfree(tmp);
 }
 
 int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib)
@@ -186,14 +117,14 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib)
 
 	if (!ib->length_dw || !ring->ready) {
 		/* TODO: Nothings in the ib we should report. */
-		DRM_ERROR("radeon: couldn't schedule IB(%u).\n", ib->idx);
+		dev_err(rdev->dev, "couldn't schedule ib\n");
 		return -EINVAL;
 	}
 
 	/* 64 dwords should be enough for fence too */
 	r = radeon_ring_lock(rdev, ring, 64);
 	if (r) {
-		DRM_ERROR("radeon: scheduling IB failed (%d).\n", r);
+		dev_err(rdev->dev, "scheduling IB failed (%d).\n", r);
 		return r;
 	}
 	radeon_ring_ib_execute(rdev, ib->fence->ring, ib);
@@ -204,63 +135,40 @@ int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib)
 
 int radeon_ib_pool_init(struct radeon_device *rdev)
 {
-	int i, r;
+	int r;
 
-	radeon_mutex_lock(&rdev->ib_pool.mutex);
-	if (rdev->ib_pool.ready) {
-		radeon_mutex_unlock(&rdev->ib_pool.mutex);
+	if (rdev->ib_pool_ready) {
 		return 0;
 	}
-
-	r = radeon_sa_bo_manager_init(rdev, &rdev->ib_pool.sa_manager,
+	r = radeon_sa_bo_manager_init(rdev, &rdev->ring_tmp_bo,
 				      RADEON_IB_POOL_SIZE*64*1024,
 				      RADEON_GEM_DOMAIN_GTT);
 	if (r) {
-		radeon_mutex_unlock(&rdev->ib_pool.mutex);
 		return r;
 	}
-
-	for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
-		rdev->ib_pool.ibs[i].fence = NULL;
-		rdev->ib_pool.ibs[i].idx = i;
-		rdev->ib_pool.ibs[i].length_dw = 0;
-		rdev->ib_pool.ibs[i].sa_bo = NULL;
-	}
-	rdev->ib_pool.head_id = 0;
-	rdev->ib_pool.ready = true;
-	DRM_INFO("radeon: ib pool ready.\n");
-
-	if (radeon_debugfs_ib_init(rdev)) {
-		DRM_ERROR("Failed to register debugfs file for IB !\n");
+	rdev->ib_pool_ready = true;
+	if (radeon_debugfs_sa_init(rdev)) {
+		dev_err(rdev->dev, "failed to register debugfs file for SA\n");
 	}
-	radeon_mutex_unlock(&rdev->ib_pool.mutex);
 	return 0;
 }
 
 void radeon_ib_pool_fini(struct radeon_device *rdev)
 {
-	unsigned i;
-
-	radeon_mutex_lock(&rdev->ib_pool.mutex);
-	if (rdev->ib_pool.ready) {
-		for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
-			radeon_sa_bo_free(rdev, &rdev->ib_pool.ibs[i].sa_bo, NULL);
-			radeon_fence_unref(&rdev->ib_pool.ibs[i].fence);
-		}
-		radeon_sa_bo_manager_fini(rdev, &rdev->ib_pool.sa_manager);
-		rdev->ib_pool.ready = false;
+	if (rdev->ib_pool_ready) {
+		radeon_sa_bo_manager_fini(rdev, &rdev->ring_tmp_bo);
+		rdev->ib_pool_ready = false;
 	}
-	radeon_mutex_unlock(&rdev->ib_pool.mutex);
 }
 
 int radeon_ib_pool_start(struct radeon_device *rdev)
 {
-	return radeon_sa_bo_manager_start(rdev, &rdev->ib_pool.sa_manager);
+	return radeon_sa_bo_manager_start(rdev, &rdev->ring_tmp_bo);
 }
 
 int radeon_ib_pool_suspend(struct radeon_device *rdev)
 {
-	return radeon_sa_bo_manager_suspend(rdev, &rdev->ib_pool.sa_manager);
+	return radeon_sa_bo_manager_suspend(rdev, &rdev->ring_tmp_bo);
 }
 
 int radeon_ib_ring_tests(struct radeon_device *rdev)
@@ -296,6 +204,21 @@ int radeon_ib_ring_tests(struct radeon_device *rdev)
 /*
  * Ring.
  */
+int radeon_debugfs_ring_init(struct radeon_device *rdev, struct radeon_ring *ring);
+
+void radeon_ring_write(struct radeon_ring *ring, uint32_t v)
+{
+#if DRM_DEBUG_CODE
+	if (ring->count_dw <= 0) {
+		DRM_ERROR("radeon: writting more dword to ring than expected !\n");
+	}
+#endif
+	ring->ring[ring->wptr++] = v;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw--;
+	ring->ring_free_dw--;
+}
+
 int radeon_ring_index(struct radeon_device *rdev, struct radeon_ring *ring)
 {
 	/* r1xx-r5xx only has CP ring */
@@ -577,37 +500,13 @@ static struct drm_info_list radeon_debugfs_ring_info_list[] = {
 	{"radeon_ring_cp2", radeon_debugfs_ring_info, 0, &cayman_ring_type_cp2_index},
 };
 
-static int radeon_debugfs_ib_info(struct seq_file *m, void *data)
-{
-	struct drm_info_node *node = (struct drm_info_node *) m->private;
-	struct drm_device *dev = node->minor->dev;
-	struct radeon_device *rdev = dev->dev_private;
-	struct radeon_ib *ib = &rdev->ib_pool.ibs[*((unsigned*)node->info_ent->data)];
-	unsigned i;
-
-	if (ib == NULL) {
-		return 0;
-	}
-	seq_printf(m, "IB %04u\n", ib->idx);
-	seq_printf(m, "IB fence %p\n", ib->fence);
-	seq_printf(m, "IB size %05u dwords\n", ib->length_dw);
-	for (i = 0; i < ib->length_dw; i++) {
-		seq_printf(m, "[%05u]=0x%08X\n", i, ib->ptr[i]);
-	}
-	return 0;
-}
-
-static struct drm_info_list radeon_debugfs_ib_list[RADEON_IB_POOL_SIZE];
-static char radeon_debugfs_ib_names[RADEON_IB_POOL_SIZE][32];
-static unsigned radeon_debugfs_ib_idx[RADEON_IB_POOL_SIZE];
-
 static int radeon_debugfs_sa_info(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct radeon_device *rdev = dev->dev_private;
 
-	radeon_sa_bo_dump_debug_info(&rdev->ib_pool.sa_manager, m);
+	radeon_sa_bo_dump_debug_info(&rdev->ring_tmp_bo, m);
 
 	return 0;
 
@@ -639,26 +538,10 @@ int radeon_debugfs_ring_init(struct radeon_device *rdev, struct radeon_ring *rin
 	return 0;
 }
 
-int radeon_debugfs_ib_init(struct radeon_device *rdev)
+int radeon_debugfs_sa_init(struct radeon_device *rdev)
 {
 #if defined(CONFIG_DEBUG_FS)
-	unsigned i;
-	int r;
-
-	r = radeon_debugfs_add_files(rdev, radeon_debugfs_sa_list, 1);
-	if (r)
-		return r;
-
-	for (i = 0; i < RADEON_IB_POOL_SIZE; i++) {
-		sprintf(radeon_debugfs_ib_names[i], "radeon_ib_%04u", i);
-		radeon_debugfs_ib_idx[i] = i;
-		radeon_debugfs_ib_list[i].name = radeon_debugfs_ib_names[i];
-		radeon_debugfs_ib_list[i].show = &radeon_debugfs_ib_info;
-		radeon_debugfs_ib_list[i].driver_features = 0;
-		radeon_debugfs_ib_list[i].data = &radeon_debugfs_ib_idx[i];
-	}
-	return radeon_debugfs_add_files(rdev, radeon_debugfs_ib_list,
-					RADEON_IB_POOL_SIZE);
+	return radeon_debugfs_add_files(rdev, radeon_debugfs_sa_list, 1);
 #else
 	return 0;
 #endif
diff --git a/drivers/gpu/drm/radeon/radeon_semaphore.c b/drivers/gpu/drm/radeon/radeon_semaphore.c
index 1bc5513..e2ace5d 100644
--- a/drivers/gpu/drm/radeon/radeon_semaphore.c
+++ b/drivers/gpu/drm/radeon/radeon_semaphore.c
@@ -41,7 +41,7 @@ int radeon_semaphore_create(struct radeon_device *rdev,
 	if (*semaphore == NULL) {
 		return -ENOMEM;
 	}
-	r = radeon_sa_bo_new(rdev, &rdev->ib_pool.sa_manager,
+	r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo,
 			     &(*semaphore)->sa_bo, 8, 8, true);
 	if (r) {
 		kfree(*semaphore);
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 15/18] drm/radeon: immediately remove ttm-move semaphore
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (13 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 14/18] drm/radeon: rip out the ib pool j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 16/18] drm/radeon: move the semaphore from the fence into the ib j.glisse
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König

From: Christian König <deathsimple@vodafone.de>

We can now protected the semaphore ram by a
fence, so free it immediately.

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon_ttm.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index 5e3d54d..0f6aee8 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -223,6 +223,7 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
 	struct radeon_device *rdev;
 	uint64_t old_start, new_start;
 	struct radeon_fence *fence, *old_fence;
+	struct radeon_semaphore *sem = NULL;
 	int r;
 
 	rdev = radeon_get_rdev(bo->bdev);
@@ -272,15 +273,16 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
 		bool sync_to_ring[RADEON_NUM_RINGS] = { };
 		sync_to_ring[old_fence->ring] = true;
 
-		r = radeon_semaphore_create(rdev, &fence->semaphore);
+		r = radeon_semaphore_create(rdev, &sem);
 		if (r) {
 			radeon_fence_unref(&fence);
 			return r;
 		}
 
-		r = radeon_semaphore_sync_rings(rdev, fence->semaphore,
+		r = radeon_semaphore_sync_rings(rdev, sem,
 						sync_to_ring, fence->ring);
 		if (r) {
+			radeon_semaphore_free(rdev, sem, NULL);
 			radeon_fence_unref(&fence);
 			return r;
 		}
@@ -292,6 +294,7 @@ static int radeon_move_blit(struct ttm_buffer_object *bo,
 	/* FIXME: handle copy error */
 	r = ttm_bo_move_accel_cleanup(bo, (void *)fence, NULL,
 				      evict, no_wait_reserve, no_wait_gpu, new_mem);
+	radeon_semaphore_free(rdev, sem, fence);
 	radeon_fence_unref(&fence);
 	return r;
 }
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 16/18] drm/radeon: move the semaphore from the fence into the ib
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (14 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 15/18] drm/radeon: immediately remove ttm-move semaphore j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 17/18] drm/radeon: remove r600 blit mutex v2 j.glisse
  2012-05-04 22:43 ` [PATCH 18/18] drm/radeon: make the ib an inline object j.glisse
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Jerome Glisse, Christian König

From: Jerome Glisse <jglisse@redhat.com>

It never really belonged there in the first place.

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/radeon.h       |   16 ++++++++--------
 drivers/gpu/drm/radeon/radeon_cs.c    |    4 ++--
 drivers/gpu/drm/radeon/radeon_fence.c |    3 ---
 drivers/gpu/drm/radeon/radeon_ring.c  |    2 ++
 4 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 24220d1..3b9e2d3 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -272,7 +272,6 @@ struct radeon_fence {
 	uint64_t			seq;
 	/* RB, DMA, etc. */
 	unsigned			ring;
-	struct radeon_semaphore		*semaphore;
 };
 
 int radeon_fence_driver_start_ring(struct radeon_device *rdev, int ring);
@@ -615,13 +614,14 @@ void radeon_irq_kms_pflip_irq_put(struct radeon_device *rdev, int crtc);
  */
 
 struct radeon_ib {
-	struct radeon_sa_bo	*sa_bo;
-	uint32_t		length_dw;
-	uint64_t		gpu_addr;
-	uint32_t		*ptr;
-	struct radeon_fence	*fence;
-	unsigned		vm_id;
-	bool			is_const_ib;
+	struct radeon_sa_bo		*sa_bo;
+	uint32_t			length_dw;
+	uint64_t			gpu_addr;
+	uint32_t			*ptr;
+	struct radeon_fence		*fence;
+	unsigned			vm_id;
+	bool				is_const_ib;
+	struct radeon_semaphore		*semaphore;
 };
 
 struct radeon_ring {
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
index 2aa3f89..7dbf632 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -138,12 +138,12 @@ static int radeon_cs_sync_rings(struct radeon_cs_parser *p)
 		return 0;
 	}
 
-	r = radeon_semaphore_create(p->rdev, &p->ib->fence->semaphore);
+	r = radeon_semaphore_create(p->rdev, &p->ib->semaphore);
 	if (r) {
 		return r;
 	}
 
-	return radeon_semaphore_sync_rings(p->rdev, p->ib->fence->semaphore,
+	return radeon_semaphore_sync_rings(p->rdev, p->ib->semaphore,
 					   sync_to_ring, p->ring);
 }
 
diff --git a/drivers/gpu/drm/radeon/radeon_fence.c b/drivers/gpu/drm/radeon/radeon_fence.c
index 8b4778f..541ae92 100644
--- a/drivers/gpu/drm/radeon/radeon_fence.c
+++ b/drivers/gpu/drm/radeon/radeon_fence.c
@@ -137,8 +137,6 @@ static void radeon_fence_destroy(struct kref *kref)
 
 	fence = container_of(kref, struct radeon_fence, kref);
 	fence->seq = RADEON_FENCE_NOTEMITED_SEQ;
-	if (fence->semaphore)
-		radeon_semaphore_free(fence->rdev, fence->semaphore, NULL);
 	kfree(fence);
 }
 
@@ -154,7 +152,6 @@ int radeon_fence_create(struct radeon_device *rdev,
 	(*fence)->rdev = rdev;
 	(*fence)->seq = RADEON_FENCE_NOTEMITED_SEQ;
 	(*fence)->ring = ring;
-	(*fence)->semaphore = NULL;
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index c70a3b7..b41aa82 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -93,6 +93,7 @@ int radeon_ib_get(struct radeon_device *rdev, int ring,
 	(*ib)->gpu_addr = radeon_sa_bo_gpu_addr((*ib)->sa_bo);
 	(*ib)->vm_id = 0;
 	(*ib)->is_const_ib = false;
+	(*ib)->semaphore = NULL;
 
 	return 0;
 }
@@ -105,6 +106,7 @@ void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib)
 	if (tmp == NULL) {
 		return;
 	}
+	radeon_semaphore_free(rdev, tmp->semaphore, tmp->fence);
 	radeon_sa_bo_free(rdev, &tmp->sa_bo, tmp->fence);
 	radeon_fence_unref(&tmp->fence);
 	kfree(tmp);
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 17/18] drm/radeon: remove r600 blit mutex v2
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (15 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 16/18] drm/radeon: move the semaphore from the fence into the ib j.glisse
@ 2012-05-04 22:43 ` j.glisse
  2012-05-04 22:43 ` [PATCH 18/18] drm/radeon: make the ib an inline object j.glisse
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Christian König

From: Christian König <deathsimple@vodafone.de>

If we don't store local data into global variables
it isn't necessary to lock anything.

v2: rebased on new SA interface

Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/evergreen_blit_kms.c |    1 -
 drivers/gpu/drm/radeon/r600.c               |   13 +---
 drivers/gpu/drm/radeon/r600_blit_kms.c      |   99 +++++++++++----------------
 drivers/gpu/drm/radeon/radeon.h             |    3 -
 drivers/gpu/drm/radeon/radeon_asic.h        |    9 ++-
 5 files changed, 50 insertions(+), 75 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
index 222acd2..30f0480 100644
--- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c
+++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c
@@ -637,7 +637,6 @@ int evergreen_blit_init(struct radeon_device *rdev)
 	if (rdev->r600_blit.shader_obj)
 		goto done;
 
-	mutex_init(&rdev->r600_blit.mutex);
 	rdev->r600_blit.state_offset = 0;
 
 	if (rdev->family < CHIP_CAYMAN)
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index 2bce657..d963fd8 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2363,20 +2363,15 @@ int r600_copy_blit(struct radeon_device *rdev,
 		   unsigned num_gpu_pages,
 		   struct radeon_fence *fence)
 {
+	struct radeon_sa_bo *vb = NULL;
 	int r;
 
-	mutex_lock(&rdev->r600_blit.mutex);
-	rdev->r600_blit.vb_ib = NULL;
-	r = r600_blit_prepare_copy(rdev, num_gpu_pages);
+	r = r600_blit_prepare_copy(rdev, num_gpu_pages, &vb);
 	if (r) {
-		if (rdev->r600_blit.vb_ib)
-			radeon_ib_free(rdev, &rdev->r600_blit.vb_ib);
-		mutex_unlock(&rdev->r600_blit.mutex);
 		return r;
 	}
-	r600_kms_blit_copy(rdev, src_offset, dst_offset, num_gpu_pages);
-	r600_blit_done_copy(rdev, fence);
-	mutex_unlock(&rdev->r600_blit.mutex);
+	r600_kms_blit_copy(rdev, src_offset, dst_offset, num_gpu_pages, vb);
+	r600_blit_done_copy(rdev, fence, vb);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c
index db38f58..ef20822 100644
--- a/drivers/gpu/drm/radeon/r600_blit_kms.c
+++ b/drivers/gpu/drm/radeon/r600_blit_kms.c
@@ -513,7 +513,6 @@ int r600_blit_init(struct radeon_device *rdev)
 	rdev->r600_blit.primitives.set_default_state = set_default_state;
 
 	rdev->r600_blit.ring_size_common = 40; /* shaders + def state */
-	rdev->r600_blit.ring_size_common += 16; /* fence emit for VB IB */
 	rdev->r600_blit.ring_size_common += 5; /* done copy */
 	rdev->r600_blit.ring_size_common += 16; /* fence emit for done copy */
 
@@ -528,7 +527,6 @@ int r600_blit_init(struct radeon_device *rdev)
 	if (rdev->r600_blit.shader_obj)
 		goto done;
 
-	mutex_init(&rdev->r600_blit.mutex);
 	rdev->r600_blit.state_offset = 0;
 
 	if (rdev->family >= CHIP_RV770)
@@ -621,27 +619,6 @@ void r600_blit_fini(struct radeon_device *rdev)
 	radeon_bo_unref(&rdev->r600_blit.shader_obj);
 }
 
-static int r600_vb_ib_get(struct radeon_device *rdev, unsigned size)
-{
-	int r;
-	r = radeon_ib_get(rdev, RADEON_RING_TYPE_GFX_INDEX,
-			  &rdev->r600_blit.vb_ib, size);
-	if (r) {
-		DRM_ERROR("failed to get IB for vertex buffer\n");
-		return r;
-	}
-
-	rdev->r600_blit.vb_total = size;
-	rdev->r600_blit.vb_used = 0;
-	return 0;
-}
-
-static void r600_vb_ib_put(struct radeon_device *rdev)
-{
-	radeon_fence_emit(rdev, rdev->r600_blit.vb_ib->fence);
-	radeon_ib_free(rdev, &rdev->r600_blit.vb_ib);
-}
-
 static unsigned r600_blit_create_rect(unsigned num_gpu_pages,
 				      int *width, int *height, int max_dim)
 {
@@ -688,7 +665,8 @@ static unsigned r600_blit_create_rect(unsigned num_gpu_pages,
 }
 
 
-int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_gpu_pages)
+int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_gpu_pages,
+			   struct radeon_sa_bo **vb)
 {
 	struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	int r;
@@ -705,46 +683,54 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_gpu_pages)
 	}
 
 	/* 48 bytes for vertex per loop */
-	r = r600_vb_ib_get(rdev, (num_loops*48)+256);
-	if (r)
+	r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo, vb,
+			     (num_loops*48)+256, 256, true);
+	if (r) {
 		return r;
+	}
 
 	/* calculate number of loops correctly */
 	ring_size = num_loops * dwords_per_loop;
 	ring_size += rdev->r600_blit.ring_size_common;
 	r = radeon_ring_lock(rdev, ring, ring_size);
-	if (r)
+	if (r) {
+		radeon_sa_bo_free(rdev, vb, NULL);
 		return r;
+	}
 
 	rdev->r600_blit.primitives.set_default_state(rdev);
 	rdev->r600_blit.primitives.set_shaders(rdev);
 	return 0;
 }
 
-void r600_blit_done_copy(struct radeon_device *rdev, struct radeon_fence *fence)
+void r600_blit_done_copy(struct radeon_device *rdev, struct radeon_fence *fence,
+			 struct radeon_sa_bo *vb)
 {
+	struct radeon_ring *ring = &rdev->ring[RADEON_RING_TYPE_GFX_INDEX];
 	int r;
 
-	if (rdev->r600_blit.vb_ib)
-		r600_vb_ib_put(rdev);
-
-	if (fence)
-		r = radeon_fence_emit(rdev, fence);
+	r = radeon_fence_emit(rdev, fence);
+	if (r) {
+		radeon_ring_unlock_undo(rdev, ring);
+		return;
+	}
 
-	radeon_ring_unlock_commit(rdev, &rdev->ring[RADEON_RING_TYPE_GFX_INDEX]);
+	radeon_ring_unlock_commit(rdev, ring);
+	radeon_sa_bo_free(rdev, &vb, fence);
 }
 
 void r600_kms_blit_copy(struct radeon_device *rdev,
 			u64 src_gpu_addr, u64 dst_gpu_addr,
-			unsigned num_gpu_pages)
+			unsigned num_gpu_pages,
+			struct radeon_sa_bo *vb)
 {
 	u64 vb_gpu_addr;
-	u32 *vb;
+	u32 *vb_cpu_addr;
 
-	DRM_DEBUG("emitting copy %16llx %16llx %d %d\n",
-		  src_gpu_addr, dst_gpu_addr,
-		  num_gpu_pages, rdev->r600_blit.vb_used);
-	vb = (u32 *)(rdev->r600_blit.vb_ib->ptr + rdev->r600_blit.vb_used);
+	DRM_DEBUG("emitting copy %16llx %16llx %d\n",
+		  src_gpu_addr, dst_gpu_addr, num_gpu_pages);
+	vb_cpu_addr = (u32 *)radeon_sa_bo_cpu_addr(vb);
+	vb_gpu_addr = radeon_sa_bo_gpu_addr(vb);
 
 	while (num_gpu_pages) {
 		int w, h;
@@ -756,39 +742,34 @@ void r600_kms_blit_copy(struct radeon_device *rdev,
 		size_in_bytes = pages_per_loop * RADEON_GPU_PAGE_SIZE;
 		DRM_DEBUG("rectangle w=%d h=%d\n", w, h);
 
-		if ((rdev->r600_blit.vb_used + 48) > rdev->r600_blit.vb_total) {
-			WARN_ON(1);
-		}
-
-		vb[0] = 0;
-		vb[1] = 0;
-		vb[2] = 0;
-		vb[3] = 0;
+		vb_cpu_addr[0] = 0;
+		vb_cpu_addr[1] = 0;
+		vb_cpu_addr[2] = 0;
+		vb_cpu_addr[3] = 0;
 
-		vb[4] = 0;
-		vb[5] = i2f(h);
-		vb[6] = 0;
-		vb[7] = i2f(h);
+		vb_cpu_addr[4] = 0;
+		vb_cpu_addr[5] = i2f(h);
+		vb_cpu_addr[6] = 0;
+		vb_cpu_addr[7] = i2f(h);
 
-		vb[8] = i2f(w);
-		vb[9] = i2f(h);
-		vb[10] = i2f(w);
-		vb[11] = i2f(h);
+		vb_cpu_addr[8] = i2f(w);
+		vb_cpu_addr[9] = i2f(h);
+		vb_cpu_addr[10] = i2f(w);
+		vb_cpu_addr[11] = i2f(h);
 
 		rdev->r600_blit.primitives.set_tex_resource(rdev, FMT_8_8_8_8,
 							    w, h, w, src_gpu_addr, size_in_bytes);
 		rdev->r600_blit.primitives.set_render_target(rdev, COLOR_8_8_8_8,
 							     w, h, dst_gpu_addr);
 		rdev->r600_blit.primitives.set_scissors(rdev, 0, 0, w, h);
-		vb_gpu_addr = rdev->r600_blit.vb_ib->gpu_addr + rdev->r600_blit.vb_used;
 		rdev->r600_blit.primitives.set_vtx_resource(rdev, vb_gpu_addr);
 		rdev->r600_blit.primitives.draw_auto(rdev);
 		rdev->r600_blit.primitives.cp_set_surface_sync(rdev,
 				    PACKET3_CB_ACTION_ENA | PACKET3_CB0_DEST_BASE_ENA,
 				    size_in_bytes, dst_gpu_addr);
 
-		vb += 12;
-		rdev->r600_blit.vb_used += 4*12;
+		vb_cpu_addr += 12;
+		vb_gpu_addr += 4*12;
 		src_gpu_addr += size_in_bytes;
 		dst_gpu_addr += size_in_bytes;
 		num_gpu_pages -= pages_per_loop;
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 3b9e2d3..c43c856 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -734,7 +734,6 @@ struct r600_blit_cp_primitives {
 };
 
 struct r600_blit {
-	struct mutex		mutex;
 	struct radeon_bo	*shader_obj;
 	struct r600_blit_cp_primitives primitives;
 	int max_dim;
@@ -744,8 +743,6 @@ struct r600_blit {
 	u32 vs_offset, ps_offset;
 	u32 state_offset;
 	u32 state_len;
-	u32 vb_used, vb_total;
-	struct radeon_ib *vb_ib;
 };
 
 void r600_blit_suspend(struct radeon_device *rdev);
diff --git a/drivers/gpu/drm/radeon/radeon_asic.h b/drivers/gpu/drm/radeon/radeon_asic.h
index 513f33f..b15d4cf 100644
--- a/drivers/gpu/drm/radeon/radeon_asic.h
+++ b/drivers/gpu/drm/radeon/radeon_asic.h
@@ -371,11 +371,14 @@ void r600_hdmi_init(struct drm_encoder *encoder);
 int r600_hdmi_buffer_status_changed(struct drm_encoder *encoder);
 void r600_hdmi_update_audio_settings(struct drm_encoder *encoder);
 /* r600 blit */
-int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_gpu_pages);
-void r600_blit_done_copy(struct radeon_device *rdev, struct radeon_fence *fence);
+int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_gpu_pages,
+			   struct radeon_sa_bo **vb);
+void r600_blit_done_copy(struct radeon_device *rdev, struct radeon_fence *fence,
+			 struct radeon_sa_bo *vb);
 void r600_kms_blit_copy(struct radeon_device *rdev,
 			u64 src_gpu_addr, u64 dst_gpu_addr,
-			unsigned num_gpu_pages);
+			unsigned num_gpu_pages,
+			struct radeon_sa_bo *vb);
 int r600_mc_wait_for_idle(struct radeon_device *rdev);
 
 /*
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 18/18] drm/radeon: make the ib an inline object
  2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
                   ` (16 preceding siblings ...)
  2012-05-04 22:43 ` [PATCH 17/18] drm/radeon: remove r600 blit mutex v2 j.glisse
@ 2012-05-04 22:43 ` j.glisse
  17 siblings, 0 replies; 19+ messages in thread
From: j.glisse @ 2012-05-04 22:43 UTC (permalink / raw)
  To: dri-devel; +Cc: Jerome Glisse, Christian König

From: Jerome Glisse <jglisse@redhat.com>

No need to malloc it any more.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Christian König <deathsimple@vodafone.de>
---
 drivers/gpu/drm/radeon/evergreen_cs.c |   10 +++---
 drivers/gpu/drm/radeon/r100.c         |   38 ++++++++++----------
 drivers/gpu/drm/radeon/r200.c         |    2 +-
 drivers/gpu/drm/radeon/r300.c         |    4 +-
 drivers/gpu/drm/radeon/r600.c         |   16 ++++----
 drivers/gpu/drm/radeon/r600_cs.c      |   22 +++++------
 drivers/gpu/drm/radeon/radeon.h       |    8 ++--
 drivers/gpu/drm/radeon/radeon_cs.c    |   61 ++++++++++++++++-----------------
 drivers/gpu/drm/radeon/radeon_ring.c  |   41 +++++++---------------
 9 files changed, 92 insertions(+), 110 deletions(-)

diff --git a/drivers/gpu/drm/radeon/evergreen_cs.c b/drivers/gpu/drm/radeon/evergreen_cs.c
index 70089d3..4e7dd2b 100644
--- a/drivers/gpu/drm/radeon/evergreen_cs.c
+++ b/drivers/gpu/drm/radeon/evergreen_cs.c
@@ -1057,7 +1057,7 @@ static int evergreen_cs_packet_parse_vline(struct radeon_cs_parser *p)
 	uint32_t header, h_idx, reg, wait_reg_mem_info;
 	volatile uint32_t *ib;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 
 	/* parse the WAIT_REG_MEM */
 	r = evergreen_cs_packet_parse(p, &wait_reg_mem, p->idx);
@@ -1215,7 +1215,7 @@ static int evergreen_cs_check_reg(struct radeon_cs_parser *p, u32 reg, u32 idx)
 		if (!(evergreen_reg_safe_bm[i] & m))
 			return 0;
 	}
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	switch (reg) {
 	/* force following reg to 0 in an attempt to disable out buffer
 	 * which will need us to better understand how it works to perform
@@ -1896,7 +1896,7 @@ static int evergreen_packet3_check(struct radeon_cs_parser *p,
 	u32 idx_value;
 
 	track = (struct evergreen_cs_track *)p->track;
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	idx = pkt->idx + 1;
 	idx_value = radeon_get_ib_value(p, idx);
 
@@ -2610,8 +2610,8 @@ int evergreen_cs_parse(struct radeon_cs_parser *p)
 		}
 	} while (p->idx < p->chunks[p->chunk_ib_idx].length_dw);
 #if 0
-	for (r = 0; r < p->ib->length_dw; r++) {
-		printk(KERN_INFO "%05d  0x%08X\n", r, p->ib->ptr[r]);
+	for (r = 0; r < p->ib.length_dw; r++) {
+		printk(KERN_INFO "%05d  0x%08X\n", r, p->ib.ptr[r]);
 		mdelay(1);
 	}
 #endif
diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c
index ee0103c..a8e00e8 100644
--- a/drivers/gpu/drm/radeon/r100.c
+++ b/drivers/gpu/drm/radeon/r100.c
@@ -139,9 +139,9 @@ int r100_reloc_pitch_offset(struct radeon_cs_parser *p,
 		}
 
 		tmp |= tile_flags;
-		p->ib->ptr[idx] = (value & 0x3fc00000) | tmp;
+		p->ib.ptr[idx] = (value & 0x3fc00000) | tmp;
 	} else
-		p->ib->ptr[idx] = (value & 0xffc00000) | tmp;
+		p->ib.ptr[idx] = (value & 0xffc00000) | tmp;
 	return 0;
 }
 
@@ -156,7 +156,7 @@ int r100_packet3_load_vbpntr(struct radeon_cs_parser *p,
 	volatile uint32_t *ib;
 	u32 idx_value;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	track = (struct r100_cs_track *)p->track;
 	c = radeon_get_ib_value(p, idx++) & 0x1F;
 	if (c > 16) {
@@ -1271,7 +1271,7 @@ void r100_cs_dump_packet(struct radeon_cs_parser *p,
 	unsigned i;
 	unsigned idx;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	idx = pkt->idx;
 	for (i = 0; i <= (pkt->count + 1); i++, idx++) {
 		DRM_INFO("ib[%d]=0x%08X\n", idx, ib[idx]);
@@ -1350,7 +1350,7 @@ int r100_cs_packet_parse_vline(struct radeon_cs_parser *p)
 	uint32_t header, h_idx, reg;
 	volatile uint32_t *ib;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 
 	/* parse the wait until */
 	r = r100_cs_packet_parse(p, &waitreloc, p->idx);
@@ -1529,7 +1529,7 @@ static int r100_packet0_check(struct radeon_cs_parser *p,
 	u32 tile_flags = 0;
 	u32 idx_value;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	track = (struct r100_cs_track *)p->track;
 
 	idx_value = radeon_get_ib_value(p, idx);
@@ -1885,7 +1885,7 @@ static int r100_packet3_check(struct radeon_cs_parser *p,
 	volatile uint32_t *ib;
 	int r;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	idx = pkt->idx + 1;
 	track = (struct r100_cs_track *)p->track;
 	switch (pkt->opcode) {
@@ -3680,7 +3680,7 @@ void r100_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib)
 
 int r100_ib_test(struct radeon_device *rdev, struct radeon_ring *ring)
 {
-	struct radeon_ib *ib;
+	struct radeon_ib ib;
 	uint32_t scratch;
 	uint32_t tmp = 0;
 	unsigned i;
@@ -3696,22 +3696,22 @@ int r100_ib_test(struct radeon_device *rdev, struct radeon_ring *ring)
 	if (r) {
 		return r;
 	}
-	ib->ptr[0] = PACKET0(scratch, 0);
-	ib->ptr[1] = 0xDEADBEEF;
-	ib->ptr[2] = PACKET2(0);
-	ib->ptr[3] = PACKET2(0);
-	ib->ptr[4] = PACKET2(0);
-	ib->ptr[5] = PACKET2(0);
-	ib->ptr[6] = PACKET2(0);
-	ib->ptr[7] = PACKET2(0);
-	ib->length_dw = 8;
-	r = radeon_ib_schedule(rdev, ib);
+	ib.ptr[0] = PACKET0(scratch, 0);
+	ib.ptr[1] = 0xDEADBEEF;
+	ib.ptr[2] = PACKET2(0);
+	ib.ptr[3] = PACKET2(0);
+	ib.ptr[4] = PACKET2(0);
+	ib.ptr[5] = PACKET2(0);
+	ib.ptr[6] = PACKET2(0);
+	ib.ptr[7] = PACKET2(0);
+	ib.length_dw = 8;
+	r = radeon_ib_schedule(rdev, &ib);
 	if (r) {
 		radeon_scratch_free(rdev, scratch);
 		radeon_ib_free(rdev, &ib);
 		return r;
 	}
-	r = radeon_fence_wait(ib->fence, false);
+	r = radeon_fence_wait(ib.fence, false);
 	if (r) {
 		return r;
 	}
diff --git a/drivers/gpu/drm/radeon/r200.c b/drivers/gpu/drm/radeon/r200.c
index a59cc47..a26144d 100644
--- a/drivers/gpu/drm/radeon/r200.c
+++ b/drivers/gpu/drm/radeon/r200.c
@@ -154,7 +154,7 @@ int r200_packet0_check(struct radeon_cs_parser *p,
 	u32 tile_flags = 0;
 	u32 idx_value;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	track = (struct r100_cs_track *)p->track;
 	idx_value = radeon_get_ib_value(p, idx);
 	switch (reg) {
diff --git a/drivers/gpu/drm/radeon/r300.c b/drivers/gpu/drm/radeon/r300.c
index 6419a59..97722a3 100644
--- a/drivers/gpu/drm/radeon/r300.c
+++ b/drivers/gpu/drm/radeon/r300.c
@@ -604,7 +604,7 @@ static int r300_packet0_check(struct radeon_cs_parser *p,
 	int r;
 	u32 idx_value;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	track = (struct r100_cs_track *)p->track;
 	idx_value = radeon_get_ib_value(p, idx);
 
@@ -1146,7 +1146,7 @@ static int r300_packet3_check(struct radeon_cs_parser *p,
 	unsigned idx;
 	int r;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	idx = pkt->idx + 1;
 	track = (struct r100_cs_track *)p->track;
 	switch(pkt->opcode) {
diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c
index d963fd8..5916d9b 100644
--- a/drivers/gpu/drm/radeon/r600.c
+++ b/drivers/gpu/drm/radeon/r600.c
@@ -2681,7 +2681,7 @@ void r600_ring_ib_execute(struct radeon_device *rdev, struct radeon_ib *ib)
 
 int r600_ib_test(struct radeon_device *rdev, struct radeon_ring *ring)
 {
-	struct radeon_ib *ib;
+	struct radeon_ib ib;
 	uint32_t scratch;
 	uint32_t tmp = 0;
 	unsigned i;
@@ -2699,18 +2699,18 @@ int r600_ib_test(struct radeon_device *rdev, struct radeon_ring *ring)
 		DRM_ERROR("radeon: failed to get ib (%d).\n", r);
 		return r;
 	}
-	ib->ptr[0] = PACKET3(PACKET3_SET_CONFIG_REG, 1);
-	ib->ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2);
-	ib->ptr[2] = 0xDEADBEEF;
-	ib->length_dw = 3;
-	r = radeon_ib_schedule(rdev, ib);
+	ib.ptr[0] = PACKET3(PACKET3_SET_CONFIG_REG, 1);
+	ib.ptr[1] = ((scratch - PACKET3_SET_CONFIG_REG_OFFSET) >> 2);
+	ib.ptr[2] = 0xDEADBEEF;
+	ib.length_dw = 3;
+	r = radeon_ib_schedule(rdev, &ib);
 	if (r) {
 		radeon_scratch_free(rdev, scratch);
 		radeon_ib_free(rdev, &ib);
 		DRM_ERROR("radeon: failed to schedule ib (%d).\n", r);
 		return r;
 	}
-	r = radeon_fence_wait(ib->fence, false);
+	r = radeon_fence_wait(ib.fence, false);
 	if (r) {
 		DRM_ERROR("radeon: fence wait failed (%d).\n", r);
 		return r;
@@ -2722,7 +2722,7 @@ int r600_ib_test(struct radeon_device *rdev, struct radeon_ring *ring)
 		DRM_UDELAY(1);
 	}
 	if (i < rdev->usec_timeout) {
-		DRM_INFO("ib test on ring %d succeeded in %u usecs\n", ib->fence->ring, i);
+		DRM_INFO("ib test on ring %d succeeded in %u usecs\n", ib.fence->ring, i);
 	} else {
 		DRM_ERROR("radeon: ib test failed (scratch(0x%04X)=0x%08X)\n",
 			  scratch, tmp);
diff --git a/drivers/gpu/drm/radeon/r600_cs.c b/drivers/gpu/drm/radeon/r600_cs.c
index b8e12af..0133f5f 100644
--- a/drivers/gpu/drm/radeon/r600_cs.c
+++ b/drivers/gpu/drm/radeon/r600_cs.c
@@ -345,7 +345,7 @@ static int r600_cs_track_validate_cb(struct radeon_cs_parser *p, int i)
 	u32 height, height_align, pitch, pitch_align, depth_align;
 	u64 base_offset, base_align;
 	struct array_mode_checker array_check;
-	volatile u32 *ib = p->ib->ptr;
+	volatile u32 *ib = p->ib.ptr;
 	unsigned array_mode;
 	u32 format;
 
@@ -471,7 +471,7 @@ static int r600_cs_track_validate_db(struct radeon_cs_parser *p)
 	u64 base_offset, base_align;
 	struct array_mode_checker array_check;
 	int array_mode;
-	volatile u32 *ib = p->ib->ptr;
+	volatile u32 *ib = p->ib.ptr;
 
 
 	if (track->db_bo == NULL) {
@@ -961,7 +961,7 @@ static int r600_cs_packet_parse_vline(struct radeon_cs_parser *p)
 	uint32_t header, h_idx, reg, wait_reg_mem_info;
 	volatile uint32_t *ib;
 
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 
 	/* parse the WAIT_REG_MEM */
 	r = r600_cs_packet_parse(p, &wait_reg_mem, p->idx);
@@ -1110,7 +1110,7 @@ static int r600_cs_check_reg(struct radeon_cs_parser *p, u32 reg, u32 idx)
 	m = 1 << ((reg >> 2) & 31);
 	if (!(r600_reg_safe_bm[i] & m))
 		return 0;
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	switch (reg) {
 	/* force following reg to 0 in an attempt to disable out buffer
 	 * which will need us to better understand how it works to perform
@@ -1714,7 +1714,7 @@ static int r600_packet3_check(struct radeon_cs_parser *p,
 	u32 idx_value;
 
 	track = (struct r600_cs_track *)p->track;
-	ib = p->ib->ptr;
+	ib = p->ib.ptr;
 	idx = pkt->idx + 1;
 	idx_value = radeon_get_ib_value(p, idx);
 
@@ -2249,8 +2249,8 @@ int r600_cs_parse(struct radeon_cs_parser *p)
 		}
 	} while (p->idx < p->chunks[p->chunk_ib_idx].length_dw);
 #if 0
-	for (r = 0; r < p->ib->length_dw; r++) {
-		printk(KERN_INFO "%05d  0x%08X\n", r, p->ib->ptr[r]);
+	for (r = 0; r < p->ib.length_dw; r++) {
+		printk(KERN_INFO "%05d  0x%08X\n", r, p->ib.ptr[r]);
 		mdelay(1);
 	}
 #endif
@@ -2298,7 +2298,6 @@ int r600_cs_legacy(struct drm_device *dev, void *data, struct drm_file *filp,
 {
 	struct radeon_cs_parser parser;
 	struct radeon_cs_chunk *ib_chunk;
-	struct radeon_ib fake_ib;
 	struct r600_cs_track *track;
 	int r;
 
@@ -2314,9 +2313,8 @@ int r600_cs_legacy(struct drm_device *dev, void *data, struct drm_file *filp,
 	parser.dev = &dev->pdev->dev;
 	parser.rdev = NULL;
 	parser.family = family;
-	parser.ib = &fake_ib;
 	parser.track = track;
-	fake_ib.ptr = ib;
+	parser.ib.ptr = ib;
 	r = radeon_cs_parser_init(&parser, data);
 	if (r) {
 		DRM_ERROR("Failed to initialize parser !\n");
@@ -2333,8 +2331,8 @@ int r600_cs_legacy(struct drm_device *dev, void *data, struct drm_file *filp,
 	 * input memory (cached) and write to the IB (which can be
 	 * uncached). */
 	ib_chunk = &parser.chunks[parser.chunk_ib_idx];
-	parser.ib->length_dw = ib_chunk->length_dw;
-	*l = parser.ib->length_dw;
+	parser.ib.length_dw = ib_chunk->length_dw;
+	*l = parser.ib.length_dw;
 	r = r600_cs_parse(&parser);
 	if (r) {
 		DRM_ERROR("Invalid command stream !\n");
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index c43c856..9f29d5d 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -760,8 +760,8 @@ struct si_rlc {
 };
 
 int radeon_ib_get(struct radeon_device *rdev, int ring,
-		  struct radeon_ib **ib, unsigned size);
-void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib);
+		  struct radeon_ib *ib, unsigned size);
+void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib);
 int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib);
 int radeon_ib_pool_init(struct radeon_device *rdev);
 void radeon_ib_pool_fini(struct radeon_device *rdev);
@@ -829,8 +829,8 @@ struct radeon_cs_parser {
 	int			chunk_relocs_idx;
 	int			chunk_flags_idx;
 	int			chunk_const_ib_idx;
-	struct radeon_ib	*ib;
-	struct radeon_ib	*const_ib;
+	struct radeon_ib	ib;
+	struct radeon_ib	const_ib;
 	void			*track;
 	unsigned		family;
 	int			parser_error;
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
index 7dbf632..b9d9a9e 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -138,12 +138,12 @@ static int radeon_cs_sync_rings(struct radeon_cs_parser *p)
 		return 0;
 	}
 
-	r = radeon_semaphore_create(p->rdev, &p->ib->semaphore);
+	r = radeon_semaphore_create(p->rdev, &p->ib.semaphore);
 	if (r) {
 		return r;
 	}
 
-	return radeon_semaphore_sync_rings(p->rdev, p->ib->semaphore,
+	return radeon_semaphore_sync_rings(p->rdev, p->ib.semaphore,
 					   sync_to_ring, p->ring);
 }
 
@@ -161,8 +161,10 @@ int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data)
 	/* get chunks */
 	INIT_LIST_HEAD(&p->validated);
 	p->idx = 0;
-	p->ib = NULL;
-	p->const_ib = NULL;
+	p->ib.sa_bo = NULL;
+	p->ib.semaphore = NULL;
+	p->const_ib.sa_bo = NULL;
+	p->const_ib.semaphore = NULL;
 	p->chunk_ib_idx = -1;
 	p->chunk_relocs_idx = -1;
 	p->chunk_flags_idx = -1;
@@ -296,10 +298,9 @@ static void radeon_cs_parser_fini(struct radeon_cs_parser *parser, int error)
 {
 	unsigned i;
 
-
-	if (!error && parser->ib)
+	if (!error)
 		ttm_eu_fence_buffer_objects(&parser->validated,
-					    parser->ib->fence);
+					    parser->ib.fence);
 	else
 		ttm_eu_backoff_reservation(&parser->validated);
 
@@ -320,9 +321,7 @@ static void radeon_cs_parser_fini(struct radeon_cs_parser *parser, int error)
 	kfree(parser->chunks);
 	kfree(parser->chunks_array);
 	radeon_ib_free(parser->rdev, &parser->ib);
-	if (parser->const_ib) {
-		radeon_ib_free(parser->rdev, &parser->const_ib);
-	}
+	radeon_ib_free(parser->rdev, &parser->const_ib);
 }
 
 static int radeon_cs_ib_chunk(struct radeon_device *rdev,
@@ -348,7 +347,7 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
 		DRM_ERROR("Failed to get ib !\n");
 		return r;
 	}
-	parser->ib->length_dw = ib_chunk->length_dw;
+	parser->ib.length_dw = ib_chunk->length_dw;
 	r = radeon_cs_parse(rdev, parser->ring, parser);
 	if (r || parser->parser_error) {
 		DRM_ERROR("Invalid command stream !\n");
@@ -363,8 +362,8 @@ static int radeon_cs_ib_chunk(struct radeon_device *rdev,
 	if (r) {
 		DRM_ERROR("Failed to synchronize rings !\n");
 	}
-	parser->ib->vm_id = 0;
-	r = radeon_ib_schedule(rdev, parser->ib);
+	parser->ib.vm_id = 0;
+	r = radeon_ib_schedule(rdev, &parser->ib);
 	if (r) {
 		DRM_ERROR("Failed to schedule IB !\n");
 	}
@@ -415,14 +414,14 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
 			DRM_ERROR("Failed to get const ib !\n");
 			return r;
 		}
-		parser->const_ib->is_const_ib = true;
-		parser->const_ib->length_dw = ib_chunk->length_dw;
+		parser->const_ib.is_const_ib = true;
+		parser->const_ib.length_dw = ib_chunk->length_dw;
 		/* Copy the packet into the IB */
-		if (DRM_COPY_FROM_USER(parser->const_ib->ptr, ib_chunk->user_ptr,
+		if (DRM_COPY_FROM_USER(parser->const_ib.ptr, ib_chunk->user_ptr,
 				       ib_chunk->length_dw * 4)) {
 			return -EFAULT;
 		}
-		r = radeon_ring_ib_parse(rdev, parser->ring, parser->const_ib);
+		r = radeon_ring_ib_parse(rdev, parser->ring, &parser->const_ib);
 		if (r) {
 			return r;
 		}
@@ -439,13 +438,13 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
 		DRM_ERROR("Failed to get ib !\n");
 		return r;
 	}
-	parser->ib->length_dw = ib_chunk->length_dw;
+	parser->ib.length_dw = ib_chunk->length_dw;
 	/* Copy the packet into the IB */
-	if (DRM_COPY_FROM_USER(parser->ib->ptr, ib_chunk->user_ptr,
+	if (DRM_COPY_FROM_USER(parser->ib.ptr, ib_chunk->user_ptr,
 			       ib_chunk->length_dw * 4)) {
 		return -EFAULT;
 	}
-	r = radeon_ring_ib_parse(rdev, parser->ring, parser->ib);
+	r = radeon_ring_ib_parse(rdev, parser->ring, &parser->ib);
 	if (r) {
 		return r;
 	}
@@ -466,29 +465,29 @@ static int radeon_cs_ib_vm_chunk(struct radeon_device *rdev,
 
 	if ((rdev->family >= CHIP_TAHITI) &&
 	    (parser->chunk_const_ib_idx != -1)) {
-		parser->const_ib->vm_id = vm->id;
+		parser->const_ib.vm_id = vm->id;
 		/* ib pool is bind at 0 in virtual address space to gpu_addr is the
 		 * offset inside the pool bo
 		 */
-		parser->const_ib->gpu_addr = parser->const_ib->sa_bo->soffset;
-		r = radeon_ib_schedule(rdev, parser->const_ib);
+		parser->const_ib.gpu_addr = parser->const_ib.sa_bo->soffset;
+		r = radeon_ib_schedule(rdev, &parser->const_ib);
 		if (r)
 			goto out;
 	}
 
-	parser->ib->vm_id = vm->id;
+	parser->ib.vm_id = vm->id;
 	/* ib pool is bind at 0 in virtual address space to gpu_addr is the
 	 * offset inside the pool bo
 	 */
-	parser->ib->gpu_addr = parser->ib->sa_bo->soffset;
-	parser->ib->is_const_ib = false;
-	r = radeon_ib_schedule(rdev, parser->ib);
+	parser->ib.gpu_addr = parser->ib.sa_bo->soffset;
+	parser->ib.is_const_ib = false;
+	r = radeon_ib_schedule(rdev, &parser->ib);
 out:
 	if (!r) {
 		if (vm->fence) {
 			radeon_fence_unref(&vm->fence);
 		}
-		vm->fence = radeon_fence_ref(parser->ib->fence);
+		vm->fence = radeon_fence_ref(parser->ib.fence);
 	}
 	mutex_unlock(&fpriv->vm.mutex);
 	return r;
@@ -566,7 +565,7 @@ int radeon_cs_finish_pages(struct radeon_cs_parser *p)
 				size = PAGE_SIZE;
 		}
 		
-		if (DRM_COPY_FROM_USER(p->ib->ptr + (i * (PAGE_SIZE/4)),
+		if (DRM_COPY_FROM_USER(p->ib.ptr + (i * (PAGE_SIZE/4)),
 				       ibc->user_ptr + (i * PAGE_SIZE),
 				       size))
 			return -EFAULT;
@@ -582,7 +581,7 @@ int radeon_cs_update_pages(struct radeon_cs_parser *p, int pg_idx)
 	int size = PAGE_SIZE;
 
 	for (i = ibc->last_copied_page + 1; i < pg_idx; i++) {
-		if (DRM_COPY_FROM_USER(p->ib->ptr + (i * (PAGE_SIZE/4)),
+		if (DRM_COPY_FROM_USER(p->ib.ptr + (i * (PAGE_SIZE/4)),
 				       ibc->user_ptr + (i * PAGE_SIZE),
 				       PAGE_SIZE)) {
 			p->parser_error = -EFAULT;
@@ -606,7 +605,7 @@ int radeon_cs_update_pages(struct radeon_cs_parser *p, int pg_idx)
 	}
 
 	/* copy to IB here */
-	memcpy((void *)(p->ib->ptr+(pg_idx*(PAGE_SIZE/4))), ibc->kpage[new_page], size);
+	memcpy((void *)(p->ib.ptr+(pg_idx*(PAGE_SIZE/4))), ibc->kpage[new_page], size);
 
 	ibc->last_copied_page = pg_idx;
 	ibc->kpage_idx[new_page] = pg_idx;
diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c
index b41aa82..d640e32 100644
--- a/drivers/gpu/drm/radeon/radeon_ring.c
+++ b/drivers/gpu/drm/radeon/radeon_ring.c
@@ -65,51 +65,36 @@ u32 radeon_get_ib_value(struct radeon_cs_parser *p, int idx)
 }
 
 int radeon_ib_get(struct radeon_device *rdev, int ring,
-		  struct radeon_ib **ib, unsigned size)
+		  struct radeon_ib *ib, unsigned size)
 {
 	int r;
 
-	*ib = kmalloc(sizeof(struct radeon_ib), GFP_KERNEL);
-	if (*ib == NULL) {
-		return -ENOMEM;
-	}
-	r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo, &(*ib)->sa_bo, size, 256, true);
+	r = radeon_sa_bo_new(rdev, &rdev->ring_tmp_bo, &ib->sa_bo, size, 256, true);
 	if (r) {
 		dev_err(rdev->dev, "failed to get a new IB (%d)\n", r);
-		kfree(*ib);
-		*ib = NULL;
 		return r;
 	}
-	r = radeon_fence_create(rdev, &(*ib)->fence, ring);
+	r = radeon_fence_create(rdev, &ib->fence, ring);
 	if (r) {
 		dev_err(rdev->dev, "failed to create fence for new IB (%d)\n", r);
-		radeon_sa_bo_free(rdev, &(*ib)->sa_bo, NULL);
-		kfree(*ib);
-		*ib = NULL;
+		radeon_sa_bo_free(rdev, &ib->sa_bo, NULL);
 		return r;
 	}
 
-	(*ib)->ptr = radeon_sa_bo_cpu_addr((*ib)->sa_bo);
-	(*ib)->gpu_addr = radeon_sa_bo_gpu_addr((*ib)->sa_bo);
-	(*ib)->vm_id = 0;
-	(*ib)->is_const_ib = false;
-	(*ib)->semaphore = NULL;
+	ib->ptr = radeon_sa_bo_cpu_addr(ib->sa_bo);
+	ib->gpu_addr = radeon_sa_bo_gpu_addr(ib->sa_bo);
+	ib->vm_id = 0;
+	ib->is_const_ib = false;
+	ib->semaphore = NULL;
 
 	return 0;
 }
 
-void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib **ib)
+void radeon_ib_free(struct radeon_device *rdev, struct radeon_ib *ib)
 {
-	struct radeon_ib *tmp = *ib;
-
-	*ib = NULL;
-	if (tmp == NULL) {
-		return;
-	}
-	radeon_semaphore_free(rdev, tmp->semaphore, tmp->fence);
-	radeon_sa_bo_free(rdev, &tmp->sa_bo, tmp->fence);
-	radeon_fence_unref(&tmp->fence);
-	kfree(tmp);
+	radeon_semaphore_free(rdev, ib->semaphore, ib->fence);
+	radeon_sa_bo_free(rdev, &ib->sa_bo, ib->fence);
+	radeon_fence_unref(&ib->fence);
 }
 
 int radeon_ib_schedule(struct radeon_device *rdev, struct radeon_ib *ib)
-- 
1.7.7.6

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-05-04 22:44 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-04 22:43 [RFC] fence, sa allocator, ib pool, semaphore rework j.glisse
2012-05-04 22:43 ` [PATCH 01/18] drm/radeon: replace the per ring mutex with a global one j.glisse
2012-05-04 22:43 ` [PATCH 02/18] drm/radeon: convert fence to uint64_t v3 j.glisse
2012-05-04 22:43 ` [PATCH 03/18] drm/radeon: rework fence handling, drop fence list v5 j.glisse
2012-05-04 22:43 ` [PATCH 04/18] drm/radeon: hold ring emission mutex for lockup detection j.glisse
2012-05-04 22:43 ` [PATCH 05/18] drm/radeon: use inline functions to calc sa_bo addr j.glisse
2012-05-04 22:43 ` [PATCH 06/18] drm/radeon: add proper locking to the SA v3 j.glisse
2012-05-04 22:43 ` [PATCH 07/18] drm/radeon: add sub allocator debugfs file j.glisse
2012-05-04 22:43 ` [PATCH 08/18] drm/radeon: keep start and end offset in the SA j.glisse
2012-05-04 22:43 ` [PATCH 09/18] drm/radeon: make sa bo a stand alone object j.glisse
2012-05-04 22:43 ` [PATCH 10/18] drm/radeon: define new SA interface v2 j.glisse
2012-05-04 22:43 ` [PATCH 11/18] drm/radeon: use one wait queue for all rings add fence_wait_any j.glisse
2012-05-04 22:43 ` [PATCH 12/18] drm/radeon: multiple ring allocator j.glisse
2012-05-04 22:43 ` [PATCH 13/18] drm/radeon: simplify semaphore handling v2 j.glisse
2012-05-04 22:43 ` [PATCH 14/18] drm/radeon: rip out the ib pool j.glisse
2012-05-04 22:43 ` [PATCH 15/18] drm/radeon: immediately remove ttm-move semaphore j.glisse
2012-05-04 22:43 ` [PATCH 16/18] drm/radeon: move the semaphore from the fence into the ib j.glisse
2012-05-04 22:43 ` [PATCH 17/18] drm/radeon: remove r600 blit mutex v2 j.glisse
2012-05-04 22:43 ` [PATCH 18/18] drm/radeon: make the ib an inline object j.glisse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.