[PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops v2

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops v2
@ 2012-11-05 13:55 Thomas Hellstrom
  2012-11-05 13:55 ` [PATCH 1/4] drm: Make hashtab rcu-safe Thomas Hellstrom
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Thomas Hellstrom @ 2012-11-05 13:55 UTC (permalink / raw)
  To: airlied, airlied; +Cc: dri-devel, linux-kernel

A patch series for next that removes a substantial number of read-modify-write
operations from TTM command submission, in particular if TTM objects are used
to export objects to user-space. The only per-object atomic r-m-w operations
left during a typical execbuf call should be refcount up and down.

v2: Formatting fixes.

In-Reply-To: 

>From Thomas Hellstrom <thellstrom@vmware.com> # This line is ignored.
From: Thomas Hellstrom <thellstrom@vmware.com>
Subject: 
In-Reply-To: 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] drm: Make hashtab rcu-safe
  2012-11-05 13:55 [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops v2 Thomas Hellstrom
@ 2012-11-05 13:55 ` Thomas Hellstrom
  2012-11-05 13:55 ` [PATCH 2/4] kref: Implement kref_get_unless_zero v2 Thomas Hellstrom
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Thomas Hellstrom @ 2012-11-05 13:55 UTC (permalink / raw)
  To: airlied, airlied; +Cc: dri-devel, linux-kernel, Thomas Hellstrom

TTM base objects will be the first consumer.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/drm_hashtab.c |   18 +++++++-----------
 1 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/drm_hashtab.c b/drivers/gpu/drm/drm_hashtab.c
index c3745c4..5729e39 100644
--- a/drivers/gpu/drm/drm_hashtab.c
+++ b/drivers/gpu/drm/drm_hashtab.c
@@ -67,10 +67,8 @@ void drm_ht_verbose_list(struct drm_open_hash *ht, unsigned long key)
 	hashed_key = hash_long(key, ht->order);
 	DRM_DEBUG("Key is 0x%08lx, Hashed key is 0x%08x\n", key, hashed_key);
 	h_list = &ht->table[hashed_key];
-	hlist_for_each(list, h_list) {
-		entry = hlist_entry(list, struct drm_hash_item, head);
+	hlist_for_each_entry_rcu(entry, list, h_list, head)
 		DRM_DEBUG("count %d, key: 0x%08lx\n", count++, entry->key);
-	}
 }
 
 static struct hlist_node *drm_ht_find_key(struct drm_open_hash *ht,
@@ -83,8 +81,7 @@ static struct hlist_node *drm_ht_find_key(struct drm_open_hash *ht,
 
 	hashed_key = hash_long(key, ht->order);
 	h_list = &ht->table[hashed_key];
-	hlist_for_each(list, h_list) {
-		entry = hlist_entry(list, struct drm_hash_item, head);
+	hlist_for_each_entry_rcu(entry, list, h_list, head) {
 		if (entry->key == key)
 			return list;
 		if (entry->key > key)
@@ -105,8 +102,7 @@ int drm_ht_insert_item(struct drm_open_hash *ht, struct drm_hash_item *item)
 	hashed_key = hash_long(key, ht->order);
 	h_list = &ht->table[hashed_key];
 	parent = NULL;
-	hlist_for_each(list, h_list) {
-		entry = hlist_entry(list, struct drm_hash_item, head);
+	hlist_for_each_entry_rcu(entry, list, h_list, head) {
 		if (entry->key == key)
 			return -EINVAL;
 		if (entry->key > key)
@@ -114,9 +110,9 @@ int drm_ht_insert_item(struct drm_open_hash *ht, struct drm_hash_item *item)
 		parent = list;
 	}
 	if (parent) {
-		hlist_add_after(parent, &item->head);
+		hlist_add_after_rcu(parent, &item->head);
 	} else {
-		hlist_add_head(&item->head, h_list);
+		hlist_add_head_rcu(&item->head, h_list);
 	}
 	return 0;
 }
@@ -171,7 +167,7 @@ int drm_ht_remove_key(struct drm_open_hash *ht, unsigned long key)
 
 	list = drm_ht_find_key(ht, key);
 	if (list) {
-		hlist_del_init(list);
+		hlist_del_init_rcu(list);
 		return 0;
 	}
 	return -EINVAL;
@@ -179,7 +175,7 @@ int drm_ht_remove_key(struct drm_open_hash *ht, unsigned long key)
 
 int drm_ht_remove_item(struct drm_open_hash *ht, struct drm_hash_item *item)
 {
-	hlist_del_init(&item->head);
+	hlist_del_init_rcu(&item->head);
 	return 0;
 }
 EXPORT_SYMBOL(drm_ht_remove_item);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] kref: Implement kref_get_unless_zero v2
  2012-11-05 13:55 [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops v2 Thomas Hellstrom
  2012-11-05 13:55 ` [PATCH 1/4] drm: Make hashtab rcu-safe Thomas Hellstrom
@ 2012-11-05 13:55 ` Thomas Hellstrom
  2012-11-05 13:55 ` [PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v2 Thomas Hellstrom
  2012-11-05 13:55 ` [PATCH 4/4] drm/ttm: Optimize reservation slightly Thomas Hellstrom
  3 siblings, 0 replies; 9+ messages in thread
From: Thomas Hellstrom @ 2012-11-05 13:55 UTC (permalink / raw)
  To: airlied, airlied; +Cc: dri-devel, linux-kernel, Thomas Hellstrom

This function is intended to simplify locking around refcounting for
objects that can be looked up from a lookup structure, and which are
removed from that lookup structure in the object destructor.
Operations on such objects require at least a read lock around
lookup + kref_get, and a write lock around kref_put + remove from lookup
structure. Furthermore, RCU implementations become extremely tricky.
With a lookup followed by a kref_get_unless_zero *with return value check*
locking in the kref_put path can be deferred to the actual removal from
the lookup structure and RCU lookups become trivial.

v2: Formatting fixes.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 include/linux/kref.h |   21 +++++++++++++++++++++
 1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/include/linux/kref.h b/include/linux/kref.h
index 65af688..bae91d0 100644
--- a/include/linux/kref.h
+++ b/include/linux/kref.h
@@ -111,4 +111,25 @@ static inline int kref_put_mutex(struct kref *kref,
 	}
 	return 0;
 }
+
+/**
+ * kref_get_unless_zero - Increment refcount for object unless it is zero.
+ * @kref: object.
+ *
+ * Return 0 if the increment succeeded. Otherwise return non-zero.
+ *
+ * This function is intended to simplify locking around refcounting for
+ * objects that can be looked up from a lookup structure, and which are
+ * removed from that lookup structure in the object destructor.
+ * Operations on such objects require at least a read lock around
+ * lookup + kref_get, and a write lock around kref_put + remove from lookup
+ * structure. Furthermore, RCU implementations become extremely tricky.
+ * With a lookup followed by a kref_get_unless_zero *with return value check*
+ * locking in the kref_put path can be deferred to the actual removal from
+ * the lookup structure and RCU lookups become trivial.
+ */
+static inline int __must_check kref_get_unless_zero(struct kref *kref)
+{
+	return !atomic_add_unless(&kref->refcount, 1, 0);
+}
 #endif /* _KREF_H_ */
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v2
  2012-11-05 13:55 [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops v2 Thomas Hellstrom
  2012-11-05 13:55 ` [PATCH 1/4] drm: Make hashtab rcu-safe Thomas Hellstrom
  2012-11-05 13:55 ` [PATCH 2/4] kref: Implement kref_get_unless_zero v2 Thomas Hellstrom
@ 2012-11-05 13:55 ` Thomas Hellstrom
  2012-11-05 13:55 ` [PATCH 4/4] drm/ttm: Optimize reservation slightly Thomas Hellstrom
  3 siblings, 0 replies; 9+ messages in thread
From: Thomas Hellstrom @ 2012-11-05 13:55 UTC (permalink / raw)
  To: airlied, airlied; +Cc: dri-devel, linux-kernel, Thomas Hellstrom

The mostly used lookup+get put+potential_destroy path of TTM objects
is converted to use RCU locks. This will substantially decrease the amount
of locked bus cycles during normal operation.
Since we use kfree_rcu to free the objects, no rcu synchronization is needed
at module unload time.

v2: Don't touch include/linux/kref.h

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/ttm/ttm_object.c         |   30 +++++++++++-------------------
 drivers/gpu/drm/vmwgfx/vmwgfx_resource.c |    8 ++++----
 include/drm/ttm/ttm_object.h             |    4 ++++
 3 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_object.c b/drivers/gpu/drm/ttm/ttm_object.c
index c785787..9d7f674 100644
--- a/drivers/gpu/drm/ttm/ttm_object.c
+++ b/drivers/gpu/drm/ttm/ttm_object.c
@@ -80,7 +80,7 @@ struct ttm_object_file {
  */
 
 struct ttm_object_device {
-	rwlock_t object_lock;
+	spinlock_t object_lock;
 	struct drm_open_hash object_hash;
 	atomic_t object_count;
 	struct ttm_mem_global *mem_glob;
@@ -157,12 +157,12 @@ int ttm_base_object_init(struct ttm_object_file *tfile,
 	base->refcount_release = refcount_release;
 	base->ref_obj_release = ref_obj_release;
 	base->object_type = object_type;
-	write_lock(&tdev->object_lock);
+	spin_lock(&tdev->object_lock);
 	kref_init(&base->refcount);
 	ret = drm_ht_just_insert_please(&tdev->object_hash,
 					&base->hash,
 					(unsigned long)base, 31, 0, 0);
-	write_unlock(&tdev->object_lock);
+	spin_unlock(&tdev->object_lock);
 	if (unlikely(ret != 0))
 		goto out_err0;
 
@@ -186,30 +186,22 @@ static void ttm_release_base(struct kref *kref)
 	    container_of(kref, struct ttm_base_object, refcount);
 	struct ttm_object_device *tdev = base->tfile->tdev;
 
+	spin_lock(&tdev->object_lock);
 	(void)drm_ht_remove_item(&tdev->object_hash, &base->hash);
-	write_unlock(&tdev->object_lock);
+	spin_unlock(&tdev->object_lock);
 	if (base->refcount_release) {
 		ttm_object_file_unref(&base->tfile);
 		base->refcount_release(&base);
 	}
-	write_lock(&tdev->object_lock);
 }
 
 void ttm_base_object_unref(struct ttm_base_object **p_base)
 {
 	struct ttm_base_object *base = *p_base;
-	struct ttm_object_device *tdev = base->tfile->tdev;
 
 	*p_base = NULL;
 
-	/*
-	 * Need to take the lock here to avoid racing with
-	 * users trying to look up the object.
-	 */
-
-	write_lock(&tdev->object_lock);
 	kref_put(&base->refcount, ttm_release_base);
-	write_unlock(&tdev->object_lock);
 }
 EXPORT_SYMBOL(ttm_base_object_unref);
 
@@ -221,14 +213,14 @@ struct ttm_base_object *ttm_base_object_lookup(struct ttm_object_file *tfile,
 	struct drm_hash_item *hash;
 	int ret;
 
-	read_lock(&tdev->object_lock);
+	rcu_read_lock();
 	ret = drm_ht_find_item(&tdev->object_hash, key, &hash);
 
 	if (likely(ret == 0)) {
 		base = drm_hash_entry(hash, struct ttm_base_object, hash);
-		kref_get(&base->refcount);
+		ret = kref_get_unless_zero(&base->refcount);
 	}
-	read_unlock(&tdev->object_lock);
+	rcu_read_unlock();
 
 	if (unlikely(ret != 0))
 		return NULL;
@@ -426,7 +418,7 @@ struct ttm_object_device *ttm_object_device_init(struct ttm_mem_global
 		return NULL;
 
 	tdev->mem_glob = mem_glob;
-	rwlock_init(&tdev->object_lock);
+	spin_lock_init(&tdev->object_lock);
 	atomic_set(&tdev->object_count, 0);
 	ret = drm_ht_create(&tdev->object_hash, hash_order);
 
@@ -444,9 +436,9 @@ void ttm_object_device_release(struct ttm_object_device **p_tdev)
 
 	*p_tdev = NULL;
 
-	write_lock(&tdev->object_lock);
+	spin_lock(&tdev->object_lock);
 	drm_ht_remove(&tdev->object_hash);
-	write_unlock(&tdev->object_lock);
+	spin_unlock(&tdev->object_lock);
 
 	kfree(tdev);
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
index da3c6b5..ae675c6 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_resource.c
@@ -351,7 +351,7 @@ static void vmw_user_context_free(struct vmw_resource *res)
 	    container_of(res, struct vmw_user_context, res);
 	struct vmw_private *dev_priv = res->dev_priv;
 
-	kfree(ctx);
+	ttm_base_object_kfree(ctx, base);
 	ttm_mem_global_free(vmw_mem_glob(dev_priv),
 			    vmw_user_context_size);
 }
@@ -1147,7 +1147,7 @@ static void vmw_user_surface_free(struct vmw_resource *res)
 	kfree(srf->offsets);
 	kfree(srf->sizes);
 	kfree(srf->snooper.image);
-	kfree(user_srf);
+	ttm_base_object_kfree(user_srf, base);
 	ttm_mem_global_free(vmw_mem_glob(dev_priv), size);
 }
 
@@ -1575,7 +1575,7 @@ static void vmw_user_dmabuf_destroy(struct ttm_buffer_object *bo)
 {
 	struct vmw_user_dma_buffer *vmw_user_bo = vmw_user_dma_buffer(bo);
 
-	kfree(vmw_user_bo);
+	ttm_base_object_kfree(vmw_user_bo, base);
 }
 
 static void vmw_user_dmabuf_release(struct ttm_base_object **p_base)
@@ -1763,7 +1763,7 @@ static void vmw_user_stream_free(struct vmw_resource *res)
 	    container_of(res, struct vmw_user_stream, stream.res);
 	struct vmw_private *dev_priv = res->dev_priv;
 
-	kfree(stream);
+	ttm_base_object_kfree(stream, base);
 	ttm_mem_global_free(vmw_mem_glob(dev_priv),
 			    vmw_user_stream_size);
 }
diff --git a/include/drm/ttm/ttm_object.h b/include/drm/ttm/ttm_object.h
index b01c563..fc0cf06 100644
--- a/include/drm/ttm/ttm_object.h
+++ b/include/drm/ttm/ttm_object.h
@@ -40,6 +40,7 @@
 #include <linux/list.h>
 #include <drm/drm_hashtab.h>
 #include <linux/kref.h>
+#include <linux/rcupdate.h>
 #include <ttm/ttm_memory.h>
 
 /**
@@ -120,6 +121,7 @@ struct ttm_object_device;
  */
 
 struct ttm_base_object {
+	struct rcu_head rhead;
 	struct drm_hash_item hash;
 	enum ttm_object_type object_type;
 	bool shareable;
@@ -268,4 +270,6 @@ extern struct ttm_object_device *ttm_object_device_init
 
 extern void ttm_object_device_release(struct ttm_object_device **p_tdev);
 
+#define ttm_base_object_kfree(__object, __base)\
+	kfree_rcu(__object, __base.rhead)
 #endif
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] drm/ttm: Optimize reservation slightly
  2012-11-05 13:55 [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops v2 Thomas Hellstrom
                   ` (2 preceding siblings ...)
  2012-11-05 13:55 ` [PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v2 Thomas Hellstrom
@ 2012-11-05 13:55 ` Thomas Hellstrom
  3 siblings, 0 replies; 9+ messages in thread
From: Thomas Hellstrom @ 2012-11-05 13:55 UTC (permalink / raw)
  To: airlied, airlied; +Cc: dri-devel, linux-kernel, Thomas Hellstrom

Reservation locking currently always takes place under the LRU spinlock.
Hence, strictly there is no need for an atomic_cmpxchg call; we can use
atomic_read followed by atomic_write since nobody else will ever reserve
without the lru spinlock held.
At least on Intel this should remove a locked bus cycle on successful
reserve.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index bf6e4b5..46008ea 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -220,7 +220,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
 	struct ttm_bo_global *glob = bo->glob;
 	int ret;
 
-	while (unlikely(atomic_cmpxchg(&bo->reserved, 0, 1) != 0)) {
+	while (unlikely(atomic_read(&bo->reserved) != 0)) {
 		/**
 		 * Deadlock avoidance for multi-bo reserving.
 		 */
@@ -249,6 +249,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
 			return ret;
 	}
 
+	atomic_set(&bo->reserved, 1);
 	if (use_sequence) {
 		/**
 		 * Wake up waiters that may need to recheck for deadlock,
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] drm/ttm: Optimize reservation slightly
  2012-11-06 11:31 [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops v3 Thomas Hellstrom
@ 2012-11-06 11:31 ` Thomas Hellstrom
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Hellstrom @ 2012-11-06 11:31 UTC (permalink / raw)
  To: airlied, airlied; +Cc: dri-devel, linux-kernel, Thomas Hellstrom

Reservation locking currently always takes place under the LRU spinlock.
Hence, strictly there is no need for an atomic_cmpxchg call; we can use
atomic_read followed by atomic_write since nobody else will ever reserve
without the lru spinlock held.
At least on Intel this should remove a locked bus cycle on successful
reserve.

Note that thit commit may be obsoleted by the cross-device reservation work.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index bf6e4b5..46008ea 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -220,7 +220,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
 	struct ttm_bo_global *glob = bo->glob;
 	int ret;
 
-	while (unlikely(atomic_cmpxchg(&bo->reserved, 0, 1) != 0)) {
+	while (unlikely(atomic_read(&bo->reserved) != 0)) {
 		/**
 		 * Deadlock avoidance for multi-bo reserving.
 		 */
@@ -249,6 +249,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
 			return ret;
 	}
 
+	atomic_set(&bo->reserved, 1);
 	if (use_sequence) {
 		/**
 		 * Wake up waiters that may need to recheck for deadlock,
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] drm/ttm: Optimize reservation slightly
  2012-11-05 14:01   ` Maarten Lankhorst
@ 2012-11-05 14:09     ` Thomas Hellstrom
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Hellstrom @ 2012-11-05 14:09 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: airlied, airlied, linux-kernel, dri-devel

On 11/05/2012 03:01 PM, Maarten Lankhorst wrote:
> Hey,
>
> Op 05-11-12 14:31, Thomas Hellstrom schreef:
>> Reservation locking currently always takes place under the LRU spinlock.
>> Hence, strictly there is no need for an atomic_cmpxchg call; we can use
>> atomic_read followed by atomic_write since nobody else will ever reserve
>> without the lru spinlock held.
>> At least on Intel this should remove a locked bus cycle on successful
>> reserve.
>>
>> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
>>
> Is that really a good thing to submit when I am busy killing lru lock around reserve? :-)
>

If your patch series makes it into the same kernel, let's kill this 
patch. Otherwise it may live
at least for a kernel release. It's not a big thing to rebase against, 
and I won't complain if your
patch adds another atomic read-modify-write op here. :)

/Thomas


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 4/4] drm/ttm: Optimize reservation slightly
  2012-11-05 13:31 ` [PATCH 4/4] drm/ttm: Optimize reservation slightly Thomas Hellstrom
@ 2012-11-05 14:01   ` Maarten Lankhorst
  2012-11-05 14:09     ` Thomas Hellstrom
  0 siblings, 1 reply; 9+ messages in thread
From: Maarten Lankhorst @ 2012-11-05 14:01 UTC (permalink / raw)
  To: Thomas Hellstrom; +Cc: airlied, airlied, linux-kernel, dri-devel

Hey,

Op 05-11-12 14:31, Thomas Hellstrom schreef:
> Reservation locking currently always takes place under the LRU spinlock.
> Hence, strictly there is no need for an atomic_cmpxchg call; we can use
> atomic_read followed by atomic_write since nobody else will ever reserve
> without the lru spinlock held.
> At least on Intel this should remove a locked bus cycle on successful
> reserve.
>
> Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
>
Is that really a good thing to submit when I am busy killing lru lock around reserve? :-)

-	while (unlikely(atomic_cmpxchg(&bo->reserved, 0, 1) != 0)) {
+	while (unlikely(atomic_xchg(&bo->reserved, 1) != 0)) {

Works without lru lock too!

In fact mutexes are done in a similar way[1], except with some more magic, and unlocked state is 1, not 0.
However I do think that to get that right (saves a irq disable in unlock path, and less wakeups in contended
case), I should really just post the mutex extension patches for reservations and ride the flames. It's
getting too close to real mutexes so I really want it to be a mutex in that case. So lets convert it.. Soon! :-)

~Maarten

[1] See linux/include/asm-generic/mutex-xchg.h and linux/include/asm-generic/mutex-dec.h for how
archs generally implement mutex fastpaths.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 4/4] drm/ttm: Optimize reservation slightly
  2012-11-05 13:31 [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops Thomas Hellstrom
@ 2012-11-05 13:31 ` Thomas Hellstrom
  2012-11-05 14:01   ` Maarten Lankhorst
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Hellstrom @ 2012-11-05 13:31 UTC (permalink / raw)
  To: airlied, airlied; +Cc: dri-devel, linux-kernel, Thomas Hellstrom

Reservation locking currently always takes place under the LRU spinlock.
Hence, strictly there is no need for an atomic_cmpxchg call; we can use
atomic_read followed by atomic_write since nobody else will ever reserve
without the lru spinlock held.
At least on Intel this should remove a locked bus cycle on successful
reserve.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index bf6e4b5..46008ea 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -220,7 +220,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
 	struct ttm_bo_global *glob = bo->glob;
 	int ret;
 
-	while (unlikely(atomic_cmpxchg(&bo->reserved, 0, 1) != 0)) {
+	while (unlikely(atomic_read(&bo->reserved) != 0)) {
 		/**
 		 * Deadlock avoidance for multi-bo reserving.
 		 */
@@ -249,6 +249,7 @@ int ttm_bo_reserve_locked(struct ttm_buffer_object *bo,
 			return ret;
 	}
 
+	atomic_set(&bo->reserved, 1);
 	if (use_sequence) {
 		/**
 		 * Wake up waiters that may need to recheck for deadlock,
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-11-06 11:32 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-05 13:55 [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops v2 Thomas Hellstrom
2012-11-05 13:55 ` [PATCH 1/4] drm: Make hashtab rcu-safe Thomas Hellstrom
2012-11-05 13:55 ` [PATCH 2/4] kref: Implement kref_get_unless_zero v2 Thomas Hellstrom
2012-11-05 13:55 ` [PATCH 3/4] drm/ttm, drm/vmwgfx: Use RCU locking for object lookups v2 Thomas Hellstrom
2012-11-05 13:55 ` [PATCH 4/4] drm/ttm: Optimize reservation slightly Thomas Hellstrom
  -- strict thread matches above, loose matches on Subject: below --
2012-11-06 11:31 [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops v3 Thomas Hellstrom
2012-11-06 11:31 ` [PATCH 4/4] drm/ttm: Optimize reservation slightly Thomas Hellstrom
2012-11-05 13:31 [PATCH 0/4] drm/ttm: Get rid of a number of atomic read-modify-write ops Thomas Hellstrom
2012-11-05 13:31 ` [PATCH 4/4] drm/ttm: Optimize reservation slightly Thomas Hellstrom
2012-11-05 14:01   ` Maarten Lankhorst
2012-11-05 14:09     ` Thomas Hellstrom

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.