linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/7] cross-device reservation for dma-buf support
@ 2013-01-15 12:33 Maarten Lankhorst
  2013-01-15 12:33 ` [PATCH 1/7] arch: add __mutex_fastpath_lock_retval_arg to generic/sh/x86/powerpc/ia64 Maarten Lankhorst
                   ` (6 more replies)
  0 siblings, 7 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 12:33 UTC (permalink / raw)
  To: dri-devel, linux-kernel, linux-media, linaro-mm-sig

So I'm resending the patch series for reservations. This is identical to my git
tree at

http://cgit.freedesktop.org/~mlankhorst/linux/

Some changes have been made since my last version. Most notably is the use of
mutexes now instead of creating my own lock primitive, that would end up being
duplicate anyway.

The git tree also has a version of i915 and radeon working together like that.
It's probably a bit hacky, but it works on my macbook pro 8.2. :)

I haven't had any reply on the mutex extensions when I sent them out separately,
so I'm including it in the series.

The idea is that for lockdep purposes, the seqno is tied to a locking a class.
This locking class it not exclusive, but as can be seen from the last patch in
the series, it catches all violations we care about.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/7] arch: add __mutex_fastpath_lock_retval_arg to generic/sh/x86/powerpc/ia64
  2013-01-15 12:33 [PATCH 0/7] cross-device reservation for dma-buf support Maarten Lankhorst
@ 2013-01-15 12:33 ` Maarten Lankhorst
  2013-01-15 13:49   ` Maarten Lankhorst
  2013-01-15 12:33 ` [PATCH 2/7] mutex: add support for reservation style locks Maarten Lankhorst
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 12:33 UTC (permalink / raw)
  To: dri-devel, linux-kernel, linux-media, linaro-mm-sig; +Cc: Maarten Lankhorst

Needed for reservation slowpath.
---
 arch/ia64/include/asm/mutex.h    | 20 ++++++++++++++++++++
 arch/powerpc/include/asm/mutex.h | 20 ++++++++++++++++++++
 arch/sh/include/asm/mutex-llsc.h | 20 ++++++++++++++++++++
 arch/x86/include/asm/mutex_32.h  | 20 ++++++++++++++++++++
 arch/x86/include/asm/mutex_64.h  | 20 ++++++++++++++++++++
 include/asm-generic/mutex-dec.h  | 20 ++++++++++++++++++++
 include/asm-generic/mutex-null.h |  1 +
 include/asm-generic/mutex-xchg.h | 21 +++++++++++++++++++++
 8 files changed, 142 insertions(+)

diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h
index bed73a6..2510058 100644
--- a/arch/ia64/include/asm/mutex.h
+++ b/arch/ia64/include/asm/mutex.h
@@ -44,6 +44,26 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
 }
 
 /**
+ *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
+ *                                     from 1 to a 0 value
+ *  @count: pointer of type atomic_t
+ *  @arg: argument to pass along if fastpath fails.
+ *  @fail_fn: function to call if the original value was not 1
+ *
+ * Change the count from 1 to a value lower than 1, and call <fail_fn> if
+ * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
+ * or anything the slow path function returns.
+ */
+static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
+				void *arg, int (*fail_fn)(atomic_t *, void*))
+{
+	if (unlikely(ia64_fetchadd4_acq(count, -1) != 1))
+		return fail_fn(count, arg);
+	else
+		return 0;
+}
+
+/**
  *  __mutex_fastpath_unlock - try to promote the count from 0 to 1
  *  @count: pointer of type atomic_t
  *  @fail_fn: function to call if the original value was not 0
diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h
index 5399f7e..df4bcff 100644
--- a/arch/powerpc/include/asm/mutex.h
+++ b/arch/powerpc/include/asm/mutex.h
@@ -97,6 +97,26 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
 }
 
 /**
+ *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
+ *                                     from 1 to a 0 value
+ *  @count: pointer of type atomic_t
+ *  @arg: argument to pass along if fastpath fails.
+ *  @fail_fn: function to call if the original value was not 1
+ *
+ * Change the count from 1 to a value lower than 1, and call <fail_fn> if
+ * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
+ * or anything the slow path function returns.
+ */
+static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
+				void *arg, int (*fail_fn)(atomic_t *, void*))
+{
+	if (unlikely(__mutex_dec_return_lock(count) < 0))
+		return fail_fn(count, arg);
+	else
+		return 0;
+}
+
+/**
  *  __mutex_fastpath_unlock - try to promote the count from 0 to 1
  *  @count: pointer of type atomic_t
  *  @fail_fn: function to call if the original value was not 0
diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h
index 090358a..b68dd6d 100644
--- a/arch/sh/include/asm/mutex-llsc.h
+++ b/arch/sh/include/asm/mutex-llsc.h
@@ -56,6 +56,26 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
 	return __res;
 }
 
+static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
+				void *arg, int (*fail_fn)(atomic_t *, void *))
+{
+	int __done, __res;
+
+	__asm__ __volatile__ (
+		"movli.l	@%2, %0	\n"
+		"add		#-1, %0	\n"
+		"movco.l	%0, @%2	\n"
+		"movt		%1	\n"
+		: "=&z" (__res), "=&r" (__done)
+		: "r" (&(count)->counter)
+		: "t");
+
+	if (unlikely(!__done || __res != 0))
+		__res = fail_fn(count, arg);
+
+	return __res;
+}
+
 static inline void
 __mutex_fastpath_unlock(atomic_t *count, void (*fail_fn)(atomic_t *))
 {
diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h
index 03f90c8..34f77f9 100644
--- a/arch/x86/include/asm/mutex_32.h
+++ b/arch/x86/include/asm/mutex_32.h
@@ -58,6 +58,26 @@ static inline int __mutex_fastpath_lock_retval(atomic_t *count,
 }
 
 /**
+ *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
+ *                                     from 1 to a 0 value
+ *  @count: pointer of type atomic_t
+ *  @arg: argument to pass along if fastpath fails.
+ *  @fail_fn: function to call if the original value was not 1
+ *
+ * Change the count from 1 to a value lower than 1, and call <fail_fn> if
+ * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
+ * or anything the slow path function returns.
+ */
+static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
+				void *arg, int (*fail_fn)(atomic_t *, void*))
+{
+	if (unlikely(atomic_dec_return(count) < 0))
+		return fail_fn(count, arg);
+	else
+		return 0;
+}
+
+/**
  *  __mutex_fastpath_unlock - try to promote the mutex from 0 to 1
  *  @count: pointer of type atomic_t
  *  @fail_fn: function to call if the original value was not 0
diff --git a/arch/x86/include/asm/mutex_64.h b/arch/x86/include/asm/mutex_64.h
index 68a87b0..148249e 100644
--- a/arch/x86/include/asm/mutex_64.h
+++ b/arch/x86/include/asm/mutex_64.h
@@ -53,6 +53,26 @@ static inline int __mutex_fastpath_lock_retval(atomic_t *count,
 }
 
 /**
+ *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
+ *                                     from 1 to a 0 value
+ *  @count: pointer of type atomic_t
+ *  @arg: argument to pass along if fastpath fails.
+ *  @fail_fn: function to call if the original value was not 1
+ *
+ * Change the count from 1 to a value lower than 1, and call <fail_fn> if
+ * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
+ * or anything the slow path function returns.
+ */
+static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
+				void *arg, int (*fail_fn)(atomic_t *, void*))
+{
+	if (unlikely(atomic_dec_return(count) < 0))
+		return fail_fn(count, arg);
+	else
+		return 0;
+}
+
+/**
  * __mutex_fastpath_unlock - increment and call function if nonpositive
  * @v: pointer of type atomic_t
  * @fail_fn: function to call if the result is nonpositive
diff --git a/include/asm-generic/mutex-dec.h b/include/asm-generic/mutex-dec.h
index f104af7..f5d027e 100644
--- a/include/asm-generic/mutex-dec.h
+++ b/include/asm-generic/mutex-dec.h
@@ -43,6 +43,26 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
 }
 
 /**
+ *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
+ *                                     from 1 to a 0 value
+ *  @count: pointer of type atomic_t
+ *  @arg: argument to pass along if fastpath fails.
+ *  @fail_fn: function to call if the original value was not 1
+ *
+ * Change the count from 1 to a value lower than 1, and call <fail_fn> if
+ * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
+ * or anything the slow path function returns.
+ */
+static inline int
+__mutex_fastpath_lock_retval_arg(atomic_t *count, void *arg,
+				 int (*fail_fn)(atomic_t *, void*))
+{
+	if (unlikely(atomic_dec_return(count) < 0))
+		return fail_fn(count, arg);
+	return 0;
+}
+
+/**
  *  __mutex_fastpath_unlock - try to promote the count from 0 to 1
  *  @count: pointer of type atomic_t
  *  @fail_fn: function to call if the original value was not 0
diff --git a/include/asm-generic/mutex-null.h b/include/asm-generic/mutex-null.h
index e1bbbc7..991e9c3 100644
--- a/include/asm-generic/mutex-null.h
+++ b/include/asm-generic/mutex-null.h
@@ -12,6 +12,7 @@
 
 #define __mutex_fastpath_lock(count, fail_fn)		fail_fn(count)
 #define __mutex_fastpath_lock_retval(count, fail_fn)	fail_fn(count)
+#define __mutex_fastpath_lock_retval_arg(count, arg, fail_fn)	fail_fn(count, arg)
 #define __mutex_fastpath_unlock(count, fail_fn)		fail_fn(count)
 #define __mutex_fastpath_trylock(count, fail_fn)	fail_fn(count)
 #define __mutex_slowpath_needs_to_unlock()		1
diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h
index c04e0db..d9cc971 100644
--- a/include/asm-generic/mutex-xchg.h
+++ b/include/asm-generic/mutex-xchg.h
@@ -55,6 +55,27 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
 }
 
 /**
+ *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
+ *                                     from 1 to a 0 value
+ *  @count: pointer of type atomic_t
+ *  @arg: argument to pass along if fastpath fails.
+ *  @fail_fn: function to call if the original value was not 1
+ *
+ * Change the count from 1 to a value lower than 1, and call <fail_fn> if
+ * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
+ * or anything the slow path function returns.
+ */
+static inline int
+__mutex_fastpath_lock_retval_arg(atomic_t *count, void *arg,
+				 int (*fail_fn)(atomic_t *, void*))
+{
+	if (unlikely(atomic_xchg(count, 0) != 1))
+		if (likely(atomic_xchg(count, -1) != 1))
+			return fail_fn(count, arg);
+	return 0;
+}
+
+/**
  *  __mutex_fastpath_unlock - try to promote the mutex from 0 to 1
  *  @count: pointer of type atomic_t
  *  @fail_fn: function to call if the original value was not 0
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/7] mutex: add support for reservation style locks
  2013-01-15 12:33 [PATCH 0/7] cross-device reservation for dma-buf support Maarten Lankhorst
  2013-01-15 12:33 ` [PATCH 1/7] arch: add __mutex_fastpath_lock_retval_arg to generic/sh/x86/powerpc/ia64 Maarten Lankhorst
@ 2013-01-15 12:33 ` Maarten Lankhorst
  2013-01-15 13:43   ` Maarten Lankhorst
  2013-01-30  1:07   ` Rob Clark
  2013-01-15 12:34 ` [PATCH 3/7] sched: allow try_to_wake_up to be used internally outside of core.c Maarten Lankhorst
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 12:33 UTC (permalink / raw)
  To: dri-devel, linux-kernel, linux-media, linaro-mm-sig; +Cc: Maarten Lankhorst

makes it easier to port ttm over..

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
 include/linux/mutex.h |  86 +++++++++++++-
 kernel/mutex.c        | 317 +++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 387 insertions(+), 16 deletions(-)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 9121595..602c247 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -62,6 +62,11 @@ struct mutex {
 #endif
 };
 
+struct ticket_mutex {
+	struct mutex base;
+	atomic_long_t reservation_id;
+};
+
 /*
  * This is the control structure for tasks blocked on mutex,
  * which resides on the blocked task's kernel stack:
@@ -109,12 +114,24 @@ static inline void mutex_destroy(struct mutex *lock) {}
 		__DEBUG_MUTEX_INITIALIZER(lockname) \
 		__DEP_MAP_MUTEX_INITIALIZER(lockname) }
 
+#define __TICKET_MUTEX_INITIALIZER(lockname) \
+		{ .base = __MUTEX_INITIALIZER(lockname) \
+		, .reservation_id = ATOMIC_LONG_INIT(0) }
+
 #define DEFINE_MUTEX(mutexname) \
 	struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
 
 extern void __mutex_init(struct mutex *lock, const char *name,
 			 struct lock_class_key *key);
 
+static inline void __ticket_mutex_init(struct ticket_mutex *lock,
+				       const char *name,
+				       struct lock_class_key *key)
+{
+	__mutex_init(&lock->base, name, key);
+	atomic_long_set(&lock->reservation_id, 0);
+}
+
 /**
  * mutex_is_locked - is the mutex locked
  * @lock: the mutex to be queried
@@ -133,26 +150,91 @@ static inline int mutex_is_locked(struct mutex *lock)
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
 extern void _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest_lock);
+
 extern int __must_check mutex_lock_interruptible_nested(struct mutex *lock,
 					unsigned int subclass);
 extern int __must_check mutex_lock_killable_nested(struct mutex *lock,
 					unsigned int subclass);
 
+extern int __must_check _mutex_reserve_lock(struct ticket_mutex *lock,
+					struct lockdep_map *nest_lock,
+					unsigned long reservation_id);
+
+extern int __must_check _mutex_reserve_lock_interruptible(struct ticket_mutex *,
+					struct lockdep_map *nest_lock,
+					unsigned long reservation_id);
+
+extern void _mutex_reserve_lock_slow(struct ticket_mutex *lock,
+				     struct lockdep_map *nest_lock,
+				     unsigned long reservation_id);
+
+extern int __must_check _mutex_reserve_lock_intr_slow(struct ticket_mutex *,
+					struct lockdep_map *nest_lock,
+					unsigned long reservation_id);
+
 #define mutex_lock(lock) mutex_lock_nested(lock, 0)
 #define mutex_lock_interruptible(lock) mutex_lock_interruptible_nested(lock, 0)
 #define mutex_lock_killable(lock) mutex_lock_killable_nested(lock, 0)
 
 #define mutex_lock_nest_lock(lock, nest_lock)				\
 do {									\
-	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);		\
+	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
 	_mutex_lock_nest_lock(lock, &(nest_lock)->dep_map);		\
 } while (0)
 
+#define mutex_reserve_lock(lock, nest_lock, reservation_id)		\
+({									\
+	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
+	_mutex_reserve_lock(lock, &(nest_lock)->dep_map, reservation_id);	\
+})
+
+#define mutex_reserve_lock_interruptible(lock, nest_lock, reservation_id)	\
+({									\
+	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
+	_mutex_reserve_lock_interruptible(lock, &(nest_lock)->dep_map,	\
+					   reservation_id);		\
+})
+
+#define mutex_reserve_lock_slow(lock, nest_lock, reservation_id)	\
+do {									\
+	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
+	_mutex_reserve_lock_slow(lock, &(nest_lock)->dep_map, reservation_id);	\
+} while (0)
+
+#define mutex_reserve_lock_intr_slow(lock, nest_lock, reservation_id)	\
+({									\
+	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
+	_mutex_reserve_lock_intr_slow(lock, &(nest_lock)->dep_map,	\
+				      reservation_id);			\
+})
+
 #else
 extern void mutex_lock(struct mutex *lock);
 extern int __must_check mutex_lock_interruptible(struct mutex *lock);
 extern int __must_check mutex_lock_killable(struct mutex *lock);
 
+extern int __must_check _mutex_reserve_lock(struct ticket_mutex *lock,
+					    unsigned long reservation_id);
+extern int __must_check _mutex_reserve_lock_interruptible(struct ticket_mutex *,
+						unsigned long reservation_id);
+
+extern void _mutex_reserve_lock_slow(struct ticket_mutex *lock,
+				     unsigned long reservation_id);
+extern int __must_check _mutex_reserve_lock_intr_slow(struct ticket_mutex *,
+						unsigned long reservation_id);
+
+#define mutex_reserve_lock(lock, nest_lock, reservation_id)		\
+	_mutex_reserve_lock(lock, reservation_id)
+
+#define mutex_reserve_lock_interruptible(lock, nest_lock, reservation_id)	\
+	_mutex_reserve_lock_interruptible(lock, reservation_id)
+
+#define mutex_reserve_lock_slow(lock, nest_lock, reservation_id)	\
+	_mutex_reserve_lock_slow(lock, reservation_id)
+
+#define mutex_reserve_lock_intr_slow(lock, nest_lock, reservation_id)	\
+	_mutex_reserve_lock_intr_slow(lock, reservation_id)
+
 # define mutex_lock_nested(lock, subclass) mutex_lock(lock)
 # define mutex_lock_interruptible_nested(lock, subclass) mutex_lock_interruptible(lock)
 # define mutex_lock_killable_nested(lock, subclass) mutex_lock_killable(lock)
@@ -167,6 +249,8 @@ extern int __must_check mutex_lock_killable(struct mutex *lock);
  */
 extern int mutex_trylock(struct mutex *lock);
 extern void mutex_unlock(struct mutex *lock);
+extern void mutex_unreserve_unlock(struct ticket_mutex *lock);
+
 extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
 
 #ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX
diff --git a/kernel/mutex.c b/kernel/mutex.c
index a307cc9..8282729 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -126,16 +126,119 @@ void __sched mutex_unlock(struct mutex *lock)
 
 EXPORT_SYMBOL(mutex_unlock);
 
+/**
+ * mutex_unreserve_unlock - release the mutex
+ * @lock: the mutex to be released
+ *
+ * Unlock a mutex that has been locked by this task previously
+ * with _mutex_reserve_lock*.
+ *
+ * This function must not be used in interrupt context. Unlocking
+ * of a not locked mutex is not allowed.
+ */
+void __sched mutex_unreserve_unlock(struct ticket_mutex *lock)
+{
+	/*
+	 * mark mutex as no longer part of a reservation, next
+	 * locker can set this again
+	 */
+	atomic_long_set(&lock->reservation_id, 0);
+
+	/*
+	 * The unlocking fastpath is the 0->1 transition from 'locked'
+	 * into 'unlocked' state:
+	 */
+#ifndef CONFIG_DEBUG_MUTEXES
+	/*
+	 * When debugging is enabled we must not clear the owner before time,
+	 * the slow path will always be taken, and that clears the owner field
+	 * after verifying that it was indeed current.
+	 */
+	mutex_clear_owner(&lock->base);
+#endif
+	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
+}
+EXPORT_SYMBOL(mutex_unreserve_unlock);
+
+static inline int __sched
+__mutex_lock_check_reserve(struct mutex *lock, unsigned long reservation_id)
+{
+	struct ticket_mutex *m = container_of(lock, struct ticket_mutex, base);
+	unsigned long cur_id;
+
+	cur_id = atomic_long_read(&m->reservation_id);
+	if (!cur_id)
+		return 0;
+
+	if (unlikely(reservation_id == cur_id))
+		return -EDEADLK;
+
+	if (unlikely(reservation_id - cur_id <= LONG_MAX))
+		return -EAGAIN;
+
+	return 0;
+}
+
+/*
+ * after acquiring lock with fastpath or when we lost out in contested
+ * slowpath, set reservation_id and wake up any waiters so they can recheck.
+ */
+static __always_inline void
+mutex_set_reservation_fastpath(struct ticket_mutex *lock,
+			       unsigned long reservation_id, bool check_res)
+{
+	unsigned long flags;
+	struct mutex_waiter *cur;
+
+	if (check_res || config_enabled(CONFIG_DEBUG_LOCK_ALLOC)) {
+		unsigned long cur_id;
+
+		cur_id = atomic_long_xchg(&lock->reservation_id,
+					  reservation_id);
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+		if (check_res)
+			DEBUG_LOCKS_WARN_ON(cur_id &&
+					   cur_id != reservation_id);
+		else
+			DEBUG_LOCKS_WARN_ON(cur_id);
+		lockdep_assert_held(&lock->base);
+#endif
+
+		if (unlikely(cur_id == reservation_id))
+			return;
+	} else
+		atomic_long_set(&lock->reservation_id, reservation_id);
+
+	/*
+	 * Check if lock is contended, if not there is nobody to wake up
+	 */
+	if (likely(atomic_read(&lock->base.count) == 0))
+		return;
+
+	/*
+	 * Uh oh, we raced in fastpath, wake up everyone in this case,
+	 * so they can see the new reservation_id
+	 */
+	spin_lock_mutex(&lock->base.wait_lock, flags);
+	list_for_each_entry(cur, &lock->base.wait_list, list) {
+		debug_mutex_wake_waiter(&lock->base, cur);
+		wake_up_process(cur->task);
+	}
+	spin_unlock_mutex(&lock->base.wait_lock, flags);
+}
+
 /*
  * Lock a mutex (possibly interruptible), slowpath:
  */
 static inline int __sched
 __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
-		    struct lockdep_map *nest_lock, unsigned long ip)
+		    struct lockdep_map *nest_lock, unsigned long ip,
+		    unsigned long reservation_id, bool res_slow)
 {
 	struct task_struct *task = current;
 	struct mutex_waiter waiter;
 	unsigned long flags;
+	int ret;
 
 	preempt_disable();
 	mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
@@ -162,6 +265,12 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 	for (;;) {
 		struct task_struct *owner;
 
+		if (!__builtin_constant_p(reservation_id) && !res_slow) {
+			ret = __mutex_lock_check_reserve(lock, reservation_id);
+			if (ret)
+				goto err_nowait;
+		}
+
 		/*
 		 * If there's an owner, wait for it to either
 		 * release the lock or go to sleep.
@@ -172,6 +281,13 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 
 		if (atomic_cmpxchg(&lock->count, 1, 0) == 1) {
 			lock_acquired(&lock->dep_map, ip);
+			if (res_slow) {
+				struct ticket_mutex *m;
+				m = container_of(lock, struct ticket_mutex, base);
+
+				mutex_set_reservation_fastpath(m, reservation_id, false);
+			}
+
 			mutex_set_owner(lock);
 			preempt_enable();
 			return 0;
@@ -227,15 +343,16 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
 		 * TASK_UNINTERRUPTIBLE case.)
 		 */
 		if (unlikely(signal_pending_state(state, task))) {
-			mutex_remove_waiter(lock, &waiter,
-					    task_thread_info(task));
-			mutex_release(&lock->dep_map, 1, ip);
-			spin_unlock_mutex(&lock->wait_lock, flags);
+			ret = -EINTR;
+			goto err;
+		}
 
-			debug_mutex_free_waiter(&waiter);
-			preempt_enable();
-			return -EINTR;
+		if (!__builtin_constant_p(reservation_id) && !res_slow) {
+			ret = __mutex_lock_check_reserve(lock, reservation_id);
+			if (ret)
+				goto err;
 		}
+
 		__set_task_state(task, state);
 
 		/* didn't get the lock, go to sleep: */
@@ -250,6 +367,28 @@ done:
 	mutex_remove_waiter(lock, &waiter, current_thread_info());
 	mutex_set_owner(lock);
 
+	if (!__builtin_constant_p(reservation_id)) {
+		struct ticket_mutex *m;
+		struct mutex_waiter *cur;
+		/*
+		 * this should get optimized out for the common case,
+		 * and is only important for _mutex_reserve_lock
+		 */
+
+		m = container_of(lock, struct ticket_mutex, base);
+		atomic_long_set(&m->reservation_id, reservation_id);
+
+		/*
+		 * give any possible sleeping processes the chance to wake up,
+		 * so they can recheck if they have to back off from
+		 * reservations
+		 */
+		list_for_each_entry(cur, &lock->wait_list, list) {
+			debug_mutex_wake_waiter(lock, cur);
+			wake_up_process(cur->task);
+		}
+	}
+
 	/* set it to 0 if there are no waiters left: */
 	if (likely(list_empty(&lock->wait_list)))
 		atomic_set(&lock->count, 0);
@@ -260,6 +399,19 @@ done:
 	preempt_enable();
 
 	return 0;
+
+err:
+	mutex_remove_waiter(lock, &waiter, task_thread_info(task));
+	spin_unlock_mutex(&lock->wait_lock, flags);
+	debug_mutex_free_waiter(&waiter);
+
+#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
+err_nowait:
+#endif
+	mutex_release(&lock->dep_map, 1, ip);
+
+	preempt_enable();
+	return ret;
 }
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -267,7 +419,8 @@ void __sched
 mutex_lock_nested(struct mutex *lock, unsigned int subclass)
 {
 	might_sleep();
-	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, subclass, NULL, _RET_IP_);
+	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
+			    subclass, NULL, _RET_IP_, 0, 0);
 }
 
 EXPORT_SYMBOL_GPL(mutex_lock_nested);
@@ -276,7 +429,8 @@ void __sched
 _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest)
 {
 	might_sleep();
-	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, nest, _RET_IP_);
+	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
+			    0, nest, _RET_IP_, 0, 0);
 }
 
 EXPORT_SYMBOL_GPL(_mutex_lock_nest_lock);
@@ -285,7 +439,8 @@ int __sched
 mutex_lock_killable_nested(struct mutex *lock, unsigned int subclass)
 {
 	might_sleep();
-	return __mutex_lock_common(lock, TASK_KILLABLE, subclass, NULL, _RET_IP_);
+	return __mutex_lock_common(lock, TASK_KILLABLE,
+				   subclass, NULL, _RET_IP_, 0, 0);
 }
 EXPORT_SYMBOL_GPL(mutex_lock_killable_nested);
 
@@ -294,10 +449,63 @@ mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
 {
 	might_sleep();
 	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE,
-				   subclass, NULL, _RET_IP_);
+				   subclass, NULL, _RET_IP_, 0, 0);
 }
 
 EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
+
+int __sched
+_mutex_reserve_lock(struct ticket_mutex *lock, struct lockdep_map *nest,
+		    unsigned long reservation_id)
+{
+	DEBUG_LOCKS_WARN_ON(!reservation_id);
+
+	might_sleep();
+	return __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
+				   0, nest, _RET_IP_, reservation_id, 0);
+}
+EXPORT_SYMBOL_GPL(_mutex_reserve_lock);
+
+
+int __sched
+_mutex_reserve_lock_interruptible(struct ticket_mutex *lock,
+				  struct lockdep_map *nest,
+				  unsigned long reservation_id)
+{
+	DEBUG_LOCKS_WARN_ON(!reservation_id);
+
+	might_sleep();
+	return __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE,
+				   0, nest, _RET_IP_, reservation_id, 0);
+}
+EXPORT_SYMBOL_GPL(_mutex_reserve_lock_interruptible);
+
+void __sched
+_mutex_reserve_lock_slow(struct ticket_mutex *lock, struct lockdep_map *nest,
+			 unsigned long reservation_id)
+{
+	DEBUG_LOCKS_WARN_ON(!reservation_id);
+
+	might_sleep();
+	__mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE, 0,
+			    nest, _RET_IP_, reservation_id, 1);
+}
+EXPORT_SYMBOL_GPL(_mutex_reserve_lock_slow);
+
+int __sched
+_mutex_reserve_lock_intr_slow(struct ticket_mutex *lock,
+			      struct lockdep_map *nest,
+			      unsigned long reservation_id)
+{
+	DEBUG_LOCKS_WARN_ON(!reservation_id);
+
+	might_sleep();
+	return __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE, 0,
+				   nest, _RET_IP_, reservation_id, 1);
+}
+EXPORT_SYMBOL_GPL(_mutex_reserve_lock_intr_slow);
+
+
 #endif
 
 /*
@@ -400,7 +608,8 @@ __mutex_lock_slowpath(atomic_t *lock_count)
 {
 	struct mutex *lock = container_of(lock_count, struct mutex, count);
 
-	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, NULL, _RET_IP_);
+	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0,
+			    NULL, _RET_IP_, 0, 0);
 }
 
 static noinline int __sched
@@ -408,7 +617,8 @@ __mutex_lock_killable_slowpath(atomic_t *lock_count)
 {
 	struct mutex *lock = container_of(lock_count, struct mutex, count);
 
-	return __mutex_lock_common(lock, TASK_KILLABLE, 0, NULL, _RET_IP_);
+	return __mutex_lock_common(lock, TASK_KILLABLE, 0,
+				   NULL, _RET_IP_, 0, 0);
 }
 
 static noinline int __sched
@@ -416,8 +626,28 @@ __mutex_lock_interruptible_slowpath(atomic_t *lock_count)
 {
 	struct mutex *lock = container_of(lock_count, struct mutex, count);
 
-	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0, NULL, _RET_IP_);
+	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0,
+				   NULL, _RET_IP_, 0, 0);
+}
+
+static noinline int __sched
+__mutex_lock_reserve_slowpath(atomic_t *lock_count, void *rid)
+{
+	struct mutex *lock = container_of(lock_count, struct mutex, count);
+
+	return __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0,
+				   NULL, _RET_IP_, (unsigned long)rid, 0);
+}
+
+static noinline int __sched
+__mutex_lock_interruptible_reserve_slowpath(atomic_t *lock_count, void *rid)
+{
+	struct mutex *lock = container_of(lock_count, struct mutex, count);
+
+	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0,
+				   NULL, _RET_IP_, (unsigned long)rid, 0);
 }
+
 #endif
 
 /*
@@ -473,6 +703,63 @@ int __sched mutex_trylock(struct mutex *lock)
 }
 EXPORT_SYMBOL(mutex_trylock);
 
+#ifndef CONFIG_DEBUG_LOCK_ALLOC
+int __sched
+_mutex_reserve_lock(struct ticket_mutex *lock, unsigned long rid)
+{
+	int ret;
+
+	might_sleep();
+
+	ret = __mutex_fastpath_lock_retval_arg(&lock->base.count, (void *)rid,
+					__mutex_lock_reserve_slowpath);
+
+	if (!ret) {
+		mutex_set_reservation_fastpath(lock, rid, true);
+		mutex_set_owner(&lock->base);
+	}
+	return ret;
+}
+EXPORT_SYMBOL(_mutex_reserve_lock);
+
+int __sched
+_mutex_reserve_lock_interruptible(struct ticket_mutex *lock, unsigned long rid)
+{
+	int ret;
+
+	might_sleep();
+
+	ret = __mutex_fastpath_lock_retval_arg(&lock->base.count, (void *)rid,
+				__mutex_lock_interruptible_reserve_slowpath);
+
+	if (!ret) {
+		mutex_set_reservation_fastpath(lock, rid, true);
+		mutex_set_owner(&lock->base);
+	}
+	return ret;
+}
+EXPORT_SYMBOL(_mutex_reserve_lock_interruptible);
+
+void __sched
+_mutex_reserve_lock_slow(struct ticket_mutex *lock, unsigned long rid)
+{
+	might_sleep();
+	__mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
+			    0, NULL, _RET_IP_, rid, 1);
+}
+EXPORT_SYMBOL(_mutex_reserve_lock_slow);
+
+int __sched
+_mutex_reserve_lock_intr_slow(struct ticket_mutex *lock, unsigned long rid)
+{
+	might_sleep();
+	return __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE,
+				   0, NULL, _RET_IP_, rid, 1);
+}
+EXPORT_SYMBOL(_mutex_reserve_lock_intr_slow);
+
+#endif
+
 /**
  * atomic_dec_and_mutex_lock - return holding mutex if we dec to 0
  * @cnt: the atomic which we are to dec
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/7] sched: allow try_to_wake_up to be used internally outside of core.c
  2013-01-15 12:33 [PATCH 0/7] cross-device reservation for dma-buf support Maarten Lankhorst
  2013-01-15 12:33 ` [PATCH 1/7] arch: add __mutex_fastpath_lock_retval_arg to generic/sh/x86/powerpc/ia64 Maarten Lankhorst
  2013-01-15 12:33 ` [PATCH 2/7] mutex: add support for reservation style locks Maarten Lankhorst
@ 2013-01-15 12:34 ` Maarten Lankhorst
  2013-01-15 12:34 ` [PATCH 4/7] fence: dma-buf cross-device synchronization (v11) Maarten Lankhorst
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 12:34 UTC (permalink / raw)
  To: dri-devel, linux-kernel, linux-media, linaro-mm-sig; +Cc: Maarten Lankhorst

Not exported, since only used by the fence implementation.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
 include/linux/wait.h | 1 +
 kernel/sched/core.c  | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 7cb64d4..7aaba95 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -11,6 +11,7 @@
 typedef struct __wait_queue wait_queue_t;
 typedef int (*wait_queue_func_t)(wait_queue_t *wait, unsigned mode, int flags, void *key);
 int default_wake_function(wait_queue_t *wait, unsigned mode, int flags, void *key);
+int try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags);
 
 struct __wait_queue {
 	unsigned int flags;
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 257002c..5f23fe3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1425,7 +1425,7 @@ static void ttwu_queue(struct task_struct *p, int cpu)
  * Returns %true if @p was woken up, %false if it was already running
  * or @state didn't match @p's state.
  */
-static int
+int
 try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 {
 	unsigned long flags;
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)
  2013-01-15 12:33 [PATCH 0/7] cross-device reservation for dma-buf support Maarten Lankhorst
                   ` (2 preceding siblings ...)
  2013-01-15 12:34 ` [PATCH 3/7] sched: allow try_to_wake_up to be used internally outside of core.c Maarten Lankhorst
@ 2013-01-15 12:34 ` Maarten Lankhorst
  2013-01-22 15:13   ` [Linaro-mm-sig] " Francesco Lavra
  2013-01-31  9:32   ` Inki Dae
  2013-01-15 12:34 ` [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4) Maarten Lankhorst
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 12:34 UTC (permalink / raw)
  To: dri-devel, linux-kernel, linux-media, linaro-mm-sig; +Cc: Maarten Lankhorst

A fence can be attached to a buffer which is being filled or consumed
by hw, to allow userspace to pass the buffer without waiting to another
device.  For example, userspace can call page_flip ioctl to display the
next frame of graphics after kicking the GPU but while the GPU is still
rendering.  The display device sharing the buffer with the GPU would
attach a callback to get notified when the GPU's rendering-complete IRQ
fires, to update the scan-out address of the display, without having to
wake up userspace.

A driver must allocate a fence context for each execution ring that can
run in parallel. The function for this takes an argument with how many
contexts to allocate:
  + fence_context_alloc()

A fence is transient, one-shot deal.  It is allocated and attached
to one or more dma-buf's.  When the one that attached it is done, with
the pending operation, it can signal the fence:
  + fence_signal()

To have a rough approximation whether a fence is fired, call:
  + fence_is_signaled()

The dma-buf-mgr handles tracking, and waiting on, the fences associated
with a dma-buf.

The one pending on the fence can add an async callback:
  + fence_add_callback()

The callback can optionally be cancelled with:
  + fence_remove_callback()

To wait synchronously, optionally with a timeout:
  + fence_wait()
  + fence_wait_timeout()

A default software-only implementation is provided, which can be used
by drivers attaching a fence to a buffer when they have no other means
for hw sync.  But a memory backed fence is also envisioned, because it
is common that GPU's can write to, or poll on some memory location for
synchronization.  For example:

  fence = custom_get_fence(...);
  if ((seqno_fence = to_seqno_fence(fence)) != NULL) {
    dma_buf *fence_buf = fence->sync_buf;
    get_dma_buf(fence_buf);

    ... tell the hw the memory location to wait ...
    custom_wait_on(fence_buf, fence->seqno_ofs, fence->seqno);
  } else {
    /* fall-back to sw sync * /
    fence_add_callback(fence, my_cb);
  }

On SoC platforms, if some other hw mechanism is provided for synchronizing
between IP blocks, it could be supported as an alternate implementation
with it's own fence ops in a similar way.

enable_signaling callback is used to provide sw signaling in case a cpu
waiter is requested or no compatible hardware signaling could be used.

The intention is to provide a userspace interface (presumably via eventfd)
later, to be used in conjunction with dma-buf's mmap support for sw access
to buffers (or for userspace apps that would prefer to do their own
synchronization).

v1: Original
v2: After discussion w/ danvet and mlankhorst on #dri-devel, we decided
    that dma-fence didn't need to care about the sw->hw signaling path
    (it can be handled same as sw->sw case), and therefore the fence->ops
    can be simplified and more handled in the core.  So remove the signal,
    add_callback, cancel_callback, and wait ops, and replace with a simple
    enable_signaling() op which can be used to inform a fence supporting
    hw->hw signaling that one or more devices which do not support hw
    signaling are waiting (and therefore it should enable an irq or do
    whatever is necessary in order that the CPU is notified when the
    fence is passed).
v3: Fix locking fail in attach_fence() and get_fence()
v4: Remove tie-in w/ dma-buf..  after discussion w/ danvet and mlankorst
    we decided that we need to be able to attach one fence to N dma-buf's,
    so using the list_head in dma-fence struct would be problematic.
v5: [ Maarten Lankhorst ] Updated for dma-bikeshed-fence and dma-buf-manager.
v6: [ Maarten Lankhorst ] I removed dma_fence_cancel_callback and some comments
    about checking if fence fired or not. This is broken by design.
    waitqueue_active during destruction is now fatal, since the signaller
    should be holding a reference in enable_signalling until it signalled
    the fence. Pass the original dma_fence_cb along, and call __remove_wait
    in the dma_fence_callback handler, so that no cleanup needs to be
    performed.
v7: [ Maarten Lankhorst ] Set cb->func and only enable sw signaling if
    fence wasn't signaled yet, for example for hardware fences that may
    choose to signal blindly.
v8: [ Maarten Lankhorst ] Tons of tiny fixes, moved __dma_fence_init to
    header and fixed include mess. dma-fence.h now includes dma-buf.h
    All members are now initialized, so kmalloc can be used for
    allocating a dma-fence. More documentation added.
v9: Change compiler bitfields to flags, change return type of
    enable_signaling to bool. Rework dma_fence_wait. Added
    dma_fence_is_signaled and dma_fence_wait_timeout.
    s/dma// and change exports to non GPL. Added fence_is_signaled and
    fence_enable_sw_signaling calls, add ability to override default
    wait operation.
v10: remove event_queue, use a custom list, export try_to_wake_up from
    scheduler. Remove fence lock and use a global spinlock instead,
    this should hopefully remove all the locking headaches I was having
    on trying to implement this. enable_signaling is called with this
    lock held.
v11:
    Use atomic ops for flags, lifting the need for some spin_lock_irqsaves.
    However I kept the guarantee that after fence_signal returns, it is
    guaranteed that enable_signaling has either been called to completion,
    or will not be called any more.

    Add contexts and seqno to base fence implementation. This allows you
    to wait for less fences, by testing for seqno + signaled, and then only
    wait on the later fence.

    Add FENCE_TRACE, FENCE_WARN, and FENCE_ERR. This makes debugging easier.
    An CONFIG_DEBUG_FENCE will be added to turn off the FENCE_TRACE
    spam, and another runtime option can turn it off at runtime.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
 Documentation/DocBook/device-drivers.tmpl |   2 +
 drivers/base/Makefile                     |   2 +-
 drivers/base/fence.c                      | 286 ++++++++++++++++++++++++
 include/linux/fence.h                     | 347 ++++++++++++++++++++++++++++++
 4 files changed, 636 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/fence.c
 create mode 100644 include/linux/fence.h

diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
index 7514dbf..6f53fc0 100644
--- a/Documentation/DocBook/device-drivers.tmpl
+++ b/Documentation/DocBook/device-drivers.tmpl
@@ -126,6 +126,8 @@ X!Edrivers/base/interface.c
      </sect1>
      <sect1><title>Device Drivers DMA Management</title>
 !Edrivers/base/dma-buf.c
+!Edrivers/base/fence.c
+!Iinclude/linux/fence.h
 !Edrivers/base/dma-coherent.c
 !Edrivers/base/dma-mapping.c
      </sect1>
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 5aa2d70..0026563 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o
 obj-y			+= power/
 obj-$(CONFIG_HAS_DMA)	+= dma-mapping.o
 obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o
-obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o
+obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o fence.o
 obj-$(CONFIG_ISA)	+= isa.o
 obj-$(CONFIG_FW_LOADER)	+= firmware_class.o
 obj-$(CONFIG_NUMA)	+= node.o
diff --git a/drivers/base/fence.c b/drivers/base/fence.c
new file mode 100644
index 0000000..28e5ffd
--- /dev/null
+++ b/drivers/base/fence.c
@@ -0,0 +1,286 @@
+/*
+ * Fence mechanism for dma-buf and to allow for asynchronous dma access
+ *
+ * Copyright (C) 2012 Canonical Ltd
+ * Copyright (C) 2012 Texas Instruments
+ *
+ * Authors:
+ * Rob Clark <rob.clark@linaro.org>
+ * Maarten Lankhorst <maarten.lankhorst@canonical.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/slab.h>
+#include <linux/export.h>
+#include <linux/fence.h>
+
+atomic_t fence_context_counter = ATOMIC_INIT(0);
+EXPORT_SYMBOL(fence_context_counter);
+
+int __fence_signal(struct fence *fence)
+{
+	struct fence_cb *cur, *tmp;
+	int ret = 0;
+
+	if (WARN_ON(!fence))
+		return -EINVAL;
+
+	if (test_and_set_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
+		ret = -EINVAL;
+
+		/*
+		 * we might have raced with the unlocked fence_signal,
+		 * still run through all callbacks
+		 */
+	}
+
+	list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
+		list_del_init(&cur->node);
+		cur->func(fence, cur, cur->priv);
+	}
+	return ret;
+}
+EXPORT_SYMBOL(__fence_signal);
+
+/**
+ * fence_signal - signal completion of a fence
+ * @fence: the fence to signal
+ *
+ * Signal completion for software callbacks on a fence, this will unblock
+ * fence_wait() calls and run all the callbacks added with
+ * fence_add_callback(). Can be called multiple times, but since a fence
+ * can only go from unsignaled to signaled state, it will only be effective
+ * the first time.
+ */
+int fence_signal(struct fence *fence)
+{
+	unsigned long flags;
+
+	if (!fence || test_and_set_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+		return -EINVAL;
+
+	if (test_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) {
+		struct fence_cb *cur, *tmp;
+
+		spin_lock_irqsave(fence->lock, flags);
+		list_for_each_entry_safe(cur, tmp, &fence->cb_list, node) {
+			list_del_init(&cur->node);
+			cur->func(fence, cur, cur->priv);
+		}
+		spin_unlock_irqrestore(fence->lock, flags);
+	}
+	return 0;
+}
+EXPORT_SYMBOL(fence_signal);
+
+void release_fence(struct kref *kref)
+{
+	struct fence *fence =
+			container_of(kref, struct fence, refcount);
+
+	BUG_ON(!list_empty(&fence->cb_list));
+
+	if (fence->ops->release)
+		fence->ops->release(fence);
+	else
+		kfree(fence);
+}
+EXPORT_SYMBOL(release_fence);
+
+/**
+ * fence_enable_sw_signaling - enable signaling on fence
+ * @fence:	[in]	the fence to enable
+ *
+ * this will request for sw signaling to be enabled, to make the fence
+ * complete as soon as possible
+ */
+void fence_enable_sw_signaling(struct fence *fence)
+{
+	unsigned long flags;
+
+	if (!test_and_set_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags) &&
+	    !test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
+		spin_lock_irqsave(fence->lock, flags);
+
+		if (!fence->ops->enable_signaling(fence))
+			__fence_signal(fence);
+
+		spin_unlock_irqrestore(fence->lock, flags);
+	}
+}
+EXPORT_SYMBOL(fence_enable_sw_signaling);
+
+/**
+ * fence_add_callback - add a callback to be called when the fence
+ * is signaled
+ * @fence:	[in]	the fence to wait on
+ * @cb:		[in]	the callback to register
+ * @func:	[in]	the function to call
+ * @priv:	[in]	the argument to pass to function
+ *
+ * cb will be initialized by fence_add_callback, no initialization
+ * by the caller is required. Any number of callbacks can be registered
+ * to a fence, but a callback can only be registered to one fence at a time.
+ *
+ * Note that the callback can be called from an atomic context.  If
+ * fence is already signaled, this function will return -ENOENT (and
+ * *not* call the callback)
+ *
+ * Add a software callback to the fence. Same restrictions apply to
+ * refcount as it does to fence_wait, however the caller doesn't need to
+ * keep a refcount to fence afterwards: when software access is enabled,
+ * the creator of the fence is required to keep the fence alive until
+ * after it signals with fence_signal. The callback itself can be called
+ * from irq context.
+ *
+ */
+int fence_add_callback(struct fence *fence, struct fence_cb *cb,
+		       fence_func_t func, void *priv)
+{
+	unsigned long flags;
+	int ret = 0;
+	bool was_set;
+
+	if (WARN_ON(!fence || !func))
+		return -EINVAL;
+
+	if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+		return -ENOENT;
+
+	spin_lock_irqsave(fence->lock, flags);
+
+	was_set = test_and_set_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags);
+
+	if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+		ret = -ENOENT;
+	else if (!was_set && !fence->ops->enable_signaling(fence)) {
+		__fence_signal(fence);
+		ret = -ENOENT;
+	}
+
+	if (!ret) {
+		cb->func = func;
+		cb->priv = priv;
+		list_add_tail(&cb->node, &fence->cb_list);
+	}
+	spin_unlock_irqrestore(fence->lock, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL(fence_add_callback);
+
+/**
+ * fence_remove_callback - remove a callback from the signaling list
+ * @fence:	[in]	the fence to wait on
+ * @cb:		[in]	the callback to remove
+ *
+ * Remove a previously queued callback from the fence. This function returns
+ * true is the callback is succesfully removed, or false if the fence has
+ * already been signaled.
+ *
+ * *WARNING*:
+ * Cancelling a callback should only be done if you really know what you're
+ * doing, since deadlocks and race conditions could occur all too easily. For
+ * this reason, it should only ever be done on hardware lockup recovery,
+ * with a reference held to the fence.
+ */
+bool
+fence_remove_callback(struct fence *fence, struct fence_cb *cb)
+{
+	unsigned long flags;
+	bool ret;
+
+	spin_lock_irqsave(fence->lock, flags);
+
+	ret = !list_empty(&cb->node);
+	if (ret)
+		list_del_init(&cb->node);
+
+	spin_unlock_irqrestore(fence->lock, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL(fence_remove_callback);
+
+static void
+fence_default_wait_cb(struct fence *fence, struct fence_cb *cb, void *priv)
+{
+	try_to_wake_up(priv, TASK_NORMAL, 0);
+}
+
+/**
+ * fence_default_wait - default sleep until the fence gets signaled
+ * or until timeout elapses
+ * @fence:	[in]	the fence to wait on
+ * @intr:	[in]	if true, do an interruptible wait
+ * @timeout:	[in]	timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
+ *
+ * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the
+ * remaining timeout in jiffies on success.
+ */
+long
+fence_default_wait(struct fence *fence, bool intr, signed long timeout)
+{
+	struct fence_cb cb;
+	unsigned long flags;
+	long ret = timeout;
+	bool was_set;
+
+	if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+		return timeout;
+
+	spin_lock_irqsave(fence->lock, flags);
+
+	if (intr && signal_pending(current)) {
+		ret = -ERESTARTSYS;
+		goto out;
+	}
+
+	was_set = test_and_set_bit(FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags);
+
+	if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+		goto out;
+
+	if (!was_set && !fence->ops->enable_signaling(fence)) {
+		__fence_signal(fence);
+		goto out;
+	}
+
+	cb.func = fence_default_wait_cb;
+	cb.priv = current;
+	list_add(&cb.node, &fence->cb_list);
+
+	while (!test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags) && ret > 0) {
+		if (intr)
+			__set_current_state(TASK_INTERRUPTIBLE);
+		else
+			__set_current_state(TASK_UNINTERRUPTIBLE);
+		spin_unlock_irqrestore(fence->lock, flags);
+
+		ret = schedule_timeout(ret);
+
+		spin_lock_irqsave(fence->lock, flags);
+		if (ret > 0 && intr && signal_pending(current))
+			ret = -ERESTARTSYS;
+	}
+
+	if (!list_empty(&cb.node))
+		list_del(&cb.node);
+	__set_current_state(TASK_RUNNING);
+
+out:
+	spin_unlock_irqrestore(fence->lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL(fence_default_wait);
diff --git a/include/linux/fence.h b/include/linux/fence.h
new file mode 100644
index 0000000..d9f091d
--- /dev/null
+++ b/include/linux/fence.h
@@ -0,0 +1,347 @@
+/*
+ * Fence mechanism for dma-buf to allow for asynchronous dma access
+ *
+ * Copyright (C) 2012 Canonical Ltd
+ * Copyright (C) 2012 Texas Instruments
+ *
+ * Authors:
+ * Rob Clark <rob.clark@linaro.org>
+ * Maarten Lankhorst <maarten.lankhorst@canonical.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __LINUX_FENCE_H
+#define __LINUX_FENCE_H
+
+#include <linux/err.h>
+#include <linux/wait.h>
+#include <linux/list.h>
+#include <linux/bitops.h>
+#include <linux/kref.h>
+#include <linux/sched.h>
+#include <linux/printk.h>
+
+struct fence;
+struct fence_ops;
+struct fence_cb;
+
+/**
+ * struct fence - software synchronization primitive
+ * @refcount: refcount for this fence
+ * @ops: fence_ops associated with this fence
+ * @cb_list: list of all callbacks to call
+ * @lock: spin_lock_irqsave used for locking
+ * @priv: fence specific private data
+ * @flags: A mask of FENCE_FLAG_* defined below
+ *
+ * the flags member must be manipulated and read using the appropriate
+ * atomic ops (bit_*), so taking the spinlock will not be needed most
+ * of the time.
+ *
+ * FENCE_FLAG_SIGNALED_BIT - fence is already signaled
+ * FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called*
+ * FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the
+ * implementer of the fence for its own purposes. Can be used in different
+ * ways by different fence implementers, so do not rely on this.
+ *
+ * *) Since atomic bitops are used, this is not guaranteed to be the case.
+ * Particularly, if the bit was set, but fence_signal was called right
+ * before this bit was set, it would have been able to set the
+ * FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called.
+ * Adding a check for FENCE_FLAG_SIGNALED_BIT after setting
+ * FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that
+ * after fence_signal was called, any enable_signaling call will have either
+ * been completed, or never called at all.
+ */
+struct fence {
+	struct kref refcount;
+	const struct fence_ops *ops;
+	struct list_head cb_list;
+	spinlock_t *lock;
+	unsigned context, seqno;
+	unsigned long flags;
+};
+
+enum fence_flag_bits {
+	FENCE_FLAG_SIGNALED_BIT,
+	FENCE_FLAG_ENABLE_SIGNAL_BIT,
+	FENCE_FLAG_USER_BITS, /* must always be last member */
+};
+
+typedef void (*fence_func_t)(struct fence *fence, struct fence_cb *cb, void *priv);
+
+/**
+ * struct fence_cb - callback for fence_add_callback
+ * @func: fence_func_t to call
+ * @priv: value of priv to pass to function
+ *
+ * This struct will be initialized by fence_add_callback, additional
+ * data can be passed along by embedding fence_cb in another struct.
+ */
+struct fence_cb {
+	struct list_head node;
+	fence_func_t func;
+	void *priv;
+};
+
+/**
+ * struct fence_ops - operations implemented for fence
+ * @enable_signaling: enable software signaling of fence
+ * @signaled: [optional] peek whether the fence is signaled
+ * @release: [optional] called on destruction of fence
+ *
+ * Notes on enable_signaling:
+ * For fence implementations that have the capability for hw->hw
+ * signaling, they can implement this op to enable the necessary
+ * irqs, or insert commands into cmdstream, etc.  This is called
+ * in the first wait() or add_callback() path to let the fence
+ * implementation know that there is another driver waiting on
+ * the signal (ie. hw->sw case).
+ *
+ * This function can be called called from atomic context, but not
+ * from irq context, so normal spinlocks can be used.
+ *
+ * A return value of false indicates the fence already passed,
+ * or some failure occured that made it impossible to enable
+ * signaling. True indicates succesful enabling.
+ *
+ * Calling fence_signal before enable_signaling is called allows
+ * for a tiny race window in which enable_signaling is called during,
+ * before, or after fence_signal. To fight this, it is recommended
+ * that before enable_signaling returns true an extra reference is
+ * taken on the fence, to be released when the fence is signaled.
+ * This will mean fence_signal will still be called twice, but
+ * the second time will be a noop since it was already signaled.
+ *
+ * Notes on release:
+ * Can be NULL, this function allows additional commands to run on
+ * destruction of the fence. Can be called from irq context.
+ * If pointer is set to NULL, kfree will get called instead.
+ */
+
+struct fence_ops {
+	bool (*enable_signaling)(struct fence *fence);
+	bool (*signaled)(struct fence *fence);
+	long (*wait)(struct fence *fence, bool intr, signed long);
+	void (*release)(struct fence *fence);
+};
+
+/**
+ * __fence_init - Initialize a custom fence.
+ * @fence:	[in]	the fence to initialize
+ * @ops:	[in]	the fence_ops for operations on this fence
+ * @lock:	[in]	the irqsafe spinlock to use for locking this fence
+ * @context:	[in]	the execution context this fence is run on
+ * @seqno:	[in]	a linear increasing sequence number for this context
+ *
+ * Initializes an allocated fence, the caller doesn't have to keep its
+ * refcount after committing with this fence, but it will need to hold a
+ * refcount again if fence_ops.enable_signaling gets called. This can
+ * be used for other implementing other types of fence.
+ *
+ * context and seqno are used for easy comparison between fences, allowing
+ * to check which fence is later by simply using fence_later.
+ */
+static inline void
+__fence_init(struct fence *fence, const struct fence_ops *ops,
+	     spinlock_t *lock, unsigned context, unsigned seqno)
+{
+	BUG_ON(!ops || !lock || !ops->enable_signaling || !ops->wait);
+
+	kref_init(&fence->refcount);
+	fence->ops = ops;
+	INIT_LIST_HEAD(&fence->cb_list);
+	fence->lock = lock;
+	fence->context = context;
+	fence->seqno = seqno;
+	fence->flags = 0UL;
+}
+
+/**
+ * fence_get - increases refcount of the fence
+ * @fence:	[in]	fence to increase refcount of
+ */
+static inline void fence_get(struct fence *fence)
+{
+	if (WARN_ON(!fence))
+		return;
+	kref_get(&fence->refcount);
+}
+
+extern void release_fence(struct kref *kref);
+
+/**
+ * fence_put - decreases refcount of the fence
+ * @fence:	[in]	fence to reduce refcount of
+ */
+static inline void fence_put(struct fence *fence)
+{
+	if (WARN_ON(!fence))
+		return;
+	kref_put(&fence->refcount, release_fence);
+}
+
+int fence_signal(struct fence *fence);
+int __fence_signal(struct fence *fence);
+long fence_default_wait(struct fence *fence, bool intr, signed long);
+int fence_add_callback(struct fence *fence, struct fence_cb *cb,
+		       fence_func_t func, void *priv);
+bool fence_remove_callback(struct fence *fence, struct fence_cb *cb);
+void fence_enable_sw_signaling(struct fence *fence);
+
+/**
+ * fence_is_signaled - Return an indication if the fence is signaled yet.
+ * @fence:	[in]	the fence to check
+ *
+ * Returns true if the fence was already signaled, false if not. Since this
+ * function doesn't enable signaling, it is not guaranteed to ever return true
+ * If fence_add_callback, fence_wait or fence_enable_sw_signaling
+ * haven't been called before.
+ *
+ * It's recommended for seqno fences to call fence_signal when the
+ * operation is complete, it makes it possible to prevent issues from
+ * wraparound between time of issue and time of use by checking the return
+ * value of this function before calling hardware-specific wait instructions.
+ */
+static inline bool
+fence_is_signaled(struct fence *fence)
+{
+	if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
+		return true;
+
+	if (fence->ops->signaled && fence->ops->signaled(fence)) {
+		fence_signal(fence);
+		return true;
+	}
+
+	return false;
+}
+
+/**
+ * fence_later - return the chronologically later fence
+ * @f1:	[in]	the first fence from the same context
+ * @f2:	[in]	the second fence from the same context
+ *
+ * Returns NULL if both fences are signaled, otherwise the fence that would be
+ * signaled last. Both fences must be from the same context, since a seqno is
+ * not re-used across contexts.
+ */
+static inline struct fence *fence_later(struct fence *f1, struct fence *f2)
+{
+	bool sig1, sig2;
+
+	/*
+	 * can't check just FENCE_FLAG_SIGNALED_BIT here, it may never have been
+	 * set called if enable_signaling wasn't, and enabling that here is
+	 * overkill.
+	 */
+	sig1 = fence_is_signaled(f1);
+	sig2 = fence_is_signaled(f2);
+
+	if (sig1 && sig2)
+		return NULL;
+
+	BUG_ON(f1->context != f2->context);
+
+	if (sig1 || f2->seqno - f2->seqno <= INT_MAX)
+		return f2;
+	else
+		return f1;
+}
+
+/**
+ * fence_wait_timeout - sleep until the fence gets signaled
+ * or until timeout elapses
+ * @fence:	[in]	the fence to wait on
+ * @intr:	[in]	if true, do an interruptible wait
+ * @timeout:	[in]	timeout value in jiffies, or MAX_SCHEDULE_TIMEOUT
+ *
+ * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or the
+ * remaining timeout in jiffies on success. Other error values may be
+ * returned on custom implementations.
+ *
+ * Performs a synchronous wait on this fence. It is assumed the caller
+ * directly or indirectly (buf-mgr between reservation and committing)
+ * holds a reference to the fence, otherwise the fence might be
+ * freed before return, resulting in undefined behavior.
+ */
+static inline long
+fence_wait_timeout(struct fence *fence, bool intr, signed long timeout)
+{
+	if (WARN_ON(timeout < 0))
+		return -EINVAL;
+
+	return fence->ops->wait(fence, intr, timeout);
+}
+
+/**
+ * fence_wait - sleep until the fence gets signaled
+ * @fence:	[in]	the fence to wait on
+ * @intr:	[in]	if true, do an interruptible wait
+ *
+ * This function will return -ERESTARTSYS if interrupted by a signal,
+ * or 0 if the fence was signaled. Other error values may be
+ * returned on custom implementations.
+ *
+ * Performs a synchronous wait on this fence. It is assumed the caller
+ * directly or indirectly (buf-mgr between reservation and committing)
+ * holds a reference to the fence, otherwise the fence might be
+ * freed before return, resulting in undefined behavior.
+ */
+static inline long fence_wait(struct fence *fence, bool intr)
+{
+	long ret;
+
+	/* Since fence_wait_timeout cannot timeout with
+	 * MAX_SCHEDULE_TIMEOUT, only valid return values are
+	 * -ERESTARTSYS and MAX_SCHEDULE_TIMEOUT.
+	 */
+	ret = fence_wait_timeout(fence, intr, MAX_SCHEDULE_TIMEOUT);
+
+	return ret < 0 ? ret : 0;
+}
+
+/**
+ * fence context counter: each execution context should have its own
+ * fence context, this allows checking if fences belong to the same
+ * context or not. One device can have multiple separate contexts,
+ * and they're used if some engine can run independently of another.
+ */
+extern atomic_t fence_context_counter;
+
+static inline unsigned fence_context_alloc(unsigned num)
+{
+	BUG_ON(!num);
+	return atomic_add_return(num, &fence_context_counter) - num;
+}
+
+#define FENCE_TRACE(f, fmt, args...) \
+	do {									\
+		struct fence *__ff = (f);					\
+		pr_info("f %u#%u: " fmt, __ff->context, __ff->seqno, ##args);	\
+	} while (0)
+
+#define FENCE_WARN(f, fmt, args...) \
+	do {									\
+		struct fence *__ff = (f);					\
+		pr_warn("f %u#%u: " fmt, __ff->context, __ff->seqno, ##args);	\
+	} while (0)
+
+#define FENCE_ERR(f, fmt, args...) \
+	do {									\
+		struct fence *__ff = (f);					\
+		pr_err("f %u#%u: " fmt, __ff->context, __ff->seqno, ##args);	\
+	} while (0)
+
+#endif /* __LINUX_FENCE_H */
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4)
  2013-01-15 12:33 [PATCH 0/7] cross-device reservation for dma-buf support Maarten Lankhorst
                   ` (3 preceding siblings ...)
  2013-01-15 12:34 ` [PATCH 4/7] fence: dma-buf cross-device synchronization (v11) Maarten Lankhorst
@ 2013-01-15 12:34 ` Maarten Lankhorst
  2013-01-16  6:28   ` [Linaro-mm-sig] " Inki Dae
  2013-01-15 12:34 ` [PATCH 6/7] reservation: cross-device reservation support Maarten Lankhorst
  2013-01-15 12:34 ` [PATCH 7/7] reservation: Add lockdep annotation and selftests Maarten Lankhorst
  6 siblings, 1 reply; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 12:34 UTC (permalink / raw)
  To: dri-devel, linux-kernel, linux-media, linaro-mm-sig; +Cc: Maarten Lankhorst

This type of fence can be used with hardware synchronization for simple
hardware that can block execution until the condition
(dma_buf[offset] - value) >= 0 has been met.

A software fallback still has to be provided in case the fence is used
with a device that doesn't support this mechanism. It is useful to expose
this for graphics cards that have an op to support this.

Some cards like i915 can export those, but don't have an option to wait,
so they need the software fallback.

I extended the original patch by Rob Clark.

v1: Original
v2: Renamed from bikeshed to seqno, moved into dma-fence.c since
    not much was left of the file. Lots of documentation added.
v3: Use fence_ops instead of custom callbacks. Moved to own file
    to avoid circular dependency between dma-buf.h and fence.h
v4: Add spinlock pointer to seqno_fence_init

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
 Documentation/DocBook/device-drivers.tmpl |   1 +
 drivers/base/fence.c                      |  38 +++++++++++
 include/linux/seqno-fence.h               | 105 ++++++++++++++++++++++++++++++
 3 files changed, 144 insertions(+)
 create mode 100644 include/linux/seqno-fence.h

diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
index 6f53fc0..ad14396 100644
--- a/Documentation/DocBook/device-drivers.tmpl
+++ b/Documentation/DocBook/device-drivers.tmpl
@@ -128,6 +128,7 @@ X!Edrivers/base/interface.c
 !Edrivers/base/dma-buf.c
 !Edrivers/base/fence.c
 !Iinclude/linux/fence.h
+!Iinclude/linux/seqno-fence.h
 !Edrivers/base/dma-coherent.c
 !Edrivers/base/dma-mapping.c
      </sect1>
diff --git a/drivers/base/fence.c b/drivers/base/fence.c
index 28e5ffd..1d3f29c 100644
--- a/drivers/base/fence.c
+++ b/drivers/base/fence.c
@@ -24,6 +24,7 @@
 #include <linux/slab.h>
 #include <linux/export.h>
 #include <linux/fence.h>
+#include <linux/seqno-fence.h>
 
 atomic_t fence_context_counter = ATOMIC_INIT(0);
 EXPORT_SYMBOL(fence_context_counter);
@@ -284,3 +285,40 @@ out:
 	return ret;
 }
 EXPORT_SYMBOL(fence_default_wait);
+
+static bool seqno_enable_signaling(struct fence *fence)
+{
+	struct seqno_fence *seqno_fence = to_seqno_fence(fence);
+	return seqno_fence->ops->enable_signaling(fence);
+}
+
+static bool seqno_signaled(struct fence *fence)
+{
+	struct seqno_fence *seqno_fence = to_seqno_fence(fence);
+	return seqno_fence->ops->signaled && seqno_fence->ops->signaled(fence);
+}
+
+static void seqno_release(struct fence *fence)
+{
+	struct seqno_fence *f = to_seqno_fence(fence);
+
+	dma_buf_put(f->sync_buf);
+	if (f->ops->release)
+		f->ops->release(fence);
+	else
+		kfree(f);
+}
+
+static long seqno_wait(struct fence *fence, bool intr, signed long timeout)
+{
+	struct seqno_fence *f = to_seqno_fence(fence);
+	return f->ops->wait(fence, intr, timeout);
+}
+
+const struct fence_ops seqno_fence_ops = {
+	.enable_signaling = seqno_enable_signaling,
+	.signaled = seqno_signaled,
+	.wait = seqno_wait,
+	.release = seqno_release
+};
+EXPORT_SYMBOL_GPL(seqno_fence_ops);
diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h
new file mode 100644
index 0000000..603adc0
--- /dev/null
+++ b/include/linux/seqno-fence.h
@@ -0,0 +1,105 @@
+/*
+ * seqno-fence, using a dma-buf to synchronize fencing
+ *
+ * Copyright (C) 2012 Texas Instruments
+ * Copyright (C) 2012 Canonical Ltd
+ * Authors:
+ *   Rob Clark <rob.clark@linaro.org>
+ *   Maarten Lankhorst <maarten.lankhorst@canonical.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __LINUX_SEQNO_FENCE_H
+#define __LINUX_SEQNO_FENCE_H
+
+#include <linux/fence.h>
+#include <linux/dma-buf.h>
+
+struct seqno_fence {
+	struct fence base;
+
+	const struct fence_ops *ops;
+	struct dma_buf *sync_buf;
+	uint32_t seqno_ofs;
+};
+
+extern const struct fence_ops seqno_fence_ops;
+
+/**
+ * to_seqno_fence - cast a fence to a seqno_fence
+ * @fence: fence to cast to a seqno_fence
+ *
+ * Returns NULL if the fence is not a seqno_fence,
+ * or the seqno_fence otherwise.
+ */
+static inline struct seqno_fence *
+to_seqno_fence(struct fence *fence)
+{
+	if (fence->ops != &seqno_fence_ops)
+		return NULL;
+	return container_of(fence, struct seqno_fence, base);
+}
+
+/**
+ * seqno_fence_init - initialize a seqno fence
+ * @fence: seqno_fence to initialize
+ * @lock: pointer to spinlock to use for fence
+ * @sync_buf: buffer containing the memory location to signal on
+ * @context: the execution context this fence is a part of
+ * @seqno_ofs: the offset within @sync_buf
+ * @seqno: the sequence # to signal on
+ * @ops: the fence_ops for operations on this seqno fence
+ *
+ * This function initializes a struct seqno_fence with passed parameters,
+ * and takes a reference on sync_buf which is released on fence destruction.
+ *
+ * A seqno_fence is a dma_fence which can complete in software when
+ * enable_signaling is called, but it also completes when
+ * (s32)((sync_buf)[seqno_ofs] - seqno) >= 0 is true
+ *
+ * The seqno_fence will take a refcount on the sync_buf until it's
+ * destroyed, but actual lifetime of sync_buf may be longer if one of the
+ * callers take a reference to it.
+ *
+ * Certain hardware have instructions to insert this type of wait condition
+ * in the command stream, so no intervention from software would be needed.
+ * This type of fence can be destroyed before completed, however a reference
+ * on the sync_buf dma-buf can be taken. It is encouraged to re-use the same
+ * dma-buf for sync_buf, since mapping or unmapping the sync_buf to the
+ * device's vm can be expensive.
+ *
+ * It is recommended for creators of seqno_fence to call fence_signal
+ * before destruction. This will prevent possible issues from wraparound at
+ * time of issue vs time of check, since users can check fence_is_signaled
+ * before submitting instructions for the hardware to wait on the fence.
+ * However, when ops.enable_signaling is not called, it doesn't have to be
+ * done as soon as possible, just before there's any real danger of seqno
+ * wraparound.
+ */
+static inline void
+seqno_fence_init(struct seqno_fence *fence, spinlock_t *lock,
+		 struct dma_buf *sync_buf,  uint32_t context, uint32_t seqno_ofs,
+		 uint32_t seqno, const struct fence_ops *ops)
+{
+	BUG_ON(!fence || !sync_buf || !ops->enable_signaling || !ops->wait);
+
+	__fence_init(&fence->base, &seqno_fence_ops, lock, context, seqno);
+
+	get_dma_buf(sync_buf);
+	fence->ops = ops;
+	fence->sync_buf = sync_buf;
+	fence->seqno_ofs = seqno_ofs;
+}
+
+#endif /* __LINUX_SEQNO_FENCE_H */
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 6/7] reservation: cross-device reservation support
  2013-01-15 12:33 [PATCH 0/7] cross-device reservation for dma-buf support Maarten Lankhorst
                   ` (4 preceding siblings ...)
  2013-01-15 12:34 ` [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4) Maarten Lankhorst
@ 2013-01-15 12:34 ` Maarten Lankhorst
  2013-01-22 16:47   ` [Linaro-mm-sig] " Francesco Lavra
  2013-02-04  7:06   ` Inki Dae
  2013-01-15 12:34 ` [PATCH 7/7] reservation: Add lockdep annotation and selftests Maarten Lankhorst
  6 siblings, 2 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 12:34 UTC (permalink / raw)
  To: dri-devel, linux-kernel, linux-media, linaro-mm-sig; +Cc: Maarten Lankhorst

This adds support for a generic reservations framework that can be
hooked up to ttm and dma-buf and allows easy sharing of reservations
across devices.

The idea is that a dma-buf and ttm object both will get a pointer
to a struct reservation_object, which has to be reserved before
anything is done with the contents of the dma-buf.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---
 Documentation/DocBook/device-drivers.tmpl |   2 +
 drivers/base/Makefile                     |   2 +-
 drivers/base/reservation.c                | 251 ++++++++++++++++++++++++++++++
 include/linux/reservation.h               | 182 ++++++++++++++++++++++
 4 files changed, 436 insertions(+), 1 deletion(-)
 create mode 100644 drivers/base/reservation.c
 create mode 100644 include/linux/reservation.h

diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
index ad14396..24e6e80 100644
--- a/Documentation/DocBook/device-drivers.tmpl
+++ b/Documentation/DocBook/device-drivers.tmpl
@@ -129,6 +129,8 @@ X!Edrivers/base/interface.c
 !Edrivers/base/fence.c
 !Iinclude/linux/fence.h
 !Iinclude/linux/seqno-fence.h
+!Edrivers/base/reservation.c
+!Iinclude/linux/reservation.h
 !Edrivers/base/dma-coherent.c
 !Edrivers/base/dma-mapping.c
      </sect1>
diff --git a/drivers/base/Makefile b/drivers/base/Makefile
index 0026563..f6f731d 100644
--- a/drivers/base/Makefile
+++ b/drivers/base/Makefile
@@ -10,7 +10,7 @@ obj-$(CONFIG_CMA) += dma-contiguous.o
 obj-y			+= power/
 obj-$(CONFIG_HAS_DMA)	+= dma-mapping.o
 obj-$(CONFIG_HAVE_GENERIC_DMA_COHERENT) += dma-coherent.o
-obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o fence.o
+obj-$(CONFIG_DMA_SHARED_BUFFER) += dma-buf.o fence.o reservation.o
 obj-$(CONFIG_ISA)	+= isa.o
 obj-$(CONFIG_FW_LOADER)	+= firmware_class.o
 obj-$(CONFIG_NUMA)	+= node.o
diff --git a/drivers/base/reservation.c b/drivers/base/reservation.c
new file mode 100644
index 0000000..07584dd
--- /dev/null
+++ b/drivers/base/reservation.c
@@ -0,0 +1,251 @@
+/*
+ * Copyright (C) 2012 Canonical Ltd
+ *
+ * Based on bo.c which bears the following copyright notice,
+ * but is dual licensed:
+ *
+ * Copyright (c) 2006-2009 VMware, Inc., Palo Alto, CA., USA
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ *
+ **************************************************************************/
+/*
+ * Authors: Thomas Hellstrom <thellstrom-at-vmware-dot-com>
+ */
+
+#include <linux/fence.h>
+#include <linux/reservation.h>
+#include <linux/export.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+
+atomic_long_t reservation_counter = ATOMIC_LONG_INIT(1);
+EXPORT_SYMBOL(reservation_counter);
+
+const char reservation_object_name[] = "reservation_object";
+EXPORT_SYMBOL(reservation_object_name);
+
+const char reservation_ticket_name[] = "reservation_ticket";
+EXPORT_SYMBOL(reservation_ticket_name);
+
+struct lock_class_key reservation_object_class;
+EXPORT_SYMBOL(reservation_object_class);
+
+struct lock_class_key reservation_ticket_class;
+EXPORT_SYMBOL(reservation_ticket_class);
+
+/**
+ * ticket_backoff - cancel a reservation
+ * @ticket:	[in] a reservation_ticket
+ * @entries:	[in] the list list of reservation_entry entries to unreserve
+ *
+ * This function cancels a previous reservation done by
+ * ticket_reserve. This is useful in case something
+ * goes wrong between reservation and committing.
+ *
+ * This should only be called after ticket_reserve returns success.
+ */
+void
+ticket_backoff(struct reservation_ticket *ticket, struct list_head *entries)
+{
+	struct list_head *cur;
+
+	if (list_empty(entries))
+		return;
+
+	list_for_each(cur, entries) {
+		struct reservation_object *obj;
+
+		reservation_entry_get(cur, &obj, NULL);
+
+		mutex_unreserve_unlock(&obj->lock);
+	}
+	reservation_ticket_fini(ticket);
+}
+EXPORT_SYMBOL(ticket_backoff);
+
+static void
+ticket_backoff_early(struct list_head *list, struct reservation_entry *entry)
+{
+	list_for_each_entry_continue_reverse(entry, list, head) {
+		struct reservation_object *obj;
+
+		reservation_entry_get(&entry->head, &obj, NULL);
+		mutex_unreserve_unlock(&obj->lock);
+	}
+}
+
+/**
+ * ticket_reserve - reserve a list of reservation_entry
+ * @ticket:	[out]	a reservation_ticket
+ * @entries:	[in]	a list of entries to reserve.
+ *
+ * Do not initialize ticket, it will be initialized by this function
+ * on success.
+ *
+ * Returns -EINTR if signal got queued, -EINVAL if fence list is full,
+ * -EDEADLK if buffer appears on the list twice, 0 on success.
+ *
+ * XXX: Nuke rest
+ * The caller will have to queue waits on those fences before calling
+ * ufmgr_fence_buffer_objects, with either hardware specific methods,
+ * fence_add_callback will, or fence_wait.
+ *
+ * As such, by incrementing refcount on reservation_entry before calling
+ * fence_add_callback, and making the callback decrement refcount on
+ * reservation_entry, or releasing refcount if fence_add_callback
+ * failed, the reservation_entry will be freed when all the fences
+ * have been signaled, and only after the last ref is released, which should
+ * be after ufmgr_fence_buffer_objects. With proper locking, when the
+ * list_head holding the list of reservation_entry's becomes empty it
+ * indicates all fences for all bufs have been signaled.
+ */
+int
+ticket_reserve(struct reservation_ticket *ticket,
+		   struct list_head *entries)
+{
+	struct list_head *cur;
+	int ret;
+	struct reservation_object *res_bo = NULL;
+
+	if (list_empty(entries))
+		return 0;
+
+	reservation_ticket_init(ticket);
+retry:
+	list_for_each(cur, entries) {
+		struct reservation_entry *entry;
+		struct reservation_object *bo;
+		bool shared;
+
+		entry = reservation_entry_get(cur, &bo, &shared);
+		if (unlikely(res_bo == bo)) {
+			res_bo = NULL;
+			continue;
+		}
+
+		ret = mutex_reserve_lock_interruptible(&bo->lock,
+						       ticket,
+						       ticket->seqno);
+		switch (ret) {
+		case 0:
+			break;
+		case -EAGAIN:
+			ticket_backoff_early(entries, entry);
+			if (res_bo)
+				mutex_unreserve_unlock(&res_bo->lock);
+
+			ret = mutex_reserve_lock_intr_slow(&bo->lock, ticket,
+							   ticket->seqno);
+			if (unlikely(ret != 0))
+				goto err_fini;
+			res_bo = bo;
+			break;
+		default:
+			goto err;
+		}
+
+		if (shared &&
+		    bo->fence_shared_count == BUF_MAX_SHARED_FENCE) {
+			WARN_ON_ONCE(1);
+			ret = -EINVAL;
+			mutex_unreserve_unlock(&bo->lock);
+			goto err;
+		}
+		if (unlikely(res_bo == bo))
+			goto retry;
+		continue;
+
+err:
+		if (res_bo != bo)
+			mutex_unreserve_unlock(&bo->lock);
+		ticket_backoff_early(entries, entry);
+err_fini:
+		reservation_ticket_fini(ticket);
+		return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(ticket_reserve);
+
+/**
+ * ticket_commit - commit a reservation with a new fence
+ * @ticket:	[in]	the reservation_ticket returned by
+ * ticket_reserve
+ * @entries:	[in]	a linked list of struct reservation_entry
+ * @fence:	[in]	the fence that indicates completion
+ *
+ * This function will call reservation_ticket_fini, no need
+ * to do it manually.
+ *
+ * This function should be called after a hardware command submission is
+ * completed succesfully. The fence is used to indicate completion of
+ * those commands.
+ */
+void
+ticket_commit(struct reservation_ticket *ticket,
+		  struct list_head *entries, struct fence *fence)
+{
+	struct list_head *cur;
+
+	if (list_empty(entries))
+		return;
+
+	if (WARN_ON(!fence)) {
+		ticket_backoff(ticket, entries);
+		return;
+	}
+
+	list_for_each(cur, entries) {
+		struct reservation_object *bo;
+		bool shared;
+
+		reservation_entry_get(cur, &bo, &shared);
+
+		if (!shared) {
+			int i;
+			for (i = 0; i < bo->fence_shared_count; ++i) {
+				fence_put(bo->fence_shared[i]);
+				bo->fence_shared[i] = NULL;
+			}
+			bo->fence_shared_count = 0;
+			if (bo->fence_excl)
+				fence_put(bo->fence_excl);
+
+			bo->fence_excl = fence;
+		} else {
+			if (WARN_ON(bo->fence_shared_count >=
+				    ARRAY_SIZE(bo->fence_shared))) {
+				mutex_unreserve_unlock(&bo->lock);
+				continue;
+			}
+
+			bo->fence_shared[bo->fence_shared_count++] = fence;
+		}
+		fence_get(fence);
+
+		mutex_unreserve_unlock(&bo->lock);
+	}
+	reservation_ticket_fini(ticket);
+}
+EXPORT_SYMBOL(ticket_commit);
diff --git a/include/linux/reservation.h b/include/linux/reservation.h
new file mode 100644
index 0000000..fc2349d
--- /dev/null
+++ b/include/linux/reservation.h
@@ -0,0 +1,182 @@
+/*
+ * Header file for reservations for dma-buf and ttm
+ *
+ * Copyright(C) 2011 Linaro Limited. All rights reserved.
+ * Copyright (C) 2012 Canonical Ltd
+ * Copyright (C) 2012 Texas Instruments
+ *
+ * Authors:
+ * Rob Clark <rob.clark@linaro.org>
+ * Maarten Lankhorst <maarten.lankhorst@canonical.com>
+ * Thomas Hellstrom <thellstrom-at-vmware-dot-com>
+ *
+ * Based on bo.c which bears the following copyright notice,
+ * but is dual licensed:
+ *
+ * Copyright (c) 2006-2009 VMware, Inc., Palo Alto, CA., USA
+ * All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the
+ * "Software"), to deal in the Software without restriction, including
+ * without limitation the rights to use, copy, modify, merge, publish,
+ * distribute, sub license, and/or sell copies of the Software, and to
+ * permit persons to whom the Software is furnished to do so, subject to
+ * the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the
+ * next paragraph) shall be included in all copies or substantial portions
+ * of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.
+ */
+#ifndef _LINUX_RESERVATION_H
+#define _LINUX_RESERVATION_H
+
+#include <linux/spinlock.h>
+#include <linux/mutex.h>
+#include <linux/fence.h>
+
+#define BUF_MAX_SHARED_FENCE 8
+
+struct fence;
+
+extern atomic_long_t reservation_counter;
+extern const char reservation_object_name[];
+extern struct lock_class_key reservation_object_class;
+extern const char reservation_ticket_name[];
+extern struct lock_class_key reservation_ticket_class;
+
+struct reservation_object {
+	struct ticket_mutex lock;
+
+	u32 fence_shared_count;
+	struct fence *fence_excl;
+	struct fence *fence_shared[BUF_MAX_SHARED_FENCE];
+};
+
+struct reservation_ticket {
+	unsigned long seqno;
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	struct lockdep_map dep_map;
+#endif
+};
+
+/**
+ * struct reservation_entry - reservation structure for a
+ * reservation_object
+ * @head:	list entry
+ * @obj_shared:	pointer to a reservation_object to reserve
+ *
+ * Bit 0 of obj_shared is set to bool shared, as such pointer has to be
+ * converted back, which can be done with reservation_entry_get.
+ */
+struct reservation_entry {
+	struct list_head head;
+	unsigned long obj_shared;
+};
+
+
+static inline void
+reservation_object_init(struct reservation_object *obj)
+{
+	obj->fence_shared_count = 0;
+	obj->fence_excl = NULL;
+
+	__ticket_mutex_init(&obj->lock, reservation_object_name,
+			    &reservation_object_class);
+}
+
+static inline void
+reservation_object_fini(struct reservation_object *obj)
+{
+	int i;
+
+	if (obj->fence_excl)
+		fence_put(obj->fence_excl);
+	for (i = 0; i < obj->fence_shared_count; ++i)
+		fence_put(obj->fence_shared[i]);
+
+	mutex_destroy(&obj->lock.base);
+}
+
+static inline void
+reservation_ticket_init(struct reservation_ticket *t)
+{
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	/*
+	 * Make sure we are not reinitializing a held ticket:
+	 */
+
+	debug_check_no_locks_freed((void *)t, sizeof(*t));
+	lockdep_init_map(&t->dep_map, reservation_ticket_name,
+			 &reservation_ticket_class, 0);
+#endif
+	mutex_acquire(&t->dep_map, 0, 0, _THIS_IP_);
+	do {
+		t->seqno = atomic_long_inc_return(&reservation_counter);
+	} while (unlikely(!t->seqno));
+}
+
+/**
+ * reservation_ticket_fini - end a reservation ticket
+ * @t:	[in]	reservation_ticket that completed all reservations
+ *
+ * This currently does nothing, but should be called after all reservations
+ * made with this ticket have been unreserved. It is likely that in the future
+ * it will be hooked up to perf events, or aid in debugging in other ways.
+ */
+static inline void
+reservation_ticket_fini(struct reservation_ticket *t)
+{
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	mutex_release(&t->dep_map, 0, _THIS_IP_);
+	t->seqno = 0;
+#endif
+}
+
+/**
+ * reservation_entry_init - initialize and append a reservation_entry
+ * to the list
+ * @entry:	entry to initialize
+ * @list:	list to append to
+ * @obj:	reservation_object to initialize the entry with
+ * @shared:	whether shared or exclusive access is requested
+ */
+static inline void
+reservation_entry_init(struct reservation_entry *entry,
+			   struct list_head *list,
+			   struct reservation_object *obj, bool shared)
+{
+	entry->obj_shared = (unsigned long)obj | !!shared;
+	list_add_tail(&entry->head, list);
+}
+
+static inline struct reservation_entry *
+reservation_entry_get(struct list_head *list,
+			  struct reservation_object **obj, bool *shared)
+{
+	struct reservation_entry *e = container_of(list, struct reservation_entry, head);
+	unsigned long val = e->obj_shared;
+
+	if (obj)
+		*obj = (struct reservation_object*)(val & ~1);
+	if (shared)
+		*shared = val & 1;
+	return e;
+}
+
+extern int ticket_reserve(struct reservation_ticket *,
+			  struct list_head *entries);
+extern void ticket_backoff(struct reservation_ticket *,
+			   struct list_head *entries);
+extern void ticket_commit(struct reservation_ticket *,
+			  struct list_head *entries, struct fence *);
+
+#endif /* _LINUX_RESERVATION_H */
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 7/7] reservation: Add lockdep annotation and selftests
  2013-01-15 12:33 [PATCH 0/7] cross-device reservation for dma-buf support Maarten Lankhorst
                   ` (5 preceding siblings ...)
  2013-01-15 12:34 ` [PATCH 6/7] reservation: cross-device reservation support Maarten Lankhorst
@ 2013-01-15 12:34 ` Maarten Lankhorst
  6 siblings, 0 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 12:34 UTC (permalink / raw)
  To: dri-devel, linux-kernel, linux-media, linaro-mm-sig; +Cc: Maarten Lankhorst

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>

---

The self-tests will fail if the commit "lockdep: Check if nested
lock is actually held" from linux tip core/locking is not applied.
---
 lib/Kconfig.debug      |   1 +
 lib/locking-selftest.c | 385 ++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 367 insertions(+), 19 deletions(-)

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 67604e5..017bcea 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -716,6 +716,7 @@ config DEBUG_ATOMIC_SLEEP
 config DEBUG_LOCKING_API_SELFTESTS
 	bool "Locking API boot-time self-tests"
 	depends on DEBUG_KERNEL
+	select CONFIG_DMA_SHARED_BUFFER
 	help
 	  Say Y here if you want the kernel to run a short self-test during
 	  bootup. The self-test checks whether common types of locking bugs
diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index 7aae0f2..7fe22c2 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -20,6 +20,7 @@
 #include <linux/interrupt.h>
 #include <linux/debug_locks.h>
 #include <linux/irqflags.h>
+#include <linux/reservation.h>
 
 /*
  * Change this to 1 if you want to see the failure printouts:
@@ -42,6 +43,7 @@ __setup("debug_locks_verbose=", setup_debug_locks_verbose);
 #define LOCKTYPE_RWLOCK	0x2
 #define LOCKTYPE_MUTEX	0x4
 #define LOCKTYPE_RWSEM	0x8
+#define LOCKTYPE_RESERVATION	0x10
 
 /*
  * Normal standalone locks, for the circular and irq-context
@@ -920,11 +922,17 @@ GENERATE_PERMUTATIONS_3_EVENTS(irq_read_recursion_soft)
 static void reset_locks(void)
 {
 	local_irq_disable();
+	lockdep_free_key_range(&reservation_object_class, 1);
+	lockdep_free_key_range(&reservation_ticket_class, 1);
+
 	I1(A); I1(B); I1(C); I1(D);
 	I1(X1); I1(X2); I1(Y1); I1(Y2); I1(Z1); I1(Z2);
 	lockdep_reset();
 	I2(A); I2(B); I2(C); I2(D);
 	init_shared_classes();
+
+	memset(&reservation_object_class, 0, sizeof reservation_object_class);
+	memset(&reservation_ticket_class, 0, sizeof reservation_ticket_class);
 	local_irq_enable();
 }
 
@@ -938,7 +946,6 @@ static int unexpected_testcase_failures;
 static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask)
 {
 	unsigned long saved_preempt_count = preempt_count();
-	int expected_failure = 0;
 
 	WARN_ON(irqs_disabled());
 
@@ -946,26 +953,16 @@ static void dotest(void (*testcase_fn)(void), int expected, int lockclass_mask)
 	/*
 	 * Filter out expected failures:
 	 */
+	if (debug_locks != expected) {
 #ifndef CONFIG_PROVE_LOCKING
-	if ((lockclass_mask & LOCKTYPE_SPIN) && debug_locks != expected)
-		expected_failure = 1;
-	if ((lockclass_mask & LOCKTYPE_RWLOCK) && debug_locks != expected)
-		expected_failure = 1;
-	if ((lockclass_mask & LOCKTYPE_MUTEX) && debug_locks != expected)
-		expected_failure = 1;
-	if ((lockclass_mask & LOCKTYPE_RWSEM) && debug_locks != expected)
-		expected_failure = 1;
+		expected_testcase_failures++;
+		printk("failed|");
+#else
+		unexpected_testcase_failures++;
+		printk("FAILED|");
+
+		dump_stack();
 #endif
-	if (debug_locks != expected) {
-		if (expected_failure) {
-			expected_testcase_failures++;
-			printk("failed|");
-		} else {
-			unexpected_testcase_failures++;
-
-			printk("FAILED|");
-			dump_stack();
-		}
 	} else {
 		testcase_successes++;
 		printk("  ok  |");
@@ -1108,6 +1105,354 @@ static inline void print_testname(const char *testname)
 	DO_TESTCASE_6IRW(desc, name, 312);			\
 	DO_TESTCASE_6IRW(desc, name, 321);
 
+static void reservation_test_fail_reserve(void)
+{
+	struct reservation_ticket t;
+	struct reservation_object o;
+	int ret;
+
+	reservation_object_init(&o);
+	reservation_ticket_init(&t);
+	t.seqno++;
+
+	ret = mutex_reserve_lock(&o.lock, &t, t.seqno);
+
+	BUG_ON(!atomic_long_read(&o.lock.reservation_id));
+
+	/* No lockdep test, pure API */
+	ret = mutex_reserve_lock(&o.lock, &t, t.seqno);
+	WARN_ON(ret != -EDEADLK);
+
+	t.seqno++;
+	ret = mutex_trylock(&o.lock.base);
+	WARN_ON(ret);
+
+	ret = mutex_reserve_lock(&o.lock, &t, t.seqno);
+	WARN_ON(ret != -EAGAIN);
+	mutex_unlock(&o.lock.base);
+
+	if (mutex_trylock(&o.lock.base))
+		mutex_unlock(&o.lock.base);
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+	else DEBUG_LOCKS_WARN_ON(1);
+#endif
+
+	reservation_ticket_fini(&t);
+}
+
+static void reservation_test_two_tickets(void)
+{
+	struct reservation_ticket t, t2;
+
+	reservation_ticket_init(&t);
+	reservation_ticket_init(&t2);
+
+	reservation_ticket_fini(&t2);
+	reservation_ticket_fini(&t);
+}
+
+static void reservation_test_ticket_unreserve_twice(void)
+{
+	struct reservation_ticket t;
+
+	reservation_ticket_init(&t);
+	reservation_ticket_fini(&t);
+	reservation_ticket_fini(&t);
+}
+
+static void reservation_test_object_unreserve_twice(void)
+{
+	struct reservation_object o;
+
+	reservation_object_init(&o);
+	mutex_lock(&o.lock.base);
+	mutex_unlock(&o.lock.base);
+	mutex_unlock(&o.lock.base);
+}
+
+static void reservation_test_fence_nest_unreserved(void)
+{
+	struct reservation_object o;
+
+	reservation_object_init(&o);
+
+	spin_lock_nest_lock(&lock_A, &o.lock.base);
+	spin_unlock(&lock_A);
+}
+
+static void reservation_test_ticket_block(void)
+{
+	struct reservation_ticket t;
+	struct reservation_object o, o2;
+	int ret;
+
+	reservation_object_init(&o);
+	reservation_object_init(&o2);
+	reservation_ticket_init(&t);
+
+	ret = mutex_reserve_lock(&o.lock, &t, t.seqno);
+	WARN_ON(ret);
+	mutex_lock(&o2.lock.base);
+	mutex_unlock(&o2.lock.base);
+	mutex_unreserve_unlock(&o.lock);
+
+	reservation_ticket_fini(&t);
+}
+
+static void reservation_test_ticket_try(void)
+{
+	struct reservation_ticket t;
+	struct reservation_object o, o2;
+	int ret;
+
+	reservation_object_init(&o);
+	reservation_object_init(&o2);
+	reservation_ticket_init(&t);
+
+	ret = mutex_reserve_lock(&o.lock, &t, t.seqno);
+	WARN_ON(ret);
+
+	mutex_trylock(&o2.lock.base);
+	mutex_unlock(&o2.lock.base);
+	mutex_unreserve_unlock(&o.lock);
+
+	reservation_ticket_fini(&t);
+}
+
+static void reservation_test_ticket_ticket(void)
+{
+	struct reservation_ticket t;
+	struct reservation_object o, o2;
+	int ret;
+
+	reservation_object_init(&o);
+	reservation_object_init(&o2);
+	reservation_ticket_init(&t);
+
+	ret = mutex_reserve_lock(&o.lock, &t, t.seqno);
+	WARN_ON(ret);
+
+	ret = mutex_reserve_lock(&o2.lock, &t, t.seqno);
+	WARN_ON(ret);
+
+	mutex_unreserve_unlock(&o2.lock);
+	mutex_unreserve_unlock(&o.lock);
+
+	reservation_ticket_fini(&t);
+}
+
+static void reservation_test_try_block(void)
+{
+	struct reservation_object o, o2;
+
+	reservation_object_init(&o);
+	reservation_object_init(&o2);
+
+	mutex_trylock(&o.lock.base);
+	mutex_lock(&o2.lock.base);
+	mutex_unlock(&o2.lock.base);
+	mutex_unlock(&o.lock.base);
+}
+
+static void reservation_test_try_try(void)
+{
+	struct reservation_object o, o2;
+
+	reservation_object_init(&o);
+	reservation_object_init(&o2);
+
+	mutex_trylock(&o.lock.base);
+	mutex_trylock(&o2.lock.base);
+	mutex_unlock(&o2.lock.base);
+	mutex_unlock(&o.lock.base);
+}
+
+static void reservation_test_try_ticket(void)
+{
+	struct reservation_ticket t;
+	struct reservation_object o, o2;
+	int ret;
+
+	reservation_object_init(&o);
+	reservation_object_init(&o2);
+
+	mutex_trylock(&o.lock.base);
+	reservation_ticket_init(&t);
+
+	ret = mutex_reserve_lock(&o2.lock, &t, t.seqno);
+	WARN_ON(ret);
+
+	mutex_unreserve_unlock(&o2.lock);
+	mutex_unlock(&o.lock.base);
+
+	reservation_ticket_fini(&t);
+}
+
+static void reservation_test_block_block(void)
+{
+	struct reservation_object o, o2;
+
+	reservation_object_init(&o);
+	reservation_object_init(&o2);
+
+	mutex_lock(&o.lock.base);
+	mutex_lock(&o2.lock.base);
+	mutex_unlock(&o2.lock.base);
+	mutex_unlock(&o.lock.base);
+}
+
+static void reservation_test_block_try(void)
+{
+	struct reservation_object o, o2;
+
+	reservation_object_init(&o);
+	reservation_object_init(&o2);
+
+	mutex_lock(&o.lock.base);
+	mutex_trylock(&o2.lock.base);
+	mutex_unlock(&o2.lock.base);
+	mutex_unlock(&o.lock.base);
+}
+
+static void reservation_test_block_ticket(void)
+{
+	struct reservation_ticket t;
+	struct reservation_object o, o2;
+	int ret;
+
+	reservation_object_init(&o);
+	reservation_object_init(&o2);
+
+	mutex_lock(&o.lock.base);
+	reservation_ticket_init(&t);
+
+	ret = mutex_reserve_lock(&o2.lock, &t, t.seqno);
+	WARN_ON(ret);
+	mutex_unreserve_unlock(&o2.lock);
+	mutex_unlock(&o.lock.base);
+
+	reservation_ticket_fini(&t);
+}
+
+static void reservation_test_fence_block(void)
+{
+	struct reservation_object o;
+
+	reservation_object_init(&o);
+	spin_lock(&lock_A);
+	spin_unlock(&lock_A);
+
+	mutex_lock(&o.lock.base);
+	spin_lock(&lock_A);
+	spin_unlock(&lock_A);
+	mutex_unlock(&o.lock.base);
+
+	spin_lock(&lock_A);
+	mutex_lock(&o.lock.base);
+	mutex_unlock(&o.lock.base);
+	spin_unlock(&lock_A);
+}
+
+static void reservation_test_fence_try(void)
+{
+	struct reservation_object o;
+
+	reservation_object_init(&o);
+	spin_lock(&lock_A);
+	spin_unlock(&lock_A);
+
+	mutex_trylock(&o.lock.base);
+	spin_lock(&lock_A);
+	spin_unlock(&lock_A);
+	mutex_unlock(&o.lock.base);
+
+	spin_lock(&lock_A);
+	mutex_trylock(&o.lock.base);
+	mutex_unlock(&o.lock.base);
+	spin_unlock(&lock_A);
+}
+
+static void reservation_test_fence_ticket(void)
+{
+	struct reservation_ticket t;
+	struct reservation_object o;
+	int ret;
+
+	reservation_object_init(&o);
+	spin_lock(&lock_A);
+	spin_unlock(&lock_A);
+
+	reservation_ticket_init(&t);
+
+	ret = mutex_reserve_lock(&o.lock, &t, t.seqno);
+	WARN_ON(ret);
+	spin_lock(&lock_A);
+	spin_unlock(&lock_A);
+	mutex_unreserve_unlock(&o.lock);
+
+	spin_lock(&lock_A);
+	ret = mutex_reserve_lock(&o.lock, &t, t.seqno);
+	WARN_ON(ret);
+	mutex_unreserve_unlock(&o.lock);
+	spin_unlock(&lock_A);
+
+	reservation_ticket_fini(&t);
+}
+
+static void reservation_tests(void)
+{
+	printk("  --------------------------------------------------------------------------\n");
+	printk("  | Reservation tests |\n");
+	printk("  ---------------------\n");
+
+	print_testname("reservation api failures");
+	dotest(reservation_test_fail_reserve, SUCCESS, LOCKTYPE_RESERVATION);
+	printk("\n");
+
+	print_testname("reserving two tickets");
+	dotest(reservation_test_two_tickets, FAILURE, LOCKTYPE_RESERVATION);
+	printk("\n");
+
+	print_testname("unreserve ticket twice");
+	dotest(reservation_test_ticket_unreserve_twice, FAILURE, LOCKTYPE_RESERVATION);
+	printk("\n");
+
+	print_testname("unreserve object twice");
+	dotest(reservation_test_object_unreserve_twice, FAILURE, LOCKTYPE_RESERVATION);
+	printk("\n");
+
+	print_testname("spinlock nest unreserved");
+	dotest(reservation_test_fence_nest_unreserved, FAILURE, LOCKTYPE_RESERVATION);
+	printk("\n");
+
+	printk("  -----------------------------------------------------\n");
+	printk("                                 |block | try  |ticket|\n");
+	printk("  -----------------------------------------------------\n");
+
+	print_testname("ticket");
+	dotest(reservation_test_ticket_block, FAILURE, LOCKTYPE_RESERVATION);
+	dotest(reservation_test_ticket_try, SUCCESS, LOCKTYPE_RESERVATION);
+	dotest(reservation_test_ticket_ticket, SUCCESS, LOCKTYPE_RESERVATION);
+	printk("\n");
+
+	print_testname("try");
+	dotest(reservation_test_try_block, FAILURE, LOCKTYPE_RESERVATION);
+	dotest(reservation_test_try_try, SUCCESS, LOCKTYPE_RESERVATION);
+	dotest(reservation_test_try_ticket, FAILURE, LOCKTYPE_RESERVATION);
+	printk("\n");
+
+	print_testname("block");
+	dotest(reservation_test_block_block, FAILURE, LOCKTYPE_RESERVATION);
+	dotest(reservation_test_block_try, SUCCESS, LOCKTYPE_RESERVATION);
+	dotest(reservation_test_block_ticket, FAILURE, LOCKTYPE_RESERVATION);
+	printk("\n");
+
+	print_testname("spinlock");
+	dotest(reservation_test_fence_block, FAILURE, LOCKTYPE_RESERVATION);
+	dotest(reservation_test_fence_try, SUCCESS, LOCKTYPE_RESERVATION);
+	dotest(reservation_test_fence_ticket, FAILURE, LOCKTYPE_RESERVATION);
+	printk("\n");
+}
 
 void locking_selftest(void)
 {
@@ -1188,6 +1533,8 @@ void locking_selftest(void)
 	DO_TESTCASE_6x2("irq read-recursion", irq_read_recursion);
 //	DO_TESTCASE_6x2B("irq read-recursion #2", irq_read_recursion2);
 
+	reservation_tests();
+
 	if (unexpected_testcase_failures) {
 		printk("-----------------------------------------------------------------\n");
 		debug_locks = 0;
-- 
1.8.0.3


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/7] mutex: add support for reservation style locks
  2013-01-15 12:33 ` [PATCH 2/7] mutex: add support for reservation style locks Maarten Lankhorst
@ 2013-01-15 13:43   ` Maarten Lankhorst
  2013-01-30  1:07   ` Rob Clark
  1 sibling, 0 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 13:43 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: dri-devel, linux-kernel, linux-media, linaro-mm-sig

Woops, missed the updated patch description..

Op 15-01-13 13:33, Maarten Lankhorst schreef:
> makes it easier to port ttm over..
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>

mutex_reserve_lock, and mutex_reserve_lock_interruptible:
  Lock a buffer with a reservation_id set. reservation_id must not be set to 0,
  since this is a special value that means no reservation_id.

  Normally if reservation_id is not set, or is older than the reservation_id that's
  currently set on the mutex, the behavior will be to wait normally.

  However, if  the reservation_id is newer than the current reservation_id, -EAGAIN
  will be returned, and this function must unreserve all other mutexes and then redo
  a blocking lock with normal mutex calls to prevent a deadlock, then call
  mutex_locked_set_reservation on successful locking to set the reservation_id inside
  the lock.

  These functions will return -EDEADLK instead of -EAGAIN if reservation_id is the same
  as the reservation_id that's attempted to lock the mutex with, since in that case you
  presumably attempted to lock the same lock twice.

mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow:
  Similar to mutex_reserve_lock, except it won't backoff with -EAGAIN. This is useful
  after mutex_reserve_lock failed with -EAGAIN, and you unreserved all buffers so no
  deadlock can occur.

mutex_unreserve_unlock:
   Unlock a buffer reserved with the previous calls.

Missing at the moment, maybe TODO?
  * lockdep warnings when wrongly calling mutex_unreserve_unlock or mutex_unlock,
    depending on whether reservation_id was set previously or not.
    - Does lockdep have something for this or do I need to extend struct mutex?

  * Check if lockdep warns if you unlock a lock that other locks were nested to.
    - spin_lock(m); spin_lock_nest_lock(a, m); spin_unlock(m); spin_unlock(a);
      would be nice if it gave a splat. Have to recheck if it does, though..

Design:
  I chose for ticket_mutex to encapsulate struct mutex, so the extra memory usage and
  atomic set on init will only happen when you deliberately create a ticket lock.

  Since the mutexes are mostly meant to protect buffer object serialization in ttm, not
  much contention is expected. I could be slightly smarter with wakeups, but this would
  be at the expense at adding a field to struct mutex_waiter. Because this would add
  overhead to all cases where ticket_mutexes are not used, and ticket_mutexes are less
  performance sensitive anyway since they only protect buffer objects, I didn't want to
  do this. It's still better than ttm always calling wake_up_all, which does a
  unconditional spin_lock_irqsave/irqrestore.

  I needed this in kernel/mutex.c because of the extensions to __lock_common, which are
  hopefully optimized away for all normal paths.

Changes since RFC patch v1:
 - Updated to use atomic_long instead of atomic, since the reservation_id was a long.
 - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
 - removed mutex_locked_set_reservation_id (or w/e it was called)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>

> ---
>  include/linux/mutex.h |  86 +++++++++++++-
>  kernel/mutex.c        | 317 +++++++++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 387 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index 9121595..602c247 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -62,6 +62,11 @@ struct mutex {
>  #endif
>  };
>  
> +struct ticket_mutex {
> +	struct mutex base;
> +	atomic_long_t reservation_id;
> +};
> +
>  /*
>   * This is the control structure for tasks blocked on mutex,
>   * which resides on the blocked task's kernel stack:
> @@ -109,12 +114,24 @@ static inline void mutex_destroy(struct mutex *lock) {}
>  		__DEBUG_MUTEX_INITIALIZER(lockname) \
>  		__DEP_MAP_MUTEX_INITIALIZER(lockname) }
>  
> +#define __TICKET_MUTEX_INITIALIZER(lockname) \
> +		{ .base = __MUTEX_INITIALIZER(lockname) \
> +		, .reservation_id = ATOMIC_LONG_INIT(0) }
> +
>  #define DEFINE_MUTEX(mutexname) \
>  	struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
>  
>  extern void __mutex_init(struct mutex *lock, const char *name,
>  			 struct lock_class_key *key);
>  
> +static inline void __ticket_mutex_init(struct ticket_mutex *lock,
> +				       const char *name,
> +				       struct lock_class_key *key)
> +{
> +	__mutex_init(&lock->base, name, key);
> +	atomic_long_set(&lock->reservation_id, 0);
> +}
> +
>  /**
>   * mutex_is_locked - is the mutex locked
>   * @lock: the mutex to be queried
> @@ -133,26 +150,91 @@ static inline int mutex_is_locked(struct mutex *lock)
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
>  extern void _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest_lock);
> +
>  extern int __must_check mutex_lock_interruptible_nested(struct mutex *lock,
>  					unsigned int subclass);
>  extern int __must_check mutex_lock_killable_nested(struct mutex *lock,
>  					unsigned int subclass);
>  
> +extern int __must_check _mutex_reserve_lock(struct ticket_mutex *lock,
> +					struct lockdep_map *nest_lock,
> +					unsigned long reservation_id);
> +
> +extern int __must_check _mutex_reserve_lock_interruptible(struct ticket_mutex *,
> +					struct lockdep_map *nest_lock,
> +					unsigned long reservation_id);
> +
> +extern void _mutex_reserve_lock_slow(struct ticket_mutex *lock,
> +				     struct lockdep_map *nest_lock,
> +				     unsigned long reservation_id);
> +
> +extern int __must_check _mutex_reserve_lock_intr_slow(struct ticket_mutex *,
> +					struct lockdep_map *nest_lock,
> +					unsigned long reservation_id);
> +
>  #define mutex_lock(lock) mutex_lock_nested(lock, 0)
>  #define mutex_lock_interruptible(lock) mutex_lock_interruptible_nested(lock, 0)
>  #define mutex_lock_killable(lock) mutex_lock_killable_nested(lock, 0)
>  
>  #define mutex_lock_nest_lock(lock, nest_lock)				\
>  do {									\
> -	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);		\
> +	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
>  	_mutex_lock_nest_lock(lock, &(nest_lock)->dep_map);		\
>  } while (0)
>  
> +#define mutex_reserve_lock(lock, nest_lock, reservation_id)		\
> +({									\
> +	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
> +	_mutex_reserve_lock(lock, &(nest_lock)->dep_map, reservation_id);	\
> +})
> +
> +#define mutex_reserve_lock_interruptible(lock, nest_lock, reservation_id)	\
> +({									\
> +	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
> +	_mutex_reserve_lock_interruptible(lock, &(nest_lock)->dep_map,	\
> +					   reservation_id);		\
> +})
> +
> +#define mutex_reserve_lock_slow(lock, nest_lock, reservation_id)	\
> +do {									\
> +	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
> +	_mutex_reserve_lock_slow(lock, &(nest_lock)->dep_map, reservation_id);	\
> +} while (0)
> +
> +#define mutex_reserve_lock_intr_slow(lock, nest_lock, reservation_id)	\
> +({									\
> +	typecheck(struct lockdep_map *, &(nest_lock)->dep_map);	\
> +	_mutex_reserve_lock_intr_slow(lock, &(nest_lock)->dep_map,	\
> +				      reservation_id);			\
> +})
> +
>  #else
>  extern void mutex_lock(struct mutex *lock);
>  extern int __must_check mutex_lock_interruptible(struct mutex *lock);
>  extern int __must_check mutex_lock_killable(struct mutex *lock);
>  
> +extern int __must_check _mutex_reserve_lock(struct ticket_mutex *lock,
> +					    unsigned long reservation_id);
> +extern int __must_check _mutex_reserve_lock_interruptible(struct ticket_mutex *,
> +						unsigned long reservation_id);
> +
> +extern void _mutex_reserve_lock_slow(struct ticket_mutex *lock,
> +				     unsigned long reservation_id);
> +extern int __must_check _mutex_reserve_lock_intr_slow(struct ticket_mutex *,
> +						unsigned long reservation_id);
> +
> +#define mutex_reserve_lock(lock, nest_lock, reservation_id)		\
> +	_mutex_reserve_lock(lock, reservation_id)
> +
> +#define mutex_reserve_lock_interruptible(lock, nest_lock, reservation_id)	\
> +	_mutex_reserve_lock_interruptible(lock, reservation_id)
> +
> +#define mutex_reserve_lock_slow(lock, nest_lock, reservation_id)	\
> +	_mutex_reserve_lock_slow(lock, reservation_id)
> +
> +#define mutex_reserve_lock_intr_slow(lock, nest_lock, reservation_id)	\
> +	_mutex_reserve_lock_intr_slow(lock, reservation_id)
> +
>  # define mutex_lock_nested(lock, subclass) mutex_lock(lock)
>  # define mutex_lock_interruptible_nested(lock, subclass) mutex_lock_interruptible(lock)
>  # define mutex_lock_killable_nested(lock, subclass) mutex_lock_killable(lock)
> @@ -167,6 +249,8 @@ extern int __must_check mutex_lock_killable(struct mutex *lock);
>   */
>  extern int mutex_trylock(struct mutex *lock);
>  extern void mutex_unlock(struct mutex *lock);
> +extern void mutex_unreserve_unlock(struct ticket_mutex *lock);
> +
>  extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
>  
>  #ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index a307cc9..8282729 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -126,16 +126,119 @@ void __sched mutex_unlock(struct mutex *lock)
>  
>  EXPORT_SYMBOL(mutex_unlock);
>  
> +/**
> + * mutex_unreserve_unlock - release the mutex
> + * @lock: the mutex to be released
> + *
> + * Unlock a mutex that has been locked by this task previously
> + * with _mutex_reserve_lock*.
> + *
> + * This function must not be used in interrupt context. Unlocking
> + * of a not locked mutex is not allowed.
> + */
> +void __sched mutex_unreserve_unlock(struct ticket_mutex *lock)
> +{
> +	/*
> +	 * mark mutex as no longer part of a reservation, next
> +	 * locker can set this again
> +	 */
> +	atomic_long_set(&lock->reservation_id, 0);
> +
> +	/*
> +	 * The unlocking fastpath is the 0->1 transition from 'locked'
> +	 * into 'unlocked' state:
> +	 */
> +#ifndef CONFIG_DEBUG_MUTEXES
> +	/*
> +	 * When debugging is enabled we must not clear the owner before time,
> +	 * the slow path will always be taken, and that clears the owner field
> +	 * after verifying that it was indeed current.
> +	 */
> +	mutex_clear_owner(&lock->base);
> +#endif
> +	__mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> +}
> +EXPORT_SYMBOL(mutex_unreserve_unlock);
> +
> +static inline int __sched
> +__mutex_lock_check_reserve(struct mutex *lock, unsigned long reservation_id)
> +{
> +	struct ticket_mutex *m = container_of(lock, struct ticket_mutex, base);
> +	unsigned long cur_id;
> +
> +	cur_id = atomic_long_read(&m->reservation_id);
> +	if (!cur_id)
> +		return 0;
> +
> +	if (unlikely(reservation_id == cur_id))
> +		return -EDEADLK;
> +
> +	if (unlikely(reservation_id - cur_id <= LONG_MAX))
> +		return -EAGAIN;
> +
> +	return 0;
> +}
> +
> +/*
> + * after acquiring lock with fastpath or when we lost out in contested
> + * slowpath, set reservation_id and wake up any waiters so they can recheck.
> + */
> +static __always_inline void
> +mutex_set_reservation_fastpath(struct ticket_mutex *lock,
> +			       unsigned long reservation_id, bool check_res)
> +{
> +	unsigned long flags;
> +	struct mutex_waiter *cur;
> +
> +	if (check_res || config_enabled(CONFIG_DEBUG_LOCK_ALLOC)) {
> +		unsigned long cur_id;
> +
> +		cur_id = atomic_long_xchg(&lock->reservation_id,
> +					  reservation_id);
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> +		if (check_res)
> +			DEBUG_LOCKS_WARN_ON(cur_id &&
> +					   cur_id != reservation_id);
> +		else
> +			DEBUG_LOCKS_WARN_ON(cur_id);
> +		lockdep_assert_held(&lock->base);
> +#endif
> +
> +		if (unlikely(cur_id == reservation_id))
> +			return;
> +	} else
> +		atomic_long_set(&lock->reservation_id, reservation_id);
> +
> +	/*
> +	 * Check if lock is contended, if not there is nobody to wake up
> +	 */
> +	if (likely(atomic_read(&lock->base.count) == 0))
> +		return;
> +
> +	/*
> +	 * Uh oh, we raced in fastpath, wake up everyone in this case,
> +	 * so they can see the new reservation_id
> +	 */
> +	spin_lock_mutex(&lock->base.wait_lock, flags);
> +	list_for_each_entry(cur, &lock->base.wait_list, list) {
> +		debug_mutex_wake_waiter(&lock->base, cur);
> +		wake_up_process(cur->task);
> +	}
> +	spin_unlock_mutex(&lock->base.wait_lock, flags);
> +}
> +
>  /*
>   * Lock a mutex (possibly interruptible), slowpath:
>   */
>  static inline int __sched
>  __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
> -		    struct lockdep_map *nest_lock, unsigned long ip)
> +		    struct lockdep_map *nest_lock, unsigned long ip,
> +		    unsigned long reservation_id, bool res_slow)
>  {
>  	struct task_struct *task = current;
>  	struct mutex_waiter waiter;
>  	unsigned long flags;
> +	int ret;
>  
>  	preempt_disable();
>  	mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
> @@ -162,6 +265,12 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  	for (;;) {
>  		struct task_struct *owner;
>  
> +		if (!__builtin_constant_p(reservation_id) && !res_slow) {
> +			ret = __mutex_lock_check_reserve(lock, reservation_id);
> +			if (ret)
> +				goto err_nowait;
> +		}
> +
>  		/*
>  		 * If there's an owner, wait for it to either
>  		 * release the lock or go to sleep.
> @@ -172,6 +281,13 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  
>  		if (atomic_cmpxchg(&lock->count, 1, 0) == 1) {
>  			lock_acquired(&lock->dep_map, ip);
> +			if (res_slow) {
> +				struct ticket_mutex *m;
> +				m = container_of(lock, struct ticket_mutex, base);
> +
> +				mutex_set_reservation_fastpath(m, reservation_id, false);
> +			}
> +
>  			mutex_set_owner(lock);
>  			preempt_enable();
>  			return 0;
> @@ -227,15 +343,16 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>  		 * TASK_UNINTERRUPTIBLE case.)
>  		 */
>  		if (unlikely(signal_pending_state(state, task))) {
> -			mutex_remove_waiter(lock, &waiter,
> -					    task_thread_info(task));
> -			mutex_release(&lock->dep_map, 1, ip);
> -			spin_unlock_mutex(&lock->wait_lock, flags);
> +			ret = -EINTR;
> +			goto err;
> +		}
>  
> -			debug_mutex_free_waiter(&waiter);
> -			preempt_enable();
> -			return -EINTR;
> +		if (!__builtin_constant_p(reservation_id) && !res_slow) {
> +			ret = __mutex_lock_check_reserve(lock, reservation_id);
> +			if (ret)
> +				goto err;
>  		}
> +
>  		__set_task_state(task, state);
>  
>  		/* didn't get the lock, go to sleep: */
> @@ -250,6 +367,28 @@ done:
>  	mutex_remove_waiter(lock, &waiter, current_thread_info());
>  	mutex_set_owner(lock);
>  
> +	if (!__builtin_constant_p(reservation_id)) {
> +		struct ticket_mutex *m;
> +		struct mutex_waiter *cur;
> +		/*
> +		 * this should get optimized out for the common case,
> +		 * and is only important for _mutex_reserve_lock
> +		 */
> +
> +		m = container_of(lock, struct ticket_mutex, base);
> +		atomic_long_set(&m->reservation_id, reservation_id);
> +
> +		/*
> +		 * give any possible sleeping processes the chance to wake up,
> +		 * so they can recheck if they have to back off from
> +		 * reservations
> +		 */
> +		list_for_each_entry(cur, &lock->wait_list, list) {
> +			debug_mutex_wake_waiter(lock, cur);
> +			wake_up_process(cur->task);
> +		}
> +	}
> +
>  	/* set it to 0 if there are no waiters left: */
>  	if (likely(list_empty(&lock->wait_list)))
>  		atomic_set(&lock->count, 0);
> @@ -260,6 +399,19 @@ done:
>  	preempt_enable();
>  
>  	return 0;
> +
> +err:
> +	mutex_remove_waiter(lock, &waiter, task_thread_info(task));
> +	spin_unlock_mutex(&lock->wait_lock, flags);
> +	debug_mutex_free_waiter(&waiter);
> +
> +#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
> +err_nowait:
> +#endif
> +	mutex_release(&lock->dep_map, 1, ip);
> +
> +	preempt_enable();
> +	return ret;
>  }
>  
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
> @@ -267,7 +419,8 @@ void __sched
>  mutex_lock_nested(struct mutex *lock, unsigned int subclass)
>  {
>  	might_sleep();
> -	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, subclass, NULL, _RET_IP_);
> +	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
> +			    subclass, NULL, _RET_IP_, 0, 0);
>  }
>  
>  EXPORT_SYMBOL_GPL(mutex_lock_nested);
> @@ -276,7 +429,8 @@ void __sched
>  _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest)
>  {
>  	might_sleep();
> -	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, nest, _RET_IP_);
> +	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
> +			    0, nest, _RET_IP_, 0, 0);
>  }
>  
>  EXPORT_SYMBOL_GPL(_mutex_lock_nest_lock);
> @@ -285,7 +439,8 @@ int __sched
>  mutex_lock_killable_nested(struct mutex *lock, unsigned int subclass)
>  {
>  	might_sleep();
> -	return __mutex_lock_common(lock, TASK_KILLABLE, subclass, NULL, _RET_IP_);
> +	return __mutex_lock_common(lock, TASK_KILLABLE,
> +				   subclass, NULL, _RET_IP_, 0, 0);
>  }
>  EXPORT_SYMBOL_GPL(mutex_lock_killable_nested);
>  
> @@ -294,10 +449,63 @@ mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
>  {
>  	might_sleep();
>  	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE,
> -				   subclass, NULL, _RET_IP_);
> +				   subclass, NULL, _RET_IP_, 0, 0);
>  }
>  
>  EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
> +
> +int __sched
> +_mutex_reserve_lock(struct ticket_mutex *lock, struct lockdep_map *nest,
> +		    unsigned long reservation_id)
> +{
> +	DEBUG_LOCKS_WARN_ON(!reservation_id);
> +
> +	might_sleep();
> +	return __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
> +				   0, nest, _RET_IP_, reservation_id, 0);
> +}
> +EXPORT_SYMBOL_GPL(_mutex_reserve_lock);
> +
> +
> +int __sched
> +_mutex_reserve_lock_interruptible(struct ticket_mutex *lock,
> +				  struct lockdep_map *nest,
> +				  unsigned long reservation_id)
> +{
> +	DEBUG_LOCKS_WARN_ON(!reservation_id);
> +
> +	might_sleep();
> +	return __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE,
> +				   0, nest, _RET_IP_, reservation_id, 0);
> +}
> +EXPORT_SYMBOL_GPL(_mutex_reserve_lock_interruptible);
> +
> +void __sched
> +_mutex_reserve_lock_slow(struct ticket_mutex *lock, struct lockdep_map *nest,
> +			 unsigned long reservation_id)
> +{
> +	DEBUG_LOCKS_WARN_ON(!reservation_id);
> +
> +	might_sleep();
> +	__mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE, 0,
> +			    nest, _RET_IP_, reservation_id, 1);
> +}
> +EXPORT_SYMBOL_GPL(_mutex_reserve_lock_slow);
> +
> +int __sched
> +_mutex_reserve_lock_intr_slow(struct ticket_mutex *lock,
> +			      struct lockdep_map *nest,
> +			      unsigned long reservation_id)
> +{
> +	DEBUG_LOCKS_WARN_ON(!reservation_id);
> +
> +	might_sleep();
> +	return __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE, 0,
> +				   nest, _RET_IP_, reservation_id, 1);
> +}
> +EXPORT_SYMBOL_GPL(_mutex_reserve_lock_intr_slow);
> +
> +
>  #endif
>  
>  /*
> @@ -400,7 +608,8 @@ __mutex_lock_slowpath(atomic_t *lock_count)
>  {
>  	struct mutex *lock = container_of(lock_count, struct mutex, count);
>  
> -	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, NULL, _RET_IP_);
> +	__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0,
> +			    NULL, _RET_IP_, 0, 0);
>  }
>  
>  static noinline int __sched
> @@ -408,7 +617,8 @@ __mutex_lock_killable_slowpath(atomic_t *lock_count)
>  {
>  	struct mutex *lock = container_of(lock_count, struct mutex, count);
>  
> -	return __mutex_lock_common(lock, TASK_KILLABLE, 0, NULL, _RET_IP_);
> +	return __mutex_lock_common(lock, TASK_KILLABLE, 0,
> +				   NULL, _RET_IP_, 0, 0);
>  }
>  
>  static noinline int __sched
> @@ -416,8 +626,28 @@ __mutex_lock_interruptible_slowpath(atomic_t *lock_count)
>  {
>  	struct mutex *lock = container_of(lock_count, struct mutex, count);
>  
> -	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0, NULL, _RET_IP_);
> +	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0,
> +				   NULL, _RET_IP_, 0, 0);
> +}
> +
> +static noinline int __sched
> +__mutex_lock_reserve_slowpath(atomic_t *lock_count, void *rid)
> +{
> +	struct mutex *lock = container_of(lock_count, struct mutex, count);
> +
> +	return __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0,
> +				   NULL, _RET_IP_, (unsigned long)rid, 0);
> +}
> +
> +static noinline int __sched
> +__mutex_lock_interruptible_reserve_slowpath(atomic_t *lock_count, void *rid)
> +{
> +	struct mutex *lock = container_of(lock_count, struct mutex, count);
> +
> +	return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0,
> +				   NULL, _RET_IP_, (unsigned long)rid, 0);
>  }
> +
>  #endif
>  
>  /*
> @@ -473,6 +703,63 @@ int __sched mutex_trylock(struct mutex *lock)
>  }
>  EXPORT_SYMBOL(mutex_trylock);
>  
> +#ifndef CONFIG_DEBUG_LOCK_ALLOC
> +int __sched
> +_mutex_reserve_lock(struct ticket_mutex *lock, unsigned long rid)
> +{
> +	int ret;
> +
> +	might_sleep();
> +
> +	ret = __mutex_fastpath_lock_retval_arg(&lock->base.count, (void *)rid,
> +					__mutex_lock_reserve_slowpath);
> +
> +	if (!ret) {
> +		mutex_set_reservation_fastpath(lock, rid, true);
> +		mutex_set_owner(&lock->base);
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL(_mutex_reserve_lock);
> +
> +int __sched
> +_mutex_reserve_lock_interruptible(struct ticket_mutex *lock, unsigned long rid)
> +{
> +	int ret;
> +
> +	might_sleep();
> +
> +	ret = __mutex_fastpath_lock_retval_arg(&lock->base.count, (void *)rid,
> +				__mutex_lock_interruptible_reserve_slowpath);
> +
> +	if (!ret) {
> +		mutex_set_reservation_fastpath(lock, rid, true);
> +		mutex_set_owner(&lock->base);
> +	}
> +	return ret;
> +}
> +EXPORT_SYMBOL(_mutex_reserve_lock_interruptible);
> +
> +void __sched
> +_mutex_reserve_lock_slow(struct ticket_mutex *lock, unsigned long rid)
> +{
> +	might_sleep();
> +	__mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
> +			    0, NULL, _RET_IP_, rid, 1);
> +}
> +EXPORT_SYMBOL(_mutex_reserve_lock_slow);
> +
> +int __sched
> +_mutex_reserve_lock_intr_slow(struct ticket_mutex *lock, unsigned long rid)
> +{
> +	might_sleep();
> +	return __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE,
> +				   0, NULL, _RET_IP_, rid, 1);
> +}
> +EXPORT_SYMBOL(_mutex_reserve_lock_intr_slow);
> +
> +#endif
> +
>  /**
>   * atomic_dec_and_mutex_lock - return holding mutex if we dec to 0
>   * @cnt: the atomic which we are to dec


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/7] arch: add __mutex_fastpath_lock_retval_arg to generic/sh/x86/powerpc/ia64
  2013-01-15 12:33 ` [PATCH 1/7] arch: add __mutex_fastpath_lock_retval_arg to generic/sh/x86/powerpc/ia64 Maarten Lankhorst
@ 2013-01-15 13:49   ` Maarten Lankhorst
  0 siblings, 0 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-15 13:49 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: dri-devel, linux-kernel, linux-media, linaro-mm-sig

Again, missing entry :(

Op 15-01-13 13:33, Maarten Lankhorst schreef:
> Needed for reservation slowpath.

I was hoping to convert the 'mutexes' in ttm to proper mutexes, so I extended the
core mutex code slightly to add support for reservations. This requires however
passing an argument to __mutex_fastpath_lock_retval, so I added this for all archs
based on their existing __mutex_fastpath_lock_retval implementation.

I'm guessing this would have to be split up in the final version, but for now it's
easier to edit as a single patch.

> ---
>  arch/ia64/include/asm/mutex.h    | 20 ++++++++++++++++++++
>  arch/powerpc/include/asm/mutex.h | 20 ++++++++++++++++++++
>  arch/sh/include/asm/mutex-llsc.h | 20 ++++++++++++++++++++
>  arch/x86/include/asm/mutex_32.h  | 20 ++++++++++++++++++++
>  arch/x86/include/asm/mutex_64.h  | 20 ++++++++++++++++++++
>  include/asm-generic/mutex-dec.h  | 20 ++++++++++++++++++++
>  include/asm-generic/mutex-null.h |  1 +
>  include/asm-generic/mutex-xchg.h | 21 +++++++++++++++++++++
>  8 files changed, 142 insertions(+)
>
> diff --git a/arch/ia64/include/asm/mutex.h b/arch/ia64/include/asm/mutex.h
> index bed73a6..2510058 100644
> --- a/arch/ia64/include/asm/mutex.h
> +++ b/arch/ia64/include/asm/mutex.h
> @@ -44,6 +44,26 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
>  }
>  
>  /**
> + *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
> + *                                     from 1 to a 0 value
> + *  @count: pointer of type atomic_t
> + *  @arg: argument to pass along if fastpath fails.
> + *  @fail_fn: function to call if the original value was not 1
> + *
> + * Change the count from 1 to a value lower than 1, and call <fail_fn> if
> + * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
> + * or anything the slow path function returns.
> + */
> +static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
> +				void *arg, int (*fail_fn)(atomic_t *, void*))
> +{
> +	if (unlikely(ia64_fetchadd4_acq(count, -1) != 1))
> +		return fail_fn(count, arg);
> +	else
> +		return 0;
> +}
> +
> +/**
>   *  __mutex_fastpath_unlock - try to promote the count from 0 to 1
>   *  @count: pointer of type atomic_t
>   *  @fail_fn: function to call if the original value was not 0
> diff --git a/arch/powerpc/include/asm/mutex.h b/arch/powerpc/include/asm/mutex.h
> index 5399f7e..df4bcff 100644
> --- a/arch/powerpc/include/asm/mutex.h
> +++ b/arch/powerpc/include/asm/mutex.h
> @@ -97,6 +97,26 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
>  }
>  
>  /**
> + *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
> + *                                     from 1 to a 0 value
> + *  @count: pointer of type atomic_t
> + *  @arg: argument to pass along if fastpath fails.
> + *  @fail_fn: function to call if the original value was not 1
> + *
> + * Change the count from 1 to a value lower than 1, and call <fail_fn> if
> + * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
> + * or anything the slow path function returns.
> + */
> +static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
> +				void *arg, int (*fail_fn)(atomic_t *, void*))
> +{
> +	if (unlikely(__mutex_dec_return_lock(count) < 0))
> +		return fail_fn(count, arg);
> +	else
> +		return 0;
> +}
> +
> +/**
>   *  __mutex_fastpath_unlock - try to promote the count from 0 to 1
>   *  @count: pointer of type atomic_t
>   *  @fail_fn: function to call if the original value was not 0
> diff --git a/arch/sh/include/asm/mutex-llsc.h b/arch/sh/include/asm/mutex-llsc.h
> index 090358a..b68dd6d 100644
> --- a/arch/sh/include/asm/mutex-llsc.h
> +++ b/arch/sh/include/asm/mutex-llsc.h
> @@ -56,6 +56,26 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
>  	return __res;
>  }
>  
> +static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
> +				void *arg, int (*fail_fn)(atomic_t *, void *))
> +{
> +	int __done, __res;
> +
> +	__asm__ __volatile__ (
> +		"movli.l	@%2, %0	\n"
> +		"add		#-1, %0	\n"
> +		"movco.l	%0, @%2	\n"
> +		"movt		%1	\n"
> +		: "=&z" (__res), "=&r" (__done)
> +		: "r" (&(count)->counter)
> +		: "t");
> +
> +	if (unlikely(!__done || __res != 0))
> +		__res = fail_fn(count, arg);
> +
> +	return __res;
> +}
> +
>  static inline void
>  __mutex_fastpath_unlock(atomic_t *count, void (*fail_fn)(atomic_t *))
>  {
> diff --git a/arch/x86/include/asm/mutex_32.h b/arch/x86/include/asm/mutex_32.h
> index 03f90c8..34f77f9 100644
> --- a/arch/x86/include/asm/mutex_32.h
> +++ b/arch/x86/include/asm/mutex_32.h
> @@ -58,6 +58,26 @@ static inline int __mutex_fastpath_lock_retval(atomic_t *count,
>  }
>  
>  /**
> + *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
> + *                                     from 1 to a 0 value
> + *  @count: pointer of type atomic_t
> + *  @arg: argument to pass along if fastpath fails.
> + *  @fail_fn: function to call if the original value was not 1
> + *
> + * Change the count from 1 to a value lower than 1, and call <fail_fn> if
> + * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
> + * or anything the slow path function returns.
> + */
> +static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
> +				void *arg, int (*fail_fn)(atomic_t *, void*))
> +{
> +	if (unlikely(atomic_dec_return(count) < 0))
> +		return fail_fn(count, arg);
> +	else
> +		return 0;
> +}
> +
> +/**
>   *  __mutex_fastpath_unlock - try to promote the mutex from 0 to 1
>   *  @count: pointer of type atomic_t
>   *  @fail_fn: function to call if the original value was not 0
> diff --git a/arch/x86/include/asm/mutex_64.h b/arch/x86/include/asm/mutex_64.h
> index 68a87b0..148249e 100644
> --- a/arch/x86/include/asm/mutex_64.h
> +++ b/arch/x86/include/asm/mutex_64.h
> @@ -53,6 +53,26 @@ static inline int __mutex_fastpath_lock_retval(atomic_t *count,
>  }
>  
>  /**
> + *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
> + *                                     from 1 to a 0 value
> + *  @count: pointer of type atomic_t
> + *  @arg: argument to pass along if fastpath fails.
> + *  @fail_fn: function to call if the original value was not 1
> + *
> + * Change the count from 1 to a value lower than 1, and call <fail_fn> if
> + * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
> + * or anything the slow path function returns.
> + */
> +static inline int __mutex_fastpath_lock_retval_arg(atomic_t *count,
> +				void *arg, int (*fail_fn)(atomic_t *, void*))
> +{
> +	if (unlikely(atomic_dec_return(count) < 0))
> +		return fail_fn(count, arg);
> +	else
> +		return 0;
> +}
> +
> +/**
>   * __mutex_fastpath_unlock - increment and call function if nonpositive
>   * @v: pointer of type atomic_t
>   * @fail_fn: function to call if the result is nonpositive
> diff --git a/include/asm-generic/mutex-dec.h b/include/asm-generic/mutex-dec.h
> index f104af7..f5d027e 100644
> --- a/include/asm-generic/mutex-dec.h
> +++ b/include/asm-generic/mutex-dec.h
> @@ -43,6 +43,26 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
>  }
>  
>  /**
> + *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
> + *                                     from 1 to a 0 value
> + *  @count: pointer of type atomic_t
> + *  @arg: argument to pass along if fastpath fails.
> + *  @fail_fn: function to call if the original value was not 1
> + *
> + * Change the count from 1 to a value lower than 1, and call <fail_fn> if
> + * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
> + * or anything the slow path function returns.
> + */
> +static inline int
> +__mutex_fastpath_lock_retval_arg(atomic_t *count, void *arg,
> +				 int (*fail_fn)(atomic_t *, void*))
> +{
> +	if (unlikely(atomic_dec_return(count) < 0))
> +		return fail_fn(count, arg);
> +	return 0;
> +}
> +
> +/**
>   *  __mutex_fastpath_unlock - try to promote the count from 0 to 1
>   *  @count: pointer of type atomic_t
>   *  @fail_fn: function to call if the original value was not 0
> diff --git a/include/asm-generic/mutex-null.h b/include/asm-generic/mutex-null.h
> index e1bbbc7..991e9c3 100644
> --- a/include/asm-generic/mutex-null.h
> +++ b/include/asm-generic/mutex-null.h
> @@ -12,6 +12,7 @@
>  
>  #define __mutex_fastpath_lock(count, fail_fn)		fail_fn(count)
>  #define __mutex_fastpath_lock_retval(count, fail_fn)	fail_fn(count)
> +#define __mutex_fastpath_lock_retval_arg(count, arg, fail_fn)	fail_fn(count, arg)
>  #define __mutex_fastpath_unlock(count, fail_fn)		fail_fn(count)
>  #define __mutex_fastpath_trylock(count, fail_fn)	fail_fn(count)
>  #define __mutex_slowpath_needs_to_unlock()		1
> diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h
> index c04e0db..d9cc971 100644
> --- a/include/asm-generic/mutex-xchg.h
> +++ b/include/asm-generic/mutex-xchg.h
> @@ -55,6 +55,27 @@ __mutex_fastpath_lock_retval(atomic_t *count, int (*fail_fn)(atomic_t *))
>  }
>  
>  /**
> + *  __mutex_fastpath_lock_retval_arg - try to take the lock by moving the count
> + *                                     from 1 to a 0 value
> + *  @count: pointer of type atomic_t
> + *  @arg: argument to pass along if fastpath fails.
> + *  @fail_fn: function to call if the original value was not 1
> + *
> + * Change the count from 1 to a value lower than 1, and call <fail_fn> if
> + * it wasn't 1 originally. This function returns 0 if the fastpath succeeds,
> + * or anything the slow path function returns.
> + */
> +static inline int
> +__mutex_fastpath_lock_retval_arg(atomic_t *count, void *arg,
> +				 int (*fail_fn)(atomic_t *, void*))
> +{
> +	if (unlikely(atomic_xchg(count, 0) != 1))
> +		if (likely(atomic_xchg(count, -1) != 1))
> +			return fail_fn(count, arg);
> +	return 0;
> +}
> +
> +/**
>   *  __mutex_fastpath_unlock - try to promote the mutex from 0 to 1
>   *  @count: pointer of type atomic_t
>   *  @fail_fn: function to call if the original value was not 0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4)
  2013-01-15 12:34 ` [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4) Maarten Lankhorst
@ 2013-01-16  6:28   ` Inki Dae
  2013-01-16 10:36     ` Maarten Lankhorst
  0 siblings, 1 reply; 32+ messages in thread
From: Inki Dae @ 2013-01-16  6:28 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: dri-devel, linux-kernel, linux-media, linaro-mm-sig, Maarten Lankhorst

2013/1/15 Maarten Lankhorst <m.b.lankhorst@gmail.com>:
> This type of fence can be used with hardware synchronization for simple
> hardware that can block execution until the condition
> (dma_buf[offset] - value) >= 0 has been met.
>
> A software fallback still has to be provided in case the fence is used
> with a device that doesn't support this mechanism. It is useful to expose
> this for graphics cards that have an op to support this.
>
> Some cards like i915 can export those, but don't have an option to wait,
> so they need the software fallback.
>
> I extended the original patch by Rob Clark.
>
> v1: Original
> v2: Renamed from bikeshed to seqno, moved into dma-fence.c since
>     not much was left of the file. Lots of documentation added.
> v3: Use fence_ops instead of custom callbacks. Moved to own file
>     to avoid circular dependency between dma-buf.h and fence.h
> v4: Add spinlock pointer to seqno_fence_init
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> ---
>  Documentation/DocBook/device-drivers.tmpl |   1 +
>  drivers/base/fence.c                      |  38 +++++++++++
>  include/linux/seqno-fence.h               | 105 ++++++++++++++++++++++++++++++
>  3 files changed, 144 insertions(+)
>  create mode 100644 include/linux/seqno-fence.h
>
> diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
> index 6f53fc0..ad14396 100644
> --- a/Documentation/DocBook/device-drivers.tmpl
> +++ b/Documentation/DocBook/device-drivers.tmpl
> @@ -128,6 +128,7 @@ X!Edrivers/base/interface.c
>  !Edrivers/base/dma-buf.c
>  !Edrivers/base/fence.c
>  !Iinclude/linux/fence.h
> +!Iinclude/linux/seqno-fence.h
>  !Edrivers/base/dma-coherent.c
>  !Edrivers/base/dma-mapping.c
>       </sect1>
> diff --git a/drivers/base/fence.c b/drivers/base/fence.c
> index 28e5ffd..1d3f29c 100644
> --- a/drivers/base/fence.c
> +++ b/drivers/base/fence.c
> @@ -24,6 +24,7 @@
>  #include <linux/slab.h>
>  #include <linux/export.h>
>  #include <linux/fence.h>
> +#include <linux/seqno-fence.h>
>
>  atomic_t fence_context_counter = ATOMIC_INIT(0);
>  EXPORT_SYMBOL(fence_context_counter);
> @@ -284,3 +285,40 @@ out:
>         return ret;
>  }
>  EXPORT_SYMBOL(fence_default_wait);
> +
> +static bool seqno_enable_signaling(struct fence *fence)
> +{
> +       struct seqno_fence *seqno_fence = to_seqno_fence(fence);
> +       return seqno_fence->ops->enable_signaling(fence);
> +}
> +
> +static bool seqno_signaled(struct fence *fence)
> +{
> +       struct seqno_fence *seqno_fence = to_seqno_fence(fence);
> +       return seqno_fence->ops->signaled && seqno_fence->ops->signaled(fence);
> +}
> +
> +static void seqno_release(struct fence *fence)
> +{
> +       struct seqno_fence *f = to_seqno_fence(fence);
> +
> +       dma_buf_put(f->sync_buf);
> +       if (f->ops->release)
> +               f->ops->release(fence);
> +       else
> +               kfree(f);
> +}
> +
> +static long seqno_wait(struct fence *fence, bool intr, signed long timeout)
> +{
> +       struct seqno_fence *f = to_seqno_fence(fence);
> +       return f->ops->wait(fence, intr, timeout);
> +}
> +
> +const struct fence_ops seqno_fence_ops = {
> +       .enable_signaling = seqno_enable_signaling,
> +       .signaled = seqno_signaled,
> +       .wait = seqno_wait,
> +       .release = seqno_release
> +};
> +EXPORT_SYMBOL_GPL(seqno_fence_ops);
> diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h
> new file mode 100644
> index 0000000..603adc0
> --- /dev/null
> +++ b/include/linux/seqno-fence.h
> @@ -0,0 +1,105 @@
> +/*
> + * seqno-fence, using a dma-buf to synchronize fencing
> + *
> + * Copyright (C) 2012 Texas Instruments
> + * Copyright (C) 2012 Canonical Ltd
> + * Authors:
> + *   Rob Clark <rob.clark@linaro.org>
> + *   Maarten Lankhorst <maarten.lankhorst@canonical.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published by
> + * the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef __LINUX_SEQNO_FENCE_H
> +#define __LINUX_SEQNO_FENCE_H
> +
> +#include <linux/fence.h>
> +#include <linux/dma-buf.h>
> +
> +struct seqno_fence {
> +       struct fence base;
> +
> +       const struct fence_ops *ops;
> +       struct dma_buf *sync_buf;
> +       uint32_t seqno_ofs;
> +};

Hi maarten,

I'm applying dma-fence v11 and seqno-fence v4 to exynos drm and have
some proposals.

The above seqno_fence structure has only one dmabuf. Shouldn't it have
mutiple dmabufs? For example, in case of drm driver, when pageflip is
requested, one framebuffer could have one more gem buffer for NV12M
format. And this means that one more exported dmabufs should be
sychronized with other devices. Below is simple structure for it,

struct seqno_fence_dmabuf {
        struct list_head        list;
        int                            id;
        struct dmabuf          *sync_buf;
        uint32_t                   seqno_ops;
        uint32_t                   seqno;
};

The member, id, could be used to identify which device sync_buf is
going to be accessed by. In case of drm driver, one framebuffer could
be accessed by one more devices, one is Display controller and another
is HDMI controller. So id would have crtc number.

And seqno_fence structure could be defined like below,

struct seqno_fence {
        struct list_head                sync_buf_list;
        struct fence                      base;
        const struct fence_ops     *ops;
};

In addition, I have implemented fence-helper framework for sw sync as
WIP and below is intefaces for it,

struct fence_helper {
        struct list_head                    entries;
        struct reservation_ticket       ticket;
        struct seqno_fence              *sf;
        spinlock_t                             lock;
        void                                      *priv;
};

int fence_helper_init(struct fence_helper *fh, void *priv, void
(*remease)(struct fence *fence));
- This function is called at driver open so process unique context
would have a new seqno_fence instance. This function does just
seqno_fence_init call, initialize entries list and set device specific
fence release callback.

bool fence_helper_check_sync_buf(struct fence_helper *fh, struct
dma_buf *sync_buf, int id);
- This function is called before dma is started and checks if same
sync_bufs had already be committed to reservation_object,
bo->fence_shared[n]. And id could be used to identy which device
sync_buf is going to be accessed by.

int fence_helper_add_sync_buf(struct fence_helper *fh, struct dma_buf
*sync_buf, int id);
- This function is called if fence_helper_check_sync_buf() is true and
adds it seqno_fence's sync_buf_list wrapping sync_buf as
seqno_fence_dma_buf structure. With this function call, one
seqno_fence instance would have more sync_bufs. At this time, the
reference count to this sync_buf is taken.

void fence_helper_del_sync_buf(struct fence_helper *fh, int id);
- This function is called if some operation is failed after
fence_helper_add_sync_buf call to release relevant resources.

int fence_helper_init_reservation_entry(struct fence_helper *fh,
struct dma_buf *dmabuf, bool shared, int id);
- This function is called after fence_helper_add_sync_buf call and
calls reservation_entry_init function to set a reservation object of
sync_buf to a new reservation_entry object. And then the new
reservation_entry is added to fh->entries to track all sync_bufs this
device is going to access.

void fence_helper_fini_reservation_entry(struct fence_helper *fh, int id);
- This function is called if some operation is failed after
fence_helper_init_reservation_entry call to releae relevant resources.

int fence_helper_ticket_reserve(struct fence_helper *fh, int id);
- This function is called after fence_helper_init_reservation_entry
call and calls ticket_reserve function to reserve a ticket(locked for
each reservation entry in fh->entires)

void fence_helper_ticket_commit(struct fence_helper *fh, int id);
- This function is called after fence_helper_ticket_reserve() is
called to commit this device's fence to all reservation_objects of
each sync_buf. After that, once other devices try to access these
buffers, they would be blocked and unlock each reservation entry in
fh->entires.

int fence_helper_wait(struct fence_helper *fh, struct dma_buf *dmabuf,
bool intr);
- This function is called before fence_helper_add_sync_buf() is called
to wait for a signal from other devices.

int fence_helper_signal(struct fence_helper *fh, int id);
- This function is called by device's interrupt handler or somewhere
when dma access to this buffer has been completed and calls
fence_signal() with each fence registed to each reservation object in
fh->entries to notify dma access completion to other deivces. At this
time, other devices blocked would be waked up and forward their next
step.

For more detail, in addition, this function does the following,
- delete each reservation entry in fh->entries.
- release each seqno_fence_dmabuf object in seqno_fence's
sync_buf_list and call dma_buf_put() to put the reference count to
dmabuf.


Now the fence-helper framework is just WIP yet so there may be my
missing points. If you are ok, I'd like to post it as RFC.

Thanks,
Inki Dae


> +
> +extern const struct fence_ops seqno_fence_ops;
> +
> +/**
> + * to_seqno_fence - cast a fence to a seqno_fence
> + * @fence: fence to cast to a seqno_fence
> + *
> + * Returns NULL if the fence is not a seqno_fence,
> + * or the seqno_fence otherwise.
> + */
> +static inline struct seqno_fence *
> +to_seqno_fence(struct fence *fence)
> +{
> +       if (fence->ops != &seqno_fence_ops)
> +               return NULL;
> +       return container_of(fence, struct seqno_fence, base);
> +}
> +
> +/**
> + * seqno_fence_init - initialize a seqno fence
> + * @fence: seqno_fence to initialize
> + * @lock: pointer to spinlock to use for fence
> + * @sync_buf: buffer containing the memory location to signal on
> + * @context: the execution context this fence is a part of
> + * @seqno_ofs: the offset within @sync_buf
> + * @seqno: the sequence # to signal on
> + * @ops: the fence_ops for operations on this seqno fence
> + *
> + * This function initializes a struct seqno_fence with passed parameters,
> + * and takes a reference on sync_buf which is released on fence destruction.
> + *
> + * A seqno_fence is a dma_fence which can complete in software when
> + * enable_signaling is called, but it also completes when
> + * (s32)((sync_buf)[seqno_ofs] - seqno) >= 0 is true
> + *
> + * The seqno_fence will take a refcount on the sync_buf until it's
> + * destroyed, but actual lifetime of sync_buf may be longer if one of the
> + * callers take a reference to it.
> + *
> + * Certain hardware have instructions to insert this type of wait condition
> + * in the command stream, so no intervention from software would be needed.
> + * This type of fence can be destroyed before completed, however a reference
> + * on the sync_buf dma-buf can be taken. It is encouraged to re-use the same
> + * dma-buf for sync_buf, since mapping or unmapping the sync_buf to the
> + * device's vm can be expensive.
> + *
> + * It is recommended for creators of seqno_fence to call fence_signal
> + * before destruction. This will prevent possible issues from wraparound at
> + * time of issue vs time of check, since users can check fence_is_signaled
> + * before submitting instructions for the hardware to wait on the fence.
> + * However, when ops.enable_signaling is not called, it doesn't have to be
> + * done as soon as possible, just before there's any real danger of seqno
> + * wraparound.
> + */
> +static inline void
> +seqno_fence_init(struct seqno_fence *fence, spinlock_t *lock,
> +                struct dma_buf *sync_buf,  uint32_t context, uint32_t seqno_ofs,
> +                uint32_t seqno, const struct fence_ops *ops)
> +{
> +       BUG_ON(!fence || !sync_buf || !ops->enable_signaling || !ops->wait);
> +
> +       __fence_init(&fence->base, &seqno_fence_ops, lock, context, seqno);
> +
> +       get_dma_buf(sync_buf);
> +       fence->ops = ops;
> +       fence->sync_buf = sync_buf;
> +       fence->seqno_ofs = seqno_ofs;
> +}
> +
> +#endif /* __LINUX_SEQNO_FENCE_H */
> --
> 1.8.0.3
>
>
> _______________________________________________
> Linaro-mm-sig mailing list
> Linaro-mm-sig@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-mm-sig

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4)
  2013-01-16  6:28   ` [Linaro-mm-sig] " Inki Dae
@ 2013-01-16 10:36     ` Maarten Lankhorst
  2013-01-16 12:00       ` Inki Dae
  2013-01-24 14:52       ` Inki Dae
  0 siblings, 2 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-16 10:36 UTC (permalink / raw)
  To: Inki Dae
  Cc: Maarten Lankhorst, dri-devel, linux-kernel, linux-media, linaro-mm-sig

Op 16-01-13 07:28, Inki Dae schreef:
> 2013/1/15 Maarten Lankhorst <m.b.lankhorst@gmail.com>:
>> This type of fence can be used with hardware synchronization for simple
>> hardware that can block execution until the condition
>> (dma_buf[offset] - value) >= 0 has been met.
>>
>> A software fallback still has to be provided in case the fence is used
>> with a device that doesn't support this mechanism. It is useful to expose
>> this for graphics cards that have an op to support this.
>>
>> Some cards like i915 can export those, but don't have an option to wait,
>> so they need the software fallback.
>>
>> I extended the original patch by Rob Clark.
>>
>> v1: Original
>> v2: Renamed from bikeshed to seqno, moved into dma-fence.c since
>>     not much was left of the file. Lots of documentation added.
>> v3: Use fence_ops instead of custom callbacks. Moved to own file
>>     to avoid circular dependency between dma-buf.h and fence.h
>> v4: Add spinlock pointer to seqno_fence_init
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> ---
>>  Documentation/DocBook/device-drivers.tmpl |   1 +
>>  drivers/base/fence.c                      |  38 +++++++++++
>>  include/linux/seqno-fence.h               | 105 ++++++++++++++++++++++++++++++
>>  3 files changed, 144 insertions(+)
>>  create mode 100644 include/linux/seqno-fence.h
>>
>> diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
>> index 6f53fc0..ad14396 100644
>> --- a/Documentation/DocBook/device-drivers.tmpl
>> +++ b/Documentation/DocBook/device-drivers.tmpl
>> @@ -128,6 +128,7 @@ X!Edrivers/base/interface.c
>>  !Edrivers/base/dma-buf.c
>>  !Edrivers/base/fence.c
>>  !Iinclude/linux/fence.h
>> +!Iinclude/linux/seqno-fence.h
>>  !Edrivers/base/dma-coherent.c
>>  !Edrivers/base/dma-mapping.c
>>       </sect1>
>> diff --git a/drivers/base/fence.c b/drivers/base/fence.c
>> index 28e5ffd..1d3f29c 100644
>> --- a/drivers/base/fence.c
>> +++ b/drivers/base/fence.c
>> @@ -24,6 +24,7 @@
>>  #include <linux/slab.h>
>>  #include <linux/export.h>
>>  #include <linux/fence.h>
>> +#include <linux/seqno-fence.h>
>>
>>  atomic_t fence_context_counter = ATOMIC_INIT(0);
>>  EXPORT_SYMBOL(fence_context_counter);
>> @@ -284,3 +285,40 @@ out:
>>         return ret;
>>  }
>>  EXPORT_SYMBOL(fence_default_wait);
>> +
>> +static bool seqno_enable_signaling(struct fence *fence)
>> +{
>> +       struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>> +       return seqno_fence->ops->enable_signaling(fence);
>> +}
>> +
>> +static bool seqno_signaled(struct fence *fence)
>> +{
>> +       struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>> +       return seqno_fence->ops->signaled && seqno_fence->ops->signaled(fence);
>> +}
>> +
>> +static void seqno_release(struct fence *fence)
>> +{
>> +       struct seqno_fence *f = to_seqno_fence(fence);
>> +
>> +       dma_buf_put(f->sync_buf);
>> +       if (f->ops->release)
>> +               f->ops->release(fence);
>> +       else
>> +               kfree(f);
>> +}
>> +
>> +static long seqno_wait(struct fence *fence, bool intr, signed long timeout)
>> +{
>> +       struct seqno_fence *f = to_seqno_fence(fence);
>> +       return f->ops->wait(fence, intr, timeout);
>> +}
>> +
>> +const struct fence_ops seqno_fence_ops = {
>> +       .enable_signaling = seqno_enable_signaling,
>> +       .signaled = seqno_signaled,
>> +       .wait = seqno_wait,
>> +       .release = seqno_release
>> +};
>> +EXPORT_SYMBOL_GPL(seqno_fence_ops);
>> diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h
>> new file mode 100644
>> index 0000000..603adc0
>> --- /dev/null
>> +++ b/include/linux/seqno-fence.h
>> @@ -0,0 +1,105 @@
>> +/*
>> + * seqno-fence, using a dma-buf to synchronize fencing
>> + *
>> + * Copyright (C) 2012 Texas Instruments
>> + * Copyright (C) 2012 Canonical Ltd
>> + * Authors:
>> + *   Rob Clark <rob.clark@linaro.org>
>> + *   Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published by
>> + * the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef __LINUX_SEQNO_FENCE_H
>> +#define __LINUX_SEQNO_FENCE_H
>> +
>> +#include <linux/fence.h>
>> +#include <linux/dma-buf.h>
>> +
>> +struct seqno_fence {
>> +       struct fence base;
>> +
>> +       const struct fence_ops *ops;
>> +       struct dma_buf *sync_buf;
>> +       uint32_t seqno_ofs;
>> +};
> Hi maarten,
>
> I'm applying dma-fence v11 and seqno-fence v4 to exynos drm and have
> some proposals.
>
> The above seqno_fence structure has only one dmabuf. Shouldn't it have
> mutiple dmabufs? For example, in case of drm driver, when pageflip is
> requested, one framebuffer could have one more gem buffer for NV12M
> format. And this means that one more exported dmabufs should be
> sychronized with other devices. Below is simple structure for it,
The fence guards a single operation, as such I didn't feel like more than one
dma-buf was needed to guard it.

Have you considered simply attaching multiple fences instead? Each with their own dma-buf.
There has been some muttering about allowing multiple exclusive fences to be attached, for arm soc's.

But I'm also considering getting rid of the dma-buf member and add a function call to retrieve it, since
the sync dma-buf member should not be changing often, and it would zap 2 atomic ops on every fence,
but I want it replaced by something that's not 10x more complicated.

Maybe "int get_sync_dma_buf(fence, old_dma_buf, &new_dma_buf)" that will set new_dma_buf = NULL
if the old_dma_buf is unchanged, and return true + return a new reference to the sync dma_buf if it's not identical to old_dma_buf.
old_dma_buf can also be NULL or a dma_buf that belongs to a different fence->context entirely. It might be capable of
returning an error, in which case the fence would count as being signaled. This could reduce the need for separately checking
fence_is_signaled first.

I think this would allow caching the synchronization dma_buf in a similar way without each fence needing
to hold a reference to the dma_buf all the time, even for fences that are only used internally.

> struct seqno_fence_dmabuf {
>         struct list_head        list;
>         int                            id;
>         struct dmabuf          *sync_buf;
>         uint32_t                   seqno_ops;
>         uint32_t                   seqno;
> };
>
> The member, id, could be used to identify which device sync_buf is
> going to be accessed by. In case of drm driver, one framebuffer could
> be accessed by one more devices, one is Display controller and another
> is HDMI controller. So id would have crtc number.
Why do you need this? the base fence already has a context member.

In fact I don't see why you need a linked list, at worst you'd need a static array since the amount of
dma-bufs should already be known during allocation time.

I would prefer to simply make reservation_object->fence_exclusive an array, since it would be easier to implement,
and there have been some calls that arm might need such a thing.


> And seqno_fence structure could be defined like below,
>
> struct seqno_fence {
>         struct list_head                sync_buf_list;
>         struct fence                      base;
>         const struct fence_ops     *ops;
> };
>
> In addition, I have implemented fence-helper framework for sw sync as
> WIP and below is intefaces for it,
>
> struct fence_helper {
>         struct list_head                    entries;
>         struct reservation_ticket       ticket;
>         struct seqno_fence              *sf;
>         spinlock_t                             lock;
>         void                                      *priv;
> };
>
> int fence_helper_init(struct fence_helper *fh, void *priv, void
> (*remease)(struct fence *fence));
> - This function is called at driver open so process unique context
> would have a new seqno_fence instance. This function does just
> seqno_fence_init call, initialize entries list and set device specific
> fence release callback.
>
> bool fence_helper_check_sync_buf(struct fence_helper *fh, struct
> dma_buf *sync_buf, int id);
> - This function is called before dma is started and checks if same
> sync_bufs had already be committed to reservation_object,
> bo->fence_shared[n]. And id could be used to identy which device
> sync_buf is going to be accessed by.
>
> int fence_helper_add_sync_buf(struct fence_helper *fh, struct dma_buf
> *sync_buf, int id);
> - This function is called if fence_helper_check_sync_buf() is true and
> adds it seqno_fence's sync_buf_list wrapping sync_buf as
> seqno_fence_dma_buf structure. With this function call, one
> seqno_fence instance would have more sync_bufs. At this time, the
> reference count to this sync_buf is taken.
>
> void fence_helper_del_sync_buf(struct fence_helper *fh, int id);
> - This function is called if some operation is failed after
> fence_helper_add_sync_buf call to release relevant resources.
>
> int fence_helper_init_reservation_entry(struct fence_helper *fh,
> struct dma_buf *dmabuf, bool shared, int id);
> - This function is called after fence_helper_add_sync_buf call and
> calls reservation_entry_init function to set a reservation object of
> sync_buf to a new reservation_entry object. And then the new
> reservation_entry is added to fh->entries to track all sync_bufs this
> device is going to access.
>
> void fence_helper_fini_reservation_entry(struct fence_helper *fh, int id);
> - This function is called if some operation is failed after
> fence_helper_init_reservation_entry call to releae relevant resources.
>
> int fence_helper_ticket_reserve(struct fence_helper *fh, int id);
> - This function is called after fence_helper_init_reservation_entry
> call and calls ticket_reserve function to reserve a ticket(locked for
> each reservation entry in fh->entires)
>
> void fence_helper_ticket_commit(struct fence_helper *fh, int id);
> - This function is called after fence_helper_ticket_reserve() is
> called to commit this device's fence to all reservation_objects of
> each sync_buf. After that, once other devices try to access these
> buffers, they would be blocked and unlock each reservation entry in
> fh->entires.
>
> int fence_helper_wait(struct fence_helper *fh, struct dma_buf *dmabuf,
> bool intr);
> - This function is called before fence_helper_add_sync_buf() is called
> to wait for a signal from other devices.
>
> int fence_helper_signal(struct fence_helper *fh, int id);
> - This function is called by device's interrupt handler or somewhere
> when dma access to this buffer has been completed and calls
> fence_signal() with each fence registed to each reservation object in
> fh->entries to notify dma access completion to other deivces. At this
> time, other devices blocked would be waked up and forward their next
> step.
>
> For more detail, in addition, this function does the following,
> - delete each reservation entry in fh->entries.
> - release each seqno_fence_dmabuf object in seqno_fence's
> sync_buf_list and call dma_buf_put() to put the reference count to
> dmabuf.
>
>
> Now the fence-helper framework is just WIP yet so there may be my
> missing points. If you are ok, I'd like to post it as RFC.
Way too complicated..



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4)
  2013-01-16 10:36     ` Maarten Lankhorst
@ 2013-01-16 12:00       ` Inki Dae
  2013-01-24 14:52       ` Inki Dae
  1 sibling, 0 replies; 32+ messages in thread
From: Inki Dae @ 2013-01-16 12:00 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: linaro-mm-sig, Maarten Lankhorst, linux-kernel, dri-devel, linux-media

2013/1/16 Maarten Lankhorst <maarten.lankhorst@canonical.com>:
> Op 16-01-13 07:28, Inki Dae schreef:
>> 2013/1/15 Maarten Lankhorst <m.b.lankhorst@gmail.com>:
>>> This type of fence can be used with hardware synchronization for simple
>>> hardware that can block execution until the condition
>>> (dma_buf[offset] - value) >= 0 has been met.
>>>
>>> A software fallback still has to be provided in case the fence is used
>>> with a device that doesn't support this mechanism. It is useful to expose
>>> this for graphics cards that have an op to support this.
>>>
>>> Some cards like i915 can export those, but don't have an option to wait,
>>> so they need the software fallback.
>>>
>>> I extended the original patch by Rob Clark.
>>>
>>> v1: Original
>>> v2: Renamed from bikeshed to seqno, moved into dma-fence.c since
>>>     not much was left of the file. Lots of documentation added.
>>> v3: Use fence_ops instead of custom callbacks. Moved to own file
>>>     to avoid circular dependency between dma-buf.h and fence.h
>>> v4: Add spinlock pointer to seqno_fence_init
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> ---
>>>  Documentation/DocBook/device-drivers.tmpl |   1 +
>>>  drivers/base/fence.c                      |  38 +++++++++++
>>>  include/linux/seqno-fence.h               | 105 ++++++++++++++++++++++++++++++
>>>  3 files changed, 144 insertions(+)
>>>  create mode 100644 include/linux/seqno-fence.h
>>>
>>> diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
>>> index 6f53fc0..ad14396 100644
>>> --- a/Documentation/DocBook/device-drivers.tmpl
>>> +++ b/Documentation/DocBook/device-drivers.tmpl
>>> @@ -128,6 +128,7 @@ X!Edrivers/base/interface.c
>>>  !Edrivers/base/dma-buf.c
>>>  !Edrivers/base/fence.c
>>>  !Iinclude/linux/fence.h
>>> +!Iinclude/linux/seqno-fence.h
>>>  !Edrivers/base/dma-coherent.c
>>>  !Edrivers/base/dma-mapping.c
>>>       </sect1>
>>> diff --git a/drivers/base/fence.c b/drivers/base/fence.c
>>> index 28e5ffd..1d3f29c 100644
>>> --- a/drivers/base/fence.c
>>> +++ b/drivers/base/fence.c
>>> @@ -24,6 +24,7 @@
>>>  #include <linux/slab.h>
>>>  #include <linux/export.h>
>>>  #include <linux/fence.h>
>>> +#include <linux/seqno-fence.h>
>>>
>>>  atomic_t fence_context_counter = ATOMIC_INIT(0);
>>>  EXPORT_SYMBOL(fence_context_counter);
>>> @@ -284,3 +285,40 @@ out:
>>>         return ret;
>>>  }
>>>  EXPORT_SYMBOL(fence_default_wait);
>>> +
>>> +static bool seqno_enable_signaling(struct fence *fence)
>>> +{
>>> +       struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>>> +       return seqno_fence->ops->enable_signaling(fence);
>>> +}
>>> +
>>> +static bool seqno_signaled(struct fence *fence)
>>> +{
>>> +       struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>>> +       return seqno_fence->ops->signaled && seqno_fence->ops->signaled(fence);
>>> +}
>>> +
>>> +static void seqno_release(struct fence *fence)
>>> +{
>>> +       struct seqno_fence *f = to_seqno_fence(fence);
>>> +
>>> +       dma_buf_put(f->sync_buf);
>>> +       if (f->ops->release)
>>> +               f->ops->release(fence);
>>> +       else
>>> +               kfree(f);
>>> +}
>>> +
>>> +static long seqno_wait(struct fence *fence, bool intr, signed long timeout)
>>> +{
>>> +       struct seqno_fence *f = to_seqno_fence(fence);
>>> +       return f->ops->wait(fence, intr, timeout);
>>> +}
>>> +
>>> +const struct fence_ops seqno_fence_ops = {
>>> +       .enable_signaling = seqno_enable_signaling,
>>> +       .signaled = seqno_signaled,
>>> +       .wait = seqno_wait,
>>> +       .release = seqno_release
>>> +};
>>> +EXPORT_SYMBOL_GPL(seqno_fence_ops);
>>> diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h
>>> new file mode 100644
>>> index 0000000..603adc0
>>> --- /dev/null
>>> +++ b/include/linux/seqno-fence.h
>>> @@ -0,0 +1,105 @@
>>> +/*
>>> + * seqno-fence, using a dma-buf to synchronize fencing
>>> + *
>>> + * Copyright (C) 2012 Texas Instruments
>>> + * Copyright (C) 2012 Canonical Ltd
>>> + * Authors:
>>> + *   Rob Clark <rob.clark@linaro.org>
>>> + *   Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU General Public License version 2 as published by
>>> + * the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful, but WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>>> + * more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License along with
>>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#ifndef __LINUX_SEQNO_FENCE_H
>>> +#define __LINUX_SEQNO_FENCE_H
>>> +
>>> +#include <linux/fence.h>
>>> +#include <linux/dma-buf.h>
>>> +
>>> +struct seqno_fence {
>>> +       struct fence base;
>>> +
>>> +       const struct fence_ops *ops;
>>> +       struct dma_buf *sync_buf;
>>> +       uint32_t seqno_ofs;
>>> +};
>> Hi maarten,
>>
>> I'm applying dma-fence v11 and seqno-fence v4 to exynos drm and have
>> some proposals.
>>
>> The above seqno_fence structure has only one dmabuf. Shouldn't it have
>> mutiple dmabufs? For example, in case of drm driver, when pageflip is
>> requested, one framebuffer could have one more gem buffer for NV12M
>> format. And this means that one more exported dmabufs should be
>> sychronized with other devices. Below is simple structure for it,
> The fence guards a single operation, as such I didn't feel like more than one
> dma-buf was needed to guard it.
>
> Have you considered simply attaching multiple fences instead? Each with their own dma-buf.

I thought each context per device should have one fence. If not so, I
think we could use multiple fences instead.

> There has been some muttering about allowing multiple exclusive fences to be attached, for arm soc's.
>
> But I'm also considering getting rid of the dma-buf member and add a function call to retrieve it, since
> the sync dma-buf member should not be changing often, and it would zap 2 atomic ops on every fence,
> but I want it replaced by something that's not 10x more complicated.
>
> Maybe "int get_sync_dma_buf(fence, old_dma_buf, &new_dma_buf)" that will set new_dma_buf = NULL
> if the old_dma_buf is unchanged, and return true + return a new reference to the sync dma_buf if it's not identical to old_dma_buf.
> old_dma_buf can also be NULL or a dma_buf that belongs to a different fence->context entirely. It might be capable of
> returning an error, in which case the fence would count as being signaled. This could reduce the need for separately checking
> fence_is_signaled first.
>
> I think this would allow caching the synchronization dma_buf in a similar way without each fence needing
> to hold a reference to the dma_buf all the time, even for fences that are only used internally.
>
>> struct seqno_fence_dmabuf {
>>         struct list_head        list;
>>         int                            id;
>>         struct dmabuf          *sync_buf;
>>         uint32_t                   seqno_ops;
>>         uint32_t                   seqno;
>> };
>>
>> The member, id, could be used to identify which device sync_buf is
>> going to be accessed by. In case of drm driver, one framebuffer could
>> be accessed by one more devices, one is Display controller and another
>> is HDMI controller. So id would have crtc number.
> Why do you need this? the base fence already has a context member.
>
> In fact I don't see why you need a linked list, at worst you'd need a static array since the amount of
> dma-bufs should already be known during allocation time.
>
> I would prefer to simply make reservation_object->fence_exclusive an array, since it would be easier to implement,
> and there have been some calls that arm might need such a thing.
>

Right, the array could be used instead. I just had quick implemention
so it's not perfect.

>
>> And seqno_fence structure could be defined like below,
>>
>> struct seqno_fence {
>>         struct list_head                sync_buf_list;
>>         struct fence                      base;
>>         const struct fence_ops     *ops;
>> };
>>
>> In addition, I have implemented fence-helper framework for sw sync as
>> WIP and below is intefaces for it,
>>
>> struct fence_helper {
>>         struct list_head                    entries;
>>         struct reservation_ticket       ticket;
>>         struct seqno_fence              *sf;
>>         spinlock_t                             lock;
>>         void                                      *priv;
>> };
>>
>> int fence_helper_init(struct fence_helper *fh, void *priv, void
>> (*remease)(struct fence *fence));
>> - This function is called at driver open so process unique context
>> would have a new seqno_fence instance. This function does just
>> seqno_fence_init call, initialize entries list and set device specific
>> fence release callback.
>>
>> bool fence_helper_check_sync_buf(struct fence_helper *fh, struct
>> dma_buf *sync_buf, int id);
>> - This function is called before dma is started and checks if same
>> sync_bufs had already be committed to reservation_object,
>> bo->fence_shared[n]. And id could be used to identy which device
>> sync_buf is going to be accessed by.
>>
>> int fence_helper_add_sync_buf(struct fence_helper *fh, struct dma_buf
>> *sync_buf, int id);
>> - This function is called if fence_helper_check_sync_buf() is true and
>> adds it seqno_fence's sync_buf_list wrapping sync_buf as
>> seqno_fence_dma_buf structure. With this function call, one
>> seqno_fence instance would have more sync_bufs. At this time, the
>> reference count to this sync_buf is taken.
>>
>> void fence_helper_del_sync_buf(struct fence_helper *fh, int id);
>> - This function is called if some operation is failed after
>> fence_helper_add_sync_buf call to release relevant resources.
>>
>> int fence_helper_init_reservation_entry(struct fence_helper *fh,
>> struct dma_buf *dmabuf, bool shared, int id);
>> - This function is called after fence_helper_add_sync_buf call and
>> calls reservation_entry_init function to set a reservation object of
>> sync_buf to a new reservation_entry object. And then the new
>> reservation_entry is added to fh->entries to track all sync_bufs this
>> device is going to access.
>>
>> void fence_helper_fini_reservation_entry(struct fence_helper *fh, int id);
>> - This function is called if some operation is failed after
>> fence_helper_init_reservation_entry call to releae relevant resources.
>>
>> int fence_helper_ticket_reserve(struct fence_helper *fh, int id);
>> - This function is called after fence_helper_init_reservation_entry
>> call and calls ticket_reserve function to reserve a ticket(locked for
>> each reservation entry in fh->entires)
>>
>> void fence_helper_ticket_commit(struct fence_helper *fh, int id);
>> - This function is called after fence_helper_ticket_reserve() is
>> called to commit this device's fence to all reservation_objects of
>> each sync_buf. After that, once other devices try to access these
>> buffers, they would be blocked and unlock each reservation entry in
>> fh->entires.
>>
>> int fence_helper_wait(struct fence_helper *fh, struct dma_buf *dmabuf,
>> bool intr);
>> - This function is called before fence_helper_add_sync_buf() is called
>> to wait for a signal from other devices.
>>
>> int fence_helper_signal(struct fence_helper *fh, int id);
>> - This function is called by device's interrupt handler or somewhere
>> when dma access to this buffer has been completed and calls
>> fence_signal() with each fence registed to each reservation object in
>> fh->entries to notify dma access completion to other deivces. At this
>> time, other devices blocked would be waked up and forward their next
>> step.
>>
>> For more detail, in addition, this function does the following,
>> - delete each reservation entry in fh->entries.
>> - release each seqno_fence_dmabuf object in seqno_fence's
>> sync_buf_list and call dma_buf_put() to put the reference count to
>> dmabuf.
>>
>>
>> Now the fence-helper framework is just WIP yet so there may be my
>> missing points. If you are ok, I'd like to post it as RFC.
> Way too complicated..

The purpose to the fence-helper is to use the dma-fence more simply.
With the fence-helper, we doesn't need to consider fence and
reservation interfaces for it. All we have to do is to call only the
fence-helper interfaces without considering two things(fence and
reservation).

For example(In consumer case),

driver_open()
{
        struct fence-helper *fh;
        ...
        ctx->fh = kzalloc();
        ...
        fence_helper_init(fh, ...);
}

driver_addfb()
{
       ...
       fence_helper_add_sync_buf(fh, sync_buf, ...);
       ...
}

driver_pageflip()
{
       ...
       fence_helper_wait(fh, sync_buf, ...);
       fence_helper_init_reservation_entry(fh, sync_buf, ...);
       fence_helper_ticket_reserve(fh, ...);
       fence_helper_ticket_commit(fh, ...);
       ...
}

driver_pageflip_handler()
{
       ...
       fence_helper_signal(fh, ...);
}

The above functions are called in the following order,
1. driver_open() -> 2. driver_addfb() -> 3. driver_pageflip() ->
4.driver_pageflip_handler()

Step 3 and 4 would be called repeatedly. And also producer could use
similar way.

I'm not sure that I understand the dma-fence framework fully so there
might be something wrong and better way.

Thanks,
Inki Dae

>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)
  2013-01-15 12:34 ` [PATCH 4/7] fence: dma-buf cross-device synchronization (v11) Maarten Lankhorst
@ 2013-01-22 15:13   ` Francesco Lavra
  2013-01-23 14:56     ` Maarten Lankhorst
  2013-01-31  9:32   ` Inki Dae
  1 sibling, 1 reply; 32+ messages in thread
From: Francesco Lavra @ 2013-01-22 15:13 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: dri-devel, linux-kernel, linux-media, linaro-mm-sig, Maarten Lankhorst

Hi,

On 01/15/2013 01:34 PM, Maarten Lankhorst wrote:
[...]
> diff --git a/include/linux/fence.h b/include/linux/fence.h
> new file mode 100644
> index 0000000..d9f091d
> --- /dev/null
> +++ b/include/linux/fence.h
> @@ -0,0 +1,347 @@
> +/*
> + * Fence mechanism for dma-buf to allow for asynchronous dma access
> + *
> + * Copyright (C) 2012 Canonical Ltd
> + * Copyright (C) 2012 Texas Instruments
> + *
> + * Authors:
> + * Rob Clark <rob.clark@linaro.org>
> + * Maarten Lankhorst <maarten.lankhorst@canonical.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published by
> + * the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef __LINUX_FENCE_H
> +#define __LINUX_FENCE_H
> +
> +#include <linux/err.h>
> +#include <linux/wait.h>
> +#include <linux/list.h>
> +#include <linux/bitops.h>
> +#include <linux/kref.h>
> +#include <linux/sched.h>
> +#include <linux/printk.h>
> +
> +struct fence;
> +struct fence_ops;
> +struct fence_cb;
> +
> +/**
> + * struct fence - software synchronization primitive
> + * @refcount: refcount for this fence
> + * @ops: fence_ops associated with this fence
> + * @cb_list: list of all callbacks to call
> + * @lock: spin_lock_irqsave used for locking
> + * @priv: fence specific private data
> + * @flags: A mask of FENCE_FLAG_* defined below
> + *
> + * the flags member must be manipulated and read using the appropriate
> + * atomic ops (bit_*), so taking the spinlock will not be needed most
> + * of the time.
> + *
> + * FENCE_FLAG_SIGNALED_BIT - fence is already signaled
> + * FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called*
> + * FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the
> + * implementer of the fence for its own purposes. Can be used in different
> + * ways by different fence implementers, so do not rely on this.
> + *
> + * *) Since atomic bitops are used, this is not guaranteed to be the case.
> + * Particularly, if the bit was set, but fence_signal was called right
> + * before this bit was set, it would have been able to set the
> + * FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called.
> + * Adding a check for FENCE_FLAG_SIGNALED_BIT after setting
> + * FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that
> + * after fence_signal was called, any enable_signaling call will have either
> + * been completed, or never called at all.
> + */
> +struct fence {
> +	struct kref refcount;
> +	const struct fence_ops *ops;
> +	struct list_head cb_list;
> +	spinlock_t *lock;
> +	unsigned context, seqno;
> +	unsigned long flags;
> +};

The documentation above should be updated with the new structure members
context and seqno.

> +
> +enum fence_flag_bits {
> +	FENCE_FLAG_SIGNALED_BIT,
> +	FENCE_FLAG_ENABLE_SIGNAL_BIT,
> +	FENCE_FLAG_USER_BITS, /* must always be last member */
> +};
> +
> +typedef void (*fence_func_t)(struct fence *fence, struct fence_cb *cb, void *priv);
> +
> +/**
> + * struct fence_cb - callback for fence_add_callback
> + * @func: fence_func_t to call
> + * @priv: value of priv to pass to function
> + *
> + * This struct will be initialized by fence_add_callback, additional
> + * data can be passed along by embedding fence_cb in another struct.
> + */
> +struct fence_cb {
> +	struct list_head node;
> +	fence_func_t func;
> +	void *priv;
> +};

Documentation should be updated here too.

> +
> +/**
> + * struct fence_ops - operations implemented for fence
> + * @enable_signaling: enable software signaling of fence
> + * @signaled: [optional] peek whether the fence is signaled
> + * @release: [optional] called on destruction of fence
> + *
> + * Notes on enable_signaling:
> + * For fence implementations that have the capability for hw->hw
> + * signaling, they can implement this op to enable the necessary
> + * irqs, or insert commands into cmdstream, etc.  This is called
> + * in the first wait() or add_callback() path to let the fence
> + * implementation know that there is another driver waiting on
> + * the signal (ie. hw->sw case).
> + *
> + * This function can be called called from atomic context, but not
> + * from irq context, so normal spinlocks can be used.
> + *
> + * A return value of false indicates the fence already passed,
> + * or some failure occured that made it impossible to enable
> + * signaling. True indicates succesful enabling.
> + *
> + * Calling fence_signal before enable_signaling is called allows
> + * for a tiny race window in which enable_signaling is called during,
> + * before, or after fence_signal. To fight this, it is recommended
> + * that before enable_signaling returns true an extra reference is
> + * taken on the fence, to be released when the fence is signaled.
> + * This will mean fence_signal will still be called twice, but
> + * the second time will be a noop since it was already signaled.
> + *
> + * Notes on release:
> + * Can be NULL, this function allows additional commands to run on
> + * destruction of the fence. Can be called from irq context.
> + * If pointer is set to NULL, kfree will get called instead.
> + */
> +
> +struct fence_ops {
> +	bool (*enable_signaling)(struct fence *fence);
> +	bool (*signaled)(struct fence *fence);
> +	long (*wait)(struct fence *fence, bool intr, signed long);
> +	void (*release)(struct fence *fence);
> +};

Ditto.

> +
> +/**
> + * __fence_init - Initialize a custom fence.
> + * @fence:	[in]	the fence to initialize
> + * @ops:	[in]	the fence_ops for operations on this fence
> + * @lock:	[in]	the irqsafe spinlock to use for locking this fence
> + * @context:	[in]	the execution context this fence is run on
> + * @seqno:	[in]	a linear increasing sequence number for this context
> + *
> + * Initializes an allocated fence, the caller doesn't have to keep its
> + * refcount after committing with this fence, but it will need to hold a
> + * refcount again if fence_ops.enable_signaling gets called. This can
> + * be used for other implementing other types of fence.
> + *
> + * context and seqno are used for easy comparison between fences, allowing
> + * to check which fence is later by simply using fence_later.
> + */
> +static inline void
> +__fence_init(struct fence *fence, const struct fence_ops *ops,
> +	     spinlock_t *lock, unsigned context, unsigned seqno)
> +{
> +	BUG_ON(!ops || !lock || !ops->enable_signaling || !ops->wait);
> +
> +	kref_init(&fence->refcount);
> +	fence->ops = ops;
> +	INIT_LIST_HEAD(&fence->cb_list);
> +	fence->lock = lock;
> +	fence->context = context;
> +	fence->seqno = seqno;
> +	fence->flags = 0UL;
> +}
> +
> +/**
> + * fence_get - increases refcount of the fence
> + * @fence:	[in]	fence to increase refcount of
> + */
> +static inline void fence_get(struct fence *fence)
> +{
> +	if (WARN_ON(!fence))
> +		return;
> +	kref_get(&fence->refcount);
> +}
> +
> +extern void release_fence(struct kref *kref);
> +
> +/**
> + * fence_put - decreases refcount of the fence
> + * @fence:	[in]	fence to reduce refcount of
> + */
> +static inline void fence_put(struct fence *fence)
> +{
> +	if (WARN_ON(!fence))
> +		return;
> +	kref_put(&fence->refcount, release_fence);
> +}
> +
> +int fence_signal(struct fence *fence);
> +int __fence_signal(struct fence *fence);
> +long fence_default_wait(struct fence *fence, bool intr, signed long);

In the parameter list the first two parameters are named, and the last
one isn't. Feels a bit odd...

> +int fence_add_callback(struct fence *fence, struct fence_cb *cb,
> +		       fence_func_t func, void *priv);
> +bool fence_remove_callback(struct fence *fence, struct fence_cb *cb);
> +void fence_enable_sw_signaling(struct fence *fence);
> +
> +/**
> + * fence_is_signaled - Return an indication if the fence is signaled yet.
> + * @fence:	[in]	the fence to check
> + *
> + * Returns true if the fence was already signaled, false if not. Since this
> + * function doesn't enable signaling, it is not guaranteed to ever return true
> + * If fence_add_callback, fence_wait or fence_enable_sw_signaling
> + * haven't been called before.
> + *
> + * It's recommended for seqno fences to call fence_signal when the
> + * operation is complete, it makes it possible to prevent issues from
> + * wraparound between time of issue and time of use by checking the return
> + * value of this function before calling hardware-specific wait instructions.
> + */
> +static inline bool
> +fence_is_signaled(struct fence *fence)
> +{
> +	if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
> +		return true;
> +
> +	if (fence->ops->signaled && fence->ops->signaled(fence)) {
> +		fence_signal(fence);
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +/**
> + * fence_later - return the chronologically later fence
> + * @f1:	[in]	the first fence from the same context
> + * @f2:	[in]	the second fence from the same context
> + *
> + * Returns NULL if both fences are signaled, otherwise the fence that would be
> + * signaled last. Both fences must be from the same context, since a seqno is
> + * not re-used across contexts.
> + */
> +static inline struct fence *fence_later(struct fence *f1, struct fence *f2)
> +{
> +	bool sig1, sig2;
> +
> +	/*
> +	 * can't check just FENCE_FLAG_SIGNALED_BIT here, it may never have been
> +	 * set called if enable_signaling wasn't, and enabling that here is
> +	 * overkill.
> +	 */
> +	sig1 = fence_is_signaled(f1);
> +	sig2 = fence_is_signaled(f2);
> +
> +	if (sig1 && sig2)
> +		return NULL;
> +
> +	BUG_ON(f1->context != f2->context);
> +
> +	if (sig1 || f2->seqno - f2->seqno <= INT_MAX)

I guess you meant (f2->seqno - f1->seqno).

> +		return f2;
> +	else
> +		return f1;
> +}

Regards,
Francesco

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 6/7] reservation: cross-device reservation support
  2013-01-15 12:34 ` [PATCH 6/7] reservation: cross-device reservation support Maarten Lankhorst
@ 2013-01-22 16:47   ` Francesco Lavra
  2013-01-22 17:04     ` Maarten Lankhorst
  2013-02-04  7:06   ` Inki Dae
  1 sibling, 1 reply; 32+ messages in thread
From: Francesco Lavra @ 2013-01-22 16:47 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: dri-devel, linux-kernel, linux-media, linaro-mm-sig, Maarten Lankhorst

On 01/15/2013 01:34 PM, Maarten Lankhorst wrote:
> This adds support for a generic reservations framework that can be
> hooked up to ttm and dma-buf and allows easy sharing of reservations
> across devices.
> 
> The idea is that a dma-buf and ttm object both will get a pointer
> to a struct reservation_object, which has to be reserved before
> anything is done with the contents of the dma-buf.
> 
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> ---
>  Documentation/DocBook/device-drivers.tmpl |   2 +
>  drivers/base/Makefile                     |   2 +-
>  drivers/base/reservation.c                | 251 ++++++++++++++++++++++++++++++
>  include/linux/reservation.h               | 182 ++++++++++++++++++++++
>  4 files changed, 436 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/base/reservation.c
>  create mode 100644 include/linux/reservation.h
[...]
> diff --git a/include/linux/reservation.h b/include/linux/reservation.h
> new file mode 100644
> index 0000000..fc2349d
> --- /dev/null
> +++ b/include/linux/reservation.h
> @@ -0,0 +1,182 @@
> +/*
> + * Header file for reservations for dma-buf and ttm
> + *
> + * Copyright(C) 2011 Linaro Limited. All rights reserved.
> + * Copyright (C) 2012 Canonical Ltd
> + * Copyright (C) 2012 Texas Instruments
> + *
> + * Authors:
> + * Rob Clark <rob.clark@linaro.org>
> + * Maarten Lankhorst <maarten.lankhorst@canonical.com>
> + * Thomas Hellstrom <thellstrom-at-vmware-dot-com>
> + *
> + * Based on bo.c which bears the following copyright notice,
> + * but is dual licensed:
> + *
> + * Copyright (c) 2006-2009 VMware, Inc., Palo Alto, CA., USA
> + * All Rights Reserved.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the
> + * "Software"), to deal in the Software without restriction, including
> + * without limitation the rights to use, copy, modify, merge, publish,
> + * distribute, sub license, and/or sell copies of the Software, and to
> + * permit persons to whom the Software is furnished to do so, subject to
> + * the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
> + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
> + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
> + * USE OR OTHER DEALINGS IN THE SOFTWARE.
> + */
> +#ifndef _LINUX_RESERVATION_H
> +#define _LINUX_RESERVATION_H
> +
> +#include <linux/spinlock.h>
> +#include <linux/mutex.h>
> +#include <linux/fence.h>
> +
> +#define BUF_MAX_SHARED_FENCE 8
> +
> +struct fence;
> +
> +extern atomic_long_t reservation_counter;
> +extern const char reservation_object_name[];
> +extern struct lock_class_key reservation_object_class;
> +extern const char reservation_ticket_name[];
> +extern struct lock_class_key reservation_ticket_class;
> +
> +struct reservation_object {
> +	struct ticket_mutex lock;
> +
> +	u32 fence_shared_count;
> +	struct fence *fence_excl;
> +	struct fence *fence_shared[BUF_MAX_SHARED_FENCE];
> +};
> +
> +struct reservation_ticket {
> +	unsigned long seqno;
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> +	struct lockdep_map dep_map;
> +#endif
> +};
> +
> +/**
> + * struct reservation_entry - reservation structure for a
> + * reservation_object
> + * @head:	list entry
> + * @obj_shared:	pointer to a reservation_object to reserve
> + *
> + * Bit 0 of obj_shared is set to bool shared, as such pointer has to be
> + * converted back, which can be done with reservation_entry_get.
> + */
> +struct reservation_entry {
> +	struct list_head head;
> +	unsigned long obj_shared;
> +};
> +
> +
> +static inline void
> +reservation_object_init(struct reservation_object *obj)
> +{
> +	obj->fence_shared_count = 0;
> +	obj->fence_excl = NULL;
> +
> +	__ticket_mutex_init(&obj->lock, reservation_object_name,
> +			    &reservation_object_class);
> +}
> +
> +static inline void
> +reservation_object_fini(struct reservation_object *obj)
> +{
> +	int i;
> +
> +	if (obj->fence_excl)
> +		fence_put(obj->fence_excl);
> +	for (i = 0; i < obj->fence_shared_count; ++i)
> +		fence_put(obj->fence_shared[i]);
> +
> +	mutex_destroy(&obj->lock.base);
> +}
> +
> +static inline void
> +reservation_ticket_init(struct reservation_ticket *t)
> +{
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> +	/*
> +	 * Make sure we are not reinitializing a held ticket:
> +	 */
> +
> +	debug_check_no_locks_freed((void *)t, sizeof(*t));
> +	lockdep_init_map(&t->dep_map, reservation_ticket_name,
> +			 &reservation_ticket_class, 0);
> +#endif
> +	mutex_acquire(&t->dep_map, 0, 0, _THIS_IP_);

If CONFIG_DEBUG_LOCK_ALLOC is not defined, t->dep_map is not there.

> +	do {
> +		t->seqno = atomic_long_inc_return(&reservation_counter);
> +	} while (unlikely(!t->seqno));
> +}

--
Francesco

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 6/7] reservation: cross-device reservation support
  2013-01-22 16:47   ` [Linaro-mm-sig] " Francesco Lavra
@ 2013-01-22 17:04     ` Maarten Lankhorst
  0 siblings, 0 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-22 17:04 UTC (permalink / raw)
  To: Francesco Lavra
  Cc: Maarten Lankhorst, dri-devel, linux-kernel, linux-media, linaro-mm-sig

Op 22-01-13 17:47, Francesco Lavra schreef:
> On 01/15/2013 01:34 PM, Maarten Lankhorst wrote:
>> This adds support for a generic reservations framework that can be
>> hooked up to ttm and dma-buf and allows easy sharing of reservations
>> across devices.
>>
>> The idea is that a dma-buf and ttm object both will get a pointer
>> to a struct reservation_object, which has to be reserved before
>> anything is done with the contents of the dma-buf.
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> ---
>>  Documentation/DocBook/device-drivers.tmpl |   2 +
>>  drivers/base/Makefile                     |   2 +-
>>  drivers/base/reservation.c                | 251 ++++++++++++++++++++++++++++++
>>  include/linux/reservation.h               | 182 ++++++++++++++++++++++
>>  4 files changed, 436 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/base/reservation.c
>>  create mode 100644 include/linux/reservation.h
> [...]
>> diff --git a/include/linux/reservation.h b/include/linux/reservation.h
>> new file mode 100644
>> index 0000000..fc2349d
>> --- /dev/null
>> +++ b/include/linux/reservation.h
>> @@ -0,0 +1,182 @@
>> +/*
>> + * Header file for reservations for dma-buf and ttm
>> + *
>> + * Copyright(C) 2011 Linaro Limited. All rights reserved.
>> + * Copyright (C) 2012 Canonical Ltd
>> + * Copyright (C) 2012 Texas Instruments
>> + *
>> + * Authors:
>> + * Rob Clark <rob.clark@linaro.org>
>> + * Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> + * Thomas Hellstrom <thellstrom-at-vmware-dot-com>
>> + *
>> + * Based on bo.c which bears the following copyright notice,
>> + * but is dual licensed:
>> + *
>> + * Copyright (c) 2006-2009 VMware, Inc., Palo Alto, CA., USA
>> + * All Rights Reserved.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a
>> + * copy of this software and associated documentation files (the
>> + * "Software"), to deal in the Software without restriction, including
>> + * without limitation the rights to use, copy, modify, merge, publish,
>> + * distribute, sub license, and/or sell copies of the Software, and to
>> + * permit persons to whom the Software is furnished to do so, subject to
>> + * the following conditions:
>> + *
>> + * The above copyright notice and this permission notice (including the
>> + * next paragraph) shall be included in all copies or substantial portions
>> + * of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
>> + * THE COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM,
>> + * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
>> + * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
>> + * USE OR OTHER DEALINGS IN THE SOFTWARE.
>> + */
>> +#ifndef _LINUX_RESERVATION_H
>> +#define _LINUX_RESERVATION_H
>> +
>> +#include <linux/spinlock.h>
>> +#include <linux/mutex.h>
>> +#include <linux/fence.h>
>> +
>> +#define BUF_MAX_SHARED_FENCE 8
>> +
>> +struct fence;
>> +
>> +extern atomic_long_t reservation_counter;
>> +extern const char reservation_object_name[];
>> +extern struct lock_class_key reservation_object_class;
>> +extern const char reservation_ticket_name[];
>> +extern struct lock_class_key reservation_ticket_class;
>> +
>> +struct reservation_object {
>> +	struct ticket_mutex lock;
>> +
>> +	u32 fence_shared_count;
>> +	struct fence *fence_excl;
>> +	struct fence *fence_shared[BUF_MAX_SHARED_FENCE];
>> +};
>> +
>> +struct reservation_ticket {
>> +	unsigned long seqno;
>> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
>> +	struct lockdep_map dep_map;
>> +#endif
>> +};
>> +
>> +/**
>> + * struct reservation_entry - reservation structure for a
>> + * reservation_object
>> + * @head:	list entry
>> + * @obj_shared:	pointer to a reservation_object to reserve
>> + *
>> + * Bit 0 of obj_shared is set to bool shared, as such pointer has to be
>> + * converted back, which can be done with reservation_entry_get.
>> + */
>> +struct reservation_entry {
>> +	struct list_head head;
>> +	unsigned long obj_shared;
>> +};
>> +
>> +
>> +static inline void
>> +reservation_object_init(struct reservation_object *obj)
>> +{
>> +	obj->fence_shared_count = 0;
>> +	obj->fence_excl = NULL;
>> +
>> +	__ticket_mutex_init(&obj->lock, reservation_object_name,
>> +			    &reservation_object_class);
>> +}
>> +
>> +static inline void
>> +reservation_object_fini(struct reservation_object *obj)
>> +{
>> +	int i;
>> +
>> +	if (obj->fence_excl)
>> +		fence_put(obj->fence_excl);
>> +	for (i = 0; i < obj->fence_shared_count; ++i)
>> +		fence_put(obj->fence_shared[i]);
>> +
>> +	mutex_destroy(&obj->lock.base);
>> +}
>> +
>> +static inline void
>> +reservation_ticket_init(struct reservation_ticket *t)
>> +{
>> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
>> +	/*
>> +	 * Make sure we are not reinitializing a held ticket:
>> +	 */
>> +
>> +	debug_check_no_locks_freed((void *)t, sizeof(*t));
>> +	lockdep_init_map(&t->dep_map, reservation_ticket_name,
>> +			 &reservation_ticket_class, 0);
>> +#endif
>> +	mutex_acquire(&t->dep_map, 0, 0, _THIS_IP_);
> If CONFIG_DEBUG_LOCK_ALLOC is not defined, t->dep_map is not there.
And mutex_acquire will not expand either, so it's harmless. :-)
>> +	do {
>> +		t->seqno = atomic_long_inc_return(&reservation_counter);
>> +	} while (unlikely(!t->seqno));
>> +}
> --
> Francesco
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)
  2013-01-22 15:13   ` [Linaro-mm-sig] " Francesco Lavra
@ 2013-01-23 14:56     ` Maarten Lankhorst
  2013-01-23 17:14       ` Francesco Lavra
  0 siblings, 1 reply; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-23 14:56 UTC (permalink / raw)
  To: Francesco Lavra
  Cc: Maarten Lankhorst, dri-devel, linux-kernel, linux-media, linaro-mm-sig

Op 22-01-13 16:13, Francesco Lavra schreef:
> Hi,
>
> On 01/15/2013 01:34 PM, Maarten Lankhorst wrote:
> [...]
>> diff --git a/include/linux/fence.h b/include/linux/fence.h
>> new file mode 100644
>> index 0000000..d9f091d
>> --- /dev/null
>> +++ b/include/linux/fence.h
>> @@ -0,0 +1,347 @@
>> +/*
>> + * Fence mechanism for dma-buf to allow for asynchronous dma access
>> + *
>> + * Copyright (C) 2012 Canonical Ltd
>> + * Copyright (C) 2012 Texas Instruments
>> + *
>> + * Authors:
>> + * Rob Clark <rob.clark@linaro.org>
>> + * Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published by
>> + * the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef __LINUX_FENCE_H
>> +#define __LINUX_FENCE_H
>> +
>> +#include <linux/err.h>
>> +#include <linux/wait.h>
>> +#include <linux/list.h>
>> +#include <linux/bitops.h>
>> +#include <linux/kref.h>
>> +#include <linux/sched.h>
>> +#include <linux/printk.h>
>> +
>> +struct fence;
>> +struct fence_ops;
>> +struct fence_cb;
>> +
>> +/**
>> + * struct fence - software synchronization primitive
>> + * @refcount: refcount for this fence
>> + * @ops: fence_ops associated with this fence
>> + * @cb_list: list of all callbacks to call
>> + * @lock: spin_lock_irqsave used for locking
>> + * @priv: fence specific private data
>> + * @flags: A mask of FENCE_FLAG_* defined below
>> + *
>> + * the flags member must be manipulated and read using the appropriate
>> + * atomic ops (bit_*), so taking the spinlock will not be needed most
>> + * of the time.
>> + *
>> + * FENCE_FLAG_SIGNALED_BIT - fence is already signaled
>> + * FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called*
>> + * FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the
>> + * implementer of the fence for its own purposes. Can be used in different
>> + * ways by different fence implementers, so do not rely on this.
>> + *
>> + * *) Since atomic bitops are used, this is not guaranteed to be the case.
>> + * Particularly, if the bit was set, but fence_signal was called right
>> + * before this bit was set, it would have been able to set the
>> + * FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called.
>> + * Adding a check for FENCE_FLAG_SIGNALED_BIT after setting
>> + * FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that
>> + * after fence_signal was called, any enable_signaling call will have either
>> + * been completed, or never called at all.
>> + */
>> +struct fence {
>> +	struct kref refcount;
>> +	const struct fence_ops *ops;
>> +	struct list_head cb_list;
>> +	spinlock_t *lock;
>> +	unsigned context, seqno;
>> +	unsigned long flags;
>> +};
> The documentation above should be updated with the new structure members
> context and seqno.
>
>> +
>> +enum fence_flag_bits {
>> +	FENCE_FLAG_SIGNALED_BIT,
>> +	FENCE_FLAG_ENABLE_SIGNAL_BIT,
>> +	FENCE_FLAG_USER_BITS, /* must always be last member */
>> +};
>> +
>> +typedef void (*fence_func_t)(struct fence *fence, struct fence_cb *cb, void *priv);
>> +
>> +/**
>> + * struct fence_cb - callback for fence_add_callback
>> + * @func: fence_func_t to call
>> + * @priv: value of priv to pass to function
>> + *
>> + * This struct will be initialized by fence_add_callback, additional
>> + * data can be passed along by embedding fence_cb in another struct.
>> + */
>> +struct fence_cb {
>> +	struct list_head node;
>> +	fence_func_t func;
>> +	void *priv;
>> +};
> Documentation should be updated here too.
>
>> +
>> +/**
>> + * struct fence_ops - operations implemented for fence
>> + * @enable_signaling: enable software signaling of fence
>> + * @signaled: [optional] peek whether the fence is signaled
>> + * @release: [optional] called on destruction of fence
>> + *
>> + * Notes on enable_signaling:
>> + * For fence implementations that have the capability for hw->hw
>> + * signaling, they can implement this op to enable the necessary
>> + * irqs, or insert commands into cmdstream, etc.  This is called
>> + * in the first wait() or add_callback() path to let the fence
>> + * implementation know that there is another driver waiting on
>> + * the signal (ie. hw->sw case).
>> + *
>> + * This function can be called called from atomic context, but not
>> + * from irq context, so normal spinlocks can be used.
>> + *
>> + * A return value of false indicates the fence already passed,
>> + * or some failure occured that made it impossible to enable
>> + * signaling. True indicates succesful enabling.
>> + *
>> + * Calling fence_signal before enable_signaling is called allows
>> + * for a tiny race window in which enable_signaling is called during,
>> + * before, or after fence_signal. To fight this, it is recommended
>> + * that before enable_signaling returns true an extra reference is
>> + * taken on the fence, to be released when the fence is signaled.
>> + * This will mean fence_signal will still be called twice, but
>> + * the second time will be a noop since it was already signaled.
>> + *
>> + * Notes on release:
>> + * Can be NULL, this function allows additional commands to run on
>> + * destruction of the fence. Can be called from irq context.
>> + * If pointer is set to NULL, kfree will get called instead.
>> + */
>> +
>> +struct fence_ops {
>> +	bool (*enable_signaling)(struct fence *fence);
>> +	bool (*signaled)(struct fence *fence);
>> +	long (*wait)(struct fence *fence, bool intr, signed long);
>> +	void (*release)(struct fence *fence);
>> +};
> Ditto.
>
>> +
>> +/**
>> + * __fence_init - Initialize a custom fence.
>> + * @fence:	[in]	the fence to initialize
>> + * @ops:	[in]	the fence_ops for operations on this fence
>> + * @lock:	[in]	the irqsafe spinlock to use for locking this fence
>> + * @context:	[in]	the execution context this fence is run on
>> + * @seqno:	[in]	a linear increasing sequence number for this context
>> + *
>> + * Initializes an allocated fence, the caller doesn't have to keep its
>> + * refcount after committing with this fence, but it will need to hold a
>> + * refcount again if fence_ops.enable_signaling gets called. This can
>> + * be used for other implementing other types of fence.
>> + *
>> + * context and seqno are used for easy comparison between fences, allowing
>> + * to check which fence is later by simply using fence_later.
>> + */
>> +static inline void
>> +__fence_init(struct fence *fence, const struct fence_ops *ops,
>> +	     spinlock_t *lock, unsigned context, unsigned seqno)
>> +{
>> +	BUG_ON(!ops || !lock || !ops->enable_signaling || !ops->wait);
>> +
>> +	kref_init(&fence->refcount);
>> +	fence->ops = ops;
>> +	INIT_LIST_HEAD(&fence->cb_list);
>> +	fence->lock = lock;
>> +	fence->context = context;
>> +	fence->seqno = seqno;
>> +	fence->flags = 0UL;
>> +}
>> +
>> +/**
>> + * fence_get - increases refcount of the fence
>> + * @fence:	[in]	fence to increase refcount of
>> + */
>> +static inline void fence_get(struct fence *fence)
>> +{
>> +	if (WARN_ON(!fence))
>> +		return;
>> +	kref_get(&fence->refcount);
>> +}
>> +
>> +extern void release_fence(struct kref *kref);
>> +
>> +/**
>> + * fence_put - decreases refcount of the fence
>> + * @fence:	[in]	fence to reduce refcount of
>> + */
>> +static inline void fence_put(struct fence *fence)
>> +{
>> +	if (WARN_ON(!fence))
>> +		return;
>> +	kref_put(&fence->refcount, release_fence);
>> +}
>> +
>> +int fence_signal(struct fence *fence);
>> +int __fence_signal(struct fence *fence);
>> +long fence_default_wait(struct fence *fence, bool intr, signed long);
> In the parameter list the first two parameters are named, and the last
> one isn't. Feels a bit odd...
>
>> +int fence_add_callback(struct fence *fence, struct fence_cb *cb,
>> +		       fence_func_t func, void *priv);
>> +bool fence_remove_callback(struct fence *fence, struct fence_cb *cb);
>> +void fence_enable_sw_signaling(struct fence *fence);
>> +
>> +/**
>> + * fence_is_signaled - Return an indication if the fence is signaled yet.
>> + * @fence:	[in]	the fence to check
>> + *
>> + * Returns true if the fence was already signaled, false if not. Since this
>> + * function doesn't enable signaling, it is not guaranteed to ever return true
>> + * If fence_add_callback, fence_wait or fence_enable_sw_signaling
>> + * haven't been called before.
>> + *
>> + * It's recommended for seqno fences to call fence_signal when the
>> + * operation is complete, it makes it possible to prevent issues from
>> + * wraparound between time of issue and time of use by checking the return
>> + * value of this function before calling hardware-specific wait instructions.
>> + */
>> +static inline bool
>> +fence_is_signaled(struct fence *fence)
>> +{
>> +	if (test_bit(FENCE_FLAG_SIGNALED_BIT, &fence->flags))
>> +		return true;
>> +
>> +	if (fence->ops->signaled && fence->ops->signaled(fence)) {
>> +		fence_signal(fence);
>> +		return true;
>> +	}
>> +
>> +	return false;
>> +}
>> +
>> +/**
>> + * fence_later - return the chronologically later fence
>> + * @f1:	[in]	the first fence from the same context
>> + * @f2:	[in]	the second fence from the same context
>> + *
>> + * Returns NULL if both fences are signaled, otherwise the fence that would be
>> + * signaled last. Both fences must be from the same context, since a seqno is
>> + * not re-used across contexts.
>> + */
>> +static inline struct fence *fence_later(struct fence *f1, struct fence *f2)
>> +{
>> +	bool sig1, sig2;
>> +
>> +	/*
>> +	 * can't check just FENCE_FLAG_SIGNALED_BIT here, it may never have been
>> +	 * set called if enable_signaling wasn't, and enabling that here is
>> +	 * overkill.
>> +	 */
>> +	sig1 = fence_is_signaled(f1);
>> +	sig2 = fence_is_signaled(f2);
>> +
>> +	if (sig1 && sig2)
>> +		return NULL;
>> +
>> +	BUG_ON(f1->context != f2->context);
>> +
>> +	if (sig1 || f2->seqno - f2->seqno <= INT_MAX)
> I guess you meant (f2->seqno - f1->seqno).
>
>> +		return f2;
>> +	else
>> +		return f1;
>> +}
> Regards,
> Francesco
>
Thanks for the review, how does this delta look?

diff --git a/include/linux/fence.h b/include/linux/fence.h
index d9f091d..831ed0a 100644
--- a/include/linux/fence.h
+++ b/include/linux/fence.h
@@ -42,7 +42,10 @@ struct fence_cb;
  * @ops: fence_ops associated with this fence
  * @cb_list: list of all callbacks to call
  * @lock: spin_lock_irqsave used for locking
- * @priv: fence specific private data
+ * @context: execution context this fence belongs to, returned by
+ *           fence_context_alloc()
+ * @seqno: the sequence number of this fence inside the executation context,
+ * can be compared to decide which fence would be signaled later.
  * @flags: A mask of FENCE_FLAG_* defined below
  *
  * the flags member must be manipulated and read using the appropriate
@@ -83,6 +86,7 @@ typedef void (*fence_func_t)(struct fence *fence, struct fence_cb *cb, void *pri
 
 /**
  * struct fence_cb - callback for fence_add_callback
+ * @node: used by fence_add_callback to append this struct to fence::cb_list
  * @func: fence_func_t to call
  * @priv: value of priv to pass to function
  *
@@ -98,8 +102,9 @@ struct fence_cb {
 /**
  * struct fence_ops - operations implemented for fence
  * @enable_signaling: enable software signaling of fence
- * @signaled: [optional] peek whether the fence is signaled
- * @release: [optional] called on destruction of fence
+ * @signaled: [optional] peek whether the fence is signaled, can be null
+ * @wait: custom wait implementation
+ * @release: [optional] called on destruction of fence, can be null
  *
  * Notes on enable_signaling:
  * For fence implementations that have the capability for hw->hw
@@ -124,6 +129,16 @@ struct fence_cb {
  * This will mean fence_signal will still be called twice, but
  * the second time will be a noop since it was already signaled.
  *
+ * Notes on wait:
+ * Must not be NULL, set to fence_default_wait for default implementation.
+ * the fence_default_wait implementation should work for any fence, as long
+ * as enable_signaling works correctly.
+ *
+ * Must return -ERESTARTSYS if the wait is intr = true and the wait was interrupted,
+ * and remaining jiffies if fence has signaled. Can also return other error
+ * values on custom implementations, which should be treated as if the fence
+ * is signaled. For example a hardware lockup could be reported like that.
+ *
  * Notes on release:
  * Can be NULL, this function allows additional commands to run on
  * destruction of the fence. Can be called from irq context.
@@ -133,7 +148,7 @@ struct fence_cb {
 struct fence_ops {
 	bool (*enable_signaling)(struct fence *fence);
 	bool (*signaled)(struct fence *fence);
-	long (*wait)(struct fence *fence, bool intr, signed long);
+	long (*wait)(struct fence *fence, bool intr, signed long timeout);
 	void (*release)(struct fence *fence);
 };
 
@@ -194,7 +209,7 @@ static inline void fence_put(struct fence *fence)
 
 int fence_signal(struct fence *fence);
 int __fence_signal(struct fence *fence);
-long fence_default_wait(struct fence *fence, bool intr, signed long);
+long fence_default_wait(struct fence *fence, bool intr, signed long timeout);
 int fence_add_callback(struct fence *fence, struct fence_cb *cb,
 		       fence_func_t func, void *priv);
 bool fence_remove_callback(struct fence *fence, struct fence_cb *cb);
@@ -254,7 +269,7 @@ static inline struct fence *fence_later(struct fence *f1, struct fence *f2)
 
 	BUG_ON(f1->context != f2->context);
 
-	if (sig1 || f2->seqno - f2->seqno <= INT_MAX)
+	if (sig1 || f2->seqno - f1->seqno <= INT_MAX)
 		return f2;
 	else
 		return f1;


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)
  2013-01-23 14:56     ` Maarten Lankhorst
@ 2013-01-23 17:14       ` Francesco Lavra
  0 siblings, 0 replies; 32+ messages in thread
From: Francesco Lavra @ 2013-01-23 17:14 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: Maarten Lankhorst, dri-devel, linux-kernel, linux-media, linaro-mm-sig

On 01/23/2013 03:56 PM, Maarten Lankhorst wrote:
> Thanks for the review, how does this delta look?
> 
> diff --git a/include/linux/fence.h b/include/linux/fence.h
> index d9f091d..831ed0a 100644
> --- a/include/linux/fence.h
> +++ b/include/linux/fence.h
> @@ -42,7 +42,10 @@ struct fence_cb;
>   * @ops: fence_ops associated with this fence
>   * @cb_list: list of all callbacks to call
>   * @lock: spin_lock_irqsave used for locking
> - * @priv: fence specific private data
> + * @context: execution context this fence belongs to, returned by
> + *           fence_context_alloc()
> + * @seqno: the sequence number of this fence inside the executation context,

s/executation/execution

Otherwise, looks good to me.

--
Francesco

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4)
  2013-01-16 10:36     ` Maarten Lankhorst
  2013-01-16 12:00       ` Inki Dae
@ 2013-01-24 14:52       ` Inki Dae
  1 sibling, 0 replies; 32+ messages in thread
From: Inki Dae @ 2013-01-24 14:52 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: linaro-mm-sig, Maarten Lankhorst, linux-kernel, dri-devel, linux-media

2013/1/16 Maarten Lankhorst <maarten.lankhorst@canonical.com>:
> Op 16-01-13 07:28, Inki Dae schreef:
>> 2013/1/15 Maarten Lankhorst <m.b.lankhorst@gmail.com>:
>>> This type of fence can be used with hardware synchronization for simple
>>> hardware that can block execution until the condition
>>> (dma_buf[offset] - value) >= 0 has been met.
>>>
>>> A software fallback still has to be provided in case the fence is used
>>> with a device that doesn't support this mechanism. It is useful to expose
>>> this for graphics cards that have an op to support this.
>>>
>>> Some cards like i915 can export those, but don't have an option to wait,
>>> so they need the software fallback.
>>>
>>> I extended the original patch by Rob Clark.
>>>
>>> v1: Original
>>> v2: Renamed from bikeshed to seqno, moved into dma-fence.c since
>>>     not much was left of the file. Lots of documentation added.
>>> v3: Use fence_ops instead of custom callbacks. Moved to own file
>>>     to avoid circular dependency between dma-buf.h and fence.h
>>> v4: Add spinlock pointer to seqno_fence_init
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> ---
>>>  Documentation/DocBook/device-drivers.tmpl |   1 +
>>>  drivers/base/fence.c                      |  38 +++++++++++
>>>  include/linux/seqno-fence.h               | 105 ++++++++++++++++++++++++++++++
>>>  3 files changed, 144 insertions(+)
>>>  create mode 100644 include/linux/seqno-fence.h
>>>
>>> diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl
>>> index 6f53fc0..ad14396 100644
>>> --- a/Documentation/DocBook/device-drivers.tmpl
>>> +++ b/Documentation/DocBook/device-drivers.tmpl
>>> @@ -128,6 +128,7 @@ X!Edrivers/base/interface.c
>>>  !Edrivers/base/dma-buf.c
>>>  !Edrivers/base/fence.c
>>>  !Iinclude/linux/fence.h
>>> +!Iinclude/linux/seqno-fence.h
>>>  !Edrivers/base/dma-coherent.c
>>>  !Edrivers/base/dma-mapping.c
>>>       </sect1>
>>> diff --git a/drivers/base/fence.c b/drivers/base/fence.c
>>> index 28e5ffd..1d3f29c 100644
>>> --- a/drivers/base/fence.c
>>> +++ b/drivers/base/fence.c
>>> @@ -24,6 +24,7 @@
>>>  #include <linux/slab.h>
>>>  #include <linux/export.h>
>>>  #include <linux/fence.h>
>>> +#include <linux/seqno-fence.h>
>>>
>>>  atomic_t fence_context_counter = ATOMIC_INIT(0);
>>>  EXPORT_SYMBOL(fence_context_counter);
>>> @@ -284,3 +285,40 @@ out:
>>>         return ret;
>>>  }
>>>  EXPORT_SYMBOL(fence_default_wait);
>>> +
>>> +static bool seqno_enable_signaling(struct fence *fence)
>>> +{
>>> +       struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>>> +       return seqno_fence->ops->enable_signaling(fence);
>>> +}
>>> +
>>> +static bool seqno_signaled(struct fence *fence)
>>> +{
>>> +       struct seqno_fence *seqno_fence = to_seqno_fence(fence);
>>> +       return seqno_fence->ops->signaled && seqno_fence->ops->signaled(fence);
>>> +}
>>> +
>>> +static void seqno_release(struct fence *fence)
>>> +{
>>> +       struct seqno_fence *f = to_seqno_fence(fence);
>>> +
>>> +       dma_buf_put(f->sync_buf);
>>> +       if (f->ops->release)
>>> +               f->ops->release(fence);
>>> +       else
>>> +               kfree(f);
>>> +}
>>> +
>>> +static long seqno_wait(struct fence *fence, bool intr, signed long timeout)
>>> +{
>>> +       struct seqno_fence *f = to_seqno_fence(fence);
>>> +       return f->ops->wait(fence, intr, timeout);
>>> +}
>>> +
>>> +const struct fence_ops seqno_fence_ops = {
>>> +       .enable_signaling = seqno_enable_signaling,
>>> +       .signaled = seqno_signaled,
>>> +       .wait = seqno_wait,
>>> +       .release = seqno_release
>>> +};
>>> +EXPORT_SYMBOL_GPL(seqno_fence_ops);
>>> diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h
>>> new file mode 100644
>>> index 0000000..603adc0
>>> --- /dev/null
>>> +++ b/include/linux/seqno-fence.h
>>> @@ -0,0 +1,105 @@
>>> +/*
>>> + * seqno-fence, using a dma-buf to synchronize fencing
>>> + *
>>> + * Copyright (C) 2012 Texas Instruments
>>> + * Copyright (C) 2012 Canonical Ltd
>>> + * Authors:
>>> + *   Rob Clark <rob.clark@linaro.org>
>>> + *   Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> + *
>>> + * This program is free software; you can redistribute it and/or modify it
>>> + * under the terms of the GNU General Public License version 2 as published by
>>> + * the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful, but WITHOUT
>>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>>> + * more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License along with
>>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>>> + */
>>> +
>>> +#ifndef __LINUX_SEQNO_FENCE_H
>>> +#define __LINUX_SEQNO_FENCE_H
>>> +
>>> +#include <linux/fence.h>
>>> +#include <linux/dma-buf.h>
>>> +
>>> +struct seqno_fence {
>>> +       struct fence base;
>>> +
>>> +       const struct fence_ops *ops;
>>> +       struct dma_buf *sync_buf;
>>> +       uint32_t seqno_ofs;
>>> +};
>> Hi maarten,
>>
>> I'm applying dma-fence v11 and seqno-fence v4 to exynos drm and have
>> some proposals.
>>
>> The above seqno_fence structure has only one dmabuf. Shouldn't it have
>> mutiple dmabufs? For example, in case of drm driver, when pageflip is
>> requested, one framebuffer could have one more gem buffer for NV12M
>> format. And this means that one more exported dmabufs should be
>> sychronized with other devices. Below is simple structure for it,
> The fence guards a single operation, as such I didn't feel like more than one
> dma-buf was needed to guard it.
>
> Have you considered simply attaching multiple fences instead? Each with their own dma-buf.
> There has been some muttering about allowing multiple exclusive fences to be attached, for arm soc's.
>
> But I'm also considering getting rid of the dma-buf member and add a function call to retrieve it, since
> the sync dma-buf member should not be changing often, and it would zap 2 atomic ops on every fence,
> but I want it replaced by something that's not 10x more complicated.
>
> Maybe "int get_sync_dma_buf(fence, old_dma_buf, &new_dma_buf)" that will set new_dma_buf = NULL
> if the old_dma_buf is unchanged, and return true + return a new reference to the sync dma_buf if it's not identical to old_dma_buf.
> old_dma_buf can also be NULL or a dma_buf that belongs to a different fence->context entirely. It might be capable of
> returning an error, in which case the fence would count as being signaled. This could reduce the need for separately checking
> fence_is_signaled first.
>
> I think this would allow caching the synchronization dma_buf in a similar way without each fence needing
> to hold a reference to the dma_buf all the time, even for fences that are only used internally.
>
>> struct seqno_fence_dmabuf {
>>         struct list_head        list;
>>         int                            id;
>>         struct dmabuf          *sync_buf;
>>         uint32_t                   seqno_ops;
>>         uint32_t                   seqno;
>> };
>>
>> The member, id, could be used to identify which device sync_buf is
>> going to be accessed by. In case of drm driver, one framebuffer could
>> be accessed by one more devices, one is Display controller and another
>> is HDMI controller. So id would have crtc number.
> Why do you need this? the base fence already has a context member.
>

There was my missing point. Please ignore 'id'. If the fence relevant
things are contained in each context(in case of drm page flip, a
event), each driver could call fence_signal() with proper fence.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/7] mutex: add support for reservation style locks
  2013-01-15 12:33 ` [PATCH 2/7] mutex: add support for reservation style locks Maarten Lankhorst
  2013-01-15 13:43   ` Maarten Lankhorst
@ 2013-01-30  1:07   ` Rob Clark
  2013-01-30 11:08     ` Daniel Vetter
  2013-01-30 11:16     ` Maarten Lankhorst
  1 sibling, 2 replies; 32+ messages in thread
From: Rob Clark @ 2013-01-30  1:07 UTC (permalink / raw)
  To: Maarten Lankhorst; +Cc: dri-devel, linux-kernel, linux-media, linaro-mm-sig

On Tue, Jan 15, 2013 at 6:33 AM, Maarten Lankhorst
<m.b.lankhorst@gmail.com> wrote:
>

Hi Maarten,

This is a nice looking extension to avoid re-implementing a mutex in
TTM/reservation code..  ofc, probably someone more familiar with mutex
code should probably review, but probably a bit of explanation about
what and why would be helpful.

> mutex_reserve_lock, and mutex_reserve_lock_interruptible:
>   Lock a buffer with a reservation_id set. reservation_id must not be set to 0,
>   since this is a special value that means no reservation_id.
>
>   Normally if reservation_id is not set, or is older than the reservation_id that's
>   currently set on the mutex, the behavior will be to wait normally.
>
>   However, if  the reservation_id is newer than the current reservation_id, -EAGAIN
>   will be returned, and this function must unreserve all other mutexes and then redo
>   a blocking lock with normal mutex calls to prevent a deadlock, then call
>   mutex_locked_set_reservation on successful locking to set the reservation_id inside
>   the lock.

It might be a bit more clear to write up how this works from the
perspective of the user of ticket_mutex, separately from the internal
implementation first, and then how it works internally?  Ie, the
mutex_set_reservation_fastpath() call is internal to the
implementation of ticket_mutex, but -EAGAIN is something the caller of
ticket_mutex shall deal with.  This might give a clearer picture of
how TTM / reservation uses this to prevent deadlock, so those less
familiar with TTM could better understand.

Well, here is an attempt to start a write-up, which should perhaps
eventually be folded into Documentation/ticket-mutex-design.txt.  But
hopefully a better explanation of the problem and the solution will
encourage some review of the ticket_mutex changes.

==========================
Basic problem statement:
----- ------- ---------
GPU's do operations that commonly involve many buffers.  Those buffers
can be shared across contexts/processes, exist in different memory
domains (for example VRAM vs system memory), and so on.  And with
PRIME / dmabuf, they can even be shared across devices.  So there are
a handful of situations where the driver needs to wait for buffers to
become ready.  If you think about this in terms of waiting on a buffer
mutex for it to become available, this presents a problem because
there is no way to guarantee that buffers appear in a execbuf/batch in
the same order in all contexts.  That is directly under control of
userspace, and a result of the sequence of GL calls that an
application makes.  Which results in the potential for deadlock.  The
problem gets more complex when you consider that the kernel may need
to migrate the buffer(s) into VRAM before the GPU operates on the
buffer(s), which main in turn require evicting some other buffers (and
you don't want to evict other buffers which are already queued up to
the GPU), but for a simplified understanding of the problem you can
ignore this.

The algorithm that TTM came up with for dealing with this problem is
quite simple.  For each group of buffers (execbuf) that need to be
locked, the caller would be assigned a unique reservation_id, from a
global counter.  In case of deadlock in the process of locking all the
buffers associated with a execbuf, the one with the lowest
reservation_id wins, and the one with the higher reservation_id
unlocks all of the buffers that it has already locked, and then tries
again.

Originally TTM implemented this algorithm on top of an event-queue and
atomic-ops, but Maarten Lankhorst realized that by merging this with
the mutex code we could take advantage of the existing mutex fast-path
code and result in a simpler solution, and so ticket_mutex was born.
(Well, there where also some additional complexities with the original
implementation when you start adding in cross-device buffer sharing
for PRIME.. Maarten could probably better explain.)

How it is used:
--- -- -- -----

A very simplified version:

  int submit_execbuf(execbuf)
  {
      /* acquiring locks, before queuing up to GPU: */
      seqno = assign_global_seqno();
  retry:
      for (buf in execbuf->buffers) {
          ret = mutex_reserve_lock(&buf->lock, seqno);
          switch (ret) {
          case 0:
              /* we got the lock */
              break;
          case -EAGAIN:
              /* someone with a lower seqno, so unreserve and try again: */
              for (buf2 in reverse order starting before buf in
execbuf->buffers)
                  mutex_unreserve_unlock(&buf2->lock);
              goto retry;
          default:
              goto err;
          }
      }

      /* now everything is good to go, submit job to GPU: */
      ...
  }

  int finish_execbuf(execbuf)
  {
      /* when GPU is finished: */
      for (buf in execbuf->buffers)
          mutex_unreserve_unlock(&buf->lock);
  }
==========================

anyways, for the rest of the patch, I'm still going through the
mutex/ticket_mutex code in conjunction with the reservation/fence
patches, so for now just a couple very superficial comments.

>   These functions will return -EDEADLK instead of -EAGAIN if reservation_id is the same
>   as the reservation_id that's attempted to lock the mutex with, since in that case you
>   presumably attempted to lock the same lock twice.
>
> mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow:
>   Similar to mutex_reserve_lock, except it won't backoff with -EAGAIN. This is useful
>   after mutex_reserve_lock failed with -EAGAIN, and you unreserved all buffers so no
>   deadlock can occur.
>
> mutex_unreserve_unlock:
>    Unlock a buffer reserved with the previous calls.
>
> Missing at the moment, maybe TODO?
>   * lockdep warnings when wrongly calling mutex_unreserve_unlock or mutex_unlock,
>     depending on whether reservation_id was set previously or not.
>     - Does lockdep have something for this or do I need to extend struct mutex?
>
>   * Check if lockdep warns if you unlock a lock that other locks were nested to.
>     - spin_lock(m); spin_lock_nest_lock(a, m); spin_unlock(m); spin_unlock(a);
>       would be nice if it gave a splat. Have to recheck if it does, though..
>
> Design:
>   I chose for ticket_mutex to encapsulate struct mutex, so the extra memory usage and
>   atomic set on init will only happen when you deliberately create a ticket lock.
>
>   Since the mutexes are mostly meant to protect buffer object serialization in ttm, not
>   much contention is expected. I could be slightly smarter with wakeups, but this would
>   be at the expense at adding a field to struct mutex_waiter. Because this would add
>   overhead to all cases where ticket_mutexes are not used, and ticket_mutexes are less
>   performance sensitive anyway since they only protect buffer objects, I didn't want to
>   do this. It's still better than ttm always calling wake_up_all, which does a
>   unconditional spin_lock_irqsave/irqrestore.
>
>   I needed this in kernel/mutex.c because of the extensions to __lock_common, which are
>   hopefully optimized away for all normal paths.
>
> Changes since RFC patch v1:
>  - Updated to use atomic_long instead of atomic, since the reservation_id was a long.
>  - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
>  - removed mutex_locked_set_reservation_id (or w/e it was called)
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> ---
>  include/linux/mutex.h |  86 +++++++++++++-
>  kernel/mutex.c        | 317 +++++++++++++++++++++++++++++++++++++++++++++++---
>  2 files changed, 387 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
> index 9121595..602c247 100644
> --- a/include/linux/mutex.h
> +++ b/include/linux/mutex.h
> @@ -62,6 +62,11 @@ struct mutex {
>  #endif
>  };
>
> +struct ticket_mutex {
> +       struct mutex base;
> +       atomic_long_t reservation_id;
> +};
> +
>  /*
>   * This is the control structure for tasks blocked on mutex,
>   * which resides on the blocked task's kernel stack:
> @@ -109,12 +114,24 @@ static inline void mutex_destroy(struct mutex *lock) {}
>                 __DEBUG_MUTEX_INITIALIZER(lockname) \
>                 __DEP_MAP_MUTEX_INITIALIZER(lockname) }
>
> +#define __TICKET_MUTEX_INITIALIZER(lockname) \
> +               { .base = __MUTEX_INITIALIZER(lockname) \
> +               , .reservation_id = ATOMIC_LONG_INIT(0) }
> +
>  #define DEFINE_MUTEX(mutexname) \
>         struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
>
>  extern void __mutex_init(struct mutex *lock, const char *name,
>                          struct lock_class_key *key);
>
> +static inline void __ticket_mutex_init(struct ticket_mutex *lock,
> +                                      const char *name,
> +                                      struct lock_class_key *key)
> +{
> +       __mutex_init(&lock->base, name, key);
> +       atomic_long_set(&lock->reservation_id, 0);
> +}
> +
>  /**
>   * mutex_is_locked - is the mutex locked
>   * @lock: the mutex to be queried
> @@ -133,26 +150,91 @@ static inline int mutex_is_locked(struct mutex *lock)
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>  extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
>  extern void _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest_lock);
> +
>  extern int __must_check mutex_lock_interruptible_nested(struct mutex *lock,
>                                         unsigned int subclass);
>  extern int __must_check mutex_lock_killable_nested(struct mutex *lock,
>                                         unsigned int subclass);
>
> +extern int __must_check _mutex_reserve_lock(struct ticket_mutex *lock,
> +                                       struct lockdep_map *nest_lock,
> +                                       unsigned long reservation_id);
> +
> +extern int __must_check _mutex_reserve_lock_interruptible(struct ticket_mutex *,
> +                                       struct lockdep_map *nest_lock,
> +                                       unsigned long reservation_id);
> +
> +extern void _mutex_reserve_lock_slow(struct ticket_mutex *lock,
> +                                    struct lockdep_map *nest_lock,
> +                                    unsigned long reservation_id);
> +
> +extern int __must_check _mutex_reserve_lock_intr_slow(struct ticket_mutex *,
> +                                       struct lockdep_map *nest_lock,
> +                                       unsigned long reservation_id);
> +
>  #define mutex_lock(lock) mutex_lock_nested(lock, 0)
>  #define mutex_lock_interruptible(lock) mutex_lock_interruptible_nested(lock, 0)
>  #define mutex_lock_killable(lock) mutex_lock_killable_nested(lock, 0)
>
>  #define mutex_lock_nest_lock(lock, nest_lock)                          \
>  do {                                                                   \
> -       typecheck(struct lockdep_map *, &(nest_lock)->dep_map);         \
> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \

looks like that was unintended whitespace change..`

>         _mutex_lock_nest_lock(lock, &(nest_lock)->dep_map);             \
>  } while (0)
>
> +#define mutex_reserve_lock(lock, nest_lock, reservation_id)            \
> +({                                                                     \
> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
> +       _mutex_reserve_lock(lock, &(nest_lock)->dep_map, reservation_id);       \
> +})
> +
> +#define mutex_reserve_lock_interruptible(lock, nest_lock, reservation_id)      \
> +({                                                                     \
> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
> +       _mutex_reserve_lock_interruptible(lock, &(nest_lock)->dep_map,  \
> +                                          reservation_id);             \
> +})
> +
> +#define mutex_reserve_lock_slow(lock, nest_lock, reservation_id)       \
> +do {                                                                   \
> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
> +       _mutex_reserve_lock_slow(lock, &(nest_lock)->dep_map, reservation_id);  \
> +} while (0)
> +
> +#define mutex_reserve_lock_intr_slow(lock, nest_lock, reservation_id)  \
> +({                                                                     \
> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
> +       _mutex_reserve_lock_intr_slow(lock, &(nest_lock)->dep_map,      \
> +                                     reservation_id);                  \
> +})
> +
>  #else
>  extern void mutex_lock(struct mutex *lock);
>  extern int __must_check mutex_lock_interruptible(struct mutex *lock);
>  extern int __must_check mutex_lock_killable(struct mutex *lock);
>
> +extern int __must_check _mutex_reserve_lock(struct ticket_mutex *lock,
> +                                           unsigned long reservation_id);
> +extern int __must_check _mutex_reserve_lock_interruptible(struct ticket_mutex *,
> +                                               unsigned long reservation_id);
> +
> +extern void _mutex_reserve_lock_slow(struct ticket_mutex *lock,
> +                                    unsigned long reservation_id);
> +extern int __must_check _mutex_reserve_lock_intr_slow(struct ticket_mutex *,
> +                                               unsigned long reservation_id);
> +
> +#define mutex_reserve_lock(lock, nest_lock, reservation_id)            \
> +       _mutex_reserve_lock(lock, reservation_id)
> +
> +#define mutex_reserve_lock_interruptible(lock, nest_lock, reservation_id)      \
> +       _mutex_reserve_lock_interruptible(lock, reservation_id)
> +
> +#define mutex_reserve_lock_slow(lock, nest_lock, reservation_id)       \
> +       _mutex_reserve_lock_slow(lock, reservation_id)
> +
> +#define mutex_reserve_lock_intr_slow(lock, nest_lock, reservation_id)  \
> +       _mutex_reserve_lock_intr_slow(lock, reservation_id)
> +
>  # define mutex_lock_nested(lock, subclass) mutex_lock(lock)
>  # define mutex_lock_interruptible_nested(lock, subclass) mutex_lock_interruptible(lock)
>  # define mutex_lock_killable_nested(lock, subclass) mutex_lock_killable(lock)
> @@ -167,6 +249,8 @@ extern int __must_check mutex_lock_killable(struct mutex *lock);
>   */
>  extern int mutex_trylock(struct mutex *lock);
>  extern void mutex_unlock(struct mutex *lock);
> +extern void mutex_unreserve_unlock(struct ticket_mutex *lock);
> +
>  extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
>
>  #ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX
> diff --git a/kernel/mutex.c b/kernel/mutex.c
> index a307cc9..8282729 100644
> --- a/kernel/mutex.c
> +++ b/kernel/mutex.c
> @@ -126,16 +126,119 @@ void __sched mutex_unlock(struct mutex *lock)
>
>  EXPORT_SYMBOL(mutex_unlock);
>
> +/**
> + * mutex_unreserve_unlock - release the mutex
> + * @lock: the mutex to be released
> + *
> + * Unlock a mutex that has been locked by this task previously
> + * with _mutex_reserve_lock*.
> + *
> + * This function must not be used in interrupt context. Unlocking
> + * of a not locked mutex is not allowed.
> + */
> +void __sched mutex_unreserve_unlock(struct ticket_mutex *lock)
> +{
> +       /*
> +        * mark mutex as no longer part of a reservation, next
> +        * locker can set this again
> +        */
> +       atomic_long_set(&lock->reservation_id, 0);
> +
> +       /*
> +        * The unlocking fastpath is the 0->1 transition from 'locked'
> +        * into 'unlocked' state:
> +        */
> +#ifndef CONFIG_DEBUG_MUTEXES
> +       /*
> +        * When debugging is enabled we must not clear the owner before time,
> +        * the slow path will always be taken, and that clears the owner field
> +        * after verifying that it was indeed current.
> +        */
> +       mutex_clear_owner(&lock->base);
> +#endif
> +       __mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
> +}
> +EXPORT_SYMBOL(mutex_unreserve_unlock);
> +
> +static inline int __sched
> +__mutex_lock_check_reserve(struct mutex *lock, unsigned long reservation_id)
> +{
> +       struct ticket_mutex *m = container_of(lock, struct ticket_mutex, base);
> +       unsigned long cur_id;
> +
> +       cur_id = atomic_long_read(&m->reservation_id);
> +       if (!cur_id)
> +               return 0;
> +
> +       if (unlikely(reservation_id == cur_id))
> +               return -EDEADLK;
> +
> +       if (unlikely(reservation_id - cur_id <= LONG_MAX))
> +               return -EAGAIN;
> +
> +       return 0;
> +}
> +
> +/*
> + * after acquiring lock with fastpath or when we lost out in contested
> + * slowpath, set reservation_id and wake up any waiters so they can recheck.
> + */

I think that is a bit misleading, if I'm understanding correctly this
is called once you get the lock (but in either fast or slow path)

> +static __always_inline void
> +mutex_set_reservation_fastpath(struct ticket_mutex *lock,
> +                              unsigned long reservation_id, bool check_res)
> +{
> +       unsigned long flags;
> +       struct mutex_waiter *cur;
> +
> +       if (check_res || config_enabled(CONFIG_DEBUG_LOCK_ALLOC)) {
> +               unsigned long cur_id;
> +
> +               cur_id = atomic_long_xchg(&lock->reservation_id,
> +                                         reservation_id);
> +#ifdef CONFIG_DEBUG_LOCK_ALLOC
> +               if (check_res)
> +                       DEBUG_LOCKS_WARN_ON(cur_id &&
> +                                          cur_id != reservation_id);
> +               else
> +                       DEBUG_LOCKS_WARN_ON(cur_id);
> +               lockdep_assert_held(&lock->base);
> +#endif
> +
> +               if (unlikely(cur_id == reservation_id))
> +                       return;
> +       } else
> +               atomic_long_set(&lock->reservation_id, reservation_id);
> +
> +       /*
> +        * Check if lock is contended, if not there is nobody to wake up
> +        */
> +       if (likely(atomic_read(&lock->base.count) == 0))
> +               return;
> +
> +       /*
> +        * Uh oh, we raced in fastpath, wake up everyone in this case,
> +        * so they can see the new reservation_id
> +        */
> +       spin_lock_mutex(&lock->base.wait_lock, flags);
> +       list_for_each_entry(cur, &lock->base.wait_list, list) {
> +               debug_mutex_wake_waiter(&lock->base, cur);
> +               wake_up_process(cur->task);
> +       }
> +       spin_unlock_mutex(&lock->base.wait_lock, flags);
> +}
> +
>  /*
>   * Lock a mutex (possibly interruptible), slowpath:
>   */
>  static inline int __sched
>  __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
> -                   struct lockdep_map *nest_lock, unsigned long ip)
> +                   struct lockdep_map *nest_lock, unsigned long ip,
> +                   unsigned long reservation_id, bool res_slow)
>  {
>         struct task_struct *task = current;
>         struct mutex_waiter waiter;
>         unsigned long flags;
> +       int ret;
>
>         preempt_disable();
>         mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
> @@ -162,6 +265,12 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>         for (;;) {
>                 struct task_struct *owner;
>
> +               if (!__builtin_constant_p(reservation_id) && !res_slow) {
> +                       ret = __mutex_lock_check_reserve(lock, reservation_id);
> +                       if (ret)
> +                               goto err_nowait;
> +               }
> +
>                 /*
>                  * If there's an owner, wait for it to either
>                  * release the lock or go to sleep.
> @@ -172,6 +281,13 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>
>                 if (atomic_cmpxchg(&lock->count, 1, 0) == 1) {
>                         lock_acquired(&lock->dep_map, ip);
> +                       if (res_slow) {
> +                               struct ticket_mutex *m;
> +                               m = container_of(lock, struct ticket_mutex, base);
> +
> +                               mutex_set_reservation_fastpath(m, reservation_id, false);
> +                       }
> +
>                         mutex_set_owner(lock);
>                         preempt_enable();
>                         return 0;
> @@ -227,15 +343,16 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
>                  * TASK_UNINTERRUPTIBLE case.)
>                  */
>                 if (unlikely(signal_pending_state(state, task))) {
> -                       mutex_remove_waiter(lock, &waiter,
> -                                           task_thread_info(task));
> -                       mutex_release(&lock->dep_map, 1, ip);
> -                       spin_unlock_mutex(&lock->wait_lock, flags);
> +                       ret = -EINTR;
> +                       goto err;
> +               }
>
> -                       debug_mutex_free_waiter(&waiter);
> -                       preempt_enable();
> -                       return -EINTR;
> +               if (!__builtin_constant_p(reservation_id) && !res_slow) {
> +                       ret = __mutex_lock_check_reserve(lock, reservation_id);
> +                       if (ret)
> +                               goto err;
>                 }
> +
>                 __set_task_state(task, state);
>
>                 /* didn't get the lock, go to sleep: */
> @@ -250,6 +367,28 @@ done:
>         mutex_remove_waiter(lock, &waiter, current_thread_info());
>         mutex_set_owner(lock);
>
> +       if (!__builtin_constant_p(reservation_id)) {
> +               struct ticket_mutex *m;
> +               struct mutex_waiter *cur;
> +               /*
> +                * this should get optimized out for the common case,
> +                * and is only important for _mutex_reserve_lock
> +                */
> +
> +               m = container_of(lock, struct ticket_mutex, base);
> +               atomic_long_set(&m->reservation_id, reservation_id);
> +
> +               /*
> +                * give any possible sleeping processes the chance to wake up,
> +                * so they can recheck if they have to back off from
> +                * reservations
> +                */
> +               list_for_each_entry(cur, &lock->wait_list, list) {
> +                       debug_mutex_wake_waiter(lock, cur);
> +                       wake_up_process(cur->task);
> +               }
> +       }
> +
>         /* set it to 0 if there are no waiters left: */
>         if (likely(list_empty(&lock->wait_list)))
>                 atomic_set(&lock->count, 0);
> @@ -260,6 +399,19 @@ done:
>         preempt_enable();
>
>         return 0;
> +
> +err:
> +       mutex_remove_waiter(lock, &waiter, task_thread_info(task));
> +       spin_unlock_mutex(&lock->wait_lock, flags);
> +       debug_mutex_free_waiter(&waiter);
> +
> +#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
> +err_nowait:
> +#endif
> +       mutex_release(&lock->dep_map, 1, ip);
> +
> +       preempt_enable();
> +       return ret;
>  }
>
>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
> @@ -267,7 +419,8 @@ void __sched
>  mutex_lock_nested(struct mutex *lock, unsigned int subclass)
>  {
>         might_sleep();
> -       __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, subclass, NULL, _RET_IP_);
> +       __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
> +                           subclass, NULL, _RET_IP_, 0, 0);
>  }
>
>  EXPORT_SYMBOL_GPL(mutex_lock_nested);
> @@ -276,7 +429,8 @@ void __sched
>  _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest)
>  {
>         might_sleep();
> -       __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, nest, _RET_IP_);
> +       __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
> +                           0, nest, _RET_IP_, 0, 0);
>  }
>
>  EXPORT_SYMBOL_GPL(_mutex_lock_nest_lock);
> @@ -285,7 +439,8 @@ int __sched
>  mutex_lock_killable_nested(struct mutex *lock, unsigned int subclass)
>  {
>         might_sleep();
> -       return __mutex_lock_common(lock, TASK_KILLABLE, subclass, NULL, _RET_IP_);
> +       return __mutex_lock_common(lock, TASK_KILLABLE,
> +                                  subclass, NULL, _RET_IP_, 0, 0);
>  }
>  EXPORT_SYMBOL_GPL(mutex_lock_killable_nested);
>
> @@ -294,10 +449,63 @@ mutex_lock_interruptible_nested(struct mutex *lock, unsigned int subclass)
>  {
>         might_sleep();
>         return __mutex_lock_common(lock, TASK_INTERRUPTIBLE,
> -                                  subclass, NULL, _RET_IP_);
> +                                  subclass, NULL, _RET_IP_, 0, 0);
>  }
>
>  EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
> +
> +int __sched
> +_mutex_reserve_lock(struct ticket_mutex *lock, struct lockdep_map *nest,
> +                   unsigned long reservation_id)
> +{
> +       DEBUG_LOCKS_WARN_ON(!reservation_id);
> +
> +       might_sleep();
> +       return __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
> +                                  0, nest, _RET_IP_, reservation_id, 0);
> +}
> +EXPORT_SYMBOL_GPL(_mutex_reserve_lock);
> +
> +
> +int __sched
> +_mutex_reserve_lock_interruptible(struct ticket_mutex *lock,
> +                                 struct lockdep_map *nest,
> +                                 unsigned long reservation_id)
> +{
> +       DEBUG_LOCKS_WARN_ON(!reservation_id);
> +
> +       might_sleep();
> +       return __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE,
> +                                  0, nest, _RET_IP_, reservation_id, 0);
> +}
> +EXPORT_SYMBOL_GPL(_mutex_reserve_lock_interruptible);
> +
> +void __sched
> +_mutex_reserve_lock_slow(struct ticket_mutex *lock, struct lockdep_map *nest,
> +                        unsigned long reservation_id)
> +{
> +       DEBUG_LOCKS_WARN_ON(!reservation_id);
> +
> +       might_sleep();
> +       __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE, 0,
> +                           nest, _RET_IP_, reservation_id, 1);
> +}
> +EXPORT_SYMBOL_GPL(_mutex_reserve_lock_slow);
> +
> +int __sched
> +_mutex_reserve_lock_intr_slow(struct ticket_mutex *lock,
> +                             struct lockdep_map *nest,
> +                             unsigned long reservation_id)
> +{
> +       DEBUG_LOCKS_WARN_ON(!reservation_id);
> +
> +       might_sleep();
> +       return __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE, 0,
> +                                  nest, _RET_IP_, reservation_id, 1);
> +}
> +EXPORT_SYMBOL_GPL(_mutex_reserve_lock_intr_slow);
> +
> +
>  #endif
>
>  /*
> @@ -400,7 +608,8 @@ __mutex_lock_slowpath(atomic_t *lock_count)
>  {
>         struct mutex *lock = container_of(lock_count, struct mutex, count);
>
> -       __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0, NULL, _RET_IP_);
> +       __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0,
> +                           NULL, _RET_IP_, 0, 0);
>  }
>
>  static noinline int __sched
> @@ -408,7 +617,8 @@ __mutex_lock_killable_slowpath(atomic_t *lock_count)
>  {
>         struct mutex *lock = container_of(lock_count, struct mutex, count);
>
> -       return __mutex_lock_common(lock, TASK_KILLABLE, 0, NULL, _RET_IP_);
> +       return __mutex_lock_common(lock, TASK_KILLABLE, 0,
> +                                  NULL, _RET_IP_, 0, 0);
>  }
>
>  static noinline int __sched
> @@ -416,8 +626,28 @@ __mutex_lock_interruptible_slowpath(atomic_t *lock_count)
>  {
>         struct mutex *lock = container_of(lock_count, struct mutex, count);
>
> -       return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0, NULL, _RET_IP_);
> +       return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0,
> +                                  NULL, _RET_IP_, 0, 0);
> +}
> +
> +static noinline int __sched
> +__mutex_lock_reserve_slowpath(atomic_t *lock_count, void *rid)
> +{
> +       struct mutex *lock = container_of(lock_count, struct mutex, count);
> +
> +       return __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE, 0,
> +                                  NULL, _RET_IP_, (unsigned long)rid, 0);
> +}
> +
> +static noinline int __sched
> +__mutex_lock_interruptible_reserve_slowpath(atomic_t *lock_count, void *rid)
> +{
> +       struct mutex *lock = container_of(lock_count, struct mutex, count);
> +
> +       return __mutex_lock_common(lock, TASK_INTERRUPTIBLE, 0,
> +                                  NULL, _RET_IP_, (unsigned long)rid, 0);
>  }
> +
>  #endif
>
>  /*
> @@ -473,6 +703,63 @@ int __sched mutex_trylock(struct mutex *lock)
>  }
>  EXPORT_SYMBOL(mutex_trylock);
>
> +#ifndef CONFIG_DEBUG_LOCK_ALLOC
> +int __sched
> +_mutex_reserve_lock(struct ticket_mutex *lock, unsigned long rid)
> +{
> +       int ret;
> +
> +       might_sleep();
> +
> +       ret = __mutex_fastpath_lock_retval_arg(&lock->base.count, (void *)rid,
> +                                       __mutex_lock_reserve_slowpath);
> +
> +       if (!ret) {
> +               mutex_set_reservation_fastpath(lock, rid, true);
> +               mutex_set_owner(&lock->base);
> +       }
> +       return ret;
> +}
> +EXPORT_SYMBOL(_mutex_reserve_lock);
> +
> +int __sched
> +_mutex_reserve_lock_interruptible(struct ticket_mutex *lock, unsigned long rid)
> +{
> +       int ret;
> +
> +       might_sleep();
> +
> +       ret = __mutex_fastpath_lock_retval_arg(&lock->base.count, (void *)rid,
> +                               __mutex_lock_interruptible_reserve_slowpath);
> +
> +       if (!ret) {
> +               mutex_set_reservation_fastpath(lock, rid, true);
> +               mutex_set_owner(&lock->base);
> +       }
> +       return ret;
> +}
> +EXPORT_SYMBOL(_mutex_reserve_lock_interruptible);
> +
> +void __sched
> +_mutex_reserve_lock_slow(struct ticket_mutex *lock, unsigned long rid)
> +{
> +       might_sleep();
> +       __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
> +                           0, NULL, _RET_IP_, rid, 1);
> +}
> +EXPORT_SYMBOL(_mutex_reserve_lock_slow);
> +
> +int __sched
> +_mutex_reserve_lock_intr_slow(struct ticket_mutex *lock, unsigned long rid)
> +{
> +       might_sleep();
> +       return __mutex_lock_common(&lock->base, TASK_INTERRUPTIBLE,
> +                                  0, NULL, _RET_IP_, rid, 1);
> +}
> +EXPORT_SYMBOL(_mutex_reserve_lock_intr_slow);
> +
> +#endif
> +
>  /**
>   * atomic_dec_and_mutex_lock - return holding mutex if we dec to 0
>   * @cnt: the atomic which we are to dec
> --
> 1.8.0.3
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/7] mutex: add support for reservation style locks
  2013-01-30  1:07   ` Rob Clark
@ 2013-01-30 11:08     ` Daniel Vetter
  2013-01-30 11:52       ` Rob Clark
  2013-01-30 11:16     ` Maarten Lankhorst
  1 sibling, 1 reply; 32+ messages in thread
From: Daniel Vetter @ 2013-01-30 11:08 UTC (permalink / raw)
  To: Rob Clark
  Cc: Maarten Lankhorst, linaro-mm-sig, linux-kernel, dri-devel, linux-media

On Wed, Jan 30, 2013 at 2:07 AM, Rob Clark <robdclark@gmail.com> wrote:
> ==========================
> Basic problem statement:
> ----- ------- ---------
> GPU's do operations that commonly involve many buffers.  Those buffers
> can be shared across contexts/processes, exist in different memory
> domains (for example VRAM vs system memory), and so on.  And with
> PRIME / dmabuf, they can even be shared across devices.  So there are
> a handful of situations where the driver needs to wait for buffers to
> become ready.  If you think about this in terms of waiting on a buffer
> mutex for it to become available, this presents a problem because
> there is no way to guarantee that buffers appear in a execbuf/batch in
> the same order in all contexts.  That is directly under control of
> userspace, and a result of the sequence of GL calls that an
> application makes.  Which results in the potential for deadlock.  The
> problem gets more complex when you consider that the kernel may need
> to migrate the buffer(s) into VRAM before the GPU operates on the
> buffer(s), which main in turn require evicting some other buffers (and
> you don't want to evict other buffers which are already queued up to
> the GPU), but for a simplified understanding of the problem you can
> ignore this.
>
> The algorithm that TTM came up with for dealing with this problem is
> quite simple.  For each group of buffers (execbuf) that need to be
> locked, the caller would be assigned a unique reservation_id, from a
> global counter.  In case of deadlock in the process of locking all the
> buffers associated with a execbuf, the one with the lowest
> reservation_id wins, and the one with the higher reservation_id
> unlocks all of the buffers that it has already locked, and then tries
> again.
>
> Originally TTM implemented this algorithm on top of an event-queue and
> atomic-ops, but Maarten Lankhorst realized that by merging this with
> the mutex code we could take advantage of the existing mutex fast-path
> code and result in a simpler solution, and so ticket_mutex was born.
> (Well, there where also some additional complexities with the original
> implementation when you start adding in cross-device buffer sharing
> for PRIME.. Maarten could probably better explain.)

I think the motivational writeup above is really nice, but the example
code below is a bit wrong

> How it is used:
> --- -- -- -----
>
> A very simplified version:
>
>   int submit_execbuf(execbuf)
>   {
>       /* acquiring locks, before queuing up to GPU: */
>       seqno = assign_global_seqno();
>   retry:
>       for (buf in execbuf->buffers) {
>           ret = mutex_reserve_lock(&buf->lock, seqno);
>           switch (ret) {
>           case 0:
>               /* we got the lock */
>               break;
>           case -EAGAIN:
>               /* someone with a lower seqno, so unreserve and try again: */
>               for (buf2 in reverse order starting before buf in
> execbuf->buffers)
>                   mutex_unreserve_unlock(&buf2->lock);
>               goto retry;
>           default:
>               goto err;
>           }
>       }
>
>       /* now everything is good to go, submit job to GPU: */
>       ...
>   }
>
>   int finish_execbuf(execbuf)
>   {
>       /* when GPU is finished: */
>       for (buf in execbuf->buffers)
>           mutex_unreserve_unlock(&buf->lock);
>   }
> ==========================

Since gpu command submission is all asnyc (hopefully at least) we
don't unlock once it completes, but right away after the commands are
submitted. Otherwise you wouldn't be able to submit new execbufs using
the same buffer objects (and besides, holding locks while going back
out to userspace is evil).

The trick is to add a fence object for async operation (essentially a
waitqueue on steriods to support gpu->gpu direct signalling). And
updating fences for a given execbuf needs to happen atomically for all
buffers, for otherwise userspace could trick the kernel into creating
a circular fence chain. This wouldn't deadlock the kernel, since
everything is async, but it'll nicely deadlock the gpus involved.
Hence why we need ticketing locks to get dma_buf fences off the
ground.

Maybe wait for Maarten's feedback, then update your motivational blurb a bit?

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/7] mutex: add support for reservation style locks
  2013-01-30  1:07   ` Rob Clark
  2013-01-30 11:08     ` Daniel Vetter
@ 2013-01-30 11:16     ` Maarten Lankhorst
  1 sibling, 0 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-30 11:16 UTC (permalink / raw)
  To: Rob Clark
  Cc: Maarten Lankhorst, linaro-mm-sig, linux-kernel, dri-devel, linux-media

Op 30-01-13 02:07, Rob Clark schreef:
> On Tue, Jan 15, 2013 at 6:33 AM, Maarten Lankhorst
> <m.b.lankhorst@gmail.com> wrote:
> Hi Maarten,
>
> This is a nice looking extension to avoid re-implementing a mutex in
> TTM/reservation code..  ofc, probably someone more familiar with mutex
> code should probably review, but probably a bit of explanation about
> what and why would be helpful.
>
>> mutex_reserve_lock, and mutex_reserve_lock_interruptible:
>>   Lock a buffer with a reservation_id set. reservation_id must not be set to 0,
>>   since this is a special value that means no reservation_id.
>>
>>   Normally if reservation_id is not set, or is older than the reservation_id that's
>>   currently set on the mutex, the behavior will be to wait normally.
>>
>>   However, if  the reservation_id is newer than the current reservation_id, -EAGAIN
>>   will be returned, and this function must unreserve all other mutexes and then redo
>>   a blocking lock with normal mutex calls to prevent a deadlock, then call
>>   mutex_locked_set_reservation on successful locking to set the reservation_id inside
>>   the lock.
> It might be a bit more clear to write up how this works from the
> perspective of the user of ticket_mutex, separately from the internal
> implementation first, and then how it works internally?  Ie, the
> mutex_set_reservation_fastpath() call is internal to the
> implementation of ticket_mutex, but -EAGAIN is something the caller of
> ticket_mutex shall deal with.  This might give a clearer picture of
> how TTM / reservation uses this to prevent deadlock, so those less
> familiar with TTM could better understand.
>
> Well, here is an attempt to start a write-up, which should perhaps
> eventually be folded into Documentation/ticket-mutex-design.txt.  But
> hopefully a better explanation of the problem and the solution will
> encourage some review of the ticket_mutex changes.
>
> ==========================
> Basic problem statement:
> ----- ------- ---------
> GPU's do operations that commonly involve many buffers.  Those buffers
> can be shared across contexts/processes, exist in different memory
> domains (for example VRAM vs system memory), and so on.  And with
> PRIME / dmabuf, they can even be shared across devices.  So there are
> a handful of situations where the driver needs to wait for buffers to
> become ready.  If you think about this in terms of waiting on a buffer
> mutex for it to become available, this presents a problem because
> there is no way to guarantee that buffers appear in a execbuf/batch in
> the same order in all contexts.  That is directly under control of
> userspace, and a result of the sequence of GL calls that an
> application makes.  Which results in the potential for deadlock.  The
> problem gets more complex when you consider that the kernel may need
> to migrate the buffer(s) into VRAM before the GPU operates on the
> buffer(s), which main in turn require evicting some other buffers (and
> you don't want to evict other buffers which are already queued up to
> the GPU), but for a simplified understanding of the problem you can
> ignore this.
>
> The algorithm that TTM came up with for dealing with this problem is
> quite simple.  For each group of buffers (execbuf) that need to be
> locked, the caller would be assigned a unique reservation_id, from a
> global counter.  In case of deadlock in the process of locking all the
> buffers associated with a execbuf, the one with the lowest
> reservation_id wins, and the one with the higher reservation_id
> unlocks all of the buffers that it has already locked, and then tries
> again.
>
> Originally TTM implemented this algorithm on top of an event-queue and
> atomic-ops, but Maarten Lankhorst realized that by merging this with
> the mutex code we could take advantage of the existing mutex fast-path
> code and result in a simpler solution, and so ticket_mutex was born.
> (Well, there where also some additional complexities with the original
> implementation when you start adding in cross-device buffer sharing
> for PRIME.. Maarten could probably better explain.)
>
> How it is used:
> --- -- -- -----
>
> A very simplified version:
>
>   int submit_execbuf(execbuf)
>   {
>       /* acquiring locks, before queuing up to GPU: */
>       seqno = assign_global_seqno();
You also need to make a 'lock' type for seqno, and lock it for lockdep purposes.
This will be a virtual lock that will only exist in lockdep, but it's needed for proper lockdep annotation.

See reservation_ticket_init/fini. It's also important that seqno must not be 0, ever.


>   retry:
>       for (buf in execbuf->buffers) {
>           ret = mutex_reserve_lock(&buf->lock, seqno);
The lockdep class for this lock must be the same for all reservations, and for maximum lockdep usability
you want all the buf->lock lockdep class for all objects across all devices to be the same too.

The __ticket_mutex_init in reservation_object_init does just that for you. :-)

>           switch (ret) {
>           case 0:
>               /* we got the lock */
>               break;
>           case -EAGAIN:
>               /* someone with a lower seqno, so unreserve and try again: */
>               for (buf2 in reverse order starting before buf in
> execbuf->buffers)
>                   mutex_unreserve_unlock(&buf2->lock);
>               goto retry;
Almost correct, you need to re-regrab buf->lock after unreserving all other buffers with mutex_reserve_lock_slow, then goto retry, and skip over this bo when doing the normal locking.

The difference between mutex_reserve_lock and mutex_reserve_lock_slow is that mutex_reserve_lock_slow will block indefinitely where mutex_reserve_lock would return -EAGAIN.
mutex_reserve_lock_slow does not return an error code. mutex_reserve_lock_intr_slow can return -EINTR if interrupted.
>           default:
>               goto err;
>           }
>       }
>
>       /* now everything is good to go, submit job to GPU: */
>       ...
>   }
>
>   int finish_execbuf(execbuf)
>   {
>       /* when GPU is finished: */
>       for (buf in execbuf->buffers)
>           mutex_unreserve_unlock(&buf->lock);
>   }
> ==========================
Thanks for taking the effort into writing this.
> anyways, for the rest of the patch, I'm still going through the
> mutex/ticket_mutex code in conjunction with the reservation/fence
> patches, so for now just a couple very superficial comments.
>
>>   These functions will return -EDEADLK instead of -EAGAIN if reservation_id is the same
>>   as the reservation_id that's attempted to lock the mutex with, since in that case you
>>   presumably attempted to lock the same lock twice.
>>
>> mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow:
>>   Similar to mutex_reserve_lock, except it won't backoff with -EAGAIN. This is useful
>>   after mutex_reserve_lock failed with -EAGAIN, and you unreserved all buffers so no
>>   deadlock can occur.
>>
>> mutex_unreserve_unlock:
>>    Unlock a buffer reserved with the previous calls.
>>
>> Missing at the moment, maybe TODO?
>>   * lockdep warnings when wrongly calling mutex_unreserve_unlock or mutex_unlock,
>>     depending on whether reservation_id was set previously or not.
>>     - Does lockdep have something for this or do I need to extend struct mutex?
>>
>>   * Check if lockdep warns if you unlock a lock that other locks were nested to.
>>     - spin_lock(m); spin_lock_nest_lock(a, m); spin_unlock(m); spin_unlock(a);
>>       would be nice if it gave a splat. Have to recheck if it does, though..
>>
>> Design:
>>   I chose for ticket_mutex to encapsulate struct mutex, so the extra memory usage and
>>   atomic set on init will only happen when you deliberately create a ticket lock.
>>
>>   Since the mutexes are mostly meant to protect buffer object serialization in ttm, not
>>   much contention is expected. I could be slightly smarter with wakeups, but this would
>>   be at the expense at adding a field to struct mutex_waiter. Because this would add
>>   overhead to all cases where ticket_mutexes are not used, and ticket_mutexes are less
>>   performance sensitive anyway since they only protect buffer objects, I didn't want to
>>   do this. It's still better than ttm always calling wake_up_all, which does a
>>   unconditional spin_lock_irqsave/irqrestore.
>>
>>   I needed this in kernel/mutex.c because of the extensions to __lock_common, which are
>>   hopefully optimized away for all normal paths.
>>
>> Changes since RFC patch v1:
>>  - Updated to use atomic_long instead of atomic, since the reservation_id was a long.
>>  - added mutex_reserve_lock_slow and mutex_reserve_lock_intr_slow
>>  - removed mutex_locked_set_reservation_id (or w/e it was called)
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> ---
>>  include/linux/mutex.h |  86 +++++++++++++-
>>  kernel/mutex.c        | 317 +++++++++++++++++++++++++++++++++++++++++++++++---
>>  2 files changed, 387 insertions(+), 16 deletions(-)
>>
>> diff --git a/include/linux/mutex.h b/include/linux/mutex.h
>> index 9121595..602c247 100644
>> --- a/include/linux/mutex.h
>> +++ b/include/linux/mutex.h
>> @@ -62,6 +62,11 @@ struct mutex {
>>  #endif
>>  };
>>
>> +struct ticket_mutex {
>> +       struct mutex base;
>> +       atomic_long_t reservation_id;
>> +};
>> +
>>  /*
>>   * This is the control structure for tasks blocked on mutex,
>>   * which resides on the blocked task's kernel stack:
>> @@ -109,12 +114,24 @@ static inline void mutex_destroy(struct mutex *lock) {}
>>                 __DEBUG_MUTEX_INITIALIZER(lockname) \
>>                 __DEP_MAP_MUTEX_INITIALIZER(lockname) }
>>
>> +#define __TICKET_MUTEX_INITIALIZER(lockname) \
>> +               { .base = __MUTEX_INITIALIZER(lockname) \
>> +               , .reservation_id = ATOMIC_LONG_INIT(0) }
>> +
>>  #define DEFINE_MUTEX(mutexname) \
>>         struct mutex mutexname = __MUTEX_INITIALIZER(mutexname)
>>
>>  extern void __mutex_init(struct mutex *lock, const char *name,
>>                          struct lock_class_key *key);
>>
>> +static inline void __ticket_mutex_init(struct ticket_mutex *lock,
>> +                                      const char *name,
>> +                                      struct lock_class_key *key)
>> +{
>> +       __mutex_init(&lock->base, name, key);
>> +       atomic_long_set(&lock->reservation_id, 0);
>> +}
>> +
>>  /**
>>   * mutex_is_locked - is the mutex locked
>>   * @lock: the mutex to be queried
>> @@ -133,26 +150,91 @@ static inline int mutex_is_locked(struct mutex *lock)
>>  #ifdef CONFIG_DEBUG_LOCK_ALLOC
>>  extern void mutex_lock_nested(struct mutex *lock, unsigned int subclass);
>>  extern void _mutex_lock_nest_lock(struct mutex *lock, struct lockdep_map *nest_lock);
>> +
>>  extern int __must_check mutex_lock_interruptible_nested(struct mutex *lock,
>>                                         unsigned int subclass);
>>  extern int __must_check mutex_lock_killable_nested(struct mutex *lock,
>>                                         unsigned int subclass);
>>
>> +extern int __must_check _mutex_reserve_lock(struct ticket_mutex *lock,
>> +                                       struct lockdep_map *nest_lock,
>> +                                       unsigned long reservation_id);
>> +
>> +extern int __must_check _mutex_reserve_lock_interruptible(struct ticket_mutex *,
>> +                                       struct lockdep_map *nest_lock,
>> +                                       unsigned long reservation_id);
>> +
>> +extern void _mutex_reserve_lock_slow(struct ticket_mutex *lock,
>> +                                    struct lockdep_map *nest_lock,
>> +                                    unsigned long reservation_id);
>> +
>> +extern int __must_check _mutex_reserve_lock_intr_slow(struct ticket_mutex *,
>> +                                       struct lockdep_map *nest_lock,
>> +                                       unsigned long reservation_id);
>> +
>>  #define mutex_lock(lock) mutex_lock_nested(lock, 0)
>>  #define mutex_lock_interruptible(lock) mutex_lock_interruptible_nested(lock, 0)
>>  #define mutex_lock_killable(lock) mutex_lock_killable_nested(lock, 0)
>>
>>  #define mutex_lock_nest_lock(lock, nest_lock)                          \
>>  do {                                                                   \
>> -       typecheck(struct lockdep_map *, &(nest_lock)->dep_map);         \
>> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
> looks like that was unintended whitespace change..`
I think it was intentional, as it would be just above 80 lines otherwise.

>>         _mutex_lock_nest_lock(lock, &(nest_lock)->dep_map);             \
>>  } while (0)
>>
>> +#define mutex_reserve_lock(lock, nest_lock, reservation_id)            \
>> +({                                                                     \
>> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
>> +       _mutex_reserve_lock(lock, &(nest_lock)->dep_map, reservation_id);       \
>> +})
>> +
>> +#define mutex_reserve_lock_interruptible(lock, nest_lock, reservation_id)      \
>> +({                                                                     \
>> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
>> +       _mutex_reserve_lock_interruptible(lock, &(nest_lock)->dep_map,  \
>> +                                          reservation_id);             \
>> +})
>> +
>> +#define mutex_reserve_lock_slow(lock, nest_lock, reservation_id)       \
>> +do {                                                                   \
>> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
>> +       _mutex_reserve_lock_slow(lock, &(nest_lock)->dep_map, reservation_id);  \
>> +} while (0)
>> +
>> +#define mutex_reserve_lock_intr_slow(lock, nest_lock, reservation_id)  \
>> +({                                                                     \
>> +       typecheck(struct lockdep_map *, &(nest_lock)->dep_map); \
>> +       _mutex_reserve_lock_intr_slow(lock, &(nest_lock)->dep_map,      \
>> +                                     reservation_id);                  \
>> +})
>> +
>>  #else
>>  extern void mutex_lock(struct mutex *lock);
>>  extern int __must_check mutex_lock_interruptible(struct mutex *lock);
>>  extern int __must_check mutex_lock_killable(struct mutex *lock);
>>
>> +extern int __must_check _mutex_reserve_lock(struct ticket_mutex *lock,
>> +                                           unsigned long reservation_id);
>> +extern int __must_check _mutex_reserve_lock_interruptible(struct ticket_mutex *,
>> +                                               unsigned long reservation_id);
>> +
>> +extern void _mutex_reserve_lock_slow(struct ticket_mutex *lock,
>> +                                    unsigned long reservation_id);
>> +extern int __must_check _mutex_reserve_lock_intr_slow(struct ticket_mutex *,
>> +                                               unsigned long reservation_id);
>> +
>> +#define mutex_reserve_lock(lock, nest_lock, reservation_id)            \
>> +       _mutex_reserve_lock(lock, reservation_id)
>> +
>> +#define mutex_reserve_lock_interruptible(lock, nest_lock, reservation_id)      \
>> +       _mutex_reserve_lock_interruptible(lock, reservation_id)
>> +
>> +#define mutex_reserve_lock_slow(lock, nest_lock, reservation_id)       \
>> +       _mutex_reserve_lock_slow(lock, reservation_id)
>> +
>> +#define mutex_reserve_lock_intr_slow(lock, nest_lock, reservation_id)  \
>> +       _mutex_reserve_lock_intr_slow(lock, reservation_id)
>> +
>>  # define mutex_lock_nested(lock, subclass) mutex_lock(lock)
>>  # define mutex_lock_interruptible_nested(lock, subclass) mutex_lock_interruptible(lock)
>>  # define mutex_lock_killable_nested(lock, subclass) mutex_lock_killable(lock)
>> @@ -167,6 +249,8 @@ extern int __must_check mutex_lock_killable(struct mutex *lock);
>>   */
>>  extern int mutex_trylock(struct mutex *lock);
>>  extern void mutex_unlock(struct mutex *lock);
>> +extern void mutex_unreserve_unlock(struct ticket_mutex *lock);
>> +
>>  extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
>>
>>  #ifndef CONFIG_HAVE_ARCH_MUTEX_CPU_RELAX
>> diff --git a/kernel/mutex.c b/kernel/mutex.c
>> index a307cc9..8282729 100644
>> --- a/kernel/mutex.c
>> +++ b/kernel/mutex.c
>> @@ -126,16 +126,119 @@ void __sched mutex_unlock(struct mutex *lock)
>>
>>  EXPORT_SYMBOL(mutex_unlock);
>>
>> +/**
>> + * mutex_unreserve_unlock - release the mutex
>> + * @lock: the mutex to be released
>> + *
>> + * Unlock a mutex that has been locked by this task previously
>> + * with _mutex_reserve_lock*.
>> + *
>> + * This function must not be used in interrupt context. Unlocking
>> + * of a not locked mutex is not allowed.
>> + */
>> +void __sched mutex_unreserve_unlock(struct ticket_mutex *lock)
>> +{
>> +       /*
>> +        * mark mutex as no longer part of a reservation, next
>> +        * locker can set this again
>> +        */
>> +       atomic_long_set(&lock->reservation_id, 0);
>> +
>> +       /*
>> +        * The unlocking fastpath is the 0->1 transition from 'locked'
>> +        * into 'unlocked' state:
>> +        */
>> +#ifndef CONFIG_DEBUG_MUTEXES
>> +       /*
>> +        * When debugging is enabled we must not clear the owner before time,
>> +        * the slow path will always be taken, and that clears the owner field
>> +        * after verifying that it was indeed current.
>> +        */
>> +       mutex_clear_owner(&lock->base);
>> +#endif
>> +       __mutex_fastpath_unlock(&lock->base.count, __mutex_unlock_slowpath);
>> +}
>> +EXPORT_SYMBOL(mutex_unreserve_unlock);
>> +
>> +static inline int __sched
>> +__mutex_lock_check_reserve(struct mutex *lock, unsigned long reservation_id)
>> +{
>> +       struct ticket_mutex *m = container_of(lock, struct ticket_mutex, base);
>> +       unsigned long cur_id;
>> +
>> +       cur_id = atomic_long_read(&m->reservation_id);
>> +       if (!cur_id)
>> +               return 0;
>> +
>> +       if (unlikely(reservation_id == cur_id))
>> +               return -EDEADLK;
>> +
>> +       if (unlikely(reservation_id - cur_id <= LONG_MAX))
>> +               return -EAGAIN;
>> +
>> +       return 0;
>> +}
>> +
>> +/*
>> + * after acquiring lock with fastpath or when we lost out in contested
>> + * slowpath, set reservation_id and wake up any waiters so they can recheck.
>> + */
> I think that is a bit misleading, if I'm understanding correctly this
> is called once you get the lock (but in either fast or slow path)
Yes, but strictly speaking it does not need to be called on slow path, it will do an atomic_xchg to set reservation_id,
see that it had already set reservation_id, and skip the rest. :-)

Maybe I should just return !__builtin_constant_p(reservation_id) in __mutex_lock_common to distinguish between fastpath and slowpath.
That would cause __mutex_lock_common to return 0 for normal mutexes, 1 when reservation_id is set and slowpath is used. This would
allow me to check whether mutex_set_reservation_fastpath needs to be called or not, and tighten up lockdep detection of mismatched
mutex_reserve_lock / mutex_lock with mutex_unlock / mutex_reserve_unlock.

~Maarte

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/7] mutex: add support for reservation style locks
  2013-01-30 11:08     ` Daniel Vetter
@ 2013-01-30 11:52       ` Rob Clark
  2013-01-31 13:38         ` Rob Clark
  0 siblings, 1 reply; 32+ messages in thread
From: Rob Clark @ 2013-01-30 11:52 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Maarten Lankhorst, linaro-mm-sig, linux-kernel, dri-devel, linux-media

On Wed, Jan 30, 2013 at 5:08 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Wed, Jan 30, 2013 at 2:07 AM, Rob Clark <robdclark@gmail.com> wrote:
>> ==========================
>> Basic problem statement:
>> ----- ------- ---------
>> GPU's do operations that commonly involve many buffers.  Those buffers
>> can be shared across contexts/processes, exist in different memory
>> domains (for example VRAM vs system memory), and so on.  And with
>> PRIME / dmabuf, they can even be shared across devices.  So there are
>> a handful of situations where the driver needs to wait for buffers to
>> become ready.  If you think about this in terms of waiting on a buffer
>> mutex for it to become available, this presents a problem because
>> there is no way to guarantee that buffers appear in a execbuf/batch in
>> the same order in all contexts.  That is directly under control of
>> userspace, and a result of the sequence of GL calls that an
>> application makes.  Which results in the potential for deadlock.  The
>> problem gets more complex when you consider that the kernel may need
>> to migrate the buffer(s) into VRAM before the GPU operates on the
>> buffer(s), which main in turn require evicting some other buffers (and
>> you don't want to evict other buffers which are already queued up to
>> the GPU), but for a simplified understanding of the problem you can
>> ignore this.
>>
>> The algorithm that TTM came up with for dealing with this problem is
>> quite simple.  For each group of buffers (execbuf) that need to be
>> locked, the caller would be assigned a unique reservation_id, from a
>> global counter.  In case of deadlock in the process of locking all the
>> buffers associated with a execbuf, the one with the lowest
>> reservation_id wins, and the one with the higher reservation_id
>> unlocks all of the buffers that it has already locked, and then tries
>> again.
>>
>> Originally TTM implemented this algorithm on top of an event-queue and
>> atomic-ops, but Maarten Lankhorst realized that by merging this with
>> the mutex code we could take advantage of the existing mutex fast-path
>> code and result in a simpler solution, and so ticket_mutex was born.
>> (Well, there where also some additional complexities with the original
>> implementation when you start adding in cross-device buffer sharing
>> for PRIME.. Maarten could probably better explain.)
>
> I think the motivational writeup above is really nice, but the example
> code below is a bit wrong
>
>> How it is used:
>> --- -- -- -----
>>
>> A very simplified version:
>>
>>   int submit_execbuf(execbuf)
>>   {
>>       /* acquiring locks, before queuing up to GPU: */
>>       seqno = assign_global_seqno();
>>   retry:
>>       for (buf in execbuf->buffers) {
>>           ret = mutex_reserve_lock(&buf->lock, seqno);
>>           switch (ret) {
>>           case 0:
>>               /* we got the lock */
>>               break;
>>           case -EAGAIN:
>>               /* someone with a lower seqno, so unreserve and try again: */
>>               for (buf2 in reverse order starting before buf in
>> execbuf->buffers)
>>                   mutex_unreserve_unlock(&buf2->lock);
>>               goto retry;
>>           default:
>>               goto err;
>>           }
>>       }
>>
>>       /* now everything is good to go, submit job to GPU: */
>>       ...
>>   }
>>
>>   int finish_execbuf(execbuf)
>>   {
>>       /* when GPU is finished: */
>>       for (buf in execbuf->buffers)
>>           mutex_unreserve_unlock(&buf->lock);
>>   }
>> ==========================
>
> Since gpu command submission is all asnyc (hopefully at least) we
> don't unlock once it completes, but right away after the commands are
> submitted. Otherwise you wouldn't be able to submit new execbufs using
> the same buffer objects (and besides, holding locks while going back
> out to userspace is evil).

right.. but I was trying to simplify the explanation for non-gpu
folk.. maybe that was an over-simplification ;-)

BR,
-R

> The trick is to add a fence object for async operation (essentially a
> waitqueue on steriods to support gpu->gpu direct signalling). And
> updating fences for a given execbuf needs to happen atomically for all
> buffers, for otherwise userspace could trick the kernel into creating
> a circular fence chain. This wouldn't deadlock the kernel, since
> everything is async, but it'll nicely deadlock the gpus involved.
> Hence why we need ticketing locks to get dma_buf fences off the
> ground.
>
> Maybe wait for Maarten's feedback, then update your motivational blurb a bit?
>
> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)
  2013-01-15 12:34 ` [PATCH 4/7] fence: dma-buf cross-device synchronization (v11) Maarten Lankhorst
  2013-01-22 15:13   ` [Linaro-mm-sig] " Francesco Lavra
@ 2013-01-31  9:32   ` Inki Dae
  2013-01-31  9:53     ` Maarten Lankhorst
  2013-01-31  9:57     ` Daniel Vetter
  1 sibling, 2 replies; 32+ messages in thread
From: Inki Dae @ 2013-01-31  9:32 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: dri-devel, linux-kernel, linux-media, linaro-mm-sig, Maarten Lankhorst

Hi,

below is my opinion.

> +struct fence;
> +struct fence_ops;
> +struct fence_cb;
> +
> +/**
> + * struct fence - software synchronization primitive
> + * @refcount: refcount for this fence
> + * @ops: fence_ops associated with this fence
> + * @cb_list: list of all callbacks to call
> + * @lock: spin_lock_irqsave used for locking
> + * @priv: fence specific private data
> + * @flags: A mask of FENCE_FLAG_* defined below
> + *
> + * the flags member must be manipulated and read using the appropriate
> + * atomic ops (bit_*), so taking the spinlock will not be needed most
> + * of the time.
> + *
> + * FENCE_FLAG_SIGNALED_BIT - fence is already signaled
> + * FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called*
> + * FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the
> + * implementer of the fence for its own purposes. Can be used in different
> + * ways by different fence implementers, so do not rely on this.
> + *
> + * *) Since atomic bitops are used, this is not guaranteed to be the case.
> + * Particularly, if the bit was set, but fence_signal was called right
> + * before this bit was set, it would have been able to set the
> + * FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called.
> + * Adding a check for FENCE_FLAG_SIGNALED_BIT after setting
> + * FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that
> + * after fence_signal was called, any enable_signaling call will have either
> + * been completed, or never called at all.
> + */
> +struct fence {
> +       struct kref refcount;
> +       const struct fence_ops *ops;
> +       struct list_head cb_list;
> +       spinlock_t *lock;
> +       unsigned context, seqno;
> +       unsigned long flags;
> +};
> +
> +enum fence_flag_bits {
> +       FENCE_FLAG_SIGNALED_BIT,
> +       FENCE_FLAG_ENABLE_SIGNAL_BIT,
> +       FENCE_FLAG_USER_BITS, /* must always be last member */
> +};
> +

It seems like that this fence framework need to add read/write flags.
In case of two read operations, one might wait for another one. But
the another is just read operation so we doesn't need to wait for it.
Shouldn't fence-wait-request be ignored? In this case, I think it's
enough to consider just only write operation.

For this, you could add the following,

enum fence_flag_bits {
        ...
        FENCE_FLAG_ACCESS_READ_BIT,
        FENCE_FLAG_ACCESS_WRITE_BIT,
        ...
};

And the producer could call fence_init() like below,
__fence_init(..., FENCE_FLAG_ACCESS_WRITE_BIT,...);

With this, fence->flags has FENCE_FLAG_ACCESS_WRITE_BIT as write
operation and then other sides(read or write operation) would wait for
the write operation completion.
And also consumer calls that function with FENCE_FLAG_ACCESS_READ_BIT
so that other consumers could ignore the fence-wait to any read
operations.

Thanks,
Inki Dae

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)
  2013-01-31  9:32   ` Inki Dae
@ 2013-01-31  9:53     ` Maarten Lankhorst
  2013-01-31  9:57     ` Daniel Vetter
  1 sibling, 0 replies; 32+ messages in thread
From: Maarten Lankhorst @ 2013-01-31  9:53 UTC (permalink / raw)
  To: Inki Dae
  Cc: Maarten Lankhorst, dri-devel, linux-kernel, linux-media, linaro-mm-sig

Op 31-01-13 10:32, Inki Dae schreef:
> Hi,
>
> below is my opinion.
>
>> +struct fence;
>> +struct fence_ops;
>> +struct fence_cb;
>> +
>> +/**
>> + * struct fence - software synchronization primitive
>> + * @refcount: refcount for this fence
>> + * @ops: fence_ops associated with this fence
>> + * @cb_list: list of all callbacks to call
>> + * @lock: spin_lock_irqsave used for locking
>> + * @priv: fence specific private data
>> + * @flags: A mask of FENCE_FLAG_* defined below
>> + *
>> + * the flags member must be manipulated and read using the appropriate
>> + * atomic ops (bit_*), so taking the spinlock will not be needed most
>> + * of the time.
>> + *
>> + * FENCE_FLAG_SIGNALED_BIT - fence is already signaled
>> + * FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called*
>> + * FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the
>> + * implementer of the fence for its own purposes. Can be used in different
>> + * ways by different fence implementers, so do not rely on this.
>> + *
>> + * *) Since atomic bitops are used, this is not guaranteed to be the case.
>> + * Particularly, if the bit was set, but fence_signal was called right
>> + * before this bit was set, it would have been able to set the
>> + * FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called.
>> + * Adding a check for FENCE_FLAG_SIGNALED_BIT after setting
>> + * FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that
>> + * after fence_signal was called, any enable_signaling call will have either
>> + * been completed, or never called at all.
>> + */
>> +struct fence {
>> +       struct kref refcount;
>> +       const struct fence_ops *ops;
>> +       struct list_head cb_list;
>> +       spinlock_t *lock;
>> +       unsigned context, seqno;
>> +       unsigned long flags;
>> +};
>> +
>> +enum fence_flag_bits {
>> +       FENCE_FLAG_SIGNALED_BIT,
>> +       FENCE_FLAG_ENABLE_SIGNAL_BIT,
>> +       FENCE_FLAG_USER_BITS, /* must always be last member */
>> +};
>> +
> It seems like that this fence framework need to add read/write flags.
> In case of two read operations, one might wait for another one. But
> the another is just read operation so we doesn't need to wait for it.
> Shouldn't fence-wait-request be ignored? In this case, I think it's
> enough to consider just only write operation.
>
> For this, you could add the following,
>
> enum fence_flag_bits {
>         ...
>         FENCE_FLAG_ACCESS_READ_BIT,
>         FENCE_FLAG_ACCESS_WRITE_BIT,
>         ...
> };
>
> And the producer could call fence_init() like below,
> __fence_init(..., FENCE_FLAG_ACCESS_WRITE_BIT,...);
>
> With this, fence->flags has FENCE_FLAG_ACCESS_WRITE_BIT as write
> operation and then other sides(read or write operation) would wait for
> the write operation completion.
> And also consumer calls that function with FENCE_FLAG_ACCESS_READ_BIT
> so that other consumers could ignore the fence-wait to any read
> operations.
>
You can't put that information in the fence. If you use a fence to fence off a hardware memcpy operation,
there would be one buffer for which you would attach the fence in read mode and another buffer where you need
write access.

~Maarten


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)
  2013-01-31  9:32   ` Inki Dae
  2013-01-31  9:53     ` Maarten Lankhorst
@ 2013-01-31  9:57     ` Daniel Vetter
  2013-01-31 14:38       ` Inki Dae
  1 sibling, 1 reply; 32+ messages in thread
From: Daniel Vetter @ 2013-01-31  9:57 UTC (permalink / raw)
  To: Inki Dae
  Cc: Maarten Lankhorst, linaro-mm-sig, linux-kernel, dri-devel, linux-media

On Thu, Jan 31, 2013 at 06:32:15PM +0900, Inki Dae wrote:
> Hi,
> 
> below is my opinion.
> 
> > +struct fence;
> > +struct fence_ops;
> > +struct fence_cb;
> > +
> > +/**
> > + * struct fence - software synchronization primitive
> > + * @refcount: refcount for this fence
> > + * @ops: fence_ops associated with this fence
> > + * @cb_list: list of all callbacks to call
> > + * @lock: spin_lock_irqsave used for locking
> > + * @priv: fence specific private data
> > + * @flags: A mask of FENCE_FLAG_* defined below
> > + *
> > + * the flags member must be manipulated and read using the appropriate
> > + * atomic ops (bit_*), so taking the spinlock will not be needed most
> > + * of the time.
> > + *
> > + * FENCE_FLAG_SIGNALED_BIT - fence is already signaled
> > + * FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called*
> > + * FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the
> > + * implementer of the fence for its own purposes. Can be used in different
> > + * ways by different fence implementers, so do not rely on this.
> > + *
> > + * *) Since atomic bitops are used, this is not guaranteed to be the case.
> > + * Particularly, if the bit was set, but fence_signal was called right
> > + * before this bit was set, it would have been able to set the
> > + * FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called.
> > + * Adding a check for FENCE_FLAG_SIGNALED_BIT after setting
> > + * FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that
> > + * after fence_signal was called, any enable_signaling call will have either
> > + * been completed, or never called at all.
> > + */
> > +struct fence {
> > +       struct kref refcount;
> > +       const struct fence_ops *ops;
> > +       struct list_head cb_list;
> > +       spinlock_t *lock;
> > +       unsigned context, seqno;
> > +       unsigned long flags;
> > +};
> > +
> > +enum fence_flag_bits {
> > +       FENCE_FLAG_SIGNALED_BIT,
> > +       FENCE_FLAG_ENABLE_SIGNAL_BIT,
> > +       FENCE_FLAG_USER_BITS, /* must always be last member */
> > +};
> > +
> 
> It seems like that this fence framework need to add read/write flags.
> In case of two read operations, one might wait for another one. But
> the another is just read operation so we doesn't need to wait for it.
> Shouldn't fence-wait-request be ignored? In this case, I think it's
> enough to consider just only write operation.
> 
> For this, you could add the following,
> 
> enum fence_flag_bits {
>         ...
>         FENCE_FLAG_ACCESS_READ_BIT,
>         FENCE_FLAG_ACCESS_WRITE_BIT,
>         ...
> };
> 
> And the producer could call fence_init() like below,
> __fence_init(..., FENCE_FLAG_ACCESS_WRITE_BIT,...);
> 
> With this, fence->flags has FENCE_FLAG_ACCESS_WRITE_BIT as write
> operation and then other sides(read or write operation) would wait for
> the write operation completion.
> And also consumer calls that function with FENCE_FLAG_ACCESS_READ_BIT
> so that other consumers could ignore the fence-wait to any read
> operations.

Fences here match more to the sync-points concept from the android stuff.
The idea is that they only signal when a hw operation completes.

Synchronization integration happens at the dma_buf level, where you can
specify whether the new operation you're doing is exclusive (which means
that you need to wait for all previous operations to complete), i.e. a
write. Or whether the operation is non-excluses (i.e. just reading) in
which case you only need to wait for any still outstanding exclusive
fences attached to the dma_buf. But you _can_ attach more than one
non-exclusive fence to a dma_buf at the same time, and so e.g. read a
buffer objects from different engines concurrently.

There's been some talk whether we also need a non-exclusive write
attachment (i.e. allow multiple concurrent writers), but I don't yet fully
understand the use-case.

In short the proposed patches can do what you want to do, it's just that
read/write access isn't part of the fences, but how you attach fences to
dma_bufs.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/7] mutex: add support for reservation style locks
  2013-01-30 11:52       ` Rob Clark
@ 2013-01-31 13:38         ` Rob Clark
  0 siblings, 0 replies; 32+ messages in thread
From: Rob Clark @ 2013-01-31 13:38 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Maarten Lankhorst, linaro-mm-sig, linux-kernel, dri-devel, linux-media

On Wed, Jan 30, 2013 at 5:52 AM, Rob Clark <robdclark@gmail.com> wrote:
> On Wed, Jan 30, 2013 at 5:08 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Wed, Jan 30, 2013 at 2:07 AM, Rob Clark <robdclark@gmail.com> wrote:
>>> ==========================
>>> Basic problem statement:
>>> ----- ------- ---------
>>> GPU's do operations that commonly involve many buffers.  Those buffers
>>> can be shared across contexts/processes, exist in different memory
>>> domains (for example VRAM vs system memory), and so on.  And with
>>> PRIME / dmabuf, they can even be shared across devices.  So there are
>>> a handful of situations where the driver needs to wait for buffers to
>>> become ready.  If you think about this in terms of waiting on a buffer
>>> mutex for it to become available, this presents a problem because
>>> there is no way to guarantee that buffers appear in a execbuf/batch in
>>> the same order in all contexts.  That is directly under control of
>>> userspace, and a result of the sequence of GL calls that an
>>> application makes.  Which results in the potential for deadlock.  The
>>> problem gets more complex when you consider that the kernel may need
>>> to migrate the buffer(s) into VRAM before the GPU operates on the
>>> buffer(s), which main in turn require evicting some other buffers (and
>>> you don't want to evict other buffers which are already queued up to
>>> the GPU), but for a simplified understanding of the problem you can
>>> ignore this.
>>>
>>> The algorithm that TTM came up with for dealing with this problem is
>>> quite simple.  For each group of buffers (execbuf) that need to be
>>> locked, the caller would be assigned a unique reservation_id, from a
>>> global counter.  In case of deadlock in the process of locking all the
>>> buffers associated with a execbuf, the one with the lowest
>>> reservation_id wins, and the one with the higher reservation_id
>>> unlocks all of the buffers that it has already locked, and then tries
>>> again.
>>>
>>> Originally TTM implemented this algorithm on top of an event-queue and
>>> atomic-ops, but Maarten Lankhorst realized that by merging this with
>>> the mutex code we could take advantage of the existing mutex fast-path
>>> code and result in a simpler solution, and so ticket_mutex was born.
>>> (Well, there where also some additional complexities with the original
>>> implementation when you start adding in cross-device buffer sharing
>>> for PRIME.. Maarten could probably better explain.)
>>
>> I think the motivational writeup above is really nice, but the example
>> code below is a bit wrong
>>
>>> How it is used:
>>> --- -- -- -----
>>>
>>> A very simplified version:
>>>
>>>   int submit_execbuf(execbuf)
>>>   {
>>>       /* acquiring locks, before queuing up to GPU: */
>>>       seqno = assign_global_seqno();
>>>   retry:
>>>       for (buf in execbuf->buffers) {
>>>           ret = mutex_reserve_lock(&buf->lock, seqno);
>>>           switch (ret) {
>>>           case 0:
>>>               /* we got the lock */
>>>               break;
>>>           case -EAGAIN:
>>>               /* someone with a lower seqno, so unreserve and try again: */
>>>               for (buf2 in reverse order starting before buf in
>>> execbuf->buffers)
>>>                   mutex_unreserve_unlock(&buf2->lock);
>>>               goto retry;
>>>           default:
>>>               goto err;
>>>           }
>>>       }
>>>
>>>       /* now everything is good to go, submit job to GPU: */
>>>       ...
>>>   }
>>>
>>>   int finish_execbuf(execbuf)
>>>   {
>>>       /* when GPU is finished: */
>>>       for (buf in execbuf->buffers)
>>>           mutex_unreserve_unlock(&buf->lock);
>>>   }
>>> ==========================
>>
>> Since gpu command submission is all asnyc (hopefully at least) we
>> don't unlock once it completes, but right away after the commands are
>> submitted. Otherwise you wouldn't be able to submit new execbufs using
>> the same buffer objects (and besides, holding locks while going back
>> out to userspace is evil).
>
> right.. but I was trying to simplify the explanation for non-gpu
> folk.. maybe that was an over-simplification ;-)
>

Ok, a bit expanded version.. I meant to send this yesterday, but I forgot..

============================
Basic problem statement:
----- ------- ---------

GPU's do operations that commonly involve many buffers.  Those buffers
can be shared across contexts/processes, exist in different memory
domains (for example VRAM vs system memory), and so on.  And with
PRIME / dmabuf, they can even be shared across devices.  So there are
a handful of situations where the driver needs to wait for buffers to
become ready.  If you think about this in terms of waiting on a buffer
mutex for it to become available, this presents a problem because
there is no way to guarantee that buffers appear in a execbuf/batch in
the same order in all contexts.  That is directly under control of
userspace, and a result of the sequence of GL calls that an application
makes.  Which results in the potential for deadlock.  The problem gets
more complex when you consider that the kernel may need to migrate the
buffer(s) into VRAM before the GPU operates on the buffer(s), which
may in turn require evicting some other buffers (and you don't want to
evict other buffers which are already queued up to the GPU), but for a
simplified understanding of the problem you can ignore this.

The algorithm that TTM came up with for dealing with this problem is
quite simple.  For each group of buffers (execbuf) that need to be
locked, the caller would be assigned a unique reservation_id, from a
global counter.  In case of deadlock in the process of locking all the
buffers associated with a execbuf, the one with the lowest
reservation_id wins, and the one with the higher reservation_id
unlocks all of the buffers that it has already locked, and then tries
again.

Originally TTM implemented this algorithm on top of an event-queue and
atomic-ops, but Maarten Lankhorst realized that by merging this with
the mutex code we could take advantage of the existing mutex fast-path
code and result in a simpler solution, and so ticket_mutex was born.
(Well, there where also some additional complexities with the original
implementation when you start adding in cross-device buffer sharing
for PRIME.. Maarten could probably better explain.)


How it is used:
--- -- -- -----

A very simplified version:

    int lock_execbuf(execbuf)
    {
        struct buf *res_buf = NULL;

        /* acquiring locks, before queuing up to GPU: */
        seqno = assign_global_seqno();

    retry:
        for (buf in execbuf->buffers) {
            if (buf == res_buf) {
                res_buf = NULL;
                continue;
            }
            ret = mutex_reserve_lock(&buf->lock, seqno);
            if (ret < 0)
                goto err;
        }

        /* now everything is good to go, submit job to GPU: */
        ...

        return 0;

    err:
        for (all buf2 before buf in execbuf->buffers)
            mutex_unreserve_unlock(&buf2->lock);
        if (res_buf)
            mutex_unreserve_unlock(&res_buf->lock);

        if (ret == -EAGAIN) {
            /* we lost out in a seqno race, lock and retry.. */
            mutex_reserve_lock_slow(&buf->lock, seqno);
            res_buf = buf;
            goto retry;
        }

        return ret;
    }

    int unlock_execbuf(execbuf)
    {
        /* when GPU is finished; */
        for (buf in execbuf->buffers)
            mutex_unreserve_unlock(&buf->lock);
    }


What Really Happens:
---- ------ -------

(TODO maybe this should be Documentation/dma-fence-reservation.txt and
this file should just refer to it??  Well, we can shuffle things around
later..)

In real life, you want to keep the GPU operating asynchronously to the
CPU as much as possible, and not have to wait to queue up more work for
the GPU until the previous work is finished.  So in practice, you are
unlocking (unreserving) all the buffers once the execbuf is queued up
to the GPU.  The dma-buf fence objects, and the reservation code which
manages the fence objects (and is the primary user of ticket_mutex)
takes care of the synchronization of different buffer users from this
point.

If you really left the buffers locked until you got some irq back from
the GPU to let you know that the GPU was finished, then you would be
unable to queue up more rendering involving the same buffer(s), which
would be quite horrible for performance.

To just understand ticket_mutex, you can probably ignore this section.
If you want to make your driver share buffers with a GPU properly, then
you really need to be using reservation/fence, so you should read on.

NOTE: the reservation object and fence are split out from dma-buf so
      that a driver can use them both for it's own internal buffers
      and for imported dma-bufs, without having to create a dma-buf
      for every internal buffer.

For each rendering command queued up to the GPU, a fence object is
created. You can think of the fence as a sort of waitqueue, except that
(if it is supported by other devices waiting on the same buffer), it
can be used for hw->hw signaling, so that CPU involvement is not
required.  A fence object is a transient, one-use, object, with two
states.  Initially it is created un-signaled.  And later after the hw
is done with the operation, it becomes signaled.

(TODO probably should refer to a different .txt with more details
about fences, hw->hw vs hw->sw vs sw->sw signaling, etc)

The same fence is attached to the reservation_object of all buffers
involved in a rendering command.  In the fence_excl slot, if the buffer
is being written to, otherwise in one of the fence_shared slots.  (A
buffer can safely have many readers at once.)

If when preparing to submit the rendering command, a buffer has an un-
signaled exclusive fence attached, then there must be some way to wait
for that fence to become signaled before the hw uses that buffer.  In
the simple case, if that fence isn't one that the driver understands
how to instruct the hw to wait for, then it must call fence_wait() to
block until other devices have finished writing to the buffer.  But if
the driver has a way to instruct the hw to wait until the fence is
signaled, it can just emit commands to instruct the GPU to wait in
order to avoid blocking.

NOTE: in actuality, if the fence is created on the same ring, and you
      therefore know that it will be signaled by the earlier render
      command before the hw sees the current render command, then
      inserting fence cmds to the hw can be skipped.

============================

It made me realize we also need some docs about fence/reservation..
not sure if I'll get a chance before fosdem, but I can take a stab at
that too

BR,
-R

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)
  2013-01-31  9:57     ` Daniel Vetter
@ 2013-01-31 14:38       ` Inki Dae
  2013-01-31 14:49         ` Daniel Vetter
  0 siblings, 1 reply; 32+ messages in thread
From: Inki Dae @ 2013-01-31 14:38 UTC (permalink / raw)
  To: Inki Dae, Maarten Lankhorst, linaro-mm-sig, linux-kernel,
	dri-devel, linux-media

2013/1/31 Daniel Vetter <daniel@ffwll.ch>:
> On Thu, Jan 31, 2013 at 06:32:15PM +0900, Inki Dae wrote:
>> Hi,
>>
>> below is my opinion.
>>
>> > +struct fence;
>> > +struct fence_ops;
>> > +struct fence_cb;
>> > +
>> > +/**
>> > + * struct fence - software synchronization primitive
>> > + * @refcount: refcount for this fence
>> > + * @ops: fence_ops associated with this fence
>> > + * @cb_list: list of all callbacks to call
>> > + * @lock: spin_lock_irqsave used for locking
>> > + * @priv: fence specific private data
>> > + * @flags: A mask of FENCE_FLAG_* defined below
>> > + *
>> > + * the flags member must be manipulated and read using the appropriate
>> > + * atomic ops (bit_*), so taking the spinlock will not be needed most
>> > + * of the time.
>> > + *
>> > + * FENCE_FLAG_SIGNALED_BIT - fence is already signaled
>> > + * FENCE_FLAG_ENABLE_SIGNAL_BIT - enable_signaling might have been called*
>> > + * FENCE_FLAG_USER_BITS - start of the unused bits, can be used by the
>> > + * implementer of the fence for its own purposes. Can be used in different
>> > + * ways by different fence implementers, so do not rely on this.
>> > + *
>> > + * *) Since atomic bitops are used, this is not guaranteed to be the case.
>> > + * Particularly, if the bit was set, but fence_signal was called right
>> > + * before this bit was set, it would have been able to set the
>> > + * FENCE_FLAG_SIGNALED_BIT, before enable_signaling was called.
>> > + * Adding a check for FENCE_FLAG_SIGNALED_BIT after setting
>> > + * FENCE_FLAG_ENABLE_SIGNAL_BIT closes this race, and makes sure that
>> > + * after fence_signal was called, any enable_signaling call will have either
>> > + * been completed, or never called at all.
>> > + */
>> > +struct fence {
>> > +       struct kref refcount;
>> > +       const struct fence_ops *ops;
>> > +       struct list_head cb_list;
>> > +       spinlock_t *lock;
>> > +       unsigned context, seqno;
>> > +       unsigned long flags;
>> > +};
>> > +
>> > +enum fence_flag_bits {
>> > +       FENCE_FLAG_SIGNALED_BIT,
>> > +       FENCE_FLAG_ENABLE_SIGNAL_BIT,
>> > +       FENCE_FLAG_USER_BITS, /* must always be last member */
>> > +};
>> > +
>>
>> It seems like that this fence framework need to add read/write flags.
>> In case of two read operations, one might wait for another one. But
>> the another is just read operation so we doesn't need to wait for it.
>> Shouldn't fence-wait-request be ignored? In this case, I think it's
>> enough to consider just only write operation.
>>
>> For this, you could add the following,
>>
>> enum fence_flag_bits {
>>         ...
>>         FENCE_FLAG_ACCESS_READ_BIT,
>>         FENCE_FLAG_ACCESS_WRITE_BIT,
>>         ...
>> };
>>
>> And the producer could call fence_init() like below,
>> __fence_init(..., FENCE_FLAG_ACCESS_WRITE_BIT,...);
>>
>> With this, fence->flags has FENCE_FLAG_ACCESS_WRITE_BIT as write
>> operation and then other sides(read or write operation) would wait for
>> the write operation completion.
>> And also consumer calls that function with FENCE_FLAG_ACCESS_READ_BIT
>> so that other consumers could ignore the fence-wait to any read
>> operations.
>
> Fences here match more to the sync-points concept from the android stuff.
> The idea is that they only signal when a hw operation completes.
>
> Synchronization integration happens at the dma_buf level, where you can
> specify whether the new operation you're doing is exclusive (which means
> that you need to wait for all previous operations to complete), i.e. a
> write. Or whether the operation is non-excluses (i.e. just reading) in
> which case you only need to wait for any still outstanding exclusive
> fences attached to the dma_buf. But you _can_ attach more than one
> non-exclusive fence to a dma_buf at the same time, and so e.g. read a
> buffer objects from different engines concurrently.
>
> There's been some talk whether we also need a non-exclusive write
> attachment (i.e. allow multiple concurrent writers), but I don't yet fully
> understand the use-case.
>
> In short the proposed patches can do what you want to do, it's just that
> read/write access isn't part of the fences, but how you attach fences to
> dma_bufs.
>

Thanks for comments, Maarten and Daniel.

I think I understand as your comment but I don't think that I
understand fully the dma-fence mechanism. So I wish you to give me
some advices for it. In our case, I'm applying the dma-fence to
mali(3d gpu) driver as producer and exynos drm(display controller)
driver as consumer.

And the sequence is as the following:
In case of producer,
1. call fence_wait to wait for the dma access completion of others.
2. And then the producer creates a fence and a new reservation entry.
3. And then it sets the given dmabuf's resv(reservation_object) to the
new reservation entry.
4. And then it adds the reservation entry to entries list.
5. And then it sets the fence to all dmabufs of the entries list.
Actually, this work is to set the fence to the reservaion_object of
each dmabuf.
6. And then the producer's dma start.
7. Finally, when the dma start is completed, we get the entries list
from a 3d job command(in case of mali core, pp job) and call
fence_signal() with each fence of each reservation entry.

>From here, is there my missing point?

And I thought the fence from reservation entry at step 7 means that
the producer wouldn't access the dmabuf attaching this fence anymore
so this step wakes up all processes blocked. So I understood that the
fence means a owner accessing the given dmabuf and we could aware of
whether the owner commited its own fence to the given dmabuf to read
or write through the fence's flags.

If you give me some advices, I'd be happy.

Thanks,
Inki Dae

> Cheers, Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 4/7] fence: dma-buf cross-device synchronization (v11)
  2013-01-31 14:38       ` Inki Dae
@ 2013-01-31 14:49         ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2013-01-31 14:49 UTC (permalink / raw)
  To: Inki Dae
  Cc: Maarten Lankhorst, linaro-mm-sig, linux-kernel, dri-devel, linux-media

On Thu, Jan 31, 2013 at 3:38 PM, Inki Dae <inki.dae@samsung.com> wrote:
> I think I understand as your comment but I don't think that I
> understand fully the dma-fence mechanism. So I wish you to give me
> some advices for it. In our case, I'm applying the dma-fence to
> mali(3d gpu) driver as producer and exynos drm(display controller)
> driver as consumer.
>
> And the sequence is as the following:
> In case of producer,
> 1. call fence_wait to wait for the dma access completion of others.
> 2. And then the producer creates a fence and a new reservation entry.
> 3. And then it sets the given dmabuf's resv(reservation_object) to the
> new reservation entry.
> 4. And then it adds the reservation entry to entries list.
> 5. And then it sets the fence to all dmabufs of the entries list.
> Actually, this work is to set the fence to the reservaion_object of
> each dmabuf.
> 6. And then the producer's dma start.
> 7. Finally, when the dma start is completed, we get the entries list
> from a 3d job command(in case of mali core, pp job) and call
> fence_signal() with each fence of each reservation entry.
>
> From here, is there my missing point?

Yeah, more or less. Although you need to wrap everything into ticket
reservation locking so that you can atomically update fences if you
have support for some form of device2device singalling (i.e. without
blocking on the cpu until all the old users completed). At least
that's the main point of Maarten's patches (and this does work with
prime between a few drivers by now), but ofc you can use cpu blocking
as a fallback.

> And I thought the fence from reservation entry at step 7 means that
> the producer wouldn't access the dmabuf attaching this fence anymore
> so this step wakes up all processes blocked. So I understood that the
> fence means a owner accessing the given dmabuf and we could aware of
> whether the owner commited its own fence to the given dmabuf to read
> or write through the fence's flags.

The fence doesn't give ownership of the dma_buf object, but only
indicates when the dma access will have completed. The relationship
between dma_buf/reservation and the attached fences specify whether
other hw engines can access the dma_buf, too (if the fence is
non-exclusive).

> If you give me some advices, I'd be happy.

Rob and Maarten are working on some howtos and documentation with
example code, I guess it'd be best to wait a bit until we have that.
Or just review the existing stuff Rob just posted and reply with
questions there.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 6/7] reservation: cross-device reservation support
  2013-01-15 12:34 ` [PATCH 6/7] reservation: cross-device reservation support Maarten Lankhorst
  2013-01-22 16:47   ` [Linaro-mm-sig] " Francesco Lavra
@ 2013-02-04  7:06   ` Inki Dae
  2013-02-04  9:57     ` Maarten Lankhorst
  1 sibling, 1 reply; 32+ messages in thread
From: Inki Dae @ 2013-02-04  7:06 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: dri-devel, linux-kernel, linux-media, linaro-mm-sig, Maarten Lankhorst

> +/**
> + * ticket_commit - commit a reservation with a new fence
> + * @ticket:    [in]    the reservation_ticket returned by
> + * ticket_reserve
> + * @entries:   [in]    a linked list of struct reservation_entry
> + * @fence:     [in]    the fence that indicates completion
> + *
> + * This function will call reservation_ticket_fini, no need
> + * to do it manually.
> + *
> + * This function should be called after a hardware command submission is
> + * completed succesfully. The fence is used to indicate completion of
> + * those commands.
> + */
> +void
> +ticket_commit(struct reservation_ticket *ticket,
> +                 struct list_head *entries, struct fence *fence)
> +{
> +       struct list_head *cur;
> +
> +       if (list_empty(entries))
> +               return;
> +
> +       if (WARN_ON(!fence)) {
> +               ticket_backoff(ticket, entries);
> +               return;
> +       }
> +
> +       list_for_each(cur, entries) {
> +               struct reservation_object *bo;
> +               bool shared;
> +
> +               reservation_entry_get(cur, &bo, &shared);
> +
> +               if (!shared) {
> +                       int i;
> +                       for (i = 0; i < bo->fence_shared_count; ++i) {
> +                               fence_put(bo->fence_shared[i]);
> +                               bo->fence_shared[i] = NULL;
> +                       }
> +                       bo->fence_shared_count = 0;
> +                       if (bo->fence_excl)
> +                               fence_put(bo->fence_excl);
> +
> +                       bo->fence_excl = fence;
> +               } else {
> +                       if (WARN_ON(bo->fence_shared_count >=
> +                                   ARRAY_SIZE(bo->fence_shared))) {
> +                               mutex_unreserve_unlock(&bo->lock);
> +                               continue;
> +                       }
> +
> +                       bo->fence_shared[bo->fence_shared_count++] = fence;
> +               }

Hi,

I got some questions to fence_excl and fence_shared. At the above
code, if bo->fence_excl is not NULL then it puts bo->fence_excl and
sets a new fence to it. This seems like that someone that committed a
new fence, wants to access the given dmabuf exclusively even if
someone is accessing the given dmabuf.

On the other hand, in case of fence_shared, someone wants to access
that dmabuf non-exclusively. So this case seems like that the given
dmabuf could be accessed by two more devices. So I guess that the
fence_excl could be used for write access(may need buffer sync like
blocking) and read access for the fence_shared(may not need buffer
sync). I'm not sure that I understand these two things correctly so
could you please give me more comments for them?

Thanks,
Inki Dae

> +               fence_get(fence);
> +
> +               mutex_unreserve_unlock(&bo->lock);
> +       }
> +       reservation_ticket_fini(ticket);
> +}
> +EXPORT_SYMBOL(ticket_commit);

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 6/7] reservation: cross-device reservation support
  2013-02-04  7:06   ` Inki Dae
@ 2013-02-04  9:57     ` Maarten Lankhorst
  2013-02-04 14:51       ` Daniel Vetter
  0 siblings, 1 reply; 32+ messages in thread
From: Maarten Lankhorst @ 2013-02-04  9:57 UTC (permalink / raw)
  To: Inki Dae
  Cc: Maarten Lankhorst, dri-devel, linux-kernel, linux-media, linaro-mm-sig

Op 04-02-13 08:06, Inki Dae schreef:
>> +/**
>> + * ticket_commit - commit a reservation with a new fence
>> + * @ticket:    [in]    the reservation_ticket returned by
>> + * ticket_reserve
>> + * @entries:   [in]    a linked list of struct reservation_entry
>> + * @fence:     [in]    the fence that indicates completion
>> + *
>> + * This function will call reservation_ticket_fini, no need
>> + * to do it manually.
>> + *
>> + * This function should be called after a hardware command submission is
>> + * completed succesfully. The fence is used to indicate completion of
>> + * those commands.
>> + */
>> +void
>> +ticket_commit(struct reservation_ticket *ticket,
>> +                 struct list_head *entries, struct fence *fence)
>> +{
>> +       struct list_head *cur;
>> +
>> +       if (list_empty(entries))
>> +               return;
>> +
>> +       if (WARN_ON(!fence)) {
>> +               ticket_backoff(ticket, entries);
>> +               return;
>> +       }
>> +
>> +       list_for_each(cur, entries) {
>> +               struct reservation_object *bo;
>> +               bool shared;
>> +
>> +               reservation_entry_get(cur, &bo, &shared);
>> +
>> +               if (!shared) {
>> +                       int i;
>> +                       for (i = 0; i < bo->fence_shared_count; ++i) {
>> +                               fence_put(bo->fence_shared[i]);
>> +                               bo->fence_shared[i] = NULL;
>> +                       }
>> +                       bo->fence_shared_count = 0;
>> +                       if (bo->fence_excl)
>> +                               fence_put(bo->fence_excl);
>> +
>> +                       bo->fence_excl = fence;
>> +               } else {
>> +                       if (WARN_ON(bo->fence_shared_count >=
>> +                                   ARRAY_SIZE(bo->fence_shared))) {
>> +                               mutex_unreserve_unlock(&bo->lock);
>> +                               continue;
>> +                       }
>> +
>> +                       bo->fence_shared[bo->fence_shared_count++] = fence;
>> +               }
> Hi,
>
> I got some questions to fence_excl and fence_shared. At the above
> code, if bo->fence_excl is not NULL then it puts bo->fence_excl and
> sets a new fence to it. This seems like that someone that committed a
> new fence, wants to access the given dmabuf exclusively even if
> someone is accessing the given dmabuf.
Yes, if there were shared fences, they had to issue a wait for the previous exclusive fence, so if you add
(possibly hardware) wait ops on those shared fences to your command stream, it follows that you also
waited for the previous exclusive fence implicitly.

If there is only an exclusive fence, you have to issue some explicit wait op  on it before you start the rest
of the commands.
> On the other hand, in case of fence_shared, someone wants to access
> that dmabuf non-exclusively. So this case seems like that the given
> dmabuf could be accessed by two more devices. So I guess that the
> fence_excl could be used for write access(may need buffer sync like
> blocking) and read access for the fence_shared(may not need buffer
> sync). I'm not sure that I understand these two things correctly so
> could you please give me more comments for them?
Correct, if you create a shared fence, you still have to emit a wait op for the exclusive fence.
> Thanks,
> Inki Dae
>
>> +               fence_get(fence);
>> +
>> +               mutex_unreserve_unlock(&bo->lock);
>> +       }
>> +       reservation_ticket_fini(ticket);
>> +}
>> +EXPORT_SYMBOL(ticket_commit);


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 6/7] reservation: cross-device reservation support
  2013-02-04  9:57     ` Maarten Lankhorst
@ 2013-02-04 14:51       ` Daniel Vetter
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Vetter @ 2013-02-04 14:51 UTC (permalink / raw)
  To: Maarten Lankhorst
  Cc: Inki Dae, linaro-mm-sig, linux-kernel, dri-devel, linux-media

On Mon, Feb 04, 2013 at 10:57:22AM +0100, Maarten Lankhorst wrote:
> Op 04-02-13 08:06, Inki Dae schreef:
> >> +/**
> >> + * ticket_commit - commit a reservation with a new fence
> >> + * @ticket:    [in]    the reservation_ticket returned by
> >> + * ticket_reserve
> >> + * @entries:   [in]    a linked list of struct reservation_entry
> >> + * @fence:     [in]    the fence that indicates completion
> >> + *
> >> + * This function will call reservation_ticket_fini, no need
> >> + * to do it manually.
> >> + *
> >> + * This function should be called after a hardware command submission is
> >> + * completed succesfully. The fence is used to indicate completion of
> >> + * those commands.
> >> + */
> >> +void
> >> +ticket_commit(struct reservation_ticket *ticket,
> >> +                 struct list_head *entries, struct fence *fence)
> >> +{
> >> +       struct list_head *cur;
> >> +
> >> +       if (list_empty(entries))
> >> +               return;
> >> +
> >> +       if (WARN_ON(!fence)) {
> >> +               ticket_backoff(ticket, entries);
> >> +               return;
> >> +       }
> >> +
> >> +       list_for_each(cur, entries) {
> >> +               struct reservation_object *bo;
> >> +               bool shared;
> >> +
> >> +               reservation_entry_get(cur, &bo, &shared);
> >> +
> >> +               if (!shared) {
> >> +                       int i;
> >> +                       for (i = 0; i < bo->fence_shared_count; ++i) {
> >> +                               fence_put(bo->fence_shared[i]);
> >> +                               bo->fence_shared[i] = NULL;
> >> +                       }
> >> +                       bo->fence_shared_count = 0;
> >> +                       if (bo->fence_excl)
> >> +                               fence_put(bo->fence_excl);
> >> +
> >> +                       bo->fence_excl = fence;
> >> +               } else {
> >> +                       if (WARN_ON(bo->fence_shared_count >=
> >> +                                   ARRAY_SIZE(bo->fence_shared))) {
> >> +                               mutex_unreserve_unlock(&bo->lock);
> >> +                               continue;
> >> +                       }
> >> +
> >> +                       bo->fence_shared[bo->fence_shared_count++] = fence;
> >> +               }
> > Hi,
> >
> > I got some questions to fence_excl and fence_shared. At the above
> > code, if bo->fence_excl is not NULL then it puts bo->fence_excl and
> > sets a new fence to it. This seems like that someone that committed a
> > new fence, wants to access the given dmabuf exclusively even if
> > someone is accessing the given dmabuf.
> Yes, if there were shared fences, they had to issue a wait for the previous exclusive fence, so if you add
> (possibly hardware) wait ops on those shared fences to your command stream, it follows that you also
> waited for the previous exclusive fence implicitly.
> 
> If there is only an exclusive fence, you have to issue some explicit wait op  on it before you start the rest
> of the commands.
> > On the other hand, in case of fence_shared, someone wants to access
> > that dmabuf non-exclusively. So this case seems like that the given
> > dmabuf could be accessed by two more devices. So I guess that the
> > fence_excl could be used for write access(may need buffer sync like
> > blocking) and read access for the fence_shared(may not need buffer
> > sync). I'm not sure that I understand these two things correctly so
> > could you please give me more comments for them?
> Correct, if you create a shared fence, you still have to emit a wait op for the exclusive fence.

Just a quick note: We limit the number of non-exclusive fences to avoid
reallocating memory, which would be a bit a pain. Otoh if we support some
form of fence timeline concept, the fence core code could overwrite
redundant fences. Which would reasonable limit the total attached fence
count. Still we'd need to thread potential -ENOMEM failures through all
callers.

Do you see a use-case for more than just a few non-exclusive fences?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2013-02-04 14:57 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-15 12:33 [PATCH 0/7] cross-device reservation for dma-buf support Maarten Lankhorst
2013-01-15 12:33 ` [PATCH 1/7] arch: add __mutex_fastpath_lock_retval_arg to generic/sh/x86/powerpc/ia64 Maarten Lankhorst
2013-01-15 13:49   ` Maarten Lankhorst
2013-01-15 12:33 ` [PATCH 2/7] mutex: add support for reservation style locks Maarten Lankhorst
2013-01-15 13:43   ` Maarten Lankhorst
2013-01-30  1:07   ` Rob Clark
2013-01-30 11:08     ` Daniel Vetter
2013-01-30 11:52       ` Rob Clark
2013-01-31 13:38         ` Rob Clark
2013-01-30 11:16     ` Maarten Lankhorst
2013-01-15 12:34 ` [PATCH 3/7] sched: allow try_to_wake_up to be used internally outside of core.c Maarten Lankhorst
2013-01-15 12:34 ` [PATCH 4/7] fence: dma-buf cross-device synchronization (v11) Maarten Lankhorst
2013-01-22 15:13   ` [Linaro-mm-sig] " Francesco Lavra
2013-01-23 14:56     ` Maarten Lankhorst
2013-01-23 17:14       ` Francesco Lavra
2013-01-31  9:32   ` Inki Dae
2013-01-31  9:53     ` Maarten Lankhorst
2013-01-31  9:57     ` Daniel Vetter
2013-01-31 14:38       ` Inki Dae
2013-01-31 14:49         ` Daniel Vetter
2013-01-15 12:34 ` [PATCH 5/7] seqno-fence: Hardware dma-buf implementation of fencing (v4) Maarten Lankhorst
2013-01-16  6:28   ` [Linaro-mm-sig] " Inki Dae
2013-01-16 10:36     ` Maarten Lankhorst
2013-01-16 12:00       ` Inki Dae
2013-01-24 14:52       ` Inki Dae
2013-01-15 12:34 ` [PATCH 6/7] reservation: cross-device reservation support Maarten Lankhorst
2013-01-22 16:47   ` [Linaro-mm-sig] " Francesco Lavra
2013-01-22 17:04     ` Maarten Lankhorst
2013-02-04  7:06   ` Inki Dae
2013-02-04  9:57     ` Maarten Lankhorst
2013-02-04 14:51       ` Daniel Vetter
2013-01-15 12:34 ` [PATCH 7/7] reservation: Add lockdep annotation and selftests Maarten Lankhorst

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).