* [rfc PATCH] drm/i915: Simplify shrinker_lock
@ 2016-07-17 18:45 Davidlohr Bueso
2016-07-17 21:54 ` Chris Wilson
2016-07-19 7:23 ` ✗ Ro.CI.BAT: failure for " Patchwork
0 siblings, 2 replies; 5+ messages in thread
From: Davidlohr Bueso @ 2016-07-17 18:45 UTC (permalink / raw)
To: chris, daniel.vetter, jani.nikula; +Cc: intel-gfx, linux-kernel, dave
Currently i915_gem_shrinker_lock() suffers from a few issues:
(1) In the attempt to avoid recursive locking, the mutex owner is
consulted. This is obviously nasty and extends the standard api
with artifacts like mutex_is_locked_by() deep in driver code.
(2) If the lock is taken by another task, the shrinking callbacks
are immediately aborted or can block until taken. This seems harsh
and we could wait on for the lock. Of course if it is highly contended
then it could be a while, but that would still allow the operation to
take place or block less for the uninterruptable cases and help memory
pressure.
While the right way of doing things would probably be to rework the
locking rules to completely avoid (1), which goes against our explicit
mutex rules, this seems to be the this way since forever -- and i915
locking looks very nasty overall. As such, this patch moves the owner
logic into mutex.h (which yes, is terrible to expose that logic :/),
and prevents possible tearing issues.
In addition, we can simplify the overall function wrt (2), by first
checking if we are the lock owner, then address the trylock and
deal with (2) if locked/contended by a traditional mutex_lock().
This should be safe considering that if current is the lock owner,
then we are guaranteed not to race with the counter->owner updates
(the counter is updated first which sets the mutex to be visibly locked).
Finally, also get rid of the mutex_is_locked() call, which is completely
racy and makes things even messier.
Not-yet-signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
Hi,
I ran into this because I noticed the mutex owner being used outside of
core mutex code. Given that I have no idea about i915 (much less your
locking semantics), this patch could be completely bogus for both (1) and
(2). Ultimately if someone knows a better way of dealing with this and
avoiding using recursive locking that would be great :)
Compile-tested only.
Thanks.
drivers/gpu/drm/i915/i915_gem_shrinker.c | 27 +++++++++------------------
include/linux/mutex.h | 12 ++++++++++++
2 files changed, 21 insertions(+), 18 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 66571466e9a8..d28d1a84787c 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -35,19 +35,6 @@
#include "i915_drv.h"
#include "i915_trace.h"
-static bool mutex_is_locked_by(struct mutex *mutex, struct task_struct *task)
-{
- if (!mutex_is_locked(mutex))
- return false;
-
-#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)
- return mutex->owner == task;
-#else
- /* Since UP may be pre-empted, we cannot assume that we own the lock */
- return false;
-#endif
-}
-
static int num_vma_bound(struct drm_i915_gem_object *obj)
{
struct i915_vma *vma;
@@ -226,16 +213,20 @@ unsigned long i915_gem_shrink_all(struct drm_i915_private *dev_priv)
static bool i915_gem_shrinker_lock(struct drm_device *dev, bool *unlock)
{
- if (!mutex_trylock(&dev->struct_mutex)) {
- if (!mutex_is_locked_by(&dev->struct_mutex, current))
- return false;
-
+ /*
+ * Recursion detection: do we already hold the lock?
+ * If so, then only we can drop it, hence no racing,
+ * with the actual lock counter and owner fields.
+ */
+ if (mutex_owner(&dev->struct_mutex) == current) {
if (to_i915(dev)->mm.shrinker_no_lock_stealing)
return false;
*unlock = false;
- } else
+ } else {
*unlock = true;
+ mutex_lock(&dev->struct_mutex);
+ }
return true;
}
diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index 2cb7531e7d7a..c4363a01c48a 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -130,6 +130,18 @@ static inline int mutex_is_locked(struct mutex *lock)
return atomic_read(&lock->count) != 1;
}
+#if defined(CONFIG_DEBUG_MUTEXES) || defined(CONFIG_MUTEX_SPIN_ON_OWNER)
+static inline struct task_struct *mutex_owner(struct mutex *lock)
+{
+ return READ_ONCE(lock->owner);
+}
+#else
+static inline struct task_struct *mutex_owner(struct mutex *lock)
+{
+ return NULL;
+}
+#endif
+
/*
* See kernel/locking/mutex.c for detailed documentation of these APIs.
* Also see Documentation/locking/mutex-design.txt.
--
2.6.6
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [rfc PATCH] drm/i915: Simplify shrinker_lock
2016-07-17 18:45 [rfc PATCH] drm/i915: Simplify shrinker_lock Davidlohr Bueso
@ 2016-07-17 21:54 ` Chris Wilson
2016-07-19 7:23 ` ✗ Ro.CI.BAT: failure for " Patchwork
1 sibling, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2016-07-17 21:54 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: daniel.vetter, jani.nikula, intel-gfx, linux-kernel
On Sun, Jul 17, 2016 at 11:45:44AM -0700, Davidlohr Bueso wrote:
> In addition, we can simplify the overall function wrt (2), by first
> checking if we are the lock owner, then address the trylock and
> deal with (2) if locked/contended by a traditional mutex_lock().
> This should be safe considering that if current is the lock owner,
> then we are guaranteed not to race with the counter->owner updates
> (the counter is updated first which sets the mutex to be visibly locked).
However, that is then subject to an indirect ABBA deadlock, between the
shrinker lock and the struct mutex (or at least that used to be the case
where the kswapd reclaim would be blocked on the mutex and an alloc
blocked on kswapd).
Unravelling the gross locking is an ongoing task, with one of the chief
goals being able to reclaim memory whenever required. It is not pretty
and often fails under pressure.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [rfc PATCH] drm/i915: Simplify shrinker_lock
@ 2016-07-17 21:54 ` Chris Wilson
0 siblings, 0 replies; 5+ messages in thread
From: Chris Wilson @ 2016-07-17 21:54 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: daniel.vetter, intel-gfx, linux-kernel
On Sun, Jul 17, 2016 at 11:45:44AM -0700, Davidlohr Bueso wrote:
> In addition, we can simplify the overall function wrt (2), by first
> checking if we are the lock owner, then address the trylock and
> deal with (2) if locked/contended by a traditional mutex_lock().
> This should be safe considering that if current is the lock owner,
> then we are guaranteed not to race with the counter->owner updates
> (the counter is updated first which sets the mutex to be visibly locked).
However, that is then subject to an indirect ABBA deadlock, between the
shrinker lock and the struct mutex (or at least that used to be the case
where the kswapd reclaim would be blocked on the mutex and an alloc
blocked on kswapd).
Unravelling the gross locking is an ongoing task, with one of the chief
goals being able to reclaim memory whenever required. It is not pretty
and often fails under pressure.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Intel-gfx] [rfc PATCH] drm/i915: Simplify shrinker_lock
2016-07-17 21:54 ` Chris Wilson
(?)
@ 2016-07-19 7:03 ` Daniel Vetter
-1 siblings, 0 replies; 5+ messages in thread
From: Daniel Vetter @ 2016-07-19 7:03 UTC (permalink / raw)
To: Chris Wilson, Davidlohr Bueso, daniel.vetter, jani.nikula,
intel-gfx, linux-kernel
On Sun, Jul 17, 2016 at 10:54:51PM +0100, Chris Wilson wrote:
> On Sun, Jul 17, 2016 at 11:45:44AM -0700, Davidlohr Bueso wrote:
> > In addition, we can simplify the overall function wrt (2), by first
> > checking if we are the lock owner, then address the trylock and
> > deal with (2) if locked/contended by a traditional mutex_lock().
> > This should be safe considering that if current is the lock owner,
> > then we are guaranteed not to race with the counter->owner updates
> > (the counter is updated first which sets the mutex to be visibly locked).
>
> However, that is then subject to an indirect ABBA deadlock, between the
> shrinker lock and the struct mutex (or at least that used to be the case
> where the kswapd reclaim would be blocked on the mutex and an alloc
> blocked on kswapd).
>
> Unravelling the gross locking is an ongoing task, with one of the chief
> goals being able to reclaim memory whenever required. It is not pretty
> and often fails under pressure.
Yeah, what we need is to split up the dev->struct_mutex Big Driver Lock to
separate concerns. What's propably needed is a low-level mm lock (under
which we never ever allocate anything to avoid the deadlock with reclaim).
Plus probably per-object locks (using ww_mutex) to be able to protect
buffer against both from the shrinker (which would trylock, considering
locked objects busy) against threads and each another. We also might need
per-submission context locks to avoid havoc there, but not sure.
The reason this is taking forever to get done is that compared to the
existing locking, this new scheme is even more complex ;-)
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 5+ messages in thread
* ✗ Ro.CI.BAT: failure for drm/i915: Simplify shrinker_lock
2016-07-17 18:45 [rfc PATCH] drm/i915: Simplify shrinker_lock Davidlohr Bueso
2016-07-17 21:54 ` Chris Wilson
@ 2016-07-19 7:23 ` Patchwork
1 sibling, 0 replies; 5+ messages in thread
From: Patchwork @ 2016-07-19 7:23 UTC (permalink / raw)
To: Davidlohr Bueso; +Cc: intel-gfx
== Series Details ==
Series: drm/i915: Simplify shrinker_lock
URL : https://patchwork.freedesktop.org/series/10016/
State : failure
== Summary ==
Series 10016v1 drm/i915: Simplify shrinker_lock
http://patchwork.freedesktop.org/api/1.0/series/10016/revisions/1/mbox
Test gem_exec_gttfill:
Subgroup basic:
pass -> DMESG-WARN (fi-snb-i7-2600)
Test gem_sync:
Subgroup basic-store-each:
pass -> DMESG-FAIL (ro-bdw-i7-5600u)
fi-bsw-n3050 total:243 pass:188 dwarn:0 dfail:0 fail:11 skip:44
fi-hsw-i7-4770k total:243 pass:210 dwarn:0 dfail:0 fail:13 skip:20
fi-kbl-qkkr total:243 pass:175 dwarn:27 dfail:0 fail:13 skip:28
fi-skl-i5-6260u total:243 pass:219 dwarn:0 dfail:0 fail:12 skip:12
fi-skl-i7-6700k total:243 pass:205 dwarn:0 dfail:0 fail:12 skip:26
fi-snb-i7-2600 total:243 pass:189 dwarn:1 dfail:0 fail:13 skip:40
ro-bdw-i5-5250u total:243 pass:214 dwarn:4 dfail:0 fail:12 skip:13
ro-bdw-i7-5557U total:243 pass:215 dwarn:0 dfail:0 fail:12 skip:16
ro-bdw-i7-5600u total:243 pass:199 dwarn:0 dfail:1 fail:11 skip:32
ro-bsw-n3050 total:218 pass:172 dwarn:0 dfail:0 fail:2 skip:43
ro-byt-n2820 total:243 pass:191 dwarn:0 dfail:0 fail:14 skip:38
ro-hsw-i3-4010u total:243 pass:206 dwarn:0 dfail:0 fail:13 skip:24
ro-hsw-i7-4770r total:243 pass:206 dwarn:0 dfail:0 fail:13 skip:24
ro-ilk-i7-620lm total:243 pass:166 dwarn:0 dfail:0 fail:14 skip:63
ro-ilk1-i5-650 total:238 pass:166 dwarn:0 dfail:0 fail:14 skip:58
ro-ivb-i7-3770 total:243 pass:197 dwarn:0 dfail:0 fail:13 skip:33
ro-skl3-i5-6260u total:243 pass:218 dwarn:1 dfail:0 fail:12 skip:12
ro-snb-i7-2620M total:243 pass:188 dwarn:0 dfail:0 fail:13 skip:42
Results at /archive/results/CI_IGT_test/RO_Patchwork_1523/
3e14755 drm-intel-nightly: 2016y-07m-18d-14h-26m-21s UTC integration manifest
0856152 drm/i915: Simplify shrinker_lock
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-07-19 7:23 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-17 18:45 [rfc PATCH] drm/i915: Simplify shrinker_lock Davidlohr Bueso
2016-07-17 21:54 ` Chris Wilson
2016-07-17 21:54 ` Chris Wilson
2016-07-19 7:03 ` [Intel-gfx] " Daniel Vetter
2016-07-19 7:23 ` ✗ Ro.CI.BAT: failure for " Patchwork
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.