All of lore.kernel.org
 help / color / mirror / Atom feed
* __i915_spin_request() sucks
@ 2015-11-12 20:36 Jens Axboe
  2015-11-12 20:40 ` Jens Axboe
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2015-11-12 20:36 UTC (permalink / raw)
  To: Daniel Vetter, chris; +Cc: DRI Development, LKML

Hi,

So a few months ago I got an XPS13 laptop, the one with the high res 
screen. GUI performance was never really that great, I attributed it to 
coming from a more powerful laptop, and the i915 driving a lot of 
pixels. But yesterday I browsed from my wife's macbook, and was blown 
away. Wow, scrolling in chrome SUCKS on the xps13. Not just scrolling, 
basically anything in chrome. Molasses. So I got sick of it, fired up a 
quick perf record, did a bunch of stuff in chrome. No super smoking 
guns, but one thing did stick out - the path leading to 
__i915_spin_request().

So today, I figured I'd try just killing that spin. If it fails, we'll 
punt to normal completions, so easy change. And wow, MASSIVE difference. 
I can now scroll in chrome and not rage! It's like the laptop is 10x 
faster now.

Ran git blame, and found:

commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Apr 7 16:20:41 2015 +0100

     drm/i915: Optimistically spin for the request completion

and read the commit message. Doesn't sound that impressive. Especially 
not for something that screws up interactive performance by a LOT.

What's the deal? Revert?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-12 20:36 __i915_spin_request() sucks Jens Axboe
@ 2015-11-12 20:40 ` Jens Axboe
  2015-11-12 22:19     ` Chris Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2015-11-12 20:40 UTC (permalink / raw)
  To: Daniel Vetter, chris; +Cc: DRI Development, LKML

On 11/12/2015 01:36 PM, Jens Axboe wrote:
> Hi,
>
> So a few months ago I got an XPS13 laptop, the one with the high res
> screen. GUI performance was never really that great, I attributed it to
> coming from a more powerful laptop, and the i915 driving a lot of
> pixels. But yesterday I browsed from my wife's macbook, and was blown
> away. Wow, scrolling in chrome SUCKS on the xps13. Not just scrolling,
> basically anything in chrome. Molasses. So I got sick of it, fired up a
> quick perf record, did a bunch of stuff in chrome. No super smoking
> guns, but one thing did stick out - the path leading to
> __i915_spin_request().
>
> So today, I figured I'd try just killing that spin. If it fails, we'll
> punt to normal completions, so easy change. And wow, MASSIVE difference.
> I can now scroll in chrome and not rage! It's like the laptop is 10x
> faster now.
>
> Ran git blame, and found:
>
> commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Apr 7 16:20:41 2015 +0100
>
>      drm/i915: Optimistically spin for the request completion
>
> and read the commit message. Doesn't sound that impressive. Especially
> not for something that screws up interactive performance by a LOT.
>
> What's the deal? Revert?

BTW, this:

"Limit the spinning to a single jiffie (~1us) at most"

is totally wrong. I have HZ=100 on my laptop. That's 10ms. 10ms! Even if 
I had HZ=1000, that'd still be 1ms of spinning. That's seriously screwed 
up, guys.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-12 20:40 ` Jens Axboe
@ 2015-11-12 22:19     ` Chris Wilson
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-11-12 22:19 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Daniel Vetter, DRI Development, LKML

On Thu, Nov 12, 2015 at 01:40:33PM -0700, Jens Axboe wrote:
> On 11/12/2015 01:36 PM, Jens Axboe wrote:
> >Hi,
> >
> >So a few months ago I got an XPS13 laptop, the one with the high res
> >screen. GUI performance was never really that great, I attributed it to
> >coming from a more powerful laptop, and the i915 driving a lot of
> >pixels. But yesterday I browsed from my wife's macbook, and was blown
> >away. Wow, scrolling in chrome SUCKS on the xps13. Not just scrolling,
> >basically anything in chrome. Molasses. So I got sick of it, fired up a
> >quick perf record, did a bunch of stuff in chrome. No super smoking
> >guns, but one thing did stick out - the path leading to
> >__i915_spin_request().

smoking gun pointing at the messenger normally.

> >So today, I figured I'd try just killing that spin. If it fails, we'll
> >punt to normal completions, so easy change. And wow, MASSIVE difference.
> >I can now scroll in chrome and not rage! It's like the laptop is 10x
> >faster now.
> >
> >Ran git blame, and found:
> >
> >commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
> >Author: Chris Wilson <chris@chris-wilson.co.uk>
> >Date:   Tue Apr 7 16:20:41 2015 +0100
> >
> >     drm/i915: Optimistically spin for the request completion
> >
> >and read the commit message. Doesn't sound that impressive. Especially
> >not for something that screws up interactive performance by a LOT.
> >
> >What's the deal? Revert?

The tests that it improved the most were the latency sensitive tests and
since my Broadwell xps13 behaves itself, I'd like to understand how it
culminates in an interactivity loss.

1. Maybe it is the uninterruptible nature of the polling, making X's
SIGIO jerky:

gitt a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 19e8f5442cf8..8099c2a9f88e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1146,7 +1146,7 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
        return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
 }
 
-static int __i915_spin_request(struct drm_i915_gem_request *req)
+static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 {
        unsigned long timeout;
 
@@ -1161,6 +1161,9 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
                if (time_after_eq(jiffies, timeout))
                        break;
 
+               if (signal_pending_state(state, current))
+                       break;
+
                cpu_relax_lowlatency();
        }
        if (i915_gem_request_completed(req, false))
@@ -1197,6 +1200,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        struct drm_i915_private *dev_priv = dev->dev_private;
        const bool irq_test_in_progress =
                ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
+       int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
        DEFINE_WAIT(wait);
        unsigned long timeout_expire;
        s64 before, now;
@@ -1221,7 +1225,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        before = ktime_get_raw_ns();
 
        /* Optimistic spin for the next jiffie before touching IRQs */
-       ret = __i915_spin_request(req);
+       ret = __i915_spin_request(req, state);
        if (ret == 0)
                goto out;
 
@@ -1233,8 +1237,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        for (;;) {
                struct timer_list timer;
 
-               prepare_to_wait(&ring->irq_queue, &wait,
-                               interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
+               prepare_to_wait(&ring->irq_queue, &wait, state);
 
                /* We need to check whether any gpu reset happened in between
                 * the caller grabbing the seqno and now ... */
@@ -1252,7 +1255,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                        break;
                }
 
-               if (interruptible && signal_pending(current)) {
+               if (signal_pending_state(state, current)) {
                        ret = -ERESTARTSYS;
                        break;
                }

2. Or maybe it is increased mutex contention:

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a275b0478200..1e52a7444e0c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2973,9 +2973,11 @@ void __i915_add_request(struct drm_i915_gem_request *req,
        __i915_add_request(req, NULL, false)
 int __i915_wait_request(struct drm_i915_gem_request *req,
                        unsigned reset_counter,
-                       bool interruptible,
+                       unsigned flags,
                        s64 *timeout,
                        struct intel_rps_client *rps);
+#define WAIT_INTERRUPTIBLE 0x1
+#define WAIT_UNLOCKED 0x2
 int __must_check i915_wait_request(struct drm_i915_gem_request *req);
 int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
 int __must_check
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8099c2a9f88e..ce17d42f1c62 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1191,7 +1191,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
  */
 int __i915_wait_request(struct drm_i915_gem_request *req,
                        unsigned reset_counter,
-                       bool interruptible,
+                       unsigned flags,
                        s64 *timeout,
                        struct intel_rps_client *rps)
 {
@@ -1200,7 +1200,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        struct drm_i915_private *dev_priv = dev->dev_private;
        const bool irq_test_in_progress =
                ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
-       int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
+       int state = flags & WAIT_INTERRUPTIBLE ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
        DEFINE_WAIT(wait);
        unsigned long timeout_expire;
        s64 before, now;
@@ -1225,9 +1225,11 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        before = ktime_get_raw_ns();
 
        /* Optimistic spin for the next jiffie before touching IRQs */
-       ret = __i915_spin_request(req, state);
-       if (ret == 0)
-               goto out;
+       if (flags & WAIT_UNLOCKED) {
+               ret = __i915_spin_request(req, state);
+               if (ret == 0)
+                       goto out;
+       }
 
        if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
                ret = -ENODEV;
@@ -1244,7 +1246,8 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
                        /* ... but upgrade the -EAGAIN to an -EIO if the gpu
                         * is truely gone. */
-                       ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
+                       ret = i915_gem_check_wedge(&dev_priv->gpu_error,
+                                                  flags & WAIT_INTERRUPTIBLE);
                        if (ret == 0)
                                ret = -EAGAIN;
                        break;
@@ -1532,7 +1535,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 
        mutex_unlock(&dev->struct_mutex);
        for (i = 0; ret == 0 && i < n; i++)
-               ret = __i915_wait_request(requests[i], reset_counter, true,
+               ret = __i915_wait_request(requests[i], reset_counter, 0x3,
                                          NULL, rps);
        mutex_lock(&dev->struct_mutex);
 
@@ -3067,7 +3070,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 
        for (i = 0; i < n; i++) {
                if (ret == 0)
-                       ret = __i915_wait_request(req[i], reset_counter, true,
+                       ret = __i915_wait_request(req[i], reset_counter, 0x3,
                                                  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
                                                  file->driver_priv);
                i915_gem_request_unreference__unlocked(req[i]);
@@ -4043,7 +4046,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
        if (target == NULL)
                return 0;
 
-       ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
+       ret = __i915_wait_request(target, reset_counter, 0x3, NULL, NULL);
        if (ret == 0)
                queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);


Or maybe it is an indirect effect, such as power balancing between the
CPU and GPU, or just thermal throttling, or it may be the task being
penalised for consuming its timeslice (for which any completion polling
seems susceptible).

> BTW, this:
> 
> "Limit the spinning to a single jiffie (~1us) at most"
> 
> is totally wrong. I have HZ=100 on my laptop. That's 10ms. 10ms!
> Even if I had HZ=1000, that'd still be 1ms of spinning. That's
> seriously screwed up, guys.

That's over and above the termination condition for blk_poll().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
@ 2015-11-12 22:19     ` Chris Wilson
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-11-12 22:19 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Daniel Vetter, LKML, DRI Development

On Thu, Nov 12, 2015 at 01:40:33PM -0700, Jens Axboe wrote:
> On 11/12/2015 01:36 PM, Jens Axboe wrote:
> >Hi,
> >
> >So a few months ago I got an XPS13 laptop, the one with the high res
> >screen. GUI performance was never really that great, I attributed it to
> >coming from a more powerful laptop, and the i915 driving a lot of
> >pixels. But yesterday I browsed from my wife's macbook, and was blown
> >away. Wow, scrolling in chrome SUCKS on the xps13. Not just scrolling,
> >basically anything in chrome. Molasses. So I got sick of it, fired up a
> >quick perf record, did a bunch of stuff in chrome. No super smoking
> >guns, but one thing did stick out - the path leading to
> >__i915_spin_request().

smoking gun pointing at the messenger normally.

> >So today, I figured I'd try just killing that spin. If it fails, we'll
> >punt to normal completions, so easy change. And wow, MASSIVE difference.
> >I can now scroll in chrome and not rage! It's like the laptop is 10x
> >faster now.
> >
> >Ran git blame, and found:
> >
> >commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
> >Author: Chris Wilson <chris@chris-wilson.co.uk>
> >Date:   Tue Apr 7 16:20:41 2015 +0100
> >
> >     drm/i915: Optimistically spin for the request completion
> >
> >and read the commit message. Doesn't sound that impressive. Especially
> >not for something that screws up interactive performance by a LOT.
> >
> >What's the deal? Revert?

The tests that it improved the most were the latency sensitive tests and
since my Broadwell xps13 behaves itself, I'd like to understand how it
culminates in an interactivity loss.

1. Maybe it is the uninterruptible nature of the polling, making X's
SIGIO jerky:

gitt a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 19e8f5442cf8..8099c2a9f88e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1146,7 +1146,7 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
        return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
 }
 
-static int __i915_spin_request(struct drm_i915_gem_request *req)
+static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 {
        unsigned long timeout;
 
@@ -1161,6 +1161,9 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
                if (time_after_eq(jiffies, timeout))
                        break;
 
+               if (signal_pending_state(state, current))
+                       break;
+
                cpu_relax_lowlatency();
        }
        if (i915_gem_request_completed(req, false))
@@ -1197,6 +1200,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        struct drm_i915_private *dev_priv = dev->dev_private;
        const bool irq_test_in_progress =
                ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
+       int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
        DEFINE_WAIT(wait);
        unsigned long timeout_expire;
        s64 before, now;
@@ -1221,7 +1225,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        before = ktime_get_raw_ns();
 
        /* Optimistic spin for the next jiffie before touching IRQs */
-       ret = __i915_spin_request(req);
+       ret = __i915_spin_request(req, state);
        if (ret == 0)
                goto out;
 
@@ -1233,8 +1237,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        for (;;) {
                struct timer_list timer;
 
-               prepare_to_wait(&ring->irq_queue, &wait,
-                               interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
+               prepare_to_wait(&ring->irq_queue, &wait, state);
 
                /* We need to check whether any gpu reset happened in between
                 * the caller grabbing the seqno and now ... */
@@ -1252,7 +1255,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                        break;
                }
 
-               if (interruptible && signal_pending(current)) {
+               if (signal_pending_state(state, current)) {
                        ret = -ERESTARTSYS;
                        break;
                }

2. Or maybe it is increased mutex contention:

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a275b0478200..1e52a7444e0c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2973,9 +2973,11 @@ void __i915_add_request(struct drm_i915_gem_request *req,
        __i915_add_request(req, NULL, false)
 int __i915_wait_request(struct drm_i915_gem_request *req,
                        unsigned reset_counter,
-                       bool interruptible,
+                       unsigned flags,
                        s64 *timeout,
                        struct intel_rps_client *rps);
+#define WAIT_INTERRUPTIBLE 0x1
+#define WAIT_UNLOCKED 0x2
 int __must_check i915_wait_request(struct drm_i915_gem_request *req);
 int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf);
 int __must_check
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8099c2a9f88e..ce17d42f1c62 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1191,7 +1191,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
  */
 int __i915_wait_request(struct drm_i915_gem_request *req,
                        unsigned reset_counter,
-                       bool interruptible,
+                       unsigned flags,
                        s64 *timeout,
                        struct intel_rps_client *rps)
 {
@@ -1200,7 +1200,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        struct drm_i915_private *dev_priv = dev->dev_private;
        const bool irq_test_in_progress =
                ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
-       int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
+       int state = flags & WAIT_INTERRUPTIBLE ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
        DEFINE_WAIT(wait);
        unsigned long timeout_expire;
        s64 before, now;
@@ -1225,9 +1225,11 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
        before = ktime_get_raw_ns();
 
        /* Optimistic spin for the next jiffie before touching IRQs */
-       ret = __i915_spin_request(req, state);
-       if (ret == 0)
-               goto out;
+       if (flags & WAIT_UNLOCKED) {
+               ret = __i915_spin_request(req, state);
+               if (ret == 0)
+                       goto out;
+       }
 
        if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
                ret = -ENODEV;
@@ -1244,7 +1246,8 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
                        /* ... but upgrade the -EAGAIN to an -EIO if the gpu
                         * is truely gone. */
-                       ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
+                       ret = i915_gem_check_wedge(&dev_priv->gpu_error,
+                                                  flags & WAIT_INTERRUPTIBLE);
                        if (ret == 0)
                                ret = -EAGAIN;
                        break;
@@ -1532,7 +1535,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 
        mutex_unlock(&dev->struct_mutex);
        for (i = 0; ret == 0 && i < n; i++)
-               ret = __i915_wait_request(requests[i], reset_counter, true,
+               ret = __i915_wait_request(requests[i], reset_counter, 0x3,
                                          NULL, rps);
        mutex_lock(&dev->struct_mutex);
 
@@ -3067,7 +3070,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 
        for (i = 0; i < n; i++) {
                if (ret == 0)
-                       ret = __i915_wait_request(req[i], reset_counter, true,
+                       ret = __i915_wait_request(req[i], reset_counter, 0x3,
                                                  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
                                                  file->driver_priv);
                i915_gem_request_unreference__unlocked(req[i]);
@@ -4043,7 +4046,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
        if (target == NULL)
                return 0;
 
-       ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
+       ret = __i915_wait_request(target, reset_counter, 0x3, NULL, NULL);
        if (ret == 0)
                queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);


Or maybe it is an indirect effect, such as power balancing between the
CPU and GPU, or just thermal throttling, or it may be the task being
penalised for consuming its timeslice (for which any completion polling
seems susceptible).

> BTW, this:
> 
> "Limit the spinning to a single jiffie (~1us) at most"
> 
> is totally wrong. I have HZ=100 on my laptop. That's 10ms. 10ms!
> Even if I had HZ=1000, that'd still be 1ms of spinning. That's
> seriously screwed up, guys.

That's over and above the termination condition for blk_poll().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-12 22:19     ` Chris Wilson
  (?)
@ 2015-11-12 22:52     ` Jens Axboe
  2015-11-12 22:59       ` Jens Axboe
  2015-11-13  9:15       ` Chris Wilson
  -1 siblings, 2 replies; 14+ messages in thread
From: Jens Axboe @ 2015-11-12 22:52 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, DRI Development, LKML

On 11/12/2015 03:19 PM, Chris Wilson wrote:
>>> So today, I figured I'd try just killing that spin. If it fails, we'll
>>> punt to normal completions, so easy change. And wow, MASSIVE difference.
>>> I can now scroll in chrome and not rage! It's like the laptop is 10x
>>> faster now.
>>>
>>> Ran git blame, and found:
>>>
>>> commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>> Date:   Tue Apr 7 16:20:41 2015 +0100
>>>
>>>      drm/i915: Optimistically spin for the request completion
>>>
>>> and read the commit message. Doesn't sound that impressive. Especially
>>> not for something that screws up interactive performance by a LOT.
>>>
>>> What's the deal? Revert?
>
> The tests that it improved the most were the latency sensitive tests and
> since my Broadwell xps13 behaves itself, I'd like to understand how it
> culminates in an interactivity loss.
>
> 1. Maybe it is the uninterruptible nature of the polling, making X's
> SIGIO jerky:

This one still feels bad.

> 2. Or maybe it is increased mutex contention:

And so does this one... I had to manually apply hunks 2-3, and after 
doing seat-of-the-pants testing for both variants, I confirmed with perf 
that we're still seeing a ton of time in __i915_wait_request() for both 
of them.

> Or maybe it is an indirect effect, such as power balancing between the
> CPU and GPU, or just thermal throttling, or it may be the task being
> penalised for consuming its timeslice (for which any completion polling
> seems susceptible).

Look, polls in the 1-10ms range are just insane. Either you botched the 
commit message and really meant "~1ms at most" and in which case I'd 
suspect you of smoking something good, or you hacked it up wrong and 
used jiffies when you really wanted to be using some other time check 
that really did give you 1us.

I'll take an IRQ over 10 msecs of busy looping on my laptop, thanks.

>> "Limit the spinning to a single jiffie (~1us) at most"
>>
>> is totally wrong. I have HZ=100 on my laptop. That's 10ms. 10ms!
>> Even if I had HZ=1000, that'd still be 1ms of spinning. That's
>> seriously screwed up, guys.
>
> That's over and above the termination condition for blk_poll().

?! And this is related, how? Comparing apples and oranges. One is a test 
opt-in feature for experimentation, the other is unconditionally enabled 
for everyone. I believe the commit even says so. See the difference? 
Would I use busy loop spinning waiting for rotating storage completions, 
which are in the 1-10ms range? No, with the reason being that the 
potential wins for spins are in the usec range.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-12 22:52     ` Jens Axboe
@ 2015-11-12 22:59       ` Jens Axboe
  2015-11-13  9:15       ` Chris Wilson
  1 sibling, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2015-11-12 22:59 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, DRI Development, LKML

On 11/12/2015 03:52 PM, Jens Axboe wrote:
> On 11/12/2015 03:19 PM, Chris Wilson wrote:
>>>> So today, I figured I'd try just killing that spin. If it fails, we'll
>>>> punt to normal completions, so easy change. And wow, MASSIVE
>>>> difference.
>>>> I can now scroll in chrome and not rage! It's like the laptop is 10x
>>>> faster now.
>>>>
>>>> Ran git blame, and found:
>>>>
>>>> commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
>>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Date:   Tue Apr 7 16:20:41 2015 +0100
>>>>
>>>>      drm/i915: Optimistically spin for the request completion
>>>>
>>>> and read the commit message. Doesn't sound that impressive. Especially
>>>> not for something that screws up interactive performance by a LOT.
>>>>
>>>> What's the deal? Revert?
>>
>> The tests that it improved the most were the latency sensitive tests and
>> since my Broadwell xps13 behaves itself, I'd like to understand how it
>> culminates in an interactivity loss.
>>
>> 1. Maybe it is the uninterruptible nature of the polling, making X's
>> SIGIO jerky:
>
> This one still feels bad.
>
>> 2. Or maybe it is increased mutex contention:
>
> And so does this one... I had to manually apply hunks 2-3, and after
> doing seat-of-the-pants testing for both variants, I confirmed with perf
> that we're still seeing a ton of time in __i915_wait_request() for both
> of them.

I don't see how #2 could make any difference, you're passing in 0x3 hard 
coded for most call sites, so we poll. The ones that don't, pass a bool 
(?!).

I should note that with the basic patch of just never spinning, I don't 
see __i915_wait_request() in the profiles. At all.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-12 22:52     ` Jens Axboe
  2015-11-12 22:59       ` Jens Axboe
@ 2015-11-13  9:15       ` Chris Wilson
  2015-11-13 15:12         ` Jens Axboe
  2015-11-13 15:36         ` Jens Axboe
  1 sibling, 2 replies; 14+ messages in thread
From: Chris Wilson @ 2015-11-13  9:15 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Daniel Vetter, DRI Development, LKML

On Thu, Nov 12, 2015 at 03:52:02PM -0700, Jens Axboe wrote:
> On 11/12/2015 03:19 PM, Chris Wilson wrote:
> >>>So today, I figured I'd try just killing that spin. If it fails, we'll
> >>>punt to normal completions, so easy change. And wow, MASSIVE difference.
> >>>I can now scroll in chrome and not rage! It's like the laptop is 10x
> >>>faster now.
> >>>
> >>>Ran git blame, and found:
> >>>
> >>>commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
> >>>Author: Chris Wilson <chris@chris-wilson.co.uk>
> >>>Date:   Tue Apr 7 16:20:41 2015 +0100
> >>>
> >>>     drm/i915: Optimistically spin for the request completion
> >>>
> >>>and read the commit message. Doesn't sound that impressive. Especially
> >>>not for something that screws up interactive performance by a LOT.
> >>>
> >>>What's the deal? Revert?
> >
> >The tests that it improved the most were the latency sensitive tests and
> >since my Broadwell xps13 behaves itself, I'd like to understand how it
> >culminates in an interactivity loss.
> >
> >1. Maybe it is the uninterruptible nature of the polling, making X's
> >SIGIO jerky:
> 
> This one still feels bad.
> 
> >2. Or maybe it is increased mutex contention:
> 
> And so does this one... I had to manually apply hunks 2-3, and after
> doing seat-of-the-pants testing for both variants, I confirmed with
> perf that we're still seeing a ton of time in __i915_wait_request()
> for both of them.
> 
> >Or maybe it is an indirect effect, such as power balancing between the
> >CPU and GPU, or just thermal throttling, or it may be the task being
> >penalised for consuming its timeslice (for which any completion polling
> >seems susceptible).
> 
> Look, polls in the 1-10ms range are just insane. Either you botched
> the commit message and really meant "~1ms at most" and in which case
> I'd suspect you of smoking something good, or you hacked it up wrong
> and used jiffies when you really wanted to be using some other time
> check that really did give you 1us.

What other time check? I was under the impression of setting up a
hrtimer was expensive and jiffies was available.
 
> I'll take an IRQ over 10 msecs of busy looping on my laptop, thanks.
> 
> >>"Limit the spinning to a single jiffie (~1us) at most"
> >>
> >>is totally wrong. I have HZ=100 on my laptop. That's 10ms. 10ms!
> >>Even if I had HZ=1000, that'd still be 1ms of spinning. That's
> >>seriously screwed up, guys.
> >
> >That's over and above the termination condition for blk_poll().
> 
> ?! And this is related, how? Comparing apples and oranges. One is a
> test opt-in feature for experimentation, the other is
> unconditionally enabled for everyone. I believe the commit even says
> so. See the difference? Would I use busy loop spinning waiting for
> rotating storage completions, which are in the 1-10ms range? No,
> with the reason being that the potential wins for spins are in the
> usec range.

Equally I expect the service interval for a batch to be around 2-20us.
There are many workloads that execute 30-50k requests/s, and you can
appreciate that they are sensitive to the latency in setting up an irq
and receiving it - just as equally leaving that irq enabled saturates a
CPU with simply executing the irq handler. So what mechanism do you use
to guard against either a very long queue depth or an abnormal request
causing msec+ spins?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-13  9:15       ` Chris Wilson
@ 2015-11-13 15:12         ` Jens Axboe
  2015-11-13 15:36         ` Jens Axboe
  1 sibling, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2015-11-13 15:12 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, DRI Development, LKML

On 11/13/2015 02:15 AM, Chris Wilson wrote:
> On Thu, Nov 12, 2015 at 03:52:02PM -0700, Jens Axboe wrote:
>> On 11/12/2015 03:19 PM, Chris Wilson wrote:
>>>>> So today, I figured I'd try just killing that spin. If it fails, we'll
>>>>> punt to normal completions, so easy change. And wow, MASSIVE difference.
>>>>> I can now scroll in chrome and not rage! It's like the laptop is 10x
>>>>> faster now.
>>>>>
>>>>> Ran git blame, and found:
>>>>>
>>>>> commit 2def4ad99befa25775dd2f714fdd4d92faec6e34
>>>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Date:   Tue Apr 7 16:20:41 2015 +0100
>>>>>
>>>>>      drm/i915: Optimistically spin for the request completion
>>>>>
>>>>> and read the commit message. Doesn't sound that impressive. Especially
>>>>> not for something that screws up interactive performance by a LOT.
>>>>>
>>>>> What's the deal? Revert?
>>>
>>> The tests that it improved the most were the latency sensitive tests and
>>> since my Broadwell xps13 behaves itself, I'd like to understand how it
>>> culminates in an interactivity loss.
>>>
>>> 1. Maybe it is the uninterruptible nature of the polling, making X's
>>> SIGIO jerky:
>>
>> This one still feels bad.
>>
>>> 2. Or maybe it is increased mutex contention:
>>
>> And so does this one... I had to manually apply hunks 2-3, and after
>> doing seat-of-the-pants testing for both variants, I confirmed with
>> perf that we're still seeing a ton of time in __i915_wait_request()
>> for both of them.
>>
>>> Or maybe it is an indirect effect, such as power balancing between the
>>> CPU and GPU, or just thermal throttling, or it may be the task being
>>> penalised for consuming its timeslice (for which any completion polling
>>> seems susceptible).
>>
>> Look, polls in the 1-10ms range are just insane. Either you botched
>> the commit message and really meant "~1ms at most" and in which case
>> I'd suspect you of smoking something good, or you hacked it up wrong
>> and used jiffies when you really wanted to be using some other time
>> check that really did give you 1us.
>
> What other time check? I was under the impression of setting up a
> hrtimer was expensive and jiffies was available.

Looping for 10ms is a lot more expensive :-). jiffies is always there, 
but it's WAY too coarse to be used for something like this.

You could use ktime_get(), there's a lot of helpers for checking 
time_after, adding msecs, etc. Something like the below, not tested here
yet.

>> I'll take an IRQ over 10 msecs of busy looping on my laptop, thanks.
>>
>>>> "Limit the spinning to a single jiffie (~1us) at most"
>>>>
>>>> is totally wrong. I have HZ=100 on my laptop. That's 10ms. 10ms!
>>>> Even if I had HZ=1000, that'd still be 1ms of spinning. That's
>>>> seriously screwed up, guys.
>>>
>>> That's over and above the termination condition for blk_poll().
>>
>> ?! And this is related, how? Comparing apples and oranges. One is a
>> test opt-in feature for experimentation, the other is
>> unconditionally enabled for everyone. I believe the commit even says
>> so. See the difference? Would I use busy loop spinning waiting for
>> rotating storage completions, which are in the 1-10ms range? No,
>> with the reason being that the potential wins for spins are in the
>> usec range.
>
> Equally I expect the service interval for a batch to be around 2-20us.
> There are many workloads that execute 30-50k requests/s, and you can
> appreciate that they are sensitive to the latency in setting up an irq
> and receiving it - just as equally leaving that irq enabled saturates a
> CPU with simply executing the irq handler. So what mechanism do you use
> to guard against either a very long queue depth or an abnormal request
> causing msec+ spins?

Not disputing that polling can work, but it needs to be a bit more 
clever. Do you know which requests are fast and which ones are not? 
Could you track it? Should we make this a module option?

20usec is too long to poll. If we look at the wins of polling, we're
talking anywhere from 1-2 usec to maybe 5 usec, depending on different
factors. So spinning between 1-3 usec should be a hard limit on most
platforms. And it's somewhat of a policy decision, since it does involve
throwing CPU at the problem. There's a crossover point where below it's
always a win, but that needs a lot more work than just optimistic
spinning for everything. You also do need a check for whether the task
has been woken up, that's also missing.

As for interrupt mitigation, I'd consider that a separate problem. It's
a lot simpler than the app induced polling that i915 is doing here. So
if overloading a core with IRQs is an issue, I'd solve that differently
similarly to NAPI or blk-iopoll (not to be confused with blk_poll() that
you referenced and is app induced polling).

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5cf4a1998273..658514e899b1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1148,17 +1148,19 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
 
 static int __i915_spin_request(struct drm_i915_gem_request *req)
 {
-	unsigned long timeout;
+	ktime_t start, end;
 
 	if (i915_gem_request_get_ring(req)->irq_refcount)
 		return -EBUSY;
 
-	timeout = jiffies + 1;
+	start = ktime_get();
+	end.tv64 = start.tv64;
+	ktime_add_us(end, 1);
 	while (!need_resched()) {
 		if (i915_gem_request_completed(req, true))
 			return 0;
 
-		if (time_after_eq(jiffies, timeout))
+		if (ktime_after(start, end))
 			break;
 
 		cpu_relax_lowlatency();

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-13  9:15       ` Chris Wilson
  2015-11-13 15:12         ` Jens Axboe
@ 2015-11-13 15:36         ` Jens Axboe
  2015-11-13 16:13           ` Mike Galbraith
  1 sibling, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2015-11-13 15:36 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, DRI Development, LKML

Previous patch was obvious pre-coffee crap, this should work for using
ktime to spin max 1usec.

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5cf4a1998273..21192e55c33c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1148,17 +1148,18 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
 
 static int __i915_spin_request(struct drm_i915_gem_request *req)
 {
-	unsigned long timeout;
+	ktime_t timeout;
 
 	if (i915_gem_request_get_ring(req)->irq_refcount)
 		return -EBUSY;
 
-	timeout = jiffies + 1;
+	timeout = ktime_get();
+	ktime_add_us(timeout, 1);
 	while (!need_resched()) {
 		if (i915_gem_request_completed(req, true))
 			return 0;
 
-		if (time_after_eq(jiffies, timeout))
+		if (ktime_after(ktime_get(), timeout))
 			break;
 
 		cpu_relax_lowlatency();

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-13 15:36         ` Jens Axboe
@ 2015-11-13 16:13           ` Mike Galbraith
  2015-11-13 16:22             ` Jens Axboe
  0 siblings, 1 reply; 14+ messages in thread
From: Mike Galbraith @ 2015-11-13 16:13 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Chris Wilson, Daniel Vetter, DRI Development, LKML

On Fri, 2015-11-13 at 08:36 -0700, Jens Axboe wrote:
> Previous patch was obvious pre-coffee crap, this should work for using
> ktime to spin max 1usec.

1us seems a tad low.  I doubt the little wooden gears and pulleys of my
core2 Toshiba Satellite lappy can get one loop ground out in a usec :)

> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 5cf4a1998273..21192e55c33c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1148,17 +1148,18 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
>  
>  static int __i915_spin_request(struct drm_i915_gem_request *req)
>  {
> -	unsigned long timeout;
> +	ktime_t timeout;
>  
>  	if (i915_gem_request_get_ring(req)->irq_refcount)
>  		return -EBUSY;
>  
> -	timeout = jiffies + 1;
> +	timeout = ktime_get();
> +	ktime_add_us(timeout, 1);
>  	while (!need_resched()) {
>  		if (i915_gem_request_completed(req, true))
>  			return 0;
>  
> -		if (time_after_eq(jiffies, timeout))
> +		if (ktime_after(ktime_get(), timeout))
>  			break;
>  
>  		cpu_relax_lowlatency();
> 



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-13 16:13           ` Mike Galbraith
@ 2015-11-13 16:22             ` Jens Axboe
  2015-11-13 22:12                 ` Chris Wilson
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Axboe @ 2015-11-13 16:22 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Chris Wilson, Daniel Vetter, DRI Development, LKML

On 11/13/2015 09:13 AM, Mike Galbraith wrote:
> On Fri, 2015-11-13 at 08:36 -0700, Jens Axboe wrote:
>> Previous patch was obvious pre-coffee crap, this should work for using
>> ktime to spin max 1usec.
>
> 1us seems a tad low.  I doubt the little wooden gears and pulleys of my
> core2 Toshiba Satellite lappy can get one loop ground out in a usec :)

Maybe it is, it's based off the original intent of the function, though. 
See the original commit referenced.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-13 16:22             ` Jens Axboe
@ 2015-11-13 22:12                 ` Chris Wilson
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-11-13 22:12 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Mike Galbraith, Daniel Vetter, DRI Development, LKML

On Fri, Nov 13, 2015 at 09:22:52AM -0700, Jens Axboe wrote:
> On 11/13/2015 09:13 AM, Mike Galbraith wrote:
> >On Fri, 2015-11-13 at 08:36 -0700, Jens Axboe wrote:
> >>Previous patch was obvious pre-coffee crap, this should work for using
> >>ktime to spin max 1usec.
> >
> >1us seems a tad low.  I doubt the little wooden gears and pulleys of my
> >core2 Toshiba Satellite lappy can get one loop ground out in a usec :)
> 
> Maybe it is, it's based off the original intent of the function,
> though. See the original commit referenced.

I've been looking at numbers from one laptop and I can set the timeout
at 2us before we see a steep decline in what is more or less synchronous
request handling (which affects a variety of rendering workloads).

Looking around, other busy loops seem to use local_clock() (i.e. rdstcll
with a fair wind). Is that worth using here?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
@ 2015-11-13 22:12                 ` Chris Wilson
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Wilson @ 2015-11-13 22:12 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Daniel Vetter, Mike Galbraith, LKML, DRI Development

On Fri, Nov 13, 2015 at 09:22:52AM -0700, Jens Axboe wrote:
> On 11/13/2015 09:13 AM, Mike Galbraith wrote:
> >On Fri, 2015-11-13 at 08:36 -0700, Jens Axboe wrote:
> >>Previous patch was obvious pre-coffee crap, this should work for using
> >>ktime to spin max 1usec.
> >
> >1us seems a tad low.  I doubt the little wooden gears and pulleys of my
> >core2 Toshiba Satellite lappy can get one loop ground out in a usec :)
> 
> Maybe it is, it's based off the original intent of the function,
> though. See the original commit referenced.

I've been looking at numbers from one laptop and I can set the timeout
at 2us before we see a steep decline in what is more or less synchronous
request handling (which affects a variety of rendering workloads).

Looking around, other busy loops seem to use local_clock() (i.e. rdstcll
with a fair wind). Is that worth using here?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: __i915_spin_request() sucks
  2015-11-13 22:12                 ` Chris Wilson
  (?)
@ 2015-11-13 22:16                 ` Jens Axboe
  -1 siblings, 0 replies; 14+ messages in thread
From: Jens Axboe @ 2015-11-13 22:16 UTC (permalink / raw)
  To: Chris Wilson, Mike Galbraith, Daniel Vetter, DRI Development, LKML

On 11/13/2015 03:12 PM, Chris Wilson wrote:
> On Fri, Nov 13, 2015 at 09:22:52AM -0700, Jens Axboe wrote:
>> On 11/13/2015 09:13 AM, Mike Galbraith wrote:
>>> On Fri, 2015-11-13 at 08:36 -0700, Jens Axboe wrote:
>>>> Previous patch was obvious pre-coffee crap, this should work for using
>>>> ktime to spin max 1usec.
>>>
>>> 1us seems a tad low.  I doubt the little wooden gears and pulleys of my
>>> core2 Toshiba Satellite lappy can get one loop ground out in a usec :)
>>
>> Maybe it is, it's based off the original intent of the function,
>> though. See the original commit referenced.
>
> I've been looking at numbers from one laptop and I can set the timeout
> at 2us before we see a steep decline in what is more or less synchronous
> request handling (which affects a variety of rendering workloads).

Alright, at least that's a vast improvement from 10ms. If you send me 
something tested, I can try it here.

> Looking around, other busy loops seem to use local_clock() (i.e. rdstcll
> with a fair wind). Is that worth using here?

Honestly, don't think it matters too much for this case. You'd have to 
disable preempt to use local_clock(), fwiw. It is a faster variant 
though, but the RT people might hate you for 2us preempt disables :-)

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-11-13 22:16 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-12 20:36 __i915_spin_request() sucks Jens Axboe
2015-11-12 20:40 ` Jens Axboe
2015-11-12 22:19   ` Chris Wilson
2015-11-12 22:19     ` Chris Wilson
2015-11-12 22:52     ` Jens Axboe
2015-11-12 22:59       ` Jens Axboe
2015-11-13  9:15       ` Chris Wilson
2015-11-13 15:12         ` Jens Axboe
2015-11-13 15:36         ` Jens Axboe
2015-11-13 16:13           ` Mike Galbraith
2015-11-13 16:22             ` Jens Axboe
2015-11-13 22:12               ` Chris Wilson
2015-11-13 22:12                 ` Chris Wilson
2015-11-13 22:16                 ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.