All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arun Siluvery <arun.siluvery@linux.intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Chris Wilson <chris@chris-wilson.co.uk>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH 05/21] drm/i915: Separate GPU hang waitqueue from advance
Date: Tue, 7 Jun 2016 17:41:20 +0530	[thread overview]
Message-ID: <5756B9E8.402@linux.intel.com> (raw)
In-Reply-To: <575573D5.1040609@linux.intel.com>

On 06/06/2016 18:30, Tvrtko Ursulin wrote:
>
> On 03/06/16 17:08, Chris Wilson wrote:
>> Currently __i915_wait_request uses a per-engine wait_queue_t for the dual
>> purpose of waking after the GPU advances or for waking after an error.
>> In the future, we may add even more wake sources and require greater
>> separation, but for now we can conceptually simplify wakeups by
>> separating
>> the two sources. In particular, this allows us to use different
>> wait-queues
>> (e.g. one on the engine advancement, a global one for errors and one on
>> each requests) without any hassle.
>
> + Arun
>
> I think this will conflict with the TDR work where one of the features
> is to make reset handling per engine. So I am not sure how beneficial in
> general, or painful for the TDR series, this patch might be.

Thanks Tvrtko.
Chris has give some comments on a related tdr patch based on these 
changes. I am looking into how to update my changes based on this.

Tdr code need access to struct_mutex so if a waiter is holding it then 
we should be able to ask it to try again so that we can proceed with 
recovery, similarly when an engine reset is in progress.

regards
Arun

>
> Regards,
>
> Tvrtko
>
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h |  6 ++++++
>>   drivers/gpu/drm/i915/i915_gem.c |  5 +++++
>>   drivers/gpu/drm/i915/i915_irq.c | 19 ++++---------------
>>   3 files changed, 15 insertions(+), 15 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h
>> b/drivers/gpu/drm/i915/i915_drv.h
>> index ceccc6d6b119..e399e97965e0 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -1401,6 +1401,12 @@ struct i915_gpu_error {
>>   #define I915_WEDGED            (1 << 31)
>>
>>       /**
>> +     * Waitqueue to signal when a hang is detected. Used to for waiters
>> +     * to release the struct_mutex for the reset to procede.
>> +     */
>> +    wait_queue_head_t wait_queue;
>> +
>> +    /**
>>        * Waitqueue to signal when the reset has completed. Used by
>> clients
>>        * that wait for dev_priv->mm.wedged to settle.
>>        */
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c
>> b/drivers/gpu/drm/i915/i915_gem.c
>> index 03256f096ab6..de4fb39312a4 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1234,6 +1234,7 @@ int __i915_wait_request(struct
>> drm_i915_gem_request *req,
>>       const bool irq_test_in_progress =
>>           ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) &
>> intel_engine_flag(engine);
>>       int state = interruptible ? TASK_INTERRUPTIBLE :
>> TASK_UNINTERRUPTIBLE;
>> +    DEFINE_WAIT(reset);
>>       DEFINE_WAIT(wait);
>>       unsigned long timeout_expire;
>>       s64 before = 0; /* Only to silence a compiler warning. */
>> @@ -1278,6 +1279,7 @@ int __i915_wait_request(struct
>> drm_i915_gem_request *req,
>>           goto out;
>>       }
>>
>> +    add_wait_queue(&dev_priv->gpu_error.wait_queue, &reset);
>>       for (;;) {
>>           struct timer_list timer;
>>
>> @@ -1329,6 +1331,8 @@ int __i915_wait_request(struct
>> drm_i915_gem_request *req,
>>               destroy_timer_on_stack(&timer);
>>           }
>>       }
>> +    remove_wait_queue(&dev_priv->gpu_error.wait_queue, &reset);
>> +
>>       if (!irq_test_in_progress)
>>           engine->irq_put(engine);
>>
>> @@ -5026,6 +5030,7 @@ i915_gem_load_init(struct drm_device *dev)
>>                 i915_gem_retire_work_handler);
>>       INIT_DELAYED_WORK(&dev_priv->mm.idle_work,
>>                 i915_gem_idle_work_handler);
>> +    init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
>>       init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
>>
>>       dev_priv->relative_constants_mode =
>> I915_EXEC_CONSTANTS_REL_GENERAL;
>> diff --git a/drivers/gpu/drm/i915/i915_irq.c
>> b/drivers/gpu/drm/i915/i915_irq.c
>> index 83cab14639b2..30127b94f26e 100644
>> --- a/drivers/gpu/drm/i915/i915_irq.c
>> +++ b/drivers/gpu/drm/i915/i915_irq.c
>> @@ -2488,11 +2488,8 @@ static irqreturn_t gen8_irq_handler(int irq,
>> void *arg)
>>       return ret;
>>   }
>>
>> -static void i915_error_wake_up(struct drm_i915_private *dev_priv,
>> -                   bool reset_completed)
>> +static void i915_error_wake_up(struct drm_i915_private *dev_priv)
>>   {
>> -    struct intel_engine_cs *engine;
>> -
>>       /*
>>        * Notify all waiters for GPU completion events that reset state
>> has
>>        * been changed, and that they need to restart their wait after
>> @@ -2501,18 +2498,10 @@ static void i915_error_wake_up(struct
>> drm_i915_private *dev_priv,
>>        */
>>
>>       /* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
>> -    for_each_engine(engine, dev_priv)
>> -        wake_up_all(&engine->irq_queue);
>> +    wake_up_all(&dev_priv->gpu_error.wait_queue);
>>
>>       /* Wake up intel_crtc_wait_for_pending_flips, holding
>> crtc->mutex. */
>>       wake_up_all(&dev_priv->pending_flip_queue);
>> -
>> -    /*
>> -     * Signal tasks blocked in i915_gem_wait_for_error that the pending
>> -     * reset state is cleared.
>> -     */
>> -    if (reset_completed)
>> -        wake_up_all(&dev_priv->gpu_error.reset_queue);
>>   }
>>
>>   /**
>> @@ -2577,7 +2566,7 @@ static void i915_reset_and_wakeup(struct
>> drm_i915_private *dev_priv)
>>            * Note: The wake_up also serves as a memory barrier so that
>>            * waiters see the update value of the reset counter atomic_t.
>>            */
>> -        i915_error_wake_up(dev_priv, true);
>> +        wake_up_all(&dev_priv->gpu_error.reset_queue);
>>       }
>>   }
>>
>> @@ -2713,7 +2702,7 @@ void i915_handle_error(struct drm_i915_private
>> *dev_priv,
>>            * ensure that the waiters see the updated value of the reset
>>            * counter atomic_t.
>>            */
>> -        i915_error_wake_up(dev_priv, false);
>> +        i915_error_wake_up(dev_priv);
>>       }
>>
>>       i915_reset_and_wakeup(dev_priv);
>>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2016-06-07 12:11 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-03 16:08 Breadcrumbs, again Chris Wilson
2016-06-03 16:08 ` [PATCH 01/21] drm/i915/shrinker: Flush active on objects before counting Chris Wilson
2016-06-03 16:08 ` [PATCH 02/21] drm/i915: Delay queuing hangcheck to wait-request Chris Wilson
2016-06-08  8:42   ` Daniel Vetter
2016-06-08  9:13     ` Chris Wilson
2016-06-03 16:08 ` [PATCH 03/21] drm/i915: Remove the dedicated hangcheck workqueue Chris Wilson
2016-06-06 12:52   ` Tvrtko Ursulin
2016-06-03 16:08 ` [PATCH 04/21] drm/i915: Make queueing the hangcheck work inline Chris Wilson
2016-06-03 16:08 ` [PATCH 05/21] drm/i915: Separate GPU hang waitqueue from advance Chris Wilson
2016-06-06 13:00   ` Tvrtko Ursulin
2016-06-07 12:11     ` Arun Siluvery [this message]
2016-06-03 16:08 ` [PATCH 06/21] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
2016-06-06 13:58   ` Tvrtko Ursulin
2016-06-03 16:08 ` [PATCH 07/21] drm/i915: Spin after waking up for an interrupt Chris Wilson
2016-06-06 14:39   ` Tvrtko Ursulin
2016-06-03 16:08 ` [PATCH 08/21] drm/i915: Use HWS for seqno tracking everywhere Chris Wilson
2016-06-06 14:55   ` Tvrtko Ursulin
2016-06-08  9:24     ` Chris Wilson
2016-06-03 16:08 ` [PATCH 09/21] drm/i915: Stop mapping the scratch page into CPU space Chris Wilson
2016-06-06 15:03   ` Tvrtko Ursulin
2016-06-03 16:08 ` [PATCH 10/21] drm/i915: Allocate scratch page from stolen Chris Wilson
2016-06-06 15:05   ` Tvrtko Ursulin
2016-06-03 16:08 ` [PATCH 11/21] drm/i915: Refactor scratch object allocation for gen2 w/a buffer Chris Wilson
2016-06-06 15:09   ` Tvrtko Ursulin
2016-06-08  9:27     ` Chris Wilson
2016-06-03 16:08 ` [PATCH 12/21] drm/i915: Add a delay between interrupt and inspecting the final seqno (ilk) Chris Wilson
2016-06-03 16:08 ` [PATCH 13/21] drm/i915: Check the CPU cached value of seqno after waking the waiter Chris Wilson
2016-06-06 15:10   ` Tvrtko Ursulin
2016-06-03 16:08 ` [PATCH 14/21] drm/i915: Only apply one barrier after a breadcrumb interrupt is posted Chris Wilson
2016-06-06 15:34   ` Tvrtko Ursulin
2016-06-08  9:35     ` Chris Wilson
2016-06-08  9:57       ` Tvrtko Ursulin
2016-06-03 16:08 ` [PATCH 15/21] drm/i915: Stop setting wraparound seqno on initialisation Chris Wilson
2016-06-08  8:54   ` Daniel Vetter
2016-06-03 16:08 ` [PATCH 16/21] drm/i915: Only query timestamp when measuring elapsed time Chris Wilson
2016-06-06 13:50   ` Tvrtko Ursulin
2016-06-03 16:08 ` [PATCH 17/21] drm/i915: Convert trace-irq to the breadcrumb waiter Chris Wilson
2016-06-07 12:04   ` Tvrtko Ursulin
2016-06-08  9:48     ` Chris Wilson
2016-06-08 10:16       ` Tvrtko Ursulin
2016-06-08 11:24         ` Chris Wilson
2016-06-08 11:47           ` Tvrtko Ursulin
2016-06-08 12:34             ` Chris Wilson
2016-06-08 12:44               ` Tvrtko Ursulin
2016-06-08 13:47                 ` Chris Wilson
2016-06-03 16:08 ` [PATCH 18/21] drm/i915: Embed signaling node into the GEM request Chris Wilson
2016-06-07 12:31   ` Tvrtko Ursulin
2016-06-08  9:54     ` Chris Wilson
2016-06-03 16:08 ` [PATCH 19/21] drm/i915: Move the get/put irq locking into the caller Chris Wilson
2016-06-07 12:46   ` Tvrtko Ursulin
2016-06-08 10:01     ` Chris Wilson
2016-06-08 10:18       ` Tvrtko Ursulin
2016-06-08 11:10         ` Chris Wilson
2016-06-08 11:49           ` Tvrtko Ursulin
2016-06-08 12:54             ` Chris Wilson
2016-06-03 16:08 ` [PATCH 20/21] drm/i915: Simplify enabling user-interrupts with L3-remapping Chris Wilson
2016-06-07 12:50   ` Tvrtko Ursulin
2016-06-03 16:08 ` [PATCH 21/21] drm/i915: Remove debug noise on detecting fault-injection of missed interrupts Chris Wilson
2016-06-07 12:51   ` Tvrtko Ursulin
2016-06-03 16:35 ` ✗ Ro.CI.BAT: failure for series starting with [01/21] drm/i915/shrinker: Flush active on objects before counting Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5756B9E8.402@linux.intel.com \
    --to=arun.siluvery@linux.intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.