All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>,
	intel-gfx@lists.freedesktop.org
Cc: Owen Zhang <owen.zhang@intel.com>
Subject: Re: [PATCH v2] drm/i915: fix SFC reset flow
Date: Thu, 19 Sep 2019 10:58:11 +0100	[thread overview]
Message-ID: <2fa990de-e08a-be3e-8e03-24b8bad1a5f7@linux.intel.com> (raw)
In-Reply-To: <88d7cc08-a621-88b5-c043-44e5487e579d@linux.intel.com>


On 19/09/2019 10:34, Tvrtko Ursulin wrote:
> 
> On 19/09/2019 02:53, Daniele Ceraolo Spurio wrote:
>> Our assumption that the we can ask the HW to lock the SFC even if not
>> currently in use does not match the HW commitment. The expectation from
>> the HW is that SW will not try to lock the SFC if the engine is not
>> using it and if we do that the behavior is undefined; on ICL the HW
>> ends up to returning the ack and ignoring our lock request, but this is
>> not guaranteed and we shouldn't expect it going forward.
>>
>> Also, failing to get the ack while the SFC is in use means that we can't
>> cleanly reset it, so fail the engine reset in that scenario.
>>
>> v2: drop rmw change, keep the log as debug and handle failure (Chris),
>>      improve comments (Tvrtko).
>>
>> Reported-by: Owen Zhang <owen.zhang@intel.com>
>> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_reset.c | 51 +++++++++++++++++----------
>>   1 file changed, 33 insertions(+), 18 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_reset.c 
>> b/drivers/gpu/drm/i915/gt/intel_reset.c
>> index 8327220ac558..797cf50625cb 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_reset.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_reset.c
>> @@ -309,7 +309,7 @@ static int gen6_reset_engines(struct intel_gt *gt,
>>       return gen6_hw_domain_reset(gt, hw_mask);
>>   }
>> -static u32 gen11_lock_sfc(struct intel_engine_cs *engine)
>> +static int gen11_lock_sfc(struct intel_engine_cs *engine, u32 *hw_mask)
>>   {
>>       struct intel_uncore *uncore = engine->uncore;
>>       u8 vdbox_sfc_access = RUNTIME_INFO(engine->i915)->vdbox_sfc_access;
>> @@ -318,6 +318,7 @@ static u32 gen11_lock_sfc(struct intel_engine_cs 
>> *engine)
>>       i915_reg_t sfc_usage;
>>       u32 sfc_usage_bit;
>>       u32 sfc_reset_bit;
>> +    int ret;
>>       switch (engine->class) {
>>       case VIDEO_DECODE_CLASS:
>> @@ -352,28 +353,33 @@ static u32 gen11_lock_sfc(struct intel_engine_cs 
>> *engine)
>>       }
>>       /*
>> -     * Tell the engine that a software reset is going to happen. The 
>> engine
>> -     * will then try to force lock the SFC (if currently locked, it will
>> -     * remain so until we tell the engine it is safe to unlock; if 
>> currently
>> -     * unlocked, it will ignore this and all new lock requests). If SFC
>> -     * ends up being locked to the engine we want to reset, we have 
>> to reset
>> -     * it as well (we will unlock it once the reset sequence is 
>> completed).
>> +     * If the engine is using a SFC, tell the engine that a software 
>> reset
>> +     * is going to happen. The engine will then try to force lock the 
>> SFC.
>> +     * If SFC ends up being locked to the engine we want to reset, we 
>> have
>> +     * to reset it as well (we will unlock it once the reset sequence is
>> +     * completed).
>>        */
>> +    if (!(intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit))
>> +        return 0;
>> +
>>       rmw_set_fw(uncore, sfc_forced_lock, sfc_forced_lock_bit);
>> -    if (__intel_wait_for_register_fw(uncore,
>> -                     sfc_forced_lock_ack,
>> -                     sfc_forced_lock_ack_bit,
>> -                     sfc_forced_lock_ack_bit,
>> -                     1000, 0, NULL)) {
>> -        DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
>> +    ret = __intel_wait_for_register_fw(uncore,
>> +                       sfc_forced_lock_ack,
>> +                       sfc_forced_lock_ack_bit,
>> +                       sfc_forced_lock_ack_bit,
>> +                       1000, 0, NULL);
>> +
>> +    /* was the SFC released while we were trying to lock it? */
>> +    if (!(intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit))
>>           return 0;
>> -    }
>> -    if (intel_uncore_read_fw(uncore, sfc_usage) & sfc_usage_bit)
>> -        return sfc_reset_bit;
>> +    if (ret)
>> +        DRM_DEBUG_DRIVER("Wait for SFC forced lock ack failed\n");
>> +    else
>> +        *hw_mask |= sfc_reset_bit;
>> -    return 0;
>> +    return ret;
>>   }
>>   static void gen11_unlock_sfc(struct intel_engine_cs *engine)
>> @@ -430,12 +436,21 @@ static int gen11_reset_engines(struct intel_gt *gt,
>>           for_each_engine_masked(engine, gt->i915, engine_mask, tmp) {
>>               GEM_BUG_ON(engine->id >= ARRAY_SIZE(hw_engine_mask));
>>               hw_mask |= hw_engine_mask[engine->id];
>> -            hw_mask |= gen11_lock_sfc(engine);
>> +            ret = gen11_lock_sfc(engine, &hw_mask);
>> +            if (ret)
>> +                goto sfc_unlock;
> 
> Break on first failure looks unsafe to me. I think it would be more 
> robust to continue, no? Like if we have been asked to reset multiple 
> engines and only one failed, why not do the ones we can?

Chris corrected me on IRC explaining that as longs as we fail to reset 
one engine from engine_mask we fall back to full reset anyway. So this 
early return is immaterial to end behavior and I have no further 
complaints. :)

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> 
>>           }
>>       }
>>       ret = gen6_hw_domain_reset(gt, hw_mask);
>> +sfc_unlock:
>> +    /*
>> +     * we unlock the SFC based on the lock status and not the result of
>> +     * gen11_lock_sfc to make sure that we clean properly if something
>> +     * wrong happened during the lock (e.g. lock acquired after timeout
>> +     * expiration).
>> +     */
>>       if (engine_mask != ALL_ENGINES)
>>           for_each_engine_masked(engine, gt->i915, engine_mask, tmp)
>>               gen11_unlock_sfc(engine);
>>
> 
> So you decided not to read the register and cross check?
> 
> Regards,
> 
> Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2019-09-19  9:58 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-19  1:53 [PATCH v2] drm/i915: fix SFC reset flow Daniele Ceraolo Spurio
2019-09-19  3:49 ` ✓ Fi.CI.BAT: success for drm/i915: fix SFC reset flow (rev2) Patchwork
2019-09-19  7:51 ` [PATCH v2] drm/i915: fix SFC reset flow Chris Wilson
2019-09-19  9:34 ` Tvrtko Ursulin
2019-09-19  9:48   ` Chris Wilson
2019-09-19  9:58   ` Tvrtko Ursulin [this message]
2019-09-19 14:40 ` ✓ Fi.CI.IGT: success for drm/i915: fix SFC reset flow (rev2) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2fa990de-e08a-be3e-8e03-24b8bad1a5f7@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=daniele.ceraolospurio@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=owen.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.