All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
To: Chris Wilson <chris@chris-wilson.co.uk>, Intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [RFC 2/2] drm/i915: Use ABI engine class in error state ecode
Date: Thu, 5 Nov 2020 09:31:07 +0000	[thread overview]
Message-ID: <19d84410-f2a9-58a7-a60e-bd03161c4078@linux.intel.com> (raw)
In-Reply-To: <160453261800.17472.1591297091381831846@build.alporthouse.com>


On 04/11/2020 23:30, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2020-11-04 13:47:43)
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Instead of printing out the internal engine mask, which can change between
>> kernel versions making it difficult to map to actual engines, present a
>> bitmask of hanging engines ABI classes. For example:
>>
>>    [drm] GPU HANG: ecode 9:24dffffd:8, in gem_exec_schedu [1334]
>>
>> Notice the swapped the order of ecode and bitmask which makes the new
>> versus old bug reports are obvious.
>>
>> Engine ABI class is useful to quickly categorize render vs media etc hangs
>> in bug reports. Considering virtual engine even more so than the current
>> scheme.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gpu_error.c | 13 +++++++------
>>   1 file changed, 7 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
>> index 857db66cc4a3..e7d9af184d58 100644
>> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
>> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
>> @@ -1659,17 +1659,16 @@ static u32 generate_ecode(const struct intel_engine_coredump *ee)
>>   static const char *error_msg(struct i915_gpu_coredump *error)
>>   {
>>          struct intel_engine_coredump *first = NULL;
>> +       unsigned int hung_classes = 0;
>>          struct intel_gt_coredump *gt;
>> -       intel_engine_mask_t engines;
>>          int len;
>>   
>> -       engines = 0;
>>          for (gt = error->gt; gt; gt = gt->next) {
>>                  struct intel_engine_coredump *cs;
>>   
>>                  for (cs = gt->engine; cs; cs = cs->next) {
>>                          if (cs->hung) {
>> -                               engines |= cs->engine->mask;
>> +                               hung_classes |= BIT(cs->engine->uabi_class);
> 
> Your argument makes sense.
> 
>>                                  if (!first)
>>                                          first = cs;
>>                          }
>> @@ -1677,9 +1676,11 @@ static const char *error_msg(struct i915_gpu_coredump *error)
>>          }
>>   
>>          len = scnprintf(error->error_msg, sizeof(error->error_msg),
>> -                       "GPU HANG: ecode %d:%x:%08x",
>> -                       INTEL_GEN(error->i915), engines,
>> -                       generate_ecode(first));
>> +                       "GPU HANG: ecode %d:%08x:%x",
>> +                       INTEL_GEN(error->i915),
>> +                       generate_ecode(first),
>> +                       hung_classes);
> 
> I vote for keeping gen:engines:ecode order, for me that is biggest to
> smallest.

It would not be obvious what kind of bits are in the mask then, say 
looking from the ecode in maybe bugzilla titles and two different bugs 
may be incorrectly marked as duplicate. Maybe instead of the order we 
could change the separator(s)? Or add prefix/suffix to the mask?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2020-11-05  9:31 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-04 13:47 [Intel-gfx] [RFC 1/2] drm/i915: Improve record of hung engines in error state Tvrtko Ursulin
2020-11-04 13:47 ` [Intel-gfx] [RFC 2/2] drm/i915: Use ABI engine class in error state ecode Tvrtko Ursulin
2020-11-04 23:30   ` Chris Wilson
2020-11-05  9:31     ` Tvrtko Ursulin [this message]
2020-11-05  9:49       ` Chris Wilson
2020-11-05 11:38   ` [Intel-gfx] [RFC v2 " Tvrtko Ursulin
2020-11-05 11:42     ` Chris Wilson
2020-11-04 14:00 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [RFC,1/2] drm/i915: Improve record of hung engines in error state Patchwork
2020-11-04 14:26 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-11-04 17:32 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork
2020-11-05 15:24 ` [Intel-gfx] ✗ Fi.CI.SPARSE: warning for series starting with [RFC,1/2] drm/i915: Improve record of hung engines in error state (rev2) Patchwork
2020-11-05 15:51 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2020-11-05 18:54 ` [Intel-gfx] ✓ Fi.CI.IGT: " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=19d84410-f2a9-58a7-a60e-bd03161c4078@linux.intel.com \
    --to=tvrtko.ursulin@linux.intel.com \
    --cc=Intel-gfx@lists.freedesktop.org \
    --cc=chris@chris-wilson.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.