All of lore.kernel.org
 help / color / mirror / Atom feed
From: Carlos Santa <carlos.santa@intel.com>
To: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Chris Wilson <chris@chris-wilson.co.uk>,
	intel-gfx@lists.freedesktop.org
Cc: Michel Thierry <michel.thierry@intel.com>
Subject: Re: drm/i915: Watchdog timeout: IRQ handler for gen8+
Date: Thu, 10 Jan 2019 18:58:17 -0800	[thread overview]
Message-ID: <c3fa6ed82c11459c82a841caf5ac68e53eb2c46e.camel@intel.com> (raw)
In-Reply-To: <3e3254dd-45cd-cd69-8d6d-176dad65ae8b@linux.intel.com>

On Mon, 2019-01-07 at 16:58 +0000, Tvrtko Ursulin wrote:
> On 07/01/2019 13:57, Chris Wilson wrote:
> > Quoting Tvrtko Ursulin (2019-01-07 13:43:29)
> > > 
> > > On 07/01/2019 11:58, Tvrtko Ursulin wrote:
> > > 
> > > [snip]
> > > 
> > > > > Note about future interaction with preemption: Preemption
> > > > > could happen
> > > > > in a command sequence prior to watchdog counter getting
> > > > > disabled,
> > > > > resulting in watchdog being triggered following preemption
> > > > > (e.g. when
> > > > > watchdog had been enabled in the low priority batch). The
> > > > > driver will
> > > > > need to explicitly disable the watchdog counter as part of
> > > > > the
> > > > > preemption sequence.
> > > > 
> > > > Does the series take care of preemption?
> > > 
> > > I did not find that it does.
> > 
> > Oh. I hoped that the watchdog was saved as part of the context...
> > Then
> > despite preemption, the timeout would resume from where we left off
> > as
> > soon as it was back on the gpu.
> > 
> > If the timeout remaining was context saved it would be much simpler
> > (at
> > least on first glance), please say it is.

The watchdog timeout gets saved as part of the register state context
so it will still be enabled after coming back from preemption but the
timeout value will be reset back to the original MAX value that it was
programmed. At least that's what I remember from a discussion with
Michel but I can check again...

Regards,
Carlos

> 
> I made my comments going only by the text from the commit message
> and 
> the absence of any preemption special handling.
> 
> Having read the spec, the situation seems like this:
> 
>   * Watchdog control and threshold register are context saved and
> restored.
> 
>   * On a context switch watchdog counter is reset to zero and 
> automatically disabled until enabled by a context restore or
> explicitly.
> 
> So it sounds the commit message could be wrong that special handling
> is 
> needed from this direction. But read till the end on the restriction
> listed.
> 
>   * Watchdog counter is reset to zero and is not accumulated across 
> multiple submission of the same context (due preemption).
> 
> I read this as - after preemption contexts gets a new full timeout 
> allocation. Or in other words, if a context is preempted N times,
> it's 
> cumulative watchdog timeout will be N * set value.
> 
> This could be theoretically exploitable to bypass the timeout. If a 
> client sets up two contexts with prio -1 and -2, and keeps
> submitting 
> periodical no-op batches against prio -1 context, while prio -2 is
> it's 
> own hog, then prio -2 context defeats the watchdog timer. I think.. 
> would appreciate is someone challenged this conclusion.
> 
> And finally there is one programming restriction which says:
> 
>   * SW must not preempt the workload which has watchdog enabled.
> Either 
> it must:
> 
> a) disable preemption for that workload completely, or
> b) disable the watchdog via mmio write before any write to ELSP
> 
> This seems it contradiction with the statement that the counter gets 
> disabled on context switch and stays disabled.
> 
> I did not spot anything like this in the series. So it would seem
> the 
> commit message is correct after all.
> 
> It would be good if someone could re-read the bspec text on register 
> 0x2178 to double check what I wrote.
> 
> Regards,
> 
> Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2019-01-11  2:57 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-05  2:39 Gen8+ engine-reset Carlos Santa
2019-01-05  2:39 ` drm/i915: Add engine reset count in get-reset-stats ioctl Carlos Santa
2019-01-05  2:39 ` drm/i915: Watchdog timeout: IRQ handler for gen8+ Carlos Santa
2019-01-07 11:58   ` Tvrtko Ursulin
2019-01-07 12:16     ` Chris Wilson
2019-01-07 12:58       ` Tvrtko Ursulin
2019-01-07 13:02         ` Chris Wilson
2019-01-07 13:12           ` Tvrtko Ursulin
2019-01-07 13:43     ` Tvrtko Ursulin
2019-01-07 13:57       ` Chris Wilson
2019-01-07 16:58         ` Tvrtko Ursulin
2019-01-07 18:31           ` Chris Wilson
2019-01-11  0:47           ` Antonio Argenziano
2019-01-11  8:22             ` Tvrtko Ursulin
2019-01-11 17:31               ` Antonio Argenziano
2019-01-11 21:28                 ` John Harrison
2019-01-16 16:15                   ` Tvrtko Ursulin
2019-01-16 17:42                     ` Antonio Argenziano
2019-01-16 17:59                       ` Antonio Argenziano
2019-01-11  2:58           ` Carlos Santa [this message]
2019-01-24  0:13     ` Carlos Santa
2019-01-05  2:39 ` drm/i915: Watchdog timeout: Ringbuffer command emission " Carlos Santa
2019-01-07 12:21   ` Tvrtko Ursulin
2019-01-05  2:39 ` drm/i915: Watchdog timeout: DRM kernel interface to set the timeout Carlos Santa
2019-01-07 12:38   ` Tvrtko Ursulin
2019-01-07 12:50     ` Chris Wilson
2019-01-07 13:39       ` Tvrtko Ursulin
2019-01-07 13:51         ` Chris Wilson
2019-01-07 17:00     ` Tvrtko Ursulin
2019-01-07 17:20       ` Tvrtko Ursulin
2019-01-05  2:39 ` drm/i915: Watchdog timeout: Include threshold value in error state Carlos Santa
2019-01-05  4:19   ` kbuild test robot
2019-01-05  4:39   ` kbuild test robot
2019-01-05  2:39 ` drm/i915: Only process VCS2 only when supported Carlos Santa
2019-01-07 12:40   ` Tvrtko Ursulin
2019-01-24  0:20     ` Carlos Santa
2019-01-05  2:40 ` drm/i915/watchdog: move emit_stop_watchdog until the very end of the ring commands Carlos Santa
2019-01-07 12:50   ` Tvrtko Ursulin
2019-01-07 12:54     ` Chris Wilson
2019-01-07 13:01       ` Tvrtko Ursulin
2019-01-11  2:25     ` Carlos Santa
2019-01-05  2:40 ` drm/i915: Watchdog timeout: Blindly trust watchdog timeout for reset? Carlos Santa
2019-01-05  4:15   ` kbuild test robot
2019-01-05 13:32   ` kbuild test robot
2019-01-05  2:57 ` ✗ Fi.CI.CHECKPATCH: warning for " Patchwork
2019-01-05  3:21 ` ✓ Fi.CI.BAT: success " Patchwork
2019-01-05  4:41 ` ✓ Fi.CI.IGT: " Patchwork
2019-01-07 10:11 ` Gen8+ engine-reset Tvrtko Ursulin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c3fa6ed82c11459c82a841caf5ac68e53eb2c46e.camel@intel.com \
    --to=carlos.santa@intel.com \
    --cc=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=michel.thierry@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.