All of lore.kernel.org
 help / color / mirror / Atom feed
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
To: Jani Nikula <jani.nikula@intel.com>
Cc: Ben Widawsky <ben@bwidawsk.net>,
	"Nikkanen, Kimmo" <kimmo.nikkanen@intel.com>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [PATCH v3] drm/i915: Remove unsafe i915.enable_rc6
Date: Thu, 2 Nov 2017 07:47:07 -0700	[thread overview]
Message-ID: <20171102144707.mwgb4zqj7owq4evz@intel.com> (raw)
In-Reply-To: <87efphdrzu.fsf@intel.com>

On Thu, Nov 02, 2017 at 08:06:29AM +0000, Jani Nikula wrote:
> On Wed, 01 Nov 2017, Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
> > On Wed, Nov 01, 2017 at 04:21:08PM +0000, Ben Widawsky wrote:
> >> On 17-11-01 18:09:47, Joonas Lahtinen wrote:
> >> > + Kimmo and Paul
> >> > 
> >> > On Wed, 2017-11-01 at 07:43 -0700, Ben Widawsky wrote:
> >> > > On 17-11-01 14:07:28, Joonas Lahtinen wrote:
> >> > > > On Mon, 2017-10-30 at 10:48 -0700, Rodrigo Vivi wrote:
> >> > > > > On Mon, Oct 30, 2017 at 01:00:51PM +0000, David Weinehall wrote:
> >> > > > > > On Fri, Oct 27, 2017 at 01:57:09PM -0700, Daniele Ceraolo Spurio wrote:
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On 26/10/17 03:32, Chris Wilson wrote:
> >> > > > > > > > It has been many years since the last confirmed sighting (and fix) of an
> >> > > > > > > > RC6 related bug (usually a system hang). Remove the parameter to stop
> >> > > > > > > > users from setting dangerous values, as they often set it during triage
> >> > > > > > > > and end up disabling the entire runtime pm instead (the option is not a
> >> > > > > > > > fine scalpel!).
> >> > > > > > > >
> >> > > > > > > > Furthermore, it allows users to set known dangerous values which were
> >> > > > > > > > intended for testing and not for production use. For testing, we can
> >> > > > > > > > always patch in the required setting without having to expose ourselves
> >> > > > > > > > to random abuse.
> >> > > > > > > >
> >> > > > > > > > v2: Fixup NEEDS_WaRsDisableCoarsePowerGating fumble, and document the
> >> > > > > > > > lack of ilk support better.
> >> > > > > > > > v3: Clear intel_info->rc6p if we don't support rc6 itself.
> >> > > > > > > >
> >> > > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >> > > > > > > > Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >> > > > > > > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> >> > > > > > > > Cc: Jani Nikula <jani.nikula@intel.com>
> >> > > > > > > > Cc: Imre Deak <imre.deak@intel.com>
> >> > > > > > > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >> > > > > > > > Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> >> > > > > > > > ---
> >> > > > > > >
> >> > > > > > > I think that for execution/debug on early silicon we might still want the
> >> > > > > > > ability to turn features like RC6 off. Maybe we can add a debug kconfig to
> >> > > > > > > force info->has_rc6 = 0? Not a blocker to this patch but worth considering
> >> > > > > > > IMO.
> >> > > > > >
> >> > > > > > Most of the BIOSes I've seen on our RVPs have had an option to disable
> >> > > > > > RC6.
> >> > > > >
> >> > > > > BIOS option don't block our code to run and set some MMIOs.
> >> > > > > Not sure how the GPU will behave on such cases.
> >> > > > >
> >> > > > > I like the idea of removing some and keeping the parameters clean.
> >> > > > > But there are few ones like RC6 and disable_power_wells that are very
> >> > > > > useful on platform enabling and also when assisting others to debug issues.
> >> > > > >
> >> > > > > For instance right now that we fixed RC6 on CNL someone told that
> >> > > > > he believe seeing more hangs, so I immediately asked to boot with
> >> > > > > i915.enable_rc6=0 to double check. It is easier and straighforward
> >> > > > > to direct them to the unsafe param than to ask them to compile the code
> >> > > > > with different options or to use some BIOS options that we are not sure.
> >> > > > >
> >> > > > > Also on bug triage some options like this are helpful.
> >> > > > >
> >> > > > > Also BIOS and compile are saved flags. So if you need to do a quick test
> >> > > > > you have to save it, and then unsave later. Parameters are very convinient
> >> > > > > for 1 boot only check.
> >> > > >
> >> > > > It's convenient for sure, but the unsafe module parameters seems to be
> >> > > > finding their way into way too many HOWTOs, and from there to some
> >> > > > "productized" use-cases. Chris states that setting .enable_rc6=0 to
> >> > > > solving an issue on publicly shipping products has been some years ago,
> >> > > > so I don't see a need for carrying this.
> >> > > >
> >> > > > We shouldn't allow the convenience of not having to change one line and
> >> > > > recompile kernel during development to affect the end-users who are
> >> > > > Googling how to get the best performance out of their hardware (I could
> >> > > > mention some distro here).
> >> > > >
> >> > > > This seems the like the best option as I don't think introducing kernel
> >> > > > parameters that only exists on debug builds would be too convenient
> >> > > > either. It'd maybe just add more confusion.
> >> > > >
> >> > > > Regards, Joonas
> >> > > 
> >> > > I believe the ability to disable RC6 is valuable not just for debugging
> >> > > purposes. Folks with very latency sensitive workloads are often willing to
> >> > > forego power savings. The real problem I see is that we don't test without rc6
> >> > > in our setup, which indeed makes it unsafe. As such, I see the other option here
> >> > > going back to the ability to toggle rc6 after load (either module parameter, or
> >> > > make it sysfs), and actually run some subset of our workloads with RC6. I
> >> > > suspect people will poop on that suggestion, but I figured I'd mention.
> >> > 
> >> > I agree there, but by my understanding there's really no ask to support
> >> > the feature in upstream. And the original motive from Chris to drop the
> >> > feature is that it's unsafe as it currently is.
> >> > 
> >> > So unless we've got the resources to bring it back from the unsafe
> >> > zone, I think we should drop it like this patch proposes.
> >> > 
> >> > Regards, Joonas
> >> 
> >> Yep, I agree. One other option would be to move i915_forcewake_user to sysfs and
> >> let them use that.
> >
> > Well, I won't try to block that. I just put my 2 cents that I believe it is a very
> > useful parameter.
> >
> > It wasn't that long ago the last time that we needed the flag to allow
> > end users to have a functional machine: https://plus.google.com/+JonMasters/posts/BqWLEjenLKv.
> >
> > or to debug a related issue:
> > https://bugzilla.redhat.com/show_bug.cgi?id=1440988
> > https://bugzilla.kernel.org/show_bug.cgi?id=116431
> >
> > Although date on few seems over than 1 year. We need to consider that
> > that was our latest new gpu... gen9.
> >
> > If products are recommending the use of enable_rc6=0 I can see they
> > adding the patch to disable that. Effect is the same and our convenience is gone.
> >
> > But again, just my view here. Not a nack ;)
> 
> I suppose the compromise would be to make it a boolean module parameter
> to only allow disabling rc6 on platforms where it's enabled by default,
> but not letting you enable rc6 where it's disabled by default. I.e. only
> support i915.enable_rc6=0 to be passed by the user.

+1. I like this approach.

> 
> BR,
> Jani.
> 
> 
> >
> > Thanks,
> > Rodrigo.
> >
> >> 
> >> -- 
> >> Ben Widawsky, Intel Open Source Technology Center
> 
> -- 
> Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2017-11-02 14:47 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-11  9:12 [PATCH] drm/i915: Remove unsafe i915.enable_rc6 Chris Wilson
2017-10-11 10:23 ` ✓ Fi.CI.BAT: success for drm/i915: Remove unsafe i915.enable_rc6 (rev2) Patchwork
2017-10-11 11:35 ` [PATCH] drm/i915: Remove unsafe i915.enable_rc6 Daniel Vetter
2017-10-11 15:39 ` ✓ Fi.CI.IGT: success for drm/i915: Remove unsafe i915.enable_rc6 (rev2) Patchwork
2017-10-12  9:37 ` [PATCH] drm/i915: Remove unsafe i915.enable_rc6 Joonas Lahtinen
2017-10-12  9:42 ` Imre Deak
2017-10-26 10:32 ` [PATCH v3] " Chris Wilson
2017-10-26 14:33   ` Joonas Lahtinen
2017-10-27 20:57   ` Daniele Ceraolo Spurio
2017-10-30 13:00     ` David Weinehall
2017-10-30 17:48       ` Rodrigo Vivi
2017-11-01 12:07         ` Joonas Lahtinen
2017-11-01 14:43           ` Ben Widawsky
2017-11-01 16:09             ` Joonas Lahtinen
2017-11-01 16:21               ` Ben Widawsky
2017-11-01 17:12                 ` Rodrigo Vivi
2017-11-02  8:06                   ` Jani Nikula
2017-11-02 14:47                     ` Rodrigo Vivi [this message]
2017-11-02 14:59                       ` Joonas Lahtinen
2017-11-02 15:17                         ` Jani Nikula
2017-10-26 10:58 ` ✗ Fi.CI.BAT: failure for drm/i915: Remove unsafe i915.enable_rc6 (rev3) Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171102144707.mwgb4zqj7owq4evz@intel.com \
    --to=rodrigo.vivi@intel.com \
    --cc=ben@bwidawsk.net \
    --cc=daniel.vetter@ffwll.ch \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@intel.com \
    --cc=kimmo.nikkanen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.