All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Robert Bragg <robert@sixbynine.org>
Cc: intel-gfx@lists.freedesktop.org,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	David Airlie <airlied@linux.ie>,
	linux-api@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Paul Mackerras <paulus@samba.org>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Daniel Vetter <daniel.vetter@intel.com>
Subject: Re: [Intel-gfx] [RFC PATCH 07/11] drm/i915: Expose PMU for Observation Architecture
Date: Thu, 7 May 2015 15:58:00 +0100	[thread overview]
Message-ID: <20150507145800.GZ22099@nuc-i3427.alporthouse.com> (raw)
In-Reply-To: <1431008154-6833-8-git-send-email-robert@sixbynine.org>

On Thu, May 07, 2015 at 03:15:50PM +0100, Robert Bragg wrote:
> +	/* We bypass the default perf core perf_paranoid_cpu() ||
> +	 * CAP_SYS_ADMIN check by using the PERF_PMU_CAP_IS_DEVICE
> +	 * flag and instead authenticate based on whether the current
> +	 * pid owns the specified context, or require CAP_SYS_ADMIN
> +	 * when collecting cross-context metrics.
> +	 */
> +	dev_priv->oa_pmu.specific_ctx = NULL;
> +	if (oa_attr.single_context) {
> +		u32 ctx_id = oa_attr.ctx_id;
> +		unsigned int drm_fd = oa_attr.drm_fd;
> +		struct fd fd = fdget(drm_fd);
> +
> +		if (fd.file) {

Specify a ctx and not providing the right fd should be its own error,
either EBADF or EINVAL.

> +			dev_priv->oa_pmu.specific_ctx =
> +				lookup_context(dev_priv, fd.file, ctx_id);
> +		}

Missing fdput

> +	}
> +
> +	if (!dev_priv->oa_pmu.specific_ctx && !capable(CAP_SYS_ADMIN))
> +		return -EACCES;
> +
> +	mutex_lock(&dev_priv->dev->struct_mutex);

i915_mutex_interruptible, probably best to couple into the GPU error
handling here as well especially as init_oa_buffer() will go onto touch
GPU internals.

> +	ret = init_oa_buffer(event);
> +	mutex_unlock(&dev_priv->dev->struct_mutex);
> +
> +	if (ret)
> +		return ret;
> +
> +	BUG_ON(dev_priv->oa_pmu.exclusive_event);
> +	dev_priv->oa_pmu.exclusive_event = event;
> +
> +	event->destroy = i915_oa_event_destroy;
> +
> +	/* PRM - observability performance counters:
> +	 *
> +	 *   OACONTROL, performance counter enable, note:
> +	 *
> +	 *   "When this bit is set, in order to have coherent counts,
> +	 *   RC6 power state and trunk clock gating must be disabled.
> +	 *   This can be achieved by programming MMIO registers as
> +	 *   0xA094=0 and 0xA090[31]=1"
> +	 *
> +	 *   In our case we are expected that taking pm + FORCEWAKE
> +	 *   references will effectively disable RC6 and trunk clock
> +	 *   gating.
> +	 */
> +	intel_runtime_pm_get(dev_priv);
> +	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);

That is a nuisance. Aside: Why isn't OA inside the powerctx? Is a subset
valid with forcewake? It does perturb the system greatly to disable rc6,
so I wonder if it could be made optional?

> +
> +	return 0;
> +}
> +
> +static void update_oacontrol(struct drm_i915_private *dev_priv)
> +{
> +	BUG_ON(!spin_is_locked(&dev_priv->oa_pmu.lock));
> +
> +	if (dev_priv->oa_pmu.event_active) {
> +		unsigned long ctx_id = 0;
> +		bool pinning_ok = false;
> +
> +		if (dev_priv->oa_pmu.specific_ctx) {
> +			struct intel_context *ctx =
> +				dev_priv->oa_pmu.specific_ctx;
> +			struct drm_i915_gem_object *obj =
> +				ctx->legacy_hw_ctx.rcs_state;

If only there was ctx->legacy_hw_ctx.rcs_vma...

> +
> +			if (i915_gem_obj_is_pinned(obj)) {
> +				ctx_id = i915_gem_obj_ggtt_offset(obj);
> +				pinning_ok = true;
> +			}
> +		}
> +
> +		if ((ctx_id == 0 || pinning_ok)) {
> +			bool periodic = dev_priv->oa_pmu.periodic;
> +			u32 period_exponent = dev_priv->oa_pmu.period_exponent;
> +			u32 report_format = dev_priv->oa_pmu.oa_buffer.format;
> +
> +			I915_WRITE(GEN7_OACONTROL,
> +				   (ctx_id & GEN7_OACONTROL_CTX_MASK) |
> +				   (period_exponent <<
> +				    GEN7_OACONTROL_TIMER_PERIOD_SHIFT) |
> +				   (periodic ?
> +				    GEN7_OACONTROL_TIMER_ENABLE : 0) |
> +				   (report_format <<
> +				    GEN7_OACONTROL_FORMAT_SHIFT) |
> +				   (ctx_id ?
> +				    GEN7_OACONTROL_PER_CTX_ENABLE : 0) |
> +				   GEN7_OACONTROL_ENABLE);

I notice you don't use any write barriers...
-Chris
-- 
Chris Wilson, Intel Open Source Technology Centre

WARNING: multiple messages have this Message-ID (diff)
From: Chris Wilson <chris-Y6uKTt2uX1cEflXRtASbqLVCufUGDwFn@public.gmane.org>
To: Robert Bragg <robert-St23OQVBDYPNLxjTenLetw@public.gmane.org>
Cc: intel-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	Peter Zijlstra
	<a.p.zijlstra-/NLkJaSkS4VmR6Xm/wNWPw@public.gmane.org>,
	David Airlie <airlied-cv59FeDIM0c@public.gmane.org>,
	linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Ingo Molnar <mingo-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	Paul Mackerras <paulus-eUNUBHrolfbYtjvyW6yDsg@public.gmane.org>,
	Arnaldo Carvalho de Melo
	<acme-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
	Daniel Vetter
	<daniel.vetter-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: Re: [Intel-gfx] [RFC PATCH 07/11] drm/i915: Expose PMU for Observation Architecture
Date: Thu, 7 May 2015 15:58:00 +0100	[thread overview]
Message-ID: <20150507145800.GZ22099@nuc-i3427.alporthouse.com> (raw)
In-Reply-To: <1431008154-6833-8-git-send-email-robert-St23OQVBDYPNLxjTenLetw@public.gmane.org>

On Thu, May 07, 2015 at 03:15:50PM +0100, Robert Bragg wrote:
> +	/* We bypass the default perf core perf_paranoid_cpu() ||
> +	 * CAP_SYS_ADMIN check by using the PERF_PMU_CAP_IS_DEVICE
> +	 * flag and instead authenticate based on whether the current
> +	 * pid owns the specified context, or require CAP_SYS_ADMIN
> +	 * when collecting cross-context metrics.
> +	 */
> +	dev_priv->oa_pmu.specific_ctx = NULL;
> +	if (oa_attr.single_context) {
> +		u32 ctx_id = oa_attr.ctx_id;
> +		unsigned int drm_fd = oa_attr.drm_fd;
> +		struct fd fd = fdget(drm_fd);
> +
> +		if (fd.file) {

Specify a ctx and not providing the right fd should be its own error,
either EBADF or EINVAL.

> +			dev_priv->oa_pmu.specific_ctx =
> +				lookup_context(dev_priv, fd.file, ctx_id);
> +		}

Missing fdput

> +	}
> +
> +	if (!dev_priv->oa_pmu.specific_ctx && !capable(CAP_SYS_ADMIN))
> +		return -EACCES;
> +
> +	mutex_lock(&dev_priv->dev->struct_mutex);

i915_mutex_interruptible, probably best to couple into the GPU error
handling here as well especially as init_oa_buffer() will go onto touch
GPU internals.

> +	ret = init_oa_buffer(event);
> +	mutex_unlock(&dev_priv->dev->struct_mutex);
> +
> +	if (ret)
> +		return ret;
> +
> +	BUG_ON(dev_priv->oa_pmu.exclusive_event);
> +	dev_priv->oa_pmu.exclusive_event = event;
> +
> +	event->destroy = i915_oa_event_destroy;
> +
> +	/* PRM - observability performance counters:
> +	 *
> +	 *   OACONTROL, performance counter enable, note:
> +	 *
> +	 *   "When this bit is set, in order to have coherent counts,
> +	 *   RC6 power state and trunk clock gating must be disabled.
> +	 *   This can be achieved by programming MMIO registers as
> +	 *   0xA094=0 and 0xA090[31]=1"
> +	 *
> +	 *   In our case we are expected that taking pm + FORCEWAKE
> +	 *   references will effectively disable RC6 and trunk clock
> +	 *   gating.
> +	 */
> +	intel_runtime_pm_get(dev_priv);
> +	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);

That is a nuisance. Aside: Why isn't OA inside the powerctx? Is a subset
valid with forcewake? It does perturb the system greatly to disable rc6,
so I wonder if it could be made optional?

> +
> +	return 0;
> +}
> +
> +static void update_oacontrol(struct drm_i915_private *dev_priv)
> +{
> +	BUG_ON(!spin_is_locked(&dev_priv->oa_pmu.lock));
> +
> +	if (dev_priv->oa_pmu.event_active) {
> +		unsigned long ctx_id = 0;
> +		bool pinning_ok = false;
> +
> +		if (dev_priv->oa_pmu.specific_ctx) {
> +			struct intel_context *ctx =
> +				dev_priv->oa_pmu.specific_ctx;
> +			struct drm_i915_gem_object *obj =
> +				ctx->legacy_hw_ctx.rcs_state;

If only there was ctx->legacy_hw_ctx.rcs_vma...

> +
> +			if (i915_gem_obj_is_pinned(obj)) {
> +				ctx_id = i915_gem_obj_ggtt_offset(obj);
> +				pinning_ok = true;
> +			}
> +		}
> +
> +		if ((ctx_id == 0 || pinning_ok)) {
> +			bool periodic = dev_priv->oa_pmu.periodic;
> +			u32 period_exponent = dev_priv->oa_pmu.period_exponent;
> +			u32 report_format = dev_priv->oa_pmu.oa_buffer.format;
> +
> +			I915_WRITE(GEN7_OACONTROL,
> +				   (ctx_id & GEN7_OACONTROL_CTX_MASK) |
> +				   (period_exponent <<
> +				    GEN7_OACONTROL_TIMER_PERIOD_SHIFT) |
> +				   (periodic ?
> +				    GEN7_OACONTROL_TIMER_ENABLE : 0) |
> +				   (report_format <<
> +				    GEN7_OACONTROL_FORMAT_SHIFT) |
> +				   (ctx_id ?
> +				    GEN7_OACONTROL_PER_CTX_ENABLE : 0) |
> +				   GEN7_OACONTROL_ENABLE);

I notice you don't use any write barriers...
-Chris
-- 
Chris Wilson, Intel Open Source Technology Centre

  parent reply	other threads:[~2015-05-07 14:58 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-07 14:15 [RFC PATCH 00/11] drm/i915: Expose OA metrics via perf PMU Robert Bragg
2015-05-07 14:15 ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 01/11] perf: export perf_event_overflow Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 02/11] perf: Add PERF_PMU_CAP_IS_DEVICE flag Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 03/11] perf: Add PERF_EVENT_IOC_FLUSH ioctl Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:20   ` [Intel-gfx] " Chris Wilson
2015-05-07 14:20     ` Chris Wilson
2015-05-18 17:25     ` [RFC PATCH v2] " Robert Bragg
2015-05-18 17:25       ` Robert Bragg
2015-05-20 12:12       ` Ingo Molnar
2015-05-20 12:12         ` Ingo Molnar
2015-05-21 17:40         ` [RFC PATCH] perf: enable fsync to flush buffered samples Robert Bragg
2015-05-21 17:40           ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 04/11] perf: Add a PERF_RECORD_DEVICE event type Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 05/11] perf: allow drivers more control over event logging Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 06/11] drm/i915: rename OACONTROL GEN7_OACONTROL Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 07/11] drm/i915: Expose PMU for Observation Architecture Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:36   ` [Intel-gfx] " Chris Wilson
2015-05-07 14:36     ` Chris Wilson
2015-05-18 16:21     ` Robert Bragg
2015-05-07 14:58   ` Chris Wilson [this message]
2015-05-07 14:58     ` [Intel-gfx] " Chris Wilson
2015-05-18 16:36     ` Robert Bragg
2015-05-18 16:36       ` Robert Bragg
2015-05-18 17:17       ` [RFC PATCH v2] " Robert Bragg
2015-05-18 17:17         ` Robert Bragg
2015-05-18 17:21       ` [RFC PATCH] squash: be more careful stopping oacontrol updates Robert Bragg
2015-05-18 17:21         ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 08/11] drm/i915: add OA config for 3D render counters Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 09/11] drm/i915: Add dev.i915.oa_event_paranoid sysctl option Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 10/11] drm/i915: report OA buf overrun + report lost status Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-07 14:15 ` [RFC PATCH 11/11] WIP: drm/i915: constrain unit gating while using OA Robert Bragg
2015-05-07 14:15   ` Robert Bragg
2015-05-08 16:21 ` [RFC PATCH 00/11] drm/i915: Expose OA metrics via perf PMU Peter Zijlstra
2015-05-08 16:21   ` Peter Zijlstra
2015-05-18 17:29   ` Robert Bragg
2015-05-18 17:29     ` Robert Bragg
2015-05-08 16:24 ` Peter Zijlstra
2015-05-08 16:24   ` Peter Zijlstra
2015-05-15  1:07   ` Robert Bragg
2015-05-15  1:07     ` Robert Bragg
2015-05-19 14:53     ` Peter Zijlstra
2015-05-19 14:53       ` Peter Zijlstra
2015-05-20 23:17       ` Robert Bragg
2015-05-20 23:17         ` Robert Bragg
2015-05-21  8:24         ` [Intel-gfx] " Daniel Vetter
2015-05-21  8:24           ` Daniel Vetter
2015-05-27 15:39         ` Peter Zijlstra
2015-05-27 15:39           ` Peter Zijlstra
2015-05-27 16:41           ` Ingo Molnar
2015-05-27 16:41             ` Ingo Molnar
2015-06-04 18:53           ` [Intel-gfx] " Robert Bragg
2015-06-04 18:53             ` Robert Bragg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150507145800.GZ22099@nuc-i3427.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=a.p.zijlstra@chello.nl \
    --cc=acme@kernel.org \
    --cc=airlied@linux.ie \
    --cc=daniel.vetter@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=paulus@samba.org \
    --cc=robert@sixbynine.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.