All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: Lionel Landwerlin <lionel.g.landwerlin@intel.com>,
	Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
	Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>,
	intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy
Date: Thu, 04 Mar 2021 08:58:06 +0000	[thread overview]
Message-ID: <161484828635.28586.889038613448637986@build.alporthouse.com> (raw)
In-Reply-To: <81d17b5e-5b32-69b9-67bb-00da8469d88a@intel.com>

Quoting Lionel Landwerlin (2021-03-04 08:28:59)
> On 04/03/2021 02:09, Chris Wilson wrote:
> > Quoting Umesh Nerlige Ramappa (2021-03-03 21:28:00)
> >> Perf measurements rely on CPU and engine timestamps to correlate
> >> events of interest across these time domains. Current mechanisms get
> >> these timestamps separately and the calculated delta between these
> >> timestamps lack enough accuracy.
> >>
> >> To improve the accuracy of these time measurements to within a few us,
> >> add a query that returns the engine and cpu timestamps captured as
> >> close to each other as possible.
> >>
> >> v2: (Tvrtko)
> >> - document clock reference used
> >> - return cpu timestamp always
> >> - capture cpu time just before lower dword of cs timestamp
> >>
> >> v3: (Chris)
> >> - use uncore-rpm
> >> - use __query_cs_timestamp helper
> >>
> >> v4: (Lionel)
> >> - Kernel perf subsytem allows users to specify the clock id to be used
> >>    in perf_event_open. This clock id is used by the perf subsystem to
> >>    return the appropriate cpu timestamp in perf events. Similarly, let
> >>    the user pass the clockid to this query so that cpu timestamp
> >>    corresponds to the clock id requested.
> >>
> >> v5: (Tvrtko)
> >> - Use normal ktime accessors instead of fast versions
> >> - Add more uApi documentation
> >>
> >> v6: (Lionel)
> >> - Move switch out of spinlock
> >>
> >> v7: (Chris)
> >> - cs_timestamp is a misnomer, use cs_cycles instead
> >> - return the cs cycle frequency as well in the query
> >>
> >> v8:
> >> - Add platform and engine specific checks
> >>
> >> v9: (Lionel)
> >> - Return 2 cpu timestamps in the query - captured before and after the
> >>    register read
> >>
> >> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
> >> ---
> >>   drivers/gpu/drm/i915/i915_query.c | 144 ++++++++++++++++++++++++++++++
> >>   include/uapi/drm/i915_drm.h       |  47 ++++++++++
> >>   2 files changed, 191 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c
> >> index fed337ad7b68..acca22ee6014 100644
> >> --- a/drivers/gpu/drm/i915/i915_query.c
> >> +++ b/drivers/gpu/drm/i915/i915_query.c
> >> @@ -6,6 +6,8 @@
> >>   
> >>   #include <linux/nospec.h>
> >>   
> >> +#include "gt/intel_engine_pm.h"
> >> +#include "gt/intel_engine_user.h"
> >>   #include "i915_drv.h"
> >>   #include "i915_perf.h"
> >>   #include "i915_query.h"
> >> @@ -90,6 +92,147 @@ static int query_topology_info(struct drm_i915_private *dev_priv,
> >>          return total_length;
> >>   }
> >>   
> >> +typedef u64 (*__ktime_func_t)(void);
> >> +static __ktime_func_t __clock_id_to_func(clockid_t clk_id)
> >> +{
> >> +       /*
> >> +        * Use logic same as the perf subsystem to allow user to select the
> >> +        * reference clock id to be used for timestamps.
> >> +        */
> >> +       switch (clk_id) {
> >> +       case CLOCK_MONOTONIC:
> >> +               return &ktime_get_ns;
> >> +       case CLOCK_MONOTONIC_RAW:
> >> +               return &ktime_get_raw_ns;
> >> +       case CLOCK_REALTIME:
> >> +               return &ktime_get_real_ns;
> >> +       case CLOCK_BOOTTIME:
> >> +               return &ktime_get_boottime_ns;
> >> +       case CLOCK_TAI:
> >> +               return &ktime_get_clocktai_ns;
> >> +       default:
> >> +               return NULL;
> >> +       }
> >> +}
> >> +
> >> +static inline int
> >> +__read_timestamps(struct intel_uncore *uncore,
> >> +                 i915_reg_t lower_reg,
> >> +                 i915_reg_t upper_reg,
> >> +                 u64 *cs_ts,
> >> +                 u64 *cpu_ts,
> >> +                 __ktime_func_t cpu_clock)
> >> +{
> >> +       u32 upper, lower, old_upper, loop = 0;
> >> +
> >> +       upper = intel_uncore_read_fw(uncore, upper_reg);
> >> +       do {
> >> +               cpu_ts[0] = cpu_clock();
> >> +               lower = intel_uncore_read_fw(uncore, lower_reg);
> >> +               cpu_ts[1] = cpu_clock();
> >> +               old_upper = upper;
> >> +               upper = intel_uncore_read_fw(uncore, upper_reg);
> > Both register reads comprise the timestamp returned to userspace, so
> > presumably you want cpu_ts[] to wrap both.
> >
> >         do {
> >                 old_upper = upper;
> >
> >                 cpu_ts[0] = cpu_clock();
> >                 lower = intel_uncore_read_fw(uncore, lower_reg);
> >                 upper = intel_uncore_read_fw(uncore, upper_reg);
> >                 cpu_ts[1] = cpu_clock();
> >         } while (upper != old_upper && loop++ < 2);
> 
> Actually if we want the best accuracy we can just deal with the lower dword.

Accuracy of what? The lower dword read perhaps, or the accuracy of the
sample point for the combined reads for the timestamp, which is closer
to an external observer (cpu_clock() implies reference to an external
observer).

The two clock samples are not even necessarily closely related due to the
nmi adjustments. If you wanted an unadjusted elapsed time for the read
you can use local_clock() then return the chosen cpu_clock() before plus
the elapsed delta from around the read as the estimated error.

cpu_ts[1] = local_clock();
cpu_ts[0] = cpu_clock();
lower = intel_uncore_read_fw(uncore, lower_reg);
cpu_ts[1] = local_clock() - cpu_ts[1];
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-03-04  8:58 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 21:28 [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy Umesh Nerlige Ramappa
2021-03-03 23:12 ` [Intel-gfx] ✓ Fi.CI.BAT: success for i915/query: Correlate engine and cpu timestamps with better accuracy (rev2) Patchwork
2021-03-04  0:09 ` [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy Chris Wilson
2021-03-04  8:28   ` Lionel Landwerlin
2021-03-04  8:58     ` Chris Wilson [this message]
2021-03-04  9:45       ` Lionel Landwerlin
2021-03-04  9:54         ` Chris Wilson
2021-03-04 10:27           ` Lionel Landwerlin
2021-03-04  1:52 ` [Intel-gfx] ✗ Fi.CI.IGT: failure for i915/query: Correlate engine and cpu timestamps with better accuracy (rev2) Patchwork
2021-03-04  8:32 ` [Intel-gfx] [PATCH] i915/query: Correlate engine and cpu timestamps with better accuracy Lionel Landwerlin
2021-03-04 12:23 ` Lionel Landwerlin
  -- strict thread matches above, loose matches on Subject: below --
2021-03-05 19:37 Umesh Nerlige Ramappa
2021-03-04  0:11 kernel test robot
2021-03-02 18:29 Umesh Nerlige Ramappa
2021-03-02 20:35 ` Lionel Landwerlin
2021-03-03  0:12   ` Umesh Nerlige Ramappa
2021-03-03  9:21     ` Lionel Landwerlin
2021-03-03 16:27       ` Umesh Nerlige Ramappa
2021-03-03 16:28         ` Lionel Landwerlin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=161484828635.28586.889038613448637986@build.alporthouse.com \
    --to=chris@chris-wilson.co.uk \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=lionel.g.landwerlin@intel.com \
    --cc=tvrtko.ursulin@linux.intel.com \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.