All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
To: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: intel-gfx@lists.freedesktop.org
Subject: Re: [Intel-gfx] [PATCH 01/19] drm/i915/perf: Fix OA filtering logic for GuC mode
Date: Tue, 6 Sep 2022 21:39:33 +0300	[thread overview]
Message-ID: <f1e9e230-2626-0f6c-02a7-e063122b759b@intel.com> (raw)
In-Reply-To: <YxeF0b6ohtFcDXf6@unerlige-ril>

On 06/09/2022 20:39, Umesh Nerlige Ramappa wrote:
> On Tue, Sep 06, 2022 at 05:33:00PM +0300, Lionel Landwerlin wrote:
>> On 23/08/2022 23:41, Umesh Nerlige Ramappa wrote:
>>> With GuC mode of submission, GuC is in control of defining the 
>>> context id field
>>> that is part of the OA reports. To filter reports, UMD and KMD must 
>>> know what sw
>>> context id was chosen by GuC. There is not interface between KMD and 
>>> GuC to
>>> determine this, so read the upper-dword of EXECLIST_STATUS to 
>>> filter/squash OA
>>> reports for the specific context.
>>>
>>> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
>>
>>
>> I assume you checked with GuC that this doesn't change as the context 
>> is running?
>
> Correct.
>
>>
>> With i915/execlist submission mode, we had to ask i915 to pin the 
>> sw_id/ctx_id.
>>
>
> From GuC perspective, the context id can change once KMD de-registers 
> the context and that will not happen while the context is in use.
>
> Thanks,
> Umesh


Thanks Umesh,


Maybe I should have been more precise in my question :


Can the ID change while the i915-perf stream is opened?

Because the ID not changing while the context is running makes sense.

But since the number of available IDs is limited to 2k or something on 
Gfx12, it's possible the GuC has to reuse IDs if too many apps want to 
run during the period of time while i915-perf is active and filtering.


-Lionel


>
>>
>> If that's not the case then filtering is broken.
>>
>>
>> -Lionel
>>
>>
>>> ---
>>>  drivers/gpu/drm/i915/gt/intel_lrc.h |   2 +
>>>  drivers/gpu/drm/i915/i915_perf.c    | 141 ++++++++++++++++++++++++----
>>>  2 files changed, 124 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.h 
>>> b/drivers/gpu/drm/i915/gt/intel_lrc.h
>>> index a390f0813c8b..7111bae759f3 100644
>>> --- a/drivers/gpu/drm/i915/gt/intel_lrc.h
>>> +++ b/drivers/gpu/drm/i915/gt/intel_lrc.h
>>> @@ -110,6 +110,8 @@ enum {
>>>  #define XEHP_SW_CTX_ID_WIDTH            16
>>>  #define XEHP_SW_COUNTER_SHIFT            58
>>>  #define XEHP_SW_COUNTER_WIDTH            6
>>> +#define GEN12_GUC_SW_CTX_ID_SHIFT        39
>>> +#define GEN12_GUC_SW_CTX_ID_WIDTH        16
>>>  static inline void lrc_runtime_start(struct intel_context *ce)
>>>  {
>>> diff --git a/drivers/gpu/drm/i915/i915_perf.c 
>>> b/drivers/gpu/drm/i915/i915_perf.c
>>> index f3c23fe9ad9c..735244a3aedd 100644
>>> --- a/drivers/gpu/drm/i915/i915_perf.c
>>> +++ b/drivers/gpu/drm/i915/i915_perf.c
>>> @@ -1233,6 +1233,125 @@ static struct intel_context 
>>> *oa_pin_context(struct i915_perf_stream *stream)
>>>      return stream->pinned_ctx;
>>>  }
>>> +static int
>>> +__store_reg_to_mem(struct i915_request *rq, i915_reg_t reg, u32 
>>> ggtt_offset)
>>> +{
>>> +    u32 *cs, cmd;
>>> +
>>> +    cmd = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
>>> +    if (GRAPHICS_VER(rq->engine->i915) >= 8)
>>> +        cmd++;
>>> +
>>> +    cs = intel_ring_begin(rq, 4);
>>> +    if (IS_ERR(cs))
>>> +        return PTR_ERR(cs);
>>> +
>>> +    *cs++ = cmd;
>>> +    *cs++ = i915_mmio_reg_offset(reg);
>>> +    *cs++ = ggtt_offset;
>>> +    *cs++ = 0;
>>> +
>>> +    intel_ring_advance(rq, cs);
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +static int
>>> +__read_reg(struct intel_context *ce, i915_reg_t reg, u32 ggtt_offset)
>>> +{
>>> +    struct i915_request *rq;
>>> +    int err;
>>> +
>>> +    rq = i915_request_create(ce);
>>> +    if (IS_ERR(rq))
>>> +        return PTR_ERR(rq);
>>> +
>>> +    i915_request_get(rq);
>>> +
>>> +    err = __store_reg_to_mem(rq, reg, ggtt_offset);
>>> +
>>> +    i915_request_add(rq);
>>> +    if (!err && i915_request_wait(rq, 0, HZ / 2) < 0)
>>> +        err = -ETIME;
>>> +
>>> +    i915_request_put(rq);
>>> +
>>> +    return err;
>>> +}
>>> +
>>> +static int
>>> +gen12_guc_sw_ctx_id(struct intel_context *ce, u32 *ctx_id)
>>> +{
>>> +    struct i915_vma *scratch;
>>> +    u32 *val;
>>> +    int err;
>>> +
>>> +    scratch = 
>>> __vm_create_scratch_for_read_pinned(&ce->engine->gt->ggtt->vm, 4);
>>> +    if (IS_ERR(scratch))
>>> +        return PTR_ERR(scratch);
>>> +
>>> +    err = i915_vma_sync(scratch);
>>> +    if (err)
>>> +        goto err_scratch;
>>> +
>>> +    err = __read_reg(ce, 
>>> RING_EXECLIST_STATUS_HI(ce->engine->mmio_base),
>>> +             i915_ggtt_offset(scratch));
>>> +    if (err)
>>> +        goto err_scratch;
>>> +
>>> +    val = i915_gem_object_pin_map_unlocked(scratch->obj, I915_MAP_WB);
>>> +    if (IS_ERR(val)) {
>>> +        err = PTR_ERR(val);
>>> +        goto err_scratch;
>>> +    }
>>> +
>>> +    *ctx_id = *val;
>>> +    i915_gem_object_unpin_map(scratch->obj);
>>> +
>>> +err_scratch:
>>> +    i915_vma_unpin_and_release(&scratch, 0);
>>> +    return err;
>>> +}
>>> +
>>> +/*
>>> + * For execlist mode of submission, pick an unused context id
>>> + * 0 - (NUM_CONTEXT_TAG -1) are used by other contexts
>>> + * XXX_MAX_CONTEXT_HW_ID is used by idle context
>>> + *
>>> + * For GuC mode of submission read context id from the upper dword 
>>> of the
>>> + * EXECLIST_STATUS register.
>>> + */
>>> +static int gen12_get_render_context_id(struct i915_perf_stream 
>>> *stream)
>>> +{
>>> +    u32 ctx_id, mask;
>>> +    int ret;
>>> +
>>> +    if (intel_engine_uses_guc(stream->engine)) {
>>> +        ret = gen12_guc_sw_ctx_id(stream->pinned_ctx, &ctx_id);
>>> +        if (ret)
>>> +            return ret;
>>> +
>>> +        mask = ((1U << GEN12_GUC_SW_CTX_ID_WIDTH) - 1) <<
>>> +            (GEN12_GUC_SW_CTX_ID_SHIFT - 32);
>>> +    } else if (GRAPHICS_VER_FULL(stream->engine->i915) >= 
>>> IP_VER(12, 50)) {
>>> +        ctx_id = (XEHP_MAX_CONTEXT_HW_ID - 1) <<
>>> +            (XEHP_SW_CTX_ID_SHIFT - 32);
>>> +
>>> +        mask = ((1U << XEHP_SW_CTX_ID_WIDTH) - 1) <<
>>> +            (XEHP_SW_CTX_ID_SHIFT - 32);
>>> +    } else {
>>> +        ctx_id = (GEN12_MAX_CONTEXT_HW_ID - 1) <<
>>> +             (GEN11_SW_CTX_ID_SHIFT - 32);
>>> +
>>> +        mask = ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) <<
>>> +            (GEN11_SW_CTX_ID_SHIFT - 32);
>>> +    }
>>> +    stream->specific_ctx_id = ctx_id & mask;
>>> +    stream->specific_ctx_id_mask = mask;
>>> +
>>> +    return 0;
>>> +}
>>> +
>>>  /**
>>>   * oa_get_render_ctx_id - determine and hold ctx hw id
>>>   * @stream: An i915-perf stream opened for OA metrics
>>> @@ -1246,6 +1365,7 @@ static struct intel_context 
>>> *oa_pin_context(struct i915_perf_stream *stream)
>>>  static int oa_get_render_ctx_id(struct i915_perf_stream *stream)
>>>  {
>>>      struct intel_context *ce;
>>> +    int ret = 0;
>>>      ce = oa_pin_context(stream);
>>>      if (IS_ERR(ce))
>>> @@ -1292,24 +1412,7 @@ static int oa_get_render_ctx_id(struct 
>>> i915_perf_stream *stream)
>>>      case 11:
>>>      case 12:
>>> -        if (GRAPHICS_VER_FULL(ce->engine->i915) >= IP_VER(12, 50)) {
>>> -            stream->specific_ctx_id_mask =
>>> -                ((1U << XEHP_SW_CTX_ID_WIDTH) - 1) <<
>>> -                (XEHP_SW_CTX_ID_SHIFT - 32);
>>> -            stream->specific_ctx_id =
>>> -                (XEHP_MAX_CONTEXT_HW_ID - 1) <<
>>> -                (XEHP_SW_CTX_ID_SHIFT - 32);
>>> -        } else {
>>> -            stream->specific_ctx_id_mask =
>>> -                ((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << 
>>> (GEN11_SW_CTX_ID_SHIFT - 32);
>>> -            /*
>>> -             * Pick an unused context id
>>> -             * 0 - BITS_PER_LONG are used by other contexts
>>> -             * GEN12_MAX_CONTEXT_HW_ID (0x7ff) is used by idle context
>>> -             */
>>> -            stream->specific_ctx_id =
>>> -                (GEN12_MAX_CONTEXT_HW_ID - 1) << 
>>> (GEN11_SW_CTX_ID_SHIFT - 32);
>>> -        }
>>> +        ret = gen12_get_render_context_id(stream);
>>>          break;
>>>      default:
>>> @@ -1323,7 +1426,7 @@ static int oa_get_render_ctx_id(struct 
>>> i915_perf_stream *stream)
>>>          stream->specific_ctx_id,
>>>          stream->specific_ctx_id_mask);
>>> -    return 0;
>>> +    return ret;
>>>  }
>>>  /**
>>
>>


  reply	other threads:[~2022-09-06 18:39 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-23 20:41 [Intel-gfx] [PATCH 00/19] Add DG2 OA support Umesh Nerlige Ramappa
2022-08-23 20:41 ` [Intel-gfx] [PATCH 01/19] drm/i915/perf: Fix OA filtering logic for GuC mode Umesh Nerlige Ramappa
2022-09-06 14:33   ` Lionel Landwerlin
2022-09-06 17:39     ` Umesh Nerlige Ramappa
2022-09-06 18:39       ` Lionel Landwerlin [this message]
2022-09-14 22:26         ` Umesh Nerlige Ramappa
2022-09-14 23:13           ` Umesh Nerlige Ramappa
2022-09-15 22:49             ` Umesh Nerlige Ramappa
2022-09-20  3:22               ` Dixit, Ashutosh
2022-09-22  3:51                 ` Dixit, Ashutosh
2022-09-22 11:05             ` Lionel Landwerlin
2022-09-09 23:47   ` Dixit, Ashutosh
2022-09-13  3:08     ` Dixit, Ashutosh
2022-09-14 23:37       ` Umesh Nerlige Ramappa
2022-09-14 23:36     ` Umesh Nerlige Ramappa
2022-09-22  3:44     ` Dixit, Ashutosh
2022-09-22  3:49       ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 02/19] drm/i915/perf: Add OA formats for DG2 Umesh Nerlige Ramappa
2022-09-06 19:35   ` Lionel Landwerlin
2022-09-06 19:46     ` Umesh Nerlige Ramappa
2022-09-06 19:59       ` Lionel Landwerlin
2022-09-13 15:40   ` Dixit, Ashutosh
2022-09-14 20:54     ` Umesh Nerlige Ramappa
2022-09-14 21:16       ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 03/19] drm/i915/perf: Fix noa wait predication " Umesh Nerlige Ramappa
2022-09-20  0:35   ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 04/19] drm/i915/perf: Determine gen12 oa ctx offset at runtime Umesh Nerlige Ramappa
2022-09-06 19:48   ` Lionel Landwerlin
2022-09-06 20:35     ` Umesh Nerlige Ramappa
2022-09-08 18:32       ` Lionel Landwerlin
2022-09-08 23:04         ` Umesh Nerlige Ramappa
2022-08-23 20:41 ` [Intel-gfx] [PATCH 05/19] drm/i915/perf: Enable commands per clock reporting in OA Umesh Nerlige Ramappa
2022-09-06 19:51   ` Lionel Landwerlin
2022-09-14  0:19   ` Dixit, Ashutosh
2022-09-15  0:04     ` Umesh Nerlige Ramappa
2022-08-23 20:41 ` [Intel-gfx] [PATCH 06/19] drm/i915/perf: Use helpers to process reports w.r.t. OA buffer size Umesh Nerlige Ramappa
2022-09-14 16:04   ` Dixit, Ashutosh
2022-09-14 18:19     ` Umesh Nerlige Ramappa
2022-09-14 19:07       ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 07/19] drm/i915/perf: Simply use stream->ctx Umesh Nerlige Ramappa
2022-09-06 19:52   ` Lionel Landwerlin
2022-08-23 20:41 ` [Intel-gfx] [PATCH 08/19] drm/i915/perf: Move gt-specific data from i915->perf to gt->perf Umesh Nerlige Ramappa
2022-09-06 19:54   ` Lionel Landwerlin
2022-09-14 18:20   ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 09/19] drm/i915/perf: Replace gt->perf.lock with stream->lock for file ops Umesh Nerlige Ramappa
2022-09-14 19:04   ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 10/19] drm/i915/perf: Use gt-specific ggtt for OA and noa-wait buffers Umesh Nerlige Ramappa
2022-09-06 19:56   ` Lionel Landwerlin
2022-09-06 20:28     ` Umesh Nerlige Ramappa
2022-09-06 20:31       ` Lionel Landwerlin
2022-08-23 20:41 ` [Intel-gfx] [PATCH 11/19] drm/i915/perf: Store a pointer to oa_format in oa_buffer Umesh Nerlige Ramappa
2022-09-06 19:56   ` Lionel Landwerlin
2022-09-14 20:43   ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 12/19] drm/i915/perf: Parse 64bit report header formats correctly Umesh Nerlige Ramappa
2022-09-16  0:47   ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 13/19] drm/i915/perf: Add Wa_16010703925:dg2 Umesh Nerlige Ramappa
2022-09-16  1:08   ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 14/19] drm/i915/perf: Add Wa_1608133521:dg2 Umesh Nerlige Ramappa
2022-08-29 14:04   ` Jani Nikula
2022-09-16  1:21   ` Dixit, Ashutosh
2022-09-16 18:19     ` Umesh Nerlige Ramappa
2022-08-23 20:41 ` [Intel-gfx] [PATCH 15/19] drm/i915/perf: Add Wa_1508761755:dg2 Umesh Nerlige Ramappa
2022-09-16  1:34   ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 16/19] drm/i915/perf: Apply Wa_18013179988 Umesh Nerlige Ramappa
2022-09-16  5:16   ` Dixit, Ashutosh
2022-09-16 15:22     ` Dixit, Ashutosh
2022-09-16 19:04       ` Umesh Nerlige Ramappa
2022-09-16 18:56     ` Umesh Nerlige Ramappa
2022-09-16 19:57       ` Dixit, Ashutosh
2022-09-16 20:25         ` Umesh Nerlige Ramappa
2022-09-16 21:00           ` Dixit, Ashutosh
2022-09-19 21:21             ` Umesh Nerlige Ramappa
2022-09-20  1:24               ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 17/19] drm/i915/perf: Save/restore EU flex counters across reset Umesh Nerlige Ramappa
2022-09-16  5:40   ` Dixit, Ashutosh
2022-08-23 20:41 ` [Intel-gfx] [PATCH 18/19] drm/i915/guc: Support OA when Wa_16011777198 is enabled Umesh Nerlige Ramappa
2022-09-16 21:41   ` Dixit, Ashutosh
2022-09-16 21:48     ` Umesh Nerlige Ramappa
2022-08-23 20:41 ` [Intel-gfx] [PATCH 19/19] drm/i915/perf: Enable OA for DG2 Umesh Nerlige Ramappa
2022-08-23 21:11 ` [Intel-gfx] [PATCH 02/19] drm/i915/perf: Add OA formats " Umesh Nerlige Ramappa
2022-08-23 21:12 ` [Intel-gfx] [PATCH 19/19] drm/i915/perf: Enable OA " Umesh Nerlige Ramappa
2022-08-23 22:07 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for Add DG2 OA support (rev2) Patchwork
2022-08-23 22:07 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2022-08-23  0:03 [Intel-gfx] [PATCH 00/19] Add DG2 OA support Umesh Nerlige Ramappa
2022-08-23  0:03 ` [Intel-gfx] [PATCH 01/19] drm/i915/perf: Fix OA filtering logic for GuC mode Umesh Nerlige Ramappa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1e9e230-2626-0f6c-02a7-e063122b759b@intel.com \
    --to=lionel.g.landwerlin@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=umesh.nerlige.ramappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.