All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
To: "Souza, Jose" <jose.souza@intel.com>,
	"Roper, Matthew D" <matthew.d.roper@intel.com>,
	"thomas.hellstrom@linux.intel.com"
	<thomas.hellstrom@linux.intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Subject: Re: [Intel-xe] [PATCH 2/3] drm/xe: Add missing TLB invalidation to emit_pipe_invalidate()
Date: Fri, 2 Jun 2023 11:20:14 +0300	[thread overview]
Message-ID: <08a53290-309d-379b-855e-99f32b8f6298@intel.com> (raw)
In-Reply-To: <3995749ec53beb559430390136ac86a15d2ed677.camel@intel.com>

On 02/06/2023 00:09, Souza, Jose wrote:
> On Thu, 2023-06-01 at 12:26 -0700, Matt Roper wrote:
>> On Wed, May 31, 2023 at 10:46:20AM +0200, Thomas Hellström wrote:
>>> On 5/30/23 20:40, Souza, Jose wrote:
>>>> On Mon, 2023-05-29 at 17:07 +0200, Thomas Hellström wrote:
>>>>> On 5/29/23 16:56, Thomas Hellström wrote:
>>>>>> On 5/29/23 16:48, Souza, Jose wrote:
>>>>>>> On Mon, 2023-05-29 at 11:08 +0200, Thomas Hellström wrote:
>>>>>>>> On Fri, 2023-05-26 at 12:06 -0700, José Roberto de Souza wrote:
>>>>>>>>> i915 invalidates TLB before emit BB start, so doing the same in Xe.
>>>>>>>> Hi, José,
>>>>>>>>
>>>>>>>> Do you see an issue because of missing TLB flushes? In that case that
>>>>>>>> needs to be added to the commit message. We do TLB flushes on unbind,
>>>>>>>> but not sure if we do them on rebind, so if that's the issue we need to
>>>>>>>> figure out whether we should do them also on rebind or, like in this
>>>>>>>> patch, on each exec.
>>>>>>> I have a group of tests that results flips randomly. It fails when
>>>>>>> the rendered buffer is compared to expected result.
>>>>>>> Anything that add a bit of delay after the exec fixes those tests so
>>>>>>> I was looking for any missing flush in Xe KMD and Mesa.
>>>>>>>
>>>>>>> This one did not fixed it but as i915 was doing it I thought would be
>>>>>>> good to do in Xe too.
>>>>>> I think missing TLB invalidations are more likely to cause random
>>>>>> overwrites of freed memory. Let me do a quick check on these. But the
>>>>>> problem you're describing indeed sounds more like a missing render
>>>>>> cache flush.
>>>>>>
>>>>> It indeed looks like we're doing proper TLB invalidation on both rebind
>>>>> and unbind, so this patch shouldn't really be needed.
>>>>>
>>>>> (look for "invalidation_fence_init()")
>>>> With this patch + PIPE_CONTROL flush at the end of batch buffer in Mesa, fixed the groups of the tests that were flipping results.
>>>> Do a XE_GUC_ACTION_TLB_INVALIDATION is the same as a PIPE_CONTROL_TLB_INVALIDATE?
>> +Cc Matt Brost and Niranjana
>>
>> It's worth noting that the removal of PIPE_CONTROL_TLB_INVALIDATE was a
>> deliberate act in this patch:
>>
>>          commit 41db3304cff95475c6f6667ae27fab3c144a49df
>>          Author:     Matthew Brost <matthew.brost@intel.com>
>>          AuthorDate: Thu Jan 26 10:40:41 2023 -0800
>>          Commit:     Rodrigo Vivi <rodrigo.vivi@intel.com>
>>          CommitDate: Tue May 23 14:13:47 2023 -0400
>>
>>              drm/xe: Drop TLB invalidation from ring operations
>>
>> so the expectation is that this specific flag shouldn't be necessary due
>> to the GuC-based invalidation we're doing.
>>
>> At the hardware level, there are a lot of different TLBs that cache page
>> table lookups.  Each engine has a TLB, the GuC itself has another TLB,
>> the OA unit has another one, and on igpu platforms, there are also
>> additional AuxCCS TLBs for the compression lookups that most engines can
>> do.  You can trigger TLB invalidations in various ways:  via GPU
>> instructions (like PIPE_CONTROL), MMIO registers (like
>> XEHP_GFX_TLB_INV_CR), or via a GuC request
>> (XE_GUC_ACTION_TLB_INVALIDATION).  The details of invalidation vary by
>> platform (e.g., you can do selective range-based invalidation on some
>> platforms, but not on others).  The GuC just provides a single
>> interface, but that interface lets you choose whether you're doing full
>> vs range invalidation, whether you're invalidating engine TLBs or
>> the GuC's own TLB, etc.  If using the GuC interface to invalidate engine
>> TLBs, it just requires a single request and takes care of invalidating
>> all of the individual engines on its end.
>>
> Huum then commit 1d759ab5967cfc ("drm/xe: Reinstate render / compute cache invalidation in ring ops") brought the emit_pipe_invalidate() back in
> __emit_job_gen12_render_compute() but not the rest.
>
> I reverted this patch in Xe kernel and started to emit a pipe_control at the end of batch buffer with:
>
> enum anv_pipe_bits flush = ANV_PIPE_FLUSH_BITS |
> 			   ANV_PIPE_L3_FABRIC_FLUSH |
> 			   ANV_PIPE_TLB_INVALIDATE;
>
> #define ANV_PIPE_FLUSH_BITS ( \
>     ANV_PIPE_DEPTH_CACHE_FLUSH_BIT | \
>     ANV_PIPE_DATA_CACHE_FLUSH_BIT | \
>     ANV_PIPE_HDC_PIPELINE_FLUSH_BIT | \
>     ANV_PIPE_UNTYPED_DATAPORT_CACHE_FLUSH_BIT | \
>     ANV_PIPE_RENDER_TARGET_CACHE_FLUSH_BIT | \
>     ANV_PIPE_TILE_CACHE_FLUSH_BIT)
>
> And it fixed the tests.
> TLB_INVALIDATE is definitely needed as I also tested with below set of bits and it did not worked:
>
>
> enum anv_pipe_bits flush = ANV_PIPE_FLUSH_BITS |
> 			   ANV_PIPE_L3_FABRIC_FLUSH |
> 			   ANV_PIPE_DEPTH_STALL_BIT |
>                             ANV_PIPE_CONTROL_FLUSH_ENABLE |
>                             ANV_PIPE_STORE_DATA_INDEX |
>                             ANV_PIPE_END_OF_PIPE_SYNC_BIT;
>
> enum anv_pipe_bits flush = ANV_PIPE_FLUSH_BITS |
> 			   ANV_PIPE_L3_FABRIC_FLUSH |
> 			   ANV_PIPE_DEPTH_STALL_BIT |
>                             ANV_PIPE_CONTROL_FLUSH_ENABLE |
>                             ANV_PIPE_STORE_DATA_INDEX |
>                             ANV_PIPE_END_OF_PIPE_SYNC_BIT |
>                             ANV_PIPE_COMMAND_CACHE_INVALIDATE_ENABLE |
>                             ANV_PIPE_CS_STALL_BIT |
>                             ANV_PIPE_INVALIDATE_BITS;
>
> So it might be leaving something in the engine TLB that could also be exploited by malicious software.
>
>> Are the mesa issues being seen on all platforms or just specific ones?
> I can reproduce on gfx12.0 platforms and DG2 but did not tested the fix in DG2 yet.


I just wanted to point out this PIPE_CONTROL bit has a bit of a 
misleading name.

It does more than invalidation :

     "If ENABLED, PIPE_CONTROL command will flush the in flight data 
written out by render engine to Global Observation point on flush done. 
Also Requires stall bit ([20] of DW1) set."


Not sure if that changes you mind about whether it should be in 
emit_flush or not :)


-Lionel


>
>>
>> Matt
>>
>>> I would think so, yes, except that the GUC TLB invalidation is GT-wide and
>>> I'm not sure whether PIPE_CONTROL_TLB_INVALIDATE is per hw engine or per-GT.
>>> But given this really has an impact, it might be that we need to invalidate
>>> TLB also after a bind where we previously pointed to the scratch page.
>>>
>>> If that's indeed the culprit we should look at issuing a
>>> PIPE_CONTROL_TLB_INVALIDATE on the exec following a bind or rebind, and
>>> leave the GuC TLB invalidations for unbinds, and then this patch makes sense
>>> as a start.
>>>
>>>
>>>> Do you see any PIPE_CONTROL flush at the end of batch buffers that i915 does but Xe don't?
>>> The emit_fini_breadcrumb() called from __i915_request_submit() indeed seems
>>> to emit the flushes needed, whereas the corresponding
>>> emit_pipe_imm_ggtt() in xe doesn't.
>>>
>>> /Thomas
>>>
>>>
>>>>> /Thomas
>>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Thomas
>>>>>>>>
>>>>>>>>
>>>>>>>>> Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
>>>>>>>>> ---
>>>>>>>>>     drivers/gpu/drm/xe/regs/xe_gpu_commands.h | 1 +
>>>>>>>>>     drivers/gpu/drm/xe/xe_ring_ops.c          | 6 ++++--
>>>>>>>>>     2 files changed, 5 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
>>>>>>>>> b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
>>>>>>>>> index 0f9c5b0b8a3ba..7c7320efea739 100644
>>>>>>>>> --- a/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
>>>>>>>>> +++ b/drivers/gpu/drm/xe/regs/xe_gpu_commands.h
>>>>>>>>> @@ -73,6 +73,7 @@
>>>>>>>>>     #define
>>>>>>>>> PIPE_CONTROL_STORE_DATA_INDEX                        (1<<21)
>>>>>>>>>     #define
>>>>>>>>> PIPE_CONTROL_CS_STALL                                (1<<20)
>>>>>>>>>     #define PIPE_CONTROL_GLOBAL_SNAPSHOT_RESET           (1<<19)
>>>>>>>>> +#define PIPE_CONTROL_TLB_INVALIDATE                  (1<<18)
>>>>>>>>>     #define
>>>>>>>>> PIPE_CONTROL_PSD_SYNC                                (1<<17)
>>>>>>>>>     #define
>>>>>>>>> PIPE_CONTROL_QW_WRITE                                (1<<14)
>>>>>>>>>     #define PIPE_CONTROL_DEPTH_STALL                     (1<<13)
>>>>>>>>> diff --git a/drivers/gpu/drm/xe/xe_ring_ops.c
>>>>>>>>> b/drivers/gpu/drm/xe/xe_ring_ops.c
>>>>>>>>> index d2fa0b4c8bcc0..4f3ef39109b9b 100644
>>>>>>>>> --- a/drivers/gpu/drm/xe/xe_ring_ops.c
>>>>>>>>> +++ b/drivers/gpu/drm/xe/xe_ring_ops.c
>>>>>>>>> @@ -37,7 +37,8 @@
>>>>>>>>>                    PIPE_CONTROL_INDIRECT_STATE_DISABLE | \
>>>>>>>>>                    PIPE_CONTROL_FLUSH_ENABLE | \
>>>>>>>>>                    PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | \
>>>>>>>>> -               PIPE_CONTROL_DC_FLUSH_ENABLE)
>>>>>>>>> +               PIPE_CONTROL_DC_FLUSH_ENABLE | \
>>>>>>>>> +               PIPE_CONTROL_TLB_INVALIDATE)
>>>>>>>>>       static u32 preparser_disable(bool state)
>>>>>>>>>     {
>>>>>>>>> @@ -117,7 +118,8 @@ static int emit_pipe_invalidate(u32 mask_flags,
>>>>>>>>> u32 *dw, int i)
>>>>>>>>>                    PIPE_CONTROL_CONST_CACHE_INVALIDATE |
>>>>>>>>>                    PIPE_CONTROL_STATE_CACHE_INVALIDATE |
>>>>>>>>>                    PIPE_CONTROL_QW_WRITE |
>>>>>>>>> -               PIPE_CONTROL_STORE_DATA_INDEX;
>>>>>>>>> +               PIPE_CONTROL_STORE_DATA_INDEX |
>>>>>>>>> +               PIPE_CONTROL_TLB_INVALIDATE;
>>>>>>>>>              flags &= ~mask_flags;



  reply	other threads:[~2023-06-02  8:20 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-26 19:06 [Intel-xe] [PATCH 1/3] drm/xe: Disable interruptable wait on xe_bo_move() José Roberto de Souza
2023-05-26 19:06 ` [Intel-xe] [PATCH 2/3] drm/xe: Add missing TLB invalidation to emit_pipe_invalidate() José Roberto de Souza
2023-05-29  9:08   ` Thomas Hellström
2023-05-29 14:48     ` Souza, Jose
2023-05-29 14:56       ` Thomas Hellström
2023-05-29 15:07         ` Thomas Hellström
2023-05-30 18:40           ` Souza, Jose
2023-05-31  8:46             ` Thomas Hellström
2023-06-01 19:26               ` Matt Roper
2023-06-01 21:09                 ` Souza, Jose
2023-06-02  8:20                   ` Lionel Landwerlin [this message]
     [not found]               ` <1ac84b94994b26ee20881de276d6349f16907716.camel@intel.com>
2023-06-02  8:36                 ` [Intel-xe] Render cache flushes WAS " Thomas Hellström
2023-05-26 19:06 ` [Intel-xe] [PATCH 3/3] drm/xe: Replace PVC check by engine type check José Roberto de Souza
2023-05-29  9:10   ` Thomas Hellström
2023-05-26 19:08 ` [Intel-xe] ✓ CI.Patch_applied: success for series starting with [1/3] drm/xe: Disable interruptable wait on xe_bo_move() Patchwork
2023-05-26 19:10 ` [Intel-xe] ✓ CI.KUnit: " Patchwork
2023-05-26 19:14 ` [Intel-xe] ✓ CI.Build: " Patchwork
2023-05-26 19:42 ` [Intel-xe] ○ CI.BAT: info " Patchwork
2023-05-29  8:59 ` [Intel-xe] [PATCH 1/3] " Thomas Hellström
2023-05-29 10:31   ` Thomas Hellström
2023-05-29 14:40     ` Souza, Jose
2023-05-29 14:46       ` Thomas Hellström
2023-05-29 16:48         ` Souza, Jose

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=08a53290-309d-379b-855e-99f32b8f6298@intel.com \
    --to=lionel.g.landwerlin@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=jose.souza@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=thomas.hellstrom@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.