RE: [PATCH] drm/amdgpu: Add graphics cache rinse packet for sdma 5.0

From: "Chen, Jiansong (Simon)" <Jiansong.Chen@amd.com>
To: Alex Deucher <alexdeucher@gmail.com>,
	"Deucher, Alexander" <Alexander.Deucher@amd.com>
Cc: amd-gfx list <amd-gfx@lists.freedesktop.org>
Subject: RE: [PATCH] drm/amdgpu: Add graphics cache rinse packet for sdma 5.0
Date: Thu, 29 Apr 2021 08:11:30 +0000	[thread overview]
Message-ID: <BY5PR12MB4885068A1AD3C120C8B9432DEA5F9@BY5PR12MB4885.namprd12.prod.outlook.com> (raw)
In-Reply-To: <CADnq5_O7OWgmX72VBOqSYKkq=A0dVmxGFmg9om54HCqen0r5EA@mail.gmail.com>

[AMD Official Use Only - Internal Distribution Only]

Hi Alex,
I notice there is already similar logic in sdma_v5_0_ring_emit_ib, do we need remove it?

Regards,
Jiansong

@@ -410,6 +410,18 @@ static void sdma_v5_0_ring_emit_ib(struct amdgpu_ring *ring,
        unsigned vmid = AMDGPU_JOB_GET_VMID(job);
        uint64_t csa_mc_addr = amdgpu_sdma_get_csa_mc_addr(ring, vmid);

+       /* Invalidate L2, because if we don't do it, we might get stale cache
+        * lines from previous IBs.
+        */
+       amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_GCR_REQ));
+       amdgpu_ring_write(ring, 0);
+       amdgpu_ring_write(ring, (SDMA_GCR_GL2_INV |
+                                SDMA_GCR_GL2_WB |
+                                SDMA_GCR_GLM_INV |
+                                SDMA_GCR_GLM_WB) << 16);
+       amdgpu_ring_write(ring, 0xffffff80);
+       amdgpu_ring_write(ring, 0xffff);
+

-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Alex Deucher
Sent: Thursday, April 29, 2021 11:41 AM
To: Deucher, Alexander <Alexander.Deucher@amd.com>
Cc: amd-gfx list <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdgpu: Add graphics cache rinse packet for sdma 5.0

Ping?

On Tue, Apr 20, 2021 at 3:28 PM Alex Deucher <alexander.deucher@amd.com> wrote:
>
> Add emit mem sync callback for sdma_v5_0
>
> In amdgpu sync object test, three threads created jobs to send GFX IB
> and SDMA IB in sequence. After the first GFX thread joined, sometimes
> the third thread will reuse the same physical page to store the SDMA
> IB. There will be a risk that SDMA will read GFX IB in the previous
> physical page. So it's better to flush the cache before commit sdma
> IB.
>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 28
> ++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> index 920fc6d4a127..d294ef6a625a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c
> @@ -437,6 +437,33 @@ static void sdma_v5_0_ring_emit_ib(struct amdgpu_ring *ring,
>         amdgpu_ring_write(ring, upper_32_bits(csa_mc_addr));  }
>
> +/**
> + * sdma_v5_0_ring_emit_mem_sync - flush the IB by graphics cache
> +rinse
> + *
> + * @ring: amdgpu ring pointer
> + * @job: job to retrieve vmid from
> + * @ib: IB object to schedule
> + *
> + * flush the IB by graphics cache rinse.
> + */
> +static void sdma_v5_0_ring_emit_mem_sync(struct amdgpu_ring *ring) {
> +    uint32_t gcr_cntl =
> +                   SDMA_GCR_GL2_INV | SDMA_GCR_GL2_WB | SDMA_GCR_GLM_INV |
> +                       SDMA_GCR_GL1_INV | SDMA_GCR_GLV_INV | SDMA_GCR_GLK_INV |
> +                       SDMA_GCR_GLI_INV(1);
> +
> +       /* flush entire cache L0/L1/L2, this can be optimized by performance requirement */
> +       amdgpu_ring_write(ring, SDMA_PKT_HEADER_OP(SDMA_OP_GCR_REQ));
> +       amdgpu_ring_write(ring, SDMA_PKT_GCR_REQ_PAYLOAD1_BASE_VA_31_7(0));
> +       amdgpu_ring_write(ring, SDMA_PKT_GCR_REQ_PAYLOAD2_GCR_CONTROL_15_0(gcr_cntl) |
> +                       SDMA_PKT_GCR_REQ_PAYLOAD2_BASE_VA_47_32(0));
> +       amdgpu_ring_write(ring, SDMA_PKT_GCR_REQ_PAYLOAD3_LIMIT_VA_31_7(0) |
> +                       SDMA_PKT_GCR_REQ_PAYLOAD3_GCR_CONTROL_18_16(gcr_cntl >> 16));
> +       amdgpu_ring_write(ring, SDMA_PKT_GCR_REQ_PAYLOAD4_LIMIT_VA_47_32(0) |
> +                       SDMA_PKT_GCR_REQ_PAYLOAD4_VMID(0));
> +}
> +
>  /**
>   * sdma_v5_0_ring_emit_hdp_flush - emit an hdp flush on the DMA ring
>   *
> @@ -1643,6 +1670,7 @@ static const struct amdgpu_ring_funcs sdma_v5_0_ring_funcs = {
>                 10 + 10 + 10, /* sdma_v5_0_ring_emit_fence x3 for user fence, vm fence */
>         .emit_ib_size = 5 + 7 + 6, /* sdma_v5_0_ring_emit_ib */
>         .emit_ib = sdma_v5_0_ring_emit_ib,
> +       .emit_mem_sync = sdma_v5_0_ring_emit_mem_sync,
>         .emit_fence = sdma_v5_0_ring_emit_fence,
>         .emit_pipeline_sync = sdma_v5_0_ring_emit_pipeline_sync,
>         .emit_vm_flush = sdma_v5_0_ring_emit_vm_flush,
> --
> 2.30.2
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flist
> s.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CJi
> ansong.Chen%40amd.com%7C1b777b80b17145712a7b08d90ac0a1f4%7C3dd8961fe48
> 84e608e11a82d994e183d%7C0%7C0%7C637552644738458840%7CUnknown%7CTWFpbGZ
> sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C1000&amp;sdata=T9jZUIJIQHS2gY8bU%2F7uM1ealAP3qxelkg2Slj3JASA%3D&am
> p;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=04%7C01%7CJiansong.Chen%40amd.com%7C1b777b80b17145712a7b08d90ac0a1f4%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637552644738458840%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=T9jZUIJIQHS2gY8bU%2F7uM1ealAP3qxelkg2Slj3JASA%3D&amp;reserved=0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx