All of lore.kernel.org
 help / color / mirror / Atom feed
* why we unreserve MQD of KIQ ?
@ 2019-11-25  9:05 ` Liu, Monk
  0 siblings, 0 replies; 6+ messages in thread
From: Liu, Monk @ 2019-11-25  9:05 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1.1: Type: text/plain, Size: 3285 bytes --]

commit a9a8a788e5e946a9835a1365256fc4ce9e96ba2c
Author: Rex Zhu <Rex.Zhu-5C7GfCeVMHo@public.gmane.org>
Date:   Wed Aug 22 18:54:45 2018 +0800

    drm/amdgpu: Change kiq ring initialize sequence on gfx9

    1. initialize kiq before initialize gfx ring.
    2. set kiq ring ready immediately when kiq initialize
       successfully.
    3. split function gfx_v9_0_kiq_resume into two functions.
         gfx_v9_0_kiq_resume is for kiq initialize.
         gfx_v9_0_kcq_resume is for kcq initialize.

    Reviewed-by: Alex Deucher <alexander.deucher-5C7GfCeVMHo@public.gmane.org>
    Signed-off-by: Rex Zhu <Rex.Zhu-5C7GfCeVMHo@public.gmane.org>
    Signed-off-by: Alex Deucher <alexander.deucher-5C7GfCeVMHo@public.gmane.org>

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 21e66f8..3594704a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -2684,7 +2684,6 @@ static int gfx_v9_0_kiq_kcq_enable(struct amdgpu_device *adev)
                queue_mask |= (1ull << i);
        }

-       kiq_ring->ready = true;
        r = amdgpu_ring_alloc(kiq_ring, (7 * adev->gfx.num_compute_rings) + 8);
        if (r) {
                DRM_ERROR("Failed to lock KIQ (%d).\n", r);
@@ -3091,26 +3090,33 @@ static int gfx_v9_0_kcq_init_queue(struct amdgpu_ring *ring)

static int gfx_v9_0_kiq_resume(struct amdgpu_device *adev)
{
-       struct amdgpu_ring *ring = NULL;
-       int r = 0, i;
-
-       gfx_v9_0_cp_compute_enable(adev, true);
+       struct amdgpu_ring *ring;
+       int r;

        ring = &adev->gfx.kiq.ring;

        r = amdgpu_bo_reserve(ring->mqd_obj, false);
        if (unlikely(r != 0))
-               goto done;
+               return r;

        r = amdgpu_bo_kmap(ring->mqd_obj, (void **)&ring->mqd_ptr);
-       if (!r) {
-               r = gfx_v9_0_kiq_init_queue(ring);
-               amdgpu_bo_kunmap(ring->mqd_obj);
-               ring->mqd_ptr = NULL;
-       }
+       if (unlikely(r != 0))
+               return r;
+
+       gfx_v9_0_kiq_init_queue(ring);
+       amdgpu_bo_kunmap(ring->mqd_obj);  //not invoked before change
+       ring->mqd_ptr = NULL;                             //not invoked before change
        amdgpu_bo_unreserve(ring->mqd_obj); //not invoked before change
-       if (r)
-               goto done;
+       ring->ready = true;
+       return 0;
+}
+
+static int gfx_v9_0_kcq_resume(struct amdgpu_device *adev)
+{

Hi guys

I found in last year, Rex submitted a change which introduced additional "amdgpu_bo_kunmap() and amdgpu_bo_unreserve()"
During the kiq_resume() routine,

I'm wondering why we need this change and I'm also suspecting it has potential regression:
See that the KIQ's mqd  object is unreserved, so it now available in LRU which means TTM can evict it by will,

But the MQD of KIQ is supposed to be pinned otherwise incorrect memory access would introduced by a world switch
Since MEC always consider KIQ' MQD is pinned

@Koenig, Christian<mailto:Christian.Koenig-5C7GfCeVMHo@public.gmane.org> you know about that change's background ?

_____________________________________
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]


[-- Attachment #1.1.2: Type: text/html, Size: 10823 bytes --]

[-- Attachment #1.2: image001.png --]
[-- Type: image/png, Size: 12243 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* why we unreserve MQD of KIQ ?
@ 2019-11-25  9:05 ` Liu, Monk
  0 siblings, 0 replies; 6+ messages in thread
From: Liu, Monk @ 2019-11-25  9:05 UTC (permalink / raw)
  To: Koenig, Christian; +Cc: amd-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 3180 bytes --]

commit a9a8a788e5e946a9835a1365256fc4ce9e96ba2c
Author: Rex Zhu <Rex.Zhu@amd.com>
Date:   Wed Aug 22 18:54:45 2018 +0800

    drm/amdgpu: Change kiq ring initialize sequence on gfx9

    1. initialize kiq before initialize gfx ring.
    2. set kiq ring ready immediately when kiq initialize
       successfully.
    3. split function gfx_v9_0_kiq_resume into two functions.
         gfx_v9_0_kiq_resume is for kiq initialize.
         gfx_v9_0_kcq_resume is for kcq initialize.

    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
    Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 21e66f8..3594704a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -2684,7 +2684,6 @@ static int gfx_v9_0_kiq_kcq_enable(struct amdgpu_device *adev)
                queue_mask |= (1ull << i);
        }

-       kiq_ring->ready = true;
        r = amdgpu_ring_alloc(kiq_ring, (7 * adev->gfx.num_compute_rings) + 8);
        if (r) {
                DRM_ERROR("Failed to lock KIQ (%d).\n", r);
@@ -3091,26 +3090,33 @@ static int gfx_v9_0_kcq_init_queue(struct amdgpu_ring *ring)

static int gfx_v9_0_kiq_resume(struct amdgpu_device *adev)
{
-       struct amdgpu_ring *ring = NULL;
-       int r = 0, i;
-
-       gfx_v9_0_cp_compute_enable(adev, true);
+       struct amdgpu_ring *ring;
+       int r;

        ring = &adev->gfx.kiq.ring;

        r = amdgpu_bo_reserve(ring->mqd_obj, false);
        if (unlikely(r != 0))
-               goto done;
+               return r;

        r = amdgpu_bo_kmap(ring->mqd_obj, (void **)&ring->mqd_ptr);
-       if (!r) {
-               r = gfx_v9_0_kiq_init_queue(ring);
-               amdgpu_bo_kunmap(ring->mqd_obj);
-               ring->mqd_ptr = NULL;
-       }
+       if (unlikely(r != 0))
+               return r;
+
+       gfx_v9_0_kiq_init_queue(ring);
+       amdgpu_bo_kunmap(ring->mqd_obj);  //not invoked before change
+       ring->mqd_ptr = NULL;                             //not invoked before change
        amdgpu_bo_unreserve(ring->mqd_obj); //not invoked before change
-       if (r)
-               goto done;
+       ring->ready = true;
+       return 0;
+}
+
+static int gfx_v9_0_kcq_resume(struct amdgpu_device *adev)
+{

Hi guys

I found in last year, Rex submitted a change which introduced additional "amdgpu_bo_kunmap() and amdgpu_bo_unreserve()"
During the kiq_resume() routine,

I'm wondering why we need this change and I'm also suspecting it has potential regression:
See that the KIQ's mqd  object is unreserved, so it now available in LRU which means TTM can evict it by will,

But the MQD of KIQ is supposed to be pinned otherwise incorrect memory access would introduced by a world switch
Since MEC always consider KIQ' MQD is pinned

@Koenig, Christian<mailto:Christian.Koenig@amd.com> you know about that change's background ?

_____________________________________
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]


[-- Attachment #1.1.2: Type: text/html, Size: 10707 bytes --]

[-- Attachment #1.2: image001.png --]
[-- Type: image/png, Size: 12243 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: why we unreserve MQD of KIQ ?
@ 2019-11-25 20:34     ` Alex Deucher
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Deucher @ 2019-11-25 20:34 UTC (permalink / raw)
  To: Liu, Monk; +Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1.1: Type: text/plain, Size: 4231 bytes --]

On Mon, Nov 25, 2019 at 4:05 AM Liu, Monk <Monk.Liu-5C7GfCeVMHo@public.gmane.org> wrote:

> commit a9a8a788e5e946a9835a1365256fc4ce9e96ba2c
>
> Author: Rex Zhu <Rex.Zhu-5C7GfCeVMHo@public.gmane.org>
>
> Date:   Wed Aug 22 18:54:45 2018 +0800
>
>
>
>     drm/amdgpu: Change kiq ring initialize sequence on gfx9
>
>
>
>     1. initialize kiq before initialize gfx ring.
>
>     2. set kiq ring ready immediately when kiq initialize
>
>        successfully.
>
>     3. split function gfx_v9_0_kiq_resume into two functions.
>
>          gfx_v9_0_kiq_resume is for kiq initialize.
>
>          gfx_v9_0_kcq_resume is for kcq initialize.
>
>
>
>     Reviewed-by: Alex Deucher <alexander.deucher-5C7GfCeVMHo@public.gmane.org>
>
>     Signed-off-by: Rex Zhu <Rex.Zhu-5C7GfCeVMHo@public.gmane.org>
>
>     Signed-off-by: Alex Deucher <alexander.deucher-5C7GfCeVMHo@public.gmane.org>
>
>
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>
> index 21e66f8..3594704a 100644
>
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>
> @@ -2684,7 +2684,6 @@ static int gfx_v9_0_kiq_kcq_enable(struct
> amdgpu_device *adev)
>
>                 queue_mask |= (1ull << i);
>
>         }
>
>
>
> -       kiq_ring->ready = true;
>
>         r = amdgpu_ring_alloc(kiq_ring, (7 * adev->gfx.num_compute_rings)
> + 8);
>
>         if (r) {
>
>                 DRM_ERROR("Failed to lock KIQ (%d).\n", r);
>
> @@ -3091,26 +3090,33 @@ static int gfx_v9_0_kcq_init_queue(struct
> amdgpu_ring *ring)
>
>
>
> static int gfx_v9_0_kiq_resume(struct amdgpu_device *adev)
>
> {
>
> -       struct amdgpu_ring *ring = NULL;
>
> -       int r = 0, i;
>
> -
>
> -       gfx_v9_0_cp_compute_enable(adev, true);
>
> +       struct amdgpu_ring *ring;
>
> +       int r;
>
>
>
>         ring = &adev->gfx.kiq.ring;
>
>
>
>         r = amdgpu_bo_reserve(ring->mqd_obj, false);
>
>         if (unlikely(r != 0))
>
> -               goto done;
>
> +               return r;
>
>
>
>         r = amdgpu_bo_kmap(ring->mqd_obj, (void **)&ring->mqd_ptr);
>
> -       if (!r) {
>
> -               r = gfx_v9_0_kiq_init_queue(ring);
>
> -               amdgpu_bo_kunmap(ring->mqd_obj);
>
> -               ring->mqd_ptr = NULL;
>
> -       }
>
> +       if (unlikely(r != 0))
>
> +               return r;
>
> +
>
> +       gfx_v9_0_kiq_init_queue(ring);
>
> +       amdgpu_bo_kunmap(ring->mqd_obj);  //not invoked before change
>
> +       ring->mqd_ptr = NULL;                             //not invoked
> before change
>
>         amdgpu_bo_unreserve(ring->mqd_obj); //not invoked before change
>
> -       if (r)
>
> -               goto done;
>
> +       ring->ready = true;
>
> +       return 0;
>
> +}
>
> +
>
> +static int gfx_v9_0_kcq_resume(struct amdgpu_device *adev)
>
> +{
>
>
>
> Hi guys
>
>
>
> I found in last year, Rex submitted a change which introduced additional
> “amdgpu_bo_kunmap() and amdgpu_bo_unreserve()”
>
> During the kiq_resume() routine,
>
>
>
> I’m wondering why we need this change and I’m also suspecting it has
> potential regression:
>
> See that the KIQ’s mqd  object is unreserved, so it now available in LRU
> which means TTM can evict it by will,
>
>
>
> But the MQD of KIQ is supposed to be pinned otherwise incorrect memory
> access would introduced by a world switch
>
> Since MEC always consider KIQ’ MQD is pinned
>
>
>
> @Koenig, Christian <Christian.Koenig-5C7GfCeVMHo@public.gmane.org> you know about that
> change’s background ?
>
>
>

I think the unreserve here is just to match the reserve at the top of this
function for the kmapping.  It should still be pinned because we call
amdgpu_bo_create_kernel() to allocate it.

Alex


> _____________________________________
>
> Monk Liu|GPU Virtualization Team |AMD
>
> [image: sig-cloud-gpu]
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[-- Attachment #1.1.2: Type: text/html, Size: 9409 bytes --]

[-- Attachment #1.2: image001.png --]
[-- Type: image/png, Size: 12243 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: why we unreserve MQD of KIQ ?
@ 2019-11-25 20:34     ` Alex Deucher
  0 siblings, 0 replies; 6+ messages in thread
From: Alex Deucher @ 2019-11-25 20:34 UTC (permalink / raw)
  To: Liu, Monk; +Cc: Koenig, Christian, amd-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 4077 bytes --]

On Mon, Nov 25, 2019 at 4:05 AM Liu, Monk <Monk.Liu@amd.com> wrote:

> commit a9a8a788e5e946a9835a1365256fc4ce9e96ba2c
>
> Author: Rex Zhu <Rex.Zhu@amd.com>
>
> Date:   Wed Aug 22 18:54:45 2018 +0800
>
>
>
>     drm/amdgpu: Change kiq ring initialize sequence on gfx9
>
>
>
>     1. initialize kiq before initialize gfx ring.
>
>     2. set kiq ring ready immediately when kiq initialize
>
>        successfully.
>
>     3. split function gfx_v9_0_kiq_resume into two functions.
>
>          gfx_v9_0_kiq_resume is for kiq initialize.
>
>          gfx_v9_0_kcq_resume is for kcq initialize.
>
>
>
>     Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
>
>     Signed-off-by: Rex Zhu <Rex.Zhu@amd.com>
>
>     Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>
>
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
> b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>
> index 21e66f8..3594704a 100644
>
> --- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>
> +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
>
> @@ -2684,7 +2684,6 @@ static int gfx_v9_0_kiq_kcq_enable(struct
> amdgpu_device *adev)
>
>                 queue_mask |= (1ull << i);
>
>         }
>
>
>
> -       kiq_ring->ready = true;
>
>         r = amdgpu_ring_alloc(kiq_ring, (7 * adev->gfx.num_compute_rings)
> + 8);
>
>         if (r) {
>
>                 DRM_ERROR("Failed to lock KIQ (%d).\n", r);
>
> @@ -3091,26 +3090,33 @@ static int gfx_v9_0_kcq_init_queue(struct
> amdgpu_ring *ring)
>
>
>
> static int gfx_v9_0_kiq_resume(struct amdgpu_device *adev)
>
> {
>
> -       struct amdgpu_ring *ring = NULL;
>
> -       int r = 0, i;
>
> -
>
> -       gfx_v9_0_cp_compute_enable(adev, true);
>
> +       struct amdgpu_ring *ring;
>
> +       int r;
>
>
>
>         ring = &adev->gfx.kiq.ring;
>
>
>
>         r = amdgpu_bo_reserve(ring->mqd_obj, false);
>
>         if (unlikely(r != 0))
>
> -               goto done;
>
> +               return r;
>
>
>
>         r = amdgpu_bo_kmap(ring->mqd_obj, (void **)&ring->mqd_ptr);
>
> -       if (!r) {
>
> -               r = gfx_v9_0_kiq_init_queue(ring);
>
> -               amdgpu_bo_kunmap(ring->mqd_obj);
>
> -               ring->mqd_ptr = NULL;
>
> -       }
>
> +       if (unlikely(r != 0))
>
> +               return r;
>
> +
>
> +       gfx_v9_0_kiq_init_queue(ring);
>
> +       amdgpu_bo_kunmap(ring->mqd_obj);  //not invoked before change
>
> +       ring->mqd_ptr = NULL;                             //not invoked
> before change
>
>         amdgpu_bo_unreserve(ring->mqd_obj); //not invoked before change
>
> -       if (r)
>
> -               goto done;
>
> +       ring->ready = true;
>
> +       return 0;
>
> +}
>
> +
>
> +static int gfx_v9_0_kcq_resume(struct amdgpu_device *adev)
>
> +{
>
>
>
> Hi guys
>
>
>
> I found in last year, Rex submitted a change which introduced additional
> “amdgpu_bo_kunmap() and amdgpu_bo_unreserve()”
>
> During the kiq_resume() routine,
>
>
>
> I’m wondering why we need this change and I’m also suspecting it has
> potential regression:
>
> See that the KIQ’s mqd  object is unreserved, so it now available in LRU
> which means TTM can evict it by will,
>
>
>
> But the MQD of KIQ is supposed to be pinned otherwise incorrect memory
> access would introduced by a world switch
>
> Since MEC always consider KIQ’ MQD is pinned
>
>
>
> @Koenig, Christian <Christian.Koenig@amd.com> you know about that
> change’s background ?
>
>
>

I think the unreserve here is just to match the reserve at the top of this
function for the kmapping.  It should still be pinned because we call
amdgpu_bo_create_kernel() to allocate it.

Alex


> _____________________________________
>
> Monk Liu|GPU Virtualization Team |AMD
>
> [image: sig-cloud-gpu]
>
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

[-- Attachment #1.1.2: Type: text/html, Size: 9149 bytes --]

[-- Attachment #1.2: image001.png --]
[-- Type: image/png, Size: 12243 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: why we unreserve MQD of KIQ ?
@ 2019-11-26  2:25         ` Liu, Monk
  0 siblings, 0 replies; 6+ messages in thread
From: Liu, Monk @ 2019-11-26  2:25 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Koenig, Christian, amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1.1: Type: text/plain, Size: 4484 bytes --]

Thanks Alex

_____________________________________
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]

From: Alex Deucher <alexdeucher@gmail.com>
Sent: Tuesday, November 26, 2019 4:34 AM
To: Liu, Monk <Monk.Liu@amd.com>
Cc: Koenig, Christian <Christian.Koenig@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: why we unreserve MQD of KIQ ?



On Mon, Nov 25, 2019 at 4:05 AM Liu, Monk <Monk.Liu@amd.com<mailto:Monk.Liu@amd.com>> wrote:
commit a9a8a788e5e946a9835a1365256fc4ce9e96ba2c
Author: Rex Zhu <Rex.Zhu@amd.com<mailto:Rex.Zhu@amd.com>>
Date:   Wed Aug 22 18:54:45 2018 +0800

    drm/amdgpu: Change kiq ring initialize sequence on gfx9

    1. initialize kiq before initialize gfx ring.
    2. set kiq ring ready immediately when kiq initialize
       successfully.
    3. split function gfx_v9_0_kiq_resume into two functions.
         gfx_v9_0_kiq_resume is for kiq initialize.
         gfx_v9_0_kcq_resume is for kcq initialize.

    Reviewed-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>>
    Signed-off-by: Rex Zhu <Rex.Zhu@amd.com<mailto:Rex.Zhu@amd.com>>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>>

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 21e66f8..3594704a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -2684,7 +2684,6 @@ static int gfx_v9_0_kiq_kcq_enable(struct amdgpu_device *adev)
                queue_mask |= (1ull << i);
        }

-       kiq_ring->ready = true;
        r = amdgpu_ring_alloc(kiq_ring, (7 * adev->gfx.num_compute_rings) + 8);
        if (r) {
                DRM_ERROR("Failed to lock KIQ (%d).\n", r);
@@ -3091,26 +3090,33 @@ static int gfx_v9_0_kcq_init_queue(struct amdgpu_ring *ring)

static int gfx_v9_0_kiq_resume(struct amdgpu_device *adev)
{
-       struct amdgpu_ring *ring = NULL;
-       int r = 0, i;
-
-       gfx_v9_0_cp_compute_enable(adev, true);
+       struct amdgpu_ring *ring;
+       int r;

        ring = &adev->gfx.kiq.ring;

        r = amdgpu_bo_reserve(ring->mqd_obj, false);
        if (unlikely(r != 0))
-               goto done;
+               return r;

        r = amdgpu_bo_kmap(ring->mqd_obj, (void **)&ring->mqd_ptr);
-       if (!r) {
-               r = gfx_v9_0_kiq_init_queue(ring);
-               amdgpu_bo_kunmap(ring->mqd_obj);
-               ring->mqd_ptr = NULL;
-       }
+       if (unlikely(r != 0))
+               return r;
+
+       gfx_v9_0_kiq_init_queue(ring);
+       amdgpu_bo_kunmap(ring->mqd_obj);  //not invoked before change
+       ring->mqd_ptr = NULL;                             //not invoked before change
        amdgpu_bo_unreserve(ring->mqd_obj); //not invoked before change
-       if (r)
-               goto done;
+       ring->ready = true;
+       return 0;
+}
+
+static int gfx_v9_0_kcq_resume(struct amdgpu_device *adev)
+{

Hi guys

I found in last year, Rex submitted a change which introduced additional “amdgpu_bo_kunmap() and amdgpu_bo_unreserve()”
During the kiq_resume() routine,

I’m wondering why we need this change and I’m also suspecting it has potential regression:
See that the KIQ’s mqd  object is unreserved, so it now available in LRU which means TTM can evict it by will,

But the MQD of KIQ is supposed to be pinned otherwise incorrect memory access would introduced by a world switch
Since MEC always consider KIQ’ MQD is pinned

@Koenig, Christian<mailto:Christian.Koenig@amd.com> you know about that change’s background ?


I think the unreserve here is just to match the reserve at the top of this function for the kmapping.  It should still be pinned because we call amdgpu_bo_create_kernel() to allocate it.

Alex

_____________________________________
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CMonk.Liu%40amd.com%7C89db24b863e343d560d708d771e6e595%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637103108837716631&sdata=AvkmJmyFovEOff7%2FBowMktww%2BltLEfYlgBUwmnPHM2I%3D&reserved=0>

[-- Attachment #1.1.2: Type: text/html, Size: 18993 bytes --]

[-- Attachment #1.2: image001.png --]
[-- Type: image/png, Size: 12243 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: why we unreserve MQD of KIQ ?
@ 2019-11-26  2:25         ` Liu, Monk
  0 siblings, 0 replies; 6+ messages in thread
From: Liu, Monk @ 2019-11-26  2:25 UTC (permalink / raw)
  To: Alex Deucher; +Cc: Koenig, Christian, amd-gfx


[-- Attachment #1.1.1: Type: text/plain, Size: 4484 bytes --]

Thanks Alex

_____________________________________
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]

From: Alex Deucher <alexdeucher@gmail.com>
Sent: Tuesday, November 26, 2019 4:34 AM
To: Liu, Monk <Monk.Liu@amd.com>
Cc: Koenig, Christian <Christian.Koenig@amd.com>; amd-gfx@lists.freedesktop.org
Subject: Re: why we unreserve MQD of KIQ ?



On Mon, Nov 25, 2019 at 4:05 AM Liu, Monk <Monk.Liu@amd.com<mailto:Monk.Liu@amd.com>> wrote:
commit a9a8a788e5e946a9835a1365256fc4ce9e96ba2c
Author: Rex Zhu <Rex.Zhu@amd.com<mailto:Rex.Zhu@amd.com>>
Date:   Wed Aug 22 18:54:45 2018 +0800

    drm/amdgpu: Change kiq ring initialize sequence on gfx9

    1. initialize kiq before initialize gfx ring.
    2. set kiq ring ready immediately when kiq initialize
       successfully.
    3. split function gfx_v9_0_kiq_resume into two functions.
         gfx_v9_0_kiq_resume is for kiq initialize.
         gfx_v9_0_kcq_resume is for kcq initialize.

    Reviewed-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>>
    Signed-off-by: Rex Zhu <Rex.Zhu@amd.com<mailto:Rex.Zhu@amd.com>>
    Signed-off-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>>

diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index 21e66f8..3594704a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -2684,7 +2684,6 @@ static int gfx_v9_0_kiq_kcq_enable(struct amdgpu_device *adev)
                queue_mask |= (1ull << i);
        }

-       kiq_ring->ready = true;
        r = amdgpu_ring_alloc(kiq_ring, (7 * adev->gfx.num_compute_rings) + 8);
        if (r) {
                DRM_ERROR("Failed to lock KIQ (%d).\n", r);
@@ -3091,26 +3090,33 @@ static int gfx_v9_0_kcq_init_queue(struct amdgpu_ring *ring)

static int gfx_v9_0_kiq_resume(struct amdgpu_device *adev)
{
-       struct amdgpu_ring *ring = NULL;
-       int r = 0, i;
-
-       gfx_v9_0_cp_compute_enable(adev, true);
+       struct amdgpu_ring *ring;
+       int r;

        ring = &adev->gfx.kiq.ring;

        r = amdgpu_bo_reserve(ring->mqd_obj, false);
        if (unlikely(r != 0))
-               goto done;
+               return r;

        r = amdgpu_bo_kmap(ring->mqd_obj, (void **)&ring->mqd_ptr);
-       if (!r) {
-               r = gfx_v9_0_kiq_init_queue(ring);
-               amdgpu_bo_kunmap(ring->mqd_obj);
-               ring->mqd_ptr = NULL;
-       }
+       if (unlikely(r != 0))
+               return r;
+
+       gfx_v9_0_kiq_init_queue(ring);
+       amdgpu_bo_kunmap(ring->mqd_obj);  //not invoked before change
+       ring->mqd_ptr = NULL;                             //not invoked before change
        amdgpu_bo_unreserve(ring->mqd_obj); //not invoked before change
-       if (r)
-               goto done;
+       ring->ready = true;
+       return 0;
+}
+
+static int gfx_v9_0_kcq_resume(struct amdgpu_device *adev)
+{

Hi guys

I found in last year, Rex submitted a change which introduced additional “amdgpu_bo_kunmap() and amdgpu_bo_unreserve()”
During the kiq_resume() routine,

I’m wondering why we need this change and I’m also suspecting it has potential regression:
See that the KIQ’s mqd  object is unreserved, so it now available in LRU which means TTM can evict it by will,

But the MQD of KIQ is supposed to be pinned otherwise incorrect memory access would introduced by a world switch
Since MEC always consider KIQ’ MQD is pinned

@Koenig, Christian<mailto:Christian.Koenig@amd.com> you know about that change’s background ?


I think the unreserve here is just to match the reserve at the top of this function for the kmapping.  It should still be pinned because we call amdgpu_bo_create_kernel() to allocate it.

Alex

_____________________________________
Monk Liu|GPU Virtualization Team |AMD
[sig-cloud-gpu]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>
https://lists.freedesktop.org/mailman/listinfo/amd-gfx<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&data=02%7C01%7CMonk.Liu%40amd.com%7C89db24b863e343d560d708d771e6e595%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637103108837716631&sdata=AvkmJmyFovEOff7%2FBowMktww%2BltLEfYlgBUwmnPHM2I%3D&reserved=0>

[-- Attachment #1.1.2: Type: text/html, Size: 18993 bytes --]

[-- Attachment #1.2: image001.png --]
[-- Type: image/png, Size: 12243 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-11-26  2:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-25  9:05 why we unreserve MQD of KIQ ? Liu, Monk
2019-11-25  9:05 ` Liu, Monk
     [not found] ` <MN2PR12MB3933605478471296DB5C3618844A0-rweVpJHSKTq/67K4VYF1uAdYzm3356FpvxpqHgZTriW3zl9H0oFU5g@public.gmane.org>
2019-11-25 20:34   ` Alex Deucher
2019-11-25 20:34     ` Alex Deucher
     [not found]     ` <CADnq5_OATXAe9x+bMeeSWPSi4R68Jf4nOsZb2eYuteG2Mk347w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2019-11-26  2:25       ` Liu, Monk
2019-11-26  2:25         ` Liu, Monk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.