All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV
@ 2021-12-14  4:17 Jingwen Chen
  2021-12-14  4:17 ` [PATCH v2 2/2] drm/amd/amdgpu: fix gmc " Jingwen Chen
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Jingwen Chen @ 2021-12-14  4:17 UTC (permalink / raw)
  To: amd-gfx; +Cc: horace.chen, Jingwen Chen, monk.liu

[Why]
psp tmr bo will be pinned during loading amdgpu and reset in SRIOV while
only unpinned in unload amdgpu

[How]
add amdgpu_in_reset and sriov judgement to skip pin bo

v2: fix wrong judgement

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 103bcadbc8b8..4de46fcb486c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -2017,12 +2017,16 @@ static int psp_hw_start(struct psp_context *psp)
 		return ret;
 	}
 
+	if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev)) 
+		goto skip_pin_bo;
+
 	ret = psp_tmr_init(psp);
 	if (ret) {
 		DRM_ERROR("PSP tmr init failed!\n");
 		return ret;
 	}
 
+skip_pin_bo:
 	/*
 	 * For ASICs with DF Cstate management centralized
 	 * to PMFW, TMR setup should be performed after PMFW
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/2] drm/amd/amdgpu: fix gmc bo pin count leak in SRIOV
  2021-12-14  4:17 [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV Jingwen Chen
@ 2021-12-14  4:17 ` Jingwen Chen
  2021-12-14  7:39   ` Chen, Horace
  2021-12-14 13:02   ` Christian König
  2021-12-14  7:39 ` [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr " Chen, Horace
  2021-12-14 16:09 ` Liu, Shaoyun
  2 siblings, 2 replies; 6+ messages in thread
From: Jingwen Chen @ 2021-12-14  4:17 UTC (permalink / raw)
  To: amd-gfx; +Cc: horace.chen, Jingwen Chen, monk.liu

[Why]
gmc bo will be pinned during loading amdgpu and reset in SRIOV while
only unpinned in unload amdgpu

[How]
add amdgpu_in_reset and sriov judgement to skip pin bo

v2: fix wrong judgement

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 4 ++++
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index d696c4754bea..ae46eb35b3d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,10 +992,14 @@ static int gmc_v10_0_gart_enable(struct amdgpu_device *adev)
 		return -EINVAL;
 	}
 
+	if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
+		goto skip_pin_bo;
+
 	r = amdgpu_gart_table_vram_pin(adev);
 	if (r)
 		return r;
 
+skip_pin_bo:
 	r = adev->gfxhub.funcs->gart_enable(adev);
 	if (r)
 		return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index db2ec84f7237..d91eb7eb0ebe 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1717,10 +1717,14 @@ static int gmc_v9_0_gart_enable(struct amdgpu_device *adev)
 		return -EINVAL;
 	}
 
+	if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
+		goto skip_pin_bo;
+
 	r = amdgpu_gart_table_vram_pin(adev);
 	if (r)
 		return r;
 
+skip_pin_bo:
 	r = adev->gfxhub.funcs->gart_enable(adev);
 	if (r)
 		return r;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV
  2021-12-14  4:17 [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV Jingwen Chen
  2021-12-14  4:17 ` [PATCH v2 2/2] drm/amd/amdgpu: fix gmc " Jingwen Chen
@ 2021-12-14  7:39 ` Chen, Horace
  2021-12-14 16:09 ` Liu, Shaoyun
  2 siblings, 0 replies; 6+ messages in thread
From: Chen, Horace @ 2021-12-14  7:39 UTC (permalink / raw)
  To: Chen, JingWen, amd-gfx; +Cc: Liu, Monk

[AMD Official Use Only]

Reviewed-by: Horace Chen <horace.chen@amd.com>

-----Original Message-----
From: Chen, JingWen <JingWen.Chen2@amd.com>
Sent: Tuesday, December 14, 2021 12:18 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk <Monk.Liu@amd.com>; Chen, Horace <Horace.Chen@amd.com>; Chen, JingWen <JingWen.Chen2@amd.com>
Subject: [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV

[Why]
psp tmr bo will be pinned during loading amdgpu and reset in SRIOV while only unpinned in unload amdgpu

[How]
add amdgpu_in_reset and sriov judgement to skip pin bo

v2: fix wrong judgement

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 103bcadbc8b8..4de46fcb486c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -2017,12 +2017,16 @@ static int psp_hw_start(struct psp_context *psp)
                return ret;
        }

+       if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
+               goto skip_pin_bo;
+
        ret = psp_tmr_init(psp);
        if (ret) {
                DRM_ERROR("PSP tmr init failed!\n");
                return ret;
        }

+skip_pin_bo:
        /*
         * For ASICs with DF Cstate management centralized
         * to PMFW, TMR setup should be performed after PMFW
--
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: [PATCH v2 2/2] drm/amd/amdgpu: fix gmc bo pin count leak in SRIOV
  2021-12-14  4:17 ` [PATCH v2 2/2] drm/amd/amdgpu: fix gmc " Jingwen Chen
@ 2021-12-14  7:39   ` Chen, Horace
  2021-12-14 13:02   ` Christian König
  1 sibling, 0 replies; 6+ messages in thread
From: Chen, Horace @ 2021-12-14  7:39 UTC (permalink / raw)
  To: Chen, JingWen, amd-gfx; +Cc: Liu, Monk

[AMD Official Use Only]

Reviewed-by: Horace Chen <horace.chen@amd.com>

-----Original Message-----
From: Chen, JingWen <JingWen.Chen2@amd.com>
Sent: Tuesday, December 14, 2021 12:18 PM
To: amd-gfx@lists.freedesktop.org
Cc: Liu, Monk <Monk.Liu@amd.com>; Chen, Horace <Horace.Chen@amd.com>; Chen, JingWen <JingWen.Chen2@amd.com>
Subject: [PATCH v2 2/2] drm/amd/amdgpu: fix gmc bo pin count leak in SRIOV

[Why]
gmc bo will be pinned during loading amdgpu and reset in SRIOV while only unpinned in unload amdgpu

[How]
add amdgpu_in_reset and sriov judgement to skip pin bo

v2: fix wrong judgement

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 4 ++++  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
index d696c4754bea..ae46eb35b3d7 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
@@ -992,10 +992,14 @@ static int gmc_v10_0_gart_enable(struct amdgpu_device *adev)
                return -EINVAL;
        }

+       if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
+               goto skip_pin_bo;
+
        r = amdgpu_gart_table_vram_pin(adev);
        if (r)
                return r;

+skip_pin_bo:
        r = adev->gfxhub.funcs->gart_enable(adev);
        if (r)
                return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index db2ec84f7237..d91eb7eb0ebe 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1717,10 +1717,14 @@ static int gmc_v9_0_gart_enable(struct amdgpu_device *adev)
                return -EINVAL;
        }

+       if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
+               goto skip_pin_bo;
+
        r = amdgpu_gart_table_vram_pin(adev);
        if (r)
                return r;

+skip_pin_bo:
        r = adev->gfxhub.funcs->gart_enable(adev);
        if (r)
                return r;
--
2.30.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 2/2] drm/amd/amdgpu: fix gmc bo pin count leak in SRIOV
  2021-12-14  4:17 ` [PATCH v2 2/2] drm/amd/amdgpu: fix gmc " Jingwen Chen
  2021-12-14  7:39   ` Chen, Horace
@ 2021-12-14 13:02   ` Christian König
  1 sibling, 0 replies; 6+ messages in thread
From: Christian König @ 2021-12-14 13:02 UTC (permalink / raw)
  To: Jingwen Chen, amd-gfx; +Cc: horace.chen, monk.liu

Am 14.12.21 um 05:17 schrieb Jingwen Chen:
> [Why]
> gmc bo will be pinned during loading amdgpu and reset in SRIOV while
> only unpinned in unload amdgpu
>
> [How]
> add amdgpu_in_reset and sriov judgement to skip pin bo
>
> v2: fix wrong judgement
>
> Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>

Nirmoy already had a different patch set to stop unpin/pin on 
suspend/resume removing those code paths altogether.

He's just on parental leave right now, but I think those patches where 
ready and just needed testing.

Regards,
Christian.

> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 4 ++++
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c  | 4 ++++
>   2 files changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> index d696c4754bea..ae46eb35b3d7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c
> @@ -992,10 +992,14 @@ static int gmc_v10_0_gart_enable(struct amdgpu_device *adev)
>   		return -EINVAL;
>   	}
>   
> +	if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
> +		goto skip_pin_bo;
> +
>   	r = amdgpu_gart_table_vram_pin(adev);
>   	if (r)
>   		return r;
>   
> +skip_pin_bo:
>   	r = adev->gfxhub.funcs->gart_enable(adev);
>   	if (r)
>   		return r;
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index db2ec84f7237..d91eb7eb0ebe 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -1717,10 +1717,14 @@ static int gmc_v9_0_gart_enable(struct amdgpu_device *adev)
>   		return -EINVAL;
>   	}
>   
> +	if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev))
> +		goto skip_pin_bo;
> +
>   	r = amdgpu_gart_table_vram_pin(adev);
>   	if (r)
>   		return r;
>   
> +skip_pin_bo:
>   	r = adev->gfxhub.funcs->gart_enable(adev);
>   	if (r)
>   		return r;


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV
  2021-12-14  4:17 [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV Jingwen Chen
  2021-12-14  4:17 ` [PATCH v2 2/2] drm/amd/amdgpu: fix gmc " Jingwen Chen
  2021-12-14  7:39 ` [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr " Chen, Horace
@ 2021-12-14 16:09 ` Liu, Shaoyun
  2 siblings, 0 replies; 6+ messages in thread
From: Liu, Shaoyun @ 2021-12-14 16:09 UTC (permalink / raw)
  To: Chen, JingWen, amd-gfx; +Cc: Chen, Horace, Chen, JingWen, Liu, Monk

[AMD Official Use Only]

These workaround code looks confusing.  For PSP TMR , I think guest side should avoid to load it totally  since it's loaded in host side.  For gart table , in current  code path probably it's ok, but I think if we have  a correct sequence in SRIOV , we shouldn't have  these kinds  of workaround.  Ex .  Can  we try call  ip_suspend  for sriov in amdgpu_device_pre_asic_reset , so we  will have  the  same logic as baremetal. 

Regards
Shaoyun.liu

-----Original Message-----
From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of Jingwen Chen
Sent: Monday, December 13, 2021 11:18 PM
To: amd-gfx@lists.freedesktop.org
Cc: Chen, Horace <Horace.Chen@amd.com>; Chen, JingWen <JingWen.Chen2@amd.com>; Liu, Monk <Monk.Liu@amd.com>
Subject: [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV

[Why]
psp tmr bo will be pinned during loading amdgpu and reset in SRIOV while only unpinned in unload amdgpu

[How]
add amdgpu_in_reset and sriov judgement to skip pin bo

v2: fix wrong judgement

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 103bcadbc8b8..4de46fcb486c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -2017,12 +2017,16 @@ static int psp_hw_start(struct psp_context *psp)
 		return ret;
 	}
 
+	if (amdgpu_sriov_vf(adev) && amdgpu_in_reset(adev)) 
+		goto skip_pin_bo;
+
 	ret = psp_tmr_init(psp);
 	if (ret) {
 		DRM_ERROR("PSP tmr init failed!\n");
 		return ret;
 	}
 
+skip_pin_bo:
 	/*
 	 * For ASICs with DF Cstate management centralized
 	 * to PMFW, TMR setup should be performed after PMFW
--
2.30.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-12-14 16:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-14  4:17 [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr bo pin count leak in SRIOV Jingwen Chen
2021-12-14  4:17 ` [PATCH v2 2/2] drm/amd/amdgpu: fix gmc " Jingwen Chen
2021-12-14  7:39   ` Chen, Horace
2021-12-14 13:02   ` Christian König
2021-12-14  7:39 ` [PATCH v2 1/2] drm/amd/amdgpu: fix psp tmr " Chen, Horace
2021-12-14 16:09 ` Liu, Shaoyun

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.