* [PATCH v3 0/5] GPU workload hints for better performance @ 2022-09-26 21:40 Shashank Sharma 2022-09-26 21:40 ` [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl Shashank Sharma ` (5 more replies) 0 siblings, 6 replies; 76+ messages in thread From: Shashank Sharma @ 2022-09-26 21:40 UTC (permalink / raw) To: amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, Shashank Sharma AMDGPU SOCs supports dynamic workload based power profiles, which can provide fine-tuned performance for a particular type of workload. This patch series adds an interface to set/reset these power profiles based on the workload type hints. A user can set a hint of workload type being submistted to GPU, and the driver can dynamically switch the power profiles which is best suited to this kind of workload. Currently supported workload profiles are: "None", "3D", "Video", "VR", "Compute" V2: This version addresses the review comment from Christian about chaning the design to set workload mode in a more dynamic method than during the context creation. V3: Addressed review comment from Christian, Removed the get_workload() calls from UAPI, keeping only the set_workload() call. Shashank Sharma (5): drm/amdgpu: add UAPI for workload hints to ctx ioctl drm/amdgpu: add new functions to set GPU power profile drm/amdgpu: set GPU workload via ctx IOCTL drm/amdgpu: switch GPU workload profile drm/amdgpu: switch workload context to/from compute drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 42 ++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 + .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 93 +++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 + .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + include/uapi/drm/amdgpu_drm.h | 17 ++++ 12 files changed, 243 insertions(+), 6 deletions(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c create mode 100644 drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h -- 2.34.1 ^ permalink raw reply [flat|nested] 76+ messages in thread
* [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2022-09-26 21:40 [PATCH v3 0/5] GPU workload hints for better performance Shashank Sharma @ 2022-09-26 21:40 ` Shashank Sharma 2022-09-27 6:07 ` Christian König ` (2 more replies) 2022-09-26 21:40 ` [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile Shashank Sharma ` (4 subsequent siblings) 5 siblings, 3 replies; 76+ messages in thread From: Shashank Sharma @ 2022-09-26 21:40 UTC (permalink / raw) To: amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, Shashank Sharma Allow the user to specify a workload hint to the kernel. We can use these to tweak the dpm heuristics to better match the workload for improved performance. V3: Create only set() workload UAPI (Christian) Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> --- include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index c2c9c674a223..23d354242699 100644 --- a/include/uapi/drm/amdgpu_drm.h +++ b/include/uapi/drm/amdgpu_drm.h @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { #define AMDGPU_CTX_OP_QUERY_STATE2 4 #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 /* GPU reset status */ #define AMDGPU_CTX_NO_RESET 0 @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 +/* GPU workload hints, flag bits 8-15 */ +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) + struct drm_amdgpu_ctx_in { /** AMDGPU_CTX_OP_* */ __u32 op; @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { __u32 flags; __u32 _pad; } pstate; + + struct { + __u32 flags; + __u32 _pad; + } workload; }; union drm_amdgpu_ctx { -- 2.34.1 ^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2022-09-26 21:40 ` [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl Shashank Sharma @ 2022-09-27 6:07 ` Christian König 2022-09-27 14:28 ` Felix Kuehling 2023-03-21 3:05 ` Marek Olšák 2 siblings, 0 replies; 76+ messages in thread From: Christian König @ 2022-09-27 6:07 UTC (permalink / raw) To: Shashank Sharma, amd-gfx; +Cc: alexander.deucher, amaranath.somalapuram Am 26.09.22 um 23:40 schrieb Shashank Sharma: > Allow the user to specify a workload hint to the kernel. > We can use these to tweak the dpm heuristics to better match > the workload for improved performance. > > V3: Create only set() workload UAPI (Christian) > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> > --- > include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h > index c2c9c674a223..23d354242699 100644 > --- a/include/uapi/drm/amdgpu_drm.h > +++ b/include/uapi/drm/amdgpu_drm.h > @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_OP_QUERY_STATE2 4 > #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 > #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 > +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 > > /* GPU reset status */ > #define AMDGPU_CTX_NO_RESET 0 > @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 > #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 > > +/* GPU workload hints, flag bits 8-15 */ > +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 > +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE > +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > + > struct drm_amdgpu_ctx_in { > /** AMDGPU_CTX_OP_* */ > __u32 op; > @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { > __u32 flags; > __u32 _pad; > } pstate; > + > + struct { > + __u32 flags; > + __u32 _pad; > + } workload; > }; > > union drm_amdgpu_ctx { ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2022-09-26 21:40 ` [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl Shashank Sharma 2022-09-27 6:07 ` Christian König @ 2022-09-27 14:28 ` Felix Kuehling 2023-03-21 3:05 ` Marek Olšák 2 siblings, 0 replies; 76+ messages in thread From: Felix Kuehling @ 2022-09-27 14:28 UTC (permalink / raw) To: Shashank Sharma, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig Am 2022-09-26 um 17:40 schrieb Shashank Sharma: > Allow the user to specify a workload hint to the kernel. > We can use these to tweak the dpm heuristics to better match > the workload for improved performance. > > V3: Create only set() workload UAPI (Christian) > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > --- > include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h > index c2c9c674a223..23d354242699 100644 > --- a/include/uapi/drm/amdgpu_drm.h > +++ b/include/uapi/drm/amdgpu_drm.h > @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_OP_QUERY_STATE2 4 > #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 > #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 > +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 > > /* GPU reset status */ > #define AMDGPU_CTX_NO_RESET 0 > @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 > #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 > > +/* GPU workload hints, flag bits 8-15 */ > +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 > +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) 8 bits seems overkill for this. Are we ever going to have 256 different workload types? Maybe 4 bits would be enough. That would allow up to 16 types. > +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE > +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) The macro argument (n) should be wrapped in parentheses. Also, it may be a good idea to apply the AMDGPU_CTX_WORKLOAD_HINT_MASK when extracting the index, in case more flags are added at higher bits in the future: (((n) & AMDGPU_CTX_WORKLOAD_HINT_MASK) >> AMDGPU_WORKLOAD_HINT_SHIFT) Regards, Felix > + > struct drm_amdgpu_ctx_in { > /** AMDGPU_CTX_OP_* */ > __u32 op; > @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { > __u32 flags; > __u32 _pad; > } pstate; > + > + struct { > + __u32 flags; > + __u32 _pad; > + } workload; > }; > > union drm_amdgpu_ctx { ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2022-09-26 21:40 ` [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl Shashank Sharma 2022-09-27 6:07 ` Christian König 2022-09-27 14:28 ` Felix Kuehling @ 2023-03-21 3:05 ` Marek Olšák 2023-03-21 13:00 ` Sharma, Shashank 2 siblings, 1 reply; 76+ messages in thread From: Marek Olšák @ 2023-03-21 3:05 UTC (permalink / raw) To: Shashank Sharma Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, amd-gfx [-- Attachment #1: Type: text/plain, Size: 2753 bytes --] I think we should do it differently because this interface will be mostly unused by open source userspace in its current form. Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be immutable for the lifetime of the context. No other interface is needed. Marek On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma <shashank.sharma@amd.com> wrote: > Allow the user to specify a workload hint to the kernel. > We can use these to tweak the dpm heuristics to better match > the workload for improved performance. > > V3: Create only set() workload UAPI (Christian) > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > --- > include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h > index c2c9c674a223..23d354242699 100644 > --- a/include/uapi/drm/amdgpu_drm.h > +++ b/include/uapi/drm/amdgpu_drm.h > @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_OP_QUERY_STATE2 4 > #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 > #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 > +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 > > /* GPU reset status */ > #define AMDGPU_CTX_NO_RESET 0 > @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 > #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 > > +/* GPU workload hints, flag bits 8-15 */ > +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 > +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE > +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > + > struct drm_amdgpu_ctx_in { > /** AMDGPU_CTX_OP_* */ > __u32 op; > @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { > __u32 flags; > __u32 _pad; > } pstate; > + > + struct { > + __u32 flags; > + __u32 _pad; > + } workload; > }; > > union drm_amdgpu_ctx { > -- > 2.34.1 > > [-- Attachment #2: Type: text/html, Size: 3524 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* RE: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-21 3:05 ` Marek Olšák @ 2023-03-21 13:00 ` Sharma, Shashank 2023-03-21 13:54 ` Christian König 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2023-03-21 13:00 UTC (permalink / raw) To: Marek Olšák Cc: Deucher, Alexander, Somalapuram, Amaranath, Koenig, Christian, amd-gfx [-- Attachment #1: Type: text/plain, Size: 3400 bytes --] [AMD Official Use Only - General] When we started this patch series, the workload hint was a part of the ctx_flag only, But we changed that after the design review, to make it more like how we are handling PSTATE. Details: https://patchwork.freedesktop.org/patch/496111/ Regards Shashank From: Marek Olšák <maraeo@gmail.com> Sent: 21 March 2023 04:05 To: Sharma, Shashank <Shashank.Sharma@amd.com> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, Amaranath <Amaranath.Somalapuram@amd.com>; Koenig, Christian <Christian.Koenig@amd.com> Subject: Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl I think we should do it differently because this interface will be mostly unused by open source userspace in its current form. Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be immutable for the lifetime of the context. No other interface is needed. Marek On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma <shashank.sharma@amd.com<mailto:shashank.sharma@amd.com>> wrote: Allow the user to specify a workload hint to the kernel. We can use these to tweak the dpm heuristics to better match the workload for improved performance. V3: Create only set() workload UAPI (Christian) Signed-off-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com<mailto:shashank.sharma@amd.com>> --- include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index c2c9c674a223..23d354242699 100644 --- a/include/uapi/drm/amdgpu_drm.h +++ b/include/uapi/drm/amdgpu_drm.h @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { #define AMDGPU_CTX_OP_QUERY_STATE2 4 #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 /* GPU reset status */ #define AMDGPU_CTX_NO_RESET 0 @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 +/* GPU workload hints, flag bits 8-15 */ +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) + struct drm_amdgpu_ctx_in { /** AMDGPU_CTX_OP_* */ __u32 op; @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { __u32 flags; __u32 _pad; } pstate; + + struct { + __u32 flags; + __u32 _pad; + } workload; }; union drm_amdgpu_ctx { -- 2.34.1 [-- Attachment #2: Type: text/html, Size: 8237 bytes --] ^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-21 13:00 ` Sharma, Shashank @ 2023-03-21 13:54 ` Christian König 2023-03-22 14:05 ` Marek Olšák 0 siblings, 1 reply; 76+ messages in thread From: Christian König @ 2023-03-21 13:54 UTC (permalink / raw) To: Sharma, Shashank, Marek Olšák Cc: Deucher, Alexander, Somalapuram, Amaranath, Koenig, Christian, amd-gfx [-- Attachment #1: Type: text/plain, Size: 4059 bytes --] Yes, I would like to avoid having multiple code paths for context creation. Setting it later on should be equally to specifying it on creation since we only need it during CS. Regards, Christian. Am 21.03.23 um 14:00 schrieb Sharma, Shashank: > > [AMD Official Use Only - General] > > When we started this patch series, the workload hint was a part of the > ctx_flag only, > > But we changed that after the design review, to make it more like how > we are handling PSTATE. > > Details: > > https://patchwork.freedesktop.org/patch/496111/ > > Regards > > Shashank > > *From:*Marek Olšák <maraeo@gmail.com> > *Sent:* 21 March 2023 04:05 > *To:* Sharma, Shashank <Shashank.Sharma@amd.com> > *Cc:* amd-gfx@lists.freedesktop.org; Deucher, Alexander > <Alexander.Deucher@amd.com>; Somalapuram, Amaranath > <Amaranath.Somalapuram@amd.com>; Koenig, Christian > <Christian.Koenig@amd.com> > *Subject:* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints > to ctx ioctl > > I think we should do it differently because this interface will be > mostly unused by open source userspace in its current form. > > Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will > be immutable for the lifetime of the context. No other interface is > needed. > > Marek > > On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma > <shashank.sharma@amd.com> wrote: > > Allow the user to specify a workload hint to the kernel. > We can use these to tweak the dpm heuristics to better match > the workload for improved performance. > > V3: Create only set() workload UAPI (Christian) > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > --- > include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/include/uapi/drm/amdgpu_drm.h > b/include/uapi/drm/amdgpu_drm.h > index c2c9c674a223..23d354242699 100644 > --- a/include/uapi/drm/amdgpu_drm.h > +++ b/include/uapi/drm/amdgpu_drm.h > @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_OP_QUERY_STATE2 4 > #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 > #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 > +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 > > /* GPU reset status */ > #define AMDGPU_CTX_NO_RESET 0 > @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 > #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 > > +/* GPU workload hints, flag bits 8-15 */ > +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 > +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE > +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > + > struct drm_amdgpu_ctx_in { > /** AMDGPU_CTX_OP_* */ > __u32 op; > @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { > __u32 flags; > __u32 _pad; > } pstate; > + > + struct { > + __u32 flags; > + __u32 _pad; > + } workload; > }; > > union drm_amdgpu_ctx { > -- > 2.34.1 > [-- Attachment #2: Type: text/html, Size: 10105 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-21 13:54 ` Christian König @ 2023-03-22 14:05 ` Marek Olšák 2023-03-22 14:08 ` Christian König 0 siblings, 1 reply; 76+ messages in thread From: Marek Olšák @ 2023-03-22 14:05 UTC (permalink / raw) To: Christian König Cc: Deucher, Alexander, Somalapuram, Amaranath, Koenig, Christian, amd-gfx, Sharma, Shashank [-- Attachment #1: Type: text/plain, Size: 4307 bytes --] The option to change the hint after context creation and get the hint would be unused uapi, and AFAIK we are not supposed to add unused uapi. What I asked is to change it to a uapi that userspace will actually use. Marek On Tue, Mar 21, 2023 at 9:54 AM Christian König < ckoenig.leichtzumerken@gmail.com> wrote: > Yes, I would like to avoid having multiple code paths for context creation. > > Setting it later on should be equally to specifying it on creation since > we only need it during CS. > > Regards, > Christian. > > Am 21.03.23 um 14:00 schrieb Sharma, Shashank: > > [AMD Official Use Only - General] > > > > When we started this patch series, the workload hint was a part of the > ctx_flag only, > > But we changed that after the design review, to make it more like how we > are handling PSTATE. > > > > Details: > > https://patchwork.freedesktop.org/patch/496111/ > > > > Regards > > Shashank > > > > *From:* Marek Olšák <maraeo@gmail.com> <maraeo@gmail.com> > *Sent:* 21 March 2023 04:05 > *To:* Sharma, Shashank <Shashank.Sharma@amd.com> <Shashank.Sharma@amd.com> > *Cc:* amd-gfx@lists.freedesktop.org; Deucher, Alexander > <Alexander.Deucher@amd.com> <Alexander.Deucher@amd.com>; Somalapuram, > Amaranath <Amaranath.Somalapuram@amd.com> <Amaranath.Somalapuram@amd.com>; > Koenig, Christian <Christian.Koenig@amd.com> <Christian.Koenig@amd.com> > *Subject:* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to > ctx ioctl > > > > I think we should do it differently because this interface will be mostly > unused by open source userspace in its current form. > > > > Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be > immutable for the lifetime of the context. No other interface is needed. > > > > Marek > > > > On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma <shashank.sharma@amd.com> > wrote: > > Allow the user to specify a workload hint to the kernel. > We can use these to tweak the dpm heuristics to better match > the workload for improved performance. > > V3: Create only set() workload UAPI (Christian) > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > --- > include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h > index c2c9c674a223..23d354242699 100644 > --- a/include/uapi/drm/amdgpu_drm.h > +++ b/include/uapi/drm/amdgpu_drm.h > @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_OP_QUERY_STATE2 4 > #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 > #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 > +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 > > /* GPU reset status */ > #define AMDGPU_CTX_NO_RESET 0 > @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { > #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 > #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 > > +/* GPU workload hints, flag bits 8-15 */ > +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 > +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE > +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > + > struct drm_amdgpu_ctx_in { > /** AMDGPU_CTX_OP_* */ > __u32 op; > @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { > __u32 flags; > __u32 _pad; > } pstate; > + > + struct { > + __u32 flags; > + __u32 _pad; > + } workload; > }; > > union drm_amdgpu_ctx { > -- > 2.34.1 > > > [-- Attachment #2: Type: text/html, Size: 8943 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-22 14:05 ` Marek Olšák @ 2023-03-22 14:08 ` Christian König 2023-03-22 14:24 ` Marek Olšák 0 siblings, 1 reply; 76+ messages in thread From: Christian König @ 2023-03-22 14:08 UTC (permalink / raw) To: Marek Olšák, Christian König Cc: Deucher, Alexander, Somalapuram, Amaranath, amd-gfx, Sharma, Shashank [-- Attachment #1: Type: text/plain, Size: 5412 bytes --] Well completely agree that we shouldn't have unused API. That's why I said we should remove the getting the hint from the UAPI. But what's wrong with setting it after creating the context? Don't you know enough about the use case? I need to understand the background a bit better here. Christian. Am 22.03.23 um 15:05 schrieb Marek Olšák: > The option to change the hint after context creation and get the hint > would be unused uapi, and AFAIK we are not supposed to add unused > uapi. What I asked is to change it to a uapi that userspace will > actually use. > > Marek > > On Tue, Mar 21, 2023 at 9:54 AM Christian König > <ckoenig.leichtzumerken@gmail.com> wrote: > > Yes, I would like to avoid having multiple code paths for context > creation. > > Setting it later on should be equally to specifying it on creation > since we only need it during CS. > > Regards, > Christian. > > Am 21.03.23 um 14:00 schrieb Sharma, Shashank: >> >> [AMD Official Use Only - General] >> >> When we started this patch series, the workload hint was a part >> of the ctx_flag only, >> >> But we changed that after the design review, to make it more like >> how we are handling PSTATE. >> >> Details: >> >> https://patchwork.freedesktop.org/patch/496111/ >> >> Regards >> >> Shashank >> >> *From:*Marek Olšák <maraeo@gmail.com> <mailto:maraeo@gmail.com> >> *Sent:* 21 March 2023 04:05 >> *To:* Sharma, Shashank <Shashank.Sharma@amd.com> >> <mailto:Shashank.Sharma@amd.com> >> *Cc:* amd-gfx@lists.freedesktop.org; Deucher, Alexander >> <Alexander.Deucher@amd.com> <mailto:Alexander.Deucher@amd.com>; >> Somalapuram, Amaranath <Amaranath.Somalapuram@amd.com> >> <mailto:Amaranath.Somalapuram@amd.com>; Koenig, Christian >> <Christian.Koenig@amd.com> <mailto:Christian.Koenig@amd.com> >> *Subject:* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload >> hints to ctx ioctl >> >> I think we should do it differently because this interface will >> be mostly unused by open source userspace in its current form. >> >> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that >> will be immutable for the lifetime of the context. No other >> interface is needed. >> >> Marek >> >> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma >> <shashank.sharma@amd.com> wrote: >> >> Allow the user to specify a workload hint to the kernel. >> We can use these to tweak the dpm heuristics to better match >> the workload for improved performance. >> >> V3: Create only set() workload UAPI (Christian) >> >> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >> --- >> include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/include/uapi/drm/amdgpu_drm.h >> b/include/uapi/drm/amdgpu_drm.h >> index c2c9c674a223..23d354242699 100644 >> --- a/include/uapi/drm/amdgpu_drm.h >> +++ b/include/uapi/drm/amdgpu_drm.h >> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { >> #define AMDGPU_CTX_OP_QUERY_STATE2 4 >> #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 >> #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 >> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 >> >> /* GPU reset status */ >> #define AMDGPU_CTX_NO_RESET 0 >> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { >> #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 >> #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 >> >> +/* GPU workload hints, flag bits 8-15 */ >> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 >> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX >> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE >> +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> + >> struct drm_amdgpu_ctx_in { >> /** AMDGPU_CTX_OP_* */ >> __u32 op; >> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { >> __u32 flags; >> __u32 _pad; >> } pstate; >> + >> + struct { >> + __u32 flags; >> + __u32 _pad; >> + } workload; >> }; >> >> union drm_amdgpu_ctx { >> -- >> 2.34.1 >> > [-- Attachment #2: Type: text/html, Size: 11730 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-22 14:08 ` Christian König @ 2023-03-22 14:24 ` Marek Olšák 2023-03-22 14:29 ` Christian König 0 siblings, 1 reply; 76+ messages in thread From: Marek Olšák @ 2023-03-22 14:24 UTC (permalink / raw) To: Christian König Cc: Deucher, Alexander, Christian König, Somalapuram, Amaranath, amd-gfx, Sharma, Shashank [-- Attachment #1: Type: text/plain, Size: 5361 bytes --] The hint is static per API (one of graphics, video, compute, unknown). In the case of Vulkan, which exposes all queues, the hint is unknown, so Vulkan won't use it. (or make it based on the queue being used and not the uapi context state) GL won't use it because the default hint is already 3D. That makes VAAPI the only user that only sets the hint once, and maybe it's not worth even adding this uapi just for VAAPI. Marek On Wed, Mar 22, 2023 at 10:08 AM Christian König <christian.koenig@amd.com> wrote: > Well completely agree that we shouldn't have unused API. That's why I said > we should remove the getting the hint from the UAPI. > > But what's wrong with setting it after creating the context? Don't you > know enough about the use case? I need to understand the background a bit > better here. > > Christian. > > Am 22.03.23 um 15:05 schrieb Marek Olšák: > > The option to change the hint after context creation and get the hint > would be unused uapi, and AFAIK we are not supposed to add unused uapi. > What I asked is to change it to a uapi that userspace will actually use. > > Marek > > On Tue, Mar 21, 2023 at 9:54 AM Christian König < > ckoenig.leichtzumerken@gmail.com> wrote: > >> Yes, I would like to avoid having multiple code paths for context >> creation. >> >> Setting it later on should be equally to specifying it on creation since >> we only need it during CS. >> >> Regards, >> Christian. >> >> Am 21.03.23 um 14:00 schrieb Sharma, Shashank: >> >> [AMD Official Use Only - General] >> >> >> >> When we started this patch series, the workload hint was a part of the >> ctx_flag only, >> >> But we changed that after the design review, to make it more like how we >> are handling PSTATE. >> >> >> >> Details: >> >> https://patchwork.freedesktop.org/patch/496111/ >> >> >> >> Regards >> >> Shashank >> >> >> >> *From:* Marek Olšák <maraeo@gmail.com> <maraeo@gmail.com> >> *Sent:* 21 March 2023 04:05 >> *To:* Sharma, Shashank <Shashank.Sharma@amd.com> >> <Shashank.Sharma@amd.com> >> *Cc:* amd-gfx@lists.freedesktop.org; Deucher, Alexander >> <Alexander.Deucher@amd.com> <Alexander.Deucher@amd.com>; Somalapuram, >> Amaranath <Amaranath.Somalapuram@amd.com> <Amaranath.Somalapuram@amd.com>; >> Koenig, Christian <Christian.Koenig@amd.com> <Christian.Koenig@amd.com> >> *Subject:* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to >> ctx ioctl >> >> >> >> I think we should do it differently because this interface will be mostly >> unused by open source userspace in its current form. >> >> >> >> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be >> immutable for the lifetime of the context. No other interface is needed. >> >> >> >> Marek >> >> >> >> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma <shashank.sharma@amd.com> >> wrote: >> >> Allow the user to specify a workload hint to the kernel. >> We can use these to tweak the dpm heuristics to better match >> the workload for improved performance. >> >> V3: Create only set() workload UAPI (Christian) >> >> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >> --- >> include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h >> index c2c9c674a223..23d354242699 100644 >> --- a/include/uapi/drm/amdgpu_drm.h >> +++ b/include/uapi/drm/amdgpu_drm.h >> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { >> #define AMDGPU_CTX_OP_QUERY_STATE2 4 >> #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 >> #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 >> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 >> >> /* GPU reset status */ >> #define AMDGPU_CTX_NO_RESET 0 >> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { >> #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 >> #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 >> >> +/* GPU workload hints, flag bits 8-15 */ >> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 >> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX >> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE >> +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >> + >> struct drm_amdgpu_ctx_in { >> /** AMDGPU_CTX_OP_* */ >> __u32 op; >> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { >> __u32 flags; >> __u32 _pad; >> } pstate; >> + >> + struct { >> + __u32 flags; >> + __u32 _pad; >> + } workload; >> }; >> >> union drm_amdgpu_ctx { >> -- >> 2.34.1 >> >> >> > [-- Attachment #2: Type: text/html, Size: 11359 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-22 14:24 ` Marek Olšák @ 2023-03-22 14:29 ` Christian König 2023-03-22 14:36 ` Marek Olšák 2023-03-22 14:38 ` Sharma, Shashank 0 siblings, 2 replies; 76+ messages in thread From: Christian König @ 2023-03-22 14:29 UTC (permalink / raw) To: Marek Olšák Cc: Deucher, Alexander, Christian König, Somalapuram, Amaranath, amd-gfx, Sharma, Shashank [-- Attachment #1: Type: text/plain, Size: 7013 bytes --] Well that sounds like being able to optionally set it after context creation is actually the right approach. VA-API could set it as soon as we know that this is a video codec application. Vulkan can set it depending on what features are used by the application. But yes, Shashank (or whoever requested that) should come up with some code for Mesa to actually use it. Otherwise we don't have the justification to push it into the kernel driver. Christian. Am 22.03.23 um 15:24 schrieb Marek Olšák: > The hint is static per API (one of graphics, video, compute, unknown). > In the case of Vulkan, which exposes all queues, the hint is unknown, > so Vulkan won't use it. (or make it based on the queue being used and > not the uapi context state) GL won't use it because the default hint > is already 3D. That makes VAAPI the only user that only sets the hint > once, and maybe it's not worth even adding this uapi just for VAAPI. > > Marek > > On Wed, Mar 22, 2023 at 10:08 AM Christian König > <christian.koenig@amd.com> wrote: > > Well completely agree that we shouldn't have unused API. That's > why I said we should remove the getting the hint from the UAPI. > > But what's wrong with setting it after creating the context? Don't > you know enough about the use case? I need to understand the > background a bit better here. > > Christian. > > Am 22.03.23 um 15:05 schrieb Marek Olšák: >> The option to change the hint after context creation and get the >> hint would be unused uapi, and AFAIK we are not supposed to add >> unused uapi. What I asked is to change it to a uapi that >> userspace will actually use. >> >> Marek >> >> On Tue, Mar 21, 2023 at 9:54 AM Christian König >> <ckoenig.leichtzumerken@gmail.com> wrote: >> >> Yes, I would like to avoid having multiple code paths for >> context creation. >> >> Setting it later on should be equally to specifying it on >> creation since we only need it during CS. >> >> Regards, >> Christian. >> >> Am 21.03.23 um 14:00 schrieb Sharma, Shashank: >>> >>> [AMD Official Use Only - General] >>> >>> When we started this patch series, the workload hint was a >>> part of the ctx_flag only, >>> >>> But we changed that after the design review, to make it more >>> like how we are handling PSTATE. >>> >>> Details: >>> >>> https://patchwork.freedesktop.org/patch/496111/ >>> >>> Regards >>> >>> Shashank >>> >>> *From:*Marek Olšák <maraeo@gmail.com> <mailto:maraeo@gmail.com> >>> *Sent:* 21 March 2023 04:05 >>> *To:* Sharma, Shashank <Shashank.Sharma@amd.com> >>> <mailto:Shashank.Sharma@amd.com> >>> *Cc:* amd-gfx@lists.freedesktop.org; Deucher, Alexander >>> <Alexander.Deucher@amd.com> >>> <mailto:Alexander.Deucher@amd.com>; Somalapuram, Amaranath >>> <Amaranath.Somalapuram@amd.com> >>> <mailto:Amaranath.Somalapuram@amd.com>; Koenig, Christian >>> <Christian.Koenig@amd.com> <mailto:Christian.Koenig@amd.com> >>> *Subject:* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for >>> workload hints to ctx ioctl >>> >>> I think we should do it differently because this interface >>> will be mostly unused by open source userspace in its >>> current form. >>> >>> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and >>> that will be immutable for the lifetime of the context. No >>> other interface is needed. >>> >>> Marek >>> >>> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma >>> <shashank.sharma@amd.com> wrote: >>> >>> Allow the user to specify a workload hint to the kernel. >>> We can use these to tweak the dpm heuristics to better match >>> the workload for improved performance. >>> >>> V3: Create only set() workload UAPI (Christian) >>> >>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>> --- >>> include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/include/uapi/drm/amdgpu_drm.h >>> b/include/uapi/drm/amdgpu_drm.h >>> index c2c9c674a223..23d354242699 100644 >>> --- a/include/uapi/drm/amdgpu_drm.h >>> +++ b/include/uapi/drm/amdgpu_drm.h >>> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { >>> #define AMDGPU_CTX_OP_QUERY_STATE2 4 >>> #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 >>> #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 >>> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 >>> >>> /* GPU reset status */ >>> #define AMDGPU_CTX_NO_RESET 0 >>> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { >>> #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 >>> #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 >>> >>> +/* GPU workload hints, flag bits 8-15 */ >>> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 >>> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX >>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE >>> +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> + >>> struct drm_amdgpu_ctx_in { >>> /** AMDGPU_CTX_OP_* */ >>> __u32 op; >>> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { >>> __u32 flags; >>> __u32 _pad; >>> } pstate; >>> + >>> + struct { >>> + __u32 flags; >>> + __u32 _pad; >>> + } workload; >>> }; >>> >>> union drm_amdgpu_ctx { >>> -- >>> 2.34.1 >>> >> > [-- Attachment #2: Type: text/html, Size: 16354 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-22 14:29 ` Christian König @ 2023-03-22 14:36 ` Marek Olšák 2023-03-22 14:52 ` Alex Deucher 2023-03-22 14:38 ` Sharma, Shashank 1 sibling, 1 reply; 76+ messages in thread From: Marek Olšák @ 2023-03-22 14:36 UTC (permalink / raw) To: Christian König Cc: Deucher, Alexander, Christian König, Somalapuram, Amaranath, amd-gfx, Sharma, Shashank [-- Attachment #1: Type: text/plain, Size: 6336 bytes --] It sounds like the kernel should set the hint based on which queues are used, so that every UMD doesn't have to duplicate the same logic. Marek On Wed, Mar 22, 2023 at 10:29 AM Christian König <christian.koenig@amd.com> wrote: > Well that sounds like being able to optionally set it after context > creation is actually the right approach. > > VA-API could set it as soon as we know that this is a video codec > application. > > Vulkan can set it depending on what features are used by the application. > > But yes, Shashank (or whoever requested that) should come up with some > code for Mesa to actually use it. Otherwise we don't have the justification > to push it into the kernel driver. > > Christian. > > Am 22.03.23 um 15:24 schrieb Marek Olšák: > > The hint is static per API (one of graphics, video, compute, unknown). In > the case of Vulkan, which exposes all queues, the hint is unknown, so > Vulkan won't use it. (or make it based on the queue being used and not the > uapi context state) GL won't use it because the default hint is already 3D. > That makes VAAPI the only user that only sets the hint once, and maybe it's > not worth even adding this uapi just for VAAPI. > > Marek > > On Wed, Mar 22, 2023 at 10:08 AM Christian König <christian.koenig@amd.com> > wrote: > >> Well completely agree that we shouldn't have unused API. That's why I >> said we should remove the getting the hint from the UAPI. >> >> But what's wrong with setting it after creating the context? Don't you >> know enough about the use case? I need to understand the background a bit >> better here. >> >> Christian. >> >> Am 22.03.23 um 15:05 schrieb Marek Olšák: >> >> The option to change the hint after context creation and get the hint >> would be unused uapi, and AFAIK we are not supposed to add unused uapi. >> What I asked is to change it to a uapi that userspace will actually use. >> >> Marek >> >> On Tue, Mar 21, 2023 at 9:54 AM Christian König < >> ckoenig.leichtzumerken@gmail.com> wrote: >> >>> Yes, I would like to avoid having multiple code paths for context >>> creation. >>> >>> Setting it later on should be equally to specifying it on creation since >>> we only need it during CS. >>> >>> Regards, >>> Christian. >>> >>> Am 21.03.23 um 14:00 schrieb Sharma, Shashank: >>> >>> [AMD Official Use Only - General] >>> >>> >>> >>> When we started this patch series, the workload hint was a part of the >>> ctx_flag only, >>> >>> But we changed that after the design review, to make it more like how we >>> are handling PSTATE. >>> >>> >>> >>> Details: >>> >>> https://patchwork.freedesktop.org/patch/496111/ >>> >>> >>> >>> Regards >>> >>> Shashank >>> >>> >>> >>> *From:* Marek Olšák <maraeo@gmail.com> <maraeo@gmail.com> >>> *Sent:* 21 March 2023 04:05 >>> *To:* Sharma, Shashank <Shashank.Sharma@amd.com> >>> <Shashank.Sharma@amd.com> >>> *Cc:* amd-gfx@lists.freedesktop.org; Deucher, Alexander >>> <Alexander.Deucher@amd.com> <Alexander.Deucher@amd.com>; Somalapuram, >>> Amaranath <Amaranath.Somalapuram@amd.com> >>> <Amaranath.Somalapuram@amd.com>; Koenig, Christian >>> <Christian.Koenig@amd.com> <Christian.Koenig@amd.com> >>> *Subject:* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints >>> to ctx ioctl >>> >>> >>> >>> I think we should do it differently because this interface will be >>> mostly unused by open source userspace in its current form. >>> >>> >>> >>> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will >>> be immutable for the lifetime of the context. No other interface is needed. >>> >>> >>> >>> Marek >>> >>> >>> >>> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma <shashank.sharma@amd.com> >>> wrote: >>> >>> Allow the user to specify a workload hint to the kernel. >>> We can use these to tweak the dpm heuristics to better match >>> the workload for improved performance. >>> >>> V3: Create only set() workload UAPI (Christian) >>> >>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>> --- >>> include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ >>> 1 file changed, 17 insertions(+) >>> >>> diff --git a/include/uapi/drm/amdgpu_drm.h >>> b/include/uapi/drm/amdgpu_drm.h >>> index c2c9c674a223..23d354242699 100644 >>> --- a/include/uapi/drm/amdgpu_drm.h >>> +++ b/include/uapi/drm/amdgpu_drm.h >>> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { >>> #define AMDGPU_CTX_OP_QUERY_STATE2 4 >>> #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 >>> #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 >>> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 >>> >>> /* GPU reset status */ >>> #define AMDGPU_CTX_NO_RESET 0 >>> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { >>> #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 >>> #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 >>> >>> +/* GPU workload hints, flag bits 8-15 */ >>> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 >>> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX >>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE >>> +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> >>> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>> + >>> struct drm_amdgpu_ctx_in { >>> /** AMDGPU_CTX_OP_* */ >>> __u32 op; >>> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { >>> __u32 flags; >>> __u32 _pad; >>> } pstate; >>> + >>> + struct { >>> + __u32 flags; >>> + __u32 _pad; >>> + } workload; >>> }; >>> >>> union drm_amdgpu_ctx { >>> -- >>> 2.34.1 >>> >>> >>> >> > [-- Attachment #2: Type: text/html, Size: 15634 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-22 14:36 ` Marek Olšák @ 2023-03-22 14:52 ` Alex Deucher 2023-03-22 15:11 ` Marek Olšák 0 siblings, 1 reply; 76+ messages in thread From: Alex Deucher @ 2023-03-22 14:52 UTC (permalink / raw) To: Marek Olšák Cc: Sharma, Shashank, Christian König, Somalapuram, Amaranath, amd-gfx, Deucher, Alexander, Christian König On Wed, Mar 22, 2023 at 10:37 AM Marek Olšák <maraeo@gmail.com> wrote: > > It sounds like the kernel should set the hint based on which queues are used, so that every UMD doesn't have to duplicate the same logic. Userspace has a better idea of what they are doing than the kernel. That said, we already set the video hint in the kernel when we submit work to VCN/UVD/VCE and we already set hint COMPUTE when user queues are active in ROCm because user queues don't go through the kernel. I guess we could just set 3D by default. On windows there is a separate API for fullscreen 3D games so 3D is only enabled in that case. I assumed UMDs would want to select a hint, but maybe we should just select the kernel set something. I figured vulkan or OpenGL would select 3D vs COMPUTE depending on what queues/extensions the app uses. Thinking about it more, if we do keep the hints, maybe it makes more sense to select the hint at context init. Then we can set the hint to the hardware at context init time. If multiple hints come in from different contexts we'll automatically select the most aggressive one. That would also be compatible with user mode queues. Alex > > Marek > > On Wed, Mar 22, 2023 at 10:29 AM Christian König <christian.koenig@amd.com> wrote: >> >> Well that sounds like being able to optionally set it after context creation is actually the right approach. >> >> VA-API could set it as soon as we know that this is a video codec application. >> >> Vulkan can set it depending on what features are used by the application. >> >> But yes, Shashank (or whoever requested that) should come up with some code for Mesa to actually use it. Otherwise we don't have the justification to push it into the kernel driver. >> >> Christian. >> >> Am 22.03.23 um 15:24 schrieb Marek Olšák: >> >> The hint is static per API (one of graphics, video, compute, unknown). In the case of Vulkan, which exposes all queues, the hint is unknown, so Vulkan won't use it. (or make it based on the queue being used and not the uapi context state) GL won't use it because the default hint is already 3D. That makes VAAPI the only user that only sets the hint once, and maybe it's not worth even adding this uapi just for VAAPI. >> >> Marek >> >> On Wed, Mar 22, 2023 at 10:08 AM Christian König <christian.koenig@amd.com> wrote: >>> >>> Well completely agree that we shouldn't have unused API. That's why I said we should remove the getting the hint from the UAPI. >>> >>> But what's wrong with setting it after creating the context? Don't you know enough about the use case? I need to understand the background a bit better here. >>> >>> Christian. >>> >>> Am 22.03.23 um 15:05 schrieb Marek Olšák: >>> >>> The option to change the hint after context creation and get the hint would be unused uapi, and AFAIK we are not supposed to add unused uapi. What I asked is to change it to a uapi that userspace will actually use. >>> >>> Marek >>> >>> On Tue, Mar 21, 2023 at 9:54 AM Christian König <ckoenig.leichtzumerken@gmail.com> wrote: >>>> >>>> Yes, I would like to avoid having multiple code paths for context creation. >>>> >>>> Setting it later on should be equally to specifying it on creation since we only need it during CS. >>>> >>>> Regards, >>>> Christian. >>>> >>>> Am 21.03.23 um 14:00 schrieb Sharma, Shashank: >>>> >>>> [AMD Official Use Only - General] >>>> >>>> >>>> >>>> When we started this patch series, the workload hint was a part of the ctx_flag only, >>>> >>>> But we changed that after the design review, to make it more like how we are handling PSTATE. >>>> >>>> >>>> >>>> Details: >>>> >>>> https://patchwork.freedesktop.org/patch/496111/ >>>> >>>> >>>> >>>> Regards >>>> >>>> Shashank >>>> >>>> >>>> >>>> From: Marek Olšák <maraeo@gmail.com> >>>> Sent: 21 March 2023 04:05 >>>> To: Sharma, Shashank <Shashank.Sharma@amd.com> >>>> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, Amaranath <Amaranath.Somalapuram@amd.com>; Koenig, Christian <Christian.Koenig@amd.com> >>>> Subject: Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl >>>> >>>> >>>> >>>> I think we should do it differently because this interface will be mostly unused by open source userspace in its current form. >>>> >>>> >>>> >>>> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be immutable for the lifetime of the context. No other interface is needed. >>>> >>>> >>>> >>>> Marek >>>> >>>> >>>> >>>> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma <shashank.sharma@amd.com> wrote: >>>> >>>> Allow the user to specify a workload hint to the kernel. >>>> We can use these to tweak the dpm heuristics to better match >>>> the workload for improved performance. >>>> >>>> V3: Create only set() workload UAPI (Christian) >>>> >>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>> --- >>>> include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ >>>> 1 file changed, 17 insertions(+) >>>> >>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h >>>> index c2c9c674a223..23d354242699 100644 >>>> --- a/include/uapi/drm/amdgpu_drm.h >>>> +++ b/include/uapi/drm/amdgpu_drm.h >>>> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { >>>> #define AMDGPU_CTX_OP_QUERY_STATE2 4 >>>> #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 >>>> #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 >>>> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 >>>> >>>> /* GPU reset status */ >>>> #define AMDGPU_CTX_NO_RESET 0 >>>> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { >>>> #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 >>>> #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 >>>> >>>> +/* GPU workload hints, flag bits 8-15 */ >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE >>>> +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) >>>> + >>>> struct drm_amdgpu_ctx_in { >>>> /** AMDGPU_CTX_OP_* */ >>>> __u32 op; >>>> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { >>>> __u32 flags; >>>> __u32 _pad; >>>> } pstate; >>>> + >>>> + struct { >>>> + __u32 flags; >>>> + __u32 _pad; >>>> + } workload; >>>> }; >>>> >>>> union drm_amdgpu_ctx { >>>> -- >>>> 2.34.1 >>>> >>>> >>> >> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-22 14:52 ` Alex Deucher @ 2023-03-22 15:11 ` Marek Olšák 0 siblings, 0 replies; 76+ messages in thread From: Marek Olšák @ 2023-03-22 15:11 UTC (permalink / raw) To: Alex Deucher Cc: Sharma, Shashank, Christian König, Somalapuram, Amaranath, amd-gfx, Deucher, Alexander, Christian König [-- Attachment #1: Type: text/plain, Size: 8326 bytes --] The uapi would make sense if somebody wrote and implemented a Vulkan extension exposing the hints and if we had customers who require that extension. Without that, userspace knows almost nothing. If anything, this effort should be led by our customers especially in the case of Vulkan (writing the extension spec, etc.) This is not a stack issue as much as it is an interface designed around Windows that doesn't fit Linux, and for that reason, putting into uapi in the current form doesn't seem to be a good idea. Marek On Wed, Mar 22, 2023 at 10:52 AM Alex Deucher <alexdeucher@gmail.com> wrote: > On Wed, Mar 22, 2023 at 10:37 AM Marek Olšák <maraeo@gmail.com> wrote: > > > > It sounds like the kernel should set the hint based on which queues are > used, so that every UMD doesn't have to duplicate the same logic. > > Userspace has a better idea of what they are doing than the kernel. > That said, we already set the video hint in the kernel when we submit > work to VCN/UVD/VCE and we already set hint COMPUTE when user queues > are active in ROCm because user queues don't go through the kernel. I > guess we could just set 3D by default. On windows there is a separate > API for fullscreen 3D games so 3D is only enabled in that case. I > assumed UMDs would want to select a hint, but maybe we should just > select the kernel set something. I figured vulkan or OpenGL would > select 3D vs COMPUTE depending on what queues/extensions the app uses. > > Thinking about it more, if we do keep the hints, maybe it makes more > sense to select the hint at context init. Then we can set the hint to > the hardware at context init time. If multiple hints come in from > different contexts we'll automatically select the most aggressive one. > That would also be compatible with user mode queues. > > Alex > > > > > Marek > > > > On Wed, Mar 22, 2023 at 10:29 AM Christian König < > christian.koenig@amd.com> wrote: > >> > >> Well that sounds like being able to optionally set it after context > creation is actually the right approach. > >> > >> VA-API could set it as soon as we know that this is a video codec > application. > >> > >> Vulkan can set it depending on what features are used by the > application. > >> > >> But yes, Shashank (or whoever requested that) should come up with some > code for Mesa to actually use it. Otherwise we don't have the justification > to push it into the kernel driver. > >> > >> Christian. > >> > >> Am 22.03.23 um 15:24 schrieb Marek Olšák: > >> > >> The hint is static per API (one of graphics, video, compute, unknown). > In the case of Vulkan, which exposes all queues, the hint is unknown, so > Vulkan won't use it. (or make it based on the queue being used and not the > uapi context state) GL won't use it because the default hint is already 3D. > That makes VAAPI the only user that only sets the hint once, and maybe it's > not worth even adding this uapi just for VAAPI. > >> > >> Marek > >> > >> On Wed, Mar 22, 2023 at 10:08 AM Christian König < > christian.koenig@amd.com> wrote: > >>> > >>> Well completely agree that we shouldn't have unused API. That's why I > said we should remove the getting the hint from the UAPI. > >>> > >>> But what's wrong with setting it after creating the context? Don't you > know enough about the use case? I need to understand the background a bit > better here. > >>> > >>> Christian. > >>> > >>> Am 22.03.23 um 15:05 schrieb Marek Olšák: > >>> > >>> The option to change the hint after context creation and get the hint > would be unused uapi, and AFAIK we are not supposed to add unused uapi. > What I asked is to change it to a uapi that userspace will actually use. > >>> > >>> Marek > >>> > >>> On Tue, Mar 21, 2023 at 9:54 AM Christian König < > ckoenig.leichtzumerken@gmail.com> wrote: > >>>> > >>>> Yes, I would like to avoid having multiple code paths for context > creation. > >>>> > >>>> Setting it later on should be equally to specifying it on creation > since we only need it during CS. > >>>> > >>>> Regards, > >>>> Christian. > >>>> > >>>> Am 21.03.23 um 14:00 schrieb Sharma, Shashank: > >>>> > >>>> [AMD Official Use Only - General] > >>>> > >>>> > >>>> > >>>> When we started this patch series, the workload hint was a part of > the ctx_flag only, > >>>> > >>>> But we changed that after the design review, to make it more like how > we are handling PSTATE. > >>>> > >>>> > >>>> > >>>> Details: > >>>> > >>>> https://patchwork.freedesktop.org/patch/496111/ > >>>> > >>>> > >>>> > >>>> Regards > >>>> > >>>> Shashank > >>>> > >>>> > >>>> > >>>> From: Marek Olšák <maraeo@gmail.com> > >>>> Sent: 21 March 2023 04:05 > >>>> To: Sharma, Shashank <Shashank.Sharma@amd.com> > >>>> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander < > Alexander.Deucher@amd.com>; Somalapuram, Amaranath < > Amaranath.Somalapuram@amd.com>; Koenig, Christian < > Christian.Koenig@amd.com> > >>>> Subject: Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints > to ctx ioctl > >>>> > >>>> > >>>> > >>>> I think we should do it differently because this interface will be > mostly unused by open source userspace in its current form. > >>>> > >>>> > >>>> > >>>> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that > will be immutable for the lifetime of the context. No other interface is > needed. > >>>> > >>>> > >>>> > >>>> Marek > >>>> > >>>> > >>>> > >>>> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma < > shashank.sharma@amd.com> wrote: > >>>> > >>>> Allow the user to specify a workload hint to the kernel. > >>>> We can use these to tweak the dpm heuristics to better match > >>>> the workload for improved performance. > >>>> > >>>> V3: Create only set() workload UAPI (Christian) > >>>> > >>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > >>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > >>>> --- > >>>> include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ > >>>> 1 file changed, 17 insertions(+) > >>>> > >>>> diff --git a/include/uapi/drm/amdgpu_drm.h > b/include/uapi/drm/amdgpu_drm.h > >>>> index c2c9c674a223..23d354242699 100644 > >>>> --- a/include/uapi/drm/amdgpu_drm.h > >>>> +++ b/include/uapi/drm/amdgpu_drm.h > >>>> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { > >>>> #define AMDGPU_CTX_OP_QUERY_STATE2 4 > >>>> #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 > >>>> #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 > >>>> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 > >>>> > >>>> /* GPU reset status */ > >>>> #define AMDGPU_CTX_NO_RESET 0 > >>>> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { > >>>> #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 > >>>> #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 > >>>> > >>>> +/* GPU workload hints, flag bits 8-15 */ > >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 > >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > >>>> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX > AMDGPU_CTX_WORKLOAD_HINT_COMPUTE > >>>> +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> > AMDGPU_CTX_WORKLOAD_HINT_SHIFT) > >>>> + > >>>> struct drm_amdgpu_ctx_in { > >>>> /** AMDGPU_CTX_OP_* */ > >>>> __u32 op; > >>>> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { > >>>> __u32 flags; > >>>> __u32 _pad; > >>>> } pstate; > >>>> + > >>>> + struct { > >>>> + __u32 flags; > >>>> + __u32 _pad; > >>>> + } workload; > >>>> }; > >>>> > >>>> union drm_amdgpu_ctx { > >>>> -- > >>>> 2.34.1 > >>>> > >>>> > >>> > >> > [-- Attachment #2: Type: text/html, Size: 11745 bytes --] ^ permalink raw reply [flat|nested] 76+ messages in thread
* RE: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl 2023-03-22 14:29 ` Christian König 2023-03-22 14:36 ` Marek Olšák @ 2023-03-22 14:38 ` Sharma, Shashank 1 sibling, 0 replies; 76+ messages in thread From: Sharma, Shashank @ 2023-03-22 14:38 UTC (permalink / raw) To: Koenig, Christian, Marek Olšák Cc: Deucher, Alexander, Christian König, Somalapuram, Amaranath, amd-gfx [-- Attachment #1: Type: text/plain, Size: 6767 bytes --] [AMD Official Use Only - General] From the exposed workload hints: +#define AMDGPU_CTX_WORKLOAD_HINT_NONE +#define AMDGPU_CTX_WORKLOAD_HINT_3D +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO +#define AMDGPU_CTX_WORKLOAD_HINT_VR +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE I guess the only option which we do not know how to use is HINT_VR, everything else is known. I find it a limitation of the stack that we can’t differentiate between a VR workload Vs 3D, coz at some time we might have to give high privilege or special attention to it when VR becomes more demanding, but for now, I can remove this one option from the patch: +#define AMDGPU_CTX_WORKLOAD_HINT_VR Regards Shashank From: Koenig, Christian <Christian.Koenig@amd.com> Sent: 22 March 2023 15:29 To: Marek Olšák <maraeo@gmail.com> Cc: Christian König <ckoenig.leichtzumerken@gmail.com>; Sharma, Shashank <Shashank.Sharma@amd.com>; Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, Amaranath <Amaranath.Somalapuram@amd.com>; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl Well that sounds like being able to optionally set it after context creation is actually the right approach. VA-API could set it as soon as we know that this is a video codec application. Vulkan can set it depending on what features are used by the application. But yes, Shashank (or whoever requested that) should come up with some code for Mesa to actually use it. Otherwise we don't have the justification to push it into the kernel driver. Christian. Am 22.03.23 um 15:24 schrieb Marek Olšák: The hint is static per API (one of graphics, video, compute, unknown). In the case of Vulkan, which exposes all queues, the hint is unknown, so Vulkan won't use it. (or make it based on the queue being used and not the uapi context state) GL won't use it because the default hint is already 3D. That makes VAAPI the only user that only sets the hint once, and maybe it's not worth even adding this uapi just for VAAPI. Marek On Wed, Mar 22, 2023 at 10:08 AM Christian König <christian.koenig@amd.com<mailto:christian.koenig@amd.com>> wrote: Well completely agree that we shouldn't have unused API. That's why I said we should remove the getting the hint from the UAPI. But what's wrong with setting it after creating the context? Don't you know enough about the use case? I need to understand the background a bit better here. Christian. Am 22.03.23 um 15:05 schrieb Marek Olšák: The option to change the hint after context creation and get the hint would be unused uapi, and AFAIK we are not supposed to add unused uapi. What I asked is to change it to a uapi that userspace will actually use. Marek On Tue, Mar 21, 2023 at 9:54 AM Christian König <ckoenig.leichtzumerken@gmail.com<mailto:ckoenig.leichtzumerken@gmail.com>> wrote: Yes, I would like to avoid having multiple code paths for context creation. Setting it later on should be equally to specifying it on creation since we only need it during CS. Regards, Christian. Am 21.03.23 um 14:00 schrieb Sharma, Shashank: [AMD Official Use Only - General] When we started this patch series, the workload hint was a part of the ctx_flag only, But we changed that after the design review, to make it more like how we are handling PSTATE. Details: https://patchwork.freedesktop.org/patch/496111/ Regards Shashank From: Marek Olšák <maraeo@gmail.com><mailto:maraeo@gmail.com> Sent: 21 March 2023 04:05 To: Sharma, Shashank <Shashank.Sharma@amd.com><mailto:Shashank.Sharma@amd.com> Cc: amd-gfx@lists.freedesktop.org<mailto:amd-gfx@lists.freedesktop.org>; Deucher, Alexander <Alexander.Deucher@amd.com><mailto:Alexander.Deucher@amd.com>; Somalapuram, Amaranath <Amaranath.Somalapuram@amd.com><mailto:Amaranath.Somalapuram@amd.com>; Koenig, Christian <Christian.Koenig@amd.com><mailto:Christian.Koenig@amd.com> Subject: Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl I think we should do it differently because this interface will be mostly unused by open source userspace in its current form. Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be immutable for the lifetime of the context. No other interface is needed. Marek On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma <shashank.sharma@amd.com<mailto:shashank.sharma@amd.com>> wrote: Allow the user to specify a workload hint to the kernel. We can use these to tweak the dpm heuristics to better match the workload for improved performance. V3: Create only set() workload UAPI (Christian) Signed-off-by: Alex Deucher <alexander.deucher@amd.com<mailto:alexander.deucher@amd.com>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com<mailto:shashank.sharma@amd.com>> --- include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h index c2c9c674a223..23d354242699 100644 --- a/include/uapi/drm/amdgpu_drm.h +++ b/include/uapi/drm/amdgpu_drm.h @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list { #define AMDGPU_CTX_OP_QUERY_STATE2 4 #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5 #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6 +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7 /* GPU reset status */ #define AMDGPU_CTX_NO_RESET 0 @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list { #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3 #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4 +/* GPU workload hints, flag bits 8-15 */ +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8 +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT) +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT) + struct drm_amdgpu_ctx_in { /** AMDGPU_CTX_OP_* */ __u32 op; @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out { __u32 flags; __u32 _pad; } pstate; + + struct { + __u32 flags; + __u32 _pad; + } workload; }; union drm_amdgpu_ctx { -- 2.34.1 [-- Attachment #2: Type: text/html, Size: 15820 bytes --] ^ permalink raw reply related [flat|nested] 76+ messages in thread
* [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-26 21:40 [PATCH v3 0/5] GPU workload hints for better performance Shashank Sharma 2022-09-26 21:40 ` [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl Shashank Sharma @ 2022-09-26 21:40 ` Shashank Sharma 2022-09-27 2:14 ` Quan, Evan ` (3 more replies) 2022-09-26 21:40 ` [PATCH v3 3/5] drm/amdgpu: set GPU workload via ctx IOCTL Shashank Sharma ` (3 subsequent siblings) 5 siblings, 4 replies; 76+ messages in thread From: Shashank Sharma @ 2022-09-26 21:40 UTC (permalink / raw) To: amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, Shashank Sharma This patch adds new functions which will allow a user to change the GPU power profile based a GPU workload hint flag. Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 +++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + 5 files changed, 158 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c create mode 100644 drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile index 5a283d12f8e1..34679c657ecc 100644 --- a/drivers/gpu/drm/amd/amdgpu/Makefile +++ b/drivers/gpu/drm/amd/amdgpu/Makefile @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o amdgpu_pll.o \ - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_ctx_workload.o amdgpu_sync.o \ amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o \ amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c new file mode 100644 index 000000000000..a11cf29bc388 --- /dev/null +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c @@ -0,0 +1,97 @@ +/* + * Copyright 2022 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ +#include <drm/drm.h> +#include "kgd_pp_interface.h" +#include "amdgpu_ctx_workload.h" + +static enum PP_SMC_POWER_PROFILE +amdgpu_workload_to_power_profile(uint32_t hint) +{ + switch (hint) { + case AMDGPU_CTX_WORKLOAD_HINT_NONE: + default: + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; + + case AMDGPU_CTX_WORKLOAD_HINT_3D: + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: + return PP_SMC_POWER_PROFILE_VIDEO; + case AMDGPU_CTX_WORKLOAD_HINT_VR: + return PP_SMC_POWER_PROFILE_VR; + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: + return PP_SMC_POWER_PROFILE_COMPUTE; + } +} + +int amdgpu_set_workload_profile(struct amdgpu_device *adev, + uint32_t hint) +{ + int ret = 0; + enum PP_SMC_POWER_PROFILE profile = + amdgpu_workload_to_power_profile(hint); + + if (adev->pm.workload_mode == hint) + return 0; + + mutex_lock(&adev->pm.smu_workload_lock); + + if (adev->pm.workload_mode == hint) + goto unlock; + + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); + if (!ret) + adev->pm.workload_mode = hint; + atomic_inc(&adev->pm.workload_switch_ref); + +unlock: + mutex_unlock(&adev->pm.smu_workload_lock); + return ret; +} + +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, + uint32_t hint) +{ + int ret = 0; + enum PP_SMC_POWER_PROFILE profile = + amdgpu_workload_to_power_profile(hint); + + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) + return 0; + + /* Do not reset GPU power profile if another reset is coming */ + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) + return 0; + + mutex_lock(&adev->pm.smu_workload_lock); + + if (adev->pm.workload_mode != hint) + goto unlock; + + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); + if (!ret) + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; + +unlock: + mutex_unlock(&adev->pm.smu_workload_lock); + return ret; +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index be7aff2d4a57..1f0f64662c04 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, mutex_init(&adev->psp.mutex); mutex_init(&adev->notifier_lock); mutex_init(&adev->pm.stable_pstate_ctx_lock); + mutex_init(&adev->pm.smu_workload_lock); mutex_init(&adev->benchmark_mutex); amdgpu_device_init_apu_flags(adev); diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h new file mode 100644 index 000000000000..6060fc53c3b0 --- /dev/null +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h @@ -0,0 +1,54 @@ +/* + * Copyright 2022 Advanced Micro Devices, Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + */ +#ifndef _AMDGPU_CTX_WL_H_ +#define _AMDGPU_CTX_WL_H_ +#include <drm/amdgpu_drm.h> +#include "amdgpu.h" + +/* Workload mode names */ +static const char * const amdgpu_workload_mode_name[] = { + "None", + "3D", + "Video", + "VR", + "Compute", + "Unknown", +}; + +static inline const +char *amdgpu_workload_profile_name(uint32_t profile) +{ + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; + + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; +} + +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, + uint32_t hint); + +int amdgpu_set_workload_profile(struct amdgpu_device *adev, + uint32_t hint); + +#endif diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h index 65624d091ed2..565131f789d0 100644 --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h @@ -361,6 +361,11 @@ struct amdgpu_pm { struct mutex stable_pstate_ctx_lock; struct amdgpu_ctx *stable_pstate_ctx; + /* SMU workload mode */ + struct mutex smu_workload_lock; + uint32_t workload_mode; + atomic_t workload_switch_ref; + struct config_table_setting config_table; /* runtime mode */ enum amdgpu_runpm_mode rpm_mode; -- 2.34.1 ^ permalink raw reply related [flat|nested] 76+ messages in thread
* RE: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-26 21:40 ` [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile Shashank Sharma @ 2022-09-27 2:14 ` Quan, Evan 2022-09-27 7:29 ` Sharma, Shashank 2022-09-27 6:08 ` Christian König ` (2 subsequent siblings) 3 siblings, 1 reply; 76+ messages in thread From: Quan, Evan @ 2022-09-27 2:14 UTC (permalink / raw) To: Sharma, Shashank, amd-gfx Cc: Deucher, Alexander, Somalapuram, Amaranath, Koenig, Christian, Sharma, Shashank [AMD Official Use Only - General] > -----Original Message----- > From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of > Shashank Sharma > Sent: Tuesday, September 27, 2022 5:40 AM > To: amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, > Amaranath <Amaranath.Somalapuram@amd.com>; Koenig, Christian > <Christian.Koenig@amd.com>; Sharma, Shashank > <Shashank.Sharma@amd.com> > Subject: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power > profile > > This patch adds new functions which will allow a user to > change the GPU power profile based a GPU workload hint > flag. > > Cc: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- > .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 > +++++++++++++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + > .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ > drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + > 5 files changed, 158 insertions(+), 1 deletion(-) > create mode 100644 > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > create mode 100644 > drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile > b/drivers/gpu/drm/amd/amdgpu/Makefile > index 5a283d12f8e1..34679c657ecc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/Makefile > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ > atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ > atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ > amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o > amdgpu_pll.o \ > - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ > + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o > amdgpu_ctx_workload.o amdgpu_sync.o \ > amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o > amdgpu_virt.o \ > amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ > amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > new file mode 100644 > index 000000000000..a11cf29bc388 > --- /dev/null > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > @@ -0,0 +1,97 @@ > +/* > + * Copyright 2022 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the > "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO > EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, > DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR > OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR > THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + */ > +#include <drm/drm.h> > +#include "kgd_pp_interface.h" > +#include "amdgpu_ctx_workload.h" > + > +static enum PP_SMC_POWER_PROFILE > +amdgpu_workload_to_power_profile(uint32_t hint) > +{ > + switch (hint) { > + case AMDGPU_CTX_WORKLOAD_HINT_NONE: > + default: > + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; > + > + case AMDGPU_CTX_WORKLOAD_HINT_3D: > + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; > + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: > + return PP_SMC_POWER_PROFILE_VIDEO; > + case AMDGPU_CTX_WORKLOAD_HINT_VR: > + return PP_SMC_POWER_PROFILE_VR; > + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: > + return PP_SMC_POWER_PROFILE_COMPUTE; > + } > +} > + > +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > + uint32_t hint) > +{ > + int ret = 0; > + enum PP_SMC_POWER_PROFILE profile = > + amdgpu_workload_to_power_profile(hint); > + > + if (adev->pm.workload_mode == hint) > + return 0; > + > + mutex_lock(&adev->pm.smu_workload_lock); > + > + if (adev->pm.workload_mode == hint) > + goto unlock; [Quan, Evan] This seems redundant with code above. I saw you dropped this in Patch4. But I kind of feel this should be the one which needs to be kept. > + > + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); > + if (!ret) > + adev->pm.workload_mode = hint; > + atomic_inc(&adev->pm.workload_switch_ref); > + > +unlock: > + mutex_unlock(&adev->pm.smu_workload_lock); > + return ret; > +} > + > +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > + uint32_t hint) > +{ > + int ret = 0; > + enum PP_SMC_POWER_PROFILE profile = > + amdgpu_workload_to_power_profile(hint); > + > + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) > + return 0; > + > + /* Do not reset GPU power profile if another reset is coming */ > + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) > + return 0; > + > + mutex_lock(&adev->pm.smu_workload_lock); > + > + if (adev->pm.workload_mode != hint) > + goto unlock; > + > + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); > + if (!ret) > + adev->pm.workload_mode = > AMDGPU_CTX_WORKLOAD_HINT_NONE; > + > +unlock: > + mutex_unlock(&adev->pm.smu_workload_lock); > + return ret; > +} [Quan, Evan] Instead of setting to AMDGPU_CTX_WORKLOAD_HINT_NONE, better to reset it back to original workload profile mode. That can make it compatible with existing sysfs interface which has similar functionality for setting workload profile mode. /** * DOC: pp_power_profile_mode * * The amdgpu driver provides a sysfs API for adjusting the heuristics * related to switching between power levels in a power state. The file * pp_power_profile_mode is used for this. * * Reading this file outputs a list of all of the predefined power profiles * and the relevant heuristics settings for that profile. * * To select a profile or create a custom profile, first select manual using * power_dpm_force_performance_level. Writing the number of a predefined * profile to pp_power_profile_mode will enable those heuristics. To * create a custom set of heuristics, write a string of numbers to the file * starting with the number of the custom profile along with a setting * for each heuristic parameter. Due to differences across asic families * the heuristic parameters vary from family to family. * */ BR Evan > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index be7aff2d4a57..1f0f64662c04 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device > *adev, > mutex_init(&adev->psp.mutex); > mutex_init(&adev->notifier_lock); > mutex_init(&adev->pm.stable_pstate_ctx_lock); > + mutex_init(&adev->pm.smu_workload_lock); > mutex_init(&adev->benchmark_mutex); > > amdgpu_device_init_apu_flags(adev); > diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > new file mode 100644 > index 000000000000..6060fc53c3b0 > --- /dev/null > +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > @@ -0,0 +1,54 @@ > +/* > + * Copyright 2022 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the > "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO > EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, > DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR > OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR > THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + */ > +#ifndef _AMDGPU_CTX_WL_H_ > +#define _AMDGPU_CTX_WL_H_ > +#include <drm/amdgpu_drm.h> > +#include "amdgpu.h" > + > +/* Workload mode names */ > +static const char * const amdgpu_workload_mode_name[] = { > + "None", > + "3D", > + "Video", > + "VR", > + "Compute", > + "Unknown", > +}; > + > +static inline const > +char *amdgpu_workload_profile_name(uint32_t profile) > +{ > + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && > + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) > + return > amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile > )]; > + > + return > amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; > +} > + > +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > + uint32_t hint); > + > +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > + uint32_t hint); > + > +#endif > diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > index 65624d091ed2..565131f789d0 100644 > --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > @@ -361,6 +361,11 @@ struct amdgpu_pm { > struct mutex stable_pstate_ctx_lock; > struct amdgpu_ctx *stable_pstate_ctx; > > + /* SMU workload mode */ > + struct mutex smu_workload_lock; > + uint32_t workload_mode; > + atomic_t workload_switch_ref; > + > struct config_table_setting config_table; > /* runtime mode */ > enum amdgpu_runpm_mode rpm_mode; > -- > 2.34.1 ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 2:14 ` Quan, Evan @ 2022-09-27 7:29 ` Sharma, Shashank 2022-09-27 9:29 ` Quan, Evan 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 7:29 UTC (permalink / raw) To: Quan, Evan, amd-gfx Cc: Deucher, Alexander, Somalapuram, Amaranath, Koenig, Christian Hello Evan, On 9/27/2022 4:14 AM, Quan, Evan wrote: > [AMD Official Use Only - General] > > > >> -----Original Message----- >> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of >> Shashank Sharma >> Sent: Tuesday, September 27, 2022 5:40 AM >> To: amd-gfx@lists.freedesktop.org >> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, >> Amaranath <Amaranath.Somalapuram@amd.com>; Koenig, Christian >> <Christian.Koenig@amd.com>; Sharma, Shashank >> <Shashank.Sharma@amd.com> >> Subject: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power >> profile >> >> This patch adds new functions which will allow a user to >> change the GPU power profile based a GPU workload hint >> flag. >> >> Cc: Alex Deucher <alexander.deucher@amd.com> >> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >> +++++++++++++++++++ >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >> 5 files changed, 158 insertions(+), 1 deletion(-) >> create mode 100644 >> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> create mode 100644 >> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >> b/drivers/gpu/drm/amd/amdgpu/Makefile >> index 5a283d12f8e1..34679c657ecc 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >> amdgpu_pll.o \ >> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >> amdgpu_ctx_workload.o amdgpu_sync.o \ >> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >> amdgpu_virt.o \ >> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> new file mode 100644 >> index 000000000000..a11cf29bc388 >> --- /dev/null >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> @@ -0,0 +1,97 @@ >> +/* >> + * Copyright 2022 Advanced Micro Devices, Inc. >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be included in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >> EVENT SHALL >> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >> DAMAGES OR >> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >> OTHERWISE, >> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >> THE USE OR >> + * OTHER DEALINGS IN THE SOFTWARE. >> + * >> + */ >> +#include <drm/drm.h> >> +#include "kgd_pp_interface.h" >> +#include "amdgpu_ctx_workload.h" >> + >> +static enum PP_SMC_POWER_PROFILE >> +amdgpu_workload_to_power_profile(uint32_t hint) >> +{ >> + switch (hint) { >> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >> + default: >> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >> + >> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >> + return PP_SMC_POWER_PROFILE_VIDEO; >> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >> + return PP_SMC_POWER_PROFILE_VR; >> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >> + return PP_SMC_POWER_PROFILE_COMPUTE; >> + } >> +} >> + >> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >> + uint32_t hint) >> +{ >> + int ret = 0; >> + enum PP_SMC_POWER_PROFILE profile = >> + amdgpu_workload_to_power_profile(hint); >> + >> + if (adev->pm.workload_mode == hint) >> + return 0; >> + >> + mutex_lock(&adev->pm.smu_workload_lock); >> + >> + if (adev->pm.workload_mode == hint) >> + goto unlock; > [Quan, Evan] This seems redundant with code above. I saw you dropped this in Patch4. > But I kind of feel this should be the one which needs to be kept. Yes, this shuffle happened during the rebase-testing of V3, will update this. >> + >> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >> + if (!ret) >> + adev->pm.workload_mode = hint; >> + atomic_inc(&adev->pm.workload_switch_ref); >> + >> +unlock: >> + mutex_unlock(&adev->pm.smu_workload_lock); >> + return ret; >> +} >> + >> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >> + uint32_t hint) >> +{ >> + int ret = 0; >> + enum PP_SMC_POWER_PROFILE profile = >> + amdgpu_workload_to_power_profile(hint); >> + >> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >> + return 0; >> + >> + /* Do not reset GPU power profile if another reset is coming */ >> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >> + return 0; >> + >> + mutex_lock(&adev->pm.smu_workload_lock); >> + >> + if (adev->pm.workload_mode != hint) >> + goto unlock; >> + >> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >> + if (!ret) >> + adev->pm.workload_mode = >> AMDGPU_CTX_WORKLOAD_HINT_NONE; >> + >> +unlock: >> + mutex_unlock(&adev->pm.smu_workload_lock); >> + return ret; >> +} > [Quan, Evan] Instead of setting to AMDGPU_CTX_WORKLOAD_HINT_NONE, better to reset it back to original workload profile mode. > That can make it compatible with existing sysfs interface which has similar functionality for setting workload profile mode. This API is specifically written to remove any workload profile applied, hense named as "clear_workload_profile" and the intention is reset. As you can see in the next patch, the work profile is being set from the job_run and reset again once the job execution is done. If there is another set() in progress, the reference counter takes care of that. So I would like to keep it this way. - Shashank > /** > * DOC: pp_power_profile_mode > * > * The amdgpu driver provides a sysfs API for adjusting the heuristics > * related to switching between power levels in a power state. The file > * pp_power_profile_mode is used for this. > * > * Reading this file outputs a list of all of the predefined power profiles > * and the relevant heuristics settings for that profile. > * > * To select a profile or create a custom profile, first select manual using > * power_dpm_force_performance_level. Writing the number of a predefined > * profile to pp_power_profile_mode will enable those heuristics. To > * create a custom set of heuristics, write a string of numbers to the file > * starting with the number of the custom profile along with a setting > * for each heuristic parameter. Due to differences across asic families > * the heuristic parameters vary from family to family. > * > */ > > BR > Evan >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index be7aff2d4a57..1f0f64662c04 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device >> *adev, >> mutex_init(&adev->psp.mutex); >> mutex_init(&adev->notifier_lock); >> mutex_init(&adev->pm.stable_pstate_ctx_lock); >> + mutex_init(&adev->pm.smu_workload_lock); >> mutex_init(&adev->benchmark_mutex); >> >> amdgpu_device_init_apu_flags(adev); >> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >> new file mode 100644 >> index 000000000000..6060fc53c3b0 >> --- /dev/null >> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >> @@ -0,0 +1,54 @@ >> +/* >> + * Copyright 2022 Advanced Micro Devices, Inc. >> + * >> + * Permission is hereby granted, free of charge, to any person obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without limitation >> + * the rights to use, copy, modify, merge, publish, distribute, sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be included in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >> EVENT SHALL >> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >> DAMAGES OR >> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >> OTHERWISE, >> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >> THE USE OR >> + * OTHER DEALINGS IN THE SOFTWARE. >> + * >> + */ >> +#ifndef _AMDGPU_CTX_WL_H_ >> +#define _AMDGPU_CTX_WL_H_ >> +#include <drm/amdgpu_drm.h> >> +#include "amdgpu.h" >> + >> +/* Workload mode names */ >> +static const char * const amdgpu_workload_mode_name[] = { >> + "None", >> + "3D", >> + "Video", >> + "VR", >> + "Compute", >> + "Unknown", >> +}; >> + >> +static inline const >> +char *amdgpu_workload_profile_name(uint32_t profile) >> +{ >> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >> + return >> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile >> )]; >> + >> + return >> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >> +} >> + >> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >> + uint32_t hint); >> + >> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >> + uint32_t hint); >> + >> +#endif >> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >> index 65624d091ed2..565131f789d0 100644 >> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >> @@ -361,6 +361,11 @@ struct amdgpu_pm { >> struct mutex stable_pstate_ctx_lock; >> struct amdgpu_ctx *stable_pstate_ctx; >> >> + /* SMU workload mode */ >> + struct mutex smu_workload_lock; >> + uint32_t workload_mode; >> + atomic_t workload_switch_ref; >> + >> struct config_table_setting config_table; >> /* runtime mode */ >> enum amdgpu_runpm_mode rpm_mode; >> -- >> 2.34.1 ^ permalink raw reply [flat|nested] 76+ messages in thread
* RE: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 7:29 ` Sharma, Shashank @ 2022-09-27 9:29 ` Quan, Evan 2022-09-27 10:00 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Quan, Evan @ 2022-09-27 9:29 UTC (permalink / raw) To: Sharma, Shashank, amd-gfx Cc: Deucher, Alexander, Somalapuram, Amaranath, Koenig, Christian [AMD Official Use Only - General] > -----Original Message----- > From: Sharma, Shashank <Shashank.Sharma@amd.com> > Sent: Tuesday, September 27, 2022 3:30 PM > To: Quan, Evan <Evan.Quan@amd.com>; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, > Amaranath <Amaranath.Somalapuram@amd.com>; Koenig, Christian > <Christian.Koenig@amd.com> > Subject: Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU > power profile > > Hello Evan, > > On 9/27/2022 4:14 AM, Quan, Evan wrote: > > [AMD Official Use Only - General] > > > > > > > >> -----Original Message----- > >> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of > >> Shashank Sharma > >> Sent: Tuesday, September 27, 2022 5:40 AM > >> To: amd-gfx@lists.freedesktop.org > >> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, > >> Amaranath <Amaranath.Somalapuram@amd.com>; Koenig, Christian > >> <Christian.Koenig@amd.com>; Sharma, Shashank > >> <Shashank.Sharma@amd.com> > >> Subject: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU > >> power profile > >> > >> This patch adds new functions which will allow a user to change the > >> GPU power profile based a GPU workload hint flag. > >> > >> Cc: Alex Deucher <alexander.deucher@amd.com> > >> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > >> --- > >> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- > >> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 > >> +++++++++++++++++++ > >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + > >> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ > >> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + > >> 5 files changed, 158 insertions(+), 1 deletion(-) > >> create mode 100644 > >> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > >> create mode 100644 > >> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > >> > >> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile > >> b/drivers/gpu/drm/amd/amdgpu/Makefile > >> index 5a283d12f8e1..34679c657ecc 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/Makefile > >> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > >> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ > >> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ > >> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ > >> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o > >> amdgpu_pll.o \ > >> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ > >> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o > >> amdgpu_ctx_workload.o amdgpu_sync.o \ > >> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o > >> amdgpu_virt.o \ > >> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ > >> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ diff --git > >> a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > >> new file mode 100644 > >> index 000000000000..a11cf29bc388 > >> --- /dev/null > >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > >> @@ -0,0 +1,97 @@ > >> +/* > >> + * Copyright 2022 Advanced Micro Devices, Inc. > >> + * > >> + * Permission is hereby granted, free of charge, to any person > >> +obtaining a > >> + * copy of this software and associated documentation files (the > >> "Software"), > >> + * to deal in the Software without restriction, including without > >> + limitation > >> + * the rights to use, copy, modify, merge, publish, distribute, > >> + sublicense, > >> + * and/or sell copies of the Software, and to permit persons to whom > >> + the > >> + * Software is furnished to do so, subject to the following conditions: > >> + * > >> + * The above copyright notice and this permission notice shall be > >> + included in > >> + * all copies or substantial portions of the Software. > >> + * > >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY > KIND, > >> EXPRESS OR > >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > >> MERCHANTABILITY, > >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN > NO > >> EVENT SHALL > >> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, > >> DAMAGES OR > >> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR > >> OTHERWISE, > >> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR > >> THE USE OR > >> + * OTHER DEALINGS IN THE SOFTWARE. > >> + * > >> + */ > >> +#include <drm/drm.h> > >> +#include "kgd_pp_interface.h" > >> +#include "amdgpu_ctx_workload.h" > >> + > >> +static enum PP_SMC_POWER_PROFILE > >> +amdgpu_workload_to_power_profile(uint32_t hint) { > >> + switch (hint) { > >> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: > >> + default: > >> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; > >> + > >> + case AMDGPU_CTX_WORKLOAD_HINT_3D: > >> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; > >> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: > >> + return PP_SMC_POWER_PROFILE_VIDEO; > >> + case AMDGPU_CTX_WORKLOAD_HINT_VR: > >> + return PP_SMC_POWER_PROFILE_VR; > >> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: > >> + return PP_SMC_POWER_PROFILE_COMPUTE; > >> + } > >> +} > >> + > >> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > >> + uint32_t hint) > >> +{ > >> + int ret = 0; > >> + enum PP_SMC_POWER_PROFILE profile = > >> + amdgpu_workload_to_power_profile(hint); > >> + > >> + if (adev->pm.workload_mode == hint) > >> + return 0; > >> + > >> + mutex_lock(&adev->pm.smu_workload_lock); > >> + > >> + if (adev->pm.workload_mode == hint) > >> + goto unlock; > > [Quan, Evan] This seems redundant with code above. I saw you dropped > this in Patch4. > > But I kind of feel this should be the one which needs to be kept. > > Yes, this shuffle happened during the rebase-testing of V3, will update this. > > >> + > >> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); > >> + if (!ret) > >> + adev->pm.workload_mode = hint; > >> + atomic_inc(&adev->pm.workload_switch_ref); > >> + > >> +unlock: > >> + mutex_unlock(&adev->pm.smu_workload_lock); > >> + return ret; > >> +} > >> + > >> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > >> + uint32_t hint) > >> +{ > >> + int ret = 0; > >> + enum PP_SMC_POWER_PROFILE profile = > >> + amdgpu_workload_to_power_profile(hint); > >> + > >> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) > >> + return 0; > >> + > >> + /* Do not reset GPU power profile if another reset is coming */ > >> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) > >> + return 0; > >> + > >> + mutex_lock(&adev->pm.smu_workload_lock); > >> + > >> + if (adev->pm.workload_mode != hint) > >> + goto unlock; > >> + > >> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); > >> + if (!ret) > >> + adev->pm.workload_mode = > >> AMDGPU_CTX_WORKLOAD_HINT_NONE; > >> + > >> +unlock: > >> + mutex_unlock(&adev->pm.smu_workload_lock); > >> + return ret; > >> +} > > [Quan, Evan] Instead of setting to > AMDGPU_CTX_WORKLOAD_HINT_NONE, better to reset it back to original > workload profile mode. > > That can make it compatible with existing sysfs interface which has similar > functionality for setting workload profile mode. > > This API is specifically written to remove any workload profile applied, hense > named as "clear_workload_profile" and the intention is reset. As you can see > in the next patch, the work profile is being set from the job_run and reset > again once the job execution is done. > > If there is another set() in progress, the reference counter takes care of that. > So I would like to keep it this way. [Quan, Evan] What I meant is some case like below: 1. User sets a workload profile mode via sysfs interface (e.g. setting compute mode via "echo 5 > /sys/class/drm/card0/device/pp_power_profile_mode") 2. Then a job was launched with a different workload profile mode requested(e.g. 3D_FULL_SCREEN mode). 3. Finally on the job ended, better to switch back to original compute mode, not just reset it back to NONE. Does that make sense? BR Evan > > - Shashank > > > /** > > * DOC: pp_power_profile_mode > > * > > * The amdgpu driver provides a sysfs API for adjusting the heuristics > > * related to switching between power levels in a power state. The file > > * pp_power_profile_mode is used for this. > > * > > * Reading this file outputs a list of all of the predefined power profiles > > * and the relevant heuristics settings for that profile. > > * > > * To select a profile or create a custom profile, first select manual using > > * power_dpm_force_performance_level. Writing the number of a > predefined > > * profile to pp_power_profile_mode will enable those heuristics. To > > * create a custom set of heuristics, write a string of numbers to the file > > * starting with the number of the custom profile along with a setting > > * for each heuristic parameter. Due to differences across asic families > > * the heuristic parameters vary from family to family. > > * > > */ > > > > BR > > Evan > >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >> index be7aff2d4a57..1f0f64662c04 100644 > >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > >> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device > >> *adev, > >> mutex_init(&adev->psp.mutex); > >> mutex_init(&adev->notifier_lock); > >> mutex_init(&adev->pm.stable_pstate_ctx_lock); > >> + mutex_init(&adev->pm.smu_workload_lock); > >> mutex_init(&adev->benchmark_mutex); > >> > >> amdgpu_device_init_apu_flags(adev); > >> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > >> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > >> new file mode 100644 > >> index 000000000000..6060fc53c3b0 > >> --- /dev/null > >> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > >> @@ -0,0 +1,54 @@ > >> +/* > >> + * Copyright 2022 Advanced Micro Devices, Inc. > >> + * > >> + * Permission is hereby granted, free of charge, to any person > >> +obtaining a > >> + * copy of this software and associated documentation files (the > >> "Software"), > >> + * to deal in the Software without restriction, including without > >> + limitation > >> + * the rights to use, copy, modify, merge, publish, distribute, > >> + sublicense, > >> + * and/or sell copies of the Software, and to permit persons to whom > >> + the > >> + * Software is furnished to do so, subject to the following conditions: > >> + * > >> + * The above copyright notice and this permission notice shall be > >> + included in > >> + * all copies or substantial portions of the Software. > >> + * > >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY > KIND, > >> EXPRESS OR > >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > >> MERCHANTABILITY, > >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN > NO > >> EVENT SHALL > >> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, > >> DAMAGES OR > >> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR > >> OTHERWISE, > >> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR > >> THE USE OR > >> + * OTHER DEALINGS IN THE SOFTWARE. > >> + * > >> + */ > >> +#ifndef _AMDGPU_CTX_WL_H_ > >> +#define _AMDGPU_CTX_WL_H_ > >> +#include <drm/amdgpu_drm.h> > >> +#include "amdgpu.h" > >> + > >> +/* Workload mode names */ > >> +static const char * const amdgpu_workload_mode_name[] = { > >> + "None", > >> + "3D", > >> + "Video", > >> + "VR", > >> + "Compute", > >> + "Unknown", > >> +}; > >> + > >> +static inline const > >> +char *amdgpu_workload_profile_name(uint32_t profile) { > >> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && > >> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) > >> + return > >> > amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile > >> )]; > >> + > >> + return > >> > amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; > >> +} > >> + > >> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > >> + uint32_t hint); > >> + > >> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > >> + uint32_t hint); > >> + > >> +#endif > >> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > >> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > >> index 65624d091ed2..565131f789d0 100644 > >> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > >> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > >> @@ -361,6 +361,11 @@ struct amdgpu_pm { > >> struct mutex stable_pstate_ctx_lock; > >> struct amdgpu_ctx *stable_pstate_ctx; > >> > >> + /* SMU workload mode */ > >> + struct mutex smu_workload_lock; > >> + uint32_t workload_mode; > >> + atomic_t workload_switch_ref; > >> + > >> struct config_table_setting config_table; > >> /* runtime mode */ > >> enum amdgpu_runpm_mode rpm_mode; > >> -- > >> 2.34.1 ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 9:29 ` Quan, Evan @ 2022-09-27 10:00 ` Sharma, Shashank 0 siblings, 0 replies; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 10:00 UTC (permalink / raw) To: Quan, Evan, amd-gfx Cc: Deucher, Alexander, Somalapuram, Amaranath, Koenig, Christian On 9/27/2022 11:29 AM, Quan, Evan wrote: > [AMD Official Use Only - General] > > > >> -----Original Message----- >> From: Sharma, Shashank <Shashank.Sharma@amd.com> >> Sent: Tuesday, September 27, 2022 3:30 PM >> To: Quan, Evan <Evan.Quan@amd.com>; amd-gfx@lists.freedesktop.org >> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, >> Amaranath <Amaranath.Somalapuram@amd.com>; Koenig, Christian >> <Christian.Koenig@amd.com> >> Subject: Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU >> power profile >> >> Hello Evan, >> >> On 9/27/2022 4:14 AM, Quan, Evan wrote: >>> [AMD Official Use Only - General] >>> >>> >>> >>>> -----Original Message----- >>>> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of >>>> Shashank Sharma >>>> Sent: Tuesday, September 27, 2022 5:40 AM >>>> To: amd-gfx@lists.freedesktop.org >>>> Cc: Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, >>>> Amaranath <Amaranath.Somalapuram@amd.com>; Koenig, Christian >>>> <Christian.Koenig@amd.com>; Sharma, Shashank >>>> <Shashank.Sharma@amd.com> >>>> Subject: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU >>>> power profile >>>> >>>> This patch adds new functions which will allow a user to change the >>>> GPU power profile based a GPU workload hint flag. >>>> >>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>> +++++++++++++++++++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>> create mode 100644 >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> create mode 100644 >>>> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>> index 5a283d12f8e1..34679c657ecc 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>>> amdgpu_pll.o \ >>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>>> amdgpu_virt.o \ >>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ diff --git >>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> new file mode 100644 >>>> index 000000000000..a11cf29bc388 >>>> --- /dev/null >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> @@ -0,0 +1,97 @@ >>>> +/* >>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>> + * >>>> + * Permission is hereby granted, free of charge, to any person >>>> +obtaining a >>>> + * copy of this software and associated documentation files (the >>>> "Software"), >>>> + * to deal in the Software without restriction, including without >>>> + limitation >>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>> + sublicense, >>>> + * and/or sell copies of the Software, and to permit persons to whom >>>> + the >>>> + * Software is furnished to do so, subject to the following conditions: >>>> + * >>>> + * The above copyright notice and this permission notice shall be >>>> + included in >>>> + * all copies or substantial portions of the Software. >>>> + * >>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >> KIND, >>>> EXPRESS OR >>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>> MERCHANTABILITY, >>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN >> NO >>>> EVENT SHALL >>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>> DAMAGES OR >>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>> OTHERWISE, >>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>> THE USE OR >>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>> + * >>>> + */ >>>> +#include <drm/drm.h> >>>> +#include "kgd_pp_interface.h" >>>> +#include "amdgpu_ctx_workload.h" >>>> + >>>> +static enum PP_SMC_POWER_PROFILE >>>> +amdgpu_workload_to_power_profile(uint32_t hint) { >>>> + switch (hint) { >>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>> + default: >>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>> + >>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>> + return PP_SMC_POWER_PROFILE_VR; >>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>> + } >>>> +} >>>> + >>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>> + uint32_t hint) >>>> +{ >>>> + int ret = 0; >>>> + enum PP_SMC_POWER_PROFILE profile = >>>> + amdgpu_workload_to_power_profile(hint); >>>> + >>>> + if (adev->pm.workload_mode == hint) >>>> + return 0; >>>> + >>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>> + >>>> + if (adev->pm.workload_mode == hint) >>>> + goto unlock; >>> [Quan, Evan] This seems redundant with code above. I saw you dropped >> this in Patch4. >>> But I kind of feel this should be the one which needs to be kept. >> >> Yes, this shuffle happened during the rebase-testing of V3, will update this. >> >>>> + >>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>> + if (!ret) >>>> + adev->pm.workload_mode = hint; >>>> + atomic_inc(&adev->pm.workload_switch_ref); >>>> + >>>> +unlock: >>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>> + return ret; >>>> +} >>>> + >>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>> + uint32_t hint) >>>> +{ >>>> + int ret = 0; >>>> + enum PP_SMC_POWER_PROFILE profile = >>>> + amdgpu_workload_to_power_profile(hint); >>>> + >>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>> + return 0; >>>> + >>>> + /* Do not reset GPU power profile if another reset is coming */ >>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>> + return 0; >>>> + >>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>> + >>>> + if (adev->pm.workload_mode != hint) >>>> + goto unlock; >>>> + >>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>> + if (!ret) >>>> + adev->pm.workload_mode = >>>> AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>> + >>>> +unlock: >>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>> + return ret; >>>> +} >>> [Quan, Evan] Instead of setting to >> AMDGPU_CTX_WORKLOAD_HINT_NONE, better to reset it back to original >> workload profile mode. >>> That can make it compatible with existing sysfs interface which has similar >> functionality for setting workload profile mode. >> >> This API is specifically written to remove any workload profile applied, hense >> named as "clear_workload_profile" and the intention is reset. As you can see >> in the next patch, the work profile is being set from the job_run and reset >> again once the job execution is done. >> >> If there is another set() in progress, the reference counter takes care of that. >> So I would like to keep it this way. > [Quan, Evan] What I meant is some case like below: > 1. User sets a workload profile mode via sysfs interface (e.g. setting compute mode via "echo 5 > /sys/class/drm/card0/device/pp_power_profile_mode") > 2. Then a job was launched with a different workload profile mode requested(e.g. 3D_FULL_SCREEN mode). > 3. Finally on the job ended, better to switch back to original compute mode, not just reset it back to NONE. Does that make sense? > > BR > Evan To be honest, once we have a proper UAPI to set the power profile, we should not use a sysfs interface at all (or use it mostly for debug purposes). Also I am not sure if you can read back the current power profile from FW/HW, can you ? - Shashank >> >> - Shashank >> >>> /** >>> * DOC: pp_power_profile_mode >>> * >>> * The amdgpu driver provides a sysfs API for adjusting the heuristics >>> * related to switching between power levels in a power state. The file >>> * pp_power_profile_mode is used for this. >>> * >>> * Reading this file outputs a list of all of the predefined power profiles >>> * and the relevant heuristics settings for that profile. >>> * >>> * To select a profile or create a custom profile, first select manual using >>> * power_dpm_force_performance_level. Writing the number of a >> predefined >>> * profile to pp_power_profile_mode will enable those heuristics. To >>> * create a custom set of heuristics, write a string of numbers to the file >>> * starting with the number of the custom profile along with a setting >>> * for each heuristic parameter. Due to differences across asic families >>> * the heuristic parameters vary from family to family. >>> * >>> */ >>> >>> BR >>> Evan >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> index be7aff2d4a57..1f0f64662c04 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device >>>> *adev, >>>> mutex_init(&adev->psp.mutex); >>>> mutex_init(&adev->notifier_lock); >>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>> + mutex_init(&adev->pm.smu_workload_lock); >>>> mutex_init(&adev->benchmark_mutex); >>>> >>>> amdgpu_device_init_apu_flags(adev); >>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>> new file mode 100644 >>>> index 000000000000..6060fc53c3b0 >>>> --- /dev/null >>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>> @@ -0,0 +1,54 @@ >>>> +/* >>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>> + * >>>> + * Permission is hereby granted, free of charge, to any person >>>> +obtaining a >>>> + * copy of this software and associated documentation files (the >>>> "Software"), >>>> + * to deal in the Software without restriction, including without >>>> + limitation >>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>> + sublicense, >>>> + * and/or sell copies of the Software, and to permit persons to whom >>>> + the >>>> + * Software is furnished to do so, subject to the following conditions: >>>> + * >>>> + * The above copyright notice and this permission notice shall be >>>> + included in >>>> + * all copies or substantial portions of the Software. >>>> + * >>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >> KIND, >>>> EXPRESS OR >>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>> MERCHANTABILITY, >>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN >> NO >>>> EVENT SHALL >>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>> DAMAGES OR >>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>> OTHERWISE, >>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>> THE USE OR >>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>> + * >>>> + */ >>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>> +#define _AMDGPU_CTX_WL_H_ >>>> +#include <drm/amdgpu_drm.h> >>>> +#include "amdgpu.h" >>>> + >>>> +/* Workload mode names */ >>>> +static const char * const amdgpu_workload_mode_name[] = { >>>> + "None", >>>> + "3D", >>>> + "Video", >>>> + "VR", >>>> + "Compute", >>>> + "Unknown", >>>> +}; >>>> + >>>> +static inline const >>>> +char *amdgpu_workload_profile_name(uint32_t profile) { >>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>> + return >>>> >> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile >>>> )]; >>>> + >>>> + return >>>> >> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>> +} >>>> + >>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>> + uint32_t hint); >>>> + >>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>> + uint32_t hint); >>>> + >>>> +#endif >>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>> index 65624d091ed2..565131f789d0 100644 >>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>> struct mutex stable_pstate_ctx_lock; >>>> struct amdgpu_ctx *stable_pstate_ctx; >>>> >>>> + /* SMU workload mode */ >>>> + struct mutex smu_workload_lock; >>>> + uint32_t workload_mode; >>>> + atomic_t workload_switch_ref; >>>> + >>>> struct config_table_setting config_table; >>>> /* runtime mode */ >>>> enum amdgpu_runpm_mode rpm_mode; >>>> -- >>>> 2.34.1 ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-26 21:40 ` [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile Shashank Sharma 2022-09-27 2:14 ` Quan, Evan @ 2022-09-27 6:08 ` Christian König 2022-09-27 9:58 ` Lazar, Lijo 2022-09-27 15:20 ` Felix Kuehling 3 siblings, 0 replies; 76+ messages in thread From: Christian König @ 2022-09-27 6:08 UTC (permalink / raw) To: Shashank Sharma, amd-gfx; +Cc: alexander.deucher, amaranath.somalapuram Am 26.09.22 um 23:40 schrieb Shashank Sharma: > This patch adds new functions which will allow a user to > change the GPU power profile based a GPU workload hint > flag. > > Cc: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Alex needs to take a closer look at this stuff, but feel free to add my acked-by. Christian. > --- > drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- > .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 +++++++++++++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + > .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ > drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + > 5 files changed, 158 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > create mode 100644 drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile > index 5a283d12f8e1..34679c657ecc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/Makefile > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ > atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ > atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ > amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o amdgpu_pll.o \ > - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ > + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_ctx_workload.o amdgpu_sync.o \ > amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o \ > amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ > amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > new file mode 100644 > index 000000000000..a11cf29bc388 > --- /dev/null > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > @@ -0,0 +1,97 @@ > +/* > + * Copyright 2022 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + */ > +#include <drm/drm.h> > +#include "kgd_pp_interface.h" > +#include "amdgpu_ctx_workload.h" > + > +static enum PP_SMC_POWER_PROFILE > +amdgpu_workload_to_power_profile(uint32_t hint) > +{ > + switch (hint) { > + case AMDGPU_CTX_WORKLOAD_HINT_NONE: > + default: > + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; > + > + case AMDGPU_CTX_WORKLOAD_HINT_3D: > + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; > + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: > + return PP_SMC_POWER_PROFILE_VIDEO; > + case AMDGPU_CTX_WORKLOAD_HINT_VR: > + return PP_SMC_POWER_PROFILE_VR; > + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: > + return PP_SMC_POWER_PROFILE_COMPUTE; > + } > +} > + > +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > + uint32_t hint) > +{ > + int ret = 0; > + enum PP_SMC_POWER_PROFILE profile = > + amdgpu_workload_to_power_profile(hint); > + > + if (adev->pm.workload_mode == hint) > + return 0; > + > + mutex_lock(&adev->pm.smu_workload_lock); > + > + if (adev->pm.workload_mode == hint) > + goto unlock; > + > + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); > + if (!ret) > + adev->pm.workload_mode = hint; > + atomic_inc(&adev->pm.workload_switch_ref); > + > +unlock: > + mutex_unlock(&adev->pm.smu_workload_lock); > + return ret; > +} > + > +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > + uint32_t hint) > +{ > + int ret = 0; > + enum PP_SMC_POWER_PROFILE profile = > + amdgpu_workload_to_power_profile(hint); > + > + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) > + return 0; > + > + /* Do not reset GPU power profile if another reset is coming */ > + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) > + return 0; > + > + mutex_lock(&adev->pm.smu_workload_lock); > + > + if (adev->pm.workload_mode != hint) > + goto unlock; > + > + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); > + if (!ret) > + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; > + > +unlock: > + mutex_unlock(&adev->pm.smu_workload_lock); > + return ret; > +} > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index be7aff2d4a57..1f0f64662c04 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, > mutex_init(&adev->psp.mutex); > mutex_init(&adev->notifier_lock); > mutex_init(&adev->pm.stable_pstate_ctx_lock); > + mutex_init(&adev->pm.smu_workload_lock); > mutex_init(&adev->benchmark_mutex); > > amdgpu_device_init_apu_flags(adev); > diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > new file mode 100644 > index 000000000000..6060fc53c3b0 > --- /dev/null > +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > @@ -0,0 +1,54 @@ > +/* > + * Copyright 2022 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + */ > +#ifndef _AMDGPU_CTX_WL_H_ > +#define _AMDGPU_CTX_WL_H_ > +#include <drm/amdgpu_drm.h> > +#include "amdgpu.h" > + > +/* Workload mode names */ > +static const char * const amdgpu_workload_mode_name[] = { > + "None", > + "3D", > + "Video", > + "VR", > + "Compute", > + "Unknown", > +}; > + > +static inline const > +char *amdgpu_workload_profile_name(uint32_t profile) > +{ > + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && > + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) > + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; > + > + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; > +} > + > +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > + uint32_t hint); > + > +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > + uint32_t hint); > + > +#endif > diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > index 65624d091ed2..565131f789d0 100644 > --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > @@ -361,6 +361,11 @@ struct amdgpu_pm { > struct mutex stable_pstate_ctx_lock; > struct amdgpu_ctx *stable_pstate_ctx; > > + /* SMU workload mode */ > + struct mutex smu_workload_lock; > + uint32_t workload_mode; > + atomic_t workload_switch_ref; > + > struct config_table_setting config_table; > /* runtime mode */ > enum amdgpu_runpm_mode rpm_mode; ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-26 21:40 ` [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile Shashank Sharma 2022-09-27 2:14 ` Quan, Evan 2022-09-27 6:08 ` Christian König @ 2022-09-27 9:58 ` Lazar, Lijo 2022-09-27 11:41 ` Sharma, Shashank 2022-09-27 15:20 ` Felix Kuehling 3 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-27 9:58 UTC (permalink / raw) To: Shashank Sharma, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 3:10 AM, Shashank Sharma wrote: > This patch adds new functions which will allow a user to > change the GPU power profile based a GPU workload hint > flag. > > Cc: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- > .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 +++++++++++++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + > .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ > drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + > 5 files changed, 158 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > create mode 100644 drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile > index 5a283d12f8e1..34679c657ecc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/Makefile > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ > atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ > atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ > amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o amdgpu_pll.o \ > - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ > + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_ctx_workload.o amdgpu_sync.o \ > amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o \ > amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ > amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > new file mode 100644 > index 000000000000..a11cf29bc388 > --- /dev/null > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > @@ -0,0 +1,97 @@ > +/* > + * Copyright 2022 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + */ > +#include <drm/drm.h> > +#include "kgd_pp_interface.h" > +#include "amdgpu_ctx_workload.h" > + > +static enum PP_SMC_POWER_PROFILE > +amdgpu_workload_to_power_profile(uint32_t hint) > +{ > + switch (hint) { > + case AMDGPU_CTX_WORKLOAD_HINT_NONE: > + default: > + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; > + > + case AMDGPU_CTX_WORKLOAD_HINT_3D: > + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; > + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: > + return PP_SMC_POWER_PROFILE_VIDEO; > + case AMDGPU_CTX_WORKLOAD_HINT_VR: > + return PP_SMC_POWER_PROFILE_VR; > + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: > + return PP_SMC_POWER_PROFILE_COMPUTE; > + } > +} > + > +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > + uint32_t hint) > +{ > + int ret = 0; > + enum PP_SMC_POWER_PROFILE profile = > + amdgpu_workload_to_power_profile(hint); > + > + if (adev->pm.workload_mode == hint) > + return 0; > + > + mutex_lock(&adev->pm.smu_workload_lock); If it's all about pm subsystem variable accesses, this API should rather be inside amd/pm subsystem. No need to expose the variable outside pm subsytem. Also currently all amdgpu_dpm* calls are protected under one mutex. Then this extra lock won't be needed. > + > + if (adev->pm.workload_mode == hint) > + goto unlock; > + > + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); > + if (!ret) > + adev->pm.workload_mode = hint; > + atomic_inc(&adev->pm.workload_switch_ref); Why is this reference kept? The swtiching happens inside a lock and there is already a check not to switch if the hint matches with current workload. Thanks, Lijo > + > +unlock: > + mutex_unlock(&adev->pm.smu_workload_lock); > + return ret; > +} > + > +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > + uint32_t hint) > +{ > + int ret = 0; > + enum PP_SMC_POWER_PROFILE profile = > + amdgpu_workload_to_power_profile(hint); > + > + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) > + return 0; > + > + /* Do not reset GPU power profile if another reset is coming */ > + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) > + return 0; > + > + mutex_lock(&adev->pm.smu_workload_lock); > + > + if (adev->pm.workload_mode != hint) > + goto unlock; > + > + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); > + if (!ret) > + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; > + > +unlock: > + mutex_unlock(&adev->pm.smu_workload_lock); > + return ret; > +} > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index be7aff2d4a57..1f0f64662c04 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, > mutex_init(&adev->psp.mutex); > mutex_init(&adev->notifier_lock); > mutex_init(&adev->pm.stable_pstate_ctx_lock); > + mutex_init(&adev->pm.smu_workload_lock); > mutex_init(&adev->benchmark_mutex); > > amdgpu_device_init_apu_flags(adev); > diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > new file mode 100644 > index 000000000000..6060fc53c3b0 > --- /dev/null > +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > @@ -0,0 +1,54 @@ > +/* > + * Copyright 2022 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + */ > +#ifndef _AMDGPU_CTX_WL_H_ > +#define _AMDGPU_CTX_WL_H_ > +#include <drm/amdgpu_drm.h> > +#include "amdgpu.h" > + > +/* Workload mode names */ > +static const char * const amdgpu_workload_mode_name[] = { > + "None", > + "3D", > + "Video", > + "VR", > + "Compute", > + "Unknown", > +}; > + > +static inline const > +char *amdgpu_workload_profile_name(uint32_t profile) > +{ > + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && > + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) > + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; > + > + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; > +} > + > +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > + uint32_t hint); > + > +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > + uint32_t hint); > + > +#endif > diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > index 65624d091ed2..565131f789d0 100644 > --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > @@ -361,6 +361,11 @@ struct amdgpu_pm { > struct mutex stable_pstate_ctx_lock; > struct amdgpu_ctx *stable_pstate_ctx; > > + /* SMU workload mode */ > + struct mutex smu_workload_lock; > + uint32_t workload_mode; > + atomic_t workload_switch_ref; > + > struct config_table_setting config_table; > /* runtime mode */ > enum amdgpu_runpm_mode rpm_mode; > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 9:58 ` Lazar, Lijo @ 2022-09-27 11:41 ` Sharma, Shashank 2022-09-27 12:10 ` Lazar, Lijo 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 11:41 UTC (permalink / raw) To: Lazar, Lijo, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 11:58 AM, Lazar, Lijo wrote: > > > On 9/27/2022 3:10 AM, Shashank Sharma wrote: >> This patch adds new functions which will allow a user to >> change the GPU power profile based a GPU workload hint >> flag. >> >> Cc: Alex Deucher <alexander.deucher@amd.com> >> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 +++++++++++++++++++ >> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >> 5 files changed, 158 insertions(+), 1 deletion(-) >> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> create mode 100644 drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >> b/drivers/gpu/drm/amd/amdgpu/Makefile >> index 5a283d12f8e1..34679c657ecc 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >> amdgpu_pll.o \ >> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >> amdgpu_ctx_workload.o amdgpu_sync.o \ >> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >> amdgpu_virt.o \ >> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> new file mode 100644 >> index 000000000000..a11cf29bc388 >> --- /dev/null >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> @@ -0,0 +1,97 @@ >> +/* >> + * Copyright 2022 Advanced Micro Devices, Inc. >> + * >> + * Permission is hereby granted, free of charge, to any person >> obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without >> limitation >> + * the rights to use, copy, modify, merge, publish, distribute, >> sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be >> included in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT >> SHALL >> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >> DAMAGES OR >> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR >> + * OTHER DEALINGS IN THE SOFTWARE. >> + * >> + */ >> +#include <drm/drm.h> >> +#include "kgd_pp_interface.h" >> +#include "amdgpu_ctx_workload.h" >> + >> +static enum PP_SMC_POWER_PROFILE >> +amdgpu_workload_to_power_profile(uint32_t hint) >> +{ >> + switch (hint) { >> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >> + default: >> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >> + >> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >> + return PP_SMC_POWER_PROFILE_VIDEO; >> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >> + return PP_SMC_POWER_PROFILE_VR; >> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >> + return PP_SMC_POWER_PROFILE_COMPUTE; >> + } >> +} >> + >> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >> + uint32_t hint) >> +{ >> + int ret = 0; >> + enum PP_SMC_POWER_PROFILE profile = >> + amdgpu_workload_to_power_profile(hint); >> + >> + if (adev->pm.workload_mode == hint) >> + return 0; >> + >> + mutex_lock(&adev->pm.smu_workload_lock); > > If it's all about pm subsystem variable accesses, this API should rather > be inside amd/pm subsystem. No need to expose the variable outside pm > subsytem. Also currently all amdgpu_dpm* calls are protected under one > mutex. Then this extra lock won't be needed. > This is tricky, this is not all about PM subsystem. Note that the job management and scheduling is handled into amdgpu_ctx, so the workload hint is set in context_management API. The API is consumed when the job is actually run from amdgpu_run() layer. So its a joint interface between context and PM. >> + >> + if (adev->pm.workload_mode == hint) >> + goto unlock; >> + >> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >> + if (!ret) >> + adev->pm.workload_mode = hint; >> + atomic_inc(&adev->pm.workload_switch_ref); > > Why is this reference kept? The swtiching happens inside a lock and > there is already a check not to switch if the hint matches with current > workload. > This reference is kept so that we would not reset the PM mode to DEFAULT when some other context has switched the PP mode. If you see the 4th patch, the PM mode will be changed when the job in that context is run, and a pm_reset function will be scheduled when the job is done. But in between if another job from another context has changed the PM mode, the refrence count will prevent us from resetting the PM mode. - Shashank > Thanks, > Lijo > >> + >> +unlock: >> + mutex_unlock(&adev->pm.smu_workload_lock); >> + return ret; >> +} >> + >> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >> + uint32_t hint) >> +{ >> + int ret = 0; >> + enum PP_SMC_POWER_PROFILE profile = >> + amdgpu_workload_to_power_profile(hint); >> + >> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >> + return 0; >> + >> + /* Do not reset GPU power profile if another reset is coming */ >> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >> + return 0; >> + >> + mutex_lock(&adev->pm.smu_workload_lock); >> + >> + if (adev->pm.workload_mode != hint) >> + goto unlock; >> + >> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >> + if (!ret) >> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >> + >> +unlock: >> + mutex_unlock(&adev->pm.smu_workload_lock); >> + return ret; >> +} >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> index be7aff2d4a57..1f0f64662c04 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, >> mutex_init(&adev->psp.mutex); >> mutex_init(&adev->notifier_lock); >> mutex_init(&adev->pm.stable_pstate_ctx_lock); >> + mutex_init(&adev->pm.smu_workload_lock); >> mutex_init(&adev->benchmark_mutex); >> amdgpu_device_init_apu_flags(adev); >> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >> new file mode 100644 >> index 000000000000..6060fc53c3b0 >> --- /dev/null >> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >> @@ -0,0 +1,54 @@ >> +/* >> + * Copyright 2022 Advanced Micro Devices, Inc. >> + * >> + * Permission is hereby granted, free of charge, to any person >> obtaining a >> + * copy of this software and associated documentation files (the >> "Software"), >> + * to deal in the Software without restriction, including without >> limitation >> + * the rights to use, copy, modify, merge, publish, distribute, >> sublicense, >> + * and/or sell copies of the Software, and to permit persons to whom the >> + * Software is furnished to do so, subject to the following conditions: >> + * >> + * The above copyright notice and this permission notice shall be >> included in >> + * all copies or substantial portions of the Software. >> + * >> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >> EXPRESS OR >> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >> MERCHANTABILITY, >> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT >> SHALL >> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >> DAMAGES OR >> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, >> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR >> + * OTHER DEALINGS IN THE SOFTWARE. >> + * >> + */ >> +#ifndef _AMDGPU_CTX_WL_H_ >> +#define _AMDGPU_CTX_WL_H_ >> +#include <drm/amdgpu_drm.h> >> +#include "amdgpu.h" >> + >> +/* Workload mode names */ >> +static const char * const amdgpu_workload_mode_name[] = { >> + "None", >> + "3D", >> + "Video", >> + "VR", >> + "Compute", >> + "Unknown", >> +}; >> + >> +static inline const >> +char *amdgpu_workload_profile_name(uint32_t profile) >> +{ >> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >> + return >> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >> + >> + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >> +} >> + >> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >> + uint32_t hint); >> + >> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >> + uint32_t hint); >> + >> +#endif >> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >> index 65624d091ed2..565131f789d0 100644 >> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >> @@ -361,6 +361,11 @@ struct amdgpu_pm { >> struct mutex stable_pstate_ctx_lock; >> struct amdgpu_ctx *stable_pstate_ctx; >> + /* SMU workload mode */ >> + struct mutex smu_workload_lock; >> + uint32_t workload_mode; >> + atomic_t workload_switch_ref; >> + >> struct config_table_setting config_table; >> /* runtime mode */ >> enum amdgpu_runpm_mode rpm_mode; >> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 11:41 ` Sharma, Shashank @ 2022-09-27 12:10 ` Lazar, Lijo 2022-09-27 12:23 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-27 12:10 UTC (permalink / raw) To: Sharma, Shashank, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 5:11 PM, Sharma, Shashank wrote: > > > On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >> >> >> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>> This patch adds new functions which will allow a user to >>> change the GPU power profile based a GPU workload hint >>> flag. >>> >>> Cc: Alex Deucher <alexander.deucher@amd.com> >>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 +++++++++++++++++++ >>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>> 5 files changed, 158 insertions(+), 1 deletion(-) >>> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> create mode 100644 drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>> index 5a283d12f8e1..34679c657ecc 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>> amdgpu_pll.o \ >>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>> amdgpu_virt.o \ >>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> new file mode 100644 >>> index 000000000000..a11cf29bc388 >>> --- /dev/null >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> @@ -0,0 +1,97 @@ >>> +/* >>> + * Copyright 2022 Advanced Micro Devices, Inc. >>> + * >>> + * Permission is hereby granted, free of charge, to any person >>> obtaining a >>> + * copy of this software and associated documentation files (the >>> "Software"), >>> + * to deal in the Software without restriction, including without >>> limitation >>> + * the rights to use, copy, modify, merge, publish, distribute, >>> sublicense, >>> + * and/or sell copies of the Software, and to permit persons to whom >>> the >>> + * Software is furnished to do so, subject to the following conditions: >>> + * >>> + * The above copyright notice and this permission notice shall be >>> included in >>> + * all copies or substantial portions of the Software. >>> + * >>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>> EXPRESS OR >>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>> MERCHANTABILITY, >>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>> EVENT SHALL >>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>> DAMAGES OR >>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>> OTHERWISE, >>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>> USE OR >>> + * OTHER DEALINGS IN THE SOFTWARE. >>> + * >>> + */ >>> +#include <drm/drm.h> >>> +#include "kgd_pp_interface.h" >>> +#include "amdgpu_ctx_workload.h" >>> + >>> +static enum PP_SMC_POWER_PROFILE >>> +amdgpu_workload_to_power_profile(uint32_t hint) >>> +{ >>> + switch (hint) { >>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>> + default: >>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>> + >>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>> + return PP_SMC_POWER_PROFILE_VIDEO; >>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>> + return PP_SMC_POWER_PROFILE_VR; >>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>> + } >>> +} >>> + >>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>> + uint32_t hint) >>> +{ >>> + int ret = 0; >>> + enum PP_SMC_POWER_PROFILE profile = >>> + amdgpu_workload_to_power_profile(hint); >>> + >>> + if (adev->pm.workload_mode == hint) >>> + return 0; >>> + >>> + mutex_lock(&adev->pm.smu_workload_lock); >> >> If it's all about pm subsystem variable accesses, this API should >> rather be inside amd/pm subsystem. No need to expose the variable >> outside pm subsytem. Also currently all amdgpu_dpm* calls are >> protected under one mutex. Then this extra lock won't be needed. >> > > This is tricky, this is not all about PM subsystem. Note that the job > management and scheduling is handled into amdgpu_ctx, so the workload > hint is set in context_management API. The API is consumed when the job > is actually run from amdgpu_run() layer. So its a joint interface > between context and PM. > If you take out amdgpu_workload_to_power_profile() line, everything else looks to touch only pm variables/functions. You could still keep a wrapper though. Also dpm_* functions are protected, so the extra mutex can be avoided as well. >>> + >>> + if (adev->pm.workload_mode == hint) >>> + goto unlock; >>> + >>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>> + if (!ret) >>> + adev->pm.workload_mode = hint; >>> + atomic_inc(&adev->pm.workload_switch_ref); >> >> Why is this reference kept? The swtiching happens inside a lock and >> there is already a check not to switch if the hint matches with >> current workload. >> > > This reference is kept so that we would not reset the PM mode to DEFAULT > when some other context has switched the PP mode. If you see the 4th > patch, the PM mode will be changed when the job in that context is run, > and a pm_reset function will be scheduled when the job is done. But in > between if another job from another context has changed the PM mode, the > refrence count will prevent us from resetting the PM mode. > This helps only if multiple jobs request the same mode. If they request different modes, then this is not helping much. It could be useful to profile some apps assuming it has exclusive access. However, in general, the API is not reliable from a user point as the mode requested can be overridden by some other job. Then a better thing to do is to document that and avoid the extra stuff around it. Thanks, Lijo > - Shashank > >> Thanks, >> Lijo >> >>> + >>> +unlock: >>> + mutex_unlock(&adev->pm.smu_workload_lock); >>> + return ret; >>> +} >>> + >>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>> + uint32_t hint) >>> +{ >>> + int ret = 0; >>> + enum PP_SMC_POWER_PROFILE profile = >>> + amdgpu_workload_to_power_profile(hint); >>> + >>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>> + return 0; >>> + >>> + /* Do not reset GPU power profile if another reset is coming */ >>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>> + return 0; >>> + >>> + mutex_lock(&adev->pm.smu_workload_lock); >>> + >>> + if (adev->pm.workload_mode != hint) >>> + goto unlock; >>> + >>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>> + if (!ret) >>> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>> + >>> +unlock: >>> + mutex_unlock(&adev->pm.smu_workload_lock); >>> + return ret; >>> +} >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>> index be7aff2d4a57..1f0f64662c04 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, >>> mutex_init(&adev->psp.mutex); >>> mutex_init(&adev->notifier_lock); >>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>> + mutex_init(&adev->pm.smu_workload_lock); >>> mutex_init(&adev->benchmark_mutex); >>> amdgpu_device_init_apu_flags(adev); >>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>> new file mode 100644 >>> index 000000000000..6060fc53c3b0 >>> --- /dev/null >>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>> @@ -0,0 +1,54 @@ >>> +/* >>> + * Copyright 2022 Advanced Micro Devices, Inc. >>> + * >>> + * Permission is hereby granted, free of charge, to any person >>> obtaining a >>> + * copy of this software and associated documentation files (the >>> "Software"), >>> + * to deal in the Software without restriction, including without >>> limitation >>> + * the rights to use, copy, modify, merge, publish, distribute, >>> sublicense, >>> + * and/or sell copies of the Software, and to permit persons to whom >>> the >>> + * Software is furnished to do so, subject to the following conditions: >>> + * >>> + * The above copyright notice and this permission notice shall be >>> included in >>> + * all copies or substantial portions of the Software. >>> + * >>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>> EXPRESS OR >>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>> MERCHANTABILITY, >>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>> EVENT SHALL >>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>> DAMAGES OR >>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>> OTHERWISE, >>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>> USE OR >>> + * OTHER DEALINGS IN THE SOFTWARE. >>> + * >>> + */ >>> +#ifndef _AMDGPU_CTX_WL_H_ >>> +#define _AMDGPU_CTX_WL_H_ >>> +#include <drm/amdgpu_drm.h> >>> +#include "amdgpu.h" >>> + >>> +/* Workload mode names */ >>> +static const char * const amdgpu_workload_mode_name[] = { >>> + "None", >>> + "3D", >>> + "Video", >>> + "VR", >>> + "Compute", >>> + "Unknown", >>> +}; >>> + >>> +static inline const >>> +char *amdgpu_workload_profile_name(uint32_t profile) >>> +{ >>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>> + return >>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>> + >>> + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>> +} >>> + >>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>> + uint32_t hint); >>> + >>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>> + uint32_t hint); >>> + >>> +#endif >>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>> index 65624d091ed2..565131f789d0 100644 >>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>> struct mutex stable_pstate_ctx_lock; >>> struct amdgpu_ctx *stable_pstate_ctx; >>> + /* SMU workload mode */ >>> + struct mutex smu_workload_lock; >>> + uint32_t workload_mode; >>> + atomic_t workload_switch_ref; >>> + >>> struct config_table_setting config_table; >>> /* runtime mode */ >>> enum amdgpu_runpm_mode rpm_mode; >>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 12:10 ` Lazar, Lijo @ 2022-09-27 12:23 ` Sharma, Shashank 2022-09-27 12:39 ` Lazar, Lijo 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 12:23 UTC (permalink / raw) To: Lazar, Lijo, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 2:10 PM, Lazar, Lijo wrote: > > > On 9/27/2022 5:11 PM, Sharma, Shashank wrote: >> >> >> On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >>> >>> >>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>> This patch adds new functions which will allow a user to >>>> change the GPU power profile based a GPU workload hint >>>> flag. >>>> >>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>> +++++++++++++++++++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> create mode 100644 drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>> index 5a283d12f8e1..34679c657ecc 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>>> amdgpu_pll.o \ >>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>>> amdgpu_virt.o \ >>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> new file mode 100644 >>>> index 000000000000..a11cf29bc388 >>>> --- /dev/null >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> @@ -0,0 +1,97 @@ >>>> +/* >>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>> + * >>>> + * Permission is hereby granted, free of charge, to any person >>>> obtaining a >>>> + * copy of this software and associated documentation files (the >>>> "Software"), >>>> + * to deal in the Software without restriction, including without >>>> limitation >>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>> sublicense, >>>> + * and/or sell copies of the Software, and to permit persons to >>>> whom the >>>> + * Software is furnished to do so, subject to the following >>>> conditions: >>>> + * >>>> + * The above copyright notice and this permission notice shall be >>>> included in >>>> + * all copies or substantial portions of the Software. >>>> + * >>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>>> EXPRESS OR >>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>> MERCHANTABILITY, >>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>> EVENT SHALL >>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>> DAMAGES OR >>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>> OTHERWISE, >>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>>> USE OR >>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>> + * >>>> + */ >>>> +#include <drm/drm.h> >>>> +#include "kgd_pp_interface.h" >>>> +#include "amdgpu_ctx_workload.h" >>>> + >>>> +static enum PP_SMC_POWER_PROFILE >>>> +amdgpu_workload_to_power_profile(uint32_t hint) >>>> +{ >>>> + switch (hint) { >>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>> + default: >>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>> + >>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>> + return PP_SMC_POWER_PROFILE_VR; >>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>> + } >>>> +} >>>> + >>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>> + uint32_t hint) >>>> +{ >>>> + int ret = 0; >>>> + enum PP_SMC_POWER_PROFILE profile = >>>> + amdgpu_workload_to_power_profile(hint); >>>> + >>>> + if (adev->pm.workload_mode == hint) >>>> + return 0; >>>> + >>>> + mutex_lock(&adev->pm.smu_workload_lock); >>> >>> If it's all about pm subsystem variable accesses, this API should >>> rather be inside amd/pm subsystem. No need to expose the variable >>> outside pm subsytem. Also currently all amdgpu_dpm* calls are >>> protected under one mutex. Then this extra lock won't be needed. >>> >> >> This is tricky, this is not all about PM subsystem. Note that the job >> management and scheduling is handled into amdgpu_ctx, so the workload >> hint is set in context_management API. The API is consumed when the >> job is actually run from amdgpu_run() layer. So its a joint interface >> between context and PM. >> > > If you take out amdgpu_workload_to_power_profile() line, everything else > looks to touch only pm variables/functions. That's not a line, that function converts a AMGPU_CTX hint to PPM profile. And going by that logic, this whole code was kept in the amdgpu_ctx.c file as well, coz this code is consuming the PM API. So to avoid these conflicts and having a new file is a better idea. You could still keep a > wrapper though. Also dpm_* functions are protected, so the extra mutex > can be avoided as well. > The lock also protects pm.workload_mode writes. >>>> + >>>> + if (adev->pm.workload_mode == hint) >>>> + goto unlock; >>>> + >>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>> + if (!ret) >>>> + adev->pm.workload_mode = hint; >>>> + atomic_inc(&adev->pm.workload_switch_ref); >>> >>> Why is this reference kept? The swtiching happens inside a lock and >>> there is already a check not to switch if the hint matches with >>> current workload. >>> >> >> This reference is kept so that we would not reset the PM mode to >> DEFAULT when some other context has switched the PP mode. If you see >> the 4th patch, the PM mode will be changed when the job in that >> context is run, and a pm_reset function will be scheduled when the job >> is done. But in between if another job from another context has >> changed the PM mode, the refrence count will prevent us from resetting >> the PM mode. >> > > This helps only if multiple jobs request the same mode. If they request > different modes, then this is not helping much. No that's certainly not the case. It's a counter, whose aim is to allow a PP reset only when the counter is 0. Do note that the reset() happens only in the job_free_cb(), which gets schedule later. If this counter is not zero, which means another work has changed the profile in between, and we should not reset it. > > It could be useful to profile some apps assuming it has exclusive access. > > However, in general, the API is not reliable from a user point as the > mode requested can be overridden by some other job. Then a better thing > to do is to document that and avoid the extra stuff around it. > As I mentioned before, like any PM feature, the benefits can be seen only while running consistant workloads for long time. I an still add a doc note in the UAPI page. - Shashank > Thanks, > Lijo > >> - Shashank >> >>> Thanks, >>> Lijo >>> >>>> + >>>> +unlock: >>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>> + return ret; >>>> +} >>>> + >>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>> + uint32_t hint) >>>> +{ >>>> + int ret = 0; >>>> + enum PP_SMC_POWER_PROFILE profile = >>>> + amdgpu_workload_to_power_profile(hint); >>>> + >>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>> + return 0; >>>> + >>>> + /* Do not reset GPU power profile if another reset is coming */ >>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>> + return 0; >>>> + >>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>> + >>>> + if (adev->pm.workload_mode != hint) >>>> + goto unlock; >>>> + >>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>> + if (!ret) >>>> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>> + >>>> +unlock: >>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>> + return ret; >>>> +} >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> index be7aff2d4a57..1f0f64662c04 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device >>>> *adev, >>>> mutex_init(&adev->psp.mutex); >>>> mutex_init(&adev->notifier_lock); >>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>> + mutex_init(&adev->pm.smu_workload_lock); >>>> mutex_init(&adev->benchmark_mutex); >>>> amdgpu_device_init_apu_flags(adev); >>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>> new file mode 100644 >>>> index 000000000000..6060fc53c3b0 >>>> --- /dev/null >>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>> @@ -0,0 +1,54 @@ >>>> +/* >>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>> + * >>>> + * Permission is hereby granted, free of charge, to any person >>>> obtaining a >>>> + * copy of this software and associated documentation files (the >>>> "Software"), >>>> + * to deal in the Software without restriction, including without >>>> limitation >>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>> sublicense, >>>> + * and/or sell copies of the Software, and to permit persons to >>>> whom the >>>> + * Software is furnished to do so, subject to the following >>>> conditions: >>>> + * >>>> + * The above copyright notice and this permission notice shall be >>>> included in >>>> + * all copies or substantial portions of the Software. >>>> + * >>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>>> EXPRESS OR >>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>> MERCHANTABILITY, >>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>> EVENT SHALL >>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>> DAMAGES OR >>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>> OTHERWISE, >>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>>> USE OR >>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>> + * >>>> + */ >>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>> +#define _AMDGPU_CTX_WL_H_ >>>> +#include <drm/amdgpu_drm.h> >>>> +#include "amdgpu.h" >>>> + >>>> +/* Workload mode names */ >>>> +static const char * const amdgpu_workload_mode_name[] = { >>>> + "None", >>>> + "3D", >>>> + "Video", >>>> + "VR", >>>> + "Compute", >>>> + "Unknown", >>>> +}; >>>> + >>>> +static inline const >>>> +char *amdgpu_workload_profile_name(uint32_t profile) >>>> +{ >>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>> + return >>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>>> + >>>> + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>> +} >>>> + >>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>> + uint32_t hint); >>>> + >>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>> + uint32_t hint); >>>> + >>>> +#endif >>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>> index 65624d091ed2..565131f789d0 100644 >>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>> struct mutex stable_pstate_ctx_lock; >>>> struct amdgpu_ctx *stable_pstate_ctx; >>>> + /* SMU workload mode */ >>>> + struct mutex smu_workload_lock; >>>> + uint32_t workload_mode; >>>> + atomic_t workload_switch_ref; >>>> + >>>> struct config_table_setting config_table; >>>> /* runtime mode */ >>>> enum amdgpu_runpm_mode rpm_mode; >>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 12:23 ` Sharma, Shashank @ 2022-09-27 12:39 ` Lazar, Lijo 2022-09-27 12:53 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-27 12:39 UTC (permalink / raw) To: Sharma, Shashank, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 5:53 PM, Sharma, Shashank wrote: > > > On 9/27/2022 2:10 PM, Lazar, Lijo wrote: >> >> >> On 9/27/2022 5:11 PM, Sharma, Shashank wrote: >>> >>> >>> On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >>>> >>>> >>>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>>> This patch adds new functions which will allow a user to >>>>> change the GPU power profile based a GPU workload hint >>>>> flag. >>>>> >>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>>> +++++++++++++++++++ >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>>> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>> create mode 100644 drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>> index 5a283d12f8e1..34679c657ecc 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>>>> amdgpu_pll.o \ >>>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >>>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>>>> amdgpu_virt.o \ >>>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>> new file mode 100644 >>>>> index 000000000000..a11cf29bc388 >>>>> --- /dev/null >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>> @@ -0,0 +1,97 @@ >>>>> +/* >>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>> + * >>>>> + * Permission is hereby granted, free of charge, to any person >>>>> obtaining a >>>>> + * copy of this software and associated documentation files (the >>>>> "Software"), >>>>> + * to deal in the Software without restriction, including without >>>>> limitation >>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>> sublicense, >>>>> + * and/or sell copies of the Software, and to permit persons to >>>>> whom the >>>>> + * Software is furnished to do so, subject to the following >>>>> conditions: >>>>> + * >>>>> + * The above copyright notice and this permission notice shall be >>>>> included in >>>>> + * all copies or substantial portions of the Software. >>>>> + * >>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>>>> EXPRESS OR >>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>> MERCHANTABILITY, >>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>> EVENT SHALL >>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>>> DAMAGES OR >>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>> OTHERWISE, >>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>>>> USE OR >>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>> + * >>>>> + */ >>>>> +#include <drm/drm.h> >>>>> +#include "kgd_pp_interface.h" >>>>> +#include "amdgpu_ctx_workload.h" >>>>> + >>>>> +static enum PP_SMC_POWER_PROFILE >>>>> +amdgpu_workload_to_power_profile(uint32_t hint) >>>>> +{ >>>>> + switch (hint) { >>>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>>> + default: >>>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>>> + >>>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>>> + return PP_SMC_POWER_PROFILE_VR; >>>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>>> + } >>>>> +} >>>>> + >>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>> + uint32_t hint) >>>>> +{ >>>>> + int ret = 0; >>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>> + amdgpu_workload_to_power_profile(hint); >>>>> + >>>>> + if (adev->pm.workload_mode == hint) >>>>> + return 0; >>>>> + >>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>> >>>> If it's all about pm subsystem variable accesses, this API should >>>> rather be inside amd/pm subsystem. No need to expose the variable >>>> outside pm subsytem. Also currently all amdgpu_dpm* calls are >>>> protected under one mutex. Then this extra lock won't be needed. >>>> >>> >>> This is tricky, this is not all about PM subsystem. Note that the job >>> management and scheduling is handled into amdgpu_ctx, so the workload >>> hint is set in context_management API. The API is consumed when the >>> job is actually run from amdgpu_run() layer. So its a joint interface >>> between context and PM. >>> >> >> If you take out amdgpu_workload_to_power_profile() line, everything >> else looks to touch only pm variables/functions. > > That's not a line, that function converts a AMGPU_CTX hint to PPM > profile. And going by that logic, this whole code was kept in the > amdgpu_ctx.c file as well, coz this code is consuming the PM API. So to > avoid these conflicts and having a new file is a better idea. > > You could still keep a >> wrapper though. Also dpm_* functions are protected, so the extra mutex >> can be avoided as well. >> > The lock also protects pm.workload_mode writes. > >>>>> + >>>>> + if (adev->pm.workload_mode == hint) >>>>> + goto unlock; >>>>> + >>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>>> + if (!ret) >>>>> + adev->pm.workload_mode = hint; >>>>> + atomic_inc(&adev->pm.workload_switch_ref); >>>> >>>> Why is this reference kept? The swtiching happens inside a lock and >>>> there is already a check not to switch if the hint matches with >>>> current workload. >>>> >>> >>> This reference is kept so that we would not reset the PM mode to >>> DEFAULT when some other context has switched the PP mode. If you see >>> the 4th patch, the PM mode will be changed when the job in that >>> context is run, and a pm_reset function will be scheduled when the >>> job is done. But in between if another job from another context has >>> changed the PM mode, the refrence count will prevent us from >>> resetting the PM mode. >>> >> >> This helps only if multiple jobs request the same mode. If they >> request different modes, then this is not helping much. > > No that's certainly not the case. It's a counter, whose aim is to allow > a PP reset only when the counter is 0. Do note that the reset() happens > only in the job_free_cb(), which gets schedule later. If this counter is > not zero, which means another work has changed the profile in between, > and we should not reset it. > >> >> It could be useful to profile some apps assuming it has exclusive access. >> >> However, in general, the API is not reliable from a user point as the >> mode requested can be overridden by some other job. Then a better >> thing to do is to document that and avoid the extra stuff around it. >> > As I mentioned before, like any PM feature, the benefits can be seen > only while running consistant workloads for long time. I an still add a > doc note in the UAPI page. > a) What is the goal of the API? Is it guaranteeing the job to run under a workprofile mode or something else? b) If it's to guarantee work profile mode, does it really guarantee that - the answer is NO when some other job is running. It may or may not work is the answer. c) What is the difference between one job resetting the profile mode to NONE vs another job change the mode to say VIDEO when the original request is for COMPUTE? While that is the case, what is the use of any sort of 'pseudo-protection' other than running some code to do extra lock/unlock stuff. Thanks, Lijo > - Shashank > >> Thanks, >> Lijo >> >>> - Shashank >>> >>>> Thanks, >>>> Lijo >>>> >>>>> + >>>>> +unlock: >>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>> + return ret; >>>>> +} >>>>> + >>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>> + uint32_t hint) >>>>> +{ >>>>> + int ret = 0; >>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>> + amdgpu_workload_to_power_profile(hint); >>>>> + >>>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>>> + return 0; >>>>> + >>>>> + /* Do not reset GPU power profile if another reset is coming */ >>>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>>> + return 0; >>>>> + >>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>> + >>>>> + if (adev->pm.workload_mode != hint) >>>>> + goto unlock; >>>>> + >>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>>> + if (!ret) >>>>> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>>> + >>>>> +unlock: >>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>> + return ret; >>>>> +} >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>> index be7aff2d4a57..1f0f64662c04 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device >>>>> *adev, >>>>> mutex_init(&adev->psp.mutex); >>>>> mutex_init(&adev->notifier_lock); >>>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>>> + mutex_init(&adev->pm.smu_workload_lock); >>>>> mutex_init(&adev->benchmark_mutex); >>>>> amdgpu_device_init_apu_flags(adev); >>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>> new file mode 100644 >>>>> index 000000000000..6060fc53c3b0 >>>>> --- /dev/null >>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>> @@ -0,0 +1,54 @@ >>>>> +/* >>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>> + * >>>>> + * Permission is hereby granted, free of charge, to any person >>>>> obtaining a >>>>> + * copy of this software and associated documentation files (the >>>>> "Software"), >>>>> + * to deal in the Software without restriction, including without >>>>> limitation >>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>> sublicense, >>>>> + * and/or sell copies of the Software, and to permit persons to >>>>> whom the >>>>> + * Software is furnished to do so, subject to the following >>>>> conditions: >>>>> + * >>>>> + * The above copyright notice and this permission notice shall be >>>>> included in >>>>> + * all copies or substantial portions of the Software. >>>>> + * >>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, >>>>> EXPRESS OR >>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>> MERCHANTABILITY, >>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>> EVENT SHALL >>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>>> DAMAGES OR >>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>> OTHERWISE, >>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>>>> USE OR >>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>> + * >>>>> + */ >>>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>>> +#define _AMDGPU_CTX_WL_H_ >>>>> +#include <drm/amdgpu_drm.h> >>>>> +#include "amdgpu.h" >>>>> + >>>>> +/* Workload mode names */ >>>>> +static const char * const amdgpu_workload_mode_name[] = { >>>>> + "None", >>>>> + "3D", >>>>> + "Video", >>>>> + "VR", >>>>> + "Compute", >>>>> + "Unknown", >>>>> +}; >>>>> + >>>>> +static inline const >>>>> +char *amdgpu_workload_profile_name(uint32_t profile) >>>>> +{ >>>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>>> + return >>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>>>> + >>>>> + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>>> +} >>>>> + >>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>> + uint32_t hint); >>>>> + >>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>> + uint32_t hint); >>>>> + >>>>> +#endif >>>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>> index 65624d091ed2..565131f789d0 100644 >>>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>>> struct mutex stable_pstate_ctx_lock; >>>>> struct amdgpu_ctx *stable_pstate_ctx; >>>>> + /* SMU workload mode */ >>>>> + struct mutex smu_workload_lock; >>>>> + uint32_t workload_mode; >>>>> + atomic_t workload_switch_ref; >>>>> + >>>>> struct config_table_setting config_table; >>>>> /* runtime mode */ >>>>> enum amdgpu_runpm_mode rpm_mode; >>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 12:39 ` Lazar, Lijo @ 2022-09-27 12:53 ` Sharma, Shashank 2022-09-27 13:29 ` Lazar, Lijo 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 12:53 UTC (permalink / raw) To: Lazar, Lijo, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 2:39 PM, Lazar, Lijo wrote: > > > On 9/27/2022 5:53 PM, Sharma, Shashank wrote: >> >> >> On 9/27/2022 2:10 PM, Lazar, Lijo wrote: >>> >>> >>> On 9/27/2022 5:11 PM, Sharma, Shashank wrote: >>>> >>>> >>>> On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >>>>> >>>>> >>>>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>>>> This patch adds new functions which will allow a user to >>>>>> change the GPU power profile based a GPU workload hint >>>>>> flag. >>>>>> >>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>> --- >>>>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>>>> +++++++++++++++++++ >>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>>>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>>>> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>> create mode 100644 >>>>>> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>> >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>> index 5a283d12f8e1..34679c657ecc 100644 >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>>>>> amdgpu_pll.o \ >>>>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >>>>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>>>>> amdgpu_virt.o \ >>>>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>> new file mode 100644 >>>>>> index 000000000000..a11cf29bc388 >>>>>> --- /dev/null >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>> @@ -0,0 +1,97 @@ >>>>>> +/* >>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>> + * >>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>> obtaining a >>>>>> + * copy of this software and associated documentation files (the >>>>>> "Software"), >>>>>> + * to deal in the Software without restriction, including without >>>>>> limitation >>>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>>> sublicense, >>>>>> + * and/or sell copies of the Software, and to permit persons to >>>>>> whom the >>>>>> + * Software is furnished to do so, subject to the following >>>>>> conditions: >>>>>> + * >>>>>> + * The above copyright notice and this permission notice shall be >>>>>> included in >>>>>> + * all copies or substantial portions of the Software. >>>>>> + * >>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>> KIND, EXPRESS OR >>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>> MERCHANTABILITY, >>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>>> EVENT SHALL >>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>>>> DAMAGES OR >>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>> OTHERWISE, >>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>>>>> USE OR >>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>> + * >>>>>> + */ >>>>>> +#include <drm/drm.h> >>>>>> +#include "kgd_pp_interface.h" >>>>>> +#include "amdgpu_ctx_workload.h" >>>>>> + >>>>>> +static enum PP_SMC_POWER_PROFILE >>>>>> +amdgpu_workload_to_power_profile(uint32_t hint) >>>>>> +{ >>>>>> + switch (hint) { >>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>>>> + default: >>>>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>>>> + >>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>>>> + return PP_SMC_POWER_PROFILE_VR; >>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>>>> + } >>>>>> +} >>>>>> + >>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>> + uint32_t hint) >>>>>> +{ >>>>>> + int ret = 0; >>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>> + >>>>>> + if (adev->pm.workload_mode == hint) >>>>>> + return 0; >>>>>> + >>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>> >>>>> If it's all about pm subsystem variable accesses, this API should >>>>> rather be inside amd/pm subsystem. No need to expose the variable >>>>> outside pm subsytem. Also currently all amdgpu_dpm* calls are >>>>> protected under one mutex. Then this extra lock won't be needed. >>>>> >>>> >>>> This is tricky, this is not all about PM subsystem. Note that the >>>> job management and scheduling is handled into amdgpu_ctx, so the >>>> workload hint is set in context_management API. The API is consumed >>>> when the job is actually run from amdgpu_run() layer. So its a joint >>>> interface between context and PM. >>>> >>> >>> If you take out amdgpu_workload_to_power_profile() line, everything >>> else looks to touch only pm variables/functions. >> >> That's not a line, that function converts a AMGPU_CTX hint to PPM >> profile. And going by that logic, this whole code was kept in the >> amdgpu_ctx.c file as well, coz this code is consuming the PM API. So >> to avoid these conflicts and having a new file is a better idea. >> >> You could still keep a >>> wrapper though. Also dpm_* functions are protected, so the extra >>> mutex can be avoided as well. >>> >> The lock also protects pm.workload_mode writes. >> >>>>>> + >>>>>> + if (adev->pm.workload_mode == hint) >>>>>> + goto unlock; >>>>>> + >>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>>>> + if (!ret) >>>>>> + adev->pm.workload_mode = hint; >>>>>> + atomic_inc(&adev->pm.workload_switch_ref); >>>>> >>>>> Why is this reference kept? The swtiching happens inside a lock and >>>>> there is already a check not to switch if the hint matches with >>>>> current workload. >>>>> >>>> >>>> This reference is kept so that we would not reset the PM mode to >>>> DEFAULT when some other context has switched the PP mode. If you see >>>> the 4th patch, the PM mode will be changed when the job in that >>>> context is run, and a pm_reset function will be scheduled when the >>>> job is done. But in between if another job from another context has >>>> changed the PM mode, the refrence count will prevent us from >>>> resetting the PM mode. >>>> >>> >>> This helps only if multiple jobs request the same mode. If they >>> request different modes, then this is not helping much. >> >> No that's certainly not the case. It's a counter, whose aim is to >> allow a PP reset only when the counter is 0. Do note that the reset() >> happens only in the job_free_cb(), which gets schedule later. If this >> counter is not zero, which means another work has changed the profile >> in between, and we should not reset it. >> >>> >>> It could be useful to profile some apps assuming it has exclusive >>> access. >>> >>> However, in general, the API is not reliable from a user point as the >>> mode requested can be overridden by some other job. Then a better >>> thing to do is to document that and avoid the extra stuff around it. >>> >> As I mentioned before, like any PM feature, the benefits can be seen >> only while running consistant workloads for long time. I an still add >> a doc note in the UAPI page. >> > > > a) What is the goal of the API? Is it guaranteeing the job to run under > a workprofile mode or something else? No, it does not guarentee anything. If you see the cover letter, it just provides an interface to an app to submit workload under a power profile which can be more suitable for its workload type. As I mentioned, it could be very useful for many scenarios like fullscreen 3D / fullscreen MM scenarios. It could also allow a system-gfx-manager to shift load balance towards one type of workload. There are many applications, once the UAPI is in place. > > b) If it's to guarantee work profile mode, does it really guarantee that > - the answer is NO when some other job is running. It may or may not > work is the answer. > > c) What is the difference between one job resetting the profile mode to > NONE vs another job change the mode to say VIDEO when the original > request is for COMPUTE? While that is the case, what is the use of any > sort of 'pseudo-protection' other than running some code to do extra > lock/unlock stuff. > Your understanding of protection is wrong here. There is intentionally no protection for a job changing another job's set workload profile, coz in that was we will end up seriazling/bottlenecking workload submission until PM profile is ready to be changed, which takes away benefit of having multiple queues of parallel submission. The protection provided by the ref counter is to avoid the clearing of the profile (to NONE), while another workload is in execution. The difference between NONE and VIDEO is still that NONE is the default profile without any fine tuning, and VIDEO is still fine tuned for VIDEO type of workloads. In the end, *again* the actual benefit comes when consistant workload is submitted for a long time, like fullscreen 3D game playback, fullscreen Video movie playback, and so on. - Shashank > Thanks, > Lijo > >> - Shashank >> >>> Thanks, >>> Lijo >>> >>>> - Shashank >>>> >>>>> Thanks, >>>>> Lijo >>>>> >>>>>> + >>>>>> +unlock: >>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>> + return ret; >>>>>> +} >>>>>> + >>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>> + uint32_t hint) >>>>>> +{ >>>>>> + int ret = 0; >>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>> + >>>>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>>>> + return 0; >>>>>> + >>>>>> + /* Do not reset GPU power profile if another reset is coming */ >>>>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>>>> + return 0; >>>>>> + >>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>> + >>>>>> + if (adev->pm.workload_mode != hint) >>>>>> + goto unlock; >>>>>> + >>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>>>> + if (!ret) >>>>>> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>>>> + >>>>>> +unlock: >>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>> + return ret; >>>>>> +} >>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>> index be7aff2d4a57..1f0f64662c04 100644 >>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device >>>>>> *adev, >>>>>> mutex_init(&adev->psp.mutex); >>>>>> mutex_init(&adev->notifier_lock); >>>>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>>>> + mutex_init(&adev->pm.smu_workload_lock); >>>>>> mutex_init(&adev->benchmark_mutex); >>>>>> amdgpu_device_init_apu_flags(adev); >>>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>> new file mode 100644 >>>>>> index 000000000000..6060fc53c3b0 >>>>>> --- /dev/null >>>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>> @@ -0,0 +1,54 @@ >>>>>> +/* >>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>> + * >>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>> obtaining a >>>>>> + * copy of this software and associated documentation files (the >>>>>> "Software"), >>>>>> + * to deal in the Software without restriction, including without >>>>>> limitation >>>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>>> sublicense, >>>>>> + * and/or sell copies of the Software, and to permit persons to >>>>>> whom the >>>>>> + * Software is furnished to do so, subject to the following >>>>>> conditions: >>>>>> + * >>>>>> + * The above copyright notice and this permission notice shall be >>>>>> included in >>>>>> + * all copies or substantial portions of the Software. >>>>>> + * >>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>> KIND, EXPRESS OR >>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>> MERCHANTABILITY, >>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>>> EVENT SHALL >>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>>>> DAMAGES OR >>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>> OTHERWISE, >>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE >>>>>> USE OR >>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>> + * >>>>>> + */ >>>>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>>>> +#define _AMDGPU_CTX_WL_H_ >>>>>> +#include <drm/amdgpu_drm.h> >>>>>> +#include "amdgpu.h" >>>>>> + >>>>>> +/* Workload mode names */ >>>>>> +static const char * const amdgpu_workload_mode_name[] = { >>>>>> + "None", >>>>>> + "3D", >>>>>> + "Video", >>>>>> + "VR", >>>>>> + "Compute", >>>>>> + "Unknown", >>>>>> +}; >>>>>> + >>>>>> +static inline const >>>>>> +char *amdgpu_workload_profile_name(uint32_t profile) >>>>>> +{ >>>>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>>>> + return >>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>>>>> + >>>>>> + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>>>> +} >>>>>> + >>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>> + uint32_t hint); >>>>>> + >>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>> + uint32_t hint); >>>>>> + >>>>>> +#endif >>>>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>> index 65624d091ed2..565131f789d0 100644 >>>>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>>>> struct mutex stable_pstate_ctx_lock; >>>>>> struct amdgpu_ctx *stable_pstate_ctx; >>>>>> + /* SMU workload mode */ >>>>>> + struct mutex smu_workload_lock; >>>>>> + uint32_t workload_mode; >>>>>> + atomic_t workload_switch_ref; >>>>>> + >>>>>> struct config_table_setting config_table; >>>>>> /* runtime mode */ >>>>>> enum amdgpu_runpm_mode rpm_mode; >>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 12:53 ` Sharma, Shashank @ 2022-09-27 13:29 ` Lazar, Lijo 2022-09-27 13:47 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-27 13:29 UTC (permalink / raw) To: Sharma, Shashank, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 6:23 PM, Sharma, Shashank wrote: > > > On 9/27/2022 2:39 PM, Lazar, Lijo wrote: >> >> >> On 9/27/2022 5:53 PM, Sharma, Shashank wrote: >>> >>> >>> On 9/27/2022 2:10 PM, Lazar, Lijo wrote: >>>> >>>> >>>> On 9/27/2022 5:11 PM, Sharma, Shashank wrote: >>>>> >>>>> >>>>> On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >>>>>> >>>>>> >>>>>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>>>>> This patch adds new functions which will allow a user to >>>>>>> change the GPU power profile based a GPU workload hint >>>>>>> flag. >>>>>>> >>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>> --- >>>>>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>>>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>>>>> +++++++++++++++++++ >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>>>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>>>>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>>>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>>>>> create mode 100644 >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>> create mode 100644 >>>>>>> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>> index 5a283d12f8e1..34679c657ecc 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>>>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>>>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>>>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>>>>>> amdgpu_pll.o \ >>>>>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >>>>>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>>>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>>>>>> amdgpu_virt.o \ >>>>>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>>>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>> new file mode 100644 >>>>>>> index 000000000000..a11cf29bc388 >>>>>>> --- /dev/null >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>> @@ -0,0 +1,97 @@ >>>>>>> +/* >>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>> + * >>>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>>> obtaining a >>>>>>> + * copy of this software and associated documentation files (the >>>>>>> "Software"), >>>>>>> + * to deal in the Software without restriction, including >>>>>>> without limitation >>>>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>>>> sublicense, >>>>>>> + * and/or sell copies of the Software, and to permit persons to >>>>>>> whom the >>>>>>> + * Software is furnished to do so, subject to the following >>>>>>> conditions: >>>>>>> + * >>>>>>> + * The above copyright notice and this permission notice shall >>>>>>> be included in >>>>>>> + * all copies or substantial portions of the Software. >>>>>>> + * >>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>> KIND, EXPRESS OR >>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>> MERCHANTABILITY, >>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>>>> EVENT SHALL >>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>>>>> DAMAGES OR >>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>> OTHERWISE, >>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>>>>> THE USE OR >>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>> + * >>>>>>> + */ >>>>>>> +#include <drm/drm.h> >>>>>>> +#include "kgd_pp_interface.h" >>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>> + >>>>>>> +static enum PP_SMC_POWER_PROFILE >>>>>>> +amdgpu_workload_to_power_profile(uint32_t hint) >>>>>>> +{ >>>>>>> + switch (hint) { >>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>>>>> + default: >>>>>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>>>>> + >>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>>>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>>>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>>>>> + return PP_SMC_POWER_PROFILE_VR; >>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>>>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>>>>> + } >>>>>>> +} >>>>>>> + >>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>> + uint32_t hint) >>>>>>> +{ >>>>>>> + int ret = 0; >>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>> + >>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>> + return 0; >>>>>>> + >>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>> >>>>>> If it's all about pm subsystem variable accesses, this API should >>>>>> rather be inside amd/pm subsystem. No need to expose the variable >>>>>> outside pm subsytem. Also currently all amdgpu_dpm* calls are >>>>>> protected under one mutex. Then this extra lock won't be needed. >>>>>> >>>>> >>>>> This is tricky, this is not all about PM subsystem. Note that the >>>>> job management and scheduling is handled into amdgpu_ctx, so the >>>>> workload hint is set in context_management API. The API is consumed >>>>> when the job is actually run from amdgpu_run() layer. So its a >>>>> joint interface between context and PM. >>>>> >>>> >>>> If you take out amdgpu_workload_to_power_profile() line, everything >>>> else looks to touch only pm variables/functions. >>> >>> That's not a line, that function converts a AMGPU_CTX hint to PPM >>> profile. And going by that logic, this whole code was kept in the >>> amdgpu_ctx.c file as well, coz this code is consuming the PM API. So >>> to avoid these conflicts and having a new file is a better idea. >>> >>> You could still keep a >>>> wrapper though. Also dpm_* functions are protected, so the extra >>>> mutex can be avoided as well. >>>> >>> The lock also protects pm.workload_mode writes. >>> >>>>>>> + >>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>> + goto unlock; >>>>>>> + >>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>>>>> + if (!ret) >>>>>>> + adev->pm.workload_mode = hint; >>>>>>> + atomic_inc(&adev->pm.workload_switch_ref); >>>>>> >>>>>> Why is this reference kept? The swtiching happens inside a lock >>>>>> and there is already a check not to switch if the hint matches >>>>>> with current workload. >>>>>> >>>>> >>>>> This reference is kept so that we would not reset the PM mode to >>>>> DEFAULT when some other context has switched the PP mode. If you >>>>> see the 4th patch, the PM mode will be changed when the job in that >>>>> context is run, and a pm_reset function will be scheduled when the >>>>> job is done. But in between if another job from another context has >>>>> changed the PM mode, the refrence count will prevent us from >>>>> resetting the PM mode. >>>>> >>>> >>>> This helps only if multiple jobs request the same mode. If they >>>> request different modes, then this is not helping much. >>> >>> No that's certainly not the case. It's a counter, whose aim is to >>> allow a PP reset only when the counter is 0. Do note that the reset() >>> happens only in the job_free_cb(), which gets schedule later. If this >>> counter is not zero, which means another work has changed the profile >>> in between, and we should not reset it. >>> >>>> >>>> It could be useful to profile some apps assuming it has exclusive >>>> access. >>>> >>>> However, in general, the API is not reliable from a user point as >>>> the mode requested can be overridden by some other job. Then a >>>> better thing to do is to document that and avoid the extra stuff >>>> around it. >>>> >>> As I mentioned before, like any PM feature, the benefits can be seen >>> only while running consistant workloads for long time. I an still add >>> a doc note in the UAPI page. >>> >> >> >> a) What is the goal of the API? Is it guaranteeing the job to run >> under a workprofile mode or something else? > > No, it does not guarentee anything. If you see the cover letter, it just > provides an interface to an app to submit workload under a power profile > which can be more suitable for its workload type. As I mentioned, it > could be very useful for many scenarios like fullscreen 3D / fullscreen > MM scenarios. It could also allow a system-gfx-manager to shift load > balance towards one type of workload. There are many applications, once > the UAPI is in place. > >> >> b) If it's to guarantee work profile mode, does it really guarantee >> that - the answer is NO when some other job is running. It may or may >> not work is the answer. >> >> c) What is the difference between one job resetting the profile mode >> to NONE vs another job change the mode to say VIDEO when the original >> request is for COMPUTE? While that is the case, what is the use of any >> sort of 'pseudo-protection' other than running some code to do extra >> lock/unlock stuff. >> > > Your understanding of protection is wrong here. There is intentionally > no protection for a job changing another job's set workload profile, coz > in that was we will end up seriazling/bottlenecking workload submission > until PM profile is ready to be changed, which takes away benefit of > having multiple queues of parallel submission. > > The protection provided by the ref counter is to avoid the clearing of > the profile (to NONE), while another workload is in execution. The > difference between NONE and VIDEO is still that NONE is the default > profile without any fine tuning, and VIDEO is still fine tuned for VIDEO > type of workloads. > Protection 1 is - mutex_lock(&adev->pm.smu_workload_lock); The line that follows is amdgpu_dpm_switch_power_profile() - this one will allow only single client use- two jobs won't be able to switch at the same time. All *dpm* APIs are protected like that. Protection 2 is - ref counter. It helps only in this kind of scenario when two jobs requested the same mode successively - Job 1 requested compute Job 2 requested compute Job 1 ends (doesnt't reset) Scenario - 2 Job 1 requested compute Job 2 requested compute Job 3 requested 3D Job 1 ends (doesnt't reset, it continues in 3D) In this mixed scenario case, I would say NONE is much more optimized as it's under FW control. Actually, it does much more fine tuning because of its background data collection. > In the end, *again* the actual benefit comes when consistant workload is > submitted for a long time, like fullscreen 3D game playback, fullscreen > Video movie playback, and so on. > "only under consistent", doesn't justify any software protection logic. Again, if the workload is consistent most likely PMFW could be managing it better. Thanks, Lijo > - Shashank > >> Thanks, >> Lijo >> >>> - Shashank >>> >>>> Thanks, >>>> Lijo >>>> >>>>> - Shashank >>>>> >>>>>> Thanks, >>>>>> Lijo >>>>>> >>>>>>> + >>>>>>> +unlock: >>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>> + return ret; >>>>>>> +} >>>>>>> + >>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>> + uint32_t hint) >>>>>>> +{ >>>>>>> + int ret = 0; >>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>> + >>>>>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>>>>> + return 0; >>>>>>> + >>>>>>> + /* Do not reset GPU power profile if another reset is coming */ >>>>>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>>>>> + return 0; >>>>>>> + >>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>> + >>>>>>> + if (adev->pm.workload_mode != hint) >>>>>>> + goto unlock; >>>>>>> + >>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>>>>> + if (!ret) >>>>>>> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>>>>> + >>>>>>> +unlock: >>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>> + return ret; >>>>>>> +} >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>> index be7aff2d4a57..1f0f64662c04 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device >>>>>>> *adev, >>>>>>> mutex_init(&adev->psp.mutex); >>>>>>> mutex_init(&adev->notifier_lock); >>>>>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>>>>> + mutex_init(&adev->pm.smu_workload_lock); >>>>>>> mutex_init(&adev->benchmark_mutex); >>>>>>> amdgpu_device_init_apu_flags(adev); >>>>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>> new file mode 100644 >>>>>>> index 000000000000..6060fc53c3b0 >>>>>>> --- /dev/null >>>>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>> @@ -0,0 +1,54 @@ >>>>>>> +/* >>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>> + * >>>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>>> obtaining a >>>>>>> + * copy of this software and associated documentation files (the >>>>>>> "Software"), >>>>>>> + * to deal in the Software without restriction, including >>>>>>> without limitation >>>>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>>>> sublicense, >>>>>>> + * and/or sell copies of the Software, and to permit persons to >>>>>>> whom the >>>>>>> + * Software is furnished to do so, subject to the following >>>>>>> conditions: >>>>>>> + * >>>>>>> + * The above copyright notice and this permission notice shall >>>>>>> be included in >>>>>>> + * all copies or substantial portions of the Software. >>>>>>> + * >>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>> KIND, EXPRESS OR >>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>> MERCHANTABILITY, >>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>>>> EVENT SHALL >>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, >>>>>>> DAMAGES OR >>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>> OTHERWISE, >>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>>>>> THE USE OR >>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>> + * >>>>>>> + */ >>>>>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>>>>> +#define _AMDGPU_CTX_WL_H_ >>>>>>> +#include <drm/amdgpu_drm.h> >>>>>>> +#include "amdgpu.h" >>>>>>> + >>>>>>> +/* Workload mode names */ >>>>>>> +static const char * const amdgpu_workload_mode_name[] = { >>>>>>> + "None", >>>>>>> + "3D", >>>>>>> + "Video", >>>>>>> + "VR", >>>>>>> + "Compute", >>>>>>> + "Unknown", >>>>>>> +}; >>>>>>> + >>>>>>> +static inline const >>>>>>> +char *amdgpu_workload_profile_name(uint32_t profile) >>>>>>> +{ >>>>>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>>>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>>>>> + return >>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>>>>>> + >>>>>>> + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>>>>> +} >>>>>>> + >>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>> + uint32_t hint); >>>>>>> + >>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>> + uint32_t hint); >>>>>>> + >>>>>>> +#endif >>>>>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>> index 65624d091ed2..565131f789d0 100644 >>>>>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>>>>> struct mutex stable_pstate_ctx_lock; >>>>>>> struct amdgpu_ctx *stable_pstate_ctx; >>>>>>> + /* SMU workload mode */ >>>>>>> + struct mutex smu_workload_lock; >>>>>>> + uint32_t workload_mode; >>>>>>> + atomic_t workload_switch_ref; >>>>>>> + >>>>>>> struct config_table_setting config_table; >>>>>>> /* runtime mode */ >>>>>>> enum amdgpu_runpm_mode rpm_mode; >>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 13:29 ` Lazar, Lijo @ 2022-09-27 13:47 ` Sharma, Shashank 2022-09-27 14:00 ` Lazar, Lijo 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 13:47 UTC (permalink / raw) To: Lazar, Lijo, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 3:29 PM, Lazar, Lijo wrote: > > > On 9/27/2022 6:23 PM, Sharma, Shashank wrote: >> >> >> On 9/27/2022 2:39 PM, Lazar, Lijo wrote: >>> >>> >>> On 9/27/2022 5:53 PM, Sharma, Shashank wrote: >>>> >>>> >>>> On 9/27/2022 2:10 PM, Lazar, Lijo wrote: >>>>> >>>>> >>>>> On 9/27/2022 5:11 PM, Sharma, Shashank wrote: >>>>>> >>>>>> >>>>>> On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >>>>>>> >>>>>>> >>>>>>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>>>>>> This patch adds new functions which will allow a user to >>>>>>>> change the GPU power profile based a GPU workload hint >>>>>>>> flag. >>>>>>>> >>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>>>>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>>>>>> +++++++++++++++++++ >>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>>>>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>>>>>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>>>>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>>>>>> create mode 100644 >>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>> create mode 100644 >>>>>>>> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>> index 5a283d12f8e1..34679c657ecc 100644 >>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>>>>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>>>>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>>>>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>>>>>>> amdgpu_pll.o \ >>>>>>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >>>>>>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>>>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>>>>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>>>>>>> amdgpu_virt.o \ >>>>>>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>>>>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>> new file mode 100644 >>>>>>>> index 000000000000..a11cf29bc388 >>>>>>>> --- /dev/null >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>> @@ -0,0 +1,97 @@ >>>>>>>> +/* >>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>> + * >>>>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>>>> obtaining a >>>>>>>> + * copy of this software and associated documentation files >>>>>>>> (the "Software"), >>>>>>>> + * to deal in the Software without restriction, including >>>>>>>> without limitation >>>>>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>>>>> sublicense, >>>>>>>> + * and/or sell copies of the Software, and to permit persons to >>>>>>>> whom the >>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>> conditions: >>>>>>>> + * >>>>>>>> + * The above copyright notice and this permission notice shall >>>>>>>> be included in >>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>> + * >>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>>> KIND, EXPRESS OR >>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>> MERCHANTABILITY, >>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>>>>> EVENT SHALL >>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>> CLAIM, DAMAGES OR >>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>>> OTHERWISE, >>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>>>>>> THE USE OR >>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>> + * >>>>>>>> + */ >>>>>>>> +#include <drm/drm.h> >>>>>>>> +#include "kgd_pp_interface.h" >>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>> + >>>>>>>> +static enum PP_SMC_POWER_PROFILE >>>>>>>> +amdgpu_workload_to_power_profile(uint32_t hint) >>>>>>>> +{ >>>>>>>> + switch (hint) { >>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>>>>>> + default: >>>>>>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>>>>>> + >>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>>>>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>>>>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>>>>>> + return PP_SMC_POWER_PROFILE_VR; >>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>>>>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>>>>>> + } >>>>>>>> +} >>>>>>>> + >>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>> + uint32_t hint) >>>>>>>> +{ >>>>>>>> + int ret = 0; >>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>> + >>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>> + return 0; >>>>>>>> + >>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>> >>>>>>> If it's all about pm subsystem variable accesses, this API should >>>>>>> rather be inside amd/pm subsystem. No need to expose the variable >>>>>>> outside pm subsytem. Also currently all amdgpu_dpm* calls are >>>>>>> protected under one mutex. Then this extra lock won't be needed. >>>>>>> >>>>>> >>>>>> This is tricky, this is not all about PM subsystem. Note that the >>>>>> job management and scheduling is handled into amdgpu_ctx, so the >>>>>> workload hint is set in context_management API. The API is >>>>>> consumed when the job is actually run from amdgpu_run() layer. So >>>>>> its a joint interface between context and PM. >>>>>> >>>>> >>>>> If you take out amdgpu_workload_to_power_profile() line, everything >>>>> else looks to touch only pm variables/functions. >>>> >>>> That's not a line, that function converts a AMGPU_CTX hint to PPM >>>> profile. And going by that logic, this whole code was kept in the >>>> amdgpu_ctx.c file as well, coz this code is consuming the PM API. So >>>> to avoid these conflicts and having a new file is a better idea. >>>> >>>> You could still keep a >>>>> wrapper though. Also dpm_* functions are protected, so the extra >>>>> mutex can be avoided as well. >>>>> >>>> The lock also protects pm.workload_mode writes. >>>> >>>>>>>> + >>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>> + goto unlock; >>>>>>>> + >>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>>>>>> + if (!ret) >>>>>>>> + adev->pm.workload_mode = hint; >>>>>>>> + atomic_inc(&adev->pm.workload_switch_ref); >>>>>>> >>>>>>> Why is this reference kept? The swtiching happens inside a lock >>>>>>> and there is already a check not to switch if the hint matches >>>>>>> with current workload. >>>>>>> >>>>>> >>>>>> This reference is kept so that we would not reset the PM mode to >>>>>> DEFAULT when some other context has switched the PP mode. If you >>>>>> see the 4th patch, the PM mode will be changed when the job in >>>>>> that context is run, and a pm_reset function will be scheduled >>>>>> when the job is done. But in between if another job from another >>>>>> context has changed the PM mode, the refrence count will prevent >>>>>> us from resetting the PM mode. >>>>>> >>>>> >>>>> This helps only if multiple jobs request the same mode. If they >>>>> request different modes, then this is not helping much. >>>> >>>> No that's certainly not the case. It's a counter, whose aim is to >>>> allow a PP reset only when the counter is 0. Do note that the >>>> reset() happens only in the job_free_cb(), which gets schedule >>>> later. If this counter is not zero, which means another work has >>>> changed the profile in between, and we should not reset it. >>>> >>>>> >>>>> It could be useful to profile some apps assuming it has exclusive >>>>> access. >>>>> >>>>> However, in general, the API is not reliable from a user point as >>>>> the mode requested can be overridden by some other job. Then a >>>>> better thing to do is to document that and avoid the extra stuff >>>>> around it. >>>>> >>>> As I mentioned before, like any PM feature, the benefits can be seen >>>> only while running consistant workloads for long time. I an still >>>> add a doc note in the UAPI page. >>>> >>> >>> >>> a) What is the goal of the API? Is it guaranteeing the job to run >>> under a workprofile mode or something else? >> >> No, it does not guarentee anything. If you see the cover letter, it >> just provides an interface to an app to submit workload under a power >> profile which can be more suitable for its workload type. As I >> mentioned, it could be very useful for many scenarios like fullscreen >> 3D / fullscreen MM scenarios. It could also allow a system-gfx-manager >> to shift load balance towards one type of workload. There are many >> applications, once the UAPI is in place. >> >>> >>> b) If it's to guarantee work profile mode, does it really guarantee >>> that - the answer is NO when some other job is running. It may or may >>> not work is the answer. >>> >>> c) What is the difference between one job resetting the profile mode >>> to NONE vs another job change the mode to say VIDEO when the original >>> request is for COMPUTE? While that is the case, what is the use of >>> any sort of 'pseudo-protection' other than running some code to do >>> extra lock/unlock stuff. >>> >> >> Your understanding of protection is wrong here. There is intentionally >> no protection for a job changing another job's set workload profile, >> coz in that was we will end up seriazling/bottlenecking workload >> submission until PM profile is ready to be changed, which takes away >> benefit of having multiple queues of parallel submission. >> >> The protection provided by the ref counter is to avoid the clearing of >> the profile (to NONE), while another workload is in execution. The >> difference between NONE and VIDEO is still that NONE is the default >> profile without any fine tuning, and VIDEO is still fine tuned for >> VIDEO type of workloads. >> > > Protection 1 is - mutex_lock(&adev->pm.smu_workload_lock); > > The line that follows is amdgpu_dpm_switch_power_profile() - this one > will allow only single client use- two jobs won't be able to switch at > the same time. All *dpm* APIs are protected like that. > this also protects the pm.workload_mode variable which is being set after the amdgpu_dpm_switch_power_profile call is successful here: adev->pm.workload_mode = hint; > Protection 2 is - ref counter. > > It helps only in this kind of scenario when two jobs requested the same > mode successively - > Job 1 requested compute > Job 2 requested compute > Job 1 ends (doesnt't reset) > > Scenario - 2 > Job 1 requested compute > Job 2 requested compute > Job 3 requested 3D > Job 1 ends (doesnt't reset, it continues in 3D) > > In this mixed scenario case, I would say NONE is much more optimized as > it's under FW control. Actually, it does much more fine tuning because > of its background data collection. > It helps in mixed scenarios as well, consider this scenario: Job 1 requests: 3D Job 2 requests: Media Job 1 finishes, but job 2 is ongoing Job 1 calls reset(), but checks the counter is non-zero and doesn't reset So the media workload continues in Media mode, not None. - Shashank >> In the end, *again* the actual benefit comes when consistant workload >> is submitted for a long time, like fullscreen 3D game playback, >> fullscreen Video movie playback, and so on. >> > > "only under consistent", doesn't justify any software protection logic. > Again, if the workload is consistent most likely PMFW could be managing > it better. > > Thanks, > Lijo > >> - Shashank >> >>> Thanks, >>> Lijo >>> >>>> - Shashank >>>> >>>>> Thanks, >>>>> Lijo >>>>> >>>>>> - Shashank >>>>>> >>>>>>> Thanks, >>>>>>> Lijo >>>>>>> >>>>>>>> + >>>>>>>> +unlock: >>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>> + return ret; >>>>>>>> +} >>>>>>>> + >>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>> + uint32_t hint) >>>>>>>> +{ >>>>>>>> + int ret = 0; >>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>> + >>>>>>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>>>>>> + return 0; >>>>>>>> + >>>>>>>> + /* Do not reset GPU power profile if another reset is >>>>>>>> coming */ >>>>>>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>>>>>> + return 0; >>>>>>>> + >>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>>> + >>>>>>>> + if (adev->pm.workload_mode != hint) >>>>>>>> + goto unlock; >>>>>>>> + >>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>>>>>> + if (!ret) >>>>>>>> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>>>>>> + >>>>>>>> +unlock: >>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>> + return ret; >>>>>>>> +} >>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>> index be7aff2d4a57..1f0f64662c04 100644 >>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct >>>>>>>> amdgpu_device *adev, >>>>>>>> mutex_init(&adev->psp.mutex); >>>>>>>> mutex_init(&adev->notifier_lock); >>>>>>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>>>>>> + mutex_init(&adev->pm.smu_workload_lock); >>>>>>>> mutex_init(&adev->benchmark_mutex); >>>>>>>> amdgpu_device_init_apu_flags(adev); >>>>>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>> new file mode 100644 >>>>>>>> index 000000000000..6060fc53c3b0 >>>>>>>> --- /dev/null >>>>>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>> @@ -0,0 +1,54 @@ >>>>>>>> +/* >>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>> + * >>>>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>>>> obtaining a >>>>>>>> + * copy of this software and associated documentation files >>>>>>>> (the "Software"), >>>>>>>> + * to deal in the Software without restriction, including >>>>>>>> without limitation >>>>>>>> + * the rights to use, copy, modify, merge, publish, distribute, >>>>>>>> sublicense, >>>>>>>> + * and/or sell copies of the Software, and to permit persons to >>>>>>>> whom the >>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>> conditions: >>>>>>>> + * >>>>>>>> + * The above copyright notice and this permission notice shall >>>>>>>> be included in >>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>> + * >>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>>> KIND, EXPRESS OR >>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>> MERCHANTABILITY, >>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO >>>>>>>> EVENT SHALL >>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>> CLAIM, DAMAGES OR >>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>>> OTHERWISE, >>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>>>>>> THE USE OR >>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>> + * >>>>>>>> + */ >>>>>>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>>>>>> +#define _AMDGPU_CTX_WL_H_ >>>>>>>> +#include <drm/amdgpu_drm.h> >>>>>>>> +#include "amdgpu.h" >>>>>>>> + >>>>>>>> +/* Workload mode names */ >>>>>>>> +static const char * const amdgpu_workload_mode_name[] = { >>>>>>>> + "None", >>>>>>>> + "3D", >>>>>>>> + "Video", >>>>>>>> + "VR", >>>>>>>> + "Compute", >>>>>>>> + "Unknown", >>>>>>>> +}; >>>>>>>> + >>>>>>>> +static inline const >>>>>>>> +char *amdgpu_workload_profile_name(uint32_t profile) >>>>>>>> +{ >>>>>>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>>>>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>>>>>> + return >>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>>>>>>> + >>>>>>>> + return >>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>>>>>> +} >>>>>>>> + >>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>> + uint32_t hint); >>>>>>>> + >>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>> + uint32_t hint); >>>>>>>> + >>>>>>>> +#endif >>>>>>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>> index 65624d091ed2..565131f789d0 100644 >>>>>>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>>>>>> struct mutex stable_pstate_ctx_lock; >>>>>>>> struct amdgpu_ctx *stable_pstate_ctx; >>>>>>>> + /* SMU workload mode */ >>>>>>>> + struct mutex smu_workload_lock; >>>>>>>> + uint32_t workload_mode; >>>>>>>> + atomic_t workload_switch_ref; >>>>>>>> + >>>>>>>> struct config_table_setting config_table; >>>>>>>> /* runtime mode */ >>>>>>>> enum amdgpu_runpm_mode rpm_mode; >>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 13:47 ` Sharma, Shashank @ 2022-09-27 14:00 ` Lazar, Lijo 2022-09-27 14:20 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-27 14:00 UTC (permalink / raw) To: Sharma, Shashank, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 7:17 PM, Sharma, Shashank wrote: > > > On 9/27/2022 3:29 PM, Lazar, Lijo wrote: >> >> >> On 9/27/2022 6:23 PM, Sharma, Shashank wrote: >>> >>> >>> On 9/27/2022 2:39 PM, Lazar, Lijo wrote: >>>> >>>> >>>> On 9/27/2022 5:53 PM, Sharma, Shashank wrote: >>>>> >>>>> >>>>> On 9/27/2022 2:10 PM, Lazar, Lijo wrote: >>>>>> >>>>>> >>>>>> On 9/27/2022 5:11 PM, Sharma, Shashank wrote: >>>>>>> >>>>>>> >>>>>>> On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>>>>>>> This patch adds new functions which will allow a user to >>>>>>>>> change the GPU power profile based a GPU workload hint >>>>>>>>> flag. >>>>>>>>> >>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>>>>>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>>>>>>> +++++++++++++++++++ >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>>>>>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>>>>>>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>>>>>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>>>>>>> create mode 100644 >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>> create mode 100644 >>>>>>>>> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>> index 5a283d12f8e1..34679c657ecc 100644 >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>>>>>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>>>>>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>>>>>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>>>>>>>> amdgpu_pll.o \ >>>>>>>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >>>>>>>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>>>>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>>>>>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>>>>>>>> amdgpu_virt.o \ >>>>>>>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>>>>>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>> new file mode 100644 >>>>>>>>> index 000000000000..a11cf29bc388 >>>>>>>>> --- /dev/null >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>> @@ -0,0 +1,97 @@ >>>>>>>>> +/* >>>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>>> + * >>>>>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>>>>> obtaining a >>>>>>>>> + * copy of this software and associated documentation files >>>>>>>>> (the "Software"), >>>>>>>>> + * to deal in the Software without restriction, including >>>>>>>>> without limitation >>>>>>>>> + * the rights to use, copy, modify, merge, publish, >>>>>>>>> distribute, sublicense, >>>>>>>>> + * and/or sell copies of the Software, and to permit persons >>>>>>>>> to whom the >>>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>>> conditions: >>>>>>>>> + * >>>>>>>>> + * The above copyright notice and this permission notice shall >>>>>>>>> be included in >>>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>>> + * >>>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>>>> KIND, EXPRESS OR >>>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>>> MERCHANTABILITY, >>>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN >>>>>>>>> NO EVENT SHALL >>>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>>> CLAIM, DAMAGES OR >>>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>>>> OTHERWISE, >>>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>>>>>>> THE USE OR >>>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>>> + * >>>>>>>>> + */ >>>>>>>>> +#include <drm/drm.h> >>>>>>>>> +#include "kgd_pp_interface.h" >>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>> + >>>>>>>>> +static enum PP_SMC_POWER_PROFILE >>>>>>>>> +amdgpu_workload_to_power_profile(uint32_t hint) >>>>>>>>> +{ >>>>>>>>> + switch (hint) { >>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>>>>>>> + default: >>>>>>>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>>>>>>> + >>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>>>>>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>>>>>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>>>>>>> + return PP_SMC_POWER_PROFILE_VR; >>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>>>>>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>>>>>>> + } >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>>> + uint32_t hint) >>>>>>>>> +{ >>>>>>>>> + int ret = 0; >>>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>>> + >>>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>>> + return 0; >>>>>>>>> + >>>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>>> >>>>>>>> If it's all about pm subsystem variable accesses, this API >>>>>>>> should rather be inside amd/pm subsystem. No need to expose the >>>>>>>> variable outside pm subsytem. Also currently all amdgpu_dpm* >>>>>>>> calls are protected under one mutex. Then this extra lock won't >>>>>>>> be needed. >>>>>>>> >>>>>>> >>>>>>> This is tricky, this is not all about PM subsystem. Note that the >>>>>>> job management and scheduling is handled into amdgpu_ctx, so the >>>>>>> workload hint is set in context_management API. The API is >>>>>>> consumed when the job is actually run from amdgpu_run() layer. So >>>>>>> its a joint interface between context and PM. >>>>>>> >>>>>> >>>>>> If you take out amdgpu_workload_to_power_profile() line, >>>>>> everything else looks to touch only pm variables/functions. >>>>> >>>>> That's not a line, that function converts a AMGPU_CTX hint to PPM >>>>> profile. And going by that logic, this whole code was kept in the >>>>> amdgpu_ctx.c file as well, coz this code is consuming the PM API. >>>>> So to avoid these conflicts and having a new file is a better idea. >>>>> >>>>> You could still keep a >>>>>> wrapper though. Also dpm_* functions are protected, so the extra >>>>>> mutex can be avoided as well. >>>>>> >>>>> The lock also protects pm.workload_mode writes. >>>>> >>>>>>>>> + >>>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>>> + goto unlock; >>>>>>>>> + >>>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>>>>>>> + if (!ret) >>>>>>>>> + adev->pm.workload_mode = hint; >>>>>>>>> + atomic_inc(&adev->pm.workload_switch_ref); >>>>>>>> >>>>>>>> Why is this reference kept? The swtiching happens inside a lock >>>>>>>> and there is already a check not to switch if the hint matches >>>>>>>> with current workload. >>>>>>>> >>>>>>> >>>>>>> This reference is kept so that we would not reset the PM mode to >>>>>>> DEFAULT when some other context has switched the PP mode. If you >>>>>>> see the 4th patch, the PM mode will be changed when the job in >>>>>>> that context is run, and a pm_reset function will be scheduled >>>>>>> when the job is done. But in between if another job from another >>>>>>> context has changed the PM mode, the refrence count will prevent >>>>>>> us from resetting the PM mode. >>>>>>> >>>>>> >>>>>> This helps only if multiple jobs request the same mode. If they >>>>>> request different modes, then this is not helping much. >>>>> >>>>> No that's certainly not the case. It's a counter, whose aim is to >>>>> allow a PP reset only when the counter is 0. Do note that the >>>>> reset() happens only in the job_free_cb(), which gets schedule >>>>> later. If this counter is not zero, which means another work has >>>>> changed the profile in between, and we should not reset it. >>>>> >>>>>> >>>>>> It could be useful to profile some apps assuming it has exclusive >>>>>> access. >>>>>> >>>>>> However, in general, the API is not reliable from a user point as >>>>>> the mode requested can be overridden by some other job. Then a >>>>>> better thing to do is to document that and avoid the extra stuff >>>>>> around it. >>>>>> >>>>> As I mentioned before, like any PM feature, the benefits can be >>>>> seen only while running consistant workloads for long time. I an >>>>> still add a doc note in the UAPI page. >>>>> >>>> >>>> >>>> a) What is the goal of the API? Is it guaranteeing the job to run >>>> under a workprofile mode or something else? >>> >>> No, it does not guarentee anything. If you see the cover letter, it >>> just provides an interface to an app to submit workload under a power >>> profile which can be more suitable for its workload type. As I >>> mentioned, it could be very useful for many scenarios like fullscreen >>> 3D / fullscreen MM scenarios. It could also allow a >>> system-gfx-manager to shift load balance towards one type of >>> workload. There are many applications, once the UAPI is in place. >>> >>>> >>>> b) If it's to guarantee work profile mode, does it really guarantee >>>> that - the answer is NO when some other job is running. It may or >>>> may not work is the answer. >>>> >>>> c) What is the difference between one job resetting the profile mode >>>> to NONE vs another job change the mode to say VIDEO when the >>>> original request is for COMPUTE? While that is the case, what is the >>>> use of any sort of 'pseudo-protection' other than running some code >>>> to do extra lock/unlock stuff. >>>> >>> >>> Your understanding of protection is wrong here. There is >>> intentionally no protection for a job changing another job's set >>> workload profile, coz in that was we will end up >>> seriazling/bottlenecking workload submission until PM profile is >>> ready to be changed, which takes away benefit of having multiple >>> queues of parallel submission. >>> >>> The protection provided by the ref counter is to avoid the clearing >>> of the profile (to NONE), while another workload is in execution. The >>> difference between NONE and VIDEO is still that NONE is the default >>> profile without any fine tuning, and VIDEO is still fine tuned for >>> VIDEO type of workloads. >>> >> >> Protection 1 is - mutex_lock(&adev->pm.smu_workload_lock); >> >> The line that follows is amdgpu_dpm_switch_power_profile() - this one >> will allow only single client use- two jobs won't be able to switch at >> the same time. All *dpm* APIs are protected like that. >> > > this also protects the pm.workload_mode variable which is being set > after the amdgpu_dpm_switch_power_profile call is successful here: > adev->pm.workload_mode = hint; > >> Protection 2 is - ref counter. >> >> It helps only in this kind of scenario when two jobs requested the >> same mode successively - >> Job 1 requested compute >> Job 2 requested compute >> Job 1 ends (doesnt't reset) >> >> Scenario - 2 >> Job 1 requested compute >> Job 2 requested compute >> Job 3 requested 3D >> Job 1 ends (doesnt't reset, it continues in 3D) >> >> In this mixed scenario case, I would say NONE is much more optimized >> as it's under FW control. Actually, it does much more fine tuning >> because of its background data collection. >> > > > It helps in mixed scenarios as well, consider this scenario: > Job 1 requests: 3D > Job 2 requests: Media Ok, let's take this as the example. Protection case : Job 1 requests: 3D => adev->pm.workload_mode = 3D; and protected by mutex_lock(&adev->pm.smu_workload_lock) Jobe 2 requests => adev->pm.workload_mode = Media; What is the use of this variable then? Two jobs can come at different times and change it independently? Any use in keeping this? Some other job came in and changed to some other value. So, what is the use of this lock finally? Use case: Job 1 requests: 3D Job 2 requests: Media Job 1 now runs under Media. What is achieved considering the intent of the API and extra CPU cycles run to protect nothing? Thanks, Lijo > Job 1 finishes, but job 2 is ongoing > Job 1 calls reset(), but checks the counter is non-zero and doesn't reset > > So the media workload continues in Media mode, not None. > > - Shashank > >>> In the end, *again* the actual benefit comes when consistant workload >>> is submitted for a long time, like fullscreen 3D game playback, >>> fullscreen Video movie playback, and so on. >>> >> >> "only under consistent", doesn't justify any software protection >> logic. Again, if the workload is consistent most likely PMFW could be >> managing it better. >> >> Thanks, >> Lijo >> >>> - Shashank >>> >>>> Thanks, >>>> Lijo >>>> >>>>> - Shashank >>>>> >>>>>> Thanks, >>>>>> Lijo >>>>>> >>>>>>> - Shashank >>>>>>> >>>>>>>> Thanks, >>>>>>>> Lijo >>>>>>>> >>>>>>>>> + >>>>>>>>> +unlock: >>>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>>> + return ret; >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>>> + uint32_t hint) >>>>>>>>> +{ >>>>>>>>> + int ret = 0; >>>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>>> + >>>>>>>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>>>>>>> + return 0; >>>>>>>>> + >>>>>>>>> + /* Do not reset GPU power profile if another reset is >>>>>>>>> coming */ >>>>>>>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>>>>>>> + return 0; >>>>>>>>> + >>>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>>>> + >>>>>>>>> + if (adev->pm.workload_mode != hint) >>>>>>>>> + goto unlock; >>>>>>>>> + >>>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>>>>>>> + if (!ret) >>>>>>>>> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>>>>>>> + >>>>>>>>> +unlock: >>>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>>> + return ret; >>>>>>>>> +} >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>> index be7aff2d4a57..1f0f64662c04 100644 >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct >>>>>>>>> amdgpu_device *adev, >>>>>>>>> mutex_init(&adev->psp.mutex); >>>>>>>>> mutex_init(&adev->notifier_lock); >>>>>>>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>>>>>>> + mutex_init(&adev->pm.smu_workload_lock); >>>>>>>>> mutex_init(&adev->benchmark_mutex); >>>>>>>>> amdgpu_device_init_apu_flags(adev); >>>>>>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>> new file mode 100644 >>>>>>>>> index 000000000000..6060fc53c3b0 >>>>>>>>> --- /dev/null >>>>>>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>> @@ -0,0 +1,54 @@ >>>>>>>>> +/* >>>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>>> + * >>>>>>>>> + * Permission is hereby granted, free of charge, to any person >>>>>>>>> obtaining a >>>>>>>>> + * copy of this software and associated documentation files >>>>>>>>> (the "Software"), >>>>>>>>> + * to deal in the Software without restriction, including >>>>>>>>> without limitation >>>>>>>>> + * the rights to use, copy, modify, merge, publish, >>>>>>>>> distribute, sublicense, >>>>>>>>> + * and/or sell copies of the Software, and to permit persons >>>>>>>>> to whom the >>>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>>> conditions: >>>>>>>>> + * >>>>>>>>> + * The above copyright notice and this permission notice shall >>>>>>>>> be included in >>>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>>> + * >>>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>>>> KIND, EXPRESS OR >>>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>>> MERCHANTABILITY, >>>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN >>>>>>>>> NO EVENT SHALL >>>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>>> CLAIM, DAMAGES OR >>>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>>>> OTHERWISE, >>>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>>>>>>> THE USE OR >>>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>>> + * >>>>>>>>> + */ >>>>>>>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>>>>>>> +#define _AMDGPU_CTX_WL_H_ >>>>>>>>> +#include <drm/amdgpu_drm.h> >>>>>>>>> +#include "amdgpu.h" >>>>>>>>> + >>>>>>>>> +/* Workload mode names */ >>>>>>>>> +static const char * const amdgpu_workload_mode_name[] = { >>>>>>>>> + "None", >>>>>>>>> + "3D", >>>>>>>>> + "Video", >>>>>>>>> + "VR", >>>>>>>>> + "Compute", >>>>>>>>> + "Unknown", >>>>>>>>> +}; >>>>>>>>> + >>>>>>>>> +static inline const >>>>>>>>> +char *amdgpu_workload_profile_name(uint32_t profile) >>>>>>>>> +{ >>>>>>>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>>>>>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>>>>>>> + return >>>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>>>>>>>> + >>>>>>>>> + return >>>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>>> + uint32_t hint); >>>>>>>>> + >>>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>>> + uint32_t hint); >>>>>>>>> + >>>>>>>>> +#endif >>>>>>>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>> index 65624d091ed2..565131f789d0 100644 >>>>>>>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>>>>>>> struct mutex stable_pstate_ctx_lock; >>>>>>>>> struct amdgpu_ctx *stable_pstate_ctx; >>>>>>>>> + /* SMU workload mode */ >>>>>>>>> + struct mutex smu_workload_lock; >>>>>>>>> + uint32_t workload_mode; >>>>>>>>> + atomic_t workload_switch_ref; >>>>>>>>> + >>>>>>>>> struct config_table_setting config_table; >>>>>>>>> /* runtime mode */ >>>>>>>>> enum amdgpu_runpm_mode rpm_mode; >>>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 14:00 ` Lazar, Lijo @ 2022-09-27 14:20 ` Sharma, Shashank 2022-09-27 14:34 ` Lazar, Lijo 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 14:20 UTC (permalink / raw) To: Lazar, Lijo, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 4:00 PM, Lazar, Lijo wrote: > > > On 9/27/2022 7:17 PM, Sharma, Shashank wrote: >> >> >> On 9/27/2022 3:29 PM, Lazar, Lijo wrote: >>> >>> >>> On 9/27/2022 6:23 PM, Sharma, Shashank wrote: >>>> >>>> >>>> On 9/27/2022 2:39 PM, Lazar, Lijo wrote: >>>>> >>>>> >>>>> On 9/27/2022 5:53 PM, Sharma, Shashank wrote: >>>>>> >>>>>> >>>>>> On 9/27/2022 2:10 PM, Lazar, Lijo wrote: >>>>>>> >>>>>>> >>>>>>> On 9/27/2022 5:11 PM, Sharma, Shashank wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>>>>>>>> This patch adds new functions which will allow a user to >>>>>>>>>> change the GPU power profile based a GPU workload hint >>>>>>>>>> flag. >>>>>>>>>> >>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>>>> --- >>>>>>>>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>>>>>>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>>>>>>>> +++++++++++++++++++ >>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>>>>>>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>>>>>>>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>>>>>>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>>>>>>>> create mode 100644 >>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>> create mode 100644 >>>>>>>>>> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>> index 5a283d12f8e1..34679c657ecc 100644 >>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>>>>>>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>>>>>>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>>>>>>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>>>>>>>>> amdgpu_pll.o \ >>>>>>>>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ >>>>>>>>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>>>>>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>>>>>>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>>>>>>>>> amdgpu_virt.o \ >>>>>>>>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>>>>>>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>> new file mode 100644 >>>>>>>>>> index 000000000000..a11cf29bc388 >>>>>>>>>> --- /dev/null >>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>> @@ -0,0 +1,97 @@ >>>>>>>>>> +/* >>>>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>>>> + * >>>>>>>>>> + * Permission is hereby granted, free of charge, to any >>>>>>>>>> person obtaining a >>>>>>>>>> + * copy of this software and associated documentation files >>>>>>>>>> (the "Software"), >>>>>>>>>> + * to deal in the Software without restriction, including >>>>>>>>>> without limitation >>>>>>>>>> + * the rights to use, copy, modify, merge, publish, >>>>>>>>>> distribute, sublicense, >>>>>>>>>> + * and/or sell copies of the Software, and to permit persons >>>>>>>>>> to whom the >>>>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>>>> conditions: >>>>>>>>>> + * >>>>>>>>>> + * The above copyright notice and this permission notice >>>>>>>>>> shall be included in >>>>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>>>> + * >>>>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>>>>> KIND, EXPRESS OR >>>>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>>>> MERCHANTABILITY, >>>>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN >>>>>>>>>> NO EVENT SHALL >>>>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>>>> CLAIM, DAMAGES OR >>>>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>>>>> OTHERWISE, >>>>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>>>>>>>> THE USE OR >>>>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>>>> + * >>>>>>>>>> + */ >>>>>>>>>> +#include <drm/drm.h> >>>>>>>>>> +#include "kgd_pp_interface.h" >>>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>>> + >>>>>>>>>> +static enum PP_SMC_POWER_PROFILE >>>>>>>>>> +amdgpu_workload_to_power_profile(uint32_t hint) >>>>>>>>>> +{ >>>>>>>>>> + switch (hint) { >>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>>>>>>>> + default: >>>>>>>>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>>>>>>>> + >>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>>>>>>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>>>>>>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>>>>>>>> + return PP_SMC_POWER_PROFILE_VR; >>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>>>>>>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>>>>>>>> + } >>>>>>>>>> +} >>>>>>>>>> + >>>>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>>>> + uint32_t hint) >>>>>>>>>> +{ >>>>>>>>>> + int ret = 0; >>>>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>>>> + >>>>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>>>> + return 0; >>>>>>>>>> + >>>>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>>>> >>>>>>>>> If it's all about pm subsystem variable accesses, this API >>>>>>>>> should rather be inside amd/pm subsystem. No need to expose the >>>>>>>>> variable outside pm subsytem. Also currently all amdgpu_dpm* >>>>>>>>> calls are protected under one mutex. Then this extra lock won't >>>>>>>>> be needed. >>>>>>>>> >>>>>>>> >>>>>>>> This is tricky, this is not all about PM subsystem. Note that >>>>>>>> the job management and scheduling is handled into amdgpu_ctx, so >>>>>>>> the workload hint is set in context_management API. The API is >>>>>>>> consumed when the job is actually run from amdgpu_run() layer. >>>>>>>> So its a joint interface between context and PM. >>>>>>>> >>>>>>> >>>>>>> If you take out amdgpu_workload_to_power_profile() line, >>>>>>> everything else looks to touch only pm variables/functions. >>>>>> >>>>>> That's not a line, that function converts a AMGPU_CTX hint to PPM >>>>>> profile. And going by that logic, this whole code was kept in the >>>>>> amdgpu_ctx.c file as well, coz this code is consuming the PM API. >>>>>> So to avoid these conflicts and having a new file is a better idea. >>>>>> >>>>>> You could still keep a >>>>>>> wrapper though. Also dpm_* functions are protected, so the extra >>>>>>> mutex can be avoided as well. >>>>>>> >>>>>> The lock also protects pm.workload_mode writes. >>>>>> >>>>>>>>>> + >>>>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>>>> + goto unlock; >>>>>>>>>> + >>>>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>>>>>>>> + if (!ret) >>>>>>>>>> + adev->pm.workload_mode = hint; >>>>>>>>>> + atomic_inc(&adev->pm.workload_switch_ref); >>>>>>>>> >>>>>>>>> Why is this reference kept? The swtiching happens inside a lock >>>>>>>>> and there is already a check not to switch if the hint matches >>>>>>>>> with current workload. >>>>>>>>> >>>>>>>> >>>>>>>> This reference is kept so that we would not reset the PM mode to >>>>>>>> DEFAULT when some other context has switched the PP mode. If you >>>>>>>> see the 4th patch, the PM mode will be changed when the job in >>>>>>>> that context is run, and a pm_reset function will be scheduled >>>>>>>> when the job is done. But in between if another job from another >>>>>>>> context has changed the PM mode, the refrence count will prevent >>>>>>>> us from resetting the PM mode. >>>>>>>> >>>>>>> >>>>>>> This helps only if multiple jobs request the same mode. If they >>>>>>> request different modes, then this is not helping much. >>>>>> >>>>>> No that's certainly not the case. It's a counter, whose aim is to >>>>>> allow a PP reset only when the counter is 0. Do note that the >>>>>> reset() happens only in the job_free_cb(), which gets schedule >>>>>> later. If this counter is not zero, which means another work has >>>>>> changed the profile in between, and we should not reset it. >>>>>> >>>>>>> >>>>>>> It could be useful to profile some apps assuming it has exclusive >>>>>>> access. >>>>>>> >>>>>>> However, in general, the API is not reliable from a user point as >>>>>>> the mode requested can be overridden by some other job. Then a >>>>>>> better thing to do is to document that and avoid the extra stuff >>>>>>> around it. >>>>>>> >>>>>> As I mentioned before, like any PM feature, the benefits can be >>>>>> seen only while running consistant workloads for long time. I an >>>>>> still add a doc note in the UAPI page. >>>>>> >>>>> >>>>> >>>>> a) What is the goal of the API? Is it guaranteeing the job to run >>>>> under a workprofile mode or something else? >>>> >>>> No, it does not guarentee anything. If you see the cover letter, it >>>> just provides an interface to an app to submit workload under a >>>> power profile which can be more suitable for its workload type. As I >>>> mentioned, it could be very useful for many scenarios like >>>> fullscreen 3D / fullscreen MM scenarios. It could also allow a >>>> system-gfx-manager to shift load balance towards one type of >>>> workload. There are many applications, once the UAPI is in place. >>>> >>>>> >>>>> b) If it's to guarantee work profile mode, does it really guarantee >>>>> that - the answer is NO when some other job is running. It may or >>>>> may not work is the answer. >>>>> >>>>> c) What is the difference between one job resetting the profile >>>>> mode to NONE vs another job change the mode to say VIDEO when the >>>>> original request is for COMPUTE? While that is the case, what is >>>>> the use of any sort of 'pseudo-protection' other than running some >>>>> code to do extra lock/unlock stuff. >>>>> >>>> >>>> Your understanding of protection is wrong here. There is >>>> intentionally no protection for a job changing another job's set >>>> workload profile, coz in that was we will end up >>>> seriazling/bottlenecking workload submission until PM profile is >>>> ready to be changed, which takes away benefit of having multiple >>>> queues of parallel submission. >>>> >>>> The protection provided by the ref counter is to avoid the clearing >>>> of the profile (to NONE), while another workload is in execution. >>>> The difference between NONE and VIDEO is still that NONE is the >>>> default profile without any fine tuning, and VIDEO is still fine >>>> tuned for VIDEO type of workloads. >>>> >>> >>> Protection 1 is - mutex_lock(&adev->pm.smu_workload_lock); >>> >>> The line that follows is amdgpu_dpm_switch_power_profile() - this one >>> will allow only single client use- two jobs won't be able to switch >>> at the same time. All *dpm* APIs are protected like that. >>> >> >> this also protects the pm.workload_mode variable which is being set >> after the amdgpu_dpm_switch_power_profile call is successful here: >> adev->pm.workload_mode = hint; >> >>> Protection 2 is - ref counter. >>> >>> It helps only in this kind of scenario when two jobs requested the >>> same mode successively - >>> Job 1 requested compute >>> Job 2 requested compute >>> Job 1 ends (doesnt't reset) >>> >>> Scenario - 2 >>> Job 1 requested compute >>> Job 2 requested compute >>> Job 3 requested 3D >>> Job 1 ends (doesnt't reset, it continues in 3D) >>> >>> In this mixed scenario case, I would say NONE is much more optimized >>> as it's under FW control. Actually, it does much more fine tuning >>> because of its background data collection. >>> >> >> >> It helps in mixed scenarios as well, consider this scenario: >> Job 1 requests: 3D >> Job 2 requests: Media > > Ok, let's take this as the example. > > Protection case : > > Job 1 requests: 3D => adev->pm.workload_mode = 3D; and protected by > mutex_lock(&adev->pm.smu_workload_lock) > > Jobe 2 requests => adev->pm.workload_mode = Media; > > What is the use of this variable then? Two jobs can come at different > times and change it independently? Any use in keeping this? > Some other job came in and changed to some other value. So, what is the > use of this lock finally? > ?? The locks are not to save the variable from being changed, but to save the variable being changed out of context. If two threads try to change it at the same time, one of them will have to wait until the other critical section is done execution. Do note that this variable is changed only when amdgpu_dpm_switch_power_profile() call is successful. Going by the same logic, what is the use of having these pm locks inside the function dpm_switch_power_profile(), as Job 1 changed the power profile to 3D, and Job 2 changed it to media :) ? Using those locks does not prevent chaning the PM profile, it makes sure that it happens in a serialized way. > Use case: > > Job 1 requests: 3D > Job 2 requests: Media > > Job 1 now runs under Media. What is achieved considering the intent of > the API and extra CPU cycles run to protect nothing? > This is how it is intended to work, I have explained this multiple times before that we do not want to block the change in PP from two different jobs. The lock is to protect concurrancy sequence, not change in mode: without that lock in the worst case scenario: Thread: 1 Job 1 requests: 3D PM mode changed to: 3D just before writing (adev->pm.workload_mode = 3d) this thread schedules out Thread:2 Job 2 requests: Media PM mode changed to: Media adev->pm.workload_mode = media Thread 1 schedules in: adev->pm.workload_mode = 3d but PM mode media. State machine broken here. So the lock is to provide sequential execution of the code. If your suggestion is we should not let the mode get changed until one job is done execution, that's a different discussion and certainly not being reflected from what you wrote above. - Shashank > Thanks, > Lijo > >> Job 1 finishes, but job 2 is ongoing >> Job 1 calls reset(), but checks the counter is non-zero and doesn't reset >> >> So the media workload continues in Media mode, not None. >> >> - Shashank >> >>>> In the end, *again* the actual benefit comes when consistant >>>> workload is submitted for a long time, like fullscreen 3D game >>>> playback, fullscreen Video movie playback, and so on. >>>> >>> >>> "only under consistent", doesn't justify any software protection >>> logic. Again, if the workload is consistent most likely PMFW could be >>> managing it better. >>> >>> Thanks, >>> Lijo >>> >>>> - Shashank >>>> >>>>> Thanks, >>>>> Lijo >>>>> >>>>>> - Shashank >>>>>> >>>>>>> Thanks, >>>>>>> Lijo >>>>>>> >>>>>>>> - Shashank >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Lijo >>>>>>>>> >>>>>>>>>> + >>>>>>>>>> +unlock: >>>>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>>>> + return ret; >>>>>>>>>> +} >>>>>>>>>> + >>>>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>>>> + uint32_t hint) >>>>>>>>>> +{ >>>>>>>>>> + int ret = 0; >>>>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>>>> + >>>>>>>>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>>>>>>>> + return 0; >>>>>>>>>> + >>>>>>>>>> + /* Do not reset GPU power profile if another reset is >>>>>>>>>> coming */ >>>>>>>>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>>>>>>>> + return 0; >>>>>>>>>> + >>>>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>>>>> + >>>>>>>>>> + if (adev->pm.workload_mode != hint) >>>>>>>>>> + goto unlock; >>>>>>>>>> + >>>>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>>>>>>>> + if (!ret) >>>>>>>>>> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>>>>>>>> + >>>>>>>>>> +unlock: >>>>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>>>> + return ret; >>>>>>>>>> +} >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>> index be7aff2d4a57..1f0f64662c04 100644 >>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct >>>>>>>>>> amdgpu_device *adev, >>>>>>>>>> mutex_init(&adev->psp.mutex); >>>>>>>>>> mutex_init(&adev->notifier_lock); >>>>>>>>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>>>>>>>> + mutex_init(&adev->pm.smu_workload_lock); >>>>>>>>>> mutex_init(&adev->benchmark_mutex); >>>>>>>>>> amdgpu_device_init_apu_flags(adev); >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>> new file mode 100644 >>>>>>>>>> index 000000000000..6060fc53c3b0 >>>>>>>>>> --- /dev/null >>>>>>>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>> @@ -0,0 +1,54 @@ >>>>>>>>>> +/* >>>>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>>>> + * >>>>>>>>>> + * Permission is hereby granted, free of charge, to any >>>>>>>>>> person obtaining a >>>>>>>>>> + * copy of this software and associated documentation files >>>>>>>>>> (the "Software"), >>>>>>>>>> + * to deal in the Software without restriction, including >>>>>>>>>> without limitation >>>>>>>>>> + * the rights to use, copy, modify, merge, publish, >>>>>>>>>> distribute, sublicense, >>>>>>>>>> + * and/or sell copies of the Software, and to permit persons >>>>>>>>>> to whom the >>>>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>>>> conditions: >>>>>>>>>> + * >>>>>>>>>> + * The above copyright notice and this permission notice >>>>>>>>>> shall be included in >>>>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>>>> + * >>>>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>>>>> KIND, EXPRESS OR >>>>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>>>> MERCHANTABILITY, >>>>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN >>>>>>>>>> NO EVENT SHALL >>>>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>>>> CLAIM, DAMAGES OR >>>>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR >>>>>>>>>> OTHERWISE, >>>>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR >>>>>>>>>> THE USE OR >>>>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>>>> + * >>>>>>>>>> + */ >>>>>>>>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>>>>>>>> +#define _AMDGPU_CTX_WL_H_ >>>>>>>>>> +#include <drm/amdgpu_drm.h> >>>>>>>>>> +#include "amdgpu.h" >>>>>>>>>> + >>>>>>>>>> +/* Workload mode names */ >>>>>>>>>> +static const char * const amdgpu_workload_mode_name[] = { >>>>>>>>>> + "None", >>>>>>>>>> + "3D", >>>>>>>>>> + "Video", >>>>>>>>>> + "VR", >>>>>>>>>> + "Compute", >>>>>>>>>> + "Unknown", >>>>>>>>>> +}; >>>>>>>>>> + >>>>>>>>>> +static inline const >>>>>>>>>> +char *amdgpu_workload_profile_name(uint32_t profile) >>>>>>>>>> +{ >>>>>>>>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>>>>>>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>>>>>>>> + return >>>>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>>>>>>>>> + >>>>>>>>>> + return >>>>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>>>>>>>> +} >>>>>>>>>> + >>>>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>>>> + uint32_t hint); >>>>>>>>>> + >>>>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>>>> + uint32_t hint); >>>>>>>>>> + >>>>>>>>>> +#endif >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>> index 65624d091ed2..565131f789d0 100644 >>>>>>>>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>>>>>>>> struct mutex stable_pstate_ctx_lock; >>>>>>>>>> struct amdgpu_ctx *stable_pstate_ctx; >>>>>>>>>> + /* SMU workload mode */ >>>>>>>>>> + struct mutex smu_workload_lock; >>>>>>>>>> + uint32_t workload_mode; >>>>>>>>>> + atomic_t workload_switch_ref; >>>>>>>>>> + >>>>>>>>>> struct config_table_setting config_table; >>>>>>>>>> /* runtime mode */ >>>>>>>>>> enum amdgpu_runpm_mode rpm_mode; >>>>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 14:20 ` Sharma, Shashank @ 2022-09-27 14:34 ` Lazar, Lijo 2022-09-27 14:50 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-27 14:34 UTC (permalink / raw) To: Sharma, Shashank, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 7:50 PM, Sharma, Shashank wrote: > > > On 9/27/2022 4:00 PM, Lazar, Lijo wrote: >> >> >> On 9/27/2022 7:17 PM, Sharma, Shashank wrote: >>> >>> >>> On 9/27/2022 3:29 PM, Lazar, Lijo wrote: >>>> >>>> >>>> On 9/27/2022 6:23 PM, Sharma, Shashank wrote: >>>>> >>>>> >>>>> On 9/27/2022 2:39 PM, Lazar, Lijo wrote: >>>>>> >>>>>> >>>>>> On 9/27/2022 5:53 PM, Sharma, Shashank wrote: >>>>>>> >>>>>>> >>>>>>> On 9/27/2022 2:10 PM, Lazar, Lijo wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 9/27/2022 5:11 PM, Sharma, Shashank wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>>>>>>>>> This patch adds new functions which will allow a user to >>>>>>>>>>> change the GPU power profile based a GPU workload hint >>>>>>>>>>> flag. >>>>>>>>>>> >>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>>>>> --- >>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>>>>>>>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>>>>>>>>> +++++++++++++++++++ >>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>>>>>>>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ >>>>>>>>>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>>>>>>>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>>>>>>>>> create mode 100644 >>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>>> create mode 100644 >>>>>>>>>>> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>>> >>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>>> index 5a283d12f8e1..34679c657ecc 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>>>>>>>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>>>>>>>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>>>>>>>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o >>>>>>>>>>> amdgpu_pll.o \ >>>>>>>>>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>>>>>>>> amdgpu_sync.o \ >>>>>>>>>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>>>>>>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>>>>>>>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o >>>>>>>>>>> amdgpu_virt.o \ >>>>>>>>>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>>>>>>>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>>> new file mode 100644 >>>>>>>>>>> index 000000000000..a11cf29bc388 >>>>>>>>>>> --- /dev/null >>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>>> @@ -0,0 +1,97 @@ >>>>>>>>>>> +/* >>>>>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>>>>> + * >>>>>>>>>>> + * Permission is hereby granted, free of charge, to any >>>>>>>>>>> person obtaining a >>>>>>>>>>> + * copy of this software and associated documentation files >>>>>>>>>>> (the "Software"), >>>>>>>>>>> + * to deal in the Software without restriction, including >>>>>>>>>>> without limitation >>>>>>>>>>> + * the rights to use, copy, modify, merge, publish, >>>>>>>>>>> distribute, sublicense, >>>>>>>>>>> + * and/or sell copies of the Software, and to permit persons >>>>>>>>>>> to whom the >>>>>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>>>>> conditions: >>>>>>>>>>> + * >>>>>>>>>>> + * The above copyright notice and this permission notice >>>>>>>>>>> shall be included in >>>>>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>>>>> + * >>>>>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>>>>>> KIND, EXPRESS OR >>>>>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>>>>> MERCHANTABILITY, >>>>>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN >>>>>>>>>>> NO EVENT SHALL >>>>>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>>>>> CLAIM, DAMAGES OR >>>>>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT >>>>>>>>>>> OR OTHERWISE, >>>>>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE >>>>>>>>>>> OR THE USE OR >>>>>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>>>>> + * >>>>>>>>>>> + */ >>>>>>>>>>> +#include <drm/drm.h> >>>>>>>>>>> +#include "kgd_pp_interface.h" >>>>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>>>> + >>>>>>>>>>> +static enum PP_SMC_POWER_PROFILE >>>>>>>>>>> +amdgpu_workload_to_power_profile(uint32_t hint) >>>>>>>>>>> +{ >>>>>>>>>>> + switch (hint) { >>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>>>>>>>>> + default: >>>>>>>>>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>>>>>>>>> + >>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>>>>>>>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>>>>>>>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>>>>>>>>> + return PP_SMC_POWER_PROFILE_VR; >>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>>>>>>>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>>>>>>>>> + } >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>>>>> + uint32_t hint) >>>>>>>>>>> +{ >>>>>>>>>>> + int ret = 0; >>>>>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>>>>> + >>>>>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>>>>> + return 0; >>>>>>>>>>> + >>>>>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>>>>> >>>>>>>>>> If it's all about pm subsystem variable accesses, this API >>>>>>>>>> should rather be inside amd/pm subsystem. No need to expose >>>>>>>>>> the variable outside pm subsytem. Also currently all >>>>>>>>>> amdgpu_dpm* calls are protected under one mutex. Then this >>>>>>>>>> extra lock won't be needed. >>>>>>>>>> >>>>>>>>> >>>>>>>>> This is tricky, this is not all about PM subsystem. Note that >>>>>>>>> the job management and scheduling is handled into amdgpu_ctx, >>>>>>>>> so the workload hint is set in context_management API. The API >>>>>>>>> is consumed when the job is actually run from amdgpu_run() >>>>>>>>> layer. So its a joint interface between context and PM. >>>>>>>>> >>>>>>>> >>>>>>>> If you take out amdgpu_workload_to_power_profile() line, >>>>>>>> everything else looks to touch only pm variables/functions. >>>>>>> >>>>>>> That's not a line, that function converts a AMGPU_CTX hint to PPM >>>>>>> profile. And going by that logic, this whole code was kept in the >>>>>>> amdgpu_ctx.c file as well, coz this code is consuming the PM API. >>>>>>> So to avoid these conflicts and having a new file is a better idea. >>>>>>> >>>>>>> You could still keep a >>>>>>>> wrapper though. Also dpm_* functions are protected, so the extra >>>>>>>> mutex can be avoided as well. >>>>>>>> >>>>>>> The lock also protects pm.workload_mode writes. >>>>>>> >>>>>>>>>>> + >>>>>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>>>>> + goto unlock; >>>>>>>>>>> + >>>>>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>>>>>>>>> + if (!ret) >>>>>>>>>>> + adev->pm.workload_mode = hint; >>>>>>>>>>> + atomic_inc(&adev->pm.workload_switch_ref); >>>>>>>>>> >>>>>>>>>> Why is this reference kept? The swtiching happens inside a >>>>>>>>>> lock and there is already a check not to switch if the hint >>>>>>>>>> matches with current workload. >>>>>>>>>> >>>>>>>>> >>>>>>>>> This reference is kept so that we would not reset the PM mode >>>>>>>>> to DEFAULT when some other context has switched the PP mode. If >>>>>>>>> you see the 4th patch, the PM mode will be changed when the job >>>>>>>>> in that context is run, and a pm_reset function will be >>>>>>>>> scheduled when the job is done. But in between if another job >>>>>>>>> from another context has changed the PM mode, the refrence >>>>>>>>> count will prevent us from resetting the PM mode. >>>>>>>>> >>>>>>>> >>>>>>>> This helps only if multiple jobs request the same mode. If they >>>>>>>> request different modes, then this is not helping much. >>>>>>> >>>>>>> No that's certainly not the case. It's a counter, whose aim is to >>>>>>> allow a PP reset only when the counter is 0. Do note that the >>>>>>> reset() happens only in the job_free_cb(), which gets schedule >>>>>>> later. If this counter is not zero, which means another work has >>>>>>> changed the profile in between, and we should not reset it. >>>>>>> >>>>>>>> >>>>>>>> It could be useful to profile some apps assuming it has >>>>>>>> exclusive access. >>>>>>>> >>>>>>>> However, in general, the API is not reliable from a user point >>>>>>>> as the mode requested can be overridden by some other job. Then >>>>>>>> a better thing to do is to document that and avoid the extra >>>>>>>> stuff around it. >>>>>>>> >>>>>>> As I mentioned before, like any PM feature, the benefits can be >>>>>>> seen only while running consistant workloads for long time. I an >>>>>>> still add a doc note in the UAPI page. >>>>>>> >>>>>> >>>>>> >>>>>> a) What is the goal of the API? Is it guaranteeing the job to run >>>>>> under a workprofile mode or something else? >>>>> >>>>> No, it does not guarentee anything. If you see the cover letter, it >>>>> just provides an interface to an app to submit workload under a >>>>> power profile which can be more suitable for its workload type. As >>>>> I mentioned, it could be very useful for many scenarios like >>>>> fullscreen 3D / fullscreen MM scenarios. It could also allow a >>>>> system-gfx-manager to shift load balance towards one type of >>>>> workload. There are many applications, once the UAPI is in place. >>>>> >>>>>> >>>>>> b) If it's to guarantee work profile mode, does it really >>>>>> guarantee that - the answer is NO when some other job is running. >>>>>> It may or may not work is the answer. >>>>>> >>>>>> c) What is the difference between one job resetting the profile >>>>>> mode to NONE vs another job change the mode to say VIDEO when the >>>>>> original request is for COMPUTE? While that is the case, what is >>>>>> the use of any sort of 'pseudo-protection' other than running some >>>>>> code to do extra lock/unlock stuff. >>>>>> >>>>> >>>>> Your understanding of protection is wrong here. There is >>>>> intentionally no protection for a job changing another job's set >>>>> workload profile, coz in that was we will end up >>>>> seriazling/bottlenecking workload submission until PM profile is >>>>> ready to be changed, which takes away benefit of having multiple >>>>> queues of parallel submission. >>>>> >>>>> The protection provided by the ref counter is to avoid the clearing >>>>> of the profile (to NONE), while another workload is in execution. >>>>> The difference between NONE and VIDEO is still that NONE is the >>>>> default profile without any fine tuning, and VIDEO is still fine >>>>> tuned for VIDEO type of workloads. >>>>> >>>> >>>> Protection 1 is - mutex_lock(&adev->pm.smu_workload_lock); >>>> >>>> The line that follows is amdgpu_dpm_switch_power_profile() - this >>>> one will allow only single client use- two jobs won't be able to >>>> switch at the same time. All *dpm* APIs are protected like that. >>>> >>> >>> this also protects the pm.workload_mode variable which is being set >>> after the amdgpu_dpm_switch_power_profile call is successful here: >>> adev->pm.workload_mode = hint; >>> >>>> Protection 2 is - ref counter. >>>> >>>> It helps only in this kind of scenario when two jobs requested the >>>> same mode successively - >>>> Job 1 requested compute >>>> Job 2 requested compute >>>> Job 1 ends (doesnt't reset) >>>> >>>> Scenario - 2 >>>> Job 1 requested compute >>>> Job 2 requested compute >>>> Job 3 requested 3D >>>> Job 1 ends (doesnt't reset, it continues in 3D) >>>> >>>> In this mixed scenario case, I would say NONE is much more optimized >>>> as it's under FW control. Actually, it does much more fine tuning >>>> because of its background data collection. >>>> >>> >>> >>> It helps in mixed scenarios as well, consider this scenario: >>> Job 1 requests: 3D >>> Job 2 requests: Media >> >> Ok, let's take this as the example. >> >> Protection case : >> >> Job 1 requests: 3D => adev->pm.workload_mode = 3D; and protected by >> mutex_lock(&adev->pm.smu_workload_lock) >> >> Jobe 2 requests => adev->pm.workload_mode = Media; >> >> What is the use of this variable then? Two jobs can come at different >> times and change it independently? Any use in keeping this? > >> Some other job came in and changed to some other value. So, what is >> the use of this lock finally? >> > ?? The locks are not to save the variable from being changed, but to > save the variable being changed out of context. If two threads try to > change it at the same time, one of them will have to wait until the > other critical section is done execution. > > Do note that this variable is changed only when > amdgpu_dpm_switch_power_profile() call is successful. Going by the same > logic, what is the use of having these pm locks inside the function > dpm_switch_power_profile(), as Job 1 changed the power profile to 3D, > and Job 2 changed it to media :) ? That lock is protecting the swsmu internal states from concurrent access and not profile mode. Here I don't see the use of this variable. Using those locks does not prevent > chaning the PM profile, it makes sure that it happens in a serialized way. > >> Use case: >> >> Job 1 requests: 3D >> Job 2 requests: Media >> >> Job 1 now runs under Media. What is achieved considering the intent of >> the API and extra CPU cycles run to protect nothing? >> > > This is how it is intended to work, I have explained this multiple times > before that we do not want to block the change in PP from two different > jobs. The lock is to protect concurrancy sequence, not change in mode: > > without that lock in the worst case scenario: > > Thread: 1 > Job 1 requests: 3D > PM mode changed to: 3D > just before writing (adev->pm.workload_mode = 3d) this thread schedules out > > Thread:2 > Job 2 requests: Media > PM mode changed to: Media > adev->pm.workload_mode = media > > Thread 1 schedules in: > adev->pm.workload_mode = 3d but PM mode media. > > State machine broken here. So the lock is to provide sequential > execution of the code. > > > If your suggestion is we should not let the mode get changed until one > job is done execution, that's a different discussion and certainly not > being reflected from what you wrote above. My suggestion is not to waste extra CPU cycles/memory when the API doesn't give any guarantee about its intended purpose (which is to keep the profile mode as requested by a job). Let it be stateless and document the usage. Thanks, Lijo > > - Shashank > >> Thanks, >> Lijo >> >>> Job 1 finishes, but job 2 is ongoing >>> Job 1 calls reset(), but checks the counter is non-zero and doesn't >>> reset >>> >>> So the media workload continues in Media mode, not None. >>> >>> - Shashank >>> >>>>> In the end, *again* the actual benefit comes when consistant >>>>> workload is submitted for a long time, like fullscreen 3D game >>>>> playback, fullscreen Video movie playback, and so on. >>>>> >>>> >>>> "only under consistent", doesn't justify any software protection >>>> logic. Again, if the workload is consistent most likely PMFW could >>>> be managing it better. >>>> >>>> Thanks, >>>> Lijo >>>> >>>>> - Shashank >>>>> >>>>>> Thanks, >>>>>> Lijo >>>>>> >>>>>>> - Shashank >>>>>>> >>>>>>>> Thanks, >>>>>>>> Lijo >>>>>>>> >>>>>>>>> - Shashank >>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Lijo >>>>>>>>>> >>>>>>>>>>> + >>>>>>>>>>> +unlock: >>>>>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>>>>> + return ret; >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>>>>> + uint32_t hint) >>>>>>>>>>> +{ >>>>>>>>>>> + int ret = 0; >>>>>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>>>>> + >>>>>>>>>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>>>>>>>>> + return 0; >>>>>>>>>>> + >>>>>>>>>>> + /* Do not reset GPU power profile if another reset is >>>>>>>>>>> coming */ >>>>>>>>>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>>>>>>>>> + return 0; >>>>>>>>>>> + >>>>>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>>>>>> + >>>>>>>>>>> + if (adev->pm.workload_mode != hint) >>>>>>>>>>> + goto unlock; >>>>>>>>>>> + >>>>>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>>>>>>>>> + if (!ret) >>>>>>>>>>> + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>>>>>>>>> + >>>>>>>>>>> +unlock: >>>>>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>>>>> + return ret; >>>>>>>>>>> +} >>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>> index be7aff2d4a57..1f0f64662c04 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct >>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>> mutex_init(&adev->psp.mutex); >>>>>>>>>>> mutex_init(&adev->notifier_lock); >>>>>>>>>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>>>>>>>>> + mutex_init(&adev->pm.smu_workload_lock); >>>>>>>>>>> mutex_init(&adev->benchmark_mutex); >>>>>>>>>>> amdgpu_device_init_apu_flags(adev); >>>>>>>>>>> diff --git >>>>>>>>>>> a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>>> new file mode 100644 >>>>>>>>>>> index 000000000000..6060fc53c3b0 >>>>>>>>>>> --- /dev/null >>>>>>>>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>>> @@ -0,0 +1,54 @@ >>>>>>>>>>> +/* >>>>>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>>>>> + * >>>>>>>>>>> + * Permission is hereby granted, free of charge, to any >>>>>>>>>>> person obtaining a >>>>>>>>>>> + * copy of this software and associated documentation files >>>>>>>>>>> (the "Software"), >>>>>>>>>>> + * to deal in the Software without restriction, including >>>>>>>>>>> without limitation >>>>>>>>>>> + * the rights to use, copy, modify, merge, publish, >>>>>>>>>>> distribute, sublicense, >>>>>>>>>>> + * and/or sell copies of the Software, and to permit persons >>>>>>>>>>> to whom the >>>>>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>>>>> conditions: >>>>>>>>>>> + * >>>>>>>>>>> + * The above copyright notice and this permission notice >>>>>>>>>>> shall be included in >>>>>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>>>>> + * >>>>>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY >>>>>>>>>>> KIND, EXPRESS OR >>>>>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>>>>> MERCHANTABILITY, >>>>>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN >>>>>>>>>>> NO EVENT SHALL >>>>>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>>>>> CLAIM, DAMAGES OR >>>>>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT >>>>>>>>>>> OR OTHERWISE, >>>>>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE >>>>>>>>>>> OR THE USE OR >>>>>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>>>>> + * >>>>>>>>>>> + */ >>>>>>>>>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>>>>>>>>> +#define _AMDGPU_CTX_WL_H_ >>>>>>>>>>> +#include <drm/amdgpu_drm.h> >>>>>>>>>>> +#include "amdgpu.h" >>>>>>>>>>> + >>>>>>>>>>> +/* Workload mode names */ >>>>>>>>>>> +static const char * const amdgpu_workload_mode_name[] = { >>>>>>>>>>> + "None", >>>>>>>>>>> + "3D", >>>>>>>>>>> + "Video", >>>>>>>>>>> + "VR", >>>>>>>>>>> + "Compute", >>>>>>>>>>> + "Unknown", >>>>>>>>>>> +}; >>>>>>>>>>> + >>>>>>>>>>> +static inline const >>>>>>>>>>> +char *amdgpu_workload_profile_name(uint32_t profile) >>>>>>>>>>> +{ >>>>>>>>>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>>>>>>>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>>>>>>>>> + return >>>>>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>>>>>>>>>> + >>>>>>>>>>> + return >>>>>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>>>>>>>>> +} >>>>>>>>>>> + >>>>>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>>>>> + uint32_t hint); >>>>>>>>>>> + >>>>>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>>>>> + uint32_t hint); >>>>>>>>>>> + >>>>>>>>>>> +#endif >>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>>> index 65624d091ed2..565131f789d0 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>>>>>>>>> struct mutex stable_pstate_ctx_lock; >>>>>>>>>>> struct amdgpu_ctx *stable_pstate_ctx; >>>>>>>>>>> + /* SMU workload mode */ >>>>>>>>>>> + struct mutex smu_workload_lock; >>>>>>>>>>> + uint32_t workload_mode; >>>>>>>>>>> + atomic_t workload_switch_ref; >>>>>>>>>>> + >>>>>>>>>>> struct config_table_setting config_table; >>>>>>>>>>> /* runtime mode */ >>>>>>>>>>> enum amdgpu_runpm_mode rpm_mode; >>>>>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-27 14:34 ` Lazar, Lijo @ 2022-09-27 14:50 ` Sharma, Shashank 0 siblings, 0 replies; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 14:50 UTC (permalink / raw) To: Lazar, Lijo, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 4:34 PM, Lazar, Lijo wrote: > > > On 9/27/2022 7:50 PM, Sharma, Shashank wrote: >> >> >> On 9/27/2022 4:00 PM, Lazar, Lijo wrote: >>> >>> >>> On 9/27/2022 7:17 PM, Sharma, Shashank wrote: >>>> >>>> >>>> On 9/27/2022 3:29 PM, Lazar, Lijo wrote: >>>>> >>>>> >>>>> On 9/27/2022 6:23 PM, Sharma, Shashank wrote: >>>>>> >>>>>> >>>>>> On 9/27/2022 2:39 PM, Lazar, Lijo wrote: >>>>>>> >>>>>>> >>>>>>> On 9/27/2022 5:53 PM, Sharma, Shashank wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 9/27/2022 2:10 PM, Lazar, Lijo wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> On 9/27/2022 5:11 PM, Sharma, Shashank wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 9/27/2022 11:58 AM, Lazar, Lijo wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>>>>>>>>>> This patch adds new functions which will allow a user to >>>>>>>>>>>> change the GPU power profile based a GPU workload hint >>>>>>>>>>>> flag. >>>>>>>>>>>> >>>>>>>>>>>> Cc: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>>>>>> --- >>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- >>>>>>>>>>>> .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 >>>>>>>>>>>> +++++++++++++++++++ >>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + >>>>>>>>>>>> .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 >>>>>>>>>>>> +++++++++++ >>>>>>>>>>>> drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + >>>>>>>>>>>> 5 files changed, 158 insertions(+), 1 deletion(-) >>>>>>>>>>>> create mode 100644 >>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>>>> create mode 100644 >>>>>>>>>>>> drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>>>> index 5a283d12f8e1..34679c657ecc 100644 >>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/Makefile >>>>>>>>>>>> @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ >>>>>>>>>>>> atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ >>>>>>>>>>>> atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ >>>>>>>>>>>> amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o >>>>>>>>>>>> amdgpu_ib.o amdgpu_pll.o \ >>>>>>>>>>>> - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>>>>>>>>> amdgpu_sync.o \ >>>>>>>>>>>> + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o >>>>>>>>>>>> amdgpu_ctx_workload.o amdgpu_sync.o \ >>>>>>>>>>>> amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o >>>>>>>>>>>> amdgpu_vram_mgr.o amdgpu_virt.o \ >>>>>>>>>>>> amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ >>>>>>>>>>>> amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ >>>>>>>>>>>> diff --git >>>>>>>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>>>> new file mode 100644 >>>>>>>>>>>> index 000000000000..a11cf29bc388 >>>>>>>>>>>> --- /dev/null >>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>>>>>>>>> @@ -0,0 +1,97 @@ >>>>>>>>>>>> +/* >>>>>>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>>>>>> + * >>>>>>>>>>>> + * Permission is hereby granted, free of charge, to any >>>>>>>>>>>> person obtaining a >>>>>>>>>>>> + * copy of this software and associated documentation files >>>>>>>>>>>> (the "Software"), >>>>>>>>>>>> + * to deal in the Software without restriction, including >>>>>>>>>>>> without limitation >>>>>>>>>>>> + * the rights to use, copy, modify, merge, publish, >>>>>>>>>>>> distribute, sublicense, >>>>>>>>>>>> + * and/or sell copies of the Software, and to permit >>>>>>>>>>>> persons to whom the >>>>>>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>>>>>> conditions: >>>>>>>>>>>> + * >>>>>>>>>>>> + * The above copyright notice and this permission notice >>>>>>>>>>>> shall be included in >>>>>>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>>>>>> + * >>>>>>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF >>>>>>>>>>>> ANY KIND, EXPRESS OR >>>>>>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>>>>>> MERCHANTABILITY, >>>>>>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. >>>>>>>>>>>> IN NO EVENT SHALL >>>>>>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>>>>>> CLAIM, DAMAGES OR >>>>>>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT >>>>>>>>>>>> OR OTHERWISE, >>>>>>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE >>>>>>>>>>>> OR THE USE OR >>>>>>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>>>>>> + * >>>>>>>>>>>> + */ >>>>>>>>>>>> +#include <drm/drm.h> >>>>>>>>>>>> +#include "kgd_pp_interface.h" >>>>>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>>>>> + >>>>>>>>>>>> +static enum PP_SMC_POWER_PROFILE >>>>>>>>>>>> +amdgpu_workload_to_power_profile(uint32_t hint) >>>>>>>>>>>> +{ >>>>>>>>>>>> + switch (hint) { >>>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_NONE: >>>>>>>>>>>> + default: >>>>>>>>>>>> + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; >>>>>>>>>>>> + >>>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_3D: >>>>>>>>>>>> + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; >>>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: >>>>>>>>>>>> + return PP_SMC_POWER_PROFILE_VIDEO; >>>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_VR: >>>>>>>>>>>> + return PP_SMC_POWER_PROFILE_VR; >>>>>>>>>>>> + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: >>>>>>>>>>>> + return PP_SMC_POWER_PROFILE_COMPUTE; >>>>>>>>>>>> + } >>>>>>>>>>>> +} >>>>>>>>>>>> + >>>>>>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>>>>>> + uint32_t hint) >>>>>>>>>>>> +{ >>>>>>>>>>>> + int ret = 0; >>>>>>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>>>>>> + >>>>>>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>>>>>> + return 0; >>>>>>>>>>>> + >>>>>>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>>>>>> >>>>>>>>>>> If it's all about pm subsystem variable accesses, this API >>>>>>>>>>> should rather be inside amd/pm subsystem. No need to expose >>>>>>>>>>> the variable outside pm subsytem. Also currently all >>>>>>>>>>> amdgpu_dpm* calls are protected under one mutex. Then this >>>>>>>>>>> extra lock won't be needed. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This is tricky, this is not all about PM subsystem. Note that >>>>>>>>>> the job management and scheduling is handled into amdgpu_ctx, >>>>>>>>>> so the workload hint is set in context_management API. The API >>>>>>>>>> is consumed when the job is actually run from amdgpu_run() >>>>>>>>>> layer. So its a joint interface between context and PM. >>>>>>>>>> >>>>>>>>> >>>>>>>>> If you take out amdgpu_workload_to_power_profile() line, >>>>>>>>> everything else looks to touch only pm variables/functions. >>>>>>>> >>>>>>>> That's not a line, that function converts a AMGPU_CTX hint to >>>>>>>> PPM profile. And going by that logic, this whole code was kept >>>>>>>> in the amdgpu_ctx.c file as well, coz this code is consuming the >>>>>>>> PM API. So to avoid these conflicts and having a new file is a >>>>>>>> better idea. >>>>>>>> >>>>>>>> You could still keep a >>>>>>>>> wrapper though. Also dpm_* functions are protected, so the >>>>>>>>> extra mutex can be avoided as well. >>>>>>>>> >>>>>>>> The lock also protects pm.workload_mode writes. >>>>>>>> >>>>>>>>>>>> + >>>>>>>>>>>> + if (adev->pm.workload_mode == hint) >>>>>>>>>>>> + goto unlock; >>>>>>>>>>>> + >>>>>>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>>>>>>>>>> + if (!ret) >>>>>>>>>>>> + adev->pm.workload_mode = hint; >>>>>>>>>>>> + atomic_inc(&adev->pm.workload_switch_ref); >>>>>>>>>>> >>>>>>>>>>> Why is this reference kept? The swtiching happens inside a >>>>>>>>>>> lock and there is already a check not to switch if the hint >>>>>>>>>>> matches with current workload. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> This reference is kept so that we would not reset the PM mode >>>>>>>>>> to DEFAULT when some other context has switched the PP mode. >>>>>>>>>> If you see the 4th patch, the PM mode will be changed when the >>>>>>>>>> job in that context is run, and a pm_reset function will be >>>>>>>>>> scheduled when the job is done. But in between if another job >>>>>>>>>> from another context has changed the PM mode, the refrence >>>>>>>>>> count will prevent us from resetting the PM mode. >>>>>>>>>> >>>>>>>>> >>>>>>>>> This helps only if multiple jobs request the same mode. If they >>>>>>>>> request different modes, then this is not helping much. >>>>>>>> >>>>>>>> No that's certainly not the case. It's a counter, whose aim is >>>>>>>> to allow a PP reset only when the counter is 0. Do note that the >>>>>>>> reset() happens only in the job_free_cb(), which gets schedule >>>>>>>> later. If this counter is not zero, which means another work has >>>>>>>> changed the profile in between, and we should not reset it. >>>>>>>> >>>>>>>>> >>>>>>>>> It could be useful to profile some apps assuming it has >>>>>>>>> exclusive access. >>>>>>>>> >>>>>>>>> However, in general, the API is not reliable from a user point >>>>>>>>> as the mode requested can be overridden by some other job. Then >>>>>>>>> a better thing to do is to document that and avoid the extra >>>>>>>>> stuff around it. >>>>>>>>> >>>>>>>> As I mentioned before, like any PM feature, the benefits can be >>>>>>>> seen only while running consistant workloads for long time. I an >>>>>>>> still add a doc note in the UAPI page. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> a) What is the goal of the API? Is it guaranteeing the job to run >>>>>>> under a workprofile mode or something else? >>>>>> >>>>>> No, it does not guarentee anything. If you see the cover letter, >>>>>> it just provides an interface to an app to submit workload under a >>>>>> power profile which can be more suitable for its workload type. As >>>>>> I mentioned, it could be very useful for many scenarios like >>>>>> fullscreen 3D / fullscreen MM scenarios. It could also allow a >>>>>> system-gfx-manager to shift load balance towards one type of >>>>>> workload. There are many applications, once the UAPI is in place. >>>>>> >>>>>>> >>>>>>> b) If it's to guarantee work profile mode, does it really >>>>>>> guarantee that - the answer is NO when some other job is running. >>>>>>> It may or may not work is the answer. >>>>>>> >>>>>>> c) What is the difference between one job resetting the profile >>>>>>> mode to NONE vs another job change the mode to say VIDEO when the >>>>>>> original request is for COMPUTE? While that is the case, what is >>>>>>> the use of any sort of 'pseudo-protection' other than running >>>>>>> some code to do extra lock/unlock stuff. >>>>>>> >>>>>> >>>>>> Your understanding of protection is wrong here. There is >>>>>> intentionally no protection for a job changing another job's set >>>>>> workload profile, coz in that was we will end up >>>>>> seriazling/bottlenecking workload submission until PM profile is >>>>>> ready to be changed, which takes away benefit of having multiple >>>>>> queues of parallel submission. >>>>>> >>>>>> The protection provided by the ref counter is to avoid the >>>>>> clearing of the profile (to NONE), while another workload is in >>>>>> execution. The difference between NONE and VIDEO is still that >>>>>> NONE is the default profile without any fine tuning, and VIDEO is >>>>>> still fine tuned for VIDEO type of workloads. >>>>>> >>>>> >>>>> Protection 1 is - mutex_lock(&adev->pm.smu_workload_lock); >>>>> >>>>> The line that follows is amdgpu_dpm_switch_power_profile() - this >>>>> one will allow only single client use- two jobs won't be able to >>>>> switch at the same time. All *dpm* APIs are protected like that. >>>>> >>>> >>>> this also protects the pm.workload_mode variable which is being set >>>> after the amdgpu_dpm_switch_power_profile call is successful here: >>>> adev->pm.workload_mode = hint; >>>> >>>>> Protection 2 is - ref counter. >>>>> >>>>> It helps only in this kind of scenario when two jobs requested the >>>>> same mode successively - >>>>> Job 1 requested compute >>>>> Job 2 requested compute >>>>> Job 1 ends (doesnt't reset) >>>>> >>>>> Scenario - 2 >>>>> Job 1 requested compute >>>>> Job 2 requested compute >>>>> Job 3 requested 3D >>>>> Job 1 ends (doesnt't reset, it continues in 3D) >>>>> >>>>> In this mixed scenario case, I would say NONE is much more >>>>> optimized as it's under FW control. Actually, it does much more >>>>> fine tuning because of its background data collection. >>>>> >>>> >>>> >>>> It helps in mixed scenarios as well, consider this scenario: >>>> Job 1 requests: 3D >>>> Job 2 requests: Media >>> >>> Ok, let's take this as the example. >>> >>> Protection case : >>> >>> Job 1 requests: 3D => adev->pm.workload_mode = 3D; and protected by >>> mutex_lock(&adev->pm.smu_workload_lock) >>> >>> Jobe 2 requests => adev->pm.workload_mode = Media; >>> >>> What is the use of this variable then? Two jobs can come at different >>> times and change it independently? Any use in keeping this? >> >>> Some other job came in and changed to some other value. So, what is >>> the use of this lock finally? >>> >> ?? The locks are not to save the variable from being changed, but to >> save the variable being changed out of context. If two threads try to >> change it at the same time, one of them will have to wait until the >> other critical section is done execution. >> >> Do note that this variable is changed only when >> amdgpu_dpm_switch_power_profile() call is successful. Going by the >> same logic, what is the use of having these pm locks inside the >> function dpm_switch_power_profile(), as Job 1 changed the power >> profile to 3D, and Job 2 changed it to media :) ? > > That lock is protecting the swsmu internal states from concurrent access > and not profile mode. That's the intention of the lock, to prevent the state. It is not supposed to prevent a profile change. So its doing its job. > Here I don't see the use of this variable. This variable is being used to block duplicate calls to this function, if we are already running in the same mode. Cosidering the number of jobs we submit, its absolutely worth. We have been talking about CPU cycles since some time, and its doing the same very well. > Using those locks does not prevent >> chaning the PM profile, it makes sure that it happens in a serialized >> way. >> >>> Use case: >>> >>> Job 1 requests: 3D >>> Job 2 requests: Media >>> >>> Job 1 now runs under Media. What is achieved considering the intent >>> of the API and extra CPU cycles run to protect nothing? >>> >> >> This is how it is intended to work, I have explained this multiple >> times before that we do not want to block the change in PP from two >> different jobs. The lock is to protect concurrancy sequence, not >> change in mode: >> >> without that lock in the worst case scenario: >> >> Thread: 1 >> Job 1 requests: 3D >> PM mode changed to: 3D >> just before writing (adev->pm.workload_mode = 3d) this thread >> schedules out >> >> Thread:2 >> Job 2 requests: Media >> PM mode changed to: Media >> adev->pm.workload_mode = media >> >> Thread 1 schedules in: >> adev->pm.workload_mode = 3d but PM mode media. >> >> State machine broken here. So the lock is to provide sequential >> execution of the code. >> >> >> If your suggestion is we should not let the mode get changed until one >> job is done execution, that's a different discussion and certainly not >> being reflected from what you wrote above. > > My suggestion is not to waste extra CPU cycles/memory when the API > doesn't give any guarantee about its intended purpose (which is to keep > the profile mode as requested by a job). Let it be stateless and > document the usage. > From the comment above, the state helps to block duplicate calls, so its saving CPU cycles rather than wasting it. Now, the guarentee of the execution is something which we can discuss, I am open to suggestion on a policy, which can do better. - Shashank > Thanks, > Lijo > >> >> - Shashank >> >>> Thanks, >>> Lijo >>> >>>> Job 1 finishes, but job 2 is ongoing >>>> Job 1 calls reset(), but checks the counter is non-zero and doesn't >>>> reset >>>> >>>> So the media workload continues in Media mode, not None. >>>> >>>> - Shashank >>>> >>>>>> In the end, *again* the actual benefit comes when consistant >>>>>> workload is submitted for a long time, like fullscreen 3D game >>>>>> playback, fullscreen Video movie playback, and so on. >>>>>> >>>>> >>>>> "only under consistent", doesn't justify any software protection >>>>> logic. Again, if the workload is consistent most likely PMFW could >>>>> be managing it better. >>>>> >>>>> Thanks, >>>>> Lijo >>>>> >>>>>> - Shashank >>>>>> >>>>>>> Thanks, >>>>>>> Lijo >>>>>>> >>>>>>>> - Shashank >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Lijo >>>>>>>>> >>>>>>>>>> - Shashank >>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Lijo >>>>>>>>>>> >>>>>>>>>>>> + >>>>>>>>>>>> +unlock: >>>>>>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>>>>>> + return ret; >>>>>>>>>>>> +} >>>>>>>>>>>> + >>>>>>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>>>>>> + uint32_t hint) >>>>>>>>>>>> +{ >>>>>>>>>>>> + int ret = 0; >>>>>>>>>>>> + enum PP_SMC_POWER_PROFILE profile = >>>>>>>>>>>> + amdgpu_workload_to_power_profile(hint); >>>>>>>>>>>> + >>>>>>>>>>>> + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) >>>>>>>>>>>> + return 0; >>>>>>>>>>>> + >>>>>>>>>>>> + /* Do not reset GPU power profile if another reset is >>>>>>>>>>>> coming */ >>>>>>>>>>>> + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) >>>>>>>>>>>> + return 0; >>>>>>>>>>>> + >>>>>>>>>>>> + mutex_lock(&adev->pm.smu_workload_lock); >>>>>>>>>>>> + >>>>>>>>>>>> + if (adev->pm.workload_mode != hint) >>>>>>>>>>>> + goto unlock; >>>>>>>>>>>> + >>>>>>>>>>>> + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); >>>>>>>>>>>> + if (!ret) >>>>>>>>>>>> + adev->pm.workload_mode = >>>>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>>>>>>>>>> + >>>>>>>>>>>> +unlock: >>>>>>>>>>>> + mutex_unlock(&adev->pm.smu_workload_lock); >>>>>>>>>>>> + return ret; >>>>>>>>>>>> +} >>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>>> index be7aff2d4a57..1f0f64662c04 100644 >>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >>>>>>>>>>>> @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct >>>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>>> mutex_init(&adev->psp.mutex); >>>>>>>>>>>> mutex_init(&adev->notifier_lock); >>>>>>>>>>>> mutex_init(&adev->pm.stable_pstate_ctx_lock); >>>>>>>>>>>> + mutex_init(&adev->pm.smu_workload_lock); >>>>>>>>>>>> mutex_init(&adev->benchmark_mutex); >>>>>>>>>>>> amdgpu_device_init_apu_flags(adev); >>>>>>>>>>>> diff --git >>>>>>>>>>>> a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>>>> b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>>>> new file mode 100644 >>>>>>>>>>>> index 000000000000..6060fc53c3b0 >>>>>>>>>>>> --- /dev/null >>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h >>>>>>>>>>>> @@ -0,0 +1,54 @@ >>>>>>>>>>>> +/* >>>>>>>>>>>> + * Copyright 2022 Advanced Micro Devices, Inc. >>>>>>>>>>>> + * >>>>>>>>>>>> + * Permission is hereby granted, free of charge, to any >>>>>>>>>>>> person obtaining a >>>>>>>>>>>> + * copy of this software and associated documentation files >>>>>>>>>>>> (the "Software"), >>>>>>>>>>>> + * to deal in the Software without restriction, including >>>>>>>>>>>> without limitation >>>>>>>>>>>> + * the rights to use, copy, modify, merge, publish, >>>>>>>>>>>> distribute, sublicense, >>>>>>>>>>>> + * and/or sell copies of the Software, and to permit >>>>>>>>>>>> persons to whom the >>>>>>>>>>>> + * Software is furnished to do so, subject to the following >>>>>>>>>>>> conditions: >>>>>>>>>>>> + * >>>>>>>>>>>> + * The above copyright notice and this permission notice >>>>>>>>>>>> shall be included in >>>>>>>>>>>> + * all copies or substantial portions of the Software. >>>>>>>>>>>> + * >>>>>>>>>>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF >>>>>>>>>>>> ANY KIND, EXPRESS OR >>>>>>>>>>>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF >>>>>>>>>>>> MERCHANTABILITY, >>>>>>>>>>>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. >>>>>>>>>>>> IN NO EVENT SHALL >>>>>>>>>>>> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY >>>>>>>>>>>> CLAIM, DAMAGES OR >>>>>>>>>>>> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT >>>>>>>>>>>> OR OTHERWISE, >>>>>>>>>>>> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE >>>>>>>>>>>> OR THE USE OR >>>>>>>>>>>> + * OTHER DEALINGS IN THE SOFTWARE. >>>>>>>>>>>> + * >>>>>>>>>>>> + */ >>>>>>>>>>>> +#ifndef _AMDGPU_CTX_WL_H_ >>>>>>>>>>>> +#define _AMDGPU_CTX_WL_H_ >>>>>>>>>>>> +#include <drm/amdgpu_drm.h> >>>>>>>>>>>> +#include "amdgpu.h" >>>>>>>>>>>> + >>>>>>>>>>>> +/* Workload mode names */ >>>>>>>>>>>> +static const char * const amdgpu_workload_mode_name[] = { >>>>>>>>>>>> + "None", >>>>>>>>>>>> + "3D", >>>>>>>>>>>> + "Video", >>>>>>>>>>>> + "VR", >>>>>>>>>>>> + "Compute", >>>>>>>>>>>> + "Unknown", >>>>>>>>>>>> +}; >>>>>>>>>>>> + >>>>>>>>>>>> +static inline const >>>>>>>>>>>> +char *amdgpu_workload_profile_name(uint32_t profile) >>>>>>>>>>>> +{ >>>>>>>>>>>> + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && >>>>>>>>>>>> + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) >>>>>>>>>>>> + return >>>>>>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; >>>>>>>>>>>> + >>>>>>>>>>>> + return >>>>>>>>>>>> amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; >>>>>>>>>>>> +} >>>>>>>>>>>> + >>>>>>>>>>>> +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, >>>>>>>>>>>> + uint32_t hint); >>>>>>>>>>>> + >>>>>>>>>>>> +int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>>>>>>>>> + uint32_t hint); >>>>>>>>>>>> + >>>>>>>>>>>> +#endif >>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>>>> b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>>>> index 65624d091ed2..565131f789d0 100644 >>>>>>>>>>>> --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h >>>>>>>>>>>> @@ -361,6 +361,11 @@ struct amdgpu_pm { >>>>>>>>>>>> struct mutex stable_pstate_ctx_lock; >>>>>>>>>>>> struct amdgpu_ctx *stable_pstate_ctx; >>>>>>>>>>>> + /* SMU workload mode */ >>>>>>>>>>>> + struct mutex smu_workload_lock; >>>>>>>>>>>> + uint32_t workload_mode; >>>>>>>>>>>> + atomic_t workload_switch_ref; >>>>>>>>>>>> + >>>>>>>>>>>> struct config_table_setting config_table; >>>>>>>>>>>> /* runtime mode */ >>>>>>>>>>>> enum amdgpu_runpm_mode rpm_mode; >>>>>>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile 2022-09-26 21:40 ` [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile Shashank Sharma ` (2 preceding siblings ...) 2022-09-27 9:58 ` Lazar, Lijo @ 2022-09-27 15:20 ` Felix Kuehling 3 siblings, 0 replies; 76+ messages in thread From: Felix Kuehling @ 2022-09-27 15:20 UTC (permalink / raw) To: Shashank Sharma, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig Am 2022-09-26 um 17:40 schrieb Shashank Sharma: > This patch adds new functions which will allow a user to > change the GPU power profile based a GPU workload hint > flag. > > Cc: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- > .../gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 97 +++++++++++++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 1 + > .../gpu/drm/amd/include/amdgpu_ctx_workload.h | 54 +++++++++++ > drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h | 5 + > 5 files changed, 158 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > create mode 100644 drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > > diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile > index 5a283d12f8e1..34679c657ecc 100644 > --- a/drivers/gpu/drm/amd/amdgpu/Makefile > +++ b/drivers/gpu/drm/amd/amdgpu/Makefile > @@ -50,7 +50,7 @@ amdgpu-y += amdgpu_device.o amdgpu_kms.o \ > atombios_dp.o amdgpu_afmt.o amdgpu_trace_points.o \ > atombios_encoders.o amdgpu_sa.o atombios_i2c.o \ > amdgpu_dma_buf.o amdgpu_vm.o amdgpu_vm_pt.o amdgpu_ib.o amdgpu_pll.o \ > - amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \ > + amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_ctx_workload.o amdgpu_sync.o \ > amdgpu_gtt_mgr.o amdgpu_preempt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o \ > amdgpu_atomfirmware.o amdgpu_vf_error.o amdgpu_sched.o \ > amdgpu_debugfs.o amdgpu_ids.o amdgpu_gmc.o \ > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > new file mode 100644 > index 000000000000..a11cf29bc388 > --- /dev/null > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > @@ -0,0 +1,97 @@ > +/* > + * Copyright 2022 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + */ > +#include <drm/drm.h> > +#include "kgd_pp_interface.h" > +#include "amdgpu_ctx_workload.h" > + > +static enum PP_SMC_POWER_PROFILE > +amdgpu_workload_to_power_profile(uint32_t hint) > +{ > + switch (hint) { > + case AMDGPU_CTX_WORKLOAD_HINT_NONE: > + default: > + return PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT; > + > + case AMDGPU_CTX_WORKLOAD_HINT_3D: > + return PP_SMC_POWER_PROFILE_FULLSCREEN3D; > + case AMDGPU_CTX_WORKLOAD_HINT_VIDEO: > + return PP_SMC_POWER_PROFILE_VIDEO; > + case AMDGPU_CTX_WORKLOAD_HINT_VR: > + return PP_SMC_POWER_PROFILE_VR; > + case AMDGPU_CTX_WORKLOAD_HINT_COMPUTE: > + return PP_SMC_POWER_PROFILE_COMPUTE; > + } > +} > + > +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > + uint32_t hint) > +{ > + int ret = 0; > + enum PP_SMC_POWER_PROFILE profile = > + amdgpu_workload_to_power_profile(hint); > + > + if (adev->pm.workload_mode == hint) > + return 0; > + > + mutex_lock(&adev->pm.smu_workload_lock); > + > + if (adev->pm.workload_mode == hint) > + goto unlock; > + > + ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); > + if (!ret) > + adev->pm.workload_mode = hint; > + atomic_inc(&adev->pm.workload_switch_ref); > + > +unlock: > + mutex_unlock(&adev->pm.smu_workload_lock); > + return ret; > +} > + > +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > + uint32_t hint) > +{ > + int ret = 0; > + enum PP_SMC_POWER_PROFILE profile = > + amdgpu_workload_to_power_profile(hint); > + > + if (hint == AMDGPU_CTX_WORKLOAD_HINT_NONE) > + return 0; > + > + /* Do not reset GPU power profile if another reset is coming */ > + if (atomic_dec_return(&adev->pm.workload_switch_ref) > 0) > + return 0; > + > + mutex_lock(&adev->pm.smu_workload_lock); > + > + if (adev->pm.workload_mode != hint) > + goto unlock; > + > + ret = amdgpu_dpm_switch_power_profile(adev, profile, 0); > + if (!ret) > + adev->pm.workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; > + > +unlock: > + mutex_unlock(&adev->pm.smu_workload_lock); > + return ret; > +} > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > index be7aff2d4a57..1f0f64662c04 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c > @@ -3554,6 +3554,7 @@ int amdgpu_device_init(struct amdgpu_device *adev, > mutex_init(&adev->psp.mutex); > mutex_init(&adev->notifier_lock); > mutex_init(&adev->pm.stable_pstate_ctx_lock); > + mutex_init(&adev->pm.smu_workload_lock); > mutex_init(&adev->benchmark_mutex); > > amdgpu_device_init_apu_flags(adev); > diff --git a/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > new file mode 100644 > index 000000000000..6060fc53c3b0 > --- /dev/null > +++ b/drivers/gpu/drm/amd/include/amdgpu_ctx_workload.h > @@ -0,0 +1,54 @@ > +/* > + * Copyright 2022 Advanced Micro Devices, Inc. > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice shall be included in > + * all copies or substantial portions of the Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR > + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, > + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR > + * OTHER DEALINGS IN THE SOFTWARE. > + * > + */ > +#ifndef _AMDGPU_CTX_WL_H_ > +#define _AMDGPU_CTX_WL_H_ > +#include <drm/amdgpu_drm.h> > +#include "amdgpu.h" > + > +/* Workload mode names */ > +static const char * const amdgpu_workload_mode_name[] = { > + "None", > + "3D", > + "Video", > + "VR", > + "Compute", > + "Unknown", > +}; > + > +static inline const > +char *amdgpu_workload_profile_name(uint32_t profile) > +{ > + if (profile >= AMDGPU_CTX_WORKLOAD_HINT_NONE && > + profile < AMDGPU_CTX_WORKLOAD_HINT_MAX) > + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_INDEX(profile)]; > + > + return amdgpu_workload_mode_name[AMDGPU_CTX_WORKLOAD_HINT_MAX]; > +} > + > +int amdgpu_clear_workload_profile(struct amdgpu_device *adev, > + uint32_t hint); > + > +int amdgpu_set_workload_profile(struct amdgpu_device *adev, > + uint32_t hint); > + > +#endif > diff --git a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > index 65624d091ed2..565131f789d0 100644 > --- a/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > +++ b/drivers/gpu/drm/amd/pm/inc/amdgpu_dpm.h > @@ -361,6 +361,11 @@ struct amdgpu_pm { > struct mutex stable_pstate_ctx_lock; > struct amdgpu_ctx *stable_pstate_ctx; > > + /* SMU workload mode */ > + struct mutex smu_workload_lock; > + uint32_t workload_mode; > + atomic_t workload_switch_ref; You have only one ref counter. I think you need one per profile. For example, imagine you have two contexts. C1 wants COMPUTE, C2 wants VIDEO. They start and finish jobs in this order: C1 C2 start COMPUTE enables COMPUTE profile start VIDEO enables VIDEO profile ... ... finish COMPUTE does nothing because refcount not 0 finish VIDEO disables VIDEO profile Now the COMPUTE profile stays enabled indefinitely. Regards, Felix > + > struct config_table_setting config_table; > /* runtime mode */ > enum amdgpu_runpm_mode rpm_mode; ^ permalink raw reply [flat|nested] 76+ messages in thread
* [PATCH v3 3/5] drm/amdgpu: set GPU workload via ctx IOCTL 2022-09-26 21:40 [PATCH v3 0/5] GPU workload hints for better performance Shashank Sharma 2022-09-26 21:40 ` [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl Shashank Sharma 2022-09-26 21:40 ` [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile Shashank Sharma @ 2022-09-26 21:40 ` Shashank Sharma 2022-09-27 6:09 ` Christian König 2022-09-26 21:40 ` [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile Shashank Sharma ` (2 subsequent siblings) 5 siblings, 1 reply; 76+ messages in thread From: Shashank Sharma @ 2022-09-26 21:40 UTC (permalink / raw) To: amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, Shashank Sharma This patch adds new IOCTL flags in amdgpu_context_IOCTL to set GPU workload profile. These calls will allow a user to switch to a GPU power profile which might be better suitable to its workload type. The currently supported workload types are: "None": Default workload profile "3D": Workload profile for 3D rendering work "Video": Workload profile for Media/Encode/Decode work "VR": Workload profile for VR rendering work "Compute": Workload profile for Compute work The workload hint flag is saved in GPU context, and then its applied when we actually run the job. V3: Create only set_workload interface, there is no need for get_workload (Christian) Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 42 +++++++++++++++++++++++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 + 2 files changed, 41 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index 8ee4e8491f39..937c294f8d84 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -27,6 +27,7 @@ #include "amdgpu.h" #include "amdgpu_sched.h" #include "amdgpu_ras.h" +#include "amdgpu_ctx_workload.h" #include <linux/nospec.h> #define to_amdgpu_ctx_entity(e) \ @@ -328,7 +329,7 @@ static int amdgpu_ctx_init(struct amdgpu_ctx_mgr *mgr, int32_t priority, return r; ctx->stable_pstate = current_stable_pstate; - + ctx->workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; return 0; } @@ -633,11 +634,34 @@ static int amdgpu_ctx_stable_pstate(struct amdgpu_device *adev, return r; } +static int amdgpu_ctx_set_workload_profile(struct amdgpu_device *adev, + struct amdgpu_fpriv *fpriv, uint32_t id, + u32 workload_hint) +{ + struct amdgpu_ctx *ctx; + struct amdgpu_ctx_mgr *mgr; + + if (!fpriv) + return -EINVAL; + + mgr = &fpriv->ctx_mgr; + mutex_lock(&mgr->lock); + ctx = idr_find(&mgr->ctx_handles, id); + if (!ctx) { + mutex_unlock(&mgr->lock); + return -EINVAL; + } + + ctx->workload_mode = workload_hint; + mutex_unlock(&mgr->lock); + return 0; +} + int amdgpu_ctx_ioctl(struct drm_device *dev, void *data, struct drm_file *filp) { int r; - uint32_t id, stable_pstate; + uint32_t id, stable_pstate, wl_hint; int32_t priority; union drm_amdgpu_ctx *args = data; @@ -681,6 +705,20 @@ int amdgpu_ctx_ioctl(struct drm_device *dev, void *data, return -EINVAL; r = amdgpu_ctx_stable_pstate(adev, fpriv, id, true, &stable_pstate); break; + case AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE: + if (args->in.flags & ~AMDGPU_CTX_WORKLOAD_HINT_MASK) + return -EINVAL; + wl_hint = args->in.flags & AMDGPU_CTX_WORKLOAD_HINT_MASK; + if (wl_hint > AMDGPU_CTX_WORKLOAD_HINT_MAX) + return -EINVAL; + r = amdgpu_ctx_set_workload_profile(adev, fpriv, id, wl_hint); + if (r) + DRM_ERROR("Failed to set workload profile to %s\n", + amdgpu_workload_profile_name(wl_hint)); + else + DRM_DEBUG_DRIVER("Workload profile set to %s\n", + amdgpu_workload_profile_name(wl_hint)); + break; default: return -EINVAL; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h index cc7c8afff414..6c8032c3291a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h @@ -58,6 +58,7 @@ struct amdgpu_ctx { unsigned long ras_counter_ce; unsigned long ras_counter_ue; uint32_t stable_pstate; + uint32_t workload_mode; }; struct amdgpu_ctx_mgr { -- 2.34.1 ^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [PATCH v3 3/5] drm/amdgpu: set GPU workload via ctx IOCTL 2022-09-26 21:40 ` [PATCH v3 3/5] drm/amdgpu: set GPU workload via ctx IOCTL Shashank Sharma @ 2022-09-27 6:09 ` Christian König 0 siblings, 0 replies; 76+ messages in thread From: Christian König @ 2022-09-27 6:09 UTC (permalink / raw) To: Shashank Sharma, amd-gfx; +Cc: alexander.deucher, amaranath.somalapuram Am 26.09.22 um 23:40 schrieb Shashank Sharma: > This patch adds new IOCTL flags in amdgpu_context_IOCTL to set > GPU workload profile. These calls will allow a user to switch > to a GPU power profile which might be better suitable to its > workload type. The currently supported workload types are: > "None": Default workload profile > "3D": Workload profile for 3D rendering work > "Video": Workload profile for Media/Encode/Decode work > "VR": Workload profile for VR rendering work > "Compute": Workload profile for Compute work > > The workload hint flag is saved in GPU context, and then its > applied when we actually run the job. > > V3: Create only set_workload interface, there is no need for > get_workload (Christian) > > Cc: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 42 +++++++++++++++++++++++-- > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 1 + > 2 files changed, 41 insertions(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > index 8ee4e8491f39..937c294f8d84 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c > @@ -27,6 +27,7 @@ > #include "amdgpu.h" > #include "amdgpu_sched.h" > #include "amdgpu_ras.h" > +#include "amdgpu_ctx_workload.h" > #include <linux/nospec.h> > > #define to_amdgpu_ctx_entity(e) \ > @@ -328,7 +329,7 @@ static int amdgpu_ctx_init(struct amdgpu_ctx_mgr *mgr, int32_t priority, > return r; > > ctx->stable_pstate = current_stable_pstate; > - > + ctx->workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; > return 0; > } > > @@ -633,11 +634,34 @@ static int amdgpu_ctx_stable_pstate(struct amdgpu_device *adev, > return r; > } > > +static int amdgpu_ctx_set_workload_profile(struct amdgpu_device *adev, > + struct amdgpu_fpriv *fpriv, uint32_t id, > + u32 workload_hint) > +{ > + struct amdgpu_ctx *ctx; > + struct amdgpu_ctx_mgr *mgr; > + > + if (!fpriv) > + return -EINVAL; > + > + mgr = &fpriv->ctx_mgr; > + mutex_lock(&mgr->lock); > + ctx = idr_find(&mgr->ctx_handles, id); > + if (!ctx) { > + mutex_unlock(&mgr->lock); > + return -EINVAL; > + } > + > + ctx->workload_mode = workload_hint; > + mutex_unlock(&mgr->lock); > + return 0; > +} > + > int amdgpu_ctx_ioctl(struct drm_device *dev, void *data, > struct drm_file *filp) > { > int r; > - uint32_t id, stable_pstate; > + uint32_t id, stable_pstate, wl_hint; > int32_t priority; > > union drm_amdgpu_ctx *args = data; > @@ -681,6 +705,20 @@ int amdgpu_ctx_ioctl(struct drm_device *dev, void *data, > return -EINVAL; > r = amdgpu_ctx_stable_pstate(adev, fpriv, id, true, &stable_pstate); > break; > + case AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE: > + if (args->in.flags & ~AMDGPU_CTX_WORKLOAD_HINT_MASK) > + return -EINVAL; > + wl_hint = args->in.flags & AMDGPU_CTX_WORKLOAD_HINT_MASK; > + if (wl_hint > AMDGPU_CTX_WORKLOAD_HINT_MAX) > + return -EINVAL; > + r = amdgpu_ctx_set_workload_profile(adev, fpriv, id, wl_hint); > + if (r) > + DRM_ERROR("Failed to set workload profile to %s\n", > + amdgpu_workload_profile_name(wl_hint)); > + else > + DRM_DEBUG_DRIVER("Workload profile set to %s\n", > + amdgpu_workload_profile_name(wl_hint)); > + break; > default: > return -EINVAL; > } > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h > index cc7c8afff414..6c8032c3291a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h > @@ -58,6 +58,7 @@ struct amdgpu_ctx { > unsigned long ras_counter_ce; > unsigned long ras_counter_ue; > uint32_t stable_pstate; > + uint32_t workload_mode; > }; > > struct amdgpu_ctx_mgr { ^ permalink raw reply [flat|nested] 76+ messages in thread
* [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile 2022-09-26 21:40 [PATCH v3 0/5] GPU workload hints for better performance Shashank Sharma ` (2 preceding siblings ...) 2022-09-26 21:40 ` [PATCH v3 3/5] drm/amdgpu: set GPU workload via ctx IOCTL Shashank Sharma @ 2022-09-26 21:40 ` Shashank Sharma 2022-09-27 6:11 ` Christian König 2022-09-27 10:03 ` Lazar, Lijo 2022-09-26 21:40 ` [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute Shashank Sharma 2022-09-27 16:24 ` [PATCH v3 0/5] GPU workload hints for better performance Michel Dänzer 5 siblings, 2 replies; 76+ messages in thread From: Shashank Sharma @ 2022-09-26 21:40 UTC (permalink / raw) To: amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, Shashank Sharma This patch and switches the GPU workload based profile based on the workload hint information saved in the workload context. The workload profile is reset to NONE when the job is done. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 4 ---- drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ 4 files changed, 20 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index b7bae833c804..de906a42144f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -237,6 +237,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs goto free_all_kdata; } + p->job->workload_mode = p->ctx->workload_mode; + if (p->uf_entry.tv.bo) p->job->uf_addr = uf_offset; kvfree(chunk_array); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c index a11cf29bc388..625114804121 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c @@ -55,15 +55,11 @@ int amdgpu_set_workload_profile(struct amdgpu_device *adev, mutex_lock(&adev->pm.smu_workload_lock); - if (adev->pm.workload_mode == hint) - goto unlock; - ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); if (!ret) adev->pm.workload_mode = hint; atomic_inc(&adev->pm.workload_switch_ref); -unlock: mutex_unlock(&adev->pm.smu_workload_lock); return ret; } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index c2fd6f3076a6..9300e86ee7c5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -30,6 +30,7 @@ #include "amdgpu.h" #include "amdgpu_trace.h" #include "amdgpu_reset.h" +#include "amdgpu_ctx_workload.h" static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) { @@ -144,6 +145,14 @@ void amdgpu_job_free_resources(struct amdgpu_job *job) static void amdgpu_job_free_cb(struct drm_sched_job *s_job) { struct amdgpu_job *job = to_amdgpu_job(s_job); + struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); + + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { + if (amdgpu_clear_workload_profile(ring->adev, job->workload_mode)) + DRM_WARN("Failed to come out of workload profile %s\n", + amdgpu_workload_profile_name(job->workload_mode)); + job->workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; + } drm_sched_job_cleanup(s_job); @@ -256,6 +265,12 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job) DRM_ERROR("Error scheduling IBs (%d)\n", r); } + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { + if (amdgpu_set_workload_profile(ring->adev, job->workload_mode)) + DRM_WARN("Failed to set workload profile to %s\n", + amdgpu_workload_profile_name(job->workload_mode)); + } + job->job_run_counter++; amdgpu_job_free_resources(job); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h index babc0af751c2..573e8692c814 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h @@ -68,6 +68,9 @@ struct amdgpu_job { /* job_run_counter >= 1 means a resubmit job */ uint32_t job_run_counter; + /* workload mode hint for pm */ + uint32_t workload_mode; + uint32_t num_ibs; struct amdgpu_ib ibs[]; }; -- 2.34.1 ^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile 2022-09-26 21:40 ` [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile Shashank Sharma @ 2022-09-27 6:11 ` Christian König 2022-09-27 10:03 ` Lazar, Lijo 1 sibling, 0 replies; 76+ messages in thread From: Christian König @ 2022-09-27 6:11 UTC (permalink / raw) To: Shashank Sharma, amd-gfx; +Cc: alexander.deucher, amaranath.somalapuram Am 26.09.22 um 23:40 schrieb Shashank Sharma: > This patch and switches the GPU workload based profile based > on the workload hint information saved in the workload context. > The workload profile is reset to NONE when the job is done. > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 4 ---- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++++++++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ > 4 files changed, 20 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index b7bae833c804..de906a42144f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -237,6 +237,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs > goto free_all_kdata; > } > > + p->job->workload_mode = p->ctx->workload_mode; > + > if (p->uf_entry.tv.bo) > p->job->uf_addr = uf_offset; > kvfree(chunk_array); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > index a11cf29bc388..625114804121 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > @@ -55,15 +55,11 @@ int amdgpu_set_workload_profile(struct amdgpu_device *adev, > > mutex_lock(&adev->pm.smu_workload_lock); > > - if (adev->pm.workload_mode == hint) > - goto unlock; > - > ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); > if (!ret) > adev->pm.workload_mode = hint; > atomic_inc(&adev->pm.workload_switch_ref); > > -unlock: > mutex_unlock(&adev->pm.smu_workload_lock); > return ret; > } > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > index c2fd6f3076a6..9300e86ee7c5 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > @@ -30,6 +30,7 @@ > #include "amdgpu.h" > #include "amdgpu_trace.h" > #include "amdgpu_reset.h" > +#include "amdgpu_ctx_workload.h" > > static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) > { > @@ -144,6 +145,14 @@ void amdgpu_job_free_resources(struct amdgpu_job *job) > static void amdgpu_job_free_cb(struct drm_sched_job *s_job) > { > struct amdgpu_job *job = to_amdgpu_job(s_job); > + struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); > + > + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { > + if (amdgpu_clear_workload_profile(ring->adev, job->workload_mode)) > + DRM_WARN("Failed to come out of workload profile %s\n", > + amdgpu_workload_profile_name(job->workload_mode)); > + job->workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; > + } > > drm_sched_job_cleanup(s_job); > > @@ -256,6 +265,12 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job) > DRM_ERROR("Error scheduling IBs (%d)\n", r); > } > > + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { > + if (amdgpu_set_workload_profile(ring->adev, job->workload_mode)) > + DRM_WARN("Failed to set workload profile to %s\n", > + amdgpu_workload_profile_name(job->workload_mode)); > + } > + > job->job_run_counter++; > amdgpu_job_free_resources(job); > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > index babc0af751c2..573e8692c814 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > @@ -68,6 +68,9 @@ struct amdgpu_job { > /* job_run_counter >= 1 means a resubmit job */ > uint32_t job_run_counter; > > + /* workload mode hint for pm */ > + uint32_t workload_mode; > + > uint32_t num_ibs; > struct amdgpu_ib ibs[]; > }; ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile 2022-09-26 21:40 ` [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile Shashank Sharma 2022-09-27 6:11 ` Christian König @ 2022-09-27 10:03 ` Lazar, Lijo 2022-09-27 11:47 ` Sharma, Shashank 1 sibling, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-27 10:03 UTC (permalink / raw) To: Shashank Sharma, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 3:10 AM, Shashank Sharma wrote: > This patch and switches the GPU workload based profile based > on the workload hint information saved in the workload context. > The workload profile is reset to NONE when the job is done. > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ > drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 4 ---- > drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++++++++++++++ > drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ > 4 files changed, 20 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index b7bae833c804..de906a42144f 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -237,6 +237,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs > goto free_all_kdata; > } > > + p->job->workload_mode = p->ctx->workload_mode; > + > if (p->uf_entry.tv.bo) > p->job->uf_addr = uf_offset; > kvfree(chunk_array); > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > index a11cf29bc388..625114804121 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c > @@ -55,15 +55,11 @@ int amdgpu_set_workload_profile(struct amdgpu_device *adev, > > mutex_lock(&adev->pm.smu_workload_lock); > > - if (adev->pm.workload_mode == hint) > - goto unlock; > - What is the expectation when a GFX job + VCN job together (or in general two jobs running in separate schedulers) and each prefers a different workload type? FW will switch as requested. Thanks, Lijo > ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); > if (!ret) > adev->pm.workload_mode = hint; > atomic_inc(&adev->pm.workload_switch_ref); > > -unlock: > mutex_unlock(&adev->pm.smu_workload_lock); > return ret; > } > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > index c2fd6f3076a6..9300e86ee7c5 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c > @@ -30,6 +30,7 @@ > #include "amdgpu.h" > #include "amdgpu_trace.h" > #include "amdgpu_reset.h" > +#include "amdgpu_ctx_workload.h" > > static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) > { > @@ -144,6 +145,14 @@ void amdgpu_job_free_resources(struct amdgpu_job *job) > static void amdgpu_job_free_cb(struct drm_sched_job *s_job) > { > struct amdgpu_job *job = to_amdgpu_job(s_job); > + struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); > + > + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { > + if (amdgpu_clear_workload_profile(ring->adev, job->workload_mode)) > + DRM_WARN("Failed to come out of workload profile %s\n", > + amdgpu_workload_profile_name(job->workload_mode)); > + job->workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; > + } > > drm_sched_job_cleanup(s_job); > > @@ -256,6 +265,12 @@ static struct dma_fence *amdgpu_job_run(struct drm_sched_job *sched_job) > DRM_ERROR("Error scheduling IBs (%d)\n", r); > } > > + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { > + if (amdgpu_set_workload_profile(ring->adev, job->workload_mode)) > + DRM_WARN("Failed to set workload profile to %s\n", > + amdgpu_workload_profile_name(job->workload_mode)); > + } > + > job->job_run_counter++; > amdgpu_job_free_resources(job); > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > index babc0af751c2..573e8692c814 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h > @@ -68,6 +68,9 @@ struct amdgpu_job { > /* job_run_counter >= 1 means a resubmit job */ > uint32_t job_run_counter; > > + /* workload mode hint for pm */ > + uint32_t workload_mode; > + > uint32_t num_ibs; > struct amdgpu_ib ibs[]; > }; > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile 2022-09-27 10:03 ` Lazar, Lijo @ 2022-09-27 11:47 ` Sharma, Shashank 2022-09-27 12:20 ` Lazar, Lijo 2022-09-27 16:33 ` Michel Dänzer 0 siblings, 2 replies; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 11:47 UTC (permalink / raw) To: Lazar, Lijo, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 12:03 PM, Lazar, Lijo wrote: > > > On 9/27/2022 3:10 AM, Shashank Sharma wrote: >> This patch and switches the GPU workload based profile based >> on the workload hint information saved in the workload context. >> The workload profile is reset to NONE when the job is done. >> >> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 4 ---- >> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++++++++++++++ >> drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ >> 4 files changed, 20 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> index b7bae833c804..de906a42144f 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >> @@ -237,6 +237,8 @@ static int amdgpu_cs_parser_init(struct >> amdgpu_cs_parser *p, union drm_amdgpu_cs >> goto free_all_kdata; >> } >> + p->job->workload_mode = p->ctx->workload_mode; >> + >> if (p->uf_entry.tv.bo) >> p->job->uf_addr = uf_offset; >> kvfree(chunk_array); >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> index a11cf29bc388..625114804121 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >> @@ -55,15 +55,11 @@ int amdgpu_set_workload_profile(struct >> amdgpu_device *adev, >> mutex_lock(&adev->pm.smu_workload_lock); >> - if (adev->pm.workload_mode == hint) >> - goto unlock; >> - > > What is the expectation when a GFX job + VCN job together (or in general > two jobs running in separate schedulers) and each prefers a different > workload type? FW will switch as requested. Well, I guess the last switched mode will take over. Do note that like most of the PM features, the real benefit of power profiles can be seen with consistant and similar workloads running for some time (Like gaming, video playback etc). - Shashank > > Thanks, > Lijo > >> ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >> if (!ret) >> adev->pm.workload_mode = hint; >> atomic_inc(&adev->pm.workload_switch_ref); >> -unlock: >> mutex_unlock(&adev->pm.smu_workload_lock); >> return ret; >> } >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >> index c2fd6f3076a6..9300e86ee7c5 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >> @@ -30,6 +30,7 @@ >> #include "amdgpu.h" >> #include "amdgpu_trace.h" >> #include "amdgpu_reset.h" >> +#include "amdgpu_ctx_workload.h" >> static enum drm_gpu_sched_stat amdgpu_job_timedout(struct >> drm_sched_job *s_job) >> { >> @@ -144,6 +145,14 @@ void amdgpu_job_free_resources(struct amdgpu_job >> *job) >> static void amdgpu_job_free_cb(struct drm_sched_job *s_job) >> { >> struct amdgpu_job *job = to_amdgpu_job(s_job); >> + struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); >> + >> + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { >> + if (amdgpu_clear_workload_profile(ring->adev, >> job->workload_mode)) >> + DRM_WARN("Failed to come out of workload profile %s\n", >> + amdgpu_workload_profile_name(job->workload_mode)); >> + job->workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >> + } >> drm_sched_job_cleanup(s_job); >> @@ -256,6 +265,12 @@ static struct dma_fence *amdgpu_job_run(struct >> drm_sched_job *sched_job) >> DRM_ERROR("Error scheduling IBs (%d)\n", r); >> } >> + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { >> + if (amdgpu_set_workload_profile(ring->adev, job->workload_mode)) >> + DRM_WARN("Failed to set workload profile to %s\n", >> + amdgpu_workload_profile_name(job->workload_mode)); >> + } >> + >> job->job_run_counter++; >> amdgpu_job_free_resources(job); >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >> index babc0af751c2..573e8692c814 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >> @@ -68,6 +68,9 @@ struct amdgpu_job { >> /* job_run_counter >= 1 means a resubmit job */ >> uint32_t job_run_counter; >> + /* workload mode hint for pm */ >> + uint32_t workload_mode; >> + >> uint32_t num_ibs; >> struct amdgpu_ib ibs[]; >> }; >> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile 2022-09-27 11:47 ` Sharma, Shashank @ 2022-09-27 12:20 ` Lazar, Lijo 2022-09-27 12:25 ` Sharma, Shashank 2022-09-27 16:33 ` Michel Dänzer 1 sibling, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-27 12:20 UTC (permalink / raw) To: Sharma, Shashank, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 5:17 PM, Sharma, Shashank wrote: > > > On 9/27/2022 12:03 PM, Lazar, Lijo wrote: >> >> >> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>> This patch and switches the GPU workload based profile based >>> on the workload hint information saved in the workload context. >>> The workload profile is reset to NONE when the job is done. >>> >>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 4 ---- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++++++++++++++ >>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ >>> 4 files changed, 20 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> index b7bae833c804..de906a42144f 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> @@ -237,6 +237,8 @@ static int amdgpu_cs_parser_init(struct >>> amdgpu_cs_parser *p, union drm_amdgpu_cs >>> goto free_all_kdata; >>> } >>> + p->job->workload_mode = p->ctx->workload_mode; >>> + >>> if (p->uf_entry.tv.bo) >>> p->job->uf_addr = uf_offset; >>> kvfree(chunk_array); >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> index a11cf29bc388..625114804121 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> @@ -55,15 +55,11 @@ int amdgpu_set_workload_profile(struct >>> amdgpu_device *adev, >>> mutex_lock(&adev->pm.smu_workload_lock); >>> - if (adev->pm.workload_mode == hint) >>> - goto unlock; >>> - >> >> What is the expectation when a GFX job + VCN job together (or in >> general two jobs running in separate schedulers) and each prefers a >> different workload type? FW will switch as requested. > > Well, I guess the last switched mode will take over. Do note that like > most of the PM features, the real benefit of power profiles can be seen > with consistant and similar workloads running for some time (Like > gaming, video playback etc). > Yes, so the extra protection layer wrapping around this is really not helping (user doesn't know if the job is really run in the requested mode). I would suggest to avoid that and document the usage of this API as exclusive mode usage for some profiling use cases. Thanks, Lijo > - Shashank > >> >> Thanks, >> Lijo >> >>> ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>> if (!ret) >>> adev->pm.workload_mode = hint; >>> atomic_inc(&adev->pm.workload_switch_ref); >>> -unlock: >>> mutex_unlock(&adev->pm.smu_workload_lock); >>> return ret; >>> } >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> index c2fd6f3076a6..9300e86ee7c5 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>> @@ -30,6 +30,7 @@ >>> #include "amdgpu.h" >>> #include "amdgpu_trace.h" >>> #include "amdgpu_reset.h" >>> +#include "amdgpu_ctx_workload.h" >>> static enum drm_gpu_sched_stat amdgpu_job_timedout(struct >>> drm_sched_job *s_job) >>> { >>> @@ -144,6 +145,14 @@ void amdgpu_job_free_resources(struct amdgpu_job >>> *job) >>> static void amdgpu_job_free_cb(struct drm_sched_job *s_job) >>> { >>> struct amdgpu_job *job = to_amdgpu_job(s_job); >>> + struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); >>> + >>> + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { >>> + if (amdgpu_clear_workload_profile(ring->adev, >>> job->workload_mode)) >>> + DRM_WARN("Failed to come out of workload profile %s\n", >>> + amdgpu_workload_profile_name(job->workload_mode)); >>> + job->workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>> + } >>> drm_sched_job_cleanup(s_job); >>> @@ -256,6 +265,12 @@ static struct dma_fence *amdgpu_job_run(struct >>> drm_sched_job *sched_job) >>> DRM_ERROR("Error scheduling IBs (%d)\n", r); >>> } >>> + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { >>> + if (amdgpu_set_workload_profile(ring->adev, >>> job->workload_mode)) >>> + DRM_WARN("Failed to set workload profile to %s\n", >>> + amdgpu_workload_profile_name(job->workload_mode)); >>> + } >>> + >>> job->job_run_counter++; >>> amdgpu_job_free_resources(job); >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >>> index babc0af751c2..573e8692c814 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >>> @@ -68,6 +68,9 @@ struct amdgpu_job { >>> /* job_run_counter >= 1 means a resubmit job */ >>> uint32_t job_run_counter; >>> + /* workload mode hint for pm */ >>> + uint32_t workload_mode; >>> + >>> uint32_t num_ibs; >>> struct amdgpu_ib ibs[]; >>> }; >>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile 2022-09-27 12:20 ` Lazar, Lijo @ 2022-09-27 12:25 ` Sharma, Shashank 0 siblings, 0 replies; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 12:25 UTC (permalink / raw) To: Lazar, Lijo, amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig On 9/27/2022 2:20 PM, Lazar, Lijo wrote: > > > On 9/27/2022 5:17 PM, Sharma, Shashank wrote: >> >> >> On 9/27/2022 12:03 PM, Lazar, Lijo wrote: >>> >>> >>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>> This patch and switches the GPU workload based profile based >>>> on the workload hint information saved in the workload context. >>>> The workload profile is reset to NONE when the job is done. >>>> >>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 4 ---- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++++++++++++++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ >>>> 4 files changed, 20 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> index b7bae833c804..de906a42144f 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> @@ -237,6 +237,8 @@ static int amdgpu_cs_parser_init(struct >>>> amdgpu_cs_parser *p, union drm_amdgpu_cs >>>> goto free_all_kdata; >>>> } >>>> + p->job->workload_mode = p->ctx->workload_mode; >>>> + >>>> if (p->uf_entry.tv.bo) >>>> p->job->uf_addr = uf_offset; >>>> kvfree(chunk_array); >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> index a11cf29bc388..625114804121 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> @@ -55,15 +55,11 @@ int amdgpu_set_workload_profile(struct >>>> amdgpu_device *adev, >>>> mutex_lock(&adev->pm.smu_workload_lock); >>>> - if (adev->pm.workload_mode == hint) >>>> - goto unlock; >>>> - >>> >>> What is the expectation when a GFX job + VCN job together (or in >>> general two jobs running in separate schedulers) and each prefers a >>> different workload type? FW will switch as requested. >> >> Well, I guess the last switched mode will take over. Do note that like >> most of the PM features, the real benefit of power profiles can be >> seen with consistant and similar workloads running for some time (Like >> gaming, video playback etc). >> > > Yes, so the extra protection layer wrapping around this is really not > helping (user doesn't know if the job is really run in the requested > mode). I would suggest to avoid that and document the usage of this API > as exclusive mode usage for some profiling use cases. > As I mentioned in the other comment, this extra protection is not for not allowing it to change the mode, but from preventing PM reset from job_cleanup thread, while another work is in progress. - Shashank > Thanks, > Lijo > >> - Shashank >> >>> >>> Thanks, >>> Lijo >>> >>>> ret = amdgpu_dpm_switch_power_profile(adev, profile, 1); >>>> if (!ret) >>>> adev->pm.workload_mode = hint; >>>> atomic_inc(&adev->pm.workload_switch_ref); >>>> -unlock: >>>> mutex_unlock(&adev->pm.smu_workload_lock); >>>> return ret; >>>> } >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>>> index c2fd6f3076a6..9300e86ee7c5 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c >>>> @@ -30,6 +30,7 @@ >>>> #include "amdgpu.h" >>>> #include "amdgpu_trace.h" >>>> #include "amdgpu_reset.h" >>>> +#include "amdgpu_ctx_workload.h" >>>> static enum drm_gpu_sched_stat amdgpu_job_timedout(struct >>>> drm_sched_job *s_job) >>>> { >>>> @@ -144,6 +145,14 @@ void amdgpu_job_free_resources(struct >>>> amdgpu_job *job) >>>> static void amdgpu_job_free_cb(struct drm_sched_job *s_job) >>>> { >>>> struct amdgpu_job *job = to_amdgpu_job(s_job); >>>> + struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); >>>> + >>>> + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { >>>> + if (amdgpu_clear_workload_profile(ring->adev, >>>> job->workload_mode)) >>>> + DRM_WARN("Failed to come out of workload profile %s\n", >>>> + amdgpu_workload_profile_name(job->workload_mode)); >>>> + job->workload_mode = AMDGPU_CTX_WORKLOAD_HINT_NONE; >>>> + } >>>> drm_sched_job_cleanup(s_job); >>>> @@ -256,6 +265,12 @@ static struct dma_fence *amdgpu_job_run(struct >>>> drm_sched_job *sched_job) >>>> DRM_ERROR("Error scheduling IBs (%d)\n", r); >>>> } >>>> + if (job->workload_mode != AMDGPU_CTX_WORKLOAD_HINT_NONE) { >>>> + if (amdgpu_set_workload_profile(ring->adev, >>>> job->workload_mode)) >>>> + DRM_WARN("Failed to set workload profile to %s\n", >>>> + amdgpu_workload_profile_name(job->workload_mode)); >>>> + } >>>> + >>>> job->job_run_counter++; >>>> amdgpu_job_free_resources(job); >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >>>> index babc0af751c2..573e8692c814 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h >>>> @@ -68,6 +68,9 @@ struct amdgpu_job { >>>> /* job_run_counter >= 1 means a resubmit job */ >>>> uint32_t job_run_counter; >>>> + /* workload mode hint for pm */ >>>> + uint32_t workload_mode; >>>> + >>>> uint32_t num_ibs; >>>> struct amdgpu_ib ibs[]; >>>> }; >>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile 2022-09-27 11:47 ` Sharma, Shashank 2022-09-27 12:20 ` Lazar, Lijo @ 2022-09-27 16:33 ` Michel Dänzer 2022-09-27 17:06 ` Sharma, Shashank 1 sibling, 1 reply; 76+ messages in thread From: Michel Dänzer @ 2022-09-27 16:33 UTC (permalink / raw) To: Sharma, Shashank, Lazar, Lijo Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, amd-gfx On 2022-09-27 13:47, Sharma, Shashank wrote: > On 9/27/2022 12:03 PM, Lazar, Lijo wrote: >> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>> This patch and switches the GPU workload based profile based >>> on the workload hint information saved in the workload context. >>> The workload profile is reset to NONE when the job is done. >>> >>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 4 ---- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++++++++++++++ >>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ >>> 4 files changed, 20 insertions(+), 4 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> index b7bae833c804..de906a42144f 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>> @@ -237,6 +237,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs >>> goto free_all_kdata; >>> } >>> + p->job->workload_mode = p->ctx->workload_mode; >>> + >>> if (p->uf_entry.tv.bo) >>> p->job->uf_addr = uf_offset; >>> kvfree(chunk_array); >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> index a11cf29bc388..625114804121 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>> @@ -55,15 +55,11 @@ int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>> mutex_lock(&adev->pm.smu_workload_lock); >>> - if (adev->pm.workload_mode == hint) >>> - goto unlock; >>> - >> >> What is the expectation when a GFX job + VCN job together (or in general two jobs running in separate schedulers) and each prefers a different workload type? FW will switch as requested. > > Well, I guess the last switched mode will take over. Do note that like most of the PM features, the real benefit of power profiles can be seen with consistant and similar workloads running for some time (Like gaming, video playback etc). Not sure how that's supposed to work on a general purpose system, where there are always expected to be multiple processes (one of which being the display server) using the GPU for different workloads. Even in special cases there may be multiple different kinds of workloads constantly being used at the same time, e.g. a fullscreen game with live streaming / recording using VCN. Have you guys considered letting the display server (DRM master) choose the profile instead? -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and Xwayland developer ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile 2022-09-27 16:33 ` Michel Dänzer @ 2022-09-27 17:06 ` Sharma, Shashank 2022-09-27 17:29 ` Michel Dänzer 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 17:06 UTC (permalink / raw) To: Michel Dänzer, Lazar, Lijo Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, amd-gfx On 9/27/2022 6:33 PM, Michel Dänzer wrote: > On 2022-09-27 13:47, Sharma, Shashank wrote: >> On 9/27/2022 12:03 PM, Lazar, Lijo wrote: >>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>> This patch and switches the GPU workload based profile based >>>> on the workload hint information saved in the workload context. >>>> The workload profile is reset to NONE when the job is done. >>>> >>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 4 ---- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++++++++++++++ >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ >>>> 4 files changed, 20 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> index b7bae833c804..de906a42144f 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>> @@ -237,6 +237,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs >>>> goto free_all_kdata; >>>> } >>>> + p->job->workload_mode = p->ctx->workload_mode; >>>> + >>>> if (p->uf_entry.tv.bo) >>>> p->job->uf_addr = uf_offset; >>>> kvfree(chunk_array); >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> index a11cf29bc388..625114804121 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>> @@ -55,15 +55,11 @@ int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>> mutex_lock(&adev->pm.smu_workload_lock); >>>> - if (adev->pm.workload_mode == hint) >>>> - goto unlock; >>>> - >>> >>> What is the expectation when a GFX job + VCN job together (or in general two jobs running in separate schedulers) and each prefers a different workload type? FW will switch as requested. >> >> Well, I guess the last switched mode will take over. Do note that like most of the PM features, the real benefit of power profiles can be seen with consistant and similar workloads running for some time (Like gaming, video playback etc). > > Not sure how that's supposed to work on a general purpose system, where there are always expected to be multiple processes (one of which being the display server) using the GPU for different workloads. > > Even in special cases there may be multiple different kinds of workloads constantly being used at the same time, e.g. a fullscreen game with live streaming / recording using VCN. > It looks like we can accommodate that now, see the recent discussion with Felix in the patch 5, where we see that "amdgpu_dpm_switch_power_profile enables and disables individual profiles, Disabling the 3D profile doesn't disable the compute profile at the same time" So I think we won't be overwriting but would be enabling/disabling individual profiles for compute/3D/MM etc. Of course I will have to update the patch series accordingly. > > Have you guys considered letting the display server (DRM master) choose the profile instead? > This seems to be very good input, in case of a further conflict in decision making, we might as well add this intelligence in DRM master. Would you mind explaining this a bit more on how do you think it should be done ? - Shashank ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile 2022-09-27 17:06 ` Sharma, Shashank @ 2022-09-27 17:29 ` Michel Dänzer 0 siblings, 0 replies; 76+ messages in thread From: Michel Dänzer @ 2022-09-27 17:29 UTC (permalink / raw) To: Sharma, Shashank, Lazar, Lijo Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, amd-gfx On 2022-09-27 19:06, Sharma, Shashank wrote: > On 9/27/2022 6:33 PM, Michel Dänzer wrote: >> On 2022-09-27 13:47, Sharma, Shashank wrote: >>> On 9/27/2022 12:03 PM, Lazar, Lijo wrote: >>>> On 9/27/2022 3:10 AM, Shashank Sharma wrote: >>>>> This patch and switches the GPU workload based profile based >>>>> on the workload hint information saved in the workload context. >>>>> The workload profile is reset to NONE when the job is done. >>>>> >>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 ++ >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c | 4 ---- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 15 +++++++++++++++ >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 3 +++ >>>>> 4 files changed, 20 insertions(+), 4 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> index b7bae833c804..de906a42144f 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c >>>>> @@ -237,6 +237,8 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, union drm_amdgpu_cs >>>>> goto free_all_kdata; >>>>> } >>>>> + p->job->workload_mode = p->ctx->workload_mode; >>>>> + >>>>> if (p->uf_entry.tv.bo) >>>>> p->job->uf_addr = uf_offset; >>>>> kvfree(chunk_array); >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>> index a11cf29bc388..625114804121 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx_workload.c >>>>> @@ -55,15 +55,11 @@ int amdgpu_set_workload_profile(struct amdgpu_device *adev, >>>>> mutex_lock(&adev->pm.smu_workload_lock); >>>>> - if (adev->pm.workload_mode == hint) >>>>> - goto unlock; >>>>> - >>>> >>>> What is the expectation when a GFX job + VCN job together (or in general two jobs running in separate schedulers) and each prefers a different workload type? FW will switch as requested. >>> >>> Well, I guess the last switched mode will take over. Do note that like most of the PM features, the real benefit of power profiles can be seen with consistant and similar workloads running for some time (Like gaming, video playback etc). >> >> Not sure how that's supposed to work on a general purpose system, where there are always expected to be multiple processes (one of which being the display server) using the GPU for different workloads. >> >> Even in special cases there may be multiple different kinds of workloads constantly being used at the same time, e.g. a fullscreen game with live streaming / recording using VCN. >> > It looks like we can accommodate that now, see the recent discussion with Felix in the patch 5, where we see that "amdgpu_dpm_switch_power_profile enables and disables individual profiles, Disabling the 3D profile doesn't disable the compute profile at the same time" > > So I think we won't be overwriting but would be enabling/disabling individual profiles for compute/3D/MM etc. Of course I will have to update the patch series accordingly. > >> >> Have you guys considered letting the display server (DRM master) choose the profile instead? >> > This seems to be very good input, in case of a further conflict in decision making, we might > > as well add this intelligence in DRM master. Would you mind explaining this a bit more on how do you think it should be done ? I don't have any specific ideas offhand; it was just an idea that happened to come to my mind, not sure it's a good one at all. Anyway, I think one important thing is that the same circumstances consistently result in the same profile being chosen. If it depends on luck / timing, that's a troubleshooting nightmare. -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and Xwayland developer ^ permalink raw reply [flat|nested] 76+ messages in thread
* [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-26 21:40 [PATCH v3 0/5] GPU workload hints for better performance Shashank Sharma ` (3 preceding siblings ...) 2022-09-26 21:40 ` [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile Shashank Sharma @ 2022-09-26 21:40 ` Shashank Sharma 2022-09-27 6:12 ` Christian König 2022-09-27 16:24 ` [PATCH v3 0/5] GPU workload hints for better performance Michel Dänzer 5 siblings, 1 reply; 76+ messages in thread From: Shashank Sharma @ 2022-09-26 21:40 UTC (permalink / raw) To: amd-gfx Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, Shashank Sharma This patch switches the GPU workload mode to/from compute mode, while submitting compute workload. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index 5e53a5293935..1caed319a448 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c @@ -34,6 +34,7 @@ #include "amdgpu_ras.h" #include "amdgpu_umc.h" #include "amdgpu_reset.h" +#include "amdgpu_ctx_workload.h" /* Total memory size in system memory and all GPU VRAM. Used to * estimate worst case amount of memory to reserve for page tables @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev, void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, bool idle) { - amdgpu_dpm_switch_power_profile(adev, - PP_SMC_POWER_PROFILE_COMPUTE, - !idle); + int ret; + + if (idle) + ret = amdgpu_clear_workload_profile(adev, AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); + else + ret = amdgpu_set_workload_profile(adev, AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); + + if (ret) + drm_warn(&adev->ddev, "Failed to %s power profile to compute mode\n", + idle ? "reset" : "set"); } bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid) -- 2.34.1 ^ permalink raw reply related [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-26 21:40 ` [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute Shashank Sharma @ 2022-09-27 6:12 ` Christian König 2022-09-27 14:48 ` Felix Kuehling 0 siblings, 1 reply; 76+ messages in thread From: Christian König @ 2022-09-27 6:12 UTC (permalink / raw) To: Shashank Sharma, amd-gfx, Kuehling, Felix Cc: alexander.deucher, amaranath.somalapuram Am 26.09.22 um 23:40 schrieb Shashank Sharma: > This patch switches the GPU workload mode to/from > compute mode, while submitting compute workload. > > Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Feel free to add my acked-by, but Felix should probably take a look as well. Christian. > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- > 1 file changed, 11 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > index 5e53a5293935..1caed319a448 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > @@ -34,6 +34,7 @@ > #include "amdgpu_ras.h" > #include "amdgpu_umc.h" > #include "amdgpu_reset.h" > +#include "amdgpu_ctx_workload.h" > > /* Total memory size in system memory and all GPU VRAM. Used to > * estimate worst case amount of memory to reserve for page tables > @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device *adev, > > void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, bool idle) > { > - amdgpu_dpm_switch_power_profile(adev, > - PP_SMC_POWER_PROFILE_COMPUTE, > - !idle); > + int ret; > + > + if (idle) > + ret = amdgpu_clear_workload_profile(adev, AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); > + else > + ret = amdgpu_set_workload_profile(adev, AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); > + > + if (ret) > + drm_warn(&adev->ddev, "Failed to %s power profile to compute mode\n", > + idle ? "reset" : "set"); > } > > bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid) ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-27 6:12 ` Christian König @ 2022-09-27 14:48 ` Felix Kuehling 2022-09-27 14:58 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Felix Kuehling @ 2022-09-27 14:48 UTC (permalink / raw) To: Christian König, Shashank Sharma, amd-gfx Cc: alexander.deucher, amaranath.somalapuram Am 2022-09-27 um 02:12 schrieb Christian König: > Am 26.09.22 um 23:40 schrieb Shashank Sharma: >> This patch switches the GPU workload mode to/from >> compute mode, while submitting compute workload. >> >> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > > Feel free to add my acked-by, but Felix should probably take a look as > well. This look OK purely from a compute perspective. But I'm concerned about the interaction of compute with graphics or multiple graphics contexts submitting work concurrently. They would constantly override or disable each other's workload hints. For example, you have an amdgpu_ctx with AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD process that also wants the compute profile. Those could be different processes belonging to different users. Say, KFD enables the compute profile first. Then the graphics context submits a job. At the start of the job, the compute profile is enabled. That's a no-op because KFD already enabled the compute profile. When the job finishes, it disables the compute profile for everyone, including KFD. That's unexpected. Or you have multiple VCN contexts. When context1 finishes a job, it disables the VIDEO profile. But context2 still has a job on the other VCN engine and wants the VIDEO profile to still be enabled. Regards, Felix > > Christian. > >> --- >> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- >> 1 file changed, 11 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >> index 5e53a5293935..1caed319a448 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >> @@ -34,6 +34,7 @@ >> #include "amdgpu_ras.h" >> #include "amdgpu_umc.h" >> #include "amdgpu_reset.h" >> +#include "amdgpu_ctx_workload.h" >> /* Total memory size in system memory and all GPU VRAM. Used to >> * estimate worst case amount of memory to reserve for page tables >> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device >> *adev, >> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, >> bool idle) >> { >> - amdgpu_dpm_switch_power_profile(adev, >> - PP_SMC_POWER_PROFILE_COMPUTE, >> - !idle); >> + int ret; >> + >> + if (idle) >> + ret = amdgpu_clear_workload_profile(adev, >> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >> + else >> + ret = amdgpu_set_workload_profile(adev, >> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >> + >> + if (ret) >> + drm_warn(&adev->ddev, "Failed to %s power profile to compute >> mode\n", >> + idle ? "reset" : "set"); >> } >> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid) > ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-27 14:48 ` Felix Kuehling @ 2022-09-27 14:58 ` Sharma, Shashank 2022-09-27 15:23 ` Felix Kuehling 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 14:58 UTC (permalink / raw) To: Felix Kuehling, Christian König, amd-gfx Cc: alexander.deucher, amaranath.somalapuram Hello Felix, Thank for the review comments. On 9/27/2022 4:48 PM, Felix Kuehling wrote: > Am 2022-09-27 um 02:12 schrieb Christian König: >> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>> This patch switches the GPU workload mode to/from >>> compute mode, while submitting compute workload. >>> >>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >> >> Feel free to add my acked-by, but Felix should probably take a look as >> well. > > This look OK purely from a compute perspective. But I'm concerned about > the interaction of compute with graphics or multiple graphics contexts > submitting work concurrently. They would constantly override or disable > each other's workload hints. > > For example, you have an amdgpu_ctx with > AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD > process that also wants the compute profile. Those could be different > processes belonging to different users. Say, KFD enables the compute > profile first. Then the graphics context submits a job. At the start of > the job, the compute profile is enabled. That's a no-op because KFD > already enabled the compute profile. When the job finishes, it disables > the compute profile for everyone, including KFD. That's unexpected. > In this case, it will not disable the compute profile, as the reference counter will not be zero. The reset_profile() will only act if the reference counter is 0. But I would be happy to get any inputs about a policy which can be more sustainable and gets better outputs, for example: - should we not allow a profile change, if a PP mode is already applied and keep it Early bird basis ? For example: Policy A - Job A sets the profile to compute - Job B tries to set profile to 3D, but we do not allow it as job A is not finished it yet. Or Policy B: Current one - Job A sets the profile to compute - Job B tries to set profile to 3D, and we allow it. Job A also runs in PP 3D - Job B finishes, but does not reset PP as reference count is not zero due to compute - Job A finishes, profile reset to NONE Or anything else ? REgards Shashank > Or you have multiple VCN contexts. When context1 finishes a job, it > disables the VIDEO profile. But context2 still has a job on the other > VCN engine and wants the VIDEO profile to still be enabled. > > Regards, > Felix > > >> >> Christian. >> >>> --- >>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- >>> 1 file changed, 11 insertions(+), 3 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>> index 5e53a5293935..1caed319a448 100644 >>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>> @@ -34,6 +34,7 @@ >>> #include "amdgpu_ras.h" >>> #include "amdgpu_umc.h" >>> #include "amdgpu_reset.h" >>> +#include "amdgpu_ctx_workload.h" >>> /* Total memory size in system memory and all GPU VRAM. Used to >>> * estimate worst case amount of memory to reserve for page tables >>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct amdgpu_device >>> *adev, >>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, >>> bool idle) >>> { >>> - amdgpu_dpm_switch_power_profile(adev, >>> - PP_SMC_POWER_PROFILE_COMPUTE, >>> - !idle); >>> + int ret; >>> + >>> + if (idle) >>> + ret = amdgpu_clear_workload_profile(adev, >>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>> + else >>> + ret = amdgpu_set_workload_profile(adev, >>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>> + >>> + if (ret) >>> + drm_warn(&adev->ddev, "Failed to %s power profile to compute >>> mode\n", >>> + idle ? "reset" : "set"); >>> } >>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 vmid) >> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-27 14:58 ` Sharma, Shashank @ 2022-09-27 15:23 ` Felix Kuehling 2022-09-27 15:38 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Felix Kuehling @ 2022-09-27 15:23 UTC (permalink / raw) To: Sharma, Shashank, Christian König, amd-gfx Cc: alexander.deucher, amaranath.somalapuram Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: > Hello Felix, > > Thank for the review comments. > > On 9/27/2022 4:48 PM, Felix Kuehling wrote: >> Am 2022-09-27 um 02:12 schrieb Christian König: >>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>> This patch switches the GPU workload mode to/from >>>> compute mode, while submitting compute workload. >>>> >>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>> >>> Feel free to add my acked-by, but Felix should probably take a look >>> as well. >> >> This look OK purely from a compute perspective. But I'm concerned >> about the interaction of compute with graphics or multiple graphics >> contexts submitting work concurrently. They would constantly override >> or disable each other's workload hints. >> >> For example, you have an amdgpu_ctx with >> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >> process that also wants the compute profile. Those could be different >> processes belonging to different users. Say, KFD enables the compute >> profile first. Then the graphics context submits a job. At the start >> of the job, the compute profile is enabled. That's a no-op because >> KFD already enabled the compute profile. When the job finishes, it >> disables the compute profile for everyone, including KFD. That's >> unexpected. >> > > In this case, it will not disable the compute profile, as the > reference counter will not be zero. The reset_profile() will only act > if the reference counter is 0. OK, I missed the reference counter. > > But I would be happy to get any inputs about a policy which can be > more sustainable and gets better outputs, for example: > - should we not allow a profile change, if a PP mode is already > applied and keep it Early bird basis ? > > For example: Policy A > - Job A sets the profile to compute > - Job B tries to set profile to 3D, but we do not allow it as job A is > not finished it yet. > > Or Policy B: Current one > - Job A sets the profile to compute > - Job B tries to set profile to 3D, and we allow it. Job A also runs > in PP 3D > - Job B finishes, but does not reset PP as reference count is not zero > due to compute > - Job A finishes, profile reset to NONE I think this won't work. As I understand it, the amdgpu_dpm_switch_power_profile enables and disables individual profiles. Disabling the 3D profile doesn't disable the compute profile at the same time. I think you'll need one refcount per profile. Regards, Felix > > > Or anything else ? > > REgards > Shashank > > >> Or you have multiple VCN contexts. When context1 finishes a job, it >> disables the VIDEO profile. But context2 still has a job on the other >> VCN engine and wants the VIDEO profile to still be enabled. >> >> Regards, >> Felix >> >> >>> >>> Christian. >>> >>>> --- >>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- >>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>> index 5e53a5293935..1caed319a448 100644 >>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>> @@ -34,6 +34,7 @@ >>>> #include "amdgpu_ras.h" >>>> #include "amdgpu_umc.h" >>>> #include "amdgpu_reset.h" >>>> +#include "amdgpu_ctx_workload.h" >>>> /* Total memory size in system memory and all GPU VRAM. Used to >>>> * estimate worst case amount of memory to reserve for page tables >>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>> amdgpu_device *adev, >>>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, >>>> bool idle) >>>> { >>>> - amdgpu_dpm_switch_power_profile(adev, >>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>> - !idle); >>>> + int ret; >>>> + >>>> + if (idle) >>>> + ret = amdgpu_clear_workload_profile(adev, >>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>> + else >>>> + ret = amdgpu_set_workload_profile(adev, >>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>> + >>>> + if (ret) >>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>> compute mode\n", >>>> + idle ? "reset" : "set"); >>>> } >>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 >>>> vmid) >>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-27 15:23 ` Felix Kuehling @ 2022-09-27 15:38 ` Sharma, Shashank 2022-09-27 20:40 ` Alex Deucher 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 15:38 UTC (permalink / raw) To: Felix Kuehling, Christian König, amd-gfx Cc: alexander.deucher, amaranath.somalapuram On 9/27/2022 5:23 PM, Felix Kuehling wrote: > Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >> Hello Felix, >> >> Thank for the review comments. >> >> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>> This patch switches the GPU workload mode to/from >>>>> compute mode, while submitting compute workload. >>>>> >>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>> >>>> Feel free to add my acked-by, but Felix should probably take a look >>>> as well. >>> >>> This look OK purely from a compute perspective. But I'm concerned >>> about the interaction of compute with graphics or multiple graphics >>> contexts submitting work concurrently. They would constantly override >>> or disable each other's workload hints. >>> >>> For example, you have an amdgpu_ctx with >>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>> process that also wants the compute profile. Those could be different >>> processes belonging to different users. Say, KFD enables the compute >>> profile first. Then the graphics context submits a job. At the start >>> of the job, the compute profile is enabled. That's a no-op because >>> KFD already enabled the compute profile. When the job finishes, it >>> disables the compute profile for everyone, including KFD. That's >>> unexpected. >>> >> >> In this case, it will not disable the compute profile, as the >> reference counter will not be zero. The reset_profile() will only act >> if the reference counter is 0. > > OK, I missed the reference counter. > > >> >> But I would be happy to get any inputs about a policy which can be >> more sustainable and gets better outputs, for example: >> - should we not allow a profile change, if a PP mode is already >> applied and keep it Early bird basis ? >> >> For example: Policy A >> - Job A sets the profile to compute >> - Job B tries to set profile to 3D, but we do not allow it as job A is >> not finished it yet. >> >> Or Policy B: Current one >> - Job A sets the profile to compute >> - Job B tries to set profile to 3D, and we allow it. Job A also runs >> in PP 3D >> - Job B finishes, but does not reset PP as reference count is not zero >> due to compute >> - Job A finishes, profile reset to NONE > > I think this won't work. As I understand it, the > amdgpu_dpm_switch_power_profile enables and disables individual > profiles. Disabling the 3D profile doesn't disable the compute profile > at the same time. I think you'll need one refcount per profile. > > Regards, > Felix Thanks, This is exactly what I was looking for, I think Alex's initial idea was around it, but I was under the assumption that there is only one HW profile in SMU which keeps on getting overwritten. This can solve our problems, as I can create an array of reference counters, and will disable only the profile whose reference counter goes 0. - Shashank > > >> >> >> Or anything else ? >> >> REgards >> Shashank >> >> >>> Or you have multiple VCN contexts. When context1 finishes a job, it >>> disables the VIDEO profile. But context2 still has a job on the other >>> VCN engine and wants the VIDEO profile to still be enabled. >>> >>> Regards, >>> Felix >>> >>> >>>> >>>> Christian. >>>> >>>>> --- >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- >>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>> >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> index 5e53a5293935..1caed319a448 100644 >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>> @@ -34,6 +34,7 @@ >>>>> #include "amdgpu_ras.h" >>>>> #include "amdgpu_umc.h" >>>>> #include "amdgpu_reset.h" >>>>> +#include "amdgpu_ctx_workload.h" >>>>> /* Total memory size in system memory and all GPU VRAM. Used to >>>>> * estimate worst case amount of memory to reserve for page tables >>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>> amdgpu_device *adev, >>>>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, >>>>> bool idle) >>>>> { >>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>> - !idle); >>>>> + int ret; >>>>> + >>>>> + if (idle) >>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>> + else >>>>> + ret = amdgpu_set_workload_profile(adev, >>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>> + >>>>> + if (ret) >>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>> compute mode\n", >>>>> + idle ? "reset" : "set"); >>>>> } >>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 >>>>> vmid) >>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-27 15:38 ` Sharma, Shashank @ 2022-09-27 20:40 ` Alex Deucher 2022-09-28 7:05 ` Lazar, Lijo 2022-09-28 8:56 ` Sharma, Shashank 0 siblings, 2 replies; 76+ messages in thread From: Alex Deucher @ 2022-09-27 20:40 UTC (permalink / raw) To: Sharma, Shashank Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank <shashank.sharma@amd.com> wrote: > > > > On 9/27/2022 5:23 PM, Felix Kuehling wrote: > > Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: > >> Hello Felix, > >> > >> Thank for the review comments. > >> > >> On 9/27/2022 4:48 PM, Felix Kuehling wrote: > >>> Am 2022-09-27 um 02:12 schrieb Christian König: > >>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: > >>>>> This patch switches the GPU workload mode to/from > >>>>> compute mode, while submitting compute workload. > >>>>> > >>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > >>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > >>>> > >>>> Feel free to add my acked-by, but Felix should probably take a look > >>>> as well. > >>> > >>> This look OK purely from a compute perspective. But I'm concerned > >>> about the interaction of compute with graphics or multiple graphics > >>> contexts submitting work concurrently. They would constantly override > >>> or disable each other's workload hints. > >>> > >>> For example, you have an amdgpu_ctx with > >>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD > >>> process that also wants the compute profile. Those could be different > >>> processes belonging to different users. Say, KFD enables the compute > >>> profile first. Then the graphics context submits a job. At the start > >>> of the job, the compute profile is enabled. That's a no-op because > >>> KFD already enabled the compute profile. When the job finishes, it > >>> disables the compute profile for everyone, including KFD. That's > >>> unexpected. > >>> > >> > >> In this case, it will not disable the compute profile, as the > >> reference counter will not be zero. The reset_profile() will only act > >> if the reference counter is 0. > > > > OK, I missed the reference counter. > > > > > >> > >> But I would be happy to get any inputs about a policy which can be > >> more sustainable and gets better outputs, for example: > >> - should we not allow a profile change, if a PP mode is already > >> applied and keep it Early bird basis ? > >> > >> For example: Policy A > >> - Job A sets the profile to compute > >> - Job B tries to set profile to 3D, but we do not allow it as job A is > >> not finished it yet. > >> > >> Or Policy B: Current one > >> - Job A sets the profile to compute > >> - Job B tries to set profile to 3D, and we allow it. Job A also runs > >> in PP 3D > >> - Job B finishes, but does not reset PP as reference count is not zero > >> due to compute > >> - Job A finishes, profile reset to NONE > > > > I think this won't work. As I understand it, the > > amdgpu_dpm_switch_power_profile enables and disables individual > > profiles. Disabling the 3D profile doesn't disable the compute profile > > at the same time. I think you'll need one refcount per profile. > > > > Regards, > > Felix > > Thanks, This is exactly what I was looking for, I think Alex's initial > idea was around it, but I was under the assumption that there is only > one HW profile in SMU which keeps on getting overwritten. This can solve > our problems, as I can create an array of reference counters, and will > disable only the profile whose reference counter goes 0. It's been a while since I paged any of this code into my head, but I believe the actual workload message in the SMU is a mask where you can specify multiple workload types at the same time and the SMU will arbitrate between them internally. E.g., the most aggressive one will be selected out of the ones specified. I think in the driver we just set one bit at a time using the current interface. It might be better to change the interface and just ref count the hint types and then when we call the set function look at the ref counts for each hint type and set the mask as appropriate. Alex > > - Shashank > > > > > > >> > >> > >> Or anything else ? > >> > >> REgards > >> Shashank > >> > >> > >>> Or you have multiple VCN contexts. When context1 finishes a job, it > >>> disables the VIDEO profile. But context2 still has a job on the other > >>> VCN engine and wants the VIDEO profile to still be enabled. > >>> > >>> Regards, > >>> Felix > >>> > >>> > >>>> > >>>> Christian. > >>>> > >>>>> --- > >>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- > >>>>> 1 file changed, 11 insertions(+), 3 deletions(-) > >>>>> > >>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > >>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > >>>>> index 5e53a5293935..1caed319a448 100644 > >>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > >>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > >>>>> @@ -34,6 +34,7 @@ > >>>>> #include "amdgpu_ras.h" > >>>>> #include "amdgpu_umc.h" > >>>>> #include "amdgpu_reset.h" > >>>>> +#include "amdgpu_ctx_workload.h" > >>>>> /* Total memory size in system memory and all GPU VRAM. Used to > >>>>> * estimate worst case amount of memory to reserve for page tables > >>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct > >>>>> amdgpu_device *adev, > >>>>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, > >>>>> bool idle) > >>>>> { > >>>>> - amdgpu_dpm_switch_power_profile(adev, > >>>>> - PP_SMC_POWER_PROFILE_COMPUTE, > >>>>> - !idle); > >>>>> + int ret; > >>>>> + > >>>>> + if (idle) > >>>>> + ret = amdgpu_clear_workload_profile(adev, > >>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); > >>>>> + else > >>>>> + ret = amdgpu_set_workload_profile(adev, > >>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); > >>>>> + > >>>>> + if (ret) > >>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to > >>>>> compute mode\n", > >>>>> + idle ? "reset" : "set"); > >>>>> } > >>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 > >>>>> vmid) > >>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-27 20:40 ` Alex Deucher @ 2022-09-28 7:05 ` Lazar, Lijo 2022-09-28 8:56 ` Sharma, Shashank 1 sibling, 0 replies; 76+ messages in thread From: Lazar, Lijo @ 2022-09-28 7:05 UTC (permalink / raw) To: Alex Deucher, Sharma, Shashank Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/28/2022 2:10 AM, Alex Deucher wrote: > On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank > <shashank.sharma@amd.com> wrote: >> >> >> >> On 9/27/2022 5:23 PM, Felix Kuehling wrote: >>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >>>> Hello Felix, >>>> >>>> Thank for the review comments. >>>> >>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>>>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>>>> This patch switches the GPU workload mode to/from >>>>>>> compute mode, while submitting compute workload. >>>>>>> >>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>> >>>>>> Feel free to add my acked-by, but Felix should probably take a look >>>>>> as well. >>>>> >>>>> This look OK purely from a compute perspective. But I'm concerned >>>>> about the interaction of compute with graphics or multiple graphics >>>>> contexts submitting work concurrently. They would constantly override >>>>> or disable each other's workload hints. >>>>> >>>>> For example, you have an amdgpu_ctx with >>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>>>> process that also wants the compute profile. Those could be different >>>>> processes belonging to different users. Say, KFD enables the compute >>>>> profile first. Then the graphics context submits a job. At the start >>>>> of the job, the compute profile is enabled. That's a no-op because >>>>> KFD already enabled the compute profile. When the job finishes, it >>>>> disables the compute profile for everyone, including KFD. That's >>>>> unexpected. >>>>> >>>> >>>> In this case, it will not disable the compute profile, as the >>>> reference counter will not be zero. The reset_profile() will only act >>>> if the reference counter is 0. >>> >>> OK, I missed the reference counter. >>> >>> >>>> >>>> But I would be happy to get any inputs about a policy which can be >>>> more sustainable and gets better outputs, for example: >>>> - should we not allow a profile change, if a PP mode is already >>>> applied and keep it Early bird basis ? >>>> >>>> For example: Policy A >>>> - Job A sets the profile to compute >>>> - Job B tries to set profile to 3D, but we do not allow it as job A is >>>> not finished it yet. >>>> >>>> Or Policy B: Current one >>>> - Job A sets the profile to compute >>>> - Job B tries to set profile to 3D, and we allow it. Job A also runs >>>> in PP 3D >>>> - Job B finishes, but does not reset PP as reference count is not zero >>>> due to compute >>>> - Job A finishes, profile reset to NONE >>> >>> I think this won't work. As I understand it, the >>> amdgpu_dpm_switch_power_profile enables and disables individual >>> profiles. Disabling the 3D profile doesn't disable the compute profile >>> at the same time. I think you'll need one refcount per profile. >>> >>> Regards, >>> Felix >> >> Thanks, This is exactly what I was looking for, I think Alex's initial >> idea was around it, but I was under the assumption that there is only >> one HW profile in SMU which keeps on getting overwritten. This can solve >> our problems, as I can create an array of reference counters, and will >> disable only the profile whose reference counter goes 0. > > It's been a while since I paged any of this code into my head, but I > believe the actual workload message in the SMU is a mask where you can > specify multiple workload types at the same time and the SMU will > arbitrate between them internally. E.g., the most aggressive one will > be selected out of the ones specified. I think in the driver we just > set one bit at a time using the current interface. Yes, this is how it works today. Only one profile is set at a time and so setting another one will overwrite the current driver preference. I think the current expectation of usage is from a system settings perspective like Gaming Mode (Full screen 3D) or Cinematic mode (Video) etc. This is also set through sysfs and there is also a Custom mode. It's not used in the sense of a per-job setting. It might be better > to change the interface and just ref count the hint types and then > when we call the set function look at the ref counts for each hint > type and set the mask as appropriate. > This means a pm subsytem level change and the ref counts need to be kept in pm layer to account for changes through sysfs or APIs. Thanks, Lijo > Alex > > >> >> - Shashank >> >>> >>> >>>> >>>> >>>> Or anything else ? >>>> >>>> REgards >>>> Shashank >>>> >>>> >>>>> Or you have multiple VCN contexts. When context1 finishes a job, it >>>>> disables the VIDEO profile. But context2 still has a job on the other >>>>> VCN engine and wants the VIDEO profile to still be enabled. >>>>> >>>>> Regards, >>>>> Felix >>>>> >>>>> >>>>>> >>>>>> Christian. >>>>>> >>>>>>> --- >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- >>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>> index 5e53a5293935..1caed319a448 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>> @@ -34,6 +34,7 @@ >>>>>>> #include "amdgpu_ras.h" >>>>>>> #include "amdgpu_umc.h" >>>>>>> #include "amdgpu_reset.h" >>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>> /* Total memory size in system memory and all GPU VRAM. Used to >>>>>>> * estimate worst case amount of memory to reserve for page tables >>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>>>> amdgpu_device *adev, >>>>>>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, >>>>>>> bool idle) >>>>>>> { >>>>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>> - !idle); >>>>>>> + int ret; >>>>>>> + >>>>>>> + if (idle) >>>>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>> + else >>>>>>> + ret = amdgpu_set_workload_profile(adev, >>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>> + >>>>>>> + if (ret) >>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>>>> compute mode\n", >>>>>>> + idle ? "reset" : "set"); >>>>>>> } >>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 >>>>>>> vmid) >>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-27 20:40 ` Alex Deucher 2022-09-28 7:05 ` Lazar, Lijo @ 2022-09-28 8:56 ` Sharma, Shashank 2022-09-28 9:00 ` Sharma, Shashank 2022-09-28 21:51 ` Alex Deucher 1 sibling, 2 replies; 76+ messages in thread From: Sharma, Shashank @ 2022-09-28 8:56 UTC (permalink / raw) To: Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/27/2022 10:40 PM, Alex Deucher wrote: > On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank > <shashank.sharma@amd.com> wrote: >> >> >> >> On 9/27/2022 5:23 PM, Felix Kuehling wrote: >>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >>>> Hello Felix, >>>> >>>> Thank for the review comments. >>>> >>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>>>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>>>> This patch switches the GPU workload mode to/from >>>>>>> compute mode, while submitting compute workload. >>>>>>> >>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>> >>>>>> Feel free to add my acked-by, but Felix should probably take a look >>>>>> as well. >>>>> >>>>> This look OK purely from a compute perspective. But I'm concerned >>>>> about the interaction of compute with graphics or multiple graphics >>>>> contexts submitting work concurrently. They would constantly override >>>>> or disable each other's workload hints. >>>>> >>>>> For example, you have an amdgpu_ctx with >>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>>>> process that also wants the compute profile. Those could be different >>>>> processes belonging to different users. Say, KFD enables the compute >>>>> profile first. Then the graphics context submits a job. At the start >>>>> of the job, the compute profile is enabled. That's a no-op because >>>>> KFD already enabled the compute profile. When the job finishes, it >>>>> disables the compute profile for everyone, including KFD. That's >>>>> unexpected. >>>>> >>>> >>>> In this case, it will not disable the compute profile, as the >>>> reference counter will not be zero. The reset_profile() will only act >>>> if the reference counter is 0. >>> >>> OK, I missed the reference counter. >>> >>> >>>> >>>> But I would be happy to get any inputs about a policy which can be >>>> more sustainable and gets better outputs, for example: >>>> - should we not allow a profile change, if a PP mode is already >>>> applied and keep it Early bird basis ? >>>> >>>> For example: Policy A >>>> - Job A sets the profile to compute >>>> - Job B tries to set profile to 3D, but we do not allow it as job A is >>>> not finished it yet. >>>> >>>> Or Policy B: Current one >>>> - Job A sets the profile to compute >>>> - Job B tries to set profile to 3D, and we allow it. Job A also runs >>>> in PP 3D >>>> - Job B finishes, but does not reset PP as reference count is not zero >>>> due to compute >>>> - Job A finishes, profile reset to NONE >>> >>> I think this won't work. As I understand it, the >>> amdgpu_dpm_switch_power_profile enables and disables individual >>> profiles. Disabling the 3D profile doesn't disable the compute profile >>> at the same time. I think you'll need one refcount per profile. >>> >>> Regards, >>> Felix >> >> Thanks, This is exactly what I was looking for, I think Alex's initial >> idea was around it, but I was under the assumption that there is only >> one HW profile in SMU which keeps on getting overwritten. This can solve >> our problems, as I can create an array of reference counters, and will >> disable only the profile whose reference counter goes 0. > > It's been a while since I paged any of this code into my head, but I > believe the actual workload message in the SMU is a mask where you can > specify multiple workload types at the same time and the SMU will > arbitrate between them internally. E.g., the most aggressive one will > be selected out of the ones specified. I think in the driver we just > set one bit at a time using the current interface. It might be better > to change the interface and just ref count the hint types and then > when we call the set function look at the ref counts for each hint > type and set the mask as appropriate. > > Alex > Hey Alex, Thanks for your comment, if that is the case, this current patch series works straight forward, and no changes would be required. Please let me know if my understanding is correct: Assumption: Order of aggression: 3D > Media > Compute - Job 1: Requests mode compute: PP changed to compute, ref count 1 - Job 2: Requests mode media: PP changed to media, ref count 2 - Job 3: requests mode 3D: PP changed to 3D, ref count 3 - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0, PP still 3D - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0, PP still 3D - Job 2 finishes, downs ref count to 0, PP changed to NONE, In this way, every job will be operating in the Power profile of desired aggression or higher, and this API guarantees the execution at-least in the desired power profile. - Shashank > >> >> - Shashank >> >>> >>> >>>> >>>> >>>> Or anything else ? >>>> >>>> REgards >>>> Shashank >>>> >>>> >>>>> Or you have multiple VCN contexts. When context1 finishes a job, it >>>>> disables the VIDEO profile. But context2 still has a job on the other >>>>> VCN engine and wants the VIDEO profile to still be enabled. >>>>> >>>>> Regards, >>>>> Felix >>>>> >>>>> >>>>>> >>>>>> Christian. >>>>>> >>>>>>> --- >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- >>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>> index 5e53a5293935..1caed319a448 100644 >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>> @@ -34,6 +34,7 @@ >>>>>>> #include "amdgpu_ras.h" >>>>>>> #include "amdgpu_umc.h" >>>>>>> #include "amdgpu_reset.h" >>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>> /* Total memory size in system memory and all GPU VRAM. Used to >>>>>>> * estimate worst case amount of memory to reserve for page tables >>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>>>> amdgpu_device *adev, >>>>>>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, >>>>>>> bool idle) >>>>>>> { >>>>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>> - !idle); >>>>>>> + int ret; >>>>>>> + >>>>>>> + if (idle) >>>>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>> + else >>>>>>> + ret = amdgpu_set_workload_profile(adev, >>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>> + >>>>>>> + if (ret) >>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>>>> compute mode\n", >>>>>>> + idle ? "reset" : "set"); >>>>>>> } >>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 >>>>>>> vmid) >>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-28 8:56 ` Sharma, Shashank @ 2022-09-28 9:00 ` Sharma, Shashank 2022-09-28 21:51 ` Alex Deucher 1 sibling, 0 replies; 76+ messages in thread From: Sharma, Shashank @ 2022-09-28 9:00 UTC (permalink / raw) To: Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx Small correction, On 9/28/2022 10:56 AM, Sharma, Shashank wrote: > > > On 9/27/2022 10:40 PM, Alex Deucher wrote: >> On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank >> <shashank.sharma@amd.com> wrote: >>> >>> >>> >>> On 9/27/2022 5:23 PM, Felix Kuehling wrote: >>>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >>>>> Hello Felix, >>>>> >>>>> Thank for the review comments. >>>>> >>>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>>>>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>>>>> This patch switches the GPU workload mode to/from >>>>>>>> compute mode, while submitting compute workload. >>>>>>>> >>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>> >>>>>>> Feel free to add my acked-by, but Felix should probably take a look >>>>>>> as well. >>>>>> >>>>>> This look OK purely from a compute perspective. But I'm concerned >>>>>> about the interaction of compute with graphics or multiple graphics >>>>>> contexts submitting work concurrently. They would constantly override >>>>>> or disable each other's workload hints. >>>>>> >>>>>> For example, you have an amdgpu_ctx with >>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>>>>> process that also wants the compute profile. Those could be different >>>>>> processes belonging to different users. Say, KFD enables the compute >>>>>> profile first. Then the graphics context submits a job. At the start >>>>>> of the job, the compute profile is enabled. That's a no-op because >>>>>> KFD already enabled the compute profile. When the job finishes, it >>>>>> disables the compute profile for everyone, including KFD. That's >>>>>> unexpected. >>>>>> >>>>> >>>>> In this case, it will not disable the compute profile, as the >>>>> reference counter will not be zero. The reset_profile() will only act >>>>> if the reference counter is 0. >>>> >>>> OK, I missed the reference counter. >>>> >>>> >>>>> >>>>> But I would be happy to get any inputs about a policy which can be >>>>> more sustainable and gets better outputs, for example: >>>>> - should we not allow a profile change, if a PP mode is already >>>>> applied and keep it Early bird basis ? >>>>> >>>>> For example: Policy A >>>>> - Job A sets the profile to compute >>>>> - Job B tries to set profile to 3D, but we do not allow it as job A is >>>>> not finished it yet. >>>>> >>>>> Or Policy B: Current one >>>>> - Job A sets the profile to compute >>>>> - Job B tries to set profile to 3D, and we allow it. Job A also runs >>>>> in PP 3D >>>>> - Job B finishes, but does not reset PP as reference count is not zero >>>>> due to compute >>>>> - Job A finishes, profile reset to NONE >>>> >>>> I think this won't work. As I understand it, the >>>> amdgpu_dpm_switch_power_profile enables and disables individual >>>> profiles. Disabling the 3D profile doesn't disable the compute profile >>>> at the same time. I think you'll need one refcount per profile. >>>> >>>> Regards, >>>> Felix >>> >>> Thanks, This is exactly what I was looking for, I think Alex's initial >>> idea was around it, but I was under the assumption that there is only >>> one HW profile in SMU which keeps on getting overwritten. This can solve >>> our problems, as I can create an array of reference counters, and will >>> disable only the profile whose reference counter goes 0. >> >> It's been a while since I paged any of this code into my head, but I >> believe the actual workload message in the SMU is a mask where you can >> specify multiple workload types at the same time and the SMU will >> arbitrate between them internally. E.g., the most aggressive one will >> be selected out of the ones specified. I think in the driver we just >> set one bit at a time using the current interface. It might be better >> to change the interface and just ref count the hint types and then >> when we call the set function look at the ref counts for each hint >> type and set the mask as appropriate. >> >> Alex >> > > Hey Alex, > Thanks for your comment, if that is the case, this current patch series > works straight forward, and no changes would be required. Only one change required would be to append the new power profile request in the existing power profile mask, instead of overwriting it. This is where the current state machine pm.workload_mode would be useful. - Shashank Please let me > know if my understanding is correct: > > Assumption: Order of aggression: 3D > Media > Compute > > - Job 1: Requests mode compute: PP changed to compute, ref count 1 > - Job 2: Requests mode media: PP changed to media, ref count 2 > - Job 3: requests mode 3D: PP changed to 3D, ref count 3 > - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0, > PP still 3D > - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0, > PP still 3D > - Job 2 finishes, downs ref count to 0, PP changed to NONE, > > In this way, every job will be operating in the Power profile of desired > aggression or higher, and this API guarantees the execution at-least in > the desired power profile. > > - Shashank > >> >>> >>> - Shashank >>> >>>> >>>> >>>>> >>>>> >>>>> Or anything else ? >>>>> >>>>> REgards >>>>> Shashank >>>>> >>>>> >>>>>> Or you have multiple VCN contexts. When context1 finishes a job, it >>>>>> disables the VIDEO profile. But context2 still has a job on the other >>>>>> VCN engine and wants the VIDEO profile to still be enabled. >>>>>> >>>>>> Regards, >>>>>> Felix >>>>>> >>>>>> >>>>>>> >>>>>>> Christian. >>>>>>> >>>>>>>> --- >>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- >>>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>> index 5e53a5293935..1caed319a448 100644 >>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>> @@ -34,6 +34,7 @@ >>>>>>>> #include "amdgpu_ras.h" >>>>>>>> #include "amdgpu_umc.h" >>>>>>>> #include "amdgpu_reset.h" >>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>> /* Total memory size in system memory and all GPU VRAM. >>>>>>>> Used to >>>>>>>> * estimate worst case amount of memory to reserve for page >>>>>>>> tables >>>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>>>>> amdgpu_device *adev, >>>>>>>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device >>>>>>>> *adev, >>>>>>>> bool idle) >>>>>>>> { >>>>>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>>> - !idle); >>>>>>>> + int ret; >>>>>>>> + >>>>>>>> + if (idle) >>>>>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>> + else >>>>>>>> + ret = amdgpu_set_workload_profile(adev, >>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>> + >>>>>>>> + if (ret) >>>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>>>>> compute mode\n", >>>>>>>> + idle ? "reset" : "set"); >>>>>>>> } >>>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 >>>>>>>> vmid) >>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-28 8:56 ` Sharma, Shashank 2022-09-28 9:00 ` Sharma, Shashank @ 2022-09-28 21:51 ` Alex Deucher 2022-09-29 8:48 ` Sharma, Shashank 1 sibling, 1 reply; 76+ messages in thread From: Alex Deucher @ 2022-09-28 21:51 UTC (permalink / raw) To: Sharma, Shashank Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank <shashank.sharma@amd.com> wrote: > > > > On 9/27/2022 10:40 PM, Alex Deucher wrote: > > On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank > > <shashank.sharma@amd.com> wrote: > >> > >> > >> > >> On 9/27/2022 5:23 PM, Felix Kuehling wrote: > >>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: > >>>> Hello Felix, > >>>> > >>>> Thank for the review comments. > >>>> > >>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: > >>>>> Am 2022-09-27 um 02:12 schrieb Christian König: > >>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: > >>>>>>> This patch switches the GPU workload mode to/from > >>>>>>> compute mode, while submitting compute workload. > >>>>>>> > >>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> > >>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> > >>>>>> > >>>>>> Feel free to add my acked-by, but Felix should probably take a look > >>>>>> as well. > >>>>> > >>>>> This look OK purely from a compute perspective. But I'm concerned > >>>>> about the interaction of compute with graphics or multiple graphics > >>>>> contexts submitting work concurrently. They would constantly override > >>>>> or disable each other's workload hints. > >>>>> > >>>>> For example, you have an amdgpu_ctx with > >>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD > >>>>> process that also wants the compute profile. Those could be different > >>>>> processes belonging to different users. Say, KFD enables the compute > >>>>> profile first. Then the graphics context submits a job. At the start > >>>>> of the job, the compute profile is enabled. That's a no-op because > >>>>> KFD already enabled the compute profile. When the job finishes, it > >>>>> disables the compute profile for everyone, including KFD. That's > >>>>> unexpected. > >>>>> > >>>> > >>>> In this case, it will not disable the compute profile, as the > >>>> reference counter will not be zero. The reset_profile() will only act > >>>> if the reference counter is 0. > >>> > >>> OK, I missed the reference counter. > >>> > >>> > >>>> > >>>> But I would be happy to get any inputs about a policy which can be > >>>> more sustainable and gets better outputs, for example: > >>>> - should we not allow a profile change, if a PP mode is already > >>>> applied and keep it Early bird basis ? > >>>> > >>>> For example: Policy A > >>>> - Job A sets the profile to compute > >>>> - Job B tries to set profile to 3D, but we do not allow it as job A is > >>>> not finished it yet. > >>>> > >>>> Or Policy B: Current one > >>>> - Job A sets the profile to compute > >>>> - Job B tries to set profile to 3D, and we allow it. Job A also runs > >>>> in PP 3D > >>>> - Job B finishes, but does not reset PP as reference count is not zero > >>>> due to compute > >>>> - Job A finishes, profile reset to NONE > >>> > >>> I think this won't work. As I understand it, the > >>> amdgpu_dpm_switch_power_profile enables and disables individual > >>> profiles. Disabling the 3D profile doesn't disable the compute profile > >>> at the same time. I think you'll need one refcount per profile. > >>> > >>> Regards, > >>> Felix > >> > >> Thanks, This is exactly what I was looking for, I think Alex's initial > >> idea was around it, but I was under the assumption that there is only > >> one HW profile in SMU which keeps on getting overwritten. This can solve > >> our problems, as I can create an array of reference counters, and will > >> disable only the profile whose reference counter goes 0. > > > > It's been a while since I paged any of this code into my head, but I > > believe the actual workload message in the SMU is a mask where you can > > specify multiple workload types at the same time and the SMU will > > arbitrate between them internally. E.g., the most aggressive one will > > be selected out of the ones specified. I think in the driver we just > > set one bit at a time using the current interface. It might be better > > to change the interface and just ref count the hint types and then > > when we call the set function look at the ref counts for each hint > > type and set the mask as appropriate. > > > > Alex > > > > Hey Alex, > Thanks for your comment, if that is the case, this current patch series > works straight forward, and no changes would be required. Please let me > know if my understanding is correct: > > Assumption: Order of aggression: 3D > Media > Compute > > - Job 1: Requests mode compute: PP changed to compute, ref count 1 > - Job 2: Requests mode media: PP changed to media, ref count 2 > - Job 3: requests mode 3D: PP changed to 3D, ref count 3 > - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0, > PP still 3D > - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0, > PP still 3D > - Job 2 finishes, downs ref count to 0, PP changed to NONE, > > In this way, every job will be operating in the Power profile of desired > aggression or higher, and this API guarantees the execution at-least in > the desired power profile. I'm not entirely sure on the relative levels of aggression, but I believe the SMU priorities them by index. E.g. #define WORKLOAD_PPLIB_DEFAULT_BIT 0 #define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1 #define WORKLOAD_PPLIB_POWER_SAVING_BIT 2 #define WORKLOAD_PPLIB_VIDEO_BIT 3 #define WORKLOAD_PPLIB_VR_BIT 4 #define WORKLOAD_PPLIB_COMPUTE_BIT 5 #define WORKLOAD_PPLIB_CUSTOM_BIT 6 3D < video < VR < compute < custom VR and compute are the most aggressive. Custom takes preference because it's user customizable. Alex > > - Shashank > > > > >> > >> - Shashank > >> > >>> > >>> > >>>> > >>>> > >>>> Or anything else ? > >>>> > >>>> REgards > >>>> Shashank > >>>> > >>>> > >>>>> Or you have multiple VCN contexts. When context1 finishes a job, it > >>>>> disables the VIDEO profile. But context2 still has a job on the other > >>>>> VCN engine and wants the VIDEO profile to still be enabled. > >>>>> > >>>>> Regards, > >>>>> Felix > >>>>> > >>>>> > >>>>>> > >>>>>> Christian. > >>>>>> > >>>>>>> --- > >>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- > >>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) > >>>>>>> > >>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > >>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > >>>>>>> index 5e53a5293935..1caed319a448 100644 > >>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > >>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c > >>>>>>> @@ -34,6 +34,7 @@ > >>>>>>> #include "amdgpu_ras.h" > >>>>>>> #include "amdgpu_umc.h" > >>>>>>> #include "amdgpu_reset.h" > >>>>>>> +#include "amdgpu_ctx_workload.h" > >>>>>>> /* Total memory size in system memory and all GPU VRAM. Used to > >>>>>>> * estimate worst case amount of memory to reserve for page tables > >>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct > >>>>>>> amdgpu_device *adev, > >>>>>>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, > >>>>>>> bool idle) > >>>>>>> { > >>>>>>> - amdgpu_dpm_switch_power_profile(adev, > >>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, > >>>>>>> - !idle); > >>>>>>> + int ret; > >>>>>>> + > >>>>>>> + if (idle) > >>>>>>> + ret = amdgpu_clear_workload_profile(adev, > >>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); > >>>>>>> + else > >>>>>>> + ret = amdgpu_set_workload_profile(adev, > >>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); > >>>>>>> + > >>>>>>> + if (ret) > >>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to > >>>>>>> compute mode\n", > >>>>>>> + idle ? "reset" : "set"); > >>>>>>> } > >>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 > >>>>>>> vmid) > >>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-28 21:51 ` Alex Deucher @ 2022-09-29 8:48 ` Sharma, Shashank 2022-09-29 11:10 ` Lazar, Lijo 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-29 8:48 UTC (permalink / raw) To: Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/28/2022 11:51 PM, Alex Deucher wrote: > On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank > <shashank.sharma@amd.com> wrote: >> >> >> >> On 9/27/2022 10:40 PM, Alex Deucher wrote: >>> On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank >>> <shashank.sharma@amd.com> wrote: >>>> >>>> >>>> >>>> On 9/27/2022 5:23 PM, Felix Kuehling wrote: >>>>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >>>>>> Hello Felix, >>>>>> >>>>>> Thank for the review comments. >>>>>> >>>>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>>>>>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>>>>>> This patch switches the GPU workload mode to/from >>>>>>>>> compute mode, while submitting compute workload. >>>>>>>>> >>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>> >>>>>>>> Feel free to add my acked-by, but Felix should probably take a look >>>>>>>> as well. >>>>>>> >>>>>>> This look OK purely from a compute perspective. But I'm concerned >>>>>>> about the interaction of compute with graphics or multiple graphics >>>>>>> contexts submitting work concurrently. They would constantly override >>>>>>> or disable each other's workload hints. >>>>>>> >>>>>>> For example, you have an amdgpu_ctx with >>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>>>>>> process that also wants the compute profile. Those could be different >>>>>>> processes belonging to different users. Say, KFD enables the compute >>>>>>> profile first. Then the graphics context submits a job. At the start >>>>>>> of the job, the compute profile is enabled. That's a no-op because >>>>>>> KFD already enabled the compute profile. When the job finishes, it >>>>>>> disables the compute profile for everyone, including KFD. That's >>>>>>> unexpected. >>>>>>> >>>>>> >>>>>> In this case, it will not disable the compute profile, as the >>>>>> reference counter will not be zero. The reset_profile() will only act >>>>>> if the reference counter is 0. >>>>> >>>>> OK, I missed the reference counter. >>>>> >>>>> >>>>>> >>>>>> But I would be happy to get any inputs about a policy which can be >>>>>> more sustainable and gets better outputs, for example: >>>>>> - should we not allow a profile change, if a PP mode is already >>>>>> applied and keep it Early bird basis ? >>>>>> >>>>>> For example: Policy A >>>>>> - Job A sets the profile to compute >>>>>> - Job B tries to set profile to 3D, but we do not allow it as job A is >>>>>> not finished it yet. >>>>>> >>>>>> Or Policy B: Current one >>>>>> - Job A sets the profile to compute >>>>>> - Job B tries to set profile to 3D, and we allow it. Job A also runs >>>>>> in PP 3D >>>>>> - Job B finishes, but does not reset PP as reference count is not zero >>>>>> due to compute >>>>>> - Job A finishes, profile reset to NONE >>>>> >>>>> I think this won't work. As I understand it, the >>>>> amdgpu_dpm_switch_power_profile enables and disables individual >>>>> profiles. Disabling the 3D profile doesn't disable the compute profile >>>>> at the same time. I think you'll need one refcount per profile. >>>>> >>>>> Regards, >>>>> Felix >>>> >>>> Thanks, This is exactly what I was looking for, I think Alex's initial >>>> idea was around it, but I was under the assumption that there is only >>>> one HW profile in SMU which keeps on getting overwritten. This can solve >>>> our problems, as I can create an array of reference counters, and will >>>> disable only the profile whose reference counter goes 0. >>> >>> It's been a while since I paged any of this code into my head, but I >>> believe the actual workload message in the SMU is a mask where you can >>> specify multiple workload types at the same time and the SMU will >>> arbitrate between them internally. E.g., the most aggressive one will >>> be selected out of the ones specified. I think in the driver we just >>> set one bit at a time using the current interface. It might be better >>> to change the interface and just ref count the hint types and then >>> when we call the set function look at the ref counts for each hint >>> type and set the mask as appropriate. >>> >>> Alex >>> >> >> Hey Alex, >> Thanks for your comment, if that is the case, this current patch series >> works straight forward, and no changes would be required. Please let me >> know if my understanding is correct: >> >> Assumption: Order of aggression: 3D > Media > Compute >> >> - Job 1: Requests mode compute: PP changed to compute, ref count 1 >> - Job 2: Requests mode media: PP changed to media, ref count 2 >> - Job 3: requests mode 3D: PP changed to 3D, ref count 3 >> - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0, >> PP still 3D >> - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0, >> PP still 3D >> - Job 2 finishes, downs ref count to 0, PP changed to NONE, >> >> In this way, every job will be operating in the Power profile of desired >> aggression or higher, and this API guarantees the execution at-least in >> the desired power profile. > > I'm not entirely sure on the relative levels of aggression, but I > believe the SMU priorities them by index. E.g. > #define WORKLOAD_PPLIB_DEFAULT_BIT 0 > #define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1 > #define WORKLOAD_PPLIB_POWER_SAVING_BIT 2 > #define WORKLOAD_PPLIB_VIDEO_BIT 3 > #define WORKLOAD_PPLIB_VR_BIT 4 > #define WORKLOAD_PPLIB_COMPUTE_BIT 5 > #define WORKLOAD_PPLIB_CUSTOM_BIT 6 > > 3D < video < VR < compute < custom > > VR and compute are the most aggressive. Custom takes preference > because it's user customizable. > > Alex > Thanks, so this UAPI will guarantee the execution of the job in atleast the requested power profile, or a more aggressive one. I will do the one change required and send the updated one. - Shashank > > > >> >> - Shashank >> >>> >>>> >>>> - Shashank >>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> Or anything else ? >>>>>> >>>>>> REgards >>>>>> Shashank >>>>>> >>>>>> >>>>>>> Or you have multiple VCN contexts. When context1 finishes a job, it >>>>>>> disables the VIDEO profile. But context2 still has a job on the other >>>>>>> VCN engine and wants the VIDEO profile to still be enabled. >>>>>>> >>>>>>> Regards, >>>>>>> Felix >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Christian. >>>>>>>> >>>>>>>>> --- >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 +++++++++++--- >>>>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>>>> >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>> index 5e53a5293935..1caed319a448 100644 >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>> @@ -34,6 +34,7 @@ >>>>>>>>> #include "amdgpu_ras.h" >>>>>>>>> #include "amdgpu_umc.h" >>>>>>>>> #include "amdgpu_reset.h" >>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>> /* Total memory size in system memory and all GPU VRAM. Used to >>>>>>>>> * estimate worst case amount of memory to reserve for page tables >>>>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>>>>>> amdgpu_device *adev, >>>>>>>>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device *adev, >>>>>>>>> bool idle) >>>>>>>>> { >>>>>>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>>>> - !idle); >>>>>>>>> + int ret; >>>>>>>>> + >>>>>>>>> + if (idle) >>>>>>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>> + else >>>>>>>>> + ret = amdgpu_set_workload_profile(adev, >>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>> + >>>>>>>>> + if (ret) >>>>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>>>>>> compute mode\n", >>>>>>>>> + idle ? "reset" : "set"); >>>>>>>>> } >>>>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device *adev, u32 >>>>>>>>> vmid) >>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 8:48 ` Sharma, Shashank @ 2022-09-29 11:10 ` Lazar, Lijo 2022-09-29 13:20 ` Sharma, Shashank 2022-09-29 18:07 ` Felix Kuehling 0 siblings, 2 replies; 76+ messages in thread From: Lazar, Lijo @ 2022-09-29 11:10 UTC (permalink / raw) To: Sharma, Shashank, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/29/2022 2:18 PM, Sharma, Shashank wrote: > > > On 9/28/2022 11:51 PM, Alex Deucher wrote: >> On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank >> <shashank.sharma@amd.com> wrote: >>> >>> >>> >>> On 9/27/2022 10:40 PM, Alex Deucher wrote: >>>> On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank >>>> <shashank.sharma@amd.com> wrote: >>>>> >>>>> >>>>> >>>>> On 9/27/2022 5:23 PM, Felix Kuehling wrote: >>>>>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >>>>>>> Hello Felix, >>>>>>> >>>>>>> Thank for the review comments. >>>>>>> >>>>>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>>>>>>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>>>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>>>>>>> This patch switches the GPU workload mode to/from >>>>>>>>>> compute mode, while submitting compute workload. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>>> >>>>>>>>> Feel free to add my acked-by, but Felix should probably take a >>>>>>>>> look >>>>>>>>> as well. >>>>>>>> >>>>>>>> This look OK purely from a compute perspective. But I'm concerned >>>>>>>> about the interaction of compute with graphics or multiple graphics >>>>>>>> contexts submitting work concurrently. They would constantly >>>>>>>> override >>>>>>>> or disable each other's workload hints. >>>>>>>> >>>>>>>> For example, you have an amdgpu_ctx with >>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>>>>>>> process that also wants the compute profile. Those could be >>>>>>>> different >>>>>>>> processes belonging to different users. Say, KFD enables the >>>>>>>> compute >>>>>>>> profile first. Then the graphics context submits a job. At the >>>>>>>> start >>>>>>>> of the job, the compute profile is enabled. That's a no-op because >>>>>>>> KFD already enabled the compute profile. When the job finishes, it >>>>>>>> disables the compute profile for everyone, including KFD. That's >>>>>>>> unexpected. >>>>>>>> >>>>>>> >>>>>>> In this case, it will not disable the compute profile, as the >>>>>>> reference counter will not be zero. The reset_profile() will only >>>>>>> act >>>>>>> if the reference counter is 0. >>>>>> >>>>>> OK, I missed the reference counter. >>>>>> >>>>>> >>>>>>> >>>>>>> But I would be happy to get any inputs about a policy which can be >>>>>>> more sustainable and gets better outputs, for example: >>>>>>> - should we not allow a profile change, if a PP mode is already >>>>>>> applied and keep it Early bird basis ? >>>>>>> >>>>>>> For example: Policy A >>>>>>> - Job A sets the profile to compute >>>>>>> - Job B tries to set profile to 3D, but we do not allow it as job >>>>>>> A is >>>>>>> not finished it yet. >>>>>>> >>>>>>> Or Policy B: Current one >>>>>>> - Job A sets the profile to compute >>>>>>> - Job B tries to set profile to 3D, and we allow it. Job A also runs >>>>>>> in PP 3D >>>>>>> - Job B finishes, but does not reset PP as reference count is not >>>>>>> zero >>>>>>> due to compute >>>>>>> - Job A finishes, profile reset to NONE >>>>>> >>>>>> I think this won't work. As I understand it, the >>>>>> amdgpu_dpm_switch_power_profile enables and disables individual >>>>>> profiles. Disabling the 3D profile doesn't disable the compute >>>>>> profile >>>>>> at the same time. I think you'll need one refcount per profile. >>>>>> >>>>>> Regards, >>>>>> Felix >>>>> >>>>> Thanks, This is exactly what I was looking for, I think Alex's initial >>>>> idea was around it, but I was under the assumption that there is only >>>>> one HW profile in SMU which keeps on getting overwritten. This can >>>>> solve >>>>> our problems, as I can create an array of reference counters, and will >>>>> disable only the profile whose reference counter goes 0. >>>> >>>> It's been a while since I paged any of this code into my head, but I >>>> believe the actual workload message in the SMU is a mask where you can >>>> specify multiple workload types at the same time and the SMU will >>>> arbitrate between them internally. E.g., the most aggressive one will >>>> be selected out of the ones specified. I think in the driver we just >>>> set one bit at a time using the current interface. It might be better >>>> to change the interface and just ref count the hint types and then >>>> when we call the set function look at the ref counts for each hint >>>> type and set the mask as appropriate. >>>> >>>> Alex >>>> >>> >>> Hey Alex, >>> Thanks for your comment, if that is the case, this current patch series >>> works straight forward, and no changes would be required. Please let me >>> know if my understanding is correct: >>> >>> Assumption: Order of aggression: 3D > Media > Compute >>> >>> - Job 1: Requests mode compute: PP changed to compute, ref count 1 >>> - Job 2: Requests mode media: PP changed to media, ref count 2 >>> - Job 3: requests mode 3D: PP changed to 3D, ref count 3 >>> - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref > 0, >>> PP still 3D >>> - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref > 0, >>> PP still 3D >>> - Job 2 finishes, downs ref count to 0, PP changed to NONE, >>> >>> In this way, every job will be operating in the Power profile of desired >>> aggression or higher, and this API guarantees the execution at-least in >>> the desired power profile. >> >> I'm not entirely sure on the relative levels of aggression, but I >> believe the SMU priorities them by index. E.g. >> #define WORKLOAD_PPLIB_DEFAULT_BIT 0 >> #define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1 >> #define WORKLOAD_PPLIB_POWER_SAVING_BIT 2 >> #define WORKLOAD_PPLIB_VIDEO_BIT 3 >> #define WORKLOAD_PPLIB_VR_BIT 4 >> #define WORKLOAD_PPLIB_COMPUTE_BIT 5 >> #define WORKLOAD_PPLIB_CUSTOM_BIT 6 >> >> 3D < video < VR < compute < custom >> >> VR and compute are the most aggressive. Custom takes preference >> because it's user customizable. >> >> Alex >> > > Thanks, so this UAPI will guarantee the execution of the job in atleast > the requested power profile, or a more aggressive one. > Hi Shashank, This is not how the API works in the driver PM subsystem. In the final interface with PMFW, driver sets only one profile bit and doesn't set any mask. So it doesn't work the way as Felix explained. If there is more than one profile bit set, PMFW looks at the mask and picks the one with the highest priority. Note that for each update of workload mask, PMFW should get a message. Driver currently sets only bit as Alex explained earlier. For our current driver implementation, you can check this as example - https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1753 Also, PM layer already stores the current workload profile for a *get* API (which also means a new pm workload variable is not needed). But, that API works as long as driver sets only one profile bit, that way driver is sure of the current profile mode - https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1628 When there is more than one, driver is not sure of the internal priority of PMFW though we can follow the bit order which Alex suggested (but sometimes FW carry some workarounds inside which means it doesn't necessarily follow the same order). There is an existing interface through sysfs through which allow to change the profile mode and add custom settings. In summary, any handling of change from single bit to mask needs to be done at the lower layer. The problem is this behavior has been there throughout all legacy ASICs. Not sure how much of effort it takes and what all needs to be modified. Thanks, Lijo > I will do the one change required and send the updated one. > > - Shashank > >> >> >> >>> >>> - Shashank >>> >>>> >>>>> >>>>> - Shashank >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>> Or anything else ? >>>>>>> >>>>>>> REgards >>>>>>> Shashank >>>>>>> >>>>>>> >>>>>>>> Or you have multiple VCN contexts. When context1 finishes a job, it >>>>>>>> disables the VIDEO profile. But context2 still has a job on the >>>>>>>> other >>>>>>>> VCN engine and wants the VIDEO profile to still be enabled. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Felix >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> Christian. >>>>>>>>> >>>>>>>>>> --- >>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 >>>>>>>>>> +++++++++++--- >>>>>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>>>>> >>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>> index 5e53a5293935..1caed319a448 100644 >>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>> @@ -34,6 +34,7 @@ >>>>>>>>>> #include "amdgpu_ras.h" >>>>>>>>>> #include "amdgpu_umc.h" >>>>>>>>>> #include "amdgpu_reset.h" >>>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>>> /* Total memory size in system memory and all GPU VRAM. >>>>>>>>>> Used to >>>>>>>>>> * estimate worst case amount of memory to reserve for >>>>>>>>>> page tables >>>>>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>>>>>>> amdgpu_device *adev, >>>>>>>>>> void amdgpu_amdkfd_set_compute_idle(struct amdgpu_device >>>>>>>>>> *adev, >>>>>>>>>> bool idle) >>>>>>>>>> { >>>>>>>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>>>>> - !idle); >>>>>>>>>> + int ret; >>>>>>>>>> + >>>>>>>>>> + if (idle) >>>>>>>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>> + else >>>>>>>>>> + ret = amdgpu_set_workload_profile(adev, >>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>> + >>>>>>>>>> + if (ret) >>>>>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>>>>>>> compute mode\n", >>>>>>>>>> + idle ? "reset" : "set"); >>>>>>>>>> } >>>>>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device >>>>>>>>>> *adev, u32 >>>>>>>>>> vmid) >>>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 11:10 ` Lazar, Lijo @ 2022-09-29 13:20 ` Sharma, Shashank 2022-09-29 13:37 ` Lazar, Lijo 2022-09-29 18:07 ` Felix Kuehling 1 sibling, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-29 13:20 UTC (permalink / raw) To: Lazar, Lijo, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/29/2022 1:10 PM, Lazar, Lijo wrote: > > > On 9/29/2022 2:18 PM, Sharma, Shashank wrote: >> >> >> On 9/28/2022 11:51 PM, Alex Deucher wrote: >>> On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank >>> <shashank.sharma@amd.com> wrote: >>>> >>>> >>>> >>>> On 9/27/2022 10:40 PM, Alex Deucher wrote: >>>>> On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank >>>>> <shashank.sharma@amd.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 9/27/2022 5:23 PM, Felix Kuehling wrote: >>>>>>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >>>>>>>> Hello Felix, >>>>>>>> >>>>>>>> Thank for the review comments. >>>>>>>> >>>>>>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>>>>>>>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>>>>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>>>>>>>> This patch switches the GPU workload mode to/from >>>>>>>>>>> compute mode, while submitting compute workload. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>>>> >>>>>>>>>> Feel free to add my acked-by, but Felix should probably take a >>>>>>>>>> look >>>>>>>>>> as well. >>>>>>>>> >>>>>>>>> This look OK purely from a compute perspective. But I'm concerned >>>>>>>>> about the interaction of compute with graphics or multiple >>>>>>>>> graphics >>>>>>>>> contexts submitting work concurrently. They would constantly >>>>>>>>> override >>>>>>>>> or disable each other's workload hints. >>>>>>>>> >>>>>>>>> For example, you have an amdgpu_ctx with >>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>>>>>>>> process that also wants the compute profile. Those could be >>>>>>>>> different >>>>>>>>> processes belonging to different users. Say, KFD enables the >>>>>>>>> compute >>>>>>>>> profile first. Then the graphics context submits a job. At the >>>>>>>>> start >>>>>>>>> of the job, the compute profile is enabled. That's a no-op because >>>>>>>>> KFD already enabled the compute profile. When the job finishes, it >>>>>>>>> disables the compute profile for everyone, including KFD. That's >>>>>>>>> unexpected. >>>>>>>>> >>>>>>>> >>>>>>>> In this case, it will not disable the compute profile, as the >>>>>>>> reference counter will not be zero. The reset_profile() will >>>>>>>> only act >>>>>>>> if the reference counter is 0. >>>>>>> >>>>>>> OK, I missed the reference counter. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> But I would be happy to get any inputs about a policy which can be >>>>>>>> more sustainable and gets better outputs, for example: >>>>>>>> - should we not allow a profile change, if a PP mode is already >>>>>>>> applied and keep it Early bird basis ? >>>>>>>> >>>>>>>> For example: Policy A >>>>>>>> - Job A sets the profile to compute >>>>>>>> - Job B tries to set profile to 3D, but we do not allow it as >>>>>>>> job A is >>>>>>>> not finished it yet. >>>>>>>> >>>>>>>> Or Policy B: Current one >>>>>>>> - Job A sets the profile to compute >>>>>>>> - Job B tries to set profile to 3D, and we allow it. Job A also >>>>>>>> runs >>>>>>>> in PP 3D >>>>>>>> - Job B finishes, but does not reset PP as reference count is >>>>>>>> not zero >>>>>>>> due to compute >>>>>>>> - Job A finishes, profile reset to NONE >>>>>>> >>>>>>> I think this won't work. As I understand it, the >>>>>>> amdgpu_dpm_switch_power_profile enables and disables individual >>>>>>> profiles. Disabling the 3D profile doesn't disable the compute >>>>>>> profile >>>>>>> at the same time. I think you'll need one refcount per profile. >>>>>>> >>>>>>> Regards, >>>>>>> Felix >>>>>> >>>>>> Thanks, This is exactly what I was looking for, I think Alex's >>>>>> initial >>>>>> idea was around it, but I was under the assumption that there is only >>>>>> one HW profile in SMU which keeps on getting overwritten. This can >>>>>> solve >>>>>> our problems, as I can create an array of reference counters, and >>>>>> will >>>>>> disable only the profile whose reference counter goes 0. >>>>> >>>>> It's been a while since I paged any of this code into my head, but I >>>>> believe the actual workload message in the SMU is a mask where you can >>>>> specify multiple workload types at the same time and the SMU will >>>>> arbitrate between them internally. E.g., the most aggressive one will >>>>> be selected out of the ones specified. I think in the driver we just >>>>> set one bit at a time using the current interface. It might be better >>>>> to change the interface and just ref count the hint types and then >>>>> when we call the set function look at the ref counts for each hint >>>>> type and set the mask as appropriate. >>>>> >>>>> Alex >>>>> >>>> >>>> Hey Alex, >>>> Thanks for your comment, if that is the case, this current patch series >>>> works straight forward, and no changes would be required. Please let me >>>> know if my understanding is correct: >>>> >>>> Assumption: Order of aggression: 3D > Media > Compute >>>> >>>> - Job 1: Requests mode compute: PP changed to compute, ref count 1 >>>> - Job 2: Requests mode media: PP changed to media, ref count 2 >>>> - Job 3: requests mode 3D: PP changed to 3D, ref count 3 >>>> - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref >>>> > 0, >>>> PP still 3D >>>> - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref >>>> > 0, >>>> PP still 3D >>>> - Job 2 finishes, downs ref count to 0, PP changed to NONE, >>>> >>>> In this way, every job will be operating in the Power profile of >>>> desired >>>> aggression or higher, and this API guarantees the execution at-least in >>>> the desired power profile. >>> >>> I'm not entirely sure on the relative levels of aggression, but I >>> believe the SMU priorities them by index. E.g. >>> #define WORKLOAD_PPLIB_DEFAULT_BIT 0 >>> #define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1 >>> #define WORKLOAD_PPLIB_POWER_SAVING_BIT 2 >>> #define WORKLOAD_PPLIB_VIDEO_BIT 3 >>> #define WORKLOAD_PPLIB_VR_BIT 4 >>> #define WORKLOAD_PPLIB_COMPUTE_BIT 5 >>> #define WORKLOAD_PPLIB_CUSTOM_BIT 6 >>> >>> 3D < video < VR < compute < custom >>> >>> VR and compute are the most aggressive. Custom takes preference >>> because it's user customizable. >>> >>> Alex >>> >> >> Thanks, so this UAPI will guarantee the execution of the job in >> atleast the requested power profile, or a more aggressive one. >> > > Hi Shashank, > > This is not how the API works in the driver PM subsystem. In the final > interface with PMFW, driver sets only one profile bit and doesn't set > any mask. So it doesn't work the way as Felix explained. If there is > more than one profile bit set, PMFW looks at the mask and picks the one > with the highest priority. Note that for each update of workload mask, > PMFW should get a message. > > Driver currently sets only bit as Alex explained earlier. For our > current driver implementation, you can check this as example - > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1753 If you see my last reply, Since Alex's last message, we are very clear on this point. And also as PM FW is already picking up the one with the highest priority, we don't have to worry about blocking profile change calls via different contexts. In this way, every job will be executed at at-least the requested priority power profile, or more than that. current power profile P0. Job1 came, requested power profile P2=> PM FW changed profile to P2. Job2 came, requested power profile P3=> if (p3 > p2): profile changed to P3, else it will stay at P2. So Job2 will still execute at P2, which is more aggressive than P3. So we don't have to block the PP change request at all. > > Also, PM layer already stores the current workload profile for a *get* > API (which also means a new pm workload variable is not needed). But, > that API works as long as driver sets only one profile bit, that way > driver is sure of the current profile mode - > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1628 > Yes, I had seen this API and its high level API while I was exploring the code, and I found this written to support sysfs based reads and write, and is not useful for a context based scenario. > > When there is more than one, driver is not sure of the internal priority > of PMFW though we can follow the bit order which Alex suggested (but > sometimes FW carry some workarounds inside which means it doesn't > necessarily follow the same order). > > There is an existing interface through sysfs through which allow to > change the profile mode and add custom settings. Same as above, this sysfs interface is very basic, and good for validation of power profile change, but not for job level pp change. In summary, any > handling of change from single bit to mask needs to be done at the lower > layer. > I still don't understand how does this series handle and change this mask ? This part is still being done in amdgpu_dpm_switch_power_profile() function, which is a dpm function only. Code in this series is just calling/consuming this function from the scheduler. > The problem is this behavior has been there throughout all legacy ASICs. > Not sure how much of effort it takes and what all needs to be modified. > As mentioned above, we are just consuming amdgpu_dpm_switch_power_profile() function. So if this function is valid for all these ASICs, I think this wrapper will also be fine. - Shashank > Thanks, > Lijo > >> I will do the one change required and send the updated one. >> >> - Shashank >> >>> >>> >>> >>>> >>>> - Shashank >>>> >>>>> >>>>>> >>>>>> - Shashank >>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Or anything else ? >>>>>>>> >>>>>>>> REgards >>>>>>>> Shashank >>>>>>>> >>>>>>>> >>>>>>>>> Or you have multiple VCN contexts. When context1 finishes a >>>>>>>>> job, it >>>>>>>>> disables the VIDEO profile. But context2 still has a job on the >>>>>>>>> other >>>>>>>>> VCN engine and wants the VIDEO profile to still be enabled. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Felix >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Christian. >>>>>>>>>> >>>>>>>>>>> --- >>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 >>>>>>>>>>> +++++++++++--- >>>>>>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>> index 5e53a5293935..1caed319a448 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>> @@ -34,6 +34,7 @@ >>>>>>>>>>> #include "amdgpu_ras.h" >>>>>>>>>>> #include "amdgpu_umc.h" >>>>>>>>>>> #include "amdgpu_reset.h" >>>>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>>>> /* Total memory size in system memory and all GPU VRAM. >>>>>>>>>>> Used to >>>>>>>>>>> * estimate worst case amount of memory to reserve for >>>>>>>>>>> page tables >>>>>>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>> void amdgpu_amdkfd_set_compute_idle(struct >>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>> bool idle) >>>>>>>>>>> { >>>>>>>>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>>>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>>>>>> - !idle); >>>>>>>>>>> + int ret; >>>>>>>>>>> + >>>>>>>>>>> + if (idle) >>>>>>>>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>>> + else >>>>>>>>>>> + ret = amdgpu_set_workload_profile(adev, >>>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>>> + >>>>>>>>>>> + if (ret) >>>>>>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>>>>>>>> compute mode\n", >>>>>>>>>>> + idle ? "reset" : "set"); >>>>>>>>>>> } >>>>>>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device >>>>>>>>>>> *adev, u32 >>>>>>>>>>> vmid) >>>>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 13:20 ` Sharma, Shashank @ 2022-09-29 13:37 ` Lazar, Lijo 2022-09-29 14:00 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-29 13:37 UTC (permalink / raw) To: Sharma, Shashank, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/29/2022 6:50 PM, Sharma, Shashank wrote: > > > On 9/29/2022 1:10 PM, Lazar, Lijo wrote: >> >> >> On 9/29/2022 2:18 PM, Sharma, Shashank wrote: >>> >>> >>> On 9/28/2022 11:51 PM, Alex Deucher wrote: >>>> On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank >>>> <shashank.sharma@amd.com> wrote: >>>>> >>>>> >>>>> >>>>> On 9/27/2022 10:40 PM, Alex Deucher wrote: >>>>>> On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank >>>>>> <shashank.sharma@amd.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 9/27/2022 5:23 PM, Felix Kuehling wrote: >>>>>>>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >>>>>>>>> Hello Felix, >>>>>>>>> >>>>>>>>> Thank for the review comments. >>>>>>>>> >>>>>>>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>>>>>>>>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>>>>>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>>>>>>>>> This patch switches the GPU workload mode to/from >>>>>>>>>>>> compute mode, while submitting compute workload. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>>>>> >>>>>>>>>>> Feel free to add my acked-by, but Felix should probably take >>>>>>>>>>> a look >>>>>>>>>>> as well. >>>>>>>>>> >>>>>>>>>> This look OK purely from a compute perspective. But I'm concerned >>>>>>>>>> about the interaction of compute with graphics or multiple >>>>>>>>>> graphics >>>>>>>>>> contexts submitting work concurrently. They would constantly >>>>>>>>>> override >>>>>>>>>> or disable each other's workload hints. >>>>>>>>>> >>>>>>>>>> For example, you have an amdgpu_ctx with >>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>>>>>>>>> process that also wants the compute profile. Those could be >>>>>>>>>> different >>>>>>>>>> processes belonging to different users. Say, KFD enables the >>>>>>>>>> compute >>>>>>>>>> profile first. Then the graphics context submits a job. At the >>>>>>>>>> start >>>>>>>>>> of the job, the compute profile is enabled. That's a no-op >>>>>>>>>> because >>>>>>>>>> KFD already enabled the compute profile. When the job >>>>>>>>>> finishes, it >>>>>>>>>> disables the compute profile for everyone, including KFD. That's >>>>>>>>>> unexpected. >>>>>>>>>> >>>>>>>>> >>>>>>>>> In this case, it will not disable the compute profile, as the >>>>>>>>> reference counter will not be zero. The reset_profile() will >>>>>>>>> only act >>>>>>>>> if the reference counter is 0. >>>>>>>> >>>>>>>> OK, I missed the reference counter. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> But I would be happy to get any inputs about a policy which can be >>>>>>>>> more sustainable and gets better outputs, for example: >>>>>>>>> - should we not allow a profile change, if a PP mode is already >>>>>>>>> applied and keep it Early bird basis ? >>>>>>>>> >>>>>>>>> For example: Policy A >>>>>>>>> - Job A sets the profile to compute >>>>>>>>> - Job B tries to set profile to 3D, but we do not allow it as >>>>>>>>> job A is >>>>>>>>> not finished it yet. >>>>>>>>> >>>>>>>>> Or Policy B: Current one >>>>>>>>> - Job A sets the profile to compute >>>>>>>>> - Job B tries to set profile to 3D, and we allow it. Job A also >>>>>>>>> runs >>>>>>>>> in PP 3D >>>>>>>>> - Job B finishes, but does not reset PP as reference count is >>>>>>>>> not zero >>>>>>>>> due to compute >>>>>>>>> - Job A finishes, profile reset to NONE >>>>>>>> >>>>>>>> I think this won't work. As I understand it, the >>>>>>>> amdgpu_dpm_switch_power_profile enables and disables individual >>>>>>>> profiles. Disabling the 3D profile doesn't disable the compute >>>>>>>> profile >>>>>>>> at the same time. I think you'll need one refcount per profile. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Felix >>>>>>> >>>>>>> Thanks, This is exactly what I was looking for, I think Alex's >>>>>>> initial >>>>>>> idea was around it, but I was under the assumption that there is >>>>>>> only >>>>>>> one HW profile in SMU which keeps on getting overwritten. This >>>>>>> can solve >>>>>>> our problems, as I can create an array of reference counters, and >>>>>>> will >>>>>>> disable only the profile whose reference counter goes 0. >>>>>> >>>>>> It's been a while since I paged any of this code into my head, but I >>>>>> believe the actual workload message in the SMU is a mask where you >>>>>> can >>>>>> specify multiple workload types at the same time and the SMU will >>>>>> arbitrate between them internally. E.g., the most aggressive one >>>>>> will >>>>>> be selected out of the ones specified. I think in the driver we just >>>>>> set one bit at a time using the current interface. It might be >>>>>> better >>>>>> to change the interface and just ref count the hint types and then >>>>>> when we call the set function look at the ref counts for each hint >>>>>> type and set the mask as appropriate. >>>>>> >>>>>> Alex >>>>>> >>>>> >>>>> Hey Alex, >>>>> Thanks for your comment, if that is the case, this current patch >>>>> series >>>>> works straight forward, and no changes would be required. Please >>>>> let me >>>>> know if my understanding is correct: >>>>> >>>>> Assumption: Order of aggression: 3D > Media > Compute >>>>> >>>>> - Job 1: Requests mode compute: PP changed to compute, ref count 1 >>>>> - Job 2: Requests mode media: PP changed to media, ref count 2 >>>>> - Job 3: requests mode 3D: PP changed to 3D, ref count 3 >>>>> - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref >>>>> > 0, >>>>> PP still 3D >>>>> - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref >>>>> > 0, >>>>> PP still 3D >>>>> - Job 2 finishes, downs ref count to 0, PP changed to NONE, >>>>> >>>>> In this way, every job will be operating in the Power profile of >>>>> desired >>>>> aggression or higher, and this API guarantees the execution >>>>> at-least in >>>>> the desired power profile. >>>> >>>> I'm not entirely sure on the relative levels of aggression, but I >>>> believe the SMU priorities them by index. E.g. >>>> #define WORKLOAD_PPLIB_DEFAULT_BIT 0 >>>> #define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1 >>>> #define WORKLOAD_PPLIB_POWER_SAVING_BIT 2 >>>> #define WORKLOAD_PPLIB_VIDEO_BIT 3 >>>> #define WORKLOAD_PPLIB_VR_BIT 4 >>>> #define WORKLOAD_PPLIB_COMPUTE_BIT 5 >>>> #define WORKLOAD_PPLIB_CUSTOM_BIT 6 >>>> >>>> 3D < video < VR < compute < custom >>>> >>>> VR and compute are the most aggressive. Custom takes preference >>>> because it's user customizable. >>>> >>>> Alex >>>> >>> >>> Thanks, so this UAPI will guarantee the execution of the job in >>> atleast the requested power profile, or a more aggressive one. >>> >> >> Hi Shashank, >> >> This is not how the API works in the driver PM subsystem. In the final >> interface with PMFW, driver sets only one profile bit and doesn't set >> any mask. So it doesn't work the way as Felix explained. If there is >> more than one profile bit set, PMFW looks at the mask and picks the >> one with the highest priority. > Note that for each update of workload mask, >> PMFW should get a message. >> >> Driver currently sets only bit as Alex explained earlier. For our >> current driver implementation, you can check this as example - >> >> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1753 > > If you see my last reply, Since Alex's last message, we are very clear > on this point. And also as PM FW is already picking up the one with the > highest priority, we don't have to worry about blocking profile change > calls via different contexts. In this way, every job will be executed at > at-least the requested priority power profile, or more than that. > > current power profile P0. > Job1 came, requested power profile P2=> PM FW changed profile to P2. > Job2 came, requested power profile P3=> if (p3 > p2): profile changed to > P3, else it will stay at P2. So Job2 will still execute at P2, which is > more aggressive than P3. > To be clear your understanding - Nothing is automatic in PMFW. PMFW picks a priority based on the actual mask sent by driver. Assuming lower bits corresponds to highest priority - If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose profile that corresponds to Bit0. If driver sends a mask with Bit4 Bit2 set and rest unset, PMFW will chose profile that corresponds to Bit2. However if driver sends a mask only with a single bit set, it chooses the profile regardless of whatever was the previous profile. t doesn't check if the existing profile > newly requested one. That is the behavior. So if a job send chooses a profile that corresponds to Bit0, driver will send that. Next time if another job chooses a profile that corresponds to Bit1, PMFW will receive that as the new profile and switch to that. It trusts the driver to send the proper workload mask. Hope that gives the picture. Thanks, Lijo > So we don't have to block the PP change request at all. > >> >> Also, PM layer already stores the current workload profile for a *get* >> API (which also means a new pm workload variable is not needed). But, >> that API works as long as driver sets only one profile bit, that way >> driver is sure of the current profile mode - >> >> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1628 >> > > Yes, I had seen this API and its high level API while I was exploring > the code, and I found this written to support sysfs based reads and > write, and is not useful for a context based scenario. > >> >> When there is more than one, driver is not sure of the internal >> priority of PMFW though we can follow the bit order which Alex >> suggested (but sometimes FW carry some workarounds inside which means >> it doesn't necessarily follow the same order). >> >> There is an existing interface through sysfs through which allow to >> change the profile mode and add custom settings. > > Same as above, this sysfs interface is very basic, and good for > validation of power profile change, but not for job level pp change. > > In summary, any >> handling of change from single bit to mask needs to be done at the >> lower layer. >> > > I still don't understand how does this series handle and change this > mask ? This part is still being done in > amdgpu_dpm_switch_power_profile() function, which is a dpm function > only. Code in this series is just calling/consuming this function from > the scheduler. > >> The problem is this behavior has been there throughout all legacy >> ASICs. Not sure how much of effort it takes and what all needs to be >> modified. >> > > As mentioned above, we are just consuming > amdgpu_dpm_switch_power_profile() function. So if this function is valid > for all these ASICs, I think this wrapper will also be fine. > > - Shashank > >> Thanks, >> Lijo >> >>> I will do the one change required and send the updated one. >>> >>> - Shashank >>> >>>> >>>> >>>> >>>>> >>>>> - Shashank >>>>> >>>>>> >>>>>>> >>>>>>> - Shashank >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Or anything else ? >>>>>>>>> >>>>>>>>> REgards >>>>>>>>> Shashank >>>>>>>>> >>>>>>>>> >>>>>>>>>> Or you have multiple VCN contexts. When context1 finishes a >>>>>>>>>> job, it >>>>>>>>>> disables the VIDEO profile. But context2 still has a job on >>>>>>>>>> the other >>>>>>>>>> VCN engine and wants the VIDEO profile to still be enabled. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Felix >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Christian. >>>>>>>>>>> >>>>>>>>>>>> --- >>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 >>>>>>>>>>>> +++++++++++--- >>>>>>>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>>> index 5e53a5293935..1caed319a448 100644 >>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>>> @@ -34,6 +34,7 @@ >>>>>>>>>>>> #include "amdgpu_ras.h" >>>>>>>>>>>> #include "amdgpu_umc.h" >>>>>>>>>>>> #include "amdgpu_reset.h" >>>>>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>>>>> /* Total memory size in system memory and all GPU >>>>>>>>>>>> VRAM. Used to >>>>>>>>>>>> * estimate worst case amount of memory to reserve for >>>>>>>>>>>> page tables >>>>>>>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>>> void amdgpu_amdkfd_set_compute_idle(struct >>>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>>> bool idle) >>>>>>>>>>>> { >>>>>>>>>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>>>>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>>>>>>> - !idle); >>>>>>>>>>>> + int ret; >>>>>>>>>>>> + >>>>>>>>>>>> + if (idle) >>>>>>>>>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>>>> + else >>>>>>>>>>>> + ret = amdgpu_set_workload_profile(adev, >>>>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>>>> + >>>>>>>>>>>> + if (ret) >>>>>>>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>>>>>>>>> compute mode\n", >>>>>>>>>>>> + idle ? "reset" : "set"); >>>>>>>>>>>> } >>>>>>>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device >>>>>>>>>>>> *adev, u32 >>>>>>>>>>>> vmid) >>>>>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 13:37 ` Lazar, Lijo @ 2022-09-29 14:00 ` Sharma, Shashank 2022-09-29 14:14 ` Lazar, Lijo 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-29 14:00 UTC (permalink / raw) To: Lazar, Lijo, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/29/2022 3:37 PM, Lazar, Lijo wrote: > To be clear your understanding - > > Nothing is automatic in PMFW. PMFW picks a priority based on the actual > mask sent by driver. > > Assuming lower bits corresponds to highest priority - > > If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose profile > that corresponds to Bit0. If driver sends a mask with Bit4 Bit2 set and > rest unset, PMFW will chose profile that corresponds to Bit2. However if > driver sends a mask only with a single bit set, it chooses the profile > regardless of whatever was the previous profile. t doesn't check if the > existing profile > newly requested one. That is the behavior. > > So if a job send chooses a profile that corresponds to Bit0, driver will > send that. Next time if another job chooses a profile that corresponds > to Bit1, PMFW will receive that as the new profile and switch to that. > It trusts the driver to send the proper workload mask. > > Hope that gives the picture. > Thanks, my understanding is also similar, referring to the core power switch profile function here: amd_powerplay.c::pp_dpm_switch_power_profile() *snip code* hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); index = fls(hwmgr->workload_mask); index = index <= Workload_Policy_Max ? index - 1 : 0; workload = hwmgr->workload_setting[index]; *snip_code* hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); Here I can see that the new workload mask is appended into the existing workload mask (not overwritten). So if we keep sending new workload_modes, they would be appended into the workload flags and finally the PM will pick the most aggressive one of all these flags, as per its policy. Now, when we have a single workload: -> Job1: requests profile P1 via UAPI, ref count = 1 -> driver sends flags for p1 -> PM FW applies profile P1 -> Job executes in profile P1 -> Job goes to reset function, ref_count = 0, -> Power profile resets Now, we have conflicts only when we see multiple workloads (Job1 and Job 2) -> Job1: requests profile P1 via UAPI, ref count = 1 -> driver sends flags for p1 -> PM FW applies profile P1 -> Job executes in profile P1 -> Job2: requests profile P2 via UAPI, refcount = 2 -> driver sends flags for (P1|P2) -> PM FW picks the more aggressive of the two (Say P1, stays in P1) -> Job1 goes to reset function, ref_count = 1, job1 does not reset power profile -> Job2 goes to reset function, ref_counter = 2, job 2 resets Power profile -> Power profile resets to None So this state machine looks like if there is only 1 job, it will be executed in desired mode. But if there are multiple, the most aggressive profile will be picked, and every job will be executed in atleast the requested power profile mode or higher. Do you find any problem so far ? - Shashank > Thanks, > Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 14:00 ` Sharma, Shashank @ 2022-09-29 14:14 ` Lazar, Lijo 2022-09-29 14:40 ` Sharma, Shashank 2022-09-29 18:32 ` Alex Deucher 0 siblings, 2 replies; 76+ messages in thread From: Lazar, Lijo @ 2022-09-29 14:14 UTC (permalink / raw) To: Sharma, Shashank, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/29/2022 7:30 PM, Sharma, Shashank wrote: > > > On 9/29/2022 3:37 PM, Lazar, Lijo wrote: >> To be clear your understanding - >> >> Nothing is automatic in PMFW. PMFW picks a priority based on the >> actual mask sent by driver. >> >> Assuming lower bits corresponds to highest priority - >> >> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose >> profile that corresponds to Bit0. If driver sends a mask with Bit4 >> Bit2 set and rest unset, PMFW will chose profile that corresponds to >> Bit2. However if driver sends a mask only with a single bit set, it >> chooses the profile regardless of whatever was the previous profile. t >> doesn't check if the existing profile > newly requested one. That is >> the behavior. >> >> So if a job send chooses a profile that corresponds to Bit0, driver >> will send that. Next time if another job chooses a profile that >> corresponds to Bit1, PMFW will receive that as the new profile and >> switch to that. It trusts the driver to send the proper workload mask. >> >> Hope that gives the picture. >> > > > Thanks, my understanding is also similar, referring to the core power > switch profile function here: > amd_powerplay.c::pp_dpm_switch_power_profile() > *snip code* > hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); > index = fls(hwmgr->workload_mask); > index = index <= Workload_Policy_Max ? index - 1 : 0; > workload = hwmgr->workload_setting[index]; > *snip_code* > hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); > > Here I can see that the new workload mask is appended into the existing > workload mask (not overwritten). So if we keep sending new > workload_modes, they would be appended into the workload flags and > finally the PM will pick the most aggressive one of all these flags, as > per its policy. > Actually it's misleading - The path for sienna is - set_power_profile_mode -> sienna_cichlid_set_power_profile_mode This code here is a picking one based on lookup table. workload_type = smu_cmn_to_asic_specific_index(smu, CMN2ASIC_MAPPING_WORKLOAD, smu->power_profile_mode); This is that lookup table - static struct cmn2asic_mapping sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = { WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, WORKLOAD_PPLIB_DEFAULT_BIT), WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT), WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, WORKLOAD_PPLIB_POWER_SAVING_BIT), WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, WORKLOAD_PPLIB_VIDEO_BIT), WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, WORKLOAD_PPLIB_VR_BIT), WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, WORKLOAD_PPLIB_COMPUTE_BIT), WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, WORKLOAD_PPLIB_CUSTOM_BIT), }; And this is the place of interaction with PMFW. (1 << workload_type) is the mask being sent. smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, 1 << workload_type, NULL); In the end, driver implementation expects only one bit to be set. Thanks, Lijo > Now, when we have a single workload: > -> Job1: requests profile P1 via UAPI, ref count = 1 > -> driver sends flags for p1 > -> PM FW applies profile P1 > -> Job executes in profile P1 > -> Job goes to reset function, ref_count = 0, > -> Power profile resets > > Now, we have conflicts only when we see multiple workloads (Job1 and Job 2) > -> Job1: requests profile P1 via UAPI, ref count = 1 > -> driver sends flags for p1 > -> PM FW applies profile P1 > -> Job executes in profile P1 > -> Job2: requests profile P2 via UAPI, refcount = 2 > -> driver sends flags for (P1|P2) > -> PM FW picks the more aggressive of the two (Say P1, stays in P1) > -> Job1 goes to reset function, ref_count = 1, job1 does not reset power > profile > -> Job2 goes to reset function, ref_counter = 2, job 2 resets Power profile > -> Power profile resets to None > > So this state machine looks like if there is only 1 job, it will be > executed in desired mode. But if there are multiple, the most aggressive > profile will be picked, and every job will be executed in atleast the > requested power profile mode or higher. > > Do you find any problem so far ? > > - Shashank > > >> Thanks, >> Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 14:14 ` Lazar, Lijo @ 2022-09-29 14:40 ` Sharma, Shashank 2022-09-29 18:32 ` Alex Deucher 1 sibling, 0 replies; 76+ messages in thread From: Sharma, Shashank @ 2022-09-29 14:40 UTC (permalink / raw) To: Lazar, Lijo, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/29/2022 4:14 PM, Lazar, Lijo wrote: > > > On 9/29/2022 7:30 PM, Sharma, Shashank wrote: >> >> >> On 9/29/2022 3:37 PM, Lazar, Lijo wrote: >>> To be clear your understanding - >>> >>> Nothing is automatic in PMFW. PMFW picks a priority based on the >>> actual mask sent by driver. >>> >>> Assuming lower bits corresponds to highest priority - >>> >>> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose >>> profile that corresponds to Bit0. If driver sends a mask with Bit4 >>> Bit2 set and rest unset, PMFW will chose profile that corresponds to >>> Bit2. However if driver sends a mask only with a single bit set, it >>> chooses the profile regardless of whatever was the previous profile. >>> t doesn't check if the existing profile > newly requested one. That >>> is the behavior. >>> >>> So if a job send chooses a profile that corresponds to Bit0, driver >>> will send that. Next time if another job chooses a profile that >>> corresponds to Bit1, PMFW will receive that as the new profile and >>> switch to that. It trusts the driver to send the proper workload mask. >>> >>> Hope that gives the picture. >>> >> >> >> Thanks, my understanding is also similar, referring to the core power >> switch profile function here: >> amd_powerplay.c::pp_dpm_switch_power_profile() >> *snip code* >> hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); >> index = fls(hwmgr->workload_mask); >> index = index <= Workload_Policy_Max ? index - 1 : 0; >> workload = hwmgr->workload_setting[index]; >> *snip_code* >> hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); >> >> Here I can see that the new workload mask is appended into the >> existing workload mask (not overwritten). So if we keep sending new >> workload_modes, they would be appended into the workload flags and >> finally the PM will pick the most aggressive one of all these flags, >> as per its policy. >> > > Actually it's misleading - > > The path for sienna is - > set_power_profile_mode -> sienna_cichlid_set_power_profile_mode > > > This code here is a picking one based on lookup table. > > workload_type = smu_cmn_to_asic_specific_index(smu, > > CMN2ASIC_MAPPING_WORKLOAD, > > smu->power_profile_mode); > > This is that lookup table - > > static struct cmn2asic_mapping > sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = { > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, > WORKLOAD_PPLIB_DEFAULT_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, > WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, > WORKLOAD_PPLIB_POWER_SAVING_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, > WORKLOAD_PPLIB_VIDEO_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, WORKLOAD_PPLIB_VR_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, > WORKLOAD_PPLIB_COMPUTE_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, > WORKLOAD_PPLIB_CUSTOM_BIT), > }; > > > And this is the place of interaction with PMFW. (1 << workload_type) is > the mask being sent. > > smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, > 1 << workload_type, NULL); > > In the end, driver implementation expects only one bit to be set. > Well this seems like a bug here in the core functions, because the powerplay layer is doing the right thing by appending the workload flags keeping in mind that a profile_change can be requested while one profile is active, but the core functions are actually ignoring those flags. This brings us to look into actual PM FW expectations. If it expects only one flag to be set in the power_mode change message, we don't need to bother about this anymore. But if it can handle more than one flag but the core driver implementation is blocking it, we will have to fix that as well. @Alex: How can we get more information on this ? - Shashank > Thanks, > Lijo > >> Now, when we have a single workload: >> -> Job1: requests profile P1 via UAPI, ref count = 1 >> -> driver sends flags for p1 >> -> PM FW applies profile P1 >> -> Job executes in profile P1 >> -> Job goes to reset function, ref_count = 0, >> -> Power profile resets >> >> Now, we have conflicts only when we see multiple workloads (Job1 and >> Job 2) >> -> Job1: requests profile P1 via UAPI, ref count = 1 >> -> driver sends flags for p1 >> -> PM FW applies profile P1 >> -> Job executes in profile P1 >> -> Job2: requests profile P2 via UAPI, refcount = 2 >> -> driver sends flags for (P1|P2) >> -> PM FW picks the more aggressive of the two (Say P1, stays in P1) >> -> Job1 goes to reset function, ref_count = 1, job1 does not reset >> power profile >> -> Job2 goes to reset function, ref_counter = 2, job 2 resets Power >> profile >> -> Power profile resets to None >> >> So this state machine looks like if there is only 1 job, it will be >> executed in desired mode. But if there are multiple, the most >> aggressive profile will be picked, and every job will be executed in >> atleast the requested power profile mode or higher. >> >> Do you find any problem so far ? >> >> - Shashank >> >> >>> Thanks, >>> Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 14:14 ` Lazar, Lijo 2022-09-29 14:40 ` Sharma, Shashank @ 2022-09-29 18:32 ` Alex Deucher 2022-09-30 5:08 ` Lazar, Lijo 1 sibling, 1 reply; 76+ messages in thread From: Alex Deucher @ 2022-09-29 18:32 UTC (permalink / raw) To: Lazar, Lijo Cc: Sharma, Shashank, Felix Kuehling, amaranath.somalapuram, amd-gfx, alexander.deucher, Christian König On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo <lijo.lazar@amd.com> wrote: > > > > On 9/29/2022 7:30 PM, Sharma, Shashank wrote: > > > > > > On 9/29/2022 3:37 PM, Lazar, Lijo wrote: > >> To be clear your understanding - > >> > >> Nothing is automatic in PMFW. PMFW picks a priority based on the > >> actual mask sent by driver. > >> > >> Assuming lower bits corresponds to highest priority - > >> > >> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose > >> profile that corresponds to Bit0. If driver sends a mask with Bit4 > >> Bit2 set and rest unset, PMFW will chose profile that corresponds to > >> Bit2. However if driver sends a mask only with a single bit set, it > >> chooses the profile regardless of whatever was the previous profile. t > >> doesn't check if the existing profile > newly requested one. That is > >> the behavior. > >> > >> So if a job send chooses a profile that corresponds to Bit0, driver > >> will send that. Next time if another job chooses a profile that > >> corresponds to Bit1, PMFW will receive that as the new profile and > >> switch to that. It trusts the driver to send the proper workload mask. > >> > >> Hope that gives the picture. > >> > > > > > > Thanks, my understanding is also similar, referring to the core power > > switch profile function here: > > amd_powerplay.c::pp_dpm_switch_power_profile() > > *snip code* > > hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); > > index = fls(hwmgr->workload_mask); > > index = index <= Workload_Policy_Max ? index - 1 : 0; > > workload = hwmgr->workload_setting[index]; > > *snip_code* > > hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); > > > > Here I can see that the new workload mask is appended into the existing > > workload mask (not overwritten). So if we keep sending new > > workload_modes, they would be appended into the workload flags and > > finally the PM will pick the most aggressive one of all these flags, as > > per its policy. > > > > Actually it's misleading - > > The path for sienna is - > set_power_profile_mode -> sienna_cichlid_set_power_profile_mode > > > This code here is a picking one based on lookup table. > > workload_type = smu_cmn_to_asic_specific_index(smu, > > CMN2ASIC_MAPPING_WORKLOAD, > > smu->power_profile_mode); > > This is that lookup table - > > static struct cmn2asic_mapping > sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = { > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, > WORKLOAD_PPLIB_DEFAULT_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, > WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, > WORKLOAD_PPLIB_POWER_SAVING_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, > WORKLOAD_PPLIB_VIDEO_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, > WORKLOAD_PPLIB_VR_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, > WORKLOAD_PPLIB_COMPUTE_BIT), > WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, > WORKLOAD_PPLIB_CUSTOM_BIT), > }; > > > And this is the place of interaction with PMFW. (1 << workload_type) is > the mask being sent. > > smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, > 1 << workload_type, NULL); > > In the end, driver implementation expects only one bit to be set. Shashank and I had a discussion about this today. I think there are a few thing we can do to handle this better: 1. Set a flag that if the user changes the default via sysfs that overrides any runtime setting via an application since presumably that is what the user wants and we won't change the hint at runtime. 2. Drop the GET API. There's no need for this, the hint is just a hint. 2. Have the driver arbitrate between the available workload profiles based on the numeric value of the hint (e.g., default < 3D < video < VR < compute) as the higher values are more aggressive in most cases. If requests come in for 3D and compute at the same time, the driver will select compute because it's value is highest. Each hint type would be ref counted so we'd know what state to be in every time we go to set the state. If all of the clients requesting compute go away, and only 3D requestors remain, we can switch to 3D. If all refcounts go to 0, we go back to default. This will not require any change to the current workload API in the SMU code. Alex > > Thanks, > Lijo > > > Now, when we have a single workload: > > -> Job1: requests profile P1 via UAPI, ref count = 1 > > -> driver sends flags for p1 > > -> PM FW applies profile P1 > > -> Job executes in profile P1 > > -> Job goes to reset function, ref_count = 0, > > -> Power profile resets > > > > Now, we have conflicts only when we see multiple workloads (Job1 and Job 2) > > -> Job1: requests profile P1 via UAPI, ref count = 1 > > -> driver sends flags for p1 > > -> PM FW applies profile P1 > > -> Job executes in profile P1 > > -> Job2: requests profile P2 via UAPI, refcount = 2 > > -> driver sends flags for (P1|P2) > > -> PM FW picks the more aggressive of the two (Say P1, stays in P1) > > -> Job1 goes to reset function, ref_count = 1, job1 does not reset power > > profile > > -> Job2 goes to reset function, ref_counter = 2, job 2 resets Power profile > > -> Power profile resets to None > > > > So this state machine looks like if there is only 1 job, it will be > > executed in desired mode. But if there are multiple, the most aggressive > > profile will be picked, and every job will be executed in atleast the > > requested power profile mode or higher. > > > > Do you find any problem so far ? > > > > - Shashank > > > > > >> Thanks, > >> Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 18:32 ` Alex Deucher @ 2022-09-30 5:08 ` Lazar, Lijo 2022-09-30 8:37 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-30 5:08 UTC (permalink / raw) To: Alex Deucher Cc: Sharma, Shashank, Felix Kuehling, amaranath.somalapuram, amd-gfx, alexander.deucher, Christian König On 9/30/2022 12:02 AM, Alex Deucher wrote: > On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo <lijo.lazar@amd.com> wrote: >> >> >> >> On 9/29/2022 7:30 PM, Sharma, Shashank wrote: >>> >>> >>> On 9/29/2022 3:37 PM, Lazar, Lijo wrote: >>>> To be clear your understanding - >>>> >>>> Nothing is automatic in PMFW. PMFW picks a priority based on the >>>> actual mask sent by driver. >>>> >>>> Assuming lower bits corresponds to highest priority - >>>> >>>> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose >>>> profile that corresponds to Bit0. If driver sends a mask with Bit4 >>>> Bit2 set and rest unset, PMFW will chose profile that corresponds to >>>> Bit2. However if driver sends a mask only with a single bit set, it >>>> chooses the profile regardless of whatever was the previous profile. t >>>> doesn't check if the existing profile > newly requested one. That is >>>> the behavior. >>>> >>>> So if a job send chooses a profile that corresponds to Bit0, driver >>>> will send that. Next time if another job chooses a profile that >>>> corresponds to Bit1, PMFW will receive that as the new profile and >>>> switch to that. It trusts the driver to send the proper workload mask. >>>> >>>> Hope that gives the picture. >>>> >>> >>> >>> Thanks, my understanding is also similar, referring to the core power >>> switch profile function here: >>> amd_powerplay.c::pp_dpm_switch_power_profile() >>> *snip code* >>> hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); >>> index = fls(hwmgr->workload_mask); >>> index = index <= Workload_Policy_Max ? index - 1 : 0; >>> workload = hwmgr->workload_setting[index]; >>> *snip_code* >>> hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); >>> >>> Here I can see that the new workload mask is appended into the existing >>> workload mask (not overwritten). So if we keep sending new >>> workload_modes, they would be appended into the workload flags and >>> finally the PM will pick the most aggressive one of all these flags, as >>> per its policy. >>> >> >> Actually it's misleading - >> >> The path for sienna is - >> set_power_profile_mode -> sienna_cichlid_set_power_profile_mode >> >> >> This code here is a picking one based on lookup table. >> >> workload_type = smu_cmn_to_asic_specific_index(smu, >> >> CMN2ASIC_MAPPING_WORKLOAD, >> >> smu->power_profile_mode); >> >> This is that lookup table - >> >> static struct cmn2asic_mapping >> sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = { >> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, >> WORKLOAD_PPLIB_DEFAULT_BIT), >> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, >> WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT), >> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, >> WORKLOAD_PPLIB_POWER_SAVING_BIT), >> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, >> WORKLOAD_PPLIB_VIDEO_BIT), >> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, >> WORKLOAD_PPLIB_VR_BIT), >> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, >> WORKLOAD_PPLIB_COMPUTE_BIT), >> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, >> WORKLOAD_PPLIB_CUSTOM_BIT), >> }; >> >> >> And this is the place of interaction with PMFW. (1 << workload_type) is >> the mask being sent. >> >> smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, >> 1 << workload_type, NULL); >> >> In the end, driver implementation expects only one bit to be set. > > Shashank and I had a discussion about this today. I think there are a > few thing we can do to handle this better: > > 1. Set a flag that if the user changes the default via sysfs that > overrides any runtime setting via an application since presumably that > is what the user wants and we won't change the hint at runtime. > 2. Drop the GET API. There's no need for this, the hint is just a hint. Double checked again based on Felix's comments on API definition. Driver decides the priority instead of FW. That way we can still keep Get API. > 2. Have the driver arbitrate between the available workload profiles > based on the numeric value of the hint (e.g., default < 3D < video < > VR < compute) as the higher values are more aggressive in most cases. > If requests come in for 3D and compute at the same time, the driver > will select compute because it's value is highest. Each hint type > would be ref counted so we'd know what state to be in every time we go > to set the state. If all of the clients requesting compute go away, > and only 3D requestors remain, we can switch to 3D. If all refcounts > go to 0, we go back to default. This will not require any change to > the current workload API in the SMU code. Since PM layer decides priority, refcount can be kept at powerplay and swsmu layer instead of any higher level API. User API may keep something like req_power_profile (for any logging/debug purpose) for the job preference. Thanks, Lijo > > Alex > >> >> Thanks, >> Lijo >> >>> Now, when we have a single workload: >>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>> -> driver sends flags for p1 >>> -> PM FW applies profile P1 >>> -> Job executes in profile P1 >>> -> Job goes to reset function, ref_count = 0, >>> -> Power profile resets >>> >>> Now, we have conflicts only when we see multiple workloads (Job1 and Job 2) >>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>> -> driver sends flags for p1 >>> -> PM FW applies profile P1 >>> -> Job executes in profile P1 >>> -> Job2: requests profile P2 via UAPI, refcount = 2 >>> -> driver sends flags for (P1|P2) >>> -> PM FW picks the more aggressive of the two (Say P1, stays in P1) >>> -> Job1 goes to reset function, ref_count = 1, job1 does not reset power >>> profile >>> -> Job2 goes to reset function, ref_counter = 2, job 2 resets Power profile >>> -> Power profile resets to None >>> >>> So this state machine looks like if there is only 1 job, it will be >>> executed in desired mode. But if there are multiple, the most aggressive >>> profile will be picked, and every job will be executed in atleast the >>> requested power profile mode or higher. >>> >>> Do you find any problem so far ? >>> >>> - Shashank >>> >>> >>>> Thanks, >>>> Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-30 5:08 ` Lazar, Lijo @ 2022-09-30 8:37 ` Sharma, Shashank 2022-09-30 9:13 ` Lazar, Lijo 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-30 8:37 UTC (permalink / raw) To: Lazar, Lijo, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/30/2022 7:08 AM, Lazar, Lijo wrote: > > > On 9/30/2022 12:02 AM, Alex Deucher wrote: >> On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo <lijo.lazar@amd.com> wrote: >>> >>> >>> >>> On 9/29/2022 7:30 PM, Sharma, Shashank wrote: >>>> >>>> >>>> On 9/29/2022 3:37 PM, Lazar, Lijo wrote: >>>>> To be clear your understanding - >>>>> >>>>> Nothing is automatic in PMFW. PMFW picks a priority based on the >>>>> actual mask sent by driver. >>>>> >>>>> Assuming lower bits corresponds to highest priority - >>>>> >>>>> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose >>>>> profile that corresponds to Bit0. If driver sends a mask with Bit4 >>>>> Bit2 set and rest unset, PMFW will chose profile that corresponds to >>>>> Bit2. However if driver sends a mask only with a single bit set, it >>>>> chooses the profile regardless of whatever was the previous profile. t >>>>> doesn't check if the existing profile > newly requested one. That is >>>>> the behavior. >>>>> >>>>> So if a job send chooses a profile that corresponds to Bit0, driver >>>>> will send that. Next time if another job chooses a profile that >>>>> corresponds to Bit1, PMFW will receive that as the new profile and >>>>> switch to that. It trusts the driver to send the proper workload mask. >>>>> >>>>> Hope that gives the picture. >>>>> >>>> >>>> >>>> Thanks, my understanding is also similar, referring to the core power >>>> switch profile function here: >>>> amd_powerplay.c::pp_dpm_switch_power_profile() >>>> *snip code* >>>> hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); >>>> index = fls(hwmgr->workload_mask); >>>> index = index <= Workload_Policy_Max ? index - 1 : 0; >>>> workload = hwmgr->workload_setting[index]; >>>> *snip_code* >>>> hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); >>>> >>>> Here I can see that the new workload mask is appended into the existing >>>> workload mask (not overwritten). So if we keep sending new >>>> workload_modes, they would be appended into the workload flags and >>>> finally the PM will pick the most aggressive one of all these flags, as >>>> per its policy. >>>> >>> >>> Actually it's misleading - >>> >>> The path for sienna is - >>> set_power_profile_mode -> sienna_cichlid_set_power_profile_mode >>> >>> >>> This code here is a picking one based on lookup table. >>> >>> workload_type = smu_cmn_to_asic_specific_index(smu, >>> >>> CMN2ASIC_MAPPING_WORKLOAD, >>> >>> smu->power_profile_mode); >>> >>> This is that lookup table - >>> >>> static struct cmn2asic_mapping >>> sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = { >>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, >>> WORKLOAD_PPLIB_DEFAULT_BIT), >>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, >>> WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT), >>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, >>> WORKLOAD_PPLIB_POWER_SAVING_BIT), >>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, >>> WORKLOAD_PPLIB_VIDEO_BIT), >>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, >>> WORKLOAD_PPLIB_VR_BIT), >>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, >>> WORKLOAD_PPLIB_COMPUTE_BIT), >>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, >>> WORKLOAD_PPLIB_CUSTOM_BIT), >>> }; >>> >>> >>> And this is the place of interaction with PMFW. (1 << workload_type) is >>> the mask being sent. >>> >>> smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, >>> 1 << workload_type, NULL); >>> >>> In the end, driver implementation expects only one bit to be set. >> >> Shashank and I had a discussion about this today. I think there are a >> few thing we can do to handle this better: >> >> 1. Set a flag that if the user changes the default via sysfs that >> overrides any runtime setting via an application since presumably that >> is what the user wants and we won't change the hint at runtime. >> 2. Drop the GET API. There's no need for this, the hint is just a hint. > > Double checked again based on Felix's comments on API definition. Driver > decides the priority instead of FW. That way we can still keep Get API. > >> 2. Have the driver arbitrate between the available workload profiles >> based on the numeric value of the hint (e.g., default < 3D < video < >> VR < compute) as the higher values are more aggressive in most cases. >> If requests come in for 3D and compute at the same time, the driver >> will select compute because it's value is highest. Each hint type >> would be ref counted so we'd know what state to be in every time we go >> to set the state. If all of the clients requesting compute go away, >> and only 3D requestors remain, we can switch to 3D. If all refcounts >> go to 0, we go back to default. This will not require any change to >> the current workload API in the SMU code. > > Since PM layer decides priority, refcount can be kept at powerplay and > swsmu layer instead of any higher level API. > > User API may keep something like req_power_profile (for any > logging/debug purpose) for the job preference. No, I think there has been enough confusion around this implementation so far, we will implement this just as Alex/Felix suggested: - No change will be done in pm/SMU layer. - The amdgpu_context_workload layer will keep the ref_counting and user_workload_hint management, and it will just call and consume the pm_switch_workload profile() like any other client. - We will add a force flag for calls coming from sysfs() interface, and it will take the highest priority. No state machine will be managed for sysfs, and it will work as it is working today. - Shashank > > Thanks, > Lijo > >> >> Alex >> >>> >>> Thanks, >>> Lijo >>> >>>> Now, when we have a single workload: >>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>> -> driver sends flags for p1 >>>> -> PM FW applies profile P1 >>>> -> Job executes in profile P1 >>>> -> Job goes to reset function, ref_count = 0, >>>> -> Power profile resets >>>> >>>> Now, we have conflicts only when we see multiple workloads (Job1 and >>>> Job 2) >>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>> -> driver sends flags for p1 >>>> -> PM FW applies profile P1 >>>> -> Job executes in profile P1 >>>> -> Job2: requests profile P2 via UAPI, refcount = 2 >>>> -> driver sends flags for (P1|P2) >>>> -> PM FW picks the more aggressive of the two (Say P1, stays in P1) >>>> -> Job1 goes to reset function, ref_count = 1, job1 does not reset >>>> power >>>> profile >>>> -> Job2 goes to reset function, ref_counter = 2, job 2 resets Power >>>> profile >>>> -> Power profile resets to None >>>> >>>> So this state machine looks like if there is only 1 job, it will be >>>> executed in desired mode. But if there are multiple, the most >>>> aggressive >>>> profile will be picked, and every job will be executed in atleast the >>>> requested power profile mode or higher. >>>> >>>> Do you find any problem so far ? >>>> >>>> - Shashank >>>> >>>> >>>>> Thanks, >>>>> Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-30 8:37 ` Sharma, Shashank @ 2022-09-30 9:13 ` Lazar, Lijo 2022-09-30 9:22 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-30 9:13 UTC (permalink / raw) To: Sharma, Shashank, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/30/2022 2:07 PM, Sharma, Shashank wrote: > > > On 9/30/2022 7:08 AM, Lazar, Lijo wrote: >> >> >> On 9/30/2022 12:02 AM, Alex Deucher wrote: >>> On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo <lijo.lazar@amd.com> wrote: >>>> >>>> >>>> >>>> On 9/29/2022 7:30 PM, Sharma, Shashank wrote: >>>>> >>>>> >>>>> On 9/29/2022 3:37 PM, Lazar, Lijo wrote: >>>>>> To be clear your understanding - >>>>>> >>>>>> Nothing is automatic in PMFW. PMFW picks a priority based on the >>>>>> actual mask sent by driver. >>>>>> >>>>>> Assuming lower bits corresponds to highest priority - >>>>>> >>>>>> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose >>>>>> profile that corresponds to Bit0. If driver sends a mask with Bit4 >>>>>> Bit2 set and rest unset, PMFW will chose profile that corresponds to >>>>>> Bit2. However if driver sends a mask only with a single bit set, it >>>>>> chooses the profile regardless of whatever was the previous >>>>>> profile. t >>>>>> doesn't check if the existing profile > newly requested one. That is >>>>>> the behavior. >>>>>> >>>>>> So if a job send chooses a profile that corresponds to Bit0, driver >>>>>> will send that. Next time if another job chooses a profile that >>>>>> corresponds to Bit1, PMFW will receive that as the new profile and >>>>>> switch to that. It trusts the driver to send the proper workload >>>>>> mask. >>>>>> >>>>>> Hope that gives the picture. >>>>>> >>>>> >>>>> >>>>> Thanks, my understanding is also similar, referring to the core power >>>>> switch profile function here: >>>>> amd_powerplay.c::pp_dpm_switch_power_profile() >>>>> *snip code* >>>>> hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); >>>>> index = fls(hwmgr->workload_mask); >>>>> index = index <= Workload_Policy_Max ? index - 1 : 0; >>>>> workload = hwmgr->workload_setting[index]; >>>>> *snip_code* >>>>> hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); >>>>> >>>>> Here I can see that the new workload mask is appended into the >>>>> existing >>>>> workload mask (not overwritten). So if we keep sending new >>>>> workload_modes, they would be appended into the workload flags and >>>>> finally the PM will pick the most aggressive one of all these >>>>> flags, as >>>>> per its policy. >>>>> >>>> >>>> Actually it's misleading - >>>> >>>> The path for sienna is - >>>> set_power_profile_mode -> sienna_cichlid_set_power_profile_mode >>>> >>>> >>>> This code here is a picking one based on lookup table. >>>> >>>> workload_type = smu_cmn_to_asic_specific_index(smu, >>>> >>>> CMN2ASIC_MAPPING_WORKLOAD, >>>> >>>> smu->power_profile_mode); >>>> >>>> This is that lookup table - >>>> >>>> static struct cmn2asic_mapping >>>> sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = { >>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, >>>> WORKLOAD_PPLIB_DEFAULT_BIT), >>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, >>>> WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT), >>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, >>>> WORKLOAD_PPLIB_POWER_SAVING_BIT), >>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, >>>> WORKLOAD_PPLIB_VIDEO_BIT), >>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, >>>> WORKLOAD_PPLIB_VR_BIT), >>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, >>>> WORKLOAD_PPLIB_COMPUTE_BIT), >>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, >>>> WORKLOAD_PPLIB_CUSTOM_BIT), >>>> }; >>>> >>>> >>>> And this is the place of interaction with PMFW. (1 << workload_type) is >>>> the mask being sent. >>>> >>>> smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, >>>> 1 << workload_type, NULL); >>>> >>>> In the end, driver implementation expects only one bit to be set. >>> >>> Shashank and I had a discussion about this today. I think there are a >>> few thing we can do to handle this better: >>> >>> 1. Set a flag that if the user changes the default via sysfs that >>> overrides any runtime setting via an application since presumably that >>> is what the user wants and we won't change the hint at runtime. >>> 2. Drop the GET API. There's no need for this, the hint is just a hint. >> >> Double checked again based on Felix's comments on API definition. >> Driver decides the priority instead of FW. That way we can still keep >> Get API. >> >>> 2. Have the driver arbitrate between the available workload profiles >>> based on the numeric value of the hint (e.g., default < 3D < video < >>> VR < compute) as the higher values are more aggressive in most cases. >>> If requests come in for 3D and compute at the same time, the driver >>> will select compute because it's value is highest. Each hint type >>> would be ref counted so we'd know what state to be in every time we go >>> to set the state. If all of the clients requesting compute go away, >>> and only 3D requestors remain, we can switch to 3D. If all refcounts >>> go to 0, we go back to default. This will not require any change to >>> the current workload API in the SMU code. >> >> Since PM layer decides priority, refcount can be kept at powerplay and >> swsmu layer instead of any higher level API. >> >> User API may keep something like req_power_profile (for any >> logging/debug purpose) for the job preference. > > No, I think there has been enough confusion around this implementation > so far, we will implement this just as Alex/Felix suggested: > - No change will be done in pm/SMU layer. Well, a confusion doesn't justify bad implementation. You could just keep the refcount in workload_setting. Another API that uses power profile indirectly also will need to take care of refcount and we don't need every other API to do that separately without knowing what is the final outcome. Thanks, Lijo > - The amdgpu_context_workload layer will keep the ref_counting and > user_workload_hint management, and it will just call and consume the > pm_switch_workload profile() like any other client. > - We will add a force flag for calls coming from sysfs() interface, and > it will take the highest priority. No state machine will be managed for > sysfs, and it will work as it is working today. > > - Shashank > >> >> Thanks, >> Lijo >> >>> >>> Alex >>> >>>> >>>> Thanks, >>>> Lijo >>>> >>>>> Now, when we have a single workload: >>>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>>> -> driver sends flags for p1 >>>>> -> PM FW applies profile P1 >>>>> -> Job executes in profile P1 >>>>> -> Job goes to reset function, ref_count = 0, >>>>> -> Power profile resets >>>>> >>>>> Now, we have conflicts only when we see multiple workloads (Job1 >>>>> and Job 2) >>>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>>> -> driver sends flags for p1 >>>>> -> PM FW applies profile P1 >>>>> -> Job executes in profile P1 >>>>> -> Job2: requests profile P2 via UAPI, refcount = 2 >>>>> -> driver sends flags for (P1|P2) >>>>> -> PM FW picks the more aggressive of the two (Say P1, stays in P1) >>>>> -> Job1 goes to reset function, ref_count = 1, job1 does not reset >>>>> power >>>>> profile >>>>> -> Job2 goes to reset function, ref_counter = 2, job 2 resets Power >>>>> profile >>>>> -> Power profile resets to None >>>>> >>>>> So this state machine looks like if there is only 1 job, it will be >>>>> executed in desired mode. But if there are multiple, the most >>>>> aggressive >>>>> profile will be picked, and every job will be executed in atleast the >>>>> requested power profile mode or higher. >>>>> >>>>> Do you find any problem so far ? >>>>> >>>>> - Shashank >>>>> >>>>> >>>>>> Thanks, >>>>>> Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-30 9:13 ` Lazar, Lijo @ 2022-09-30 9:22 ` Sharma, Shashank 2022-09-30 9:54 ` Lazar, Lijo 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-30 9:22 UTC (permalink / raw) To: Lazar, Lijo, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/30/2022 11:13 AM, Lazar, Lijo wrote: > > > On 9/30/2022 2:07 PM, Sharma, Shashank wrote: >> >> >> On 9/30/2022 7:08 AM, Lazar, Lijo wrote: >>> >>> >>> On 9/30/2022 12:02 AM, Alex Deucher wrote: >>>> On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo <lijo.lazar@amd.com> >>>> wrote: >>>>> >>>>> >>>>> >>>>> On 9/29/2022 7:30 PM, Sharma, Shashank wrote: >>>>>> >>>>>> >>>>>> On 9/29/2022 3:37 PM, Lazar, Lijo wrote: >>>>>>> To be clear your understanding - >>>>>>> >>>>>>> Nothing is automatic in PMFW. PMFW picks a priority based on the >>>>>>> actual mask sent by driver. >>>>>>> >>>>>>> Assuming lower bits corresponds to highest priority - >>>>>>> >>>>>>> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose >>>>>>> profile that corresponds to Bit0. If driver sends a mask with Bit4 >>>>>>> Bit2 set and rest unset, PMFW will chose profile that corresponds to >>>>>>> Bit2. However if driver sends a mask only with a single bit set, it >>>>>>> chooses the profile regardless of whatever was the previous >>>>>>> profile. t >>>>>>> doesn't check if the existing profile > newly requested one. That is >>>>>>> the behavior. >>>>>>> >>>>>>> So if a job send chooses a profile that corresponds to Bit0, driver >>>>>>> will send that. Next time if another job chooses a profile that >>>>>>> corresponds to Bit1, PMFW will receive that as the new profile and >>>>>>> switch to that. It trusts the driver to send the proper workload >>>>>>> mask. >>>>>>> >>>>>>> Hope that gives the picture. >>>>>>> >>>>>> >>>>>> >>>>>> Thanks, my understanding is also similar, referring to the core power >>>>>> switch profile function here: >>>>>> amd_powerplay.c::pp_dpm_switch_power_profile() >>>>>> *snip code* >>>>>> hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); >>>>>> index = fls(hwmgr->workload_mask); >>>>>> index = index <= Workload_Policy_Max ? index - 1 : 0; >>>>>> workload = hwmgr->workload_setting[index]; >>>>>> *snip_code* >>>>>> hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); >>>>>> >>>>>> Here I can see that the new workload mask is appended into the >>>>>> existing >>>>>> workload mask (not overwritten). So if we keep sending new >>>>>> workload_modes, they would be appended into the workload flags and >>>>>> finally the PM will pick the most aggressive one of all these >>>>>> flags, as >>>>>> per its policy. >>>>>> >>>>> >>>>> Actually it's misleading - >>>>> >>>>> The path for sienna is - >>>>> set_power_profile_mode -> sienna_cichlid_set_power_profile_mode >>>>> >>>>> >>>>> This code here is a picking one based on lookup table. >>>>> >>>>> workload_type = smu_cmn_to_asic_specific_index(smu, >>>>> >>>>> CMN2ASIC_MAPPING_WORKLOAD, >>>>> >>>>> smu->power_profile_mode); >>>>> >>>>> This is that lookup table - >>>>> >>>>> static struct cmn2asic_mapping >>>>> sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = { >>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, >>>>> WORKLOAD_PPLIB_DEFAULT_BIT), >>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, >>>>> WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT), >>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, >>>>> WORKLOAD_PPLIB_POWER_SAVING_BIT), >>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, >>>>> WORKLOAD_PPLIB_VIDEO_BIT), >>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, >>>>> WORKLOAD_PPLIB_VR_BIT), >>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, >>>>> WORKLOAD_PPLIB_COMPUTE_BIT), >>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, >>>>> WORKLOAD_PPLIB_CUSTOM_BIT), >>>>> }; >>>>> >>>>> >>>>> And this is the place of interaction with PMFW. (1 << >>>>> workload_type) is >>>>> the mask being sent. >>>>> >>>>> smu_cmn_send_smc_msg_with_param(smu, SMU_MSG_SetWorkloadMask, >>>>> 1 << workload_type, NULL); >>>>> >>>>> In the end, driver implementation expects only one bit to be set. >>>> >>>> Shashank and I had a discussion about this today. I think there are a >>>> few thing we can do to handle this better: >>>> >>>> 1. Set a flag that if the user changes the default via sysfs that >>>> overrides any runtime setting via an application since presumably that >>>> is what the user wants and we won't change the hint at runtime. >>>> 2. Drop the GET API. There's no need for this, the hint is just a >>>> hint. >>> >>> Double checked again based on Felix's comments on API definition. >>> Driver decides the priority instead of FW. That way we can still keep >>> Get API. >>> >>>> 2. Have the driver arbitrate between the available workload profiles >>>> based on the numeric value of the hint (e.g., default < 3D < video < >>>> VR < compute) as the higher values are more aggressive in most cases. >>>> If requests come in for 3D and compute at the same time, the driver >>>> will select compute because it's value is highest. Each hint type >>>> would be ref counted so we'd know what state to be in every time we go >>>> to set the state. If all of the clients requesting compute go away, >>>> and only 3D requestors remain, we can switch to 3D. If all refcounts >>>> go to 0, we go back to default. This will not require any change to >>>> the current workload API in the SMU code. >>> >>> Since PM layer decides priority, refcount can be kept at powerplay >>> and swsmu layer instead of any higher level API. >>> >>> User API may keep something like req_power_profile (for any >>> logging/debug purpose) for the job preference. >> >> No, I think there has been enough confusion around this implementation >> so far, we will implement this just as Alex/Felix suggested: >> - No change will be done in pm/SMU layer. > > Well, a confusion doesn't justify bad implementation. You could just > keep the refcount in workload_setting. So far, none of us have any reason to believe its a bad implementation. Why is it so, again ? > > Another API that uses power profile indirectly also will need to take > care of refcount and we don't need every other API to do that separately > without knowing what is the final outcome. > And why ? The dpm_switch_power_profile API was introduced to be used by a higher level API, and if a consumer API wants to keep track of that, its their own call. This doesn't affect internal PM APIs. The whole idea is to manage the PM calls without any change in PM APIs. - Shashank > Thanks, > Lijo > >> - The amdgpu_context_workload layer will keep the ref_counting and >> user_workload_hint management, and it will just call and consume the >> pm_switch_workload profile() like any other client. > >> - We will add a force flag for calls coming from sysfs() interface, >> and it will take the highest priority. No state machine will be >> managed for sysfs, and it will work as it is working today. >> >> - Shashank >> >>> >>> Thanks, >>> Lijo >>> >>>> >>>> Alex >>>> >>>>> >>>>> Thanks, >>>>> Lijo >>>>> >>>>>> Now, when we have a single workload: >>>>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>>>> -> driver sends flags for p1 >>>>>> -> PM FW applies profile P1 >>>>>> -> Job executes in profile P1 >>>>>> -> Job goes to reset function, ref_count = 0, >>>>>> -> Power profile resets >>>>>> >>>>>> Now, we have conflicts only when we see multiple workloads (Job1 >>>>>> and Job 2) >>>>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>>>> -> driver sends flags for p1 >>>>>> -> PM FW applies profile P1 >>>>>> -> Job executes in profile P1 >>>>>> -> Job2: requests profile P2 via UAPI, refcount = 2 >>>>>> -> driver sends flags for (P1|P2) >>>>>> -> PM FW picks the more aggressive of the two (Say P1, stays in P1) >>>>>> -> Job1 goes to reset function, ref_count = 1, job1 does not reset >>>>>> power >>>>>> profile >>>>>> -> Job2 goes to reset function, ref_counter = 2, job 2 resets >>>>>> Power profile >>>>>> -> Power profile resets to None >>>>>> >>>>>> So this state machine looks like if there is only 1 job, it will be >>>>>> executed in desired mode. But if there are multiple, the most >>>>>> aggressive >>>>>> profile will be picked, and every job will be executed in atleast the >>>>>> requested power profile mode or higher. >>>>>> >>>>>> Do you find any problem so far ? >>>>>> >>>>>> - Shashank >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-30 9:22 ` Sharma, Shashank @ 2022-09-30 9:54 ` Lazar, Lijo 2022-09-30 10:09 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Lazar, Lijo @ 2022-09-30 9:54 UTC (permalink / raw) To: Sharma, Shashank, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/30/2022 2:52 PM, Sharma, Shashank wrote: > > > On 9/30/2022 11:13 AM, Lazar, Lijo wrote: >> >> >> On 9/30/2022 2:07 PM, Sharma, Shashank wrote: >>> >>> >>> On 9/30/2022 7:08 AM, Lazar, Lijo wrote: >>>> >>>> >>>> On 9/30/2022 12:02 AM, Alex Deucher wrote: >>>>> On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo <lijo.lazar@amd.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 9/29/2022 7:30 PM, Sharma, Shashank wrote: >>>>>>> >>>>>>> >>>>>>> On 9/29/2022 3:37 PM, Lazar, Lijo wrote: >>>>>>>> To be clear your understanding - >>>>>>>> >>>>>>>> Nothing is automatic in PMFW. PMFW picks a priority based on the >>>>>>>> actual mask sent by driver. >>>>>>>> >>>>>>>> Assuming lower bits corresponds to highest priority - >>>>>>>> >>>>>>>> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose >>>>>>>> profile that corresponds to Bit0. If driver sends a mask with Bit4 >>>>>>>> Bit2 set and rest unset, PMFW will chose profile that >>>>>>>> corresponds to >>>>>>>> Bit2. However if driver sends a mask only with a single bit set, it >>>>>>>> chooses the profile regardless of whatever was the previous >>>>>>>> profile. t >>>>>>>> doesn't check if the existing profile > newly requested one. >>>>>>>> That is >>>>>>>> the behavior. >>>>>>>> >>>>>>>> So if a job send chooses a profile that corresponds to Bit0, driver >>>>>>>> will send that. Next time if another job chooses a profile that >>>>>>>> corresponds to Bit1, PMFW will receive that as the new profile and >>>>>>>> switch to that. It trusts the driver to send the proper workload >>>>>>>> mask. >>>>>>>> >>>>>>>> Hope that gives the picture. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> Thanks, my understanding is also similar, referring to the core >>>>>>> power >>>>>>> switch profile function here: >>>>>>> amd_powerplay.c::pp_dpm_switch_power_profile() >>>>>>> *snip code* >>>>>>> hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); >>>>>>> index = fls(hwmgr->workload_mask); >>>>>>> index = index <= Workload_Policy_Max ? index - 1 : 0; >>>>>>> workload = hwmgr->workload_setting[index]; >>>>>>> *snip_code* >>>>>>> hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); >>>>>>> >>>>>>> Here I can see that the new workload mask is appended into the >>>>>>> existing >>>>>>> workload mask (not overwritten). So if we keep sending new >>>>>>> workload_modes, they would be appended into the workload flags and >>>>>>> finally the PM will pick the most aggressive one of all these >>>>>>> flags, as >>>>>>> per its policy. >>>>>>> >>>>>> >>>>>> Actually it's misleading - >>>>>> >>>>>> The path for sienna is - >>>>>> set_power_profile_mode -> sienna_cichlid_set_power_profile_mode >>>>>> >>>>>> >>>>>> This code here is a picking one based on lookup table. >>>>>> >>>>>> workload_type = smu_cmn_to_asic_specific_index(smu, >>>>>> >>>>>> CMN2ASIC_MAPPING_WORKLOAD, >>>>>> >>>>>> smu->power_profile_mode); >>>>>> >>>>>> This is that lookup table - >>>>>> >>>>>> static struct cmn2asic_mapping >>>>>> sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = { >>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, >>>>>> WORKLOAD_PPLIB_DEFAULT_BIT), >>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, >>>>>> WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT), >>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, >>>>>> WORKLOAD_PPLIB_POWER_SAVING_BIT), >>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, >>>>>> WORKLOAD_PPLIB_VIDEO_BIT), >>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, >>>>>> WORKLOAD_PPLIB_VR_BIT), >>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, >>>>>> WORKLOAD_PPLIB_COMPUTE_BIT), >>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, >>>>>> WORKLOAD_PPLIB_CUSTOM_BIT), >>>>>> }; >>>>>> >>>>>> >>>>>> And this is the place of interaction with PMFW. (1 << >>>>>> workload_type) is >>>>>> the mask being sent. >>>>>> >>>>>> smu_cmn_send_smc_msg_with_param(smu, >>>>>> SMU_MSG_SetWorkloadMask, >>>>>> 1 << workload_type, NULL); >>>>>> >>>>>> In the end, driver implementation expects only one bit to be set. >>>>> >>>>> Shashank and I had a discussion about this today. I think there are a >>>>> few thing we can do to handle this better: >>>>> >>>>> 1. Set a flag that if the user changes the default via sysfs that >>>>> overrides any runtime setting via an application since presumably that >>>>> is what the user wants and we won't change the hint at runtime. >>>>> 2. Drop the GET API. There's no need for this, the hint is just a >>>>> hint. >>>> >>>> Double checked again based on Felix's comments on API definition. >>>> Driver decides the priority instead of FW. That way we can still >>>> keep Get API. >>>> >>>>> 2. Have the driver arbitrate between the available workload profiles >>>>> based on the numeric value of the hint (e.g., default < 3D < video < >>>>> VR < compute) as the higher values are more aggressive in most cases. >>>>> If requests come in for 3D and compute at the same time, the driver >>>>> will select compute because it's value is highest. Each hint type >>>>> would be ref counted so we'd know what state to be in every time we go >>>>> to set the state. If all of the clients requesting compute go away, >>>>> and only 3D requestors remain, we can switch to 3D. If all refcounts >>>>> go to 0, we go back to default. This will not require any change to >>>>> the current workload API in the SMU code. >>>> >>>> Since PM layer decides priority, refcount can be kept at powerplay >>>> and swsmu layer instead of any higher level API. >>>> >>>> User API may keep something like req_power_profile (for any >>>> logging/debug purpose) for the job preference. >>> >>> No, I think there has been enough confusion around this >>> implementation so far, we will implement this just as Alex/Felix >>> suggested: >>> - No change will be done in pm/SMU layer. >> >> Well, a confusion doesn't justify bad implementation. You could just >> keep the refcount in workload_setting. > > So far, none of us have any reason to believe its a bad implementation. > Why is it so, again ? > It's only about keeping track of requests at client layer. >> >> Another API that uses power profile indirectly also will need to take >> care of refcount and we don't need every other API to do that >> separately without knowing what is the final outcome. >> > > And why ? The dpm_switch_power_profile API was introduced to be used by > a higher level API, and if a consumer API wants to keep track of that, > its their own call. This doesn't affect internal PM APIs. The whole idea > is to manage the PM calls without any change in PM APIs. > Just like per-job-switch-profile is a new usage, there could be other new cases as well. Also, there are other APIs which indirectly manipulates power profile other than sys. All I'm saying is keep the refcount at core layer so that regardless of wherever it comes from, it keeps the preference. So instead of this- smu->workload_mask &= ~(1 << smu->workload_prority[type]); Have something like this - smu->workload[type].reqcount--; if (!smu->workload[type].reqcount) smu->workload_mask &= ~(1 << smu->workload[type].priority); I guess, the count was not there because there was no usage of multiple clients preferring the same profile at the same time. Now that there is a case for this, fix it at where required rather than keeping a track of it at client layer. Thanks, Lijo > > - Shashank > >> Thanks, >> Lijo >> >>> - The amdgpu_context_workload layer will keep the ref_counting and >>> user_workload_hint management, and it will just call and consume the >>> pm_switch_workload profile() like any other client. >> >>> - We will add a force flag for calls coming from sysfs() interface, >>> and it will take the highest priority. No state machine will be >>> managed for sysfs, and it will work as it is working today. >>> >>> - Shashank >>> >>>> >>>> Thanks, >>>> Lijo >>>> >>>>> >>>>> Alex >>>>> >>>>>> >>>>>> Thanks, >>>>>> Lijo >>>>>> >>>>>>> Now, when we have a single workload: >>>>>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>>>>> -> driver sends flags for p1 >>>>>>> -> PM FW applies profile P1 >>>>>>> -> Job executes in profile P1 >>>>>>> -> Job goes to reset function, ref_count = 0, >>>>>>> -> Power profile resets >>>>>>> >>>>>>> Now, we have conflicts only when we see multiple workloads (Job1 >>>>>>> and Job 2) >>>>>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>>>>> -> driver sends flags for p1 >>>>>>> -> PM FW applies profile P1 >>>>>>> -> Job executes in profile P1 >>>>>>> -> Job2: requests profile P2 via UAPI, refcount = 2 >>>>>>> -> driver sends flags for (P1|P2) >>>>>>> -> PM FW picks the more aggressive of the two (Say P1, stays in P1) >>>>>>> -> Job1 goes to reset function, ref_count = 1, job1 does not >>>>>>> reset power >>>>>>> profile >>>>>>> -> Job2 goes to reset function, ref_counter = 2, job 2 resets >>>>>>> Power profile >>>>>>> -> Power profile resets to None >>>>>>> >>>>>>> So this state machine looks like if there is only 1 job, it will be >>>>>>> executed in desired mode. But if there are multiple, the most >>>>>>> aggressive >>>>>>> profile will be picked, and every job will be executed in atleast >>>>>>> the >>>>>>> requested power profile mode or higher. >>>>>>> >>>>>>> Do you find any problem so far ? >>>>>>> >>>>>>> - Shashank >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-30 9:54 ` Lazar, Lijo @ 2022-09-30 10:09 ` Sharma, Shashank 0 siblings, 0 replies; 76+ messages in thread From: Sharma, Shashank @ 2022-09-30 10:09 UTC (permalink / raw) To: Lazar, Lijo, Alex Deucher Cc: alexander.deucher, Felix Kuehling, amaranath.somalapuram, Christian König, amd-gfx On 9/30/2022 11:54 AM, Lazar, Lijo wrote: > > > On 9/30/2022 2:52 PM, Sharma, Shashank wrote: >> >> >> On 9/30/2022 11:13 AM, Lazar, Lijo wrote: >>> >>> >>> On 9/30/2022 2:07 PM, Sharma, Shashank wrote: >>>> >>>> >>>> On 9/30/2022 7:08 AM, Lazar, Lijo wrote: >>>>> >>>>> >>>>> On 9/30/2022 12:02 AM, Alex Deucher wrote: >>>>>> On Thu, Sep 29, 2022 at 10:14 AM Lazar, Lijo <lijo.lazar@amd.com> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 9/29/2022 7:30 PM, Sharma, Shashank wrote: >>>>>>>> >>>>>>>> >>>>>>>> On 9/29/2022 3:37 PM, Lazar, Lijo wrote: >>>>>>>>> To be clear your understanding - >>>>>>>>> >>>>>>>>> Nothing is automatic in PMFW. PMFW picks a priority based on the >>>>>>>>> actual mask sent by driver. >>>>>>>>> >>>>>>>>> Assuming lower bits corresponds to highest priority - >>>>>>>>> >>>>>>>>> If driver sends a mask with Bit3 and Bit 0 set, PMFW will chose >>>>>>>>> profile that corresponds to Bit0. If driver sends a mask with Bit4 >>>>>>>>> Bit2 set and rest unset, PMFW will chose profile that >>>>>>>>> corresponds to >>>>>>>>> Bit2. However if driver sends a mask only with a single bit >>>>>>>>> set, it >>>>>>>>> chooses the profile regardless of whatever was the previous >>>>>>>>> profile. t >>>>>>>>> doesn't check if the existing profile > newly requested one. >>>>>>>>> That is >>>>>>>>> the behavior. >>>>>>>>> >>>>>>>>> So if a job send chooses a profile that corresponds to Bit0, >>>>>>>>> driver >>>>>>>>> will send that. Next time if another job chooses a profile that >>>>>>>>> corresponds to Bit1, PMFW will receive that as the new profile and >>>>>>>>> switch to that. It trusts the driver to send the proper >>>>>>>>> workload mask. >>>>>>>>> >>>>>>>>> Hope that gives the picture. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks, my understanding is also similar, referring to the core >>>>>>>> power >>>>>>>> switch profile function here: >>>>>>>> amd_powerplay.c::pp_dpm_switch_power_profile() >>>>>>>> *snip code* >>>>>>>> hwmgr->workload_mask |= (1 << hwmgr->workload_prority[type]); >>>>>>>> index = fls(hwmgr->workload_mask); >>>>>>>> index = index <= Workload_Policy_Max ? index - 1 : 0; >>>>>>>> workload = hwmgr->workload_setting[index]; >>>>>>>> *snip_code* >>>>>>>> hwmgr->hwmgr_func->set_power_profile_mode(hwmgr, &workload, 0); >>>>>>>> >>>>>>>> Here I can see that the new workload mask is appended into the >>>>>>>> existing >>>>>>>> workload mask (not overwritten). So if we keep sending new >>>>>>>> workload_modes, they would be appended into the workload flags and >>>>>>>> finally the PM will pick the most aggressive one of all these >>>>>>>> flags, as >>>>>>>> per its policy. >>>>>>>> >>>>>>> >>>>>>> Actually it's misleading - >>>>>>> >>>>>>> The path for sienna is - >>>>>>> set_power_profile_mode -> sienna_cichlid_set_power_profile_mode >>>>>>> >>>>>>> >>>>>>> This code here is a picking one based on lookup table. >>>>>>> >>>>>>> workload_type = smu_cmn_to_asic_specific_index(smu, >>>>>>> >>>>>>> CMN2ASIC_MAPPING_WORKLOAD, >>>>>>> >>>>>>> smu->power_profile_mode); >>>>>>> >>>>>>> This is that lookup table - >>>>>>> >>>>>>> static struct cmn2asic_mapping >>>>>>> sienna_cichlid_workload_map[PP_SMC_POWER_PROFILE_COUNT] = { >>>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_BOOTUP_DEFAULT, >>>>>>> WORKLOAD_PPLIB_DEFAULT_BIT), >>>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_FULLSCREEN3D, >>>>>>> WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT), >>>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_POWERSAVING, >>>>>>> WORKLOAD_PPLIB_POWER_SAVING_BIT), >>>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VIDEO, >>>>>>> WORKLOAD_PPLIB_VIDEO_BIT), >>>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_VR, >>>>>>> WORKLOAD_PPLIB_VR_BIT), >>>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>> WORKLOAD_PPLIB_COMPUTE_BIT), >>>>>>> WORKLOAD_MAP(PP_SMC_POWER_PROFILE_CUSTOM, >>>>>>> WORKLOAD_PPLIB_CUSTOM_BIT), >>>>>>> }; >>>>>>> >>>>>>> >>>>>>> And this is the place of interaction with PMFW. (1 << >>>>>>> workload_type) is >>>>>>> the mask being sent. >>>>>>> >>>>>>> smu_cmn_send_smc_msg_with_param(smu, >>>>>>> SMU_MSG_SetWorkloadMask, >>>>>>> 1 << workload_type, NULL); >>>>>>> >>>>>>> In the end, driver implementation expects only one bit to be set. >>>>>> >>>>>> Shashank and I had a discussion about this today. I think there >>>>>> are a >>>>>> few thing we can do to handle this better: >>>>>> >>>>>> 1. Set a flag that if the user changes the default via sysfs that >>>>>> overrides any runtime setting via an application since presumably >>>>>> that >>>>>> is what the user wants and we won't change the hint at runtime. >>>>>> 2. Drop the GET API. There's no need for this, the hint is just a >>>>>> hint. >>>>> >>>>> Double checked again based on Felix's comments on API definition. >>>>> Driver decides the priority instead of FW. That way we can still >>>>> keep Get API. >>>>> >>>>>> 2. Have the driver arbitrate between the available workload profiles >>>>>> based on the numeric value of the hint (e.g., default < 3D < video < >>>>>> VR < compute) as the higher values are more aggressive in most cases. >>>>>> If requests come in for 3D and compute at the same time, the driver >>>>>> will select compute because it's value is highest. Each hint type >>>>>> would be ref counted so we'd know what state to be in every time >>>>>> we go >>>>>> to set the state. If all of the clients requesting compute go away, >>>>>> and only 3D requestors remain, we can switch to 3D. If all refcounts >>>>>> go to 0, we go back to default. This will not require any change to >>>>>> the current workload API in the SMU code. >>>>> >>>>> Since PM layer decides priority, refcount can be kept at powerplay >>>>> and swsmu layer instead of any higher level API. >>>>> >>>>> User API may keep something like req_power_profile (for any >>>>> logging/debug purpose) for the job preference. >>>> >>>> No, I think there has been enough confusion around this >>>> implementation so far, we will implement this just as Alex/Felix >>>> suggested: >>>> - No change will be done in pm/SMU layer. >>> >>> Well, a confusion doesn't justify bad implementation. You could just >>> keep the refcount in workload_setting. >> >> So far, none of us have any reason to believe its a bad >> implementation. Why is it so, again ? >> > > It's only about keeping track of requests at client layer. > There is absolutely nothing bad or wrong with that, as a matter of fact, some driver designs prefer to keep it like this, and let the core API minimal and focused on core functionality. This is just about choice. >>> >>> Another API that uses power profile indirectly also will need to take >>> care of refcount and we don't need every other API to do that >>> separately without knowing what is the final outcome. >>> >> >> And why ? The dpm_switch_power_profile API was introduced to be used >> by a higher level API, and if a consumer API wants to keep track of >> that, its their own call. This doesn't affect internal PM APIs. The >> whole idea is to manage the PM calls without any change in PM APIs. >> > > Just like per-job-switch-profile is a new usage, there could be other > new cases as well. Also, there are other APIs which indirectly > manipulates power profile other than sys. > Understand that there was no reference counting for pm profile change so far, as it was probably written considering sysfs interface and never considered a multi-client environment. Like workload context, If there are other current/future clients who could also use these APIs, it would be a very important reason to add this workload reference counting in central pm structure (rather than multiple scattered places), so that every new API/consumer can understand and use this, and consider a system-wide scenario of DPM power profile, instead of a narrow view of its own thread. The central counter will indicate that more than one consumers/clients can change the power profile, not only this thread. - Shashank > All I'm saying is keep the refcount at core layer so that regardless of > wherever it comes from, it keeps the preference. > > So instead of this- > smu->workload_mask &= ~(1 << smu->workload_prority[type]); > > Have something like this - > > smu->workload[type].reqcount--; > if (!smu->workload[type].reqcount) > smu->workload_mask &= ~(1 << > smu->workload[type].priority); > > I guess, the count was not there because there was no usage of multiple > clients preferring the same profile at the same time. Now that there is > a case for this, fix it at where required rather than keeping a track of > it at client layer. > > Thanks, > Lijo > >> >> - Shashank >> >>> Thanks, >>> Lijo >>> >>>> - The amdgpu_context_workload layer will keep the ref_counting and >>>> user_workload_hint management, and it will just call and consume the >>>> pm_switch_workload profile() like any other client. >>> >>>> - We will add a force flag for calls coming from sysfs() interface, >>>> and it will take the highest priority. No state machine will be >>>> managed for sysfs, and it will work as it is working today. >>>> >>>> - Shashank >>>> >>>>> >>>>> Thanks, >>>>> Lijo >>>>> >>>>>> >>>>>> Alex >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Lijo >>>>>>> >>>>>>>> Now, when we have a single workload: >>>>>>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>>>>>> -> driver sends flags for p1 >>>>>>>> -> PM FW applies profile P1 >>>>>>>> -> Job executes in profile P1 >>>>>>>> -> Job goes to reset function, ref_count = 0, >>>>>>>> -> Power profile resets >>>>>>>> >>>>>>>> Now, we have conflicts only when we see multiple workloads (Job1 >>>>>>>> and Job 2) >>>>>>>> -> Job1: requests profile P1 via UAPI, ref count = 1 >>>>>>>> -> driver sends flags for p1 >>>>>>>> -> PM FW applies profile P1 >>>>>>>> -> Job executes in profile P1 >>>>>>>> -> Job2: requests profile P2 via UAPI, refcount = 2 >>>>>>>> -> driver sends flags for (P1|P2) >>>>>>>> -> PM FW picks the more aggressive of the two (Say P1, stays in P1) >>>>>>>> -> Job1 goes to reset function, ref_count = 1, job1 does not >>>>>>>> reset power >>>>>>>> profile >>>>>>>> -> Job2 goes to reset function, ref_counter = 2, job 2 resets >>>>>>>> Power profile >>>>>>>> -> Power profile resets to None >>>>>>>> >>>>>>>> So this state machine looks like if there is only 1 job, it will be >>>>>>>> executed in desired mode. But if there are multiple, the most >>>>>>>> aggressive >>>>>>>> profile will be picked, and every job will be executed in >>>>>>>> atleast the >>>>>>>> requested power profile mode or higher. >>>>>>>> >>>>>>>> Do you find any problem so far ? >>>>>>>> >>>>>>>> - Shashank >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Lijo ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 11:10 ` Lazar, Lijo 2022-09-29 13:20 ` Sharma, Shashank @ 2022-09-29 18:07 ` Felix Kuehling 2022-09-30 4:46 ` Lazar, Lijo 1 sibling, 1 reply; 76+ messages in thread From: Felix Kuehling @ 2022-09-29 18:07 UTC (permalink / raw) To: Lazar, Lijo, Sharma, Shashank, Alex Deucher Cc: alexander.deucher, amaranath.somalapuram, Christian König, amd-gfx On 2022-09-29 07:10, Lazar, Lijo wrote: > > > On 9/29/2022 2:18 PM, Sharma, Shashank wrote: >> >> >> On 9/28/2022 11:51 PM, Alex Deucher wrote: >>> On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank >>> <shashank.sharma@amd.com> wrote: >>>> >>>> >>>> >>>> On 9/27/2022 10:40 PM, Alex Deucher wrote: >>>>> On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank >>>>> <shashank.sharma@amd.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 9/27/2022 5:23 PM, Felix Kuehling wrote: >>>>>>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >>>>>>>> Hello Felix, >>>>>>>> >>>>>>>> Thank for the review comments. >>>>>>>> >>>>>>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>>>>>>>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>>>>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>>>>>>>> This patch switches the GPU workload mode to/from >>>>>>>>>>> compute mode, while submitting compute workload. >>>>>>>>>>> >>>>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>>>> >>>>>>>>>> Feel free to add my acked-by, but Felix should probably take >>>>>>>>>> a look >>>>>>>>>> as well. >>>>>>>>> >>>>>>>>> This look OK purely from a compute perspective. But I'm concerned >>>>>>>>> about the interaction of compute with graphics or multiple >>>>>>>>> graphics >>>>>>>>> contexts submitting work concurrently. They would constantly >>>>>>>>> override >>>>>>>>> or disable each other's workload hints. >>>>>>>>> >>>>>>>>> For example, you have an amdgpu_ctx with >>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>>>>>>>> process that also wants the compute profile. Those could be >>>>>>>>> different >>>>>>>>> processes belonging to different users. Say, KFD enables the >>>>>>>>> compute >>>>>>>>> profile first. Then the graphics context submits a job. At the >>>>>>>>> start >>>>>>>>> of the job, the compute profile is enabled. That's a no-op >>>>>>>>> because >>>>>>>>> KFD already enabled the compute profile. When the job >>>>>>>>> finishes, it >>>>>>>>> disables the compute profile for everyone, including KFD. That's >>>>>>>>> unexpected. >>>>>>>>> >>>>>>>> >>>>>>>> In this case, it will not disable the compute profile, as the >>>>>>>> reference counter will not be zero. The reset_profile() will >>>>>>>> only act >>>>>>>> if the reference counter is 0. >>>>>>> >>>>>>> OK, I missed the reference counter. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> But I would be happy to get any inputs about a policy which can be >>>>>>>> more sustainable and gets better outputs, for example: >>>>>>>> - should we not allow a profile change, if a PP mode is already >>>>>>>> applied and keep it Early bird basis ? >>>>>>>> >>>>>>>> For example: Policy A >>>>>>>> - Job A sets the profile to compute >>>>>>>> - Job B tries to set profile to 3D, but we do not allow it as >>>>>>>> job A is >>>>>>>> not finished it yet. >>>>>>>> >>>>>>>> Or Policy B: Current one >>>>>>>> - Job A sets the profile to compute >>>>>>>> - Job B tries to set profile to 3D, and we allow it. Job A also >>>>>>>> runs >>>>>>>> in PP 3D >>>>>>>> - Job B finishes, but does not reset PP as reference count is >>>>>>>> not zero >>>>>>>> due to compute >>>>>>>> - Job A finishes, profile reset to NONE >>>>>>> >>>>>>> I think this won't work. As I understand it, the >>>>>>> amdgpu_dpm_switch_power_profile enables and disables individual >>>>>>> profiles. Disabling the 3D profile doesn't disable the compute >>>>>>> profile >>>>>>> at the same time. I think you'll need one refcount per profile. >>>>>>> >>>>>>> Regards, >>>>>>> Felix >>>>>> >>>>>> Thanks, This is exactly what I was looking for, I think Alex's >>>>>> initial >>>>>> idea was around it, but I was under the assumption that there is >>>>>> only >>>>>> one HW profile in SMU which keeps on getting overwritten. This >>>>>> can solve >>>>>> our problems, as I can create an array of reference counters, and >>>>>> will >>>>>> disable only the profile whose reference counter goes 0. >>>>> >>>>> It's been a while since I paged any of this code into my head, but I >>>>> believe the actual workload message in the SMU is a mask where you >>>>> can >>>>> specify multiple workload types at the same time and the SMU will >>>>> arbitrate between them internally. E.g., the most aggressive one >>>>> will >>>>> be selected out of the ones specified. I think in the driver we just >>>>> set one bit at a time using the current interface. It might be >>>>> better >>>>> to change the interface and just ref count the hint types and then >>>>> when we call the set function look at the ref counts for each hint >>>>> type and set the mask as appropriate. >>>>> >>>>> Alex >>>>> >>>> >>>> Hey Alex, >>>> Thanks for your comment, if that is the case, this current patch >>>> series >>>> works straight forward, and no changes would be required. Please >>>> let me >>>> know if my understanding is correct: >>>> >>>> Assumption: Order of aggression: 3D > Media > Compute >>>> >>>> - Job 1: Requests mode compute: PP changed to compute, ref count 1 >>>> - Job 2: Requests mode media: PP changed to media, ref count 2 >>>> - Job 3: requests mode 3D: PP changed to 3D, ref count 3 >>>> - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref >>>> > 0, >>>> PP still 3D >>>> - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref >>>> > 0, >>>> PP still 3D >>>> - Job 2 finishes, downs ref count to 0, PP changed to NONE, >>>> >>>> In this way, every job will be operating in the Power profile of >>>> desired >>>> aggression or higher, and this API guarantees the execution >>>> at-least in >>>> the desired power profile. >>> >>> I'm not entirely sure on the relative levels of aggression, but I >>> believe the SMU priorities them by index. E.g. >>> #define WORKLOAD_PPLIB_DEFAULT_BIT 0 >>> #define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1 >>> #define WORKLOAD_PPLIB_POWER_SAVING_BIT 2 >>> #define WORKLOAD_PPLIB_VIDEO_BIT 3 >>> #define WORKLOAD_PPLIB_VR_BIT 4 >>> #define WORKLOAD_PPLIB_COMPUTE_BIT 5 >>> #define WORKLOAD_PPLIB_CUSTOM_BIT 6 >>> >>> 3D < video < VR < compute < custom >>> >>> VR and compute are the most aggressive. Custom takes preference >>> because it's user customizable. >>> >>> Alex >>> >> >> Thanks, so this UAPI will guarantee the execution of the job in >> atleast the requested power profile, or a more aggressive one. >> > > Hi Shashank, > > This is not how the API works in the driver PM subsystem. In the final > interface with PMFW, driver sets only one profile bit and doesn't set > any mask. So it doesn't work the way as Felix explained. I was not looking at the implementation but at the API: int amdgpu_dpm_switch_power_profile(struct amdgpu_device *adev, enum PP_SMC_POWER_PROFILE type, bool en) This API suggests, that we can enable and disable individual profiles. E.g. disabling PP_SMC_POWER_PROFILE_VIDEO should not change whether PP_SMC_POWER_PROFILE_COMPUTE is enabled. What we actually send to the HW when multiple profiles are enabled through this API is a different question. We have to choose one profile or the other. This can happen in the driver or the firmware. I don't care. But if disabling PP_SMC_POWER_PROFILE_VIDEO makes us forget that we ever enabled PP_SMC_POWER_PROFILE_COMPUTE then this API is broken and useless as an abstraction. Regards, Felix > If there is more than one profile bit set, PMFW looks at the mask and > picks the one with the highest priority. Note that for each update of > workload mask, PMFW should get a message. > > Driver currently sets only bit as Alex explained earlier. For our > current driver implementation, you can check this as example - > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1753 > > > Also, PM layer already stores the current workload profile for a *get* > API (which also means a new pm workload variable is not needed). But, > that API works as long as driver sets only one profile bit, that way > driver is sure of the current profile mode - > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1628 > > > When there is more than one, driver is not sure of the internal > priority of PMFW though we can follow the bit order which Alex > suggested (but sometimes FW carry some workarounds inside which means > it doesn't necessarily follow the same order). > > There is an existing interface through sysfs through which allow to > change the profile mode and add custom settings. In summary, any > handling of change from single bit to mask needs to be done at the > lower layer. > > The problem is this behavior has been there throughout all legacy > ASICs. Not sure how much of effort it takes and what all needs to be > modified. > > Thanks, > Lijo > >> I will do the one change required and send the updated one. >> >> - Shashank >> >>> >>> >>> >>>> >>>> - Shashank >>>> >>>>> >>>>>> >>>>>> - Shashank >>>>>> >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Or anything else ? >>>>>>>> >>>>>>>> REgards >>>>>>>> Shashank >>>>>>>> >>>>>>>> >>>>>>>>> Or you have multiple VCN contexts. When context1 finishes a >>>>>>>>> job, it >>>>>>>>> disables the VIDEO profile. But context2 still has a job on >>>>>>>>> the other >>>>>>>>> VCN engine and wants the VIDEO profile to still be enabled. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Felix >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Christian. >>>>>>>>>> >>>>>>>>>>> --- >>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 >>>>>>>>>>> +++++++++++--- >>>>>>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>> index 5e53a5293935..1caed319a448 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>> @@ -34,6 +34,7 @@ >>>>>>>>>>> #include "amdgpu_ras.h" >>>>>>>>>>> #include "amdgpu_umc.h" >>>>>>>>>>> #include "amdgpu_reset.h" >>>>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>>>> /* Total memory size in system memory and all GPU >>>>>>>>>>> VRAM. Used to >>>>>>>>>>> * estimate worst case amount of memory to reserve for >>>>>>>>>>> page tables >>>>>>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>> void amdgpu_amdkfd_set_compute_idle(struct >>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>> bool idle) >>>>>>>>>>> { >>>>>>>>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>>>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>>>>>> - !idle); >>>>>>>>>>> + int ret; >>>>>>>>>>> + >>>>>>>>>>> + if (idle) >>>>>>>>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>>> + else >>>>>>>>>>> + ret = amdgpu_set_workload_profile(adev, >>>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>>> + >>>>>>>>>>> + if (ret) >>>>>>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>>>>>>>> compute mode\n", >>>>>>>>>>> + idle ? "reset" : "set"); >>>>>>>>>>> } >>>>>>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device >>>>>>>>>>> *adev, u32 >>>>>>>>>>> vmid) >>>>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute 2022-09-29 18:07 ` Felix Kuehling @ 2022-09-30 4:46 ` Lazar, Lijo 0 siblings, 0 replies; 76+ messages in thread From: Lazar, Lijo @ 2022-09-30 4:46 UTC (permalink / raw) To: Felix Kuehling, Sharma, Shashank, Alex Deucher Cc: alexander.deucher, amaranath.somalapuram, Christian König, amd-gfx On 9/29/2022 11:37 PM, Felix Kuehling wrote: > On 2022-09-29 07:10, Lazar, Lijo wrote: >> >> >> On 9/29/2022 2:18 PM, Sharma, Shashank wrote: >>> >>> >>> On 9/28/2022 11:51 PM, Alex Deucher wrote: >>>> On Wed, Sep 28, 2022 at 4:57 AM Sharma, Shashank >>>> <shashank.sharma@amd.com> wrote: >>>>> >>>>> >>>>> >>>>> On 9/27/2022 10:40 PM, Alex Deucher wrote: >>>>>> On Tue, Sep 27, 2022 at 11:38 AM Sharma, Shashank >>>>>> <shashank.sharma@amd.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 9/27/2022 5:23 PM, Felix Kuehling wrote: >>>>>>>> Am 2022-09-27 um 10:58 schrieb Sharma, Shashank: >>>>>>>>> Hello Felix, >>>>>>>>> >>>>>>>>> Thank for the review comments. >>>>>>>>> >>>>>>>>> On 9/27/2022 4:48 PM, Felix Kuehling wrote: >>>>>>>>>> Am 2022-09-27 um 02:12 schrieb Christian König: >>>>>>>>>>> Am 26.09.22 um 23:40 schrieb Shashank Sharma: >>>>>>>>>>>> This patch switches the GPU workload mode to/from >>>>>>>>>>>> compute mode, while submitting compute workload. >>>>>>>>>>>> >>>>>>>>>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> >>>>>>>>>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> >>>>>>>>>>> >>>>>>>>>>> Feel free to add my acked-by, but Felix should probably take >>>>>>>>>>> a look >>>>>>>>>>> as well. >>>>>>>>>> >>>>>>>>>> This look OK purely from a compute perspective. But I'm concerned >>>>>>>>>> about the interaction of compute with graphics or multiple >>>>>>>>>> graphics >>>>>>>>>> contexts submitting work concurrently. They would constantly >>>>>>>>>> override >>>>>>>>>> or disable each other's workload hints. >>>>>>>>>> >>>>>>>>>> For example, you have an amdgpu_ctx with >>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (maybe Vulkan compute) and a KFD >>>>>>>>>> process that also wants the compute profile. Those could be >>>>>>>>>> different >>>>>>>>>> processes belonging to different users. Say, KFD enables the >>>>>>>>>> compute >>>>>>>>>> profile first. Then the graphics context submits a job. At the >>>>>>>>>> start >>>>>>>>>> of the job, the compute profile is enabled. That's a no-op >>>>>>>>>> because >>>>>>>>>> KFD already enabled the compute profile. When the job >>>>>>>>>> finishes, it >>>>>>>>>> disables the compute profile for everyone, including KFD. That's >>>>>>>>>> unexpected. >>>>>>>>>> >>>>>>>>> >>>>>>>>> In this case, it will not disable the compute profile, as the >>>>>>>>> reference counter will not be zero. The reset_profile() will >>>>>>>>> only act >>>>>>>>> if the reference counter is 0. >>>>>>>> >>>>>>>> OK, I missed the reference counter. >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> But I would be happy to get any inputs about a policy which can be >>>>>>>>> more sustainable and gets better outputs, for example: >>>>>>>>> - should we not allow a profile change, if a PP mode is already >>>>>>>>> applied and keep it Early bird basis ? >>>>>>>>> >>>>>>>>> For example: Policy A >>>>>>>>> - Job A sets the profile to compute >>>>>>>>> - Job B tries to set profile to 3D, but we do not allow it as >>>>>>>>> job A is >>>>>>>>> not finished it yet. >>>>>>>>> >>>>>>>>> Or Policy B: Current one >>>>>>>>> - Job A sets the profile to compute >>>>>>>>> - Job B tries to set profile to 3D, and we allow it. Job A also >>>>>>>>> runs >>>>>>>>> in PP 3D >>>>>>>>> - Job B finishes, but does not reset PP as reference count is >>>>>>>>> not zero >>>>>>>>> due to compute >>>>>>>>> - Job A finishes, profile reset to NONE >>>>>>>> >>>>>>>> I think this won't work. As I understand it, the >>>>>>>> amdgpu_dpm_switch_power_profile enables and disables individual >>>>>>>> profiles. Disabling the 3D profile doesn't disable the compute >>>>>>>> profile >>>>>>>> at the same time. I think you'll need one refcount per profile. >>>>>>>> >>>>>>>> Regards, >>>>>>>> Felix >>>>>>> >>>>>>> Thanks, This is exactly what I was looking for, I think Alex's >>>>>>> initial >>>>>>> idea was around it, but I was under the assumption that there is >>>>>>> only >>>>>>> one HW profile in SMU which keeps on getting overwritten. This >>>>>>> can solve >>>>>>> our problems, as I can create an array of reference counters, and >>>>>>> will >>>>>>> disable only the profile whose reference counter goes 0. >>>>>> >>>>>> It's been a while since I paged any of this code into my head, but I >>>>>> believe the actual workload message in the SMU is a mask where you >>>>>> can >>>>>> specify multiple workload types at the same time and the SMU will >>>>>> arbitrate between them internally. E.g., the most aggressive one >>>>>> will >>>>>> be selected out of the ones specified. I think in the driver we just >>>>>> set one bit at a time using the current interface. It might be >>>>>> better >>>>>> to change the interface and just ref count the hint types and then >>>>>> when we call the set function look at the ref counts for each hint >>>>>> type and set the mask as appropriate. >>>>>> >>>>>> Alex >>>>>> >>>>> >>>>> Hey Alex, >>>>> Thanks for your comment, if that is the case, this current patch >>>>> series >>>>> works straight forward, and no changes would be required. Please >>>>> let me >>>>> know if my understanding is correct: >>>>> >>>>> Assumption: Order of aggression: 3D > Media > Compute >>>>> >>>>> - Job 1: Requests mode compute: PP changed to compute, ref count 1 >>>>> - Job 2: Requests mode media: PP changed to media, ref count 2 >>>>> - Job 3: requests mode 3D: PP changed to 3D, ref count 3 >>>>> - Job 1 finishes, downs ref count to 2, doesn't reset the PP as ref >>>>> > 0, >>>>> PP still 3D >>>>> - Job 3 finishes, downs ref count to 1, doesn't reset the PP as ref >>>>> > 0, >>>>> PP still 3D >>>>> - Job 2 finishes, downs ref count to 0, PP changed to NONE, >>>>> >>>>> In this way, every job will be operating in the Power profile of >>>>> desired >>>>> aggression or higher, and this API guarantees the execution >>>>> at-least in >>>>> the desired power profile. >>>> >>>> I'm not entirely sure on the relative levels of aggression, but I >>>> believe the SMU priorities them by index. E.g. >>>> #define WORKLOAD_PPLIB_DEFAULT_BIT 0 >>>> #define WORKLOAD_PPLIB_FULL_SCREEN_3D_BIT 1 >>>> #define WORKLOAD_PPLIB_POWER_SAVING_BIT 2 >>>> #define WORKLOAD_PPLIB_VIDEO_BIT 3 >>>> #define WORKLOAD_PPLIB_VR_BIT 4 >>>> #define WORKLOAD_PPLIB_COMPUTE_BIT 5 >>>> #define WORKLOAD_PPLIB_CUSTOM_BIT 6 >>>> >>>> 3D < video < VR < compute < custom >>>> >>>> VR and compute are the most aggressive. Custom takes preference >>>> because it's user customizable. >>>> >>>> Alex >>>> >>> >>> Thanks, so this UAPI will guarantee the execution of the job in >>> atleast the requested power profile, or a more aggressive one. >>> >> >> Hi Shashank, >> >> This is not how the API works in the driver PM subsystem. In the final >> interface with PMFW, driver sets only one profile bit and doesn't set >> any mask. So it doesn't work the way as Felix explained. > > I was not looking at the implementation but at the API: > > int amdgpu_dpm_switch_power_profile(struct amdgpu_device *adev, > enum PP_SMC_POWER_PROFILE type, > bool en) > > This API suggests, that we can enable and disable individual profiles. > E.g. disabling PP_SMC_POWER_PROFILE_VIDEO should not change whether > PP_SMC_POWER_PROFILE_COMPUTE is enabled. What we actually send to the HW > when multiple profiles are enabled through this API is a different > question. We have to choose one profile or the other. This can happen in > the driver or the firmware. I don't care. > > But if disabling PP_SMC_POWER_PROFILE_VIDEO makes us forget that we ever > enabled PP_SMC_POWER_PROFILE_COMPUTE then this API is broken and useless > as an abstraction. > Checked again. Here driver decides the priority instead of FW. So the API works as you mentioned (except that there is no refcount done). Sorry for the confusion. Thanks, Lijo > Regards, > Felix > > >> If there is more than one profile bit set, PMFW looks at the mask and >> picks the one with the highest priority. Note that for each update of >> workload mask, PMFW should get a message. >> >> Driver currently sets only bit as Alex explained earlier. For our >> current driver implementation, you can check this as example - >> >> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1753 >> >> >> Also, PM layer already stores the current workload profile for a *get* >> API (which also means a new pm workload variable is not needed). But, >> that API works as long as driver sets only one profile bit, that way >> driver is sure of the current profile mode - >> >> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c#L1628 >> >> >> When there is more than one, driver is not sure of the internal >> priority of PMFW though we can follow the bit order which Alex >> suggested (but sometimes FW carry some workarounds inside which means >> it doesn't necessarily follow the same order). >> >> There is an existing interface through sysfs through which allow to >> change the profile mode and add custom settings. In summary, any >> handling of change from single bit to mask needs to be done at the >> lower layer. >> >> The problem is this behavior has been there throughout all legacy >> ASICs. Not sure how much of effort it takes and what all needs to be >> modified. >> >> Thanks, >> Lijo >> >>> I will do the one change required and send the updated one. >>> >>> - Shashank >>> >>>> >>>> >>>> >>>>> >>>>> - Shashank >>>>> >>>>>> >>>>>>> >>>>>>> - Shashank >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Or anything else ? >>>>>>>>> >>>>>>>>> REgards >>>>>>>>> Shashank >>>>>>>>> >>>>>>>>> >>>>>>>>>> Or you have multiple VCN contexts. When context1 finishes a >>>>>>>>>> job, it >>>>>>>>>> disables the VIDEO profile. But context2 still has a job on >>>>>>>>>> the other >>>>>>>>>> VCN engine and wants the VIDEO profile to still be enabled. >>>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Felix >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Christian. >>>>>>>>>>> >>>>>>>>>>>> --- >>>>>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 14 >>>>>>>>>>>> +++++++++++--- >>>>>>>>>>>> 1 file changed, 11 insertions(+), 3 deletions(-) >>>>>>>>>>>> >>>>>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>>> index 5e53a5293935..1caed319a448 100644 >>>>>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c >>>>>>>>>>>> @@ -34,6 +34,7 @@ >>>>>>>>>>>> #include "amdgpu_ras.h" >>>>>>>>>>>> #include "amdgpu_umc.h" >>>>>>>>>>>> #include "amdgpu_reset.h" >>>>>>>>>>>> +#include "amdgpu_ctx_workload.h" >>>>>>>>>>>> /* Total memory size in system memory and all GPU >>>>>>>>>>>> VRAM. Used to >>>>>>>>>>>> * estimate worst case amount of memory to reserve for >>>>>>>>>>>> page tables >>>>>>>>>>>> @@ -703,9 +704,16 @@ int amdgpu_amdkfd_submit_ib(struct >>>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>>> void amdgpu_amdkfd_set_compute_idle(struct >>>>>>>>>>>> amdgpu_device *adev, >>>>>>>>>>>> bool idle) >>>>>>>>>>>> { >>>>>>>>>>>> - amdgpu_dpm_switch_power_profile(adev, >>>>>>>>>>>> - PP_SMC_POWER_PROFILE_COMPUTE, >>>>>>>>>>>> - !idle); >>>>>>>>>>>> + int ret; >>>>>>>>>>>> + >>>>>>>>>>>> + if (idle) >>>>>>>>>>>> + ret = amdgpu_clear_workload_profile(adev, >>>>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>>>> + else >>>>>>>>>>>> + ret = amdgpu_set_workload_profile(adev, >>>>>>>>>>>> AMDGPU_CTX_WORKLOAD_HINT_COMPUTE); >>>>>>>>>>>> + >>>>>>>>>>>> + if (ret) >>>>>>>>>>>> + drm_warn(&adev->ddev, "Failed to %s power profile to >>>>>>>>>>>> compute mode\n", >>>>>>>>>>>> + idle ? "reset" : "set"); >>>>>>>>>>>> } >>>>>>>>>>>> bool amdgpu_amdkfd_is_kfd_vmid(struct amdgpu_device >>>>>>>>>>>> *adev, u32 >>>>>>>>>>>> vmid) >>>>>>>>>>> ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 0/5] GPU workload hints for better performance 2022-09-26 21:40 [PATCH v3 0/5] GPU workload hints for better performance Shashank Sharma ` (4 preceding siblings ...) 2022-09-26 21:40 ` [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute Shashank Sharma @ 2022-09-27 16:24 ` Michel Dänzer 2022-09-27 16:59 ` Sharma, Shashank 5 siblings, 1 reply; 76+ messages in thread From: Michel Dänzer @ 2022-09-27 16:24 UTC (permalink / raw) To: Shashank Sharma Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, amd-gfx On 2022-09-26 23:40, Shashank Sharma wrote: > AMDGPU SOCs supports dynamic workload based power profiles, which can > provide fine-tuned performance for a particular type of workload. > This patch series adds an interface to set/reset these power profiles > based on the workload type hints. A user can set a hint of workload > type being submistted to GPU, and the driver can dynamically switch > the power profiles which is best suited to this kind of workload. > > Currently supported workload profiles are: > "None", "3D", "Video", "VR", "Compute" > > V2: This version addresses the review comment from Christian about > chaning the design to set workload mode in a more dynamic method > than during the context creation. > > V3: Addressed review comment from Christian, Removed the get_workload() > calls from UAPI, keeping only the set_workload() call. > > Shashank Sharma (5): > drm/amdgpu: add UAPI for workload hints to ctx ioctl > drm/amdgpu: add new functions to set GPU power profile > drm/amdgpu: set GPU workload via ctx IOCTL > drm/amdgpu: switch GPU workload profile > drm/amdgpu: switch workload context to/from compute Where are the corresponding Mesa changes? -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and Xwayland developer ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 0/5] GPU workload hints for better performance 2022-09-27 16:24 ` [PATCH v3 0/5] GPU workload hints for better performance Michel Dänzer @ 2022-09-27 16:59 ` Sharma, Shashank 2022-09-27 17:13 ` Michel Dänzer 0 siblings, 1 reply; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 16:59 UTC (permalink / raw) To: Michel Dänzer Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, amd-gfx Hey Michel, Thanks for the review coments. On 9/27/2022 6:24 PM, Michel Dänzer wrote: > On 2022-09-26 23:40, Shashank Sharma wrote: >> AMDGPU SOCs supports dynamic workload based power profiles, which can >> provide fine-tuned performance for a particular type of workload. >> This patch series adds an interface to set/reset these power profiles >> based on the workload type hints. A user can set a hint of workload >> type being submistted to GPU, and the driver can dynamically switch >> the power profiles which is best suited to this kind of workload. >> >> Currently supported workload profiles are: >> "None", "3D", "Video", "VR", "Compute" >> >> V2: This version addresses the review comment from Christian about >> chaning the design to set workload mode in a more dynamic method >> than during the context creation. >> >> V3: Addressed review comment from Christian, Removed the get_workload() >> calls from UAPI, keeping only the set_workload() call. >> >> Shashank Sharma (5): >> drm/amdgpu: add UAPI for workload hints to ctx ioctl >> drm/amdgpu: add new functions to set GPU power profile >> drm/amdgpu: set GPU workload via ctx IOCTL >> drm/amdgpu: switch GPU workload profile >> drm/amdgpu: switch workload context to/from compute > > Where are the corresponding Mesa changes? > > This series here was to get the feedback on the kernel side design first. As you can see from the patch history, we have already changed the design once and this is V2. So I thought it would be a good idea to get the feedback on kernel UAPI, before starting sending patches to mesa. The mesa/libdrm changes are ready and I was using those mixed with libdrm/test/amdgpu stuff to validate the series. Now I will fine tune them to match the feedback here, and send the updated series. - Shashank ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 0/5] GPU workload hints for better performance 2022-09-27 16:59 ` Sharma, Shashank @ 2022-09-27 17:13 ` Michel Dänzer 2022-09-27 17:25 ` Sharma, Shashank 0 siblings, 1 reply; 76+ messages in thread From: Michel Dänzer @ 2022-09-27 17:13 UTC (permalink / raw) To: Sharma, Shashank Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, amd-gfx On 2022-09-27 18:59, Sharma, Shashank wrote: > Hey Michel, > Thanks for the review coments. > > On 9/27/2022 6:24 PM, Michel Dänzer wrote: >> On 2022-09-26 23:40, Shashank Sharma wrote: >>> AMDGPU SOCs supports dynamic workload based power profiles, which can >>> provide fine-tuned performance for a particular type of workload. >>> This patch series adds an interface to set/reset these power profiles >>> based on the workload type hints. A user can set a hint of workload >>> type being submistted to GPU, and the driver can dynamically switch >>> the power profiles which is best suited to this kind of workload. >>> >>> Currently supported workload profiles are: >>> "None", "3D", "Video", "VR", "Compute" >>> >>> V2: This version addresses the review comment from Christian about >>> chaning the design to set workload mode in a more dynamic method >>> than during the context creation. >>> >>> V3: Addressed review comment from Christian, Removed the get_workload() >>> calls from UAPI, keeping only the set_workload() call. >>> >>> Shashank Sharma (5): >>> drm/amdgpu: add UAPI for workload hints to ctx ioctl >>> drm/amdgpu: add new functions to set GPU power profile >>> drm/amdgpu: set GPU workload via ctx IOCTL >>> drm/amdgpu: switch GPU workload profile >>> drm/amdgpu: switch workload context to/from compute >> >> Where are the corresponding Mesa changes? >> >> > This series here was to get the feedback on the kernel side design first. As you can see from the patch history, we have already changed the design once and this is V2. So I thought it would be a good idea to get the feedback on kernel UAPI, before starting sending patches to mesa. In general, it's not possible to review UAPI without the corresponding user-space code. I don't think this is an exception. -- Earthling Michel Dänzer | https://redhat.com Libre software enthusiast | Mesa and Xwayland developer ^ permalink raw reply [flat|nested] 76+ messages in thread
* Re: [PATCH v3 0/5] GPU workload hints for better performance 2022-09-27 17:13 ` Michel Dänzer @ 2022-09-27 17:25 ` Sharma, Shashank 0 siblings, 0 replies; 76+ messages in thread From: Sharma, Shashank @ 2022-09-27 17:25 UTC (permalink / raw) To: Michel Dänzer Cc: alexander.deucher, amaranath.somalapuram, christian.koenig, amd-gfx On 9/27/2022 7:13 PM, Michel Dänzer wrote: > On 2022-09-27 18:59, Sharma, Shashank wrote: >> Hey Michel, >> Thanks for the review coments. >> >> On 9/27/2022 6:24 PM, Michel Dänzer wrote: >>> On 2022-09-26 23:40, Shashank Sharma wrote: >>>> AMDGPU SOCs supports dynamic workload based power profiles, which can >>>> provide fine-tuned performance for a particular type of workload. >>>> This patch series adds an interface to set/reset these power profiles >>>> based on the workload type hints. A user can set a hint of workload >>>> type being submistted to GPU, and the driver can dynamically switch >>>> the power profiles which is best suited to this kind of workload. >>>> >>>> Currently supported workload profiles are: >>>> "None", "3D", "Video", "VR", "Compute" >>>> >>>> V2: This version addresses the review comment from Christian about >>>> chaning the design to set workload mode in a more dynamic method >>>> than during the context creation. >>>> >>>> V3: Addressed review comment from Christian, Removed the get_workload() >>>> calls from UAPI, keeping only the set_workload() call. >>>> >>>> Shashank Sharma (5): >>>> drm/amdgpu: add UAPI for workload hints to ctx ioctl >>>> drm/amdgpu: add new functions to set GPU power profile >>>> drm/amdgpu: set GPU workload via ctx IOCTL >>>> drm/amdgpu: switch GPU workload profile >>>> drm/amdgpu: switch workload context to/from compute >>> >>> Where are the corresponding Mesa changes? >>> >>> >> This series here was to get the feedback on the kernel side design first. As you can see from the patch history, we have already changed the design once and this is V2. So I thought it would be a good idea to get the feedback on kernel UAPI, before starting sending patches to mesa. > > In general, it's not possible to review UAPI without the corresponding user-space code. I don't think this is an exception. > > Sure, good that we already have got the kernel inputs we wanted, now the next version will be with corresponding MESA changes. - Shashank ^ permalink raw reply [flat|nested] 76+ messages in thread
end of thread, other threads:[~2023-03-22 15:12 UTC | newest] Thread overview: 76+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-09-26 21:40 [PATCH v3 0/5] GPU workload hints for better performance Shashank Sharma 2022-09-26 21:40 ` [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl Shashank Sharma 2022-09-27 6:07 ` Christian König 2022-09-27 14:28 ` Felix Kuehling 2023-03-21 3:05 ` Marek Olšák 2023-03-21 13:00 ` Sharma, Shashank 2023-03-21 13:54 ` Christian König 2023-03-22 14:05 ` Marek Olšák 2023-03-22 14:08 ` Christian König 2023-03-22 14:24 ` Marek Olšák 2023-03-22 14:29 ` Christian König 2023-03-22 14:36 ` Marek Olšák 2023-03-22 14:52 ` Alex Deucher 2023-03-22 15:11 ` Marek Olšák 2023-03-22 14:38 ` Sharma, Shashank 2022-09-26 21:40 ` [PATCH v3 2/5] drm/amdgpu: add new functions to set GPU power profile Shashank Sharma 2022-09-27 2:14 ` Quan, Evan 2022-09-27 7:29 ` Sharma, Shashank 2022-09-27 9:29 ` Quan, Evan 2022-09-27 10:00 ` Sharma, Shashank 2022-09-27 6:08 ` Christian König 2022-09-27 9:58 ` Lazar, Lijo 2022-09-27 11:41 ` Sharma, Shashank 2022-09-27 12:10 ` Lazar, Lijo 2022-09-27 12:23 ` Sharma, Shashank 2022-09-27 12:39 ` Lazar, Lijo 2022-09-27 12:53 ` Sharma, Shashank 2022-09-27 13:29 ` Lazar, Lijo 2022-09-27 13:47 ` Sharma, Shashank 2022-09-27 14:00 ` Lazar, Lijo 2022-09-27 14:20 ` Sharma, Shashank 2022-09-27 14:34 ` Lazar, Lijo 2022-09-27 14:50 ` Sharma, Shashank 2022-09-27 15:20 ` Felix Kuehling 2022-09-26 21:40 ` [PATCH v3 3/5] drm/amdgpu: set GPU workload via ctx IOCTL Shashank Sharma 2022-09-27 6:09 ` Christian König 2022-09-26 21:40 ` [PATCH v3 4/5] drm/amdgpu: switch GPU workload profile Shashank Sharma 2022-09-27 6:11 ` Christian König 2022-09-27 10:03 ` Lazar, Lijo 2022-09-27 11:47 ` Sharma, Shashank 2022-09-27 12:20 ` Lazar, Lijo 2022-09-27 12:25 ` Sharma, Shashank 2022-09-27 16:33 ` Michel Dänzer 2022-09-27 17:06 ` Sharma, Shashank 2022-09-27 17:29 ` Michel Dänzer 2022-09-26 21:40 ` [PATCH v3 5/5] drm/amdgpu: switch workload context to/from compute Shashank Sharma 2022-09-27 6:12 ` Christian König 2022-09-27 14:48 ` Felix Kuehling 2022-09-27 14:58 ` Sharma, Shashank 2022-09-27 15:23 ` Felix Kuehling 2022-09-27 15:38 ` Sharma, Shashank 2022-09-27 20:40 ` Alex Deucher 2022-09-28 7:05 ` Lazar, Lijo 2022-09-28 8:56 ` Sharma, Shashank 2022-09-28 9:00 ` Sharma, Shashank 2022-09-28 21:51 ` Alex Deucher 2022-09-29 8:48 ` Sharma, Shashank 2022-09-29 11:10 ` Lazar, Lijo 2022-09-29 13:20 ` Sharma, Shashank 2022-09-29 13:37 ` Lazar, Lijo 2022-09-29 14:00 ` Sharma, Shashank 2022-09-29 14:14 ` Lazar, Lijo 2022-09-29 14:40 ` Sharma, Shashank 2022-09-29 18:32 ` Alex Deucher 2022-09-30 5:08 ` Lazar, Lijo 2022-09-30 8:37 ` Sharma, Shashank 2022-09-30 9:13 ` Lazar, Lijo 2022-09-30 9:22 ` Sharma, Shashank 2022-09-30 9:54 ` Lazar, Lijo 2022-09-30 10:09 ` Sharma, Shashank 2022-09-29 18:07 ` Felix Kuehling 2022-09-30 4:46 ` Lazar, Lijo 2022-09-27 16:24 ` [PATCH v3 0/5] GPU workload hints for better performance Michel Dänzer 2022-09-27 16:59 ` Sharma, Shashank 2022-09-27 17:13 ` Michel Dänzer 2022-09-27 17:25 ` Sharma, Shashank
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).