On Wed, Mar 22, 2023 at 10:52 AM Alex Deucher <alexdeucher@gmail.com> wrote:

On Wed, Mar 22, 2023 at 10:37 AM Marek Olšák <maraeo@gmail.com> wrote:
>
> It sounds like the kernel should set the hint based on which queues are used, so that every UMD doesn't have to duplicate the same logic.

Userspace has a better idea of what they are doing than the kernel.
That said, we already set the video hint in the kernel when we submit
work to VCN/UVD/VCE and we already set hint COMPUTE when user queues
are active in ROCm because user queues don't go through the kernel. I
guess we could just set 3D by default. On windows there is a separate
API for fullscreen 3D games so 3D is only enabled in that case. I
assumed UMDs would want to select a hint, but maybe we should just
select the kernel set something. I figured vulkan or OpenGL would
select 3D vs COMPUTE depending on what queues/extensions the app uses.

Thinking about it more, if we do keep the hints, maybe it makes more
sense to select the hint at context init. Then we can set the hint to
the hardware at context init time. If multiple hints come in from
different contexts we'll automatically select the most aggressive one.
That would also be compatible with user mode queues.

Alex

>
> Marek
>
> On Wed, Mar 22, 2023 at 10:29 AM Christian König <christian.koenig@amd.com> wrote:
>>
>> Well that sounds like being able to optionally set it after context creation is actually the right approach.
>>
>> VA-API could set it as soon as we know that this is a video codec application.
>>
>> Vulkan can set it depending on what features are used by the application.
>>
>> But yes, Shashank (or whoever requested that) should come up with some code for Mesa to actually use it. Otherwise we don't have the justification to push it into the kernel driver.
>>
>> Christian.
>>
>> Am 22.03.23 um 15:24 schrieb Marek Olšák:
>>
>> The hint is static per API (one of graphics, video, compute, unknown). In the case of Vulkan, which exposes all queues, the hint is unknown, so Vulkan won't use it. (or make it based on the queue being used and not the uapi context state) GL won't use it because the default hint is already 3D. That makes VAAPI the only user that only sets the hint once, and maybe it's not worth even adding this uapi just for VAAPI.
>>
>> Marek
>>
>> On Wed, Mar 22, 2023 at 10:08 AM Christian König <christian.koenig@amd.com> wrote:
>>>
>>> Well completely agree that we shouldn't have unused API. That's why I said we should remove the getting the hint from the UAPI.
>>>
>>> But what's wrong with setting it after creating the context? Don't you know enough about the use case? I need to understand the background a bit better here.
>>>
>>> Christian.
>>>
>>> Am 22.03.23 um 15:05 schrieb Marek Olšák:
>>>
>>> The option to change the hint after context creation and get the hint would be unused uapi, and AFAIK we are not supposed to add unused uapi. What I asked is to change it to a uapi that userspace will actually use.
>>>
>>> Marek
>>>
>>> On Tue, Mar 21, 2023 at 9:54 AM Christian König <ckoenig.leichtzumerken@gmail.com> wrote:
>>>>
>>>> Yes, I would like to avoid having multiple code paths for context creation.
>>>>
>>>> Setting it later on should be equally to specifying it on creation since we only need it during CS.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 21.03.23 um 14:00 schrieb Sharma, Shashank:
>>>>
>>>> [AMD Official Use Only - General]
>>>>
>>>>
>>>>
>>>> When we started this patch series, the workload hint was a part of the ctx_flag only,
>>>>
>>>> But we changed that after the design review, to make it more like how we are handling PSTATE.
>>>>
>>>>
>>>>
>>>> Details:
>>>>
>>>> https://patchwork.freedesktop.org/patch/496111/
>>>>
>>>>
>>>>
>>>> Regards
>>>>
>>>> Shashank
>>>>
>>>>
>>>>
>>>> From: Marek Olšák <maraeo@gmail.com>
>>>> Sent: 21 March 2023 04:05
>>>> To: Sharma, Shashank <Shashank.Sharma@amd.com>
>>>> Cc: amd-gfx@lists.freedesktop.org; Deucher, Alexander <Alexander.Deucher@amd.com>; Somalapuram, Amaranath <Amaranath.Somalapuram@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>
>>>> Subject: Re: [PATCH v3 1/5] drm/amdgpu: add UAPI for workload hints to ctx ioctl
>>>>
>>>>
>>>>
>>>> I think we should do it differently because this interface will be mostly unused by open source userspace in its current form.
>>>>
>>>>
>>>>
>>>> Let's set the workload hint in drm_amdgpu_ctx_in::flags, and that will be immutable for the lifetime of the context. No other interface is needed.
>>>>
>>>>
>>>>
>>>> Marek
>>>>
>>>>
>>>>
>>>> On Mon, Sep 26, 2022 at 5:41 PM Shashank Sharma <shashank.sharma@amd.com> wrote:
>>>>
>>>> Allow the user to specify a workload hint to the kernel.
>>>> We can use these to tweak the dpm heuristics to better match
>>>> the workload for improved performance.
>>>>
>>>> V3: Create only set() workload UAPI (Christian)
>>>>
>>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>>> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
>>>> ---
>>>> include/uapi/drm/amdgpu_drm.h | 17 +++++++++++++++++
>>>> 1 file changed, 17 insertions(+)
>>>>
>>>> diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/drm/amdgpu_drm.h
>>>> index c2c9c674a223..23d354242699 100644
>>>> --- a/include/uapi/drm/amdgpu_drm.h
>>>> +++ b/include/uapi/drm/amdgpu_drm.h
>>>> @@ -212,6 +212,7 @@ union drm_amdgpu_bo_list {
>>>> #define AMDGPU_CTX_OP_QUERY_STATE2 4
>>>> #define AMDGPU_CTX_OP_GET_STABLE_PSTATE 5
>>>> #define AMDGPU_CTX_OP_SET_STABLE_PSTATE 6
>>>> +#define AMDGPU_CTX_OP_SET_WORKLOAD_PROFILE 7
>>>>
>>>> /* GPU reset status */
>>>> #define AMDGPU_CTX_NO_RESET 0
>>>> @@ -252,6 +253,17 @@ union drm_amdgpu_bo_list {
>>>> #define AMDGPU_CTX_STABLE_PSTATE_MIN_MCLK 3
>>>> #define AMDGPU_CTX_STABLE_PSTATE_PEAK 4
>>>>
>>>> +/* GPU workload hints, flag bits 8-15 */
>>>> +#define AMDGPU_CTX_WORKLOAD_HINT_SHIFT 8
>>>> +#define AMDGPU_CTX_WORKLOAD_HINT_MASK (0xff << AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>>>> +#define AMDGPU_CTX_WORKLOAD_HINT_NONE (0 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>>>> +#define AMDGPU_CTX_WORKLOAD_HINT_3D (1 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>>>> +#define AMDGPU_CTX_WORKLOAD_HINT_VIDEO (2 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>>>> +#define AMDGPU_CTX_WORKLOAD_HINT_VR (3 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>>>> +#define AMDGPU_CTX_WORKLOAD_HINT_COMPUTE (4 << AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>>>> +#define AMDGPU_CTX_WORKLOAD_HINT_MAX AMDGPU_CTX_WORKLOAD_HINT_COMPUTE
>>>> +#define AMDGPU_CTX_WORKLOAD_INDEX(n) (n >> AMDGPU_CTX_WORKLOAD_HINT_SHIFT)
>>>> +
>>>> struct drm_amdgpu_ctx_in {
>>>> /** AMDGPU_CTX_OP_* */
>>>> __u32 op;
>>>> @@ -281,6 +293,11 @@ union drm_amdgpu_ctx_out {
>>>> __u32 flags;
>>>> __u32 _pad;
>>>> } pstate;
>>>> +
>>>> + struct {
>>>> + __u32 flags;
>>>> + __u32 _pad;
>>>> + } workload;
>>>> };
>>>>
>>>> union drm_amdgpu_ctx {
>>>> --
>>>> 2.34.1
>>>>
>>>>
>>>
>>