All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luben Tuikov <luben.tuikov@amd.com>
To: Nirmoy <nirmodas@amd.com>,
	christian.koenig@amd.com,
	Andrey Grodzovsky <Andrey.Grodzovsky@amd.com>,
	amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com, Boyuan.Zhang@amd.com,
	nirmoy.das@amd.com, Leo.Liu@amd.com, James.Zhu@amd.com
Subject: Re: [PATCH 1/1] drm/amdgpu: disable gpu_sched load balancer for vcn jobs
Date: Tue, 17 Mar 2020 16:46:24 -0400	[thread overview]
Message-ID: <4616f7cc-ce70-d74c-5a62-d736fec08085@amd.com> (raw)
In-Reply-To: <3fac4046-0c2c-7fa9-7c83-6af9149e50bf@amd.com>

On 2020-03-12 06:56, Nirmoy wrote:
> 
> On 3/12/20 9:50 AM, Christian König wrote:
>> Am 11.03.20 um 21:55 schrieb Nirmoy:
>>>
>>> On 3/11/20 9:35 PM, Andrey Grodzovsky wrote:
>>>>
>>>> On 3/11/20 4:32 PM, Nirmoy wrote:
>>>>>
>>>>> On 3/11/20 9:02 PM, Andrey Grodzovsky wrote:
>>>>>>
>>>>>> On 3/11/20 4:00 PM, Andrey Grodzovsky wrote:
>>>>>>>
>>>>>>> On 3/11/20 4:00 PM, Nirmoy Das wrote:
>>>>>>>> [SNIP]
>>>>>>>> @@ -1257,6 +1258,9 @@ static int amdgpu_cs_submit(struct 
>>>>>>>> amdgpu_cs_parser *p,
>>>>>>>>       priority = job->base.s_priority;
>>>>>>>>       drm_sched_entity_push_job(&job->base, entity);
>>>>>>>>   +    if (ring->funcs->no_gpu_sched_loadbalance)
>>>>>>>> + amdgpu_ctx_disable_gpu_sched_load_balance(entity);
>>>>>>>> +
>>>>>>>
>>>>>>>
>>>>>>> Why this needs to be done each time a job is submitted and not 
>>>>>>> once in drm_sched_entity_init (same foramdgpu_job_submit bellow ?)
>>>>>>>
>>>>>>> Andrey
>>>>>>
>>>>>>
>>>>>> My bad - not in drm_sched_entity_init but in relevant amdgpu code.
>>>>>
>>>>>
>>>>> Hi Andrey,
>>>>>
>>>>> Do you mean drm_sched_job_init() or after creating VCN entities?
>>>>>
>>>>>
>>>>> Nirmoy
>>>>
>>>>
>>>> I guess after creating the VCN entities (has to be amdgpu specific 
>>>> code) - I just don't get why it needs to be done each time job is 
>>>> submitted, I mean - since you set .no_gpu_sched_loadbalance = true 
>>>> anyway this is always true and so shouldn't you just initialize the 
>>>> VCN entity with a schedulers list consisting of one scheduler and 
>>>> that it ?
>>>
>>>
>>> Assumption: If I understand correctly we shouldn't be doing load 
>>> balance among VCN jobs in the same context. Christian, James and Leo 
>>> can clarify that if I am wrong.
>>>
>>> But we can still do load balance of VNC jobs among multiple contexts. 
>>> That load balance decision happens in drm_sched_entity_init(). If we 
>>> initialize VCN entity with one scheduler then
>>>
>>> all entities irrespective of context gets that one scheduler which 
>>> means we are not utilizing extra VNC instances.
>>
>> Andrey has a very good point here. So far we only looked at this from 
>> the hardware requirement side that we can't change the ring after the 
>> first submission any more.
>>
>> But it is certainly valuable to keep the extra overhead out of the hot 
>> path during command submission.
> 
> 
> 
>>
>>> Ideally we should be calling 
>>> amdgpu_ctx_disable_gpu_sched_load_balance() only once after 1st call 
>>> of drm_sched_entity_init() of a VCN job. I am not sure how to do that 
>>> efficiently.
>>>
>>> Another option might be to copy the logic of 
>>> drm_sched_entity_get_free_sched() and choose suitable VNC sched 
>>> at/after VCN entity creation.
>>
>> Yes, but we should not copy the logic but rather refactor it :)
>>
>> Basically we need a drm_sched_pick_best() function which gets an array 
>> of drm_gpu_scheduler structures and returns the one with the least 
>> load on it.
>>
>> This function can then be used by VCN to pick one instance before 
>> initializing the entity as well as a replacement for 
>> drm_sched_entity_get_free_sched() to change the scheduler for load 
>> balancing.
> 
> 
> This sounds like a optimum solution here.
> 
> Thanks Andrey and Christian. I will resend with suggested changes.

Note that this isn't an optimal solution. Note that drm_sched_pick_best()
and drm_sched_entity_get_free_sched() (these names are too long), are similar
in what they do, in that they pick a scheduler, which is still a centralized
decision making.

An optimal solution would be for each execution unit to pick work
when work is available, which is a decentralized decision model.

Not sure how an array would be used, as the proposition here is
laid out--would that be an O(n) search through the array?

In any case, centralized decision making introduces a bottleneck. Decentralized
solutions are available for scheduling with O(1) time complexity.

Regards,
Luben


> 
> 
>>
>> Regards,
>> Christian.
>>
>>>
>>>
>>> Regards,
>>>
>>> Nirmoy
>>>
>>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Famd-gfx&amp;data=02%7C01%7Cluben.tuikov%40amd.com%7C903b7b3a6faf480f1a7908d7c6738bae%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637196071895640764&amp;sdata=RrPrZ5aHVOhMd5H8wEqCt%2FPPSBNLCyRVwDoLBU4p3Iw%3D&amp;reserved=0
> 

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2020-03-17 20:46 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-11 20:00 [PATCH 1/1] drm/amdgpu: disable gpu_sched load balancer for vcn jobs Nirmoy Das
2020-03-11 20:00 ` Andrey Grodzovsky
2020-03-11 20:02   ` Andrey Grodzovsky
2020-03-11 20:32     ` Nirmoy
2020-03-11 20:35       ` Andrey Grodzovsky
2020-03-11 20:55         ` Nirmoy
2020-03-12  8:50           ` Christian König
2020-03-12 10:56             ` Nirmoy
2020-03-17 20:46               ` Luben Tuikov [this message]
2020-03-11 20:14 ` James Zhu
2020-03-11 20:36   ` Nirmoy
  -- strict thread matches above, loose matches on Subject: below --
2020-03-11 19:57 Nirmoy Das

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4616f7cc-ce70-d74c-5a62-d736fec08085@amd.com \
    --to=luben.tuikov@amd.com \
    --cc=Andrey.Grodzovsky@amd.com \
    --cc=Boyuan.Zhang@amd.com \
    --cc=James.Zhu@amd.com \
    --cc=Leo.Liu@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=christian.koenig@amd.com \
    --cc=nirmodas@amd.com \
    --cc=nirmoy.das@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.