All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Nirmoy <nirmodas@amd.com>, Nirmoy Das <nirmoy.aiemd@gmail.com>,
	amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com, kenny.ho@amd.com, nirmoy.das@amd.com,
	pierre-eric.pelloux-prayer@amd.com
Subject: Re: [PATCH] drm/scheduler: fix race condition in load balancer
Date: Wed, 15 Jan 2020 13:52:00 +0100	[thread overview]
Message-ID: <8b4d2ea2-a28d-6eb4-2d50-02b5c450922f@amd.com> (raw)
In-Reply-To: <862ad550-082d-7ece-1d4d-99801ab10428@amd.com>

Hi Nirmoy,

Am 15.01.20 um 12:04 schrieb Nirmoy:
> Hi Christian,
>
> On 1/14/20 5:01 PM, Christian König wrote:
>>
>>> Before this patch:
>>>
>>> sched_name     num of many times it got scheduled
>>> =========      ==================================
>>> sdma0          314
>>> sdma1          32
>>> comp_1.0.0     56
>>> comp_1.1.0     0
>>> comp_1.1.1     0
>>> comp_1.2.0     0
>>> comp_1.2.1     0
>>> comp_1.3.0     0
>>> comp_1.3.1     0
>>>
>>> After this patch:
>>>
>>> sched_name     num of many times it got scheduled
>>> =========      ==================================
>>>   sdma1          243
>>>   sdma0          164
>>>   comp_1.0.1     14
>>>   comp_1.1.0     11
>>>   comp_1.1.1     10
>>>   comp_1.2.0     15
>>>   comp_1.2.1     14
>>>   comp_1.3.0     10
>>>   comp_1.3.1     10
>>
>> Well that is still rather nice to have, why does that happen?
>
> I think I know why it happens. At init all entity's rq gets assigned 
> to sched_list[0]. I put some prints to check what we compare in 
> drm_sched_entity_get_free_sched.
>
> It turns out most of the time it compares zero values(num_jobs(0) < 
> min_jobs(0)) so most of the time 1st rq(sdma0, comp_1.0.0) was picked 
> by drm_sched_entity_get_free_sched.

Well that is expected because the unit tests always does 
submission,wait,submission,wait,submission,wait.... So the number of 
jobs in the scheduler becomes zero in between.

> This patch was not correct , had an extra atomic_inc(num_jobs) in 
> drm_sched_job_init. This probably added bit of randomness I think, 
> which helped in better job distribution.

Mhm, that might not be a bad idea after all. We could rename num_jobs 
into something like like score and do a +1 in drm_sched_rq_add_entity() 
and a -1 in drm_sched_rq_remove_entity().

That should have pretty much the effect we want to have.

> I've updated my previous RFC patch which uses time consumed by each 
> sched for load balance with a twist of ignoring previously scheduled 
> sched/rq. Let me know what do you think.

I didn't had time yet to wrap my head around that in detail, but at 
least of hand Luben is right that the locking looks really awkward.

And I would rather like to avoid a larger change like this for a nice to 
have for testing feature.

Regards,
Christian.

>
>
> Regards,
>
> Nirmoy
>
>>
>> Christian.
>>
>>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2020-01-15 12:52 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14 15:43 [PATCH] drm/scheduler: fix race condition in load balancer Nirmoy Das
2020-01-14 16:01 ` Christian König
2020-01-14 16:13   ` Nirmoy
2020-01-14 16:20     ` Christian König
2020-01-14 16:20   ` Nirmoy
2020-01-14 16:23     ` Christian König
2020-01-14 16:27       ` Nirmoy
2020-01-15 11:04   ` Nirmoy
2020-01-15 12:52     ` Christian König [this message]
2020-01-15 13:24       ` Nirmoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8b4d2ea2-a28d-6eb4-2d50-02b5c450922f@amd.com \
    --to=christian.koenig@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=kenny.ho@amd.com \
    --cc=nirmodas@amd.com \
    --cc=nirmoy.aiemd@gmail.com \
    --cc=nirmoy.das@amd.com \
    --cc=pierre-eric.pelloux-prayer@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.