From: Nirmoy <nirmodas@amd.com>
To: "Christian König" <christian.koenig@amd.com>,
"Nirmoy Das" <nirmoy.aiemd@gmail.com>,
amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com, kenny.ho@amd.com, nirmoy.das@amd.com,
pierre-eric.pelloux-prayer@amd.com
Subject: Re: [PATCH] drm/scheduler: fix race condition in load balancer
Date: Tue, 14 Jan 2020 17:27:44 +0100 [thread overview]
Message-ID: <9ce31047-6b53-4a5c-7483-660d671afca3@amd.com> (raw)
In-Reply-To: <fec78b0f-809a-15de-5c54-996f480eb4eb@amd.com>
On 1/14/20 5:23 PM, Christian König wrote:
> Am 14.01.20 um 17:20 schrieb Nirmoy:
>>
>> On 1/14/20 5:01 PM, Christian König wrote:
>>> Am 14.01.20 um 16:43 schrieb Nirmoy Das:
>>>> Jobs submitted in an entity should execute in the order those jobs
>>>> are submitted. We make sure that by checking entity->job_queue in
>>>> drm_sched_entity_select_rq() so that we don't loadbalance jobs within
>>>> an entity.
>>>>
>>>> But because we update entity->job_queue later in
>>>> drm_sched_entity_push_job(),
>>>> there remains a open window when it is possibe that entity->rq
>>>> might get
>>>> updated by drm_sched_entity_select_rq() which should not be allowed.
>>>
>>> NAK, concurrent calls to
>>> drm_sched_job_init()/drm_sched_entity_push_job() are not allowed in
>>> the first place or otherwise we mess up the fence sequence order and
>>> risk memory corruption.
>> if I am not missing something, I don't see any lock securing
>> drm_sched_job_init()/drm_sched_entity_push_job() calls in
>> amdgpu_cs_submit().
>
> See one step up in the call chain, function amdgpu_cs_ioctl().
>
> This is locking the page tables, which also makes access to the
> context and entities mutual exclusive:
>> r = amdgpu_cs_parser_bos(&parser, data);
> ...
>> r = amdgpu_cs_submit(&parser, cs);
>>
>> out:
>
> And here the page tables are unlocked again:
>> amdgpu_cs_parser_fini(&parser, r, reserved_buffers);
Okay. Then something else is going on. Let me dig more.
Thanks,
Nirmoy
>
> Regards,
> Christian.
>
>>
>>
>> Regards,
>>
>> Nirmoy
>>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
next prev parent reply other threads:[~2020-01-14 16:26 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-14 15:43 [PATCH] drm/scheduler: fix race condition in load balancer Nirmoy Das
2020-01-14 16:01 ` Christian König
2020-01-14 16:13 ` Nirmoy
2020-01-14 16:20 ` Christian König
2020-01-14 16:20 ` Nirmoy
2020-01-14 16:23 ` Christian König
2020-01-14 16:27 ` Nirmoy [this message]
2020-01-15 11:04 ` Nirmoy
2020-01-15 12:52 ` Christian König
2020-01-15 13:24 ` Nirmoy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9ce31047-6b53-4a5c-7483-660d671afca3@amd.com \
--to=nirmodas@amd.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=christian.koenig@amd.com \
--cc=kenny.ho@amd.com \
--cc=nirmoy.aiemd@gmail.com \
--cc=nirmoy.das@amd.com \
--cc=pierre-eric.pelloux-prayer@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).