All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <christian.koenig@amd.com>
To: Nirmoy <nirmodas@amd.com>, Nirmoy Das <nirmoy.aiemd@gmail.com>,
	amd-gfx@lists.freedesktop.org
Cc: alexander.deucher@amd.com, kenny.ho@amd.com, nirmoy.das@amd.com,
	pierre-eric.pelloux-prayer@amd.com
Subject: Re: [PATCH] drm/scheduler: fix race condition in load balancer
Date: Tue, 14 Jan 2020 17:23:26 +0100	[thread overview]
Message-ID: <fec78b0f-809a-15de-5c54-996f480eb4eb@amd.com> (raw)
In-Reply-To: <04f6d680-02ff-7526-adb4-4d44e83712bc@amd.com>

Am 14.01.20 um 17:20 schrieb Nirmoy:
>
> On 1/14/20 5:01 PM, Christian König wrote:
>> Am 14.01.20 um 16:43 schrieb Nirmoy Das:
>>> Jobs submitted in an entity should execute in the order those jobs
>>> are submitted. We make sure that by checking entity->job_queue in
>>> drm_sched_entity_select_rq() so that we don't loadbalance jobs within
>>> an entity.
>>>
>>> But because we update entity->job_queue later in 
>>> drm_sched_entity_push_job(),
>>> there remains a open window when it is possibe that entity->rq might 
>>> get
>>> updated by drm_sched_entity_select_rq() which should not be allowed.
>>
>> NAK, concurrent calls to 
>> drm_sched_job_init()/drm_sched_entity_push_job() are not allowed in 
>> the first place or otherwise we mess up the fence sequence order and 
>> risk memory corruption.
> if I am not missing something, I don't see any lock securing 
> drm_sched_job_init()/drm_sched_entity_push_job() calls in 
> amdgpu_cs_submit().

See one step up in the call chain, function amdgpu_cs_ioctl().

This is locking the page tables, which also makes access to the context 
and entities mutual exclusive:
>         r = amdgpu_cs_parser_bos(&parser, data);
...
>         r = amdgpu_cs_submit(&parser, cs);
>
> out:

And here the page tables are unlocked again:
>         amdgpu_cs_parser_fini(&parser, r, reserved_buffers);

Regards,
Christian.

>
>
> Regards,
>
> Nirmoy
>

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2020-01-14 16:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-14 15:43 [PATCH] drm/scheduler: fix race condition in load balancer Nirmoy Das
2020-01-14 16:01 ` Christian König
2020-01-14 16:13   ` Nirmoy
2020-01-14 16:20     ` Christian König
2020-01-14 16:20   ` Nirmoy
2020-01-14 16:23     ` Christian König [this message]
2020-01-14 16:27       ` Nirmoy
2020-01-15 11:04   ` Nirmoy
2020-01-15 12:52     ` Christian König
2020-01-15 13:24       ` Nirmoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fec78b0f-809a-15de-5c54-996f480eb4eb@amd.com \
    --to=christian.koenig@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=kenny.ho@amd.com \
    --cc=nirmodas@amd.com \
    --cc=nirmoy.aiemd@gmail.com \
    --cc=nirmoy.das@amd.com \
    --cc=pierre-eric.pelloux-prayer@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.