From: Boris Brezillon <boris.brezillon@collabora.com>
To: Matthew Brost <matthew.brost@intel.com>
Cc: robdclark@chromium.org, thomas.hellstrom@linux.intel.com,
sarah.walker@imgtec.com, ketil.johnsen@arm.com,
Liviu.Dudau@arm.com, mcanal@igalia.com,
dri-devel@lists.freedesktop.org, christian.koenig@amd.com,
luben.tuikov@amd.com, dakr@redhat.com, donald.robson@imgtec.com,
lina@asahilina.net, intel-xe@lists.freedesktop.org,
faith.ekstrand@collabora.com
Subject: Re: [PATCH v6 5/7] drm/sched: Split free_job into own work item
Date: Mon, 23 Oct 2023 14:16:06 +0200 [thread overview]
Message-ID: <20231023141606.3dd53c39@collabora.com> (raw)
In-Reply-To: <20231017150958.838613-6-matthew.brost@intel.com>
Hi,
On Tue, 17 Oct 2023 08:09:56 -0700
Matthew Brost <matthew.brost@intel.com> wrote:
> +static void drm_sched_run_job_work(struct work_struct *w)
> +{
> + struct drm_gpu_scheduler *sched =
> + container_of(w, struct drm_gpu_scheduler, work_run_job);
> + struct drm_sched_entity *entity;
> + struct dma_fence *fence;
> + struct drm_sched_fence *s_fence;
> + struct drm_sched_job *sched_job;
> + int r;
>
> - atomic_inc(&sched->hw_rq_count);
> - drm_sched_job_begin(sched_job);
> + if (READ_ONCE(sched->pause_submit))
> + return;
> +
> + entity = drm_sched_select_entity(sched, true);
> + if (!entity)
> + return;
>
> - trace_drm_run_job(sched_job, entity);
> - fence = sched->ops->run_job(sched_job);
> + sched_job = drm_sched_entity_pop_job(entity);
> + if (!sched_job) {
> complete_all(&entity->entity_idle);
> - drm_sched_fence_scheduled(s_fence, fence);
> + return; /* No more work */
> + }
>
> - if (!IS_ERR_OR_NULL(fence)) {
> - /* Drop for original kref_init of the fence */
> - dma_fence_put(fence);
> + s_fence = sched_job->s_fence;
>
> - r = dma_fence_add_callback(fence, &sched_job->cb,
> - drm_sched_job_done_cb);
> - if (r == -ENOENT)
> - drm_sched_job_done(sched_job, fence->error);
> - else if (r)
> - DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n",
> - r);
> - } else {
> - drm_sched_job_done(sched_job, IS_ERR(fence) ?
> - PTR_ERR(fence) : 0);
> - }
> + atomic_inc(&sched->hw_rq_count);
> + drm_sched_job_begin(sched_job);
>
> - wake_up(&sched->job_scheduled);
> + trace_drm_run_job(sched_job, entity);
> + fence = sched->ops->run_job(sched_job);
> + complete_all(&entity->entity_idle);
> + drm_sched_fence_scheduled(s_fence, fence);
> +
> + if (!IS_ERR_OR_NULL(fence)) {
> + /* Drop for original kref_init of the fence */
> + dma_fence_put(fence);
> +
> + r = dma_fence_add_callback(fence, &sched_job->cb,
> + drm_sched_job_done_cb);
> + if (r == -ENOENT)
> + drm_sched_job_done(sched_job, fence->error);
> + else if (r)
> + DRM_DEV_ERROR(sched->dev, "fence add callback failed (%d)\n", r);
> + } else {
> + drm_sched_job_done(sched_job, IS_ERR(fence) ?
> + PTR_ERR(fence) : 0);
> }
Just ran into a race condition when using a non-ordered workqueue
for drm_sched:
thread A thread B
drm_sched_run_job_work()
drm_sched_job_begin()
// inserts jobA in pending_list
drm_sched_free_job_work()
drm_sched_get_cleanup_job()
// check first job in pending list
// if s_fence->parent == NULL, consider
// for cleanup
->free_job(jobA)
drm_sched_job_cleanup()
// set sched_job->s_fence = NULL
->run_job()
drm_sched_fence_scheduled()
// sched_job->s_fence->parent = parent_fence
// BOOM => NULL pointer deref
For now, I'll just use a dedicated ordered wq, but if we claim
multi-threaded workqueues are supported, this is probably worth fixing.
I know there's been some discussions about when the timeout should be
started, and the job insertion in the pending_list is kinda related.
If we want this insertion to happen before ->run_job() is called, we
need a way to flag when a job is inserted, but not fully submitted yet.
Regards,
Boris
next prev parent reply other threads:[~2023-10-23 12:16 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-17 15:09 [PATCH v6 0/7] DRM scheduler changes for Xe Matthew Brost
2023-10-17 15:09 ` [PATCH v6 1/7] drm/sched: Add drm_sched_wqueue_* helpers Matthew Brost
2023-10-17 15:09 ` [PATCH v6 2/7] drm/sched: Convert drm scheduler to use a work queue rather than kthread Matthew Brost
2023-10-17 15:09 ` [PATCH v6 3/7] drm/sched: Move schedule policy to scheduler Matthew Brost
2023-10-24 3:34 ` Luben Tuikov
2023-10-17 15:09 ` [PATCH v6 4/7] drm/sched: Add DRM_SCHED_POLICY_SINGLE_ENTITY scheduling policy Matthew Brost
2023-10-24 3:50 ` Luben Tuikov
2023-10-25 15:13 ` Matthew Brost
2023-10-25 18:39 ` Luben Tuikov
2023-10-17 15:09 ` [PATCH v6 5/7] drm/sched: Split free_job into own work item Matthew Brost
2023-10-19 1:25 ` Luben Tuikov
2023-10-19 16:55 ` Matthew Brost
2023-10-22 1:27 ` Luben Tuikov
2023-10-23 12:16 ` Boris Brezillon [this message]
2023-10-23 12:39 ` Boris Brezillon
2023-10-23 13:54 ` Matthew Brost
2023-10-23 14:22 ` Boris Brezillon
2023-10-23 22:03 ` Matthew Brost
2023-10-17 15:09 ` [PATCH v6 6/7] drm/sched: Add drm_sched_start_timeout_unlocked helper Matthew Brost
2023-10-17 15:09 ` [PATCH v6 7/7] drm/sched: Add a helper to queue TDR immediately Matthew Brost
2023-10-18 16:18 ` Luben Tuikov
2023-10-18 8:24 ` [PATCH v6 0/7] DRM scheduler changes for Xe Daniel Vetter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231023141606.3dd53c39@collabora.com \
--to=boris.brezillon@collabora.com \
--cc=Liviu.Dudau@arm.com \
--cc=christian.koenig@amd.com \
--cc=dakr@redhat.com \
--cc=donald.robson@imgtec.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=faith.ekstrand@collabora.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=ketil.johnsen@arm.com \
--cc=lina@asahilina.net \
--cc=luben.tuikov@amd.com \
--cc=matthew.brost@intel.com \
--cc=mcanal@igalia.com \
--cc=robdclark@chromium.org \
--cc=sarah.walker@imgtec.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).