From: "Christian König" <christian.koenig@amd.com>
To: Karol Herbst <kherbst@redhat.com>
Cc: "Asahi Lina" <lina@asahilina.net>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"David Airlie" <airlied@gmail.com>,
"Daniel Vetter" <daniel@ffwll.ch>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Alex Gaynor" <alex.gaynor@gmail.com>,
"Wedson Almeida Filho" <wedsonaf@gmail.com>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Gary Guo" <gary@garyguo.net>,
"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
"Sumit Semwal" <sumit.semwal@linaro.org>,
"Luben Tuikov" <luben.tuikov@amd.com>,
"Jarkko Sakkinen" <jarkko@kernel.org>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"Alyssa Rosenzweig" <alyssa@rosenzweig.io>,
"Ella Stanforth" <ella@iglunix.org>,
"Faith Ekstrand" <faith.ekstrand@collabora.com>,
Mary <mary@mary.zone>,
linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
rust-for-linux@vger.kernel.org, linux-media@vger.kernel.org,
linaro-mm-sig@lists.linaro.org, linux-sgx@vger.kernel.org,
asahi@lists.linux.dev
Subject: Re: [PATCH RFC 10/18] drm/scheduler: Add can_run_job callback
Date: Wed, 8 Mar 2023 16:02:37 +0100 [thread overview]
Message-ID: <269c4fb7-60f6-6778-444d-15703febea58@amd.com> (raw)
In-Reply-To: <CACO55tumNMYrcJ0LhnLfTK4DmGLHD-bt3xpXyoPe98V2wmgXQg@mail.gmail.com>
Am 08.03.23 um 15:43 schrieb Karol Herbst:
> [SNIP]
> "further"? There was no discussion at all,
Yeah, well that is exactly what I wanted to archive.
> you just started off like
> that. If you think somebody misses that connection, you can point out
> to documentation/videos whatever so the contributor can understand
> what's wrong with an approach. You did that, so that's fine. It's just
> starting off _any_ discussion with a "Well complete NAK" is terrible
> style. I'd feel uncomfortable if that happened to me and I'm sure
> there are enough people like that that we should be more reasonable
> with our replies. Just.. don't.
>
> We are all humans here and people react negatively to such things. And
> if people do it on purpose it just makes it worse.
I completely see your point, I just don't know how to improve it.
I don't stop people like this because I want to make them uncomfortable
but because I want to prevent further discussions on that topic.
In other words how can I make people notice that this is something
fundamental while still being polite?
>>>> This is clearly going against the idea of having jobs only depend on
>>>> fences and nothing else which is mandatory for correct memory management.
>>>>
>>> I'm sure it's all documented and there is a design document on how
>>> things have to look like you can point out? Might help to get a better
>>> understanding on how things should be.
>> Yeah, that's the problematic part. We have documented this very
>> extensively:
>> https://www.kernel.org/doc/html/v5.9/driver-api/dma-buf.html#indefinite-dma-fences
>>
>> And both Jason and Daniel gave talks about the underlying problem and
> fyi:
> s/Jason/Faith/g
+1. I wasn't aware of that.
>> try to come up with patches to raise warnings when that happens, but
>> people still keep coming up with the same idea over and over again.
>>
> Yes, and we'll have to tell them over and over again. Nothing wrong
> with that. That's just part of maintaining such a big subsystem. And
> that's definitely not a valid reason to phrase things like above.
>
>> It's just that the technical relationship between preventing jobs from
>> running and with that preventing dma_fences from signaling and the core
>> memory management with page faults and shrinkers waiting for those
>> fences is absolutely not obvious.
>>
>> We had at least 10 different teams from different companies falling into
>> the same trap already and either the patches were rejected of hand or
>> had to painfully reverted or mitigated later on.
>>
> Sure, but that's just part of the job. And pointing out fundamental
> mistakes early on is important, but the situation won't get any better
> by being like that. Yes, we'll have to repeat the same words over and
> over again, and yes that might be annoying, but that's just how it is.
Well I have no problem explaining people why a solution doesn't work.
But what usually happens is that people don't realize that they need to
back of from a design and completely start over.
Regards,
Christian.
>
>> Regards,
>> Christian.
>>
>>>> If the hw is busy with something you need to return the fence for this
>>>> from the prepare_job callback so that the scheduler can be notified when
>>>> the hw is available again.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> Signed-off-by: Asahi Lina <lina@asahilina.net>
>>>>> ---
>>>>> drivers/gpu/drm/scheduler/sched_main.c | 10 ++++++++++
>>>>> include/drm/gpu_scheduler.h | 8 ++++++++
>>>>> 2 files changed, 18 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> index 4e6ad6e122bc..5c0add2c7546 100644
>>>>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>>>>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>>>>> @@ -1001,6 +1001,16 @@ static int drm_sched_main(void *param)
>>>>> if (!entity)
>>>>> continue;
>>>>>
>>>>> + if (sched->ops->can_run_job) {
>>>>> + sched_job = to_drm_sched_job(spsc_queue_peek(&entity->job_queue));
>>>>> + if (!sched_job) {
>>>>> + complete_all(&entity->entity_idle);
>>>>> + continue;
>>>>> + }
>>>>> + if (!sched->ops->can_run_job(sched_job))
>>>>> + continue;
>>>>> + }
>>>>> +
>>>>> sched_job = drm_sched_entity_pop_job(entity);
>>>>>
>>>>> if (!sched_job) {
>>>>> diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
>>>>> index 9db9e5e504ee..bd89ea9507b9 100644
>>>>> --- a/include/drm/gpu_scheduler.h
>>>>> +++ b/include/drm/gpu_scheduler.h
>>>>> @@ -396,6 +396,14 @@ struct drm_sched_backend_ops {
>>>>> struct dma_fence *(*prepare_job)(struct drm_sched_job *sched_job,
>>>>> struct drm_sched_entity *s_entity);
>>>>>
>>>>> + /**
>>>>> + * @can_run_job: Called before job execution to check whether the
>>>>> + * hardware is free enough to run the job. This can be used to
>>>>> + * implement more complex hardware resource policies than the
>>>>> + * hw_submission limit.
>>>>> + */
>>>>> + bool (*can_run_job)(struct drm_sched_job *sched_job);
>>>>> +
>>>>> /**
>>>>> * @run_job: Called to execute the job once all of the dependencies
>>>>> * have been resolved. This may be called multiple times, if
>>>>>
next prev parent reply other threads:[~2023-03-08 15:03 UTC|newest]
Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-07 14:25 [PATCH RFC 00/18] Rust DRM subsystem abstractions (& preview AGX driver) Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 01/18] rust: drm: ioctl: Add DRM ioctl abstraction Asahi Lina
2023-03-07 14:48 ` Karol Herbst
2023-03-07 14:51 ` Karol Herbst
2023-03-07 15:32 ` Maíra Canal
2023-03-09 5:32 ` Asahi Lina
2023-03-09 6:15 ` Dave Airlie
2023-03-09 12:09 ` Maíra Canal
2023-03-07 17:34 ` Björn Roy Baron
2023-03-09 6:04 ` Asahi Lina
2023-03-09 20:24 ` Faith Ekstrand
2023-03-09 20:39 ` Karol Herbst
2023-03-10 6:21 ` Asahi Lina
2023-04-13 9:23 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 02/18] rust: drm: Add Device and Driver abstractions Asahi Lina
2023-03-07 18:19 ` Björn Roy Baron
2023-03-09 6:10 ` Asahi Lina
2023-03-10 18:56 ` Boqun Feng
2023-03-11 5:41 ` Boqun Feng
2023-04-05 17:10 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 03/18] rust: drm: file: Add File abstraction Asahi Lina
2023-03-09 21:16 ` Faith Ekstrand
2023-03-09 22:16 ` Asahi Lina
2023-03-13 17:49 ` Faith Ekstrand
2023-03-14 2:07 ` Boqun Feng
2023-04-05 11:25 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 04/18] rust: drm: gem: Add GEM object abstraction Asahi Lina
2023-04-05 11:08 ` Daniel Vetter
2023-04-05 11:19 ` Miguel Ojeda
2023-04-05 11:22 ` Daniel Vetter
2023-04-05 12:32 ` Miguel Ojeda
2023-04-05 12:36 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 05/18] drm/gem-shmem: Export VM ops functions Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 06/18] rust: drm: gem: shmem: Add DRM shmem helper abstraction Asahi Lina
2023-03-08 13:38 ` Maíra Canal
2023-03-09 5:25 ` Asahi Lina
2023-03-09 11:47 ` Maíra Canal
2023-03-09 14:16 ` Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 07/18] rust: drm: mm: Add DRM MM Range Allocator abstraction Asahi Lina
2023-04-06 14:15 ` Daniel Vetter
2023-04-06 15:28 ` Miguel Ojeda
2023-04-06 15:45 ` Daniel Vetter
2023-04-06 17:19 ` Miguel Ojeda
2023-04-06 15:53 ` Asahi Lina
2023-04-06 16:13 ` [Linaro-mm-sig] " Daniel Vetter
2023-04-06 16:39 ` Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 08/18] rust: dma_fence: Add DMA Fence abstraction Asahi Lina
2023-04-05 11:10 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 09/18] rust: drm: syncobj: Add DRM Sync Object abstraction Asahi Lina
2023-04-05 12:33 ` Daniel Vetter
2023-04-06 16:04 ` Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 10/18] drm/scheduler: Add can_run_job callback Asahi Lina
2023-03-08 8:46 ` Christian König
2023-03-08 9:41 ` Asahi Lina
2023-03-08 10:00 ` Christian König
2023-03-08 14:53 ` Asahi Lina
2023-03-08 15:30 ` Christian König
2023-03-08 16:44 ` Asahi Lina
2023-03-08 17:57 ` Christian König
2023-03-08 19:05 ` Asahi Lina
2023-03-08 19:12 ` Christian König
2023-03-08 19:45 ` Asahi Lina
2023-03-08 20:14 ` Christian König
2023-03-09 6:30 ` Asahi Lina
2023-03-09 8:05 ` Christian König
2023-03-09 9:14 ` Asahi Lina
2023-03-09 18:50 ` Faith Ekstrand
2023-03-10 9:16 ` Asahi Lina
2023-03-08 12:39 ` Karol Herbst
2023-03-08 13:47 ` Christian König
2023-03-08 14:43 ` Karol Herbst
2023-03-08 15:02 ` Christian König [this message]
2023-03-08 15:19 ` Karol Herbst
2023-03-16 13:40 ` Daniel Vetter
2023-04-05 13:40 ` Daniel Vetter
2023-04-05 14:14 ` Christian König
2023-04-05 14:21 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 11/18] drm/scheduler: Clean up jobs when the scheduler is torn down Asahi Lina
2023-03-08 9:57 ` Maarten Lankhorst
2023-03-08 10:03 ` Christian König
2023-03-08 15:18 ` Asahi Lina
2023-03-08 15:42 ` Christian König
2023-03-08 17:32 ` Asahi Lina
2023-03-08 18:12 ` Christian König
2023-03-08 19:37 ` Asahi Lina
2023-03-09 8:42 ` Christian König
2023-03-09 9:43 ` Asahi Lina
2023-03-09 11:47 ` Christian König
2023-03-09 13:48 ` Asahi Lina
2023-03-09 19:59 ` Faith Ekstrand
2023-03-10 9:58 ` Asahi Lina
2023-03-13 20:11 ` Faith Ekstrand
2023-03-08 17:39 ` alyssa
2023-03-08 17:44 ` Asahi Lina
2023-03-08 18:13 ` Christian König
2023-04-05 13:52 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 12/18] rust: drm: sched: Add GPU scheduler abstraction Asahi Lina
2023-04-05 15:43 ` Daniel Vetter
2023-04-05 19:29 ` Daniel Vetter
2023-04-18 8:45 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 13/18] drm/gem: Add a flag to control whether objects can be exported Asahi Lina
2023-04-05 14:55 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 14/18] rust: drm: gem: Add set_exportable() method Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 15/18] drm/asahi: Add the Asahi driver UAPI [DO NOT MERGE] Asahi Lina
2023-03-07 15:28 ` Karol Herbst
2023-03-07 14:25 ` [PATCH RFC 16/18] rust: bindings: Bind the Asahi DRM UAPI Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 17/18] rust: macros: Add versions macro Asahi Lina
2023-03-07 16:17 ` [PATCH RFC 00/18] Rust DRM subsystem abstractions (& preview AGX driver) Asahi Lina
[not found] ` <20230307-rust-drm-v1-18-917ff5bc80a8@asahilina.net>
2023-04-05 14:44 ` [PATCH RFC 18/18] drm/asahi: Add the Asahi driver for Apple AGX GPUs Daniel Vetter
2023-04-06 5:02 ` Asahi Lina
2023-04-06 5:09 ` Asahi Lina
2023-04-06 11:25 ` [Linaro-mm-sig] " Daniel Vetter
2023-04-06 13:32 ` Asahi Lina
2023-04-06 13:54 ` Daniel Vetter
[not found] ` <ZC2HtBOaoUAzVCVH@phenom.ffwll.local>
2023-04-06 4:44 ` Asahi Lina
2023-04-06 5:09 ` Asahi Lina
2023-04-06 11:26 ` Daniel Vetter
2023-04-06 10:42 ` [Linaro-mm-sig] " Daniel Vetter
2023-04-06 11:55 ` Daniel Vetter
2023-04-06 13:15 ` Asahi Lina
2023-04-06 13:48 ` Daniel Vetter
2023-04-06 15:19 ` Asahi Lina
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=269c4fb7-60f6-6778-444d-15703febea58@amd.com \
--to=christian.koenig@amd.com \
--cc=airlied@gmail.com \
--cc=alex.gaynor@gmail.com \
--cc=alyssa@rosenzweig.io \
--cc=asahi@lists.linux.dev \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=daniel@ffwll.ch \
--cc=dave.hansen@linux.intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=ella@iglunix.org \
--cc=faith.ekstrand@collabora.com \
--cc=gary@garyguo.net \
--cc=jarkko@kernel.org \
--cc=kherbst@redhat.com \
--cc=lina@asahilina.net \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=luben.tuikov@amd.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mary@mary.zone \
--cc=mripard@kernel.org \
--cc=ojeda@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=sumit.semwal@linaro.org \
--cc=tzimmermann@suse.de \
--cc=wedsonaf@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).