From: "Christian König" <christian.koenig@amd.com>
To: "Asahi Lina" <lina@asahilina.net>,
"Maarten Lankhorst" <maarten.lankhorst@linux.intel.com>,
"Maxime Ripard" <mripard@kernel.org>,
"Thomas Zimmermann" <tzimmermann@suse.de>,
"David Airlie" <airlied@gmail.com>,
"Daniel Vetter" <daniel@ffwll.ch>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Alex Gaynor" <alex.gaynor@gmail.com>,
"Wedson Almeida Filho" <wedsonaf@gmail.com>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Gary Guo" <gary@garyguo.net>,
"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
"Sumit Semwal" <sumit.semwal@linaro.org>,
"Luben Tuikov" <luben.tuikov@amd.com>,
"Jarkko Sakkinen" <jarkko@kernel.org>,
"Dave Hansen" <dave.hansen@linux.intel.com>
Cc: Alyssa Rosenzweig <alyssa@rosenzweig.io>,
Karol Herbst <kherbst@redhat.com>,
Ella Stanforth <ella@iglunix.org>,
Faith Ekstrand <faith.ekstrand@collabora.com>,
Mary <mary@mary.zone>,
linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
rust-for-linux@vger.kernel.org, linux-media@vger.kernel.org,
linaro-mm-sig@lists.linaro.org, linux-sgx@vger.kernel.org,
asahi@lists.linux.dev
Subject: Re: [PATCH RFC 11/18] drm/scheduler: Clean up jobs when the scheduler is torn down
Date: Wed, 8 Mar 2023 19:12:00 +0100 [thread overview]
Message-ID: <e18500b5-21a0-77fd-8434-86258cefce5a@amd.com> (raw)
In-Reply-To: <0f14c1ae-0c39-106c-9563-7c1c672154c0@asahilina.net>
Am 08.03.23 um 18:32 schrieb Asahi Lina:
> [SNIP]
> Yes but... none of this cleans up jobs that are already submitted by the
> scheduler and in its pending list, with registered completion callbacks,
> which were already popped off of the entities.
>
> *That* is the problem this patch fixes!
Ah! Yes that makes more sense now.
>> We could add a warning when users of this API doesn't do this
>> correctly, but cleaning up incorrect API use is clearly something we
>> don't want here.
> It is the job of the Rust abstractions to make incorrect API use that
> leads to memory unsafety impossible. So even if you don't want that in
> C, it's my job to do that for Rust... and right now, I just can't
> because drm_sched doesn't provide an API that can be safely wrapped
> without weird bits of babysitting functionality on top (like tracking
> jobs outside or awkwardly making jobs hold a reference to the scheduler
> and defer dropping it to another thread).
Yeah, that was discussed before but rejected.
The argument was that upper layer needs to wait for the hw to become
idle before the scheduler can be destroyed anyway.
>>> Right now, it is not possible to create a safe Rust abstraction for
>>> drm_sched without doing something like duplicating all job tracking in
>>> the abstraction, or the above backreference + deferred cleanup mess, or
>>> something equally silly. So let's just fix the C side please ^^
>> Nope, as far as I can see this is just not correctly tearing down the
>> objects in the right order.
> There's no API to clean up in-flight jobs in a drm_sched at all.
> Destroying an entity won't do it. So there is no reasonable way to do
> this at all...
Yes, this was removed.
>> So you are trying to do something which is not supposed to work in the
>> first place.
> I need to make things that aren't supposed to work impossible to do in
> the first place, or at least fail gracefully instead of just oopsing
> like drm_sched does today...
>
> If you're convinced there's a way to do this, can you tell me exactly
> what code sequence I need to run to safely shut down a scheduler
> assuming all entities are already destroyed? You can't ask me for a list
> of pending jobs (the scheduler knows this, it doesn't make any sense to
> duplicate that outside), and you can't ask me to just not do this until
> all jobs complete execution (because then we either end up with the
> messy deadlock situation I described if I take a reference, or more
> duplicative in-flight job count tracking and blocking in the free path
> of the Rust abstraction, which doesn't make any sense either).
Good question. We don't have anybody upstream which uses the scheduler
lifetime like this.
Essentially the job list in the scheduler is something we wanted to
remove because it causes tons of race conditions during hw recovery.
When you tear down the firmware queue how do you handle already
submitted jobs there?
Regards,
Christian.
>
> ~~ Lina
next prev parent reply other threads:[~2023-03-08 18:12 UTC|newest]
Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-07 14:25 [PATCH RFC 00/18] Rust DRM subsystem abstractions (& preview AGX driver) Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 01/18] rust: drm: ioctl: Add DRM ioctl abstraction Asahi Lina
2023-03-07 14:48 ` Karol Herbst
2023-03-07 14:51 ` Karol Herbst
2023-03-07 15:32 ` Maíra Canal
2023-03-09 5:32 ` Asahi Lina
2023-03-09 6:15 ` Dave Airlie
2023-03-09 12:09 ` Maíra Canal
2023-03-07 17:34 ` Björn Roy Baron
2023-03-09 6:04 ` Asahi Lina
2023-03-09 20:24 ` Faith Ekstrand
2023-03-09 20:39 ` Karol Herbst
2023-03-10 6:21 ` Asahi Lina
2023-04-13 9:23 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 02/18] rust: drm: Add Device and Driver abstractions Asahi Lina
2023-03-07 18:19 ` Björn Roy Baron
2023-03-09 6:10 ` Asahi Lina
2023-03-10 18:56 ` Boqun Feng
2023-03-11 5:41 ` Boqun Feng
2023-04-05 17:10 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 03/18] rust: drm: file: Add File abstraction Asahi Lina
2023-03-09 21:16 ` Faith Ekstrand
2023-03-09 22:16 ` Asahi Lina
2023-03-13 17:49 ` Faith Ekstrand
2023-03-14 2:07 ` Boqun Feng
2023-04-05 11:25 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 04/18] rust: drm: gem: Add GEM object abstraction Asahi Lina
2023-04-05 11:08 ` Daniel Vetter
2023-04-05 11:19 ` Miguel Ojeda
2023-04-05 11:22 ` Daniel Vetter
2023-04-05 12:32 ` Miguel Ojeda
2023-04-05 12:36 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 05/18] drm/gem-shmem: Export VM ops functions Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 06/18] rust: drm: gem: shmem: Add DRM shmem helper abstraction Asahi Lina
2023-03-08 13:38 ` Maíra Canal
2023-03-09 5:25 ` Asahi Lina
2023-03-09 11:47 ` Maíra Canal
2023-03-09 14:16 ` Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 07/18] rust: drm: mm: Add DRM MM Range Allocator abstraction Asahi Lina
2023-04-06 14:15 ` Daniel Vetter
2023-04-06 15:28 ` Miguel Ojeda
2023-04-06 15:45 ` Daniel Vetter
2023-04-06 17:19 ` Miguel Ojeda
2023-04-06 15:53 ` Asahi Lina
2023-04-06 16:13 ` [Linaro-mm-sig] " Daniel Vetter
2023-04-06 16:39 ` Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 08/18] rust: dma_fence: Add DMA Fence abstraction Asahi Lina
2023-04-05 11:10 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 09/18] rust: drm: syncobj: Add DRM Sync Object abstraction Asahi Lina
2023-04-05 12:33 ` Daniel Vetter
2023-04-06 16:04 ` Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 10/18] drm/scheduler: Add can_run_job callback Asahi Lina
2023-03-08 8:46 ` Christian König
2023-03-08 9:41 ` Asahi Lina
2023-03-08 10:00 ` Christian König
2023-03-08 14:53 ` Asahi Lina
2023-03-08 15:30 ` Christian König
2023-03-08 16:44 ` Asahi Lina
2023-03-08 17:57 ` Christian König
2023-03-08 19:05 ` Asahi Lina
2023-03-08 19:12 ` Christian König
2023-03-08 19:45 ` Asahi Lina
2023-03-08 20:14 ` Christian König
2023-03-09 6:30 ` Asahi Lina
2023-03-09 8:05 ` Christian König
2023-03-09 9:14 ` Asahi Lina
2023-03-09 18:50 ` Faith Ekstrand
2023-03-10 9:16 ` Asahi Lina
2023-03-08 12:39 ` Karol Herbst
2023-03-08 13:47 ` Christian König
2023-03-08 14:43 ` Karol Herbst
2023-03-08 15:02 ` Christian König
2023-03-08 15:19 ` Karol Herbst
2023-03-16 13:40 ` Daniel Vetter
2023-04-05 13:40 ` Daniel Vetter
2023-04-05 14:14 ` Christian König
2023-04-05 14:21 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 11/18] drm/scheduler: Clean up jobs when the scheduler is torn down Asahi Lina
2023-03-08 9:57 ` Maarten Lankhorst
2023-03-08 10:03 ` Christian König
2023-03-08 15:18 ` Asahi Lina
2023-03-08 15:42 ` Christian König
2023-03-08 17:32 ` Asahi Lina
2023-03-08 18:12 ` Christian König [this message]
2023-03-08 19:37 ` Asahi Lina
2023-03-09 8:42 ` Christian König
2023-03-09 9:43 ` Asahi Lina
2023-03-09 11:47 ` Christian König
2023-03-09 13:48 ` Asahi Lina
2023-03-09 19:59 ` Faith Ekstrand
2023-03-10 9:58 ` Asahi Lina
2023-03-13 20:11 ` Faith Ekstrand
2023-03-08 17:39 ` alyssa
2023-03-08 17:44 ` Asahi Lina
2023-03-08 18:13 ` Christian König
2023-04-05 13:52 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 12/18] rust: drm: sched: Add GPU scheduler abstraction Asahi Lina
2023-04-05 15:43 ` Daniel Vetter
2023-04-05 19:29 ` Daniel Vetter
2023-04-18 8:45 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 13/18] drm/gem: Add a flag to control whether objects can be exported Asahi Lina
2023-04-05 14:55 ` Daniel Vetter
2023-03-07 14:25 ` [PATCH RFC 14/18] rust: drm: gem: Add set_exportable() method Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 15/18] drm/asahi: Add the Asahi driver UAPI [DO NOT MERGE] Asahi Lina
2023-03-07 15:28 ` Karol Herbst
2023-03-07 14:25 ` [PATCH RFC 16/18] rust: bindings: Bind the Asahi DRM UAPI Asahi Lina
2023-03-07 14:25 ` [PATCH RFC 17/18] rust: macros: Add versions macro Asahi Lina
2023-03-07 16:17 ` [PATCH RFC 00/18] Rust DRM subsystem abstractions (& preview AGX driver) Asahi Lina
[not found] ` <20230307-rust-drm-v1-18-917ff5bc80a8@asahilina.net>
2023-04-05 14:44 ` [PATCH RFC 18/18] drm/asahi: Add the Asahi driver for Apple AGX GPUs Daniel Vetter
2023-04-06 5:02 ` Asahi Lina
2023-04-06 5:09 ` Asahi Lina
2023-04-06 11:25 ` [Linaro-mm-sig] " Daniel Vetter
2023-04-06 13:32 ` Asahi Lina
2023-04-06 13:54 ` Daniel Vetter
[not found] ` <ZC2HtBOaoUAzVCVH@phenom.ffwll.local>
2023-04-06 4:44 ` Asahi Lina
2023-04-06 5:09 ` Asahi Lina
2023-04-06 11:26 ` Daniel Vetter
2023-04-06 10:42 ` [Linaro-mm-sig] " Daniel Vetter
2023-04-06 11:55 ` Daniel Vetter
2023-04-06 13:15 ` Asahi Lina
2023-04-06 13:48 ` Daniel Vetter
2023-04-06 15:19 ` Asahi Lina
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e18500b5-21a0-77fd-8434-86258cefce5a@amd.com \
--to=christian.koenig@amd.com \
--cc=airlied@gmail.com \
--cc=alex.gaynor@gmail.com \
--cc=alyssa@rosenzweig.io \
--cc=asahi@lists.linux.dev \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=daniel@ffwll.ch \
--cc=dave.hansen@linux.intel.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=ella@iglunix.org \
--cc=faith.ekstrand@collabora.com \
--cc=gary@garyguo.net \
--cc=jarkko@kernel.org \
--cc=kherbst@redhat.com \
--cc=lina@asahilina.net \
--cc=linaro-mm-sig@lists.linaro.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-sgx@vger.kernel.org \
--cc=luben.tuikov@amd.com \
--cc=maarten.lankhorst@linux.intel.com \
--cc=mary@mary.zone \
--cc=mripard@kernel.org \
--cc=ojeda@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=sumit.semwal@linaro.org \
--cc=tzimmermann@suse.de \
--cc=wedsonaf@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).