From: Oded Gabbay <firstname.lastname@example.org>
To: Jason Gunthorpe <email@example.com>
Cc: Dave Airlie <firstname.lastname@example.org>,
Greg Kroah-Hartman <email@example.com>,
Yuji Ishikawa <firstname.lastname@example.org>,
Jiho Chu <email@example.com>, Arnd Bergmann <firstname.lastname@example.org>,
"Linux-Kernel@Vger. Kernel. Org" <email@example.com>
Subject: Re: New subsystem for acceleration devices
Date: Sun, 7 Aug 2022 14:25:33 +0300 [thread overview]
Message-ID: <CAFCwf107tLxHKxkPqSRsOHVVp5s2tDEFOOy2oDZUz_KGmv-rDg@mail.gmail.com> (raw)
On Sun, Aug 7, 2022 at 9:43 AM Oded Gabbay <firstname.lastname@example.org> wrote:
> On Fri, Aug 5, 2022 at 3:22 AM Jason Gunthorpe <email@example.com> wrote:
> > On Thu, Aug 04, 2022 at 08:48:28PM +0300, Oded Gabbay wrote:
> > > > The flip is true of DRM - DRM is pretty general. I bet I could
> > > > implement an RDMA device under DRM - but that doesn't mean it should
> > > > be done.
> > > >
> > > > My biggest concern is that this subsystem not turn into a back door
> > > > for a bunch of abuse of kernel APIs going forward. Though things
> > > > are
> > >
> > > How do you suggest to make sure it won't happen ?
> > Well, if you launch the subsystem then it is your job to make sure it
> > doesn't happen - or endure the complaints :)
> Understood, I'll make sure there is no room for complaints.
> > Accelerators have this nasty tendancy to become co-designed with their
> > SOCs in nasty intricate ways and then there is a desire to punch
> > through all the inconvenient layers to make it go faster.
> > > > better now, we still see this in DRM where expediency or performance
> > > > justifies hacky shortcuts instead of good in-kernel architecture. At
> > > > least DRM has reliable and experienced review these days.
> > > Definitely. DRM has some parts that are really well written. For
> > > example, the whole char device handling with sysfs/debugfs and managed
> > > resources code.
> > Arguably this should all be common code in the driver core/etc - what
> > value is a subsystem adding beyond that besides using it properly?
> I mainly see two things here:
> 1. If there is a subsystem which is responsible for creating and
> exposing the device character files, then there should be some code
> that connects between each device driver to that subsystem.
> i.e. There should be functions that each driver should call from its
> probe and release callback functions.
> Those functions should take care of the following:
> - Create metadata for the device, the device's minor(s) and the
> driver's ioctls table and driver's callback for file operations (both
> are common for all the driver's devices). Save all that metadata with
> proper locking.
> - Create the device char files themselves and supply file operations
> that will be called per each open/close/mmap/etc.
> - Keep track of all these objects' lifetime in regard to the device
> driver's lifetime, with proper handling for release.
> - Add common handling and entries of sysfs/debugfs for these devices
> with the ability for each device driver to add their own unique
> 2. I think that you underestimate (due to your experience) the "using
> it properly" part... It is not so easy to do this properly for
> inexperienced kernel people. If we provide all the code I mentioned
> above, the device driver writer doesn't need to be aware of all these
> kernel APIs.
Two more points I thought of as examples for added value to be done in
1. Common code for handling dma-buf. Very similar to what was added a
year ago to rdma. This can be accompanied by a common ioctl to export
and import a dma-buf.
2. Common code to handle drivers that want to allow a single user at a
time to run open the device char file.
> > It would be nice to at least identify something that could obviously
> > be common, like some kind of enumeration and metadata kind of stuff
> > (think like ethtool, devlink, rdma tool, nvemctl etc)
> Definitely. I think we can have at least one ioctl that will be common
> to all drivers from the start.
> A kind of information retrieval ioctl. There are many information
> points that I'm sure are common to most accelerators. We have an
> extensive information ioctl in the habanalabs driver and most of the
> information there is not habana specific imo.
> > > I think that it is clear from my previous email what I intended to
> > > provide. A clean, simple framework for devices to register with, get
> > > services for the most basic stuff such as device char handling,
> > > sysfs/debugfs.
> > This should all be trivially done by any driver using the core codes,
> > if you see gaps I'd ask why not improve the core code?
> > > Later on, add more simple stuff such as memory manager
> > > and uapi for memory handling. I guess someone can say all that exists
> > > in drm, but like I said it exists in other subsystems as well.
> > This makes sense to me, all accelerators need a way to set a memory
> > map, but on the other hand we are doing some very nifty stuff in this
> > area with iommufd and it might be very interesting to just have the
> > accelerator driver link to that API instead of building yet another
> > copy of pin_user_pages() code.. Especially with PASID likely becoming
> > part of any accelerator toolkit.
> Here I disagree with you. First of all, there are many relatively
> simple accelerators, especially in edge, where PASID is really not
> Second, even for the more sophisticated PCIe/CXL-based ones, PASID is
> not mandatory and I suspect that it won't be in 100% of those devices.
> But definitely that should be an alternative to the "classic" way of
> handling dma'able memory (pin_user_pages()).
> > > I want to be perfectly honest and say there is nothing special here
> > > for AI. It's actually the opposite, it is a generic framework for
> > > compute only. Think of it as an easier path to upstream if you just
> > > want to do compute acceleration. Maybe in the future it will be more,
> > > but I can't predict the future.
> > I can't either, and to be clear I'm only questioning the merit of a
> > "subsystem" eg with a struct class, rigerous uAPI, etc.
> > The idea that these kinds of accel drivers deserve specialized review
> > makes sense to me, even if they remain as unorganized char
> > devices. Just understand that is what you are signing up for :)
> I understand. That's why I'm taking all your points very seriously.
> This is not a decision that should be taken lightly and I want to be
> sure most agree this is the correct way forward.
> My next step is to talk to Dave about it in-depth. In his other email
> he wrote some interesting ideas which I want to discuss with him.
> Maybe this is something that should be discussed in the kernel summit ?
> > Jason
next prev parent reply other threads:[~2022-08-07 11:26 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20220731114605epcas1p1afff6b948f542e2062b60d49a8023f6f@epcas1p1.samsung.com>
2022-07-31 11:45 ` New subsystem for acceleration devices Oded Gabbay
2022-07-31 15:37 ` Greg Kroah-Hartman
2022-08-01 2:29 ` yuji2.ishikawa
2022-08-01 8:21 ` Oded Gabbay
2022-08-03 4:39 ` yuji2.ishikawa
2022-08-03 5:34 ` Greg KH
2022-08-03 20:28 ` Oded Gabbay
2022-08-02 17:25 ` Jiho Chu
2022-08-02 19:07 ` Oded Gabbay
2022-08-03 19:04 ` Dave Airlie
2022-08-03 20:20 ` Oded Gabbay
2022-08-03 23:31 ` Daniel Stone
2022-08-04 6:46 ` Oded Gabbay
2022-08-04 9:27 ` Jiho Chu
2022-08-03 23:54 ` Dave Airlie
2022-08-04 7:43 ` Oded Gabbay
2022-08-04 14:50 ` Jason Gunthorpe
2022-08-04 17:48 ` Oded Gabbay
2022-08-05 0:22 ` Jason Gunthorpe
2022-08-07 6:43 ` Oded Gabbay
2022-08-07 11:25 ` Oded Gabbay [this message]
2022-08-08 6:10 ` Greg Kroah-Hartman
2022-08-08 17:55 ` Jason Gunthorpe
2022-08-09 6:23 ` Greg Kroah-Hartman
2022-08-09 8:04 ` Christoph Hellwig
2022-08-09 8:32 ` Arnd Bergmann
2022-08-09 12:18 ` Jason Gunthorpe
2022-08-09 12:46 ` Arnd Bergmann
2022-08-09 14:22 ` Jason Gunthorpe
2022-08-09 8:45 ` Greg Kroah-Hartman
2022-08-08 17:46 ` Jason Gunthorpe
2022-08-08 20:26 ` Oded Gabbay
2022-08-09 12:43 ` Jason Gunthorpe
2022-08-05 3:02 ` Dave Airlie
2022-08-07 6:50 ` Oded Gabbay
2022-08-09 21:42 ` Oded Gabbay
2022-08-10 9:00 ` Jiho Chu
2022-08-10 14:05 ` yuji2.ishikawa
2022-08-10 14:37 ` Oded Gabbay
2022-08-23 18:23 ` Kevin Hilman
2022-08-23 20:45 ` Oded Gabbay
2022-08-29 20:54 ` Kevin Hilman
2022-09-23 16:21 ` Oded Gabbay
2022-09-26 8:16 ` Christoph Hellwig
2022-09-29 6:50 ` Oded Gabbay
2022-08-04 12:00 ` Tvrtko Ursulin
2022-08-04 15:03 ` Jeffrey Hugo
2022-08-04 17:53 ` Oded Gabbay
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).