From: Jason Gunthorpe <firstname.lastname@example.org>
To: Oded Gabbay <email@example.com>
Cc: Dave Airlie <firstname.lastname@example.org>,
Greg Kroah-Hartman <email@example.com>,
Yuji Ishikawa <firstname.lastname@example.org>,
Jiho Chu <email@example.com>, Arnd Bergmann <firstname.lastname@example.org>,
"Linux-Kernel@Vger. Kernel. Org" <email@example.com>
Subject: Re: New subsystem for acceleration devices
Date: Mon, 8 Aug 2022 14:46:27 -0300 [thread overview]
Message-ID: <YvFL8/g+xbrhLzEr@nvidia.com> (raw)
On Sun, Aug 07, 2022 at 09:43:40AM +0300, Oded Gabbay wrote:
> 1. If there is a subsystem which is responsible for creating and
> exposing the device character files, then there should be some code
> that connects between each device driver to that subsystem.
> i.e. There should be functions that each driver should call from its
> probe and release callback functions.
> Those functions should take care of the following:
> - Create metadata for the device, the device's minor(s) and the
> driver's ioctls table and driver's callback for file operations (both
> are common for all the driver's devices). Save all that metadata with
> proper locking.
> - Create the device char files themselves and supply file operations
> that will be called per each open/close/mmap/etc.
> - Keep track of all these objects' lifetime in regard to the device
> driver's lifetime, with proper handling for release.
> - Add common handling and entries of sysfs/debugfs for these devices
> with the ability for each device driver to add their own unique
> 2. I think that you underestimate (due to your experience) the "using
> it properly" part... It is not so easy to do this properly for
> inexperienced kernel people. If we provide all the code I mentioned
> above, the device driver writer doesn't need to be aware of all these
> kernel APIs.
This may be, but it still seems weird to me to justify a subsystem as
"making existing APIs simpler so drivers don't mess them up". It
suggests perhaps we need additional core API helpers?
> > It would be nice to at least identify something that could obviously
> > be common, like some kind of enumeration and metadata kind of stuff
> > (think like ethtool, devlink, rdma tool, nvemctl etc)
> Definitely. I think we can have at least one ioctl that will be common
> to all drivers from the start.
Generally you don't want that as an ioctl because you have to open the
device to execute it, there may be permissions issues for instance -
or if you have a single-open-at-a-time model like VFIO, then it
doesn't work well together.
Usually this would be sysfs or netlink.
> > This makes sense to me, all accelerators need a way to set a memory
> > map, but on the other hand we are doing some very nifty stuff in this
> > area with iommufd and it might be very interesting to just have the
> > accelerator driver link to that API instead of building yet another
> > copy of pin_user_pages() code.. Especially with PASID likely becoming
> > part of any accelerator toolkit.
> Here I disagree with you. First of all, there are many relatively
> simple accelerators, especially in edge, where PASID is really not
> Second, even for the more sophisticated PCIe/CXL-based ones, PASID is
> not mandatory and I suspect that it won't be in 100% of those devices.
> But definitely that should be an alternative to the "classic" way of
> handling dma'able memory (pin_user_pages()).
My point was that iommufd can do the pinning for you and dump that
result into a iommu based PASID, or it can do the pinning for you and
allow the driver to translate it into its own page table format eg the
ASID in the habana device.
We don't need to have map/unmap APIs to manage address spaces in every
> Maybe this is something that should be discussed in the kernel summit ?
Maybe, I expect to be at LPC at least
next prev parent reply other threads:[~2022-08-08 17:46 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20220731114605epcas1p1afff6b948f542e2062b60d49a8023f6f@epcas1p1.samsung.com>
2022-07-31 11:45 ` New subsystem for acceleration devices Oded Gabbay
2022-07-31 15:37 ` Greg Kroah-Hartman
2022-08-01 2:29 ` yuji2.ishikawa
2022-08-01 8:21 ` Oded Gabbay
2022-08-03 4:39 ` yuji2.ishikawa
2022-08-03 5:34 ` Greg KH
2022-08-03 20:28 ` Oded Gabbay
2022-08-02 17:25 ` Jiho Chu
2022-08-02 19:07 ` Oded Gabbay
2022-08-03 19:04 ` Dave Airlie
2022-08-03 20:20 ` Oded Gabbay
2022-08-03 23:31 ` Daniel Stone
2022-08-04 6:46 ` Oded Gabbay
2022-08-04 9:27 ` Jiho Chu
2022-08-03 23:54 ` Dave Airlie
2022-08-04 7:43 ` Oded Gabbay
2022-08-04 14:50 ` Jason Gunthorpe
2022-08-04 17:48 ` Oded Gabbay
2022-08-05 0:22 ` Jason Gunthorpe
2022-08-07 6:43 ` Oded Gabbay
2022-08-07 11:25 ` Oded Gabbay
2022-08-08 6:10 ` Greg Kroah-Hartman
2022-08-08 17:55 ` Jason Gunthorpe
2022-08-09 6:23 ` Greg Kroah-Hartman
2022-08-09 8:04 ` Christoph Hellwig
2022-08-09 8:32 ` Arnd Bergmann
2022-08-09 12:18 ` Jason Gunthorpe
2022-08-09 12:46 ` Arnd Bergmann
2022-08-09 14:22 ` Jason Gunthorpe
2022-08-09 8:45 ` Greg Kroah-Hartman
2022-08-08 17:46 ` Jason Gunthorpe [this message]
2022-08-08 20:26 ` Oded Gabbay
2022-08-09 12:43 ` Jason Gunthorpe
2022-08-05 3:02 ` Dave Airlie
2022-08-07 6:50 ` Oded Gabbay
2022-08-09 21:42 ` Oded Gabbay
2022-08-10 9:00 ` Jiho Chu
2022-08-10 14:05 ` yuji2.ishikawa
2022-08-10 14:37 ` Oded Gabbay
2022-08-23 18:23 ` Kevin Hilman
2022-08-23 20:45 ` Oded Gabbay
2022-08-29 20:54 ` Kevin Hilman
2022-09-23 16:21 ` Oded Gabbay
2022-09-26 8:16 ` Christoph Hellwig
2022-09-29 6:50 ` Oded Gabbay
2022-08-04 12:00 ` Tvrtko Ursulin
2022-08-04 15:03 ` Jeffrey Hugo
2022-08-04 17:53 ` Oded Gabbay
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).