linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oded Gabbay <oded.gabbay@gmail.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Dave Airlie <airlied@gmail.com>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>,
	Jiho Chu <jiho.chu@samsung.com>, Arnd Bergmann <arnd@arndb.de>,
	"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>
Subject: Re: New subsystem for acceleration devices
Date: Sun, 7 Aug 2022 14:25:33 +0300	[thread overview]
Message-ID: <CAFCwf107tLxHKxkPqSRsOHVVp5s2tDEFOOy2oDZUz_KGmv-rDg@mail.gmail.com> (raw)
In-Reply-To: <CAFCwf13QF_JdzNcpw==zzBoEQUYChMXfechotH31qmAfYZUGmg@mail.gmail.com>

On Sun, Aug 7, 2022 at 9:43 AM Oded Gabbay <oded.gabbay@gmail.com> wrote:
>
> On Fri, Aug 5, 2022 at 3:22 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Thu, Aug 04, 2022 at 08:48:28PM +0300, Oded Gabbay wrote:
> >
> > > > The flip is true of DRM - DRM is pretty general. I bet I could
> > > > implement an RDMA device under DRM - but that doesn't mean it should
> > > > be done.
> > > >
> > > > My biggest concern is that this subsystem not turn into a back door
> > > > for a bunch of abuse of kernel APIs going forward. Though things
> > > > are
> > >
> > > How do you suggest to make sure it won't happen ?
> >
> > Well, if you launch the subsystem then it is your job to make sure it
> > doesn't happen - or endure the complaints :)
> Understood, I'll make sure there is no room for complaints.
> >
> > Accelerators have this nasty tendancy to become co-designed with their
> > SOCs in nasty intricate ways and then there is a desire to punch
> > through all the inconvenient layers to make it go faster.
> >
> > > > better now, we still see this in DRM where expediency or performance
> > > > justifies hacky shortcuts instead of good in-kernel architecture. At
> > > > least DRM has reliable and experienced review these days.
> > > Definitely. DRM has some parts that are really well written. For
> > > example, the whole char device handling with sysfs/debugfs and managed
> > > resources code.
> >
> > Arguably this should all be common code in the driver core/etc - what
> > value is a subsystem adding beyond that besides using it properly?
>
> I mainly see two things here:
>
> 1. If there is a subsystem which is responsible for creating and
> exposing the device character files, then there should be some code
> that connects between each device driver to that subsystem.
> i.e. There should be functions that each driver should call from its
> probe and release callback functions.
>
> Those functions should take care of the following:
> - Create metadata for the device, the device's minor(s) and the
> driver's ioctls table and driver's callback for file operations (both
> are common for all the driver's devices). Save all that metadata with
> proper locking.
> - Create the device char files themselves and supply file operations
> that will be called per each open/close/mmap/etc.
> - Keep track of all these objects' lifetime in regard to the device
> driver's lifetime, with proper handling for release.
> - Add common handling and entries of sysfs/debugfs for these devices
> with the ability for each device driver to add their own unique
> entries.
>
> 2. I think that you underestimate (due to your experience) the "using
> it properly" part... It is not so easy to do this properly for
> inexperienced kernel people. If we provide all the code I mentioned
> above, the device driver writer doesn't need to be aware of all these
> kernel APIs.
>
Two more points I thought of as examples for added value to be done in
common code:
1. Common code for handling dma-buf. Very similar to what was added a
year ago to rdma. This can be accompanied by a common ioctl to export
and import a dma-buf.
2. Common code to handle drivers that want to allow a single user at a
time to run open the device char file.

Oded


> >
> > It would be nice to at least identify something that could obviously
> > be common, like some kind of enumeration and metadata kind of stuff
> > (think like ethtool, devlink, rdma tool, nvemctl etc)
> Definitely. I think we can have at least one ioctl that will be common
> to all drivers from the start.
> A kind of information retrieval ioctl. There are many information
> points that I'm sure are common to most accelerators. We have an
> extensive information ioctl in the habanalabs driver and most of the
> information there is not habana specific imo.
> >
> > > I think that it is clear from my previous email what I intended to
> > > provide. A clean, simple framework for devices to register with, get
> > > services for the most basic stuff such as device char handling,
> > > sysfs/debugfs.
> >
> > This should all be trivially done by any driver using the core codes,
> > if you see gaps I'd ask why not improve the core code?
> >
> > > Later on, add more simple stuff such as memory manager
> > > and uapi for memory handling. I guess someone can say all that exists
> > > in drm, but like I said it exists in other subsystems as well.
> >
> > This makes sense to me, all accelerators need a way to set a memory
> > map, but on the other hand we are doing some very nifty stuff in this
> > area with iommufd and it might be very interesting to just have the
> > accelerator driver link to that API instead of building yet another
> > copy of pin_user_pages() code.. Especially with PASID likely becoming
> > part of any accelerator toolkit.
> Here I disagree with you. First of all, there are many relatively
> simple accelerators, especially in edge, where PASID is really not
> relevant.
> Second, even for the more sophisticated PCIe/CXL-based ones, PASID is
> not mandatory and I suspect that it won't be in 100% of those devices.
> But definitely that should be an alternative to the "classic" way of
> handling dma'able memory (pin_user_pages()).
> >
> > > I want to be perfectly honest and say there is nothing special here
> > > for AI. It's actually the opposite, it is a generic framework for
> > > compute only. Think of it as an easier path to upstream if you just
> > > want to do compute acceleration. Maybe in the future it will be more,
> > > but I can't predict the future.
> >
> > I can't either, and to be clear I'm only questioning the merit of a
> > "subsystem" eg with a struct class, rigerous uAPI, etc.
> >
> > The idea that these kinds of accel drivers deserve specialized review
> > makes sense to me, even if they remain as unorganized char
> > devices. Just understand that is what you are signing up for :)
> I understand. That's why I'm taking all your points very seriously.
> This is not a decision that should be taken lightly and I want to be
> sure most agree this is the correct way forward.
> My next step is to talk to Dave about it in-depth. In his other email
> he wrote some interesting ideas which I want to discuss with him.
>
> Maybe this is something that should be discussed in the kernel summit ?
>
> >
> > Jason

  reply	other threads:[~2022-08-07 11:26 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220731114605epcas1p1afff6b948f542e2062b60d49a8023f6f@epcas1p1.samsung.com>
2022-07-31 11:45 ` New subsystem for acceleration devices Oded Gabbay
2022-07-31 15:37   ` Greg Kroah-Hartman
2022-08-01  2:29     ` yuji2.ishikawa
2022-08-01  8:21       ` Oded Gabbay
2022-08-03  4:39         ` yuji2.ishikawa
2022-08-03  5:34           ` Greg KH
2022-08-03 20:28           ` Oded Gabbay
2022-08-02 17:25   ` Jiho Chu
2022-08-02 19:07     ` Oded Gabbay
2022-08-03 19:04   ` Dave Airlie
2022-08-03 20:20     ` Oded Gabbay
2022-08-03 23:31       ` Daniel Stone
2022-08-04  6:46         ` Oded Gabbay
2022-08-04  9:27           ` Jiho Chu
2022-08-03 23:54       ` Dave Airlie
2022-08-04  7:43         ` Oded Gabbay
2022-08-04 14:50           ` Jason Gunthorpe
2022-08-04 17:48             ` Oded Gabbay
2022-08-05  0:22               ` Jason Gunthorpe
2022-08-07  6:43                 ` Oded Gabbay
2022-08-07 11:25                   ` Oded Gabbay [this message]
2022-08-08  6:10                     ` Greg Kroah-Hartman
2022-08-08 17:55                       ` Jason Gunthorpe
2022-08-09  6:23                         ` Greg Kroah-Hartman
2022-08-09  8:04                           ` Christoph Hellwig
2022-08-09  8:32                             ` Arnd Bergmann
2022-08-09 12:18                               ` Jason Gunthorpe
2022-08-09 12:46                                 ` Arnd Bergmann
2022-08-09 14:22                                   ` Jason Gunthorpe
2022-08-09  8:45                             ` Greg Kroah-Hartman
2022-08-08 17:46                   ` Jason Gunthorpe
2022-08-08 20:26                     ` Oded Gabbay
2022-08-09 12:43                       ` Jason Gunthorpe
2022-08-05  3:02           ` Dave Airlie
2022-08-07  6:50             ` Oded Gabbay
2022-08-09 21:42               ` Oded Gabbay
2022-08-10  9:00                 ` Jiho Chu
2022-08-10 14:05                 ` yuji2.ishikawa
2022-08-10 14:37                   ` Oded Gabbay
2022-08-23 18:23                 ` Kevin Hilman
2022-08-23 20:45                   ` Oded Gabbay
2022-08-29 20:54                     ` Kevin Hilman
2022-09-23 16:21                       ` Oded Gabbay
2022-09-26  8:16                         ` Christoph Hellwig
2022-09-29  6:50                           ` Oded Gabbay
2022-08-04 12:00         ` Tvrtko Ursulin
2022-08-04 15:03           ` Jeffrey Hugo
2022-08-04 17:53             ` Oded Gabbay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFCwf107tLxHKxkPqSRsOHVVp5s2tDEFOOy2oDZUz_KGmv-rDg@mail.gmail.com \
    --to=oded.gabbay@gmail.com \
    --cc=airlied@gmail.com \
    --cc=arnd@arndb.de \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jgg@nvidia.com \
    --cc=jiho.chu@samsung.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yuji2.ishikawa@toshiba.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).