ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Greg KH <greg@kroah.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Josh Triplett <josh@joshtriplett.org>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	ksummit@lists.linux.dev
Subject: Re: [MAINTAINER SUMMIT] User-space requirements for accelerator drivers
Date: Sun, 12 Sep 2021 17:15:30 +0300	[thread overview]
Message-ID: <YT4LgkK+ejUWljEh@unreal> (raw)
In-Reply-To: <YT3/5kJuhw/QVqrw@kroah.com>

On Sun, Sep 12, 2021 at 03:25:58PM +0200, Greg KH wrote:
> On Sun, Sep 12, 2021 at 11:29:45AM +0300, Leon Romanovsky wrote:
> > On Sun, Sep 12, 2021 at 09:26:57AM +0200, Greg KH wrote:
> > > On Sun, Sep 12, 2021 at 07:27:55AM +0300, Leon Romanovsky wrote:
> > > > On Sun, Sep 12, 2021 at 01:04:01AM +0300, Laurent Pinchart wrote:
> > > > > Hi Leon,
> > > > > 
> > > > > On Sat, Sep 11, 2021 at 03:04:07PM +0300, Leon Romanovsky wrote:
> > > > > > On Sat, Sep 11, 2021 at 02:41:52PM +0300, Laurent Pinchart wrote:
> > > > > > > On Sat, Sep 11, 2021 at 01:31:02PM +0300, Leon Romanovsky wrote:
> > > > > > > > On Sat, Sep 11, 2021 at 01:55:16AM +0200, Thomas Gleixner wrote:
> > > > > > > > > On Fri, Sep 10 2021 at 16:45, Josh Triplett wrote:
> > > > > > > > > 
> > > > > > > > > > On Sat, Sep 11, 2021 at 12:52:14AM +0200, Mauro Carvalho Chehab wrote:
> > > > > > > > > >> On media, enforcing userspace to always be open source would
> > > > > > > > > >> have been very bad, as it would prevent several videoconferencing 
> > > > > > > > > >> software to exist on Linux.
> > > > > > > > > >
> > > > > > > > > > I don't think we should enforce that all userspace users of an interface
> > > > > > > > > > be Open Source. I do think we should enforce that *some* userspace user
> > > > > > > > > > of an interface be Open Source before we add the interface.
> > > > > > > > > 
> > > > > > > > > The real question is whether the interface is documented in a way that
> > > > > > > > > an Open Source implementation is possible. It does not matter whether it
> > > > > > > > > exists at that point in time or not. Even if it exists there is no
> > > > > > > > > guarantee that it is feature complete.
> > > > > > > > > 
> > > > > > > > > Freely accessible documentation is really the key.
> > > > > > > > 
> > > > > > > > I have more radical view than you and think that documentation is far
> > > > > > > > from being enough. I would like to see any userspace API used (or to be
> > > > > > > > used) in any package which exists in Debiam/Fedora/SuSE.
> > > > > > > 
> > > > > > > We probably need to add Android AOSP to that list, as we have
> > > > > > > Android-specific APIs (not that I believe we *should* have
> > > > > > > Android-specific APIs, there's been lots of efforts over the past years
> > > > > > > to develop standard APIs for use cases that stem from Android, slowly
> > > > > > > replacing Android-specific APIs in some area, but I don't believe we can
> > > > > > > realisticly bridge that gap completely overnight, if ever).
> > > > > > 
> > > > > > Maybe.
> > > > > > 
> > > > > > > > Only this will give us some sort of confidence that API and device are usable
> > > > > > > > to some level. As a side note, we will be able to estimate possible API
> > > > > > > > deprecation/fix/extension based on simple search in package databases.
> > > > > > > 
> > > > > > > Linux supports devices from very diverse markets, from very tiny
> > > > > > > embedded devices to supercomputers. We have drivers for devices that
> > > > > > > exist in data centres of a single company only, or for which only a
> > > > > > > handful of units exist through the world. The set of rules that we'll
> > > > > > > decide on, if any, should take this into account.
> > > > > > 
> > > > > > I'm part of that group (RDMA) who cares about enterprise, cloud and supercomputers. :)
> > > > > > So for us, working out-of-the box (distro packages and not github code drops) is
> > > > > > the key to the scalability.
> > > > > 
> > > > > What if we're dealing with a device that only exists in a handful of
> > > > > machines though ? Would distributions accept the burden of packaging
> > > > > corresponding userspace code, and maintaining the packages, when only a
> > > > > handful of people in the world will use it ? It's a genuine question.
> > > > 
> > > > Fedora, Debian and OpenSuSE are volunteer based distributions, they
> > > > accept new packages, which need to be prepared (or asked to be
> > > > prepared) by such vendors.
> > > > 
> > > > There is no "accept the burden of packaging corresponding userspace code,
> > > > and maintaining the packages", it is on package maintainer who can or
> > > > can't be associated with distribution.
> > > > 
> > > > > 
> > > > > > Regarding "embedded devices", I remind that we are talking about
> > > > > > userspace API and most likely busybox will be used for them, which is
> > > > > > also part of larger distro anyway, so fails under category "exists in
> > > > > > Debian/Fedora/SuSE".
> > > > > 
> > > > > We're talking about APIs exposed by drivers, for devices such as GPUs,
> > > > > cameras or AI/ML accelerators. I don't think busybox will exercise those
> > > > > :-) We have Masa for GPUs, libcamera for cameras, and other frameworks
> > > > > I'm less familiar with for AI/ML accelerators, and I expect those to be
> > > > > packaged by distributions. There are however other kind of devices that
> > > > > don't fall in existing well-defined categories.
> > > > 
> > > > I'm a little bit confused here. IMHO, you are trying to find an universal
> > > > solution for a problem that doesn't exist.
> > > > 
> > > > Above you asked how to deal with niche devices? Here you talk about mass
> > > > products devices for the enterprise while before you mentioned "embedded
> > > > devices".
> > > > 
> > > > 1. Niche devices - continue to do as they do it now, by supplying
> > > > out-of-tree solutions for their customers. Such devices and companies
> > > > rarely need upstream linux kernel support, because the burden to
> > > > upstream it is very high. We don't want them in the tree either, because
> > > > once they upstream it, the maintenance burden will be on us.
> > > 
> > > {sigh}
> > > 
> > > No, that is NOT our rule at all.
> > > 
> > > These devices and companies need to be upstream more than anything else
> > > as that way they become part of our community and are responsible for
> > > maintaining their code in the tree.  To force them to remain outside is
> > > to go against everything that many of us have been saying for _decades_
> > > now.
> > > 
> > > And how are you going to judge what is, and is not, a "niche" device?
> > 
> > I will leave to that company to decide. Again this is exactly how they
> > operate now, there is nothing new here. Every company calculates ROI
> > for working with upstream and small companies with niche devices are not
> > different here.
> > 
> > The main idea that I want to see working userspace stack, and being in
> > distro sets a certain quality level, am I asking too much?
> 
> Define "working userspace stack" and "distro" please.  Like others have
> said, many distros will not take userspace code unless it's already in
> the kernel tree first, as that ensures that the abi will not break.

Like I already answered
https://lore.kernel.org/all/YT2zryAKHc%2F5R2IH@unreal/
"To be used" means some open PR to existing package or request for
inclusion for new packages.

> 
> > > > 2. Devices that hits the certain level of adoption - need to be
> > > > integrated into certain userspace stack, which needs to be part of
> > > > distro.
> > > 
> > > Distros are a very odd rule to rely on given that they are by far the
> > > minority of the usage in raw numbers for Linux in the world.
> > 
> > You can count Android as another distro, it is just semantics.
> 
> But how do you define Android's userspace?  Just one vendor?  2 vendors?
> 10 vendors?  There is major userspace fragmentation in Android userspace
> in many places, the user/kernel boundry being one of the big ones as
> many of us have found out over the past years.  And many of us are
> working to resolve this, but it's not so simple at times, and I have
> many examples if you want specifics.

Lauerent suggested AOSP
https://lore.kernel.org/all/YTyWANV%2FmSkQbYhj@pendragon.ideasonboard.com/

> 
> > > > And AI/ML is no different here, someone just need to start build such
> > > > stack. Otherwise, we will continue to see more free riders like HabanaLabs
> > > > which don't have any real benefit to the community.
> > > 
> > > Everyone contributes to Linux in a selfish manner, that's just how the
> > > community works.  The work that companies like habanalabs is NOT being a
> > > "free rider" at all, they have worked with us and done the hard work of
> > > actually getting their code merged into the tree.
> > 
> > I perfectly remember them trying to bypass netdev and RDMA communities
> > by pretending "misc" device.
> > 
> > https://lore.kernel.org/linux-rdma/20200915133556.21268811@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/
> > https://lore.kernel.org/linux-rdma/20200917171833.GJ8409@ziepe.ca/
> > 
> > Or DRM
> > https://lore.kernel.org/linux-rdma/CAKMK7uFOfoxbD2Z5mb-qHFnUe5rObGKQ6Ygh--HSH9M=9bziGg@mail.gmail.com/
> > 
> > So I can agree with the statement "worked hard", but not with the
> > relevant communities.
> 
> I point at these as doing exactly what we want vendors to be doing!
> Thank you for finding the good examples.  This is a vendor submitting
> patches and saying, "here is what we want to do, with a first cut at
> doing it."  It's up to us as a community to tell them if they are doing
> it the right way or not.
> 
> If we just let them all go their own ways, they will come up with
> horrible apis and interfaces, we have all seen that before.
> 
> So by working together, we both can learn from, and work together to
> solve the issue.  And that is what these driver authors and company has
> been doing!  They are part of our community, why are you saying they
> should now just go do their own thing away from us?

This is not what I said. I don't see Intel (habanalabs) as a company
that can't create proper AI stack and think that this is our
responsibility to provide them enough incentive to do it.

> 
> And as for "bypassing", that feels very mean.  We have had accelerator
> code in the char/misc and other parts of the kernel tree since at least
> 2018 if not earlier (I didn't look all that hard.)  Just because someone
> wanted to use the in-kernel apis that are there (why is dma-buf some
> magic thing?) does not mean that they suddenly need to move to a
> different subsystem.

Because dma-buf API has specific semantics and was designed with very
specific usage model in mind.

> 
> We get at least 1-2 new subsystems and major drivers that get added to
> the kernel tree that do things that have never been done before with
> custom user/kernel apis every kernel release.  Not everything can be a
> standard api no matter how much I, and others, wish it were.

So when will you draw a line and ask to create proper susbsystem
with standard APIs? After 2, 3 ... 100 similar (from our point of view)
and different (from vendor point of view) devices with custom API?

> 
> As examples, what about the hyperv blob api that was submitted recently
> going around the block layer?  What about the new Intel accelerator that
> added yet-another-set-of-custom-ioctls?  What about the rpi drivers?
> What about the virtualbox drivers?  Should all of those just live
> outside of the kernel for forever?
> 
> Of course not.

So what is your bar? Accept everything?

Thanks

> 
> thanks,
> 
> greg k-h

  reply	other threads:[~2021-09-12 14:15 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 21:00 [MAINTAINER SUMMIT] User-space requirements for accelerator drivers Jonathan Corbet
2021-09-10 21:32 ` Josh Triplett
2021-09-13 13:50   ` Christian Brauner
2021-09-13 13:57     ` Daniel Vetter
2021-09-14  2:07       ` Laurent Pinchart
2021-09-14 14:40   ` Jani Nikula
2021-09-14 14:45     ` Geert Uytterhoeven
2021-09-14 14:59       ` Jani Nikula
2021-09-14 15:10         ` Geert Uytterhoeven
2021-09-10 21:51 ` James Bottomley
2021-09-10 21:59   ` Alexandre Belloni
2021-09-10 22:35     ` James Bottomley
2021-09-11 14:51       ` Jonathan Corbet
2021-09-11 15:24         ` James Bottomley
2021-09-11 21:52           ` Laurent Pinchart
2021-09-14 13:22             ` Johannes Berg
2021-09-11  0:08   ` Laurent Pinchart
2021-09-10 22:52 ` Mauro Carvalho Chehab
2021-09-10 23:45   ` Josh Triplett
2021-09-10 23:48     ` Dave Hansen
2021-09-11  0:13       ` Laurent Pinchart
2021-09-10 23:55     ` Thomas Gleixner
2021-09-11  0:20       ` Laurent Pinchart
2021-09-11 14:20         ` Steven Rostedt
2021-09-11 22:08           ` Laurent Pinchart
2021-09-11 22:42             ` Steven Rostedt
2021-09-11 23:10               ` Laurent Pinchart
2021-09-13 11:10               ` Mark Brown
2021-09-11 22:51           ` Mauro Carvalho Chehab
2021-09-11 23:22           ` Mauro Carvalho Chehab
2021-09-11 10:31       ` Leon Romanovsky
2021-09-11 11:41         ` Laurent Pinchart
2021-09-11 12:04           ` Leon Romanovsky
2021-09-11 22:04             ` Laurent Pinchart
2021-09-12  4:27               ` Leon Romanovsky
2021-09-12  7:26                 ` Greg KH
2021-09-12  8:29                   ` Leon Romanovsky
2021-09-12 13:25                     ` Greg KH
2021-09-12 14:15                       ` Leon Romanovsky [this message]
2021-09-12 14:34                         ` Greg KH
2021-09-12 16:41                           ` Laurent Pinchart
2021-09-12 20:35                           ` Dave Airlie
2021-09-12 20:41                           ` Dave Airlie
2021-09-12 20:49                             ` Daniel Vetter
2021-09-12 21:12                               ` Dave Airlie
2021-09-12 22:51                                 ` Linus Walleij
2021-09-12 23:15                                   ` Dave Airlie
2021-09-13 13:20                                   ` Arnd Bergmann
2021-09-13 13:54                                     ` Daniel Vetter
2021-09-13 22:04                                       ` Arnd Bergmann
2021-09-13 23:33                                         ` Dave Airlie
2021-09-14  9:08                                           ` Arnd Bergmann
2021-09-14  9:23                                             ` Daniel Vetter
2021-09-14 10:47                                               ` Laurent Pinchart
2021-09-14 12:58                                               ` Arnd Bergmann
2021-09-14 19:45                                                 ` Daniel Vetter
2021-09-14 15:43                                             ` Luck, Tony
2021-09-13 14:52                                     ` James Bottomley
2021-09-14 13:07                                     ` Linus Walleij
2021-09-13 14:03                           ` Mark Brown
2021-09-12 15:55                       ` Laurent Pinchart
2021-09-12 16:43                         ` James Bottomley
2021-09-12 16:58                           ` Laurent Pinchart
2021-09-12 17:08                             ` James Bottomley
2021-09-12 19:52                   ` Dave Airlie
2021-09-12  7:46                 ` Mauro Carvalho Chehab
2021-09-12  8:00                   ` Leon Romanovsky
2021-09-12 14:53                     ` Laurent Pinchart
2021-09-12 15:41                       ` Mauro Carvalho Chehab
2021-09-10 23:46   ` Laurent Pinchart
2021-09-11  0:38     ` Mauro Carvalho Chehab
2021-09-11  9:27       ` Laurent Pinchart
2021-09-11 22:33         ` Mauro Carvalho Chehab
2021-09-13 12:04         ` Mark Brown
2021-09-12 19:13 ` Dave Airlie
2021-09-12 19:48   ` Laurent Pinchart
2021-09-13  2:26     ` Dave Airlie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YT4LgkK+ejUWljEh@unreal \
    --to=leon@kernel.org \
    --cc=corbet@lwn.net \
    --cc=greg@kroah.com \
    --cc=josh@joshtriplett.org \
    --cc=ksummit@lists.linux.dev \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=mchehab@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).