All of lore.kernel.org
 help / color / mirror / Atom feed
From: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
To: Greg KH <greg@kroah.com>
Cc: Leon Romanovsky <leon@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Josh Triplett <josh@joshtriplett.org>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	ksummit@lists.linux.dev
Subject: Re: [MAINTAINER SUMMIT] User-space requirements for accelerator drivers
Date: Sun, 12 Sep 2021 19:41:35 +0300	[thread overview]
Message-ID: <YT4tv+TXxI9m9WVj@pendragon.ideasonboard.com> (raw)
In-Reply-To: <YT4QCHwnqZL/rl0z@kroah.com>

Hi Greg,

On Sun, Sep 12, 2021 at 04:34:48PM +0200, Greg KH wrote:
> On Sun, Sep 12, 2021 at 05:15:30PM +0300, Leon Romanovsky wrote:
> > On Sun, Sep 12, 2021 at 03:25:58PM +0200, Greg KH wrote:
> > > > The main idea that I want to see working userspace stack, and being in
> > > > distro sets a certain quality level, am I asking too much?
> > > 
> > > Define "working userspace stack" and "distro" please.  Like others have
> > > said, many distros will not take userspace code unless it's already in
> > > the kernel tree first, as that ensures that the abi will not break.
> > 
> > Like I already answered
> > https://lore.kernel.org/all/YT2zryAKHc%2F5R2IH@unreal/
> > "To be used" means some open PR to existing package or request for
> > inclusion for new packages.
> 
> But again, distros will not take things that are not already in the
> kernel.

It's becoming difficult to follow the discussion as it has branched.
I've replied on this topic separately.

> > > > > > 2. Devices that hits the certain level of adoption - need to be
> > > > > > integrated into certain userspace stack, which needs to be part of
> > > > > > distro.
> > > > > 
> > > > > Distros are a very odd rule to rely on given that they are by far the
> > > > > minority of the usage in raw numbers for Linux in the world.
> > > > 
> > > > You can count Android as another distro, it is just semantics.
> > > 
> > > But how do you define Android's userspace?  Just one vendor?  2 vendors?
> > > 10 vendors?  There is major userspace fragmentation in Android userspace
> > > in many places, the user/kernel boundry being one of the big ones as
> > > many of us have found out over the past years.  And many of us are
> > > working to resolve this, but it's not so simple at times, and I have
> > > many examples if you want specifics.
> > 
> > Lauerent suggested AOSP
> > https://lore.kernel.org/all/YTyWANV%2FmSkQbYhj@pendragon.ideasonboard.com/
> 
> Vendors can not get code into AOSP for various reasons that only Google
> understands.  There are many millions, if not billions of Android
> devices out there with user/kernel apis that are not upstream nor in
> AOSP because Google doesn't want to take them, or because the vendor can
> not go through those hoops (international law is tricky at times...)
> 
> So are we to just not be able to take drivers that add those new apis if
> AOSP can not take the userspace side, yet the userspace side is
> published somewhere else?

"Open userspace" and "packaged in distros" are two criteria that have
been proposed. There are more, such as "open documentation" for
instance. It's up to us to decide what to do (if anything), and I don't
believe we'll be able to find one-size-fits-them-all criteria that can
apply globally. There is however in my opinion value in carefully
designing a set of criteria and document them, to then for instance let
subsystems pick the ones that work best for the type of devices they
handle.

The "packaged in distros" criteria is, as I understand it, an attempt to
avoid code dumps on git..b that would have been so badly designed that
they would be unmaintainable. It's a tricky area, what I think is
required is that vendors publish an open userspace implementation that
is serious enough, and not just a way to tick a box while circumventing
the spirit of the rule. Distro packaging may help achieving that, but
there are certainly other ways too. For me, at the end of the day it's
really about how to create a community starting from a single
implementation.

> > > > > > And AI/ML is no different here, someone just need to start build such
> > > > > > stack. Otherwise, we will continue to see more free riders like HabanaLabs
> > > > > > which don't have any real benefit to the community.
> > > > > 
> > > > > Everyone contributes to Linux in a selfish manner, that's just how the
> > > > > community works.  The work that companies like habanalabs is NOT being a
> > > > > "free rider" at all, they have worked with us and done the hard work of
> > > > > actually getting their code merged into the tree.
> > > > 
> > > > I perfectly remember them trying to bypass netdev and RDMA communities
> > > > by pretending "misc" device.
> > > > 
> > > > https://lore.kernel.org/linux-rdma/20200915133556.21268811@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/
> > > > https://lore.kernel.org/linux-rdma/20200917171833.GJ8409@ziepe.ca/
> > > > 
> > > > Or DRM
> > > > https://lore.kernel.org/linux-rdma/CAKMK7uFOfoxbD2Z5mb-qHFnUe5rObGKQ6Ygh--HSH9M=9bziGg@mail.gmail.com/
> > > > 
> > > > So I can agree with the statement "worked hard", but not with the
> > > > relevant communities.
> > > 
> > > I point at these as doing exactly what we want vendors to be doing!
> > > Thank you for finding the good examples.  This is a vendor submitting
> > > patches and saying, "here is what we want to do, with a first cut at
> > > doing it."  It's up to us as a community to tell them if they are doing
> > > it the right way or not.
> > > 
> > > If we just let them all go their own ways, they will come up with
> > > horrible apis and interfaces, we have all seen that before.
> > > 
> > > So by working together, we both can learn from, and work together to
> > > solve the issue.  And that is what these driver authors and company has
> > > been doing!  They are part of our community, why are you saying they
> > > should now just go do their own thing away from us?
> > 
> > This is not what I said. I don't see Intel (habanalabs) as a company
> > that can't create proper AI stack and think that this is our
> > responsibility to provide them enough incentive to do it.
> 
> So should we be forcing everyone to follow the IBM standard for
> accelerator drivers because they were in the kernel first all those
> years ago?  Or what other standard do we pick?
> 
> And why are we dictating new industry standards here?  Who are we to do
> that?  Who is going to take that responsibility on?
> 
> > > And as for "bypassing", that feels very mean.  We have had accelerator
> > > code in the char/misc and other parts of the kernel tree since at least
> > > 2018 if not earlier (I didn't look all that hard.)  Just because someone
> > > wanted to use the in-kernel apis that are there (why is dma-buf some
> > > magic thing?) does not mean that they suddenly need to move to a
> > > different subsystem.
> > 
> > Because dma-buf API has specific semantics and was designed with very
> > specific usage model in mind.
> 
> So will the IB patches usage be re-reviewed?
> 
> Anyway, we have apis that are used throughout the kernel all the time
> that don't end up on the various subsystem mailing list because people
> forget, or just do not know.  That's normal and something we have dealt
> with for forever.  As an example, I didn't realise that just using the
> dma-buf api required such a review.
> 
> Can we put that in the MAINTAINERS file somehow for apis?
> 
> > > We get at least 1-2 new subsystems and major drivers that get added to
> > > the kernel tree that do things that have never been done before with
> > > custom user/kernel apis every kernel release.  Not everything can be a
> > > standard api no matter how much I, and others, wish it were.
> > 
> > So when will you draw a line and ask to create proper susbsystem
> > with standard APIs? After 2, 3 ... 100 similar (from our point of view)
> > and different (from vendor point of view) devices with custom API?
> 
> That is a great question and I do not have the answer to that.  Should
> we have done that after the first one went into the kernel all those
> years ago?  Maybe, but I seem to recal the answer being "our hardware
> works much differently, so our user api will be much different", and
> that's a valid answer.

And it's also the answer that all vendors will give, because it's an
easy way to avoid doing extra work. It may sometimes be true, but that's
an exception rather than a rule.

It reminds me of something I've heard in a working group recently, when
someone mentioned a "key differentiating factor" that requires a free
ticket for vendors not to open the implementation, and a few seconds
later went on to say it was "available in all phones in the market
today". I won't call these lies, I believe that in most cases the
vendors actually believe it's true.

> If your standard can not handle new usage models and a way to handle
> that, then it isn't a good standard that companies will follow for new
> types of devices.
> 
> We have loads of char drivers with odd ioctl apis because we have loads
> of odd hardware devices out in the world.  We have been treating these
> accelerators like that for a long time now, except when they try to
> duplicate existing in-kernel code (like crypto or networking).

Going back to the "is an accelerator a GPU?" topic for a bit, DRM
doesn't prevent drivers from exposing custom features with custom API
elements. AI/ML accelerators aren't GPUs in the original sense of 3D
rendering accelerators (maybe that's a cause of misunderstanding, we're
not using the best terminology), but they fit pretty well within the
device model that DRM creates. The side effect of using DRM is that an
open userspace is required, and this is why some people in the community
believe Habanalabs tried to work around that rule by going for
drivers/misc/. I don't know enough about the history to know if they
were behaving in good faith or not, but maybe we could try to turn this
page by deciding on the right path forward together and forget about the
finger pointing and blaming.

> > > As examples, what about the hyperv blob api that was submitted recently
> > > going around the block layer?  What about the new Intel accelerator that
> > > added yet-another-set-of-custom-ioctls?  What about the rpi drivers?
> > > What about the virtualbox drivers?  Should all of those just live
> > > outside of the kernel for forever?
> > > 
> > > Of course not.
> > 
> > So what is your bar? Accept everything?
> 
> It's a hard line to draw, and for some reason, I seem to be the one
> having to review these types of drivers every kernel release.  If people
> wish to help me out, please do so, all the patches are on the lists.

This may be a controversial point, but could it be because vendors
perceive you as less likely to look closely and push back ? If
drivers/misc/ is seen as being free-for-all and other subsystems are
likely to ask for more work, natural laziness will push vendors to
drivers/misc/. 

> Right now I push back where I can and try to get semi-sane apis created
> that are "obviously not wrong" where I notice.  After that, I just need
> to trust that the maintainer of the driver knows what they are doing and
> will maintain the code going forward.  So far, it's worked out.
> 
> Do you have a better idea of what to do instead?

Is there a way we could push those drivers more strongly towards other
subsystems ? There's certainly no way you will be able to foster the
creating of a dozen userspace frameworks and related communities from
drivers/misc/ by yourself.

-- 
Regards,

Laurent Pinchart

  reply	other threads:[~2021-09-12 16:42 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 21:00 [MAINTAINER SUMMIT] User-space requirements for accelerator drivers Jonathan Corbet
2021-09-10 21:32 ` Josh Triplett
2021-09-13 13:50   ` Christian Brauner
2021-09-13 13:57     ` Daniel Vetter
2021-09-14  2:07       ` Laurent Pinchart
2021-09-14 14:40   ` Jani Nikula
2021-09-14 14:45     ` Geert Uytterhoeven
2021-09-14 14:59       ` Jani Nikula
2021-09-14 15:10         ` Geert Uytterhoeven
2021-09-10 21:51 ` James Bottomley
2021-09-10 21:59   ` Alexandre Belloni
2021-09-10 22:35     ` James Bottomley
2021-09-11 14:51       ` Jonathan Corbet
2021-09-11 15:24         ` James Bottomley
2021-09-11 21:52           ` Laurent Pinchart
2021-09-14 13:22             ` Johannes Berg
2021-09-11  0:08   ` Laurent Pinchart
2021-09-10 22:52 ` Mauro Carvalho Chehab
2021-09-10 23:45   ` Josh Triplett
2021-09-10 23:48     ` Dave Hansen
2021-09-11  0:13       ` Laurent Pinchart
2021-09-10 23:55     ` Thomas Gleixner
2021-09-11  0:20       ` Laurent Pinchart
2021-09-11 14:20         ` Steven Rostedt
2021-09-11 22:08           ` Laurent Pinchart
2021-09-11 22:42             ` Steven Rostedt
2021-09-11 23:10               ` Laurent Pinchart
2021-09-13 11:10               ` Mark Brown
2021-09-11 22:51           ` Mauro Carvalho Chehab
2021-09-11 23:22           ` Mauro Carvalho Chehab
2021-09-11 10:31       ` Leon Romanovsky
2021-09-11 11:41         ` Laurent Pinchart
2021-09-11 12:04           ` Leon Romanovsky
2021-09-11 22:04             ` Laurent Pinchart
2021-09-12  4:27               ` Leon Romanovsky
2021-09-12  7:26                 ` Greg KH
2021-09-12  8:29                   ` Leon Romanovsky
2021-09-12 13:25                     ` Greg KH
2021-09-12 14:15                       ` Leon Romanovsky
2021-09-12 14:34                         ` Greg KH
2021-09-12 16:41                           ` Laurent Pinchart [this message]
2021-09-12 20:35                           ` Dave Airlie
2021-09-12 20:41                           ` Dave Airlie
2021-09-12 20:49                             ` Daniel Vetter
2021-09-12 21:12                               ` Dave Airlie
2021-09-12 22:51                                 ` Linus Walleij
2021-09-12 23:15                                   ` Dave Airlie
2021-09-13 13:20                                   ` Arnd Bergmann
2021-09-13 13:54                                     ` Daniel Vetter
2021-09-13 22:04                                       ` Arnd Bergmann
2021-09-13 23:33                                         ` Dave Airlie
2021-09-14  9:08                                           ` Arnd Bergmann
2021-09-14  9:23                                             ` Daniel Vetter
2021-09-14 10:47                                               ` Laurent Pinchart
2021-09-14 12:58                                               ` Arnd Bergmann
2021-09-14 19:45                                                 ` Daniel Vetter
2021-09-14 15:43                                             ` Luck, Tony
2021-09-13 14:52                                     ` James Bottomley
2021-09-14 13:07                                     ` Linus Walleij
2021-09-13 14:03                           ` Mark Brown
2021-09-12 15:55                       ` Laurent Pinchart
2021-09-12 16:43                         ` James Bottomley
2021-09-12 16:58                           ` Laurent Pinchart
2021-09-12 17:08                             ` James Bottomley
2021-09-12 19:52                   ` Dave Airlie
2021-09-12  7:46                 ` Mauro Carvalho Chehab
2021-09-12  8:00                   ` Leon Romanovsky
2021-09-12 14:53                     ` Laurent Pinchart
2021-09-12 15:41                       ` Mauro Carvalho Chehab
2021-09-10 23:46   ` Laurent Pinchart
2021-09-11  0:38     ` Mauro Carvalho Chehab
2021-09-11  9:27       ` Laurent Pinchart
2021-09-11 22:33         ` Mauro Carvalho Chehab
2021-09-13 12:04         ` Mark Brown
2021-09-12 19:13 ` Dave Airlie
2021-09-12 19:48   ` Laurent Pinchart
2021-09-13  2:26     ` Dave Airlie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YT4tv+TXxI9m9WVj@pendragon.ideasonboard.com \
    --to=laurent.pinchart@ideasonboard.com \
    --cc=corbet@lwn.net \
    --cc=greg@kroah.com \
    --cc=josh@joshtriplett.org \
    --cc=ksummit@lists.linux.dev \
    --cc=leon@kernel.org \
    --cc=mchehab@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.