ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: Leon Romanovsky <leon@kernel.org>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Josh Triplett <josh@joshtriplett.org>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Jonathan Corbet <corbet@lwn.net>,
	ksummit@lists.linux.dev
Subject: Re: [MAINTAINER SUMMIT] User-space requirements for accelerator drivers
Date: Sun, 12 Sep 2021 16:34:48 +0200	[thread overview]
Message-ID: <YT4QCHwnqZL/rl0z@kroah.com> (raw)
In-Reply-To: <YT4LgkK+ejUWljEh@unreal>

On Sun, Sep 12, 2021 at 05:15:30PM +0300, Leon Romanovsky wrote:
> On Sun, Sep 12, 2021 at 03:25:58PM +0200, Greg KH wrote:
> > > The main idea that I want to see working userspace stack, and being in
> > > distro sets a certain quality level, am I asking too much?
> > 
> > Define "working userspace stack" and "distro" please.  Like others have
> > said, many distros will not take userspace code unless it's already in
> > the kernel tree first, as that ensures that the abi will not break.
> 
> Like I already answered
> https://lore.kernel.org/all/YT2zryAKHc%2F5R2IH@unreal/
> "To be used" means some open PR to existing package or request for
> inclusion for new packages.

But again, distros will not take things that are not already in the
kernel.

> > > > > 2. Devices that hits the certain level of adoption - need to be
> > > > > integrated into certain userspace stack, which needs to be part of
> > > > > distro.
> > > > 
> > > > Distros are a very odd rule to rely on given that they are by far the
> > > > minority of the usage in raw numbers for Linux in the world.
> > > 
> > > You can count Android as another distro, it is just semantics.
> > 
> > But how do you define Android's userspace?  Just one vendor?  2 vendors?
> > 10 vendors?  There is major userspace fragmentation in Android userspace
> > in many places, the user/kernel boundry being one of the big ones as
> > many of us have found out over the past years.  And many of us are
> > working to resolve this, but it's not so simple at times, and I have
> > many examples if you want specifics.
> 
> Lauerent suggested AOSP
> https://lore.kernel.org/all/YTyWANV%2FmSkQbYhj@pendragon.ideasonboard.com/

Vendors can not get code into AOSP for various reasons that only Google
understands.  There are many millions, if not billions of Android
devices out there with user/kernel apis that are not upstream nor in
AOSP because Google doesn't want to take them, or because the vendor can
not go through those hoops (international law is tricky at times...)

So are we to just not be able to take drivers that add those new apis if
AOSP can not take the userspace side, yet the userspace side is
published somewhere else?

> > > > > And AI/ML is no different here, someone just need to start build such
> > > > > stack. Otherwise, we will continue to see more free riders like HabanaLabs
> > > > > which don't have any real benefit to the community.
> > > > 
> > > > Everyone contributes to Linux in a selfish manner, that's just how the
> > > > community works.  The work that companies like habanalabs is NOT being a
> > > > "free rider" at all, they have worked with us and done the hard work of
> > > > actually getting their code merged into the tree.
> > > 
> > > I perfectly remember them trying to bypass netdev and RDMA communities
> > > by pretending "misc" device.
> > > 
> > > https://lore.kernel.org/linux-rdma/20200915133556.21268811@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/
> > > https://lore.kernel.org/linux-rdma/20200917171833.GJ8409@ziepe.ca/
> > > 
> > > Or DRM
> > > https://lore.kernel.org/linux-rdma/CAKMK7uFOfoxbD2Z5mb-qHFnUe5rObGKQ6Ygh--HSH9M=9bziGg@mail.gmail.com/
> > > 
> > > So I can agree with the statement "worked hard", but not with the
> > > relevant communities.
> > 
> > I point at these as doing exactly what we want vendors to be doing!
> > Thank you for finding the good examples.  This is a vendor submitting
> > patches and saying, "here is what we want to do, with a first cut at
> > doing it."  It's up to us as a community to tell them if they are doing
> > it the right way or not.
> > 
> > If we just let them all go their own ways, they will come up with
> > horrible apis and interfaces, we have all seen that before.
> > 
> > So by working together, we both can learn from, and work together to
> > solve the issue.  And that is what these driver authors and company has
> > been doing!  They are part of our community, why are you saying they
> > should now just go do their own thing away from us?
> 
> This is not what I said. I don't see Intel (habanalabs) as a company
> that can't create proper AI stack and think that this is our
> responsibility to provide them enough incentive to do it.

So should we be forcing everyone to follow the IBM standard for
accelerator drivers because they were in the kernel first all those
years ago?  Or what other standard do we pick?

And why are we dictating new industry standards here?  Who are we to do
that?  Who is going to take that responsibility on?

> > And as for "bypassing", that feels very mean.  We have had accelerator
> > code in the char/misc and other parts of the kernel tree since at least
> > 2018 if not earlier (I didn't look all that hard.)  Just because someone
> > wanted to use the in-kernel apis that are there (why is dma-buf some
> > magic thing?) does not mean that they suddenly need to move to a
> > different subsystem.
> 
> Because dma-buf API has specific semantics and was designed with very
> specific usage model in mind.

So will the IB patches usage be re-reviewed?

Anyway, we have apis that are used throughout the kernel all the time
that don't end up on the various subsystem mailing list because people
forget, or just do not know.  That's normal and something we have dealt
with for forever.  As an example, I didn't realise that just using the
dma-buf api required such a review.

Can we put that in the MAINTAINERS file somehow for apis?

> > We get at least 1-2 new subsystems and major drivers that get added to
> > the kernel tree that do things that have never been done before with
> > custom user/kernel apis every kernel release.  Not everything can be a
> > standard api no matter how much I, and others, wish it were.
> 
> So when will you draw a line and ask to create proper susbsystem
> with standard APIs? After 2, 3 ... 100 similar (from our point of view)
> and different (from vendor point of view) devices with custom API?

That is a great question and I do not have the answer to that.  Should
we have done that after the first one went into the kernel all those
years ago?  Maybe, but I seem to recal the answer being "our hardware
works much differently, so our user api will be much different", and
that's a valid answer.

If your standard can not handle new usage models and a way to handle
that, then it isn't a good standard that companies will follow for new
types of devices.

We have loads of char drivers with odd ioctl apis because we have loads
of odd hardware devices out in the world.  We have been treating these
accelerators like that for a long time now, except when they try to
duplicate existing in-kernel code (like crypto or networking).

> > As examples, what about the hyperv blob api that was submitted recently
> > going around the block layer?  What about the new Intel accelerator that
> > added yet-another-set-of-custom-ioctls?  What about the rpi drivers?
> > What about the virtualbox drivers?  Should all of those just live
> > outside of the kernel for forever?
> > 
> > Of course not.
> 
> So what is your bar? Accept everything?

It's a hard line to draw, and for some reason, I seem to be the one
having to review these types of drivers every kernel release.  If people
wish to help me out, please do so, all the patches are on the lists.

Right now I push back where I can and try to get semi-sane apis created
that are "obviously not wrong" where I notice.  After that, I just need
to trust that the maintainer of the driver knows what they are doing and
will maintain the code going forward.  So far, it's worked out.

Do you have a better idea of what to do instead?

thanks,

greg k-h

  reply	other threads:[~2021-09-12 14:34 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 21:00 [MAINTAINER SUMMIT] User-space requirements for accelerator drivers Jonathan Corbet
2021-09-10 21:32 ` Josh Triplett
2021-09-13 13:50   ` Christian Brauner
2021-09-13 13:57     ` Daniel Vetter
2021-09-14  2:07       ` Laurent Pinchart
2021-09-14 14:40   ` Jani Nikula
2021-09-14 14:45     ` Geert Uytterhoeven
2021-09-14 14:59       ` Jani Nikula
2021-09-14 15:10         ` Geert Uytterhoeven
2021-09-10 21:51 ` James Bottomley
2021-09-10 21:59   ` Alexandre Belloni
2021-09-10 22:35     ` James Bottomley
2021-09-11 14:51       ` Jonathan Corbet
2021-09-11 15:24         ` James Bottomley
2021-09-11 21:52           ` Laurent Pinchart
2021-09-14 13:22             ` Johannes Berg
2021-09-11  0:08   ` Laurent Pinchart
2021-09-10 22:52 ` Mauro Carvalho Chehab
2021-09-10 23:45   ` Josh Triplett
2021-09-10 23:48     ` Dave Hansen
2021-09-11  0:13       ` Laurent Pinchart
2021-09-10 23:55     ` Thomas Gleixner
2021-09-11  0:20       ` Laurent Pinchart
2021-09-11 14:20         ` Steven Rostedt
2021-09-11 22:08           ` Laurent Pinchart
2021-09-11 22:42             ` Steven Rostedt
2021-09-11 23:10               ` Laurent Pinchart
2021-09-13 11:10               ` Mark Brown
2021-09-11 22:51           ` Mauro Carvalho Chehab
2021-09-11 23:22           ` Mauro Carvalho Chehab
2021-09-11 10:31       ` Leon Romanovsky
2021-09-11 11:41         ` Laurent Pinchart
2021-09-11 12:04           ` Leon Romanovsky
2021-09-11 22:04             ` Laurent Pinchart
2021-09-12  4:27               ` Leon Romanovsky
2021-09-12  7:26                 ` Greg KH
2021-09-12  8:29                   ` Leon Romanovsky
2021-09-12 13:25                     ` Greg KH
2021-09-12 14:15                       ` Leon Romanovsky
2021-09-12 14:34                         ` Greg KH [this message]
2021-09-12 16:41                           ` Laurent Pinchart
2021-09-12 20:35                           ` Dave Airlie
2021-09-12 20:41                           ` Dave Airlie
2021-09-12 20:49                             ` Daniel Vetter
2021-09-12 21:12                               ` Dave Airlie
2021-09-12 22:51                                 ` Linus Walleij
2021-09-12 23:15                                   ` Dave Airlie
2021-09-13 13:20                                   ` Arnd Bergmann
2021-09-13 13:54                                     ` Daniel Vetter
2021-09-13 22:04                                       ` Arnd Bergmann
2021-09-13 23:33                                         ` Dave Airlie
2021-09-14  9:08                                           ` Arnd Bergmann
2021-09-14  9:23                                             ` Daniel Vetter
2021-09-14 10:47                                               ` Laurent Pinchart
2021-09-14 12:58                                               ` Arnd Bergmann
2021-09-14 19:45                                                 ` Daniel Vetter
2021-09-14 15:43                                             ` Luck, Tony
2021-09-13 14:52                                     ` James Bottomley
2021-09-14 13:07                                     ` Linus Walleij
2021-09-13 14:03                           ` Mark Brown
2021-09-12 15:55                       ` Laurent Pinchart
2021-09-12 16:43                         ` James Bottomley
2021-09-12 16:58                           ` Laurent Pinchart
2021-09-12 17:08                             ` James Bottomley
2021-09-12 19:52                   ` Dave Airlie
2021-09-12  7:46                 ` Mauro Carvalho Chehab
2021-09-12  8:00                   ` Leon Romanovsky
2021-09-12 14:53                     ` Laurent Pinchart
2021-09-12 15:41                       ` Mauro Carvalho Chehab
2021-09-10 23:46   ` Laurent Pinchart
2021-09-11  0:38     ` Mauro Carvalho Chehab
2021-09-11  9:27       ` Laurent Pinchart
2021-09-11 22:33         ` Mauro Carvalho Chehab
2021-09-13 12:04         ` Mark Brown
2021-09-12 19:13 ` Dave Airlie
2021-09-12 19:48   ` Laurent Pinchart
2021-09-13  2:26     ` Dave Airlie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YT4QCHwnqZL/rl0z@kroah.com \
    --to=greg@kroah.com \
    --cc=corbet@lwn.net \
    --cc=josh@joshtriplett.org \
    --cc=ksummit@lists.linux.dev \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=leon@kernel.org \
    --cc=mchehab@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).