linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Olof Johansson <olof@lixom.net>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Dave Airlie <airlied@gmail.com>,
	Oded Gabbay <oded.gabbay@gmail.com>,
	Jerome Glisse <jglisse@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>,
	ogabbay@habana.ai
Subject: Re: [PATCH 00/15] Habana Labs kernel driver
Date: Fri, 25 Jan 2019 07:33:23 -0800	[thread overview]
Message-ID: <CAOesGMgJUeUMoRXZ98=J8bpfCuZOOMo5pwpVXZX-o-y1dAQsHA@mail.gmail.com> (raw)
In-Reply-To: <20190125073745.GD11891@kroah.com>

Hi,

On Thu, Jan 24, 2019 at 11:37 PM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Thu, Jan 24, 2019 at 07:57:11AM +1000, Dave Airlie wrote:
> > On Wed, 23 Jan 2019 at 10:01, Oded Gabbay <oded.gabbay@gmail.com> wrote:
> > >
> > > Hello,
> > >
> > > For those who don't know me, my name is Oded Gabbay (Kernel Maintainer
> > > for AMD's amdkfd driver, worked at RedHat's Desktop group) and I work at
> > > Habana Labs since its inception two and a half years ago.
> >
> > Hey Oded,
> >
> > So this creates a driver with a userspace facing API via ioctls.
> > Although this isn't a "GPU" driver we have a rule in the graphics
> > drivers are for accelerators that we don't merge userspace API with an
> > appropriate userspace user.
> >
> > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements
> >
> > I see nothing in these accelerator drivers that make me think we
> > should be treating them different.
>
> I understand that this is your position on when you accept drivers into
> the DRM layer, as you need to interact with common interfaces and a
> massive userspace stack at the same time.  And that's wonderful, it
> allows you to be able to move both sides of that stack forward without
> removing support for devices that worked on older kernels.
>
> But, that's not really the case with this new driver at all.  We add new
> driver subsystems, and individual drivers, with loads of new ioctls, in
> every new kernel release.  We don't impose on all of them the "your
> userspace code must be open" rule, so why is this new driver somehow
> different from them?
>
> Yes, there is the fun legal issue of "derivative works" when talking
> about a userspace program that is written to only interact with a
> specific kernel driver using a custom api like this one has, and how the
> license of the kernel side (GPLv2) affects the userspace side
> (whatever), but that is something that I leave up to the lawyers who
> like discussing and enforcing such things.
>
> When evaluating this driver (note, I saw it for a few revisions before
> Oded posted it here), all I did was try to make sure that it fit in
> properly with the kernel apis and methods of operations.  Given that
> there are no in-kernel drivers for this type of device, and that it
> really is a pretty small shim layer around the hardware, which means
> that userspace does a lot of the heavy lifting, it is going to be a
> very hardware-specific user/kernel api, and that shows.

I brought this up because there are sort of, if you squint, three-ish
of these already (the OpenCAPI/CAPI ones and now this). Also,
Jonathan's comment about CCIX pieces coming as well.

I've talked to a handful of vendors in this space and more or less all
of them hope that drivers/misc is still a suitable home for their
driver by the time it's ready to post, and I know that if we keep them
there, it'll only be one or two more drivers until we have this
discussion anyway.

That's really why I brought it up now, to get a clear stance on "Yeah,
we know these are all slightly different today, and we're willing to
give them a predictable home for the first while as we figure out
together how things should look". Keep in mind, this space is also
currently a bit of a gold rush, with many people working hard to get
to market with near-zero talk between them.

> Sidenote, this could have almost just been a UIO driver, which would
> have put _ALL_ of the logic in userspace.  At least this way we have a
> chance for the kernel code to be sane and not try to inflict problems on
> the rest of the system.

Yeah, and sharing hardware when you have userspace drivers tends to
need a bunch of synchronization and coordination which leads to taller
(closed?) stacks there. Having resource arbitration and sharing
assisted by the kernel is usually the right thing, and we should
encourage that.

> Going forward, it would be wonderful if we could come up with a unified
> api for how to interact with these types of hardware accelerators, but
> given the newness of this industry, and the vastly different ways people
> are trying to solve the problem, that is going to take a few attempts,
> and many years before we can get there.  Until then, taking drivers like
> this into the kernel tree makes sense as that way all of our users will
> be able to use that hardware, and better yet, the rest of us can learn
> more about how this stuff works so that we can help out with that api
> generation when it happens.

Exactly my viewpoint as well, combined with not pushing vendors
towards nvidia models by default even if they start out separate, and
having a chance to find inroads with them on engineering levels when
they come to us with the drivers.

Absolutely best chance for success is _always_ when we can engage with
the vendor engineers as peers, instead of going up and down the
corporate charts first (you often need to do a bit of both, but having
allies in engineering makes it much easier).

> So for now, I have no objection to taking this type of driver into the
> tree.  Yes, it would be wonderful if we had an open userspace program to
> drive it so that we could actually test and make changes to the api over
> time, but I think that is something that the submitting company needs to
> realize will be better for them to do, as for right now, all of that
> testing and changes are their responsibility.

I do think having requirements of being able to exercise the hardware
is really valuable, and we should consider what a requirement for that
would look like. For the Habana case that could be a separate
low-level library and a test workload on top. For FPGA cases it could
be a well-known reference bitstream with the vendor reference
control/communication path + similar low-level pieces.

> As for what directory the code should live in, I suggested "misc" as
> there was no other universal location, and I hate to see new subsystems
> be created with only one driver, as that's pretty sad.  But it's just a
> name/location, I have no dog in the fight, so I really don't care where
> it ends up in the tree, just as long as it gets merged somewhere :)

I'm usually one to push back against new subsystems too, especially
when I see a framework proposal with just one driver. In this case,
given that we all know more vendors will come along, I think it makes
sense to take the discussion and establish structure now. This should
give some clarity to those who are out there that we haven't seen yet,
and give them a chance to prepare for things such as the low-level
userspace pieces mentioned above.

So I think setting this up now is the right thing to do, we know there
will be more material here and having a common aggregation of it makes
sense.


-Olof

  reply	other threads:[~2019-01-25 15:33 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-23  0:00 [PATCH 00/15] Habana Labs kernel driver Oded Gabbay
2019-01-23  0:00 ` [PATCH 01/15] habanalabs: add skeleton driver Oded Gabbay
2019-01-23  0:49   ` Joe Perches
2019-01-25 19:18     ` Oded Gabbay
2019-01-23 12:28   ` Mike Rapoport
2019-01-23 12:40     ` Greg KH
2019-01-23 12:55       ` Mike Rapoport
2019-01-25 20:09         ` Oded Gabbay
2019-01-25 20:05     ` Oded Gabbay
2019-01-26 16:05   ` Arnd Bergmann
2019-01-26 16:24     ` Oded Gabbay
2019-01-26 21:14       ` Arnd Bergmann
2019-01-26 21:48         ` Oded Gabbay
2019-01-27  8:32           ` gregkh
2019-01-29 22:49             ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 03/15] habanalabs: add basic Goya support Oded Gabbay
2019-01-23 12:28   ` Mike Rapoport
2019-01-25 20:32     ` Oded Gabbay
2019-01-27  6:39       ` Mike Rapoport
2019-01-28  7:44         ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 04/15] habanalabs: add context and ASID modules Oded Gabbay
2019-01-23 12:28   ` Mike Rapoport
2019-01-25 21:07     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 05/15] habanalabs: add command buffer module Oded Gabbay
2019-01-23 12:28   ` Mike Rapoport
2019-01-25 21:47     ` Oded Gabbay
2019-01-27  6:49       ` Mike Rapoport
2019-01-28  7:55         ` Oded Gabbay
2019-01-28  8:41           ` Mike Rapoport
2019-01-23  0:00 ` [PATCH 06/15] habanalabs: add basic Goya h/w initialization Oded Gabbay
2019-01-25  7:46   ` Mike Rapoport
2019-01-28 10:35     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 07/15] habanalabs: add h/w queues module Oded Gabbay
2019-01-25  7:50   ` Mike Rapoport
2019-01-28 10:50     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 08/15] habanalabs: add event queue and interrupts Oded Gabbay
2019-01-25  7:51   ` Mike Rapoport
2019-01-28 11:14     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 09/15] habanalabs: add sysfs and hwmon support Oded Gabbay
2019-01-25  7:54   ` Mike Rapoport
2019-01-28 11:26     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 10/15] habanalabs: add device reset support Oded Gabbay
2019-01-27  7:51   ` Mike Rapoport
2019-01-28 12:53     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 11/15] habanalabs: add command submission module Oded Gabbay
2019-01-27 15:11   ` Mike Rapoport
2019-01-28 13:51     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 12/15] habanalabs: add virtual memory and MMU modules Oded Gabbay
2019-01-27 16:13   ` Mike Rapoport
2019-01-30 10:34     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 13/15] habanalabs: implement INFO IOCTL Oded Gabbay
2019-01-23  0:00 ` [PATCH 14/15] habanalabs: add debugfs support Oded Gabbay
2019-01-23  0:00 ` [PATCH 15/15] Update MAINTAINERS and CREDITS with habanalabs info Oded Gabbay
2019-01-23 12:27 ` [PATCH 00/15] Habana Labs kernel driver Mike Rapoport
2019-01-23 22:43   ` Oded Gabbay
2019-01-23 21:52 ` Olof Johansson
2019-01-23 22:40   ` Oded Gabbay
2019-01-23 23:16     ` Olof Johansson
2019-01-24  1:03   ` Andrew Donnellan
2019-01-24 11:59     ` Jonathan Cameron
2019-01-25 17:13     ` Olof Johansson
2019-02-24 22:23   ` Pavel Machek
2019-01-23 21:57 ` Dave Airlie
2019-01-23 22:02   ` Dave Airlie
2019-01-23 22:31     ` Oded Gabbay
2019-01-23 22:45       ` Dave Airlie
2019-01-23 23:04         ` Olof Johansson
2019-01-23 23:20           ` Jerome Glisse
2019-01-23 23:35             ` Oded Gabbay
2019-01-23 23:41               ` Olof Johansson
2019-01-23 23:40             ` Olof Johansson
2019-01-23 23:48               ` Jerome Glisse
2019-01-24  7:35                 ` Daniel Vetter
2019-01-24  9:50                   ` Oded Gabbay
2019-01-24 10:22                     ` Dave Airlie
2019-01-25  0:13                       ` Olof Johansson
2019-01-25  7:43                         ` Daniel Vetter
2019-01-25 15:02                           ` Olof Johansson
2019-01-25 16:00                             ` Daniel Vetter
2019-01-24 23:51                   ` Olof Johansson
2019-01-23 23:23           ` Oded Gabbay
2019-01-25  7:37   ` Greg Kroah-Hartman
2019-01-25 15:33     ` Olof Johansson [this message]
2019-01-25 16:06       ` Greg Kroah-Hartman
2019-01-25 17:12         ` Olof Johansson
2019-01-25 18:16           ` [PATCH/RFC 0/5] HW accel subsystem Olof Johansson
2019-01-25 18:16             ` [PATCH 1/5] drivers/accel: Introduce subsystem Olof Johansson
2019-01-25 21:13               ` [PATCH v2 " Olof Johansson
2019-01-26 17:09                 ` Randy Dunlap
2019-01-27  4:31                 ` Andrew Donnellan
2019-01-28 19:36                   ` Frederic Barrat
2019-01-25 22:23               ` [PATCH " Daniel Vetter
2019-01-27 16:31                 ` Daniel Vetter
2019-01-25 18:16             ` [PATCH 2/5] cxl: Move to drivers/accel Olof Johansson
2019-01-25 18:16             ` [PATCH 3/5] drivers/accel: cxl: Move non-uapi include files Olof Johansson
2019-01-25 18:16             ` [PATCH 4/5] ocxl: Move to drivers/accel Olof Johansson
2019-01-25 18:16             ` [PATCH 5/5] drivers/accel: ocxl: Move non-uapi include files Olof Johansson
2019-01-26 13:51               ` Greg Kroah-Hartman
2019-01-26 21:11             ` [PATCH/RFC 0/5] HW accel subsystem Arnd Bergmann
2019-02-01  9:10             ` Kenneth Lee
2019-02-01 10:07               ` Greg Kroah-Hartman
2019-02-01 12:09                 ` Kenneth Lee
2019-01-26 13:52           ` [PATCH 00/15] Habana Labs kernel driver Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOesGMgJUeUMoRXZ98=J8bpfCuZOOMo5pwpVXZX-o-y1dAQsHA@mail.gmail.com' \
    --to=olof@lixom.net \
    --cc=airlied@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jglisse@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oded.gabbay@gmail.com \
    --cc=ogabbay@habana.ai \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).