linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Olof Johansson <olof@lixom.net>
To: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Jerome Glisse <jglisse@redhat.com>,
	Dave Airlie <airlied@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Daniel Vetter <daniel.vetter@ffwll.ch>,
	LKML <linux-kernel@vger.kernel.org>,
	ogabbay@habana.ai, Arnd Bergmann <arnd@arndb.de>,
	fbarrat@linux.ibm.com,
	Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Subject: Re: [PATCH 00/15] Habana Labs kernel driver
Date: Wed, 23 Jan 2019 15:41:45 -0800	[thread overview]
Message-ID: <CAOesGMjU133pMS2CMb0GADuJgGrH75mtq8w0Eh3c29DbvYh_0w@mail.gmail.com> (raw)
In-Reply-To: <CAFCwf11PadVOM1jg+JnSzvLxro1PP6hEWX8WtRne9WVN26PaYQ@mail.gmail.com>

On Wed, Jan 23, 2019 at 3:35 PM Oded Gabbay <oded.gabbay@gmail.com> wrote:
>
> On Thu, Jan 24, 2019 at 1:20 AM Jerome Glisse <jglisse@redhat.com> wrote:
> >
> > On Wed, Jan 23, 2019 at 03:04:33PM -0800, Olof Johansson wrote:
> > > On Wed, Jan 23, 2019 at 2:45 PM Dave Airlie <airlied@gmail.com> wrote:
> > > >
> > > > On Thu, 24 Jan 2019 at 08:32, Oded Gabbay <oded.gabbay@gmail.com> wrote:
> > > > >
> > > > > On Thu, Jan 24, 2019 at 12:02 AM Dave Airlie <airlied@gmail.com> wrote:
> > > > > >
> > > > > > Adding Daniel as well.
> > > > > >
> > > > > > Dave.
> > > > > >
> > > > > > On Thu, 24 Jan 2019 at 07:57, Dave Airlie <airlied@gmail.com> wrote:
> > > > > > >
> > > > > > > On Wed, 23 Jan 2019 at 10:01, Oded Gabbay <oded.gabbay@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > For those who don't know me, my name is Oded Gabbay (Kernel Maintainer
> > > > > > > > for AMD's amdkfd driver, worked at RedHat's Desktop group) and I work at
> > > > > > > > Habana Labs since its inception two and a half years ago.
> > > > > > >
> > > > > > > Hey Oded,
> > > > > > >
> > > > > > > So this creates a driver with a userspace facing API via ioctls.
> > > > > > > Although this isn't a "GPU" driver we have a rule in the graphics
> > > > > > > drivers are for accelerators that we don't merge userspace API with an
> > > > > > > appropriate userspace user.
> > > > > > >
> > > > > > > https://dri.freedesktop.org/docs/drm/gpu/drm-uapi.html#open-source-userspace-requirements
> > > > > > >
> > > > > > > I see nothing in these accelerator drivers that make me think we
> > > > > > > should be treating them different.
> > > > > > >
> > > > > > > Having large closed userspaces that we have no insight into means we
> > > > > > > get suboptimal locked for ever uAPIs. If someone in the future creates
> > > > > > > an open source userspace, we will end up in a place where they get
> > > > > > > suboptimal behaviour because they are locked into a uAPI that we can't
> > > > > > > change.
> > > > > > >
> > > > > > > Dave.
> > > > >
> > > > > Hi Dave,
> > > > > While I always appreciate your opinion and happy to hear it, I totally
> > > > > disagree with you on this point.
> > > > >
> > > > > First of all, as you said, this device is NOT a GPU. Hence, I wasn't
> > > > > aware that this rule might apply to this driver or to any other driver
> > > > > outside of drm. Has this rule been applied to all the current drivers
> > > > > in the kernel tree with userspace facing API via IOCTLs, which are not
> > > > > in the drm subsystem ?  I see the logic for GPUs as they drive the
> > > > > display of the entire machine, but this is an accelerator for a
> > > > > specific purpose, not something generic as GPU. I just don't see how
> > > > > one can treat them in the same way.
> > > >
> > > > The logic isn't there for GPUs for those reason that we have an
> > > > established library or that GPUs are in laptops. They are just where
> > > > we learned the lessons of merging things whose primary reason for
> > > > being in the kernel is to execute stuff from misc userspace stacks,
> > > > where the uAPI has to remain stable indefinitely.
> > > >
> > > > a) security - without knowledge of what the accelerator can do how can
> > > > we know if the API you expose isn't just a giant root hole?
> > > >
> > > > b) uAPI stability. Without a userspace for this, there is no way for
> > > > anyone even if in possession of the hardware to validate the uAPI you
> > > > provide and are asking the kernel to commit to supporting indefinitely
> > > > is optimal or secure. If an open source userspace appears is it to be
> > > > limited to API the closed userspace has created. It limits the future
> > > > unnecessarily.
> > > >
> > > > > There is no way that "someone" will create a userspace
> > > > > for our H/W without the intimate knowledge of the H/W or without the
> > > > > ISA of our programmable cores. Maybe for large companies this request
> > > > > is valid, but for startups complying to this request is not realistic.
> > > >
> > > > So what benefit does the Linux kernel get from having support for this
> > > > feature upstream?
> > > >
> > > > If users can't access the necessary code to use it, why does this
> > > > require to be maintained in the kernel.
> > > >
> > > > > To conclude, I think this approach discourage other companies from
> > > > > open sourcing their drivers and is counter-productive. I'm not sure
> > > > > you are aware of how difficult it is to convince startup management to
> > > > > opensource the code...
> > > >
> > > > Oh I am, but I'm also more aware how quickly startups go away and
> > > > leave the kernel holding a lot of code we don't know how to validate
> > > > or use.
> > > >
> > > > I'm opening to being convinced but I think defining new userspace
> > > > facing APIs is a task that we should take a lot more seriously going
> > > > forward to avoid mistakes of the past.
> > >
> > > I think the most important thing here is to know that things are
> > > likely to change quite a bit over the next couple of years, and that
> > > we don't know yet what we actually need. If we hold off picking up
> > > support for hardware while all of this is ironed out, we'll miss out
> > > on being exposed to it, and will have a very tall hill to climb once
> > > we try to convince vendors to come into the fold. It's also not been a
> > > requirement for the other two drivers we have merged, as far as I can
> > > tell (CAPI and OpenCAPI) so the cat's already out of the bag.
> > >
> > > I'd rather not get stuck in a stand-off needing the longterm solution
> > > to pick up the short term contribution. That way we can move over to a
> > > _new_ API once there's been a better chance of finding common grounds
> > > and once things settle down a bit, instead of trying to bring some
> > > larger legacy codebase for devices that people might no longer care
> > > much about over to the newer APIs.
> > >
> > > It's better to be exposed to the HW and drivers now, than having
> > > people build large elaborate out-of-tree software stacks for this.
> > > It's also better to get them to come and collaborate now, instead of
> > > pushing them away until things are perfect.
> > >
> > > Having a way to validate and exercise the userspace API is important,
> > > including ability to change it if needed. Would it be possible to open
> > > up the lowest userspace pieces (driver interactions), even if some
> > > other layers might not yet be, to exercise the device/kernel/userspace
> > > interfaces without "live" workload, etc?
> >
> > Yes and to exercise the userspace API you need at very least to
> > know the ISA so that you can write program for the accelerator.
> > You also need to know the set of commands the hardware has. The
> > ioctl and how to create a userspace that interact with the kernel
> > is the easy part, the hard part is the compiler.
>
> So actually in my case in order to exercise the IOCTL API, you can
> give "work" to the device that will not trigger the compute parts, but
> only the different queues and the DMA engines.
> I think that is enough to validate that the IOCTLs won't break.
> All the "commands" that you can give to the queue logic (QMAN) is
> exposed in one of the files in the driver (goya_packets.h).
>
> I want to stress this - To validate the IOCTLs, it is enough to do DMA
> work. You will use ALL the 5 IOCTLs to do just that - give work to the
> DMA engines.

I personally think this is a reasonable trade-off, given that you have
a communication layer between. For hardware that doesn't have that,
and where device behavior and data movement depends on execution on
the compute parts, more would need to be open.


-Olof

  reply	other threads:[~2019-01-23 23:42 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-23  0:00 [PATCH 00/15] Habana Labs kernel driver Oded Gabbay
2019-01-23  0:00 ` [PATCH 01/15] habanalabs: add skeleton driver Oded Gabbay
2019-01-23  0:49   ` Joe Perches
2019-01-25 19:18     ` Oded Gabbay
2019-01-23 12:28   ` Mike Rapoport
2019-01-23 12:40     ` Greg KH
2019-01-23 12:55       ` Mike Rapoport
2019-01-25 20:09         ` Oded Gabbay
2019-01-25 20:05     ` Oded Gabbay
2019-01-26 16:05   ` Arnd Bergmann
2019-01-26 16:24     ` Oded Gabbay
2019-01-26 21:14       ` Arnd Bergmann
2019-01-26 21:48         ` Oded Gabbay
2019-01-27  8:32           ` gregkh
2019-01-29 22:49             ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 03/15] habanalabs: add basic Goya support Oded Gabbay
2019-01-23 12:28   ` Mike Rapoport
2019-01-25 20:32     ` Oded Gabbay
2019-01-27  6:39       ` Mike Rapoport
2019-01-28  7:44         ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 04/15] habanalabs: add context and ASID modules Oded Gabbay
2019-01-23 12:28   ` Mike Rapoport
2019-01-25 21:07     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 05/15] habanalabs: add command buffer module Oded Gabbay
2019-01-23 12:28   ` Mike Rapoport
2019-01-25 21:47     ` Oded Gabbay
2019-01-27  6:49       ` Mike Rapoport
2019-01-28  7:55         ` Oded Gabbay
2019-01-28  8:41           ` Mike Rapoport
2019-01-23  0:00 ` [PATCH 06/15] habanalabs: add basic Goya h/w initialization Oded Gabbay
2019-01-25  7:46   ` Mike Rapoport
2019-01-28 10:35     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 07/15] habanalabs: add h/w queues module Oded Gabbay
2019-01-25  7:50   ` Mike Rapoport
2019-01-28 10:50     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 08/15] habanalabs: add event queue and interrupts Oded Gabbay
2019-01-25  7:51   ` Mike Rapoport
2019-01-28 11:14     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 09/15] habanalabs: add sysfs and hwmon support Oded Gabbay
2019-01-25  7:54   ` Mike Rapoport
2019-01-28 11:26     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 10/15] habanalabs: add device reset support Oded Gabbay
2019-01-27  7:51   ` Mike Rapoport
2019-01-28 12:53     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 11/15] habanalabs: add command submission module Oded Gabbay
2019-01-27 15:11   ` Mike Rapoport
2019-01-28 13:51     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 12/15] habanalabs: add virtual memory and MMU modules Oded Gabbay
2019-01-27 16:13   ` Mike Rapoport
2019-01-30 10:34     ` Oded Gabbay
2019-01-23  0:00 ` [PATCH 13/15] habanalabs: implement INFO IOCTL Oded Gabbay
2019-01-23  0:00 ` [PATCH 14/15] habanalabs: add debugfs support Oded Gabbay
2019-01-23  0:00 ` [PATCH 15/15] Update MAINTAINERS and CREDITS with habanalabs info Oded Gabbay
2019-01-23 12:27 ` [PATCH 00/15] Habana Labs kernel driver Mike Rapoport
2019-01-23 22:43   ` Oded Gabbay
2019-01-23 21:52 ` Olof Johansson
2019-01-23 22:40   ` Oded Gabbay
2019-01-23 23:16     ` Olof Johansson
2019-01-24  1:03   ` Andrew Donnellan
2019-01-24 11:59     ` Jonathan Cameron
2019-01-25 17:13     ` Olof Johansson
2019-02-24 22:23   ` Pavel Machek
2019-01-23 21:57 ` Dave Airlie
2019-01-23 22:02   ` Dave Airlie
2019-01-23 22:31     ` Oded Gabbay
2019-01-23 22:45       ` Dave Airlie
2019-01-23 23:04         ` Olof Johansson
2019-01-23 23:20           ` Jerome Glisse
2019-01-23 23:35             ` Oded Gabbay
2019-01-23 23:41               ` Olof Johansson [this message]
2019-01-23 23:40             ` Olof Johansson
2019-01-23 23:48               ` Jerome Glisse
2019-01-24  7:35                 ` Daniel Vetter
2019-01-24  9:50                   ` Oded Gabbay
2019-01-24 10:22                     ` Dave Airlie
2019-01-25  0:13                       ` Olof Johansson
2019-01-25  7:43                         ` Daniel Vetter
2019-01-25 15:02                           ` Olof Johansson
2019-01-25 16:00                             ` Daniel Vetter
2019-01-24 23:51                   ` Olof Johansson
2019-01-23 23:23           ` Oded Gabbay
2019-01-25  7:37   ` Greg Kroah-Hartman
2019-01-25 15:33     ` Olof Johansson
2019-01-25 16:06       ` Greg Kroah-Hartman
2019-01-25 17:12         ` Olof Johansson
2019-01-25 18:16           ` [PATCH/RFC 0/5] HW accel subsystem Olof Johansson
2019-01-25 18:16             ` [PATCH 1/5] drivers/accel: Introduce subsystem Olof Johansson
2019-01-25 21:13               ` [PATCH v2 " Olof Johansson
2019-01-26 17:09                 ` Randy Dunlap
2019-01-27  4:31                 ` Andrew Donnellan
2019-01-28 19:36                   ` Frederic Barrat
2019-01-25 22:23               ` [PATCH " Daniel Vetter
2019-01-27 16:31                 ` Daniel Vetter
2019-01-25 18:16             ` [PATCH 2/5] cxl: Move to drivers/accel Olof Johansson
2019-01-25 18:16             ` [PATCH 3/5] drivers/accel: cxl: Move non-uapi include files Olof Johansson
2019-01-25 18:16             ` [PATCH 4/5] ocxl: Move to drivers/accel Olof Johansson
2019-01-25 18:16             ` [PATCH 5/5] drivers/accel: ocxl: Move non-uapi include files Olof Johansson
2019-01-26 13:51               ` Greg Kroah-Hartman
2019-01-26 21:11             ` [PATCH/RFC 0/5] HW accel subsystem Arnd Bergmann
2019-02-01  9:10             ` Kenneth Lee
2019-02-01 10:07               ` Greg Kroah-Hartman
2019-02-01 12:09                 ` Kenneth Lee
2019-01-26 13:52           ` [PATCH 00/15] Habana Labs kernel driver Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOesGMjU133pMS2CMb0GADuJgGrH75mtq8w0Eh3c29DbvYh_0w@mail.gmail.com \
    --to=olof@lixom.net \
    --cc=airlied@gmail.com \
    --cc=andrew.donnellan@au1.ibm.com \
    --cc=arnd@arndb.de \
    --cc=daniel.vetter@ffwll.ch \
    --cc=fbarrat@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jglisse@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oded.gabbay@gmail.com \
    --cc=ogabbay@habana.ai \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).