All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@mellanox.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Olof Johansson <olof.johansson@gmail.com>,
	Jeffrey Hugo <jhugo@codeaurora.org>,
	Dave Airlie <airlied@gmail.com>, Arnd Bergmann <arnd@arndb.de>,
	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	wufan@codeaurora.org, pratanan@codeaurora.org,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 0/8] Qualcomm Cloud AI 100 driver
Date: Tue, 19 May 2020 20:26:36 -0300	[thread overview]
Message-ID: <20200519232636.GA24561@mellanox.com> (raw)
In-Reply-To: <CAKMK7uG-oP-tcOcNz-ZzTmGondEo-17BCN1kpFBPwb7F8QcM5w@mail.gmail.com>

On Tue, May 19, 2020 at 10:41:15PM +0200, Daniel Vetter wrote:

> Get some consistency into your decision making as maintainer. And don't
> tell me or anyone else that this is complicated, gpu and rdma driver folks
> very much told you and Olof last year that this is what you're getting
> yourself into.

It is complicated!

One of the big mistakes we learned from in RDMA is that we must have a
cannonical open userspace, that is at least the user side of the uABI
from the kernel. It doesn't have to do a lot but it does have to be
there and everyone must use it.

Some time ago it was all a fragmented mess where every HW had its own
library project with no community and that spilled into the kernel
where it became impossible to be sure everyone was playing nicely and
keeping their parts up to date. We are still digging out where I find
stuff in the kernel that just never seemed to make it into any
userspace..

I feel this is an essential ingredient, and I think I gave this advice
at LPC as well - it is important to start as a proper subsystem with a
proper standard user space. IMHO a random collection of opaque misc
drivers for incredibly complex HW is not going to magically gel into a
subsystem.

Given the state of the industry the userspace doesn't have to do
alot, and maybe that library exposes unique APIs for each HW, but it
is at least a rallying point to handle all these questions like: 'is
the proposed userspace enough?', give some consistency, and be ready
to add in those things that are common (like, say IOMMU PASID setup)

The uacce stuff is sort of interesting here as it does seem to take
some of that approach, it is really simplistic, but the basic idea of
creating a generic DMA work ring is in there, and probably applies
just as well to several of these 'totally-not-a-GPU' drivers.

The other key is that the uABI from the kernel does need to be very
flexible as really any new HW can appear with any new strange need all
the time, and there will not be detailed commonality between HWs. RDMA
has made this mistake a lot in the past too.

The newer RDMA netlink like API is actually turning out not bad for
this purpose.. (again something a subsystem could provide)

Also the approach in this driver to directly connect the device to
userspace for control commands has worked for RDMA in the past few
years.

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@mellanox.com>
To: Daniel Vetter <daniel@ffwll.ch>
Cc: Olof Johansson <olof.johansson@gmail.com>,
	wufan@codeaurora.org, Arnd Bergmann <arnd@arndb.de>,
	Jeffrey Hugo <jhugo@codeaurora.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-arm-msm <linux-arm-msm@vger.kernel.org>,
	pratanan@codeaurora.org, LKML <linux-kernel@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Subject: Re: [RFC PATCH 0/8] Qualcomm Cloud AI 100 driver
Date: Tue, 19 May 2020 20:26:36 -0300	[thread overview]
Message-ID: <20200519232636.GA24561@mellanox.com> (raw)
In-Reply-To: <CAKMK7uG-oP-tcOcNz-ZzTmGondEo-17BCN1kpFBPwb7F8QcM5w@mail.gmail.com>

On Tue, May 19, 2020 at 10:41:15PM +0200, Daniel Vetter wrote:

> Get some consistency into your decision making as maintainer. And don't
> tell me or anyone else that this is complicated, gpu and rdma driver folks
> very much told you and Olof last year that this is what you're getting
> yourself into.

It is complicated!

One of the big mistakes we learned from in RDMA is that we must have a
cannonical open userspace, that is at least the user side of the uABI
from the kernel. It doesn't have to do a lot but it does have to be
there and everyone must use it.

Some time ago it was all a fragmented mess where every HW had its own
library project with no community and that spilled into the kernel
where it became impossible to be sure everyone was playing nicely and
keeping their parts up to date. We are still digging out where I find
stuff in the kernel that just never seemed to make it into any
userspace..

I feel this is an essential ingredient, and I think I gave this advice
at LPC as well - it is important to start as a proper subsystem with a
proper standard user space. IMHO a random collection of opaque misc
drivers for incredibly complex HW is not going to magically gel into a
subsystem.

Given the state of the industry the userspace doesn't have to do
alot, and maybe that library exposes unique APIs for each HW, but it
is at least a rallying point to handle all these questions like: 'is
the proposed userspace enough?', give some consistency, and be ready
to add in those things that are common (like, say IOMMU PASID setup)

The uacce stuff is sort of interesting here as it does seem to take
some of that approach, it is really simplistic, but the basic idea of
creating a generic DMA work ring is in there, and probably applies
just as well to several of these 'totally-not-a-GPU' drivers.

The other key is that the uABI from the kernel does need to be very
flexible as really any new HW can appear with any new strange need all
the time, and there will not be detailed commonality between HWs. RDMA
has made this mistake a lot in the past too.

The newer RDMA netlink like API is actually turning out not bad for
this purpose.. (again something a subsystem could provide)

Also the approach in this driver to directly connect the device to
userspace for control commands has worked for RDMA in the past few
years.

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2020-05-19 23:26 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-19 20:41 [RFC PATCH 0/8] Qualcomm Cloud AI 100 driver Daniel Vetter
2020-05-19 20:41 ` Daniel Vetter
2020-05-19 23:26 ` Jason Gunthorpe [this message]
2020-05-19 23:26   ` Jason Gunthorpe
2020-05-20  4:59 ` Greg Kroah-Hartman
2020-05-20  4:59   ` Greg Kroah-Hartman
2020-05-20  5:11   ` Bjorn Andersson
2020-05-20  5:11     ` Bjorn Andersson
2020-05-20  5:54     ` Greg Kroah-Hartman
2020-05-20  5:54       ` Greg Kroah-Hartman
2020-05-20  5:15 ` Greg Kroah-Hartman
2020-05-20  5:15   ` Greg Kroah-Hartman
2020-05-20  8:34   ` Daniel Vetter
2020-05-20  8:34     ` Daniel Vetter
2020-05-20 14:48     ` Jeffrey Hugo
2020-05-20 14:48       ` Jeffrey Hugo
2020-05-20 15:56       ` Daniel Vetter
2020-05-20 15:56         ` Daniel Vetter
2020-05-20 15:59       ` Greg Kroah-Hartman
2020-05-20 15:59         ` Greg Kroah-Hartman
2020-05-20 16:15         ` Jeffrey Hugo
2020-05-20 16:15           ` Jeffrey Hugo
  -- strict thread matches above, loose matches on Subject: below --
2020-05-14 14:07 Jeffrey Hugo
2020-05-19  5:08 ` Dave Airlie
2020-05-19 14:57   ` Jeffrey Hugo
2020-05-19 17:41     ` Greg Kroah-Hartman
2020-05-19 18:07       ` Jeffrey Hugo
2020-05-19 18:12         ` Greg Kroah-Hartman
2020-05-19 18:26           ` Jeffrey Hugo
2020-05-20  5:32             ` Greg Kroah-Hartman
2020-05-19 17:33   ` Greg Kroah-Hartman
2020-05-19  6:57 ` Manivannan Sadhasivam
2020-05-19 14:16   ` Jeffrey Hugo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200519232636.GA24561@mellanox.com \
    --to=jgg@mellanox.com \
    --cc=airlied@gmail.com \
    --cc=arnd@arndb.de \
    --cc=bjorn.andersson@linaro.org \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jhugo@codeaurora.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=olof.johansson@gmail.com \
    --cc=pratanan@codeaurora.org \
    --cc=wufan@codeaurora.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.