ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Dave Airlie <airlied@gmail.com>
To: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>,
	Linus Walleij <linus.walleij@linaro.org>,
	 Greg KH <greg@kroah.com>, Leon Romanovsky <leon@kernel.org>,
	 Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	 Josh Triplett <josh@joshtriplett.org>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	 Jonathan Corbet <corbet@lwn.net>,
	ksummit@lists.linux.dev, dev@tvm.apache.org
Subject: Re: [MAINTAINER SUMMIT] User-space requirements for accelerator drivers
Date: Tue, 14 Sep 2021 09:33:17 +1000	[thread overview]
Message-ID: <CAPM=9tw3WTUb6R5VaDR002P0QYbcZ0uPETC4r0MPBBqySLe09Q@mail.gmail.com> (raw)
In-Reply-To: <CAK8P3a2pvCwuSic9yevW0xmMy0-F1FEgfQ-_Rc7wWDs7PJEf_w@mail.gmail.com>

On Tue, 14 Sept 2021 at 08:05, Arnd Bergmann <arnd@arndb.de> wrote:
>
> >n Mon, Sep 13, 2021 at 3:54 PM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> > > One straightforward hardware independent low-level API would
> > > be the traditional BLAS GEMM call[1] for matrix multiplication
> > > and its variants (integer, float, bfloat16, ...).  Most of the frameworks
> > > are able to use SGEMM to do the actual calculation since that
> > > has optimized versions for most CPUs and GPUs, and most
> > > hardware accelerators should be able to provide an
> > > implementation of this that doesn't completely suck. This
> > > can be used for both inferencing and training.
> >
> > I think BLAS are too high-level for these. Sure fore perfect speed the
> > vendor probably wants to have their own BLAS thing, their own NN
> > optmizer and a heap of other things, but for the low-level userspace
> > we're talking about here that pretty much doesn't matter.
>
> I suppose high-level vs low-level is not the correct distinction here,
> it's more like fixed-function vs programmable.
>
> As a fixed-function interface, something like GEMM is probably as
> low-level as you would want to get, as it's big enough to make sense
> as a single atomic command, but small enough to be able to build on
> top of it.

The distinctions is more programming model than fixed vs programmable
in rough order of complexity

a) device is MMIO programmed and can process one thing, kernel needs
to mediate between exclusive users (big lock, initial drm subsystem)
b) device has a queue that can process untrusted userspace command
with no memory safety (old drm drivers, in-kernel command stream
parsing)
c) device has queues, contexts, memory safety, virtual address space
(newer drm drivers)
d) device has full preempt on all hw blocks, is fully coherent, can
trigger paging sanely, userspace can submit directly (pipe dream).

What the device processes is of little consequence to the kernel
driver model. the uAPI of course needs to reflect the above along with
what the device can program. Since there could be a queue for a DMA
device that isn't specificed but can be programmed to DMA random
system memory.

Devices in category (a) are the sort of things that can need kernel
interfaces like a GEMM or BLAS level, however there is no point having
an interface at that level for any of the b/c/d device. That interface
needs to be in userspace somewhere, level0 or something like is
probably where things will end up, and the type (a) devices will die
out.

> I realize that fixed-function is not fashionable on GPUs, but they
> are widely used in other areas (video codecs, crypto, ...) even when
> you are running precompiled code on the accelerator hardware.
> This would of course replace the question of open source user space
> with the question of open-source firmware, as the user side would
> become mostly while the accelerator goes from dynamically created
> to a firmware blob.

We have lots of fixed function on GPUs, video codecs are on most x86
GPUs. It's how you program them that matters, most of them are behind
queues similar to the 3D engine, so you program them the same way.

What isn't fashionable on GPUs is programmable blocks that are single
user that only the kernel can program one user on at a time, since hw
has long since left that model as desirable. There are some AI
accelerators going doing the same path, but eventually they'll have to
be shareable and catch up with GPU programming models to remain
competitive.

Dave.

  reply	other threads:[~2021-09-13 23:33 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 21:00 [MAINTAINER SUMMIT] User-space requirements for accelerator drivers Jonathan Corbet
2021-09-10 21:32 ` Josh Triplett
2021-09-13 13:50   ` Christian Brauner
2021-09-13 13:57     ` Daniel Vetter
2021-09-14  2:07       ` Laurent Pinchart
2021-09-14 14:40   ` Jani Nikula
2021-09-14 14:45     ` Geert Uytterhoeven
2021-09-14 14:59       ` Jani Nikula
2021-09-14 15:10         ` Geert Uytterhoeven
2021-09-10 21:51 ` James Bottomley
2021-09-10 21:59   ` Alexandre Belloni
2021-09-10 22:35     ` James Bottomley
2021-09-11 14:51       ` Jonathan Corbet
2021-09-11 15:24         ` James Bottomley
2021-09-11 21:52           ` Laurent Pinchart
2021-09-14 13:22             ` Johannes Berg
2021-09-11  0:08   ` Laurent Pinchart
2021-09-10 22:52 ` Mauro Carvalho Chehab
2021-09-10 23:45   ` Josh Triplett
2021-09-10 23:48     ` Dave Hansen
2021-09-11  0:13       ` Laurent Pinchart
2021-09-10 23:55     ` Thomas Gleixner
2021-09-11  0:20       ` Laurent Pinchart
2021-09-11 14:20         ` Steven Rostedt
2021-09-11 22:08           ` Laurent Pinchart
2021-09-11 22:42             ` Steven Rostedt
2021-09-11 23:10               ` Laurent Pinchart
2021-09-13 11:10               ` Mark Brown
2021-09-11 22:51           ` Mauro Carvalho Chehab
2021-09-11 23:22           ` Mauro Carvalho Chehab
2021-09-11 10:31       ` Leon Romanovsky
2021-09-11 11:41         ` Laurent Pinchart
2021-09-11 12:04           ` Leon Romanovsky
2021-09-11 22:04             ` Laurent Pinchart
2021-09-12  4:27               ` Leon Romanovsky
2021-09-12  7:26                 ` Greg KH
2021-09-12  8:29                   ` Leon Romanovsky
2021-09-12 13:25                     ` Greg KH
2021-09-12 14:15                       ` Leon Romanovsky
2021-09-12 14:34                         ` Greg KH
2021-09-12 16:41                           ` Laurent Pinchart
2021-09-12 20:35                           ` Dave Airlie
2021-09-12 20:41                           ` Dave Airlie
2021-09-12 20:49                             ` Daniel Vetter
2021-09-12 21:12                               ` Dave Airlie
2021-09-12 22:51                                 ` Linus Walleij
2021-09-12 23:15                                   ` Dave Airlie
2021-09-13 13:20                                   ` Arnd Bergmann
2021-09-13 13:54                                     ` Daniel Vetter
2021-09-13 22:04                                       ` Arnd Bergmann
2021-09-13 23:33                                         ` Dave Airlie [this message]
2021-09-14  9:08                                           ` Arnd Bergmann
2021-09-14  9:23                                             ` Daniel Vetter
2021-09-14 10:47                                               ` Laurent Pinchart
2021-09-14 12:58                                               ` Arnd Bergmann
2021-09-14 19:45                                                 ` Daniel Vetter
2021-09-14 15:43                                             ` Luck, Tony
2021-09-13 14:52                                     ` James Bottomley
2021-09-14 13:07                                     ` Linus Walleij
2021-09-13 14:03                           ` Mark Brown
2021-09-12 15:55                       ` Laurent Pinchart
2021-09-12 16:43                         ` James Bottomley
2021-09-12 16:58                           ` Laurent Pinchart
2021-09-12 17:08                             ` James Bottomley
2021-09-12 19:52                   ` Dave Airlie
2021-09-12  7:46                 ` Mauro Carvalho Chehab
2021-09-12  8:00                   ` Leon Romanovsky
2021-09-12 14:53                     ` Laurent Pinchart
2021-09-12 15:41                       ` Mauro Carvalho Chehab
2021-09-10 23:46   ` Laurent Pinchart
2021-09-11  0:38     ` Mauro Carvalho Chehab
2021-09-11  9:27       ` Laurent Pinchart
2021-09-11 22:33         ` Mauro Carvalho Chehab
2021-09-13 12:04         ` Mark Brown
2021-09-12 19:13 ` Dave Airlie
2021-09-12 19:48   ` Laurent Pinchart
2021-09-13  2:26     ` Dave Airlie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPM=9tw3WTUb6R5VaDR002P0QYbcZ0uPETC4r0MPBBqySLe09Q@mail.gmail.com' \
    --to=airlied@gmail.com \
    --cc=arnd@arndb.de \
    --cc=corbet@lwn.net \
    --cc=daniel.vetter@ffwll.ch \
    --cc=dev@tvm.apache.org \
    --cc=greg@kroah.com \
    --cc=josh@joshtriplett.org \
    --cc=ksummit@lists.linux.dev \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=leon@kernel.org \
    --cc=linus.walleij@linaro.org \
    --cc=mchehab@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).