From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from bedivere.hansenpartnership.com (bedivere.hansenpartnership.com [96.44.175.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A08463FD0 for ; Mon, 13 Sep 2021 14:52:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1631544735; bh=VPfq8BqZ6xy0+6P2yfAIitl06YOfr9k0VCz+G/2fSpE=; h=Message-ID:Subject:From:To:Date:In-Reply-To:References:From; b=OXYQudZQ0dqcOZC90Uqy6TKpwk0FbWYu6T0Q8O9J8rO2em8l1ycPYbLjQ3NkEqOMy 4jaUNFeHxlSd/VqdheFiolvtUMEuNhJJ1w8TlZhX8VIssKrLmqWHc4R5IDjRRvIw4f kPyvO2EE7W5fkiF5KhTi+oSwrAzIMrfeR1Lu0tLg= Received: from localhost (localhost [127.0.0.1]) by bedivere.hansenpartnership.com (Postfix) with ESMTP id B544A1280577; Mon, 13 Sep 2021 07:52:15 -0700 (PDT) Received: from bedivere.hansenpartnership.com ([127.0.0.1]) by localhost (bedivere.hansenpartnership.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Y0K7JSgW-hFY; Mon, 13 Sep 2021 07:52:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=hansenpartnership.com; s=20151216; t=1631544735; bh=VPfq8BqZ6xy0+6P2yfAIitl06YOfr9k0VCz+G/2fSpE=; h=Message-ID:Subject:From:To:Date:In-Reply-To:References:From; b=OXYQudZQ0dqcOZC90Uqy6TKpwk0FbWYu6T0Q8O9J8rO2em8l1ycPYbLjQ3NkEqOMy 4jaUNFeHxlSd/VqdheFiolvtUMEuNhJJ1w8TlZhX8VIssKrLmqWHc4R5IDjRRvIw4f kPyvO2EE7W5fkiF5KhTi+oSwrAzIMrfeR1Lu0tLg= Received: from jarvis.int.hansenpartnership.com (unknown [IPv6:2601:600:8280:66d1::527]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bedivere.hansenpartnership.com (Postfix) with ESMTPSA id 05B62128051A; Mon, 13 Sep 2021 07:52:14 -0700 (PDT) Message-ID: <03c63726404ad51449b2370d5d3cada976633eec.camel@HansenPartnership.com> Subject: Re: [MAINTAINER SUMMIT] User-space requirements for accelerator drivers From: James Bottomley To: Arnd Bergmann , Linus Walleij Cc: Dave Airlie , Daniel Vetter , Greg KH , Leon Romanovsky , Laurent Pinchart , Thomas Gleixner , Josh Triplett , Mauro Carvalho Chehab , Jonathan Corbet , ksummit@lists.linux.dev, dev@tvm.apache.org Date: Mon, 13 Sep 2021 07:52:14 -0700 In-Reply-To: References: <87ilz8c7ff.ffs@tglx> Content-Type: text/plain; charset="UTF-8" User-Agent: Evolution 3.34.4 Precedence: bulk X-Mailing-List: ksummit@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 7bit On Mon, 2021-09-13 at 15:20 +0200, Arnd Bergmann wrote: > On Mon, Sep 13, 2021 at 12:51 AM Linus Walleij < > linus.walleij@linaro.org> wrote: > > On Sun, Sep 12, 2021 at 11:13 PM Dave Airlie > > wrote: > > > > > For userspace components as well these communities of experts > > > need to exist for each domain, and we need to encourage upstream > > > first processes across the board for these split kernel/userspace > > > stacks. > > > > > > The habanalabs compiler backend is an LLVM fork, I'd like to see > > > the effort to upstream that LLVM backend into LLVM proper. > > > > I couldn't agree more. > > > > A big part of the problem with inference engines / NPU:s is that of > > no standardized userspace. Several of the machine learning > > initiatives from some years back now have stale git repositories > > and are visibly unmaintained, c.f. Caffe > > https://github.com/BVLC/caffe last commit 2 years ago. > > Caffe as a standalone project was abandoned and merged into > PyTorch, see https://caffe2.ai/. I think this is the kind of > consolidation of those projects that you are looking for. > > > Habanalabs propose an LLVM fork as compiler, yet the Intel > > logo is on the Apache TVM website, and no sign of integrating with > > that project. They claim to support also TensorFlow. > > > > The way I perceive it is that there simply isn't any GCC/LLVM or > > Gallium 3D of NPU:s, these people haven't yet decided that "here > > is that userspace we are all going to use". Or have they? > > > > LLVM? TVM? TensorFlow? PyTorch? Some other one? > > > > What worries me is that I don't see one single developer being > > able to say "this one definitely, and they will work with the > > kernel community", and that is what we need to hear. > > I don't actually think this is a decision we can possibly wait for. > The ones you listed all work on different levels, some build on top > of others, and some may get replaced by new ones over time. I cut all the interesting design stuff because there's a meta problem here: we seem to be charting a course based on the idea we have to get the userspace API right first time. We really don't, we have to make a reasonable effort to get it right, but we can go around for a v2 if we fail ... that's the whole point about open source: fail fast and redo. No-one can really design an API without seeing how the users actually use it. When we do get it right first time, it's more by luck than judgment, so we should expect failure more often than not. The trick to a successful API is usually finding what the minimal set of operations is and implementing that. If you think about bells and whistles first (as 95% of API design documents do tend to) you usually fail. Completely new APIs with producer consumer interlock always have this failure problem, because in a blue sky environment, neither the producer nor consumer knows exactly what they want the first time around ... they usually have to try a couple of times to figure out what works and what doesn't. What we have to enable is this fast iteration while they work it out. API versioning is usually a good beginning to this ... There's also nothing wrong with recommending existing interfaces and seeing how that works because existing patterns are there for a reason. James