From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28C14C282D7 for ; Wed, 30 Jan 2019 18:11:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id F031320869 for ; Wed, 30 Jan 2019 18:11:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732855AbfA3SLI (ORCPT ); Wed, 30 Jan 2019 13:11:08 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:59378 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727473AbfA3SLI (ORCPT ); Wed, 30 Jan 2019 13:11:08 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DC58980D; Wed, 30 Jan 2019 10:11:07 -0800 (PST) Received: from fuggles.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EBAE83F557; Wed, 30 Jan 2019 10:11:05 -0800 (PST) Date: Wed, 30 Jan 2019 18:11:00 +0000 From: Will Deacon To: Alexei Starovoitov Cc: Peter Zijlstra , Alexei Starovoitov , davem@davemloft.net, daniel@iogearbox.net, jakub.kicinski@netronome.com, netdev@vger.kernel.org, kernel-team@fb.com, mingo@redhat.com, Paul McKenney , jannh@google.com Subject: Re: bpf memory model. Was: [PATCH v4 bpf-next 1/9] bpf: introduce bpf_spin_lock Message-ID: <20190130181100.GA18558@fuggles.cambridge.arm.com> References: <20190124041403.2100609-1-ast@kernel.org> <20190124041403.2100609-2-ast@kernel.org> <20190124180109.GA27771@hirez.programming.kicks-ass.net> <20190124235857.xyb5xx2ufr6x5mbt@ast-mbp.dhcp.thefacebook.com> <20190125102312.GC4500@hirez.programming.kicks-ass.net> <20190126001725.roqqfrpysyljqiqx@ast-mbp.dhcp.thefacebook.com> <20190128092408.GD28467@hirez.programming.kicks-ass.net> <20190128215623.6eqskzhklydhympa@ast-mbp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190128215623.6eqskzhklydhympa@ast-mbp> User-Agent: Mutt/1.11.1+86 (6f28e57d73f2) () Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hi Alexei, On Mon, Jan 28, 2019 at 01:56:24PM -0800, Alexei Starovoitov wrote: > On Mon, Jan 28, 2019 at 10:24:08AM +0100, Peter Zijlstra wrote: > > On Fri, Jan 25, 2019 at 04:17:26PM -0800, Alexei Starovoitov wrote: > > > What I want to avoid is to define the whole execution ordering model upfront. > > > We cannot say that BPF ISA is weakly ordered like alpha. > > > Most of the bpf progs are written and running on x86. We shouldn't > > > twist bpf developer's arm by artificially relaxing memory model. > > > BPF memory model is equal to memory model of underlying architecture. > > > What we can do is to make it bpf progs a bit more portable with > > > smp_rmb instructions, but we must not force weak execution on the developer. > > > > Well, I agree with only introducing bits you actually need, and my > > smp_rmb() example might have been poorly chosen, smp_load_acquire() / > > smp_store_release() might have been a far more useful example. > > > > But I disagree with the last part; we have to pick a model now; > > otherwise you'll pain yourself into a corner. > > > > Also; Alpha isn't very relevant these days; however ARM64 does seem to > > be gaining a lot of attention and that is very much a weak architecture. > > Adding strongly ordered assumptions to BPF now, will penalize them in > > the long run. > > arm64 is gaining attention just like riscV is gaining it too. > BPF jit for arm64 is very solid, while BPF jit for riscV is being worked on. > BPF is not picking sides in CPU HW and ISA battles. It's not about picking a side, it's about providing an abstraction of the various CPU architectures out there so that the programmer doesn't need to worry about where their program may run. Hell, even if you just said "eBPF follows x86 semantics" that would be better than saying nothing (and then we could have a discussion about whether x86 semantics are really what you want). > Memory model is CPU HW design decision. BPF ISA cannot dictate HW design. > We're not saying today that BPF is strongly ordered. > BPF load/stores are behaving differently on x86 vs arm64. > We can add new instructions, but we cannot 'define' how load/stores behave > from memory model perspective. > For example, take atomicity of single byte load/store. > Not all archs have them atomic, but we cannot say to bpf programmers > to always assume non-atomic byte loads. Hmm, I don't think this is a good approach to take for the future of eBPF. Assuming that a desirable property of an eBPF program is portability between CPU architectures, then you're effectively forcing the programmer to "assume the worst", where the worst is almost certainly unusable for practical purposes. One easy thing you could consider would be to allow tagging of an eBPF program with its supported target architectures (the JIT will refuse to accept it for other architectures). This would at least prevent remove the illusion of portability and force the programmer to be explicit. However, I think we'd much better off if we defined some basic ordering primitives such as relaxed and RCpc-style acquire/release operations (including atomic RmW), single-copy atomic memory accesses up to the native machine size and a full-fence instruction. If your program uses something that the underlying arch doesn't support, then it is rejected (e.g. 64-bit READ_ONCE on a 32-bit arch) That should map straightforwardly to all modern architectures and allow for efficient codegen on x86 and arm64. It would probably require a bunch of new BPF instructions that would be defined to be atomic (you already have XADD as a relaxed atomic add). Apologies if this sounds patronising, but I'm keen to help figure out the semantics *now* so that we don't end up having to infer them later on, which is the usual painful case for memory models. I suspect Peter and Paul would also prefer to attack it that way around. I appreciate that the temptation is to avoid the problem by deferring to the underlying hardware memory model, but I think that will create more problems than it solves and we're here to help you get this right. Will