From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_NEOMUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38F8DC282D7 for ; Wed, 30 Jan 2019 19:36:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 07D7C2184D for ; Wed, 30 Jan 2019 19:36:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UyzZEaUQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387661AbfA3Tgd (ORCPT ); Wed, 30 Jan 2019 14:36:33 -0500 Received: from mail-pf1-f195.google.com ([209.85.210.195]:36954 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727114AbfA3Tgd (ORCPT ); Wed, 30 Jan 2019 14:36:33 -0500 Received: by mail-pf1-f195.google.com with SMTP id y126so293567pfb.4 for ; Wed, 30 Jan 2019 11:36:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=xj2Z4gVVgXA9Wp4RH0RUPAfktP3Dcmu+kI/HHOGi3Lg=; b=UyzZEaUQ/Xj3zrROHbPiQw5pO9HNCKoWdx75Mwoylavdqm9mqZ7RcG3VdATOI0eLrU OoKdhfg0vH++QgnDYsG53ZoJc9LLYp0JjJgRJ0KaWk5MXiS/w7lNLVBFlcAu+7gCRRbT oZwrh9yUcHcdi5BYMOiryyXrCMz9Ettv8zXLYSCw4Qe7lMpnjSXLYoNv2BfiXfJ+GLPr LIAOzoihsmBE9d/QzbNB7EPzbkz9ObSMpiK2yZRKyHTcQZiClrvyYdjzhy9Fd/3KK0SR ccB0TU+H9gcsMSxelWllI/BDwj7BvCY/rMmYLsPcMchLNFfLYF0cpqUOVp/TCp9mOVnr W94A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=xj2Z4gVVgXA9Wp4RH0RUPAfktP3Dcmu+kI/HHOGi3Lg=; b=ImYBZtfGkICra48JVyrrg3uFZICC/NkV9ncbCT3oiTDVbBQG5yfcG2G6bIHG83T9Td oL1DqMKd55Ax7NJY5E3rufahgySAA4XqPJ6sPVapVuHQQfztAjAlnpwFgvgfu8fzhlq/ SwPEPt+5REff3nsfJ75QYVOrk3GWErGmUea6h9FP7f7W/gssNuoh/2bGGv9xLcnvEAGz TxzPRDk/7feAPvueck2pCNA1JpRl+Rm7g/I9MmU2OsFD2EG/P4+hfvrCjWLRoSNGaJCW X9FRe6fczKB8+2au66SQxgCJVaivebcnefaP0w3qFUgx6dcjgCL2iZar3dD0Z/fItVJP I1Iw== X-Gm-Message-State: AJcUukfZ/H/SKvdynGW126xu3148Zg/DF1lQbUQglKC4vS6EU5TciLqH mBr7lnsJp3OKVGHLQUdZAw0= X-Google-Smtp-Source: ALg8bN6bSmXCT26E+oJJGwKdMQgpXcCiRORAeA8MY5X1SlGuq7GlQJGKogoNyrQy+uIb8s5aR6KV0w== X-Received: by 2002:a63:1e56:: with SMTP id p22mr28616456pgm.126.1548876992364; Wed, 30 Jan 2019 11:36:32 -0800 (PST) Received: from ast-mbp.dhcp.thefacebook.com ([2620:10d:c090:180::1:9e50]) by smtp.gmail.com with ESMTPSA id f62sm3206548pgc.67.2019.01.30.11.36.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 30 Jan 2019 11:36:31 -0800 (PST) Date: Wed, 30 Jan 2019 11:36:29 -0800 From: Alexei Starovoitov To: Peter Zijlstra Cc: Alexei Starovoitov , davem@davemloft.net, daniel@iogearbox.net, jakub.kicinski@netronome.com, netdev@vger.kernel.org, kernel-team@fb.com, mingo@redhat.com, will.deacon@arm.com, Paul McKenney , jannh@google.com Subject: Re: bpf memory model. Was: [PATCH v4 bpf-next 1/9] bpf: introduce bpf_spin_lock Message-ID: <20190130193628.edvaf6aai2w5b6xf@ast-mbp.dhcp.thefacebook.com> References: <20190124041403.2100609-2-ast@kernel.org> <20190124180109.GA27771@hirez.programming.kicks-ass.net> <20190124235857.xyb5xx2ufr6x5mbt@ast-mbp.dhcp.thefacebook.com> <20190125102312.GC4500@hirez.programming.kicks-ass.net> <20190126001725.roqqfrpysyljqiqx@ast-mbp.dhcp.thefacebook.com> <20190128092408.GD28467@hirez.programming.kicks-ass.net> <20190128215623.6eqskzhklydhympa@ast-mbp> <20190129091654.GD28485@hirez.programming.kicks-ass.net> <20190130023212.zs4d6hws5tsfl5uc@ast-mbp.dhcp.thefacebook.com> <20190130085850.GA2278@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190130085850.GA2278@hirez.programming.kicks-ass.net> User-Agent: NeoMutt/20180223 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Wed, Jan 30, 2019 at 09:58:50AM +0100, Peter Zijlstra wrote: > On Tue, Jan 29, 2019 at 06:32:13PM -0800, Alexei Starovoitov wrote: > > On Tue, Jan 29, 2019 at 10:16:54AM +0100, Peter Zijlstra wrote: > > > On Mon, Jan 28, 2019 at 01:56:24PM -0800, Alexei Starovoitov wrote: > > > > On Mon, Jan 28, 2019 at 10:24:08AM +0100, Peter Zijlstra wrote: > > > > > > > > Ah, but the loop won't be in the BPF program itself. The BPF program > > > > > would only have had the BPF_SPIN_LOCK instruction, the JIT them emits > > > > > code similar to queued_spin_lock()/queued_spin_unlock() (or calls to > > > > > out-of-line versions of them). > > > > > > > > As I said we considered exactly that and such approach has a lot of downsides > > > > comparing with the helper approach. > > > > Pretty much every time new feature is added we're evaluating whether it > > > > should be new instruction or new helper. 99% of the time we go with new helper. > > > > > > Ah; it seems I'm confused on helper vs instruction. As in, I've no idea > > > what a helper is. > > > > bpf helper is a normal kernel function that can be called from bpf program. > > In assembler it's a direct function call. > > Ah, it is what is otherwise known as a standard library, > > > > > > There isn't anything that mandates the JIT uses the exact same locking > > > > > routines the interpreter does, is there? > > > > > > > > sure. This bpf_spin_lock() helper can be optimized whichever way the kernel wants. > > > > Like bpf_map_lookup_elem() call is _inlined_ by the verifier for certain map types. > > > > JITs don't even need to do anything. It looks like function call from bpf prog > > > > point of view, but in JITed code it is a sequence of native instructions. > > > > > > > > Say tomorrow we find out that bpf_prog->bpf_spin_lock()->queued_spin_lock() > > > > takes too much time then we can inline fast path of queued_spin_lock > > > > directly into bpf prog and save function call cost. > > > > > > OK, so then the JIT can optimize helpers. Would it not make sense to > > > have the simple test-and-set spinlock in the generic code and have the > > > JITs use arch_spinlock_t where appropriate? > > > > I think that pretty much the same as what I have with qspinlock. > > Instead of taking a risk how JIT writers implement bpf_spin_lock optimization > > I'm using qspinlock on architectures that are known to support it. > > I see the argument for it... > > > So instead of starting with dumb test-and-set there will be faster > > qspinlock from the start on x86, arm64 and few others archs. > > Those are the archs we care about the most anyway. Other archs can take > > time to optimize it (if optimizations are necessary at all). > > In general hacking JITs is much harder and more error prone than > > changing core and adding helpers. Hence we avoid touching JITs > > as much as possible. > > So archs/JITs are not trivially able to override those helper functions? > Because for example ARM (32bit) doesn't do qspinlock but it's > arch_spinlock_t is _much_ better than a TAS lock. JITs can override. There is no 'ready to use' facility for all types of helpers to do that, but it's easy enough to add. Having said that I'm going to reject arm32 JIT patches that are trying to use arch_spinlock instead of generic bpf_spin_lock. The last thing arm32 jit needs is this type of optimization. Other JITs is a different story.