From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.6 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F4129C282C3 for ; Fri, 25 Jan 2019 01:47:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id ACCD0218CD for ; Fri, 25 Jan 2019 01:47:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="vbDzg3xj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728300AbfAYBrZ (ORCPT ); Thu, 24 Jan 2019 20:47:25 -0500 Received: from mail-ot1-f65.google.com ([209.85.210.65]:43371 "EHLO mail-ot1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727983AbfAYBrZ (ORCPT ); Thu, 24 Jan 2019 20:47:25 -0500 Received: by mail-ot1-f65.google.com with SMTP id a11so7100645otr.10 for ; Thu, 24 Jan 2019 17:47:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=4BabUjL/FnavSv8Gsrrzn3LJ5qbDrn3MgHV1TH72Npw=; b=vbDzg3xjWo12BDJDeprIhLPv1Dh1KUCh1kfFx+obsPmsjveIeU5T1TVlPU4utwrYeb zsHbDbYQrNSh5obo1/I7ieU/ySWKWUZGi1CcWFNqb52YeID9jm5iqHgGfTBKG29en46v WezU8KfkMakH32qr2pNfPzJjYTBKFQnQMmWt22dBe7VZ4cXnE8F8Eza31vA1B0QXtA55 lLGjVlNEy0adqgcIXbeh2mE++rrVNCSy3Xo+8fJbzxqkqfgGDnHkb7Wxbe+iKp44DV3h QlVbgWT/fvuzXBEyrT3TOWJxgY1o3jZMOpKildC2cn3iaII19GIawzhQC23dj4ZCehAV GRkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=4BabUjL/FnavSv8Gsrrzn3LJ5qbDrn3MgHV1TH72Npw=; b=B2Fr7GFI/780xGSNMz7ehm7j/o72mS+DLzKHyyLZ/CQxFemhZpNvqEOQwe7e9kwYE6 wrUvXHurPX1DN/Fd3UO15TBEP171d5IPXsq3MEqCFtxXkCDMxV4HTsGhKKozjqYRBD8d dvNDwT8CMbuAkf0iISht8e5Oz44rLZzq9mmdeLCZd97PzLnVxPtnzsj1pHsgVDRBq1Yo 1dfrGPdE8kLfU19GyeeOIXa8D5uEAThFtawzTy336bWdAmmROnAsFftvTOO2qmKZEr4P 7+xYKZl8iOhU7dBDAH7A/trZAlnt4Fs/iM8+M7Q+aFZAQosiZPcXrXbM3IqZDLjayQes +mGg== X-Gm-Message-State: AJcUukfwEZnz9DcF7NuA5XjHmKmVp6uWQbzfuaaAd048FuNnSS72tY8Z Ox3tyIM1q2+ksN8N4wxn6T/qChxtJsSAbrNDdO+opw== X-Google-Smtp-Source: ALg8bN7Wq2VNrVZq7c2mlkoZb30HSKsMX177u9zXLJxsc955Io5ubE5V0NZJRGXUZrEVRKnf5FJpp4wTflNywn2mdhs= X-Received: by 2002:a9d:aa9:: with SMTP id 38mr6543475otq.255.1548380843152; Thu, 24 Jan 2019 17:47:23 -0800 (PST) MIME-Version: 1.0 References: <20190124041403.2100609-1-ast@kernel.org> <20190124041403.2100609-2-ast@kernel.org> <20190124180109.GA27771@hirez.programming.kicks-ass.net> <20190124185652.GB17767@hirez.programming.kicks-ass.net> <20190124234232.GY4240@linux.ibm.com> <20190125000515.jizijxz4n735gclx@ast-mbp.dhcp.thefacebook.com> <20190125012224.GZ4240@linux.ibm.com> In-Reply-To: <20190125012224.GZ4240@linux.ibm.com> From: Jann Horn Date: Fri, 25 Jan 2019 02:46:55 +0100 Message-ID: Subject: Re: [PATCH v4 bpf-next 1/9] bpf: introduce bpf_spin_lock To: paulmck@linux.ibm.com Cc: Alexei Starovoitov , Peter Zijlstra , Alexei Starovoitov , "David S. Miller" , Daniel Borkmann , jakub.kicinski@netronome.com, Network Development , kernel-team@fb.com, Ingo Molnar , Will Deacon Content-Type: text/plain; charset="UTF-8" Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, Jan 25, 2019 at 2:22 AM Paul E. McKenney wrote: > On Thu, Jan 24, 2019 at 04:05:16PM -0800, Alexei Starovoitov wrote: > > On Thu, Jan 24, 2019 at 03:42:32PM -0800, Paul E. McKenney wrote: > > > On Thu, Jan 24, 2019 at 07:56:52PM +0100, Peter Zijlstra wrote: > > > > On Thu, Jan 24, 2019 at 07:01:09PM +0100, Peter Zijlstra wrote: > > > > > > > > > > Thanks for having kernel/locking people on Cc... > > > > > > > > > > On Wed, Jan 23, 2019 at 08:13:55PM -0800, Alexei Starovoitov wrote: > > > > > > > > > > > Implementation details: > > > > > > - on !SMP bpf_spin_lock() becomes nop > > > > > > > > > > Because no BPF program is preemptible? I don't see any assertions or > > > > > even a comment that says this code is non-preemptible. > > > > > > > > > > AFAICT some of the BPF_RUN_PROG things are under rcu_read_lock() only, > > > > > which is not sufficient. > > > > > > > > > > > - on architectures that don't support queued_spin_lock trivial lock is used. > > > > > > Note that arch_spin_lock cannot be used, since not all archs agree that > > > > > > zero == unlocked and sizeof(arch_spinlock_t) != sizeof(__u32). > > > > > > > > > > I really don't much like direct usage of qspinlock; esp. not as a > > > > > surprise. > > > > > > Substituting the lightweight-reader SRCU as discussed earlier would allow > > > use of a more generic locking primitive, for example, one that allowed > > > blocking, at least in cases were the context allowed this. > > > > > > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git > > > branch srcu-lr.2019.01.16a. > > > > > > One advantage of a more generic locking primitive would be keeping BPF > > > programs independent of internal changes to spinlock primitives. > > > > Let's keep "srcu in bpf" discussion separate from bpf_spin_lock discussion. > > bpf is not switching to srcu any time soon. > > If/when it happens it will be only for certain prog+map types > > like bpf syscall probes that need to be able to do copy_from_user > > from bpf prog. > > Hmmm... What prevents BPF programs from looping infinitely within an > RCU reader, and as you noted, preemption disabled? > > If BPF programs are in fact allowed to loop infinitely, it would be > very good for the health of the kernel to have preemption enabled. > And to be within an SRCU read-side critical section instead of an RCU > read-side critical section. The BPF verifier prevents loops; this is in push_insn() in kernel/bpf/verifier.c, which errors out with -EINVAL when a back edge is encountered. For non-root programs, that limits the maximum number of instructions per eBPF engine execution to BPF_MAXINSNS*MAX_TAIL_CALL_CNT==4096*32==131072 (but that includes call instructions, which can cause relatively expensive operations like hash table lookups). For programs created with CAP_SYS_ADMIN, things get more tricky because you can create your own functions and call them repeatedly; I'm not sure whether the pessimal runtime there becomes exponential, or whether there is some check that catches this.