From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.3 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6247AECE58C for ; Wed, 9 Oct 2019 01:51:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3B6C7218AC for ; Wed, 9 Oct 2019 01:51:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="c4jBo2/P" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728429AbfJIBvX (ORCPT ); Tue, 8 Oct 2019 21:51:23 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:37442 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726109AbfJIBvX (ORCPT ); Tue, 8 Oct 2019 21:51:23 -0400 Received: by mail-pg1-f194.google.com with SMTP id p1so359303pgi.4; Tue, 08 Oct 2019 18:51:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=0mdGCRWjUTyephaV+2jLG555MDHtlgXCjF9iBbHjnFI=; b=c4jBo2/PxNveO+hjo+EwN1iUIMTH27FSWNE5WRyJ/riN3TRxT5zkTgf+YRFyFyjBqy 4vEfWe0zKFXGcRR04pNf1QK8uHJeX5KHVchpysApe2iXETMADBIHoo9QBQvDKsUDRf7+ 9zACo1f2DkXgpeF02DkE1z0vaYxuIRuVYxtEfCS/y+GpVnFEHDIfSuZCq1vPGEjN/TiA Y55F0fzWaB4kseQ0YNn9uTbeKxX9aF/ySR9HsN/riRWiKIA8fUt1e2xsEH6Fv18ztoY8 MeBGxja2xw/wKocDgGkIfWk7CWDqo2O4lB3YLwmYRtSa5mi3E7oOsiXPUNMRE/tuKJaC Njmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=0mdGCRWjUTyephaV+2jLG555MDHtlgXCjF9iBbHjnFI=; b=fZ41kHZD7cQ7/Xb02msIj6L/7K4fuNpp4UnGC7m3CycHNUfAdl9bWZqKz2RtfLJJ3E A4UJJUD4UWrf3Gjq/vSy2vhG8Uap/6uD140S02KrKtTk+epRASD04ZTdOiFjVZVpIFKH 9laTS4JuuNH2pcybZfGCEs7r1xXY8FnN2GMqDR+7ah3lc7FRenmKxiwK7brtM7kCip97 e3cwxAd520/0yLMaOknYwbzRGx/ohFBhoTQi42YVsVkKElNH+n6gpOFQqJPGhDXj045I tbhjKJOv7kuWoSnxN/m/Oz4NjzguvKvI5VtdakxOp/zxSAgR3o/HKr9+1KUTdCAJzPRe AxcQ== X-Gm-Message-State: APjAAAWmf9lku+lbWLXiZRWGzQxZBPFbQecNVfVrxSGcLEZByb6m3Kfg QTYpmfwIQ3V5c6ibhKQLsBw= X-Google-Smtp-Source: APXvYqzgFStDLl40mKVxSCJp5phF42olQABHpPmPsW4UcTx/6QRdRGarQ7NV+3ig1wzT40274rhYaA== X-Received: by 2002:a63:d450:: with SMTP id i16mr1642635pgj.126.1570585882017; Tue, 08 Oct 2019 18:51:22 -0700 (PDT) Received: from ast-mbp.dhcp.thefacebook.com ([2620:10d:c090:180::851e]) by smtp.gmail.com with ESMTPSA id b5sm352496pfp.38.2019.10.08.18.51.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 08 Oct 2019 18:51:21 -0700 (PDT) Date: Tue, 8 Oct 2019 18:51:19 -0700 From: Alexei Starovoitov To: Toke =?utf-8?Q?H=C3=B8iland-J=C3=B8rgensen?= Cc: Daniel Borkmann , Alexei Starovoitov , Martin KaFai Lau , Song Liu , Yonghong Song , Marek Majkowski , Lorenz Bauer , Alan Maguire , Jesper Dangaard Brouer , David Miller , netdev@vger.kernel.org, bpf@vger.kernel.org Subject: Re: [PATCH bpf-next v3 1/5] bpf: Support chain calling multiple BPF programs after each other Message-ID: <20191009015117.pldowv6n3k5p3ghr@ast-mbp.dhcp.thefacebook.com> References: <157046883502.2092443.146052429591277809.stgit@alrua-x1> <157046883614.2092443.9861796174814370924.stgit@alrua-x1> <20191007204234.p2bh6sul2uakpmnp@ast-mbp.dhcp.thefacebook.com> <87sgo3lkx9.fsf@toke.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <87sgo3lkx9.fsf@toke.dk> User-Agent: NeoMutt/20180223 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Tue, Oct 08, 2019 at 10:07:46AM +0200, Toke Høiland-Jørgensen wrote: > Alexei Starovoitov writes: > > > On Mon, Oct 07, 2019 at 07:20:36PM +0200, Toke Høiland-Jørgensen wrote: > >> From: Toke Høiland-Jørgensen > >> > >> This adds support for wrapping eBPF program dispatch in chain calling > >> logic. The code injection is controlled by a flag at program load time; if > >> the flag is set, the BPF program will carry a flag bit that changes the > >> program dispatch logic to wrap it in a chain call loop. > >> > >> Ideally, it shouldn't be necessary to set the flag on program load time, > >> but rather inject the calls when a chain call program is first loaded. The > >> allocation logic sets the whole of struct bpf_prog to be read-only memory, > >> so it can't immediately be modified, but conceivably we could just unlock > >> the first page of the struct and flip the bit when a chain call program is > >> first attached. > >> > >> Signed-off-by: Toke Høiland-Jørgensen > >> --- > >> include/linux/bpf.h | 3 +++ > >> include/linux/filter.h | 34 ++++++++++++++++++++++++++++++++-- > >> include/uapi/linux/bpf.h | 6 ++++++ > >> kernel/bpf/core.c | 6 ++++++ > >> kernel/bpf/syscall.c | 4 +++- > >> 5 files changed, 50 insertions(+), 3 deletions(-) > >> > >> diff --git a/include/linux/bpf.h b/include/linux/bpf.h > >> index 5b9d22338606..13e5f38cf5c6 100644 > >> --- a/include/linux/bpf.h > >> +++ b/include/linux/bpf.h > >> @@ -365,6 +365,8 @@ struct bpf_prog_stats { > >> struct u64_stats_sync syncp; > >> }; > >> > >> +#define BPF_NUM_CHAIN_SLOTS 8 > >> + > >> struct bpf_prog_aux { > >> atomic_t refcnt; > >> u32 used_map_cnt; > >> @@ -383,6 +385,7 @@ struct bpf_prog_aux { > >> struct list_head ksym_lnode; > >> const struct bpf_prog_ops *ops; > >> struct bpf_map **used_maps; > >> + struct bpf_prog *chain_progs[BPF_NUM_CHAIN_SLOTS]; > >> struct bpf_prog *prog; > >> struct user_struct *user; > >> u64 load_time; /* ns since boottime */ > >> diff --git a/include/linux/filter.h b/include/linux/filter.h > >> index 2ce57645f3cd..3d1e4991e61d 100644 > >> --- a/include/linux/filter.h > >> +++ b/include/linux/filter.h > >> @@ -21,6 +21,7 @@ > >> #include > >> #include > >> #include > >> +#include > >> > >> #include > >> > >> @@ -528,6 +529,7 @@ struct bpf_prog { > >> is_func:1, /* program is a bpf function */ > >> kprobe_override:1, /* Do we override a kprobe? */ > >> has_callchain_buf:1, /* callchain buffer allocated? */ > >> + chain_calls:1, /* should this use the chain_call wrapper */ > >> enforce_expected_attach_type:1; /* Enforce expected_attach_type checking at attach time */ > >> enum bpf_prog_type type; /* Type of BPF program */ > >> enum bpf_attach_type expected_attach_type; /* For some prog types */ > >> @@ -551,6 +553,30 @@ struct sk_filter { > >> struct bpf_prog *prog; > >> }; > >> > >> +#define BPF_MAX_CHAIN_CALLS 32 > >> +static __always_inline unsigned int do_chain_calls(const struct bpf_prog *prog, > >> + const void *ctx) > >> +{ > >> + int i = BPF_MAX_CHAIN_CALLS; > >> + int idx; > >> + u32 ret; > >> + > >> + do { > >> + ret = (*(prog)->bpf_func)(ctx, prog->insnsi); > > > > This breaks program stats. > > Oh, right, silly me. Will fix. > > >> + > >> + if (ret + 1 >= BPF_NUM_CHAIN_SLOTS) { > >> + prog = prog->aux->chain_progs[0]; > >> + continue; > >> + } > >> + idx = ret + 1; > >> + idx = array_index_nospec(idx, BPF_NUM_CHAIN_SLOTS); > >> + > >> + prog = prog->aux->chain_progs[idx] ?: prog->aux->chain_progs[0]; > >> + } while (prog && --i); > >> + > >> + return ret; > >> +} > >> + > >> DECLARE_STATIC_KEY_FALSE(bpf_stats_enabled_key); > >> > >> #define BPF_PROG_RUN(prog, ctx) ({ \ > >> @@ -559,14 +585,18 @@ DECLARE_STATIC_KEY_FALSE(bpf_stats_enabled_key); > >> if (static_branch_unlikely(&bpf_stats_enabled_key)) { \ > >> struct bpf_prog_stats *stats; \ > >> u64 start = sched_clock(); \ > >> - ret = (*(prog)->bpf_func)(ctx, (prog)->insnsi); \ > >> + ret = prog->chain_calls ? \ > >> + do_chain_calls(prog, ctx) : \ > >> + (*(prog)->bpf_func)(ctx, (prog)->insnsi); \ > > > > I thought you agreed on 'no performance regressions' rule? > > As I wrote in the cover letter I could not measurable a performance > impact from this, even with the simplest possible XDP program (where > program setup time has the largest impact). > > This was the performance before/after patch (also in the cover letter): > > Before patch (XDP DROP program): 31.5 Mpps > After patch (XDP DROP program): 32.0 Mpps > > So actually this *increases* performance ;) > (Or rather, the difference is within the measurement uncertainty on my > system). I have hard time believing such numbers. If I wasn't clear before: Nack to such hack in BPF_PROG_RUN. Please implement proper indirect calls and jumps. Apps have to cooperate with each other regardless whereas above is a narrow solution to one problem.