From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C534C4332F for ; Thu, 8 Dec 2022 19:08:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229651AbiLHTIB (ORCPT ); Thu, 8 Dec 2022 14:08:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230030AbiLHTH5 (ORCPT ); Thu, 8 Dec 2022 14:07:57 -0500 Received: from mail-pg1-x52b.google.com (mail-pg1-x52b.google.com [IPv6:2607:f8b0:4864:20::52b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A0FD4E432 for ; Thu, 8 Dec 2022 11:07:55 -0800 (PST) Received: by mail-pg1-x52b.google.com with SMTP id f9so1918499pgf.7 for ; Thu, 08 Dec 2022 11:07:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=LJ7eeOZL8N+ixNEyuwLxJlm56Cp+9F6miXSVQSvbV1w=; b=tBRJ1B4gYb29BzwSOqyC11vVld7yLj8K4I0/vHbVZtJt50a4MFZtk/GrPyKMGpU05v JY2sNRF5IPmR5Mfcy8p0A8i+ld7MUgHhAuN/So2mkphayZOdLmkcHy9/C49qhQs2k8x1 grAs+nmD+MRtC5opnn2FBNBzGgXzpui4YgyAB1OK8UP1lv1S4PAGwNHlUR/tiAIDUf1N TWIRKcZ1MFyamXIx65L96P5dgBRhIkwSQhIYN4wOtW+iqtRU2Yo8MlO1c8pMIJkJxvh0 2RtenYAgnvGCPOYr9YlmYiNArNC5pLX6rmM4d/SO9iv3lWLLJkPtHpj3iOjcGA/iVKL1 +iEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=LJ7eeOZL8N+ixNEyuwLxJlm56Cp+9F6miXSVQSvbV1w=; b=M8PHn1eZX12AlWW1/hZojlFErxARRsOoDAYnq4KSPo3hLLTIg9gw01PLmDEFcXSjpX gafW5bbswZZnCHGQQVSLLKXmwWaZmhtnSPNEFi2wv/Vk+NSdrKn7HT5Oun+ZEr2Bo0FT 80SA5hbsGIbOg5ylvCmXt+iJZ2AyV4y/OfejnrgMFbDFd2ha/WRKkEZcS44OHjDbpX8i HQPxyc63BnntUHFSllDRPQ+wLgqTj/8SmGZo+/7JdMtIs9yEmNxHDoJgX7CRNtJj3fVV jljZE8/Vpg2wxKJfr3bFGfrQYm7b9tVbB/lkfigD6MsJn6gxD+glwIsunM8jYRG4EX+B qrag== X-Gm-Message-State: ANoB5pkyC04Pk57Bp6q+00A8d44Q3eKl8LiBwhi6qD3fJ5zFDKI/mNYn s+y43zfJgd27WaEh+OP5k9qbIajMWsbVbCKad02ykA== X-Google-Smtp-Source: AA0mqf7GfkAQ1EMUVVxDe6o/VMPnCRS///aZLSdE3MRghnCupekr3vOIvq1jseuMZGjXkH4/zM4SYYwwc0N0OtdWmdc= X-Received: by 2002:a62:5b43:0:b0:573:6cfc:2210 with SMTP id p64-20020a625b43000000b005736cfc2210mr80595152pfb.55.1670526474708; Thu, 08 Dec 2022 11:07:54 -0800 (PST) MIME-Version: 1.0 References: <20221206024554.3826186-1-sdf@google.com> <20221206024554.3826186-4-sdf@google.com> <20221207210019.41dc9b6b@kernel.org> In-Reply-To: <20221207210019.41dc9b6b@kernel.org> From: Stanislav Fomichev Date: Thu, 8 Dec 2022 11:07:43 -0800 Message-ID: Subject: Re: [PATCH bpf-next v3 03/12] bpf: XDP metadata RX kfuncs To: Jakub Kicinski Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, martin.lau@linux.dev, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, David Ahern , Willem de Bruijn , Jesper Dangaard Brouer , Anatoly Burakov , Alexander Lobakin , Magnus Karlsson , Maryam Tahhan , xdp-hints@xdp-project.net, netdev@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Wed, Dec 7, 2022 at 9:00 PM Jakub Kicinski wrote: > > The offload tests still pass after this, right? Yeah, had to bring them back in shape just for the purpose of making sure they're still happy: https://lore.kernel.org/bpf/20221206232739.2504890-1-sdf@google.com/ > TBH I don't remember this code well enough to spot major issues. No worries! Appreciate the review and the comments on consistency; I'm also mostly unaware how this whole offloading works :-) > On Mon, 5 Dec 2022 18:45:45 -0800 Stanislav Fomichev wrote: > > There is an ndo handler per kfunc, the verifier replaces a call to the > > generic kfunc with a call to the per-device one. > > > > For XDP, we define a new kfunc set (xdp_metadata_kfunc_ids) which > > implements all possible metatada kfuncs. Not all devices have to > > implement them. If kfunc is not supported by the target device, > > the default implementation is called instead. > > > > Upon loading, if BPF_F_XDP_HAS_METADATA is passed via prog_flags, > > we treat prog_index as target device for kfunc resolution. > > > @@ -2476,10 +2477,18 @@ void bpf_offload_dev_netdev_unregister(struct bpf_offload_dev *offdev, > > struct net_device *netdev); > > bool bpf_offload_dev_match(struct bpf_prog *prog, struct net_device *netdev); > > > > +void *bpf_offload_resolve_kfunc(struct bpf_prog *prog, u32 func_id); > > There seems to be some mis-naming going on. I expected: > > offloaded =~ nfp > dev_bound == XDP w/ funcs > > *_offload_resolve_kfunc looks misnamed? Unless you want to resolve > for HW offload? Yeah, I had the same expectations, but I was also assuming that this bpf_offload_resolve_kfunc might also at some point handle offloaded metadata kfuncs. But looking at it again, agree that the following looks a bit off: if (bpf_prog_is_dev_bound()) { xxx = bpf_offload_resolve_kfunc() } Let me use the dev_bound prefix more consistently here and in the other places you've pointed out. > > void unpriv_ebpf_notify(int new_state); > > > > #if defined(CONFIG_NET) && defined(CONFIG_BPF_SYSCALL) > > int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr); > > +void bpf_offload_bound_netdev_unregister(struct net_device *dev); > > ditto: offload_bound is a mix of terms no? Ack, will do bpf_dev_bound_netdev_unregister here, thanks! > > @@ -1611,6 +1612,10 @@ struct net_device_ops { > > ktime_t (*ndo_get_tstamp)(struct net_device *dev, > > const struct skb_shared_hwtstamps *hwtstamps, > > bool cycles); > > + bool (*ndo_xdp_rx_timestamp_supported)(const struct xdp_md *ctx); > > + u64 (*ndo_xdp_rx_timestamp)(const struct xdp_md *ctx); > > + bool (*ndo_xdp_rx_hash_supported)(const struct xdp_md *ctx); > > + u32 (*ndo_xdp_rx_hash)(const struct xdp_md *ctx); > > }; > > Is this on the fast path? Can we do an indirection? No, we resolve them at load time from "generic" bpf_xdp_metadata_rx_ to ndo_xdp_rx_. > Put these ops in their own struct and add a pointer to that struct > in net_device_ops? Purely for grouping reasons because the netdev > ops are getting orders of magnitude past the size where you can > actually find stuff in this struct. Oh, great idea, will do! > > bpf_free_used_maps(aux); > > bpf_free_used_btfs(aux); > > - if (bpf_prog_is_offloaded(aux)) > > + if (bpf_prog_is_dev_bound(aux)) > > bpf_prog_offload_destroy(aux->prog); > > This also looks a touch like a mix of terms (condition vs function > called). Here, not sure, open to suggestions. These bpf_prog_offload_init/bpf_prog_offload_destroy are generic enough (now) that I'm calling them for both dev_bound/offloaded. The following paths trigger for both offloaded/dev_bound cases: if (bpf_prog_is_dev_bound()) bpf_prog_offload_init(); if (bpf_prog_is_dev_bound()) bpf_prog_offload_destroy(); Do you think it's worth it having completely separate dev_bound/offloaded paths? Or, alternatively, can rename to bpf_prog_dev_bound_{init,destroy} but still handle both cases? > > +static int __bpf_offload_init(void); > > +static int __bpf_offload_dev_netdev_register(struct bpf_offload_dev *offdev, > > + struct net_device *netdev); > > +static void __bpf_offload_dev_netdev_unregister(struct bpf_offload_dev *offdev, > > + struct net_device *netdev); > > fwd declarations are yuck SG, will move them here instead. > > static int bpf_dev_offload_check(struct net_device *netdev) > > { > > if (!netdev) > > @@ -87,13 +93,17 @@ int bpf_prog_offload_init(struct bpf_prog *prog, union bpf_attr *attr) > > attr->prog_type != BPF_PROG_TYPE_XDP) > > return -EINVAL; > > > > - if (attr->prog_flags) > > + if (attr->prog_flags & ~BPF_F_XDP_HAS_METADATA) > > return -EINVAL; > > > > offload = kzalloc(sizeof(*offload), GFP_USER); > > if (!offload) > > return -ENOMEM; > > > > + err = __bpf_offload_init(); > > + if (err) > > + return err; > > leaks offload Oops, let me actually move this to late_initcall as you suggest below. > > @@ -209,6 +233,19 @@ bpf_prog_offload_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt) > > up_read(&bpf_devs_lock); > > } > > > > +static void maybe_remove_bound_netdev(struct net_device *dev) > > +{ > > func name prefix ? Good point, will rename to bpf_dev_bound_try_remove_netdev. > > -struct bpf_offload_dev * > > -bpf_offload_dev_create(const struct bpf_prog_offload_ops *ops, void *priv) > > +static int __bpf_offload_init(void) > > { > > - struct bpf_offload_dev *offdev; > > int err; > > > > down_write(&bpf_devs_lock); > > @@ -680,12 +740,25 @@ bpf_offload_dev_create(const struct bpf_prog_offload_ops *ops, void *priv) > > err = rhashtable_init(&offdevs, &offdevs_params); > > if (err) { > > up_write(&bpf_devs_lock); > > - return ERR_PTR(err); > > + return err; > > } > > offdevs_inited = true; > > } > > up_write(&bpf_devs_lock); > > > > + return 0; > > +} > > Would late_initcall() or some such not work for this? Agreed, let's move it to the initcall instead. > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 5b221568dfd4..862e03fcffa6 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -9228,6 +9228,10 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack > > NL_SET_ERR_MSG(extack, "Using device-bound program without HW_MODE flag is not supported"); > > extack should get updated here, I reckon, maybe in previous patch Oh, thanks for spotting, will fix. > > return -EINVAL; > > } > > + if (bpf_prog_is_dev_bound(new_prog->aux) && !bpf_offload_dev_match(new_prog, dev)) { > > bound_dev_match() ? Right, so this is another case where it works for both cases. Maybe rename to bpf_dev_bound_match and use for both offloaded/dev_bound? Or do you prefer completely separate paths? > > + NL_SET_ERR_MSG(extack, "Cannot attach to a different target device"); > > different than.. ? Borrowing from netdevsim, lmk if the following won't work here: "Program bound to different device" > > + return -EINVAL; > > + } > > if (new_prog->expected_attach_type == BPF_XDP_DEVMAP) { > > NL_SET_ERR_MSG(extack, "BPF_XDP_DEVMAP programs can not be attached to a device"); > > return -EINVAL;