From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BB774CCA482 for ; Wed, 13 Jul 2022 11:18:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235161AbiGMLSd (ORCPT ); Wed, 13 Jul 2022 07:18:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236059AbiGMLS3 (ORCPT ); Wed, 13 Jul 2022 07:18:29 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E7771101480 for ; Wed, 13 Jul 2022 04:18:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657711108; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yLRYZFEE5WheSOr0ZueH2oIEz+AIlR3j+gOBT1CAlzo=; b=LFQUcT2cetqSE77FnnU7LFhoypDLxqiMiqAK4o3LS4eezMMHW1+47ubA4xNDuuX0ZcQPyt WkN3MCxMqSnJC7WF7oL9hIViFGgiP8K6Lowmt5l6I8NhRL7i0LL++OjcOXKUpEpWXyhKVB b5QIMlC3/pr1ITzpmZVChXQ7A+pfzAg= Received: from mail-ej1-f71.google.com (mail-ej1-f71.google.com [209.85.218.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-274-9MvFtF0mPQSUNAzrZ62evQ-1; Wed, 13 Jul 2022 07:18:27 -0400 X-MC-Unique: 9MvFtF0mPQSUNAzrZ62evQ-1 Received: by mail-ej1-f71.google.com with SMTP id sd24-20020a1709076e1800b0072b582293c2so2888907ejc.0 for ; Wed, 13 Jul 2022 04:18:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yLRYZFEE5WheSOr0ZueH2oIEz+AIlR3j+gOBT1CAlzo=; b=wLNBzV2Mg/dxtWQUJfU7aoLDtWAOKWNO8iAMMiHNI1HQALD6BgkeEJU5dGGgY5Vg7f Gva1vyYWLBensweMnepXJhYFa0LZadLKuOGqhyAqoOY6gmi89/NUxiQNUD27x6bOEvVq b+k67c/5rhkMO90et9z1mubj4uXbc7GGyTUTYF43UWxtNKwXnhhIYOO1rbNanEdbsXjz PHwPMixdLuJpagz7TbL45zdoN4YKhmiO/oH2QCd85kzUzBOuU1xtIPold4f+tyjaPRAm dXh4k5By7Fx8cgErsYOeCgwC7CsY+126GPnfZnKiCOPUfdUbIQFv9zrrLChY3Zb8Fkn/ Vpcw== X-Gm-Message-State: AJIora/JjExSYNXgzybXdDv78FkDBGNJD0tb/0P83JhrGLt+ZRHiGP4p ZoNuaZYHK7pHrwTcJ93Dq4FjJ9LmuxL2LcTI7Sz77pT+8InQBbElagl5N166+Hl9WxHA4vhYOrz oZI5eGz8V5wCM X-Received: by 2002:a17:906:cc12:b0:72b:67bb:80c3 with SMTP id ml18-20020a170906cc1200b0072b67bb80c3mr2868847ejb.668.1657711105124; Wed, 13 Jul 2022 04:18:25 -0700 (PDT) X-Google-Smtp-Source: AGRyM1uBFxK6DbwgDOXIMNl+wGvmseBfFYC5zNP3nMDjTbia6fZ+NjHseC/3kcVLQORZRD33bQ8oSQ== X-Received: by 2002:a17:906:cc12:b0:72b:67bb:80c3 with SMTP id ml18-20020a170906cc1200b0072b67bb80c3mr2868756ejb.668.1657711104006; Wed, 13 Jul 2022 04:18:24 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk ([2a0c:4d80:42:443::2]) by smtp.gmail.com with ESMTPSA id kv21-20020a17090778d500b0070abf371274sm4814528ejc.136.2022.07.13.04.18.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Jul 2022 04:18:22 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 1C6074D9914; Wed, 13 Jul 2022 13:14:39 +0200 (CEST) From: =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= To: Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer Cc: Kumar Kartikeya Dwivedi , netdev@vger.kernel.org, bpf@vger.kernel.org, Freysteinn Alfredsson , Cong Wang , =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= , Eric Dumazet , Paolo Abeni Subject: [RFC PATCH 12/17] bpf: Add helper to schedule an interface for TX dequeue Date: Wed, 13 Jul 2022 13:14:20 +0200 Message-Id: <20220713111430.134810-13-toke@redhat.com> X-Mailer: git-send-email 2.37.0 In-Reply-To: <20220713111430.134810-1-toke@redhat.com> References: <20220713111430.134810-1-toke@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This adds a helper that a BPF program can call to schedule an interface for transmission. The helper can be used from both a regular XDP program (to schedule transmission after queueing a packet), and from a dequeue program to (re-)schedule transmission after a dequeue operation. In particular, the latter use can be combined with BPF timers to schedule delayed transmission, for instance to implement traffic shaping. The helper always schedules transmission on the interface on the current CPU. For cross-CPU operation, it is up to the BPF program to arrange for the helper to be called on the appropriate CPU, either by configuring hardware RSS appropriately, or by using a cpumap. Likewise, it is up to the BPF programs to decide whether to use separate queues per CPU (by using multiple maps to queue packets in), or accept the lock contention of using a single map across CPUs. Signed-off-by: Toke Høiland-Jørgensen --- include/uapi/linux/bpf.h | 11 +++++++ net/core/filter.c | 52 ++++++++++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 11 +++++++ 3 files changed, 74 insertions(+) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d44382644391..b352ecc280f4 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -5358,6 +5358,16 @@ union bpf_attr { * *bpf_packet_dequeue()* (and checked to not be NULL). * Return * This always succeeds and returns zero. + * + * long bpf_schedule_iface_dequeue(void *ctx, int ifindex, int flags) + * Description + * Schedule the interface with index *ifindex* for transmission from + * its dequeue program as soon as possible. The *flags* argument + * must be zero. + * + * Return + * Returns zero on success, or -ENOENT if no dequeue program is + * loaded on the interface. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5570,6 +5580,7 @@ union bpf_attr { FN(tcp_raw_check_syncookie_ipv6), \ FN(packet_dequeue), \ FN(packet_drop), \ + FN(schedule_iface_dequeue), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper diff --git a/net/core/filter.c b/net/core/filter.c index 7c89eaa01c29..bb556d873b52 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -4431,6 +4431,54 @@ static const struct bpf_func_proto bpf_xdp_redirect_map_proto = { .arg3_type = ARG_ANYTHING, }; +static int bpf_schedule_iface_dequeue(struct net *net, int ifindex, int flags) +{ + struct net_device *dev; + struct bpf_prog *prog; + + if (flags) + return -EINVAL; + + dev = dev_get_by_index_rcu(net, ifindex); + if (!dev) + return -ENODEV; + + prog = rcu_dereference(dev->xdp_dequeue_prog); + if (!prog) + return -ENOENT; + + dev_schedule_xdp_dequeue(dev); + return 0; +} + +BPF_CALL_3(bpf_xdp_schedule_iface_dequeue, struct xdp_buff *, ctx, int, ifindex, int, flags) +{ + return bpf_schedule_iface_dequeue(dev_net(ctx->rxq->dev), ifindex, flags); +} + +static const struct bpf_func_proto bpf_xdp_schedule_iface_dequeue_proto = { + .func = bpf_xdp_schedule_iface_dequeue, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_ANYTHING, +}; + +BPF_CALL_3(bpf_dequeue_schedule_iface_dequeue, struct dequeue_data *, ctx, int, ifindex, int, flags) +{ + return bpf_schedule_iface_dequeue(dev_net(ctx->txq->dev), ifindex, flags); +} + +static const struct bpf_func_proto bpf_dequeue_schedule_iface_dequeue_proto = { + .func = bpf_dequeue_schedule_iface_dequeue, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_ANYTHING, +}; + BTF_ID_LIST_SINGLE(xdp_md_btf_ids, struct, xdp_md) BPF_CALL_4(bpf_packet_dequeue, struct dequeue_data *, ctx, struct bpf_map *, map, @@ -8068,6 +8116,8 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_xdp_fib_lookup_proto; case BPF_FUNC_check_mtu: return &bpf_xdp_check_mtu_proto; + case BPF_FUNC_schedule_iface_dequeue: + return &bpf_xdp_schedule_iface_dequeue_proto; #ifdef CONFIG_INET case BPF_FUNC_sk_lookup_udp: return &bpf_xdp_sk_lookup_udp_proto; @@ -8105,6 +8155,8 @@ dequeue_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_packet_dequeue_proto; case BPF_FUNC_packet_drop: return &bpf_packet_drop_proto; + case BPF_FUNC_schedule_iface_dequeue: + return &bpf_dequeue_schedule_iface_dequeue_proto; default: return bpf_base_func_proto(func_id); } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 1dab68a89e18..9eb9a5b52c76 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5358,6 +5358,16 @@ union bpf_attr { * *bpf_packet_dequeue()* (and checked to not be NULL). * Return * This always succeeds and returns zero. + * + * long bpf_schedule_iface_dequeue(void *ctx, int ifindex, int flags) + * Description + * Schedule the interface with index *ifindex* for transmission from + * its dequeue program as soon as possible. The *flags* argument + * must be zero. + * + * Return + * Returns zero on success, or -ENOENT if no dequeue program is + * loaded on the interface. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -5570,6 +5580,7 @@ union bpf_attr { FN(tcp_raw_check_syncookie_ipv6), \ FN(packet_dequeue), \ FN(packet_drop), \ + FN(schedule_iface_dequeue), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper -- 2.37.0