From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.3 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E7D7C433E6 for ; Tue, 9 Feb 2021 13:40:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4DC3C64DF0 for ; Tue, 9 Feb 2021 13:40:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231614AbhBINj6 (ORCPT ); Tue, 9 Feb 2021 08:39:58 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:41389 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231503AbhBINjm (ORCPT ); Tue, 9 Feb 2021 08:39:42 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612877895; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=90XehYhYpm04iLba4IV9sGnMS0JagnDWIEdB2K4BJ0k=; b=SNs42kXnOpYQsPmlXv1pecKU2uJ6r7Xjqvei8J6tIMKkOUdpq0rBbasBsia6u3uueX6Iy2 PEsaNtXEM7J/1bVjK8ER7x63MWyAiNvrGFY4ZhaT5iEnhyw9FGoypLFOGhr1JSng+qD3og hs0i3Akq2j2WmMqVZFcGUbaZoUkBHkM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-238-dXiEL-YtOGCCjJOwkMJ3zw-1; Tue, 09 Feb 2021 08:38:11 -0500 X-MC-Unique: dXiEL-YtOGCCjJOwkMJ3zw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 755A3107AD93; Tue, 9 Feb 2021 13:38:09 +0000 (UTC) Received: from firesoul.localdomain (unknown [10.40.208.19]) by smtp.corp.redhat.com (Postfix) with ESMTP id B47EF60CD0; Tue, 9 Feb 2021 13:38:05 +0000 (UTC) Received: from [192.168.42.3] (localhost [IPv6:::1]) by firesoul.localdomain (Postfix) with ESMTP id 55D7230736C73; Tue, 9 Feb 2021 14:38:04 +0100 (CET) Subject: [PATCH bpf-next V16 0/7] bpf: New approach for BPF MTU handling From: Jesper Dangaard Brouer To: bpf@vger.kernel.org Cc: Jesper Dangaard Brouer , netdev@vger.kernel.org, Daniel Borkmann , Alexei Starovoitov , maze@google.com, lmb@cloudflare.com, shaun@tigera.io, Lorenzo Bianconi , marek@cloudflare.com, John Fastabend , Jakub Kicinski , eyal.birger@gmail.com, colrack@gmail.com Date: Tue, 09 Feb 2021 14:38:04 +0100 Message-ID: <161287779408.790810.15631860742170694244.stgit@firesoul> User-Agent: StGit/0.19 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patchset drops all the MTU checks in TC BPF-helpers that limits growing the packet size. This is done because these BPF-helpers doesn't take redirect into account, which can result in their MTU check being done against the wrong netdev. The new approach is to give BPF-programs knowledge about the MTU on a netdev (via ifindex) and fib route lookup level. Meaning some BPF-helpers are added and extended to make it possible to do MTU checks in the BPF-code. If BPF-prog doesn't comply with the MTU then the packet will eventually get dropped as some other layer. In some cases the existing kernel MTU checks will drop the packet, but there are also cases where BPF can bypass these checks. Specifically doing TC-redirect from ingress step (sch_handle_ingress) into egress code path (basically calling dev_queue_xmit()). It is left up to driver code to handle these kind of MTU violations. One advantage of this approach is that it ingress-to-egress BPF-prog can send information via packet data. With the MTU checks removed in the helpers, and also not done in skb_do_redirect() call, this allows for an ingress BPF-prog to communicate with an egress BPF-prog via packet data, as long as egress BPF-prog remove this prior to transmitting packet. This patchset is primarily focused on TC-BPF, but I've made sure that the MTU BPF-helpers also works for XDP BPF-programs. V2: Change BPF-helper API from lookup to check. V3: Drop enforcement of MTU in net-core, leave it to drivers. V4: Keep sanity limit + netdev "up" checks + rename BPF-helper. V5: Fix uninit variable + name struct output member mtu_result. V6: Use bpf_check_mtu() in selftest V7: Fix logic using tot_len and add another selftest V8: Add better selftests for BPF-helper bpf_check_mtu V9: Remove patch that use skb_set_redirected V10: Fix selftests and 'tot_len' MTU check like XDP V11: Fix nitpicks in selftests V12: Adjustments requested by Daniel V13: More adjustments requested by Daniel V14: Improve man page for BPF-helper bpf_check_mtu V15: Missing static for a function declaration V16: Revert part of V13 in patch 2 --- Feel free to trim version comments before applying. Jesper Dangaard Brouer (7): bpf: Remove MTU check in __bpf_skb_max_len bpf: fix bpf_fib_lookup helper MTU check for SKB ctx bpf: bpf_fib_lookup return MTU value as output when looked up bpf: add BPF-helper for MTU checking bpf: drop MTU check when doing TC-BPF redirect to ingress selftests/bpf: use bpf_check_mtu in selftest test_cls_redirect selftests/bpf: tests using bpf_check_mtu BPF-helper include/linux/netdevice.h | 32 +++ include/uapi/linux/bpf.h | 86 ++++++++ net/core/dev.c | 32 +-- net/core/filter.c | 167 ++++++++++++++- tools/include/uapi/linux/bpf.h | 86 ++++++++ tools/testing/selftests/bpf/prog_tests/check_mtu.c | 216 ++++++++++++++++++++ tools/testing/selftests/bpf/progs/test_check_mtu.c | 198 ++++++++++++++++++ .../selftests/bpf/progs/test_cls_redirect.c | 7 + 8 files changed, 779 insertions(+), 45 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/check_mtu.c create mode 100644 tools/testing/selftests/bpf/progs/test_check_mtu.c --