From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE178C04AB4 for ; Fri, 17 May 2019 15:56:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9CD502083E for ; Fri, 17 May 2019 15:56:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VHEUPiKn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729232AbfEQP4d (ORCPT ); Fri, 17 May 2019 11:56:33 -0400 Received: from mail-qk1-f193.google.com ([209.85.222.193]:35284 "EHLO mail-qk1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728664AbfEQP4d (ORCPT ); Fri, 17 May 2019 11:56:33 -0400 Received: by mail-qk1-f193.google.com with SMTP id c15so4763935qkl.2 for ; Fri, 17 May 2019 08:56:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=mJzwOKW1MucKATd8FsdFb6vNGDwJpslTYAonmnF5XNE=; b=VHEUPiKn4bN/ynqShJFY8QP9CIFJFZz13kLOYkhncFar197i+73jtlL4TpbZDgdnj1 R3JcQZ5ZhABcwwO2lANHKCUbb6lbAkM6nzZnRoDcTTorOMnYV+ENR3MEppu8kWhPKfxD ADA/8KOU3PibTx8O1c06fYcL+K7ZXFwL6vV/lpHPZOnWWi5rFFKAltOX8Toz5ebAkmfu IAGRIXfPmFvLOGTblPB4fU9YXvOCFWbC7GwSyOeGf8mrmI+JGANCQEw+5e3kkJVIJcPX 2Rq42vd/mxmnv1FDezug+/9zgbyrYRbgKsroec4GWIjp82/VmQpM0NJYd2QNzlDST8so L5qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=mJzwOKW1MucKATd8FsdFb6vNGDwJpslTYAonmnF5XNE=; b=t9PA+eKJ63QT14L+FHdUdqzZOs8ZgJanCLhcNYGIuGbpHg4Ek+6yBwFX4+ias3bmfc u5oVWDQ1My0iaea/gu11OWoaQQI36vGLYtNtMFnOMBU+VcTLroZbDztEAEg1v6nhdxPr +gt1VRfD1sg4PZG7ON1Oek9tTZNWcQ7Pnq9iKRLG4N/eG2iZKl331ZVXyhmnvtZ3d5S2 6hJPqGzyVZPc2E4wv5LG5RUYLy2KECqq/ICthvk/RErzpktgo0gNtDABAhVjj3eK+750 uB9yl1kWbLl1fc5CtQlO1f36q8x9P87ZUmj+4xetJJbC7nhNx+/C7a83Jpm7NCYvsV54 cnTA== X-Gm-Message-State: APjAAAUlg0pNncdsf8nJ/6U9nDLCzA2HSBSitMPQ4G6gCmW+zKrvJ3sw aJSReCubpVGydzxO0DpOlkBFd7Gq X-Google-Smtp-Source: APXvYqw0vAunI/0fnMxSvvZQoLzlofBf/6XmWIPWgIvpkRUPMYNb+gkjcHdBRbo3b4l4ckldOKXIdg== X-Received: by 2002:a37:e402:: with SMTP id y2mr40240123qkf.200.1558108591415; Fri, 17 May 2019 08:56:31 -0700 (PDT) Received: from willemb1.nyc.corp.google.com ([2620:0:1003:315:3fa1:a34c:1128:1d39]) by smtp.gmail.com with ESMTPSA id l127sm3977582qkc.81.2019.05.17.08.56.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 17 May 2019 08:56:30 -0700 (PDT) From: Willem de Bruijn To: netdev@vger.kernel.org Cc: Willem de Bruijn Subject: [PATCH net-next RFC] ipv6: elide flowlabel check if no exclusive leases exist Date: Fri, 17 May 2019 11:56:25 -0400 Message-Id: <20190517155625.117835-1-willemdebruijn.kernel@gmail.com> X-Mailer: git-send-email 2.21.0.1020.gf2820cf01a-goog MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Willem de Bruijn Processes can request ipv6 flowlabels with cmsg IPV6_FLOWINFO. If not set, by default an autogenerated flowlabel is selected. Explicit flowlabels require a control operation per label plus a datapath check on every connection (every datagram if unconnected). This is particularly expensive on unconnected sockets with many connections, such as QUIC. In the common case, where no lease is exclusive, the check can be safely elided, as both lease request and check trivially succeed. Indeed, autoflowlabel does the same (even with exclusive leases). Elide the check if no process has requested an exclusive lease. This is an optimization. Robust applications still have to revert to requesting leases if the fast path fails due to an exclusive lease. This is decidedly an RFC patch: - need to update all fl6_sock_lookup callers, not just udp - behavior should be per-netns isolated Other approaches considered: - a single "get all flowlabels, non-exclusive" flowlabel get request if set, elide fl6_sock_lookup and fail exclusive lease requests - sysctls (only useful if on by default, with static_branch) A) "non-exclusive mode", failing all exclusive lease requests: processes already have to be robust against lease failure B) just bypass check in fl6_sock_lookup, like autoflowlabel Signed-off-by: Willem de Bruijn --- include/net/ipv6.h | 11 +++++++++++ net/ipv6/ip6_flowlabel.c | 6 ++++++ net/ipv6/udp.c | 8 ++++---- 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index daf80863d3a50..8881cee572410 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -343,7 +344,17 @@ static inline void txopt_put(struct ipv6_txoptions *opt) kfree_rcu(opt, rcu); } +extern struct static_key_false ipv6_flowlabel_exclusive; struct ip6_flowlabel *fl6_sock_lookup(struct sock *sk, __be32 label); +static inline struct ip6_flowlabel *fl6_sock_verify(struct sock *sk, + __be32 label) +{ + if (static_branch_unlikely(&ipv6_flowlabel_exclusive)) + return fl6_sock_lookup(sk, label) ? : ERR_PTR(-ENOENT); + + return NULL; +} + struct ipv6_txoptions *fl6_merge_options(struct ipv6_txoptions *opt_space, struct ip6_flowlabel *fl, struct ipv6_txoptions *fopt); diff --git a/net/ipv6/ip6_flowlabel.c b/net/ipv6/ip6_flowlabel.c index be5f3d7ceb966..d5f4233b04e0c 100644 --- a/net/ipv6/ip6_flowlabel.c +++ b/net/ipv6/ip6_flowlabel.c @@ -57,6 +57,8 @@ static DEFINE_SPINLOCK(ip6_fl_lock); static DEFINE_SPINLOCK(ip6_sk_fl_lock); +DEFINE_STATIC_KEY_FALSE(ipv6_flowlabel_exclusive); + #define for_each_fl_rcu(hash, fl) \ for (fl = rcu_dereference_bh(fl_ht[(hash)]); \ fl != NULL; \ @@ -98,6 +100,8 @@ static void fl_free_rcu(struct rcu_head *head) { struct ip6_flowlabel *fl = container_of(head, struct ip6_flowlabel, rcu); + if (fl->share != IPV6_FL_S_NONE && fl->share != IPV6_FL_S_ANY) + static_branch_dec(&ipv6_flowlabel_exclusive); if (fl->share == IPV6_FL_S_PROCESS) put_pid(fl->owner.pid); kfree(fl->opt); @@ -423,6 +427,8 @@ fl_create(struct net *net, struct sock *sk, struct in6_flowlabel_req *freq, } fl->dst = freq->flr_dst; atomic_set(&fl->users, 1); + if (fl->share != IPV6_FL_S_ANY) + static_branch_inc(&ipv6_flowlabel_exclusive); switch (fl->share) { case IPV6_FL_S_EXCL: case IPV6_FL_S_ANY: diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 07fa579dfb96c..859a1cbd54906 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1331,8 +1331,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) if (np->sndflow) { fl6.flowlabel = sin6->sin6_flowinfo&IPV6_FLOWINFO_MASK; if (fl6.flowlabel&IPV6_FLOWLABEL_MASK) { - flowlabel = fl6_sock_lookup(sk, fl6.flowlabel); - if (!flowlabel) + flowlabel = fl6_sock_verify(sk, fl6.flowlabel); + if (IS_ERR(flowlabel)) return -EINVAL; } } @@ -1383,8 +1383,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) return err; } if ((fl6.flowlabel&IPV6_FLOWLABEL_MASK) && !flowlabel) { - flowlabel = fl6_sock_lookup(sk, fl6.flowlabel); - if (!flowlabel) + flowlabel = fl6_sock_verify(sk, fl6.flowlabel); + if (IS_ERR(flowlabel)) return -EINVAL; } if (!(opt->opt_nflen|opt->opt_flen)) -- 2.21.0.1020.gf2820cf01a-goog