All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net V3] net: reevalulate autoflowlabel setting after sysctl setting
@ 2017-12-20 20:10 Shaohua Li
  2017-12-21 18:07 ` David Miller
  0 siblings, 1 reply; 2+ messages in thread
From: Shaohua Li @ 2017-12-20 20:10 UTC (permalink / raw)
  To: netdev, davem
  Cc: Kernel Team, Shaohua Li, Martin KaFai Lau, Eric Dumazet, Tom Herbert

From: Shaohua Li <shli@fb.com>

sysctl.ip6.auto_flowlabels is default 1. In our hosts, we set it to 2.
If sockopt doesn't set autoflowlabel, outcome packets from the hosts are
supposed to not include flowlabel. This is true for normal packet, but
not for reset packet.

The reason is ipv6_pinfo.autoflowlabel is set in sock creation. Later if
we change sysctl.ip6.auto_flowlabels, the ipv6_pinfo.autoflowlabel isn't
changed, so the sock will keep the old behavior in terms of auto
flowlabel. Reset packet is suffering from this problem, because reset
packet is sent from a special control socket, which is created at boot
time. Since sysctl.ipv6.auto_flowlabels is 1 by default, the control
socket will always have its ipv6_pinfo.autoflowlabel set, even after
user set sysctl.ipv6.auto_flowlabels to 1, so reset packset will always
have flowlabel. Normal sock created before sysctl setting suffers from
the same issue. We can't even turn off autoflowlabel unless we kill all
socks in the hosts.

To fix this, if IPV6_AUTOFLOWLABEL sockopt is used, we use the
autoflowlabel setting from user, otherwise we always call
ip6_default_np_autolabel() which has the new settings of sysctl.

Note, this changes behavior a little bit. Before commit 42240901f7c4
(ipv6: Implement different admin modes for automatic flow labels), the
autoflowlabel behavior of a sock isn't sticky, eg, if sysctl changes,
existing connection will change autoflowlabel behavior. After that
commit, autoflowlabel behavior is sticky in the whole life of the sock.
With this patch, the behavior isn't sticky again.

Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Tom Herbert <tom@quantonium.net>
Signed-off-by: Shaohua Li <shli@fb.com>
---
 include/linux/ipv6.h     |  3 ++-
 net/ipv6/af_inet6.c      |  1 -
 net/ipv6/ip6_output.c    | 12 ++++++++++--
 net/ipv6/ipv6_sockglue.c |  1 +
 4 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index cb18c62..8415bf1 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -273,7 +273,8 @@ struct ipv6_pinfo {
 						 * 100: prefer care-of address
 						 */
 				dontfrag:1,
-				autoflowlabel:1;
+				autoflowlabel:1,
+				autoflowlabel_set:1;
 	__u8			min_hopcount;
 	__u8			tclass;
 	__be32			rcv_flowinfo;
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index c26f712..c9441ca 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -210,7 +210,6 @@ static int inet6_create(struct net *net, struct socket *sock, int protocol,
 	np->mcast_hops	= IPV6_DEFAULT_MCASTHOPS;
 	np->mc_loop	= 1;
 	np->pmtudisc	= IPV6_PMTUDISC_WANT;
-	np->autoflowlabel = ip6_default_np_autolabel(net);
 	np->repflow	= net->ipv6.sysctl.flowlabel_reflect;
 	sk->sk_ipv6only	= net->ipv6.sysctl.bindv6only;
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 5110a41..f7dd51c 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -166,6 +166,14 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
 			    !(IP6CB(skb)->flags & IP6SKB_REROUTED));
 }
 
+static bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np)
+{
+	if (!np->autoflowlabel_set)
+		return ip6_default_np_autolabel(net);
+	else
+		return np->autoflowlabel;
+}
+
 /*
  * xmit an sk_buff (used by TCP, SCTP and DCCP)
  * Note : socket lock is not held for SYNACK packets, but might be modified
@@ -230,7 +238,7 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
 		hlimit = ip6_dst_hoplimit(dst);
 
 	ip6_flow_hdr(hdr, tclass, ip6_make_flowlabel(net, skb, fl6->flowlabel,
-						     np->autoflowlabel, fl6));
+				ip6_autoflowlabel(net, np), fl6));
 
 	hdr->payload_len = htons(seg_len);
 	hdr->nexthdr = proto;
@@ -1626,7 +1634,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 
 	ip6_flow_hdr(hdr, v6_cork->tclass,
 		     ip6_make_flowlabel(net, skb, fl6->flowlabel,
-					np->autoflowlabel, fl6));
+					ip6_autoflowlabel(net, np), fl6));
 	hdr->hop_limit = v6_cork->hop_limit;
 	hdr->nexthdr = proto;
 	hdr->saddr = fl6->saddr;
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index b9404fe..2d4680e 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -886,6 +886,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		break;
 	case IPV6_AUTOFLOWLABEL:
 		np->autoflowlabel = valbool;
+		np->autoflowlabel_set = 1;
 		retv = 0;
 		break;
 	case IPV6_RECVFRAGSIZE:
-- 
2.9.5

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH net V3] net: reevalulate autoflowlabel setting after sysctl setting
  2017-12-20 20:10 [PATCH net V3] net: reevalulate autoflowlabel setting after sysctl setting Shaohua Li
@ 2017-12-21 18:07 ` David Miller
  0 siblings, 0 replies; 2+ messages in thread
From: David Miller @ 2017-12-21 18:07 UTC (permalink / raw)
  To: shli; +Cc: netdev, Kernel-team, shli, kafai, eric.dumazet, tom

From: Shaohua Li <shli@kernel.org>
Date: Wed, 20 Dec 2017 12:10:21 -0800

> From: Shaohua Li <shli@fb.com>
> 
> sysctl.ip6.auto_flowlabels is default 1. In our hosts, we set it to 2.
> If sockopt doesn't set autoflowlabel, outcome packets from the hosts are
> supposed to not include flowlabel. This is true for normal packet, but
> not for reset packet.
> 
> The reason is ipv6_pinfo.autoflowlabel is set in sock creation. Later if
> we change sysctl.ip6.auto_flowlabels, the ipv6_pinfo.autoflowlabel isn't
> changed, so the sock will keep the old behavior in terms of auto
> flowlabel. Reset packet is suffering from this problem, because reset
> packet is sent from a special control socket, which is created at boot
> time. Since sysctl.ipv6.auto_flowlabels is 1 by default, the control
> socket will always have its ipv6_pinfo.autoflowlabel set, even after
> user set sysctl.ipv6.auto_flowlabels to 1, so reset packset will always
> have flowlabel. Normal sock created before sysctl setting suffers from
> the same issue. We can't even turn off autoflowlabel unless we kill all
> socks in the hosts.
> 
> To fix this, if IPV6_AUTOFLOWLABEL sockopt is used, we use the
> autoflowlabel setting from user, otherwise we always call
> ip6_default_np_autolabel() which has the new settings of sysctl.
> 
> Note, this changes behavior a little bit. Before commit 42240901f7c4
> (ipv6: Implement different admin modes for automatic flow labels), the
> autoflowlabel behavior of a sock isn't sticky, eg, if sysctl changes,
> existing connection will change autoflowlabel behavior. After that
> commit, autoflowlabel behavior is sticky in the whole life of the sock.
> With this patch, the behavior isn't sticky again.
> 
> Cc: Martin KaFai Lau <kafai@fb.com>
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Tom Herbert <tom@quantonium.net>
> Signed-off-by: Shaohua Li <shli@fb.com>

This looks a lot better, applied, thanks.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-12-21 18:07 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-20 20:10 [PATCH net V3] net: reevalulate autoflowlabel setting after sysctl setting Shaohua Li
2017-12-21 18:07 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.