From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sabrina Dubroca Subject: Re: [PATCH] net: take care of bonding in build_skb_flow_key (v3) Date: Wed, 20 Jan 2016 16:18:20 +0100 Message-ID: <20160120151820.GA1765@bistromath.redhat.com> References: <1453267933-25381-1-git-send-email-wen.gang.wang@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Cc: netdev@vger.kernel.org, jay.vosburgh@canonical.com To: Wengang Wang Return-path: Received: from mx1.redhat.com ([209.132.183.28]:57969 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933132AbcATPSX (ORCPT ); Wed, 20 Jan 2016 10:18:23 -0500 Content-Disposition: inline In-Reply-To: <1453267933-25381-1-git-send-email-wen.gang.wang@oracle.com> Sender: netdev-owner@vger.kernel.org List-ID: 2016-01-20, 13:32:13 +0800, Wengang Wang wrote: > In a bonding setting, we determines fragment size according to MTU and > PMTU associated to the bonding master. If the slave finds the fragment > size is too big, it drops the fragment and calls ip_rt_update_pmtu(), > passing _skb_ and _pmtu_, trying to update the path MTU. > Problem is that the target device that function ip_rt_update_pmtu actually > tries to update is the slave (skb->dev), not the master. Thus since no > PMTU change happens on master, the fragment size for later packets doesn't > change so all later fragments/packets are dropped too. > > The fix is letting build_skb_flow_key() take care of the transition of > device index from bonding slave to the master. That makes the master become > the target device that ip_rt_update_pmtu tries to update PMTU to. > > Signed-off-by: Wengang Wang > --- > net/ipv4/route.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/net/ipv4/route.c b/net/ipv4/route.c > index 85f184e..c59fb0d 100644 > --- a/net/ipv4/route.c > +++ b/net/ipv4/route.c > @@ -523,10 +523,21 @@ static void build_skb_flow_key(struct flowi4 *fl4, const struct sk_buff *skb, > const struct sock *sk) > { > const struct iphdr *iph = ip_hdr(skb); > - int oif = skb->dev->ifindex; > + struct net_device *master = NULL; > u8 tos = RT_TOS(iph->tos); > u8 prot = iph->protocol; > u32 mark = skb->mark; > + int oif; > + > + if (skb->dev->flags & IFF_SLAVE) { Maybe use netif_is_bond_slave here instead, since you have this problem with bonding slaves? > + rtnl_lock(); > + master = netdev_master_upper_dev_get(skb->dev); > + rtnl_unlock(); > + } As zhuyj said, this is called from dev_queue_xmit, so you cannot take rtnl_lock here. > + if (master) > + oif = master->ifindex; You cannot dereference master after you release the rtnl lock. So it would probably be best to use netdev_master_upper_dev_get_rcu, as zhuyj suggested earlier, and make sure that you only use the result between rcu_read_lock()/rcu_read_unlock(): rcu_read_lock(); master = netdev_master_upper_dev_get_rcu(skb->dev); if (master) oif = master->ifindex; rcu_read_unlock(); Thanks, -- Sabrina