From mboxrd@z Thu Jan 1 00:00:00 1970 From: Wengang Wang Subject: Re: [PATCH] net: take care of bonding in build_skb_flow_key (v3) Date: Thu, 21 Jan 2016 13:15:43 +0800 Message-ID: <56A0697F.9030703@oracle.com> References: <1453267933-25381-1-git-send-email-wen.gang.wang@oracle.com> <20160120151820.GA1765@bistromath.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: netdev@vger.kernel.org, jay.vosburgh@canonical.com To: Sabrina Dubroca Return-path: Received: from aserp1040.oracle.com ([141.146.126.69]:29333 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750934AbcAUFM1 (ORCPT ); Thu, 21 Jan 2016 00:12:27 -0500 In-Reply-To: <20160120151820.GA1765@bistromath.redhat.com> Sender: netdev-owner@vger.kernel.org List-ID: =E5=9C=A8 2016=E5=B9=B401=E6=9C=8820=E6=97=A5 23:18, Sabrina Dubroca =E5= =86=99=E9=81=93: > 2016-01-20, 13:32:13 +0800, Wengang Wang wrote: >> In a bonding setting, we determines fragment size according to MTU a= nd >> PMTU associated to the bonding master. If the slave finds the fragme= nt >> size is too big, it drops the fragment and calls ip_rt_update_pmtu()= , >> passing _skb_ and _pmtu_, trying to update the path MTU. >> Problem is that the target device that function ip_rt_update_pmtu ac= tually >> tries to update is the slave (skb->dev), not the master. Thus since = no >> PMTU change happens on master, the fragment size for later packets d= oesn't >> change so all later fragments/packets are dropped too. >> >> The fix is letting build_skb_flow_key() take care of the transition = of >> device index from bonding slave to the master. That makes the master= become >> the target device that ip_rt_update_pmtu tries to update PMTU to. >> >> Signed-off-by: Wengang Wang >> --- >> net/ipv4/route.c | 13 ++++++++++++- >> 1 file changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/net/ipv4/route.c b/net/ipv4/route.c >> index 85f184e..c59fb0d 100644 >> --- a/net/ipv4/route.c >> +++ b/net/ipv4/route.c >> @@ -523,10 +523,21 @@ static void build_skb_flow_key(struct flowi4 *= fl4, const struct sk_buff *skb, >> const struct sock *sk) >> { >> const struct iphdr *iph =3D ip_hdr(skb); >> - int oif =3D skb->dev->ifindex; >> + struct net_device *master =3D NULL; >> u8 tos =3D RT_TOS(iph->tos); >> u8 prot =3D iph->protocol; >> u32 mark =3D skb->mark; >> + int oif; >> + >> + if (skb->dev->flags & IFF_SLAVE) { > Maybe use netif_is_bond_slave here instead, since you have this > problem with bonding slaves? > > >> + rtnl_lock(); >> + master =3D netdev_master_upper_dev_get(skb->dev); >> + rtnl_unlock(); >> + } > As zhuyj said, this is called from dev_queue_xmit, so you cannot take > rtnl_lock here. > >> + if (master) >> + oif =3D master->ifindex; > You cannot dereference master after you release the rtnl lock. > > So it would probably be best to use netdev_master_upper_dev_get_rcu, > as zhuyj suggested earlier, and make sure that you only use the resul= t > between rcu_read_lock()/rcu_read_unlock(): > > rcu_read_lock(); > master =3D netdev_master_upper_dev_get_rcu(skb->dev); > if (master) > oif =3D master->ifindex; > rcu_read_unlock(); > OK, thanks for advising. thanks, wengang > Thanks, >