From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752523AbcAWI0m (ORCPT ); Sat, 23 Jan 2016 03:26:42 -0500 Received: from mail-wm0-f65.google.com ([74.125.82.65]:33789 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752093AbcAWI0j (ORCPT ); Sat, 23 Jan 2016 03:26:39 -0500 Date: Sat, 23 Jan 2016 09:26:36 +0100 From: Jiri Pirko To: Jay Vosburgh Cc: Jarod Wilson , linux-kernel@vger.kernel.org, "David S. Miller" , Eric Dumazet , Jiri Pirko , Daniel Borkmann , Tom Herbert , Veaceslav Falico , Andy Gospodarek , netdev@vger.kernel.org Subject: Re: [RFC PATCH net] net/core: don't increment rx_dropped on inactive slaves Message-ID: <20160123082636.GC2193@nanopsycho.orion> References: <1453489882-57948-1-git-send-email-jarod@redhat.com> <14563.1453496352@famine> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <14563.1453496352@famine> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Fri, Jan 22, 2016 at 09:59:12PM CET, jay.vosburgh@canonical.com wrote: >Jarod Wilson wrote: > >>The network core tries to keep track of dropped packets, but some packets >>you wouldn't really call dropped, so much as intentionally ignored, under >>certain circumstances. One such case is that of bonding and team device >>slaves that are currently inactive. Their respective rx_handler functions >>return RX_HANDLER_EXACT (the only places in the kernel that return that), >>which ends up tracking into the network core's __netif_receive_skb_core() >>function's drop path, with no pt_prev set. On a noisy network, this can >>result in a very rapidly incrementing rx_dropped counter, not only on the >>inactive slave(s), but also on the master device, such as the following: >[...] >>In this scenario, p5p1, p5p2 and p7p1 are all inactive slaves in an >>active-backup bond0, and you can see that all three have high drop counts, >>with the master bond0 showing a tally of all three. >> >>I know that this was previously discussed some here: >> >> http://www.spinics.net/lists/netdev/msg226341.html >> >>It seems additional counters never came to fruition, but honestly, for >>this particular case, I'm not even sure they're warranted, I'd be inclined >>to say just silently drop these packets without incrementing a counter. At >>least, that's probably what would make someone who has complained loudly >>about this issue happy, as they have monitoring tools that are squaking >>loudly at any increments to rx_dropped. In this case, it is delivered with exact delivery according to per-dev registered callback. We just have to avoid it gets to bond. So this case is not "to drop", but rather "to block skb to don't get where it does not belong". > > I don't think the kernel should silently drop packets; there >should be a counter somewhere. If a packet is being thrown away >deliberately, it should not just vanish into the screaming void of >space. Someday someone will try and track down where that packet is >being dropped. > > I've had that same conversation with customers who insist on >accounting for every packet drop (from the "any drop is an error" >mindset), so I understand the issue. > > Thinking about the prior discussion, the rx_drop_inactive is >still a good idea, but I'd actually today get good use from a >"rx_drop_unforwardable" (or an equivalent but shorter name) counter that >counts every time a packet is dropped due to is_skb_forwardable() >returning false. __dev_forward_skb does this (and hits rx_dropped), as >does the bridge (and does not count it). > > -J > >>CC: "David S. Miller" >>CC: Eric Dumazet >>CC: Jiri Pirko >>CC: Daniel Borkmann >>CC: Tom Herbert >>CC: Jay Vosburgh >>CC: Veaceslav Falico >>CC: Andy Gospodarek >>CC: netdev@vger.kernel.org >>Signed-off-by: Jarod Wilson >>--- >> net/core/dev.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >>diff --git a/net/core/dev.c b/net/core/dev.c >>index 8cba3d8..1354c7b 100644 >>--- a/net/core/dev.c >>+++ b/net/core/dev.c >>@@ -4153,8 +4153,11 @@ ncls: >> else >> ret = pt_prev->func(skb, skb->dev, pt_prev, orig_dev); >> } else { >>+ if (deliver_exact) >>+ goto inactive; /* bond or team inactive slave */ >> drop: >> atomic_long_inc(&skb->dev->rx_dropped); >>+inactive: >> kfree_skb(skb); >> /* Jamal, now you will not able to escape explaining >> * me how you were going to use this. :-) >>-- >>1.8.3.1 >> > >--- > -Jay Vosburgh, jay.vosburgh@canonical.com