From mboxrd@z Thu Jan 1 00:00:00 1970 From: Cong Wang Subject: Re: [Patch net v3] mlx5: force CHECKSUM_NONE for short ethernet frames Date: Mon, 3 Dec 2018 22:48:23 -0800 Message-ID: References: <20181204061405.16539-1-xiyou.wangcong@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: Linux Kernel Network Developers , Saeed Mahameed , Tariq Toukan To: Eric Dumazet Return-path: Received: from mail-pl1-f194.google.com ([209.85.214.194]:46060 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725994AbeLDGsg (ORCPT ); Tue, 4 Dec 2018 01:48:36 -0500 Received: by mail-pl1-f194.google.com with SMTP id a14so7782896plm.12 for ; Mon, 03 Dec 2018 22:48:35 -0800 (PST) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Mon, Dec 3, 2018 at 10:34 PM Eric Dumazet wrote: > > On Mon, Dec 3, 2018 at 10:14 PM Cong Wang wrote: > > > > When an ethernet frame is padded to meet the minimum ethernet frame > > size, the padding octets are not covered by the hardware checksum. > > Fortunately the padding octets are ususally zero's, which don't affect > > checksum. However, we have a switch which pads non-zero octets, this > > causes kernel hardware checksum fault repeatedly. > > > > Prior to commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"), > > skb checksum was forced to be CHECKSUM_NONE when padding is detected. > > After it, we need to keep skb->csum updated, like what we do for RXFCS. > > However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP > > headers, it is not worthy the effort as the packets are so small that > > CHECKSUM_COMPLETE can't save anything. > > > > I tested this patch with RXFCS on and off, it works fine without any > > warning in both cases. > > > > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"), > > Cc: Saeed Mahameed > > Cc: Eric Dumazet > > Cc: Tariq Toukan > > Signed-off-by: Cong Wang > > --- > > .../net/ethernet/mellanox/mlx5/core/en_rx.c | 22 ++++++++++++++++++- > > 1 file changed, 21 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c > > index 624eed345b5d..1c153b8091da 100644 > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c > > @@ -732,6 +732,13 @@ static u8 get_ip_proto(struct sk_buff *skb, int network_depth, __be16 proto) > > ((struct ipv6hdr *)ip_p)->nexthdr; > > } > > > > +static bool is_short_frame(struct sk_buff *skb, bool has_fcs) > > +{ > > + u32 frame_len = has_fcs ? skb->len - ETH_FCS_LEN : skb->len; > > + > > + return frame_len <= ETH_ZLEN; > > +} > > + > > static inline void mlx5e_handle_csum(struct net_device *netdev, > > struct mlx5_cqe64 *cqe, > > struct mlx5e_rq *rq, > > @@ -755,9 +762,22 @@ static inline void mlx5e_handle_csum(struct net_device *netdev, > > goto csum_unnecessary; > > > > if (likely(is_last_ethertype_ip(skb, &network_depth, &proto))) { > > + bool has_fcs = !!(netdev->features & NETIF_F_RXFCS); > > + > > if (unlikely(get_ip_proto(skb, network_depth, proto) == IPPROTO_SCTP)) > > goto csum_unnecessary; > > > > + /* CQE csum doesn't cover padding octets in short ethernet > > + * frames. And the pad field is appended prior to calculating > > + * and appending the FCS field. > > + * > > + * Detecting these padded frames requires to verify and parse > > + * IP headers, so we simply force all those small frames to be > > + * CHECKSUM_NONE even if they are not padded. > > + */ > > + if (unlikely(is_short_frame(skb, has_fcs))) > > + goto csum_none; > > Should not this go to csum_unnecessary instead ? I don't see why we don't even want to validate the protocol checksum here. Any reason you are suggesting so? > > Probably not a big deal, but small UDP frames might hit this code path, > so ethtool -S would show a lot of csum_none which could confuse mlx5 owners. Why it is confusing? We intentionally bypass hardware checksum and let protocol layer validate it. > > BTW, > It looks like mlx5 prefers delivering skbs with CHECKSUM_COMPLETE instead of > CHECKSUM_UNNECESSARY but at least for ipv6 CHECKSUM_UNNECESSARY > would be slightly faster, by avoiding various csum_partial() costs > when headers are parsed. Sure, it is certainly faster if you don't want to validate L4 checksum. The only question is why we don't either validate hardware checksum or L4 checksum? Thanks.