All of lore.kernel.org
 help / color / mirror / Atom feed
From: Cong Wang <xiyou.wangcong@gmail.com>
To: Eric Dumazet <edumazet@google.com>
Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Saeed Mahameed <saeedm@mellanox.com>,
	Tariq Toukan <tariqt@mellanox.com>
Subject: Re: [Patch net v3] mlx5: force CHECKSUM_NONE for short ethernet frames
Date: Mon, 3 Dec 2018 22:48:23 -0800	[thread overview]
Message-ID: <CAM_iQpUzVH9MkXTB5XNsQobodrN6ZKYj1RKNMcDNEMivZXfGfA@mail.gmail.com> (raw)
In-Reply-To: <CANn89iK7Z12KwFoH_R=Z5zkUUcG7en44pdtL4GxQ3YiFvDdzAw@mail.gmail.com>

On Mon, Dec 3, 2018 at 10:34 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Mon, Dec 3, 2018 at 10:14 PM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > When an ethernet frame is padded to meet the minimum ethernet frame
> > size, the padding octets are not covered by the hardware checksum.
> > Fortunately the padding octets are ususally zero's, which don't affect
> > checksum. However, we have a switch which pads non-zero octets, this
> > causes kernel hardware checksum fault repeatedly.
> >
> > Prior to commit 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"),
> > skb checksum was forced to be CHECKSUM_NONE when padding is detected.
> > After it, we need to keep skb->csum updated, like what we do for RXFCS.
> > However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP
> > headers, it is not worthy the effort as the packets are so small that
> > CHECKSUM_COMPLETE can't save anything.
> >
> > I tested this patch with RXFCS on and off, it works fine without any
> > warning in both cases.
> >
> > Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"),
> > Cc: Saeed Mahameed <saeedm@mellanox.com>
> > Cc: Eric Dumazet <edumazet@google.com>
> > Cc: Tariq Toukan <tariqt@mellanox.com>
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> > ---
> >  .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 22 ++++++++++++++++++-
> >  1 file changed, 21 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > index 624eed345b5d..1c153b8091da 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
> > @@ -732,6 +732,13 @@ static u8 get_ip_proto(struct sk_buff *skb, int network_depth, __be16 proto)
> >                                             ((struct ipv6hdr *)ip_p)->nexthdr;
> >  }
> >
> > +static bool is_short_frame(struct sk_buff *skb, bool has_fcs)
> > +{
> > +       u32 frame_len = has_fcs ? skb->len - ETH_FCS_LEN : skb->len;
> > +
> > +       return frame_len <= ETH_ZLEN;
> > +}
> > +
> >  static inline void mlx5e_handle_csum(struct net_device *netdev,
> >                                      struct mlx5_cqe64 *cqe,
> >                                      struct mlx5e_rq *rq,
> > @@ -755,9 +762,22 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
> >                 goto csum_unnecessary;
> >
> >         if (likely(is_last_ethertype_ip(skb, &network_depth, &proto))) {
> > +               bool has_fcs = !!(netdev->features & NETIF_F_RXFCS);
> > +
> >                 if (unlikely(get_ip_proto(skb, network_depth, proto) == IPPROTO_SCTP))
> >                         goto csum_unnecessary;
> >
> > +               /* CQE csum doesn't cover padding octets in short ethernet
> > +                * frames. And the pad field is appended prior to calculating
> > +                * and appending the FCS field.
> > +                *
> > +                * Detecting these padded frames requires to verify and parse
> > +                * IP headers, so we simply force all those small frames to be
> > +                * CHECKSUM_NONE even if they are not padded.
> > +                */
> > +               if (unlikely(is_short_frame(skb, has_fcs)))
> > +                       goto csum_none;
>
> Should not this go to csum_unnecessary instead ?

I don't see why we don't even want to validate the protocol checksum
here.

Any reason you are suggesting so?


>
> Probably not a big deal, but small UDP frames might hit this code path,
> so ethtool -S would show a lot of csum_none which could confuse mlx5 owners.

Why it is confusing? We intentionally bypass hardware checksum
and let protocol layer validate it.


>
> BTW,
> It looks like mlx5 prefers delivering skbs with CHECKSUM_COMPLETE instead of
> CHECKSUM_UNNECESSARY but at least for ipv6 CHECKSUM_UNNECESSARY
> would be slightly faster, by avoiding  various csum_partial() costs
> when headers are parsed.

Sure, it is certainly faster if you don't want to validate L4 checksum.
The only question is why we don't either validate hardware checksum
or L4 checksum?

Thanks.

  reply	other threads:[~2018-12-04  6:48 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-04  6:14 [Patch net v3] mlx5: force CHECKSUM_NONE for short ethernet frames Cong Wang
2018-12-04  6:34 ` Eric Dumazet
2018-12-04  6:48   ` Cong Wang [this message]
     [not found]     ` <CANn89iK0j=2LYK=szVO+Fpg1-tX=wSz+ghZx8RnwZSEbxZjf5w@mail.gmail.com>
2018-12-04  7:09       ` Eric Dumazet
2018-12-04  7:29       ` Cong Wang
2018-12-04  7:51         ` Eric Dumazet
2018-12-04 19:17           ` Saeed Mahameed
2018-12-04 20:35             ` Cong Wang
2018-12-04 21:16               ` Eric Dumazet
2018-12-04 21:20                 ` Cong Wang
2018-12-05  0:59               ` Saeed Mahameed
2018-12-05  2:48                 ` Cong Wang
2018-12-04 20:31           ` Cong Wang
2018-12-04 19:02 ` Saeed Mahameed
2018-12-04 20:44 ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAM_iQpUzVH9MkXTB5XNsQobodrN6ZKYj1RKNMcDNEMivZXfGfA@mail.gmail.com \
    --to=xiyou.wangcong@gmail.com \
    --cc=edumazet@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=tariqt@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.