All of lore.kernel.org
 help / color / mirror / Atom feed
From: Saeed Mahameed <saeedm@mellanox.com>
To: "saeedm@dev.mellanox.co.il" <saeedm@dev.mellanox.co.il>,
	"xiyou.wangcong@gmail.com" <xiyou.wangcong@gmail.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Tariq Toukan <tariqt@mellanox.com>,
	"edumazet@google.com" <edumazet@google.com>
Subject: Re: [Patch net v3] mlx5: force CHECKSUM_NONE for short ethernet frames
Date: Wed, 5 Dec 2018 00:59:18 +0000	[thread overview]
Message-ID: <79d2b44fac14006c0f1dd819179f23a87abe9186.camel@mellanox.com> (raw)
In-Reply-To: <CAM_iQpUqhaJ0_hJZfZoJtu50K5bae0oR5f1jAUVR1F7id6YLzQ@mail.gmail.com>

On Tue, 2018-12-04 at 12:35 -0800, Cong Wang wrote:
> On Tue, Dec 4, 2018 at 11:17 AM Saeed Mahameed
> <saeedm@dev.mellanox.co.il> wrote:
> > On Mon, Dec 3, 2018 at 11:52 PM Eric Dumazet <edumazet@google.com>
> > wrote:
> > > On Mon, Dec 3, 2018 at 11:30 PM Cong Wang <
> > > xiyou.wangcong@gmail.com> wrote:
> > > > On Mon, Dec 3, 2018 at 11:08 PM Eric Dumazet <
> > > > edumazet@google.com> wrote:
> > > > > The hardware has probably validated the L3 & L4 checksum just
> > > > > fine.
> > > > > 
> > > > > Note that if ip_summed is CHECKSUM_UNNECESSARY, the padding
> > > > > bytes (if any)
> > > > > have no impact on the csum that has been verified by the NIC.
> > > > 
> > > > Why? Why does the hardware validates L3/L4 checksum when it
> > > > supplies a full-packet checksum? What's its point here?
> > > 
> > > The point is that the driver author can decide what is best.
> > > 
> > > For native IP+TCP or IP+UDP, the NIC has the ability to fully
> > > understand the packet and fully validate the checksum.
> > 
> > Also for Native IP4 and IP6 plain L3 packets.
> > The Hardware validates the csum when it can, and always provides
> > checksum complete for all packets.
> > One of the reason to validate is that sometimes we want to skip
> > checksum complete, but still leverage the hw validation,
> > like in your patch :), or LRO case, or many other cases in other
> > kernels/OSes/drivers.
> 
> This sounds wrong to me too.
> 
> If the HW already validates it, the software doesn't need to do it,
> therefore must skip hw csum for performance gain.,
> 

Cong, you are missing some important details, hardware can only parse a
handful of protocols IP/TCP/UDP some tunneling like vxlan,GRE, etc.. 
but for complicated protocols and new generic tunneling protocols,
which the hardware wasn't built to understand, the checksum complete
comes to the rescue.

not only that, imagine you are doing some proprietary tunneling and you
want to encapsulate the rx traffic and send it back to wire, how would
you want to recalculate the csum before txing ? manually on the whole
new packet or use the csum complete and just figure out the differences
? i am sure you gonna want to use the checksum complete of the original
packet.

So again csum complete is not only for validation, it is more powerful
than that.

> 
> > So i agree with Eric, let's jump to checksum_unnecessary for short
> > packets.
> 
> This is odd, if Eric is right, then we should completely get rid of
> CHECKSUM_COMPLETE. Short packets are not exceptions.
> 

short packets are exception, 
1. because of the small packet padding issue
2. because they are short enough not to feel the performance hit of
recalculating their whole csum (if necessary) .

Again still it is nice to report csum unnecessary for those packets, if
possible.

for large packets csum complete is favorable because of the above
reasons.


if you don't like checksum complete you have a way to totally disable
it in mlx5 via ethtool private flags.

I hope this helps.


> I still don't understand why people including Eric kept fixing this
> thing which could be just removed from the very beginning.
> Sounds like nobody even looked into it until my patch.
> 
> Thanks.

  parent reply	other threads:[~2018-12-05  0:59 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-04  6:14 [Patch net v3] mlx5: force CHECKSUM_NONE for short ethernet frames Cong Wang
2018-12-04  6:34 ` Eric Dumazet
2018-12-04  6:48   ` Cong Wang
     [not found]     ` <CANn89iK0j=2LYK=szVO+Fpg1-tX=wSz+ghZx8RnwZSEbxZjf5w@mail.gmail.com>
2018-12-04  7:09       ` Eric Dumazet
2018-12-04  7:29       ` Cong Wang
2018-12-04  7:51         ` Eric Dumazet
2018-12-04 19:17           ` Saeed Mahameed
2018-12-04 20:35             ` Cong Wang
2018-12-04 21:16               ` Eric Dumazet
2018-12-04 21:20                 ` Cong Wang
2018-12-05  0:59               ` Saeed Mahameed [this message]
2018-12-05  2:48                 ` Cong Wang
2018-12-04 20:31           ` Cong Wang
2018-12-04 19:02 ` Saeed Mahameed
2018-12-04 20:44 ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=79d2b44fac14006c0f1dd819179f23a87abe9186.camel@mellanox.com \
    --to=saeedm@mellanox.com \
    --cc=edumazet@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@dev.mellanox.co.il \
    --cc=tariqt@mellanox.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.