All of lore.kernel.org
 help / color / mirror / Atom feed
* CHECKSUM_COMPLETE question
@ 2020-02-12  9:52 Grygorii Strashko
  2020-02-19 17:03 ` Grygorii Strashko
  0 siblings, 1 reply; 5+ messages in thread
From: Grygorii Strashko @ 2020-02-12  9:52 UTC (permalink / raw)
  To: netdev, David S . Miller; +Cc: Hideaki YOSHIFUJI

Hi All,

I'd like to ask expert opinion and clarify few points about HW RX checksum offload.

1) CHECKSUM_COMPLETE - from description in <linux/skbuff.h>
  * CHECKSUM_COMPLETE:
  *
  *   This is the most generic way. The device supplied checksum of the _whole_
  *   packet as seen by netif_rx() and fills out in skb->csum. Meaning, the
  *   hardware doesn't need to parse L3/L4 headers to implement this.

My understanding from above is that HW, to be fully compatible with Linux, should produce CSUM
starting from first byte after EtherType field:
  (6 DST_MAC) (6 SRC_MAC) (2 EtherType) (...                   ...)
                                         ^                       ^
                                         | start csum            | end csum
and ending at the last byte of Ethernet frame data.
- if packet is VLAN tagged then VLAN TCI and real EtherType included in CSUM,
   but first VLAN TPID doesn't
- pad bytes may/may not be included in csum


2) I've found some difference between IPv4 and IPv6 csum processing of fragmented packets

Fragmented IPv4 UDP packet:
  - driver fills skb->csum and sets skb->ip_summed = CHECKSUM_COMPLETE for every fragment
  - in ip_frag_queue() the ip_hdr is parsed, and polled, and packet queued, but
    there is *no* csum correction in this function for polled ip_hdr (no csum_sub() or similar calls)
^^^^
  - as result, in inet_frag_reasm_finish() the skb->csum field can be seen unmodified
  - if the same packet is sent over VLAN the skb->csum in inet_frag_reasm_finish() will be seen as modified due to
    skb_vlan_untag()->skb_pull_rcsum()

Fragmented IPv6 UDP packet:
  - driver fills skb->csum and sets skb->ip_summed = CHECKSUM_COMPLETE for every fragment
  - in ip6_frag_queue() the ipv6_hdr s parsed, and polled, and packet queued,
    *and csum corrected*
  
  ip6_frag_queue()
  ...
     	if (skb->ip_summed == CHECKSUM_COMPLETE) {
		const unsigned char *nh = skb_network_header(skb);
		skb->csum = csum_sub(skb->csum,
				     csum_partial(nh, (u8 *)(fhdr + 1) - nh,

Are there any reasons for such difference between IPv4 and IPv6?

3) few words about new HW i'm working with.
The HW can parse IP4/IP6 and UDP/TCP headers and generate csum including pseudo header checksum
which is working pretty well for non fragmented packets.

For fragmented packets (UDP for example):
- First fragments have the UDP header (including pseudo header) and data included in the count.
- Middle and last fragments have only data included in the count

As result when SUM_ALL(frag->csum) == 0xFFFF means packet csum is correct.

Above seems will not be working out of the box, at least not without csum manipulations simialar
to what is done in mellanox/mlx4/en_rx.c (check_csum(),get_fixed_ipv4_csum(), get_fixed_ipv6_csum())


Thanks you.

-- 
Best regards,
grygorii

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CHECKSUM_COMPLETE question
  2020-02-12  9:52 CHECKSUM_COMPLETE question Grygorii Strashko
@ 2020-02-19 17:03 ` Grygorii Strashko
  2020-02-19 22:59   ` Willem de Bruijn
  0 siblings, 1 reply; 5+ messages in thread
From: Grygorii Strashko @ 2020-02-19 17:03 UTC (permalink / raw)
  To: netdev, David S . Miller, Alexey Kuznetsov, Eric Dumazet, John Fastabend
  Cc: Hideaki YOSHIFUJI

Hi All,

On 12/02/2020 11:52, Grygorii Strashko wrote:
> Hi All,
> 
> I'd like to ask expert opinion and clarify few points about HW RX checksum offload.
> 
> 1) CHECKSUM_COMPLETE - from description in <linux/skbuff.h>
>   * CHECKSUM_COMPLETE:
>   *
>   *   This is the most generic way. The device supplied checksum of the _whole_
>   *   packet as seen by netif_rx() and fills out in skb->csum. Meaning, the
>   *   hardware doesn't need to parse L3/L4 headers to implement this.
> 
> My understanding from above is that HW, to be fully compatible with Linux, should produce CSUM
> starting from first byte after EtherType field:
>   (6 DST_MAC) (6 SRC_MAC) (2 EtherType) (...                   ...)
>                                          ^                       ^
>                                          | start csum            | end csum
> and ending at the last byte of Ethernet frame data.
> - if packet is VLAN tagged then VLAN TCI and real EtherType included in CSUM,
>    but first VLAN TPID doesn't
> - pad bytes may/may not be included in csum
> 
> 
> 2) I've found some difference between IPv4 and IPv6 csum processing of fragmented packets
> 
> Fragmented IPv4 UDP packet:
>   - driver fills skb->csum and sets skb->ip_summed = CHECKSUM_COMPLETE for every fragment
>   - in ip_frag_queue() the ip_hdr is parsed, and polled, and packet queued, but
>     there is *no* csum correction in this function for polled ip_hdr (no csum_sub() or similar calls)
> ^^^^
>   - as result, in inet_frag_reasm_finish() the skb->csum field can be seen unmodified
>   - if the same packet is sent over VLAN the skb->csum in inet_frag_reasm_finish() will be seen as modified due to
>     skb_vlan_untag()->skb_pull_rcsum()
> 
> Fragmented IPv6 UDP packet:
>   - driver fills skb->csum and sets skb->ip_summed = CHECKSUM_COMPLETE for every fragment
>   - in ip6_frag_queue() the ipv6_hdr s parsed, and polled, and packet queued,
>     *and csum corrected*
> 
>   ip6_frag_queue()
>   ...
>          if (skb->ip_summed == CHECKSUM_COMPLETE) {
>          const unsigned char *nh = skb_network_header(skb);
>          skb->csum = csum_sub(skb->csum,
>                       csum_partial(nh, (u8 *)(fhdr + 1) - nh,
> 
> Are there any reasons for such difference between IPv4 and IPv6?

I'm very sorry for disturbing you, but could anybody help with above two questions?


> 
> 3) few words about new HW i'm working with.
> The HW can parse IP4/IP6 and UDP/TCP headers and generate csum including pseudo header checksum
> which is working pretty well for non fragmented packets.
> 
> For fragmented packets (UDP for example):
> - First fragments have the UDP header (including pseudo header) and data included in the count.
> - Middle and last fragments have only data included in the count
> 
> As result when SUM_ALL(frag->csum) == 0xFFFF means packet csum is correct.
> 
> Above seems will not be working out of the box, at least not without csum manipulations simialar
> to what is done in mellanox/mlx4/en_rx.c (check_csum(),get_fixed_ipv4_csum(), get_fixed_ipv6_csum())
> 
> 
> Thanks you.
> 

-- 
Best regards,
grygorii

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CHECKSUM_COMPLETE question
  2020-02-19 17:03 ` Grygorii Strashko
@ 2020-02-19 22:59   ` Willem de Bruijn
  2020-02-20  0:25     ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Willem de Bruijn @ 2020-02-19 22:59 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: netdev, David S . Miller, Alexey Kuznetsov, Eric Dumazet,
	John Fastabend, Hideaki YOSHIFUJI

On Wed, Feb 19, 2020 at 9:04 AM Grygorii Strashko
<grygorii.strashko@ti.com> wrote:
>
> Hi All,
>
> On 12/02/2020 11:52, Grygorii Strashko wrote:
> > Hi All,
> >
> > I'd like to ask expert opinion and clarify few points about HW RX checksum offload.
> >
> > 1) CHECKSUM_COMPLETE - from description in <linux/skbuff.h>
> >   * CHECKSUM_COMPLETE:
> >   *
> >   *   This is the most generic way. The device supplied checksum of the _whole_
> >   *   packet as seen by netif_rx() and fills out in skb->csum. Meaning, the
> >   *   hardware doesn't need to parse L3/L4 headers to implement this.
> >
> > My understanding from above is that HW, to be fully compatible with Linux, should produce CSUM
> > starting from first byte after EtherType field:
> >   (6 DST_MAC) (6 SRC_MAC) (2 EtherType) (...                   ...)
> >                                          ^                       ^
> >                                          | start csum            | end csum
> > and ending at the last byte of Ethernet frame data.
> > - if packet is VLAN tagged then VLAN TCI and real EtherType included in CSUM,
> >    but first VLAN TPID doesn't
> > - pad bytes may/may not be included in csum

Based on commit 88078d98d1bb ("net: pskb_trim_rcsum() and
CHECKSUM_COMPLETE are friends") these bytes are expected to be covered
by skb->csum.

Not sure about that ipv4 header pull without csum adjust.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CHECKSUM_COMPLETE question
  2020-02-19 22:59   ` Willem de Bruijn
@ 2020-02-20  0:25     ` Jakub Kicinski
  2020-02-20  6:41       ` Willem de Bruijn
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2020-02-20  0:25 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Grygorii Strashko, netdev, David S . Miller, Alexey Kuznetsov,
	Eric Dumazet, John Fastabend, Hideaki YOSHIFUJI

On Wed, 19 Feb 2020 14:59:16 -0800 Willem de Bruijn wrote:
> On Wed, Feb 19, 2020 at 9:04 AM Grygorii Strashko
> <grygorii.strashko@ti.com> wrote:
> >
> > Hi All,
> >
> > On 12/02/2020 11:52, Grygorii Strashko wrote:  
> > > Hi All,
> > >
> > > I'd like to ask expert opinion and clarify few points about HW RX checksum offload.
> > >
> > > 1) CHECKSUM_COMPLETE - from description in <linux/skbuff.h>
> > >   * CHECKSUM_COMPLETE:
> > >   *
> > >   *   This is the most generic way. The device supplied checksum of the _whole_
> > >   *   packet as seen by netif_rx() and fills out in skb->csum. Meaning, the
> > >   *   hardware doesn't need to parse L3/L4 headers to implement this.
> > >
> > > My understanding from above is that HW, to be fully compatible with Linux, should produce CSUM
> > > starting from first byte after EtherType field:
> > >   (6 DST_MAC) (6 SRC_MAC) (2 EtherType) (...                   ...)
> > >                                          ^                       ^
> > >                                          | start csum            | end csum
> > > and ending at the last byte of Ethernet frame data.
> > > - if packet is VLAN tagged then VLAN TCI and real EtherType included in CSUM,
> > >    but first VLAN TPID doesn't
> > > - pad bytes may/may not be included in csum  
> 
> Based on commit 88078d98d1bb ("net: pskb_trim_rcsum() and
> CHECKSUM_COMPLETE are friends") these bytes are expected to be covered
> by skb->csum.
> 
> Not sure about that ipv4 header pull without csum adjust.

Isn't it just because IPv4 has a header checksum and therefore what's
pulled off adds up to 0 anyway? IPv6 does not have a header csum, hence
the adjustment?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CHECKSUM_COMPLETE question
  2020-02-20  0:25     ` Jakub Kicinski
@ 2020-02-20  6:41       ` Willem de Bruijn
  0 siblings, 0 replies; 5+ messages in thread
From: Willem de Bruijn @ 2020-02-20  6:41 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Grygorii Strashko, netdev, David S . Miller, Alexey Kuznetsov,
	Eric Dumazet, John Fastabend, Hideaki YOSHIFUJI

On Wed, Feb 19, 2020 at 4:26 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 19 Feb 2020 14:59:16 -0800 Willem de Bruijn wrote:
> > On Wed, Feb 19, 2020 at 9:04 AM Grygorii Strashko
> > <grygorii.strashko@ti.com> wrote:
> > >
> > > Hi All,
> > >
> > > On 12/02/2020 11:52, Grygorii Strashko wrote:
> > > > Hi All,
> > > >
> > > > I'd like to ask expert opinion and clarify few points about HW RX checksum offload.
> > > >
> > > > 1) CHECKSUM_COMPLETE - from description in <linux/skbuff.h>
> > > >   * CHECKSUM_COMPLETE:
> > > >   *
> > > >   *   This is the most generic way. The device supplied checksum of the _whole_
> > > >   *   packet as seen by netif_rx() and fills out in skb->csum. Meaning, the
> > > >   *   hardware doesn't need to parse L3/L4 headers to implement this.
> > > >
> > > > My understanding from above is that HW, to be fully compatible with Linux, should produce CSUM
> > > > starting from first byte after EtherType field:
> > > >   (6 DST_MAC) (6 SRC_MAC) (2 EtherType) (...                   ...)
> > > >                                          ^                       ^
> > > >                                          | start csum            | end csum
> > > > and ending at the last byte of Ethernet frame data.
> > > > - if packet is VLAN tagged then VLAN TCI and real EtherType included in CSUM,
> > > >    but first VLAN TPID doesn't
> > > > - pad bytes may/may not be included in csum
> >
> > Based on commit 88078d98d1bb ("net: pskb_trim_rcsum() and
> > CHECKSUM_COMPLETE are friends") these bytes are expected to be covered
> > by skb->csum.
> >
> > Not sure about that ipv4 header pull without csum adjust.
>
> Isn't it just because IPv4 has a header checksum and therefore what's
> pulled off adds up to 0 anyway? IPv6 does not have a header csum, hence
> the adjustment?

Ah yes, of course!

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-02-20  6:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-12  9:52 CHECKSUM_COMPLETE question Grygorii Strashko
2020-02-19 17:03 ` Grygorii Strashko
2020-02-19 22:59   ` Willem de Bruijn
2020-02-20  0:25     ` Jakub Kicinski
2020-02-20  6:41       ` Willem de Bruijn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.