netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2] af-packet: new flag to indicate all csums are good
@ 2020-06-02  8:05 Victor Julien
  2020-06-02  8:11 ` Victor Julien
  2020-06-02 14:29 ` Willem de Bruijn
  0 siblings, 2 replies; 17+ messages in thread
From: Victor Julien @ 2020-06-02  8:05 UTC (permalink / raw)
  To: netdev
  Cc: victor, David S. Miller, Jakub Kicinski, Jonathan Corbet,
	Eric Dumazet, Willem de Bruijn, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel

Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
that the driver has completely validated the checksums in the packet.

The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
in that the new flag will only be set if all the layers are valid,
while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.

The name is derived from the skb->ip_summed setting CHECKSUM_UNNECESSARY.

Security tools such as Suricata, Snort, Zeek/Bro need to know not
only that a packet has not been corrupted, but also that the
checksums are correct. Without this an attacker could send a packet,
for example a TCP RST packet, that would be accepted by the
security tool, but rejected by the end host creating an impendance
mismatch.

To avoid this scenario tools currently will have to (re)calcultate/validate
the checksums as well. With this patch this becomes unnecessary for many
of the packets.

This patch has been tested with Suricata with the virtio driver,
where it reduced the ammount of time spent in the Suricata TCP
checksum validation to about half.

Signed-off-by: Victor Julien <victor@inliniac.net>
---
 Documentation/networking/packet_mmap.rst | 80 +++++++++++++-----------
 include/uapi/linux/if_packet.h           |  1 +
 net/packet/af_packet.c                   | 11 ++--
 3 files changed, 52 insertions(+), 40 deletions(-)

diff --git a/Documentation/networking/packet_mmap.rst b/Documentation/networking/packet_mmap.rst
index 6c009ceb1183..1711be47d61d 100644
--- a/Documentation/networking/packet_mmap.rst
+++ b/Documentation/networking/packet_mmap.rst
@@ -437,42 +437,50 @@ and the following flags apply:
 Capture process
 ^^^^^^^^^^^^^^^
 
-     from include/linux/if_packet.h
-
-     #define TP_STATUS_COPY          (1 << 1)
-     #define TP_STATUS_LOSING        (1 << 2)
-     #define TP_STATUS_CSUMNOTREADY  (1 << 3)
-     #define TP_STATUS_CSUM_VALID    (1 << 7)
-
-======================  =======================================================
-TP_STATUS_COPY		This flag indicates that the frame (and associated
-			meta information) has been truncated because it's
-			larger than tp_frame_size. This packet can be
-			read entirely with recvfrom().
-
-			In order to make this work it must to be
-			enabled previously with setsockopt() and
-			the PACKET_COPY_THRESH option.
-
-			The number of frames that can be buffered to
-			be read with recvfrom is limited like a normal socket.
-			See the SO_RCVBUF option in the socket (7) man page.
-
-TP_STATUS_LOSING	indicates there were packet drops from last time
-			statistics where checked with getsockopt() and
-			the PACKET_STATISTICS option.
-
-TP_STATUS_CSUMNOTREADY	currently it's used for outgoing IP packets which
-			its checksum will be done in hardware. So while
-			reading the packet we should not try to check the
-			checksum.
-
-TP_STATUS_CSUM_VALID	This flag indicates that at least the transport
-			header checksum of the packet has been already
-			validated on the kernel side. If the flag is not set
-			then we are free to check the checksum by ourselves
-			provided that TP_STATUS_CSUMNOTREADY is also not set.
-======================  =======================================================
+from include/linux/if_packet.h::
+
+     #define TP_STATUS_COPY		(1 << 1)
+     #define TP_STATUS_LOSING		(1 << 2)
+     #define TP_STATUS_CSUMNOTREADY	(1 << 3)
+     #define TP_STATUS_CSUM_VALID	(1 << 7)
+     #define TP_STATUS_CSUM_UNNECESSARY	(1 << 8)
+
+==========================  =====================================================
+TP_STATUS_COPY		    This flag indicates that the frame (and associated
+			    meta information) has been truncated because it's
+			    larger than tp_frame_size. This packet can be
+			    read entirely with recvfrom().
+
+			    In order to make this work it must to be
+			    enabled previously with setsockopt() and
+			    the PACKET_COPY_THRESH option.
+
+			    The number of frames that can be buffered to
+			    be read with recvfrom is limited like a normal socket.
+			    See the SO_RCVBUF option in the socket (7) man page.
+
+TP_STATUS_LOSING	    indicates there were packet drops from last time
+			    statistics where checked with getsockopt() and
+			    the PACKET_STATISTICS option.
+
+TP_STATUS_CSUMNOTREADY	    currently it's used for outgoing IP packets which
+			    its checksum will be done in hardware. So while
+			    reading the packet we should not try to check the
+			    checksum.
+
+TP_STATUS_CSUM_VALID	    This flag indicates that at least the transport
+			    header checksum of the packet has been already
+			    validated on the kernel side. If the flag is not set
+			    then we are free to check the checksum by ourselves
+			    provided that TP_STATUS_CSUMNOTREADY is also not set.
+
+TP_STATUS_CSUM_UNNECESSARY  This flag indicates that the driver validated all
+			    the packets csums. If it is not set it might be that
+			    the driver doesn't support this, or that one of the
+			    layers csums is bad. TP_STATUS_CSUM_VALID may still
+			    be set if the transport layer csum is correct or
+			    if the driver supports only this mode.
+==========================  =====================================================
 
 for convenience there are also the following defines::
 
diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
index 3d884d68eb30..76a5c762e2e0 100644
--- a/include/uapi/linux/if_packet.h
+++ b/include/uapi/linux/if_packet.h
@@ -113,6 +113,7 @@ struct tpacket_auxdata {
 #define TP_STATUS_BLK_TMO		(1 << 5)
 #define TP_STATUS_VLAN_TPID_VALID	(1 << 6) /* auxdata has valid tp_vlan_tpid */
 #define TP_STATUS_CSUM_VALID		(1 << 7)
+#define TP_STATUS_CSUM_UNNECESSARY	(1 << 8)
 
 /* Tx ring - header status */
 #define TP_STATUS_AVAILABLE	      0
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 29bd405adbbd..94e213537646 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2215,10 +2215,13 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL)
 		status |= TP_STATUS_CSUMNOTREADY;
-	else if (skb->pkt_type != PACKET_OUTGOING &&
-		 (skb->ip_summed == CHECKSUM_COMPLETE ||
-		  skb_csum_unnecessary(skb)))
-		status |= TP_STATUS_CSUM_VALID;
+	else if (skb->pkt_type != PACKET_OUTGOING) {
+		if (skb->ip_summed == CHECKSUM_UNNECESSARY)
+			status |= TP_STATUS_CSUM_UNNECESSARY | TP_STATUS_CSUM_VALID;
+		else if (skb->ip_summed == CHECKSUM_COMPLETE ||
+			 skb_csum_unnecessary(skb))
+			status |= TP_STATUS_CSUM_VALID;
+	}
 
 	if (snaplen > res)
 		snaplen = res;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02  8:05 [PATCH net-next v2] af-packet: new flag to indicate all csums are good Victor Julien
@ 2020-06-02  8:11 ` Victor Julien
  2020-06-02 14:29 ` Willem de Bruijn
  1 sibling, 0 replies; 17+ messages in thread
From: Victor Julien @ 2020-06-02  8:11 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Jonathan Corbet, Eric Dumazet,
	Willem de Bruijn, Mao Wenan, Arnd Bergmann, Neil Horman,
	linux-doc, linux-kernel

Need to learn how to properly do git sendpatch, apologies if i'm not
following proper procedures.

This v2 fixes up the doc. The output is now tested with rst2pdf and
looks good. Added in a trivial related doc fixup.

I wasn't completely sure if I was supposed to use tabs or spaces in rst,
so did tabs with spaces to fix alignment.

Fixed the code path to not check for the packet direction multiple times.

Please let me know if the patch makes sense fundamentally as well. My
knowledge of skb's and linux checksum handling in general is very
superficial.

Thanks,
Victor

On 02-06-2020 10:05, Victor Julien wrote:
> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
> that the driver has completely validated the checksums in the packet.
> 
> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
> in that the new flag will only be set if all the layers are valid,
> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
> 
> The name is derived from the skb->ip_summed setting CHECKSUM_UNNECESSARY.
> 
> Security tools such as Suricata, Snort, Zeek/Bro need to know not
> only that a packet has not been corrupted, but also that the
> checksums are correct. Without this an attacker could send a packet,
> for example a TCP RST packet, that would be accepted by the
> security tool, but rejected by the end host creating an impendance
> mismatch.
> 
> To avoid this scenario tools currently will have to (re)calcultate/validate
> the checksums as well. With this patch this becomes unnecessary for many
> of the packets.
> 
> This patch has been tested with Suricata with the virtio driver,
> where it reduced the ammount of time spent in the Suricata TCP
> checksum validation to about half.
> 
> Signed-off-by: Victor Julien <victor@inliniac.net>
> ---
>  Documentation/networking/packet_mmap.rst | 80 +++++++++++++-----------
>  include/uapi/linux/if_packet.h           |  1 +
>  net/packet/af_packet.c                   | 11 ++--
>  3 files changed, 52 insertions(+), 40 deletions(-)
> 
> diff --git a/Documentation/networking/packet_mmap.rst b/Documentation/networking/packet_mmap.rst
> index 6c009ceb1183..1711be47d61d 100644
> --- a/Documentation/networking/packet_mmap.rst
> +++ b/Documentation/networking/packet_mmap.rst
> @@ -437,42 +437,50 @@ and the following flags apply:
>  Capture process
>  ^^^^^^^^^^^^^^^
>  
> -     from include/linux/if_packet.h
> -
> -     #define TP_STATUS_COPY          (1 << 1)
> -     #define TP_STATUS_LOSING        (1 << 2)
> -     #define TP_STATUS_CSUMNOTREADY  (1 << 3)
> -     #define TP_STATUS_CSUM_VALID    (1 << 7)
> -
> -======================  =======================================================
> -TP_STATUS_COPY		This flag indicates that the frame (and associated
> -			meta information) has been truncated because it's
> -			larger than tp_frame_size. This packet can be
> -			read entirely with recvfrom().
> -
> -			In order to make this work it must to be
> -			enabled previously with setsockopt() and
> -			the PACKET_COPY_THRESH option.
> -
> -			The number of frames that can be buffered to
> -			be read with recvfrom is limited like a normal socket.
> -			See the SO_RCVBUF option in the socket (7) man page.
> -
> -TP_STATUS_LOSING	indicates there were packet drops from last time
> -			statistics where checked with getsockopt() and
> -			the PACKET_STATISTICS option.
> -
> -TP_STATUS_CSUMNOTREADY	currently it's used for outgoing IP packets which
> -			its checksum will be done in hardware. So while
> -			reading the packet we should not try to check the
> -			checksum.
> -
> -TP_STATUS_CSUM_VALID	This flag indicates that at least the transport
> -			header checksum of the packet has been already
> -			validated on the kernel side. If the flag is not set
> -			then we are free to check the checksum by ourselves
> -			provided that TP_STATUS_CSUMNOTREADY is also not set.
> -======================  =======================================================
> +from include/linux/if_packet.h::
> +
> +     #define TP_STATUS_COPY		(1 << 1)
> +     #define TP_STATUS_LOSING		(1 << 2)
> +     #define TP_STATUS_CSUMNOTREADY	(1 << 3)
> +     #define TP_STATUS_CSUM_VALID	(1 << 7)
> +     #define TP_STATUS_CSUM_UNNECESSARY	(1 << 8)
> +
> +==========================  =====================================================
> +TP_STATUS_COPY		    This flag indicates that the frame (and associated
> +			    meta information) has been truncated because it's
> +			    larger than tp_frame_size. This packet can be
> +			    read entirely with recvfrom().
> +
> +			    In order to make this work it must to be
> +			    enabled previously with setsockopt() and
> +			    the PACKET_COPY_THRESH option.
> +
> +			    The number of frames that can be buffered to
> +			    be read with recvfrom is limited like a normal socket.
> +			    See the SO_RCVBUF option in the socket (7) man page.
> +
> +TP_STATUS_LOSING	    indicates there were packet drops from last time
> +			    statistics where checked with getsockopt() and
> +			    the PACKET_STATISTICS option.
> +
> +TP_STATUS_CSUMNOTREADY	    currently it's used for outgoing IP packets which
> +			    its checksum will be done in hardware. So while
> +			    reading the packet we should not try to check the
> +			    checksum.
> +
> +TP_STATUS_CSUM_VALID	    This flag indicates that at least the transport
> +			    header checksum of the packet has been already
> +			    validated on the kernel side. If the flag is not set
> +			    then we are free to check the checksum by ourselves
> +			    provided that TP_STATUS_CSUMNOTREADY is also not set.
> +
> +TP_STATUS_CSUM_UNNECESSARY  This flag indicates that the driver validated all
> +			    the packets csums. If it is not set it might be that
> +			    the driver doesn't support this, or that one of the
> +			    layers csums is bad. TP_STATUS_CSUM_VALID may still
> +			    be set if the transport layer csum is correct or
> +			    if the driver supports only this mode.
> +==========================  =====================================================
>  
>  for convenience there are also the following defines::
>  
> diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
> index 3d884d68eb30..76a5c762e2e0 100644
> --- a/include/uapi/linux/if_packet.h
> +++ b/include/uapi/linux/if_packet.h
> @@ -113,6 +113,7 @@ struct tpacket_auxdata {
>  #define TP_STATUS_BLK_TMO		(1 << 5)
>  #define TP_STATUS_VLAN_TPID_VALID	(1 << 6) /* auxdata has valid tp_vlan_tpid */
>  #define TP_STATUS_CSUM_VALID		(1 << 7)
> +#define TP_STATUS_CSUM_UNNECESSARY	(1 << 8)
>  
>  /* Tx ring - header status */
>  #define TP_STATUS_AVAILABLE	      0
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 29bd405adbbd..94e213537646 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2215,10 +2215,13 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
>  
>  	if (skb->ip_summed == CHECKSUM_PARTIAL)
>  		status |= TP_STATUS_CSUMNOTREADY;
> -	else if (skb->pkt_type != PACKET_OUTGOING &&
> -		 (skb->ip_summed == CHECKSUM_COMPLETE ||
> -		  skb_csum_unnecessary(skb)))
> -		status |= TP_STATUS_CSUM_VALID;
> +	else if (skb->pkt_type != PACKET_OUTGOING) {
> +		if (skb->ip_summed == CHECKSUM_UNNECESSARY)
> +			status |= TP_STATUS_CSUM_UNNECESSARY | TP_STATUS_CSUM_VALID;
> +		else if (skb->ip_summed == CHECKSUM_COMPLETE ||
> +			 skb_csum_unnecessary(skb))
> +			status |= TP_STATUS_CSUM_VALID;
> +	}
>  
>  	if (snaplen > res)
>  		snaplen = res;
> 


-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02  8:05 [PATCH net-next v2] af-packet: new flag to indicate all csums are good Victor Julien
  2020-06-02  8:11 ` Victor Julien
@ 2020-06-02 14:29 ` Willem de Bruijn
  2020-06-02 17:03   ` Victor Julien
  1 sibling, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2020-06-02 14:29 UTC (permalink / raw)
  To: Victor Julien
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
>
> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
> that the driver has completely validated the checksums in the packet.
>
> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
> in that the new flag will only be set if all the layers are valid,
> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.

transport, not ip checksum.

But as I understand it drivers set CHECKSUM_COMPLETE if they fill in
skb->csum over the full length of the packet. This does not
necessarily imply that any of the checksum fields in the packet are
valid yet (see also skb->csum_valid). Protocol code later compares
checksum fields against this using __skb_checksum_validate_complete and friends.

But packet sockets may be called before any of this, however. So I wonder
how valid the checksum really is right now when setting TP_STATUS_CSUM_VALID.
I assume it's correct, but don't fully understand where the validation
has taken place..

Similar to commit 682f048bd494 ("af_packet: pass checksum validation
status to the user"), please update tpacket_rcv and packet_rcv.

Note also that net-next is currently closed.




>  for convenience there are also the following defines::
>
> diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
> index 3d884d68eb30..76a5c762e2e0 100644
> --- a/include/uapi/linux/if_packet.h
> +++ b/include/uapi/linux/if_packet.h
> @@ -113,6 +113,7 @@ struct tpacket_auxdata {
>  #define TP_STATUS_BLK_TMO              (1 << 5)
>  #define TP_STATUS_VLAN_TPID_VALID      (1 << 6) /* auxdata has valid tp_vlan_tpid */
>  #define TP_STATUS_CSUM_VALID           (1 << 7)
> +#define TP_STATUS_CSUM_UNNECESSARY     (1 << 8)
>
>  /* Tx ring - header status */
>  #define TP_STATUS_AVAILABLE          0
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 29bd405adbbd..94e213537646 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2215,10 +2215,13 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
>
>         if (skb->ip_summed == CHECKSUM_PARTIAL)
>                 status |= TP_STATUS_CSUMNOTREADY;
> -       else if (skb->pkt_type != PACKET_OUTGOING &&
> -                (skb->ip_summed == CHECKSUM_COMPLETE ||
> -                 skb_csum_unnecessary(skb)))
> -               status |= TP_STATUS_CSUM_VALID;
> +       else if (skb->pkt_type != PACKET_OUTGOING) {
> +               if (skb->ip_summed == CHECKSUM_UNNECESSARY)
> +                       status |= TP_STATUS_CSUM_UNNECESSARY | TP_STATUS_CSUM_VALID;
> +               else if (skb->ip_summed == CHECKSUM_COMPLETE ||
> +                        skb_csum_unnecessary(skb))
> +                       status |= TP_STATUS_CSUM_VALID;
> +       }
>
>         if (snaplen > res)
>                 snaplen = res;
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 14:29 ` Willem de Bruijn
@ 2020-06-02 17:03   ` Victor Julien
  2020-06-02 17:37     ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Victor Julien @ 2020-06-02 17:03 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On 02-06-2020 16:29, Willem de Bruijn wrote:
> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
>>
>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
>> that the driver has completely validated the checksums in the packet.
>>
>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
>> in that the new flag will only be set if all the layers are valid,
>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
> 
> transport, not ip checksum.

Allow me a n00b question: what does transport refer to here? Things like
ethernet? It isn't clear to me from the doc.

(happy to follow up with a patch to clarify the doc when I understand
things better)

> But as I understand it drivers set CHECKSUM_COMPLETE if they fill in
> skb->csum over the full length of the packet. This does not
> necessarily imply that any of the checksum fields in the packet are
> valid yet (see also skb->csum_valid). Protocol code later compares
> checksum fields against this using __skb_checksum_validate_complete and friends.
> 
> But packet sockets may be called before any of this, however. So I wonder
> how valid the checksum really is right now when setting TP_STATUS_CSUM_VALID.
> I assume it's correct, but don't fully understand where the validation
> has taken place..

I guess I'm more confused now about what TP_STATUS_CSUM_VALID actually
means. It sounds almost like the opposite of TP_STATUS_CSUMNOTREADY, but
I'm not sure I understand what the value would be.

It would be great if someone could help clear this up. Everything I
thought I knew/understood so far has been proven wrong, so I'm not too
confident about my patch anymore...

> Similar to commit 682f048bd494 ("af_packet: pass checksum validation
> status to the user"), please update tpacket_rcv and packet_rcv.

Ah yes, good catch. Will add it there as well.

> Note also that net-next is currently closed.

Should I hold off on sending a v3 until it reopens?

Regards / Groeten,
Victor


> 
> 
> 
>>  for convenience there are also the following defines::
>>
>> diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
>> index 3d884d68eb30..76a5c762e2e0 100644
>> --- a/include/uapi/linux/if_packet.h
>> +++ b/include/uapi/linux/if_packet.h
>> @@ -113,6 +113,7 @@ struct tpacket_auxdata {
>>  #define TP_STATUS_BLK_TMO              (1 << 5)
>>  #define TP_STATUS_VLAN_TPID_VALID      (1 << 6) /* auxdata has valid tp_vlan_tpid */
>>  #define TP_STATUS_CSUM_VALID           (1 << 7)
>> +#define TP_STATUS_CSUM_UNNECESSARY     (1 << 8)
>>
>>  /* Tx ring - header status */
>>  #define TP_STATUS_AVAILABLE          0
>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index 29bd405adbbd..94e213537646 100644
>> --- a/net/packet/af_packet.c
>> +++ b/net/packet/af_packet.c
>> @@ -2215,10 +2215,13 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
>>
>>         if (skb->ip_summed == CHECKSUM_PARTIAL)
>>                 status |= TP_STATUS_CSUMNOTREADY;
>> -       else if (skb->pkt_type != PACKET_OUTGOING &&
>> -                (skb->ip_summed == CHECKSUM_COMPLETE ||
>> -                 skb_csum_unnecessary(skb)))
>> -               status |= TP_STATUS_CSUM_VALID;
>> +       else if (skb->pkt_type != PACKET_OUTGOING) {
>> +               if (skb->ip_summed == CHECKSUM_UNNECESSARY)
>> +                       status |= TP_STATUS_CSUM_UNNECESSARY | TP_STATUS_CSUM_VALID;
>> +               else if (skb->ip_summed == CHECKSUM_COMPLETE ||
>> +                        skb_csum_unnecessary(skb))
>> +                       status |= TP_STATUS_CSUM_VALID;
>> +       }
>>
>>         if (snaplen > res)
>>                 snaplen = res;
>> --
>> 2.17.1
>>


-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 17:03   ` Victor Julien
@ 2020-06-02 17:37     ` Willem de Bruijn
  2020-06-02 18:31       ` Victor Julien
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2020-06-02 17:37 UTC (permalink / raw)
  To: Victor Julien
  Cc: Willem de Bruijn, Network Development, David S. Miller,
	Jakub Kicinski, Jonathan Corbet, Eric Dumazet, Mao Wenan,
	Arnd Bergmann, Neil Horman, linux-doc, linux-kernel,
	Alexander Drozdov, Tom Herbert

On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
>
> On 02-06-2020 16:29, Willem de Bruijn wrote:
> > On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
> >>
> >> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
> >> that the driver has completely validated the checksums in the packet.
> >>
> >> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
> >> in that the new flag will only be set if all the layers are valid,
> >> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
> >
> > transport, not ip checksum.
>
> Allow me a n00b question: what does transport refer to here? Things like
> ethernet? It isn't clear to me from the doc.

The TCP/UDP/.. transport protocol checksum.

> (happy to follow up with a patch to clarify the doc when I understand
> things better)
>
> > But as I understand it drivers set CHECKSUM_COMPLETE if they fill in
> > skb->csum over the full length of the packet. This does not
> > necessarily imply that any of the checksum fields in the packet are
> > valid yet (see also skb->csum_valid). Protocol code later compares
> > checksum fields against this using __skb_checksum_validate_complete and friends.
> >
> > But packet sockets may be called before any of this, however. So I wonder
> > how valid the checksum really is right now when setting TP_STATUS_CSUM_VALID.
> > I assume it's correct, but don't fully understand where the validation
> > has taken place..
>
> I guess I'm more confused now about what TP_STATUS_CSUM_VALID actually
> means. It sounds almost like the opposite of TP_STATUS_CSUMNOTREADY, but
> I'm not sure I understand what the value would be.
>
> It would be great if someone could help clear this up. Everything I
> thought I knew/understood so far has been proven wrong, so I'm not too
> confident about my patch anymore...

Agreed that we should clear this up.

> > Similar to commit 682f048bd494 ("af_packet: pass checksum validation
> > status to the user"), please update tpacket_rcv and packet_rcv.
>
> Ah yes, good catch. Will add it there as well.
>
> > Note also that net-next is currently closed.
>
> Should I hold off on sending a v3 until it reopens?

Yep, thanks. You can always check
http://vger.kernel.org/~davem/net-next.html when unsure.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 17:37     ` Willem de Bruijn
@ 2020-06-02 18:31       ` Victor Julien
  2020-06-02 19:03         ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Victor Julien @ 2020-06-02 18:31 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

Hi Willem,

On 02-06-2020 19:37, Willem de Bruijn wrote:
> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
>>
>> On 02-06-2020 16:29, Willem de Bruijn wrote:
>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
>>>>
>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
>>>> that the driver has completely validated the checksums in the packet.
>>>>
>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
>>>> in that the new flag will only be set if all the layers are valid,
>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
>>>
>>> transport, not ip checksum.
>>
>> Allow me a n00b question: what does transport refer to here? Things like
>> ethernet? It isn't clear to me from the doc.
> 
> The TCP/UDP/.. transport protocol checksum.

Hmm that is what I thought originally, but then it didn't seem to work.
Hence my patch.

However I just redid my testing. I took the example tpacketv3 program
and added the status flag checks to the 'display()' func:

                if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
                        printf("TP_STATUS_CSUM_VALID, ");
                }
                if (ppd->tp_status & (1<<8)) {
                        printf("TP_STATUS_CSUM_UNNECESSARY, ");

                }

Then using scapy sent some packets in 2 variants:
- default (good csums)
- deliberately bad csums
(then also added a few things like ip6 over ip)


srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
iface="enp1s0") // good csums

srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
iface="enp1s0") //bad tcp

1.2.3.4 -> 5.6.7.8, TP_STATUS_CSUM_VALID, TP_STATUS_CSUM_UNNECESSARY,
rxhash: 0x81ad5744
1.2.3.4 -> 5.6.7.8, rxhash: 0x81ad5744

So this suggests that what you're saying is correct, that it sets
TP_STATUS_CSUM_VALID if the TCP/UDP csum (and IPv4 csum) is valid, and
does not set it when either of them are invalid.

I'll also re-evaluate things in Suricata.


One thing I wonder if what this "at least" from the 682f048bd494 commit
message means:

"Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the
 af_packet user that at least the transport header checksum
 has been already validated."

For TCP/UDP there wouldn't be a higher layer with csums, right? And
based on my testing it seems lower levels (at least IP) is also
included. Or would that perhaps refer to something like VXLAN or Geneve
over UDP? That the csums of packets on top of those layers aren't
(necessarily) considered?

Thanks,
Victor


>> (happy to follow up with a patch to clarify the doc when I understand
>> things better)
>>
>>> But as I understand it drivers set CHECKSUM_COMPLETE if they fill in
>>> skb->csum over the full length of the packet. This does not
>>> necessarily imply that any of the checksum fields in the packet are
>>> valid yet (see also skb->csum_valid). Protocol code later compares
>>> checksum fields against this using __skb_checksum_validate_complete and friends.
>>>
>>> But packet sockets may be called before any of this, however. So I wonder
>>> how valid the checksum really is right now when setting TP_STATUS_CSUM_VALID.
>>> I assume it's correct, but don't fully understand where the validation
>>> has taken place..
>>
>> I guess I'm more confused now about what TP_STATUS_CSUM_VALID actually
>> means. It sounds almost like the opposite of TP_STATUS_CSUMNOTREADY, but
>> I'm not sure I understand what the value would be.
>>
>> It would be great if someone could help clear this up. Everything I
>> thought I knew/understood so far has been proven wrong, so I'm not too
>> confident about my patch anymore...
> 
> Agreed that we should clear this up.
> 
>>> Similar to commit 682f048bd494 ("af_packet: pass checksum validation
>>> status to the user"), please update tpacket_rcv and packet_rcv.
>>
>> Ah yes, good catch. Will add it there as well.
>>
>>> Note also that net-next is currently closed.
>>
>> Should I hold off on sending a v3 until it reopens?
> 
> Yep, thanks. You can always check
> http://vger.kernel.org/~davem/net-next.html when unsure.
> 


-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 18:31       ` Victor Julien
@ 2020-06-02 19:03         ` Willem de Bruijn
  2020-06-02 19:22           ` Victor Julien
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2020-06-02 19:03 UTC (permalink / raw)
  To: Victor Julien
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@inliniac.net> wrote:
>
> Hi Willem,
>
> On 02-06-2020 19:37, Willem de Bruijn wrote:
> > On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
> >>
> >> On 02-06-2020 16:29, Willem de Bruijn wrote:
> >>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
> >>>>
> >>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
> >>>> that the driver has completely validated the checksums in the packet.
> >>>>
> >>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
> >>>> in that the new flag will only be set if all the layers are valid,
> >>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
> >>>
> >>> transport, not ip checksum.
> >>
> >> Allow me a n00b question: what does transport refer to here? Things like
> >> ethernet? It isn't clear to me from the doc.
> >
> > The TCP/UDP/.. transport protocol checksum.
>
> Hmm that is what I thought originally, but then it didn't seem to work.
> Hence my patch.
>
> However I just redid my testing. I took the example tpacketv3 program
> and added the status flag checks to the 'display()' func:
>
>                 if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
>                         printf("TP_STATUS_CSUM_VALID, ");
>                 }
>                 if (ppd->tp_status & (1<<8)) {
>                         printf("TP_STATUS_CSUM_UNNECESSARY, ");
>
>                 }
>
> Then using scapy sent some packets in 2 variants:
> - default (good csums)
> - deliberately bad csums
> (then also added a few things like ip6 over ip)
>
>
> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
> iface="enp1s0") // good csums
>
> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
> iface="enp1s0") //bad tcp

Is this a test between two machines? What is the device driver of the
machine receiving and printing the packet? It would be helpful to know
whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.

>
> 1.2.3.4 -> 5.6.7.8, TP_STATUS_CSUM_VALID, TP_STATUS_CSUM_UNNECESSARY,
> rxhash: 0x81ad5744
> 1.2.3.4 -> 5.6.7.8, rxhash: 0x81ad5744
>
> So this suggests that what you're saying is correct, that it sets
> TP_STATUS_CSUM_VALID if the TCP/UDP csum (and IPv4 csum) is valid, and
> does not set it when either of them are invalid.

That's not exactly what I said. It looks to me that a device that sets
CHECKSUM_COMPLETE will return TP_STATUS_CSUM_VALID from
__netif_receive_skb_core even if the TCP checksum turns out to be bad.
If a driver would insert such packets into the stack, that is.

> I'll also re-evaluate things in Suricata.
>
>
> One thing I wonder if what this "at least" from the 682f048bd494 commit
> message means:
>
> "Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the
>  af_packet user that at least the transport header checksum
>  has been already validated."
>
> For TCP/UDP there wouldn't be a higher layer with csums, right? And
> based on my testing it seems lower levels (at least IP) is also
> included. Or would that perhaps refer to something like VXLAN or Geneve
> over UDP? That the csums of packets on top of those layers aren't
> (necessarily) considered?

The latter. All these checksums are about transport layer checksums
(the ip header checksum is cheap to compute). Multiple checksums
refers to packets encapsulated in other protocols with checksum, such
as GRE or UDP based like Geneve.

>
> Thanks,
> Victor
>
>
> >> (happy to follow up with a patch to clarify the doc when I understand
> >> things better)
> >>
> >>> But as I understand it drivers set CHECKSUM_COMPLETE if they fill in
> >>> skb->csum over the full length of the packet. This does not
> >>> necessarily imply that any of the checksum fields in the packet are
> >>> valid yet (see also skb->csum_valid). Protocol code later compares
> >>> checksum fields against this using __skb_checksum_validate_complete and friends.
> >>>
> >>> But packet sockets may be called before any of this, however. So I wonder
> >>> how valid the checksum really is right now when setting TP_STATUS_CSUM_VALID.
> >>> I assume it's correct, but don't fully understand where the validation
> >>> has taken place..
> >>
> >> I guess I'm more confused now about what TP_STATUS_CSUM_VALID actually
> >> means. It sounds almost like the opposite of TP_STATUS_CSUMNOTREADY, but
> >> I'm not sure I understand what the value would be.
> >>
> >> It would be great if someone could help clear this up. Everything I
> >> thought I knew/understood so far has been proven wrong, so I'm not too
> >> confident about my patch anymore...
> >
> > Agreed that we should clear this up.
> >
> >>> Similar to commit 682f048bd494 ("af_packet: pass checksum validation
> >>> status to the user"), please update tpacket_rcv and packet_rcv.
> >>
> >> Ah yes, good catch. Will add it there as well.
> >>
> >>> Note also that net-next is currently closed.
> >>
> >> Should I hold off on sending a v3 until it reopens?
> >
> > Yep, thanks. You can always check
> > http://vger.kernel.org/~davem/net-next.html when unsure.
> >
>
>
> --
> ---------------------------------------------
> Victor Julien
> http://www.inliniac.net/
> PGP: http://www.inliniac.net/victorjulien.asc
> ---------------------------------------------
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 19:03         ` Willem de Bruijn
@ 2020-06-02 19:22           ` Victor Julien
  2020-06-02 19:29             ` Jakub Kicinski
  2020-06-02 19:38             ` Willem de Bruijn
  0 siblings, 2 replies; 17+ messages in thread
From: Victor Julien @ 2020-06-02 19:22 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On 02-06-2020 21:03, Willem de Bruijn wrote:
> On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@inliniac.net> wrote:
>> On 02-06-2020 19:37, Willem de Bruijn wrote:
>>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
>>>>
>>>> On 02-06-2020 16:29, Willem de Bruijn wrote:
>>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
>>>>>>
>>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
>>>>>> that the driver has completely validated the checksums in the packet.
>>>>>>
>>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
>>>>>> in that the new flag will only be set if all the layers are valid,
>>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
>>>>>
>>>>> transport, not ip checksum.
>>>>
>>>> Allow me a n00b question: what does transport refer to here? Things like
>>>> ethernet? It isn't clear to me from the doc.
>>>
>>> The TCP/UDP/.. transport protocol checksum.
>>
>> Hmm that is what I thought originally, but then it didn't seem to work.
>> Hence my patch.
>>
>> However I just redid my testing. I took the example tpacketv3 program
>> and added the status flag checks to the 'display()' func:
>>
>>                 if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
>>                         printf("TP_STATUS_CSUM_VALID, ");
>>                 }
>>                 if (ppd->tp_status & (1<<8)) {
>>                         printf("TP_STATUS_CSUM_UNNECESSARY, ");
>>
>>                 }
>>
>> Then using scapy sent some packets in 2 variants:
>> - default (good csums)
>> - deliberately bad csums
>> (then also added a few things like ip6 over ip)
>>
>>
>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
>> iface="enp1s0") // good csums
>>
>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
>> iface="enp1s0") //bad tcp
> 
> Is this a test between two machines? What is the device driver of the
> machine receiving and printing the packet? It would be helpful to know
> whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.

Yes 2 machines, or actually 2 machines and a VM. The receiving Linux
sits in a kvm vm with network pass through and uses the virtio driver
(host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY'
virtio seems to support that.

I've done some more tests. In a pcap replay that I know contains packet
with bad TCP csums (but good IP csums for those pkts), to a physical
host running Ubuntu Linux kernel 5.3:

- receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
every packet, including the bad TCP ones
- receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
packets.

Again purely based on 'git grep' it seems nfp does not support
UNNECESSARY, while ixgbe does.

(my original testing was with the nfp only, so now I at least understand
my original thinking)


>>
>> 1.2.3.4 -> 5.6.7.8, TP_STATUS_CSUM_VALID, TP_STATUS_CSUM_UNNECESSARY,
>> rxhash: 0x81ad5744
>> 1.2.3.4 -> 5.6.7.8, rxhash: 0x81ad5744
>>
>> So this suggests that what you're saying is correct, that it sets
>> TP_STATUS_CSUM_VALID if the TCP/UDP csum (and IPv4 csum) is valid, and
>> does not set it when either of them are invalid.
> 
> That's not exactly what I said. It looks to me that a device that sets
> CHECKSUM_COMPLETE will return TP_STATUS_CSUM_VALID from
> __netif_receive_skb_core even if the TCP checksum turns out to be bad.
> If a driver would insert such packets into the stack, that is.

Ok, this might be confirmed by my nfp vs virtio/ixgbe observations
mentioned above.


>> I'll also re-evaluate things in Suricata.
>>
>>
>> One thing I wonder if what this "at least" from the 682f048bd494 commit
>> message means:
>>
>> "Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the
>>  af_packet user that at least the transport header checksum
>>  has been already validated."
>>
>> For TCP/UDP there wouldn't be a higher layer with csums, right? And
>> based on my testing it seems lower levels (at least IP) is also
>> included. Or would that perhaps refer to something like VXLAN or Geneve
>> over UDP? That the csums of packets on top of those layers aren't
>> (necessarily) considered?
> 
> The latter. All these checksums are about transport layer checksums
> (the ip header checksum is cheap to compute). Multiple checksums
> refers to packets encapsulated in other protocols with checksum, such
> as GRE or UDP based like Geneve.

If nothing else comes from this I'll at least propose doc patch to
clarify this for ppl like myself.

Thanks,
Victor

-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 19:22           ` Victor Julien
@ 2020-06-02 19:29             ` Jakub Kicinski
  2020-06-02 19:47               ` Victor Julien
  2020-06-02 19:38             ` Willem de Bruijn
  1 sibling, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2020-06-02 19:29 UTC (permalink / raw)
  To: Victor Julien
  Cc: Willem de Bruijn, Network Development, David S. Miller,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On Tue, 2 Jun 2020 21:22:11 +0200 Victor Julien wrote:
> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
> every packet, including the bad TCP ones
> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
> packets.
> 
> Again purely based on 'git grep' it seems nfp does not support
> UNNECESSARY, while ixgbe does.
> 
> (my original testing was with the nfp only, so now I at least understand
> my original thinking)

FWIW nfp defaults to CHECKSUM_COMPLETE if the device supports it (see
if you have RXCSUM_COMPLETE in the probe logs). It supports UNNECESSARY
as well, but IDK if there is a way to choose  the preferred checksum
types in the stack :( You'd have to edit the driver and remove the
NFP_NET_CFG_CTRL_CSUM_COMPLETE from the NFP_NET_CFG_CTRL_RXCSUM_ANY
mask to switch to using UNNECESSARY.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 19:22           ` Victor Julien
  2020-06-02 19:29             ` Jakub Kicinski
@ 2020-06-02 19:38             ` Willem de Bruijn
  2020-06-02 20:05               ` Victor Julien
  1 sibling, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2020-06-02 19:38 UTC (permalink / raw)
  To: Victor Julien
  Cc: Willem de Bruijn, Network Development, David S. Miller,
	Jakub Kicinski, Jonathan Corbet, Eric Dumazet, Mao Wenan,
	Arnd Bergmann, Neil Horman, linux-doc, linux-kernel,
	Alexander Drozdov, Tom Herbert

On Tue, Jun 2, 2020 at 3:22 PM Victor Julien <victor@inliniac.net> wrote:
>
> On 02-06-2020 21:03, Willem de Bruijn wrote:
> > On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@inliniac.net> wrote:
> >> On 02-06-2020 19:37, Willem de Bruijn wrote:
> >>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
> >>>>
> >>>> On 02-06-2020 16:29, Willem de Bruijn wrote:
> >>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
> >>>>>>
> >>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
> >>>>>> that the driver has completely validated the checksums in the packet.
> >>>>>>
> >>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
> >>>>>> in that the new flag will only be set if all the layers are valid,
> >>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
> >>>>>
> >>>>> transport, not ip checksum.
> >>>>
> >>>> Allow me a n00b question: what does transport refer to here? Things like
> >>>> ethernet? It isn't clear to me from the doc.
> >>>
> >>> The TCP/UDP/.. transport protocol checksum.
> >>
> >> Hmm that is what I thought originally, but then it didn't seem to work.
> >> Hence my patch.
> >>
> >> However I just redid my testing. I took the example tpacketv3 program
> >> and added the status flag checks to the 'display()' func:
> >>
> >>                 if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
> >>                         printf("TP_STATUS_CSUM_VALID, ");
> >>                 }
> >>                 if (ppd->tp_status & (1<<8)) {
> >>                         printf("TP_STATUS_CSUM_UNNECESSARY, ");
> >>
> >>                 }
> >>
> >> Then using scapy sent some packets in 2 variants:
> >> - default (good csums)
> >> - deliberately bad csums
> >> (then also added a few things like ip6 over ip)
> >>
> >>
> >> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
> >> iface="enp1s0") // good csums
> >>
> >> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
> >> iface="enp1s0") //bad tcp
> >
> > Is this a test between two machines? What is the device driver of the
> > machine receiving and printing the packet? It would be helpful to know
> > whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.
>
> Yes 2 machines, or actually 2 machines and a VM. The receiving Linux
> sits in a kvm vm with network pass through and uses the virtio driver
> (host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY'
> virtio seems to support that.
>
> I've done some more tests. In a pcap replay that I know contains packet
> with bad TCP csums (but good IP csums for those pkts), to a physical
> host running Ubuntu Linux kernel 5.3:
>
> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
> every packet, including the bad TCP ones
> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
> packets.

Great. Thanks a lot for running all these experiments.

We might have to drop the TP_STATUS_CSUM_VALID with CHECKSUM_COMPLETE
unless skb->csum_valid.

For packets with multiple transport layer checksums,
CHECKSUM_UNNECESSARY should mean that all have been verified.

I believe that in the case of multiple transport headers, csum_valid
similarly ensures all checksums up to csum_start are valid. Will need
to double check.

If so, there probably is no need for a separate new TP_STATUS.
TP_STATUS_CSUM_VALID is reported only when all checksums are valid.

> Again purely based on 'git grep' it seems nfp does not support
> UNNECESSARY, while ixgbe does.
>
> (my original testing was with the nfp only, so now I at least understand
> my original thinking)
>
>
> >>
> >> 1.2.3.4 -> 5.6.7.8, TP_STATUS_CSUM_VALID, TP_STATUS_CSUM_UNNECESSARY,
> >> rxhash: 0x81ad5744
> >> 1.2.3.4 -> 5.6.7.8, rxhash: 0x81ad5744
> >>
> >> So this suggests that what you're saying is correct, that it sets
> >> TP_STATUS_CSUM_VALID if the TCP/UDP csum (and IPv4 csum) is valid, and
> >> does not set it when either of them are invalid.
> >
> > That's not exactly what I said. It looks to me that a device that sets
> > CHECKSUM_COMPLETE will return TP_STATUS_CSUM_VALID from
> > __netif_receive_skb_core even if the TCP checksum turns out to be bad.
> > If a driver would insert such packets into the stack, that is.
>
> Ok, this might be confirmed by my nfp vs virtio/ixgbe observations
> mentioned above.
>
>
> >> I'll also re-evaluate things in Suricata.
> >>
> >>
> >> One thing I wonder if what this "at least" from the 682f048bd494 commit
> >> message means:
> >>
> >> "Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the
> >>  af_packet user that at least the transport header checksum
> >>  has been already validated."
> >>
> >> For TCP/UDP there wouldn't be a higher layer with csums, right? And
> >> based on my testing it seems lower levels (at least IP) is also
> >> included. Or would that perhaps refer to something like VXLAN or Geneve
> >> over UDP? That the csums of packets on top of those layers aren't
> >> (necessarily) considered?
> >
> > The latter. All these checksums are about transport layer checksums
> > (the ip header checksum is cheap to compute). Multiple checksums
> > refers to packets encapsulated in other protocols with checksum, such
> > as GRE or UDP based like Geneve.
>
> If nothing else comes from this I'll at least propose doc patch to
> clarify this for ppl like myself.
>
> Thanks,
> Victor
>
> --
> ---------------------------------------------
> Victor Julien
> http://www.inliniac.net/
> PGP: http://www.inliniac.net/victorjulien.asc
> ---------------------------------------------
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 19:29             ` Jakub Kicinski
@ 2020-06-02 19:47               ` Victor Julien
  0 siblings, 0 replies; 17+ messages in thread
From: Victor Julien @ 2020-06-02 19:47 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Willem de Bruijn, Network Development, David S. Miller,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On 02-06-2020 21:29, Jakub Kicinski wrote:
> On Tue, 2 Jun 2020 21:22:11 +0200 Victor Julien wrote:
>> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
>> every packet, including the bad TCP ones
>> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
>> packets.
>>
>> Again purely based on 'git grep' it seems nfp does not support
>> UNNECESSARY, while ixgbe does.
>>
>> (my original testing was with the nfp only, so now I at least understand
>> my original thinking)
> 
> FWIW nfp defaults to CHECKSUM_COMPLETE if the device supports it (see
> if you have RXCSUM_COMPLETE in the probe logs). It supports UNNECESSARY
> as well, but IDK if there is a way to choose  the preferred checksum
> types in the stack :( You'd have to edit the driver and remove the
> NFP_NET_CFG_CTRL_CSUM_COMPLETE from the NFP_NET_CFG_CTRL_RXCSUM_ANY
> mask to switch to using UNNECESSARY.
> 

I indeed have RXCSUM_COMPLETE, so that should mean it uses
CHECKSUM_COMPLETE indeed.

nfp 0000:43:00.0 eth0: CAP: 0x78140233 PROMISC RXCSUM TXCSUM GATHER TSO2
RSS2 AUTOMASK IRQMOD RXCSUM_COMPLETE

-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 19:38             ` Willem de Bruijn
@ 2020-06-02 20:05               ` Victor Julien
  2020-06-02 20:18                 ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Victor Julien @ 2020-06-02 20:05 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On 02-06-2020 21:38, Willem de Bruijn wrote:
> On Tue, Jun 2, 2020 at 3:22 PM Victor Julien <victor@inliniac.net> wrote:
>>
>> On 02-06-2020 21:03, Willem de Bruijn wrote:
>>> On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@inliniac.net> wrote:
>>>> On 02-06-2020 19:37, Willem de Bruijn wrote:
>>>>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
>>>>>>
>>>>>> On 02-06-2020 16:29, Willem de Bruijn wrote:
>>>>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
>>>>>>>>
>>>>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
>>>>>>>> that the driver has completely validated the checksums in the packet.
>>>>>>>>
>>>>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
>>>>>>>> in that the new flag will only be set if all the layers are valid,
>>>>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
>>>>>>>
>>>>>>> transport, not ip checksum.
>>>>>>
>>>>>> Allow me a n00b question: what does transport refer to here? Things like
>>>>>> ethernet? It isn't clear to me from the doc.
>>>>>
>>>>> The TCP/UDP/.. transport protocol checksum.
>>>>
>>>> Hmm that is what I thought originally, but then it didn't seem to work.
>>>> Hence my patch.
>>>>
>>>> However I just redid my testing. I took the example tpacketv3 program
>>>> and added the status flag checks to the 'display()' func:
>>>>
>>>>                 if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
>>>>                         printf("TP_STATUS_CSUM_VALID, ");
>>>>                 }
>>>>                 if (ppd->tp_status & (1<<8)) {
>>>>                         printf("TP_STATUS_CSUM_UNNECESSARY, ");
>>>>
>>>>                 }
>>>>
>>>> Then using scapy sent some packets in 2 variants:
>>>> - default (good csums)
>>>> - deliberately bad csums
>>>> (then also added a few things like ip6 over ip)
>>>>
>>>>
>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
>>>> iface="enp1s0") // good csums
>>>>
>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
>>>> iface="enp1s0") //bad tcp
>>>
>>> Is this a test between two machines? What is the device driver of the
>>> machine receiving and printing the packet? It would be helpful to know
>>> whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.
>>
>> Yes 2 machines, or actually 2 machines and a VM. The receiving Linux
>> sits in a kvm vm with network pass through and uses the virtio driver
>> (host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY'
>> virtio seems to support that.
>>
>> I've done some more tests. In a pcap replay that I know contains packet
>> with bad TCP csums (but good IP csums for those pkts), to a physical
>> host running Ubuntu Linux kernel 5.3:
>>
>> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
>> every packet, including the bad TCP ones
>> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
>> packets.
> 
> Great. Thanks a lot for running all these experiments.
> 
> We might have to drop the TP_STATUS_CSUM_VALID with CHECKSUM_COMPLETE
> unless skb->csum_valid.
> 
> For packets with multiple transport layer checksums,
> CHECKSUM_UNNECESSARY should mean that all have been verified.
> 
> I believe that in the case of multiple transport headers, csum_valid
> similarly ensures all checksums up to csum_start are valid. Will need
> to double check.
> 
> If so, there probably is no need for a separate new TP_STATUS.
> TP_STATUS_CSUM_VALID is reported only when all checksums are valid.

So if I understand you correctly the key may be in the call to
`skb_csum_unnecessary`:

That reads:

static inline int skb_csum_unnecessary(const struct sk_buff *skb)
{
        return ((skb->ip_summed == CHECKSUM_UNNECESSARY) ||
                skb->csum_valid ||
                (skb->ip_summed == CHECKSUM_PARTIAL &&
                 skb_checksum_start_offset(skb) >= 0));
}

But really only the first 2 conditions are reachable, as we already know
skb->ip_summed is not CHECKSUM_PARTIAL when we call it.

So our unmodified check is:

        else if (skb->pkt_type != PACKET_OUTGOING &&
                (skb->ip_summed == CHECKSUM_COMPLETE ||
		 skb->ip_summed == CHECKSUM_UNNECESSARY ||
		 skb->csum_valid))

Should this become something like:

        else if (skb->pkt_type != PACKET_OUTGOING &&
                (skb->ip_summed == CHECKSUM_COMPLETE &&
                 skb->csum_valid) ||
		 skb->ip_summed == CHECKSUM_UNNECESSARY)

Is this what you had in mind?

-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 20:05               ` Victor Julien
@ 2020-06-02 20:18                 ` Willem de Bruijn
  2020-06-02 20:29                   ` Victor Julien
  2020-06-04  9:46                   ` Victor Julien
  0 siblings, 2 replies; 17+ messages in thread
From: Willem de Bruijn @ 2020-06-02 20:18 UTC (permalink / raw)
  To: Victor Julien
  Cc: Willem de Bruijn, Network Development, David S. Miller,
	Jakub Kicinski, Jonathan Corbet, Eric Dumazet, Mao Wenan,
	Arnd Bergmann, Neil Horman, linux-doc, linux-kernel,
	Alexander Drozdov, Tom Herbert

On Tue, Jun 2, 2020 at 4:05 PM Victor Julien <victor@inliniac.net> wrote:
>
> On 02-06-2020 21:38, Willem de Bruijn wrote:
> > On Tue, Jun 2, 2020 at 3:22 PM Victor Julien <victor@inliniac.net> wrote:
> >>
> >> On 02-06-2020 21:03, Willem de Bruijn wrote:
> >>> On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@inliniac.net> wrote:
> >>>> On 02-06-2020 19:37, Willem de Bruijn wrote:
> >>>>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
> >>>>>>
> >>>>>> On 02-06-2020 16:29, Willem de Bruijn wrote:
> >>>>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
> >>>>>>>>
> >>>>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
> >>>>>>>> that the driver has completely validated the checksums in the packet.
> >>>>>>>>
> >>>>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
> >>>>>>>> in that the new flag will only be set if all the layers are valid,
> >>>>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
> >>>>>>>
> >>>>>>> transport, not ip checksum.
> >>>>>>
> >>>>>> Allow me a n00b question: what does transport refer to here? Things like
> >>>>>> ethernet? It isn't clear to me from the doc.
> >>>>>
> >>>>> The TCP/UDP/.. transport protocol checksum.
> >>>>
> >>>> Hmm that is what I thought originally, but then it didn't seem to work.
> >>>> Hence my patch.
> >>>>
> >>>> However I just redid my testing. I took the example tpacketv3 program
> >>>> and added the status flag checks to the 'display()' func:
> >>>>
> >>>>                 if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
> >>>>                         printf("TP_STATUS_CSUM_VALID, ");
> >>>>                 }
> >>>>                 if (ppd->tp_status & (1<<8)) {
> >>>>                         printf("TP_STATUS_CSUM_UNNECESSARY, ");
> >>>>
> >>>>                 }
> >>>>
> >>>> Then using scapy sent some packets in 2 variants:
> >>>> - default (good csums)
> >>>> - deliberately bad csums
> >>>> (then also added a few things like ip6 over ip)
> >>>>
> >>>>
> >>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
> >>>> iface="enp1s0") // good csums
> >>>>
> >>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
> >>>> iface="enp1s0") //bad tcp
> >>>
> >>> Is this a test between two machines? What is the device driver of the
> >>> machine receiving and printing the packet? It would be helpful to know
> >>> whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.
> >>
> >> Yes 2 machines, or actually 2 machines and a VM. The receiving Linux
> >> sits in a kvm vm with network pass through and uses the virtio driver
> >> (host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY'
> >> virtio seems to support that.
> >>
> >> I've done some more tests. In a pcap replay that I know contains packet
> >> with bad TCP csums (but good IP csums for those pkts), to a physical
> >> host running Ubuntu Linux kernel 5.3:
> >>
> >> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
> >> every packet, including the bad TCP ones
> >> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
> >> packets.
> >
> > Great. Thanks a lot for running all these experiments.
> >
> > We might have to drop the TP_STATUS_CSUM_VALID with CHECKSUM_COMPLETE
> > unless skb->csum_valid.
> >
> > For packets with multiple transport layer checksums,
> > CHECKSUM_UNNECESSARY should mean that all have been verified.
> >
> > I believe that in the case of multiple transport headers, csum_valid
> > similarly ensures all checksums up to csum_start are valid. Will need
> > to double check.
> >
> > If so, there probably is no need for a separate new TP_STATUS.
> > TP_STATUS_CSUM_VALID is reported only when all checksums are valid.
>
> So if I understand you correctly the key may be in the call to
> `skb_csum_unnecessary`:
>
> That reads:
>
> static inline int skb_csum_unnecessary(const struct sk_buff *skb)
> {
>         return ((skb->ip_summed == CHECKSUM_UNNECESSARY) ||
>                 skb->csum_valid ||
>                 (skb->ip_summed == CHECKSUM_PARTIAL &&
>                  skb_checksum_start_offset(skb) >= 0));
> }
>
> But really only the first 2 conditions are reachable

.. from this codepath. That function is called in other codepaths as well.

> , as we already know
> skb->ip_summed is not CHECKSUM_PARTIAL when we call it.
>
> So our unmodified check is:
>
>         else if (skb->pkt_type != PACKET_OUTGOING &&
>                 (skb->ip_summed == CHECKSUM_COMPLETE ||
>                  skb->ip_summed == CHECKSUM_UNNECESSARY ||
>                  skb->csum_valid))
>
> Should this become something like:
>
>         else if (skb->pkt_type != PACKET_OUTGOING &&
>                 (skb->ip_summed == CHECKSUM_COMPLETE &&
>                  skb->csum_valid) ||
>                  skb->ip_summed == CHECKSUM_UNNECESSARY)
>
> Is this what you had in mind?

I don't suggest modifying skb_csum_unnecessary probably. Certainly not
until I've looked at all other callers of it.

But in case of packet sockets, yes, adding that csum_valid check is my
first rough approximation.

That said, first let's give others more familiar with
TP_STATUS_CSUM_VALID some time to comment.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 20:18                 ` Willem de Bruijn
@ 2020-06-02 20:29                   ` Victor Julien
  2020-06-04  9:46                   ` Victor Julien
  1 sibling, 0 replies; 17+ messages in thread
From: Victor Julien @ 2020-06-02 20:29 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On 02-06-2020 22:18, Willem de Bruijn wrote:
> On Tue, Jun 2, 2020 at 4:05 PM Victor Julien <victor@inliniac.net> wrote:
>>
>> On 02-06-2020 21:38, Willem de Bruijn wrote:
>>> On Tue, Jun 2, 2020 at 3:22 PM Victor Julien <victor@inliniac.net> wrote:
>>>>
>>>> On 02-06-2020 21:03, Willem de Bruijn wrote:
>>>>> On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@inliniac.net> wrote:
>>>>>> On 02-06-2020 19:37, Willem de Bruijn wrote:
>>>>>>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
>>>>>>>>
>>>>>>>> On 02-06-2020 16:29, Willem de Bruijn wrote:
>>>>>>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
>>>>>>>>>>
>>>>>>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
>>>>>>>>>> that the driver has completely validated the checksums in the packet.
>>>>>>>>>>
>>>>>>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
>>>>>>>>>> in that the new flag will only be set if all the layers are valid,
>>>>>>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
>>>>>>>>>
>>>>>>>>> transport, not ip checksum.
>>>>>>>>
>>>>>>>> Allow me a n00b question: what does transport refer to here? Things like
>>>>>>>> ethernet? It isn't clear to me from the doc.
>>>>>>>
>>>>>>> The TCP/UDP/.. transport protocol checksum.
>>>>>>
>>>>>> Hmm that is what I thought originally, but then it didn't seem to work.
>>>>>> Hence my patch.
>>>>>>
>>>>>> However I just redid my testing. I took the example tpacketv3 program
>>>>>> and added the status flag checks to the 'display()' func:
>>>>>>
>>>>>>                 if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
>>>>>>                         printf("TP_STATUS_CSUM_VALID, ");
>>>>>>                 }
>>>>>>                 if (ppd->tp_status & (1<<8)) {
>>>>>>                         printf("TP_STATUS_CSUM_UNNECESSARY, ");
>>>>>>
>>>>>>                 }
>>>>>>
>>>>>> Then using scapy sent some packets in 2 variants:
>>>>>> - default (good csums)
>>>>>> - deliberately bad csums
>>>>>> (then also added a few things like ip6 over ip)
>>>>>>
>>>>>>
>>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
>>>>>> iface="enp1s0") // good csums
>>>>>>
>>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
>>>>>> iface="enp1s0") //bad tcp
>>>>>
>>>>> Is this a test between two machines? What is the device driver of the
>>>>> machine receiving and printing the packet? It would be helpful to know
>>>>> whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.
>>>>
>>>> Yes 2 machines, or actually 2 machines and a VM. The receiving Linux
>>>> sits in a kvm vm with network pass through and uses the virtio driver
>>>> (host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY'
>>>> virtio seems to support that.
>>>>
>>>> I've done some more tests. In a pcap replay that I know contains packet
>>>> with bad TCP csums (but good IP csums for those pkts), to a physical
>>>> host running Ubuntu Linux kernel 5.3:
>>>>
>>>> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
>>>> every packet, including the bad TCP ones
>>>> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
>>>> packets.
>>>
>>> Great. Thanks a lot for running all these experiments.
>>>
>>> We might have to drop the TP_STATUS_CSUM_VALID with CHECKSUM_COMPLETE
>>> unless skb->csum_valid.
>>>
>>> For packets with multiple transport layer checksums,
>>> CHECKSUM_UNNECESSARY should mean that all have been verified.
>>>
>>> I believe that in the case of multiple transport headers, csum_valid
>>> similarly ensures all checksums up to csum_start are valid. Will need
>>> to double check.
>>>
>>> If so, there probably is no need for a separate new TP_STATUS.
>>> TP_STATUS_CSUM_VALID is reported only when all checksums are valid.
>>
>> So if I understand you correctly the key may be in the call to
>> `skb_csum_unnecessary`:
>>
>> That reads:
>>
>> static inline int skb_csum_unnecessary(const struct sk_buff *skb)
>> {
>>         return ((skb->ip_summed == CHECKSUM_UNNECESSARY) ||
>>                 skb->csum_valid ||
>>                 (skb->ip_summed == CHECKSUM_PARTIAL &&
>>                  skb_checksum_start_offset(skb) >= 0));
>> }
>>
>> But really only the first 2 conditions are reachable
> 
> .. from this codepath. That function is called in other codepaths as well.
> 
>> , as we already know
>> skb->ip_summed is not CHECKSUM_PARTIAL when we call it.
>>
>> So our unmodified check is:
>>
>>         else if (skb->pkt_type != PACKET_OUTGOING &&
>>                 (skb->ip_summed == CHECKSUM_COMPLETE ||
>>                  skb->ip_summed == CHECKSUM_UNNECESSARY ||
>>                  skb->csum_valid))
>>
>> Should this become something like:
>>
>>         else if (skb->pkt_type != PACKET_OUTGOING &&
>>                 (skb->ip_summed == CHECKSUM_COMPLETE &&
>>                  skb->csum_valid) ||
>>                  skb->ip_summed == CHECKSUM_UNNECESSARY)
>>
>> Is this what you had in mind?
> 
> I don't suggest modifying skb_csum_unnecessary probably. Certainly not
> until I've looked at all other callers of it.

Yeah didn't read it that way. Just wanted to make clear what the actual
check for afpacket is (for my own understanding mostly)

> But in case of packet sockets, yes, adding that csum_valid check is my
> first rough approximation.
> 
> That said, first let's give others more familiar with
> TP_STATUS_CSUM_VALID some time to comment.
> 

Thanks a lot for your comments.

-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-02 20:18                 ` Willem de Bruijn
  2020-06-02 20:29                   ` Victor Julien
@ 2020-06-04  9:46                   ` Victor Julien
  2020-06-04 13:48                     ` Willem de Bruijn
  1 sibling, 1 reply; 17+ messages in thread
From: Victor Julien @ 2020-06-04  9:46 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On 02-06-2020 22:18, Willem de Bruijn wrote:
> On Tue, Jun 2, 2020 at 4:05 PM Victor Julien <victor@inliniac.net> wrote:
>>
>> On 02-06-2020 21:38, Willem de Bruijn wrote:
>>> On Tue, Jun 2, 2020 at 3:22 PM Victor Julien <victor@inliniac.net> wrote:
>>>>
>>>> On 02-06-2020 21:03, Willem de Bruijn wrote:
>>>>> On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@inliniac.net> wrote:
>>>>>> On 02-06-2020 19:37, Willem de Bruijn wrote:
>>>>>>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
>>>>>>>>
>>>>>>>> On 02-06-2020 16:29, Willem de Bruijn wrote:
>>>>>>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
>>>>>>>>>>
>>>>>>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
>>>>>>>>>> that the driver has completely validated the checksums in the packet.
>>>>>>>>>>
>>>>>>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
>>>>>>>>>> in that the new flag will only be set if all the layers are valid,
>>>>>>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
>>>>>>>>>
>>>>>>>>> transport, not ip checksum.
>>>>>>>>
>>>>>>>> Allow me a n00b question: what does transport refer to here? Things like
>>>>>>>> ethernet? It isn't clear to me from the doc.
>>>>>>>
>>>>>>> The TCP/UDP/.. transport protocol checksum.
>>>>>>
>>>>>> Hmm that is what I thought originally, but then it didn't seem to work.
>>>>>> Hence my patch.
>>>>>>
>>>>>> However I just redid my testing. I took the example tpacketv3 program
>>>>>> and added the status flag checks to the 'display()' func:
>>>>>>
>>>>>>                 if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
>>>>>>                         printf("TP_STATUS_CSUM_VALID, ");
>>>>>>                 }
>>>>>>                 if (ppd->tp_status & (1<<8)) {
>>>>>>                         printf("TP_STATUS_CSUM_UNNECESSARY, ");
>>>>>>
>>>>>>                 }
>>>>>>
>>>>>> Then using scapy sent some packets in 2 variants:
>>>>>> - default (good csums)
>>>>>> - deliberately bad csums
>>>>>> (then also added a few things like ip6 over ip)
>>>>>>
>>>>>>
>>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
>>>>>> iface="enp1s0") // good csums
>>>>>>
>>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
>>>>>> iface="enp1s0") //bad tcp
>>>>>
>>>>> Is this a test between two machines? What is the device driver of the
>>>>> machine receiving and printing the packet? It would be helpful to know
>>>>> whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.
>>>>
>>>> Yes 2 machines, or actually 2 machines and a VM. The receiving Linux
>>>> sits in a kvm vm with network pass through and uses the virtio driver
>>>> (host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY'
>>>> virtio seems to support that.
>>>>
>>>> I've done some more tests. In a pcap replay that I know contains packet
>>>> with bad TCP csums (but good IP csums for those pkts), to a physical
>>>> host running Ubuntu Linux kernel 5.3:
>>>>
>>>> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
>>>> every packet, including the bad TCP ones
>>>> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
>>>> packets.
>>>
>>> Great. Thanks a lot for running all these experiments.
>>>
>>> We might have to drop the TP_STATUS_CSUM_VALID with CHECKSUM_COMPLETE
>>> unless skb->csum_valid.
>>>
>>> For packets with multiple transport layer checksums,
>>> CHECKSUM_UNNECESSARY should mean that all have been verified.
>>>
>>> I believe that in the case of multiple transport headers, csum_valid
>>> similarly ensures all checksums up to csum_start are valid. Will need
>>> to double check.
>>>
>>> If so, there probably is no need for a separate new TP_STATUS.
>>> TP_STATUS_CSUM_VALID is reported only when all checksums are valid.
>>
>> So if I understand you correctly the key may be in the call to
>> `skb_csum_unnecessary`:
>>
>> That reads:
>>
>> static inline int skb_csum_unnecessary(const struct sk_buff *skb)
>> {
>>         return ((skb->ip_summed == CHECKSUM_UNNECESSARY) ||
>>                 skb->csum_valid ||
>>                 (skb->ip_summed == CHECKSUM_PARTIAL &&
>>                  skb_checksum_start_offset(skb) >= 0));
>> }
>>
>> But really only the first 2 conditions are reachable
> 
> .. from this codepath. That function is called in other codepaths as well.
> 
>> , as we already know
>> skb->ip_summed is not CHECKSUM_PARTIAL when we call it.
>>
>> So our unmodified check is:
>>
>>         else if (skb->pkt_type != PACKET_OUTGOING &&
>>                 (skb->ip_summed == CHECKSUM_COMPLETE ||
>>                  skb->ip_summed == CHECKSUM_UNNECESSARY ||
>>                  skb->csum_valid))
>>
>> Should this become something like:
>>
>>         else if (skb->pkt_type != PACKET_OUTGOING &&
>>                 (skb->ip_summed == CHECKSUM_COMPLETE &&
>>                  skb->csum_valid) ||
>>                  skb->ip_summed == CHECKSUM_UNNECESSARY)
>>
>> Is this what you had in mind?
> 
> I don't suggest modifying skb_csum_unnecessary probably. Certainly not
> until I've looked at all other callers of it.
> 
> But in case of packet sockets, yes, adding that csum_valid check is my
> first rough approximation.
> 
> That said, first let's give others more familiar with
> TP_STATUS_CSUM_VALID some time to comment.
> 

I did some more experiments, on real hw this time. I made the following
change to 5.7.0 (wasn't brave enough to remote upgrade a box to netnext):

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 29bd405adbbd..3afb1913837a 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2216,8 +2216,8 @@ static int tpacket_rcv(struct sk_buff *skb, struct
net_device *dev,
        if (skb->ip_summed == CHECKSUM_PARTIAL)
                status |= TP_STATUS_CSUMNOTREADY;
        else if (skb->pkt_type != PACKET_OUTGOING &&
-                (skb->ip_summed == CHECKSUM_COMPLETE ||
-                 skb_csum_unnecessary(skb)))
+                ((skb->ip_summed == CHECKSUM_COMPLETE &&
skb->csum_valid) ||
+                  skb->ip_summed == CHECKSUM_UNNECESSARY))
                status |= TP_STATUS_CSUM_VALID;

        if (snaplen > res)

With this change it seems the TP_STATUS_CSUM_VALID flag is *never* set
for the nfp driver.

The capture on the ixgbe driver looks unchanged, but that seems to make
sense as I think it uses CHECKSUM_UNNECESSARY and not CHECKSUM_COMPLETE.

For the nfp driver I have these settings:

root@z820:~# ethtool -k ens3np0|grep rx
rx-checksumming: on
rx-vlan-offload: off [fixed]
rx-vlan-filter: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
rx-udp_tunnel-port-offload: on
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
rx-gro-list: off

Since the driver suggests 'rx-checksumming' is enabled, I'm wondering
how we can actually get a result from it? Jakub, do you know?

-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-04  9:46                   ` Victor Julien
@ 2020-06-04 13:48                     ` Willem de Bruijn
  2020-06-05 12:38                       ` Victor Julien
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2020-06-04 13:48 UTC (permalink / raw)
  To: Victor Julien
  Cc: Willem de Bruijn, Network Development, David S. Miller,
	Jakub Kicinski, Jonathan Corbet, Eric Dumazet, Mao Wenan,
	Arnd Bergmann, Neil Horman, linux-doc, linux-kernel,
	Alexander Drozdov, Tom Herbert

On Thu, Jun 4, 2020 at 5:47 AM Victor Julien <victor@inliniac.net> wrote:
>
> On 02-06-2020 22:18, Willem de Bruijn wrote:
> > On Tue, Jun 2, 2020 at 4:05 PM Victor Julien <victor@inliniac.net> wrote:
> >>
> >> On 02-06-2020 21:38, Willem de Bruijn wrote:
> >>> On Tue, Jun 2, 2020 at 3:22 PM Victor Julien <victor@inliniac.net> wrote:
> >>>>
> >>>> On 02-06-2020 21:03, Willem de Bruijn wrote:
> >>>>> On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@inliniac.net> wrote:
> >>>>>> On 02-06-2020 19:37, Willem de Bruijn wrote:
> >>>>>>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
> >>>>>>>>
> >>>>>>>> On 02-06-2020 16:29, Willem de Bruijn wrote:
> >>>>>>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
> >>>>>>>>>> that the driver has completely validated the checksums in the packet.
> >>>>>>>>>>
> >>>>>>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
> >>>>>>>>>> in that the new flag will only be set if all the layers are valid,
> >>>>>>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
> >>>>>>>>>
> >>>>>>>>> transport, not ip checksum.
> >>>>>>>>
> >>>>>>>> Allow me a n00b question: what does transport refer to here? Things like
> >>>>>>>> ethernet? It isn't clear to me from the doc.
> >>>>>>>
> >>>>>>> The TCP/UDP/.. transport protocol checksum.
> >>>>>>
> >>>>>> Hmm that is what I thought originally, but then it didn't seem to work.
> >>>>>> Hence my patch.
> >>>>>>
> >>>>>> However I just redid my testing. I took the example tpacketv3 program
> >>>>>> and added the status flag checks to the 'display()' func:
> >>>>>>
> >>>>>>                 if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
> >>>>>>                         printf("TP_STATUS_CSUM_VALID, ");
> >>>>>>                 }
> >>>>>>                 if (ppd->tp_status & (1<<8)) {
> >>>>>>                         printf("TP_STATUS_CSUM_UNNECESSARY, ");
> >>>>>>
> >>>>>>                 }
> >>>>>>
> >>>>>> Then using scapy sent some packets in 2 variants:
> >>>>>> - default (good csums)
> >>>>>> - deliberately bad csums
> >>>>>> (then also added a few things like ip6 over ip)
> >>>>>>
> >>>>>>
> >>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
> >>>>>> iface="enp1s0") // good csums
> >>>>>>
> >>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
> >>>>>> iface="enp1s0") //bad tcp
> >>>>>
> >>>>> Is this a test between two machines? What is the device driver of the
> >>>>> machine receiving and printing the packet? It would be helpful to know
> >>>>> whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.
> >>>>
> >>>> Yes 2 machines, or actually 2 machines and a VM. The receiving Linux
> >>>> sits in a kvm vm with network pass through and uses the virtio driver
> >>>> (host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY'
> >>>> virtio seems to support that.
> >>>>
> >>>> I've done some more tests. In a pcap replay that I know contains packet
> >>>> with bad TCP csums (but good IP csums for those pkts), to a physical
> >>>> host running Ubuntu Linux kernel 5.3:
> >>>>
> >>>> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
> >>>> every packet, including the bad TCP ones
> >>>> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
> >>>> packets.
> >>>
> >>> Great. Thanks a lot for running all these experiments.
> >>>
> >>> We might have to drop the TP_STATUS_CSUM_VALID with CHECKSUM_COMPLETE
> >>> unless skb->csum_valid.
> >>>
> >>> For packets with multiple transport layer checksums,
> >>> CHECKSUM_UNNECESSARY should mean that all have been verified.
> >>>
> >>> I believe that in the case of multiple transport headers, csum_valid
> >>> similarly ensures all checksums up to csum_start are valid. Will need
> >>> to double check.
> >>>
> >>> If so, there probably is no need for a separate new TP_STATUS.
> >>> TP_STATUS_CSUM_VALID is reported only when all checksums are valid.
> >>
> >> So if I understand you correctly the key may be in the call to
> >> `skb_csum_unnecessary`:
> >>
> >> That reads:
> >>
> >> static inline int skb_csum_unnecessary(const struct sk_buff *skb)
> >> {
> >>         return ((skb->ip_summed == CHECKSUM_UNNECESSARY) ||
> >>                 skb->csum_valid ||
> >>                 (skb->ip_summed == CHECKSUM_PARTIAL &&
> >>                  skb_checksum_start_offset(skb) >= 0));
> >> }
> >>
> >> But really only the first 2 conditions are reachable
> >
> > .. from this codepath. That function is called in other codepaths as well.
> >
> >> , as we already know
> >> skb->ip_summed is not CHECKSUM_PARTIAL when we call it.
> >>
> >> So our unmodified check is:
> >>
> >>         else if (skb->pkt_type != PACKET_OUTGOING &&
> >>                 (skb->ip_summed == CHECKSUM_COMPLETE ||
> >>                  skb->ip_summed == CHECKSUM_UNNECESSARY ||
> >>                  skb->csum_valid))
> >>
> >> Should this become something like:
> >>
> >>         else if (skb->pkt_type != PACKET_OUTGOING &&
> >>                 (skb->ip_summed == CHECKSUM_COMPLETE &&
> >>                  skb->csum_valid) ||
> >>                  skb->ip_summed == CHECKSUM_UNNECESSARY)
> >>
> >> Is this what you had in mind?
> >
> > I don't suggest modifying skb_csum_unnecessary probably. Certainly not
> > until I've looked at all other callers of it.
> >
> > But in case of packet sockets, yes, adding that csum_valid check is my
> > first rough approximation.
> >
> > That said, first let's give others more familiar with
> > TP_STATUS_CSUM_VALID some time to comment.
> >
>
> I did some more experiments, on real hw this time. I made the following
> change to 5.7.0 (wasn't brave enough to remote upgrade a box to netnext):
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 29bd405adbbd..3afb1913837a 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -2216,8 +2216,8 @@ static int tpacket_rcv(struct sk_buff *skb, struct
> net_device *dev,
>         if (skb->ip_summed == CHECKSUM_PARTIAL)
>                 status |= TP_STATUS_CSUMNOTREADY;
>         else if (skb->pkt_type != PACKET_OUTGOING &&
> -                (skb->ip_summed == CHECKSUM_COMPLETE ||
> -                 skb_csum_unnecessary(skb)))
> +                ((skb->ip_summed == CHECKSUM_COMPLETE &&
> skb->csum_valid) ||
> +                  skb->ip_summed == CHECKSUM_UNNECESSARY))
>                 status |= TP_STATUS_CSUM_VALID;
>
>         if (snaplen > res)
>
> With this change it seems the TP_STATUS_CSUM_VALID flag is *never* set
> for the nfp driver.

I was mistaken. skb->csum_valid only signals whether the skb->csum
field is initialized. As of commit 573e8fca255a ("net: skb_gro_checksum_*
functions") skb->csum_valid it is always set if CHECKSUM_COMPLETE.
This does not imply that the checksum field in the header is correct.

The checksum field may get checked against the known checksum of
the payload in skb->csum before __netif_receive_skb_core and thus
before packet sockets during GRO when that is enabled. But not
always. Not if the packet gets flushed, for instance, see tcp4_gro_receive.

Commit 662880f44203 ("net: Allow GRO to use and set levels of checksum
unnecessary") indicates that the original assumption in this patch
that CHECKSUM_UNNECESSARY implies all checksums being valid does not
necessarily hold. Drivers are expected to set up skb->csum_level when
they have verified more than just the inner transport header.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net-next v2] af-packet: new flag to indicate all csums are good
  2020-06-04 13:48                     ` Willem de Bruijn
@ 2020-06-05 12:38                       ` Victor Julien
  0 siblings, 0 replies; 17+ messages in thread
From: Victor Julien @ 2020-06-05 12:38 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Jonathan Corbet, Eric Dumazet, Mao Wenan, Arnd Bergmann,
	Neil Horman, linux-doc, linux-kernel, Alexander Drozdov,
	Tom Herbert

On 04-06-2020 15:48, Willem de Bruijn wrote:
> On Thu, Jun 4, 2020 at 5:47 AM Victor Julien <victor@inliniac.net> wrote:
>>
>> On 02-06-2020 22:18, Willem de Bruijn wrote:
>>> On Tue, Jun 2, 2020 at 4:05 PM Victor Julien <victor@inliniac.net> wrote:
>>>>
>>>> On 02-06-2020 21:38, Willem de Bruijn wrote:
>>>>> On Tue, Jun 2, 2020 at 3:22 PM Victor Julien <victor@inliniac.net> wrote:
>>>>>>
>>>>>> On 02-06-2020 21:03, Willem de Bruijn wrote:
>>>>>>> On Tue, Jun 2, 2020 at 2:31 PM Victor Julien <victor@inliniac.net> wrote:
>>>>>>>> On 02-06-2020 19:37, Willem de Bruijn wrote:
>>>>>>>>> On Tue, Jun 2, 2020 at 1:03 PM Victor Julien <victor@inliniac.net> wrote:
>>>>>>>>>>
>>>>>>>>>> On 02-06-2020 16:29, Willem de Bruijn wrote:
>>>>>>>>>>> On Tue, Jun 2, 2020 at 4:05 AM Victor Julien <victor@inliniac.net> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Introduce a new flag (TP_STATUS_CSUM_UNNECESSARY) to indicate
>>>>>>>>>>>> that the driver has completely validated the checksums in the packet.
>>>>>>>>>>>>
>>>>>>>>>>>> The TP_STATUS_CSUM_UNNECESSARY flag differs from TP_STATUS_CSUM_VALID
>>>>>>>>>>>> in that the new flag will only be set if all the layers are valid,
>>>>>>>>>>>> while TP_STATUS_CSUM_VALID is set as well if only the IP layer is valid.
>>>>>>>>>>>
>>>>>>>>>>> transport, not ip checksum.
>>>>>>>>>>
>>>>>>>>>> Allow me a n00b question: what does transport refer to here? Things like
>>>>>>>>>> ethernet? It isn't clear to me from the doc.
>>>>>>>>>
>>>>>>>>> The TCP/UDP/.. transport protocol checksum.
>>>>>>>>
>>>>>>>> Hmm that is what I thought originally, but then it didn't seem to work.
>>>>>>>> Hence my patch.
>>>>>>>>
>>>>>>>> However I just redid my testing. I took the example tpacketv3 program
>>>>>>>> and added the status flag checks to the 'display()' func:
>>>>>>>>
>>>>>>>>                 if (ppd->tp_status & TP_STATUS_CSUM_VALID) {
>>>>>>>>                         printf("TP_STATUS_CSUM_VALID, ");
>>>>>>>>                 }
>>>>>>>>                 if (ppd->tp_status & (1<<8)) {
>>>>>>>>                         printf("TP_STATUS_CSUM_UNNECESSARY, ");
>>>>>>>>
>>>>>>>>                 }
>>>>>>>>
>>>>>>>> Then using scapy sent some packets in 2 variants:
>>>>>>>> - default (good csums)
>>>>>>>> - deliberately bad csums
>>>>>>>> (then also added a few things like ip6 over ip)
>>>>>>>>
>>>>>>>>
>>>>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(),
>>>>>>>> iface="enp1s0") // good csums
>>>>>>>>
>>>>>>>> srp1(Ether()/IP(src="1.2.3.4", dst="5.6.7.8")/IPv6()/TCP(chksum=1),
>>>>>>>> iface="enp1s0") //bad tcp
>>>>>>>
>>>>>>> Is this a test between two machines? What is the device driver of the
>>>>>>> machine receiving and printing the packet? It would be helpful to know
>>>>>>> whether this uses CHECKSUM_COMPLETE or CHECKSUM_UNNECESSARY.
>>>>>>
>>>>>> Yes 2 machines, or actually 2 machines and a VM. The receiving Linux
>>>>>> sits in a kvm vm with network pass through and uses the virtio driver
>>>>>> (host uses e1000e). Based on a quick 'git grep CHECKSUM_UNNECESSARY'
>>>>>> virtio seems to support that.
>>>>>>
>>>>>> I've done some more tests. In a pcap replay that I know contains packet
>>>>>> with bad TCP csums (but good IP csums for those pkts), to a physical
>>>>>> host running Ubuntu Linux kernel 5.3:
>>>>>>
>>>>>> - receiver uses nfp (netronome) driver: TP_STATUS_CSUM_VALID set for
>>>>>> every packet, including the bad TCP ones
>>>>>> - receiver uses ixgbe driver: TP_STATUS_CSUM_VALID not set for the bad
>>>>>> packets.
>>>>>
>>>>> Great. Thanks a lot for running all these experiments.
>>>>>
>>>>> We might have to drop the TP_STATUS_CSUM_VALID with CHECKSUM_COMPLETE
>>>>> unless skb->csum_valid.
>>>>>
>>>>> For packets with multiple transport layer checksums,
>>>>> CHECKSUM_UNNECESSARY should mean that all have been verified.
>>>>>
>>>>> I believe that in the case of multiple transport headers, csum_valid
>>>>> similarly ensures all checksums up to csum_start are valid. Will need
>>>>> to double check.
>>>>>
>>>>> If so, there probably is no need for a separate new TP_STATUS.
>>>>> TP_STATUS_CSUM_VALID is reported only when all checksums are valid.
>>>>
>>>> So if I understand you correctly the key may be in the call to
>>>> `skb_csum_unnecessary`:
>>>>
>>>> That reads:
>>>>
>>>> static inline int skb_csum_unnecessary(const struct sk_buff *skb)
>>>> {
>>>>         return ((skb->ip_summed == CHECKSUM_UNNECESSARY) ||
>>>>                 skb->csum_valid ||
>>>>                 (skb->ip_summed == CHECKSUM_PARTIAL &&
>>>>                  skb_checksum_start_offset(skb) >= 0));
>>>> }
>>>>
>>>> But really only the first 2 conditions are reachable
>>>
>>> .. from this codepath. That function is called in other codepaths as well.
>>>
>>>> , as we already know
>>>> skb->ip_summed is not CHECKSUM_PARTIAL when we call it.
>>>>
>>>> So our unmodified check is:
>>>>
>>>>         else if (skb->pkt_type != PACKET_OUTGOING &&
>>>>                 (skb->ip_summed == CHECKSUM_COMPLETE ||
>>>>                  skb->ip_summed == CHECKSUM_UNNECESSARY ||
>>>>                  skb->csum_valid))
>>>>
>>>> Should this become something like:
>>>>
>>>>         else if (skb->pkt_type != PACKET_OUTGOING &&
>>>>                 (skb->ip_summed == CHECKSUM_COMPLETE &&
>>>>                  skb->csum_valid) ||
>>>>                  skb->ip_summed == CHECKSUM_UNNECESSARY)
>>>>
>>>> Is this what you had in mind?
>>>
>>> I don't suggest modifying skb_csum_unnecessary probably. Certainly not
>>> until I've looked at all other callers of it.
>>>
>>> But in case of packet sockets, yes, adding that csum_valid check is my
>>> first rough approximation.
>>>
>>> That said, first let's give others more familiar with
>>> TP_STATUS_CSUM_VALID some time to comment.
>>>
>>
>> I did some more experiments, on real hw this time. I made the following
>> change to 5.7.0 (wasn't brave enough to remote upgrade a box to netnext):
>>
>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index 29bd405adbbd..3afb1913837a 100644
>> --- a/net/packet/af_packet.c
>> +++ b/net/packet/af_packet.c
>> @@ -2216,8 +2216,8 @@ static int tpacket_rcv(struct sk_buff *skb, struct
>> net_device *dev,
>>         if (skb->ip_summed == CHECKSUM_PARTIAL)
>>                 status |= TP_STATUS_CSUMNOTREADY;
>>         else if (skb->pkt_type != PACKET_OUTGOING &&
>> -                (skb->ip_summed == CHECKSUM_COMPLETE ||
>> -                 skb_csum_unnecessary(skb)))
>> +                ((skb->ip_summed == CHECKSUM_COMPLETE &&
>> skb->csum_valid) ||
>> +                  skb->ip_summed == CHECKSUM_UNNECESSARY))
>>                 status |= TP_STATUS_CSUM_VALID;
>>
>>         if (snaplen > res)
>>
>> With this change it seems the TP_STATUS_CSUM_VALID flag is *never* set
>> for the nfp driver.
> 
> I was mistaken. skb->csum_valid only signals whether the skb->csum
> field is initialized. As of commit 573e8fca255a ("net: skb_gro_checksum_*
> functions") skb->csum_valid it is always set if CHECKSUM_COMPLETE.
> This does not imply that the checksum field in the header is correct.
> 
> The checksum field may get checked against the known checksum of
> the payload in skb->csum before __netif_receive_skb_core and thus
> before packet sockets during GRO when that is enabled. But not
> always. Not if the packet gets flushed, for instance, see tcp4_gro_receive.
> 
> Commit 662880f44203 ("net: Allow GRO to use and set levels of checksum
> unnecessary") indicates that the original assumption in this patch
> that CHECKSUM_UNNECESSARY implies all checksums being valid does not
> necessarily hold. Drivers are expected to set up skb->csum_level when
> they have verified more than just the inner transport header.
> 

I think I found another case in the kernel that does seem to assume we
can rely on skb_csum_unnecessary.

496e4ae7dc94 ("netfilter: nf_queue: add NFQA_SKB_CSUM_NOTVERIFIED info
flag") seems to try to do what I'm after for nfqueue, but with an
inverted flag. I assume that if the flag is not set (and neither
NFQA_SKB_CSUMNOTREADY) it means we should be able to infer that the
csums are valid. Otherwise, what would be the point of the flag.

The logic seems to come down to:

                csum_verify = !skb_csum_unnecessary(entskb);

(for ip_summed != CHECKSUM_PARTIAL)

The it's passed to userspace:

        if (packet->ip_summed == CHECKSUM_PARTIAL)
                flags = NFQA_SKB_CSUMNOTREADY;
        else if (csum_verify)
                flags = NFQA_SKB_CSUM_NOTVERIFIED;

So according to this code, if skb_csum_unnecessary returns false the
csums is not verified, implying that it is when skb_csum_unnecessary
returns true.

I have no idea if this can be mapped directly to af-packet like this.

Despite reading 77cffe23c1f8 ("net: Clarification of
CHECKSUM_UNNECESSARY") multiple times I'm still not sure. If we get a
straightforward IPv4/TCP or IPv6/UDP does it mean that if
CHECKSUM_UNNECESSARY is set we can trust the csums of those layers are
validated?

If properly documented that would cover all the use cases I initially
care about, although it would of course be nice if the kernel already
knows the VXLAN encapsulated traffic was also verified that we can pass
this on as well.

-- 
---------------------------------------------
Victor Julien
http://www.inliniac.net/
PGP: http://www.inliniac.net/victorjulien.asc
---------------------------------------------


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-06-05 12:38 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-02  8:05 [PATCH net-next v2] af-packet: new flag to indicate all csums are good Victor Julien
2020-06-02  8:11 ` Victor Julien
2020-06-02 14:29 ` Willem de Bruijn
2020-06-02 17:03   ` Victor Julien
2020-06-02 17:37     ` Willem de Bruijn
2020-06-02 18:31       ` Victor Julien
2020-06-02 19:03         ` Willem de Bruijn
2020-06-02 19:22           ` Victor Julien
2020-06-02 19:29             ` Jakub Kicinski
2020-06-02 19:47               ` Victor Julien
2020-06-02 19:38             ` Willem de Bruijn
2020-06-02 20:05               ` Victor Julien
2020-06-02 20:18                 ` Willem de Bruijn
2020-06-02 20:29                   ` Victor Julien
2020-06-04  9:46                   ` Victor Julien
2020-06-04 13:48                     ` Willem de Bruijn
2020-06-05 12:38                       ` Victor Julien

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).