[Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path
@ 2015-11-05  1:02 Mark D Rustad
  2015-11-13 20:55 ` Schmitt, Phillip J
  2015-11-13 21:26 ` Alexander Duyck
  0 siblings, 2 replies; 7+ messages in thread
From: Mark D Rustad @ 2015-11-05  1:02 UTC (permalink / raw)
  To: intel-wired-lan

Check for and handle IPv6 extended headers so that tx checksum
offload can be done. Thanks to Tom Herbert for noticing this
problem. Note that the goto back to process the final protocol
value can never result in a loop, because it cannot be yet
another extended header. Handling them in this manner avoids
adding further checks to the non-extended header hot path.

Reported-by: Tom Herbert <tom@herbertland.com>
Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 1ffbe85eab7b..f5730ddbee96 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -7261,6 +7261,7 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
 			}
 		}
 
+again:
 		switch (l4_hdr) {
 		case IPPROTO_TCP:
 			type_tucmd |= IXGBE_ADVTXD_TUCMD_L4T_TCP;
@@ -7277,7 +7278,23 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
 					IXGBE_ADVTXD_L4LEN_SHIFT;
 			break;
 		default:
-			if (unlikely(net_ratelimit())) {
+			if (network_hdr.ipv4->version == 6 &&
+			    ipv6_ext_hdr(l4_hdr)) {
+				unsigned int offset = 0;
+				int ret;
+
+				ret = ipv6_find_hdr(skb, &offset, -1, NULL,
+						    NULL);
+				if (ret > 0) {
+					l4_hdr = ret;
+					goto again;
+				}
+				if (unlikely(net_ratelimit())) {
+					dev_warn(tx_ring->dev,
+						 "ipv6_find_hdr returned %d\n",
+						 ret);
+				}
+			} else if (unlikely(net_ratelimit())) {
 				dev_warn(tx_ring->dev,
 				 "partial checksum but l4 proto=%x!\n",
 				 l4_hdr);


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path
  2015-11-05  1:02 [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path Mark D Rustad
@ 2015-11-13 20:55 ` Schmitt, Phillip J
  2015-11-13 21:18   ` Alexander Duyck
  2015-11-13 21:26 ` Alexander Duyck
  1 sibling, 1 reply; 7+ messages in thread
From: Schmitt, Phillip J @ 2015-11-13 20:55 UTC (permalink / raw)
  To: intel-wired-lan



> -----Original Message-----
> From: Intel-wired-lan [mailto:intel-wired-lan-bounces at lists.osuosl.org] On
> Behalf Of Mark D Rustad
> Sent: Wednesday, November 04, 2015 5:02 PM
> To: intel-wired-lan at lists.osuosl.org
> Subject: [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx
> path
> 
> Check for and handle IPv6 extended headers so that tx checksum offload can be
> done. Thanks to Tom Herbert for noticing this problem. Note that the goto back
> to process the final protocol value can never result in a loop, because it cannot
> be yet another extended header. Handling them in this manner avoids adding
> further checks to the non-extended header hot path.
> 
> Reported-by: Tom Herbert <tom@herbertland.com>
> Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>

Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path
  2015-11-13 20:55 ` Schmitt, Phillip J
@ 2015-11-13 21:18   ` Alexander Duyck
  2015-11-13 22:14     ` Rustad, Mark D
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Duyck @ 2015-11-13 21:18 UTC (permalink / raw)
  To: intel-wired-lan

On 11/13/2015 12:55 PM, Schmitt, Phillip J wrote:
>
>> -----Original Message-----
>> From: Intel-wired-lan [mailto:intel-wired-lan-bounces at lists.osuosl.org] On
>> Behalf Of Mark D Rustad
>> Sent: Wednesday, November 04, 2015 5:02 PM
>> To: intel-wired-lan at lists.osuosl.org
>> Subject: [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx
>> path
>>
>> Check for and handle IPv6 extended headers so that tx checksum offload can be
>> done. Thanks to Tom Herbert for noticing this problem. Note that the goto back
>> to process the final protocol value can never result in a loop, because it cannot
>> be yet another extended header. Handling them in this manner avoids adding
>> further checks to the non-extended header hot path.
>>
>> Reported-by: Tom Herbert <tom@herbertland.com>
>> Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>

What protocol types did you test this over?  Just curious as it seems it 
is using ip6_find_header without taking into account if we want the 
inner or outer IPv6 header.

I believe in order to get that correct there should be an offset taken 
into account so that the inner header could be found instead of the 
outer in case skb->encapsulation is set.

- Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path
  2015-11-05  1:02 [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path Mark D Rustad
  2015-11-13 20:55 ` Schmitt, Phillip J
@ 2015-11-13 21:26 ` Alexander Duyck
  1 sibling, 0 replies; 7+ messages in thread
From: Alexander Duyck @ 2015-11-13 21:26 UTC (permalink / raw)
  To: intel-wired-lan

On 11/04/2015 05:02 PM, Mark D Rustad wrote:
> Check for and handle IPv6 extended headers so that tx checksum
> offload can be done. Thanks to Tom Herbert for noticing this
> problem. Note that the goto back to process the final protocol
> value can never result in a loop, because it cannot be yet
> another extended header. Handling them in this manner avoids
> adding further checks to the non-extended header hot path.
>
> Reported-by: Tom Herbert <tom@herbertland.com>
> Signed-off-by: Mark Rustad <mark.d.rustad@intel.com>
> ---
>   drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |   19 ++++++++++++++++++-
>   1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 1ffbe85eab7b..f5730ddbee96 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -7261,6 +7261,7 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>   			}
>   		}
>   
> +again:
>   		switch (l4_hdr) {
>   		case IPPROTO_TCP:
>   			type_tucmd |= IXGBE_ADVTXD_TUCMD_L4T_TCP;
> @@ -7277,7 +7278,23 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>   					IXGBE_ADVTXD_L4LEN_SHIFT;
>   			break;
>   		default:
> -			if (unlikely(net_ratelimit())) {
> +			if (network_hdr.ipv4->version == 6 &&
> +			    ipv6_ext_hdr(l4_hdr)) {

Instead of testing for ipv6_ext_hdr why not just test with
    if ((transport_hdr.raw - network_hdr.raw) != sizeof(struct ipv6hdr))

> +				unsigned int offset = 0;
> +				int ret;
> +
> +				ret = ipv6_find_hdr(skb, &offset, -1, NULL,
> +						    NULL);
> +				if (ret > 0) {
> +					l4_hdr = ret;
> +					goto again;
> +				}
> +				if (unlikely(net_ratelimit())) {
> +					dev_warn(tx_ring->dev,
> +						 "ipv6_find_hdr returned %d\n",
> +						 ret);
> +				}
> +			} else if (unlikely(net_ratelimit())) {
>   				dev_warn(tx_ring->dev,
>   				 "partial checksum but l4 proto=%x!\n",
>   				 l4_hdr);
>

You could also avoid most of this loop code if you just moved all this 
up into the IPv6 portion of the L3 processing.

- Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path
  2015-11-13 21:18   ` Alexander Duyck
@ 2015-11-13 22:14     ` Rustad, Mark D
  2015-11-14  1:24       ` Alexander Duyck
  0 siblings, 1 reply; 7+ messages in thread
From: Rustad, Mark D @ 2015-11-13 22:14 UTC (permalink / raw)
  To: intel-wired-lan

Alexander Duyck <alexander.duyck@gmail.com> wrote:

> What protocol types did you test this over?  Just curious as it seems it is using ip6_find_header without taking into account if we want the inner or outer IPv6 header.

You are right. The call to ipv6_find_header here will always look at the outer header, which is not correct in an encapsulated case.

> I believe in order to get that correct there should be an offset taken into account so that the inner header could be found instead of the outer in case skb->encapsulation is set.

I'll have to work that out.

>> @@ -7277,7 +7278,23 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>>  					IXGBE_ADVTXD_L4LEN_SHIFT;
>>  			break;
>>  		default:
>> -			if (unlikely(net_ratelimit())) {
>> +			if (network_hdr.ipv4->version == 6 &&
>> +			    ipv6_ext_hdr(l4_hdr)) {
> 
> Instead of testing for ipv6_ext_hdr why not just test with
>   if ((transport_hdr.raw - network_hdr.raw) != sizeof(struct ipv6hdr))

That does seem like it would work, but see below.

>> +				unsigned int offset = 0;
>> +				int ret;
>> +
>> +				ret = ipv6_find_hdr(skb, &offset, -1, NULL,
>> +						    NULL);
>> +				if (ret > 0) {
>> +					l4_hdr = ret;
>> +					goto again;
>> +				}
>> +				if (unlikely(net_ratelimit())) {
>> +					dev_warn(tx_ring->dev,
>> +						 "ipv6_find_hdr returned %d\n",
>> +						 ret);
>> +				}
>> +			} else if (unlikely(net_ratelimit())) {
>>  				dev_warn(tx_ring->dev,
>>  				 "partial checksum but l4 proto=%x!\n",
>>  				 l4_hdr);
> 
> You could also avoid most of this loop code if you just moved all this up into the IPv6 portion of the L3 processing.

I considered that, but really did not want to add any checks at all in what would normally (the non-extended header case) be the IPv6 hot path. And of course it isn't really a loop. It looks to me like if I eliminated the ipv6_ext_hdr call above, that, in this incarnation, it could be an infinite loop!

So I think the elimination of the ipv6_ext_hdr call should only be done if the code is put into the L3 processing. Is it worth another check in that hot path for something that, apparently, has never yet been used and is very unlikely to be a common case in most environments?

I prefer the current arrangement, because it puts the overhead on the less common path and no burden on the common one. I do need to fix the offset for the ipv6_find_hdr in any case - thanks for catching that!

--
Mark Rustad, Networking Division, Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.osuosl.org/pipermail/intel-wired-lan/attachments/20151113/ad7240a2/attachment.asc>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path
  2015-11-13 22:14     ` Rustad, Mark D
@ 2015-11-14  1:24       ` Alexander Duyck
  2015-11-16 18:42         ` Rustad, Mark D
  0 siblings, 1 reply; 7+ messages in thread
From: Alexander Duyck @ 2015-11-14  1:24 UTC (permalink / raw)
  To: intel-wired-lan

On 11/13/2015 02:14 PM, Rustad, Mark D wrote:
> Alexander Duyck <alexander.duyck@gmail.com> wrote:
>
>> What protocol types did you test this over?  Just curious as it seems it is using ip6_find_header without taking into account if we want the inner or outer IPv6 header.
> You are right. The call to ipv6_find_header here will always look at the outer header, which is not correct in an encapsulated case.
>
>> I believe in order to get that correct there should be an offset taken into account so that the inner header could be found instead of the outer in case skb->encapsulation is set.
> I'll have to work that out.

If nothing else I think the offset is pretty easy to calculate since it 
should be network_hdr.raw - skb->data.

>>> @@ -7277,7 +7278,23 @@ static void ixgbe_tx_csum(struct ixgbe_ring *tx_ring,
>>>   					IXGBE_ADVTXD_L4LEN_SHIFT;
>>>   			break;
>>>   		default:
>>> -			if (unlikely(net_ratelimit())) {
>>> +			if (network_hdr.ipv4->version == 6 &&
>>> +			    ipv6_ext_hdr(l4_hdr)) {
>> Instead of testing for ipv6_ext_hdr why not just test with
>>    if ((transport_hdr.raw - network_hdr.raw) != sizeof(struct ipv6hdr))
> That does seem like it would work, but see below.
>
>>> +				unsigned int offset = 0;
>>> +				int ret;
>>> +
>>> +				ret = ipv6_find_hdr(skb, &offset, -1, NULL,
>>> +						    NULL);

I was doing some further looking into this and is there a reason for 
using ipv6_find_hdr instead of ipv6_skip_hdr?  It seems like the way you 
are using ipv6_find_hdr you aren't doing fragment verification and as a 
result you could end up trying to do a checksum on a fragment.  With 
ipv6_skip_hdr you would at least be forced to return the fragment 
pointer offset which you could test for to verify if this is a 
fragmented IPv6 frame.

>>> +				if (ret > 0) {
>>> +					l4_hdr = ret;
>>> +					goto again;
>>> +				}
>>> +				if (unlikely(net_ratelimit())) {
>>> +					dev_warn(tx_ring->dev,
>>> +						 "ipv6_find_hdr returned %d\n",
>>> +						 ret);
>>> +				}
>>> +			} else if (unlikely(net_ratelimit())) {
>>>   				dev_warn(tx_ring->dev,
>>>   				 "partial checksum but l4 proto=%x!\n",
>>>   				 l4_hdr);
>> You could also avoid most of this loop code if you just moved all this up into the IPv6 portion of the L3 processing.
>
> I considered that, but really did not want to add any checks at all in what would normally (the non-extended header case) be the IPv6 hot path. And of course it isn't really a loop. It looks to me like if I eliminated the ipv6_ext_hdr call above, that, in this incarnation, it could be an infinite loop!
>
> So I think the elimination of the ipv6_ext_hdr call should only be done if the code is put into the L3 processing. Is it worth another check in that hot path for something that, apparently, has never yet been used and is very unlikely to be a common case in most environments?
>
> I prefer the current arrangement, because it puts the overhead on the less common path and no burden on the common one. I do need to fix the offset for the ipv6_find_hdr in any case - thanks for catching that!

My thought was if you were to compare the offset, which if I am not 
mistaken you already have to compute for ip header length in the Tx 
descriptor above, then you could do an if/else where you either use the 
old approach which just returns the value or ipv6_find_hdr in the else 
case.  The total cost added to the IPv6 hot path should consist of a 
comparison and an unused jump for the case without extended headers.

Also if I am not mistaken this does have the potential for an infinite 
loop as well, as all it would take is a malformed skb with an IPv6 
extension header ending with NEXTHDR_NONE requesting a checksum offload 
wouldn't it?  From what I can tell ipv6_find_hdr can return that as a 
value if you are using the -1 target value, and it would result in you 
looping forever.  It isn't the kind of thing you would find with 
standard testing, but a fuzzer might be able to trigger it on a raw socket.

- Alex

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path
  2015-11-14  1:24       ` Alexander Duyck
@ 2015-11-16 18:42         ` Rustad, Mark D
  0 siblings, 0 replies; 7+ messages in thread
From: Rustad, Mark D @ 2015-11-16 18:42 UTC (permalink / raw)
  To: intel-wired-lan

Alexander Duyck <alexander.duyck@gmail.com> wrote:

> On 11/13/2015 02:14 PM, Rustad, Mark D wrote:
>> Alexander Duyck <alexander.duyck@gmail.com> wrote:
>> 
>>> What protocol types did you test this over?  Just curious as it seems it is using ip6_find_header without taking into account if we want the inner or outer IPv6 header.
>> You are right. The call to ipv6_find_header here will always look at the outer header, which is not correct in an encapsulated case.
>> 
>>> I believe in order to get that correct there should be an offset taken into account so that the inner header could be found instead of the outer in case skb->encapsulation is set.
>> I'll have to work that out.
> 
> If nothing else I think the offset is pretty easy to calculate since it should be network_hdr.raw - skb->data.

> I was doing some further looking into this and is there a reason for using ipv6_find_hdr instead of ipv6_skip_hdr?  It seems like the way you are using ipv6_find_hdr you aren't doing fragment verification and as a result you could end up trying to do a checksum on a fragment.  With ipv6_skip_hdr you would at least be forced to return the fragment pointer offset which you could test for to verify if this is a fragmented IPv6 frame.

You are absolutely right about this.

> My thought was if you were to compare the offset, which if I am not mistaken you already have to compute for ip header length in the Tx descriptor above, then you could do an if/else where you either use the old approach which just returns the value or ipv6_find_hdr in the else case.  The total cost added to the IPv6 hot path should consist of a comparison and an unused jump for the case without extended headers.

Yes, it is an inexpensive check, I was just trying to avoid even that, but see below.

> Also if I am not mistaken this does have the potential for an infinite loop as well, as all it would take is a malformed skb with an IPv6 extension header ending with NEXTHDR_NONE requesting a checksum offload wouldn't it?  From what I can tell ipv6_find_hdr can return that as a value if you are using the -1 target value, and it would result in you looping forever.  It isn't the kind of thing you would find with standard testing, but a fuzzer might be able to trigger it on a raw socket.

And you are absolutely right about this too. I was sure that I had looked at the case, but you know, I think I was looking at the wrong (right) function.

I will make a new version in line with what you are suggesting. Thanks for all of your thought and effort on this. I'll be sure to add a tunneled IPv6 session in my test.

--
Mark Rustad, Networking Division, Intel Corporation
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 841 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.osuosl.org/pipermail/intel-wired-lan/attachments/20151116/a2ef512d/attachment.asc>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-11-16 18:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-05  1:02 [Intel-wired-lan] [PATCH] ixgbe: Handle extended IPv6 headers in tx path Mark D Rustad
2015-11-13 20:55 ` Schmitt, Phillip J
2015-11-13 21:18   ` Alexander Duyck
2015-11-13 22:14     ` Rustad, Mark D
2015-11-14  1:24       ` Alexander Duyck
2015-11-16 18:42         ` Rustad, Mark D
2015-11-13 21:26 ` Alexander Duyck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.