linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* i40e X722 RSS problem with NAT-Traversal IPsec packets
@ 2019-05-01 20:52 Lennart Sorensen
  2019-05-01 22:52 ` [Intel-wired-lan] " Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-01 20:52 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev, intel-wired-lan, Jeff Kirsher, Len Sorensen

We have hit a strange problem with RSS on the X722 on our new servers
(S2600WFT based).

The RSS hash is distributing most packets across cores quite nicely, with
one exception.  ESP encapsulated in UDP is always going to queue 0 no
matter what the hash key is set to or how many cores have queues assigned.
So if terminating IPsec tunnels that are using NAT-Traversal, all packets
arrive on the same core, which clearly isn't good for scalability.
Other UDP packets are fine, TCP is fine, ICMP, ESP, etc have no problem
that we have seen, only the ESP in UDP packets.

Given the packets are UDP packets I would have hoped they would just
be distributed using the source and destination ip and port values as
other UDP packets seem to be, but they are not.  I vaguely suspect the
UDP tunnel handling support the card has for this since it claims to
use the internal packet's values for RSS rather than the UDP packet
itself for certain supported types of UDP encapsulated IP traffic, but
not ESP in UDP, so perhaps it sees an IP packet inside a UDP packet,
and decides to try and parse it instead, doesn't know how to handle it
and stops without assigning any RSS value to the packet at all rather
than falling back to treating it as a plain UDP packet.  But that's just
guessing based on the documentation of the hardware capabilities.

Here is an example of a packet that always hits queue 0:

14:48:09.014360 54:ee:75:30:f1:e1 > a4:bf:01:4e:0c:87, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 3312, offset 0, flags [DF], proto UDP (17), length 160)
    1.99.99.2.4500 > 1.99.99.1.4500: [no cksum] UDP-encap: ESP(spi=0xac11cadf,seq=0x480), length 132
        0x0000:  4500 00a0 0cf0 4000 4011 6494 0163 6302  E.....@.@.d..cc.
        0x0010:  0163 6301 1194 1194 008c 0000 ac11 cadf  .cc.............
        0x0020:  0000 0480 901d 3b39 e884 0616 fed4 3e37  ......;9......>7
        0x0030:  bb67 bca2 adac e519 c7a9 ced9 00bf 263e  .g............&>
        0x0040:  28a6 ba38 1e8c e6e3 bbf9 e093 1c49 8154  (..8.........I.T
        0x0050:  0d66 c1d5 2416 f4d2 26ec f5a1 773f 4ae2  .f..$...&...w?J.
        0x0060:  8e26 0ed8 0e5f daab 06b2 aa51 2f2f e16e  .&..._.....Q//.n
        0x0070:  22ca dd94 f499 027b 11d0 de7b 4d9d 7af1  "......{...{M.z.
        0x0080:  f468 ae0d ad41 5c96 577d 7b44 1cc4 0ba3  .h...A\.W}{D....
        0x0090:  9ff7 142f b159 c9d0 38e1 c460 120f f4bb  .../.Y..8..`....
14:48:09.014439 a4:bf:01:4e:0c:87 > 54:ee:75:30:f1:e1, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 43796, offset 0, flags [none], proto UDP (17), length 160)
    1.99.99.1.4500 > 1.99.99.2.4500: [no cksum] UDP-encap: ESP(spi=0x47f5919c,seq=0x480), length 132
        0x0000:  4500 00a0 ab14 0000 4011 0670 0163 6301  E.......@..p.cc.
        0x0010:  0163 6302 1194 1194 008c 0000 47f5 919c  .cc.........G...
        0x0020:  0000 0480 106b cafb 14ee f75b 3533 16fb  .....k.....[53..
        0x0030:  87f5 9d90 a73b 8daf 481f 22b7 2b30 b482  .....;..H.".+0..
        0x0040:  a330 1fe4 59da a394 b48b ac77 5a96 dfac  .0..Y......wZ...
        0x0050:  4798 793a ca7e 1af2 a9a8 2f7b 9327 d5b9  G.y:.~..../{.'..
        0x0060:  f8d0 e761 c7b3 a85c c843 ec25 62b2 e083  ...a...\.C.%b...
        0x0070:  f0d5 1097 736b 051a b15d e7de 7f0e b5b7  ....sk...]......
        0x0080:  209b 4d1d af37 c1a1 09a0 a6c9 71cf 7d54  ..M..7......q.}T
        0x0090:  55c3 2797 e622 581f 09cf 9483 2ba5 e64a  U.'.."X.....+..J

This was done on 4.19.28 kernel with the i40e driver in that kernel with
libreswan for IPsec using netkey in the kernel and nat-traversal in use.
The packets are a ping echo and reply pair.  NVM version 3.49 and 4.00
tried so far.

No other network interfaces we have used have had this problem.  RSS has
always just worked until now.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-01 20:52 i40e X722 RSS problem with NAT-Traversal IPsec packets Lennart Sorensen
@ 2019-05-01 22:52 ` Alexander Duyck
  2019-05-02 15:11   ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-01 22:52 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: LKML, Netdev, intel-wired-lan

On Wed, May 1, 2019 at 2:03 PM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> We have hit a strange problem with RSS on the X722 on our new servers
> (S2600WFT based).
>
> The RSS hash is distributing most packets across cores quite nicely, with
> one exception.  ESP encapsulated in UDP is always going to queue 0 no
> matter what the hash key is set to or how many cores have queues assigned.
> So if terminating IPsec tunnels that are using NAT-Traversal, all packets
> arrive on the same core, which clearly isn't good for scalability.
> Other UDP packets are fine, TCP is fine, ICMP, ESP, etc have no problem
> that we have seen, only the ESP in UDP packets.
>
> Given the packets are UDP packets I would have hoped they would just
> be distributed using the source and destination ip and port values as
> other UDP packets seem to be, but they are not.  I vaguely suspect the
> UDP tunnel handling support the card has for this since it claims to
> use the internal packet's values for RSS rather than the UDP packet
> itself for certain supported types of UDP encapsulated IP traffic, but
> not ESP in UDP, so perhaps it sees an IP packet inside a UDP packet,
> and decides to try and parse it instead, doesn't know how to handle it
> and stops without assigning any RSS value to the packet at all rather
> than falling back to treating it as a plain UDP packet.  But that's just
> guessing based on the documentation of the hardware capabilities.
>
> Here is an example of a packet that always hits queue 0:
>
> 14:48:09.014360 54:ee:75:30:f1:e1 > a4:bf:01:4e:0c:87, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 3312, offset 0, flags [DF], proto UDP (17), length 160)
>     1.99.99.2.4500 > 1.99.99.1.4500: [no cksum] UDP-encap: ESP(spi=0xac11cadf,seq=0x480), length 132
>         0x0000:  4500 00a0 0cf0 4000 4011 6494 0163 6302  E.....@.@.d..cc.
>         0x0010:  0163 6301 1194 1194 008c 0000 ac11 cadf  .cc.............
>         0x0020:  0000 0480 901d 3b39 e884 0616 fed4 3e37  ......;9......>7
>         0x0030:  bb67 bca2 adac e519 c7a9 ced9 00bf 263e  .g............&>
>         0x0040:  28a6 ba38 1e8c e6e3 bbf9 e093 1c49 8154  (..8.........I.T
>         0x0050:  0d66 c1d5 2416 f4d2 26ec f5a1 773f 4ae2  .f..$...&...w?J.
>         0x0060:  8e26 0ed8 0e5f daab 06b2 aa51 2f2f e16e  .&..._.....Q//.n
>         0x0070:  22ca dd94 f499 027b 11d0 de7b 4d9d 7af1  "......{...{M.z.
>         0x0080:  f468 ae0d ad41 5c96 577d 7b44 1cc4 0ba3  .h...A\.W}{D....
>         0x0090:  9ff7 142f b159 c9d0 38e1 c460 120f f4bb  .../.Y..8..`....
> 14:48:09.014439 a4:bf:01:4e:0c:87 > 54:ee:75:30:f1:e1, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 43796, offset 0, flags [none], proto UDP (17), length 160)
>     1.99.99.1.4500 > 1.99.99.2.4500: [no cksum] UDP-encap: ESP(spi=0x47f5919c,seq=0x480), length 132
>         0x0000:  4500 00a0 ab14 0000 4011 0670 0163 6301  E.......@..p.cc.
>         0x0010:  0163 6302 1194 1194 008c 0000 47f5 919c  .cc.........G...
>         0x0020:  0000 0480 106b cafb 14ee f75b 3533 16fb  .....k.....[53..
>         0x0030:  87f5 9d90 a73b 8daf 481f 22b7 2b30 b482  .....;..H.".+0..
>         0x0040:  a330 1fe4 59da a394 b48b ac77 5a96 dfac  .0..Y......wZ...
>         0x0050:  4798 793a ca7e 1af2 a9a8 2f7b 9327 d5b9  G.y:.~..../{.'..
>         0x0060:  f8d0 e761 c7b3 a85c c843 ec25 62b2 e083  ...a...\.C.%b...
>         0x0070:  f0d5 1097 736b 051a b15d e7de 7f0e b5b7  ....sk...]......
>         0x0080:  209b 4d1d af37 c1a1 09a0 a6c9 71cf 7d54  ..M..7......q.}T
>         0x0090:  55c3 2797 e622 581f 09cf 9483 2ba5 e64a  U.'.."X.....+..J
>
> This was done on 4.19.28 kernel with the i40e driver in that kernel with
> libreswan for IPsec using netkey in the kernel and nat-traversal in use.
> The packets are a ping echo and reply pair.  NVM version 3.49 and 4.00
> tried so far.
>
> No other network interfaces we have used have had this problem.  RSS has
> always just worked until now.
>
> --
> Len Sorensen

I'm not sure how RSS will do much for you here. Basically you only
have the source IP address as your only source of entropy when it
comes to RSS since the destination IP should always be the same if you
are performing a server role and terminating packets on the local
system and as far as the ports in your example you seem to only be
using 4500 for both the source and the destination.

In your testing are you only looking at a point to point connection
between two systems, or do you have multiple systems accessing the
system you are testing? I ask as the only way this should do any
traffic spreading via RSS would be if the source IPs are different and
that would require multiple client systems accessing the server.

In the case of other encapsulation types over UDP, such as VXLAN, I
know that a hash value is stored in the UDP source port location
instead of the true source port number. This allows the RSS hashing to
occur on this extra information which would allow for a greater
diversity in hash results. Depending on how you are generating the ESP
encapsulation you might look at seeing if it would be possible to have
a hash on the inner data used as the UDP source port in the outgoing
packets. This would help to resolve this sort of issue.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-01 22:52 ` [Intel-wired-lan] " Alexander Duyck
@ 2019-05-02 15:11   ` Lennart Sorensen
  2019-05-02 17:03     ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-02 15:11 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: LKML, Netdev, intel-wired-lan

On Wed, May 01, 2019 at 03:52:57PM -0700, Alexander Duyck wrote:
> I'm not sure how RSS will do much for you here. Basically you only
> have the source IP address as your only source of entropy when it
> comes to RSS since the destination IP should always be the same if you
> are performing a server role and terminating packets on the local
> system and as far as the ports in your example you seem to only be
> using 4500 for both the source and the destination.

I have thousands of IPsec clients connecting.  Simply treating them as
normal UDP packets would work.  The IP address is different, and often
the port too.

> In your testing are you only looking at a point to point connection
> between two systems, or do you have multiple systems accessing the
> system you are testing? I ask as the only way this should do any
> traffic spreading via RSS would be if the source IPs are different and
> that would require multiple client systems accessing the server.

I tried changing the client IP address and the RSS hash key.  It never
changed to another queue.  Something is broken.

> In the case of other encapsulation types over UDP, such as VXLAN, I
> know that a hash value is stored in the UDP source port location
> instead of the true source port number. This allows the RSS hashing to
> occur on this extra information which would allow for a greater
> diversity in hash results. Depending on how you are generating the ESP
> encapsulation you might look at seeing if it would be possible to have
> a hash on the inner data used as the UDP source port in the outgoing
> packets. This would help to resolve this sort of issue.

Well it works on every other network card except this one.  Every other
intel card in the past we have used had no problem doing this right.

You want all the packets for a given ipsec tunnel to go to the same queue.
That is not a problem here.  What you don't want is every ipsec packet
from everyone going to the same queue (always queue 0).  So simply
treating them as UDP packets with a source and destination IP and port
would work perfectly fine.  The X722 isn't doing that.  It is always
assigning a hash value of 0 to these packets.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-02 15:11   ` Lennart Sorensen
@ 2019-05-02 17:03     ` Alexander Duyck
  2019-05-02 17:16       ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-02 17:03 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: LKML, Netdev, intel-wired-lan

On Thu, May 2, 2019 at 8:11 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Wed, May 01, 2019 at 03:52:57PM -0700, Alexander Duyck wrote:
> > I'm not sure how RSS will do much for you here. Basically you only
> > have the source IP address as your only source of entropy when it
> > comes to RSS since the destination IP should always be the same if you
> > are performing a server role and terminating packets on the local
> > system and as far as the ports in your example you seem to only be
> > using 4500 for both the source and the destination.
>
> I have thousands of IPsec clients connecting.  Simply treating them as
> normal UDP packets would work.  The IP address is different, and often
> the port too.

Thanks for the clarification. I just wanted to verify that I know we
have had similar complaints in the past and it turns out those were
only using one set of IP addresses.

> > In your testing are you only looking at a point to point connection
> > between two systems, or do you have multiple systems accessing the
> > system you are testing? I ask as the only way this should do any
> > traffic spreading via RSS would be if the source IPs are different and
> > that would require multiple client systems accessing the server.
>
> I tried changing the client IP address and the RSS hash key.  It never
> changed to another queue.  Something is broken.

Okay, so if changing the RSS hash key has not effect then it is likely
not being used.

> > In the case of other encapsulation types over UDP, such as VXLAN, I
> > know that a hash value is stored in the UDP source port location
> > instead of the true source port number. This allows the RSS hashing to
> > occur on this extra information which would allow for a greater
> > diversity in hash results. Depending on how you are generating the ESP
> > encapsulation you might look at seeing if it would be possible to have
> > a hash on the inner data used as the UDP source port in the outgoing
> > packets. This would help to resolve this sort of issue.
>
> Well it works on every other network card except this one.  Every other
> intel card in the past we have used had no problem doing this right.

The question is what is different about this card, and I don't have an
immediate answer so we would need to do some investigation.

> You want all the packets for a given ipsec tunnel to go to the same queue.
> That is not a problem here.  What you don't want is every ipsec packet
> from everyone going to the same queue (always queue 0).  So simply
> treating them as UDP packets with a source and destination IP and port
> would work perfectly fine.  The X722 isn't doing that.  It is always
> assigning a hash value of 0 to these packets.

You had stated in your earlier email that "Other UDP packets are
fine". Perhaps we need to do some further isolation to identify why
the ESP over UDP packets are not being hashed on while other UDP
packets are.

Would it be possible to provide a couple of raw Ethernet frames
instead of IP packets for us to examine? I noticed the two packets you
sent earlier didn't start until the IP header. One possibility would
be that if we had any extra outer headers or trailers added to the
packet that could possibly cause issues since that might either make
the packet not parsable or possibly flag it as some sort of length
error when the size of the packet doesn't match what is reported in
the headers.

One other thing we may want to look at doing is trying to identify the
particular part of the packets that might be causing the hash to not
be generated. One way to do that would be to use something like
netperf to generate packets and send them toward your test system.
Something like the command line below could be used to send packets
that should be similar to the ones you provided earlier:
     netperf -H <target IP> -t UDP_STREAM -N -- -P 4500,4500 -m 132

If the packets generated by netperf were not hashed that would tell us
then it may be some sort of issue with how UDP packets are being
parsed, and from there we could narrow things down by modifying port
numbers and changing packet sizes. If that does get hashed then we
need to start looking outside of the IP/UDP header parsing for
possible issues since there is likely something else causing the
issue.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-02 17:03     ` Alexander Duyck
@ 2019-05-02 17:16       ` Lennart Sorensen
  2019-05-02 17:28         ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-02 17:16 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: LKML, Netdev, intel-wired-lan

On Thu, May 02, 2019 at 10:03:23AM -0700, Alexander Duyck wrote:
> On Thu, May 2, 2019 at 8:11 AM Lennart Sorensen
> <lsorense@csclub.uwaterloo.ca> wrote:
> >
> > On Wed, May 01, 2019 at 03:52:57PM -0700, Alexander Duyck wrote:
> > > I'm not sure how RSS will do much for you here. Basically you only
> > > have the source IP address as your only source of entropy when it
> > > comes to RSS since the destination IP should always be the same if you
> > > are performing a server role and terminating packets on the local
> > > system and as far as the ports in your example you seem to only be
> > > using 4500 for both the source and the destination.
> >
> > I have thousands of IPsec clients connecting.  Simply treating them as
> > normal UDP packets would work.  The IP address is different, and often
> > the port too.
> 
> Thanks for the clarification. I just wanted to verify that I know we
> have had similar complaints in the past and it turns out those were
> only using one set of IP addresses.
> 
> > > In your testing are you only looking at a point to point connection
> > > between two systems, or do you have multiple systems accessing the
> > > system you are testing? I ask as the only way this should do any
> > > traffic spreading via RSS would be if the source IPs are different and
> > > that would require multiple client systems accessing the server.
> >
> > I tried changing the client IP address and the RSS hash key.  It never
> > changed to another queue.  Something is broken.
> 
> Okay, so if changing the RSS hash key has not effect then it is likely
> not being used.
> 
> > > In the case of other encapsulation types over UDP, such as VXLAN, I
> > > know that a hash value is stored in the UDP source port location
> > > instead of the true source port number. This allows the RSS hashing to
> > > occur on this extra information which would allow for a greater
> > > diversity in hash results. Depending on how you are generating the ESP
> > > encapsulation you might look at seeing if it would be possible to have
> > > a hash on the inner data used as the UDP source port in the outgoing
> > > packets. This would help to resolve this sort of issue.
> >
> > Well it works on every other network card except this one.  Every other
> > intel card in the past we have used had no problem doing this right.
> 
> The question is what is different about this card, and I don't have an
> immediate answer so we would need to do some investigation.

I think the firmware has a bug. :)  My first email has my speculation
of where the bug could be.

> You had stated in your earlier email that "Other UDP packets are
> fine". Perhaps we need to do some further isolation to identify why
> the ESP over UDP packets are not being hashed on while other UDP
> packets are.

Well they are IP packets encapsulated in UDP, while other UDP packets
are not IP packets encapsulated in UDP, and there is special handling
for some IP types inside UDP on this card, which is an unusual feature.
For the supported IP in UDP types, it actually is supposed to use the IP
packet inside the UDP packet to generate the RSS value, so it pretends it
wasn't even encapsulated.  But it does not handle ESP in UDP specifically,
and hence I suspect that is the problem.  I think it tries to handle the
IP in UDP and since it doesn't support ESP in UDP it fails to fall back
to using the original UDP packet for the RSS value.  That would at least
explain why regular UDP packets that don't contain an IP packet inside
are fine, but this particular type of packet is being handled wrong.

> Would it be possible to provide a couple of raw Ethernet frames
> instead of IP packets for us to examine? I noticed the two packets you
> sent earlier didn't start until the IP header. One possibility would
> be that if we had any extra outer headers or trailers added to the
> packet that could possibly cause issues since that might either make
> the packet not parsable or possibly flag it as some sort of length
> error when the size of the packet doesn't match what is reported in
> the headers.

Oh did I forget the option for that?  I can try and capture some today
with the full headers.

> One other thing we may want to look at doing is trying to identify the
> particular part of the packets that might be causing the hash to not
> be generated. One way to do that would be to use something like
> netperf to generate packets and send them toward your test system.
> Something like the command line below could be used to send packets
> that should be similar to the ones you provided earlier:
>      netperf -H <target IP> -t UDP_STREAM -N -- -P 4500,4500 -m 132
> 
> If the packets generated by netperf were not hashed that would tell us
> then it may be some sort of issue with how UDP packets are being
> parsed, and from there we could narrow things down by modifying port
> numbers and changing packet sizes. If that does get hashed then we
> need to start looking outside of the IP/UDP header parsing for
> possible issues since there is likely something else causing the
> issue.

I will see what I can do with that.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-02 17:16       ` Lennart Sorensen
@ 2019-05-02 17:28         ` Alexander Duyck
  2019-05-02 17:55           ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-02 17:28 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: LKML, Netdev, intel-wired-lan

On Thu, May 2, 2019 at 10:16 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Thu, May 02, 2019 at 10:03:23AM -0700, Alexander Duyck wrote:
> > On Thu, May 2, 2019 at 8:11 AM Lennart Sorensen
> > <lsorense@csclub.uwaterloo.ca> wrote:
> > >
> > > On Wed, May 01, 2019 at 03:52:57PM -0700, Alexander Duyck wrote:
> > > > I'm not sure how RSS will do much for you here. Basically you only
> > > > have the source IP address as your only source of entropy when it
> > > > comes to RSS since the destination IP should always be the same if you
> > > > are performing a server role and terminating packets on the local
> > > > system and as far as the ports in your example you seem to only be
> > > > using 4500 for both the source and the destination.
> > >
> > > I have thousands of IPsec clients connecting.  Simply treating them as
> > > normal UDP packets would work.  The IP address is different, and often
> > > the port too.
> >
> > Thanks for the clarification. I just wanted to verify that I know we
> > have had similar complaints in the past and it turns out those were
> > only using one set of IP addresses.
> >
> > > > In your testing are you only looking at a point to point connection
> > > > between two systems, or do you have multiple systems accessing the
> > > > system you are testing? I ask as the only way this should do any
> > > > traffic spreading via RSS would be if the source IPs are different and
> > > > that would require multiple client systems accessing the server.
> > >
> > > I tried changing the client IP address and the RSS hash key.  It never
> > > changed to another queue.  Something is broken.
> >
> > Okay, so if changing the RSS hash key has not effect then it is likely
> > not being used.
> >
> > > > In the case of other encapsulation types over UDP, such as VXLAN, I
> > > > know that a hash value is stored in the UDP source port location
> > > > instead of the true source port number. This allows the RSS hashing to
> > > > occur on this extra information which would allow for a greater
> > > > diversity in hash results. Depending on how you are generating the ESP
> > > > encapsulation you might look at seeing if it would be possible to have
> > > > a hash on the inner data used as the UDP source port in the outgoing
> > > > packets. This would help to resolve this sort of issue.
> > >
> > > Well it works on every other network card except this one.  Every other
> > > intel card in the past we have used had no problem doing this right.
> >
> > The question is what is different about this card, and I don't have an
> > immediate answer so we would need to do some investigation.
>
> I think the firmware has a bug. :)  My first email has my speculation
> of where the bug could be.

The thing is the firmware has to have some idea what it is dealing
with. As far as I know I don't believe port number 4500 is being
auto-flagged as any special type. In the case of the other tunnel
types such as VXLAN, NVGRE, and GENEVE the driver has to set a port
value indicating that the port will receive special handling. If it
isn't added via i40e_udp_tunnel_add then the firmware/hardware
shouldn't know anything about the tunnel.

> > You had stated in your earlier email that "Other UDP packets are
> > fine". Perhaps we need to do some further isolation to identify why
> > the ESP over UDP packets are not being hashed on while other UDP
> > packets are.
>
> Well they are IP packets encapsulated in UDP, while other UDP packets
> are not IP packets encapsulated in UDP, and there is special handling
> for some IP types inside UDP on this card, which is an unusual feature.

It really isn't that unusual of a feature. Many NICs have this
functionality now. In order to support it we usually have to populate
the port values for the device so the internal parser knows to expect
them.

> For the supported IP in UDP types, it actually is supposed to use the IP
> packet inside the UDP packet to generate the RSS value, so it pretends it
> wasn't even encapsulated.  But it does not handle ESP in UDP specifically,
> and hence I suspect that is the problem.  I think it tries to handle the
> IP in UDP and since it doesn't support ESP in UDP it fails to fall back
> to using the original UDP packet for the RSS value.  That would at least
> explain why regular UDP packets that don't contain an IP packet inside
> are fine, but this particular type of packet is being handled wrong.

That is one of the reasons I suggested testing with netperf as I did
below. Basically if we construct all the outer headers the same as
your packet we can see if some specific combination is causing a
parsing issue. I tested the netperf approach on an XL710 and didn't
see any issues, but perhaps the XL722 is doing something differently.

> > Would it be possible to provide a couple of raw Ethernet frames
> > instead of IP packets for us to examine? I noticed the two packets you
> > sent earlier didn't start until the IP header. One possibility would
> > be that if we had any extra outer headers or trailers added to the
> > packet that could possibly cause issues since that might either make
> > the packet not parsable or possibly flag it as some sort of length
> > error when the size of the packet doesn't match what is reported in
> > the headers.
>
> Oh did I forget the option for that?  I can try and capture some today
> with the full headers.

Thanks. If nothing else it should make it possible to just use
tcpreplay if needed to reproduce the issue.

> > One other thing we may want to look at doing is trying to identify the
> > particular part of the packets that might be causing the hash to not
> > be generated. One way to do that would be to use something like
> > netperf to generate packets and send them toward your test system.
> > Something like the command line below could be used to send packets
> > that should be similar to the ones you provided earlier:
> >      netperf -H <target IP> -t UDP_STREAM -N -- -P 4500,4500 -m 132
> >
> > If the packets generated by netperf were not hashed that would tell us
> > then it may be some sort of issue with how UDP packets are being
> > parsed, and from there we could narrow things down by modifying port
> > numbers and changing packet sizes. If that does get hashed then we
> > need to start looking outside of the IP/UDP header parsing for
> > possible issues since there is likely something else causing the
> > issue.
>
> I will see what I can do with that.
>
> --
> Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-02 17:28         ` Alexander Duyck
@ 2019-05-02 17:55           ` Lennart Sorensen
  2019-05-02 18:52             ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-02 17:55 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: LKML, Netdev, intel-wired-lan

On Thu, May 02, 2019 at 10:28:22AM -0700, Alexander Duyck wrote:
> The thing is the firmware has to have some idea what it is dealing
> with. As far as I know I don't believe port number 4500 is being
> auto-flagged as any special type. In the case of the other tunnel
> types such as VXLAN, NVGRE, and GENEVE the driver has to set a port
> value indicating that the port will receive special handling. If it
> isn't added via i40e_udp_tunnel_add then the firmware/hardware
> shouldn't know anything about the tunnel.

Well that makes some sense.  I was wondering why there didn't seem to
be an on/off switch for that feature.

> It really isn't that unusual of a feature. Many NICs have this
> functionality now. In order to support it we usually have to populate
> the port values for the device so the internal parser knows to expect
> them.
> 
> That is one of the reasons I suggested testing with netperf as I did
> below. Basically if we construct all the outer headers the same as
> your packet we can see if some specific combination is causing a
> parsing issue. I tested the netperf approach on an XL710 and didn't
> see any issues, but perhaps the XL722 is doing something differently.
> 
> Thanks. If nothing else it should make it possible to just use
> tcpreplay if needed to reproduce the issue.

Here is the same packets as before with the link level header included
(I forgot to use -XX rather than -X):

13:43:49.081567 54:ee:75:30:f1:e1 > a4:bf:01:4e:0c:87, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 21783, offset 0, flags [DF], proto UDP (17), length 160)
    1.99.99.2.4500 > 1.99.99.1.4500: [no cksum] UDP-encap: ESP(spi=0x8de82290,seq=0x6a56), length 132
        0x0000:  a4bf 014e 0c87 54ee 7530 f1e1 0800 4500  ...N..T.u0....E.
        0x0010:  00a0 5517 4000 4011 1c6d 0163 6302 0163  ..U.@.@..m.cc..c
        0x0020:  6301 1194 1194 008c 0000 8de8 2290 0000  c..........."...
        0x0030:  6a56 72da 0734 52f6 406e 9346 f946 c698  jVr..4R.@n.F.F..
        0x0040:  a38c 280c 94da 53e1 91e0 35bf 812a 4500  ..(...S...5..*E.
        0x0050:  6003 ca7d 6872 a50b d41a 5c4d 7c22 3fb8  `..}hr....\M|"?.
        0x0060:  56d8 2a0f bc3f d3a6 5853 682c 914c c1b1  V.*..?..XSh,.L..
        0x0070:  c5c3 94e8 4789 d8b4 4ab4 e5f9 d20a e5ef  ....G...J.......
        0x0080:  de1d 05dd e98a 996b 5c11 6657 b667 6af1  .......k\.fW.gj.
        0x0090:  2a97 694b 16de 74e2 f8fe 13a3 d45e e3e9  *.iK..t......^..
        0x00a0:  f0b1 b83b 99e3 55cb b40b 5ba8 9c23       ...;..U...[..#
13:43:49.081658 a4:bf:01:4e:0c:87 > 54:ee:75:30:f1:e1, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 44552, offset 0, flags [none], proto UDP (17), length 160)
    1.99.99.1.4500 > 1.99.99.2.4500: [no cksum] UDP-encap: ESP(spi=0x1d4ecfdf,seq=0x6a56), length 132
        0x0000:  54ee 7530 f1e1 a4bf 014e 0c87 0800 4500  T.u0.....N....E.
        0x0010:  00a0 ae08 0000 4011 037c 0163 6301 0163  ......@..|.cc..c
        0x0020:  6302 1194 1194 008c 0000 1d4e cfdf 0000  c..........N....
        0x0030:  6a56 28ca 4809 8933 911d f2be 4510 e757  jV(.H..3....E..W
        0x0040:  3885 7d26 5238 8c58 38e3 6c07 2f8e 335a  8.}&R8.X8.l./.3Z
        0x0050:  6d48 2a72 4619 e8a3 c421 bc54 48b2 6239  mH*rF....!.TH.b9
        0x0060:  5e07 7e89 a68e 0161 4e6a 5b6f 8b89 9f53  ^.~....aNj[o...S
        0x0070:  4c40 1c6c d159 60f8 68e7 24db 8b21 2ec2  L@.l.Y`.h.$..!..
        0x0080:  4b67 9b83 643b b0ac 6e2d bf4f 1ee1 9508  Kg..d;..n-.O....
        0x0090:  d1bd dcd4 74ee e4dc 78d0 578a 5905 1f4d  ....t...x.W.Y..M
        0x00a0:  74be e643 910b b4d3 f428 8822 e22b       t..C.....(.".+

I will try to see what I can do with netperf.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-02 17:55           ` Lennart Sorensen
@ 2019-05-02 18:52             ` Lennart Sorensen
  2019-05-02 20:59               ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-02 18:52 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: LKML, Netdev, intel-wired-lan

On Thu, May 02, 2019 at 01:55:13PM -0400, Lennart Sorensen wrote:
> Here is the same packets as before with the link level header included
> (I forgot to use -XX rather than -X):
> 
> 13:43:49.081567 54:ee:75:30:f1:e1 > a4:bf:01:4e:0c:87, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 21783, offset 0, flags [DF], proto UDP (17), length 160)
>     1.99.99.2.4500 > 1.99.99.1.4500: [no cksum] UDP-encap: ESP(spi=0x8de82290,seq=0x6a56), length 132
>         0x0000:  a4bf 014e 0c87 54ee 7530 f1e1 0800 4500  ...N..T.u0....E.
>         0x0010:  00a0 5517 4000 4011 1c6d 0163 6302 0163  ..U.@.@..m.cc..c
>         0x0020:  6301 1194 1194 008c 0000 8de8 2290 0000  c..........."...
>         0x0030:  6a56 72da 0734 52f6 406e 9346 f946 c698  jVr..4R.@n.F.F..
>         0x0040:  a38c 280c 94da 53e1 91e0 35bf 812a 4500  ..(...S...5..*E.
>         0x0050:  6003 ca7d 6872 a50b d41a 5c4d 7c22 3fb8  `..}hr....\M|"?.
>         0x0060:  56d8 2a0f bc3f d3a6 5853 682c 914c c1b1  V.*..?..XSh,.L..
>         0x0070:  c5c3 94e8 4789 d8b4 4ab4 e5f9 d20a e5ef  ....G...J.......
>         0x0080:  de1d 05dd e98a 996b 5c11 6657 b667 6af1  .......k\.fW.gj.
>         0x0090:  2a97 694b 16de 74e2 f8fe 13a3 d45e e3e9  *.iK..t......^..
>         0x00a0:  f0b1 b83b 99e3 55cb b40b 5ba8 9c23       ...;..U...[..#
> 13:43:49.081658 a4:bf:01:4e:0c:87 > 54:ee:75:30:f1:e1, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 44552, offset 0, flags [none], proto UDP (17), length 160)
>     1.99.99.1.4500 > 1.99.99.2.4500: [no cksum] UDP-encap: ESP(spi=0x1d4ecfdf,seq=0x6a56), length 132
>         0x0000:  54ee 7530 f1e1 a4bf 014e 0c87 0800 4500  T.u0.....N....E.
>         0x0010:  00a0 ae08 0000 4011 037c 0163 6301 0163  ......@..|.cc..c
>         0x0020:  6302 1194 1194 008c 0000 1d4e cfdf 0000  c..........N....
>         0x0030:  6a56 28ca 4809 8933 911d f2be 4510 e757  jV(.H..3....E..W
>         0x0040:  3885 7d26 5238 8c58 38e3 6c07 2f8e 335a  8.}&R8.X8.l./.3Z
>         0x0050:  6d48 2a72 4619 e8a3 c421 bc54 48b2 6239  mH*rF....!.TH.b9
>         0x0060:  5e07 7e89 a68e 0161 4e6a 5b6f 8b89 9f53  ^.~....aNj[o...S
>         0x0070:  4c40 1c6c d159 60f8 68e7 24db 8b21 2ec2  L@.l.Y`.h.$..!..
>         0x0080:  4b67 9b83 643b b0ac 6e2d bf4f 1ee1 9508  Kg..d;..n-.O....
>         0x0090:  d1bd dcd4 74ee e4dc 78d0 578a 5905 1f4d  ....t...x.W.Y..M
>         0x00a0:  74be e643 910b b4d3 f428 8822 e22b       t..C.....(.".+
> 
> I will try to see what I can do with netperf.

Hmm, maybe UDP isn't doing as well as I thought.

Playing with packit doing this:

packit -t UDP -d 1.99.99.1 -D 32432 -S 4500 -i enp0s25 -h -p "0x 00 11 22 33 44 55 66 77 88 99 00 11 22 33 44 55 66 77 88 99 00 11 22 33 44 55 66 77 88 99" -c 5

I have played with the source and destination port numbers, and so far
I have only managed to hit queues 0, 1 and 2 (mostly 0 and 2).  No port
number I have tried has made it hit any other queue.  That is weird.
Making random changes ought to distribute more than that.  And changing
the hkey certainly ought to make a difference, and so far it doesn't
seem to for these packets (I know I saw icmp move around just fine before
when changing the hkey).

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-02 18:52             ` Lennart Sorensen
@ 2019-05-02 20:59               ` Alexander Duyck
  2019-05-03 15:14                 ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-02 20:59 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: LKML, Netdev, intel-wired-lan

On Thu, May 2, 2019 at 11:52 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Thu, May 02, 2019 at 01:55:13PM -0400, Lennart Sorensen wrote:
> > Here is the same packets as before with the link level header included
> > (I forgot to use -XX rather than -X):
> >
> > 13:43:49.081567 54:ee:75:30:f1:e1 > a4:bf:01:4e:0c:87, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 21783, offset 0, flags [DF], proto UDP (17), length 160)
> >     1.99.99.2.4500 > 1.99.99.1.4500: [no cksum] UDP-encap: ESP(spi=0x8de82290,seq=0x6a56), length 132
> >         0x0000:  a4bf 014e 0c87 54ee 7530 f1e1 0800 4500  ...N..T.u0....E.
> >         0x0010:  00a0 5517 4000 4011 1c6d 0163 6302 0163  ..U.@.@..m.cc..c
> >         0x0020:  6301 1194 1194 008c 0000 8de8 2290 0000  c..........."...
> >         0x0030:  6a56 72da 0734 52f6 406e 9346 f946 c698  jVr..4R.@n.F.F..
> >         0x0040:  a38c 280c 94da 53e1 91e0 35bf 812a 4500  ..(...S...5..*E.
> >         0x0050:  6003 ca7d 6872 a50b d41a 5c4d 7c22 3fb8  `..}hr....\M|"?.
> >         0x0060:  56d8 2a0f bc3f d3a6 5853 682c 914c c1b1  V.*..?..XSh,.L..
> >         0x0070:  c5c3 94e8 4789 d8b4 4ab4 e5f9 d20a e5ef  ....G...J.......
> >         0x0080:  de1d 05dd e98a 996b 5c11 6657 b667 6af1  .......k\.fW.gj.
> >         0x0090:  2a97 694b 16de 74e2 f8fe 13a3 d45e e3e9  *.iK..t......^..
> >         0x00a0:  f0b1 b83b 99e3 55cb b40b 5ba8 9c23       ...;..U...[..#
> > 13:43:49.081658 a4:bf:01:4e:0c:87 > 54:ee:75:30:f1:e1, ethertype IPv4 (0x0800), length 174: (tos 0x0, ttl 64, id 44552, offset 0, flags [none], proto UDP (17), length 160)
> >     1.99.99.1.4500 > 1.99.99.2.4500: [no cksum] UDP-encap: ESP(spi=0x1d4ecfdf,seq=0x6a56), length 132
> >         0x0000:  54ee 7530 f1e1 a4bf 014e 0c87 0800 4500  T.u0.....N....E.
> >         0x0010:  00a0 ae08 0000 4011 037c 0163 6301 0163  ......@..|.cc..c
> >         0x0020:  6302 1194 1194 008c 0000 1d4e cfdf 0000  c..........N....
> >         0x0030:  6a56 28ca 4809 8933 911d f2be 4510 e757  jV(.H..3....E..W
> >         0x0040:  3885 7d26 5238 8c58 38e3 6c07 2f8e 335a  8.}&R8.X8.l./.3Z
> >         0x0050:  6d48 2a72 4619 e8a3 c421 bc54 48b2 6239  mH*rF....!.TH.b9
> >         0x0060:  5e07 7e89 a68e 0161 4e6a 5b6f 8b89 9f53  ^.~....aNj[o...S
> >         0x0070:  4c40 1c6c d159 60f8 68e7 24db 8b21 2ec2  L@.l.Y`.h.$..!..
> >         0x0080:  4b67 9b83 643b b0ac 6e2d bf4f 1ee1 9508  Kg..d;..n-.O....
> >         0x0090:  d1bd dcd4 74ee e4dc 78d0 578a 5905 1f4d  ....t...x.W.Y..M
> >         0x00a0:  74be e643 910b b4d3 f428 8822 e22b       t..C.....(.".+
> >
> > I will try to see what I can do with netperf.
>
> Hmm, maybe UDP isn't doing as well as I thought.
>
> Playing with packit doing this:
>
> packit -t UDP -d 1.99.99.1 -D 32432 -S 4500 -i enp0s25 -h -p "0x 00 11 22 33 44 55 66 77 88 99 00 11 22 33 44 55 66 77 88 99 00 11 22 33 44 55 66 77 88 99" -c 5
>
> I have played with the source and destination port numbers, and so far
> I have only managed to hit queues 0, 1 and 2 (mostly 0 and 2).  No port
> number I have tried has made it hit any other queue.  That is weird.
> Making random changes ought to distribute more than that.  And changing
> the hkey certainly ought to make a difference, and so far it doesn't
> seem to for these packets (I know I saw icmp move around just fine before
> when changing the hkey).
>
> --
> Len Sorensen

If I recall correctly RSS is only using something like the lower 9
bits (indirection table size of 512) of the resultant hash on the
X722, even fewer if you have fewer queues that are a power of 2 and
happen to program the indirection table in a round robin fashion. So
for example on my system setup with 32 queues it is technically only
using the lower 5 bits of the hash.

One issue as a result of that is that you can end up with swaths of
bits that don't really seem to impact the hash all that much since it
will never actually change those bits of the resultant hash. In order
to guarantee that every bit in the input impacts the hash you have to
make certain you have to gaps in the key wider than the bits you
examine in the final hash.

A quick and dirty way to verify that the hash key is part of the issue
would be to use something like a simple repeating value such as AA:55
as your hash key. With something like that every bit you change in the
UDP port number should result in a change in the final RSS hash for
queue counts of 3 or greater. The downside is the upper 16 bits of the
hash are identical to the lower 16 so the actual hash value itself
isn't as useful.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-02 20:59               ` Alexander Duyck
@ 2019-05-03 15:14                 ` Lennart Sorensen
  2019-05-03 17:19                   ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-03 15:14 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: LKML, Netdev, intel-wired-lan

On Thu, May 02, 2019 at 01:59:46PM -0700, Alexander Duyck wrote:
> If I recall correctly RSS is only using something like the lower 9
> bits (indirection table size of 512) of the resultant hash on the
> X722, even fewer if you have fewer queues that are a power of 2 and
> happen to program the indirection table in a round robin fashion. So
> for example on my system setup with 32 queues it is technically only
> using the lower 5 bits of the hash.
> 
> One issue as a result of that is that you can end up with swaths of
> bits that don't really seem to impact the hash all that much since it
> will never actually change those bits of the resultant hash. In order
> to guarantee that every bit in the input impacts the hash you have to
> make certain you have to gaps in the key wider than the bits you
> examine in the final hash.
> 
> A quick and dirty way to verify that the hash key is part of the issue
> would be to use something like a simple repeating value such as AA:55
> as your hash key. With something like that every bit you change in the
> UDP port number should result in a change in the final RSS hash for
> queue counts of 3 or greater. The downside is the upper 16 bits of the
> hash are identical to the lower 16 so the actual hash value itself
> isn't as useful.

OK I set the hkey to
aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55
and still only see queue 0 and 2 getting hit with a couple of dozen
different UDP port numbers I picked.  Changing the hash with ethtool to
that didn't even move where the tcp packets for my ssh connection are
going (they are always on queue 2 it seems).

Does it just not hash UDP packets correctly?  Is it even doing RSS?
(the register I checked claimed it is).

This system has 40 queues assigned by default since that is how many
CPUs there are.  Changing it to a lower number didn't make a difference
(I tried 32 and 8).

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-03 15:14                 ` Lennart Sorensen
@ 2019-05-03 17:19                   ` Alexander Duyck
  2019-05-03 20:59                     ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-03 17:19 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: LKML, Netdev, intel-wired-lan

On Fri, May 3, 2019 at 8:14 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Thu, May 02, 2019 at 01:59:46PM -0700, Alexander Duyck wrote:
> > If I recall correctly RSS is only using something like the lower 9
> > bits (indirection table size of 512) of the resultant hash on the
> > X722, even fewer if you have fewer queues that are a power of 2 and
> > happen to program the indirection table in a round robin fashion. So
> > for example on my system setup with 32 queues it is technically only
> > using the lower 5 bits of the hash.
> >
> > One issue as a result of that is that you can end up with swaths of
> > bits that don't really seem to impact the hash all that much since it
> > will never actually change those bits of the resultant hash. In order
> > to guarantee that every bit in the input impacts the hash you have to
> > make certain you have to gaps in the key wider than the bits you
> > examine in the final hash.
> >
> > A quick and dirty way to verify that the hash key is part of the issue
> > would be to use something like a simple repeating value such as AA:55
> > as your hash key. With something like that every bit you change in the
> > UDP port number should result in a change in the final RSS hash for
> > queue counts of 3 or greater. The downside is the upper 16 bits of the
> > hash are identical to the lower 16 so the actual hash value itself
> > isn't as useful.
>
> OK I set the hkey to
> aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55
> and still only see queue 0 and 2 getting hit with a couple of dozen
> different UDP port numbers I picked.  Changing the hash with ethtool to
> that didn't even move where the tcp packets for my ssh connection are
> going (they are always on queue 2 it seems).

The TCP flow could be bypassing RSS and may be using ATR to decide
where the Rx packets are processed. Now that I think about it there is
a possibility that ATR could be interfering with the queue selection.
You might try disabling it by running:
    ethtool --set-priv-flags <iface> flow-director-atr off

> Does it just not hash UDP packets correctly?  Is it even doing RSS?
> (the register I checked claimed it is).

The problem is RSS can be bypassed for queue selection by things like
ATR which I called out above. One possibility is that if the
encryption you were using was leaving the skb->encapsulation flag set,
and the NIC might have misidentified the packets as something it could
parse and set up a bunch of rules that were rerouting incoming traffic
based on outgoing traffic. Disabling the feature should switch off
that behavior if that is in fact the case.

> This system has 40 queues assigned by default since that is how many
> CPUs there are.  Changing it to a lower number didn't make a difference
> (I tried 32 and 8).

You are probably fine using 40 queues. That isn't an even power of two
so it would actually improve the entropy a bit since the lower bits
don't have a many:1 mapping to queues.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-03 17:19                   ` Alexander Duyck
@ 2019-05-03 20:59                     ` Lennart Sorensen
  2019-05-13 16:55                       ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-03 20:59 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: LKML, Netdev, intel-wired-lan

On Fri, May 03, 2019 at 10:19:47AM -0700, Alexander Duyck wrote:
> The TCP flow could be bypassing RSS and may be using ATR to decide
> where the Rx packets are processed. Now that I think about it there is
> a possibility that ATR could be interfering with the queue selection.
> You might try disabling it by running:
>     ethtool --set-priv-flags <iface> flow-director-atr off

Hmm, I thought I had killed ATR (I certainly meant to), but it appears
I had not.  I will experiment to see if that makes a difference.

> The problem is RSS can be bypassed for queue selection by things like
> ATR which I called out above. One possibility is that if the
> encryption you were using was leaving the skb->encapsulation flag set,
> and the NIC might have misidentified the packets as something it could
> parse and set up a bunch of rules that were rerouting incoming traffic
> based on outgoing traffic. Disabling the feature should switch off
> that behavior if that is in fact the case.
> 
> You are probably fine using 40 queues. That isn't an even power of two
> so it would actually improve the entropy a bit since the lower bits
> don't have a many:1 mapping to queues.

I will let you know Monday how my tests go with atr off.  I really
thought that was off already since it was supposed to be.  We always
try to turn that off because it does not work well.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-03 20:59                     ` Lennart Sorensen
@ 2019-05-13 16:55                       ` Lennart Sorensen
  2019-05-13 19:04                         ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-13 16:55 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: LKML, Netdev, intel-wired-lan

On Fri, May 03, 2019 at 04:59:35PM -0400, Lennart Sorensen wrote:
> On Fri, May 03, 2019 at 10:19:47AM -0700, Alexander Duyck wrote:
> > The TCP flow could be bypassing RSS and may be using ATR to decide
> > where the Rx packets are processed. Now that I think about it there is
> > a possibility that ATR could be interfering with the queue selection.
> > You might try disabling it by running:
> >     ethtool --set-priv-flags <iface> flow-director-atr off
> 
> Hmm, I thought I had killed ATR (I certainly meant to), but it appears
> I had not.  I will experiment to see if that makes a difference.
> 
> > The problem is RSS can be bypassed for queue selection by things like
> > ATR which I called out above. One possibility is that if the
> > encryption you were using was leaving the skb->encapsulation flag set,
> > and the NIC might have misidentified the packets as something it could
> > parse and set up a bunch of rules that were rerouting incoming traffic
> > based on outgoing traffic. Disabling the feature should switch off
> > that behavior if that is in fact the case.
> > 
> > You are probably fine using 40 queues. That isn't an even power of two
> > so it would actually improve the entropy a bit since the lower bits
> > don't have a many:1 mapping to queues.
> 
> I will let you know Monday how my tests go with atr off.  I really
> thought that was off already since it was supposed to be.  We always
> try to turn that off because it does not work well.

OK it took a while to try a bunch of stuff to make sure ATR really really
was off.

I still see the problem it seems.

# ethtool --show-priv-flags eth2
Private flags for eth2:
MFP              : off
LinkPolling      : off
flow-director-atr: off
veb-stats        : off
hw-atr-eviction  : on
legacy-rx        : off

# ethtool -i eth2
driver: i40e
version: 2.1.7-k
firmware-version: 4.00 0x80001577 1.1767.0
expansion-rom-version: 
bus-info: 0000:3d:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes


Here are two packets that for some reason both go to queue 0 which
seems odd.  As far as I can tell all of the packets for UDP port 4500
traffic from any IP is going to queue 0.

UDP from 10.49.1.50:4500 to 10.49.1.1:4500 encapsulating ESP:

a4bf 014e 0c88 001f 45ff f410 0800 45e0 
0060 166e 4000 4011 0b1b 0af9 0132 0af9 
0101 1194 1194 004c 0000 0000 0201 0000 
0000 4eaf 2f76 58cd aae0 4d92 8cb7 0835 
1141 7a23 9f06 f323 b816 1a2b c88d 322c 
5f16 d4a6 ba72 7c89 2258 9d20 085e d6ed 
c7a4 5cc1 3ef2 0753 783d b691 e9d6 

UDP from 10.49.1.51:4500 to 10.49.1.1:4500 encapsulating ESP:

a4bf 014e 0c88 20f3 99ae c688 0800 45e0 
0060 1671 4000 4011 0b17 0af9 0133 0af9 
0101 1194 1194 004c 0000 0000 0200 0000 
0000 4ec5 253f 27f1 7fdd 4d82 0697 bef2 
45bd 281f 8ecf ac4f 06ed 79ba 3cbb 5eaf 
494b 146e a013 8b93 1c38 8aef da3f a73d 
6f13 5f80 e946 82e2 7da7 21e8 9d03 


# ethtool -x eth2 
RX flow hash indirection table for eth2 with 12 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11     0     1     2     3
   16:      4     5     6     7     8     9    10    11
...
  488:      8     9    10    11     0     1     2     3
  496:      4     5     6     7     8     9    10    11
  504:      0     1     2     3     4     5     6     7
RSS hash key:
60:56:66:39:8e:70:46:02:5d:33:5e:9c:5f:f6:fa:9d:ac:50:63:7c:ca:01:23:22:07:a3:8a:23:98:fd:38:5b:74:96:7e:72:0c:aa:83:fc:10:aa:6d:35:bb:8c:4e:eb:46:03:07:6a

Changing the key to:

aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55

makes no change in the queue the packets are going to.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-13 16:55                       ` Lennart Sorensen
@ 2019-05-13 19:04                         ` Alexander Duyck
  2019-05-14 16:34                           ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-13 19:04 UTC (permalink / raw)
  To: Lennart Sorensen, Jeff Kirsher; +Cc: LKML, Netdev, intel-wired-lan

On Mon, May 13, 2019 at 9:55 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Fri, May 03, 2019 at 04:59:35PM -0400, Lennart Sorensen wrote:
> > On Fri, May 03, 2019 at 10:19:47AM -0700, Alexander Duyck wrote:
> > > The TCP flow could be bypassing RSS and may be using ATR to decide
> > > where the Rx packets are processed. Now that I think about it there is
> > > a possibility that ATR could be interfering with the queue selection.
> > > You might try disabling it by running:
> > >     ethtool --set-priv-flags <iface> flow-director-atr off
> >
> > Hmm, I thought I had killed ATR (I certainly meant to), but it appears
> > I had not.  I will experiment to see if that makes a difference.
> >
> > > The problem is RSS can be bypassed for queue selection by things like
> > > ATR which I called out above. One possibility is that if the
> > > encryption you were using was leaving the skb->encapsulation flag set,
> > > and the NIC might have misidentified the packets as something it could
> > > parse and set up a bunch of rules that were rerouting incoming traffic
> > > based on outgoing traffic. Disabling the feature should switch off
> > > that behavior if that is in fact the case.
> > >
> > > You are probably fine using 40 queues. That isn't an even power of two
> > > so it would actually improve the entropy a bit since the lower bits
> > > don't have a many:1 mapping to queues.
> >
> > I will let you know Monday how my tests go with atr off.  I really
> > thought that was off already since it was supposed to be.  We always
> > try to turn that off because it does not work well.
>
> OK it took a while to try a bunch of stuff to make sure ATR really really
> was off.
>
> I still see the problem it seems.
>
> # ethtool --show-priv-flags eth2
> Private flags for eth2:
> MFP              : off
> LinkPolling      : off
> flow-director-atr: off
> veb-stats        : off
> hw-atr-eviction  : on
> legacy-rx        : off
>
> # ethtool -i eth2
> driver: i40e
> version: 2.1.7-k
> firmware-version: 4.00 0x80001577 1.1767.0
> expansion-rom-version:
> bus-info: 0000:3d:00.1
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
>
>
> Here are two packets that for some reason both go to queue 0 which
> seems odd.  As far as I can tell all of the packets for UDP port 4500
> traffic from any IP is going to queue 0.
>
> UDP from 10.49.1.50:4500 to 10.49.1.1:4500 encapsulating ESP:
>
> a4bf 014e 0c88 001f 45ff f410 0800 45e0
> 0060 166e 4000 4011 0b1b 0af9 0132 0af9
> 0101 1194 1194 004c 0000 0000 0201 0000
> 0000 4eaf 2f76 58cd aae0 4d92 8cb7 0835
> 1141 7a23 9f06 f323 b816 1a2b c88d 322c
> 5f16 d4a6 ba72 7c89 2258 9d20 085e d6ed
> c7a4 5cc1 3ef2 0753 783d b691 e9d6
>
> UDP from 10.49.1.51:4500 to 10.49.1.1:4500 encapsulating ESP:
>
> a4bf 014e 0c88 20f3 99ae c688 0800 45e0
> 0060 1671 4000 4011 0b17 0af9 0133 0af9
> 0101 1194 1194 004c 0000 0000 0200 0000
> 0000 4ec5 253f 27f1 7fdd 4d82 0697 bef2
> 45bd 281f 8ecf ac4f 06ed 79ba 3cbb 5eaf
> 494b 146e a013 8b93 1c38 8aef da3f a73d
> 6f13 5f80 e946 82e2 7da7 21e8 9d03
>
>
> # ethtool -x eth2
> RX flow hash indirection table for eth2 with 12 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11     0     1     2     3
>    16:      4     5     6     7     8     9    10    11
> ...
>   488:      8     9    10    11     0     1     2     3
>   496:      4     5     6     7     8     9    10    11
>   504:      0     1     2     3     4     5     6     7
> RSS hash key:
> 60:56:66:39:8e:70:46:02:5d:33:5e:9c:5f:f6:fa:9d:ac:50:63:7c:ca:01:23:22:07:a3:8a:23:98:fd:38:5b:74:96:7e:72:0c:aa:83:fc:10:aa:6d:35:bb:8c:4e:eb:46:03:07:6a
>
> Changing the key to:
>
> aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55:aa:55
>
> makes no change in the queue the packets are going to.
>
> --
> Len Sorensen

So I recreated the first packet you listed via text2pcap, replayed it
on my test system via tcpreplay, updated my configuration to 12
queues, and used the 2 hash keys you listed. I ended up seeing the
traffic bounce between queues 4 and 8 with an X710 I had to test with
when I was changing the key value.

Unfortunately I don't have an X722 to test with. I'm suspecting that
there may be some difference in the RSS setup, specifically it seems
like values in the PFQF_HENA register were changed for the X722 part
that may be causing the issues we are seeing.

I will see if I can get someone from the networking division to take a
look at this since I don't have access to the part in question nor a
datasheet for it so I am not sure if I can help much more.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-13 19:04                         ` Alexander Duyck
@ 2019-05-14 16:34                           ` Lennart Sorensen
  2019-05-16 17:10                             ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-14 16:34 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

On Mon, May 13, 2019 at 12:04:00PM -0700, Alexander Duyck wrote:
> So I recreated the first packet you listed via text2pcap, replayed it
> on my test system via tcpreplay, updated my configuration to 12
> queues, and used the 2 hash keys you listed. I ended up seeing the
> traffic bounce between queues 4 and 8 with an X710 I had to test with
> when I was changing the key value.
> 
> Unfortunately I don't have an X722 to test with. I'm suspecting that
> there may be some difference in the RSS setup, specifically it seems
> like values in the PFQF_HENA register were changed for the X722 part
> that may be causing the issues we are seeing.
> 
> I will see if I can get someone from the networking division to take a
> look at this since I don't have access to the part in question nor a
> datasheet for it so I am not sure if I can help much more.

Great.  I hope someone can figure this out because it is working very
badly so far.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-14 16:34                           ` Lennart Sorensen
@ 2019-05-16 17:10                             ` Alexander Duyck
  2019-05-16 18:34                               ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-16 17:10 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

[-- Attachment #1: Type: text/plain, Size: 4225 bytes --]

On Tue, May 14, 2019 at 9:34 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Mon, May 13, 2019 at 12:04:00PM -0700, Alexander Duyck wrote:
> > So I recreated the first packet you listed via text2pcap, replayed it
> > on my test system via tcpreplay, updated my configuration to 12
> > queues, and used the 2 hash keys you listed. I ended up seeing the
> > traffic bounce between queues 4 and 8 with an X710 I had to test with
> > when I was changing the key value.
> >
> > Unfortunately I don't have an X722 to test with. I'm suspecting that
> > there may be some difference in the RSS setup, specifically it seems
> > like values in the PFQF_HENA register were changed for the X722 part
> > that may be causing the issues we are seeing.
> >
> > I will see if I can get someone from the networking division to take a
> > look at this since I don't have access to the part in question nor a
> > datasheet for it so I am not sure if I can help much more.
>
> Great.  I hope someone can figure this out because it is working very
> badly so far.
>
> --
> Len Sorensen

So I was sent a link to the datasheet for the part and I have a
working theory that what we may be seeing is a problem in the firmware
for the part.

Can you try applying the attached patch and send the output from the
dmesg? Specifically I would want anything with the name "i40e" in it.
What I am looking for is something like the following:
[  294.383416] i40e 0000:81:00.1: fw 5.0.40043 api 1.5 nvm 5.04 0x800024cd 0.0.0
[  294.675039] i40e 0000:81:00.1: MAC address: 68:05:ca:37:c7:99
[  294.685941] i40e 0000:81:00.1: flow_type: 63 input_mask:0x0000000000004000
[  294.686056] i40e 0000:81:00.1: flow_type: 46 input_mask:0x0007fff800000000
[  294.686170] i40e 0000:81:00.1: flow_type: 45 input_mask:0x0007fff800000000
[  294.686284] i40e 0000:81:00.1: flow_type: 44 input_mask:0x0007ffff80000000
[  294.686399] i40e 0000:81:00.1: flow_type: 43 input_mask:0x0007fffe00000000
[  294.686513] i40e 0000:81:00.1: flow_type: 41 input_mask:0x0007fffe00000000
[  294.686628] i40e 0000:81:00.1: flow_type: 36 input_mask:0x0001801800000000
[  294.686743] i40e 0000:81:00.1: flow_type: 35 input_mask:0x0001801800000000
[  294.686858] i40e 0000:81:00.1: flow_type: 34 input_mask:0x0001801f80000000
[  294.686973] i40e 0000:81:00.1: flow_type: 33 input_mask:0x0001801e00000000
[  294.687087] i40e 0000:81:00.1: flow_type: 31 input_mask:0x0001801e00000000
[  294.691906] i40e 0000:81:00.1 ens5f1: renamed from eth0
[  294.711173] i40e 0000:81:00.1 ens5f1: NIC Link is Up, 10 Gbps Full
Duplex, Flow Control: None
[  294.759061] i40e 0000:81:00.1: PCI-Express: Speed 8.0GT/s Width x8
[  294.863363] i40e 0000:81:00.1: Features: PF-id[1] VFs: 32 VSIs: 34
QP: 32 RSS FD_ATR FD_SB NTUPLE VxLAN Geneve PTP VEPA

With that we can tell what flow types are enabled, and what input
fields are enabled for each flow type. My suspicion is that we may see
the two new types added to X722 for UDP, 29 and 30, may not match type
31 which is the current flow type supported on the X710.

I have included a copy inline below in case the patch is stripped,
however I suspect it will not apply cleanly as the mail client I am
using usually ends up causing white space mangling by replacing tabs
with spaces.

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 65c2b9d2652b..0c93859f8184 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -10998,6 +10998,15 @@ static int i40e_pf_config_rss(struct i40e_pf *pf)
                ((u64)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1)) << 32);
        hena |= i40e_pf_get_default_rss_hena(pf);

+       for (ret = 64; ret--;) {
+               if (!(hena & (1ull << ret)))
+                       continue;
+               dev_info(&pf->pdev->dev, "flow_type: %d
input_mask:0x%08x%08x\n",
+                        ret,
+                        i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(1, ret)),
+                        i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(0, ret)));
+       }
+
        i40e_write_rx_ctl(hw, I40E_PFQF_HENA(0), (u32)hena);
        i40e_write_rx_ctl(hw, I40E_PFQF_HENA(1), (u32)(hena >> 32));

[-- Attachment #2: i40e-debug-hash-inputs.patch --]
[-- Type: text/x-patch, Size: 999 bytes --]

i40e: Debug hash inputs

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>


---
 drivers/net/ethernet/intel/i40e/i40e_main.c |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 65c2b9d2652b..0c93859f8184 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -10998,6 +10998,15 @@ static int i40e_pf_config_rss(struct i40e_pf *pf)
 		((u64)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1)) << 32);
 	hena |= i40e_pf_get_default_rss_hena(pf);
 
+	for (ret = 64; ret--;) {
+		if (!(hena & (1ull << ret)))
+			continue;
+		dev_info(&pf->pdev->dev, "flow_type: %d input_mask:0x%08x%08x\n",
+			 ret,
+			 i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(1, ret)),
+			 i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(0, ret)));
+	}
+
 	i40e_write_rx_ctl(hw, I40E_PFQF_HENA(0), (u32)hena);
 	i40e_write_rx_ctl(hw, I40E_PFQF_HENA(1), (u32)(hena >> 32));
 

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-16 17:10                             ` Alexander Duyck
@ 2019-05-16 18:34                               ` Lennart Sorensen
  2019-05-16 18:37                                 ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-16 18:34 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

On Thu, May 16, 2019 at 10:10:55AM -0700, Alexander Duyck wrote:
> So I was sent a link to the datasheet for the part and I have a
> working theory that what we may be seeing is a problem in the firmware
> for the part.
> 
> Can you try applying the attached patch and send the output from the
> dmesg? Specifically I would want anything with the name "i40e" in it.
> What I am looking for is something like the following:
> [  294.383416] i40e 0000:81:00.1: fw 5.0.40043 api 1.5 nvm 5.04 0x800024cd 0.0.0
> [  294.675039] i40e 0000:81:00.1: MAC address: 68:05:ca:37:c7:99
> [  294.685941] i40e 0000:81:00.1: flow_type: 63 input_mask:0x0000000000004000
> [  294.686056] i40e 0000:81:00.1: flow_type: 46 input_mask:0x0007fff800000000
> [  294.686170] i40e 0000:81:00.1: flow_type: 45 input_mask:0x0007fff800000000
> [  294.686284] i40e 0000:81:00.1: flow_type: 44 input_mask:0x0007ffff80000000
> [  294.686399] i40e 0000:81:00.1: flow_type: 43 input_mask:0x0007fffe00000000
> [  294.686513] i40e 0000:81:00.1: flow_type: 41 input_mask:0x0007fffe00000000
> [  294.686628] i40e 0000:81:00.1: flow_type: 36 input_mask:0x0001801800000000
> [  294.686743] i40e 0000:81:00.1: flow_type: 35 input_mask:0x0001801800000000
> [  294.686858] i40e 0000:81:00.1: flow_type: 34 input_mask:0x0001801f80000000
> [  294.686973] i40e 0000:81:00.1: flow_type: 33 input_mask:0x0001801e00000000
> [  294.687087] i40e 0000:81:00.1: flow_type: 31 input_mask:0x0001801e00000000
> [  294.691906] i40e 0000:81:00.1 ens5f1: renamed from eth0
> [  294.711173] i40e 0000:81:00.1 ens5f1: NIC Link is Up, 10 Gbps Full
> Duplex, Flow Control: None
> [  294.759061] i40e 0000:81:00.1: PCI-Express: Speed 8.0GT/s Width x8
> [  294.863363] i40e 0000:81:00.1: Features: PF-id[1] VFs: 32 VSIs: 34
> QP: 32 RSS FD_ATR FD_SB NTUPLE VxLAN Geneve PTP VEPA
> 
> With that we can tell what flow types are enabled, and what input
> fields are enabled for each flow type. My suspicion is that we may see
> the two new types added to X722 for UDP, 29 and 30, may not match type
> 31 which is the current flow type supported on the X710.
> 
> I have included a copy inline below in case the patch is stripped,
> however I suspect it will not apply cleanly as the mail client I am
> using usually ends up causing white space mangling by replacing tabs
> with spaces.
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
> b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 65c2b9d2652b..0c93859f8184 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -10998,6 +10998,15 @@ static int i40e_pf_config_rss(struct i40e_pf *pf)
>                 ((u64)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1)) << 32);
>         hena |= i40e_pf_get_default_rss_hena(pf);
> 
> +       for (ret = 64; ret--;) {
> +               if (!(hena & (1ull << ret)))
> +                       continue;
> +               dev_info(&pf->pdev->dev, "flow_type: %d
> input_mask:0x%08x%08x\n",
> +                        ret,
> +                        i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(1, ret)),
> +                        i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(0, ret)));
> +       }
> +
>         i40e_write_rx_ctl(hw, I40E_PFQF_HENA(0), (u32)hena);
>         i40e_write_rx_ctl(hw, I40E_PFQF_HENA(1), (u32)(hena >> 32));

> i40e: Debug hash inputs

Here is what I see:

i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
i40e: Copyright (c) 2013 - 2014 Intel Corporation.
i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
i40e 0000:3d:00.1: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
i40e 0000:3d:00.1: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
i40e 0000:3d:00.1: MAC address: a4:bf:01:4e:0c:88
i40e 0000:3d:00.1: flow_type: 63 input_mask:0x0000000000004000
i40e 0000:3d:00.1: flow_type: 46 input_mask:0x0007fff800000000
i40e 0000:3d:00.1: flow_type: 45 input_mask:0x0007fff800000000
i40e 0000:3d:00.1: flow_type: 44 input_mask:0x0007ffff80000000
i40e 0000:3d:00.1: flow_type: 43 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 42 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 41 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 40 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 39 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 36 input_mask:0x0006060000000000
i40e 0000:3d:00.1: flow_type: 35 input_mask:0x0006060000000000
i40e 0000:3d:00.1: flow_type: 34 input_mask:0x0006060780000000
i40e 0000:3d:00.1: flow_type: 33 input_mask:0x0006060600000000
i40e 0000:3d:00.1: flow_type: 32 input_mask:0x0006060600000000
i40e 0000:3d:00.1: flow_type: 31 input_mask:0x0006060600000000
i40e 0000:3d:00.1: flow_type: 30 input_mask:0x0006060600000000
i40e 0000:3d:00.1: flow_type: 29 input_mask:0x0006060600000000
i40e 0000:3d:00.1: Features: PF-id[1] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
i40e 0000:3d:00.1 eth2: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
i40e_ioctl: power down: eth1
i40e_ioctl: power down: eth2

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-16 18:34                               ` Lennart Sorensen
@ 2019-05-16 18:37                                 ` Lennart Sorensen
  2019-05-16 23:32                                   ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-16 18:37 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

On Thu, May 16, 2019 at 02:34:08PM -0400, Lennart Sorensen wrote:
> Here is what I see:
> 
> i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
> i40e: Copyright (c) 2013 - 2014 Intel Corporation.
> i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
> i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
> i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
> i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
> i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
> i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
> i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> i40e 0000:3d:00.1: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> i40e 0000:3d:00.1: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> i40e 0000:3d:00.1: MAC address: a4:bf:01:4e:0c:88
> i40e 0000:3d:00.1: flow_type: 63 input_mask:0x0000000000004000
> i40e 0000:3d:00.1: flow_type: 46 input_mask:0x0007fff800000000
> i40e 0000:3d:00.1: flow_type: 45 input_mask:0x0007fff800000000
> i40e 0000:3d:00.1: flow_type: 44 input_mask:0x0007ffff80000000
> i40e 0000:3d:00.1: flow_type: 43 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 42 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 41 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 40 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 39 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 36 input_mask:0x0006060000000000
> i40e 0000:3d:00.1: flow_type: 35 input_mask:0x0006060000000000
> i40e 0000:3d:00.1: flow_type: 34 input_mask:0x0006060780000000
> i40e 0000:3d:00.1: flow_type: 33 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: flow_type: 32 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: flow_type: 31 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: flow_type: 30 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: flow_type: 29 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: Features: PF-id[1] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> i40e 0000:3d:00.1 eth2: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
> i40e_ioctl: power down: eth1
> i40e_ioctl: power down: eth2

Those last two lines is something I added, so ignore those.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-16 18:37                                 ` Lennart Sorensen
@ 2019-05-16 23:32                                   ` Alexander Duyck
  2019-05-17 16:42                                     ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-16 23:32 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

On Thu, May 16, 2019 at 11:37 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Thu, May 16, 2019 at 02:34:08PM -0400, Lennart Sorensen wrote:
> > Here is what I see:
> >
> > i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
> > i40e: Copyright (c) 2013 - 2014 Intel Corporation.
> > i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> > i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> > i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
> > i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
> > i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
> > i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
> > i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
> > i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
> > i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
> > i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
> > i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> > i40e 0000:3d:00.1: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> > i40e 0000:3d:00.1: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> > i40e 0000:3d:00.1: MAC address: a4:bf:01:4e:0c:88
> > i40e 0000:3d:00.1: flow_type: 63 input_mask:0x0000000000004000
> > i40e 0000:3d:00.1: flow_type: 46 input_mask:0x0007fff800000000
> > i40e 0000:3d:00.1: flow_type: 45 input_mask:0x0007fff800000000
> > i40e 0000:3d:00.1: flow_type: 44 input_mask:0x0007ffff80000000
> > i40e 0000:3d:00.1: flow_type: 43 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.1: flow_type: 42 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.1: flow_type: 41 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.1: flow_type: 40 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.1: flow_type: 39 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.1: flow_type: 36 input_mask:0x0006060000000000
> > i40e 0000:3d:00.1: flow_type: 35 input_mask:0x0006060000000000
> > i40e 0000:3d:00.1: flow_type: 34 input_mask:0x0006060780000000
> > i40e 0000:3d:00.1: flow_type: 33 input_mask:0x0006060600000000
> > i40e 0000:3d:00.1: flow_type: 32 input_mask:0x0006060600000000
> > i40e 0000:3d:00.1: flow_type: 31 input_mask:0x0006060600000000
> > i40e 0000:3d:00.1: flow_type: 30 input_mask:0x0006060600000000
> > i40e 0000:3d:00.1: flow_type: 29 input_mask:0x0006060600000000
> > i40e 0000:3d:00.1: Features: PF-id[1] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> > i40e 0000:3d:00.1 eth2: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
> > i40e_ioctl: power down: eth1
> > i40e_ioctl: power down: eth2
>
> Those last two lines is something I added, so ignore those.

No problem.

So just looking at the data provided I am going to guess that IPv6 w/
UDP likely works without any issues and it is just going to be IPv4
that is the problem. When you compare the UDP setup from mine versus
yours it looks like for some reason somebody swapped around the input
bits for the L3 src and destination fields. I'm basing that on the
input set masks in the i40e_txrx.h header:
/* INPUT SET MASK for RSS, flow director, and flexible payload */
#define I40E_L3_SRC_SHIFT               47
#define I40E_L3_SRC_MASK                (0x3ULL << I40E_L3_SRC_SHIFT)
#define I40E_L3_V6_SRC_SHIFT            43
#define I40E_L3_V6_SRC_MASK             (0xFFULL << I40E_L3_V6_SRC_SHIFT)
#define I40E_L3_DST_SHIFT               35
#define I40E_L3_DST_MASK                (0x3ULL << I40E_L3_DST_SHIFT)
#define I40E_L3_V6_DST_SHIFT            35
#define I40E_L3_V6_DST_MASK             (0xFFULL << I40E_L3_V6_DST_SHIFT)
#define I40E_L4_SRC_SHIFT               34
#define I40E_L4_SRC_MASK                (0x1ULL << I40E_L4_SRC_SHIFT)
#define I40E_L4_DST_SHIFT               33
#define I40E_L4_DST_MASK                (0x1ULL << I40E_L4_DST_SHIFT)
#define I40E_VERIFY_TAG_SHIFT           31
#define I40E_VERIFY_TAG_MASK            (0x3ULL << I40E_VERIFY_TAG_SHIFT)

The easiest way to verify would be to rewrite the registers for
flow_type 29, 30, and 31 to match the value that I had shown earlier
from my dump:
[  294.687087] i40e 0000:81:00.1: flow_type: 31 input_mask:0x0001801e00000000

I will take a look at putting together a patch that can be tested to
verify if this is actually the issue tomorrow.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-16 23:32                                   ` Alexander Duyck
@ 2019-05-17 16:42                                     ` Alexander Duyck
  2019-05-17 17:23                                       ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-17 16:42 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

[-- Attachment #1: Type: text/plain, Size: 8786 bytes --]

On Thu, May 16, 2019 at 4:32 PM Alexander Duyck
<alexander.duyck@gmail.com> wrote:
>
> On Thu, May 16, 2019 at 11:37 AM Lennart Sorensen
> <lsorense@csclub.uwaterloo.ca> wrote:
> >
> > On Thu, May 16, 2019 at 02:34:08PM -0400, Lennart Sorensen wrote:
> > > Here is what I see:
> > >
> > > i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
> > > i40e: Copyright (c) 2013 - 2014 Intel Corporation.
> > > i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> > > i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> > > i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
> > > i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
> > > i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
> > > i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
> > > i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
> > > i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
> > > i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
> > > i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
> > > i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> > > i40e 0000:3d:00.1: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> > > i40e 0000:3d:00.1: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> > > i40e 0000:3d:00.1: MAC address: a4:bf:01:4e:0c:88
> > > i40e 0000:3d:00.1: flow_type: 63 input_mask:0x0000000000004000
> > > i40e 0000:3d:00.1: flow_type: 46 input_mask:0x0007fff800000000
> > > i40e 0000:3d:00.1: flow_type: 45 input_mask:0x0007fff800000000
> > > i40e 0000:3d:00.1: flow_type: 44 input_mask:0x0007ffff80000000
> > > i40e 0000:3d:00.1: flow_type: 43 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.1: flow_type: 42 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.1: flow_type: 41 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.1: flow_type: 40 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.1: flow_type: 39 input_mask:0x0007fffe00000000
> > > i40e 0000:3d:00.1: flow_type: 36 input_mask:0x0006060000000000
> > > i40e 0000:3d:00.1: flow_type: 35 input_mask:0x0006060000000000
> > > i40e 0000:3d:00.1: flow_type: 34 input_mask:0x0006060780000000
> > > i40e 0000:3d:00.1: flow_type: 33 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.1: flow_type: 32 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.1: flow_type: 31 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.1: flow_type: 30 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.1: flow_type: 29 input_mask:0x0006060600000000
> > > i40e 0000:3d:00.1: Features: PF-id[1] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> > > i40e 0000:3d:00.1 eth2: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
> > > i40e_ioctl: power down: eth1
> > > i40e_ioctl: power down: eth2
> >
> > Those last two lines is something I added, so ignore those.
>
> No problem.
>
> So just looking at the data provided I am going to guess that IPv6 w/
> UDP likely works without any issues and it is just going to be IPv4
> that is the problem. When you compare the UDP setup from mine versus
> yours it looks like for some reason somebody swapped around the input
> bits for the L3 src and destination fields. I'm basing that on the
> input set masks in the i40e_txrx.h header:
> /* INPUT SET MASK for RSS, flow director, and flexible payload */
> #define I40E_L3_SRC_SHIFT               47
> #define I40E_L3_SRC_MASK                (0x3ULL << I40E_L3_SRC_SHIFT)
> #define I40E_L3_V6_SRC_SHIFT            43
> #define I40E_L3_V6_SRC_MASK             (0xFFULL << I40E_L3_V6_SRC_SHIFT)
> #define I40E_L3_DST_SHIFT               35
> #define I40E_L3_DST_MASK                (0x3ULL << I40E_L3_DST_SHIFT)
> #define I40E_L3_V6_DST_SHIFT            35
> #define I40E_L3_V6_DST_MASK             (0xFFULL << I40E_L3_V6_DST_SHIFT)
> #define I40E_L4_SRC_SHIFT               34
> #define I40E_L4_SRC_MASK                (0x1ULL << I40E_L4_SRC_SHIFT)
> #define I40E_L4_DST_SHIFT               33
> #define I40E_L4_DST_MASK                (0x1ULL << I40E_L4_DST_SHIFT)
> #define I40E_VERIFY_TAG_SHIFT           31
> #define I40E_VERIFY_TAG_MASK            (0x3ULL << I40E_VERIFY_TAG_SHIFT)
>
> The easiest way to verify would be to rewrite the registers for
> flow_type 29, 30, and 31 to match the value that I had shown earlier
> from my dump:
> [  294.687087] i40e 0000:81:00.1: flow_type: 31 input_mask:0x0001801e00000000
>
> I will take a look at putting together a patch that can be tested to
> verify if this is actually the issue tomorrow.
>
> Thanks.
>
> - Alex

So the patch below/attached should resolve the issues you are seeing
with your system in terms of UDPv4 RSS. What you should see with this
patch is the first function to come up will display some "update input
mask" messages, and then the remaining functions shouldn't make any
noise about it since the registers being updated are global to the
device.

If you can test this and see if it resolves the UDPv4 RSS issues I
would appreciate it.

Thanks.

- Alex

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 65c2b9d2652b..c0a7f66babd9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -10998,6 +10998,58 @@ static int i40e_pf_config_rss(struct i40e_pf *pf)
                ((u64)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1)) << 32);
        hena |= i40e_pf_get_default_rss_hena(pf);

+       for (ret = 64; ret--;) {
+               u64 hash_inset_orig, hash_inset_update;
+
+               if (!(hena & (1ull << ret)))
+                       continue;
+
+               /* Read initial input set value for flow type */
+               hash_inset_orig = i40e_read_rx_ctl(hw,
I40E_GLQF_HASH_INSET(1, ret));
+               hash_inset_orig <<= 32;
+               hash_inset_orig |= i40e_read_rx_ctl(hw,
I40E_GLQF_HASH_INSET(0, ret));
+
+               /* Copy value so we can compare later */
+               hash_inset_update = hash_inset_orig;
+
+               /* We should be looking at either the entire IPv6 or IPv4
+                * mask being set. If only part of the IPv6 mask is set, but
+                * the IPv4 mask is not then we have a garbage mask value
+                * and need to reset it.
+                */
+               switch (hash_inset_orig & I40E_L3_V6_SRC_MASK) {
+               case I40E_L3_V6_SRC_MASK:
+               case I40E_L3_SRC_MASK:
+               case 0:
+                       break;
+               default:
+                       hash_inset_update &= ~I40E_L3_V6_SRC_MASK;
+                       hash_inset_update |= I40E_L3_SRC_MASK;
+               }
+
+               switch (hash_inset_orig & I40E_L3_V6_DST_MASK) {
+               case I40E_L3_V6_DST_MASK:
+               case I40E_L3_DST_MASK:
+               case 0:
+                       break;
+               default:
+                       hash_inset_update &= ~I40E_L3_V6_DST_MASK;
+                       hash_inset_update |= I40E_L3_DST_MASK;
+               }
+
+               if (hash_inset_update != hash_inset_orig) {
+                       dev_warn(&pf->pdev->dev,
+                                "flow type: %d update input mask
from:0x%016llx, to:0x%016llx\n",
+                                ret,
+                                hash_inset_orig, hash_inset_update);
+                       i40e_write_rx_ctl(hw, I40E_GLQF_HASH_INSET(0, ret),
+                                         (u32)hash_inset_update);
+                       hash_inset_update >>= 32;
+                       i40e_write_rx_ctl(hw, I40E_GLQF_HASH_INSET(1, ret),
+                                         (u32)hash_inset_update);
+               }
+       }
+
        i40e_write_rx_ctl(hw, I40E_PFQF_HENA(0), (u32)hena);
        i40e_write_rx_ctl(hw, I40E_PFQF_HENA(1), (u32)(hena >> 32));

[-- Attachment #2: i40e-debug-hash-inputs.patch --]
[-- Type: text/x-patch, Size: 2297 bytes --]

i40e: Debug hash inputs

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>


---
 drivers/net/ethernet/intel/i40e/i40e_main.c |   52 +++++++++++++++++++++++++++
 1 file changed, 52 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 65c2b9d2652b..c0a7f66babd9 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -10998,6 +10998,58 @@ static int i40e_pf_config_rss(struct i40e_pf *pf)
 		((u64)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1)) << 32);
 	hena |= i40e_pf_get_default_rss_hena(pf);
 
+	for (ret = 64; ret--;) {
+		u64 hash_inset_orig, hash_inset_update;
+
+		if (!(hena & (1ull << ret)))
+			continue;
+
+		/* Read initial input set value for flow type */
+		hash_inset_orig = i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(1, ret));
+		hash_inset_orig <<= 32;
+		hash_inset_orig |= i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(0, ret));
+
+		/* Copy value so we can compare later */
+		hash_inset_update = hash_inset_orig;
+
+		/* We should be looking at either the entire IPv6 or IPv4
+		 * mask being set. If only part of the IPv6 mask is set, but
+		 * the IPv4 mask is not then we have a garbage mask value
+		 * and need to reset it.
+		 */
+		switch (hash_inset_orig & I40E_L3_V6_SRC_MASK) {
+		case I40E_L3_V6_SRC_MASK:
+		case I40E_L3_SRC_MASK:
+		case 0:
+			break;
+		default:
+			hash_inset_update &= ~I40E_L3_V6_SRC_MASK;
+			hash_inset_update |= I40E_L3_SRC_MASK;
+		}
+
+		switch (hash_inset_orig & I40E_L3_V6_DST_MASK) {
+		case I40E_L3_V6_DST_MASK:
+		case I40E_L3_DST_MASK:
+		case 0:
+			break;
+		default:
+			hash_inset_update &= ~I40E_L3_V6_DST_MASK;
+			hash_inset_update |= I40E_L3_DST_MASK;
+		}
+
+		if (hash_inset_update != hash_inset_orig) {
+			dev_warn(&pf->pdev->dev,
+				 "flow type: %d update input mask from:0x%016llx, to:0x%016llx\n",
+				 ret,
+				 hash_inset_orig, hash_inset_update);
+			i40e_write_rx_ctl(hw, I40E_GLQF_HASH_INSET(0, ret),
+					  (u32)hash_inset_update);
+			hash_inset_update >>= 32;
+			i40e_write_rx_ctl(hw, I40E_GLQF_HASH_INSET(1, ret),
+					  (u32)hash_inset_update);
+		}
+	}
+
 	i40e_write_rx_ctl(hw, I40E_PFQF_HENA(0), (u32)hena);
 	i40e_write_rx_ctl(hw, I40E_PFQF_HENA(1), (u32)(hena >> 32));
 

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-17 16:42                                     ` Alexander Duyck
@ 2019-05-17 17:23                                       ` Lennart Sorensen
  2019-05-17 22:20                                         ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-17 17:23 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

On Fri, May 17, 2019 at 09:42:19AM -0700, Alexander Duyck wrote:
> So the patch below/attached should resolve the issues you are seeing
> with your system in terms of UDPv4 RSS. What you should see with this
> patch is the first function to come up will display some "update input
> mask" messages, and then the remaining functions shouldn't make any
> noise about it since the registers being updated are global to the
> device.
> 
> If you can test this and see if it resolves the UDPv4 RSS issues I
> would appreciate it.
> 
> Thanks.
> 
> - Alex
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
> b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 65c2b9d2652b..c0a7f66babd9 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -10998,6 +10998,58 @@ static int i40e_pf_config_rss(struct i40e_pf *pf)
>                 ((u64)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1)) << 32);
>         hena |= i40e_pf_get_default_rss_hena(pf);
> 
> +       for (ret = 64; ret--;) {
> +               u64 hash_inset_orig, hash_inset_update;
> +
> +               if (!(hena & (1ull << ret)))
> +                       continue;
> +
> +               /* Read initial input set value for flow type */
> +               hash_inset_orig = i40e_read_rx_ctl(hw,
> I40E_GLQF_HASH_INSET(1, ret));
> +               hash_inset_orig <<= 32;
> +               hash_inset_orig |= i40e_read_rx_ctl(hw,
> I40E_GLQF_HASH_INSET(0, ret));
> +
> +               /* Copy value so we can compare later */
> +               hash_inset_update = hash_inset_orig;
> +
> +               /* We should be looking at either the entire IPv6 or IPv4
> +                * mask being set. If only part of the IPv6 mask is set, but
> +                * the IPv4 mask is not then we have a garbage mask value
> +                * and need to reset it.
> +                */
> +               switch (hash_inset_orig & I40E_L3_V6_SRC_MASK) {
> +               case I40E_L3_V6_SRC_MASK:
> +               case I40E_L3_SRC_MASK:
> +               case 0:
> +                       break;
> +               default:
> +                       hash_inset_update &= ~I40E_L3_V6_SRC_MASK;
> +                       hash_inset_update |= I40E_L3_SRC_MASK;
> +               }
> +
> +               switch (hash_inset_orig & I40E_L3_V6_DST_MASK) {
> +               case I40E_L3_V6_DST_MASK:
> +               case I40E_L3_DST_MASK:
> +               case 0:
> +                       break;
> +               default:
> +                       hash_inset_update &= ~I40E_L3_V6_DST_MASK;
> +                       hash_inset_update |= I40E_L3_DST_MASK;
> +               }
> +
> +               if (hash_inset_update != hash_inset_orig) {
> +                       dev_warn(&pf->pdev->dev,
> +                                "flow type: %d update input mask
> from:0x%016llx, to:0x%016llx\n",
> +                                ret,
> +                                hash_inset_orig, hash_inset_update);
> +                       i40e_write_rx_ctl(hw, I40E_GLQF_HASH_INSET(0, ret),
> +                                         (u32)hash_inset_update);
> +                       hash_inset_update >>= 32;
> +                       i40e_write_rx_ctl(hw, I40E_GLQF_HASH_INSET(1, ret),
> +                                         (u32)hash_inset_update);
> +               }
> +       }
> +
>         i40e_write_rx_ctl(hw, I40E_PFQF_HENA(0), (u32)hena);
>         i40e_write_rx_ctl(hw, I40E_PFQF_HENA(1), (u32)(hena >> 32));

> i40e: Debug hash inputs
> 
> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> 
> 
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c |   52 +++++++++++++++++++++++++++
>  1 file changed, 52 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index 65c2b9d2652b..c0a7f66babd9 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -10998,6 +10998,58 @@ static int i40e_pf_config_rss(struct i40e_pf *pf)
>  		((u64)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1)) << 32);
>  	hena |= i40e_pf_get_default_rss_hena(pf);
>  
> +	for (ret = 64; ret--;) {
> +		u64 hash_inset_orig, hash_inset_update;
> +
> +		if (!(hena & (1ull << ret)))
> +			continue;
> +
> +		/* Read initial input set value for flow type */
> +		hash_inset_orig = i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(1, ret));
> +		hash_inset_orig <<= 32;
> +		hash_inset_orig |= i40e_read_rx_ctl(hw, I40E_GLQF_HASH_INSET(0, ret));
> +
> +		/* Copy value so we can compare later */
> +		hash_inset_update = hash_inset_orig;
> +
> +		/* We should be looking at either the entire IPv6 or IPv4
> +		 * mask being set. If only part of the IPv6 mask is set, but
> +		 * the IPv4 mask is not then we have a garbage mask value
> +		 * and need to reset it.
> +		 */
> +		switch (hash_inset_orig & I40E_L3_V6_SRC_MASK) {
> +		case I40E_L3_V6_SRC_MASK:
> +		case I40E_L3_SRC_MASK:
> +		case 0:
> +			break;
> +		default:
> +			hash_inset_update &= ~I40E_L3_V6_SRC_MASK;
> +			hash_inset_update |= I40E_L3_SRC_MASK;
> +		}
> +
> +		switch (hash_inset_orig & I40E_L3_V6_DST_MASK) {
> +		case I40E_L3_V6_DST_MASK:
> +		case I40E_L3_DST_MASK:
> +		case 0:
> +			break;
> +		default:
> +			hash_inset_update &= ~I40E_L3_V6_DST_MASK;
> +			hash_inset_update |= I40E_L3_DST_MASK;
> +		}
> +
> +		if (hash_inset_update != hash_inset_orig) {
> +			dev_warn(&pf->pdev->dev,
> +				 "flow type: %d update input mask from:0x%016llx, to:0x%016llx\n",
> +				 ret,
> +				 hash_inset_orig, hash_inset_update);
> +			i40e_write_rx_ctl(hw, I40E_GLQF_HASH_INSET(0, ret),
> +					  (u32)hash_inset_update);
> +			hash_inset_update >>= 32;
> +			i40e_write_rx_ctl(hw, I40E_GLQF_HASH_INSET(1, ret),
> +					  (u32)hash_inset_update);
> +		}
> +	}
> +
>  	i40e_write_rx_ctl(hw, I40E_PFQF_HENA(0), (u32)hena);
>  	i40e_write_rx_ctl(hw, I40E_PFQF_HENA(1), (u32)(hena >> 32));
>  

OK I applied that and see this:

i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
i40e: Copyright (c) 2013 - 2014 Intel Corporation.
i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
i40e 0000:3d:00.0: flow type: 36 update input mask from:0x0006060000000000, to:0x0001801800000000
i40e 0000:3d:00.0: flow type: 35 update input mask from:0x0006060000000000, to:0x0001801800000000
i40e 0000:3d:00.0: flow type: 34 update input mask from:0x0006060780000000, to:0x0001801f80000000
i40e 0000:3d:00.0: flow type: 33 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 32 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 31 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 30 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 29 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
i40e 0000:3d:00.1: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
i40e 0000:3d:00.1: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
i40e 0000:3d:00.1: MAC address: a4:bf:01:4e:0c:88
i40e 0000:3d:00.1: Features: PF-id[1] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
i40e 0000:3d:00.1 eth2: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None

Unfortunately (much to my disappointment, I hoped it would work) I see
no change in behaviour.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-17 17:23                                       ` Lennart Sorensen
@ 2019-05-17 22:20                                         ` Alexander Duyck
  2019-05-21 15:15                                           ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-17 22:20 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

On Fri, May 17, 2019 at 10:23 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
> OK I applied that and see this:
>
> i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
> i40e: Copyright (c) 2013 - 2014 Intel Corporation.
> i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
> i40e 0000:3d:00.0: flow type: 36 update input mask from:0x0006060000000000, to:0x0001801800000000
> i40e 0000:3d:00.0: flow type: 35 update input mask from:0x0006060000000000, to:0x0001801800000000
> i40e 0000:3d:00.0: flow type: 34 update input mask from:0x0006060780000000, to:0x0001801f80000000
> i40e 0000:3d:00.0: flow type: 33 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 32 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 31 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 30 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 29 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> i40e 0000:3d:00.1: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> i40e 0000:3d:00.1: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> i40e 0000:3d:00.1: MAC address: a4:bf:01:4e:0c:88
> i40e 0000:3d:00.1: Features: PF-id[1] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> i40e 0000:3d:00.1 eth2: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
>
> Unfortunately (much to my disappointment, I hoped it would work) I see
> no change in behaviour.
>
> --
> Len Sorensen

I was hoping it would work too. It seemed like it should have been the
answer since it definitely didn't seem right. Now it has me wondering
about some of the other code in the driver.

By any chance have you run anything like DPDK on any of the X722
interfaces on this system recently? I ask because it occurs to me that
if you had and it loaded something like a custom parsing profile it
could cause issues similar to this.

A debugging step you might try would be to revert back to my earlier
patch that only displayed the input mask instead of changing it. Once
you have done that you could look at doing a full power cycle on the
system by either physically disconnecting the power, or using the
power switch on the power supply itself if one is available. It is
necessary to disconnect the motherboard/NIC from power in order to
fully clear the global state stored in the device as it is retained
when the system is in standby.

What I want to verify is if the input mask that we have ran into is
the natural power-on input mask of if that is something that was
overridden by something else. The mask change I made should be reset
if the system loses power, and then it will either default back to the
value with the 6's if that is it's natural state, or it will match
what I had if it was not.

Other than that I really can't think up too much else. I suppose there
is the possibility of the NVM either setting up a DCB setting or
HREGION register causing an override that is limiting the queues to 1.
However, the likelihood of that should be really low.

- Alex

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-17 22:20                                         ` Alexander Duyck
@ 2019-05-21 15:15                                           ` Lennart Sorensen
  2019-05-21 16:51                                             ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-21 15:15 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

On Fri, May 17, 2019 at 03:20:02PM -0700, Alexander Duyck wrote:
> I was hoping it would work too. It seemed like it should have been the
> answer since it definitely didn't seem right. Now it has me wondering
> about some of the other code in the driver.
> 
> By any chance have you run anything like DPDK on any of the X722
> interfaces on this system recently? I ask because it occurs to me that
> if you had and it loaded something like a custom parsing profile it
> could cause issues similar to this.

I have never used DPDK on anything.  I was hoping never to do so. :)

This system has so far booted Debian (with a 4.19 kernel) and our own OS
(which has a 4.9 kernel).

> A debugging step you might try would be to revert back to my earlier
> patch that only displayed the input mask instead of changing it. Once
> you have done that you could look at doing a full power cycle on the
> system by either physically disconnecting the power, or using the
> power switch on the power supply itself if one is available. It is
> necessary to disconnect the motherboard/NIC from power in order to
> fully clear the global state stored in the device as it is retained
> when the system is in standby.
> 
> What I want to verify is if the input mask that we have ran into is
> the natural power-on input mask of if that is something that was
> overridden by something else. The mask change I made should be reset
> if the system loses power, and then it will either default back to the
> value with the 6's if that is it's natural state, or it will match
> what I had if it was not.
> 
> Other than that I really can't think up too much else. I suppose there
> is the possibility of the NVM either setting up a DCB setting or
> HREGION register causing an override that is limiting the queues to 1.
> However, the likelihood of that should be really low.

Here is the register dump after a full power off:

40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
i40e: Copyright (c) 2013 - 2014 Intel Corporation.
i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
i40e 0000:3d:00.1: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
i40e 0000:3d:00.1: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
i40e 0000:3d:00.1: MAC address: a4:bf:01:4e:0c:88
i40e 0000:3d:00.1: flow_type: 63 input_mask:0x0000000000004000
i40e 0000:3d:00.1: flow_type: 46 input_mask:0x0007fff800000000
i40e 0000:3d:00.1: flow_type: 45 input_mask:0x0007fff800000000
i40e 0000:3d:00.1: flow_type: 44 input_mask:0x0007ffff80000000
i40e 0000:3d:00.1: flow_type: 43 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 42 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 41 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 40 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 39 input_mask:0x0007fffe00000000
i40e 0000:3d:00.1: flow_type: 36 input_mask:0x0006060000000000
i40e 0000:3d:00.1: flow_type: 35 input_mask:0x0006060000000000
i40e 0000:3d:00.1: flow_type: 34 input_mask:0x0006060780000000
i40e 0000:3d:00.1: flow_type: 33 input_mask:0x0006060600000000
i40e 0000:3d:00.1: flow_type: 32 input_mask:0x0006060600000000
i40e 0000:3d:00.1: flow_type: 31 input_mask:0x0006060600000000
i40e 0000:3d:00.1: flow_type: 30 input_mask:0x0006060600000000
i40e 0000:3d:00.1: flow_type: 29 input_mask:0x0006060600000000
i40e 0000:3d:00.1: Features: PF-id[1] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
i40e 0000:3d:00.1 eth2: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None

Pretty sure that is identical to before.

If I dump the registers after doing the update I see this (just did a
reboot this time, not a power cycle):

i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
i40e: Copyright (c) 2013 - 2014 Intel Corporation.
i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow type: 36 update input mask from:0x0006060000000000, to:0x0001801800000000
i40e 0000:3d:00.0: flow type: 35 update input mask from:0x0006060000000000, to:0x0001801800000000
i40e 0000:3d:00.0: flow type: 34 update input mask from:0x0006060780000000, to:0x0001801f80000000
i40e 0000:3d:00.0: flow type: 33 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 32 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 31 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 30 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 29 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow_type after update: 63 input_mask:0x0000000000004000
i40e 0000:3d:00.0: flow_type after update: 46 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type after update: 45 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type after update: 44 input_mask:0x0007ffff80000000
i40e 0000:3d:00.0: flow_type after update: 43 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type after update: 42 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type after update: 41 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type after update: 40 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type after update: 39 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type after update: 36 input_mask:0x0001801800000000
i40e 0000:3d:00.0: flow_type after update: 35 input_mask:0x0001801800000000
i40e 0000:3d:00.0: flow_type after update: 34 input_mask:0x0001801f80000000
i40e 0000:3d:00.0: flow_type after update: 33 input_mask:0x0001801e00000000
i40e 0000:3d:00.0: flow_type after update: 32 input_mask:0x0001801e00000000
i40e 0000:3d:00.0: flow_type after update: 31 input_mask:0x0001801e00000000
i40e 0000:3d:00.0: flow_type after update: 30 input_mask:0x0001801e00000000
i40e 0000:3d:00.0: flow_type after update: 29 input_mask:0x0001801e00000000
i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA

So at least it appears the update did apply.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-21 15:15                                           ` Lennart Sorensen
@ 2019-05-21 16:51                                             ` Alexander Duyck
  2019-05-21 17:54                                               ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-21 16:51 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

On Tue, May 21, 2019 at 8:15 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Fri, May 17, 2019 at 03:20:02PM -0700, Alexander Duyck wrote:
> > I was hoping it would work too. It seemed like it should have been the
> > answer since it definitely didn't seem right. Now it has me wondering
> > about some of the other code in the driver.
> >
> > By any chance have you run anything like DPDK on any of the X722
> > interfaces on this system recently? I ask because it occurs to me that
> > if you had and it loaded something like a custom parsing profile it
> > could cause issues similar to this.
>
> I have never used DPDK on anything.  I was hoping never to do so. :)
>
> This system has so far booted Debian (with a 4.19 kernel) and our own OS
> (which has a 4.9 kernel).
>
> > A debugging step you might try would be to revert back to my earlier
> > patch that only displayed the input mask instead of changing it. Once
> > you have done that you could look at doing a full power cycle on the
> > system by either physically disconnecting the power, or using the
> > power switch on the power supply itself if one is available. It is
> > necessary to disconnect the motherboard/NIC from power in order to
> > fully clear the global state stored in the device as it is retained
> > when the system is in standby.
> >
> > What I want to verify is if the input mask that we have ran into is
> > the natural power-on input mask of if that is something that was
> > overridden by something else. The mask change I made should be reset
> > if the system loses power, and then it will either default back to the
> > value with the 6's if that is it's natural state, or it will match
> > what I had if it was not.
> >
> > Other than that I really can't think up too much else. I suppose there
> > is the possibility of the NVM either setting up a DCB setting or
> > HREGION register causing an override that is limiting the queues to 1.
> > However, the likelihood of that should be really low.
>
> Here is the register dump after a full power off:
>
> 40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
> i40e: Copyright (c) 2013 - 2014 Intel Corporation.
> i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
> i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
> i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
> i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
> i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
> i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
> i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> i40e 0000:3d:00.1: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> i40e 0000:3d:00.1: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> i40e 0000:3d:00.1: MAC address: a4:bf:01:4e:0c:88
> i40e 0000:3d:00.1: flow_type: 63 input_mask:0x0000000000004000
> i40e 0000:3d:00.1: flow_type: 46 input_mask:0x0007fff800000000
> i40e 0000:3d:00.1: flow_type: 45 input_mask:0x0007fff800000000
> i40e 0000:3d:00.1: flow_type: 44 input_mask:0x0007ffff80000000
> i40e 0000:3d:00.1: flow_type: 43 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 42 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 41 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 40 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 39 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.1: flow_type: 36 input_mask:0x0006060000000000
> i40e 0000:3d:00.1: flow_type: 35 input_mask:0x0006060000000000
> i40e 0000:3d:00.1: flow_type: 34 input_mask:0x0006060780000000
> i40e 0000:3d:00.1: flow_type: 33 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: flow_type: 32 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: flow_type: 31 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: flow_type: 30 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: flow_type: 29 input_mask:0x0006060600000000
> i40e 0000:3d:00.1: Features: PF-id[1] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
> i40e 0000:3d:00.1 eth2: NIC Link is Up, 1000 Mbps Full Duplex, Flow Control: None
>
> Pretty sure that is identical to before.
>
> If I dump the registers after doing the update I see this (just did a
> reboot this time, not a power cycle):
>
> i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
> i40e: Copyright (c) 2013 - 2014 Intel Corporation.
> i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
> i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
> i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
> i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
> i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
> i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
> i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow type: 36 update input mask from:0x0006060000000000, to:0x0001801800000000
> i40e 0000:3d:00.0: flow type: 35 update input mask from:0x0006060000000000, to:0x0001801800000000
> i40e 0000:3d:00.0: flow type: 34 update input mask from:0x0006060780000000, to:0x0001801f80000000
> i40e 0000:3d:00.0: flow type: 33 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 32 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 31 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 30 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 29 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow_type after update: 63 input_mask:0x0000000000004000
> i40e 0000:3d:00.0: flow_type after update: 46 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type after update: 45 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type after update: 44 input_mask:0x0007ffff80000000
> i40e 0000:3d:00.0: flow_type after update: 43 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type after update: 42 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type after update: 41 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type after update: 40 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type after update: 39 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type after update: 36 input_mask:0x0001801800000000
> i40e 0000:3d:00.0: flow_type after update: 35 input_mask:0x0001801800000000
> i40e 0000:3d:00.0: flow_type after update: 34 input_mask:0x0001801f80000000
> i40e 0000:3d:00.0: flow_type after update: 33 input_mask:0x0001801e00000000
> i40e 0000:3d:00.0: flow_type after update: 32 input_mask:0x0001801e00000000
> i40e 0000:3d:00.0: flow_type after update: 31 input_mask:0x0001801e00000000
> i40e 0000:3d:00.0: flow_type after update: 30 input_mask:0x0001801e00000000
> i40e 0000:3d:00.0: flow_type after update: 29 input_mask:0x0001801e00000000
> i40e 0000:3d:00.0: Features: PF-id[0] VSIs: 34 QP: 12 TXQ: 13 RSS VxLAN Geneve VEPA
>
> So at least it appears the update did apply.
>
> --
> Len Sorensen

I think we need to narrow this down a bit more. Let's try forcing the
lookup table all to one value and see if traffic is still going to
queue 0.

Specifically what we need to is run the following command to try and
force all RSS traffic to queue 8, you can verify the result with
"ethtool -x":
ethtool -X <iface> weight 0 0 0 0 0 0 0 0 1

If that works and the IPSec traffic goes to queue 8 then we are likely
looking at some sort of input issue, either in the parsing or the
population of things like the input mask that we can then debug
further.

If traffic still goes to queue 0 then that tells us the output of the
RSS hash and lookup table are being ignored, this would imply either
some other filter is rerouting the traffic or is directing us to limit
the queue index to 0 bits.

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-21 16:51                                             ` Alexander Duyck
@ 2019-05-21 17:54                                               ` Lennart Sorensen
  2019-05-21 23:22                                                 ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-21 17:54 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan

On Tue, May 21, 2019 at 09:51:33AM -0700, Alexander Duyck wrote:
> I think we need to narrow this down a bit more. Let's try forcing the
> lookup table all to one value and see if traffic is still going to
> queue 0.
> 
> Specifically what we need to is run the following command to try and
> force all RSS traffic to queue 8, you can verify the result with
> "ethtool -x":
> ethtool -X <iface> weight 0 0 0 0 0 0 0 0 1
> 
> If that works and the IPSec traffic goes to queue 8 then we are likely
> looking at some sort of input issue, either in the parsing or the
> population of things like the input mask that we can then debug
> further.
> 
> If traffic still goes to queue 0 then that tells us the output of the
> RSS hash and lookup table are being ignored, this would imply either
> some other filter is rerouting the traffic or is directing us to limit
> the queue index to 0 bits.

# ethtool -x eth2
RX flow hash indirection table for eth2 with 12 RX ring(s):
    0:      7     7     7     7     7     7     7     7
    8:      7     7     7     7     7     7     7     7
   16:      7     7     7     7     7     7     7     7
   24:      7     7     7     7     7     7     7     7
   32:      7     7     7     7     7     7     7     7
...
  472:      7     7     7     7     7     7     7     7
  480:      7     7     7     7     7     7     7     7
  488:      7     7     7     7     7     7     7     7
  496:      7     7     7     7     7     7     7     7
  504:      7     7     7     7     7     7     7     7
RSS hash key:
0b:1f:ae:ed:60:04:7d:e5:8a:2b:43:3f:1d:ee:6c:99:89:29:94:b0:25:db:c7:4b:fa:da:4d:3f:e8:cc:bc:00:ad:32:01:d6:1c:30:3f:f8:79:3e:f4:48:04:1f:51:d2:5a:39:f0:90
root@ECA:~# ethtool --show-priv-flags eth2
Private flags for eth2:
MFP              : off
LinkPolling      : off
flow-director-atr: off
veb-stats        : off
hw-atr-eviction  : on
legacy-rx        : off

All ipsec packets are still hitting queue 0.

Seems it is completely ignoring RSS for these packets.  That is
impressively weird.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-21 17:54                                               ` Lennart Sorensen
@ 2019-05-21 23:22                                                 ` Alexander Duyck
  2019-05-22 14:39                                                   ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Alexander Duyck @ 2019-05-21 23:22 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan, e1000-devel

[-- Attachment #1: Type: text/plain, Size: 3403 bytes --]

On Tue, May 21, 2019 at 10:55 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Tue, May 21, 2019 at 09:51:33AM -0700, Alexander Duyck wrote:
> > I think we need to narrow this down a bit more. Let's try forcing the
> > lookup table all to one value and see if traffic is still going to
> > queue 0.
> >
> > Specifically what we need to is run the following command to try and
> > force all RSS traffic to queue 8, you can verify the result with
> > "ethtool -x":
> > ethtool -X <iface> weight 0 0 0 0 0 0 0 0 1
> >
> > If that works and the IPSec traffic goes to queue 8 then we are likely
> > looking at some sort of input issue, either in the parsing or the
> > population of things like the input mask that we can then debug
> > further.
> >
> > If traffic still goes to queue 0 then that tells us the output of the
> > RSS hash and lookup table are being ignored, this would imply either
> > some other filter is rerouting the traffic or is directing us to limit
> > the queue index to 0 bits.
>
> # ethtool -x eth2
> RX flow hash indirection table for eth2 with 12 RX ring(s):
>     0:      7     7     7     7     7     7     7     7
>     8:      7     7     7     7     7     7     7     7
>    16:      7     7     7     7     7     7     7     7
>    24:      7     7     7     7     7     7     7     7
>    32:      7     7     7     7     7     7     7     7
> ...
>   472:      7     7     7     7     7     7     7     7
>   480:      7     7     7     7     7     7     7     7
>   488:      7     7     7     7     7     7     7     7
>   496:      7     7     7     7     7     7     7     7
>   504:      7     7     7     7     7     7     7     7
> RSS hash key:
> 0b:1f:ae:ed:60:04:7d:e5:8a:2b:43:3f:1d:ee:6c:99:89:29:94:b0:25:db:c7:4b:fa:da:4d:3f:e8:cc:bc:00:ad:32:01:d6:1c:30:3f:f8:79:3e:f4:48:04:1f:51:d2:5a:39:f0:90
> root@ECA:~# ethtool --show-priv-flags eth2
> Private flags for eth2:
> MFP              : off
> LinkPolling      : off
> flow-director-atr: off
> veb-stats        : off
> hw-atr-eviction  : on
> legacy-rx        : off
>
> All ipsec packets are still hitting queue 0.
>
> Seems it is completely ignoring RSS for these packets.  That is
> impressively weird.
>
> --
> Len Sorensen

So we are either using 0 bits of the LUT or we are just not performing
hashing because this is somehow being parsed into a type that doesn't
support it.

I have attached 2 more patches we can test. The first enables hashing
on what are called "OAM" packets, The thing is we shouldn't be
identifying these packets as "OAM", Operations Administration &
Management, as normally it would have to be recognized as a tunnel
first and then have a specific flag set in either the GENEVE or
VXLAN-GPE header. The second patch will dump the contents of the
HREGION registers. They should all be 0, however I thought it best to
dump the contents and verify that since I know that these registers
can be used to change the traffic class of a given packet type and if
we are encountering that it might be moving it to an uninitialized TC
which would be using queue offset 0 with 0 bits of the LUT.

These last 2 patches would pretty much eliminate the entire RSS
subsystem. If we don't see HREGION values set and the OAM flags have
no effect I can only assume there is something going on with the
parser in the NIC since it isn't recognizing the packet type.

Thanks.

- Alex

[-- Attachment #2: i40e-enable-oam-flag-tunnel.patch --]
[-- Type: text/x-patch, Size: 1199 bytes --]

i40e: Enable OAM flag tunnel hashing

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

Add support for hashing packet types 26 and 27 on X722 adapters. The
default input set is supposed to be source outer UDP and VNI. For now all I
care about is enabling hashing on this to see if we can get ESP traffic to
not hit queue 0 for everything.
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.h |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.h b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
index 100e92d2982f..ad3e16e8cd7a 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.h
@@ -95,7 +95,8 @@ enum i40e_dyn_idx_t {
 	BIT_ULL(I40E_FILTER_PCTYPE_NONF_MULTICAST_IPV4_UDP) | \
 	BIT_ULL(I40E_FILTER_PCTYPE_NONF_IPV6_TCP_SYN_NO_ACK) | \
 	BIT_ULL(I40E_FILTER_PCTYPE_NONF_UNICAST_IPV6_UDP) | \
-	BIT_ULL(I40E_FILTER_PCTYPE_NONF_MULTICAST_IPV6_UDP))
+	BIT_ULL(I40E_FILTER_PCTYPE_NONF_MULTICAST_IPV6_UDP) | \
+	BIT_ULL(26) | BIT_ULL(27)) /* Added bits for tunnel OAM */
 
 #define i40e_pf_get_default_rss_hena(pf) \
 	(((pf)->hw_features & I40E_HW_MULTIPLE_TCP_UDP_RSS_PCTYPE) ? \

[-- Attachment #3: i40e-dump-extra-hregion.patch --]
[-- Type: text/x-patch, Size: 933 bytes --]

i40e: Dump HREGION entries

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>


---
 drivers/net/ethernet/intel/i40e/i40e_main.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 320562b39686..370f66df4e4f 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11094,6 +11094,14 @@ static int i40e_pf_config_rss(struct i40e_pf *pf)
 	u64 hena;
 	int ret;
 
+	/* These should all be 0, dump them to verify they are */
+	for (ret = 8; ret--;) {
+		reg_val = i40e_read_rx_ctl(hw, I40E_PFQF_HREGION(ret));
+
+		dev_info(&pf->pdev->dev,
+			 "PFQF_HREGION[%d]: 0x%08x\n", ret, reg_val);
+	}
+
 	/* By default we enable TCP/UDP with IPv4/IPv6 ptypes */
 	hena = (u64)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(0)) |
 		((u64)i40e_read_rx_ctl(hw, I40E_PFQF_HENA(1)) << 32);

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-21 23:22                                                 ` Alexander Duyck
@ 2019-05-22 14:39                                                   ` Lennart Sorensen
  2019-06-07 14:39                                                     ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-05-22 14:39 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan, e1000-devel

On Tue, May 21, 2019 at 04:22:17PM -0700, Alexander Duyck wrote:
> So we are either using 0 bits of the LUT or we are just not performing
> hashing because this is somehow being parsed into a type that doesn't
> support it.
> 
> I have attached 2 more patches we can test. The first enables hashing
> on what are called "OAM" packets, The thing is we shouldn't be
> identifying these packets as "OAM", Operations Administration &
> Management, as normally it would have to be recognized as a tunnel
> first and then have a specific flag set in either the GENEVE or
> VXLAN-GPE header. The second patch will dump the contents of the
> HREGION registers. They should all be 0, however I thought it best to
> dump the contents and verify that since I know that these registers
> can be used to change the traffic class of a given packet type and if
> we are encountering that it might be moving it to an uninitialized TC
> which would be using queue offset 0 with 0 bits of the LUT.
> 
> These last 2 patches would pretty much eliminate the entire RSS
> subsystem. If we don't see HREGION values set and the OAM flags have
> no effect I can only assume there is something going on with the
> parser in the NIC since it isn't recognizing the packet type.
> 
> Thanks.
> 
> - Alex

OK I applied those two patches and get this:

i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
i40e: Copyright (c) 2013 - 2014 Intel Corporation.
i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
i40e 0000:3d:00.0: PFQF_HREGION[7]: 0x00000000
i40e 0000:3d:00.0: PFQF_HREGION[6]: 0x00000000
i40e 0000:3d:00.0: PFQF_HREGION[5]: 0x00000000
i40e 0000:3d:00.0: PFQF_HREGION[4]: 0x00000000
i40e 0000:3d:00.0: PFQF_HREGION[3]: 0x00000000
i40e 0000:3d:00.0: PFQF_HREGION[2]: 0x00000000
i40e 0000:3d:00.0: PFQF_HREGION[1]: 0x00000000
i40e 0000:3d:00.0: PFQF_HREGION[0]: 0x00000000
i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
i40e 0000:3d:00.0: flow_type: 27 input_mask:0x00000000002c0000
i40e 0000:3d:00.0: flow_type: 26 input_mask:0x00000000002c0000
i40e 0000:3d:00.0: flow type: 36 update input mask from:0x0006060000000000, to:0x0001801800000000
i40e 0000:3d:00.0: flow type: 35 update input mask from:0x0006060000000000, to:0x0001801800000000
i40e 0000:3d:00.0: flow type: 34 update input mask from:0x0006060780000000, to:0x0001801f80000000
i40e 0000:3d:00.0: flow type: 33 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 32 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 31 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 30 update input mask from:0x0006060600000000, to:0x0001801e00000000
i40e 0000:3d:00.0: flow type: 29 update input mask from:0x0006060600000000, to:0x0001801e00000000

So seems the regions are all 0.

All ipsec packets still hitting queue 0.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-05-22 14:39                                                   ` Lennart Sorensen
@ 2019-06-07 14:39                                                     ` Lennart Sorensen
  2019-06-07 19:32                                                       ` Alexander Duyck
  0 siblings, 1 reply; 33+ messages in thread
From: Lennart Sorensen @ 2019-06-07 14:39 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan, e1000-devel

On Wed, May 22, 2019 at 10:39:56AM -0400, Lennart Sorensen wrote:
> OK I applied those two patches and get this:
> 
> i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
> i40e: Copyright (c) 2013 - 2014 Intel Corporation.
> i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
> i40e 0000:3d:00.0: PFQF_HREGION[7]: 0x00000000
> i40e 0000:3d:00.0: PFQF_HREGION[6]: 0x00000000
> i40e 0000:3d:00.0: PFQF_HREGION[5]: 0x00000000
> i40e 0000:3d:00.0: PFQF_HREGION[4]: 0x00000000
> i40e 0000:3d:00.0: PFQF_HREGION[3]: 0x00000000
> i40e 0000:3d:00.0: PFQF_HREGION[2]: 0x00000000
> i40e 0000:3d:00.0: PFQF_HREGION[1]: 0x00000000
> i40e 0000:3d:00.0: PFQF_HREGION[0]: 0x00000000
> i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
> i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
> i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
> i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
> i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
> i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
> i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
> i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
> i40e 0000:3d:00.0: flow_type: 27 input_mask:0x00000000002c0000
> i40e 0000:3d:00.0: flow_type: 26 input_mask:0x00000000002c0000
> i40e 0000:3d:00.0: flow type: 36 update input mask from:0x0006060000000000, to:0x0001801800000000
> i40e 0000:3d:00.0: flow type: 35 update input mask from:0x0006060000000000, to:0x0001801800000000
> i40e 0000:3d:00.0: flow type: 34 update input mask from:0x0006060780000000, to:0x0001801f80000000
> i40e 0000:3d:00.0: flow type: 33 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 32 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 31 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 30 update input mask from:0x0006060600000000, to:0x0001801e00000000
> i40e 0000:3d:00.0: flow type: 29 update input mask from:0x0006060600000000, to:0x0001801e00000000
> 
> So seems the regions are all 0.
> 
> All ipsec packets still hitting queue 0.

So any news or more ideas to try or are we stuck hoping someone can fix
the firmware?

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-06-07 14:39                                                     ` Lennart Sorensen
@ 2019-06-07 19:32                                                       ` Alexander Duyck
  2019-06-07 20:49                                                         ` [E1000-devel] " Hisashi T Fujinaka
  2020-02-07 21:51                                                         ` Lennart Sorensen
  0 siblings, 2 replies; 33+ messages in thread
From: Alexander Duyck @ 2019-06-07 19:32 UTC (permalink / raw)
  To: Lennart Sorensen; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan, e1000-devel

On Fri, Jun 7, 2019 at 7:39 AM Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
>
> On Wed, May 22, 2019 at 10:39:56AM -0400, Lennart Sorensen wrote:
> > OK I applied those two patches and get this:
> >
> > i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
> > i40e: Copyright (c) 2013 - 2014 Intel Corporation.
> > i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
> > i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
> > i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
> > i40e 0000:3d:00.0: PFQF_HREGION[7]: 0x00000000
> > i40e 0000:3d:00.0: PFQF_HREGION[6]: 0x00000000
> > i40e 0000:3d:00.0: PFQF_HREGION[5]: 0x00000000
> > i40e 0000:3d:00.0: PFQF_HREGION[4]: 0x00000000
> > i40e 0000:3d:00.0: PFQF_HREGION[3]: 0x00000000
> > i40e 0000:3d:00.0: PFQF_HREGION[2]: 0x00000000
> > i40e 0000:3d:00.0: PFQF_HREGION[1]: 0x00000000
> > i40e 0000:3d:00.0: PFQF_HREGION[0]: 0x00000000
> > i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
> > i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
> > i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
> > i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
> > i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
> > i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
> > i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
> > i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
> > i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
> > i40e 0000:3d:00.0: flow_type: 27 input_mask:0x00000000002c0000
> > i40e 0000:3d:00.0: flow_type: 26 input_mask:0x00000000002c0000
> > i40e 0000:3d:00.0: flow type: 36 update input mask from:0x0006060000000000, to:0x0001801800000000
> > i40e 0000:3d:00.0: flow type: 35 update input mask from:0x0006060000000000, to:0x0001801800000000
> > i40e 0000:3d:00.0: flow type: 34 update input mask from:0x0006060780000000, to:0x0001801f80000000
> > i40e 0000:3d:00.0: flow type: 33 update input mask from:0x0006060600000000, to:0x0001801e00000000
> > i40e 0000:3d:00.0: flow type: 32 update input mask from:0x0006060600000000, to:0x0001801e00000000
> > i40e 0000:3d:00.0: flow type: 31 update input mask from:0x0006060600000000, to:0x0001801e00000000
> > i40e 0000:3d:00.0: flow type: 30 update input mask from:0x0006060600000000, to:0x0001801e00000000
> > i40e 0000:3d:00.0: flow type: 29 update input mask from:0x0006060600000000, to:0x0001801e00000000
> >
> > So seems the regions are all 0.
> >
> > All ipsec packets still hitting queue 0.
>
> So any news or more ideas to try or are we stuck hoping someone can fix
> the firmware?

I had reached out to some folks over in the networking division hoping
that they can get a reproduction as I don't have the hardware that you
are seeing the issue on so I have no way to reproduce it.

Maybe someone from that group can reply and tell us where they are on that?

Thanks.

- Alex

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [E1000-devel] [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-06-07 19:32                                                       ` Alexander Duyck
@ 2019-06-07 20:49                                                         ` Hisashi T Fujinaka
  2019-06-07 22:08                                                           ` Fujinaka, Todd
  2020-02-07 21:51                                                         ` Lennart Sorensen
  1 sibling, 1 reply; 33+ messages in thread
From: Hisashi T Fujinaka @ 2019-06-07 20:49 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Lennart Sorensen, e1000-devel, Netdev, intel-wired-lan, LKML

On Fri, 7 Jun 2019, Alexander Duyck wrote:

> On Fri, Jun 7, 2019 at 7:39 AM Lennart Sorensen
> <lsorense@csclub.uwaterloo.ca> wrote:
>>
>> On Wed, May 22, 2019 at 10:39:56AM -0400, Lennart Sorensen wrote:
>>> OK I applied those two patches and get this:
>>>
>>> i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.7-k
>>> i40e: Copyright (c) 2013 - 2014 Intel Corporation.
>>> i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 1.1767.0
>>> i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
>>> i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87
>>> i40e 0000:3d:00.0: PFQF_HREGION[7]: 0x00000000
>>> i40e 0000:3d:00.0: PFQF_HREGION[6]: 0x00000000
>>> i40e 0000:3d:00.0: PFQF_HREGION[5]: 0x00000000
>>> i40e 0000:3d:00.0: PFQF_HREGION[4]: 0x00000000
>>> i40e 0000:3d:00.0: PFQF_HREGION[3]: 0x00000000
>>> i40e 0000:3d:00.0: PFQF_HREGION[2]: 0x00000000
>>> i40e 0000:3d:00.0: PFQF_HREGION[1]: 0x00000000
>>> i40e 0000:3d:00.0: PFQF_HREGION[0]: 0x00000000
>>> i40e 0000:3d:00.0: flow_type: 63 input_mask:0x0000000000004000
>>> i40e 0000:3d:00.0: flow_type: 46 input_mask:0x0007fff800000000
>>> i40e 0000:3d:00.0: flow_type: 45 input_mask:0x0007fff800000000
>>> i40e 0000:3d:00.0: flow_type: 44 input_mask:0x0007ffff80000000
>>> i40e 0000:3d:00.0: flow_type: 43 input_mask:0x0007fffe00000000
>>> i40e 0000:3d:00.0: flow_type: 42 input_mask:0x0007fffe00000000
>>> i40e 0000:3d:00.0: flow_type: 41 input_mask:0x0007fffe00000000
>>> i40e 0000:3d:00.0: flow_type: 40 input_mask:0x0007fffe00000000
>>> i40e 0000:3d:00.0: flow_type: 39 input_mask:0x0007fffe00000000
>>> i40e 0000:3d:00.0: flow_type: 36 input_mask:0x0006060000000000
>>> i40e 0000:3d:00.0: flow_type: 35 input_mask:0x0006060000000000
>>> i40e 0000:3d:00.0: flow_type: 34 input_mask:0x0006060780000000
>>> i40e 0000:3d:00.0: flow_type: 33 input_mask:0x0006060600000000
>>> i40e 0000:3d:00.0: flow_type: 32 input_mask:0x0006060600000000
>>> i40e 0000:3d:00.0: flow_type: 31 input_mask:0x0006060600000000
>>> i40e 0000:3d:00.0: flow_type: 30 input_mask:0x0006060600000000
>>> i40e 0000:3d:00.0: flow_type: 29 input_mask:0x0006060600000000
>>> i40e 0000:3d:00.0: flow_type: 27 input_mask:0x00000000002c0000
>>> i40e 0000:3d:00.0: flow_type: 26 input_mask:0x00000000002c0000
>>> i40e 0000:3d:00.0: flow type: 36 update input mask from:0x0006060000000000, to:0x0001801800000000
>>> i40e 0000:3d:00.0: flow type: 35 update input mask from:0x0006060000000000, to:0x0001801800000000
>>> i40e 0000:3d:00.0: flow type: 34 update input mask from:0x0006060780000000, to:0x0001801f80000000
>>> i40e 0000:3d:00.0: flow type: 33 update input mask from:0x0006060600000000, to:0x0001801e00000000
>>> i40e 0000:3d:00.0: flow type: 32 update input mask from:0x0006060600000000, to:0x0001801e00000000
>>> i40e 0000:3d:00.0: flow type: 31 update input mask from:0x0006060600000000, to:0x0001801e00000000
>>> i40e 0000:3d:00.0: flow type: 30 update input mask from:0x0006060600000000, to:0x0001801e00000000
>>> i40e 0000:3d:00.0: flow type: 29 update input mask from:0x0006060600000000, to:0x0001801e00000000
>>>
>>> So seems the regions are all 0.
>>>
>>> All ipsec packets still hitting queue 0.
>>
>> So any news or more ideas to try or are we stuck hoping someone can fix
>> the firmware?
>
> I had reached out to some folks over in the networking division hoping
> that they can get a reproduction as I don't have the hardware that you
> are seeing the issue on so I have no way to reproduce it.
>
> Maybe someone from that group can reply and tell us where they are on that?
>
> Thanks.
>
> - Alex

For some reason this isn't showing up in my work email. We had an
internal conference this week and I think people are away. I'll see if I
can chase some people down if they're still here and not on the way
home.

-- 
Hisashi T Fujinaka - htodd@twofifty.com

^ permalink raw reply	[flat|nested] 33+ messages in thread

* RE: [E1000-devel] [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-06-07 20:49                                                         ` [E1000-devel] " Hisashi T Fujinaka
@ 2019-06-07 22:08                                                           ` Fujinaka, Todd
  2019-06-10 19:01                                                             ` Lennart Sorensen
  0 siblings, 1 reply; 33+ messages in thread
From: Fujinaka, Todd @ 2019-06-07 22:08 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: e1000-devel, Netdev, intel-wired-lan, LKML, Lennart Sorensen

Just a quick update with the response I got and I'll make sure this is in our internal bug database.

Here's what I got back, and it looks like you guys have tried this already:

Have they tried these steps to configure RSS:

RSS Hash Flow
-------------

Allows you to set the hash bytes per flow type and any combination of one or
more options for Receive Side Scaling (RSS) hash byte configuration.

#ethtool -N <dev> rx-flow-hash <type> <option>

Where <type> is:
  tcp4  signifying TCP over IPv4
  udp4  signifying UDP over IPv4
  tcp6  signifying TCP over IPv6
  udp6  signifying UDP over IPv6
And <option> is one or more of:
  s Hash on the IP source address of the rx packet.
  d Hash on the IP destination address of the rx packet.
  f Hash on bytes 0 and 1 of the Layer 4 header of the rx packet.
  n Hash on bytes 2 and 3 of the Layer 4 header of the rx packet.

Also, looks like the driver needs to be updated to latest version:
>>> 1.1767.0 i40e 0000:3d:00.0: The driver for the device detected a
>>> newer version of the NVM image than expected. Please install the
>>> most recent version of the network driver.

Out of tree: https://sourceforge.net/projects/e1000/files/i40e%20stable/

Todd Fujinaka
Software Application Engineer
Datacenter Engineering Group
Intel Corporation
todd.fujinaka@intel.com


-----Original Message-----
From: Hisashi T Fujinaka [mailto:htodd@twofifty.com] 
Sent: Friday, June 7, 2019 1:50 PM
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: e1000-devel@lists.sourceforge.net; Netdev <netdev@vger.kernel.org>; intel-wired-lan <intel-wired-lan@lists.osuosl.org>; LKML <linux-kernel@vger.kernel.org>; Lennart Sorensen <lsorense@csclub.uwaterloo.ca>
Subject: Re: [E1000-devel] [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets

On Fri, 7 Jun 2019, Alexander Duyck wrote:

> On Fri, Jun 7, 2019 at 7:39 AM Lennart Sorensen 
> <lsorense@csclub.uwaterloo.ca> wrote:
>>
>> On Wed, May 22, 2019 at 10:39:56AM -0400, Lennart Sorensen wrote:
>>> OK I applied those two patches and get this:
>>>
>>> i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 
>>> 2.1.7-k
>>> i40e: Copyright (c) 2013 - 2014 Intel Corporation.
>>> i40e 0000:3d:00.0: fw 3.10.52896 api 1.6 nvm 4.00 0x80001577 
>>> 1.1767.0 i40e 0000:3d:00.0: The driver for the device detected a newer version of the NVM image than expected. Please install the most recent version of the network driver.
>>> i40e 0000:3d:00.0: MAC address: a4:bf:01:4e:0c:87 i40e 0000:3d:00.0: 
>>> PFQF_HREGION[7]: 0x00000000 i40e 0000:3d:00.0: PFQF_HREGION[6]: 
>>> 0x00000000 i40e 0000:3d:00.0: PFQF_HREGION[5]: 0x00000000 i40e 
>>> 0000:3d:00.0: PFQF_HREGION[4]: 0x00000000 i40e 0000:3d:00.0: 
>>> PFQF_HREGION[3]: 0x00000000 i40e 0000:3d:00.0: PFQF_HREGION[2]: 
>>> 0x00000000 i40e 0000:3d:00.0: PFQF_HREGION[1]: 0x00000000 i40e 
>>> 0000:3d:00.0: PFQF_HREGION[0]: 0x00000000 i40e 0000:3d:00.0: 
>>> flow_type: 63 input_mask:0x0000000000004000 i40e 0000:3d:00.0: 
>>> flow_type: 46 input_mask:0x0007fff800000000 i40e 0000:3d:00.0: 
>>> flow_type: 45 input_mask:0x0007fff800000000 i40e 0000:3d:00.0: 
>>> flow_type: 44 input_mask:0x0007ffff80000000 i40e 0000:3d:00.0: 
>>> flow_type: 43 input_mask:0x0007fffe00000000 i40e 0000:3d:00.0: 
>>> flow_type: 42 input_mask:0x0007fffe00000000 i40e 0000:3d:00.0: 
>>> flow_type: 41 input_mask:0x0007fffe00000000 i40e 0000:3d:00.0: 
>>> flow_type: 40 input_mask:0x0007fffe00000000 i40e 0000:3d:00.0: 
>>> flow_type: 39 input_mask:0x0007fffe00000000 i40e 0000:3d:00.0: 
>>> flow_type: 36 input_mask:0x0006060000000000 i40e 0000:3d:00.0: 
>>> flow_type: 35 input_mask:0x0006060000000000 i40e 0000:3d:00.0: 
>>> flow_type: 34 input_mask:0x0006060780000000 i40e 0000:3d:00.0: 
>>> flow_type: 33 input_mask:0x0006060600000000 i40e 0000:3d:00.0: 
>>> flow_type: 32 input_mask:0x0006060600000000 i40e 0000:3d:00.0: 
>>> flow_type: 31 input_mask:0x0006060600000000 i40e 0000:3d:00.0: 
>>> flow_type: 30 input_mask:0x0006060600000000 i40e 0000:3d:00.0: 
>>> flow_type: 29 input_mask:0x0006060600000000 i40e 0000:3d:00.0: 
>>> flow_type: 27 input_mask:0x00000000002c0000 i40e 0000:3d:00.0: 
>>> flow_type: 26 input_mask:0x00000000002c0000 i40e 0000:3d:00.0: flow 
>>> type: 36 update input mask from:0x0006060000000000, 
>>> to:0x0001801800000000 i40e 0000:3d:00.0: flow type: 35 update input 
>>> mask from:0x0006060000000000, to:0x0001801800000000 i40e 
>>> 0000:3d:00.0: flow type: 34 update input mask 
>>> from:0x0006060780000000, to:0x0001801f80000000 i40e 0000:3d:00.0: 
>>> flow type: 33 update input mask from:0x0006060600000000, 
>>> to:0x0001801e00000000 i40e 0000:3d:00.0: flow type: 32 update input 
>>> mask from:0x0006060600000000, to:0x0001801e00000000 i40e 
>>> 0000:3d:00.0: flow type: 31 update input mask 
>>> from:0x0006060600000000, to:0x0001801e00000000 i40e 0000:3d:00.0: 
>>> flow type: 30 update input mask from:0x0006060600000000, 
>>> to:0x0001801e00000000 i40e 0000:3d:00.0: flow type: 29 update input 
>>> mask from:0x0006060600000000, to:0x0001801e00000000
>>>
>>> So seems the regions are all 0.
>>>
>>> All ipsec packets still hitting queue 0.
>>
>> So any news or more ideas to try or are we stuck hoping someone can 
>> fix the firmware?
>
> I had reached out to some folks over in the networking division hoping 
> that they can get a reproduction as I don't have the hardware that you 
> are seeing the issue on so I have no way to reproduce it.
>
> Maybe someone from that group can reply and tell us where they are on that?
>
> Thanks.
>
> - Alex

For some reason this isn't showing up in my work email. We had an internal conference this week and I think people are away. I'll see if I can chase some people down if they're still here and not on the way home.

--
Hisashi T Fujinaka - htodd@twofifty.com


_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [E1000-devel] [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-06-07 22:08                                                           ` Fujinaka, Todd
@ 2019-06-10 19:01                                                             ` Lennart Sorensen
  0 siblings, 0 replies; 33+ messages in thread
From: Lennart Sorensen @ 2019-06-10 19:01 UTC (permalink / raw)
  To: Fujinaka, Todd
  Cc: Alexander Duyck, e1000-devel, Netdev, intel-wired-lan, LKML

On Fri, Jun 07, 2019 at 10:08:31PM +0000, Fujinaka, Todd wrote:
> Just a quick update with the response I got and I'll make sure this is in our internal bug database.
> 
> Here's what I got back, and it looks like you guys have tried this already:
> 
> Have they tried these steps to configure RSS:
> 
> RSS Hash Flow
> -------------
> 
> Allows you to set the hash bytes per flow type and any combination of one or
> more options for Receive Side Scaling (RSS) hash byte configuration.
> 
> #ethtool -N <dev> rx-flow-hash <type> <option>
> 
> Where <type> is:
>   tcp4  signifying TCP over IPv4
>   udp4  signifying UDP over IPv4
>   tcp6  signifying TCP over IPv6
>   udp6  signifying UDP over IPv6
> And <option> is one or more of:
>   s Hash on the IP source address of the rx packet.
>   d Hash on the IP destination address of the rx packet.
>   f Hash on bytes 0 and 1 of the Layer 4 header of the rx packet.
>   n Hash on bytes 2 and 3 of the Layer 4 header of the rx packet.

With potentially 10000 ipsec connections, we don't even want to look at
creating manual flow entries.  There isn't enough room for that.  We just
wanted RSS to do its job the way it does on every other NIC in the past.
After years of using mostly intel NICs that just worked, this one has
been quite the surprise.

> Also, looks like the driver needs to be updated to latest version:
> >>> 1.1767.0 i40e 0000:3d:00.0: The driver for the device detected a
> >>> newer version of the NVM image than expected. Please install the
> >>> most recent version of the network driver.
> 
> Out of tree: https://sourceforge.net/projects/e1000/files/i40e%20stable/

Already tried with 4.19 kernel which is essentially identical to the
latest out of tree driver (I diffed them and found no functional
differences at all) and it didn't help.  Well it was essentially identical
to the latest out of tree a few weeks ago.  It seems there is now a
newer one with some changes although nothing in the list of changes
sound relevant.

We do not want to use the out of tree driver and even trying it out is
a lot of work.  We used to use it in the past for some NIC types but
stopped due to the hassle of maintaining the integration.  If any problems
exist in the in kernel driver we will patch it, but so far that does not
appear to be the problem.  The tests we did so far indicate the firmware
isn't applying an RSS value to certain packet types.  Even mapping every
RSS value to queue 7 still saw these packets arrive on queue 0 which
should of course be impossible if the firmware was working.  Now if
there is anything in the out of tree driver that you think can explain
this problem, I will look at it and consider trying it, but so far I
see nothing that makes that worth the effort.  It just doesn't look like
a driver problem.  If someone has access to a S2600WFT board (or some
other C612 based board) it should be simple enough to try replaying
the captured packet and see what RSS queue it hits (with ATR disabled
of course).

The message is because we tried installing an NVM update to see if it
fixed anything, and it did not.  We could put the old version back,
but since neither version works I didn't bother yet.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Intel-wired-lan] i40e X722 RSS problem with NAT-Traversal IPsec packets
  2019-06-07 19:32                                                       ` Alexander Duyck
  2019-06-07 20:49                                                         ` [E1000-devel] " Hisashi T Fujinaka
@ 2020-02-07 21:51                                                         ` Lennart Sorensen
  1 sibling, 0 replies; 33+ messages in thread
From: Lennart Sorensen @ 2020-02-07 21:51 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Jeff Kirsher, LKML, Netdev, intel-wired-lan, e1000-devel

On Fri, Jun 07, 2019 at 12:32:51PM -0700, Alexander Duyck wrote:
> I had reached out to some folks over in the networking division hoping
> that they can get a reproduction as I don't have the hardware that you
> are seeing the issue on so I have no way to reproduce it.
> 
> Maybe someone from that group can reply and tell us where they are on that?

Well I still never heard anything from anyone.  Just installed 4.10
firmware in case that security fix (the only change to happen in over
12 months) did something, but no.

So all UDP encapsulated IPsec packets still always have RSS value of 0.

I am tempted to write a test to see if all UDP encapsulated IP packets
that are not of one of the explicitly handled types have this problem
since I have a suspicion they do.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2020-02-07 22:00 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-01 20:52 i40e X722 RSS problem with NAT-Traversal IPsec packets Lennart Sorensen
2019-05-01 22:52 ` [Intel-wired-lan] " Alexander Duyck
2019-05-02 15:11   ` Lennart Sorensen
2019-05-02 17:03     ` Alexander Duyck
2019-05-02 17:16       ` Lennart Sorensen
2019-05-02 17:28         ` Alexander Duyck
2019-05-02 17:55           ` Lennart Sorensen
2019-05-02 18:52             ` Lennart Sorensen
2019-05-02 20:59               ` Alexander Duyck
2019-05-03 15:14                 ` Lennart Sorensen
2019-05-03 17:19                   ` Alexander Duyck
2019-05-03 20:59                     ` Lennart Sorensen
2019-05-13 16:55                       ` Lennart Sorensen
2019-05-13 19:04                         ` Alexander Duyck
2019-05-14 16:34                           ` Lennart Sorensen
2019-05-16 17:10                             ` Alexander Duyck
2019-05-16 18:34                               ` Lennart Sorensen
2019-05-16 18:37                                 ` Lennart Sorensen
2019-05-16 23:32                                   ` Alexander Duyck
2019-05-17 16:42                                     ` Alexander Duyck
2019-05-17 17:23                                       ` Lennart Sorensen
2019-05-17 22:20                                         ` Alexander Duyck
2019-05-21 15:15                                           ` Lennart Sorensen
2019-05-21 16:51                                             ` Alexander Duyck
2019-05-21 17:54                                               ` Lennart Sorensen
2019-05-21 23:22                                                 ` Alexander Duyck
2019-05-22 14:39                                                   ` Lennart Sorensen
2019-06-07 14:39                                                     ` Lennart Sorensen
2019-06-07 19:32                                                       ` Alexander Duyck
2019-06-07 20:49                                                         ` [E1000-devel] " Hisashi T Fujinaka
2019-06-07 22:08                                                           ` Fujinaka, Todd
2019-06-10 19:01                                                             ` Lennart Sorensen
2020-02-07 21:51                                                         ` Lennart Sorensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).