ixgbe: driver drops packets routed from an IPSec interface with a "bad sa

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
@ 2019-09-06 18:13 Michael Marley
  2019-09-09 18:21 ` Shannon Nelson
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Marley @ 2019-09-06 18:13 UTC (permalink / raw)
  To: netdev

(This is also reported at 
https://bugzilla.kernel.org/show_bug.cgi?id=204551, but it was 
recommended that I send it to this list as well.)

I have a put together a router that routes traffic from several local 
subnets from a switch attached to an i82599ES card through an IPSec VPN 
interface set up with StrongSwan.  (The VPN is running on an unrelated 
second interface with a different driver.)  Traffic from the local 
interfaces to the VPN works as it should and eventually makes it through 
the VPN server and out to the Internet.  The return traffic makes it 
back to the router and tcpdump shows it leaving by the i82599, but the 
traffic never actually makes it onto the wire and I instead get one of

enp1s0: ixgbe_ipsec_tx: bad sa_idx=64512 handle=0

for each packet that should be transmitted.  (The sa_idx and handle 
values are always the same.)

I realized this was probably related to ixgbe's IPSec offloading 
feature, so I tried with the motherboard's integrated e1000e device and 
didn't have the problem.  I tried using ethtool to disable all the 
IPSec-related offloads (tx-esp-segmentation, esp-hw-offload, 
esp-tx-csum-hw-offload), but the problem persisted.  I then tried 
recompiling the kernel with CONFIG_IXGBE_IPSEC=n and that worked around 
the problem.

I was also able to find another instance of the same problem reported in 
Debian at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930443.  
That person seems to be having exactly the same issue as me, down to the 
sa_idx and handle values being the same.

If there are any more details I can provide to make this easier to track 
down, please let me know.

Thanks,

Michael Marley

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
  2019-09-06 18:13 ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error Michael Marley
@ 2019-09-09 18:21 ` Shannon Nelson
  2019-09-09 18:45   ` Michael Marley
  0 siblings, 1 reply; 9+ messages in thread
From: Shannon Nelson @ 2019-09-09 18:21 UTC (permalink / raw)
  To: Michael Marley, netdev, Jeff Kirsher, steffen.klassert

On 9/6/19 11:13 AM, Michael Marley wrote:
> (This is also reported at 
> https://bugzilla.kernel.org/show_bug.cgi?id=204551, but it was 
> recommended that I send it to this list as well.)
>
> I have a put together a router that routes traffic from several local 
> subnets from a switch attached to an i82599ES card through an IPSec 
> VPN interface set up with StrongSwan.  (The VPN is running on an 
> unrelated second interface with a different driver.)  Traffic from the 
> local interfaces to the VPN works as it should and eventually makes it 
> through the VPN server and out to the Internet.  The return traffic 
> makes it back to the router and tcpdump shows it leaving by the 
> i82599, but the traffic never actually makes it onto the wire and I 
> instead get one of
>
> enp1s0: ixgbe_ipsec_tx: bad sa_idx=64512 handle=0
>
> for each packet that should be transmitted.  (The sa_idx and handle 
> values are always the same.)
>
> I realized this was probably related to ixgbe's IPSec offloading 
> feature, so I tried with the motherboard's integrated e1000e device 
> and didn't have the problem.  I tried using ethtool to disable all the 
> IPSec-related offloads (tx-esp-segmentation, esp-hw-offload, 
> esp-tx-csum-hw-offload), but the problem persisted.  I then tried 
> recompiling the kernel with CONFIG_IXGBE_IPSEC=n and that worked 
> around the problem.
>
> I was also able to find another instance of the same problem reported 
> in Debian at 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930443.  That person 
> seems to be having exactly the same issue as me, down to the sa_idx 
> and handle values being the same.
>
> If there are any more details I can provide to make this easier to 
> track down, please let me know.
>
> Thanks,
>
> Michael Marley

Hi Michael,

Thanks for pointing this out.  The issue this error message is 
complaining about is that the handle given to the driver is a bad 
value.  The handle is what helps the driver find the right encryption 
information, and in this case is an index into an array, one array for 
Rx and one for Tx, each of which have up to 1024 entries.  In order to 
encode them into a single value, 1024 is added to the Tx values to make 
the handle, and 1024 is subtracted to use the handle later.  Note that 
the bad sa_idx is 64512, which happens to also be -1024; if the Tx 
handle given to ixgbe for xmit is 0, we subtract 1024 from that and get 
this bad sa_idx value.

That handle is supposed to be an opaque value only used by the driver.  
It looks to me like either (a) the driver is not setting up the handle 
correctly when the SA is first set up, or (b) something in the upper 
levels of the ipsec code is clearing the handle value. We would need to 
know more about all the bits in your SA set up to have a better idea 
what parts of the ipsec code are being exercised when this problem happens.

I currently don't have access to a good ixgbe setup on which to 
test/debug this, and I haven't been paying much attention lately to 
what's happening in the upper ipsec layers, so my help will be somewhat 
limited.  I'm hoping the the Intel folks can add a little help, so I've 
copied Jeff Kirsher on this (they'll probably point back to me since I 
wrote this chunk :-) ).  I've also copied Stephen Klassert for his ipsec 
thoughts.

In the meantime, can you give more details on the exact ipsec rules that 
are used here, and are there any error messages coming from ixgbe 
regarding the ipsec rule setup that might help us identify what's happening?

Thanks,
sln

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
  2019-09-09 18:21 ` Shannon Nelson
@ 2019-09-09 18:45   ` Michael Marley
  2019-09-10 21:43     ` Shannon Nelson
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Marley @ 2019-09-09 18:45 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: netdev, Jeff Kirsher, steffen.klassert

On 2019-09-09 14:21, Shannon Nelson wrote:
> On 9/6/19 11:13 AM, Michael Marley wrote:
>> (This is also reported at 
>> https://bugzilla.kernel.org/show_bug.cgi?id=204551, but it was 
>> recommended that I send it to this list as well.)
>> 
>> I have a put together a router that routes traffic from several local 
>> subnets from a switch attached to an i82599ES card through an IPSec 
>> VPN interface set up with StrongSwan.  (The VPN is running on an 
>> unrelated second interface with a different driver.)  Traffic from the 
>> local interfaces to the VPN works as it should and eventually makes it 
>> through the VPN server and out to the Internet.  The return traffic 
>> makes it back to the router and tcpdump shows it leaving by the 
>> i82599, but the traffic never actually makes it onto the wire and I 
>> instead get one of
>> 
>> enp1s0: ixgbe_ipsec_tx: bad sa_idx=64512 handle=0
>> 
>> for each packet that should be transmitted.  (The sa_idx and handle 
>> values are always the same.)
>> 
>> I realized this was probably related to ixgbe's IPSec offloading 
>> feature, so I tried with the motherboard's integrated e1000e device 
>> and didn't have the problem.  I tried using ethtool to disable all the 
>> IPSec-related offloads (tx-esp-segmentation, esp-hw-offload, 
>> esp-tx-csum-hw-offload), but the problem persisted.  I then tried 
>> recompiling the kernel with CONFIG_IXGBE_IPSEC=n and that worked 
>> around the problem.
>> 
>> I was also able to find another instance of the same problem reported 
>> in Debian at 
>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930443.  That person 
>> seems to be having exactly the same issue as me, down to the sa_idx 
>> and handle values being the same.
>> 
>> If there are any more details I can provide to make this easier to 
>> track down, please let me know.
>> 
>> Thanks,
>> 
>> Michael Marley
> 
> Hi Michael,
> 
> Thanks for pointing this out.  The issue this error message is
> complaining about is that the handle given to the driver is a bad
> value.  The handle is what helps the driver find the right encryption
> information, and in this case is an index into an array, one array for
> Rx and one for Tx, each of which have up to 1024 entries.  In order to
> encode them into a single value, 1024 is added to the Tx values to
> make the handle, and 1024 is subtracted to use the handle later.  Note
> that the bad sa_idx is 64512, which happens to also be -1024; if the
> Tx handle given to ixgbe for xmit is 0, we subtract 1024 from that and
> get this bad sa_idx value.
> 
> That handle is supposed to be an opaque value only used by the
> driver.  It looks to me like either (a) the driver is not setting up
> the handle correctly when the SA is first set up, or (b) something in
> the upper levels of the ipsec code is clearing the handle value. We
> would need to know more about all the bits in your SA set up to have a
> better idea what parts of the ipsec code are being exercised when this
> problem happens.
> 
> I currently don't have access to a good ixgbe setup on which to
> test/debug this, and I haven't been paying much attention lately to
> what's happening in the upper ipsec layers, so my help will be
> somewhat limited.  I'm hoping the the Intel folks can add a little
> help, so I've copied Jeff Kirsher on this (they'll probably point back
> to me since I wrote this chunk :-) ).  I've also copied Stephen
> Klassert for his ipsec thoughts.
> 
> In the meantime, can you give more details on the exact ipsec rules
> that are used here, and are there any error messages coming from ixgbe
> regarding the ipsec rule setup that might help us identify what's
> happening?
> 
> Thanks,
> sln

Hi Shannon,

Thanks for your response!  I apologize, I am a bit of a newbie to IPSec 
myself, so I'm not 100% sure what is the best way to provide the 
information you need, but here is the (slightly-redacted) output of 
swanctl --list-sas first from the server and then from the client:

<servername>: #24, ESTABLISHED, IKEv2, 3cb75c180ee5dc68_i 
cc7dae551b603bb7_r*
   local  '<serverip>' @ <serverip>[4500]
   remote '<clientip>' @ <clientip>[4500]
   AES_GCM_16-256/PRF_HMAC_SHA2_512/ECP_384
   established 174180s ago
   <servername>: #110, reqid 12, INSTALLED, TUNNEL-in-UDP, 
ESP:AES_GCM_16-256/ECP_384
     installed 469s ago
     in  c51a0f11 (-|0x00000064), 1548864 bytes, 19575 packets,     6s 
ago
     out c3bd9741 (-|0x00000064), 23618807 bytes, 22865 packets,     7s 
ago
     local  0.0.0.0/0 ::/0
     remote 0.0.0.0/0 ::/0

<clientname>: #1, ESTABLISHED, IKEv2, 3cb75c180ee5dc68_i* 
cc7dae551b603bb7_r
   local  '<clientip>' @ <clientip>[4500]
   remote '<serverip>' @ <serverip>[4500]
   AES_GCM_16-256/PRF_HMAC_SHA2_512/ECP_384
   established 174013s ago
   <clientname>: #54, reqid 1, INSTALLED, TUNNEL-in-UDP, 
ESP:AES_GCM_16-256/ECP_384
     installed 303s ago, rekeying in 2979s, expires in 3657s
     in  c3bd9741 (-|0x00000064), 23178523 bytes, 20725 packets,     0s 
ago
     out c51a0f11 (-|0x00000064), 1429124 bytes, 17719 packets,     0s 
ago
     local  0.0.0.0/0 ::/0
     remote 0.0.0.0/0 ::/0

It might also be worth mentioning that I am using an xfrm interface to 
do "regular" routing rather than the policy-based routing that 
StrongSwan/IPSec normally uses. If there is anything else that would 
help more, I would be happy to provide it.

Just to be clear though, I'm not trying to run IPSec on the ixgbe 
interface at all.  The ixgbe adapter is being used to connect the router 
to the switch on the LAN side of the network.  IPSec is running on the 
WAN interface without any hardware acceleration (besides AES-NI).  The 
problem occurs when a computer on the LAN tries to access the WAN.  The 
outgoing packets work as expected and the incoming packets are routed 
back out through the ixgbe device toward the LAN client, but the driver 
drops the packets with the sa_idx error.

I hope this helps.

Thanks,

Michael

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
  2019-09-09 18:45   ` Michael Marley
@ 2019-09-10 21:43     ` Shannon Nelson
  2019-09-10 22:53       ` Michael Marley
  0 siblings, 1 reply; 9+ messages in thread
From: Shannon Nelson @ 2019-09-10 21:43 UTC (permalink / raw)
  To: Michael Marley; +Cc: netdev, Jeff Kirsher, steffen.klassert

On 9/9/19 11:45 AM, Michael Marley wrote:
> On 2019-09-09 14:21, Shannon Nelson wrote:
>> On 9/6/19 11:13 AM, Michael Marley wrote:
>>> (This is also reported at 
>>> https://bugzilla.kernel.org/show_bug.cgi?id=204551, but it was 
>>> recommended that I send it to this list as well.)
>>>
>>> I have a put together a router that routes traffic from several 
>>> local subnets from a switch attached to an i82599ES card through an 
>>> IPSec VPN interface set up with StrongSwan. (The VPN is running on 
>>> an unrelated second interface with a different driver.)  Traffic 
>>> from the local interfaces to the VPN works as it should and 
>>> eventually makes it through the VPN server and out to the Internet.  
>>> The return traffic makes it back to the router and tcpdump shows it 
>>> leaving by the i82599, but the traffic never actually makes it onto 
>>> the wire and I instead get one of
>>>
>>> enp1s0: ixgbe_ipsec_tx: bad sa_idx=64512 handle=0
>>>
>>> for each packet that should be transmitted.  (The sa_idx and handle 
>>> values are always the same.)
>>>
>>> I realized this was probably related to ixgbe's IPSec offloading 
>>> feature, so I tried with the motherboard's integrated e1000e device 
>>> and didn't have the problem.  I tried using ethtool to disable all 
>>> the IPSec-related offloads (tx-esp-segmentation, esp-hw-offload, 
>>> esp-tx-csum-hw-offload), but the problem persisted.  I then tried 
>>> recompiling the kernel with CONFIG_IXGBE_IPSEC=n and that worked 
>>> around the problem.
>>>
>>> I was also able to find another instance of the same problem 
>>> reported in Debian at 
>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930443. That 
>>> person seems to be having exactly the same issue as me, down to the 
>>> sa_idx and handle values being the same.
>>>
>>> If there are any more details I can provide to make this easier to 
>>> track down, please let me know.
>>>
>>> Thanks,
>>>
>>> Michael Marley
>>
>> Hi Michael,
>>
>> Thanks for pointing this out.  The issue this error message is
>> complaining about is that the handle given to the driver is a bad
>> value.  The handle is what helps the driver find the right encryption
>> information, and in this case is an index into an array, one array for
>> Rx and one for Tx, each of which have up to 1024 entries.  In order to
>> encode them into a single value, 1024 is added to the Tx values to
>> make the handle, and 1024 is subtracted to use the handle later.  Note
>> that the bad sa_idx is 64512, which happens to also be -1024; if the
>> Tx handle given to ixgbe for xmit is 0, we subtract 1024 from that and
>> get this bad sa_idx value.
>>
>> That handle is supposed to be an opaque value only used by the
>> driver.  It looks to me like either (a) the driver is not setting up
>> the handle correctly when the SA is first set up, or (b) something in
>> the upper levels of the ipsec code is clearing the handle value. We
>> would need to know more about all the bits in your SA set up to have a
>> better idea what parts of the ipsec code are being exercised when this
>> problem happens.
>>
>> I currently don't have access to a good ixgbe setup on which to
>> test/debug this, and I haven't been paying much attention lately to
>> what's happening in the upper ipsec layers, so my help will be
>> somewhat limited.  I'm hoping the the Intel folks can add a little
>> help, so I've copied Jeff Kirsher on this (they'll probably point back
>> to me since I wrote this chunk :-) ).  I've also copied Stephen
>> Klassert for his ipsec thoughts.
>>
>> In the meantime, can you give more details on the exact ipsec rules
>> that are used here, and are there any error messages coming from ixgbe
>> regarding the ipsec rule setup that might help us identify what's
>> happening?
>>
>> Thanks,
>> sln
>
> Hi Shannon,
>
> Thanks for your response!  I apologize, I am a bit of a newbie to 
> IPSec myself, so I'm not 100% sure what is the best way to provide the 
> information you need, but here is the (slightly-redacted) output of 
> swanctl --list-sas first from the server and then from the client:
>
> <servername>: #24, ESTABLISHED, IKEv2, 3cb75c180ee5dc68_i 
> cc7dae551b603bb7_r*
>   local  '<serverip>' @ <serverip>[4500]
>   remote '<clientip>' @ <clientip>[4500]
>   AES_GCM_16-256/PRF_HMAC_SHA2_512/ECP_384
>   established 174180s ago
>   <servername>: #110, reqid 12, INSTALLED, TUNNEL-in-UDP, 
> ESP:AES_GCM_16-256/ECP_384
>     installed 469s ago
>     in  c51a0f11 (-|0x00000064), 1548864 bytes, 19575 packets, 6s ago
>     out c3bd9741 (-|0x00000064), 23618807 bytes, 22865 packets,     7s 
> ago
>     local  0.0.0.0/0 ::/0
>     remote 0.0.0.0/0 ::/0
>
> <clientname>: #1, ESTABLISHED, IKEv2, 3cb75c180ee5dc68_i* 
> cc7dae551b603bb7_r
>   local  '<clientip>' @ <clientip>[4500]
>   remote '<serverip>' @ <serverip>[4500]
>   AES_GCM_16-256/PRF_HMAC_SHA2_512/ECP_384
>   established 174013s ago
>   <clientname>: #54, reqid 1, INSTALLED, TUNNEL-in-UDP, 
> ESP:AES_GCM_16-256/ECP_384
>     installed 303s ago, rekeying in 2979s, expires in 3657s
>     in  c3bd9741 (-|0x00000064), 23178523 bytes, 20725 packets,     0s 
> ago
>     out c51a0f11 (-|0x00000064), 1429124 bytes, 17719 packets, 0s ago
>     local  0.0.0.0/0 ::/0
>     remote 0.0.0.0/0 ::/0
>
> It might also be worth mentioning that I am using an xfrm interface to 
> do "regular" routing rather than the policy-based routing that 
> StrongSwan/IPSec normally uses. If there is anything else that would 
> help more, I would be happy to provide it.
>
> Just to be clear though, I'm not trying to run IPSec on the ixgbe 
> interface at all.  The ixgbe adapter is being used to connect the 
> router to the switch on the LAN side of the network.  IPSec is running 
> on the WAN interface without any hardware acceleration (besides 
> AES-NI).  The problem occurs when a computer on the LAN tries to 
> access the WAN.  The outgoing packets work as expected and the 
> incoming packets are routed back out through the ixgbe device toward 
> the LAN client, but the driver drops the packets with the sa_idx error.
>
> I hope this helps.
>
> Thanks,
>
> Michael

I'm not familiar with StrongSwan and its configurations, but I'm 
guessing that if you didn't expressly enable it, perhaps StrongSwan 
enabled the ipsec offload capability.  I would suggest turning it off to 
at least get you passed the immediate issue.  If there isn't an obvious 
configuration knob in StrongSwan, perhaps you can at least use ethtool 
to disable the offload, which should be off be default anyway.

You can check it with "ethtool -k ethX | grep esp-hw-offload" and see if 
it is set.  You can disable it with "ethtool -K ethX esp-hw-offload off"

Meanwhile, can you please send the output of the following commands:
uname -a
ip xfrm s
ip xfrm p
dmesg | grep ixgbe

And any other /var/log/syslog or /var/log/messages that look suspicious 
and might give any more insight to what's happening.

Thanks,
sln


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
  2019-09-10 21:43     ` Shannon Nelson
@ 2019-09-10 22:53       ` Michael Marley
  2019-09-11  6:15         ` Steffen Klassert
  0 siblings, 1 reply; 9+ messages in thread
From: Michael Marley @ 2019-09-10 22:53 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: netdev, Jeff Kirsher, steffen.klassert

On 9/10/19 5:43 PM, Shannon Nelson wrote:

> On 9/9/19 11:45 AM, Michael Marley wrote:
>> On 2019-09-09 14:21, Shannon Nelson wrote:
>>> On 9/6/19 11:13 AM, Michael Marley wrote:
>>>> (This is also reported at
>>>> https://bugzilla.kernel.org/show_bug.cgi?id=204551, but it was
>>>> recommended that I send it to this list as well.)
>>>>
>>>> I have a put together a router that routes traffic from several
>>>> local subnets from a switch attached to an i82599ES card through an
>>>> IPSec VPN interface set up with StrongSwan. (The VPN is running on
>>>> an unrelated second interface with a different driver.)  Traffic
>>>> from the local interfaces to the VPN works as it should and
>>>> eventually makes it through the VPN server and out to the
>>>> Internet.  The return traffic makes it back to the router and
>>>> tcpdump shows it leaving by the i82599, but the traffic never
>>>> actually makes it onto the wire and I instead get one of
>>>>
>>>> enp1s0: ixgbe_ipsec_tx: bad sa_idx=64512 handle=0
>>>>
>>>> for each packet that should be transmitted.  (The sa_idx and handle
>>>> values are always the same.)
>>>>
>>>> I realized this was probably related to ixgbe's IPSec offloading
>>>> feature, so I tried with the motherboard's integrated e1000e device
>>>> and didn't have the problem.  I tried using ethtool to disable all
>>>> the IPSec-related offloads (tx-esp-segmentation, esp-hw-offload,
>>>> esp-tx-csum-hw-offload), but the problem persisted.  I then tried
>>>> recompiling the kernel with CONFIG_IXGBE_IPSEC=n and that worked
>>>> around the problem.
>>>>
>>>> I was also able to find another instance of the same problem
>>>> reported in Debian at
>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930443. That
>>>> person seems to be having exactly the same issue as me, down to the
>>>> sa_idx and handle values being the same.
>>>>
>>>> If there are any more details I can provide to make this easier to
>>>> track down, please let me know.
>>>>
>>>> Thanks,
>>>>
>>>> Michael Marley
>>>
>>> Hi Michael,
>>>
>>> Thanks for pointing this out.  The issue this error message is
>>> complaining about is that the handle given to the driver is a bad
>>> value.  The handle is what helps the driver find the right encryption
>>> information, and in this case is an index into an array, one array for
>>> Rx and one for Tx, each of which have up to 1024 entries.  In order to
>>> encode them into a single value, 1024 is added to the Tx values to
>>> make the handle, and 1024 is subtracted to use the handle later.  Note
>>> that the bad sa_idx is 64512, which happens to also be -1024; if the
>>> Tx handle given to ixgbe for xmit is 0, we subtract 1024 from that and
>>> get this bad sa_idx value.
>>>
>>> That handle is supposed to be an opaque value only used by the
>>> driver.  It looks to me like either (a) the driver is not setting up
>>> the handle correctly when the SA is first set up, or (b) something in
>>> the upper levels of the ipsec code is clearing the handle value. We
>>> would need to know more about all the bits in your SA set up to have a
>>> better idea what parts of the ipsec code are being exercised when this
>>> problem happens.
>>>
>>> I currently don't have access to a good ixgbe setup on which to
>>> test/debug this, and I haven't been paying much attention lately to
>>> what's happening in the upper ipsec layers, so my help will be
>>> somewhat limited.  I'm hoping the the Intel folks can add a little
>>> help, so I've copied Jeff Kirsher on this (they'll probably point back
>>> to me since I wrote this chunk :-) ).  I've also copied Stephen
>>> Klassert for his ipsec thoughts.
>>>
>>> In the meantime, can you give more details on the exact ipsec rules
>>> that are used here, and are there any error messages coming from ixgbe
>>> regarding the ipsec rule setup that might help us identify what's
>>> happening?
>>>
>>> Thanks,
>>> sln
>>
>> Hi Shannon,
>>
>> Thanks for your response!  I apologize, I am a bit of a newbie to
>> IPSec myself, so I'm not 100% sure what is the best way to provide
>> the information you need, but here is the (slightly-redacted) output
>> of swanctl --list-sas first from the server and then from the client:
>>
>> <servername>: #24, ESTABLISHED, IKEv2, 3cb75c180ee5dc68_i
>> cc7dae551b603bb7_r*
>>   local  '<serverip>' @ <serverip>[4500]
>>   remote '<clientip>' @ <clientip>[4500]
>>   AES_GCM_16-256/PRF_HMAC_SHA2_512/ECP_384
>>   established 174180s ago
>>   <servername>: #110, reqid 12, INSTALLED, TUNNEL-in-UDP,
>> ESP:AES_GCM_16-256/ECP_384
>>     installed 469s ago
>>     in  c51a0f11 (-|0x00000064), 1548864 bytes, 19575 packets, 6s ago
>>     out c3bd9741 (-|0x00000064), 23618807 bytes, 22865 packets,    
>> 7s ago
>>     local  0.0.0.0/0 ::/0
>>     remote 0.0.0.0/0 ::/0
>>
>> <clientname>: #1, ESTABLISHED, IKEv2, 3cb75c180ee5dc68_i*
>> cc7dae551b603bb7_r
>>   local  '<clientip>' @ <clientip>[4500]
>>   remote '<serverip>' @ <serverip>[4500]
>>   AES_GCM_16-256/PRF_HMAC_SHA2_512/ECP_384
>>   established 174013s ago
>>   <clientname>: #54, reqid 1, INSTALLED, TUNNEL-in-UDP,
>> ESP:AES_GCM_16-256/ECP_384
>>     installed 303s ago, rekeying in 2979s, expires in 3657s
>>     in  c3bd9741 (-|0x00000064), 23178523 bytes, 20725 packets,    
>> 0s ago
>>     out c51a0f11 (-|0x00000064), 1429124 bytes, 17719 packets, 0s ago
>>     local  0.0.0.0/0 ::/0
>>     remote 0.0.0.0/0 ::/0
>>
>> It might also be worth mentioning that I am using an xfrm interface
>> to do "regular" routing rather than the policy-based routing that
>> StrongSwan/IPSec normally uses. If there is anything else that would
>> help more, I would be happy to provide it.
>>
>> Just to be clear though, I'm not trying to run IPSec on the ixgbe
>> interface at all.  The ixgbe adapter is being used to connect the
>> router to the switch on the LAN side of the network.  IPSec is
>> running on the WAN interface without any hardware acceleration
>> (besides AES-NI).  The problem occurs when a computer on the LAN
>> tries to access the WAN.  The outgoing packets work as expected and
>> the incoming packets are routed back out through the ixgbe device
>> toward the LAN client, but the driver drops the packets with the
>> sa_idx error.
>>
>> I hope this helps.
>>
>> Thanks,
>>
>> Michael
>
> I'm not familiar with StrongSwan and its configurations, but I'm
> guessing that if you didn't expressly enable it, perhaps StrongSwan
> enabled the ipsec offload capability.  I would suggest turning it off
> to at least get you passed the immediate issue.  If there isn't an
> obvious configuration knob in StrongSwan, perhaps you can at least use
> ethtool to disable the offload, which should be off be default anyway.
>
> You can check it with "ethtool -k ethX | grep esp-hw-offload" and see
> if it is set.  You can disable it with "ethtool -K ethX esp-hw-offload
> off"
>
> Meanwhile, can you please send the output of the following commands:
> uname -a
> ip xfrm s
> ip xfrm p
> dmesg | grep ixgbe
>
> And any other /var/log/syslog or /var/log/messages that look
> suspicious and might give any more insight to what's happening.
>
> Thanks,
> sln
>
StrongSwan has hardware offload disabled by default, and I didn't enable
it explicitly.  I also already tried turning off all those switches with
ethtool and it has no effect.  This doesn't surprise me though, because
as I said, I don't actually have the IPSec connection running over the
ixgbe device.  The IPSec connection runs over another network adapter
that doesn't support IPSec offload at all.  The problem comes when
traffic received over the IPSec interface is then routed back out
(unencrypted) through the ixgbe device into the local network.

Here is the rest of the data for which you asked:

michael@soapstone:~$ uname -a
Linux soapstone 5.3.0-10-generic #11-Ubuntu SMP Mon Sep 9 15:12:17 UTC
2019 x86_64 x86_64 x86_64 GNU/Linux
michael@soapstone:~$ sudo ip xfrm s
src <srcip> dst <dstip>
        proto esp spi 0xcf6f90d3 reqid 1 mode tunnel
        replay-window 0 flag af-unspec
        aead rfc4106(gcm(aes))
0x254c6298b27ad65f61387c39e060698db777a335081d145ca6706d65b6be95770d2622b4
128
        encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
        anti-replay context: seq 0x0, oseq 0xaaac, bitmap 0x00000000
        if_id 0x64
src <dstip> dst <srcip>
        proto esp spi 0xc6e02140 reqid 1 mode tunnel
        replay-window 32 flag af-unspec
        aead rfc4106(gcm(aes))
0x05473bd76e1b7268b54825b019d19c13a360193bc9aa20137204ea566409356da47fc7d7
128
        encap type espinudp sport 4500 dport 4500 addr 0.0.0.0
        anti-replay context: seq 0xab11, oseq 0x0, bitmap 0xffffffff
        if_id 0x64
michael@soapstone:~$ sudo ip xfrm p
src ::/0 dst ::/0
        dir out priority 399999
        tmpl src <srcip> dst <dstip>
                proto esp spi 0xcf6f90d3 reqid 1 mode tunnel
        if_id 0x64
src 0.0.0.0/0 dst 0.0.0.0/0
        dir out priority 399999
        tmpl src <srcip> dst <dstip>
                proto esp spi 0xcf6f90d3 reqid 1 mode tunnel
        if_id 0x64
src ::/0 dst ::/0
        dir fwd priority 399999
        tmpl src <dstip> dst <srcip>
                proto esp reqid 1 mode tunnel
        if_id 0x64
src ::/0 dst ::/0
        dir in priority 399999
        tmpl src <dstip> dst <srcip>
                proto esp reqid 1 mode tunnel
        if_id 0x64
src 0.0.0.0/0 dst 0.0.0.0/0
        dir fwd priority 399999
        tmpl src <dstip> dst <srcip>
                proto esp reqid 1 mode tunnel
        if_id 0x64
src 0.0.0.0/0 dst 0.0.0.0/0
        dir in priority 399999
        tmpl src <dstip> dst <srcip>
                proto esp reqid 1 mode tunnel
        if_id 0x64
src 0.0.0.0/0 dst 0.0.0.0/0
        socket in priority 0
src 0.0.0.0/0 dst 0.0.0.0/0
        socket out priority 0
src 0.0.0.0/0 dst 0.0.0.0/0
        socket in priority 0
src 0.0.0.0/0 dst 0.0.0.0/0
        socket out priority 0
src ::/0 dst ::/0
        socket in priority 0
src ::/0 dst ::/0
        socket out priority 0
src ::/0 dst ::/0
        socket in priority 0
src ::/0 dst ::/0
        socket out priority 0
michael@soapstone:~$ dmesg | grep -i ixgbe
[    0.780400] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver -
version 5.1.0-k
[    0.781606] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[    0.954093] ixgbe 0000:01:00.0: Multiqueue Enabled: Rx Queue count =
8, Tx Queue count = 8 XDP Queue count = 0
[    0.955081] ixgbe 0000:01:00.0: 32.000 Gb/s available PCIe bandwidth
(5 GT/s x8 link)
[    0.955860] ixgbe 0000:01:00.0: MAC: 2, PHY: 14, SFP+: 3, PBA No: Unknown
[    0.956519] ixgbe 0000:01:00.0: 00:1b:21:c0:00:1e
[    0.958079] ixgbe 0000:01:00.0: Intel(R) 10 Gigabit Network Connection
[    0.958884] libphy: ixgbe-mdio: probed
[    0.960220] ixgbe 0000:01:00.0 enp1s0: renamed from eth0
[    2.788208] ixgbe 0000:01:00.0: registered PHC device on enp1s0
[    2.966290] ixgbe 0000:01:00.0 enp1s0: detected SFP+: 3
[    3.110132] ixgbe 0000:01:00.0 enp1s0: NIC Link is Up 10 Gbps, Flow
Control: RX/TX

Thanks,

Michael



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
  2019-09-10 22:53       ` Michael Marley
@ 2019-09-11  6:15         ` Steffen Klassert
  2019-09-11  7:17           ` Shannon Nelson
  2019-09-11 14:50           ` Michael Marley
  0 siblings, 2 replies; 9+ messages in thread
From: Steffen Klassert @ 2019-09-11  6:15 UTC (permalink / raw)
  To: Michael Marley; +Cc: Shannon Nelson, netdev, Jeff Kirsher

On Tue, Sep 10, 2019 at 06:53:30PM -0400, Michael Marley wrote:
> 
> StrongSwan has hardware offload disabled by default, and I didn't enable
> it explicitly.  I also already tried turning off all those switches with
> ethtool and it has no effect.  This doesn't surprise me though, because
> as I said, I don't actually have the IPSec connection running over the
> ixgbe device.  The IPSec connection runs over another network adapter
> that doesn't support IPSec offload at all.  The problem comes when
> traffic received over the IPSec interface is then routed back out
> (unencrypted) through the ixgbe device into the local network.


Seems like the ixgbe driver tries to use the sec_path
from RX to setup an offload at the TX side.

Can you please try this (completely untested) patch?

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 9bcae44e9883..ae31bd57127c 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -36,6 +36,7 @@
 #include <net/vxlan.h>
 #include <net/mpls.h>
 #include <net/xdp_sock.h>
+#include <net/xfrm.h>
 
 #include "ixgbe.h"
 #include "ixgbe_common.h"
@@ -8696,7 +8697,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
 #endif /* IXGBE_FCOE */
 
 #ifdef CONFIG_IXGBE_IPSEC
-	if (secpath_exists(skb) &&
+	if (xfrm_offload(skb) &&
 	    !ixgbe_ipsec_tx(tx_ring, first, &ipsec_tx))
 		goto out_drop;
 #endif

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
  2019-09-11  6:15         ` Steffen Klassert
@ 2019-09-11  7:17           ` Shannon Nelson
  2019-09-11 14:50           ` Michael Marley
  1 sibling, 0 replies; 9+ messages in thread
From: Shannon Nelson @ 2019-09-11  7:17 UTC (permalink / raw)
  To: Steffen Klassert, Michael Marley; +Cc: netdev, Jeff Kirsher

On 9/10/19 11:15 PM, Steffen Klassert wrote:
> On Tue, Sep 10, 2019 at 06:53:30PM -0400, Michael Marley wrote:
>> StrongSwan has hardware offload disabled by default, and I didn't enable
>> it explicitly.  I also already tried turning off all those switches with
>> ethtool and it has no effect.  This doesn't surprise me though, because
>> as I said, I don't actually have the IPSec connection running over the
>> ixgbe device.  The IPSec connection runs over another network adapter
>> that doesn't support IPSec offload at all.  The problem comes when
>> traffic received over the IPSec interface is then routed back out
>> (unencrypted) through the ixgbe device into the local network.
>
> Seems like the ixgbe driver tries to use the sec_path
> from RX to setup an offload at the TX side.
>
> Can you please try this (completely untested) patch?
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 9bcae44e9883..ae31bd57127c 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -36,6 +36,7 @@
>   #include <net/vxlan.h>
>   #include <net/mpls.h>
>   #include <net/xdp_sock.h>
> +#include <net/xfrm.h>
>   
>   #include "ixgbe.h"
>   #include "ixgbe_common.h"
> @@ -8696,7 +8697,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff *skb,
>   #endif /* IXGBE_FCOE */
>   
>   #ifdef CONFIG_IXGBE_IPSEC
> -	if (secpath_exists(skb) &&
> +	if (xfrm_offload(skb) &&
>   	    !ixgbe_ipsec_tx(tx_ring, first, &ipsec_tx))
>   		goto out_drop;
>   #endif

Thanks, Steffen, that looks right to me.

sln



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
  2019-09-11  6:15         ` Steffen Klassert
  2019-09-11  7:17           ` Shannon Nelson
@ 2019-09-11 14:50           ` Michael Marley
  2019-09-11 18:45             ` Jeff Kirsher
  1 sibling, 1 reply; 9+ messages in thread
From: Michael Marley @ 2019-09-11 14:50 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Shannon Nelson, netdev, Jeff Kirsher

On 2019-09-11 02:15, Steffen Klassert wrote:
> On Tue, Sep 10, 2019 at 06:53:30PM -0400, Michael Marley wrote:
>> 
>> StrongSwan has hardware offload disabled by default, and I didn't 
>> enable
>> it explicitly.  I also already tried turning off all those switches 
>> with
>> ethtool and it has no effect.  This doesn't surprise me though, 
>> because
>> as I said, I don't actually have the IPSec connection running over the
>> ixgbe device.  The IPSec connection runs over another network adapter
>> that doesn't support IPSec offload at all.  The problem comes when
>> traffic received over the IPSec interface is then routed back out
>> (unencrypted) through the ixgbe device into the local network.
> 
> 
> Seems like the ixgbe driver tries to use the sec_path
> from RX to setup an offload at the TX side.
> 
> Can you please try this (completely untested) patch?
> 
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 9bcae44e9883..ae31bd57127c 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -36,6 +36,7 @@
>  #include <net/vxlan.h>
>  #include <net/mpls.h>
>  #include <net/xdp_sock.h>
> +#include <net/xfrm.h>
> 
>  #include "ixgbe.h"
>  #include "ixgbe_common.h"
> @@ -8696,7 +8697,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct sk_buff 
> *skb,
>  #endif /* IXGBE_FCOE */
> 
>  #ifdef CONFIG_IXGBE_IPSEC
> -	if (secpath_exists(skb) &&
> +	if (xfrm_offload(skb) &&
>  	    !ixgbe_ipsec_tx(tx_ring, first, &ipsec_tx))
>  		goto out_drop;
>  #endif
With the patch, the problem is gone.  Thanks!

Michael

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error
  2019-09-11 14:50           ` Michael Marley
@ 2019-09-11 18:45             ` Jeff Kirsher
  0 siblings, 0 replies; 9+ messages in thread
From: Jeff Kirsher @ 2019-09-11 18:45 UTC (permalink / raw)
  To: Michael Marley, Steffen Klassert; +Cc: Shannon Nelson, netdev

[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]

On Wed, 2019-09-11 at 10:50 -0400, Michael Marley wrote:
> On 2019-09-11 02:15, Steffen Klassert wrote:
> > On Tue, Sep 10, 2019 at 06:53:30PM -0400, Michael Marley wrote:
> > > StrongSwan has hardware offload disabled by default, and I
> > > didn't 
> > > enable
> > > it explicitly.  I also already tried turning off all those
> > > switches 
> > > with
> > > ethtool and it has no effect.  This doesn't surprise me though, 
> > > because
> > > as I said, I don't actually have the IPSec connection running
> > > over the
> > > ixgbe device.  The IPSec connection runs over another network
> > > adapter
> > > that doesn't support IPSec offload at all.  The problem comes
> > > when
> > > traffic received over the IPSec interface is then routed back out
> > > (unencrypted) through the ixgbe device into the local network.
> > 
> > Seems like the ixgbe driver tries to use the sec_path
> > from RX to setup an offload at the TX side.
> > 
> > Can you please try this (completely untested) patch?

Steffen, can you send your patch to intel-wired-lan@lists.osuosl.org
mailing list?

> > diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > index 9bcae44e9883..ae31bd57127c 100644
> > --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> > @@ -36,6 +36,7 @@
> >  #include <net/vxlan.h>
> >  #include <net/mpls.h>
> >  #include <net/xdp_sock.h>
> > +#include <net/xfrm.h>
> > 
> >  #include "ixgbe.h"
> >  #include "ixgbe_common.h"
> > @@ -8696,7 +8697,7 @@ netdev_tx_t ixgbe_xmit_frame_ring(struct
> > sk_buff 
> > *skb,
> >  #endif /* IXGBE_FCOE */
> > 
> >  #ifdef CONFIG_IXGBE_IPSEC
> > -	if (secpath_exists(skb) &&
> > +	if (xfrm_offload(skb) &&
> >  	    !ixgbe_ipsec_tx(tx_ring, first, &ipsec_tx))
> >  		goto out_drop;
> >  #endif
> With the patch, the problem is gone.  Thanks!
> 
> Michael


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-09-11 18:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-06 18:13 ixgbe: driver drops packets routed from an IPSec interface with a "bad sa_idx" error Michael Marley
2019-09-09 18:21 ` Shannon Nelson
2019-09-09 18:45   ` Michael Marley
2019-09-10 21:43     ` Shannon Nelson
2019-09-10 22:53       ` Michael Marley
2019-09-11  6:15         ` Steffen Klassert
2019-09-11  7:17           ` Shannon Nelson
2019-09-11 14:50           ` Michael Marley
2019-09-11 18:45             ` Jeff Kirsher

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).