Cyclic hardware reset for e1000e

* Cyclic hardware reset for e1000e
@ 2019-02-18 12:36 Per Oberg
  2019-02-18 12:43 ` Jan Kiszka
  0 siblings, 1 reply; 14+ messages in thread
From: Per Oberg @ 2019-02-18 12:36 UTC (permalink / raw)
  To: xenomai

Hello list

I have this issue where my e1000e network card gets into some kind of cyclic hardware reset during operation. The weird thing is that this only happens when I let systemd start the application. If it's started manually it always works as intended. 

I am running  xenomai 3.0.7 with a linux-4.9.38 kernel and I use the network connection in Linux non-rt mode. I use systemd and NetworkManager.

I do realize that once I get into the reset it will continue resetting because I keep flooding the buffers. My issue is that it -never- happens when I start my process manually, only when systemd starts it. Because the network goes down quite badly I cannot log in and disable the service once it happens and therefore I cannot really try starting it manually after letting the network recover.  

There is some information from intel in [1] below. There is talk about power management function and EPROM etc. They specifically write: 

"82573(V/L/E) TX Unit Hang Messages
Several adapters with the 82573 chipset display "TX unit hang" messages during normal operation with the e1000 driver. The issue appears both with TSO enabled and disabled, and is caused by a power management function that is enabled in the EEPROM. Early releases of the chipsets to vendors had the EEPROM bit that enabled the feature. After the issue was discovered newer adapters were released with the feature disabled in the EEPROM."

I also read something about disabling GRO/TSO/GSO that helped some people. 

My questions to the list are: 

1. Have you guys any experience with this?
2. Would I be better of using the RT Net drivers?
3. What could cause the issue to trigger only when run by systemd. (I thought about timing issues and NetworkManager, but how do I debug this?)

[1] https://serverfault.com/questions/193114/linux-e1000e-intel-networking-driver-problems-galore-where-do-i-start

Thoughts anyone?

Regards
Per Öberg 

^ permalink raw reply	[flat|nested] 14+ messages in thread