All of lore.kernel.org
 help / color / mirror / Atom feed
* Cyclic hardware reset for e1000e
@ 2019-02-18 12:36 Per Oberg
  2019-02-18 12:43 ` Jan Kiszka
  0 siblings, 1 reply; 14+ messages in thread
From: Per Oberg @ 2019-02-18 12:36 UTC (permalink / raw)
  To: xenomai

Hello list

I have this issue where my e1000e network card gets into some kind of cyclic hardware reset during operation. The weird thing is that this only happens when I let systemd start the application. If it's started manually it always works as intended. 

I am running  xenomai 3.0.7 with a linux-4.9.38 kernel and I use the network connection in Linux non-rt mode. I use systemd and NetworkManager.

I do realize that once I get into the reset it will continue resetting because I keep flooding the buffers. My issue is that it -never- happens when I start my process manually, only when systemd starts it. Because the network goes down quite badly I cannot log in and disable the service once it happens and therefore I cannot really try starting it manually after letting the network recover.  

There is some information from intel in [1] below. There is talk about power management function and EPROM etc. They specifically write: 

"82573(V/L/E) TX Unit Hang Messages
Several adapters with the 82573 chipset display "TX unit hang" messages during normal operation with the e1000 driver. The issue appears both with TSO enabled and disabled, and is caused by a power management function that is enabled in the EEPROM. Early releases of the chipsets to vendors had the EEPROM bit that enabled the feature. After the issue was discovered newer adapters were released with the feature disabled in the EEPROM."


I also read something about disabling GRO/TSO/GSO that helped some people. 

My questions to the list are: 

1. Have you guys any experience with this?
2. Would I be better of using the RT Net drivers?
3. What could cause the issue to trigger only when run by systemd. (I thought about timing issues and NetworkManager, but how do I debug this?)

[1] https://serverfault.com/questions/193114/linux-e1000e-intel-networking-driver-problems-galore-where-do-i-start

Thoughts anyone?

Regards
Per Öberg 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cyclic hardware reset for e1000e
  2019-02-18 12:36 Cyclic hardware reset for e1000e Per Oberg
@ 2019-02-18 12:43 ` Jan Kiszka
  2019-02-18 13:08   ` Per Oberg
  0 siblings, 1 reply; 14+ messages in thread
From: Jan Kiszka @ 2019-02-18 12:43 UTC (permalink / raw)
  To: Per Oberg, xenomai

On 18.02.19 13:36, Per Oberg via Xenomai wrote:
> Hello list
> 
> I have this issue where my e1000e network card gets into some kind of cyclic hardware reset during operation. The weird thing is that this only happens when I let systemd start the application. If it's started manually it always works as intended.
> 
> I am running  xenomai 3.0.7 with a linux-4.9.38 kernel and I use the network connection in Linux non-rt mode. I use systemd and NetworkManager.
> 
> I do realize that once I get into the reset it will continue resetting because I keep flooding the buffers. My issue is that it -never- happens when I start my process manually, only when systemd starts it. Because the network goes down quite badly I cannot log in and disable the service once it happens and therefore I cannot really try starting it manually after letting the network recover.
> 
> There is some information from intel in [1] below. There is talk about power management function and EPROM etc. They specifically write:
> 
> "82573(V/L/E) TX Unit Hang Messages
> Several adapters with the 82573 chipset display "TX unit hang" messages during normal operation with the e1000 driver. The issue appears both with TSO enabled and disabled, and is caused by a power management function that is enabled in the EEPROM. Early releases of the chipsets to vendors had the EEPROM bit that enabled the feature. After the issue was discovered newer adapters were released with the feature disabled in the EEPROM."
> 
> 
> I also read something about disabling GRO/TSO/GSO that helped some people.
> 
> My questions to the list are:
> 
> 1. Have you guys any experience with this?
> 2. Would I be better of using the RT Net drivers?
> 3. What could cause the issue to trigger only when run by systemd. (I thought about timing issues and NetworkManager, but how do I debug this?)
> 
> [1] https://serverfault.com/questions/193114/linux-e1000e-intel-networking-driver-problems-galore-where-do-i-start
> 
> Thoughts anyone?

Are you giving Linux enough time to work (no 100% RT domination of any core for 
hundreds of milliseconds or longer)?

Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cyclic hardware reset for e1000e
  2019-02-18 12:43 ` Jan Kiszka
@ 2019-02-18 13:08   ` Per Oberg
  2019-03-13  8:53     ` Per Oberg
  0 siblings, 1 reply; 14+ messages in thread
From: Per Oberg @ 2019-02-18 13:08 UTC (permalink / raw)
  To: xenomai


----- Den 18 feb 2019, på kl 13:43, Jan Kiszka jan.kiszka@siemens.com skrev:

> On 18.02.19 13:36, Per Oberg via Xenomai wrote:
> > Hello list

>> I have this issue where my e1000e network card gets into some kind of cyclic
>> hardware reset during operation. The weird thing is that this only happens when
>> I let systemd start the application. If it's started manually it always works
> > as intended.

>> I am running xenomai 3.0.7 with a linux-4.9.38 kernel and I use the network
> > connection in Linux non-rt mode. I use systemd and NetworkManager.

>> I do realize that once I get into the reset it will continue resetting because I
>> keep flooding the buffers. My issue is that it -never- happens when I start my
>> process manually, only when systemd starts it. Because the network goes down
>> quite badly I cannot log in and disable the service once it happens and
>> therefore I cannot really try starting it manually after letting the network
> > recover.

>> There is some information from intel in [1] below. There is talk about power
> > management function and EPROM etc. They specifically write:

> > "82573(V/L/E) TX Unit Hang Messages
>> Several adapters with the 82573 chipset display "TX unit hang" messages during
>> normal operation with the e1000 driver. The issue appears both with TSO enabled
>> and disabled, and is caused by a power management function that is enabled in
>> the EEPROM. Early releases of the chipsets to vendors had the EEPROM bit that
>> enabled the feature. After the issue was discovered newer adapters were
> > released with the feature disabled in the EEPROM."


> > I also read something about disabling GRO/TSO/GSO that helped some people.

> > My questions to the list are:

> > 1. Have you guys any experience with this?
> > 2. Would I be better of using the RT Net drivers?
>> 3. What could cause the issue to trigger only when run by systemd. (I thought
> > about timing issues and NetworkManager, but how do I debug this?)

>> [1]
> > https://serverfault.com/questions/193114/linux-e1000e-intel-networking-driver-problems-galore-where-do-i-start

> > Thoughts anyone?

> Are you giving Linux enough time to work (no 100% RT domination of any core for
> hundreds of milliseconds or longer)?

I am not sure, yet. I have this logging function for reporting back to me when I loose samples. Loosing samples would currently make the software try to catch up and this would mean 100% cpu till it does. I do see this being logged around the time it resets but I'm not sure if it's much worse than "usual". If for some reason the hardware reset happens because linux gets starved I can easily see this going cyclic.

Per Öberg 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cyclic hardware reset for e1000e
  2019-02-18 13:08   ` Per Oberg
@ 2019-03-13  8:53     ` Per Oberg
  2019-03-13 17:06       ` Cobalt compatible distribution Don Newbold
  2019-03-18  8:29       ` Cyclic hardware reset for e1000e Per Oberg
  0 siblings, 2 replies; 14+ messages in thread
From: Per Oberg @ 2019-03-13  8:53 UTC (permalink / raw)
  To: xenomai


Please visit us at: [ http://www.wolframmathcore.com/ | wolframmathcore.com ] or [ http://www.wolfram.com/ | wolfram.com ]

----- Den 18 feb 2019, på kl 14:08, Per Öberg pero@wolfram.com skrev:

> ----- Den 18 feb 2019, på kl 13:43, Jan Kiszka jan.kiszka@siemens.com skrev:

> > On 18.02.19 13:36, Per Oberg via Xenomai wrote:
> > > Hello list

> >> I have this issue where my e1000e network card gets into some kind of cyclic
> >> hardware reset during operation. The weird thing is that this only happens when
> >> I let systemd start the application. If it's started manually it always works
> > > as intended.

> >> I am running xenomai 3.0.7 with a linux-4.9.38 kernel and I use the network
> > > connection in Linux non-rt mode. I use systemd and NetworkManager.

> >> I do realize that once I get into the reset it will continue resetting because I
> >> keep flooding the buffers. My issue is that it -never- happens when I start my
> >> process manually, only when systemd starts it. Because the network goes down
> >> quite badly I cannot log in and disable the service once it happens and
> >> therefore I cannot really try starting it manually after letting the network
> > > recover.

> >> There is some information from intel in [1] below. There is talk about power
> > > management function and EPROM etc. They specifically write:

> > > "82573(V/L/E) TX Unit Hang Messages
> >> Several adapters with the 82573 chipset display "TX unit hang" messages during
> >> normal operation with the e1000 driver. The issue appears both with TSO enabled
> >> and disabled, and is caused by a power management function that is enabled in
> >> the EEPROM. Early releases of the chipsets to vendors had the EEPROM bit that
> >> enabled the feature. After the issue was discovered newer adapters were
> > > released with the feature disabled in the EEPROM."

> > > I also read something about disabling GRO/TSO/GSO that helped some people.

> > > My questions to the list are:

> > > 1. Have you guys any experience with this?
> > > 2. Would I be better of using the RT Net drivers?
> >> 3. What could cause the issue to trigger only when run by systemd. (I thought
> > > about timing issues and NetworkManager, but how do I debug this?)

> >> [1]
> > > https://serverfault.com/questions/193114/linux-e1000e-intel-networking-driver-problems-galore-where-do-i-start

> > > Thoughts anyone?

> > Are you giving Linux enough time to work (no 100% RT domination of any core for
> > hundreds of milliseconds or longer)?

> I am not sure, yet. I have this logging function for reporting back to me when I
> loose samples. Loosing samples would currently make the software try to catch
> up and this would mean 100% cpu till it does. I do see this being logged around
> the time it resets but I'm not sure if it's much worse than "usual". If for
> some reason the hardware reset happens because linux gets starved I can easily
> see this going cyclic.

> Per Öberg

So, I have managed to do some checking

It looks like the cyclic resets are about 80-100 seconds apart. 
Before the first reset we are most likely holding the CPUs for about 3-4ms.

I managed to get hold of a kernel message saying: 
[...] WARNING: CPU: 0 PID: 3 at net/sched/sch_generic.c:316 dev_watchdog+0x215/0x220
[...] NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out

The full trace is shown below.

One difference that I have found is that I am running with "--cpu-affinity=2,3" when running manually, but not when using systemd to start the program. Can this have an impact?


--------------------  DMESG TRACE -----------------------------------------

[31865.706967] ------------[ cut here ]------------
[31865.706973] WARNING: CPU: 0 PID: 3 at net/sched/sch_generic.c:316 dev_watchdog+0x215/0x220
[31865.706974] NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out
[31865.706974] Modules linked in: iTCO_wdt iTCO_vendor_support ppdev i915 intel_rapl intel_powerclamp coretemp kvm_intel kvm drm_kms_helper irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm intel_gtt aesni_intel agpgart aes_x86_64 fb_sys_fops lrw gf128mul glue_helper e1000e ablk_helper syscopyarea cryptd sysfillrect sysimgblt efi_pstore igb xhci_pci psmouse xhci_hcd dca pcspkr i2c_algo_bit serio_raw ptp efivars pps_core xeno_can_peak_pci xeno_can_sja1000 xeno_can i2c_i801 shpchp i2c_smbus hci_uart btbcm btintel bluetooth parport_pc parport pinctrl_sunrisepoint pinctrl_intel i2c_hid tpm_tis tpm_tis_core tpm sch_fq_codel efivarfs ipv6 crc_ccitt
[31865.707329] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.38-xenomai+ #6
[31865.707330] Hardware name: Default string Default string/SKYBAY, BIOS 5.11 09/22/2016
[31865.707331] I-pipe domain: Linux
[31865.707333]  ffffc90000033c80 ffffffff813e0324 ffffc90000033cd0 0000000000000000
[31865.707336]  ffffc90000033cc0 ffffffff81054b67 0000013c6dc2eb00 0000000000000000
[31865.707517]  ffff88026048fc80 0000000000000000 ffff88025ed74000 0000000000000001
[31865.707520] Call Trace:
[31865.707524]  [<ffffffff813e0324>] dump_stack+0x96/0xc2
[31865.707526]  [<ffffffff81054b67>] __warn+0xc7/0xf0
[31865.707527]  [<ffffffff81054bda>] warn_slowpath_fmt+0x4a/0x50
[31865.707529]  [<ffffffff81a04be0>] ? dev_graft_qdisc+0x70/0x70
[31865.707568]  [<ffffffff81a04df5>] dev_watchdog+0x215/0x220
[31865.707569]  [<ffffffff81a04be0>] ? dev_graft_qdisc+0x70/0x70
[31865.707571]  [<ffffffff81a04be0>] ? dev_graft_qdisc+0x70/0x70
[31865.707573]  [<ffffffff810a6d47>] call_timer_fn.isra.25+0x17/0x70
[31865.707575]  [<ffffffff810a6e47>] expire_timers+0xa7/0xd0
[31865.707576]  [<ffffffff810a6eec>] run_timer_softirq+0x7c/0x160
[31865.707578]  [<ffffffff81aae546>] ? _raw_spin_unlock_irq+0x16/0x30
[31865.707581]  [<ffffffff810595b6>] __do_softirq+0xe6/0x1e0
[31865.707583]  [<ffffffff810596e2>] run_ksoftirqd+0x32/0x40
[31865.707584]  [<ffffffff81073ff5>] smpboot_thread_fn+0x165/0x230
[31865.707611]  [<ffffffff81073e90>] ? sort_range+0x20/0x20
[31865.707827]  [<ffffffff81070962>] kthread+0xd2/0xf0
[31865.707829]  [<ffffffff81070890>] ? kthread_park+0x60/0x60
[31865.707831]  [<ffffffff81aaed33>] ret_from_fork+0x23/0x30
[31865.707834] ---[ end trace 111a72a07d1d2f26 ]---
[31865.743096] e1000e 0000:00:1f.6 enp0s31f6: Reset adapter unexpectedly
[31867.827820] e1000e: enp0s31f6 NIC Link is Up 100 Mbps Full Duplex, Flow Control: Rx/Tx




Per Öberg 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Cobalt compatible distribution
  2019-03-13  8:53     ` Per Oberg
@ 2019-03-13 17:06       ` Don Newbold
       [not found]         ` <192645678.5721329.1552685329163@mail.yahoo.com>
  2019-03-18  8:29       ` Cyclic hardware reset for e1000e Per Oberg
  1 sibling, 1 reply; 14+ messages in thread
From: Don Newbold @ 2019-03-13 17:06 UTC (permalink / raw)
  To: xenomai

Xenomai,

I'm tasked with writing an RTDM driver for the Cobalt configuration. 
Unfortunately I'm having a hard time finding a Linux distribution with a 
kernel version supported by Cobalt. Can you help me out?

Don




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cobalt compatible distribution
       [not found]         ` <192645678.5721329.1552685329163@mail.yahoo.com>
@ 2019-03-15 21:29           ` Alec Ari
       [not found]           ` <cece8f69-d8c5-7165-e918-444398bea154@gmail.com>
  1 sibling, 0 replies; 14+ messages in thread
From: Alec Ari @ 2019-03-15 21:29 UTC (permalink / raw)
  To: Xenomai--- via Xenomai

Hi.

I don't think any distro ships with a cobolt-enabled kernel. If compiling a kernel is too big a task, don't bother with writing any drivers.

Alec


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cobalt compatible distribution
       [not found]           ` <cece8f69-d8c5-7165-e918-444398bea154@gmail.com>
@ 2019-03-16  7:44             ` Alec Ari
  2019-03-18 18:00               ` Don Newbold
  0 siblings, 1 reply; 14+ messages in thread
From: Alec Ari @ 2019-03-16  7:44 UTC (permalink / raw)
  To: Don Newbold, Xenomai--- via Xenomai

Hi,

Cobalt enabled vs cobalt supported, what is the difference here? Cobalt is part of Xenomai, you patch the kernel using prepare-kernel.sh and you enable the Cobalt kernel config option via Kconfig menu. If you want to write an RTDM driver, the kernel must be patched and configured appropriately.

The Xenomai/Cobalt stuff is all distro-independent, ipipe and all is kernel space. You won't find anything about Cobalt on distrowatch.

I didn't insult you, I said that if you're looking for a distribution with a Xenomai kernel shipped with it because building from scratch is too big a task, you're better off avoiding writing a driver.

If you're serious about doing this, just build the kernel yourself and work on your driver with whatever distro you want. Some distros/desktop environments offer lower latency than others by the time you're all done, but that's really about it. LXDE/LXQt might give you better scores than let's say GNOME/KDE.

Does this answer your question? If not, let me know what I'm missing as I'm doing my best to help.

Alec


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cyclic hardware reset for e1000e
  2019-03-13  8:53     ` Per Oberg
  2019-03-13 17:06       ` Cobalt compatible distribution Don Newbold
@ 2019-03-18  8:29       ` Per Oberg
  1 sibling, 0 replies; 14+ messages in thread
From: Per Oberg @ 2019-03-18  8:29 UTC (permalink / raw)
  To: xenomai

----- Den 13 mar 2019, på kl 9:53, Per Öberg pero@wolfram.com skrev:

> > ----- Den 18 feb 2019, på kl 13:43, Jan Kiszka jan.kiszka@siemens.com skrev:

> > > On 18.02.19 13:36, Per Oberg via Xenomai wrote:
> > > > Hello list

> > >> I have this issue where my e1000e network card gets into some kind of cyclic
> > >> hardware reset during operation. The weird thing is that this only happens when
> > >> I let systemd start the application. If it's started manually it always works
> > > > as intended.

> > >> I am running xenomai 3.0.7 with a linux-4.9.38 kernel and I use the network
> > > > connection in Linux non-rt mode. I use systemd and NetworkManager.

> > >> I do realize that once I get into the reset it will continue resetting because I
> > >> keep flooding the buffers. My issue is that it -never- happens when I start my
> > >> process manually, only when systemd starts it. Because the network goes down
> > >> quite badly I cannot log in and disable the service once it happens and
> > >> therefore I cannot really try starting it manually after letting the network
> > > > recover.

> > >> There is some information from intel in [1] below. There is talk about power
> > > > management function and EPROM etc. They specifically write:

> > > > "82573(V/L/E) TX Unit Hang Messages
> > >> Several adapters with the 82573 chipset display "TX unit hang" messages during
> > >> normal operation with the e1000 driver. The issue appears both with TSO enabled
> > >> and disabled, and is caused by a power management function that is enabled in
> > >> the EEPROM. Early releases of the chipsets to vendors had the EEPROM bit that
> > >> enabled the feature. After the issue was discovered newer adapters were
> > > > released with the feature disabled in the EEPROM."

> > > > I also read something about disabling GRO/TSO/GSO that helped some people.

> > > > My questions to the list are:

> > > > 1. Have you guys any experience with this?
> > > > 2. Would I be better of using the RT Net drivers?
> > >> 3. What could cause the issue to trigger only when run by systemd. (I thought
> > > > about timing issues and NetworkManager, but how do I debug this?)

> > >> [1]
> > > > https://serverfault.com/questions/193114/linux-e1000e-intel-networking-driver-problems-galore-where-do-i-start

> > > > Thoughts anyone?

> > > Are you giving Linux enough time to work (no 100% RT domination of any core for
> > > hundreds of milliseconds or longer)?

> > I am not sure, yet. I have this logging function for reporting back to me when I
> > loose samples. Loosing samples would currently make the software try to catch
> > up and this would mean 100% cpu till it does. I do see this being logged around
> > the time it resets but I'm not sure if it's much worse than "usual". If for
> > some reason the hardware reset happens because linux gets starved I can easily
> > see this going cyclic.

> > Per Öberg

> So, I have managed to do some checking

> It looks like the cyclic resets are about 80-100 seconds apart.
> Before the first reset we are most likely holding the CPUs for about 3-4ms.

> I managed to get hold of a kernel message saying:
> [...] WARNING: CPU: 0 PID: 3 at net/sched/sch_generic.c:316
> dev_watchdog+0x215/0x220
> [...] NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out

> The full trace is shown below.

> One difference that I have found is that I am running with "--cpu-affinity=2,3"
> when running manually, but not when using systemd to start the program. Can
> this have an impact?

> -------------------- DMESG TRACE -----------------------------------------

> [31865.706967] ------------[ cut here ]------------
> [31865.706973] WARNING: CPU: 0 PID: 3 at net/sched/sch_generic.c:316
> dev_watchdog+0x215/0x220
> [31865.706974] NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out
> [31865.706974] Modules linked in: iTCO_wdt iTCO_vendor_support ppdev i915
> intel_rapl intel_powerclamp coretemp kvm_intel kvm drm_kms_helper irqbypass
> crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel drm intel_gtt
> aesni_intel agpgart aes_x86_64 fb_sys_fops lrw gf128mul glue_helper e1000e
> ablk_helper syscopyarea cryptd sysfillrect sysimgblt efi_pstore igb xhci_pci
> psmouse xhci_hcd dca pcspkr i2c_algo_bit serio_raw ptp efivars pps_core
> xeno_can_peak_pci xeno_can_sja1000 xeno_can i2c_i801 shpchp i2c_smbus hci_uart
> btbcm btintel bluetooth parport_pc parport pinctrl_sunrisepoint pinctrl_intel
> i2c_hid tpm_tis tpm_tis_core tpm sch_fq_codel efivarfs ipv6 crc_ccitt
> [31865.707329] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.9.38-xenomai+ #6
> [31865.707330] Hardware name: Default string Default string/SKYBAY, BIOS 5.11
> 09/22/2016
> [31865.707331] I-pipe domain: Linux
> [31865.707333] ffffc90000033c80 ffffffff813e0324 ffffc90000033cd0
> 0000000000000000
> [31865.707336] ffffc90000033cc0 ffffffff81054b67 0000013c6dc2eb00
> 0000000000000000
> [31865.707517] ffff88026048fc80 0000000000000000 ffff88025ed74000
> 0000000000000001
> [31865.707520] Call Trace:
> [31865.707524] [<ffffffff813e0324>] dump_stack+0x96/0xc2
> [31865.707526] [<ffffffff81054b67>] __warn+0xc7/0xf0
> [31865.707527] [<ffffffff81054bda>] warn_slowpath_fmt+0x4a/0x50
> [31865.707529] [<ffffffff81a04be0>] ? dev_graft_qdisc+0x70/0x70
> [31865.707568] [<ffffffff81a04df5>] dev_watchdog+0x215/0x220
> [31865.707569] [<ffffffff81a04be0>] ? dev_graft_qdisc+0x70/0x70
> [31865.707571] [<ffffffff81a04be0>] ? dev_graft_qdisc+0x70/0x70
> [31865.707573] [<ffffffff810a6d47>] call_timer_fn.isra.25+0x17/0x70
> [31865.707575] [<ffffffff810a6e47>] expire_timers+0xa7/0xd0
> [31865.707576] [<ffffffff810a6eec>] run_timer_softirq+0x7c/0x160
> [31865.707578] [<ffffffff81aae546>] ? _raw_spin_unlock_irq+0x16/0x30
> [31865.707581] [<ffffffff810595b6>] __do_softirq+0xe6/0x1e0
> [31865.707583] [<ffffffff810596e2>] run_ksoftirqd+0x32/0x40
> [31865.707584] [<ffffffff81073ff5>] smpboot_thread_fn+0x165/0x230
> [31865.707611] [<ffffffff81073e90>] ? sort_range+0x20/0x20
> [31865.707827] [<ffffffff81070962>] kthread+0xd2/0xf0
> [31865.707829] [<ffffffff81070890>] ? kthread_park+0x60/0x60
> [31865.707831] [<ffffffff81aaed33>] ret_from_fork+0x23/0x30
> [31865.707834] ---[ end trace 111a72a07d1d2f26 ]---
> [31865.743096] e1000e 0000:00:1f.6 enp0s31f6: Reset adapter unexpectedly
> [31867.827820] e1000e: enp0s31f6 NIC Link is Up 100 Mbps Full Duplex, Flow
> Control: Rx/Tx


Does anyone know what causes :
"NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out"

Is it only me hogging all resources or are there other possibilities? 


Does anyone know if I would benefit from using "--cpu-affinity=2,3" ? My assumption is that perhaps if I schedule stuff on a core that is not used for handling interrupts, remembering the "WARNING: CPU: 0" part of the error, it would somehow help. 


Per Öberg


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cobalt compatible distribution
  2019-03-16  7:44             ` Alec Ari
@ 2019-03-18 18:00               ` Don Newbold
       [not found]                 ` <1723459381.6926353.1552936352861@mail.yahoo.com>
  0 siblings, 1 reply; 14+ messages in thread
From: Don Newbold @ 2019-03-18 18:00 UTC (permalink / raw)
  To: Alec Ari, Xenomai--- via Xenomai

Alec,

I do realize the kernel is modified as part of Cobalt installation. This 
is certainly why only specific kernels are supported. Which is why I was 
looking for a distribution which shipped with a supported kernel.

I was neither expecting nor looking to distrowatch for a distribution 
which included Cobalt. All I was looking to distriwatch for was a 
distribution which shipped with a kernel version supported by Cobalt. I 
found none, which is why I went to the community.

In the absence of any such distribution that ships with a supported 
kernel, I will fall back to selecting a distribution whose kernel can 
easily be updated to a Cobalt supported kernel version. At the moment, 
Ubuntu seems a very likely source. If you would suggest a specific 
distribution and version I would be most appreciative.

Thank you for your assistance.

Don

On 3/16/2019 2:44 AM, Alec Ari wrote:
> Hi,
> 
> Cobalt enabled vs cobalt supported, what is the difference here? Cobalt is part of Xenomai, you patch the kernel using prepare-kernel.sh and you enable the Cobalt kernel config option via Kconfig menu. If you want to write an RTDM driver, the kernel must be patched and configured appropriately.
> 
> The Xenomai/Cobalt stuff is all distro-independent, ipipe and all is kernel space. You won't find anything about Cobalt on distrowatch.
> 
> I didn't insult you, I said that if you're looking for a distribution with a Xenomai kernel shipped with it because building from scratch is too big a task, you're better off avoiding writing a driver.
> 
> If you're serious about doing this, just build the kernel yourself and work on your driver with whatever distro you want. Some distros/desktop environments offer lower latency than others by the time you're all done, but that's really about it. LXDE/LXQt might give you better scores than let's say GNOME/KDE.
> 
> Does this answer your question? If not, let me know what I'm missing as I'm doing my best to help.
> 
> Alec
> 
> ---
> This email has been checked for viruses by AVG.
> https://www.avg.com
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cobalt compatible distribution
       [not found]                 ` <1723459381.6926353.1552936352861@mail.yahoo.com>
@ 2019-03-18 19:13                   ` Alec Ari
  2019-03-18 20:59                     ` Don Newbold
  0 siblings, 1 reply; 14+ messages in thread
From: Alec Ari @ 2019-03-18 19:13 UTC (permalink / raw)
  To: Xenomai--- via Xenomai

>All I was looking to distriwatch for was a
>distribution which shipped with a kernel version supported by Cobalt.

Ahhhh, got it!!! Kernel version and distro don't matter at all. People run 4.14 kernels on distros shipped with 2.6.32 and it works fine. The old saying, kernel space always breaks, user space never breaks. Kernel.org does a good job at providing legacy/obsolete calls+functions so it is all backwards compatible with old tools, whether it be I2C, ACPI SCSI, etc.

That being said, I do not know where this kernel version/distro kernel version must match rumor started but I've been seeing it more lately. That is just propaganda and can safely be ignored.

Alec


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cobalt compatible distribution
  2019-03-18 19:13                   ` Alec Ari
@ 2019-03-18 20:59                     ` Don Newbold
  2019-03-18 23:42                       ` Alec Ari
  0 siblings, 1 reply; 14+ messages in thread
From: Don Newbold @ 2019-03-18 20:59 UTC (permalink / raw)
  To: Alec Ari, Xenomai--- via Xenomai

Alec,

The source of this info might come from pages such as the below which 
includes kernel version numbers. This is what led me down the path to 
looking for a specific kernel version number.

> https://gitlab.denx.de/Xenomai/ipipe/tags

Even the main ipipe page below gives a kernel version number. Yes, I see 
the note "Interrupt pipeline support for legacy kernel releases (up to 
4.9.x series)". This though, tells me it works only with 4.9.x series 
kernels.

> https://gitlab.denx.de/Xenomai/ipipe

Is there documentation somewhere that makes it clear what kernel 
versions each Cobalt release supports?

Don

On 3/18/2019 2:13 PM, Alec Ari via Xenomai wrote:
>> All I was looking to distriwatch for was a
>> distribution which shipped with a kernel version supported by Cobalt.
> 
> Ahhhh, got it!!! Kernel version and distro don't matter at all. People run 4.14 kernels on distros shipped with 2.6.32 and it works fine. The old saying, kernel space always breaks, user space never breaks. Kernel.org does a good job at providing legacy/obsolete calls+functions so it is all backwards compatible with old tools, whether it be I2C, ACPI SCSI, etc.
> 
> That being said, I do not know where this kernel version/distro kernel version must match rumor started but I've been seeing it more lately. That is just propaganda and can safely be ignored.
> 
> Alec
> 
> 
> ---
> This email has been checked for viruses by AVG.
> https://www.avg.com
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cobalt compatible distribution
  2019-03-18 20:59                     ` Don Newbold
@ 2019-03-18 23:42                       ` Alec Ari
  2019-03-19 16:08                         ` Don Newbold
  0 siblings, 1 reply; 14+ messages in thread
From: Alec Ari @ 2019-03-18 23:42 UTC (permalink / raw)
  To: Don Newbold, Xenomai--- via Xenomai

Oh boy, just when I thought I understood your question.

Xenomai git sources currently supports the 4.4 series, 4.9 series, and 4.14.


If you want 4.14 support, use this one:
https://gitlab.denx.de/Xenomai/ipipe-x86

Then follow this guide, starting at "Installing the Cobalt core:"

https://gitlab.denx.de/Xenomai/xenomai/wikis/Installing_Xenomai_3

Basically the steps are, for 4.14:

$ git clone https://gitlab.denx.de/Xenomai/ipipe-x86
$ git clone https://gitlab.denx.de/Xenomai/xenomai


$ cd ipipe-x86 && bash ../xenomai/scripts/prepare-kernel.sh --arch=x86
$ make menuconfig

Enable IPIPE, Cobalt core, etc. configure to your needs. I'm going off memory here but I'm pretty sure that's it, kernel side anyway.

When you're done:

$ cd xenomai && ./scripts/bootstrap && ./configure && make && sudo make install

For configure parameters:

$ ./configure --help


Alright, am I missing anything?


Alec


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cobalt compatible distribution
  2019-03-18 23:42                       ` Alec Ari
@ 2019-03-19 16:08                         ` Don Newbold
  2019-03-20 18:43                           ` Alec Ari
  0 siblings, 1 reply; 14+ messages in thread
From: Don Newbold @ 2019-03-19 16:08 UTC (permalink / raw)
  To: Alec Ari, Xenomai--- via Xenomai

Alec,

That'l get me going. Thank you,

Don

On 3/18/2019 6:42 PM, Alec Ari wrote:
> Oh boy, just when I thought I understood your question.
> 
> Xenomai git sources currently supports the 4.4 series, 4.9 series, and 4.14.
> 
> 
> If you want 4.14 support, use this one:
> https://gitlab.denx.de/Xenomai/ipipe-x86
> 
> Then follow this guide, starting at "Installing the Cobalt core:"
> 
> https://gitlab.denx.de/Xenomai/xenomai/wikis/Installing_Xenomai_3
> 
> Basically the steps are, for 4.14:
> 
> $ git clone https://gitlab.denx.de/Xenomai/ipipe-x86
> $ git clone https://gitlab.denx.de/Xenomai/xenomai
> 
> 
> $ cd ipipe-x86 && bash ../xenomai/scripts/prepare-kernel.sh --arch=x86
> $ make menuconfig
> 
> Enable IPIPE, Cobalt core, etc. configure to your needs. I'm going off memory here but I'm pretty sure that's it, kernel side anyway.
> 
> When you're done:
> 
> $ cd xenomai && ./scripts/bootstrap && ./configure && make && sudo make install
> 
> For configure parameters:
> 
> $ ./configure --help
> 
> 
> Alright, am I missing anything?
> 
> 
> Alec
> 
> ---
> This email has been checked for viruses by AVG.
> https://www.avg.com
> 
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Cobalt compatible distribution
  2019-03-19 16:08                         ` Don Newbold
@ 2019-03-20 18:43                           ` Alec Ari
  0 siblings, 0 replies; 14+ messages in thread
From: Alec Ari @ 2019-03-20 18:43 UTC (permalink / raw)
  To: Don Newbold, Xenomai--- via Xenomai

Hi Don,

Glad I could help. Sorry for all the confusion!

Alec


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-03-20 18:43 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-18 12:36 Cyclic hardware reset for e1000e Per Oberg
2019-02-18 12:43 ` Jan Kiszka
2019-02-18 13:08   ` Per Oberg
2019-03-13  8:53     ` Per Oberg
2019-03-13 17:06       ` Cobalt compatible distribution Don Newbold
     [not found]         ` <192645678.5721329.1552685329163@mail.yahoo.com>
2019-03-15 21:29           ` Alec Ari
     [not found]           ` <cece8f69-d8c5-7165-e918-444398bea154@gmail.com>
2019-03-16  7:44             ` Alec Ari
2019-03-18 18:00               ` Don Newbold
     [not found]                 ` <1723459381.6926353.1552936352861@mail.yahoo.com>
2019-03-18 19:13                   ` Alec Ari
2019-03-18 20:59                     ` Don Newbold
2019-03-18 23:42                       ` Alec Ari
2019-03-19 16:08                         ` Don Newbold
2019-03-20 18:43                           ` Alec Ari
2019-03-18  8:29       ` Cyclic hardware reset for e1000e Per Oberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.