* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 [not found] <d1530cba-1a72-cae8-6a04-ed8ec0f82e6e@gmail.com> @ 2023-01-19 10:17 ` Paul Menzel 2023-01-19 10:22 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network performance " Paul Menzel 2023-01-19 12:24 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance " Bartek Kois 0 siblings, 2 replies; 14+ messages in thread From: Paul Menzel @ 2023-01-19 10:17 UTC (permalink / raw) To: Bartek Kois; +Cc: intel-wired-lan, regressions #regzbot ^introduced: 4.9.88..5.10.149 Dear Bartek, Am 14.01.23 um 11:23 schrieb Bartek Kois: > After moving from Debian 9.7 to 11.5 as soon as I perform "ip link set > enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel 82599EN based 10G > adapter) I am experiencing high cpu load (even if no traffic is passing > through the adapter) and network performance is low (when network is > connected). How do you test the network performance? Please give exact numbers for comparison. > The cpu load is oscillating between 0.1 and 0.3 on vanilla system > with no network attached. The problem can be observed on the > following platforms: Supermicro X9SCL (Intel C202 PCH) and > Supermicro X10SLL+-F (Intel C222 Express PCH), but for the Supermicro > X11SSL-F (Intel® C232 chipset) everything is working well. > > Tested environments: > Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 > (2018-05-07) x86_64 GNU/Linux [all platforms working well with no > problems: Supermicro X9SCL (Intel C202 PCH), Supermicro X10SLL+-F (Intel > C222 Express PCH), Supermicro X11SSL-F (Intel® C232 chipset)] > Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 > (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro X9SCL (Intel > C202 PCH), Supermicro X10SLL+-F (Intel C222 Express PCH) behave > problematic as described above | newer platform: Supermicro X11SSL-F > (Intel® C232 chipset) working well with no problems] Maybe create a bug at the Linux kernel bug tracker [1], where you can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). > So far to solve the problem I was trying to upgrade system to the newest > stable version, upgrade kernel to version 6.x, upgrade ixgbe driver to > the newest version but with no luck. Thank you for checking that. Too bad it’s still present. To rule out some user space problem, could you test Debian 9.7 with a stable Linux release, currently 6.1.7? What does `sudo perf top --sort comm,dso` show, where the time is spent? > Supermicro support suggested as follows: > it might be kernel related debian 11.5 has kernel 5.10 which is a > recent kernel it might not properly support the chipsets for X9 > therefore i suggest to use RHEL or CentOS as they use much older kernel > versions. I expect that with ubuntu 20.04 you see the same problem it > uses kernel 5.4 Testing another GNU/Linux distribution for another data point, might be a good idea. As nobody has responded yet, bisecting the issue is probably the fastest way to get to the bottom of this. Luckily the problem seems reproducible and you seem to be able to build a Linux kernel yourself, so that should work. (For testing purposes you could also test with Ubuntu, as they provide Linux kernel builds for (almost) all releases in their Linux kernel mainline PPA [2].) Kind regards, Paul [1]: https://bugzilla.kernel.org/ [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network performance after moving to Debian 11.5 2023-01-19 10:17 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 Paul Menzel @ 2023-01-19 10:22 ` Paul Menzel 2023-01-19 12:24 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance " Bartek Kois 1 sibling, 0 replies; 14+ messages in thread From: Paul Menzel @ 2023-01-19 10:22 UTC (permalink / raw) To: Bartek Kois; +Cc: intel-wired-lan, regressions Dear Bartek, Am 19.01.23 um 11:17 schrieb Paul Menzel: > #regzbot ^introduced: 4.9.88..5.10.149 > Am 14.01.23 um 11:23 schrieb Bartek Kois: > >> After moving from Debian 9.7 to 11.5 as soon as I perform "ip link set >> enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel 82599EN based 10G >> adapter) I am experiencing high cpu load (even if no traffic is >> passing through the adapter) and network performance is low (when >> network is connected). > > How do you test the network performance? Please give exact numbers for > comparison. > >> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >> with no network attached. The problem can be observed on the following >> platforms: Supermicro X9SCL (Intel C202 PCH) and >> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the Supermicro >> X11SSL-F (Intel® C232 chipset) everything is working well. >> >> Tested environments: >> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >> (2018-05-07) x86_64 GNU/Linux [all platforms working well with no >> problems: Supermicro X9SCL (Intel C202 PCH), Supermicro X10SLL+-F >> (Intel C222 Express PCH), Supermicro X11SSL-F (Intel® C232 chipset)] > >> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro X9SCL >> (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 Express PCH) behave >> problematic as described above | newer platform: Supermicro X11SSL-F >> (Intel® C232 chipset) working well with no problems] > > Maybe create a bug at the Linux kernel bug tracker [1], where you can > attach all the logs (`dmesg`, `lspci -nnk -s …`, …). > >> So far to solve the problem I was trying to upgrade system to the >> newest stable version, upgrade kernel to version 6.x, upgrade ixgbe >> driver to the newest version but with no luck. > > Thank you for checking that. Too bad it’s still present. To rule out > some user space problem, could you test Debian 9.7 with a stable Linux > release, currently 6.1.7? > > What does `sudo perf top --sort comm,dso` show, where the time is spent? > >> Supermicro support suggested as follows: >> it might be kernel related debian 11.5 has kernel 5.10 which is a >> recent kernel it might not properly support the chipsets for X9 >> therefore i suggest to use RHEL or CentOS as they use much older >> kernel versions. I expect that with ubuntu 20.04 you see the same >> problem it uses kernel 5.4 > > Testing another GNU/Linux distribution for another data point, might be > a good idea. > > As nobody has responded yet, bisecting the issue is probably the fastest > way to get to the bottom of this. Luckily the problem seems reproducible > and you seem to be able to build a Linux kernel yourself, so that should > work. (For testing purposes you could also test with Ubuntu, as they > provide Linux kernel builds for (almost) all releases in their Linux > kernel mainline PPA [2].) You could also try to do that in a virtual machine by passing through the network device to the VM. If that reproduces the issue, that’s quite a fast setup for bisecting a regression, as start times are really fast. (For example, you can pass the Linux kernel directly to a QEMU VM with the `-kernel` switch.) Kind regards, Paul > [1]: https://bugzilla.kernel.org/ > [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-19 10:17 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 Paul Menzel 2023-01-19 10:22 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network performance " Paul Menzel @ 2023-01-19 12:24 ` Bartek Kois 2023-01-19 16:58 ` Bartek Kois 1 sibling, 1 reply; 14+ messages in thread From: Bartek Kois @ 2023-01-19 12:24 UTC (permalink / raw) To: Paul Menzel; +Cc: intel-wired-lan, regressions W dniu 19.01.2023 o 11:17, Paul Menzel pisze: > > #regzbot ^introduced: 4.9.88..5.10.149 > > Dear Bartek, > > > Am 14.01.23 um 11:23 schrieb Bartek Kois: > >> After moving from Debian 9.7 to 11.5 as soon as I perform "ip link >> set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel 82599EN based >> 10G adapter) I am experiencing high cpu load (even if no traffic is >> passing through the adapter) and network performance is low (when >> network is connected). > > How do you test the network performance? Please give exact numbers for > comparison. > I am using this server as a router for my subscribers with iptables (for NAT and firewall) and hfsc (for QoS). First I encountered this problem while migrating form Debian 9.7 to 11.5. Routers based on Supermicro X11SSL-F (Intel® C232 chipset) works with no problems after that migration, but routers based on Supermicro X9SCL (Intel C202 PCH) and Supermicro X10SLL+-F (Intel C222 Express PCH) starts behaving strangely with high cpu load (0.5-0.8 while before it was around 0.0-0.1) and subscribers not being able to utilize their plans. I tried to strip down the problem and ends up with clean system with no iptables or hfsc rules behaving the same (higher load) right after setting the 10G link upeven if no traffic is passing by. >> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >> with no network attached. The problem can be observed on the >> following platforms: Supermicro X9SCL (Intel C202 PCH) and >> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the Supermicro >> X11SSL-F (Intel® C232 chipset) everything is working well. >> >> Tested environments: >> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >> (2018-05-07) x86_64 GNU/Linux [all platforms working well with no >> problems: Supermicro X9SCL (Intel C202 PCH), Supermicro X10SLL+-F >> (Intel C222 Express PCH), Supermicro X11SSL-F (Intel® C232 chipset)] > >> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro X9SCL >> (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 Express PCH) >> behave problematic as described above | newer platform: Supermicro >> X11SSL-F (Intel® C232 chipset) working well with no problems] > > Maybe create a bug at the Linux kernel bug tracker [1], where you can > attach all the logs (`dmesg`, `lspci -nnk -s …`, …). > I`ve already reported that to the Debian team ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but so far nobody took care of this issue so far. >> So far to solve the problem I was trying to upgrade system to the >> newest stable version, upgrade kernel to version 6.x, upgrade ixgbe >> driver to the newest version but with no luck. > > Thank you for checking that. Too bad it’s still present. To rule out > some user space problem, could you test Debian 9.7 with a stable Linux > release, currently 6.1.7? > > What does `sudo perf top --sort comm,dso` show, where the time is spent? During my first test in real enviroment with subscribers I gether the following data through the perf: 27.83% [kernel] [k] strncpy 14.80% [kernel] [k] nft_do_chain 7.61% [kernel] [k] memcmp 5.63% [kernel] [k] nft_meta_get_eval 3.14% [kernel] [k] nft_cmp_eval 2.79% [kernel] [k] asm_exc_nmi 1.07% [kernel] [k] module_get_kallsym 0.92% [kernel] [k] kallsyms_expand_symbol.constprop.0 0.85% [kernel] [k] ixgbe_poll 0.75% [kernel] [k] format_decode 0.61% [kernel] [k] number 0.56% [kernel] [k] menu_select 0.54% [kernel] [k] clflush_cache_range 0.52% [kernel] [k] cpuidle_enter_state 0.51% [kernel] [k] vsnprintf 0.50% [kernel] [k] u32_classify 0.49% [kernel] [k] fib_table_lookup 0.40% [kernel] [k] dma_pte_clear_level 0.39% [kernel] [k] domain_mapping 0.36% [kernel] [k] ixgbe_xmit_fram PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 18 root 20 0 0 0 0 S 28.2 0.0 7:06.27 ksoftirqd/1 12 root 20 0 0 0 0 R 12.0 0.0 4:10.88 ksoftirqd/0 23 root 20 0 0 0 0 S 6.0 0.0 4:36.08 ksoftirqd/2 28 root 20 0 0 0 0 S 5.3 0.0 6:46.47 ksoftirqd/3 846449 root 20 0 0 0 0 I 1.0 0.0 0:01.61 kworker/0:0-events_power_efficient 13 root 20 0 0 0 0 I 0.3 0.0 0:13.50 rcu_sched 8264 root 20 0 101536 6944 4824 S 0.3 0.2 0:07.77 dhcpd 1 root 20 0 164048 10184 7672 S 0.0 0.3 0:04.52 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-events_highpri 9 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_rude_ 11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_trace 14 root rt 0 0 0 0 S 0.0 0.0 0:00.26 migration/0 > >> Supermicro support suggested as follows: >> it might be kernel related debian 11.5 has kernel 5.10 which is a >> recent kernel it might not properly support the chipsets for X9 >> therefore i suggest to use RHEL or CentOS as they use much older >> kernel versions. I expect that with ubuntu 20.04 you see the same >> problem it uses kernel 5.4 > Testing another GNU/Linux distribution for another data point, might > be a good idea. > > As nobody has responded yet, bisecting the issue is probably the > fastest way to get to the bottom of this. Luckily the problem seems > reproducible and you seem to be able to build a Linux kernel yourself, > so that should work. (For testing purposes you could also test with > Ubuntu, as they provide Linux kernel builds for (almost) all releases > in their Linux kernel mainline PPA [2].) > Of course I can try Ubuntu and report how it is working. Best regards Bartek Kois > > Kind regards, > > Paul > > > [1]: https://bugzilla.kernel.org/ > [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-19 12:24 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance " Bartek Kois @ 2023-01-19 16:58 ` Bartek Kois 2023-01-19 17:09 ` Paul Menzel 0 siblings, 1 reply; 14+ messages in thread From: Bartek Kois @ 2023-01-19 16:58 UTC (permalink / raw) To: Paul Menzel; +Cc: intel-wired-lan, regressions W dniu 19.01.2023 o 13:24, Bartek Kois pisze: > > W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >> >> #regzbot ^introduced: 4.9.88..5.10.149 >> >> Dear Bartek, >> >> >> Am 14.01.23 um 11:23 schrieb Bartek Kois: >> >>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip link >>> set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel 82599EN >>> based 10G adapter) I am experiencing high cpu load (even if no >>> traffic is passing through the adapter) and network performance is >>> low (when network is connected). >> >> How do you test the network performance? Please give exact numbers >> for comparison. >> > I am using this server as a router for my subscribers with iptables > (for NAT and firewall) and hfsc (for QoS). First I encountered this > problem while migrating form Debian 9.7 to 11.5. Routers based on > Supermicro X11SSL-F (Intel® C232 chipset) works with no problems after > that migration, but routers based on Supermicro X9SCL (Intel C202 PCH) > and Supermicro X10SLL+-F (Intel C222 Express PCH) starts behaving > strangely with high cpu load (0.5-0.8 while before it was around > 0.0-0.1) and subscribers not being able to utilize their plans. I > tried to strip down the problem and ends up with clean system with no > iptables or hfsc rules behaving the same (higher load) right after > setting the 10G link upeven if no traffic is passing by. > >>> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >>> with no network attached. The problem can be observed on the >>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the Supermicro >>> X11SSL-F (Intel® C232 chipset) everything is working well. >>> >>> Tested environments: >>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >>> (2018-05-07) x86_64 GNU/Linux [all platforms working well with no >>> problems: Supermicro X9SCL (Intel C202 PCH), Supermicro X10SLL+-F >>> (Intel C222 Express PCH), Supermicro X11SSL-F (Intel® C232 chipset)] >> >>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro X9SCL >>> (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 Express PCH) >>> behave problematic as described above | newer platform: Supermicro >>> X11SSL-F (Intel® C232 chipset) working well with no problems] >> >> Maybe create a bug at the Linux kernel bug tracker [1], where you can >> attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >> > I`ve already reported that to the Debian team > ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but so far > nobody took care of this issue so far. > >>> So far to solve the problem I was trying to upgrade system to the >>> newest stable version, upgrade kernel to version 6.x, upgrade ixgbe >>> driver to the newest version but with no luck. >> >> Thank you for checking that. Too bad it’s still present. To rule out >> some user space problem, could you test Debian 9.7 with a stable >> Linux release, currently 6.1.7? >> >> What does `sudo perf top --sort comm,dso` show, where the time is spent? > > During my first test in real enviroment with subscribers I gether the > following data through the perf: > > 27.83% [kernel] [k] strncpy > 14.80% [kernel] [k] nft_do_chain > 7.61% [kernel] [k] memcmp > 5.63% [kernel] [k] nft_meta_get_eval > 3.14% [kernel] [k] nft_cmp_eval > 2.79% [kernel] [k] asm_exc_nmi > 1.07% [kernel] [k] module_get_kallsym > 0.92% [kernel] [k] > kallsyms_expand_symbol.constprop.0 > 0.85% [kernel] [k] ixgbe_poll > 0.75% [kernel] [k] format_decode > 0.61% [kernel] [k] number > 0.56% [kernel] [k] menu_select > 0.54% [kernel] [k] clflush_cache_range > 0.52% [kernel] [k] cpuidle_enter_state > 0.51% [kernel] [k] vsnprintf > 0.50% [kernel] [k] u32_classify > 0.49% [kernel] [k] fib_table_lookup > 0.40% [kernel] [k] dma_pte_clear_level > 0.39% [kernel] [k] domain_mapping > 0.36% [kernel] [k] ixgbe_xmit_fram > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ > COMMAND > 18 root 20 0 0 0 0 S 28.2 0.0 7:06.27 > ksoftirqd/1 > 12 root 20 0 0 0 0 R 12.0 0.0 4:10.88 > ksoftirqd/0 > 23 root 20 0 0 0 0 S 6.0 0.0 4:36.08 > ksoftirqd/2 > 28 root 20 0 0 0 0 S 5.3 0.0 6:46.47 > ksoftirqd/3 > 846449 root 20 0 0 0 0 I 1.0 0.0 0:01.61 > kworker/0:0-events_power_efficient > 13 root 20 0 0 0 0 I 0.3 0.0 0:13.50 > rcu_sched > 8264 root 20 0 101536 6944 4824 S 0.3 0.2 0:07.77 > dhcpd > 1 root 20 0 164048 10184 7672 S 0.0 0.3 0:04.52 > systemd > 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 > kthreadd > 3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 > rcu_gp > 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 > rcu_par_gp > 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 > kworker/0:0H-events_highpri > 9 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 > mm_percpu_wq > 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 > rcu_tasks_rude_ > 11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 > rcu_tasks_trace > 14 root rt 0 0 0 0 S 0.0 0.0 0:00.26 > migration/0 > >> >>> Supermicro support suggested as follows: >>> it might be kernel related debian 11.5 has kernel 5.10 which is a >>> recent kernel it might not properly support the chipsets for X9 >>> therefore i suggest to use RHEL or CentOS as they use much older >>> kernel versions. I expect that with ubuntu 20.04 you see the same >>> problem it uses kernel 5.4 >> Testing another GNU/Linux distribution for another data point, might >> be a good idea. >> >> As nobody has responded yet, bisecting the issue is probably the >> fastest way to get to the bottom of this. Luckily the problem seems >> reproducible and you seem to be able to build a Linux kernel >> yourself, so that should work. (For testing purposes you could also >> test with Ubuntu, as they provide Linux kernel builds for (almost) >> all releases in their Linux kernel mainline PPA [2].) >> > Of course I can try Ubuntu and report how it is working. > Ubuntu (5.15.0-43-generic) seems to be working in the same way generating higher load after executing "ip link set enp1s0 up". Best regards Bartek Kois > Best regards > > Bartek Kois > >> >> Kind regards, >> >> Paul >> >> >> [1]: https://bugzilla.kernel.org/ >> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-19 16:58 ` Bartek Kois @ 2023-01-19 17:09 ` Paul Menzel 2023-01-19 17:17 ` Bartek Kois 0 siblings, 1 reply; 14+ messages in thread From: Paul Menzel @ 2023-01-19 17:09 UTC (permalink / raw) To: Bartek Kois; +Cc: intel-wired-lan, regressions Dear Bartek, Am 19.01.23 um 17:58 schrieb Bartek Kois: > W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >> >> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>> >>> #regzbot ^introduced: 4.9.88..5.10.149 >>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>> >>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip link >>>> set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel 82599EN >>>> based 10G adapter) I am experiencing high cpu load (even if no >>>> traffic is passing through the adapter) and network performance is >>>> low (when network is connected). >>> >>> How do you test the network performance? Please give exact numbers >>> for comparison. >>> >> I am using this server as a router for my subscribers with iptables >> (for NAT and firewall) and hfsc (for QoS). First I encountered this >> problem while migrating form Debian 9.7 to 11.5. Routers based on >> Supermicro X11SSL-F (Intel® C232 chipset) works with no problems after >> that migration, but routers based on Supermicro X9SCL (Intel C202 PCH) >> and Supermicro X10SLL+-F (Intel C222 Express PCH) starts behaving >> strangely with high cpu load (0.5-0.8 while before it was around >> 0.0-0.1) and subscribers not being able to utilize their plans. I >> tried to strip down the problem and ends up with clean system with no >> iptables or hfsc rules behaving the same (higher load) right after >> setting the 10G link upeven if no traffic is passing by. >> >>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >>>> with no network attached. The problem can be observed on the >>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the Supermicro >>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>> >>>> Tested environments: >>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >>>> (2018-05-07) x86_64 GNU/Linux [all platforms working well with no >>>> problems: Supermicro X9SCL (Intel C202 PCH), Supermicro X10SLL+-F >>>> (Intel C222 Express PCH), Supermicro X11SSL-F (Intel® C232 chipset)] >>> >>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro X9SCL >>>> (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 Express PCH) >>>> behave problematic as described above | newer platform: Supermicro >>>> X11SSL-F (Intel® C232 chipset) working well with no problems] >>> >>> Maybe create a bug at the Linux kernel bug tracker [1], where you can >>> attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>> >> I`ve already reported that to the Debian team >> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but so far >> nobody took care of this issue so far. >> >>>> So far to solve the problem I was trying to upgrade system to the >>>> newest stable version, upgrade kernel to version 6.x, upgrade ixgbe >>>> driver to the newest version but with no luck. >>> >>> Thank you for checking that. Too bad it’s still present. To rule out >>> some user space problem, could you test Debian 9.7 with a stable >>> Linux release, currently 6.1.7? >>> >>> What does `sudo perf top --sort comm,dso` show, where the time is spent? >> >> During my first test in real enviroment with subscribers I gether the >> following data through the perf: >> >> 27.83% [kernel] [k] strncpy >> 14.80% [kernel] [k] nft_do_chain >> 7.61% [kernel] [k] memcmp >> 5.63% [kernel] [k] nft_meta_get_eval >> 3.14% [kernel] [k] nft_cmp_eval >> 2.79% [kernel] [k] asm_exc_nmi >> 1.07% [kernel] [k] module_get_kallsym >> 0.92% [kernel] [k] kallsyms_expand_symbol.constprop.0 >> 0.85% [kernel] [k] ixgbe_poll >> 0.75% [kernel] [k] format_decode >> 0.61% [kernel] [k] number >> 0.56% [kernel] [k] menu_select >> 0.54% [kernel] [k] clflush_cache_range >> 0.52% [kernel] [k] cpuidle_enter_state >> 0.51% [kernel] [k] vsnprintf >> 0.50% [kernel] [k] u32_classify >> 0.49% [kernel] [k] fib_table_lookup >> 0.40% [kernel] [k] dma_pte_clear_level >> 0.39% [kernel] [k] domain_mapping >> 0.36% [kernel] [k] ixgbe_xmit_fram >> >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 18 root 20 0 0 0 0 S 28.2 0.0 7:06.27 ksoftirqd/1 >> 12 root 20 0 0 0 0 R 12.0 0.0 4:10.88 ksoftirqd/0 […] Do you see different behavior in `/proc/interrupts`? >>>> Supermicro support suggested as follows: >>>> it might be kernel related debian 11.5 has kernel 5.10 which is a >>>> recent kernel it might not properly support the chipsets for X9 >>>> therefore i suggest to use RHEL or CentOS as they use much older >>>> kernel versions. I expect that with ubuntu 20.04 you see the same >>>> problem it uses kernel 5.4 >>> >>> Testing another GNU/Linux distribution for another data point, might >>> be a good idea. >>> >>> As nobody has responded yet, bisecting the issue is probably the >>> fastest way to get to the bottom of this. Luckily the problem seems >>> reproducible and you seem to be able to build a Linux kernel >>> yourself, so that should work. (For testing purposes you could also >>> test with Ubuntu, as they provide Linux kernel builds for (almost) >>> all releases in their Linux kernel mainline PPA [2].) >>> >> Of course I can try Ubuntu and report how it is working. >> > Ubuntu (5.15.0-43-generic) seems to be working in the same way > generating higher load after executing "ip link set enp1s0 up". That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu 20.04 with Linux 5.4, and Ubuntu 18.04 with 4.15? Anyway, I think, you won’t come around bisecting. Another hint, make sure that you can build a 4.9 Linux kernel yourself, that does not exhibit that issue. Kind regards, Paul >>> [1]: https://bugzilla.kernel.org/ >>> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-19 17:09 ` Paul Menzel @ 2023-01-19 17:17 ` Bartek Kois 2023-01-22 20:28 ` Paul Menzel 0 siblings, 1 reply; 14+ messages in thread From: Bartek Kois @ 2023-01-19 17:17 UTC (permalink / raw) To: Paul Menzel; +Cc: intel-wired-lan, regressions W dniu 19.01.2023 o 18:09, Paul Menzel pisze: > Dear Bartek, > > > Am 19.01.23 um 17:58 schrieb Bartek Kois: >> W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >>> >>> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>>> >>>> #regzbot ^introduced: 4.9.88..5.10.149 > >>>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>>> >>>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip link >>>>> set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel 82599EN >>>>> based 10G adapter) I am experiencing high cpu load (even if no >>>>> traffic is passing through the adapter) and network performance is >>>>> low (when network is connected). >>>> >>>> How do you test the network performance? Please give exact numbers >>>> for comparison. >>>> >>> I am using this server as a router for my subscribers with iptables >>> (for NAT and firewall) and hfsc (for QoS). First I encountered this >>> problem while migrating form Debian 9.7 to 11.5. Routers based on >>> Supermicro X11SSL-F (Intel® C232 chipset) works with no problems >>> after that migration, but routers based on Supermicro X9SCL (Intel >>> C202 PCH) and Supermicro X10SLL+-F (Intel C222 Express PCH) starts >>> behaving strangely with high cpu load (0.5-0.8 while before it was >>> around 0.0-0.1) and subscribers not being able to utilize their >>> plans. I tried to strip down the problem and ends up with clean >>> system with no iptables or hfsc rules behaving the same (higher >>> load) right after setting the 10G link upeven if no traffic is >>> passing by. >>> >>>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >>>>> with no network attached. The problem can be observed on the >>>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the Supermicro >>>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>>> >>>>> Tested environments: >>>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >>>>> (2018-05-07) x86_64 GNU/Linux [all platforms working well with no >>>>> problems: Supermicro X9SCL (Intel C202 PCH), Supermicro X10SLL+-F >>>>> (Intel C222 Express PCH), Supermicro X11SSL-F (Intel® C232 chipset)] >>>> >>>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro X9SCL >>>>> (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 Express PCH) >>>>> behave problematic as described above | newer platform: Supermicro >>>>> X11SSL-F (Intel® C232 chipset) working well with no problems] >>>> >>>> Maybe create a bug at the Linux kernel bug tracker [1], where you >>>> can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>>> >>> I`ve already reported that to the Debian team >>> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but so far >>> nobody took care of this issue so far. >>> >>>>> So far to solve the problem I was trying to upgrade system to the >>>>> newest stable version, upgrade kernel to version 6.x, upgrade >>>>> ixgbe driver to the newest version but with no luck. >>>> >>>> Thank you for checking that. Too bad it’s still present. To rule >>>> out some user space problem, could you test Debian 9.7 with a >>>> stable Linux release, currently 6.1.7? >>>> >>>> What does `sudo perf top --sort comm,dso` show, where the time is >>>> spent? >>> >>> During my first test in real enviroment with subscribers I gether >>> the following data through the perf: >>> >>> 27.83% [kernel] [k] strncpy >>> 14.80% [kernel] [k] nft_do_chain >>> 7.61% [kernel] [k] memcmp >>> 5.63% [kernel] [k] nft_meta_get_eval >>> 3.14% [kernel] [k] nft_cmp_eval >>> 2.79% [kernel] [k] asm_exc_nmi >>> 1.07% [kernel] [k] module_get_kallsym >>> 0.92% [kernel] [k] >>> kallsyms_expand_symbol.constprop.0 >>> 0.85% [kernel] [k] ixgbe_poll >>> 0.75% [kernel] [k] format_decode >>> 0.61% [kernel] [k] number >>> 0.56% [kernel] [k] menu_select >>> 0.54% [kernel] [k] clflush_cache_range >>> 0.52% [kernel] [k] cpuidle_enter_state >>> 0.51% [kernel] [k] vsnprintf >>> 0.50% [kernel] [k] u32_classify >>> 0.49% [kernel] [k] fib_table_lookup >>> 0.40% [kernel] [k] dma_pte_clear_level >>> 0.39% [kernel] [k] domain_mapping >>> 0.36% [kernel] [k] ixgbe_xmit_fram >>> >>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >>> COMMAND >>> 18 root 20 0 0 0 0 S 28.2 0.0 7:06.27 >>> ksoftirqd/1 >>> 12 root 20 0 0 0 0 R 12.0 0.0 4:10.88 >>> ksoftirqd/0 > > […] > > Do you see different behavior in `/proc/interrupts`? > This is how it looks like for Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux on Supermicro X10SLL+-F (Intel C222 Express PCH): 1 root 20 0 163948 10288 7696 S 0.0 0.1 0:39.58 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.17 kthreadd 3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-kblockd 9 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq 10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_rude_ 11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_trace 12 root 20 0 0 0 0 S 0.0 0.0 6:07.13 ksoftirqd/0 13 root 20 0 0 0 0 I 0.0 0.0 4:15.28 rcu_sched 14 root rt 0 0 0 0 S 0.0 0.0 0:03.20 migration/0 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1 17 root rt 0 0 0 0 S 0.0 0.0 0:02.75 migration/1 18 root 20 0 0 0 0 S 0.0 0.0 4:35.84 ksoftirqd/1 20 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/1:0H-events_highpri 21 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/2 22 root rt 0 0 0 0 S 0.0 0.0 0:01.37 migration/2 23 root 20 0 0 0 0 S 0.0 0.0 8:18.23 ksoftirqd/2 25 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/2:0H-events_highpri 26 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/3 27 root rt 0 0 0 0 S 0.0 0.0 0:01.76 migration/3 28 root 20 0 0 0 0 S 0.0 0.0 8:45.46 ksoftirqd/3 30 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/3:0H-events_highpri 31 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/4 32 root rt 0 0 0 0 S 0.0 0.0 0:04.39 migration/4 33 root 20 0 0 0 0 S 0.0 0.0 3:44.08 ksoftirqd/4 35 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/4:0H-events_highpri 36 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/5 37 root rt 0 0 0 0 S 0.0 0.0 0:02.44 migration/5 38 root 20 0 0 0 0 S 0.0 0.0 4:04.34 ksoftirqd/5 40 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/5:0H-events_highpri 41 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/6 42 root rt 0 0 0 0 S 0.0 0.0 0:01.95 migration/6 43 root 20 0 0 0 0 S 0.0 0.0 3:35.38 ksoftirqd/6 45 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/6:0H-kblockd 46 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/7 47 root rt 0 0 0 0 S 0.0 0.0 0:01.07 migration/7 48 root 20 0 0 0 0 S 0.0 0.0 0:00.16 ksoftirqd/7 50 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/7:0H-kblockd 59 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs 60 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 netns 61 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kauditd 62 root 20 0 0 0 0 S 0.0 0.0 0:00.09 khungtaskd 63 root 20 0 0 0 0 S 0.0 0.0 0:00.00 oom_reaper 64 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 writeback 65 root 20 0 0 0 0 S 0.0 0.0 0:07.72 kcompactd0 66 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd 67 root 39 19 0 0 0 S 0.0 0.0 0:01.19 khugepaged 85 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kintegrityd 86 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kblockd 87 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 blkcg_punt_bio 88 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 edac-poller 89 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 devfreq_wq 91 root 0 -20 0 0 0 I 0.0 0.0 0:02.57 kworker/1:1H-kblockd 92 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kswapd0 93 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kthrotld 94 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 acpi_thermal_pm 96 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 ipv6_addrconf 104 root 0 -20 0 0 0 I 0.0 0.0 0:00.68 kworker/2:1H-kblockd 109 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kstrp 112 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 zswap-shrink 113 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/u17:0 and for Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 on Supermicro X10SLL+-F (Intel C222 Express PCH) 31659 root 20 0 0 0 0 S 0.3 0.0 0:00.92 kworker/7:0 1 root 20 0 57032 6736 5256 S 0.0 0.1 2:28.14 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.19 kthreadd 3 root 20 0 0 0 0 S 0.0 0.0 0:35.42 ksoftirqd/0 5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H 7 root 20 0 0 0 0 S 0.0 0.0 2:36.16 rcu_sched 8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh 9 root rt 0 0 0 0 S 0.0 0.0 0:00.28 migration/0 10 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 lru-add-drain 11 root rt 0 0 0 0 S 0.0 0.0 0:00.25 watchdog/0 12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/1 14 root rt 0 0 0 0 S 0.0 0.0 0:00.31 watchdog/1 15 root rt 0 0 0 0 S 0.0 0.0 0:25.69 migration/1 16 root 20 0 0 0 0 S 0.0 0.0 1:10.62 ksoftirqd/1 18 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/1:0H 19 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/2 20 root rt 0 0 0 0 S 0.0 0.0 0:00.26 watchdog/2 21 root rt 0 0 0 0 S 0.0 0.0 0:10.18 migration/2 22 root 20 0 0 0 0 S 0.0 0.0 0:51.08 ksoftirqd/2 24 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/2:0H 25 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/3 26 root rt 0 0 0 0 S 0.0 0.0 0:00.23 watchdog/3 27 root rt 0 0 0 0 S 0.0 0.0 0:00.32 migration/3 28 root 20 0 0 0 0 S 0.0 0.0 0:48.46 ksoftirqd/3 30 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/3:0H 31 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/4 32 root rt 0 0 0 0 S 0.0 0.0 0:00.21 watchdog/4 33 root rt 0 0 0 0 S 0.0 0.0 0:00.25 migration/4 34 root 20 0 0 0 0 S 0.0 0.0 0:36.35 ksoftirqd/4 36 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/4:0H 37 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/5 38 root rt 0 0 0 0 S 0.0 0.0 0:00.22 watchdog/5 39 root rt 0 0 0 0 S 0.0 0.0 0:04.02 migration/5 40 root 20 0 0 0 0 S 0.0 0.0 0:41.43 ksoftirqd/5 42 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/5:0H 43 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/6 44 root rt 0 0 0 0 S 0.0 0.0 0:00.22 watchdog/6 45 root rt 0 0 0 0 S 0.0 0.0 0:01.53 migration/6 46 root 20 0 0 0 0 S 0.0 0.0 0:41.66 ksoftirqd/6 48 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/6:0H 49 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/7 50 root rt 0 0 0 0 S 0.0 0.0 0:00.24 watchdog/7 51 root rt 0 0 0 0 S 0.0 0.0 0:00.27 migration/7 52 root 20 0 0 0 0 S 0.0 0.0 0:46.13 ksoftirqd/7 54 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/7:0H 55 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs 56 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 netns 57 root 20 0 0 0 0 S 0.0 0.0 0:00.07 khungtaskd 58 root 20 0 0 0 0 S 0.0 0.0 0:00.00 oom_reaper 59 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 writeback 60 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kcompactd0 62 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd 63 root 39 19 0 0 0 S 0.0 0.0 0:00.00 khugepaged 64 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 crypto 65 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kintegrityd 66 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 bioset 67 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kblockd 75 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 devfreq_wq 76 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 watchdogd 77 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kswapd0 78 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 vmstat 90 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kthrotld 91 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 ipv6_addrconf 121 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 acpi_thermal_pm 130 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 ata_sff 139 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 ixgbe 166 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_0 167 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 scsi_tmf_0 168 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_1 169 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 scsi_tmf_1 170 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_2 171 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 scsi_tmf_2 172 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_3 173 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 scsi_tmf_3 174 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_4 175 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 scsi_tmf_4 176 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_5 >>>>> Supermicro support suggested as follows: >>>>> it might be kernel related debian 11.5 has kernel 5.10 which is a >>>>> recent kernel it might not properly support the chipsets for X9 >>>>> therefore i suggest to use RHEL or CentOS as they use much older >>>>> kernel versions. I expect that with ubuntu 20.04 you see the same >>>>> problem it uses kernel 5.4 >>>> >>> Testing another GNU/Linux distribution for another data point, >>>> might >>>> be a good idea. >>>> >>>> As nobody has responded yet, bisecting the issue is probably the >>>> fastest way to get to the bottom of this. Luckily the problem seems >>>> reproducible and you seem to be able to build a Linux kernel >>>> yourself, so that should work. (For testing purposes you could also >>>> test with Ubuntu, as they provide Linux kernel builds for (almost) >>>> all releases in their Linux kernel mainline PPA [2].) >>>> >>> Of course I can try Ubuntu and report how it is working. >>> >> Ubuntu (5.15.0-43-generic) seems to be working in the same way >> generating higher load after executing "ip link set enp1s0 up". > > That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu 20.04 > with Linux 5.4, and Ubuntu 18.04 with 4.15? > > Anyway, I think, you won’t come around bisecting. Another hint, make > sure that you can build a 4.9 Linux kernel yourself, that does not > exhibit that issue. > That`s ringht, it is 22.04. I don`t have to build it. Standard kernel Linux 4.9.0-6-amd64 form Debian 9.7 worked without problems for past 4 years. Best regards Bartek Kois > > Kind regards, > > Paul > > >>>> [1]: https://bugzilla.kernel.org/ >>>> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-19 17:17 ` Bartek Kois @ 2023-01-22 20:28 ` Paul Menzel 2023-01-23 18:38 ` Bartek Kois 0 siblings, 1 reply; 14+ messages in thread From: Paul Menzel @ 2023-01-22 20:28 UTC (permalink / raw) To: Bartek Kois; +Cc: intel-wired-lan, regressions Dear Bartek, Am 19.01.23 um 18:17 schrieb Bartek Kois: > W dniu 19.01.2023 o 18:09, Paul Menzel pisze: >> Am 19.01.23 um 17:58 schrieb Bartek Kois: >>> W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >>>> >>>> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>>>> >>>>> #regzbot ^introduced: 4.9.88..5.10.149 >> >>>>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>>>> >>>>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip link >>>>>> set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel 82599EN >>>>>> based 10G adapter) I am experiencing high cpu load (even if no >>>>>> traffic is passing through the adapter) and network performance is >>>>>> low (when network is connected). >>>>> >>>>> How do you test the network performance? Please give exact numbers >>>>> for comparison. >>>>> >>>> I am using this server as a router for my subscribers with iptables >>>> (for NAT and firewall) and hfsc (for QoS). First I encountered this >>>> problem while migrating form Debian 9.7 to 11.5. Routers based on >>>> Supermicro X11SSL-F (Intel® C232 chipset) works with no problems >>>> after that migration, but routers based on Supermicro X9SCL (Intel >>>> C202 PCH) and Supermicro X10SLL+-F (Intel C222 Express PCH) starts >>>> behaving strangely with high cpu load (0.5-0.8 while before it was >>>> around 0.0-0.1) and subscribers not being able to utilize their >>>> plans. I tried to strip down the problem and ends up with clean >>>> system with no iptables or hfsc rules behaving the same (higher >>>> load) right after setting the 10G link upeven if no traffic is >>>> passing by. >>>> >>>>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >>>>>> with no network attached. The problem can be observed on the >>>>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the Supermicro >>>>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>>>> >>>>>> Tested environments: >>>>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >>>>>> (2018-05-07) x86_64 GNU/Linux [all platforms working well with no >>>>>> problems: Supermicro X9SCL (Intel C202 PCH), Supermicro X10SLL+-F >>>>>> (Intel C222 Express PCH), Supermicro X11SSL-F (Intel® C232 chipset)] >>>>> >>>>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro X9SCL >>>>>> (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 Express PCH) >>>>>> behave problematic as described above | newer platform: Supermicro >>>>>> X11SSL-F (Intel® C232 chipset) working well with no problems] >>>>> >>>>> Maybe create a bug at the Linux kernel bug tracker [1], where you >>>>> can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>>>> >>>> I`ve already reported that to the Debian team >>>> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but so far >>>> nobody took care of this issue so far. >>>> >>>>>> So far to solve the problem I was trying to upgrade system to the >>>>>> newest stable version, upgrade kernel to version 6.x, upgrade >>>>>> ixgbe driver to the newest version but with no luck. >>>>> >>>>> Thank you for checking that. Too bad it’s still present. To rule >>>>> out some user space problem, could you test Debian 9.7 with a >>>>> stable Linux release, currently 6.1.7? >>>>> >>>>> What does `sudo perf top --sort comm,dso` show, where the time is >>>>> spent? >>>> >>>> During my first test in real enviroment with subscribers I gether >>>> the following data through the perf: >>>> >>>> 27.83% [kernel] [k] strncpy >>>> 14.80% [kernel] [k] nft_do_chain >>>> 7.61% [kernel] [k] memcmp >>>> 5.63% [kernel] [k] nft_meta_get_eval >>>> 3.14% [kernel] [k] nft_cmp_eval >>>> 2.79% [kernel] [k] asm_exc_nmi >>>> 1.07% [kernel] [k] module_get_kallsym >>>> 0.92% [kernel] [k] kallsyms_expand_symbol.constprop.0 >>>> 0.85% [kernel] [k] ixgbe_poll >>>> 0.75% [kernel] [k] format_decode >>>> 0.61% [kernel] [k] number >>>> 0.56% [kernel] [k] menu_select >>>> 0.54% [kernel] [k] clflush_cache_range >>>> 0.52% [kernel] [k] cpuidle_enter_state >>>> 0.51% [kernel] [k] vsnprintf >>>> 0.50% [kernel] [k] u32_classify >>>> 0.49% [kernel] [k] fib_table_lookup >>>> 0.40% [kernel] [k] dma_pte_clear_level >>>> 0.39% [kernel] [k] domain_mapping >>>> 0.36% [kernel] [k] ixgbe_xmit_fram >>>> >>>> >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>>> 18 root 20 0 0 0 0 S 28.2 0.0 7:06.27 ksoftirqd/1 >>>> 12 root 20 0 0 0 0 R 12.0 0.0 4:10.88 ksoftirqd/0 >> >> […] >> >> Do you see different behavior in `/proc/interrupts`? >> > This is how it looks like for Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP > Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux on Supermicro X10SLL+-F > (Intel C222 Express PCH): > > 1 root 20 0 163948 10288 7696 S 0.0 0.1 0:39.58 systemd […] The content of `/proc/interrupts` has a different format on my system. ``` $ head -3 /proc/interrupts CPU0 CPU1 CPU2 CPU3 1: 55560 0 113 0 IR-IO-APIC 1-edge i8042 8: 0 0 0 0 IR-IO-APIC 8-edge rtc0 ``` […] > and for Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 > on Supermicro X10SLL+-F (Intel C222 Express PCH) > > 31659 root 20 0 0 0 0 S 0.3 0.0 0:00.92 > kworker/7:0 > 1 root 20 0 57032 6736 5256 S 0.0 0.1 2:28.14 systemd […] >>>>>> Supermicro support suggested as follows: >>>>>> it might be kernel related debian 11.5 has kernel 5.10 which is a >>>>>> recent kernel it might not properly support the chipsets for X9 >>>>>> therefore i suggest to use RHEL or CentOS as they use much older >>>>>> kernel versions. I expect that with ubuntu 20.04 you see the same >>>>>> problem it uses kernel 5.4 >>>>> >>> Testing another GNU/Linux distribution for another data point, >>>>> might >>>>> be a good idea. >>>>> >>>>> As nobody has responded yet, bisecting the issue is probably the >>>>> fastest way to get to the bottom of this. Luckily the problem seems >>>>> reproducible and you seem to be able to build a Linux kernel >>>>> yourself, so that should work. (For testing purposes you could also >>>>> test with Ubuntu, as they provide Linux kernel builds for (almost) >>>>> all releases in their Linux kernel mainline PPA [2].) >>>>> >>>> Of course I can try Ubuntu and report how it is working. >>>> >>> Ubuntu (5.15.0-43-generic) seems to be working in the same way >>> generating higher load after executing "ip link set enp1s0 up". >> >> That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu 20.04 >> with Linux 5.4, and Ubuntu 18.04 with 4.15? >> >> Anyway, I think, you won’t come around bisecting. Another hint, make >> sure that you can build a 4.9 Linux kernel yourself, that does not >> exhibit that issue. >> > That`s right, it is 22.04. I don`t have to build it. Standard kernel > Linux 4.9.0-6-amd64 from Debian 9.7 worked without problems for past 4 > years. If nobody of the developers/maintainers is going to step up, you are on your own. Again, as you can reproduce this easily, the fastest way is to bisect the issue, which you can do on your own. Kind regards, Paul >>>>> [1]: https://bugzilla.kernel.org/ >>>>> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-22 20:28 ` Paul Menzel @ 2023-01-23 18:38 ` Bartek Kois 2023-01-23 18:53 ` Paul Menzel 0 siblings, 1 reply; 14+ messages in thread From: Bartek Kois @ 2023-01-23 18:38 UTC (permalink / raw) To: Paul Menzel; +Cc: intel-wired-lan, regressions W dniu 22.01.2023 o 21:28, Paul Menzel pisze: > Dear Bartek, > > > Am 19.01.23 um 18:17 schrieb Bartek Kois: >> W dniu 19.01.2023 o 18:09, Paul Menzel pisze: > >>> Am 19.01.23 um 17:58 schrieb Bartek Kois: >>>> W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >>>>> >>>>> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>>>>> >>>>>> #regzbot ^introduced: 4.9.88..5.10.149 >>> >>>>>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>>>>> >>>>>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip >>>>>>> link set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel >>>>>>> 82599EN based 10G adapter) I am experiencing high cpu load (even >>>>>>> if no traffic is passing through the adapter) and network >>>>>>> performance is low (when network is connected). >>>>>> >>>>>> How do you test the network performance? Please give exact >>>>>> numbers for comparison. >>>>>> >>>>> I am using this server as a router for my subscribers with >>>>> iptables (for NAT and firewall) and hfsc (for QoS). First I >>>>> encountered this problem while migrating form Debian 9.7 to 11.5. >>>>> Routers based on Supermicro X11SSL-F (Intel® C232 chipset) works >>>>> with no problems after that migration, but routers based on >>>>> Supermicro X9SCL (Intel C202 PCH) and Supermicro X10SLL+-F (Intel >>>>> C222 Express PCH) starts behaving strangely with high cpu load >>>>> (0.5-0.8 while before it was around 0.0-0.1) and subscribers not >>>>> being able to utilize their plans. I tried to strip down the >>>>> problem and ends up with clean system with no iptables or hfsc >>>>> rules behaving the same (higher load) right after setting the 10G >>>>> link upeven if no traffic is passing by. >>>>> >>>>>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >>>>>>> with no network attached. The problem can be observed on the >>>>>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the >>>>>>> Supermicro >>>>>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>>>>> >>>>>>> Tested environments: >>>>>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >>>>>>> (2018-05-07) x86_64 GNU/Linux [all platforms working well with >>>>>>> no problems: Supermicro X9SCL (Intel C202 PCH), Supermicro >>>>>>> X10SLL+-F (Intel C222 Express PCH), Supermicro X11SSL-F (Intel® >>>>>>> C232 chipset)] >>>>>> >>>>>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>>>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro X9SCL >>>>>>> (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 Express PCH) >>>>>>> behave problematic as described above | newer platform: >>>>>>> Supermicro X11SSL-F (Intel® C232 chipset) working well with no >>>>>>> problems] >>>>>> >>>>>> Maybe create a bug at the Linux kernel bug tracker [1], where you >>>>>> can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>>>>> >>>>> I`ve already reported that to the Debian team >>>>> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but so >>>>> far nobody took care of this issue so far. >>>>> >>>>>>> So far to solve the problem I was trying to upgrade system to >>>>>>> the newest stable version, upgrade kernel to version 6.x, >>>>>>> upgrade ixgbe driver to the newest version but with no luck. >>>>>> >>>>>> Thank you for checking that. Too bad it’s still present. To rule >>>>>> out some user space problem, could you test Debian 9.7 with a >>>>>> stable Linux release, currently 6.1.7? >>>>>> >>>>>> What does `sudo perf top --sort comm,dso` show, where the time is >>>>>> spent? >>>>> >>>>> During my first test in real enviroment with subscribers I gether >>>>> the following data through the perf: >>>>> >>>>> 27.83% [kernel] [k] strncpy >>>>> 14.80% [kernel] [k] nft_do_chain >>>>> 7.61% [kernel] [k] memcmp >>>>> 5.63% [kernel] [k] nft_meta_get_eval >>>>> 3.14% [kernel] [k] nft_cmp_eval >>>>> 2.79% [kernel] [k] asm_exc_nmi >>>>> 1.07% [kernel] [k] module_get_kallsym >>>>> 0.92% [kernel] [k] >>>>> kallsyms_expand_symbol.constprop.0 >>>>> 0.85% [kernel] [k] ixgbe_poll >>>>> 0.75% [kernel] [k] format_decode >>>>> 0.61% [kernel] [k] number >>>>> 0.56% [kernel] [k] menu_select >>>>> 0.54% [kernel] [k] clflush_cache_range >>>>> 0.52% [kernel] [k] cpuidle_enter_state >>>>> 0.51% [kernel] [k] vsnprintf >>>>> 0.50% [kernel] [k] u32_classify >>>>> 0.49% [kernel] [k] fib_table_lookup >>>>> 0.40% [kernel] [k] dma_pte_clear_level >>>>> 0.39% [kernel] [k] domain_mapping >>>>> 0.36% [kernel] [k] ixgbe_xmit_fram >>>>> >>>>> >>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >>>>> COMMAND >>>>> 18 root 20 0 0 0 0 S 28.2 0.0 7:06.27 >>>>> ksoftirqd/1 >>>>> 12 root 20 0 0 0 0 R 12.0 0.0 4:10.88 >>>>> ksoftirqd/0 >>> >>> […] >>> >>> Do you see different behavior in `/proc/interrupts`? >>> >> This is how it looks like for Debian 11.5 - Linux 5.10.0-19-amd64 #1 >> SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux on Supermicro >> X10SLL+-F (Intel C222 Express PCH): >> >> 1 root 20 0 163948 10288 7696 S 0.0 0.1 0:39.58 >> systemd > > […] > > The content of `/proc/interrupts` has a different format on my system. > > ``` > $ head -3 /proc/interrupts > CPU0 CPU1 CPU2 CPU3 > 1: 55560 0 113 0 IR-IO-APIC 1-edge > i8042 > 8: 0 0 0 0 IR-IO-APIC 8-edge > rtc0 > ``` > […] > >> and for Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >> 4.9.88-1+deb9u1 on Supermicro X10SLL+-F (Intel C222 Express PCH) >> >> 31659 root 20 0 0 0 0 S 0.3 0.0 0:00.92 >> kworker/7:0 >> 1 root 20 0 57032 6736 5256 S 0.0 0.1 2:28.14 >> systemd > > […] >>>>>>> Supermicro support suggested as follows: >>>>>>> it might be kernel related debian 11.5 has kernel 5.10 which is >>>>>>> a recent kernel it might not properly support the chipsets for >>>>>>> X9 therefore i suggest to use RHEL or CentOS as they use much >>>>>>> older kernel versions. I expect that with ubuntu 20.04 you see >>>>>>> the same problem it uses kernel 5.4 >>>>>> >>> Testing another GNU/Linux distribution for another data >>>>>> point, might >>>>>> be a good idea. >>>>>> >>>>>> As nobody has responded yet, bisecting the issue is probably the >>>>>> fastest way to get to the bottom of this. Luckily the problem >>>>>> seems reproducible and you seem to be able to build a Linux >>>>>> kernel yourself, so that should work. (For testing purposes you >>>>>> could also test with Ubuntu, as they provide Linux kernel builds >>>>>> for (almost) all releases in their Linux kernel mainline PPA [2].) >>>>>> >>>>> Of course I can try Ubuntu and report how it is working. >>>>> >>>> Ubuntu (5.15.0-43-generic) seems to be working in the same way >>>> generating higher load after executing "ip link set enp1s0 up". >>> >>> That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu >>> 20.04 with Linux 5.4, and Ubuntu 18.04 with 4.15? >>> >>> Anyway, I think, you won’t come around bisecting. Another hint, make >>> sure that you can build a 4.9 Linux kernel yourself, that does not >>> exhibit that issue. >>> >> That`s right, it is 22.04. I don`t have to build it. Standard kernel >> Linux 4.9.0-6-amd64 from Debian 9.7 worked without problems for past >> 4 years. > > If nobody of the developers/maintainers is going to step up, you are > on your own. Again, as you can reproduce this easily, the fastest way > is to bisect the issue, which you can do on your own. How can I invastigate that futher? I thought about trying to change some of the parameters related to ixgbe driver and observe if anything is changing, but when I am trying to do: sudo modprobe ixgbe IntMode=0 I get the following error in the dmesg: [ 2137.324772] ixgbe: unknown parameter 'IntMode' ignored <<<<<<<<< [ 2137.324848] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver [ 2137.324848] ixgbe: Copyright (c) 1999-2016 Intel Corporation. [ 2138.505751] ixgbe 0000:02:00.0: Multiqueue Enabled: Rx Queue count = 4, Tx Queue count = 4 XDP Queue count = 0 [ 2138.506049] ixgbe 0000:02:00.0: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link) [ 2138.506134] ixgbe 0000:02:00.0: MAC: 2, PHY: 1, PBA No: 0210FF-0FF [ 2138.506137] ixgbe 0000:02:00.0: ac:1f:6b:ab:fa:70 [ 2138.510537] ixgbe 0000:02:00.0 enp2s0: renamed from eth0 [ 2138.537452] ixgbe 0000:02:00.0: Intel(R) 10 Gigabit Network Connection How should I use those parameters? Best regards Bartek Kois > > Kind regards, > > Paul > > >>>>>> [1]: https://bugzilla.kernel.org/ >>>>>> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-23 18:38 ` Bartek Kois @ 2023-01-23 18:53 ` Paul Menzel 2023-01-23 18:58 ` Bartek Kois 0 siblings, 1 reply; 14+ messages in thread From: Paul Menzel @ 2023-01-23 18:53 UTC (permalink / raw) To: Bartek Kois; +Cc: intel-wired-lan, regressions Dear Bartek, Am 23.01.23 um 19:38 schrieb Bartek Kois: > > W dniu 22.01.2023 o 21:28, Paul Menzel pisze: >> Dear Bartek, >> >> >> Am 19.01.23 um 18:17 schrieb Bartek Kois: >>> W dniu 19.01.2023 o 18:09, Paul Menzel pisze: >> >>>> Am 19.01.23 um 17:58 schrieb Bartek Kois: >>>>> W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >>>>>> >>>>>> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>>>>>> >>>>>>> #regzbot ^introduced: 4.9.88..5.10.149 >>>> >>>>>>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>>>>>> >>>>>>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip >>>>>>>> link set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel >>>>>>>> 82599EN based 10G adapter) I am experiencing high cpu load (even >>>>>>>> if no traffic is passing through the adapter) and network >>>>>>>> performance is low (when network is connected). >>>>>>> >>>>>>> How do you test the network performance? Please give exact >>>>>>> numbers for comparison. >>>>>>> >>>>>> I am using this server as a router for my subscribers with >>>>>> iptables (for NAT and firewall) and hfsc (for QoS). First I >>>>>> encountered this problem while migrating form Debian 9.7 to 11.5. >>>>>> Routers based on Supermicro X11SSL-F (Intel® C232 chipset) works >>>>>> with no problems after that migration, but routers based on >>>>>> Supermicro X9SCL (Intel C202 PCH) and Supermicro X10SLL+-F (Intel >>>>>> C222 Express PCH) starts behaving strangely with high cpu load >>>>>> (0.5-0.8 while before it was around 0.0-0.1) and subscribers not >>>>>> being able to utilize their plans. I tried to strip down the >>>>>> problem and ends up with clean system with no iptables or hfsc >>>>>> rules behaving the same (higher load) right after setting the 10G >>>>>> link upeven if no traffic is passing by. >>>>>> >>>>>>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >>>>>>>> with no network attached. The problem can be observed on the >>>>>>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the >>>>>>>> Supermicro >>>>>>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>>>>>> >>>>>>>> Tested environments: >>>>>>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >>>>>>>> (2018-05-07) x86_64 GNU/Linux [all platforms working well with >>>>>>>> no problems: Supermicro X9SCL (Intel C202 PCH), Supermicro >>>>>>>> X10SLL+-F (Intel C222 Express PCH), Supermicro X11SSL-F (Intel® >>>>>>>> C232 chipset)] >>>>>>> >>>>>>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>>>>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro X9SCL >>>>>>>> (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 Express PCH) >>>>>>>> behave problematic as described above | newer platform: >>>>>>>> Supermicro X11SSL-F (Intel® C232 chipset) working well with no >>>>>>>> problems] >>>>>>> >>>>>>> Maybe create a bug at the Linux kernel bug tracker [1], where you >>>>>>> can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>>>>>> >>>>>> I`ve already reported that to the Debian team >>>>>> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but so >>>>>> far nobody took care of this issue so far. >>>>>> >>>>>>>> So far to solve the problem I was trying to upgrade system to >>>>>>>> the newest stable version, upgrade kernel to version 6.x, >>>>>>>> upgrade ixgbe driver to the newest version but with no luck. >>>>>>> >>>>>>> Thank you for checking that. Too bad it’s still present. To rule >>>>>>> out some user space problem, could you test Debian 9.7 with a >>>>>>> stable Linux release, currently 6.1.7? >>>>>>> >>>>>>> What does `sudo perf top --sort comm,dso` show, where the time is >>>>>>> spent? >>>>>> >>>>>> During my first test in real enviroment with subscribers I gether >>>>>> the following data through the perf: >>>>>> >>>>>> 27.83% [kernel] [k] strncpy >>>>>> 14.80% [kernel] [k] nft_do_chain >>>>>> 7.61% [kernel] [k] memcmp >>>>>> 5.63% [kernel] [k] nft_meta_get_eval >>>>>> 3.14% [kernel] [k] nft_cmp_eval >>>>>> 2.79% [kernel] [k] asm_exc_nmi >>>>>> 1.07% [kernel] [k] module_get_kallsym >>>>>> 0.92% [kernel] [k] >>>>>> kallsyms_expand_symbol.constprop.0 >>>>>> 0.85% [kernel] [k] ixgbe_poll >>>>>> 0.75% [kernel] [k] format_decode >>>>>> 0.61% [kernel] [k] number >>>>>> 0.56% [kernel] [k] menu_select >>>>>> 0.54% [kernel] [k] clflush_cache_range >>>>>> 0.52% [kernel] [k] cpuidle_enter_state >>>>>> 0.51% [kernel] [k] vsnprintf >>>>>> 0.50% [kernel] [k] u32_classify >>>>>> 0.49% [kernel] [k] fib_table_lookup >>>>>> 0.40% [kernel] [k] dma_pte_clear_level >>>>>> 0.39% [kernel] [k] domain_mapping >>>>>> 0.36% [kernel] [k] ixgbe_xmit_fram >>>>>> >>>>>> >>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>>>>> 18 root 20 0 0 0 0 S 28.2 0.0 7:06.27 ksoftirqd/1 >>>>>> 12 root 20 0 0 0 0 R 12.0 0.0 4:10.88 ksoftirqd/0 >>>> >>>> […] >>>> >>>> Do you see different behavior in `/proc/interrupts`? >>>> >>> This is how it looks like for Debian 11.5 - Linux 5.10.0-19-amd64 #1 >>> SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux on Supermicro >>> X10SLL+-F (Intel C222 Express PCH): >>> >>> 1 root 20 0 163948 10288 7696 S 0.0 0.1 0:39.58 systemd >> >> […] >> >> The content of `/proc/interrupts` has a different format on my system. >> >> ``` >> $ head -3 /proc/interrupts >> CPU0 CPU1 CPU2 CPU3 >> 1: 55560 0 113 0 IR-IO-APIC 1-edge i8042 >> 8: 0 0 0 0 IR-IO-APIC 8-edge rtc0 >> ``` >> […] >> >>> and for Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >>> 4.9.88-1+deb9u1 on Supermicro X10SLL+-F (Intel C222 Express PCH) >>> >>> 31659 root 20 0 0 0 0 S 0.3 0.0 0:00.92 kworker/7:0 >>> 1 root 20 0 57032 6736 5256 S 0.0 0.1 2:28.14 systemd >> >> […] >>>>>>>> Supermicro support suggested as follows: >>>>>>>> it might be kernel related debian 11.5 has kernel 5.10 which is >>>>>>>> a recent kernel it might not properly support the chipsets for >>>>>>>> X9 therefore i suggest to use RHEL or CentOS as they use much >>>>>>>> older kernel versions. I expect that with ubuntu 20.04 you see >>>>>>>> the same problem it uses kernel 5.4 >>>>>>> >>> Testing another GNU/Linux distribution for another data >>>>>>> point, might be a good idea. >>>>>>> >>>>>>> As nobody has responded yet, bisecting the issue is probably the >>>>>>> fastest way to get to the bottom of this. Luckily the problem >>>>>>> seems reproducible and you seem to be able to build a Linux >>>>>>> kernel yourself, so that should work. (For testing purposes you >>>>>>> could also test with Ubuntu, as they provide Linux kernel builds >>>>>>> for (almost) all releases in their Linux kernel mainline PPA [2].) >>>>>>> >>>>>> Of course I can try Ubuntu and report how it is working. >>>>>> >>>>> Ubuntu (5.15.0-43-generic) seems to be working in the same way >>>>> generating higher load after executing "ip link set enp1s0 up". >>>> >>>> That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu >>>> 20.04 with Linux 5.4, and Ubuntu 18.04 with 4.15? >>>> >>>> Anyway, I think, you won’t come around bisecting. Another hint, make >>>> sure that you can build a 4.9 Linux kernel yourself, that does not >>>> exhibit that issue. >>>> >>> That`s right, it is 22.04. I don`t have to build it. Standard kernel >>> Linux 4.9.0-6-amd64 from Debian 9.7 worked without problems for past >>> 4 years. >> >> If nobody of the developers/maintainers is going to step up, you are >> on your own. Again, as you can reproduce this easily, the fastest way >> is to bisect the issue, which you can do on your own. > > How can I investigate that further? I repeat myself, please bisect the issue. It’s the fastest way. > I thought about trying to change some > of the parameters related to ixgbe driver and observe if anything is > changing, but when I am trying to do: > > sudo modprobe ixgbe IntMode=0 > > I get the following error in the dmesg: > > [ 2137.324772] ixgbe: unknown parameter 'IntMode' ignored <<<<<<<<< […] `modinfo ixgbe` shows the supported parameters. Kind regards, Paul PS: If you need help bisecting, please ask. Otherwise, I am out of this thread. >>>>>>> [1]: https://bugzilla.kernel.org/ >>>>>>> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-23 18:53 ` Paul Menzel @ 2023-01-23 18:58 ` Bartek Kois 2023-01-23 19:03 ` Paul Menzel 0 siblings, 1 reply; 14+ messages in thread From: Bartek Kois @ 2023-01-23 18:58 UTC (permalink / raw) To: Paul Menzel; +Cc: intel-wired-lan, regressions W dniu 23.01.2023 o 19:53, Paul Menzel pisze: > Dear Bartek, > > > Am 23.01.23 um 19:38 schrieb Bartek Kois: >> >> W dniu 22.01.2023 o 21:28, Paul Menzel pisze: >>> Dear Bartek, >>> >>> >>> Am 19.01.23 um 18:17 schrieb Bartek Kois: >>>> W dniu 19.01.2023 o 18:09, Paul Menzel pisze: >>> >>>>> Am 19.01.23 um 17:58 schrieb Bartek Kois: >>>>>> W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >>>>>>> >>>>>>> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>>>>>>> >>>>>>>> #regzbot ^introduced: 4.9.88..5.10.149 >>>>> >>>>>>>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>>>>>>> >>>>>>>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip >>>>>>>>> link set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel >>>>>>>>> 82599EN based 10G adapter) I am experiencing high cpu load >>>>>>>>> (even if no traffic is passing through the adapter) and >>>>>>>>> network performance is low (when network is connected). >>>>>>>> >>>>>>>> How do you test the network performance? Please give exact >>>>>>>> numbers for comparison. >>>>>>>> >>>>>>> I am using this server as a router for my subscribers with >>>>>>> iptables (for NAT and firewall) and hfsc (for QoS). First I >>>>>>> encountered this problem while migrating form Debian 9.7 to >>>>>>> 11.5. Routers based on Supermicro X11SSL-F (Intel® C232 >>>>>>> chipset) works with no problems after that migration, but >>>>>>> routers based on Supermicro X9SCL (Intel C202 PCH) and >>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH) starts behaving >>>>>>> strangely with high cpu load (0.5-0.8 while before it was around >>>>>>> 0.0-0.1) and subscribers not being able to utilize their plans. >>>>>>> I tried to strip down the problem and ends up with clean system >>>>>>> with no iptables or hfsc rules behaving the same (higher load) >>>>>>> right after setting the 10G link upeven if no traffic is passing >>>>>>> by. >>>>>>> >>>>>>>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >>>>>>>>> with no network attached. The problem can be observed on the >>>>>>>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the >>>>>>>>> Supermicro >>>>>>>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>>>>>>> >>>>>>>>> Tested environments: >>>>>>>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >>>>>>>>> (2018-05-07) x86_64 GNU/Linux [all platforms working well with >>>>>>>>> no problems: Supermicro X9SCL (Intel C202 PCH), Supermicro >>>>>>>>> X10SLL+-F (Intel C222 Express PCH), Supermicro X11SSL-F >>>>>>>>> (Intel® C232 chipset)] >>>>>>>> >>>>>>>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>>>>>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro >>>>>>>>> X9SCL (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 >>>>>>>>> Express PCH) behave problematic as described above | newer >>>>>>>>> platform: Supermicro X11SSL-F (Intel® C232 chipset) working >>>>>>>>> well with no problems] >>>>>>>> >>>>>>>> Maybe create a bug at the Linux kernel bug tracker [1], where >>>>>>>> you can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>>>>>>> >>>>>>> I`ve already reported that to the Debian team >>>>>>> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but so >>>>>>> far nobody took care of this issue so far. >>>>>>> >>>>>>>>> So far to solve the problem I was trying to upgrade system to >>>>>>>>> the newest stable version, upgrade kernel to version 6.x, >>>>>>>>> upgrade ixgbe driver to the newest version but with no luck. >>>>>>>> >>>>>>>> Thank you for checking that. Too bad it’s still present. To >>>>>>>> rule out some user space problem, could you test Debian 9.7 >>>>>>>> with a stable Linux release, currently 6.1.7? >>>>>>>> >>>>>>>> What does `sudo perf top --sort comm,dso` show, where the time >>>>>>>> is spent? >>>>>>> >>>>>>> During my first test in real enviroment with subscribers I >>>>>>> gether the following data through the perf: >>>>>>> >>>>>>> 27.83% [kernel] [k] strncpy >>>>>>> 14.80% [kernel] [k] nft_do_chain >>>>>>> 7.61% [kernel] [k] memcmp >>>>>>> 5.63% [kernel] [k] nft_meta_get_eval >>>>>>> 3.14% [kernel] [k] nft_cmp_eval >>>>>>> 2.79% [kernel] [k] asm_exc_nmi >>>>>>> 1.07% [kernel] [k] module_get_kallsym >>>>>>> 0.92% [kernel] [k] >>>>>>> kallsyms_expand_symbol.constprop.0 >>>>>>> 0.85% [kernel] [k] ixgbe_poll >>>>>>> 0.75% [kernel] [k] format_decode >>>>>>> 0.61% [kernel] [k] number >>>>>>> 0.56% [kernel] [k] menu_select >>>>>>> 0.54% [kernel] [k] clflush_cache_range >>>>>>> 0.52% [kernel] [k] cpuidle_enter_state >>>>>>> 0.51% [kernel] [k] vsnprintf >>>>>>> 0.50% [kernel] [k] u32_classify >>>>>>> 0.49% [kernel] [k] fib_table_lookup >>>>>>> 0.40% [kernel] [k] dma_pte_clear_level >>>>>>> 0.39% [kernel] [k] domain_mapping >>>>>>> 0.36% [kernel] [k] ixgbe_xmit_fram >>>>>>> >>>>>>> >>>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM >>>>>>> TIME+ COMMAND >>>>>>> 18 root 20 0 0 0 0 S 28.2 0.0 >>>>>>> 7:06.27 ksoftirqd/1 >>>>>>> 12 root 20 0 0 0 0 R 12.0 0.0 >>>>>>> 4:10.88 ksoftirqd/0 >>>>> >>>>> […] >>>>> >>>>> Do you see different behavior in `/proc/interrupts`? >>>>> >>>> This is how it looks like for Debian 11.5 - Linux 5.10.0-19-amd64 >>>> #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux on >>>> Supermicro X10SLL+-F (Intel C222 Express PCH): >>>> >>>> 1 root 20 0 163948 10288 7696 S 0.0 0.1 0:39.58 >>>> systemd >>> >>> […] >>> >>> The content of `/proc/interrupts` has a different format on my system. >>> >>> ``` >>> $ head -3 /proc/interrupts >>> CPU0 CPU1 CPU2 CPU3 >>> 1: 55560 0 113 0 IR-IO-APIC 1-edge >>> i8042 >>> 8: 0 0 0 0 IR-IO-APIC 8-edge >>> rtc0 >>> ``` >>> […] >>> >>>> and for Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >>>> 4.9.88-1+deb9u1 on Supermicro X10SLL+-F (Intel C222 Express PCH) >>>> >>>> 31659 root 20 0 0 0 0 S 0.3 0.0 0:00.92 >>>> kworker/7:0 >>>> 1 root 20 0 57032 6736 5256 S 0.0 0.1 2:28.14 >>>> systemd >>> >>> […] >>>>>>>>> Supermicro support suggested as follows: >>>>>>>>> it might be kernel related debian 11.5 has kernel 5.10 which >>>>>>>>> is a recent kernel it might not properly support the chipsets >>>>>>>>> for X9 therefore i suggest to use RHEL or CentOS as they use >>>>>>>>> much older kernel versions. I expect that with ubuntu 20.04 >>>>>>>>> you see the same problem it uses kernel 5.4 >>>>>>>> >>> Testing another GNU/Linux distribution for another data >>>>>>>> point, might be a good idea. >>>>>>>> >>>>>>>> As nobody has responded yet, bisecting the issue is probably >>>>>>>> the fastest way to get to the bottom of this. Luckily the >>>>>>>> problem seems reproducible and you seem to be able to build a >>>>>>>> Linux kernel yourself, so that should work. (For testing >>>>>>>> purposes you could also test with Ubuntu, as they provide Linux >>>>>>>> kernel builds for (almost) all releases in their Linux kernel >>>>>>>> mainline PPA [2].) >>>>>>>> >>>>>>> Of course I can try Ubuntu and report how it is working. >>>>>>> >>>>>> Ubuntu (5.15.0-43-generic) seems to be working in the same way >>>>>> generating higher load after executing "ip link set enp1s0 up". >>>>> >>>>> That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu >>>>> 20.04 with Linux 5.4, and Ubuntu 18.04 with 4.15? >>>>> >>>>> Anyway, I think, you won’t come around bisecting. Another hint, >>>>> make sure that you can build a 4.9 Linux kernel yourself, that >>>>> does not exhibit that issue. >>>>> >>>> That`s right, it is 22.04. I don`t have to build it. Standard >>>> kernel Linux 4.9.0-6-amd64 from Debian 9.7 worked without problems >>>> for past 4 years. >>> >>> If nobody of the developers/maintainers is going to step up, you are >>> on your own. Again, as you can reproduce this easily, the fastest >>> way is to bisect the issue, which you can do on your own. >> >> How can I investigate that further? > > I repeat myself, please bisect the issue. It’s the fastest way. > >> I thought about trying to change some of the parameters related to >> ixgbe driver and observe if anything is changing, but when I am >> trying to do: >> >> sudo modprobe ixgbe IntMode=0 >> >> I get the following error in the dmesg: >> >> [ 2137.324772] ixgbe: unknown parameter 'IntMode' ignored <<<<<<<<< > > […] > > `modinfo ixgbe` shows the supported parameters. > > > Kind regards, > > Paul > > > PS: If you need help bisecting, please ask. Otherwise, I am out of > this thread. Ok, how exactly I can bisect this issue? Best regards Bartek Kois > >>>>>>>> [1]: https://bugzilla.kernel.org/ >>>>>>>> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-23 18:58 ` Bartek Kois @ 2023-01-23 19:03 ` Paul Menzel 2023-01-24 9:33 ` Linux kernel regression tracking (Thorsten Leemhuis) 0 siblings, 1 reply; 14+ messages in thread From: Paul Menzel @ 2023-01-23 19:03 UTC (permalink / raw) To: Bartek Kois; +Cc: intel-wired-lan, regressions Dear Bartek, Am 23.01.23 um 19:58 schrieb Bartek Kois: > W dniu 23.01.2023 o 19:53, Paul Menzel pisze: >> Am 23.01.23 um 19:38 schrieb Bartek Kois: >>> >>> W dniu 22.01.2023 o 21:28, Paul Menzel pisze: >>>> Dear Bartek, >>>> >>>> >>>> Am 19.01.23 um 18:17 schrieb Bartek Kois: >>>>> W dniu 19.01.2023 o 18:09, Paul Menzel pisze: >>>> >>>>>> Am 19.01.23 um 17:58 schrieb Bartek Kois: >>>>>>> W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >>>>>>>> >>>>>>>> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>>>>>>>> >>>>>>>>> #regzbot ^introduced: 4.9.88..5.10.149 >>>>>> >>>>>>>>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>>>>>>>> >>>>>>>>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip >>>>>>>>>> link set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel >>>>>>>>>> 82599EN based 10G adapter) I am experiencing high cpu load >>>>>>>>>> (even if no traffic is passing through the adapter) and >>>>>>>>>> network performance is low (when network is connected). >>>>>>>>> >>>>>>>>> How do you test the network performance? Please give exact >>>>>>>>> numbers for comparison. >>>>>>>>> >>>>>>>> I am using this server as a router for my subscribers with >>>>>>>> iptables (for NAT and firewall) and hfsc (for QoS). First I >>>>>>>> encountered this problem while migrating form Debian 9.7 to >>>>>>>> 11.5. Routers based on Supermicro X11SSL-F (Intel® C232 >>>>>>>> chipset) works with no problems after that migration, but >>>>>>>> routers based on Supermicro X9SCL (Intel C202 PCH) and >>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH) starts behaving >>>>>>>> strangely with high cpu load (0.5-0.8 while before it was around >>>>>>>> 0.0-0.1) and subscribers not being able to utilize their plans. >>>>>>>> I tried to strip down the problem and ends up with clean system >>>>>>>> with no iptables or hfsc rules behaving the same (higher load) >>>>>>>> right after setting the 10G link upeven if no traffic is passing >>>>>>>> by. >>>>>>>> >>>>>>>>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla system >>>>>>>>>> with no network attached. The problem can be observed on the >>>>>>>>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the >>>>>>>>>> Supermicro >>>>>>>>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>>>>>>>> >>>>>>>>>> Tested environments: >>>>>>>>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1 >>>>>>>>>> (2018-05-07) x86_64 GNU/Linux [all platforms working well with >>>>>>>>>> no problems: Supermicro X9SCL (Intel C202 PCH), Supermicro >>>>>>>>>> X10SLL+-F (Intel C222 Express PCH), Supermicro X11SSL-F >>>>>>>>>> (Intel® C232 chipset)] >>>>>>>>> >>>>>>>>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>>>>>>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro >>>>>>>>>> X9SCL (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 >>>>>>>>>> Express PCH) behave problematic as described above | newer >>>>>>>>>> platform: Supermicro X11SSL-F (Intel® C232 chipset) working >>>>>>>>>> well with no problems] >>>>>>>>> >>>>>>>>> Maybe create a bug at the Linux kernel bug tracker [1], where >>>>>>>>> you can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>>>>>>>> >>>>>>>> I`ve already reported that to the Debian team >>>>>>>> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but so >>>>>>>> far nobody took care of this issue so far. >>>>>>>> >>>>>>>>>> So far to solve the problem I was trying to upgrade system to >>>>>>>>>> the newest stable version, upgrade kernel to version 6.x, >>>>>>>>>> upgrade ixgbe driver to the newest version but with no luck. >>>>>>>>> >>>>>>>>> Thank you for checking that. Too bad it’s still present. To >>>>>>>>> rule out some user space problem, could you test Debian 9.7 >>>>>>>>> with a stable Linux release, currently 6.1.7? >>>>>>>>> >>>>>>>>> What does `sudo perf top --sort comm,dso` show, where the time >>>>>>>>> is spent? >>>>>>>> >>>>>>>> During my first test in real enviroment with subscribers I >>>>>>>> gether the following data through the perf: >>>>>>>> >>>>>>>> 27.83% [kernel] [k] strncpy >>>>>>>> 14.80% [kernel] [k] nft_do_chain >>>>>>>> 7.61% [kernel] [k] memcmp >>>>>>>> 5.63% [kernel] [k] nft_meta_get_eval >>>>>>>> 3.14% [kernel] [k] nft_cmp_eval >>>>>>>> 2.79% [kernel] [k] asm_exc_nmi >>>>>>>> 1.07% [kernel] [k] module_get_kallsym >>>>>>>> 0.92% [kernel] [k] >>>>>>>> kallsyms_expand_symbol.constprop.0 >>>>>>>> 0.85% [kernel] [k] ixgbe_poll >>>>>>>> 0.75% [kernel] [k] format_decode >>>>>>>> 0.61% [kernel] [k] number >>>>>>>> 0.56% [kernel] [k] menu_select >>>>>>>> 0.54% [kernel] [k] clflush_cache_range >>>>>>>> 0.52% [kernel] [k] cpuidle_enter_state >>>>>>>> 0.51% [kernel] [k] vsnprintf >>>>>>>> 0.50% [kernel] [k] u32_classify >>>>>>>> 0.49% [kernel] [k] fib_table_lookup >>>>>>>> 0.40% [kernel] [k] dma_pte_clear_level >>>>>>>> 0.39% [kernel] [k] domain_mapping >>>>>>>> 0.36% [kernel] [k] ixgbe_xmit_fram >>>>>>>> >>>>>>>> >>>>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM >>>>>>>> TIME+ COMMAND >>>>>>>> 18 root 20 0 0 0 0 S 28.2 0.0 >>>>>>>> 7:06.27 ksoftirqd/1 >>>>>>>> 12 root 20 0 0 0 0 R 12.0 0.0 >>>>>>>> 4:10.88 ksoftirqd/0 >>>>>> >>>>>> […] >>>>>> >>>>>> Do you see different behavior in `/proc/interrupts`? >>>>>> >>>>> This is how it looks like for Debian 11.5 - Linux 5.10.0-19-amd64 >>>>> #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux on >>>>> Supermicro X10SLL+-F (Intel C222 Express PCH): >>>>> >>>>> 1 root 20 0 163948 10288 7696 S 0.0 0.1 0:39.58 >>>>> systemd >>>> >>>> […] >>>> >>>> The content of `/proc/interrupts` has a different format on my system. >>>> >>>> ``` >>>> $ head -3 /proc/interrupts >>>> CPU0 CPU1 CPU2 CPU3 >>>> 1: 55560 0 113 0 IR-IO-APIC 1-edge >>>> i8042 >>>> 8: 0 0 0 0 IR-IO-APIC 8-edge >>>> rtc0 >>>> ``` >>>> […] >>>> >>>>> and for Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >>>>> 4.9.88-1+deb9u1 on Supermicro X10SLL+-F (Intel C222 Express PCH) >>>>> >>>>> 31659 root 20 0 0 0 0 S 0.3 0.0 0:00.92 >>>>> kworker/7:0 >>>>> 1 root 20 0 57032 6736 5256 S 0.0 0.1 2:28.14 >>>>> systemd >>>> >>>> […] >>>>>>>>>> Supermicro support suggested as follows: >>>>>>>>>> it might be kernel related debian 11.5 has kernel 5.10 which >>>>>>>>>> is a recent kernel it might not properly support the chipsets >>>>>>>>>> for X9 therefore i suggest to use RHEL or CentOS as they use >>>>>>>>>> much older kernel versions. I expect that with ubuntu 20.04 >>>>>>>>>> you see the same problem it uses kernel 5.4 >>>>>>>>> >>> Testing another GNU/Linux distribution for another data >>>>>>>>> point, might be a good idea. >>>>>>>>> >>>>>>>>> As nobody has responded yet, bisecting the issue is probably >>>>>>>>> the fastest way to get to the bottom of this. Luckily the >>>>>>>>> problem seems reproducible and you seem to be able to build a >>>>>>>>> Linux kernel yourself, so that should work. (For testing >>>>>>>>> purposes you could also test with Ubuntu, as they provide Linux >>>>>>>>> kernel builds for (almost) all releases in their Linux kernel >>>>>>>>> mainline PPA [2].) >>>>>>>>> >>>>>>>> Of course I can try Ubuntu and report how it is working. >>>>>>>> >>>>>>> Ubuntu (5.15.0-43-generic) seems to be working in the same way >>>>>>> generating higher load after executing "ip link set enp1s0 up". >>>>>> >>>>>> That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu >>>>>> 20.04 with Linux 5.4, and Ubuntu 18.04 with 4.15? >>>>>> >>>>>> Anyway, I think, you won’t come around bisecting. Another hint, >>>>>> make sure that you can build a 4.9 Linux kernel yourself, that >>>>>> does not exhibit that issue. >>>>>> >>>>> That`s right, it is 22.04. I don`t have to build it. Standard >>>>> kernel Linux 4.9.0-6-amd64 from Debian 9.7 worked without problems >>>>> for past 4 years. >>>> >>>> If nobody of the developers/maintainers is going to step up, you are >>>> on your own. Again, as you can reproduce this easily, the fastest >>>> way is to bisect the issue, which you can do on your own. >>> >>> How can I investigate that further? >> >> I repeat myself, please bisect the issue. It’s the fastest way. >> >>> I thought about trying to change some of the parameters related to >>> ixgbe driver and observe if anything is changing, but when I am >>> trying to do: >>> >>> sudo modprobe ixgbe IntMode=0 >>> >>> I get the following error in the dmesg: >>> >>> [ 2137.324772] ixgbe: unknown parameter 'IntMode' ignored <<<<<<<<< >> >> […] >> >> `modinfo ixgbe` shows the supported parameters. >> PS: If you need help bisecting, please ask. Otherwise, I am out of >> this thread. > > Ok, how exactly I can bisect this issue? What have you tried so far? As written in the past, I’d first try more distributions, for example, older Ubuntu versions. Then, if you have some range, I’d use the Ubuntu PPA, and then between the release candidate versions, only then start doing `git bisect` as documented in the documentation [3]. Kind regards, Paul >>>>>>>>> [1]: https://bugzilla.kernel.org/ >>>>>>>>> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ [3]: https://www.kernel.org/doc/html/latest/admin-guide/bug-bisect.html ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-23 19:03 ` Paul Menzel @ 2023-01-24 9:33 ` Linux kernel regression tracking (Thorsten Leemhuis) 2023-01-24 9:40 ` Bartek Kois 0 siblings, 1 reply; 14+ messages in thread From: Linux kernel regression tracking (Thorsten Leemhuis) @ 2023-01-24 9:33 UTC (permalink / raw) To: Paul Menzel, Bartek Kois; +Cc: intel-wired-lan, regressions On 23.01.23 20:03, Paul Menzel wrote: > Am 23.01.23 um 19:58 schrieb Bartek Kois: > >> W dniu 23.01.2023 o 19:53, Paul Menzel pisze: > >>> Am 23.01.23 um 19:38 schrieb Bartek Kois: >>>> >>>> W dniu 22.01.2023 o 21:28, Paul Menzel pisze: >>>>> Dear Bartek, >>>>> >>>>> >>>>> Am 19.01.23 um 18:17 schrieb Bartek Kois: >>>>>> W dniu 19.01.2023 o 18:09, Paul Menzel pisze: >>>>> >>>>>>> Am 19.01.23 um 17:58 schrieb Bartek Kois: >>>>>>>> W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >>>>>>>>> >>>>>>>>> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>>>>>>>>> >>>>>>>>>> #regzbot ^introduced: 4.9.88..5.10.149 >>>>>>> >>>>>>>>>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>>>>>>>>> >>>>>>>>>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip >>>>>>>>>>> link set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel >>>>>>>>>>> 82599EN based 10G adapter) I am experiencing high cpu load >>>>>>>>>>> (even if no traffic is passing through the adapter) and >>>>>>>>>>> network performance is low (when network is connected). >>>>>>>>>> >>>>>>>>>> How do you test the network performance? Please give exact >>>>>>>>>> numbers for comparison. >>>>>>>>>> >>>>>>>>> I am using this server as a router for my subscribers with >>>>>>>>> iptables (for NAT and firewall) and hfsc (for QoS). First I >>>>>>>>> encountered this problem while migrating form Debian 9.7 to >>>>>>>>> 11.5. Routers based on Supermicro X11SSL-F (Intel® C232 >>>>>>>>> chipset) works with no problems after that migration, but >>>>>>>>> routers based on Supermicro X9SCL (Intel C202 PCH) and >>>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH) starts behaving >>>>>>>>> strangely with high cpu load (0.5-0.8 while before it was >>>>>>>>> around 0.0-0.1) and subscribers not being able to utilize their >>>>>>>>> plans. I tried to strip down the problem and ends up with clean >>>>>>>>> system with no iptables or hfsc rules behaving the same (higher >>>>>>>>> load) right after setting the 10G link upeven if no traffic is >>>>>>>>> passing by. >>>>>>>>> >>>>>>>>>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla >>>>>>>>>>> system >>>>>>>>>>> with no network attached. The problem can be observed on the >>>>>>>>>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>>>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the >>>>>>>>>>> Supermicro >>>>>>>>>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>>>>>>>>> >>>>>>>>>>> Tested environments: >>>>>>>>>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >>>>>>>>>>> 4.9.88-1+deb9u1 (2018-05-07) x86_64 GNU/Linux [all platforms >>>>>>>>>>> working well with no problems: Supermicro X9SCL (Intel C202 >>>>>>>>>>> PCH), Supermicro X10SLL+-F (Intel C222 Express PCH), >>>>>>>>>>> Supermicro X11SSL-F (Intel® C232 chipset)] >>>>>>>>>> >>>>>>>>>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>>>>>>>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro >>>>>>>>>>> X9SCL (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 >>>>>>>>>>> Express PCH) behave problematic as described above | newer >>>>>>>>>>> platform: Supermicro X11SSL-F (Intel® C232 chipset) working >>>>>>>>>>> well with no problems] >>>>>>>>>> >>>>>>>>>> Maybe create a bug at the Linux kernel bug tracker [1], where >>>>>>>>>> you can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>>>>>>>>> >>>>>>>>> I`ve already reported that to the Debian team >>>>>>>>> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but >>>>>>>>> so far nobody took care of this issue so far. >>>>>>>>> >>>>>>>>>>> So far to solve the problem I was trying to upgrade system to >>>>>>>>>>> the newest stable version, upgrade kernel to version 6.x, >>>>>>>>>>> upgrade ixgbe driver to the newest version but with no luck. >>>>>>>>>> >>>>>>>>>> Thank you for checking that. Too bad it’s still present. To >>>>>>>>>> rule out some user space problem, could you test Debian 9.7 >>>>>>>>>> with a stable Linux release, currently 6.1.7? >>>>>>>>>> >>>>>>>>>> What does `sudo perf top --sort comm,dso` show, where the time >>>>>>>>>> is spent? >>>>>>>>> >>>>>>>>> During my first test in real enviroment with subscribers I >>>>>>>>> gether the following data through the perf: >>>>>>>>> >>>>>>>>> 27.83% [kernel] [k] strncpy >>>>>>>>> 14.80% [kernel] [k] nft_do_chain >>>>>>>>> 7.61% [kernel] [k] memcmp >>>>>>>>> 5.63% [kernel] [k] nft_meta_get_eval >>>>>>>>> 3.14% [kernel] [k] nft_cmp_eval >>>>>>>>> 2.79% [kernel] [k] asm_exc_nmi >>>>>>>>> 1.07% [kernel] [k] module_get_kallsym >>>>>>>>> 0.92% [kernel] [k] >>>>>>>>> kallsyms_expand_symbol.constprop.0 >>>>>>>>> 0.85% [kernel] [k] ixgbe_poll >>>>>>>>> 0.75% [kernel] [k] format_decode >>>>>>>>> 0.61% [kernel] [k] number >>>>>>>>> 0.56% [kernel] [k] menu_select >>>>>>>>> 0.54% [kernel] [k] clflush_cache_range >>>>>>>>> 0.52% [kernel] [k] cpuidle_enter_state >>>>>>>>> 0.51% [kernel] [k] vsnprintf >>>>>>>>> 0.50% [kernel] [k] u32_classify >>>>>>>>> 0.49% [kernel] [k] fib_table_lookup >>>>>>>>> 0.40% [kernel] [k] dma_pte_clear_level >>>>>>>>> 0.39% [kernel] [k] domain_mapping >>>>>>>>> 0.36% [kernel] [k] ixgbe_xmit_fram >>>>>>>>> >>>>>>>>> >>>>>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM >>>>>>>>> TIME+ COMMAND >>>>>>>>> 18 root 20 0 0 0 0 S 28.2 0.0 >>>>>>>>> 7:06.27 ksoftirqd/1 >>>>>>>>> 12 root 20 0 0 0 0 R 12.0 0.0 >>>>>>>>> 4:10.88 ksoftirqd/0 >>>>>>> >>>>>>> […] >>>>>>> >>>>>>> Do you see different behavior in `/proc/interrupts`? >>>>>>> >>>>>> This is how it looks like for Debian 11.5 - Linux 5.10.0-19-amd64 >>>>>> #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux on >>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH): >>>>>> >>>>>> 1 root 20 0 163948 10288 7696 S 0.0 0.1 >>>>>> 0:39.58 systemd >>>>> >>>>> […] >>>>> >>>>> The content of `/proc/interrupts` has a different format on my system. >>>>> >>>>> ``` >>>>> $ head -3 /proc/interrupts >>>>> CPU0 CPU1 CPU2 CPU3 >>>>> 1: 55560 0 113 0 IR-IO-APIC 1-edge >>>>> i8042 >>>>> 8: 0 0 0 0 IR-IO-APIC 8-edge >>>>> rtc0 >>>>> ``` >>>>> […] >>>>> >>>>>> and for Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >>>>>> 4.9.88-1+deb9u1 on Supermicro X10SLL+-F (Intel C222 Express PCH) >>>>>> >>>>>> 31659 root 20 0 0 0 0 S 0.3 0.0 0:00.92 >>>>>> kworker/7:0 >>>>>> 1 root 20 0 57032 6736 5256 S 0.0 0.1 2:28.14 >>>>>> systemd >>>>> >>>>> […] >>>>>>>>>>> Supermicro support suggested as follows: >>>>>>>>>>> it might be kernel related debian 11.5 has kernel 5.10 which >>>>>>>>>>> is a recent kernel it might not properly support the chipsets >>>>>>>>>>> for X9 therefore i suggest to use RHEL or CentOS as they use >>>>>>>>>>> much older kernel versions. I expect that with ubuntu 20.04 >>>>>>>>>>> you see the same problem it uses kernel 5.4 >>>>>>>>>> >>> Testing another GNU/Linux distribution for another data >>>>>>>>>> point, might be a good idea. >>>>>>>>>> >>>>>>>>>> As nobody has responded yet, bisecting the issue is probably >>>>>>>>>> the fastest way to get to the bottom of this. Luckily the >>>>>>>>>> problem seems reproducible and you seem to be able to build a >>>>>>>>>> Linux kernel yourself, so that should work. (For testing >>>>>>>>>> purposes you could also test with Ubuntu, as they provide >>>>>>>>>> Linux kernel builds for (almost) all releases in their Linux >>>>>>>>>> kernel mainline PPA [2].) >>>>>>>>>> >>>>>>>>> Of course I can try Ubuntu and report how it is working. >>>>>>>>> >>>>>>>> Ubuntu (5.15.0-43-generic) seems to be working in the same way >>>>>>>> generating higher load after executing "ip link set enp1s0 up". >>>>>>> >>>>>>> That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu >>>>>>> 20.04 with Linux 5.4, and Ubuntu 18.04 with 4.15? >>>>>>> >>>>>>> Anyway, I think, you won’t come around bisecting. Another hint, >>>>>>> make sure that you can build a 4.9 Linux kernel yourself, that >>>>>>> does not exhibit that issue. >>>>>>> >>>>>> That`s right, it is 22.04. I don`t have to build it. Standard >>>>>> kernel Linux 4.9.0-6-amd64 from Debian 9.7 worked without problems >>>>>> for past 4 years. >>>>> >>>>> If nobody of the developers/maintainers is going to step up, you >>>>> are on your own. Again, as you can reproduce this easily, the >>>>> fastest way is to bisect the issue, which you can do on your own. >>>> >>>> How can I investigate that further? >>> >>> I repeat myself, please bisect the issue. It’s the fastest way. >>> >>>> I thought about trying to change some of the parameters related to >>>> ixgbe driver and observe if anything is changing, but when I am >>>> trying to do: >>>> >>>> sudo modprobe ixgbe IntMode=0 >>>> >>>> I get the following error in the dmesg: >>>> >>>> [ 2137.324772] ixgbe: unknown parameter 'IntMode' ignored <<<<<<<<< >>> >>> […] >>> >>> `modinfo ixgbe` shows the supported parameters. > >>> PS: If you need help bisecting, please ask. Otherwise, I am out of >>> this thread. >> >> Ok, how exactly I can bisect this issue? > > What have you tried so far? As written in the past, I’d first try more > distributions, for example, older Ubuntu versions. Then, if you have > some range, I’d use the Ubuntu PPA, and then between the release > candidate versions, only then start doing `git bisect` as documented in > the documentation [3]. Hmmm. I'm not an expert in that area, but if you follow Paul's advice keep in mind that a deliberate config change by the distro might have an impact here. Hence it might be a good idea to rule that out first by taking a config from a working kernel and using it (with the help of "make olddefconfig") to build your own kernel from the version that is known to fail. But over such a wide range of versions this can be tricky. :-/ But apart from that Paul is right afaics: nobody yet had an idea what might cause this regression, hence we need a bisection to pin-point the problem. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. >>>>>>>>>> [1]: https://bugzilla.kernel.org/ >>>>>>>>>> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ > [3]: https://www.kernel.org/doc/html/latest/admin-guide/bug-bisect.html > > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-24 9:33 ` Linux kernel regression tracking (Thorsten Leemhuis) @ 2023-01-24 9:40 ` Bartek Kois 2023-03-23 13:46 ` Linux regression tracking (Thorsten Leemhuis) 0 siblings, 1 reply; 14+ messages in thread From: Bartek Kois @ 2023-01-24 9:40 UTC (permalink / raw) To: Linux regressions mailing list, Paul Menzel; +Cc: intel-wired-lan W dniu 24.01.2023 o 10:33, Linux kernel regression tracking (Thorsten Leemhuis) pisze: > On 23.01.23 20:03, Paul Menzel wrote: >> Am 23.01.23 um 19:58 schrieb Bartek Kois: >> >>> W dniu 23.01.2023 o 19:53, Paul Menzel pisze: >>>> Am 23.01.23 um 19:38 schrieb Bartek Kois: >>>>> W dniu 22.01.2023 o 21:28, Paul Menzel pisze: >>>>>> Dear Bartek, >>>>>> >>>>>> >>>>>> Am 19.01.23 um 18:17 schrieb Bartek Kois: >>>>>>> W dniu 19.01.2023 o 18:09, Paul Menzel pisze: >>>>>>>> Am 19.01.23 um 17:58 schrieb Bartek Kois: >>>>>>>>> W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >>>>>>>>>> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>>>>>>>>>> #regzbot ^introduced: 4.9.88..5.10.149 >>>>>>>>>>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>>>>>>>>>> >>>>>>>>>>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip >>>>>>>>>>>> link set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel >>>>>>>>>>>> 82599EN based 10G adapter) I am experiencing high cpu load >>>>>>>>>>>> (even if no traffic is passing through the adapter) and >>>>>>>>>>>> network performance is low (when network is connected). >>>>>>>>>>> How do you test the network performance? Please give exact >>>>>>>>>>> numbers for comparison. >>>>>>>>>>> >>>>>>>>>> I am using this server as a router for my subscribers with >>>>>>>>>> iptables (for NAT and firewall) and hfsc (for QoS). First I >>>>>>>>>> encountered this problem while migrating form Debian 9.7 to >>>>>>>>>> 11.5. Routers based on Supermicro X11SSL-F (Intel® C232 >>>>>>>>>> chipset) works with no problems after that migration, but >>>>>>>>>> routers based on Supermicro X9SCL (Intel C202 PCH) and >>>>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH) starts behaving >>>>>>>>>> strangely with high cpu load (0.5-0.8 while before it was >>>>>>>>>> around 0.0-0.1) and subscribers not being able to utilize their >>>>>>>>>> plans. I tried to strip down the problem and ends up with clean >>>>>>>>>> system with no iptables or hfsc rules behaving the same (higher >>>>>>>>>> load) right after setting the 10G link upeven if no traffic is >>>>>>>>>> passing by. >>>>>>>>>> >>>>>>>>>>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla >>>>>>>>>>>> system >>>>>>>>>>>> with no network attached. The problem can be observed on the >>>>>>>>>>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>>>>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the >>>>>>>>>>>> Supermicro >>>>>>>>>>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>>>>>>>>>> >>>>>>>>>>>> Tested environments: >>>>>>>>>>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >>>>>>>>>>>> 4.9.88-1+deb9u1 (2018-05-07) x86_64 GNU/Linux [all platforms >>>>>>>>>>>> working well with no problems: Supermicro X9SCL (Intel C202 >>>>>>>>>>>> PCH), Supermicro X10SLL+-F (Intel C222 Express PCH), >>>>>>>>>>>> Supermicro X11SSL-F (Intel® C232 chipset)] >>>>>>>>>>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>>>>>>>>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro >>>>>>>>>>>> X9SCL (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 >>>>>>>>>>>> Express PCH) behave problematic as described above | newer >>>>>>>>>>>> platform: Supermicro X11SSL-F (Intel® C232 chipset) working >>>>>>>>>>>> well with no problems] >>>>>>>>>>> Maybe create a bug at the Linux kernel bug tracker [1], where >>>>>>>>>>> you can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>>>>>>>>>> >>>>>>>>>> I`ve already reported that to the Debian team >>>>>>>>>> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but >>>>>>>>>> so far nobody took care of this issue so far. >>>>>>>>>> >>>>>>>>>>>> So far to solve the problem I was trying to upgrade system to >>>>>>>>>>>> the newest stable version, upgrade kernel to version 6.x, >>>>>>>>>>>> upgrade ixgbe driver to the newest version but with no luck. >>>>>>>>>>> Thank you for checking that. Too bad it’s still present. To >>>>>>>>>>> rule out some user space problem, could you test Debian 9.7 >>>>>>>>>>> with a stable Linux release, currently 6.1.7? >>>>>>>>>>> >>>>>>>>>>> What does `sudo perf top --sort comm,dso` show, where the time >>>>>>>>>>> is spent? >>>>>>>>>> During my first test in real enviroment with subscribers I >>>>>>>>>> gether the following data through the perf: >>>>>>>>>> >>>>>>>>>> 27.83% [kernel] [k] strncpy >>>>>>>>>> 14.80% [kernel] [k] nft_do_chain >>>>>>>>>> 7.61% [kernel] [k] memcmp >>>>>>>>>> 5.63% [kernel] [k] nft_meta_get_eval >>>>>>>>>> 3.14% [kernel] [k] nft_cmp_eval >>>>>>>>>> 2.79% [kernel] [k] asm_exc_nmi >>>>>>>>>> 1.07% [kernel] [k] module_get_kallsym >>>>>>>>>> 0.92% [kernel] [k] >>>>>>>>>> kallsyms_expand_symbol.constprop.0 >>>>>>>>>> 0.85% [kernel] [k] ixgbe_poll >>>>>>>>>> 0.75% [kernel] [k] format_decode >>>>>>>>>> 0.61% [kernel] [k] number >>>>>>>>>> 0.56% [kernel] [k] menu_select >>>>>>>>>> 0.54% [kernel] [k] clflush_cache_range >>>>>>>>>> 0.52% [kernel] [k] cpuidle_enter_state >>>>>>>>>> 0.51% [kernel] [k] vsnprintf >>>>>>>>>> 0.50% [kernel] [k] u32_classify >>>>>>>>>> 0.49% [kernel] [k] fib_table_lookup >>>>>>>>>> 0.40% [kernel] [k] dma_pte_clear_level >>>>>>>>>> 0.39% [kernel] [k] domain_mapping >>>>>>>>>> 0.36% [kernel] [k] ixgbe_xmit_fram >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM >>>>>>>>>> TIME+ COMMAND >>>>>>>>>> 18 root 20 0 0 0 0 S 28.2 0.0 >>>>>>>>>> 7:06.27 ksoftirqd/1 >>>>>>>>>> 12 root 20 0 0 0 0 R 12.0 0.0 >>>>>>>>>> 4:10.88 ksoftirqd/0 >>>>>>>> […] >>>>>>>> >>>>>>>> Do you see different behavior in `/proc/interrupts`? >>>>>>>> >>>>>>> This is how it looks like for Debian 11.5 - Linux 5.10.0-19-amd64 >>>>>>> #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux on >>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH): >>>>>>> >>>>>>> 1 root 20 0 163948 10288 7696 S 0.0 0.1 >>>>>>> 0:39.58 systemd >>>>>> […] >>>>>> >>>>>> The content of `/proc/interrupts` has a different format on my system. >>>>>> >>>>>> ``` >>>>>> $ head -3 /proc/interrupts >>>>>> CPU0 CPU1 CPU2 CPU3 >>>>>> 1: 55560 0 113 0 IR-IO-APIC 1-edge >>>>>> i8042 >>>>>> 8: 0 0 0 0 IR-IO-APIC 8-edge >>>>>> rtc0 >>>>>> ``` >>>>>> […] >>>>>> >>>>>>> and for Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >>>>>>> 4.9.88-1+deb9u1 on Supermicro X10SLL+-F (Intel C222 Express PCH) >>>>>>> >>>>>>> 31659 root 20 0 0 0 0 S 0.3 0.0 0:00.92 >>>>>>> kworker/7:0 >>>>>>> 1 root 20 0 57032 6736 5256 S 0.0 0.1 2:28.14 >>>>>>> systemd >>>>>> […] >>>>>>>>>>>> Supermicro support suggested as follows: >>>>>>>>>>>> it might be kernel related debian 11.5 has kernel 5.10 which >>>>>>>>>>>> is a recent kernel it might not properly support the chipsets >>>>>>>>>>>> for X9 therefore i suggest to use RHEL or CentOS as they use >>>>>>>>>>>> much older kernel versions. I expect that with ubuntu 20.04 >>>>>>>>>>>> you see the same problem it uses kernel 5.4 >>>>>>>>>>>>>> Testing another GNU/Linux distribution for another data >>>>>>>>>>> point, might be a good idea. >>>>>>>>>>> >>>>>>>>>>> As nobody has responded yet, bisecting the issue is probably >>>>>>>>>>> the fastest way to get to the bottom of this. Luckily the >>>>>>>>>>> problem seems reproducible and you seem to be able to build a >>>>>>>>>>> Linux kernel yourself, so that should work. (For testing >>>>>>>>>>> purposes you could also test with Ubuntu, as they provide >>>>>>>>>>> Linux kernel builds for (almost) all releases in their Linux >>>>>>>>>>> kernel mainline PPA [2].) >>>>>>>>>>> >>>>>>>>>> Of course I can try Ubuntu and report how it is working. >>>>>>>>>> >>>>>>>>> Ubuntu (5.15.0-43-generic) seems to be working in the same way >>>>>>>>> generating higher load after executing "ip link set enp1s0 up". >>>>>>>> That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu >>>>>>>> 20.04 with Linux 5.4, and Ubuntu 18.04 with 4.15? >>>>>>>> >>>>>>>> Anyway, I think, you won’t come around bisecting. Another hint, >>>>>>>> make sure that you can build a 4.9 Linux kernel yourself, that >>>>>>>> does not exhibit that issue. >>>>>>>> >>>>>>> That`s right, it is 22.04. I don`t have to build it. Standard >>>>>>> kernel Linux 4.9.0-6-amd64 from Debian 9.7 worked without problems >>>>>>> for past 4 years. >>>>>> If nobody of the developers/maintainers is going to step up, you >>>>>> are on your own. Again, as you can reproduce this easily, the >>>>>> fastest way is to bisect the issue, which you can do on your own. >>>>> How can I investigate that further? >>>> I repeat myself, please bisect the issue. It’s the fastest way. >>>> >>>>> I thought about trying to change some of the parameters related to >>>>> ixgbe driver and observe if anything is changing, but when I am >>>>> trying to do: >>>>> >>>>> sudo modprobe ixgbe IntMode=0 >>>>> >>>>> I get the following error in the dmesg: >>>>> >>>>> [ 2137.324772] ixgbe: unknown parameter 'IntMode' ignored <<<<<<<<< >>>> […] >>>> >>>> `modinfo ixgbe` shows the supported parameters. >>>> PS: If you need help bisecting, please ask. Otherwise, I am out of >>>> this thread. >>> Ok, how exactly I can bisect this issue? >> What have you tried so far? As written in the past, I’d first try more >> distributions, for example, older Ubuntu versions. Then, if you have >> some range, I’d use the Ubuntu PPA, and then between the release >> candidate versions, only then start doing `git bisect` as documented in >> the documentation [3]. > Hmmm. I'm not an expert in that area, but if you follow Paul's advice > keep in mind that a deliberate config change by the distro might have an > impact here. Hence it might be a good idea to rule that out first by > taking a config from a working kernel and using it (with the help of > "make olddefconfig") to build your own kernel from the version that is > known to fail. But over such a wide range of versions this can be > tricky. :-/ > > But apart from that Paul is right afaics: nobody yet had an idea what > might cause this regression, hence we need a bisection to pin-point the > problem. Thanks for the advice. I`ll try my best to find out which commit caused the problem, but it will take me some time as I have never done bisecting especially on that scale. What`s wondering me the most is that nobody reported this issue so far taking into account that these platforms along with Debian and Intel 82599EN NIC is quite common configuration I think. Best regards Bartek Kois > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) > -- > Everything you wanna know about Linux kernel regression tracking: > https://linux-regtracking.leemhuis.info/about/#tldr > If I did something stupid, please tell me, as explained on that page. > > > >>>>>>>>>>> [1]: https://bugzilla.kernel.org/ >>>>>>>>>>> [2]: https://kernel.ubuntu.com/~kernel-ppa/mainline/ >> [3]: https://www.kernel.org/doc/html/latest/admin-guide/bug-bisect.html >> >> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 2023-01-24 9:40 ` Bartek Kois @ 2023-03-23 13:46 ` Linux regression tracking (Thorsten Leemhuis) 0 siblings, 0 replies; 14+ messages in thread From: Linux regression tracking (Thorsten Leemhuis) @ 2023-03-23 13:46 UTC (permalink / raw) To: Bartek Kois, Linux regressions mailing list, Paul Menzel; +Cc: intel-wired-lan On 24.01.23 10:40, Bartek Kois wrote: > W dniu 24.01.2023 o 10:33, Linux kernel regression tracking (Thorsten > Leemhuis) pisze: >> On 23.01.23 20:03, Paul Menzel wrote: >>> Am 23.01.23 um 19:58 schrieb Bartek Kois: >>>> W dniu 23.01.2023 o 19:53, Paul Menzel pisze: >>>>> Am 23.01.23 um 19:38 schrieb Bartek Kois: >>>>>> W dniu 22.01.2023 o 21:28, Paul Menzel pisze: >>>>>>> Am 19.01.23 um 18:17 schrieb Bartek Kois: >>>>>>>> W dniu 19.01.2023 o 18:09, Paul Menzel pisze: >>>>>>>>> Am 19.01.23 um 17:58 schrieb Bartek Kois: >>>>>>>>>> W dniu 19.01.2023 o 13:24, Bartek Kois pisze: >>>>>>>>>>> W dniu 19.01.2023 o 11:17, Paul Menzel pisze: >>>>>>>>>>>> #regzbot ^introduced: 4.9.88..5.10.149 >>>>>>>>>>>> Am 14.01.23 um 11:23 schrieb Bartek Kois: >>>>>>>>>>>> >>>>>>>>>>>>> After moving from Debian 9.7 to 11.5 as soon as I perform "ip >>>>>>>>>>>>> link set enp1s0 up" for my 10G adapter (AOC-STGN-I1S - Intel >>>>>>>>>>>>> 82599EN based 10G adapter) I am experiencing high cpu load >>>>>>>>>>>>> (even if no traffic is passing through the adapter) and >>>>>>>>>>>>> network performance is low (when network is connected). >>>>>>>>>>>> How do you test the network performance? Please give exact >>>>>>>>>>>> numbers for comparison. >>>>>>>>>>>> >>>>>>>>>>> I am using this server as a router for my subscribers with >>>>>>>>>>> iptables (for NAT and firewall) and hfsc (for QoS). First I >>>>>>>>>>> encountered this problem while migrating form Debian 9.7 to >>>>>>>>>>> 11.5. Routers based on Supermicro X11SSL-F (Intel® C232 >>>>>>>>>>> chipset) works with no problems after that migration, but >>>>>>>>>>> routers based on Supermicro X9SCL (Intel C202 PCH) and >>>>>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH) starts behaving >>>>>>>>>>> strangely with high cpu load (0.5-0.8 while before it was >>>>>>>>>>> around 0.0-0.1) and subscribers not being able to utilize their >>>>>>>>>>> plans. I tried to strip down the problem and ends up with clean >>>>>>>>>>> system with no iptables or hfsc rules behaving the same (higher >>>>>>>>>>> load) right after setting the 10G link upeven if no traffic is >>>>>>>>>>> passing by. >>>>>>>>>>> >>>>>>>>>>>>> The cpu load is oscillating between 0.1 and 0.3 on vanilla >>>>>>>>>>>>> system >>>>>>>>>>>>> with no network attached. The problem can be observed on the >>>>>>>>>>>>> following platforms: Supermicro X9SCL (Intel C202 PCH) and >>>>>>>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH), but for the >>>>>>>>>>>>> Supermicro >>>>>>>>>>>>> X11SSL-F (Intel® C232 chipset) everything is working well. >>>>>>>>>>>>> >>>>>>>>>>>>> Tested environments: >>>>>>>>>>>>> Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >>>>>>>>>>>>> 4.9.88-1+deb9u1 (2018-05-07) x86_64 GNU/Linux [all platforms >>>>>>>>>>>>> working well with no problems: Supermicro X9SCL (Intel C202 >>>>>>>>>>>>> PCH), Supermicro X10SLL+-F (Intel C222 Express PCH), >>>>>>>>>>>>> Supermicro X11SSL-F (Intel® C232 chipset)] >>>>>>>>>>>>> Debian 11.5 - Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 >>>>>>>>>>>>> (2022-10-21) x86_64 GNU/Linux [older platforms: Supermicro >>>>>>>>>>>>> X9SCL (Intel C202 PCH), Supermicro X10SLL+-F (Intel C222 >>>>>>>>>>>>> Express PCH) behave problematic as described above | newer >>>>>>>>>>>>> platform: Supermicro X11SSL-F (Intel® C232 chipset) working >>>>>>>>>>>>> well with no problems] >>>>>>>>>>>> Maybe create a bug at the Linux kernel bug tracker [1], where >>>>>>>>>>>> you can attach all the logs (`dmesg`, `lspci -nnk -s …`, …). >>>>>>>>>>>> >>>>>>>>>>> I`ve already reported that to the Debian team >>>>>>>>>>> ttps://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1024763, but >>>>>>>>>>> so far nobody took care of this issue so far. >>>>>>>>>>> >>>>>>>>>>>>> So far to solve the problem I was trying to upgrade system to >>>>>>>>>>>>> the newest stable version, upgrade kernel to version 6.x, >>>>>>>>>>>>> upgrade ixgbe driver to the newest version but with no luck. >>>>>>>>>>>> Thank you for checking that. Too bad it’s still present. To >>>>>>>>>>>> rule out some user space problem, could you test Debian 9.7 >>>>>>>>>>>> with a stable Linux release, currently 6.1.7? >>>>>>>>>>>> >>>>>>>>>>>> What does `sudo perf top --sort comm,dso` show, where the time >>>>>>>>>>>> is spent? >>>>>>>>>>> During my first test in real enviroment with subscribers I >>>>>>>>>>> gether the following data through the perf: >>>>>>>>>>> >>>>>>>>>>> 27.83% [kernel] [k] strncpy >>>>>>>>>>> 14.80% [kernel] [k] nft_do_chain >>>>>>>>>>> 7.61% [kernel] [k] memcmp >>>>>>>>>>> 5.63% [kernel] [k] nft_meta_get_eval >>>>>>>>>>> 3.14% [kernel] [k] nft_cmp_eval >>>>>>>>>>> 2.79% [kernel] [k] asm_exc_nmi >>>>>>>>>>> 1.07% [kernel] [k] module_get_kallsym >>>>>>>>>>> 0.92% [kernel] [k] >>>>>>>>>>> kallsyms_expand_symbol.constprop.0 >>>>>>>>>>> 0.85% [kernel] [k] ixgbe_poll >>>>>>>>>>> 0.75% [kernel] [k] format_decode >>>>>>>>>>> 0.61% [kernel] [k] number >>>>>>>>>>> 0.56% [kernel] [k] menu_select >>>>>>>>>>> 0.54% [kernel] [k] clflush_cache_range >>>>>>>>>>> 0.52% [kernel] [k] cpuidle_enter_state >>>>>>>>>>> 0.51% [kernel] [k] vsnprintf >>>>>>>>>>> 0.50% [kernel] [k] u32_classify >>>>>>>>>>> 0.49% [kernel] [k] fib_table_lookup >>>>>>>>>>> 0.40% [kernel] [k] dma_pte_clear_level >>>>>>>>>>> 0.39% [kernel] [k] domain_mapping >>>>>>>>>>> 0.36% [kernel] [k] ixgbe_xmit_fram >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM >>>>>>>>>>> TIME+ COMMAND >>>>>>>>>>> 18 root 20 0 0 0 0 S 28.2 0.0 >>>>>>>>>>> 7:06.27 ksoftirqd/1 >>>>>>>>>>> 12 root 20 0 0 0 0 R 12.0 0.0 >>>>>>>>>>> 4:10.88 ksoftirqd/0 >>>>>>>>> […] >>>>>>>>> >>>>>>>>> Do you see different behavior in `/proc/interrupts`? >>>>>>>>> >>>>>>>> This is how it looks like for Debian 11.5 - Linux 5.10.0-19-amd64 >>>>>>>> #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux on >>>>>>>> Supermicro X10SLL+-F (Intel C222 Express PCH): >>>>>>>> >>>>>>>> 1 root 20 0 163948 10288 7696 S 0.0 0.1 >>>>>>>> 0:39.58 systemd >>>>>>> […] >>>>>>> >>>>>>> The content of `/proc/interrupts` has a different format on my >>>>>>> system. >>>>>>> >>>>>>> ``` >>>>>>> $ head -3 /proc/interrupts >>>>>>> CPU0 CPU1 CPU2 CPU3 >>>>>>> 1: 55560 0 113 0 IR-IO-APIC 1-edge >>>>>>> i8042 >>>>>>> 8: 0 0 0 0 IR-IO-APIC 8-edge >>>>>>> rtc0 >>>>>>> ``` >>>>>>> […] >>>>>>> >>>>>>>> and for Debian 9.7 - Linux 4.9.0-6-amd64 #1 SMP Debian >>>>>>>> 4.9.88-1+deb9u1 on Supermicro X10SLL+-F (Intel C222 Express PCH) >>>>>>>> >>>>>>>> 31659 root 20 0 0 0 0 S 0.3 0.0 0:00.92 >>>>>>>> kworker/7:0 >>>>>>>> 1 root 20 0 57032 6736 5256 S 0.0 0.1 2:28.14 >>>>>>>> systemd >>>>>>> […] >>>>>>>>>>>>> Supermicro support suggested as follows: >>>>>>>>>>>>> it might be kernel related debian 11.5 has kernel 5.10 which >>>>>>>>>>>>> is a recent kernel it might not properly support the chipsets >>>>>>>>>>>>> for X9 therefore i suggest to use RHEL or CentOS as they use >>>>>>>>>>>>> much older kernel versions. I expect that with ubuntu 20.04 >>>>>>>>>>>>> you see the same problem it uses kernel 5.4 >>>>>>>>>>>>>>> Testing another GNU/Linux distribution for another data >>>>>>>>>>>> point, might be a good idea. >>>>>>>>>>>> >>>>>>>>>>>> As nobody has responded yet, bisecting the issue is probably >>>>>>>>>>>> the fastest way to get to the bottom of this. Luckily the >>>>>>>>>>>> problem seems reproducible and you seem to be able to build a >>>>>>>>>>>> Linux kernel yourself, so that should work. (For testing >>>>>>>>>>>> purposes you could also test with Ubuntu, as they provide >>>>>>>>>>>> Linux kernel builds for (almost) all releases in their Linux >>>>>>>>>>>> kernel mainline PPA [2].) >>>>>>>>>>>> >>>>>>>>>>> Of course I can try Ubuntu and report how it is working. >>>>>>>>>>> >>>>>>>>>> Ubuntu (5.15.0-43-generic) seems to be working in the same way >>>>>>>>>> generating higher load after executing "ip link set enp1s0 up". >>>>>>>>> That is good to know. (Is this Ubuntu 22.04?) What about Ubuntu >>>>>>>>> 20.04 with Linux 5.4, and Ubuntu 18.04 with 4.15? >>>>>>>>> >>>>>>>>> Anyway, I think, you won’t come around bisecting. Another hint, >>>>>>>>> make sure that you can build a 4.9 Linux kernel yourself, that >>>>>>>>> does not exhibit that issue. >>>>>>>>> >>>>>>>> That`s right, it is 22.04. I don`t have to build it. Standard >>>>>>>> kernel Linux 4.9.0-6-amd64 from Debian 9.7 worked without problems >>>>>>>> for past 4 years. >>>>>>> If nobody of the developers/maintainers is going to step up, you >>>>>>> are on your own. Again, as you can reproduce this easily, the >>>>>>> fastest way is to bisect the issue, which you can do on your own. >>>>>> How can I investigate that further? >>>>> I repeat myself, please bisect the issue. It’s the fastest way. >>>>> >>>>>> I thought about trying to change some of the parameters related to >>>>>> ixgbe driver and observe if anything is changing, but when I am >>>>>> trying to do: >>>>>> >>>>>> sudo modprobe ixgbe IntMode=0 >>>>>> >>>>>> I get the following error in the dmesg: >>>>>> >>>>>> [ 2137.324772] ixgbe: unknown parameter 'IntMode' ignored <<<<<<<<< >>>>> […] >>>>> >>>>> `modinfo ixgbe` shows the supported parameters. >>>>> PS: If you need help bisecting, please ask. Otherwise, I am out of >>>>> this thread. >>>> Ok, how exactly I can bisect this issue? >>> What have you tried so far? As written in the past, I’d first try more >>> distributions, for example, older Ubuntu versions. Then, if you have >>> some range, I’d use the Ubuntu PPA, and then between the release >>> candidate versions, only then start doing `git bisect` as documented in >>> the documentation [3]. >> Hmmm. I'm not an expert in that area, but if you follow Paul's advice >> keep in mind that a deliberate config change by the distro might have an >> impact here. Hence it might be a good idea to rule that out first by >> taking a config from a working kernel and using it (with the help of >> "make olddefconfig") to build your own kernel from the version that is >> known to fail. But over such a wide range of versions this can be >> tricky. :-/ >> >> But apart from that Paul is right afaics: nobody yet had an idea what >> might cause this regression, hence we need a bisection to pin-point the >> problem. > > Thanks for the advice. I`ll try my best to find out which commit caused > the problem, but it will take me some time as I have never done > bisecting especially on that scale. Did you ever get closer to the root of the problem? > What`s wondering me the most is that > nobody reported this issue so far taking into account that these > platforms along with Debian and Intel 82599EN NIC is quite common > configuration I think. I guess the answer is the usual: the problem only shows up in some environments using that NIC -- for example if the firmware of the motherboard or the configuration somehow directly or indirectly trigger the problem. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. P.S.: #regzbot backburner: need bisection that will take some time to get done #regzbot poke ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2023-03-23 13:46 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <d1530cba-1a72-cae8-6a04-ed8ec0f82e6e@gmail.com> 2023-01-19 10:17 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance after moving to Debian 11.5 Paul Menzel 2023-01-19 10:22 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network performance " Paul Menzel 2023-01-19 12:24 ` [Intel-wired-lan] Supermicro AOC-STGN-I1S (Intel 82599EN based 10G adapter) - poor network perfomance " Bartek Kois 2023-01-19 16:58 ` Bartek Kois 2023-01-19 17:09 ` Paul Menzel 2023-01-19 17:17 ` Bartek Kois 2023-01-22 20:28 ` Paul Menzel 2023-01-23 18:38 ` Bartek Kois 2023-01-23 18:53 ` Paul Menzel 2023-01-23 18:58 ` Bartek Kois 2023-01-23 19:03 ` Paul Menzel 2023-01-24 9:33 ` Linux kernel regression tracking (Thorsten Leemhuis) 2023-01-24 9:40 ` Bartek Kois 2023-03-23 13:46 ` Linux regression tracking (Thorsten Leemhuis)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).