Hi Julia, > Have you tested on a vanilla (non-RT) kernel? I doubt there is anything RT specific > about what you are seeing, but it might be nice to get confirmation. Also, bisection > would probably be easier if you confirm on a vanilla kernel. > > I find it unlikely that it's a kernel config option that changed which regressed you, but > instead was a code change to a driver. Which driver is now the question, and the > surface area is still big (processor mapping attributes for this region, PCI root > complex configuration, PCI brige configuration, igb driver itself, etc.). > > Big enough that I'd recommend a bisection. It looks like a bisection between 3.18 > and 4.8 would take you about 18 tries to narrow down, assuming all goes well. > I have now repeated my tests using the vanilla kernel. There I got the very same issue. Using kernel 4.0 is fine, however starting with kernel 4.1, the issue appears. Here is my exact (reproducible) test description: I applied the following patch to the kernel to get the igb trace. This patch instruments the igb_rd32() function to measure the call to readl() which is used to access registers of the igb NIC. ++++++++++++++++++ BEGIN PATCH ++++++++++++++++++++++++++++++++++++ Index: linux-4.8/drivers/net/ethernet/intel/igb/trace.h =================================================================== --- /dev/null +++ linux-4.8/drivers/net/ethernet/intel/igb/trace.h @@ -0,0 +1,34 @@ + +#if !defined(_TRACE_IGB_H_) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_IGB_H_ + +#include +#undef TRACE_SYSTEM +#define TRACE_SYSTEM igb + + +#define _TRACE_H_ + + +TRACE_EVENT(igb, + TP_PROTO(u32 val), + TP_ARGS(val), + TP_STRUCT__entry( + __field(u32, val) + ), + TP_fast_assign( + __entry->val = val; + ), + TP_printk("val: %u", + __entry->val) +); + + +#endif + +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH drivers/net/ethernet/intel/igb +#undef TRACE_INCLUDE_FILE +#define TRACE_INCLUDE_FILE trace + +#include Index: linux-4.8/drivers/net/ethernet/intel/igb/Makefile =================================================================== --- linux-4.8.orig/drivers/net/ethernet/intel/igb/Makefile +++ linux-4.8/drivers/net/ethernet/intel/igb/Makefile @@ -28,6 +28,7 @@ # # Makefile for the Intel(R) 82575 PCI-Express ethernet driver # +ccflags-y += -I. obj-$(CONFIG_IGB) += igb.o Index: linux-4.8/drivers/net/ethernet/intel/igb/igb_main.c =================================================================== --- linux-4.8.orig/drivers/net/ethernet/intel/igb/igb_main.c +++ linux-4.8/drivers/net/ethernet/intel/igb/igb_main.c @@ -57,6 +57,9 @@ #include #include "igb.h" +#define CREATE_TRACE_POINTS +#include "trace.h" + #define MAJ 5 #define MIN 3 #define BUILD 0 @@ -753,7 +756,9 @@ u32 igb_rd32(struct e1000_hw *hw, u32 re if (E1000_REMOVED(hw_addr)) return ~value; + trace_igb(801); value = readl(&hw_addr[reg]); + trace_igb(802); /* reads should not return all F's */ if (!(~value) && (!reg || !(~readl(hw_addr)))) { ++++++++++++++++++ END PATCH ++++++++++++++++++++++++++++++++++++ I build the kernel with this patch applied, rebooted the PC to run this kernel and used the following script for my test. ++++++++++++++++++ BEGIN SCRIPT +++++++++++++++++++++++++++++++++ #!/bin/bash for f in /sys/devices/system/cpu/cpu[0-9]*/cpufreq/scaling_governor ; do if [ -w $f ]; then echo "performance" > $f fi done if true; then rmmod igb modprobe igb ethtool -L eth2 combined 1 ifconfig eth2 up 192.168.100.111 fi ifconfig mount /sys/kernel/debug ( cd /sys/kernel/debug/tracing echo 0 > tracing_on echo 0 > events/enable echo 1 > events/igb/enable echo "print-parent" > trace_options echo "latency-format" > trace_options echo 1 > tracing_on sleep 4 cat trace ) ++++++++++++++++++ END SCRIPT +++++++++++++++++++++++++++++++++ The results of this for kernel 4.0: [...] kworker/-1239 3...1 49699046us : igb: val: 801 kworker/-1239 3...1 49699047us : igb: val: 802 kworker/-1239 3...1 49699047us : igb: val: 801 kworker/-1239 3...1 49699048us+: igb: val: 802 kworker/-1239 3...1 49699099us : igb: val: 801 kworker/-1239 3...1 49699100us : igb: val: 802 kworker/-1239 3...1 49699100us : igb: val: 801 kworker/-1239 3...1 49699102us : igb: val: 802 kworker/-1239 3...1 49699102us : igb: val: 801 kworker/-1239 3...1 49699103us : igb: val: 802 kworker/-1239 3...1 49699103us : igb: val: 801 kworker/-1239 3...1 49699104us : igb: val: 802 kworker/-1239 3...1 49699104us : igb: val: 801 kworker/-1239 3...1 49699105us : igb: val: 802 kworker/-1239 3...1 49699105us : igb: val: 801 kworker/-1239 3...1 49699107us : igb: val: 802 kworker/-1239 3...1 49699107us : igb: val: 801 kworker/-1239 3...1 49699108us : igb: val: 802 kworker/-1239 3...1 49699108us : igb: val: 801 kworker/-1239 3...1 49699109us : igb: val: 802 kworker/-1239 3...1 49699109us : igb: val: 801 kworker/-1239 3...1 49699110us : igb: val: 802 kworker/-1239 3...1 49699110us : igb: val: 801 kworker/-1239 3...1 49699111us : igb: val: 802 kworker/-1239 3...1 49699111us : igb: val: 801 kworker/-1239 3...1 49699113us+: igb: val: 802 kworker/-1239 3...1 49699163us : igb: val: 801 kworker/-1239 3...1 49699164us : igb: val: 802 kworker/-1239 3...1 49699164us : igb: val: 801 kworker/-1239 3...1 49699166us : igb: val: 802 kworker/-1239 3...1 49699166us : igb: val: 801 kworker/-1239 3...1 49699167us : igb: val: 802 kworker/-1239 3...1 49699167us : igb: val: 801 kworker/-1239 3...1 49699168us : igb: val: 802 kworker/-1239 3...1 49699168us : igb: val: 801 kworker/-1239 3...1 49699169us : igb: val: 802 kworker/-1239 3...1 49699169us : igb: val: 801 kworker/-1239 3...1 49699170us : igb: val: 802 kworker/-1239 3...1 49699171us : igb: val: 801 kworker/-1239 3...1 49699173us : igb: val: 802 kworker/-1239 3...1 49699173us : igb: val: 801 kworker/-1239 3...1 49699174us : igb: val: 802 The results of this for kernel 4.1: [...] kworker/-1333 2...1 75697302us+: igb: val: 801 kworker/-1333 2...1 75697323us+: igb: val: 802 kworker/-1333 2...1 75697373us+: igb: val: 801 kworker/-1333 2...1 75697395us : igb: val: 802 kworker/-1333 2...1 75697395us+: igb: val: 801 kworker/-1333 2...1 75697415us : igb: val: 802 kworker/-1333 2...1 75697416us+: igb: val: 801 kworker/-1333 2...1 75697436us : igb: val: 802 kworker/-1333 2...1 75697436us+: igb: val: 801 kworker/-1333 2...1 75697457us : igb: val: 802 kworker/-1333 2...1 75697457us+: igb: val: 801 kworker/-1333 2...1 75697478us : igb: val: 802 kworker/-1333 2...1 75697478us+: igb: val: 801 kworker/-1333 2...1 75697499us : igb: val: 802 kworker/-1333 2...1 75697499us+: igb: val: 801 kworker/-1333 2...1 75697520us : igb: val: 802 kworker/-1333 2...1 75697520us+: igb: val: 801 kworker/-1333 2...1 75697541us : igb: val: 802 kworker/-1333 2...1 75697541us+: igb: val: 801 kworker/-1333 2...1 75697562us : igb: val: 802 kworker/-1333 2...1 75697562us+: igb: val: 801 kworker/-1333 2...1 75697583us : igb: val: 802 kworker/-1333 2...1 75697583us+: igb: val: 801 kworker/-1333 2...1 75697604us+: igb: val: 802 kworker/-1333 2...1 75697654us+: igb: val: 801 kworker/-1333 2...1 75697675us : igb: val: 802 kworker/-1333 2...1 75697675us+: igb: val: 801 kworker/-1333 2...1 75697696us : igb: val: 802 kworker/-1333 2...1 75697696us+: igb: val: 801 kworker/-1333 2...1 75697717us : igb: val: 802 kworker/-1333 2...1 75697717us+: igb: val: 801 kworker/-1333 2...1 75697738us : igb: val: 802 kworker/-1333 2...1 75697738us+: igb: val: 801 kworker/-1333 2...1 75697759us : igb: val: 802 kworker/-1333 2...1 75697759us+: igb: val: 801 kworker/-1333 2...1 75697780us : igb: val: 802 kworker/-1333 2...1 75697781us+: igb: val: 801 kworker/-1333 2...1 75697801us : igb: val: 802 kworker/-1333 2...1 75697801us+: igb: val: 801 kworker/-1333 2...1 75697822us : igb: val: 802 It can be cleary seen that with the kernel 4.1 the time for the execution of the "readl" function is much slower! It tasks always about 21us! In comparison to that the kernel 4.0 is fast. Here the time for "readl" is about 1-2 us. All measurements have been done on the very same PC. The PC is an HP EliteDesk 800 G1, a dual port Intel i350-T2 server adapter has been plugged in. CPU: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz x86_64 bit mode. Please find attached the kernel configuration for the 4.1 kernel. I had a look at the generated .o file of igb_main.o. However, in both cases - kernel 4.0 and kernel 4.1 - the code for the igb_rd32 function looks the same. Especially the call to the readl() function is the very same. This means, that I think that some other stuff in kernel 4.1 has changed, which has impact on the igb accesses. Any idea what component could cause this kind of issue? Thanks for any feedback! Regards Mathias