From the register dump we can see that your device is 5720-A0. There is a patch that was applied long time ago: 4d95847381228639844c7197deb8b2211274ef22- tg3: Workaround rx_discards stat bug This patch could be making the rx_discards count not report the actual discards. You can see the actual discards if you monitor the register at 0x2250, which I think matches with your application report. Can you test this and confirm? We can also see from your register dump shared earlier that the Flow Attention Register at 0x3C48 is indicating bit 6 set. That means the MBUF allocation state machine has reached mbuf low water threshold. In summary at this point it appears like the host cannot keep up with incoming traffic. On Mon, May 23, 2022 at 9:31 PM David Laight wrote: > > Ok todays test is calling tracing_off() when the napi callback > has more than 1000 packets queued. > > This is rather informative. > Just taking the trace for cpu 32 and shortening the lines somewhat > gives the following. > > -0 6.594695: irq_handler_entry: irq=87 name=em2-rx-2 > -0 6.594695: napi_schedule_prep <-tg3_msi_1shot > -0 6.594695: __napi_schedule <-tg3_msi_1shot > -0 6.594695: irq_handler_exit: irq=87 ret=handled > -0 6.594695: softirq_entry: vec=3 [action=NET_RX] > -0 6.594696: napi_schedule_prep <-tg3_rx > -0 6.594696: __napi_schedule <-tg3_rx > -0 6.594697: napi_poll: napi poll on napi struct 06e44eda for device em2 work 1 budget 64 > -0 6.594697: softirq_exit: vec=3 [action=NET_RX] > -0 6.594697: softirq_entry: vec=3 [action=NET_RX] > -0 6.594698: napi_poll: napi poll on napi struct 909def03 for device em2 work 0 budget 64 > -0 6.594698: softirq_exit: vec=3 [action=NET_RX] > -0 6.594700: irq_handler_entry: irq=87 name=em2-rx-2 > -0 6.594701: napi_schedule_prep <-tg3_msi_1shot > -0 6.594701: __napi_schedule <-tg3_msi_1shot > -0 6.594701: irq_handler_exit: irq=87 ret=handled > -0 6.594701: softirq_entry: vec=3 [action=NET_RX] > -0 6.594704: napi_schedule_prep <-tg3_rx > -0 6.594704: __napi_schedule <-tg3_rx > -0 6.594705: napi_poll: napi poll on napi struct 06e44eda for device em2 work 3 budget 64 > -0 6.594706: softirq_exit: vec=3 [action=NET_RX] > -0 6.594710: irq_handler_entry: irq=87 name=em2-rx-2 > -0 6.594710: napi_schedule_prep <-tg3_msi_1shot > -0 6.594710: __napi_schedule <-tg3_msi_1shot > -0 6.594710: irq_handler_exit: irq=87 ret=handled > -0 6.594712: sched_switch: prev_pid=0 prev_prio=120 prev_state=R ==> next_pid=2275 next_prio=49 > pid-2275 6.594720: sys_futex(uaddr: 7fbd2bfe3a88, op: 81, val: 1, utime: 1064720, uaddr2: 0, val3: 27bea) > pid-2275 6.594721: sys_futex -> 0x0 > pid-2275 6.598067: sys_epoll_wait(epfd: 61, events: 7fbd2bfe3300, maxevents: 80, timeout: 0) > pid-2275 6.598747: sched_switch: prev_pid=2275 prev_prio=49 prev_state=S ==> next_pid=175 next_prio=120 > pid-175 6.598759: sched_switch: prev_pid=175 prev_prio=120 prev_state=R ==> next_pid=819 next_prio=120 > pid-819 6.598763: sched_switch: prev_pid=819 prev_prio=120 prev_state=I ==> next_pid=175 next_prio=120 > pid-175 6.598765: softirq_entry: vec=3 [action=NET_RX] > pid-175 6.598770: __napi_schedule <-napi_complete_done > pid-175 6.598771: napi_poll: napi poll on napi struct 909def03 for device em2 work 0 budget 64 > > The first few interrupts look fine. > Then there is a 4ms delay between the irq_handler caling napi_schedule() > and the NET_RX function actually being called. > > Except that isn't the actual delay. > There are 3 relevant napi structures. > 06e44eda (horrid hash confusing things) for the rx ring of irq=87. > xxxxxxxx (not in the above trace) for the other active rx ring. > 909def03 for em2-rx-1 that is also used to copy used rx descriptors > back to the 'free buffer' ring. > > So the normal sequence is the hard_irq schedules the 'napi' for that ring. > Max 64 packets are processed (RPS cross schedules them). > Then the 'napi' for ring em2-rx-0 is scheduled. > It is this napi that is being reported complete. > Then the em2-rx-2 napi is called and finds 1000+ items on the status ring. > > During the 4ms gap em2-rx-4 is being interrupted and the napi is > processing receive packets - but the shared napi (909def03) isn't run. > (Earlier in the trace it gets scheduled by both ISR.) > > pid 175 is ksoftirqd/32 and 819 kworker/32:2. > pid 2275 is part of our application. > It is possible that it looped in userspace for a few ms. > > But the napi_schedule() at 6.594704 should have called back as > 909def03 immediately. > > I've got a lot more ftrace - but it is large and mostly boring. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)