netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: "netdev@vger.kernel.org" <netdev@vger.kernel.org>
Cc: "'mchan@broadcom.com'" <mchan@broadcom.com>,
	David Miller <davem@davemloft.net>
Subject: tg3 dropping packets at high packet rates
Date: Wed, 18 May 2022 16:08:47 +0000	[thread overview]
Message-ID: <70a20d8f91664412ae91e401391e17cb@AcuMS.aculab.com> (raw)

I'm trying to see why the tg3 driver is dropping a lot of
receive packets.

(This driver is making my head hurt...)

I think that the rx_packets count (sum of rx_[umb]cast_packets)
is all the packets, but a smaller number are actually processed
by the tg3_rx()
But none of the error counts get increased.

It is almost as if it has lost almost all the receive buffers.

If I read /sys/class/net/em2/statistics/rx_packets every second
delaying with:
  syscall(SYS_clock_nanosleep, CLOCK_MONOTONIC, TIMER_ABSTIME, &ts, NULL);
about every 43 seconds I get a zero increment.
This really doesn't help!
I've put a count into tg3_rx() that seems to match what IP/UDP
and the application see.

The traffic flow is pretty horrid (but could be worse).
There are 8000 small UDP packets every 20ms.
These are reasonably spread through the 20ms (not back to back).
All the destination ports are different (8000 receiving sockets).
(The receiving application handles this fine (now).)
The packets come from two different systems.

Firstly RSS doesn't seem to work very well.
With the current driver I think everything hits 2 rings.
With the 3.10 RHEL driver it all ends up in one.

Anyway after a hint from Eric I enabled RPS.
This offloads the IP and UDP processing enough to stop
any of the cpu (only 40 of them) from reporting even 50% busy.

I've also increased the rx ring size to 2047.
Changing the coalescing parameters seems to have no effect.

I think there should be 2047 receive buffers.
So 4 interrupts every 20ms or 200/sec might be enough
to receive all the frames.
The actual interrupt rate (deltas on /proc/interrupts)
is actual over 80000/sec.
So it doesn't look as though the driver is ever processing
many packets/interrupt.
If the driver were getting behind I'd expect a smaller number
of interrupts.

This would be consistent with there only being (say) 8 active
receive buffers.

The device in question identifies as:

tg3 0000:02:00.0 eth0: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address xx
tg3 0000:02:00.0 eth0: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
tg3 0000:02:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
tg3 0000:02:00.0 eth0: dma_rwctrl[00000001] dma_mask[64-bit]

Any idea where to look?

Or should I just use different ethernet hardware!
(Although the interrupt coalescing parameters for igb are
also completely broken for this traffic flow.)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


             reply	other threads:[~2022-05-18 16:08 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-18 16:08 David Laight [this message]
2022-05-18 17:27 ` tg3 dropping packets at high packet rates Paolo Abeni
2022-05-18 21:31   ` David Laight
2022-05-19  0:52     ` Michael Chan
2022-05-19  8:44       ` David Laight
2022-05-19 10:20         ` Pavan Chebbi
2022-05-19 13:14           ` David Laight
2022-05-19 13:29             ` Paolo Abeni
2022-05-19 13:54               ` Andrew Lunn
2022-05-19 14:11               ` David Laight
2022-05-19 14:35                 ` Pavan Chebbi
2022-05-19 14:42                   ` David Laight
2022-05-20 16:08                     ` David Laight
2022-05-23 16:01                       ` David Laight
2022-05-23 16:14                         ` Pavan Chebbi
2022-05-23 21:23                           ` David Laight
2022-05-25  7:28                             ` David Laight
2022-05-25 15:56                               ` Jakub Kicinski
2022-05-25 21:48                                 ` David Laight
2022-05-22 23:22         ` Michael Chan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=70a20d8f91664412ae91e401391e17cb@AcuMS.aculab.com \
    --to=david.laight@aculab.com \
    --cc=davem@davemloft.net \
    --cc=mchan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).