netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: "Rafał Miłecki" <zajec5@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>, Arnd Bergmann <arnd@arndb.de>,
	Alexander Lobakin <alexandr.lobakin@intel.com>,
	Network Development <netdev@vger.kernel.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	Russell King <linux@armlinux.org.uk>,
	Felix Fietkau <nbd@nbd.name>,
	"openwrt-devel@lists.openwrt.org"
	<openwrt-devel@lists.openwrt.org>,
	Florian Fainelli <f.fainelli@gmail.com>
Subject: Re: Optimizing kernel compilation / alignments for network performance
Date: Fri, 6 May 2022 10:45:29 +0200	[thread overview]
Message-ID: <CAK8P3a0Rouw8jHHqGhKtMu-ks--bqpVYj_+u4-Pt9VoFOK7nMw@mail.gmail.com> (raw)
In-Reply-To: <510bd08b-3d46-2fc8-3974-9d99fd53430e@gmail.com>

On Fri, May 6, 2022 at 9:44 AM Rafał Miłecki <zajec5@gmail.com> wrote:
>
> On 5.05.2022 18:04, Andrew Lunn wrote:
> >> you'll see that most used functions are:
> >> v7_dma_inv_range
> >> __irqentry_text_end
> >> l2c210_inv_range
> >> v7_dma_clean_range
> >> bcma_host_soc_read32
> >> __netif_receive_skb_core
> >> arch_cpu_idle
> >> l2c210_clean_range
> >> fib_table_lookup
> >
> > There is a lot of cache management functions here.

Indeed, so optimizing the coherency management (see Felix' reply)
is likely to help most in making the driver faster, but that does not
explain why the alignment of the object code has such a big impact
on performance.

To investigate the alignment further, what I was actually looking for
is a comparison of the profile of the slow and fast case. Here I would
expect that the slow case spends more time in one of the functions
that don't deal with cache management (maybe fib_table_lookup or
__netif_receive_skb_core).

A few other thoughts:

- bcma_host_soc_read32() is a fundamentally slow operation, maybe
  some of the calls can turned into a relaxed read, like the readback
  in bgmac_chip_intrs_off() or the 'poll again' at the end bgmac_poll(),
  though obviously not the one in bgmac_dma_rx_read().
  It may be possible to even avoid some of the reads entirely, checking
  for more data in bgmac_poll() may actually be counterproductive
  depending on the workload.

- The higher-end networking SoCs are usually cache-coherent and
  can avoid the cache management entirely. There is a slim chance
  that this chip is designed that way and it just needs to be enabled
  properly. Most low-end chips don't implement the coherent
  interconnect though, and I suppose you have checked this already.

- bgmac_dma_rx_update_index() and bgmac_dma_tx_add() appear
  to have an extraneous dma_wmb(), which should be implied by the
  non-relaxed writel() in bgmac_write().

- accesses to the DMA descriptor don't show up in the profile here,
  but look like they can get misoptimized by the compiler. I would
  generally use READ_ONCE() and WRITE_ONCE() for these to
  ensure that you don't end up with extra or out-of-order accesses.
  This also makes it clearer to the reader that something special
  happens here.

> > Might sound odd,
> > but have you tried disabling SMP? These cache functions need to
> > operate across all CPUs, and the communication between CPUs can slow
> > them down. If there is only one CPU, these cache functions get simpler
> > and faster.
> >
> > It just depends on your workload. If you have 1 CPU loaded to 100% and
> > the other 3 idle, you might see an improvement. If you actually need
> > more than one CPU, it will probably be worse.
>
> It seems to lower my NAT speed from ~362 Mb/s to 320 Mb/s but it feels
> more stable now (lower variations). Let me spend some time on more
> testing.
>
>
> FWIW during all my tests I was using:
> echo 2 > /sys/class/net/eth0/queues/rx-0/rps_cpus
> that is what I need to get similar speeds across iperf sessions
>
> With
> echo 0 > /sys/class/net/eth0/queues/rx-0/rps_cpus
> my NAT speeds were jumping between 4 speeds:
> 273 Mbps / 315 Mbps / 353 Mbps / 425 Mbps
> (every time I started iperf kernel jumped into one state and kept the
>   same iperf speed until stopping it and starting another session)
>
> With
> echo 1 > /sys/class/net/eth0/queues/rx-0/rps_cpus
> my NAT speeds were jumping between 2 speeds:
> 284 Mbps / 408 Mbps

Can you try using 'numactl -C' to pin the iperf processes to
a particular CPU core? This may be related to the locality of
the user process relative to where the interrupts end up.

        Arnd

  reply	other threads:[~2022-05-06  8:45 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-27 12:04 Optimizing kernel compilation / alignments for network performance Rafał Miłecki
2022-04-27 12:56 ` Alexander Lobakin
2022-04-27 17:31   ` Rafał Miłecki
2022-04-29 14:18     ` Rafał Miłecki
2022-04-29 14:49     ` Arnd Bergmann
2022-05-05 15:42       ` Rafał Miłecki
2022-05-05 16:04         ` Andrew Lunn
2022-05-05 16:46           ` Felix Fietkau
2022-05-06  7:47             ` Rafał Miłecki
2022-05-06 12:42               ` Andrew Lunn
2022-05-10 10:29                 ` Rafał Miłecki
2022-05-10 14:09                   ` Dave Taht
2022-05-10 19:15                     ` Dave Taht
2022-05-06  7:44           ` Rafał Miłecki
2022-05-06  8:45             ` Arnd Bergmann [this message]
2022-05-06  8:55               ` Rafał Miłecki
2022-05-06  9:44                 ` Arnd Bergmann
2022-05-10 12:51                   ` Rafał Miłecki
2022-05-10 13:19                     ` Arnd Bergmann
2022-05-10 11:23               ` Rafał Miłecki
2022-05-10 13:18                 ` Arnd Bergmann
2022-05-08  9:53             ` Rafał Miłecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAK8P3a0Rouw8jHHqGhKtMu-ks--bqpVYj_+u4-Pt9VoFOK7nMw@mail.gmail.com \
    --to=arnd@arndb.de \
    --cc=alexandr.lobakin@intel.com \
    --cc=andrew@lunn.ch \
    --cc=f.fainelli@gmail.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux@armlinux.org.uk \
    --cc=nbd@nbd.name \
    --cc=netdev@vger.kernel.org \
    --cc=openwrt-devel@lists.openwrt.org \
    --cc=zajec5@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).