All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafał Miłecki" <zajec5@gmail.com>
To: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>,
	Felix Fietkau <nbd@nbd.name>
Cc: Toshiaki Makita <toshiaki.makita1@gmail.com>,
	netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	Stefano Brivio <sbrivio@redhat.com>,
	Sabrina Dubroca <sd@queasysnail.net>,
	David Ahern <dsahern@gmail.com>, Jo-Philipp Wich <jo@mein.io>,
	Koen Vandeputte <koen.vandeputte@ncentric.com>
Subject: Re: NAT performance regression caused by vlan GRO support
Date: Fri, 5 Apr 2019 10:24:29 +0200	[thread overview]
Message-ID: <f23dd3f7-691a-6f9b-fca0-7a78f44fcb97@gmail.com> (raw)
In-Reply-To: <31acd23f-6973-1912-7fcc-575a5d4e00e7@gmail.com>

On 05.04.2019 10:12, Rafał Miłecki wrote:
> On 05.04.2019 09:58, Toshiaki Makita wrote:
>> On 2019/04/05 16:14, Felix Fietkau wrote:
>>> On 2019-04-05 09:11, Rafał Miłecki wrote:
>>>> I guess its GRO + csum_partial() to be blamed for this performance drop.
>>>>
>>>> Maybe csum_partial() is very fast on your powerful machine and few extra calls
>>>> don't make a difference? I can imagine it affecting much slower home router with
>>>> ARM cores.
>>> Most high performance Ethernet devices implement hardware checksum
>>> offload, which completely gets rid of this overhead.
>>> Unfortunately, the BCM53xx/47xx Ethernet MAC doesn't have this, which is
>>> why you're getting such crappy performance.
>>
>> Hmm... now I disabled rx checksum and tried the test again, and indeed I
>> see csum_partial from GRO path. But I also see csum_partial even without
>> GRO from nf_conntrack_in -> tcp_packet -> __skb_checksum_complete.
>> Probably Rafał disabled nf_conntrack_checksum sysctl knob?
>>
>> But anyway even with disabling rx csum offload my machine has better
>> performance with GRO. I'm sure in some cases GRO should be disabled, but
>> I guess it's difficult to determine whether we should disable GRO or not
>> automatically when csum offload is not available.
> 
> Few testing results:
> 
> 1) ethtool -K eth0 gro off; echo 0 > /proc/sys/net/netfilter/nf_conntrack_checksum
> [  6]  0.0-60.0 sec  6.57 GBytes   940 Mbits/sec
> 
> 2) ethtool -K eth0 gro off; echo 1 > /proc/sys/net/netfilter/nf_conntrack_checksum
> [  6]  0.0-60.0 sec  4.65 GBytes   666 Mbits/sec

For this case (GRO off and nf_conntrack_checksum enabled) I can confirm I see
csum_partial() in the perf output. It's taking 13,14% instead of 25,46% (as when
using GRO) though.

Samples: 38K of event 'cycles', Event count (approx.): 12209908413
   Overhead  Command          Shared Object           Symbol
+   13,14%  ksoftirqd/1      [kernel.kallsyms]       [k] csum_partial
+   10,16%  swapper          [kernel.kallsyms]       [k] v7_dma_inv_range
+    6,36%  swapper          [kernel.kallsyms]       [k] l2c210_inv_range
+    4,89%  swapper          [kernel.kallsyms]       [k] __irqentry_text_end
+    4,12%  ksoftirqd/1      [kernel.kallsyms]       [k] v7_dma_clean_range
+    3,78%  swapper          [kernel.kallsyms]       [k] bcma_host_soc_read32
+    2,76%  swapper          [kernel.kallsyms]       [k] arch_cpu_idle
+    2,45%  ksoftirqd/1      [kernel.kallsyms]       [k] __netif_receive_skb_core
+    2,37%  ksoftirqd/1      [kernel.kallsyms]       [k] l2c210_clean_range
+    1,76%  ksoftirqd/1      [kernel.kallsyms]       [k] bgmac_start_xmit
+    1,66%  swapper          [kernel.kallsyms]       [k] bgmac_poll
+    1,55%  ksoftirqd/1      [kernel.kallsyms]       [k] __dev_queue_xmit
+    1,11%  ksoftirqd/1      [kernel.kallsyms]       [k] skb_vlan_untag


> 3) ethtool -K eth0 gro on; echo 0 > /proc/sys/net/netfilter/nf_conntrack_checksum
> [  6]  0.0-60.0 sec  4.02 GBytes   575 Mbits/sec
> 
> 4) ethtool -K eth0 gro on; echo 1 > /proc/sys/net/netfilter/nf_conntrack_checksum
> [  6]  0.0-60.0 sec  4.04 GBytes   579 Mbits/sec

  reply	other threads:[~2019-04-05  8:24 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-04 12:57 NAT performance regression caused by vlan GRO support Rafał Miłecki
2019-04-04 15:17 ` Toshiaki Makita
2019-04-04 20:22   ` Rafał Miłecki
2019-04-05  4:26     ` Toshiaki Makita
2019-04-05  5:48       ` Rafał Miłecki
2019-04-05  7:11         ` Rafał Miłecki
2019-04-05  7:14           ` Felix Fietkau
2019-04-05  7:58             ` Toshiaki Makita
2019-04-05  8:12               ` Rafał Miłecki
2019-04-05  8:24                 ` Rafał Miłecki [this message]
2019-04-05 10:18               ` Toke Høiland-Jørgensen
2019-04-05 10:51                 ` Florian Westphal
2019-04-05 11:00                   ` Eric Dumazet
2019-04-07 11:53 ` Rafał Miłecki
2019-04-07 11:54   ` Rafał Miłecki
2019-04-08 13:31     ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f23dd3f7-691a-6f9b-fca0-7a78f44fcb97@gmail.com \
    --to=zajec5@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=jo@mein.io \
    --cc=koen.vandeputte@ncentric.com \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=nbd@nbd.name \
    --cc=netdev@vger.kernel.org \
    --cc=sbrivio@redhat.com \
    --cc=sd@queasysnail.net \
    --cc=toshiaki.makita1@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.