All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Rafał Miłecki" <zajec5@gmail.com>
To: netdev@vger.kernel.org, "David S. Miller" <davem@davemloft.net>,
	"Toshiaki Makita" <makita.toshiaki@lab.ntt.co.jp>,
	"Toke Høiland-Jørgensen" <toke@redhat.com>,
	"Florian Westphal" <fw@strlen.de>,
	"Eric Dumazet" <eric.dumazet@gmail.com>
Cc: Stefano Brivio <sbrivio@redhat.com>,
	Sabrina Dubroca <sd@queasysnail.net>,
	David Ahern <dsahern@gmail.com>, Felix Fietkau <nbd@nbd.name>,
	Jo-Philipp Wich <jo@mein.io>,
	Koen Vandeputte <koen.vandeputte@ncentric.com>
Subject: Re: NAT performance regression caused by vlan GRO support
Date: Sun, 7 Apr 2019 13:54:59 +0200	[thread overview]
Message-ID: <2149862a-b12e-4025-c51d-6857d26b9a77@gmail.com> (raw)
In-Reply-To: <ff7160de-2ad3-e807-e695-497c8418b318@gmail.com>

Now I have some questions regarding possible optimizations. Note I'm too
familiar with the net subsystem so maybe I got wrong ideas.

On 07.04.2019 13:53, Rafał Miłecki wrote:
> On 04.04.2019 14:57, Rafał Miłecki wrote:
>> Long story short, starting with the commit 66e5133f19e9 ("vlan: Add GRO support
>> for non hardware accelerated vlan") - which first hit kernel 4.2 - NAT
>> performance of my router dropped by 30% - 40%.
> 
> I'll try to provide some summary for this issue. I'll focus on TCP traffic as
> that's what I happened to test.
> 
> Basically all slowdowns are related to the csum_partial(). Calculating checksum
> has a significant impact on NAT performance on less CPU powerful devices.
> 
> **********
> 
> GRO disabled
> 
> Without GRO a csum_partial() is used only when validating TCP packets in the
> nf_conntrack_tcp_packet() (known as tcp_packet() in kernels older than 5.1).
> 
> Simplified forward trace for that case:
> nf_conntrack_in
>      nf_conntrack_tcp_packet
>          tcp_error
>              if (state->net->ct.sysctl_checksum)
>                  nf_checksum
>                      nf_ip_checksum
>                          __skb_checksum_complete
> 
> That validation can be disabled using nf_conntrack_checksum sysfs and it bumps
> NAT speed for me from 666 Mb/s to 940 Mb/s (+41%).
> 
> **********
> 
> GRO enabled
> 
> First of all GRO also includes TCP validation that requires calculating a
> checksum.
> 
> Simplified forward trace for that case:
> vlan_gro_receive
>      call_gro_receive
>          inet_gro_receive
>              indirect_call_gro_receive
>                  tcp4_gro_receive
>                      skb_gro_checksum_validate
>                      tcp_gro_receive
> 
> *If* we had a way to disable that validation it *would* result in bumping NAT
> speed for me from 577 Mb/s to 825 Mb/s (+43%).

Could we have tcp4_gro_receive() behave similarly to the tcp_error() and make it
respect the nf_conntrack_checksum sysfs value?

Could we simply add something like:
if (dev_net(skb->dev)->ct.sysctl_checksum)
to it (to additionally protect a skb_gro_checksum_validate() call)?


> Secondly using GRO means we need to calculate a checksum before transmitting
> packets (applies to devices without HW checksum offloading). I think it's
> related to packets merging in the skb_gro_receive() and then setting
> CHECKSUM_PARTIAL:
> 
> vlan_gro_complete
>      inet_gro_complete
>          tcp4_gro_complete
>              tcp_gro_complete
>                  skb->ip_summed = CHECKSUM_PARTIAL;
> 
> That results in bgmac calculating a checksum from the scratch, take a look at
> the bgmac_dma_tx_add() which does:
> 
> if (skb->ip_summed == CHECKSUM_PARTIAL)
>      skb_checksum_help(skb);
> 
> Performing that whole checksum calculation will always result in GRO slowing
> down NAT for me when using BCM47094 SoC with that not-so-powerful ARM CPUs.

Is this possible to avoid CHECKSUM_PARTIAL & skb_checksum_help() which has to
calculate a whole checksum? It's definitely possible to *update* checksum after
simple packet changes (e.g. amending an IP or port). Would that be possible to
use similar method when dealing with packets with GRO enabled?

If not, maybe w really need to think about some good & clever condition for
disabling GRO by default on hw without checksum offloading.

  reply	other threads:[~2019-04-07 11:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-04 12:57 NAT performance regression caused by vlan GRO support Rafał Miłecki
2019-04-04 15:17 ` Toshiaki Makita
2019-04-04 20:22   ` Rafał Miłecki
2019-04-05  4:26     ` Toshiaki Makita
2019-04-05  5:48       ` Rafał Miłecki
2019-04-05  7:11         ` Rafał Miłecki
2019-04-05  7:14           ` Felix Fietkau
2019-04-05  7:58             ` Toshiaki Makita
2019-04-05  8:12               ` Rafał Miłecki
2019-04-05  8:24                 ` Rafał Miłecki
2019-04-05 10:18               ` Toke Høiland-Jørgensen
2019-04-05 10:51                 ` Florian Westphal
2019-04-05 11:00                   ` Eric Dumazet
2019-04-07 11:53 ` Rafał Miłecki
2019-04-07 11:54   ` Rafał Miłecki [this message]
2019-04-08 13:31     ` David Laight

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2149862a-b12e-4025-c51d-6857d26b9a77@gmail.com \
    --to=zajec5@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=eric.dumazet@gmail.com \
    --cc=fw@strlen.de \
    --cc=jo@mein.io \
    --cc=koen.vandeputte@ncentric.com \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=nbd@nbd.name \
    --cc=netdev@vger.kernel.org \
    --cc=sbrivio@redhat.com \
    --cc=sd@queasysnail.net \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.