From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,FROM_EXCESS_BASE64, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46DBDC4360F for ; Fri, 5 Apr 2019 07:20:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0678920700 for ; Fri, 5 Apr 2019 07:20:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ZHOxc/HO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729785AbfDEHUT (ORCPT ); Fri, 5 Apr 2019 03:20:19 -0400 Received: from mail-ed1-f68.google.com ([209.85.208.68]:46048 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726694AbfDEHUT (ORCPT ); Fri, 5 Apr 2019 03:20:19 -0400 Received: by mail-ed1-f68.google.com with SMTP id w23so556237edv.12 for ; Fri, 05 Apr 2019 00:20:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=1QhCH2sQBx2uJ7E94Ie4icbmkP37yNC06IWiJPGhYbQ=; b=ZHOxc/HO3GvjkVD9BlhGWp/4zbv9s74W3AnIBxpNBcZKJbeiGAgzI7G8CCuaWVKXPs M4lzA/bX32DA1oBEwxNx5knty+5/N4jYo4UY/PsYMzkoFmD3JaDQDWXtsxfgrVTACE3i sRT6tvGLh69v2hcAKIeNyf+MOIC3+gWcm9EasRO1epzPfiferFkBVdGXXKcNutyhJYsu E6ByhpapR6UoWiTPkpbKCtWc6G3TzIfY/ElAwRsT48+nMtmNIywsIEFjyF4mlRaI0RQP pxTkxLwoO7HuaXASKnvPgVsWyiS+HFQdafiJiVCBU83PkEJaM+SSeYbOdy9ayBE33jKu Ma9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:references:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=1QhCH2sQBx2uJ7E94Ie4icbmkP37yNC06IWiJPGhYbQ=; b=Pjg0mY9c+plgS+1kEhWYk95EL9q5KOxxMhe/lS0KI9l9S/YMME5oLNHWWxERlBOWuO 1AUkODSslOAoOfcZGTHwznQgPioMBNAlNXHDgBTR2qMpXDerFCpnSH/Apt0OUiuibnA7 w/C7mNa1aWVjRArwSCWHU2mMXQNgRlPpgxL9jkM6p2GHmsEeceVeGRJlMalws/Xq8u1u AHltaYD4y6YmBD3kUrzuKkxWk2QS4k49RVAyqj36NDs9s3B1V5L5UEUthlW42mLCi3Hj boMbt/tamgfTu0V4rDvFthtxDfC+744YNP9rDO/NBhkWodWqwt61zfzCAki3pmfuP03H m+iA== X-Gm-Message-State: APjAAAVFnKOpTTFRpv3NjGTR4qW5iGWVOblfCOItbSSRnfy2Ezat4mGd xww4+JIDMinsBdiIpHbZEbY= X-Google-Smtp-Source: APXvYqyuAXtxbfB9/ksoQbyTBRmxGElr4PShRTd/OM+FsMAuCLEPAsooC/8vTaBD5Xs/9PXYsVgt8A== X-Received: by 2002:a50:d591:: with SMTP id v17mr6996781edi.180.1554448286170; Fri, 05 Apr 2019 00:11:26 -0700 (PDT) Received: from elitebook.lan (ip-194-187-74-233.konfederacka.maverick.com.pl. [194.187.74.233]) by smtp.googlemail.com with ESMTPSA id x45sm6368273edm.64.2019.04.05.00.11.24 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Fri, 05 Apr 2019 00:11:25 -0700 (PDT) Subject: Re: NAT performance regression caused by vlan GRO support From: =?UTF-8?B?UmFmYcWCIE1pxYJlY2tp?= To: Toshiaki Makita Cc: Toshiaki Makita , netdev@vger.kernel.org, "David S. Miller" , Stefano Brivio , Sabrina Dubroca , David Ahern , Felix Fietkau , Jo-Philipp Wich , Koen Vandeputte References: <73223229-6bc0-2647-6952-975961811866@gmail.com> <75961408-fd62-0f12-bd4b-79008b27576c@gmail.com> <53588a9f-8cc8-0ee5-0947-8ab2b2e56f15@gmail.com> Message-ID: Date: Fri, 5 Apr 2019 09:11:23 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.5.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 05.04.2019 07:48, Rafał Miłecki wrote: > On 05.04.2019 06:26, Toshiaki Makita wrote: >> My test results: >> >> Receiving packets from eth0.10, forwarding them to eth0.20 and applying >> MASQUERADE on eth0.20, using i40e 25G NIC on kernel 4.20.13. >> Disabled rxvlan by ethtool -K to exercise vlan_gro_receive(). >> Measured TCP throughput by netperf. >> >> GRO on : 17 Gbps >> GRO off:  5 Gbps >> >> So I failed to reproduce your problem. > > :( Thanks for trying & checking that! > > >> Would you check the CPU usage by "mpstat -P ALL" or similar (like "sar >> -u ALL -P ALL") to check if the traffic is able to consume 100% CPU on >> your machine? > > 1) ethtool -K eth0 gro on + iperf running (577 Mb/s) > root@OpenWrt:/# mpstat -P ALL 10 3 > Linux 5.1.0-rc3+ (OpenWrt)      03/27/19        _armv7l_        (2 CPU) > > 16:33:40     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > 16:33:50     all    0.00    0.00    0.00    0.00    0.00   58.79    0.00    0.00   41.21 > 16:33:50       0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00 > 16:33:50       1    0.00    0.00    0.00    0.00    0.00   17.58    0.00    0.00   82.42 > > 16:33:50     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > 16:34:00     all    0.00    0.00    0.05    0.00    0.00   59.44    0.00    0.00   40.51 > 16:34:00       0    0.00    0.00    0.10    0.00    0.00   99.90    0.00    0.00    0.00 > 16:34:00       1    0.00    0.00    0.00    0.00    0.00   18.98    0.00    0.00   81.02 > > 16:34:00     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > 16:34:10     all    0.00    0.00    0.00    0.00    0.00   59.59    0.00    0.00   40.41 > 16:34:10       0    0.00    0.00    0.00    0.00    0.00  100.00    0.00    0.00    0.00 > 16:34:10       1    0.00    0.00    0.00    0.00    0.00   19.18    0.00    0.00   80.82 > > Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > Average:     all    0.00    0.00    0.02    0.00    0.00   59.27    0.00    0.00   40.71 > Average:       0    0.00    0.00    0.03    0.00    0.00   99.97    0.00    0.00    0.00 > Average:       1    0.00    0.00    0.00    0.00    0.00   18.58    0.00    0.00   81.42 > > > 2) ethtool -K eth0 gro off + iperf running (941 Mb/s) > root@OpenWrt:/# mpstat -P ALL 10 3 > Linux 5.1.0-rc3+ (OpenWrt)      03/27/19        _armv7l_        (2 CPU) > > 16:34:39     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > 16:34:49     all    0.00    0.00    0.05    0.00    0.00   86.91    0.00    0.00   13.04 > 16:34:49       0    0.00    0.00    0.10    0.00    0.00   78.22    0.00    0.00   21.68 > 16:34:49       1    0.00    0.00    0.00    0.00    0.00   95.60    0.00    0.00    4.40 > > 16:34:49     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > 16:34:59     all    0.00    0.00    0.10    0.00    0.00   87.06    0.00    0.00   12.84 > 16:34:59       0    0.00    0.00    0.20    0.00    0.00   79.72    0.00    0.00   20.08 > 16:34:59       1    0.00    0.00    0.00    0.00    0.00   94.41    0.00    0.00    5.59 > > 16:34:59     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > 16:35:09     all    0.00    0.00    0.05    0.00    0.00   85.71    0.00    0.00   14.24 > 16:35:09       0    0.00    0.00    0.10    0.00    0.00   79.42    0.00    0.00   20.48 > 16:35:09       1    0.00    0.00    0.00    0.00    0.00   92.01    0.00    0.00    7.99 > > Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > Average:     all    0.00    0.00    0.07    0.00    0.00   86.56    0.00    0.00   13.37 > Average:       0    0.00    0.00    0.13    0.00    0.00   79.12    0.00    0.00   20.75 > Average:       1    0.00    0.00    0.00    0.00    0.00   94.01    0.00    0.00    5.99 > > > 3) System idle (no iperf) > root@OpenWrt:/# mpstat -P ALL 10 1 > Linux 5.1.0-rc3+ (OpenWrt)      03/27/19        _armv7l_        (2 CPU) > > 16:35:31     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > 16:35:41     all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00 > 16:35:41       0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00 > 16:35:41       1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00 > > Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle > Average:     all    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00 > Average:       0    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00 > Average:       1    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00 > > >> If CPU is 100%, perf may help us analyze your problem. If it's >> available, try running below while testing: >> # perf record -a -g -- sleep 5 >> >> And then run this after testing: >> # perf report --no-child > > I can see my CPU 0 is fully loaded when using "gro on". I'll try perf now. I guess its GRO + csum_partial() to be blamed for this performance drop. Maybe csum_partial() is very fast on your powerful machine and few extra calls don't make a difference? I can imagine it affecting much slower home router with ARM cores. 1) ethtool -K eth0 gro on Samples: 34K of event 'cycles', Event count (approx.): 10041345370 Overhead Command Shared Object Symbol + 25,46% ksoftirqd/0 [kernel.kallsyms] [k] csum_partial + 8,82% ksoftirqd/0 [kernel.kallsyms] [k] v7_dma_inv_range + 6,03% swapper [kernel.kallsyms] [k] arch_cpu_idle + 4,08% ksoftirqd/0 [kernel.kallsyms] [k] v7_dma_clean_range + 3,82% ksoftirqd/0 [kernel.kallsyms] [k] l2c210_inv_range + 3,14% swapper [kernel.kallsyms] [k] rcu_idle_exit + 3,00% ksoftirqd/0 [kernel.kallsyms] [k] l2c210_clean_range + 2,43% ksoftirqd/0 [kernel.kallsyms] [k] bgmac_start_xmit + 1,24% swapper [kernel.kallsyms] [k] csum_partial + 1,20% swapper [kernel.kallsyms] [k] do_idle + 1,19% swapper [kernel.kallsyms] [k] skb_segment + 1,19% ksoftirqd/0 [kernel.kallsyms] [k] arm_dma_unmap_page + 1,00% ksoftirqd/0 [kernel.kallsyms] [k] bgmac_poll + 0,95% ksoftirqd/0 [kernel.kallsyms] [k] __slab_free.constprop.3 + 0,80% ksoftirqd/0 [kernel.kallsyms] [k] skb_release_data + 0,77% swapper [kernel.kallsyms] [k] __dev_queue_xmit + 0,73% ksoftirqd/0 [kernel.kallsyms] [k] build_skb + 0,68% ksoftirqd/0 [kernel.kallsyms] [k] skb_segment + 0,66% ksoftirqd/0 [kernel.kallsyms] [k] mmiocpy + 0,66% ksoftirqd/0 [kernel.kallsyms] [k] skb_checksum_help + 0,65% ksoftirqd/0 [kernel.kallsyms] [k] dev_gro_receive + 0,64% ksoftirqd/0 [kernel.kallsyms] [k] page_address + 0,62% ksoftirqd/0 [kernel.kallsyms] [k] __qdisc_run + 0,62% ksoftirqd/0 [kernel.kallsyms] [k] dma_cache_maint_page + 0,59% swapper [kernel.kallsyms] [k] __kmalloc_track_caller + 0,59% swapper [kernel.kallsyms] [k] mmiocpy + 0,58% ksoftirqd/0 [kernel.kallsyms] [k] sch_direct_xmit + 0,55% ksoftirqd/0 [kernel.kallsyms] [k] mmioset + 0,52% ksoftirqd/0 [kernel.kallsyms] [k] inet_gro_receive 0,49% ksoftirqd/0 [kernel.kallsyms] [k] netdev_alloc_frag 0,47% swapper [kernel.kallsyms] [k] __netif_receive_skb_core 0,45% swapper [kernel.kallsyms] [k] kmem_cache_alloc 0,45% ksoftirqd/0 [kernel.kallsyms] [k] __skb_checksum 0,43% swapper [kernel.kallsyms] [k] v7_dma_clean_range 0,39% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_alloc 0,36% ksoftirqd/0 [kernel.kallsyms] [k] qdisc_dequeue_head 0,36% ksoftirqd/0 [kernel.kallsyms] [k] arm_dma_map_page 0,35% swapper [kernel.kallsyms] [k] mmioset 0,34% ksoftirqd/0 [kernel.kallsyms] [k] tcp_gro_receive 0,33% swapper [kernel.kallsyms] [k] __copy_skb_header 0,33% ksoftirqd/0 [kernel.kallsyms] [k] kmem_cache_free 0,32% ksoftirqd/0 [kernel.kallsyms] [k] netif_skb_features 0,30% swapper [kernel.kallsyms] [k] netif_skb_features 0,30% ksoftirqd/0 [kernel.kallsyms] [k] __skb_flow_dissect 2) ethtool -K eth0 gro off Samples: 39K of event 'cycles', Event count (approx.): 13065826851 Overhead Command Shared Object Symbol + 11,09% swapper [kernel.kallsyms] [k] v7_dma_inv_range + 5,86% ksoftirqd/1 [kernel.kallsyms] [k] v7_dma_clean_range + 5,77% swapper [kernel.kallsyms] [k] l2c210_inv_range + 5,38% swapper [kernel.kallsyms] [k] __irqentry_text_end + 4,44% swapper [kernel.kallsyms] [k] bcma_host_soc_read32 + 3,28% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core + 3,25% ksoftirqd/1 [kernel.kallsyms] [k] l2c210_clean_range + 2,70% swapper [kernel.kallsyms] [k] arch_cpu_idle + 2,25% swapper [kernel.kallsyms] [k] bgmac_poll + 2,14% ksoftirqd/1 [kernel.kallsyms] [k] bgmac_start_xmit + 1,79% ksoftirqd/1 [kernel.kallsyms] [k] __dev_queue_xmit + 1,36% ksoftirqd/1 [kernel.kallsyms] [k] skb_vlan_untag + 1,11% swapper [kernel.kallsyms] [k] __skb_flow_dissect + 1,07% ksoftirqd/1 [kernel.kallsyms] [k] netif_skb_features + 0,98% ksoftirqd/1 [kernel.kallsyms] [k] ip_rcv_core.constprop.3 + 0,92% ksoftirqd/1 [kernel.kallsyms] [k] sch_direct_xmit + 0,90% ksoftirqd/1 [kernel.kallsyms] [k] __local_bh_enable_ip + 0,86% ksoftirqd/1 [kernel.kallsyms] [k] nf_hook_slow + 0,82% swapper [kernel.kallsyms] [k] net_rx_action + 0,80% ksoftirqd/1 [kernel.kallsyms] [k] validate_xmit_skb.constprop.30 + 0,75% swapper [kernel.kallsyms] [k] build_skb + 0,72% ksoftirqd/1 [kernel.kallsyms] [k] ip_forward + 0,71% ksoftirqd/1 [kernel.kallsyms] [k] br_handle_frame_finish + 0,71% ksoftirqd/1 [kernel.kallsyms] [k] skb_pull_rcsum + 0,65% swapper [kernel.kallsyms] [k] arm_dma_unmap_page + 0,59% ksoftirqd/1 [kernel.kallsyms] [k] ip_finish_output2 + 0,59% swapper [kernel.kallsyms] [k] __skb_get_hash + 0,58% swapper [kernel.kallsyms] [k] dma_cache_maint_page + 0,55% ksoftirqd/1 [kernel.kallsyms] [k] fdb_find_rcu + 0,54% swapper [kernel.kallsyms] [k] bcma_host_soc_write32 + 0,53% ksoftirqd/1 [kernel.kallsyms] [k] vlan_do_receive + 0,52% ksoftirqd/1 [kernel.kallsyms] [k] memmove + 0,52% swapper [kernel.kallsyms] [k] rcu_idle_exit + 0,51% ksoftirqd/1 [kernel.kallsyms] [k] ip_rcv + 0,51% ksoftirqd/1 [kernel.kallsyms] [k] dev_hard_start_xmit 0,49% ksoftirqd/1 [kernel.kallsyms] [k] ip_output 0,46% ksoftirqd/1 [kernel.kallsyms] [k] vlan_dev_hard_start_xmit 0,45% swapper [kernel.kallsyms] [k] enqueue_to_backlog 0,42% swapper [kernel.kallsyms] [k] netdev_alloc_frag 0,42% swapper [kernel.kallsyms] [k] skb_release_data 0,41% ksoftirqd/1 [kernel.kallsyms] [k] ip_forward_finish 0,40% ksoftirqd/1 [kernel.kallsyms] [k] br_handle_frame 0,37% ksoftirqd/1 [kernel.kallsyms] [k] mmiocpy 0,37% ksoftirqd/1 [kernel.kallsyms] [k] page_address 0,36% ksoftirqd/0 [kernel.kallsyms] [k] v7_dma_inv_range 0,36% ksoftirqd/1 [kernel.kallsyms] [k] memcmp 0,36% ksoftirqd/1 [kernel.kallsyms] [k] netif_receive_skb_internal 0,34% swapper [kernel.kallsyms] [k] page_address 0,34% swapper [kernel.kallsyms] [k] mmioset 0,33% ksoftirqd/1 [kernel.kallsyms] [k] br_pass_frame_up 0,33% ksoftirqd/1 [kernel.kallsyms] [k] neigh_connected_output 0,33% swapper [kernel.kallsyms] [k] kmem_cache_alloc 0,31% ksoftirqd/1 [kernel.kallsyms] [k] mmioset 0,30% ksoftirqd/1 [kernel.kallsyms] [k] ip_finish_output 0,30% ksoftirqd/1 [kernel.kallsyms] [k] bcma_bgmac_write