From: "Paweł Staszewski" <pstaszewski@itcare.pl>
To: David Ahern <dsahern@gmail.com>,
netdev@vger.kernel.org,
Jesper Dangaard Brouer <brouer@redhat.com>
Subject: Re: Linux kernel - 5.4.0+ (net-next from 27.11.2019) routing/network performance
Date: Mon, 2 Dec 2019 11:09:29 +0100 [thread overview]
Message-ID: <8e17a844-e98b-59b1-5a0e-669562b3178c@itcare.pl> (raw)
In-Reply-To: <589d2715-80ae-0478-7e31-342060519320@gmail.com>
W dniu 01.12.2019 o 17:05, David Ahern pisze:
> On 11/29/19 4:00 PM, Paweł Staszewski wrote:
>> As always - each year i need to summarize network performance for
>> routing applications like linux router on native Linux kernel (without
>> xdp/dpdk/vpp etc) :)
>>
> Do you keep past profiles? How does this profile (and traffic rates)
> compare to older kernels - e.g., 5.0 or 4.19?
>
>
Yes - so for 4.19:
Max bandwidth was about 40-42Gbit/s RX / 40-42Gbit/s TX of
forwarded(routed) traffic
And after "order-0 pages" patches - max was 50Gbit/s RX + 50Gbit/s TX
(forwarding - bandwidth max)
(current kernel almost doubled this)
And also old perf top (from kernel 4.19) - before "order-0 pages patch":
PerfTop: 108490 irqs/sec kernel:99.6% exact: 0.0% [4000Hz
cycles], (all, 56 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
26.78% [kernel] [k] queued_spin_lock_slowpath
9.09% [kernel] [k] mlx5e_skb_from_cqe_linear
4.94% [kernel] [k] mlx5e_sq_xmit
3.63% [kernel] [k] memcpy_erms
3.30% [kernel] [k] fib_table_lookup
3.26% [kernel] [k] build_skb
2.41% [kernel] [k] mlx5e_poll_tx_cq
2.11% [kernel] [k] get_page_from_freelist
1.51% [kernel] [k] vlan_do_receive
1.51% [kernel] [k] _raw_spin_lock
1.43% [kernel] [k] __dev_queue_xmit
1.41% [kernel] [k] dev_gro_receive
1.34% [kernel] [k] mlx5e_poll_rx_cq
1.26% [kernel] [k] tcp_gro_receive
1.21% [kernel] [k] free_one_page
1.13% [kernel] [k] swiotlb_map_page
1.13% [kernel] [k] mlx5e_post_rx_wqes
1.05% [kernel] [k] pfifo_fast_dequeue
1.05% [kernel] [k] mlx5e_handle_rx_cqe
1.03% [kernel] [k] ip_finish_output2
1.02% [kernel] [k] ipt_do_table
0.96% [kernel] [k] inet_gro_receive
0.91% [kernel] [k] mlx5_eq_int
0.88% [kernel] [k] __slab_free.isra.79
0.86% [kernel] [k] __build_skb
0.84% [kernel] [k] page_frag_free
0.76% [kernel] [k] skb_release_data
0.75% [kernel] [k] __netif_receive_skb_core
0.75% [kernel] [k] irq_entries_start
0.71% [kernel] [k] ip_route_input_rcu
0.65% [kernel] [k] vlan_dev_hard_start_xmit
0.56% [kernel] [k] ip_forward
0.56% [kernel] [k] __memcpy
0.52% [kernel] [k] kmem_cache_alloc
0.52% [kernel] [k] kmem_cache_free_bulk
0.49% [kernel] [k] mlx5e_page_release
0.47% [kernel] [k] netif_skb_features
0.47% [kernel] [k] mlx5e_build_rx_skb
0.47% [kernel] [k] dev_hard_start_xmit
0.43% [kernel] [k] __page_pool_put_page
0.43% [kernel] [k] __netif_schedule
0.43% [kernel] [k] mlx5e_xmit
0.41% [kernel] [k] __qdisc_run
0.41% [kernel] [k] validate_xmit_skb.isra.142
0.41% [kernel] [k] swiotlb_unmap_page
0.40% [kernel] [k] inet_lookup_ifaddr_rcu
0.34% [kernel] [k] ip_rcv_core.isra.20.constprop.25
0.34% [kernel] [k] tcp4_gro_receive
0.29% [kernel] [k] _raw_spin_lock_irqsave
0.29% [kernel] [k] napi_consume_skb
0.29% [kernel] [k] skb_gro_receive
0.29% [kernel] [k] ___slab_alloc.isra.80
0.27% [kernel] [k] eth_type_trans
0.26% [kernel] [k] __free_pages_ok
0.26% [kernel] [k] __get_xps_queue_idx
0.24% [kernel] [k] _raw_spin_trylock
0.23% [kernel] [k] __local_bh_enable_ip
0.22% [kernel] [k] pfifo_fast_enqueue
0.21% [kernel] [k] tasklet_action_common.isra.21
0.21% [kernel] [k] sch_direct_xmit
0.21% [kernel] [k] skb_network_protocol
0.21% [kernel] [k] kmem_cache_free
0.20% [kernel] [k] netdev_pick_tx
0.18% [kernel] [k] napi_gro_complete
0.18% [kernel] [k] __sched_text_start
0.18% [kernel] [k] mlx5e_xdp_handle
0.17% [kernel] [k] ip_finish_output
0.16% [kernel] [k] napi_gro_flush
0.16% [kernel] [k] vlan_passthru_hard_header
0.16% [kernel] [k] skb_segment
0.15% [kernel] [k] __alloc_pages_nodemask
0.15% [kernel] [k] mlx5e_features_check
0.15% [kernel] [k] mlx5e_napi_poll
0.15% [kernel] [k] napi_gro_receive
0.14% [kernel] [k] fib_validate_source
0.14% [kernel] [k] _raw_spin_lock_irq
0.14% [kernel] [k] inet_gro_complete
0.14% [kernel] [k] get_partial_node.isra.78
0.13% [kernel] [k] napi_complete_done
0.13% [kernel] [k] ip_rcv_finish_core.isra.17
0.13% [kernel] [k] cmd_exec
After "order-0 pages" patch
PerfTop: 104692 irqs/sec kernel:99.5% exact: 0.0% [4000Hz
cycles], (all, 56 CPUs)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
9.06% [kernel] [k] mlx5e_skb_from_cqe_mpwrq_linear
6.43% [kernel] [k] tasklet_action_common.isra.21
5.68% [kernel] [k] fib_table_lookup
4.89% [kernel] [k] irq_entries_start
4.53% [kernel] [k] mlx5_eq_int
4.10% [kernel] [k] build_skb
3.39% [kernel] [k] mlx5e_poll_tx_cq
3.38% [kernel] [k] mlx5e_sq_xmit
2.73% [kernel] [k] mlx5e_poll_rx_cq
2.18% [kernel] [k] __dev_queue_xmit
2.13% [kernel] [k] vlan_do_receive
2.12% [kernel] [k] mlx5e_handle_rx_cqe_mpwrq
2.00% [kernel] [k] ip_finish_output2
1.87% [kernel] [k] mlx5e_post_rx_mpwqes
1.86% [kernel] [k] memcpy_erms
1.85% [kernel] [k] ipt_do_table
1.70% [kernel] [k] dev_gro_receive
1.39% [kernel] [k] __netif_receive_skb_core
1.31% [kernel] [k] inet_gro_receive
1.21% [kernel] [k] ip_route_input_rcu
1.21% [kernel] [k] tcp_gro_receive
1.13% [kernel] [k] _raw_spin_lock
1.08% [kernel] [k] __build_skb
1.06% [kernel] [k] kmem_cache_free_bulk
1.05% [kernel] [k] __softirqentry_text_start
1.03% [kernel] [k] vlan_dev_hard_start_xmit
0.98% [kernel] [k] pfifo_fast_dequeue
0.95% [kernel] [k] mlx5e_xmit
0.95% [kernel] [k] page_frag_free
0.88% [kernel] [k] ip_forward
0.81% [kernel] [k] dev_hard_start_xmit
0.78% [kernel] [k] rcu_irq_exit
0.77% [kernel] [k] netif_skb_features
0.72% [kernel] [k] napi_complete_done
0.72% [kernel] [k] kmem_cache_alloc
0.68% [kernel] [k] validate_xmit_skb.isra.142
0.66% [kernel] [k] ip_rcv_core.isra.20.constprop.25
0.58% [kernel] [k] swiotlb_map_page
0.57% [kernel] [k] __qdisc_run
0.56% [kernel] [k] tasklet_action
0.54% [kernel] [k] __get_xps_queue_idx
0.54% [kernel] [k] inet_lookup_ifaddr_rcu
0.50% [kernel] [k] tcp4_gro_receive
0.49% [kernel] [k] skb_release_data
0.47% [kernel] [k] eth_type_trans
0.40% [kernel] [k] sch_direct_xmit
0.40% [kernel] [k] net_rx_action
0.39% [kernel] [k] __local_bh_enable_ip
>> HW setup:
>>
>> Server (Supermicro SYS-1019P-WTR)
>>
>> 1x Intel 6146
>>
>> 2x Mellanox connect-x 5 (100G) (installed in two different x16 pcie
>> gen3.1 slots)
>>
>> 6x 8GB DDR4 2666 (it really matters cause 100G is about 12.5GB/s of
>> memory bandwidth one direction)
>>
>>
>> And here it is:
>>
>> perf top at 72Gbit.s RX and 72Gbit/s TX (at same time)
>>
>> PerfTop: 91202 irqs/sec kernel:99.7% exact: 100.0% [4000Hz
>> cycles:ppp], (all, 24 CPUs)
>> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>
>> 7.56% [kernel] [k] __dev_queue_xmit
>> 5.27% [kernel] [k] build_skb
>> 4.41% [kernel] [k] rr_transmit
>> 4.17% [kernel] [k] fib_table_lookup
>> 3.83% [kernel] [k] mlx5e_skb_from_cqe_mpwrq_linear
>> 3.30% [kernel] [k] mlx5e_sq_xmit
>> 3.14% [kernel] [k] __netif_receive_skb_core
>> 2.48% [kernel] [k] netif_skb_features
>> 2.36% [kernel] [k] _raw_spin_trylock
>> 2.27% [kernel] [k] dev_hard_start_xmit
--
Paweł Staszewski
next prev parent reply other threads:[~2019-12-02 10:10 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-29 22:00 Linux kernel - 5.4.0+ (net-next from 27.11.2019) routing/network performance Paweł Staszewski
2019-11-29 22:13 ` Paweł Staszewski
2019-12-01 16:05 ` David Ahern
2019-12-02 10:09 ` Paweł Staszewski [this message]
2019-12-02 10:53 ` Paolo Abeni
2019-12-02 16:23 ` Paweł Staszewski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8e17a844-e98b-59b1-5a0e-669562b3178c@itcare.pl \
--to=pstaszewski@itcare.pl \
--cc=brouer@redhat.com \
--cc=dsahern@gmail.com \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.