All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: "Íñigo Huguet" <ihuguet@redhat.com>,
	"Edward Cree" <ecree.xilinx@gmail.com>,
	habetsm.xilinx@gmail.com
Cc: netdev@vger.kernel.org, Dinan Gunawardena <dinang@xilinx.com>,
	Pablo Cascon <pabloc@xilinx.com>
Subject: Re: Bad performance in RX with sfc 40G
Date: Thu, 18 Nov 2021 09:19:39 -0800	[thread overview]
Message-ID: <beef3b28-6818-df7b-eaad-8569cac5d79b@gmail.com> (raw)
In-Reply-To: <CACT4oudChHDKecLfDdA7R8jpQv2Nmz5xBS3hH_jFWeS37CnQGg@mail.gmail.com>



On 11/18/21 7:14 AM, Íñigo Huguet wrote:
> Hello,
> 
> Doing some tests a few weeks ago I noticed a very low performance in
> RX using 40G Solarflare NICs. Doing tests with iperf3 I got more than
> 30Gbps in TX, but just around 15Gbps in RX. Other NICs from other
> vendors could send and receive over 30Gbps.
> 
> I was doing the tests with multiple threads in iperf3 (-P 8).
> 
> The models used are SFC9140 and SFC9220.
> 
> Perf showed that most of the time was being expended in
> `native_queued_spin_lock_slowpath`. Tracing the calls to it with
> bpftrace I got that most of the calls were from __napi_poll > efx_poll
>> efx_fast_push_rx_descriptors > __alloc_pages >
> get_page_from_freelist > ...
> 
> Please can you help me investigate the issue? At first sight, it seems
> a not very optimal memory allocation strategy, or maybe a failure in
> pages recycling strategy...
> 
> This is the output of bpftrace, the 2 call chains that repeat more
> times, both from sfc
> 
> @[
>     native_queued_spin_lock_slowpath+1
>     _raw_spin_lock+26
>     rmqueue_bulk+76
>     get_page_from_freelist+2295
>     __alloc_pages+214
>     efx_fast_push_rx_descriptors+640
>     efx_poll+660
>     __napi_poll+42
>     net_rx_action+547
>     __softirqentry_text_start+208
>     __irq_exit_rcu+179
>     common_interrupt+131
>     asm_common_interrupt+30
>     cpuidle_enter_state+199
>     cpuidle_enter+41
>     do_idle+462
>     cpu_startup_entry+25
>     start_kernel+2465
>     secondary_startup_64_no_verify+194
> ]: 2650
> @[
>     native_queued_spin_lock_slowpath+1
>     _raw_spin_lock+26
>     rmqueue_bulk+76
>     get_page_from_freelist+2295
>     __alloc_pages+214
>     efx_fast_push_rx_descriptors+640
>     efx_poll+660
>     __napi_poll+42
>     net_rx_action+547
>     __softirqentry_text_start+208
>     __irq_exit_rcu+179
>     common_interrupt+131
>     asm_common_interrupt+30
>     cpuidle_enter_state+199
>     cpuidle_enter+41
>     do_idle+462
>     cpu_startup_entry+25
>     secondary_startup_64_no_verify+194
> ]: 17119
> 
> --
> Íñigo Huguet
> 


You could try to :

Make the RX ring buffers bigger (ethtool -G eth0 rx 8192)

and/or

Make sure your tcp socket receive buffer is smaller than number of frames in the ring buffer

echo "4096 131072 2097152" >/proc/sys/net/ipv4/tcp_rmem

You can also try latest net-next, as TCP got something to help this case.

f35f821935d8df76f9c92e2431a225bdff938169 tcp: defer skb freeing after socket lock is released

  reply	other threads:[~2021-11-18 17:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-18 15:14 Bad performance in RX with sfc 40G Íñigo Huguet
2021-11-18 17:19 ` Eric Dumazet [this message]
2021-12-02 14:26   ` Íñigo Huguet
2021-11-20  8:31 ` Martin Habets
2021-12-09 12:06   ` Íñigo Huguet
2021-12-23 13:18     ` Íñigo Huguet
2022-01-02  9:22       ` Martin Habets
2022-01-10  8:58         ` [PATCH net-next] sfc: The size of the RX recycle ring should be more flexible Martin Habets
2022-01-10  9:31           ` Íñigo Huguet
2022-01-12  9:05             ` Martin Habets
2022-01-31 11:08             ` Martin Habets
2022-01-10 17:22           ` Jakub Kicinski
2022-01-12  9:08             ` Martin Habets
2022-01-31 11:10           ` [PATCH V2 " Martin Habets
2022-02-02  5:10             ` patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=beef3b28-6818-df7b-eaad-8569cac5d79b@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=dinang@xilinx.com \
    --cc=ecree.xilinx@gmail.com \
    --cc=habetsm.xilinx@gmail.com \
    --cc=ihuguet@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabloc@xilinx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.