linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tariq Toukan <ttoukan.linux@gmail.com>
To: David Howells <dhowells@redhat.com>, netdev@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
	David Ahern <dsahern@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Christoph Hellwig <hch@infradead.org>,
	Jens Axboe <axboe@kernel.dk>, Jeff Layton <jlayton@kernel.org>,
	Christian Brauner <brauner@kernel.org>,
	Chuck Lever III <chuck.lever@oracle.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, Boris Pismenny <borisp@nvidia.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Gal Pressman <gal@nvidia.com>,
	ranro@nvidia.com, samiram@nvidia.com, drort@nvidia.com,
	Tariq Toukan <tariqt@nvidia.com>
Subject: Re: [PATCH net-next v10 08/16] tls: Inline do_tcp_sendpages()
Date: Wed, 7 Jun 2023 17:17:30 +0300	[thread overview]
Message-ID: <4c49176f-147a-4283-f1b1-32aac7b4b996@gmail.com> (raw)
In-Reply-To: <20230522121125.2595254-9-dhowells@redhat.com>



On 22/05/2023 15:11, David Howells wrote:
> do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(),
> so inline it, allowing do_tcp_sendpages() to be removed.  This is part of
> replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Boris Pismenny <borisp@nvidia.com>
> cc: John Fastabend <john.fastabend@gmail.com>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Eric Dumazet <edumazet@google.com>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: netdev@vger.kernel.org
> ---

Hi,

My team spotted a new degradation in TLS TX device offload, bisected to 
this patch.

 From a quick look at the patch, it's not clear to me what's going wrong.
Please let us know of any helpful information that we can provide to 
help in the debug.

Regards,
Tariq

Reproduce Flow:
client / server test using nginx and  wrk (nothing special/custom about 
the apps used).

client:
/opt/mellanox/iproute2/sbin/ip link set dev eth3 up
/opt/mellanox/iproute2/sbin/ip addr add 11.141.46.9/16 dev eth3

server:
/opt/mellanox/iproute2/sbin/ip link set dev eth3 up
/opt/mellanox/iproute2/sbin/ip addr add 11.141.46.10/16 dev eth3

client:
/auto/sw/regression/sw_net_ver_tools/ktls/tools/x86_64/nginx_openssl_3_0_0 
-p /usr/bin/drivertest_rpms/ktls/nginx/
/opt/mellanox/iproute2/sbin/ss -i src [11.141.46.9]

server:
/auto/sw/regression/sw_net_ver_tools/ktls/tools/x86_64/wrk_openssl_3_0_0 
-b11.141.46.10 -t4 -c874 -d14 --timeout 5s 
https://11.141.46.9:20443/256000b.img

client:
dmesg
/auto/sw/regression/sw_net_ver_tools/ktls/tools/x86_64/nginx_openssl_3_0_0 
-p /usr/bin/drivertest_rpms/ktls/nginx/ -s stop


[root@c-141-46-1-009 ~]# dmesg
------------[ cut here ]------------
WARNING: CPU: 1 PID: 977 at net/core/skbuff.c:6957 
skb_splice_from_iter+0x102/0x300
Modules linked in: rpcrdma rdma_ucm ib_iser libiscsi 
scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib 
ib_uverbs ib_core xt_conntrack xt_MASQUERADE nf_conntrack_netlink 
nfnetlink xt_addrtype iptable_nat nf_nat br_netfilter rpcsec_gss_krb5 
auth_rpcgss oid_registry overlay mlx5_core zram zsmalloc fuse
CPU: 1 PID: 977 Comm: nginx_openssl_3 Not tainted 
6.4.0-rc3_for_upstream_min_debug_2023_06_01_23_04 #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
RIP: 0010:skb_splice_from_iter+0x102/0x300
Code: ef 48 8b 55 08 f6 c2 01 0f 85 54 01 00 00 8b 0d 98 cf 5f 01 48 89 
ea 85 c9 0f 8f 4c 01 00 00 48 8b 12 80 e6 02 74 48 49 89 dd <0f> 0b 48 
c7 c1 fb ff ff ff 45 01 65 70 45 01 65 74 45 01 a5 d0 00
RSP: 0018:ffff8881045abaa0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88814370fe00 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffea00051123c0 RDI: ffff88814370fe00
RBP: ffffea0005112400 R08: 0000000000000011 R09: 0000000000003ffd
R10: 0000000000003ffd R11: 0000000000000008 R12: 0000000000002e6e
R13: ffff88814370fe00 R14: ffff8881045abae8 R15: 000000000000118f
FS:  00007f6e23043740(0000) GS:ffff88852c880000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000009c6c00 CR3: 000000013b791001 CR4: 0000000000370ea0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:

  ? kmalloc_reserve+0x86/0xe0
  tcp_sendmsg_locked+0x33e/0xd40
  tls_push_sg+0xdd/0x230
  tls_push_data+0x673/0x920
  tls_device_sendmsg+0x6e/0xc0
  sock_sendmsg+0x38/0x60
  sock_write_iter+0x97/0x100
  vfs_write+0x2df/0x380
  ksys_write+0xa7/0xe0
  do_syscall_64+0x3d/0x90
  entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f6e22f018b7
Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e 
fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 
f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
RSP: 002b:00007ffdb528a2f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000004000 RCX: 00007f6e22f018b7
RDX: 0000000000004000 RSI: 00000000025cdef0 RDI: 0000000000000028
RBP: 00000000020103c0 R08: 00007ffdb5289a90 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000025cdef0
R13: 000000000204fca0 R14: 0000000000004000 R15: 0000000000004000

---[ end trace 0000000000000000 ]---



>   include/net/tls.h  |  2 +-
>   net/tls/tls_main.c | 24 +++++++++++++++---------
>   2 files changed, 16 insertions(+), 10 deletions(-)
> 
> diff --git a/include/net/tls.h b/include/net/tls.h
> index 6056ce5a2aa5..5791ca7a189c 100644
> --- a/include/net/tls.h
> +++ b/include/net/tls.h
> @@ -258,7 +258,7 @@ struct tls_context {
>   	struct scatterlist *partially_sent_record;
>   	u16 partially_sent_offset;
>   
> -	bool in_tcp_sendpages;
> +	bool splicing_pages;
>   	bool pending_open_record_frags;
>   
>   	struct mutex tx_lock; /* protects partially_sent_* fields and
> diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
> index f2e7302a4d96..3d45fdb5c4e9 100644
> --- a/net/tls/tls_main.c
> +++ b/net/tls/tls_main.c
> @@ -125,7 +125,10 @@ int tls_push_sg(struct sock *sk,
>   		u16 first_offset,
>   		int flags)
>   {
> -	int sendpage_flags = flags | MSG_SENDPAGE_NOTLAST;
> +	struct bio_vec bvec;
> +	struct msghdr msg = {
> +		.msg_flags = MSG_SENDPAGE_NOTLAST | MSG_SPLICE_PAGES | flags,
> +	};
>   	int ret = 0;
>   	struct page *p;
>   	size_t size;
> @@ -134,16 +137,19 @@ int tls_push_sg(struct sock *sk,
>   	size = sg->length - offset;
>   	offset += sg->offset;
>   
> -	ctx->in_tcp_sendpages = true;
> +	ctx->splicing_pages = true;
>   	while (1) {
>   		if (sg_is_last(sg))
> -			sendpage_flags = flags;
> +			msg.msg_flags = flags;
>   
>   		/* is sending application-limited? */
>   		tcp_rate_check_app_limited(sk);
>   		p = sg_page(sg);
>   retry:
> -		ret = do_tcp_sendpages(sk, p, offset, size, sendpage_flags);
> +		bvec_set_page(&bvec, p, size, offset);
> +		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> +
> +		ret = tcp_sendmsg_locked(sk, &msg, size);
>   
>   		if (ret != size) {
>   			if (ret > 0) {
> @@ -155,7 +161,7 @@ int tls_push_sg(struct sock *sk,
>   			offset -= sg->offset;
>   			ctx->partially_sent_offset = offset;
>   			ctx->partially_sent_record = (void *)sg;
> -			ctx->in_tcp_sendpages = false;
> +			ctx->splicing_pages = false;
>   			return ret;
>   		}
>   
> @@ -169,7 +175,7 @@ int tls_push_sg(struct sock *sk,
>   		size = sg->length;
>   	}
>   
> -	ctx->in_tcp_sendpages = false;
> +	ctx->splicing_pages = false;
>   
>   	return 0;
>   }
> @@ -247,11 +253,11 @@ static void tls_write_space(struct sock *sk)
>   {
>   	struct tls_context *ctx = tls_get_ctx(sk);
>   
> -	/* If in_tcp_sendpages call lower protocol write space handler
> +	/* If splicing_pages call lower protocol write space handler
>   	 * to ensure we wake up any waiting operations there. For example
> -	 * if do_tcp_sendpages where to call sk_wait_event.
> +	 * if splicing pages where to call sk_wait_event.
>   	 */
> -	if (ctx->in_tcp_sendpages) {
> +	if (ctx->splicing_pages) {
>   		ctx->sk_write_space(sk);
>   		return;
>   	}
> 
> 

  reply	other threads:[~2023-06-07 14:17 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-22 12:11 [PATCH net-next v10 00/16] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 1 David Howells
2023-05-22 12:11 ` [PATCH net-next v10 01/16] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
2023-05-22 12:11 ` [PATCH net-next v10 02/16] net: Pass max frags into skb_append_pagefrags() David Howells
2023-05-22 12:11 ` [PATCH net-next v10 03/16] net: Add a function to splice pages into an skbuff for MSG_SPLICE_PAGES David Howells
2023-05-24 12:24   ` Yunsheng Lin
2023-05-24 13:21   ` David Howells
2023-05-22 12:11 ` [PATCH net-next v10 04/16] tcp: Support MSG_SPLICE_PAGES David Howells
2023-05-22 12:11 ` [PATCH net-next v10 05/16] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
2023-05-22 12:11 ` [PATCH net-next v10 06/16] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg David Howells
2023-05-22 12:11 ` [PATCH net-next v10 07/16] espintcp: Inline do_tcp_sendpages() David Howells
2023-05-22 12:11 ` [PATCH net-next v10 08/16] tls: " David Howells
2023-06-07 14:17   ` Tariq Toukan [this message]
2023-06-07 15:03   ` David Howells
2023-06-13 11:15     ` Tariq Toukan
2023-06-19  8:23       ` Tariq Toukan
2023-06-19  9:35       ` David Howells
2023-06-27 16:49         ` Tariq Toukan
2023-06-30 17:21           ` Jakub Kicinski
2023-07-04 20:06             ` Tariq Toukan
2023-07-05 16:19               ` Jakub Kicinski
2023-07-23  6:35                 ` Tariq Toukan
2023-07-26  0:30                   ` Jakub Kicinski
2023-07-26 19:20                     ` Tariq Toukan
2023-07-26 20:08                       ` Jakub Kicinski
2023-08-03 11:52                         ` Tariq Toukan
2023-08-03 11:47                     ` Tariq Toukan
2023-08-04  3:12                       ` Jakub Kicinski
2023-08-08  7:29                         ` Tariq Toukan
2023-07-26 10:51                 ` David Howells
2023-07-26 11:43                   ` Tariq Toukan
2023-07-26 14:57                     ` Jakub Kicinski
2023-08-10 13:07             ` David Howells
2023-06-27 16:55         ` David Howells
2023-06-27 17:06         ` David Howells
2023-05-22 12:11 ` [PATCH net-next v10 09/16] siw: " David Howells
2023-05-22 12:11 ` [PATCH net-next v10 10/16] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() David Howells
2023-05-22 12:11 ` [PATCH net-next v10 11/16] ip, udp: Support MSG_SPLICE_PAGES David Howells
2023-05-22 12:11 ` [PATCH net-next v10 12/16] ip6, udp6: " David Howells
2023-05-22 12:11 ` [PATCH net-next v10 13/16] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-05-22 12:11 ` [PATCH net-next v10 14/16] ip: Remove ip_append_page() David Howells
2023-05-22 12:11 ` [PATCH net-next v10 15/16] af_unix: Support MSG_SPLICE_PAGES David Howells
2023-05-22 12:11 ` [PATCH net-next v10 16/16] unix: Convert unix_stream_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-05-24  4:20 ` [PATCH net-next v10 00/16] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 1 patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c49176f-147a-4283-f1b1-32aac7b4b996@gmail.com \
    --to=ttoukan.linux@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=borisp@nvidia.com \
    --cc=brauner@kernel.org \
    --cc=chuck.lever@oracle.com \
    --cc=davem@davemloft.net \
    --cc=dhowells@redhat.com \
    --cc=drort@nvidia.com \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=gal@nvidia.com \
    --cc=hch@infradead.org \
    --cc=jlayton@kernel.org \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=ranro@nvidia.com \
    --cc=samiram@nvidia.com \
    --cc=tariqt@nvidia.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).