sparclinux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mina Almasry <almasrymina@google.com>
To: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	 linux-doc@vger.kernel.org, linux-alpha@vger.kernel.org,
	 linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org,
	 sparclinux@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	 linux-arch@vger.kernel.org, bpf@vger.kernel.org,
	 linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org,
	 dri-devel@lists.freedesktop.org
Cc: "Mina Almasry" <almasrymina@google.com>,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Ivan Kokshaysky" <ink@jurassic.park.msu.ru>,
	"Matt Turner" <mattst88@gmail.com>,
	"Thomas Bogendoerfer" <tsbogend@alpha.franken.de>,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
	"Helge Deller" <deller@gmx.de>,
	"Andreas Larsson" <andreas@gaisler.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Masami Hiramatsu" <mhiramat@kernel.org>,
	"Mathieu Desnoyers" <mathieu.desnoyers@efficios.com>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Alexei Starovoitov" <ast@kernel.org>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Andrii Nakryiko" <andrii@kernel.org>,
	"Martin KaFai Lau" <martin.lau@linux.dev>,
	"Eduard Zingerman" <eddyz87@gmail.com>,
	"Song Liu" <song@kernel.org>,
	"Yonghong Song" <yonghong.song@linux.dev>,
	"John Fastabend" <john.fastabend@gmail.com>,
	"KP Singh" <kpsingh@kernel.org>,
	"Stanislav Fomichev" <sdf@google.com>,
	"Hao Luo" <haoluo@google.com>, "Jiri Olsa" <jolsa@kernel.org>,
	"Steffen Klassert" <steffen.klassert@secunet.com>,
	"Herbert Xu" <herbert@gondor.apana.org.au>,
	"David Ahern" <dsahern@kernel.org>,
	"Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Amritha Nambiar" <amritha.nambiar@intel.com>,
	"Maciej Fijalkowski" <maciej.fijalkowski@intel.com>,
	"Alexander Mikhalitsyn" <alexander@mihalicyn.com>,
	"Kaiyuan Zhang" <kaiyuanz@google.com>,
	"Christian Brauner" <brauner@kernel.org>,
	"Simon Horman" <horms@kernel.org>,
	"David Howells" <dhowells@redhat.com>,
	"Florian Westphal" <fw@strlen.de>,
	"Yunsheng Lin" <linyunsheng@huawei.com>,
	"Kuniyuki Iwashima" <kuniyu@amazon.com>,
	"Jens Axboe" <axboe@kernel.dk>,
	"Arseniy Krasnov" <avkrasnov@salutedevices.com>,
	"Aleksander Lobakin" <aleksander.lobakin@intel.com>,
	"Michael Lass" <bevan@bi-co.net>, "Jiri Pirko" <jiri@resnulli.us>,
	"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
	"Lorenzo Bianconi" <lorenzo@kernel.org>,
	"Richard Gobert" <richardbgobert@gmail.com>,
	"Sridhar Samudrala" <sridhar.samudrala@intel.com>,
	"Xuan Zhuo" <xuanzhuo@linux.alibaba.com>,
	"Johannes Berg" <johannes.berg@intel.com>,
	"Abel Wu" <wuyun.abel@bytedance.com>,
	"Breno Leitao" <leitao@debian.org>,
	"Pavel Begunkov" <asml.silence@gmail.com>,
	"David Wei" <dw@davidwei.uk>, "Jason Gunthorpe" <jgg@ziepe.ca>,
	"Shailend Chand" <shailend@google.com>,
	"Harshitha Ramamurthy" <hramamurthy@google.com>,
	"Shakeel Butt" <shakeel.butt@linux.dev>,
	"Jeroen de Borst" <jeroendb@google.com>,
	"Praveen Kaligineedi" <pkaligineedi@google.com>,
	"Willem de Bruijn" <willemb@google.com>
Subject: [RFC PATCH net-next v8 12/14] net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags
Date: Tue,  2 Apr 2024 17:20:49 -0700	[thread overview]
Message-ID: <20240403002053.2376017-13-almasrymina@google.com> (raw)
In-Reply-To: <20240403002053.2376017-1-almasrymina@google.com>

Add an interface for the user to notify the kernel that it is done
reading the devmem dmabuf frags returned as cmsg. The kernel will
drop the reference on the frags to make them available for reuse.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Kaiyuan Zhang <kaiyuanz@google.com>
Signed-off-by: Mina Almasry <almasrymina@google.com>

---

v7:
- Updated SO_DEVMEM_* uapi to use the next available entry (Arnd).

v6:
- Squash in locking optimizations from edumazet@google.com. With his
  changes we lock the xarray once per sock_devmem_dontneed operation
  rather than once per frag.

Changes in v1:
- devmemtoken -> dmabuf_token (David).
- Use napi_pp_put_page() for refcounting (Yunsheng).
- Fix build error with missing socket options on other asms.

---
 arch/alpha/include/uapi/asm/socket.h  |  1 +
 arch/mips/include/uapi/asm/socket.h   |  1 +
 arch/parisc/include/uapi/asm/socket.h |  1 +
 arch/sparc/include/uapi/asm/socket.h  |  1 +
 include/uapi/asm-generic/socket.h     |  1 +
 include/uapi/linux/uio.h              |  4 ++
 net/core/sock.c                       | 61 +++++++++++++++++++++++++++
 7 files changed, 70 insertions(+)

diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
index ef4656a41058..251b73c5481e 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -144,6 +144,7 @@
 #define SCM_DEVMEM_LINEAR	SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF	79
 #define SCM_DEVMEM_DMABUF	SO_DEVMEM_DMABUF
+#define SO_DEVMEM_DONTNEED	80
 
 #if !defined(__KERNEL__)
 
diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
index 414807d55e33..8ab7582291ab 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -155,6 +155,7 @@
 #define SCM_DEVMEM_LINEAR	SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF	79
 #define SCM_DEVMEM_DMABUF	SO_DEVMEM_DMABUF
+#define SO_DEVMEM_DONTNEED	80
 
 #if !defined(__KERNEL__)
 
diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
index 2b817efd4544..38fc0b188e08 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -136,6 +136,7 @@
 #define SCM_DEVMEM_LINEAR	SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF	79
 #define SCM_DEVMEM_DMABUF	SO_DEVMEM_DMABUF
+#define SO_DEVMEM_DONTNEED	80
 
 #if !defined(__KERNEL__)
 
diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
index 00248fc68977..57084ed2f3c4 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -137,6 +137,7 @@
 #define SCM_DEVMEM_LINEAR        SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF         0x0058
 #define SCM_DEVMEM_DMABUF        SO_DEVMEM_DMABUF
+#define SO_DEVMEM_DONTNEED       0x0059
 
 #if !defined(__KERNEL__)
 
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index 25a2f5255f52..1acb77780f10 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -135,6 +135,7 @@
 #define SO_PASSPIDFD		76
 #define SO_PEERPIDFD		77
 
+#define SO_DEVMEM_DONTNEED	97
 #define SO_DEVMEM_LINEAR	98
 #define SCM_DEVMEM_LINEAR	SO_DEVMEM_LINEAR
 #define SO_DEVMEM_DMABUF	99
diff --git a/include/uapi/linux/uio.h b/include/uapi/linux/uio.h
index 3a22ddae376a..d17f8fcd93ec 100644
--- a/include/uapi/linux/uio.h
+++ b/include/uapi/linux/uio.h
@@ -33,6 +33,10 @@ struct dmabuf_cmsg {
 				 */
 };
 
+struct dmabuf_token {
+	__u32 token_start;
+	__u32 token_count;
+};
 /*
  *	UIO_MAXIOV shall be at least 16 1003.1g (5.4.1.1)
  */
diff --git a/net/core/sock.c b/net/core/sock.c
index 5ed411231fc7..04e14f9c4e24 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1049,6 +1049,63 @@ static int sock_reserve_memory(struct sock *sk, int bytes)
 	return 0;
 }
 
+#ifdef CONFIG_PAGE_POOL
+static noinline_for_stack int
+sock_devmem_dontneed(struct sock *sk, sockptr_t optval, unsigned int optlen)
+{
+	unsigned int num_tokens, i, j, k, netmem_num = 0;
+	struct dmabuf_token *tokens;
+	netmem_ref netmems[16];
+	int ret;
+
+	if (sk->sk_type != SOCK_STREAM || sk->sk_protocol != IPPROTO_TCP)
+		return -EBADF;
+
+	if (optlen % sizeof(struct dmabuf_token) ||
+	    optlen > sizeof(*tokens) * 128)
+		return -EINVAL;
+
+	tokens = kvmalloc_array(128, sizeof(*tokens), GFP_KERNEL);
+	if (!tokens)
+		return -ENOMEM;
+
+	num_tokens = optlen / sizeof(struct dmabuf_token);
+	if (copy_from_sockptr(tokens, optval, optlen))
+		return -EFAULT;
+
+	ret = 0;
+
+	xa_lock_bh(&sk->sk_user_frags);
+	for (i = 0; i < num_tokens; i++) {
+		for (j = 0; j < tokens[i].token_count; j++) {
+			netmem_ref netmem = (__force netmem_ref)__xa_erase(
+				&sk->sk_user_frags, tokens[i].token_start + j);
+
+			if (netmem &&
+			    !WARN_ON_ONCE(!netmem_is_net_iov(netmem))) {
+				netmems[netmem_num++] = netmem;
+				if (netmem_num == ARRAY_SIZE(netmems)) {
+					xa_unlock_bh(&sk->sk_user_frags);
+					for (k = 0; k < netmem_num; k++)
+						WARN_ON_ONCE(!napi_pp_put_page(netmems[k],
+									       false));
+					netmem_num = 0;
+					xa_lock_bh(&sk->sk_user_frags);
+				}
+				ret++;
+			}
+		}
+	}
+
+	xa_unlock_bh(&sk->sk_user_frags);
+	for (k = 0; k < netmem_num; k++)
+		WARN_ON_ONCE(!napi_pp_put_page(netmems[k], false));
+
+	kvfree(tokens);
+	return ret;
+}
+#endif
+
 void sockopt_lock_sock(struct sock *sk)
 {
 	/* When current->bpf_ctx is set, the setsockopt is called from
@@ -1200,6 +1257,10 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
 			ret = -EOPNOTSUPP;
 		return ret;
 		}
+#ifdef CONFIG_PAGE_POOL
+	case SO_DEVMEM_DONTNEED:
+		return sock_devmem_dontneed(sk, optval, optlen);
+#endif
 	}
 
 	sockopt_lock_sock(sk);
-- 
2.44.0.478.gd926399ef9-goog


  parent reply	other threads:[~2024-04-03  0:21 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-03  0:20 [RFC PATCH net-next v8 00/14] Device Memory TCP Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 01/14] queue_api: define queue api Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 02/14] net: page_pool: create hooks for custom page providers Mina Almasry
2024-05-01  7:54   ` Christoph Hellwig
2024-05-03 20:10     ` Mina Almasry
2024-05-06 12:04       ` Christoph Hellwig
2024-05-07 16:05         ` Pavel Begunkov
2024-05-07 16:18           ` Jason Gunthorpe
2024-05-07 16:23             ` Christoph Hellwig
2024-05-07 16:42               ` Mina Almasry
2024-05-07 16:48                 ` Jason Gunthorpe
2024-05-07 17:19                   ` Daniel Vetter
2024-05-07 17:25                   ` Pavel Begunkov
2024-05-07 17:56                     ` Jason Gunthorpe
2024-05-07 19:35                       ` Pavel Begunkov
2024-05-07 23:32                         ` Jason Gunthorpe
2024-05-08  7:16                           ` Daniel Vetter
2024-05-08 11:35                             ` Pavel Begunkov
2024-05-08 15:34                               ` Daniel Vetter
2024-05-08 15:51                               ` Christoph Hellwig
2024-05-08 17:02                                 ` Pavel Begunkov
2024-05-09  4:49                                   ` Christoph Hellwig
2024-05-08 11:30                           ` Pavel Begunkov
2024-05-08 14:25                             ` Jason Gunthorpe
2024-05-08 15:44                               ` Pavel Begunkov
2024-05-08 15:58                                 ` Jason Gunthorpe
2024-05-08 16:13                                   ` Pavel Begunkov
2024-05-07 17:17                 ` Pavel Begunkov
2024-05-07 16:55               ` Pavel Begunkov
2024-05-07 17:15                 ` Mina Almasry
2024-05-07 17:34                   ` Pavel Begunkov
2024-04-03  0:20 ` [RFC PATCH net-next v8 03/14] net: netdev netlink api to bind dma-buf to a net device Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 04/14] netdev: support binding dma-buf to netdevice Mina Almasry
2024-04-24 17:35   ` David Wei
2024-04-24 22:11     ` Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 05/14] netdev: netdevice devmem allocator Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 06/14] page_pool: convert to use netmem Mina Almasry
2024-04-03 17:27   ` Simon Horman
2024-04-03  0:20 ` [RFC PATCH net-next v8 07/14] page_pool: devmem support Mina Almasry
2024-04-27  0:17   ` David Wei
2024-04-27  2:11     ` Mina Almasry
2024-04-30 13:31       ` Pavel Begunkov
2024-04-30 13:45       ` Jens Axboe
2024-04-30 18:29         ` Mina Almasry
2024-04-30 18:55           ` Jens Axboe
2024-04-30 19:19             ` Mina Almasry
2024-05-01 13:58             ` Jesper Dangaard Brouer
2024-05-01  7:55     ` Christoph Hellwig
2024-05-06  0:29       ` David Wei
2024-04-03  0:20 ` [RFC PATCH net-next v8 08/14] memory-provider: dmabuf devmem memory provider Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 09/14] net: support non paged skb frags Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 10/14] net: add support for skbs with unreadable frags Mina Almasry
2024-04-03  0:20 ` [RFC PATCH net-next v8 11/14] tcp: RX path for devmem TCP Mina Almasry
2024-04-03  0:20 ` Mina Almasry [this message]
2024-04-03  0:20 ` [RFC PATCH net-next v8 13/14] net: add devmem TCP documentation Mina Almasry
2024-05-03 13:14   ` Bagas Sanjaya
2024-04-03  0:20 ` [RFC PATCH net-next v8 14/14] selftests: add ncdevmem, netcat for devmem TCP Mina Almasry
2024-04-08 15:57   ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240403002053.2376017-13-almasrymina@google.com \
    --to=almasrymina@google.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=aleksander.lobakin@intel.com \
    --cc=alexander@mihalicyn.com \
    --cc=amritha.nambiar@intel.com \
    --cc=andreas@gaisler.com \
    --cc=andrii@kernel.org \
    --cc=arnd@arndb.de \
    --cc=asml.silence@gmail.com \
    --cc=ast@kernel.org \
    --cc=avkrasnov@salutedevices.com \
    --cc=axboe@kernel.dk \
    --cc=bevan@bi-co.net \
    --cc=bigeasy@linutronix.de \
    --cc=bpf@vger.kernel.org \
    --cc=brauner@kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=deller@gmx.de \
    --cc=dhowells@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dsahern@kernel.org \
    --cc=dw@davidwei.uk \
    --cc=eddyz87@gmail.com \
    --cc=edumazet@google.com \
    --cc=fw@strlen.de \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=horms@kernel.org \
    --cc=hramamurthy@google.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=ink@jurassic.park.msu.ru \
    --cc=jeroendb@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jiri@resnulli.us \
    --cc=johannes.berg@intel.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kaiyuanz@google.com \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=leitao@debian.org \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=lorenzo@kernel.org \
    --cc=maciej.fijalkowski@intel.com \
    --cc=martin.lau@linux.dev \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mattst88@gmail.com \
    --cc=mhiramat@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pkaligineedi@google.com \
    --cc=richard.henderson@linaro.org \
    --cc=richardbgobert@gmail.com \
    --cc=rostedt@goodmis.org \
    --cc=sdf@google.com \
    --cc=shailend@google.com \
    --cc=shakeel.butt@linux.dev \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=sparclinux@vger.kernel.org \
    --cc=sridhar.samudrala@intel.com \
    --cc=steffen.klassert@secunet.com \
    --cc=sumit.semwal@linaro.org \
    --cc=tsbogend@alpha.franken.de \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=wuyun.abel@bytedance.com \
    --cc=xuanzhuo@linux.alibaba.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).