All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Howells <dhowells@redhat.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	netdev@vger.kernel.org, dhowells@redhat.com,
	Christoph Hellwig <hch@infradead.org>,
	Matthew Wilcox <willy@infradead.org>,
	Jens Axboe <axboe@kernel.dk>, Jan Kara <jack@suse.cz>,
	Jeff Layton <jlayton@kernel.org>,
	Logan Gunthorpe <logang@deltatee.com>,
	linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH v6 34/34] net: [RFC][WIP] Make __zerocopy_sg_from_iter() correctly pin or leave pages unref'd
Date: Mon, 16 Jan 2023 23:12:10 +0000	[thread overview]
Message-ID: <167391073019.2311931.11127613443740355536.stgit@warthog.procyon.org.uk> (raw)
In-Reply-To: <167391047703.2311931.8115712773222260073.stgit@warthog.procyon.org.uk>

Make __zerocopy_sg_from_iter() call iov_iter_extract_pages() to get pages
that have been ref'd, pinned or left alone as appropriate.  As this is only
used for source buffers, pinning isn't an option, but being unref'd is.

The way __zerocopy_sg_from_iter() merges fragments is also altered, such
that fragments must also match their cleanup modes to be merged.

An extra helper and wrapper, folio_put_unpin_sub() and page_put_unpin_sub()
are added to allow multiple refs to be put/unpinned.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: netdev@vger.kernel.org
---

 include/linux/mm.h  |    2 ++
 mm/gup.c            |   25 +++++++++++++++++++++++++
 net/core/datagram.c |   23 +++++++++++++----------
 3 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index f14edb192394..e3923b89c75e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1368,7 +1368,9 @@ static inline bool is_cow_mapping(vm_flags_t flags)
 #endif
 
 void folio_put_unpin(struct folio *folio, unsigned int flags);
+void folio_put_unpin_sub(struct folio *folio, unsigned int flags, unsigned int refs);
 void page_put_unpin(struct page *page, unsigned int flags);
+void page_put_unpin_sub(struct page *page, unsigned int flags, unsigned int refs);
 
 /*
  * The identification function is mainly used by the buddy allocator for
diff --git a/mm/gup.c b/mm/gup.c
index 3ee4b4c7e0cb..49dd27ba6c13 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -213,6 +213,31 @@ void page_put_unpin(struct page *page, unsigned int flags)
 }
 EXPORT_SYMBOL_GPL(page_put_unpin);
 
+/**
+ * folio_put_unpin_sub - Unpin/put a folio as appropriate
+ * @folio: The folio to release
+ * @flags: gup flags indicating the mode of release (FOLL_*)
+ * @refs: Number of refs/pins to drop
+ *
+ * Release a folio according to the flags.  If FOLL_GET is set, the folio has a
+ * ref dropped; if FOLL_PIN is set, it is unpinned; otherwise it is left
+ * unaltered.
+ */
+void folio_put_unpin_sub(struct folio *folio, unsigned int flags,
+			 unsigned int refs)
+{
+	if (flags & (FOLL_GET | FOLL_PIN))
+		gup_put_folio(folio, refs, flags);
+}
+EXPORT_SYMBOL_GPL(folio_put_unpin_sub);
+
+void page_put_unpin_sub(struct page *page, unsigned int flags,
+			unsigned int refs)
+{
+	folio_put_unpin_sub(page_folio(page), flags, refs);
+}
+EXPORT_SYMBOL_GPL(page_put_unpin_sub);
+
 /**
  * try_grab_page() - elevate a page's refcount by a flag-dependent amount
  * @page:    pointer to page to be grabbed
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 122bfb144d32..63ea1f8817e0 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -614,6 +614,7 @@ int __zerocopy_sg_from_iter(struct msghdr *msg, struct sock *sk,
 			    struct sk_buff *skb, struct iov_iter *from,
 			    size_t length)
 {
+	unsigned int cleanup_mode = iov_iter_extract_mode(from, FOLL_SOURCE_BUF);
 	int frag;
 
 	if (msg && msg->msg_ubuf && msg->sg_from_iter)
@@ -622,7 +623,7 @@ int __zerocopy_sg_from_iter(struct msghdr *msg, struct sock *sk,
 	frag = skb_shinfo(skb)->nr_frags;
 
 	while (length && iov_iter_count(from)) {
-		struct page *pages[MAX_SKB_FRAGS];
+		struct page *pages[MAX_SKB_FRAGS], **ppages = pages;
 		struct page *last_head = NULL;
 		size_t start;
 		ssize_t copied;
@@ -632,9 +633,9 @@ int __zerocopy_sg_from_iter(struct msghdr *msg, struct sock *sk,
 		if (frag == MAX_SKB_FRAGS)
 			return -EMSGSIZE;
 
-		copied = iov_iter_get_pages(from, pages, length,
-					    MAX_SKB_FRAGS - frag, &start,
-					    FOLL_SOURCE_BUF);
+		copied = iov_iter_extract_pages(from, &ppages, length,
+						MAX_SKB_FRAGS - frag,
+						FOLL_SOURCE_BUF, &start);
 		if (copied < 0)
 			return -EFAULT;
 
@@ -662,12 +663,14 @@ int __zerocopy_sg_from_iter(struct msghdr *msg, struct sock *sk,
 				skb_frag_t *last = &skb_shinfo(skb)->frags[frag - 1];
 
 				if (head == skb_frag_page(last) &&
+				    cleanup_mode == skb_frag_cleanup(last) &&
 				    start == skb_frag_off(last) + skb_frag_size(last)) {
 					skb_frag_size_add(last, size);
 					/* We combined this page, we need to release
-					 * a reference. Since compound pages refcount
-					 * is shared among many pages, batch the refcount
-					 * adjustments to limit false sharing.
+					 * a reference or a pin.  Since compound pages
+					 * refcount is shared among many pages, batch
+					 * the refcount adjustments to limit false
+					 * sharing.
 					 */
 					last_head = head;
 					refs++;
@@ -675,14 +678,14 @@ int __zerocopy_sg_from_iter(struct msghdr *msg, struct sock *sk,
 				}
 			}
 			if (refs) {
-				page_ref_sub(last_head, refs);
+				page_put_unpin_sub(last_head, cleanup_mode, refs);
 				refs = 0;
 			}
 			skb_fill_page_desc_noacc(skb, frag++, head, start, size,
-						 FOLL_GET);
+						 cleanup_mode);
 		}
 		if (refs)
-			page_ref_sub(last_head, refs);
+			page_put_unpin_sub(last_head, cleanup_mode, refs);
 	}
 	return 0;
 }



  parent reply	other threads:[~2023-01-16 23:22 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-16 23:07 [PATCH v6 00/34] iov_iter: Improve page extraction (ref, pin or just list) David Howells
2023-01-16 23:08 ` [PATCH v6 01/34] vfs: Unconditionally set IOCB_WRITE in call_write_iter() David Howells
2023-01-17  7:52   ` Christoph Hellwig
2023-01-18 22:11     ` Al Viro
2023-01-19  5:44       ` Christoph Hellwig
2023-01-19 11:34       ` David Howells
2023-01-19 16:48         ` Christoph Hellwig
2023-01-19 21:14         ` David Howells
2023-01-17  8:28   ` David Howells
2023-01-17  8:44     ` Christoph Hellwig
2023-01-17 11:11   ` David Howells
2023-01-17 11:11     ` Christoph Hellwig
2023-01-18 22:05   ` Al Viro
2023-01-19  5:41     ` Christoph Hellwig
2023-01-19 10:01   ` David Howells
2023-01-19 16:46     ` Christoph Hellwig
2023-01-19 11:06   ` David Howells
2023-01-16 23:08 ` [PATCH v6 02/34] iov_iter: Use IOCB/IOMAP_WRITE/op_is_write rather than iterator direction David Howells
2023-01-17  7:55   ` Christoph Hellwig
2023-01-18 22:18     ` Al Viro
2023-01-16 23:08 ` [PATCH v6 03/34] iov_iter: Pass I/O direction into iov_iter_get_pages*() David Howells
2023-01-17  7:57   ` Christoph Hellwig
2023-01-17  8:07     ` David Hildenbrand
2023-01-17  8:09       ` Christoph Hellwig
2023-01-18 23:03     ` Al Viro
2023-01-19  0:15       ` Al Viro
2023-01-19  2:11         ` Al Viro
2023-01-19  5:47           ` Christoph Hellwig
2023-01-19  5:47         ` Christoph Hellwig
2023-01-17  8:44   ` David Howells
2023-01-17  8:46     ` Christoph Hellwig
2023-01-17  8:47     ` David Hildenbrand
2023-01-16 23:08 ` [PATCH v6 04/34] iov_iter: Remove iov_iter_get_pages2/pages_alloc2() David Howells
2023-01-16 23:08 ` [PATCH v6 05/34] iov_iter: Change the direction macros into an enum David Howells
2023-01-18 23:14   ` Al Viro
2023-01-18 23:17   ` David Howells
2023-01-18 23:19     ` Al Viro
2023-01-18 23:24     ` David Howells
2023-01-16 23:08 ` [PATCH v6 06/34] iov_iter: Use the direction in the iterator functions David Howells
2023-01-17  7:58   ` Christoph Hellwig
2023-01-18 22:28     ` Al Viro
2023-01-19  5:45       ` Christoph Hellwig
2023-01-18 23:15   ` Al Viro
2023-01-16 23:08 ` [PATCH v6 07/34] iov_iter: Add a function to extract a page list from an iterator David Howells
2023-01-17  8:01   ` Christoph Hellwig
2023-01-17  8:19   ` David Howells
2023-01-16 23:08 ` [PATCH v6 08/34] mm: Provide a helper to drop a pin/ref on a page David Howells
2023-01-17  8:02   ` Christoph Hellwig
2023-01-17  8:21   ` David Howells
2023-01-16 23:09 ` [PATCH v6 09/34] bio: Rename BIO_NO_PAGE_REF to BIO_PAGE_REFFED and invert the meaning David Howells
2023-01-17  8:02   ` Christoph Hellwig
2023-01-16 23:09 ` [PATCH v6 10/34] mm, block: Make BIO_PAGE_REFFED/PINNED the same as FOLL_GET/PIN numerically David Howells
2023-01-17  8:03   ` Christoph Hellwig
2023-01-16 23:09 ` [PATCH v6 11/34] iov_iter, block: Make bio structs pin pages rather than ref'ing if appropriate David Howells
2023-01-17  8:07   ` Christoph Hellwig
2023-01-17  8:26   ` David Howells
2023-01-17  8:44     ` Christoph Hellwig
2023-01-16 23:09 ` [PATCH v6 12/34] bio: Fix bio_flagged() so that gcc can better optimise it David Howells
2023-01-17  8:07   ` Christoph Hellwig
2023-01-16 23:09 ` [PATCH v6 13/34] netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator David Howells
2023-01-16 23:09 ` [PATCH v6 14/34] netfs: Add a function to extract an iterator into a scatterlist David Howells
2023-01-16 23:09 ` [PATCH v6 15/34] af_alg: Pin pages rather than ref'ing if appropriate David Howells
2023-01-16 23:09 ` [PATCH v6 16/34] af_alg: [RFC] Use netfs_extract_iter_to_sg() to create scatterlists David Howells
2023-01-16 23:10 ` [PATCH v6 17/34] scsi: [RFC] Use netfs_extract_iter_to_sg() David Howells
2023-01-16 23:10 ` [PATCH v6 18/34] dio: Pin pages rather than ref'ing if appropriate David Howells
2023-01-19  5:04   ` Al Viro
2023-01-19  5:51     ` Christoph Hellwig
2023-01-16 23:10 ` [PATCH v6 19/34] fuse: " David Howells
2023-01-16 23:10 ` [PATCH v6 20/34] vfs: Make splice use iov_iter_extract_pages() David Howells
2023-01-19  2:31   ` Al Viro
2023-01-16 23:10 ` [PATCH v6 21/34] 9p: Pin pages rather than ref'ing if appropriate David Howells
2023-01-19  2:52   ` Al Viro
2023-01-19 16:44   ` David Howells
2023-01-19 16:51     ` Christoph Hellwig
2023-01-16 23:10 ` [PATCH v6 22/34] nfs: " David Howells
2023-01-16 23:10 ` [PATCH v6 23/34] cifs: Implement splice_read to pass down ITER_BVEC not ITER_PIPE David Howells
2023-01-16 23:10 ` [PATCH v6 24/34] cifs: Add a function to build an RDMA SGE list from an iterator David Howells
2023-01-16 23:11 ` [PATCH v6 25/34] cifs: Add a function to Hash the contents of " David Howells
2023-01-16 23:11 ` [PATCH v6 26/34] cifs: Add some helper functions David Howells
2023-01-16 23:11 ` [PATCH v6 27/34] cifs: Add a function to read into an iter from a socket David Howells
2023-01-16 23:11 ` [PATCH v6 28/34] cifs: Change the I/O paths to use an iterator rather than a page list David Howells
2023-01-16 23:11 ` [PATCH v6 29/34] cifs: Build the RDMA SGE list directly from an iterator David Howells
2023-01-16 23:11 ` [PATCH v6 30/34] cifs: Remove unused code David Howells
2023-01-16 23:11 ` [PATCH v6 31/34] cifs: Fix problem with encrypted RDMA data read David Howells
2023-01-19 16:25   ` Stefan Metzmacher
2023-01-16 23:11 ` [PATCH v6 32/34] cifs: DIO to/from KVEC-type iterators should now work David Howells
2023-01-16 23:12 ` [PATCH v6 33/34] net: [RFC][WIP] Mark each skb_frags as to how they should be cleaned up David Howells
2023-01-16 23:12 ` David Howells [this message]
2023-01-17  7:46 ` [PATCH v6 00/34] iov_iter: Improve page extraction (ref, pin or just list) Christoph Hellwig
2023-01-18 14:00 ` [PATCH v6 09/34] bio: Rename BIO_NO_PAGE_REF to BIO_PAGE_REFFED and invert the meaning David Howells
2023-01-18 14:09   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=167391073019.2311931.11127613443740355536.stgit@warthog.procyon.org.uk \
    --to=dhowells@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jlayton@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.