All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/11] ceph: convert to new fscache API
@ 2020-07-22 10:55 Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 01/11] ceph: break out writeback of incompatible snap context to separate function Jeff Layton
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

Here's my first pass at a conversion to the new fscache API. It starts
with a few cleanup/reorganization patches to prepare the code. I then
rip out most of the old ceph fscache helpers and replace them with new
ones for the new API.

The last patches then plug buffered read and write support back into the
new infrastructure, with most of the read-side routines using the new
fscache_read_helper routines.

This passes xfstests' quick group run with the cache disabled. With it
enabled, it passed most of it, but I hit some OOM kills on generic/531.
Still tracking that bit down, but I suspect the problem is in
fscache/cachefiles code and not in these patches.

Note that this depends on a some patches that were not in David's last
posting of the fscache update, so all of this is not ready for merge. We
could probably take a few of the initial patches earlier though.

Jeff Layton (11):
  ceph: break out writeback of incompatible snap context to separate
    function
  ceph: don't call ceph_update_writeable_page from page_mkwrite
  ceph: fold ceph_sync_readpages into ceph_readpage
  ceph: fold ceph_sync_writepages into writepage_nounlock
  ceph: fold ceph_update_writeable_page into ceph_write_begin
  ceph: conversion to new fscache API
  ceph: convert readpage to fscache read helper
  ceph: plug write_begin into read helper
  ceph: convert readpages to fscache_read_helper
  ceph: add fscache writeback support
  ceph: re-enable fscache support

 fs/ceph/Kconfig |   3 +-
 fs/ceph/addr.c  | 939 +++++++++++++++++++++++++++---------------------
 fs/ceph/cache.c | 290 ++++-----------
 fs/ceph/cache.h | 106 ++----
 fs/ceph/caps.c  |  11 +-
 fs/ceph/file.c  |  13 +-
 fs/ceph/inode.c |  14 +-
 fs/ceph/super.h |   1 -
 8 files changed, 644 insertions(+), 733 deletions(-)

-- 
2.26.2

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC PATCH 01/11] ceph: break out writeback of incompatible snap context to separate function
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 02/11] ceph: don't call ceph_update_writeable_page from page_mkwrite Jeff Layton
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

When dirtying a page, we have to flush incompatible contexts. Move that
into a separate function.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 96 +++++++++++++++++++++++++++++++-------------------
 1 file changed, 60 insertions(+), 36 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 01ad09733ac7..4057725e1864 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1300,40 +1300,34 @@ static int context_is_writeable_or_written(struct inode *inode,
 	return ret;
 }
 
-/*
- * We are only allowed to write into/dirty the page if the page is
- * clean, or already dirty within the same snap context.
+/**
+ * ceph_find_incompatible - find an incompatible context and return it
+ * @inode: inode associated with page
+ * @page: page being dirtied
  *
- * called with page locked.
- * return success with page locked,
- * or any failure (incl -EAGAIN) with page unlocked.
+ * Returns NULL on success, negative error code on error, and a snapc ref that should be
+ * waited on otherwise.
  */
-static int ceph_update_writeable_page(struct file *file,
-			    loff_t pos, unsigned len,
-			    struct page *page)
+static struct ceph_snap_context *
+ceph_find_incompatible(struct inode *inode, struct page *page)
 {
-	struct inode *inode = file_inode(file);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
 	struct ceph_inode_info *ci = ceph_inode(inode);
-	loff_t page_off = pos & PAGE_MASK;
-	int pos_in_page = pos & ~PAGE_MASK;
-	int end_in_page = pos_in_page + len;
-	loff_t i_size;
-	int r;
-	struct ceph_snap_context *snapc, *oldest;
 
 	if (READ_ONCE(fsc->mount_state) == CEPH_MOUNT_SHUTDOWN) {
 		dout(" page %p forced umount\n", page);
-		unlock_page(page);
-		return -EIO;
+		return ERR_PTR(-EIO);
 	}
 
-retry_locked:
-	/* writepages currently holds page lock, but if we change that later, */
-	wait_on_page_writeback(page);
+	for (;;) {
+		struct ceph_snap_context *snapc, *oldest;
+
+		wait_on_page_writeback(page);
+
+		snapc = page_snap_context(page);
+		if (!snapc || snapc == ci->i_head_snapc)
+			break;
 
-	snapc = page_snap_context(page);
-	if (snapc && snapc != ci->i_head_snapc) {
 		/*
 		 * this page is already dirty in another (older) snap
 		 * context!  is it writeable now?
@@ -1348,26 +1342,56 @@ static int ceph_update_writeable_page(struct file *file,
 			 * be writeable or written
 			 */
 			snapc = ceph_get_snap_context(snapc);
-			unlock_page(page);
-			ceph_queue_writeback(inode);
-			r = wait_event_killable(ci->i_cap_wq,
-			       context_is_writeable_or_written(inode, snapc));
-			ceph_put_snap_context(snapc);
-			if (r == -ERESTARTSYS)
-				return r;
-			return -EAGAIN;
+			return snapc;
 		}
 		ceph_put_snap_context(oldest);
 
 		/* yay, writeable, do it now (without dropping page lock) */
 		dout(" page %p snapc %p not current, but oldest\n",
 		     page, snapc);
-		if (!clear_page_dirty_for_io(page))
-			goto retry_locked;
-		r = writepage_nounlock(page, NULL);
-		if (r < 0)
+		if (clear_page_dirty_for_io(page)) {
+			int r = writepage_nounlock(page, NULL);
+			if (r < 0)
+				return ERR_PTR(r);
+		}
+	}
+	return NULL;
+}
+
+/*
+ * We are only allowed to write into/dirty the page if the page is
+ * clean, or already dirty within the same snap context.
+ *
+ * called with page locked.
+ * return success with page locked,
+ * or any failure (incl -EAGAIN) with page unlocked.
+ */
+static int ceph_update_writeable_page(struct file *file,
+			    loff_t pos, unsigned len,
+			    struct page *page)
+{
+	struct inode *inode = file_inode(file);
+	struct ceph_inode_info *ci = ceph_inode(inode);
+	struct ceph_snap_context *snapc;
+	loff_t page_off = pos & PAGE_MASK;
+	int pos_in_page = pos & ~PAGE_MASK;
+	int end_in_page = pos_in_page + len;
+	loff_t i_size;
+	int r;
+
+retry_locked:
+	snapc = ceph_find_incompatible(inode, page);
+	if (snapc) {
+		if (IS_ERR(snapc)) {
+			r = PTR_ERR(snapc);
 			goto fail_unlock;
-		goto retry_locked;
+		}
+		unlock_page(page);
+		ceph_queue_writeback(inode);
+		r = wait_event_killable(ci->i_cap_wq,
+					context_is_writeable_or_written(inode, snapc));
+		ceph_put_snap_context(snapc);
+		return -EAGAIN;
 	}
 
 	if (PageUptodate(page)) {
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 02/11] ceph: don't call ceph_update_writeable_page from page_mkwrite
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 01/11] ceph: break out writeback of incompatible snap context to separate function Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 03/11] ceph: fold ceph_sync_readpages into ceph_readpage Jeff Layton
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

page_mkwrite should only be called with Uptodate pages, so we should
only need to flush incompatible snap contexts.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 4057725e1864..4f017d552196 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1693,6 +1693,8 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf)
 	inode_inc_iversion_raw(inode);
 
 	do {
+		struct ceph_snap_context *snapc;
+
 		lock_page(page);
 
 		if (page_mkwrite_check_truncate(page, inode) < 0) {
@@ -1701,13 +1703,26 @@ static vm_fault_t ceph_page_mkwrite(struct vm_fault *vmf)
 			break;
 		}
 
-		err = ceph_update_writeable_page(vma->vm_file, off, len, page);
-		if (err >= 0) {
+		snapc = ceph_find_incompatible(inode, page);
+		if (!snapc) {
 			/* success.  we'll keep the page locked. */
 			set_page_dirty(page);
 			ret = VM_FAULT_LOCKED;
+			break;
 		}
-	} while (err == -EAGAIN);
+
+		unlock_page(page);
+
+		if (IS_ERR(snapc)) {
+			ret = VM_FAULT_SIGBUS;
+			break;
+		}
+
+		ceph_queue_writeback(inode);
+		err = wait_event_killable(ci->i_cap_wq,
+				context_is_writeable_or_written(inode, snapc));
+		ceph_put_snap_context(snapc);
+	} while (err == 0);
 
 	if (ret == VM_FAULT_LOCKED ||
 	    ci->i_inline_version != CEPH_INLINE_NONE) {
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 03/11] ceph: fold ceph_sync_readpages into ceph_readpage
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 01/11] ceph: break out writeback of incompatible snap context to separate function Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 02/11] ceph: don't call ceph_update_writeable_page from page_mkwrite Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 04/11] ceph: fold ceph_sync_writepages into writepage_nounlock Jeff Layton
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

It's the only caller and this will make reorg easier.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 79 ++++++++++++++++----------------------------------
 1 file changed, 25 insertions(+), 54 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 4f017d552196..3b464b8c7ce6 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -182,58 +182,15 @@ static int ceph_releasepage(struct page *page, gfp_t g)
 	return !PagePrivate(page);
 }
 
-/*
- * Read some contiguous pages.  If we cross a stripe boundary, shorten
- * *plen.  Return number of bytes read, or error.
- */
-static int ceph_sync_readpages(struct ceph_fs_client *fsc,
-			       struct ceph_vino vino,
-			       struct ceph_file_layout *layout,
-			       u64 off, u64 *plen,
-			       u32 truncate_seq, u64 truncate_size,
-			       struct page **pages, int num_pages,
-			       int page_align)
-{
-	struct ceph_osd_client *osdc = &fsc->client->osdc;
-	struct ceph_osd_request *req;
-	int rc = 0;
-
-	dout("readpages on ino %llx.%llx on %llu~%llu\n", vino.ino,
-	     vino.snap, off, *plen);
-	req = ceph_osdc_new_request(osdc, layout, vino, off, plen, 0, 1,
-				    CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ,
-				    NULL, truncate_seq, truncate_size,
-				    false);
-	if (IS_ERR(req))
-		return PTR_ERR(req);
-
-	/* it may be a short read due to an object boundary */
-	osd_req_op_extent_osd_data_pages(req, 0,
-				pages, *plen, page_align, false, false);
-
-	dout("readpages  final extent is %llu~%llu (%llu bytes align %d)\n",
-	     off, *plen, *plen, page_align);
-
-	rc = ceph_osdc_start_request(osdc, req, false);
-	if (!rc)
-		rc = ceph_osdc_wait_request(osdc, req);
-
-	ceph_update_read_latency(&fsc->mdsc->metric, req->r_start_latency,
-				 req->r_end_latency, rc);
-
-	ceph_osdc_put_request(req);
-	dout("readpages result %d\n", rc);
-	return rc;
-}
-
-/*
- * read a single page, without unlocking it.
- */
+/* read a single page, without unlocking it. */
 static int ceph_do_readpage(struct file *filp, struct page *page)
 {
 	struct inode *inode = file_inode(filp);
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct ceph_osd_client *osdc = &fsc->client->osdc;
+	struct ceph_osd_request *req;
+	struct ceph_vino vino = ceph_vino(inode);
 	int err = 0;
 	u64 off = page_offset(page);
 	u64 len = PAGE_SIZE;
@@ -260,12 +217,27 @@ static int ceph_do_readpage(struct file *filp, struct page *page)
 	if (err == 0)
 		return -EINPROGRESS;
 
-	dout("readpage inode %p file %p page %p index %lu\n",
-	     inode, filp, page, page->index);
-	err = ceph_sync_readpages(fsc, ceph_vino(inode),
-				  &ci->i_layout, off, &len,
-				  ci->i_truncate_seq, ci->i_truncate_size,
-				  &page, 1, 0);
+	dout("readpage ino %llx.%llx file %p off %llu len %llu page %p index %lu\n",
+	     vino.ino, vino.snap, filp, off, len, page, page->index);
+	req = ceph_osdc_new_request(osdc, &ci->i_layout, vino, off, &len, 0, 1,
+				    CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ, NULL,
+				    ci->i_truncate_seq, ci->i_truncate_size,
+				    false);
+	if (IS_ERR(req))
+		return PTR_ERR(req);
+
+	osd_req_op_extent_osd_data_pages(req, 0, &page, len, 0, false, false);
+
+	err = ceph_osdc_start_request(osdc, req, false);
+	if (!err)
+		err = ceph_osdc_wait_request(osdc, req);
+
+	ceph_update_read_latency(&fsc->mdsc->metric, req->r_start_latency,
+				 req->r_end_latency, err);
+
+	ceph_osdc_put_request(req);
+	dout("readpage result %d\n", err);
+
 	if (err == -ENOENT)
 		err = 0;
 	if (err < 0) {
@@ -283,7 +255,6 @@ static int ceph_do_readpage(struct file *filp, struct page *page)
 
 	SetPageUptodate(page);
 	ceph_readpage_to_fscache(inode, page);
-
 out:
 	return err < 0 ? err : 0;
 }
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 04/11] ceph: fold ceph_sync_writepages into writepage_nounlock
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
                   ` (2 preceding siblings ...)
  2020-07-22 10:55 ` [RFC PATCH 03/11] ceph: fold ceph_sync_readpages into ceph_readpage Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 05/11] ceph: fold ceph_update_writeable_page into ceph_write_begin Jeff Layton
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

It's the only caller, and this will make it easier to refactor.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 93 +++++++++++++++++++-------------------------------
 1 file changed, 35 insertions(+), 58 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 3b464b8c7ce6..42fb38a112b8 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -590,50 +590,6 @@ static u64 get_writepages_data_length(struct inode *inode,
 	return end > start ? end - start : 0;
 }
 
-/*
- * do a synchronous write on N pages
- */
-static int ceph_sync_writepages(struct ceph_fs_client *fsc,
-				struct ceph_vino vino,
-				struct ceph_file_layout *layout,
-				struct ceph_snap_context *snapc,
-				u64 off, u64 len,
-				u32 truncate_seq, u64 truncate_size,
-				struct timespec64 *mtime,
-				struct page **pages, int num_pages)
-{
-	struct ceph_osd_client *osdc = &fsc->client->osdc;
-	struct ceph_osd_request *req;
-	int rc = 0;
-	int page_align = off & ~PAGE_MASK;
-
-	req = ceph_osdc_new_request(osdc, layout, vino, off, &len, 0, 1,
-				    CEPH_OSD_OP_WRITE, CEPH_OSD_FLAG_WRITE,
-				    snapc, truncate_seq, truncate_size,
-				    true);
-	if (IS_ERR(req))
-		return PTR_ERR(req);
-
-	/* it may be a short write due to an object boundary */
-	osd_req_op_extent_osd_data_pages(req, 0, pages, len, page_align,
-				false, false);
-	dout("writepages %llu~%llu (%llu bytes)\n", off, len, len);
-
-	req->r_mtime = *mtime;
-	rc = ceph_osdc_start_request(osdc, req, true);
-	if (!rc)
-		rc = ceph_osdc_wait_request(osdc, req);
-
-	ceph_update_write_latency(&fsc->mdsc->metric, req->r_start_latency,
-				  req->r_end_latency, rc);
-
-	ceph_osdc_put_request(req);
-	if (rc == 0)
-		rc = len;
-	dout("writepages result %d\n", rc);
-	return rc;
-}
-
 /*
  * Write a single page, but leave the page locked.
  *
@@ -642,20 +598,19 @@ static int ceph_sync_writepages(struct ceph_fs_client *fsc,
  */
 static int writepage_nounlock(struct page *page, struct writeback_control *wbc)
 {
-	struct inode *inode;
-	struct ceph_inode_info *ci;
-	struct ceph_fs_client *fsc;
+	struct inode *inode = page->mapping->host;
+	struct ceph_inode_info *ci = ceph_inode(inode);
+	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
 	struct ceph_snap_context *snapc, *oldest;
 	loff_t page_off = page_offset(page);
-	int err, len = PAGE_SIZE;
+	int err;
+	loff_t len = PAGE_SIZE;
 	struct ceph_writeback_ctl ceph_wbc;
+	struct ceph_osd_client *osdc = &fsc->client->osdc;
+	struct ceph_osd_request *req;
 
 	dout("writepage %p idx %lu\n", page, page->index);
 
-	inode = page->mapping->host;
-	ci = ceph_inode(inode);
-	fsc = ceph_inode_to_client(inode);
-
 	/* verify this is a writeable snap context */
 	snapc = page_snap_context(page);
 	if (!snapc) {
@@ -684,7 +639,7 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc)
 	if (ceph_wbc.i_size < page_off + len)
 		len = ceph_wbc.i_size - page_off;
 
-	dout("writepage %p page %p index %lu on %llu~%u snapc %p seq %lld\n",
+	dout("writepage %p page %p index %lu on %llu~%llu snapc %p seq %lld\n",
 	     inode, page, page->index, page_off, len, snapc, snapc->seq);
 
 	if (atomic_long_inc_return(&fsc->writeback_count) >
@@ -692,11 +647,33 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc)
 		set_bdi_congested(inode_to_bdi(inode), BLK_RW_ASYNC);
 
 	set_page_writeback(page);
-	err = ceph_sync_writepages(fsc, ceph_vino(inode),
-				   &ci->i_layout, snapc, page_off, len,
-				   ceph_wbc.truncate_seq,
-				   ceph_wbc.truncate_size,
-				   &inode->i_mtime, &page, 1);
+	req = ceph_osdc_new_request(osdc, &ci->i_layout, ceph_vino(inode), page_off, &len, 0, 1,
+				    CEPH_OSD_OP_WRITE, CEPH_OSD_FLAG_WRITE, snapc,
+				    ceph_wbc.truncate_seq, ceph_wbc.truncate_size,
+				    true);
+	if (IS_ERR(req)) {
+		redirty_page_for_writepage(wbc, page);
+		end_page_writeback(page);
+		return PTR_ERR(req);
+	}
+
+	/* it may be a short write due to an object boundary */
+	WARN_ON_ONCE(len > PAGE_SIZE);
+	osd_req_op_extent_osd_data_pages(req, 0, &page, len, 0, false, false);
+	dout("writepage %llu~%llu (%llu bytes)\n", page_off, len, len);
+
+	req->r_mtime = inode->i_mtime;
+	err = ceph_osdc_start_request(osdc, req, true);
+	if (!err)
+		err = ceph_osdc_wait_request(osdc, req);
+
+	ceph_update_write_latency(&fsc->mdsc->metric, req->r_start_latency,
+				  req->r_end_latency, err);
+
+	ceph_osdc_put_request(req);
+	if (err == 0)
+		err = len;
+
 	if (err < 0) {
 		struct writeback_control tmp_wbc;
 		if (!wbc)
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 05/11] ceph: fold ceph_update_writeable_page into ceph_write_begin
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
                   ` (3 preceding siblings ...)
  2020-07-22 10:55 ` [RFC PATCH 04/11] ceph: fold ceph_sync_writepages into writepage_nounlock Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 06/11] ceph: conversion to new fscache API Jeff Layton
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

...and reorganize the loop for better clarity.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 141 ++++++++++++++++++++++---------------------------
 1 file changed, 62 insertions(+), 79 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 42fb38a112b8..b43431c0c95c 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1253,6 +1253,8 @@ static int context_is_writeable_or_written(struct inode *inode,
  * @inode: inode associated with page
  * @page: page being dirtied
  *
+ * We are only allowed to write into/dirty the page if the page is
+ * clean, or already dirty within the same snap context.
  * Returns NULL on success, negative error code on error, and a snapc ref that should be
  * waited on otherwise.
  */
@@ -1309,104 +1311,85 @@ ceph_find_incompatible(struct inode *inode, struct page *page)
 /*
  * We are only allowed to write into/dirty the page if the page is
  * clean, or already dirty within the same snap context.
- *
- * called with page locked.
- * return success with page locked,
- * or any failure (incl -EAGAIN) with page unlocked.
  */
-static int ceph_update_writeable_page(struct file *file,
-			    loff_t pos, unsigned len,
-			    struct page *page)
+static int ceph_write_begin(struct file *file, struct address_space *mapping,
+			    loff_t pos, unsigned len, unsigned flags,
+			    struct page **pagep, void **fsdata)
 {
 	struct inode *inode = file_inode(file);
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_snap_context *snapc;
+	struct page *page = NULL;
+	pgoff_t index = pos >> PAGE_SHIFT;
 	loff_t page_off = pos & PAGE_MASK;
 	int pos_in_page = pos & ~PAGE_MASK;
 	int end_in_page = pos_in_page + len;
 	loff_t i_size;
 	int r;
+refind:
+	/* get a page */
+	page = grab_cache_page_write_begin(mapping, index, 0);
+	if (!page)
+		return -ENOMEM;
 
-retry_locked:
-	snapc = ceph_find_incompatible(inode, page);
-	if (snapc) {
-		if (IS_ERR(snapc)) {
-			r = PTR_ERR(snapc);
-			goto fail_unlock;
+	dout("write_begin file %p inode %p page %p %d~%d\n", file,
+	     inode, page, (int)pos, (int)len);
+
+	for (;;) {
+		snapc = ceph_find_incompatible(inode, page);
+		if (snapc) {
+			if (IS_ERR(snapc)) {
+				r = PTR_ERR(snapc);
+				break;
+			}
+			unlock_page(page);
+			ceph_queue_writeback(inode);
+			r = wait_event_killable(ci->i_cap_wq,
+						context_is_writeable_or_written(inode, snapc));
+			ceph_put_snap_context(snapc);
+			put_page(page);
+			goto refind;
 		}
-		unlock_page(page);
-		ceph_queue_writeback(inode);
-		r = wait_event_killable(ci->i_cap_wq,
-					context_is_writeable_or_written(inode, snapc));
-		ceph_put_snap_context(snapc);
-		return -EAGAIN;
-	}
 
-	if (PageUptodate(page)) {
-		dout(" page %p already uptodate\n", page);
-		return 0;
-	}
+		if (PageUptodate(page)) {
+			dout(" page %p already uptodate\n", page);
+			break;
+		}
 
-	/* full page? */
-	if (pos_in_page == 0 && len == PAGE_SIZE)
-		return 0;
+		/* full page? */
+		if (pos_in_page == 0 && len == PAGE_SIZE)
+			break;
 
-	/* past end of file? */
-	i_size = i_size_read(inode);
-
-	if (page_off >= i_size ||
-	    (pos_in_page == 0 && (pos+len) >= i_size &&
-	     end_in_page - pos_in_page != PAGE_SIZE)) {
-		dout(" zeroing %p 0 - %d and %d - %d\n",
-		     page, pos_in_page, end_in_page, (int)PAGE_SIZE);
-		zero_user_segments(page,
-				   0, pos_in_page,
-				   end_in_page, PAGE_SIZE);
-		return 0;
-	}
+		/* past end of file? */
+		i_size = i_size_read(inode);
+		if (page_off >= i_size ||
+		    (pos_in_page == 0 && (pos+len) >= i_size &&
+		     end_in_page - pos_in_page != PAGE_SIZE)) {
+			dout(" zeroing %p 0 - %d and %d - %d\n",
+			     page, pos_in_page, end_in_page, (int)PAGE_SIZE);
+			zero_user_segments(page,
+					   0, pos_in_page,
+					   end_in_page, PAGE_SIZE);
+			break;
+		}
 
-	/* we need to read it. */
-	r = ceph_do_readpage(file, page);
-	if (r < 0) {
-		if (r == -EINPROGRESS)
-			return -EAGAIN;
-		goto fail_unlock;
+		/* we need to read it. */
+		r = ceph_do_readpage(file, page);
+		if (r) {
+			if (r == -EINPROGRESS)
+				continue;
+			break;
+		}
 	}
-	goto retry_locked;
-fail_unlock:
-	unlock_page(page);
-	return r;
-}
 
-/*
- * We are only allowed to write into/dirty the page if the page is
- * clean, or already dirty within the same snap context.
- */
-static int ceph_write_begin(struct file *file, struct address_space *mapping,
-			    loff_t pos, unsigned len, unsigned flags,
-			    struct page **pagep, void **fsdata)
-{
-	struct inode *inode = file_inode(file);
-	struct page *page;
-	pgoff_t index = pos >> PAGE_SHIFT;
-	int r;
-
-	do {
-		/* get a page */
-		page = grab_cache_page_write_begin(mapping, index, 0);
-		if (!page)
-			return -ENOMEM;
-
-		dout("write_begin file %p inode %p page %p %d~%d\n", file,
-		     inode, page, (int)pos, (int)len);
-
-		r = ceph_update_writeable_page(file, pos, len, page);
-		if (r < 0)
+	if (r < 0) {
+		if (page) {
+			unlock_page(page);
 			put_page(page);
-		else
-			*pagep = page;
-	} while (r == -EAGAIN);
-
+		}
+	} else {
+		*pagep = page;
+	}
 	return r;
 }
 
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 06/11] ceph: conversion to new fscache API
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
                   ` (4 preceding siblings ...)
  2020-07-22 10:55 ` [RFC PATCH 05/11] ceph: fold ceph_update_writeable_page into ceph_write_begin Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 07/11] ceph: convert readpage to fscache read helper Jeff Layton
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

Now that the fscache API has been reworked and simplified, start
changing ceph over to use it. Drop functions that are no longer needed
and rework some of the others to conform to the new API.

Change how cookies are managed as well. With the old API, we would only
instantiate a cookie when the file was open for reads. Change it to
instantiate the cookie when the inode is instantiated and call use/unuse
when the file is opened/closed. This will allow us to plumb in write
support in subsequent patches.

For now, just rip most of the code out of the read+write codepaths, as
subsequent patches will rework that code to use new helper functions.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c  |  47 +++-----
 fs/ceph/cache.c | 290 +++++++++++++-----------------------------------
 fs/ceph/cache.h | 106 ++++--------------
 fs/ceph/caps.c  |  11 +-
 fs/ceph/file.c  |  13 ++-
 fs/ceph/inode.c |  14 ++-
 fs/ceph/super.h |   1 -
 7 files changed, 135 insertions(+), 347 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index b43431c0c95c..1dbc4adbfc72 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -155,19 +155,18 @@ static void ceph_invalidatepage(struct page *page, unsigned int offset,
 		return;
 	}
 
-	ceph_invalidate_fscache_page(inode, page);
-
 	WARN_ON(!PageLocked(page));
-	if (!PagePrivate(page))
-		return;
+	if (PagePrivate(page)) {
+		dout("%p invalidatepage %p idx %lu full dirty page\n",
+		     inode, page, page->index);
 
-	dout("%p invalidatepage %p idx %lu full dirty page\n",
-	     inode, page, page->index);
+		ceph_put_wrbuffer_cap_refs(ci, 1, snapc);
+		ceph_put_snap_context(snapc);
+		page->private = 0;
+		ClearPagePrivate(page);
+	}
 
-	ceph_put_wrbuffer_cap_refs(ci, 1, snapc);
-	ceph_put_snap_context(snapc);
-	page->private = 0;
-	ClearPagePrivate(page);
+	ceph_wait_on_page_fscache(page);
 }
 
 static int ceph_releasepage(struct page *page, gfp_t g)
@@ -175,11 +174,12 @@ static int ceph_releasepage(struct page *page, gfp_t g)
 	dout("%p releasepage %p idx %lu (%sdirty)\n", page->mapping->host,
 	     page, page->index, PageDirty(page) ? "" : "not ");
 
-	/* Can we release the page from the cache? */
-	if (!ceph_release_fscache_page(page, g))
+	if (PagePrivate(page))
 		return 0;
 
-	return !PagePrivate(page);
+	ceph_wait_on_page_fscache(page);
+
+	return 1;
 }
 
 /* read a single page, without unlocking it. */
@@ -213,10 +213,6 @@ static int ceph_do_readpage(struct file *filp, struct page *page)
 		return 0;
 	}
 
-	err = ceph_readpage_from_fscache(inode, page);
-	if (err == 0)
-		return -EINPROGRESS;
-
 	dout("readpage ino %llx.%llx file %p off %llu len %llu page %p index %lu\n",
 	     vino.ino, vino.snap, filp, off, len, page, page->index);
 	req = ceph_osdc_new_request(osdc, &ci->i_layout, vino, off, &len, 0, 1,
@@ -242,7 +238,6 @@ static int ceph_do_readpage(struct file *filp, struct page *page)
 		err = 0;
 	if (err < 0) {
 		SetPageError(page);
-		ceph_fscache_readpage_cancel(inode, page);
 		if (err == -EBLACKLISTED)
 			fsc->blacklisted = true;
 		goto out;
@@ -254,7 +249,6 @@ static int ceph_do_readpage(struct file *filp, struct page *page)
 		flush_dcache_page(page);
 
 	SetPageUptodate(page);
-	ceph_readpage_to_fscache(inode, page);
 out:
 	return err < 0 ? err : 0;
 }
@@ -294,10 +288,8 @@ static void finish_read(struct ceph_osd_request *req)
 	for (i = 0; i < num_pages; i++) {
 		struct page *page = osd_data->pages[i];
 
-		if (rc < 0 && rc != -ENOENT) {
-			ceph_fscache_readpage_cancel(inode, page);
+		if (rc < 0 && rc != -ENOENT)
 			goto unlock;
-		}
 		if (bytes < (int)PAGE_SIZE) {
 			/* zero (remainder of) page */
 			int s = bytes < 0 ? 0 : bytes;
@@ -307,7 +299,6 @@ static void finish_read(struct ceph_osd_request *req)
 		     page->index);
 		flush_dcache_page(page);
 		SetPageUptodate(page);
-		ceph_readpage_to_fscache(inode, page);
 unlock:
 		unlock_page(page);
 		put_page(page);
@@ -408,7 +399,6 @@ static int start_read(struct inode *inode, struct ceph_rw_context *rw_ctx,
 		     page->index);
 		if (add_to_page_cache_lru(page, &inode->i_data, page->index,
 					  GFP_KERNEL)) {
-			ceph_fscache_uncache_page(inode, page);
 			put_page(page);
 			dout("start_read %p add_to_page_cache failed %p\n",
 			     inode, page);
@@ -441,7 +431,6 @@ static int start_read(struct inode *inode, struct ceph_rw_context *rw_ctx,
 
 out_pages:
 	for (i = 0; i < nr_pages; ++i) {
-		ceph_fscache_readpage_cancel(inode, pages[i]);
 		unlock_page(pages[i]);
 	}
 	ceph_put_page_vector(pages, nr_pages, false);
@@ -471,12 +460,6 @@ static int ceph_readpages(struct file *file, struct address_space *mapping,
 	if (ceph_inode(inode)->i_inline_version != CEPH_INLINE_NONE)
 		return -EINVAL;
 
-	rc = ceph_readpages_from_fscache(mapping->host, mapping, page_list,
-					 &nr_pages);
-
-	if (rc == 0)
-		goto out;
-
 	rw_ctx = ceph_find_rw_context(fi);
 	max = fsc->mount_options->rsize >> PAGE_SHIFT;
 	dout("readpages %p file %p ctx %p nr_pages %d max %d\n",
@@ -487,8 +470,6 @@ static int ceph_readpages(struct file *file, struct address_space *mapping,
 			goto out;
 	}
 out:
-	ceph_fscache_readpages_cancel(inode, page_list);
-
 	dout("readpages %p file %p ret %d\n", inode, file, rc);
 	return rc;
 }
diff --git a/fs/ceph/cache.c b/fs/ceph/cache.c
index 2f5cb6bc78e1..67f1dad09c19 100644
--- a/fs/ceph/cache.c
+++ b/fs/ceph/cache.c
@@ -35,11 +35,6 @@ struct ceph_fscache_entry {
 	char uniquifier[];
 };
 
-static const struct fscache_cookie_def ceph_fscache_fsid_object_def = {
-	.name		= "CEPH.fsid",
-	.type		= FSCACHE_COOKIE_TYPE_INDEX,
-};
-
 int __init ceph_fscache_register(void)
 {
 	return fscache_register_netfs(&ceph_cache_netfs);
@@ -50,6 +45,51 @@ void ceph_fscache_unregister(void)
 	fscache_unregister_netfs(&ceph_cache_netfs);
 }
 
+void ceph_fscache_use_cookie(struct inode *inode, bool will_modify)
+{
+	struct ceph_inode_info *ci = ceph_inode(inode);
+
+	if (ci->fscache)
+		fscache_use_cookie(ci->fscache, will_modify);
+}
+
+void ceph_fscache_unuse_cookie(struct inode *inode, bool update)
+{
+	struct ceph_inode_info *ci = ceph_inode(inode);
+
+	if (!ci->fscache)
+		return;
+
+	if (update) {
+		struct ceph_aux_inode aux = {
+						.version = ci->i_version,
+						.mtime_sec = inode->i_mtime.tv_sec,
+						.mtime_nsec = inode->i_mtime.tv_nsec,
+					    };
+		loff_t i_size = i_size_read(inode);
+
+		fscache_unuse_cookie(ci->fscache, &aux, &i_size);
+	} else {
+		fscache_unuse_cookie(ci->fscache, NULL, NULL);
+	}
+}
+
+void ceph_fscache_update_cookie(struct inode *inode)
+{
+	struct ceph_inode_info *ci = ceph_inode(inode);
+	struct ceph_aux_inode aux;
+	loff_t i_size = i_size_read(inode);
+
+	if (!ci->fscache)
+		return;
+
+	aux.version = ci->i_version;
+	aux.mtime_sec = inode->i_mtime.tv_sec;
+	aux.mtime_nsec = inode->i_mtime.tv_nsec;
+
+	fscache_update_cookie(ci->fscache, &aux, &i_size);
+}
+
 int ceph_fscache_register_fs(struct ceph_fs_client* fsc, struct fs_context *fc)
 {
 	const struct ceph_fsid *fsid = &fsc->client->fsid;
@@ -86,225 +126,70 @@ int ceph_fscache_register_fs(struct ceph_fs_client* fsc, struct fs_context *fc)
 	}
 
 	fsc->fscache = fscache_acquire_cookie(ceph_cache_netfs.primary_index,
-					      &ceph_fscache_fsid_object_def,
-					      &ent->fsid, sizeof(ent->fsid) + uniq_len,
-					      NULL, 0,
-					      fsc, 0, true);
-
+					      FSCACHE_COOKIE_TYPE_INDEX,
+					      "CEPH.fsid", 0, NULL, &ent->fsid,
+					      sizeof(ent->fsid) + uniq_len, NULL, 0, 0);
 	if (fsc->fscache) {
 		ent->fscache = fsc->fscache;
 		list_add_tail(&ent->list, &ceph_fscache_list);
 	} else {
+		pr_warn("Unable to set primary index for fscache! Disabling it.\n");
 		kfree(ent);
-		errorfc(fc, "unable to register fscache cookie for fsid %pU",
-		       fsid);
-		/* all other fs ignore this error */
 	}
 out_unlock:
 	mutex_unlock(&ceph_fscache_lock);
 	return err;
 }
 
-static enum fscache_checkaux ceph_fscache_inode_check_aux(
-	void *cookie_netfs_data, const void *data, uint16_t dlen,
-	loff_t object_size)
-{
-	struct ceph_aux_inode aux;
-	struct ceph_inode_info* ci = cookie_netfs_data;
-	struct inode* inode = &ci->vfs_inode;
-
-	if (dlen != sizeof(aux) ||
-	    i_size_read(inode) != object_size)
-		return FSCACHE_CHECKAUX_OBSOLETE;
-
-	memset(&aux, 0, sizeof(aux));
-	aux.version = ci->i_version;
-	aux.mtime_sec = inode->i_mtime.tv_sec;
-	aux.mtime_nsec = inode->i_mtime.tv_nsec;
-
-	if (memcmp(data, &aux, sizeof(aux)) != 0)
-		return FSCACHE_CHECKAUX_OBSOLETE;
-
-	dout("ceph inode 0x%p cached okay\n", ci);
-	return FSCACHE_CHECKAUX_OKAY;
-}
-
-static const struct fscache_cookie_def ceph_fscache_inode_object_def = {
-	.name		= "CEPH.inode",
-	.type		= FSCACHE_COOKIE_TYPE_DATAFILE,
-	.check_aux	= ceph_fscache_inode_check_aux,
-};
-
 void ceph_fscache_register_inode_cookie(struct inode *inode)
 {
 	struct ceph_inode_info *ci = ceph_inode(inode);
 	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
 	struct ceph_aux_inode aux;
 
+	/* Register only new inodes */
+	if (!(inode->i_state & I_NEW))
+		return;
+
 	/* No caching for filesystem */
 	if (!fsc->fscache)
 		return;
 
-	/* Only cache for regular files that are read only */
-	if (!S_ISREG(inode->i_mode))
-		return;
+	WARN_ON_ONCE(ci->fscache);
 
-	inode_lock_nested(inode, I_MUTEX_CHILD);
-	if (!ci->fscache) {
-		memset(&aux, 0, sizeof(aux));
-		aux.version = ci->i_version;
-		aux.mtime_sec = inode->i_mtime.tv_sec;
-		aux.mtime_nsec = inode->i_mtime.tv_nsec;
-		ci->fscache = fscache_acquire_cookie(fsc->fscache,
-						     &ceph_fscache_inode_object_def,
-						     &ci->i_vino, sizeof(ci->i_vino),
-						     &aux, sizeof(aux),
-						     ci, i_size_read(inode), false);
-	}
-	inode_unlock(inode);
+	memset(&aux, 0, sizeof(aux));
+	aux.version = ci->i_version;
+	aux.mtime_sec = inode->i_mtime.tv_sec;
+	aux.mtime_nsec = inode->i_mtime.tv_nsec;
+	ci->fscache = fscache_acquire_cookie(fsc->fscache,
+					     FSCACHE_COOKIE_TYPE_DATAFILE,
+					     "CEPH.inode", 0, NULL,
+					     &ci->i_vino,
+					     sizeof(ci->i_vino),
+					     &aux, sizeof(aux),
+					     i_size_read(inode));
 }
 
 void ceph_fscache_unregister_inode_cookie(struct ceph_inode_info* ci)
 {
-	struct fscache_cookie* cookie;
-
-	if ((cookie = ci->fscache) == NULL)
-		return;
-
-	ci->fscache = NULL;
-
-	fscache_uncache_all_inode_pages(cookie, &ci->vfs_inode);
-	fscache_relinquish_cookie(cookie, &ci->i_vino, false);
-}
-
-static bool ceph_fscache_can_enable(void *data)
-{
-	struct inode *inode = data;
-	return !inode_is_open_for_write(inode);
-}
-
-void ceph_fscache_file_set_cookie(struct inode *inode, struct file *filp)
-{
-	struct ceph_inode_info *ci = ceph_inode(inode);
-
-	if (!fscache_cookie_valid(ci->fscache))
-		return;
-
-	if (inode_is_open_for_write(inode)) {
-		dout("fscache_file_set_cookie %p %p disabling cache\n",
-		     inode, filp);
-		fscache_disable_cookie(ci->fscache, &ci->i_vino, false);
-		fscache_uncache_all_inode_pages(ci->fscache, inode);
-	} else {
-		fscache_enable_cookie(ci->fscache, &ci->i_vino, i_size_read(inode),
-				      ceph_fscache_can_enable, inode);
-		if (fscache_cookie_enabled(ci->fscache)) {
-			dout("fscache_file_set_cookie %p %p enabling cache\n",
-			     inode, filp);
-		}
-	}
-}
-
-static void ceph_readpage_from_fscache_complete(struct page *page, void *data, int error)
-{
-	if (!error)
-		SetPageUptodate(page);
+	struct fscache_cookie* cookie = xchg(&ci->fscache, NULL);
 
-	unlock_page(page);
-}
-
-static inline bool cache_valid(struct ceph_inode_info *ci)
-{
-	return ci->i_fscache_gen == ci->i_rdcache_gen;
+	if (cookie)
+		fscache_relinquish_cookie(cookie, false);
 }
 
-
-/* Atempt to read from the fscache,
- *
- * This function is called from the readpage_nounlock context. DO NOT attempt to
- * unlock the page here (or in the callback).
- */
-int ceph_readpage_from_fscache(struct inode *inode, struct page *page)
+void ceph_fscache_invalidate(struct inode *inode, unsigned int flags)
 {
 	struct ceph_inode_info *ci = ceph_inode(inode);
-	int ret;
-
-	if (!cache_valid(ci))
-		return -ENOBUFS;
-
-	ret = fscache_read_or_alloc_page(ci->fscache, page,
-					 ceph_readpage_from_fscache_complete, NULL,
-					 GFP_KERNEL);
-
-	switch (ret) {
-		case 0: /* Page found */
-			dout("page read submitted\n");
-			return 0;
-		case -ENOBUFS: /* Pages were not found, and can't be */
-		case -ENODATA: /* Pages were not found */
-			dout("page/inode not in cache\n");
-			return ret;
-		default:
-			dout("%s: unknown error ret = %i\n", __func__, ret);
-			return ret;
-	}
-}
+	struct ceph_aux_inode aux = { .version		= ci->i_version,
+				      .mtime_sec	= inode->i_mtime.tv_sec,
+				      .mtime_nsec	= inode->i_mtime.tv_nsec };
 
-int ceph_readpages_from_fscache(struct inode *inode,
-				  struct address_space *mapping,
-				  struct list_head *pages,
-				  unsigned *nr_pages)
-{
-	struct ceph_inode_info *ci = ceph_inode(inode);
-	int ret;
-
-	if (!cache_valid(ci))
-		return -ENOBUFS;
-
-	ret = fscache_read_or_alloc_pages(ci->fscache, mapping, pages, nr_pages,
-					  ceph_readpage_from_fscache_complete,
-					  NULL, mapping_gfp_mask(mapping));
-
-	switch (ret) {
-		case 0: /* All pages found */
-			dout("all-page read submitted\n");
-			return 0;
-		case -ENOBUFS: /* Some pages were not found, and can't be */
-		case -ENODATA: /* some pages were not found */
-			dout("page/inode not in cache\n");
-			return ret;
-		default:
-			dout("%s: unknown error ret = %i\n", __func__, ret);
-			return ret;
-	}
-}
-
-void ceph_readpage_to_fscache(struct inode *inode, struct page *page)
-{
-	struct ceph_inode_info *ci = ceph_inode(inode);
-	int ret;
-
-	if (!PageFsCache(page))
-		return;
-
-	if (!cache_valid(ci))
-		return;
-
-	ret = fscache_write_page(ci->fscache, page, i_size_read(inode),
-				 GFP_KERNEL);
-	if (ret)
-		 fscache_uncache_page(ci->fscache, page);
-}
-
-void ceph_invalidate_fscache_page(struct inode* inode, struct page *page)
-{
-	struct ceph_inode_info *ci = ceph_inode(inode);
-
-	if (!PageFsCache(page))
-		return;
+	aux.version = ci->i_version;
+	aux.mtime_sec = inode->i_mtime.tv_sec;
+	aux.mtime_nsec = inode->i_mtime.tv_nsec;
 
-	fscache_wait_on_page_write(ci->fscache, page);
-	fscache_uncache_page(ci->fscache, page);
+	fscache_invalidate(ceph_inode(inode)->fscache, &aux, i_size_read(inode), flags);
 }
 
 void ceph_fscache_unregister_fs(struct ceph_fs_client* fsc)
@@ -325,28 +210,7 @@ void ceph_fscache_unregister_fs(struct ceph_fs_client* fsc)
 		WARN_ON_ONCE(!found);
 		mutex_unlock(&ceph_fscache_lock);
 
-		__fscache_relinquish_cookie(fsc->fscache, NULL, false);
+		__fscache_relinquish_cookie(fsc->fscache, false);
 	}
 	fsc->fscache = NULL;
 }
-
-/*
- * caller should hold CEPH_CAP_FILE_{RD,CACHE}
- */
-void ceph_fscache_revalidate_cookie(struct ceph_inode_info *ci)
-{
-	if (cache_valid(ci))
-		return;
-
-	/* resue i_truncate_mutex. There should be no pending
-	 * truncate while the caller holds CEPH_CAP_FILE_RD */
-	mutex_lock(&ci->i_truncate_mutex);
-	if (!cache_valid(ci)) {
-		if (fscache_check_consistency(ci->fscache, &ci->i_vino))
-			fscache_invalidate(ci->fscache);
-		spin_lock(&ci->i_ceph_lock);
-		ci->i_fscache_gen = ci->i_rdcache_gen;
-		spin_unlock(&ci->i_ceph_lock);
-	}
-	mutex_unlock(&ci->i_truncate_mutex);
-}
diff --git a/fs/ceph/cache.h b/fs/ceph/cache.h
index 89dbdd1eb14a..4d5777cdf86f 100644
--- a/fs/ceph/cache.h
+++ b/fs/ceph/cache.h
@@ -21,63 +21,36 @@ void ceph_fscache_unregister_fs(struct ceph_fs_client* fsc);
 
 void ceph_fscache_register_inode_cookie(struct inode *inode);
 void ceph_fscache_unregister_inode_cookie(struct ceph_inode_info* ci);
-void ceph_fscache_file_set_cookie(struct inode *inode, struct file *filp);
-void ceph_fscache_revalidate_cookie(struct ceph_inode_info *ci);
 
-int ceph_readpage_from_fscache(struct inode *inode, struct page *page);
-int ceph_readpages_from_fscache(struct inode *inode,
-				struct address_space *mapping,
-				struct list_head *pages,
-				unsigned *nr_pages);
-void ceph_readpage_to_fscache(struct inode *inode, struct page *page);
-void ceph_invalidate_fscache_page(struct inode* inode, struct page *page);
+void ceph_fscache_use_cookie(struct inode *inode, bool will_modify);
+void ceph_fscache_unuse_cookie(struct inode *inode, bool update);
+
+void ceph_fscache_update_cookie(struct inode *inode);
+void ceph_fscache_invalidate(struct inode *inode, unsigned int flags);
 
 static inline void ceph_fscache_inode_init(struct ceph_inode_info *ci)
 {
 	ci->fscache = NULL;
-	ci->i_fscache_gen = 0;
-}
-
-static inline void ceph_fscache_invalidate(struct inode *inode)
-{
-	fscache_invalidate(ceph_inode(inode)->fscache);
-}
-
-static inline void ceph_fscache_uncache_page(struct inode *inode,
-					     struct page *page)
-{
-	struct ceph_inode_info *ci = ceph_inode(inode);
-	return fscache_uncache_page(ci->fscache, page);
 }
 
-static inline int ceph_release_fscache_page(struct page *page, gfp_t gfp)
+static inline struct fscache_cookie *ceph_fscache_cookie(struct ceph_inode_info *ci)
 {
-	struct inode* inode = page->mapping->host;
-	struct ceph_inode_info *ci = ceph_inode(inode);
-	return fscache_maybe_release_page(ci->fscache, page, gfp);
+	return ci->fscache;
 }
 
-static inline void ceph_fscache_readpage_cancel(struct inode *inode,
-						struct page *page)
+static inline void ceph_wait_on_page_fscache(struct page *page)
 {
-	struct ceph_inode_info *ci = ceph_inode(inode);
-	if (fscache_cookie_valid(ci->fscache) && PageFsCache(page))
-		__fscache_uncache_page(ci->fscache, page);
+	wait_on_page_fscache(page);
 }
 
-static inline void ceph_fscache_readpages_cancel(struct inode *inode,
-						 struct list_head *pages)
+static inline void ceph_fscache_resize_cookie(struct ceph_inode_info *ci, loff_t new_size)
 {
-	struct ceph_inode_info *ci = ceph_inode(inode);
-	return fscache_readpages_cancel(ci->fscache, pages);
-}
+	struct fscache_cookie *cookie = ceph_fscache_cookie(ci);
 
-static inline void ceph_disable_fscache_readpage(struct ceph_inode_info *ci)
-{
-	ci->i_fscache_gen = ci->i_rdcache_gen - 1;
+	if (cookie)
+		fscache_resize_cookie(cookie, new_size);
 }
-
-#else
+#else /* CONFIG_CEPH_FSCACHE */
 
 static inline int ceph_fscache_register(void)
 {
@@ -110,67 +83,34 @@ static inline void ceph_fscache_unregister_inode_cookie(struct ceph_inode_info*
 {
 }
 
-static inline void ceph_fscache_file_set_cookie(struct inode *inode,
-						struct file *filp)
+static inline void ceph_fscache_use_cookie(struct inode *inode, bool will_modify)
 {
 }
 
-static inline void ceph_fscache_revalidate_cookie(struct ceph_inode_info *ci)
+static inline void ceph_fscache_unuse_cookie(struct inode *inode, bool update)
 {
 }
 
-static inline void ceph_fscache_uncache_page(struct inode *inode,
-					     struct page *pages)
+static inline void ceph_fscache_update_cookie(struct inode *inode)
 {
 }
 
-static inline int ceph_readpage_from_fscache(struct inode* inode,
-					     struct page *page)
+static inline void ceph_fscache_invalidate(struct inode *inode, unsigned int flags)
 {
-	return -ENOBUFS;
 }
 
-static inline int ceph_readpages_from_fscache(struct inode *inode,
-					      struct address_space *mapping,
-					      struct list_head *pages,
-					      unsigned *nr_pages)
+static inline struct fscache_cookie *ceph_fscache_cookie(struct ceph_inode_info *ci)
 {
-	return -ENOBUFS;
+	return NULL;
 }
 
-static inline void ceph_readpage_to_fscache(struct inode *inode,
-					    struct page *page)
+static inline void ceph_wait_on_page_fscache(struct page *page)
 {
 }
 
-static inline void ceph_fscache_invalidate(struct inode *inode)
+static inline void ceph_fscache_resize_cookie(struct ceph_inode_info *ci, loff_t new_size)
 {
 }
-
-static inline void ceph_invalidate_fscache_page(struct inode *inode,
-						struct page *page)
-{
-}
-
-static inline int ceph_release_fscache_page(struct page *page, gfp_t gfp)
-{
-	return 1;
-}
-
-static inline void ceph_fscache_readpage_cancel(struct inode *inode,
-						struct page *page)
-{
-}
-
-static inline void ceph_fscache_readpages_cancel(struct inode *inode,
-						 struct list_head *pages)
-{
-}
-
-static inline void ceph_disable_fscache_readpage(struct ceph_inode_info *ci)
-{
-}
-
-#endif
+#endif /* CONFIG_CEPH_FSCACHE */
 
 #endif
diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
index f8d0ee19f01d..a4428a40c67c 100644
--- a/fs/ceph/caps.c
+++ b/fs/ceph/caps.c
@@ -1984,7 +1984,7 @@ void ceph_check_caps(struct ceph_inode_info *ci, int flags,
 	    !(ci->i_wb_ref || ci->i_wrbuffer_ref) &&   /* no dirty pages... */
 	    inode->i_data.nrpages &&		/* have cached pages */
 	    (revoking & (CEPH_CAP_FILE_CACHE|
-			 CEPH_CAP_FILE_LAZYIO)) && /*  or revoking cache */
+			 CEPH_CAP_FILE_LAZYIO)) && /* revoking Fc */
 	    !tried_invalidate) {
 		dout("check_caps trying to invalidate on %p\n", inode);
 		if (try_nonblocking_invalidate(inode) < 0) {
@@ -2722,10 +2722,6 @@ static int try_get_cap_refs(struct inode *inode, int need, int want,
 				*got = need | want;
 			else
 				*got = need;
-			if (S_ISREG(inode->i_mode) &&
-			    (need & CEPH_CAP_FILE_RD) &&
-			    !(*got & CEPH_CAP_FILE_CACHE))
-				ceph_disable_fscache_readpage(ci);
 			ceph_take_cap_refs(ci, *got, true);
 			ret = 1;
 		}
@@ -2975,11 +2971,6 @@ int ceph_get_caps(struct file *filp, int need, int want,
 		}
 		break;
 	}
-
-	if (S_ISREG(ci->vfs_inode.i_mode) &&
-	    (_got & CEPH_CAP_FILE_RD) && (_got & CEPH_CAP_FILE_CACHE))
-		ceph_fscache_revalidate_cookie(ci);
-
 	*got = _got;
 	return 0;
 }
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index d51c3f2fdca0..de5dc8cbcd4d 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -250,8 +250,7 @@ static int ceph_init_file(struct inode *inode, struct file *file, int fmode)
 
 	switch (inode->i_mode & S_IFMT) {
 	case S_IFREG:
-		ceph_fscache_register_inode_cookie(inode);
-		ceph_fscache_file_set_cookie(inode, file);
+		ceph_fscache_use_cookie(inode, file->f_mode & FMODE_WRITE);
 		/* fall through */
 	case S_IFDIR:
 		ret = ceph_init_file_info(inode, file, fmode,
@@ -803,6 +802,7 @@ int ceph_release(struct inode *inode, struct file *file)
 		dout("release inode %p regular file %p\n", inode, file);
 		WARN_ON(!list_empty(&fi->rw_contexts));
 
+		ceph_fscache_unuse_cookie(inode, file->f_mode & FMODE_WRITE);
 		ceph_put_fmode(ci, fi->fmode, 1);
 
 		kmem_cache_free(ceph_file_cachep, fi);
@@ -1221,7 +1221,11 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter,
 	     snapc, snapc ? snapc->seq : 0);
 
 	if (write) {
-		int ret2 = invalidate_inode_pages2_range(inode->i_mapping,
+		int ret2;
+
+		ceph_fscache_invalidate(inode, FSCACHE_INVAL_DIO_WRITE);
+
+		ret2 = invalidate_inode_pages2_range(inode->i_mapping,
 					pos >> PAGE_SHIFT,
 					(pos + count - 1) >> PAGE_SHIFT);
 		if (ret2 < 0)
@@ -1432,6 +1436,8 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos,
 	if (ret < 0)
 		return ret;
 
+	/* FIXME: write to cache instead? */
+	ceph_fscache_invalidate(inode, 0);
 	ret = invalidate_inode_pages2_range(inode->i_mapping,
 					    pos >> PAGE_SHIFT,
 					    (pos + count - 1) >> PAGE_SHIFT);
@@ -2381,6 +2387,7 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off,
 		goto out_caps;
 
 	/* Drop dst file cached pages */
+	ceph_fscache_invalidate(dst_inode, 0);
 	ret = invalidate_inode_pages2_range(dst_inode->i_mapping,
 					    dst_off >> PAGE_SHIFT,
 					    (dst_off + len) >> PAGE_SHIFT);
diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
index 357c937699d5..c98830a5be6f 100644
--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -623,6 +623,7 @@ int ceph_fill_file_size(struct inode *inode, int issued,
 		}
 		i_size_write(inode, size);
 		inode->i_blocks = calc_inode_blocks(size);
+		ceph_fscache_update_cookie(inode);
 		ci->i_reported_size = size;
 		if (truncate_seq != ci->i_truncate_seq) {
 			dout("truncate_seq %u -> %u\n",
@@ -655,10 +656,6 @@ int ceph_fill_file_size(struct inode *inode, int issued,
 		     truncate_size);
 		ci->i_truncate_size = truncate_size;
 	}
-
-	if (queue_trunc)
-		ceph_fscache_invalidate(inode);
-
 	return queue_trunc;
 }
 
@@ -927,6 +924,7 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page,
 		inode->i_op = &ceph_file_iops;
 		break;
 	case S_IFREG:
+		ceph_fscache_register_inode_cookie(inode);
 		inode->i_op = &ceph_file_iops;
 		inode->i_fop = &ceph_file_fops;
 		break;
@@ -1790,11 +1788,13 @@ bool ceph_inode_set_size(struct inode *inode, loff_t size)
 	spin_lock(&ci->i_ceph_lock);
 	dout("set_size %p %llu -> %llu\n", inode, inode->i_size, size);
 	i_size_write(inode, size);
+	ceph_fscache_update_cookie(inode);
 	inode->i_blocks = calc_inode_blocks(size);
 
 	ret = __ceph_should_report_size(ci);
 
 	spin_unlock(&ci->i_ceph_lock);
+
 	return ret;
 }
 
@@ -1866,6 +1866,7 @@ void ceph_queue_vmtruncate(struct inode *inode)
 	set_bit(CEPH_I_WORK_VMTRUNCATE, &ci->i_work_mask);
 
 	ihold(inode);
+	ceph_fscache_invalidate(inode, 0);
 	if (queue_work(ceph_inode_to_client(inode)->inode_wq,
 		       &ci->i_work)) {
 		dout("ceph_queue_vmtruncate %p\n", inode);
@@ -1975,6 +1976,10 @@ void __ceph_do_pending_vmtruncate(struct inode *inode)
 	spin_unlock(&ci->i_ceph_lock);
 
 	truncate_pagecache(inode, to);
+	ceph_fscache_use_cookie(inode, true);
+	ceph_fscache_resize_cookie(ci, to);
+	ceph_fscache_unuse_cookie(inode, true);
+
 
 	spin_lock(&ci->i_ceph_lock);
 	if (to == ci->i_truncate_size) {
@@ -2136,6 +2141,7 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr)
 		if ((issued & CEPH_CAP_FILE_EXCL) &&
 		    attr->ia_size > inode->i_size) {
 			i_size_write(inode, attr->ia_size);
+			ceph_fscache_update_cookie(inode);
 			inode->i_blocks = calc_inode_blocks(attr->ia_size);
 			ci->i_reported_size = attr->ia_size;
 			dirtied |= CEPH_CAP_FILE_EXCL;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 65981a73ff2a..0e69770bcfc2 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -429,7 +429,6 @@ struct ceph_inode_info {
 
 #ifdef CONFIG_CEPH_FSCACHE
 	struct fscache_cookie *fscache;
-	u32 i_fscache_gen;
 #endif
 	errseq_t i_meta_err;
 
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 07/11] ceph: convert readpage to fscache read helper
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
                   ` (5 preceding siblings ...)
  2020-07-22 10:55 ` [RFC PATCH 06/11] ceph: conversion to new fscache API Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 08/11] ceph: plug write_begin into " Jeff Layton
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

Create a new ceph_fscache_req structure that holds a fscache_io_request
and a refcount_t. Change the readpage code to use the new
infrastructure. Have KConfig select FSCACHE_SUPPORT so that it will
still work even when FSCache proper is disabled.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/Kconfig |   1 +
 fs/ceph/addr.c  | 204 +++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 195 insertions(+), 10 deletions(-)

diff --git a/fs/ceph/Kconfig b/fs/ceph/Kconfig
index e955a38be3c8..734d08b57ef9 100644
--- a/fs/ceph/Kconfig
+++ b/fs/ceph/Kconfig
@@ -6,6 +6,7 @@ config CEPH_FS
 	select LIBCRC32C
 	select CRYPTO_AES
 	select CRYPTO
+	select FSCACHE_SUPPORT
 	default n
 	help
 	  Choose Y or M here to include support for mounting the
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 1dbc4adbfc72..98f5edbf70b6 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -12,6 +12,7 @@
 #include <linux/signal.h>
 #include <linux/iversion.h>
 #include <linux/ktime.h>
+#include <linux/fscache.h>
 
 #include "super.h"
 #include "mds_client.h"
@@ -182,6 +183,199 @@ static int ceph_releasepage(struct page *page, gfp_t g)
 	return 1;
 }
 
+struct ceph_fscache_req {
+	struct fscache_io_request	fscache_req;
+	refcount_t			ref;
+};
+
+static struct ceph_fscache_req *ceph_fsreq_alloc(void)
+{
+	struct ceph_fscache_req *req = kzalloc(sizeof(*req), GFP_NOFS);
+
+	if (req)
+		refcount_set(&req->ref, 1);
+	return req;
+}
+
+static void ceph_fsreq_done(struct fscache_io_request *fsreq)
+{
+}
+
+static void ceph_fsreq_get(struct fscache_io_request *fsreq)
+{
+	struct ceph_fscache_req *req = container_of(fsreq, struct ceph_fscache_req, fscache_req);
+
+	refcount_inc(&req->ref);
+}
+
+static void ceph_fsreq_put(struct fscache_io_request *fsreq)
+{
+	struct ceph_fscache_req *req = container_of(fsreq, struct ceph_fscache_req, fscache_req);
+
+	if (refcount_dec_and_test(&req->ref)) {
+		fscache_free_io_request(fsreq);
+		kfree(req);
+	}
+}
+
+static void ceph_fsreq_reshape(struct fscache_io_request *fsreq,
+			       struct fscache_request_shape *shape)
+{
+	struct inode *inode = fsreq->mapping->host;
+	struct ceph_inode_info *ci = ceph_inode(inode);
+	u64 objno, objoff;
+	u32 xlen;
+
+	/* Truncate the extent at the end of the current object */
+	ceph_calc_file_object_mapping(&ci->i_layout, shape->actual_start << PAGE_SHIFT,
+				      shape->actual_nr_pages << PAGE_SHIFT, &objno, &objoff, &xlen);
+	shape->actual_nr_pages = xlen >> PAGE_SHIFT;
+}
+
+static void finish_fsreq_read(struct ceph_osd_request *req)
+{
+	struct ceph_fs_client *fsc = ceph_inode_to_client(req->r_inode);
+	struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0);
+	struct fscache_io_request *fsreq = req->r_priv;
+	int num_pages;
+	int err = req->r_result;
+
+	ceph_update_read_latency(&fsc->mdsc->metric, req->r_start_latency,
+				 req->r_end_latency, err);
+
+	/* no object means success but no data */
+	if (err == -ENOENT)
+		err = 0;
+	else if (err == -EBLACKLISTED)
+		fsc->blacklisted = true;
+
+	dout("%s: result %d\n", __func__, err);
+	if (err >= 0)
+		fsreq->transferred = err;
+	else
+		fsreq->error = err;
+
+	if (fsreq->io_done)
+		fsreq->io_done(fsreq);
+
+	num_pages = calc_pages_for(osd_data->alignment, osd_data->length);
+	ceph_put_page_vector(osd_data->pages, num_pages, false);
+	ceph_fsreq_put(fsreq);
+	iput(req->r_inode);
+}
+
+static void ceph_fsreq_issue_op(struct fscache_io_request *fsreq)
+{
+	struct inode *inode = fsreq->mapping->host;
+	struct ceph_inode_info *ci = ceph_inode(inode);
+	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct ceph_osd_request *req = NULL;
+	struct ceph_vino vino = ceph_vino(inode);
+	struct iov_iter iter;
+	struct page **pages;
+	size_t page_off;
+	int err = 0;
+	u64 len = fsreq->len;
+
+	req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, fsreq->pos, &len,
+			0, 1, CEPH_OSD_OP_READ,
+			CEPH_OSD_FLAG_READ | fsc->client->osdc.client->options->read_from_replica,
+			NULL, ci->i_truncate_seq, ci->i_truncate_size, false);
+	if (IS_ERR(req)) {
+		err = PTR_ERR(req);
+		goto out;
+	}
+
+	dout("%s: pos=%llu orig_len=%llu len=%llu\n", __func__, fsreq->pos, fsreq->len, len);
+	iov_iter_mapping(&iter, READ, fsreq->mapping, fsreq->pos, len);
+	len = iov_iter_get_pages_alloc(&iter, &pages, len, &page_off);
+	if (len < 0) {
+		err = len;
+		dout("%s: iov_ter_get_pages_alloc returned %d\n", __func__, err);
+		goto out;
+	}
+
+	/* fscache should always give us a page-aligned read */
+	WARN_ON_ONCE(page_off);
+
+	osd_req_op_extent_osd_data_pages(req, 0, pages, len, 0, false, false);
+	req->r_callback = finish_fsreq_read;
+	req->r_priv = fsreq;
+	ceph_fsreq_get(fsreq);
+	req->r_inode = inode;
+	ihold(inode);
+
+	err = ceph_osdc_start_request(req->r_osdc, req, false);
+	if (err) {
+		iput(inode);
+		ceph_fsreq_put(fsreq);
+	}
+out:
+	if (req)
+		ceph_osdc_put_request(req);
+
+	if (err) {
+		fsreq->error = err;
+		if (fsreq->io_done)
+			fsreq->io_done(fsreq);
+	}
+	dout("%s: result %d\n", __func__, fsreq->error);
+}
+
+const struct fscache_io_request_ops ceph_readpage_fsreq_ops = {
+	.issue_op	= ceph_fsreq_issue_op,
+	.reshape	= ceph_fsreq_reshape,
+	.done		= ceph_fsreq_done,
+	.get		= ceph_fsreq_get,
+	.put		= ceph_fsreq_put,
+};
+
+/* read a single page, without unlocking it. */
+static int ceph_readpage(struct file *filp, struct page *page)
+{
+	struct inode *inode = file_inode(filp);
+	struct ceph_inode_info *ci = ceph_inode(inode);
+	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct ceph_fscache_req *req;
+	struct ceph_vino vino = ceph_vino(inode);
+	struct fscache_cookie *cookie = ceph_fscache_cookie(ci);
+	int err = 0;
+	u64 off = page_offset(page);
+	u64 len = PAGE_SIZE;
+
+	if (ci->i_inline_version != CEPH_INLINE_NONE) {
+		/*
+		 * Uptodate inline data should have been added
+		 * into page cache while getting Fcr caps.
+		 */
+		if (off == 0) {
+			unlock_page(page);
+			return -EINVAL;
+		}
+		zero_user_segment(page, 0, PAGE_SIZE);
+		SetPageUptodate(page);
+		unlock_page(page);
+		return 0;
+	}
+
+	dout("readpage ino %llx.%llx file %p off %llu len %llu page %p index %lu\n",
+	     vino.ino, vino.snap, filp, off, len, page, page->index);
+
+	req = ceph_fsreq_alloc();
+	if (!req) {
+		unlock_page(page);
+		return -ENOMEM;
+	}
+
+	fscache_init_io_request(&req->fscache_req, cookie, &ceph_readpage_fsreq_ops);
+	req->fscache_req.mapping = inode->i_mapping;
+
+	err = fscache_read_helper_locked_page(&req->fscache_req, page,
+					      fsc->mount_options->rsize >> PAGE_SHIFT);
+	ceph_fsreq_put(&req->fscache_req);
+	return err;
+}
+
 /* read a single page, without unlocking it. */
 static int ceph_do_readpage(struct file *filp, struct page *page)
 {
@@ -253,16 +447,6 @@ static int ceph_do_readpage(struct file *filp, struct page *page)
 	return err < 0 ? err : 0;
 }
 
-static int ceph_readpage(struct file *filp, struct page *page)
-{
-	int r = ceph_do_readpage(filp, page);
-	if (r != -EINPROGRESS)
-		unlock_page(page);
-	else
-		r = 0;
-	return r;
-}
-
 /*
  * Finish an async read(ahead) op.
  */
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 08/11] ceph: plug write_begin into read helper
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
                   ` (6 preceding siblings ...)
  2020-07-22 10:55 ` [RFC PATCH 07/11] ceph: convert readpage to fscache read helper Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 09/11] ceph: convert readpages to fscache_read_helper Jeff Layton
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

Plug write_begin into the read helper routine. This requires adding a
new is_req_valid op that we can use to vet whether there is an
incompatible snap context that needs to be flushed before we can fill
the page.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 251 +++++++++++++++++++++++++------------------------
 1 file changed, 130 insertions(+), 121 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 98f5edbf70b6..cc10bf17b65b 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -185,6 +185,7 @@ static int ceph_releasepage(struct page *page, gfp_t g)
 
 struct ceph_fscache_req {
 	struct fscache_io_request	fscache_req;
+	struct ceph_snap_context	*snapc;
 	refcount_t			ref;
 };
 
@@ -376,77 +377,6 @@ static int ceph_readpage(struct file *filp, struct page *page)
 	return err;
 }
 
-/* read a single page, without unlocking it. */
-static int ceph_do_readpage(struct file *filp, struct page *page)
-{
-	struct inode *inode = file_inode(filp);
-	struct ceph_inode_info *ci = ceph_inode(inode);
-	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
-	struct ceph_osd_client *osdc = &fsc->client->osdc;
-	struct ceph_osd_request *req;
-	struct ceph_vino vino = ceph_vino(inode);
-	int err = 0;
-	u64 off = page_offset(page);
-	u64 len = PAGE_SIZE;
-
-	if (off >= i_size_read(inode)) {
-		zero_user_segment(page, 0, PAGE_SIZE);
-		SetPageUptodate(page);
-		return 0;
-	}
-
-	if (ci->i_inline_version != CEPH_INLINE_NONE) {
-		/*
-		 * Uptodate inline data should have been added
-		 * into page cache while getting Fcr caps.
-		 */
-		if (off == 0)
-			return -EINVAL;
-		zero_user_segment(page, 0, PAGE_SIZE);
-		SetPageUptodate(page);
-		return 0;
-	}
-
-	dout("readpage ino %llx.%llx file %p off %llu len %llu page %p index %lu\n",
-	     vino.ino, vino.snap, filp, off, len, page, page->index);
-	req = ceph_osdc_new_request(osdc, &ci->i_layout, vino, off, &len, 0, 1,
-				    CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ, NULL,
-				    ci->i_truncate_seq, ci->i_truncate_size,
-				    false);
-	if (IS_ERR(req))
-		return PTR_ERR(req);
-
-	osd_req_op_extent_osd_data_pages(req, 0, &page, len, 0, false, false);
-
-	err = ceph_osdc_start_request(osdc, req, false);
-	if (!err)
-		err = ceph_osdc_wait_request(osdc, req);
-
-	ceph_update_read_latency(&fsc->mdsc->metric, req->r_start_latency,
-				 req->r_end_latency, err);
-
-	ceph_osdc_put_request(req);
-	dout("readpage result %d\n", err);
-
-	if (err == -ENOENT)
-		err = 0;
-	if (err < 0) {
-		SetPageError(page);
-		if (err == -EBLACKLISTED)
-			fsc->blacklisted = true;
-		goto out;
-	}
-	if (err < PAGE_SIZE)
-		/* zero fill remainder of page */
-		zero_user_segment(page, err, PAGE_SIZE);
-	else
-		flush_dcache_page(page);
-
-	SetPageUptodate(page);
-out:
-	return err < 0 ? err : 0;
-}
-
 /*
  * Finish an async read(ahead) op.
  */
@@ -1473,6 +1403,30 @@ ceph_find_incompatible(struct inode *inode, struct page *page)
 	return NULL;
 }
 
+static int ceph_fsreq_is_req_valid(struct fscache_io_request *fsreq)
+{
+	struct ceph_snap_context *snapc;
+	struct ceph_fscache_req *req = container_of(fsreq, struct ceph_fscache_req, fscache_req);
+
+	snapc = ceph_find_incompatible(fsreq->mapping->host, fsreq->no_unlock_page);
+	if (snapc) {
+		if (IS_ERR(snapc))
+			return PTR_ERR(snapc);
+		req->snapc = snapc;
+		return -EAGAIN;
+	}
+	return 0;
+}
+
+const struct fscache_io_request_ops ceph_read_for_write_fsreq_ops = {
+	.issue_op	= ceph_fsreq_issue_op,
+	.reshape	= ceph_fsreq_reshape,
+	.is_req_valid	= ceph_fsreq_is_req_valid,
+	.done		= ceph_fsreq_done,
+	.get		= ceph_fsreq_get,
+	.put		= ceph_fsreq_put,
+};
+
 /*
  * We are only allowed to write into/dirty the page if the page is
  * clean, or already dirty within the same snap context.
@@ -1483,76 +1437,131 @@ static int ceph_write_begin(struct file *file, struct address_space *mapping,
 {
 	struct inode *inode = file_inode(file);
 	struct ceph_inode_info *ci = ceph_inode(inode);
-	struct ceph_snap_context *snapc;
+	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct fscache_cookie *cookie = ceph_fscache_cookie(ci);
 	struct page *page = NULL;
 	pgoff_t index = pos >> PAGE_SHIFT;
-	loff_t page_off = pos & PAGE_MASK;
 	int pos_in_page = pos & ~PAGE_MASK;
-	int end_in_page = pos_in_page + len;
-	loff_t i_size;
 	int r;
-refind:
-	/* get a page */
-	page = grab_cache_page_write_begin(mapping, index, 0);
-	if (!page)
-		return -ENOMEM;
 
-	dout("write_begin file %p inode %p page %p %d~%d\n", file,
-	     inode, page, (int)pos, (int)len);
+	if (ci->i_inline_version != CEPH_INLINE_NONE) {
+		/*
+		 * In principle, we should never get here, as the inode should have been uninlined
+		 * before we're allowed to write to the page (in write_iter or page_mkwrite).
+		 */
+		WARN_ONCE(1, "ceph: write_begin called on still-inlined inode!\n");
 
-	for (;;) {
-		snapc = ceph_find_incompatible(inode, page);
-		if (snapc) {
-			if (IS_ERR(snapc)) {
-				r = PTR_ERR(snapc);
-				break;
-			}
-			unlock_page(page);
-			ceph_queue_writeback(inode);
-			r = wait_event_killable(ci->i_cap_wq,
-						context_is_writeable_or_written(inode, snapc));
-			ceph_put_snap_context(snapc);
-			put_page(page);
-			goto refind;
+		/*
+		 * Uptodate inline data should have been added
+		 * into page cache while getting Fcr caps.
+		 */
+		if (index == 0) {
+			r = -EINVAL;
+			goto out;
 		}
 
-		if (PageUptodate(page)) {
-			dout(" page %p already uptodate\n", page);
-			break;
+		page = grab_cache_page_write_begin(mapping, index, 0);
+		if (!page)
+			return -ENOMEM;
+
+		zero_user_segment(page, 0, PAGE_SIZE);
+		SetPageUptodate(page);
+		r = 0;
+		goto out;
+	}
+
+	do {
+		struct ceph_fscache_req *req;
+		struct ceph_snap_context *snapc = NULL;
+
+		page = pagecache_get_page(mapping, index, FGP_WRITE, 0);
+		if (page) {
+			r = 0;
+			if (PageUptodate(page)) {
+				lock_page(page);
+				if (PageUptodate(page))
+					goto out;
+				unlock_page(page);
+			}
 		}
 
-		/* full page? */
-		if (pos_in_page == 0 && len == PAGE_SIZE)
-			break;
+		/*
+		 * In some cases we don't need to read at all:
+		 * - full page write
+		 * - write that lies completely beyond EOF
+		 * - write that covers the the page from start to EOF or beyond it
+		 */
+		if ((pos_in_page == 0 && len == PAGE_SIZE) ||
+		    (pos >= i_size_read(inode)) ||
+		    (pos_in_page == 0 && (pos + len) >= i_size_read(inode))) {
+			if (!page) {
+				page = grab_cache_page_write_begin(mapping, index, 0);
+				if (!page) {
+					r = -ENOMEM;
+					break;
+				}
+			} else {
+				lock_page(page);
+			}
+
+			snapc = ceph_find_incompatible(inode, page);
+			if (!snapc) {
+				zero_user_segments(page, 0, pos_in_page,
+							 pos_in_page + len, PAGE_SIZE);
+				r = 0;
+				goto out;
+			}
+
+			unlock_page(page);
 
-		/* past end of file? */
-		i_size = i_size_read(inode);
-		if (page_off >= i_size ||
-		    (pos_in_page == 0 && (pos+len) >= i_size &&
-		     end_in_page - pos_in_page != PAGE_SIZE)) {
-			dout(" zeroing %p 0 - %d and %d - %d\n",
-			     page, pos_in_page, end_in_page, (int)PAGE_SIZE);
-			zero_user_segments(page,
-					   0, pos_in_page,
-					   end_in_page, PAGE_SIZE);
+			if (IS_ERR(snapc)) {
+				r = PTR_ERR(snapc);
+				goto out;
+			}
+			goto flush_incompat;
+		}
+
+		req = ceph_fsreq_alloc();
+		if (!req) {
+			unlock_page(page);
+			r = -ENOMEM;
 			break;
 		}
 
-		/* we need to read it. */
-		r = ceph_do_readpage(file, page);
-		if (r) {
-			if (r == -EINPROGRESS)
-				continue;
+		/*
+		 * Do the read. If we find out that we need to wait on writeback, then kick that
+		 * off, wait for it and then resubmit the read.
+		 */
+		fscache_init_io_request(&req->fscache_req, cookie, &ceph_read_for_write_fsreq_ops);
+		req->fscache_req.mapping = inode->i_mapping;
+
+		r = fscache_read_helper_for_write(&req->fscache_req, &page, index,
+						  fsc->mount_options->rsize >> PAGE_SHIFT, 0);
+		if (r != -EAGAIN) {
+			if (r == 0)
+				r = wait_on_bit(&req->fscache_req.flags,
+						FSCACHE_IO_READ_IN_PROGRESS, TASK_KILLABLE);
+			ceph_fsreq_put(&req->fscache_req);
 			break;
 		}
-	}
 
+		BUG_ON(!req->snapc);
+		snapc = ceph_get_snap_context(req->snapc);
+		ceph_fsreq_put(&req->fscache_req);
+flush_incompat:
+		put_page(page);
+		page = NULL;
+		ceph_queue_writeback(inode);
+		r = wait_event_killable(ci->i_cap_wq,
+					context_is_writeable_or_written(inode, snapc));
+		ceph_put_snap_context(snapc);
+	} while (r == 0);
+out:
 	if (r < 0) {
-		if (page) {
-			unlock_page(page);
+		if (page)
 			put_page(page);
-		}
 	} else {
+		WARN_ON_ONCE(!PageLocked(page));
 		*pagep = page;
 	}
 	return r;
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 09/11] ceph: convert readpages to fscache_read_helper
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
                   ` (7 preceding siblings ...)
  2020-07-22 10:55 ` [RFC PATCH 08/11] ceph: plug write_begin into " Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 10/11] ceph: add fscache writeback support Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 11/11] ceph: re-enable fscache support Jeff Layton
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

Convert ceph_readpages to use the fscache_read_helper. With this we can
rip out a lot of the old readpage/readpages infrastructure.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 209 +++++++------------------------------------------
 1 file changed, 28 insertions(+), 181 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index cc10bf17b65b..435374373c90 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -377,76 +377,23 @@ static int ceph_readpage(struct file *filp, struct page *page)
 	return err;
 }
 
-/*
- * Finish an async read(ahead) op.
- */
-static void finish_read(struct ceph_osd_request *req)
-{
-	struct inode *inode = req->r_inode;
-	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
-	struct ceph_osd_data *osd_data;
-	int rc = req->r_result <= 0 ? req->r_result : 0;
-	int bytes = req->r_result >= 0 ? req->r_result : 0;
-	int num_pages;
-	int i;
-
-	dout("finish_read %p req %p rc %d bytes %d\n", inode, req, rc, bytes);
-	if (rc == -EBLACKLISTED)
-		ceph_inode_to_client(inode)->blacklisted = true;
-
-	/* unlock all pages, zeroing any data we didn't read */
-	osd_data = osd_req_op_extent_osd_data(req, 0);
-	BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_PAGES);
-	num_pages = calc_pages_for((u64)osd_data->alignment,
-					(u64)osd_data->length);
-	for (i = 0; i < num_pages; i++) {
-		struct page *page = osd_data->pages[i];
-
-		if (rc < 0 && rc != -ENOENT)
-			goto unlock;
-		if (bytes < (int)PAGE_SIZE) {
-			/* zero (remainder of) page */
-			int s = bytes < 0 ? 0 : bytes;
-			zero_user_segment(page, s, PAGE_SIZE);
-		}
- 		dout("finish_read %p uptodate %p idx %lu\n", inode, page,
-		     page->index);
-		flush_dcache_page(page);
-		SetPageUptodate(page);
-unlock:
-		unlock_page(page);
-		put_page(page);
-		bytes -= PAGE_SIZE;
-	}
-
-	ceph_update_read_latency(&fsc->mdsc->metric, req->r_start_latency,
-				 req->r_end_latency, rc);
-
-	kfree(osd_data->pages);
-}
-
-/*
- * start an async read(ahead) operation.  return nr_pages we submitted
- * a read for on success, or negative error code.
- */
-static int start_read(struct inode *inode, struct ceph_rw_context *rw_ctx,
-		      struct list_head *page_list, int max)
+static int ceph_readpages(struct file *file, struct address_space *mapping,
+			  struct list_head *page_list, unsigned nr_pages)
 {
-	struct ceph_osd_client *osdc =
-		&ceph_inode_to_client(inode)->client->osdc;
+	struct inode *inode = file_inode(file);
 	struct ceph_inode_info *ci = ceph_inode(inode);
-	struct page *page = lru_to_page(page_list);
-	struct ceph_vino vino;
-	struct ceph_osd_request *req;
-	u64 off;
-	u64 len;
-	int i;
-	struct page **pages;
-	pgoff_t next_index;
-	int nr_pages = 0;
+	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct ceph_file_info *fi = file->private_data;
+	struct ceph_rw_context *rw_ctx;
+	struct fscache_cookie *cookie = ceph_fscache_cookie(ci);
 	int got = 0;
 	int ret = 0;
+	int max = fsc->mount_options->rsize >> PAGE_SHIFT;
+
+	if (ceph_inode(inode)->i_inline_version != CEPH_INLINE_NONE)
+		return -EINVAL;
 
+	rw_ctx = ceph_find_rw_context(fi);
 	if (!rw_ctx) {
 		/* caller of readpages does not hold buffer and read caps
 		 * (fadvise, madvise and readahead cases) */
@@ -459,133 +406,33 @@ static int start_read(struct inode *inode, struct ceph_rw_context *rw_ctx,
 			dout("start_read %p, no cache cap\n", inode);
 			ret = 0;
 		}
-		if (ret <= 0) {
-			if (got)
-				ceph_put_cap_refs(ci, got);
-			while (!list_empty(page_list)) {
-				page = lru_to_page(page_list);
-				list_del(&page->lru);
-				put_page(page);
-			}
-			return ret;
-		}
+		if (ret <= 0)
+			goto out;
 	}
 
-	off = (u64) page_offset(page);
+	dout("readpages %p file %p ctx %p nr_pages %d max %d\n",
+	     inode, file, rw_ctx, nr_pages, max);
 
-	/* count pages */
-	next_index = page->index;
-	list_for_each_entry_reverse(page, page_list, lru) {
-		if (page->index != next_index)
-			break;
-		nr_pages++;
-		next_index++;
-		if (max && nr_pages == max)
-			break;
-	}
-	len = nr_pages << PAGE_SHIFT;
-	dout("start_read %p nr_pages %d is %lld~%lld\n", inode, nr_pages,
-	     off, len);
-	vino = ceph_vino(inode);
-	req = ceph_osdc_new_request(osdc, &ci->i_layout, vino, off, &len,
-				    0, 1, CEPH_OSD_OP_READ,
-				    CEPH_OSD_FLAG_READ, NULL,
-				    ci->i_truncate_seq, ci->i_truncate_size,
-				    false);
-	if (IS_ERR(req)) {
-		ret = PTR_ERR(req);
-		goto out;
-	}
+	while (ret >= 0 && !list_empty(page_list)) {
+		struct ceph_fscache_req *req = ceph_fsreq_alloc();
 
-	/* build page vector */
-	nr_pages = calc_pages_for(0, len);
-	pages = kmalloc_array(nr_pages, sizeof(*pages), GFP_KERNEL);
-	if (!pages) {
-		ret = -ENOMEM;
-		goto out_put;
-	}
-	for (i = 0; i < nr_pages; ++i) {
-		page = list_entry(page_list->prev, struct page, lru);
-		BUG_ON(PageLocked(page));
-		list_del(&page->lru);
-
- 		dout("start_read %p adding %p idx %lu\n", inode, page,
-		     page->index);
-		if (add_to_page_cache_lru(page, &inode->i_data, page->index,
-					  GFP_KERNEL)) {
-			put_page(page);
-			dout("start_read %p add_to_page_cache failed %p\n",
-			     inode, page);
-			nr_pages = i;
-			if (nr_pages > 0) {
-				len = nr_pages << PAGE_SHIFT;
-				osd_req_op_extent_update(req, 0, len);
-				break;
-			}
-			goto out_pages;
+		if (!req) {
+			ret = -ENOMEM;
+			break;
 		}
-		pages[i] = page;
-	}
-	osd_req_op_extent_osd_data_pages(req, 0, pages, len, 0, false, false);
-	req->r_callback = finish_read;
-	req->r_inode = inode;
-
-	dout("start_read %p starting %p %lld~%lld\n", inode, req, off, len);
-	ret = ceph_osdc_start_request(osdc, req, false);
-	if (ret < 0)
-		goto out_pages;
-	ceph_osdc_put_request(req);
-
-	/* After adding locked pages to page cache, the inode holds cache cap.
-	 * So we can drop our cap refs. */
-	if (got)
-		ceph_put_cap_refs(ci, got);
-
-	return nr_pages;
+		fscache_init_io_request(&req->fscache_req, cookie, &ceph_readpage_fsreq_ops);
+		req->fscache_req.mapping = inode->i_mapping;
 
-out_pages:
-	for (i = 0; i < nr_pages; ++i) {
-		unlock_page(pages[i]);
+		ret = fscache_read_helper_page_list(&req->fscache_req, page_list, max);
+		ceph_fsreq_put(&req->fscache_req);
 	}
-	ceph_put_page_vector(pages, nr_pages, false);
-out_put:
-	ceph_osdc_put_request(req);
 out:
+	/* After adding locked pages to page cache, the inode holds Fc refs. We can drop ours. */
 	if (got)
 		ceph_put_cap_refs(ci, got);
-	return ret;
-}
 
-
-/*
- * Read multiple pages.  Leave pages we don't read + unlock in page_list;
- * the caller (VM) cleans them up.
- */
-static int ceph_readpages(struct file *file, struct address_space *mapping,
-			  struct list_head *page_list, unsigned nr_pages)
-{
-	struct inode *inode = file_inode(file);
-	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
-	struct ceph_file_info *fi = file->private_data;
-	struct ceph_rw_context *rw_ctx;
-	int rc = 0;
-	int max = 0;
-
-	if (ceph_inode(inode)->i_inline_version != CEPH_INLINE_NONE)
-		return -EINVAL;
-
-	rw_ctx = ceph_find_rw_context(fi);
-	max = fsc->mount_options->rsize >> PAGE_SHIFT;
-	dout("readpages %p file %p ctx %p nr_pages %d max %d\n",
-	     inode, file, rw_ctx, nr_pages, max);
-	while (!list_empty(page_list)) {
-		rc = start_read(inode, rw_ctx, page_list, max);
-		if (rc < 0)
-			goto out;
-	}
-out:
-	dout("readpages %p file %p ret %d\n", inode, file, rc);
-	return rc;
+	dout("readpages %p file %p ret %d\n", inode, file, ret);
+	return ret;
 }
 
 struct ceph_writeback_ctl
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 10/11] ceph: add fscache writeback support
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
                   ` (8 preceding siblings ...)
  2020-07-22 10:55 ` [RFC PATCH 09/11] ceph: convert readpages to fscache_read_helper Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  2020-07-22 10:55 ` [RFC PATCH 11/11] ceph: re-enable fscache support Jeff Layton
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

When updating the backing store from the pagecache (a'la writepage or
writepages), write to the cache first. This allows us to keep caching
files even when they are open for write.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 126 ++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 119 insertions(+), 7 deletions(-)

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 435374373c90..b518c2f8309e 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -435,6 +435,98 @@ static int ceph_readpages(struct file *file, struct address_space *mapping,
 	return ret;
 }
 
+#ifdef CONFIG_CEPH_FSCACHE
+const struct fscache_io_request_ops ceph_write_fsreq_ops = {
+	.get		= ceph_fsreq_get,
+	.put		= ceph_fsreq_put,
+};
+
+/*
+ * Clear the PG_fscache flag from a sequence of pages and wake up anyone who's
+ * waiting.  The last page is included in the sequence. Poached from afs_clear_fscache_bits.
+ */
+static void ceph_clear_fscache_bits(struct address_space *mapping,
+				   pgoff_t start, pgoff_t last)
+{
+	struct page *page;
+
+	XA_STATE(xas, &mapping->i_pages, start);
+
+	rcu_read_lock();
+	xas_for_each(&xas, page, last) {
+		unlock_page_fscache(page);
+	}
+	rcu_read_unlock();
+}
+
+static void ceph_write_to_cache_done(struct fscache_io_request *fsreq)
+{
+	pgoff_t start = fsreq->pos >> PAGE_SHIFT;
+	pgoff_t last = start + fsreq->nr_pages - 1;
+
+	ceph_clear_fscache_bits(fsreq->mapping, start, last);
+	if (fsreq->error && fsreq->error != -ENOBUFS)
+		ceph_fscache_invalidate(fsreq->mapping->host, 0);
+}
+
+static void ceph_write_to_cache(struct inode *inode, u64 off, u64 len)
+{
+	struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
+	struct iov_iter iter;
+	struct ceph_fscache_req *req;
+	struct fscache_cookie *cookie = ceph_inode(inode)->fscache;
+	pgoff_t start = off >> PAGE_SHIFT;
+	struct fscache_request_shape shape = {
+		.proposed_start		= start,
+		.proposed_nr_pages	= calc_pages_for(off, len),
+		.max_io_pages		= fsc->mount_options->wsize >> PAGE_SHIFT,
+		.i_size			= i_size_read(inode),
+		.for_write		= true,
+	};
+	pgoff_t last = start + shape.proposed_nr_pages - 1;
+
+	/* Don't do anything if cache is disabled */
+	if (!fscache_cookie_enabled(cookie))
+		goto abandon;
+
+	fscache_shape_request(cookie, &shape);
+	if (!(shape.to_be_done & FSCACHE_WRITE_TO_CACHE) ||
+	    shape.actual_nr_pages == 0 ||
+	    shape.actual_start != shape.proposed_start)
+		goto abandon;
+
+	if (shape.actual_nr_pages < shape.proposed_nr_pages) {
+		ceph_clear_fscache_bits(inode->i_mapping, start + shape.actual_nr_pages,
+					start + shape.proposed_nr_pages - 1);
+		last = shape.proposed_start + shape.actual_nr_pages - 1;
+		len = (u64)(last + 1 - start) << PAGE_SHIFT;
+	}
+
+	req = ceph_fsreq_alloc();
+	if (!req)
+		goto abandon;
+
+	fscache_init_io_request(&req->fscache_req, cookie, &ceph_write_fsreq_ops);
+	req->fscache_req.pos = round_down(off, shape.dio_block_size);
+	req->fscache_req.len = round_up(len, shape.dio_block_size);
+	req->fscache_req.nr_pages = shape.actual_nr_pages;
+	req->fscache_req.mapping = inode->i_mapping;
+	req->fscache_req.io_done = ceph_write_to_cache_done;
+
+	iov_iter_mapping(&iter, WRITE, inode->i_mapping, req->fscache_req.pos,
+			 req->fscache_req.len);
+	fscache_write(&req->fscache_req, &iter);
+	ceph_fsreq_put(&req->fscache_req);
+	return;
+abandon:
+	ceph_clear_fscache_bits(inode->i_mapping, start, last);
+}
+#else
+static inline void ceph_write_to_cache(struct inode *inode, u64 off, u64 len)
+{
+}
+#endif /* CONFIG_CEPH_FSCACHE */
+
 struct ceph_writeback_ctl
 {
 	loff_t i_size;
@@ -588,16 +680,17 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc)
 	    CONGESTION_ON_THRESH(fsc->mount_options->congestion_kb))
 		set_bdi_congested(inode_to_bdi(inode), BLK_RW_ASYNC);
 
-	set_page_writeback(page);
 	req = ceph_osdc_new_request(osdc, &ci->i_layout, ceph_vino(inode), page_off, &len, 0, 1,
 				    CEPH_OSD_OP_WRITE, CEPH_OSD_FLAG_WRITE, snapc,
 				    ceph_wbc.truncate_seq, ceph_wbc.truncate_size,
 				    true);
-	if (IS_ERR(req)) {
-		redirty_page_for_writepage(wbc, page);
-		end_page_writeback(page);
+	if (IS_ERR(req))
 		return PTR_ERR(req);
-	}
+
+	set_page_writeback(page);
+	if (TestSetPageFsCache(page))
+		BUG();
+	ceph_write_to_cache(inode, page_off, len);
 
 	/* it may be a short write due to an object boundary */
 	WARN_ON_ONCE(len > PAGE_SIZE);
@@ -656,6 +749,9 @@ static int ceph_writepage(struct page *page, struct writeback_control *wbc)
 	struct inode *inode = page->mapping->host;
 	BUG_ON(!inode);
 	ihold(inode);
+
+	ceph_wait_on_page_fscache(page);
+
 	err = writepage_nounlock(page, wbc);
 	if (err == -ERESTARTSYS) {
 		/* direct memory reclaimer was killed by SIGKILL. return 0
@@ -902,7 +998,7 @@ static int ceph_writepages_start(struct address_space *mapping,
 				unlock_page(page);
 				break;
 			}
-			if (PageWriteback(page)) {
+			if (PageWriteback(page) || PageFsCache(page)) {
 				if (wbc->sync_mode == WB_SYNC_NONE) {
 					dout("%p under writeback\n", page);
 					unlock_page(page);
@@ -910,6 +1006,7 @@ static int ceph_writepages_start(struct address_space *mapping,
 				}
 				dout("waiting on writeback %p\n", page);
 				wait_on_page_writeback(page);
+				ceph_wait_on_page_fscache(page);
 			}
 
 			if (!clear_page_dirty_for_io(page)) {
@@ -1042,9 +1139,19 @@ static int ceph_writepages_start(struct address_space *mapping,
 		op_idx = 0;
 		for (i = 0; i < locked_pages; i++) {
 			u64 cur_offset = page_offset(pages[i]);
+			/*
+			 * Discontinuity in page range? Ceph can handle that by just passing
+			 * multiple extents in the write op.
+			 */
 			if (offset + len != cur_offset) {
+				/* If it's full, stop here */
 				if (op_idx + 1 == req->r_num_ops)
 					break;
+
+				/* Kick off an fscache write with what we have so far. */
+				ceph_write_to_cache(inode, offset, len);
+
+				/* Start a new extent */
 				osd_req_op_extent_dup_last(req, op_idx,
 							   cur_offset - offset);
 				dout("writepages got pages at %llu~%llu\n",
@@ -1061,8 +1168,11 @@ static int ceph_writepages_start(struct address_space *mapping,
 			}
 
 			set_page_writeback(pages[i]);
+			if (TestSetPageFsCache(pages[i]))
+				BUG();
 			len += PAGE_SIZE;
 		}
+		ceph_write_to_cache(inode, offset, len);
 
 		if (ceph_wbc.size_stable) {
 			len = min(len, ceph_wbc.i_size - offset);
@@ -1385,9 +1495,11 @@ static int ceph_write_begin(struct file *file, struct address_space *mapping,
 		r = fscache_read_helper_for_write(&req->fscache_req, &page, index,
 						  fsc->mount_options->rsize >> PAGE_SHIFT, 0);
 		if (r != -EAGAIN) {
-			if (r == 0)
+			if (r == 0) {
 				r = wait_on_bit(&req->fscache_req.flags,
 						FSCACHE_IO_READ_IN_PROGRESS, TASK_KILLABLE);
+				ceph_wait_on_page_fscache(page);
+			}
 			ceph_fsreq_put(&req->fscache_req);
 			break;
 		}
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC PATCH 11/11] ceph: re-enable fscache support
  2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
                   ` (9 preceding siblings ...)
  2020-07-22 10:55 ` [RFC PATCH 10/11] ceph: add fscache writeback support Jeff Layton
@ 2020-07-22 10:55 ` Jeff Layton
  10 siblings, 0 replies; 12+ messages in thread
From: Jeff Layton @ 2020-07-22 10:55 UTC (permalink / raw)
  To: ceph-devel; +Cc: dhowells, dwysocha, smfrench

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/Kconfig b/fs/ceph/Kconfig
index 734d08b57ef9..bc5656966aca 100644
--- a/fs/ceph/Kconfig
+++ b/fs/ceph/Kconfig
@@ -21,7 +21,7 @@ config CEPH_FS
 if CEPH_FS
 config CEPH_FSCACHE
 	bool "Enable Ceph client caching support"
-	depends on CEPH_FS=m && FSCACHE_OLD || CEPH_FS=y && FSCACHE_OLD=y
+	depends on CEPH_FS=m && FSCACHE || CEPH_FS=y && FSCACHE=y
 	help
 	  Choose Y here to enable persistent, read-only local
 	  caching support for Ceph clients using FS-Cache
-- 
2.26.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-07-22 10:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-22 10:55 [RFC PATCH 00/11] ceph: convert to new fscache API Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 01/11] ceph: break out writeback of incompatible snap context to separate function Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 02/11] ceph: don't call ceph_update_writeable_page from page_mkwrite Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 03/11] ceph: fold ceph_sync_readpages into ceph_readpage Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 04/11] ceph: fold ceph_sync_writepages into writepage_nounlock Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 05/11] ceph: fold ceph_update_writeable_page into ceph_write_begin Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 06/11] ceph: conversion to new fscache API Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 07/11] ceph: convert readpage to fscache read helper Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 08/11] ceph: plug write_begin into " Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 09/11] ceph: convert readpages to fscache_read_helper Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 10/11] ceph: add fscache writeback support Jeff Layton
2020-07-22 10:55 ` [RFC PATCH 11/11] ceph: re-enable fscache support Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.