linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] AFS fixes
@ 2020-10-27 13:49 David Howells
  2020-10-27 13:50 ` [PATCH 01/10] afs: Fix copy_file_range() David Howells
                   ` (9 more replies)
  0 siblings, 10 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:49 UTC (permalink / raw)
  To: linux-afs
  Cc: Dan Carpenter, Christoph Hellwig, Matthew Wilcox (Oracle),
	Colin Ian King, Nick Piggin, dhowells, linux-fsdevel,
	linux-kernel


Here's a set of fixes for AFS:

 (1) Fix copy_file_range() to an afs file now returning EINVAL if the
     splice_write file op isn't supplied.

 (2) Fix a deref-before-check in afs_unuse_cell().

 (3) Fix a use-after-free in afs_xattr_get_acl().

 (4) Fix afs to not try to clear PG_writeback when laundering a page.

 (5) Fix afs to take a ref on a page that it sets PG_private on and to drop
     that ref when clearing PG_private.

 (6) Fix a page leak if write_begin() fails.

 (7) Fix afs_write_begin() to not alter the dirty region info stored in
     page->private, but rather do this in afs_write_end() instead when we
     know what we actually changed.

 (8) Fix afs_invalidatepage() to alter the dirty region info on a page when
     partial page invalidation occurs so that we don't inadvertantly
     include a span of zeros that will get written back if a page gets
     laundered due to a remote 3rd-party induced invalidation.

     We mustn't, however, reduce the dirty region if the page has been seen
     to be mapped (ie. we got called through the page_mkwrite vector) as
     the page might still be mapped and we might lose data if the file is
     extended again.

To aid (8), two additional patches are included:

 (*) Wrap the manipulations of the dirty region info stored in
     page->private into helper functions.

 (*) Alter the encoding of the dirty region so that the region bounds can
     be stored with one fewer bit, making a bit available for the
     indication of mappedness.

The patches can be found here:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=afs-fixes

David
---
Dan Carpenter (1):
      afs: Fix a use after free in afs_xattr_get_acl()

David Howells (9):
      afs: Fix copy_file_range()
      afs: Fix tracing deref-before-check
      afs: Fix afs_launder_page to not clear PG_writeback
      afs: Fix to take ref on page when PG_private is set
      afs: Fix page leak on afs_write_begin() failure
      afs: Fix where page->private is set during write
      afs: Wrap page->private manipulations in inline functions
      afs: Alter dirty range encoding in page->private
      afs: Fix afs_invalidatepage to adjust the dirty region


 fs/afs/cell.c              |  3 +-
 fs/afs/dir.c               |  3 ++
 fs/afs/dir_edit.c          |  1 +
 fs/afs/file.c              | 74 ++++++++++++++++++++++++-----
 fs/afs/internal.h          | 41 ++++++++++++++++
 fs/afs/write.c             | 95 ++++++++++++++++++++++----------------
 fs/afs/xattr.c             |  2 +-
 include/linux/page-flags.h |  1 +
 include/trace/events/afs.h | 20 ++------
 9 files changed, 171 insertions(+), 69 deletions(-)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 01/10] afs: Fix copy_file_range()
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
@ 2020-10-27 13:50 ` David Howells
  2020-10-27 16:02   ` Christoph Hellwig
  2020-10-27 13:50 ` [PATCH 02/10] afs: Fix tracing deref-before-check David Howells
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 12+ messages in thread
From: David Howells @ 2020-10-27 13:50 UTC (permalink / raw)
  To: linux-afs; +Cc: Christoph Hellwig, dhowells, linux-fsdevel, linux-kernel

The prevention of splice-write without explicit ops made the
copy_file_write() syscall to an afs file (as done by the generic/112
xfstest) fail with EINVAL.

Fix by using iter_file_splice_write() for afs.

Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
---

 fs/afs/file.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 371d1488cc54..91225421ad37 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -33,6 +33,7 @@ const struct file_operations afs_file_operations = {
 	.write_iter	= afs_file_write,
 	.mmap		= afs_file_mmap,
 	.splice_read	= generic_file_splice_read,
+	.splice_write	= iter_file_splice_write,
 	.fsync		= afs_fsync,
 	.lock		= afs_lock,
 	.flock		= afs_flock,



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 02/10] afs: Fix tracing deref-before-check
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
  2020-10-27 13:50 ` [PATCH 01/10] afs: Fix copy_file_range() David Howells
@ 2020-10-27 13:50 ` David Howells
  2020-10-27 13:50 ` [PATCH 03/10] afs: Fix a use after free in afs_xattr_get_acl() David Howells
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:50 UTC (permalink / raw)
  To: linux-afs
  Cc: Dan Carpenter, Dan Carpenter, dhowells, linux-fsdevel, linux-kernel

The patch dca54a7bbb8c: "afs: Add tracing for cell refcount and active user
count" from Oct 13, 2020, leads to the following Smatch complaint:

    fs/afs/cell.c:596 afs_unuse_cell()
    warn: variable dereferenced before check 'cell' (see line 592)

Fix this by moving the retrieval of the cell debug ID to after the check of
the validity of the cell pointer.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: dca54a7bbb8c ("afs: Add tracing for cell refcount and active user count")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Dan Carpenter <dan.carpenter@oracle.com>
---

 fs/afs/cell.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/afs/cell.c b/fs/afs/cell.c
index 52233fa6195f..887b673f6223 100644
--- a/fs/afs/cell.c
+++ b/fs/afs/cell.c
@@ -589,7 +589,7 @@ struct afs_cell *afs_use_cell(struct afs_cell *cell, enum afs_cell_trace reason)
  */
 void afs_unuse_cell(struct afs_net *net, struct afs_cell *cell, enum afs_cell_trace reason)
 {
-	unsigned int debug_id = cell->debug_id;
+	unsigned int debug_id;
 	time64_t now, expire_delay;
 	int u, a;
 
@@ -604,6 +604,7 @@ void afs_unuse_cell(struct afs_net *net, struct afs_cell *cell, enum afs_cell_tr
 	if (cell->vl_servers->nr_servers)
 		expire_delay = afs_cell_gc_delay;
 
+	debug_id = cell->debug_id;
 	u = atomic_read(&cell->ref);
 	a = atomic_dec_return(&cell->active);
 	trace_afs_cell(debug_id, u, a, reason);



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 03/10] afs: Fix a use after free in afs_xattr_get_acl()
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
  2020-10-27 13:50 ` [PATCH 01/10] afs: Fix copy_file_range() David Howells
  2020-10-27 13:50 ` [PATCH 02/10] afs: Fix tracing deref-before-check David Howells
@ 2020-10-27 13:50 ` David Howells
  2020-10-27 13:50 ` [PATCH 04/10] afs: Fix afs_launder_page to not clear PG_writeback David Howells
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:50 UTC (permalink / raw)
  To: linux-afs
  Cc: Dan Carpenter, Colin Ian King, dhowells, linux-fsdevel, linux-kernel

From: Dan Carpenter <dan.carpenter@oracle.com>

The "op" pointer is freed earlier when we call afs_put_operation().

Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Colin Ian King <colin.king@canonical.com>
---

 fs/afs/xattr.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/afs/xattr.c b/fs/afs/xattr.c
index 84f3c4f57531..38884d6c57cd 100644
--- a/fs/afs/xattr.c
+++ b/fs/afs/xattr.c
@@ -85,7 +85,7 @@ static int afs_xattr_get_acl(const struct xattr_handler *handler,
 			if (acl->size <= size)
 				memcpy(buffer, acl->data, acl->size);
 			else
-				op->error = -ERANGE;
+				ret = -ERANGE;
 		}
 	}
 



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 04/10] afs: Fix afs_launder_page to not clear PG_writeback
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
                   ` (2 preceding siblings ...)
  2020-10-27 13:50 ` [PATCH 03/10] afs: Fix a use after free in afs_xattr_get_acl() David Howells
@ 2020-10-27 13:50 ` David Howells
  2020-10-27 13:50 ` [PATCH 05/10] afs: Fix to take ref on page when PG_private is set David Howells
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:50 UTC (permalink / raw)
  To: linux-afs; +Cc: dhowells, linux-fsdevel, linux-kernel

Fix afs_launder_page() to not clear PG_writeback on the page it is
laundering as the flag isn't set in this case.

Fixes: 4343d00872e1 ("afs: Get rid of the afs_writeback record")
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/internal.h |    1 +
 fs/afs/write.c    |   10 ++++++----
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 81b0485fd22a..289f5dffa46f 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -812,6 +812,7 @@ struct afs_operation {
 			pgoff_t		last;		/* last page in mapping to deal with */
 			unsigned	first_offset;	/* offset into mapping[first] */
 			unsigned	last_to;	/* amount of mapping[last] */
+			bool		laundering;	/* Laundering page, PG_writeback not set */
 		} store;
 		struct {
 			struct iattr	*attr;
diff --git a/fs/afs/write.c b/fs/afs/write.c
index da12abd6db21..b937ec047ec9 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -396,7 +396,8 @@ static void afs_store_data_success(struct afs_operation *op)
 	op->ctime = op->file[0].scb.status.mtime_client;
 	afs_vnode_commit_status(op, &op->file[0]);
 	if (op->error == 0) {
-		afs_pages_written_back(vnode, op->store.first, op->store.last);
+		if (!op->store.laundering)
+			afs_pages_written_back(vnode, op->store.first, op->store.last);
 		afs_stat_v(vnode, n_stores);
 		atomic_long_add((op->store.last * PAGE_SIZE + op->store.last_to) -
 				(op->store.first * PAGE_SIZE + op->store.first_offset),
@@ -415,7 +416,7 @@ static const struct afs_operation_ops afs_store_data_operation = {
  */
 static int afs_store_data(struct address_space *mapping,
 			  pgoff_t first, pgoff_t last,
-			  unsigned offset, unsigned to)
+			  unsigned offset, unsigned to, bool laundering)
 {
 	struct afs_vnode *vnode = AFS_FS_I(mapping->host);
 	struct afs_operation *op;
@@ -448,6 +449,7 @@ static int afs_store_data(struct address_space *mapping,
 	op->store.last = last;
 	op->store.first_offset = offset;
 	op->store.last_to = to;
+	op->store.laundering = laundering;
 	op->mtime = vnode->vfs_inode.i_mtime;
 	op->flags |= AFS_OPERATION_UNINTR;
 	op->ops = &afs_store_data_operation;
@@ -601,7 +603,7 @@ static int afs_write_back_from_locked_page(struct address_space *mapping,
 	if (end > i_size)
 		to = i_size & ~PAGE_MASK;
 
-	ret = afs_store_data(mapping, first, last, offset, to);
+	ret = afs_store_data(mapping, first, last, offset, to, false);
 	switch (ret) {
 	case 0:
 		ret = count;
@@ -921,7 +923,7 @@ int afs_launder_page(struct page *page)
 
 		trace_afs_page_dirty(vnode, tracepoint_string("launder"),
 				     page->index, priv);
-		ret = afs_store_data(mapping, page->index, page->index, t, f);
+		ret = afs_store_data(mapping, page->index, page->index, t, f, true);
 	}
 
 	trace_afs_page_dirty(vnode, tracepoint_string("laundered"),



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 05/10] afs: Fix to take ref on page when PG_private is set
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
                   ` (3 preceding siblings ...)
  2020-10-27 13:50 ` [PATCH 04/10] afs: Fix afs_launder_page to not clear PG_writeback David Howells
@ 2020-10-27 13:50 ` David Howells
  2020-10-27 13:50 ` [PATCH 06/10] afs: Fix page leak on afs_write_begin() failure David Howells
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:50 UTC (permalink / raw)
  To: linux-afs; +Cc: Matthew Wilcox (Oracle), dhowells, linux-fsdevel, linux-kernel

Fix afs to take a ref on a page when it sets PG_private on it and to drop
the ref when removing the flag.

Note that in afs_write_begin(), a lot of the time, PG_private is already
set on a page to which we're going to add some data.  In such a case, we
leave the bit set and mustn't increment the page count.  To this end, make
TestSetPagePrivate() available.

Fixes: 31143d5d515e ("AFS: implement basic file write support")
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/dir.c               |    3 +++
 fs/afs/dir_edit.c          |    1 +
 fs/afs/file.c              |    2 ++
 fs/afs/write.c             |    9 +++++++--
 include/linux/page-flags.h |    1 +
 5 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index 1d2e61e0ab04..064eb66c33e9 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -283,6 +283,7 @@ static struct afs_read *afs_read_dir(struct afs_vnode *dvnode, struct key *key)
 
 			set_page_private(req->pages[i], 1);
 			SetPagePrivate(req->pages[i]);
+			get_page(req->pages[i]);
 			unlock_page(req->pages[i]);
 			i++;
 		} else {
@@ -1977,6 +1978,7 @@ static int afs_dir_releasepage(struct page *page, gfp_t gfp_flags)
 
 	set_page_private(page, 0);
 	ClearPagePrivate(page);
+	put_page(page);
 
 	/* The directory will need reloading. */
 	if (test_and_clear_bit(AFS_VNODE_DIR_VALID, &dvnode->flags))
@@ -2006,5 +2008,6 @@ static void afs_dir_invalidatepage(struct page *page, unsigned int offset,
 	if (offset == 0 && length == PAGE_SIZE) {
 		set_page_private(page, 0);
 		ClearPagePrivate(page);
+		put_page(page);
 	}
 }
diff --git a/fs/afs/dir_edit.c b/fs/afs/dir_edit.c
index b108528bf010..997f6798beee 100644
--- a/fs/afs/dir_edit.c
+++ b/fs/afs/dir_edit.c
@@ -246,6 +246,7 @@ void afs_edit_dir_add(struct afs_vnode *vnode,
 			if (!PagePrivate(page)) {
 				set_page_private(page, 1);
 				SetPagePrivate(page);
+				get_page(page);
 			}
 			dir_page = kmap(page);
 		}
diff --git a/fs/afs/file.c b/fs/afs/file.c
index 91225421ad37..7dafa2266048 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -632,6 +632,7 @@ static void afs_invalidatepage(struct page *page, unsigned int offset,
 					     page->index, priv);
 			set_page_private(page, 0);
 			ClearPagePrivate(page);
+			put_page(page);
 		}
 	}
 
@@ -666,6 +667,7 @@ static int afs_releasepage(struct page *page, gfp_t gfp_flags)
 				     page->index, priv);
 		set_page_private(page, 0);
 		ClearPagePrivate(page);
+		put_page(page);
 	}
 
 	/* indicate that the page can be released */
diff --git a/fs/afs/write.c b/fs/afs/write.c
index b937ec047ec9..29685947324e 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -151,7 +151,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 	priv |= f;
 	trace_afs_page_dirty(vnode, tracepoint_string("begin"),
 			     page->index, priv);
-	SetPagePrivate(page);
+	if (!TestSetPagePrivate(page))
+		get_page(page);
 	set_page_private(page, priv);
 	_leave(" = 0");
 	return 0;
@@ -338,6 +339,8 @@ static void afs_pages_written_back(struct afs_vnode *vnode,
 			trace_afs_page_dirty(vnode, tracepoint_string("clear"),
 					     pv.pages[loop]->index, priv);
 			set_page_private(pv.pages[loop], 0);
+			ClearPagePrivate(pv.pages[loop]);
+			put_page(pv.pages[loop]);
 			end_page_writeback(pv.pages[loop]);
 		}
 		first += count;
@@ -863,7 +866,8 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 	priv |= 0; /* From */
 	trace_afs_page_dirty(vnode, tracepoint_string("mkwrite"),
 			     vmf->page->index, priv);
-	SetPagePrivate(vmf->page);
+	if (!TestSetPagePrivate(vmf->page))
+		get_page(vmf->page);
 	set_page_private(vmf->page, priv);
 	file_update_time(file);
 
@@ -930,6 +934,7 @@ int afs_launder_page(struct page *page)
 			     page->index, priv);
 	set_page_private(page, 0);
 	ClearPagePrivate(page);
+	put_page(page);
 
 #ifdef CONFIG_AFS_FSCACHE
 	if (PageFsCache(page)) {
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 4f6ba9379112..37d65b55a6c6 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -365,6 +365,7 @@ PAGEFLAG(SwapBacked, swapbacked, PF_NO_TAIL)
  */
 PAGEFLAG(Private, private, PF_ANY) __SETPAGEFLAG(Private, private, PF_ANY)
 	__CLEARPAGEFLAG(Private, private, PF_ANY)
+	TESTSETFLAG(Private, private, PF_ANY)
 PAGEFLAG(Private2, private_2, PF_ANY) TESTSCFLAG(Private2, private_2, PF_ANY)
 PAGEFLAG(OwnerPriv1, owner_priv_1, PF_ANY)
 	TESTCLEARFLAG(OwnerPriv1, owner_priv_1, PF_ANY)



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 06/10] afs: Fix page leak on afs_write_begin() failure
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
                   ` (4 preceding siblings ...)
  2020-10-27 13:50 ` [PATCH 05/10] afs: Fix to take ref on page when PG_private is set David Howells
@ 2020-10-27 13:50 ` David Howells
  2020-10-27 13:50 ` [PATCH 07/10] afs: Fix where page->private is set during write David Howells
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:50 UTC (permalink / raw)
  To: linux-afs; +Cc: Nick Piggin, dhowells, linux-fsdevel, linux-kernel

Fix the leak of the target page in afs_write_begin() when it fails.

Fixes: 15b4650e55e0 ("afs: convert to new aops")
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Nick Piggin <npiggin@gmain.com>
---

 fs/afs/write.c |   23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index 29685947324e..16a896096ccf 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -76,7 +76,7 @@ static int afs_fill_page(struct afs_vnode *vnode, struct key *key,
  */
 int afs_write_begin(struct file *file, struct address_space *mapping,
 		    loff_t pos, unsigned len, unsigned flags,
-		    struct page **pagep, void **fsdata)
+		    struct page **_page, void **fsdata)
 {
 	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
 	struct page *page;
@@ -110,9 +110,6 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 		SetPageUptodate(page);
 	}
 
-	/* page won't leak in error case: it eventually gets cleaned off LRU */
-	*pagep = page;
-
 try_again:
 	/* See if this page is already partially written in a way that we can
 	 * merge the new write with.
@@ -154,6 +151,7 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 	if (!TestSetPagePrivate(page))
 		get_page(page);
 	set_page_private(page, priv);
+	*_page = page;
 	_leave(" = 0");
 	return 0;
 
@@ -163,17 +161,18 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 flush_conflicting_write:
 	_debug("flush conflict");
 	ret = write_one_page(page);
-	if (ret < 0) {
-		_leave(" = %d", ret);
-		return ret;
-	}
+	if (ret < 0)
+		goto error;
 
 	ret = lock_page_killable(page);
-	if (ret < 0) {
-		_leave(" = %d", ret);
-		return ret;
-	}
+	if (ret < 0)
+		goto error;
 	goto try_again;
+
+error:
+	put_page(page);
+	_leave(" = %d", ret);
+	return ret;
 }
 
 /*



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 07/10] afs: Fix where page->private is set during write
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
                   ` (5 preceding siblings ...)
  2020-10-27 13:50 ` [PATCH 06/10] afs: Fix page leak on afs_write_begin() failure David Howells
@ 2020-10-27 13:50 ` David Howells
  2020-10-27 13:50 ` [PATCH 08/10] afs: Wrap page->private manipulations in inline functions David Howells
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:50 UTC (permalink / raw)
  To: linux-afs; +Cc: dhowells, linux-fsdevel, linux-kernel

In afs, page->private is set to indicate the dirty region of a page.  This
is done in afs_write_begin(), but that can't take account of whether the
copy into the page actually worked.

Fix this by moving the change of page->private into afs_write_end().

Fixes: 4343d00872e1 ("afs: Get rid of the afs_writeback record")
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/write.c |   41 +++++++++++++++++++++++++++--------------
 1 file changed, 27 insertions(+), 14 deletions(-)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index 16a896096ccf..5ed5df906744 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -135,22 +135,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 		if (!test_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags) &&
 		    (to < f || from > t))
 			goto flush_conflicting_write;
-		if (from < f)
-			f = from;
-		if (to > t)
-			t = to;
-	} else {
-		f = from;
-		t = to;
 	}
 
-	priv = (unsigned long)t << AFS_PRIV_SHIFT;
-	priv |= f;
-	trace_afs_page_dirty(vnode, tracepoint_string("begin"),
-			     page->index, priv);
-	if (!TestSetPagePrivate(page))
-		get_page(page);
-	set_page_private(page, priv);
 	*_page = page;
 	_leave(" = 0");
 	return 0;
@@ -184,6 +170,9 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 {
 	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
 	struct key *key = afs_file_key(file);
+	unsigned long priv;
+	unsigned int f, from = pos & (PAGE_SIZE - 1);
+	unsigned int t, to = from + copied;
 	loff_t i_size, maybe_i_size;
 	int ret;
 
@@ -215,6 +204,30 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 		SetPageUptodate(page);
 	}
 
+	if (PagePrivate(page)) {
+		priv = page_private(page);
+		f = priv & AFS_PRIV_MAX;
+		t = priv >> AFS_PRIV_SHIFT;
+		if (from < f)
+			f = from;
+		if (to > t)
+			t = to;
+		priv = (unsigned long)t << AFS_PRIV_SHIFT;
+		priv |= f;
+		trace_afs_page_dirty(vnode, tracepoint_string("dirty+"),
+				     page->index, priv);
+	} else {
+		SetPagePrivate(page);
+		get_page(page);
+		f = from;
+		t = to;
+		priv = (unsigned long)t << AFS_PRIV_SHIFT;
+		priv |= f;
+		trace_afs_page_dirty(vnode, tracepoint_string("dirty"),
+				     page->index, priv);
+	}
+
+	set_page_private(page, priv);
 	set_page_dirty(page);
 	if (PageDirty(page))
 		_debug("dirtied");



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 08/10] afs: Wrap page->private manipulations in inline functions
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
                   ` (6 preceding siblings ...)
  2020-10-27 13:50 ` [PATCH 07/10] afs: Fix where page->private is set during write David Howells
@ 2020-10-27 13:50 ` David Howells
  2020-10-27 13:51 ` [PATCH 09/10] afs: Alter dirty range encoding in page->private David Howells
  2020-10-27 13:51 ` [PATCH 10/10] afs: Fix afs_invalidatepage to adjust the dirty region David Howells
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:50 UTC (permalink / raw)
  To: linux-afs; +Cc: dhowells, linux-fsdevel, linux-kernel

The afs filesystem uses page->private to store the dirty range within a
page such that in the event of a conflicting 3rd-party write to the server,
we write back just the bits that got changed locally.

However, there are a couple of problems with this:

 (1) I need a bit to note if the page might be mapped so that partial
     invalidation doesn't shrink the range.

 (2) There aren't necessarily sufficient bits to store the entire range of
     data altered (say it's a 32-bit system with 64KiB pages or transparent
     huge pages are in use).

So wrap the accesses in inline functions so that future commits can change
how this works.

Also move them out of the tracing header into the in-directory header.
There's not really any need for them to be in the tracing header.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/internal.h          |   28 ++++++++++++++++++++++++++++
 fs/afs/write.c             |   31 +++++++++++++------------------
 include/trace/events/afs.h |   19 +++----------------
 3 files changed, 44 insertions(+), 34 deletions(-)

diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 289f5dffa46f..acefa6bf0a4d 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -858,6 +858,34 @@ struct afs_vnode_cache_aux {
 	u64			data_version;
 } __packed;
 
+/*
+ * We use page->private to hold the amount of the page that we've written to,
+ * splitting the field into two parts.  However, we need to represent a range
+ * 0...PAGE_SIZE inclusive, so we can't support 64K pages on a 32-bit system.
+ */
+#if PAGE_SIZE > 32768
+#define __AFS_PAGE_PRIV_MASK	0xffffffff
+#define __AFS_PAGE_PRIV_SHIFT	32
+#else
+#define __AFS_PAGE_PRIV_MASK	0xffff
+#define __AFS_PAGE_PRIV_SHIFT	16
+#endif
+
+static inline unsigned int afs_page_dirty_from(unsigned long priv)
+{
+	return priv & __AFS_PAGE_PRIV_MASK;
+}
+
+static inline unsigned int afs_page_dirty_to(unsigned long priv)
+{
+	return (priv >> __AFS_PAGE_PRIV_SHIFT) & __AFS_PAGE_PRIV_MASK;
+}
+
+static inline unsigned long afs_page_dirty(unsigned int from, unsigned int to)
+{
+	return ((unsigned long)to << __AFS_PAGE_PRIV_SHIFT) | from;
+}
+
 #include <trace/events/afs.h>
 
 /*****************************************************************************/
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 5ed5df906744..91bc2cb2cad1 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -117,8 +117,8 @@ int afs_write_begin(struct file *file, struct address_space *mapping,
 	t = f = 0;
 	if (PagePrivate(page)) {
 		priv = page_private(page);
-		f = priv & AFS_PRIV_MAX;
-		t = priv >> AFS_PRIV_SHIFT;
+		f = afs_page_dirty_from(priv);
+		t = afs_page_dirty_to(priv);
 		ASSERTCMP(f, <=, t);
 	}
 
@@ -206,23 +206,19 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 
 	if (PagePrivate(page)) {
 		priv = page_private(page);
-		f = priv & AFS_PRIV_MAX;
-		t = priv >> AFS_PRIV_SHIFT;
+		f = afs_page_dirty_from(priv);
+		t = afs_page_dirty_to(priv);
 		if (from < f)
 			f = from;
 		if (to > t)
 			t = to;
-		priv = (unsigned long)t << AFS_PRIV_SHIFT;
-		priv |= f;
+		priv = afs_page_dirty(f, t);
 		trace_afs_page_dirty(vnode, tracepoint_string("dirty+"),
 				     page->index, priv);
 	} else {
 		SetPagePrivate(page);
 		get_page(page);
-		f = from;
-		t = to;
-		priv = (unsigned long)t << AFS_PRIV_SHIFT;
-		priv |= f;
+		priv = afs_page_dirty(from, to);
 		trace_afs_page_dirty(vnode, tracepoint_string("dirty"),
 				     page->index, priv);
 	}
@@ -526,8 +522,8 @@ static int afs_write_back_from_locked_page(struct address_space *mapping,
 	 */
 	start = primary_page->index;
 	priv = page_private(primary_page);
-	offset = priv & AFS_PRIV_MAX;
-	to = priv >> AFS_PRIV_SHIFT;
+	offset = afs_page_dirty_from(priv);
+	to = afs_page_dirty_to(priv);
 	trace_afs_page_dirty(vnode, tracepoint_string("store"),
 			     primary_page->index, priv);
 
@@ -572,8 +568,8 @@ static int afs_write_back_from_locked_page(struct address_space *mapping,
 			}
 
 			priv = page_private(page);
-			f = priv & AFS_PRIV_MAX;
-			t = priv >> AFS_PRIV_SHIFT;
+			f = afs_page_dirty_from(priv);
+			t = afs_page_dirty_to(priv);
 			if (f != 0 &&
 			    !test_bit(AFS_VNODE_NEW_CONTENT, &vnode->flags)) {
 				unlock_page(page);
@@ -874,8 +870,7 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 	 */
 	wait_on_page_writeback(vmf->page);
 
-	priv = (unsigned long)PAGE_SIZE << AFS_PRIV_SHIFT; /* To */
-	priv |= 0; /* From */
+	priv = afs_page_dirty(0, PAGE_SIZE);
 	trace_afs_page_dirty(vnode, tracepoint_string("mkwrite"),
 			     vmf->page->index, priv);
 	if (!TestSetPagePrivate(vmf->page))
@@ -933,8 +928,8 @@ int afs_launder_page(struct page *page)
 		f = 0;
 		t = PAGE_SIZE;
 		if (PagePrivate(page)) {
-			f = priv & AFS_PRIV_MAX;
-			t = priv >> AFS_PRIV_SHIFT;
+			f = afs_page_dirty_from(priv);
+			t = afs_page_dirty_to(priv);
 		}
 
 		trace_afs_page_dirty(vnode, tracepoint_string("launder"),
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index 8eb49231c6bb..e718ae17ad91 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -966,19 +966,6 @@ TRACE_EVENT(afs_dir_check_failed,
 		      __entry->vnode, __entry->off, __entry->i_size)
 	    );
 
-/*
- * We use page->private to hold the amount of the page that we've written to,
- * splitting the field into two parts.  However, we need to represent a range
- * 0...PAGE_SIZE inclusive, so we can't support 64K pages on a 32-bit system.
- */
-#if PAGE_SIZE > 32768
-#define AFS_PRIV_MAX	0xffffffff
-#define AFS_PRIV_SHIFT	32
-#else
-#define AFS_PRIV_MAX	0xffff
-#define AFS_PRIV_SHIFT	16
-#endif
-
 TRACE_EVENT(afs_page_dirty,
 	    TP_PROTO(struct afs_vnode *vnode, const char *where,
 		     pgoff_t page, unsigned long priv),
@@ -999,10 +986,10 @@ TRACE_EVENT(afs_page_dirty,
 		    __entry->priv = priv;
 			   ),
 
-	    TP_printk("vn=%p %lx %s %lu-%lu",
+	    TP_printk("vn=%p %lx %s %x-%x",
 		      __entry->vnode, __entry->page, __entry->where,
-		      __entry->priv & AFS_PRIV_MAX,
-		      __entry->priv >> AFS_PRIV_SHIFT)
+		      afs_page_dirty_from(__entry->priv),
+		      afs_page_dirty_to(__entry->priv))
 	    );
 
 TRACE_EVENT(afs_call_state,



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 09/10] afs: Alter dirty range encoding in page->private
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
                   ` (7 preceding siblings ...)
  2020-10-27 13:50 ` [PATCH 08/10] afs: Wrap page->private manipulations in inline functions David Howells
@ 2020-10-27 13:51 ` David Howells
  2020-10-27 13:51 ` [PATCH 10/10] afs: Fix afs_invalidatepage to adjust the dirty region David Howells
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:51 UTC (permalink / raw)
  To: linux-afs; +Cc: dhowells, linux-fsdevel, linux-kernel

Currently, page->private on an afs page is used to store the range of
dirtied data within the page, where the range includes the lower bound, but
excludes the upper bound (e.g. 0-1 is a range covering a single byte).

This, however, requires a superfluous bit for the last-byte bound so that
on a 4KiB page, it can say 0-4096 to indicate the whole page, the idea
being that having both numbers the same would indicate an empty range.
This is unnecessary as the PG_private bit is clear if it's an empty range
(as is PG_dirty).

Alter the way the dirty range is encoded in page->private such that the
upper bound is reduced by 1 (e.g. 0-0 is then specified the same single
byte range mentioned above).

Applying this to both bounds frees up two bits, one of which can be used in
a future commit.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/internal.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index acefa6bf0a4d..0ff1088a7c87 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -878,12 +878,12 @@ static inline unsigned int afs_page_dirty_from(unsigned long priv)
 
 static inline unsigned int afs_page_dirty_to(unsigned long priv)
 {
-	return (priv >> __AFS_PAGE_PRIV_SHIFT) & __AFS_PAGE_PRIV_MASK;
+	return ((priv >> __AFS_PAGE_PRIV_SHIFT) & __AFS_PAGE_PRIV_MASK) + 1;
 }
 
 static inline unsigned long afs_page_dirty(unsigned int from, unsigned int to)
 {
-	return ((unsigned long)to << __AFS_PAGE_PRIV_SHIFT) | from;
+	return ((unsigned long)(to - 1) << __AFS_PAGE_PRIV_SHIFT) | from;
 }
 
 #include <trace/events/afs.h>



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 10/10] afs: Fix afs_invalidatepage to adjust the dirty region
  2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
                   ` (8 preceding siblings ...)
  2020-10-27 13:51 ` [PATCH 09/10] afs: Alter dirty range encoding in page->private David Howells
@ 2020-10-27 13:51 ` David Howells
  9 siblings, 0 replies; 12+ messages in thread
From: David Howells @ 2020-10-27 13:51 UTC (permalink / raw)
  To: linux-afs; +Cc: dhowells, linux-fsdevel, linux-kernel

Fix afs_invalidatepage() to adjust the dirty region recorded in
page->private when truncating a page.  If the dirty region is entirely
removed, then the private data is cleared and the page dirty state is
cleared.

Without this, if the page is truncated and then expanded again by truncate,
zeros from the expanded, but no-longer dirty region may get written back to
the server if the page gets laundered due to a conflicting 3rd-party write.

It mustn't, however, shorten the dirty region of the page if that page is
still mmapped and has been marked dirty by afs_page_mkwrite(), so a flag is
stored in page->private to record this.

Fixes: 4343d00872e1 ("afs: Get rid of the afs_writeback record")
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/afs/file.c              |   74 +++++++++++++++++++++++++++++++++++++-------
 fs/afs/internal.h          |   16 ++++++++--
 fs/afs/write.c             |    1 +
 include/trace/events/afs.h |    5 ++-
 4 files changed, 80 insertions(+), 16 deletions(-)

diff --git a/fs/afs/file.c b/fs/afs/file.c
index 7dafa2266048..e3cec86cc6ef 100644
--- a/fs/afs/file.c
+++ b/fs/afs/file.c
@@ -601,6 +601,65 @@ static int afs_readpages(struct file *file, struct address_space *mapping,
 	return ret;
 }
 
+/*
+ * Adjust the dirty region of the page on truncation or full invalidation,
+ * getting rid of the markers altogether if the region is entirely invalidated.
+ */
+static void afs_invalidate_dirty(struct page *page, unsigned int offset,
+				 unsigned int length)
+{
+	struct afs_vnode *vnode = AFS_FS_I(page->mapping->host);
+	unsigned long priv;
+	unsigned int f, t, end = offset + length;
+
+	priv = page_private(page);
+
+	/* we clean up only if the entire page is being invalidated */
+	if (offset == 0 && length == thp_size(page))
+		goto full_invalidate;
+
+	 /* If the page was dirtied by page_mkwrite(), the PTE stays writable
+	  * and we don't get another notification to tell us to expand it
+	  * again.
+	  */
+	if (afs_is_page_dirty_mmapped(priv))
+		return;
+
+	/* We may need to shorten the dirty region */
+	f = afs_page_dirty_from(priv);
+	t = afs_page_dirty_to(priv);
+
+	if (t <= offset || f >= end)
+		return; /* Doesn't overlap */
+
+	if (f < offset && t > end)
+		return; /* Splits the dirty region - just absorb it */
+
+	if (f >= offset && t <= end)
+		goto undirty;
+
+	if (f < offset)
+		t = offset;
+	else
+		f = end;
+	if (f == t)
+		goto undirty;
+
+	priv = afs_page_dirty(f, t);
+	set_page_private(page, priv);
+	trace_afs_page_dirty(vnode, tracepoint_string("trunc"), page->index, priv);
+	return;
+
+undirty:
+	trace_afs_page_dirty(vnode, tracepoint_string("undirty"), page->index, priv);
+	clear_page_dirty_for_io(page);
+full_invalidate:
+	trace_afs_page_dirty(vnode, tracepoint_string("inval"), page->index, priv);
+	set_page_private(page, 0);
+	ClearPagePrivate(page);
+	put_page(page);
+}
+
 /*
  * invalidate part or all of a page
  * - release a page and clean up its private data if offset is 0 (indicating
@@ -609,9 +668,6 @@ static int afs_readpages(struct file *file, struct address_space *mapping,
 static void afs_invalidatepage(struct page *page, unsigned int offset,
 			       unsigned int length)
 {
-	struct afs_vnode *vnode = AFS_FS_I(page->mapping->host);
-	unsigned long priv;
-
 	_enter("{%lu},%u,%u", page->index, offset, length);
 
 	BUG_ON(!PageLocked(page));
@@ -625,17 +681,11 @@ static void afs_invalidatepage(struct page *page, unsigned int offset,
 			fscache_uncache_page(vnode->cache, page);
 		}
 #endif
-
-		if (PagePrivate(page)) {
-			priv = page_private(page);
-			trace_afs_page_dirty(vnode, tracepoint_string("inval"),
-					     page->index, priv);
-			set_page_private(page, 0);
-			ClearPagePrivate(page);
-			put_page(page);
-		}
 	}
 
+	if (PagePrivate(page))
+		afs_invalidate_dirty(page, offset, length);
+
 	_leave("");
 }
 
diff --git a/fs/afs/internal.h b/fs/afs/internal.h
index 0ff1088a7c87..bf8fb9863b0e 100644
--- a/fs/afs/internal.h
+++ b/fs/afs/internal.h
@@ -864,11 +864,13 @@ struct afs_vnode_cache_aux {
  * 0...PAGE_SIZE inclusive, so we can't support 64K pages on a 32-bit system.
  */
 #if PAGE_SIZE > 32768
-#define __AFS_PAGE_PRIV_MASK	0xffffffff
+#define __AFS_PAGE_PRIV_MASK	0x7fffffff
 #define __AFS_PAGE_PRIV_SHIFT	32
+#define __AFS_PRIV_MMAPPED	0x80000000
 #else
-#define __AFS_PAGE_PRIV_MASK	0xffff
+#define __AFS_PAGE_PRIV_MASK	0x7fff
 #define __AFS_PAGE_PRIV_SHIFT	16
+#define __AFS_PRIV_MMAPPED	0x8000
 #endif
 
 static inline unsigned int afs_page_dirty_from(unsigned long priv)
@@ -886,6 +888,16 @@ static inline unsigned long afs_page_dirty(unsigned int from, unsigned int to)
 	return ((unsigned long)(to - 1) << __AFS_PAGE_PRIV_SHIFT) | from;
 }
 
+static inline unsigned long afs_page_dirty_mmapped(unsigned long priv)
+{
+	return priv | __AFS_PRIV_MMAPPED;
+}
+
+static inline bool afs_is_page_dirty_mmapped(unsigned long priv)
+{
+	return priv & __AFS_PRIV_MMAPPED;
+}
+
 #include <trace/events/afs.h>
 
 /*****************************************************************************/
diff --git a/fs/afs/write.c b/fs/afs/write.c
index 91bc2cb2cad1..057d02fd4d02 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -871,6 +871,7 @@ vm_fault_t afs_page_mkwrite(struct vm_fault *vmf)
 	wait_on_page_writeback(vmf->page);
 
 	priv = afs_page_dirty(0, PAGE_SIZE);
+	priv = afs_page_dirty_mmapped(priv);
 	trace_afs_page_dirty(vnode, tracepoint_string("mkwrite"),
 			     vmf->page->index, priv);
 	if (!TestSetPagePrivate(vmf->page))
diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h
index e718ae17ad91..91d515cb3aed 100644
--- a/include/trace/events/afs.h
+++ b/include/trace/events/afs.h
@@ -986,10 +986,11 @@ TRACE_EVENT(afs_page_dirty,
 		    __entry->priv = priv;
 			   ),
 
-	    TP_printk("vn=%p %lx %s %x-%x",
+	    TP_printk("vn=%p %lx %s %x-%x%s",
 		      __entry->vnode, __entry->page, __entry->where,
 		      afs_page_dirty_from(__entry->priv),
-		      afs_page_dirty_to(__entry->priv))
+		      afs_page_dirty_to(__entry->priv),
+		      afs_is_page_dirty_mmapped(__entry->priv) ? " M" : "")
 	    );
 
 TRACE_EVENT(afs_call_state,



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 01/10] afs: Fix copy_file_range()
  2020-10-27 13:50 ` [PATCH 01/10] afs: Fix copy_file_range() David Howells
@ 2020-10-27 16:02   ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2020-10-27 16:02 UTC (permalink / raw)
  To: David Howells; +Cc: linux-afs, Christoph Hellwig, linux-fsdevel, linux-kernel

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-10-27 16:16 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-27 13:49 [PATCH 00/10] AFS fixes David Howells
2020-10-27 13:50 ` [PATCH 01/10] afs: Fix copy_file_range() David Howells
2020-10-27 16:02   ` Christoph Hellwig
2020-10-27 13:50 ` [PATCH 02/10] afs: Fix tracing deref-before-check David Howells
2020-10-27 13:50 ` [PATCH 03/10] afs: Fix a use after free in afs_xattr_get_acl() David Howells
2020-10-27 13:50 ` [PATCH 04/10] afs: Fix afs_launder_page to not clear PG_writeback David Howells
2020-10-27 13:50 ` [PATCH 05/10] afs: Fix to take ref on page when PG_private is set David Howells
2020-10-27 13:50 ` [PATCH 06/10] afs: Fix page leak on afs_write_begin() failure David Howells
2020-10-27 13:50 ` [PATCH 07/10] afs: Fix where page->private is set during write David Howells
2020-10-27 13:50 ` [PATCH 08/10] afs: Wrap page->private manipulations in inline functions David Howells
2020-10-27 13:51 ` [PATCH 09/10] afs: Alter dirty range encoding in page->private David Howells
2020-10-27 13:51 ` [PATCH 10/10] afs: Fix afs_invalidatepage to adjust the dirty region David Howells

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).