linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount
@ 2018-04-27  9:21 Nikolay Borisov
  2018-04-27  9:21 ` [PATCH 1/5] btrfs: Unexport btrfs_alloc_delalloc_work Nikolay Borisov
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Nikolay Borisov @ 2018-04-27  9:21 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov

After investigating crashes on generic/176 it turned that the culprit in fact
is the random failure induced by generic/019. As it happens, if on unmount the 
filesystem is in BTRFS_FS_STATE_ERROR then btrfs_error_commit_super is called. 
This unveiled 2 bugs:
 1. btrfs_destroy_delalloc_inodes's implementation was completely bogus, since
 it only called btrfs_invalidate_inodes which only pruned dentries and didn't 
 do anything to free any inodes with pending delalloc bytes. Once this is fixed 
 with the use of invalide_inode_pages2 the second bug transpired. 
 2. The last call ot run_delayed_iputs is made before btrfs_cleanup_transaction
 is called. The latter in turn could queue up more delayed iputs resulting from 
 invalidates_inode_pages2. 

This series fixes the problem by first fixing btrfs_destroy_delalloc_inode to 
properly cleanup delalloc inodes and as a result cleans up the code a bit. 

I've given it a good bashing through xfstest (4 full xfstest cycles + 100 
iterations of generic/475 since it was hitting some early assertion failures,
which are fixed in the final version) so am pretty confident in the change. 

Nikolay Borisov (5):
  btrfs: Unexport btrfs_alloc_delalloc_work
  btrfs: Split btrfs_del_delalloc_inode into 2 functions
  btrfs: Add assert in __btrfs_del_delalloc_inode
  btrfs: Fix delalloc inodes invalidation during transaction abort
  btrfs: Unexport and rename btrfs_invalidate_inodes

 fs/btrfs/ctree.h   |  12 +----
 fs/btrfs/disk-io.c |  22 ++++----
 fs/btrfs/inode.c   | 152 ++++++++++++++++++++++++++++++-----------------------
 3 files changed, 99 insertions(+), 87 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/5] btrfs: Unexport btrfs_alloc_delalloc_work
  2018-04-27  9:21 [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount Nikolay Borisov
@ 2018-04-27  9:21 ` Nikolay Borisov
  2018-04-27  9:21 ` [PATCH 2/5] btrfs: Split btrfs_del_delalloc_inode into 2 functions Nikolay Borisov
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Nikolay Borisov @ 2018-04-27  9:21 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov

It's used only in inode.c so makes no sense to have it exported. Also
move the definition of btrfs_delalloc_work to inode.c since it's used
only this file.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/ctree.h | 9 ---------
 fs/btrfs/inode.c | 8 ++++++++
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 789c7da2721c..5b7080d2fbff 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3167,15 +3167,6 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode,
 				     struct extent_map *em);
 
 /* inode.c */
-struct btrfs_delalloc_work {
-	struct inode *inode;
-	struct completion completion;
-	struct list_head list;
-	struct btrfs_work work;
-};
-
-struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode);
-
 struct extent_map *btrfs_get_extent_fiemap(struct btrfs_inode *inode,
 		struct page *page, size_t pg_offset, u64 start,
 		u64 len, int create);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 9b8f2904ad55..078e02c21825 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -10145,6 +10145,13 @@ static int btrfs_rename2(struct inode *old_dir, struct dentry *old_dentry,
 	return btrfs_rename(old_dir, old_dentry, new_dir, new_dentry, flags);
 }
 
+struct btrfs_delalloc_work {
+	struct inode *inode;
+	struct completion completion;
+	struct list_head list;
+	struct btrfs_work work;
+};
+
 static void btrfs_run_delalloc_work(struct btrfs_work *work)
 {
 	struct btrfs_delalloc_work *delalloc_work;
@@ -10162,6 +10169,7 @@ static void btrfs_run_delalloc_work(struct btrfs_work *work)
 	complete(&delalloc_work->completion);
 }
 
+static
 struct btrfs_delalloc_work *btrfs_alloc_delalloc_work(struct inode *inode)
 {
 	struct btrfs_delalloc_work *work;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/5] btrfs: Split btrfs_del_delalloc_inode into 2 functions
  2018-04-27  9:21 [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount Nikolay Borisov
  2018-04-27  9:21 ` [PATCH 1/5] btrfs: Unexport btrfs_alloc_delalloc_work Nikolay Borisov
@ 2018-04-27  9:21 ` Nikolay Borisov
  2018-05-11  5:44   ` Anand Jain
  2018-04-27  9:21 ` [PATCH 3/5] btrfs: Add assert in __btrfs_del_delalloc_inode Nikolay Borisov
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2018-04-27  9:21 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov

This is in preparation of fixing delalloc inodes leakage on transaction
abort. Also export the new function.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/ctree.h |  2 ++
 fs/btrfs/inode.c | 15 ++++++++++++---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 5b7080d2fbff..27aa9b58b001 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3174,6 +3174,8 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len,
 			      u64 *orig_start, u64 *orig_block_len,
 			      u64 *ram_bytes);
 
+void __btrfs_del_delalloc_inode(struct btrfs_root *root,
+				struct btrfs_inode *inode);
 struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry);
 int btrfs_set_inode_index(struct btrfs_inode *dir, u64 *index);
 int btrfs_unlink_inode(struct btrfs_trans_handle *trans,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 078e02c21825..164950d96c8a 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1742,12 +1742,12 @@ static void btrfs_add_delalloc_inodes(struct btrfs_root *root,
 	spin_unlock(&root->delalloc_lock);
 }
 
-static void btrfs_del_delalloc_inode(struct btrfs_root *root,
-				     struct btrfs_inode *inode)
+
+void __btrfs_del_delalloc_inode(struct btrfs_root *root,
+				struct btrfs_inode *inode)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->vfs_inode.i_sb);
 
-	spin_lock(&root->delalloc_lock);
 	if (!list_empty(&inode->delalloc_inodes)) {
 		list_del_init(&inode->delalloc_inodes);
 		clear_bit(BTRFS_INODE_IN_DELALLOC_LIST,
@@ -1760,6 +1760,15 @@ static void btrfs_del_delalloc_inode(struct btrfs_root *root,
 			spin_unlock(&fs_info->delalloc_root_lock);
 		}
 	}
+}
+
+
+static void btrfs_del_delalloc_inode(struct btrfs_root *root,
+				     struct btrfs_inode *inode)
+{
+
+	spin_lock(&root->delalloc_lock);
+	__btrfs_del_delalloc_inode(root, inode);
 	spin_unlock(&root->delalloc_lock);
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/5] btrfs: Add assert in __btrfs_del_delalloc_inode
  2018-04-27  9:21 [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount Nikolay Borisov
  2018-04-27  9:21 ` [PATCH 1/5] btrfs: Unexport btrfs_alloc_delalloc_work Nikolay Borisov
  2018-04-27  9:21 ` [PATCH 2/5] btrfs: Split btrfs_del_delalloc_inode into 2 functions Nikolay Borisov
@ 2018-04-27  9:21 ` Nikolay Borisov
  2018-04-27  9:21 ` [PATCH 4/5] btrfs: Fix delalloc inodes invalidation during transaction abort Nikolay Borisov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Nikolay Borisov @ 2018-04-27  9:21 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov

The invariant is that when nr_delalloc_inodes is 0 then the root
mustn't have any inodes on its delalloc inodes list.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/inode.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 164950d96c8a..d35992b8bb5e 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1754,6 +1754,7 @@ void __btrfs_del_delalloc_inode(struct btrfs_root *root,
 			  &inode->runtime_flags);
 		root->nr_delalloc_inodes--;
 		if (!root->nr_delalloc_inodes) {
+			ASSERT(list_empty(&root->delalloc_inodes));
 			spin_lock(&fs_info->delalloc_root_lock);
 			BUG_ON(list_empty(&root->delalloc_root));
 			list_del_init(&root->delalloc_root);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/5] btrfs: Fix delalloc inodes invalidation during transaction abort
  2018-04-27  9:21 [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount Nikolay Borisov
                   ` (2 preceding siblings ...)
  2018-04-27  9:21 ` [PATCH 3/5] btrfs: Add assert in __btrfs_del_delalloc_inode Nikolay Borisov
@ 2018-04-27  9:21 ` Nikolay Borisov
  2018-05-10 15:35   ` David Sterba
  2018-04-27  9:21 ` [PATCH 5/5] btrfs: Unexport and rename btrfs_invalidate_inodes Nikolay Borisov
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2018-04-27  9:21 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov

When a transaction is aborted btrfs_cleanup_transaction is called to
cleanup all the various in-flight bits and pieces which migth be
active. One of those is delalloc inodes - inodes which have dirty
pages which haven't been persisted yet. Currently the process of
freeing such delalloc inodes in exceptional circumstances such as
transaction abort boiled down to calling btrfs_invalidate_inodes whose
sole job is to invalidate the dentries for all inodes related to a
root. This is in fact wrong and insufficient since such delalloc inodes
will likely have pending pages or ordered-extents and will be linked to
the sb->s_inode_list. This means that unmounting a btrfs instance with
an aborted transaction could potentially lead inodes/their pages
visible to the system long after their superblock has been freed. This
in turn leads to a "use-after-free" situation once page shrink is
triggered. This situation could be simulated by running generic/019
which would cause such inodes to be left hanging, followed by
generic/176 which causes memory pressure and page eviction which lead
to touching the freed super block instance. This situation is
additionally detected by the unmount code of VFS with the following
message:

"VFS: Busy inodes after unmount of Self-destruct in 5 seconds.  Have a nice day..."

Additionally btrfs hits WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
in free_fs_root for the same reason.

This patch aims to rectify the sitaution by doing the following:

1. Change btrfs_destroy_delalloc_inodes so that it calls
invalidate_inode_pages2 for every inode on the delalloc list, this
ensures that all the pages of the inode are released. This function
boils down to calling btrfs_releasepage. During test I observed cases
where inodes on the delalloc list were having an i_count of 0, so this
necessitates using igrab to be sure we are working on a non-freed inode.

2. Since calling btrfs_releasepage might queue delayed iputs move the
call out to btrfs_cleanup_transaction in btrfs_error_commit_super before
calling run_delayed_iputs for the last time. This is necessary to ensure
that delayed iputs are run.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/disk-io.c | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f50786e19a7a..0f88a551862a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3816,6 +3816,7 @@ void close_ctree(struct btrfs_fs_info *fs_info)
 	set_bit(BTRFS_FS_CLOSING_DONE, &fs_info->flags);
 
 	btrfs_free_qgroup_config(fs_info);
+	ASSERT(list_empty(&fs_info->delalloc_roots));
 
 	if (percpu_counter_sum(&fs_info->delalloc_bytes)) {
 		btrfs_info(fs_info, "at unmount delalloc count %lld",
@@ -4123,15 +4124,15 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info)
 
 static void btrfs_error_commit_super(struct btrfs_fs_info *fs_info)
 {
+	/* cleanup FS via transaction */
+	btrfs_cleanup_transaction(fs_info);
+
 	mutex_lock(&fs_info->cleaner_mutex);
 	btrfs_run_delayed_iputs(fs_info);
 	mutex_unlock(&fs_info->cleaner_mutex);
 
 	down_write(&fs_info->cleanup_work_sem);
 	up_write(&fs_info->cleanup_work_sem);
-
-	/* cleanup FS via transaction */
-	btrfs_cleanup_transaction(fs_info);
 }
 
 static void btrfs_destroy_ordered_extents(struct btrfs_root *root)
@@ -4256,19 +4257,19 @@ static void btrfs_destroy_delalloc_inodes(struct btrfs_root *root)
 	list_splice_init(&root->delalloc_inodes, &splice);
 
 	while (!list_empty(&splice)) {
+		struct inode *inode = NULL;
 		btrfs_inode = list_first_entry(&splice, struct btrfs_inode,
 					       delalloc_inodes);
-
-		list_del_init(&btrfs_inode->delalloc_inodes);
-		clear_bit(BTRFS_INODE_IN_DELALLOC_LIST,
-			  &btrfs_inode->runtime_flags);
+		__btrfs_del_delalloc_inode(root, btrfs_inode);
 		spin_unlock(&root->delalloc_lock);
 
-		btrfs_invalidate_inodes(btrfs_inode->root);
-
+		inode = igrab(&btrfs_inode->vfs_inode);
+		if (inode) {
+			invalidate_inode_pages2(inode->i_mapping);
+			iput(inode);
+		}
 		spin_lock(&root->delalloc_lock);
 	}
-
 	spin_unlock(&root->delalloc_lock);
 }
 
@@ -4284,7 +4285,6 @@ static void btrfs_destroy_all_delalloc_inodes(struct btrfs_fs_info *fs_info)
 	while (!list_empty(&splice)) {
 		root = list_first_entry(&splice, struct btrfs_root,
 					 delalloc_root);
-		list_del_init(&root->delalloc_root);
 		root = btrfs_grab_fs_root(root);
 		BUG_ON(!root);
 		spin_unlock(&fs_info->delalloc_root_lock);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/5] btrfs: Unexport and rename btrfs_invalidate_inodes
  2018-04-27  9:21 [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount Nikolay Borisov
                   ` (3 preceding siblings ...)
  2018-04-27  9:21 ` [PATCH 4/5] btrfs: Fix delalloc inodes invalidation during transaction abort Nikolay Borisov
@ 2018-04-27  9:21 ` Nikolay Borisov
  2018-04-27 11:36   ` [PATCH v2] " Nikolay Borisov
  2018-05-01 14:02 ` [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount David Sterba
  2018-05-07 22:58 ` David Sterba
  6 siblings, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2018-04-27  9:21 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov

This function is no longer used outside of inode.c so just make it
static. At the same time give a more becoming name, since it's not
really invalidating the inodes but just calling d_prune_alias. Last,
but not least - move the function above the sole caller to avoid
introducing yet-another-pointless forward declaration.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/ctree.h |   1 -
 fs/btrfs/inode.c | 128 ++++++++++++++++++++++++++++---------------------------
 2 files changed, 65 insertions(+), 64 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 27aa9b58b001..9536872f2ec4 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3231,7 +3231,6 @@ int btrfs_orphan_cleanup(struct btrfs_root *root);
 void btrfs_orphan_commit_root(struct btrfs_trans_handle *trans,
 			      struct btrfs_root *root);
 int btrfs_cont_expand(struct inode *inode, loff_t oldsize, loff_t size);
-void btrfs_invalidate_inodes(struct btrfs_root *root);
 void btrfs_add_delayed_iput(struct inode *inode);
 void btrfs_run_delayed_iputs(struct btrfs_fs_info *fs_info);
 int btrfs_prealloc_file_range(struct inode *inode, int mode,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d35992b8bb5e..ba29e68a5fdd 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4384,6 +4384,71 @@ static noinline int may_destroy_subvol(struct btrfs_root *root)
 	return ret;
 }
 
+/* Delete all dentries for inodes belonging to the root */
+static void btrfs_prune_dentries(struct btrfs_root *root)
+{
+	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct rb_node *node;
+	struct rb_node *prev;
+	struct btrfs_inode *entry;
+	struct inode *inode;
+	u64 objectid = 0;
+
+	if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
+		WARN_ON(btrfs_root_refs(&root->root_item) != 0);
+
+	spin_lock(&root->inode_lock);
+again:
+	node = root->inode_tree.rb_node;
+	prev = NULL;
+	while (node) {
+		prev = node;
+		entry = rb_entry(node, struct btrfs_inode, rb_node);
+
+		if (objectid < btrfs_ino(BTRFS_I(&entry->vfs_inode)))
+			node = node->rb_left;
+		else if (objectid > btrfs_ino(BTRFS_I(&entry->vfs_inode)))
+			node = node->rb_right;
+		else
+			break;
+	}
+	if (!node) {
+		while (prev) {
+			entry = rb_entry(prev, struct btrfs_inode, rb_node);
+			if (objectid <= btrfs_ino(BTRFS_I(&entry->vfs_inode))) {
+				node = prev;
+				break;
+			}
+			prev = rb_next(prev);
+		}
+	}
+	while (node) {
+		entry = rb_entry(node, struct btrfs_inode, rb_node);
+		objectid = btrfs_ino(BTRFS_I(&entry->vfs_inode)) + 1;
+		inode = igrab(&entry->vfs_inode);
+		if (inode) {
+			spin_unlock(&root->inode_lock);
+			if (atomic_read(&inode->i_count) > 1)
+				d_prune_aliases(inode);
+			/*
+			 * btrfs_drop_inode will have it removed from
+			 * the inode cache when its usage count
+			 * hits zero.
+			 */
+			iput(inode);
+			cond_resched();
+			spin_lock(&root->inode_lock);
+			goto again;
+		}
+
+		if (cond_resched_lock(&root->inode_lock))
+			goto again;
+
+		node = rb_next(node);
+	}
+	spin_unlock(&root->inode_lock);
+}
+
 int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(dentry->d_sb);
@@ -5823,69 +5888,6 @@ static void inode_tree_del(struct inode *inode)
 	}
 }
 
-void btrfs_invalidate_inodes(struct btrfs_root *root)
-{
-	struct btrfs_fs_info *fs_info = root->fs_info;
-	struct rb_node *node;
-	struct rb_node *prev;
-	struct btrfs_inode *entry;
-	struct inode *inode;
-	u64 objectid = 0;
-
-	if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
-		WARN_ON(btrfs_root_refs(&root->root_item) != 0);
-
-	spin_lock(&root->inode_lock);
-again:
-	node = root->inode_tree.rb_node;
-	prev = NULL;
-	while (node) {
-		prev = node;
-		entry = rb_entry(node, struct btrfs_inode, rb_node);
-
-		if (objectid < btrfs_ino(BTRFS_I(&entry->vfs_inode)))
-			node = node->rb_left;
-		else if (objectid > btrfs_ino(BTRFS_I(&entry->vfs_inode)))
-			node = node->rb_right;
-		else
-			break;
-	}
-	if (!node) {
-		while (prev) {
-			entry = rb_entry(prev, struct btrfs_inode, rb_node);
-			if (objectid <= btrfs_ino(BTRFS_I(&entry->vfs_inode))) {
-				node = prev;
-				break;
-			}
-			prev = rb_next(prev);
-		}
-	}
-	while (node) {
-		entry = rb_entry(node, struct btrfs_inode, rb_node);
-		objectid = btrfs_ino(BTRFS_I(&entry->vfs_inode)) + 1;
-		inode = igrab(&entry->vfs_inode);
-		if (inode) {
-			spin_unlock(&root->inode_lock);
-			if (atomic_read(&inode->i_count) > 1)
-				d_prune_aliases(inode);
-			/*
-			 * btrfs_drop_inode will have it removed from
-			 * the inode cache when its usage count
-			 * hits zero.
-			 */
-			iput(inode);
-			cond_resched();
-			spin_lock(&root->inode_lock);
-			goto again;
-		}
-
-		if (cond_resched_lock(&root->inode_lock))
-			goto again;
-
-		node = rb_next(node);
-	}
-	spin_unlock(&root->inode_lock);
-}
 
 static int btrfs_init_locked_inode(struct inode *inode, void *p)
 {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2] btrfs: Unexport and rename btrfs_invalidate_inodes
  2018-04-27  9:21 ` [PATCH 5/5] btrfs: Unexport and rename btrfs_invalidate_inodes Nikolay Borisov
@ 2018-04-27 11:36   ` Nikolay Borisov
  2018-04-28 16:33     ` kbuild test robot
                       ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Nikolay Borisov @ 2018-04-27 11:36 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Nikolay Borisov

This function is no longer used outside of inode.c so just make it
static. At the same time give a more becoming name, since it's not
really invalidating the inodes but just calling d_prune_alias. Last,
but not least - move the function above the sole caller to avoid
introducing yet-another-pointless forward declaration.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---

Forgot to rename the function at the actual call site. 
 fs/btrfs/ctree.h |   1 -
 fs/btrfs/inode.c | 130 ++++++++++++++++++++++++++++---------------------------
 2 files changed, 66 insertions(+), 65 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 27aa9b58b001..9536872f2ec4 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -3231,7 +3231,6 @@ int btrfs_orphan_cleanup(struct btrfs_root *root);
 void btrfs_orphan_commit_root(struct btrfs_trans_handle *trans,
 			      struct btrfs_root *root);
 int btrfs_cont_expand(struct inode *inode, loff_t oldsize, loff_t size);
-void btrfs_invalidate_inodes(struct btrfs_root *root);
 void btrfs_add_delayed_iput(struct inode *inode);
 void btrfs_run_delayed_iputs(struct btrfs_fs_info *fs_info);
 int btrfs_prealloc_file_range(struct inode *inode, int mode,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index d35992b8bb5e..d42cd8e6e8a6 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4384,6 +4384,71 @@ static noinline int may_destroy_subvol(struct btrfs_root *root)
 	return ret;
 }
 
+/* Delete all dentries for inodes belonging to the root */
+static void btrfs_prune_dentries(struct btrfs_root *root)
+{
+	struct btrfs_fs_info *fs_info = root->fs_info;
+	struct rb_node *node;
+	struct rb_node *prev;
+	struct btrfs_inode *entry;
+	struct inode *inode;
+	u64 objectid = 0;
+
+	if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
+		WARN_ON(btrfs_root_refs(&root->root_item) != 0);
+
+	spin_lock(&root->inode_lock);
+again:
+	node = root->inode_tree.rb_node;
+	prev = NULL;
+	while (node) {
+		prev = node;
+		entry = rb_entry(node, struct btrfs_inode, rb_node);
+
+		if (objectid < btrfs_ino(BTRFS_I(&entry->vfs_inode)))
+			node = node->rb_left;
+		else if (objectid > btrfs_ino(BTRFS_I(&entry->vfs_inode)))
+			node = node->rb_right;
+		else
+			break;
+	}
+	if (!node) {
+		while (prev) {
+			entry = rb_entry(prev, struct btrfs_inode, rb_node);
+			if (objectid <= btrfs_ino(BTRFS_I(&entry->vfs_inode))) {
+				node = prev;
+				break;
+			}
+			prev = rb_next(prev);
+		}
+	}
+	while (node) {
+		entry = rb_entry(node, struct btrfs_inode, rb_node);
+		objectid = btrfs_ino(BTRFS_I(&entry->vfs_inode)) + 1;
+		inode = igrab(&entry->vfs_inode);
+		if (inode) {
+			spin_unlock(&root->inode_lock);
+			if (atomic_read(&inode->i_count) > 1)
+				d_prune_aliases(inode);
+			/*
+			 * btrfs_drop_inode will have it removed from
+			 * the inode cache when its usage count
+			 * hits zero.
+			 */
+			iput(inode);
+			cond_resched();
+			spin_lock(&root->inode_lock);
+			goto again;
+		}
+
+		if (cond_resched_lock(&root->inode_lock))
+			goto again;
+
+		node = rb_next(node);
+	}
+	spin_unlock(&root->inode_lock);
+}
+
 int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(dentry->d_sb);
@@ -4510,7 +4575,7 @@ int btrfs_delete_subvolume(struct inode *dir, struct dentry *dentry)
 		spin_unlock(&dest->root_item_lock);
 	} else {
 		d_invalidate(dentry);
-		btrfs_invalidate_inodes(dest);
+		btrfs_prune_dentries(dest);
 		ASSERT(dest->send_in_progress == 0);
 
 		/* the last ref */
@@ -5823,69 +5888,6 @@ static void inode_tree_del(struct inode *inode)
 	}
 }
 
-void btrfs_invalidate_inodes(struct btrfs_root *root)
-{
-	struct btrfs_fs_info *fs_info = root->fs_info;
-	struct rb_node *node;
-	struct rb_node *prev;
-	struct btrfs_inode *entry;
-	struct inode *inode;
-	u64 objectid = 0;
-
-	if (!test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state))
-		WARN_ON(btrfs_root_refs(&root->root_item) != 0);
-
-	spin_lock(&root->inode_lock);
-again:
-	node = root->inode_tree.rb_node;
-	prev = NULL;
-	while (node) {
-		prev = node;
-		entry = rb_entry(node, struct btrfs_inode, rb_node);
-
-		if (objectid < btrfs_ino(BTRFS_I(&entry->vfs_inode)))
-			node = node->rb_left;
-		else if (objectid > btrfs_ino(BTRFS_I(&entry->vfs_inode)))
-			node = node->rb_right;
-		else
-			break;
-	}
-	if (!node) {
-		while (prev) {
-			entry = rb_entry(prev, struct btrfs_inode, rb_node);
-			if (objectid <= btrfs_ino(BTRFS_I(&entry->vfs_inode))) {
-				node = prev;
-				break;
-			}
-			prev = rb_next(prev);
-		}
-	}
-	while (node) {
-		entry = rb_entry(node, struct btrfs_inode, rb_node);
-		objectid = btrfs_ino(BTRFS_I(&entry->vfs_inode)) + 1;
-		inode = igrab(&entry->vfs_inode);
-		if (inode) {
-			spin_unlock(&root->inode_lock);
-			if (atomic_read(&inode->i_count) > 1)
-				d_prune_aliases(inode);
-			/*
-			 * btrfs_drop_inode will have it removed from
-			 * the inode cache when its usage count
-			 * hits zero.
-			 */
-			iput(inode);
-			cond_resched();
-			spin_lock(&root->inode_lock);
-			goto again;
-		}
-
-		if (cond_resched_lock(&root->inode_lock))
-			goto again;
-
-		node = rb_next(node);
-	}
-	spin_unlock(&root->inode_lock);
-}
 
 static int btrfs_init_locked_inode(struct inode *inode, void *p)
 {
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] btrfs: Unexport and rename btrfs_invalidate_inodes
  2018-04-27 11:36   ` [PATCH v2] " Nikolay Borisov
@ 2018-04-28 16:33     ` kbuild test robot
  2018-04-28 16:44     ` kbuild test robot
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 20+ messages in thread
From: kbuild test robot @ 2018-04-28 16:33 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: kbuild-all, linux-btrfs, Nikolay Borisov

[-- Attachment #1: Type: text/plain, Size: 3255 bytes --]

Hi Nikolay,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20180424]
[cannot apply to v4.17-rc2 v4.17-rc1 v4.16 v4.17-rc2]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nikolay-Borisov/btrfs-Unexport-and-rename-btrfs_invalidate_inodes/20180428-234332
config: x86_64-randconfig-x013-201816 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   fs/btrfs/disk-io.c: In function 'btrfs_destroy_delalloc_inodes':
>> fs/btrfs/disk-io.c:4317:3: error: implicit declaration of function 'btrfs_invalidate_inodes'; did you mean 'btrfs_update_inode'? [-Werror=implicit-function-declaration]
      btrfs_invalidate_inodes(btrfs_inode->root);
      ^~~~~~~~~~~~~~~~~~~~~~~
      btrfs_update_inode
   cc1: some warnings being treated as errors

vim +4317 fs/btrfs/disk-io.c

acce952b0 liubo        2011-01-06  4297  
143bede52 Jeff Mahoney 2012-03-01  4298  static void btrfs_destroy_delalloc_inodes(struct btrfs_root *root)
acce952b0 liubo        2011-01-06  4299  {
acce952b0 liubo        2011-01-06  4300  	struct btrfs_inode *btrfs_inode;
acce952b0 liubo        2011-01-06  4301  	struct list_head splice;
acce952b0 liubo        2011-01-06  4302  
acce952b0 liubo        2011-01-06  4303  	INIT_LIST_HEAD(&splice);
acce952b0 liubo        2011-01-06  4304  
eb73c1b7c Miao Xie     2013-05-15  4305  	spin_lock(&root->delalloc_lock);
eb73c1b7c Miao Xie     2013-05-15  4306  	list_splice_init(&root->delalloc_inodes, &splice);
acce952b0 liubo        2011-01-06  4307  
acce952b0 liubo        2011-01-06  4308  	while (!list_empty(&splice)) {
eb73c1b7c Miao Xie     2013-05-15  4309  		btrfs_inode = list_first_entry(&splice, struct btrfs_inode,
acce952b0 liubo        2011-01-06  4310  					       delalloc_inodes);
acce952b0 liubo        2011-01-06  4311  
acce952b0 liubo        2011-01-06  4312  		list_del_init(&btrfs_inode->delalloc_inodes);
df0af1a57 Miao Xie     2013-01-29  4313  		clear_bit(BTRFS_INODE_IN_DELALLOC_LIST,
df0af1a57 Miao Xie     2013-01-29  4314  			  &btrfs_inode->runtime_flags);
eb73c1b7c Miao Xie     2013-05-15  4315  		spin_unlock(&root->delalloc_lock);
acce952b0 liubo        2011-01-06  4316  
acce952b0 liubo        2011-01-06 @4317  		btrfs_invalidate_inodes(btrfs_inode->root);
b216cbfb5 Miao Xie     2013-05-15  4318  
eb73c1b7c Miao Xie     2013-05-15  4319  		spin_lock(&root->delalloc_lock);
acce952b0 liubo        2011-01-06  4320  	}
acce952b0 liubo        2011-01-06  4321  
eb73c1b7c Miao Xie     2013-05-15  4322  	spin_unlock(&root->delalloc_lock);
acce952b0 liubo        2011-01-06  4323  }
acce952b0 liubo        2011-01-06  4324  

:::::: The code at line 4317 was first introduced by commit
:::::: acce952b0263825da32cf10489413dec78053347 Btrfs: forced readonly mounts on errors

:::::: TO: liubo <liubo2009@cn.fujitsu.com>
:::::: CC: Chris Mason <chris.mason@oracle.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28542 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] btrfs: Unexport and rename btrfs_invalidate_inodes
  2018-04-27 11:36   ` [PATCH v2] " Nikolay Borisov
  2018-04-28 16:33     ` kbuild test robot
@ 2018-04-28 16:44     ` kbuild test robot
  2018-04-28 18:04     ` kbuild test robot
  2018-05-11  5:38     ` Anand Jain
  3 siblings, 0 replies; 20+ messages in thread
From: kbuild test robot @ 2018-04-28 16:44 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: kbuild-all, linux-btrfs, Nikolay Borisov

[-- Attachment #1: Type: text/plain, Size: 3196 bytes --]

Hi Nikolay,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on next-20180424]
[cannot apply to v4.17-rc2 v4.17-rc1 v4.16 v4.17-rc2]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nikolay-Borisov/btrfs-Unexport-and-rename-btrfs_invalidate_inodes/20180428-234332
config: i386-randconfig-a0-201816 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   fs//btrfs/disk-io.c: In function 'btrfs_destroy_delalloc_inodes':
>> fs//btrfs/disk-io.c:4317:3: error: implicit declaration of function 'btrfs_invalidate_inodes' [-Werror=implicit-function-declaration]
      btrfs_invalidate_inodes(btrfs_inode->root);
      ^
   cc1: some warnings being treated as errors

vim +/btrfs_invalidate_inodes +4317 fs//btrfs/disk-io.c

acce952b0 liubo        2011-01-06  4297  
143bede52 Jeff Mahoney 2012-03-01  4298  static void btrfs_destroy_delalloc_inodes(struct btrfs_root *root)
acce952b0 liubo        2011-01-06  4299  {
acce952b0 liubo        2011-01-06  4300  	struct btrfs_inode *btrfs_inode;
acce952b0 liubo        2011-01-06  4301  	struct list_head splice;
acce952b0 liubo        2011-01-06  4302  
acce952b0 liubo        2011-01-06  4303  	INIT_LIST_HEAD(&splice);
acce952b0 liubo        2011-01-06  4304  
eb73c1b7c Miao Xie     2013-05-15  4305  	spin_lock(&root->delalloc_lock);
eb73c1b7c Miao Xie     2013-05-15  4306  	list_splice_init(&root->delalloc_inodes, &splice);
acce952b0 liubo        2011-01-06  4307  
acce952b0 liubo        2011-01-06  4308  	while (!list_empty(&splice)) {
eb73c1b7c Miao Xie     2013-05-15  4309  		btrfs_inode = list_first_entry(&splice, struct btrfs_inode,
acce952b0 liubo        2011-01-06  4310  					       delalloc_inodes);
acce952b0 liubo        2011-01-06  4311  
acce952b0 liubo        2011-01-06  4312  		list_del_init(&btrfs_inode->delalloc_inodes);
df0af1a57 Miao Xie     2013-01-29  4313  		clear_bit(BTRFS_INODE_IN_DELALLOC_LIST,
df0af1a57 Miao Xie     2013-01-29  4314  			  &btrfs_inode->runtime_flags);
eb73c1b7c Miao Xie     2013-05-15  4315  		spin_unlock(&root->delalloc_lock);
acce952b0 liubo        2011-01-06  4316  
acce952b0 liubo        2011-01-06 @4317  		btrfs_invalidate_inodes(btrfs_inode->root);
b216cbfb5 Miao Xie     2013-05-15  4318  
eb73c1b7c Miao Xie     2013-05-15  4319  		spin_lock(&root->delalloc_lock);
acce952b0 liubo        2011-01-06  4320  	}
acce952b0 liubo        2011-01-06  4321  
eb73c1b7c Miao Xie     2013-05-15  4322  	spin_unlock(&root->delalloc_lock);
acce952b0 liubo        2011-01-06  4323  }
acce952b0 liubo        2011-01-06  4324  

:::::: The code at line 4317 was first introduced by commit
:::::: acce952b0263825da32cf10489413dec78053347 Btrfs: forced readonly mounts on errors

:::::: TO: liubo <liubo2009@cn.fujitsu.com>
:::::: CC: Chris Mason <chris.mason@oracle.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 32619 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] btrfs: Unexport and rename btrfs_invalidate_inodes
  2018-04-27 11:36   ` [PATCH v2] " Nikolay Borisov
  2018-04-28 16:33     ` kbuild test robot
  2018-04-28 16:44     ` kbuild test robot
@ 2018-04-28 18:04     ` kbuild test robot
  2018-05-11  5:38     ` Anand Jain
  3 siblings, 0 replies; 20+ messages in thread
From: kbuild test robot @ 2018-04-28 18:04 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: kbuild-all, linux-btrfs, Nikolay Borisov

Hi Nikolay,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on next-20180424]
[cannot apply to v4.17-rc2 v4.17-rc1 v4.16 v4.17-rc2]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Nikolay-Borisov/btrfs-Unexport-and-rename-btrfs_invalidate_inodes/20180428-234332
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   fs/btrfs/disk-io.c:4317:17: sparse: undefined identifier 'btrfs_invalidate_inodes'
   fs/btrfs/disk-io.c:295:27: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:295:27: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:2257:39: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:2257:39: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:2298:39: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:2298:39: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:2560:37: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:2788:31: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:3191:17: sparse: incompatible types in comparison expression (different address spaces)
   fs/btrfs/disk-io.c:3523:33: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:3523:33: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:3532:33: sparse: expression using sizeof(void)
   fs/btrfs/disk-io.c:3532:33: sparse: expression using sizeof(void)
>> fs/btrfs/disk-io.c:4317:40: sparse: call with no type!
   fs/btrfs/disk-io.c: In function 'btrfs_destroy_delalloc_inodes':
   fs/btrfs/disk-io.c:4317:3: error: implicit declaration of function 'btrfs_invalidate_inodes'; did you mean 'btrfs_update_inode'? [-Werror=implicit-function-declaration]
      btrfs_invalidate_inodes(btrfs_inode->root);
      ^~~~~~~~~~~~~~~~~~~~~~~
      btrfs_update_inode
   cc1: some warnings being treated as errors

vim +4317 fs/btrfs/disk-io.c

acce952b0 liubo        2011-01-06  4297  
143bede52 Jeff Mahoney 2012-03-01  4298  static void btrfs_destroy_delalloc_inodes(struct btrfs_root *root)
acce952b0 liubo        2011-01-06  4299  {
acce952b0 liubo        2011-01-06  4300  	struct btrfs_inode *btrfs_inode;
acce952b0 liubo        2011-01-06  4301  	struct list_head splice;
acce952b0 liubo        2011-01-06  4302  
acce952b0 liubo        2011-01-06  4303  	INIT_LIST_HEAD(&splice);
acce952b0 liubo        2011-01-06  4304  
eb73c1b7c Miao Xie     2013-05-15  4305  	spin_lock(&root->delalloc_lock);
eb73c1b7c Miao Xie     2013-05-15  4306  	list_splice_init(&root->delalloc_inodes, &splice);
acce952b0 liubo        2011-01-06  4307  
acce952b0 liubo        2011-01-06  4308  	while (!list_empty(&splice)) {
eb73c1b7c Miao Xie     2013-05-15  4309  		btrfs_inode = list_first_entry(&splice, struct btrfs_inode,
acce952b0 liubo        2011-01-06  4310  					       delalloc_inodes);
acce952b0 liubo        2011-01-06  4311  
acce952b0 liubo        2011-01-06  4312  		list_del_init(&btrfs_inode->delalloc_inodes);
df0af1a57 Miao Xie     2013-01-29  4313  		clear_bit(BTRFS_INODE_IN_DELALLOC_LIST,
df0af1a57 Miao Xie     2013-01-29  4314  			  &btrfs_inode->runtime_flags);
eb73c1b7c Miao Xie     2013-05-15  4315  		spin_unlock(&root->delalloc_lock);
acce952b0 liubo        2011-01-06  4316  
acce952b0 liubo        2011-01-06 @4317  		btrfs_invalidate_inodes(btrfs_inode->root);
b216cbfb5 Miao Xie     2013-05-15  4318  
eb73c1b7c Miao Xie     2013-05-15  4319  		spin_lock(&root->delalloc_lock);
acce952b0 liubo        2011-01-06  4320  	}
acce952b0 liubo        2011-01-06  4321  
eb73c1b7c Miao Xie     2013-05-15  4322  	spin_unlock(&root->delalloc_lock);
acce952b0 liubo        2011-01-06  4323  }
acce952b0 liubo        2011-01-06  4324  

:::::: The code at line 4317 was first introduced by commit
:::::: acce952b0263825da32cf10489413dec78053347 Btrfs: forced readonly mounts on errors

:::::: TO: liubo <liubo2009@cn.fujitsu.com>
:::::: CC: Chris Mason <chris.mason@oracle.com>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount
  2018-04-27  9:21 [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount Nikolay Borisov
                   ` (4 preceding siblings ...)
  2018-04-27  9:21 ` [PATCH 5/5] btrfs: Unexport and rename btrfs_invalidate_inodes Nikolay Borisov
@ 2018-05-01 14:02 ` David Sterba
  2018-05-16 15:36   ` David Sterba
  2018-05-07 22:58 ` David Sterba
  6 siblings, 1 reply; 20+ messages in thread
From: David Sterba @ 2018-05-01 14:02 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-btrfs

On Fri, Apr 27, 2018 at 12:21:49PM +0300, Nikolay Borisov wrote:
> After investigating crashes on generic/176 it turned that the culprit in fact
> is the random failure induced by generic/019. As it happens, if on unmount the 
> filesystem is in BTRFS_FS_STATE_ERROR then btrfs_error_commit_super is called. 
> This unveiled 2 bugs:
>  1. btrfs_destroy_delalloc_inodes's implementation was completely bogus, since
>  it only called btrfs_invalidate_inodes which only pruned dentries and didn't 
>  do anything to free any inodes with pending delalloc bytes. Once this is fixed 
>  with the use of invalide_inode_pages2 the second bug transpired. 
>  2. The last call ot run_delayed_iputs is made before btrfs_cleanup_transaction
>  is called. The latter in turn could queue up more delayed iputs resulting from 
>  invalidates_inode_pages2. 
> 
> This series fixes the problem by first fixing btrfs_destroy_delalloc_inode to 
> properly cleanup delalloc inodes and as a result cleans up the code a bit. 
> 
> I've given it a good bashing through xfstest (4 full xfstest cycles + 100 
> iterations of generic/475 since it was hitting some early assertion failures,
> which are fixed in the final version) so am pretty confident in the change. 

Thanks. I'll add it as topic branch to next, this needs some testing
exposure. The plan is to push the core patches to some rc, possibly rc5.

Review of patch 3 is required.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount
  2018-04-27  9:21 [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount Nikolay Borisov
                   ` (5 preceding siblings ...)
  2018-05-01 14:02 ` [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount David Sterba
@ 2018-05-07 22:58 ` David Sterba
  2018-05-08  5:16   ` Nikolay Borisov
  6 siblings, 1 reply; 20+ messages in thread
From: David Sterba @ 2018-05-07 22:58 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-btrfs

On Fri, Apr 27, 2018 at 12:21:49PM +0300, Nikolay Borisov wrote:
> After investigating crashes on generic/176 it turned that the culprit in fact
> is the random failure induced by generic/019. As it happens, if on unmount the 
> filesystem is in BTRFS_FS_STATE_ERROR then btrfs_error_commit_super is called. 
> This unveiled 2 bugs:
>  1. btrfs_destroy_delalloc_inodes's implementation was completely bogus, since
>  it only called btrfs_invalidate_inodes which only pruned dentries and didn't 
>  do anything to free any inodes with pending delalloc bytes. Once this is fixed 
>  with the use of invalide_inode_pages2 the second bug transpired. 
>  2. The last call ot run_delayed_iputs is made before btrfs_cleanup_transaction
>  is called. The latter in turn could queue up more delayed iputs resulting from 
>  invalidates_inode_pages2. 
> 
> This series fixes the problem by first fixing btrfs_destroy_delalloc_inode to 
> properly cleanup delalloc inodes and as a result cleans up the code a bit. 
> 
> I've given it a good bashing through xfstest (4 full xfstest cycles + 100 
> iterations of generic/475 since it was hitting some early assertion failures,
> which are fixed in the final version) so am pretty confident in the change. 

One qemu testmachine complains.

The branch was ext/nikbor/delalloc-invalidate in my github repo. Other
tests seem "fine", unlikely to be related to this patchset.

The error here is a null pointer deref in end bio callback, which
matches a use-after-free scenario, so I think there's still something
left to fix.

generic/335             [22:34:50]
[26281.970322] run fstests generic/335 at 2018-05-07 22:34:50
[26282.440728] BUG: unable to handle kernel NULL pointer dereference at0000000000000000
[26282.445060] PGD 0 P4D 0
[26282.446526] Oops: 0000 [#1] PREEMPT SMP
[26282.448562] Modules linked in: btrfs libcrc32c xor zstd_decompresszstd_compress xxhash raid6_pq loop af_packet [last unloaded: libcrc32c]
[26282.454384] CPU: 2 PID: 30005 Comm: btrfs-endio-met Tainted: GW         4.17.0-rc4-default+ #73
[26282.457247] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
[26282.459386] RIP: 0010:__queue_work+0x189/0x3f0
[26282.460342] RSP: 0018:ffff8ea47fd03f00 EFLAGS: 00010046
[26282.461506] RAX: 0000000000000000 RBX: 0000000000000000 RCX:0000000000000000
[26282.463061] RDX: ffff8ea47fbda640 RSI: 000000007fffffff RDI:ffff8ea47fbda640
[26282.464606] RBP: 000000000000000d R08: 0000000000000000 R09:0000000000000001
[26282.466169] R10: 0000000000001000 R11: ffff8ea469244000 R12:ffff8ea40819c800
[26282.467697] R13: 0000000000000002 R14: 0000000000000200 R15:ffff8ea47fbda640
[26282.469205] FS:  0000000000000000(0000) GS:ffff8ea47fd00000(0000)knlGS:0000000000000000
[26282.470971] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[26282.472173] CR2: 0000000000000000 CR3: 0000000042ed4000 CR4:00000000000006e0
[26282.473674] Call Trace:
[26282.474300]  <IRQ>
[26282.474848]  queue_work_on+0x34/0x40
[26282.475717]  btrfs_end_bio+0x71/0x110 [btrfs]
[26282.476750]  blk_update_request+0x78/0x2d0
[26282.477675]  blk_mq_end_request+0x18/0x70
[26282.478599]  flush_smp_call_function_queue+0x6f/0xe0
[26282.479690]  smp_call_function_single_interrupt+0x2c/0xe0
[26282.480825]  call_function_single_interrupt+0xf/0x20
[26282.481888]  </IRQ>
[26282.482423] RIP: 0010:exit_shm+0x0/0x1c0
[26282.483281] RSP: 0018:ffffa9a1c4787ea0 EFLAGS: 00000292 ORIG_RAX:ffffffffffffff04
[26282.484944] RAX: ffffffffb2e37960 RBX: ffff8ea47ccc1c00 RCX:0000000000000000
[26282.486423] RDX: ffff8ea413270e40 RSI: 0000000000000282 RDI:ffff8ea47ccc1c00
[26282.487870] RBP: 0000000000000000 R08: ffff8ea413297630 R09:0000000000000000
[26282.489104] R10: 0000000000000000 R11: 0000000000000256 R12:0000000000000000
[26282.490306] R13: ffff8ea47ccc2301 R14: 0000000000000001 R15:ffff8ea47ccc1c00
[26282.491500]  do_exit+0x274/0xb00
[26282.492198]  ? rescuer_thread+0x2be/0x310
[26282.492990]  ? worker_thread+0x380/0x380
[26282.493795]  kthread+0xe0/0x130
[26282.494443]  ? kthread_create_worker_on_cpu+0x50/0x50
[26282.495395]  ret_from_fork+0x1f/0x30
[26282.499564] RIP: __queue_work+0x189/0x3f0 RSP: ffff8ea47fd03f00
[26282.500614] CR2: 0000000000000000
[26282.501248] ---[ end trace f7e701988bc2b82f ]---
[26282.502141] Kernel panic - not syncing: Fatal exception in interrupt
[26282.503419] Kernel Offset: 0x31000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[26282.505348] Rebooting in 90 seconds..

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount
  2018-05-07 22:58 ` David Sterba
@ 2018-05-08  5:16   ` Nikolay Borisov
  2018-05-08 10:41     ` David Sterba
  0 siblings, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2018-05-08  5:16 UTC (permalink / raw)
  To: dsterba, linux-btrfs



On  8.05.2018 01:58, David Sterba wrote:
> On Fri, Apr 27, 2018 at 12:21:49PM +0300, Nikolay Borisov wrote:
>> After investigating crashes on generic/176 it turned that the culprit in fact
>> is the random failure induced by generic/019. As it happens, if on unmount the 
>> filesystem is in BTRFS_FS_STATE_ERROR then btrfs_error_commit_super is called. 
>> This unveiled 2 bugs:
>>  1. btrfs_destroy_delalloc_inodes's implementation was completely bogus, since
>>  it only called btrfs_invalidate_inodes which only pruned dentries and didn't 
>>  do anything to free any inodes with pending delalloc bytes. Once this is fixed 
>>  with the use of invalide_inode_pages2 the second bug transpired. 
>>  2. The last call ot run_delayed_iputs is made before btrfs_cleanup_transaction
>>  is called. The latter in turn could queue up more delayed iputs resulting from 
>>  invalidates_inode_pages2. 
>>
>> This series fixes the problem by first fixing btrfs_destroy_delalloc_inode to 
>> properly cleanup delalloc inodes and as a result cleans up the code a bit. 
>>
>> I've given it a good bashing through xfstest (4 full xfstest cycles + 100 
>> iterations of generic/475 since it was hitting some early assertion failures,
>> which are fixed in the final version) so am pretty confident in the change. 
> 
> One qemu testmachine complains.
> 
> The branch was ext/nikbor/delalloc-invalidate in my github repo. Other
> tests seem "fine", unlikely to be related to this patchset.
> 
> The error here is a null pointer deref in end bio callback, which
> matches a use-after-free scenario, so I think there's still something
> left to fix.
> 
> generic/335             [22:34:50]

How easy is to repro this on this particular test? Like every other run
or is it spurious?

> [26281.970322] run fstests generic/335 at 2018-05-07 22:34:50
> [26282.440728] BUG: unable to handle kernel NULL pointer dereference at0000000000000000
> [26282.445060] PGD 0 P4D 0
> [26282.446526] Oops: 0000 [#1] PREEMPT SMP
> [26282.448562] Modules linked in: btrfs libcrc32c xor zstd_decompresszstd_compress xxhash raid6_pq loop af_packet [last unloaded: libcrc32c]
> [26282.454384] CPU: 2 PID: 30005 Comm: btrfs-endio-met Tainted: GW         4.17.0-rc4-default+ #73
> [26282.457247] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
> [26282.459386] RIP: 0010:__queue_work+0x189/0x3f0
> [26282.460342] RSP: 0018:ffff8ea47fd03f00 EFLAGS: 00010046
> [26282.461506] RAX: 0000000000000000 RBX: 0000000000000000 RCX:0000000000000000
> [26282.463061] RDX: ffff8ea47fbda640 RSI: 000000007fffffff RDI:ffff8ea47fbda640
> [26282.464606] RBP: 000000000000000d R08: 0000000000000000 R09:0000000000000001
> [26282.466169] R10: 0000000000001000 R11: ffff8ea469244000 R12:ffff8ea40819c800
> [26282.467697] R13: 0000000000000002 R14: 0000000000000200 R15:ffff8ea47fbda640
> [26282.469205] FS:  0000000000000000(0000) GS:ffff8ea47fd00000(0000)knlGS:0000000000000000
> [26282.470971] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [26282.472173] CR2: 0000000000000000 CR3: 0000000042ed4000 CR4:00000000000006e0
> [26282.473674] Call Trace:
> [26282.474300]  <IRQ>
> [26282.474848]  queue_work_on+0x34/0x40
> [26282.475717]  btrfs_end_bio+0x71/0x110 [btrfs]
> [26282.476750]  blk_update_request+0x78/0x2d0
> [26282.477675]  blk_mq_end_request+0x18/0x70
> [26282.478599]  flush_smp_call_function_queue+0x6f/0xe0
> [26282.479690]  smp_call_function_single_interrupt+0x2c/0xe0
> [26282.480825]  call_function_single_interrupt+0xf/0x20
> [26282.481888]  </IRQ>
> [26282.482423] RIP: 0010:exit_shm+0x0/0x1c0
> [26282.483281] RSP: 0018:ffffa9a1c4787ea0 EFLAGS: 00000292 ORIG_RAX:ffffffffffffff04
> [26282.484944] RAX: ffffffffb2e37960 RBX: ffff8ea47ccc1c00 RCX:0000000000000000
> [26282.486423] RDX: ffff8ea413270e40 RSI: 0000000000000282 RDI:ffff8ea47ccc1c00
> [26282.487870] RBP: 0000000000000000 R08: ffff8ea413297630 R09:0000000000000000
> [26282.489104] R10: 0000000000000000 R11: 0000000000000256 R12:0000000000000000
> [26282.490306] R13: ffff8ea47ccc2301 R14: 0000000000000001 R15:ffff8ea47ccc1c00
> [26282.491500]  do_exit+0x274/0xb00
> [26282.492198]  ? rescuer_thread+0x2be/0x310
> [26282.492990]  ? worker_thread+0x380/0x380
> [26282.493795]  kthread+0xe0/0x130
> [26282.494443]  ? kthread_create_worker_on_cpu+0x50/0x50
> [26282.495395]  ret_from_fork+0x1f/0x30
> [26282.499564] RIP: __queue_work+0x189/0x3f0 RSP: ffff8ea47fd03f00
> [26282.500614] CR2: 0000000000000000
> [26282.501248] ---[ end trace f7e701988bc2b82f ]---
> [26282.502141] Kernel panic - not syncing: Fatal exception in interrupt
> [26282.503419] Kernel Offset: 0x31000000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [26282.505348] Rebooting in 90 seconds..
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount
  2018-05-08  5:16   ` Nikolay Borisov
@ 2018-05-08 10:41     ` David Sterba
  0 siblings, 0 replies; 20+ messages in thread
From: David Sterba @ 2018-05-08 10:41 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-btrfs

On Tue, May 08, 2018 at 08:16:37AM +0300, Nikolay Borisov wrote:
> >> properly cleanup delalloc inodes and as a result cleans up the code a bit. 
> >>
> >> I've given it a good bashing through xfstest (4 full xfstest cycles + 100 
> >> iterations of generic/475 since it was hitting some early assertion failures,
> >> which are fixed in the final version) so am pretty confident in the change. 
> > 
> > One qemu testmachine complains.
> > 
> > The branch was ext/nikbor/delalloc-invalidate in my github repo. Other
> > tests seem "fine", unlikely to be related to this patchset.
> > 
> > The error here is a null pointer deref in end bio callback, which
> > matches a use-after-free scenario, so I think there's still something
> > left to fix.
> > 
> > generic/335             [22:34:50]
> 
> How easy is to repro this on this particular test? Like every other run
> or is it spurious?

It happened once, on first run of the testing setup, so I don't have
enough data to answer. I'll start another round and we'll see.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 4/5] btrfs: Fix delalloc inodes invalidation during transaction abort
  2018-04-27  9:21 ` [PATCH 4/5] btrfs: Fix delalloc inodes invalidation during transaction abort Nikolay Borisov
@ 2018-05-10 15:35   ` David Sterba
  0 siblings, 0 replies; 20+ messages in thread
From: David Sterba @ 2018-05-10 15:35 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: linux-btrfs

On Fri, Apr 27, 2018 at 12:21:53PM +0300, Nikolay Borisov wrote:
> When a transaction is aborted btrfs_cleanup_transaction is called to
> cleanup all the various in-flight bits and pieces which migth be
> active. One of those is delalloc inodes - inodes which have dirty
> pages which haven't been persisted yet. Currently the process of
> freeing such delalloc inodes in exceptional circumstances such as
> transaction abort boiled down to calling btrfs_invalidate_inodes whose
> sole job is to invalidate the dentries for all inodes related to a
> root. This is in fact wrong and insufficient since such delalloc inodes
> will likely have pending pages or ordered-extents and will be linked to
> the sb->s_inode_list. This means that unmounting a btrfs instance with
> an aborted transaction could potentially lead inodes/their pages
> visible to the system long after their superblock has been freed. This
> in turn leads to a "use-after-free" situation once page shrink is
> triggered. This situation could be simulated by running generic/019
> which would cause such inodes to be left hanging, followed by
> generic/176 which causes memory pressure and page eviction which lead
> to touching the freed super block instance. This situation is
> additionally detected by the unmount code of VFS with the following
> message:
> 
> "VFS: Busy inodes after unmount of Self-destruct in 5 seconds.  Have a nice day..."
> 
> Additionally btrfs hits WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
> in free_fs_root for the same reason.
> 
> This patch aims to rectify the sitaution by doing the following:
> 
> 1. Change btrfs_destroy_delalloc_inodes so that it calls
> invalidate_inode_pages2 for every inode on the delalloc list, this
> ensures that all the pages of the inode are released. This function
> boils down to calling btrfs_releasepage. During test I observed cases
> where inodes on the delalloc list were having an i_count of 0, so this
> necessitates using igrab to be sure we are working on a non-freed inode.
> 
> 2. Since calling btrfs_releasepage might queue delayed iputs move the
> call out to btrfs_cleanup_transaction in btrfs_error_commit_super before
> calling run_delayed_iputs for the last time. This is necessary to ensure
> that delayed iputs are run.
> 
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> ---
>  fs/btrfs/disk-io.c | 22 +++++++++++-----------
>  1 file changed, 11 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index f50786e19a7a..0f88a551862a 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3816,6 +3816,7 @@ void close_ctree(struct btrfs_fs_info *fs_info)
>  	set_bit(BTRFS_FS_CLOSING_DONE, &fs_info->flags);
>  
>  	btrfs_free_qgroup_config(fs_info);
> +	ASSERT(list_empty(&fs_info->delalloc_roots));
>  
>  	if (percpu_counter_sum(&fs_info->delalloc_bytes)) {
>  		btrfs_info(fs_info, "at unmount delalloc count %lld",
> @@ -4123,15 +4124,15 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info)
>  
>  static void btrfs_error_commit_super(struct btrfs_fs_info *fs_info)
>  {
> +	/* cleanup FS via transaction */
> +	btrfs_cleanup_transaction(fs_info);
> +
>  	mutex_lock(&fs_info->cleaner_mutex);
>  	btrfs_run_delayed_iputs(fs_info);
>  	mutex_unlock(&fs_info->cleaner_mutex);
>  
>  	down_write(&fs_info->cleanup_work_sem);
>  	up_write(&fs_info->cleanup_work_sem);
> -
> -	/* cleanup FS via transaction */
> -	btrfs_cleanup_transaction(fs_info);

I wonder if we still run the cleanup here. The function calls several
others but they all seem to be ok and do the initial checks if there's
any work left to do.


btrfs_error_commit_super is called only once after filesystem error is
found in close_ctree, and then the kthreads are stopped. So there should
not be any work left, but I'm not sure if this holds 100% after the
btrfs_run_delayed_iputs is still called.

More invariant assertions would be good

>  }
>  
>  static void btrfs_destroy_ordered_extents(struct btrfs_root *root)
> @@ -4256,19 +4257,19 @@ static void btrfs_destroy_delalloc_inodes(struct btrfs_root *root)
>  	list_splice_init(&root->delalloc_inodes, &splice);
>  
>  	while (!list_empty(&splice)) {
> +		struct inode *inode = NULL;
>  		btrfs_inode = list_first_entry(&splice, struct btrfs_inode,
>  					       delalloc_inodes);
> -
> -		list_del_init(&btrfs_inode->delalloc_inodes);
> -		clear_bit(BTRFS_INODE_IN_DELALLOC_LIST,
> -			  &btrfs_inode->runtime_flags);
> +		__btrfs_del_delalloc_inode(root, btrfs_inode);
>  		spin_unlock(&root->delalloc_lock);
>  
> -		btrfs_invalidate_inodes(btrfs_inode->root);
> -
> +		inode = igrab(&btrfs_inode->vfs_inode);

A comment would be good here, similar to the explanation in the
changelog.

Other than that I think were good to go with this patch. I've merged the
preparatory (1-3) patches to misc-next. This one after a few more
testing rounds.

> +		if (inode) {
> +			invalidate_inode_pages2(inode->i_mapping);
> +			iput(inode);
> +		}
>  		spin_lock(&root->delalloc_lock);
>  	}
> -
>  	spin_unlock(&root->delalloc_lock);
>  }
>  

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2] btrfs: Unexport and rename btrfs_invalidate_inodes
  2018-04-27 11:36   ` [PATCH v2] " Nikolay Borisov
                       ` (2 preceding siblings ...)
  2018-04-28 18:04     ` kbuild test robot
@ 2018-05-11  5:38     ` Anand Jain
  3 siblings, 0 replies; 20+ messages in thread
From: Anand Jain @ 2018-05-11  5:38 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs



On 04/27/2018 07:36 PM, Nikolay Borisov wrote:
> This function is no longer used outside of inode.c so just make it
> static. At the same time give a more becoming name, since it's not
> really invalidating the inodes but just calling d_prune_alias. Last,
> but not least - move the function above the sole caller to avoid
> introducing yet-another-pointless forward declaration.
> 
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>

Reviewed-by: Anand Jain <anand.jain@oracle.com>

Thanks, Anand

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/5] btrfs: Split btrfs_del_delalloc_inode into 2 functions
  2018-04-27  9:21 ` [PATCH 2/5] btrfs: Split btrfs_del_delalloc_inode into 2 functions Nikolay Borisov
@ 2018-05-11  5:44   ` Anand Jain
  2018-05-11  8:39     ` Nikolay Borisov
  0 siblings, 1 reply; 20+ messages in thread
From: Anand Jain @ 2018-05-11  5:44 UTC (permalink / raw)
  To: Nikolay Borisov, linux-btrfs



On 04/27/2018 05:21 PM, Nikolay Borisov wrote:
> This is in preparation of fixing delalloc inodes leakage on transaction
> abort. Also export the new function.
> 
> Signed-off-by: Nikolay Borisov <nborisov@suse.com>

  nit: I think we are reserving function prefix __ for some special
  functions. I am not sure if the function name should prefix with __
  here.

Reviewed-by: Anand Jain <anand.jain@orcle.com>

Thanks, Anand

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/5] btrfs: Split btrfs_del_delalloc_inode into 2 functions
  2018-05-11  5:44   ` Anand Jain
@ 2018-05-11  8:39     ` Nikolay Borisov
  2018-05-16 15:27       ` David Sterba
  0 siblings, 1 reply; 20+ messages in thread
From: Nikolay Borisov @ 2018-05-11  8:39 UTC (permalink / raw)
  To: Anand Jain, linux-btrfs



On 11.05.2018 08:44, Anand Jain wrote:
> 
> 
> On 04/27/2018 05:21 PM, Nikolay Borisov wrote:
>> This is in preparation of fixing delalloc inodes leakage on transaction
>> abort. Also export the new function.
>>
>> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> 
>  nit: I think we are reserving function prefix __ for some special
>  functions. I am not sure if the function name should prefix with __
>  here.

Generally __ prefix is used for some internal function. In this case the
gist of the function (with no locking) is behind the __ prefixed
function, whereas the non __ version adds the necessary locking. I think
this is a fairly well-established pattern in the kernel.
> 
> Reviewed-by: Anand Jain <anand.jain@orcle.com>
> 
> Thanks, Anand
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/5] btrfs: Split btrfs_del_delalloc_inode into 2 functions
  2018-05-11  8:39     ` Nikolay Borisov
@ 2018-05-16 15:27       ` David Sterba
  0 siblings, 0 replies; 20+ messages in thread
From: David Sterba @ 2018-05-16 15:27 UTC (permalink / raw)
  To: Nikolay Borisov; +Cc: Anand Jain, linux-btrfs

On Fri, May 11, 2018 at 11:39:56AM +0300, Nikolay Borisov wrote:
> On 11.05.2018 08:44, Anand Jain wrote:
> > On 04/27/2018 05:21 PM, Nikolay Borisov wrote:
> >> This is in preparation of fixing delalloc inodes leakage on transaction
> >> abort. Also export the new function.
> >>
> >> Signed-off-by: Nikolay Borisov <nborisov@suse.com>
> > 
> >  nit: I think we are reserving function prefix __ for some special
> >  functions. I am not sure if the function name should prefix with __
> >  here.
> 
> Generally __ prefix is used for some internal function. In this case the
> gist of the function (with no locking) is behind the __ prefixed
> function, whereas the non __ version adds the necessary locking. I think
> this is a fairly well-established pattern in the kernel.

Yes, this is one of few patterns where __ is ok.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount
  2018-05-01 14:02 ` [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount David Sterba
@ 2018-05-16 15:36   ` David Sterba
  0 siblings, 0 replies; 20+ messages in thread
From: David Sterba @ 2018-05-16 15:36 UTC (permalink / raw)
  To: dsterba, Nikolay Borisov, linux-btrfs

On Tue, May 01, 2018 at 04:02:34PM +0200, David Sterba wrote:
> On Fri, Apr 27, 2018 at 12:21:49PM +0300, Nikolay Borisov wrote:
> > After investigating crashes on generic/176 it turned that the culprit in fact
> > is the random failure induced by generic/019. As it happens, if on unmount the 
> > filesystem is in BTRFS_FS_STATE_ERROR then btrfs_error_commit_super is called. 
> > This unveiled 2 bugs:
> >  1. btrfs_destroy_delalloc_inodes's implementation was completely bogus, since
> >  it only called btrfs_invalidate_inodes which only pruned dentries and didn't 
> >  do anything to free any inodes with pending delalloc bytes. Once this is fixed 
> >  with the use of invalide_inode_pages2 the second bug transpired. 
> >  2. The last call ot run_delayed_iputs is made before btrfs_cleanup_transaction
> >  is called. The latter in turn could queue up more delayed iputs resulting from 
> >  invalidates_inode_pages2. 
> > 
> > This series fixes the problem by first fixing btrfs_destroy_delalloc_inode to 
> > properly cleanup delalloc inodes and as a result cleans up the code a bit. 
> > 
> > I've given it a good bashing through xfstest (4 full xfstest cycles + 100 
> > iterations of generic/475 since it was hitting some early assertion failures,
> > which are fixed in the final version) so am pretty confident in the change. 
> 
> Thanks. I'll add it as topic branch to next, this needs some testing
> exposure. The plan is to push the core patches to some rc, possibly rc5.
> 
> Review of patch 3 is required.

The whole series is now in misc-next. I did not see another occurence of
the crash, so that was probably unrelated to this patchset but still
worth analyzing.

As there are going to be more patches in post-rc5, this patch is in the
pool.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-05-16 15:39 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-27  9:21 [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount Nikolay Borisov
2018-04-27  9:21 ` [PATCH 1/5] btrfs: Unexport btrfs_alloc_delalloc_work Nikolay Borisov
2018-04-27  9:21 ` [PATCH 2/5] btrfs: Split btrfs_del_delalloc_inode into 2 functions Nikolay Borisov
2018-05-11  5:44   ` Anand Jain
2018-05-11  8:39     ` Nikolay Borisov
2018-05-16 15:27       ` David Sterba
2018-04-27  9:21 ` [PATCH 3/5] btrfs: Add assert in __btrfs_del_delalloc_inode Nikolay Borisov
2018-04-27  9:21 ` [PATCH 4/5] btrfs: Fix delalloc inodes invalidation during transaction abort Nikolay Borisov
2018-05-10 15:35   ` David Sterba
2018-04-27  9:21 ` [PATCH 5/5] btrfs: Unexport and rename btrfs_invalidate_inodes Nikolay Borisov
2018-04-27 11:36   ` [PATCH v2] " Nikolay Borisov
2018-04-28 16:33     ` kbuild test robot
2018-04-28 16:44     ` kbuild test robot
2018-04-28 18:04     ` kbuild test robot
2018-05-11  5:38     ` Anand Jain
2018-05-01 14:02 ` [PATCH 0/5] Fix delalloc inodes leaking on btrfs unmount David Sterba
2018-05-16 15:36   ` David Sterba
2018-05-07 22:58 ` David Sterba
2018-05-08  5:16   ` Nikolay Borisov
2018-05-08 10:41     ` David Sterba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).