All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method.
@ 2015-01-30  9:20 Qu Wenruo
  2015-01-30  9:20 ` [PATCH v5 1/9] Revert "btrfs: add support for processing pending changes" related commits Qu Wenruo
                   ` (9 more replies)
  0 siblings, 10 replies; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs

Patchset to solve the previous found freeze/sysfs deadlock.

Unlike previous pending_changes, which uses transaction commits to
ensure mount option doesn't change during transaction.
This idea is great in concept, but will introduce extra and somewhat
duplicated ro/freeze check, and the original synchronized behavior is
also changed.

[Per-transaction mount option]
This patch use the RCU-like concept, which will copy the mount_opt from
fs_info into btrfs_transaction, and each btrfs_test_opt() for specific
mount option bit(SPACE_CACHE/INODE_MAP_CACHE) should be changed to
btrfs_test_trans_opt().
So mount option during transaction won't be changed, and also, no extra
read_only or freeze check is needed, the commit routine is also
untouched.

[Use VFS protect for sysfs change]
The 6th patch will introduce a new help function sb_want_write() to
claim write permission to a superblock.
With this, we are able to do write protection like mnt_want_write() but
only needs to ensure that the superblock is writeable.
This also keeps the same synchronized behavior using ioctl, which will
block on frozen fs until it is unfrozen.

Changelog:
v1:
   Only use cheap freeze check to avoid deadlock.
v2:
   Fix the never changed pending_changes bug and handle transaction in
   btrfs_freeze()
   Revert sysfs only functions.
v3:
   Add atomic mount option change and per-trans mount option.
   Revert all pending changes functions.
v4:
   Add mnt_want_write() in sysfs handler.
v5:
   Change VFS helper name to sb_want_write() and sb_drop_write().
   Fix a free-n-use bug where sysfs can start a transaction but
   transaction thread is freed before btrfs sysfs interfaces.

Qu Wenruo (9):
  Revert "btrfs: add support for processing pending changes" related    
    commits
  btrfs: Make btrfs_parse_options() parse mount option in a atomic way
  btrfs: Introduce per-transaction mount_opt to keep mount option    
    consistent during transaction.
  btrfs: Use btrfs_test_trans_opt() to handle SPACE_CACHE if it's under 
       transaction protect.
  btrfs: Use btrfs_test_trans_opt() to handle INODE_CACHE if it's under 
       transaction protect.
  vfs: Add sb_want_write() function to get vfsmount from a given sb.
  btrfs: Move btrfs_sysfs_remove_one(fs_info) before transcation thread 
       cleanup.
  btrfs: Use sb_want_write() to protect label change.
  btrfs: Use sb_want_write() to protect sysfs feature change.

 fs/btrfs/ctree.h            |  64 +++-----------------
 fs/btrfs/disk-io.c          |  14 ++---
 fs/btrfs/extent-tree.c      |   2 +-
 fs/btrfs/free-space-cache.c |   2 +-
 fs/btrfs/inode-map.c        |   5 +-
 fs/btrfs/super.c            | 139 ++++++++++++++++++++------------------------
 fs/btrfs/sysfs.c            |  56 ++++++++++++------
 fs/btrfs/transaction.c      |  43 ++------------
 fs/btrfs/transaction.h      |   6 +-
 fs/namespace.c              |  36 ++++++++++++
 include/linux/mount.h       |   2 +
 11 files changed, 169 insertions(+), 200 deletions(-)

-- 
2.2.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v5 1/9] Revert "btrfs: add support for processing pending changes" related commits
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
@ 2015-01-30  9:20 ` Qu Wenruo
  2015-01-30 19:14   ` David Sterba
  2015-01-30  9:20 ` [PATCH v5 2/9] btrfs: Make btrfs_parse_options() parse mount option in a atomic way Qu Wenruo
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs

This reverts commit 572d9ab7845 ~ a6f69dc8018.

This pending commits patches introduce deadlock with freeze, and fix for
it will introduce extra checks on freeze and read only case.

For mount option change, later patches will introduce per-transaction
mount option to keep mount options consistent during transaction.

For sysfs interface to change label/features, it will use
mnt_want_write() to do the ro/freeze check and keep the 'btrfs pro set'
synchronized behavior.

Revert them to a clean base for later changes.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

Conflicts:
	fs/btrfs/ctree.h
	fs/btrfs/super.c
---
changelog:
v2:
  Newly introduced.
v3~v5:
  None
---
 fs/btrfs/ctree.h       | 49 +------------------------------------------------
 fs/btrfs/disk-io.c     |  8 +++-----
 fs/btrfs/inode-map.c   |  2 +-
 fs/btrfs/super.c       | 24 +++---------------------
 fs/btrfs/sysfs.c       | 34 +++++++++++++++++++++-------------
 fs/btrfs/transaction.c | 38 ++++++--------------------------------
 fs/btrfs/transaction.h |  2 --
 7 files changed, 35 insertions(+), 122 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 0b18070..08741e8 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -1412,11 +1412,6 @@ struct btrfs_fs_info {
 	 */
 	u64 last_trans_log_full_commit;
 	unsigned long mount_opt;
-	/*
-	 * Track requests for actions that need to be done during transaction
-	 * commit (like for some mount options).
-	 */
-	unsigned long pending_changes;
 	unsigned long compress_type:4;
 	int commit_interval;
 	/*
@@ -2114,6 +2109,7 @@ struct btrfs_ioctl_defrag_range_args {
 #define BTRFS_MOUNT_CHECK_INTEGRITY_INCLUDING_EXTENT_DATA (1 << 21)
 #define BTRFS_MOUNT_PANIC_ON_FATAL_ERROR	(1 << 22)
 #define BTRFS_MOUNT_RESCAN_UUID_TREE	(1 << 23)
+#define	BTRFS_MOUNT_CHANGE_INODE_CACHE	(1 << 24)
 
 #define BTRFS_DEFAULT_COMMIT_INTERVAL	(30)
 #define BTRFS_DEFAULT_MAX_INLINE	(8192)
@@ -2139,49 +2135,6 @@ struct btrfs_ioctl_defrag_range_args {
 }
 
 /*
- * Requests for changes that need to be done during transaction commit.
- *
- * Internal mount options that are used for special handling of the real
- * mount options (eg. cannot be set during remount and have to be set during
- * transaction commit)
- */
-
-#define BTRFS_PENDING_SET_INODE_MAP_CACHE	(0)
-#define BTRFS_PENDING_CLEAR_INODE_MAP_CACHE	(1)
-#define BTRFS_PENDING_COMMIT			(2)
-
-#define btrfs_test_pending(info, opt)	\
-	test_bit(BTRFS_PENDING_##opt, &(info)->pending_changes)
-#define btrfs_set_pending(info, opt)	\
-	set_bit(BTRFS_PENDING_##opt, &(info)->pending_changes)
-#define btrfs_clear_pending(info, opt)	\
-	clear_bit(BTRFS_PENDING_##opt, &(info)->pending_changes)
-
-/*
- * Helpers for setting pending mount option changes.
- *
- * Expects corresponding macros
- * BTRFS_PENDING_SET_ and CLEAR_ + short mount option name
- */
-#define btrfs_set_pending_and_info(info, opt, fmt, args...)            \
-do {                                                                   \
-       if (!btrfs_raw_test_opt((info)->mount_opt, opt)) {              \
-               btrfs_info((info), fmt, ##args);                        \
-               btrfs_set_pending((info), SET_##opt);                   \
-               btrfs_clear_pending((info), CLEAR_##opt);               \
-       }                                                               \
-} while(0)
-
-#define btrfs_clear_pending_and_info(info, opt, fmt, args...)          \
-do {                                                                   \
-       if (btrfs_raw_test_opt((info)->mount_opt, opt)) {               \
-               btrfs_info((info), fmt, ##args);                        \
-               btrfs_set_pending((info), CLEAR_##opt);                 \
-               btrfs_clear_pending((info), SET_##opt);                 \
-       }                                                               \
-} while(0)
-
-/*
  * Inode flags
  */
 #define BTRFS_INODE_NODATASUM		(1 << 0)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 8c63419..2d3c8b7 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2832,11 +2832,9 @@ retry_root_backup:
 		btrfs_set_opt(fs_info->mount_opt, SSD);
 	}
 
-	/*
-	 * Mount does not set all options immediatelly, we can do it now and do
-	 * not have to wait for transaction commit
-	 */
-	btrfs_apply_pending_changes(fs_info);
+	/* Set the real inode map cache flag */
+	if (btrfs_test_opt(tree_root, CHANGE_INODE_CACHE))
+		btrfs_set_opt(tree_root->fs_info->mount_opt, INODE_MAP_CACHE);
 
 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
 	if (btrfs_test_opt(tree_root, CHECK_INTEGRITY)) {
diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
index 74faea3..81efd83 100644
--- a/fs/btrfs/inode-map.c
+++ b/fs/btrfs/inode-map.c
@@ -178,7 +178,7 @@ static void start_caching(struct btrfs_root *root)
 			  root->root_key.objectid);
 	if (IS_ERR(tsk)) {
 		btrfs_warn(root->fs_info, "failed to start inode caching task");
-		btrfs_clear_pending_and_info(root->fs_info, INODE_MAP_CACHE,
+		btrfs_clear_and_info(root, CHANGE_INODE_CACHE,
 				"disabling inode map caching");
 	}
 }
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 6f49b28..b0c45b2 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -993,27 +993,9 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
 	trans = btrfs_attach_transaction_barrier(root);
 	if (IS_ERR(trans)) {
 		/* no transaction, don't bother */
-		if (PTR_ERR(trans) == -ENOENT) {
-			/*
-			 * Exit unless we have some pending changes
-			 * that need to go through commit
-			 */
-			if (fs_info->pending_changes == 0)
-				return 0;
-			/*
-			 * A non-blocking test if the fs is frozen. We must not
-			 * start a new transaction here otherwise a deadlock
-			 * happens. The pending operations are delayed to the
-			 * next commit after thawing.
-			 */
-			if (__sb_start_write(sb, SB_FREEZE_WRITE, false))
-				__sb_end_write(sb, SB_FREEZE_WRITE);
-			else
-				return 0;
-			trans = btrfs_start_transaction(root, 0);
-		}
-		if (IS_ERR(trans))
-			return PTR_ERR(trans);
+		if (PTR_ERR(trans) == -ENOENT)
+			return 0;
+		return PTR_ERR(trans);
 	}
 	return btrfs_commit_transaction(trans, root);
 }
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 92db3f6..b2e7bb4 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -111,6 +111,7 @@ static ssize_t btrfs_feature_attr_store(struct kobject *kobj,
 {
 	struct btrfs_fs_info *fs_info;
 	struct btrfs_feature_attr *fa = to_btrfs_feature_attr(a);
+	struct btrfs_trans_handle *trans;
 	u64 features, set, clear;
 	unsigned long val;
 	int ret;
@@ -152,6 +153,10 @@ static ssize_t btrfs_feature_attr_store(struct kobject *kobj,
 	btrfs_info(fs_info, "%s %s feature flag",
 		   val ? "Setting" : "Clearing", fa->kobj_attr.attr.name);
 
+	trans = btrfs_start_transaction(fs_info->fs_root, 0);
+	if (IS_ERR(trans))
+		return PTR_ERR(trans);
+
 	spin_lock(&fs_info->super_lock);
 	features = get_features(fs_info, fa->feature_set);
 	if (val)
@@ -161,11 +166,9 @@ static ssize_t btrfs_feature_attr_store(struct kobject *kobj,
 	set_features(fs_info, fa->feature_set, features);
 	spin_unlock(&fs_info->super_lock);
 
-	/*
-	 * We don't want to do full transaction commit from inside sysfs
-	 */
-	btrfs_set_pending(fs_info, COMMIT);
-	wake_up_process(fs_info->transaction_kthread);
+	ret = btrfs_commit_transaction(trans, fs_info->fs_root);
+	if (ret)
+		return ret;
 
 	return count;
 }
@@ -369,6 +372,9 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
 				 const char *buf, size_t len)
 {
 	struct btrfs_fs_info *fs_info = to_fs_info(kobj);
+	struct btrfs_trans_handle *trans;
+	struct btrfs_root *root = fs_info->fs_root;
+	int ret;
 	size_t p_len;
 
 	if (fs_info->sb->s_flags & MS_RDONLY)
@@ -383,18 +389,20 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
 	if (p_len >= BTRFS_LABEL_SIZE)
 		return -EINVAL;
 
-	spin_lock(&fs_info->super_lock);
+	trans = btrfs_start_transaction(root, 0);
+	if (IS_ERR(trans))
+		return PTR_ERR(trans);
+
+	spin_lock(&root->fs_info->super_lock);
 	memset(fs_info->super_copy->label, 0, BTRFS_LABEL_SIZE);
 	memcpy(fs_info->super_copy->label, buf, p_len);
-	spin_unlock(&fs_info->super_lock);
+	spin_unlock(&root->fs_info->super_lock);
+	ret = btrfs_commit_transaction(trans, root);
 
-	/*
-	 * We don't want to do full transaction commit from inside sysfs
-	 */
-	btrfs_set_pending(fs_info, COMMIT);
-	wake_up_process(fs_info->transaction_kthread);
+	if (!ret)
+		return len;
 
-	return len;
+	return ret;
 }
 BTRFS_ATTR_RW(label, btrfs_label_show, btrfs_label_store);
 
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index e88b59d..295a135 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1938,10 +1938,13 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 	}
 
 	/*
-	 * Since the transaction is done, we can apply the pending changes
-	 * before the next transaction.
+	 * Since the transaction is done, we should set the inode map cache flag
+	 * before any other comming transaction.
 	 */
-	btrfs_apply_pending_changes(root->fs_info);
+	if (btrfs_test_opt(root, CHANGE_INODE_CACHE))
+		btrfs_set_opt(root->fs_info->mount_opt, INODE_MAP_CACHE);
+	else
+		btrfs_clear_opt(root->fs_info->mount_opt, INODE_MAP_CACHE);
 
 	/* commit_fs_roots gets rid of all the tree log roots, it is now
 	 * safe to free the root of tree log roots
@@ -2112,32 +2115,3 @@ int btrfs_clean_one_deleted_snapshot(struct btrfs_root *root)
 
 	return (ret < 0) ? 0 : 1;
 }
-
-void btrfs_apply_pending_changes(struct btrfs_fs_info *fs_info)
-{
-	unsigned long prev;
-	unsigned long bit;
-
-	prev = xchg(&fs_info->pending_changes, 0);
-	if (!prev)
-		return;
-
-	bit = 1 << BTRFS_PENDING_SET_INODE_MAP_CACHE;
-	if (prev & bit)
-		btrfs_set_opt(fs_info->mount_opt, INODE_MAP_CACHE);
-	prev &= ~bit;
-
-	bit = 1 << BTRFS_PENDING_CLEAR_INODE_MAP_CACHE;
-	if (prev & bit)
-		btrfs_clear_opt(fs_info->mount_opt, INODE_MAP_CACHE);
-	prev &= ~bit;
-
-	bit = 1 << BTRFS_PENDING_COMMIT;
-	if (prev & bit)
-		btrfs_debug(fs_info, "pending commit done");
-	prev &= ~bit;
-
-	if (prev)
-		btrfs_warn(fs_info,
-			"unknown pending changes left 0x%lx, ignoring", prev);
-}
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index 00ed29c..fd400a3 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -170,6 +170,4 @@ int btrfs_wait_marked_extents(struct btrfs_root *root,
 int btrfs_transaction_blocked(struct btrfs_fs_info *info);
 int btrfs_transaction_in_commit(struct btrfs_fs_info *info);
 void btrfs_put_transaction(struct btrfs_transaction *transaction);
-void btrfs_apply_pending_changes(struct btrfs_fs_info *fs_info);
-
 #endif
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 2/9] btrfs: Make btrfs_parse_options() parse mount option in a atomic way
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
  2015-01-30  9:20 ` [PATCH v5 1/9] Revert "btrfs: add support for processing pending changes" related commits Qu Wenruo
@ 2015-01-30  9:20 ` Qu Wenruo
  2015-01-30 18:28   ` David Sterba
  2015-01-30  9:20 ` [PATCH v5 3/9] btrfs: Introduce per-transaction mount_opt to keep mount option consistent during transaction Qu Wenruo
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs

Current btrfs_parse_options() is not atomic, which can set and clear a
bit, especially for nospace_cache case.

For example, if a fs is mounted with nospace_cache,
btrfs_parse_options() will set SPACE_CACHE bit first(since
cache_generation is non-zeo) and clear the SPACE_CACHE bit due to
nospace_cache mount option.
So under heavy operations and remount a nospace_cache btrfs, there is a
windows for commit to create space cache.

This bug can be reproduced by fstest/btrfs/071 073 074 with
nospace_cache mount option. It has about 50% chance to create space
cache, and about 10% chance to create wrong space cache, which can't
pass btrfsck.

This patch will do the mount option parse in a copy-and-update method.
First copy the mount_opt from fs_info to new_opt, and only update
options in new_opt. At last, copy the new_opt back to
fs_info->mount_opt.

This patch is already good enough to fix the above nospace_cache +
remount bug, but need later patch to make sure mount options does not
change during transaction.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
changelog:
v3:
  Newly introduced
v4:
  None
v5:
  Add ACCESS_ONE() macro to avoid aggressive compiler optimization to
  not read fs_info->mount_opt to new_opt but always use fs_info->mount_opt.
---
 fs/btrfs/ctree.h |  16 ++++----
 fs/btrfs/super.c | 115 +++++++++++++++++++++++++++++--------------------------
 2 files changed, 69 insertions(+), 62 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 08741e8..0d44ea9 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2120,18 +2120,18 @@ struct btrfs_ioctl_defrag_range_args {
 #define btrfs_test_opt(root, opt)	((root)->fs_info->mount_opt & \
 					 BTRFS_MOUNT_##opt)
 
-#define btrfs_set_and_info(root, opt, fmt, args...)			\
+#define btrfs_set_and_info(fs_info, val, opt, fmt, args...)		\
 {									\
-	if (!btrfs_test_opt(root, opt))					\
-		btrfs_info(root->fs_info, fmt, ##args);			\
-	btrfs_set_opt(root->fs_info->mount_opt, opt);			\
+	if (!btrfs_raw_test_opt(val, opt))				\
+		btrfs_info(fs_info, fmt, ##args);			\
+	btrfs_set_opt(val, opt);					\
 }
 
-#define btrfs_clear_and_info(root, opt, fmt, args...)			\
+#define btrfs_clear_and_info(fs_info, val, opt, fmt, args...)		\
 {									\
-	if (btrfs_test_opt(root, opt))					\
-		btrfs_info(root->fs_info, fmt, ##args);			\
-	btrfs_clear_opt(root->fs_info->mount_opt, opt);			\
+	if (btrfs_raw_test_opt(val, opt))				\
+		btrfs_info(fs_info, fmt, ##args);			\
+	btrfs_clear_opt(val, opt);					\
 }
 
 /*
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index b0c45b2..3ea3ce3 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -395,10 +395,13 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 	int ret = 0;
 	char *compress_type;
 	bool compress_force = false;
+	unsigned long new_opt;
+
+	new_opt = ACCESS_ONCE(info->mount_opt);
 
 	cache_gen = btrfs_super_cache_generation(root->fs_info->super_copy);
 	if (cache_gen)
-		btrfs_set_opt(info->mount_opt, SPACE_CACHE);
+		btrfs_set_opt(new_opt, SPACE_CACHE);
 
 	if (!options)
 		goto out;
@@ -422,7 +425,7 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 		switch (token) {
 		case Opt_degraded:
 			btrfs_info(root->fs_info, "allowing degraded mounts");
-			btrfs_set_opt(info->mount_opt, DEGRADED);
+			btrfs_set_opt(new_opt, DEGRADED);
 			break;
 		case Opt_subvol:
 		case Opt_subvolid:
@@ -434,7 +437,7 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 			 */
 			break;
 		case Opt_nodatasum:
-			btrfs_set_and_info(root, NODATASUM,
+			btrfs_set_and_info(info, new_opt, NODATASUM,
 					   "setting nodatasum");
 			break;
 		case Opt_datasum:
@@ -444,8 +447,8 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 				else
 					btrfs_info(root->fs_info, "setting datasum");
 			}
-			btrfs_clear_opt(info->mount_opt, NODATACOW);
-			btrfs_clear_opt(info->mount_opt, NODATASUM);
+			btrfs_clear_opt(new_opt, NODATACOW);
+			btrfs_clear_opt(new_opt, NODATASUM);
 			break;
 		case Opt_nodatacow:
 			if (!btrfs_test_opt(root, NODATACOW)) {
@@ -457,13 +460,13 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 					btrfs_info(root->fs_info, "setting nodatacow");
 				}
 			}
-			btrfs_clear_opt(info->mount_opt, COMPRESS);
-			btrfs_clear_opt(info->mount_opt, FORCE_COMPRESS);
-			btrfs_set_opt(info->mount_opt, NODATACOW);
-			btrfs_set_opt(info->mount_opt, NODATASUM);
+			btrfs_clear_opt(new_opt, COMPRESS);
+			btrfs_clear_opt(new_opt, FORCE_COMPRESS);
+			btrfs_set_opt(new_opt, NODATACOW);
+			btrfs_set_opt(new_opt, NODATASUM);
 			break;
 		case Opt_datacow:
-			btrfs_clear_and_info(root, NODATACOW,
+			btrfs_clear_and_info(info, new_opt, NODATACOW,
 					     "setting datacow");
 			break;
 		case Opt_compress_force:
@@ -477,20 +480,20 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 			    strcmp(args[0].from, "zlib") == 0) {
 				compress_type = "zlib";
 				info->compress_type = BTRFS_COMPRESS_ZLIB;
-				btrfs_set_opt(info->mount_opt, COMPRESS);
-				btrfs_clear_opt(info->mount_opt, NODATACOW);
-				btrfs_clear_opt(info->mount_opt, NODATASUM);
+				btrfs_set_opt(new_opt, COMPRESS);
+				btrfs_clear_opt(new_opt, NODATACOW);
+				btrfs_clear_opt(new_opt, NODATASUM);
 			} else if (strcmp(args[0].from, "lzo") == 0) {
 				compress_type = "lzo";
 				info->compress_type = BTRFS_COMPRESS_LZO;
-				btrfs_set_opt(info->mount_opt, COMPRESS);
-				btrfs_clear_opt(info->mount_opt, NODATACOW);
-				btrfs_clear_opt(info->mount_opt, NODATASUM);
+				btrfs_set_opt(new_opt, COMPRESS);
+				btrfs_clear_opt(new_opt, NODATACOW);
+				btrfs_clear_opt(new_opt, NODATASUM);
 				btrfs_set_fs_incompat(info, COMPRESS_LZO);
 			} else if (strncmp(args[0].from, "no", 2) == 0) {
 				compress_type = "no";
-				btrfs_clear_opt(info->mount_opt, COMPRESS);
-				btrfs_clear_opt(info->mount_opt, FORCE_COMPRESS);
+				btrfs_clear_opt(new_opt, COMPRESS);
+				btrfs_clear_opt(new_opt, FORCE_COMPRESS);
 				compress_force = false;
 			} else {
 				ret = -EINVAL;
@@ -498,7 +501,8 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 			}
 
 			if (compress_force) {
-				btrfs_set_and_info(root, FORCE_COMPRESS,
+				btrfs_set_and_info(info, new_opt,
+						   FORCE_COMPRESS,
 						   "force %s compression",
 						   compress_type);
 			} else {
@@ -512,29 +516,29 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 				 * flag, otherwise, there is no way for users
 				 * to disable forcible compression separately.
 				 */
-				btrfs_clear_opt(info->mount_opt, FORCE_COMPRESS);
+				btrfs_clear_opt(new_opt, FORCE_COMPRESS);
 			}
 			break;
 		case Opt_ssd:
-			btrfs_set_and_info(root, SSD,
+			btrfs_set_and_info(info, new_opt, SSD,
 					   "use ssd allocation scheme");
 			break;
 		case Opt_ssd_spread:
-			btrfs_set_and_info(root, SSD_SPREAD,
+			btrfs_set_and_info(info, new_opt, SSD_SPREAD,
 					   "use spread ssd allocation scheme");
-			btrfs_set_opt(info->mount_opt, SSD);
+			btrfs_set_opt(new_opt, SSD);
 			break;
 		case Opt_nossd:
-			btrfs_set_and_info(root, NOSSD,
-					     "not using ssd allocation scheme");
-			btrfs_clear_opt(info->mount_opt, SSD);
+			btrfs_set_and_info(info, new_opt, NOSSD,
+					   "not using ssd allocation scheme");
+			btrfs_clear_opt(new_opt, SSD);
 			break;
 		case Opt_barrier:
-			btrfs_clear_and_info(root, NOBARRIER,
+			btrfs_clear_and_info(info, new_opt, NOBARRIER,
 					     "turning on barriers");
 			break;
 		case Opt_nobarrier:
-			btrfs_set_and_info(root, NOBARRIER,
+			btrfs_set_and_info(info, new_opt, NOBARRIER,
 					   "turning off barriers");
 			break;
 		case Opt_thread_pool:
@@ -594,19 +598,19 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 			root->fs_info->sb->s_flags &= ~MS_POSIXACL;
 			break;
 		case Opt_notreelog:
-			btrfs_set_and_info(root, NOTREELOG,
+			btrfs_set_and_info(info, new_opt, NOTREELOG,
 					   "disabling tree log");
 			break;
 		case Opt_treelog:
-			btrfs_clear_and_info(root, NOTREELOG,
+			btrfs_clear_and_info(info, new_opt, NOTREELOG,
 					     "enabling tree log");
 			break;
 		case Opt_flushoncommit:
-			btrfs_set_and_info(root, FLUSHONCOMMIT,
+			btrfs_set_and_info(info, new_opt, FLUSHONCOMMIT,
 					   "turning on flush-on-commit");
 			break;
 		case Opt_noflushoncommit:
-			btrfs_clear_and_info(root, FLUSHONCOMMIT,
+			btrfs_clear_and_info(info, new_opt, FLUSHONCOMMIT,
 					     "turning off flush-on-commit");
 			break;
 		case Opt_ratio:
@@ -623,71 +627,71 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 			}
 			break;
 		case Opt_discard:
-			btrfs_set_and_info(root, DISCARD,
+			btrfs_set_and_info(info, new_opt, DISCARD,
 					   "turning on discard");
 			break;
 		case Opt_nodiscard:
-			btrfs_clear_and_info(root, DISCARD,
+			btrfs_clear_and_info(info, new_opt, DISCARD,
 					     "turning off discard");
 			break;
 		case Opt_space_cache:
-			btrfs_set_and_info(root, SPACE_CACHE,
+			btrfs_set_and_info(info, new_opt, SPACE_CACHE,
 					   "enabling disk space caching");
 			break;
 		case Opt_rescan_uuid_tree:
-			btrfs_set_opt(info->mount_opt, RESCAN_UUID_TREE);
+			btrfs_set_opt(new_opt, RESCAN_UUID_TREE);
 			break;
 		case Opt_no_space_cache:
-			btrfs_clear_and_info(root, SPACE_CACHE,
+			btrfs_clear_and_info(info, new_opt, SPACE_CACHE,
 					     "disabling disk space caching");
 			break;
 		case Opt_inode_cache:
-			btrfs_set_pending_and_info(info, INODE_MAP_CACHE,
+			btrfs_set_and_info(info, new_opt, INODE_MAP_CACHE,
 					   "enabling inode map caching");
 			break;
 		case Opt_noinode_cache:
-			btrfs_clear_pending_and_info(info, INODE_MAP_CACHE,
+			btrfs_clear_and_info(info, new_opt, INODE_MAP_CACHE,
 					     "disabling inode map caching");
 			break;
 		case Opt_clear_cache:
-			btrfs_set_and_info(root, CLEAR_CACHE,
+			btrfs_set_and_info(info, new_opt, CLEAR_CACHE,
 					   "force clearing of disk cache");
 			break;
 		case Opt_user_subvol_rm_allowed:
-			btrfs_set_opt(info->mount_opt, USER_SUBVOL_RM_ALLOWED);
+			btrfs_set_opt(new_opt, USER_SUBVOL_RM_ALLOWED);
 			break;
 		case Opt_enospc_debug:
-			btrfs_set_opt(info->mount_opt, ENOSPC_DEBUG);
+			btrfs_set_opt(new_opt, ENOSPC_DEBUG);
 			break;
 		case Opt_noenospc_debug:
-			btrfs_clear_opt(info->mount_opt, ENOSPC_DEBUG);
+			btrfs_clear_opt(new_opt, ENOSPC_DEBUG);
 			break;
 		case Opt_defrag:
-			btrfs_set_and_info(root, AUTO_DEFRAG,
+			btrfs_set_and_info(info, new_opt, AUTO_DEFRAG,
 					   "enabling auto defrag");
 			break;
 		case Opt_nodefrag:
-			btrfs_clear_and_info(root, AUTO_DEFRAG,
+			btrfs_clear_and_info(info, new_opt, AUTO_DEFRAG,
 					     "disabling auto defrag");
 			break;
 		case Opt_recovery:
 			btrfs_info(root->fs_info, "enabling auto recovery");
-			btrfs_set_opt(info->mount_opt, RECOVERY);
+			btrfs_set_opt(new_opt, RECOVERY);
 			break;
 		case Opt_skip_balance:
-			btrfs_set_opt(info->mount_opt, SKIP_BALANCE);
+			btrfs_set_opt(new_opt, SKIP_BALANCE);
 			break;
 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
 		case Opt_check_integrity_including_extent_data:
 			btrfs_info(root->fs_info,
 				   "enabling check integrity including extent data");
-			btrfs_set_opt(info->mount_opt,
+			btrfs_set_opt(new_opt,
 				      CHECK_INTEGRITY_INCLUDING_EXTENT_DATA);
-			btrfs_set_opt(info->mount_opt, CHECK_INTEGRITY);
+			btrfs_set_opt(new_opt, CHECK_INTEGRITY);
 			break;
 		case Opt_check_integrity:
 			btrfs_info(root->fs_info, "enabling check integrity");
-			btrfs_set_opt(info->mount_opt, CHECK_INTEGRITY);
+			btrfs_set_opt(new_opt, CHECK_INTEGRITY);
 			break;
 		case Opt_check_integrity_print_mask:
 			ret = match_int(&args[0], &intarg);
@@ -713,10 +717,10 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 #endif
 		case Opt_fatal_errors:
 			if (strcmp(args[0].from, "panic") == 0)
-				btrfs_set_opt(info->mount_opt,
+				btrfs_set_opt(new_opt,
 					      PANIC_ON_FATAL_ERROR);
 			else if (strcmp(args[0].from, "bug") == 0)
-				btrfs_clear_opt(info->mount_opt,
+				btrfs_clear_opt(new_opt,
 					      PANIC_ON_FATAL_ERROR);
 			else {
 				ret = -EINVAL;
@@ -752,8 +756,11 @@ int btrfs_parse_options(struct btrfs_root *root, char *options)
 		}
 	}
 out:
-	if (!ret && btrfs_test_opt(root, SPACE_CACHE))
-		btrfs_info(root->fs_info, "disk space caching is enabled");
+	if (!ret) {
+		if (btrfs_raw_test_opt(new_opt, SPACE_CACHE))
+			btrfs_info(info, "disk space caching is enabled");
+		ACCESS_ONCE(info->mount_opt) = new_opt;
+	}
 	kfree(orig);
 	return ret;
 }
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 3/9] btrfs: Introduce per-transaction mount_opt to keep mount option consistent during transaction.
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
  2015-01-30  9:20 ` [PATCH v5 1/9] Revert "btrfs: add support for processing pending changes" related commits Qu Wenruo
  2015-01-30  9:20 ` [PATCH v5 2/9] btrfs: Make btrfs_parse_options() parse mount option in a atomic way Qu Wenruo
@ 2015-01-30  9:20 ` Qu Wenruo
  2015-01-30 19:12   ` David Sterba
  2015-01-30  9:20 ` [PATCH v5 4/9] btrfs: Use btrfs_test_trans_opt() to handle SPACE_CACHE if it's under transaction protect Qu Wenruo
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs

Before this patch, mount_opt is not consistent during a transaction.
btrfs_parse_options() can race with transaction.

Now each transaction will keep a copy of fs_info->mount_opt upon
creation, and new btrfs_test_trans_opt() macro is introduced to get the
mount_opt in the transaction.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
changelog:
v3:
  Newly introduced.
v4~v5:
  None
---
 fs/btrfs/transaction.c | 1 +
 fs/btrfs/transaction.h | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 295a135..846e1b8 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -210,6 +210,7 @@ loop:
 		return -EROFS;
 	}
 
+	cur_trans->mount_opt = fs_info->mount_opt;
 	atomic_set(&cur_trans->num_writers, 1);
 	extwriter_counter_init(cur_trans, type);
 	init_waitqueue_head(&cur_trans->writer_wait);
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index fd400a3..4052879 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -52,6 +52,7 @@ struct btrfs_transaction {
 	struct list_head list;
 	struct extent_io_tree dirty_pages;
 	unsigned long start_time;
+	unsigned long mount_opt;
 	wait_queue_head_t writer_wait;
 	wait_queue_head_t commit_wait;
 	struct list_head pending_snapshots;
@@ -126,6 +127,9 @@ struct btrfs_pending_snapshot {
 	struct list_head list;
 };
 
+#define btrfs_test_trans_opt(trans, opt)	\
+	(btrfs_raw_test_opt(trans->transaction->mount_opt, opt))
+
 static inline void btrfs_set_inode_last_trans(struct btrfs_trans_handle *trans,
 					      struct inode *inode)
 {
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 4/9] btrfs: Use btrfs_test_trans_opt() to handle SPACE_CACHE if it's under transaction protect.
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
                   ` (2 preceding siblings ...)
  2015-01-30  9:20 ` [PATCH v5 3/9] btrfs: Introduce per-transaction mount_opt to keep mount option consistent during transaction Qu Wenruo
@ 2015-01-30  9:20 ` Qu Wenruo
  2015-01-30 19:12   ` David Sterba
  2015-01-30  9:20 ` [PATCH v5 5/9] btrfs: Use btrfs_test_trans_opt() to handle INODE_CACHE " Qu Wenruo
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs

Convert btrfs_test_opt() to btrfs_test_trans_opt() if it's called under
transaction protection.
This will ensure SPACE_CACHE bit is consistent during transaction.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
changelog:
v3:
  Newly introduced.
v4~v5:
  None
---
 fs/btrfs/extent-tree.c | 2 +-
 fs/btrfs/transaction.c | 7 ++++---
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a684086..fde502c 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3260,7 +3260,7 @@ again:
 
 	spin_lock(&block_group->lock);
 	if (block_group->cached != BTRFS_CACHE_FINISHED ||
-	    !btrfs_test_opt(root, SPACE_CACHE) ||
+	    !btrfs_test_trans_opt(trans, SPACE_CACHE) ||
 	    block_group->delalloc_bytes) {
 		/*
 		 * don't bother trying to write stuff out _if_
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 846e1b8..aec5a5a 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1507,7 +1507,8 @@ static noinline int create_pending_snapshots(struct btrfs_trans_handle *trans,
 	return ret;
 }
 
-static void update_super_roots(struct btrfs_root *root)
+static void update_super_roots(struct btrfs_trans_handle *trans,
+			       struct btrfs_root *root)
 {
 	struct btrfs_root_item *root_item;
 	struct btrfs_super_block *super;
@@ -1523,7 +1524,7 @@ static void update_super_roots(struct btrfs_root *root)
 	super->root = root_item->bytenr;
 	super->generation = root_item->generation;
 	super->root_level = root_item->level;
-	if (btrfs_test_opt(root, SPACE_CACHE))
+	if (btrfs_test_trans_opt(trans, SPACE_CACHE))
 		super->cache_generation = root_item->generation;
 	if (root->fs_info->update_uuid_tree_gen)
 		super->uuid_tree_generation = root_item->generation;
@@ -1987,7 +1988,7 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 	switch_commit_roots(cur_trans, root->fs_info);
 
 	assert_qgroups_uptodate(trans);
-	update_super_roots(root);
+	update_super_roots(trans, root);
 
 	btrfs_set_super_log_root(root->fs_info->super_copy, 0);
 	btrfs_set_super_log_root_level(root->fs_info->super_copy, 0);
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 5/9] btrfs: Use btrfs_test_trans_opt() to handle INODE_CACHE if it's under transaction protect.
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
                   ` (3 preceding siblings ...)
  2015-01-30  9:20 ` [PATCH v5 4/9] btrfs: Use btrfs_test_trans_opt() to handle SPACE_CACHE if it's under transaction protect Qu Wenruo
@ 2015-01-30  9:20 ` Qu Wenruo
  2015-01-30 19:12   ` David Sterba
  2015-01-30  9:20 ` [PATCH v5 6/9] vfs: Add sb_want_write() function to get vfsmount from a given sb Qu Wenruo
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs

Convert btrfs_test_opt() to btrfs_test_trans_opt() if it's called under
transaction protection.

This will ensure SPACE_CACHE bit is consistent during transaction.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
Changelog:
v3:
  Newly introduced.
v4~v5:
  None
---
 fs/btrfs/ctree.h            | 1 -
 fs/btrfs/disk-io.c          | 4 ----
 fs/btrfs/free-space-cache.c | 2 +-
 fs/btrfs/inode-map.c        | 5 +++--
 fs/btrfs/transaction.c      | 9 ---------
 5 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 0d44ea9..f381ae8 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -2109,7 +2109,6 @@ struct btrfs_ioctl_defrag_range_args {
 #define BTRFS_MOUNT_CHECK_INTEGRITY_INCLUDING_EXTENT_DATA (1 << 21)
 #define BTRFS_MOUNT_PANIC_ON_FATAL_ERROR	(1 << 22)
 #define BTRFS_MOUNT_RESCAN_UUID_TREE	(1 << 23)
-#define	BTRFS_MOUNT_CHANGE_INODE_CACHE	(1 << 24)
 
 #define BTRFS_DEFAULT_COMMIT_INTERVAL	(30)
 #define BTRFS_DEFAULT_MAX_INLINE	(8192)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 2d3c8b7..f4d168d 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2832,10 +2832,6 @@ retry_root_backup:
 		btrfs_set_opt(fs_info->mount_opt, SSD);
 	}
 
-	/* Set the real inode map cache flag */
-	if (btrfs_test_opt(tree_root, CHANGE_INODE_CACHE))
-		btrfs_set_opt(tree_root->fs_info->mount_opt, INODE_MAP_CACHE);
-
 #ifdef CONFIG_BTRFS_FS_CHECK_INTEGRITY
 	if (btrfs_test_opt(tree_root, CHECK_INTEGRITY)) {
 		ret = btrfsic_mount(tree_root, fs_devices,
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index d6c03f7..2b9cabf 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -3348,7 +3348,7 @@ int btrfs_write_out_ino_cache(struct btrfs_root *root,
 	struct btrfs_free_space_ctl *ctl = root->free_ino_ctl;
 	int ret;
 
-	if (!btrfs_test_opt(root, INODE_MAP_CACHE))
+	if (!btrfs_test_trans_opt(trans, INODE_MAP_CACHE))
 		return 0;
 
 	ret = __btrfs_write_out_cache(root, inode, ctl, NULL, trans, path, 0);
diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
index 81efd83..49b089c 100644
--- a/fs/btrfs/inode-map.c
+++ b/fs/btrfs/inode-map.c
@@ -178,7 +178,8 @@ static void start_caching(struct btrfs_root *root)
 			  root->root_key.objectid);
 	if (IS_ERR(tsk)) {
 		btrfs_warn(root->fs_info, "failed to start inode caching task");
-		btrfs_clear_and_info(root, CHANGE_INODE_CACHE,
+		btrfs_clear_and_info(root->fs_info, root->fs_info->mount_opt,
+				INODE_MAP_CACHE,
 				"disabling inode map caching");
 	}
 }
@@ -405,7 +406,7 @@ int btrfs_save_ino_cache(struct btrfs_root *root,
 	if (btrfs_root_refs(&root->root_item) == 0)
 		return 0;
 
-	if (!btrfs_test_opt(root, INODE_MAP_CACHE))
+	if (!btrfs_test_trans_opt(trans, INODE_MAP_CACHE))
 		return 0;
 
 	path = btrfs_alloc_path();
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index aec5a5a..dd9e7af 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -1939,15 +1939,6 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans,
 		goto scrub_continue;
 	}
 
-	/*
-	 * Since the transaction is done, we should set the inode map cache flag
-	 * before any other comming transaction.
-	 */
-	if (btrfs_test_opt(root, CHANGE_INODE_CACHE))
-		btrfs_set_opt(root->fs_info->mount_opt, INODE_MAP_CACHE);
-	else
-		btrfs_clear_opt(root->fs_info->mount_opt, INODE_MAP_CACHE);
-
 	/* commit_fs_roots gets rid of all the tree log roots, it is now
 	 * safe to free the root of tree log roots
 	 */
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 6/9] vfs: Add sb_want_write() function to get vfsmount from a given sb.
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
                   ` (4 preceding siblings ...)
  2015-01-30  9:20 ` [PATCH v5 5/9] btrfs: Use btrfs_test_trans_opt() to handle INODE_CACHE " Qu Wenruo
@ 2015-01-30  9:20 ` Qu Wenruo
  2015-01-30 19:11   ` David Sterba
  2015-01-30  9:20 ` [PATCH v5 7/9] btrfs: Move btrfs_sysfs_remove_one(fs_info) before transcation thread cleanup Qu Wenruo
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs; +Cc: linux-fsdevel

There are sysfs interfaces in some fs, only btrfs yet, which will modify
on-disk data.
Unlike normal file operation routine we can use mnt_want_write_file() to
protect the operation, change through sysfs won't to be binded to any file
in the filesystem.

So introduce new sb_want_write() to do the protection agains a super
block, which acts much like mnt_want_write() but will return success if
the super block is read-write.

Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
Signed-off-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
Changelog:
v4:
  Newly introduced.
v5:
  Change name to sb_want_write() and receive sb and parameter.
---
 fs/namespace.c        | 36 ++++++++++++++++++++++++++++++++++++
 include/linux/mount.h |  2 ++
 2 files changed, 38 insertions(+)

diff --git a/fs/namespace.c b/fs/namespace.c
index cd1e968..79bc762 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1105,6 +1105,42 @@ struct vfsmount *mntget(struct vfsmount *mnt)
 }
 EXPORT_SYMBOL(mntget);
 
+/**
+ * sb_want_write - get write acess to a super block
+ * @sb: the superblock of the filesystem
+ *
+ * This tells the low-level filesystem that a write is about to be performed to
+ * it, and makes sure that the writes are allowed (superblock is read-write,
+ * filesystem is not frozen) before returning success.
+ * When the write operation is finished, sb_drop_write() must be called.
+ * This is much like mnt_want_write() as a refcount, but only needs
+ * the superblock to be read-write.
+ */
+int sb_want_write(struct super_block *sb)
+{
+	sb_start_write(sb);
+	if (sb->s_readonly_remount || sb->s_flags & MS_RDONLY) {
+		sb_end_write(sb);
+		return -EROFS;
+	}
+	return 0;
+}
+EXPORT_SYMBOL(sb_want_write);
+
+/**
+ * sb_drop_write - give up write acess to a super block
+ * @sb: the superblock on which to give up write access
+ *
+ * Tells the low-level filesystem that we are done performing writes to it and
+ * also allows filesystem to be frozen again. Must be matched with
+ * sb_want_write() call above.
+ */
+void sb_drop_write(struct super_block *sb)
+{
+	sb_end_write(sb);
+}
+EXPORT_SYMBOL(sb_drop_write);
+
 struct vfsmount *mnt_clone_internal(struct path *path)
 {
 	struct mount *p;
diff --git a/include/linux/mount.h b/include/linux/mount.h
index c2c561d..abf4495 100644
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -74,8 +74,10 @@ struct path;
 extern int mnt_want_write(struct vfsmount *mnt);
 extern int mnt_want_write_file(struct file *file);
 extern int mnt_clone_write(struct vfsmount *mnt);
+extern int sb_want_write(struct super_block *sb);
 extern void mnt_drop_write(struct vfsmount *mnt);
 extern void mnt_drop_write_file(struct file *file);
+extern void sb_drop_write(struct super_block *sb);
 extern void mntput(struct vfsmount *mnt);
 extern struct vfsmount *mntget(struct vfsmount *mnt);
 extern struct vfsmount *mnt_clone_internal(struct path *path);
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 7/9] btrfs: Move btrfs_sysfs_remove_one(fs_info) before transcation thread cleanup.
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
                   ` (5 preceding siblings ...)
  2015-01-30  9:20 ` [PATCH v5 6/9] vfs: Add sb_want_write() function to get vfsmount from a given sb Qu Wenruo
@ 2015-01-30  9:20 ` Qu Wenruo
  2015-01-30 18:56   ` David Sterba
  2015-01-30  9:20 ` [PATCH v5 8/9] btrfs: Use sb_want_write() to protect label change Qu Wenruo
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs

Since btrfs sysfs interfaces can start new transaction, we need to do it
before transaction thread cleanup.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
v5:
  Newly introduced.
---
 fs/btrfs/disk-io.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f4d168d..7c185a0 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -3630,6 +3630,12 @@ void close_ctree(struct btrfs_root *root)
 	fs_info->closing = 1;
 	smp_mb();
 
+	/*
+	 * Remove btrfs sysfs interfaces first,
+	 * since it can start new transaction.
+	 */
+	btrfs_sysfs_remove_one(fs_info);
+
 	/* wait for the uuid_scan task to finish */
 	down(&fs_info->uuid_tree_rescan_sem);
 	/* avoid complains from lockdep et al., set sem back to initial state */
@@ -3673,8 +3679,6 @@ void close_ctree(struct btrfs_root *root)
 		       percpu_counter_sum(&fs_info->delalloc_bytes));
 	}
 
-	btrfs_sysfs_remove_one(fs_info);
-
 	btrfs_free_fs_roots(fs_info);
 
 	btrfs_put_block_group_cache(fs_info);
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 8/9] btrfs: Use sb_want_write() to protect label change.
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
                   ` (6 preceding siblings ...)
  2015-01-30  9:20 ` [PATCH v5 7/9] btrfs: Move btrfs_sysfs_remove_one(fs_info) before transcation thread cleanup Qu Wenruo
@ 2015-01-30  9:20 ` Qu Wenruo
  2015-01-30 18:34   ` David Sterba
  2015-01-30  9:20 ` [PATCH v5 9/9] btrfs: Use sb_want_write() to protect sysfs feature change Qu Wenruo
  2015-01-30 19:17 ` [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method David Sterba
  9 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs

Use the new vfs API sb_want_write() to do the write protection of the
label change transaction.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
v4:
  Newly introduced.
v5:
  Change to use sb_want_write().
---
 fs/btrfs/sysfs.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index b2e7bb4..3218245 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -25,6 +25,7 @@
 #include <linux/bug.h>
 #include <linux/genhd.h>
 #include <linux/debugfs.h>
+#include <linux/mount.h>
 
 #include "ctree.h"
 #include "disk-io.h"
@@ -377,8 +378,6 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
 	int ret;
 	size_t p_len;
 
-	if (fs_info->sb->s_flags & MS_RDONLY)
-		return -EROFS;
 
 	/*
 	 * p_len is the len until the first occurrence of either
@@ -389,9 +388,15 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
 	if (p_len >= BTRFS_LABEL_SIZE)
 		return -EINVAL;
 
+	ret = sb_want_write(fs_info->sb);
+	if (ret)
+		return ret;
+
 	trans = btrfs_start_transaction(root, 0);
-	if (IS_ERR(trans))
-		return PTR_ERR(trans);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		goto out;
+	}
 
 	spin_lock(&root->fs_info->super_lock);
 	memset(fs_info->super_copy->label, 0, BTRFS_LABEL_SIZE);
@@ -399,9 +404,10 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
 	spin_unlock(&root->fs_info->super_lock);
 	ret = btrfs_commit_transaction(trans, root);
 
+out:
+	sb_drop_write(fs_info->sb);
 	if (!ret)
 		return len;
-
 	return ret;
 }
 BTRFS_ATTR_RW(label, btrfs_label_show, btrfs_label_store);
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v5 9/9] btrfs: Use sb_want_write() to protect sysfs feature change.
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
                   ` (7 preceding siblings ...)
  2015-01-30  9:20 ` [PATCH v5 8/9] btrfs: Use sb_want_write() to protect label change Qu Wenruo
@ 2015-01-30  9:20 ` Qu Wenruo
  2015-01-30 18:35   ` David Sterba
  2015-01-30 19:17 ` [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method David Sterba
  9 siblings, 1 reply; 22+ messages in thread
From: Qu Wenruo @ 2015-01-30  9:20 UTC (permalink / raw)
  To: linux-btrfs

Just like label change, use sb_want_write() to do a correct protection.

Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
---
v4:
  Newly introduced.
v5:
  Change to sb_want_write().
---
 fs/btrfs/sysfs.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 3218245..64f876e 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -154,9 +154,15 @@ static ssize_t btrfs_feature_attr_store(struct kobject *kobj,
 	btrfs_info(fs_info, "%s %s feature flag",
 		   val ? "Setting" : "Clearing", fa->kobj_attr.attr.name);
 
+	ret = sb_want_write(fs_info->sb);
+	if (ret)
+		return ret;
+
 	trans = btrfs_start_transaction(fs_info->fs_root, 0);
-	if (IS_ERR(trans))
-		return PTR_ERR(trans);
+	if (IS_ERR(trans)) {
+		ret = PTR_ERR(trans);
+		goto out;
+	}
 
 	spin_lock(&fs_info->super_lock);
 	features = get_features(fs_info, fa->feature_set);
@@ -168,10 +174,12 @@ static ssize_t btrfs_feature_attr_store(struct kobject *kobj,
 	spin_unlock(&fs_info->super_lock);
 
 	ret = btrfs_commit_transaction(trans, fs_info->fs_root);
-	if (ret)
-		return ret;
 
-	return count;
+out:
+	sb_drop_write(fs_info->sb);
+	if (!ret)
+		return count;
+	return ret;
 }
 
 static umode_t btrfs_feature_visible(struct kobject *kobj,
-- 
2.2.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 2/9] btrfs: Make btrfs_parse_options() parse mount option in a atomic way
  2015-01-30  9:20 ` [PATCH v5 2/9] btrfs: Make btrfs_parse_options() parse mount option in a atomic way Qu Wenruo
@ 2015-01-30 18:28   ` David Sterba
  0 siblings, 0 replies; 22+ messages in thread
From: David Sterba @ 2015-01-30 18:28 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 30, 2015 at 05:20:47PM +0800, Qu Wenruo wrote:
> v5:
>   Add ACCESS_ONE() macro to avoid aggressive compiler optimization to
>   not read fs_info->mount_opt to new_opt but always use fs_info->mount_opt.

This holds only up to this patch, ie. that there are other threads that
could change value of fs_info->mount_opt concurrently. But then the
per-transaction mount options will avoid that as space cache status will
be unmodified by remounts (through parse_options).

So either drop ACCESS_ONCE or please write a comment and explain why
it's still needed.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 8/9] btrfs: Use sb_want_write() to protect label change.
  2015-01-30  9:20 ` [PATCH v5 8/9] btrfs: Use sb_want_write() to protect label change Qu Wenruo
@ 2015-01-30 18:34   ` David Sterba
  0 siblings, 0 replies; 22+ messages in thread
From: David Sterba @ 2015-01-30 18:34 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 30, 2015 at 05:20:53PM +0800, Qu Wenruo wrote:
> Use the new vfs API sb_want_write() to do the write protection of the
> label change transaction.
> 
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> ---
> v4:
>   Newly introduced.
> v5:
>   Change to use sb_want_write().
> ---
>  fs/btrfs/sysfs.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index b2e7bb4..3218245 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -25,6 +25,7 @@
>  #include <linux/bug.h>
>  #include <linux/genhd.h>
>  #include <linux/debugfs.h>
> +#include <linux/mount.h>
>  
>  #include "ctree.h"
>  #include "disk-io.h"
> @@ -377,8 +378,6 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
>  	int ret;
>  	size_t p_len;
>  
> -	if (fs_info->sb->s_flags & MS_RDONLY)
> -		return -EROFS;
>  
>  	/*
>  	 * p_len is the len until the first occurrence of either
> @@ -389,9 +388,15 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
>  	if (p_len >= BTRFS_LABEL_SIZE)
>  		return -EINVAL;
>  
> +	ret = sb_want_write(fs_info->sb);
> +	if (ret)
> +		return ret;

Please move the check to the beginning of the function, where the
original MS_RDONLY check was.

> +
>  	trans = btrfs_start_transaction(root, 0);
> -	if (IS_ERR(trans))
> -		return PTR_ERR(trans);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		goto out;
> +	}
>  
>  	spin_lock(&root->fs_info->super_lock);
>  	memset(fs_info->super_copy->label, 0, BTRFS_LABEL_SIZE);
> @@ -399,9 +404,10 @@ static ssize_t btrfs_label_store(struct kobject *kobj,
>  	spin_unlock(&root->fs_info->super_lock);
>  	ret = btrfs_commit_transaction(trans, root);
>  
> +out:
> +	sb_drop_write(fs_info->sb);
>  	if (!ret)
>  		return len;
> -
>  	return ret;
>  }
>  BTRFS_ATTR_RW(label, btrfs_label_show, btrfs_label_store);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 9/9] btrfs: Use sb_want_write() to protect sysfs feature change.
  2015-01-30  9:20 ` [PATCH v5 9/9] btrfs: Use sb_want_write() to protect sysfs feature change Qu Wenruo
@ 2015-01-30 18:35   ` David Sterba
  0 siblings, 0 replies; 22+ messages in thread
From: David Sterba @ 2015-01-30 18:35 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 30, 2015 at 05:20:54PM +0800, Qu Wenruo wrote:
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -154,9 +154,15 @@ static ssize_t btrfs_feature_attr_store(struct kobject *kobj,
>  	btrfs_info(fs_info, "%s %s feature flag",
>  		   val ? "Setting" : "Clearing", fa->kobj_attr.attr.name);
>  
> +	ret = sb_want_write(fs_info->sb);
> +	if (ret)
> +		return ret;

Same here, move it to the beginning, right after fs_info is validated.

> +
>  	trans = btrfs_start_transaction(fs_info->fs_root, 0);
> -	if (IS_ERR(trans))
> -		return PTR_ERR(trans);
> +	if (IS_ERR(trans)) {
> +		ret = PTR_ERR(trans);
> +		goto out;
> +	}
>  

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 7/9] btrfs: Move btrfs_sysfs_remove_one(fs_info) before transcation thread cleanup.
  2015-01-30  9:20 ` [PATCH v5 7/9] btrfs: Move btrfs_sysfs_remove_one(fs_info) before transcation thread cleanup Qu Wenruo
@ 2015-01-30 18:56   ` David Sterba
  0 siblings, 0 replies; 22+ messages in thread
From: David Sterba @ 2015-01-30 18:56 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 30, 2015 at 05:20:52PM +0800, Qu Wenruo wrote:
> Since btrfs sysfs interfaces can start new transaction, we need to do it
> before transaction thread cleanup.
> 
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
> ---
> v5:
>   Newly introduced.
> ---
>  fs/btrfs/disk-io.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index f4d168d..7c185a0 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -3630,6 +3630,12 @@ void close_ctree(struct btrfs_root *root)
>  	fs_info->closing = 1;
>  	smp_mb();
>  
> +	/*
> +	 * Remove btrfs sysfs interfaces first,
> +	 * since it can start new transaction.

Please use the whole width of the line to format comments.


> +	 */
> +	btrfs_sysfs_remove_one(fs_info);
> +
>  	/* wait for the uuid_scan task to finish */
>  	down(&fs_info->uuid_tree_rescan_sem);
>  	/* avoid complains from lockdep et al., set sem back to initial state */

A few lines below, there are

btrfs_pause_balance
btrfs_dev_replace_suspend_for_unmount
btrfs_scrub_cancel

scrub and balance seem ok regarding sysfs, but dev-replace may replace
the sysfs entries if btrfs_dev_replace_finishing is called between

btrfs_sysfs_remove_one
...
btrfs_dev_replace_finishing <- wants to add/remove sysfs entries
...
btrfs_dev_replace_suspend_for_unmount

But all uses of sysfs objects or the btrfs_kobj objects do a NULL check
first and don't continue so this is safe and IMO best what we can have
to keep the code sane and lets us tear down sysfs exactly where you've
put it.

Reviewed-by: David Sterba <dsterba@suse.cz>

> @@ -3673,8 +3679,6 @@ void close_ctree(struct btrfs_root *root)
>  		       percpu_counter_sum(&fs_info->delalloc_bytes));
>  	}
>  
> -	btrfs_sysfs_remove_one(fs_info);
> -
>  	btrfs_free_fs_roots(fs_info);
>  
>  	btrfs_put_block_group_cache(fs_info);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 6/9] vfs: Add sb_want_write() function to get vfsmount from a given sb.
  2015-01-30  9:20 ` [PATCH v5 6/9] vfs: Add sb_want_write() function to get vfsmount from a given sb Qu Wenruo
@ 2015-01-30 19:11   ` David Sterba
  0 siblings, 0 replies; 22+ messages in thread
From: David Sterba @ 2015-01-30 19:11 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs, linux-fsdevel

On Fri, Jan 30, 2015 at 05:20:51PM +0800, Qu Wenruo wrote:
> So introduce new sb_want_write() to do the protection agains a super
> block, which acts much like mnt_want_write() but will return success if
> the super block is read-write.

I like the new interface and think that it could be used in the per-fs
background threads instead of the raw MS_RDONLY checks. From a brief
look it may not apply everywhere as sb_start_write could wait, but there
are a few candidates.

> 
> Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>
> Signed-off-by: Al Viro <viro@ZenIV.linux.org.uk>
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

Reviewed-by: David Sterba <dsterba@suse.cz>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 5/9] btrfs: Use btrfs_test_trans_opt() to handle INODE_CACHE if it's under transaction protect.
  2015-01-30  9:20 ` [PATCH v5 5/9] btrfs: Use btrfs_test_trans_opt() to handle INODE_CACHE " Qu Wenruo
@ 2015-01-30 19:12   ` David Sterba
  0 siblings, 0 replies; 22+ messages in thread
From: David Sterba @ 2015-01-30 19:12 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 30, 2015 at 05:20:50PM +0800, Qu Wenruo wrote:
> Convert btrfs_test_opt() to btrfs_test_trans_opt() if it's called under
> transaction protection.
> 
> This will ensure SPACE_CACHE bit is consistent during transaction.
> 
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

Reviewed-by: David Sterba <dsterba@suse.cz>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 4/9] btrfs: Use btrfs_test_trans_opt() to handle SPACE_CACHE if it's under transaction protect.
  2015-01-30  9:20 ` [PATCH v5 4/9] btrfs: Use btrfs_test_trans_opt() to handle SPACE_CACHE if it's under transaction protect Qu Wenruo
@ 2015-01-30 19:12   ` David Sterba
  0 siblings, 0 replies; 22+ messages in thread
From: David Sterba @ 2015-01-30 19:12 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 30, 2015 at 05:20:49PM +0800, Qu Wenruo wrote:
> Convert btrfs_test_opt() to btrfs_test_trans_opt() if it's called under
> transaction protection.
> This will ensure SPACE_CACHE bit is consistent during transaction.
> 
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

Reviewed-by: David Sterba <dsterba@suse.cz>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 3/9] btrfs: Introduce per-transaction mount_opt to keep mount option consistent during transaction.
  2015-01-30  9:20 ` [PATCH v5 3/9] btrfs: Introduce per-transaction mount_opt to keep mount option consistent during transaction Qu Wenruo
@ 2015-01-30 19:12   ` David Sterba
  0 siblings, 0 replies; 22+ messages in thread
From: David Sterba @ 2015-01-30 19:12 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 30, 2015 at 05:20:48PM +0800, Qu Wenruo wrote:
> Before this patch, mount_opt is not consistent during a transaction.
> btrfs_parse_options() can race with transaction.
> 
> Now each transaction will keep a copy of fs_info->mount_opt upon
> creation, and new btrfs_test_trans_opt() macro is introduced to get the
> mount_opt in the transaction.
> 
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

Reviewed-by: David Sterba <dsterba@suse.cz>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 1/9] Revert "btrfs: add support for processing pending changes" related commits
  2015-01-30  9:20 ` [PATCH v5 1/9] Revert "btrfs: add support for processing pending changes" related commits Qu Wenruo
@ 2015-01-30 19:14   ` David Sterba
  0 siblings, 0 replies; 22+ messages in thread
From: David Sterba @ 2015-01-30 19:14 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 30, 2015 at 05:20:46PM +0800, Qu Wenruo wrote:
> This reverts commit 572d9ab7845 ~ a6f69dc8018.
> 
> This pending commits patches introduce deadlock with freeze, and fix for
> it will introduce extra checks on freeze and read only case.
> 
> For mount option change, later patches will introduce per-transaction
> mount option to keep mount options consistent during transaction.
> 
> For sysfs interface to change label/features, it will use
> mnt_want_write() to do the ro/freeze check and keep the 'btrfs pro set'
> synchronized behavior.
> 
> Revert them to a clean base for later changes.
> 
> Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>

With the improved fix in place

Acked-by: David Sterba <dsterba@suse.cz>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method.
  2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
                   ` (8 preceding siblings ...)
  2015-01-30  9:20 ` [PATCH v5 9/9] btrfs: Use sb_want_write() to protect sysfs feature change Qu Wenruo
@ 2015-01-30 19:17 ` David Sterba
  2015-01-31  2:30   ` Miao Xie
  9 siblings, 1 reply; 22+ messages in thread
From: David Sterba @ 2015-01-30 19:17 UTC (permalink / raw)
  To: Qu Wenruo; +Cc: linux-btrfs

On Fri, Jan 30, 2015 at 05:20:45PM +0800, Qu Wenruo wrote:
> [Use VFS protect for sysfs change]
> The 6th patch will introduce a new help function sb_want_write() to
> claim write permission to a superblock.
> With this, we are able to do write protection like mnt_want_write() but
> only needs to ensure that the superblock is writeable.
> This also keeps the same synchronized behavior using ioctl, which will
> block on frozen fs until it is unfrozen.

You know what I think abuot the commit inside sysfs, but it looks better
to me now with the sb_* protections so I give it a go.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method.
  2015-01-30 19:17 ` [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method David Sterba
@ 2015-01-31  2:30   ` Miao Xie
  2015-02-02  1:36     ` Qu Wenruo
  0 siblings, 1 reply; 22+ messages in thread
From: Miao Xie @ 2015-01-31  2:30 UTC (permalink / raw)
  To: dsterba, Qu Wenruo, linux-btrfs

On Fri, 30 Jan 2015 20:17:49 +0100, David Sterba wrote:
> On Fri, Jan 30, 2015 at 05:20:45PM +0800, Qu Wenruo wrote:
>> [Use VFS protect for sysfs change]
>> The 6th patch will introduce a new help function sb_want_write() to
>> claim write permission to a superblock.
>> With this, we are able to do write protection like mnt_want_write() but
>> only needs to ensure that the superblock is writeable.
>> This also keeps the same synchronized behavior using ioctl, which will
>> block on frozen fs until it is unfrozen.
> 
> You know what I think abuot the commit inside sysfs, but it looks better
> to me now with the sb_* protections so I give it a go.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

I worried about the following case

# fsfreeze btrfs
# echo "new label" > btrfs_sysfs
It should be hangup


On the other terminal
# umount btrfs


Because the 2nd echo command didn't increase mount reference, so umount
would not know someone still blocked on the fs, it would not go back and
return EBUSY like someone access the fs by common fs interface, it would
deactive fs directly and then blocked on sysfs removal.


Thanks
Miao

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method.
  2015-01-31  2:30   ` Miao Xie
@ 2015-02-02  1:36     ` Qu Wenruo
  0 siblings, 0 replies; 22+ messages in thread
From: Qu Wenruo @ 2015-02-02  1:36 UTC (permalink / raw)
  To: Miao Xie, dsterba, linux-btrfs; +Cc: Al Viro


-------- Original Message --------
Subject: Re: [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better 
method.
From: Miao Xie <miaoxie@huawei.com>
To: <dsterba@suse.cz>, Qu Wenruo <quwenruo@cn.fujitsu.com>, 
<linux-btrfs@vger.kernel.org>
Date: 2015年01月31日 10:30
> On Fri, 30 Jan 2015 20:17:49 +0100, David Sterba wrote:
>> On Fri, Jan 30, 2015 at 05:20:45PM +0800, Qu Wenruo wrote:
>>> [Use VFS protect for sysfs change]
>>> The 6th patch will introduce a new help function sb_want_write() to
>>> claim write permission to a superblock.
>>> With this, we are able to do write protection like mnt_want_write() but
>>> only needs to ensure that the superblock is writeable.
>>> This also keeps the same synchronized behavior using ioctl, which will
>>> block on frozen fs until it is unfrozen.
>> You know what I think abuot the commit inside sysfs, but it looks better
>> to me now with the sb_* protections so I give it a go.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> I worried about the following case
>
> # fsfreeze btrfs
> # echo "new label" > btrfs_sysfs
> It should be hangup
Yes, same as "btrfs pro set <mnt> label "
>
>
> On the other terminal
> # umount btrfs
Oh, that's a problem.
It can be umounted, but....
>
>
> Because the 2nd echo command didn't increase mount reference, so umount
> would not know someone still blocked on the fs, it would not go back and
> return EBUSY like someone access the fs by common fs interface, it would
> deactive fs directly and then blocked on sysfs removal.
The btrfs module can't be removed, but not blocked. Just return -EBUSY.
So at least it won't cause annoying block problem.

Although hard for sysadmin to find pinpoint the real block reason, it is 
still resolvable, mount again and unfreeze
can resolve it.
But it never seems to be a reasonable behavior for me either.

Just as you mentioned, in sysfs, we only have sb, no vfsmount, so 
without huge VFS change(*),
we could either accept the wrong method, or drop all write sysfs support 
for transaction-invoking features.

*: Maybe we can add ref counts to sb, and umount syscall checks for the 
sb ref counts if the mount point
is the last vfsmount of it. I'm not sure whether it is worthy but it 
seems valid.

Thanks,
Qu

>
>
> Thanks
> Miao


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2015-02-02  1:34 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-30  9:20 [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method Qu Wenruo
2015-01-30  9:20 ` [PATCH v5 1/9] Revert "btrfs: add support for processing pending changes" related commits Qu Wenruo
2015-01-30 19:14   ` David Sterba
2015-01-30  9:20 ` [PATCH v5 2/9] btrfs: Make btrfs_parse_options() parse mount option in a atomic way Qu Wenruo
2015-01-30 18:28   ` David Sterba
2015-01-30  9:20 ` [PATCH v5 3/9] btrfs: Introduce per-transaction mount_opt to keep mount option consistent during transaction Qu Wenruo
2015-01-30 19:12   ` David Sterba
2015-01-30  9:20 ` [PATCH v5 4/9] btrfs: Use btrfs_test_trans_opt() to handle SPACE_CACHE if it's under transaction protect Qu Wenruo
2015-01-30 19:12   ` David Sterba
2015-01-30  9:20 ` [PATCH v5 5/9] btrfs: Use btrfs_test_trans_opt() to handle INODE_CACHE " Qu Wenruo
2015-01-30 19:12   ` David Sterba
2015-01-30  9:20 ` [PATCH v5 6/9] vfs: Add sb_want_write() function to get vfsmount from a given sb Qu Wenruo
2015-01-30 19:11   ` David Sterba
2015-01-30  9:20 ` [PATCH v5 7/9] btrfs: Move btrfs_sysfs_remove_one(fs_info) before transcation thread cleanup Qu Wenruo
2015-01-30 18:56   ` David Sterba
2015-01-30  9:20 ` [PATCH v5 8/9] btrfs: Use sb_want_write() to protect label change Qu Wenruo
2015-01-30 18:34   ` David Sterba
2015-01-30  9:20 ` [PATCH v5 9/9] btrfs: Use sb_want_write() to protect sysfs feature change Qu Wenruo
2015-01-30 18:35   ` David Sterba
2015-01-30 19:17 ` [PATCH v5 0/9] btrfs: Fix freeze/sysfs deadlock in better method David Sterba
2015-01-31  2:30   ` Miao Xie
2015-02-02  1:36     ` Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.