linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next v4 0/3] Fix some issues about mmp
@ 2021-10-19 12:39 Ye Bin
  2021-10-19 12:39 ` [PATCH -next v4 1/3] ext4: compare to local seq and nodename when check conflict Ye Bin
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Ye Bin @ 2021-10-19 12:39 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

I test mmp function as follow steps:
1. Inject delay 5s in ext4_multi_mount_protect function after
"skip:" label.
2. Share HostA block device(sda) with HostB(nbd0) by NBD.
3. Enable mmp feature when mkfs.ext4 sda.
4. Mount sda and nbd0 at the same time.

I found kmmpd never trigger detect multi-mount. Reason is as follow:
1. Kmmpd init seq with 0, if two host have same nodename, will lead to
detect confliction very slow, even never detect confliction.
2. When detect confliction in kmmpd, we get 'check_bh' is same with 'bh'.
so we compare data with itself.
3. We only trigger detect when ”diff > mmp_check_interval * HZ“,
'mmp_check_interval' is double of 'mmp_update_interval', 'diff' is
about 'mmp_update_interval'. So 'diff' is little than 'mmp_check_interval * HZ'
normaly. As Jan Kara explain as follows:
"I think the check is there only for the case where write_mmp_block() +
sleep took longer than mmp_check_interval. I agree that should rarely
happen but on a really busy system it is possible and in that case we would
miss updating mmp block for too long and so another node could have started
using the filesystem. "

v1->v2:
Fix 'last_check_time' not initialized before checking.

v2->v3:
1. drop commit "ext4: introduce last_check_time record previous check time"
As Ted explain as follows:
"I'd like Andreas to comment here.  My understanding is that MMP
originally intended as a safety mechanism which would be used as part
of a primary/backup high availability system, but not as the *primary*
system where you might try to have two servers simultaneously try to
mount the file system and use MMP as the "election" mechanism to
decide which server is going to be the primary system, and which would
be the backup system.

The cost of being able to handle this particular race is it would slow
down the mounts of cleanly unmounted systems.

There *are* better systems to implement leader elections[1] than using
MMP.  Most of these more efficient leader elections assume that you
have a working IP network, and so if you have a separate storage
network (including a shared SCSI bus) from your standard IP network,
then MMP is a useful failsafe in the face of a network partition of
your IP network.  The question is whether MMP should be useful for
more than that.  And if it isn't, then we should probably document
what MMP is and isn't good for, and give advice in the form of an
application note for how MMP should be used in the context of a larger
system."
2. drop commit "ext4: fix possible store wrong check interval value in disk when umount"
3. simplify read_mmp_block fucntion to avoid UAF

v3->v4:
1. drop commit "ext4: init 'seq' with the value which set in 'ext4_multi_mount_protect'"
2. merge "ext4: get buffer head before read_mmp_block" to
"ext4: simplify read_mmp_block fucntion"
3. rename "ext4: avoid to re-read mmp check data get from page cache" to
"ext4: remove useless bh_check variable"
4. reorder "ext4: remove useless bh_check variable" and 
"ext4: simplify read_mmp_block fucntion"

Ye Bin (3):
  ext4: compare to local seq and nodename when check conflict
  ext4: remove useless bh_check variable
  ext4: simplify read_mmp_block fucntion

 fs/ext4/ext4.h |  5 +++-
 fs/ext4/mmp.c  | 64 +++++++++++++++++++++-----------------------------
 2 files changed, 31 insertions(+), 38 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH -next v4 1/3] ext4: compare to local seq and nodename when check conflict
  2021-10-19 12:39 [PATCH -next v4 0/3] Fix some issues about mmp Ye Bin
@ 2021-10-19 12:39 ` Ye Bin
  2021-10-19 12:39 ` [PATCH -next v4 2/3] ext4: remove useless bh_check variable Ye Bin
  2021-10-19 12:39 ` [PATCH -next v4 3/3] ext4: simplify read_mmp_block fucntion Ye Bin
  2 siblings, 0 replies; 6+ messages in thread
From: Ye Bin @ 2021-10-19 12:39 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

As mmp and check_mmp is point to the same data, so there will never
detect conflict.
To solve this issue just compare to local data.

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/ext4.h | 5 ++++-
 fs/ext4/mmp.c  | 9 +++++----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 404dd50856e5..9a487a558787 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2601,6 +2601,8 @@ struct ext4_features {
 #define EXT4_MMP_SEQ_FSCK  0xE24D4D50U /* mmp_seq value when being fscked */
 #define EXT4_MMP_SEQ_MAX   0xE24D4D4FU /* maximum valid mmp_seq value */
 
+#define EXT4_MMP_NODENAME_LEN   64 /* mmp_nodename length */
+
 struct mmp_struct {
 	__le32	mmp_magic;		/* Magic number for MMP */
 	__le32	mmp_seq;		/* Sequence no. updated periodically */
@@ -2610,7 +2612,8 @@ struct mmp_struct {
 	 * purposes and do not affect the correctness of the algorithm
 	 */
 	__le64	mmp_time;		/* Time last updated */
-	char	mmp_nodename[64];	/* Node which last updated MMP block */
+	/* Node which last updated MMP block */
+	char	mmp_nodename[EXT4_MMP_NODENAME_LEN];
 	char	mmp_bdevname[32];	/* Bdev which last updated MMP block */
 
 	/*
diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c
index cebea4270817..97d5a8136eb2 100644
--- a/fs/ext4/mmp.c
+++ b/fs/ext4/mmp.c
@@ -138,6 +138,7 @@ static int kmmpd(void *data)
 	unsigned mmp_check_interval;
 	unsigned long last_update_time;
 	unsigned long diff;
+	char nodename[EXT4_MMP_NODENAME_LEN];
 	int retval = 0;
 
 	mmp_block = le64_to_cpu(es->s_mmp_block);
@@ -153,8 +154,8 @@ static int kmmpd(void *data)
 	BUILD_BUG_ON(sizeof(mmp->mmp_bdevname) < BDEVNAME_SIZE);
 	bdevname(bh->b_bdev, mmp->mmp_bdevname);
 
-	memcpy(mmp->mmp_nodename, init_utsname()->nodename,
-	       sizeof(mmp->mmp_nodename));
+	memcpy(nodename, init_utsname()->nodename, sizeof(nodename));
+	memcpy(mmp->mmp_nodename, nodename, sizeof(mmp->mmp_nodename));
 
 	while (!kthread_should_stop() && !sb_rdonly(sb)) {
 		if (!ext4_has_feature_mmp(sb)) {
@@ -206,8 +207,8 @@ static int kmmpd(void *data)
 			}
 
 			mmp_check = (struct mmp_struct *)(bh_check->b_data);
-			if (mmp->mmp_seq != mmp_check->mmp_seq ||
-			    memcmp(mmp->mmp_nodename, mmp_check->mmp_nodename,
+			if (seq != mmp_check->mmp_seq ||
+			    memcmp(nodename, mmp_check->mmp_nodename,
 				   sizeof(mmp->mmp_nodename))) {
 				dump_mmp_msg(sb, mmp_check,
 					     "Error while updating MMP info. "
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH -next v4 2/3] ext4: remove useless bh_check variable
  2021-10-19 12:39 [PATCH -next v4 0/3] Fix some issues about mmp Ye Bin
  2021-10-19 12:39 ` [PATCH -next v4 1/3] ext4: compare to local seq and nodename when check conflict Ye Bin
@ 2021-10-19 12:39 ` Ye Bin
  2021-10-19 12:39 ` [PATCH -next v4 3/3] ext4: simplify read_mmp_block fucntion Ye Bin
  2 siblings, 0 replies; 6+ messages in thread
From: Ye Bin @ 2021-10-19 12:39 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

Since we initialize 'bh_check' to NULL and pass it to read_mmp_block(), that
function will just call sb_getblk() which will just return the buffer_head
we have in 'bh'. So just remove the pointless 'bh_check' variable and use
'bh' directly.

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/mmp.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c
index 97d5a8136eb2..9788c617e593 100644
--- a/fs/ext4/mmp.c
+++ b/fs/ext4/mmp.c
@@ -195,10 +195,7 @@ static int kmmpd(void *data)
 		 */
 		diff = jiffies - last_update_time;
 		if (diff > mmp_check_interval * HZ) {
-			struct buffer_head *bh_check = NULL;
-			struct mmp_struct *mmp_check;
-
-			retval = read_mmp_block(sb, &bh_check, mmp_block);
+			retval = read_mmp_block(sb, &bh, mmp_block);
 			if (retval) {
 				ext4_error_err(sb, -retval,
 					       "error reading MMP data: %d",
@@ -206,20 +203,18 @@ static int kmmpd(void *data)
 				goto wait_to_exit;
 			}
 
-			mmp_check = (struct mmp_struct *)(bh_check->b_data);
-			if (seq != mmp_check->mmp_seq ||
-			    memcmp(nodename, mmp_check->mmp_nodename,
-				   sizeof(mmp->mmp_nodename))) {
-				dump_mmp_msg(sb, mmp_check,
+			mmp = (struct mmp_struct *)(bh->b_data);
+			if (seq != le32_to_cpu(mmp->mmp_seq) ||
+			    memcmp(nodename, mmp->mmp_nodename,
+				    sizeof(nodename))) {
+				dump_mmp_msg(sb, mmp,
 					     "Error while updating MMP info. "
 					     "The filesystem seems to have been"
 					     " multiply mounted.");
 				ext4_error_err(sb, EBUSY, "abort");
-				put_bh(bh_check);
 				retval = -EBUSY;
 				goto wait_to_exit;
 			}
-			put_bh(bh_check);
 		}
 
 		 /*
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH -next v4 3/3] ext4: simplify read_mmp_block fucntion
  2021-10-19 12:39 [PATCH -next v4 0/3] Fix some issues about mmp Ye Bin
  2021-10-19 12:39 ` [PATCH -next v4 1/3] ext4: compare to local seq and nodename when check conflict Ye Bin
  2021-10-19 12:39 ` [PATCH -next v4 2/3] ext4: remove useless bh_check variable Ye Bin
@ 2021-10-19 12:39 ` Ye Bin
  2021-10-19 13:49   ` Jan Kara
  2 siblings, 1 reply; 6+ messages in thread
From: Ye Bin @ 2021-10-19 12:39 UTC (permalink / raw)
  To: tytso, adilger.kernel, linux-ext4; +Cc: linux-kernel, jack, Ye Bin

This patch is according to Jan Kara's suggestion:
I guess I would just get rid of sb_getblk() in read_mmp_block() and always
expect valid bh passed. The only place that passes NULL bh after this
patch is one case in ext4_multi_mount_protect() and that can call
sb_getblk() on its own. That way we can also simplify read_mmp_block()
prototype to:

static int read_mmp_block(struct super_block *sb, struct buffer_head *bh);

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/mmp.c | 44 +++++++++++++++++++-------------------------
 1 file changed, 19 insertions(+), 25 deletions(-)

diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c
index 9788c617e593..3a7370c88934 100644
--- a/fs/ext4/mmp.c
+++ b/fs/ext4/mmp.c
@@ -64,33 +64,26 @@ static int write_mmp_block(struct super_block *sb, struct buffer_head *bh)
 /*
  * Read the MMP block. It _must_ be read from disk and hence we clear the
  * uptodate flag on the buffer.
+ * Caller must ensure pass valid 'bh'.
  */
-static int read_mmp_block(struct super_block *sb, struct buffer_head **bh,
-			  ext4_fsblk_t mmp_block)
+static int read_mmp_block(struct super_block *sb, struct buffer_head *bh)
 {
 	struct mmp_struct *mmp;
 	int ret;
 
-	if (*bh)
-		clear_buffer_uptodate(*bh);
-
-	/* This would be sb_bread(sb, mmp_block), except we need to be sure
-	 * that the MD RAID device cache has been bypassed, and that the read
-	 * is not blocked in the elevator. */
-	if (!*bh) {
-		*bh = sb_getblk(sb, mmp_block);
-		if (!*bh) {
-			ret = -ENOMEM;
-			goto warn_exit;
-		}
+	if (!bh) {
+		ret = -EINVAL;
+		goto warn_exit;
 	}
 
-	lock_buffer(*bh);
-	ret = ext4_read_bh(*bh, REQ_META | REQ_PRIO, NULL);
+	clear_buffer_uptodate(bh);
+
+	lock_buffer(bh);
+	ret = ext4_read_bh(bh, REQ_META | REQ_PRIO, NULL);
 	if (ret)
 		goto warn_exit;
 
-	mmp = (struct mmp_struct *)((*bh)->b_data);
+	mmp = (struct mmp_struct *)((bh)->b_data);
 	if (le32_to_cpu(mmp->mmp_magic) != EXT4_MMP_MAGIC) {
 		ret = -EFSCORRUPTED;
 		goto warn_exit;
@@ -101,10 +94,7 @@ static int read_mmp_block(struct super_block *sb, struct buffer_head **bh,
 	}
 	return 0;
 warn_exit:
-	brelse(*bh);
-	*bh = NULL;
-	ext4_warning(sb, "Error %d while reading MMP block %llu",
-		     ret, mmp_block);
+	ext4_warning(sb, "Error %d while reading MMP block", ret);
 	return ret;
 }
 
@@ -195,7 +185,7 @@ static int kmmpd(void *data)
 		 */
 		diff = jiffies - last_update_time;
 		if (diff > mmp_check_interval * HZ) {
-			retval = read_mmp_block(sb, &bh, mmp_block);
+			retval = read_mmp_block(sb, bh);
 			if (retval) {
 				ext4_error_err(sb, -retval,
 					       "error reading MMP data: %d",
@@ -289,7 +279,11 @@ int ext4_multi_mount_protect(struct super_block *sb,
 		goto failed;
 	}
 
-	retval = read_mmp_block(sb, &bh, mmp_block);
+	bh = sb_getblk(sb, mmp_block);
+	if (bh)
+		goto failed;
+
+	retval = read_mmp_block(sb, bh);
 	if (retval)
 		goto failed;
 
@@ -327,7 +321,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
 		goto failed;
 	}
 
-	retval = read_mmp_block(sb, &bh, mmp_block);
+	retval = read_mmp_block(sb, bh);
 	if (retval)
 		goto failed;
 	mmp = (struct mmp_struct *)(bh->b_data);
@@ -356,7 +350,7 @@ int ext4_multi_mount_protect(struct super_block *sb,
 		goto failed;
 	}
 
-	retval = read_mmp_block(sb, &bh, mmp_block);
+	retval = read_mmp_block(sb, bh);
 	if (retval)
 		goto failed;
 	mmp = (struct mmp_struct *)(bh->b_data);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH -next v4 3/3] ext4: simplify read_mmp_block fucntion
  2021-10-19 12:39 ` [PATCH -next v4 3/3] ext4: simplify read_mmp_block fucntion Ye Bin
@ 2021-10-19 13:49   ` Jan Kara
  2021-10-20  2:53     ` yebin
  0 siblings, 1 reply; 6+ messages in thread
From: Jan Kara @ 2021-10-19 13:49 UTC (permalink / raw)
  To: Ye Bin; +Cc: tytso, adilger.kernel, linux-ext4, linux-kernel, jack

On Tue 19-10-21 20:39:31, Ye Bin wrote:
> This patch is according to Jan Kara's suggestion:
> I guess I would just get rid of sb_getblk() in read_mmp_block() and always
> expect valid bh passed. The only place that passes NULL bh after this
> patch is one case in ext4_multi_mount_protect() and that can call
> sb_getblk() on its own. That way we can also simplify read_mmp_block()
> prototype to:
> 
> static int read_mmp_block(struct super_block *sb, struct buffer_head *bh);
> 
> Signed-off-by: Ye Bin <yebin10@huawei.com>
> Reviewed-by: Jan Kara <jack@suse.cz>

...

> @@ -289,7 +279,11 @@ int ext4_multi_mount_protect(struct super_block *sb,
>  		goto failed;
>  	}
>  
> -	retval = read_mmp_block(sb, &bh, mmp_block);
> +	bh = sb_getblk(sb, mmp_block);
> +	if (bh)
	^^^^^^

!bh here, please.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -next v4 3/3] ext4: simplify read_mmp_block fucntion
  2021-10-19 13:49   ` Jan Kara
@ 2021-10-20  2:53     ` yebin
  0 siblings, 0 replies; 6+ messages in thread
From: yebin @ 2021-10-20  2:53 UTC (permalink / raw)
  To: Jan Kara; +Cc: tytso, adilger.kernel, linux-ext4, linux-kernel



On 2021/10/19 21:49, Jan Kara wrote:
> On Tue 19-10-21 20:39:31, Ye Bin wrote:
>> This patch is according to Jan Kara's suggestion:
>> I guess I would just get rid of sb_getblk() in read_mmp_block() and always
>> expect valid bh passed. The only place that passes NULL bh after this
>> patch is one case in ext4_multi_mount_protect() and that can call
>> sb_getblk() on its own. That way we can also simplify read_mmp_block()
>> prototype to:
>>
>> static int read_mmp_block(struct super_block *sb, struct buffer_head *bh);
>>
>> Signed-off-by: Ye Bin <yebin10@huawei.com>
>> Reviewed-by: Jan Kara <jack@suse.cz>
> ...
>
>> @@ -289,7 +279,11 @@ int ext4_multi_mount_protect(struct super_block *sb,
>>   		goto failed;
>>   	}
>>   
>> -	retval = read_mmp_block(sb, &bh, mmp_block);
>> +	bh = sb_getblk(sb, mmp_block);
>> +	if (bh)
> 	^^^^^^
>
> !bh here, please.
Sorry,it's my fault. I will fix it and test  this patch set base on 
linux mainline.
>
> 								Honza
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-20  2:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-19 12:39 [PATCH -next v4 0/3] Fix some issues about mmp Ye Bin
2021-10-19 12:39 ` [PATCH -next v4 1/3] ext4: compare to local seq and nodename when check conflict Ye Bin
2021-10-19 12:39 ` [PATCH -next v4 2/3] ext4: remove useless bh_check variable Ye Bin
2021-10-19 12:39 ` [PATCH -next v4 3/3] ext4: simplify read_mmp_block fucntion Ye Bin
2021-10-19 13:49   ` Jan Kara
2021-10-20  2:53     ` yebin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).