linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Jan Kara <jack@suse.cz>
Cc: yebin <yebin10@huawei.com>,
	adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH -next v2 2/6] ext4: introduce last_check_time record previous check time
Date: Wed, 13 Oct 2021 17:41:12 -0400	[thread overview]
Message-ID: <YWdSeMuosYio7TFv@mit.edu> (raw)
In-Reply-To: <20211013093847.GB19200@quack2.suse.cz>

On Wed, Oct 13, 2021 at 11:38:47AM +0200, Jan Kara wrote:
> 
> OK, I see. So the race in ext4_multi_mount_protect() goes like:
> 
> hostA				hostB
> 
> read_mmp_block()		read_mmp_block()
> - sees EXT4_MMP_SEQ_CLEAN	- sees EXT4_MMP_SEQ_CLEAN
> write_mmp_block()
> wait_time == 0 -> no wait
> read_mmp_block()
>   - all OK, mount
> 				write_mmp_block()
> 				wait_time == 0 -> no wait
> 				read_mmp_block()
> 				  - all OK, mount
> 
> Do I get it right? Actually, if we passed seq we wrote in
> ext4_multi_mount_protect() to kmmpd (probably in sb), then kmmpd would
> notice the conflict on its first invocation but still that would be a bit
> late because there would be a time window where hostA and hostB would be
> both using the fs.
> 
> We could reduce the likelyhood of this race by always waiting in
> ext4_multi_mount_protect() between write & read but I guess that is
> undesirable as it would slow down all clean mounts. Ted?

I'd like Andreas to comment here.  My understanding is that MMP
originally intended as a safety mechanism which would be used as part
of a primary/backup high availability system, but not as the *primary*
system where you might try to have two servers simultaneously try to
mount the file system and use MMP as the "election" mechanism to
decide which server is going to be the primary system, and which would
be the backup system.

The cost of being able to handle this particular race is it would slow
down the mounts of cleanly unmounted systems.

There *are* better systems to implement leader elections[1] than using
MMP.  Most of these more efficient leader elections assume that you
have a working IP network, and so if you have a separate storage
network (including a shared SCSI bus) from your standard IP network,
then MMP is a useful failsafe in the face of a network partition of
your IP network.  The question is whether MMP should be useful for
more than that.  And if it isn't, then we should probably document
what MMP is and isn't good for, and give advice in the form of an
application note for how MMP should be used in the context of a larger
system.

[1] https://en.wikipedia.org/wiki/Leader_election

						- Ted

  parent reply	other threads:[~2021-10-13 21:41 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-11  9:00 [PATCH -next v2 0/6] Fix some issues about mmp Ye Bin
2021-09-11  9:00 ` [PATCH -next v2 1/6] ext4: init seq with random value in kmmpd Ye Bin
2021-10-07 12:26   ` Jan Kara
2021-10-08  1:50     ` yebin
2021-09-11  9:00 ` [PATCH -next v2 2/6] ext4: introduce last_check_time record previous check time Ye Bin
2021-10-07 12:31   ` Jan Kara
2021-10-08  1:56     ` yebin
2021-10-08  2:38       ` yebin
2021-10-12  8:47         ` Jan Kara
2021-10-12 11:46           ` yebin
2021-10-13  9:38             ` Jan Kara
2021-10-13 12:33               ` yebin
2021-10-13 21:41               ` Theodore Ts'o [this message]
2021-10-15  3:21                 ` Andreas Dilger
2021-10-15  3:21                 ` Andreas Dilger
2021-09-11  9:00 ` [PATCH -next v2 3/6] ext4: compare to local seq and nodename when check conflict Ye Bin
2021-10-07 12:36   ` Jan Kara
2021-09-11  9:00 ` [PATCH -next v2 4/6] ext4: avoid to re-read mmp check data get from page cache Ye Bin
2021-10-07 12:44   ` Jan Kara
2021-10-08  3:52     ` yebin
2021-09-11  9:00 ` [PATCH -next v2 5/6] ext4: avoid to double free s_mmp_bh Ye Bin
2021-09-11  9:00 ` [PATCH -next v2 6/6] ext4: fix possible store wrong check interval value in disk when umount Ye Bin
2021-10-07 13:12   ` Jan Kara
2021-10-08  3:49     ` yebin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YWdSeMuosYio7TFv@mit.edu \
    --to=tytso@mit.edu \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yebin10@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).