linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Jan Kara <jack@suse.cz>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	Mikulas Patocka <mpatocka@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: linux-next: build failure after merge of the vfs tree
Date: Wed, 4 Jan 2012 02:17:54 +0000	[thread overview]
Message-ID: <20120104021754.GD23916@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20120103144531.GA23916@ZenIV.linux.org.uk>

On Tue, Jan 03, 2012 at 02:45:32PM +0000, Al Viro wrote:
> On Tue, Jan 03, 2012 at 02:39:42PM +0100, Jan Kara wrote:
> 
> >   Thanks Stephen! Al, how shall we resolve this? You wrote you can provide
> > a VFS helper like get_super() which will also guarantee that the fs is
> > unfrozen.  That could be used in quotactl_block() and fsync_bdev(). If you
> > plan to do this for 3.3 then I can just remove the quota fix and let you
> > do it.
> 
> I started digging in that area and I really don't like what I'm seeing.
> sget() race fix from Aug 2010 (MS_BORN one) had not covered all cases.
> The thing is, we can get hit with this:
> 	1) mount(2) does sget(), etc. and fails very late in the game - with
> ->s_root already allocated.  For some filesystems such failure exits are
> possible.
> 	2) something crawling through the superblock list finds our new
> sb before we realize it's doomed.  Tries to grab s_umount, gets blocked.
> 	3) in the meanwhile *another* mount(2) does sget() that catches
> the same sb and decides to pick it.  ->s_active is grabbed, we get blocked
> on attempt to get ->s_umount exclusive.
> 	4) the original mount(2) gets to the failure point and does
> deactivate_locked_super().  ->s_active is decremented, ->s_umount unlocked.
> However, because of (3) ->s_active does not reach 0 yet.  Guy stuck in (2)
> gets to run.  ->s_root is non-NULL here.  And fs is not in a good shape...
> 	5) sget() from (3) gets to ->s_umount, notices that MS_BORN hadn't
> been set and does deactivate_locked_super().  Now ->s_active is 0 and
> we get around to shutting the sucker down.  ->kill_sb() gets called, ->s_root
> is dropped, etc. - the whole nine yards.  Caller of sget() had been saved from
> the race.  However, whoever that had been in (2) and (4) still got hit.
> 
> IOW, MS_BORN check is needed in the places that go through the superblock
> list, grab ->s_umount and check ->s_root.  That will close the hole for
> good.
> 
> We also have a problem in get_active_super() caller; again, the missing MS_BORN
> check (in freeze_super(), after getting ->s_umount).

These should be fixed by the patch below; I don't think that it needs
to go in 3.2-final (the three-way race in question is neither new nor
easy to hit and it's a non-issue for most of the filesystem types), but it's
definitely a -stable fodder.  Bugs on failure exits in ->mount() instances
(at the very least, ext4 and minixfs are leaking dentry on some of those)
are separate story; so's dealing with get_super() and user_get_super()
vs. thaw.  Mikulas: adding an extra argument deciding whether we wait for
thaw is no-go - it's worse than splitting get_super() in two functions;
the effect is the same, but misuses are harder to spot.  I'm still not
sure about ->statfs(), BTW - any input on that would be welcome.  Can
it end up blocked on a frozen fs until said fs is thawed?  We either need
to convert ustat(2) to "wait for thaw" semantics (should be interruptible,
BTW) or document that ->statfs() is not allowed to wait for thawing.
It's far too subtle to leave undocumented...

commit a33a3874084823990acc2a857e54a93e0611a7ff
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Tue Jan 3 21:01:29 2012 -0500

    vfs: fix the rest of sget() races
    
    unfortunately, just checking MS_BORN after having grabbed ->s_umount in
    sget() is not enough; places that pick superblock from a list and
    grab s_umount shared need the same check in addition to checking for
    ->s_root; otherwise three-way race between failing mount, sget() and
    such list-walker can leave us with list-walker coming *second*, when
    temporary active ref grabbed by sget() (to be dropped when sget()
    notices that original mount has failed by checking MS_BORN) has
    lead to deactivate_locked_super() from failing ->mount() *not* doing
    ->kill_sb() and just releasing ->s_umount.  Once sget() gets through
    and notices that MS_BORN had never been set it will drop the active
    ref and fs will be shut down and kicked out of all lists, but it's
    too late for something like sync_supers().
    
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

diff --git a/fs/super.c b/fs/super.c
index e25b2e8..1b0c814 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -340,7 +340,7 @@ bool grab_super_passive(struct super_block *sb)
 	spin_unlock(&sb_lock);
 
 	if (down_read_trylock(&sb->s_umount)) {
-		if (sb->s_root)
+		if (sb->s_root && (sb->s_flags & MS_BORN))
 			return true;
 		up_read(&sb->s_umount);
 	}
@@ -508,7 +508,7 @@ void sync_supers(void)
 			spin_unlock(&sb_lock);
 
 			down_read(&sb->s_umount);
-			if (sb->s_root && sb->s_dirt)
+			if (sb->s_root && sb->s_dirt && (sb->s_flags & MS_BORN))
 				sb->s_op->write_super(sb);
 			up_read(&sb->s_umount);
 
@@ -543,7 +543,7 @@ void iterate_supers(void (*f)(struct super_block *, void *), void *arg)
 		spin_unlock(&sb_lock);
 
 		down_read(&sb->s_umount);
-		if (sb->s_root)
+		if (sb->s_root && (sb->s_flags & MS_BORN))
 			f(sb, arg);
 		up_read(&sb->s_umount);
 
@@ -578,7 +578,7 @@ void iterate_supers_type(struct file_system_type *type,
 		spin_unlock(&sb_lock);
 
 		down_read(&sb->s_umount);
-		if (sb->s_root)
+		if (sb->s_root && (sb->s_flags & MS_BORN))
 			f(sb, arg);
 		up_read(&sb->s_umount);
 
@@ -619,7 +619,7 @@ rescan:
 			spin_unlock(&sb_lock);
 			down_read(&sb->s_umount);
 			/* still alive? */
-			if (sb->s_root)
+			if (sb->s_root && (sb->s_flags & MS_BORN))
 				return sb;
 			up_read(&sb->s_umount);
 			/* nope, got unmounted */
@@ -679,7 +679,7 @@ rescan:
 			spin_unlock(&sb_lock);
 			down_read(&sb->s_umount);
 			/* still alive? */
-			if (sb->s_root)
+			if (sb->s_root && (sb->s_flags & MS_BORN))
 				return sb;
 			up_read(&sb->s_umount);
 			/* nope, got unmounted */
@@ -776,7 +776,8 @@ static void do_emergency_remount(struct work_struct *work)
 		sb->s_count++;
 		spin_unlock(&sb_lock);
 		down_write(&sb->s_umount);
-		if (sb->s_root && sb->s_bdev && !(sb->s_flags & MS_RDONLY)) {
+		if (sb->s_root && sb->s_bdev && (sb->s_flags & MS_BORN) &&
+		    !(sb->s_flags & MS_RDONLY)) {
 			/*
 			 * What lock protects sb->s_flags??
 			 */
@@ -1159,6 +1160,11 @@ int freeze_super(struct super_block *sb)
 		return -EBUSY;
 	}
 
+	if (!(sb->s_flags & MS_BORN)) {
+		up_write(&sb->s_umount);
+		return 0;	/* sic - it's "nothing to do" */
+	}
+
 	if (sb->s_flags & MS_RDONLY) {
 		sb->s_frozen = SB_FREEZE_TRANS;
 		smp_wmb();

  reply	other threads:[~2012-01-04  2:18 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-01-03  1:43 linux-next: build failure after merge of the vfs tree Stephen Rothwell
2012-01-03 13:39 ` Jan Kara
2012-01-03 14:45   ` Al Viro
2012-01-04  2:17     ` Al Viro [this message]
2012-01-04  2:50       ` Dave Chinner
2012-01-04 18:00         ` Jan Kara
2012-01-04 18:47           ` Christoph Hellwig
2012-01-04 22:26             ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2021-01-04 22:36 Stephen Rothwell
2021-01-04 23:28 ` Al Viro
2020-10-27  4:14 Stephen Rothwell
2020-10-27  4:59 ` Al Viro
2020-11-10 19:00   ` Al Viro
2020-11-10 21:24     ` Stephen Rothwell
2020-09-28  1:31 Stephen Rothwell
2020-09-28  6:05 ` Christoph Hellwig
2020-09-24  8:30 Stephen Rothwell
2020-09-24 20:08 ` Al Viro
2020-09-25 12:01   ` Stephen Rothwell
2020-09-25 13:38     ` Al Viro
2020-09-29  4:10       ` Josh Poimboeuf
2020-10-06 14:30         ` Josh Poimboeuf
2020-10-06 21:04           ` Stephen Rothwell
2020-10-07 15:46             ` Josh Poimboeuf
2020-07-29  1:56 Stephen Rothwell
2020-07-29  6:33 ` Christoph Hellwig
2020-07-29 19:19   ` Al Viro
2020-07-27 12:06 Stephen Rothwell
2020-05-07  0:39 Stephen Rothwell
2020-05-07  2:35 ` Al Viro
2020-05-07 15:07   ` Jens Axboe
2020-01-10  6:57 Stephen Rothwell
2020-01-10 10:00 ` Carlos Maiolino
2020-01-10 11:03 ` Carlos Maiolino
2020-01-10 22:44   ` Stephen Rothwell
2020-01-13  9:28     ` Carlos Maiolino
2020-01-24  2:41 ` Stephen Rothwell
2020-01-29 22:40   ` Stephen Rothwell
2019-01-02  4:01 Stephen Rothwell
2019-01-30  3:45 ` Stephen Rothwell
2018-10-03  0:32 Stephen Rothwell
2018-10-16  0:17 ` Stephen Rothwell
2018-10-16 16:37   ` Jaegeuk Kim
2018-10-16 20:45     ` Stephen Rothwell
2018-09-10  3:59 Stephen Rothwell
2018-09-10  3:35 Stephen Rothwell
2018-09-18 21:38 ` Stephen Rothwell
2018-09-18 22:17 ` David Howells
2018-09-18 23:49   ` Stephen Rothwell
2018-09-19  7:17     ` Geert Uytterhoeven
2018-09-19  6:01   ` David Howells
2018-09-19  6:31     ` Stephen Rothwell
2018-09-20 10:48       ` Michael Ellerman
2018-09-20 16:20       ` David Howells
2018-09-20 10:44   ` Michael Ellerman
2018-10-29  4:33 ` Stephen Rothwell
2018-10-29  9:07   ` Stephen Rothwell
2018-10-29  9:21   ` David Howells
2018-10-29 10:29     ` Stephen Rothwell
2018-09-06  2:28 Stephen Rothwell
2018-08-07 10:58 Stephen Rothwell
2018-08-07  1:11 Stephen Rothwell
2018-08-06  0:37 Stephen Rothwell
2018-08-06 12:24 ` Stephen Rothwell
2018-08-07  0:59   ` Stephen Rothwell
2018-08-07  2:20     ` Masahiro Yamada
2018-06-19  1:47 Stephen Rothwell
2018-03-19  6:06 Stephen Rothwell
2018-03-19 19:56 ` Mateusz Guzik
2018-04-03  2:26 ` Stephen Rothwell
2018-04-08  2:19   ` Al Viro
2018-04-08  2:55     ` Stephen Rothwell
2017-12-03 23:16 Stephen Rothwell
2017-07-11  0:55 Stephen Rothwell
2017-07-11  9:21 ` David Howells
2017-07-10  2:15 Stephen Rothwell
2017-07-10  2:34 ` Al Viro
2017-02-27  0:27 Stephen Rothwell
2017-02-27  8:31 ` David Howells
2016-07-29  1:19 Stephen Rothwell
2016-07-29  4:18 ` Al Viro
2016-05-02  1:25 Stephen Rothwell
2016-05-02  1:31 ` Al Viro
2016-05-02  4:48   ` Abhijith Das
2015-12-10  0:18 Stephen Rothwell
2015-12-10  0:23 ` Stephen Rothwell
2015-12-10  0:48   ` Al Viro
2015-12-10 15:44     ` Mike Marshall
2015-12-21  0:23 ` Stephen Rothwell
2016-01-07  0:42   ` Stephen Rothwell
2016-01-07  2:09     ` Al Viro
2015-12-09  5:58 Stephen Rothwell
2015-12-09  1:19 Stephen Rothwell
2015-12-09 21:30 ` Mike Marshall
2015-12-09 22:20   ` Stephen Rothwell
2015-12-09 22:53     ` Andreas Grünbacher
2015-12-07 22:42 Stephen Rothwell
2015-05-11  1:26 Stephen Rothwell
2015-05-13  2:26 ` Stephen Rothwell
2015-03-13  1:02 Stephen Rothwell
2015-03-24  3:24 ` Stephen Rothwell
2015-03-24 10:44   ` Christoph Hellwig
2014-12-10  7:45 Stephen Rothwell
2014-12-11  2:32 ` Al Viro
2014-04-22  1:26 Stephen Rothwell
2014-04-23  0:33 ` Stephen Rothwell
2013-11-07  0:30 Stephen Rothwell
2013-09-09  2:33 Stephen Rothwell
2013-09-09  8:54 ` Ian Kent
2013-06-24  1:35 Stephen Rothwell
2013-06-24  9:34 ` Al Viro
2013-05-01  2:22 Stephen Rothwell
2013-05-01 13:13 ` J. Bruce Fields
2013-04-08  1:15 Stephen Rothwell
2013-04-09 15:49 ` Stephen Rothwell
2013-04-03  0:22 Stephen Rothwell
2013-04-03  1:14 ` Al Viro
2013-04-02  0:26 Stephen Rothwell
2013-04-02  0:39 ` Al Viro
2012-07-16  0:59 Stephen Rothwell
2012-05-31  0:51 Stephen Rothwell
2012-05-31  1:02 ` Al Viro
2011-12-22  0:15 Stephen Rothwell
2011-12-20  0:31 Stephen Rothwell
2011-12-19  1:06 Stephen Rothwell
2011-12-19  1:12 ` Al Viro
2011-07-16  6:44 Stephen Rothwell
2011-07-25  3:20 ` Stephen Rothwell
2011-07-25 18:26   ` Trond Myklebust
2011-07-16  6:36 Stephen Rothwell
2010-07-19  0:25 Stephen Rothwell
2010-08-04  1:47 ` Stephen Rothwell
2010-07-12  2:24 Stephen Rothwell
2010-07-12  5:31 ` Ryusuke Konishi
2010-06-22  1:22 Stephen Rothwell
2010-08-04  1:50 ` Stephen Rothwell
2010-05-28  1:45 Stephen Rothwell
2010-05-28  1:51 ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120104021754.GD23916@ZenIV.linux.org.uk \
    --to=viro@zeniv.linux.org.uk \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=sfr@canb.auug.org.au \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).