All of lore.kernel.org
 help / color / mirror / Atom feed
* How should e2fsck clear s_errno/j_errno on an ro mount?
@ 2012-07-20 23:13 Eric Sandeen
  2012-07-21  0:39 ` Theodore Ts'o
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2012-07-20 23:13 UTC (permalink / raw)
  To: ext4 development

I'm looking at a situation where a root filesystem encountered an error and shut down, and therefore the error was stored in the journal.

But for the root fs, it seems that nothing can clear it.

If we do e2fsck -fy on a readonly mounted filesystem, then remount,rw the error persists:

[25124.319387] EXT4-fs warning (device loop3): ext4_clear_journal_err:4281: Filesystem error recorded from previous mount: IO failure
[25124.331140] EXT4-fs warning (device loop3): ext4_clear_journal_err:4282: Marking fs in need of filesystem check.

ad infinitum.

It may be my fever-addled brain this week but I'm having a hard time following how this error is supposed to get set & cleared, especially if e2fsck has modified it while mounted ro.

As soon as I mount rw again, load_superblock() sees the journal superblock has an error set, and copies it back into the journal->j_errno.

After Ted's "e2fsck: correctly propagate error from journal to superblock" in e2fsprogs, at least an unmounted fs gets cleaned up, but I'm not sure what to do to fix this when it's mounted.  

Am I missing something obvious?

Thanks,
-Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How should e2fsck clear s_errno/j_errno on an ro mount?
  2012-07-20 23:13 How should e2fsck clear s_errno/j_errno on an ro mount? Eric Sandeen
@ 2012-07-21  0:39 ` Theodore Ts'o
  2012-07-23 16:28   ` Eric Sandeen
  0 siblings, 1 reply; 9+ messages in thread
From: Theodore Ts'o @ 2012-07-21  0:39 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: ext4 development

On Fri, Jul 20, 2012 at 06:13:50PM -0500, Eric Sandeen wrote:
> I'm looking at a situation where a root filesystem encountered an error and shut down, and therefore the error was stored in the journal.
> 
> But for the root fs, it seems that nothing can clear it.
> 
> If we do e2fsck -fy on a readonly mounted filesystem, then remount,rw the error persists:
> 
> [25124.319387] EXT4-fs warning (device loop3): ext4_clear_journal_err:4281: Filesystem error recorded from previous mount: IO failure
> [25124.331140] EXT4-fs warning (device loop3): ext4_clear_journal_err:4282: Marking fs in need of filesystem check.
> 
> ad infinitum.
> 
> It may be my fever-addled brain this week but I'm having a hard time following how this error is supposed to get set & cleared, especially if e2fsck has modified it while mounted ro.
> 
> As soon as I mount rw again, load_superblock() sees the journal superblock has an error set, and copies it back into the journal->j_errno.
> 
> After Ted's "e2fsck: correctly propagate error from journal to superblock" in e2fsprogs, at least an unmounted fs gets cleaned up, but I'm not sure what to do to fix this when it's mounted.  

The problem you describe should have been fixed by commit 6d75685e:
e2fsck: handle an already recovered journal with a non-zero s_error
field, which is in e2fsprogs 1.42.4.

					- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How should e2fsck clear s_errno/j_errno on an ro mount?
  2012-07-21  0:39 ` Theodore Ts'o
@ 2012-07-23 16:28   ` Eric Sandeen
  2012-07-23 19:14     ` Theodore Ts'o
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2012-07-23 16:28 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: ext4 development

On 7/20/12 7:39 PM, Theodore Ts'o wrote:
> On Fri, Jul 20, 2012 at 06:13:50PM -0500, Eric Sandeen wrote:
>> I'm looking at a situation where a root filesystem encountered an error and shut down, and therefore the error was stored in the journal.
>>
>> But for the root fs, it seems that nothing can clear it.
>>
>> If we do e2fsck -fy on a readonly mounted filesystem, then remount,rw the error persists:
>>
>> [25124.319387] EXT4-fs warning (device loop3): ext4_clear_journal_err:4281: Filesystem error recorded from previous mount: IO failure
>> [25124.331140] EXT4-fs warning (device loop3): ext4_clear_journal_err:4282: Marking fs in need of filesystem check.
>>
>> ad infinitum.
>>
>> It may be my fever-addled brain this week but I'm having a hard time following how this error is supposed to get set & cleared, especially if e2fsck has modified it while mounted ro.
>>
>> As soon as I mount rw again, load_superblock() sees the journal superblock has an error set, and copies it back into the journal->j_errno.
>>
>> After Ted's "e2fsck: correctly propagate error from journal to superblock" in e2fsprogs, at least an unmounted fs gets cleaned up, but I'm not sure what to do to fix this when it's mounted.  
> 
> The problem you describe should have been fixed by commit 6d75685e:
> e2fsck: handle an already recovered journal with a non-zero s_error
> field, which is in e2fsprogs 1.42.4.

Well, I tested e2fsprogs from git and saw the same trouble.  Let me look more closely at that commit.

-Eric

> 					- Ted
> 



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How should e2fsck clear s_errno/j_errno on an ro mount?
  2012-07-23 16:28   ` Eric Sandeen
@ 2012-07-23 19:14     ` Theodore Ts'o
  2012-07-23 20:21       ` Eric Sandeen
  0 siblings, 1 reply; 9+ messages in thread
From: Theodore Ts'o @ 2012-07-23 19:14 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: ext4 development

> nWell, I tested e2fsprogs from git and saw the same trouble.  Let me
> look more closely at that commit.

Hmm, would it be possible to send me /tmp/loop3.qcow.bz2 after running:

     e2image -Q /dev/loop3 /tmp/loop3.qcow
     bzip2 /tmp/loop3.qcow

Regardless of what the problem ends up being, this looks like
something for our regression test suite....

					- Ted

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: How should e2fsck clear s_errno/j_errno on an ro mount?
  2012-07-23 19:14     ` Theodore Ts'o
@ 2012-07-23 20:21       ` Eric Sandeen
  2012-07-29  3:51         ` Theodore Ts'o
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Sandeen @ 2012-07-23 20:21 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: ext4 development

On 7/23/12 2:14 PM, Theodore Ts'o wrote:
>> nWell, I tested e2fsprogs from git and saw the same trouble.  Let me
>> look more closely at that commit.
> 
> Hmm, would it be possible to send me /tmp/loop3.qcow.bz2 after running:
> 
>      e2image -Q /dev/loop3 /tmp/loop3.qcow
>      bzip2 /tmp/loop3.qcow
> 
> Regardless of what the problem ends up being, this looks like
> something for our regression test suite....
> 
> 					- Ted
> 

Sure I can do that, will do it offline.

FWIW, the commit you mentioned changes e2fsck_check_ext3_journal, and
we only get there from main() like this:

        if ((ctx->mount_flags & EXT2_MF_MOUNTED) &&
            !(sb->s_feature_incompat & EXT3_FEATURE_INCOMPAT_RECOVER))
                goto skip_journal;

        retval = e2fsck_check_ext3_journal(ctx);

In my case I am mounted ro and recovery is done, so
EXT3_FEATURE_INCOMPAT_RECOVER is not, set, and so we skip over it with
the goto.

or else from here:

        if (sb->s_feature_incompat & EXT3_FEATURE_INCOMPAT_RECOVER) {
                if (ctx->options & E2F_OPT_READONLY) {
			...
                } else {
			...
                        retval = e2fsck_run_ext3_journal(ctx);

And again, I've already done recovery and am mounted RO so we won't go that way.

If make the first test above a little later:



diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index 94260bd..73aa028 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -1424,10 +1424,6 @@ failure:
 		fprintf(ctx->logf, "Filesystem UUID: %s\n",
 			e2p_uuid2str(sb->s_uuid));
 
-	if ((ctx->mount_flags & EXT2_MF_MOUNTED) &&
-	    !(sb->s_feature_incompat & EXT3_FEATURE_INCOMPAT_RECOVER))
-		goto skip_journal;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: How should e2fsck clear s_errno/j_errno on an ro mount?
  2012-07-23 20:21       ` Eric Sandeen
@ 2012-07-29  3:51         ` Theodore Ts'o
  2012-07-29  3:52           ` [PATCH] ext4: make sure the journal sb is written in ext4_clear_journal_err() Theodore Ts'o
  2012-07-29  4:13           ` [PATCH] Revert "e2fsck: Skip journal checks if the fs is mounted and doesn't need recovery" Theodore Ts'o
  0 siblings, 2 replies; 9+ messages in thread
From: Theodore Ts'o @ 2012-07-29  3:51 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: ext4 development

On Mon, Jul 23, 2012 at 03:21:45PM -0500, Eric Sandeen wrote:
> 
> If make the first test above a little later:
> 
> ...
> 
> 
> it clears it up for me but TBH I'm not totally clear on what all is going
> on here.

Your patch effectively reverts commit 47c1b8e1666: e2fsck: Skip journal
checks if the fs is mounted and doesn't need recovery

The reason why this patch was added was that it slightly speeded up
the HDD boots.  I've looked at this closely, and it's really not that
big of a speed up.  It only saves a single seek and read (to read the
journal superblock).  I've measured what it would be on a 1TB laptop
drive, and the difference is 60ms (70ms versus 10ms).

So I'm going to revert that commit, which should fix the problem that
we're seeing.  But it also turns out that the kernel is not quite
doing the right thing.  It should have never left the journal
superblock with a non-zero s_error indicator after it transfered the
error indicator to the superblock.  The problem was that we weren't
actually flushing the cleared errno field out to the disk for the
journal superblock.

							- Ted


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] ext4: make sure the journal sb is written in ext4_clear_journal_err()
  2012-07-29  3:51         ` Theodore Ts'o
@ 2012-07-29  3:52           ` Theodore Ts'o
  2012-07-29  4:13           ` [PATCH] Revert "e2fsck: Skip journal checks if the fs is mounted and doesn't need recovery" Theodore Ts'o
  1 sibling, 0 replies; 9+ messages in thread
From: Theodore Ts'o @ 2012-07-29  3:52 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: sandeen, Theodore Ts'o, stable

After we transfer set the EXT4_ERROR_FS bit in the file system
superblock, it's not enough to call jbd2_journal_clear_err() to clear
the error indication from journal superblock --- we need to call
jbd2_journal_update_sb_errno() as well.  Otherwise, when the root file
system is mounted read-only, the journal is replayed, and the error
indicator is transferred to the superblock --- but the s_errno field
in the jbd2 superblock is left set (since although we cleared it in
memory, we never flushed it out to disk).

This can end up confusing e2fsck.  We should make e2fsck more robust
in this case, but the kernel shouldn't be leaving things in this
confused state, either.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: stable@kernel.org
---
 fs/ext4/super.c      | 1 +
 fs/jbd2/journal.c    | 3 ++-
 include/linux/jbd2.h | 1 +
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index a2a5979..ca09462 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4444,6 +4444,7 @@ static void ext4_clear_journal_err(struct super_block *sb,
 		ext4_commit_super(sb, 1);
 
 		jbd2_journal_clear_err(journal);
+		jbd2_journal_update_sb_errno(journal);
 	}
 }
 
diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
index e9a3c4c..bd23f2e 100644
--- a/fs/jbd2/journal.c
+++ b/fs/jbd2/journal.c
@@ -1377,7 +1377,7 @@ static void jbd2_mark_journal_empty(journal_t *journal)
  * Update a journal's errno.  Write updated superblock to disk waiting for IO
  * to complete.
  */
-static void jbd2_journal_update_sb_errno(journal_t *journal)
+void jbd2_journal_update_sb_errno(journal_t *journal)
 {
 	journal_superblock_t *sb = journal->j_superblock;
 
@@ -1390,6 +1390,7 @@ static void jbd2_journal_update_sb_errno(journal_t *journal)
 
 	jbd2_write_superblock(journal, WRITE_SYNC);
 }
+EXPORT_SYMBOL(jbd2_journal_update_sb_errno);
 
 /*
  * Read the superblock for a given journal, performing initial
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index f334c7f..3efc43f 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1125,6 +1125,7 @@ extern int	   jbd2_journal_destroy    (journal_t *);
 extern int	   jbd2_journal_recover    (journal_t *journal);
 extern int	   jbd2_journal_wipe       (journal_t *, int);
 extern int	   jbd2_journal_skip_recovery	(journal_t *);
+extern void	   jbd2_journal_update_sb_errno(journal_t *);
 extern void	   jbd2_journal_update_sb_log_tail	(journal_t *, tid_t,
 				unsigned long, int);
 extern void	   __jbd2_journal_abort_hard	(journal_t *);
-- 
1.7.12.rc0.22.gcdd159b


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] Revert "e2fsck: Skip journal checks if the fs is mounted and doesn't need recovery"
  2012-07-29  3:51         ` Theodore Ts'o
  2012-07-29  3:52           ` [PATCH] ext4: make sure the journal sb is written in ext4_clear_journal_err() Theodore Ts'o
@ 2012-07-29  4:13           ` Theodore Ts'o
  2012-07-29  4:19             ` [PATCH] e2fsck: check a file system mounted read-only if forced Theodore Ts'o
  1 sibling, 1 reply; 9+ messages in thread
From: Theodore Ts'o @ 2012-07-29  4:13 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: sandeen, Theodore Ts'o

This reverts commit 47c1b8e16668daa6e74cee3c7b8bdf237ffefe70.

The original reason for this commit was to speed up boots for hard
drives.  However, I've measured the time difference on a 1TB laptop
drive, and it's not significant: 70ms vs 10ms when running e2fsck on a
clean file system.

The problem with this optimization is that we don't notice if the
journal superblock has a non-zero s_errno field.  If we don't transfer
the error indicator from the journal superblock to the file system
superblock, then the kernel will transfer it when the file system is
remounted read-write, causing scary messages to appear in the syslog.
(And since there was a bug in the kernel code which didn't clear the
error indicator in the journal superblock, it would never get
cleared.)

Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 e2fsck/unix.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index 94260bd..f71a125 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -1424,10 +1424,6 @@ failure:
 		fprintf(ctx->logf, "Filesystem UUID: %s\n",
 			e2p_uuid2str(sb->s_uuid));
 
-	if ((ctx->mount_flags & EXT2_MF_MOUNTED) &&
-	    !(sb->s_feature_incompat & EXT3_FEATURE_INCOMPAT_RECOVER))
-		goto skip_journal;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] e2fsck: check a file system mounted read-only if forced
  2012-07-29  4:13           ` [PATCH] Revert "e2fsck: Skip journal checks if the fs is mounted and doesn't need recovery" Theodore Ts'o
@ 2012-07-29  4:19             ` Theodore Ts'o
  0 siblings, 0 replies; 9+ messages in thread
From: Theodore Ts'o @ 2012-07-29  4:19 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: sandeen, Theodore Ts'o

Previously e2fsck would only allow a mounted file system to be checked
if it was the root file system and it was mounted read-only.  Now
allow any file system mounted read-only if the -f option is specified.

This makes it easier to test how e2fsck handles checking file systems
which are mounted without having to test on the root file system.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
---
 e2fsck/unix.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/e2fsck/unix.c b/e2fsck/unix.c
index f71a125..da41f69 100644
--- a/e2fsck/unix.c
+++ b/e2fsck/unix.c
@@ -225,7 +225,9 @@ static void check_mount(e2fsck_t ctx)
 	     !(ctx->options & E2F_OPT_WRITECHECK)))
 		return;
 
-	if ((ctx->options & E2F_OPT_READONLY) &&
+	if (((ctx->options & E2F_OPT_READONLY) ||
+	     ((ctx->options & E2F_OPT_FORCE) &&
+	      (ctx->mount_flags & EXT2_MF_READONLY))) &&
 	    !(ctx->options & E2F_OPT_WRITECHECK)) {
 		log_out(ctx, _("Warning!  %s is %s.\n"),
 			ctx->filesystem_name,
@@ -1226,6 +1228,9 @@ restart:
 		if (!(ctx->mount_flags & EXT2_MF_ISROOT &&
 		      ctx->mount_flags & EXT2_MF_READONLY))
 			flags |= EXT2_FLAG_EXCLUSIVE;
+		if ((ctx->mount_flags & EXT2_MF_READONLY) &&
+		    (ctx->options & E2F_OPT_FORCE))
+			flags &= ~EXT2_FLAG_EXCLUSIVE;
 	}
 
 	retval = try_open_fs(ctx, flags, io_ptr, &fs);
-- 
1.7.12.rc0.22.gcdd159b


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-07-29  4:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-20 23:13 How should e2fsck clear s_errno/j_errno on an ro mount? Eric Sandeen
2012-07-21  0:39 ` Theodore Ts'o
2012-07-23 16:28   ` Eric Sandeen
2012-07-23 19:14     ` Theodore Ts'o
2012-07-23 20:21       ` Eric Sandeen
2012-07-29  3:51         ` Theodore Ts'o
2012-07-29  3:52           ` [PATCH] ext4: make sure the journal sb is written in ext4_clear_journal_err() Theodore Ts'o
2012-07-29  4:13           ` [PATCH] Revert "e2fsck: Skip journal checks if the fs is mounted and doesn't need recovery" Theodore Ts'o
2012-07-29  4:19             ` [PATCH] e2fsck: check a file system mounted read-only if forced Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.