On Tue 12-03-19 13:21:06, Petr Mladek wrote:
> On Tue 2019-03-12 10:49:06, Jan Kara wrote:
> > When admin calls "reboot -f" - i.e., does a hard system reboot by
> > directly calling reboot(2) - ext4 filesystem mounted with errors=panic
> > can panic the system. This happens because the underlying device gets
> > disabled without unmounting the filesystem and thus some syscall running
> > in parallel to reboot(2) can result in the filesystem getting IO errors.
> > 
> > This is somewhat surprising to the users so try improve the behavior by
> > switching to errors=remount-ro behavior when the system is running
> > reboot(2).
> > 
> > Signed-off-by: Jan Kara <jack@suse.cz>
> > ---
> >  fs/ext4/super.c | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > What do people think about this? Arguably this change has some potential
> > for breakage if e.g. an ext4 error would somehow block reboot(2). I don't
> > see how that would be possible but I wanted to raise this just in case...
> > 
> > diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> > index 60da0a6e4d86..f7b6f95fa8dd 100644
> > --- a/fs/ext4/super.c
> > +++ b/fs/ext4/super.c
> > @@ -430,6 +430,12 @@ static void ext4_journal_commit_callback(journal_t *journal, transaction_t *txn)
> >  	spin_unlock(&sbi->s_md_lock);
> >  }
> >  
> > +static bool system_going_down(void)
> > +{
> > +	return system_state == SYSTEM_HALT || system_state == SYSTEM_POWER_OFF
> > +		|| system_state == SYSTEM_RESTART;
> > +}
> > +
> >  /* Deal with the reporting of failure conditions on a filesystem such as
> >   * inconsistencies detected or read IO failures.
> >   *
> > @@ -460,7 +466,12 @@ static void ext4_handle_error(struct super_block *sb)
> >  		if (journal)
> >  			jbd2_journal_abort(journal, -EIO);
> >  	}
> > -	if (test_opt(sb, ERRORS_RO)) {
> > +	/*
> > +	 * We force ERRORS_RO behavior when system is rebooting. Otherwise we
> > +	 * could panic during 'reboot -f' as the underlying device got already
> > +	 * disabled.
> > +	 */
> > +	if (test_opt(sb, ERRORS_RO) || system_going_down()) {
> 
> If I read the code correctly then this will not avoid the panic().
> 
> There is no return in this branch. Therefore it still continues
> with the ERRORS_PANIC check...
> 
> Or do I miss anything, please?

No, you are right. Attached is a fixed up patch.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR