From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Darrick J. Wong" Subject: Re: Question: errors=continue behaviour for failed external journal device Date: Mon, 28 Jul 2014 09:09:34 -0700 Message-ID: <20140728160934.GP8628@birch.djwong.org> References: <20140727000733.GV6725@thunk.org> <20140728131742.GP6725@thunk.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: =?utf-8?B?THVrw6HFoQ==?= Czerner , Vlad Dobrotescu , linux-ext4@vger.kernel.org To: "Theodore Ts'o" Return-path: Received: from aserp1040.oracle.com ([141.146.126.69]:25545 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751999AbaG1QJo (ORCPT ); Mon, 28 Jul 2014 12:09:44 -0400 Content-Disposition: inline In-Reply-To: <20140728131742.GP6725@thunk.org> Sender: linux-ext4-owner@vger.kernel.org List-ID: On Mon, Jul 28, 2014 at 09:17:42AM -0400, Theodore Ts'o wrote: > On Mon, Jul 28, 2014 at 11:11:45AM +0200, Luk=C3=A1=C5=A1 Czerner wro= te: > >=20 > > I very much agree with that, that's why I was quite surprised that = I > > found out recently that this is the default. I was living in the > > delusion that the default was ERRORS_RO for as long as I can rememb= er. > > So my question is, should we change it ? This really does not seem > > like a sane default. >=20 > Yeah, I've been thinking that this would be a good thing to change fo= r > 1.43. >=20 > The only reason that errors=3Dcontinue was the default was for > historical reasons. I could imagine some system administrators being > surprised when all of a sudden their production systems start getting > lots of EROFS errors getting reported by applications. So I could > potentially imagine some Help Desks / Support folks at distributions > not being enthusiastic about such a change. >=20 > Hmm.... we are starting to have some errors where we can allow the > system to stagger on, even if we need to disallow new allocations in > some block groups. I wonder if it is worthwhile to have a "continue > for correctable errors". The danger, of course, is that some errors, > even if they are correctable, (such as freeing a block which is > already freed), could be a hint that there are other fs corruptions, > not yet detected, that might lead to data loss if we reboot and fsck, > or remount readonly right away. So the question is while there is > some value, is it worth the added complexity to add an > "errors=3Dcontinue-correctable" option? Back in the earlier 3.15 days when I was trying to figure out what was = going on with that corruption bug that Eric Whitney found, it was useful for the= kernel to be able to stumble on with the non-broken block groups long enough t= o save the logs of what had happened. (Laptops don't seem to have serial cons= oles...) In general I think it's worth the effort. (I'd shovel crash reports into pstore if I wasn't afraid of bricking UE= =46I.) --D >=20 > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4"= in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html