From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758814Ab2JXVEz (ORCPT ); Wed, 24 Oct 2012 17:04:55 -0400 Received: from mout.web.de ([212.227.17.11]:63549 "EHLO mout.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755618Ab2JXVEy (ORCPT ); Wed, 24 Oct 2012 17:04:54 -0400 Message-ID: <508857F2.9000206@web.de> Date: Wed, 24 Oct 2012 23:04:50 +0200 From: Jannis Achstetter User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121022 Thunderbird/16.0.1 MIME-Version: 1.0 To: "Theodore Ts'o" CC: linux-kernel@vger.kernel.org Subject: Re: Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) References: <87objupjlr.fsf@spindle.srvr.nix> <20121023013343.GB6370@fieldses.org> <87mwzdnuww.fsf@spindle.srvr.nix> <20121023143019.GA3040@fieldses.org> <874nllxi7e.fsf_-_@spindle.srvr.nix> <87pq48nbyz.fsf_-_@spindle.srvr.nix> <20121023221913.GC28626@thunk.org> In-Reply-To: <20121023221913.GC28626@thunk.org> X-Enigmail-Version: 1.5a1pre OpenPGP: url=subkeys.pgp.net Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:UKV2dWqV0IBUS/LqGxJfuG2Cx0ARpqjE9WWd0QkhDSu sGhCeGe0DOD6o9sAzIenoaLXTtTkvGxk8EnW7sfnaJA5DTjh7q ReqiJZZ62/c8xmINfib0txSJUK0s9NGWcLjdz14Py7UAoMQquo 78McgCNiAKmgM/1NLRuxB8MFOoP/Z9Pi7WA94krnVZTtokkz4F cnDrODTMgvohUw6JbcOVFT1gL/rffizV7demqDzUhk= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Am 24.10.2012 00:19, schrieb Theodore Ts'o: > The reason why the problem happens rarely is that the effect of the > buggy commit is that if the journal's starting block is zero, we fail > to truncate the journal when we unmount the file system. This can > happen if we mount and then unmount the file system fairly quickly, > before the log has a chance to wrap. After the first time this has > happened, it's not a disaster, since when we replay the journal, we'll > just replay some extra transactions. But if this happens twice, the > oldest valid transaction will still not have gotten updated, but some > of the newer transactions from the last mount session will have gotten > written by the very latest transacitons, and when we then try to do > the extra transaction replays, the metadata blocks can end up getting > very scrambled indeed. Repost. Sorry, I don't mean to spam, I just don't see my first mail (sent via gmane.org) anywhere, so ... As a "normal linux user" I'm interested in the practical things to do now to avoid data loss. I'm running several systems with 3.6.2 and ext4. Fearing loss of data: - Is there a way to see whether the journal of a specific partition has been wrapped (since mounting) so that umounting and mounting (or doing a reboot to downgrade the kernel) is safe? - Is there a way to "force" a journal-wrap? Run any filesystem-benchmark? Which one with what parameters? Or is it unwise since I might even further corrupt data if I hit the case already? - Is it wise to umount now and run e2fsck or might I corrupt my files just by umounting now if the journal hasn't wrapped yet? - How do you define "fairly quickly"? Of course servers run 24/7 but I might be using my PC 2-5 hrs a day... Is that a "reboot to soon after booting"? - Any more advice you can give to the ordinary user to avoid fs-corruption? Don't shut down machines for some days? Better down- or upgrade the kernel? Best regards, Jannis Achstetter