All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nix <nix@esperi.org.uk>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: "Eric Sandeen" <sandeen@redhat.com>,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org,
	"J. Bruce Fields" <bfields@fieldses.org>,
	"Bryan Schumaker" <bjschuma@netapp.com>,
	"Peng Tao" <bergwolf@gmail.com>,
	Trond.Myklebust@netapp.com, gregkh@linuxfoundation.org,
	"Toralf Förster" <toralf.foerster@gmx.de>,
	nick.cheng@areca.com.tw
Subject: Re: Apparent serious progressive ext4 data corruption bug in 3.6 (when rebooting during umount) (possibly blockdev / arcmsr at fault??)
Date: Fri, 26 Oct 2012 01:22:22 +0100	[thread overview]
Message-ID: <871ugm2ibl.fsf_-_@spindle.srvr.nix> (raw)
In-Reply-To: <20121025011056.GC4559@thunk.org> (Theodore Ts'o's message of "Wed, 24 Oct 2012 21:10:56 -0400")

On 25 Oct 2012, Theodore Ts'o told this:
> If that does make the problem go away, that will be a very interesting
> data point....

I'll be looking at this tomorrow, but as sod's law would have it I have
another user on this machine who didn't want it mega-rebooted tonight,
so I was reduced to trying to reproduce the problem in virtualization
under qemu.

I failed, for one very simple reason: on 3.6.3, even with a umount -l
still in the process of unmounting the fs and flushing changes, even on
an fs mounted nobarrier,journal_async_commit, even when mounted atop
LVM, reboot(2) will block until umount's writeout is complete (and lvm
vgchange refuses to deactivate the volume group while that is happening,
but I don't bother deactivating volume groups on the afflicted machine
so I know that can't be related). Obviously, this suffices to ensure
that a reboot is not possible while umounts are underway -- though a
power cut is still possible, I suppose.

On the afflicted machine (with a block device stack running LVM, then
libata/arcmsr), as far as I can tell reboot(8) is *not* blocking if a
unmount is underway: it shoots down everything and restarts at once. I
have no direct proof of this yet, but during the last week I've
routinely seen it reboot with lots of writes underway and umount -l log
messages streaming up the screen: it certainly doesn't wait for all the
umount -l's to be done the way it is in virtualization. I have no idea
how this can be possible: I thought fses on a block device had to be
quiesced (thus, in the case of an ongoing umount, unmounted and flushed)
before any attempt at all was made to shut the underlying block device
down, and I'd be fairly surprised if a flush wasn't done even if
nobarrier was active (it certainly seems to be for virtio-blk, but that
may well be a special case). But arcmsr (or libata? I should test with a
simulated libata rather than virtio-blk next) appears to be getting
around that somehow. This would probably explain all sorts of horrible
corruption if umounting during a reboot, right?

So maybe it's the stack of block devices that's at fault, and not the
filesystem at all! I'll admit I don't really understand what happens at
system halt time well enough to be sure, and getting log info from a
machine in the middle of reboot(8) appears likely to be a complete sod
(maybe halt(8) would be better: at least I could take a photo of the
screen then). If that's true, it would *certainly* explain why nobody
else can see this problem: only arcmsr users who also do umount -l's
would have a chance, and that population probably has a size of one.


I'll try to prove this tomorrow by writing a few gigs of junk to a temp
filesytem held open by a temporary cat /dev/null, umount -l'ing it and
killing off the cat the instant before the reboot -f call. If I don't
see the reboot call blocking, the hypothesis is proved. (This is much
what I did in virtualization, where I observe reboot blocking.)

(Another blockdev-related possibility, if reboot *is* observed to block,
is that arcmsr may be throwing away very-recently-written data when the
adapter is shut down right before reboot.)


Argh. How can rebooting a system be so damn complicated. Bring back the
C64 or BBC Master where I could just pull the power lead out and stick
it back in. :)

  parent reply	other threads:[~2012-10-26  0:22 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-22 16:17 Heads-up: 3.6.2 / 3.6.3 NFS server panic: 3.6.2+ regression? Nix
2012-10-23  1:33 ` J. Bruce Fields
2012-10-23 14:07   ` Nix
2012-10-23 14:30     ` J. Bruce Fields
2012-10-23 16:32       ` Heads-up: 3.6.2 / 3.6.3 NFS server oops: 3.6.2+ regression? (also an unrelated ext4 data loss bug) Nix
2012-10-23 16:46         ` J. Bruce Fields
2012-10-23 16:54           ` J. Bruce Fields
2012-10-23 16:56           ` Myklebust, Trond
2012-10-23 16:56             ` Myklebust, Trond
2012-10-23 17:05             ` Nix
2012-10-23 17:36               ` Nix
2012-10-23 17:43                 ` J. Bruce Fields
2012-10-23 17:44                 ` Myklebust, Trond
2012-10-23 17:57                   ` Myklebust, Trond
2012-10-23 17:57                     ` Myklebust, Trond
     [not found]                   ` <1351015039.4622.23.camel@lade.trondhjem.org>
2012-10-23 18:23                     ` Myklebust, Trond
2012-10-23 18:23                       ` Myklebust, Trond
2012-10-23 19:49                       ` Nix
2012-10-24 10:18                         ` [PATCH] lockd: fix races in per-net NSM client handling Stanislav Kinsbursky
2012-10-23 20:57         ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Nix
2012-10-23 20:57           ` Nix
2012-10-23 22:19           ` Theodore Ts'o
2012-10-23 22:47             ` Nix
2012-10-23 23:16               ` Theodore Ts'o
2012-10-23 23:06             ` Nix
2012-10-23 23:28               ` Theodore Ts'o
2012-10-23 23:34                 ` Nix
2012-10-24  0:57             ` Eric Sandeen
2012-10-24 20:17               ` Jan Kara
2012-10-26 15:25                 ` Eric Sandeen
2012-10-24 19:13             ` Jannis Achstetter
2012-10-24 19:13               ` Jannis Achstetter
2012-10-24 21:31               ` Theodore Ts'o
2012-10-24 22:05                 ` Jannis Achstetter
2012-10-24 23:47                 ` Nix
2012-10-25 17:02                 ` Felipe Contreras
2012-10-24 21:04             ` Jannis Achstetter
2012-10-24  1:13           ` Eric Sandeen
2012-10-24  1:13             ` Eric Sandeen
2012-10-24  4:15             ` Nix
2012-10-24  4:27               ` Eric Sandeen
2012-10-24  5:23                 ` Theodore Ts'o
2012-10-24  7:00                   ` Hugh Dickins
2012-10-24 11:46                     ` Nix
2012-10-24 11:45                   ` Nix
2012-10-24 17:22                   ` Eric Sandeen
2012-10-24 19:49                   ` Nix
2012-10-24 19:54                     ` Nix
2012-10-24 20:30                     ` Eric Sandeen
2012-10-24 20:34                       ` Nix
2012-10-24 20:45                     ` Nix
2012-10-24 21:08                     ` Theodore Ts'o
2012-10-24 23:27                       ` Apparent serious progressive ext4 data corruption bug in 3.6 (when rebooting during umount) Nix
2012-10-24 23:42                         ` Nix
2012-10-25  1:10                         ` Theodore Ts'o
2012-10-25  1:45                           ` Nix
2012-10-25  1:45                             ` Nix
2012-10-25 14:12                             ` Theodore Ts'o
2012-10-25 14:15                               ` Nix
2012-10-25 17:39                                 ` Nix
2012-10-25 11:06                           ` Nix
2012-10-26  0:22                           ` Nix [this message]
2012-10-26  0:11               ` Apparent serious progressive ext4 data corruption bug in 3.6.3 (and other stable branches?) Ric Wheeler
2012-10-26  0:43                 ` Theodore Ts'o
2012-10-26 12:12                   ` Nix
2012-10-26 20:35           ` Eric Sandeen
2012-10-26 20:37             ` Nix
2012-10-26 20:56               ` Theodore Ts'o
2012-10-26 20:56                 ` Theodore Ts'o
2012-10-26 20:59                 ` Nix
2012-10-26 20:59                   ` Nix
2012-10-26 21:15                   ` Theodore Ts'o
2012-10-26 21:15                     ` Theodore Ts'o
2012-10-26 21:19                     ` Nix
2012-10-27  0:22                       ` Theodore Ts'o
2012-10-27  0:22                         ` Theodore Ts'o
2012-10-27 12:45                         ` Nix
2012-10-27 17:55                           ` Theodore Ts'o
2012-10-27 18:47                             ` Nix
2012-10-27 21:19                               ` Eric Sandeen
2012-10-27 21:21                                 ` Nix
2012-10-27 21:23                                   ` Eric Sandeen
2012-10-27 21:29                                     ` Nix
2012-10-27 21:34                                       ` Eric Sandeen
2012-10-27 21:40                                         ` Nix
     [not found]                                         ` <09758CEA-74B5-48D0-8075-BB723A2CABBB@dilger.ca>
2012-10-29  2:09                                           ` Eric Sandeen
2012-10-27 22:42                                 ` Eric Sandeen
2012-10-29  1:00                                   ` Theodore Ts'o
2012-10-29  1:04                                     ` Nix
2012-10-29  2:24                                     ` Eric Sandeen
2012-10-29  2:34                                       ` Theodore Ts'o
2012-10-29  2:35                                         ` Eric Sandeen
2012-10-29  2:42                                           ` Theodore Ts'o
2012-10-27 18:30                           ` Eric Sandeen
2012-10-27  3:11                     ` Jim Rees
2012-10-27  3:11                       ` Jim Rees
2012-10-27  8:01             ` Testing ext4's journal via simulating a reboot via KVM Theodore Ts'o
2012-10-28  4:23           ` [PATCH] ext4: fix unjournaled inode bitmap modification Eric Sandeen
2012-10-28  4:23             ` Eric Sandeen
2012-10-28 13:59             ` Nix
2012-10-29  2:30             ` [PATCH -v3] " Theodore Ts'o
2012-10-29  2:30               ` Theodore Ts'o
2012-10-29  3:24               ` Eric Sandeen
2012-10-29  5:07               ` Andreas Dilger
2012-10-29 17:08               ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871ugm2ibl.fsf_-_@spindle.srvr.nix \
    --to=nix@esperi.org.uk \
    --cc=Trond.Myklebust@netapp.com \
    --cc=bergwolf@gmail.com \
    --cc=bfields@fieldses.org \
    --cc=bjschuma@netapp.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nick.cheng@areca.com.tw \
    --cc=sandeen@redhat.com \
    --cc=toralf.foerster@gmx.de \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.