All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 20992] New: Data corruption triggers ext4 oops
@ 2010-10-23 14:39 bugzilla-daemon
  2010-10-23 14:56 ` [Bug 20992] " bugzilla-daemon
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: bugzilla-daemon @ 2010-10-23 14:39 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=20992

           Summary: Data corruption triggers ext4 oops
           Product: File System
           Version: 2.5
    Kernel Version: 2.6.35.7
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: ext4
        AssignedTo: fs_ext4@kernel-bugs.osdl.org
        ReportedBy: bart.vanassche@gmail.com
        Regression: No


While running I/O performance tests I accidentally overwrote an ext4
filesystem. The next access of that filesystem triggered a kernel oops. I don't
think that should happen ?

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 20992] Data corruption triggers ext4 oops
  2010-10-23 14:39 [Bug 20992] New: Data corruption triggers ext4 oops bugzilla-daemon
@ 2010-10-23 14:56 ` bugzilla-daemon
  2010-10-23 14:58 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2010-10-23 14:56 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=20992


Theodore Tso <tytso@mit.edu> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu




--- Comment #1 from Theodore Tso <tytso@mit.edu>  2010-10-23 14:56:03 ---
Can you send the oops message, complete with the stack trace?

Thanks!!

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 20992] Data corruption triggers ext4 oops
  2010-10-23 14:39 [Bug 20992] New: Data corruption triggers ext4 oops bugzilla-daemon
  2010-10-23 14:56 ` [Bug 20992] " bugzilla-daemon
@ 2010-10-23 14:58 ` bugzilla-daemon
  2010-10-23 15:20 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2010-10-23 14:58 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=20992





--- Comment #2 from Bart Van Assche <bart.vanassche@gmail.com>  2010-10-23 14:58:48 ---
Created an attachment (id=34542)
 --> (https://bugzilla.kernel.org/attachment.cgi?id=34542)
Kernel oops

Attached backtrace.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 20992] Data corruption triggers ext4 oops
  2010-10-23 14:39 [Bug 20992] New: Data corruption triggers ext4 oops bugzilla-daemon
  2010-10-23 14:56 ` [Bug 20992] " bugzilla-daemon
  2010-10-23 14:58 ` bugzilla-daemon
@ 2010-10-23 15:20 ` bugzilla-daemon
  2010-10-23 15:27 ` bugzilla-daemon
  2013-12-10 22:22 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2010-10-23 15:20 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=20992





--- Comment #3 from Theodore Tso <tytso@mit.edu>  2010-10-23 15:20:43 ---
Yep, looks like a bug alright. 

>From what I can tell, you were in the middle of async I/O, at the time when the
disk was corrupted.  The problem seemed to come after the I/O was completed,
and  ext4_convert_unwritten_extents() was trying to set the initialized bit on
the extent tree.  At that point the extent tree must have gotten corrupted on
disk, and this seriously confused the extent conversion code, which ended up
passing 0 to ext4_ext_put_in_cache() as the length of the extent, and that
tripped the BUG_ON in ext4_ext_put_in_cache().

How did you corrupt the file system while it was mounted?   Was it via some dd
to the disk device directly?

We do have code that checks to make sure the extent tree is sane, but we skip
it if the data was already in the buffer cache, to save CPU costs.  But if you
wrote to the disk device directly, it would have gone through the buffer cache,
since the extent tree was already cached, we would have skipped the validation
step, and that could be the explanation for how the bug got triggered.

If so, I'm loathe to turn on the validation check unconditionally, since that
would kill performance.  I can probably change the BUG_ON in
ext4_put_in_cache() to rather set the cache state to "invalid", which would at
least prevent the BUG_ON.  The filesystem was probably well and truly trashed,
though, so sooner or later the ext4 fs code would have hit something to cause
it to be very unhappy.  Hoepfully it would be an ext4_error() call to mark the
file system as corrupted, as opposed to another BUG_ON.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 20992] Data corruption triggers ext4 oops
  2010-10-23 14:39 [Bug 20992] New: Data corruption triggers ext4 oops bugzilla-daemon
                   ` (2 preceding siblings ...)
  2010-10-23 15:20 ` bugzilla-daemon
@ 2010-10-23 15:27 ` bugzilla-daemon
  2013-12-10 22:22 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2010-10-23 15:27 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=20992





--- Comment #4 from Bart Van Assche <bart.vanassche@gmail.com>  2010-10-23 15:27:06 ---
(In reply to comment #3)
> How did you corrupt the file system while it was mounted?   Was it via some dd
> to the disk device directly?

Indeed - the filesystem was corrupted while mounted by overwriting the entire
contents with dd (dd if=/dev/zero of=/dev/sd... oflag=direct).

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 20992] Data corruption triggers ext4 oops
  2010-10-23 14:39 [Bug 20992] New: Data corruption triggers ext4 oops bugzilla-daemon
                   ` (3 preceding siblings ...)
  2010-10-23 15:27 ` bugzilla-daemon
@ 2013-12-10 22:22 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2013-12-10 22:22 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=20992

Alan <alan@lxorguk.ukuu.org.uk> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |alan@lxorguk.ukuu.org.uk
         Resolution|---                         |OBSOLETE

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-12-10 22:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-23 14:39 [Bug 20992] New: Data corruption triggers ext4 oops bugzilla-daemon
2010-10-23 14:56 ` [Bug 20992] " bugzilla-daemon
2010-10-23 14:58 ` bugzilla-daemon
2010-10-23 15:20 ` bugzilla-daemon
2010-10-23 15:27 ` bugzilla-daemon
2013-12-10 22:22 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.