All of lore.kernel.org
 help / color / mirror / Atom feed
* Ext4 corruption on linux-next since 5.2 merge window
@ 2019-05-22 23:40 Peter Geis
  2019-05-24  4:37 ` Theodore Ts'o
  0 siblings, 1 reply; 2+ messages in thread
From: Peter Geis @ 2019-05-22 23:40 UTC (permalink / raw)
  To: linux-ext4

Good Evening,

Since the 5.2 merge window, I've been encountering EXT4 corruption 
periodically.
The board is the rk3328-roc-cc.
The device is a USB 3.0 Samsung SSD.
$ lsusb
Bus 005 Device 002: ID 04e8:61f5 Samsung Electronics Co., Ltd Portable 
SSD T5
$ lsusb --tree
/:  Bus 05.Port 1: Dev 1, Class=root_hub, Driver=xhci-hcd/1p, 5000M
     |__ Port 1: Dev 2, If 0, Class=Mass Storage, Driver=uas, 5000M
Currently running:
uname -a
Linux firefly 5.2.0-rc1-next-20190521test-14384-gea7592a68ff9 #64 SMP 
PREEMPT Tue May 21 14:40:53 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

The error received is:
[12546.303907] EXT4-fs error (device sda1): ext4_find_extent:909: inode 
#8: comm jbd2/sda1-8: pblk 60850175 bad header/extent: invalid extent 
entries - magic f30a, entries 8, max 340(340), depth 0(0)

This immediately knocks the filesystem to RO.

It is easily reproducible during kernel compilation.

I'm at a loss as to where to begin, considering the number of changes in 
various subsystems.
Is there some way I can enable more ext4 debugging?

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Ext4 corruption on linux-next since 5.2 merge window
  2019-05-22 23:40 Ext4 corruption on linux-next since 5.2 merge window Peter Geis
@ 2019-05-24  4:37 ` Theodore Ts'o
  0 siblings, 0 replies; 2+ messages in thread
From: Theodore Ts'o @ 2019-05-24  4:37 UTC (permalink / raw)
  To: Peter Geis; +Cc: linux-ext4

On Wed, May 22, 2019 at 07:40:26PM -0400, Peter Geis wrote:
> Good Evening,
> 
> Since the 5.2 merge window, I've been encountering EXT4 corruption
> periodically.

Yeah, sorry.  The fix is in the ext4.git tree, on the dev branch.
I'll be pushing it to Linus shortly.

It's actually not an actual file system corruption, but rather a false
positive, when there is sufficient memory pressure to push the extent
status tree entries for the journal out of the cache.

The fix is:

commit 0a944e8a6c66ca04c7afbaa17e22bf208a8b37f0
Author: Theodore Ts'o <tytso@mit.edu>
Date:   Wed May 22 10:27:01 2019 -0400

    ext4: don't perform block validity checks on the journal inode
    
    Since the journal inode is already checked when we added it to the
    block validity's system zone, if we check it again, we'll just trigger
    a failure.
    
    This was causing failures like this:
    
    [   53.897001] EXT4-fs error (device sda): ext4_find_extent:909: inode
    #8: comm jbd2/sda-8: pblk 121667583 bad header/extent: invalid extent entries - magic f30a, entries 
8, max 340(340), depth 0(0)
    [   53.931430] jbd2_journal_bmap: journal block not found at offset 49 on sda-8
    [   53.938480] Aborting journal on device sda-8.
    
    ... but only if the system was under enough memory pressure that
    logical->physical mapping for the journal inode gets pushed out of the
    extent cache.  (This is why it wasn't noticed earlier.)
    
    Fixes: 345c0dbf3a30 ("ext4: protect journal inode's blocks using block_validity")
    Reported-by: Dan Rue <dan.rue@linaro.org>
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>

						- Ted

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-05-24  4:37 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-22 23:40 Ext4 corruption on linux-next since 5.2 merge window Peter Geis
2019-05-24  4:37 ` Theodore Ts'o

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.