help with bug_on on ext4 mount

* help with bug_on on ext4 mount
@ 2014-07-01  6:44 Dolev Raviv
  2014-07-01 15:39 ` Theodore Ts'o
  0 siblings, 1 reply; 2+ messages in thread
From: Dolev Raviv @ 2014-07-01  6:44 UTC (permalink / raw)
  To: linux-ext4; +Cc: Tanya Brokhman, Maya Erez, kdorfman, lsusman

Hi All,

I’m working on a crash originating from ext4 mount path. I’m running with
3.10 based kernel.

Crash description:
I saw a BUG_ON assertion failure in function ext4_clear_journal_err(). The
assertion that fails is:  !EXT4_HAS_COMPAT_FEATURE(sb,
EXT4_FEATURE_COMPAT_HAS_JOURNAL).
The strange thing is, that the same BUG_ON assertion is called at the
start of the function that calls ext4_clear_journal_err(), which is
ext4_load_journal(). This means that the capability flag is changed in
ext4_load_journal, before the call for journal_err().

I’m not too familiar with ext4 code unfortunately. From analyzing the
journal path I came to the below conclusions:
This scenario is possible, if during journal replay, the super_block is
restored or overridden from the journal.
I have noticed a case where the sb is marked as dirty and later, it is
evicted through the address_space_operations .writepage = ext4_writepage
cb. This cb is using the journal and can cause the dirty sb appear on the
journal. If during the journal write operation a power cut occurs, and the
sb copy in the journal is corrupted, it may cause the BUG_ON assertion
failure above.

Is the scenario described above even possible (or am I missing something)?
Has anyone encountered similar issues? Are there any known fixes for this?

Thanks,
Dolev
-- 
QUALCOMM ISRAEL, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 2+ messages in thread