linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Y. Ts'o" <tytso@mit.edu>
To: Sodagudi Prasad <psodagud@codeaurora.org>
Cc: adilger.kernel@dilger.ca, wen.xu@gatech.edu,
	linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: Remounting filesystem read-only
Date: Fri, 27 Jul 2018 20:18:23 -0400	[thread overview]
Message-ID: <20180728001823.GA28432@thunk.org> (raw)
In-Reply-To: <c121825198eba667c09fc53fa9b4fd3a@codeaurora.org>

On Fri, Jul 27, 2018 at 01:34:31PM -0700, Sodagudi Prasad wrote:
> > The error should be pretty clear: "Inode table for bg 0 marked as
> > needing zeroing".  That should never happen.
> 
> Can you provide any debug patch to detect when this corruption is happening?
> Source of this corruption and how this is partition getting corrupted?
> Or which file system operation lead to this corruption?

Do you have a reliable repro?  If it's a one-off, it can be caused by
*anything*.  Crappy hardware, a bug in some proprietary, binary-only
GPU driver dereferencing some wild pointer that corrupts kernel
memory, etc.

Asking for a debug patch is like asking for "can you create technology
that can detect when a cockroach enter my house?"

So if you have a reliable repro, then we know what operations might be
triggering the corruption, and then you work on creating a minimal
repro, and only *then* when we have a restricted set of possibilities
that might be the cause (for example, if removing a GPU call makes the
problem go away, then the patch would need to be in the proprietary
GPU driver....)

> I am digging code a bit around this warning to understand more.

The warning means that a flag in block group descriptor #0 is set
that should never be set.  How did the flag get set?  There is any
number of things that could cause that.

You might want to look at the block group descriptor via dumpe2fs or
debugfs, to see if it's just a single bit getting flipped, or if the
entire block group descriptor is garbage.  Note that under normal code
paths, the flag *never* gets set by ext4 kernel code.  The flag will
get set on non-block group 0 block group descriptors by ext4, and the
ext4 kernel code will only clear the flag.

Of course, if there is a bug in some driver that dereferences a
pointer widely, all bets are off.

					- Ted

  reply	other threads:[~2018-07-28  0:18 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <366cf3ac534bbadaaa61714a43006ac7@codeaurora.org>
2018-07-27 19:26 ` Remounting filesystem read-only Sodagudi Prasad
2018-07-27 19:52   ` Theodore Y. Ts'o
2018-07-27 20:34     ` Sodagudi Prasad
2018-07-28  0:18       ` Theodore Y. Ts'o [this message]
2018-07-28  7:47         ` Darrick J. Wong
2018-08-02  2:23           ` Sodagudi Prasad

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180728001823.GA28432@thunk.org \
    --to=tytso@mit.edu \
    --cc=adilger.kernel@dilger.ca \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=psodagud@codeaurora.org \
    --cc=wen.xu@gatech.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).