linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Mason <clm@fb.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [GIT PULL] Btrfs
Date: Mon, 21 Mar 2016 22:15:33 -0400	[thread overview]
Message-ID: <20160322021533.xqalzzk5itglqu3x@floor.thefacebook.com> (raw)
In-Reply-To: <CA+55aFzkbk=EVjqp6cx+OQXPadFGfFvaHHV59+CZFStFB8gGvQ@mail.gmail.com>

On Mon, Mar 21, 2016 at 06:16:54PM -0700, Linus Torvalds wrote:
> On Mon, Mar 21, 2016 at 5:24 PM, Chris Mason <clm@fb.com> wrote:
> >
> > I waited an extra day to send this one out because I hit a crash late
> > last week with CONFIG_DEBUG_PAGEALLOC enabled (fixed in the top commit).
> 
> Hmm. If that commit helps, it will spit out a warning.
> 
> So is it actually fixed, or just hacked around to the point where you
> don't get a page fault?
> 
> That WARN_ON_ONCE kind of implies it's a "this happens, but we don't know why".

Hi Linus,

	while (bio_index < bio->bi_vcnt) {
		count = find some crcs
		...
		while (count--) {
			...
			page_bytes_left -= root->sectorsize;
			if (!page_bytes_left) {
				bio_index++;
				/*
				 * make sure we're still inside the
				 * bio before we update page_bytes_left
				 */
				if (bio_index >= bio->bi_vcnt) {
					WARN_ON_ONCE(count);
					goto done;
				}
				bvec++;
				page_bytes_left = bvec->bv_len;
				^^^^^ this was the line that crashed
				      before
			}

		}
	}

done:
	cleanup;
	return;

What should be happening here is we'll goto done when count is zero and
we've walked past the end of the bio.  IOW, both the outer and inner
loops are doing the right tests and the right math, but the inner loop
is improperly accessing a bogus bvec->bv_len because it didn't realize
the outer loop was now completely done.

I don't see a way for it to happen when count != 0, and I ran xfstests
on a few machines to try and triple check that.  If there are new bugs
hiding here, we'll have EIOs returned up to userland because this
function didn't properly fetch the crcs.  If anyone reported the EIOs,
they would send in the WARN_ON output too, so we'd know right away not
to blame their hardware.

I also ran for days with heavy read/write loads without seeing the crc
errors.  I didn't have the WARN_ON, or CONFIG_DEBUG_PAGEALLOC on that
box, but if other things were wrong, we'd have done a lot worse than poke
into bvec->bv_len, and the crc errors would have stopped the test.

-chris

  reply	other threads:[~2016-03-22  2:15 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-22  0:24 [GIT PULL] Btrfs Chris Mason
2016-03-22  1:16 ` Linus Torvalds
2016-03-22  2:15   ` Chris Mason [this message]
2016-03-22  2:24     ` Chris Mason
2016-03-22  2:38       ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2017-06-10 13:00 Chris Mason
2017-05-09 17:56 Chris Mason
2017-05-09 18:01 ` Chris Mason
2017-04-28  0:26 Chris Mason
2017-04-14 18:28 Chris Mason
2017-03-31 21:05 Chris Mason
2017-03-23 15:09 Chris Mason
2017-03-02 20:19 Chris Mason
2017-02-25  0:56 Chris Mason
2017-02-11 13:18 Chris Mason
2017-01-27 19:37 Chris Mason
2016-12-16 17:01 Chris Mason
2016-11-04 17:28 Chris Mason
2016-10-28 14:08 Chris Mason
2016-10-14 20:31 Chris Mason
2016-10-11 15:46 Chris Mason
2016-09-23 20:01 Chris Mason
2016-09-09 17:46 Chris Mason
2016-09-03 13:54 Chris Mason
2016-08-26 23:36 Chris Mason
2016-08-10 12:10 Chris Mason
2016-08-04 19:12 Chris Mason
2016-07-31 13:55 Chris Mason
2016-06-18 13:01 Chris Mason
2016-06-10 20:01 Chris Mason
2016-06-03 20:57 Chris Mason
2016-05-27 17:44 Chris Mason
2016-05-21 14:18 Chris Mason
2016-04-08 20:43 Chris Mason
2016-04-01 22:45 Chris Mason
2016-03-04 18:51 Chris Mason
2016-02-19 19:08 Chris Mason
2016-02-12 16:43 Chris Mason
2016-01-29 18:42 Chris Mason
2016-01-22 16:34 Chris Mason
2016-01-17 23:30 Chris Mason
2016-01-18 10:25 ` Martin Steigerwald
2015-12-18 17:28 Chris Mason
2015-11-27 21:13 Chris Mason
2015-11-13 20:37 Chris Mason
2015-11-06 18:44 Chris Mason
2015-10-23 12:47 Chris Mason
2015-10-16 17:34 Chris Mason
2015-10-09 17:42 Chris Mason
2015-09-25 17:35 Chris Mason
2015-09-11 18:44 Chris Mason
2015-08-08 19:41 Chris Mason
2015-07-31 18:21 Chris Mason
2015-07-17 19:38 Chris Mason
2015-07-10 19:15 Chris Mason
2015-06-29 21:11 Chris Mason
2015-05-23  1:14 Chris Mason
2015-05-26 12:33 ` Josh Boyer
2015-05-26 12:54   ` Chris Mason
2015-06-02 14:02     ` Josh Boyer
2015-06-26 14:21       ` David Sterba
2015-03-06 21:45 Chris Mason
2015-02-26  2:01 Chris Mason
2015-02-19 20:36 Chris Mason
2015-02-20 10:09 ` Markus Trippelsdorf
2014-11-09  1:17 Chris Mason
2014-10-11  0:41 Chris Mason
2014-08-14 17:59 Chris Mason
2014-08-14 18:10 ` Linus Torvalds
2014-08-14 18:17   ` Chris Mason
2014-07-20 14:33 Chris Mason
2014-07-21  3:07 ` Duncan
2014-07-04 14:42 Chris Mason
2014-06-20 15:53 Chris Mason
2014-05-20 19:25 Chris Mason
2014-04-26 23:31 Chris Mason
2014-02-16 13:13 Chris Mason
2014-02-09 19:13 Chris Mason
2014-02-04 17:59 Chris Mason
2014-01-30 21:52 Chris Mason
2014-02-02  0:15 ` David Rientjes
2014-02-02  1:28   ` Filipe David Manana
2014-02-02  2:40     ` David Rientjes
2014-02-02  8:09     ` Chris Samuel
2014-02-03 17:54 ` David Sterba
2014-02-03 18:18   ` Chris Mason
2014-02-04 19:50     ` Greg KH
2014-02-04 19:52       ` Chris Mason
2013-12-12 21:57 Chris Mason
2013-11-21 18:35 Chris Mason
2013-11-15 15:19 Chris Mason
2013-11-15 16:11 ` Geert Uytterhoeven
2013-11-14 17:19 Chris Mason
2013-11-15 11:32 ` Heiko Carstens
2013-11-15 12:21   ` Chris Mason
2013-11-15 13:40     ` Chris Mason
2013-11-15 13:42       ` Geert Uytterhoeven
2013-11-15 14:57         ` Heiko Carstens
2013-11-17  9:36           ` Gleb Natapov
2013-11-18  9:35             ` Heiko Carstens
2013-11-18 10:30               ` Will Deacon
2013-10-18 23:24 Chris Mason
2013-10-12  1:01 Chris Mason
2013-10-05 17:36 Chris Mason
2013-09-22 20:50 Chris Mason
2013-09-12 15:36 Chris Mason
2013-09-12 20:38 ` Josh Boyer
2013-09-13  6:44   ` Geert Uytterhoeven
2013-09-13 11:53     ` Josh Boyer
2013-09-13 12:15       ` Russell King
2013-09-13 12:36         ` Geert Uytterhoeven
2013-09-13 15:06         ` Josh Boyer
2013-09-13 15:38           ` Josh Boyer
2013-09-13 15:58             ` Russell King
2013-09-14  9:33               ` Heiko Carstens
2013-09-13 13:07 ` Ric Wheeler
2013-09-13 14:11   ` Hugo Mills
2013-08-10 12:28 Chris Mason
2013-07-09 19:18 Chris Mason
2013-06-14  1:29 Chris Mason
2013-05-09 19:26 Chris Mason
2013-03-02 15:15 Chris Mason
2013-03-02 15:41 ` Liu Bo
2013-03-03  0:45 ` Linus Torvalds
2013-03-03  1:10   ` Chris Mason
2012-10-09 21:05 Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160322021533.xqalzzk5itglqu3x@floor.thefacebook.com \
    --to=clm@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).