All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vegard Nossum <vegard.nossum@gmail.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>,
	Dave Jones <davej@codemonkey.org.uk>, Chris Mason <clm@fb.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jens Axboe <axboe@fb.com>, Al Viro <viro@zeniv.linux.org.uk>,
	Josef Bacik <jbacik@fb.com>, David Sterba <dsterba@suse.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Dave Chinner <david@fromorbit.com>
Subject: Re: bio linked list corruption.
Date: Mon, 5 Dec 2016 19:26:04 +0100	[thread overview]
Message-ID: <CAOMGZ=HBsKuqhgoq8nKcE21Z4AiCtNfQgWyk_JmzbwVWfb7GiA@mail.gmail.com> (raw)
In-Reply-To: <CALCETrX9+eS3KYdN2UBvmBsMhKr3f-PBSC8UHqrnK8WhE6_guw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4179 bytes --]

On 5 December 2016 at 19:11, Andy Lutomirski <luto@kernel.org> wrote:
> On Sun, Dec 4, 2016 at 3:04 PM, Vegard Nossum <vegard.nossum@gmail.com> wrote:
>> On 23 November 2016 at 20:58, Dave Jones <davej@codemonkey.org.uk> wrote:
>>> On Wed, Nov 23, 2016 at 02:34:19PM -0500, Dave Jones wrote:
>>>
>>>  > [  317.689216] BUG: Bad page state in process kworker/u8:8  pfn:4d8fd4
>>>  > trace from just before this happened. Does this shed any light ?
>>>  >
>>>  > https://codemonkey.org.uk/junk/trace.txt
>>>
>>> crap, I just noticed the timestamps in the trace come from quite a bit
>>> later.  I'll tweak the code to do the taint checking/ftrace stop after
>>> every syscall, that should narrow the window some more.
>>
>> FWIW I hit this as well:
>>
>> BUG: unable to handle kernel paging request at ffffffff81ff08b7
>
> We really ought to improve this message.  If nothing else, it should
> say whether it was a read, a write, or an instruction fetch.
>
>> IP: [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
>> PGD 461e067 PUD 461f063
>> PMD 1e001e1
>
> Too lazy to manually decode this right now, but I don't think it matters.
>
>> Oops: 0003 [#1] PREEMPT SMP KASAN
>
> Note this is SMP, but that just means CONFIG_SMP=y.  Vegard, how many
> CPUs does your kernel think you have?

My first crash was running on a 1-CPU guest (not intentionally, but
because of a badly configured qemu). I'm honestly surprised it
triggered at all with 1 CPU, but I guess it shows that it's not a true
concurrency issue at least!

>
>> RIP: 0010:[<ffffffff8135f2ea>]  [<ffffffff8135f2ea>]
>> __lock_acquire.isra.32+0xda/0x1a30
>> RSP: 0018:ffff8801bab8f730  EFLAGS: 00010082
>> RAX: ffffffff81ff071f RBX: 0000000000000000 RCX: 0000000000000000
>
> RAX points to kernel text.

Yes, it's somewhere in the middle of iov_iter_init() -- other crashes
also had put_prev_entity(), a null pointer, and some garbage values I
couldn't identify.

>
>> Code: 89 4d b8 44 89 45 c0 89 4d c8 4c 89 55 d0 e8 ee c3 ff ff 48 85
>> c0 4c 8b 55 d0 8b 4d c8 44 8b 45 c0 4c 8b 4d b8 0f 84 c6 01 00 00 <3e>
>> ff 80 98 01 00 00 49 8d be 48 07 00 00 48 ba 00 00 00 00 00
>
>   2b:    3e ff 80 98 01 00 00     incl   %ds:*0x198(%rax)        <--
> trapping instruction
>
> That's very strange.  I think this is:
>
> atomic_inc((atomic_t *)&class->ops);
>
> but my kernel contains:
>
>     3cb4:       f0 ff 80 98 01 00 00    lock incl 0x198(%rax)
>
> So your kernel has been smp-alternatived.  That 3e comes from
> alternatives_smp_unlock.  If you're running on SMP with UP
> alternatives, things will break.

Yes, indeed. It was running on 1 CPU by mistake and still triggered the bug.

The crashes started really pouring in once I got my qemu fixed. Just
to reassure you, here's another crash which shows it's using the
correct instruction on an actual multicore guest:

BUG: unable to handle kernel paging request at 00000003030001de
IP: [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
PGD 183fd2067 PUD 0

Oops: 0002 [#1] PREEMPT SMP KASAN
Dumping ftrace buffer:
  (ftrace buffer empty)
CPU: 23 PID: 9584 Comm: trinity-c104 Not tainted 4.9.0-rc7+ #219
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
task: ffff880189f68000 task.stack: ffff88017fe50000
RIP: 0010:[<ffffffff8135f2ea>]  [<ffffffff8135f2ea>]
__lock_acquire.isra.32+0xda/0x1a30
RSP: 0018:ffff88017fe575e0  EFLAGS: 00010002
RAX: 0000000303000046 RBX: 0000000000000000 RCX: 0000000000000000
[...]
Code: 89 4d b8 44 89 45 c0 89 4d c8 4c 89 55 d0 e8 ee c3 ff ff 48 85
c0 4c 8b 55 d0 8b 4d c8 44 8b 45 c0 4c 8b 4d b8 0f 84 c6 01 00 00 <f0>
ff 80 98 01 00 00 49 8d be 48 07 00 00 48 ba 00 00 00 00 00
RIP  [<ffffffff8135f2ea>] __lock_acquire.isra.32+0xda/0x1a30
RSP <ffff88017fe575e0>
CR2: 00000003030001de
---[ end trace 2846425104eb6141 ]---
Kernel panic - not syncing: Fatal exception
------------[ cut here ]------------

  2b:*  f0 ff 80 98 01 00 00    lock incl 0x198(%rax)           <--
trapping instruction

> What's your kernel command line?  Can we have your entire kernel log from boot?

Just in case you still want this, I've attached the boot log for the
"true SMP" guest above.


Vegard

[-- Attachment #2: 5.txt.gz --]
[-- Type: application/x-gzip, Size: 17767 bytes --]

  parent reply	other threads:[~2016-12-05 18:26 UTC|newest]

Thread overview: 118+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-11 14:45 btrfs bio linked list corruption Dave Jones
2016-10-11 15:11 ` Al Viro
2016-10-11 15:19   ` Dave Jones
2016-10-11 15:20     ` Chris Mason
2016-10-11 15:49       ` Dave Jones
2016-10-11 15:54 ` Chris Mason
2016-10-11 16:25   ` Dave Jones
2016-10-12 13:47   ` Dave Jones
2016-10-12 14:40     ` Dave Jones
2016-10-12 14:42       ` Chris Mason
2016-10-13 18:16         ` Dave Jones
2016-10-13 21:18           ` Chris Mason
2016-10-13 21:56             ` Dave Jones
2016-10-16  0:42             ` Dave Jones
2016-10-18  1:07               ` Chris Mason
2016-10-18 22:42 ` Dave Jones
2016-10-18 23:12   ` Jens Axboe
2016-10-18 23:31     ` Chris Mason
2016-10-18 23:36       ` Jens Axboe
2016-10-18 23:39       ` Linus Torvalds
2016-10-18 23:42         ` Chris Mason
2016-10-19  0:10           ` Linus Torvalds
2016-10-19  0:19             ` Chris Mason
2016-10-19  0:28             ` Linus Torvalds
2016-10-20 22:48               ` Dave Jones
2016-10-19  1:05             ` Andy Lutomirski
2016-10-20 22:50               ` Dave Jones
2016-10-20 23:01                 ` Andy Lutomirski
2016-10-20 23:03                   ` Dave Jones
2016-10-20 23:23                     ` Andy Lutomirski
2016-10-21 20:02                       ` Dave Jones
2016-10-21 20:17                         ` Chris Mason
2016-10-21 20:23                           ` Dave Jones
2016-10-21 20:38                             ` Chris Mason
2016-10-21 20:41                               ` Josef Bacik
2016-10-21 21:11                                 ` Dave Jones
2016-10-22 15:20                         ` Dave Jones
2016-10-23 21:32                           ` Chris Mason
2016-10-24  4:40                             ` Dave Jones
2016-10-24 13:42                               ` Chris Mason
2016-10-26  0:27                                 ` Dave Jones
2016-10-26  1:33                                   ` Linus Torvalds
2016-10-26  1:39                                     ` Linus Torvalds
2016-10-26 16:30                                       ` Dave Jones
2016-10-26 16:48                                         ` Linus Torvalds
2016-10-26 18:18                                           ` Dave Jones
2016-10-26 18:42                                           ` Dave Jones
2016-10-26 19:06                                             ` Linus Torvalds
2016-10-26 20:00                                               ` Chris Mason
2016-10-26 21:52                                                 ` Chris Mason
2016-10-26 22:21                                                   ` Linus Torvalds
2016-10-26 22:40                                                     ` Dave Jones
2016-10-26 22:51                                                       ` Linus Torvalds
2016-10-26 22:55                                                         ` Jens Axboe
2016-10-26 22:58                                                         ` Linus Torvalds
2016-10-26 23:03                                                           ` Jens Axboe
2016-10-26 23:07                                                             ` Dave Jones
2016-10-26 23:08                                                             ` Linus Torvalds
2016-10-26 23:20                                                               ` Jens Axboe
2016-10-26 23:38                                                                 ` Chris Mason
2016-10-26 23:47                                                                   ` Dave Jones
2016-10-27  0:00                                                                     ` Jens Axboe
2016-10-27 13:33                                                                       ` Chris Mason
2016-10-31 18:55                                                                     ` Dave Jones
2016-10-31 19:35                                                                       ` Linus Torvalds
2016-10-31 19:44                                                                         ` Chris Mason
2016-11-06 16:55                                                                           ` btrfs btree_ctree_super fault Dave Jones
2016-11-08 14:59                                                                             ` Dave Jones
2016-11-08 15:08                                                                               ` Chris Mason
2016-11-10 14:35                                                                                 ` Dave Jones
2016-11-10 15:27                                                                                   ` Chris Mason
2016-11-23 19:34                                                                           ` bio linked list corruption Dave Jones
2016-11-23 19:58                                                                             ` Dave Jones
2016-12-01 15:32                                                                               ` btrfs_destroy_inode warn (outstanding extents) Dave Jones
2016-12-03 16:48                                                                                 ` Dave Jones
2016-12-07 16:15                                                                                   ` Dave Jones
2016-12-09 21:12                                                                                 ` Steven Rostedt
2016-12-04 23:04                                                                               ` bio linked list corruption Vegard Nossum
2016-12-05 11:10                                                                                 ` Vegard Nossum
2016-12-05 17:09                                                                                   ` Vegard Nossum
2016-12-05 17:21                                                                                     ` Dave Jones
2016-12-05 17:55                                                                                     ` Linus Torvalds
2016-12-05 19:11                                                                                       ` Vegard Nossum
2016-12-05 20:10                                                                                         ` Linus Torvalds
2016-12-05 20:35                                                                                           ` Linus Torvalds
2016-12-05 21:33                                                                                             ` Vegard Nossum
2016-12-06  8:42                                                                                               ` Vegard Nossum
2016-12-06  8:16                                                                                             ` Peter Zijlstra
2016-12-06  8:36                                                                                               ` Ingo Molnar
2016-12-06 16:33                                                                                               ` Linus Torvalds
2016-12-05 20:10                                                                                         ` Vegard Nossum
2016-12-05 18:11                                                                                 ` Andy Lutomirski
2016-12-05 18:25                                                                                   ` Linus Torvalds
2016-12-05 18:26                                                                                   ` Vegard Nossum [this message]
2016-10-26 23:19                                                             ` Chris Mason
2016-10-26 23:21                                                               ` Jens Axboe
2016-10-27  6:33                                                             ` Christoph Hellwig
2016-10-27 16:34                                                               ` Linus Torvalds
2016-10-27 16:36                                                                 ` Jens Axboe
2016-10-26 23:01                                                         ` Dave Jones
2016-10-26 23:05                                                           ` Jens Axboe
2016-10-26 22:52                                                       ` Jens Axboe
2016-10-26 22:07                                                 ` Linus Torvalds
2016-10-26 22:54                                                   ` Chris Mason
2016-10-27  5:41                                   ` Dave Chinner
2016-10-27 17:23                                     ` Dave Jones
2016-10-24 20:06                               ` Andy Lutomirski
2016-10-24 20:46                                 ` Linus Torvalds
2016-10-24 21:17                                   ` Linus Torvalds
2016-10-24 21:50                                     ` Linus Torvalds
2016-10-24 22:02                                       ` Chris Mason
2016-10-24 22:42                                   ` Andy Lutomirski
2016-10-25  0:00                                     ` Linus Torvalds
2016-10-25  1:09                                       ` Andy Lutomirski
2016-10-19 17:09           ` Philipp Hahn
2016-10-19 17:43             ` Linus Torvalds
2016-10-20  6:52               ` Ingo Molnar
2016-10-20  7:17                 ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOMGZ=HBsKuqhgoq8nKcE21Z4AiCtNfQgWyk_JmzbwVWfb7GiA@mail.gmail.com' \
    --to=vegard.nossum@gmail.com \
    --cc=axboe@fb.com \
    --cc=bp@alien8.de \
    --cc=clm@fb.com \
    --cc=davej@codemonkey.org.uk \
    --cc=david@fromorbit.com \
    --cc=dsterba@suse.com \
    --cc=jbacik@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.