All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anand Jain <anand.jain@oracle.com>
To: dsterba@suse.cz, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 13/13] btrfs: optimize check for stale device
Date: Wed, 23 Mar 2016 00:43:30 +0800	[thread overview]
Message-ID: <56F17632.8080901@oracle.com> (raw)
In-Reply-To: <20160322122119.GJ8095@twin.jikos.cz>



On 03/22/2016 08:21 PM, David Sterba wrote:
> On Fri, Feb 19, 2016 at 03:10:16PM +0800, Anand Jain wrote:
>>> I see crashes with btrfs/011 on a non-debugging config
>>>
>>> [  641.714363] BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
>>> [  641.716057] IP: [<ffffffffa0152eb6>] scrub_setup_ctx.isra.19+0x1f6/0x260 [btrfs]
>>> [  641.717036] PGD 720c1067 PUD 720c2067 PMD 0
>>> [  641.717749] Oops: 0000 [#1] PREEMPT SMP
>> ::
>>> [  641.723163] CPU: 0 PID: 27766 Comm: btrfs Not tainted 4.5.0-rc3-next-20160212-1.g38290f0-vanilla #1
>>> [  641.724420] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
>>> [  641.725723] task: ffff8800742481c0 ti: ffff880071d10000 task.ti: ffff880071d10000
>>> [  641.726954] RIP: 0010:[<ffffffffa0152eb6>]  [<ffffffffa0152eb6>] scrub_setup_ctx.isra.19+0x1f6/0x260 [btrfs]
>>> [  641.728404] RSP: 0018:ffff880071d13ce8  EFLAGS: 00010202
>>> [  641.729413] RAX: ffff88007231e800 RBX: ffff88007231e800 RCX: 0000000000000000
>>> [  641.730610] RDX: ffffffffa0195638 RSI: ffffffffa017c5a8 RDI: ffff88007231ea80
>>> [  641.731832] RBP: ffff880071d13d18 R08: 0000000000000000 R09: ffff88007204ea00
>>> [  641.733085] R10: 0000000000000008 R11: 0000000000000000 R12: 0000000000000000
>>> [  641.734307] R13: 0000000000000001 R14: ffff88007231e9f8 R15: 000000000000003f
>>> [  641.735544] FS:  00007f03ed36d8c0(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
>>> [  641.736883] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  641.738022] CR2: 0000000000000068 CR3: 00000000720c0000 CR4: 00000000000006f0
>>> [  641.739325] Stack:
>>> [  641.740156]  ffff8800724d4000 ffff8800724d4000 0000000000000000 ffff8800722ef000
>>> [  641.741735]  0000000000000000 ffff8800724d4fc8 ffff880071d13d98 ffffffffa01566fd
>>> [  641.743163]  ffff88007b127000 0000001900000000 ffff8800724d4ce8 0000000000000000
>>> [  641.744599] Call Trace:
>>> [  641.745553]  [<ffffffffa01566fd>] btrfs_scrub_dev+0x13d/0x510 [btrfs]
>>> [  641.746894]  [<ffffffffa0169ca9>] btrfs_dev_replace_start+0x279/0x3f0 [btrfs]
>>> [  641.748282]  [<ffffffffa0132839>] btrfs_ioctl+0x1869/0x2070 [btrfs]
>>> [  641.749587]  [<ffffffff8106d553>] ? pte_alloc_one+0x33/0x40
>>> [  641.750850]  [<ffffffff81222516>] do_vfs_ioctl+0x96/0x590
>>> [  641.752128]  [<ffffffff810682d1>] ? __do_page_fault+0x181/0x450
>>> [  641.753432]  [<ffffffff81222a89>] SyS_ioctl+0x79/0x90
>>> [  641.754663]  [<ffffffff816d4336>] entry_SYSCALL_64_fastpath+0x1e/0xa8
>>> [  641.756037] Code: 00 48 c7 c2 38 56 19 a0 48 c7 c6 a8 c5 17 a0 e8 21 39 f7 e0 45 85 ed 48 c7 83 68 02 00 00 00 00 00 00 48 89 d8 0f 84 03 ff ff ff <49> 83 7c 24 68 00 74 40 c7 83 78 02 00 00 20 00 00 00 4c 89 a3
>>> [  641.760392] RIP  [<ffffffffa0152eb6>] scrub_setup_ctx.isra.19+0x1f6/0x260 [btrfs]
>>> [  641.761970]  RSP <ffff880071d13ce8>
>>> [  641.763190] CR2: 0000000000000068
>>> [  641.767218] ---[ end trace f46d4e6a90bda310 ]---
>>>
>>> the dereference happens at offset 0x68 which matches bdev in
>>> btrfs_device, so this patch is my best guess at the moment. I'm not able
>>> to reproduce it directly so I need to wait for a rebuild and repeat.
>>
>>
>>     Looks like dev was fine when find_device was called, but
>>     later it was null when ->bdev was accessed.
>>
>>     I couldn't reproduce here. There are 10 workouts within btrfs/011
>>     any idea workout caused this? As of now I am guessing..
>>
>>     workout "-m dup -d single" 1 cancel quick
>>
>>     digging more.
>
> I was not able reproduce the crash since. All ok on a physical machine,
> in a virtual machine in kvm the test runs for a long time and then
> freezes (serial console, ssh). The kvm process eats 100% cpu, not
> possible to debug it directly. The branch stays in my for-next and is
> on the way to 4.7, we'll see if we can reproduce it.

Agreed. Thanks Dave.

  reply	other threads:[~2016-03-22 16:43 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-13  2:01 [PATCH resend 00/13] misc patches plus Introduce device delete by devid Anand Jain
2016-02-13  2:01 ` [PATCH v2 01/13] btrfs: pass the error code to the btrfs_std_error and log ret Anand Jain
2016-02-13  2:01 ` [PATCH 02/13] btrfs: create a helper function to read the disk super Anand Jain
2016-02-13  2:01 ` [PATCH v2 03/13] btrfs: maintain consistency in logging to help debugging Anand Jain
2016-02-13  2:01 ` [PATCH v2 04/13] btrfs: device path change must be logged Anand Jain
2016-02-13  2:01 ` [PATCH 05/13] Btrfs: fix fs logging for multi device Anand Jain
2016-02-13  2:01 ` [PATCH v2 06/13] btrfs: create helper function __check_raid_min_devices() Anand Jain
2016-02-15 14:51   ` David Sterba
2016-02-13  2:01 ` [PATCH 07/13] btrfs: clean up and optimize __check_raid_min_device() Anand Jain
2016-02-13  2:01 ` [PATCH v2 08/13] btrfs: create helper btrfs_find_device_by_user_input() Anand Jain
2016-02-13  2:01 ` [PATCH 09/13] btrfs: make use of btrfs_find_device_by_user_input() Anand Jain
2016-02-15 16:47   ` David Sterba
2016-02-15 16:53     ` David Sterba
2016-02-13  2:01 ` [PATCH v2 10/13] btrfs: enhance btrfs_find_device_by_user_input() to check device path Anand Jain
2016-02-13  2:01 ` [PATCH v2 11/13] btrfs: make use of btrfs_scratch_superblocks() in btrfs_rm_device() Anand Jain
2016-02-13  2:01 ` [PATCH v4 12/13] btrfs: introduce device delete by devid Anand Jain
2016-02-17 10:49   ` David Sterba
2016-02-18  6:59     ` Anand Jain
2016-02-18  9:53       ` David Sterba
2016-02-13  2:01 ` [PATCH 13/13] btrfs: optimize check for stale device Anand Jain
2016-02-18 15:13   ` David Sterba
2016-02-19  7:10     ` Anand Jain
2016-02-19  9:15       ` Anand Jain
2016-03-22 12:21       ` David Sterba
2016-03-22 16:43         ` Anand Jain [this message]
2016-03-09  9:54     ` Anand Jain
2016-03-09 16:33       ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F17632.8080901@oracle.com \
    --to=anand.jain@oracle.com \
    --cc=dsterba@suse.cz \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.