All of lore.kernel.org
 help / color / mirror / Atom feed
From: CAI Qian <caiqian@redhat.com>
To: Rob Herring <robh@kernel.org>, Jiri Olsa <jolsa@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Kan Liang <kan.liang@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>
Subject: [4.9-rc1+] intel_uncore builtin + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic
Date: Wed, 19 Oct 2016 10:45:31 -0400 (EDT)	[thread overview]
Message-ID: <1035662571.647973.1476888331396.JavaMail.zimbra@redhat.com> (raw)
In-Reply-To: <CAL_JsqJkSqV00esLTfoGJkf31kuA9F4daRx6Gsf27ZPs-AAAZQ@mail.gmail.com>

It turns out this can only be reproducible when compiled intel_uncore as a builtin, i.e.,
not compiled it as a module. The can still be reproduced in the yesterday's mainline.

Here is some information about the system,

Intel Platform: Grantley-R Wildcat Pass CPU: Broadwell-EP, B0.
Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz

[   66.349263] PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
[   66.356672] software IO TLB [mem 0x71c7d000-0x75c7d000] (64MB) mapped at [ffff880071c7d000-ffff880075c7cfff]
[   66.369911] Intel CQM monitoring enabled
[   66.374445] Intel MBM enabled
[   66.385708] RAPL PMU: API unit is 2^-32 Joules, 4 fixed counters, 655360 ms ovfl timer
[   66.394564] RAPL PMU: hw unit of domain pp0-core 2^-14 Joules
[   66.400991] RAPL PMU: hw unit of domain package 2^-14 Joules
[   66.407317] RAPL PMU: hw unit of domain dram 2^-14 Joules
[   66.413358] RAPL PMU: hw unit of domain pp1-gpu 2^-14 Joules
[   66.434040] ================================================================================
[   66.443462] UBSAN: Undefined behaviour in drivers/base/core.c:1251:17
[   66.450653] member access within null pointer of type 'struct device'
[   66.457845] CPU: 68 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc1-lockfix+ #48
[   66.465809] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
[   66.477168]  ffff880847aff798 ffffffff81d370b4 0000000041b58ab3 ffffffff83348dcf
[   66.485469]  ffffffff81d36ff4 ffff880847aff7c0 ffff880847aff770 ffff880e3f9d8000
[   66.493770]  ffffffff82ff8a00 ffffffff8309c5c0 00000000000004e3 000000009091f309
[   66.502073] Call Trace:
[   66.504811]  [<ffffffff81d370b4>] dump_stack+0xc0/0x12c
[   66.510644]  [<ffffffff81d36ff4>] ? _atomic_dec_and_lock+0xc4/0xc4
[   66.517548]  [<ffffffff81e5ac85>] ubsan_epilogue+0xd/0x8a
[   66.523574]  [<ffffffff81e5ae68>] __ubsan_handle_type_mismatch+0x166/0x434
[   66.531253]  [<ffffffff813294dd>] ? get_lock_stats+0x1d/0x120
[   66.537667]  [<ffffffff81e5ad02>] ? ubsan_epilogue+0x8a/0x8a
[   66.543985]  [<ffffffff82241acc>] device_del+0x6fc/0x860
[   66.549917]  [<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[   66.557494]  [<ffffffff822413d0>] ? cleanup_glue_dir+0x140/0x140
[   66.564202]  [<ffffffff8160a6f2>] perf_pmu_unregister+0x142/0x6d0
[   66.571006]  [<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
[   66.577619]  [<ffffffff810559f7>] uncore_pmu_unregister+0x67/0xd0
[   66.584422]  [<ffffffff8105ae6c>] uncore_pci_remove+0x32c/0x510
[   66.591025]  [<ffffffff81ec8392>] pci_device_remove+0xb2/0x240
[   66.597539]  [<ffffffff8224fe76>] driver_probe_device+0x146/0xfc0
[   66.604340]  [<ffffffff82250cf0>] ? driver_probe_device+0xfc0/0xfc0
[   66.611334]  [<ffffffff82250ea5>] __driver_attach+0x1b5/0x230
[   66.617749]  [<ffffffff82248e60>] bus_for_each_dev+0x130/0x200
[   66.624264]  [<ffffffff81353300>] ? do_raw_spin_trylock+0x110/0x110
[   66.631258]  [<ffffffff82248d30>] ? subsys_dev_iter_init+0x100/0x100
[   66.638349]  [<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
[   66.644959]  [<ffffffff8224eaa2>] driver_attach+0x42/0x70
[   66.650976]  [<ffffffff8224d846>] bus_add_driver+0x406/0x870
[   66.657292]  [<ffffffff822535b9>] driver_register+0x1a9/0x3d0
[   66.663704]  [<ffffffff81352942>] ? __raw_spin_lock_init+0x32/0x120
[   66.670700]  [<ffffffff81ec2a1d>] __pci_register_driver+0x1ad/0x2b0
[   66.677694]  [<ffffffff81ec2870>] ? pci_pm_runtime_idle+0x180/0x180
[   66.684694]  [<ffffffff858f57b5>] intel_uncore_init+0x58d/0x64c
[   66.691300]  [<ffffffff858ed56d>] ? amd_iommu_pc_init+0x16/0x344
[   66.698006]  [<ffffffff858f5228>] ? uncore_type_init+0x5cb/0x5cb
[   66.704710]  [<ffffffff81000587>] do_one_initcall+0xb7/0x2a0
[   66.711025]  [<ffffffff810004d0>] ? initcall_blacklisted+0x1a0/0x1a0
[   66.718116]  [<ffffffff8132687d>] ? up_write+0x7d/0x120
[   66.723949]  [<ffffffff81326800>] ? up_read+0x40/0x40
[   66.729587]  [<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[   66.737165]  [<ffffffff8130db04>] ? __wake_up+0x44/0x50
[   66.743000]  [<ffffffff858e71b9>] kernel_init_freeable+0x68a/0x768
[   66.749900]  [<ffffffff858e6b2f>] ? start_kernel+0x751/0x751
[   66.756219]  [<ffffffff81075ec0>] ? compat_start_thread+0xa0/0xa0
[   66.763013]  [<ffffffff82c704c0>] ? rest_init+0x190/0x190
[   66.769039]  [<ffffffff82c704d3>] kernel_init+0x13/0x140
[   66.774967]  [<ffffffff82c704c0>] ? rest_init+0x190/0x190
[   66.780993]  [<ffffffff82c8b0d7>] ret_from_fork+0x27/0x40
[   66.787019] ================================================================================
[   66.796479] kasan: CONFIG_KASAN_INLINE enabled
[   66.801450] kasan: GPF could be caused by NULL-ptr deref or user memory access
[   66.809525] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[   66.817878] Modules linked in:
[   66.821295] CPU: 68 PID: 1 Comm: swapper/0 Not tainted 4.9.0-rc1-lockfix+ #48
[   66.829260] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS GRRFSDP1.86B.0271.R00.1510301446 10/30/2015
[   66.840618] task: ffff880e3f9d8000 task.stack: ffff880847af8000
[   66.847225] RIP: 0010:[<ffffffff82241466>]  [<ffffffff82241466>] device_del+0x96/0x860
[   66.856076] RSP: 0000:ffff880847aff868  EFLAGS: 00010246
[   66.862002] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
[   66.869967] RDX: 0000000000000000 RSI: ffffffff82ea0cc0 RDI: ffffed0108f5ff06
[   66.877931] RBP: ffff880847aff920 R08: ffff880e3f9d8000 R09: 0000000000000007
[   66.885894] R10: 0000000000000000 R11: 0000000000000006 R12: ffff880844094930
[   66.893859] R13: 0000000000000001 R14: ffff880844094800 R15: ffff880844095258
[   66.901824] FS:  0000000000000000(0000) GS:ffff880e54e00000(0000) knlGS:0000000000000000
[   66.910853] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   66.917265] CR2: 0000000000000000 CR3: 000000000360a000 CR4: 00000000003406e0
[   66.925228] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   66.933191] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   66.941154] Stack:
[   66.943396]  ffffffff82c8a5d2 ffff881077f705c0 1ffff10108f5ff13 ffff880847aff920
[   66.951698]  0000000000000000 ffffffff86d346c8 0000000041b58ab3 ffffffff8338e870
[   66.959997]  ffffffff822413d0 ffff880e00000044 ffffffff00000000 ffff880847aff8c0
[   66.968296] Call Trace:
[   66.971025]  [<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[   66.978603]  [<ffffffff822413d0>] ? cleanup_glue_dir+0x140/0x140
[   66.985309]  [<ffffffff8160a6f2>] perf_pmu_unregister+0x142/0x6d0
[   66.992111]  [<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
[   66.998720]  [<ffffffff810559f7>] uncore_pmu_unregister+0x67/0xd0
[   67.005523]  [<ffffffff8105ae6c>] uncore_pci_remove+0x32c/0x510
[   67.012131]  [<ffffffff81ec8392>] pci_device_remove+0xb2/0x240
[   67.018641]  [<ffffffff8224fe76>] driver_probe_device+0x146/0xfc0
[   67.025442]  [<ffffffff82250cf0>] ? driver_probe_device+0xfc0/0xfc0
[   67.032437]  [<ffffffff82250ea5>] __driver_attach+0x1b5/0x230
[   67.038852]  [<ffffffff82248e60>] bus_for_each_dev+0x130/0x200
[   67.045361]  [<ffffffff81353300>] ? do_raw_spin_trylock+0x110/0x110
[   67.052357]  [<ffffffff82248d30>] ? subsys_dev_iter_init+0x100/0x100
[   67.059450]  [<ffffffff81278cae>] ? preempt_count_sub+0x5e/0xe0
[   67.066056]  [<ffffffff8224eaa2>] driver_attach+0x42/0x70
[   67.072081]  [<ffffffff8224d846>] bus_add_driver+0x406/0x870
[   67.078397]  [<ffffffff822535b9>] driver_register+0x1a9/0x3d0
[   67.084809]  [<ffffffff81352942>] ? __raw_spin_lock_init+0x32/0x120
[   67.091803]  [<ffffffff81ec2a1d>] __pci_register_driver+0x1ad/0x2b0
[   67.098798]  [<ffffffff81ec2870>] ? pci_pm_runtime_idle+0x180/0x180
[   67.105792]  [<ffffffff858f57b5>] intel_uncore_init+0x58d/0x64c
[   67.112399]  [<ffffffff858ed56d>] ? amd_iommu_pc_init+0x16/0x344
[   67.119103]  [<ffffffff858f5228>] ? uncore_type_init+0x5cb/0x5cb
[   67.125806]  [<ffffffff81000587>] do_one_initcall+0xb7/0x2a0
[   67.132124]  [<ffffffff810004d0>] ? initcall_blacklisted+0x1a0/0x1a0
[   67.139215]  [<ffffffff8132687d>] ? up_write+0x7d/0x120
[   67.145046]  [<ffffffff81326800>] ? up_read+0x40/0x40
[   67.150684]  [<ffffffff82c8a5d2>] ? _raw_spin_unlock_irqrestore+0x42/0x70
[   67.158262]  [<ffffffff8130db04>] ? __wake_up+0x44/0x50
[   67.164094]  [<ffffffff858e71b9>] kernel_init_freeable+0x68a/0x768
[   67.170992]  [<ffffffff858e6b2f>] ? start_kernel+0x751/0x751
[   67.177310]  [<ffffffff81075ec0>] ? compat_start_thread+0xa0/0xa0
[   67.184111]  [<ffffffff82c704c0>] ? rest_init+0x190/0x190
[   67.190137]  [<ffffffff82c704d3>] kernel_init+0x13/0x140
[   67.196064]  [<ffffffff82c704c0>] ? rest_init+0x190/0x190
[   67.202090]  [<ffffffff82c8b0d7>] ret_from_fork+0x27/0x40
[   67.208115] Code: f3 f3 65 48 8b 04 25 28 00 00 00 48 89 45 d0 31 c0 48 85 ff 0f 84 69 06 00 00 48 89 da 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 41 06 00 00 48 8b 03 48 89 85 68 ff ff ff 48 
[   67.229872] RIP  [<ffffffff82241466>] device_del+0x96/0x860
[   67.236101]  RSP <ffff880847aff868>
[   67.240059] ---[ end trace 69358e866a1e3f6c ]---
[   67.245377] Kernel panic - not syncing: Fatal exception
[   67.251271] ---[ end Kernel panic - not syncing: Fatal exception


----- Original Message -----
> From: "Rob Herring" <robh@kernel.org>
> To: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>
> Cc: "CAI Qian" <caiqian@redhat.com>, "linux-kernel" <linux-kernel@vger.kernel.org>
> Sent: Monday, October 10, 2016 2:15:29 PM
> Subject: Re: kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic
> 
> On Mon, Oct 10, 2016 at 12:20 PM, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> > On Mon, Oct 10, 2016 at 11:37:27AM -0400, CAI Qian wrote:
> >> Not sure if anyone reported this before. With this kernel config, it is
> >> 100% kernel panic so far with today's
> >> mainline master HEAD.
> >>
> >> http://people.redhat.com/qcai/tmp/config-kasan-remove
> >
> > Oh it breaks things with kasan disabled as well :)
> >
> > See Laszlo's bug report already a few hours ago, Rob is on it...
> 
> I think this one is different though. It has a remove() hook.
> 
> Rob
>

  parent reply	other threads:[~2016-10-19 14:46 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <907882571.66590.1476113724660.JavaMail.zimbra@redhat.com>
2016-10-10 15:37 ` kasan inline + CONFIG_DEBUG_TEST_DRIVER_REMOVE kernel panic CAI Qian
2016-10-10 17:09   ` Rob Herring
2016-10-10 18:25     ` CAI Qian
2016-10-10 17:20   ` Greg Kroah-Hartman
2016-10-10 18:15     ` Rob Herring
2016-10-10 18:22       ` CAI Qian
2016-10-10 19:34         ` Rob Herring
2016-10-10 20:09           ` CAI Qian
2016-10-19 14:45       ` CAI Qian [this message]
2016-10-19 19:19         ` [4.9-rc1+] intel_uncore builtin " Jiri Olsa
2016-10-19 20:18           ` CAI Qian
2016-10-20  5:39           ` Peter Zijlstra
2016-10-20  8:58             ` Jiri Olsa
2016-10-20  9:04               ` Peter Zijlstra
2016-10-20  9:42                 ` Jiri Olsa
2016-10-20 11:10                   ` [PATCH] perf: Protect pmu device removal with pmu_bus_running check " Jiri Olsa
2016-10-20 14:30                     ` CAI Qian
2016-10-28 10:10                     ` [tip:perf/urgent] perf/core: Protect PMU device removal with a 'pmu_bus_running' check, to fix CONFIG_DEBUG_TEST_DRIVER_REMOVE=y " tip-bot for Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1035662571.647973.1476888331396.JavaMail.zimbra@redhat.com \
    --to=caiqian@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=robh@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.