From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751294AbdIPMq5 (ORCPT ); Sat, 16 Sep 2017 08:46:57 -0400 Received: from mga01.intel.com ([192.55.52.88]:23499 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751185AbdIPMq4 (ORCPT ); Sat, 16 Sep 2017 08:46:56 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,402,1500966000"; d="scan'208";a="1015102101" Date: Sat, 16 Sep 2017 20:46:52 +0800 From: Fengguang Wu To: Thomas Gleixner Cc: LKP , LKML , Don Zickus , Ingo Molnar , Peter Zijlstra Subject: Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208 Message-ID: <20170916124652.jpjoj4zgosw2af2z@wfg-t540p.sh.intel.com> References: <59baf8db.Rfy+1ZsQ37PfCiRH%fengguang.wu@intel.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="uwutss6cizgiv7op" Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20161104 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --uwutss6cizgiv7op Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline On Fri, Sep 15, 2017 at 06:24:20PM +0200, Thomas Gleixner wrote: >On Fri, 15 Sep 2017, Thomas Gleixner wrote: > >> On Fri, 15 Sep 2017, Thomas Gleixner wrote: >> >> > On Fri, 15 Sep 2017, kernel test robot wrote: >> > > [ 0.035023] CPU: Intel Common KVM processor (family: 0xf, model: 0x6, stepping: 0x1) >> > > [ 0.042302] Performance Events: unsupported Netburst CPU model 6 no PMU driver, software events only. >> > >> > Cute. So there is no supported PMU, but for some unknown reason the lockup >> > detector can create an event, otherwise the perf availaibility check in >> > lockup_detector_init() would fail .... >> > >> > Peter??? >> >> In my VM the corresponding dmesg is: >> >> [ 0.038086] Performance Events: unsupported p6 CPU model 61 no PMU driver, software events only. What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy Bridge machines with the attached script. >> [ 0.041031] Hierarchical SRCU implementation. >> [ 0.046210] NMI watchdog: Perf event create on CPU 0 failed with -2 >> [ 0.046980] NMI watchdog: Perf NMI watchdog permanetely disabled >> >> Confused > >I still can't reproduce. Can you please apply the debug patch below and >provide the output? OK. I'll try and report back tomorrow. Thanks, Fengguang >8<----------------- > >diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c >index b2931154b5f2..e6c9ca516945 100644 >--- a/kernel/watchdog_hld.c >+++ b/kernel/watchdog_hld.c >@@ -171,6 +171,7 @@ static int hardlockup_detector_event_create(void) > /* Try to register using hardware perf events */ > evt = perf_event_create_kernel_counter(wd_attr, cpu, NULL, > watchdog_overflow_callback, NULL); >+ pr_info("EVT create on CPU %u returned %p\n", cpu, evt); > if (IS_ERR(evt)) { > pr_info("Perf event create on CPU %d failed with %ld\n", cpu, > PTR_ERR(evt)); >@@ -221,7 +222,10 @@ void hardlockup_detector_perf_cleanup(void) > struct perf_event *event = per_cpu(watchdog_ev, cpu); > > per_cpu(watchdog_ev, cpu) = NULL; >- perf_event_release_kernel(event); >+ pr_info("EVT on CPU %u in dead mask: %p\n", cpu, event); >+ if (event) >+ perf_event_release_kernel(event); >+ > } > cpumask_clear(&dead_events_mask); > } --uwutss6cizgiv7op Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="reproduce-quantal-vp-36:20170916063837:i386-randconfig-b0-09160153:4.13.0-11828-gd57108d:1" #!/bin/bash kernel=$1 kvm=( qemu-system-x86_64 -enable-kvm -cpu kvm64 -kernel $kernel -m 399 -smp 2 -device e1000,netdev=net0 -netdev user,id=net0 -boot order=nc -no-reboot -watchdog i6300esb -watchdog-action debug -rtc base=localtime -serial stdio -display none -monitor null ) append=( root=/dev/ram0 hung_task_panic=1 debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_loglevel console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,115200 vga=normal rw drbd.minor_count=8 ) "${kvm[@]}" -append "${append[*]}" --uwutss6cizgiv7op-- From mboxrd@z Thu Jan 1 00:00:00 1970 Content-Type: multipart/mixed; boundary="===============2230660605340634625==" MIME-Version: 1.0 From: Fengguang Wu To: lkp@lists.01.org Subject: Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208 Date: Sat, 16 Sep 2017 20:46:52 +0800 Message-ID: <20170916124652.jpjoj4zgosw2af2z@wfg-t540p.sh.intel.com> In-Reply-To: List-Id: --===============2230660605340634625== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable On Fri, Sep 15, 2017 at 06:24:20PM +0200, Thomas Gleixner wrote: >On Fri, 15 Sep 2017, Thomas Gleixner wrote: > >> On Fri, 15 Sep 2017, Thomas Gleixner wrote: >> >> > On Fri, 15 Sep 2017, kernel test robot wrote: >> > > [ 0.035023] CPU: Intel Common KVM processor (family: 0xf, model: = 0x6, stepping: 0x1) >> > > [ 0.042302] Performance Events: unsupported Netburst CPU model 6 = no PMU driver, software events only. >> > >> > Cute. So there is no supported PMU, but for some unknown reason the lo= ckup >> > detector can create an event, otherwise the perf availaibility check in >> > lockup_detector_init() would fail .... >> > >> > Peter??? >> >> In my VM the corresponding dmesg is: >> >> [ 0.038086] Performance Events: unsupported p6 CPU model 61 no PMU dr= iver, software events only. What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy Bridge machines with the attached script. >> [ 0.041031] Hierarchical SRCU implementation. >> [ 0.046210] NMI watchdog: Perf event create on CPU 0 failed with -2 >> [ 0.046980] NMI watchdog: Perf NMI watchdog permanetely disabled >> >> Confused > >I still can't reproduce. Can you please apply the debug patch below and >provide the output? OK. I'll try and report back tomorrow. Thanks, Fengguang >8<----------------- > >diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c >index b2931154b5f2..e6c9ca516945 100644 >--- a/kernel/watchdog_hld.c >+++ b/kernel/watchdog_hld.c >@@ -171,6 +171,7 @@ static int hardlockup_detector_event_create(void) > /* Try to register using hardware perf events */ > evt =3D perf_event_create_kernel_counter(wd_attr, cpu, NULL, > watchdog_overflow_callback, NULL); >+ pr_info("EVT create on CPU %u returned %p\n", cpu, evt); > if (IS_ERR(evt)) { > pr_info("Perf event create on CPU %d failed with %ld\n", cpu, > PTR_ERR(evt)); >@@ -221,7 +222,10 @@ void hardlockup_detector_perf_cleanup(void) > struct perf_event *event =3D per_cpu(watchdog_ev, cpu); > > per_cpu(watchdog_ev, cpu) =3D NULL; >- perf_event_release_kernel(event); >+ pr_info("EVT on CPU %u in dead mask: %p\n", cpu, event); >+ if (event) >+ perf_event_release_kernel(event); >+ > } > cpumask_clear(&dead_events_mask); > } --===============2230660605340634625== Content-Type: text/plain MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="4.13.0-11828-gd57108d1" IyEvYmluL2Jhc2gKCmtlcm5lbD0kMQoKa3ZtPSgKCXFlbXUtc3lzdGVtLXg4Nl82NAoJLWVuYWJs ZS1rdm0KCS1jcHUga3ZtNjQKCS1rZXJuZWwgJGtlcm5lbAoJLW0gMzk5Cgktc21wIDIKCS1kZXZp Y2UgZTEwMDAsbmV0ZGV2PW5ldDAKCS1uZXRkZXYgdXNlcixpZD1uZXQwCgktYm9vdCBvcmRlcj1u YwoJLW5vLXJlYm9vdAoJLXdhdGNoZG9nIGk2MzAwZXNiCgktd2F0Y2hkb2ctYWN0aW9uIGRlYnVn CgktcnRjIGJhc2U9bG9jYWx0aW1lCgktc2VyaWFsIHN0ZGlvCgktZGlzcGxheSBub25lCgktbW9u aXRvciBudWxsCikKCmFwcGVuZD0oCglyb290PS9kZXYvcmFtMAoJaHVuZ190YXNrX3BhbmljPTEK CWRlYnVnCglhcGljPWRlYnVnCglzeXNycV9hbHdheXNfZW5hYmxlZAoJcmN1cGRhdGUucmN1X2Nw dV9zdGFsbF90aW1lb3V0PTEwMAoJbmV0LmlmbmFtZXM9MAoJcHJpbnRrLmRldmttc2c9b24KCXBh bmljPS0xCglzb2Z0bG9ja3VwX3BhbmljPTEKCW5taV93YXRjaGRvZz1wYW5pYwoJb29wcz1wYW5p YwoJbG9hZF9yYW1kaXNrPTIKCXByb21wdF9yYW1kaXNrPTAKCWRyYmQubWlub3JfY291bnQ9OAoJ c3lzdGVtZC5sb2dfbGV2ZWw9ZXJyCglpZ25vcmVfbG9nbGV2ZWwKCWNvbnNvbGU9dHR5MAoJZWFy bHlwcmludGs9dHR5UzAsMTE1MjAwCgljb25zb2xlPXR0eVMwLDExNTIwMAoJdmdhPW5vcm1hbAoJ cncKCWRyYmQubWlub3JfY291bnQ9OAopCgoiJHtrdm1bQF19IiAtYXBwZW5kICIke2FwcGVuZFsq XX0iCg== --===============2230660605340634625==--