All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Fengguang Wu <fengguang.wu@intel.com>
Cc: LKP <lkp@01.org>, LKML <linux-kernel@vger.kernel.org>,
	Don Zickus <dzickus@redhat.com>, Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208
Date: Sat, 16 Sep 2017 19:35:10 +0200 (CEST)	[thread overview]
Message-ID: <alpine.DEB.2.20.1709161850010.2105@nanos> (raw)
In-Reply-To: <20170916124652.jpjoj4zgosw2af2z@wfg-t540p.sh.intel.com>

On Sat, 16 Sep 2017, Fengguang Wu wrote:
> > > [    0.038086] Performance Events: unsupported p6 CPU model 61 no PMU
> > > driver, software events only.
> 
> What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy
> Bridge machines with the attached script.

My bad. I booted the wrong config ....

> > > [    0.041031] Hierarchical SRCU implementation.
> > > [    0.046210] NMI watchdog: Perf event create on CPU 0 failed with -2
> > > [    0.046980] NMI watchdog: Perf NMI watchdog permanetely disabled
> > > 
> > > Confused
> > 
> > I still can't reproduce. Can you please apply the debug patch below and
> > provide the output?
> 
> OK. I'll try and report back tomorrow.

Don't bother. I found it already. On UP we have:

#define for_each_cpu(cpu, mask)               \
        for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)

which is a total fail as it breaks any code which uses for_each_cpu() or
any of the other variants on UP by assuming that all cpumask have bit 0
set.

That means any code which does not have conditional code for some of the
cpumask functions is potentially broken. Sigh.

The simple cure for the watchdog is below.

Thanks,

	tglx
8<------------------

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index b2931154b5f2..d4c0f75b189e 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -221,7 +221,12 @@ void hardlockup_detector_perf_cleanup(void)
 		struct perf_event *event = per_cpu(watchdog_ev, cpu);
 
 		per_cpu(watchdog_ev, cpu) = NULL;
-		perf_event_release_kernel(event);
+		/*
+		 * Check the event, because on UP for_each_cpu() assumes
+		 * idiotically that all masks handed in have bit 0 set.
+		 */
+		if (event)
+			perf_event_release_kernel(event);
 	}
 	cpumask_clear(&dead_events_mask);
 }

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Gleixner <tglx@linutronix.de>
To: lkp@lists.01.org
Subject: Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208
Date: Sat, 16 Sep 2017 19:35:10 +0200	[thread overview]
Message-ID: <alpine.DEB.2.20.1709161850010.2105@nanos> (raw)
In-Reply-To: <20170916124652.jpjoj4zgosw2af2z@wfg-t540p.sh.intel.com>

[-- Attachment #1: Type: text/plain, Size: 1855 bytes --]

On Sat, 16 Sep 2017, Fengguang Wu wrote:
> > > [    0.038086] Performance Events: unsupported p6 CPU model 61 no PMU
> > > driver, software events only.
> 
> What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy
> Bridge machines with the attached script.

My bad. I booted the wrong config ....

> > > [    0.041031] Hierarchical SRCU implementation.
> > > [    0.046210] NMI watchdog: Perf event create on CPU 0 failed with -2
> > > [    0.046980] NMI watchdog: Perf NMI watchdog permanetely disabled
> > > 
> > > Confused
> > 
> > I still can't reproduce. Can you please apply the debug patch below and
> > provide the output?
> 
> OK. I'll try and report back tomorrow.

Don't bother. I found it already. On UP we have:

#define for_each_cpu(cpu, mask)               \
        for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)

which is a total fail as it breaks any code which uses for_each_cpu() or
any of the other variants on UP by assuming that all cpumask have bit 0
set.

That means any code which does not have conditional code for some of the
cpumask functions is potentially broken. Sigh.

The simple cure for the watchdog is below.

Thanks,

	tglx
8<------------------

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index b2931154b5f2..d4c0f75b189e 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -221,7 +221,12 @@ void hardlockup_detector_perf_cleanup(void)
 		struct perf_event *event = per_cpu(watchdog_ev, cpu);
 
 		per_cpu(watchdog_ev, cpu) = NULL;
-		perf_event_release_kernel(event);
+		/*
+		 * Check the event, because on UP for_each_cpu() assumes
+		 * idiotically that all masks handed in have bit 0 set.
+		 */
+		if (event)
+			perf_event_release_kernel(event);
 	}
 	cpumask_clear(&dead_events_mask);
 }










  reply	other threads:[~2017-09-16 17:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-14 21:47 d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG: unable to handle kernel NULL pointer dereference at 0000000000000208 kernel test robot
2017-09-14 21:47 ` kernel test robot
2017-09-15  7:50 ` Thomas Gleixner
2017-09-15  7:50   ` Thomas Gleixner
2017-09-15 12:54   ` Thomas Gleixner
2017-09-15 12:54     ` Thomas Gleixner
2017-09-15 16:24     ` Thomas Gleixner
2017-09-15 16:24       ` Thomas Gleixner
2017-09-16 12:46       ` Fengguang Wu
2017-09-16 12:46         ` Fengguang Wu
2017-09-16 17:35         ` Thomas Gleixner [this message]
2017-09-16 17:35           ` Thomas Gleixner
2017-09-16 17:55           ` Linus Torvalds
2017-09-16 17:55             ` Linus Torvalds
2017-09-16 18:12             ` Thomas Gleixner
2017-09-16 18:12               ` Thomas Gleixner
2017-09-16 18:23               ` Linus Torvalds
2017-09-16 18:23                 ` Linus Torvalds
2017-09-16 21:47                 ` Thomas Gleixner
2017-09-16 21:47                   ` Thomas Gleixner
2017-09-16 22:02                   ` Linus Torvalds
2017-09-16 22:02                     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.20.1709161850010.2105@nanos \
    --to=tglx@linutronix.de \
    --cc=dzickus@redhat.com \
    --cc=fengguang.wu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lkp@01.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.