All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas-Mich Richter <tmricht@linux.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>, Kees Cook <keescook@chromium.org>
Cc: acme@redhat.com,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Hendrik Brueckner <brueckner@linux.ibm.com>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>
Subject: Re: WARN_ON_ONCE() hit at kernel/events/core.c:330
Date: Thu, 4 Apr 2019 11:15:39 +0200	[thread overview]
Message-ID: <eb3cbf9b-6851-72c6-2648-3e6eaa78fb7d@linux.ibm.com> (raw)
In-Reply-To: <20190403104103.GE4038@hirez.programming.kicks-ass.net>

On 4/3/19 12:41 PM, Peter Zijlstra wrote:
> On Wed, Apr 03, 2019 at 11:47:00AM +0200, Thomas-Mich Richter wrote:
>> I use linux 5.1.0-rc3 on s390 and got this WARN_ON_ONCE message:
>>
>> WARNING: CPU: 15 PID: 0 at kernel/events/core.c:330
>>                 event_function_local.constprop.79+0xe2/0xe8
>>
>> which was introduced with
>>    commit cca2094605ef ("perf/core: Fix event_function_local()").
>> ..snip.... 
>>
>> Any ideas or hints who to avoid/fix this warning?
> 
> Some thoughts here:
> 
>   https://lkml.kernel.org/r/20190213101644.GN32534@hirez.programming.kicks-ass.net
> 
> tl;dr, I've no frigging clue.
> 

I have read this thread and at the end you mentioned:

    Humm, but in that case:

   context_switch()
    prepare_task_switch()
      perf_event_task_sched_out()
        __perf_event_task_sched_out()
	  perf_event_context_sched_out()
	    task_ctx_sched_out()
	      ctx_sched_out()
	        group_sched_out()
		  event_sched_out()
		    if (event->pending_disable)

   Would have already cleared the pending_disable state, so the IPI would
   not have ran perf_event_disable_local() in the first place.

Our test system is configured to panic in WARN_ON_ONCE(). I looked
at the dump. The event triggering WARN_ON_ONCE:

crash> struct perf_event.oncpu 0x1f9b24800
  oncpu = 0
crash> struct perf_event.state 0x1f9b24800
  state = PERF_EVENT_STATE_ACTIVE
crash> 

This means the code in 
static void event_sched_out(....)
{
        ....
        event->pmu->del(event, 0);
        event->oncpu = -1;

        if (event->pending_disable) {
                event->pending_disable = 0;
                state = PERF_EVENT_STATE_OFF;
        }
        perf_event_set_state(event, state);
        ...
}

has not finished and returned from this function. So the task was not completely context-switched
out from CPU 0 while the interrupt handler was executing on CPU 15:

static void perf_pending_event(...)
{
        ....
        if (event->pending_disable) {
                event->pending_disable = 0;
                perf_event_disable_local(event);  <--- Causes the WARN_ON_ONCE()
        }
        .....
}

I think there is a race, especially when the interrupt handler on CPU 15
was invoked via timer interrupt an runs on a different CPU.

> 
> Does it reproduce on x86 without virt on? I don't have a PPC LPAR to
> test things on.
> 

s390 LPARs run under hipervisor control, no chance to run any OS without it.

-- 
Thomas Richter, Dept 3252, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294


  parent reply	other threads:[~2019-04-04  9:26 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-03  9:47 WARN_ON_ONCE() hit at kernel/events/core.c:330 Thomas-Mich Richter
2019-04-03 10:41 ` Peter Zijlstra
2019-04-03 11:26   ` Thomas-Mich Richter
2019-04-04  9:15   ` Thomas-Mich Richter [this message]
2019-04-04 11:09     ` Peter Zijlstra
2019-04-04 12:02       ` Peter Zijlstra
2019-04-04 12:13         ` Peter Zijlstra
2019-04-04 13:03       ` Peter Zijlstra
2019-04-04 13:21         ` Thomas-Mich Richter
2019-04-05 10:18         ` Thomas-Mich Richter
2019-04-05 11:46           ` Peter Zijlstra
2019-04-08  7:12         ` Thomas-Mich Richter
2019-04-08  8:22           ` Peter Zijlstra
2019-04-08  8:47             ` Thomas-Mich Richter
2019-04-08  9:50             ` Peter Zijlstra
2019-04-08 13:28               ` Thomas-Mich Richter
2019-04-09  6:07               ` Thomas-Mich Richter
2019-04-09  8:29                 ` Peter Zijlstra
2019-04-09  8:53               ` Mark Rutland
2019-04-10 13:51                 ` Thomas-Mich Richter
2019-04-10 14:33                   ` Peter Zijlstra
2019-04-11 12:06                     ` Alexander Shishkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eb3cbf9b-6851-72c6-2648-3e6eaa78fb7d@linux.ibm.com \
    --to=tmricht@linux.ibm.com \
    --cc=acme@redhat.com \
    --cc=brueckner@linux.ibm.com \
    --cc=heiko.carstens@de.ibm.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=schwidefsky@de.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.