All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Sasha Levin <sasha.levin@oracle.com>
Cc: "Paul McKenney" <paulmck@linux.vnet.ibm.com>,
	"Borislav Petkov" <bp@alien8.de>, "X86 ML" <x86@kernel.org>,
	"Linus Torvalds" <torvalds@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Oleg Nesterov" <oleg@redhat.com>,
	"Tony Luck" <tony.luck@intel.com>,
	"Andi Kleen" <andi@firstfloor.org>,
	"Josh Triplett" <josh@joshtriplett.org>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>
Subject: Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context
Date: Sat, 31 Jan 2015 04:50:12 -0800	[thread overview]
Message-ID: <CALCETrX8XACPcTdFgLcP050W0s5wrT178wCk=3n4zwtZKOVyHg@mail.gmail.com> (raw)
In-Reply-To: <CALCETrVbThoEKpC-+jpSP4MQbb9svzcXcMy2VMLmSOmRb32C4w@mail.gmail.com>

On Fri, Jan 30, 2015 at 7:12 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Fri, Jan 30, 2015 at 5:28 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
>> On 01/30/2015 02:57 PM, Sasha Levin wrote:
>>> On 01/28/2015 04:02 PM, Andy Lutomirski wrote:
>>>> On Wed, Jan 28, 2015 at 9:48 AM, Paul E. McKenney
>>>> <paulmck@linux.vnet.ibm.com> wrote:
>>>>> On Wed, Jan 28, 2015 at 08:33:06AM -0800, Andy Lutomirski wrote:
>>>>>> On Fri, Jan 23, 2015 at 5:25 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>>>>>>> On Fri, Jan 23, 2015 at 12:48 PM, Sasha Levin <sasha.levin@oracle.com> wrote:
>>>>>>>> On 01/23/2015 01:34 PM, Andy Lutomirski wrote:
>>>>>>>>> On Fri, Jan 23, 2015 at 10:04 AM, Borislav Petkov <bp@alien8.de> wrote:
>>>>>>>>>> On Fri, Jan 23, 2015 at 09:58:01AM -0800, Andy Lutomirski wrote:
>>>>>>>>>>>> [  543.999079] Call Trace:
>>>>>>>>>>>> [  543.999079] dump_stack (lib/dump_stack.c:52)
>>>>>>>>>>>> [  543.999079] lockdep_rcu_suspicious (kernel/locking/lockdep.c:4259)
>>>>>>>>>>>> [  543.999079] atomic_notifier_call_chain (include/linux/rcupdate.h:892 kernel/notifier.c:182 kernel/notifier.c:193)
>>>>>>>>>>>> [  543.999079] ? atomic_notifier_call_chain (kernel/notifier.c:192)
>>>>>>>>>>>> [  543.999079] notify_die (kernel/notifier.c:538)
>>>>>>>>>>>> [  543.999079] ? atomic_notifier_call_chain (kernel/notifier.c:538)
>>>>>>>>>>>> [  543.999079] ? debug_smp_processor_id (lib/smp_processor_id.c:57)
>>>>>>>>>>>> [  543.999079] do_debug (arch/x86/kernel/traps.c:652)
>>>>>>>>>>>> [  543.999079] ? trace_hardirqs_on (kernel/locking/lockdep.c:2609)
>>>>>>>>>>>> [  543.999079] ? do_int3 (arch/x86/kernel/traps.c:610)
>>>>>>>>>>>> [  543.999079] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2554 kernel/locking/lockdep.c:2601)
>>>>>>>>>>>> [  543.999079] debug (arch/x86/kernel/entry_64.S:1310)
>>>>>>>>>>>
>>>>>>>>>>> I don't know how to read this stack trace.  Are we in do_int3,
>>>>>>>>>>> do_debug, or both?  I didn't change do_debug at all.
>>>>>>>>>>
>>>>>>>>>> It looks like we're in do_debug. do_int3 is only on the stack but not
>>>>>>>>>> part of the current frame if I can trust the '?' ...
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> It's possible that an int3 happened and I did something wrong on
>>>>>>>>> return that caused a subsequent do_debug to screw up, but I don't see
>>>>>>>>> how my patch would have caused that.
>>>>>>>>>
>>>>>>>>> Were there any earlier log messages?
>>>>>>>>
>>>>>>>> Nope, nothing odd before or after.
>>>>>>>
>>>>>>> Trinity just survived for a decent amount of time for me with my
>>>>>>> patches, other than a bunch of apparently expected OOM kills.  I have
>>>>>>> no idea how to tell trinity how much memory to use.
>>>>>>
>>>>>> A longer trinity run on a larger VM survived (still with some OOM
>>>>>> kills, but no taint) with these patches.  I suspect that it's a
>>>>>> regression somewhere else in the RCU changes.  I have
>>>>>> CONFIG_PROVE_RCU=y, so I should have seen the failure if it was there,
>>>>>> I think.
>>>>>
>>>>> If by "RCU changes" you mean my changes to the RCU infrastructure, I am
>>>>> going to need more of a hint than I see in this thread thus far.  ;-)
>>>>>
>>>>
>>>> I can't help much, since I can't reproduce the problem.  Presumably if
>>>> it's a bug in -tip, someone else will trigger it, too.
>>>
>>> I'm not sure what to tell you here, I'm not using any weird options for trinity
>>> to reproduce it.
>>>
>>> It doesn't happen to frequently, but I still see it happening.
>>>
>>> Would you like me to try a debug patch or something similar?
>>
>> After talking with Paul we know what's going on here:
>>
>> do_debug() calls ist_enter() to indicate we're running on the interrupt
>> stack. The first think ist_enter() does is:
>
> I wonder whether there's an easy way to trigger this.  Probably a
> watchpoint on the user stack would do the trick.

This is embarrassing.  I just stuck an assertion in do_int3 and I can
reproduce it with int3 from user space.  Patch coming.

>
>>
>>         preempt_count_add(HARDIRQ_OFFSET);
>>
>> After this, as far as the kernel is concerned, we're in interrupt mode
>> so in_interrupt() will return true.
>>
>> Next, we'll call exception_enter() which won't do anything since:
>>
>>         void context_tracking_user_exit(void)
>>         {
>>                 unsigned long flags;
>>
>>                 if (!context_tracking_is_enabled())
>>                         return;
>>
>>                 if (in_interrupt())  <=== This returns true, so nothing else gets done
>>                         return;
>>
>> At this stage we never tell RCU that we exited user mode, but then we
>> try to use it calling the notifiers, which explains the warnings I'm seeing.
>>
>
> Is fixing this as simple as calling exception_enter before
> incrementing the preempt count?  I'll try to have a tested patch
> tomorrow.
>
> Thanks for tracking this down!  I've been out of town since you
> reported this, so I haven't had enough time to track it down myself.
>
> --Andy



-- 
Andy Lutomirski
AMA Capital Management, LLC

  reply	other threads:[~2015-01-31 12:50 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-21 21:26 [PATCH v4 0/5] x86: Rework IST interrupts Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 1/5] uprobes, x86: Fix _TIF_UPROBE vs _TIF_NOTIFY_RESUME Andy Lutomirski
2014-11-22 16:55   ` Borislav Petkov
2014-11-24 17:58     ` Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Andy Lutomirski
2014-11-21 21:32   ` Andy Lutomirski
2014-11-21 22:07     ` Paul E. McKenney
2014-11-21 22:19       ` Andy Lutomirski
2014-11-21 22:55         ` Paul E. McKenney
2014-11-21 23:06           ` Andy Lutomirski
2014-11-21 23:38             ` Paul E. McKenney
2014-11-22  2:00               ` Andy Lutomirski
2014-11-22  4:20                 ` Paul E. McKenney
2014-11-22  5:53                   ` Andy Lutomirski
2014-11-22 23:41                     ` Paul E. McKenney
2014-11-24 20:22                       ` Andy Lutomirski
2014-11-24 20:54                         ` Paul E. McKenney
2014-11-24 21:02                           ` Andy Lutomirski
2014-11-24 21:35                             ` Paul E. McKenney
2014-11-24 22:34                               ` Paul E. McKenney
2014-11-24 22:36                                 ` Andy Lutomirski
2014-11-24 22:57                                   ` Paul E. McKenney
2014-11-24 23:31                                     ` Paul E. McKenney
2014-11-24 23:35                                       ` Andy Lutomirski
2014-11-24 23:50                                         ` Paul E. McKenney
2014-11-24 23:52                                           ` Andy Lutomirski
2014-11-25 18:58                                             ` Borislav Petkov
2014-11-25 19:16                                               ` Paul E. McKenney
2014-12-11  0:22                                               ` Tony Luck
2014-12-11  0:24                                                 ` Andy Lutomirski
2015-01-05 21:46                                                   ` Tony Luck
2015-01-05 21:54                                                     ` Andy Lutomirski
2015-01-06  0:44                                                       ` [PATCH] x86, mce: Get rid of TIF_MCE_NOTIFY and associated mce tricks Luck, Tony
2015-01-06  1:01                                                         ` Andy Lutomirski
2015-01-06 18:00                                                           ` Luck, Tony
2015-01-07 12:13                                                             ` Borislav Petkov
2015-01-07 15:51                                                               ` Andy Lutomirski
2015-01-07 15:58                                                                 ` Borislav Petkov
2015-01-07 16:12                                                                 ` Paul E. McKenney
2014-11-25 17:13                                           ` [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context Paul E. McKenney
2014-11-27  7:03                                           ` Lai Jiangshan
2014-11-27 16:46                                             ` Paul E. McKenney
2014-11-24 21:27                           ` Paul E. McKenney
2014-11-21 22:20       ` Frederic Weisbecker
2014-11-21 22:00   ` Paul E. McKenney
2014-11-22 17:20   ` Borislav Petkov
2014-11-24 19:48     ` Andy Lutomirski
2015-01-22 21:52   ` Sasha Levin
2015-01-23 17:58     ` Andy Lutomirski
2015-01-23 18:04       ` Borislav Petkov
2015-01-23 18:34         ` Andy Lutomirski
2015-01-23 20:48           ` Sasha Levin
2015-01-24  1:25             ` Andy Lutomirski
2015-01-28 16:33               ` Andy Lutomirski
2015-01-28 17:48                 ` Paul E. McKenney
2015-01-28 21:02                   ` Andy Lutomirski
2015-01-30 19:57                     ` Sasha Levin
2015-01-31  1:28                       ` Sasha Levin
2015-01-31  3:12                         ` Andy Lutomirski
2015-01-31 12:50                           ` Andy Lutomirski [this message]
2015-01-31 13:01                         ` [PATCH] x86, traps: Fix ist_enter from userspace Andy Lutomirski
2015-01-31 15:09                           ` Sasha Levin
2015-01-31 16:18                           ` Paul E. McKenney
2015-02-01  2:17                             ` Andy Lutomirski
2015-02-04  6:01                           ` [tip:x86/asm] " tip-bot for Andy Lutomirski
2014-11-21 21:26 ` [PATCH v4 3/5] x86, entry: Switch stacks on a paranoid entry " Andy Lutomirski
2014-11-24 15:55   ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 4/5] x86: Clean up current_stack_pointer Andy Lutomirski
2014-11-24 11:39   ` Borislav Petkov
2014-11-21 21:26 ` [PATCH v4 5/5] x86, traps: Add ist_begin_non_atomic and ist_end_non_atomic Andy Lutomirski
2014-11-24 15:54   ` Borislav Petkov
2014-11-24 19:52     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrX8XACPcTdFgLcP050W0s5wrT178wCk=3n4zwtZKOVyHg@mail.gmail.com' \
    --to=luto@amacapital.net \
    --cc=andi@firstfloor.org \
    --cc=bp@alien8.de \
    --cc=fweisbec@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=sasha.levin@oracle.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.