xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH] x86: show remote CPU state upon fatal NMI
Date: Wed, 15 Jun 2016 06:59:44 -0600	[thread overview]
Message-ID: <57616D6002000078000F5537@prv-mh.provo.novell.com> (raw)
In-Reply-To: <60164dc5-a615-beaa-7f41-f7d6375486b9@citrix.com>

>>> On 15.06.16 at 13:03, <andrew.cooper3@citrix.com> wrote:
> On 15/06/16 08:55, Jan Beulich wrote:
>>>> @@ -570,6 +600,15 @@ void fatal_trap(const struct cpu_user_re
>>>>              printk("Faulting linear address: %p\n", _p(cr2));
>>>>              show_page_walk(cr2);
>>>>          }
>>>> +        else if ( trapnr == TRAP_nmi )
>>>> +        {
>>>> +            cpumask_andnot(&nmi_show_state_mask, &cpu_online_map,
>>>> +                           cpumask_of(smp_processor_id()));
>>>> +            set_nmi_callback(nmi_show_execution_state);
>>>> +            smp_send_nmi_allbutself();
>>> This would cause far less spinlock contention as
>>>
>>> for_each_cpu( cpu, nmi_show_state_mask )
>>>     smp_send_nmi(cpu);
>>>
>>> I realise this involves introducing a new smp function, but it would
>>> substantially reduce contention on the console lock.
>> Again, I don't see why lock contention would matter here. And then
>> I also don't see how sending the IPIs individually would make matters
>> significantly better: The sending will surely finish much faster than
>> the printing.
> 
> Contention is a problem because you have replaced the NMI callback, and
> the watchdog is still running.  Especially if sync_console is in effect,
> you are liable to incur a further timeout, queueing up more NMIs.
> 
> Although now I think of it, that won't matter so long as the NMIs don't
> nest.
> 
> The one advantage of sending the NMIs in order is that the information
> dump will happen in order, which is slightly more use than having them
> in a random order on a large machine.

How that? All the NMIs will still arrive at about the same time, so
while some low numbered CPUs may indeed get their state printed
in order, higher numbered ones may still make it into the lock region
in any order. (And no, building upon ticket locks making randomness
much less likely is neither an option, nor would it really help: Just
think of a lower numbered CPU first having to come out of a deep
C-state or running at a much lower P-state than a higher numbered
one.)

>>> I would recommend moving this clause into nmi_watchdog_tick(), so that
>>> it doesn't get involved for non-watchdog NMIs.  IOCK/SERR NMIs won't
>>> have anything interesting to print from here.  I would also recommend
>>> disabling the watchdog before IPI'ing.
>> And indeed I would have wanted it there, but I can't see how it can
>> reasonably be put there: fatal_trap() doesn't return, so we can't put
>> it after. And we definitely want to get state of the local CPU out
>> before we try to log state of any of the remote CPUs. So the only
>> options I see would be to
>> - somehow specially flag the regs structure, but that feels hackish
>>   (among other aspects nmi_watchdog_tick() has that parameter
>>   const qualified for the very reason that it isn't supposed to fiddle
>>   with it),
>> - introduce a global flag.
> 
> How about a boolean flag to fatal_trap()?  It doesn't have many callers,
> and this kind of printing might also be useful for some MCEs.

Ah, right, there indeed aren't that many. Can you qualify "some"
a bit better, so that maybe I can have the patch pass true there
right away?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-06-15 12:59 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-14 14:33 [PATCH] x86: show remote CPU state upon fatal NMI Jan Beulich
2016-06-14 15:03 ` Andrew Cooper
2016-06-15  7:55   ` Jan Beulich
2016-06-15 11:03     ` Andrew Cooper
2016-06-15 12:59       ` Jan Beulich [this message]
2016-06-15 13:15         ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57616D6002000078000F5537@prv-mh.provo.novell.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).