All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dario Faggioli <dfaggioli@suse.com>
To: Mirela Simonovic <mirela.simonovic@aggios.com>
Cc: "Edgar E. Iglesias" <edgar.iglesias@xilinx.com>,
	Julien Grall <julien.grall@arm.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Davorin Mista <dm@aggios.com>,
	Xen Devel <xen-devel@lists.xen.org>
Subject: Re: [PATCH v3 10/10] xen/arm: Enable errata for secondary CPU on hotplug after the boot
Date: Fri, 11 May 2018 15:01:01 +0200	[thread overview]
Message-ID: <7544f7eaf9f2c079fe8d1294fbabe935c8e99f59.camel@suse.com> (raw)
In-Reply-To: <CAKPH-NiZtBwioY1hfzx2PSgzKB6eQxjw0hXUzLg94+u+OVpL+Q@mail.gmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 3759 bytes --]

On Fri, 2018-05-11 at 12:41 +0200, Mirela Simonovic wrote:
> Hi Dario,
> 
Hi,

> On Thu, May 10, 2018 at 6:24 PM, Dario Faggioli <dfaggioli@suse.com>
> wrote:
> > I may very well be missing or misunderstanding something, but I
> > continue to think that the problem here is that CPU_STARTING can't,
> > right now, fail, while you need it to be able to.
> > 
> > If that is the case, giving different priorities to the notifier,
> > is
> > not a solution.
> > 
> Let me try to clarify. The assumption is that the starting CPU can
> fail. Additional requirement set by Julien is that panic or BUG_ON is
> not acceptable. 
>
So, currently, in Xen, CPU bringup can't fail at the CPU_STARTING
phase. This is the point of the BUG_ON().

It is you that need it to change this, and make it possible for CPU
bringup to fail during CPU_STARTING. This is fine, but require changes,
which, IMO, are not limited to removing the BUG_ON() or trading it with
something else.

> There are 2 ways to deal with this scenario:
> 
> 1) Ignore and report the error, and let the erroneous CPU become
> online.
> This cannot be done without changing the logic in either scheduler or
> notify_cpu_starting(), or I least I don't see how would that be done.
> In previous email when I said "escalating this to who knows where" I
> did not refer to error escalation but the escalation of the scope of
> this series.
> 
How can that be an option? If CPU bringup failed, how is it possible to
let it go online?

To me, it's not that "we can't let a CPU for which bringup failed
continue without changing the scheduler or notify_cpu_starting()". It's
rather "we must not a CPU for which bringup faile continue. Period.".

Which is to say, you need to change things (in common code!) in such a
way that CPU bringup can fail during the CPU_STARTING phase.

> 2) Stop the erroneous CPU.
> Taking this approach requires setting the priority for the notifier.
>
No! Stop the CPU if the bringup fails duting the CPU_STARTING phase
does not require setting the priorities of the CPU_STARTING notifier. 

It requires to changing things (in common code) in such a way that CPU
bringup can fail during the CPU_STARTING phase. (Did I say that
already? :-) )

I understand that, if you set the priority, your series work. But that
does not mean that it is a proper solution to the problem. It means
that it is an hack that...well... makes your series work. :-)

> The key point is that notify_cpu_starting() and scheduler do not have
> to be changed. If errata notifier has higher priority than
> scheduler's
> notifier in the case of an error the CPU will not return into
> notify_cpu_starting() and it will never execute BUG_ON because it
> will
> be stopped. The rest of the system will continue to function without
> that CPU.
> 
Right. But now we have and architecture (ARM) for which CPU bringup can
fail at the CPU_STARTING phase. And yet, looking at common code, we see
that the CPU_STARTING notifier has a BUG_ON() if it ever fails. How
would people looking at this in 2 years time from now make sense of
this?

No, I don't think there really is a sensible workaround here. If you
need the CPU_STARTING phase of CPU bringup to be able to fail, you need
to make all the changes required to make that happen properly.

Which does involve changing common code, and which should therefore be
discussed with the other arches and core hypervisor maintainers.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2018-05-11 13:01 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-27 17:12 [PATCH v3 00/10] xen/arm64: Suspend preconditions and CPU hotplug fixes Mirela Simonovic
2018-04-27 17:12 ` [PATCH v3 01/10] xen/arm64: Added handling of the trapped access to OSLSR register Mirela Simonovic
2018-04-27 17:12 ` [PATCH v3 02/10] xen/arm: Ignore write to GICD_ISACTIVERn registers (vgic-v2) Mirela Simonovic
2018-04-30 14:32   ` Julien Grall
2018-04-27 17:12 ` [PATCH v3 03/10] xen/arm: Implement CPU_OFF PSCI call (physical interface) Mirela Simonovic
2018-04-30 14:36   ` Julien Grall
2018-04-27 17:12 ` [PATCH v3 04/10] xen/arm: Remove __initdata and __init to enable CPU hotplug Mirela Simonovic
2018-04-27 17:12 ` [PATCH v3 05/10] xen/arm: Setup virtual paging for non-boot CPUs on hotplug/resume Mirela Simonovic
2018-04-30 14:47   ` Julien Grall
2018-05-07 14:55     ` Mirela Simonovic
2018-05-08 14:14       ` Julien Grall
2018-05-08 14:28         ` Mirela Simonovic
2018-05-09 10:32           ` Julien Grall
2018-04-27 17:12 ` [PATCH v3 06/10] xen/common: Restore IRQ affinity when hotplugging a pCPU Mirela Simonovic
2018-05-10  7:33   ` Dario Faggioli
2018-04-27 17:12 ` [PATCH v3 07/10] xen/arm: Release maintenance interrupt when CPU is hot-unplugged Mirela Simonovic
2018-04-30 14:51   ` Julien Grall
2018-04-27 17:12 ` [PATCH v3 08/10] xen/arm: Release timer interrupts " Mirela Simonovic
2018-04-30 15:58   ` Julien Grall
2018-05-07 15:33     ` Mirela Simonovic
2018-04-27 17:12 ` [PATCH v3 09/10] xen/arm: Free memory allocated for sibling/core maps on CPU hot-unplug Mirela Simonovic
2018-04-30 15:55   ` Julien Grall
2018-04-27 17:12 ` [PATCH v3 10/10] xen/arm: Enable errata for secondary CPU on hotplug after the boot Mirela Simonovic
2018-04-30 16:09   ` Julien Grall
2018-05-09 15:48     ` Mirela Simonovic
2018-05-09 16:32       ` Julien Grall
2018-05-10 11:57         ` Mirela Simonovic
2018-05-10 13:24           ` Mirela Simonovic
2018-05-10 13:29             ` Julien Grall
2018-05-10 14:12               ` Mirela Simonovic
2018-05-10 14:35                 ` Julien Grall
2018-05-10 14:25             ` Dario Faggioli
2018-05-10 15:00               ` Mirela Simonovic
2018-05-10 15:13                 ` Julien Grall
2018-05-10 15:49                   ` Mirela Simonovic
2018-05-10 16:02                     ` Julien Grall
2018-05-10 16:21                       ` Dario Faggioli
2018-05-10 16:24                     ` Dario Faggioli
2018-05-11 10:41                       ` Mirela Simonovic
2018-05-11 10:54                         ` Julien Grall
2018-05-11 12:07                           ` Mirela Simonovic
2018-05-11 12:20                             ` Mirela Simonovic
2018-05-11 13:08                               ` Julien Grall
2018-05-11 13:31                                 ` Dario Faggioli
     [not found]                                   ` <CAKPH-Nj2znmdcjZEfFf83WmrcBS_u4R33yPOxAPWw37RHVZ38g@mail.gmail.com>
2018-05-11 14:14                                     ` Dario Faggioli
2018-05-11 21:47                                   ` Stefano Stabellini
2018-05-14  9:44                                     ` Julien Grall
2018-05-14 16:59                                       ` Stefano Stabellini
2018-05-15 11:45                                         ` Mirela Simonovic
2018-05-11 13:12                           ` Dario Faggioli
2018-05-11 13:01                         ` Dario Faggioli [this message]
2018-05-10 16:12                   ` Dario Faggioli
2018-05-10 13:33           ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7544f7eaf9f2c079fe8d1294fbabe935c8e99f59.camel@suse.com \
    --to=dfaggioli@suse.com \
    --cc=dm@aggios.com \
    --cc=edgar.iglesias@xilinx.com \
    --cc=julien.grall@arm.com \
    --cc=mirela.simonovic@aggios.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.