All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] xen/arm: CPU hotplug fixes
@ 2018-04-11 13:19 Mirela Simonovic
  2018-04-11 13:19 ` [PATCH 1/7] xen/arm: Added handling of the trapped access to OSLSR register Mirela Simonovic
                   ` (8 more replies)
  0 siblings, 9 replies; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 13:19 UTC (permalink / raw)
  To: xen-devel; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, julien.grall

This patch set contains fixes required to enable CPU hotplug for secondary CPUs.
CPU hotplug of secondary CPUs will be used for suspend to RAM support for ARM.

With these patches calling disable_nonboot_cpus() from the boot CPU will cause
all secondary CPUs to be stopped. When a CPU is stopped it will issue the PSCI
CPU_OFF call to the EL3. The calling CPU will be powered down if the underlying
firmware allows so. If the CPU_OFF is not supported by the underlying firmware
the calling CPU will stay in infinite WFI loop. When secondary CPUs are stopped
all interrupts targeted to them are migrated to the boot CPU.
Calling enable_nonboot_cpus() from the boot CPU will cause all secondary CPUs
to be hotplugged. When a CPU is hotplugged the interrupts' affinity is restored.

Calls to enable/disable_nonboot_cpus() functions currently don't exist in
Xen ARM code. This will be added with the suspend to RAM support for ARM.

The code is tested on Xilinx Zynq UltraScale+ MPSoC/ZCU102 board (includes
physical power down/up of secondary CPUs).

Mirela Simonovic (7):
  xen/arm: Added handling of the trapped access to OSLSR register
  xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers
  xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
  xen/arm: When CPU dies, free percpu area immediatelly
  xen/arm: Remove __initdata and __init to enable CPU hotplug
  xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  xen/arm: Restore IRQ affinity after hotplugging a CPU

 xen/arch/arm/arm64/smpboot.c   |  2 +-
 xen/arch/arm/arm64/vsysreg.c   |  3 ++-
 xen/arch/arm/irq.c             |  2 +-
 xen/arch/arm/p2m.c             | 10 ++++++++--
 xen/arch/arm/percpu.c          |  2 +-
 xen/arch/arm/processor.c       |  2 +-
 xen/arch/arm/psci.c            |  5 +++++
 xen/arch/arm/smpboot.c         | 14 ++++++++++++--
 xen/arch/arm/vgic-v2.c         |  3 +--
 xen/common/schedule.c          |  4 ++++
 xen/include/asm-arm/p2m.h      |  3 +++
 xen/include/asm-arm/procinfo.h |  4 ++--
 xen/include/asm-arm/psci.h     |  1 +
 13 files changed, 42 insertions(+), 13 deletions(-)

-- 
2.13.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 1/7] xen/arm: Added handling of the trapped access to OSLSR register
  2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
@ 2018-04-11 13:19 ` Mirela Simonovic
  2018-04-11 14:39   ` Julien Grall
  2018-04-11 13:19 ` [PATCH 2/7] xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers Mirela Simonovic
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 13:19 UTC (permalink / raw)
  To: xen-devel; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, julien.grall

Linux/dom0 accesses OSLSR register when saving CPU context during the
suspend procedure. Xen traps access to this register, but has no handling
for it. Consequently, Xen injects undef exception to linux, causing it to
crash. This patch adds handling of the trapped access to OSLSR as ro/raz.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
---
 xen/arch/arm/arm64/vsysreg.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
index c57ac12503..8f80e1735e 100644
--- a/xen/arch/arm/arm64/vsysreg.c
+++ b/xen/arch/arm/arm64/vsysreg.c
@@ -57,13 +57,14 @@ void do_sysreg(struct cpu_user_regs *regs,
      * ARMv8 (DDI 0487A.d): D1-1509 Table D1-58
      *
      * Unhandled:
-     *    OSLSR_EL1
      *    DBGPRCR_EL1
      */
     case HSR_SYSREG_OSLAR_EL1:
         return handle_wo_wi(regs, regidx, hsr.sysreg.read, hsr, 1);
     case HSR_SYSREG_OSDLR_EL1:
         return handle_raz_wi(regs, regidx, hsr.sysreg.read, hsr, 1);
+    case HSR_SYSREG_OSLSR_EL1:
+        return handle_ro_raz(regs, regidx, hsr.sysreg.read, hsr, 1);
 
     /*
      * MDCR_EL2.TDA
-- 
2.13.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 2/7] xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers
  2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
  2018-04-11 13:19 ` [PATCH 1/7] xen/arm: Added handling of the trapped access to OSLSR register Mirela Simonovic
@ 2018-04-11 13:19 ` Mirela Simonovic
  2018-04-11 14:43   ` Julien Grall
  2018-04-11 13:19 ` [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface) Mirela Simonovic
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 13:19 UTC (permalink / raw)
  To: xen-devel; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, julien.grall

Guests attempt to write into these registers on resume (for example Linux).
Without this patch a data abort exception will be raised to the guest.
This patch handles the write access by ignoring it. This should be fine for
now because reading these registers is already handled as 'read as zero'.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
---
 xen/arch/arm/vgic-v2.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
index 646d1f3d12..b088376ed0 100644
--- a/xen/arch/arm/vgic-v2.c
+++ b/xen/arch/arm/vgic-v2.c
@@ -484,11 +484,10 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
         return 0;
 
     case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN):
-        if ( dabt.size != DABT_WORD ) goto bad_width;
         printk(XENLOG_G_ERR
                "%pv: vGICD: unhandled word write %#"PRIregister" to ISACTIVER%d\n",
                v, r, gicd_reg - GICD_ISACTIVER);
-        return 0;
+        goto write_ignore_32;
 
     case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN):
         printk(XENLOG_G_ERR
-- 
2.13.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
  2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
  2018-04-11 13:19 ` [PATCH 1/7] xen/arm: Added handling of the trapped access to OSLSR register Mirela Simonovic
  2018-04-11 13:19 ` [PATCH 2/7] xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers Mirela Simonovic
@ 2018-04-11 13:19 ` Mirela Simonovic
  2018-04-11 14:46   ` Julien Grall
  2018-04-11 13:19 ` [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly Mirela Simonovic
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 13:19 UTC (permalink / raw)
  To: xen-devel; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, julien.grall

This patch adds the PSCI CPU_OFF call to the EL3 in order to
trigger powering down of the calling CPU when the CPU is stopped.
If CPU_OFF call fails for some reason, e.g. EL3 does not implement
the PSCI CPU_OFF function, the calling CPU will loop in the infinite
while/wfi, as it was looping before this change.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
---
 xen/arch/arm/psci.c        | 5 +++++
 xen/arch/arm/smpboot.c     | 7 +++++++
 xen/include/asm-arm/psci.h | 1 +
 3 files changed, 13 insertions(+)

diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index 94b616df9b..e9e756e56b 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -46,6 +46,11 @@ int call_psci_cpu_on(int cpu)
     return call_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), 0);
 }
 
+int call_psci_cpu_off(void)
+{
+    return call_smc(PSCI_0_2_FN32_CPU_OFF, 0, 0, 0);
+}
+
 void call_psci_system_off(void)
 {
     if ( psci_ver > PSCI_VERSION(0, 1) )
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index b2116f0d2d..5666efcd3a 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -390,11 +390,18 @@ void __cpu_disable(void)
 
 void stop_cpu(void)
 {
+    int errno;
     local_irq_disable();
     cpu_is_dead = true;
     /* Make sure the write happens before we sleep forever */
     dsb(sy);
     isb();
+    /* PSCI cpu off call will return only in case of an error */
+    errno = call_psci_cpu_off();
+    printk(XENLOG_DEBUG "PSCI cpu off call failed for CPU#%d err=%d\n",
+           get_processor_id(), errno);
+    isb();
+    /* If CPU_OFF PSCI call failed stay in the WFI loop */
     while ( 1 )
         wfi();
 }
diff --git a/xen/include/asm-arm/psci.h b/xen/include/asm-arm/psci.h
index 9ac820e94a..50d668a296 100644
--- a/xen/include/asm-arm/psci.h
+++ b/xen/include/asm-arm/psci.h
@@ -20,6 +20,7 @@ extern uint32_t psci_ver;
 
 int psci_init(void);
 int call_psci_cpu_on(int cpu);
+int call_psci_cpu_off(void);
 void call_psci_system_off(void);
 void call_psci_system_reset(void);
 
-- 
2.13.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
                   ` (2 preceding siblings ...)
  2018-04-11 13:19 ` [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface) Mirela Simonovic
@ 2018-04-11 13:19 ` Mirela Simonovic
  2018-04-11 14:53   ` Julien Grall
  2018-04-11 13:19 ` [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug Mirela Simonovic
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 13:19 UTC (permalink / raw)
  To: xen-devel; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, julien.grall

Freeing percpu area is done when a non-boot CPU is disabled upon suspend.
This use to be scheduled for execution after a period of time, what caused
the following racing issues. If CPU is enabled after it is disabled and
before the freeing of percpu area is performed, Xen would crash upon
initializing percpu area because per cpu offset is not marked as
INVALID_PERCPU_AREA (this suppose to happen when cpu area is freed).
To resolve the racing issue, free percpu area right away instead
scheduling it for later.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
---
 xen/arch/arm/percpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/percpu.c b/xen/arch/arm/percpu.c
index 25442c48fe..e4e8405f43 100644
--- a/xen/arch/arm/percpu.c
+++ b/xen/arch/arm/percpu.c
@@ -46,7 +46,7 @@ static void free_percpu_area(unsigned int cpu)
 {
     struct free_info *info = &per_cpu(free_info, cpu);
     info->cpu = cpu;
-    call_rcu(&info->rcu, _free_percpu_area);
+    _free_percpu_area(&info->rcu);
 }
 
 static int cpu_percpu_callback(
-- 
2.13.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug
  2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
                   ` (3 preceding siblings ...)
  2018-04-11 13:19 ` [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly Mirela Simonovic
@ 2018-04-11 13:19 ` Mirela Simonovic
  2018-04-12  0:07   ` Stefano Stabellini
  2018-04-11 13:19 ` [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario Mirela Simonovic
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 13:19 UTC (permalink / raw)
  To: xen-devel; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, julien.grall

CPU up flow is currently used during the initial boot to start secondary
CPUs. However, the same flow should be used for CPU hotplug, e.g. when
hotplugging secondary CPUs within the resume procedure (resume from the
suspend to RAM). Therefore, prefixes __initdata and __init had to be removed
from few data structures and functions that are used within the cpu up flow.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
---
 xen/arch/arm/arm64/smpboot.c   | 2 +-
 xen/arch/arm/irq.c             | 2 +-
 xen/arch/arm/processor.c       | 2 +-
 xen/arch/arm/smpboot.c         | 4 ++--
 xen/include/asm-arm/procinfo.h | 4 ++--
 5 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/xen/arch/arm/arm64/smpboot.c b/xen/arch/arm/arm64/smpboot.c
index 4fd0ac68b7..694fbf67e6 100644
--- a/xen/arch/arm/arm64/smpboot.c
+++ b/xen/arch/arm/arm64/smpboot.c
@@ -104,7 +104,7 @@ int __init arch_cpu_init(int cpu, struct dt_device_node *dn)
         return smp_psci_init(cpu);
 }
 
-int __init arch_cpu_up(int cpu)
+int arch_cpu_up(int cpu)
 {
     if ( !smp_enable_ops[cpu].prepare_cpu )
         return -ENODEV;
diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
index aa4e832cae..098281f8ab 100644
--- a/xen/arch/arm/irq.c
+++ b/xen/arch/arm/irq.c
@@ -65,7 +65,7 @@ irq_desc_t *__irq_to_desc(int irq)
     return &irq_desc[irq-NR_LOCAL_IRQS];
 }
 
-int __init arch_init_one_irq_desc(struct irq_desc *desc)
+int arch_init_one_irq_desc(struct irq_desc *desc)
 {
     desc->arch.type = IRQ_TYPE_INVALID;
     return 0;
diff --git a/xen/arch/arm/processor.c b/xen/arch/arm/processor.c
index ce4385064a..acad8b31d6 100644
--- a/xen/arch/arm/processor.c
+++ b/xen/arch/arm/processor.c
@@ -20,7 +20,7 @@
 
 static DEFINE_PER_CPU(struct processor *, processor);
 
-void __init processor_setup(void)
+void processor_setup(void)
 {
     const struct proc_info_list *procinfo;
 
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 5666efcd3a..d15ea8df5e 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -52,8 +52,8 @@ nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 static unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
        __attribute__((__aligned__(STACK_SIZE)));
 
-/* Initial boot cpu data */
-struct init_info __initdata init_data =
+/* Boot cpu data */
+struct init_info init_data =
 {
     .stack = cpu0_boot_stack,
 };
diff --git a/xen/include/asm-arm/procinfo.h b/xen/include/asm-arm/procinfo.h
index 26306b35f8..02be56e348 100644
--- a/xen/include/asm-arm/procinfo.h
+++ b/xen/include/asm-arm/procinfo.h
@@ -35,9 +35,9 @@ struct proc_info_list {
     struct processor    *processor;
 };
 
-const __init struct proc_info_list *lookup_processor_type(void);
+const struct proc_info_list *lookup_processor_type(void);
 
-void __init processor_setup(void);
+void processor_setup(void);
 void processor_vcpu_initialise(struct vcpu *v);
 
 #endif
-- 
2.13.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
                   ` (4 preceding siblings ...)
  2018-04-11 13:19 ` [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug Mirela Simonovic
@ 2018-04-11 13:19 ` Mirela Simonovic
  2018-04-11 15:11   ` Julien Grall
  2018-04-11 13:19 ` [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU Mirela Simonovic
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 13:19 UTC (permalink / raw)
  To: xen-devel; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, julien.grall

In existing code the paging for secondary CPUs is setup only in boot flow.
The setup is triggered from start_xen function after all CPUs are brought
online. In other words, the initialization of VTCR_EL2 register is done
out of the cpu_up/start_secondary control flow. However, the cpu_up flow
should be self-contained - it should fully initialize a secondary CPU,
because the cpu_up is used not only to bring a secondary CPU online on
boot, but also to hotplug a CPU during the system resume.
With this patch the setting of paging is triggered from start_secondary
function if the current system state is not boot. This way, the paging
will be setup in non-boot scenarios, while the setup in boot scenario
remains unchanged.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
---
 xen/arch/arm/p2m.c        | 10 ++++++++--
 xen/arch/arm/smpboot.c    |  3 +++
 xen/include/asm-arm/p2m.h |  3 +++
 3 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index d43c3aa896..d041b4d24b 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1451,9 +1451,15 @@ err:
     return page;
 }
 
-static void __init setup_virt_paging_one(void *data)
+void setup_virt_paging_one(void *data)
 {
-    unsigned long val = (unsigned long)data;
+    static unsigned long vtcr_data;
+    unsigned long val;
+
+    if ( data )
+        vtcr_data = (unsigned long)data;
+    val = vtcr_data;
+
     WRITE_SYSREG32(val, VTCR_EL2);
     isb();
 }
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index d15ea8df5e..dae4427320 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -360,6 +360,9 @@ void start_secondary(unsigned long boot_phys_offset,
 
     check_local_cpu_errata();
 
+    if ( system_state != SYS_STATE_boot )
+        setup_virt_paging_one(NULL);
+
     printk(XENLOG_DEBUG "CPU %u booted.\n", smp_processor_id());
 
     startup_cpu_idle_loop();
diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h
index 8823707c17..6cc36ebbc3 100644
--- a/xen/include/asm-arm/p2m.h
+++ b/xen/include/asm-arm/p2m.h
@@ -153,6 +153,9 @@ void p2m_altp2m_check(struct vcpu *v, uint16_t idx)
 /* Second stage paging setup, to be called on all CPUs */
 void setup_virt_paging(void);
 
+/* Second stage paging setup, to be called by one CPU out of the boot flow */
+void setup_virt_paging_one(void *data);
+
 /* Init the datastructures for later use by the p2m code */
 int p2m_init(struct domain *d);
 
-- 
2.13.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU
  2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
                   ` (5 preceding siblings ...)
  2018-04-11 13:19 ` [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario Mirela Simonovic
@ 2018-04-11 13:19 ` Mirela Simonovic
  2018-04-11 14:55   ` Julien Grall
                     ` (2 more replies)
  2018-04-11 15:07 ` [PATCH 0/7] xen/arm: CPU hotplug fixes Julien Grall
  2018-04-11 15:13 ` Julien Grall
  8 siblings, 3 replies; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 13:19 UTC (permalink / raw)
  To: xen-devel; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, julien.grall

Secondary pCPUs will be offlined on system suspend and hotplugged
on resume. When offlining secondary CPUs all interrupts targeted
to those CPUs will be routed to the boot CPU. The boot CPU
is responsible for finalizing suspend procedure. All wake-up
interrupts are therefore targeted to the boot CPU. Existing code
was missing the restoration of interrupts affinity after
hotplugging a CPU. This patch restores the IRQ affinity after
a CPU is hotplugged.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
---
 xen/common/schedule.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/common/schedule.c b/xen/common/schedule.c
index 343ab6306e..e3956019bc 100644
--- a/xen/common/schedule.c
+++ b/xen/common/schedule.c
@@ -692,6 +692,7 @@ void restore_vcpu_affinity(struct domain *d)
     for_each_vcpu ( d, v )
     {
         spinlock_t *lock;
+        bool affinity_was_broken = v->affinity_broken;
 
         ASSERT(!vcpu_runnable(v));
 
@@ -724,6 +725,9 @@ void restore_vcpu_affinity(struct domain *d)
         lock = vcpu_schedule_lock_irq(v);
         v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, v);
         spin_unlock_irq(lock);
+
+        if ( affinity_was_broken )
+            sched_move_irqs(v);
     }
 
     domain_update_node_affinity(d);
-- 
2.13.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH 1/7] xen/arm: Added handling of the trapped access to OSLSR register
  2018-04-11 13:19 ` [PATCH 1/7] xen/arm: Added handling of the trapped access to OSLSR register Mirela Simonovic
@ 2018-04-11 14:39   ` Julien Grall
  2018-04-11 23:28     ` Stefano Stabellini
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-11 14:39 UTC (permalink / raw)
  To: Mirela Simonovic, Xen-devel; +Cc: edgar.iglesias, sstabellini, dm

Hi,

You seem to have used a wrong address for me.

Title: This patch is only adding the arm64 side. Please make it clear in it.

Cheers,

On 04/11/2018 02:19 PM, Mirela Simonovic wrote:
> Linux/dom0 accesses OSLSR register when saving CPU context during the
> suspend procedure. Xen traps access to this register, but has no handling
> for it. Consequently, Xen injects undef exception to linux, causing it to
> crash. This patch adds handling of the trapped access to OSLSR as ro/raz.
>
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> ---
>   xen/arch/arm/arm64/vsysreg.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> index c57ac12503..8f80e1735e 100644
> --- a/xen/arch/arm/arm64/vsysreg.c
> +++ b/xen/arch/arm/arm64/vsysreg.c
> @@ -57,13 +57,14 @@ void do_sysreg(struct cpu_user_regs *regs,
>        * ARMv8 (DDI 0487A.d): D1-1509 Table D1-58
>        *
>        * Unhandled:
> -     *    OSLSR_EL1
>        *    DBGPRCR_EL1
>        */
>       case HSR_SYSREG_OSLAR_EL1:
>           return handle_wo_wi(regs, regidx, hsr.sysreg.read, hsr, 1);
>       case HSR_SYSREG_OSDLR_EL1:
>           return handle_raz_wi(regs, regidx, hsr.sysreg.read, hsr, 1);
> +    case HSR_SYSREG_OSLSR_EL1:
> +        return handle_ro_raz(regs, regidx, hsr.sysreg.read, hsr, 1);
>
>       /*
>        * MDCR_EL2.TDA
>

--
Julien Grall
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 2/7] xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers
  2018-04-11 13:19 ` [PATCH 2/7] xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers Mirela Simonovic
@ 2018-04-11 14:43   ` Julien Grall
  2018-04-11 23:31     ` Stefano Stabellini
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-11 14:43 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini, dm

Hi,

On 11/04/18 14:19, Mirela Simonovic wrote:
> Guests attempt to write into these registers on resume (for example Linux).
> Without this patch a data abort exception will be raised to the guest.
> This patch handles the write access by ignoring it. This should be fine for
> now because reading these registers is already handled as 'read as zero'.

I think this patch is wrong. It is not mandatory for the guest to write 
exactly the same value as read. Assuming the guest will always write 0, 
then what you want to do is checking the write is actually 0. In that 
case you can ignore it. For all the other case, you should still fail.

Cheers,

> 
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> ---
>   xen/arch/arm/vgic-v2.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> index 646d1f3d12..b088376ed0 100644
> --- a/xen/arch/arm/vgic-v2.c
> +++ b/xen/arch/arm/vgic-v2.c
> @@ -484,11 +484,10 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v, mmio_info_t *info,
>           return 0;
>   
>       case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN):
> -        if ( dabt.size != DABT_WORD ) goto bad_width;
>           printk(XENLOG_G_ERR
>                  "%pv: vGICD: unhandled word write %#"PRIregister" to ISACTIVER%d\n",
>                  v, r, gicd_reg - GICD_ISACTIVER);
> -        return 0;
> +        goto write_ignore_32;
>   
>       case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN):
>           printk(XENLOG_G_ERR
> 

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
  2018-04-11 13:19 ` [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface) Mirela Simonovic
@ 2018-04-11 14:46   ` Julien Grall
  2018-04-12 11:33     ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-11 14:46 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini, dm, julien.grall

Hi,

On 11/04/18 14:19, Mirela Simonovic wrote:
> This patch adds the PSCI CPU_OFF call to the EL3 in order to
> trigger powering down of the calling CPU when the CPU is stopped.
> If CPU_OFF call fails for some reason, e.g. EL3 does not implement
> the PSCI CPU_OFF function, the calling CPU will loop in the infinite > while/wfi, as it was looping before this change.

I am afraid that the example you give is wrong. That call should exist 
for all implementation of PSCI 0.2 and above. For 0.1, this should never 
be call as the ID is not reserved.

> 
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> ---
>   xen/arch/arm/psci.c        | 5 +++++
>   xen/arch/arm/smpboot.c     | 7 +++++++
>   xen/include/asm-arm/psci.h | 1 +
>   3 files changed, 13 insertions(+)
> 
> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
> index 94b616df9b..e9e756e56b 100644
> --- a/xen/arch/arm/psci.c
> +++ b/xen/arch/arm/psci.c
> @@ -46,6 +46,11 @@ int call_psci_cpu_on(int cpu)
>       return call_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary), 0);
>   }
>   
> +int call_psci_cpu_off(void)
> +{

You have to check the PSCI version here before calling the function.

> +    return call_smc(PSCI_0_2_FN32_CPU_OFF, 0, 0, 0);
> +}
> +
>   void call_psci_system_off(void)
>   {
>       if ( psci_ver > PSCI_VERSION(0, 1) )
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index b2116f0d2d..5666efcd3a 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -390,11 +390,18 @@ void __cpu_disable(void)
>   
>   void stop_cpu(void)
>   {
> +    int errno;

newline.

>       local_irq_disable();
>       cpu_is_dead = true;
>       /* Make sure the write happens before we sleep forever */
>       dsb(sy);
>       isb();
> +    /* PSCI cpu off call will return only in case of an error */
> +    errno = call_psci_cpu_off();
> +    printk(XENLOG_DEBUG "PSCI cpu off call failed for CPU#%d err=%d\n",
> +           get_processor_id(), errno);
> +    isb();

What are you trying to achieve with the isb() here?

Cheers,

> +    /* If CPU_OFF PSCI call failed stay in the WFI loop */
>       while ( 1 )
>           wfi();
>   }
> diff --git a/xen/include/asm-arm/psci.h b/xen/include/asm-arm/psci.h
> index 9ac820e94a..50d668a296 100644
> --- a/xen/include/asm-arm/psci.h
> +++ b/xen/include/asm-arm/psci.h
> @@ -20,6 +20,7 @@ extern uint32_t psci_ver;
>   
>   int psci_init(void);
>   int call_psci_cpu_on(int cpu);
> +int call_psci_cpu_off(void);
>   void call_psci_system_off(void);
>   void call_psci_system_reset(void);
>   
> 

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-11 13:19 ` [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly Mirela Simonovic
@ 2018-04-11 14:53   ` Julien Grall
  2018-04-11 23:46     ` Stefano Stabellini
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-11 14:53 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini, dm

Hi,

On 11/04/18 14:19, Mirela Simonovic wrote:
> Freeing percpu area is done when a non-boot CPU is disabled upon suspend.
> This use to be scheduled for execution after a period of time, what caused
> the following racing issues. If CPU is enabled after it is disabled and
> before the freeing of percpu area is performed, Xen would crash upon
> initializing percpu area because per cpu offset is not marked as
> INVALID_PERCPU_AREA (this suppose to happen when cpu area is freed).
> To resolve the racing issue, free percpu area right away instead
> scheduling it for later.

The reason of using the RCU is you want to make sure that none of the 
other CPUs will access that percpu data before freeing it. So I don't 
think this patch is valid.

It looks like to me a rcu barrier is missing after calling cpu_down 
somewhere in the CPU off path. I am not entirely sure where.

Cheers,

> 
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> ---
>   xen/arch/arm/percpu.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/percpu.c b/xen/arch/arm/percpu.c
> index 25442c48fe..e4e8405f43 100644
> --- a/xen/arch/arm/percpu.c
> +++ b/xen/arch/arm/percpu.c
> @@ -46,7 +46,7 @@ static void free_percpu_area(unsigned int cpu)
>   {
>       struct free_info *info = &per_cpu(free_info, cpu);
>       info->cpu = cpu;
> -    call_rcu(&info->rcu, _free_percpu_area);
> +    _free_percpu_area(&info->rcu);
>   }
>   
>   static int cpu_percpu_callback(
> 

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU
  2018-04-11 13:19 ` [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU Mirela Simonovic
@ 2018-04-11 14:55   ` Julien Grall
  2018-04-12  0:20   ` Stefano Stabellini
  2018-04-12 16:49   ` Dario Faggioli
  2 siblings, 0 replies; 61+ messages in thread
From: Julien Grall @ 2018-04-11 14:55 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel
  Cc: edgar.iglesias, George Dunlap, sstabellini, dm,
	dfaggioli@suse.com >> Dario Faggioli

Hi,

Please CC relevant maintainers for the code you modify. You can use 
scripts/get_maintainers.pl for that. I have CCed them for you this time.

Cheers,

On 11/04/18 14:19, Mirela Simonovic wrote:
> Secondary pCPUs will be offlined on system suspend and hotplugged
> on resume. When offlining secondary CPUs all interrupts targeted
> to those CPUs will be routed to the boot CPU. The boot CPU
> is responsible for finalizing suspend procedure. All wake-up
> interrupts are therefore targeted to the boot CPU. Existing code
> was missing the restoration of interrupts affinity after
> hotplugging a CPU. This patch restores the IRQ affinity after
> a CPU is hotplugged.
> 
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> ---
>   xen/common/schedule.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 343ab6306e..e3956019bc 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -692,6 +692,7 @@ void restore_vcpu_affinity(struct domain *d)
>       for_each_vcpu ( d, v )
>       {
>           spinlock_t *lock;
> +        bool affinity_was_broken = v->affinity_broken;
>   
>           ASSERT(!vcpu_runnable(v));
>   
> @@ -724,6 +725,9 @@ void restore_vcpu_affinity(struct domain *d)
>           lock = vcpu_schedule_lock_irq(v);
>           v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, v);
>           spin_unlock_irq(lock);
> +
> +        if ( affinity_was_broken )
> +            sched_move_irqs(v);
>       }
>   
>       domain_update_node_affinity(d);
> 

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
                   ` (6 preceding siblings ...)
  2018-04-11 13:19 ` [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU Mirela Simonovic
@ 2018-04-11 15:07 ` Julien Grall
  2018-04-11 15:58   ` Mirela Simonovic
  2018-04-11 15:13 ` Julien Grall
  8 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-11 15:07 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini, dm

Hi Mirela,

Thank you for sending the series.

On 11/04/18 14:19, Mirela Simonovic wrote:
> This patch set contains fixes required to enable CPU hotplug for secondary CPUs.
> CPU hotplug of secondary CPUs will be used for suspend to RAM support for ARM.
> 
> With these patches calling disable_nonboot_cpus() from the boot CPU will cause
> all secondary CPUs to be stopped. When a CPU is stopped it will issue the PSCI
> CPU_OFF call to the EL3. The calling CPU will be powered down if the underlying
> firmware allows so. If the CPU_OFF is not supported by the underlying firmware
> the calling CPU will stay in infinite WFI loop. When secondary CPUs are stopped
> all interrupts targeted to them are migrated to the boot CPU.
> Calling enable_nonboot_cpus() from the boot CPU will cause all secondary CPUs
> to be hotplugged. When a CPU is hotplugged the interrupts' affinity is restored.

I think this patch series is not complete. While it may work for you 
there are a few corner cases that will result to leak some memory 
everytime you turn off/on the CPU. This is the case of page-table, 
action associated to local IRQ.... I don't have a full list, but 
basically you need to ensured that anything that was initialized during 
the CPU boot are freed/reverted. You probably want to look at my answer 
you on your design document [1] for more details.

I also don't see any code migrating Xen interrupt from the CPU that is 
going to be turned off to the other. Is it something you are going to 
plan for the future?

> 
> Calls to enable/disable_nonboot_cpus() functions currently don't exist in
> Xen ARM code. This will be added with the suspend to RAM support for ARM.

What would be the way to test that series?

> 
> The code is tested on Xilinx Zynq UltraScale+ MPSoC/ZCU102 board (includes
> physical power down/up of secondary CPUs).
> 
> Mirela Simonovic (7):
>    xen/arm: Added handling of the trapped access to OSLSR register
>    xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers
>    xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
>    xen/arm: When CPU dies, free percpu area immediatelly
>    xen/arm: Remove __initdata and __init to enable CPU hotplug
>    xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
>    xen/arm: Restore IRQ affinity after hotplugging a CPU
> 
>   xen/arch/arm/arm64/smpboot.c   |  2 +-
>   xen/arch/arm/arm64/vsysreg.c   |  3 ++-
>   xen/arch/arm/irq.c             |  2 +-
>   xen/arch/arm/p2m.c             | 10 ++++++++--
>   xen/arch/arm/percpu.c          |  2 +-
>   xen/arch/arm/processor.c       |  2 +-
>   xen/arch/arm/psci.c            |  5 +++++
>   xen/arch/arm/smpboot.c         | 14 ++++++++++++--
>   xen/arch/arm/vgic-v2.c         |  3 +--
>   xen/common/schedule.c          |  4 ++++
>   xen/include/asm-arm/p2m.h      |  3 +++
>   xen/include/asm-arm/procinfo.h |  4 ++--
>   xen/include/asm-arm/psci.h     |  1 +
>   13 files changed, 42 insertions(+), 13 deletions(-)
> 

Cheers,

[1] See [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-11 13:19 ` [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario Mirela Simonovic
@ 2018-04-11 15:11   ` Julien Grall
  2018-04-17 12:54     ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-11 15:11 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini, dm

Hi,

On 11/04/18 14:19, Mirela Simonovic wrote:
> In existing code the paging for secondary CPUs is setup only in boot flow.
> The setup is triggered from start_xen function after all CPUs are brought
> online. In other words, the initialization of VTCR_EL2 register is done
> out of the cpu_up/start_secondary control flow. However, the cpu_up flow
> should be self-contained - it should fully initialize a secondary CPU,
> because the cpu_up is used not only to bring a secondary CPU online on
> boot, but also to hotplug a CPU during the system resume.
> With this patch the setting of paging is triggered from start_secondary
> function if the current system state is not boot. This way, the paging
> will be setup in non-boot scenarios, while the setup in boot scenario
> remains unchanged.

I am afraid that this is not correct. You can't assume that value chosen 
for VTCR by Xen at boot will fit this new CPU. So you have to check it 
is fine or park the CPU if there are any issue.

For more details have a look at [1].

[1] 
https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg02482.html

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
                   ` (7 preceding siblings ...)
  2018-04-11 15:07 ` [PATCH 0/7] xen/arm: CPU hotplug fixes Julien Grall
@ 2018-04-11 15:13 ` Julien Grall
  8 siblings, 0 replies; 61+ messages in thread
From: Julien Grall @ 2018-04-11 15:13 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini, dm

Hi,

I forgot to mention that the title of most of the patch gives the 
impression you fixes hotplug for both Arm32 and Arm64. After a deeper 
look, it is only arm64. Please make clear over commit message and cover 
letter.

Cheers,

On 11/04/18 14:19, Mirela Simonovic wrote:
> This patch set contains fixes required to enable CPU hotplug for secondary CPUs.
> CPU hotplug of secondary CPUs will be used for suspend to RAM support for ARM.
> 
> With these patches calling disable_nonboot_cpus() from the boot CPU will cause
> all secondary CPUs to be stopped. When a CPU is stopped it will issue the PSCI
> CPU_OFF call to the EL3. The calling CPU will be powered down if the underlying
> firmware allows so. If the CPU_OFF is not supported by the underlying firmware
> the calling CPU will stay in infinite WFI loop. When secondary CPUs are stopped
> all interrupts targeted to them are migrated to the boot CPU.
> Calling enable_nonboot_cpus() from the boot CPU will cause all secondary CPUs
> to be hotplugged. When a CPU is hotplugged the interrupts' affinity is restored.
> 
> Calls to enable/disable_nonboot_cpus() functions currently don't exist in
> Xen ARM code. This will be added with the suspend to RAM support for ARM.
> 
> The code is tested on Xilinx Zynq UltraScale+ MPSoC/ZCU102 board (includes
> physical power down/up of secondary CPUs).
> 
> Mirela Simonovic (7):
>    xen/arm: Added handling of the trapped access to OSLSR register
>    xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers
>    xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
>    xen/arm: When CPU dies, free percpu area immediatelly
>    xen/arm: Remove __initdata and __init to enable CPU hotplug
>    xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
>    xen/arm: Restore IRQ affinity after hotplugging a CPU
> 
>   xen/arch/arm/arm64/smpboot.c   |  2 +-
>   xen/arch/arm/arm64/vsysreg.c   |  3 ++-
>   xen/arch/arm/irq.c             |  2 +-
>   xen/arch/arm/p2m.c             | 10 ++++++++--
>   xen/arch/arm/percpu.c          |  2 +-
>   xen/arch/arm/processor.c       |  2 +-
>   xen/arch/arm/psci.c            |  5 +++++
>   xen/arch/arm/smpboot.c         | 14 ++++++++++++--
>   xen/arch/arm/vgic-v2.c         |  3 +--
>   xen/common/schedule.c          |  4 ++++
>   xen/include/asm-arm/p2m.h      |  3 +++
>   xen/include/asm-arm/procinfo.h |  4 ++--
>   xen/include/asm-arm/psci.h     |  1 +
>   13 files changed, 42 insertions(+), 13 deletions(-)
> 

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-11 15:07 ` [PATCH 0/7] xen/arm: CPU hotplug fixes Julien Grall
@ 2018-04-11 15:58   ` Mirela Simonovic
  2018-04-11 16:02     ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 15:58 UTC (permalink / raw)
  To: Julien Grall, xen-devel; +Cc: edgar.iglesias, sstabellini, dm

Hi Julien,


Thank you for the feedback.


On 04/11/2018 05:07 PM, Julien Grall wrote:
> Hi Mirela,
>
> Thank you for sending the series.
>
> On 11/04/18 14:19, Mirela Simonovic wrote:
>> This patch set contains fixes required to enable CPU hotplug for 
>> secondary CPUs.
>> CPU hotplug of secondary CPUs will be used for suspend to RAM support 
>> for ARM.
>>
>> With these patches calling disable_nonboot_cpus() from the boot CPU 
>> will cause
>> all secondary CPUs to be stopped. When a CPU is stopped it will issue 
>> the PSCI
>> CPU_OFF call to the EL3. The calling CPU will be powered down if the 
>> underlying
>> firmware allows so. If the CPU_OFF is not supported by the underlying 
>> firmware
>> the calling CPU will stay in infinite WFI loop. When secondary CPUs 
>> are stopped
>> all interrupts targeted to them are migrated to the boot CPU.
>> Calling enable_nonboot_cpus() from the boot CPU will cause all 
>> secondary CPUs
>> to be hotplugged. When a CPU is hotplugged the interrupts' affinity 
>> is restored.
>
> I think this patch series is not complete. While it may work for you 
> there are a few corner cases that will result to leak some memory 
> everytime you turn off/on the CPU. This is the case of page-table, 
> action associated to local IRQ.... I don't have a full list, but 
> basically you need to ensured that anything that was initialized 
> during the CPU boot are freed/reverted. You probably want to look at 
> my answer you on your design document [1] for more details.
>

Thanks, I'll check.

> I also don't see any code migrating Xen interrupt from the CPU that is 
> going to be turned off to the other. Is it something you are going to 
> plan for the future?
>

Migrating interrupts when turning off a CPU already works. However, when 
a CPU is turned back on there is no interrupt migration back to the 
hotplugged CPU - all interrupts will remain routed to the CPU#0.
Patch 7/7 fixes this.

>>
>> Calls to enable/disable_nonboot_cpus() functions currently don't 
>> exist in
>> Xen ARM code. This will be added with the suspend to RAM support for 
>> ARM.
>
> What would be the way to test that series?
>

I tested/developed the series on Xilinx Zynq UltraScale+ MPSoC (ZCU102 
board), but how would others test that - I don't know. We will have to 
ask Xilinx for detailed answer to this question.

Thanks,
Mirela

>>
>> The code is tested on Xilinx Zynq UltraScale+ MPSoC/ZCU102 board 
>> (includes
>> physical power down/up of secondary CPUs).
>>
>> Mirela Simonovic (7):
>>    xen/arm: Added handling of the trapped access to OSLSR register
>>    xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers
>>    xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
>>    xen/arm: When CPU dies, free percpu area immediatelly
>>    xen/arm: Remove __initdata and __init to enable CPU hotplug
>>    xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
>>    xen/arm: Restore IRQ affinity after hotplugging a CPU
>>
>>   xen/arch/arm/arm64/smpboot.c   |  2 +-
>>   xen/arch/arm/arm64/vsysreg.c   |  3 ++-
>>   xen/arch/arm/irq.c             |  2 +-
>>   xen/arch/arm/p2m.c             | 10 ++++++++--
>>   xen/arch/arm/percpu.c          |  2 +-
>>   xen/arch/arm/processor.c       |  2 +-
>>   xen/arch/arm/psci.c            |  5 +++++
>>   xen/arch/arm/smpboot.c         | 14 ++++++++++++--
>>   xen/arch/arm/vgic-v2.c         |  3 +--
>>   xen/common/schedule.c          |  4 ++++
>>   xen/include/asm-arm/p2m.h      |  3 +++
>>   xen/include/asm-arm/procinfo.h |  4 ++--
>>   xen/include/asm-arm/psci.h     |  1 +
>>   13 files changed, 42 insertions(+), 13 deletions(-)
>>
>
> Cheers,
>
> [1] See [RFC v2] xen/arm: Suspend to RAM Support in Xen for ARM
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-11 15:58   ` Mirela Simonovic
@ 2018-04-11 16:02     ` Julien Grall
  2018-04-11 16:37       ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-11 16:02 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini, dm

Hi,

On 11/04/18 16:58, Mirela Simonovic wrote:
> On 04/11/2018 05:07 PM, Julien Grall wrote:
>> On 11/04/18 14:19, Mirela Simonovic wrote:
> Migrating interrupts when turning off a CPU already works. However, when 
> a CPU is turned back on there is no interrupt migration back to the 
> hotplugged CPU - all interrupts will remain routed to the CPU#0.
> Patch 7/7 fixes this

What do you mean by all interrupts? Interrupts routed to guest will 
always follow the vCPU. So are you sure they are going to be migrated 
when that vCPU is paused/off?

Can you give the path in Xen doing that?

> 
>>>
>>> Calls to enable/disable_nonboot_cpus() functions currently don't 
>>> exist in
>>> Xen ARM code. This will be added with the suspend to RAM support for 
>>> ARM.
>>
>> What would be the way to test that series?
>>
> 
> I tested/developed the series on Xilinx Zynq UltraScale+ MPSoC (ZCU102 
> board), but how would others test that - I don't know. We will have to 
> ask Xilinx for detailed answer to this question.

I am not sure to understand your answer. What I asked, is how did you 
did turned off the CPU? Did you hack Xen or do you rely on another series?

If the former, it would be nice to get the code you used. If the latter, 
then having a hack patch to test that code would be nice. Ideally, you 
want to plug that in the SYSCTL interface for out-of-box testing.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-11 16:02     ` Julien Grall
@ 2018-04-11 16:37       ` Mirela Simonovic
  2018-04-12  8:43         ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-11 16:37 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel


[-- Attachment #1.1: Type: text/plain, Size: 3461 bytes --]

Hi Julien,

On Wed, Apr 11, 2018 at 6:02 PM, Julien Grall <julien.grall@arm.com> wrote:

> Hi,
>
> On 11/04/18 16:58, Mirela Simonovic wrote:
>
>> On 04/11/2018 05:07 PM, Julien Grall wrote:
>>
>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>
>> Migrating interrupts when turning off a CPU already works. However, when
>> a CPU is turned back on there is no interrupt migration back to the
>> hotplugged CPU - all interrupts will remain routed to the CPU#0.
>> Patch 7/7 fixes this
>>
>
> What do you mean by all interrupts? Interrupts routed to guest will always
> follow the vCPU. So are you sure they are going to be migrated when that
> vCPU is paused/off?
>
>
Just to make sure we're on the same page - this is about hotplugging
physical CPUs. Hotplugging vCPUs using virtual PSCI CPU_OFF interface is
already implemented and unrelated to this series.

Assuming that system has 2 pCPUs by 'all interrupts' I mean interrupts that
were targeted to the pCPU#0 and pCPU#1 prior to doing any hotplug.

For example, if a guest is pinned to pCPU#1 an interrupt of a device it
owns will be targeted to pCPU#1.
When pCPU#1 is turned off that interrupt will be migrated to pCPU#0. pCPU#0
finalizes the suspend and receives wake-up interrupts. However, when CPU#1
is turned back on that interrupt will remain targeted to the CPU#0, which I
assumed is wrong.
The scenario described here is also how I tested this.



> Can you give the path in Xen doing that?
>
>
Sure, here is a backtrace (dumped on the CPU being turned off):

    0  0x2603dc arch_move_irqs(): vgic.c, line 309
    1  0x22ee58 sched_move_irqs()+20: schedule.c, line 303
    2  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
    3  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
    4  0x25aff8 __cpu_disable()+96: smpboot.c, line 386
    5  0x201608 take_cpu_down()+52: cpu.c, line 75
    6  0x23426c stopmachine_action()+188: stop_machine.c, line 159
    7  0x235858 do_tasklet_work()+176: tasklet.c, line 94
    8  0x235c80 do_tasklet()+104: tasklet.c, line 126
    9  0x24daec idle_loop()+144: domain.c, line 72
   10  0x25b1f8 start_secondary()+404: smpboot.c, line 368



>
>>
>>>> Calls to enable/disable_nonboot_cpus() functions currently don't exist
>>>> in
>>>> Xen ARM code. This will be added with the suspend to RAM support for
>>>> ARM.
>>>>
>>>
>>> What would be the way to test that series?
>>>
>>>
>> I tested/developed the series on Xilinx Zynq UltraScale+ MPSoC (ZCU102
>> board), but how would others test that - I don't know. We will have to ask
>> Xilinx for detailed answer to this question.
>>
>
> I am not sure to understand your answer. What I asked, is how did you did
> turned off the CPU? Did you hack Xen or do you rely on another series?
>
>
I rely on the suspend to RAM series that will be submitted after this one
is done. Not complete suspend to RAM support is needed in order to do the
testing.
Essentially, I have triggered suspend to RAM from Dom0 and the minimum in
Xen is to have the suspend support for secondary CPUs (calls to
disable/enable_nonboot_cpus).


> If the former, it would be nice to get the code you used. If the latter,
> then having a hack patch to test that code would be nice. Ideally, you want
> to plug that in the SYSCTL interface for out-of-box testing.
>
>
Ok, I have never used that but I'll try to figure it out. I may come up
with additional questions.

Thanks,
Mirela



> Cheers,
>
> --
> Julien Grall
>

[-- Attachment #1.2: Type: text/html, Size: 5757 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 1/7] xen/arm: Added handling of the trapped access to OSLSR register
  2018-04-11 14:39   ` Julien Grall
@ 2018-04-11 23:28     ` Stefano Stabellini
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Stabellini @ 2018-04-11 23:28 UTC (permalink / raw)
  To: Julien Grall; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, Xen-devel

On Wed, 11 Apr 2018, Julien Grall wrote:
> Hi,
> 
> You seem to have used a wrong address for me.
> 
> Title: This patch is only adding the arm64 side. Please make it clear in it.

With that:

Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>


> Cheers,
> 
> On 04/11/2018 02:19 PM, Mirela Simonovic wrote:
> > Linux/dom0 accesses OSLSR register when saving CPU context during the
> > suspend procedure. Xen traps access to this register, but has no handling
> > for it. Consequently, Xen injects undef exception to linux, causing it to
> > crash. This patch adds handling of the trapped access to OSLSR as ro/raz.
> > 
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > ---
> >   xen/arch/arm/arm64/vsysreg.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/xen/arch/arm/arm64/vsysreg.c b/xen/arch/arm/arm64/vsysreg.c
> > index c57ac12503..8f80e1735e 100644
> > --- a/xen/arch/arm/arm64/vsysreg.c
> > +++ b/xen/arch/arm/arm64/vsysreg.c
> > @@ -57,13 +57,14 @@ void do_sysreg(struct cpu_user_regs *regs,
> >        * ARMv8 (DDI 0487A.d): D1-1509 Table D1-58
> >        *
> >        * Unhandled:
> > -     *    OSLSR_EL1
> >        *    DBGPRCR_EL1
> >        */
> >       case HSR_SYSREG_OSLAR_EL1:
> >           return handle_wo_wi(regs, regidx, hsr.sysreg.read, hsr, 1);
> >       case HSR_SYSREG_OSDLR_EL1:
> >           return handle_raz_wi(regs, regidx, hsr.sysreg.read, hsr, 1);
> > +    case HSR_SYSREG_OSLSR_EL1:
> > +        return handle_ro_raz(regs, regidx, hsr.sysreg.read, hsr, 1);
> > 
> >       /*
> >        * MDCR_EL2.TDA
> > 
> 
> --
> Julien Grall
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xenproject.org
> https://lists.xenproject.org/mailman/listinfo/xen-devel

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 2/7] xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers
  2018-04-11 14:43   ` Julien Grall
@ 2018-04-11 23:31     ` Stefano Stabellini
  2018-04-12 11:15       ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Stefano Stabellini @ 2018-04-11 23:31 UTC (permalink / raw)
  To: Julien Grall; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, xen-devel

On Wed, 11 Apr 2018, Julien Grall wrote:
> On 11/04/18 14:19, Mirela Simonovic wrote:
> > Guests attempt to write into these registers on resume (for example Linux).
> > Without this patch a data abort exception will be raised to the guest.
> > This patch handles the write access by ignoring it. This should be fine for
> > now because reading these registers is already handled as 'read as zero'.
> 
> I think this patch is wrong. It is not mandatory for the guest to write
> exactly the same value as read. Assuming the guest will always write 0, then
> what you want to do is checking the write is actually 0. In that case you can
> ignore it. For all the other case, you should still fail.

Yes indeed, and that should be fine because GICD_ISACTIVER is already
implemented as read_as_zero.


> > 
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > ---
> >   xen/arch/arm/vgic-v2.c | 3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
> > index 646d1f3d12..b088376ed0 100644
> > --- a/xen/arch/arm/vgic-v2.c
> > +++ b/xen/arch/arm/vgic-v2.c
> > @@ -484,11 +484,10 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v,
> > mmio_info_t *info,
> >           return 0;
> >         case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN):
> > -        if ( dabt.size != DABT_WORD ) goto bad_width;
> >           printk(XENLOG_G_ERR
> >                  "%pv: vGICD: unhandled word write %#"PRIregister" to
> > ISACTIVER%d\n",
> >                  v, r, gicd_reg - GICD_ISACTIVER);
> > -        return 0;
> > +        goto write_ignore_32;
> >         case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN):
> >           printk(XENLOG_G_ERR
> > 
> 
> -- 
> Julien Grall
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-11 14:53   ` Julien Grall
@ 2018-04-11 23:46     ` Stefano Stabellini
  2018-04-12  8:53       ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Stefano Stabellini @ 2018-04-11 23:46 UTC (permalink / raw)
  To: Julien Grall; +Cc: edgar.iglesias, sstabellini, Mirela Simonovic, dm, xen-devel

On Wed, 11 Apr 2018, Julien Grall wrote:
> On 11/04/18 14:19, Mirela Simonovic wrote:
> > Freeing percpu area is done when a non-boot CPU is disabled upon suspend.
> > This use to be scheduled for execution after a period of time, what caused
> > the following racing issues. If CPU is enabled after it is disabled and
> > before the freeing of percpu area is performed, Xen would crash upon
> > initializing percpu area because per cpu offset is not marked as
> > INVALID_PERCPU_AREA (this suppose to happen when cpu area is freed).
> > To resolve the racing issue, free percpu area right away instead
> > scheduling it for later.
> 
> The reason of using the RCU is you want to make sure that none of the other
> CPUs will access that percpu data before freeing it. So I don't think this
> patch is valid.
> 
> It looks like to me a rcu barrier is missing after calling cpu_down somewhere
> in the CPU off path. I am not entirely sure where.

We need a rcu_barrier(). Perhaps, it could be added on cpu_on before
initializing the percpu area?


> > 
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > ---
> >   xen/arch/arm/percpu.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/xen/arch/arm/percpu.c b/xen/arch/arm/percpu.c
> > index 25442c48fe..e4e8405f43 100644
> > --- a/xen/arch/arm/percpu.c
> > +++ b/xen/arch/arm/percpu.c
> > @@ -46,7 +46,7 @@ static void free_percpu_area(unsigned int cpu)
> >   {
> >       struct free_info *info = &per_cpu(free_info, cpu);
> >       info->cpu = cpu;
> > -    call_rcu(&info->rcu, _free_percpu_area);
> > +    _free_percpu_area(&info->rcu);
> >   }
> >     static int cpu_percpu_callback(
> > 
> 
> -- 
> Julien Grall
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug
  2018-04-11 13:19 ` [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug Mirela Simonovic
@ 2018-04-12  0:07   ` Stefano Stabellini
  2018-04-12  9:03     ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Stefano Stabellini @ 2018-04-12  0:07 UTC (permalink / raw)
  To: Mirela Simonovic; +Cc: edgar.iglesias, sstabellini, julien.grall, dm, xen-devel

On Wed, 11 Apr 2018, Mirela Simonovic wrote:
> CPU up flow is currently used during the initial boot to start secondary
> CPUs. However, the same flow should be used for CPU hotplug, e.g. when
> hotplugging secondary CPUs within the resume procedure (resume from the
> suspend to RAM). Therefore, prefixes __initdata and __init had to be removed
> from few data structures and functions that are used within the cpu up flow.
> 
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> ---
>  xen/arch/arm/arm64/smpboot.c   | 2 +-
>  xen/arch/arm/irq.c             | 2 +-
>  xen/arch/arm/processor.c       | 2 +-
>  xen/arch/arm/smpboot.c         | 4 ++--
>  xen/include/asm-arm/procinfo.h | 4 ++--
>  5 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/xen/arch/arm/arm64/smpboot.c b/xen/arch/arm/arm64/smpboot.c
> index 4fd0ac68b7..694fbf67e6 100644
> --- a/xen/arch/arm/arm64/smpboot.c
> +++ b/xen/arch/arm/arm64/smpboot.c
> @@ -104,7 +104,7 @@ int __init arch_cpu_init(int cpu, struct dt_device_node *dn)
>          return smp_psci_init(cpu);
>  }
>  
> -int __init arch_cpu_up(int cpu)
> +int arch_cpu_up(int cpu)
>  {
>      if ( !smp_enable_ops[cpu].prepare_cpu )
>          return -ENODEV;
> diff --git a/xen/arch/arm/irq.c b/xen/arch/arm/irq.c
> index aa4e832cae..098281f8ab 100644
> --- a/xen/arch/arm/irq.c
> +++ b/xen/arch/arm/irq.c
> @@ -65,7 +65,7 @@ irq_desc_t *__irq_to_desc(int irq)
>      return &irq_desc[irq-NR_LOCAL_IRQS];
>  }
>  
> -int __init arch_init_one_irq_desc(struct irq_desc *desc)
> +int arch_init_one_irq_desc(struct irq_desc *desc)
>  {
>      desc->arch.type = IRQ_TYPE_INVALID;
>      return 0;
> diff --git a/xen/arch/arm/processor.c b/xen/arch/arm/processor.c
> index ce4385064a..acad8b31d6 100644
> --- a/xen/arch/arm/processor.c
> +++ b/xen/arch/arm/processor.c
> @@ -20,7 +20,7 @@
>  
>  static DEFINE_PER_CPU(struct processor *, processor);
>  
> -void __init processor_setup(void)
> +void processor_setup(void)
>  {
>      const struct proc_info_list *procinfo;
>  
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index 5666efcd3a..d15ea8df5e 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -52,8 +52,8 @@ nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>  static unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
>         __attribute__((__aligned__(STACK_SIZE)));
>  
> -/* Initial boot cpu data */
> -struct init_info __initdata init_data =
> +/* Boot cpu data */
> +struct init_info init_data =
>  {
>      .stack = cpu0_boot_stack,
>  };

Don't you also want to remove __initdata from cpu0_boot_stack?


> diff --git a/xen/include/asm-arm/procinfo.h b/xen/include/asm-arm/procinfo.h
> index 26306b35f8..02be56e348 100644
> --- a/xen/include/asm-arm/procinfo.h
> +++ b/xen/include/asm-arm/procinfo.h
> @@ -35,9 +35,9 @@ struct proc_info_list {
>      struct processor    *processor;
>  };
>  
> -const __init struct proc_info_list *lookup_processor_type(void);
> +const struct proc_info_list *lookup_processor_type(void);
>  
> -void __init processor_setup(void);
> +void processor_setup(void);
>  void processor_vcpu_initialise(struct vcpu *v);
>  
>  #endif

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU
  2018-04-11 13:19 ` [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU Mirela Simonovic
  2018-04-11 14:55   ` Julien Grall
@ 2018-04-12  0:20   ` Stefano Stabellini
  2018-04-12  7:38     ` Mirela Simonovic
  2018-04-12 16:49   ` Dario Faggioli
  2 siblings, 1 reply; 61+ messages in thread
From: Stefano Stabellini @ 2018-04-12  0:20 UTC (permalink / raw)
  To: Mirela Simonovic; +Cc: edgar.iglesias, sstabellini, julien.grall, dm, xen-devel

On Wed, 11 Apr 2018, Mirela Simonovic wrote:
> Secondary pCPUs will be offlined on system suspend and hotplugged
> on resume. When offlining secondary CPUs all interrupts targeted
> to those CPUs will be routed to the boot CPU. The boot CPU
> is responsible for finalizing suspend procedure. All wake-up
> interrupts are therefore targeted to the boot CPU. Existing code
> was missing the restoration of interrupts affinity after
> hotplugging a CPU. This patch restores the IRQ affinity after
> a CPU is hotplugged.
> 
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> ---
>  xen/common/schedule.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 343ab6306e..e3956019bc 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -692,6 +692,7 @@ void restore_vcpu_affinity(struct domain *d)
>      for_each_vcpu ( d, v )
>      {
>          spinlock_t *lock;
> +        bool affinity_was_broken = v->affinity_broken;
>  
>          ASSERT(!vcpu_runnable(v));
>  
> @@ -724,6 +725,9 @@ void restore_vcpu_affinity(struct domain *d)
>          lock = vcpu_schedule_lock_irq(v);
>          v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, v);
>          spin_unlock_irq(lock);
> +
> +        if ( affinity_was_broken )
> +            sched_move_irqs(v);
>      }
>  
>      domain_update_node_affinity(d);

I am no expert of this code, but it looks correct to me. You might want
to move the setting of affinity_was_broken to inside the existing if (
v->affinity_broken ) check.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU
  2018-04-12  0:20   ` Stefano Stabellini
@ 2018-04-12  7:38     ` Mirela Simonovic
  0 siblings, 0 replies; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-12  7:38 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Edgar E. Iglesias, julien.grall, Davorin Mista, Xen Devel


[-- Attachment #1.1: Type: text/plain, Size: 1961 bytes --]

Hi Stefano,

v->processor has to be updated before the sched_move_irqs is called. Since
updating v->processor is done after the 'if (v->affinity_broken)' check I
had to put sched_move_irqs below, where it is.

Thanks,
Mirela

On Thu, Apr 12, 2018 at 2:20 AM, Stefano Stabellini <sstabellini@kernel.org>
wrote:

> On Wed, 11 Apr 2018, Mirela Simonovic wrote:
> > Secondary pCPUs will be offlined on system suspend and hotplugged
> > on resume. When offlining secondary CPUs all interrupts targeted
> > to those CPUs will be routed to the boot CPU. The boot CPU
> > is responsible for finalizing suspend procedure. All wake-up
> > interrupts are therefore targeted to the boot CPU. Existing code
> > was missing the restoration of interrupts affinity after
> > hotplugging a CPU. This patch restores the IRQ affinity after
> > a CPU is hotplugged.
> >
> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > ---
> >  xen/common/schedule.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> > index 343ab6306e..e3956019bc 100644
> > --- a/xen/common/schedule.c
> > +++ b/xen/common/schedule.c
> > @@ -692,6 +692,7 @@ void restore_vcpu_affinity(struct domain *d)
> >      for_each_vcpu ( d, v )
> >      {
> >          spinlock_t *lock;
> > +        bool affinity_was_broken = v->affinity_broken;
> >
> >          ASSERT(!vcpu_runnable(v));
> >
> > @@ -724,6 +725,9 @@ void restore_vcpu_affinity(struct domain *d)
> >          lock = vcpu_schedule_lock_irq(v);
> >          v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, v);
> >          spin_unlock_irq(lock);
> > +
> > +        if ( affinity_was_broken )
> > +            sched_move_irqs(v);
> >      }
> >
> >      domain_update_node_affinity(d);
>
> I am no expert of this code, but it looks correct to me. You might want
> to move the setting of affinity_was_broken to inside the existing if (
> v->affinity_broken ) check.
>

[-- Attachment #1.2: Type: text/html, Size: 8432 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-11 16:37       ` Mirela Simonovic
@ 2018-04-12  8:43         ` Julien Grall
  2018-04-13 10:19           ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-12  8:43 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel



On 11/04/18 17:37, Mirela Simonovic wrote:
> Hi Julien,

Hi,

May I ask you to configure your mail client to use > for quoting and use 
plain text? Otherwise, this is going to be really difficult to follow 
the discussion after few round (see already below).

> On Wed, Apr 11, 2018 at 6:02 PM, Julien Grall <julien.grall@arm.com 
> <mailto:julien.grall@arm.com>> wrote:
> 
>     Hi,
> 
>     On 11/04/18 16:58, Mirela Simonovic wrote:
> 
>         On 04/11/2018 05:07 PM, Julien Grall wrote:
> 
>             On 11/04/18 14:19, Mirela Simonovic wrote:
> 
>         Migrating interrupts when turning off a CPU already works.
>         However, when a CPU is turned back on there is no interrupt
>         migration back to the hotplugged CPU - all interrupts will
>         remain routed to the CPU#0.
>         Patch 7/7 fixes this
> 
> 
>     What do you mean by all interrupts? Interrupts routed to guest will
>     always follow the vCPU. So are you sure they are going to be
>     migrated when that vCPU is paused/off?
> 
> 
> Just to make sure we're on the same page - this is about hotplugging 
> physical CPUs. Hotplugging vCPUs using virtual PSCI CPU_OFF interface is 
> already implemented and unrelated to this series.

Yes, we are on the same page :). I was just wondering what happen to 
interrupt routed to that pCPU.

> 
> Assuming that system has 2 pCPUs by 'all interrupts' I mean interrupts 
> that were targeted to the pCPU#0 and pCPU#1 prior to doing any hotplug.
> 
> For example, if a guest is pinned to pCPU#1 an interrupt of a device it 
> owns will be targeted to pCPU#1.
> When pCPU#1 is turned off that interrupt will be migrated to pCPU#0. 
> pCPU#0 finalizes the suspend and receives wake-up interrupts. However, 
> when CPU#1 is turned back on that interrupt will remain targeted to the 
> CPU#0, which I assumed is wrong.
> The scenario described here is also how I tested this.
> 
>     Can you give the path in Xen doing that?
> 
> 
> Sure, here is a backtrace (dumped on the CPU being turned off):
>      0  0x2603dc arch_move_irqs(): vgic.c, line 309
>      1  0x22ee58 sched_move_irqs()+20: schedule.c, line 303
>      2  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>      3  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>      4  0x25aff8 __cpu_disable()+96: smpboot.c, line 386
>      5  0x201608 take_cpu_down()+52: cpu.c, line 75
>      6  0x23426c stopmachine_action()+188: stop_machine.c, line 159
>      7  0x235858 do_tasklet_work()+176: tasklet.c, line 94
>      8  0x235c80 do_tasklet()+104: tasklet.c, line 126
>      9  0x24daec idle_loop()+144: domain.c, line 72
>     10  0x25b1f8 start_secondary()+404: smpboot.c, line 368


So this cover interrupt routed to a virtual CPU. However, this does not 
handle interrupts used by Xen. How do you handle them?

For instance SMMUs IRQ might be routed to other interrupt than CPU #0.

[...]

> 
>     If the former, it would be nice to get the code you used. If the
>     latter, then having a hack patch to test that code would be nice.
>     Ideally, you want to plug that in the SYSCTL interface for
>     out-of-box testing.
> 
> 
> Ok, I have never used that but I'll try to figure it out. I may come up 
> with additional questions.

You might want to have a look at the x86 version XEN_SYSCTL_cpu_hotplug 
in x86/systcl.c.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-11 23:46     ` Stefano Stabellini
@ 2018-04-12  8:53       ` Julien Grall
  2018-04-12 21:31         ` Stefano Stabellini
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-12  8:53 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: edgar.iglesias, Mirela Simonovic, dm, xen-devel



On 12/04/18 00:46, Stefano Stabellini wrote:
> On Wed, 11 Apr 2018, Julien Grall wrote:
>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>> Freeing percpu area is done when a non-boot CPU is disabled upon suspend.
>>> This use to be scheduled for execution after a period of time, what caused
>>> the following racing issues. If CPU is enabled after it is disabled and
>>> before the freeing of percpu area is performed, Xen would crash upon
>>> initializing percpu area because per cpu offset is not marked as
>>> INVALID_PERCPU_AREA (this suppose to happen when cpu area is freed).
>>> To resolve the racing issue, free percpu area right away instead
>>> scheduling it for later.
>>
>> The reason of using the RCU is you want to make sure that none of the other
>> CPUs will access that percpu data before freeing it. So I don't think this
>> patch is valid.
>>
>> It looks like to me a rcu barrier is missing after calling cpu_down somewhere
>> in the CPU off path. I am not entirely sure where.
> 
> We need a rcu_barrier(). Perhaps, it could be added on cpu_on before
> initializing the percpu area?

Do you mind giving a bit more details on your thought? cpu_up looks a 
strange place as no one should access the percpu area after the CPU is 
down. So it feels the rcu_barrier should be somewhere before 
PSCI_cpu_off is called.

Cheers,

> 
> 
>>>
>>> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
>>> ---
>>>    xen/arch/arm/percpu.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/xen/arch/arm/percpu.c b/xen/arch/arm/percpu.c
>>> index 25442c48fe..e4e8405f43 100644
>>> --- a/xen/arch/arm/percpu.c
>>> +++ b/xen/arch/arm/percpu.c
>>> @@ -46,7 +46,7 @@ static void free_percpu_area(unsigned int cpu)
>>>    {
>>>        struct free_info *info = &per_cpu(free_info, cpu);
>>>        info->cpu = cpu;
>>> -    call_rcu(&info->rcu, _free_percpu_area);
>>> +    _free_percpu_area(&info->rcu);
>>>    }
>>>      static int cpu_percpu_callback(
>>>
>>
>> -- 
>> Julien Grall
>>

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug
  2018-04-12  0:07   ` Stefano Stabellini
@ 2018-04-12  9:03     ` Julien Grall
  2018-04-12 12:50       ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-12  9:03 UTC (permalink / raw)
  To: Stefano Stabellini, Mirela Simonovic
  Cc: edgar.iglesias, julien.grall, dm, xen-devel

Hi,

On 12/04/18 01:07, Stefano Stabellini wrote:
> On Wed, 11 Apr 2018, Mirela Simonovic wrote:
>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>> index 5666efcd3a..d15ea8df5e 100644
>> --- a/xen/arch/arm/smpboot.c
>> +++ b/xen/arch/arm/smpboot.c
>> @@ -52,8 +52,8 @@ nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>>   static unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
>>          __attribute__((__aligned__(STACK_SIZE)));
>>   
>> -/* Initial boot cpu data */
>> -struct init_info __initdata init_data =
>> +/* Boot cpu data */
>> +struct init_info init_data =
>>   {
>>       .stack = cpu0_boot_stack,
>>   };
> 
> Don't you also want to remove __initdata from cpu0_boot_stack?

I am not sure about this. When you go idle, you could re-use the 
idle_vcpu[0]->arch.stack. So you save 12K in resident memory.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 2/7] xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers
  2018-04-11 23:31     ` Stefano Stabellini
@ 2018-04-12 11:15       ` Mirela Simonovic
  0 siblings, 0 replies; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-12 11:15 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Edgar E. Iglesias, Julien Grall, Davorin Mista, Xen Devel

Hi,


On Thu, Apr 12, 2018 at 1:31 AM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Wed, 11 Apr 2018, Julien Grall wrote:
>> On 11/04/18 14:19, Mirela Simonovic wrote:
>> > Guests attempt to write into these registers on resume (for example Linux).
>> > Without this patch a data abort exception will be raised to the guest.
>> > This patch handles the write access by ignoring it. This should be fine for
>> > now because reading these registers is already handled as 'read as zero'.
>>
>> I think this patch is wrong. It is not mandatory for the guest to write
>> exactly the same value as read. Assuming the guest will always write 0, then
>> what you want to do is checking the write is actually 0. In that case you can
>> ignore it. For all the other case, you should still fail.
>
> Yes indeed, and that should be fine because GICD_ISACTIVER is already
> implemented as read_as_zero.
>

Thanks, I fixed this (will be submitted in v2 series).

>
>> >
>> > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
>> > ---
>> >   xen/arch/arm/vgic-v2.c | 3 +--
>> >   1 file changed, 1 insertion(+), 2 deletions(-)
>> >
>> > diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c
>> > index 646d1f3d12..b088376ed0 100644
>> > --- a/xen/arch/arm/vgic-v2.c
>> > +++ b/xen/arch/arm/vgic-v2.c
>> > @@ -484,11 +484,10 @@ static int vgic_v2_distr_mmio_write(struct vcpu *v,
>> > mmio_info_t *info,
>> >           return 0;
>> >         case VRANGE32(GICD_ISACTIVER, GICD_ISACTIVERN):
>> > -        if ( dabt.size != DABT_WORD ) goto bad_width;
>> >           printk(XENLOG_G_ERR
>> >                  "%pv: vGICD: unhandled word write %#"PRIregister" to
>> > ISACTIVER%d\n",
>> >                  v, r, gicd_reg - GICD_ISACTIVER);
>> > -        return 0;
>> > +        goto write_ignore_32;
>> >         case VRANGE32(GICD_ICACTIVER, GICD_ICACTIVERN):
>> >           printk(XENLOG_G_ERR
>> >
>>
>> --
>> Julien Grall
>>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
  2018-04-11 14:46   ` Julien Grall
@ 2018-04-12 11:33     ` Mirela Simonovic
  2018-04-12 13:31       ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-12 11:33 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	julien.grall, Xen Devel

Hi Julien,

On Wed, Apr 11, 2018 at 4:46 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi,
>
> On 11/04/18 14:19, Mirela Simonovic wrote:
>>
>> This patch adds the PSCI CPU_OFF call to the EL3 in order to
>> trigger powering down of the calling CPU when the CPU is stopped.
>> If CPU_OFF call fails for some reason, e.g. EL3 does not implement
>> the PSCI CPU_OFF function, the calling CPU will loop in the infinite >
>> while/wfi, as it was looping before this change.
>
>
> I am afraid that the example you give is wrong. That call should exist for
> all implementation of PSCI 0.2 and above. For 0.1, this should never be call
> as the ID is not reserved.

Thanks.

>
>>
>> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
>> ---
>>   xen/arch/arm/psci.c        | 5 +++++
>>   xen/arch/arm/smpboot.c     | 7 +++++++
>>   xen/include/asm-arm/psci.h | 1 +
>>   3 files changed, 13 insertions(+)
>>
>> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
>> index 94b616df9b..e9e756e56b 100644
>> --- a/xen/arch/arm/psci.c
>> +++ b/xen/arch/arm/psci.c
>> @@ -46,6 +46,11 @@ int call_psci_cpu_on(int cpu)
>>       return call_smc(psci_cpu_on_nr, cpu_logical_map(cpu),
>> __pa(init_secondary), 0);
>>   }
>>   +int call_psci_cpu_off(void)
>> +{
>
>
> You have to check the PSCI version here before calling the function.
>

Thanks, I fixed this.

>> +    return call_smc(PSCI_0_2_FN32_CPU_OFF, 0, 0, 0);
>> +}
>> +
>>   void call_psci_system_off(void)
>>   {
>>       if ( psci_ver > PSCI_VERSION(0, 1) )
>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>> index b2116f0d2d..5666efcd3a 100644
>> --- a/xen/arch/arm/smpboot.c
>> +++ b/xen/arch/arm/smpboot.c
>> @@ -390,11 +390,18 @@ void __cpu_disable(void)
>>     void stop_cpu(void)
>>   {
>> +    int errno;
>
>
> newline.

Got it, thanks

>
>>       local_irq_disable();
>>       cpu_is_dead = true;
>>       /* Make sure the write happens before we sleep forever */
>>       dsb(sy);
>>       isb();
>> +    /* PSCI cpu off call will return only in case of an error */
>> +    errno = call_psci_cpu_off();
>> +    printk(XENLOG_DEBUG "PSCI cpu off call failed for CPU#%d err=%d\n",
>> +           get_processor_id(), errno);
>> +    isb();
>
>
> What are you trying to achieve with the isb() here?
>

I use to have a problem that the wfi below gets executed before the
call_psci_cpu_off(). Adding isb() fixed the issue. However, I tried
now to reproduce the problem and it doesn't show up. I still believe
isb() should be here, please let me know if you disagree (I obviously
can't prove the claim now).

Thanks,
Mirela

> Cheers,
>
>> +    /* If CPU_OFF PSCI call failed stay in the WFI loop */
>>       while ( 1 )
>>           wfi();
>>   }
>> diff --git a/xen/include/asm-arm/psci.h b/xen/include/asm-arm/psci.h
>> index 9ac820e94a..50d668a296 100644
>> --- a/xen/include/asm-arm/psci.h
>> +++ b/xen/include/asm-arm/psci.h
>> @@ -20,6 +20,7 @@ extern uint32_t psci_ver;
>>     int psci_init(void);
>>   int call_psci_cpu_on(int cpu);
>> +int call_psci_cpu_off(void);
>>   void call_psci_system_off(void);
>>   void call_psci_system_reset(void);
>>
>
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug
  2018-04-12  9:03     ` Julien Grall
@ 2018-04-12 12:50       ` Mirela Simonovic
  2018-04-12 12:56         ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-12 12:50 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Xen Devel, Davorin Mista,
	julien.grall

Hi,

On Thu, Apr 12, 2018 at 11:03 AM, Julien Grall <julien.grall@arm.com> wrote:
> Hi,
>
> On 12/04/18 01:07, Stefano Stabellini wrote:
>>
>> On Wed, 11 Apr 2018, Mirela Simonovic wrote:
>>>
>>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>>> index 5666efcd3a..d15ea8df5e 100644
>>> --- a/xen/arch/arm/smpboot.c
>>> +++ b/xen/arch/arm/smpboot.c
>>> @@ -52,8 +52,8 @@ nodemask_t __read_mostly node_online_map = { { [0] =
>>> 1UL } };
>>>   static unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
>>>          __attribute__((__aligned__(STACK_SIZE)));
>>>   -/* Initial boot cpu data */
>>> -struct init_info __initdata init_data =
>>> +/* Boot cpu data */
>>> +struct init_info init_data =
>>>   {
>>>       .stack = cpu0_boot_stack,
>>>   };
>>
>>
>> Don't you also want to remove __initdata from cpu0_boot_stack?
>

Somehow I didn't observe this as a problem... After taking a deeper
look now I understand that secondary CPUs reuse this stack to boot. So
I agree, __initdata from cpu0_boot_stack should be removed.

>
> I am not sure about this. When you go idle, you could re-use the
> idle_vcpu[0]->arch.stack. So you save 12K in resident memory.
>

I'm not sure I follow this, maybe Stefano can comment.

Thanks,
Mirela

> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug
  2018-04-12 12:50       ` Mirela Simonovic
@ 2018-04-12 12:56         ` Julien Grall
  2018-04-12 13:55           ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-12 12:56 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Xen Devel, Davorin Mista,
	julien.grall

Hi,

On 12/04/18 13:50, Mirela Simonovic wrote:
> Hi,
> 
> On Thu, Apr 12, 2018 at 11:03 AM, Julien Grall <julien.grall@arm.com> wrote:
>> Hi,
>>
>> On 12/04/18 01:07, Stefano Stabellini wrote:
>>>
>>> On Wed, 11 Apr 2018, Mirela Simonovic wrote:
>>>>
>>>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>>>> index 5666efcd3a..d15ea8df5e 100644
>>>> --- a/xen/arch/arm/smpboot.c
>>>> +++ b/xen/arch/arm/smpboot.c
>>>> @@ -52,8 +52,8 @@ nodemask_t __read_mostly node_online_map = { { [0] =
>>>> 1UL } };
>>>>    static unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
>>>>           __attribute__((__aligned__(STACK_SIZE)));
>>>>    -/* Initial boot cpu data */
>>>> -struct init_info __initdata init_data =
>>>> +/* Boot cpu data */
>>>> +struct init_info init_data =
>>>>    {
>>>>        .stack = cpu0_boot_stack,
>>>>    };
>>>
>>>
>>> Don't you also want to remove __initdata from cpu0_boot_stack?
>>
> 
> Somehow I didn't observe this as a problem... After taking a deeper
> look now I understand that secondary CPUs reuse this stack to boot. So
> I agree, __initdata from cpu0_boot_stack should be removed.

No it should not be removed. cpu0_boot_stack is only used for Xen is 
booted (e.g CPU0 jumping at _start). In the suspend/resume case you are 
not going to use that patch for CPU0.

> 
>>
>> I am not sure about this. When you go idle, you could re-use the
>> idle_vcpu[0]->arch.stack. So you save 12K in resident memory.
>>
> 
> I'm not sure I follow this, maybe Stefano can comment.

Each CPU have an associated idle vCPU used for context switch and 
running idle mode. That idle vCPU contains the stack that is used for 
boot CPU.

In the case of CPU0, you can not use idle vCPU stack when booting 
because it is not initialized. However during suspend/resume case, you 
will already have the idle_vcpu[0]->stack in hand. So there are no need 
to use cpu0_boot_stack.

However, do you really need to setup the stack on resume?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
  2018-04-12 11:33     ` Mirela Simonovic
@ 2018-04-12 13:31       ` Julien Grall
  2018-04-16 10:02         ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-12 13:31 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	julien.grall, Xen Devel



On 12/04/18 12:33, Mirela Simonovic wrote:
> On Wed, Apr 11, 2018 at 4:46 PM, Julien Grall <julien.grall@arm.com> wrote:
>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>
>>>        local_irq_disable();
>>>        cpu_is_dead = true;
>>>        /* Make sure the write happens before we sleep forever */
>>>        dsb(sy);
>>>        isb();
>>> +    /* PSCI cpu off call will return only in case of an error */
>>> +    errno = call_psci_cpu_off();
>>> +    printk(XENLOG_DEBUG "PSCI cpu off call failed for CPU#%d err=%d\n",
>>> +           get_processor_id(), errno);
>>> +    isb();
>>
>>
>> What are you trying to achieve with the isb() here?
>>
> 
> I use to have a problem that the wfi below gets executed before the
> call_psci_cpu_off(). Adding isb() fixed the issue. However, I tried
> now to reproduce the problem and it doesn't show up. I still believe
> isb() should be here, please let me know if you disagree (I obviously
> can't prove the claim now).

The problem you describe can't be possible with the code you have 
because call_psci_cpu_off() is issuing a SMC. SMC will lead to change 
exception level and therefore have a context-synchronization barrier.

This is obviously based on the assumption you don't have an errata on 
your CPU exposing the behavior you describe. For that you would need to 
check errata notice for your CPU and/or try to reproduce.

However, what you would need is a dsb(sy); isb(); to drain the write 
buffer if you print a message.

Furthermore, now on platform without CPU off support (e.g non-PSCI 
platform and PSCI 0.1) you will log an error message that may worry 
people. In reality, PSCI cpu_off will unlikely fail, so you probably 
want to add a panic in call_psci_cpu_off instead.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug
  2018-04-12 12:56         ` Julien Grall
@ 2018-04-12 13:55           ` Mirela Simonovic
  0 siblings, 0 replies; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-12 13:55 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Xen Devel, Davorin Mista,
	julien.grall

Hi Julien,

On Thu, Apr 12, 2018 at 2:56 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi,
>
> On 12/04/18 13:50, Mirela Simonovic wrote:
>>
>> Hi,
>>
>> On Thu, Apr 12, 2018 at 11:03 AM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> On 12/04/18 01:07, Stefano Stabellini wrote:
>>>>
>>>>
>>>> On Wed, 11 Apr 2018, Mirela Simonovic wrote:
>>>>>
>>>>>
>>>>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>>>>> index 5666efcd3a..d15ea8df5e 100644
>>>>> --- a/xen/arch/arm/smpboot.c
>>>>> +++ b/xen/arch/arm/smpboot.c
>>>>> @@ -52,8 +52,8 @@ nodemask_t __read_mostly node_online_map = { { [0] =
>>>>> 1UL } };
>>>>>    static unsigned char __initdata cpu0_boot_stack[STACK_SIZE]
>>>>>           __attribute__((__aligned__(STACK_SIZE)));
>>>>>    -/* Initial boot cpu data */
>>>>> -struct init_info __initdata init_data =
>>>>> +/* Boot cpu data */
>>>>> +struct init_info init_data =
>>>>>    {
>>>>>        .stack = cpu0_boot_stack,
>>>>>    };
>>>>
>>>>
>>>>
>>>> Don't you also want to remove __initdata from cpu0_boot_stack?
>>>
>>>
>>
>> Somehow I didn't observe this as a problem... After taking a deeper
>> look now I understand that secondary CPUs reuse this stack to boot. So
>> I agree, __initdata from cpu0_boot_stack should be removed.
>
>
> No it should not be removed. cpu0_boot_stack is only used for Xen is booted
> (e.g CPU0 jumping at _start). In the suspend/resume case you are not going
> to use that patch for CPU0.
>

Thanks for clarification. I need to correct myself - I missed the fact
that the boot CPU updated init_data.stack for a secondary CPU to point
to its idle_vcpu's stack (in __cpu_up). So you're right,
cpu0_boot_stack will never be used again after the CPU0 boots and
therefore the __initdata should not be removed.

>>
>>>
>>> I am not sure about this. When you go idle, you could re-use the
>>> idle_vcpu[0]->arch.stack. So you save 12K in resident memory.
>>>
>>
>> I'm not sure I follow this, maybe Stefano can comment.
>
>
> Each CPU have an associated idle vCPU used for context switch and running
> idle mode. That idle vCPU contains the stack that is used for boot CPU.
>
> In the case of CPU0, you can not use idle vCPU stack when booting because it
> is not initialized. However during suspend/resume case, you will already
> have the idle_vcpu[0]->stack in hand. So there are no need to use
> cpu0_boot_stack.
>
> However, do you really need to setup the stack on resume?
>

There is no need to use cpu0_boot_stack for CPU0 on resume. The
idle_vcpu's stack is used and it has to be used if we want to resume
execution from where we suspended. We just save/restore SP on
suspend/resume.

Thanks,
Mirela

> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU
  2018-04-11 13:19 ` [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU Mirela Simonovic
  2018-04-11 14:55   ` Julien Grall
  2018-04-12  0:20   ` Stefano Stabellini
@ 2018-04-12 16:49   ` Dario Faggioli
  2018-04-13 10:11     ` Mirela Simonovic
  2 siblings, 1 reply; 61+ messages in thread
From: Dario Faggioli @ 2018-04-12 16:49 UTC (permalink / raw)
  To: Mirela Simonovic, xen-devel; +Cc: edgar.iglesias, sstabellini, dm, julien.grall


[-- Attachment #1.1: Type: text/plain, Size: 2739 bytes --]

On Wed, 2018-04-11 at 15:19 +0200, Mirela Simonovic wrote:
> Secondary pCPUs will be offlined on system suspend and hotplugged
> on resume. When offlining secondary CPUs all interrupts targeted
> to those CPUs will be routed to the boot CPU. The boot CPU
> is responsible for finalizing suspend procedure. All wake-up
> interrupts are therefore targeted to the boot CPU. 
>
There is no wake-up interrupt involved in the process of unpausing
domains during resume. So, I that that what you mean is "the interrups
that the vcpus receive once they're running again. And even in that
case, rather than "All", is it "The interrupts received until the first
time the vcpu goes through vcpu_migrate()", as vcpu_migrate() does call
sched_move_irq()?

Note that I'm not (trying to) being picky, I'm just trying to
undestand. :-)

> Existing code
> was missing the restoration of interrupts affinity after
> hotplugging a CPU.
>
Either use hot-unplug and hotplug, or offline and online. I think the
latter is better in this case.

>  This patch restores the IRQ affinity after
> a CPU is hotplugged.
> 
> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>

> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
> index 343ab6306e..e3956019bc 100644
> --- a/xen/common/schedule.c
> +++ b/xen/common/schedule.c
> @@ -724,6 +725,9 @@ void restore_vcpu_affinity(struct domain *d)
>          lock = vcpu_schedule_lock_irq(v);
>          v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, v);
>          spin_unlock_irq(lock);
> +
> +        if ( affinity_was_broken )
> +            sched_move_irqs(v);
>
But I guess that, more than on whether or not the affinity was broken,
you are interested in whether or not v->processor changed.

In fact, for the vcpus that had 0 in v->processor at the beginning of
this function, and also have 0 in there now, there is no need to call
sched_move_irq(), is there?

Similarly, if the affinity of a vcpu has not been broken, but pick_cpu
end up selecting a different v->processor, you do want to call
sched_move_irq(), I think.

If I'm right, I think it would be better to do, at the beginning of the
for:

 unsigned int old_cpu = v->processor;

And here:

 if (old_cpu != v->processor)
   sched_move_irqs(v);

And I'd also add a comment (above the if()), briefly saying how this is
necessary to match/undo the call work of vcpu_move_nosched() in
cpu_disable_scheduler(). 

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-12  8:53       ` Julien Grall
@ 2018-04-12 21:31         ` Stefano Stabellini
  2018-04-16 13:14           ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Stefano Stabellini @ 2018-04-12 21:31 UTC (permalink / raw)
  To: Julien Grall
  Cc: edgar.iglesias, Stefano Stabellini, Mirela Simonovic, dm, xen-devel

On Thu, 12 Apr 2018, Julien Grall wrote:
> On 12/04/18 00:46, Stefano Stabellini wrote:
> > On Wed, 11 Apr 2018, Julien Grall wrote:
> > > On 11/04/18 14:19, Mirela Simonovic wrote:
> > > > Freeing percpu area is done when a non-boot CPU is disabled upon
> > > > suspend.
> > > > This use to be scheduled for execution after a period of time, what
> > > > caused
> > > > the following racing issues. If CPU is enabled after it is disabled and
> > > > before the freeing of percpu area is performed, Xen would crash upon
> > > > initializing percpu area because per cpu offset is not marked as
> > > > INVALID_PERCPU_AREA (this suppose to happen when cpu area is freed).
> > > > To resolve the racing issue, free percpu area right away instead
> > > > scheduling it for later.
> > > 
> > > The reason of using the RCU is you want to make sure that none of the
> > > other
> > > CPUs will access that percpu data before freeing it. So I don't think this
> > > patch is valid.
> > > 
> > > It looks like to me a rcu barrier is missing after calling cpu_down
> > > somewhere
> > > in the CPU off path. I am not entirely sure where.
> > 
> > We need a rcu_barrier(). Perhaps, it could be added on cpu_on before
> > initializing the percpu area?
> 
> Do you mind giving a bit more details on your thought? cpu_up looks a strange
> place as no one should access the percpu area after the CPU is down. So it
> feels the rcu_barrier should be somewhere before PSCI_cpu_off is called.

Yes, it feels strange to do it on cpu_on, it would be more obvious on
cpu_off, but we don't actually need to _free_percpu_area on cpu_off,
right? We only need to make sure it is done before cpu_percpu_callback
is called on cpu_on.

My suggestion would be to evaluate if it is possible to introduce the
rcu_barrier() on the resume path before cpu_percpu_callback, maybe in
start_secondary.

I was also looking at xen/arch/x86/acpi/power.c:enter_state and noticed
that they chose to call rcu_barrier() on enable_cpu before
enable_nonboot_cpus().


 
> > > > 
> > > > Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
> > > > ---
> > > >    xen/arch/arm/percpu.c | 2 +-
> > > >    1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/xen/arch/arm/percpu.c b/xen/arch/arm/percpu.c
> > > > index 25442c48fe..e4e8405f43 100644
> > > > --- a/xen/arch/arm/percpu.c
> > > > +++ b/xen/arch/arm/percpu.c
> > > > @@ -46,7 +46,7 @@ static void free_percpu_area(unsigned int cpu)
> > > >    {
> > > >        struct free_info *info = &per_cpu(free_info, cpu);
> > > >        info->cpu = cpu;
> > > > -    call_rcu(&info->rcu, _free_percpu_area);
> > > > +    _free_percpu_area(&info->rcu);
> > > >    }
> > > >      static int cpu_percpu_callback(
> > > > 
> > > 
> > > -- 
> > > Julien Grall
> > > 
> 
> -- 
> Julien Grall
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU
  2018-04-12 16:49   ` Dario Faggioli
@ 2018-04-13 10:11     ` Mirela Simonovic
  2018-04-16 12:38       ` Dario Faggioli
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-13 10:11 UTC (permalink / raw)
  To: Dario Faggioli
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	julien.grall, Xen Devel

Hi Dario,

On Thu, Apr 12, 2018 at 6:49 PM, Dario Faggioli <dfaggioli@suse.com> wrote:
> On Wed, 2018-04-11 at 15:19 +0200, Mirela Simonovic wrote:
>> Secondary pCPUs will be offlined on system suspend and hotplugged
>> on resume. When offlining secondary CPUs all interrupts targeted
>> to those CPUs will be routed to the boot CPU. The boot CPU
>> is responsible for finalizing suspend procedure. All wake-up
>> interrupts are therefore targeted to the boot CPU.
>>

Maybe I should have made the last sentence more understandable, like
"Interrupts that could wake-up the system from suspend to RAM state
are therefore targeted to the boot CPU."

> There is no wake-up interrupt involved in the process of unpausing
> domains during resume. So, I that that what you mean is "the interrups
> that the vcpus receive once they're running again. And even in that
> case, rather than "All", is it "The interrupts received until the first
> time the vcpu goes through vcpu_migrate()", as vcpu_migrate() does call
> sched_move_irq()?
>
> Note that I'm not (trying to) being picky, I'm just trying to
> undestand. :-)

No worries, my commit message should be more understandable and your
answer helps me identify what's unclear. I'll try to explain below.

This is about suspend/resume implementation for ARM that is based on
PSCI and targeted for embedded systems as well. Each guest could have
its own wake-up devices/interrupts (passthrough) that could trigger
the resume. So the wake-up interrupt in this context triggers the
resume. By 'all interrupts' I meant interrupts that are left enabled
by guests upon suspend (those interrupts could wake-up the system).
Here is the PSCI-based suspend to RAM design spec:
https://lists.xenproject.org/archives/html/xen-devel/2017-12/msg01574.html

This patch is required to ensure that interrupts are migrated back to
secondary pCPUs on resume if that is needed. When a vCPU is pinned to
a secondary pCPU the vcpu migration will not happen on resume (and may
never happen), so the interrupt will remain targeted to the boot CPU.
I'll try to fix the commit message to better explain this.

>
>> Existing code
>> was missing the restoration of interrupts affinity after
>> hotplugging a CPU.
>>
> Either use hot-unplug and hotplug, or offline and online. I think the
> latter is better in this case.
>
>>  This patch restores the IRQ affinity after
>> a CPU is hotplugged.
>>
>> Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
>
>> diff --git a/xen/common/schedule.c b/xen/common/schedule.c
>> index 343ab6306e..e3956019bc 100644
>> --- a/xen/common/schedule.c
>> +++ b/xen/common/schedule.c
>> @@ -724,6 +725,9 @@ void restore_vcpu_affinity(struct domain *d)
>>          lock = vcpu_schedule_lock_irq(v);
>>          v->processor = SCHED_OP(vcpu_scheduler(v), pick_cpu, v);
>>          spin_unlock_irq(lock);
>> +
>> +        if ( affinity_was_broken )
>> +            sched_move_irqs(v);
>>
> But I guess that, more than on whether or not the affinity was broken,
> you are interested in whether or not v->processor changed.
>
> In fact, for the vcpus that had 0 in v->processor at the beginning of
> this function, and also have 0 in there now, there is no need to call
> sched_move_irq(), is there?
>
> Similarly, if the affinity of a vcpu has not been broken, but pick_cpu
> end up selecting a different v->processor, you do want to call
> sched_move_irq(), I think.
>
> If I'm right, I think it would be better to do, at the beginning of the
> for:
>
>  unsigned int old_cpu = v->processor;
>
> And here:
>
>  if (old_cpu != v->processor)
>    sched_move_irqs(v);
>
> And I'd also add a comment (above the if()), briefly saying how this is
> necessary to match/undo the call work of vcpu_move_nosched() in
> cpu_disable_scheduler().
>

I though affinity check was sufficient but your proposal is better and
I'll fix the patch accordingly. Thanks for the feedback!

Regards,
Mirela

> Regards,
> Dario
> --
> <<This happens because I choose it to happen!>> (Raistlin Majere)
> -----------------------------------------------------------------
> Dario Faggioli, Ph.D, http://about.me/dario.faggioli
> Software Engineer @ SUSE https://www.suse.com/

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-12  8:43         ` Julien Grall
@ 2018-04-13 10:19           ` Mirela Simonovic
  2018-04-16 11:33             ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-13 10:19 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel

Hi Julien,

On Thu, Apr 12, 2018 at 10:43 AM, Julien Grall <julien.grall@arm.com> wrote:
>
>
> On 11/04/18 17:37, Mirela Simonovic wrote:
>>
>> Hi Julien,
>
>
> Hi,
>
> May I ask you to configure your mail client to use > for quoting and use
> plain text? Otherwise, this is going to be really difficult to follow the
> discussion after few round (see already below).
>
>> On Wed, Apr 11, 2018 at 6:02 PM, Julien Grall <julien.grall@arm.com
>> <mailto:julien.grall@arm.com>> wrote:
>>
>>     Hi,
>>
>>     On 11/04/18 16:58, Mirela Simonovic wrote:
>>
>>         On 04/11/2018 05:07 PM, Julien Grall wrote:
>>
>>             On 11/04/18 14:19, Mirela Simonovic wrote:
>>
>>         Migrating interrupts when turning off a CPU already works.
>>         However, when a CPU is turned back on there is no interrupt
>>         migration back to the hotplugged CPU - all interrupts will
>>         remain routed to the CPU#0.
>>         Patch 7/7 fixes this
>>
>>
>>     What do you mean by all interrupts? Interrupts routed to guest will
>>     always follow the vCPU. So are you sure they are going to be
>>     migrated when that vCPU is paused/off?
>>
>>
>> Just to make sure we're on the same page - this is about hotplugging
>> physical CPUs. Hotplugging vCPUs using virtual PSCI CPU_OFF interface is
>> already implemented and unrelated to this series.
>
>
> Yes, we are on the same page :). I was just wondering what happen to
> interrupt routed to that pCPU.
>
>>
>> Assuming that system has 2 pCPUs by 'all interrupts' I mean interrupts
>> that were targeted to the pCPU#0 and pCPU#1 prior to doing any hotplug.
>>
>> For example, if a guest is pinned to pCPU#1 an interrupt of a device it
>> owns will be targeted to pCPU#1.
>> When pCPU#1 is turned off that interrupt will be migrated to pCPU#0.
>> pCPU#0 finalizes the suspend and receives wake-up interrupts. However, when
>> CPU#1 is turned back on that interrupt will remain targeted to the CPU#0,
>> which I assumed is wrong.
>> The scenario described here is also how I tested this.
>>
>>     Can you give the path in Xen doing that?
>>
>>
>> Sure, here is a backtrace (dumped on the CPU being turned off):
>>      0  0x2603dc arch_move_irqs(): vgic.c, line 309
>>      1  0x22ee58 sched_move_irqs()+20: schedule.c, line 303
>>      2  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>>      3  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>>      4  0x25aff8 __cpu_disable()+96: smpboot.c, line 386
>>      5  0x201608 take_cpu_down()+52: cpu.c, line 75
>>      6  0x23426c stopmachine_action()+188: stop_machine.c, line 159
>>      7  0x235858 do_tasklet_work()+176: tasklet.c, line 94
>>      8  0x235c80 do_tasklet()+104: tasklet.c, line 126
>>      9  0x24daec idle_loop()+144: domain.c, line 72
>>     10  0x25b1f8 start_secondary()+404: smpboot.c, line 368
>
>
>
> So this cover interrupt routed to a virtual CPU. However, this does not
> handle interrupts used by Xen. How do you handle them?
>
> For instance SMMUs IRQ might be routed to other interrupt than CPU #0.

Interrupts used by Xen should not wake-up the system and will be
disabled when we suspend the devices used by Xen.
However, I need to double check that such interrupts get enabled on
the right CPU on resume. Could you please tell me which mechanism in
Xen is used to target such an interrupt to a secondary CPU only? Is
that even possible and why would that be used?
Currently, I have no way to try out SMMU but I could try UART.

Thanks,
Mirela

>
> [...]
>
>>
>>     If the former, it would be nice to get the code you used. If the
>>     latter, then having a hack patch to test that code would be nice.
>>     Ideally, you want to plug that in the SYSCTL interface for
>>     out-of-box testing.
>>
>>
>> Ok, I have never used that but I'll try to figure it out. I may come up
>> with additional questions.
>
>
> You might want to have a look at the x86 version XEN_SYSCTL_cpu_hotplug in
> x86/systcl.c.
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
  2018-04-12 13:31       ` Julien Grall
@ 2018-04-16 10:02         ` Mirela Simonovic
  2018-04-16 14:26           ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-16 10:02 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	julien.grall, Xen Devel

Hi Julien,


On Thu, Apr 12, 2018 at 3:31 PM, Julien Grall <julien.grall@arm.com> wrote:
>
>
> On 12/04/18 12:33, Mirela Simonovic wrote:
>>
>> On Wed, Apr 11, 2018 at 4:46 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>
>>>>        local_irq_disable();
>>>>        cpu_is_dead = true;
>>>>        /* Make sure the write happens before we sleep forever */
>>>>        dsb(sy);
>>>>        isb();
>>>> +    /* PSCI cpu off call will return only in case of an error */
>>>> +    errno = call_psci_cpu_off();
>>>> +    printk(XENLOG_DEBUG "PSCI cpu off call failed for CPU#%d err=%d\n",
>>>> +           get_processor_id(), errno);
>>>> +    isb();
>>>
>>>
>>>
>>> What are you trying to achieve with the isb() here?
>>>
>>
>> I use to have a problem that the wfi below gets executed before the
>> call_psci_cpu_off(). Adding isb() fixed the issue. However, I tried
>> now to reproduce the problem and it doesn't show up. I still believe
>> isb() should be here, please let me know if you disagree (I obviously
>> can't prove the claim now).
>
>
> The problem you describe can't be possible with the code you have because
> call_psci_cpu_off() is issuing a SMC. SMC will lead to change exception
> level and therefore have a context-synchronization barrier.
>
> This is obviously based on the assumption you don't have an errata on your
> CPU exposing the behavior you describe. For that you would need to check
> errata notice for your CPU and/or try to reproduce.
>
> However, what you would need is a dsb(sy); isb(); to drain the write buffer
> if you print a message.
>
> Furthermore, now on platform without CPU off support (e.g non-PSCI platform
> and PSCI 0.1) you will log an error message that may worry people. In
> reality, PSCI cpu_off will unlikely fail, so you probably want to add a
> panic in call_psci_cpu_off instead.
>

Even if PSCI cpu_off call fails, what is unlikely to happen, the
system is still functional. Enabling that pCPU later will fail, but
Xen can handle this error and continue running properly on the boot
pCPU (I've tested this in 2 pCPUs config).
Therefore, I believe panic may not be necessary in this case. I
suggest that we dump the error message and continue to run. Please let
me know if you disagree.

Thanks,
Mirela

> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-13 10:19           ` Mirela Simonovic
@ 2018-04-16 11:33             ` Julien Grall
  2018-04-16 14:06               ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-16 11:33 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel

Hi,

On 13/04/18 11:19, Mirela Simonovic wrote:
> On Thu, Apr 12, 2018 at 10:43 AM, Julien Grall <julien.grall@arm.com> wrote:
>>
>>
>> On 11/04/18 17:37, Mirela Simonovic wrote:
>>>
>>> Hi Julien,
>>
>>
>> Hi,
>>
>> May I ask you to configure your mail client to use > for quoting and use
>> plain text? Otherwise, this is going to be really difficult to follow the
>> discussion after few round (see already below).
>>
>>> On Wed, Apr 11, 2018 at 6:02 PM, Julien Grall <julien.grall@arm.com
>>> <mailto:julien.grall@arm.com>> wrote:
>>>
>>>      Hi,
>>>
>>>      On 11/04/18 16:58, Mirela Simonovic wrote:
>>>
>>>          On 04/11/2018 05:07 PM, Julien Grall wrote:
>>>
>>>              On 11/04/18 14:19, Mirela Simonovic wrote:
>>>
>>>          Migrating interrupts when turning off a CPU already works.
>>>          However, when a CPU is turned back on there is no interrupt
>>>          migration back to the hotplugged CPU - all interrupts will
>>>          remain routed to the CPU#0.
>>>          Patch 7/7 fixes this
>>>
>>>
>>>      What do you mean by all interrupts? Interrupts routed to guest will
>>>      always follow the vCPU. So are you sure they are going to be
>>>      migrated when that vCPU is paused/off?
>>>
>>>
>>> Just to make sure we're on the same page - this is about hotplugging
>>> physical CPUs. Hotplugging vCPUs using virtual PSCI CPU_OFF interface is
>>> already implemented and unrelated to this series.
>>
>>
>> Yes, we are on the same page :). I was just wondering what happen to
>> interrupt routed to that pCPU.
>>
>>>
>>> Assuming that system has 2 pCPUs by 'all interrupts' I mean interrupts
>>> that were targeted to the pCPU#0 and pCPU#1 prior to doing any hotplug.
>>>
>>> For example, if a guest is pinned to pCPU#1 an interrupt of a device it
>>> owns will be targeted to pCPU#1.
>>> When pCPU#1 is turned off that interrupt will be migrated to pCPU#0.
>>> pCPU#0 finalizes the suspend and receives wake-up interrupts. However, when
>>> CPU#1 is turned back on that interrupt will remain targeted to the CPU#0,
>>> which I assumed is wrong.
>>> The scenario described here is also how I tested this.
>>>
>>>      Can you give the path in Xen doing that?
>>>
>>>
>>> Sure, here is a backtrace (dumped on the CPU being turned off):
>>>       0  0x2603dc arch_move_irqs(): vgic.c, line 309
>>>       1  0x22ee58 sched_move_irqs()+20: schedule.c, line 303
>>>       2  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>>>       3  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>>>       4  0x25aff8 __cpu_disable()+96: smpboot.c, line 386
>>>       5  0x201608 take_cpu_down()+52: cpu.c, line 75
>>>       6  0x23426c stopmachine_action()+188: stop_machine.c, line 159
>>>       7  0x235858 do_tasklet_work()+176: tasklet.c, line 94
>>>       8  0x235c80 do_tasklet()+104: tasklet.c, line 126
>>>       9  0x24daec idle_loop()+144: domain.c, line 72
>>>      10  0x25b1f8 start_secondary()+404: smpboot.c, line 368
>>
>>
>>
>> So this cover interrupt routed to a virtual CPU. However, this does not
>> handle interrupts used by Xen. How do you handle them?
>>
>> For instance SMMUs IRQ might be routed to other interrupt than CPU #0.
> 
> Interrupts used by Xen should not wake-up the system and will be
> disabled when we suspend the devices used by Xen.
Here you only speak about the suspend use case. While I understand your 
ultimate goal is suspend/resume, this series is about CPU hotplug.

IHMO, the suspend/resume case is no more than a superset of CPU up/down. 
If you solve the problem for up/down, likely you are going to solve it 
for suspend/resume.

So, what would happen to interrupts routed to the CPU going offline?

> However, I need to double check that such interrupts get enabled on
> the right CPU on resume. Could you please tell me which mechanism in
> Xen is used to target such an interrupt to a secondary CPU only? Is
> that even possible and why would that be used?

SPIs will be routed to the CPU calling setup_irq. It may not always be 
CPU#0. For instance, this is the case context interrupt for the SMMU 
because they are setup when the device is assigned.

I guess this decision is arguable. If you move all the interrupts to 
CPU#0 it will potentially disrupt vCPU running on it. I am thinking in 
the case of SMMU fault that could be triggered easily by another domain.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU
  2018-04-13 10:11     ` Mirela Simonovic
@ 2018-04-16 12:38       ` Dario Faggioli
  0 siblings, 0 replies; 61+ messages in thread
From: Dario Faggioli @ 2018-04-16 12:38 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	julien.grall, Xen Devel


[-- Attachment #1.1: Type: text/plain, Size: 1509 bytes --]

On Fri, 2018-04-13 at 12:11 +0200, Mirela Simonovic wrote:
> Hi Dario,
> 
Hi,

> On Thu, Apr 12, 2018 at 6:49 PM, Dario Faggioli <dfaggioli@suse.com>
> wrote:
> No worries, my commit message should be more understandable and your
> answer helps me identify what's unclear. I'll try to explain below.
> 
> This is about suspend/resume implementation for ARM that is based on
> PSCI and targeted for embedded systems as well. Each guest could have
> its own wake-up devices/interrupts (passthrough) that could trigger
> the resume. So the wake-up interrupt in this context triggers the
> resume. By 'all interrupts' I meant interrupts that are left enabled
> by guests upon suspend (those interrupts could wake-up the system).
> Here is the PSCI-based suspend to RAM design spec:
> https://lists.xenproject.org/archives/html/xen-devel/2017-12/msg01574
> .html
> 
Ah, so you are talking about suspend/resume of _the_guest_! This was
actually the bit of information that I was missing. :-)

This was, probbly, partly due to me not being familiar with ARM, and to
the fact that I haven't really looked at the rest of the series
probably.

Still, I think it would not hurt for the changelog to be a bit more
explitic and clear.

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

[-- Attachment #1.2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-12 21:31         ` Stefano Stabellini
@ 2018-04-16 13:14           ` Julien Grall
  2018-04-16 13:41             ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-16 13:14 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: edgar.iglesias, dm, Andrew Cooper, xen-devel, Jan Beulich,
	Mirela Simonovic

Hi,

On 12/04/18 22:31, Stefano Stabellini wrote:
> On Thu, 12 Apr 2018, Julien Grall wrote:
>> On 12/04/18 00:46, Stefano Stabellini wrote:
>>> On Wed, 11 Apr 2018, Julien Grall wrote:
>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>> Freeing percpu area is done when a non-boot CPU is disabled upon
>>>>> suspend.
>>>>> This use to be scheduled for execution after a period of time, what
>>>>> caused
>>>>> the following racing issues. If CPU is enabled after it is disabled and
>>>>> before the freeing of percpu area is performed, Xen would crash upon
>>>>> initializing percpu area because per cpu offset is not marked as
>>>>> INVALID_PERCPU_AREA (this suppose to happen when cpu area is freed).
>>>>> To resolve the racing issue, free percpu area right away instead
>>>>> scheduling it for later.
>>>>
>>>> The reason of using the RCU is you want to make sure that none of the
>>>> other
>>>> CPUs will access that percpu data before freeing it. So I don't think this
>>>> patch is valid.
>>>>
>>>> It looks like to me a rcu barrier is missing after calling cpu_down
>>>> somewhere
>>>> in the CPU off path. I am not entirely sure where.
>>>
>>> We need a rcu_barrier(). Perhaps, it could be added on cpu_on before
>>> initializing the percpu area?
>>
>> Do you mind giving a bit more details on your thought? cpu_up looks a strange
>> place as no one should access the percpu area after the CPU is down. So it
>> feels the rcu_barrier should be somewhere before PSCI_cpu_off is called.
> 
> Yes, it feels strange to do it on cpu_on, it would be more obvious on
> cpu_off, but we don't actually need to _free_percpu_area on cpu_off,
> right? We only need to make sure it is done before cpu_percpu_callback
> is called on cpu_on.
> 
> My suggestion would be to evaluate if it is possible to introduce the
> rcu_barrier() on the resume path before cpu_percpu_callback, maybe in
> start_secondary.

Well, cpu_percpu_callback is not called by start_secondary. It is called 
when preparing the CPU from another CPU. So anything in start_secondary 
will not work.

> 
> I was also looking at xen/arch/x86/acpi/power.c:enter_state and noticed
> that they chose to call rcu_barrier() on enable_cpu before
> enable_nonboot_cpus().

I guess the rcu_barrier() in the function handling suspend/resume works. 
But that doesn't cover the hotplug case. Looking at x86, suspend/resume 
case. For the hotplug case, there are an rcu_barrier in 
cpu_{up,down}_helper but they are only present in the case of 
cpu_{up,down} failed. I am not entirely sure how this is handled in x86

Andrew, Jan, do you know when the percpu will be free on hotplug? It is 
call to call_rcu(...) but I am not sure when this is going to be executed.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-16 13:14           ` Julien Grall
@ 2018-04-16 13:41             ` Mirela Simonovic
  2018-04-16 15:21               ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-16 13:41 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	Andrew Cooper, Xen Devel, Jan Beulich

Hi Julien, Stefano,

Thanks for the feedback and suggestions.

On Mon, Apr 16, 2018 at 3:14 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi,
>
>
> On 12/04/18 22:31, Stefano Stabellini wrote:
>>
>> On Thu, 12 Apr 2018, Julien Grall wrote:
>>>
>>> On 12/04/18 00:46, Stefano Stabellini wrote:
>>>>
>>>> On Wed, 11 Apr 2018, Julien Grall wrote:
>>>>>
>>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>>>
>>>>>> Freeing percpu area is done when a non-boot CPU is disabled upon
>>>>>> suspend.
>>>>>> This use to be scheduled for execution after a period of time, what
>>>>>> caused
>>>>>> the following racing issues. If CPU is enabled after it is disabled
>>>>>> and
>>>>>> before the freeing of percpu area is performed, Xen would crash upon
>>>>>> initializing percpu area because per cpu offset is not marked as
>>>>>> INVALID_PERCPU_AREA (this suppose to happen when cpu area is freed).
>>>>>> To resolve the racing issue, free percpu area right away instead
>>>>>> scheduling it for later.
>>>>>
>>>>>
>>>>> The reason of using the RCU is you want to make sure that none of the
>>>>> other
>>>>> CPUs will access that percpu data before freeing it. So I don't think
>>>>> this
>>>>> patch is valid.
>>>>>
>>>>> It looks like to me a rcu barrier is missing after calling cpu_down
>>>>> somewhere
>>>>> in the CPU off path. I am not entirely sure where.
>>>>
>>>>
>>>> We need a rcu_barrier(). Perhaps, it could be added on cpu_on before
>>>> initializing the percpu area?
>>>
>>>
>>> Do you mind giving a bit more details on your thought? cpu_up looks a
>>> strange
>>> place as no one should access the percpu area after the CPU is down. So
>>> it
>>> feels the rcu_barrier should be somewhere before PSCI_cpu_off is called.
>>
>>
>> Yes, it feels strange to do it on cpu_on, it would be more obvious on
>> cpu_off, but we don't actually need to _free_percpu_area on cpu_off,
>> right? We only need to make sure it is done before cpu_percpu_callback
>> is called on cpu_on.
>>
>> My suggestion would be to evaluate if it is possible to introduce the
>> rcu_barrier() on the resume path before cpu_percpu_callback, maybe in
>> start_secondary.
>
>
> Well, cpu_percpu_callback is not called by start_secondary. It is called
> when preparing the CPU from another CPU. So anything in start_secondary will
> not work.
>

I have also confirmed that in start_secondary it doesn't work, it's too late.

>>
>> I was also looking at xen/arch/x86/acpi/power.c:enter_state and noticed
>> that they chose to call rcu_barrier() on enable_cpu before
>> enable_nonboot_cpus().
>

Before the enable_nonboot_cpus() gets invoked seems to be a good place
for rcu_barrier(), as it's done for x86.

>
> I guess the rcu_barrier() in the function handling suspend/resume works. But
> that doesn't cover the hotplug case. Looking at x86, suspend/resume case.
> For the hotplug case, there are an rcu_barrier in cpu_{up,down}_helper but
> they are only present in the case of cpu_{up,down} failed. I am not entirely
> sure how this is handled in x86
>
> Andrew, Jan, do you know when the percpu will be free on hotplug? It is call
> to call_rcu(...) but I am not sure when this is going to be executed.
>

AFAIK disable/enable_nonboot_cpus() is the only way to do the hotplug
and rcu_barrier() is not included in the flow.
I suggest to invoke rcu_barrier() before enable_nonboot_cpus() and
eventually this could be moved into enable_nonboot_cpus() itself.

Thanks,
Mirela

> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-16 11:33             ` Julien Grall
@ 2018-04-16 14:06               ` Mirela Simonovic
  2018-04-16 15:05                 ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-16 14:06 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel

Hi Julien,

On Mon, Apr 16, 2018 at 1:33 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi,
>
>
> On 13/04/18 11:19, Mirela Simonovic wrote:
>>
>> On Thu, Apr 12, 2018 at 10:43 AM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>>
>>>
>>> On 11/04/18 17:37, Mirela Simonovic wrote:
>>>>
>>>>
>>>> Hi Julien,
>>>
>>>
>>>
>>> Hi,
>>>
>>> May I ask you to configure your mail client to use > for quoting and use
>>> plain text? Otherwise, this is going to be really difficult to follow the
>>> discussion after few round (see already below).
>>>
>>>> On Wed, Apr 11, 2018 at 6:02 PM, Julien Grall <julien.grall@arm.com
>>>> <mailto:julien.grall@arm.com>> wrote:
>>>>
>>>>      Hi,
>>>>
>>>>      On 11/04/18 16:58, Mirela Simonovic wrote:
>>>>
>>>>          On 04/11/2018 05:07 PM, Julien Grall wrote:
>>>>
>>>>              On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>
>>>>          Migrating interrupts when turning off a CPU already works.
>>>>          However, when a CPU is turned back on there is no interrupt
>>>>          migration back to the hotplugged CPU - all interrupts will
>>>>          remain routed to the CPU#0.
>>>>          Patch 7/7 fixes this
>>>>
>>>>
>>>>      What do you mean by all interrupts? Interrupts routed to guest will
>>>>      always follow the vCPU. So are you sure they are going to be
>>>>      migrated when that vCPU is paused/off?
>>>>
>>>>
>>>> Just to make sure we're on the same page - this is about hotplugging
>>>> physical CPUs. Hotplugging vCPUs using virtual PSCI CPU_OFF interface is
>>>> already implemented and unrelated to this series.
>>>
>>>
>>>
>>> Yes, we are on the same page :). I was just wondering what happen to
>>> interrupt routed to that pCPU.
>>>
>>>>
>>>> Assuming that system has 2 pCPUs by 'all interrupts' I mean interrupts
>>>> that were targeted to the pCPU#0 and pCPU#1 prior to doing any hotplug.
>>>>
>>>> For example, if a guest is pinned to pCPU#1 an interrupt of a device it
>>>> owns will be targeted to pCPU#1.
>>>> When pCPU#1 is turned off that interrupt will be migrated to pCPU#0.
>>>> pCPU#0 finalizes the suspend and receives wake-up interrupts. However,
>>>> when
>>>> CPU#1 is turned back on that interrupt will remain targeted to the
>>>> CPU#0,
>>>> which I assumed is wrong.
>>>> The scenario described here is also how I tested this.
>>>>
>>>>      Can you give the path in Xen doing that?
>>>>
>>>>
>>>> Sure, here is a backtrace (dumped on the CPU being turned off):
>>>>       0  0x2603dc arch_move_irqs(): vgic.c, line 309
>>>>       1  0x22ee58 sched_move_irqs()+20: schedule.c, line 303
>>>>       2  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>>>>       3  0x2318e8 cpu_disable_scheduler()+1000: schedule.c, line 586
>>>>       4  0x25aff8 __cpu_disable()+96: smpboot.c, line 386
>>>>       5  0x201608 take_cpu_down()+52: cpu.c, line 75
>>>>       6  0x23426c stopmachine_action()+188: stop_machine.c, line 159
>>>>       7  0x235858 do_tasklet_work()+176: tasklet.c, line 94
>>>>       8  0x235c80 do_tasklet()+104: tasklet.c, line 126
>>>>       9  0x24daec idle_loop()+144: domain.c, line 72
>>>>      10  0x25b1f8 start_secondary()+404: smpboot.c, line 368
>>>
>>>
>>>
>>>
>>> So this cover interrupt routed to a virtual CPU. However, this does not
>>> handle interrupts used by Xen. How do you handle them?
>>>
>>> For instance SMMUs IRQ might be routed to other interrupt than CPU #0.
>>
>>
>> Interrupts used by Xen should not wake-up the system and will be
>> disabled when we suspend the devices used by Xen.
>
> Here you only speak about the suspend use case. While I understand your
> ultimate goal is suspend/resume, this series is about CPU hotplug.
>

AFAIK, the only way and occasion to hotplug a CPU is using
disable/enable_nonboot_cpus() within the Xen suspend/resume procedure.
We are implementing CPU hotplug only to enable Xen suspend/resume.
This is how it is also done for x86 and we wanted to implement the
equivalent behavior for ARM.
If the cover-letter is misleading please let me know what would be
more appropriate title.

However, I absolutely agree that the interrupt routing before and
after the hotplug has to be the same.

> IHMO, the suspend/resume case is no more than a superset of CPU up/down. If
> you solve the problem for up/down, likely you are going to solve it for
> suspend/resume.
>
> So, what would happen to interrupts routed to the CPU going offline?
>
>> However, I need to double check that such interrupts get enabled on
>> the right CPU on resume. Could you please tell me which mechanism in
>> Xen is used to target such an interrupt to a secondary CPU only? Is
>> that even possible and why would that be used?
>
>
> SPIs will be routed to the CPU calling setup_irq. It may not always be
> CPU#0. For instance, this is the case context interrupt for the SMMU because
> they are setup when the device is assigned.
>
> I guess this decision is arguable. If you move all the interrupts to CPU#0
> it will potentially disrupt vCPU running on it. I am thinking in the case of
> SMMU fault that could be triggered easily by another domain.
>

I'll come back to this, need to do some research/debugging to better
understand what's going on.

Thanks,
Mirela

> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
  2018-04-16 10:02         ` Mirela Simonovic
@ 2018-04-16 14:26           ` Julien Grall
  2018-04-16 16:52             ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-16 14:26 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	julien.grall, Xen Devel

Hi Mirela,

On 16/04/18 11:02, Mirela Simonovic wrote:
> On Thu, Apr 12, 2018 at 3:31 PM, Julien Grall <julien.grall@arm.com> wrote:
>>
>>
>> On 12/04/18 12:33, Mirela Simonovic wrote:
>>>
>>> On Wed, Apr 11, 2018 at 4:46 PM, Julien Grall <julien.grall@arm.com>
>>> wrote:
>>>>
>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>
>>>>>         local_irq_disable();
>>>>>         cpu_is_dead = true;
>>>>>         /* Make sure the write happens before we sleep forever */
>>>>>         dsb(sy);
>>>>>         isb();
>>>>> +    /* PSCI cpu off call will return only in case of an error */
>>>>> +    errno = call_psci_cpu_off();
>>>>> +    printk(XENLOG_DEBUG "PSCI cpu off call failed for CPU#%d err=%d\n",
>>>>> +           get_processor_id(), errno);
>>>>> +    isb();
>>>>
>>>>
>>>>
>>>> What are you trying to achieve with the isb() here?
>>>>
>>>
>>> I use to have a problem that the wfi below gets executed before the
>>> call_psci_cpu_off(). Adding isb() fixed the issue. However, I tried
>>> now to reproduce the problem and it doesn't show up. I still believe
>>> isb() should be here, please let me know if you disagree (I obviously
>>> can't prove the claim now).
>>
>>
>> The problem you describe can't be possible with the code you have because
>> call_psci_cpu_off() is issuing a SMC. SMC will lead to change exception
>> level and therefore have a context-synchronization barrier.
>>
>> This is obviously based on the assumption you don't have an errata on your
>> CPU exposing the behavior you describe. For that you would need to check
>> errata notice for your CPU and/or try to reproduce.
>>
>> However, what you would need is a dsb(sy); isb(); to drain the write buffer
>> if you print a message.
>>
>> Furthermore, now on platform without CPU off support (e.g non-PSCI platform
>> and PSCI 0.1) you will log an error message that may worry people. In
>> reality, PSCI cpu_off will unlikely fail, so you probably want to add a
>> panic in call_psci_cpu_off instead.
>>
> 
> Even if PSCI cpu_off call fails, what is unlikely to happen, the
> system is still functional.

I disagree here, if you are unable to turn off a CPU via PSCI then 
something is definitely wrong. This means that CPU will forever spin in 
Xen code with no way to exit. This could bring interesting issue with 
anything potentially modifying Xen code (i.e livepatching).

IHMO, the forever sleep in stop_cpu() is just a temporary solution to 
cater shutdown of the platform. The state of secondary CPU does not much 
matter at that time. In case of suspend/resume you want really want to 
be able to turn off those CPUs correctly otherwise they are not going to 
come up again.

> Enabling that pCPU later will fail, but
> Xen can handle this error and continue running properly on the boot
> pCPU (I've tested this in 2 pCPUs config).

I don't consider that as xen running properly. You lost a pCPU so your 
workload is completely different. Imagine you are using the NULL 
scheduler (e.g only one vCPU is pinned to a specific pCPU), what are you 
going to do with the vCPU?

> Therefore, I believe panic may not be necessary in this case. I
> suggest that we dump the error message and continue to run. Please let
> me know if you disagree.

This is a bad idea, a failure should at least be logged to show 
something gone wrong.

PSCI CPU off will, as you said, unlikely failed. Looking at the spec, 
the only possible reason is your are trying to turn off a CPU where 
Trusted OS is resident. This means something far more wrong is happening 
in Xen code and I don't think it would be safe to continue to run.

Hence why I suggested a BUG_ON/panic because this is something that is 
not meant to happen.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 0/7] xen/arm: CPU hotplug fixes
  2018-04-16 14:06               ` Mirela Simonovic
@ 2018-04-16 15:05                 ` Julien Grall
  0 siblings, 0 replies; 61+ messages in thread
From: Julien Grall @ 2018-04-16 15:05 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel

Hi Mirela,

On 16/04/18 15:06, Mirela Simonovic wrote:
> On Mon, Apr 16, 2018 at 1:33 PM, Julien Grall <julien.grall@arm.com> wrote:
>> On 13/04/18 11:19, Mirela Simonovic wrote:
>>> On Thu, Apr 12, 2018 at 10:43 AM, Julien Grall <julien.grall@arm.com>
>>>> On 11/04/18 17:37, Mirela Simonovic wrote:
>>>>> On Wed, Apr 11, 2018 at 6:02 PM, Julien Grall <julien.grall@arm.com
>>>>> <mailto:julien.grall@arm.com>> wrote:
>>>> So this cover interrupt routed to a virtual CPU. However, this does not
>>>> handle interrupts used by Xen. How do you handle them?
>>>>
>>>> For instance SMMUs IRQ might be routed to other interrupt than CPU #0.
>>>
>>>
>>> Interrupts used by Xen should not wake-up the system and will be
>>> disabled when we suspend the devices used by Xen.
>>
>> Here you only speak about the suspend use case. While I understand your
>> ultimate goal is suspend/resume, this series is about CPU hotplug.
>>
> 
> AFAIK, the only way and occasion to hotplug a CPU is using > disable/enable_nonboot_cpus() within the Xen suspend/resume procedure.

As I mention in a previous e-mail, suspend/resume is not the only way to 
hotplug a CPU. There are an interface (see XEN_SYSCTL_cpu_hotplug) to 
allow that from the userland.

But I agree that it is not implemented on Arm, so suspend/resume would 
be the only way to so far. To be clear, I am not asking you to implement 
XEN_SYSCTL_cpu_hotplug. Althougth it should be easy to do it and easy 
for testing CPU offline/online.

> We are implementing CPU hotplug only to enable Xen suspend/resume.
 >
> This is how it is also done for x86 and we wanted to implement the
> equivalent behavior for ARM.
> If the cover-letter is misleading please let me know what would be
> more appropriate title.

We have multiple places in Xen that could use cpu_up/cpu_down helpers. 
 From the cover letter, it is quite unclear that you are only looking 
after suspend/resume.

The SYSCTL interface is one of the other use case. While most of the 
code can be shared, some of them may not make sense.

For instance, in the suspend/resume case routing the interrupt assigned 
to Xen to CPU#0 may not be necessary because all CPUs should come back 
later on. However, this should be necessary when offlining a vCPU with 
SYSCTL interface.

That's why I asked it and I think it should be clarified in the cover 
letter.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-16 13:41             ` Mirela Simonovic
@ 2018-04-16 15:21               ` Julien Grall
  2018-04-17 10:52                 ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-16 15:21 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	Andrew Cooper, Xen Devel, Jan Beulich



On 16/04/18 14:41, Mirela Simonovic wrote:
> On Mon, Apr 16, 2018 at 3:14 PM, Julien Grall <julien.grall@arm.com> wrote:
>> On 12/04/18 22:31, Stefano Stabellini wrote:
>>> On Thu, 12 Apr 2018, Julien Grall wrote:
>>>> On 12/04/18 00:46, Stefano Stabellini wrote:
>>>>> On Wed, 11 Apr 2018, Julien Grall wrote:
>>>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>> I guess the rcu_barrier() in the function handling suspend/resume works. But
>> that doesn't cover the hotplug case. Looking at x86, suspend/resume case.
>> For the hotplug case, there are an rcu_barrier in cpu_{up,down}_helper but
>> they are only present in the case of cpu_{up,down} failed. I am not entirely
>> sure how this is handled in x86
>>
>> Andrew, Jan, do you know when the percpu will be free on hotplug? It is call
>> to call_rcu(...) but I am not sure when this is going to be executed.
>>
> 
> AFAIK disable/enable_nonboot_cpus() is the only way to do the hotplug
> and rcu_barrier() is not included in the flow.

That's not the only way. I clearly specified one in my previous answer 
(see cpu_{up,down}_helper) and there are other place (look for cpu_up).

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
  2018-04-16 14:26           ` Julien Grall
@ 2018-04-16 16:52             ` Mirela Simonovic
  2018-04-16 17:02               ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-16 16:52 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel

Hi Julien,

On Mon, Apr 16, 2018 at 4:26 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi Mirela,
>
>
> On 16/04/18 11:02, Mirela Simonovic wrote:
>>
>> On Thu, Apr 12, 2018 at 3:31 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>>
>>>
>>> On 12/04/18 12:33, Mirela Simonovic wrote:
>>>>
>>>>
>>>> On Wed, Apr 11, 2018 at 4:46 PM, Julien Grall <julien.grall@arm.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>>
>>>>>>         local_irq_disable();
>>>>>>         cpu_is_dead = true;
>>>>>>         /* Make sure the write happens before we sleep forever */
>>>>>>         dsb(sy);
>>>>>>         isb();
>>>>>> +    /* PSCI cpu off call will return only in case of an error */
>>>>>> +    errno = call_psci_cpu_off();
>>>>>> +    printk(XENLOG_DEBUG "PSCI cpu off call failed for CPU#%d
>>>>>> err=%d\n",
>>>>>> +           get_processor_id(), errno);
>>>>>> +    isb();
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> What are you trying to achieve with the isb() here?
>>>>>
>>>>
>>>> I use to have a problem that the wfi below gets executed before the
>>>> call_psci_cpu_off(). Adding isb() fixed the issue. However, I tried
>>>> now to reproduce the problem and it doesn't show up. I still believe
>>>> isb() should be here, please let me know if you disagree (I obviously
>>>> can't prove the claim now).
>>>
>>>
>>>
>>> The problem you describe can't be possible with the code you have because
>>> call_psci_cpu_off() is issuing a SMC. SMC will lead to change exception
>>> level and therefore have a context-synchronization barrier.
>>>
>>> This is obviously based on the assumption you don't have an errata on
>>> your
>>> CPU exposing the behavior you describe. For that you would need to check
>>> errata notice for your CPU and/or try to reproduce.
>>>
>>> However, what you would need is a dsb(sy); isb(); to drain the write
>>> buffer
>>> if you print a message.
>>>
>>> Furthermore, now on platform without CPU off support (e.g non-PSCI
>>> platform
>>> and PSCI 0.1) you will log an error message that may worry people. In
>>> reality, PSCI cpu_off will unlikely fail, so you probably want to add a
>>> panic in call_psci_cpu_off instead.
>>>
>>
>> Even if PSCI cpu_off call fails, what is unlikely to happen, the
>> system is still functional.
>
>
> I disagree here, if you are unable to turn off a CPU via PSCI then something
> is definitely wrong. This means that CPU will forever spin in Xen code with
> no way to exit. This could bring interesting issue with anything potentially
> modifying Xen code (i.e livepatching).
>
> IHMO, the forever sleep in stop_cpu() is just a temporary solution to cater
> shutdown of the platform. The state of secondary CPU does not much matter at
> that time. In case of suspend/resume you want really want to be able to turn
> off those CPUs correctly otherwise they are not going to come up again.
>

If we follow that logic the CPU will not be able to exit WFI state
either. So we should raise panic in that case as well and in cases
where the system is suspending + the CPU is stopped + cpu off doesn't
work so the CPU cannot be enabled again.
However, raising panic makes no sense for shutdown scenario.
How about we do it something like this:

void stop_cpu(void)
{
    ...
    if ( system_state == SYS_STATE_suspend )
        call_psci_cpu_off();

    while ( 1 )
        wfi();
}

void call_psci_cpu_off(void)
{
    int errno;

    /* If successfull the cpu_off call doesn't return */
    errno = call_smc(PSCI_0_2_FN32_CPU_OFF, 0, 0, 0);
    if ( errno )
        panic("PSCI cpu off failed for CPU%d err=%d\n", get_processor_id(),
                errno);
}

call_psci_cpu_off should not check for PSCI version because we need to
panic regardless.

Thanks,
Mirela


>> Enabling that pCPU later will fail, but
>> Xen can handle this error and continue running properly on the boot
>> pCPU (I've tested this in 2 pCPUs config).
>
>
> I don't consider that as xen running properly. You lost a pCPU so your
> workload is completely different. Imagine you are using the NULL scheduler
> (e.g only one vCPU is pinned to a specific pCPU), what are you going to do
> with the vCPU?
>
>> Therefore, I believe panic may not be necessary in this case. I
>> suggest that we dump the error message and continue to run. Please let
>> me know if you disagree.
>
>
> This is a bad idea, a failure should at least be logged to show something
> gone wrong.
>
> PSCI CPU off will, as you said, unlikely failed. Looking at the spec, the
> only possible reason is your are trying to turn off a CPU where Trusted OS
> is resident. This means something far more wrong is happening in Xen code
> and I don't think it would be safe to continue to run.
>
> Hence why I suggested a BUG_ON/panic because this is something that is not
> meant to happen.
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface)
  2018-04-16 16:52             ` Mirela Simonovic
@ 2018-04-16 17:02               ` Julien Grall
  0 siblings, 0 replies; 61+ messages in thread
From: Julien Grall @ 2018-04-16 17:02 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel



On 16/04/18 17:52, Mirela Simonovic wrote:
> On Mon, Apr 16, 2018 at 4:26 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hi Mirela,
>>
>>
>> On 16/04/18 11:02, Mirela Simonovic wrote:
>>>
>>> On Thu, Apr 12, 2018 at 3:31 PM, Julien Grall <julien.grall@arm.com>
>>> wrote:
>>>> On 12/04/18 12:33, Mirela Simonovic wrote:
>>>>> On Wed, Apr 11, 2018 at 4:46 PM, Julien Grall <julien.grall@arm.com>
>>>>> wrote:
>>>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>
>> I disagree here, if you are unable to turn off a CPU via PSCI then something
>> is definitely wrong. This means that CPU will forever spin in Xen code with
>> no way to exit. This could bring interesting issue with anything potentially
>> modifying Xen code (i.e livepatching).
>>
>> IHMO, the forever sleep in stop_cpu() is just a temporary solution to cater
>> shutdown of the platform. The state of secondary CPU does not much matter at
>> that time. In case of suspend/resume you want really want to be able to turn
>> off those CPUs correctly otherwise they are not going to come up again.
>>
> 
> If we follow that logic the CPU will not be able to exit WFI state
> either. So we should raise panic in that case as well and in cases
> where the system is suspending + the CPU is stopped + cpu off doesn't
> work so the CPU cannot be enabled again.

In case of suspend/resume you can check before hand whether we cpu_off 
is implemented for that platform. If not, you deny suspend.

This will avoid people using suspend/resume on those platforms.

> However, raising panic makes no sense for shutdown scenario.

I agree and that's why I suggested to move the panic in one of my 
previous e-mail:

"Furthermore, now on platform without CPU off support (e.g non-PSCI 
platform and PSCI 0.1) you will log an error message that may worry 
people. In reality, PSCI cpu_off will unlikely fail, so you probably 
want to add a panic in call_psci_cpu_off instead."

> How about we do it something like this:
> 
> void stop_cpu(void)
> {
>      ...
>      if ( system_state == SYS_STATE_suspend )
>          call_psci_cpu_off();
> 
>      while ( 1 )
>          wfi();
> }
> 
> void call_psci_cpu_off(void)
> {
>      int errno;
> 
>      /* If successfull the cpu_off call doesn't return */
>      errno = call_smc(PSCI_0_2_FN32_CPU_OFF, 0, 0, 0);
>      if ( errno )
>          panic("PSCI cpu off failed for CPU%d err=%d\n", get_processor_id(),
>                  errno);
> }
> 
> call_psci_cpu_off should not check for PSCI version because we need to
> panic regardless.

Yes. There are no need for the check because it will never return on 
success.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-16 15:21               ` Julien Grall
@ 2018-04-17 10:52                 ` Mirela Simonovic
  2018-04-17 11:02                   ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-17 10:52 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	Andrew Cooper, Xen Devel, Jan Beulich

Hi Julien,

On Mon, Apr 16, 2018 at 5:21 PM, Julien Grall <julien.grall@arm.com> wrote:
>
>
> On 16/04/18 14:41, Mirela Simonovic wrote:
>>
>> On Mon, Apr 16, 2018 at 3:14 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> On 12/04/18 22:31, Stefano Stabellini wrote:
>>>>
>>>> On Thu, 12 Apr 2018, Julien Grall wrote:
>>>>>
>>>>> On 12/04/18 00:46, Stefano Stabellini wrote:
>>>>>>
>>>>>> On Wed, 11 Apr 2018, Julien Grall wrote:
>>>>>>>
>>>>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>
>>> I guess the rcu_barrier() in the function handling suspend/resume works.
>>> But
>>> that doesn't cover the hotplug case. Looking at x86, suspend/resume case.
>>> For the hotplug case, there are an rcu_barrier in cpu_{up,down}_helper
>>> but
>>> they are only present in the case of cpu_{up,down} failed. I am not
>>> entirely
>>> sure how this is handled in x86
>>>
>>> Andrew, Jan, do you know when the percpu will be free on hotplug? It is
>>> call
>>> to call_rcu(...) but I am not sure when this is going to be executed.
>>>
>>
>> AFAIK disable/enable_nonboot_cpus() is the only way to do the hotplug
>> and rcu_barrier() is not included in the flow.
>
>
> That's not the only way. I clearly specified one in my previous answer (see
> cpu_{up,down}_helper) and there are other place (look for cpu_up).
>

I've looked at cpu_{up,down}_helper and cpu_up and I'm convinced now
that adding rcu_barrier() prior to calling enable_nonboot_cpus() is
the right approch.

cpu_{up,down}_helper functions exist only for x86. cpu_up_helper()
does call rcu_barrier() prior to calling cpu_up().
So calling rcu_barrier() is expected to be done prior to calling
cpu_up() (or enable_nonboot_cpus(), which is just a wrapper for
cpu_up()).

I believe this is right way to do because cpu_up() is used for
enabling non-boot CPUs in both boot and suspend/hotplug scenarios,
while rcu_barrier() is not required in boot scenario.
Therefore, I'll add rcu_barrier() prior to calling
enable_nonboot_cpus(). If I missed something please let me know.

Thanks,
Mirela

> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-17 10:52                 ` Mirela Simonovic
@ 2018-04-17 11:02                   ` Julien Grall
  2018-04-18 22:52                     ` Stefano Stabellini
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-17 11:02 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	Andrew Cooper, Xen Devel, Jan Beulich



On 17/04/18 11:52, Mirela Simonovic wrote:
> Hi Julien,

Hi Mirela,

> On Mon, Apr 16, 2018 at 5:21 PM, Julien Grall <julien.grall@arm.com> wrote:
>>
>>
>> On 16/04/18 14:41, Mirela Simonovic wrote:
>>>
>>> On Mon, Apr 16, 2018 at 3:14 PM, Julien Grall <julien.grall@arm.com>
>>> wrote:
>>>>
>>>> On 12/04/18 22:31, Stefano Stabellini wrote:
>>>>>
>>>>> On Thu, 12 Apr 2018, Julien Grall wrote:
>>>>>>
>>>>>> On 12/04/18 00:46, Stefano Stabellini wrote:
>>>>>>>
>>>>>>> On Wed, 11 Apr 2018, Julien Grall wrote:
>>>>>>>>
>>>>>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>
>>>> I guess the rcu_barrier() in the function handling suspend/resume works.
>>>> But
>>>> that doesn't cover the hotplug case. Looking at x86, suspend/resume case.
>>>> For the hotplug case, there are an rcu_barrier in cpu_{up,down}_helper
>>>> but
>>>> they are only present in the case of cpu_{up,down} failed. I am not
>>>> entirely
>>>> sure how this is handled in x86
>>>>
>>>> Andrew, Jan, do you know when the percpu will be free on hotplug? It is
>>>> call
>>>> to call_rcu(...) but I am not sure when this is going to be executed.
>>>>
>>>
>>> AFAIK disable/enable_nonboot_cpus() is the only way to do the hotplug
>>> and rcu_barrier() is not included in the flow.
>>
>>
>> That's not the only way. I clearly specified one in my previous answer (see
>> cpu_{up,down}_helper) and there are other place (look for cpu_up).
>>
> 
> I've looked at cpu_{up,down}_helper and cpu_up and I'm convinced now
> that adding rcu_barrier() prior to calling enable_nonboot_cpus() is
> the right approch.
> 
> cpu_{up,down}_helper functions exist only for x86.

They have nothing very x86 specific AFAICT so they could potentially be 
used for Arm when XEN_SYSCTL_hotplug will be implemented.

> cpu_up_helper()
> does call rcu_barrier() prior to calling cpu_up().

That's not true. Below the code for cpu_up_helper():

     int ret = cpu_up(cpu); <- First call
     if ( ret == -EBUSY )
     {
         rcu_barrier();	   <- RCU barrier
         ret = cpu_up(cpu); <- Second call
     }
     return ret;

So the rcu_barrier is called after cpu_up() in case it returns -EBUSY.

> So calling rcu_barrier() is expected to be done prior to calling
> cpu_up() (or enable_nonboot_cpus(), which is just a wrapper for
> cpu_up()).
> 
> I believe this is right way to do because cpu_up() is used for
> enabling non-boot CPUs in both boot and suspend/hotplug scenarios,
> while rcu_barrier() is not required in boot scenario.
> Therefore, I'll add rcu_barrier() prior to calling
> enable_nonboot_cpus(). If I missed something please let me know.

See above, this is exactly why I asked Andrew & Jan input on how rcu 
work is flushed when using cpu_up_helper/cpu_down_helper. Because I 
don't understand if it is meant to work.

So I would like to see whether it would make sense to put the 
rcu_barrier() somewhere else to cover every call of cpu_up().

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-11 15:11   ` Julien Grall
@ 2018-04-17 12:54     ` Mirela Simonovic
  2018-04-17 14:11       ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-17 12:54 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel

Hi Julien,


On Wed, Apr 11, 2018 at 5:11 PM, Julien Grall <julien.grall@arm.com> wrote:
> Hi,
>
> On 11/04/18 14:19, Mirela Simonovic wrote:
>>
>> In existing code the paging for secondary CPUs is setup only in boot flow.
>> The setup is triggered from start_xen function after all CPUs are brought
>> online. In other words, the initialization of VTCR_EL2 register is done
>> out of the cpu_up/start_secondary control flow. However, the cpu_up flow
>> should be self-contained - it should fully initialize a secondary CPU,
>> because the cpu_up is used not only to bring a secondary CPU online on
>> boot, but also to hotplug a CPU during the system resume.
>> With this patch the setting of paging is triggered from start_secondary
>> function if the current system state is not boot. This way, the paging
>> will be setup in non-boot scenarios, while the setup in boot scenario
>> remains unchanged.
>
>
> I am afraid that this is not correct. You can't assume that value chosen for
> VTCR by Xen at boot will fit this new CPU. So you have to check it is fine
> or park the CPU if there are any issue.
>

This is not a new CPU. This CPU already went through its boot sequence
and it reached the resume point because it does fit the value chosen
for VTCR by Xen.
If it wouldn't fit the chosen value for VTCR it would be parked so it
wouldn't participate in suspend/resume. Please let me know if I
misunderstood your comment.

AFAIU the value chosen by Xen for VTCR config has to be common for all
online CPUs. Since this value is also used in the resume path I
suggest to make global (static in the p2m.c) the 'val' variable which
is currently local in setup_virt_paging() and passed as argument to
setup_virt_paging_one(). Then setup_virt_paging_one() would not
receive an argument.
I need to access this value on resume, so I would call
setup_virt_paging_one() without argument from start_secondary() if the
system state is not boot.
This seems to me a bit cleaner compared to what I submitted in this
patch, but fundamentally the functionality is the same.

Thanks,
Mirela

> For more details have a look at [1].
>
> [1]
> https://lists.xenproject.org/archives/html/xen-devel/2018-01/msg02482.html
>
> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-17 12:54     ` Mirela Simonovic
@ 2018-04-17 14:11       ` Julien Grall
  2018-04-17 15:22         ` Mirela Simonovic
  2018-04-18 10:45         ` Mirela Simonovic
  0 siblings, 2 replies; 61+ messages in thread
From: Julien Grall @ 2018-04-17 14:11 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel



On 17/04/18 13:54, Mirela Simonovic wrote:
> Hi Julien,

Hi,

>
> On Wed, Apr 11, 2018 at 5:11 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hi,
>>
>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>
>>> In existing code the paging for secondary CPUs is setup only in boot flow.
>>> The setup is triggered from start_xen function after all CPUs are brought
>>> online. In other words, the initialization of VTCR_EL2 register is done
>>> out of the cpu_up/start_secondary control flow. However, the cpu_up flow
>>> should be self-contained - it should fully initialize a secondary CPU,
>>> because the cpu_up is used not only to bring a secondary CPU online on
>>> boot, but also to hotplug a CPU during the system resume.
>>> With this patch the setting of paging is triggered from start_secondary
>>> function if the current system state is not boot. This way, the paging
>>> will be setup in non-boot scenarios, while the setup in boot scenario
>>> remains unchanged.
>>
>>
>> I am afraid that this is not correct. You can't assume that value chosen for
>> VTCR by Xen at boot will fit this new CPU. So you have to check it is fine
>> or park the CPU if there are any issue.
>>
>
> This is not a new CPU. This CPU already went through its boot sequence
> and it reached the resume point because it does fit the value chosen
> for VTCR by Xen.
> If it wouldn't fit the chosen value for VTCR it would be parked so it
> wouldn't participate in suspend/resume. Please let me know if I
> misunderstood your comment.

This is not a new CPU for your use case. However your commit message
speak about "non-boot" CPU bring-up. So for me this is more than
suspend/resume, it is about bringing-up CPU at any time.

As those CPUs can't participate to the decision (it is too late), you
need to make sure the VTCR will fit on that CPU.

>
> AFAIU the value chosen by Xen for VTCR config has to be common for all
> online CPUs. Since this value is also used in the resume path I
> suggest to make global (static in the p2m.c) the 'val' variable which
> is currently local in setup_virt_paging() and passed as argument to
> setup_virt_paging_one(). Then setup_virt_paging_one() would not
> receive an argument.
> I need to access this value on resume, so I would call
> setup_virt_paging_one() without argument from start_secondary() if the
> system state is not boot.
> This seems to me a bit cleaner compared to what I submitted in this
> patch, but fundamentally the functionality is the same.

You don't need to introduce a static variable it. I believe you can
re-create it based on the information we already have in global
variables. So what I would do is moving the creation of vtcr value in
that function.

Cheers,

--
Julien Grall
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-17 14:11       ` Julien Grall
@ 2018-04-17 15:22         ` Mirela Simonovic
  2018-04-18  9:48           ` Julien Grall
  2018-04-18 10:45         ` Mirela Simonovic
  1 sibling, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-17 15:22 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel

Hi Julien,

On Tue, Apr 17, 2018 at 4:11 PM, Julien Grall <julien.grall@arm.com> wrote:
>
>
> On 17/04/18 13:54, Mirela Simonovic wrote:
>>
>> Hi Julien,
>
>
> Hi,
>
>>
>> On Wed, Apr 11, 2018 at 5:11 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>
>>>>
>>>> In existing code the paging for secondary CPUs is setup only in boot
>>>> flow.
>>>> The setup is triggered from start_xen function after all CPUs are
>>>> brought
>>>> online. In other words, the initialization of VTCR_EL2 register is done
>>>> out of the cpu_up/start_secondary control flow. However, the cpu_up flow
>>>> should be self-contained - it should fully initialize a secondary CPU,
>>>> because the cpu_up is used not only to bring a secondary CPU online on
>>>> boot, but also to hotplug a CPU during the system resume.
>>>> With this patch the setting of paging is triggered from start_secondary
>>>> function if the current system state is not boot. This way, the paging
>>>> will be setup in non-boot scenarios, while the setup in boot scenario
>>>> remains unchanged.
>>>
>>>
>>>
>>> I am afraid that this is not correct. You can't assume that value chosen
>>> for
>>> VTCR by Xen at boot will fit this new CPU. So you have to check it is
>>> fine
>>> or park the CPU if there are any issue.
>>>
>>
>> This is not a new CPU. This CPU already went through its boot sequence
>> and it reached the resume point because it does fit the value chosen
>> for VTCR by Xen.
>> If it wouldn't fit the chosen value for VTCR it would be parked so it
>> wouldn't participate in suspend/resume. Please let me know if I
>> misunderstood your comment.
>
>
> This is not a new CPU for your use case. However your commit message
> speak about "non-boot" CPU bring-up. So for me this is more than
> suspend/resume, it is about bringing-up CPU at any time.
>

Use case you're trying to cover is hotplugging a CPU after the boot is
done in bit.LITTLE system, and that CPU wasn't initially brought
online (on boot). Right?

> As those CPUs can't participate to the decision (it is too late), you
> need to make sure the VTCR will fit on that CPU.
>

Could you please point me to the location in sources where this is
done on boot? I mean checking compliance with chosen VTCR value and
parking CPU if it doesn't fit.

Thanks,
Mirela

>>
>> AFAIU the value chosen by Xen for VTCR config has to be common for all
>> online CPUs. Since this value is also used in the resume path I
>> suggest to make global (static in the p2m.c) the 'val' variable which
>> is currently local in setup_virt_paging() and passed as argument to
>> setup_virt_paging_one(). Then setup_virt_paging_one() would not
>> receive an argument.
>> I need to access this value on resume, so I would call
>> setup_virt_paging_one() without argument from start_secondary() if the
>> system state is not boot.
>> This seems to me a bit cleaner compared to what I submitted in this
>> patch, but fundamentally the functionality is the same.
>
>
> You don't need to introduce a static variable it. I believe you can
> re-create it based on the information we already have in global
> variables. So what I would do is moving the creation of vtcr value in
> that function.
>
> Cheers,
>
> --
> Julien Grall
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-17 15:22         ` Mirela Simonovic
@ 2018-04-18  9:48           ` Julien Grall
  2018-04-18 10:34             ` Mirela Simonovic
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-18  9:48 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel



On 17/04/18 16:22, Mirela Simonovic wrote:
> Hi Julien,
> 
> On Tue, Apr 17, 2018 at 4:11 PM, Julien Grall <julien.grall@arm.com> wrote:
>>
>>
>> On 17/04/18 13:54, Mirela Simonovic wrote:
>>>
>>> Hi Julien,
>>
>>
>> Hi,
>>
>>>
>>> On Wed, Apr 11, 2018 at 5:11 PM, Julien Grall <julien.grall@arm.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>>
>>>>>
>>>>> In existing code the paging for secondary CPUs is setup only in boot
>>>>> flow.
>>>>> The setup is triggered from start_xen function after all CPUs are
>>>>> brought
>>>>> online. In other words, the initialization of VTCR_EL2 register is done
>>>>> out of the cpu_up/start_secondary control flow. However, the cpu_up flow
>>>>> should be self-contained - it should fully initialize a secondary CPU,
>>>>> because the cpu_up is used not only to bring a secondary CPU online on
>>>>> boot, but also to hotplug a CPU during the system resume.
>>>>> With this patch the setting of paging is triggered from start_secondary
>>>>> function if the current system state is not boot. This way, the paging
>>>>> will be setup in non-boot scenarios, while the setup in boot scenario
>>>>> remains unchanged.
>>>>
>>>>
>>>>
>>>> I am afraid that this is not correct. You can't assume that value chosen
>>>> for
>>>> VTCR by Xen at boot will fit this new CPU. So you have to check it is
>>>> fine
>>>> or park the CPU if there are any issue.
>>>>
>>>
>>> This is not a new CPU. This CPU already went through its boot sequence
>>> and it reached the resume point because it does fit the value chosen
>>> for VTCR by Xen.
>>> If it wouldn't fit the chosen value for VTCR it would be parked so it
>>> wouldn't participate in suspend/resume. Please let me know if I
>>> misunderstood your comment.
>>
>>
>> This is not a new CPU for your use case. However your commit message
>> speak about "non-boot" CPU bring-up. So for me this is more than
>> suspend/resume, it is about bringing-up CPU at any time.
>>
> 
> Use case you're trying to cover is hotplugging a CPU after the boot is
> done in bit.LITTLE system, and that CPU wasn't initially brought
> online (on boot). Right?

That's right. It is how I understood your commit title.

> 
>> As those CPUs can't participate to the decision (it is too late), you
>> need to make sure the VTCR will fit on that CPU.
>>
> 
> Could you please point me to the location in sources where this is
> done on boot? I mean checking compliance with chosen VTCR value and
> parking CPU if it doesn't fit.

At the moment all CPUs are brought up during Xen boot. So what we do is 
find the small PA range that will accommodate all the onlined CPUs (see 
setup_virt_paging).

So there are no parking required at the moment. However, if you start 
bringing-up new CPUs after boot. Then you have to check they will be 
able to handle the values chosen at boot.

As you pointed out, this might not be necessary for the suspend/resume 
case. So I am not asking you to implement that. However, please make 
sure the commit message clearly spell out that you bringing up secondary 
CPU after Xen has boot only covers the suspend/resume case.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-18  9:48           ` Julien Grall
@ 2018-04-18 10:34             ` Mirela Simonovic
  0 siblings, 0 replies; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-18 10:34 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel

Hi Julien,

On Wed, Apr 18, 2018 at 11:48 AM, Julien Grall <julien.grall@arm.com> wrote:
>
>
> On 17/04/18 16:22, Mirela Simonovic wrote:
>>
>> Hi Julien,
>>
>> On Tue, Apr 17, 2018 at 4:11 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>>
>>>
>>> On 17/04/18 13:54, Mirela Simonovic wrote:
>>>>
>>>>
>>>> Hi Julien,
>>>
>>>
>>>
>>> Hi,
>>>
>>>>
>>>> On Wed, Apr 11, 2018 at 5:11 PM, Julien Grall <julien.grall@arm.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> In existing code the paging for secondary CPUs is setup only in boot
>>>>>> flow.
>>>>>> The setup is triggered from start_xen function after all CPUs are
>>>>>> brought
>>>>>> online. In other words, the initialization of VTCR_EL2 register is
>>>>>> done
>>>>>> out of the cpu_up/start_secondary control flow. However, the cpu_up
>>>>>> flow
>>>>>> should be self-contained - it should fully initialize a secondary CPU,
>>>>>> because the cpu_up is used not only to bring a secondary CPU online on
>>>>>> boot, but also to hotplug a CPU during the system resume.
>>>>>> With this patch the setting of paging is triggered from
>>>>>> start_secondary
>>>>>> function if the current system state is not boot. This way, the paging
>>>>>> will be setup in non-boot scenarios, while the setup in boot scenario
>>>>>> remains unchanged.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I am afraid that this is not correct. You can't assume that value
>>>>> chosen
>>>>> for
>>>>> VTCR by Xen at boot will fit this new CPU. So you have to check it is
>>>>> fine
>>>>> or park the CPU if there are any issue.
>>>>>
>>>>
>>>> This is not a new CPU. This CPU already went through its boot sequence
>>>> and it reached the resume point because it does fit the value chosen
>>>> for VTCR by Xen.
>>>> If it wouldn't fit the chosen value for VTCR it would be parked so it
>>>> wouldn't participate in suspend/resume. Please let me know if I
>>>> misunderstood your comment.
>>>
>>>
>>>
>>> This is not a new CPU for your use case. However your commit message
>>> speak about "non-boot" CPU bring-up. So for me this is more than
>>> suspend/resume, it is about bringing-up CPU at any time.
>>>
>>
>> Use case you're trying to cover is hotplugging a CPU after the boot is
>> done in bit.LITTLE system, and that CPU wasn't initially brought
>> online (on boot). Right?
>
>
> That's right. It is how I understood your commit title.
>
>>
>>> As those CPUs can't participate to the decision (it is too late), you
>>> need to make sure the VTCR will fit on that CPU.
>>>
>>
>> Could you please point me to the location in sources where this is
>> done on boot? I mean checking compliance with chosen VTCR value and
>> parking CPU if it doesn't fit.
>
>
> At the moment all CPUs are brought up during Xen boot. So what we do is find
> the small PA range that will accommodate all the onlined CPUs (see
> setup_virt_paging).
>
> So there are no parking required at the moment. However, if you start
> bringing-up new CPUs after boot. Then you have to check they will be able to
> handle the values chosen at boot.
>
> As you pointed out, this might not be necessary for the suspend/resume case.
> So I am not asking you to implement that. However, please make sure the
> commit message clearly spell out that you bringing up secondary CPU after
> Xen has boot only covers the suspend/resume case.
>

I'll fix the commit title, thanks.

> Cheers,
>
> --
> Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-17 14:11       ` Julien Grall
  2018-04-17 15:22         ` Mirela Simonovic
@ 2018-04-18 10:45         ` Mirela Simonovic
  2018-04-18 10:58           ` Julien Grall
  1 sibling, 1 reply; 61+ messages in thread
From: Mirela Simonovic @ 2018-04-18 10:45 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel

On Tue, Apr 17, 2018 at 4:11 PM, Julien Grall <julien.grall@arm.com> wrote:
>
>
> On 17/04/18 13:54, Mirela Simonovic wrote:
>>
>> Hi Julien,
>
>
> Hi,
>
>>
>> On Wed, Apr 11, 2018 at 5:11 PM, Julien Grall <julien.grall@arm.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>
>>>>
>>>> In existing code the paging for secondary CPUs is setup only in boot
>>>> flow.
>>>> The setup is triggered from start_xen function after all CPUs are
>>>> brought
>>>> online. In other words, the initialization of VTCR_EL2 register is done
>>>> out of the cpu_up/start_secondary control flow. However, the cpu_up flow
>>>> should be self-contained - it should fully initialize a secondary CPU,
>>>> because the cpu_up is used not only to bring a secondary CPU online on
>>>> boot, but also to hotplug a CPU during the system resume.
>>>> With this patch the setting of paging is triggered from start_secondary
>>>> function if the current system state is not boot. This way, the paging
>>>> will be setup in non-boot scenarios, while the setup in boot scenario
>>>> remains unchanged.
>>>
>>>
>>>
>>> I am afraid that this is not correct. You can't assume that value chosen
>>> for
>>> VTCR by Xen at boot will fit this new CPU. So you have to check it is
>>> fine
>>> or park the CPU if there are any issue.
>>>
>>
>> This is not a new CPU. This CPU already went through its boot sequence
>> and it reached the resume point because it does fit the value chosen
>> for VTCR by Xen.
>> If it wouldn't fit the chosen value for VTCR it would be parked so it
>> wouldn't participate in suspend/resume. Please let me know if I
>> misunderstood your comment.
>
>
> This is not a new CPU for your use case. However your commit message
> speak about "non-boot" CPU bring-up. So for me this is more than
> suspend/resume, it is about bringing-up CPU at any time.
>
> As those CPUs can't participate to the decision (it is too late), you
> need to make sure the VTCR will fit on that CPU.
>
>>
>> AFAIU the value chosen by Xen for VTCR config has to be common for all
>> online CPUs. Since this value is also used in the resume path I
>> suggest to make global (static in the p2m.c) the 'val' variable which
>> is currently local in setup_virt_paging() and passed as argument to
>> setup_virt_paging_one(). Then setup_virt_paging_one() would not
>> receive an argument.
>> I need to access this value on resume, so I would call
>> setup_virt_paging_one() without argument from start_secondary() if the
>> system state is not boot.
>> This seems to me a bit cleaner compared to what I submitted in this
>> patch, but fundamentally the functionality is the same.
>
>
> You don't need to introduce a static variable it. I believe you can
> re-create it based on the information we already have in global
> variables. So what I would do is moving the creation of vtcr value in
> that function.

Using this approach each CPU will need to recalculate the value which
is already known prior to executing the function.
I believe this is sub-optimal and contrary to existing implementation
where only one CPU performs the calculation.
Is there any benefit of recalculating the value?
Is there any disadvantage of remembering the value into a static variable?

Please let me know, I just need to understand whether there is any
particular reason why you want to repeat the calculation. If no
particular reasons, saving the calculated value is definitely a better
approach.

Thanks,
Mirela

>
> Cheers,
>
> --
> Julien Grall
> IMPORTANT NOTICE: The contents of this email and any attachments are
> confidential and may also be privileged. If you are not the intended
> recipient, please notify the sender immediately and do not disclose the
> contents to any other person, use it for any purpose, or store or copy the
> information in any medium. Thank you.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-18 10:45         ` Mirela Simonovic
@ 2018-04-18 10:58           ` Julien Grall
  2018-04-18 22:56             ` Stefano Stabellini
  0 siblings, 1 reply; 61+ messages in thread
From: Julien Grall @ 2018-04-18 10:58 UTC (permalink / raw)
  To: Mirela Simonovic
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista, Xen Devel



On 18/04/18 11:45, Mirela Simonovic wrote:
> On Tue, Apr 17, 2018 at 4:11 PM, Julien Grall <julien.grall@arm.com> wrote:
>>
>>
>> On 17/04/18 13:54, Mirela Simonovic wrote:
>>>
>>> Hi Julien,
>>
>>
>> Hi,
>>
>>>
>>> On Wed, Apr 11, 2018 at 5:11 PM, Julien Grall <julien.grall@arm.com>
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>>
>>>>>
>>>>> In existing code the paging for secondary CPUs is setup only in boot
>>>>> flow.
>>>>> The setup is triggered from start_xen function after all CPUs are
>>>>> brought
>>>>> online. In other words, the initialization of VTCR_EL2 register is done
>>>>> out of the cpu_up/start_secondary control flow. However, the cpu_up flow
>>>>> should be self-contained - it should fully initialize a secondary CPU,
>>>>> because the cpu_up is used not only to bring a secondary CPU online on
>>>>> boot, but also to hotplug a CPU during the system resume.
>>>>> With this patch the setting of paging is triggered from start_secondary
>>>>> function if the current system state is not boot. This way, the paging
>>>>> will be setup in non-boot scenarios, while the setup in boot scenario
>>>>> remains unchanged.
>>>>
>>>>
>>>>
>>>> I am afraid that this is not correct. You can't assume that value chosen
>>>> for
>>>> VTCR by Xen at boot will fit this new CPU. So you have to check it is
>>>> fine
>>>> or park the CPU if there are any issue.
>>>>
>>>
>>> This is not a new CPU. This CPU already went through its boot sequence
>>> and it reached the resume point because it does fit the value chosen
>>> for VTCR by Xen.
>>> If it wouldn't fit the chosen value for VTCR it would be parked so it
>>> wouldn't participate in suspend/resume. Please let me know if I
>>> misunderstood your comment.
>>
>>
>> This is not a new CPU for your use case. However your commit message
>> speak about "non-boot" CPU bring-up. So for me this is more than
>> suspend/resume, it is about bringing-up CPU at any time.
>>
>> As those CPUs can't participate to the decision (it is too late), you
>> need to make sure the VTCR will fit on that CPU.
>>
>>>
>>> AFAIU the value chosen by Xen for VTCR config has to be common for all
>>> online CPUs. Since this value is also used in the resume path I
>>> suggest to make global (static in the p2m.c) the 'val' variable which
>>> is currently local in setup_virt_paging() and passed as argument to
>>> setup_virt_paging_one(). Then setup_virt_paging_one() would not
>>> receive an argument.
>>> I need to access this value on resume, so I would call
>>> setup_virt_paging_one() without argument from start_secondary() if the
>>> system state is not boot.
>>> This seems to me a bit cleaner compared to what I submitted in this
>>> patch, but fundamentally the functionality is the same.
>>
>>
>> You don't need to introduce a static variable it. I believe you can
>> re-create it based on the information we already have in global
>> variables. So what I would do is moving the creation of vtcr value in
>> that function.
> 
> Using this approach each CPU will need to recalculate the value which
> is already known prior to executing the function.
> I believe this is sub-optimal and contrary to existing implementation
> where only one CPU performs the calculation.

The implementation usually evolves with the requirements. In the 
existing implementation, the value could be calculated on a single pCPU 
and then scratched afterwards.

> Is there any benefit of recalculating the value? > Is there any disadvantage of remembering the value into a static 
variable?

I know that it is only a 32-bit value, but I would rather avoid 
spreading static variable when a value can be recompute in a few steps 
with what we have.

Stefano do you have any opinions?

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-17 11:02                   ` Julien Grall
@ 2018-04-18 22:52                     ` Stefano Stabellini
  2018-04-19  9:32                       ` Julien Grall
  0 siblings, 1 reply; 61+ messages in thread
From: Stefano Stabellini @ 2018-04-18 22:52 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Davorin Mista,
	Andrew Cooper, Xen Devel, Jan Beulich, Mirela Simonovic

On Tue, 17 Apr 2018, Julien Grall wrote:
> On 17/04/18 11:52, Mirela Simonovic wrote:
> > Hi Julien,
> 
> Hi Mirela,
> 
> > On Mon, Apr 16, 2018 at 5:21 PM, Julien Grall <julien.grall@arm.com> wrote:
> > > 
> > > 
> > > On 16/04/18 14:41, Mirela Simonovic wrote:
> > > > 
> > > > On Mon, Apr 16, 2018 at 3:14 PM, Julien Grall <julien.grall@arm.com>
> > > > wrote:
> > > > > 
> > > > > On 12/04/18 22:31, Stefano Stabellini wrote:
> > > > > > 
> > > > > > On Thu, 12 Apr 2018, Julien Grall wrote:
> > > > > > > 
> > > > > > > On 12/04/18 00:46, Stefano Stabellini wrote:
> > > > > > > > 
> > > > > > > > On Wed, 11 Apr 2018, Julien Grall wrote:
> > > > > > > > > 
> > > > > > > > > On 11/04/18 14:19, Mirela Simonovic wrote:
> > > > > 
> > > > > I guess the rcu_barrier() in the function handling suspend/resume
> > > > > works.
> > > > > But
> > > > > that doesn't cover the hotplug case. Looking at x86, suspend/resume
> > > > > case.
> > > > > For the hotplug case, there are an rcu_barrier in cpu_{up,down}_helper
> > > > > but
> > > > > they are only present in the case of cpu_{up,down} failed. I am not
> > > > > entirely
> > > > > sure how this is handled in x86
> > > > > 
> > > > > Andrew, Jan, do you know when the percpu will be free on hotplug? It
> > > > > is
> > > > > call
> > > > > to call_rcu(...) but I am not sure when this is going to be executed.
> > > > > 
> > > > 
> > > > AFAIK disable/enable_nonboot_cpus() is the only way to do the hotplug
> > > > and rcu_barrier() is not included in the flow.
> > > 
> > > 
> > > That's not the only way. I clearly specified one in my previous answer
> > > (see
> > > cpu_{up,down}_helper) and there are other place (look for cpu_up).
> > > 
> > 
> > I've looked at cpu_{up,down}_helper and cpu_up and I'm convinced now
> > that adding rcu_barrier() prior to calling enable_nonboot_cpus() is
> > the right approch.
> > 
> > cpu_{up,down}_helper functions exist only for x86.
> 
> They have nothing very x86 specific AFAICT so they could potentially be used
> for Arm when XEN_SYSCTL_hotplug will be implemented.
> 
> > cpu_up_helper()
> > does call rcu_barrier() prior to calling cpu_up().
> 
> That's not true. Below the code for cpu_up_helper():
> 
>     int ret = cpu_up(cpu); <- First call
>     if ( ret == -EBUSY )
>     {
>         rcu_barrier();	   <- RCU barrier
>         ret = cpu_up(cpu); <- Second call
>     }
>     return ret;
> 
> So the rcu_barrier is called after cpu_up() in case it returns -EBUSY.
> 
> > So calling rcu_barrier() is expected to be done prior to calling
> > cpu_up() (or enable_nonboot_cpus(), which is just a wrapper for
> > cpu_up()).
> > 
> > I believe this is right way to do because cpu_up() is used for
> > enabling non-boot CPUs in both boot and suspend/hotplug scenarios,
> > while rcu_barrier() is not required in boot scenario.
> > Therefore, I'll add rcu_barrier() prior to calling
> > enable_nonboot_cpus(). If I missed something please let me know.
> 
> See above, this is exactly why I asked Andrew & Jan input on how rcu work is
> flushed when using cpu_up_helper/cpu_down_helper. Because I don't understand
> if it is meant to work.
> 
> So I would like to see whether it would make sense to put the rcu_barrier()
> somewhere else to cover every call of cpu_up().

I thought that for the specific problem we are talking about, we only
need a rcu_barrier() after a cpu has been brought offline (and before
it is brought online again). I don't understand what other use-cases you
are thinking about that might require an rcu_barrier().

Is this the question you are asking Andrew and Jan?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario
  2018-04-18 10:58           ` Julien Grall
@ 2018-04-18 22:56             ` Stefano Stabellini
  0 siblings, 0 replies; 61+ messages in thread
From: Stefano Stabellini @ 2018-04-18 22:56 UTC (permalink / raw)
  To: Julien Grall
  Cc: Edgar E. Iglesias, Stefano Stabellini, Mirela Simonovic,
	Davorin Mista, Xen Devel

On Wed, 18 Apr 2018, Julien Grall wrote:
> On 18/04/18 11:45, Mirela Simonovic wrote:
> > On Tue, Apr 17, 2018 at 4:11 PM, Julien Grall <julien.grall@arm.com> wrote:
> > > 
> > > 
> > > On 17/04/18 13:54, Mirela Simonovic wrote:
> > > > 
> > > > Hi Julien,
> > > 
> > > 
> > > Hi,
> > > 
> > > > 
> > > > On Wed, Apr 11, 2018 at 5:11 PM, Julien Grall <julien.grall@arm.com>
> > > > wrote:
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > On 11/04/18 14:19, Mirela Simonovic wrote:
> > > > > > 
> > > > > > 
> > > > > > In existing code the paging for secondary CPUs is setup only in boot
> > > > > > flow.
> > > > > > The setup is triggered from start_xen function after all CPUs are
> > > > > > brought
> > > > > > online. In other words, the initialization of VTCR_EL2 register is
> > > > > > done
> > > > > > out of the cpu_up/start_secondary control flow. However, the cpu_up
> > > > > > flow
> > > > > > should be self-contained - it should fully initialize a secondary
> > > > > > CPU,
> > > > > > because the cpu_up is used not only to bring a secondary CPU online
> > > > > > on
> > > > > > boot, but also to hotplug a CPU during the system resume.
> > > > > > With this patch the setting of paging is triggered from
> > > > > > start_secondary
> > > > > > function if the current system state is not boot. This way, the
> > > > > > paging
> > > > > > will be setup in non-boot scenarios, while the setup in boot
> > > > > > scenario
> > > > > > remains unchanged.
> > > > > 
> > > > > 
> > > > > 
> > > > > I am afraid that this is not correct. You can't assume that value
> > > > > chosen
> > > > > for
> > > > > VTCR by Xen at boot will fit this new CPU. So you have to check it is
> > > > > fine
> > > > > or park the CPU if there are any issue.
> > > > > 
> > > > 
> > > > This is not a new CPU. This CPU already went through its boot sequence
> > > > and it reached the resume point because it does fit the value chosen
> > > > for VTCR by Xen.
> > > > If it wouldn't fit the chosen value for VTCR it would be parked so it
> > > > wouldn't participate in suspend/resume. Please let me know if I
> > > > misunderstood your comment.
> > > 
> > > 
> > > This is not a new CPU for your use case. However your commit message
> > > speak about "non-boot" CPU bring-up. So for me this is more than
> > > suspend/resume, it is about bringing-up CPU at any time.
> > > 
> > > As those CPUs can't participate to the decision (it is too late), you
> > > need to make sure the VTCR will fit on that CPU.
> > > 
> > > > 
> > > > AFAIU the value chosen by Xen for VTCR config has to be common for all
> > > > online CPUs. Since this value is also used in the resume path I
> > > > suggest to make global (static in the p2m.c) the 'val' variable which
> > > > is currently local in setup_virt_paging() and passed as argument to
> > > > setup_virt_paging_one(). Then setup_virt_paging_one() would not
> > > > receive an argument.
> > > > I need to access this value on resume, so I would call
> > > > setup_virt_paging_one() without argument from start_secondary() if the
> > > > system state is not boot.
> > > > This seems to me a bit cleaner compared to what I submitted in this
> > > > patch, but fundamentally the functionality is the same.
> > > 
> > > 
> > > You don't need to introduce a static variable it. I believe you can
> > > re-create it based on the information we already have in global
> > > variables. So what I would do is moving the creation of vtcr value in
> > > that function.
> > 
> > Using this approach each CPU will need to recalculate the value which
> > is already known prior to executing the function.
> > I believe this is sub-optimal and contrary to existing implementation
> > where only one CPU performs the calculation.
> 
> The implementation usually evolves with the requirements. In the existing
> implementation, the value could be calculated on a single pCPU and then
> scratched afterwards.
> 
> > Is there any benefit of recalculating the value? > Is there any disadvantage
> > of remembering the value into a static 
> variable?
> 
> I know that it is only a 32-bit value, but I would rather avoid spreading
> static variable when a value can be recompute in a few steps with what we
> have.
> 
> Stefano do you have any opinions?

Either way it is of very little consequence. I would let Mirela do as
she prefer.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly
  2018-04-18 22:52                     ` Stefano Stabellini
@ 2018-04-19  9:32                       ` Julien Grall
  0 siblings, 0 replies; 61+ messages in thread
From: Julien Grall @ 2018-04-19  9:32 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Edgar E. Iglesias, Davorin Mista, Andrew Cooper, Xen Devel,
	Jan Beulich, Mirela Simonovic



On 18/04/18 23:52, Stefano Stabellini wrote:
> On Tue, 17 Apr 2018, Julien Grall wrote:
>> On 17/04/18 11:52, Mirela Simonovic wrote:
>>> Hi Julien,
>>
>> Hi Mirela,
>>
>>> On Mon, Apr 16, 2018 at 5:21 PM, Julien Grall <julien.grall@arm.com> wrote:
>>>>
>>>>
>>>> On 16/04/18 14:41, Mirela Simonovic wrote:
>>>>>
>>>>> On Mon, Apr 16, 2018 at 3:14 PM, Julien Grall <julien.grall@arm.com>
>>>>> wrote:
>>>>>>
>>>>>> On 12/04/18 22:31, Stefano Stabellini wrote:
>>>>>>>
>>>>>>> On Thu, 12 Apr 2018, Julien Grall wrote:
>>>>>>>>
>>>>>>>> On 12/04/18 00:46, Stefano Stabellini wrote:
>>>>>>>>>
>>>>>>>>> On Wed, 11 Apr 2018, Julien Grall wrote:
>>>>>>>>>>
>>>>>>>>>> On 11/04/18 14:19, Mirela Simonovic wrote:
>>>>>>
>>>>>> I guess the rcu_barrier() in the function handling suspend/resume
>>>>>> works.
>>>>>> But
>>>>>> that doesn't cover the hotplug case. Looking at x86, suspend/resume
>>>>>> case.
>>>>>> For the hotplug case, there are an rcu_barrier in cpu_{up,down}_helper
>>>>>> but
>>>>>> they are only present in the case of cpu_{up,down} failed. I am not
>>>>>> entirely
>>>>>> sure how this is handled in x86
>>>>>>
>>>>>> Andrew, Jan, do you know when the percpu will be free on hotplug? It
>>>>>> is
>>>>>> call
>>>>>> to call_rcu(...) but I am not sure when this is going to be executed.
>>>>>>
>>>>>
>>>>> AFAIK disable/enable_nonboot_cpus() is the only way to do the hotplug
>>>>> and rcu_barrier() is not included in the flow.
>>>>
>>>>
>>>> That's not the only way. I clearly specified one in my previous answer
>>>> (see
>>>> cpu_{up,down}_helper) and there are other place (look for cpu_up).
>>>>
>>>
>>> I've looked at cpu_{up,down}_helper and cpu_up and I'm convinced now
>>> that adding rcu_barrier() prior to calling enable_nonboot_cpus() is
>>> the right approch.
>>>
>>> cpu_{up,down}_helper functions exist only for x86.
>>
>> They have nothing very x86 specific AFAICT so they could potentially be used
>> for Arm when XEN_SYSCTL_hotplug will be implemented.
>>
>>> cpu_up_helper()
>>> does call rcu_barrier() prior to calling cpu_up().
>>
>> That's not true. Below the code for cpu_up_helper():
>>
>>      int ret = cpu_up(cpu); <- First call
>>      if ( ret == -EBUSY )
>>      {
>>          rcu_barrier();	   <- RCU barrier
>>          ret = cpu_up(cpu); <- Second call
>>      }
>>      return ret;
>>
>> So the rcu_barrier is called after cpu_up() in case it returns -EBUSY.
>>
>>> So calling rcu_barrier() is expected to be done prior to calling
>>> cpu_up() (or enable_nonboot_cpus(), which is just a wrapper for
>>> cpu_up()).
>>>
>>> I believe this is right way to do because cpu_up() is used for
>>> enabling non-boot CPUs in both boot and suspend/hotplug scenarios,
>>> while rcu_barrier() is not required in boot scenario.
>>> Therefore, I'll add rcu_barrier() prior to calling
>>> enable_nonboot_cpus(). If I missed something please let me know.
>>
>> See above, this is exactly why I asked Andrew & Jan input on how rcu work is
>> flushed when using cpu_up_helper/cpu_down_helper. Because I don't understand
>> if it is meant to work.
>>
>> So I would like to see whether it would make sense to put the rcu_barrier()
>> somewhere else to cover every call of cpu_up().
> 
> I thought that for the specific problem we are talking about, we only
> need a rcu_barrier() after a cpu has been brought offline (and before
> it is brought online again). I don't understand what other use-cases you
> are thinking about that might require an rcu_barrier().
> 
> Is this the question you are asking Andrew and Jan?

There are 2 use cases:
	- suspend/resume
	- XEN_SYSCTL_hotplug (not implemented yet on Arm)

Mirela suggestion only apply for suspend/resume. However, there are 
clearly something undocumented in the code about the expectation on 
calling cpu_up/cpu_down.

I would like to understand where is the rcu_barrier when the CPU is 
offlined using the hypercall XEN_SYSCTL_hotplug. As I pointed out there 
the only rcu_barrier call I can find in that path is when cpu_down() fails.

The answer may be: "It is missing and would need to be fixed on x86" and 
that would be fine by me.

In any case, we should probably document the expectation if we rely on 
the caller of cpu_{up,down} to call rcu_barrier(). Otherwise, this is 
yet another way to rediscover the problem with a different path.

Cheers,

-- 
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2018-04-19  9:32 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-11 13:19 [PATCH 0/7] xen/arm: CPU hotplug fixes Mirela Simonovic
2018-04-11 13:19 ` [PATCH 1/7] xen/arm: Added handling of the trapped access to OSLSR register Mirela Simonovic
2018-04-11 14:39   ` Julien Grall
2018-04-11 23:28     ` Stefano Stabellini
2018-04-11 13:19 ` [PATCH 2/7] xen/arm/vgic-v2: Ignore write to GICD_ISACTIVERn registers Mirela Simonovic
2018-04-11 14:43   ` Julien Grall
2018-04-11 23:31     ` Stefano Stabellini
2018-04-12 11:15       ` Mirela Simonovic
2018-04-11 13:19 ` [PATCH 3/7] xen/arm/psci: Implement CPU_OFF PSCI call (physical interface) Mirela Simonovic
2018-04-11 14:46   ` Julien Grall
2018-04-12 11:33     ` Mirela Simonovic
2018-04-12 13:31       ` Julien Grall
2018-04-16 10:02         ` Mirela Simonovic
2018-04-16 14:26           ` Julien Grall
2018-04-16 16:52             ` Mirela Simonovic
2018-04-16 17:02               ` Julien Grall
2018-04-11 13:19 ` [PATCH 4/7] xen/arm: When CPU dies, free percpu area immediatelly Mirela Simonovic
2018-04-11 14:53   ` Julien Grall
2018-04-11 23:46     ` Stefano Stabellini
2018-04-12  8:53       ` Julien Grall
2018-04-12 21:31         ` Stefano Stabellini
2018-04-16 13:14           ` Julien Grall
2018-04-16 13:41             ` Mirela Simonovic
2018-04-16 15:21               ` Julien Grall
2018-04-17 10:52                 ` Mirela Simonovic
2018-04-17 11:02                   ` Julien Grall
2018-04-18 22:52                     ` Stefano Stabellini
2018-04-19  9:32                       ` Julien Grall
2018-04-11 13:19 ` [PATCH 5/7] xen/arm: Remove __initdata and __init to enable CPU hotplug Mirela Simonovic
2018-04-12  0:07   ` Stefano Stabellini
2018-04-12  9:03     ` Julien Grall
2018-04-12 12:50       ` Mirela Simonovic
2018-04-12 12:56         ` Julien Grall
2018-04-12 13:55           ` Mirela Simonovic
2018-04-11 13:19 ` [PATCH 6/7] xen/arm: Setup virtual paging for secondary CPUs in non-boot scenario Mirela Simonovic
2018-04-11 15:11   ` Julien Grall
2018-04-17 12:54     ` Mirela Simonovic
2018-04-17 14:11       ` Julien Grall
2018-04-17 15:22         ` Mirela Simonovic
2018-04-18  9:48           ` Julien Grall
2018-04-18 10:34             ` Mirela Simonovic
2018-04-18 10:45         ` Mirela Simonovic
2018-04-18 10:58           ` Julien Grall
2018-04-18 22:56             ` Stefano Stabellini
2018-04-11 13:19 ` [PATCH 7/7] xen/arm: Restore IRQ affinity after hotplugging a CPU Mirela Simonovic
2018-04-11 14:55   ` Julien Grall
2018-04-12  0:20   ` Stefano Stabellini
2018-04-12  7:38     ` Mirela Simonovic
2018-04-12 16:49   ` Dario Faggioli
2018-04-13 10:11     ` Mirela Simonovic
2018-04-16 12:38       ` Dario Faggioli
2018-04-11 15:07 ` [PATCH 0/7] xen/arm: CPU hotplug fixes Julien Grall
2018-04-11 15:58   ` Mirela Simonovic
2018-04-11 16:02     ` Julien Grall
2018-04-11 16:37       ` Mirela Simonovic
2018-04-12  8:43         ` Julien Grall
2018-04-13 10:19           ` Mirela Simonovic
2018-04-16 11:33             ` Julien Grall
2018-04-16 14:06               ` Mirela Simonovic
2018-04-16 15:05                 ` Julien Grall
2018-04-11 15:13 ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.