xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] x86: more power-efficient CPU parking
@ 2018-08-01 14:22 Jan Beulich
  2018-08-01 14:31 ` [PATCH 1/5] x86/cpuidle: replace a pointless NULL check Jan Beulich
                   ` (6 more replies)
  0 siblings, 7 replies; 33+ messages in thread
From: Jan Beulich @ 2018-08-01 14:22 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

When putting CPUs to sleep permanently, we should try to put them into
the most power conserving state possible. For now it is unclear whether,
especially in a deep C-state, the P-state also matters, so this series only
arranges for the C-state side of things (plus some cleanup).

1: x86/cpuidle: replace a pointless NULL check
2: x86/idle: re-arrange dead-idle handling
3: x86/cpuidle: push parked CPUs into deeper sleep states when possible
4: x86/cpuidle: clean up Cx dumping
5: x86: place non-parked CPUs into wait-for-SIPI state after offlining

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/5] x86/cpuidle: replace a pointless NULL check
  2018-08-01 14:22 [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
@ 2018-08-01 14:31 ` Jan Beulich
  2018-08-01 14:33   ` Andrew Cooper
  2018-08-01 14:31 ` [PATCH 2/5] x86/idle: re-arrange dead-idle handling Jan Beulich
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2018-08-01 14:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

The address of an array slot can't be NULL. Instead add a bounds check
to make sure the array indexing is valid (the check is against 2 since
slot zero of the array - corresponding to C0 - is of no interest here).

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -759,11 +759,11 @@ void acpi_dead_idle(void)
     struct acpi_processor_power *power;
     struct acpi_processor_cx *cx;
 
-    if ( (power = processor_powers[smp_processor_id()]) == NULL )
+    if ( (power = processor_powers[smp_processor_id()]) == NULL ||
+         power->count < 2 )
         goto default_halt;
 
-    if ( (cx = &power->states[power->count-1]) == NULL )
-        goto default_halt;
+    cx = &power->states[power->count - 1];
 
     if ( cx->entry_method == ACPI_CSTATE_EM_FFH )
     {



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 2/5] x86/idle: re-arrange dead-idle handling
  2018-08-01 14:22 [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
  2018-08-01 14:31 ` [PATCH 1/5] x86/cpuidle: replace a pointless NULL check Jan Beulich
@ 2018-08-01 14:31 ` Jan Beulich
  2018-09-07 17:08   ` Andrew Cooper
  2018-08-01 14:32 ` [PATCH 3/5] x86/cpuidle: push parked CPUs into deeper sleep states when possible Jan Beulich
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2018-08-01 14:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

In order to be able to wake parked CPUs from default_dead_idle(), the
function must not itself loop. Move the loop into play_dead(), and use
play_dead() as well on the AP boot error path.

Furthermore, not the least considering the comment in play_dead(),
make sure NMI raised (for now this would be a bug elsewhere, but that's
about to change) against a parked or fully offline CPU won't invoke the
actual, full-blown NMI handler.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -98,14 +98,19 @@ void default_dead_idle(void)
      */
     spec_ctrl_enter_idle(get_cpu_info());
     wbinvd();
-    for ( ; ; )
-        halt();
+    halt();
 }
 
-static void play_dead(void)
+void play_dead(void)
 {
+    unsigned int cpu = smp_processor_id();
+
     local_irq_disable();
 
+    /* Change the NMI handler to a nop (see comment below). */
+    _set_gate_lower(&idt_tables[cpu][TRAP_nmi], SYS_DESC_irq_gate, 0,
+                    &trap_nop);
+
     /*
      * NOTE: After cpu_exit_clear, per-cpu variables may no longer accessible,
      * as they may be freed at any time if offline CPUs don't get parked. In
@@ -116,9 +121,10 @@ static void play_dead(void)
      * Consider very carefully when adding code to *dead_idle. Most hypervisor
      * subsystems are unsafe to call.
      */
-    cpu_exit_clear(smp_processor_id());
+    cpu_exit_clear(cpu);
 
-    (*dead_idle)();
+    for ( ; ; )
+        dead_idle();
 }
 
 static void idle_loop(void)
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -33,6 +33,7 @@
 #include <xen/serial.h>
 #include <xen/numa.h>
 #include <xen/cpu.h>
+#include <asm/cpuidle.h>
 #include <asm/current.h>
 #include <asm/mc146818rtc.h>
 #include <asm/desc.h>
@@ -209,8 +210,7 @@ static void smp_callin(void)
     halt:
         clear_local_APIC();
         spin_debug_enable();
-        cpu_exit_clear(cpu);
-        (*dead_idle)();
+        play_dead();
     }
 
     /* Allow the master to continue. */
--- a/xen/include/asm-x86/cpuidle.h
+++ b/xen/include/asm-x86/cpuidle.h
@@ -20,6 +20,7 @@ int mwait_idle_init(struct notifier_bloc
 int cpuidle_init_cpu(unsigned int cpu);
 void default_dead_idle(void);
 void acpi_dead_idle(void);
+void play_dead(void);
 void trace_exit_reason(u32 *irq_traced);
 void update_idle_stats(struct acpi_processor_power *,
                        struct acpi_processor_cx *, uint64_t, uint64_t);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 3/5] x86/cpuidle: push parked CPUs into deeper sleep states when possible
  2018-08-01 14:22 [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
  2018-08-01 14:31 ` [PATCH 1/5] x86/cpuidle: replace a pointless NULL check Jan Beulich
  2018-08-01 14:31 ` [PATCH 2/5] x86/idle: re-arrange dead-idle handling Jan Beulich
@ 2018-08-01 14:32 ` Jan Beulich
  2018-10-26 10:56   ` Ping: " Jan Beulich
  2018-08-01 14:33 ` [PATCH 4/5] x86/cpuidle: clean up Cx dumping Jan Beulich
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2018-08-01 14:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

When the mwait-idle driver isn't used, C-state information becomes
available only in the course of Dom0 starting up. Use the provided data
to allow parked CPUs to sleep in a more energy efficient way, by waking
them briefly (via NMI) once the data has been recorded.

This involves re-arranging how/when the governor's ->enable() hook gets
invoked. The changes there include addition of so far missing error
handling in the respective CPU notifier handlers.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -351,12 +351,22 @@ static void dump_cx(unsigned char key)
     unsigned int cpu;
 
     printk("'%c' pressed -> printing ACPI Cx structures\n", key);
-    for_each_online_cpu ( cpu )
-        if (processor_powers[cpu])
-        {
-            print_acpi_power(cpu, processor_powers[cpu]);
-            process_pending_softirqs();
-        }
+    for_each_present_cpu ( cpu )
+    {
+        struct acpi_processor_power *power = processor_powers[cpu];
+
+        if ( !power )
+            continue;
+
+        if ( cpu_online(cpu) )
+            print_acpi_power(cpu, power);
+        else if ( park_offline_cpus )
+            printk("CPU%u parked in state %u (C%u)\n", cpu,
+                   power->last_state ? power->last_state->idx : 1,
+                   power->last_state ? power->last_state->type : 1);
+
+        process_pending_softirqs();
+    }
 }
 
 static int __init cpu_idle_key_init(void)
@@ -764,6 +774,7 @@ void acpi_dead_idle(void)
         goto default_halt;
 
     cx = &power->states[power->count - 1];
+    power->last_state = cx;
 
     if ( cx->entry_method == ACPI_CSTATE_EM_FFH )
     {
@@ -1216,9 +1227,30 @@ long set_cx_pminfo(uint32_t acpi_id, str
         set_cx(acpi_power, &xen_cx);
     }
 
-    if ( cpuidle_current_governor->enable &&
-         cpuidle_current_governor->enable(acpi_power) )
-        return -EFAULT;
+    if ( !cpu_online(cpu_id) )
+    {
+        uint32_t apic_id = x86_cpu_to_apicid[cpu_id];
+
+        /*
+         * If we've just learned of more available C states, wake the CPU if
+         * it's parked, so it can go back to sleep in perhaps a deeper state.
+         */
+        if ( park_offline_cpus && apic_id != BAD_APICID )
+        {
+            unsigned long flags;
+
+            local_irq_save(flags);
+            apic_wait_icr_idle();
+            apic_icr_write(APIC_DM_NMI | APIC_DEST_PHYSICAL, apic_id);
+            local_irq_restore(flags);
+        }
+    }
+    else if ( cpuidle_current_governor->enable )
+    {
+        ret = cpuidle_current_governor->enable(acpi_power);
+        if ( ret < 0 )
+            return ret;
+    }
 
     /* FIXME: C-state dependency is not supported by far */
 
@@ -1378,19 +1410,22 @@ static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
     unsigned int cpu = (unsigned long)hcpu;
+    int rc = 0;
 
-    /* Only hook on CPU_ONLINE because a dead cpu may utilize the info to
-     * to enter deep C-state */
+    /*
+     * Only hook on CPU_UP_PREPARE because a dead cpu may utilize the info
+     * to enter deep C-state.
+     */
     switch ( action )
     {
-    case CPU_ONLINE:
-        (void)cpuidle_init_cpu(cpu);
-        break;
-    default:
+    case CPU_UP_PREPARE:
+        rc = cpuidle_init_cpu(cpu);
+        if ( !rc && cpuidle_current_governor->enable )
+            rc = cpuidle_current_governor->enable(processor_powers[cpu]);
         break;
     }
 
-    return NOTIFY_DONE;
+    return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
 }
 
 static struct notifier_block cpu_nfb = {
@@ -1405,6 +1440,7 @@ static int __init cpuidle_presmp_init(vo
         return 0;
 
     mwait_idle_init(&cpu_nfb);
+    cpu_nfb.notifier_call(&cpu_nfb, CPU_UP_PREPARE, cpu);
     cpu_nfb.notifier_call(&cpu_nfb, CPU_ONLINE, cpu);
     register_cpu_notifier(&cpu_nfb);
     return 0;
--- a/xen/arch/x86/acpi/cpuidle_menu.c
+++ b/xen/arch/x86/acpi/cpuidle_menu.c
@@ -277,9 +277,6 @@ static void menu_reflect(struct acpi_pro
 
 static int menu_enable_device(struct acpi_processor_power *power)
 {
-    if (!cpu_online(power->cpu))
-        return -1;
-
     memset(&per_cpu(menu_devices, power->cpu), 0, sizeof(struct menu_device));
 
     return 0;
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -1162,12 +1162,17 @@ static int mwait_idle_cpu_init(struct no
 	struct acpi_processor_power *dev = processor_powers[cpu];
 
 	switch (action) {
+		int rc;
+
 	default:
 		return NOTIFY_DONE;
 
 	case CPU_UP_PREPARE:
-		cpuidle_init_cpu(cpu);
-		return NOTIFY_DONE;
+		rc = cpuidle_init_cpu(cpu);
+		dev = processor_powers[cpu];
+		if (!rc && cpuidle_current_governor->enable)
+			rc = cpuidle_current_governor->enable(dev);
+		return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
 
 	case CPU_ONLINE:
 		if (!dev)
@@ -1256,8 +1261,6 @@ int __init mwait_idle_init(struct notifi
 	}
 	if (!err) {
 		nfb->notifier_call = mwait_idle_cpu_init;
-		mwait_idle_cpu_init(nfb, CPU_UP_PREPARE, NULL);
-
 		pm_idle_save = pm_idle;
 		pm_idle = mwait_idle;
 		dead_idle = acpi_dead_idle;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 4/5] x86/cpuidle: clean up Cx dumping
  2018-08-01 14:22 [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
                   ` (2 preceding siblings ...)
  2018-08-01 14:32 ` [PATCH 3/5] x86/cpuidle: push parked CPUs into deeper sleep states when possible Jan Beulich
@ 2018-08-01 14:33 ` Jan Beulich
  2018-08-01 14:40   ` Andrew Cooper
  2018-08-01 14:33 ` [PATCH 5/5] x86: place non-parked CPUs into wait-for-SIPI state after offlining Jan Beulich
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2018-08-01 14:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

Don't log the same global information once per CPU. Don't log the same
information (here: the currently active state) twice. Don't prefix
decimal numbers with zeros (giving the impression they're octal). Use
format specifiers matching the type of the corresponding expressions.
Don't split printk()-s without intervening new-lines.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -304,9 +304,6 @@ static void print_acpi_power(uint32_t cp
 
     printk("==cpu%d==\n", cpu);
     last_state_idx = power->last_state ? power->last_state->idx : -1;
-    printk("active state:\t\tC%d\n", last_state_idx);
-    printk("max_cstate:\t\tC%d\n", max_cstate);
-    printk("states:\n");
 
     spin_lock_irq(&power->stat_lock);
     current_tick = cpuidle_get_tick();
@@ -331,16 +328,14 @@ static void print_acpi_power(uint32_t cp
         idle_usage += usage[i];
         idle_res += tick_to_ns(res_tick[i]);
 
-        printk((last_state_idx == i) ? "   *" : "    ");
-        printk("C%d:\t", i);
-        printk("type[C%d] ", power->states[i].type);
-        printk("latency[%03d] ", power->states[i].latency);
-        printk("usage[%08"PRIu64"] ", usage[i]);
-        printk("method[%5s] ", acpi_cstate_method_name[power->states[i].entry_method]);
-        printk("duration[%"PRIu64"]\n", tick_to_ns(res_tick[i]));
+        printk("   %cC%u:\ttype[C%d] latency[%3u] usage[%8"PRIu64"] method[%5s] duration[%"PRIu64"]\n",
+               (last_state_idx == i) ? '*' : ' ', i,
+               power->states[i].type, power->states[i].latency, usage[i],
+               acpi_cstate_method_name[power->states[i].entry_method],
+               tick_to_ns(res_tick[i]));
     }
-    printk((last_state_idx == 0) ? "   *" : "    ");
-    printk("C0:\tusage[%08"PRIu64"] duration[%"PRIu64"]\n",
+    printk("   %cC0:\tusage[%8"PRIu64"] duration[%"PRIu64"]\n",
+           (last_state_idx == 0) ? '*' : ' ',
            usage[0] + idle_usage, current_stime - idle_res);
 
     print_hw_residencies(cpu);
@@ -351,6 +346,7 @@ static void dump_cx(unsigned char key)
     unsigned int cpu;
 
     printk("'%c' pressed -> printing ACPI Cx structures\n", key);
+    printk("max cstate: C%u\n", max_cstate);
     for_each_present_cpu ( cpu )
     {
         struct acpi_processor_power *power = processor_powers[cpu];





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] x86/cpuidle: replace a pointless NULL check
  2018-08-01 14:31 ` [PATCH 1/5] x86/cpuidle: replace a pointless NULL check Jan Beulich
@ 2018-08-01 14:33   ` Andrew Cooper
  2018-08-01 15:12     ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2018-08-01 14:33 UTC (permalink / raw)
  To: Jan Beulich, xen-devel

On 01/08/18 15:31, Jan Beulich wrote:
> The address of an array slot can't be NULL. Instead add a bounds check
> to make sure the array indexing is valid (the check is against 2 since
> slot zero of the array - corresponding to C0 - is of no interest here).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Wouldn't an expression involving ARRAY_SIZE() be more appropriate?

Either way, Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 5/5] x86: place non-parked CPUs into wait-for-SIPI state after offlining
  2018-08-01 14:22 [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
                   ` (3 preceding siblings ...)
  2018-08-01 14:33 ` [PATCH 4/5] x86/cpuidle: clean up Cx dumping Jan Beulich
@ 2018-08-01 14:33 ` Jan Beulich
  2018-08-29  7:08 ` Ping: [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
       [not found] ` <5B61C21202000000000FC1F1@prv1-mh.provo.novell.com>
  6 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2018-08-01 14:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

This is presumably more power efficient than keeping them in a HLT loop,
and I supposed also the state in which they're being handed off by
firmware.

Split off from wakeup_secondary_cpu() the code to assert/deassert INIT
(and in turn also recurring wait-for-send-completion code), and re-use
it from __cpu_die(). Take the opportunity and add the previously
missing apic_wait_icr_idle() at the start of the sequence.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -415,10 +415,25 @@ void start_secondary(void *unused)
 
 extern void *stack_start;
 
-static int wakeup_secondary_cpu(int phys_apicid, unsigned long start_eip)
+static unsigned int wait_send(void)
 {
-    unsigned long send_status = 0, accept_status = 0;
-    int maxlvt, timeout, i;
+    unsigned int send_status, timeout = 0;
+
+    Dprintk("Waiting for send to finish...\n");
+    do {
+        Dprintk("+");
+        udelay(100);
+        send_status = apic_read(APIC_ICR) & APIC_ICR_BUSY;
+    } while ( send_status && (timeout++ < 1000) );
+
+    return send_status;
+}
+
+static unsigned int init_secondary_cpu(unsigned int phys_apicid)
+{
+    unsigned int send_status = 0;
+
+    apic_wait_icr_idle();
 
     /*
      * Be paranoid about clearing APIC errors.
@@ -436,13 +451,7 @@ static int wakeup_secondary_cpu(int phys
 
     if ( !x2apic_enabled )
     {
-        Dprintk("Waiting for send to finish...\n");
-        timeout = 0;
-        do {
-            Dprintk("+");
-            udelay(100);
-            send_status = apic_read(APIC_ICR) & APIC_ICR_BUSY;
-        } while ( send_status && (timeout++ < 1000) );
+        send_status = wait_send();
 
         mdelay(10);
 
@@ -450,13 +459,7 @@ static int wakeup_secondary_cpu(int phys
 
         apic_icr_write(APIC_INT_LEVELTRIG | APIC_DM_INIT, phys_apicid);
 
-        Dprintk("Waiting for send to finish...\n");
-        timeout = 0;
-        do {
-            Dprintk("+");
-            udelay(100);
-            send_status = apic_read(APIC_ICR) & APIC_ICR_BUSY;
-        } while ( send_status && (timeout++ < 1000) );
+        send_status = wait_send();
     }
     else if ( tboot_in_measured_env() )
     {
@@ -471,7 +474,16 @@ static int wakeup_secondary_cpu(int phys
         udelay(10);
     }
 
-    maxlvt = get_maxlvt();
+    return send_status;
+}
+
+static unsigned int wakeup_secondary_cpu(unsigned int phys_apicid,
+                                         unsigned int start_eip)
+{
+    unsigned int send_status, accept_status = 0;
+    unsigned int maxlvt = get_maxlvt(), i;
+
+    send_status = init_secondary_cpu(phys_apicid);
 
     for ( i = 0; i < 2; i++ )
     {
@@ -491,15 +503,9 @@ static int wakeup_secondary_cpu(int phys
             /* Give the other CPU some time to accept the IPI. */
             udelay(300);
 
-            Dprintk("Startup point 1.\n");
+            Dprintk("Startup point %u.\n", i + 1);
 
-            Dprintk("Waiting for send to finish...\n");
-            timeout = 0;
-            do {
-                Dprintk("+");
-                udelay(100);
-                send_status = apic_read(APIC_ICR) & APIC_ICR_BUSY;
-            } while ( send_status && (timeout++ < 1000) );
+            send_status = wait_send();
 
             /* Give the other CPU some time to accept the IPI. */
             udelay(200);
@@ -519,7 +525,7 @@ static int wakeup_secondary_cpu(int phys
     if ( send_status )
         printk("APIC never delivered???\n");
     if ( accept_status )
-        printk("APIC delivery error (%lx).\n", accept_status);
+        printk("APIC delivery error (%x).\n", accept_status);
 
     return (send_status | accept_status);
 }
@@ -1238,6 +1244,9 @@ void __cpu_die(unsigned int cpu)
         if ( (++i % 10) == 0 )
             printk(KERN_ERR "CPU %u still not dead...\n", cpu);
     }
+
+    if ( !park_offline_cpus )
+        init_secondary_cpu(x86_cpu_to_apicid[cpu]);
 }
 
 int cpu_add(uint32_t apic_id, uint32_t acpi_id, uint32_t pxm)




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 4/5] x86/cpuidle: clean up Cx dumping
  2018-08-01 14:33 ` [PATCH 4/5] x86/cpuidle: clean up Cx dumping Jan Beulich
@ 2018-08-01 14:40   ` Andrew Cooper
  0 siblings, 0 replies; 33+ messages in thread
From: Andrew Cooper @ 2018-08-01 14:40 UTC (permalink / raw)
  To: Jan Beulich, xen-devel

On 01/08/18 15:33, Jan Beulich wrote:
> Don't log the same global information once per CPU. Don't log the same
> information (here: the currently active state) twice. Don't prefix
> decimal numbers with zeros (giving the impression they're octal). Use
> format specifiers matching the type of the corresponding expressions.
> Don't split printk()-s without intervening new-lines.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] x86/cpuidle: replace a pointless NULL check
  2018-08-01 14:33   ` Andrew Cooper
@ 2018-08-01 15:12     ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2018-08-01 15:12 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 01.08.18 at 16:33, <andrew.cooper3@citrix.com> wrote:
> On 01/08/18 15:31, Jan Beulich wrote:
>> The address of an array slot can't be NULL. Instead add a bounds check
>> to make sure the array indexing is valid (the check is against 2 since
>> slot zero of the array - corresponding to C0 - is of no interest here).
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Wouldn't an expression involving ARRAY_SIZE() be more appropriate?

No, as this is a lower bounds check; ->count will never exceed the
upper bound.

> Either way, Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks, Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Ping: [PATCH 0/5] x86: more power-efficient CPU parking
  2018-08-01 14:22 [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
                   ` (4 preceding siblings ...)
  2018-08-01 14:33 ` [PATCH 5/5] x86: place non-parked CPUs into wait-for-SIPI state after offlining Jan Beulich
@ 2018-08-29  7:08 ` Jan Beulich
  2018-08-29 17:01   ` Andrew Cooper
       [not found] ` <5B61C21202000000000FC1F1@prv1-mh.provo.novell.com>
  6 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2018-08-29  7:08 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 01.08.18 at 16:22,  wrote:
> When putting CPUs to sleep permanently, we should try to put them into
> the most power conserving state possible. For now it is unclear whether,
> especially in a deep C-state, the P-state also matters, so this series only
> arranges for the C-state side of things (plus some cleanup).
> 
> 1: x86/cpuidle: replace a pointless NULL check
> 2: x86/idle: re-arrange dead-idle handling
> 3: x86/cpuidle: push parked CPUs into deeper sleep states when possible
> 4: x86/cpuidle: clean up Cx dumping
> 5: x86: place non-parked CPUs into wait-for-SIPI state after offlining
> 
> Jan
> 
> 





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Ping: [PATCH 0/5] x86: more power-efficient CPU parking
  2018-08-29  7:08 ` Ping: [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
@ 2018-08-29 17:01   ` Andrew Cooper
  2018-08-30  7:29     ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2018-08-29 17:01 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 29/08/18 08:08, Jan Beulich wrote:
>>>> On 01.08.18 at 16:22,  wrote:
>> When putting CPUs to sleep permanently, we should try to put them into
>> the most power conserving state possible. For now it is unclear whether,
>> especially in a deep C-state, the P-state also matters, so this series only
>> arranges for the C-state side of things (plus some cleanup).
>>
>> 1: x86/cpuidle: replace a pointless NULL check
>> 2: x86/idle: re-arrange dead-idle handling
>> 3: x86/cpuidle: push parked CPUs into deeper sleep states when possible
>> 4: x86/cpuidle: clean up Cx dumping
>> 5: x86: place non-parked CPUs into wait-for-SIPI state after offlining

I don't have a problem in principle, but I'm afraid that you're going to
need either positive documentation, or a positive statement from Intel
and AMD that wait-for-SIPI is the most power efficient state to park a
CPU in.

At a guess, I'd expect that wait-for-SIPI isn't an optimised state
(because the vendors wouldn't expect hardware to be in this state after
boot), and from the various discussions around L1TF, there are
definitely differences between wait-for-SIPI and mwait.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Ping: [PATCH 0/5] x86: more power-efficient CPU parking
  2018-08-29 17:01   ` Andrew Cooper
@ 2018-08-30  7:29     ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2018-08-30  7:29 UTC (permalink / raw)
  To: Brian Woods, Suravee Suthikulpanit, Andrew Cooper
  Cc: xen-devel, Kevin Tian, Jun Nakajima

>>> On 29.08.18 at 19:01, <andrew.cooper3@citrix.com> wrote:
> On 29/08/18 08:08, Jan Beulich wrote:
>>>>> On 01.08.18 at 16:22,  wrote:
>>> When putting CPUs to sleep permanently, we should try to put them into
>>> the most power conserving state possible. For now it is unclear whether,
>>> especially in a deep C-state, the P-state also matters, so this series only
>>> arranges for the C-state side of things (plus some cleanup).
>>>
>>> 1: x86/cpuidle: replace a pointless NULL check
>>> 2: x86/idle: re-arrange dead-idle handling
>>> 3: x86/cpuidle: push parked CPUs into deeper sleep states when possible
>>> 4: x86/cpuidle: clean up Cx dumping
>>> 5: x86: place non-parked CPUs into wait-for-SIPI state after offlining
> 
> I don't have a problem in principle, but I'm afraid that you're going to
> need either positive documentation, or a positive statement from Intel
> and AMD that wait-for-SIPI is the most power efficient state to park a
> CPU in.
> 
> At a guess, I'd expect that wait-for-SIPI isn't an optimised state
> (because the vendors wouldn't expect hardware to be in this state after
> boot), and from the various discussions around L1TF, there are
> definitely differences between wait-for-SIPI and mwait.

Well, that's a comment on patch 5 alone, which for that very reason
I've put last in the series. Do I imply your response to mean patches
2 and 3 can have your ack (4 already has and 1 has already gone in)?

Also you're comparing to MWAIT, but such CPUs would more likely get
put in a loop over HLT (as patch 5's description also says). That's
(part of) the background of why I'd like AMD to provide data to enable
mwait-idle for their CPUs.

As to the actual question raised: AMD (first and foremost), Intel?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] x86/idle: re-arrange dead-idle handling
  2018-08-01 14:31 ` [PATCH 2/5] x86/idle: re-arrange dead-idle handling Jan Beulich
@ 2018-09-07 17:08   ` Andrew Cooper
  2018-09-10 10:13     ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2018-09-07 17:08 UTC (permalink / raw)
  To: Jan Beulich, xen-devel

On 01/08/18 15:31, Jan Beulich wrote:
> In order to be able to wake parked CPUs from default_dead_idle(), the
> function must not itself loop. Move the loop into play_dead(), and use
> play_dead() as well on the AP boot error path.
>
> Furthermore, not the least considering the comment in play_dead(),
> make sure NMI raised (for now this would be a bug elsewhere, but that's
> about to change) against a parked or fully offline CPU won't invoke the
> actual, full-blown NMI handler.

I'm concerned by this.  It isn't necessarily a bug elsewhere, because
the CPU can still participate in SMM, and still have the SMM handler
raise an NMI.

Equally, it may still be able to service #MC's, so I can't see how it is
safe for us to ever free the percpu data.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] x86/idle: re-arrange dead-idle handling
  2018-09-07 17:08   ` Andrew Cooper
@ 2018-09-10 10:13     ` Jan Beulich
  2018-10-26 10:55       ` Ping: " Jan Beulich
  2018-12-05 20:33       ` Andrew Cooper
  0 siblings, 2 replies; 33+ messages in thread
From: Jan Beulich @ 2018-09-10 10:13 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 07.09.18 at 19:08, <andrew.cooper3@citrix.com> wrote:
> On 01/08/18 15:31, Jan Beulich wrote:
>> In order to be able to wake parked CPUs from default_dead_idle(), the
>> function must not itself loop. Move the loop into play_dead(), and use
>> play_dead() as well on the AP boot error path.
>>
>> Furthermore, not the least considering the comment in play_dead(),
>> make sure NMI raised (for now this would be a bug elsewhere, but that's
>> about to change) against a parked or fully offline CPU won't invoke the
>> actual, full-blown NMI handler.
> 
> I'm concerned by this.  It isn't necessarily a bug elsewhere, because
> the CPU can still participate in SMM, and still have the SMM handler
> raise an NMI.

What significance does an NMI have when raised towards an offline
CPU? If SMM does this, then I'd very much guess this is happening in
a broadcast fashion (otherwise they'd surely raised it against CPU 0),
and in that case NOP-ing the effect for parked CPUs seems quite
appropriate to me.

> Equally, it may still be able to service #MC's, so I can't see how it is
> safe for us to ever free the percpu data.

I'm having trouble seeing how this remark relates to the series here.
Plus it's a theoretical problem at present only anyway:
- physical hot remove is not implemented (there's no source of the
  new CPU_REMOVE notification),
- Intel CPUs get parked, i.e. never have their per-CPU data freed,
- AMD CPUs don't broadcast #MC.

Bottom line for both parts of your reply: I don't see any proposal /
request of what you think needs changing, and how. Unless,
perhaps, you're suggesting the entire series is rubbish, in which
case I'd like you to propose an alternative approach to address
the problem of parking CPUs in C1 instead of deepest possible
C-states.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Ping: Re: [PATCH 2/5] x86/idle: re-arrange dead-idle handling
  2018-09-10 10:13     ` Jan Beulich
@ 2018-10-26 10:55       ` Jan Beulich
  2018-12-05 20:33       ` Andrew Cooper
  1 sibling, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2018-10-26 10:55 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 10.09.18 at 12:13,  wrote:
>>>> On 07.09.18 at 19:08, <andrew.cooper3@citrix.com> wrote:
> > On 01/08/18 15:31, Jan Beulich wrote:
> >> In order to be able to wake parked CPUs from default_dead_idle(), the
> >> function must not itself loop. Move the loop into play_dead(), and use
> >> play_dead() as well on the AP boot error path.
> >>
> >> Furthermore, not the least considering the comment in play_dead(),
> >> make sure NMI raised (for now this would be a bug elsewhere, but that's
> >> about to change) against a parked or fully offline CPU won't invoke the
> >> actual, full-blown NMI handler.
> > 
> > I'm concerned by this.  It isn't necessarily a bug elsewhere, because
> > the CPU can still participate in SMM, and still have the SMM handler
> > raise an NMI.
> 
> What significance does an NMI have when raised towards an offline
> CPU? If SMM does this, then I'd very much guess this is happening in
> a broadcast fashion (otherwise they'd surely raised it against CPU 0),
> and in that case NOP-ing the effect for parked CPUs seems quite
> appropriate to me.
> 
> > Equally, it may still be able to service #MC's, so I can't see how it is
> > safe for us to ever free the percpu data.
> 
> I'm having trouble seeing how this remark relates to the series here.
> Plus it's a theoretical problem at present only anyway:
> - physical hot remove is not implemented (there's no source of the
>   new CPU_REMOVE notification),
> - Intel CPUs get parked, i.e. never have their per-CPU data freed,
> - AMD CPUs don't broadcast #MC.
> 
> Bottom line for both parts of your reply: I don't see any proposal /
> request of what you think needs changing, and how. Unless,
> perhaps, you're suggesting the entire series is rubbish, in which
> case I'd like you to propose an alternative approach to address
> the problem of parking CPUs in C1 instead of deepest possible
> C-states.
> 
> Jan
> 
> 





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Ping: [PATCH 3/5] x86/cpuidle: push parked CPUs into deeper sleep states when possible
  2018-08-01 14:32 ` [PATCH 3/5] x86/cpuidle: push parked CPUs into deeper sleep states when possible Jan Beulich
@ 2018-10-26 10:56   ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2018-10-26 10:56 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 01.08.18 at 16:32,  wrote:
> When the mwait-idle driver isn't used, C-state information becomes
> available only in the course of Dom0 starting up. Use the provided data
> to allow parked CPUs to sleep in a more energy efficient way, by waking
> them briefly (via NMI) once the data has been recorded.
> 
> This involves re-arranging how/when the governor's ->enable() hook gets
> invoked. The changes there include addition of so far missing error
> handling in the respective CPU notifier handlers.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/arch/x86/acpi/cpu_idle.c
> +++ b/xen/arch/x86/acpi/cpu_idle.c
> @@ -351,12 +351,22 @@ static void dump_cx(unsigned char key)
>      unsigned int cpu;
>  
>      printk("'%c' pressed -> printing ACPI Cx structures\n", key);
> -    for_each_online_cpu ( cpu )
> -        if (processor_powers[cpu])
> -        {
> -            print_acpi_power(cpu, processor_powers[cpu]);
> -            process_pending_softirqs();
> -        }
> +    for_each_present_cpu ( cpu )
> +    {
> +        struct acpi_processor_power *power = processor_powers[cpu];
> +
> +        if ( !power )
> +            continue;
> +
> +        if ( cpu_online(cpu) )
> +            print_acpi_power(cpu, power);
> +        else if ( park_offline_cpus )
> +            printk("CPU%u parked in state %u (C%u)\n", cpu,
> +                   power->last_state ? power->last_state->idx : 1,
> +                   power->last_state ? power->last_state->type : 1);
> +
> +        process_pending_softirqs();
> +    }
>  }
>  
>  static int __init cpu_idle_key_init(void)
> @@ -764,6 +774,7 @@ void acpi_dead_idle(void)
>          goto default_halt;
>  
>      cx = &power->states[power->count - 1];
> +    power->last_state = cx;
>  
>      if ( cx->entry_method == ACPI_CSTATE_EM_FFH )
>      {
> @@ -1216,9 +1227,30 @@ long set_cx_pminfo(uint32_t acpi_id, str
>          set_cx(acpi_power, &xen_cx);
>      }
>  
> -    if ( cpuidle_current_governor->enable &&
> -         cpuidle_current_governor->enable(acpi_power) )
> -        return -EFAULT;
> +    if ( !cpu_online(cpu_id) )
> +    {
> +        uint32_t apic_id = x86_cpu_to_apicid[cpu_id];
> +
> +        /*
> +         * If we've just learned of more available C states, wake the CPU 
> if
> +         * it's parked, so it can go back to sleep in perhaps a deeper 
> state.
> +         */
> +        if ( park_offline_cpus && apic_id != BAD_APICID )
> +        {
> +            unsigned long flags;
> +
> +            local_irq_save(flags);
> +            apic_wait_icr_idle();
> +            apic_icr_write(APIC_DM_NMI | APIC_DEST_PHYSICAL, apic_id);
> +            local_irq_restore(flags);
> +        }
> +    }
> +    else if ( cpuidle_current_governor->enable )
> +    {
> +        ret = cpuidle_current_governor->enable(acpi_power);
> +        if ( ret < 0 )
> +            return ret;
> +    }
>  
>      /* FIXME: C-state dependency is not supported by far */
>  
> @@ -1378,19 +1410,22 @@ static int cpu_callback(
>      struct notifier_block *nfb, unsigned long action, void *hcpu)
>  {
>      unsigned int cpu = (unsigned long)hcpu;
> +    int rc = 0;
>  
> -    /* Only hook on CPU_ONLINE because a dead cpu may utilize the info to
> -     * to enter deep C-state */
> +    /*
> +     * Only hook on CPU_UP_PREPARE because a dead cpu may utilize the info
> +     * to enter deep C-state.
> +     */
>      switch ( action )
>      {
> -    case CPU_ONLINE:
> -        (void)cpuidle_init_cpu(cpu);
> -        break;
> -    default:
> +    case CPU_UP_PREPARE:
> +        rc = cpuidle_init_cpu(cpu);
> +        if ( !rc && cpuidle_current_governor->enable )
> +            rc = cpuidle_current_governor->enable(processor_powers[cpu]);
>          break;
>      }
>  
> -    return NOTIFY_DONE;
> +    return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
>  }
>  
>  static struct notifier_block cpu_nfb = {
> @@ -1405,6 +1440,7 @@ static int __init cpuidle_presmp_init(vo
>          return 0;
>  
>      mwait_idle_init(&cpu_nfb);
> +    cpu_nfb.notifier_call(&cpu_nfb, CPU_UP_PREPARE, cpu);
>      cpu_nfb.notifier_call(&cpu_nfb, CPU_ONLINE, cpu);
>      register_cpu_notifier(&cpu_nfb);
>      return 0;
> --- a/xen/arch/x86/acpi/cpuidle_menu.c
> +++ b/xen/arch/x86/acpi/cpuidle_menu.c
> @@ -277,9 +277,6 @@ static void menu_reflect(struct acpi_pro
>  
>  static int menu_enable_device(struct acpi_processor_power *power)
>  {
> -    if (!cpu_online(power->cpu))
> -        return -1;
> -
>      memset(&per_cpu(menu_devices, power->cpu), 0, sizeof(struct 
> menu_device));
>  
>      return 0;
> --- a/xen/arch/x86/cpu/mwait-idle.c
> +++ b/xen/arch/x86/cpu/mwait-idle.c
> @@ -1162,12 +1162,17 @@ static int mwait_idle_cpu_init(struct no
>  	struct acpi_processor_power *dev = processor_powers[cpu];
>  
>  	switch (action) {
> +		int rc;
> +
>  	default:
>  		return NOTIFY_DONE;
>  
>  	case CPU_UP_PREPARE:
> -		cpuidle_init_cpu(cpu);
> -		return NOTIFY_DONE;
> +		rc = cpuidle_init_cpu(cpu);
> +		dev = processor_powers[cpu];
> +		if (!rc && cpuidle_current_governor->enable)
> +			rc = cpuidle_current_governor->enable(dev);
> +		return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
>  
>  	case CPU_ONLINE:
>  		if (!dev)
> @@ -1256,8 +1261,6 @@ int __init mwait_idle_init(struct notifi
>  	}
>  	if (!err) {
>  		nfb->notifier_call = mwait_idle_cpu_init;
> -		mwait_idle_cpu_init(nfb, CPU_UP_PREPARE, NULL);
> -
>  		pm_idle_save = pm_idle;
>  		pm_idle = mwait_idle;
>  		dead_idle = acpi_dead_idle;
> 
> 
> 
> 




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] x86/idle: re-arrange dead-idle handling
  2018-09-10 10:13     ` Jan Beulich
  2018-10-26 10:55       ` Ping: " Jan Beulich
@ 2018-12-05 20:33       ` Andrew Cooper
  2018-12-06  8:16         ` Jan Beulich
  1 sibling, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2018-12-05 20:33 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 10/09/2018 11:13, Jan Beulich wrote:
>
>> Equally, it may still be able to service #MC's, so I can't see how it is
>> safe for us to ever free the percpu data.
> I'm having trouble seeing how this remark relates to the series here.

Because you've tried to make NMIs safe, but not made equivalent
adjustments to the #MC side of things.

> Plus it's a theoretical problem at present only anyway:
> - physical hot remove is not implemented (there's no source of the
>   new CPU_REMOVE notification),
> - Intel CPUs get parked, i.e. never have their per-CPU data freed,
> - AMD CPUs don't broadcast #MC.

Ignoring MCE's is never an option, but whenever CR4.MCE is set, we must
be prepared to handle #MC.  Just because an AMD CPU is playing dead
doesn't mean it is immune to receiving #MC's.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] x86/idle: re-arrange dead-idle handling
  2018-12-05 20:33       ` Andrew Cooper
@ 2018-12-06  8:16         ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2018-12-06  8:16 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 05.12.18 at 21:33, <andrew.cooper3@citrix.com> wrote:
> On 10/09/2018 11:13, Jan Beulich wrote:
>>
>>> Equally, it may still be able to service #MC's, so I can't see how it is
>>> safe for us to ever free the percpu data.
>> I'm having trouble seeing how this remark relates to the series here.
> 
> Because you've tried to make NMIs safe, but not made equivalent
> adjustments to the #MC side of things.

Explain to me how this is getting worse with the patch in question.
It doesn't alter under what conditions per-CPU data gets freed.
Of course I can short-circuit the #MC handler just like I do for the
NMI one, but that's only going to delay shutdown of the core
until a second #MC surfaces (as the first one would never get
dealt with).

>> Plus it's a theoretical problem at present only anyway:
>> - physical hot remove is not implemented (there's no source of the
>>   new CPU_REMOVE notification),
>> - Intel CPUs get parked, i.e. never have their per-CPU data freed,
>> - AMD CPUs don't broadcast #MC.
> 
> Ignoring MCE's is never an option, but whenever CR4.MCE is set, we must
> be prepared to handle #MC.  Just because an AMD CPU is playing dead
> doesn't mean it is immune to receiving #MC's.

Again - this is nothing the patch here changes in any way. It's
not clear to me whether clearing CR4.MCE is an option on AMD
CPUs.

Anything beyond wiring #MC into trap_nop() (please let me
know if that's what you want to see added) should imo not
be part of this patch, and again imo doesn't even have to be
part of this series.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Ping: [PATCH 0/5] x86: more power-efficient CPU parking
       [not found]           ` <5B61C2120200007800224310@prv1-mh.provo.novell.com>
@ 2019-04-03 10:12             ` Jan Beulich
  2019-04-03 11:14               ` Andrew Cooper
       [not found]             ` <5B61C2120200000000101EDC@prv1-mh.provo.novell.com>
  1 sibling, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2019-04-03 10:12 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monne

>>> On 01.08.18 at 16:22,  wrote:
> When putting CPUs to sleep permanently, we should try to put them into
> the most power conserving state possible. For now it is unclear whether,
> especially in a deep C-state, the P-state also matters, so this series only
> arranges for the C-state side of things (plus some cleanup).
> 
> 1: x86/cpuidle: replace a pointless NULL check
> 2: x86/idle: re-arrange dead-idle handling
> 3: x86/cpuidle: push parked CPUs into deeper sleep states when possible
> 4: x86/cpuidle: clean up Cx dumping
> 5: x86: place non-parked CPUs into wait-for-SIPI state after offlining

So patch 5 is understandably controversial, and I'm explicitly
excluding it from the ping. Patch 1 has gone in long ago and
patch 4 has been acked. While there was some feedback on
2 (albeit stalled then after my reply), I don't think I've had any
substantial feedback on 3 though, yet _they_ are supposed to
provide the main improvement here. Patch 5 really was meant
to be more optional than I had expressed, hence its placement
even after the pure cleanup patch 4.

The series has been pending for well over half a year now,
yet other than for the indirect call patching series (where it
already doesn't feel really well) I don't want to propose
simply committing this in the absences of sufficient feedback.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Ping: [PATCH 0/5] x86: more power-efficient CPU parking
  2019-04-03 10:12             ` Jan Beulich
@ 2019-04-03 11:14               ` Andrew Cooper
  2019-04-03 12:43                 ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2019-04-03 11:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Roger Pau Monne

On 03/04/2019 11:12, Jan Beulich wrote:
>>>> On 01.08.18 at 16:22,  wrote:
>> When putting CPUs to sleep permanently, we should try to put them into
>> the most power conserving state possible. For now it is unclear whether,
>> especially in a deep C-state, the P-state also matters, so this series only
>> arranges for the C-state side of things (plus some cleanup).
>>
>> 1: x86/cpuidle: replace a pointless NULL check
>> 2: x86/idle: re-arrange dead-idle handling
>> 3: x86/cpuidle: push parked CPUs into deeper sleep states when possible
>> 4: x86/cpuidle: clean up Cx dumping
>> 5: x86: place non-parked CPUs into wait-for-SIPI state after offlining
> So patch 5 is understandably controversial, and I'm explicitly
> excluding it from the ping.

Considering that it causes EFI firmware to explode in several
interesting ways, I'm afraid it is a complete nonstarter.

> Patch 1 has gone in long ago and
> patch 4 has been acked. While there was some feedback on
> 2 (albeit stalled then after my reply), I don't think I've had any
> substantial feedback on 3 though, yet _they_ are supposed to
> provide the main improvement here. Patch 5 really was meant
> to be more optional than I had expressed, hence its placement
> even after the pure cleanup patch 4.

Some of the patches have cycled out of my inbox, so I'm reviewing from
the mail archive.

2) Now that default_dead_idle() isn't a terminal function, you must use
spec_ctrl_enter_idle() on the way back out for safety.

I'm still going to insist on something about #MC, even if it is just a
note in the commit message saying "this doesn't yet make #MC any safer
for parked CPUs".

The real problem is that your comment about NMI now makes the code read
as if everything is safe, and while this patch is an improvement (and
therefore ok on its own merits), it doesn't actually make the result #MC
safe.

3) Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Ping: [PATCH 0/5] x86: more power-efficient CPU parking
  2019-04-03 11:14               ` Andrew Cooper
@ 2019-04-03 12:43                 ` Jan Beulich
  2019-04-03 14:44                   ` Andrew Cooper
  0 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2019-04-03 12:43 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monne

>>> On 03.04.19 at 13:14, <andrew.cooper3@citrix.com> wrote:
> On 03/04/2019 11:12, Jan Beulich wrote:
>>>>> On 01.08.18 at 16:22,  wrote:
>>> When putting CPUs to sleep permanently, we should try to put them into
>>> the most power conserving state possible. For now it is unclear whether,
>>> especially in a deep C-state, the P-state also matters, so this series only
>>> arranges for the C-state side of things (plus some cleanup).
>>>
>>> 1: x86/cpuidle: replace a pointless NULL check
>>> 2: x86/idle: re-arrange dead-idle handling
>>> 3: x86/cpuidle: push parked CPUs into deeper sleep states when possible
>>> 4: x86/cpuidle: clean up Cx dumping
>>> 5: x86: place non-parked CPUs into wait-for-SIPI state after offlining
>> So patch 5 is understandably controversial, and I'm explicitly
>> excluding it from the ping.
> 
> Considering that it causes EFI firmware to explode in several
> interesting ways, I'm afraid it is a complete nonstarter.

I didn't know this - neither of my two EFI boxes have exploded in
any way during the last half year. Care to share details?

>> Patch 1 has gone in long ago and
>> patch 4 has been acked. While there was some feedback on
>> 2 (albeit stalled then after my reply), I don't think I've had any
>> substantial feedback on 3 though, yet _they_ are supposed to
>> provide the main improvement here. Patch 5 really was meant
>> to be more optional than I had expressed, hence its placement
>> even after the pure cleanup patch 4.
> 
> Some of the patches have cycled out of my inbox, so I'm reviewing from
> the mail archive.
> 
> 2) Now that default_dead_idle() isn't a terminal function, you must use
> spec_ctrl_enter_idle() on the way back out for safety.

Ah, yes, perhaps better to add it, than to implicitly rely on callers
(there's really exactly one) to be terminal. (I take it you mean
spec_ctrl_exit_idle().)

> I'm still going to insist on something about #MC, even if it is just a
> note in the commit message saying "this doesn't yet make #MC any safer
> for parked CPUs".

I can add something like this, but what is it that's known unsafe
for parked CPUs? They in particular retain their per-CPU data. If
anything I see a problem with fully offline CPUs. For now I'll add
a similar sentence, but with "parked" replaced.

One other question (I apparently forgot about this aspect
between putting together the series and posting it):
acpi_dead_idle() has built-in loops as well. While it's not
expected for a CPU to need waking from there (as no "even
better" dead-idle handler could get installed) I wonder whether
for consistency we wouldn't better drop the loops there too.
The downside of doing so would be added overhead in case
of spurious wakeups (which ought to have a small chance of
being possible in particular in the MWAIT case).

> 3) Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Thanks!

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Ping: [PATCH 0/5] x86: more power-efficient CPU parking
  2019-04-03 12:43                 ` Jan Beulich
@ 2019-04-03 14:44                   ` Andrew Cooper
  2019-04-03 15:20                     ` Jan Beulich
  0 siblings, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2019-04-03 14:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Roger Pau Monne


[-- Attachment #1.1: Type: text/plain, Size: 4065 bytes --]

On 03/04/2019 13:43, Jan Beulich wrote:
>>>> On 03.04.19 at 13:14, <andrew.cooper3@citrix.com> wrote:
>> On 03/04/2019 11:12, Jan Beulich wrote:
>>>>>> On 01.08.18 at 16:22,  wrote:
>>>> When putting CPUs to sleep permanently, we should try to put them into
>>>> the most power conserving state possible. For now it is unclear whether,
>>>> especially in a deep C-state, the P-state also matters, so this series only
>>>> arranges for the C-state side of things (plus some cleanup).
>>>>
>>>> 1: x86/cpuidle: replace a pointless NULL check
>>>> 2: x86/idle: re-arrange dead-idle handling
>>>> 3: x86/cpuidle: push parked CPUs into deeper sleep states when possible
>>>> 4: x86/cpuidle: clean up Cx dumping
>>>> 5: x86: place non-parked CPUs into wait-for-SIPI state after offlining
>>> So patch 5 is understandably controversial, and I'm explicitly
>>> excluding it from the ping.
>> Considering that it causes EFI firmware to explode in several
>> interesting ways, I'm afraid it is a complete nonstarter.
> I didn't know this - neither of my two EFI boxes have exploded in
> any way during the last half year. Care to share details?

It was an assertion failure when the CPU failed to call into the SMM
rendezvous.

LogLibaErrorLogSmmLib.c(276): ((BOOLEAN)(0==1))

This is a production Dell system IIRC (or maybe Supermicro, but either
way, a production firmware).

In retrospect, fully offlining a CPU behind the back of the firmware is
an extremely antisocial thing to do, and I'm not surprised that the
firmware doesn't tolerate it.

>>> Patch 1 has gone in long ago and
>>> patch 4 has been acked. While there was some feedback on
>>> 2 (albeit stalled then after my reply), I don't think I've had any
>>> substantial feedback on 3 though, yet _they_ are supposed to
>>> provide the main improvement here. Patch 5 really was meant
>>> to be more optional than I had expressed, hence its placement
>>> even after the pure cleanup patch 4.
>> Some of the patches have cycled out of my inbox, so I'm reviewing from
>> the mail archive.
>>
>> 2) Now that default_dead_idle() isn't a terminal function, you must use
>> spec_ctrl_enter_idle() on the way back out for safety.
> Ah, yes, perhaps better to add it, than to implicitly rely on callers
> (there's really exactly one) to be terminal. (I take it you mean
> spec_ctrl_exit_idle().)

I did, sorry.

>> I'm still going to insist on something about #MC, even if it is just a
>> note in the commit message saying "this doesn't yet make #MC any safer
>> for parked CPUs".
> I can add something like this, but what is it that's known unsafe
> for parked CPUs? They in particular retain their per-CPU data. If
> anything I see a problem with fully offline CPUs. For now I'll add
> a similar sentence, but with "parked" replaced.

Ah ok - my mistake.  Basically, for any CPU we have ever started, we
need to maintain a valid stack because we can take NMIs and MCEs.

This is also a strict requirement placed on the BIOS even for APs by the
various Bios Writers Guides.  At a minimum, this is the IVT and enough
stack to handle NMIs with just an IRET, and an SMM stack.

Even for AMD CPUs which don't broadcast MCEs, we have a non-zero chance
of legitimately taking an #MC on a core we have recently downed, for a
previous action which has taken a while to propagate back.

> One other question (I apparently forgot about this aspect
> between putting together the series and posting it):
> acpi_dead_idle() has built-in loops as well. While it's not
> expected for a CPU to need waking from there (as no "even
> better" dead-idle handler could get installed) I wonder whether
> for consistency we wouldn't better drop the loops there too.

I think that would be a good idea, along with a similar speculative
adjustment.

> The downside of doing so would be added overhead in case
> of spurious wakeups (which ought to have a small chance of
> being possible in particular in the MWAIT case).

I really don't think that is a concern.  I don't think you'll be able to
measure the difference in the noise.

~Andrew

[-- Attachment #1.2: Type: text/html, Size: 6469 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: Ping: [PATCH 0/5] x86: more power-efficient CPU parking
  2019-04-03 14:44                   ` Andrew Cooper
@ 2019-04-03 15:20                     ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2019-04-03 15:20 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monne

>>> On 03.04.19 at 16:44, <andrew.cooper3@citrix.com> wrote:
> On 03/04/2019 13:43, Jan Beulich wrote:
>>>>> On 03.04.19 at 13:14, <andrew.cooper3@citrix.com> wrote:
>>> On 03/04/2019 11:12, Jan Beulich wrote:
>>>>>>> On 01.08.18 at 16:22,  wrote:
>>>>> When putting CPUs to sleep permanently, we should try to put them into
>>>>> the most power conserving state possible. For now it is unclear whether,
>>>>> especially in a deep C-state, the P-state also matters, so this series only
>>>>> arranges for the C-state side of things (plus some cleanup).
>>>>>
>>>>> 1: x86/cpuidle: replace a pointless NULL check
>>>>> 2: x86/idle: re-arrange dead-idle handling
>>>>> 3: x86/cpuidle: push parked CPUs into deeper sleep states when possible
>>>>> 4: x86/cpuidle: clean up Cx dumping
>>>>> 5: x86: place non-parked CPUs into wait-for-SIPI state after offlining
>>>> So patch 5 is understandably controversial, and I'm explicitly
>>>> excluding it from the ping.
>>> Considering that it causes EFI firmware to explode in several
>>> interesting ways, I'm afraid it is a complete nonstarter.
>> I didn't know this - neither of my two EFI boxes have exploded in
>> any way during the last half year. Care to share details?
> 
> It was an assertion failure when the CPU failed to call into the SMM
> rendezvous.
> 
> LogLibaErrorLogSmmLib.c(276): ((BOOLEAN)(0==1))
> 
> This is a production Dell system IIRC (or maybe Supermicro, but either
> way, a production firmware).
> 
> In retrospect, fully offlining a CPU behind the back of the firmware is
> an extremely antisocial thing to do, and I'm not surprised that the
> firmware doesn't tolerate it.

Oh, I see. I withdraw this patch then.

>> One other question (I apparently forgot about this aspect
>> between putting together the series and posting it):
>> acpi_dead_idle() has built-in loops as well. While it's not
>> expected for a CPU to need waking from there (as no "even
>> better" dead-idle handler could get installed) I wonder whether
>> for consistency we wouldn't better drop the loops there too.
> 
> I think that would be a good idea, along with a similar speculative
> adjustment.

Will do.

>> The downside of doing so would be added overhead in case
>> of spurious wakeups (which ought to have a small chance of
>> being possible in particular in the MWAIT case).
> 
> I really don't think that is a concern.  I don't think you'll be able to
> measure the difference in the noise.

Well, not over any extended period of time. But a single WBINVD
may be quite noticeable to at least the other thread while it
executes.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 0/3] x86: more power-efficient CPU parking
       [not found]               ` <5B61C212020000780022FF0D@prv1-mh.provo.novell.com>
@ 2019-05-17 10:10                 ` Jan Beulich
  2019-05-17 10:10                   ` [Xen-devel] " Jan Beulich
                                     ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Jan Beulich @ 2019-05-17 10:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

When putting CPUs to sleep permanently, we should try to put them into
the most power conserving state possible. For now it is unclear whether,
especially in a deep C-state, the P-state also matters, so this series only
arranges for the C-state side of things (plus some cleanup).

1: x86/idle: re-arrange dead-idle handling
2: x86/cpuidle: push parked CPUs into deeper sleep states when possible
3: x86/cpuidle: clean up Cx dumping

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Xen-devel] [PATCH v2 0/3] x86: more power-efficient CPU parking
  2019-05-17 10:10                 ` [PATCH v2 0/3] " Jan Beulich
@ 2019-05-17 10:10                   ` Jan Beulich
  2019-05-17 10:11                   ` [PATCH v2 1/3] x86/idle: re-arrange dead-idle handling Jan Beulich
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2019-05-17 10:10 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

When putting CPUs to sleep permanently, we should try to put them into
the most power conserving state possible. For now it is unclear whether,
especially in a deep C-state, the P-state also matters, so this series only
arranges for the C-state side of things (plus some cleanup).

1: x86/idle: re-arrange dead-idle handling
2: x86/cpuidle: push parked CPUs into deeper sleep states when possible
3: x86/cpuidle: clean up Cx dumping

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 1/3] x86/idle: re-arrange dead-idle handling
  2019-05-17 10:10                 ` [PATCH v2 0/3] " Jan Beulich
  2019-05-17 10:10                   ` [Xen-devel] " Jan Beulich
@ 2019-05-17 10:11                   ` Jan Beulich
  2019-05-17 10:11                     ` [Xen-devel] " Jan Beulich
  2019-05-20 14:25                     ` Andrew Cooper
  2019-05-17 10:12                   ` [PATCH v2 2/3] x86/cpuidle: push parked CPUs into deeper sleep states when possible Jan Beulich
  2019-05-17 10:12                   ` [PATCH v2 3/3] x86/cpuidle: clean up Cx dumping Jan Beulich
  3 siblings, 2 replies; 33+ messages in thread
From: Jan Beulich @ 2019-05-17 10:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

In order to be able to wake parked CPUs from default_dead_idle() (for
them to then enter a different dead-idle routine), the function should
not itself loop. Move the loop into play_dead(), and use play_dead() as
well on the AP boot error path.

Furthermore, not the least considering the comment in play_dead(),
make sure NMI raised (for now this would be a bug elsewhere, but that's
about to change) against a parked or fully offline CPU won't invoke the
actual, full-blown NMI handler.

Note however that this doesn't make #MC any safer for fully offline
CPUs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Add spec_ctrl_exit_idle() to default_dead_idle(). Add #MC related
    remark to description.
---
Note: I had to drop the discussed acpi_dead_idle() adjustment again, as
      it breaks booting with "smt=0" and "maxcpus=" on at least one of
      my systems. I've not yet managed to understand why that would be.

--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -100,14 +100,20 @@ void default_dead_idle(void)
      */
     spec_ctrl_enter_idle(get_cpu_info());
     wbinvd();
-    for ( ; ; )
-        halt();
+    halt();
+    spec_ctrl_exit_idle(get_cpu_info());
 }
 
-static void play_dead(void)
+void play_dead(void)
 {
+    unsigned int cpu = smp_processor_id();
+
     local_irq_disable();
 
+    /* Change the NMI handler to a nop (see comment below). */
+    _set_gate_lower(&idt_tables[cpu][TRAP_nmi], SYS_DESC_irq_gate, 0,
+                    &trap_nop);
+
     /*
      * NOTE: After cpu_exit_clear, per-cpu variables may no longer accessible,
      * as they may be freed at any time if offline CPUs don't get parked. In
@@ -118,9 +124,10 @@ static void play_dead(void)
      * Consider very carefully when adding code to *dead_idle. Most hypervisor
      * subsystems are unsafe to call.
      */
-    cpu_exit_clear(smp_processor_id());
+    cpu_exit_clear(cpu);
 
-    (*dead_idle)();
+    for ( ; ; )
+        dead_idle();
 }
 
 static void idle_loop(void)
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -33,6 +33,7 @@
 #include <xen/serial.h>
 #include <xen/numa.h>
 #include <xen/cpu.h>
+#include <asm/cpuidle.h>
 #include <asm/current.h>
 #include <asm/mc146818rtc.h>
 #include <asm/desc.h>
@@ -209,8 +210,7 @@ static void smp_callin(void)
     halt:
         clear_local_APIC();
         spin_debug_enable();
-        cpu_exit_clear(cpu);
-        (*dead_idle)();
+        play_dead();
     }
 
     /* Allow the master to continue. */
--- a/xen/include/asm-x86/cpuidle.h
+++ b/xen/include/asm-x86/cpuidle.h
@@ -20,6 +20,7 @@ int mwait_idle_init(struct notifier_bloc
 int cpuidle_init_cpu(unsigned int cpu);
 void default_dead_idle(void);
 void acpi_dead_idle(void);
+void play_dead(void);
 void trace_exit_reason(u32 *irq_traced);
 void update_idle_stats(struct acpi_processor_power *,
                        struct acpi_processor_cx *, uint64_t, uint64_t);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Xen-devel] [PATCH v2 1/3] x86/idle: re-arrange dead-idle handling
  2019-05-17 10:11                   ` [PATCH v2 1/3] x86/idle: re-arrange dead-idle handling Jan Beulich
@ 2019-05-17 10:11                     ` Jan Beulich
  2019-05-20 14:25                     ` Andrew Cooper
  1 sibling, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2019-05-17 10:11 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

In order to be able to wake parked CPUs from default_dead_idle() (for
them to then enter a different dead-idle routine), the function should
not itself loop. Move the loop into play_dead(), and use play_dead() as
well on the AP boot error path.

Furthermore, not the least considering the comment in play_dead(),
make sure NMI raised (for now this would be a bug elsewhere, but that's
about to change) against a parked or fully offline CPU won't invoke the
actual, full-blown NMI handler.

Note however that this doesn't make #MC any safer for fully offline
CPUs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Add spec_ctrl_exit_idle() to default_dead_idle(). Add #MC related
    remark to description.
---
Note: I had to drop the discussed acpi_dead_idle() adjustment again, as
      it breaks booting with "smt=0" and "maxcpus=" on at least one of
      my systems. I've not yet managed to understand why that would be.

--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -100,14 +100,20 @@ void default_dead_idle(void)
      */
     spec_ctrl_enter_idle(get_cpu_info());
     wbinvd();
-    for ( ; ; )
-        halt();
+    halt();
+    spec_ctrl_exit_idle(get_cpu_info());
 }
 
-static void play_dead(void)
+void play_dead(void)
 {
+    unsigned int cpu = smp_processor_id();
+
     local_irq_disable();
 
+    /* Change the NMI handler to a nop (see comment below). */
+    _set_gate_lower(&idt_tables[cpu][TRAP_nmi], SYS_DESC_irq_gate, 0,
+                    &trap_nop);
+
     /*
      * NOTE: After cpu_exit_clear, per-cpu variables may no longer accessible,
      * as they may be freed at any time if offline CPUs don't get parked. In
@@ -118,9 +124,10 @@ static void play_dead(void)
      * Consider very carefully when adding code to *dead_idle. Most hypervisor
      * subsystems are unsafe to call.
      */
-    cpu_exit_clear(smp_processor_id());
+    cpu_exit_clear(cpu);
 
-    (*dead_idle)();
+    for ( ; ; )
+        dead_idle();
 }
 
 static void idle_loop(void)
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -33,6 +33,7 @@
 #include <xen/serial.h>
 #include <xen/numa.h>
 #include <xen/cpu.h>
+#include <asm/cpuidle.h>
 #include <asm/current.h>
 #include <asm/mc146818rtc.h>
 #include <asm/desc.h>
@@ -209,8 +210,7 @@ static void smp_callin(void)
     halt:
         clear_local_APIC();
         spin_debug_enable();
-        cpu_exit_clear(cpu);
-        (*dead_idle)();
+        play_dead();
     }
 
     /* Allow the master to continue. */
--- a/xen/include/asm-x86/cpuidle.h
+++ b/xen/include/asm-x86/cpuidle.h
@@ -20,6 +20,7 @@ int mwait_idle_init(struct notifier_bloc
 int cpuidle_init_cpu(unsigned int cpu);
 void default_dead_idle(void);
 void acpi_dead_idle(void);
+void play_dead(void);
 void trace_exit_reason(u32 *irq_traced);
 void update_idle_stats(struct acpi_processor_power *,
                        struct acpi_processor_cx *, uint64_t, uint64_t);





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 2/3] x86/cpuidle: push parked CPUs into deeper sleep states when possible
  2019-05-17 10:10                 ` [PATCH v2 0/3] " Jan Beulich
  2019-05-17 10:10                   ` [Xen-devel] " Jan Beulich
  2019-05-17 10:11                   ` [PATCH v2 1/3] x86/idle: re-arrange dead-idle handling Jan Beulich
@ 2019-05-17 10:12                   ` Jan Beulich
  2019-05-17 10:12                     ` [Xen-devel] " Jan Beulich
  2019-05-17 10:12                   ` [PATCH v2 3/3] x86/cpuidle: clean up Cx dumping Jan Beulich
  3 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2019-05-17 10:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

When the mwait-idle driver isn't used, C-state information becomes
available only in the course of Dom0 starting up. Use the provided data
to allow parked CPUs to sleep in a more energy efficient way, by waking
them briefly (via NMI) once the data has been recorded.

This involves re-arranging how/when the governor's ->enable() hook gets
invoked. The changes there include addition of so far missing error
handling in the respective CPU notifier handlers.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -351,12 +351,22 @@ static void dump_cx(unsigned char key)
     unsigned int cpu;
 
     printk("'%c' pressed -> printing ACPI Cx structures\n", key);
-    for_each_online_cpu ( cpu )
-        if (processor_powers[cpu])
-        {
-            print_acpi_power(cpu, processor_powers[cpu]);
-            process_pending_softirqs();
-        }
+    for_each_present_cpu ( cpu )
+    {
+        struct acpi_processor_power *power = processor_powers[cpu];
+
+        if ( !power )
+            continue;
+
+        if ( cpu_online(cpu) )
+            print_acpi_power(cpu, power);
+        else if ( park_offline_cpus )
+            printk("CPU%u parked in state %u (C%u)\n", cpu,
+                   power->last_state ? power->last_state->idx : 1,
+                   power->last_state ? power->last_state->type : 1);
+
+        process_pending_softirqs();
+    }
 }
 
 static int __init cpu_idle_key_init(void)
@@ -765,6 +775,7 @@ void acpi_dead_idle(void)
         goto default_halt;
 
     cx = &power->states[power->count - 1];
+    power->last_state = cx;
 
     if ( cx->entry_method == ACPI_CSTATE_EM_FFH )
     {
@@ -1217,9 +1228,30 @@ long set_cx_pminfo(uint32_t acpi_id, str
         set_cx(acpi_power, &xen_cx);
     }
 
-    if ( cpuidle_current_governor->enable &&
-         cpuidle_current_governor->enable(acpi_power) )
-        return -EFAULT;
+    if ( !cpu_online(cpu_id) )
+    {
+        uint32_t apic_id = x86_cpu_to_apicid[cpu_id];
+
+        /*
+         * If we've just learned of more available C states, wake the CPU if
+         * it's parked, so it can go back to sleep in perhaps a deeper state.
+         */
+        if ( park_offline_cpus && apic_id != BAD_APICID )
+        {
+            unsigned long flags;
+
+            local_irq_save(flags);
+            apic_wait_icr_idle();
+            apic_icr_write(APIC_DM_NMI | APIC_DEST_PHYSICAL, apic_id);
+            local_irq_restore(flags);
+        }
+    }
+    else if ( cpuidle_current_governor->enable )
+    {
+        ret = cpuidle_current_governor->enable(acpi_power);
+        if ( ret < 0 )
+            return ret;
+    }
 
     /* FIXME: C-state dependency is not supported by far */
 
@@ -1379,19 +1411,22 @@ static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
     unsigned int cpu = (unsigned long)hcpu;
+    int rc = 0;
 
-    /* Only hook on CPU_ONLINE because a dead cpu may utilize the info to
-     * to enter deep C-state */
+    /*
+     * Only hook on CPU_UP_PREPARE because a dead cpu may utilize the info
+     * to enter deep C-state.
+     */
     switch ( action )
     {
-    case CPU_ONLINE:
-        (void)cpuidle_init_cpu(cpu);
-        break;
-    default:
+    case CPU_UP_PREPARE:
+        rc = cpuidle_init_cpu(cpu);
+        if ( !rc && cpuidle_current_governor->enable )
+            rc = cpuidle_current_governor->enable(processor_powers[cpu]);
         break;
     }
 
-    return NOTIFY_DONE;
+    return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
 }
 
 static struct notifier_block cpu_nfb = {
@@ -1406,6 +1441,7 @@ static int __init cpuidle_presmp_init(vo
         return 0;
 
     mwait_idle_init(&cpu_nfb);
+    cpu_nfb.notifier_call(&cpu_nfb, CPU_UP_PREPARE, cpu);
     cpu_nfb.notifier_call(&cpu_nfb, CPU_ONLINE, cpu);
     register_cpu_notifier(&cpu_nfb);
     return 0;
--- a/xen/arch/x86/acpi/cpuidle_menu.c
+++ b/xen/arch/x86/acpi/cpuidle_menu.c
@@ -277,9 +277,6 @@ static void menu_reflect(struct acpi_pro
 
 static int menu_enable_device(struct acpi_processor_power *power)
 {
-    if (!cpu_online(power->cpu))
-        return -1;
-
     memset(&per_cpu(menu_devices, power->cpu), 0, sizeof(struct menu_device));
 
     return 0;
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -1166,12 +1166,17 @@ static int mwait_idle_cpu_init(struct no
 	struct acpi_processor_power *dev = processor_powers[cpu];
 
 	switch (action) {
+		int rc;
+
 	default:
 		return NOTIFY_DONE;
 
 	case CPU_UP_PREPARE:
-		cpuidle_init_cpu(cpu);
-		return NOTIFY_DONE;
+		rc = cpuidle_init_cpu(cpu);
+		dev = processor_powers[cpu];
+		if (!rc && cpuidle_current_governor->enable)
+			rc = cpuidle_current_governor->enable(dev);
+		return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
 
 	case CPU_ONLINE:
 		if (!dev)
@@ -1260,8 +1265,6 @@ int __init mwait_idle_init(struct notifi
 	}
 	if (!err) {
 		nfb->notifier_call = mwait_idle_cpu_init;
-		mwait_idle_cpu_init(nfb, CPU_UP_PREPARE, NULL);
-
 		pm_idle_save = pm_idle;
 		pm_idle = mwait_idle;
 		dead_idle = acpi_dead_idle;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Xen-devel] [PATCH v2 2/3] x86/cpuidle: push parked CPUs into deeper sleep states when possible
  2019-05-17 10:12                   ` [PATCH v2 2/3] x86/cpuidle: push parked CPUs into deeper sleep states when possible Jan Beulich
@ 2019-05-17 10:12                     ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2019-05-17 10:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

When the mwait-idle driver isn't used, C-state information becomes
available only in the course of Dom0 starting up. Use the provided data
to allow parked CPUs to sleep in a more energy efficient way, by waking
them briefly (via NMI) once the data has been recorded.

This involves re-arranging how/when the governor's ->enable() hook gets
invoked. The changes there include addition of so far missing error
handling in the respective CPU notifier handlers.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -351,12 +351,22 @@ static void dump_cx(unsigned char key)
     unsigned int cpu;
 
     printk("'%c' pressed -> printing ACPI Cx structures\n", key);
-    for_each_online_cpu ( cpu )
-        if (processor_powers[cpu])
-        {
-            print_acpi_power(cpu, processor_powers[cpu]);
-            process_pending_softirqs();
-        }
+    for_each_present_cpu ( cpu )
+    {
+        struct acpi_processor_power *power = processor_powers[cpu];
+
+        if ( !power )
+            continue;
+
+        if ( cpu_online(cpu) )
+            print_acpi_power(cpu, power);
+        else if ( park_offline_cpus )
+            printk("CPU%u parked in state %u (C%u)\n", cpu,
+                   power->last_state ? power->last_state->idx : 1,
+                   power->last_state ? power->last_state->type : 1);
+
+        process_pending_softirqs();
+    }
 }
 
 static int __init cpu_idle_key_init(void)
@@ -765,6 +775,7 @@ void acpi_dead_idle(void)
         goto default_halt;
 
     cx = &power->states[power->count - 1];
+    power->last_state = cx;
 
     if ( cx->entry_method == ACPI_CSTATE_EM_FFH )
     {
@@ -1217,9 +1228,30 @@ long set_cx_pminfo(uint32_t acpi_id, str
         set_cx(acpi_power, &xen_cx);
     }
 
-    if ( cpuidle_current_governor->enable &&
-         cpuidle_current_governor->enable(acpi_power) )
-        return -EFAULT;
+    if ( !cpu_online(cpu_id) )
+    {
+        uint32_t apic_id = x86_cpu_to_apicid[cpu_id];
+
+        /*
+         * If we've just learned of more available C states, wake the CPU if
+         * it's parked, so it can go back to sleep in perhaps a deeper state.
+         */
+        if ( park_offline_cpus && apic_id != BAD_APICID )
+        {
+            unsigned long flags;
+
+            local_irq_save(flags);
+            apic_wait_icr_idle();
+            apic_icr_write(APIC_DM_NMI | APIC_DEST_PHYSICAL, apic_id);
+            local_irq_restore(flags);
+        }
+    }
+    else if ( cpuidle_current_governor->enable )
+    {
+        ret = cpuidle_current_governor->enable(acpi_power);
+        if ( ret < 0 )
+            return ret;
+    }
 
     /* FIXME: C-state dependency is not supported by far */
 
@@ -1379,19 +1411,22 @@ static int cpu_callback(
     struct notifier_block *nfb, unsigned long action, void *hcpu)
 {
     unsigned int cpu = (unsigned long)hcpu;
+    int rc = 0;
 
-    /* Only hook on CPU_ONLINE because a dead cpu may utilize the info to
-     * to enter deep C-state */
+    /*
+     * Only hook on CPU_UP_PREPARE because a dead cpu may utilize the info
+     * to enter deep C-state.
+     */
     switch ( action )
     {
-    case CPU_ONLINE:
-        (void)cpuidle_init_cpu(cpu);
-        break;
-    default:
+    case CPU_UP_PREPARE:
+        rc = cpuidle_init_cpu(cpu);
+        if ( !rc && cpuidle_current_governor->enable )
+            rc = cpuidle_current_governor->enable(processor_powers[cpu]);
         break;
     }
 
-    return NOTIFY_DONE;
+    return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
 }
 
 static struct notifier_block cpu_nfb = {
@@ -1406,6 +1441,7 @@ static int __init cpuidle_presmp_init(vo
         return 0;
 
     mwait_idle_init(&cpu_nfb);
+    cpu_nfb.notifier_call(&cpu_nfb, CPU_UP_PREPARE, cpu);
     cpu_nfb.notifier_call(&cpu_nfb, CPU_ONLINE, cpu);
     register_cpu_notifier(&cpu_nfb);
     return 0;
--- a/xen/arch/x86/acpi/cpuidle_menu.c
+++ b/xen/arch/x86/acpi/cpuidle_menu.c
@@ -277,9 +277,6 @@ static void menu_reflect(struct acpi_pro
 
 static int menu_enable_device(struct acpi_processor_power *power)
 {
-    if (!cpu_online(power->cpu))
-        return -1;
-
     memset(&per_cpu(menu_devices, power->cpu), 0, sizeof(struct menu_device));
 
     return 0;
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -1166,12 +1166,17 @@ static int mwait_idle_cpu_init(struct no
 	struct acpi_processor_power *dev = processor_powers[cpu];
 
 	switch (action) {
+		int rc;
+
 	default:
 		return NOTIFY_DONE;
 
 	case CPU_UP_PREPARE:
-		cpuidle_init_cpu(cpu);
-		return NOTIFY_DONE;
+		rc = cpuidle_init_cpu(cpu);
+		dev = processor_powers[cpu];
+		if (!rc && cpuidle_current_governor->enable)
+			rc = cpuidle_current_governor->enable(dev);
+		return !rc ? NOTIFY_DONE : notifier_from_errno(rc);
 
 	case CPU_ONLINE:
 		if (!dev)
@@ -1260,8 +1265,6 @@ int __init mwait_idle_init(struct notifi
 	}
 	if (!err) {
 		nfb->notifier_call = mwait_idle_cpu_init;
-		mwait_idle_cpu_init(nfb, CPU_UP_PREPARE, NULL);
-
 		pm_idle_save = pm_idle;
 		pm_idle = mwait_idle;
 		dead_idle = acpi_dead_idle;




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 3/3] x86/cpuidle: clean up Cx dumping
  2019-05-17 10:10                 ` [PATCH v2 0/3] " Jan Beulich
                                     ` (2 preceding siblings ...)
  2019-05-17 10:12                   ` [PATCH v2 2/3] x86/cpuidle: push parked CPUs into deeper sleep states when possible Jan Beulich
@ 2019-05-17 10:12                   ` Jan Beulich
  2019-05-17 10:12                     ` [Xen-devel] " Jan Beulich
  3 siblings, 1 reply; 33+ messages in thread
From: Jan Beulich @ 2019-05-17 10:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Don't log the same global information once per CPU. Don't log the same
information (here: the currently active state) twice. Don't prefix
decimal numbers with zeros (giving the impression they're octal). Use
format specifiers matching the type of the corresponding expressions.
Don't split printk()-s without intervening new-lines.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -304,9 +304,6 @@ static void print_acpi_power(uint32_t cp
 
     printk("==cpu%d==\n", cpu);
     last_state_idx = power->last_state ? power->last_state->idx : -1;
-    printk("active state:\t\tC%d\n", last_state_idx);
-    printk("max_cstate:\t\tC%d\n", max_cstate);
-    printk("states:\n");
 
     spin_lock_irq(&power->stat_lock);
     current_tick = cpuidle_get_tick();
@@ -331,16 +328,14 @@ static void print_acpi_power(uint32_t cp
         idle_usage += usage[i];
         idle_res += tick_to_ns(res_tick[i]);
 
-        printk((last_state_idx == i) ? "   *" : "    ");
-        printk("C%d:\t", i);
-        printk("type[C%d] ", power->states[i].type);
-        printk("latency[%03d] ", power->states[i].latency);
-        printk("usage[%08"PRIu64"] ", usage[i]);
-        printk("method[%5s] ", acpi_cstate_method_name[power->states[i].entry_method]);
-        printk("duration[%"PRIu64"]\n", tick_to_ns(res_tick[i]));
+        printk("   %cC%u:\ttype[C%d] latency[%3u] usage[%8"PRIu64"] method[%5s] duration[%"PRIu64"]\n",
+               (last_state_idx == i) ? '*' : ' ', i,
+               power->states[i].type, power->states[i].latency, usage[i],
+               acpi_cstate_method_name[power->states[i].entry_method],
+               tick_to_ns(res_tick[i]));
     }
-    printk((last_state_idx == 0) ? "   *" : "    ");
-    printk("C0:\tusage[%08"PRIu64"] duration[%"PRIu64"]\n",
+    printk("   %cC0:\tusage[%8"PRIu64"] duration[%"PRIu64"]\n",
+           (last_state_idx == 0) ? '*' : ' ',
            usage[0] + idle_usage, current_stime - idle_res);
 
     print_hw_residencies(cpu);
@@ -351,6 +346,7 @@ static void dump_cx(unsigned char key)
     unsigned int cpu;
 
     printk("'%c' pressed -> printing ACPI Cx structures\n", key);
+    printk("max cstate: C%u\n", max_cstate);
     for_each_present_cpu ( cpu )
     {
         struct acpi_processor_power *power = processor_powers[cpu];





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Xen-devel] [PATCH v2 3/3] x86/cpuidle: clean up Cx dumping
  2019-05-17 10:12                   ` [PATCH v2 3/3] x86/cpuidle: clean up Cx dumping Jan Beulich
@ 2019-05-17 10:12                     ` Jan Beulich
  0 siblings, 0 replies; 33+ messages in thread
From: Jan Beulich @ 2019-05-17 10:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monne

Don't log the same global information once per CPU. Don't log the same
information (here: the currently active state) twice. Don't prefix
decimal numbers with zeros (giving the impression they're octal). Use
format specifiers matching the type of the corresponding expressions.
Don't split printk()-s without intervening new-lines.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -304,9 +304,6 @@ static void print_acpi_power(uint32_t cp
 
     printk("==cpu%d==\n", cpu);
     last_state_idx = power->last_state ? power->last_state->idx : -1;
-    printk("active state:\t\tC%d\n", last_state_idx);
-    printk("max_cstate:\t\tC%d\n", max_cstate);
-    printk("states:\n");
 
     spin_lock_irq(&power->stat_lock);
     current_tick = cpuidle_get_tick();
@@ -331,16 +328,14 @@ static void print_acpi_power(uint32_t cp
         idle_usage += usage[i];
         idle_res += tick_to_ns(res_tick[i]);
 
-        printk((last_state_idx == i) ? "   *" : "    ");
-        printk("C%d:\t", i);
-        printk("type[C%d] ", power->states[i].type);
-        printk("latency[%03d] ", power->states[i].latency);
-        printk("usage[%08"PRIu64"] ", usage[i]);
-        printk("method[%5s] ", acpi_cstate_method_name[power->states[i].entry_method]);
-        printk("duration[%"PRIu64"]\n", tick_to_ns(res_tick[i]));
+        printk("   %cC%u:\ttype[C%d] latency[%3u] usage[%8"PRIu64"] method[%5s] duration[%"PRIu64"]\n",
+               (last_state_idx == i) ? '*' : ' ', i,
+               power->states[i].type, power->states[i].latency, usage[i],
+               acpi_cstate_method_name[power->states[i].entry_method],
+               tick_to_ns(res_tick[i]));
     }
-    printk((last_state_idx == 0) ? "   *" : "    ");
-    printk("C0:\tusage[%08"PRIu64"] duration[%"PRIu64"]\n",
+    printk("   %cC0:\tusage[%8"PRIu64"] duration[%"PRIu64"]\n",
+           (last_state_idx == 0) ? '*' : ' ',
            usage[0] + idle_usage, current_stime - idle_res);
 
     print_hw_residencies(cpu);
@@ -351,6 +346,7 @@ static void dump_cx(unsigned char key)
     unsigned int cpu;
 
     printk("'%c' pressed -> printing ACPI Cx structures\n", key);
+    printk("max cstate: C%u\n", max_cstate);
     for_each_present_cpu ( cpu )
     {
         struct acpi_processor_power *power = processor_powers[cpu];





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/3] x86/idle: re-arrange dead-idle handling
  2019-05-17 10:11                   ` [PATCH v2 1/3] x86/idle: re-arrange dead-idle handling Jan Beulich
  2019-05-17 10:11                     ` [Xen-devel] " Jan Beulich
@ 2019-05-20 14:25                     ` Andrew Cooper
  2019-05-20 14:25                       ` [Xen-devel] " Andrew Cooper
  1 sibling, 1 reply; 33+ messages in thread
From: Andrew Cooper @ 2019-05-20 14:25 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:11, Jan Beulich wrote:
> In order to be able to wake parked CPUs from default_dead_idle() (for
> them to then enter a different dead-idle routine), the function should
> not itself loop. Move the loop into play_dead(), and use play_dead() as
> well on the AP boot error path.
>
> Furthermore, not the least considering the comment in play_dead(),
> make sure NMI raised (for now this would be a bug elsewhere, but that's
> about to change) against a parked or fully offline CPU won't invoke the
> actual, full-blown NMI handler.
>
> Note however that this doesn't make #MC any safer for fully offline
> CPUs.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Xen-devel] [PATCH v2 1/3] x86/idle: re-arrange dead-idle handling
  2019-05-20 14:25                     ` Andrew Cooper
@ 2019-05-20 14:25                       ` Andrew Cooper
  0 siblings, 0 replies; 33+ messages in thread
From: Andrew Cooper @ 2019-05-20 14:25 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monne

On 17/05/2019 11:11, Jan Beulich wrote:
> In order to be able to wake parked CPUs from default_dead_idle() (for
> them to then enter a different dead-idle routine), the function should
> not itself loop. Move the loop into play_dead(), and use play_dead() as
> well on the AP boot error path.
>
> Furthermore, not the least considering the comment in play_dead(),
> make sure NMI raised (for now this would be a bug elsewhere, but that's
> about to change) against a parked or fully offline CPU won't invoke the
> actual, full-blown NMI handler.
>
> Note however that this doesn't make #MC any safer for fully offline
> CPUs.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2019-05-20 14:26 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-01 14:22 [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
2018-08-01 14:31 ` [PATCH 1/5] x86/cpuidle: replace a pointless NULL check Jan Beulich
2018-08-01 14:33   ` Andrew Cooper
2018-08-01 15:12     ` Jan Beulich
2018-08-01 14:31 ` [PATCH 2/5] x86/idle: re-arrange dead-idle handling Jan Beulich
2018-09-07 17:08   ` Andrew Cooper
2018-09-10 10:13     ` Jan Beulich
2018-10-26 10:55       ` Ping: " Jan Beulich
2018-12-05 20:33       ` Andrew Cooper
2018-12-06  8:16         ` Jan Beulich
2018-08-01 14:32 ` [PATCH 3/5] x86/cpuidle: push parked CPUs into deeper sleep states when possible Jan Beulich
2018-10-26 10:56   ` Ping: " Jan Beulich
2018-08-01 14:33 ` [PATCH 4/5] x86/cpuidle: clean up Cx dumping Jan Beulich
2018-08-01 14:40   ` Andrew Cooper
2018-08-01 14:33 ` [PATCH 5/5] x86: place non-parked CPUs into wait-for-SIPI state after offlining Jan Beulich
2018-08-29  7:08 ` Ping: [PATCH 0/5] x86: more power-efficient CPU parking Jan Beulich
2018-08-29 17:01   ` Andrew Cooper
2018-08-30  7:29     ` Jan Beulich
     [not found] ` <5B61C21202000000000FC1F1@prv1-mh.provo.novell.com>
     [not found]   ` <5B61C21202000078001F8805@prv1-mh.provo.novell.com>
     [not found]     ` <5B61C21202000000000FC6BD@prv1-mh.provo.novell.com>
     [not found]       ` <5B61C212020000780020B6D8@prv1-mh.provo.novell.com>
     [not found]         ` <5B61C21202000000000FF27E@prv1-mh.provo.novell.com>
     [not found]           ` <5B61C2120200007800224310@prv1-mh.provo.novell.com>
2019-04-03 10:12             ` Jan Beulich
2019-04-03 11:14               ` Andrew Cooper
2019-04-03 12:43                 ` Jan Beulich
2019-04-03 14:44                   ` Andrew Cooper
2019-04-03 15:20                     ` Jan Beulich
     [not found]             ` <5B61C2120200000000101EDC@prv1-mh.provo.novell.com>
     [not found]               ` <5B61C212020000780022FF0D@prv1-mh.provo.novell.com>
2019-05-17 10:10                 ` [PATCH v2 0/3] " Jan Beulich
2019-05-17 10:10                   ` [Xen-devel] " Jan Beulich
2019-05-17 10:11                   ` [PATCH v2 1/3] x86/idle: re-arrange dead-idle handling Jan Beulich
2019-05-17 10:11                     ` [Xen-devel] " Jan Beulich
2019-05-20 14:25                     ` Andrew Cooper
2019-05-20 14:25                       ` [Xen-devel] " Andrew Cooper
2019-05-17 10:12                   ` [PATCH v2 2/3] x86/cpuidle: push parked CPUs into deeper sleep states when possible Jan Beulich
2019-05-17 10:12                     ` [Xen-devel] " Jan Beulich
2019-05-17 10:12                   ` [PATCH v2 3/3] x86/cpuidle: clean up Cx dumping Jan Beulich
2019-05-17 10:12                     ` [Xen-devel] " Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).