All of lore.kernel.org
 help / color / mirror / Atom feed
* cpu hotplug issue
@ 2011-07-19 17:40 Vasilis Liaskovitis
  2011-07-20  8:35 ` Gleb Natapov
  0 siblings, 1 reply; 42+ messages in thread
From: Vasilis Liaskovitis @ 2011-07-19 17:40 UTC (permalink / raw)
  To: kvm

Hello,

I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.

host os: x86_64 debian-squeeze (2.6.32-5-amd64 kernel)
qemu-kvm: 0.14.1 or latest, both with the following (not-accepted)
patch to allow hotplug on the main_system_bus applied:
http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
guest os: x86_64 debian-testing with 2.6.39-2-amd64 kernel or debian
stable-squeeze with 2.6.32-5-amd64.

example qemu-kvm command line (others have also been tried, this is
the simplest):
/opt/qemu-kvm/bin/qemu-system-x86_64 -enable-kvm -M pc -smp
1,maxcpus=2 -drive file=/opt/images/wheezy.qcow2 -monitor stdio

My first issue is regarding the acpi event delivery: I am running
"udevadm monitor" in my guest OS.
When I issue the cpu-hotplug on my qemu-monitor:

qemu_monitor: cpu_set 1 online

No udev-acpi event appears in the udevadm monitor of the guest.

if i issue a pci-device hotplug command afterwards e.g. adding a NIC:

qemu_monitor: pci_add auto nic

then the cpu-hotplug acpi/udev events appear in the udev monitor:

KERNEL[1311086666.114132] add
/devices/LNXSYSTM:00/device:00/LNXCPU:01 (acpi)
KERNEL[1311086666.119475] add      /devices/system/cpu/cpu1 (cpu)
KERNEL[1311086666.120699] add
/devices/virtual/thermal/cooling_device1 (thermal)
UDEV  [1311086666.158445] add      /devices/system/cpu/cpu1 (cpu)
UDEV  [1311086666.159925] add
/devices/virtual/thermal/cooling_device1 (thermal)
UDEV  [1311086666.174848] add
/devices/LNXSYSTM:00/device:00/LNXCPU:01 (acpi)

So only then is the./sys/devices/system/cpu/cpu1/ directory created.

Is the acpi event supposed to be delivered to the guest OS immediately
after the cpu_set command? Or is what I
described expected behaviour? I am probably missing something, as
issuing a pci_add device command seems
irrelevant for the purpose of a CPU-hotplug acpi event.

My second issue: after the successful creation of the cpu1 directory I
try to online the cpu:

echo 1 > /sys/devices/system/cpu/cpu1/online
bash: echo: write error: Input/output error

in the guest, dmesg reports:

[ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
[ 2325.376357] smpboot cpu 1: start_ip = 9a000
[ 2330.821306] CPU1: Not responding.

and subsequent writes to /sys/devices/system/cpu/cpu1/online return
invalid argument.
Is there an obvious problem with the setup I have described?

The same problem was observed with other -smp,maxcpus options as well
e.g. -smp 2,macpus=4 and trying to online cpu3.
I have also tried to specify "-cpu phenom" in the qemu command line,
with no effect.

qemu-0.14.1 has seabios-0.6.1 or 0.6.2 I believe, both of which
include cpu hotplug support. I have also tried the newest
seabios (from repository) to be on the safe side with the same result,
so this doesn't seem to be a bios problem.

thanks for any help,

- Vasilis

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: cpu hotplug issue
  2011-07-19 17:40 cpu hotplug issue Vasilis Liaskovitis
@ 2011-07-20  8:35 ` Gleb Natapov
  2011-07-21 11:06   ` [PATCH] " Vasilis Liaskovitis
  0 siblings, 1 reply; 42+ messages in thread
From: Gleb Natapov @ 2011-07-20  8:35 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: kvm

On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
> Hello,
> 
> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
> 
You do everything right. It's qemu who is buggy. Since qemu need a patch
for cpu hotplug to not crash it nobody tests it, so code bit rots.

> host os: x86_64 debian-squeeze (2.6.32-5-amd64 kernel)
> qemu-kvm: 0.14.1 or latest, both with the following (not-accepted)
> patch to allow hotplug on the main_system_bus applied:
> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> guest os: x86_64 debian-testing with 2.6.39-2-amd64 kernel or debian
> stable-squeeze with 2.6.32-5-amd64.
> 
> example qemu-kvm command line (others have also been tried, this is
> the simplest):
> /opt/qemu-kvm/bin/qemu-system-x86_64 -enable-kvm -M pc -smp
> 1,maxcpus=2 -drive file=/opt/images/wheezy.qcow2 -monitor stdio
> 
> My first issue is regarding the acpi event delivery: I am running
> "udevadm monitor" in my guest OS.
> When I issue the cpu-hotplug on my qemu-monitor:
> 
> qemu_monitor: cpu_set 1 online
> 
> No udev-acpi event appears in the udevadm monitor of the guest.
> 
> if i issue a pci-device hotplug command afterwards e.g. adding a NIC:
> 
> qemu_monitor: pci_add auto nic
> 
> then the cpu-hotplug acpi/udev events appear in the udev monitor:
> 
> KERNEL[1311086666.114132] add
> /devices/LNXSYSTM:00/device:00/LNXCPU:01 (acpi)
> KERNEL[1311086666.119475] add      /devices/system/cpu/cpu1 (cpu)
> KERNEL[1311086666.120699] add
> /devices/virtual/thermal/cooling_device1 (thermal)
> UDEV  [1311086666.158445] add      /devices/system/cpu/cpu1 (cpu)
> UDEV  [1311086666.159925] add
> /devices/virtual/thermal/cooling_device1 (thermal)
> UDEV  [1311086666.174848] add
> /devices/LNXSYSTM:00/device:00/LNXCPU:01 (acpi)
> 
> So only then is the./sys/devices/system/cpu/cpu1/ directory created.
> 
> Is the acpi event supposed to be delivered to the guest OS immediately
> after the cpu_set command? Or is what I
> described expected behaviour? I am probably missing something, as
> issuing a pci_add device command seems
> irrelevant for the purpose of a CPU-hotplug acpi event.
> 
> My second issue: after the successful creation of the cpu1 directory I
> try to online the cpu:
> 
> echo 1 > /sys/devices/system/cpu/cpu1/online
> bash: echo: write error: Input/output error
> 
> in the guest, dmesg reports:
> 
> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
> [ 2330.821306] CPU1: Not responding.
> 
> and subsequent writes to /sys/devices/system/cpu/cpu1/online return
> invalid argument.
> Is there an obvious problem with the setup I have described?
> 
> The same problem was observed with other -smp,maxcpus options as well
> e.g. -smp 2,macpus=4 and trying to online cpu3.
> I have also tried to specify "-cpu phenom" in the qemu command line,
> with no effect.
> 
> qemu-0.14.1 has seabios-0.6.1 or 0.6.2 I believe, both of which
> include cpu hotplug support. I have also tried the newest
> seabios (from repository) to be on the safe side with the same result,
> so this doesn't seem to be a bios problem.
> 
> thanks for any help,
> 
> - Vasilis
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-20  8:35 ` Gleb Natapov
@ 2011-07-21 11:06   ` Vasilis Liaskovitis
  2011-07-21 11:33     ` Gleb Natapov
                       ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Vasilis Liaskovitis @ 2011-07-21 11:06 UTC (permalink / raw)
  To: Gleb Natapov, jan.kiszka; +Cc: kvm

Hi,

On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
>> Hello,
>>
>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
>>
> You do everything right. It's qemu who is buggy. Since qemu need a patch
> for cpu hotplug to not crash it nobody tests it, so code bit rots.

thanks for your reply.

As I mentioned in the original email, onlining a hotplugged-cpu with
qemu-kvm/master results in:

>> echo 1 > /sys/devices/system/cpu/cpu1/online
>> bash: echo: write error: Input/output error
>>
>> in the guest, dmesg reports:
>>
>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
>> [ 2330.821306] CPU1: Not responding.

I tried to git-bisect between qemu-kvm-0.13.0 (last known version
where cpu hotplug works correctly
for me) and qemu-kvm/master.

More precisely: To enable cpu-hotplug at each bisect stage, I apply
this patch derived from:
http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html

diff --git a/hw/qdev.c b/hw/qdev.c
index 1aa1ea0..aed48ce 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
    if (!main_system_bus) {
        main_system_bus = qbus_create(&system_bus_info, NULL,
                                      "main-system-bus");
+       main_system_bus->allow_hotplug = 1;
    }
    return main_system_bus;
}

and test cpu hotplug functionality.
The commit that appears to break CPU hotplug is:

commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date:   Mon Feb 21 12:28:07 2011 +0100
   qemu-kvm: Mark VCPU state dirty on creation

Is it possible that kvm_vcpu_dirty should not be set to 1 for a CPU
that's being hot-plugged?
I.e. when kvm_cpu_exec() is called for the first time during
initialization of a hotplugged-CPU,
we shouldn't try to restore state with kvm_arch_put_registers().

The following patch enables hotplug and solves the non-responsive
hotplugged CPU problem,
by not setting the vcpu_dirty state when hotplugging. Enabling hotplug
is still done through
the main_system_bus (see above).
Tested on amd64host/amd64guest combination. Comments?

Note that there is probably another bug in qemu-kvm/master regarding
acpi-udev event delivery for
a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
the event in the guest, see
first issue in my original mail). This patch does not address that issue.

Signed-off-by: Vasilis Liaskovitis <vliaskov@gmail.com>
---

 hw/acpi_piix4.c   |    2 +-
 hw/pc.c           |   13 +++++++++++--
 hw/qdev.c         |    1 +
 kvm-all.c         |   22 +++++++++++++++++++++-
 kvm.h             |    3 +++
 target-i386/cpu.h |    2 +-
 6 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index c30a050..b0cac60 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -584,7 +584,7 @@ void qemu_system_cpu_hot_add(int cpu, int state)
     PIIX4PMState *s = global_piix4_pm_state;

     if (state && !qemu_get_cpu(cpu)) {
-        env = pc_new_cpu(global_cpu_model);
+        env = pc_new_cpu(global_cpu_model, 1);
         if (!env) {
             fprintf(stderr, "cpu %d creation failed\n", cpu);
             return;
diff --git a/hw/pc.c b/hw/pc.c
index c0a88e1..8cfbf27 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -924,7 +924,7 @@ static void pc_cpu_reset(void *opaque)
     env->halted = !cpu_is_bsp(env);
 }

-CPUState *pc_new_cpu(const char *cpu_model)
+CPUState *pc_new_cpu(const char *cpu_model, int hotplugged)
 {
     CPUState *env;

@@ -936,7 +936,16 @@ CPUState *pc_new_cpu(const char *cpu_model)
 #endif
     }

+    if (hotplugged) {
+        kvm_set_cpuhotplug_req();
+    }
+
     env = cpu_init(cpu_model);
+
+    if (hotplugged) {
+        kvm_reset_cpuhotplug_req();
+    }
+
     if (!env) {
         fprintf(stderr, "Unable to find x86 CPU definition\n");
         exit(1);
@@ -956,7 +965,7 @@ void pc_cpus_init(const char *cpu_model)

     /* init CPUs */
     for(i = 0; i < smp_cpus; i++) {
-        pc_new_cpu(cpu_model);
+        pc_new_cpu(cpu_model, 0);
     }
 }

diff --git a/hw/qdev.c b/hw/qdev.c
index 292b52f..f85ac0f 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
     if (!main_system_bus) {
         main_system_bus = qbus_create(&system_bus_info, NULL,
                                       "main-system-bus");
+	main_system_bus->allow_hotplug = 1;
     }
     return main_system_bus;
 }
diff --git a/kvm-all.c b/kvm-all.c
index 3ad2459..8aae1d7 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -85,6 +85,7 @@ struct KVMState
 #endif
     void *used_gsi_bitmap;
     int max_gsi;
+    int cpu_hotplug_req;
 };

 KVMState *kvm_state;
@@ -220,7 +221,9 @@ int kvm_init_vcpu(CPUState *env)

     env->kvm_fd = ret;
     env->kvm_state = s;
-    env->kvm_vcpu_dirty = 1;
+    if (!kvm_has_cpuhotplug_req()) {
+        env->kvm_vcpu_dirty = 1;
+    }

     mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
     if (mmap_size < 0) {
@@ -797,6 +800,8 @@ int kvm_init(void)

     s->pit_in_kernel = kvm_pit;

+    s->cpu_hotplug_req = 0;
+
     ret = kvm_arch_init(s);
     if (ret < 0) {
         goto err;
@@ -1104,6 +1109,21 @@ int kvm_has_vcpu_events(void)
     return kvm_state->vcpu_events;
 }

+int kvm_has_cpuhotplug_req(void)
+{
+    return kvm_state->cpu_hotplug_req;
+}
+
+void kvm_set_cpuhotplug_req(void)
+{
+    kvm_state->cpu_hotplug_req = 1;
+}
+
+void kvm_reset_cpuhotplug_req(void)
+{
+    kvm_state->cpu_hotplug_req = 0;
+}
+
 int kvm_has_robust_singlestep(void)
 {
     return kvm_state->robust_singlestep;
diff --git a/kvm.h b/kvm.h
index b15e1dd..7fa72fd 100644
--- a/kvm.h
+++ b/kvm.h
@@ -52,6 +52,9 @@ int kvm_has_xsave(void);
 int kvm_has_xcrs(void);
 int kvm_has_many_ioeventfds(void);
 int kvm_has_pit_state2(void);
+int kvm_has_cpuhotplug_req(void);
+void kvm_set_cpuhotplug_req(void);
+void kvm_reset_cpuhotplug_req(void);

 #ifdef NEED_CPU_H
 int kvm_init_vcpu(CPUState *env);
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 935d08a..7e7839b 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -923,7 +923,7 @@ void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_cr4);
 /* hw/pc.c */
 void cpu_smm_update(CPUX86State *env);
 uint64_t cpu_get_tsc(CPUX86State *env);
-CPUState *pc_new_cpu(const char *cpu_model);
+CPUState *pc_new_cpu(const char *cpu_model, int hotplugged);

 /* used to debug */
 #define X86_DUMP_FPU  0x0001 /* dump FPU state too */

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:06   ` [PATCH] " Vasilis Liaskovitis
@ 2011-07-21 11:33     ` Gleb Natapov
  2011-07-21 11:42       ` Jan Kiszka
  2011-07-21 13:08       ` Vasilis Liaskovitis
  2011-07-21 11:36     ` Jan Kiszka
  2011-07-21 12:22     ` Jan Kiszka
  2 siblings, 2 replies; 42+ messages in thread
From: Gleb Natapov @ 2011-07-21 11:33 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: jan.kiszka, kvm

On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
> Hi,
> 
> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
> > On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
> >> Hello,
> >>
> >> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
> >>
> > You do everything right. It's qemu who is buggy. Since qemu need a patch
> > for cpu hotplug to not crash it nobody tests it, so code bit rots.
> 
> thanks for your reply.
> 
> As I mentioned in the original email, onlining a hotplugged-cpu with
> qemu-kvm/master results in:
> 
> >> echo 1 > /sys/devices/system/cpu/cpu1/online
> >> bash: echo: write error: Input/output error
> >>
> >> in the guest, dmesg reports:
> >>
> >> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
> >> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
> >> [ 2330.821306] CPU1: Not responding.
> 
> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
> where cpu hotplug works correctly
> for me) and qemu-kvm/master.
> 
> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> this patch derived from:
> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> 
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 1aa1ea0..aed48ce 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>     if (!main_system_bus) {
>         main_system_bus = qbus_create(&system_bus_info, NULL,
>                                       "main-system-bus");
> +       main_system_bus->allow_hotplug = 1;
>     }
>     return main_system_bus;
> }
> 
> and test cpu hotplug functionality.
> The commit that appears to break CPU hotplug is:
> 
Thank you for going through the pain of bisecting it! Bisects that
require patch to be applied between each step are especially annoying.

> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date:   Mon Feb 21 12:28:07 2011 +0100
>    qemu-kvm: Mark VCPU state dirty on creation
> 
Jan can you look at this please?

> Is it possible that kvm_vcpu_dirty should not be set to 1 for a CPU
> that's being hot-plugged?
> I.e. when kvm_cpu_exec() is called for the first time during
> initialization of a hotplugged-CPU,
> we shouldn't try to restore state with kvm_arch_put_registers().
> 
> The following patch enables hotplug and solves the non-responsive
> hotplugged CPU problem,
> by not setting the vcpu_dirty state when hotplugging. Enabling hotplug
> is still done through
> the main_system_bus (see above).
> Tested on amd64host/amd64guest combination. Comments?
> 
> Note that there is probably another bug in qemu-kvm/master regarding
> acpi-udev event delivery for
> a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
> the event in the guest, see
> first issue in my original mail). This patch does not address that issue.
> 
Did this work in qemu-0.13?

> Signed-off-by: Vasilis Liaskovitis <vliaskov@gmail.com>
> ---
> 
>  hw/acpi_piix4.c   |    2 +-
>  hw/pc.c           |   13 +++++++++++--
>  hw/qdev.c         |    1 +
>  kvm-all.c         |   22 +++++++++++++++++++++-
>  kvm.h             |    3 +++
>  target-i386/cpu.h |    2 +-
>  6 files changed, 38 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> index c30a050..b0cac60 100644
> --- a/hw/acpi_piix4.c
> +++ b/hw/acpi_piix4.c
> @@ -584,7 +584,7 @@ void qemu_system_cpu_hot_add(int cpu, int state)
>      PIIX4PMState *s = global_piix4_pm_state;
> 
>      if (state && !qemu_get_cpu(cpu)) {
> -        env = pc_new_cpu(global_cpu_model);
> +        env = pc_new_cpu(global_cpu_model, 1);
>          if (!env) {
>              fprintf(stderr, "cpu %d creation failed\n", cpu);
>              return;
> diff --git a/hw/pc.c b/hw/pc.c
> index c0a88e1..8cfbf27 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -924,7 +924,7 @@ static void pc_cpu_reset(void *opaque)
>      env->halted = !cpu_is_bsp(env);
>  }
> 
> -CPUState *pc_new_cpu(const char *cpu_model)
> +CPUState *pc_new_cpu(const char *cpu_model, int hotplugged)
>  {
>      CPUState *env;
> 
> @@ -936,7 +936,16 @@ CPUState *pc_new_cpu(const char *cpu_model)
>  #endif
>      }
> 
> +    if (hotplugged) {
> +        kvm_set_cpuhotplug_req();
> +    }
> +
>      env = cpu_init(cpu_model);
> +
> +    if (hotplugged) {
> +        kvm_reset_cpuhotplug_req();
> +    }
> +
>      if (!env) {
>          fprintf(stderr, "Unable to find x86 CPU definition\n");
>          exit(1);
> @@ -956,7 +965,7 @@ void pc_cpus_init(const char *cpu_model)
> 
>      /* init CPUs */
>      for(i = 0; i < smp_cpus; i++) {
> -        pc_new_cpu(cpu_model);
> +        pc_new_cpu(cpu_model, 0);
>      }
>  }
> 
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 292b52f..f85ac0f 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>      if (!main_system_bus) {
>          main_system_bus = qbus_create(&system_bus_info, NULL,
>                                        "main-system-bus");
> +	main_system_bus->allow_hotplug = 1;
>      }
>      return main_system_bus;
>  }
> diff --git a/kvm-all.c b/kvm-all.c
> index 3ad2459..8aae1d7 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -85,6 +85,7 @@ struct KVMState
>  #endif
>      void *used_gsi_bitmap;
>      int max_gsi;
> +    int cpu_hotplug_req;
>  };
> 
>  KVMState *kvm_state;
> @@ -220,7 +221,9 @@ int kvm_init_vcpu(CPUState *env)
> 
>      env->kvm_fd = ret;
>      env->kvm_state = s;
> -    env->kvm_vcpu_dirty = 1;
> +    if (!kvm_has_cpuhotplug_req()) {
> +        env->kvm_vcpu_dirty = 1;
> +    }
> 
>      mmap_size = kvm_ioctl(s, KVM_GET_VCPU_MMAP_SIZE, 0);
>      if (mmap_size < 0) {
> @@ -797,6 +800,8 @@ int kvm_init(void)
> 
>      s->pit_in_kernel = kvm_pit;
> 
> +    s->cpu_hotplug_req = 0;
> +
>      ret = kvm_arch_init(s);
>      if (ret < 0) {
>          goto err;
> @@ -1104,6 +1109,21 @@ int kvm_has_vcpu_events(void)
>      return kvm_state->vcpu_events;
>  }
> 
> +int kvm_has_cpuhotplug_req(void)
> +{
> +    return kvm_state->cpu_hotplug_req;
> +}
> +
> +void kvm_set_cpuhotplug_req(void)
> +{
> +    kvm_state->cpu_hotplug_req = 1;
> +}
> +
> +void kvm_reset_cpuhotplug_req(void)
> +{
> +    kvm_state->cpu_hotplug_req = 0;
> +}
> +
>  int kvm_has_robust_singlestep(void)
>  {
>      return kvm_state->robust_singlestep;
> diff --git a/kvm.h b/kvm.h
> index b15e1dd..7fa72fd 100644
> --- a/kvm.h
> +++ b/kvm.h
> @@ -52,6 +52,9 @@ int kvm_has_xsave(void);
>  int kvm_has_xcrs(void);
>  int kvm_has_many_ioeventfds(void);
>  int kvm_has_pit_state2(void);
> +int kvm_has_cpuhotplug_req(void);
> +void kvm_set_cpuhotplug_req(void);
> +void kvm_reset_cpuhotplug_req(void);
> 
>  #ifdef NEED_CPU_H
>  int kvm_init_vcpu(CPUState *env);
> diff --git a/target-i386/cpu.h b/target-i386/cpu.h
> index 935d08a..7e7839b 100644
> --- a/target-i386/cpu.h
> +++ b/target-i386/cpu.h
> @@ -923,7 +923,7 @@ void cpu_x86_update_cr4(CPUX86State *env, uint32_t new_cr4);
>  /* hw/pc.c */
>  void cpu_smm_update(CPUX86State *env);
>  uint64_t cpu_get_tsc(CPUX86State *env);
> -CPUState *pc_new_cpu(const char *cpu_model);
> +CPUState *pc_new_cpu(const char *cpu_model, int hotplugged);
> 
>  /* used to debug */
>  #define X86_DUMP_FPU  0x0001 /* dump FPU state too */

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:06   ` [PATCH] " Vasilis Liaskovitis
  2011-07-21 11:33     ` Gleb Natapov
@ 2011-07-21 11:36     ` Jan Kiszka
  2011-07-21 12:22     ` Jan Kiszka
  2 siblings, 0 replies; 42+ messages in thread
From: Jan Kiszka @ 2011-07-21 11:36 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: Gleb Natapov, kvm

On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
> Hi,
> 
> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
>>> Hello,
>>>
>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
>>>
>> You do everything right. It's qemu who is buggy. Since qemu need a patch
>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
> 
> thanks for your reply.
> 
> As I mentioned in the original email, onlining a hotplugged-cpu with
> qemu-kvm/master results in:
> 
>>> echo 1 > /sys/devices/system/cpu/cpu1/online
>>> bash: echo: write error: Input/output error
>>>
>>> in the guest, dmesg reports:
>>>
>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
>>> [ 2330.821306] CPU1: Not responding.
> 
> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
> where cpu hotplug works correctly
> for me) and qemu-kvm/master.
> 
> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> this patch derived from:
> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> 
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 1aa1ea0..aed48ce 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>     if (!main_system_bus) {
>         main_system_bus = qbus_create(&system_bus_info, NULL,
>                                       "main-system-bus");
> +       main_system_bus->allow_hotplug = 1;
>     }
>     return main_system_bus;
> }
> 
> and test cpu hotplug functionality.
> The commit that appears to break CPU hotplug is:
> 
> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date:   Mon Feb 21 12:28:07 2011 +0100
>    qemu-kvm: Mark VCPU state dirty on creation
> 
> Is it possible that kvm_vcpu_dirty should not be set to 1 for a CPU
> that's being hot-plugged?
> I.e. when kvm_cpu_exec() is called for the first time during
> initialization of a hotplugged-CPU,
> we shouldn't try to restore state with kvm_arch_put_registers().

We should because user space defines the CPU state on creation or after
reset, and that state has to be transferred to the kernel. If you get
further by skipping this, we likely create to wrong state in user space
while the kernel happens to have a working one. So the state needs
fixing, not the write back (IOW, your workaround likely papers over the
real issue).

So far for a high-level analysis without digging in this dirt on my own. :)

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:33     ` Gleb Natapov
@ 2011-07-21 11:42       ` Jan Kiszka
  2011-07-21 11:51         ` Gleb Natapov
  2011-07-21 13:08       ` Vasilis Liaskovitis
  1 sibling, 1 reply; 42+ messages in thread
From: Jan Kiszka @ 2011-07-21 11:42 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Vasilis Liaskovitis, kvm

On 2011-07-21 13:33, Gleb Natapov wrote:
> On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
>> Hi,
>>
>> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
>>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
>>>> Hello,
>>>>
>>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
>>>>
>>> You do everything right. It's qemu who is buggy. Since qemu need a patch
>>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
>>
>> thanks for your reply.
>>
>> As I mentioned in the original email, onlining a hotplugged-cpu with
>> qemu-kvm/master results in:
>>
>>>> echo 1 > /sys/devices/system/cpu/cpu1/online
>>>> bash: echo: write error: Input/output error
>>>>
>>>> in the guest, dmesg reports:
>>>>
>>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
>>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
>>>> [ 2330.821306] CPU1: Not responding.
>>
>> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
>> where cpu hotplug works correctly
>> for me) and qemu-kvm/master.
>>
>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
>> this patch derived from:
>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
>>
>> diff --git a/hw/qdev.c b/hw/qdev.c
>> index 1aa1ea0..aed48ce 100644
>> --- a/hw/qdev.c
>> +++ b/hw/qdev.c
>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>>     if (!main_system_bus) {
>>         main_system_bus = qbus_create(&system_bus_info, NULL,
>>                                       "main-system-bus");
>> +       main_system_bus->allow_hotplug = 1;
>>     }
>>     return main_system_bus;
>> }
>>
>> and test cpu hotplug functionality.
>> The commit that appears to break CPU hotplug is:
>>
> Thank you for going through the pain of bisecting it! Bisects that
> require patch to be applied between each step are especially annoying.
> 
>> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>> Date:   Mon Feb 21 12:28:07 2011 +0100
>>    qemu-kvm: Mark VCPU state dirty on creation
>>
> Jan can you look at this please?

I can't promise to do debugging myself.

Also, as I never succeeded in getting anything working with CPU hotplug,
even back in the days it was supposed to work, I'm a bit clueless /wrt
to the right test cases.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:42       ` Jan Kiszka
@ 2011-07-21 11:51         ` Gleb Natapov
  2011-07-21 11:55           ` Jan Kiszka
  2011-07-21 12:45           ` Gleb Natapov
  0 siblings, 2 replies; 42+ messages in thread
From: Gleb Natapov @ 2011-07-21 11:51 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Vasilis Liaskovitis, kvm

On Thu, Jul 21, 2011 at 01:42:08PM +0200, Jan Kiszka wrote:
> On 2011-07-21 13:33, Gleb Natapov wrote:
> > On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
> >> Hi,
> >>
> >> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
> >>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
> >>>> Hello,
> >>>>
> >>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
> >>>>
> >>> You do everything right. It's qemu who is buggy. Since qemu need a patch
> >>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
> >>
> >> thanks for your reply.
> >>
> >> As I mentioned in the original email, onlining a hotplugged-cpu with
> >> qemu-kvm/master results in:
> >>
> >>>> echo 1 > /sys/devices/system/cpu/cpu1/online
> >>>> bash: echo: write error: Input/output error
> >>>>
> >>>> in the guest, dmesg reports:
> >>>>
> >>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
> >>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
> >>>> [ 2330.821306] CPU1: Not responding.
> >>
> >> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
> >> where cpu hotplug works correctly
> >> for me) and qemu-kvm/master.
> >>
> >> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> >> this patch derived from:
> >> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> >>
> >> diff --git a/hw/qdev.c b/hw/qdev.c
> >> index 1aa1ea0..aed48ce 100644
> >> --- a/hw/qdev.c
> >> +++ b/hw/qdev.c
> >> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
> >>     if (!main_system_bus) {
> >>         main_system_bus = qbus_create(&system_bus_info, NULL,
> >>                                       "main-system-bus");
> >> +       main_system_bus->allow_hotplug = 1;
> >>     }
> >>     return main_system_bus;
> >> }
> >>
> >> and test cpu hotplug functionality.
> >> The commit that appears to break CPU hotplug is:
> >>
> > Thank you for going through the pain of bisecting it! Bisects that
> > require patch to be applied between each step are especially annoying.
> > 
> >> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
> >> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >> Date:   Mon Feb 21 12:28:07 2011 +0100
> >>    qemu-kvm: Mark VCPU state dirty on creation
> >>
> > Jan can you look at this please?
> 
> I can't promise to do debugging myself.
> 
> Also, as I never succeeded in getting anything working with CPU hotplug,
> even back in the days it was supposed to work, I'm a bit clueless /wrt
> to the right test cases.
> 
CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
applied). But we have two bugs currently. One is that ACPI interrupt
is not send when cpu is onlined (at least this appears to be the case).
I will look at that one. Another is that after new cpu is detected it
can't be onlined.

After fixing the first bug the test should look like this:
1. start vm with  -smp 1,macpus=2
2. wait for it to boot
3. do "cpu 1 online" in monitor.
4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"

If step 4 should succeed. It fails now.

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:51         ` Gleb Natapov
@ 2011-07-21 11:55           ` Jan Kiszka
  2011-07-21 12:00             ` Gleb Natapov
  2011-07-21 12:18             ` Avi Kivity
  2011-07-21 12:45           ` Gleb Natapov
  1 sibling, 2 replies; 42+ messages in thread
From: Jan Kiszka @ 2011-07-21 11:55 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Vasilis Liaskovitis, kvm

On 2011-07-21 13:51, Gleb Natapov wrote:
> On Thu, Jul 21, 2011 at 01:42:08PM +0200, Jan Kiszka wrote:
>> On 2011-07-21 13:33, Gleb Natapov wrote:
>>> On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
>>>> Hi,
>>>>
>>>> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
>>>>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
>>>>>>
>>>>> You do everything right. It's qemu who is buggy. Since qemu need a patch
>>>>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
>>>>
>>>> thanks for your reply.
>>>>
>>>> As I mentioned in the original email, onlining a hotplugged-cpu with
>>>> qemu-kvm/master results in:
>>>>
>>>>>> echo 1 > /sys/devices/system/cpu/cpu1/online
>>>>>> bash: echo: write error: Input/output error
>>>>>>
>>>>>> in the guest, dmesg reports:
>>>>>>
>>>>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
>>>>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
>>>>>> [ 2330.821306] CPU1: Not responding.
>>>>
>>>> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
>>>> where cpu hotplug works correctly
>>>> for me) and qemu-kvm/master.
>>>>
>>>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
>>>> this patch derived from:
>>>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
>>>>
>>>> diff --git a/hw/qdev.c b/hw/qdev.c
>>>> index 1aa1ea0..aed48ce 100644
>>>> --- a/hw/qdev.c
>>>> +++ b/hw/qdev.c
>>>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>>>>     if (!main_system_bus) {
>>>>         main_system_bus = qbus_create(&system_bus_info, NULL,
>>>>                                       "main-system-bus");
>>>> +       main_system_bus->allow_hotplug = 1;
>>>>     }
>>>>     return main_system_bus;
>>>> }
>>>>
>>>> and test cpu hotplug functionality.
>>>> The commit that appears to break CPU hotplug is:
>>>>
>>> Thank you for going through the pain of bisecting it! Bisects that
>>> require patch to be applied between each step are especially annoying.
>>>
>>>> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>> Date:   Mon Feb 21 12:28:07 2011 +0100
>>>>    qemu-kvm: Mark VCPU state dirty on creation
>>>>
>>> Jan can you look at this please?
>>
>> I can't promise to do debugging myself.
>>
>> Also, as I never succeeded in getting anything working with CPU hotplug,
>> even back in the days it was supposed to work, I'm a bit clueless /wrt
>> to the right test cases.
>>
> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> applied). But we have two bugs currently. One is that ACPI interrupt

I bet we have multiple bugs for quite a while now, which accelerated bit
rotting even further.

> is not send when cpu is onlined (at least this appears to be the case).
> I will look at that one. Another is that after new cpu is detected it
> can't be onlined.
> 
> After fixing the first bug the test should look like this:
> 1. start vm with  -smp 1,macpus=2
> 2. wait for it to boot
> 3. do "cpu 1 online" in monitor.
> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"
> 
> If step 4 should succeed. It fails now.

Should the above work even with the current ACPI issue unfixed?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:55           ` Jan Kiszka
@ 2011-07-21 12:00             ` Gleb Natapov
  2011-07-21 12:18             ` Avi Kivity
  1 sibling, 0 replies; 42+ messages in thread
From: Gleb Natapov @ 2011-07-21 12:00 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Vasilis Liaskovitis, kvm

On Thu, Jul 21, 2011 at 01:55:26PM +0200, Jan Kiszka wrote:
> On 2011-07-21 13:51, Gleb Natapov wrote:
> > On Thu, Jul 21, 2011 at 01:42:08PM +0200, Jan Kiszka wrote:
> >> On 2011-07-21 13:33, Gleb Natapov wrote:
> >>> On Thu, Jul 21, 2011 at 01:06:41PM +0200, Vasilis Liaskovitis wrote:
> >>>> Hi,
> >>>>
> >>>> On Wed, Jul 20, 2011 at 10:35 AM, Gleb Natapov <gleb@redhat.com> wrote:
> >>>>> On Tue, Jul 19, 2011 at 07:40:55PM +0200, Vasilis Liaskovitis wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> I have encountered a problem trying to hotplug a CPU in my x86_64 guest setup.
> >>>>>>
> >>>>> You do everything right. It's qemu who is buggy. Since qemu need a patch
> >>>>> for cpu hotplug to not crash it nobody tests it, so code bit rots.
> >>>>
> >>>> thanks for your reply.
> >>>>
> >>>> As I mentioned in the original email, onlining a hotplugged-cpu with
> >>>> qemu-kvm/master results in:
> >>>>
> >>>>>> echo 1 > /sys/devices/system/cpu/cpu1/online
> >>>>>> bash: echo: write error: Input/output error
> >>>>>>
> >>>>>> in the guest, dmesg reports:
> >>>>>>
> >>>>>> [ 2325.376355] Booting Node 0 Processor 1 APIC 0x1
> >>>>>> [ 2325.376357] smpboot cpu 1: start_ip = 9a000
> >>>>>> [ 2330.821306] CPU1: Not responding.
> >>>>
> >>>> I tried to git-bisect between qemu-kvm-0.13.0 (last known version
> >>>> where cpu hotplug works correctly
> >>>> for me) and qemu-kvm/master.
> >>>>
> >>>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> >>>> this patch derived from:
> >>>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> >>>>
> >>>> diff --git a/hw/qdev.c b/hw/qdev.c
> >>>> index 1aa1ea0..aed48ce 100644
> >>>> --- a/hw/qdev.c
> >>>> +++ b/hw/qdev.c
> >>>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
> >>>>     if (!main_system_bus) {
> >>>>         main_system_bus = qbus_create(&system_bus_info, NULL,
> >>>>                                       "main-system-bus");
> >>>> +       main_system_bus->allow_hotplug = 1;
> >>>>     }
> >>>>     return main_system_bus;
> >>>> }
> >>>>
> >>>> and test cpu hotplug functionality.
> >>>> The commit that appears to break CPU hotplug is:
> >>>>
> >>> Thank you for going through the pain of bisecting it! Bisects that
> >>> require patch to be applied between each step are especially annoying.
> >>>
> >>>> commit f4de8c1451f2265148ff4d895a27e21c0a8788aa
> >>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >>>> Date:   Mon Feb 21 12:28:07 2011 +0100
> >>>>    qemu-kvm: Mark VCPU state dirty on creation
> >>>>
> >>> Jan can you look at this please?
> >>
> >> I can't promise to do debugging myself.
> >>
> >> Also, as I never succeeded in getting anything working with CPU hotplug,
> >> even back in the days it was supposed to work, I'm a bit clueless /wrt
> >> to the right test cases.
> >>
> > CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> > applied). But we have two bugs currently. One is that ACPI interrupt
> 
> I bet we have multiple bugs for quite a while now, which accelerated bit
> rotting even further.
> 
> > is not send when cpu is onlined (at least this appears to be the case).
> > I will look at that one. Another is that after new cpu is detected it
> > can't be onlined.
> > 
> > After fixing the first bug the test should look like this:
> > 1. start vm with  -smp 1,macpus=2
> > 2. wait for it to boot
> > 3. do "cpu 1 online" in monitor.
> > 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"
> > 
> > If step 4 should succeed. It fails now.
> 
> Should the above work even with the current ACPI issue unfixed?
> 
Last I checked cpu hotplug worked. cpu unplug didn't. ACPI is only
responsible for making guest noticing new cpu, not starting it, and
according to Vasilis testing this is still working (kind of).

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:55           ` Jan Kiszka
  2011-07-21 12:00             ` Gleb Natapov
@ 2011-07-21 12:18             ` Avi Kivity
  2011-07-21 12:22               ` Gleb Natapov
                                 ` (2 more replies)
  1 sibling, 3 replies; 42+ messages in thread
From: Avi Kivity @ 2011-07-21 12:18 UTC (permalink / raw)
  To: Jan Kiszka
  Cc: Gleb Natapov, Vasilis Liaskovitis, kvm, Lucas Meneghel Rodrigues,
	Anthony Liguori

On 07/21/2011 02:55 PM, Jan Kiszka wrote:
> >>
> >  CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> >  applied). But we have two bugs currently. One is that ACPI interrupt
>
> I bet we have multiple bugs for quite a while now, which accelerated bit
> rotting even further.
>

kvm-autotest to the rescue!

Lucas, do we have a cpu hotplug test in kvm-autotest?

If so, I promise to add it to my autotest run once it is fixed, to 
prevent the spread of the disease.

btw, Anthony, are you running kvm-autotest these days? with qemu-kvm 
diverging less and less from qemu.git these days, it's becoming less 
effective to run it downstream.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:06   ` [PATCH] " Vasilis Liaskovitis
  2011-07-21 11:33     ` Gleb Natapov
  2011-07-21 11:36     ` Jan Kiszka
@ 2011-07-21 12:22     ` Jan Kiszka
  2011-07-21 12:25       ` Gleb Natapov
  2 siblings, 1 reply; 42+ messages in thread
From: Jan Kiszka @ 2011-07-21 12:22 UTC (permalink / raw)
  To: Vasilis Liaskovitis, Gleb Natapov; +Cc: kvm

On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> this patch derived from:
> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> 
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 1aa1ea0..aed48ce 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>     if (!main_system_bus) {
>         main_system_bus = qbus_create(&system_bus_info, NULL,
>                                       "main-system-bus");
> +       main_system_bus->allow_hotplug = 1;

BTW, this reminds me why "fixing" CPU hotplug won't help a lot. We
finally need to _design_ the infrastructure required for CPU hotplugging
and stop hacking away reasonable bits that do not fit in the current
simplistic CPU interface model.

The questions are:
 - How should the CPU-APIC-chipset topology should look like in an
   ideal (qdev reworked or qdev replaced with X) world should like?
 - How can we make useful step in that direction given the current
   mechanisms (e.g. a hotpluggable APIC bus created by the chipset)?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 12:18             ` Avi Kivity
@ 2011-07-21 12:22               ` Gleb Natapov
  2011-07-21 12:39               ` Jan Kiszka
  2011-07-21 13:27               ` Lucas Meneghel Rodrigues
  2 siblings, 0 replies; 42+ messages in thread
From: Gleb Natapov @ 2011-07-21 12:22 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jan Kiszka, Vasilis Liaskovitis, kvm, Lucas Meneghel Rodrigues,
	Anthony Liguori

On Thu, Jul 21, 2011 at 03:18:03PM +0300, Avi Kivity wrote:
> On 07/21/2011 02:55 PM, Jan Kiszka wrote:
> >>>
> >>  CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> >>  applied). But we have two bugs currently. One is that ACPI interrupt
> >
> >I bet we have multiple bugs for quite a while now, which accelerated bit
> >rotting even further.
> >
> 
> kvm-autotest to the rescue!
> 
+1 but cpu hotplug does not work without patch in
http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html.
The patch was rejected and alternative wasn't proposed.

> Lucas, do we have a cpu hotplug test in kvm-autotest?
> 
> If so, I promise to add it to my autotest run once it is fixed, to
> prevent the spread of the disease.
> 
> btw, Anthony, are you running kvm-autotest these days? with qemu-kvm
> diverging less and less from qemu.git these days, it's becoming less
> effective to run it downstream.
> 
> -- 
> error compiling committee.c: too many arguments to function

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 12:22     ` Jan Kiszka
@ 2011-07-21 12:25       ` Gleb Natapov
  2011-07-21 12:35         ` Jan Kiszka
  0 siblings, 1 reply; 42+ messages in thread
From: Gleb Natapov @ 2011-07-21 12:25 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Vasilis Liaskovitis, kvm

On Thu, Jul 21, 2011 at 02:22:07PM +0200, Jan Kiszka wrote:
> On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
> > More precisely: To enable cpu-hotplug at each bisect stage, I apply
> > this patch derived from:
> > http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> > 
> > diff --git a/hw/qdev.c b/hw/qdev.c
> > index 1aa1ea0..aed48ce 100644
> > --- a/hw/qdev.c
> > +++ b/hw/qdev.c
> > @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
> >     if (!main_system_bus) {
> >         main_system_bus = qbus_create(&system_bus_info, NULL,
> >                                       "main-system-bus");
> > +       main_system_bus->allow_hotplug = 1;
> 
> BTW, this reminds me why "fixing" CPU hotplug won't help a lot. We
I do not see relation between qdev not allowing cpu hotplug bug and cpu been
created with incorrect state bug. Fixing of former will not magically
make later disappear. Both should be solved.

> finally need to _design_ the infrastructure required for CPU hotplugging
> and stop hacking away reasonable bits that do not fit in the current
> simplistic CPU interface model.
> 
> The questions are:
>  - How should the CPU-APIC-chipset topology should look like in an
>    ideal (qdev reworked or qdev replaced with X) world should like?
>  - How can we make useful step in that direction given the current
>    mechanisms (e.g. a hotpluggable APIC bus created by the chipset)?
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 12:25       ` Gleb Natapov
@ 2011-07-21 12:35         ` Jan Kiszka
  2011-07-21 12:40           ` Gleb Natapov
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kiszka @ 2011-07-21 12:35 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Vasilis Liaskovitis, kvm

On 2011-07-21 14:25, Gleb Natapov wrote:
> On Thu, Jul 21, 2011 at 02:22:07PM +0200, Jan Kiszka wrote:
>> On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
>>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
>>> this patch derived from:
>>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
>>>
>>> diff --git a/hw/qdev.c b/hw/qdev.c
>>> index 1aa1ea0..aed48ce 100644
>>> --- a/hw/qdev.c
>>> +++ b/hw/qdev.c
>>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
>>>     if (!main_system_bus) {
>>>         main_system_bus = qbus_create(&system_bus_info, NULL,
>>>                                       "main-system-bus");
>>> +       main_system_bus->allow_hotplug = 1;
>>
>> BTW, this reminds me why "fixing" CPU hotplug won't help a lot. We
> I do not see relation between qdev not allowing cpu hotplug bug

It's not a bug, it's a feature: The system bus we attach APICs to so far
is a statically configured beast, intentionally hotplug-incapable.
Hacking this away in qemu-kvm will be a step towards where we came from
and will break sooner or later again. We need a CPU/APIC bus, also for
other reasons, that can then be made hotplug-capable.

> and cpu been
> created with incorrect state bug. Fixing of former will not magically
> make later disappear. Both should be solved.

The issues hidden by the topology design problem should be solved
nevertheless, that's true.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 12:18             ` Avi Kivity
  2011-07-21 12:22               ` Gleb Natapov
@ 2011-07-21 12:39               ` Jan Kiszka
  2011-07-21 13:27               ` Lucas Meneghel Rodrigues
  2 siblings, 0 replies; 42+ messages in thread
From: Jan Kiszka @ 2011-07-21 12:39 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Gleb Natapov, Vasilis Liaskovitis, kvm, Lucas Meneghel Rodrigues,
	Anthony Liguori

On 2011-07-21 14:18, Avi Kivity wrote:
> On 07/21/2011 02:55 PM, Jan Kiszka wrote:
>>>>
>>>  CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
>>>  applied). But we have two bugs currently. One is that ACPI interrupt
>>
>> I bet we have multiple bugs for quite a while now, which accelerated bit
>> rotting even further.
>>
> 
> kvm-autotest to the rescue!

CPU hotplug primarily suffers from lacking integration with QEMU
(hotpluggable CPU bus, qdev'ified CPUs to enable "device_add/del cpu",
and all that upstream). That needs to be finally solved, then regression
testing will indeed be very important.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 12:35         ` Jan Kiszka
@ 2011-07-21 12:40           ` Gleb Natapov
  0 siblings, 0 replies; 42+ messages in thread
From: Gleb Natapov @ 2011-07-21 12:40 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Vasilis Liaskovitis, kvm

On Thu, Jul 21, 2011 at 02:35:09PM +0200, Jan Kiszka wrote:
> On 2011-07-21 14:25, Gleb Natapov wrote:
> > On Thu, Jul 21, 2011 at 02:22:07PM +0200, Jan Kiszka wrote:
> >> On 2011-07-21 13:06, Vasilis Liaskovitis wrote:
> >>> More precisely: To enable cpu-hotplug at each bisect stage, I apply
> >>> this patch derived from:
> >>> http://lists.gnu.org/archive/html/qemu-devel/2010-08/msg00850.html
> >>>
> >>> diff --git a/hw/qdev.c b/hw/qdev.c
> >>> index 1aa1ea0..aed48ce 100644
> >>> --- a/hw/qdev.c
> >>> +++ b/hw/qdev.c
> >>> @@ -327,6 +327,7 @@ BusState *sysbus_get_default(void)
> >>>     if (!main_system_bus) {
> >>>         main_system_bus = qbus_create(&system_bus_info, NULL,
> >>>                                       "main-system-bus");
> >>> +       main_system_bus->allow_hotplug = 1;
> >>
> >> BTW, this reminds me why "fixing" CPU hotplug won't help a lot. We
> > I do not see relation between qdev not allowing cpu hotplug bug
> 
> It's not a bug, it's a feature: The system bus we attach APICs to so far
> is a statically configured beast, intentionally hotplug-incapable.
> Hacking this away in qemu-kvm will be a step towards where we came from
> and will break sooner or later again. We need a CPU/APIC bus, also for
> other reasons, that can then be made hotplug-capable.
> 
Agree. I was calling the whole situation the "bug", not this missing line in
qdev in particular.

> > and cpu been
> > created with incorrect state bug. Fixing of former will not magically
> > make later disappear. Both should be solved.
> 
> The issues hidden by the topology design problem should be solved
> nevertheless, that's true.
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:51         ` Gleb Natapov
  2011-07-21 11:55           ` Jan Kiszka
@ 2011-07-21 12:45           ` Gleb Natapov
  2011-07-22 10:56             ` Jan Kiszka
  1 sibling, 1 reply; 42+ messages in thread
From: Gleb Natapov @ 2011-07-21 12:45 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Vasilis Liaskovitis, kvm

On Thu, Jul 21, 2011 at 02:51:18PM +0300, Gleb Natapov wrote:
> > > Jan can you look at this please?
> > 
> > I can't promise to do debugging myself.
> > 
> > Also, as I never succeeded in getting anything working with CPU hotplug,
> > even back in the days it was supposed to work, I'm a bit clueless /wrt
> > to the right test cases.
> > 
> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> applied). But we have two bugs currently. One is that ACPI interrupt
> is not send when cpu is onlined (at least this appears to be the case).
> I will look at that one. Another is that after new cpu is detected it
> can't be onlined.
> 
> After fixing the first bug the test should look like this:
> 1. start vm with  -smp 1,macpus=2
> 2. wait for it to boot
> 3. do "cpu 1 online" in monitor.
> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"
> 
> If step 4 should succeed. It fails now.
> 
The first one was easy to solve. See patch below. Step 3 should be
"cpu_set 1 online".

---

Trigger sci interrupt after cpu hotplug/unplug event.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index c30a050..40f3fcd 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s)
                    ACPI_BITMASK_POWER_BUTTON_ENABLE |
                    ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
                    ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
-        (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0);
+        (((s->gpe.sts[0] & s->gpe.en[0]) &
+	  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0); 
 
     qemu_set_irq(s->irq, sci_level);
     /* schedule a timer interruption if needed */
--
			Gleb.

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 11:33     ` Gleb Natapov
  2011-07-21 11:42       ` Jan Kiszka
@ 2011-07-21 13:08       ` Vasilis Liaskovitis
  2011-07-21 13:11         ` Gleb Natapov
  2011-07-21 13:15         ` Avi Kivity
  1 sibling, 2 replies; 42+ messages in thread
From: Vasilis Liaskovitis @ 2011-07-21 13:08 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: jan.kiszka, kvm

Hi,

thanks for looking at this closer.

On Thu, Jul 21, 2011 at 1:33 PM, Gleb Natapov <gleb@redhat.com> wrote:

>> Note that there is probably another bug in qemu-kvm/master regarding
>> acpi-udev event delivery for
>> a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
>> the event in the guest, see
>> first issue in my original mail). This patch does not address that issue.
>>
> Did this work in qemu-0.13?

yes, the acpi-event delivery worked fine for me in 0.13.0

I also tried to separately bisect the acpi-udev delivery issue between
0.13.0 and qemu-kvm/master but ended up
with a big merge commit as the culprit. Not sure if it would help.

FWIW, here's my "git bisect log" for the acpi-udev delivery issue:

git bisect start
# good: [2e1e73f3d746feffc1c12e9463b2fc04e10df860] Merge remote branch
'upstream/stable-0.13' into stable-0.13
git bisect good 2e1e73f3d746feffc1c12e9463b2fc04e10df860
# bad: [fda19064e889d4419dd3dc69ca8e6e7a1535fdf5] Merge branch 'upstream-merge'
git bisect bad fda19064e889d4419dd3dc69ca8e6e7a1535fdf5
# good: [01ac6428a04576ae6f84f07d82c98da304b9ac77] Fix memory leak in
register save load due to xsave support
git bisect good 01ac6428a04576ae6f84f07d82c98da304b9ac77
# good: [f14224a695f51576e33d96a4bc26b9a67899dbb9] kvm_stat: add 'x'
key for enabling/disabling "drilldown"
git bisect good f14224a695f51576e33d96a4bc26b9a67899dbb9
# good: [f14224a695f51576e33d96a4bc26b9a67899dbb9] kvm_stat: add 'x'
key for enabling/disabling "drilldown"
git bisect good f14224a695f51576e33d96a4bc26b9a67899dbb9
# bad: [f47b14bbbe43fbaef9137e2abea8d300a6e6287f] Merge commit
'45fe15c25a5c9feea6e0f78434f5e9f632de9d94' into upstream-merge
git bisect bad f47b14bbbe43fbaef9137e2abea8d300a6e6287f
# good: [d23948b15a9920fb7f6374b55a6db1ecff81f3ee] lm32: add Milkymist
VGAFB support
git bisect good d23948b15a9920fb7f6374b55a6db1ecff81f3ee
# good: [47c0143cdde080101e97a1b39f3ff13e33b5726c] target-i386: add
CPU86_LDouble <-> double conversion functions
git bisect good 47c0143cdde080101e97a1b39f3ff13e33b5726c
# good: [d2d979c628e4b2c4a3cb71a31841875795c79043] NBD: Avoid leaking
a couple of strings when the NBD device is closed
git bisect good d2d979c628e4b2c4a3cb71a31841875795c79043
# bad: [70757dcaa40e14978bf287084d8fab9efb815a2d] qemu-kvm: Limit MSI
vector walk to actual array size
git bisect bad 70757dcaa40e14978bf287084d8fab9efb815a2d
# good: [913f5649dc778a1635e3440edba3429d84caec89] Merge commit
'a9f8ad8f2acdb2398da5d32a5efc19cb0196d79f' into upstream-merge
git bisect good 913f5649dc778a1635e3440edba3429d84caec89
# bad: [ad16018ee64433754914b73782da94fe2bcc0dac] qemu-kvm: Move gsi
bits from kvm_msix_vector_add to kvm_msi_add_message
git bisect bad ad16018ee64433754914b73782da94fe2bcc0dac
# bad: [98e934f0ab47583e82e3cabaa1c6a48c816e7bd0] qemu-kvm:
Synchronize states before reset
git bisect bad 98e934f0ab47583e82e3cabaa1c6a48c816e7bd0
# bad: [dc080800baef853f5a4d383d4f00d6afa7a46ff4] Merge branch 'upstream-merge'
git bisect bad dc080800baef853f5a4d383d4f00d6afa7a46ff4
# bad: [7d821420734b3e1fed438ae41b5791eb281f9448] Merge commit
'4d9ad7f793605abd9806fc932b3e04e028894565' into upstream-merge
git bisect bad 7d821420734b3e1fed438ae41b5791eb281f9448
# bad: [fc58948644fe3975c541f8452c63dd2d257587bd] Merge commit
'23910d3f669d46073b403876e30a7314599633af' into upstream-merge
git bisect bad fc58948644fe3975c541f8452c63dd2d257587bd

fc58948644fe3975c541f8452c63dd2d257587bd is the first bad commit

git show fc58948644fe3975c541f8452c63dd2d257587bd

commit fc58948644fe3975c541f8452c63dd2d257587bd
Merge: 913f564 23910d3
Author: Marcelo Tosatti <mtosatti@redhat.com>
Date:   Fri Apr 15 08:35:38 2011 -0300

    Merge commit '23910d3f669d46073b403876e30a7314599633af' into upstream-merge

    * commit '23910d3f669d46073b403876e30a7314599633af': (109 commits)
      acpi, acpi_piix: factor out GPE logic
      arm: basic support for ARMv4/ARMv4T emulation
      Fix conversions from pointer to tcg_target_long
      vnc: tight: Fix crash after 2GB of output
      smbus_eeprom: consolidate smbus eeprom creation oc pc_piix,
mips_mapta, mips_fulong
      lan9118: Ignore write to MAC_VLAN1 register
      acpi, acpi_piix, vt82c686: factor out PM1_CNT logic
      acpi, acpi_piix, vt82c686: factor out PM1a EVT logic
      acpi, acpi_piix, vt82c686: factor out PM_TMR logic
      hw/pflash_cfi02: Fix lazy reset of ROMD mode
      configure: avoid basename usage message
      mpc85xx_pci_map_irq: change "unknow" to "unknown"
      event: trivial coding style fixes
      multiboot: Quote filename in error message
      ppce500_mpc8544ds: Fix compile with --enable-debug and --disable-kvm
      Use existing helper function to implement popcntd instruction
      Delay creation of pseries device tree until reset
      pseries: Abolish envs array
      spapr_vscsi: Set uninitialized variable
      Don't call cpu_synchronize_state() from machine init.
      ...

    Conflicts:
        hw/acpi_piix4.c

    Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

diff --cc Makefile.objs
index 1fc3708,44ce368..895d416
--- a/Makefile.objs
+++ b/Makefile.objs
@@@ -14,10 -14,9 +14,10 @@@ oslib-obj-$(CONFIG_POSIX) += oslib-posi
  # block-obj-y is code used by both qemu system emulation and qemu-img

  block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o
module.o async.o
- block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
+ block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
qemu-progress.o qemu-sockets.o
  block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
  block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
 +block-obj-$(CONFIG_POSIX) += compatfd.o

  block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o
bochs.o vpc.o vvfat.o
  block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o
qcow2-snapshot.o qcow2-cache.o
diff --cc Makefile.target
index 3e70627,d5761b7..3a37205
--- a/Makefile.target



- Vasilis

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 13:08       ` Vasilis Liaskovitis
@ 2011-07-21 13:11         ` Gleb Natapov
  2011-07-21 13:12           ` Vasilis Liaskovitis
  2011-07-21 13:15         ` Avi Kivity
  1 sibling, 1 reply; 42+ messages in thread
From: Gleb Natapov @ 2011-07-21 13:11 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: jan.kiszka, kvm

On Thu, Jul 21, 2011 at 03:08:37PM +0200, Vasilis Liaskovitis wrote:
> Hi,
> 
> thanks for looking at this closer.
> 
> On Thu, Jul 21, 2011 at 1:33 PM, Gleb Natapov <gleb@redhat.com> wrote:
> 
> >> Note that there is probably another bug in qemu-kvm/master regarding
> >> acpi-udev event delivery for
> >> a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
> >> the event in the guest, see
> >> first issue in my original mail). This patch does not address that issue.
> >>
> > Did this work in qemu-0.13?
> 
> yes, the acpi-event delivery worked fine for me in 0.13.0
> 
I've already sent a patch to fix this problem in one of the emails in this thread.

> I also tried to separately bisect the acpi-udev delivery issue between
> 0.13.0 and qemu-kvm/master but ended up
> with a big merge commit as the culprit. Not sure if it would help.
> 
> FWIW, here's my "git bisect log" for the acpi-udev delivery issue:
> 
> git bisect start
> # good: [2e1e73f3d746feffc1c12e9463b2fc04e10df860] Merge remote branch
> 'upstream/stable-0.13' into stable-0.13
> git bisect good 2e1e73f3d746feffc1c12e9463b2fc04e10df860
> # bad: [fda19064e889d4419dd3dc69ca8e6e7a1535fdf5] Merge branch 'upstream-merge'
> git bisect bad fda19064e889d4419dd3dc69ca8e6e7a1535fdf5
> # good: [01ac6428a04576ae6f84f07d82c98da304b9ac77] Fix memory leak in
> register save load due to xsave support
> git bisect good 01ac6428a04576ae6f84f07d82c98da304b9ac77
> # good: [f14224a695f51576e33d96a4bc26b9a67899dbb9] kvm_stat: add 'x'
> key for enabling/disabling "drilldown"
> git bisect good f14224a695f51576e33d96a4bc26b9a67899dbb9
> # good: [f14224a695f51576e33d96a4bc26b9a67899dbb9] kvm_stat: add 'x'
> key for enabling/disabling "drilldown"
> git bisect good f14224a695f51576e33d96a4bc26b9a67899dbb9
> # bad: [f47b14bbbe43fbaef9137e2abea8d300a6e6287f] Merge commit
> '45fe15c25a5c9feea6e0f78434f5e9f632de9d94' into upstream-merge
> git bisect bad f47b14bbbe43fbaef9137e2abea8d300a6e6287f
> # good: [d23948b15a9920fb7f6374b55a6db1ecff81f3ee] lm32: add Milkymist
> VGAFB support
> git bisect good d23948b15a9920fb7f6374b55a6db1ecff81f3ee
> # good: [47c0143cdde080101e97a1b39f3ff13e33b5726c] target-i386: add
> CPU86_LDouble <-> double conversion functions
> git bisect good 47c0143cdde080101e97a1b39f3ff13e33b5726c
> # good: [d2d979c628e4b2c4a3cb71a31841875795c79043] NBD: Avoid leaking
> a couple of strings when the NBD device is closed
> git bisect good d2d979c628e4b2c4a3cb71a31841875795c79043
> # bad: [70757dcaa40e14978bf287084d8fab9efb815a2d] qemu-kvm: Limit MSI
> vector walk to actual array size
> git bisect bad 70757dcaa40e14978bf287084d8fab9efb815a2d
> # good: [913f5649dc778a1635e3440edba3429d84caec89] Merge commit
> 'a9f8ad8f2acdb2398da5d32a5efc19cb0196d79f' into upstream-merge
> git bisect good 913f5649dc778a1635e3440edba3429d84caec89
> # bad: [ad16018ee64433754914b73782da94fe2bcc0dac] qemu-kvm: Move gsi
> bits from kvm_msix_vector_add to kvm_msi_add_message
> git bisect bad ad16018ee64433754914b73782da94fe2bcc0dac
> # bad: [98e934f0ab47583e82e3cabaa1c6a48c816e7bd0] qemu-kvm:
> Synchronize states before reset
> git bisect bad 98e934f0ab47583e82e3cabaa1c6a48c816e7bd0
> # bad: [dc080800baef853f5a4d383d4f00d6afa7a46ff4] Merge branch 'upstream-merge'
> git bisect bad dc080800baef853f5a4d383d4f00d6afa7a46ff4
> # bad: [7d821420734b3e1fed438ae41b5791eb281f9448] Merge commit
> '4d9ad7f793605abd9806fc932b3e04e028894565' into upstream-merge
> git bisect bad 7d821420734b3e1fed438ae41b5791eb281f9448
> # bad: [fc58948644fe3975c541f8452c63dd2d257587bd] Merge commit
> '23910d3f669d46073b403876e30a7314599633af' into upstream-merge
> git bisect bad fc58948644fe3975c541f8452c63dd2d257587bd
> 
> fc58948644fe3975c541f8452c63dd2d257587bd is the first bad commit
> 
> git show fc58948644fe3975c541f8452c63dd2d257587bd
> 
> commit fc58948644fe3975c541f8452c63dd2d257587bd
> Merge: 913f564 23910d3
> Author: Marcelo Tosatti <mtosatti@redhat.com>
> Date:   Fri Apr 15 08:35:38 2011 -0300
> 
>     Merge commit '23910d3f669d46073b403876e30a7314599633af' into upstream-merge
> 
>     * commit '23910d3f669d46073b403876e30a7314599633af': (109 commits)
>       acpi, acpi_piix: factor out GPE logic
>       arm: basic support for ARMv4/ARMv4T emulation
>       Fix conversions from pointer to tcg_target_long
>       vnc: tight: Fix crash after 2GB of output
>       smbus_eeprom: consolidate smbus eeprom creation oc pc_piix,
> mips_mapta, mips_fulong
>       lan9118: Ignore write to MAC_VLAN1 register
>       acpi, acpi_piix, vt82c686: factor out PM1_CNT logic
>       acpi, acpi_piix, vt82c686: factor out PM1a EVT logic
>       acpi, acpi_piix, vt82c686: factor out PM_TMR logic
>       hw/pflash_cfi02: Fix lazy reset of ROMD mode
>       configure: avoid basename usage message
>       mpc85xx_pci_map_irq: change "unknow" to "unknown"
>       event: trivial coding style fixes
>       multiboot: Quote filename in error message
>       ppce500_mpc8544ds: Fix compile with --enable-debug and --disable-kvm
>       Use existing helper function to implement popcntd instruction
>       Delay creation of pseries device tree until reset
>       pseries: Abolish envs array
>       spapr_vscsi: Set uninitialized variable
>       Don't call cpu_synchronize_state() from machine init.
>       ...
> 
>     Conflicts:
>         hw/acpi_piix4.c
> 
>     Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> diff --cc Makefile.objs
> index 1fc3708,44ce368..895d416
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@@ -14,10 -14,9 +14,10 @@@ oslib-obj-$(CONFIG_POSIX) += oslib-posi
>   # block-obj-y is code used by both qemu system emulation and qemu-img
> 
>   block-obj-y = cutils.o cache-utils.o qemu-malloc.o qemu-option.o
> module.o async.o
> - block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
> + block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o
> qemu-progress.o qemu-sockets.o
>   block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
>   block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>  +block-obj-$(CONFIG_POSIX) += compatfd.o
> 
>   block-nested-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o
> bochs.o vpc.o vvfat.o
>   block-nested-y += qcow2.o qcow2-refcount.o qcow2-cluster.o
> qcow2-snapshot.o qcow2-cache.o
> diff --cc Makefile.target
> index 3e70627,d5761b7..3a37205
> --- a/Makefile.target
> 
> 
> 
> - Vasilis

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 13:11         ` Gleb Natapov
@ 2011-07-21 13:12           ` Vasilis Liaskovitis
  2011-07-21 13:13             ` Gleb Natapov
  0 siblings, 1 reply; 42+ messages in thread
From: Vasilis Liaskovitis @ 2011-07-21 13:12 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: jan.kiszka, kvm

On Thu, Jul 21, 2011 at 3:11 PM, Gleb Natapov <gleb@redhat.com> wrote:
> On Thu, Jul 21, 2011 at 03:08:37PM +0200, Vasilis Liaskovitis wrote:
> I've already sent a patch to fix this problem in one of the emails in this thread.

oops sorry I missed your last mail/patch. Got it.
thanks again.

- Vasilis

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 13:12           ` Vasilis Liaskovitis
@ 2011-07-21 13:13             ` Gleb Natapov
  0 siblings, 0 replies; 42+ messages in thread
From: Gleb Natapov @ 2011-07-21 13:13 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: jan.kiszka, kvm

On Thu, Jul 21, 2011 at 03:12:39PM +0200, Vasilis Liaskovitis wrote:
> On Thu, Jul 21, 2011 at 3:11 PM, Gleb Natapov <gleb@redhat.com> wrote:
> > On Thu, Jul 21, 2011 at 03:08:37PM +0200, Vasilis Liaskovitis wrote:
> > I've already sent a patch to fix this problem in one of the emails in this thread.
> 
> oops sorry I missed your last mail/patch. Got it.
> thanks again.
> 
Why sorry? Thank you for your work.

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 13:08       ` Vasilis Liaskovitis
  2011-07-21 13:11         ` Gleb Natapov
@ 2011-07-21 13:15         ` Avi Kivity
  2011-07-21 13:15           ` Avi Kivity
  1 sibling, 1 reply; 42+ messages in thread
From: Avi Kivity @ 2011-07-21 13:15 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: Gleb Natapov, jan.kiszka, kvm

On 07/21/2011 04:08 PM, Vasilis Liaskovitis wrote:
> Hi,
>
> thanks for looking at this closer.
>
> On Thu, Jul 21, 2011 at 1:33 PM, Gleb Natapov<gleb@redhat.com>  wrote:
>
> >>  Note that there is probably another bug in qemu-kvm/master regarding
> >>  acpi-udev event delivery for
> >>  a cpu-hotplug event (cpu_set in the qemu_monitor no longer triggers
> >>  the event in the guest, see
> >>  first issue in my original mail). This patch does not address that issue.
> >>
> >  Did this work in qemu-0.13?
>
> yes, the acpi-event delivery worked fine for me in 0.13.0
>
> I also tried to separately bisect the acpi-udev delivery issue between
> 0.13.0 and qemu-kvm/master but ended up
> with a big merge commit as the culprit. Not sure if it would help.

You can bisect it further.  If you tag the bad commit "bad", then you do

$ git bisect start bad^2 $(git merge-base bad^1 bad^2)

   (for each commit bisect feeds you:)
   $ git merge --no-commit bad^1
<test>
   $ git checkout -f
   $ git bisect <good|bad>

What this does is bisect all commits added by the merge, but first 
re-does the merge before testing.  You shouldn't have any conflicts 
during these merging.

It is recommended to set up ccache to reduce build time, as those merges 
force a full rebuild every time.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 13:15         ` Avi Kivity
@ 2011-07-21 13:15           ` Avi Kivity
  0 siblings, 0 replies; 42+ messages in thread
From: Avi Kivity @ 2011-07-21 13:15 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: Gleb Natapov, jan.kiszka, kvm

On 07/21/2011 04:15 PM, Avi Kivity wrote:
>> I also tried to separately bisect the acpi-udev delivery issue between
>> 0.13.0 and qemu-kvm/master but ended up
>> with a big merge commit as the culprit. Not sure if it would help.
>
>
> You can bisect it further.  If you tag the bad commit "bad", then you do

Err, I now see this is moot.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 12:18             ` Avi Kivity
  2011-07-21 12:22               ` Gleb Natapov
  2011-07-21 12:39               ` Jan Kiszka
@ 2011-07-21 13:27               ` Lucas Meneghel Rodrigues
  2 siblings, 0 replies; 42+ messages in thread
From: Lucas Meneghel Rodrigues @ 2011-07-21 13:27 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jan Kiszka, Gleb Natapov, Vasilis Liaskovitis, kvm, Anthony Liguori

On 07/21/2011 09:18 AM, Avi Kivity wrote:
> On 07/21/2011 02:55 PM, Jan Kiszka wrote:
>> >>
>> > CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
>> > applied). But we have two bugs currently. One is that ACPI interrupt
>>
>> I bet we have multiple bugs for quite a while now, which accelerated bit
>> rotting even further.
>>
>
> kvm-autotest to the rescue!
>
> Lucas, do we have a cpu hotplug test in kvm-autotest?

Hey Avi, I just read the thread, and let me tell you guys, I have worked 
on a cpu_set test, that was supposed to test CPU hotplug. I have created 
the entire test, however, I could never ever get the test to pass due to 
the fact that the infrastructure on qemu is broken. I have revisited the 
test late 2010, and I've verified that although cpu_set does not crash 
qemu anymore, the guest OS has no idea whatsoever that a new CPU was 
added. I remember that one of the problems we had was lack of 
functionality in Seabios...

I can gladly get back to it and get it commited (I was reluctant in 
getting it upstream because I never ever got a successful test result), 
but I can put it into shape and get it commited.

Cheers,

Lucas

> If so, I promise to add it to my autotest run once it is fixed, to
> prevent the spread of the disease.
>
> btw, Anthony, are you running kvm-autotest these days? with qemu-kvm
> diverging less and less from qemu.git these days, it's becoming less
> effective to run it downstream.
>


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-21 12:45           ` Gleb Natapov
@ 2011-07-22 10:56             ` Jan Kiszka
  2011-07-24 11:56               ` Gleb Natapov
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kiszka @ 2011-07-22 10:56 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Vasilis Liaskovitis, kvm, Markus Armbruster

On 2011-07-21 14:45, Gleb Natapov wrote:
> On Thu, Jul 21, 2011 at 02:51:18PM +0300, Gleb Natapov wrote:
>>>> Jan can you look at this please?
>>>
>>> I can't promise to do debugging myself.
>>>
>>> Also, as I never succeeded in getting anything working with CPU hotplug,
>>> even back in the days it was supposed to work, I'm a bit clueless /wrt
>>> to the right test cases.
>>>
>> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
>> applied). But we have two bugs currently. One is that ACPI interrupt
>> is not send when cpu is onlined (at least this appears to be the case).
>> I will look at that one. Another is that after new cpu is detected it
>> can't be onlined.
>>
>> After fixing the first bug the test should look like this:
>> 1. start vm with  -smp 1,macpus=2
>> 2. wait for it to boot
>> 3. do "cpu 1 online" in monitor.
>> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"
>>
>> If step 4 should succeed. It fails now.
>>
> The first one was easy to solve. See patch below. Step 3 should be
> "cpu_set 1 online".
> 
> ---
> 
> Trigger sci interrupt after cpu hotplug/unplug event.
> 
> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> index c30a050..40f3fcd 100644
> --- a/hw/acpi_piix4.c
> +++ b/hw/acpi_piix4.c
> @@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s)
>                     ACPI_BITMASK_POWER_BUTTON_ENABLE |
>                     ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
>                     ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
> -        (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0);
> +        (((s->gpe.sts[0] & s->gpe.en[0]) &
> +	  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0); 
>  
>      qemu_set_irq(s->irq, sci_level);
>      /* schedule a timer interruption if needed */
> --
> 			Gleb.

I had a closer look and identified two further issues, one generic, one
CPU-hotplug-specific:
 - (qdev) devices that are hotplugged do not receive any reset. That
   does not only apply to the APIC in case of CPU hotplugging, it is
   also broken for NICs, storage controllers, etc. when doing PCI
   hot-add as I just checked via gdb.
 - CPU hotplugging was always (or at least for a fairly long time),
   well, fragile as it failed to make CPU thread creation and CPU
   initialization atomic against APIC addition and other initialization
   steps. IOW, we need to create CPUs stopped, finish all init work,
   sync their states completely to the kernel
   (cpu_synchronize_post_init), and then kick them of. Actually I'm
   considering to stop all CPUs during that short phase to make things
   simpler and future-proof (when we reduce qemu_global_mutex
   dependencies).

Still, something else must be different for hotplugged CPUs as they fail
to come up properly every 2 or 3 system resets or online transitions of
the Linux guest. Will try to understand that once time permits.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-22 10:56             ` Jan Kiszka
@ 2011-07-24 11:56               ` Gleb Natapov
  2011-07-24 16:11                 ` Jan Kiszka
  0 siblings, 1 reply; 42+ messages in thread
From: Gleb Natapov @ 2011-07-24 11:56 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Vasilis Liaskovitis, kvm, Markus Armbruster

On Fri, Jul 22, 2011 at 12:56:58PM +0200, Jan Kiszka wrote:
> On 2011-07-21 14:45, Gleb Natapov wrote:
> > On Thu, Jul 21, 2011 at 02:51:18PM +0300, Gleb Natapov wrote:
> >>>> Jan can you look at this please?
> >>>
> >>> I can't promise to do debugging myself.
> >>>
> >>> Also, as I never succeeded in getting anything working with CPU hotplug,
> >>> even back in the days it was supposed to work, I'm a bit clueless /wrt
> >>> to the right test cases.
> >>>
> >> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
> >> applied). But we have two bugs currently. One is that ACPI interrupt
> >> is not send when cpu is onlined (at least this appears to be the case).
> >> I will look at that one. Another is that after new cpu is detected it
> >> can't be onlined.
> >>
> >> After fixing the first bug the test should look like this:
> >> 1. start vm with  -smp 1,macpus=2
> >> 2. wait for it to boot
> >> 3. do "cpu 1 online" in monitor.
> >> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"
> >>
> >> If step 4 should succeed. It fails now.
> >>
> > The first one was easy to solve. See patch below. Step 3 should be
> > "cpu_set 1 online".
> > 
> > ---
> > 
> > Trigger sci interrupt after cpu hotplug/unplug event.
> > 
> > Signed-off-by: Gleb Natapov <gleb@redhat.com>
> > diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> > index c30a050..40f3fcd 100644
> > --- a/hw/acpi_piix4.c
> > +++ b/hw/acpi_piix4.c
> > @@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s)
> >                     ACPI_BITMASK_POWER_BUTTON_ENABLE |
> >                     ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
> >                     ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
> > -        (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0);
> > +        (((s->gpe.sts[0] & s->gpe.en[0]) &
> > +	  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0); 
> >  
> >      qemu_set_irq(s->irq, sci_level);
> >      /* schedule a timer interruption if needed */
> > --
> > 			Gleb.
> 
> I had a closer look and identified two further issues, one generic, one
> CPU-hotplug-specific:
>  - (qdev) devices that are hotplugged do not receive any reset. That
>    does not only apply to the APIC in case of CPU hotplugging, it is
>    also broken for NICs, storage controllers, etc. when doing PCI
>    hot-add as I just checked via gdb.
>  - CPU hotplugging was always (or at least for a fairly long time),
>    well, fragile as it failed to make CPU thread creation and CPU
>    initialization atomic against APIC addition and other initialization
>    steps. IOW, we need to create CPUs stopped, finish all init work,
>    sync their states completely to the kernel
>    (cpu_synchronize_post_init), and then kick them of. Actually I'm
Syncing the state to the kernel should be done by vcpu thread, so I it
cannot be stopped while the sync is done. May be I misunderstood what
you mean here.

>    considering to stop all CPUs during that short phase to make things
>    simpler and future-proof (when we reduce qemu_global_mutex
>    dependencies).
> 
> Still, something else must be different for hotplugged CPUs as they fail
> to come up properly every 2 or 3 system resets or online transitions of
> the Linux guest. Will try to understand that once time permits.
> 
> Jan
> 
> -- 
> Siemens AG, Corporate Technology, CT T DE IT 1
> Corporate Competence Center Embedded Linux

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-24 11:56               ` Gleb Natapov
@ 2011-07-24 16:11                 ` Jan Kiszka
  2011-07-25 13:18                   ` Jan Kiszka
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kiszka @ 2011-07-24 16:11 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Vasilis Liaskovitis, kvm, Markus Armbruster

[-- Attachment #1: Type: text/plain, Size: 3304 bytes --]

On 2011-07-24 13:56, Gleb Natapov wrote:
> On Fri, Jul 22, 2011 at 12:56:58PM +0200, Jan Kiszka wrote:
>> On 2011-07-21 14:45, Gleb Natapov wrote:
>>> On Thu, Jul 21, 2011 at 02:51:18PM +0300, Gleb Natapov wrote:
>>>>>> Jan can you look at this please?
>>>>>
>>>>> I can't promise to do debugging myself.
>>>>>
>>>>> Also, as I never succeeded in getting anything working with CPU hotplug,
>>>>> even back in the days it was supposed to work, I'm a bit clueless /wrt
>>>>> to the right test cases.
>>>>>
>>>> CPU hotplug for Linux suppose to be easy (with allow_hotplug patch
>>>> applied). But we have two bugs currently. One is that ACPI interrupt
>>>> is not send when cpu is onlined (at least this appears to be the case).
>>>> I will look at that one. Another is that after new cpu is detected it
>>>> can't be onlined.
>>>>
>>>> After fixing the first bug the test should look like this:
>>>> 1. start vm with  -smp 1,macpus=2
>>>> 2. wait for it to boot
>>>> 3. do "cpu 1 online" in monitor.
>>>> 4. do "echo 1 > /sys/devices/system/cpu/cpu1/online"
>>>>
>>>> If step 4 should succeed. It fails now.
>>>>
>>> The first one was easy to solve. See patch below. Step 3 should be
>>> "cpu_set 1 online".
>>>
>>> ---
>>>
>>> Trigger sci interrupt after cpu hotplug/unplug event.
>>>
>>> Signed-off-by: Gleb Natapov <gleb@redhat.com>
>>> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
>>> index c30a050..40f3fcd 100644
>>> --- a/hw/acpi_piix4.c
>>> +++ b/hw/acpi_piix4.c
>>> @@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s)
>>>                     ACPI_BITMASK_POWER_BUTTON_ENABLE |
>>>                     ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
>>>                     ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
>>> -        (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0);
>>> +        (((s->gpe.sts[0] & s->gpe.en[0]) &
>>> +	  (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0); 
>>>  
>>>      qemu_set_irq(s->irq, sci_level);
>>>      /* schedule a timer interruption if needed */
>>> --
>>> 			Gleb.
>>
>> I had a closer look and identified two further issues, one generic, one
>> CPU-hotplug-specific:
>>  - (qdev) devices that are hotplugged do not receive any reset. That
>>    does not only apply to the APIC in case of CPU hotplugging, it is
>>    also broken for NICs, storage controllers, etc. when doing PCI
>>    hot-add as I just checked via gdb.
>>  - CPU hotplugging was always (or at least for a fairly long time),
>>    well, fragile as it failed to make CPU thread creation and CPU
>>    initialization atomic against APIC addition and other initialization
>>    steps. IOW, we need to create CPUs stopped, finish all init work,
>>    sync their states completely to the kernel
>>    (cpu_synchronize_post_init), and then kick them of. Actually I'm
> Syncing the state to the kernel should be done by vcpu thread, so I it
> cannot be stopped while the sync is done. May be I misunderstood what
> you mean here.

Stopped first of all means not entering kvm_cpu_exec before the whole
setup and the initial sync are done.

Syncing the initial state may also happen over the creating context as
long as the vcpus are stopped (analogously to
kvm_cpu_synchronize_post_init).

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-24 16:11                 ` Jan Kiszka
@ 2011-07-25 13:18                   ` Jan Kiszka
  2011-07-25 13:21                     ` Gleb Natapov
  2011-07-27 16:35                     ` Vasilis Liaskovitis
  0 siblings, 2 replies; 42+ messages in thread
From: Jan Kiszka @ 2011-07-25 13:18 UTC (permalink / raw)
  To: Gleb Natapov, Vasilis Liaskovitis; +Cc: kvm, Markus Armbruster

On 2011-07-24 18:11, Jan Kiszka wrote:
>>> I had a closer look and identified two further issues, one generic, one
>>> CPU-hotplug-specific:
>>>  - (qdev) devices that are hotplugged do not receive any reset. That
>>>    does not only apply to the APIC in case of CPU hotplugging, it is
>>>    also broken for NICs, storage controllers, etc. when doing PCI
>>>    hot-add as I just checked via gdb.
>>>  - CPU hotplugging was always (or at least for a fairly long time),
>>>    well, fragile as it failed to make CPU thread creation and CPU
>>>    initialization atomic against APIC addition and other initialization
>>>    steps. IOW, we need to create CPUs stopped, finish all init work,
>>>    sync their states completely to the kernel
>>>    (cpu_synchronize_post_init), and then kick them of. Actually I'm
>> Syncing the state to the kernel should be done by vcpu thread, so I it
>> cannot be stopped while the sync is done. May be I misunderstood what
>> you mean here.
> 
> Stopped first of all means not entering kvm_cpu_exec before the whole
> setup and the initial sync are done.
> 
> Syncing the initial state may also happen over the creating context as
> long as the vcpus are stopped (analogously to
> kvm_cpu_synchronize_post_init).

OK, hacks below plus the following three patches make CPU hotplug work
again - with some exceptions. Here are the patches:

1. http://thread.gmane.org/gmane.comp.emulators.kvm.devel/76484
2. http://thread.gmane.org/gmane.comp.emulators.qemu/110272
3. http://thread.gmane.org/gmane.comp.emulators.qemu/110426

And here are the hacks (well, the first hunk is clearly a fix, the last
one clearly a hack, /me still undecided about the rest):

diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index c30a050..f650250 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s)
                    ACPI_BITMASK_POWER_BUTTON_ENABLE |
                    ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
                    ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
-        (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0);
+        (((s->gpe.sts[0] & s->gpe.en[0]) &
+          (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0);
 
     qemu_set_irq(s->irq, sci_level);
     /* schedule a timer interruption if needed */
diff --git a/hw/pc.c b/hw/pc.c
index c0a88e1..e5371be 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -42,6 +42,7 @@
 #include "kvm.h"
 #include "blockdev.h"
 #include "ui/qemu-spice.h"
+#include "cpus.h"
 
 /* output Bochs bios info messages */
 //#define DEBUG_BIOS
@@ -936,6 +937,10 @@ CPUState *pc_new_cpu(const char *cpu_model)
 #endif
     }
 
+    if (vm_running) {
+        pause_all_vcpus();
+    }
+
     env = cpu_init(cpu_model);
     if (!env) {
         fprintf(stderr, "Unable to find x86 CPU definition\n");
@@ -947,6 +952,11 @@ CPUState *pc_new_cpu(const char *cpu_model)
     }
     qemu_register_reset(pc_cpu_reset, env);
     pc_cpu_reset(env);
+
+    cpu_synchronize_post_init(env);
+    if (vm_running) {
+        resume_all_vcpus();
+    }
     return env;
 }
 
diff --git a/hw/qdev.c b/hw/qdev.c
index 1626131..b91e2c2 100644
--- a/hw/qdev.c
+++ b/hw/qdev.c
@@ -330,6 +330,7 @@ BusState *sysbus_get_default(void)
     if (!main_system_bus) {
         main_system_bus = qbus_create(&system_bus_info, NULL,
                                       "main-system-bus");
+        main_system_bus->allow_hotplug = 1;
     }
     return main_system_bus;
 }

I see two remaining problems:
 - kvmclock is somehow broken, either in my guest kernel (OpenSUSE HEAD
   3.0.0-2) or the host, -cpu host,-kvmclock works around sporadic
   guest lockups on echo 1 > /sys...
 - Seabios tends to lock up once every few system_reset after some
   CPU has been hot-added - also in TCG mode. It seems to dislike any
   setup of #CPUs > smp_cpus (whatever that implies in details).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-25 13:18                   ` Jan Kiszka
@ 2011-07-25 13:21                     ` Gleb Natapov
  2011-07-25 13:26                       ` Jan Kiszka
  2011-07-27 16:35                     ` Vasilis Liaskovitis
  1 sibling, 1 reply; 42+ messages in thread
From: Gleb Natapov @ 2011-07-25 13:21 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Vasilis Liaskovitis, kvm, Markus Armbruster

On Mon, Jul 25, 2011 at 03:18:19PM +0200, Jan Kiszka wrote:
> On 2011-07-24 18:11, Jan Kiszka wrote:
> >>> I had a closer look and identified two further issues, one generic, one
> >>> CPU-hotplug-specific:
> >>>  - (qdev) devices that are hotplugged do not receive any reset. That
> >>>    does not only apply to the APIC in case of CPU hotplugging, it is
> >>>    also broken for NICs, storage controllers, etc. when doing PCI
> >>>    hot-add as I just checked via gdb.
> >>>  - CPU hotplugging was always (or at least for a fairly long time),
> >>>    well, fragile as it failed to make CPU thread creation and CPU
> >>>    initialization atomic against APIC addition and other initialization
> >>>    steps. IOW, we need to create CPUs stopped, finish all init work,
> >>>    sync their states completely to the kernel
> >>>    (cpu_synchronize_post_init), and then kick them of. Actually I'm
> >> Syncing the state to the kernel should be done by vcpu thread, so I it
> >> cannot be stopped while the sync is done. May be I misunderstood what
> >> you mean here.
> > 
> > Stopped first of all means not entering kvm_cpu_exec before the whole
> > setup and the initial sync are done.
> > 
> > Syncing the initial state may also happen over the creating context as
> > long as the vcpus are stopped (analogously to
> > kvm_cpu_synchronize_post_init).
> 
> OK, hacks below plus the following three patches make CPU hotplug work
> again - with some exceptions. Here are the patches:
> 
> 1. http://thread.gmane.org/gmane.comp.emulators.kvm.devel/76484
> 2. http://thread.gmane.org/gmane.comp.emulators.qemu/110272
> 3. http://thread.gmane.org/gmane.comp.emulators.qemu/110426
> 
> And here are the hacks (well, the first hunk is clearly a fix, the last
> one clearly a hack, /me still undecided about the rest):
> 
> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
> index c30a050..f650250 100644
> --- a/hw/acpi_piix4.c
> +++ b/hw/acpi_piix4.c
> @@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s)
>                     ACPI_BITMASK_POWER_BUTTON_ENABLE |
>                     ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
>                     ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
> -        (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0);
> +        (((s->gpe.sts[0] & s->gpe.en[0]) &
> +          (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0);
>  
>      qemu_set_irq(s->irq, sci_level);
>      /* schedule a timer interruption if needed */
> diff --git a/hw/pc.c b/hw/pc.c
> index c0a88e1..e5371be 100644
> --- a/hw/pc.c
> +++ b/hw/pc.c
> @@ -42,6 +42,7 @@
>  #include "kvm.h"
>  #include "blockdev.h"
>  #include "ui/qemu-spice.h"
> +#include "cpus.h"
>  
>  /* output Bochs bios info messages */
>  //#define DEBUG_BIOS
> @@ -936,6 +937,10 @@ CPUState *pc_new_cpu(const char *cpu_model)
>  #endif
>      }
>  
> +    if (vm_running) {
> +        pause_all_vcpus();
> +    }
> +
>      env = cpu_init(cpu_model);
>      if (!env) {
>          fprintf(stderr, "Unable to find x86 CPU definition\n");
> @@ -947,6 +952,11 @@ CPUState *pc_new_cpu(const char *cpu_model)
>      }
>      qemu_register_reset(pc_cpu_reset, env);
>      pc_cpu_reset(env);
> +
> +    cpu_synchronize_post_init(env);
> +    if (vm_running) {
> +        resume_all_vcpus();
> +    }
>      return env;
>  }
>  
> diff --git a/hw/qdev.c b/hw/qdev.c
> index 1626131..b91e2c2 100644
> --- a/hw/qdev.c
> +++ b/hw/qdev.c
> @@ -330,6 +330,7 @@ BusState *sysbus_get_default(void)
>      if (!main_system_bus) {
>          main_system_bus = qbus_create(&system_bus_info, NULL,
>                                        "main-system-bus");
> +        main_system_bus->allow_hotplug = 1;
>      }
>      return main_system_bus;
>  }
> 
> I see two remaining problems:
>  - kvmclock is somehow broken, either in my guest kernel (OpenSUSE HEAD
>    3.0.0-2) or the host, -cpu host,-kvmclock works around sporadic
>    guest lockups on echo 1 > /sys...
>  - Seabios tends to lock up once every few system_reset after some
>    CPU has been hot-added - also in TCG mode. It seems to dislike any
>    setup of #CPUs > smp_cpus (whatever that implies in details).
> 
Have you specified maxcpus? Something like -smp 1,maxcpus=4.

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-25 13:21                     ` Gleb Natapov
@ 2011-07-25 13:26                       ` Jan Kiszka
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Kiszka @ 2011-07-25 13:26 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Vasilis Liaskovitis, kvm, Markus Armbruster

On 2011-07-25 15:21, Gleb Natapov wrote:
> On Mon, Jul 25, 2011 at 03:18:19PM +0200, Jan Kiszka wrote:
>> On 2011-07-24 18:11, Jan Kiszka wrote:
>>>>> I had a closer look and identified two further issues, one generic, one
>>>>> CPU-hotplug-specific:
>>>>>  - (qdev) devices that are hotplugged do not receive any reset. That
>>>>>    does not only apply to the APIC in case of CPU hotplugging, it is
>>>>>    also broken for NICs, storage controllers, etc. when doing PCI
>>>>>    hot-add as I just checked via gdb.
>>>>>  - CPU hotplugging was always (or at least for a fairly long time),
>>>>>    well, fragile as it failed to make CPU thread creation and CPU
>>>>>    initialization atomic against APIC addition and other initialization
>>>>>    steps. IOW, we need to create CPUs stopped, finish all init work,
>>>>>    sync their states completely to the kernel
>>>>>    (cpu_synchronize_post_init), and then kick them of. Actually I'm
>>>> Syncing the state to the kernel should be done by vcpu thread, so I it
>>>> cannot be stopped while the sync is done. May be I misunderstood what
>>>> you mean here.
>>>
>>> Stopped first of all means not entering kvm_cpu_exec before the whole
>>> setup and the initial sync are done.
>>>
>>> Syncing the initial state may also happen over the creating context as
>>> long as the vcpus are stopped (analogously to
>>> kvm_cpu_synchronize_post_init).
>>
>> OK, hacks below plus the following three patches make CPU hotplug work
>> again - with some exceptions. Here are the patches:
>>
>> 1. http://thread.gmane.org/gmane.comp.emulators.kvm.devel/76484
>> 2. http://thread.gmane.org/gmane.comp.emulators.qemu/110272
>> 3. http://thread.gmane.org/gmane.comp.emulators.qemu/110426
>>
>> And here are the hacks (well, the first hunk is clearly a fix, the last
>> one clearly a hack, /me still undecided about the rest):
>>
>> diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
>> index c30a050..f650250 100644
>> --- a/hw/acpi_piix4.c
>> +++ b/hw/acpi_piix4.c
>> @@ -92,7 +92,8 @@ static void pm_update_sci(PIIX4PMState *s)
>>                     ACPI_BITMASK_POWER_BUTTON_ENABLE |
>>                     ACPI_BITMASK_GLOBAL_LOCK_ENABLE |
>>                     ACPI_BITMASK_TIMER_ENABLE)) != 0) ||
>> -        (((s->gpe.sts[0] & s->gpe.en[0]) & PIIX4_PCI_HOTPLUG_STATUS) != 0);
>> +        (((s->gpe.sts[0] & s->gpe.en[0]) &
>> +          (PIIX4_PCI_HOTPLUG_STATUS | PIIX4_CPU_HOTPLUG_STATUS)) != 0);
>>  
>>      qemu_set_irq(s->irq, sci_level);
>>      /* schedule a timer interruption if needed */
>> diff --git a/hw/pc.c b/hw/pc.c
>> index c0a88e1..e5371be 100644
>> --- a/hw/pc.c
>> +++ b/hw/pc.c
>> @@ -42,6 +42,7 @@
>>  #include "kvm.h"
>>  #include "blockdev.h"
>>  #include "ui/qemu-spice.h"
>> +#include "cpus.h"
>>  
>>  /* output Bochs bios info messages */
>>  //#define DEBUG_BIOS
>> @@ -936,6 +937,10 @@ CPUState *pc_new_cpu(const char *cpu_model)
>>  #endif
>>      }
>>  
>> +    if (vm_running) {
>> +        pause_all_vcpus();
>> +    }
>> +
>>      env = cpu_init(cpu_model);
>>      if (!env) {
>>          fprintf(stderr, "Unable to find x86 CPU definition\n");
>> @@ -947,6 +952,11 @@ CPUState *pc_new_cpu(const char *cpu_model)
>>      }
>>      qemu_register_reset(pc_cpu_reset, env);
>>      pc_cpu_reset(env);
>> +
>> +    cpu_synchronize_post_init(env);
>> +    if (vm_running) {
>> +        resume_all_vcpus();
>> +    }
>>      return env;
>>  }
>>  
>> diff --git a/hw/qdev.c b/hw/qdev.c
>> index 1626131..b91e2c2 100644
>> --- a/hw/qdev.c
>> +++ b/hw/qdev.c
>> @@ -330,6 +330,7 @@ BusState *sysbus_get_default(void)
>>      if (!main_system_bus) {
>>          main_system_bus = qbus_create(&system_bus_info, NULL,
>>                                        "main-system-bus");
>> +        main_system_bus->allow_hotplug = 1;
>>      }
>>      return main_system_bus;
>>  }
>>
>> I see two remaining problems:
>>  - kvmclock is somehow broken, either in my guest kernel (OpenSUSE HEAD
>>    3.0.0-2) or the host, -cpu host,-kvmclock works around sporadic
>>    guest lockups on echo 1 > /sys...
>>  - Seabios tends to lock up once every few system_reset after some
>>    CPU has been hot-added - also in TCG mode. It seems to dislike any
>>    setup of #CPUs > smp_cpus (whatever that implies in details).
>>
> Have you specified maxcpus? Something like -smp 1,maxcpus=4.

Yes, for sure.

BTW, cpu_set completely lacks any parameter sanity checks. That
interface looks like "for gurus only". Hope we can do better via qdev
and properties.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-25 13:18                   ` Jan Kiszka
  2011-07-25 13:21                     ` Gleb Natapov
@ 2011-07-27 16:35                     ` Vasilis Liaskovitis
  2011-07-28 16:52                       ` Jan Kiszka
  1 sibling, 1 reply; 42+ messages in thread
From: Vasilis Liaskovitis @ 2011-07-27 16:35 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, kvm, Markus Armbruster

Hi,

On Mon, Jul 25, 2011 at 3:18 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>
> OK, hacks below plus the following three patches make CPU hotplug work
> again - with some exceptions. Here are the patches:
>
> 1. http://thread.gmane.org/gmane.comp.emulators.kvm.devel/76484
> 2. http://thread.gmane.org/gmane.comp.emulators.qemu/110272
> 3. http://thread.gmane.org/gmane.comp.emulators.qemu/110426
>
> And here are the hacks (well, the first hunk is clearly a fix, the last
> one clearly a hack, /me still undecided about the rest):

thanks for these. Testing a bit late here...
Applying these 3 patches + hacks/fix on master doesn't solve the
initial "CPU: not responding" dmesg hotplug issue on my setup.
The ACPI event is delivered correctly, but "echo 1 >
/sys/devices/system/cpu/cpuX/online" still fails.
e.g. tested with "-smp 1,maxcpus=2"

Could you share your qemu-kvm command line? I have been trying 2.6.39
guest kernel (as opposed to 3.0.2) but
I doubt that's the issue.
Can someone else confirm whether this solves the issue for their setup or not?

> I see two remaining problems:
>  - kvmclock is somehow broken, either in my guest kernel (OpenSUSE HEAD
>    3.0.0-2) or the host, -cpu host,-kvmclock works around sporadic
>    guest lockups on echo 1 > /sys...

could you explain this a bit further? specifying "-cpu host,-kvmclock"
avoids the lockups? I have tried
adding this to my command line but didn't help (but then again my
problem is not guest lockups: the onlining fails)

>  - Seabios tends to lock up once every few system_reset after some
>    CPU has been hot-added - also in TCG mode. It seems to dislike any
>    setup of #CPUs > smp_cpus (whatever that implies in details).

I always get a lockedup seabios when rebooting after a hotplug attempt
with these patches (or with my initial vcpu_dirty workaround FWIW)

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-27 16:35                     ` Vasilis Liaskovitis
@ 2011-07-28 16:52                       ` Jan Kiszka
  2011-08-02  9:46                         ` Vasilis Liaskovitis
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kiszka @ 2011-07-28 16:52 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: Gleb Natapov, kvm, Markus Armbruster

On 2011-07-27 18:35, Vasilis Liaskovitis wrote:
> Hi,
> 
> On Mon, Jul 25, 2011 at 3:18 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>
>> OK, hacks below plus the following three patches make CPU hotplug work
>> again - with some exceptions. Here are the patches:
>>
>> 1. http://thread.gmane.org/gmane.comp.emulators.kvm.devel/76484
>> 2. http://thread.gmane.org/gmane.comp.emulators.qemu/110272
>> 3. http://thread.gmane.org/gmane.comp.emulators.qemu/110426
>>
>> And here are the hacks (well, the first hunk is clearly a fix, the last
>> one clearly a hack, /me still undecided about the rest):
> 
> thanks for these. Testing a bit late here...
> Applying these 3 patches + hacks/fix on master doesn't solve the
> initial "CPU: not responding" dmesg hotplug issue on my setup.
> The ACPI event is delivered correctly, but "echo 1 >
> /sys/devices/system/cpu/cpuX/online" still fails.
> e.g. tested with "-smp 1,maxcpus=2"
> 
> Could you share your qemu-kvm command line?

Nothing special:

qemu-system-x86_64 /my/image -m 1G -snapshot -smp 2,maxcpus=3 \
	-cpu host,-kvmclock

> I have been trying 2.6.39
> guest kernel (as opposed to 3.0.2) but
> I doubt that's the issue.
> Can someone else confirm whether this solves the issue for their setup or not?
> 
>> I see two remaining problems:
>>  - kvmclock is somehow broken, either in my guest kernel (OpenSUSE HEAD
>>    3.0.0-2) or the host, -cpu host,-kvmclock works around sporadic
>>    guest lockups on echo 1 > /sys...
> 
> could you explain this a bit further? specifying "-cpu host,-kvmclock"
> avoids the lockups? I have tried

Yes - apparently. I haven't checked if kvmclock is actually the problem
or just changing the boundary conditions so that the issue no longer
shows up here.

> adding this to my command line but didn't help (but then again my
> problem is not guest lockups: the onlining fails)
> 
>>  - Seabios tends to lock up once every few system_reset after some
>>    CPU has been hot-added - also in TCG mode. It seems to dislike any
>>    setup of #CPUs > smp_cpus (whatever that implies in details).
> 
> I always get a lockedup seabios when rebooting after a hotplug attempt
> with these patches (or with my initial vcpu_dirty workaround FWIW)

Yes, this problem is most likely independent of the above. I even got
those lockups when creating smp_cpus+1 vcpus in pc_cpus_init.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-07-28 16:52                       ` Jan Kiszka
@ 2011-08-02  9:46                         ` Vasilis Liaskovitis
  2011-08-02 10:24                           ` Jan Kiszka
  0 siblings, 1 reply; 42+ messages in thread
From: Vasilis Liaskovitis @ 2011-08-02  9:46 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, kvm, Markus Armbruster

Hi,

On Thu, Jul 28, 2011 at 6:52 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
> On 2011-07-27 18:35, Vasilis Liaskovitis wrote:
>> Hi,
>>
>> On Mon, Jul 25, 2011 at 3:18 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> Applying these 3 patches + hacks/fix on master doesn't solve the
>> initial "CPU: not responding" dmesg hotplug issue on my setup.
>> The ACPI event is delivered correctly, but "echo 1 >
>> /sys/devices/system/cpu/cpuX/online" still fails.
>> e.g. tested with "-smp 1,maxcpus=2"
>>
>> Could you share your qemu-kvm command line?
>
> Nothing special:
>
> qemu-system-x86_64 /my/image -m 1G -snapshot -smp 2,maxcpus=3 \
>        -cpu host,-kvmclock
>

ok, I have tested the patched qemu-kvm (guestOS ubuntu 10.10 with
2.6.35-24-generic kernel)  with:

qemu-system-x86_64 /my/image -m 1G -smp 1,maxcpus=2 \
        -cpu host,-kvmclock

on 2 different physical hosts. On the first one:

vendor_id       : GenuineIntel
cpu family      : 6
model           : 37
model name      : Intel(R) Core(TM) i3 CPU       M 370  @ 2.40GHz

cpu hotplug works ok. When onlining (echo 1 >
/sys/devices/system/cpu/cpu1/online) dmesg shows:

Booting Node 0 Processor 1 APIC 0x1
TSC synchronization [CPU#0 -> CPU#1]:
Measured 3573498937792 cycles TSC warp between CPUs, turning off TSC clock.
Marking TSC unstable due to check_tsc_sync_source failed
kvm-clock: cpu 1, msr 0:1f15901, secondary cpu clock

on the second system:

vendor_id	: AuthenticAMD
cpu family	: 16
model		: 4
model name	: AMD Phenom(tm) II X4 955 Processor

onlining fails:

Booting Node 0 Processor 1 APIC 0x1
CPU1: Not Responding.

When I specify -kvmclock on the qemu command line for the AMD system,
it seems that the clocksource on the guest is hpet.
Dmesg on guest shows "Switching to clocksource hpet" and
/sys/devices/system/clocksource/clocksource0/current_clocksource
confirms.

When I don't specify -kvmclock, then the guest clocksource is actually
kvm-clock. But the onlining problem persists.

I have also tried a newer guestOS (debian-testing with 2.6.39 or 3.0.0
kernel) on the second system, same problem.

Also: what physical host have you used for testing?

thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-08-02  9:46                         ` Vasilis Liaskovitis
@ 2011-08-02 10:24                           ` Jan Kiszka
  2011-08-02 13:41                             ` Vasilis Liaskovitis
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kiszka @ 2011-08-02 10:24 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: Gleb Natapov, kvm, Markus Armbruster

On 2011-08-02 11:46, Vasilis Liaskovitis wrote:
> Hi,
> 
> On Thu, Jul 28, 2011 at 6:52 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>> On 2011-07-27 18:35, Vasilis Liaskovitis wrote:
>>> Hi,
>>>
>>> On Mon, Jul 25, 2011 at 3:18 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:
>>> Applying these 3 patches + hacks/fix on master doesn't solve the
>>> initial "CPU: not responding" dmesg hotplug issue on my setup.
>>> The ACPI event is delivered correctly, but "echo 1 >
>>> /sys/devices/system/cpu/cpuX/online" still fails.
>>> e.g. tested with "-smp 1,maxcpus=2"
>>>
>>> Could you share your qemu-kvm command line?
>>
>> Nothing special:
>>
>> qemu-system-x86_64 /my/image -m 1G -snapshot -smp 2,maxcpus=3 \
>>        -cpu host,-kvmclock
>>
> 
> ok, I have tested the patched qemu-kvm (guestOS ubuntu 10.10 with
> 2.6.35-24-generic kernel)  with:
> 
> qemu-system-x86_64 /my/image -m 1G -smp 1,maxcpus=2 \
>         -cpu host,-kvmclock
> 
> on 2 different physical hosts. On the first one:
> 
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 37
> model name      : Intel(R) Core(TM) i3 CPU       M 370  @ 2.40GHz
> 
> cpu hotplug works ok. When onlining (echo 1 >
> /sys/devices/system/cpu/cpu1/online) dmesg shows:
> 
> Booting Node 0 Processor 1 APIC 0x1
> TSC synchronization [CPU#0 -> CPU#1]:
> Measured 3573498937792 cycles TSC warp between CPUs, turning off TSC clock.
> Marking TSC unstable due to check_tsc_sync_source failed
> kvm-clock: cpu 1, msr 0:1f15901, secondary cpu clock

Strange. Why does it still talk about kvm-clock?

> 
> on the second system:
> 
> vendor_id	: AuthenticAMD
> cpu family	: 16
> model		: 4
> model name	: AMD Phenom(tm) II X4 955 Processor
> 
> onlining fails:
> 
> Booting Node 0 Processor 1 APIC 0x1
> CPU1: Not Responding.
> 
> When I specify -kvmclock on the qemu command line for the AMD system,
> it seems that the clocksource on the guest is hpet.
> Dmesg on guest shows "Switching to clocksource hpet" and
> /sys/devices/system/clocksource/clocksource0/current_clocksource
> confirms.
> 
> When I don't specify -kvmclock, then the guest clocksource is actually
> kvm-clock. But the onlining problem persists.
> 
> I have also tried a newer guestOS (debian-testing with 2.6.39 or 3.0.0
> kernel) on the second system, same problem.
> 
> Also: what physical host have you used for testing?

Intel(R) Core(TM) i7 CPU       M 620  @ 2.67GHz

FWIW, I've pushed my tree here:

git://git.kiszka.org/qemu-kvm.git queues/cpu-hotplug

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-08-02 10:24                           ` Jan Kiszka
@ 2011-08-02 13:41                             ` Vasilis Liaskovitis
  2011-08-03 10:07                               ` Vasilis Liaskovitis
  0 siblings, 1 reply; 42+ messages in thread
From: Vasilis Liaskovitis @ 2011-08-02 13:41 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Gleb Natapov, kvm, Markus Armbruster

On Tue, Aug 2, 2011 at 12:24 PM, Jan Kiszka <jan.kiszka@siemens.com> wrote:

> FWIW, I've pushed my tree here:
>
> git://git.kiszka.org/qemu-kvm.git queues/cpu-hotplug
>

thanks. Your cpu-hotplug tree or today's git-master (commit
dacdc4b10bafbb21120e1c24a9665444768ef999) works.

Previously I tested the second system with master from the middle of
last week (commit  497d8129ad4bca8a3b319ee84229d9a0f9cdd15c)
which probably was missing something else in between merges. Apologies.

>
> Strange. Why does it still talk about kvm-clock?
>

I do get kvmclock dmesg output when onlining e.g. on the second system as well:

[   89.062098] Booting Node 0 Processor 1 APIC 0x1
[   89.062100] smpboot cpu 1: start_ip = 98000
[   89.074229] kvm-clock: cpu 1, msr 0:3fd12f81, secondary cpu clock
[   89.075038] NMI watchdog disabled (cpu1): hardware events not enabled
[   89.076103] Switched to NOHz mode on CPU #1

kvm_register_clock() prints the kvm-clock line, isn't this expected to
be called when the new CPU is onlined?
thanks,

- Vasilis

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-08-02 13:41                             ` Vasilis Liaskovitis
@ 2011-08-03 10:07                               ` Vasilis Liaskovitis
  2011-08-03 10:37                                 ` Jan Kiszka
  0 siblings, 1 reply; 42+ messages in thread
From: Vasilis Liaskovitis @ 2011-08-03 10:07 UTC (permalink / raw)
  To: Jan Kiszka, seabios, kvm; +Cc: gleb, armbru

When rebooting after a CPU hotplug in qemu-kvm, Seabios can get stuck in smp_probe().
Normally cmos_smp_count is read from cmos and the smp_ap_boot_code is run on all cpus 
except bootstrap. The expected result is CountCPUs == cmos_smp_count + 1.  After a 
cpu hotplug, more than cmos_smp_count cpus are active, so we get a situation where 
CountCPUs > cmos_smp_count + 1 and Seabios keeps looping forever in smp_probe. In some
cases, the while loop exit condition is tested before CountCPUs gets larger (i.e. 
before smp_ap_boot_code runs on all CPUs), so the hang is not always reproducible.

This patch introduces a new fw_cfg variable called hotplugged_cpus that gets updated from
qemu-kvm hoplug code. Seabios reads this variable on each call to smp_probe() and adjusts
the expected number of online CPUs.

The qemu-kvm part of this patch is against Jan Kiszka's cpu-hotplug tree:
git://git.kiszka.org/qemu-kvm.git queues/cpu-hotplug
tested with qemu-system-x86_64.

An alternative to this patch would be to update the smp_cpus variable in qemu-kvm and 
do a "cmos update" to 0x5f from the cpu-hotplug code. Access to the rtc_state (cmos device) 
would be required in hw/acpi_piix4.c.  This way no change to Seabios would be required.

---

 hw/acpi_piix4.c |    6 ++++++
 hw/fw_cfg.c     |   14 ++++++++++++++
 hw/fw_cfg.h     |    2 ++
 hw/loader.c     |    2 +-
 sysemu.h        |    1 +
 vl.c            |    1 +
 6 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c
index bbb9edd..54d98ae 100644
--- a/hw/acpi_piix4.c
+++ b/hw/acpi_piix4.c
@@ -24,6 +24,7 @@
 #include "sysemu.h"
 #include "range.h"
 #include "ioport.h"
+#include "fw_cfg.h"
 
 //#define DEBUG
 
@@ -539,6 +540,7 @@ static void pcirmv_write(void *opaque, uint32_t addr, uint32_t val)
 }
 
 extern const char *global_cpu_model;
+extern FWCfgState *fw_cfg;
 
 static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev,
                                 PCIHotplugState state);
@@ -580,6 +582,8 @@ static void enable_processor(PIIX4PMState *s, int cpu)
 
     *gpe->sts = *gpe->sts | PIIX4_CPU_HOTPLUG_STATUS;
     g->cpus_sts[cpu/8] |= (1 << (cpu%8));
+    hotplugged_cpus++;
+    fw_cfg_update_i16(fw_cfg, FW_CFG_HPLUG_CPUS, (uint16_t)hotplugged_cpus);
 }
 
 static void disable_processor(PIIX4PMState *s, int cpu)
@@ -589,6 +593,8 @@ static void disable_processor(PIIX4PMState *s, int cpu)
 
     *gpe->sts = *gpe->sts | PIIX4_CPU_HOTPLUG_STATUS;
     g->cpus_sts[cpu/8] &= ~(1 << (cpu%8));
+    hotplugged_cpus--;
+    fw_cfg_update_i16(fw_cfg, FW_CFG_HPLUG_CPUS, (uint16_t)hotplugged_cpus);
 }
 
 void qemu_system_cpu_hot_add(int cpu, int state)
diff --git a/hw/fw_cfg.c b/hw/fw_cfg.c
index a29db90..228f517 100644
--- a/hw/fw_cfg.c
+++ b/hw/fw_cfg.c
@@ -395,6 +395,19 @@ int fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value)
     return fw_cfg_add_bytes(s, key, (uint8_t *)copy, sizeof(value));
 }
 
+int fw_cfg_update_i16(FWCfgState *s, uint16_t key, uint16_t data)
+{
+    int arch = !!(key & FW_CFG_ARCH_LOCAL);
+    uint16_t *p;
+
+    key &= FW_CFG_ENTRY_MASK;
+
+    p = (uint16_t*)s->entries[arch][key].data;
+    *p = cpu_to_le16(data);
+    
+    return 1;
+}
+
 int fw_cfg_add_callback(FWCfgState *s, uint16_t key, FWCfgCallback callback,
                         void *callback_opaque, uint8_t *data, size_t len)
 {
@@ -489,6 +502,7 @@ FWCfgState *fw_cfg_init(uint32_t ctl_port, uint32_t data_port,
     fw_cfg_add_i16(s, FW_CFG_NOGRAPHIC, (uint16_t)(display_type == DT_NOGRAPHIC));
     fw_cfg_add_i16(s, FW_CFG_NB_CPUS, (uint16_t)smp_cpus);
     fw_cfg_add_i16(s, FW_CFG_MAX_CPUS, (uint16_t)max_cpus);
+    fw_cfg_add_i16(s, FW_CFG_HPLUG_CPUS, (uint16_t)hotplugged_cpus);
     fw_cfg_add_i16(s, FW_CFG_BOOT_MENU, (uint16_t)boot_menu);
     fw_cfg_bootsplash(s);
 
diff --git a/hw/fw_cfg.h b/hw/fw_cfg.h
index 856bf91..95d74b4 100644
--- a/hw/fw_cfg.h
+++ b/hw/fw_cfg.h
@@ -27,6 +27,7 @@
 #define FW_CFG_SETUP_SIZE       0x17
 #define FW_CFG_SETUP_DATA       0x18
 #define FW_CFG_FILE_DIR         0x19
+#define FW_CFG_HPLUG_CPUS       0x1a
 
 #define FW_CFG_FILE_FIRST       0x20
 #define FW_CFG_FILE_SLOTS       0x10
@@ -56,6 +57,7 @@ typedef void (*FWCfgCallback)(void *opaque, uint8_t *data);
 typedef struct FWCfgState FWCfgState;
 int fw_cfg_add_bytes(FWCfgState *s, uint16_t key, uint8_t *data, uint32_t len);
 int fw_cfg_add_i16(FWCfgState *s, uint16_t key, uint16_t value);
+int fw_cfg_update_i16(FWCfgState *s, uint16_t key, uint16_t data);
 int fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value);
 int fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value);
 int fw_cfg_add_callback(FWCfgState *s, uint16_t key, FWCfgCallback callback,
diff --git a/hw/loader.c b/hw/loader.c
index 35d792e..2a017d1 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -536,7 +536,7 @@ struct Rom {
     QTAILQ_ENTRY(Rom) next;
 };
 
-static FWCfgState *fw_cfg;
+FWCfgState *fw_cfg;
 static QTAILQ_HEAD(, Rom) roms = QTAILQ_HEAD_INITIALIZER(roms);
 
 static void rom_insert(Rom *rom)
diff --git a/sysemu.h b/sysemu.h
index 59fae48..a0658fd 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -115,6 +115,7 @@ extern int ctrl_grab;
 extern int usb_enabled;
 extern int smp_cpus;
 extern int max_cpus;
+extern int hotplugged_cpus;
 extern int cursor_hide;
 extern int graphic_rotate;
 extern int no_quit;
diff --git a/vl.c b/vl.c
index 3fa1711..4ab8c63 100644
--- a/vl.c
+++ b/vl.c
@@ -204,6 +204,7 @@ int rtc_td_hack = 0;
 int usb_enabled = 0;
 int singlestep = 0;
 int smp_cpus = 1;
+int hotplugged_cpus = 0;
 int max_cpus = 0;
 int smp_cores = 1;
 int smp_threads = 1;



 src/paravirt.c |   12 ++++++++++++
 src/paravirt.h |    2 ++
 src/smp.c      |    6 ++++--
 3 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/src/paravirt.c b/src/paravirt.c
index 9cf77de..3367609 100644
--- a/src/paravirt.c
+++ b/src/paravirt.c
@@ -305,6 +305,18 @@ u16 qemu_cfg_get_max_cpus(void)
     return cnt;
 }
 
+u16 qemu_cfg_get_hplug_cpus(void)
+{
+    u16 cnt;
+
+    if (!qemu_cfg_present)
+        return 0;
+
+    qemu_cfg_read_entry(&cnt, QEMU_CFG_HPLUG_CPUS, sizeof(cnt));
+
+    return cnt;
+}
+
 static QemuCfgFile LastFile;
 
 static u32
diff --git a/src/paravirt.h b/src/paravirt.h
index 4a370a0..1a3a914 100644
--- a/src/paravirt.h
+++ b/src/paravirt.h
@@ -33,6 +33,7 @@ static inline int kvm_para_available(void)
 #define QEMU_CFG_BOOT_MENU		0x0e
 #define QEMU_CFG_MAX_CPUS		0x0f
 #define QEMU_CFG_FILE_DIR               0x19
+#define QEMU_CFG_HPLUG_CPUS             0x1a
 #define QEMU_CFG_ARCH_LOCAL		0x8000
 #define QEMU_CFG_ACPI_TABLES		(QEMU_CFG_ARCH_LOCAL + 0)
 #define QEMU_CFG_SMBIOS_ENTRIES		(QEMU_CFG_ARCH_LOCAL + 1)
@@ -55,6 +56,7 @@ int qemu_cfg_smbios_load_external(int type, char **p, unsigned *nr_structs,
 int qemu_cfg_get_numa_nodes(void);
 void qemu_cfg_get_numa_data(u64 *data, int n);
 u16 qemu_cfg_get_max_cpus(void);
+u16 qemu_cfg_get_hplug_cpus(void);
 
 typedef struct QemuCfgFile {
     u32  size;        /* file size */
diff --git a/src/smp.c b/src/smp.c
index 8c077a1..b035870 100644
--- a/src/smp.c
+++ b/src/smp.c
@@ -36,6 +36,7 @@ wrmsr_smp(u32 index, u64 val)
 
 u32 CountCPUs VAR16VISIBLE;
 u32 MaxCountCPUs VAR16VISIBLE;
+u32 HotPlugCPUs VAR16VISIBLE;
 extern void smp_ap_boot_code(void);
 ASM16(
     "  .global smp_ap_boot_code\n"
@@ -114,8 +115,9 @@ smp_probe(void)
     if (CONFIG_COREBOOT) {
         msleep(10);
     } else {
-        u8 cmos_smp_count = inb_cmos(CMOS_BIOS_SMP_COUNT);
-        while (cmos_smp_count + 1 != readl(&CountCPUs))
+        u32 cmos_smp_count = (u32)inb_cmos(CMOS_BIOS_SMP_COUNT);
+        HotPlugCPUs = qemu_cfg_get_hplug_cpus();
+        while (cmos_smp_count + HotPlugCPUs + 1 != readl(&CountCPUs))
             yield();
     }
 

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-08-03 10:07                               ` Vasilis Liaskovitis
@ 2011-08-03 10:37                                 ` Jan Kiszka
  2011-08-03 10:38                                   ` Gleb Natapov
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kiszka @ 2011-08-03 10:37 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: seabios, kvm, gleb, armbru

On 2011-08-03 12:07, Vasilis Liaskovitis wrote:
> When rebooting after a CPU hotplug in qemu-kvm, Seabios can get stuck in smp_probe().
> Normally cmos_smp_count is read from cmos and the smp_ap_boot_code is run on all cpus 
> except bootstrap. The expected result is CountCPUs == cmos_smp_count + 1.  After a 
> cpu hotplug, more than cmos_smp_count cpus are active, so we get a situation where 
> CountCPUs > cmos_smp_count + 1 and Seabios keeps looping forever in smp_probe. In some
> cases, the while loop exit condition is tested before CountCPUs gets larger (i.e. 
> before smp_ap_boot_code runs on all CPUs), so the hang is not always reproducible.
> 
> This patch introduces a new fw_cfg variable called hotplugged_cpus that gets updated from
> qemu-kvm hoplug code. Seabios reads this variable on each call to smp_probe() and adjusts
> the expected number of online CPUs.
> 
> The qemu-kvm part of this patch is against Jan Kiszka's cpu-hotplug tree:
> git://git.kiszka.org/qemu-kvm.git queues/cpu-hotplug
> tested with qemu-system-x86_64.
> 
> An alternative to this patch would be to update the smp_cpus variable in qemu-kvm and 
> do a "cmos update" to 0x5f from the cpu-hotplug code. Access to the rtc_state (cmos device) 
> would be required in hw/acpi_piix4.c.  This way no change to Seabios would be required.
> 

...

>  src/paravirt.c |   12 ++++++++++++
>  src/paravirt.h |    2 ++
>  src/smp.c      |    6 ++++--
>  3 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/src/paravirt.c b/src/paravirt.c
> index 9cf77de..3367609 100644
> --- a/src/paravirt.c
> +++ b/src/paravirt.c
> @@ -305,6 +305,18 @@ u16 qemu_cfg_get_max_cpus(void)
>      return cnt;
>  }
>  
> +u16 qemu_cfg_get_hplug_cpus(void)
> +{
> +    u16 cnt;
> +
> +    if (!qemu_cfg_present)
> +        return 0;
> +
> +    qemu_cfg_read_entry(&cnt, QEMU_CFG_HPLUG_CPUS, sizeof(cnt));

Why can't Seabios read to true number online CPUs from the PIIX4 device?
The information is there already, no need for addition PV here.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-08-03 10:37                                 ` Jan Kiszka
@ 2011-08-03 10:38                                   ` Gleb Natapov
  2011-08-03 10:42                                     ` Jan Kiszka
  0 siblings, 1 reply; 42+ messages in thread
From: Gleb Natapov @ 2011-08-03 10:38 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: seabios, armbru, kvm

On Wed, Aug 03, 2011 at 12:37:13PM +0200, Jan Kiszka wrote:
> On 2011-08-03 12:07, Vasilis Liaskovitis wrote:
> > When rebooting after a CPU hotplug in qemu-kvm, Seabios can get stuck in smp_probe().
> > Normally cmos_smp_count is read from cmos and the smp_ap_boot_code is run on all cpus 
> > except bootstrap. The expected result is CountCPUs == cmos_smp_count + 1.  After a 
> > cpu hotplug, more than cmos_smp_count cpus are active, so we get a situation where 
> > CountCPUs > cmos_smp_count + 1 and Seabios keeps looping forever in smp_probe. In some
> > cases, the while loop exit condition is tested before CountCPUs gets larger (i.e. 
> > before smp_ap_boot_code runs on all CPUs), so the hang is not always reproducible.
> > 
> > This patch introduces a new fw_cfg variable called hotplugged_cpus that gets updated from
> > qemu-kvm hoplug code. Seabios reads this variable on each call to smp_probe() and adjusts
> > the expected number of online CPUs.
> > 
> > The qemu-kvm part of this patch is against Jan Kiszka's cpu-hotplug tree:
> > git://git.kiszka.org/qemu-kvm.git queues/cpu-hotplug
> > tested with qemu-system-x86_64.
> > 
> > An alternative to this patch would be to update the smp_cpus variable in qemu-kvm and 
> > do a "cmos update" to 0x5f from the cpu-hotplug code. Access to the rtc_state (cmos device) 
> > would be required in hw/acpi_piix4.c.  This way no change to Seabios would be required.
> > 
> 
> ...
> 
> >  src/paravirt.c |   12 ++++++++++++
> >  src/paravirt.h |    2 ++
> >  src/smp.c      |    6 ++++--
> >  3 files changed, 18 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/paravirt.c b/src/paravirt.c
> > index 9cf77de..3367609 100644
> > --- a/src/paravirt.c
> > +++ b/src/paravirt.c
> > @@ -305,6 +305,18 @@ u16 qemu_cfg_get_max_cpus(void)
> >      return cnt;
> >  }
> >  
> > +u16 qemu_cfg_get_hplug_cpus(void)
> > +{
> > +    u16 cnt;
> > +
> > +    if (!qemu_cfg_present)
> > +        return 0;
> > +
> > +    qemu_cfg_read_entry(&cnt, QEMU_CFG_HPLUG_CPUS, sizeof(cnt));
> 
> Why can't Seabios read to true number online CPUs from the PIIX4 device?
> The information is there already, no need for addition PV here.
> 
Where is it in PIIX4 device?

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-08-03 10:38                                   ` Gleb Natapov
@ 2011-08-03 10:42                                     ` Jan Kiszka
  2011-08-03 16:25                                       ` Vasilis Liaskovitis
  0 siblings, 1 reply; 42+ messages in thread
From: Jan Kiszka @ 2011-08-03 10:42 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Vasilis Liaskovitis, seabios, kvm, armbru

On 2011-08-03 12:38, Gleb Natapov wrote:
> On Wed, Aug 03, 2011 at 12:37:13PM +0200, Jan Kiszka wrote:
>> On 2011-08-03 12:07, Vasilis Liaskovitis wrote:
>>> When rebooting after a CPU hotplug in qemu-kvm, Seabios can get stuck in smp_probe().
>>> Normally cmos_smp_count is read from cmos and the smp_ap_boot_code is run on all cpus 
>>> except bootstrap. The expected result is CountCPUs == cmos_smp_count + 1.  After a 
>>> cpu hotplug, more than cmos_smp_count cpus are active, so we get a situation where 
>>> CountCPUs > cmos_smp_count + 1 and Seabios keeps looping forever in smp_probe. In some
>>> cases, the while loop exit condition is tested before CountCPUs gets larger (i.e. 
>>> before smp_ap_boot_code runs on all CPUs), so the hang is not always reproducible.
>>>
>>> This patch introduces a new fw_cfg variable called hotplugged_cpus that gets updated from
>>> qemu-kvm hoplug code. Seabios reads this variable on each call to smp_probe() and adjusts
>>> the expected number of online CPUs.
>>>
>>> The qemu-kvm part of this patch is against Jan Kiszka's cpu-hotplug tree:
>>> git://git.kiszka.org/qemu-kvm.git queues/cpu-hotplug
>>> tested with qemu-system-x86_64.
>>>
>>> An alternative to this patch would be to update the smp_cpus variable in qemu-kvm and 
>>> do a "cmos update" to 0x5f from the cpu-hotplug code. Access to the rtc_state (cmos device) 
>>> would be required in hw/acpi_piix4.c.  This way no change to Seabios would be required.
>>>
>>
>> ...
>>
>>>  src/paravirt.c |   12 ++++++++++++
>>>  src/paravirt.h |    2 ++
>>>  src/smp.c      |    6 ++++--
>>>  3 files changed, 18 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/paravirt.c b/src/paravirt.c
>>> index 9cf77de..3367609 100644
>>> --- a/src/paravirt.c
>>> +++ b/src/paravirt.c
>>> @@ -305,6 +305,18 @@ u16 qemu_cfg_get_max_cpus(void)
>>>      return cnt;
>>>  }
>>>  
>>> +u16 qemu_cfg_get_hplug_cpus(void)
>>> +{
>>> +    u16 cnt;
>>> +
>>> +    if (!qemu_cfg_present)
>>> +        return 0;
>>> +
>>> +    qemu_cfg_read_entry(&cnt, QEMU_CFG_HPLUG_CPUS, sizeof(cnt));
>>
>> Why can't Seabios read to true number online CPUs from the PIIX4 device?
>> The information is there already, no need for addition PV here.
>>
> Where is it in PIIX4 device?

PROC registers (or however they are called).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-08-03 10:42                                     ` Jan Kiszka
@ 2011-08-03 16:25                                       ` Vasilis Liaskovitis
  2011-08-04  8:01                                         ` Gleb Natapov
  0 siblings, 1 reply; 42+ messages in thread
From: Vasilis Liaskovitis @ 2011-08-03 16:25 UTC (permalink / raw)
  To: Jan Kiszka, gleb; +Cc: seabios, kvm, armbru

On Wed, Aug 03, 2011 at 12:42:11PM +0200, Jan Kiszka wrote:
> >> Why can't Seabios read to true number online CPUs from the PIIX4 device?
> >> The information is there already, no need for addition PV here.
> >>
> > Where is it in PIIX4 device?
> 
> PROC registers (or however they are called).

In qemu-kvm, the cpus_sts bitmap array in PIIX4PMState/ACPIGPE has the true 
number of online CPUS. This is accessed from the DSDT hotplug method in Seabios
 as "OperationRegion SystemIO" with address 0xaf00. Is this i/o address in the 
piix4 spec?  How can it be accessed from the rest of SeaBIOS? It seems to reside 
in ACPI_PM space.

- Vasilis



^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-08-03 16:25                                       ` Vasilis Liaskovitis
@ 2011-08-04  8:01                                         ` Gleb Natapov
  2011-08-04  8:40                                           ` Jan Kiszka
  0 siblings, 1 reply; 42+ messages in thread
From: Gleb Natapov @ 2011-08-04  8:01 UTC (permalink / raw)
  To: Vasilis Liaskovitis; +Cc: Jan Kiszka, seabios, armbru, kvm

On Wed, Aug 03, 2011 at 06:25:07PM +0200, Vasilis Liaskovitis wrote:
> On Wed, Aug 03, 2011 at 12:42:11PM +0200, Jan Kiszka wrote:
> > >> Why can't Seabios read to true number online CPUs from the PIIX4 device?
> > >> The information is there already, no need for addition PV here.
> > >>
> > > Where is it in PIIX4 device?
> > 
> > PROC registers (or however they are called).
> 
> In qemu-kvm, the cpus_sts bitmap array in PIIX4PMState/ACPIGPE has the true 
> number of online CPUS. This is accessed from the DSDT hotplug method in Seabios
>  as "OperationRegion SystemIO" with address 0xaf00. Is this i/o address in the 
> piix4 spec?  How can it be accessed from the rest of SeaBIOS? It seems to reside 
> in ACPI_PM space.
> 
0xaf00 is not part of PIIX4. PIIX4 supports nor cpu host plug
neither pci hot plug.  I haven't found any PROC register in PIIX4 spec
so far.

--
			Gleb.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH] cpu hotplug issue
  2011-08-04  8:01                                         ` Gleb Natapov
@ 2011-08-04  8:40                                           ` Jan Kiszka
  0 siblings, 0 replies; 42+ messages in thread
From: Jan Kiszka @ 2011-08-04  8:40 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Vasilis Liaskovitis, seabios, kvm, armbru

On 2011-08-04 10:01, Gleb Natapov wrote:
> On Wed, Aug 03, 2011 at 06:25:07PM +0200, Vasilis Liaskovitis wrote:
>> On Wed, Aug 03, 2011 at 12:42:11PM +0200, Jan Kiszka wrote:
>>>>> Why can't Seabios read to true number online CPUs from the PIIX4 device?
>>>>> The information is there already, no need for addition PV here.
>>>>>
>>>> Where is it in PIIX4 device?
>>>
>>> PROC registers (or however they are called).
>>
>> In qemu-kvm, the cpus_sts bitmap array in PIIX4PMState/ACPIGPE has the true 
>> number of online CPUS. This is accessed from the DSDT hotplug method in Seabios
>>  as "OperationRegion SystemIO" with address 0xaf00. Is this i/o address in the 
>> piix4 spec?  How can it be accessed from the rest of SeaBIOS? It seems to reside 
>> in ACPI_PM space.
>>
> 0xaf00 is not part of PIIX4. PIIX4 supports nor cpu host plug
> neither pci hot plug.  I haven't found any PROC register in PIIX4 spec
> so far.

So it is some "board" extension that is declared to BIOS/OS via ACPI? In
any case, the channel we need to read the number of online CPUs already
exists.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2011-08-04  8:40 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-19 17:40 cpu hotplug issue Vasilis Liaskovitis
2011-07-20  8:35 ` Gleb Natapov
2011-07-21 11:06   ` [PATCH] " Vasilis Liaskovitis
2011-07-21 11:33     ` Gleb Natapov
2011-07-21 11:42       ` Jan Kiszka
2011-07-21 11:51         ` Gleb Natapov
2011-07-21 11:55           ` Jan Kiszka
2011-07-21 12:00             ` Gleb Natapov
2011-07-21 12:18             ` Avi Kivity
2011-07-21 12:22               ` Gleb Natapov
2011-07-21 12:39               ` Jan Kiszka
2011-07-21 13:27               ` Lucas Meneghel Rodrigues
2011-07-21 12:45           ` Gleb Natapov
2011-07-22 10:56             ` Jan Kiszka
2011-07-24 11:56               ` Gleb Natapov
2011-07-24 16:11                 ` Jan Kiszka
2011-07-25 13:18                   ` Jan Kiszka
2011-07-25 13:21                     ` Gleb Natapov
2011-07-25 13:26                       ` Jan Kiszka
2011-07-27 16:35                     ` Vasilis Liaskovitis
2011-07-28 16:52                       ` Jan Kiszka
2011-08-02  9:46                         ` Vasilis Liaskovitis
2011-08-02 10:24                           ` Jan Kiszka
2011-08-02 13:41                             ` Vasilis Liaskovitis
2011-08-03 10:07                               ` Vasilis Liaskovitis
2011-08-03 10:37                                 ` Jan Kiszka
2011-08-03 10:38                                   ` Gleb Natapov
2011-08-03 10:42                                     ` Jan Kiszka
2011-08-03 16:25                                       ` Vasilis Liaskovitis
2011-08-04  8:01                                         ` Gleb Natapov
2011-08-04  8:40                                           ` Jan Kiszka
2011-07-21 13:08       ` Vasilis Liaskovitis
2011-07-21 13:11         ` Gleb Natapov
2011-07-21 13:12           ` Vasilis Liaskovitis
2011-07-21 13:13             ` Gleb Natapov
2011-07-21 13:15         ` Avi Kivity
2011-07-21 13:15           ` Avi Kivity
2011-07-21 11:36     ` Jan Kiszka
2011-07-21 12:22     ` Jan Kiszka
2011-07-21 12:25       ` Gleb Natapov
2011-07-21 12:35         ` Jan Kiszka
2011-07-21 12:40           ` Gleb Natapov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.