All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix hvm vcpu hotplug bug
@ 2010-08-07 17:39 Liu, Jinsong
  2010-08-11 13:54 ` Ian Jackson
  0 siblings, 1 reply; 11+ messages in thread
From: Liu, Jinsong @ 2010-08-07 17:39 UTC (permalink / raw)
  To: Keir Fraser, xen-devel; +Cc: Jiang, Yunhong, Li, Xin

[-- Attachment #1: Type: text/plain, Size: 342 bytes --]

When hotplug hvm vcpu by 'xm vcpu-set' command, if it add/remove
many vcpus by 1 'xm vcpu-set' command, it has a bug that it cannot
add/remove all vcpus that want to be added/removed.
This patch is to fix the bug. It delays trigger sci until all xenstore
cpu node status are watched.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>

[-- Attachment #2: fix-qemu-vcpu-hotplug.patch --]
[-- Type: application/octet-stream, Size: 3334 bytes --]

From a596641aed36f06dabde97155e60f9266d8f7bfb Mon Sep 17 00:00:00 2001
From: Liu Jinsong <jinsong.liu@intel.com>
Date: Sun, 8 Aug 2010 01:12:11 +0800
Subject: [PATCH] Fix hvm vcpu hotplug bug

When hotplug hvm vcpu by 'xm vcpu-set' command, if it add/remove
many vcpus by 1 'xm vcpu-set' command, it has a bug that it cannot
add/remove all vcpus that want to be added/removed.
This patch is to fix the bug. It delays trigger sci until all xenstore
cpu node status are watched.

Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
---
 hw/piix4acpi.c |   31 +++++++++++++++++++------------
 sysemu.h       |    2 ++
 xenstore.c     |   10 ++++++++--
 3 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/hw/piix4acpi.c b/hw/piix4acpi.c
index 1efa77d..8698bbd 100644
--- a/hw/piix4acpi.c
+++ b/hw/piix4acpi.c
@@ -747,6 +747,22 @@ static int disable_processor(GPEState *g, int cpu)
     return 1;
 }
 
+int qemu_cpu_state(int cpu, int state)
+{
+    if (state)
+        return enable_processor(&gpe_state, cpu);
+    else
+        return disable_processor(&gpe_state, cpu);
+}
+
+void qemu_cpu_sci(void)
+{
+    if (gpe_state.gpe0_en[0] & 4) {
+        qemu_set_irq(sci_irq, 1);
+        qemu_set_irq(sci_irq, 0);
+    }
+}
+
 void qemu_cpu_add_remove(int cpu, int state)
 {
     if ((cpu <0) || (cpu >= vcpus)) {
@@ -754,17 +770,8 @@ void qemu_cpu_add_remove(int cpu, int state)
         return;
     }
 
-    if (state) {
-        if (!enable_processor(&gpe_state, cpu))
-            return;
-    } else {
-        if (!disable_processor(&gpe_state, cpu))
-            return;
-    }
-    fprintf(logfile, "%s vcpu %d\n", state ? "Add" : "Remove", cpu);
+    if (!qemu_cpu_state(cpu, state))
+        return;
 
-    if (gpe_state.gpe0_en[0] & 4) {
-        qemu_set_irq(sci_irq, 1);
-        qemu_set_irq(sci_irq, 0);
-    }
+    qemu_cpu_sci();
 }
diff --git a/sysemu.h b/sysemu.h
index 87b278b..7a9f989 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -173,6 +173,8 @@ extern int drive_add(const char *file, const char *fmt, ...);
 extern int drive_init(struct drive_opt *arg, int snapshot, void *machine);
 
 /* acpi */
+void qemu_cpu_sci(void);
+int qemu_cpu_state(int cpu, int state);
 void qemu_cpu_add_remove(int cpu, int state);
 void qemu_system_hot_add_init(void);
 void qemu_system_device_hot_add(int pcibus, int slot, int state);
diff --git a/xenstore.c b/xenstore.c
index 4a35f55..2b3c3f2 100644
--- a/xenstore.c
+++ b/xenstore.c
@@ -972,6 +972,7 @@ static void xenstore_process_vcpu_set_event(char **vec)
     char *act = NULL;
     char *vcpustr, *node = vec[XS_WATCH_PATH];
     unsigned int vcpu, len;
+    static int done = 0;
 
     vcpustr = strstr(node, "cpu/");
     if (!vcpustr) {
@@ -987,12 +988,17 @@ static void xenstore_process_vcpu_set_event(char **vec)
     }
 
     if (!strncmp(act, "online", len))
-        qemu_cpu_add_remove(vcpu, 1);
+        qemu_cpu_state(vcpu, 1);
     else if (!strncmp(act, "offline", len))
-        qemu_cpu_add_remove(vcpu, 0);
+        qemu_cpu_state(vcpu, 0);
     else
         fprintf(stderr, "vcpu-set: command error.\n");
 
+    if ((++done) == vcpus) {
+        done = 0;
+        qemu_cpu_sci();
+    }
+
     free(act);
     return;
 }
-- 
1.5.6.4


[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-07 17:39 [PATCH] Fix hvm vcpu hotplug bug Liu, Jinsong
@ 2010-08-11 13:54 ` Ian Jackson
  2010-08-18  4:57   ` Liu, Jinsong
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Jackson @ 2010-08-11 13:54 UTC (permalink / raw)
  To: Liu, Jinsong; +Cc: Li, Xin, xen-devel, Keir Fraser, Jiang, Yunhong

Liu, Jinsong writes ("[Xen-devel] [PATCH] Fix hvm vcpu hotplug bug"):
> When hotplug hvm vcpu by 'xm vcpu-set' command, if it add/remove
> many vcpus by 1 'xm vcpu-set' command, it has a bug that it cannot
> add/remove all vcpus that want to be added/removed.
> This patch is to fix the bug. It delays trigger sci until all xenstore
> cpu node status are watched.

This patch seems to arrange to take multiple CPU hot-add/remove events
and coalesce them into a single event.  It is obvious how this avoids
triggering a race, but I'm not convinced that it's a correct fix.

The core problem seems to be that somehow the SCI IRQ is lost ?
Perhaps the real problem is this code:
        qemu_set_irq(sci_irq, 1);
        qemu_set_irq(sci_irq, 0);

I'm not familiar with the way SCI is supposed to work but clearing the
irq in the qemu add/remove function seems wrong.  Surely the host
should clear the interrupt when it has serviced the interrupt.

Can you explain what I'm missing ?

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-11 13:54 ` Ian Jackson
@ 2010-08-18  4:57   ` Liu, Jinsong
  2010-08-18 13:53     ` Ian Jackson
  0 siblings, 1 reply; 11+ messages in thread
From: Liu, Jinsong @ 2010-08-18  4:57 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Li, Xin, xen-devel, Keir Fraser, Jiang, Yunhong

Ian Jackson wrote:
> Liu, Jinsong writes ("[Xen-devel] [PATCH] Fix hvm vcpu hotplug bug"):
>> When hotplug hvm vcpu by 'xm vcpu-set' command, if it add/remove
>> many vcpus by 1 'xm vcpu-set' command, it has a bug that it cannot
>> add/remove all vcpus that want to be added/removed.
>> This patch is to fix the bug. It delays trigger sci until all
>> xenstore cpu node status are watched.
> 
> This patch seems to arrange to take multiple CPU hot-add/remove events
> and coalesce them into a single event.  It is obvious how this avoids
> triggering a race, but I'm not convinced that it's a correct fix.

It's used to avoid inconsistency of cpu status map (producer: qemu watch xenstore cpu nodes; customer: SCI \_L02 control method), so it delay trigger SCI until all cpu node are watched.

> 
> The core problem seems to be that somehow the SCI IRQ is lost ?
> Perhaps the real problem is this code:
>         qemu_set_irq(sci_irq, 1);
>         qemu_set_irq(sci_irq, 0);
> 
> I'm not familiar with the way SCI is supposed to work but clearing the
> irq in the qemu add/remove function seems wrong.  Surely the host
> should clear the interrupt when it has serviced the interrupt.
> 
> Can you explain what I'm missing ?
> 
> Ian.

Yes, you are right. 
In fact, it make me puzzling how and when to drop sci_irq to 0.

According to acpi spec, there are 2 level logic: gpe and sci.
1). gpe_en and gpe_sts 'AND' to trigger a gpe event (like pci-hotplug, cpu-hotplug);
2). multi gpe event 'OR' to trigger sci, which now wired to i8259[9];

Current qemu-xen implement gpe logic, and directly wired gpe to i8259[9].
Since qemu-xen now only support pci-hotplug event, it can work doing so.
However, if we want to support multi hotplug event, qemu-xen now didn't have 'gpe events OR to trigger sci' logic.
I think qemu-xen should add this logic level, so that it can support more gpe events in the future.

BTW, at qemu-kvm, it do same as our current patch:
         qemu_set_irq(sci_irq, 1);
         qemu_set_irq(sci_irq, 0);
it triggers a sci pulse and works fine.

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-18  4:57   ` Liu, Jinsong
@ 2010-08-18 13:53     ` Ian Jackson
  2010-08-19  2:20       ` Liu, Jinsong
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Jackson @ 2010-08-18 13:53 UTC (permalink / raw)
  To: Liu, Jinsong; +Cc: Jiang, Yunhong, xen-devel, Li, Xin, Keir Fraser

Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug bug"):
> I think qemu-xen should add this logic level, so that it can support
> more gpe events in the future.

Yes.  But what clears the interrupt ?  Is it edge or level triggered ?

> BTW, at qemu-kvm, it do same as our current patch:
>          qemu_set_irq(sci_irq, 1);
>          qemu_set_irq(sci_irq, 0);
> it triggers a sci pulse and works fine.

If this works fine it is just lucky, I think.  It has the same race as
the one you've identified.  Batching up the different events is just a
way to make the race probably not happen in that particular case.

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-18 13:53     ` Ian Jackson
@ 2010-08-19  2:20       ` Liu, Jinsong
  2010-08-19 14:42         ` Ian Jackson
  0 siblings, 1 reply; 11+ messages in thread
From: Liu, Jinsong @ 2010-08-19  2:20 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jiang, Yunhong, xen-devel, Li, Xin, Keir Fraser

Ian Jackson wrote:
> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug
> bug"): 
>> I think qemu-xen should add this logic level, so that it can support
>> more gpe events in the future.
> 
> Yes.  But what clears the interrupt ?  Is it edge or level triggered ?
> 

SCI is a shareable level interrupt.
Per my understanding, sci-gpe logic is like

                                                  |------- gpe_sts x
                                        |-----AND
                                        |         |------- gpe_en x
sci int                               |
------------------------------------ OR    ......
(wired to i8259[9] etc)         |
                                        |         |------- gpe_sts y
                                        |-----AND
                                                  |------- gpe_en y

Currently qemu-xen didn't have 'gep event x OR gpe event y --> sci' logic, it should add this logic level.

To clear the interrupt:
1). gpe_sts x & gpe_en x --> low gpe event x
2). all gpe events low --> low sci
(low is of logic meaning here, not real electric value)

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-19  2:20       ` Liu, Jinsong
@ 2010-08-19 14:42         ` Ian Jackson
  2010-08-20  3:29           ` Liu, Jinsong
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Jackson @ 2010-08-19 14:42 UTC (permalink / raw)
  To: Liu, Jinsong; +Cc: Jiang, Yunhong, xen-devel, Li, Xin, Keir Fraser

Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug bug"):
> Per my understanding, sci-gpe logic is like
> [diagram]
>
> Currently qemu-xen didn't have 'gep event x OR gpe event y --> sci'
> logic, it should add this logic level.

Yes, thanks, I'd understood that.  But:

> To clear the interrupt:
> 1). gpe_sts x & gpe_en x --> low gpe event x

I think you mean "gpe_sts goes low" ?  Since your diagram is in terms
of levels and I guess "sts" is the actual interrupt source, and
"sts_en" is the interrupt enable flag.

So in real hardware, what makes gpe_sts go low ?

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-19 14:42         ` Ian Jackson
@ 2010-08-20  3:29           ` Liu, Jinsong
  2010-08-20 13:23             ` Ian Jackson
  0 siblings, 1 reply; 11+ messages in thread
From: Liu, Jinsong @ 2010-08-20  3:29 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jiang, Yunhong, xen-devel, Li, Xin, Keir Fraser

Ian Jackson wrote:
> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug
> bug"): 
>> Per my understanding, sci-gpe logic is like
>> [diagram]
>> 
>> Currently qemu-xen didn't have 'gep event x OR gpe event y --> sci'
>> logic, it should add this logic level.
> 
> Yes, thanks, I'd understood that.  But:
> 
>> To clear the interrupt:
>> 1). gpe_sts x & gpe_en x --> low gpe event x
> 
> I think you mean "gpe_sts goes low" ?  Since your diagram is in terms
> of levels and I guess "sts" is the actual interrupt source, and
> "sts_en" is the interrupt enable flag.

Yes.

> 
> So in real hardware, what makes gpe_sts go low ?
> 
> Ian.

ospm will do. Its sequence is (i.e. linux 2.6.32, ospm dispatch control method case)
1). clear pge_en x by writing '0' to the bit;
2). asynchronic call control method;
3). clear gpe_sts x by writing '1' to the bit;
4). re-enable gpe_en x by writing '1' to the bit;


Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-20  3:29           ` Liu, Jinsong
@ 2010-08-20 13:23             ` Ian Jackson
  2010-08-22  5:00               ` Liu, Jinsong
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Jackson @ 2010-08-20 13:23 UTC (permalink / raw)
  To: Liu, Jinsong; +Cc: Jiang, Yunhong, Keir Fraser, xen-devel, Li, Xin

Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug bug"):
> ospm will do. Its sequence is (i.e. linux 2.6.32, ospm dispatch
> control method case)

"ospm" is "OS power management" ie in our case the guest kernel I take
it ?

> 1). clear pge_en x by writing '0' to the bit;
> 2). asynchronic call control method;
> 3). clear gpe_sts x by writing '1' to the bit;
> 4). re-enable gpe_en x by writing '1' to the bit;

So the code in qemu should never clear gpe_sts itself.

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-20 13:23             ` Ian Jackson
@ 2010-08-22  5:00               ` Liu, Jinsong
  2010-08-24 14:19                 ` Ian Jackson
  0 siblings, 1 reply; 11+ messages in thread
From: Liu, Jinsong @ 2010-08-22  5:00 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jiang, Yunhong, Keir Fraser, xen-devel, Li, Xin

Ian Jackson wrote:
> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug
> bug"): 
>> ospm will do. Its sequence is (i.e. linux 2.6.32, ospm dispatch
>> control method case)
> 
> "ospm" is "OS power management" ie in our case the guest kernel I take
> it ?

hmm, 'ospm' is just a generial speaking, more accurate speaking is 'acpica'.
i.e. hvm linux 2.6.32 handle sci/gpe at:
drivers/acpi/acpica/evsci.c
drivers/acpi/acpica/evgpe.c

> 
>> 1). clear pge_en x by writing '0' to the bit;
>> 2). asynchronic call control method;
>> 3). clear gpe_sts x by writing '1' to the bit;
>> 4). re-enable gpe_en x by writing '1' to the bit;
> 
> So the code in qemu should never clear gpe_sts itself.
> 
> Ian.

No, that's just what qemu should do, to simulate hardware behavior.

Thanks,
Jinsong

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-22  5:00               ` Liu, Jinsong
@ 2010-08-24 14:19                 ` Ian Jackson
  2010-08-25  4:53                   ` Liu, Jinsong
  0 siblings, 1 reply; 11+ messages in thread
From: Ian Jackson @ 2010-08-24 14:19 UTC (permalink / raw)
  To: Liu, Jinsong; +Cc: Keir Fraser, Jiang, Yunhong, Ian Jackson, xen-devel, Li, Xin

Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug bug"):
> Ian Jackson wrote:
> > Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug bug"): 
> >> [linux/drivers/acpi/acpica/ will:]
> >> 1). clear pge_en x by writing '0' to the bit;
> >> 2). asynchronic call control method;
> >> 3). clear gpe_sts x by writing '1' to the bit;
> >> 4). re-enable gpe_en x by writing '1' to the bit;
> > 
> > So the code in qemu should never clear gpe_sts itself.
> 
> No, that's just what qemu should do, to simulate hardware behavior.

Obviously I still haven't understood.  If the guest kernel driver is
supposed to clear this bit as you seem to say above, then it should
not be cleared automatically by qemu-dm as part of the hotplug
notification.

Obviously the register ought to be emulated by qemu and the bit ought
to be cleared in qemu when the kernel driver writes an inactive value
to it.

None of this seems to be implemented in qemu right now AFAICT ?

Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH] Fix hvm vcpu hotplug bug
  2010-08-24 14:19                 ` Ian Jackson
@ 2010-08-25  4:53                   ` Liu, Jinsong
  0 siblings, 0 replies; 11+ messages in thread
From: Liu, Jinsong @ 2010-08-25  4:53 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Jiang, Yunhong, Keir Fraser, xen-devel, Li, Xin

Ian Jackson wrote:
> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug
> bug"): 
>> Ian Jackson wrote:
>>> Liu, Jinsong writes ("RE: [Xen-devel] [PATCH] Fix hvm vcpu hotplug
>>> bug"): 
>>>> [linux/drivers/acpi/acpica/ will:]
>>>> 1). clear pge_en x by writing '0' to the bit;
>>>> 2). asynchronic call control method;
>>>> 3). clear gpe_sts x by writing '1' to the bit;
>>>> 4). re-enable gpe_en x by writing '1' to the bit;
>>> 
>>> So the code in qemu should never clear gpe_sts itself.
>> 
>> No, that's just what qemu should do, to simulate hardware behavior.
> 
> Obviously I still haven't understood.  If the guest kernel driver is
> supposed to clear this bit as you seem to say above, then it should
> not be cleared automatically by qemu-dm as part of the hotplug
> notification.

Of course hotplug notification will not clear gpe_sts.
At qemu side, 'hotplug notification' and 'clear gpe_sts' are asynchronic:
1) qemu: hotplug notification trigger sci, 
2) guest kernel: 
	a). clear pge_en x by writing '0' to the bit;
	b). asynchronically clear gpe_sts x by writing '1' to the bit;
	c). re-enable gpe_en x by writing '1' to the bit;
3) qemu: gpe_en_write/gpe_sts_write will simulate the hardware action to set or clear gpe_en/gpe_sts.

What I mean is, currently at qmeu side, it need add the logic of 'multi gpe --> OR --> sci' logic.
(i.e, at current qemu hw/piix4acpi.c, gpe_sts_write and gpe_en_write cannot handle multi gpe).
Only after add the logic, it could be clean to handle 'vcpu hotplug' at qmeu.

Thanks,
Jinsong

> 
> Obviously the register ought to be emulated by qemu and the bit ought
> to be cleared in qemu when the kernel driver writes an inactive value
> to it.
> 
> None of this seems to be implemented in qemu right now AFAICT ?
> 
> Ian.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2010-08-25  4:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-07 17:39 [PATCH] Fix hvm vcpu hotplug bug Liu, Jinsong
2010-08-11 13:54 ` Ian Jackson
2010-08-18  4:57   ` Liu, Jinsong
2010-08-18 13:53     ` Ian Jackson
2010-08-19  2:20       ` Liu, Jinsong
2010-08-19 14:42         ` Ian Jackson
2010-08-20  3:29           ` Liu, Jinsong
2010-08-20 13:23             ` Ian Jackson
2010-08-22  5:00               ` Liu, Jinsong
2010-08-24 14:19                 ` Ian Jackson
2010-08-25  4:53                   ` Liu, Jinsong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.