All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen
@ 2010-03-01  9:38 Sheng Yang
  2010-03-01  9:38   ` Sheng Yang
                   ` (7 more replies)
  0 siblings, 8 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel

Hi, Jeremy & Ian

Here is the forth version of patchset to enable PV extension of HVM support
in Linux kernel of Xen.

I'd like to have your comments on this patch series. We think the patches are
prepared for check in pv_ops domU tree.

The PV extension of HVM is started from real mode like HVM guest, but also with a
a range of PV features(e.g. PV halt, PV timer, event channel, as well as PV
drivers). So guest with this feature can takes the advantages of both H/W
virtualization and Para-Virtualization.

The first two of the patchset imported several header file from Jeremy's tree
and Xen tree, respect to Jeremy and Keir's works.

The whole patchset based on Linux upstream.

You need a line like:

cpuid = [ '0x40000002:edx=0x3' ]

in HVM configuration file to expose hybrid feature to guest, and

CONFIG_XEN

in the guest kernel configuration file to enable the hybrid support.

And the compiled image can be used as native/pv domU/hvm guest/pv feature hvm kernel.

Current the patchset support x86_64 only.

I've discussed the vector-evtchn mapping with Stefano, and we agreed on that
it's more complex than we thought, and can't brought much benefit for
pvops domU. So there is no major change in this update.

Change from v3:
1. Rebase to Linux 2.6.33 release.
2. change the name to "PV extension of HVM"
3. Some minor coding polishing.

Change from v2:
1. change the name "hybrid" to "PV featured HVM".
2. Unified the PV driver's judgement of xen_domain() to xen_evtchn_enabled().
3. Move the function(evtchn) initialize hypercall near the real enabling place,
rather than a unified place before function enabled.
4. Remove the reserved E820 region for grant table. Use QEmu Xen platform
device's MMIO instead.

The major change from v1:
1. SMP support.
2. Modify the entrance point to avoid most of genernic kernel modification.
3. Binding PV timer with event channel mechanism.

--
regards
Yang, Sheng

 arch/x86/include/asm/xen/cpuid.h     |   73 +++++++++++++
 arch/x86/include/asm/xen/hypercall.h |    6 +
 arch/x86/kernel/setup.c              |    8 ++
 arch/x86/xen/enlighten.c             |  188 ++++++++++++++++++++++++++++++++++
 arch/x86/xen/irq.c                   |   54 ++++++++++
 arch/x86/xen/smp.c                   |  144 ++++++++++++++++++++++++++-
 arch/x86/xen/xen-head.S              |    6 +
 arch/x86/xen/xen-ops.h               |    4 +
 drivers/block/xen-blkfront.c         |    3 -
 drivers/input/xen-kbdfront.c         |    6 +-
 drivers/net/xen-netfront.c           |    5 -
 drivers/video/xen-fbfront.c          |    6 +-
 drivers/xen/events.c                 |   66 +++++++++++-
 drivers/xen/grant-table.c            |   59 ++++++++++-
 drivers/xen/xenbus/xenbus_probe.c    |   22 +++-
 include/xen/events.h                 |    1 +
 include/xen/hvm.h                    |   28 +++++
 include/xen/interface/hvm/hvm_op.h   |   79 ++++++++++++++
 include/xen/interface/hvm/params.h   |  111 ++++++++++++++++++++
 include/xen/interface/xen.h          |    6 +-
 include/xen/xen.h                    |   11 ++
 include/xen/xenbus.h                 |    3 +
 22 files changed, 861 insertions(+), 28 deletions(-)

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH 1/7] xen: add support for hvm_op
  2010-03-01  9:38 [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Sheng Yang
@ 2010-03-01  9:38   ` Sheng Yang
  2010-03-01  9:38 ` [PATCH 2/7] xen: Import cpuid.h from Xen Sheng Yang
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel,
	Jeremy Fitzhardinge, Sheng Yang

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Add support for hvm_op hypercall.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 arch/x86/include/asm/xen/hypercall.h |    6 ++
 include/xen/hvm.h                    |   23 +++++++
 include/xen/interface/hvm/hvm_op.h   |   72 ++++++++++++++++++++++
 include/xen/interface/hvm/params.h   |  111 ++++++++++++++++++++++++++++++++++
 4 files changed, 212 insertions(+), 0 deletions(-)
 create mode 100644 include/xen/hvm.h
 create mode 100644 include/xen/interface/hvm/hvm_op.h
 create mode 100644 include/xen/interface/hvm/params.h

diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index 9c371e4..47c2ebb 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -417,6 +417,12 @@ HYPERVISOR_nmi_op(unsigned long op, unsigned long arg)
 	return _hypercall2(int, nmi_op, op, arg);
 }
 
+static inline unsigned long __must_check
+HYPERVISOR_hvm_op(int op, void *arg)
+{
+	return _hypercall2(unsigned long, hvm_op, op, arg);
+}
+
 static inline void
 MULTI_fpu_taskswitch(struct multicall_entry *mcl, int set)
 {
diff --git a/include/xen/hvm.h b/include/xen/hvm.h
new file mode 100644
index 0000000..4ea8887
--- /dev/null
+++ b/include/xen/hvm.h
@@ -0,0 +1,23 @@
+/* Simple wrappers around HVM functions */
+#ifndef XEN_HVM_H__
+#define XEN_HVM_H__
+
+#include <xen/interface/hvm/params.h>
+
+static inline unsigned long hvm_get_parameter(int idx)
+{
+	struct xen_hvm_param xhv;
+	int r;
+
+	xhv.domid = DOMID_SELF;
+	xhv.index = idx;
+	r = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
+	if (r < 0) {
+		printk(KERN_ERR "cannot get hvm parameter %d: %d.\n",
+		       idx, r);
+		return 0;
+	}
+	return xhv.value;
+}
+
+#endif /* XEN_HVM_H__ */
diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h
new file mode 100644
index 0000000..7c74ba4
--- /dev/null
+++ b/include/xen/interface/hvm/hvm_op.h
@@ -0,0 +1,72 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_HVM_HVM_OP_H__
+#define __XEN_PUBLIC_HVM_HVM_OP_H__
+
+/* Get/set subcommands: extra argument == pointer to xen_hvm_param struct. */
+#define HVMOP_set_param           0
+#define HVMOP_get_param           1
+struct xen_hvm_param {
+    domid_t  domid;    /* IN */
+    uint32_t index;    /* IN */
+    uint64_t value;    /* IN/OUT */
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param);
+
+/* Set the logical level of one of a domain's PCI INTx wires. */
+#define HVMOP_set_pci_intx_level  2
+struct xen_hvm_set_pci_intx_level {
+    /* Domain to be updated. */
+    domid_t  domid;
+    /* PCI INTx identification in PCI topology (domain:bus:device:intx). */
+    uint8_t  domain, bus, device, intx;
+    /* Assertion level (0 = unasserted, 1 = asserted). */
+    uint8_t  level;
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_pci_intx_level);
+
+/* Set the logical level of one of a domain's ISA IRQ wires. */
+#define HVMOP_set_isa_irq_level   3
+struct xen_hvm_set_isa_irq_level {
+    /* Domain to be updated. */
+    domid_t  domid;
+    /* ISA device identification, by ISA IRQ (0-15). */
+    uint8_t  isa_irq;
+    /* Assertion level (0 = unasserted, 1 = asserted). */
+    uint8_t  level;
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_isa_irq_level);
+
+#define HVMOP_set_pci_link_route  4
+struct xen_hvm_set_pci_link_route {
+    /* Domain to be updated. */
+    domid_t  domid;
+    /* PCI link identifier (0-3). */
+    uint8_t  link;
+    /* ISA IRQ (1-15), or 0 (disable link). */
+    uint8_t  isa_irq;
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_pci_link_route);
+
+/* Flushes all VCPU TLBs: @arg must be NULL. */
+#define HVMOP_flush_tlbs          5
+
+#endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
diff --git a/include/xen/interface/hvm/params.h b/include/xen/interface/hvm/params.h
new file mode 100644
index 0000000..15d828f
--- /dev/null
+++ b/include/xen/interface/hvm/params.h
@@ -0,0 +1,111 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_HVM_PARAMS_H__
+#define __XEN_PUBLIC_HVM_PARAMS_H__
+
+#include "hvm_op.h"
+
+/*
+ * Parameter space for HVMOP_{set,get}_param.
+ */
+
+/*
+ * How should CPU0 event-channel notifications be delivered?
+ * val[63:56] == 0: val[55:0] is a delivery GSI (Global System Interrupt).
+ * val[63:56] == 1: val[55:0] is a delivery PCI INTx line, as follows:
+ *                  Domain = val[47:32], Bus  = val[31:16],
+ *                  DevFn  = val[15: 8], IntX = val[ 1: 0]
+ * If val == 0 then CPU0 event-channel notifications are not delivered.
+ */
+#define HVM_PARAM_CALLBACK_IRQ 0
+
+/*
+ * These are not used by Xen. They are here for convenience of HVM-guest
+ * xenbus implementations.
+ */
+#define HVM_PARAM_STORE_PFN    1
+#define HVM_PARAM_STORE_EVTCHN 2
+
+#define HVM_PARAM_PAE_ENABLED  4
+
+#define HVM_PARAM_IOREQ_PFN    5
+
+#define HVM_PARAM_BUFIOREQ_PFN 6
+
+#ifdef __ia64__
+
+#define HVM_PARAM_NVRAM_FD     7
+#define HVM_PARAM_VHPT_SIZE    8
+#define HVM_PARAM_BUFPIOREQ_PFN	9
+
+#elif defined(__i386__) || defined(__x86_64__)
+
+/* Expose Viridian interfaces to this HVM guest? */
+#define HVM_PARAM_VIRIDIAN     9
+
+#endif
+
+/*
+ * Set mode for virtual timers (currently x86 only):
+ *  delay_for_missed_ticks (default):
+ *   Do not advance a vcpu's time beyond the correct delivery time for
+ *   interrupts that have been missed due to preemption. Deliver missed
+ *   interrupts when the vcpu is rescheduled and advance the vcpu's virtual
+ *   time stepwise for each one.
+ *  no_delay_for_missed_ticks:
+ *   As above, missed interrupts are delivered, but guest time always tracks
+ *   wallclock (i.e., real) time while doing so.
+ *  no_missed_ticks_pending:
+ *   No missed interrupts are held pending. Instead, to ensure ticks are
+ *   delivered at some non-zero rate, if we detect missed ticks then the
+ *   internal tick alarm is not disabled if the VCPU is preempted during the
+ *   next tick period.
+ *  one_missed_tick_pending:
+ *   Missed interrupts are collapsed together and delivered as one 'late tick'.
+ *   Guest time always tracks wallclock (i.e., real) time.
+ */
+#define HVM_PARAM_TIMER_MODE   10
+#define HVMPTM_delay_for_missed_ticks    0
+#define HVMPTM_no_delay_for_missed_ticks 1
+#define HVMPTM_no_missed_ticks_pending   2
+#define HVMPTM_one_missed_tick_pending   3
+
+/* Boolean: Enable virtual HPET (high-precision event timer)? (x86-only) */
+#define HVM_PARAM_HPET_ENABLED 11
+
+/* Identity-map page directory used by Intel EPT when CR0.PG=0. */
+#define HVM_PARAM_IDENT_PT     12
+
+/* Device Model domain, defaults to 0. */
+#define HVM_PARAM_DM_DOMAIN    13
+
+/* ACPI S state: currently support S0 and S3 on x86. */
+#define HVM_PARAM_ACPI_S_STATE 14
+
+/* TSS used on Intel when CR0.PE=0. */
+#define HVM_PARAM_VM86_TSS     15
+
+/* Boolean: Enable aligning all periodic vpts to reduce interrupts */
+#define HVM_PARAM_VPT_ALIGN    16
+
+#define HVM_NR_PARAMS          17
+
+#endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 1/7] xen: add support for hvm_op
@ 2010-03-01  9:38   ` Sheng Yang
  0 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Ian Pratt
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel,
	Jeremy Fitzhardinge, Sheng Yang

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Add support for hvm_op hypercall.

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 arch/x86/include/asm/xen/hypercall.h |    6 ++
 include/xen/hvm.h                    |   23 +++++++
 include/xen/interface/hvm/hvm_op.h   |   72 ++++++++++++++++++++++
 include/xen/interface/hvm/params.h   |  111 ++++++++++++++++++++++++++++++++++
 4 files changed, 212 insertions(+), 0 deletions(-)
 create mode 100644 include/xen/hvm.h
 create mode 100644 include/xen/interface/hvm/hvm_op.h
 create mode 100644 include/xen/interface/hvm/params.h

diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index 9c371e4..47c2ebb 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -417,6 +417,12 @@ HYPERVISOR_nmi_op(unsigned long op, unsigned long arg)
 	return _hypercall2(int, nmi_op, op, arg);
 }
 
+static inline unsigned long __must_check
+HYPERVISOR_hvm_op(int op, void *arg)
+{
+	return _hypercall2(unsigned long, hvm_op, op, arg);
+}
+
 static inline void
 MULTI_fpu_taskswitch(struct multicall_entry *mcl, int set)
 {
diff --git a/include/xen/hvm.h b/include/xen/hvm.h
new file mode 100644
index 0000000..4ea8887
--- /dev/null
+++ b/include/xen/hvm.h
@@ -0,0 +1,23 @@
+/* Simple wrappers around HVM functions */
+#ifndef XEN_HVM_H__
+#define XEN_HVM_H__
+
+#include <xen/interface/hvm/params.h>
+
+static inline unsigned long hvm_get_parameter(int idx)
+{
+	struct xen_hvm_param xhv;
+	int r;
+
+	xhv.domid = DOMID_SELF;
+	xhv.index = idx;
+	r = HYPERVISOR_hvm_op(HVMOP_get_param, &xhv);
+	if (r < 0) {
+		printk(KERN_ERR "cannot get hvm parameter %d: %d.\n",
+		       idx, r);
+		return 0;
+	}
+	return xhv.value;
+}
+
+#endif /* XEN_HVM_H__ */
diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h
new file mode 100644
index 0000000..7c74ba4
--- /dev/null
+++ b/include/xen/interface/hvm/hvm_op.h
@@ -0,0 +1,72 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_HVM_HVM_OP_H__
+#define __XEN_PUBLIC_HVM_HVM_OP_H__
+
+/* Get/set subcommands: extra argument == pointer to xen_hvm_param struct. */
+#define HVMOP_set_param           0
+#define HVMOP_get_param           1
+struct xen_hvm_param {
+    domid_t  domid;    /* IN */
+    uint32_t index;    /* IN */
+    uint64_t value;    /* IN/OUT */
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_param);
+
+/* Set the logical level of one of a domain's PCI INTx wires. */
+#define HVMOP_set_pci_intx_level  2
+struct xen_hvm_set_pci_intx_level {
+    /* Domain to be updated. */
+    domid_t  domid;
+    /* PCI INTx identification in PCI topology (domain:bus:device:intx). */
+    uint8_t  domain, bus, device, intx;
+    /* Assertion level (0 = unasserted, 1 = asserted). */
+    uint8_t  level;
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_pci_intx_level);
+
+/* Set the logical level of one of a domain's ISA IRQ wires. */
+#define HVMOP_set_isa_irq_level   3
+struct xen_hvm_set_isa_irq_level {
+    /* Domain to be updated. */
+    domid_t  domid;
+    /* ISA device identification, by ISA IRQ (0-15). */
+    uint8_t  isa_irq;
+    /* Assertion level (0 = unasserted, 1 = asserted). */
+    uint8_t  level;
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_isa_irq_level);
+
+#define HVMOP_set_pci_link_route  4
+struct xen_hvm_set_pci_link_route {
+    /* Domain to be updated. */
+    domid_t  domid;
+    /* PCI link identifier (0-3). */
+    uint8_t  link;
+    /* ISA IRQ (1-15), or 0 (disable link). */
+    uint8_t  isa_irq;
+};
+DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_pci_link_route);
+
+/* Flushes all VCPU TLBs: @arg must be NULL. */
+#define HVMOP_flush_tlbs          5
+
+#endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
diff --git a/include/xen/interface/hvm/params.h b/include/xen/interface/hvm/params.h
new file mode 100644
index 0000000..15d828f
--- /dev/null
+++ b/include/xen/interface/hvm/params.h
@@ -0,0 +1,111 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef __XEN_PUBLIC_HVM_PARAMS_H__
+#define __XEN_PUBLIC_HVM_PARAMS_H__
+
+#include "hvm_op.h"
+
+/*
+ * Parameter space for HVMOP_{set,get}_param.
+ */
+
+/*
+ * How should CPU0 event-channel notifications be delivered?
+ * val[63:56] == 0: val[55:0] is a delivery GSI (Global System Interrupt).
+ * val[63:56] == 1: val[55:0] is a delivery PCI INTx line, as follows:
+ *                  Domain = val[47:32], Bus  = val[31:16],
+ *                  DevFn  = val[15: 8], IntX = val[ 1: 0]
+ * If val == 0 then CPU0 event-channel notifications are not delivered.
+ */
+#define HVM_PARAM_CALLBACK_IRQ 0
+
+/*
+ * These are not used by Xen. They are here for convenience of HVM-guest
+ * xenbus implementations.
+ */
+#define HVM_PARAM_STORE_PFN    1
+#define HVM_PARAM_STORE_EVTCHN 2
+
+#define HVM_PARAM_PAE_ENABLED  4
+
+#define HVM_PARAM_IOREQ_PFN    5
+
+#define HVM_PARAM_BUFIOREQ_PFN 6
+
+#ifdef __ia64__
+
+#define HVM_PARAM_NVRAM_FD     7
+#define HVM_PARAM_VHPT_SIZE    8
+#define HVM_PARAM_BUFPIOREQ_PFN	9
+
+#elif defined(__i386__) || defined(__x86_64__)
+
+/* Expose Viridian interfaces to this HVM guest? */
+#define HVM_PARAM_VIRIDIAN     9
+
+#endif
+
+/*
+ * Set mode for virtual timers (currently x86 only):
+ *  delay_for_missed_ticks (default):
+ *   Do not advance a vcpu's time beyond the correct delivery time for
+ *   interrupts that have been missed due to preemption. Deliver missed
+ *   interrupts when the vcpu is rescheduled and advance the vcpu's virtual
+ *   time stepwise for each one.
+ *  no_delay_for_missed_ticks:
+ *   As above, missed interrupts are delivered, but guest time always tracks
+ *   wallclock (i.e., real) time while doing so.
+ *  no_missed_ticks_pending:
+ *   No missed interrupts are held pending. Instead, to ensure ticks are
+ *   delivered at some non-zero rate, if we detect missed ticks then the
+ *   internal tick alarm is not disabled if the VCPU is preempted during the
+ *   next tick period.
+ *  one_missed_tick_pending:
+ *   Missed interrupts are collapsed together and delivered as one 'late tick'.
+ *   Guest time always tracks wallclock (i.e., real) time.
+ */
+#define HVM_PARAM_TIMER_MODE   10
+#define HVMPTM_delay_for_missed_ticks    0
+#define HVMPTM_no_delay_for_missed_ticks 1
+#define HVMPTM_no_missed_ticks_pending   2
+#define HVMPTM_one_missed_tick_pending   3
+
+/* Boolean: Enable virtual HPET (high-precision event timer)? (x86-only) */
+#define HVM_PARAM_HPET_ENABLED 11
+
+/* Identity-map page directory used by Intel EPT when CR0.PG=0. */
+#define HVM_PARAM_IDENT_PT     12
+
+/* Device Model domain, defaults to 0. */
+#define HVM_PARAM_DM_DOMAIN    13
+
+/* ACPI S state: currently support S0 and S3 on x86. */
+#define HVM_PARAM_ACPI_S_STATE 14
+
+/* TSS used on Intel when CR0.PE=0. */
+#define HVM_PARAM_VM86_TSS     15
+
+/* Boolean: Enable aligning all periodic vpts to reduce interrupts */
+#define HVM_PARAM_VPT_ALIGN    16
+
+#define HVM_NR_PARAMS          17
+
+#endif /* __XEN_PUBLIC_HVM_PARAMS_H__ */
-- 
1.5.4.5

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 2/7] xen: Import cpuid.h from Xen
  2010-03-01  9:38 [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Sheng Yang
  2010-03-01  9:38   ` Sheng Yang
@ 2010-03-01  9:38 ` Sheng Yang
  2010-03-01  9:38 ` [PATCH 3/7] xen/hvm: Xen PV extension of HVM initialization Sheng Yang
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel,
	Keir Fraser, Sheng Yang

From: Keir Fraser <keir.fraser@citrix.com>

Which would be used by CPUID detection later

Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 arch/x86/include/asm/xen/cpuid.h |   68 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 68 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/include/asm/xen/cpuid.h

diff --git a/arch/x86/include/asm/xen/cpuid.h b/arch/x86/include/asm/xen/cpuid.h
new file mode 100644
index 0000000..8787f03
--- /dev/null
+++ b/arch/x86/include/asm/xen/cpuid.h
@@ -0,0 +1,68 @@
+/******************************************************************************
+ * arch/include/asm/xen/cpuid.h
+ *
+ * CPUID interface to Xen.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2007 Citrix Systems, Inc.
+ *
+ * Authors:
+ *    Keir Fraser <keir.fraser@citrix.com>
+ */
+
+#ifndef __ASM_X86_XEN_CPUID_H__
+#define __ASM_X86_XEN_CPUID_H__
+
+/* Xen identification leaves start at 0x40000000. */
+#define XEN_CPUID_FIRST_LEAF 0x40000000
+#define XEN_CPUID_LEAF(i)    (XEN_CPUID_FIRST_LEAF + (i))
+
+/*
+ * Leaf 1 (0x40000000)
+ * EAX: Largest Xen-information leaf. All leaves up to an including @EAX
+ *      are supported by the Xen host.
+ * EBX-EDX: "XenVMMXenVMM" signature, allowing positive identification
+ *      of a Xen host.
+ */
+#define XEN_CPUID_SIGNATURE_EBX 0x566e6558 /* "XenV" */
+#define XEN_CPUID_SIGNATURE_ECX 0x65584d4d /* "MMXe" */
+#define XEN_CPUID_SIGNATURE_EDX 0x4d4d566e /* "nVMM" */
+
+/*
+ * Leaf 2 (0x40000001)
+ * EAX[31:16]: Xen major version.
+ * EAX[15: 0]: Xen minor version.
+ * EBX-EDX: Reserved (currently all zeroes).
+ */
+
+/*
+ * Leaf 3 (0x40000002)
+ * EAX: Number of hypercall transfer pages. This register is always guaranteed
+ *      to specify one hypercall page.
+ * EBX: Base address of Xen-specific MSRs.
+ * ECX: Features 1. Unused bits are set to zero.
+ * EDX: Features 2. Unused bits are set to zero.
+ */
+
+/* Does the host support MMU_PT_UPDATE_PRESERVE_AD for this guest? */
+#define _XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD 0
+#define XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD  (1u<<0)
+
+#endif /* __XEN_PUBLIC_ARCH_X86_CPUID_H__ */
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 3/7] xen/hvm: Xen PV extension of HVM initialization
  2010-03-01  9:38 [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Sheng Yang
  2010-03-01  9:38   ` Sheng Yang
  2010-03-01  9:38 ` [PATCH 2/7] xen: Import cpuid.h from Xen Sheng Yang
@ 2010-03-01  9:38 ` Sheng Yang
  2010-03-02  1:02     ` Jeremy Fitzhardinge
  2010-03-01  9:38 ` [PATCH 4/7] xen: The entrance for PV extension of HVM Sheng Yang
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel,
	Sheng Yang, Yaozu (Eddie) Dong

The PV extension of HVM(once known as Hybrid) is started from real mode like
HVM guest, but also with a component based PV feature selection(e.g. PV halt,
PV timer, event channel, then PV drivers). So guest can takes the advantages
of both H/W virtualization and Para-Virtualization.

This patch introduced the PV extension of HVM guest initialization.

Guest would detect the capability using CPUID 0x40000002.edx, then call
HVMOP_enable_pv hypercall to enable pv support in hypervisor.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
Signed-off-by: Yaozu (Eddie) Dong <eddie.dong@intel.com>
---
 arch/x86/include/asm/xen/cpuid.h   |    5 ++
 arch/x86/xen/enlighten.c           |  115 ++++++++++++++++++++++++++++++++++++
 arch/x86/xen/irq.c                 |   21 +++++++
 arch/x86/xen/xen-head.S            |    6 ++
 arch/x86/xen/xen-ops.h             |    1 +
 include/xen/interface/hvm/hvm_op.h |    7 ++
 include/xen/xen.h                  |    9 +++
 7 files changed, 164 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/xen/cpuid.h b/arch/x86/include/asm/xen/cpuid.h
index 8787f03..a93c851 100644
--- a/arch/x86/include/asm/xen/cpuid.h
+++ b/arch/x86/include/asm/xen/cpuid.h
@@ -65,4 +65,9 @@
 #define _XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD 0
 #define XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD  (1u<<0)
 
+#define _XEN_CPUID_FEAT2_HVM_PV 0
+#define XEN_CPUID_FEAT2_HVM_PV (1u<<0)
+#define _XEN_CPUID_FEAT2_HVM_PV_EVTCHN 1
+#define XEN_CPUID_FEAT2_HVM_PV_EVTCHN (1u<<1)
+
 #endif /* __XEN_PUBLIC_ARCH_X86_CPUID_H__ */
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 36daccb..fdb9664 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -34,6 +34,8 @@
 #include <xen/interface/version.h>
 #include <xen/interface/physdev.h>
 #include <xen/interface/vcpu.h>
+#include <xen/interface/memory.h>
+#include <xen/interface/hvm/hvm_op.h>
 #include <xen/features.h>
 #include <xen/page.h>
 #include <xen/hvc-console.h>
@@ -43,6 +45,7 @@
 #include <asm/page.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
+#include <asm/xen/cpuid.h>
 #include <asm/fixmap.h>
 #include <asm/processor.h>
 #include <asm/proto.h>
@@ -1198,3 +1201,115 @@ asmlinkage void __init xen_start_kernel(void)
 	x86_64_start_reservations((char *)__pa_symbol(&boot_params));
 #endif
 }
+
+static void __init xen_hvm_pv_banner(void)
+{
+	unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
+	struct xen_extraversion extra;
+	HYPERVISOR_xen_version(XENVER_extraversion, &extra);
+
+	printk(KERN_INFO "Booting PV featured HVM kernel on %s\n",
+		pv_info.name);
+	printk(KERN_INFO "Xen version: %d.%d%s\n",
+		version >> 16, version & 0xffff, extra.extraversion);
+}
+
+static int xen_para_available(void)
+{
+	uint32_t eax, ebx, ecx, edx;
+	cpuid(XEN_CPUID_LEAF(0), &eax, &ebx, &ecx, &edx);
+
+	if (ebx == XEN_CPUID_SIGNATURE_EBX &&
+	    ecx == XEN_CPUID_SIGNATURE_ECX &&
+	    edx == XEN_CPUID_SIGNATURE_EDX &&
+	    ((eax - XEN_CPUID_LEAF(0)) >= 2))
+		return 1;
+
+	return 0;
+}
+
+u32 xen_hvm_pv_status;
+EXPORT_SYMBOL_GPL(xen_hvm_pv_status);
+
+static int enable_hvm_pv(u64 flags)
+{
+	struct xen_hvm_pv_type a;
+
+	a.domid = DOMID_SELF;
+	a.flags = flags;
+	return HYPERVISOR_hvm_op(HVMOP_enable_pv, &a);
+}
+
+static int init_hvm_pv_info(void)
+{
+	uint32_t ecx, edx, pages, msr;
+	u64 pfn;
+
+	if (!xen_para_available())
+		return -EINVAL;
+
+	cpuid(XEN_CPUID_LEAF(2), &pages, &msr, &ecx, &edx);
+
+	/* Check if hvm_pv mode is supported */
+	if (!(edx & XEN_CPUID_FEAT2_HVM_PV))
+		return -ENODEV;
+
+	xen_hvm_pv_status = XEN_HVM_PV_ENABLED;
+
+	/* We only support 1 page of hypercall for now */
+	if (pages != 1)
+		return -ENOMEM;
+
+	pfn = __pa(hypercall_page);
+	wrmsrl(msr, pfn);
+
+	xen_setup_features();
+
+	x86_init.oem.banner = xen_hvm_pv_banner;
+	pv_info = xen_info;
+	pv_info.kernel_rpl = 0;
+
+	return 0;
+}
+
+extern struct shared_info shared_info_page;
+
+static void __init init_shared_info(void)
+{
+	struct xen_add_to_physmap xatp;
+
+	xatp.domid = DOMID_SELF;
+	xatp.idx = 0;
+	xatp.space = XENMAPSPACE_shared_info;
+	xatp.gpfn = __pa(&shared_info_page) >> PAGE_SHIFT;
+	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
+		BUG();
+
+	HYPERVISOR_shared_info = (struct shared_info *)&shared_info_page;
+
+	/* Don't do the full vcpu_info placement stuff until we have a
+	   possible map and a non-dummy shared_info. */
+	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
+}
+
+void __init xen_guest_init(void)
+{
+#ifdef CONFIG_X86_32
+	return;
+#else
+	int r;
+
+	/* Ensure the we won't confused with PV */
+	if (xen_domain_type == XEN_PV_DOMAIN)
+		return;
+
+	r = init_hvm_pv_info();
+	if (r < 0)
+		return;
+
+	init_shared_info();
+
+	xen_hvm_pv_init_irq_ops();
+#endif
+}
+
diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
index 9d30105..fadaa97 100644
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -131,3 +131,24 @@ void __init xen_init_irq_ops()
 	pv_irq_ops = xen_irq_ops;
 	x86_init.irqs.intr_init = xen_init_IRQ;
 }
+
+static void xen_hvm_pv_safe_halt(void)
+{
+	/* Do local_irq_enable() explicitly in hvm_pv guest */
+	local_irq_enable();
+	xen_safe_halt();
+}
+
+static void xen_hvm_pv_halt(void)
+{
+	if (irqs_disabled())
+		HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL);
+	else
+		xen_hvm_pv_safe_halt();
+}
+
+void __init xen_hvm_pv_init_irq_ops(void)
+{
+	pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
+	pv_irq_ops.halt = xen_hvm_pv_halt;
+}
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index 1a5ff24..26041ce 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -33,6 +33,12 @@ ENTRY(hypercall_page)
 	.skip PAGE_SIZE_asm
 .popsection
 
+.pushsection .data
+	.align PAGE_SIZE_asm
+ENTRY(shared_info_page)
+	.skip PAGE_SIZE_asm
+.popsection
+
 	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
 	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION,  .asciz "2.6")
 	ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,    .asciz "xen-3.0")
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index f9153a3..cc00760 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -41,6 +41,7 @@ void xen_vcpu_restore(void);
 void __init xen_build_dynamic_phys_to_machine(void);
 
 void xen_init_irq_ops(void);
+void xen_hvm_pv_init_irq_ops(void);
 void xen_setup_timer(int cpu);
 void xen_setup_runstate_info(int cpu);
 void xen_teardown_timer(int cpu);
diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h
index 7c74ba4..0ce8a26 100644
--- a/include/xen/interface/hvm/hvm_op.h
+++ b/include/xen/interface/hvm/hvm_op.h
@@ -69,4 +69,11 @@ DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_pci_link_route);
 /* Flushes all VCPU TLBs: @arg must be NULL. */
 #define HVMOP_flush_tlbs          5
 
+#define HVMOP_enable_pv 9
+struct xen_hvm_pv_type {
+	domid_t domid;
+	uint32_t flags;
+#define HVM_PV_EVTCHN (1ull<<1)
+};
+
 #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
diff --git a/include/xen/xen.h b/include/xen/xen.h
index a164024..9bb92e5 100644
--- a/include/xen/xen.h
+++ b/include/xen/xen.h
@@ -9,6 +9,7 @@ enum xen_domain_type {
 
 #ifdef CONFIG_XEN
 extern enum xen_domain_type xen_domain_type;
+extern void xen_guest_init(void);
 #else
 #define xen_domain_type		XEN_NATIVE
 #endif
@@ -19,6 +20,14 @@ extern enum xen_domain_type xen_domain_type;
 #define xen_hvm_domain()	(xen_domain() &&			\
 				 xen_domain_type == XEN_HVM_DOMAIN)
 
+#define XEN_HVM_PV_ENABLED	    (1u << 0)
+#define XEN_HVM_PV_EVTCHN_ENABLED   (1u << 1)
+extern u32 xen_hvm_pv_status;
+
+#define xen_hvm_pv_enabled() (xen_hvm_pv_status & XEN_HVM_PV_ENABLED)
+#define xen_hvm_pv_evtchn_enabled() (xen_hvm_pv_enabled() && \
+		(xen_hvm_pv_status & XEN_HVM_PV_EVTCHN_ENABLED))
+
 #ifdef CONFIG_XEN_DOM0
 #include <xen/interface/xen.h>
 #include <asm/xen/hypervisor.h>
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 4/7] xen: The entrance for PV extension of HVM
  2010-03-01  9:38 [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Sheng Yang
                   ` (2 preceding siblings ...)
  2010-03-01  9:38 ` [PATCH 3/7] xen/hvm: Xen PV extension of HVM initialization Sheng Yang
@ 2010-03-01  9:38 ` Sheng Yang
  2010-03-02  1:05   ` [Xen-devel] " Jeremy Fitzhardinge
  2010-03-01  9:38   ` Sheng Yang
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel,
	Sheng Yang, Ingo Molnar

xen_guest_init() would setup the environment.

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 arch/x86/kernel/setup.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 5d9e40c..2b61d46 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -113,6 +113,10 @@
 #endif
 #include <asm/mce.h>
 
+#ifdef CONFIG_XEN
+#include <xen/xen.h>
+#endif
+
 /*
  * end_pfn only includes RAM, while max_pfn_mapped includes all e820 entries.
  * The direct mapping extends to max_pfn_mapped, so that we can directly access
@@ -740,6 +744,10 @@ void __init setup_arch(char **cmdline_p)
 
 	x86_init.oem.arch_setup();
 
+#ifdef CONFIG_XEN
+	xen_guest_init();
+#endif
+
 	setup_memory_map();
 	parse_setup_data();
 	/* update the e820_saved too */
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 5/7] xen: Make event channel work with PV extension of HVM
  2010-03-01  9:38 [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Sheng Yang
@ 2010-03-01  9:38   ` Sheng Yang
  2010-03-01  9:38 ` [PATCH 2/7] xen: Import cpuid.h from Xen Sheng Yang
                     ` (6 subsequent siblings)
  7 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel, Sheng Yang

We mapped each IOAPIC pin to a VIRQ, so that we can deliver interrupt through
these VIRQs.

We used X86_PLATFORM_IPI_VECTOR as the noficiation vector for hypervisor
to notify guest about the event.

The Xen PV timer is used to provide guest a reliable timer.

The patch also enabled SMP support, then we can support IPI through evtchn as well.

Then we don't use IOAPIC/LAPIC.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 arch/x86/xen/enlighten.c    |   72 +++++++++++++++++++++
 arch/x86/xen/irq.c          |   37 ++++++++++-
 arch/x86/xen/smp.c          |  144 ++++++++++++++++++++++++++++++++++++++++++-
 arch/x86/xen/xen-ops.h      |    3 +
 drivers/xen/events.c        |   66 ++++++++++++++++++-
 include/xen/events.h        |    1 +
 include/xen/hvm.h           |    5 ++
 include/xen/interface/xen.h |    6 ++-
 8 files changed, 326 insertions(+), 8 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index fdb9664..f617639 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -58,6 +58,9 @@
 #include <asm/reboot.h>
 #include <asm/stackprotector.h>
 
+#include <xen/hvm.h>
+#include <xen/events.h>
+
 #include "xen-ops.h"
 #include "mmu.h"
 #include "multicalls.h"
@@ -1212,6 +1215,8 @@ static void __init xen_hvm_pv_banner(void)
 		pv_info.name);
 	printk(KERN_INFO "Xen version: %d.%d%s\n",
 		version >> 16, version & 0xffff, extra.extraversion);
+	if (xen_hvm_pv_evtchn_enabled())
+		printk(KERN_INFO "PV feature: Event channel enabled\n");
 }
 
 static int xen_para_available(void)
@@ -1256,6 +1261,9 @@ static int init_hvm_pv_info(void)
 
 	xen_hvm_pv_status = XEN_HVM_PV_ENABLED;
 
+	if (edx & XEN_CPUID_FEAT2_HVM_PV_EVTCHN)
+		xen_hvm_pv_status |= XEN_HVM_PV_EVTCHN_ENABLED;
+
 	/* We only support 1 page of hypercall for now */
 	if (pages != 1)
 		return -ENOMEM;
@@ -1292,12 +1300,42 @@ static void __init init_shared_info(void)
 	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
 }
 
+static int set_callback_via(uint64_t via)
+{
+	struct xen_hvm_param a;
+
+	a.domid = DOMID_SELF;
+	a.index = HVM_PARAM_CALLBACK_IRQ;
+	a.value = via;
+	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
+}
+
+void do_hvm_pv_evtchn_intr(void)
+{
+#ifdef CONFIG_X86_64
+	per_cpu(irq_count, smp_processor_id())++;
+#endif
+	xen_evtchn_do_upcall(get_irq_regs());
+#ifdef CONFIG_X86_64
+	per_cpu(irq_count, smp_processor_id())--;
+#endif
+}
+
+#ifdef CONFIG_X86_LOCAL_APIC
+static void xen_hvm_pv_evtchn_apic_write(u32 reg, u32 val)
+{
+	/* The only one reached here should be EOI */
+	WARN_ON(reg != APIC_EOI);
+}
+#endif
+
 void __init xen_guest_init(void)
 {
 #ifdef CONFIG_X86_32
 	return;
 #else
 	int r;
+	uint64_t callback_via;
 
 	/* Ensure the we won't confused with PV */
 	if (xen_domain_type == XEN_PV_DOMAIN)
@@ -1310,6 +1348,40 @@ void __init xen_guest_init(void)
 	init_shared_info();
 
 	xen_hvm_pv_init_irq_ops();
+
+	if (xen_hvm_pv_evtchn_enabled()) {
+		if (enable_hvm_pv(HVM_PV_EVTCHN))
+			return;
+
+		pv_time_ops = xen_time_ops;
+
+		x86_init.timers.timer_init = xen_time_init;
+		x86_init.timers.setup_percpu_clockev = x86_init_noop;
+		x86_cpuinit.setup_percpu_clockev = x86_init_noop;
+
+		x86_platform.calibrate_tsc = xen_tsc_khz;
+		x86_platform.get_wallclock = xen_get_wallclock;
+		x86_platform.set_wallclock = xen_set_wallclock;
+
+		pv_apic_ops = xen_apic_ops;
+#ifdef CONFIG_X86_LOCAL_APIC
+		/*
+		 * set up the basic apic ops.
+		 */
+		set_xen_basic_apic_ops();
+		apic->write = xen_hvm_pv_evtchn_apic_write;
+#endif
+
+		callback_via = HVM_CALLBACK_VECTOR(X86_PLATFORM_IPI_VECTOR);
+		set_callback_via(callback_via);
+
+		x86_platform_ipi_callback = do_hvm_pv_evtchn_intr;
+
+		disable_acpi();
+
+		xen_hvm_pv_smp_init();
+		machine_ops = xen_machine_ops;
+	}
 #endif
 }
 
diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
index fadaa97..7827a6d 100644
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -5,6 +5,7 @@
 #include <xen/interface/xen.h>
 #include <xen/interface/sched.h>
 #include <xen/interface/vcpu.h>
+#include <xen/xen.h>
 
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
@@ -132,6 +133,20 @@ void __init xen_init_irq_ops()
 	x86_init.irqs.intr_init = xen_init_IRQ;
 }
 
+static void xen_hvm_pv_evtchn_disable(void)
+{
+	native_irq_disable();
+	xen_irq_disable();
+}
+PV_CALLEE_SAVE_REGS_THUNK(xen_hvm_pv_evtchn_disable);
+
+static void xen_hvm_pv_evtchn_enable(void)
+{
+	native_irq_enable();
+	xen_irq_enable();
+}
+PV_CALLEE_SAVE_REGS_THUNK(xen_hvm_pv_evtchn_enable);
+
 static void xen_hvm_pv_safe_halt(void)
 {
 	/* Do local_irq_enable() explicitly in hvm_pv guest */
@@ -147,8 +162,26 @@ static void xen_hvm_pv_halt(void)
 		xen_hvm_pv_safe_halt();
 }
 
+static const struct pv_irq_ops xen_hvm_pv_irq_ops __initdata = {
+	.save_fl = __PV_IS_CALLEE_SAVE(native_save_fl),
+	.restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl),
+	.irq_disable = PV_CALLEE_SAVE(xen_hvm_pv_evtchn_disable),
+	.irq_enable = PV_CALLEE_SAVE(xen_hvm_pv_evtchn_enable),
+
+	.safe_halt = xen_hvm_pv_safe_halt,
+	.halt = xen_hvm_pv_halt,
+#ifdef CONFIG_X86_64
+	.adjust_exception_frame = paravirt_nop,
+#endif
+};
+
 void __init xen_hvm_pv_init_irq_ops(void)
 {
-	pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
-	pv_irq_ops.halt = xen_hvm_pv_halt;
+	if (xen_hvm_pv_evtchn_enabled()) {
+		pv_irq_ops = xen_hvm_pv_irq_ops;
+		x86_init.irqs.intr_init = xen_hvm_pv_evtchn_init_IRQ;
+	} else {
+		pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
+		pv_irq_ops.halt = xen_hvm_pv_halt;
+	}
 }
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 563d205..a85d0b6 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -15,20 +15,26 @@
 #include <linux/sched.h>
 #include <linux/err.h>
 #include <linux/smp.h>
+#include <linux/nmi.h>
 
 #include <asm/paravirt.h>
 #include <asm/desc.h>
 #include <asm/pgtable.h>
 #include <asm/cpu.h>
+#include <asm/trampoline.h>
+#include <asm/tlbflush.h>
+#include <asm/mtrr.h>
 
 #include <xen/interface/xen.h>
 #include <xen/interface/vcpu.h>
 
 #include <asm/xen/interface.h>
 #include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
 
 #include <xen/page.h>
 #include <xen/events.h>
+#include <xen/xen.h>
 
 #include "xen-ops.h"
 #include "mmu.h"
@@ -171,7 +177,8 @@ static void __init xen_smp_prepare_boot_cpu(void)
 
 	/* We've switched to the "real" per-cpu gdt, so make sure the
 	   old memory can be recycled */
-	make_lowmem_page_readwrite(xen_initial_gdt);
+	if (xen_pv_domain())
+		make_lowmem_page_readwrite(xen_initial_gdt);
 
 	xen_setup_vcpu_info_placement();
 }
@@ -480,3 +487,138 @@ void __init xen_smp_init(void)
 	xen_fill_possible_map();
 	xen_init_spinlocks();
 }
+
+static __cpuinit void xen_hvm_pv_start_secondary(void)
+{
+	int cpu = smp_processor_id();
+
+	cpu_init();
+	touch_nmi_watchdog();
+	preempt_disable();
+
+	/* otherwise gcc will move up smp_processor_id before the cpu_init */
+	barrier();
+	/*
+	 * Check TSC synchronization with the BSP:
+	 */
+	check_tsc_sync_target();
+
+	/* Done in smp_callin(), move it here */
+	set_mtrr_aps_delayed_init();
+	smp_store_cpu_info(cpu);
+
+	/* This must be done before setting cpu_online_mask */
+	set_cpu_sibling_map(cpu);
+	wmb();
+
+	set_cpu_online(smp_processor_id(), true);
+	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
+
+	/* enable local interrupts */
+	local_irq_enable();
+
+	xen_setup_cpu_clockevents();
+
+	wmb();
+	cpu_idle();
+}
+
+static __cpuinit int
+hvm_pv_cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
+{
+	struct vcpu_guest_context *ctxt;
+	unsigned long start_ip;
+
+	if (cpumask_test_and_set_cpu(cpu, xen_cpu_initialized_map))
+		return 0;
+
+	ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL);
+	if (ctxt == NULL)
+		return -ENOMEM;
+
+	early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu);
+	initial_code = (unsigned long)xen_hvm_pv_start_secondary;
+	stack_start.sp = (void *) idle->thread.sp;
+
+	/* start_ip had better be page-aligned! */
+	start_ip = setup_trampoline();
+
+	/* only start_ip is what we want */
+	ctxt->flags = VGCF_HVM_GUEST;
+	ctxt->user_regs.eip = start_ip;
+
+	printk(KERN_INFO "Booting processor %d ip 0x%lx\n", cpu, start_ip);
+
+	if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt))
+		BUG();
+
+	kfree(ctxt);
+	return 0;
+}
+
+static int __init xen_hvm_pv_cpu_up(unsigned int cpu)
+{
+	struct task_struct *idle = idle_task(cpu);
+	int rc;
+	unsigned long flags;
+
+	per_cpu(current_task, cpu) = idle;
+
+#ifdef CONFIG_X86_32
+	irq_ctx_init(cpu);
+#else
+	clear_tsk_thread_flag(idle, TIF_FORK);
+	initial_gs = per_cpu_offset(cpu);
+	per_cpu(kernel_stack, cpu) =
+		(unsigned long)task_stack_page(idle) -
+		KERNEL_STACK_OFFSET + THREAD_SIZE;
+#endif
+
+	xen_setup_timer(cpu);
+	xen_init_lock_cpu(cpu);
+
+	per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
+
+	rc = hvm_pv_cpu_initialize_context(cpu, idle);
+	if (rc)
+		return rc;
+
+	if (num_online_cpus() == 1)
+		alternatives_smp_switch(1);
+
+	rc = xen_smp_intr_init(cpu);
+	if (rc)
+		return rc;
+
+	rc = HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL);
+	BUG_ON(rc);
+
+	/*
+	 * Check TSC synchronization with the AP (keep irqs disabled
+	 * while doing so):
+	 */
+	local_irq_save(flags);
+	check_tsc_sync_source(cpu);
+	local_irq_restore(flags);
+
+	while (!cpu_online(cpu)) {
+		cpu_relax();
+		touch_nmi_watchdog();
+	}
+
+	return 0;
+}
+
+static void xen_hvm_pv_flush_tlb_others(const struct cpumask *cpumask,
+					struct mm_struct *mm, unsigned long va)
+{
+	/* TODO Make it more specific */
+	flush_tlb_all();
+}
+
+void __init xen_hvm_pv_smp_init(void)
+{
+	smp_ops = xen_smp_ops;
+	smp_ops.cpu_up = xen_hvm_pv_cpu_up;
+	pv_mmu_ops.flush_tlb_others = xen_hvm_pv_flush_tlb_others;
+}
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index cc00760..a8cc129 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -34,6 +34,7 @@ void xen_reserve_top(void);
 char * __init xen_memory_setup(void);
 void __init xen_arch_setup(void);
 void __init xen_init_IRQ(void);
+void __init xen_hvm_pv_evtchn_init_IRQ(void);
 void xen_enable_sysenter(void);
 void xen_enable_syscall(void);
 void xen_vcpu_restore(void);
@@ -61,10 +62,12 @@ void xen_setup_vcpu_info_placement(void);
 
 #ifdef CONFIG_SMP
 void xen_smp_init(void);
+void xen_hvm_pv_smp_init(void);
 
 extern cpumask_var_t xen_cpu_initialized_map;
 #else
 static inline void xen_smp_init(void) {}
+static inline void xen_hvm_pv_smp_init(void) {}
 #endif
 
 #ifdef CONFIG_PARAVIRT_SPINLOCKS
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index ce602dd..e1835db 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -37,9 +37,12 @@
 
 #include <xen/xen-ops.h>
 #include <xen/events.h>
+#include <xen/xen.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 
+#include <asm/desc.h>
+
 /*
  * This lock protects updates to the following mapping and reference-count
  * arrays. The lock does not need to be acquired to read the mapping tables.
@@ -624,8 +627,13 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
  	unsigned count;
 
-	exit_idle();
-	irq_enter();
+	/*
+	 * If is PV featured HVM, these have already been done
+	 */
+	if (likely(!xen_hvm_pv_evtchn_enabled())) {
+		exit_idle();
+		irq_enter();
+	}
 
 	do {
 		unsigned long pending_words;
@@ -662,8 +670,10 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 	} while(count != 1);
 
 out:
-	irq_exit();
-	set_irq_regs(old_regs);
+	if (likely(!xen_hvm_pv_evtchn_enabled())) {
+		irq_exit();
+		set_irq_regs(old_regs);
+	}
 
 	put_cpu();
 }
@@ -944,3 +954,51 @@ void __init xen_init_IRQ(void)
 
 	irq_ctx_init(smp_processor_id());
 }
+
+void __init xen_hvm_pv_evtchn_init_IRQ(void)
+{
+	int i;
+
+	xen_init_IRQ();
+	for (i = 0; i < NR_IRQS_LEGACY; i++) {
+		struct evtchn_bind_virq bind_virq;
+		struct irq_desc *desc = irq_to_desc(i);
+		int virq, evtchn;
+
+		virq = i + VIRQ_EMUL_PIN_START;
+		bind_virq.virq = virq;
+		bind_virq.vcpu = 0;
+
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
+						&bind_virq) != 0)
+			BUG();
+
+		evtchn = bind_virq.port;
+		evtchn_to_irq[evtchn] = i;
+		irq_info[i] = mk_virq_info(evtchn, virq);
+
+		desc->status = IRQ_DISABLED;
+		desc->action = NULL;
+		desc->depth = 1;
+
+		/*
+		 * 16 old-style INTA-cycle interrupts:
+		 */
+		set_irq_chip_and_handler_name(i, &xen_dynamic_chip,
+					handle_level_irq, "event");
+	}
+
+	/*
+	 * Cover the whole vector space, no vector can escape
+	 * us. (some of these will be overridden and become
+	 * 'special' SMP interrupts)
+	 */
+	for (i = 0; i < (NR_VECTORS - FIRST_EXTERNAL_VECTOR); i++) {
+		int vector = FIRST_EXTERNAL_VECTOR + i;
+		if (vector != IA32_SYSCALL_VECTOR)
+			set_intr_gate(vector, interrupt[i]);
+	}
+
+	/* generic IPI for platform specific use, now used for HVM evtchn */
+	alloc_intr_gate(X86_PLATFORM_IPI_VECTOR, x86_platform_ipi);
+}
diff --git a/include/xen/events.h b/include/xen/events.h
index e68d59a..91755db 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -56,4 +56,5 @@ void xen_poll_irq(int irq);
 /* Determine the IRQ which is bound to an event channel */
 unsigned irq_from_evtchn(unsigned int evtchn);
 
+void xen_evtchn_do_upcall(struct pt_regs *regs);
 #endif	/* _XEN_EVENTS_H */
diff --git a/include/xen/hvm.h b/include/xen/hvm.h
index 4ea8887..c66d788 100644
--- a/include/xen/hvm.h
+++ b/include/xen/hvm.h
@@ -20,4 +20,9 @@ static inline unsigned long hvm_get_parameter(int idx)
 	return xhv.value;
 }
 
+#define HVM_CALLBACK_VIA_TYPE_VECTOR 0x2
+#define HVM_CALLBACK_VIA_TYPE_SHIFT 56
+#define HVM_CALLBACK_VECTOR(x) (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
+				HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
+
 #endif /* XEN_HVM_H__ */
diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
index 2befa3e..9282ff7 100644
--- a/include/xen/interface/xen.h
+++ b/include/xen/interface/xen.h
@@ -90,7 +90,11 @@
 #define VIRQ_ARCH_6    22
 #define VIRQ_ARCH_7    23
 
-#define NR_VIRQS       24
+#define VIRQ_EMUL_PIN_START 24
+#define VIRQ_EMUL_PIN_NUM 16
+
+#define NR_VIRQS       40
+
 /*
  * MMU-UPDATE REQUESTS
  *
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 5/7] xen: Make event channel work with PV extension of HVM
@ 2010-03-01  9:38   ` Sheng Yang
  0 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt
  Cc: linux-kernel, xen-devel, Ian Campbell, Sheng Yang, Stefano Stabellini

We mapped each IOAPIC pin to a VIRQ, so that we can deliver interrupt through
these VIRQs.

We used X86_PLATFORM_IPI_VECTOR as the noficiation vector for hypervisor
to notify guest about the event.

The Xen PV timer is used to provide guest a reliable timer.

The patch also enabled SMP support, then we can support IPI through evtchn as well.

Then we don't use IOAPIC/LAPIC.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 arch/x86/xen/enlighten.c    |   72 +++++++++++++++++++++
 arch/x86/xen/irq.c          |   37 ++++++++++-
 arch/x86/xen/smp.c          |  144 ++++++++++++++++++++++++++++++++++++++++++-
 arch/x86/xen/xen-ops.h      |    3 +
 drivers/xen/events.c        |   66 ++++++++++++++++++-
 include/xen/events.h        |    1 +
 include/xen/hvm.h           |    5 ++
 include/xen/interface/xen.h |    6 ++-
 8 files changed, 326 insertions(+), 8 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index fdb9664..f617639 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -58,6 +58,9 @@
 #include <asm/reboot.h>
 #include <asm/stackprotector.h>
 
+#include <xen/hvm.h>
+#include <xen/events.h>
+
 #include "xen-ops.h"
 #include "mmu.h"
 #include "multicalls.h"
@@ -1212,6 +1215,8 @@ static void __init xen_hvm_pv_banner(void)
 		pv_info.name);
 	printk(KERN_INFO "Xen version: %d.%d%s\n",
 		version >> 16, version & 0xffff, extra.extraversion);
+	if (xen_hvm_pv_evtchn_enabled())
+		printk(KERN_INFO "PV feature: Event channel enabled\n");
 }
 
 static int xen_para_available(void)
@@ -1256,6 +1261,9 @@ static int init_hvm_pv_info(void)
 
 	xen_hvm_pv_status = XEN_HVM_PV_ENABLED;
 
+	if (edx & XEN_CPUID_FEAT2_HVM_PV_EVTCHN)
+		xen_hvm_pv_status |= XEN_HVM_PV_EVTCHN_ENABLED;
+
 	/* We only support 1 page of hypercall for now */
 	if (pages != 1)
 		return -ENOMEM;
@@ -1292,12 +1300,42 @@ static void __init init_shared_info(void)
 	per_cpu(xen_vcpu, 0) = &HYPERVISOR_shared_info->vcpu_info[0];
 }
 
+static int set_callback_via(uint64_t via)
+{
+	struct xen_hvm_param a;
+
+	a.domid = DOMID_SELF;
+	a.index = HVM_PARAM_CALLBACK_IRQ;
+	a.value = via;
+	return HYPERVISOR_hvm_op(HVMOP_set_param, &a);
+}
+
+void do_hvm_pv_evtchn_intr(void)
+{
+#ifdef CONFIG_X86_64
+	per_cpu(irq_count, smp_processor_id())++;
+#endif
+	xen_evtchn_do_upcall(get_irq_regs());
+#ifdef CONFIG_X86_64
+	per_cpu(irq_count, smp_processor_id())--;
+#endif
+}
+
+#ifdef CONFIG_X86_LOCAL_APIC
+static void xen_hvm_pv_evtchn_apic_write(u32 reg, u32 val)
+{
+	/* The only one reached here should be EOI */
+	WARN_ON(reg != APIC_EOI);
+}
+#endif
+
 void __init xen_guest_init(void)
 {
 #ifdef CONFIG_X86_32
 	return;
 #else
 	int r;
+	uint64_t callback_via;
 
 	/* Ensure the we won't confused with PV */
 	if (xen_domain_type == XEN_PV_DOMAIN)
@@ -1310,6 +1348,40 @@ void __init xen_guest_init(void)
 	init_shared_info();
 
 	xen_hvm_pv_init_irq_ops();
+
+	if (xen_hvm_pv_evtchn_enabled()) {
+		if (enable_hvm_pv(HVM_PV_EVTCHN))
+			return;
+
+		pv_time_ops = xen_time_ops;
+
+		x86_init.timers.timer_init = xen_time_init;
+		x86_init.timers.setup_percpu_clockev = x86_init_noop;
+		x86_cpuinit.setup_percpu_clockev = x86_init_noop;
+
+		x86_platform.calibrate_tsc = xen_tsc_khz;
+		x86_platform.get_wallclock = xen_get_wallclock;
+		x86_platform.set_wallclock = xen_set_wallclock;
+
+		pv_apic_ops = xen_apic_ops;
+#ifdef CONFIG_X86_LOCAL_APIC
+		/*
+		 * set up the basic apic ops.
+		 */
+		set_xen_basic_apic_ops();
+		apic->write = xen_hvm_pv_evtchn_apic_write;
+#endif
+
+		callback_via = HVM_CALLBACK_VECTOR(X86_PLATFORM_IPI_VECTOR);
+		set_callback_via(callback_via);
+
+		x86_platform_ipi_callback = do_hvm_pv_evtchn_intr;
+
+		disable_acpi();
+
+		xen_hvm_pv_smp_init();
+		machine_ops = xen_machine_ops;
+	}
 #endif
 }
 
diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
index fadaa97..7827a6d 100644
--- a/arch/x86/xen/irq.c
+++ b/arch/x86/xen/irq.c
@@ -5,6 +5,7 @@
 #include <xen/interface/xen.h>
 #include <xen/interface/sched.h>
 #include <xen/interface/vcpu.h>
+#include <xen/xen.h>
 
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
@@ -132,6 +133,20 @@ void __init xen_init_irq_ops()
 	x86_init.irqs.intr_init = xen_init_IRQ;
 }
 
+static void xen_hvm_pv_evtchn_disable(void)
+{
+	native_irq_disable();
+	xen_irq_disable();
+}
+PV_CALLEE_SAVE_REGS_THUNK(xen_hvm_pv_evtchn_disable);
+
+static void xen_hvm_pv_evtchn_enable(void)
+{
+	native_irq_enable();
+	xen_irq_enable();
+}
+PV_CALLEE_SAVE_REGS_THUNK(xen_hvm_pv_evtchn_enable);
+
 static void xen_hvm_pv_safe_halt(void)
 {
 	/* Do local_irq_enable() explicitly in hvm_pv guest */
@@ -147,8 +162,26 @@ static void xen_hvm_pv_halt(void)
 		xen_hvm_pv_safe_halt();
 }
 
+static const struct pv_irq_ops xen_hvm_pv_irq_ops __initdata = {
+	.save_fl = __PV_IS_CALLEE_SAVE(native_save_fl),
+	.restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl),
+	.irq_disable = PV_CALLEE_SAVE(xen_hvm_pv_evtchn_disable),
+	.irq_enable = PV_CALLEE_SAVE(xen_hvm_pv_evtchn_enable),
+
+	.safe_halt = xen_hvm_pv_safe_halt,
+	.halt = xen_hvm_pv_halt,
+#ifdef CONFIG_X86_64
+	.adjust_exception_frame = paravirt_nop,
+#endif
+};
+
 void __init xen_hvm_pv_init_irq_ops(void)
 {
-	pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
-	pv_irq_ops.halt = xen_hvm_pv_halt;
+	if (xen_hvm_pv_evtchn_enabled()) {
+		pv_irq_ops = xen_hvm_pv_irq_ops;
+		x86_init.irqs.intr_init = xen_hvm_pv_evtchn_init_IRQ;
+	} else {
+		pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
+		pv_irq_ops.halt = xen_hvm_pv_halt;
+	}
 }
diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
index 563d205..a85d0b6 100644
--- a/arch/x86/xen/smp.c
+++ b/arch/x86/xen/smp.c
@@ -15,20 +15,26 @@
 #include <linux/sched.h>
 #include <linux/err.h>
 #include <linux/smp.h>
+#include <linux/nmi.h>
 
 #include <asm/paravirt.h>
 #include <asm/desc.h>
 #include <asm/pgtable.h>
 #include <asm/cpu.h>
+#include <asm/trampoline.h>
+#include <asm/tlbflush.h>
+#include <asm/mtrr.h>
 
 #include <xen/interface/xen.h>
 #include <xen/interface/vcpu.h>
 
 #include <asm/xen/interface.h>
 #include <asm/xen/hypercall.h>
+#include <asm/xen/hypervisor.h>
 
 #include <xen/page.h>
 #include <xen/events.h>
+#include <xen/xen.h>
 
 #include "xen-ops.h"
 #include "mmu.h"
@@ -171,7 +177,8 @@ static void __init xen_smp_prepare_boot_cpu(void)
 
 	/* We've switched to the "real" per-cpu gdt, so make sure the
 	   old memory can be recycled */
-	make_lowmem_page_readwrite(xen_initial_gdt);
+	if (xen_pv_domain())
+		make_lowmem_page_readwrite(xen_initial_gdt);
 
 	xen_setup_vcpu_info_placement();
 }
@@ -480,3 +487,138 @@ void __init xen_smp_init(void)
 	xen_fill_possible_map();
 	xen_init_spinlocks();
 }
+
+static __cpuinit void xen_hvm_pv_start_secondary(void)
+{
+	int cpu = smp_processor_id();
+
+	cpu_init();
+	touch_nmi_watchdog();
+	preempt_disable();
+
+	/* otherwise gcc will move up smp_processor_id before the cpu_init */
+	barrier();
+	/*
+	 * Check TSC synchronization with the BSP:
+	 */
+	check_tsc_sync_target();
+
+	/* Done in smp_callin(), move it here */
+	set_mtrr_aps_delayed_init();
+	smp_store_cpu_info(cpu);
+
+	/* This must be done before setting cpu_online_mask */
+	set_cpu_sibling_map(cpu);
+	wmb();
+
+	set_cpu_online(smp_processor_id(), true);
+	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
+
+	/* enable local interrupts */
+	local_irq_enable();
+
+	xen_setup_cpu_clockevents();
+
+	wmb();
+	cpu_idle();
+}
+
+static __cpuinit int
+hvm_pv_cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
+{
+	struct vcpu_guest_context *ctxt;
+	unsigned long start_ip;
+
+	if (cpumask_test_and_set_cpu(cpu, xen_cpu_initialized_map))
+		return 0;
+
+	ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL);
+	if (ctxt == NULL)
+		return -ENOMEM;
+
+	early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu);
+	initial_code = (unsigned long)xen_hvm_pv_start_secondary;
+	stack_start.sp = (void *) idle->thread.sp;
+
+	/* start_ip had better be page-aligned! */
+	start_ip = setup_trampoline();
+
+	/* only start_ip is what we want */
+	ctxt->flags = VGCF_HVM_GUEST;
+	ctxt->user_regs.eip = start_ip;
+
+	printk(KERN_INFO "Booting processor %d ip 0x%lx\n", cpu, start_ip);
+
+	if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt))
+		BUG();
+
+	kfree(ctxt);
+	return 0;
+}
+
+static int __init xen_hvm_pv_cpu_up(unsigned int cpu)
+{
+	struct task_struct *idle = idle_task(cpu);
+	int rc;
+	unsigned long flags;
+
+	per_cpu(current_task, cpu) = idle;
+
+#ifdef CONFIG_X86_32
+	irq_ctx_init(cpu);
+#else
+	clear_tsk_thread_flag(idle, TIF_FORK);
+	initial_gs = per_cpu_offset(cpu);
+	per_cpu(kernel_stack, cpu) =
+		(unsigned long)task_stack_page(idle) -
+		KERNEL_STACK_OFFSET + THREAD_SIZE;
+#endif
+
+	xen_setup_timer(cpu);
+	xen_init_lock_cpu(cpu);
+
+	per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
+
+	rc = hvm_pv_cpu_initialize_context(cpu, idle);
+	if (rc)
+		return rc;
+
+	if (num_online_cpus() == 1)
+		alternatives_smp_switch(1);
+
+	rc = xen_smp_intr_init(cpu);
+	if (rc)
+		return rc;
+
+	rc = HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL);
+	BUG_ON(rc);
+
+	/*
+	 * Check TSC synchronization with the AP (keep irqs disabled
+	 * while doing so):
+	 */
+	local_irq_save(flags);
+	check_tsc_sync_source(cpu);
+	local_irq_restore(flags);
+
+	while (!cpu_online(cpu)) {
+		cpu_relax();
+		touch_nmi_watchdog();
+	}
+
+	return 0;
+}
+
+static void xen_hvm_pv_flush_tlb_others(const struct cpumask *cpumask,
+					struct mm_struct *mm, unsigned long va)
+{
+	/* TODO Make it more specific */
+	flush_tlb_all();
+}
+
+void __init xen_hvm_pv_smp_init(void)
+{
+	smp_ops = xen_smp_ops;
+	smp_ops.cpu_up = xen_hvm_pv_cpu_up;
+	pv_mmu_ops.flush_tlb_others = xen_hvm_pv_flush_tlb_others;
+}
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index cc00760..a8cc129 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -34,6 +34,7 @@ void xen_reserve_top(void);
 char * __init xen_memory_setup(void);
 void __init xen_arch_setup(void);
 void __init xen_init_IRQ(void);
+void __init xen_hvm_pv_evtchn_init_IRQ(void);
 void xen_enable_sysenter(void);
 void xen_enable_syscall(void);
 void xen_vcpu_restore(void);
@@ -61,10 +62,12 @@ void xen_setup_vcpu_info_placement(void);
 
 #ifdef CONFIG_SMP
 void xen_smp_init(void);
+void xen_hvm_pv_smp_init(void);
 
 extern cpumask_var_t xen_cpu_initialized_map;
 #else
 static inline void xen_smp_init(void) {}
+static inline void xen_hvm_pv_smp_init(void) {}
 #endif
 
 #ifdef CONFIG_PARAVIRT_SPINLOCKS
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index ce602dd..e1835db 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -37,9 +37,12 @@
 
 #include <xen/xen-ops.h>
 #include <xen/events.h>
+#include <xen/xen.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 
+#include <asm/desc.h>
+
 /*
  * This lock protects updates to the following mapping and reference-count
  * arrays. The lock does not need to be acquired to read the mapping tables.
@@ -624,8 +627,13 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
  	unsigned count;
 
-	exit_idle();
-	irq_enter();
+	/*
+	 * If is PV featured HVM, these have already been done
+	 */
+	if (likely(!xen_hvm_pv_evtchn_enabled())) {
+		exit_idle();
+		irq_enter();
+	}
 
 	do {
 		unsigned long pending_words;
@@ -662,8 +670,10 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
 	} while(count != 1);
 
 out:
-	irq_exit();
-	set_irq_regs(old_regs);
+	if (likely(!xen_hvm_pv_evtchn_enabled())) {
+		irq_exit();
+		set_irq_regs(old_regs);
+	}
 
 	put_cpu();
 }
@@ -944,3 +954,51 @@ void __init xen_init_IRQ(void)
 
 	irq_ctx_init(smp_processor_id());
 }
+
+void __init xen_hvm_pv_evtchn_init_IRQ(void)
+{
+	int i;
+
+	xen_init_IRQ();
+	for (i = 0; i < NR_IRQS_LEGACY; i++) {
+		struct evtchn_bind_virq bind_virq;
+		struct irq_desc *desc = irq_to_desc(i);
+		int virq, evtchn;
+
+		virq = i + VIRQ_EMUL_PIN_START;
+		bind_virq.virq = virq;
+		bind_virq.vcpu = 0;
+
+		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
+						&bind_virq) != 0)
+			BUG();
+
+		evtchn = bind_virq.port;
+		evtchn_to_irq[evtchn] = i;
+		irq_info[i] = mk_virq_info(evtchn, virq);
+
+		desc->status = IRQ_DISABLED;
+		desc->action = NULL;
+		desc->depth = 1;
+
+		/*
+		 * 16 old-style INTA-cycle interrupts:
+		 */
+		set_irq_chip_and_handler_name(i, &xen_dynamic_chip,
+					handle_level_irq, "event");
+	}
+
+	/*
+	 * Cover the whole vector space, no vector can escape
+	 * us. (some of these will be overridden and become
+	 * 'special' SMP interrupts)
+	 */
+	for (i = 0; i < (NR_VECTORS - FIRST_EXTERNAL_VECTOR); i++) {
+		int vector = FIRST_EXTERNAL_VECTOR + i;
+		if (vector != IA32_SYSCALL_VECTOR)
+			set_intr_gate(vector, interrupt[i]);
+	}
+
+	/* generic IPI for platform specific use, now used for HVM evtchn */
+	alloc_intr_gate(X86_PLATFORM_IPI_VECTOR, x86_platform_ipi);
+}
diff --git a/include/xen/events.h b/include/xen/events.h
index e68d59a..91755db 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -56,4 +56,5 @@ void xen_poll_irq(int irq);
 /* Determine the IRQ which is bound to an event channel */
 unsigned irq_from_evtchn(unsigned int evtchn);
 
+void xen_evtchn_do_upcall(struct pt_regs *regs);
 #endif	/* _XEN_EVENTS_H */
diff --git a/include/xen/hvm.h b/include/xen/hvm.h
index 4ea8887..c66d788 100644
--- a/include/xen/hvm.h
+++ b/include/xen/hvm.h
@@ -20,4 +20,9 @@ static inline unsigned long hvm_get_parameter(int idx)
 	return xhv.value;
 }
 
+#define HVM_CALLBACK_VIA_TYPE_VECTOR 0x2
+#define HVM_CALLBACK_VIA_TYPE_SHIFT 56
+#define HVM_CALLBACK_VECTOR(x) (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
+				HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
+
 #endif /* XEN_HVM_H__ */
diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
index 2befa3e..9282ff7 100644
--- a/include/xen/interface/xen.h
+++ b/include/xen/interface/xen.h
@@ -90,7 +90,11 @@
 #define VIRQ_ARCH_6    22
 #define VIRQ_ARCH_7    23
 
-#define NR_VIRQS       24
+#define VIRQ_EMUL_PIN_START 24
+#define VIRQ_EMUL_PIN_NUM 16
+
+#define NR_VIRQS       40
+
 /*
  * MMU-UPDATE REQUESTS
  *
-- 
1.5.4.5

^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 6/7] xen: Unified checking for Xen of PV drivers to xenbus_register_frontend()
  2010-03-01  9:38 [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Sheng Yang
                   ` (4 preceding siblings ...)
  2010-03-01  9:38   ` Sheng Yang
@ 2010-03-01  9:38 ` Sheng Yang
  2010-03-02  1:45   ` [Xen-devel] " Jeremy Fitzhardinge
  2010-03-01  9:38 ` [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM Sheng Yang
  2010-03-02  0:42 ` [Xen-devel] [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Jeremy Fitzhardinge
  7 siblings, 1 reply; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel, Sheng Yang


Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 drivers/block/xen-blkfront.c |    3 ---
 drivers/input/xen-kbdfront.c |    3 ---
 drivers/net/xen-netfront.c   |    5 -----
 drivers/video/xen-fbfront.c  |    3 ---
 include/xen/xenbus.h         |    3 +++
 5 files changed, 3 insertions(+), 14 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 05a31e5..d6465c1 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1068,9 +1068,6 @@ static struct xenbus_driver blkfront = {
 
 static int __init xlblk_init(void)
 {
-	if (!xen_domain())
-		return -ENODEV;
-
 	if (register_blkdev(XENVBD_MAJOR, DEV_NAME)) {
 		printk(KERN_WARNING "xen_blk: can't get major %d with name %s\n",
 		       XENVBD_MAJOR, DEV_NAME);
diff --git a/drivers/input/xen-kbdfront.c b/drivers/input/xen-kbdfront.c
index c721c0a..febffc3 100644
--- a/drivers/input/xen-kbdfront.c
+++ b/drivers/input/xen-kbdfront.c
@@ -338,9 +338,6 @@ static struct xenbus_driver xenkbd_driver = {
 
 static int __init xenkbd_init(void)
 {
-	if (!xen_domain())
-		return -ENODEV;
-
 	/* Nothing to do if running in dom0. */
 	if (xen_initial_domain())
 		return -ENODEV;
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index a869b45..d89fd0b 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1804,14 +1804,9 @@ static struct xenbus_driver netfront_driver = {
 
 static int __init netif_init(void)
 {
-	if (!xen_domain())
-		return -ENODEV;
-
 	if (xen_initial_domain())
 		return 0;
 
-	printk(KERN_INFO "Initialising Xen virtual ethernet driver.\n");
-
 	return xenbus_register_frontend(&netfront_driver);
 }
 module_init(netif_init);
diff --git a/drivers/video/xen-fbfront.c b/drivers/video/xen-fbfront.c
index 603598f..daff72f 100644
--- a/drivers/video/xen-fbfront.c
+++ b/drivers/video/xen-fbfront.c
@@ -683,9 +683,6 @@ static struct xenbus_driver xenfb_driver = {
 
 static int __init xenfb_init(void)
 {
-	if (!xen_domain())
-		return -ENODEV;
-
 	/* Nothing to do if running in dom0. */
 	if (xen_initial_domain())
 		return -ENODEV;
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index b9763ba..9f68cf5 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -43,6 +43,7 @@
 #include <xen/interface/grant_table.h>
 #include <xen/interface/io/xenbus.h>
 #include <xen/interface/io/xs_wire.h>
+#include <xen/xen.h>
 
 /* Register callback to watch this node. */
 struct xenbus_watch
@@ -112,6 +113,8 @@ static inline int __must_check
 xenbus_register_frontend(struct xenbus_driver *drv)
 {
 	WARN_ON(drv->owner != THIS_MODULE);
+	if (!xen_domain())
+		return -ENODEV;
 	return __xenbus_register_frontend(drv, THIS_MODULE, KBUILD_MODNAME);
 }
 
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
  2010-03-01  9:38 [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Sheng Yang
                   ` (5 preceding siblings ...)
  2010-03-01  9:38 ` [PATCH 6/7] xen: Unified checking for Xen of PV drivers to xenbus_register_frontend() Sheng Yang
@ 2010-03-01  9:38 ` Sheng Yang
  2010-03-01 17:38     ` Konrad Rzeszutek Wilk
  2010-03-02  0:42 ` [Xen-devel] [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Jeremy Fitzhardinge
  7 siblings, 1 reply; 53+ messages in thread
From: Sheng Yang @ 2010-03-01  9:38 UTC (permalink / raw)
  To: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt
  Cc: Ian Campbell, Stefano Stabellini, xen-devel, linux-kernel, Sheng Yang

Now pv drivers(vnif, vbd) can work.

Signed-off-by: Sheng Yang <sheng@linux.intel.com>
---
 arch/x86/xen/enlighten.c          |    1 +
 drivers/input/xen-kbdfront.c      |    3 ++
 drivers/video/xen-fbfront.c       |    3 ++
 drivers/xen/grant-table.c         |   59 +++++++++++++++++++++++++++++++++++--
 drivers/xen/xenbus/xenbus_probe.c |   22 ++++++++++---
 include/xen/xen.h                 |    2 +
 include/xen/xenbus.h              |    2 +-
 7 files changed, 83 insertions(+), 9 deletions(-)

diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index f617639..3bc9ff0 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1277,6 +1277,7 @@ static int init_hvm_pv_info(void)
 	pv_info = xen_info;
 	pv_info.kernel_rpl = 0;
 
+	xen_domain_type = XEN_HVM_DOMAIN;
 	return 0;
 }
 
diff --git a/drivers/input/xen-kbdfront.c b/drivers/input/xen-kbdfront.c
index febffc3..8e58812 100644
--- a/drivers/input/xen-kbdfront.c
+++ b/drivers/input/xen-kbdfront.c
@@ -342,6 +342,9 @@ static int __init xenkbd_init(void)
 	if (xen_initial_domain())
 		return -ENODEV;
 
+	if (xen_hvm_domain())
+		return -ENODEV;
+
 	return xenbus_register_frontend(&xenkbd_driver);
 }
 
diff --git a/drivers/video/xen-fbfront.c b/drivers/video/xen-fbfront.c
index daff72f..b173474 100644
--- a/drivers/video/xen-fbfront.c
+++ b/drivers/video/xen-fbfront.c
@@ -687,6 +687,9 @@ static int __init xenfb_init(void)
 	if (xen_initial_domain())
 		return -ENODEV;
 
+	if (xen_hvm_domain())
+		return -ENODEV;
+
 	return xenbus_register_frontend(&xenfb_driver);
 }
 
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 4c6c0bd..abffda5 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -46,6 +46,8 @@
 #include <asm/pgtable.h>
 #include <asm/sync_bitops.h>
 
+#include <xen/interface/memory.h>
+#include <linux/io.h>
 
 /* External tools reserve first few grant table entries. */
 #define NR_RESERVED_ENTRIES 8
@@ -441,12 +443,33 @@ static inline unsigned int max_nr_grant_frames(void)
 	return xen_max;
 }
 
+static unsigned long hvm_pv_resume_frames;
+
 static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
 {
 	struct gnttab_setup_table setup;
 	unsigned long *frames;
 	unsigned int nr_gframes = end_idx + 1;
 	int rc;
+	struct xen_add_to_physmap xatp;
+	unsigned int i = end_idx;
+
+	if (xen_hvm_pv_evtchn_enabled()) {
+		/*
+		 * Loop backwards, so that the first hypercall has the largest
+		 * index, ensuring that the table will grow only once.
+		 */
+		do {
+			xatp.domid = DOMID_SELF;
+			xatp.idx = i;
+			xatp.space = XENMAPSPACE_grant_table;
+			xatp.gpfn = (hvm_pv_resume_frames >> PAGE_SHIFT) + i;
+			if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
+				BUG();
+		} while (i-- > start_idx);
+
+		return 0;
+	}
 
 	frames = kmalloc(nr_gframes * sizeof(unsigned long), GFP_ATOMIC);
 	if (!frames)
@@ -473,11 +496,41 @@ static int gnttab_map(unsigned int start_idx, unsigned int end_idx)
 	return 0;
 }
 
+/* The region reserved by QEmu for Xen platform device */
+#define GNTTAB_START	    0xf2000000ul
+#define GNTTAB_SIZE	    0x20000ul
+
 int gnttab_resume(void)
 {
-	if (max_nr_grant_frames() < nr_grant_frames)
+	unsigned int max_nr_gframes;
+
+	max_nr_gframes = max_nr_grant_frames();
+	if (max_nr_gframes < nr_grant_frames)
 		return -ENOSYS;
-	return gnttab_map(0, nr_grant_frames - 1);
+
+	if (xen_pv_domain())
+		return gnttab_map(0, nr_grant_frames - 1);
+
+	/* Xen PV featured HVM */
+	if (!hvm_pv_resume_frames) {
+		if (PAGE_SIZE * max_nr_gframes > GNTTAB_SIZE) {
+			printk(KERN_WARNING
+				"Grant table size exceed the limit!\n");
+			return -EINVAL;
+		}
+		hvm_pv_resume_frames = GNTTAB_START;
+		shared = ioremap(hvm_pv_resume_frames,
+				 PAGE_SIZE * max_nr_gframes);
+		if (shared == NULL) {
+			printk(KERN_WARNING
+				"Fail to ioremap gnttab share frames\n");
+			return -ENOMEM;
+		}
+	}
+
+	gnttab_map(0, nr_grant_frames - 1);
+
+	return 0;
 }
 
 int gnttab_suspend(void)
@@ -510,7 +563,7 @@ static int __devinit gnttab_init(void)
 	unsigned int max_nr_glist_frames, nr_glist_frames;
 	unsigned int nr_init_grefs;
 
-	if (!xen_domain())
+	if (!xen_evtchn_enabled())
 		return -ENODEV;
 
 	nr_grant_frames = 1;
diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 2f7aaa9..2704f0c 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -55,6 +55,8 @@
 #include <xen/events.h>
 #include <xen/page.h>
 
+#include <xen/hvm.h>
+
 #include "xenbus_comms.h"
 #include "xenbus_probe.h"
 
@@ -786,7 +788,8 @@ static int __init xenbus_probe_init(void)
 	DPRINTK("");
 
 	err = -ENODEV;
-	if (!xen_domain())
+
+	if (!xen_evtchn_enabled())
 		goto out_error;
 
 	/* Register ourselves with the kernel bus subsystem */
@@ -805,10 +808,19 @@ static int __init xenbus_probe_init(void)
 		/* dom0 not yet supported */
 	} else {
 		xenstored_ready = 1;
-		xen_store_evtchn = xen_start_info->store_evtchn;
-		xen_store_mfn = xen_start_info->store_mfn;
+		if (xen_hvm_pv_evtchn_enabled()) {
+			xen_store_evtchn =
+				hvm_get_parameter(HVM_PARAM_STORE_EVTCHN);
+			xen_store_mfn =
+				hvm_get_parameter(HVM_PARAM_STORE_PFN);
+			xen_store_interface =
+				ioremap(xen_store_mfn << PAGE_SHIFT, PAGE_SIZE);
+		} else {
+			xen_store_evtchn = xen_start_info->store_evtchn;
+			xen_store_mfn = xen_start_info->store_mfn;
+			xen_store_interface = mfn_to_virt(xen_store_mfn);
+		}
 	}
-	xen_store_interface = mfn_to_virt(xen_store_mfn);
 
 	/* Initialize the interface to xenstore. */
 	err = xs_init();
@@ -922,7 +934,7 @@ static void wait_for_devices(struct xenbus_driver *xendrv)
 	struct device_driver *drv = xendrv ? &xendrv->driver : NULL;
 	unsigned int seconds_waited = 0;
 
-	if (!ready_to_wait_for_devices || !xen_domain())
+	if (!ready_to_wait_for_devices || !xen_evtchn_enabled())
 		return;
 
 	while (exists_connecting_device(drv)) {
diff --git a/include/xen/xen.h b/include/xen/xen.h
index 9bb92e5..f949aee 100644
--- a/include/xen/xen.h
+++ b/include/xen/xen.h
@@ -28,6 +28,8 @@ extern u32 xen_hvm_pv_status;
 #define xen_hvm_pv_evtchn_enabled() (xen_hvm_pv_enabled() && \
 		(xen_hvm_pv_status & XEN_HVM_PV_EVTCHN_ENABLED))
 
+#define xen_evtchn_enabled() (xen_pv_domain() || xen_hvm_pv_evtchn_enabled())
+
 #ifdef CONFIG_XEN_DOM0
 #include <xen/interface/xen.h>
 #include <asm/xen/hypervisor.h>
diff --git a/include/xen/xenbus.h b/include/xen/xenbus.h
index 9f68cf5..0e79644 100644
--- a/include/xen/xenbus.h
+++ b/include/xen/xenbus.h
@@ -113,7 +113,7 @@ static inline int __must_check
 xenbus_register_frontend(struct xenbus_driver *drv)
 {
 	WARN_ON(drv->owner != THIS_MODULE);
-	if (!xen_domain())
+	if (!xen_evtchn_enabled())
 		return -ENODEV;
 	return __xenbus_register_frontend(drv, THIS_MODULE, KBUILD_MODNAME);
 }
-- 
1.5.4.5


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [LKML] [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
  2010-03-01  9:38 ` [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM Sheng Yang
@ 2010-03-01 17:38     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 53+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-03-01 17:38 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, Ian Campbell,
	Stefano Stabellini, xen-devel, linux-kernel

> +/* The region reserved by QEmu for Xen platform device */
> +#define GNTTAB_START	    0xf2000000ul
> +#define GNTTAB_SIZE	    0x20000ul

I thought that in the earlier review you said:

"> > +#define GNTTAB_START           0xfbfe0000ul
> > +#define GNTTAB_SIZE            0x20000ul 
> 
> Is it possible that there would be a PCI device that would be
> passed in the guest that would conflict with the above mentioned
> E820 region?
>
I would change them to a dedicated PCI MMIO address in the next version.
Thanks.

"
?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LKML] [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
@ 2010-03-01 17:38     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 53+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-03-01 17:38 UTC (permalink / raw)
  To: Sheng Yang
  Cc: xen-devel, Ian Campbell, Stefano Stabellini, linux-kernel,
	Ian Pratt, Jeremy Fitzhardinge, Keir Fraser

> +/* The region reserved by QEmu for Xen platform device */
> +#define GNTTAB_START	    0xf2000000ul
> +#define GNTTAB_SIZE	    0x20000ul

I thought that in the earlier review you said:

"> > +#define GNTTAB_START           0xfbfe0000ul
> > +#define GNTTAB_SIZE            0x20000ul 
> 
> Is it possible that there would be a PCI device that would be
> passed in the guest that would conflict with the above mentioned
> E820 region?
>
I would change them to a dedicated PCI MMIO address in the next version.
Thanks.

"
?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen
  2010-03-01  9:38 [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Sheng Yang
                   ` (6 preceding siblings ...)
  2010-03-01  9:38 ` [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM Sheng Yang
@ 2010-03-02  0:42 ` Jeremy Fitzhardinge
  2010-03-02  1:26   ` Sheng Yang
  7 siblings, 1 reply; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02  0:42 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, linux-kernel,
	xen-devel, Ian Campbell, Stefano Stabellini

On 03/01/2010 01:38 AM, Sheng Yang wrote:
> I'd like to have your comments on this patch series. We think the patches are
> prepared for check in pv_ops domU tree.
>
> The PV extension of HVM is started from real mode like HVM guest, but also with a
> a range of PV features(e.g. PV halt, PV timer, event channel, as well as PV
> drivers). So guest with this feature can takes the advantages of both H/W
> virtualization and Para-Virtualization.
>    

The PV device side of this has a lot of overlap with the older PV driver 
for HVM domains code.  Aside perhaps for the details of how 
interrupts/events are actually delivered, that seems like a completely 
distinct piece of work which could be done separately.

Also, I'd like to see a clearer statement of what you're specifically 
trying to optimise for here.  What is inefficient in an unmodified HVM 
domain, and how do your patches address these things.  What measurements 
have you made?

> The first two of the patchset imported several header file from Jeremy's tree
> and Xen tree, respect to Jeremy and Keir's works.
>
> The whole patchset based on Linux upstream.
>
> You need a line like:
>
> cpuid = [ '0x40000002:edx=0x3' ]
>
> in HVM configuration file to expose hybrid feature to guest, and
>
> CONFIG_XEN
>
> in the guest kernel configuration file to enable the hybrid support.
>
> And the compiled image can be used as native/pv domU/hvm guest/pv feature hvm kernel.
>
> Current the patchset support x86_64 only.
>    

Please add a new config option for the new code, and make it dependent 
on x86-64, rather than scattering random #ifdef CONFIG_X86_64s around 
the code.  Make sure all the code is controlled by this CONFIG variable 
so it disappears in a 32-bit build (ideally by putting it into a new 
file where possible, and only building it when enabled).

> I've discussed the vector-evtchn mapping with Stefano, and we agreed on that
> it's more complex than we thought, and can't brought much benefit for
> pvops domU. So there is no major change in this update.
>
> Change from v3:
> 1. Rebase to Linux 2.6.33 release.
>    

Why 2.6.33?  I'm currently working on 2.6.32; I intend to move to .33 
shortly, but .32 is a much more convenient base (particularly if you're 
interested in getting these changes into any distro kernels).

I rebased it to .32 without much effort.


I'll make specific comments in the patches.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of HVM initialization
  2010-03-01  9:38 ` [PATCH 3/7] xen/hvm: Xen PV extension of HVM initialization Sheng Yang
@ 2010-03-02  1:02     ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02  1:02 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, xen-devel,
	Ian Campbell, Stefano Stabellini, Yaozu (Eddie) Dong,
	linux-kernel

On 03/01/2010 01:38 AM, Sheng Yang wrote:
> The PV extension of HVM(once known as Hybrid) is started from real mode like
> HVM guest, but also with a component based PV feature selection(e.g. PV halt,
> PV timer, event channel, then PV drivers). So guest can takes the advantages
> of both H/W virtualization and Para-Virtualization.
>
> This patch introduced the PV extension of HVM guest initialization.
>
> Guest would detect the capability using CPUID 0x40000002.edx, then call
> HVMOP_enable_pv hypercall to enable pv support in hypervisor.
>
> Signed-off-by: Sheng Yang<sheng@linux.intel.com>
> Signed-off-by: Yaozu (Eddie) Dong<eddie.dong@intel.com>
> ---
>   arch/x86/include/asm/xen/cpuid.h   |    5 ++
>   arch/x86/xen/enlighten.c           |  115 ++++++++++++++++++++++++++++++++++++
>   arch/x86/xen/irq.c                 |   21 +++++++
>   arch/x86/xen/xen-head.S            |    6 ++
>   arch/x86/xen/xen-ops.h             |    1 +
>   include/xen/interface/hvm/hvm_op.h |    7 ++
>   include/xen/xen.h                  |    9 +++
>   7 files changed, 164 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/include/asm/xen/cpuid.h b/arch/x86/include/asm/xen/cpuid.h
> index 8787f03..a93c851 100644
> --- a/arch/x86/include/asm/xen/cpuid.h
> +++ b/arch/x86/include/asm/xen/cpuid.h
> @@ -65,4 +65,9 @@
>   #define _XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD 0
>   #define XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD  (1u<<0)
>
> +#define _XEN_CPUID_FEAT2_HVM_PV 0
> +#define XEN_CPUID_FEAT2_HVM_PV (1u<<0)
> +#define _XEN_CPUID_FEAT2_HVM_PV_EVTCHN 1
> +#define XEN_CPUID_FEAT2_HVM_PV_EVTCHN (1u<<1)
> +
>   #endif /* __XEN_PUBLIC_ARCH_X86_CPUID_H__ */
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 36daccb..fdb9664 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -34,6 +34,8 @@
>   #include<xen/interface/version.h>
>   #include<xen/interface/physdev.h>
>   #include<xen/interface/vcpu.h>
> +#include<xen/interface/memory.h>
> +#include<xen/interface/hvm/hvm_op.h>
>   #include<xen/features.h>
>   #include<xen/page.h>
>   #include<xen/hvc-console.h>
> @@ -43,6 +45,7 @@
>   #include<asm/page.h>
>   #include<asm/xen/hypercall.h>
>   #include<asm/xen/hypervisor.h>
> +#include<asm/xen/cpuid.h>
>   #include<asm/fixmap.h>
>   #include<asm/processor.h>
>   #include<asm/proto.h>
> @@ -1198,3 +1201,115 @@ asmlinkage void __init xen_start_kernel(void)
>   	x86_64_start_reservations((char *)__pa_symbol(&boot_params));
>   #endif
>   }
> +
> +static void __init xen_hvm_pv_banner(void)
> +{
> +	unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
> +	struct xen_extraversion extra;
> +	HYPERVISOR_xen_version(XENVER_extraversion,&extra);
> +
> +	printk(KERN_INFO "Booting PV featured HVM kernel on %s\n",
> +		pv_info.name);
> +	printk(KERN_INFO "Xen version: %d.%d%s\n",
> +		version>>  16, version&  0xffff, extra.extraversion);
> +}
> +
> +static int xen_para_available(void)
> +{
> +	uint32_t eax, ebx, ecx, edx;
> +	cpuid(XEN_CPUID_LEAF(0),&eax,&ebx,&ecx,&edx);
> +
> +	if (ebx == XEN_CPUID_SIGNATURE_EBX&&
> +	    ecx == XEN_CPUID_SIGNATURE_ECX&&
> +	    edx == XEN_CPUID_SIGNATURE_EDX&&
> +	    ((eax - XEN_CPUID_LEAF(0))>= 2))
> +		return 1;
> +
> +	return 0;
> +}
> +
> +u32 xen_hvm_pv_status;
> +EXPORT_SYMBOL_GPL(xen_hvm_pv_status);
> +
> +static int enable_hvm_pv(u64 flags)
> +{
> +	struct xen_hvm_pv_type a;
> +
> +	a.domid = DOMID_SELF;
> +	a.flags = flags;
> +	return HYPERVISOR_hvm_op(HVMOP_enable_pv,&a);
> +}
> +
> +static int init_hvm_pv_info(void)
> +{
> +	uint32_t ecx, edx, pages, msr;
> +	u64 pfn;
> +
> +	if (!xen_para_available())
> +		return -EINVAL;
> +
> +	cpuid(XEN_CPUID_LEAF(2),&pages,&msr,&ecx,&edx);
> +
> +	/* Check if hvm_pv mode is supported */
> +	if (!(edx&  XEN_CPUID_FEAT2_HVM_PV))
> +		return -ENODEV;
> +
> +	xen_hvm_pv_status = XEN_HVM_PV_ENABLED;
>    

Why use this?  Why not just set the domain type to HVM, and leave the 
"status" field as a bitset of available paravirtualizations?

> +
> +	/* We only support 1 page of hypercall for now */
> +	if (pages != 1)
> +		return -ENOMEM;
> +
> +	pfn = __pa(hypercall_page);
> +	wrmsrl(msr, pfn);
> +
> +	xen_setup_features();
> +
> +	x86_init.oem.banner = xen_hvm_pv_banner;
> +	pv_info = xen_info;
> +	pv_info.kernel_rpl = 0;
> +
> +	return 0;
> +}
> +
> +extern struct shared_info shared_info_page;
> +
> +static void __init init_shared_info(void)
> +{
> +	struct xen_add_to_physmap xatp;
> +
> +	xatp.domid = DOMID_SELF;
> +	xatp.idx = 0;
> +	xatp.space = XENMAPSPACE_shared_info;
> +	xatp.gpfn = __pa(&shared_info_page)>>  PAGE_SHIFT;
> +	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap,&xatp))
> +		BUG();
> +
> +	HYPERVISOR_shared_info = (struct shared_info *)&shared_info_page;
> +
> +	/* Don't do the full vcpu_info placement stuff until we have a
> +	   possible map and a non-dummy shared_info. */
>    

Is this comment meaningful here?  This is the real shared info at this 
point, no?  Are you going to support vcpu_info placement?

> +	per_cpu(xen_vcpu, 0) =&HYPERVISOR_shared_info->vcpu_info[0];
> +}
> +
> +void __init xen_guest_init(void)
> +{
> +#ifdef CONFIG_X86_32
> +	return;
> +#else
> +	int r;
> +
> +	/* Ensure the we won't confused with PV */
> +	if (xen_domain_type == XEN_PV_DOMAIN)
> +		return;
>    

Aren't you specifically testing for xen_domain_type == NATIVE here?  If 
its anything else, then it means we're either PV, or have become an HVM 
domain some other way (like probing for the Xen platform PCI device).



> +
> +	r = init_hvm_pv_info();
> +	if (r<  0)
> +		return;
> +
> +	init_shared_info();
> +
> +	xen_hvm_pv_init_irq_ops();
> +#endif
> +}
>    

Can you split all this off into a new file.  It doesn't seem to have any 
dependencies on the rest of enlighten.c, and I've been trying to 
disaggregate it anyway.

> +
> diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
> index 9d30105..fadaa97 100644
> --- a/arch/x86/xen/irq.c
> +++ b/arch/x86/xen/irq.c
> @@ -131,3 +131,24 @@ void __init xen_init_irq_ops()
>   	pv_irq_ops = xen_irq_ops;
>   	x86_init.irqs.intr_init = xen_init_IRQ;
>   }
> +
> +static void xen_hvm_pv_safe_halt(void)
> +{
> +	/* Do local_irq_enable() explicitly in hvm_pv guest */
> +	local_irq_enable();
> +	xen_safe_halt();
> +}
> +
> +static void xen_hvm_pv_halt(void)
> +{
> +	if (irqs_disabled())
> +		HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL);
> +	else
> +		xen_hvm_pv_safe_halt();
> +}
> +
> +void __init xen_hvm_pv_init_irq_ops(void)
> +{
> +	pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
> +	pv_irq_ops.halt = xen_hvm_pv_halt;
> +}
>    

It would be better to make this patch purely common infrastructure, and 
make specific features (like hvm+pv halt) separate patches (one per patch).

> diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
> index 1a5ff24..26041ce 100644
> --- a/arch/x86/xen/xen-head.S
> +++ b/arch/x86/xen/xen-head.S
> @@ -33,6 +33,12 @@ ENTRY(hypercall_page)
>   	.skip PAGE_SIZE_asm
>   .popsection
>
> +.pushsection .data
> +	.align PAGE_SIZE_asm
> +ENTRY(shared_info_page)
> +	.skip PAGE_SIZE_asm
> +.popsection
>    

Why does this need to be defined in asm?  Can't it be either allocated 
or defined in C?

> +
>   	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
>   	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION,  .asciz "2.6")
>   	ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,    .asciz "xen-3.0")
> diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
> index f9153a3..cc00760 100644
> --- a/arch/x86/xen/xen-ops.h
> +++ b/arch/x86/xen/xen-ops.h
> @@ -41,6 +41,7 @@ void xen_vcpu_restore(void);
>   void __init xen_build_dynamic_phys_to_machine(void);
>
>   void xen_init_irq_ops(void);
> +void xen_hvm_pv_init_irq_ops(void);
>   void xen_setup_timer(int cpu);
>   void xen_setup_runstate_info(int cpu);
>   void xen_teardown_timer(int cpu);
> diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h
> index 7c74ba4..0ce8a26 100644
> --- a/include/xen/interface/hvm/hvm_op.h
> +++ b/include/xen/interface/hvm/hvm_op.h
> @@ -69,4 +69,11 @@ DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_pci_link_route);
>   /* Flushes all VCPU TLBs: @arg must be NULL. */
>   #define HVMOP_flush_tlbs          5
>
> +#define HVMOP_enable_pv 9
> +struct xen_hvm_pv_type {
> +	domid_t domid;
> +	uint32_t flags;
> +#define HVM_PV_EVTCHN (1ull<<1)
> +};
> +
>   #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
> diff --git a/include/xen/xen.h b/include/xen/xen.h
> index a164024..9bb92e5 100644
> --- a/include/xen/xen.h
> +++ b/include/xen/xen.h
> @@ -9,6 +9,7 @@ enum xen_domain_type {
>
>   #ifdef CONFIG_XEN
>   extern enum xen_domain_type xen_domain_type;
> +extern void xen_guest_init(void);
>   #else
>   #define xen_domain_type		XEN_NATIVE
>   #endif
> @@ -19,6 +20,14 @@ extern enum xen_domain_type xen_domain_type;
>   #define xen_hvm_domain()	(xen_domain()&&			\
>   				 xen_domain_type == XEN_HVM_DOMAIN)
>
> +#define XEN_HVM_PV_ENABLED	    (1u<<  0)
>    

Why have this?  We already have xen_domain_type which will either be 
XEN_NATIVE (ie, either real native, or on some fully emulated 
environment we have no specific optimisations for), or XEN_HVM_DOMAIN 
(we know we're running under Xen as an HVM domain).

> +#define XEN_HVM_PV_EVTCHN_ENABLED   (1u<<  1)
> +extern u32 xen_hvm_pv_status;
>    

I think "status" is a misnomer here.  Isn't it specifically a set of PV 
features which are active?

> +
> +#define xen_hvm_pv_enabled() (xen_hvm_pv_status&  XEN_HVM_PV_ENABLED)
> +#define xen_hvm_pv_evtchn_enabled() (xen_hvm_pv_enabled()&&  \
> +		(xen_hvm_pv_status&  XEN_HVM_PV_EVTCHN_ENABLED))
>    

Testing for xen_hvm_pv_enabled() should be redundant; surely the 
status/feature flag won't be set unless the environment supports the 
feature, and if it does it doesn't really matter what the domain type is.

> +
>   #ifdef CONFIG_XEN_DOM0
>   #include<xen/interface/xen.h>
>   #include<asm/xen/hypervisor.h>
>    

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/7] xen/hvm: Xen PV extension of HVM initialization
@ 2010-03-02  1:02     ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02  1:02 UTC (permalink / raw)
  To: Sheng Yang
  Cc: xen-devel, Ian Campbell, Stefano Stabellini, Yaozu (Eddie) Dong,
	linux-kernel, Ian Pratt, Jeremy Fitzhardinge, Keir Fraser

On 03/01/2010 01:38 AM, Sheng Yang wrote:
> The PV extension of HVM(once known as Hybrid) is started from real mode like
> HVM guest, but also with a component based PV feature selection(e.g. PV halt,
> PV timer, event channel, then PV drivers). So guest can takes the advantages
> of both H/W virtualization and Para-Virtualization.
>
> This patch introduced the PV extension of HVM guest initialization.
>
> Guest would detect the capability using CPUID 0x40000002.edx, then call
> HVMOP_enable_pv hypercall to enable pv support in hypervisor.
>
> Signed-off-by: Sheng Yang<sheng@linux.intel.com>
> Signed-off-by: Yaozu (Eddie) Dong<eddie.dong@intel.com>
> ---
>   arch/x86/include/asm/xen/cpuid.h   |    5 ++
>   arch/x86/xen/enlighten.c           |  115 ++++++++++++++++++++++++++++++++++++
>   arch/x86/xen/irq.c                 |   21 +++++++
>   arch/x86/xen/xen-head.S            |    6 ++
>   arch/x86/xen/xen-ops.h             |    1 +
>   include/xen/interface/hvm/hvm_op.h |    7 ++
>   include/xen/xen.h                  |    9 +++
>   7 files changed, 164 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/include/asm/xen/cpuid.h b/arch/x86/include/asm/xen/cpuid.h
> index 8787f03..a93c851 100644
> --- a/arch/x86/include/asm/xen/cpuid.h
> +++ b/arch/x86/include/asm/xen/cpuid.h
> @@ -65,4 +65,9 @@
>   #define _XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD 0
>   #define XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD  (1u<<0)
>
> +#define _XEN_CPUID_FEAT2_HVM_PV 0
> +#define XEN_CPUID_FEAT2_HVM_PV (1u<<0)
> +#define _XEN_CPUID_FEAT2_HVM_PV_EVTCHN 1
> +#define XEN_CPUID_FEAT2_HVM_PV_EVTCHN (1u<<1)
> +
>   #endif /* __XEN_PUBLIC_ARCH_X86_CPUID_H__ */
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index 36daccb..fdb9664 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -34,6 +34,8 @@
>   #include<xen/interface/version.h>
>   #include<xen/interface/physdev.h>
>   #include<xen/interface/vcpu.h>
> +#include<xen/interface/memory.h>
> +#include<xen/interface/hvm/hvm_op.h>
>   #include<xen/features.h>
>   #include<xen/page.h>
>   #include<xen/hvc-console.h>
> @@ -43,6 +45,7 @@
>   #include<asm/page.h>
>   #include<asm/xen/hypercall.h>
>   #include<asm/xen/hypervisor.h>
> +#include<asm/xen/cpuid.h>
>   #include<asm/fixmap.h>
>   #include<asm/processor.h>
>   #include<asm/proto.h>
> @@ -1198,3 +1201,115 @@ asmlinkage void __init xen_start_kernel(void)
>   	x86_64_start_reservations((char *)__pa_symbol(&boot_params));
>   #endif
>   }
> +
> +static void __init xen_hvm_pv_banner(void)
> +{
> +	unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
> +	struct xen_extraversion extra;
> +	HYPERVISOR_xen_version(XENVER_extraversion,&extra);
> +
> +	printk(KERN_INFO "Booting PV featured HVM kernel on %s\n",
> +		pv_info.name);
> +	printk(KERN_INFO "Xen version: %d.%d%s\n",
> +		version>>  16, version&  0xffff, extra.extraversion);
> +}
> +
> +static int xen_para_available(void)
> +{
> +	uint32_t eax, ebx, ecx, edx;
> +	cpuid(XEN_CPUID_LEAF(0),&eax,&ebx,&ecx,&edx);
> +
> +	if (ebx == XEN_CPUID_SIGNATURE_EBX&&
> +	    ecx == XEN_CPUID_SIGNATURE_ECX&&
> +	    edx == XEN_CPUID_SIGNATURE_EDX&&
> +	    ((eax - XEN_CPUID_LEAF(0))>= 2))
> +		return 1;
> +
> +	return 0;
> +}
> +
> +u32 xen_hvm_pv_status;
> +EXPORT_SYMBOL_GPL(xen_hvm_pv_status);
> +
> +static int enable_hvm_pv(u64 flags)
> +{
> +	struct xen_hvm_pv_type a;
> +
> +	a.domid = DOMID_SELF;
> +	a.flags = flags;
> +	return HYPERVISOR_hvm_op(HVMOP_enable_pv,&a);
> +}
> +
> +static int init_hvm_pv_info(void)
> +{
> +	uint32_t ecx, edx, pages, msr;
> +	u64 pfn;
> +
> +	if (!xen_para_available())
> +		return -EINVAL;
> +
> +	cpuid(XEN_CPUID_LEAF(2),&pages,&msr,&ecx,&edx);
> +
> +	/* Check if hvm_pv mode is supported */
> +	if (!(edx&  XEN_CPUID_FEAT2_HVM_PV))
> +		return -ENODEV;
> +
> +	xen_hvm_pv_status = XEN_HVM_PV_ENABLED;
>    

Why use this?  Why not just set the domain type to HVM, and leave the 
"status" field as a bitset of available paravirtualizations?

> +
> +	/* We only support 1 page of hypercall for now */
> +	if (pages != 1)
> +		return -ENOMEM;
> +
> +	pfn = __pa(hypercall_page);
> +	wrmsrl(msr, pfn);
> +
> +	xen_setup_features();
> +
> +	x86_init.oem.banner = xen_hvm_pv_banner;
> +	pv_info = xen_info;
> +	pv_info.kernel_rpl = 0;
> +
> +	return 0;
> +}
> +
> +extern struct shared_info shared_info_page;
> +
> +static void __init init_shared_info(void)
> +{
> +	struct xen_add_to_physmap xatp;
> +
> +	xatp.domid = DOMID_SELF;
> +	xatp.idx = 0;
> +	xatp.space = XENMAPSPACE_shared_info;
> +	xatp.gpfn = __pa(&shared_info_page)>>  PAGE_SHIFT;
> +	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap,&xatp))
> +		BUG();
> +
> +	HYPERVISOR_shared_info = (struct shared_info *)&shared_info_page;
> +
> +	/* Don't do the full vcpu_info placement stuff until we have a
> +	   possible map and a non-dummy shared_info. */
>    

Is this comment meaningful here?  This is the real shared info at this 
point, no?  Are you going to support vcpu_info placement?

> +	per_cpu(xen_vcpu, 0) =&HYPERVISOR_shared_info->vcpu_info[0];
> +}
> +
> +void __init xen_guest_init(void)
> +{
> +#ifdef CONFIG_X86_32
> +	return;
> +#else
> +	int r;
> +
> +	/* Ensure the we won't confused with PV */
> +	if (xen_domain_type == XEN_PV_DOMAIN)
> +		return;
>    

Aren't you specifically testing for xen_domain_type == NATIVE here?  If 
its anything else, then it means we're either PV, or have become an HVM 
domain some other way (like probing for the Xen platform PCI device).



> +
> +	r = init_hvm_pv_info();
> +	if (r<  0)
> +		return;
> +
> +	init_shared_info();
> +
> +	xen_hvm_pv_init_irq_ops();
> +#endif
> +}
>    

Can you split all this off into a new file.  It doesn't seem to have any 
dependencies on the rest of enlighten.c, and I've been trying to 
disaggregate it anyway.

> +
> diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
> index 9d30105..fadaa97 100644
> --- a/arch/x86/xen/irq.c
> +++ b/arch/x86/xen/irq.c
> @@ -131,3 +131,24 @@ void __init xen_init_irq_ops()
>   	pv_irq_ops = xen_irq_ops;
>   	x86_init.irqs.intr_init = xen_init_IRQ;
>   }
> +
> +static void xen_hvm_pv_safe_halt(void)
> +{
> +	/* Do local_irq_enable() explicitly in hvm_pv guest */
> +	local_irq_enable();
> +	xen_safe_halt();
> +}
> +
> +static void xen_hvm_pv_halt(void)
> +{
> +	if (irqs_disabled())
> +		HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL);
> +	else
> +		xen_hvm_pv_safe_halt();
> +}
> +
> +void __init xen_hvm_pv_init_irq_ops(void)
> +{
> +	pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
> +	pv_irq_ops.halt = xen_hvm_pv_halt;
> +}
>    

It would be better to make this patch purely common infrastructure, and 
make specific features (like hvm+pv halt) separate patches (one per patch).

> diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
> index 1a5ff24..26041ce 100644
> --- a/arch/x86/xen/xen-head.S
> +++ b/arch/x86/xen/xen-head.S
> @@ -33,6 +33,12 @@ ENTRY(hypercall_page)
>   	.skip PAGE_SIZE_asm
>   .popsection
>
> +.pushsection .data
> +	.align PAGE_SIZE_asm
> +ENTRY(shared_info_page)
> +	.skip PAGE_SIZE_asm
> +.popsection
>    

Why does this need to be defined in asm?  Can't it be either allocated 
or defined in C?

> +
>   	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
>   	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION,  .asciz "2.6")
>   	ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,    .asciz "xen-3.0")
> diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
> index f9153a3..cc00760 100644
> --- a/arch/x86/xen/xen-ops.h
> +++ b/arch/x86/xen/xen-ops.h
> @@ -41,6 +41,7 @@ void xen_vcpu_restore(void);
>   void __init xen_build_dynamic_phys_to_machine(void);
>
>   void xen_init_irq_ops(void);
> +void xen_hvm_pv_init_irq_ops(void);
>   void xen_setup_timer(int cpu);
>   void xen_setup_runstate_info(int cpu);
>   void xen_teardown_timer(int cpu);
> diff --git a/include/xen/interface/hvm/hvm_op.h b/include/xen/interface/hvm/hvm_op.h
> index 7c74ba4..0ce8a26 100644
> --- a/include/xen/interface/hvm/hvm_op.h
> +++ b/include/xen/interface/hvm/hvm_op.h
> @@ -69,4 +69,11 @@ DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_pci_link_route);
>   /* Flushes all VCPU TLBs: @arg must be NULL. */
>   #define HVMOP_flush_tlbs          5
>
> +#define HVMOP_enable_pv 9
> +struct xen_hvm_pv_type {
> +	domid_t domid;
> +	uint32_t flags;
> +#define HVM_PV_EVTCHN (1ull<<1)
> +};
> +
>   #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
> diff --git a/include/xen/xen.h b/include/xen/xen.h
> index a164024..9bb92e5 100644
> --- a/include/xen/xen.h
> +++ b/include/xen/xen.h
> @@ -9,6 +9,7 @@ enum xen_domain_type {
>
>   #ifdef CONFIG_XEN
>   extern enum xen_domain_type xen_domain_type;
> +extern void xen_guest_init(void);
>   #else
>   #define xen_domain_type		XEN_NATIVE
>   #endif
> @@ -19,6 +20,14 @@ extern enum xen_domain_type xen_domain_type;
>   #define xen_hvm_domain()	(xen_domain()&&			\
>   				 xen_domain_type == XEN_HVM_DOMAIN)
>
> +#define XEN_HVM_PV_ENABLED	    (1u<<  0)
>    

Why have this?  We already have xen_domain_type which will either be 
XEN_NATIVE (ie, either real native, or on some fully emulated 
environment we have no specific optimisations for), or XEN_HVM_DOMAIN 
(we know we're running under Xen as an HVM domain).

> +#define XEN_HVM_PV_EVTCHN_ENABLED   (1u<<  1)
> +extern u32 xen_hvm_pv_status;
>    

I think "status" is a misnomer here.  Isn't it specifically a set of PV 
features which are active?

> +
> +#define xen_hvm_pv_enabled() (xen_hvm_pv_status&  XEN_HVM_PV_ENABLED)
> +#define xen_hvm_pv_evtchn_enabled() (xen_hvm_pv_enabled()&&  \
> +		(xen_hvm_pv_status&  XEN_HVM_PV_EVTCHN_ENABLED))
>    

Testing for xen_hvm_pv_enabled() should be redundant; surely the 
status/feature flag won't be set unless the environment supports the 
feature, and if it does it doesn't really matter what the domain type is.

> +
>   #ifdef CONFIG_XEN_DOM0
>   #include<xen/interface/xen.h>
>   #include<asm/xen/hypervisor.h>
>    

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 4/7] xen: The entrance for PV extension of HVM
  2010-03-01  9:38 ` [PATCH 4/7] xen: The entrance for PV extension of HVM Sheng Yang
@ 2010-03-02  1:05   ` Jeremy Fitzhardinge
  2010-03-02  1:41     ` Sheng Yang
  0 siblings, 1 reply; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02  1:05 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, xen-devel,
	Ian Campbell, Stefano Stabellini, linux-kernel, Ingo Molnar

On 03/01/2010 01:38 AM, Sheng Yang wrote:
> xen_guest_init() would setup the environment.
>
> Cc: Ingo Molnar<mingo@elte.hu>
> Signed-off-by: Sheng Yang<sheng@linux.intel.com>
> ---
>   arch/x86/kernel/setup.c |    8 ++++++++
>   1 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 5d9e40c..2b61d46 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -113,6 +113,10 @@
>   #endif
>   #include<asm/mce.h>
>
> +#ifdef CONFIG_XEN
> +#include<xen/xen.h>
> +#endif
>    

No #ifdefs; put them in xen.h if necessary (which they aren't).  This 
probably isn't the right header anyway; this is specifically for testing 
the presence of Xen and/or the current domain type.  It shouldn't have 
any other prototypes in it, or include anything else (it probably 
already includes too much).  Given this is already x86-specific code, 
include asm/xen/something.h.

> +
>   /*
>    * end_pfn only includes RAM, while max_pfn_mapped includes all e820 entries.
>    * The direct mapping extends to max_pfn_mapped, so that we can directly access
> @@ -740,6 +744,10 @@ void __init setup_arch(char **cmdline_p)
>
>   	x86_init.oem.arch_setup();
>
> +#ifdef CONFIG_XEN
> +	xen_guest_init();
> +#endif
>    

Again, no #ifdefs here.  Put an #ifdeffed stub in an appropriate header.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LKML] [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
  2010-03-01 17:38     ` Konrad Rzeszutek Wilk
@ 2010-03-02  1:13       ` Sheng Yang
  -1 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  1:13 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, Ian Campbell,
	Stefano Stabellini, xen-devel, linux-kernel

On Tuesday 02 March 2010 01:38:58 Konrad Rzeszutek Wilk wrote:
> > +/* The region reserved by QEmu for Xen platform device */
> > +#define GNTTAB_START	    0xf2000000ul
> > +#define GNTTAB_SIZE	    0x20000ul
> 
> I thought that in the earlier review you said:
> 
> "> > +#define GNTTAB_START           0xfbfe0000ul
> 
> > > +#define GNTTAB_SIZE            0x20000ul
> >
> > Is it possible that there would be a PCI device that would be
> > passed in the guest that would conflict with the above mentioned
> > E820 region?
> 
> I would change them to a dedicated PCI MMIO address in the next version.
> Thanks.
> 

Oops... Thanks for reminder. :)

-- 
regards
Yang, Sheng

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LKML] [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
@ 2010-03-02  1:13       ` Sheng Yang
  0 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  1:13 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Ian Campbell, Stefano Stabellini, linux-kernel,
	Ian Pratt, Jeremy Fitzhardinge, Keir Fraser

On Tuesday 02 March 2010 01:38:58 Konrad Rzeszutek Wilk wrote:
> > +/* The region reserved by QEmu for Xen platform device */
> > +#define GNTTAB_START	    0xf2000000ul
> > +#define GNTTAB_SIZE	    0x20000ul
> 
> I thought that in the earlier review you said:
> 
> "> > +#define GNTTAB_START           0xfbfe0000ul
> 
> > > +#define GNTTAB_SIZE            0x20000ul
> >
> > Is it possible that there would be a PCI device that would be
> > passed in the guest that would conflict with the above mentioned
> > E820 region?
> 
> I would change them to a dedicated PCI MMIO address in the next version.
> Thanks.
> 

Oops... Thanks for reminder. :)

-- 
regards
Yang, Sheng

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LKML] [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
  2010-03-01 17:38     ` Konrad Rzeszutek Wilk
@ 2010-03-02  1:21       ` Sheng Yang
  -1 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  1:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, Ian Campbell,
	Stefano Stabellini, xen-devel, linux-kernel

On Tuesday 02 March 2010 01:38:58 Konrad Rzeszutek Wilk wrote:
> > +/* The region reserved by QEmu for Xen platform device */
> > +#define GNTTAB_START	    0xf2000000ul
> > +#define GNTTAB_SIZE	    0x20000ul
> 
> I thought that in the earlier review you said:
> 
> "> > +#define GNTTAB_START           0xfbfe0000ul
> 
> > > +#define GNTTAB_SIZE            0x20000ul
> >
> > Is it possible that there would be a PCI device that would be
> > passed in the guest that would conflict with the above mentioned
> > E820 region?
> 
> I would change them to a dedicated PCI MMIO address in the next version.
> Thanks.
> 
> "
> ?

And yes, this is the dedicated PCI MMIO address I mentioned.. I would update 
the comments to get it more clear.

I don't think it's very clear solution, because the real good way to do this 
is probe pci device and find out with one is the platform pci device then use 
it. But the grant table initialization is quite earlier compared to the 
possible probing now... I hardcode the position now, and hunting for a better 
idea.

Comments?

-- 
regards
Yang, Sheng
 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LKML] [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
@ 2010-03-02  1:21       ` Sheng Yang
  0 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  1:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Ian Campbell, Stefano Stabellini, linux-kernel,
	Ian Pratt, Jeremy Fitzhardinge, Keir Fraser

On Tuesday 02 March 2010 01:38:58 Konrad Rzeszutek Wilk wrote:
> > +/* The region reserved by QEmu for Xen platform device */
> > +#define GNTTAB_START	    0xf2000000ul
> > +#define GNTTAB_SIZE	    0x20000ul
> 
> I thought that in the earlier review you said:
> 
> "> > +#define GNTTAB_START           0xfbfe0000ul
> 
> > > +#define GNTTAB_SIZE            0x20000ul
> >
> > Is it possible that there would be a PCI device that would be
> > passed in the guest that would conflict with the above mentioned
> > E820 region?
> 
> I would change them to a dedicated PCI MMIO address in the next version.
> Thanks.
> 
> "
> ?

And yes, this is the dedicated PCI MMIO address I mentioned.. I would update 
the comments to get it more clear.

I don't think it's very clear solution, because the real good way to do this 
is probe pci device and find out with one is the platform pci device then use 
it. But the grant table initialization is quite earlier compared to the 
possible probing now... I hardcode the position now, and hunting for a better 
idea.

Comments?

-- 
regards
Yang, Sheng

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen
  2010-03-02  0:42 ` [Xen-devel] [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Jeremy Fitzhardinge
@ 2010-03-02  1:26   ` Sheng Yang
  2010-03-02  1:32       ` Jeremy Fitzhardinge
  2010-03-02  3:20       ` Dong, Eddie
  0 siblings, 2 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  1:26 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, linux-kernel,
	xen-devel, Ian Campbell, Stefano Stabellini

On Tuesday 02 March 2010 08:42:01 Jeremy Fitzhardinge wrote:
> On 03/01/2010 01:38 AM, Sheng Yang wrote:
> > I'd like to have your comments on this patch series. We think the patches
> > are prepared for check in pv_ops domU tree.
> >
> > The PV extension of HVM is started from real mode like HVM guest, but
> > also with a a range of PV features(e.g. PV halt, PV timer, event channel,
> > as well as PV drivers). So guest with this feature can takes the
> > advantages of both H/W virtualization and Para-Virtualization.

Thanks for the comments!
> 
> The PV device side of this has a lot of overlap with the older PV driver
> for HVM domains code.  Aside perhaps for the details of how
> interrupts/events are actually delivered, that seems like a completely
> distinct piece of work which could be done separately.

Yes. PV driver is not our core target. So leave it a little late is fine.

> Also, I'd like to see a clearer statement of what you're specifically
> trying to optimise for here.  What is inefficient in an unmodified HVM
> domain, and how do your patches address these things.  What measurements
> have you made?

The key issue we want to address is the overhead of LAPIC, especially EOI and 
other actions resulted in explicit vmexit. That is the overhead we want to 
eliminate. And evtchn seems like a perfect choice to us.

> 
> > The first two of the patchset imported several header file from Jeremy's
> > tree and Xen tree, respect to Jeremy and Keir's works.
> >
> > The whole patchset based on Linux upstream.
> >
> > You need a line like:
> >
> > cpuid = [ '0x40000002:edx=0x3' ]
> >
> > in HVM configuration file to expose hybrid feature to guest, and
> >
> > CONFIG_XEN
> >
> > in the guest kernel configuration file to enable the hybrid support.
> >
> > And the compiled image can be used as native/pv domU/hvm guest/pv feature
> > hvm kernel.
> >
> > Current the patchset support x86_64 only.
> 
> Please add a new config option for the new code, and make it dependent
> on x86-64, rather than scattering random #ifdef CONFIG_X86_64s around
> the code.  Make sure all the code is controlled by this CONFIG variable
> so it disappears in a 32-bit build (ideally by putting it into a new
> file where possible, and only building it when enabled).

Sure.
> 
> > I've discussed the vector-evtchn mapping with Stefano, and we agreed on
> > that it's more complex than we thought, and can't brought much benefit
> > for pvops domU. So there is no major change in this update.
> >
> > Change from v3:
> > 1. Rebase to Linux 2.6.33 release.
> 
> Why 2.6.33?  I'm currently working on 2.6.32; I intend to move to .33
> shortly, but .32 is a much more convenient base (particularly if you're
> interested in getting these changes into any distro kernels).
> 
> I rebased it to .32 without much effort.

OK.
> 
> 
> I'll make specific comments in the patches.

Thanks

-- 
regards
Yang, Sheng

> 
>      J
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen
  2010-03-02  1:26   ` Sheng Yang
@ 2010-03-02  1:32       ` Jeremy Fitzhardinge
  2010-03-02  3:20       ` Dong, Eddie
  1 sibling, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02  1:32 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, linux-kernel,
	xen-devel, Ian Campbell, Stefano Stabellini

On 03/01/2010 05:26 PM, Sheng Yang wrote:
>> Why 2.6.33?  I'm currently working on 2.6.32; I intend to move to .33
>> shortly, but .32 is a much more convenient base (particularly if you're
>> interested in getting these changes into any distro kernels).
>>
>> I rebased it to .32 without much effort.
>>      
> OK.
>    

Hm, looks like the x86_platform_ipi stuff is missing in .32.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen
@ 2010-03-02  1:32       ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02  1:32 UTC (permalink / raw)
  To: Sheng Yang
  Cc: xen-devel, Ian Campbell, Stefano Stabellini, linux-kernel,
	Ian Pratt, Jeremy Fitzhardinge, Keir Fraser

On 03/01/2010 05:26 PM, Sheng Yang wrote:
>> Why 2.6.33?  I'm currently working on 2.6.32; I intend to move to .33
>> shortly, but .32 is a much more convenient base (particularly if you're
>> interested in getting these changes into any distro kernels).
>>
>> I rebased it to .32 without much effort.
>>      
> OK.
>    

Hm, looks like the x86_platform_ipi stuff is missing in .32.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen
  2010-03-02  1:32       ` Jeremy Fitzhardinge
  (?)
@ 2010-03-02  1:34       ` Sheng Yang
  -1 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  1:34 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, linux-kernel,
	xen-devel, Ian Campbell, Stefano Stabellini

On Tuesday 02 March 2010 09:32:20 Jeremy Fitzhardinge wrote:
> On 03/01/2010 05:26 PM, Sheng Yang wrote:
> >> Why 2.6.33?  I'm currently working on 2.6.32; I intend to move to .33
> >> shortly, but .32 is a much more convenient base (particularly if you're
> >> interested in getting these changes into any distro kernels).
> >>
> >> I rebased it to .32 without much effort.
> >
> > OK.
> 
> Hm, looks like the x86_platform_ipi stuff is missing in .32.
> 
I thought that one should still called smp_generic_interrupt in .32?

-- 
regards
Yang, Sheng

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 5/7] xen: Make event channel work with PV extension of HVM
  2010-03-01  9:38   ` Sheng Yang
  (?)
@ 2010-03-02  1:38   ` Jeremy Fitzhardinge
  2010-03-02  5:48     ` Sheng Yang
  2010-03-04  5:37       ` Sheng Yang
  -1 siblings, 2 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02  1:38 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, linux-kernel,
	xen-devel, Ian Campbell, Stefano Stabellini

On 03/01/2010 01:38 AM, Sheng Yang wrote:
> We mapped each IOAPIC pin to a VIRQ, so that we can deliver interrupt through
> these VIRQs.
>
> We used X86_PLATFORM_IPI_VECTOR as the noficiation vector for hypervisor
> to notify guest about the event.
>    

I don't understand this aspect of the patch.  Can you explain the 
interrupt delivery path when HVM event channels are enabled?

> The Xen PV timer is used to provide guest a reliable timer.
>
> The patch also enabled SMP support, then we can support IPI through evtchn as well.
>
> Then we don't use IOAPIC/LAPIC.
>
> Signed-off-by: Sheng Yang<sheng@linux.intel.com>
> ---
>   arch/x86/xen/enlighten.c    |   72 +++++++++++++++++++++
>   arch/x86/xen/irq.c          |   37 ++++++++++-
>   arch/x86/xen/smp.c          |  144 ++++++++++++++++++++++++++++++++++++++++++-
>   arch/x86/xen/xen-ops.h      |    3 +
>   drivers/xen/events.c        |   66 ++++++++++++++++++-
>   include/xen/events.h        |    1 +
>   include/xen/hvm.h           |    5 ++
>   include/xen/interface/xen.h |    6 ++-
>   8 files changed, 326 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index fdb9664..f617639 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -58,6 +58,9 @@
>   #include<asm/reboot.h>
>   #include<asm/stackprotector.h>
>
> +#include<xen/hvm.h>
> +#include<xen/events.h>
> +
>   #include "xen-ops.h"
>   #include "mmu.h"
>   #include "multicalls.h"
> @@ -1212,6 +1215,8 @@ static void __init xen_hvm_pv_banner(void)
>   		pv_info.name);
>   	printk(KERN_INFO "Xen version: %d.%d%s\n",
>   		version>>  16, version&  0xffff, extra.extraversion);
> +	if (xen_hvm_pv_evtchn_enabled())
> +		printk(KERN_INFO "PV feature: Event channel enabled\n");
>   }
>
>   static int xen_para_available(void)
> @@ -1256,6 +1261,9 @@ static int init_hvm_pv_info(void)
>
>   	xen_hvm_pv_status = XEN_HVM_PV_ENABLED;
>
> +	if (edx&  XEN_CPUID_FEAT2_HVM_PV_EVTCHN)
> +		xen_hvm_pv_status |= XEN_HVM_PV_EVTCHN_ENABLED;
> +
>   	/* We only support 1 page of hypercall for now */
>   	if (pages != 1)
>   		return -ENOMEM;
> @@ -1292,12 +1300,42 @@ static void __init init_shared_info(void)
>   	per_cpu(xen_vcpu, 0) =&HYPERVISOR_shared_info->vcpu_info[0];
>   }
>
> +static int set_callback_via(uint64_t via)
> +{
> +	struct xen_hvm_param a;
> +
> +	a.domid = DOMID_SELF;
> +	a.index = HVM_PARAM_CALLBACK_IRQ;
> +	a.value = via;
> +	return HYPERVISOR_hvm_op(HVMOP_set_param,&a);
> +}
> +
> +void do_hvm_pv_evtchn_intr(void)
> +{
> +#ifdef CONFIG_X86_64
> +	per_cpu(irq_count, smp_processor_id())++;
> +#endif
> +	xen_evtchn_do_upcall(get_irq_regs());
> +#ifdef CONFIG_X86_64
> +	per_cpu(irq_count, smp_processor_id())--;
> +#endif
> +}
> +
> +#ifdef CONFIG_X86_LOCAL_APIC
> +static void xen_hvm_pv_evtchn_apic_write(u32 reg, u32 val)
> +{
> +	/* The only one reached here should be EOI */
> +	WARN_ON(reg != APIC_EOI);
> +}
> +#endif
>    

> +
>   void __init xen_guest_init(void)
>   {
>   #ifdef CONFIG_X86_32
>   	return;
>   #else
>   	int r;
> +	uint64_t callback_via;
>
>   	/* Ensure the we won't confused with PV */
>   	if (xen_domain_type == XEN_PV_DOMAIN)
> @@ -1310,6 +1348,40 @@ void __init xen_guest_init(void)
>   	init_shared_info();
>
>   	xen_hvm_pv_init_irq_ops();
> +
> +	if (xen_hvm_pv_evtchn_enabled()) {
> +		if (enable_hvm_pv(HVM_PV_EVTCHN))
>    

If this fails, shouldn't it also clear XEN_HVM_PV_EVTCHN_ENABLED?

> +			return;
> +
> +		pv_time_ops = xen_time_ops;
> +
> +		x86_init.timers.timer_init = xen_time_init;
> +		x86_init.timers.setup_percpu_clockev = x86_init_noop;
> +		x86_cpuinit.setup_percpu_clockev = x86_init_noop;
>    

Presumably even if we don't have PV_EVTCHN available/enabled, the Xen 
clocksource would be available for getting time?

> +
> +		x86_platform.calibrate_tsc = xen_tsc_khz;
> +		x86_platform.get_wallclock = xen_get_wallclock;
> +		x86_platform.set_wallclock = xen_set_wallclock;
> +
> +		pv_apic_ops = xen_apic_ops;
> +#ifdef CONFIG_X86_LOCAL_APIC
> +		/*
> +		 * set up the basic apic ops.
> +		 */
> +		set_xen_basic_apic_ops();
> +		apic->write = xen_hvm_pv_evtchn_apic_write;
>    

I'd just change the xen_apic_write to remove the WARN_ON, since you 
don't seem to care about it either.

> +#endif
> +
> +		callback_via = HVM_CALLBACK_VECTOR(X86_PLATFORM_IPI_VECTOR);
> +		set_callback_via(callback_via);
> +
> +		x86_platform_ipi_callback = do_hvm_pv_evtchn_intr;
> +
> +		disable_acpi();
> +
> +		xen_hvm_pv_smp_init();
> +		machine_ops = xen_machine_ops;
> +	}
>   #endif
>   }
>
> diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
> index fadaa97..7827a6d 100644
> --- a/arch/x86/xen/irq.c
> +++ b/arch/x86/xen/irq.c
> @@ -5,6 +5,7 @@
>   #include<xen/interface/xen.h>
>   #include<xen/interface/sched.h>
>   #include<xen/interface/vcpu.h>
> +#include<xen/xen.h>
>
>   #include<asm/xen/hypercall.h>
>   #include<asm/xen/hypervisor.h>
> @@ -132,6 +133,20 @@ void __init xen_init_irq_ops()
>   	x86_init.irqs.intr_init = xen_init_IRQ;
>   }
>
> +static void xen_hvm_pv_evtchn_disable(void)
> +{
> +	native_irq_disable();
> +	xen_irq_disable();
> +}
> +PV_CALLEE_SAVE_REGS_THUNK(xen_hvm_pv_evtchn_disable);
> +
> +static void xen_hvm_pv_evtchn_enable(void)
> +{
> +	native_irq_enable();
> +	xen_irq_enable();
> +}
> +PV_CALLEE_SAVE_REGS_THUNK(xen_hvm_pv_evtchn_enable);
> +
>   static void xen_hvm_pv_safe_halt(void)
>   {
>   	/* Do local_irq_enable() explicitly in hvm_pv guest */
> @@ -147,8 +162,26 @@ static void xen_hvm_pv_halt(void)
>   		xen_hvm_pv_safe_halt();
>   }
>
> +static const struct pv_irq_ops xen_hvm_pv_irq_ops __initdata = {
> +	.save_fl = __PV_IS_CALLEE_SAVE(native_save_fl),
> +	.restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl),
> +	.irq_disable = PV_CALLEE_SAVE(xen_hvm_pv_evtchn_disable),
> +	.irq_enable = PV_CALLEE_SAVE(xen_hvm_pv_evtchn_enable),
> +
> +	.safe_halt = xen_hvm_pv_safe_halt,
> +	.halt = xen_hvm_pv_halt,
> +#ifdef CONFIG_X86_64
> +	.adjust_exception_frame = paravirt_nop,
> +#endif
> +};
> +
>   void __init xen_hvm_pv_init_irq_ops(void)
>   {
> -	pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
> -	pv_irq_ops.halt = xen_hvm_pv_halt;
> +	if (xen_hvm_pv_evtchn_enabled()) {
> +		pv_irq_ops = xen_hvm_pv_irq_ops;
> +		x86_init.irqs.intr_init = xen_hvm_pv_evtchn_init_IRQ;
> +	} else {
> +		pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
> +		pv_irq_ops.halt = xen_hvm_pv_halt;
> +	}
>   }
> diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> index 563d205..a85d0b6 100644
> --- a/arch/x86/xen/smp.c
> +++ b/arch/x86/xen/smp.c
> @@ -15,20 +15,26 @@
>   #include<linux/sched.h>
>   #include<linux/err.h>
>   #include<linux/smp.h>
> +#include<linux/nmi.h>
>
>   #include<asm/paravirt.h>
>   #include<asm/desc.h>
>   #include<asm/pgtable.h>
>   #include<asm/cpu.h>
> +#include<asm/trampoline.h>
> +#include<asm/tlbflush.h>
> +#include<asm/mtrr.h>
>
>   #include<xen/interface/xen.h>
>   #include<xen/interface/vcpu.h>
>
>   #include<asm/xen/interface.h>
>   #include<asm/xen/hypercall.h>
> +#include<asm/xen/hypervisor.h>
>
>   #include<xen/page.h>
>   #include<xen/events.h>
> +#include<xen/xen.h>
>
>   #include "xen-ops.h"
>   #include "mmu.h"
> @@ -171,7 +177,8 @@ static void __init xen_smp_prepare_boot_cpu(void)
>
>   	/* We've switched to the "real" per-cpu gdt, so make sure the
>   	   old memory can be recycled */
> -	make_lowmem_page_readwrite(xen_initial_gdt);
> +	if (xen_pv_domain())
> +		make_lowmem_page_readwrite(xen_initial_gdt);
>    

Test for xen_feature(XENFEAT_writable_descriptor_tables).

>
>   	xen_setup_vcpu_info_placement();
>   }
> @@ -480,3 +487,138 @@ void __init xen_smp_init(void)
>   	xen_fill_possible_map();
>   	xen_init_spinlocks();
>   }
> +
> +static __cpuinit void xen_hvm_pv_start_secondary(void)
> +{
> +	int cpu = smp_processor_id();
> +
> +	cpu_init();
> +	touch_nmi_watchdog();
> +	preempt_disable();
> +
> +	/* otherwise gcc will move up smp_processor_id before the cpu_init */
> +	barrier();
> +	/*
> +	 * Check TSC synchronization with the BSP:
> +	 */
> +	check_tsc_sync_target();
> +
> +	/* Done in smp_callin(), move it here */
> +	set_mtrr_aps_delayed_init();
> +	smp_store_cpu_info(cpu);
> +
> +	/* This must be done before setting cpu_online_mask */
> +	set_cpu_sibling_map(cpu);
> +	wmb();
> +
> +	set_cpu_online(smp_processor_id(), true);
> +	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
> +
> +	/* enable local interrupts */
> +	local_irq_enable();
> +
> +	xen_setup_cpu_clockevents();
>    

How much of this is necessary?  At this point, isn't CPU bringup the 
same as PV?

> +
> +	wmb();
> +	cpu_idle();
> +}
> +
> +static __cpuinit int
> +hvm_pv_cpu_initialize_context(unsigned int cpu, struct task_struct *idle)
> +{
> +	struct vcpu_guest_context *ctxt;
> +	unsigned long start_ip;
> +
> +	if (cpumask_test_and_set_cpu(cpu, xen_cpu_initialized_map))
> +		return 0;
> +
> +	ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL);
> +	if (ctxt == NULL)
> +		return -ENOMEM;
> +
> +	early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu);
> +	initial_code = (unsigned long)xen_hvm_pv_start_secondary;
> +	stack_start.sp = (void *) idle->thread.sp;
> +
> +	/* start_ip had better be page-aligned! */
> +	start_ip = setup_trampoline();
> +
> +	/* only start_ip is what we want */
> +	ctxt->flags = VGCF_HVM_GUEST;
> +	ctxt->user_regs.eip = start_ip;
> +
> +	printk(KERN_INFO "Booting processor %d ip 0x%lx\n", cpu, start_ip);
> +
> +	if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt))
> +		BUG();
> +
> +	kfree(ctxt);
> +	return 0;
> +}
> +
> +static int __init xen_hvm_pv_cpu_up(unsigned int cpu)
> +{
> +	struct task_struct *idle = idle_task(cpu);
> +	int rc;
> +	unsigned long flags;
> +
> +	per_cpu(current_task, cpu) = idle;
> +
> +#ifdef CONFIG_X86_32
> +	irq_ctx_init(cpu);
> +#else
> +	clear_tsk_thread_flag(idle, TIF_FORK);
> +	initial_gs = per_cpu_offset(cpu);
> +	per_cpu(kernel_stack, cpu) =
> +		(unsigned long)task_stack_page(idle) -
> +		KERNEL_STACK_OFFSET + THREAD_SIZE;
> +#endif
> +
> +	xen_setup_timer(cpu);
> +	xen_init_lock_cpu(cpu);
> +
> +	per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
>    

Can you put all this duplicated code into a common function?

> +
> +	rc = hvm_pv_cpu_initialize_context(cpu, idle);
> +	if (rc)
> +		return rc;
> +
> +	if (num_online_cpus() == 1)
> +		alternatives_smp_switch(1);
> +
> +	rc = xen_smp_intr_init(cpu);
> +	if (rc)
> +		return rc;
> +
> +	rc = HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL);
> +	BUG_ON(rc);
> +
> +	/*
> +	 * Check TSC synchronization with the AP (keep irqs disabled
> +	 * while doing so):
> +	 */
> +	local_irq_save(flags);
> +	check_tsc_sync_source(cpu);
> +	local_irq_restore(flags);
> +
> +	while (!cpu_online(cpu)) {
> +		cpu_relax();
> +		touch_nmi_watchdog();
> +	}
> +
> +	return 0;
> +}
> +
> +static void xen_hvm_pv_flush_tlb_others(const struct cpumask *cpumask,
> +					struct mm_struct *mm, unsigned long va)
> +{
> +	/* TODO Make it more specific */
> +	flush_tlb_all();
>    
Is the MMUEXT_TLB_FLUSH/INVLPG_MULTI hypercall not currently available 
to HVM?

> +}
> +
> +void __init xen_hvm_pv_smp_init(void)
> +{
>    

It's probably safest to have this do its own test for 
xen_hvm_pv_evtchn_enabled(), rather than assuming the caller is getting 
it right.

> +	smp_ops = xen_smp_ops;
> +	smp_ops.cpu_up = xen_hvm_pv_cpu_up;
> +	pv_mmu_ops.flush_tlb_others = xen_hvm_pv_flush_tlb_others;
> +}
> diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
> index cc00760..a8cc129 100644
> --- a/arch/x86/xen/xen-ops.h
> +++ b/arch/x86/xen/xen-ops.h
> @@ -34,6 +34,7 @@ void xen_reserve_top(void);
>   char * __init xen_memory_setup(void);
>   void __init xen_arch_setup(void);
>   void __init xen_init_IRQ(void);
> +void __init xen_hvm_pv_evtchn_init_IRQ(void);
>   void xen_enable_sysenter(void);
>   void xen_enable_syscall(void);
>   void xen_vcpu_restore(void);
> @@ -61,10 +62,12 @@ void xen_setup_vcpu_info_placement(void);
>
>   #ifdef CONFIG_SMP
>   void xen_smp_init(void);
> +void xen_hvm_pv_smp_init(void);
>
>   extern cpumask_var_t xen_cpu_initialized_map;
>   #else
>   static inline void xen_smp_init(void) {}
> +static inline void xen_hvm_pv_smp_init(void) {}
>   #endif
>
>   #ifdef CONFIG_PARAVIRT_SPINLOCKS
> diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> index ce602dd..e1835db 100644
> --- a/drivers/xen/events.c
> +++ b/drivers/xen/events.c
> @@ -37,9 +37,12 @@
>
>   #include<xen/xen-ops.h>
>   #include<xen/events.h>
> +#include<xen/xen.h>
>   #include<xen/interface/xen.h>
>   #include<xen/interface/event_channel.h>
>
> +#include<asm/desc.h>
> +
>   /*
>    * This lock protects updates to the following mapping and reference-count
>    * arrays. The lock does not need to be acquired to read the mapping tables.
> @@ -624,8 +627,13 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
>   	struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
>    	unsigned count;
>
> -	exit_idle();
> -	irq_enter();
> +	/*
> +	 * If is PV featured HVM, these have already been done
> +	 */
> +	if (likely(!xen_hvm_pv_evtchn_enabled())) {
> +		exit_idle();
> +		irq_enter();
> +	}
>    

In that case, rather than putting this conditional in the hot path, make 
an inner __xen_evtchn_do_upcall which is wrapped by the PV and HVM 
variants which do the appropriate things.  (And drop the pt_regs arg, I 
think.)

>
>   	do {
>   		unsigned long pending_words;
> @@ -662,8 +670,10 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
>   	} while(count != 1);
>
>   out:
> -	irq_exit();
> -	set_irq_regs(old_regs);
> +	if (likely(!xen_hvm_pv_evtchn_enabled())) {
> +		irq_exit();
> +		set_irq_regs(old_regs);
> +	}
>
>   	put_cpu();
>   }
> @@ -944,3 +954,51 @@ void __init xen_init_IRQ(void)
>
>   	irq_ctx_init(smp_processor_id());
>   }
> +
> +void __init xen_hvm_pv_evtchn_init_IRQ(void)
> +{
> +	int i;
> +
> +	xen_init_IRQ();
> +	for (i = 0; i<  NR_IRQS_LEGACY; i++) {
> +		struct evtchn_bind_virq bind_virq;
> +		struct irq_desc *desc = irq_to_desc(i);
> +		int virq, evtchn;
> +
> +		virq = i + VIRQ_EMUL_PIN_START;
> +		bind_virq.virq = virq;
> +		bind_virq.vcpu = 0;
> +
> +		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
> +						&bind_virq) != 0)
> +			BUG();
> +
> +		evtchn = bind_virq.port;
> +		evtchn_to_irq[evtchn] = i;
> +		irq_info[i] = mk_virq_info(evtchn, virq);
> +
> +		desc->status = IRQ_DISABLED;
> +		desc->action = NULL;
> +		desc->depth = 1;
> +
> +		/*
> +		 * 16 old-style INTA-cycle interrupts:
> +		 */
> +		set_irq_chip_and_handler_name(i,&xen_dynamic_chip,
> +					handle_level_irq, "event");
> +	}
> +
> +	/*
> +	 * Cover the whole vector space, no vector can escape
> +	 * us. (some of these will be overridden and become
> +	 * 'special' SMP interrupts)
> +	 */
> +	for (i = 0; i<  (NR_VECTORS - FIRST_EXTERNAL_VECTOR); i++) {
> +		int vector = FIRST_EXTERNAL_VECTOR + i;
> +		if (vector != IA32_SYSCALL_VECTOR)
> +			set_intr_gate(vector, interrupt[i]);
> +	}
> +
> +	/* generic IPI for platform specific use, now used for HVM evtchn */
> +	alloc_intr_gate(X86_PLATFORM_IPI_VECTOR, x86_platform_ipi);
> +}
> diff --git a/include/xen/events.h b/include/xen/events.h
> index e68d59a..91755db 100644
> --- a/include/xen/events.h
> +++ b/include/xen/events.h
> @@ -56,4 +56,5 @@ void xen_poll_irq(int irq);
>   /* Determine the IRQ which is bound to an event channel */
>   unsigned irq_from_evtchn(unsigned int evtchn);
>
> +void xen_evtchn_do_upcall(struct pt_regs *regs);
>   #endif	/* _XEN_EVENTS_H */
> diff --git a/include/xen/hvm.h b/include/xen/hvm.h
> index 4ea8887..c66d788 100644
> --- a/include/xen/hvm.h
> +++ b/include/xen/hvm.h
> @@ -20,4 +20,9 @@ static inline unsigned long hvm_get_parameter(int idx)
>   	return xhv.value;
>   }
>
> +#define HVM_CALLBACK_VIA_TYPE_VECTOR 0x2
> +#define HVM_CALLBACK_VIA_TYPE_SHIFT 56
> +#define HVM_CALLBACK_VECTOR(x) (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
> +				HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
> +
>   #endif /* XEN_HVM_H__ */
> diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
> index 2befa3e..9282ff7 100644
> --- a/include/xen/interface/xen.h
> +++ b/include/xen/interface/xen.h
> @@ -90,7 +90,11 @@
>   #define VIRQ_ARCH_6    22
>   #define VIRQ_ARCH_7    23
>
> -#define NR_VIRQS       24
> +#define VIRQ_EMUL_PIN_START 24
> +#define VIRQ_EMUL_PIN_NUM 16
> +
> +#define NR_VIRQS       40
>    

(VIRQ_EMUL_PIN_START + VIRQ_EMUL_PIN_NUM)?

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
  2010-03-02  1:02     ` Jeremy Fitzhardinge
@ 2010-03-02  1:38       ` Sheng Yang
  -1 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  1:38 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, xen-devel,
	Ian Campbell, Stefano Stabellini, Yaozu (Eddie) Dong,
	linux-kernel

On Tuesday 02 March 2010 09:02:40 Jeremy Fitzhardinge wrote:
> On 03/01/2010 01:38 AM, Sheng Yang wrote:
> > The PV extension of HVM(once known as Hybrid) is started from real mode
> > like HVM guest, but also with a component based PV feature selection(e.g.
> > PV halt, PV timer, event channel, then PV drivers). So guest can takes
> > the advantages of both H/W virtualization and Para-Virtualization.
> >
> > This patch introduced the PV extension of HVM guest initialization.
> >
> > Guest would detect the capability using CPUID 0x40000002.edx, then call
> > HVMOP_enable_pv hypercall to enable pv support in hypervisor.
> >
> > Signed-off-by: Sheng Yang<sheng@linux.intel.com>
> > Signed-off-by: Yaozu (Eddie) Dong<eddie.dong@intel.com>
> > ---
> >   arch/x86/include/asm/xen/cpuid.h   |    5 ++
> >   arch/x86/xen/enlighten.c           |  115
> > ++++++++++++++++++++++++++++++++++++ arch/x86/xen/irq.c                 |
> >   21 +++++++
> >   arch/x86/xen/xen-head.S            |    6 ++
> >   arch/x86/xen/xen-ops.h             |    1 +
> >   include/xen/interface/hvm/hvm_op.h |    7 ++
> >   include/xen/xen.h                  |    9 +++
> >   7 files changed, 164 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/xen/cpuid.h
> > b/arch/x86/include/asm/xen/cpuid.h index 8787f03..a93c851 100644
> > --- a/arch/x86/include/asm/xen/cpuid.h
> > +++ b/arch/x86/include/asm/xen/cpuid.h
> > @@ -65,4 +65,9 @@
> >   #define _XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD 0
> >   #define XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD  (1u<<0)
> >
> > +#define _XEN_CPUID_FEAT2_HVM_PV 0
> > +#define XEN_CPUID_FEAT2_HVM_PV (1u<<0)
> > +#define _XEN_CPUID_FEAT2_HVM_PV_EVTCHN 1
> > +#define XEN_CPUID_FEAT2_HVM_PV_EVTCHN (1u<<1)
> > +
> >   #endif /* __XEN_PUBLIC_ARCH_X86_CPUID_H__ */
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index 36daccb..fdb9664 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -34,6 +34,8 @@
> >   #include<xen/interface/version.h>
> >   #include<xen/interface/physdev.h>
> >   #include<xen/interface/vcpu.h>
> > +#include<xen/interface/memory.h>
> > +#include<xen/interface/hvm/hvm_op.h>
> >   #include<xen/features.h>
> >   #include<xen/page.h>
> >   #include<xen/hvc-console.h>
> > @@ -43,6 +45,7 @@
> >   #include<asm/page.h>
> >   #include<asm/xen/hypercall.h>
> >   #include<asm/xen/hypervisor.h>
> > +#include<asm/xen/cpuid.h>
> >   #include<asm/fixmap.h>
> >   #include<asm/processor.h>
> >   #include<asm/proto.h>
> > @@ -1198,3 +1201,115 @@ asmlinkage void __init xen_start_kernel(void)
> >   	x86_64_start_reservations((char *)__pa_symbol(&boot_params));
> >   #endif
> >   }
> > +
> > +static void __init xen_hvm_pv_banner(void)
> > +{
> > +	unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
> > +	struct xen_extraversion extra;
> > +	HYPERVISOR_xen_version(XENVER_extraversion,&extra);
> > +
> > +	printk(KERN_INFO "Booting PV featured HVM kernel on %s\n",
> > +		pv_info.name);
> > +	printk(KERN_INFO "Xen version: %d.%d%s\n",
> > +		version>>  16, version&  0xffff, extra.extraversion);
> > +}
> > +
> > +static int xen_para_available(void)
> > +{
> > +	uint32_t eax, ebx, ecx, edx;
> > +	cpuid(XEN_CPUID_LEAF(0),&eax,&ebx,&ecx,&edx);
> > +
> > +	if (ebx == XEN_CPUID_SIGNATURE_EBX&&
> > +	    ecx == XEN_CPUID_SIGNATURE_ECX&&
> > +	    edx == XEN_CPUID_SIGNATURE_EDX&&
> > +	    ((eax - XEN_CPUID_LEAF(0))>= 2))
> > +		return 1;
> > +
> > +	return 0;
> > +}
> > +
> > +u32 xen_hvm_pv_status;
> > +EXPORT_SYMBOL_GPL(xen_hvm_pv_status);
> > +
> > +static int enable_hvm_pv(u64 flags)
> > +{
> > +	struct xen_hvm_pv_type a;
> > +
> > +	a.domid = DOMID_SELF;
> > +	a.flags = flags;
> > +	return HYPERVISOR_hvm_op(HVMOP_enable_pv,&a);
> > +}
> > +
> > +static int init_hvm_pv_info(void)
> > +{
> > +	uint32_t ecx, edx, pages, msr;
> > +	u64 pfn;
> > +
> > +	if (!xen_para_available())
> > +		return -EINVAL;
> > +
> > +	cpuid(XEN_CPUID_LEAF(2),&pages,&msr,&ecx,&edx);
> > +
> > +	/* Check if hvm_pv mode is supported */
> > +	if (!(edx&  XEN_CPUID_FEAT2_HVM_PV))
> > +		return -ENODEV;
> > +
> > +	xen_hvm_pv_status = XEN_HVM_PV_ENABLED;
> 
> Why use this?  Why not just set the domain type to HVM, and leave the
> "status" field as a bitset of available paravirtualizations?

A annoy thing in pv drivers is that it would test if the domain type is _NOT_ 
XEN_NATIVE. So set the domain to XEN_HVM_DOMAIN would result in PV driver 
initialization then probably panic. Maybe we can do something to PV drivers, 
as patch 6 and a part of patch 7.
> 
> > +
> > +	/* We only support 1 page of hypercall for now */
> > +	if (pages != 1)
> > +		return -ENOMEM;
> > +
> > +	pfn = __pa(hypercall_page);
> > +	wrmsrl(msr, pfn);
> > +
> > +	xen_setup_features();
> > +
> > +	x86_init.oem.banner = xen_hvm_pv_banner;
> > +	pv_info = xen_info;
> > +	pv_info.kernel_rpl = 0;
> > +
> > +	return 0;
> > +}
> > +
> > +extern struct shared_info shared_info_page;
> > +
> > +static void __init init_shared_info(void)
> > +{
> > +	struct xen_add_to_physmap xatp;
> > +
> > +	xatp.domid = DOMID_SELF;
> > +	xatp.idx = 0;
> > +	xatp.space = XENMAPSPACE_shared_info;
> > +	xatp.gpfn = __pa(&shared_info_page)>>  PAGE_SHIFT;
> > +	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap,&xatp))
> > +		BUG();
> > +
> > +	HYPERVISOR_shared_info = (struct shared_info *)&shared_info_page;
> > +
> > +	/* Don't do the full vcpu_info placement stuff until we have a
> > +	   possible map and a non-dummy shared_info. */
> 
> Is this comment meaningful here?  This is the real shared info at this
> point, no?  Are you going to support vcpu_info placement?

Would discard this..
 
> > +	per_cpu(xen_vcpu, 0) =&HYPERVISOR_shared_info->vcpu_info[0];
> > +}
> > +
> > +void __init xen_guest_init(void)
> > +{
> > +#ifdef CONFIG_X86_32
> > +	return;
> > +#else
> > +	int r;
> > +
> > +	/* Ensure the we won't confused with PV */
> > +	if (xen_domain_type == XEN_PV_DOMAIN)
> > +		return;
> 
> Aren't you specifically testing for xen_domain_type == NATIVE here?  If
> its anything else, then it means we're either PV, or have become an HVM
> domain some other way (like probing for the Xen platform PCI device).

Yes, that's better.

> > +
> > +	r = init_hvm_pv_info();
> > +	if (r<  0)
> > +		return;
> > +
> > +	init_shared_info();
> > +
> > +	xen_hvm_pv_init_irq_ops();
> > +#endif
> > +}
> 
> Can you split all this off into a new file.  It doesn't seem to have any
> dependencies on the rest of enlighten.c, and I've been trying to
> disaggregate it anyway.

Part of pv_ops are overlapped. I would try if a new file would bring much 
duplicate.
 
> > +
> > diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
> > index 9d30105..fadaa97 100644
> > --- a/arch/x86/xen/irq.c
> > +++ b/arch/x86/xen/irq.c
> > @@ -131,3 +131,24 @@ void __init xen_init_irq_ops()
> >   	pv_irq_ops = xen_irq_ops;
> >   	x86_init.irqs.intr_init = xen_init_IRQ;
> >   }
> > +
> > +static void xen_hvm_pv_safe_halt(void)
> > +{
> > +	/* Do local_irq_enable() explicitly in hvm_pv guest */
> > +	local_irq_enable();
> > +	xen_safe_halt();
> > +}
> > +
> > +static void xen_hvm_pv_halt(void)
> > +{
> > +	if (irqs_disabled())
> > +		HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL);
> > +	else
> > +		xen_hvm_pv_safe_halt();
> > +}
> > +
> > +void __init xen_hvm_pv_init_irq_ops(void)
> > +{
> > +	pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
> > +	pv_irq_ops.halt = xen_hvm_pv_halt;
> > +}
> 
> It would be better to make this patch purely common infrastructure, and
> make specific features (like hvm+pv halt) separate patches (one per patch).

OK. (In fact I merged them in second edition...)
> 
> > diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
> > index 1a5ff24..26041ce 100644
> > --- a/arch/x86/xen/xen-head.S
> > +++ b/arch/x86/xen/xen-head.S
> > @@ -33,6 +33,12 @@ ENTRY(hypercall_page)
> >   	.skip PAGE_SIZE_asm
> >   .popsection
> >
> > +.pushsection .data
> > +	.align PAGE_SIZE_asm
> > +ENTRY(shared_info_page)
> > +	.skip PAGE_SIZE_asm
> > +.popsection
> 
> Why does this need to be defined in asm?  Can't it be either allocated
> or defined in C?

I think we need a aligned page, as hypercall page.
> 
> > +
> >   	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
> >   	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION,  .asciz "2.6")
> >   	ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,    .asciz "xen-3.0")
> > diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
> > index f9153a3..cc00760 100644
> > --- a/arch/x86/xen/xen-ops.h
> > +++ b/arch/x86/xen/xen-ops.h
> > @@ -41,6 +41,7 @@ void xen_vcpu_restore(void);
> >   void __init xen_build_dynamic_phys_to_machine(void);
> >
> >   void xen_init_irq_ops(void);
> > +void xen_hvm_pv_init_irq_ops(void);
> >   void xen_setup_timer(int cpu);
> >   void xen_setup_runstate_info(int cpu);
> >   void xen_teardown_timer(int cpu);
> > diff --git a/include/xen/interface/hvm/hvm_op.h
> > b/include/xen/interface/hvm/hvm_op.h index 7c74ba4..0ce8a26 100644
> > --- a/include/xen/interface/hvm/hvm_op.h
> > +++ b/include/xen/interface/hvm/hvm_op.h
> > @@ -69,4 +69,11 @@
> > DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_pci_link_route); /* Flushes all
> > VCPU TLBs: @arg must be NULL. */
> >   #define HVMOP_flush_tlbs          5
> >
> > +#define HVMOP_enable_pv 9
> > +struct xen_hvm_pv_type {
> > +	domid_t domid;
> > +	uint32_t flags;
> > +#define HVM_PV_EVTCHN (1ull<<1)
> > +};
> > +
> >   #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
> > diff --git a/include/xen/xen.h b/include/xen/xen.h
> > index a164024..9bb92e5 100644
> > --- a/include/xen/xen.h
> > +++ b/include/xen/xen.h
> > @@ -9,6 +9,7 @@ enum xen_domain_type {
> >
> >   #ifdef CONFIG_XEN
> >   extern enum xen_domain_type xen_domain_type;
> > +extern void xen_guest_init(void);
> >   #else
> >   #define xen_domain_type		XEN_NATIVE
> >   #endif
> > @@ -19,6 +20,14 @@ extern enum xen_domain_type xen_domain_type;
> >   #define xen_hvm_domain()	(xen_domain()&&			\
> >   				 xen_domain_type == XEN_HVM_DOMAIN)
> >
> > +#define XEN_HVM_PV_ENABLED	    (1u<<  0)
> 
> Why have this?  We already have xen_domain_type which will either be
> XEN_NATIVE (ie, either real native, or on some fully emulated
> environment we have no specific optimisations for), or XEN_HVM_DOMAIN
> (we know we're running under Xen as an HVM domain).

Something may need to change in pv driver, as I said above.
> 
> > +#define XEN_HVM_PV_EVTCHN_ENABLED   (1u<<  1)
> > +extern u32 xen_hvm_pv_status;
> 
> I think "status" is a misnomer here.  Isn't it specifically a set of PV
> features which are active?

Could you give a suggestion of the name? I am not a native English speaker...
> 
> > +
> > +#define xen_hvm_pv_enabled() (xen_hvm_pv_status&  XEN_HVM_PV_ENABLED)
> > +#define xen_hvm_pv_evtchn_enabled() (xen_hvm_pv_enabled()&&  \
> > +		(xen_hvm_pv_status&  XEN_HVM_PV_EVTCHN_ENABLED))
> 
> Testing for xen_hvm_pv_enabled() should be redundant; surely the
> status/feature flag won't be set unless the environment supports the
> feature, and if it does it doesn't really matter what the domain type is.

Sure

-- 
regards
Yang, Sheng

> > +
> >   #ifdef CONFIG_XEN_DOM0
> >   #include<xen/interface/xen.h>
> >   #include<asm/xen/hypervisor.h>
> 
>      J
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of HVM initialization
@ 2010-03-02  1:38       ` Sheng Yang
  0 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  1:38 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, xen-devel,
	Ian Campbell, Stefano Stabellini, Yaozu (Eddie) Dong,
	linux-kernel

On Tuesday 02 March 2010 09:02:40 Jeremy Fitzhardinge wrote:
> On 03/01/2010 01:38 AM, Sheng Yang wrote:
> > The PV extension of HVM(once known as Hybrid) is started from real mode
> > like HVM guest, but also with a component based PV feature selection(e.g.
> > PV halt, PV timer, event channel, then PV drivers). So guest can takes
> > the advantages of both H/W virtualization and Para-Virtualization.
> >
> > This patch introduced the PV extension of HVM guest initialization.
> >
> > Guest would detect the capability using CPUID 0x40000002.edx, then call
> > HVMOP_enable_pv hypercall to enable pv support in hypervisor.
> >
> > Signed-off-by: Sheng Yang<sheng@linux.intel.com>
> > Signed-off-by: Yaozu (Eddie) Dong<eddie.dong@intel.com>
> > ---
> >   arch/x86/include/asm/xen/cpuid.h   |    5 ++
> >   arch/x86/xen/enlighten.c           |  115
> > ++++++++++++++++++++++++++++++++++++ arch/x86/xen/irq.c                 |
> >   21 +++++++
> >   arch/x86/xen/xen-head.S            |    6 ++
> >   arch/x86/xen/xen-ops.h             |    1 +
> >   include/xen/interface/hvm/hvm_op.h |    7 ++
> >   include/xen/xen.h                  |    9 +++
> >   7 files changed, 164 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/xen/cpuid.h
> > b/arch/x86/include/asm/xen/cpuid.h index 8787f03..a93c851 100644
> > --- a/arch/x86/include/asm/xen/cpuid.h
> > +++ b/arch/x86/include/asm/xen/cpuid.h
> > @@ -65,4 +65,9 @@
> >   #define _XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD 0
> >   #define XEN_CPUID_FEAT1_MMU_PT_UPDATE_PRESERVE_AD  (1u<<0)
> >
> > +#define _XEN_CPUID_FEAT2_HVM_PV 0
> > +#define XEN_CPUID_FEAT2_HVM_PV (1u<<0)
> > +#define _XEN_CPUID_FEAT2_HVM_PV_EVTCHN 1
> > +#define XEN_CPUID_FEAT2_HVM_PV_EVTCHN (1u<<1)
> > +
> >   #endif /* __XEN_PUBLIC_ARCH_X86_CPUID_H__ */
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index 36daccb..fdb9664 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -34,6 +34,8 @@
> >   #include<xen/interface/version.h>
> >   #include<xen/interface/physdev.h>
> >   #include<xen/interface/vcpu.h>
> > +#include<xen/interface/memory.h>
> > +#include<xen/interface/hvm/hvm_op.h>
> >   #include<xen/features.h>
> >   #include<xen/page.h>
> >   #include<xen/hvc-console.h>
> > @@ -43,6 +45,7 @@
> >   #include<asm/page.h>
> >   #include<asm/xen/hypercall.h>
> >   #include<asm/xen/hypervisor.h>
> > +#include<asm/xen/cpuid.h>
> >   #include<asm/fixmap.h>
> >   #include<asm/processor.h>
> >   #include<asm/proto.h>
> > @@ -1198,3 +1201,115 @@ asmlinkage void __init xen_start_kernel(void)
> >   	x86_64_start_reservations((char *)__pa_symbol(&boot_params));
> >   #endif
> >   }
> > +
> > +static void __init xen_hvm_pv_banner(void)
> > +{
> > +	unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
> > +	struct xen_extraversion extra;
> > +	HYPERVISOR_xen_version(XENVER_extraversion,&extra);
> > +
> > +	printk(KERN_INFO "Booting PV featured HVM kernel on %s\n",
> > +		pv_info.name);
> > +	printk(KERN_INFO "Xen version: %d.%d%s\n",
> > +		version>>  16, version&  0xffff, extra.extraversion);
> > +}
> > +
> > +static int xen_para_available(void)
> > +{
> > +	uint32_t eax, ebx, ecx, edx;
> > +	cpuid(XEN_CPUID_LEAF(0),&eax,&ebx,&ecx,&edx);
> > +
> > +	if (ebx == XEN_CPUID_SIGNATURE_EBX&&
> > +	    ecx == XEN_CPUID_SIGNATURE_ECX&&
> > +	    edx == XEN_CPUID_SIGNATURE_EDX&&
> > +	    ((eax - XEN_CPUID_LEAF(0))>= 2))
> > +		return 1;
> > +
> > +	return 0;
> > +}
> > +
> > +u32 xen_hvm_pv_status;
> > +EXPORT_SYMBOL_GPL(xen_hvm_pv_status);
> > +
> > +static int enable_hvm_pv(u64 flags)
> > +{
> > +	struct xen_hvm_pv_type a;
> > +
> > +	a.domid = DOMID_SELF;
> > +	a.flags = flags;
> > +	return HYPERVISOR_hvm_op(HVMOP_enable_pv,&a);
> > +}
> > +
> > +static int init_hvm_pv_info(void)
> > +{
> > +	uint32_t ecx, edx, pages, msr;
> > +	u64 pfn;
> > +
> > +	if (!xen_para_available())
> > +		return -EINVAL;
> > +
> > +	cpuid(XEN_CPUID_LEAF(2),&pages,&msr,&ecx,&edx);
> > +
> > +	/* Check if hvm_pv mode is supported */
> > +	if (!(edx&  XEN_CPUID_FEAT2_HVM_PV))
> > +		return -ENODEV;
> > +
> > +	xen_hvm_pv_status = XEN_HVM_PV_ENABLED;
> 
> Why use this?  Why not just set the domain type to HVM, and leave the
> "status" field as a bitset of available paravirtualizations?

A annoy thing in pv drivers is that it would test if the domain type is _NOT_ 
XEN_NATIVE. So set the domain to XEN_HVM_DOMAIN would result in PV driver 
initialization then probably panic. Maybe we can do something to PV drivers, 
as patch 6 and a part of patch 7.
> 
> > +
> > +	/* We only support 1 page of hypercall for now */
> > +	if (pages != 1)
> > +		return -ENOMEM;
> > +
> > +	pfn = __pa(hypercall_page);
> > +	wrmsrl(msr, pfn);
> > +
> > +	xen_setup_features();
> > +
> > +	x86_init.oem.banner = xen_hvm_pv_banner;
> > +	pv_info = xen_info;
> > +	pv_info.kernel_rpl = 0;
> > +
> > +	return 0;
> > +}
> > +
> > +extern struct shared_info shared_info_page;
> > +
> > +static void __init init_shared_info(void)
> > +{
> > +	struct xen_add_to_physmap xatp;
> > +
> > +	xatp.domid = DOMID_SELF;
> > +	xatp.idx = 0;
> > +	xatp.space = XENMAPSPACE_shared_info;
> > +	xatp.gpfn = __pa(&shared_info_page)>>  PAGE_SHIFT;
> > +	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap,&xatp))
> > +		BUG();
> > +
> > +	HYPERVISOR_shared_info = (struct shared_info *)&shared_info_page;
> > +
> > +	/* Don't do the full vcpu_info placement stuff until we have a
> > +	   possible map and a non-dummy shared_info. */
> 
> Is this comment meaningful here?  This is the real shared info at this
> point, no?  Are you going to support vcpu_info placement?

Would discard this..
 
> > +	per_cpu(xen_vcpu, 0) =&HYPERVISOR_shared_info->vcpu_info[0];
> > +}
> > +
> > +void __init xen_guest_init(void)
> > +{
> > +#ifdef CONFIG_X86_32
> > +	return;
> > +#else
> > +	int r;
> > +
> > +	/* Ensure the we won't confused with PV */
> > +	if (xen_domain_type == XEN_PV_DOMAIN)
> > +		return;
> 
> Aren't you specifically testing for xen_domain_type == NATIVE here?  If
> its anything else, then it means we're either PV, or have become an HVM
> domain some other way (like probing for the Xen platform PCI device).

Yes, that's better.

> > +
> > +	r = init_hvm_pv_info();
> > +	if (r<  0)
> > +		return;
> > +
> > +	init_shared_info();
> > +
> > +	xen_hvm_pv_init_irq_ops();
> > +#endif
> > +}
> 
> Can you split all this off into a new file.  It doesn't seem to have any
> dependencies on the rest of enlighten.c, and I've been trying to
> disaggregate it anyway.

Part of pv_ops are overlapped. I would try if a new file would bring much 
duplicate.
 
> > +
> > diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
> > index 9d30105..fadaa97 100644
> > --- a/arch/x86/xen/irq.c
> > +++ b/arch/x86/xen/irq.c
> > @@ -131,3 +131,24 @@ void __init xen_init_irq_ops()
> >   	pv_irq_ops = xen_irq_ops;
> >   	x86_init.irqs.intr_init = xen_init_IRQ;
> >   }
> > +
> > +static void xen_hvm_pv_safe_halt(void)
> > +{
> > +	/* Do local_irq_enable() explicitly in hvm_pv guest */
> > +	local_irq_enable();
> > +	xen_safe_halt();
> > +}
> > +
> > +static void xen_hvm_pv_halt(void)
> > +{
> > +	if (irqs_disabled())
> > +		HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL);
> > +	else
> > +		xen_hvm_pv_safe_halt();
> > +}
> > +
> > +void __init xen_hvm_pv_init_irq_ops(void)
> > +{
> > +	pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
> > +	pv_irq_ops.halt = xen_hvm_pv_halt;
> > +}
> 
> It would be better to make this patch purely common infrastructure, and
> make specific features (like hvm+pv halt) separate patches (one per patch).

OK. (In fact I merged them in second edition...)
> 
> > diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
> > index 1a5ff24..26041ce 100644
> > --- a/arch/x86/xen/xen-head.S
> > +++ b/arch/x86/xen/xen-head.S
> > @@ -33,6 +33,12 @@ ENTRY(hypercall_page)
> >   	.skip PAGE_SIZE_asm
> >   .popsection
> >
> > +.pushsection .data
> > +	.align PAGE_SIZE_asm
> > +ENTRY(shared_info_page)
> > +	.skip PAGE_SIZE_asm
> > +.popsection
> 
> Why does this need to be defined in asm?  Can't it be either allocated
> or defined in C?

I think we need a aligned page, as hypercall page.
> 
> > +
> >   	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_OS,       .asciz "linux")
> >   	ELFNOTE(Xen, XEN_ELFNOTE_GUEST_VERSION,  .asciz "2.6")
> >   	ELFNOTE(Xen, XEN_ELFNOTE_XEN_VERSION,    .asciz "xen-3.0")
> > diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
> > index f9153a3..cc00760 100644
> > --- a/arch/x86/xen/xen-ops.h
> > +++ b/arch/x86/xen/xen-ops.h
> > @@ -41,6 +41,7 @@ void xen_vcpu_restore(void);
> >   void __init xen_build_dynamic_phys_to_machine(void);
> >
> >   void xen_init_irq_ops(void);
> > +void xen_hvm_pv_init_irq_ops(void);
> >   void xen_setup_timer(int cpu);
> >   void xen_setup_runstate_info(int cpu);
> >   void xen_teardown_timer(int cpu);
> > diff --git a/include/xen/interface/hvm/hvm_op.h
> > b/include/xen/interface/hvm/hvm_op.h index 7c74ba4..0ce8a26 100644
> > --- a/include/xen/interface/hvm/hvm_op.h
> > +++ b/include/xen/interface/hvm/hvm_op.h
> > @@ -69,4 +69,11 @@
> > DEFINE_GUEST_HANDLE_STRUCT(xen_hvm_set_pci_link_route); /* Flushes all
> > VCPU TLBs: @arg must be NULL. */
> >   #define HVMOP_flush_tlbs          5
> >
> > +#define HVMOP_enable_pv 9
> > +struct xen_hvm_pv_type {
> > +	domid_t domid;
> > +	uint32_t flags;
> > +#define HVM_PV_EVTCHN (1ull<<1)
> > +};
> > +
> >   #endif /* __XEN_PUBLIC_HVM_HVM_OP_H__ */
> > diff --git a/include/xen/xen.h b/include/xen/xen.h
> > index a164024..9bb92e5 100644
> > --- a/include/xen/xen.h
> > +++ b/include/xen/xen.h
> > @@ -9,6 +9,7 @@ enum xen_domain_type {
> >
> >   #ifdef CONFIG_XEN
> >   extern enum xen_domain_type xen_domain_type;
> > +extern void xen_guest_init(void);
> >   #else
> >   #define xen_domain_type		XEN_NATIVE
> >   #endif
> > @@ -19,6 +20,14 @@ extern enum xen_domain_type xen_domain_type;
> >   #define xen_hvm_domain()	(xen_domain()&&			\
> >   				 xen_domain_type == XEN_HVM_DOMAIN)
> >
> > +#define XEN_HVM_PV_ENABLED	    (1u<<  0)
> 
> Why have this?  We already have xen_domain_type which will either be
> XEN_NATIVE (ie, either real native, or on some fully emulated
> environment we have no specific optimisations for), or XEN_HVM_DOMAIN
> (we know we're running under Xen as an HVM domain).

Something may need to change in pv driver, as I said above.
> 
> > +#define XEN_HVM_PV_EVTCHN_ENABLED   (1u<<  1)
> > +extern u32 xen_hvm_pv_status;
> 
> I think "status" is a misnomer here.  Isn't it specifically a set of PV
> features which are active?

Could you give a suggestion of the name? I am not a native English speaker...
> 
> > +
> > +#define xen_hvm_pv_enabled() (xen_hvm_pv_status&  XEN_HVM_PV_ENABLED)
> > +#define xen_hvm_pv_evtchn_enabled() (xen_hvm_pv_enabled()&&  \
> > +		(xen_hvm_pv_status&  XEN_HVM_PV_EVTCHN_ENABLED))
> 
> Testing for xen_hvm_pv_enabled() should be redundant; surely the
> status/feature flag won't be set unless the environment supports the
> feature, and if it does it doesn't really matter what the domain type is.

Sure

-- 
regards
Yang, Sheng

> > +
> >   #ifdef CONFIG_XEN_DOM0
> >   #include<xen/interface/xen.h>
> >   #include<asm/xen/hypervisor.h>
> 
>      J
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 4/7] xen: The entrance for PV extension of HVM
  2010-03-02  1:05   ` [Xen-devel] " Jeremy Fitzhardinge
@ 2010-03-02  1:41     ` Sheng Yang
  0 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  1:41 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, xen-devel,
	Ian Campbell, Stefano Stabellini, linux-kernel, Ingo Molnar

On Tuesday 02 March 2010 09:05:36 Jeremy Fitzhardinge wrote:
> On 03/01/2010 01:38 AM, Sheng Yang wrote:
> > xen_guest_init() would setup the environment.
> >
> > Cc: Ingo Molnar<mingo@elte.hu>
> > Signed-off-by: Sheng Yang<sheng@linux.intel.com>
> > ---
> >   arch/x86/kernel/setup.c |    8 ++++++++
> >   1 files changed, 8 insertions(+), 0 deletions(-)
> >
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index 5d9e40c..2b61d46 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -113,6 +113,10 @@
> >   #endif
> >   #include<asm/mce.h>
> >
> > +#ifdef CONFIG_XEN
> > +#include<xen/xen.h>
> > +#endif
> 
> No #ifdefs; put them in xen.h if necessary (which they aren't).  This
> probably isn't the right header anyway; this is specifically for testing
> the presence of Xen and/or the current domain type.  It shouldn't have
> any other prototypes in it, or include anything else (it probably
> already includes too much).  Given this is already x86-specific code,
> include asm/xen/something.h.

Would update it. Thanks.

-- 
regards
Yang, Sheng

> 
> > +
> >   /*
> >    * end_pfn only includes RAM, while max_pfn_mapped includes all e820
> > entries. * The direct mapping extends to max_pfn_mapped, so that we can
> > directly access @@ -740,6 +744,10 @@ void __init setup_arch(char
> > **cmdline_p)
> >
> >   	x86_init.oem.arch_setup();
> >
> > +#ifdef CONFIG_XEN
> > +	xen_guest_init();
> > +#endif
> 
> Again, no #ifdefs here.  Put an #ifdeffed stub in an appropriate header.
> 
>      J
> 
	

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
  2010-03-02  1:38       ` Sheng Yang
  (?)
@ 2010-03-02  1:43       ` Jeremy Fitzhardinge
  -1 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02  1:43 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, xen-devel,
	Ian Campbell, Stefano Stabellini, Yaozu (Eddie) Dong,
	linux-kernel

On 03/01/2010 05:38 PM, Sheng Yang wrote:
>> Why use this?  Why not just set the domain type to HVM, and leave the
>> "status" field as a bitset of available paravirtualizations?
>>      
> A annoy thing in pv drivers is that it would test if the domain type is _NOT_
> XEN_NATIVE. So set the domain to XEN_HVM_DOMAIN would result in PV driver
> initialization then probably panic. Maybe we can do something to PV drivers,
> as patch 6 and a part of patch 7.
>    

In that case, change them to specifically test for PV until they're 
ready to deal with an HVM environment.

>
> Part of pv_ops are overlapped. I would try if a new file would bring much
> duplicate.
>    

It shouldn't do; un-static things if necessary.

>>> +	.align PAGE_SIZE_asm
>>> +ENTRY(shared_info_page)
>>> +	.skip PAGE_SIZE_asm
>>> +.popsection
>>>        
>> Why does this need to be defined in asm?  Can't it be either allocated
>> or defined in C?
>>      
> I think we need a aligned page, as hypercall page.
>    

That can be declared in C with the __page_aligned_bss annotation.  And 
couldn't you dynamically allocate it with the bootmem allocator?

>> I think "status" is a misnomer here.  Isn't it specifically a set of PV
>> features which are active?
>>      
> Could you give a suggestion of the name? I am not a native English speaker...
>    

xen_hvm_pv_features.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 6/7] xen: Unified checking for Xen of PV drivers to xenbus_register_frontend()
  2010-03-01  9:38 ` [PATCH 6/7] xen: Unified checking for Xen of PV drivers to xenbus_register_frontend() Sheng Yang
@ 2010-03-02  1:45   ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02  1:45 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, linux-kernel,
	xen-devel, Ian Campbell, Stefano Stabellini

On 03/01/2010 01:38 AM, Sheng Yang wrote:
> Signed-off-by: Sheng Yang<sheng@linux.intel.com>
>    

We covered this before; just change the tests so they run in PV domains 
so they don't crash on HVM+PV, and we can address these later.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [Xen-devel] [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen
  2010-03-02  1:26   ` Sheng Yang
@ 2010-03-02  3:20       ` Dong, Eddie
  2010-03-02  3:20       ` Dong, Eddie
  1 sibling, 0 replies; 53+ messages in thread
From: Dong, Eddie @ 2010-03-02  3:20 UTC (permalink / raw)
  To: Sheng Yang, Jeremy Fitzhardinge
  Cc: xen-devel, Ian Campbell, Stefano Stabellini, linux-kernel,
	Ian Pratt, Jeremy Fitzhardinge, Keir Fraser, Dong, Eddie

> 
>> Also, I'd like to see a clearer statement of what you're specifically
>> trying to optimise for here.  What is inefficient in an unmodified
>> HVM domain, and how do your patches address these things.  What
>> measurements have you made?
> 
> The key issue we want to address is the overhead of LAPIC, especially
> EOI and other actions resulted in explicit vmexit. That is the
> overhead we want to eliminate. And evtchn seems like a perfect choice
> to us. 
> 

That is not complete. Linux timer in HVM is always annoying, and time to time make surprise to endusers. PV timer does help here. Eventchannel is nice to HVM as well. 

Thx, Eddie

^ permalink raw reply	[flat|nested] 53+ messages in thread

* RE: [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen
@ 2010-03-02  3:20       ` Dong, Eddie
  0 siblings, 0 replies; 53+ messages in thread
From: Dong, Eddie @ 2010-03-02  3:20 UTC (permalink / raw)
  To: Sheng Yang, Jeremy Fitzhardinge
  Cc: xen-devel, Jeremy, Ian Campbell, Stefano Stabellini, Dong, Eddie,
	linux-kernel, Ian Pratt, Fitzhardinge, Keir Fraser

> 
>> Also, I'd like to see a clearer statement of what you're specifically
>> trying to optimise for here.  What is inefficient in an unmodified
>> HVM domain, and how do your patches address these things.  What
>> measurements have you made?
> 
> The key issue we want to address is the overhead of LAPIC, especially
> EOI and other actions resulted in explicit vmexit. That is the
> overhead we want to eliminate. And evtchn seems like a perfect choice
> to us. 
> 

That is not complete. Linux timer in HVM is always annoying, and time to time make surprise to endusers. PV timer does help here. Eventchannel is nice to HVM as well. 

Thx, Eddie

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 5/7] xen: Make event channel work with PV extension of HVM
  2010-03-02  1:38   ` [Xen-devel] " Jeremy Fitzhardinge
@ 2010-03-02  5:48     ` Sheng Yang
  2010-03-03 18:31       ` Jeremy Fitzhardinge
  2010-03-04  5:37       ` Sheng Yang
  1 sibling, 1 reply; 53+ messages in thread
From: Sheng Yang @ 2010-03-02  5:48 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, linux-kernel,
	xen-devel, Ian Campbell, Stefano Stabellini

On Tuesday 02 March 2010 09:38:21 Jeremy Fitzhardinge wrote:
> On 03/01/2010 01:38 AM, Sheng Yang wrote:
> > We mapped each IOAPIC pin to a VIRQ, so that we can deliver interrupt
> > through these VIRQs.
> >
> > We used X86_PLATFORM_IPI_VECTOR as the noficiation vector for hypervisor
> > to notify guest about the event.
> 
> I don't understand this aspect of the patch.  Can you explain the
> interrupt delivery path when HVM event channels are enabled?

The goal is to eliminate LAPIC overhead(especially EOI and other operation 
which caused VMExit when using LAPIC) in the current interrupt delivery 
system, so that we use evtchn to delivery interrupt.

For the emulated device, we allocate one VIRQ for each IOAPIC pin. When IOAPIC 
pin should be assert, instead we inject a VIRQ.

The X86_PLATFORM_IPI_VECTOR is the entry point for event handling. When we 
want to inject event(VIRQ or others) to guest, we inject a 
X86_PLATFORM_IPI_VECTOR, so that the handler of vector in the guest can handle 
the events well.
 
> > The Xen PV timer is used to provide guest a reliable timer.
> >
> > The patch also enabled SMP support, then we can support IPI through
> > evtchn as well.
> >
> > Then we don't use IOAPIC/LAPIC.
> >
> > Signed-off-by: Sheng Yang<sheng@linux.intel.com>
> > ---
> >   arch/x86/xen/enlighten.c    |   72 +++++++++++++++++++++
> >   arch/x86/xen/irq.c          |   37 ++++++++++-
> >   arch/x86/xen/smp.c          |  144
> > ++++++++++++++++++++++++++++++++++++++++++- arch/x86/xen/xen-ops.h      |
> >    3 +
> >   drivers/xen/events.c        |   66 ++++++++++++++++++-
> >   include/xen/events.h        |    1 +
> >   include/xen/hvm.h           |    5 ++
> >   include/xen/interface/xen.h |    6 ++-
> >   8 files changed, 326 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> > index fdb9664..f617639 100644
> > --- a/arch/x86/xen/enlighten.c
> > +++ b/arch/x86/xen/enlighten.c
> > @@ -58,6 +58,9 @@
> >   #include<asm/reboot.h>
> >   #include<asm/stackprotector.h>
> >
> > +#include<xen/hvm.h>
> > +#include<xen/events.h>
> > +
> >   #include "xen-ops.h"
> >   #include "mmu.h"
> >   #include "multicalls.h"
> > @@ -1212,6 +1215,8 @@ static void __init xen_hvm_pv_banner(void)
> >   		pv_info.name);
> >   	printk(KERN_INFO "Xen version: %d.%d%s\n",
> >   		version>>  16, version&  0xffff, extra.extraversion);
> > +	if (xen_hvm_pv_evtchn_enabled())
> > +		printk(KERN_INFO "PV feature: Event channel enabled\n");
> >   }
> >
> >   static int xen_para_available(void)
> > @@ -1256,6 +1261,9 @@ static int init_hvm_pv_info(void)
> >
> >   	xen_hvm_pv_status = XEN_HVM_PV_ENABLED;
> >
> > +	if (edx&  XEN_CPUID_FEAT2_HVM_PV_EVTCHN)
> > +		xen_hvm_pv_status |= XEN_HVM_PV_EVTCHN_ENABLED;
> > +
> >   	/* We only support 1 page of hypercall for now */
> >   	if (pages != 1)
> >   		return -ENOMEM;
> > @@ -1292,12 +1300,42 @@ static void __init init_shared_info(void)
> >   	per_cpu(xen_vcpu, 0) =&HYPERVISOR_shared_info->vcpu_info[0];
> >   }
> >
> > +static int set_callback_via(uint64_t via)
> > +{
> > +	struct xen_hvm_param a;
> > +
> > +	a.domid = DOMID_SELF;
> > +	a.index = HVM_PARAM_CALLBACK_IRQ;
> > +	a.value = via;
> > +	return HYPERVISOR_hvm_op(HVMOP_set_param,&a);
> > +}
> > +
> > +void do_hvm_pv_evtchn_intr(void)
> > +{
> > +#ifdef CONFIG_X86_64
> > +	per_cpu(irq_count, smp_processor_id())++;
> > +#endif
> > +	xen_evtchn_do_upcall(get_irq_regs());
> > +#ifdef CONFIG_X86_64
> > +	per_cpu(irq_count, smp_processor_id())--;
> > +#endif
> > +}
> > +
> > +#ifdef CONFIG_X86_LOCAL_APIC
> > +static void xen_hvm_pv_evtchn_apic_write(u32 reg, u32 val)
> > +{
> > +	/* The only one reached here should be EOI */
> > +	WARN_ON(reg != APIC_EOI);
> > +}
> > +#endif
> >
> >
> > +
> >   void __init xen_guest_init(void)
> >   {
> >   #ifdef CONFIG_X86_32
> >   	return;
> >   #else
> >   	int r;
> > +	uint64_t callback_via;
> >
> >   	/* Ensure the we won't confused with PV */
> >   	if (xen_domain_type == XEN_PV_DOMAIN)
> > @@ -1310,6 +1348,40 @@ void __init xen_guest_init(void)
> >   	init_shared_info();
> >
> >   	xen_hvm_pv_init_irq_ops();
> > +
> > +	if (xen_hvm_pv_evtchn_enabled()) {
> > +		if (enable_hvm_pv(HVM_PV_EVTCHN))
> 
> If this fails, shouldn't it also clear XEN_HVM_PV_EVTCHN_ENABLED?

Yes...
 
> > +			return;
> > +
> > +		pv_time_ops = xen_time_ops;
> > +
> > +		x86_init.timers.timer_init = xen_time_init;
> > +		x86_init.timers.setup_percpu_clockev = x86_init_noop;
> > +		x86_cpuinit.setup_percpu_clockev = x86_init_noop;
> 
> Presumably even if we don't have PV_EVTCHN available/enabled, the Xen
> clocksource would be available for getting time?

I think currently Xen pv clocksource and clockevent are binding... Not sure if 
a single line "clocksource_register(&xen_clocksource)" can work. I would give 
it a try, maybe add a new PV feature.
 
> > +
> > +		x86_platform.calibrate_tsc = xen_tsc_khz;
> > +		x86_platform.get_wallclock = xen_get_wallclock;
> > +		x86_platform.set_wallclock = xen_set_wallclock;
> > +
> > +		pv_apic_ops = xen_apic_ops;
> > +#ifdef CONFIG_X86_LOCAL_APIC
> > +		/*
> > +		 * set up the basic apic ops.
> > +		 */
> > +		set_xen_basic_apic_ops();
> > +		apic->write = xen_hvm_pv_evtchn_apic_write;
> 
> I'd just change the xen_apic_write to remove the WARN_ON, since you
> don't seem to care about it either.

:)
> 
> > +#endif
> > +
> > +		callback_via = HVM_CALLBACK_VECTOR(X86_PLATFORM_IPI_VECTOR);
> > +		set_callback_via(callback_via);
> > +
> > +		x86_platform_ipi_callback = do_hvm_pv_evtchn_intr;
> > +
> > +		disable_acpi();
> > +
> > +		xen_hvm_pv_smp_init();
> > +		machine_ops = xen_machine_ops;
> > +	}
> >   #endif
> >   }
> >
> > diff --git a/arch/x86/xen/irq.c b/arch/x86/xen/irq.c
> > index fadaa97..7827a6d 100644
> > --- a/arch/x86/xen/irq.c
> > +++ b/arch/x86/xen/irq.c
> > @@ -5,6 +5,7 @@
> >   #include<xen/interface/xen.h>
> >   #include<xen/interface/sched.h>
> >   #include<xen/interface/vcpu.h>
> > +#include<xen/xen.h>
> >
> >   #include<asm/xen/hypercall.h>
> >   #include<asm/xen/hypervisor.h>
> > @@ -132,6 +133,20 @@ void __init xen_init_irq_ops()
> >   	x86_init.irqs.intr_init = xen_init_IRQ;
> >   }
> >
> > +static void xen_hvm_pv_evtchn_disable(void)
> > +{
> > +	native_irq_disable();
> > +	xen_irq_disable();
> > +}
> > +PV_CALLEE_SAVE_REGS_THUNK(xen_hvm_pv_evtchn_disable);
> > +
> > +static void xen_hvm_pv_evtchn_enable(void)
> > +{
> > +	native_irq_enable();
> > +	xen_irq_enable();
> > +}
> > +PV_CALLEE_SAVE_REGS_THUNK(xen_hvm_pv_evtchn_enable);
> > +
> >   static void xen_hvm_pv_safe_halt(void)
> >   {
> >   	/* Do local_irq_enable() explicitly in hvm_pv guest */
> > @@ -147,8 +162,26 @@ static void xen_hvm_pv_halt(void)
> >   		xen_hvm_pv_safe_halt();
> >   }
> >
> > +static const struct pv_irq_ops xen_hvm_pv_irq_ops __initdata = {
> > +	.save_fl = __PV_IS_CALLEE_SAVE(native_save_fl),
> > +	.restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl),
> > +	.irq_disable = PV_CALLEE_SAVE(xen_hvm_pv_evtchn_disable),
> > +	.irq_enable = PV_CALLEE_SAVE(xen_hvm_pv_evtchn_enable),
> > +
> > +	.safe_halt = xen_hvm_pv_safe_halt,
> > +	.halt = xen_hvm_pv_halt,
> > +#ifdef CONFIG_X86_64
> > +	.adjust_exception_frame = paravirt_nop,
> > +#endif
> > +};
> > +
> >   void __init xen_hvm_pv_init_irq_ops(void)
> >   {
> > -	pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
> > -	pv_irq_ops.halt = xen_hvm_pv_halt;
> > +	if (xen_hvm_pv_evtchn_enabled()) {
> > +		pv_irq_ops = xen_hvm_pv_irq_ops;
> > +		x86_init.irqs.intr_init = xen_hvm_pv_evtchn_init_IRQ;
> > +	} else {
> > +		pv_irq_ops.safe_halt = xen_hvm_pv_safe_halt;
> > +		pv_irq_ops.halt = xen_hvm_pv_halt;
> > +	}
> >   }
> > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c
> > index 563d205..a85d0b6 100644
> > --- a/arch/x86/xen/smp.c
> > +++ b/arch/x86/xen/smp.c
> > @@ -15,20 +15,26 @@
> >   #include<linux/sched.h>
> >   #include<linux/err.h>
> >   #include<linux/smp.h>
> > +#include<linux/nmi.h>
> >
> >   #include<asm/paravirt.h>
> >   #include<asm/desc.h>
> >   #include<asm/pgtable.h>
> >   #include<asm/cpu.h>
> > +#include<asm/trampoline.h>
> > +#include<asm/tlbflush.h>
> > +#include<asm/mtrr.h>
> >
> >   #include<xen/interface/xen.h>
> >   #include<xen/interface/vcpu.h>
> >
> >   #include<asm/xen/interface.h>
> >   #include<asm/xen/hypercall.h>
> > +#include<asm/xen/hypervisor.h>
> >
> >   #include<xen/page.h>
> >   #include<xen/events.h>
> > +#include<xen/xen.h>
> >
> >   #include "xen-ops.h"
> >   #include "mmu.h"
> > @@ -171,7 +177,8 @@ static void __init xen_smp_prepare_boot_cpu(void)
> >
> >   	/* We've switched to the "real" per-cpu gdt, so make sure the
> >   	   old memory can be recycled */
> > -	make_lowmem_page_readwrite(xen_initial_gdt);
> > +	if (xen_pv_domain())
> > +		make_lowmem_page_readwrite(xen_initial_gdt);
> 
> Test for xen_feature(XENFEAT_writable_descriptor_tables).

Sure.
> 
> >   	xen_setup_vcpu_info_placement();
> >   }
> > @@ -480,3 +487,138 @@ void __init xen_smp_init(void)
> >   	xen_fill_possible_map();
> >   	xen_init_spinlocks();
> >   }
> > +
> > +static __cpuinit void xen_hvm_pv_start_secondary(void)
> > +{
> > +	int cpu = smp_processor_id();
> > +
> > +	cpu_init();
> > +	touch_nmi_watchdog();
> > +	preempt_disable();
> > +
> > +	/* otherwise gcc will move up smp_processor_id before the cpu_init */
> > +	barrier();
> > +	/*
> > +	 * Check TSC synchronization with the BSP:
> > +	 */
> > +	check_tsc_sync_target();
> > +
> > +	/* Done in smp_callin(), move it here */
> > +	set_mtrr_aps_delayed_init();
> > +	smp_store_cpu_info(cpu);
> > +
> > +	/* This must be done before setting cpu_online_mask */
> > +	set_cpu_sibling_map(cpu);
> > +	wmb();
> > +
> > +	set_cpu_online(smp_processor_id(), true);
> > +	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
> > +
> > +	/* enable local interrupts */
> > +	local_irq_enable();
> > +
> > +	xen_setup_cpu_clockevents();
> 
> How much of this is necessary?  At this point, isn't CPU bringup the
> same as PV?

Xen_enable_sysenter/syscall is not needed for this. And we have a TSC sync 
here - but it seems unnecessary for PV. But set_mtrr_aps_delayed_init() is 
needed. Reuse the cpu_bring_up() is fine?
 
> > +
> > +	wmb();
> > +	cpu_idle();
> > +}
> > +
> > +static __cpuinit int
> > +hvm_pv_cpu_initialize_context(unsigned int cpu, struct task_struct
> > *idle) +{
> > +	struct vcpu_guest_context *ctxt;
> > +	unsigned long start_ip;
> > +
> > +	if (cpumask_test_and_set_cpu(cpu, xen_cpu_initialized_map))
> > +		return 0;
> > +
> > +	ctxt = kzalloc(sizeof(*ctxt), GFP_KERNEL);
> > +	if (ctxt == NULL)
> > +		return -ENOMEM;
> > +
> > +	early_gdt_descr.address = (unsigned long)get_cpu_gdt_table(cpu);
> > +	initial_code = (unsigned long)xen_hvm_pv_start_secondary;
> > +	stack_start.sp = (void *) idle->thread.sp;
> > +
> > +	/* start_ip had better be page-aligned! */
> > +	start_ip = setup_trampoline();
> > +
> > +	/* only start_ip is what we want */
> > +	ctxt->flags = VGCF_HVM_GUEST;
> > +	ctxt->user_regs.eip = start_ip;
> > +
> > +	printk(KERN_INFO "Booting processor %d ip 0x%lx\n", cpu, start_ip);
> > +
> > +	if (HYPERVISOR_vcpu_op(VCPUOP_initialise, cpu, ctxt))
> > +		BUG();
> > +
> > +	kfree(ctxt);
> > +	return 0;
> > +}
> > +
> > +static int __init xen_hvm_pv_cpu_up(unsigned int cpu)
> > +{
> > +	struct task_struct *idle = idle_task(cpu);
> > +	int rc;
> > +	unsigned long flags;
> > +
> > +	per_cpu(current_task, cpu) = idle;
> > +
> > +#ifdef CONFIG_X86_32
> > +	irq_ctx_init(cpu);
> > +#else
> > +	clear_tsk_thread_flag(idle, TIF_FORK);
> > +	initial_gs = per_cpu_offset(cpu);
> > +	per_cpu(kernel_stack, cpu) =
> > +		(unsigned long)task_stack_page(idle) -
> > +		KERNEL_STACK_OFFSET + THREAD_SIZE;
> > +#endif
> > +
> > +	xen_setup_timer(cpu);
> > +	xen_init_lock_cpu(cpu);
> > +
> > +	per_cpu(cpu_state, cpu) = CPU_UP_PREPARE;
> 
> Can you put all this duplicated code into a common function?

Sure.
> 
> > +
> > +	rc = hvm_pv_cpu_initialize_context(cpu, idle);
> > +	if (rc)
> > +		return rc;
> > +
> > +	if (num_online_cpus() == 1)
> > +		alternatives_smp_switch(1);
> > +
> > +	rc = xen_smp_intr_init(cpu);
> > +	if (rc)
> > +		return rc;
> > +
> > +	rc = HYPERVISOR_vcpu_op(VCPUOP_up, cpu, NULL);
> > +	BUG_ON(rc);
> > +
> > +	/*
> > +	 * Check TSC synchronization with the AP (keep irqs disabled
> > +	 * while doing so):
> > +	 */
> > +	local_irq_save(flags);
> > +	check_tsc_sync_source(cpu);
> > +	local_irq_restore(flags);
> > +
> > +	while (!cpu_online(cpu)) {
> > +		cpu_relax();
> > +		touch_nmi_watchdog();
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static void xen_hvm_pv_flush_tlb_others(const struct cpumask *cpumask,
> > +					struct mm_struct *mm, unsigned long va)
> > +{
> > +	/* TODO Make it more specific */
> > +	flush_tlb_all();
> 
> Is the MMUEXT_TLB_FLUSH/INVLPG_MULTI hypercall not currently available
> to HVM?

I think they are different. These hypercalls flushed native's TLB, but HVM 
want to flush guest one, especially when using shadow, HVM need do something 
for it.
 
> > +}
> > +
> > +void __init xen_hvm_pv_smp_init(void)
> > +{
> 
> It's probably safest to have this do its own test for
> xen_hvm_pv_evtchn_enabled(), rather than assuming the caller is getting
> it right.

Yes.
> 
> > +	smp_ops = xen_smp_ops;
> > +	smp_ops.cpu_up = xen_hvm_pv_cpu_up;
> > +	pv_mmu_ops.flush_tlb_others = xen_hvm_pv_flush_tlb_others;
> > +}
> > diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
> > index cc00760..a8cc129 100644
> > --- a/arch/x86/xen/xen-ops.h
> > +++ b/arch/x86/xen/xen-ops.h
> > @@ -34,6 +34,7 @@ void xen_reserve_top(void);
> >   char * __init xen_memory_setup(void);
> >   void __init xen_arch_setup(void);
> >   void __init xen_init_IRQ(void);
> > +void __init xen_hvm_pv_evtchn_init_IRQ(void);
> >   void xen_enable_sysenter(void);
> >   void xen_enable_syscall(void);
> >   void xen_vcpu_restore(void);
> > @@ -61,10 +62,12 @@ void xen_setup_vcpu_info_placement(void);
> >
> >   #ifdef CONFIG_SMP
> >   void xen_smp_init(void);
> > +void xen_hvm_pv_smp_init(void);
> >
> >   extern cpumask_var_t xen_cpu_initialized_map;
> >   #else
> >   static inline void xen_smp_init(void) {}
> > +static inline void xen_hvm_pv_smp_init(void) {}
> >   #endif
> >
> >   #ifdef CONFIG_PARAVIRT_SPINLOCKS
> > diff --git a/drivers/xen/events.c b/drivers/xen/events.c
> > index ce602dd..e1835db 100644
> > --- a/drivers/xen/events.c
> > +++ b/drivers/xen/events.c
> > @@ -37,9 +37,12 @@
> >
> >   #include<xen/xen-ops.h>
> >   #include<xen/events.h>
> > +#include<xen/xen.h>
> >   #include<xen/interface/xen.h>
> >   #include<xen/interface/event_channel.h>
> >
> > +#include<asm/desc.h>
> > +
> >   /*
> >    * This lock protects updates to the following mapping and
> > reference-count * arrays. The lock does not need to be acquired to read
> > the mapping tables. @@ -624,8 +627,13 @@ void xen_evtchn_do_upcall(struct
> > pt_regs *regs) struct vcpu_info *vcpu_info = __get_cpu_var(xen_vcpu);
> >    	unsigned count;
> >
> > -	exit_idle();
> > -	irq_enter();
> > +	/*
> > +	 * If is PV featured HVM, these have already been done
> > +	 */
> > +	if (likely(!xen_hvm_pv_evtchn_enabled())) {
> > +		exit_idle();
> > +		irq_enter();
> > +	}
> 
> In that case, rather than putting this conditional in the hot path, make
> an inner __xen_evtchn_do_upcall which is wrapped by the PV and HVM
> variants which do the appropriate things.  (And drop the pt_regs arg, I
> think.)

Sure.
 
-- 
regards
Yang, Sheng

> >   	do {
> >   		unsigned long pending_words;
> > @@ -662,8 +670,10 @@ void xen_evtchn_do_upcall(struct pt_regs *regs)
> >   	} while(count != 1);
> >
> >   out:
> > -	irq_exit();
> > -	set_irq_regs(old_regs);
> > +	if (likely(!xen_hvm_pv_evtchn_enabled())) {
> > +		irq_exit();
> > +		set_irq_regs(old_regs);
> > +	}
> >
> >   	put_cpu();
> >   }
> > @@ -944,3 +954,51 @@ void __init xen_init_IRQ(void)
> >
> >   	irq_ctx_init(smp_processor_id());
> >   }
> > +
> > +void __init xen_hvm_pv_evtchn_init_IRQ(void)
> > +{
> > +	int i;
> > +
> > +	xen_init_IRQ();
> > +	for (i = 0; i<  NR_IRQS_LEGACY; i++) {
> > +		struct evtchn_bind_virq bind_virq;
> > +		struct irq_desc *desc = irq_to_desc(i);
> > +		int virq, evtchn;
> > +
> > +		virq = i + VIRQ_EMUL_PIN_START;
> > +		bind_virq.virq = virq;
> > +		bind_virq.vcpu = 0;
> > +
> > +		if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_virq,
> > +						&bind_virq) != 0)
> > +			BUG();
> > +
> > +		evtchn = bind_virq.port;
> > +		evtchn_to_irq[evtchn] = i;
> > +		irq_info[i] = mk_virq_info(evtchn, virq);
> > +
> > +		desc->status = IRQ_DISABLED;
> > +		desc->action = NULL;
> > +		desc->depth = 1;
> > +
> > +		/*
> > +		 * 16 old-style INTA-cycle interrupts:
> > +		 */
> > +		set_irq_chip_and_handler_name(i,&xen_dynamic_chip,
> > +					handle_level_irq, "event");
> > +	}
> > +
> > +	/*
> > +	 * Cover the whole vector space, no vector can escape
> > +	 * us. (some of these will be overridden and become
> > +	 * 'special' SMP interrupts)
> > +	 */
> > +	for (i = 0; i<  (NR_VECTORS - FIRST_EXTERNAL_VECTOR); i++) {
> > +		int vector = FIRST_EXTERNAL_VECTOR + i;
> > +		if (vector != IA32_SYSCALL_VECTOR)
> > +			set_intr_gate(vector, interrupt[i]);
> > +	}
> > +
> > +	/* generic IPI for platform specific use, now used for HVM evtchn */
> > +	alloc_intr_gate(X86_PLATFORM_IPI_VECTOR, x86_platform_ipi);
> > +}
> > diff --git a/include/xen/events.h b/include/xen/events.h
> > index e68d59a..91755db 100644
> > --- a/include/xen/events.h
> > +++ b/include/xen/events.h
> > @@ -56,4 +56,5 @@ void xen_poll_irq(int irq);
> >   /* Determine the IRQ which is bound to an event channel */
> >   unsigned irq_from_evtchn(unsigned int evtchn);
> >
> > +void xen_evtchn_do_upcall(struct pt_regs *regs);
> >   #endif	/* _XEN_EVENTS_H */
> > diff --git a/include/xen/hvm.h b/include/xen/hvm.h
> > index 4ea8887..c66d788 100644
> > --- a/include/xen/hvm.h
> > +++ b/include/xen/hvm.h
> > @@ -20,4 +20,9 @@ static inline unsigned long hvm_get_parameter(int idx)
> >   	return xhv.value;
> >   }
> >
> > +#define HVM_CALLBACK_VIA_TYPE_VECTOR 0x2
> > +#define HVM_CALLBACK_VIA_TYPE_SHIFT 56
> > +#define HVM_CALLBACK_VECTOR(x)
> > (((uint64_t)HVM_CALLBACK_VIA_TYPE_VECTOR)<<\
> > +				HVM_CALLBACK_VIA_TYPE_SHIFT | (x))
> > +
> >   #endif /* XEN_HVM_H__ */
> > diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h
> > index 2befa3e..9282ff7 100644
> > --- a/include/xen/interface/xen.h
> > +++ b/include/xen/interface/xen.h
> > @@ -90,7 +90,11 @@
> >   #define VIRQ_ARCH_6    22
> >   #define VIRQ_ARCH_7    23
> >
> > -#define NR_VIRQS       24
> > +#define VIRQ_EMUL_PIN_START 24
> > +#define VIRQ_EMUL_PIN_NUM 16
> > +
> > +#define NR_VIRQS       40
> 
> (VIRQ_EMUL_PIN_START + VIRQ_EMUL_PIN_NUM)?
> 
>      J
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
  2010-03-02  1:38       ` Sheng Yang
  (?)
  (?)
@ 2010-03-02  9:22       ` Ian Campbell
  2010-03-02 20:17           ` Jeremy Fitzhardinge
  -1 siblings, 1 reply; 53+ messages in thread
From: Ian Campbell @ 2010-03-02  9:22 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Jeremy Fitzhardinge, Keir Fraser, Jeremy Fitzhardinge, Ian Pratt,
	xen-devel, Stefano Stabellini, Yaozu (Eddie) Dong, linux-kernel

On Tue, 2010-03-02 at 01:38 +0000, Sheng Yang wrote:
> 
> A annoy thing in pv drivers is that it would test if the domain type
> is _NOT_ XEN_NATIVE. So set the domain to XEN_HVM_DOMAIN would result
> in PV driver initialization then probably panic. 

What _actually_ panics?

Registration of the frontend devices should be completely harmless
(apart from a little wasted RAM) unless a xenbus driver manages to come
up and enumerate the xen bus and cause the ->probe function run.

You should be gating the xenbus startup on the availability of PV
functionality not the individual driver registrations. This keeps the
test in a single easy to maintain place.

Compare with pci_register_driver and all of the callers of that function
-- not a single one of them has an "is_pci_available" test anywhere.

Ian.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LKML] [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
  2010-03-02  1:21       ` Sheng Yang
  (?)
@ 2010-03-02 13:41       ` Konrad Rzeszutek Wilk
  2010-03-02 14:09           ` Ian Campbell
  -1 siblings, 1 reply; 53+ messages in thread
From: Konrad Rzeszutek Wilk @ 2010-03-02 13:41 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, Ian Campbell,
	Stefano Stabellini, xen-devel, linux-kernel

On Tue, Mar 02, 2010 at 09:21:43AM +0800, Sheng Yang wrote:
> On Tuesday 02 March 2010 01:38:58 Konrad Rzeszutek Wilk wrote:
> > > +/* The region reserved by QEmu for Xen platform device */
> > > +#define GNTTAB_START	    0xf2000000ul
> > > +#define GNTTAB_SIZE	    0x20000ul
> > 
> > I thought that in the earlier review you said:
> > 
> > "> > +#define GNTTAB_START           0xfbfe0000ul
> > 
> > > > +#define GNTTAB_SIZE            0x20000ul
> > >
> > > Is it possible that there would be a PCI device that would be
> > > passed in the guest that would conflict with the above mentioned
> > > E820 region?
> > 
> > I would change them to a dedicated PCI MMIO address in the next version.
> > Thanks.
> > 
> > "
> > ?
> 
> And yes, this is the dedicated PCI MMIO address I mentioned.. I would update 
> the comments to get it more clear.
> 
> I don't think it's very clear solution, because the real good way to do this 
> is probe pci device and find out with one is the platform pci device then use 
> it. But the grant table initialization is quite earlier compared to the 
> possible probing now... I hardcode the position now, and hunting for a better 

Would it be possible to move the grant table initialization to later
phase? Past the PCI loading/initialization?

> idea.
> 
> Comments?
> 
> -- 
> regards
> Yang, Sheng
>  

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LKML] [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
  2010-03-02 13:41       ` Konrad Rzeszutek Wilk
@ 2010-03-02 14:09           ` Ian Campbell
  0 siblings, 0 replies; 53+ messages in thread
From: Ian Campbell @ 2010-03-02 14:09 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Sheng Yang, Keir Fraser, Jeremy Fitzhardinge, Ian Pratt,
	Stefano Stabellini, xen-devel, linux-kernel

On Tue, 2010-03-02 at 13:41 +0000, Konrad Rzeszutek Wilk wrote:
> On Tue, Mar 02, 2010 at 09:21:43AM +0800, Sheng Yang wrote:
> > On Tuesday 02 March 2010 01:38:58 Konrad Rzeszutek Wilk wrote:
> > > > +/* The region reserved by QEmu for Xen platform device */
> > > > +#define GNTTAB_START	    0xf2000000ul
> > > > +#define GNTTAB_SIZE	    0x20000ul
> > > 
> > > I thought that in the earlier review you said:
> > > 
> > > "> > +#define GNTTAB_START           0xfbfe0000ul
> > > 
> > > > > +#define GNTTAB_SIZE            0x20000ul
> > > >
> > > > Is it possible that there would be a PCI device that would be
> > > > passed in the guest that would conflict with the above mentioned
> > > > E820 region?
> > > 
> > > I would change them to a dedicated PCI MMIO address in the next version.
> > > Thanks.
> > > 
> > > "
> > > ?
> > 
> > And yes, this is the dedicated PCI MMIO address I mentioned.. I would update 
> > the comments to get it more clear.
> > 
> > I don't think it's very clear solution, because the real good way to do this 
> > is probe pci device and find out with one is the platform pci device then use 
> > it. But the grant table initialization is quite earlier compared to the 
> > possible probing now... I hardcode the position now, and hunting for a better 
> 
> Would it be possible to move the grant table initialization to later
> phase? Past the PCI loading/initialization?

Or provide the address via an MSR, hypervisor specific CPUID leaf or IO
port or by using the early_pci infrastructure in the kernel. I don't
think we are short of options ;-)

Ian.

> 
> > idea.
> > 
> > Comments?
> > 
> > -- 
> > regards
> > Yang, Sheng
> >  



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [LKML] [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM
@ 2010-03-02 14:09           ` Ian Campbell
  0 siblings, 0 replies; 53+ messages in thread
From: Ian Campbell @ 2010-03-02 14:09 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: xen-devel, Stefano Stabellini, linux-kernel, Ian Pratt,
	Jeremy Fitzhardinge, Keir Fraser, Sheng Yang

On Tue, 2010-03-02 at 13:41 +0000, Konrad Rzeszutek Wilk wrote:
> On Tue, Mar 02, 2010 at 09:21:43AM +0800, Sheng Yang wrote:
> > On Tuesday 02 March 2010 01:38:58 Konrad Rzeszutek Wilk wrote:
> > > > +/* The region reserved by QEmu for Xen platform device */
> > > > +#define GNTTAB_START	    0xf2000000ul
> > > > +#define GNTTAB_SIZE	    0x20000ul
> > > 
> > > I thought that in the earlier review you said:
> > > 
> > > "> > +#define GNTTAB_START           0xfbfe0000ul
> > > 
> > > > > +#define GNTTAB_SIZE            0x20000ul
> > > >
> > > > Is it possible that there would be a PCI device that would be
> > > > passed in the guest that would conflict with the above mentioned
> > > > E820 region?
> > > 
> > > I would change them to a dedicated PCI MMIO address in the next version.
> > > Thanks.
> > > 
> > > "
> > > ?
> > 
> > And yes, this is the dedicated PCI MMIO address I mentioned.. I would update 
> > the comments to get it more clear.
> > 
> > I don't think it's very clear solution, because the real good way to do this 
> > is probe pci device and find out with one is the platform pci device then use 
> > it. But the grant table initialization is quite earlier compared to the 
> > possible probing now... I hardcode the position now, and hunting for a better 
> 
> Would it be possible to move the grant table initialization to later
> phase? Past the PCI loading/initialization?

Or provide the address via an MSR, hypervisor specific CPUID leaf or IO
port or by using the early_pci infrastructure in the kernel. I don't
think we are short of options ;-)

Ian.

> 
> > idea.
> > 
> > Comments?
> > 
> > -- 
> > regards
> > Yang, Sheng
> >  

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
  2010-03-02  9:22       ` Ian Campbell
@ 2010-03-02 20:17           ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02 20:17 UTC (permalink / raw)
  To: Ian Campbell
  Cc: Sheng Yang, Keir Fraser, Jeremy Fitzhardinge, Ian Pratt,
	xen-devel, Stefano Stabellini, Yaozu (Eddie) Dong, linux-kernel

On 03/02/2010 01:22 AM, Ian Campbell wrote:
> On Tue, 2010-03-02 at 01:38 +0000, Sheng Yang wrote:
>    
>> A annoy thing in pv drivers is that it would test if the domain type
>> is _NOT_ XEN_NATIVE. So set the domain to XEN_HVM_DOMAIN would result
>> in PV driver initialization then probably panic.
>>      
> What _actually_ panics?
>
> Registration of the frontend devices should be completely harmless
> (apart from a little wasted RAM) unless a xenbus driver manages to come
> up and enumerate the xen bus and cause the ->probe function run.
>
> You should be gating the xenbus startup on the availability of PV
> functionality not the individual driver registrations. This keeps the
> test in a single easy to maintain place.
>
> Compare with pci_register_driver and all of the callers of that function
> -- not a single one of them has an "is_pci_available" test anywhere.
>    

The problem is that it currently assumes xenbus is initialized by the 
time the PV drivers init, because in a PV boot xenbus gets initted very 
early.  We need to change it so that it can cope with drivers being 
initialized and registering with xenbus before it has been initialized.

We have the same problem with plain PV-on-HVM drivers as xenbus only 
comes up as a result of probing the Xen PCI platform device, which may 
be after the PV drivers' init functions have been called.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/7] xen/hvm: Xen PV extension of HVM initialization
@ 2010-03-02 20:17           ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-02 20:17 UTC (permalink / raw)
  To: Ian Campbell
  Cc: xen-devel, Stefano Stabellini, Yaozu (Eddie) Dong, linux-kernel,
	Ian Pratt, Jeremy Fitzhardinge, Keir Fraser, Sheng Yang

On 03/02/2010 01:22 AM, Ian Campbell wrote:
> On Tue, 2010-03-02 at 01:38 +0000, Sheng Yang wrote:
>    
>> A annoy thing in pv drivers is that it would test if the domain type
>> is _NOT_ XEN_NATIVE. So set the domain to XEN_HVM_DOMAIN would result
>> in PV driver initialization then probably panic.
>>      
> What _actually_ panics?
>
> Registration of the frontend devices should be completely harmless
> (apart from a little wasted RAM) unless a xenbus driver manages to come
> up and enumerate the xen bus and cause the ->probe function run.
>
> You should be gating the xenbus startup on the availability of PV
> functionality not the individual driver registrations. This keeps the
> test in a single easy to maintain place.
>
> Compare with pci_register_driver and all of the callers of that function
> -- not a single one of them has an "is_pci_available" test anywhere.
>    

The problem is that it currently assumes xenbus is initialized by the 
time the PV drivers init, because in a PV boot xenbus gets initted very 
early.  We need to change it so that it can cope with drivers being 
initialized and registering with xenbus before it has been initialized.

We have the same problem with plain PV-on-HVM drivers as xenbus only 
comes up as a result of probing the Xen PCI platform device, which may 
be after the PV drivers' init functions have been called.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
  2010-03-02 20:17           ` Jeremy Fitzhardinge
@ 2010-03-03 11:35             ` Stefano Stabellini
  -1 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2010-03-03 11:35 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Ian Campbell, Sheng Yang, Keir Fraser, Jeremy Fitzhardinge,
	Ian Pratt, xen-devel, Stefano Stabellini, Yaozu (Eddie) Dong,
	linux-kernel

On Tue, 2 Mar 2010, Jeremy Fitzhardinge wrote:
> On 03/02/2010 01:22 AM, Ian Campbell wrote:
> > On Tue, 2010-03-02 at 01:38 +0000, Sheng Yang wrote:
> >    
> >> A annoy thing in pv drivers is that it would test if the domain type
> >> is _NOT_ XEN_NATIVE. So set the domain to XEN_HVM_DOMAIN would result
> >> in PV driver initialization then probably panic.
> >>      
> > What _actually_ panics?
> >
> > Registration of the frontend devices should be completely harmless
> > (apart from a little wasted RAM) unless a xenbus driver manages to come
> > up and enumerate the xen bus and cause the ->probe function run.
> >
> > You should be gating the xenbus startup on the availability of PV
> > functionality not the individual driver registrations. This keeps the
> > test in a single easy to maintain place.
> >
> > Compare with pci_register_driver and all of the callers of that function
> > -- not a single one of them has an "is_pci_available" test anywhere.
> >    
> 
> The problem is that it currently assumes xenbus is initialized by the 
> time the PV drivers init, because in a PV boot xenbus gets initted very 
> early.  We need to change it so that it can cope with drivers being 
> initialized and registering with xenbus before it has been initialized.
> 
> We have the same problem with plain PV-on-HVM drivers as xenbus only 
> comes up as a result of probing the Xen PCI platform device, which may 
> be after the PV drivers' init functions have been called.

Give a look at the fourth patch of the series I just posted: it
introduces a simple driver for the Xen PCI platform device to initialize
xenbus and gran table later on when running in a HVM domain, at the same
time leaving the PV case as it is.



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
@ 2010-03-03 11:35             ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2010-03-03 11:35 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: Fitzhardinge

On Tue, 2 Mar 2010, Jeremy Fitzhardinge wrote:
> On 03/02/2010 01:22 AM, Ian Campbell wrote:
> > On Tue, 2010-03-02 at 01:38 +0000, Sheng Yang wrote:
> >    
> >> A annoy thing in pv drivers is that it would test if the domain type
> >> is _NOT_ XEN_NATIVE. So set the domain to XEN_HVM_DOMAIN would result
> >> in PV driver initialization then probably panic.
> >>      
> > What _actually_ panics?
> >
> > Registration of the frontend devices should be completely harmless
> > (apart from a little wasted RAM) unless a xenbus driver manages to come
> > up and enumerate the xen bus and cause the ->probe function run.
> >
> > You should be gating the xenbus startup on the availability of PV
> > functionality not the individual driver registrations. This keeps the
> > test in a single easy to maintain place.
> >
> > Compare with pci_register_driver and all of the callers of that function
> > -- not a single one of them has an "is_pci_available" test anywhere.
> >    
> 
> The problem is that it currently assumes xenbus is initialized by the 
> time the PV drivers init, because in a PV boot xenbus gets initted very 
> early.  We need to change it so that it can cope with drivers being 
> initialized and registering with xenbus before it has been initialized.
> 
> We have the same problem with plain PV-on-HVM drivers as xenbus only 
> comes up as a result of probing the Xen PCI platform device, which may 
> be after the PV drivers' init functions have been called.

Give a look at the fourth patch of the series I just posted: it
introduces a simple driver for the Xen PCI platform device to initialize
xenbus and gran table later on when running in a HVM domain, at the same
time leaving the PV case as it is.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
  2010-03-03 11:35             ` Stefano Stabellini
  (?)
@ 2010-03-03 17:35             ` Jeremy Fitzhardinge
  2010-03-03 17:41                 ` Stefano Stabellini
  -1 siblings, 1 reply; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-03 17:35 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Ian Campbell, Sheng Yang, Keir Fraser, Jeremy Fitzhardinge,
	Ian Pratt, xen-devel, Yaozu (Eddie) Dong, linux-kernel

On 03/03/2010 03:35 AM, Stefano Stabellini wrote:
> Give a look at the fourth patch of the series I just posted: it
> introduces a simple driver for the Xen PCI platform device to initialize
> xenbus and gran table later on when running in a HVM domain, at the same
> time leaving the PV case as it is.
>    

Yes, I noticed that.  Do the PV drivers cope with having the PCI 
probe/xenbus setup after their init functions have been called, or do 
you make sure the PCI probe happens early?

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
  2010-03-03 17:35             ` [Xen-devel] " Jeremy Fitzhardinge
@ 2010-03-03 17:41                 ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2010-03-03 17:41 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Stefano Stabellini, Ian Campbell, Sheng Yang, Keir Fraser,
	Jeremy Fitzhardinge, Ian Pratt, xen-devel, Yaozu (Eddie) Dong,
	linux-kernel

On Wed, 3 Mar 2010, Jeremy Fitzhardinge wrote:
> On 03/03/2010 03:35 AM, Stefano Stabellini wrote:
> > Give a look at the fourth patch of the series I just posted: it
> > introduces a simple driver for the Xen PCI platform device to initialize
> > xenbus and gran table later on when running in a HVM domain, at the same
> > time leaving the PV case as it is.
> >    
> 
> Yes, I noticed that.  Do the PV drivers cope with having the PCI 
> probe/xenbus setup after their init functions have been called, or do 
> you make sure the PCI probe happens early?
> 

Nope, they seem to cope fine so far.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
@ 2010-03-03 17:41                 ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2010-03-03 17:41 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Jeremy Fitzhardinge, Stefano Stabellini, Yaozu (Eddie) Dong,
	linux-kernel, Ian Campbell, Ian Pratt, xen-devel, Keir Fraser,
	Sheng Yang

On Wed, 3 Mar 2010, Jeremy Fitzhardinge wrote:
> On 03/03/2010 03:35 AM, Stefano Stabellini wrote:
> > Give a look at the fourth patch of the series I just posted: it
> > introduces a simple driver for the Xen PCI platform device to initialize
> > xenbus and gran table later on when running in a HVM domain, at the same
> > time leaving the PV case as it is.
> >    
> 
> Yes, I noticed that.  Do the PV drivers cope with having the PCI 
> probe/xenbus setup after their init functions have been called, or do 
> you make sure the PCI probe happens early?
> 

Nope, they seem to cope fine so far.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 5/7] xen: Make event channel work with PV extension of HVM
  2010-03-02  5:48     ` Sheng Yang
@ 2010-03-03 18:31       ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-03 18:31 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, linux-kernel,
	xen-devel, Ian Campbell, Stefano Stabellini

On 03/01/2010 09:48 PM, Sheng Yang wrote:
>> Presumably even if we don't have PV_EVTCHN available/enabled, the Xen
>> clocksource would be available for getting time?
>>      
> I think currently Xen pv clocksource and clockevent are binding... Not sure if
> a single line "clocksource_register(&xen_clocksource)" can work. I would give
> it a try, maybe add a new PV feature.
>    

There should be no strong binding between them, but there may be some 
sloppy assumptions in xen/time.c which should be fixed.  Linux itself 
treats clocksources and eventsources as completely distinct entities, 
and doesn't assume they're running on the same timebase (for example).

Having a PV clocksource even if the timer interrupts are emulated would 
make sense and be useful.

>>>    	xen_setup_vcpu_info_placement();
>>>    }
>>> @@ -480,3 +487,138 @@ void __init xen_smp_init(void)
>>>    	xen_fill_possible_map();
>>>    	xen_init_spinlocks();
>>>    }
>>> +
>>> +static __cpuinit void xen_hvm_pv_start_secondary(void)
>>> +{
>>> +	int cpu = smp_processor_id();
>>> +
>>> +	cpu_init();
>>> +	touch_nmi_watchdog();
>>> +	preempt_disable();
>>> +
>>> +	/* otherwise gcc will move up smp_processor_id before the cpu_init */
>>> +	barrier();
>>> +	/*
>>> +	 * Check TSC synchronization with the BSP:
>>> +	 */
>>> +	check_tsc_sync_target();
>>> +
>>> +	/* Done in smp_callin(), move it here */
>>> +	set_mtrr_aps_delayed_init();
>>> +	smp_store_cpu_info(cpu);
>>> +
>>> +	/* This must be done before setting cpu_online_mask */
>>> +	set_cpu_sibling_map(cpu);
>>> +	wmb();
>>> +
>>> +	set_cpu_online(smp_processor_id(), true);
>>> +	per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE;
>>> +
>>> +	/* enable local interrupts */
>>> +	local_irq_enable();
>>> +
>>> +	xen_setup_cpu_clockevents();
>>>        
>> How much of this is necessary?  At this point, isn't CPU bringup the
>> same as PV?
>>      
> Xen_enable_sysenter/syscall is not needed for this. And we have a TSC sync
> here - but it seems unnecessary for PV. But set_mtrr_aps_delayed_init() is
> needed. Reuse the cpu_bring_up() is fine?
>    

Doesn't Xen arrange for the tscs to be synced anyway?

>> Is the MMUEXT_TLB_FLUSH/INVLPG_MULTI hypercall not currently available
>> to HVM?
>>      
> I think they are different. These hypercalls flushed native's TLB, but HVM
> want to flush guest one, especially when using shadow, HVM need do something
> for it.
>    

I see.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 5/7] xen: Make event channel work with PV extension of HVM
  2010-03-02  1:38   ` [Xen-devel] " Jeremy Fitzhardinge
@ 2010-03-04  5:37       ` Sheng Yang
  2010-03-04  5:37       ` Sheng Yang
  1 sibling, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-04  5:37 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Keir Fraser, Jeremy Fitzhardinge, Ian Pratt, linux-kernel,
	xen-devel, Ian Campbell, Stefano Stabellini

On Tuesday 02 March 2010 09:38:21 Jeremy Fitzhardinge wrote:
> On 03/01/2010 01:38 AM, Sheng Yang wrote:
> > +
> > +		x86_platform.calibrate_tsc = xen_tsc_khz;
> > +		x86_platform.get_wallclock = xen_get_wallclock;
> > +		x86_platform.set_wallclock = xen_set_wallclock;
> > +
> > +		pv_apic_ops = xen_apic_ops;
> > +#ifdef CONFIG_X86_LOCAL_APIC
> > +		/*
> > +		 * set up the basic apic ops.
> > +		 */
> > +		set_xen_basic_apic_ops();
> > +		apic->write = xen_hvm_pv_evtchn_apic_write;
> 
> I'd just change the xen_apic_write to remove the WARN_ON, since you
> don't seem to care about it either.

So which code base I should make these patches against? We expect the patchset 
can be accepted in the Linux upstream soon after you pick it up.
> 
> >
> > -	exit_idle();
> > -	irq_enter();
> > +	/*
> > +	 * If is PV featured HVM, these have already been done
> > +	 */
> > +	if (likely(!xen_hvm_pv_evtchn_enabled())) {
> > +		exit_idle();
> > +		irq_enter();
> > +	}
> 
> In that case, rather than putting this conditional in the hot path, make
> an inner __xen_evtchn_do_upcall which is wrapped by the PV and HVM
> variants which do the appropriate things.  (And drop the pt_regs arg, I
> think.)

Seems we still need pt_regs to for handle_irq()?

-- 
regards
Yang, Sheng

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 5/7] xen: Make event channel work with PV extension of HVM
@ 2010-03-04  5:37       ` Sheng Yang
  0 siblings, 0 replies; 53+ messages in thread
From: Sheng Yang @ 2010-03-04  5:37 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: xen-devel, Ian Campbell, Stefano Stabellini, linux-kernel,
	Ian Pratt, Jeremy Fitzhardinge, Keir Fraser

On Tuesday 02 March 2010 09:38:21 Jeremy Fitzhardinge wrote:
> On 03/01/2010 01:38 AM, Sheng Yang wrote:
> > +
> > +		x86_platform.calibrate_tsc = xen_tsc_khz;
> > +		x86_platform.get_wallclock = xen_get_wallclock;
> > +		x86_platform.set_wallclock = xen_set_wallclock;
> > +
> > +		pv_apic_ops = xen_apic_ops;
> > +#ifdef CONFIG_X86_LOCAL_APIC
> > +		/*
> > +		 * set up the basic apic ops.
> > +		 */
> > +		set_xen_basic_apic_ops();
> > +		apic->write = xen_hvm_pv_evtchn_apic_write;
> 
> I'd just change the xen_apic_write to remove the WARN_ON, since you
> don't seem to care about it either.

So which code base I should make these patches against? We expect the patchset 
can be accepted in the Linux upstream soon after you pick it up.
> 
> >
> > -	exit_idle();
> > -	irq_enter();
> > +	/*
> > +	 * If is PV featured HVM, these have already been done
> > +	 */
> > +	if (likely(!xen_hvm_pv_evtchn_enabled())) {
> > +		exit_idle();
> > +		irq_enter();
> > +	}
> 
> In that case, rather than putting this conditional in the hot path, make
> an inner __xen_evtchn_do_upcall which is wrapped by the PV and HVM
> variants which do the appropriate things.  (And drop the pt_regs arg, I
> think.)

Seems we still need pt_regs to for handle_irq()?

-- 
regards
Yang, Sheng

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
  2010-03-03 17:41                 ` Stefano Stabellini
@ 2010-03-04 10:18                   ` Ian Campbell
  -1 siblings, 0 replies; 53+ messages in thread
From: Ian Campbell @ 2010-03-04 10:18 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jeremy Fitzhardinge, Sheng Yang, Keir Fraser,
	Jeremy Fitzhardinge, Ian Pratt, xen-devel, Yaozu (Eddie) Dong,
	linux-kernel

On Wed, 2010-03-03 at 17:41 +0000, Stefano Stabellini wrote:
> On Wed, 3 Mar 2010, Jeremy Fitzhardinge wrote:
> > On 03/03/2010 03:35 AM, Stefano Stabellini wrote:
> > > Give a look at the fourth patch of the series I just posted: it
> > > introduces a simple driver for the Xen PCI platform device to initialize
> > > xenbus and gran table later on when running in a HVM domain, at the same
> > > time leaving the PV case as it is.
> > >    
> > 
> > Yes, I noticed that.  Do the PV drivers cope with having the PCI 
> > probe/xenbus setup after their init functions have been called, or do 
> > you make sure the PCI probe happens early?
> > 
> 
> Nope, they seem to cope fine so far.

xenbus_frontend_register is basically just driver_register so it should
be safe to call even before the register_bus(xenbusfoo) call. Things
shouldn't kick off until xen_probe() is called which either happens
explicitly in xenbus_probe_init() or via probe_work which contains a
xenstored_ready check.

Ian.


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 3/7] xen/hvm: Xen PV extension of  HVM initialization
@ 2010-03-04 10:18                   ` Ian Campbell
  0 siblings, 0 replies; 53+ messages in thread
From: Ian Campbell @ 2010-03-04 10:18 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Jeremy Fitzhardinge, xen-devel, Yaozu (Eddie) Dong, linux-kernel,
	Ian Pratt, Jeremy Fitzhardinge, Keir Fraser, Sheng Yang

On Wed, 2010-03-03 at 17:41 +0000, Stefano Stabellini wrote:
> On Wed, 3 Mar 2010, Jeremy Fitzhardinge wrote:
> > On 03/03/2010 03:35 AM, Stefano Stabellini wrote:
> > > Give a look at the fourth patch of the series I just posted: it
> > > introduces a simple driver for the Xen PCI platform device to initialize
> > > xenbus and gran table later on when running in a HVM domain, at the same
> > > time leaving the PV case as it is.
> > >    
> > 
> > Yes, I noticed that.  Do the PV drivers cope with having the PCI 
> > probe/xenbus setup after their init functions have been called, or do 
> > you make sure the PCI probe happens early?
> > 
> 
> Nope, they seem to cope fine so far.

xenbus_frontend_register is basically just driver_register so it should
be safe to call even before the register_bus(xenbusfoo) call. Things
shouldn't kick off until xen_probe() is called which either happens
explicitly in xenbus_probe_init() or via probe_work which contains a
xenstored_ready check.

Ian.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 5/7] xen: Make event channel work with PV extension of HVM
  2010-03-04  5:37       ` Sheng Yang
@ 2010-03-04 11:58         ` Stefano Stabellini
  -1 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2010-03-04 11:58 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Jeremy Fitzhardinge, Keir Fraser, Jeremy Fitzhardinge, Ian Pratt,
	linux-kernel, xen-devel, Ian Campbell, Stefano Stabellini

On Thu, 4 Mar 2010, Sheng Yang wrote:
> On Tuesday 02 March 2010 09:38:21 Jeremy Fitzhardinge wrote:
> > On 03/01/2010 01:38 AM, Sheng Yang wrote:
> > > +
> > > +		x86_platform.calibrate_tsc = xen_tsc_khz;
> > > +		x86_platform.get_wallclock = xen_get_wallclock;
> > > +		x86_platform.set_wallclock = xen_set_wallclock;
> > > +
> > > +		pv_apic_ops = xen_apic_ops;
> > > +#ifdef CONFIG_X86_LOCAL_APIC
> > > +		/*
> > > +		 * set up the basic apic ops.
> > > +		 */
> > > +		set_xen_basic_apic_ops();
> > > +		apic->write = xen_hvm_pv_evtchn_apic_write;
> > 
> > I'd just change the xen_apic_write to remove the WARN_ON, since you
> > don't seem to care about it either.
> 
> So which code base I should make these patches against? We expect the patchset 
> can be accepted in the Linux upstream soon after you pick it up.

I think that once we agree on which approach we take we should work
together on this.

Jeremy, can we have our own temporary branch so that we can work
together on a common code base to finish this patch series?


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 5/7] xen: Make event channel work with PV extension of HVM
@ 2010-03-04 11:58         ` Stefano Stabellini
  0 siblings, 0 replies; 53+ messages in thread
From: Stefano Stabellini @ 2010-03-04 11:58 UTC (permalink / raw)
  To: Sheng Yang
  Cc: Jeremy Fitzhardinge, xen-devel, Stefano Stabellini, linux-kernel,
	Ian Campbell, Ian Pratt, Jeremy Fitzhardinge, Keir Fraser

On Thu, 4 Mar 2010, Sheng Yang wrote:
> On Tuesday 02 March 2010 09:38:21 Jeremy Fitzhardinge wrote:
> > On 03/01/2010 01:38 AM, Sheng Yang wrote:
> > > +
> > > +		x86_platform.calibrate_tsc = xen_tsc_khz;
> > > +		x86_platform.get_wallclock = xen_get_wallclock;
> > > +		x86_platform.set_wallclock = xen_set_wallclock;
> > > +
> > > +		pv_apic_ops = xen_apic_ops;
> > > +#ifdef CONFIG_X86_LOCAL_APIC
> > > +		/*
> > > +		 * set up the basic apic ops.
> > > +		 */
> > > +		set_xen_basic_apic_ops();
> > > +		apic->write = xen_hvm_pv_evtchn_apic_write;
> > 
> > I'd just change the xen_apic_write to remove the WARN_ON, since you
> > don't seem to care about it either.
> 
> So which code base I should make these patches against? We expect the patchset 
> can be accepted in the Linux upstream soon after you pick it up.

I think that once we agree on which approach we take we should work
together on this.

Jeremy, can we have our own temporary branch so that we can work
together on a common code base to finish this patch series?

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Xen-devel] [PATCH 5/7] xen: Make event channel work with PV extension of HVM
  2010-03-04 11:58         ` Stefano Stabellini
@ 2010-03-08 22:31           ` Jeremy Fitzhardinge
  -1 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-08 22:31 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Sheng Yang, Keir Fraser, Jeremy Fitzhardinge, Ian Pratt,
	linux-kernel, xen-devel, Ian Campbell

On 03/04/2010 03:58 AM, Stefano Stabellini wrote:
> Jeremy, can we have our own temporary branch so that we can work
> together on a common code base to finish this patch series?
>    

I'm not sure quite what you mean, but I've committed Sheng's work to 
xen/pvhvm-sheng for now, and I'll commit your patches to a similar 
branch once I know where to apply them.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH 5/7] xen: Make event channel work with PV extension of HVM
@ 2010-03-08 22:31           ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 53+ messages in thread
From: Jeremy Fitzhardinge @ 2010-03-08 22:31 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: xen-devel, linux-kernel, Ian Campbell, Ian Pratt,
	Jeremy Fitzhardinge, Keir Fraser, Sheng Yang

On 03/04/2010 03:58 AM, Stefano Stabellini wrote:
> Jeremy, can we have our own temporary branch so that we can work
> together on a common code base to finish this patch series?
>    

I'm not sure quite what you mean, but I've committed Sheng's work to 
xen/pvhvm-sheng for now, and I'll commit your patches to a similar 
branch once I know where to apply them.

     J

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2010-03-08 22:32 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-01  9:38 [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Sheng Yang
2010-03-01  9:38 ` [PATCH 1/7] xen: add support for hvm_op Sheng Yang
2010-03-01  9:38   ` Sheng Yang
2010-03-01  9:38 ` [PATCH 2/7] xen: Import cpuid.h from Xen Sheng Yang
2010-03-01  9:38 ` [PATCH 3/7] xen/hvm: Xen PV extension of HVM initialization Sheng Yang
2010-03-02  1:02   ` [Xen-devel] " Jeremy Fitzhardinge
2010-03-02  1:02     ` Jeremy Fitzhardinge
2010-03-02  1:38     ` [Xen-devel] " Sheng Yang
2010-03-02  1:38       ` Sheng Yang
2010-03-02  1:43       ` Jeremy Fitzhardinge
2010-03-02  9:22       ` Ian Campbell
2010-03-02 20:17         ` Jeremy Fitzhardinge
2010-03-02 20:17           ` Jeremy Fitzhardinge
2010-03-03 11:35           ` [Xen-devel] " Stefano Stabellini
2010-03-03 11:35             ` Stefano Stabellini
2010-03-03 17:35             ` [Xen-devel] " Jeremy Fitzhardinge
2010-03-03 17:41               ` Stefano Stabellini
2010-03-03 17:41                 ` Stefano Stabellini
2010-03-04 10:18                 ` [Xen-devel] " Ian Campbell
2010-03-04 10:18                   ` Ian Campbell
2010-03-01  9:38 ` [PATCH 4/7] xen: The entrance for PV extension of HVM Sheng Yang
2010-03-02  1:05   ` [Xen-devel] " Jeremy Fitzhardinge
2010-03-02  1:41     ` Sheng Yang
2010-03-01  9:38 ` [PATCH 5/7] xen: Make event channel work with " Sheng Yang
2010-03-01  9:38   ` Sheng Yang
2010-03-02  1:38   ` [Xen-devel] " Jeremy Fitzhardinge
2010-03-02  5:48     ` Sheng Yang
2010-03-03 18:31       ` Jeremy Fitzhardinge
2010-03-04  5:37     ` Sheng Yang
2010-03-04  5:37       ` Sheng Yang
2010-03-04 11:58       ` [Xen-devel] " Stefano Stabellini
2010-03-04 11:58         ` Stefano Stabellini
2010-03-08 22:31         ` [Xen-devel] " Jeremy Fitzhardinge
2010-03-08 22:31           ` Jeremy Fitzhardinge
2010-03-01  9:38 ` [PATCH 6/7] xen: Unified checking for Xen of PV drivers to xenbus_register_frontend() Sheng Yang
2010-03-02  1:45   ` [Xen-devel] " Jeremy Fitzhardinge
2010-03-01  9:38 ` [PATCH 7/7] xen: Enable grant table and xenbus for PV extension of HVM Sheng Yang
2010-03-01 17:38   ` [LKML] " Konrad Rzeszutek Wilk
2010-03-01 17:38     ` Konrad Rzeszutek Wilk
2010-03-02  1:13     ` Sheng Yang
2010-03-02  1:13       ` Sheng Yang
2010-03-02  1:21     ` Sheng Yang
2010-03-02  1:21       ` Sheng Yang
2010-03-02 13:41       ` Konrad Rzeszutek Wilk
2010-03-02 14:09         ` Ian Campbell
2010-03-02 14:09           ` Ian Campbell
2010-03-02  0:42 ` [Xen-devel] [PATCH 0/7][v4] PV extension of HVM (Hybrid) for Xen Jeremy Fitzhardinge
2010-03-02  1:26   ` Sheng Yang
2010-03-02  1:32     ` Jeremy Fitzhardinge
2010-03-02  1:32       ` Jeremy Fitzhardinge
2010-03-02  1:34       ` [Xen-devel] " Sheng Yang
2010-03-02  3:20     ` Dong, Eddie
2010-03-02  3:20       ` Dong, Eddie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.