linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL] Xen APIC hooks (with io_apic_ops)
@ 2009-05-12 23:25 Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 01/17] xen/dom0: handle acpi lapic parsing in Xen dom0 Jeremy Fitzhardinge
                   ` (17 more replies)
  0 siblings, 18 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel

Hi Ingo,

Here's a revised set of the Xen APIC changes which adds io_apic_ops
to allow Xen to intercept IO APIC access operations.

Thanks,
	J

The following changes since commit ce791368bb4a53d05e78e1588bac0aacde8db84c:
  Jeremy Fitzhardinge (1):
        xen/i386: make sure initial VGA/ISA mappings are not overridden

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen.git for-ingo/xen/dom0/apic-ops

Gerd Hoffmann (2):
      xen: set pirq name to something useful.
      xen: fix legacy irq setup, make ioapic-less machines work.

Ian Campbell (1):
      xen: pre-initialize legacy irqs early

Jeremy Fitzhardinge (14):
      xen/dom0: handle acpi lapic parsing in Xen dom0
      x86: add io_apic_ops to allow interception
      xen: implement io_apic_ops
      xen: create dummy ioapic mapping
      xen: implement pirq type event channels
      x86/io_apic: add get_nr_irqs_gsi()
      xen/apic: identity map gsi->irqs
      xen: direct irq registration to pirq event channels
      xen: bind pirq to vector and event channel
      xen: don't setup acpi interrupt unless there is one
      xen: use acpi_get_override_irq() to get triggering for legacy irqs
      xen: initialize irq 0 too
      xen: dynamically allocate irq & event structures
      xen: disable MSI

 arch/x86/include/asm/io_apic.h |   10 ++
 arch/x86/include/asm/xen/pci.h |   13 ++
 arch/x86/kernel/acpi/boot.c    |   18 +++-
 arch/x86/kernel/apic/io_apic.c |   55 ++++++++-
 arch/x86/xen/Kconfig           |   11 ++
 arch/x86/xen/Makefile          |    3 +-
 arch/x86/xen/apic.c            |   69 ++++++++++
 arch/x86/xen/enlighten.c       |    2 +
 arch/x86/xen/mmu.c             |   10 ++
 arch/x86/xen/pci.c             |   86 +++++++++++++
 arch/x86/xen/xen-ops.h         |    6 +
 drivers/pci/pci.h              |    2 -
 drivers/xen/events.c           |  273 ++++++++++++++++++++++++++++++++++++++--
 include/linux/pci.h            |    6 +
 include/xen/events.h           |   19 +++
 15 files changed, 568 insertions(+), 15 deletions(-)
 create mode 100644 arch/x86/include/asm/xen/pci.h
 create mode 100644 arch/x86/xen/apic.c
 create mode 100644 arch/x86/xen/pci.c


^ permalink raw reply	[flat|nested] 104+ messages in thread

* [PATCH 01/17] xen/dom0: handle acpi lapic parsing in Xen dom0
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 02/17] x86: add io_apic_ops to allow interception Jeremy Fitzhardinge
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge, Jeremy Fitzhardinge

When running in Xen dom0, we still want to parse the ACPI tables to
find out about local and IO apics, but we don't want to actually use
the lapics.

Put a couple of tests for Xen to prevent lapics from being mapped or
accessed.  This is very Xen-specific behaviour, so there didn't seem to
be any point in adding more indirection.

[ Impact: ignore local apics, which are not usable under Xen ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reviewed-by: "H. Peter Anvin" <hpa@zytor.com>
---
 arch/x86/kernel/acpi/boot.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 723989d..4147e0c 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -41,6 +41,8 @@
 #include <asm/mpspec.h>
 #include <asm/smp.h>
 
+#include <asm/xen/hypervisor.h>
+
 static int __initdata acpi_force = 0;
 u32 acpi_rsdt_forced;
 #ifdef	CONFIG_ACPI
@@ -218,6 +220,10 @@ static void __cpuinit acpi_register_lapic(int id, u8 enabled)
 {
 	unsigned int ver = 0;
 
+	/* We don't want to register lapics when in Xen dom0 */
+	if (xen_initial_domain())
+		return;
+
 	if (!enabled) {
 		++disabled_cpus;
 		return;
@@ -802,6 +808,10 @@ static int __init acpi_parse_fadt(struct acpi_table_header *table)
 
 static void __init acpi_register_lapic_address(unsigned long address)
 {
+	/* Xen dom0 doesn't have usable lapics */
+	if (xen_initial_domain())
+		return;
+
 	mp_lapic_addr = address;
 
 	set_fixmap_nocache(FIX_APIC_BASE, address);
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 02/17] x86: add io_apic_ops to allow interception
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 01/17] xen/dom0: handle acpi lapic parsing in Xen dom0 Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-25  3:54   ` Ingo Molnar
  2009-05-12 23:25 ` [PATCH 03/17] xen: implement io_apic_ops Jeremy Fitzhardinge
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Xen dom0 needs to paravirtualize IO operations to the IO APIC, so add
a io_apic_ops for it to intercept.  Do this as ops structure because
there's at least some chance that another paravirtualized environment
may want to intercept these.

[Impact: indirect IO APIC access via io_apic_ops]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/include/asm/io_apic.h |    9 +++++++
 arch/x86/kernel/apic/io_apic.c |   50 +++++++++++++++++++++++++++++++++++++--
 2 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 9d826e4..8cbfe73 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -21,6 +21,15 @@
 #define IO_APIC_REDIR_LEVEL_TRIGGER	(1 << 15)
 #define IO_APIC_REDIR_MASKED		(1 << 16)
 
+struct io_apic_ops {
+	void (*init)(void);
+	unsigned int (*read)(unsigned int apic, unsigned int reg);
+	void (*write)(unsigned int apic, unsigned int reg, unsigned int value);
+	void (*modify)(unsigned int apic, unsigned int reg, unsigned int value);
+};
+
+void __init set_io_apic_ops(const struct io_apic_ops *);
+
 /*
  * The structure of the IO-APIC:
  */
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 30da617..c24f116 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -66,6 +66,25 @@
 
 #define __apicdebuginit(type) static type __init
 
+static void __init __ioapic_init_mappings(void);
+static unsigned int __io_apic_read(unsigned int apic, unsigned int reg);
+static void __io_apic_write(unsigned int apic, unsigned int reg,
+			    unsigned int val);
+static void __io_apic_modify(unsigned int apic, unsigned int reg,
+			     unsigned int val);
+
+static struct io_apic_ops io_apic_ops = {
+	.init = __ioapic_init_mappings,
+	.read = __io_apic_read,
+	.write = __io_apic_write,
+	.modify = __io_apic_modify,
+};
+
+void __init set_io_apic_ops(const struct io_apic_ops *ops)
+{
+	io_apic_ops = *ops;
+}
+
 /*
  *      Is the SiS APIC rmw bug present ?
  *      -1 = don't know, 0 = no, 1 = yes
@@ -385,6 +404,24 @@ set_extra_move_desc(struct irq_desc *desc, const struct cpumask *mask)
 }
 #endif
 
+static inline unsigned int io_apic_read(unsigned int apic, unsigned int reg)
+{
+	return io_apic_ops.read(apic, reg);
+}
+
+static inline void io_apic_write(unsigned int apic, unsigned int reg,
+				 unsigned int value)
+{
+	io_apic_ops.write(apic, reg, value);
+}
+
+static inline void io_apic_modify(unsigned int apic, unsigned int reg,
+				  unsigned int value)
+{
+	io_apic_ops.modify(apic, reg, value);
+}
+
+
 struct io_apic {
 	unsigned int index;
 	unsigned int unused[3];
@@ -405,14 +442,15 @@ static inline void io_apic_eoi(unsigned int apic, unsigned int vector)
 	writel(vector, &io_apic->eoi);
 }
 
-static inline unsigned int io_apic_read(unsigned int apic, unsigned int reg)
+static unsigned int __io_apic_read(unsigned int apic, unsigned int reg)
 {
 	struct io_apic __iomem *io_apic = io_apic_base(apic);
 	writel(reg, &io_apic->index);
 	return readl(&io_apic->data);
 }
 
-static inline void io_apic_write(unsigned int apic, unsigned int reg, unsigned int value)
+static void __io_apic_write(unsigned int apic, unsigned int reg,
+			    unsigned int value)
 {
 	struct io_apic __iomem *io_apic = io_apic_base(apic);
 	writel(reg, &io_apic->index);
@@ -425,7 +463,8 @@ static inline void io_apic_write(unsigned int apic, unsigned int reg, unsigned i
  *
  * Older SiS APIC requires we rewrite the index register
  */
-static inline void io_apic_modify(unsigned int apic, unsigned int reg, unsigned int value)
+static void __io_apic_modify(unsigned int apic, unsigned int reg,
+			     unsigned int value)
 {
 	struct io_apic __iomem *io_apic = io_apic_base(apic);
 
@@ -4141,6 +4180,11 @@ static struct resource * __init ioapic_setup_resources(void)
 
 void __init ioapic_init_mappings(void)
 {
+	io_apic_ops.init();
+}
+
+static void __init __ioapic_init_mappings(void)
+{
 	unsigned long ioapic_phys, idx = FIX_IO_APIC_BASE_0;
 	struct resource *ioapic_res;
 	int i;
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 03/17] xen: implement io_apic_ops
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 01/17] xen/dom0: handle acpi lapic parsing in Xen dom0 Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 02/17] x86: add io_apic_ops to allow interception Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 04/17] xen: create dummy ioapic mapping Jeremy Fitzhardinge
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge, Jeremy Fitzhardinge

Writes to the IO APIC are paravirtualized via hypercalls, so implement
the appropriate operations.

[ Impact: implement Xen interface for io_apic_ops ]
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/Makefile    |    2 +-
 arch/x86/xen/apic.c      |   64 ++++++++++++++++++++++++++++++++++++++++++++++
 arch/x86/xen/enlighten.c |    2 +
 arch/x86/xen/xen-ops.h   |    6 ++++
 4 files changed, 73 insertions(+), 1 deletions(-)
 create mode 100644 arch/x86/xen/apic.c

diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index c4cda96..73ecb74 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -11,4 +11,4 @@ obj-y		:= enlighten.o setup.o multicalls.o mmu.o irq.o \
 
 obj-$(CONFIG_SMP)		+= smp.o spinlock.o
 obj-$(CONFIG_XEN_DEBUG_FS)	+= debugfs.o
-obj-$(CONFIG_XEN_DOM0)		+= vga.o
+obj-$(CONFIG_XEN_DOM0)		+= vga.o apic.o
diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c
new file mode 100644
index 0000000..8ae563c
--- /dev/null
+++ b/arch/x86/xen/apic.c
@@ -0,0 +1,64 @@
+#include <linux/kernel.h>
+#include <linux/threads.h>
+#include <linux/bitmap.h>
+
+#include <asm/io_apic.h>
+#include <asm/acpi.h>
+
+#include <asm/xen/hypervisor.h>
+#include <asm/xen/hypercall.h>
+
+#include <xen/interface/xen.h>
+#include <xen/interface/physdev.h>
+
+static void __init xen_io_apic_init(void)
+{
+}
+
+static unsigned int xen_io_apic_read(unsigned apic, unsigned reg)
+{
+	struct physdev_apic apic_op;
+	int ret;
+
+	apic_op.apic_physbase = mp_ioapics[apic].apicaddr;
+	apic_op.reg = reg;
+	ret = HYPERVISOR_physdev_op(PHYSDEVOP_apic_read, &apic_op);
+	if (ret)
+		BUG();
+	return apic_op.value;
+}
+
+
+static void xen_io_apic_write(unsigned int apic, unsigned int reg, unsigned int value)
+{
+	struct physdev_apic apic_op;
+
+	apic_op.apic_physbase = mp_ioapics[apic].apicaddr;
+	apic_op.reg = reg;
+	apic_op.value = value;
+	if (HYPERVISOR_physdev_op(PHYSDEVOP_apic_write, &apic_op))
+		BUG();
+}
+
+static struct io_apic_ops __initdata xen_ioapic_ops = {
+	.init = xen_io_apic_init,
+	.read = xen_io_apic_read,
+	.write = xen_io_apic_write,
+	.modify = xen_io_apic_write,
+};
+
+void xen_init_apic(void)
+{
+	if (!xen_initial_domain())
+		return;
+
+	set_io_apic_ops(&xen_ioapic_ops);
+
+#ifdef CONFIG_ACPI
+	/*
+	 * Pretend ACPI found our lapic even though we've disabled it,
+ 	 * to prevent MP tables from setting up lapics.
+ 	 */
+	acpi_lapic = 1;
+#endif
+}
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 12e4d9c..3a4932a 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1085,6 +1085,8 @@ asmlinkage void __init xen_start_kernel(void)
 		set_iopl.iopl = 1;
 		if (HYPERVISOR_physdev_op(PHYSDEVOP_set_iopl, &set_iopl) == -1)
 			BUG();
+
+		xen_init_apic();
 	}
 
 	/* set the limit of our address space */
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 40abcef..0853949 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -76,13 +76,19 @@ struct dom0_vga_console_info;
 
 #ifdef CONFIG_XEN_DOM0
 void xen_init_vga(const struct dom0_vga_console_info *, size_t size);
+void xen_init_apic(void);
 #else
 static inline void xen_init_vga(const struct dom0_vga_console_info *info,
 				size_t size)
 {
 }
+
+static inline void xen_init_apic(void)
+{
+}
 #endif
 
+
 /* Declare an asm function, along with symbols needed to make it
    inlineable */
 #define DECL_ASM(ret, name, ...)		\
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 04/17] xen: create dummy ioapic mapping
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (2 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 03/17] xen: implement io_apic_ops Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 05/17] xen: implement pirq type event channels Jeremy Fitzhardinge
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge, Jeremy Fitzhardinge

We don't allow direct access to the IO apic, so make sure that any
request to map it just "maps" non-present pages.  We should see any
attempts at direct access explode nicely.

[ Impact: debuggability (make failures obvious) ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/mmu.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c
index 331e52d..139c8de 100644
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -1919,6 +1919,16 @@ static void xen_set_fixmap(unsigned idx, phys_addr_t phys, pgprot_t prot)
 		pte = pfn_pte(phys, prot);
 		break;
 
+#ifdef CONFIG_X86_IO_APIC
+	case FIX_IO_APIC_BASE_0 ... FIX_IO_APIC_BASE_END:
+		/*
+		 * We just don't map the IO APIC - all access is via
+		 * hypercalls.  Keep the address in the pte for reference.
+		 */
+		pte = pfn_pte(phys, PAGE_NONE);
+		break;
+#endif
+
 	case FIX_PARAVIRT_BOOTMAP:
 		/* This is an MFN, but it isn't an IO mapping from the
 		   IO domain */
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 05/17] xen: implement pirq type event channels
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (3 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 04/17] xen: create dummy ioapic mapping Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 06/17] x86/io_apic: add get_nr_irqs_gsi() Jeremy Fitzhardinge
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge, Jeremy Fitzhardinge

A privileged PV Xen domain can get direct access to hardware.  In
order for this to be useful, it must be able to get hardware
interrupts.

Being a PV Xen domain, all interrupts are delivered as event channels.
PIRQ event channels are bound to a pirq number and an interrupt
vector.  When a IO APIC raises a hardware interrupt on that vector, it
is delivered as an event channel, which we can deliver to the
appropriate device driver(s).

This patch simply implements the infrastructure for dealing with pirq
event channels.

[ Impact: integrate hardware interrupts into Xen's event scheme ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 drivers/xen/events.c |  245 +++++++++++++++++++++++++++++++++++++++++++++++++-
 include/xen/events.h |   11 +++
 2 files changed, 253 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 97f4b39..fd98c19 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -16,7 +16,7 @@
  *    (typically dom0).
  * 2. VIRQs, typically used for timers.  These are per-cpu events.
  * 3. IPIs.
- * 4. Hardware interrupts. Not supported at present.
+ * 4. PIRQs - Hardware interrupts.
  *
  * Jeremy Fitzhardinge <jeremy@xensource.com>, XenSource Inc, 2007
  */
@@ -40,6 +40,9 @@
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 
+/* Leave low irqs free for identity mapping */
+#define LEGACY_IRQS	16
+
 /*
  * This lock protects updates to the following mapping and reference-count
  * arrays. The lock does not need to be acquired to read the mapping tables.
@@ -83,10 +86,12 @@ struct irq_info
 		enum ipi_vector ipi;
 		struct {
 			unsigned short gsi;
-			unsigned short vector;
+			unsigned char vector;
+			unsigned char flags;
 		} pirq;
 	} u;
 };
+#define PIRQ_NEEDS_EOI	(1 << 0)
 
 static struct irq_info irq_info[NR_IRQS];
 
@@ -106,6 +111,7 @@ static inline unsigned long *cpu_evtchn_mask(int cpu)
 #define VALID_EVTCHN(chn)	((chn) != 0)
 
 static struct irq_chip xen_dynamic_chip;
+static struct irq_chip xen_pirq_chip;
 
 /* Constructor for packed IRQ information. */
 static struct irq_info mk_unbound_info(void)
@@ -218,6 +224,15 @@ static unsigned int cpu_from_evtchn(unsigned int evtchn)
 	return ret;
 }
 
+static bool pirq_needs_eoi(unsigned irq)
+{
+	struct irq_info *info = info_for_irq(irq);
+
+	BUG_ON(info->type != IRQT_PIRQ);
+
+	return info->u.pirq.flags & PIRQ_NEEDS_EOI;
+}
+
 static inline unsigned long active_evtchns(unsigned int cpu,
 					   struct shared_info *sh,
 					   unsigned int idx)
@@ -334,7 +349,7 @@ static int find_unbound_irq(void)
 	int irq;
 	struct irq_desc *desc;
 
-	for (irq = 0; irq < nr_irqs; irq++)
+	for (irq = LEGACY_IRQS; irq < nr_irqs; irq++)
 		if (irq_info[irq].type == IRQT_UNBOUND)
 			break;
 
@@ -350,6 +365,210 @@ static int find_unbound_irq(void)
 	return irq;
 }
 
+static bool identity_mapped_irq(unsigned irq)
+{
+	/* only identity map legacy irqs */
+	return irq < LEGACY_IRQS;
+}
+
+static void pirq_unmask_notify(int irq)
+{
+	struct physdev_eoi eoi = { .irq = irq };
+
+	if (unlikely(pirq_needs_eoi(irq))) {
+		int rc = HYPERVISOR_physdev_op(PHYSDEVOP_eoi, &eoi);
+		WARN_ON(rc);
+	}
+}
+
+static void pirq_query_unmask(int irq)
+{
+	struct physdev_irq_status_query irq_status;
+	struct irq_info *info = info_for_irq(irq);
+
+	BUG_ON(info->type != IRQT_PIRQ);
+
+	irq_status.irq = irq;
+	if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status))
+		irq_status.flags = 0;
+
+	info->u.pirq.flags &= ~PIRQ_NEEDS_EOI;
+	if (irq_status.flags & XENIRQSTAT_needs_eoi)
+		info->u.pirq.flags |= PIRQ_NEEDS_EOI;
+}
+
+static bool probing_irq(int irq)
+{
+	struct irq_desc *desc = irq_to_desc(irq);
+
+	return desc && desc->action == NULL;
+}
+
+static unsigned int startup_pirq(unsigned int irq)
+{
+	struct evtchn_bind_pirq bind_pirq;
+	struct irq_info *info = info_for_irq(irq);
+	int evtchn = evtchn_from_irq(irq);
+
+	BUG_ON(info->type != IRQT_PIRQ);
+
+	if (VALID_EVTCHN(evtchn))
+		goto out;
+
+	bind_pirq.pirq = irq;
+	/* NB. We are happy to share unless we are probing. */
+	bind_pirq.flags = probing_irq(irq) ? 0 : BIND_PIRQ__WILL_SHARE;
+	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq) != 0) {
+		if (!probing_irq(irq))
+			printk(KERN_INFO "Failed to obtain physical IRQ %d\n",
+			       irq);
+		return 0;
+	}
+	evtchn = bind_pirq.port;
+
+	pirq_query_unmask(irq);
+
+	evtchn_to_irq[evtchn] = irq;
+	bind_evtchn_to_cpu(evtchn, 0);
+	info->evtchn = evtchn;
+
+ out:
+	unmask_evtchn(evtchn);
+	pirq_unmask_notify(irq);
+
+	return 0;
+}
+
+static void shutdown_pirq(unsigned int irq)
+{
+	struct evtchn_close close;
+	struct irq_info *info = info_for_irq(irq);
+	int evtchn = evtchn_from_irq(irq);
+
+	BUG_ON(info->type != IRQT_PIRQ);
+
+	if (!VALID_EVTCHN(evtchn))
+		return;
+
+	mask_evtchn(evtchn);
+
+	close.port = evtchn;
+	if (HYPERVISOR_event_channel_op(EVTCHNOP_close, &close) != 0)
+		BUG();
+
+	bind_evtchn_to_cpu(evtchn, 0);
+	evtchn_to_irq[evtchn] = -1;
+	info->evtchn = 0;
+}
+
+static void enable_pirq(unsigned int irq)
+{
+	startup_pirq(irq);
+}
+
+static void disable_pirq(unsigned int irq)
+{
+}
+
+static void ack_pirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+
+	move_native_irq(irq);
+
+	if (VALID_EVTCHN(evtchn)) {
+		mask_evtchn(evtchn);
+		clear_evtchn(evtchn);
+	}
+}
+
+static void end_pirq(unsigned int irq)
+{
+	int evtchn = evtchn_from_irq(irq);
+	struct irq_desc *desc = irq_to_desc(irq);
+
+	if (WARN_ON(!desc))
+		return;
+
+	if ((desc->status & (IRQ_DISABLED|IRQ_PENDING)) ==
+	    (IRQ_DISABLED|IRQ_PENDING)) {
+		shutdown_pirq(irq);
+	} else if (VALID_EVTCHN(evtchn)) {
+		unmask_evtchn(evtchn);
+		pirq_unmask_notify(irq);
+	}
+}
+
+static int find_irq_by_gsi(unsigned gsi)
+{
+	int irq;
+
+	for (irq = 0; irq < NR_IRQS; irq++) {
+		struct irq_info *info = info_for_irq(irq);
+
+		if (info == NULL || info->type != IRQT_PIRQ)
+			continue;
+
+		if (gsi_from_irq(irq) == gsi)
+			return irq;
+	}
+
+	return -1;
+}
+
+/*
+ * Allocate a physical irq, along with a vector.  We don't assign an
+ * event channel until the irq actually started up.  Return an
+ * existing irq if we've already got one for the gsi.
+ */
+int xen_allocate_pirq(unsigned gsi)
+{
+	int irq;
+	struct physdev_irq irq_op;
+
+	spin_lock(&irq_mapping_update_lock);
+
+	irq = find_irq_by_gsi(gsi);
+	if (irq != -1) {
+		printk(KERN_INFO "xen_allocate_pirq: returning irq %d for gsi %u\n",
+		       irq, gsi);
+		goto out;	/* XXX need refcount? */
+	}
+
+	if (identity_mapped_irq(gsi)) {
+		irq = gsi;
+		dynamic_irq_init(irq);
+	} else
+		irq = find_unbound_irq();
+
+	set_irq_chip_and_handler_name(irq, &xen_pirq_chip,
+				      handle_level_irq, "pirq");
+
+	irq_op.irq = irq;
+	if (HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op)) {
+		dynamic_irq_cleanup(irq);
+		irq = -ENOSPC;
+		goto out;
+	}
+
+	irq_info[irq] = mk_pirq_info(0, gsi, irq_op.vector);
+
+out:
+	spin_unlock(&irq_mapping_update_lock);
+
+	return irq;
+}
+
+int xen_vector_from_irq(unsigned irq)
+{
+	return vector_from_irq(irq);
+}
+
+int xen_gsi_from_irq(unsigned irq)
+{
+	return gsi_from_irq(irq);
+}
+
 int bind_evtchn_to_irq(unsigned int evtchn)
 {
 	int irq;
@@ -922,6 +1141,26 @@ static struct irq_chip xen_dynamic_chip __read_mostly = {
 	.retrigger	= retrigger_dynirq,
 };
 
+static struct irq_chip xen_pirq_chip __read_mostly = {
+	.name		= "xen-pirq",
+
+	.startup	= startup_pirq,
+	.shutdown	= shutdown_pirq,
+
+	.enable		= enable_pirq,
+	.unmask		= enable_pirq,
+
+	.disable	= disable_pirq,
+	.mask		= disable_pirq,
+
+	.ack		= ack_pirq,
+	.end		= end_pirq,
+
+	.set_affinity	= set_affinity_irq,
+
+	.retrigger	= retrigger_dynirq,
+};
+
 void __init xen_init_IRQ(void)
 {
 	int i;
diff --git a/include/xen/events.h b/include/xen/events.h
index 9f24b64..e5b541d 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -58,4 +58,15 @@ void xen_poll_irq(int irq);
 /* Determine the IRQ which is bound to an event channel */
 unsigned irq_from_evtchn(unsigned int evtchn);
 
+/* Allocate an irq for a physical interrupt, given a gsi.  "Legacy"
+   GSIs are identity mapped; others are dynamically allocated as
+   usual. */
+int xen_allocate_pirq(unsigned gsi);
+
+/* Return vector allocated to pirq */
+int xen_vector_from_irq(unsigned pirq);
+
+/* Return gsi allocated to pirq */
+int xen_gsi_from_irq(unsigned pirq);
+
 #endif	/* _XEN_EVENTS_H */
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 06/17] x86/io_apic: add get_nr_irqs_gsi()
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (4 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 05/17] xen: implement pirq type event channels Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 07/17] xen/apic: identity map gsi->irqs Jeremy Fitzhardinge
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy@f9-builder.(none)>

Add get_nr_irqs_gsi() to return nr_irqs_gsi.  Xen will use this to
determine how many irqs it needs to reserve for hardware irqs.

[ Impact: new interface to get max GSI ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Reviewed-by: "H. Peter Anvin" <hpa@zytor.com>
---
 arch/x86/include/asm/io_apic.h |    1 +
 arch/x86/kernel/apic/io_apic.c |    5 +++++
 2 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index 8cbfe73..e33ccb7 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -181,6 +181,7 @@ extern void reinit_intr_remapped_IO_APIC(int intr_remapping,
 #endif
 
 extern void probe_nr_irqs_gsi(void);
+extern int get_nr_irqs_gsi(void);
 
 extern int setup_ioapic_entry(int apic, int irq,
 			      struct IO_APIC_route_entry *entry,
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index c24f116..07dc530 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3917,6 +3917,11 @@ void __init probe_nr_irqs_gsi(void)
 	printk(KERN_DEBUG "nr_irqs_gsi: %d\n", nr_irqs_gsi);
 }
 
+int get_nr_irqs_gsi(void)
+{
+	return nr_irqs_gsi;
+}
+
 #ifdef CONFIG_SPARSE_IRQ
 int __init arch_probe_nr_irqs(void)
 {
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 07/17] xen/apic: identity map gsi->irqs
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (5 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 06/17] x86/io_apic: add get_nr_irqs_gsi() Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 08/17] xen: direct irq registration to pirq event channels Jeremy Fitzhardinge
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge, Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy@f9-builder.(none)>

Reserve the lower irq range for use for hardware interrupts so we
can identity-map them.

[ Impact: preserve compat with native ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 drivers/xen/events.c |   23 +++++++++++++++++------
 1 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index fd98c19..88395bb 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -31,6 +31,7 @@
 #include <asm/ptrace.h>
 #include <asm/irq.h>
 #include <asm/idle.h>
+#include <asm/io_apic.h>
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
@@ -40,9 +41,6 @@
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 
-/* Leave low irqs free for identity mapping */
-#define LEGACY_IRQS	16
-
 /*
  * This lock protects updates to the following mapping and reference-count
  * arrays. The lock does not need to be acquired to read the mapping tables.
@@ -344,12 +342,24 @@ static void unmask_evtchn(int port)
 	put_cpu();
 }
 
+static int get_nr_hw_irqs(void)
+{
+	int ret = 1;
+
+#ifdef CONFIG_X86_IO_APIC
+	ret = get_nr_irqs_gsi();
+#endif
+
+	return ret;
+}
+
 static int find_unbound_irq(void)
 {
 	int irq;
 	struct irq_desc *desc;
+	int start = get_nr_hw_irqs();
 
-	for (irq = LEGACY_IRQS; irq < nr_irqs; irq++)
+	for (irq = start; irq < nr_irqs; irq++)
 		if (irq_info[irq].type == IRQT_UNBOUND)
 			break;
 
@@ -367,8 +377,8 @@ static int find_unbound_irq(void)
 
 static bool identity_mapped_irq(unsigned irq)
 {
-	/* only identity map legacy irqs */
-	return irq < LEGACY_IRQS;
+	/* identity map all the hardware irqs */
+	return irq < get_nr_hw_irqs();
 }
 
 static void pirq_unmask_notify(int irq)
@@ -537,6 +547,7 @@ int xen_allocate_pirq(unsigned gsi)
 
 	if (identity_mapped_irq(gsi)) {
 		irq = gsi;
+		irq_to_desc_alloc_cpu(irq, 0);
 		dynamic_irq_init(irq);
 	} else
 		irq = find_unbound_irq();
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 08/17] xen: direct irq registration to pirq event channels
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (6 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 07/17] xen/apic: identity map gsi->irqs Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 09/17] xen: bind pirq to vector and event channel Jeremy Fitzhardinge
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge, Jeremy Fitzhardinge

This patch puts the hooks into place so that when the interrupt
subsystem registers an irq, it gets routed via Xen (if we're running
under Xen).

The first step is to get a gsi for a particular device+pin.  We use
the normal acpi interrupt routing to do the mapping.

We reserve enough irq space to fit the hardware interrupt sources in,
so we can allocate the irq == gsi, as we do in the native case;
software events will get allocated irqs above that.

Having allocated an irq, we ask Xen to allocate a vector, and then
bind that pirq/vector to an event channel.  When the hardware raises
an interrupt on a vector, Xen signals us on the corresponding event
channel, which gets routed to the irq and delivered to the appropriate
device driver.

This patch does everything except set up the IO APIC pin routing to
the vector.

[ Impact: route hardware interrupts via Xen ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/include/asm/xen/pci.h |   13 +++++++++++
 arch/x86/kernel/acpi/boot.c    |    8 ++++++-
 arch/x86/xen/Kconfig           |   11 +++++++++
 arch/x86/xen/Makefile          |    1 +
 arch/x86/xen/pci.c             |   47 ++++++++++++++++++++++++++++++++++++++++
 drivers/xen/events.c           |    6 ++++-
 include/xen/events.h           |    8 ++++++
 7 files changed, 92 insertions(+), 2 deletions(-)
 create mode 100644 arch/x86/include/asm/xen/pci.h
 create mode 100644 arch/x86/xen/pci.c

diff --git a/arch/x86/include/asm/xen/pci.h b/arch/x86/include/asm/xen/pci.h
new file mode 100644
index 0000000..0563fc6
--- /dev/null
+++ b/arch/x86/include/asm/xen/pci.h
@@ -0,0 +1,13 @@
+#ifndef _ASM_X86_XEN_PCI_H
+#define _ASM_X86_XEN_PCI_H
+
+#ifdef CONFIG_XEN_DOM0_PCI
+int xen_register_gsi(u32 gsi, int triggering, int polarity);
+#else
+static inline int xen_register_gsi(u32 gsi, int triggering, int polarity)
+{
+	return -1;
+}
+#endif
+
+#endif	/* _ASM_X86_XEN_PCI_H */
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 4147e0c..d4de1c2 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -41,6 +41,8 @@
 #include <asm/mpspec.h>
 #include <asm/smp.h>
 
+#include <asm/xen/pci.h>
+
 #include <asm/xen/hypervisor.h>
 
 static int __initdata acpi_force = 0;
@@ -530,9 +532,13 @@ int acpi_gsi_to_irq(u32 gsi, unsigned int *irq)
  */
 int acpi_register_gsi(u32 gsi, int triggering, int polarity)
 {
-	unsigned int irq;
+	int irq;
 	unsigned int plat_gsi = gsi;
 
+	irq = xen_register_gsi(gsi, triggering, polarity);
+	if (irq >= 0)
+		return irq;
+
 #ifdef CONFIG_PCI
 	/*
 	 * Make sure all (legacy) PCI IRQs are set as level-triggered.
diff --git a/arch/x86/xen/Kconfig b/arch/x86/xen/Kconfig
index fe69286..42e9f0a 100644
--- a/arch/x86/xen/Kconfig
+++ b/arch/x86/xen/Kconfig
@@ -37,6 +37,17 @@ config XEN_DEBUG_FS
 	  Enable statistics output and various tuning options in debugfs.
 	  Enabling this option may incur a significant performance overhead.
 
+config XEN_PCI_PASSTHROUGH
+       bool #"Enable support for Xen PCI passthrough devices"
+       depends on XEN && PCI
+       help
+         Enable support for passing PCI devices through to
+	 unprivileged domains. (COMPLETELY UNTESTED)
+
+config XEN_DOM0_PCI
+       def_bool y
+       depends on XEN_DOM0 && PCI
+
 config XEN_DOM0
 	bool "Enable Xen privileged domain support"
 	depends on XEN && X86_IO_APIC && ACPI
diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile
index 73ecb74..639965a 100644
--- a/arch/x86/xen/Makefile
+++ b/arch/x86/xen/Makefile
@@ -12,3 +12,4 @@ obj-y		:= enlighten.o setup.o multicalls.o mmu.o irq.o \
 obj-$(CONFIG_SMP)		+= smp.o spinlock.o
 obj-$(CONFIG_XEN_DEBUG_FS)	+= debugfs.o
 obj-$(CONFIG_XEN_DOM0)		+= vga.o apic.o
+obj-$(CONFIG_XEN_DOM0_PCI)	+= pci.o
\ No newline at end of file
diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c
new file mode 100644
index 0000000..f450007
--- /dev/null
+++ b/arch/x86/xen/pci.c
@@ -0,0 +1,47 @@
+#include <linux/kernel.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+
+#include <asm/pci_x86.h>
+
+#include <asm/xen/hypervisor.h>
+
+#include <xen/interface/xen.h>
+#include <xen/events.h>
+
+#include "xen-ops.h"
+
+int xen_register_gsi(u32 gsi, int triggering, int polarity)
+{
+	int irq;
+
+	if (!xen_domain())
+		return -1;
+
+	printk(KERN_DEBUG "xen: registering gsi %u triggering %d polarity %d\n",
+	       gsi, triggering, polarity);
+
+	irq = xen_allocate_pirq(gsi);
+
+	printk(KERN_DEBUG "xen: --> irq=%d\n", irq);
+
+	return irq;
+}
+
+void __init xen_setup_pirqs(void)
+{
+#ifdef CONFIG_ACPI
+	int irq;
+
+	/*
+	 * Set up acpi interrupt in acpi_gbl_FADT.sci_interrupt.
+	 */
+	irq = xen_allocate_pirq(acpi_gbl_FADT.sci_interrupt);
+
+	printk(KERN_INFO "xen: allocated irq %d for acpi %d\n",
+	       irq, acpi_gbl_FADT.sci_interrupt);
+
+	/* Blerk. */
+	acpi_gbl_FADT.sci_interrupt = irq;
+#endif
+}
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 88395bb..968e927 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -419,6 +419,7 @@ static unsigned int startup_pirq(unsigned int irq)
 	struct evtchn_bind_pirq bind_pirq;
 	struct irq_info *info = info_for_irq(irq);
 	int evtchn = evtchn_from_irq(irq);
+	int rc;
 
 	BUG_ON(info->type != IRQT_PIRQ);
 
@@ -428,7 +429,8 @@ static unsigned int startup_pirq(unsigned int irq)
 	bind_pirq.pirq = irq;
 	/* NB. We are happy to share unless we are probing. */
 	bind_pirq.flags = probing_irq(irq) ? 0 : BIND_PIRQ__WILL_SHARE;
-	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq) != 0) {
+	rc = HYPERVISOR_event_channel_op(EVTCHNOP_bind_pirq, &bind_pirq);
+	if (rc != 0) {
 		if (!probing_irq(irq))
 			printk(KERN_INFO "Failed to obtain physical IRQ %d\n",
 			       irq);
@@ -1187,4 +1189,6 @@ void __init xen_init_IRQ(void)
 		mask_evtchn(i);
 
 	irq_ctx_init(smp_processor_id());
+
+	xen_setup_pirqs();
 }
diff --git a/include/xen/events.h b/include/xen/events.h
index e5b541d..6fe4863 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -69,4 +69,12 @@ int xen_vector_from_irq(unsigned pirq);
 /* Return gsi allocated to pirq */
 int xen_gsi_from_irq(unsigned pirq);
 
+#ifdef CONFIG_XEN_DOM0_PCI
+void xen_setup_pirqs(void);
+#else
+static inline void xen_setup_pirqs(void)
+{
+}
+#endif
+
 #endif	/* _XEN_EVENTS_H */
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 09/17] xen: bind pirq to vector and event channel
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (7 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 08/17] xen: direct irq registration to pirq event channels Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 10/17] xen: pre-initialize legacy irqs early Jeremy Fitzhardinge
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge, Jeremy Fitzhardinge

Having converting a dev+pin to a gsi, and that gsi to an irq, and
allocated a vector for the irq, we must program the IO APIC to deliver
an interrupt on a pin to the vector, so Xen can deliver it as an event
channel.

Given the pirq, we can get the gsi and vector.  We map the gsi to a
specific IO APIC's pin, and set the routing entry.

(We were passing the ACPI triggering and polarity levels directly into
the apic - but they have reversed values.  The result was that
all the level-triggered interrupts were edge, and vice-versa.
It's surprising that anything worked at all, but now AHCI works
for me.

Thanks for Gerd Hoffmann for noticing this.)

[ Impact: program IO APICs under Xen ]

Diagnosed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/apic.c |    2 ++
 arch/x86/xen/pci.c  |   33 +++++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c
index 8ae563c..35a8af7 100644
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -4,6 +4,7 @@
 
 #include <asm/io_apic.h>
 #include <asm/acpi.h>
+#include <asm/hw_irq.h>
 
 #include <asm/xen/hypervisor.h>
 #include <asm/xen/hypercall.h>
@@ -13,6 +14,7 @@
 
 static void __init xen_io_apic_init(void)
 {
+	enable_IO_APIC();
 }
 
 static unsigned int xen_io_apic_read(unsigned apic, unsigned reg)
diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c
index f450007..af4e898 100644
--- a/arch/x86/xen/pci.c
+++ b/arch/x86/xen/pci.c
@@ -2,6 +2,8 @@
 #include <linux/acpi.h>
 #include <linux/pci.h>
 
+#include <asm/mpspec.h>
+#include <asm/io_apic.h>
 #include <asm/pci_x86.h>
 
 #include <asm/xen/hypervisor.h>
@@ -11,6 +13,32 @@
 
 #include "xen-ops.h"
 
+static void xen_set_io_apic_routing(int irq, int trigger, int polarity)
+{
+	int ioapic, ioapic_pin;
+	int vector, gsi;
+	struct IO_APIC_route_entry entry;
+
+	gsi = xen_gsi_from_irq(irq);
+	vector = xen_vector_from_irq(irq);
+
+	ioapic = mp_find_ioapic(gsi);
+	if (ioapic == -1) {
+		printk(KERN_WARNING "xen_set_ioapic_routing: irq %d gsi %d ioapic %d\n",
+			irq, gsi, ioapic);
+		return;
+	}
+
+	ioapic_pin = mp_find_ioapic_pin(ioapic, gsi);
+
+	printk(KERN_INFO "xen_set_ioapic_routing: irq %d gsi %d vector %d ioapic %d pin %d triggering %d polarity %d\n",
+		irq, gsi, vector, ioapic, ioapic_pin, trigger, polarity);
+
+	setup_ioapic_entry(ioapic, -1, &entry, ~0, trigger, polarity, vector,
+			   ioapic_pin);
+	ioapic_write_entry(ioapic, ioapic_pin, entry);
+}
+
 int xen_register_gsi(u32 gsi, int triggering, int polarity)
 {
 	int irq;
@@ -25,6 +53,11 @@ int xen_register_gsi(u32 gsi, int triggering, int polarity)
 
 	printk(KERN_DEBUG "xen: --> irq=%d\n", irq);
 
+	if (irq > 0)
+		xen_set_io_apic_routing(irq,
+					triggering == ACPI_EDGE_SENSITIVE ? 0 : 1,
+					polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
+
 	return irq;
 }
 
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 10/17] xen: pre-initialize legacy irqs early
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (8 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 09/17] xen: bind pirq to vector and event channel Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 11/17] xen: don't setup acpi interrupt unless there is one Jeremy Fitzhardinge
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Ian Campbell, Jeremy Fitzhardinge

From: Ian Campbell <ian.campbell@citrix.com>

Various legacy devices, such as IDE, assume their legacy interrupts are
already initialized and are immediately usable.  Pre-initialize all the
legacy interrupts.

[ Impact: ISA/legacy device compat ]

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/pci.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c
index af4e898..402a5bd 100644
--- a/arch/x86/xen/pci.c
+++ b/arch/x86/xen/pci.c
@@ -63,9 +63,9 @@ int xen_register_gsi(u32 gsi, int triggering, int polarity)
 
 void __init xen_setup_pirqs(void)
 {
-#ifdef CONFIG_ACPI
 	int irq;
 
+#ifdef CONFIG_ACPI
 	/*
 	 * Set up acpi interrupt in acpi_gbl_FADT.sci_interrupt.
 	 */
@@ -77,4 +77,8 @@ void __init xen_setup_pirqs(void)
 	/* Blerk. */
 	acpi_gbl_FADT.sci_interrupt = irq;
 #endif
+
+	/* Pre-allocate legacy irqs */
+	for (irq = 0; irq < NR_IRQS_LEGACY; irq++)
+		xen_allocate_pirq(irq);
 }
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 11/17] xen: don't setup acpi interrupt unless there is one
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (9 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 10/17] xen: pre-initialize legacy irqs early Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 12/17] xen: use acpi_get_override_irq() to get triggering for legacy irqs Jeremy Fitzhardinge
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

If the SCI hasn't been set, then presumably we're not running
with acpi, don't bother setting up the interrupt.

[ Impact: compatibility with pre-ACPI machines ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/pci.c |   11 +++++------
 1 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c
index 402a5bd..00ad6df 100644
--- a/arch/x86/xen/pci.c
+++ b/arch/x86/xen/pci.c
@@ -69,13 +69,12 @@ void __init xen_setup_pirqs(void)
 	/*
 	 * Set up acpi interrupt in acpi_gbl_FADT.sci_interrupt.
 	 */
-	irq = xen_allocate_pirq(acpi_gbl_FADT.sci_interrupt);
+	if (acpi_gbl_FADT.sci_interrupt > 0) {
+		irq = xen_allocate_pirq(acpi_gbl_FADT.sci_interrupt);
 
-	printk(KERN_INFO "xen: allocated irq %d for acpi %d\n",
-	       irq, acpi_gbl_FADT.sci_interrupt);
-
-	/* Blerk. */
-	acpi_gbl_FADT.sci_interrupt = irq;
+		printk(KERN_INFO "xen: allocated irq %d for acpi %d\n",
+		       irq, acpi_gbl_FADT.sci_interrupt);
+	}
 #endif
 
 	/* Pre-allocate legacy irqs */
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 12/17] xen: use acpi_get_override_irq() to get triggering for legacy irqs
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (10 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 11/17] xen: don't setup acpi interrupt unless there is one Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 13/17] xen: initialize irq 0 too Jeremy Fitzhardinge
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

We need to set up proper IO apic entries for legacy irqs, which are
not normally configured by either normal acpi interrupt routing or
PNP.

This also generalizes the acpi interrupt setup, so we can remove it
as a special case.

[ Impact: compatibility with legacy/ISA hardware ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/pci.c |   24 ++++++++++--------------
 1 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c
index 00ad6df..db0c74c 100644
--- a/arch/x86/xen/pci.c
+++ b/arch/x86/xen/pci.c
@@ -65,19 +65,15 @@ void __init xen_setup_pirqs(void)
 {
 	int irq;
 
-#ifdef CONFIG_ACPI
-	/*
-	 * Set up acpi interrupt in acpi_gbl_FADT.sci_interrupt.
-	 */
-	if (acpi_gbl_FADT.sci_interrupt > 0) {
-		irq = xen_allocate_pirq(acpi_gbl_FADT.sci_interrupt);
-
-		printk(KERN_INFO "xen: allocated irq %d for acpi %d\n",
-		       irq, acpi_gbl_FADT.sci_interrupt);
-	}
-#endif
-
 	/* Pre-allocate legacy irqs */
-	for (irq = 0; irq < NR_IRQS_LEGACY; irq++)
-		xen_allocate_pirq(irq);
+	for (irq = 0; irq < NR_IRQS_LEGACY; irq++) {
+		int trigger, polarity;
+
+		if (acpi_get_override_irq(irq, &trigger, &polarity) == -1)
+			continue;
+
+		xen_register_gsi(irq,
+			trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE,
+			polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH);
+	}
 }
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 13/17] xen: initialize irq 0 too
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (11 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 12/17] xen: use acpi_get_override_irq() to get triggering for legacy irqs Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 14/17] xen: dynamically allocate irq & event structures Jeremy Fitzhardinge
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

IRQ 0 is valid, so make sure it gets initialized properly too.
(Though in practice it doesn't matter, because its the timer
interrupt we don't use under Xen.)

[ Impact: theoretical bugfix, cleanup ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/pci.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c
index db0c74c..381b7ab 100644
--- a/arch/x86/xen/pci.c
+++ b/arch/x86/xen/pci.c
@@ -53,7 +53,7 @@ int xen_register_gsi(u32 gsi, int triggering, int polarity)
 
 	printk(KERN_DEBUG "xen: --> irq=%d\n", irq);
 
-	if (irq > 0)
+	if (irq >= 0)
 		xen_set_io_apic_routing(irq,
 					triggering == ACPI_EDGE_SENSITIVE ? 0 : 1,
 					polarity == ACPI_ACTIVE_HIGH ? 0 : 1);
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 14/17] xen: dynamically allocate irq & event structures
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (12 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 13/17] xen: initialize irq 0 too Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 15/17] xen: set pirq name to something useful Jeremy Fitzhardinge
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Dynamically allocate the irq_info and evtchn_to_irq arrays, so that
1) the irq_info array scales to the actual number of possible irqs,
and 2) we don't needlessly increase the static size of the kernel
when we aren't running under Xen.

Derived on patch from Mike Travis <travis@sgi.com>.

[ Impact: reduce memory usage ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 drivers/xen/events.c |   15 +++++++++------
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index 968e927..e6ddf78 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -27,6 +27,7 @@
 #include <linux/module.h>
 #include <linux/string.h>
 #include <linux/bootmem.h>
+#include <linux/irqnr.h>
 
 #include <asm/ptrace.h>
 #include <asm/irq.h>
@@ -91,11 +92,9 @@ struct irq_info
 };
 #define PIRQ_NEEDS_EOI	(1 << 0)
 
-static struct irq_info irq_info[NR_IRQS];
+static struct irq_info *irq_info;
 
-static int evtchn_to_irq[NR_EVENT_CHANNELS] = {
-	[0 ... NR_EVENT_CHANNELS-1] = -1
-};
+static int *evtchn_to_irq;
 struct cpu_evtchn_s {
 	unsigned long bits[NR_EVENT_CHANNELS/BITS_PER_LONG];
 };
@@ -515,7 +514,7 @@ static int find_irq_by_gsi(unsigned gsi)
 {
 	int irq;
 
-	for (irq = 0; irq < NR_IRQS; irq++) {
+	for (irq = 0; irq < nr_irqs; irq++) {
 		struct irq_info *info = info_for_irq(irq);
 
 		if (info == NULL || info->type != IRQT_PIRQ)
@@ -1180,7 +1179,11 @@ void __init xen_init_IRQ(void)
 	size_t size = nr_cpu_ids * sizeof(struct cpu_evtchn_s);
 
 	cpu_evtchn_mask_p = alloc_bootmem(size);
-	BUG_ON(cpu_evtchn_mask_p == NULL);
+	irq_info = alloc_bootmem(nr_irqs * sizeof(*irq_info));
+
+	evtchn_to_irq = alloc_bootmem(NR_EVENT_CHANNELS * sizeof(*evtchn_to_irq));
+	for (i = 0; i < NR_EVENT_CHANNELS; i++)
+		evtchn_to_irq[i] = -1;
 
 	init_evtchn_cpu_bindings();
 
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 15/17] xen: set pirq name to something useful.
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (13 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 14/17] xen: dynamically allocate irq & event structures Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 16/17] xen: fix legacy irq setup, make ioapic-less machines work Jeremy Fitzhardinge
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Gerd Hoffmann, Jeremy Fitzhardinge

From: Gerd Hoffmann <kraxel@xeni.home.kraxel.org>

Make pirq show useful information in /proc/interrupts

[ Impact: better output in /proc/interrupts ]

Signed-off-by: Gerd Hoffmann <kraxel@xeni.home.kraxel.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/pci.c   |    3 ++-
 drivers/xen/events.c |    4 ++--
 include/xen/events.h |    2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c
index 381b7ab..4b286f1 100644
--- a/arch/x86/xen/pci.c
+++ b/arch/x86/xen/pci.c
@@ -49,7 +49,8 @@ int xen_register_gsi(u32 gsi, int triggering, int polarity)
 	printk(KERN_DEBUG "xen: registering gsi %u triggering %d polarity %d\n",
 	       gsi, triggering, polarity);
 
-	irq = xen_allocate_pirq(gsi);
+	irq = xen_allocate_pirq(gsi, (triggering == ACPI_EDGE_SENSITIVE)
+				     ? "ioapic-edge" : "ioapic-level");
 
 	printk(KERN_DEBUG "xen: --> irq=%d\n", irq);
 
diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index e6ddf78..f84d13b 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -532,7 +532,7 @@ static int find_irq_by_gsi(unsigned gsi)
  * event channel until the irq actually started up.  Return an
  * existing irq if we've already got one for the gsi.
  */
-int xen_allocate_pirq(unsigned gsi)
+int xen_allocate_pirq(unsigned gsi, char *name)
 {
 	int irq;
 	struct physdev_irq irq_op;
@@ -554,7 +554,7 @@ int xen_allocate_pirq(unsigned gsi)
 		irq = find_unbound_irq();
 
 	set_irq_chip_and_handler_name(irq, &xen_pirq_chip,
-				      handle_level_irq, "pirq");
+				      handle_level_irq, name);
 
 	irq_op.irq = irq;
 	if (HYPERVISOR_physdev_op(PHYSDEVOP_alloc_irq_vector, &irq_op)) {
diff --git a/include/xen/events.h b/include/xen/events.h
index 6fe4863..4b19b9c 100644
--- a/include/xen/events.h
+++ b/include/xen/events.h
@@ -61,7 +61,7 @@ unsigned irq_from_evtchn(unsigned int evtchn);
 /* Allocate an irq for a physical interrupt, given a gsi.  "Legacy"
    GSIs are identity mapped; others are dynamically allocated as
    usual. */
-int xen_allocate_pirq(unsigned gsi);
+int xen_allocate_pirq(unsigned gsi, char *name);
 
 /* Return vector allocated to pirq */
 int xen_vector_from_irq(unsigned pirq);
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 16/17] xen: fix legacy irq setup, make ioapic-less machines work.
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (14 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 15/17] xen: set pirq name to something useful Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-12 23:25 ` [PATCH 17/17] xen: disable MSI Jeremy Fitzhardinge
  2009-05-19 12:35 ` [GIT PULL] Xen APIC hooks (with io_apic_ops) Ingo Molnar
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Gerd Hoffmann, Jeremy Fitzhardinge

From: Gerd Hoffmann <kraxel@xeni.home.kraxel.org>

If the machine has no IO APICs, then just allocate a set of legacy
interrupts.

[ Impact: fix Xen compatibility with old machines ]

Signed-off-by: Gerd Hoffmann <kraxel@xeni.home.kraxel.org>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/pci.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/arch/x86/xen/pci.c b/arch/x86/xen/pci.c
index 4b286f1..07b59fe 100644
--- a/arch/x86/xen/pci.c
+++ b/arch/x86/xen/pci.c
@@ -66,6 +66,12 @@ void __init xen_setup_pirqs(void)
 {
 	int irq;
 
+	if (0 == nr_ioapics) {
+		for (irq = 0; irq < NR_IRQS_LEGACY; irq++)
+			xen_allocate_pirq(irq, "xt-pic");
+		return;
+	}
+
 	/* Pre-allocate legacy irqs */
 	for (irq = 0; irq < NR_IRQS_LEGACY; irq++) {
 		int trigger, polarity;
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* [PATCH 17/17] xen: disable MSI
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (15 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 16/17] xen: fix legacy irq setup, make ioapic-less machines work Jeremy Fitzhardinge
@ 2009-05-12 23:25 ` Jeremy Fitzhardinge
  2009-05-19 12:35 ` [GIT PULL] Xen APIC hooks (with io_apic_ops) Ingo Molnar
  17 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-12 23:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge

From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>

Disable MSI until we support it properly.

[ Impact: prevent MSI subsystem from crashing ]

Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
---
 arch/x86/xen/apic.c |    3 +++
 drivers/pci/pci.h   |    2 --
 include/linux/pci.h |    6 ++++++
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c
index 35a8af7..fece57a 100644
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -1,6 +1,7 @@
 #include <linux/kernel.h>
 #include <linux/threads.h>
 #include <linux/bitmap.h>
+#include <linux/pci.h>
 
 #include <asm/io_apic.h>
 #include <asm/acpi.h>
@@ -54,6 +55,8 @@ void xen_init_apic(void)
 	if (!xen_initial_domain())
 		return;
 
+	pci_no_msi();
+
 	set_io_apic_ops(&xen_ioapic_ops);
 
 #ifdef CONFIG_ACPI
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index d03f6b9..79ada7b 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -111,10 +111,8 @@ extern struct rw_semaphore pci_bus_sem;
 extern unsigned int pci_pm_d3_delay;
 
 #ifdef CONFIG_PCI_MSI
-void pci_no_msi(void);
 extern void pci_msi_init_pci_dev(struct pci_dev *dev);
 #else
-static inline void pci_no_msi(void) { }
 static inline void pci_msi_init_pci_dev(struct pci_dev *dev) { }
 #endif
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 72698d8..724d030 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1253,5 +1253,11 @@ static inline irqreturn_t pci_sriov_migration(struct pci_dev *dev)
 }
 #endif
 
+#ifdef CONFIG_PCI_MSI
+void pci_no_msi(void);
+#else
+static inline void pci_no_msi(void) { }
+#endif
+
 #endif /* __KERNEL__ */
 #endif /* LINUX_PCI_H */
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 104+ messages in thread

* Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
                   ` (16 preceding siblings ...)
  2009-05-12 23:25 ` [PATCH 17/17] xen: disable MSI Jeremy Fitzhardinge
@ 2009-05-19 12:35 ` Ingo Molnar
  2009-05-20 17:57   ` Jeremy Fitzhardinge
  2009-05-24 20:10   ` Avi Kivity
  17 siblings, 2 replies; 104+ messages in thread
From: Ingo Molnar @ 2009-05-19 12:35 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel


* Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> Hi Ingo,
> 
> Here's a revised set of the Xen APIC changes which adds 
> io_apic_ops to allow Xen to intercept IO APIC access operations.

In a previous discussion you said:

> IO APIC operations are not even slightly performance critical? Are 
> they ever used on the interrupt delivery path?

Since they are not performance critical, then why doesnt Xen catch 
the IO-APIC accesses, and virtualizes the device?

If you want to hook into the IO-APIC code at such a low level, why 
dont you hook into the _hardware_ API - i.e. catch those 
setup/routing modifications to the IO-APIC space. No Linux changes 
are needed in that case.

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-19 12:35 ` [GIT PULL] Xen APIC hooks (with io_apic_ops) Ingo Molnar
@ 2009-05-20 17:57   ` Jeremy Fitzhardinge
  2009-05-25  4:10     ` Ingo Molnar
  2009-05-24 20:10   ` Avi Kivity
  1 sibling, 1 reply; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-20 17:57 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Keir Fraser

Ingo Molnar wrote:
> Since they are not performance critical, then why doesnt Xen catch 
> the IO-APIC accesses, and virtualizes the device?
>
> If you want to hook into the IO-APIC code at such a low level, why 
> dont you hook into the _hardware_ API - i.e. catch those 
> setup/routing modifications to the IO-APIC space. No Linux changes 
> are needed in that case.
>   

Yes, these changes aren't for a performance reason.  It's a case where a 
few lines change in Linux saves many hundreds or thousands of lines 
change in Xen.

Xen doesn't have an internal mechanism for emulating devices via 
pagefaults (that's generally handled by a qemu instance running as part 
of a guest domain), so there's no mechanism to map and emulate the 
io-apic.  Putting such support into Xen would mean adding a pile of new 
infrastructure to support this case.

Unlike the mtrr discussion, where the msr read/write ops would allow us 
to emulate the mtrr within the Xen-specific parts of the kernel, the 
io-apic ops are just accessed via normal memory writes which we can't 
hook, so it would have to be done within Xen.

The other thing I thought about was putting a hook in the Linux 
pagefault handler, so we could emulate the ioapic at that level.  But 
putting a hook in a very hot path to avoid code changes in a cold path 
doesn't make any sense.  (Same applies to doing PF emulation within Xen; 
that's an even hotter path than Linux's.)

    J

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-19 12:35 ` [GIT PULL] Xen APIC hooks (with io_apic_ops) Ingo Molnar
  2009-05-20 17:57   ` Jeremy Fitzhardinge
@ 2009-05-24 20:10   ` Avi Kivity
  2009-05-25  3:51     ` Ingo Molnar
  1 sibling, 1 reply; 104+ messages in thread
From: Avi Kivity @ 2009-05-24 20:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeremy Fitzhardinge, the arch/x86 maintainers,
	Linux Kernel Mailing List, Xen-devel

Ingo Molnar wrote:
>> IO APIC operations are not even slightly performance critical? Are 
>> they ever used on the interrupt delivery path?
>>     
>
> Since they are not performance critical, then why doesnt Xen catch 
> the IO-APIC accesses, and virtualizes the device?
>
> If you want to hook into the IO-APIC code at such a low level, why 
> dont you hook into the _hardware_ API - i.e. catch those 
> setup/routing modifications to the IO-APIC space. No Linux changes 
> are needed in that case.
>   

When x2apic is enabled, and EOI broadcast is disabled, then the io apic 
does become a hot path - it needs to be written for each level-triggered 
interrupt EOI.  In this case I might want to paravirtualize  the EOI 
write to exit only if an interrupt is pending; otherwise communicate via 
shared memory.

We do something similar for Windows (by patching it) very successfully; 
Windows likes to touch the APIC TPR ~ 100,000 times per second, usually 
without triggering an interrupt.  We hijack these writes, do the checks 
in guest context, and only exit if the TPR write would trigger an interrupt.

(kvm will likely gain x2apic support in 2.6.32; patches have already 
been posted)

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-24 20:10   ` Avi Kivity
@ 2009-05-25  3:51     ` Ingo Molnar
  2009-05-25  4:55       ` Avi Kivity
  0 siblings, 1 reply; 104+ messages in thread
From: Ingo Molnar @ 2009-05-25  3:51 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jeremy Fitzhardinge, the arch/x86 maintainers,
	Linux Kernel Mailing List, Xen-devel


* Avi Kivity <avi@redhat.com> wrote:

> Ingo Molnar wrote:
>>> IO APIC operations are not even slightly performance critical? Are  
>>> they ever used on the interrupt delivery path?
>>>     
>>
>> Since they are not performance critical, then why doesnt Xen catch the 
>> IO-APIC accesses, and virtualizes the device?
>>
>> If you want to hook into the IO-APIC code at such a low level, why  
>> dont you hook into the _hardware_ API - i.e. catch those setup/routing 
>> modifications to the IO-APIC space. No Linux changes are needed in that 
>> case.
>>   
>
> When x2apic is enabled, and EOI broadcast is disabled, then the io 
> apic does become a hot path - it needs to be written for each 
> level-triggered interrupt EOI.  In this case I might want to 
> paravirtualize the EOI write to exit only if an interrupt is 
> pending; otherwise communicate via shared memory.
>
> We do something similar for Windows (by patching it) very 
> successfully; Windows likes to touch the APIC TPR ~ 100,000 times 
> per second, usually without triggering an interrupt.  We hijack 
> these writes, do the checks in guest context, and only exit if the 
> TPR write would trigger an interrupt.

I suspect you aware of that this is about the io-apic not the local 
APIC. The local apic methods are already driver-ized - and they sit 
closer to the CPU so they matter more to performance.

> (kvm will likely gain x2apic support in 2.6.32; patches have 
> already been posted)

ok. This points in the direction of the io-apic driver abstraction 
from Jeremy being the right long-term approach. We already have a 
few quirks that could be cleaned up by using a proper driver 
interface.

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH 02/17] x86: add io_apic_ops to allow interception
  2009-05-12 23:25 ` [PATCH 02/17] x86: add io_apic_ops to allow interception Jeremy Fitzhardinge
@ 2009-05-25  3:54   ` Ingo Molnar
  2009-05-27  7:17     ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 104+ messages in thread
From: Ingo Molnar @ 2009-05-25  3:54 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge


* Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> 
> Xen dom0 needs to paravirtualize IO operations to the IO APIC, so add
> a io_apic_ops for it to intercept.  Do this as ops structure because
> there's at least some chance that another paravirtualized environment
> may want to intercept these.
> 
> [Impact: indirect IO APIC access via io_apic_ops]
> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
> ---
>  arch/x86/include/asm/io_apic.h |    9 +++++++
>  arch/x86/kernel/apic/io_apic.c |   50 +++++++++++++++++++++++++++++++++++++--
>  2 files changed, 56 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
> index 9d826e4..8cbfe73 100644
> --- a/arch/x86/include/asm/io_apic.h
> +++ b/arch/x86/include/asm/io_apic.h
> @@ -21,6 +21,15 @@
>  #define IO_APIC_REDIR_LEVEL_TRIGGER	(1 << 15)
>  #define IO_APIC_REDIR_MASKED		(1 << 16)
>  
> +struct io_apic_ops {
> +	void (*init)(void);
> +	unsigned int (*read)(unsigned int apic, unsigned int reg);
> +	void (*write)(unsigned int apic, unsigned int reg, unsigned int value);
> +	void (*modify)(unsigned int apic, unsigned int reg, unsigned int value);
> +};
> +
> +void __init set_io_apic_ops(const struct io_apic_ops *);

ok, could you please turn the whole IO-APIC code into a driver 
framework? I.e. all IO-APIC calls outside of 
arch/x86/kernel/apic/io_apic.c should be to some io_apic-> method.

The advantage will be a proper abstraction for all IO-APIC details - 
not just a minimalistic one for Xen's need.

Also, please name it 'struct io_apic' - similar to the 'struct apic' 
naming we have for the local APIC driver structure.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-20 17:57   ` Jeremy Fitzhardinge
@ 2009-05-25  4:10     ` Ingo Molnar
  2009-05-26 12:46       ` [Xen-devel] " George Dunlap
  0 siblings, 1 reply; 104+ messages in thread
From: Ingo Molnar @ 2009-05-25  4:10 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Keir Fraser, Linus Torvalds, Avi Kivity


* Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> Ingo Molnar wrote:
>> Since they are not performance critical, then why doesnt Xen catch the 
>> IO-APIC accesses, and virtualizes the device?
>>
>> If you want to hook into the IO-APIC code at such a low level, why  
>> dont you hook into the _hardware_ API - i.e. catch those setup/routing 
>> modifications to the IO-APIC space. No Linux changes are needed in that 
>> case.
>>   
>
> Yes, these changes aren't for a performance reason.  It's a case 
> where a few lines change in Linux saves many hundreds or thousands 
> of lines change in Xen.
>
> Xen doesn't have an internal mechanism for emulating devices via 
> pagefaults (that's generally handled by a qemu instance running as 
> part of a guest domain), so there's no mechanism to map and 
> emulate the io-apic.  Putting such support into Xen would mean 
> adding a pile of new infrastructure to support this case.

Note that this design problem has been created by Xen, 
intentionally, and Xen is now suffering under those bad technical 
choices made years ago. It's not Linux's problem.

The whole Xen design is messed up really: you have taken off bits of 
the Linux kernel you found interesting, turned them into a 
micro-kernel in essence and renamed it to 'Xen'.

But drivers and proper architecture is apparently boring (and 
fragile and hard and expensive to write and support in a 
micro-kernel setup) so you came up with this DOM0 piece of cr*p that 
ties Linux to Xen even closer (along an _ABI_), where Linux does 
most of the real work while Xen still stays 'separate' on paper.

Xen isnt actually useful _at all_ without Linux/DOM0. Without Dom0 
Xen is slow and native hardware support within Xen is virtually 
non-existent, as you point out above.

This is proof that you should have done all that work within Linux - 
instead of duplicating a lot of code.

> Unlike the mtrr discussion, where the msr read/write ops would 
> allow us to emulate the mtrr within the Xen-specific parts of the 
> kernel, the io-apic ops are just accessed via normal memory writes 
> which we can't hook, so it would have to be done within Xen.
>
> The other thing I thought about was putting a hook in the Linux 
> pagefault handler, so we could emulate the ioapic at that level.  
> But putting a hook in a very hot path to avoid code changes in a 
> cold path doesn't make any sense.  (Same applies to doing PF 
> emulation within Xen; that's an even hotter path than Linux's.)

We already have various page fault notifiers, you could reuse them 
if you wanted to.

Anyway, i'll pull the IO-APIC driver-ization changes if it's 
complete, thorough and clean, because that will obviously help Linux 
too. But the influx of paravirt overhead slowing down the native 
kernel has to stop really.

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-25  3:51     ` Ingo Molnar
@ 2009-05-25  4:55       ` Avi Kivity
  2009-05-25  5:06         ` Ingo Molnar
  0 siblings, 1 reply; 104+ messages in thread
From: Avi Kivity @ 2009-05-25  4:55 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeremy Fitzhardinge, the arch/x86 maintainers,
	Linux Kernel Mailing List, Xen-devel

Ingo Molnar wrote:
>> We do something similar for Windows (by patching it) very 
>> successfully; Windows likes to touch the APIC TPR ~ 100,000 times 
>> per second, usually without triggering an interrupt.  We hijack 
>> these writes, do the checks in guest context, and only exit if the 
>> TPR write would trigger an interrupt.
>>     
>
> I suspect you aware of that this is about the io-apic not the local 
> APIC. The local apic methods are already driver-ized - and they sit 
> closer to the CPU so they matter more to performance.
>   

Yeah, I gave this as an example.  It's very different -- io-apic vs. 
local apic, paravirtualization vs. patching the guest behind its back, 
Linux vs. Windows.

Of course if we hook the io-apic EOI we'll want to hook the local apic 
EOI as well.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-25  4:55       ` Avi Kivity
@ 2009-05-25  5:06         ` Ingo Molnar
  2009-05-25  5:12           ` Avi Kivity
  0 siblings, 1 reply; 104+ messages in thread
From: Ingo Molnar @ 2009-05-25  5:06 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jeremy Fitzhardinge, the arch/x86 maintainers,
	Linux Kernel Mailing List, Xen-devel


* Avi Kivity <avi@redhat.com> wrote:

> Ingo Molnar wrote:
>>> We do something similar for Windows (by patching it) very  
>>> successfully; Windows likes to touch the APIC TPR ~ 100,000 times  
>>> per second, usually without triggering an interrupt.  We hijack  
>>> these writes, do the checks in guest context, and only exit if the  
>>> TPR write would trigger an interrupt.
>>>     
>>
>> I suspect you aware of that this is about the io-apic not the local  
>> APIC. The local apic methods are already driver-ized - and they sit  
>> closer to the CPU so they matter more to performance.
>>   
>
> Yeah, I gave this as an example.  It's very different -- io-apic 
> vs.  local apic, paravirtualization vs. patching the guest behind 
> its back, Linux vs. Windows.
>
> Of course if we hook the io-apic EOI we'll want to hook the local 
> apic EOI as well.

Yeah. Eventually anything that matters to performance will be 
accelerated by hardware (and properly virtualized), which in turn 
will be faster than any hypercall based approach, right?

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-25  5:06         ` Ingo Molnar
@ 2009-05-25  5:12           ` Avi Kivity
  2009-05-25  5:19             ` Ingo Molnar
  0 siblings, 1 reply; 104+ messages in thread
From: Avi Kivity @ 2009-05-25  5:12 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeremy Fitzhardinge, the arch/x86 maintainers,
	Linux Kernel Mailing List, Xen-devel

Ingo Molnar wrote:
> * Avi Kivity <avi@redhat.com> wrote:
>
>   
>> Ingo Molnar wrote:
>>     
>>>> We do something similar for Windows (by patching it) very  
>>>> successfully; Windows likes to touch the APIC TPR ~ 100,000 times  
>>>> per second, usually without triggering an interrupt.  We hijack  
>>>> these writes, do the checks in guest context, and only exit if the  
>>>> TPR write would trigger an interrupt.
>>>>     
>>>>         
>>> I suspect you aware of that this is about the io-apic not the local  
>>> APIC. The local apic methods are already driver-ized - and they sit  
>>> closer to the CPU so they matter more to performance.
>>>   
>>>       
>> Yeah, I gave this as an example.  It's very different -- io-apic 
>> vs.  local apic, paravirtualization vs. patching the guest behind 
>> its back, Linux vs. Windows.
>>
>> Of course if we hook the io-apic EOI we'll want to hook the local 
>> apic EOI as well.
>>     
>
> Yeah. Eventually anything that matters to performance will be 
> accelerated by hardware (and properly virtualized), which in turn 
> will be faster than any hypercall based approach, right?
>   

Right.  That's already happened to the TPR (Intel processors accelerate 
that 4-bit registers but ignore everything else in the local apic).  As 
another example, we have mmu paravirtualization in kvm, but 
automatically disable it when the hardware does nested paging.  The 
problem is that hardware support has a long pipeline, and even when 
support does appear, there's a massive installed base to care about.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-25  5:12           ` Avi Kivity
@ 2009-05-25  5:19             ` Ingo Molnar
  0 siblings, 0 replies; 104+ messages in thread
From: Ingo Molnar @ 2009-05-25  5:19 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Jeremy Fitzhardinge, the arch/x86 maintainers,
	Linux Kernel Mailing List, Xen-devel


* Avi Kivity <avi@redhat.com> wrote:

> Ingo Molnar wrote:
>> * Avi Kivity <avi@redhat.com> wrote:
>>
>>   
>>> Ingo Molnar wrote:
>>>     
>>>>> We do something similar for Windows (by patching it) very   
>>>>> successfully; Windows likes to touch the APIC TPR ~ 100,000 times 
>>>>>  per second, usually without triggering an interrupt.  We hijack  
>>>>> these writes, do the checks in guest context, and only exit if 
>>>>> the  TPR write would trigger an interrupt.
>>>>>             
>>>> I suspect you aware of that this is about the io-apic not the local 
>>>>  APIC. The local apic methods are already driver-ized - and they 
>>>> sit  closer to the CPU so they matter more to performance.
>>>>         
>>> Yeah, I gave this as an example.  It's very different -- io-apic vs.  
>>> local apic, paravirtualization vs. patching the guest behind its 
>>> back, Linux vs. Windows.
>>>
>>> Of course if we hook the io-apic EOI we'll want to hook the local  
>>> apic EOI as well.
>>>     
>>
>> Yeah. Eventually anything that matters to performance will be 
>> accelerated by hardware (and properly virtualized), which in turn 
>> will be faster than any hypercall based approach, right?
>
> Right.  That's already happened to the TPR (Intel processors 
> accelerate that 4-bit registers but ignore everything else in the 
> local apic).  As another example, we have mmu paravirtualization 
> in kvm, but automatically disable it when the hardware does nested 
> paging.  The problem is that hardware support has a long pipeline, 
> and even when support does appear, there's a massive installed 
> base to care about.

Yeah. Btw., i also think that in-kernel IO-APIC and APIC emulation 
could have uses elsewhere as well - such as in testing. Currently 
you actually have to own a big box to be able to test certain 
hardware limits. This has a negative effect on test coverage and a 
subsequent negative effect on kernel quality. If KVM provided clean 
code to emulate certain hw environments we could check out limits 
(and our bugs) far more effectively.

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-25  4:10     ` Ingo Molnar
@ 2009-05-26 12:46       ` George Dunlap
  2009-05-26 18:26         ` Avi Kivity
  2009-05-26 21:19         ` [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops) Gerd Hoffmann
  0 siblings, 2 replies; 104+ messages in thread
From: George Dunlap @ 2009-05-26 12:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jeremy Fitzhardinge, Xen-devel, the arch/x86 maintainers,
	Linux Kernel Mailing List, Avi Kivity, Linus Torvalds,
	Keir Fraser

On Mon, May 25, 2009 at 5:10 AM, Ingo Molnar <mingo@elte.hu> wrote:
> Note that this design problem has been created by Xen,
> intentionally, and Xen is now suffering under those bad technical
> choices made years ago. It's not Linux's problem.

I'd like to respecfully disagree with this.  I think I can see your
point of view: you're being asked to make changes to accommodate a
project you're not involved in, and whose fundamental design you
disagree with.  And no one disagrees with the stance that changes to
accomodate Xen must not impact native performance.  But I think the
current design (with dom0 running linux-as-hypervisor-component) is
the best one, and it's one we would make over again if we had to start
from scratch.

Basically, there are three ways to approach the hypervisor problem wrt Linux:
1. Make Linux into a hypervisor (linux-as-hypervisor). This is the KVM approach.
2. Fork Linux, stealing all the device drivers, and making a
monolithic hypervisor.
3. Make a small, lean hypervisor, but leverage Linux to run the
devices and control stack (linux-as-hypervisor-component).

I've worked a bit at both kernel and hypervisor level (although
admittedly much more in-depth at the hypervisor level).  It seems to
me that being a hypervisor is a much different thing than being a
kernel.  I don't believe that one piece of software can do both well.
And I believe that, when it begins to mature more, KVM will run into
the very same issue.  KVM developers will really want to start to make
the kernel into a hypervisor, and there will be a disagreement between
those who want the kernel to be just a kernel, and those who want the
kernel also to be a hypervisor.  The result will be either a heavily
modified Linux (much more than linux-as-hypervisor-component) or a
really sucky hypervisor.

As a simple example, take scheduling.  I'm about to re-write the Xen
scheduler, and in the process I took a good look at the scheduler you
wrote.  I think it's got a lot of really good ideas, which I plan to
steal. :-)  However, I'm going to have to make some key changes in
order for it to function well as a hypervisor scheduler.  If KVM is
used on a production server with 20 or 30 multi-vcpu VMs, I predict
the current scheduler will do very poorly, because it wasn't designed
with VMs in mind, but with processes.  Making changes so that VMs run
better will fundamentally make things that make processes run less
well.

Forking Linux, drivers an all, is not a good idea; anyone would have
to be a fool to try it.  I think if you think seriously about it,
you'd never do something like that.  I don't believe any such a
project would have a snowball's chance in hell of attracting anywhere
near the required number of hardware developers to make it an
enterprise-class system.  If, somehow, it did manage to attract a
critical mass to make it viable, then the result would be two much
weaker projects, wasting millions of man-hours of  labor doing
unnecessary duplication.

No, I think the best option, and the option the Xen project would take
again if we were to start from scratch, would be what we have done:
To build a hypervisor to be a hypervisor, and let the kernel be a
kernel: but leverage the millions of man-hours still being done in
hardware support for Linux.

Either way, time will tell in the end.  If I'm wrong, and KVM can
become an enterprise-class hypervisor while playing well with
linux-as-kernel, then eventually it will dominate and Xen will die
out.  You can say "I told you so" and remove all the crap you've been
objecting to.  If I'm right, however, then having Xen around will be
critical, not just for open-source virtualization, but for the kernel
as well.  You'll be happy to be able to tell people, "Don't put this
hypervisor crap in here.  If you want a hypervisor, go to Xen." :-)

Until things are shown clearly one way or the other, the best thing to
do is hedge your bets, and allow both projects to develop.

[That's my main point; in-line responses below.]

> The whole Xen design is messed up really: you have taken off bits of
> the Linux kernel you found interesting, turned them into a
> micro-kernel in essence and renamed it to 'Xen'.

That's how Xen started, and that's really the beauty of open-source.
(After all, KVM has stolen some ideas from the Xen shadow code.)  But
since then, basically all of the code has been replaced with
Xen-written code.  I think if you did an SCO-style audit comparing
Linux and Xen 3.4, you'd find a lot less in common than you think.

> But drivers and proper architecture is apparently boring (and
> fragile and hard and expensive to write and support in a
> micro-kernel setup) so you came up with this DOM0 piece of cr*p that
> ties Linux to Xen even closer (along an _ABI_), where Linux does
> most of the real work while Xen still stays 'separate' on paper.

It's not boring, it's just a colossal waste of time and resources to
duplicate all that effort.  "Real work" is done by all of the
components: Xen does the "real work" of scheduling and resource
management; Linux does the "real work" of process-level stuff,
filesystems, and so on and (in the case of dom0) hardware support;
qemu does the "real work" of doing device emulation.  All of them are
unique, difficult, and interesting to somebody.  Reducing duplication
means everyone can work on what interests them the most, and minimizes
the total "busy work" for all involved.

How many KVM developers are working on device drivers?  And how would
Xen duplicating all the driver development help Linux?  Linux would
still have to do everything, there'd just be fewer developers to do it
(since some people would be working on Xen drivers instead).

> Xen isnt actually useful _at all_ without Linux/DOM0. Without Dom0
> Xen is slow and native hardware support within Xen is virtually
> non-existent, as you point out above.

And qemu-kvm isn't useful _at_all_ without Linux either; and Linux-KVM
isn't useful _at_all_ without qemu.  Your point?

Xen will run without dom0?  I wasn't aware of that... ;-)

> This is proof that you should have done all that work within Linux -
> instead of duplicating a lot of code.

See above.

 -George Dunlap

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-26 12:46       ` [Xen-devel] " George Dunlap
@ 2009-05-26 18:26         ` Avi Kivity
  2009-05-26 19:18           ` Dan Magenheimer
  2009-05-26 21:19         ` [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops) Gerd Hoffmann
  1 sibling, 1 reply; 104+ messages in thread
From: Avi Kivity @ 2009-05-26 18:26 UTC (permalink / raw)
  To: George Dunlap
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Xen-devel,
	the arch/x86 maintainers, Linux Kernel Mailing List,
	Linus Torvalds, Keir Fraser

George Dunlap wrote:
> As a simple example, take scheduling.  I'm about to re-write the Xen
> scheduler, and in the process I took a good look at the scheduler you
> wrote.  I think it's got a lot of really good ideas, which I plan to
> steal. :-)  However, I'm going to have to make some key changes in
> order for it to function well as a hypervisor scheduler.  If KVM is
> used on a production server with 20 or 30 multi-vcpu VMs, I predict
> the current scheduler will do very poorly, because it wasn't designed
> with VMs in mind, but with processes.  Making changes so that VMs run
> better will fundamentally make things that make processes run less
> well.
>   

The Linux scheduler already supports multiple scheduling classes.  If we 
find that none of them will fit our needs, we'll propose a new one.  
There are also multiple I/O schedulers, multiple allocators (perhaps a 
bad example), and multiple filesystems.

When the need can be demonstrated to be real, and the implementation can 
be clean, Linux can usually be adapted.

I think the Xen design has merit if it can truly make dom0 a guest -- 
that is, if it can survive dom0 failure.  Until then, you're just taking 
a large interdependent codebase and splitting it at some random point, 
but you don't get any stability or security in return.  It will also be 
interesting to see how far Xen can get along without real memory 
management (overcommit).

>> The whole Xen design is messed up really: you have taken off bits of
>> the Linux kernel you found interesting, turned them into a
>> micro-kernel in essence and renamed it to 'Xen'.
>>     
>
> That's how Xen started, and that's really the beauty of open-source.
> (After all, KVM has stolen some ideas from the Xen shadow code.)  But
> since then, basically all of the code has been replaced with
> Xen-written code.  I think if you did an SCO-style audit comparing
> Linux and Xen 3.4, you'd find a lot less in common than you think.
>   

A lot of the arch code is derived from Linux.

>> Xen isnt actually useful _at all_ without Linux/DOM0. Without Dom0
>> Xen is slow and native hardware support within Xen is virtually
>> non-existent, as you point out above.
>>     
>
> And qemu-kvm isn't useful _at_all_ without Linux either; and Linux-KVM
> isn't useful _at_all_ without qemu.  Your point?
>   

kvm is actually being used by other userspaces.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* RE: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-26 18:26         ` Avi Kivity
@ 2009-05-26 19:18           ` Dan Magenheimer
  2009-05-26 19:41             ` Avi Kivity
  2009-05-28  0:13             ` Ingo Molnar
  0 siblings, 2 replies; 104+ messages in thread
From: Dan Magenheimer @ 2009-05-26 19:18 UTC (permalink / raw)
  To: Avi Kivity, George Dunlap
  Cc: Jeremy Fitzhardinge, Xen-devel, the arch/x86 maintainers,
	Linux Kernel Mailing List, Keir Fraser, Ingo Molnar,
	Linus Torvalds

> It will also be 
> interesting to see how far Xen can get along without real memory 
> management (overcommit).

Several implementations of "classic" memory overcommit have been
done for Xen, most recently the Difference Engine work at UCSD.
It is true that none have been merged yet, in part because,
in many real world environments, "generalized" overcommit
often leads to hypervisor swapping, and performance becomes
unacceptable.  (In other words, except in certain limited customer
use models, memory overcommit is a "marketing feature".)

There's also a novel approach, Transcendent Memory (aka "tmem"
see http://oss.oracle.com/projects/tmem).  Though tmem requires the
guest to participate in memory management decisions (thus requiring
a Linux patch), system-wide physical memory efficiency may
improve vs memory deduplication, and hypervisor-based swapping
is not necessary.

> The Linux scheduler already supports multiple scheduling 
> classes.  If we 
> find that none of them will fit our needs, we'll propose a new one.  
> When the need can be demonstrated to be real, and the 
> implementation can 
> be clean, Linux can usually be adapted.

But that's exactly George and Jeremy's point.  KVM will
eventually require changes that clutter Linux for purposes
that are relevant only to a hypervisor.

> > I think if you did an SCO-style audit comparing
> > Linux and Xen 3.4, you'd find a lot less in common than you think.  
> 
> A lot of the arch code is derived from Linux.

Indeed it is, but the operative word is "derived".  In
many cases, the code has been modified to be more applicable
to a hypervisor.  For example, in Xen, tmem uses radix trees
in a way that is similar to Linux but different enough that
the changes would not likely be acceptable in Linux.  The
separation between Xen and Linux allows this diversity
without cluttering Linux.

I think we can all agree that drawing boundaries between
"hypervisor" functionality and "operating system"
functionality is a work in progress and may take many
more years to settle.  In the meantime, there should be
room (and support) for different approaches.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-26 19:18           ` Dan Magenheimer
@ 2009-05-26 19:41             ` Avi Kivity
  2009-05-28  0:13             ` Ingo Molnar
  1 sibling, 0 replies; 104+ messages in thread
From: Avi Kivity @ 2009-05-26 19:41 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: George Dunlap, Jeremy Fitzhardinge, Xen-devel,
	the arch/x86 maintainers, Linux Kernel Mailing List, Keir Fraser,
	Ingo Molnar, Linus Torvalds

Dan Magenheimer wrote:
>> It will also be 
>> interesting to see how far Xen can get along without real memory 
>> management (overcommit).
>>     
>
> Several implementations of "classic" memory overcommit have been
> done for Xen, most recently the Difference Engine work at UCSD.
> It is true that none have been merged yet, in part because,
> in many real world environments, "generalized" overcommit
> often leads to hypervisor swapping, and performance becomes
> unacceptable.  (In other words, except in certain limited customer
> use models, memory overcommit is a "marketing feature".)
>   

Swapping indeed drags performance down horribly.  I regard it as a last 
resort solution used when everything else (page sharing, compression, 
ballooning, live migration) has failed.  By having that last resort you 
can actually use the other methods without fearing an out-of-memory 
condition eventually.

Note that with SSDs disks have started to narrow the gap between memory 
and secondary storage access times, so swapping will actually start 
improving rather than regressing as it has done in recent times.

> There's also a novel approach, Transcendent Memory (aka "tmem"
> see http://oss.oracle.com/projects/tmem).  Though tmem requires the
> guest to participate in memory management decisions (thus requiring
> a Linux patch), system-wide physical memory efficiency may
> improve vs memory deduplication, and hypervisor-based swapping
> is not necessary.
>   

Yes, I've seen that.  Another tool in the memory management arsenal.

>   
>> The Linux scheduler already supports multiple scheduling 
>> classes.  If we 
>> find that none of them will fit our needs, we'll propose a new one.  
>> When the need can be demonstrated to be real, and the 
>> implementation can 
>> be clean, Linux can usually be adapted.
>>     
>
> But that's exactly George and Jeremy's point.  KVM will
> eventually require changes that clutter Linux for purposes
> that are relevant only to a hypervisor.
>   

kvm has already made changes to Linux.  Preemption notifiers allow us to 
have a lightweight exit path, and mmu notifiers allow the Linux mmu to 
control the kvm mmu.  And in fact mmu notifiers have proven useful to 
device drivers.

It also works the other way around; for example work on cpu controllers 
will benefit kvm, and the real-time scheduler will also apply to kvm 
guests.  In fact many scheduler and memory management features 
immediately apply to kvm, usually without any need for integration.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-26 12:46       ` [Xen-devel] " George Dunlap
  2009-05-26 18:26         ` Avi Kivity
@ 2009-05-26 21:19         ` Gerd Hoffmann
  2009-05-27 10:14           ` George Dunlap
  1 sibling, 1 reply; 104+ messages in thread
From: Gerd Hoffmann @ 2009-05-26 21:19 UTC (permalink / raw)
  To: George Dunlap
  Cc: Ingo Molnar, Jeremy Fitzhardinge, Xen-devel,
	the arch/x86 maintainers, Linux Kernel Mailing List, Avi Kivity,
	Linus Torvalds, Keir Fraser

On 05/26/09 14:46, George Dunlap wrote:
> On Mon, May 25, 2009 at 5:10 AM, Ingo Molnar<mingo@elte.hu>  wrote:
>> Note that this design problem has been created by Xen,
>> intentionally, and Xen is now suffering under those bad technical
>> choices made years ago. It's not Linux's problem.
>
> I'd like to respecfully disagree with this.

Well.  Xen *does* suffer from bad technical choices made years ago.  I'm 
pretty sure Xen would look radically different when being rewritten from 
scratch today.

One reason is that Xen predates vt and svm.  With that in mind some of 
the xen interface bits don't look *that* odd any more.  Back then it did 
made sense to handle things that way.  The ioapic hypercalls discussed 
in this thread belong into that group IMHO.

Another reason is that Xen wasn't "designed".  Xen was "hacked up".  As 
far I know there is no document which describes the overall design of 
the guest/xen ABI.  Also there is no documentation (other than code) 
which describes all details of the guest/xen ABI.  Simple reason:  The 
ABI wasn't designed.  It was hammered into shape until it worked.  On 
x86.  The guys who attempted (and failed) to port xen to ppc had alot of 
*ahem* fun with that stuff.  For example: Passing guest virtual 
addresses in (some) hypercalls.  Also direct paging mode is a very 
x86-ish and is the reason for a number of ia64-ifdefs in places where 
you don't expect them ...

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [PATCH 02/17] x86: add io_apic_ops to allow interception
  2009-05-25  3:54   ` Ingo Molnar
@ 2009-05-27  7:17     ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-27  7:17 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: the arch/x86 maintainers, Linux Kernel Mailing List, Xen-devel,
	Jeremy Fitzhardinge, Greg KH, Jens Axboe

Ingo Molnar wrote:
> ok, could you please turn the whole IO-APIC code into a driver 
> framework? I.e. all IO-APIC calls outside of 
> arch/x86/kernel/apic/io_apic.c should be to some io_apic-> method.
>
> The advantage will be a proper abstraction for all IO-APIC details - 
> not just a minimalistic one for Xen's need.
>
> Also, please name it 'struct io_apic' - similar to the 'struct apic' 
> naming we have for the local APIC driver structure.

OK, I'll have a look at it.  I think it could turn out quite nicely, and 
possibly remove the need for some other other Xen hooks around the 
place, as well as make the path for some other other upcoming things 
clearer.

But in the meantime, would you consider taking the minimal ops approach 
for this next merge window, and the full api in the next dev cycle?

Thanks,
    J


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-26 21:19         ` [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops) Gerd Hoffmann
@ 2009-05-27 10:14           ` George Dunlap
  0 siblings, 0 replies; 104+ messages in thread
From: George Dunlap @ 2009-05-27 10:14 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Jeremy Fitzhardinge, Xen-devel, the arch/x86 maintainers,
	Linux Kernel Mailing List, Avi Kivity, Ingo Molnar,
	Linus Torvalds, Keir Fraser

On Tue, May 26, 2009 at 10:19 PM, Gerd Hoffmann <kraxel@redhat.com> wrote:
> Well.  Xen *does* suffer from bad technical choices made years ago.  I'm
> pretty sure Xen would look radically different when being rewritten from
> scratch today.

That may be.  I don't know enough about the specific issues you raise
below to comment.  But Ingo wasn't bringing up those issues: he was
disagreeing with the whole idea of including dom0 Linux as a key
component of the Xen system.  If the Xen project were to start over
from scratch, we might make a lot of different decisions; but running
Linux as the hypervisor (as KVM does) or forking Linux (as Ingo seemed
to suggest) are not among them.

 -George

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-26 19:18           ` Dan Magenheimer
  2009-05-26 19:41             ` Avi Kivity
@ 2009-05-28  0:13             ` Ingo Molnar
  2009-05-28  0:49               ` Jeremy Fitzhardinge
                                 ` (3 more replies)
  1 sibling, 4 replies; 104+ messages in thread
From: Ingo Molnar @ 2009-05-28  0:13 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Avi Kivity, George Dunlap, Jeremy Fitzhardinge, Xen-devel,
	the arch/x86 maintainers, Linux Kernel Mailing List, Keir Fraser,
	Linus Torvalds


* Dan Magenheimer <dan.magenheimer@oracle.com> wrote:

> > The Linux scheduler already supports multiple scheduling 
> > classes.  If we find that none of them will fit our needs, we'll 
> > propose a new one.  When the need can be demonstrated to be 
> > real, and the implementation can be clean, Linux can usually be 
> > adapted.
> 
> But that's exactly George and Jeremy's point.  KVM will eventually 
> require changes that clutter Linux for purposes that are relevant 
> only to a hypervisor.

That's wrong. Any such scheduler classes would also help: control 
groups, containers, vserver, UML and who knows what other isolation 
project. Many of such mechanisms are already implemented as well.

I rarely see any KVM-only feature in generic kernel code, and that's 
good.

Xen changes - especially dom0 - are overwhelmingly not about 
improving Linux, but about having some special hook and extra 
treatment in random places - and that's really bad.

I also find it pretty telling that you cut out the most important 
point of Avi's reply:

> > I think the Xen design has merit if it can truly make dom0 a 
> > guest -- that is, if it can survive dom0 failure.  Until then, 
> > you're just taking a large interdependent codebase and splitting 
> > it at some random point, but you don't get any stability or 
> > security in return.

that crucial question really has to be answered honestly and 
upfront.

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-28  0:13             ` Ingo Molnar
@ 2009-05-28  0:49               ` Jeremy Fitzhardinge
  2009-05-28  3:47               ` Dan Magenheimer
                                 ` (2 subsequent siblings)
  3 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-28  0:49 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Magenheimer, Avi Kivity, George Dunlap, Xen-devel,
	the arch/x86 maintainers, Linux Kernel Mailing List, Keir Fraser,
	Linus Torvalds

Ingo Molnar wrote:
> I also find it pretty telling that you cut out the most important 
> point of Avi's reply:
>
>   
>>> I think the Xen design has merit if it can truly make dom0 a 
>>> guest -- that is, if it can survive dom0 failure.  Until then, 
>>> you're just taking a large interdependent codebase and splitting 
>>> it at some random point, but you don't get any stability or 
>>> security in return.
>>>       
>
> that crucial question really has to be answered honestly and 
> upfront.

Xen, the hypervisor itself, doesn't require any services from dom0. From 
its perspective, dom0 is just another guest domain, though with enough 
privileges to access hardware.  Dom0's job is to provide device access 
to other less privileged domains.

There is currently some system-wide information which is stored in a 
usermode daemon in dom0. Recovering from its loss is hard, but there is 
a prototype to pull that daemon out into its own special-purpose 
domain.  At that point, dom0 can reboot without affecting any of the 
other domains or Xen itself.

If dom0 goes away, the other domains will get a disconnect and 
temporarily lose access to their devices, but they can cope with that.  
 From their perspective, it would look like they'd just been 
save/restored or migrated to another machine.  When dom0 comes back, 
they'll reconnect and carry on.

The disaggregation of dom0's functions is something that the Xen 
development community is actively perusing.

    J

^ permalink raw reply	[flat|nested] 104+ messages in thread

* RE: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-28  0:13             ` Ingo Molnar
  2009-05-28  0:49               ` Jeremy Fitzhardinge
@ 2009-05-28  3:47               ` Dan Magenheimer
  2009-05-28 14:26               ` George Dunlap
  2009-05-29  0:45               ` Xen is a feature Jeremy Fitzhardinge
  3 siblings, 0 replies; 104+ messages in thread
From: Dan Magenheimer @ 2009-05-28  3:47 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Avi Kivity, George Dunlap, Jeremy Fitzhardinge, Xen-devel,
	the arch/x86 maintainers, Linux Kernel Mailing List, Keir Fraser,
	Linus Torvalds

> * Dan Magenheimer <dan.magenheimer@oracle.com> wrote:
> 
> > > The Linux scheduler already supports multiple scheduling 
> > > classes.  If we find that none of them will fit our needs, we'll 
> > > propose a new one.  When the need can be demonstrated to be 
> > > real, and the implementation can be clean, Linux can usually be 
> > > adapted.
> > 
> > But that's exactly George and Jeremy's point.  KVM will eventually 
> > require changes that clutter Linux for purposes that are relevant 
> > only to a hypervisor.
> 
> That's wrong. Any such scheduler classes would also help: control 
> groups, containers, vserver, UML and who knows what other isolation 
> project. Many of such mechanisms are already implemented as well.

I think you are missing the point.  Yes, certainly, generic
scheduler code can be written that applies to all of these
uses.  But will that be the same code that is best for KVM to
succeed in an enterprise-class virtual data center?
I agree with George that it will not; generic code and optimal
code are rarely the same thing.  What's best for an operating
system is not always what's best for a hypervisor.

But we are both speculating.  I guess only time will tell.

> I also find it pretty telling that you cut out the most important 
> point of Avi's reply:
> 
> > > I think the Xen design has merit if it can truly make dom0 a 
> > > guest -- that is, if it can survive dom0 failure.  Until then, 
> > > you're just taking a large interdependent codebase and splitting 
> > > it at some random point, but you don't get any stability or 
> > > security in return.
> 
> that crucial question really has to be answered honestly and 
> upfront.

I cut it out because I thought others would be more qualified
to answer, but since nobody else has, I will.  Absolutely there
is work going on to survive failure of dom0 (or any domain)!
This is a must for enterprise-grade availability and security,
such as is needed for huge corporate data centers and "clouds".
However, the majority of users (individuals and small businesses)
will probably be most happy with their distro (and distro kernel)
as dom0 since it is convenient and familiar.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
  2009-05-28  0:13             ` Ingo Molnar
  2009-05-28  0:49               ` Jeremy Fitzhardinge
  2009-05-28  3:47               ` Dan Magenheimer
@ 2009-05-28 14:26               ` George Dunlap
  2009-05-29  0:45               ` Xen is a feature Jeremy Fitzhardinge
  3 siblings, 0 replies; 104+ messages in thread
From: George Dunlap @ 2009-05-28 14:26 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Magenheimer, Jeremy Fitzhardinge, Xen-devel,
	the arch/x86 maintainers, Linux Kernel Mailing List, Avi Kivity,
	Linus Torvalds, Keir Fraser

On Thu, May 28, 2009 at 1:13 AM, Ingo Molnar <mingo@elte.hu> wrote:
>> > I think the Xen design has merit if it can truly make dom0 a
>> > guest -- that is, if it can survive dom0 failure.  Until then,
>> > you're just taking a large interdependent codebase and splitting
>> > it at some random point, but you don't get any stability or
>> > security in return.

Let me turn this around: are you (Ingo) saying that if a Xen system
could successfully survive a dom0 failure, then you would consider
that a valid reason for this design choice, and would be willing to
support and pursue changes required to allow mainline linux to run as
dom0?  If not then this line of discussion is just a distraction.

I personally think the strongest argument for an interdependent
codebase is the ability to have a separate piece of software as a
dedicated hypervisor. I also think Xen provides extra security and
stability as it is right now.  The code is much smaller and simpler
than the kernel.  The number of hypercalls is smaller than the number
of system calls, and the complexity of hypercalls is much lower than
the complexity of system calls in general.  Driver domains, in which a
driver runs in a domain other than dom0 and can fail and reboot, have
been supported in Xen for years.  The ability to survive dom0 failure
is just an added benefit.

As Dan and Jeremy said, the Xen community is actively pursuing the
changes required to allow dom0 to panic / reboot without requiring a
reboot of Xen and other guests.  I'm sure if that would make members
of the linux community actively support inclusion of dom0 support, we
could make that work a priority.

 -George

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Xen is a feature
  2009-05-28  0:13             ` Ingo Molnar
                                 ` (2 preceding siblings ...)
  2009-05-28 14:26               ` George Dunlap
@ 2009-05-29  0:45               ` Jeremy Fitzhardinge
  2009-05-29  1:27                 ` Greg KH
                                   ` (2 more replies)
  3 siblings, 3 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-29  0:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dan Magenheimer, Avi Kivity, George Dunlap, Xen-devel,
	the arch/x86 maintainers, Linux Kernel Mailing List, Keir Fraser,
	Linus Torvalds, Greg KH, Kurt C. Hackel, Ian Pratt, xen-users,
	Ky Srinivasan, Eric Anderson, Wim Coekaerts, Stephen Spector,
	Jens Axboe, Nick Piggin

Ingo Molnar wrote:
> Xen changes - especially dom0 - are overwhelmingly not about 
> improving Linux, but about having some special hook and extra 
> treatment in random places - and that's really bad.
>   

You've made this argument a few times now, and I take exception to it.

It seems to be predicated on the idea that Xen has some kind of niche 
usage, with barely more users than Voyager.  Or that it is a parasite 
sitting on the side of Linux, being a pure drain.

Neither is true.  Xen is very widely used.  There are at least 500k 
servers running Xen in commercial user sites (and untold numbers of 
smaller sites and personal users), running millions of virtual guest 
domains.  If you browse the net at all widely, you're likely to be using 
a Xen-based server; all of Amazon runs on Xen, for example.  Mozilla and 
Debian are hosted on Xen systems.

Hardware vendors like Dell and HP are shipping servers with Xen built 
into the firmware, and increasingly, desktops and laptops.  Many laptop 
"instant-on/instant-access" features are based on a combination of Xen 
and Linux.

All major Linux distributions support running as a Xen guest, and many 
support running as a Xen host.

For these users, Xen support is an active feature of Linux; Linux 
without Xen support would be much less useful to them, and better Xen 
support would be more useful.  For them, Xen support is no different 
from any other kind of platform support.  They are being actively 
hampered by the fact that the only dom0 support is available in the form 
of either ancient or very patched kernels. 

To them, improved Xen support *is* "improving Linux".

Your view appears to be that virtualization is either useless, or a neat 
trick useful for doing a quick kernel test (which is why kvm got early 
traction in this community; it is well suited to this use-case).  But 
that is a very parochial kernel-dev view.  For many users, 
virtualization (in general, but commonly on Xen) has become an 
absolutely essential part of their computing infrastructure, and they 
would no more go without it than they would go without ethernet.

We're taking your technical critiques very seriously, of course, and I 
appreciate any constructive comment.  But your baseline position of 
animosity towards Xen is unreasonable, unfair and unnecessary.

    J

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-05-29  0:45               ` Xen is a feature Jeremy Fitzhardinge
@ 2009-05-29  1:27                 ` Greg KH
  2009-05-29  4:05                 ` David Miller
  2009-05-30  2:19                 ` [Xen-devel] " Andy Burns
  2 siblings, 0 replies; 104+ messages in thread
From: Greg KH @ 2009-05-29  1:27 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Ingo Molnar, Dan Magenheimer, Avi Kivity, George Dunlap,
	Xen-devel, the arch/x86 maintainers, Linux Kernel Mailing List,
	Keir Fraser, Linus Torvalds, Kurt C. Hackel, Ian Pratt,
	xen-users, Ky Srinivasan, Eric Anderson, Wim Coekaerts,
	Stephen Spector, Jens Axboe, Nick Piggin

On Thu, May 28, 2009 at 05:45:34PM -0700, Jeremy Fitzhardinge wrote:
> Mozilla and Debian are hosted on Xen systems.

A tiny data point about these domains.  They are hosted by osuosl.org,
which uses xen systems running with the current dom0 patch set.  Because
those patches are out-of-tree, they have a hard time updating kernel
versions, and generally lag kernel.org releases by a lot, which is not
always a good thing.

So getting the dom0 patches into mainline will make their lives much
easier, and more secure.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-05-29  0:45               ` Xen is a feature Jeremy Fitzhardinge
  2009-05-29  1:27                 ` Greg KH
@ 2009-05-29  4:05                 ` David Miller
  2009-05-29  6:37                   ` Jaswinder Singh Rajput
  2009-05-29 12:01                   ` George Dunlap
  2009-05-30  2:19                 ` [Xen-devel] " Andy Burns
  2 siblings, 2 replies; 104+ messages in thread
From: David Miller @ 2009-05-29  4:05 UTC (permalink / raw)
  To: jeremy
  Cc: mingo, dan.magenheimer, avi, George.Dunlap, xen-devel, x86,
	linux-kernel, keir.fraser, torvalds, gregkh, kurt.hackel,
	Ian.Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	stephen.spector, jens.axboe, npiggin

From: Jeremy Fitzhardinge <jeremy@goop.org>
Date: Thu, 28 May 2009 17:45:34 -0700

> Ingo Molnar wrote:
>> Xen changes - especially dom0 - are overwhelmingly not about improving
>> Linux, but about having some special hook and extra treatment in
>> random places - and that's really bad.
>>   
> 
> You've made this argument a few times now, and I take exception to it.
> 
> It seems to be predicated on the idea that Xen has some kind of niche
> usage, with barely more users than Voyager.  Or that it is a parasite
> sitting on the side of Linux, being a pure drain.

I don't see Ingo's comments, whether I agree with them or not, as
an implication of Xen being niche.  Rather I see his comments as
an opposition to how Xen is implemented.

> We're taking your technical critiques very seriously, of course, and I
> appreciate any constructive comment.  But your baseline position of
> animosity towards Xen is unreasonable, unfair and unnecessary.

I don't see any animosity at all in what Ingo has said.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-05-29  4:05                 ` David Miller
@ 2009-05-29  6:37                   ` Jaswinder Singh Rajput
  2009-05-29  6:51                     ` David Miller
  2009-05-29 12:01                   ` George Dunlap
  1 sibling, 1 reply; 104+ messages in thread
From: Jaswinder Singh Rajput @ 2009-05-29  6:37 UTC (permalink / raw)
  To: David Miller
  Cc: jeremy, mingo, dan.magenheimer, avi, George.Dunlap, xen-devel,
	x86, linux-kernel, keir.fraser, torvalds, gregkh, kurt.hackel,
	Ian.Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	stephen.spector, jens.axboe, npiggin

Hi Dave,

On Thu, 2009-05-28 at 21:05 -0700, David Miller wrote:
> From: Jeremy Fitzhardinge <jeremy@goop.org>
> Date: Thu, 28 May 2009 17:45:34 -0700
> 
> > Ingo Molnar wrote:
> >> Xen changes - especially dom0 - are overwhelmingly not about improving
> >> Linux, but about having some special hook and extra treatment in
> >> random places - and that's really bad.
> >>   
> > 
> > You've made this argument a few times now, and I take exception to it.
> > 
> > It seems to be predicated on the idea that Xen has some kind of niche
> > usage, with barely more users than Voyager.  Or that it is a parasite
> > sitting on the side of Linux, being a pure drain.
> 
> I don't see Ingo's comments, whether I agree with them or not, as
> an implication of Xen being niche.  Rather I see his comments as
> an opposition to how Xen is implemented.
> 

You can see Ingo's comments and whole thread under subject :

Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)

http://lkml.org/lkml/2009/5/27/758

--
JSR


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-05-29  6:37                   ` Jaswinder Singh Rajput
@ 2009-05-29  6:51                     ` David Miller
  0 siblings, 0 replies; 104+ messages in thread
From: David Miller @ 2009-05-29  6:51 UTC (permalink / raw)
  To: jaswinder
  Cc: jeremy, mingo, dan.magenheimer, avi, George.Dunlap, xen-devel,
	x86, linux-kernel, keir.fraser, torvalds, gregkh, kurt.hackel,
	Ian.Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	stephen.spector, jens.axboe, npiggin

From: Jaswinder Singh Rajput <jaswinder@kernel.org>
Date: Fri, 29 May 2009 12:07:32 +0530

> Hi Dave,
> 
> On Thu, 2009-05-28 at 21:05 -0700, David Miller wrote:
>> From: Jeremy Fitzhardinge <jeremy@goop.org>
>> Date: Thu, 28 May 2009 17:45:34 -0700
>> 
>> > Ingo Molnar wrote:
>> >> Xen changes - especially dom0 - are overwhelmingly not about improving
>> >> Linux, but about having some special hook and extra treatment in
>> >> random places - and that's really bad.
>> >>   
>> > 
>> > You've made this argument a few times now, and I take exception to it.
>> > 
>> > It seems to be predicated on the idea that Xen has some kind of niche
>> > usage, with barely more users than Voyager.  Or that it is a parasite
>> > sitting on the side of Linux, being a pure drain.
>> 
>> I don't see Ingo's comments, whether I agree with them or not, as
>> an implication of Xen being niche.  Rather I see his comments as
>> an opposition to how Xen is implemented.
>> 
> 
> You can see Ingo's comments and whole thread under subject :
> 
> Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
> 
> http://lkml.org/lkml/2009/5/27/758

Jeremy is specifically commenting on Ingo's quoted "argument".
And that "argument" is what he takes "exception to".

And that's the scope of what I'm commenting on too.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-05-29  4:05                 ` David Miller
  2009-05-29  6:37                   ` Jaswinder Singh Rajput
@ 2009-05-29 12:01                   ` George Dunlap
  2009-05-29 14:14                     ` Pasi Kärkkäinen
                                       ` (3 more replies)
  1 sibling, 4 replies; 104+ messages in thread
From: George Dunlap @ 2009-05-29 12:01 UTC (permalink / raw)
  To: David Miller
  Cc: jeremy, mingo, Dan Magenheimer, avi, xen-devel, x86,
	linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

David Miller wrote:
> I don't see Ingo's comments, whether I agree with them or not, as
> an implication of Xen being niche.  Rather I see his comments as
> an opposition to how Xen is implemented.
>   
It's in his definition of "improving Linux".  Jeremy is saying that 
allowing Linux to run as dom0 *is* improving Linux.  The lack of dom0 
support is at this moment making life more difficult for a huge number 
of Linux users who use Xen, including Mozilla, Debian, and Amazon.    
Adding dom0 support would make Linux even more useful to a wide variety 
of people not using Xen at the moment. 

Saying that dom0 support is "not about improving Linux" completely 
ignores the cost people are paying right now, and the benefits people 
could have.  That (if I understand him) what Jeremy meant by saying it 
was treating it as if it was some kind of "niche usage, with barely more 
users than Voyager", and "being a pure drain".
> I don't see any animosity at all in what Ingo has said.
>   
The last few paragraphs of the e-mail weren't about that particular 
argument, but about the sum of the interaction with Ingo over dom0 
support for the last 6 months.  If you read the various threads, it's 
pretty clear that Ingo is resistant to accepting dom0 changes, for 
whatever reason, and has been looking for reasons not to include it. 

If we take him at his word, that the root issue is that he fundamentally 
dislikes the design choice of running Linux-as-hypervisor-component, 
then we have a difference of opinion and we're just going to have to 
agree to disagree.  But there are reasons to include it anyway, 
including benefits to existing Xen users and potential Xen users (who 
have decided not to use KVM for whatever reason), and the idea of 
survival-of-the-fittest: Xen and KVM have made different design choices, 
let's let them both grow and see which one thrives.  If KVM's design is 
unilaterally superior, eventually Xen will die off.  But I suspect that 
there's significant demand in the OSS virtualization ecology for both 
approaches, and the world will be the worse for dom0 support being 
out-of-tree.

In any case, making unreasonable or inconsistent technical objections, 
when the root issue is is actually something else, is a waste of time 
and energy for everyone involved.

 -George

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-05-29 12:01                   ` George Dunlap
@ 2009-05-29 14:14                     ` Pasi Kärkkäinen
  2009-05-29 21:29                       ` David Miller
       [not found]                     ` <87tz33ep1b.fsf@basil.nowhere.org>
                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 104+ messages in thread
From: Pasi Kärkkäinen @ 2009-05-29 14:14 UTC (permalink / raw)
  To: George Dunlap
  Cc: David Miller, jeremy, mingo, Dan Magenheimer, avi, xen-devel,
	x86, linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

On Fri, May 29, 2009 at 01:01:18PM +0100, George Dunlap wrote:
> David Miller wrote:
> >I don't see Ingo's comments, whether I agree with them or not, as
> >an implication of Xen being niche.  Rather I see his comments as
> >an opposition to how Xen is implemented.
> >  
> It's in his definition of "improving Linux".  Jeremy is saying that 
> allowing Linux to run as dom0 *is* improving Linux.  The lack of dom0 
> support is at this moment making life more difficult for a huge number 
> of Linux users who use Xen, including Mozilla, Debian, and Amazon.    
> Adding dom0 support would make Linux even more useful to a wide variety 
> of people not using Xen at the moment. 
> 

Like stated already earlier, there is a huge amount of Xen in use all around
the globe for server/datacenter virtualization. Personally I know many Xen 
installations in production, but not a single KVM installation (I'm sure those 
exist aswell, but personally I haven't seen those).

At the moment it's pretty painful for the distro developers to ship dom0
enabled kernels (most of the distros do ship or are waiting for upstream
dom0 enabled kernel), and also for many advanced users who build their custom Xen
based solutions.. 

The current situation is not good for anyone. We really need Xen dom0
support in mainline Linux.

Just my 2 eurocents.

-- Pasi

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-05-29 14:14                     ` Pasi Kärkkäinen
@ 2009-05-29 21:29                       ` David Miller
  0 siblings, 0 replies; 104+ messages in thread
From: David Miller @ 2009-05-29 21:29 UTC (permalink / raw)
  To: pasik
  Cc: george.dunlap, jeremy, mingo, dan.magenheimer, avi, xen-devel,
	x86, linux-kernel, Keir.Fraser, torvalds, gregkh, kurt.hackel,
	Ian.Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	stephen.spector, jens.axboe, npiggin

From: Pasi Kärkkäinen <pasik@iki.fi>
Date: Fri, 29 May 2009 17:14:39 +0300

> We really need Xen dom0 support in mainline Linux.

Whether we want a feature is seperate from making sure it's
implementation is up to snuff and doesn't suck.

But the concentration of the talk seems to be on wanting the feature,
and that's only half the story.

I'm getting sick of hearing over and over how many people use Xen,
that point has been made succintly so let's move on ok?


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: Xen is a feature
       [not found]                     ` <87tz33ep1b.fsf@basil.nowhere.org>
@ 2009-05-29 21:31                       ` Jeremy Fitzhardinge
  2009-05-29 23:09                       ` Nakajima, Jun
  1 sibling, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-29 21:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: George Dunlap, xen-devel, Keir Fraser, x86, linux-kernel

Andi Kleen wrote:
> George Dunlap <george.dunlap@eu.citrix.com> writes:
>
> cc list from hell trimmed. 
>
>   
>> allowing Linux to run as dom0 *is* improving Linux.  The lack of dom0
>> support is at this moment making life more difficult for a huge number
>> of Linux users who use Xen, including Mozilla, Debian, and Amazon.
>> Adding dom0 support would make Linux even more useful to a wide
>> variety of people not using Xen at the moment.
>>     
>
> Perhaps one way to address this problem would be to make the Dom0
> interface less intrusive for the host OS?
>   

I'm certainly not deaf to criticism along those lines, and I'm looking 
at ways of cleaning up/decoupling those interactions.

But my frustration arises from the fact that there's been a total stall 
on merging any of the pieces, even the ones which are either 
uncontroversial, or purely xen-internal changes.

    J

^ permalink raw reply	[flat|nested] 104+ messages in thread

* RE: [Xen-devel] Re: Xen is a feature
       [not found]                     ` <87tz33ep1b.fsf@basil.nowhere.org>
  2009-05-29 21:31                       ` [Xen-devel] " Jeremy Fitzhardinge
@ 2009-05-29 23:09                       ` Nakajima, Jun
  2009-05-29 23:26                         ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 104+ messages in thread
From: Nakajima, Jun @ 2009-05-29 23:09 UTC (permalink / raw)
  To: Andi Kleen, George Dunlap
  Cc: jeremy, xen-devel, Keir Fraser, x86, linux-kernel

On 5/29/2009 11:34:40 AM, Andi Kleen wrote:
> George Dunlap <george.dunlap@eu.citrix.com> writes:
>
> cc list from hell trimmed.
>
> > allowing Linux to run as dom0 *is* improving Linux.  The lack of
> > dom0 support is at this moment making life more difficult for a huge
> > number of Linux users who use Xen, including Mozilla, Debian, and Amazon.
> > Adding dom0 support would make Linux even more useful to a wide
> > variety of people not using Xen at the moment.
>
> Perhaps one way to address this problem would be to make the Dom0
> interface less intrusive for the host OS?
>
> Maybe impression last time I looked was that there was huge potential
> of improvement in this area. For example the PAT issue recently
> discussed was completely unnecessary.  Or if you added a "VT/SVM only"
> Dom0 mode I'm sure the interface would be significantly cleaner too.
> If you can come up with a slim clean interface the chances for actual
> integration would be likely much higher.

I think we still need some (or all?) of additional dom0 PV ops even for HVM (Hardware-based VM) dom0. Hardware-based virtualization can significantly clean up the CPU-related PV ops (including some for local APIC), but they have nothing to do with dom0.

Some hooks in the host could be removed by reusing the HVM-specific code with modifications to the virtualization logic, but I think people need to tell which specific ones are intrusive, to be fair.

             .
Jun Nakajima | Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Re: Xen is a feature
  2009-05-29 23:09                       ` Nakajima, Jun
@ 2009-05-29 23:26                         ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 104+ messages in thread
From: Jeremy Fitzhardinge @ 2009-05-29 23:26 UTC (permalink / raw)
  To: Nakajima, Jun
  Cc: Andi Kleen, George Dunlap, x86, xen-devel, Keir Fraser,
	linux-kernel, Ingo Molnar, Jiang, Yunhong

Nakajima, Jun wrote:
> I think we still need some (or all?) of additional dom0 PV ops even for HVM (Hardware-based VM) dom0. Hardware-based virtualization can significantly clean up the CPU-related PV ops (including some for local APIC), but they have nothing to do with dom0.
>
> Some hooks in the host could be removed by reusing the HVM-specific code with modifications to the virtualization logic, but I think people need to tell which specific ones are intrusive, to be fair.
>   

I think two things will significantly clean up the dom0 apic patches:

    One is to adjust the LAPIC and IOAPIC probing code so that it
    behaves correctly if the APIC cpuid flag is clear.  That would
    remove a lot of the init-time ad-hoc Xen changes I made.

    The other is to implement Ingo's suggestion of a proper ioapic
    driver layer.  I think that would not only resolve the low-level
    IO-APIC register access issue, but probably clean up a lot of the
    vector allocation/handling, and make a clear path for MSI support. 
    With luck it will also clean up things like x2apic support

I'm planning on putting some time into investigating these next week.

Once we've nailed down the details of how to make PAT work for PV guests 
on the Xen side, we should be able to implement that fairly easily in 
Linux with no core x86 changes.

I really don't think emulating MTRR register writes is the right way to 
implement Xen MTRR support, given that a much more semantically 
appropriate interface already exists, but we can do that if nothing else 
gets merged.

IanC is restructuring the swiotlb changes in a way that I hope will be 
acceptable to all.

At that point, I think we really will have resolved all the high-level 
concerns expressed about the overall architecture of the patches, and 
maybe we can finally see some progress.

    J

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-devel] Xen is a feature
  2009-05-29  0:45               ` Xen is a feature Jeremy Fitzhardinge
  2009-05-29  1:27                 ` Greg KH
  2009-05-29  4:05                 ` David Miller
@ 2009-05-30  2:19                 ` Andy Burns
  2 siblings, 0 replies; 104+ messages in thread
From: Andy Burns @ 2009-05-30  2:19 UTC (permalink / raw)
  To: Jeremy Fitzhardinge
  Cc: Ingo Molnar, Nick Piggin, Dan Magenheimer, Xen-devel,
	Wim Coekaerts, Ian Pratt, Stephen Spector, George Dunlap,
	Kurt C. Hackel, the arch/x86 maintainers,
	Linux Kernel Mailing List, xen-users, Avi Kivity, Eric Anderson,
	Jens Axboe, Ky Srinivasan, Linus Torvalds, Greg KH, Keir Fraser

2009/5/29 Jeremy Fitzhardinge <jeremy@goop.org>:

> Ingo Molnar wrote:
>>
>> Xen changes - especially dom0 - are overwhelmingly not about improving
>> Linux, but about having some special hook and extra treatment in random
>> places - and that's really bad.
>>
>
> You've made this argument a few times now, and I take exception to it.
>
> There are at least 500k servers
> running Xen in commercial user sites (and untold numbers of smaller sites
> and personal users), running millions of virtual guest domains.
> To them, improved Xen support *is* "improving Linux".

Well said. I use xen both personally and in my business as a dozen or
so of those unseen millions of domUs, I've bitten my tongue for months
while watching xen developers jump through the hoops in order to get
pv_ops dom0 into the mainstream, only to be knocked back or left until
the next merge window and the next and the next.

Sure there were "the bad old days" of xen's history, but having been
asked the go the pv_ops route, I feel it is not just failing to
improve linux by keeping dom0 out of mainstream, but actually hurting
users and trapping them on ancient kernels which are missing newer
hardware support.

Sure, I wouldn't like to see any old rubbish merged into the kernel,
but I'm amazed at Jeremy's patience over this.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-05-29 12:01                   ` George Dunlap
  2009-05-29 14:14                     ` Pasi Kärkkäinen
       [not found]                     ` <87tz33ep1b.fsf@basil.nowhere.org>
@ 2009-06-02 15:23                     ` Thomas Gleixner
  2009-06-02 16:41                       ` George Dunlap
  2009-06-03 19:49                       ` Bill Davidsen
  2009-06-02 22:40                     ` Steven Rostedt
  3 siblings, 2 replies; 104+ messages in thread
From: Thomas Gleixner @ 2009-06-02 15:23 UTC (permalink / raw)
  To: George Dunlap
  Cc: David Miller, jeremy, mingo, Dan Magenheimer, avi, xen-devel,
	x86, linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

On Fri, 29 May 2009, George Dunlap wrote:
> David Miller wrote:
> > I don't see Ingo's comments, whether I agree with them or not, as
> > an implication of Xen being niche.  Rather I see his comments as
> > an opposition to how Xen is implemented.
> >   
> It's in his definition of "improving Linux".  Jeremy is saying that allowing
> Linux to run as dom0 *is* improving Linux.  The lack of dom0 support is at
> this moment making life more difficult for a huge number of Linux users who

Exactly that's the point. Adding dom0 makes life easier for a group of
users who decided to use Xen some time ago, but what Ingo wants is
technical improvement of the kernel.

There are many features which have been wildly used in the distro
world where developers tried to push support into the kernel with the
same line of arguments.

The kernel policy always was and still is to accept only those
features which have a technical benefit to the code base.

I'm just picking a few examples:

Aside of the paravirt, which seems to expand through arch/x86 like a
hydra, the new patches sprinkle "if (xen_...)" all over the
place. These extra xen dependencies are no improvement, they are a
royal pain in the ... They are sticky once they got merged simply
because the hypervisor relies on them and we need to provide
compatibility for a long time.

Aside of that it grows interfaces like pat_disable() just because the
CPU model of Xen is obviously not able to kill the PAT flags in the
CPUid emulation. Why for heavens sake do we have a cpuid paravirt op
when we need to disable stuff seperately which can be disabled by
paravirt functionality already? I don't see this as an improvement
either, it's simple sloppy hackery.

The changelogs of the patches are partially confusing as hell:

commit 7d2b03ff4ae27b7c9e99a421a5b965f20e4bfaab

    x86: fix up flush_tlb_all
    
    - initialize the locks before the first use
    - make sure preemption is disabled
    
    [ Impact: Bug fixes: boot time warning, and crash ]

This patch is in the Xen queue and I assume it's XEN related as we
have not seen anywhere a boot time warning and crash with the current
code AFAICT, but the changelog reads like this is some generic BUG in
the SMP boot code. There is neither a hint to Xen nor to another patch
which caused that problem. While the patch itself is harmless I do not
see what is improved and why the change was necessary in the first
place.

That's what maintainers have to look at and not who is using the code
already and wants to see it merged.

> use Xen, including Mozilla, Debian, and Amazon. Adding dom0 support would
> make Linux even more useful to a wide variety of people not using Xen at the
> moment. 

I really have a hard time to see why dom0 support makes Linux more
useful to people who do not use it. It does not improve the Linux
experience of Joe User at all.

In fact it could be harmful to the average user, if it's merged in a
crappy way that increases overhead, has a performance cost and draws
away development and maintenance resources from other areas of the
kernel.

Aside of that it can also hinder the development of a properly
designed hypervisor in Linux: 'why bother with that new stuff, it
might be cleaner and nicer, but we have this Xen dom0 stuff
already?'.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-02 15:23                     ` Thomas Gleixner
@ 2009-06-02 16:41                       ` George Dunlap
  2009-06-02 17:28                         ` Chris Friesen
                                           ` (2 more replies)
  2009-06-03 19:49                       ` Bill Davidsen
  1 sibling, 3 replies; 104+ messages in thread
From: George Dunlap @ 2009-06-02 16:41 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: David Miller, jeremy, mingo, Dan Magenheimer, avi, xen-devel,
	x86, linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

Thomas Gleixner wrote:
> Exactly that's the point. Adding dom0 makes life easier for a group of
> users who decided to use Xen some time ago, but what Ingo wants is
> technical improvement of the kernel.
>
> There are many features which have been wildly used in the distro
> world where developers tried to push support into the kernel with the
> same line of arguments.
>
> The kernel policy always was and still is to accept only those
> features which have a technical benefit to the code base.
>   
I can appreciate the idea of resisting the pushing of random features.  
Still, your definition of "improving Linux" is still lacking.  Obviously 
a new scheduler is taking something that's existing and improving it.  
But adding a new filesystem, a new driver, or adding a new feature, such 
as notifications, AIO, a new hardware architecture, or even KVM: How do 
those classify as "technical improvement to the kernel" or "features 
which have technical benefit to the code base" in a way that Xen does not?

If you mean "increases Linux's technical capability", and define Xen as 
outside of Linux, then I think the definition is too small.  After all, 
allowing Linux to run on an ARM processor isn't increasing Linux' 
technical capability, it's just allowing a new group of people (people 
with ARM chips) to use Linux.  It's the same with Xen.

No one disputes the idea that changes shouldn't be ugly; no one disputes 
the idea that changes shouldn't introduce performance regressions.  But 
there are patchqueues that are ready, signed-off by other maintainers, 
and which Ingo admits that he has no technical objections to, but 
refuses to merge. 

(His most recent "objection" is that he claims the currently existing 
pv_ops infrastructure (which KVM and others benefit from as well as Xen) 
introduces almost a 1% overhead on native in an mm-heavy 
microbenchmark.  So he refuses to merge feature Y (dom0 support) until 
the Xen community helps technically unrelated existing feature X 
(pv_ops) meets some criteria.  So it has nothing to do with the quality 
of the patches themselves.)

[Not qualified to speak to the specific technical objections.]
> I really have a hard time to see why dom0 support makes Linux more
> useful to people who do not use it. It does not improve the Linux
> experience of Joe User at all.
>   
If Joe User uses Amazon, he benefits.  If Joe User downloads an Ubuntu 
or Debian distro, and the hosting providers were more secure and had to 
do less work because dom0 was inlined, then he benefits because of the 
lower cost / resources freed to do other things.

But what I was actually talking about is the number of people who don't 
use it now but would use it if it were merged in.  There hundreds of 
thousands of instances running now, and more people are chosing to use 
it at the moment, even though those who use it have the devil's choice 
between doing patching or using a 3-year old kernel.  How many more 
would use it if it were in mainline?
> In fact it could be harmful to the average user, if it's merged in a
> crappy way that increases overhead, has a performance cost and draws
> away development and maintenance resources from other areas of the
> kernel.
>   
No one is asking for something to be merged in a crappy way, or with 
unacceptable performance cost.  There are a number of patchqueues that 
Ingo has no technical objections to, but which he still refuses to merge.

"Drawing away development and maintenance resources" is a cost/benefits 
question, and Jeremy's main point was that there is a *high* benefit for 
dom0 being merged into mainline.  The same could be said of almost 
anything: are you suggesting not accepting any more KVM code because it 
might "draw away development and maintenance resources from other areas 
of the kernel"?
> Aside of that it can also hinder the development of a properly
> designed hypervisor in Linux: 'why bother with that new stuff, it
> might be cleaner and nicer, but we have this Xen dom0 stuff
> already?'.
>   
This argument doesn't make any sense.  Would you advocate only having 
one filesystem for fear that people would somehow be discouraged from 
working on a new filesystem?

Even if that were a valid argument, it wouldn't apply in this situation. 
KVM has plenty of mind-share, and the support of RedHat.  Also, I'd 
wager that it's a lot easier for a Linux kernel developer to get 
involved in KVM than in Xen, because they're already familiar with 
Linux.  I don't think anyone working on KVM will be tempted to give up 
just because Xen is also available, unless it becomes clear that 
linux-as-hypervisor isn't the best technical solution; in which case, 
moving to Xen would be the right thing to do anyway.  Merging dom0 Xen 
will in no way interfere with the development of KVM or other 
linux-as-hypervisor projects.

The main point of Jeremy's e-mail was NOT to say, "Lots of people use 
this so you should merge it."  He's was responding to Xen being treated 
like it had no benefit.  It does have a benefit; it is a feature.

 -George

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-02 16:41                       ` George Dunlap
@ 2009-06-02 17:28                         ` Chris Friesen
  2009-06-02 17:46                         ` Linus Torvalds
  2009-06-02 18:59                         ` Thomas Gleixner
  2 siblings, 0 replies; 104+ messages in thread
From: Chris Friesen @ 2009-06-02 17:28 UTC (permalink / raw)
  To: George Dunlap
  Cc: Thomas Gleixner, David Miller, jeremy, mingo, Dan Magenheimer,
	avi, xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin

George Dunlap wrote:
> Thomas Gleixner wrote:

> No one disputes the idea that changes shouldn't be ugly; no one disputes 
> the idea that changes shouldn't introduce performance regressions.  But 
> there are patchqueues that are ready, signed-off by other maintainers, 
> and which Ingo admits that he has no technical objections to, but 
> refuses to merge.

I can't comment on this part, but if so that seems unfortunate.

> The main point of Jeremy's e-mail was NOT to say, "Lots of people use 
> this so you should merge it."  He's was responding to Xen being treated 
> like it had no benefit.  It does have a benefit; it is a feature.

I don't know about others, but I certainly interpreted a number of posts
saying exactly that--that it's useful so it should be included.

I don't think anyone is arguing that Xen is not useful or that it should
not ever be included, rather the question is whether the current set of
patches is suitable for addition or whether they are too messy and
should be cleaned up first.

Chris

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-02 16:41                       ` George Dunlap
  2009-06-02 17:28                         ` Chris Friesen
@ 2009-06-02 17:46                         ` Linus Torvalds
  2009-06-02 18:02                           ` Linus Torvalds
  2009-06-04 14:02                           ` [Xen-users] " Thomas Goirand
  2009-06-02 18:59                         ` Thomas Gleixner
  2 siblings, 2 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-06-02 17:46 UTC (permalink / raw)
  To: George Dunlap
  Cc: Thomas Gleixner, David Miller, jeremy, mingo, Dan Magenheimer,
	avi, xen-devel, x86, linux-kernel, Keir Fraser, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin



On Tue, 2 Jun 2009, George Dunlap wrote:
>
> idea that changes shouldn't introduce performance regressions.  But there are
> patchqueues that are ready, signed-off by other maintainers, and which Ingo
> admits that he has no technical objections to, but refuses to merge. 

I've seen technical objects in this thread. The whole thing _started_ with 
one, and Thomas brought up others.

As a top-level maintainer, I can also very much sympathise with the "don't 
merge new stuff if there are known problems and no known solutions to 
those issues". Is Ingo supposed to just continue to merge crap, when it's 
admitted that it has problems and pollutes code that he has to maintain?

The fact is (and this is a _fact_): Xen is a total mess from a development 
standpoint. I talked about this in private with Jeremy. Xen pollutes the 
architecture code in ways that NO OTHER subsystem does. And I have never 
EVER seen the Xen developers really acknowledge that and try to fix it.

Thomas pointed to patches that add _explicitly_ Xen-related special cases 
that aren't even trying to make sense. See the local apic thing. 

So quite frankly, I wish some of the Xen people looked themselves in the 
mirror, and then asked themselves "would _I_ merge something ugly like 
that, if it was filling my subsystem with totally unrelated hacks for some 
other crap"?

Seriously.

If it was just the local APIC, fine. But it may be just the local APIC 
code this time around, next time it will be something else. It's been TLB, 
it's been entry_*.S, it's been all over. Some of them are performance 
issues.

I dunno. I just do know that I pointed out the statistics for how 
mindlessly incestuous the Xen patches have historically been to Jeremy. He 
admitted it. I've not seen _anybody_ say that things will improve. 

Xen has been painful. If you give maintainers pain, don't expect them to 
love you or respect you.

So I would really suggest that Xen people should look at _why_ they are 
giving maintainers so much pain.

		Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-02 17:46                         ` Linus Torvalds
@ 2009-06-02 18:02                           ` Linus Torvalds
  2009-06-02 18:59                             ` Avi Kivity
  2009-06-04 14:02                           ` [Xen-users] " Thomas Goirand
  1 sibling, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-06-02 18:02 UTC (permalink / raw)
  To: George Dunlap
  Cc: Thomas Gleixner, David Miller, jeremy, mingo, Dan Magenheimer,
	avi, xen-devel, x86, linux-kernel, Keir Fraser, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin



On Tue, 2 Jun 2009, Linus Torvalds wrote:
> 
> I dunno. I just do know that I pointed out the statistics for how 
> mindlessly incestuous the Xen patches have historically been to Jeremy. He 
> admitted it. I've not seen _anybody_ say that things will improve. 

In case people want to look at this on their own, get a git tree, and run 
the examples I asked Jeremy to run:

        git log --pretty=oneline --full-diff --stat arch/x86/kvm/ |
                grep -v '/kvm' |
                less -S

and then go ahead and do the same except with "xen" instead of "kvm".

Now, once you've done that, ask yourself which one is going to be merged 
easily and without any pushback.

Btw, this is NOT meant to be a "xen vs kvm" thing. Before you react to the 
"kvm" part, replace "arch/x86/kvm" above with "drivers/scsi" or something.

The point? Xen really is horribly badly separated out. It gets way more 
incestuous with other systems than it should. It's entirely possible that 
this is very fundamental to both paravirtualization and to hypervisor 
behavior, but it doesn't matter - it just measn that I can well see that 
Xen is a f*cking pain to merge.

So please, Xen people, look at your track record, and look at the issues 
from the standpoint of somebody merging your code, rather than just from 
the standpoint of somebody who whines "I want my code to be merged".

IOW, if you have trouble getting your code merged, ask yourself what _you_ 
are doing wrong.

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-02 16:41                       ` George Dunlap
  2009-06-02 17:28                         ` Chris Friesen
  2009-06-02 17:46                         ` Linus Torvalds
@ 2009-06-02 18:59                         ` Thomas Gleixner
  2 siblings, 0 replies; 104+ messages in thread
From: Thomas Gleixner @ 2009-06-02 18:59 UTC (permalink / raw)
  To: George Dunlap
  Cc: David Miller, jeremy, mingo, Dan Magenheimer, avi, xen-devel,
	x86, linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

On Tue, 2 Jun 2009, George Dunlap wrote:

> Thomas Gleixner wrote:
> > Exactly that's the point. Adding dom0 makes life easier for a group of
> > users who decided to use Xen some time ago, but what Ingo wants is
> > technical improvement of the kernel.
> > 
> > There are many features which have been wildly used in the distro
> > world where developers tried to push support into the kernel with the
> > same line of arguments.
> > 
> > The kernel policy always was and still is to accept only those
> > features which have a technical benefit to the code base.
> >   
> I can appreciate the idea of resisting the pushing of random features.  Still,
> your definition of "improving Linux" is still lacking.  Obviously a new
> scheduler is taking something that's existing and improving it.  But adding a
> new filesystem, a new driver, or adding a new feature, such as notifications,
> AIO, a new hardware architecture, or even KVM: How do those classify as
> "technical improvement to the kernel" or "features which have technical
> benefit to the code base" in a way that Xen does not?

There is a huge difference between new filesystems, drivers,
architectures and Xen.

A new filesystem is not intrusive to the filesystem layers, it's not
adding its special cases all over the place. There is no single "if
(fs_whatever)" hackery in the code base. Neither does a driver nor a
new architecture.

If the new functionality needs some extension to the generic code base
then this is carefully added with the maintainers of that code and the
extension is usually useful to other (filesystems, drivers,
architectures) as well. If it's necessary to add some special case for
one architecture then this is done by proper abstraction to keep the
burden and the maintainence cost down.

There is no #ifdef ARCH_ARM in mm/ fs/ kernel/ block/ .....

Talking about KVM, there is not a single "if (kvm)" line in the
arch/x86 code base. There is _ONE_ lonely #ifdef CONFIG_KVM_CLOCK
(which could be eliminated) in the whole x86 codebase, but at least 10
CONFIG_XEN* ones all over the place. The KVM developers went great
length to avoid adding restrictions to the existing code base.

I'm not saying that the Xen folks did not listen to us, they improved
lots of their code base and Jeremy was particularly helpful to unify
the 32/64bit code.

But right now I see a big code dump with subtle details where some of
them are just not acceptable to me.

> If you mean "increases Linux's technical capability", and define Xen as
> outside of Linux, then I think the definition is too small.  After all,
> allowing Linux to run on an ARM processor isn't increasing Linux' technical
> capability, it's just allowing a new group of people (people with ARM chips)
> to use Linux.  It's the same with Xen.

No, it's not. ARM does not interfere with anything and it keeps its
architecture specific limitations confined in arch/arm.

Xen injects its design limitation workarounds into the arch/x86
codebase and burdens developers and maintainers with it.

> No one disputes the idea that changes shouldn't be ugly; no one disputes the
> idea that changes shouldn't introduce performance regressions.  But there are
> patchqueues that are ready, signed-off by other maintainers, and which Ingo
> admits that he has no technical objections to, but refuses to merge. 
> (His most recent "objection" is that he claims the currently existing pv_ops
> infrastructure (which KVM and others benefit from as well as Xen) introduces
> almost a 1% overhead on native in an mm-heavy microbenchmark.  So he refuses
> to merge feature Y (dom0 support) until the Xen community helps technically
> unrelated existing feature X (pv_ops) meets some criteria.  So it has nothing
> to do with the quality of the patches themselves.)

Oh well. It has a lot to do with the quality of the patches. The
design is part of the quality and right now the short comings of the
design are papered over by adding Xen restrictions into the x86 code
base.

> [Not qualified to speak to the specific technical objections.]
> > I really have a hard time to see why dom0 support makes Linux more
> > useful to people who do not use it. It does not improve the Linux
> > experience of Joe User at all.
> >   
> If Joe User uses Amazon, he benefits.  If Joe User downloads an Ubuntu or
> Debian distro, and the hosting providers were more secure and had to do less
> work because dom0 was inlined, then he benefits because of the lower cost /
> resources freed to do other things.

Right, then they can concentrate on adding another bunch out of tree
patches to their kernels. Next time you stand up and tell me the same
argument for apparmour, ndiswrapper or whatever people like to use.

> But what I was actually talking about is the number of people who don't use it
> now but would use it if it were merged in.  There hundreds of thousands of
> instances running now, and more people are chosing to use it at the moment,
> even though those who use it have the devil's choice between doing patching or
> using a 3-year old kernel.  How many more would use it if it were in mainline?

How many more would use ndiswrapper if it were in mainline ?

> > In fact it could be harmful to the average user, if it's merged in a
> > crappy way that increases overhead, has a performance cost and draws
> > away development and maintenance resources from other areas of the
> > kernel.
> >   
> No one is asking for something to be merged in a crappy way, or with
> unacceptable performance cost.  There are a number of patchqueues that Ingo
> has no technical objections to, but which he still refuses to merge.

Right, because the lineup of patches is not completely untangled and
we still have objections against the overall outcome and design of the
Dom0 integration into the kernel proper.

It's not our fault that the Dom0 design decisions were made in total
disconnect to the kernel community and now a "swallow them as is"
policy is imposed on us with the argument that the newer kernels need
to run on ancient hypervisors as well.

You whine about users having to use 3 year old kernels, but 3 years
old hypervisors are fine, right ?

I'm not against merging dom0 in general, I'm opposing that we need to
buy inferior technical solutions which we can not change for a long
time. Once we merged them the "you can not break existent hypervisors"
argument will be used to prevent any design change and cleanup.

> The main point of Jeremy's e-mail was NOT to say, "Lots of people use this so
> you should merge it."  He's was responding to Xen being treated like it had no
> benefit.  It does have a benefit; it is a feature.

Right, a feature which comes with cost. The cost is the de facto
injection of an dom0 ABI into the arch/x86 code base. A new driver is
a feature as well, but it just adds the feature w/o impact to the
general system.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-02 18:02                           ` Linus Torvalds
@ 2009-06-02 18:59                             ` Avi Kivity
  2009-06-07  9:13                               ` Ingo Molnar
  0 siblings, 1 reply; 104+ messages in thread
From: Avi Kivity @ 2009-06-02 18:59 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: George Dunlap, Thomas Gleixner, David Miller, jeremy, mingo,
	Dan Magenheimer, xen-devel, x86, linux-kernel, Keir Fraser,
	gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe, npiggin

Linus Torvalds wrote:
> The point? Xen really is horribly badly separated out. It gets way more 
> incestuous with other systems than it should. It's entirely possible that 
> this is very fundamental to both paravirtualization and to hypervisor 
> behavior, but it doesn't matter - it just measn that I can well see that 
> Xen is a f*cking pain to merge.
>
> So please, Xen people, look at your track record, and look at the issues 
> from the standpoint of somebody merging your code, rather than just from 
> the standpoint of somebody who whines "I want my code to be merged".
>
> IOW, if you have trouble getting your code merged, ask yourself what _you_ 
> are doing wrong.
>   

There is in fact a way to get dom0 support with nearly no changes to 
Linux, but it involves massive changes to Xen itself and requires 
hardware support: run dom0 as a fully virtualized guest, and assign it 
all the resources dom0 can access.  It's probably a massive effort though.

I've considered it for kvm when faced with the "I want a thin 
hypervisor" question: compile the hypervisor kernel with PCI support but 
nothing else (no CONFIG_BLOCK or CONFIG_NET, no device drivers), load 
userspace from initramfs, and assign host devices to one or more 
privileged guests.  You could probably run the host with a heavily 
stripped configuration, and enjoy the slimness while every interrupt 
invokes the scheduler, a context switch, and maybe an IPI for good measure.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-05-29 12:01                   ` George Dunlap
                                       ` (2 preceding siblings ...)
  2009-06-02 15:23                     ` Thomas Gleixner
@ 2009-06-02 22:40                     ` Steven Rostedt
  2009-06-02 23:28                       ` Merge Xen (the hypervisor) into Linux Ingo Molnar
  2009-06-02 23:41                       ` Xen is a feature Thomas Gleixner
  3 siblings, 2 replies; 104+ messages in thread
From: Steven Rostedt @ 2009-06-02 22:40 UTC (permalink / raw)
  To: George Dunlap
  Cc: David Miller, jeremy, mingo, Dan Magenheimer, avi, xen-devel,
	x86, linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

On Fri, May 29, 2009 at 01:01:18PM +0100, George Dunlap wrote:
>
> If we take him at his word, that the root issue is that he fundamentally  
> dislikes the design choice of running Linux-as-hypervisor-component,  
> then we have a difference of opinion and we're just going to have to  
> agree to disagree.  But there are reasons to include it anyway,  
> including benefits to existing Xen users and potential Xen users (who  
> have decided not to use KVM for whatever reason), and the idea of  
> survival-of-the-fittest: Xen and KVM have made different design choices,  
> let's let them both grow and see which one thrives.  If KVM's design is  
> unilaterally superior, eventually Xen will die off.  But I suspect that  
> there's significant demand in the OSS virtualization ecology for both  
> approaches, and the world will be the worse for dom0 support being  
> out-of-tree.
>

Three years ago, when I was hired by Red Hat, I was put on the Virt team,
and I had to work on Xen. I found it an awkward community to say the least.
But I'll refrain from talking about that experience.

Before I was hired, I was full time developing the -rt patch. I was accustom
to the way the Linux development worked, and felt comfortable with it. I was
very pleased when I left the virt team to go back to work on the -rt patch.
Just before I left, KVM came out. I started playing with it and I once again
felt comfortable in that development. I probably would not have mind working
in the virt team if it was KVM that I was working on. I guess the point I'm
trying to make here is that KVM is developed in a Linux community, Xen is not.

The major difference between KVM and Xen is that KVM _is_ part of Linux. Xen
is not. The reason that this matters is that if we need to make a change to
the way Linux works we can simply make KVM handle the change. That is, you
could think of it as Dom0 and the hypervisor would always be in sync.

If we were to break an interface with Dom0 for Xen then we would have a bunch
of people crying foul about us breaking a defined API. One of Thomas's complaints
(and a valid one) is that once Linux supports an external API it must always
keep it compatible. This will hamper new development in Linux if the APIs are
scattered throughout the kernel without much thought.

Now here's a crazy solution. Merge the Xen hypervisor into Linux ;-)

Give full ownership of Xen to the Linux community. One of your people could be
a maintainer. This way the API between Dom0 and the hypervisor would be an internal
one. If you needed to upgrade Dom0, you also must upgrade the hypervisor, but that
would be fine since the hypervisor would also be in the Kernel proper.

This may not solve all the issues that the x86 maintainers have with the Dom0
patches, but it may help solve the API one.

Yeah, I know, I'll be having snowball fights with Saddam before that happens.

-- Steve


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Merge Xen (the hypervisor) into Linux
  2009-06-02 22:40                     ` Steven Rostedt
@ 2009-06-02 23:28                       ` Ingo Molnar
  2009-06-03  0:00                         ` Dan Magenheimer
                                           ` (3 more replies)
  2009-06-02 23:41                       ` Xen is a feature Thomas Gleixner
  1 sibling, 4 replies; 104+ messages in thread
From: Ingo Molnar @ 2009-06-02 23:28 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: George Dunlap, David Miller, jeremy, Dan Magenheimer, avi,
	xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin


* Steven Rostedt <rostedt@goodmis.org> wrote:

> Now here's a crazy solution. Merge the Xen hypervisor into Linux 
> ;-)

That's not that crazy - it's the right technical solution if DOM0 is 
desired for upstream. From what i've seen in DOM0 land the incestous 
dependencies are really only long-term manageable if the whole thing 
is in a single tree.

A lot of Xen legacies could be dropped: the crazy ring1 hack on 
32-bit, the various wide interfaces to make pure-software 
virtualization limp along. All major CPUs shipped with hardware 
virtualization support in the past 2-3 years, so the availability of 
VMX and SVM can be taken for granted for such a project.

That cuts down on a fair amount of crap. A lot of code on the Linux 
side could be reused, and a pure CONFIG_PCI=y (all other things 
disabled) would provide a "slim hypervisor" instance with a very 
small and concentrated code base. (That 'slim hypervisor' might even 
be built with CONFIG_NOMMU.)

That way dom0 would be a natural extension: a minimal interface 
between Linux-Xen-minimal and the dom0 guest instance.

It's a sane technical model IMO, and makes dom0 a lot more 
palatable. Having in-tree competition to KVM would also obviously be 
good to Linux in general.

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-02 22:40                     ` Steven Rostedt
  2009-06-02 23:28                       ` Merge Xen (the hypervisor) into Linux Ingo Molnar
@ 2009-06-02 23:41                       ` Thomas Gleixner
  1 sibling, 0 replies; 104+ messages in thread
From: Thomas Gleixner @ 2009-06-02 23:41 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: George Dunlap, David Miller, jeremy, mingo, Dan Magenheimer, avi,
	xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin

On Tue, 2 Jun 2009, Steven Rostedt wrote:
> If we were to break an interface with Dom0 for Xen then we would have a bunch
> of people crying foul about us breaking a defined API. One of Thomas's complaints
> (and a valid one) is that once Linux supports an external API it must always
> keep it compatible. This will hamper new development in Linux if the APIs are
> scattered throughout the kernel without much thought.
> 
> Now here's a crazy solution. Merge the Xen hypervisor into Linux ;-)

Not that crazy as you might think.
 
> Give full ownership of Xen to the Linux community. One of your people could be
> a maintainer. This way the API between Dom0 and the hypervisor would be an internal

s/API/ABI/ :) 

> one. If you needed to upgrade Dom0, you also must upgrade the hypervisor, but that
> would be fine since the hypervisor would also be in the Kernel proper.
> 
> This may not solve all the issues that the x86 maintainers have with the Dom0
> patches, but it may help solve the API one.

In fact it would resolve the ABI problem once and forever as we could
fix hypervisor / dom0 in sync. hypervisor and dom0 need to run in
lock-step anyway if you want to make useful progress aside of
maintaining versioned interfaces which are known to bloat rapidly.

It's not a big deal to set a flag day which says: update hypervisor
and (dom0) kernel in one go.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 104+ messages in thread

* RE: Merge Xen (the hypervisor) into Linux
  2009-06-02 23:28                       ` Merge Xen (the hypervisor) into Linux Ingo Molnar
@ 2009-06-03  0:00                         ` Dan Magenheimer
  2009-06-03  0:32                           ` Thomas Gleixner
  2009-06-03  2:43                           ` Theodore Tso
  2009-06-03  1:00                         ` Joel Becker
                                           ` (2 subsequent siblings)
  3 siblings, 2 replies; 104+ messages in thread
From: Dan Magenheimer @ 2009-06-03  0:00 UTC (permalink / raw)
  To: Ingo Molnar, Steven Rostedt
  Cc: George Dunlap, David Miller, jeremy, avi, xen-devel, x86,
	linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

That sound you heard was 10000 xen-users@lists.xensource.com
all having heart attacks at once.

Need I say more.

> -----Original Message-----
> From: Ingo Molnar [mailto:mingo@elte.hu]
> Sent: Tuesday, June 02, 2009 5:29 PM
> To: Steven Rostedt
> Cc: George Dunlap; David Miller; jeremy@goop.org; Dan Magenheimer;
> avi@redhat.com; xen-devel@lists.xensource.com; x86@kernel.org;
> linux-kernel@vger.kernel.org; Keir Fraser;
> torvalds@linux-foundation.org; gregkh@suse.de; Kurt Hackel; Ian Pratt;
> xen-users@lists.xensource.com; ksrinivasan; EAnderson@novell.com;
> wimcoekaerts@wimmekes.net; Stephen Spector; Jens Axboe; 
> npiggin@suse.de
> Subject: Merge Xen (the hypervisor) into Linux
> 
> 
> 
> * Steven Rostedt <rostedt@goodmis.org> wrote:
> 
> > Now here's a crazy solution. Merge the Xen hypervisor into Linux 
> > ;-)
> 
> That's not that crazy - it's the right technical solution if DOM0 is 
> desired for upstream. From what i've seen in DOM0 land the incestous 
> dependencies are really only long-term manageable if the whole thing 
> is in a single tree.
> 
> A lot of Xen legacies could be dropped: the crazy ring1 hack on 
> 32-bit, the various wide interfaces to make pure-software 
> virtualization limp along. All major CPUs shipped with hardware 
> virtualization support in the past 2-3 years, so the availability of 
> VMX and SVM can be taken for granted for such a project.
> 
> That cuts down on a fair amount of crap. A lot of code on the Linux 
> side could be reused, and a pure CONFIG_PCI=y (all other things 
> disabled) would provide a "slim hypervisor" instance with a very 
> small and concentrated code base. (That 'slim hypervisor' might even 
> be built with CONFIG_NOMMU.)
> 
> That way dom0 would be a natural extension: a minimal interface 
> between Linux-Xen-minimal and the dom0 guest instance.
> 
> It's a sane technical model IMO, and makes dom0 a lot more 
> palatable. Having in-tree competition to KVM would also obviously be 
> good to Linux in general.
> 
> 	Ingo
>

^ permalink raw reply	[flat|nested] 104+ messages in thread

* RE: Merge Xen (the hypervisor) into Linux
  2009-06-03  0:00                         ` Dan Magenheimer
@ 2009-06-03  0:32                           ` Thomas Gleixner
  2009-06-03  2:43                           ` Theodore Tso
  1 sibling, 0 replies; 104+ messages in thread
From: Thomas Gleixner @ 2009-06-03  0:32 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Ingo Molnar, Steven Rostedt, George Dunlap, David Miller, jeremy,
	avi, xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin

On Tue, 2 Jun 2009, Dan Magenheimer wrote:

> That sound you heard was 10000 xen-users@lists.xensource.com
> all having heart attacks at once.
> 
> Need I say more.

Well, you might answer the question whether you are the only survivor
of that mass heart attack. In case you are the only one we can simply
assume that 99.99% of the user base is gone and we can stop the merge
discussion completely. Otherwise we try to find the survivors which
have to contribute more than the tabloid pattern.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-02 23:28                       ` Merge Xen (the hypervisor) into Linux Ingo Molnar
  2009-06-03  0:00                         ` Dan Magenheimer
@ 2009-06-03  1:00                         ` Joel Becker
  2009-06-03  2:00                           ` david
  2009-06-03  7:59                           ` Alan Cox
  2009-06-03  8:07                         ` Christian Tramnitz
  2009-06-03 17:31                         ` Chris Friesen
  3 siblings, 2 replies; 104+ messages in thread
From: Joel Becker @ 2009-06-03  1:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, George Dunlap, David Miller, jeremy,
	Dan Magenheimer, avi, xen-devel, x86, linux-kernel, Keir Fraser,
	torvalds, gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe, npiggin

[ Speaking as me, no regard to $EMPLOYER ]

On Wed, Jun 03, 2009 at 01:28:43AM +0200, Ingo Molnar wrote:
> A lot of Xen legacies could be dropped: the crazy ring1 hack on 
> 32-bit, the various wide interfaces to make pure-software 
> virtualization limp along. All major CPUs shipped with hardware 
> virtualization support in the past 2-3 years, so the availability of 
> VMX and SVM can be taken for granted for such a project.

	The biggest reason I personally want Xen to be in mainline is
PVM.  Dropping PVM is, to me, pretty much saying "let's merge Xen
without taking the useful parts."
	I have only two large machines I control.  They're too big to
run as single hosts - it's a waste - but I can leverage cluster testing
by virtualizing them.
	The first machine has HVM support.  The early kind.  It's about
2 years old.  It's so dreadfully slow that I had to go to PVM.  That
runs at very good speeds and I've stopped noticing the virtualization.
The only problem I have is managing the hypervisor bits, because they're
out of tree.
	Now, perhaps that could be fixed.  Someone told me that older
HVM boxen can't be fixed; you need a very recent VMX/SVM to perform
well.  But if it is fixable, then perhaps future plans shouldn't worry
about it.
	The second machine is pre-HVM by a short period.  It is not even
three years old.  I can't run HVM on it, at all.  I can either run PVM
or I can't virtualize.  It has fast CPUs and many GB of RAM.  I can do
an entire four node cluster test on it, with serious (read, memory
intensive) software.  In a PVM-less world, this machine becomes a
single cluster node, and I have to go find three more machines.  Of
course, if I had infinite machines, I wouldn't be worrying about this at
all.
	So I want to see PVM continue for a long time.  I'd like it to
be something I can get with mainline Linux.  I don't care if it is dom0,
dom0 and the hypervisor, whatever.  I just don't want to have to be
patching out-of-tree patches for a pretty basic functionality.
	I don't see 2-3 years as a time frame to assume "everyone has
one."  Otherwise, why does Linux have code for x86_32?  Everyone's had a
64bit system for at least that long.  Sure, that's a straw man.  It goes
both ways.
	Like Chris said, if we have technical hurdles for Xen to cross,
let's get them out in the open and fixed.  If previous Xen developer
interaction has left a bad taste in people's mouths, then the current
crew has to make it up to us.  But we have to be willing to notice
they're doing so.
	At the end of the day, I want to use Linux on my systems.

Joel

-- 

"I almost ran over an angel
 He had a nice big fat cigar.
 'In a sense,' he said, 'You're alone here
 So if you jump, you'd best jump far.'"

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  1:00                         ` Joel Becker
@ 2009-06-03  2:00                           ` david
  2009-06-03  7:59                           ` Alan Cox
  1 sibling, 0 replies; 104+ messages in thread
From: david @ 2009-06-03  2:00 UTC (permalink / raw)
  To: Joel Becker
  Cc: Ingo Molnar, Steven Rostedt, George Dunlap, David Miller, jeremy,
	Dan Magenheimer, avi, xen-devel, x86, linux-kernel, Keir Fraser,
	torvalds, gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe, npiggin

On Tue, 2 Jun 2009, Joel Becker wrote:

> [ Speaking as me, no regard to $EMPLOYER ]
>
> On Wed, Jun 03, 2009 at 01:28:43AM +0200, Ingo Molnar wrote:
>> A lot of Xen legacies could be dropped: the crazy ring1 hack on
>> 32-bit, the various wide interfaces to make pure-software
>> virtualization limp along. All major CPUs shipped with hardware
>> virtualization support in the past 2-3 years, so the availability of
>> VMX and SVM can be taken for granted for such a project.
>
> 	The biggest reason I personally want Xen to be in mainline is
> PVM.  Dropping PVM is, to me, pretty much saying "let's merge Xen
> without taking the useful parts."


> 	So I want to see PVM continue for a long time.  I'd like it to
> be something I can get with mainline Linux.  I don't care if it is dom0,
> dom0 and the hypervisor, whatever.  I just don't want to have to be
> patching out-of-tree patches for a pretty basic functionality.
> 	I don't see 2-3 years as a time frame to assume "everyone has
> one."  Otherwise, why does Linux have code for x86_32?  Everyone's had a
> 64bit system for at least that long.  Sure, that's a straw man.  It goes
> both ways.

it's always easier to continue to support stuff that you already have in 
place than it is to add new things.

if the non PVM stuff could be added to the kernel, how much would that 
simplify the code needed to support PVM? would that reduce the amount of 
effort that the Xen people need to spend to something that would mean that 
they would be able to keep up with fairly recent kernels?

or what about getting the non PVM version in, and then making the seperate 
argument to add PVM support with a different config option ('xen support 
for older CPU's, note there is a performance degredation if this option is 
selected'), distros could support Xen in their main kernel package on new 
hardware, and users like you could enable the slower version.

David Lang

note: I am not an approver in this process, just an interested observer 
(who doesn't use Xen)

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  0:00                         ` Dan Magenheimer
  2009-06-03  0:32                           ` Thomas Gleixner
@ 2009-06-03  2:43                           ` Theodore Tso
  2009-06-03  3:42                             ` Steven Rostedt
  2009-06-03  7:28                             ` Gerd Hoffmann
  1 sibling, 2 replies; 104+ messages in thread
From: Theodore Tso @ 2009-06-03  2:43 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Ingo Molnar, Steven Rostedt, George Dunlap, David Miller, jeremy,
	avi, xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin

On Tue, Jun 02, 2009 at 05:00:21PM -0700, Dan Magenheimer wrote:
> That sound you heard was 10000 xen-users@lists.xensource.com
> all having heart attacks at once.
> 
> Need I say more.

So maybe I'm stupid, but why would they be having heart attacks?

It seems like a decent solutoin to me.  What's being proposed would
make the dom0/hypervisor interface an internal once, always subject to
change.  What's wrong with that?  Presumably the domU/hypervisor
interface would have to be remain stable, but why is the
dom0/hypervisor interface have to be sacred and unchanging?  I don't
understand the concern.

			       	     	    - Ted

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  2:43                           ` Theodore Tso
@ 2009-06-03  3:42                             ` Steven Rostedt
  2009-06-03  4:49                               ` Dan Magenheimer
  2009-06-03  7:28                             ` Gerd Hoffmann
  1 sibling, 1 reply; 104+ messages in thread
From: Steven Rostedt @ 2009-06-03  3:42 UTC (permalink / raw)
  To: Theodore Tso
  Cc: Dan Magenheimer, Ingo Molnar, George Dunlap, David Miller,
	jeremy, avi, xen-devel, x86, linux-kernel, Keir Fraser, torvalds,
	gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe, npiggin


On Tue, 2 Jun 2009, Theodore Tso wrote:

> On Tue, Jun 02, 2009 at 05:00:21PM -0700, Dan Magenheimer wrote:
> > That sound you heard was 10000 xen-users@lists.xensource.com
> > all having heart attacks at once.
> > 
> > Need I say more.
> 
> So maybe I'm stupid, but why would they be having heart attacks?

Maybe because they asked for an apple and got an apple pie?

That is, they are pushing hard for an interface for Dom0, and Ingo just 
agreed to take it along with the entire Xen hypervisor ;-)

> 
> It seems like a decent solutoin to me.  What's being proposed would
> make the dom0/hypervisor interface an internal once, always subject to
> change.  What's wrong with that?  Presumably the domU/hypervisor
> interface would have to be remain stable, but why is the
> dom0/hypervisor interface have to be sacred and unchanging?  I don't
> understand the concern.

I know I said it was a crazy idea, but the craziness was not with the 
technical side, or even if it is the correct thing to do. I just don't see 
the Xen team cooperating with the Linux team. But maybe those are the old 
days. Perhaps the rightful place for the Xen hypervisor is in Linux. Xen 
is GPL right? Thus we could do this even with out the permission from 
Citrix.

The Dom0 push of Xen just seems too much like Linux being Xen's sex 
slave, when it should be the other way around. By Linux acquiring the Xen 
hypervisor, then I can imaging much more progress in the area of Xen. KVM 
may be a competitor, but the two may also be able to share code thus both 
could benefit.

I'm not as turned off by Paravirt as others (although I've had my cursing 
at it), but with Xen inside Linux, we can tame the damage. Progress of Xen 
would speed up since there would be no barrier with the changes in Linux 
with the changes in Xen. That is, they will always be compatible.

-- Steve


^ permalink raw reply	[flat|nested] 104+ messages in thread

* RE: Merge Xen (the hypervisor) into Linux
  2009-06-03  3:42                             ` Steven Rostedt
@ 2009-06-03  4:49                               ` Dan Magenheimer
  2009-06-03  4:58                                 ` David Miller
  2009-06-03  5:22                                 ` Steven Rostedt
  0 siblings, 2 replies; 104+ messages in thread
From: Dan Magenheimer @ 2009-06-03  4:49 UTC (permalink / raw)
  To: Steven Rostedt, Theodore Tso
  Cc: Ingo Molnar, George Dunlap, David Miller, jeremy, avi, xen-devel,
	x86, linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

> > On Tue, Jun 02, 2009 at 05:00:21PM -0700, Dan Magenheimer wrote:
> > > That sound you heard was 10000 xen-users@lists.xensource.com
> > > all having heart attacks at once.
> > > 
> > > Need I say more.
> > 
> > So maybe I'm stupid, but why would they be having heart attacks?
> 
> Maybe because they asked for an apple and got an apple pie?
> 
> That is, they are pushing hard for an interface for Dom0, and 
> Ingo just 
> agreed to take it along with the entire Xen hypervisor ;-)

Um, no, he did not.  He and Avi suggested that Xen be completely
rearchitected to suit Linux's preferences. 

A hypervisor is not an operating system.  Yes there is
similarity in a number of pieces of code.  But there's
some similarity between Java and Linux too...

> Perhaps the rightful place for the Xen hypervisor is in 
> Linux. Xen 
> is GPL right? Thus we could do this even with out the permission from 
> Citrix.

(tongue firmly in cheek in case you might assume otherwise)
Linux is GPL right?  Perhaps the rightful place for the Linux
operating system is part of Java.  Thus we could do this even
with out the permission from Ingo.

> I just don't see 
> the Xen team cooperating with the Linux team.  But maybe those 
> are the old days. 

Yes, let's fix that.  Let's start turning this discussion towards
how we can cooperate better.

> The Dom0 push of Xen just seems too much like Linux being Xen's sex 
> slave, when it should be the other way around.

I can certainly see how it might feel that way, but it needn't
be... nor the other way around.  But in the end, only the end users
matter.  If we can't cooperate, we simply cede the war to Windows
and Hyper-V.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  4:49                               ` Dan Magenheimer
@ 2009-06-03  4:58                                 ` David Miller
  2009-06-03  5:07                                   ` Steven Rostedt
  2009-06-03  5:22                                 ` Steven Rostedt
  1 sibling, 1 reply; 104+ messages in thread
From: David Miller @ 2009-06-03  4:58 UTC (permalink / raw)
  To: dan.magenheimer
  Cc: rostedt, tytso, mingo, george.dunlap, jeremy, avi, xen-devel,
	x86, linux-kernel, Keir.Fraser, torvalds, gregkh, kurt.hackel,
	Ian.Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	stephen.spector, jens.axboe, npiggin

From: Dan Magenheimer <dan.magenheimer@oracle.com>
Date: Tue, 2 Jun 2009 21:49:58 -0700 (PDT)

> A hypervisor is not an operating system.

This is a pretty bogus statement if you ask me.

A hypervisor a software system that provides seperation between
protection realms.

It also handles exceptions and "system calls" on behalf of the other
protection realms.

I personally don't see the difference at all.  And since many
hypervisors even do cpu scheduling, the fundamental differences
converge to almost nothing.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  4:58                                 ` David Miller
@ 2009-06-03  5:07                                   ` Steven Rostedt
  0 siblings, 0 replies; 104+ messages in thread
From: Steven Rostedt @ 2009-06-03  5:07 UTC (permalink / raw)
  To: David Miller
  Cc: dan.magenheimer, tytso, mingo, george.dunlap, jeremy, avi,
	xen-devel, x86, linux-kernel, Keir.Fraser, torvalds, gregkh,
	kurt.hackel, Ian.Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, stephen.spector, jens.axboe, npiggin


On Tue, 2 Jun 2009, David Miller wrote:

> From: Dan Magenheimer <dan.magenheimer@oracle.com>
> Date: Tue, 2 Jun 2009 21:49:58 -0700 (PDT)
> 
> > A hypervisor is not an operating system.
> 
> This is a pretty bogus statement if you ask me.
> 
> A hypervisor a software system that provides seperation between
> protection realms.
> 
> It also handles exceptions and "system calls" on behalf of the other
> protection realms.
> 
> I personally don't see the difference at all.  And since many
> hypervisors even do cpu scheduling, the fundamental differences
> converge to almost nothing.

I recently sat in an Operating Systems class where the Professor was an 
old IBM retiree, that worked on the 390 system way back when. He would 
argue the point that an Operating System must do at least two things, 
schedule tasks and manage paging. The Xen hypervisor does both, thus in 
his eyes, it is indeed an Operating System.

-- Steve

P.S. he also thought that filesystem management does not have to be a 
duty of the OS and he hated the fact he had to teach it ;-)


^ permalink raw reply	[flat|nested] 104+ messages in thread

* RE: Merge Xen (the hypervisor) into Linux
  2009-06-03  4:49                               ` Dan Magenheimer
  2009-06-03  4:58                                 ` David Miller
@ 2009-06-03  5:22                                 ` Steven Rostedt
  2009-06-03 12:03                                   ` George Dunlap
  1 sibling, 1 reply; 104+ messages in thread
From: Steven Rostedt @ 2009-06-03  5:22 UTC (permalink / raw)
  To: Dan Magenheimer
  Cc: Theodore Tso, Ingo Molnar, George Dunlap, David Miller, jeremy,
	avi, xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin


On Tue, 2 Jun 2009, Dan Magenheimer wrote:

> > > On Tue, Jun 02, 2009 at 05:00:21PM -0700, Dan Magenheimer wrote:
> > > > That sound you heard was 10000 xen-users@lists.xensource.com
> > > > all having heart attacks at once.
> > > > 
> > > > Need I say more.
> > > 
> > > So maybe I'm stupid, but why would they be having heart attacks?
> > 
> > Maybe because they asked for an apple and got an apple pie?
> > 
> > That is, they are pushing hard for an interface for Dom0, and 
> > Ingo just 
> > agreed to take it along with the entire Xen hypervisor ;-)
> 
> Um, no, he did not.  He and Avi suggested that Xen be completely
> rearchitected to suit Linux's preferences. 

I was being a bit tongue in cheek with that comment too.

> 
> A hypervisor is not an operating system.

You say potato I say potato (Hmm, that doesn't work in text)

>  Yes there is
> similarity in a number of pieces of code.  But there's
> some similarity between Java and Linux too...

Java can run on hardware?

> 
> > Perhaps the rightful place for the Xen hypervisor is in 
> > Linux. Xen 
> > is GPL right? Thus we could do this even with out the permission from 
> > Citrix.
> 
> (tongue firmly in cheek in case you might assume otherwise)
> Linux is GPL right?  Perhaps the rightful place for the Linux
> operating system is part of Java.  Thus we could do this even
> with out the permission from Ingo.

If Java became GPL it could very well do that.

> 
> > I just don't see 
> > the Xen team cooperating with the Linux team.  But maybe those 
> > are the old days. 
> 
> Yes, let's fix that.  Let's start turning this discussion towards
> how we can cooperate better.

Sure.

> 
> > The Dom0 push of Xen just seems too much like Linux being Xen's sex 
> > slave, when it should be the other way around.
> 
> I can certainly see how it might feel that way, but it needn't
> be... nor the other way around.  But in the end, only the end users
> matter.  If we can't cooperate, we simply cede the war to Windows
> and Hyper-V.

When I suggest that Xen be merged into Linux, I did not mean it had to be 
like KVM or lguest where the Linux would boot up and run Xen. I mean that 
Xen could still be a micro kernel. The difference would be that its source 
would live in the kernel proper. linux.git/xen?   This way the ABI between 
Xen and Dom0 would always be in sync.

We could even link it in to the vmlinuz, instead of needing the separate 
xen.gz to load first. The vmlinuz could then expand into a Xen 
hypervisor, and also load the Dom0 with it. One image for both entities.

If you want Dom0 ABI in, you have to expect it to change without notice. 
If this breaks Xen, then we don't want to hear any complaints. This means 
that users of Xen would need to make sure that they have both the most 
recent on hypervisor and kernel and hope that they match.

With the combined image we then get the two to always be together, and no 
problems with the users.

What's the issue with this? You get to keep your "micro hypervisor" design 
that has been stated to be the superior method.

-- Steve


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  2:43                           ` Theodore Tso
  2009-06-03  3:42                             ` Steven Rostedt
@ 2009-06-03  7:28                             ` Gerd Hoffmann
  2009-06-03  8:47                               ` Alan Cox
  1 sibling, 1 reply; 104+ messages in thread
From: Gerd Hoffmann @ 2009-06-03  7:28 UTC (permalink / raw)
  To: Theodore Tso, Dan Magenheimer, Ingo Molnar, Steven Rostedt,
	George Dunlap, David Miller, jeremy, avi, xen-devel, x86,
	linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

   Hi,

> It seems like a decent solutoin to me.  What's being proposed would
> make the dom0/hypervisor interface an internal once, always subject to
> change.  What's wrong with that?

Linux is not the only player here.  NetBSD can run as dom0 guest. 
Solaris can run as dom0 guest too.  Thus making the dom0/xen interface 
private to linux and xen isn't going to fly.

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  1:00                         ` Joel Becker
  2009-06-03  2:00                           ` david
@ 2009-06-03  7:59                           ` Alan Cox
  1 sibling, 0 replies; 104+ messages in thread
From: Alan Cox @ 2009-06-03  7:59 UTC (permalink / raw)
  To: Joel Becker
  Cc: Ingo Molnar, Steven Rostedt, George Dunlap, David Miller, jeremy,
	Dan Magenheimer, avi, xen-devel, x86, linux-kernel, Keir Fraser,
	torvalds, gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe, npiggin

> 	The biggest reason I personally want Xen to be in mainline is
> PVM.  Dropping PVM is, to me, pretty much saying "let's merge Xen
> without taking the useful parts."

PVM is and has been for a long time a messaging parallel machine. Can you
not misuse the abbreviation in confusing ways (especially in email I read
in the morning ;))

Merging just hardware assisted vm support initially might be a perfectly
sensible path.

> 	Like Chris said, if we have technical hurdles for Xen to cross,
> let's get them out in the open and fixed.  If previous Xen developer
> interaction has left a bad taste in people's mouths, then the current
> crew has to make it up to us.  But we have to be willing to notice
> they're doing so.

Start by changing the mentality. Right now much of the patched code looks
like  "We made a decision years ago when creating Xen. Now we need to
force that code we wrote into Linux somehow".

Stuff gets merged a lot better if the thinking is "how do we make the
minimal changes to the existing kernel, cleanly and with minimal
inter-relationships". Only after that do you worry about whether
the existing in kernel interfaces are right.

There is a simple reason for this: Changing an interface in the kernel is
a consensus finding process around all visible users of the interface.
It's much easier to do that as a follow up. That way you can bench
alternatives, test if it harms any of the users and merge change sets
that span all the various users of the interface in one go.

It's also frequently the case that when you have a simple clean interface
that doesn't fit some in tree users it becomes blindly obvious what it
should look like.

So I would suggest the path is
- Use existing interfaces
- Merge chunks of the Xen code without worrying too much about performance
  in Xen but worry in detail about bare metal performance
- Don't worry about "hard" problems initially - eg with PAE just use the
  paravirt CPUID hook and deny having PAE to begin with
- Where there isn't a clean simple interface try as hard as possible to
  build some glue code using existing interfaces in the kernel

When it works, doesn't harm bare metal performance and is merged then go
back and worry about the harder stuff, optimisation and fine tuning. It
doesn't even need to be able to run all guests or all configurations
initially.

Also please can folks get out of the "how do we merge Xen" mentality into
the "How do we create dom0 functionality for Xen in Linux" - don't
pre-suppose the existing implementation is right.

Alan

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-02 23:28                       ` Merge Xen (the hypervisor) into Linux Ingo Molnar
  2009-06-03  0:00                         ` Dan Magenheimer
  2009-06-03  1:00                         ` Joel Becker
@ 2009-06-03  8:07                         ` Christian Tramnitz
  2009-06-04 18:53                           ` Linus Torvalds
  2009-06-03 17:31                         ` Chris Friesen
  3 siblings, 1 reply; 104+ messages in thread
From: Christian Tramnitz @ 2009-06-03  8:07 UTC (permalink / raw)
  To: linux-kernel; +Cc: xen-devel, xen-users

Ingo Molnar wrote:
> A lot of Xen legacies could be dropped: the crazy ring1 hack on 
> 32-bit, the various wide interfaces to make pure-software 
> virtualization limp along. All major CPUs shipped with hardware 
> virtualization support in the past 2-3 years, so the availability of 
> VMX and SVM can be taken for granted for such a project.

What a great idea, and while we're doing this let's also drop support
for legacy stuff like PATA and i8042 in mainline. Noone will need it
anyway because their successors are on the market for years... let's
just take it for granted that everyone is using SATA and USB nowadays!


Best regards,
   Christian


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  7:28                             ` Gerd Hoffmann
@ 2009-06-03  8:47                               ` Alan Cox
  2009-06-03  9:09                                 ` Gerd Hoffmann
  0 siblings, 1 reply; 104+ messages in thread
From: Alan Cox @ 2009-06-03  8:47 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Theodore Tso, Dan Magenheimer, Ingo Molnar, Steven Rostedt,
	George Dunlap, David Miller, jeremy, avi, xen-devel, x86,
	linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

> Linux is not the only player here.  NetBSD can run as dom0 guest. 
> Solaris can run as dom0 guest too.  Thus making the dom0/xen interface 
> private to linux and xen isn't going to fly.

It does not however preclude fixing the dom0 interface.

Anyway we deal with unfixable interfaces on a regular basis with device
hardware. What we don't do is screw up the kernel handling garbage
hardware. We dump the adaption on the driver.

Same with Xen, impedance matching Xen's interface with the kernel is (at
least initialy) something that belongs entirely in the Xen glue, or to
get started initially by just turning off stuff.

MTRR, PAE etc can all be turned off for the purpose an initial merge.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  8:47                               ` Alan Cox
@ 2009-06-03  9:09                                 ` Gerd Hoffmann
  2009-06-03  9:20                                   ` Keir Fraser
  2009-06-03 11:15                                   ` Theodore Tso
  0 siblings, 2 replies; 104+ messages in thread
From: Gerd Hoffmann @ 2009-06-03  9:09 UTC (permalink / raw)
  To: Alan Cox
  Cc: Theodore Tso, Dan Magenheimer, Ingo Molnar, Steven Rostedt,
	George Dunlap, David Miller, jeremy, avi, xen-devel, x86,
	linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

On 06/03/09 10:47, Alan Cox wrote:
>> Linux is not the only player here.  NetBSD can run as dom0 guest.
>> Solaris can run as dom0 guest too.  Thus making the dom0/xen interface
>> private to linux and xen isn't going to fly.
>
> It does not however preclude fixing the dom0 interface.

It wasn't my intention to imply that.  The interface can be extended 
when needed.  PAT support will probably be such a case.  Changing it in 
incompatible ways isn't going to work though.

> MTRR, PAE etc can all be turned off for the purpose an initial merge.

s/PAE/PAT/?  PAE is mandatory ...

Having not-yet supported stuff disabled initially is sensible IMHO.  Can 
be done for MTRR and PAT.  Is already done for MSI ;)

The lapic/ioapic stuff must be sorted though because otherwise you can't 
boot the box at all.  I think the same is true for the swiotlb bits.

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  9:09                                 ` Gerd Hoffmann
@ 2009-06-03  9:20                                   ` Keir Fraser
  2009-06-03 11:15                                   ` Theodore Tso
  1 sibling, 0 replies; 104+ messages in thread
From: Keir Fraser @ 2009-06-03  9:20 UTC (permalink / raw)
  To: Gerd Hoffmann, Alan Cox
  Cc: Theodore Tso, Dan Magenheimer, Ingo Molnar, Steven Rostedt,
	George Dunlap, David Miller, jeremy, avi, xen-devel, x86,
	linux-kernel, torvalds, gregkh, kurt.hackel, Ian Pratt,
	xen-users, ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe, npiggin

On 03/06/2009 10:09, "Gerd Hoffmann" <kraxel@redhat.com> wrote:

> On 06/03/09 10:47, Alan Cox wrote:
>>> Linux is not the only player here.  NetBSD can run as dom0 guest.
>>> Solaris can run as dom0 guest too.  Thus making the dom0/xen interface
>>> private to linux and xen isn't going to fly.
>> 
>> It does not however preclude fixing the dom0 interface.
> 
> It wasn't my intention to imply that.  The interface can be extended
> when needed.  PAT support will probably be such a case.  Changing it in
> incompatible ways isn't going to work though.

We're happy to change interfaces where we agree that makes sense.
Compatibility is our own (Xen's) problem of course, and it's generally not
an insurmountable problem -- worst case we can launch dom0 in a varying
environment dependent on a Xen-specific elf note, for example.

 -- Keir



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  9:09                                 ` Gerd Hoffmann
  2009-06-03  9:20                                   ` Keir Fraser
@ 2009-06-03 11:15                                   ` Theodore Tso
  2009-06-03 11:39                                     ` Keir Fraser
  2009-06-03 11:41                                     ` Gerd Hoffmann
  1 sibling, 2 replies; 104+ messages in thread
From: Theodore Tso @ 2009-06-03 11:15 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Alan Cox, Dan Magenheimer, Ingo Molnar, Steven Rostedt,
	George Dunlap, David Miller, jeremy, avi, xen-devel, x86,
	linux-kernel, Keir Fraser, torvalds, gregkh, kurt.hackel,
	Ian Pratt, xen-users, ksrinivasan, EAnderson, wimcoekaerts,
	Stephen Spector, jens.axboe, npiggin

On Wed, Jun 03, 2009 at 11:09:39AM +0200, Gerd Hoffmann wrote:
> On 06/03/09 10:47, Alan Cox wrote:
>>> Linux is not the only player here.  NetBSD can run as dom0 guest.
>>> Solaris can run as dom0 guest too.  Thus making the dom0/xen interface
>>> private to linux and xen isn't going to fly.
>>
>> It does not however preclude fixing the dom0 interface.
>
> It wasn't my intention to imply that.  The interface can be extended  
> when needed.  PAT support will probably be such a case.  Changing it in  
> incompatible ways isn't going to work though.

But that means that if there is some fundamentally broken piece of
dom0 design, that the Linux kernel will be stuck with it ***forever***
and it will contaminate code paths and make the code harder to
maintain ***forever*** if we consent to the Xen merge?  Is that really
what you are saying?   Be careful how you answer that....

     	     	       	  	      	  - Ted

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03 11:15                                   ` Theodore Tso
@ 2009-06-03 11:39                                     ` Keir Fraser
  2009-06-03 11:41                                     ` Gerd Hoffmann
  1 sibling, 0 replies; 104+ messages in thread
From: Keir Fraser @ 2009-06-03 11:39 UTC (permalink / raw)
  To: Theodore Tso, Gerd Hoffmann
  Cc: Alan Cox, Dan Magenheimer, Ingo Molnar, Steven Rostedt,
	George Dunlap, David Miller, jeremy, avi, xen-devel, x86,
	linux-kernel, torvalds, gregkh, kurt.hackel, Ian Pratt,
	xen-users, ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe, npiggin

On 03/06/2009 12:15, "Theodore Tso" <tytso@mit.edu> wrote:

>>> It does not however preclude fixing the dom0 interface.
>> 
>> It wasn't my intention to imply that.  The interface can be extended
>> when needed.  PAT support will probably be such a case.  Changing it in
>> incompatible ways isn't going to work though.
> 
> But that means that if there is some fundamentally broken piece of
> dom0 design, that the Linux kernel will be stuck with it ***forever***
> and it will contaminate code paths and make the code harder to
> maintain ***forever*** if we consent to the Xen merge?  Is that really
> what you are saying?   Be careful how you answer that....

It's not true, if you are prepared for a new dom0 kernel to require a new
version of Xen (which seems not unreasonable). We're happy to make
reasonable interface changes, and deal with compatibility issues as
necessary within Xen.

 -- Keir



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03 11:15                                   ` Theodore Tso
  2009-06-03 11:39                                     ` Keir Fraser
@ 2009-06-03 11:41                                     ` Gerd Hoffmann
  1 sibling, 0 replies; 104+ messages in thread
From: Gerd Hoffmann @ 2009-06-03 11:41 UTC (permalink / raw)
  To: Theodore Tso, Alan Cox, Dan Magenheimer, Ingo Molnar,
	Steven Rostedt, George Dunlap, David Miller, jeremy, avi,
	xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin

On 06/03/09 13:15, Theodore Tso wrote:
> On Wed, Jun 03, 2009 at 11:09:39AM +0200, Gerd Hoffmann wrote:
>> It wasn't my intention to imply that.  The interface can be extended
>> when needed.  PAT support will probably be such a case.  Changing it in
>> incompatible ways isn't going to work though.
>
> But that means that if there is some fundamentally broken piece of
> dom0 design, that the Linux kernel will be stuck with it ***forever***
> and it will contaminate code paths and make the code harder to
> maintain ***forever*** if we consent to the Xen merge?

No.  Xen is stuck with it forever (or at least for a few releases). 
Even when adding new & better dom0/xen interfaces in the merge process 
Xen has to keep the old ones to handle the other dom0 guests (NetBSD, 
Solaris, old 2.6.18 out-of-tree linux kernel).  Pretty much like the 
linux kernel has to keep old syscalls to not break the ABI for the 
applications, xen has to maintain old hypercalls[1].

Other way around:  Apps can use new system calls only when running one 
recent kernels, and they have to deal with -ENOSYS.  Likewise it might 
be that the pv_ops-based dom0 kernel can provide some features only when 
running on a recent hypervisor.  That will likely be the case for PAT.

cheers,
   Gerd

[1] and other interfaces like trap'n'emulate certain instructions.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  5:22                                 ` Steven Rostedt
@ 2009-06-03 12:03                                   ` George Dunlap
  2009-06-03 19:05                                     ` Theodore Tso
  0 siblings, 1 reply; 104+ messages in thread
From: George Dunlap @ 2009-06-03 12:03 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Dan Magenheimer, Theodore Tso, Ingo Molnar, David Miller, jeremy,
	avi, xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe, npiggin

Steven Rostedt wrote:
> What's the issue with this? You get to keep your "micro hypervisor" design 
> that has been stated to be the superior method.
>   
It is a very interesting idea, but it would still be basically a 
completely new project.  If someone started such a project, they could 
probably cannibalize a lot of Xen's existing code (a funny boomerang, 
since Xen cannibalized Linux's code when it started), but it would still 
require a lot of work and re-writing, and the result would be a lot 
different than Xen is now.  It would be years before it was ready to be 
used in a production system.  It's not really realistic to expect all 
the Xen developers and users to drop Xen development, shift gears into 
this new project, and wait until it's ready to be used.  (That's not to 
say that the idea has no merit, just that Xen as it is wouldn't go away 
until it this hypothetical linux hypervisor component was mature enough 
for users and developers to jump onto.)

Yeah, lots of interesting implications for such a project.

Having a separate component to be a hypervisor, even if in the same 
tree, would mean we could have dedicated hypervisor schedulers, &c.  
They could (conceivably) work more closely with the dom0 scheduler to 
make things more efficient.

As others have said, it would limit the ability of such a hypervisor to 
be used with other dom0 operatings systems.  Fixing the ABI sufficiently 
so that others can use it might be possible, but it seems to me unlikely 
to meet with much success without a lot of committment on both sides 
(i.e., w/in Linux and within other OS communities).

I'm not sure that it would turn out quite the way some people expect, 
though.  From a technical perspective, I'm not sure getting rid of the 
"ring 1 hack" or requiring HVM support would be the best design choice 
for such a project.  And it's hard to predict what kinds of technical, 
political, or cultural issues, directions, or potential dead-ends a 
project might take. 

 From all angles, it's too risky to just abandon the current Xen 
codebase until this hypothetical linux hypervisor component has shown 
itself to be viable.

-George

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-02 23:28                       ` Merge Xen (the hypervisor) into Linux Ingo Molnar
                                           ` (2 preceding siblings ...)
  2009-06-03  8:07                         ` Christian Tramnitz
@ 2009-06-03 17:31                         ` Chris Friesen
  2009-06-03 17:36                           ` Alan Cox
  3 siblings, 1 reply; 104+ messages in thread
From: Chris Friesen @ 2009-06-03 17:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Steven Rostedt, George Dunlap, David Miller, jeremy,
	Dan Magenheimer, avi, xen-devel, x86, linux-kernel, Keir Fraser,
	torvalds, gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe, npiggin

Ingo Molnar wrote:

> A lot of Xen legacies could be dropped: the crazy ring1 hack on 
> 32-bit, the various wide interfaces to make pure-software 
> virtualization limp along. All major CPUs shipped with hardware 
> virtualization support in the past 2-3 years, so the availability of 
> VMX and SVM can be taken for granted for such a project.

That's a pretty bold statement.  I have five x86 machines in my house
currently being used, and none of them support VMX/SVM.

At least some Lenovo laptops disable VMX in the BIOS with no way to
enable it.  Some of the Core2Duo chips don't support VMX at all.

I think Xen without paravirtualization would be a serious degradation of
usefulness.

Chris

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03 17:31                         ` Chris Friesen
@ 2009-06-03 17:36                           ` Alan Cox
  0 siblings, 0 replies; 104+ messages in thread
From: Alan Cox @ 2009-06-03 17:36 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Ingo Molnar, Steven Rostedt, George Dunlap, David Miller, jeremy,
	Dan Magenheimer, avi, xen-devel, x86, linux-kernel, Keir Fraser,
	torvalds, gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe, npiggin

On Wed, 03 Jun 2009 11:31:13 -0600
"Chris Friesen" <cfriesen@nortel.com> wrote:

> Ingo Molnar wrote:
> 
> > A lot of Xen legacies could be dropped: the crazy ring1 hack on 
> > 32-bit, the various wide interfaces to make pure-software 
> > virtualization limp along. All major CPUs shipped with hardware 
> > virtualization support in the past 2-3 years, so the availability of 
> > VMX and SVM can be taken for granted for such a project.
> 
> That's a pretty bold statement.  I have five x86 machines in my house
> currently being used, and none of them support VMX/SVM.
> 
> At least some Lenovo laptops disable VMX in the BIOS with no way to
> enable it.  Some of the Core2Duo chips don't support VMX at all.

Ditto some Atom cpus which in turn means you can't run kvm on all the
netbooks right now - which is one place its very useful.

> I think Xen without paravirtualization would be a serious degradation of
> usefulness.

At that point you can just use kvm anyway.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03 12:03                                   ` George Dunlap
@ 2009-06-03 19:05                                     ` Theodore Tso
       [not found]                                       ` <4A27CF94.1050903@gmx.de>
  0 siblings, 1 reply; 104+ messages in thread
From: Theodore Tso @ 2009-06-03 19:05 UTC (permalink / raw)
  To: George Dunlap
  Cc: Steven Rostedt, Dan Magenheimer, Ingo Molnar, David Miller,
	jeremy, avi, xen-devel, x86, linux-kernel, Keir Fraser, torvalds,
	gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe, npiggin

On Wed, Jun 03, 2009 at 01:03:51PM +0100, George Dunlap wrote:
> It is a very interesting idea, but it would still be basically a  
> completely new project.  If someone started such a project, they could  
> probably cannibalize a lot of Xen's existing code (a funny boomerang,  
> since Xen cannibalized Linux's code when it started), but it would still  
> require a lot of work and re-writing, and the result would be a lot  
> different than Xen is now.  It would be years before it was ready to be  
> used in a production system.  

You might be surprised; if we started with a working dom0/xen pair,
and there were people working on it to clean up dom0/xen interface,
treating it as an internal Linux interface with an eye towards
minimizing contamination of core kernel code, the Linux model of
development can go pretty fast.  Compare and contrast it with the
***years*** of calendar time and decades of wasted man-years of
engineering effort needed to port and backport and maintain dom0
support with Linux.  Given that experience, I could easily see how
some might assume that it would take years to significantly improve
things, but I suspect if xen were merged into mainline with the
assumption that it could be arbitrarily changed to make things sane,
with the primary interface that needed backwards compatibility care
being the xen/domU interface, I expect things would go pretty fast.

What would be lost is dom0 support for other OS's, but really, is that
such a major loss?  Linux has far better device driver support than
Solaris or FreeBSD, so there is really that much gain in using some
other OS for dom0?

						- Ted

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-02 15:23                     ` Thomas Gleixner
  2009-06-02 16:41                       ` George Dunlap
@ 2009-06-03 19:49                       ` Bill Davidsen
  2009-06-03 20:20                         ` Thomas Gleixner
  1 sibling, 1 reply; 104+ messages in thread
From: Bill Davidsen @ 2009-06-03 19:49 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: George Dunlap, David Miller, jeremy, mingo, Dan Magenheimer, avi,
	xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe

Thomas Gleixner wrote:
> On Fri, 29 May 2009, George Dunlap wrote:
>> David Miller wrote:
>>> I don't see Ingo's comments, whether I agree with them or not, as
>>> an implication of Xen being niche.  Rather I see his comments as
>>> an opposition to how Xen is implemented.
>>>   
>> It's in his definition of "improving Linux".  Jeremy is saying that allowing
>> Linux to run as dom0 *is* improving Linux.  The lack of dom0 support is at
>> this moment making life more difficult for a huge number of Linux users who
> 
> Exactly that's the point. Adding dom0 makes life easier for a group of
> users who decided to use Xen some time ago, but what Ingo wants is
> technical improvement of the kernel.
> 
> There are many features which have been wildly used in the distro
> world where developers tried to push support into the kernel with the
> same line of arguments.
> 
> The kernel policy always was and still is to accept only those
> features which have a technical benefit to the code base.
> 
> I'm just picking a few examples:
> 
> Aside of the paravirt, which seems to expand through arch/x86 like a
> hydra, the new patches sprinkle "if (xen_...)" all over the
> place. These extra xen dependencies are no improvement, they are a
> royal pain in the ... They are sticky once they got merged simply
> because the hypervisor relies on them and we need to provide
> compatibility for a long time.
> 
Wait, let's not classify something as "no improvement" when you mean "I don't 
need it." The fact that processors without hardware VM can run virtual machines 
is a non-trivial benefit for many users, and in future embedded applications, 
where both hvm and 64 bit capability may not justify their power requirements. 
And the improved PV performance over full virtualization is an improvement, even 
though it certainly isn't night and day.

Having replace some systems with new hardware just so I could use KVM does not 
make me forget that I used xen for some time, and that PV is still a savings, 
even with the latest hardware.

Let's stick to technical issues, and not deny that there are a number of users 
who really will have expanded capability. The technical points are valid, but as 
a former and probable future xen (CentOS) user, so are the benefits.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-03 19:49                       ` Bill Davidsen
@ 2009-06-03 20:20                         ` Thomas Gleixner
  2009-06-03 22:37                           ` Bill Davidsen
  0 siblings, 1 reply; 104+ messages in thread
From: Thomas Gleixner @ 2009-06-03 20:20 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: George Dunlap, David Miller, jeremy, mingo, Dan Magenheimer, avi,
	xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe

On Wed, 3 Jun 2009, Bill Davidsen wrote:
> Thomas Gleixner wrote:
> > Aside of the paravirt, which seems to expand through arch/x86 like a
> > hydra, the new patches sprinkle "if (xen_...)" all over the
> > place. These extra xen dependencies are no improvement, they are a
> > royal pain in the ... They are sticky once they got merged simply
> > because the hypervisor relies on them and we need to provide
> > compatibility for a long time.
> > 
> Wait, let's not classify something as "no improvement" when you mean "I don't
> need it."

It's not about "I don't need it.". It's about having Xen dependencies
in the code all over the place which make mainatainence harder. I have
to balance the users benefit (xen dom0 support) vs. the impact on
maintainability and the restrictions which are going to be set almost
in stone by merging it.

> Let's stick to technical issues, and not deny that there are a number of users
> who really will have expanded capability. The technical points are valid, but
> as a former and probable future xen (CentOS) user, so are the benefits.

Refusing random "if (xen...)" dependencies is a purely technical
decision. I have said more than once that I'm not against merging dom0
in general, I'm just frightened by the technical impact of a defacto
ABI which we swallow with it.

We have enough problems with real silicon and BIOS/ACPI already, why
should we add artifical and _avoidable_ virtual silicon horror ?

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-03 20:20                         ` Thomas Gleixner
@ 2009-06-03 22:37                           ` Bill Davidsen
  2009-06-03 23:29                             ` Frans Pop
  0 siblings, 1 reply; 104+ messages in thread
From: Bill Davidsen @ 2009-06-03 22:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: George Dunlap, David Miller, jeremy, mingo, Dan Magenheimer, avi,
	xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe

Thomas Gleixner wrote:
> On Wed, 3 Jun 2009, Bill Davidsen wrote:
>   
>> Thomas Gleixner wrote:
>>     
>>> Aside of the paravirt, which seems to expand through arch/x86 like a
>>> hydra, the new patches sprinkle "if (xen_...)" all over the
>>> place. These extra xen dependencies are no improvement, they are a
>>> royal pain in the ... They are sticky once they got merged simply
>>> because the hypervisor relies on them and we need to provide
>>> compatibility for a long time.
>>>
>>>       
>> Wait, let's not classify something as "no improvement" when you mean "I don't
>> need it."
>>     
>
> It's not about "I don't need it.". It's about having Xen dependencies
> in the code all over the place which make mainatainence harder. I have
> to balance the users benefit (xen dom0 support) vs. the impact on
> maintainability and the restrictions which are going to be set almost
> in stone by merging it.
>
>   
>> Let's stick to technical issues, and not deny that there are a number of users
>> who really will have expanded capability. The technical points are valid, but
>> as a former and probable future xen (CentOS) user, so are the benefits.
>>     
>
> Refusing random "if (xen...)" dependencies is a purely technical
> decision. I have said more than once that I'm not against merging dom0
> in general, I'm just frightened by the technical impact of a defacto
> ABI which we swallow with it.
>
>   
I was referring to your "no benefit" comment, I don't dispute the 
technical issues. I think the idea of moving the hypervisor into the 
kernel and letting xen folks do the external parts as they please.

> We have enough problems with real silicon and BIOS/ACPI already, why
> should we add artifical and _avoidable_ virtual silicon horror ?
>   

I guess my point wasn't clear, sorry, it's just that I felt as though 
the features lacking KVM (old/small/BIOS-limited CPUs) might be hidden 
in the smoke due to the technical issues.

-- 
Bill Davidsen <davidsen@tmr.com>
  Even purely technical things can appear to be magic, if the documentation is
obscure enough. For example, PulseAudio is configured by dancing naked around a
fire at midnight, shaking a rattle with one hand and a LISP manual with the
other, while reciting the GNU manifesto in hexadecimal. The documentation fails
to note that you must circle the fire counter-clockwise in the southern
hemisphere.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-03 22:37                           ` Bill Davidsen
@ 2009-06-03 23:29                             ` Frans Pop
  2009-06-04 13:21                               ` George Dunlap
  2009-06-05  4:14                               ` Bill Davidsen
  0 siblings, 2 replies; 104+ messages in thread
From: Frans Pop @ 2009-06-03 23:29 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: tglx, george.dunlap, davem, jeremy, mingo, dan.magenheimer, avi,
	xen-devel, x86, linux-kernel, Keir.Fraser, torvalds, gregkh,
	kurt.hackel, Ian.Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, stephen.spector, jens.axboe

Bill Davidsen wrote:
> I was referring to your "no benefit" comment, I don't dispute the
> technical issues. I think the idea of moving the hypervisor into the
> kernel and letting xen folks do the external parts as they please.

Where does that come from? AFAICT Thomas never made a "no benefit" comment 
other than limited to the context of the technical implementation.
I've always understood his meaning in this thread to be: "the proposed 
patch set does not improve the technical standard of the linux kernel, 
but would instead lower it considerably".
Thomas has been extremely correct in this thread and IMO does not deserve 
this attack.

Let's look at his exact comments (emphasis mine).

! The kernel policy always was and still is to accept only those
! features which have a technical benefit **to the code base**.

and

! Aside of the paravirt, which seems to expand through arch/x86 like a
! hydra, the new patches sprinkle "if (xen_...)" all over the
! place. These extra xen dependencies are no improvement, they are a
! royal pain in the ...

Also clearly limited to technical implementation.

! I really have a hard time to see why dom0 support makes Linux more
! useful **to people who do not use it**. It does not improve the Linux
! experience **of Joe User** at all.

Or has Thomas made some "no benefit" comment I've missed?

Cheers,
FJP

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-03 23:29                             ` Frans Pop
@ 2009-06-04 13:21                               ` George Dunlap
  2009-06-04 15:10                                 ` Theodore Tso
  2009-06-04 15:31                                 ` Chris Friesen
  2009-06-05  4:14                               ` Bill Davidsen
  1 sibling, 2 replies; 104+ messages in thread
From: George Dunlap @ 2009-06-04 13:21 UTC (permalink / raw)
  To: Frans Pop
  Cc: Bill Davidsen, tglx, davem, jeremy, mingo, Dan Magenheimer, avi,
	xen-devel, x86, linux-kernel, Keir Fraser, torvalds, gregkh,
	kurt.hackel, Ian Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, Stephen Spector, jens.axboe

Frans Pop wrote:
> ! The kernel policy always was and still is to accept only those
> ! features which have a technical benefit **to the code base**.
>   
Yes, I think I understood him better after I responded to his e-mail 
(unfortunately).  When people say things like "dom0 adds all these hooks 
but doesn't add anything to Linux", they mean something like this 
(please correct me anyone, if I'm wrong).

Kernel developers want Linux, as a project, to have cool things in it.  
They want it to be cool.  Adding new features, new capabilities, new 
technical code, makes it cooler.  Sometimes adding new features to make 
it cooler has some cost in terms of adding things to other parts of the 
code, possibly making it a little less clean or a little more 
convoluted.  But if the coolness is cool enough, it's worth the cost.

The feeling is that adding a bunch of these dom0 hooks (especially of 
the type, "if(xen) { foo; }"), are a cost to Linux.  They make the code 
ugly.  They do allow a new kind of coolness, a (linux-dom0 + Xen) 
coolness.  But none of the coolness actually happens in Linux; it all 
happens in Xen.  So coolness may happen, and world happiness might 
increase marginally, but Linux itself doesn't seem any cooler, it just 
has the cost of all these ugly hooks.  Thus the "Linux is Xen's sex 
slave" analogy. :-)

If (hypothetically) we merged Xen into Linux, then (people are 
suggesting) the coolness of Xen would actually contribute to the 
coolness of Linux ("add technical benefit to the code base").  People 
would feel like working on the interface between linux-xen and the rest 
of linux would be making their own piece of software, Linux, work 
better, rather than feeling like they have to work with some foreign 
project that doesn't make their code any cooler.

Is that a pretty accurate representation of the "adding features which 
have a technical benefit to the code base" argument?

 -George

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-users] Re: Xen is a feature
  2009-06-02 17:46                         ` Linus Torvalds
  2009-06-02 18:02                           ` Linus Torvalds
@ 2009-06-04 14:02                           ` Thomas Goirand
  1 sibling, 0 replies; 104+ messages in thread
From: Thomas Goirand @ 2009-06-04 14:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: George Dunlap, jens.axboe, npiggin, Dan Magenheimer, xen-devel,
	wimcoekaerts, gregkh, ksrinivasan, linux-kernel, x86, jeremy,
	David Miller, Ian Pratt, Stephen Spector, avi, EAnderson,
	kurt.hackel, Thomas Gleixner, xen-users, mingo, Keir Fraser

Linus Torvalds wrote:
> Seriously.
> 
> If it was just the local APIC, fine. But it may be just the local APIC 
> code this time around, next time it will be something else. It's been TLB, 
> it's been entry_*.S, it's been all over. Some of them are performance 
> issues.
> 
> I dunno. I just do know that I pointed out the statistics for how 
> mindlessly incestuous the Xen patches have historically been to Jeremy. He 
> admitted it. I've not seen _anybody_ say that things will improve. 
> 
> Xen has been painful. If you give maintainers pain, don't expect them to 
> love you or respect you.
> 
> So I would really suggest that Xen people should look at _why_ they are 
> giving maintainers so much pain.
> 
> 		Linus

Seriously, reading this is discouraging. I had to stop myself
criticizing too much this opinion here, but it's kind of hard to read
"mindless", "painful" and such considering the consequences of the
current state.

As time passes, it's becoming more and more unmaintainable to manage the
dom0 patch on one side, and the mainline kernel on the other, even for a
user/admin point of view. THIS is years of mindless and painful
administration/patching tasks. We've all bee waiting too long already.
We need the Xen dom0 "feature" NOW! Not tomorrow, not in one week, not
in 10 years...

As a developer myself (not on the kernel though), I can perfectly
understand the standpoint about ugliness of the code. However, refusing
to merge gives bad headaches to hundreds of people trying to deal and
maintain productions with the issues it creates.

I stand on Steven Rostedt's side (and many others too). Merging WILL
make it possible to have Xen going the way you wish. Otherwise, it's
again a cathedral type of development. Keir Fraser and others seems to
be willing to do the changes in the API if needed. It's just not right
to tell they don't want to. And if there is such need for ABI/API
compatibility, why not just add a config option "compatibility to old
style Xen (dirty hugly slow feature)" if there are some issues?

Now, about merging the Xen hypervisor, that's another discussion that
can happen later on, IMHO. What's URGENT (I insist here) is dom0 support
(including with 64 bits).

Thomas

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: [Xen-users] Re: Merge Xen (the hypervisor) into Linux
       [not found]                                       ` <4A27CF94.1050903@gmx.de>
@ 2009-06-04 14:03                                         ` Steven Rostedt
  0 siblings, 0 replies; 104+ messages in thread
From: Steven Rostedt @ 2009-06-04 14:03 UTC (permalink / raw)
  To: Florian Manschwetus
  Cc: Theodore Tso, George Dunlap, Dan Magenheimer, Ingo Molnar,
	David Miller, jeremy, avi, xen-devel, x86, linux-kernel,
	Keir Fraser, torvalds, gregkh, kurt.hackel, Ian Pratt, xen-users,
	ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe, npiggin


On Thu, 4 Jun 2009, Florian Manschwetus wrote:
> 
> On the other side why use linux as dom0?
> just take a second to mind about OpenSolaris as dom0 (release state would
> close up soon to current state of xen), it gifts you zfs.

Let's turn this around a bit. Can we get Xen to keep a rock solid stable 
ABI?  Where the interface to Xen from Dom0 is never expected to break? All 
old Dom0's will always work on Xen?

Document this interface, and that it will always work. If it is a clean 
interface, then perhaps Linux could work with it. But it would need to be 
non intrusive. I'll have to take some time to look at the Dom0 patches to 
see what exactly it requires. Perhaps there's better ways to accomplish 
what is being asked for.

-- Steve


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-04 13:21                               ` George Dunlap
@ 2009-06-04 15:10                                 ` Theodore Tso
  2009-06-04 15:31                                 ` Chris Friesen
  1 sibling, 0 replies; 104+ messages in thread
From: Theodore Tso @ 2009-06-04 15:10 UTC (permalink / raw)
  To: George Dunlap
  Cc: Frans Pop, Bill Davidsen, tglx, davem, jeremy, mingo,
	Dan Magenheimer, avi, xen-devel, x86, linux-kernel, Keir Fraser,
	torvalds, gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe

On Thu, Jun 04, 2009 at 02:21:08PM +0100, George Dunlap wrote:
> If (hypothetically) we merged Xen into Linux, then (people are  
> suggesting) the coolness of Xen would actually contribute to the  
> coolness of Linux ("add technical benefit to the code base").  People  
> would feel like working on the interface between linux-xen and the rest  
> of linux would be making their own piece of software, Linux, work  
> better, rather than feeling like they have to work with some foreign  
> project that doesn't make their code any cooler.
>
> Is that a pretty accurate representation of the "adding features which  
> have a technical benefit to the code base" argument?

The other argument is that by merging Xen into Linux, it becomes
easier for kernel developers to understand *why* "if (xen) ..." shows
up in random places in core kernel code, and it becomes easier to
clean that up.

If Xen isn't merged, it becomes much harder to believe that those
cleanups will occur, since the Xen developers might stonewall such
cleanups for reasons that Linux developers might not consider valid.
So the threshold for accepting patches might be much higher, since the
subsystem maintainers involved might decide to NAK patches as
uglifying the Linux kernel codebase with no real benefit to the Linux
codebase --- and not much hope that said ugly hacks will get cleaned
up later.  Historically, once code with warts gets merged, we lose all
leverage towards fixing those warts afterwards; this is true in
general, and not a statement of a lack of trust of Xen developers
specifically.

This doesn't make merging Xen *impossible*, but probably makes it
harder, since each of those objections will have to be cleared,
possibly by refactoring the code so that it adds benefits not just for
Xen, but some other in-kernel user of that abstraction (i.e., like
KVM, lguest, etc.) or by cleaning up the code in general, in order to
clear NAK's by the relevant developers.  

If Xen is merged, then ultimately Linus gets to make the call about
whether something gets fixed, even at the cost of making a change to
the hypervisor/dom0 interface.  So this would likely decrease the
threshold of what has to be fixed before people are willing to ACK a
Xen merge, since there's better confidence that these warts will be
cleaned up.  An example of that might be XFS, which had all sorts of
Irix warts which has been gradually cleaned up over the years.  Of
course, there might still be some hideous abstraction violations that
would have to be cleaned up first; but that's up to the relevant
subsystem maintainers.

	       	   		  	     - Ted

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-04 13:21                               ` George Dunlap
  2009-06-04 15:10                                 ` Theodore Tso
@ 2009-06-04 15:31                                 ` Chris Friesen
  1 sibling, 0 replies; 104+ messages in thread
From: Chris Friesen @ 2009-06-04 15:31 UTC (permalink / raw)
  To: George Dunlap
  Cc: Frans Pop, Bill Davidsen, tglx, davem, jeremy, mingo,
	Dan Magenheimer, avi, xen-devel, x86, linux-kernel, Keir Fraser,
	torvalds, gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe

George Dunlap wrote:
> Frans Pop wrote:
> 
>>! The kernel policy always was and still is to accept only those
>>! features which have a technical benefit **to the code base**.

> If (hypothetically) we merged Xen into Linux, then (people are 
> suggesting) the coolness of Xen would actually contribute to the 
> coolness of Linux ("add technical benefit to the code base").  People 
> would feel like working on the interface between linux-xen and the rest 
> of linux would be making their own piece of software, Linux, work 
> better, rather than feeling like they have to work with some foreign 
> project that doesn't make their code any cooler.

I suspect that there is an element of this.

There is also the factor that if Xen was merged into linux, we would
then be able to work towards a sane(r) virtualization layer that would
be useful for KVM, Xen, and possibly others.  This provides a technical
benefit to the code base by introducing a more logical organization
rather than having ad-hoc changes sprinkled all over.

Chris


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-03  8:07                         ` Christian Tramnitz
@ 2009-06-04 18:53                           ` Linus Torvalds
  2009-06-05  0:09                             ` Samuel Thibault
  0 siblings, 1 reply; 104+ messages in thread
From: Linus Torvalds @ 2009-06-04 18:53 UTC (permalink / raw)
  To: Christian Tramnitz
  Cc: Ingo Molnar, Steven Rostedt, George Dunlap, David Miller, jeremy,
	Dan Magenheimer, avi, xen-devel, x86, linux-kernel, Keir Fraser,
	gregkh, kurt.hackel, Ian Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, Stephen Spector, jens.axboe



On Wed, 3 Jun 2009, Christian Tramnitz wrote:
>
> What a great idea, and while we're doing this let's also drop support
> for legacy stuff like PATA and i8042 in mainline. Noone will need it
> anyway because their successors are on the market for years... let's
> just take it for granted that everyone is using SATA and USB nowadays!

Have you noticed how PATA and i8042 don't screw up anything else? 

You're totally missing the problem. If Xen was a single driver thing, we 
wouldn't have this discussion. But as is, Xen craps all over OTHER PEOPLES 
CODE. When those people then aren't interested in Xen, why is anybody 
surprised that people aren't excited?

			Linus

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-04 18:53                           ` Linus Torvalds
@ 2009-06-05  0:09                             ` Samuel Thibault
  2009-06-05  0:18                               ` David Miller
  2009-06-05  0:54                               ` Linus Torvalds
  0 siblings, 2 replies; 104+ messages in thread
From: Samuel Thibault @ 2009-06-05  0:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christian Tramnitz, Ingo Molnar, Steven Rostedt, George Dunlap,
	David Miller, jeremy, Dan Magenheimer, avi, xen-devel, x86,
	linux-kernel, Keir Fraser, gregkh, kurt.hackel, Ian Pratt,
	xen-users, ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe

Linus Torvalds, le Thu 04 Jun 2009 11:53:45 -0700, a écrit :
> On Wed, 3 Jun 2009, Christian Tramnitz wrote:
> >
> > What a great idea, and while we're doing this let's also drop support
> > for legacy stuff like PATA and i8042 in mainline. Noone will need it
> > anyway because their successors are on the market for years... let's
> > just take it for granted that everyone is using SATA and USB nowadays!
> 
> Have you noticed how PATA and i8042 don't screw up anything else? 

Right.  We should get rid of all the HIGHMEM kmap crap that cripples all
the code.

Samuel

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-05  0:09                             ` Samuel Thibault
@ 2009-06-05  0:18                               ` David Miller
  2009-06-05  0:54                               ` Linus Torvalds
  1 sibling, 0 replies; 104+ messages in thread
From: David Miller @ 2009-06-05  0:18 UTC (permalink / raw)
  To: samuel.thibault
  Cc: torvalds, christian, mingo, rostedt, george.dunlap, jeremy,
	dan.magenheimer, avi, xen-devel, x86, linux-kernel, Keir.Fraser,
	gregkh, kurt.hackel, Ian.Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, stephen.spector, jens.axboe

From: Samuel Thibault <samuel.thibault@ens-lyon.org>
Date: Fri, 5 Jun 2009 02:09:10 +0200

> Linus Torvalds, le Thu 04 Jun 2009 11:53:45 -0700, a écrit :
>> On Wed, 3 Jun 2009, Christian Tramnitz wrote:
>> >
>> > What a great idea, and while we're doing this let's also drop support
>> > for legacy stuff like PATA and i8042 in mainline. Noone will need it
>> > anyway because their successors are on the market for years... let's
>> > just take it for granted that everyone is using SATA and USB nowadays!
>> 
>> Have you noticed how PATA and i8042 don't screw up anything else? 
> 
> Right.  We should get rid of all the HIGHMEM kmap crap that cripples all
> the code.

The kmap interfaces are pretty damn clean if you ask me.  Especially
compared to the abortion Xen plops into the x86 platform code.

So, keep searching for an argument where none exists.

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Merge Xen (the hypervisor) into Linux
  2009-06-05  0:09                             ` Samuel Thibault
  2009-06-05  0:18                               ` David Miller
@ 2009-06-05  0:54                               ` Linus Torvalds
  1 sibling, 0 replies; 104+ messages in thread
From: Linus Torvalds @ 2009-06-05  0:54 UTC (permalink / raw)
  To: Samuel Thibault
  Cc: Christian Tramnitz, Ingo Molnar, Steven Rostedt, George Dunlap,
	David Miller, jeremy, Dan Magenheimer, avi, xen-devel, x86,
	linux-kernel, Keir Fraser, gregkh, kurt.hackel, Ian Pratt,
	xen-users, ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe



On Fri, 5 Jun 2009, Samuel Thibault wrote:
> 
> Right.  We should get rid of all the HIGHMEM kmap crap that cripples all
> the code.

Now you're starting to understand.

However, the difference between Xen and highmem (which I do hate, and 
which took a long time and lots of effort to get done) is how many people 
care. And in particular how many kernel developers do.

Until you can face these obvious facts, please just shut up. Ok?

			Linsu

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-03 23:29                             ` Frans Pop
  2009-06-04 13:21                               ` George Dunlap
@ 2009-06-05  4:14                               ` Bill Davidsen
  2009-06-05  4:55                                 ` Chris Friesen
  1 sibling, 1 reply; 104+ messages in thread
From: Bill Davidsen @ 2009-06-05  4:14 UTC (permalink / raw)
  To: Frans Pop
  Cc: tglx, george.dunlap, davem, jeremy, mingo, dan.magenheimer, avi,
	xen-devel, x86, linux-kernel, Keir.Fraser, torvalds, gregkh,
	kurt.hackel, Ian.Pratt, xen-users, ksrinivasan, EAnderson,
	wimcoekaerts, stephen.spector, jens.axboe

Frans Pop wrote:
> Bill Davidsen wrote:
>   
>> I was referring to your "no benefit" comment, I don't dispute the
>> technical issues. I think the idea of moving the hypervisor into the
>> kernel and letting xen folks do the external parts as they please.
>>     
>
> Where does that come from? AFAICT Thomas never made a "no benefit" comment 
> other than limited to the context of the technical implementation.
>
>   
Where it comes from is his very recent statement, which contains those 
very words. You may interpret what he said in any way you choose, but 
denying that he said it shows that you didn't follow the link back. I 
never denied the ugliness of the code, nor does the author, but it adds 
a great deal of value for many people, and that's the point I was making.

-- 
Bill Davidsen <davidsen@tmr.com>
  Even purely technical things can appear to be magic, if the documentation is
obscure enough. For example, PulseAudio is configured by dancing naked around a
fire at midnight, shaking a rattle with one hand and a LISP manual with the
other, while reciting the GNU manifesto in hexadecimal. The documentation fails
to note that you must circle the fire counter-clockwise in the southern
hemisphere.



^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-05  4:14                               ` Bill Davidsen
@ 2009-06-05  4:55                                 ` Chris Friesen
  0 siblings, 0 replies; 104+ messages in thread
From: Chris Friesen @ 2009-06-05  4:55 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: Frans Pop, tglx, george.dunlap, davem, jeremy, mingo,
	dan.magenheimer, avi, xen-devel, x86, linux-kernel, Keir.Fraser,
	torvalds, gregkh, kurt.hackel, Ian.Pratt, xen-users, ksrinivasan,
	EAnderson, wimcoekaerts, stephen.spector, jens.axboe

Bill Davidsen wrote:

> Where it comes from is his very recent statement, which contains those 
> very words. You may interpret what he said in any way you choose, but 
> denying that he said it shows that you didn't follow the link back. I 
> never denied the ugliness of the code, nor does the author, but it adds 
> a great deal of value for many people, and that's the point I was making.

Lots of code could be said to add a great deal of value for many people
(semi-closed video card drivers, ndiswrapper, etc.), but it's never
going to be accepted into the kernel.

The maintainers get to decide whether the perceived benefit outweighs
the perceived cost.  So far, they've decided that Xen isn't worth it.

The most likely way to get Xen merged is to lower the cost (reduce the
churn and ugliness), increase the benefit (improve the virtualization
layer, thus cleaning up other code as well), or both.

Chris

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-02 18:59                             ` Avi Kivity
@ 2009-06-07  9:13                               ` Ingo Molnar
  2009-06-07 10:01                                 ` Avi Kivity
  0 siblings, 1 reply; 104+ messages in thread
From: Ingo Molnar @ 2009-06-07  9:13 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Linus Torvalds, George Dunlap, Thomas Gleixner, David Miller,
	jeremy, Dan Magenheimer, xen-devel, x86, linux-kernel,
	Keir Fraser, gregkh, kurt.hackel, Ian Pratt, xen-users,
	ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe, npiggin


* Avi Kivity <avi@redhat.com> wrote:

> Linus Torvalds wrote:
>> The point? Xen really is horribly badly separated out. It gets way more 
>> incestuous with other systems than it should. It's entirely possible 
>> that this is very fundamental to both paravirtualization and to 
>> hypervisor behavior, but it doesn't matter - it just measn that I can 
>> well see that Xen is a f*cking pain to merge.
>>
>> So please, Xen people, look at your track record, and look at the 
>> issues from the standpoint of somebody merging your code, rather 
>> than just from the standpoint of somebody who whines "I want my 
>> code to be merged".
>>
>> IOW, if you have trouble getting your code merged, ask yourself 
>> what _you_ are doing wrong.
>
> There is in fact a way to get dom0 support with nearly no changes 
> to Linux, but it involves massive changes to Xen itself and 
> requires hardware support: run dom0 as a fully virtualized guest, 
> and assign it all the resources dom0 can access.  It's probably a 
> massive effort though.
>
> I've considered it for kvm when faced with the "I want a thin 
> hypervisor" question: compile the hypervisor kernel with PCI 
> support but nothing else (no CONFIG_BLOCK or CONFIG_NET, no device 
> drivers), load userspace from initramfs, and assign host devices 
> to one or more privileged guests.  You could probably run the host 
> with a heavily stripped configuration, and enjoy the slimness 
> while every interrupt invokes the scheduler, a context switch, and 
> maybe an IPI for good measure.

This would be an acceptable model i suspect, if someone wants a 
'slim hypervisor'.

We can context switch way faster than we handle IRQs. Plus in a 
slimmed-down config we could intentionally slim down aspects of the 
scheduler as well, if it ever became a measurable performance issue. 
The hypervisor would run a minimal user-space and most of the 
context-switching overhead relates to having a full-fledged 
user-space with rich requirements. So there's no real conceptual 
friction between a 'lean and mean' hypervisor and a full-featured 
native kernel.

This would certainly be an utterly clean design, and it would be 
interesting to see a Linux/Xen + Linux/Dom0 combo engineered in such 
a way - if people really find this layered kernel approach 
interesting. So the door is not closed to dom0 at all - but it has 
to be designed cleanly without messing up the native kernel.

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-07  9:13                               ` Ingo Molnar
@ 2009-06-07 10:01                                 ` Avi Kivity
  2009-06-07 10:35                                   ` Ingo Molnar
  0 siblings, 1 reply; 104+ messages in thread
From: Avi Kivity @ 2009-06-07 10:01 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, George Dunlap, Thomas Gleixner, David Miller,
	jeremy, Dan Magenheimer, xen-devel, x86, linux-kernel,
	Keir Fraser, gregkh, kurt.hackel, Ian Pratt, xen-users,
	ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe, npiggin

Ingo Molnar wrote:
>> There is in fact a way to get dom0 support with nearly no changes 
>> to Linux, but it involves massive changes to Xen itself and 
>> requires hardware support: run dom0 as a fully virtualized guest, 
>> and assign it all the resources dom0 can access.  It's probably a 
>> massive effort though.
>>
>> I've considered it for kvm when faced with the "I want a thin 
>> hypervisor" question: compile the hypervisor kernel with PCI 
>> support but nothing else (no CONFIG_BLOCK or CONFIG_NET, no device 
>> drivers), load userspace from initramfs, and assign host devices 
>> to one or more privileged guests.  You could probably run the host 
>> with a heavily stripped configuration, and enjoy the slimness 
>> while every interrupt invokes the scheduler, a context switch, and 
>> maybe an IPI for good measure.
>>     
>
> This would be an acceptable model i suspect, if someone wants a 
> 'slim hypervisor'.
>
> We can context switch way faster than we handle IRQs. Plus in a 
> slimmed-down config we could intentionally slim down aspects of the 
> scheduler as well, if it ever became a measurable performance issue. 
> The hypervisor would run a minimal user-space and most of the 
> context-switching overhead relates to having a full-fledged 
> user-space with rich requirements. So there's no real conceptual 
> friction between a 'lean and mean' hypervisor and a full-featured 
> native kernel.
>   

The context switch would be taken by the Xen scheduler, not the Linux 
scheduler.  It's how interrupts work under Xen: an interrupt is taken, 
Xen schedules the domain that owns the interrupts (dom0 usually), which 
then handles the interrupt.  The Linux scheduler would only be involved 
if you thread your interrupt handlers.

This context switch is necessary regardless of how dom0 is integrated 
into Linux; it's simply a side effect of implementing device drivers 
outside the kernel (in this context, the kernel is Xen, and dom0 is just 
another userspace, albeit with elevated privileges.  The Linux 
equivalent to dom0 is a process that uses uio.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-07 10:01                                 ` Avi Kivity
@ 2009-06-07 10:35                                   ` Ingo Molnar
  2009-06-07 12:46                                     ` Avi Kivity
  0 siblings, 1 reply; 104+ messages in thread
From: Ingo Molnar @ 2009-06-07 10:35 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Linus Torvalds, George Dunlap, Thomas Gleixner, David Miller,
	jeremy, Dan Magenheimer, xen-devel, x86, linux-kernel,
	Keir Fraser, gregkh, kurt.hackel, Ian Pratt, xen-users,
	ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe, npiggin


* Avi Kivity <avi@redhat.com> wrote:

> Ingo Molnar wrote:
>>> There is in fact a way to get dom0 support with nearly no changes to 
>>> Linux, but it involves massive changes to Xen itself and requires 
>>> hardware support: run dom0 as a fully virtualized guest, and assign 
>>> it all the resources dom0 can access.  It's probably a massive effort 
>>> though.
>>>
>>> I've considered it for kvm when faced with the "I want a thin  
>>> hypervisor" question: compile the hypervisor kernel with PCI support 
>>> but nothing else (no CONFIG_BLOCK or CONFIG_NET, no device drivers), 
>>> load userspace from initramfs, and assign host devices to one or more 
>>> privileged guests.  You could probably run the host with a heavily 
>>> stripped configuration, and enjoy the slimness while every interrupt 
>>> invokes the scheduler, a context switch, and maybe an IPI for good 
>>> measure.
>>>     
>>
>> This would be an acceptable model i suspect, if someone wants a 'slim 
>> hypervisor'.
>>
>> We can context switch way faster than we handle IRQs. Plus in a  
>> slimmed-down config we could intentionally slim down aspects of the  
>> scheduler as well, if it ever became a measurable performance issue.  
>> The hypervisor would run a minimal user-space and most of the  
>> context-switching overhead relates to having a full-fledged user-space 
>> with rich requirements. So there's no real conceptual friction between 
>> a 'lean and mean' hypervisor and a full-featured native kernel.
>>   
>
> The context switch would be taken by the Xen scheduler, not the Linux  
> scheduler. [...]

The 'slim hypervisor' model i was suggesting was a slimmed down 
_Linux_ kernel.

	Ingo

^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-07 10:35                                   ` Ingo Molnar
@ 2009-06-07 12:46                                     ` Avi Kivity
  2009-06-07 13:02                                       ` Jaswinder Singh Rajput
  0 siblings, 1 reply; 104+ messages in thread
From: Avi Kivity @ 2009-06-07 12:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, George Dunlap, Thomas Gleixner, David Miller,
	jeremy, Dan Magenheimer, xen-devel, x86, linux-kernel,
	Keir Fraser, gregkh, kurt.hackel, Ian Pratt, xen-users,
	ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe, npiggin

Ingo Molnar wrote:
> * Avi Kivity <avi@redhat.com> wrote:
>
>   
>> Ingo Molnar wrote:
>>     
>>>> There is in fact a way to get dom0 support with nearly no changes to 
>>>> Linux, but it involves massive changes to Xen itself and requires 
>>>> hardware support: run dom0 as a fully virtualized guest, and assign 
>>>> it all the resources dom0 can access.  It's probably a massive effort 
>>>> though.
>>>>
>>>> I've considered it for kvm when faced with the "I want a thin  
>>>> hypervisor" question: compile the hypervisor kernel with PCI support 
>>>> but nothing else (no CONFIG_BLOCK or CONFIG_NET, no device drivers), 
>>>> load userspace from initramfs, and assign host devices to one or more 
>>>> privileged guests.  You could probably run the host with a heavily 
>>>> stripped configuration, and enjoy the slimness while every interrupt 
>>>> invokes the scheduler, a context switch, and maybe an IPI for good 
>>>> measure.
>>>>     
>>>>         
>>> This would be an acceptable model i suspect, if someone wants a 'slim 
>>> hypervisor'.
>>>
>>> We can context switch way faster than we handle IRQs. Plus in a  
>>> slimmed-down config we could intentionally slim down aspects of the  
>>> scheduler as well, if it ever became a measurable performance issue.  
>>> The hypervisor would run a minimal user-space and most of the  
>>> context-switching overhead relates to having a full-fledged user-space 
>>> with rich requirements. So there's no real conceptual friction between 
>>> a 'lean and mean' hypervisor and a full-featured native kernel.
>>>   
>>>       
>> The context switch would be taken by the Xen scheduler, not the Linux  
>> scheduler. [...]
>>     
>
> The 'slim hypervisor' model i was suggesting was a slimmed down 
> _Linux_ kernel.
>   

Yeah, I lost the context.  I should reduce my own context switching.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 104+ messages in thread

* Re: Xen is a feature
  2009-06-07 12:46                                     ` Avi Kivity
@ 2009-06-07 13:02                                       ` Jaswinder Singh Rajput
  0 siblings, 0 replies; 104+ messages in thread
From: Jaswinder Singh Rajput @ 2009-06-07 13:02 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Ingo Molnar, Linus Torvalds, George Dunlap, Thomas Gleixner,
	David Miller, jeremy, Dan Magenheimer, xen-devel, x86,
	linux-kernel, Keir Fraser, gregkh, kurt.hackel, Ian Pratt,
	xen-users, ksrinivasan, EAnderson, wimcoekaerts, Stephen Spector,
	jens.axboe, npiggin

On Sun, 2009-06-07 at 15:46 +0300, Avi Kivity wrote:
> Ingo Molnar wrote:
> >
> > The 'slim hypervisor' model i was suggesting was a slimmed down 
> > _Linux_ kernel.
> >   
> 
> Yeah, I lost the context.  I should reduce my own context switching.
> 

It would be better if we monitor the switching, entry/exit and other
useful parameters in count and frequency using debugfs to increase the
performance.

Thanks,
--
JSR


^ permalink raw reply	[flat|nested] 104+ messages in thread

end of thread, other threads:[~2009-06-07 13:03 UTC | newest]

Thread overview: 104+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-12 23:25 [GIT PULL] Xen APIC hooks (with io_apic_ops) Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 01/17] xen/dom0: handle acpi lapic parsing in Xen dom0 Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 02/17] x86: add io_apic_ops to allow interception Jeremy Fitzhardinge
2009-05-25  3:54   ` Ingo Molnar
2009-05-27  7:17     ` Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 03/17] xen: implement io_apic_ops Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 04/17] xen: create dummy ioapic mapping Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 05/17] xen: implement pirq type event channels Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 06/17] x86/io_apic: add get_nr_irqs_gsi() Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 07/17] xen/apic: identity map gsi->irqs Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 08/17] xen: direct irq registration to pirq event channels Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 09/17] xen: bind pirq to vector and event channel Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 10/17] xen: pre-initialize legacy irqs early Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 11/17] xen: don't setup acpi interrupt unless there is one Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 12/17] xen: use acpi_get_override_irq() to get triggering for legacy irqs Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 13/17] xen: initialize irq 0 too Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 14/17] xen: dynamically allocate irq & event structures Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 15/17] xen: set pirq name to something useful Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 16/17] xen: fix legacy irq setup, make ioapic-less machines work Jeremy Fitzhardinge
2009-05-12 23:25 ` [PATCH 17/17] xen: disable MSI Jeremy Fitzhardinge
2009-05-19 12:35 ` [GIT PULL] Xen APIC hooks (with io_apic_ops) Ingo Molnar
2009-05-20 17:57   ` Jeremy Fitzhardinge
2009-05-25  4:10     ` Ingo Molnar
2009-05-26 12:46       ` [Xen-devel] " George Dunlap
2009-05-26 18:26         ` Avi Kivity
2009-05-26 19:18           ` Dan Magenheimer
2009-05-26 19:41             ` Avi Kivity
2009-05-28  0:13             ` Ingo Molnar
2009-05-28  0:49               ` Jeremy Fitzhardinge
2009-05-28  3:47               ` Dan Magenheimer
2009-05-28 14:26               ` George Dunlap
2009-05-29  0:45               ` Xen is a feature Jeremy Fitzhardinge
2009-05-29  1:27                 ` Greg KH
2009-05-29  4:05                 ` David Miller
2009-05-29  6:37                   ` Jaswinder Singh Rajput
2009-05-29  6:51                     ` David Miller
2009-05-29 12:01                   ` George Dunlap
2009-05-29 14:14                     ` Pasi Kärkkäinen
2009-05-29 21:29                       ` David Miller
     [not found]                     ` <87tz33ep1b.fsf@basil.nowhere.org>
2009-05-29 21:31                       ` [Xen-devel] " Jeremy Fitzhardinge
2009-05-29 23:09                       ` Nakajima, Jun
2009-05-29 23:26                         ` Jeremy Fitzhardinge
2009-06-02 15:23                     ` Thomas Gleixner
2009-06-02 16:41                       ` George Dunlap
2009-06-02 17:28                         ` Chris Friesen
2009-06-02 17:46                         ` Linus Torvalds
2009-06-02 18:02                           ` Linus Torvalds
2009-06-02 18:59                             ` Avi Kivity
2009-06-07  9:13                               ` Ingo Molnar
2009-06-07 10:01                                 ` Avi Kivity
2009-06-07 10:35                                   ` Ingo Molnar
2009-06-07 12:46                                     ` Avi Kivity
2009-06-07 13:02                                       ` Jaswinder Singh Rajput
2009-06-04 14:02                           ` [Xen-users] " Thomas Goirand
2009-06-02 18:59                         ` Thomas Gleixner
2009-06-03 19:49                       ` Bill Davidsen
2009-06-03 20:20                         ` Thomas Gleixner
2009-06-03 22:37                           ` Bill Davidsen
2009-06-03 23:29                             ` Frans Pop
2009-06-04 13:21                               ` George Dunlap
2009-06-04 15:10                                 ` Theodore Tso
2009-06-04 15:31                                 ` Chris Friesen
2009-06-05  4:14                               ` Bill Davidsen
2009-06-05  4:55                                 ` Chris Friesen
2009-06-02 22:40                     ` Steven Rostedt
2009-06-02 23:28                       ` Merge Xen (the hypervisor) into Linux Ingo Molnar
2009-06-03  0:00                         ` Dan Magenheimer
2009-06-03  0:32                           ` Thomas Gleixner
2009-06-03  2:43                           ` Theodore Tso
2009-06-03  3:42                             ` Steven Rostedt
2009-06-03  4:49                               ` Dan Magenheimer
2009-06-03  4:58                                 ` David Miller
2009-06-03  5:07                                   ` Steven Rostedt
2009-06-03  5:22                                 ` Steven Rostedt
2009-06-03 12:03                                   ` George Dunlap
2009-06-03 19:05                                     ` Theodore Tso
     [not found]                                       ` <4A27CF94.1050903@gmx.de>
2009-06-04 14:03                                         ` [Xen-users] " Steven Rostedt
2009-06-03  7:28                             ` Gerd Hoffmann
2009-06-03  8:47                               ` Alan Cox
2009-06-03  9:09                                 ` Gerd Hoffmann
2009-06-03  9:20                                   ` Keir Fraser
2009-06-03 11:15                                   ` Theodore Tso
2009-06-03 11:39                                     ` Keir Fraser
2009-06-03 11:41                                     ` Gerd Hoffmann
2009-06-03  1:00                         ` Joel Becker
2009-06-03  2:00                           ` david
2009-06-03  7:59                           ` Alan Cox
2009-06-03  8:07                         ` Christian Tramnitz
2009-06-04 18:53                           ` Linus Torvalds
2009-06-05  0:09                             ` Samuel Thibault
2009-06-05  0:18                               ` David Miller
2009-06-05  0:54                               ` Linus Torvalds
2009-06-03 17:31                         ` Chris Friesen
2009-06-03 17:36                           ` Alan Cox
2009-06-02 23:41                       ` Xen is a feature Thomas Gleixner
2009-05-30  2:19                 ` [Xen-devel] " Andy Burns
2009-05-26 21:19         ` [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops) Gerd Hoffmann
2009-05-27 10:14           ` George Dunlap
2009-05-24 20:10   ` Avi Kivity
2009-05-25  3:51     ` Ingo Molnar
2009-05-25  4:55       ` Avi Kivity
2009-05-25  5:06         ` Ingo Molnar
2009-05-25  5:12           ` Avi Kivity
2009-05-25  5:19             ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).