All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/40] RFC: Xenner
@ 2010-11-01 15:01 Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 01/40] elf: Move translate_fn to helper struct Alexander Graf
                   ` (39 more replies)
  0 siblings, 40 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Some of you might remember Gerd's xenner project. The basic motivation is to
run Xen PV guests in KVM with the normal KVM architecture.

In order to achieve this, Xenner contains of two pieces:

  1) Xenner Qemu pieces
  2) Xenner guest kernel

Part 1 is partially in qemu already. The xen support framework that Gerd pushed
a while back can be used just as well for xenner. Some parts like a special PV
device to communicate with xenner and, mechanisms to instantiate a VM and 
replacements for xen infrastructure are provided in patches here.

Part 2 is a completely self-contained piece of code. The xenner guest kernel
runs in the VM's CPL0 context. It translates guest hypercalls to hardware calls
that KVM implements, like CR3 modifications or LAPIC accesses.

This patch set tries to revive Gerd's code by integrating as much as possible
into the qemu code base. My ultimate goal is to isolate the qemu xenner code
well enough to be able to run an i386 xen pv guest with tcg on powerpc.

I'm sending this set out in the hope to receive feedback. Do you think this is
a good idea? Can you spot some glitches in the code that I overlooked? See
the list below for things I'm aware of to be broken.

Missing pieces:

  - VMstate
  - Full qdev check
  - endianness check
  - build qemu w/o xen headers
  - find a replacement for qemu_map_foreign_batch


Usage:

  (pv-grub)

  qemu-system-x86_64 -M xenpv -kernel /usr/lib/xen/boot/pv-grub-x86_64 \
    -drive file=/images/xen.raw,if=xen -nographic -enable-kvm

  (direct kernel boot)

  qemu-system-x86_64 -M xenpv -kernel /boot/vmlinux-xen -initrd \
    /boot/initrd-xen -append xencons=tty -drive file=/images/xen.raw,if=xen \
    -nographic -enable-kvm

  (with graphics)

  qemu-system-x86_64 -M xenpv -kernel /boot/vmlinux-xen -initrd \
    /boot/initrd-xen -drive file=/images/xen.raw,if=xen -vga xenfb -vnc :0 \
    -enable-kvm


I'm looking very much forward to constructive feedback!

Alex


Alexander Graf (39):
  elf: Move translate_fn to helper struct
  elf: Add notes implementation
  elf: add header notification
  elf: add section analyzer
  xen-disk: disable aio
  xenner: kernel: 32 bit files
  xenner: kernel: 64-bit files
  xenner: kernel: Global data
  xenner: kernel: Hypercall handler (i386)
  xenner: kernel: Hypercall handler (x86_64)
  xenner: kernel: Hypercall handler (generic)
  xenner: kernel: Headers
  xenner: kernel: Instruction emulator
  xenner: kernel: lapic code
  xenner: kernel: Main (i386)
  xenner: kernel: Main (x86_64)
  xenner: kernel: Main
  xenner: kernel: Makefile
  xenner: kernel: mmu support for 32-bit PAE
  xenner: kernel: mmu support for 32-bit normal
  xenner: kernel: mmu support for 64-bit
  xenner: kernel: generic MM functionality
  xenner: kernel: printk
  xenner: kernel: KVM PV code
  xenner: kernel: xen-names
  xenner: add xc_dom.h
  xenner: libxc emu: evtchn
  xenner: libxc emu: grant tables
  xenner: libxc emu: memory mapping
  xenner: libxc emu: xenstore
  xenner: emudev
  xenner: core
  xenner: PV machine
  xenner: Domain Builder
  xen: only create dummy env when necessary
  Add xenner binaries
  xenner: integrate into build system
  xenner: integrate into xen pv machine
  xen: add sysrq support

Gerd Hoffmann (1):
  qdev-ify: xen backends

 Makefile                        |    6 +
 Makefile.target                 |   16 +-
 configure                       |   21 +-
 hmp-commands.hx                 |   24 +
 hw/an5206.c                     |    2 +-
 hw/arm_boot.c                   |    2 +-
 hw/armv7m.c                     |    2 +-
 hw/cris-boot.c                  |    4 +-
 hw/dummy_m68k.c                 |    2 +-
 hw/elf_ops.h                    |  105 ++++-
 hw/loader.c                     |   50 ++-
 hw/loader.h                     |   24 +-
 hw/mcf5208.c                    |    2 +-
 hw/mips_fulong2e.c              |    4 +-
 hw/mips_malta.c                 |    4 +-
 hw/mips_mipssim.c               |    6 +-
 hw/mips_r4k.c                   |    6 +-
 hw/multiboot.c                  |    2 +-
 hw/petalogix_s3adsp1800_mmu.c   |    8 +-
 hw/ppc440_bamboo.c              |    2 +-
 hw/ppc_newworld.c               |    6 +-
 hw/ppc_oldworld.c               |    6 +-
 hw/ppce500_mpc8544ds.c          |    2 +-
 hw/sun4m.c                      |   10 +-
 hw/sun4u.c                      |    7 +-
 hw/virtex_ml507.c               |    2 +-
 hw/xc_dom.h                     |  273 +++++++++++
 hw/xen.h                        |    2 +
 hw/xen_backend.c                |  176 +++++---
 hw/xen_backend.h                |   10 +-
 hw/xen_console.c                |   10 +-
 hw/xen_disk.c                   |   12 +-
 hw/xen_domainbuild.c            |    8 +
 hw/xen_interfaces.c             |  108 ++++
 hw/xen_interfaces.h             |  111 +++++
 hw/xen_machine_pv.c             |   44 ++-
 hw/xen_nic.c                    |   10 +-
 hw/xen_redirect.h               |   56 +++
 hw/xenfb.c                      |   14 +-
 hw/xenner.h                     |   52 ++
 hw/xenner_core.c                |  224 +++++++++
 hw/xenner_dom_builder.c         |  406 +++++++++++++++
 hw/xenner_emudev.c              |  107 ++++
 hw/xenner_emudev.h              |  108 ++++
 hw/xenner_guest_store.c         |  494 +++++++++++++++++++
 hw/xenner_libxc_evtchn.c        |  467 ++++++++++++++++++
 hw/xenner_libxc_gnttab.c        |   91 ++++
 hw/xenner_libxc_if.c            |  124 +++++
 hw/xenner_libxenstore.c         |  709 +++++++++++++++++++++++++++
 hw/xenner_pv.c                  |  135 +++++
 monitor.c                       |    8 +
 pc-bios/xenner/Makefile         |   72 +++
 pc-bios/xenner/apicdef.h        |  173 +++++++
 pc-bios/xenner/cpufeature.h     |  129 +++++
 pc-bios/xenner/list.h           |  169 +++++++
 pc-bios/xenner/msr-index.h      |  278 +++++++++++
 pc-bios/xenner/printk.c         |  682 ++++++++++++++++++++++++++
 pc-bios/xenner/processor.h      |  326 ++++++++++++
 pc-bios/xenner/shared.h         |  188 +++++++
 pc-bios/xenner/xen-names.c      |  141 ++++++
 pc-bios/xenner/xen-names.h      |   68 +++
 pc-bios/xenner/xenner-data.c    |  142 ++++++
 pc-bios/xenner/xenner-emudev.h  |   57 +++
 pc-bios/xenner/xenner-hcall.c   | 1031 +++++++++++++++++++++++++++++++++++++++
 pc-bios/xenner/xenner-hcall32.c |  299 +++++++++++
 pc-bios/xenner/xenner-hcall64.c |  323 ++++++++++++
 pc-bios/xenner/xenner-instr.c   |  405 +++++++++++++++
 pc-bios/xenner/xenner-lapic.c   |  622 +++++++++++++++++++++++
 pc-bios/xenner/xenner-main.c    |  875 +++++++++++++++++++++++++++++++++
 pc-bios/xenner/xenner-main32.c  |  390 +++++++++++++++
 pc-bios/xenner/xenner-main64.c  |  412 ++++++++++++++++
 pc-bios/xenner/xenner-mm.c      |  105 ++++
 pc-bios/xenner/xenner-mm32.c    |  314 ++++++++++++
 pc-bios/xenner/xenner-mm64.c    |  369 ++++++++++++++
 pc-bios/xenner/xenner-mmpae.c   |  444 +++++++++++++++++
 pc-bios/xenner/xenner-pv.c      |  186 +++++++
 pc-bios/xenner/xenner.h         |  684 ++++++++++++++++++++++++++
 pc-bios/xenner/xenner32-pae.lds |   37 ++
 pc-bios/xenner/xenner32.S       |  441 +++++++++++++++++
 pc-bios/xenner/xenner32.h       |  191 ++++++++
 pc-bios/xenner/xenner32.lds     |   37 ++
 pc-bios/xenner/xenner64.S       |  400 +++++++++++++++
 pc-bios/xenner/xenner64.h       |  117 +++++
 pc-bios/xenner/xenner64.lds     |   38 ++
 pc-bios/xenner32-pae.elf        |  Bin 0 -> 310874 bytes
 pc-bios/xenner32.elf            |  Bin 0 -> 296741 bytes
 pc-bios/xenner64.elf            |  Bin 0 -> 342244 bytes
 87 files changed, 14089 insertions(+), 140 deletions(-)
 create mode 100644 hw/xc_dom.h
 create mode 100644 hw/xen_interfaces.c
 create mode 100644 hw/xen_interfaces.h
 create mode 100644 hw/xen_redirect.h
 create mode 100644 hw/xenner.h
 create mode 100644 hw/xenner_core.c
 create mode 100644 hw/xenner_dom_builder.c
 create mode 100644 hw/xenner_emudev.c
 create mode 100644 hw/xenner_emudev.h
 create mode 100644 hw/xenner_guest_store.c
 create mode 100644 hw/xenner_libxc_evtchn.c
 create mode 100644 hw/xenner_libxc_gnttab.c
 create mode 100644 hw/xenner_libxc_if.c
 create mode 100644 hw/xenner_libxenstore.c
 create mode 100644 hw/xenner_pv.c
 create mode 100644 pc-bios/xenner/Makefile
 create mode 100644 pc-bios/xenner/apicdef.h
 create mode 100644 pc-bios/xenner/cpufeature.h
 create mode 100644 pc-bios/xenner/list.h
 create mode 100644 pc-bios/xenner/msr-index.h
 create mode 100644 pc-bios/xenner/printk.c
 create mode 100644 pc-bios/xenner/processor.h
 create mode 100644 pc-bios/xenner/shared.h
 create mode 100644 pc-bios/xenner/xen-names.c
 create mode 100644 pc-bios/xenner/xen-names.h
 create mode 100644 pc-bios/xenner/xenner-data.c
 create mode 100644 pc-bios/xenner/xenner-emudev.h
 create mode 100644 pc-bios/xenner/xenner-hcall.c
 create mode 100644 pc-bios/xenner/xenner-hcall32.c
 create mode 100644 pc-bios/xenner/xenner-hcall64.c
 create mode 100644 pc-bios/xenner/xenner-instr.c
 create mode 100644 pc-bios/xenner/xenner-lapic.c
 create mode 100644 pc-bios/xenner/xenner-main.c
 create mode 100644 pc-bios/xenner/xenner-main32.c
 create mode 100644 pc-bios/xenner/xenner-main64.c
 create mode 100644 pc-bios/xenner/xenner-mm.c
 create mode 100644 pc-bios/xenner/xenner-mm32.c
 create mode 100644 pc-bios/xenner/xenner-mm64.c
 create mode 100644 pc-bios/xenner/xenner-mmpae.c
 create mode 100644 pc-bios/xenner/xenner-pv.c
 create mode 100644 pc-bios/xenner/xenner.h
 create mode 100644 pc-bios/xenner/xenner32-pae.lds
 create mode 100644 pc-bios/xenner/xenner32.S
 create mode 100644 pc-bios/xenner/xenner32.h
 create mode 100644 pc-bios/xenner/xenner32.lds
 create mode 100644 pc-bios/xenner/xenner64.S
 create mode 100644 pc-bios/xenner/xenner64.h
 create mode 100644 pc-bios/xenner/xenner64.lds
 create mode 100755 pc-bios/xenner32-pae.elf
 create mode 100755 pc-bios/xenner32.elf
 create mode 100755 pc-bios/xenner64.elf

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 01/40] elf: Move translate_fn to helper struct
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 02/40] elf: Add notes implementation Alexander Graf
                   ` (38 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

The elf loader takes a direct parameter for a callback that enabled users
of load_elf to translate addresses on the fly.

While this is nice to have, it's really unflexible. We need to add some
more callbacks to elf and listing every single one in the function call
just doesn't scale.

So let's move that one over to a struct that can easily be extended. This
way we can add more callbacks later on and don't have to worry about
compatibility.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/an5206.c                   |    2 +-
 hw/arm_boot.c                 |    2 +-
 hw/armv7m.c                   |    2 +-
 hw/cris-boot.c                |    4 +++-
 hw/dummy_m68k.c               |    2 +-
 hw/elf_ops.h                  |   10 +++-------
 hw/loader.c                   |   29 ++++++++++++++++++++++-------
 hw/loader.h                   |   16 ++++++++++++----
 hw/mcf5208.c                  |    2 +-
 hw/mips_fulong2e.c            |    4 +++-
 hw/mips_malta.c               |    4 +++-
 hw/mips_mipssim.c             |    6 ++++--
 hw/mips_r4k.c                 |    6 ++++--
 hw/multiboot.c                |    2 +-
 hw/petalogix_s3adsp1800_mmu.c |    8 +++++---
 hw/ppc440_bamboo.c            |    2 +-
 hw/ppc_newworld.c             |    6 ++++--
 hw/ppc_oldworld.c             |    6 ++++--
 hw/ppce500_mpc8544ds.c        |    2 +-
 hw/sun4m.c                    |   10 ++++++++--
 hw/sun4u.c                    |    7 +++++--
 hw/virtex_ml507.c             |    2 +-
 22 files changed, 89 insertions(+), 45 deletions(-)

diff --git a/hw/an5206.c b/hw/an5206.c
index b9f19a9..1797337 100644
--- a/hw/an5206.c
+++ b/hw/an5206.c
@@ -68,7 +68,7 @@ static void an5206_init(ram_addr_t ram_size,
         exit(1);
     }
 
-    kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+    kernel_size = load_elf(kernel_filename, NULL, &elf_entry,
                            NULL, NULL, 1, ELF_MACHINE, 0);
     entry = elf_entry;
     if (kernel_size < 0) {
diff --git a/hw/arm_boot.c b/hw/arm_boot.c
index 620550b..54bd9f5 100644
--- a/hw/arm_boot.c
+++ b/hw/arm_boot.c
@@ -226,7 +226,7 @@ void arm_load_kernel(CPUState *env, struct arm_boot_info *info)
 #endif
 
     /* Assume that raw images are linux kernels, and ELF images are not.  */
-    kernel_size = load_elf(info->kernel_filename, NULL, NULL, &elf_entry,
+    kernel_size = load_elf(info->kernel_filename, NULL, &elf_entry,
                            NULL, NULL, big_endian, ELF_MACHINE, 1);
     entry = elf_entry;
     if (kernel_size < 0) {
diff --git a/hw/armv7m.c b/hw/armv7m.c
index 588ec98..57a1515 100644
--- a/hw/armv7m.c
+++ b/hw/armv7m.c
@@ -222,7 +222,7 @@ qemu_irq *armv7m_init(int flash_size, int sram_size,
     big_endian = 0;
 #endif
 
-    image_size = load_elf(kernel_filename, NULL, NULL, &entry, &lowaddr,
+    image_size = load_elf(kernel_filename, NULL, &entry, &lowaddr,
                           NULL, big_endian, ELF_MACHINE, 1);
     if (image_size < 0) {
         image_size = load_image_targphys(kernel_filename, 0, flash_size);
diff --git a/hw/cris-boot.c b/hw/cris-boot.c
index 2ef17f6..b6eec35 100644
--- a/hw/cris-boot.c
+++ b/hw/cris-boot.c
@@ -66,11 +66,13 @@ void cris_load_image(CPUState *env, struct cris_load_info *li)
     uint64_t entry, high;
     int kcmdline_len;
     int image_size;
+    ElfHandlers handlers = elf_default_handlers;
 
     env->load_info = li;
     /* Boots a kernel elf binary, os/linux-2.6/vmlinux from the axis 
        devboard SDK.  */
-    image_size = load_elf(li->image_filename, translate_kernel_address, NULL,
+    handlers.translate_fn = translate_kernel_address;
+    image_size = load_elf(li->image_filename, &handlers,
                           &entry, NULL, &high, 0, ELF_MACHINE, 0);
     li->entry = entry;
     if (image_size < 0) {
diff --git a/hw/dummy_m68k.c b/hw/dummy_m68k.c
index 61efb39..db49e31 100644
--- a/hw/dummy_m68k.c
+++ b/hw/dummy_m68k.c
@@ -43,7 +43,7 @@ static void dummy_m68k_init(ram_addr_t ram_size,
 
     /* Load kernel.  */
     if (kernel_filename) {
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+        kernel_size = load_elf(kernel_filename, NULL, &elf_entry,
                                NULL, NULL, 1, ELF_MACHINE, 0);
         entry = elf_entry;
         if (kernel_size < 0) {
diff --git a/hw/elf_ops.h b/hw/elf_ops.h
index 0bd7235..8b63dfc 100644
--- a/hw/elf_ops.h
+++ b/hw/elf_ops.h
@@ -190,8 +190,7 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
 }
 
 static int glue(load_elf, SZ)(const char *name, int fd,
-                              uint64_t (*translate_fn)(void *, uint64_t),
-                              void *translate_opaque,
+                              ElfHandlers *handlers,
                               int must_swab, uint64_t *pentry,
                               uint64_t *lowaddr, uint64_t *highaddr,
                               int elf_machine, int clear_lsb)
@@ -265,11 +264,8 @@ static int glue(load_elf, SZ)(const char *name, int fd,
             }
             /* address_offset is hack for kernel images that are
                linked at the wrong physical address.  */
-            if (translate_fn) {
-                addr = translate_fn(translate_opaque, ph->p_paddr);
-            } else {
-                addr = ph->p_paddr;
-            }
+            addr = handlers->translate_fn(handlers->translate_opaque,
+                                          ph->p_paddr);
 
             snprintf(label, sizeof(label), "phdr #%d: %s", i, name);
             rom_add_blob_fixed(label, data, mem_size, addr);
diff --git a/hw/loader.c b/hw/loader.c
index 49ac1fa..50b43a0 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -229,6 +229,17 @@ int load_aout(const char *filename, target_phys_addr_t addr, int max_sz,
 
 /* ELF loader */
 
+static uint64_t elf_default_translate(void *opaque, uint64_t addr)
+{
+    return addr;
+}
+
+ElfHandlers elf_default_handlers = {
+    .translate_fn = elf_default_translate,
+    .translate_opaque = NULL,
+};
+
+
 static void *load_at(int fd, int offset, int size)
 {
     void *ptr;
@@ -276,9 +287,9 @@ static void *load_at(int fd, int offset, int size)
 #include "elf_ops.h"
 
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
-int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
-             void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
-             uint64_t *highaddr, int big_endian, int elf_machine, int clear_lsb)
+int load_elf(const char *filename, ElfHandlers *handlers,
+             uint64_t *pentry, uint64_t *lowaddr, uint64_t *highaddr,
+             int big_endian, int elf_machine, int clear_lsb)
 {
     int fd, data_order, target_data_order, must_swab, ret;
     uint8_t e_ident[EI_NIDENT];
@@ -310,13 +321,17 @@ int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
     if (target_data_order != e_ident[EI_DATA])
         return -1;
 
+    if (!handlers) {
+        handlers = &elf_default_handlers;
+    }
+
     lseek(fd, 0, SEEK_SET);
     if (e_ident[EI_CLASS] == ELFCLASS64) {
-        ret = load_elf64(filename, fd, translate_fn, translate_opaque, must_swab,
-                         pentry, lowaddr, highaddr, elf_machine, clear_lsb);
+        ret = load_elf64(filename, fd, handlers, must_swab, pentry, lowaddr,
+                         highaddr, elf_machine, clear_lsb);
     } else {
-        ret = load_elf32(filename, fd, translate_fn, translate_opaque, must_swab,
-                         pentry, lowaddr, highaddr, elf_machine, clear_lsb);
+        ret = load_elf32(filename, fd, handlers, must_swab, pentry, lowaddr,
+                         highaddr, elf_machine, clear_lsb);
     }
 
     close(fd);
diff --git a/hw/loader.h b/hw/loader.h
index 1f82fc5..27a2c36 100644
--- a/hw/loader.h
+++ b/hw/loader.h
@@ -5,10 +5,18 @@
 int get_image_size(const char *filename);
 int load_image(const char *filename, uint8_t *addr); /* deprecated */
 int load_image_targphys(const char *filename, target_phys_addr_t, int max_sz);
-int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
-             void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
-             uint64_t *highaddr, int big_endian, int elf_machine,
-             int clear_lsb);
+
+typedef struct ElfHandlers {
+    uint64_t (*translate_fn)(void *opaque, uint64_t address);
+    void *translate_opaque;
+} ElfHandlers;
+
+extern ElfHandlers elf_default_handlers;
+
+int load_elf(const char *filename, ElfHandlers *handlers,
+             uint64_t *pentry, uint64_t *lowaddr, uint64_t *highaddr,
+             int big_endian, int elf_machine, int clear_lsb);
+
 int load_aout(const char *filename, target_phys_addr_t addr, int max_sz,
               int bswap_needed, target_phys_addr_t target_page_size);
 int load_uimage(const char *filename, target_phys_addr_t *ep,
diff --git a/hw/mcf5208.c b/hw/mcf5208.c
index 38645f7..3a27574 100644
--- a/hw/mcf5208.c
+++ b/hw/mcf5208.c
@@ -270,7 +270,7 @@ static void mcf5208evb_init(ram_addr_t ram_size,
         exit(1);
     }
 
-    kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+    kernel_size = load_elf(kernel_filename, NULL, &elf_entry,
                            NULL, NULL, 1, ELF_MACHINE, 0);
     entry = elf_entry;
     if (kernel_size < 0) {
diff --git a/hw/mips_fulong2e.c b/hw/mips_fulong2e.c
index 07eb9ee..5e259c6 100644
--- a/hw/mips_fulong2e.c
+++ b/hw/mips_fulong2e.c
@@ -106,8 +106,10 @@ static int64_t load_kernel (CPUState *env)
     ram_addr_t initrd_offset;
     uint32_t *prom_buf;
     long prom_size;
+    ElfHandlers handlers = elf_default_handlers;
 
-    if (load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys, NULL,
+    handlers.translate_fn = cpu_mips_kseg0_to_phys;
+    if (load_elf(loaderparams.kernel_filename, &handlers,
                  (uint64_t *)&kernel_entry, (uint64_t *)&kernel_low,
                  (uint64_t *)&kernel_high, 0, ELF_MACHINE, 1) < 0) {
         fprintf(stderr, "qemu: could not load kernel '%s'\n",
diff --git a/hw/mips_malta.c b/hw/mips_malta.c
index 8026071..42da041 100644
--- a/hw/mips_malta.c
+++ b/hw/mips_malta.c
@@ -686,6 +686,7 @@ static int64_t load_kernel (void)
     uint32_t *prom_buf;
     long prom_size;
     int prom_index = 0;
+    ElfHandlers handlers = elf_default_handlers;
 
 #ifdef TARGET_WORDS_BIGENDIAN
     big_endian = 1;
@@ -693,7 +694,8 @@ static int64_t load_kernel (void)
     big_endian = 0;
 #endif
 
-    if (load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys, NULL,
+    handlers.translate_fn = cpu_mips_kseg0_to_phys;
+    if (load_elf(loaderparams.kernel_filename, &handlers,
                  (uint64_t *)&kernel_entry, NULL, (uint64_t *)&kernel_high,
                  big_endian, ELF_MACHINE, 1) < 0) {
         fprintf(stderr, "qemu: could not load kernel '%s'\n",
diff --git a/hw/mips_mipssim.c b/hw/mips_mipssim.c
index 111c759..55bfc25 100644
--- a/hw/mips_mipssim.c
+++ b/hw/mips_mipssim.c
@@ -55,6 +55,7 @@ static int64_t load_kernel(void)
     long initrd_size;
     ram_addr_t initrd_offset;
     int big_endian;
+    ElfHandlers handlers = elf_default_handlers;
 
 #ifdef TARGET_WORDS_BIGENDIAN
     big_endian = 1;
@@ -62,8 +63,9 @@ static int64_t load_kernel(void)
     big_endian = 0;
 #endif
 
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&entry, NULL,
+    handlers.translate_fn = cpu_mips_kseg0_to_phys;
+    kernel_size = load_elf(loaderparams.kernel_filename, &handlers,
+                           (uint64_t *)&entry, NULL,
                            (uint64_t *)&kernel_high, big_endian,
                            ELF_MACHINE, 1);
     if (kernel_size >= 0) {
diff --git a/hw/mips_r4k.c b/hw/mips_r4k.c
index aa34890..bd5f340 100644
--- a/hw/mips_r4k.c
+++ b/hw/mips_r4k.c
@@ -81,14 +81,16 @@ static int64_t load_kernel(void)
     ram_addr_t initrd_offset;
     uint32_t *params_buf;
     int big_endian;
+    ElfHandlers handlers = elf_default_handlers;
 
 #ifdef TARGET_WORDS_BIGENDIAN
     big_endian = 1;
 #else
     big_endian = 0;
 #endif
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&entry, NULL,
+    handlers.translate_fn = cpu_mips_kseg0_to_phys;
+    kernel_size = load_elf(loaderparams.kernel_filename, &handlers,
+                           (uint64_t *)&entry, NULL,
                            (uint64_t *)&kernel_high, big_endian,
                            ELF_MACHINE, 1);
     if (kernel_size >= 0) {
diff --git a/hw/multiboot.c b/hw/multiboot.c
index f9097a2..97b891a 100644
--- a/hw/multiboot.c
+++ b/hw/multiboot.c
@@ -171,7 +171,7 @@ int load_multiboot(void *fw_cfg,
         uint64_t elf_low, elf_high;
         int kernel_size;
         fclose(f);
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+        kernel_size = load_elf(kernel_filename, NULL, &elf_entry,
                                &elf_low, &elf_high, 0, ELF_MACHINE, 0);
         if (kernel_size < 0) {
             fprintf(stderr, "Error while loading elf kernel\n");
diff --git a/hw/petalogix_s3adsp1800_mmu.c b/hw/petalogix_s3adsp1800_mmu.c
index 42de459..6a5416e 100644
--- a/hw/petalogix_s3adsp1800_mmu.c
+++ b/hw/petalogix_s3adsp1800_mmu.c
@@ -167,15 +167,17 @@ petalogix_s3adsp1800_init(ram_addr_t ram_size,
     if (kernel_filename) {
         uint64_t entry, low, high;
         uint32_t base32;
+        ElfHandlers handlers = elf_default_handlers;
 
         /* Boots a kernel elf binary.  */
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL,
                                &entry, &low, &high,
                                1, ELF_MACHINE, 0);
         base32 = entry;
         if (base32 == 0xc0000000) {
-            kernel_size = load_elf(kernel_filename, translate_kernel_address,
-                                   NULL, &entry, NULL, NULL,
+            handlers.translate_fn = translate_kernel_address;
+            kernel_size = load_elf(kernel_filename, &handlers,
+                                   &entry, NULL, NULL,
                                    1, ELF_MACHINE, 0);
         }
         /* Always boot into physical ram.  */
diff --git a/hw/ppc440_bamboo.c b/hw/ppc440_bamboo.c
index 34ddf45..8616700 100644
--- a/hw/ppc440_bamboo.c
+++ b/hw/ppc440_bamboo.c
@@ -123,7 +123,7 @@ static void bamboo_init(ram_addr_t ram_size,
     if (kernel_filename) {
         success = load_uimage(kernel_filename, &entry, &loadaddr, NULL);
         if (success < 0) {
-            success = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+            success = load_elf(kernel_filename, NULL, &elf_entry,
                                &elf_lowaddr, NULL, 1, ELF_MACHINE, 0);
             entry = elf_entry;
             loadaddr = elf_lowaddr;
diff --git a/hw/ppc_newworld.c b/hw/ppc_newworld.c
index 4369337..f479289 100644
--- a/hw/ppc_newworld.c
+++ b/hw/ppc_newworld.c
@@ -181,7 +181,7 @@ static void ppc_core99_init (ram_addr_t ram_size,
 
     /* Load OpenBIOS (ELF) */
     if (filename) {
-        bios_size = load_elf(filename, NULL, NULL, NULL,
+        bios_size = load_elf(filename, NULL, NULL,
                              NULL, NULL, 1, ELF_MACHINE, 0);
 
         qemu_free(filename);
@@ -196,6 +196,7 @@ static void ppc_core99_init (ram_addr_t ram_size,
     if (linux_boot) {
         uint64_t lowaddr = 0;
         int bswap_needed;
+        ElfHandlers handlers = elf_default_handlers;
 
 #ifdef BSWAP_NEEDED
         bswap_needed = 1;
@@ -204,7 +205,8 @@ static void ppc_core99_init (ram_addr_t ram_size,
 #endif
         kernel_base = KERNEL_LOAD_ADDR;
 
-        kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
+        handlers.translate_fn = translate_kernel_address;
+        kernel_size = load_elf(kernel_filename, &handlers,
                                NULL, &lowaddr, NULL, 1, ELF_MACHINE, 0);
         if (kernel_size < 0)
             kernel_size = load_aout(kernel_filename, kernel_base,
diff --git a/hw/ppc_oldworld.c b/hw/ppc_oldworld.c
index a2f9ddf..f24bded 100644
--- a/hw/ppc_oldworld.c
+++ b/hw/ppc_oldworld.c
@@ -119,7 +119,7 @@ static void ppc_heathrow_init (ram_addr_t ram_size,
 
     /* Load OpenBIOS (ELF) */
     if (filename) {
-        bios_size = load_elf(filename, 0, NULL, NULL, NULL, NULL,
+        bios_size = load_elf(filename, NULL, NULL, NULL, NULL,
                              1, ELF_MACHINE, 0);
         qemu_free(filename);
     } else {
@@ -133,14 +133,16 @@ static void ppc_heathrow_init (ram_addr_t ram_size,
     if (linux_boot) {
         uint64_t lowaddr = 0;
         int bswap_needed;
+        ElfHandlers handlers = elf_default_handlers;
 
 #ifdef BSWAP_NEEDED
         bswap_needed = 1;
 #else
         bswap_needed = 0;
 #endif
+        handlers.translate_fn = translate_kernel_address;
         kernel_base = KERNEL_LOAD_ADDR;
-        kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
+        kernel_size = load_elf(kernel_filename, &handlers,
                                NULL, &lowaddr, NULL, 1, ELF_MACHINE, 0);
         if (kernel_size < 0)
             kernel_size = load_aout(kernel_filename, kernel_base,
diff --git a/hw/ppce500_mpc8544ds.c b/hw/ppce500_mpc8544ds.c
index 59d20d3..1d066ab 100644
--- a/hw/ppce500_mpc8544ds.c
+++ b/hw/ppce500_mpc8544ds.c
@@ -233,7 +233,7 @@ static void mpc8544ds_init(ram_addr_t ram_size,
     if (kernel_filename) {
         kernel_size = load_uimage(kernel_filename, &entry, &loadaddr, NULL);
         if (kernel_size < 0) {
-            kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+            kernel_size = load_elf(kernel_filename, NULL, &elf_entry,
                                    &elf_lowaddr, NULL, 1, ELF_MACHINE, 0);
             entry = elf_entry;
             loadaddr = elf_lowaddr;
diff --git a/hw/sun4m.c b/hw/sun4m.c
index 0392109..265ac49 100644
--- a/hw/sun4m.c
+++ b/hw/sun4m.c
@@ -322,13 +322,15 @@ static unsigned long sun4m_load_kernel(const char *kernel_filename,
     kernel_size = 0;
     if (linux_boot) {
         int bswap_needed;
+        ElfHandlers handlers = elf_default_handlers;
 
 #ifdef BSWAP_NEEDED
         bswap_needed = 1;
 #else
         bswap_needed = 0;
 #endif
-        kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
+        handlers.translate_fn = translate_kernel_address;
+        kernel_size = load_elf(kernel_filename, &handlers,
                                NULL, NULL, NULL, 1, ELF_MACHINE, 0);
         if (kernel_size < 0)
             kernel_size = load_aout(kernel_filename, KERNEL_LOAD_ADDR,
@@ -677,7 +679,11 @@ static void prom_init(target_phys_addr_t addr, const char *bios_name)
     }
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
     if (filename) {
-        ret = load_elf(filename, translate_prom_address, &addr, NULL,
+        ElfHandlers handlers = elf_default_handlers;
+
+        handlers.translate_fn = translate_prom_address;
+        handlers.translate_opaque = &addr;
+        ret = load_elf(filename, &handlers, NULL,
                        NULL, NULL, 1, ELF_MACHINE, 0);
         if (ret < 0 || ret > PROM_SIZE_MAX) {
             ret = load_image_targphys(filename, addr, PROM_SIZE_MAX);
diff --git a/hw/sun4u.c b/hw/sun4u.c
index 45a46d6..b07c1e1 100644
--- a/hw/sun4u.c
+++ b/hw/sun4u.c
@@ -194,7 +194,7 @@ static unsigned long sun4u_load_kernel(const char *kernel_filename,
 #else
         bswap_needed = 0;
 #endif
-        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL,
                                NULL, NULL, 1, ELF_MACHINE, 0);
         if (kernel_size < 0)
             kernel_size = load_aout(kernel_filename, KERNEL_LOAD_ADDR,
@@ -597,6 +597,7 @@ static void prom_init(target_phys_addr_t addr, const char *bios_name)
     SysBusDevice *s;
     char *filename;
     int ret;
+    ElfHandlers handlers = elf_default_handlers;
 
     dev = qdev_create(NULL, "openprom");
     qdev_init_nofail(dev);
@@ -610,7 +611,9 @@ static void prom_init(target_phys_addr_t addr, const char *bios_name)
     }
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
     if (filename) {
-        ret = load_elf(filename, translate_prom_address, &addr,
+        handlers.translate_fn = translate_prom_address;
+        handlers.translate_opaque = &addr;
+        ret = load_elf(filename, &handlers,
                        NULL, NULL, NULL, 1, ELF_MACHINE, 0);
         if (ret < 0 || ret > PROM_SIZE_MAX) {
             ret = load_image_targphys(filename, addr, PROM_SIZE_MAX);
diff --git a/hw/virtex_ml507.c b/hw/virtex_ml507.c
index fa60515..3ef7487 100644
--- a/hw/virtex_ml507.c
+++ b/hw/virtex_ml507.c
@@ -238,7 +238,7 @@ static void virtex_init(ram_addr_t ram_size,
         target_phys_addr_t boot_offset;
 
         /* Boots a kernel elf binary.  */
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL,
                                &entry, &low, &high, 1, ELF_MACHINE, 0);
         boot_info.bootstrap_pc = entry & 0x00ffffff;
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 02/40] elf: Add notes implementation
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 01/40] elf: Move translate_fn to helper struct Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 18:29   ` Blue Swirl
  2010-11-01 18:41   ` [Qemu-devel] " Paolo Bonzini
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 03/40] elf: add header notification Alexander Graf
                   ` (37 subsequent siblings)
  39 siblings, 2 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

---
 hw/elf_ops.h |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 hw/loader.c  |    7 ++++++
 hw/loader.h  |    3 ++
 3 files changed, 70 insertions(+), 1 deletions(-)

diff --git a/hw/elf_ops.h b/hw/elf_ops.h
index 8b63dfc..645d058 100644
--- a/hw/elf_ops.h
+++ b/hw/elf_ops.h
@@ -189,6 +189,44 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
     return -1;
 }
 
+static void glue(elf_read_notes, SZ)(uint8_t *data, int data_len,
+                                     ElfHandlers *handlers, int must_swab)
+{
+    uint8_t *p = data;
+
+    while ((ulong)&p[3] < (ulong)&data[data_len]) {
+        uint32_t *cur = (uint32_t *)p;
+        uint32_t namesz = cur[0];
+        uint32_t descsz = cur[1];
+        uint32_t type   = cur[2];
+        uint8_t *name;
+        uint8_t *desc;
+
+        p += 3 * sizeof(uint32_t);
+
+        if (must_swab) {
+            namesz = bswap32(namesz);
+            descsz = bswap32(descsz);
+            type   = bswap32(type);
+        }
+
+        namesz = (namesz + 3) & ~3;
+        descsz = (descsz + 3) & ~3;
+
+        name = p;
+        p += namesz;
+        desc = p;
+        p += descsz;
+
+        if ((ulong)p > (ulong)&data[data_len]) {
+            break;
+        }
+
+        handlers->note_fn(handlers->note_opaque, name, namesz, desc, descsz,
+                          type);
+    }
+}
+
 static int glue(load_elf, SZ)(const char *name, int fd,
                               ElfHandlers *handlers,
                               int must_swab, uint64_t *pentry,
@@ -252,7 +290,8 @@ static int glue(load_elf, SZ)(const char *name, int fd,
     total_size = 0;
     for(i = 0; i < ehdr.e_phnum; i++) {
         ph = &phdr[i];
-        if (ph->p_type == PT_LOAD) {
+        switch (ph->p_type) {
+        case PT_LOAD:
             mem_size = ph->p_memsz;
             /* XXX: avoid allocating */
             data = qemu_mallocz(mem_size);
@@ -278,6 +317,26 @@ static int glue(load_elf, SZ)(const char *name, int fd,
 
             qemu_free(data);
             data = NULL;
+            break;
+
+        case PT_NOTE:
+            mem_size = ph->p_memsz;
+            if (!mem_size) {
+                break;
+            }
+            data = qemu_mallocz(mem_size);
+            if (ph->p_filesz > 0) {
+                if (lseek(fd, ph->p_offset, SEEK_SET) < 0)
+                    goto fail;
+                if (read(fd, data, ph->p_filesz) != ph->p_filesz)
+                    goto fail;
+            }
+
+            glue(elf_read_notes, SZ)(data, ph->p_memsz, handlers, must_swab);
+
+            qemu_free(data);
+            data = NULL;
+            break;
         }
     }
     qemu_free(phdr);
diff --git a/hw/loader.c b/hw/loader.c
index 50b43a0..cb430e0 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -229,6 +229,11 @@ int load_aout(const char *filename, target_phys_addr_t addr, int max_sz,
 
 /* ELF loader */
 
+static void elf_default_note(void *opaque, uint8_t *name, uint32_t name_len,
+                             uint8_t *desc, uint32_t desc_len, uint32_t type)
+{
+}
+
 static uint64_t elf_default_translate(void *opaque, uint64_t addr)
 {
     return addr;
@@ -237,6 +242,8 @@ static uint64_t elf_default_translate(void *opaque, uint64_t addr)
 ElfHandlers elf_default_handlers = {
     .translate_fn = elf_default_translate,
     .translate_opaque = NULL,
+    .note_fn = elf_default_note,
+    .note_opaque = NULL,
 };
 
 
diff --git a/hw/loader.h b/hw/loader.h
index 27a2c36..29d5c71 100644
--- a/hw/loader.h
+++ b/hw/loader.h
@@ -9,6 +9,9 @@ int load_image_targphys(const char *filename, target_phys_addr_t, int max_sz);
 typedef struct ElfHandlers {
     uint64_t (*translate_fn)(void *opaque, uint64_t address);
     void *translate_opaque;
+    void (*note_fn)(void *opaque, uint8_t *name, uint32_t name_len,
+                    uint8_t *desc, uint32_t desc_len, uint32_t type);
+    void *note_opaque;
 } ElfHandlers;
 
 extern ElfHandlers elf_default_handlers;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 03/40] elf: add header notification
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 01/40] elf: Move translate_fn to helper struct Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 02/40] elf: Add notes implementation Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 04/40] elf: add section analyzer Alexander Graf
                   ` (36 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

---
 hw/elf_ops.h |    2 ++
 hw/loader.c  |    7 +++++++
 hw/loader.h  |    2 ++
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/hw/elf_ops.h b/hw/elf_ops.h
index 645d058..5bcba7e 100644
--- a/hw/elf_ops.h
+++ b/hw/elf_ops.h
@@ -247,6 +247,8 @@ static int glue(load_elf, SZ)(const char *name, int fd,
         glue(bswap_ehdr, SZ)(&ehdr);
     }
 
+    handlers->header_notify_fn(handlers->header_notify_opaque, &ehdr, SZ);
+
     switch (elf_machine) {
         case EM_PPC64:
             if (EM_PPC64 != ehdr.e_machine)
diff --git a/hw/loader.c b/hw/loader.c
index cb430e0..6a43fda 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -239,11 +239,18 @@ static uint64_t elf_default_translate(void *opaque, uint64_t addr)
     return addr;
 }
 
+static void elf_default_header_notify(void *opaque, void *ehdr, int bits)
+{
+    return;
+}
+
 ElfHandlers elf_default_handlers = {
     .translate_fn = elf_default_translate,
     .translate_opaque = NULL,
     .note_fn = elf_default_note,
     .note_opaque = NULL,
+    .header_notify_fn = elf_default_header_notify,
+    .header_notify_opaque = NULL,
 };
 
 
diff --git a/hw/loader.h b/hw/loader.h
index 29d5c71..090b815 100644
--- a/hw/loader.h
+++ b/hw/loader.h
@@ -12,6 +12,8 @@ typedef struct ElfHandlers {
     void (*note_fn)(void *opaque, uint8_t *name, uint32_t name_len,
                     uint8_t *desc, uint32_t desc_len, uint32_t type);
     void *note_opaque;
+    void (*header_notify_fn)(void *opaque, void *ehdr, int bits);
+    void *header_notify_opaque;
 } ElfHandlers;
 
 extern ElfHandlers elf_default_handlers;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 04/40] elf: add section analyzer
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (2 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 03/40] elf: add header notification Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 05/40] xen-disk: disable aio Alexander Graf
                   ` (35 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

---
 hw/elf_ops.h |   32 ++++++++++++++++++++++++++++++--
 hw/loader.c  |    7 +++++++
 hw/loader.h  |    3 +++
 3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/hw/elf_ops.h b/hw/elf_ops.h
index 5bcba7e..6a042c5 100644
--- a/hw/elf_ops.h
+++ b/hw/elf_ops.h
@@ -100,13 +100,14 @@ static int glue(symcmp, SZ)(const void *s0, const void *s1)
 }
 
 static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
-                                  int clear_lsb)
+                                  int clear_lsb, ElfHandlers *handlers)
 {
     struct elf_shdr *symtab, *strtab, *shdr_table = NULL;
     struct elf_sym *syms = NULL;
     struct syminfo *s;
     int nsyms, i;
     char *str = NULL;
+    char *secstr = NULL;
 
     shdr_table = load_at(fd, ehdr->e_shoff,
                          sizeof(struct elf_shdr) * ehdr->e_shnum);
@@ -172,6 +173,32 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
     if (!str)
         goto fail;
 
+    /* Section string table */
+    if (ehdr->e_shstrndx >= ehdr->e_shnum)
+        goto fail;
+    strtab = &shdr_table[ehdr->e_shstrndx];
+
+    secstr = load_at(fd, strtab->sh_offset, strtab->sh_size);
+    if (!secstr)
+        goto fail;
+
+    /* External section analyzer */
+    for (i = 0; i < ehdr->e_shnum; i++) {
+        struct elf_shdr *cursec = &shdr_table[i];
+        uint8_t *section, *name;
+
+        if (!cursec->sh_size) {
+            continue;
+        }
+
+        name = (uint8_t*)&secstr[cursec->sh_name];
+        section = load_at(fd, cursec->sh_offset, cursec->sh_size);
+        handlers->section_fn(handlers->section_opaque, name, cursec->sh_size,
+                             section);
+        qemu_free(section);
+    }
+
+
     /* Commit */
     s = qemu_mallocz(sizeof(*s));
     s->lookup_symbol = glue(lookup_symbol, SZ);
@@ -185,6 +212,7 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
  fail:
     qemu_free(syms);
     qemu_free(str);
+    qemu_free(secstr);
     qemu_free(shdr_table);
     return -1;
 }
@@ -273,7 +301,7 @@ static int glue(load_elf, SZ)(const char *name, int fd,
     if (pentry)
    	*pentry = (uint64_t)(elf_sword)ehdr.e_entry;
 
-    glue(load_symbols, SZ)(&ehdr, fd, must_swab, clear_lsb);
+    glue(load_symbols, SZ)(&ehdr, fd, must_swab, clear_lsb, handlers);
 
     size = ehdr.e_phnum * sizeof(phdr[0]);
     lseek(fd, ehdr.e_phoff, SEEK_SET);
diff --git a/hw/loader.c b/hw/loader.c
index 6a43fda..cf3c1ad 100644
--- a/hw/loader.c
+++ b/hw/loader.c
@@ -234,6 +234,11 @@ static void elf_default_note(void *opaque, uint8_t *name, uint32_t name_len,
 {
 }
 
+static void elf_default_section(void *opaque, uint8_t *name, uint32_t len,
+                                uint8_t *data)
+{
+}
+
 static uint64_t elf_default_translate(void *opaque, uint64_t addr)
 {
     return addr;
@@ -251,6 +256,8 @@ ElfHandlers elf_default_handlers = {
     .note_opaque = NULL,
     .header_notify_fn = elf_default_header_notify,
     .header_notify_opaque = NULL,
+    .section_fn = elf_default_section,
+    .section_opaque = NULL,
 };
 
 
diff --git a/hw/loader.h b/hw/loader.h
index 090b815..6351644 100644
--- a/hw/loader.h
+++ b/hw/loader.h
@@ -14,6 +14,9 @@ typedef struct ElfHandlers {
     void *note_opaque;
     void (*header_notify_fn)(void *opaque, void *ehdr, int bits);
     void *header_notify_opaque;
+    void (*section_fn)(void *opaque, uint8_t *name, uint32_t len,
+                       uint8_t *data);
+    void *section_opaque;
 } ElfHandlers;
 
 extern ElfHandlers elf_default_handlers;
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 05/40] xen-disk: disable aio
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (3 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 04/40] elf: add section analyzer Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 06/40] qdev-ify: xen backends Alexander Graf
                   ` (34 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Using AIO with the xen backend breaks for me. Disabling it makes things work.
So until we figure out what exactly is going wrong, let's disable it.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xen_disk.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index 134ac33..06752de 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -49,7 +49,7 @@ static int syncwrite    = 0;
 static int batch_maps   = 0;
 
 static int max_requests = 32;
-static int use_aio      = 1;
+static int use_aio      = 0;
 
 /* ------------------------------------------------------------- */
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 06/40] qdev-ify: xen backends
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (4 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 05/40] xen-disk: disable aio Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-02 10:08   ` Markus Armbruster
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 07/40] xenner: kernel: 32 bit files Alexander Graf
                   ` (33 subsequent siblings)
  39 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

From: Gerd Hoffmann <kraxel@redhat.com>

This patch converts the xen backend code to qdev.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xen_backend.c    |  176 ++++++++++++++++++++++++++++++++-------------------
 hw/xen_backend.h    |    9 ++-
 hw/xen_console.c    |   10 +++-
 hw/xen_disk.c       |   10 +++-
 hw/xen_machine_pv.c |    6 +--
 hw/xen_nic.c        |   10 +++-
 hw/xenfb.c          |   14 ++++-
 7 files changed, 158 insertions(+), 77 deletions(-)

diff --git a/hw/xen_backend.c b/hw/xen_backend.c
index a2e408f..0d6a96b 100644
--- a/hw/xen_backend.c
+++ b/hw/xen_backend.c
@@ -42,13 +42,21 @@
 
 /* ------------------------------------------------------------- */
 
+typedef struct XenBus {
+    BusState qbus;
+} XenBus;
+
 /* public */
 int xen_xc;
 struct xs_handle *xenstore = NULL;
 const char *xen_protocol;
 
 /* private */
-static QTAILQ_HEAD(XenDeviceHead, XenDevice) xendevs = QTAILQ_HEAD_INITIALIZER(xendevs);
+static struct BusInfo xen_bus_info = {
+    .name       = "Xen",
+    .size       = sizeof(XenBus),
+};
+static XenBus *xenbus;
 static int debug = 0;
 
 /* ------------------------------------------------------------- */
@@ -163,14 +171,16 @@ int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state)
 
 struct XenDevice *xen_be_find_xendev(const char *type, int dom, int dev)
 {
+    struct DeviceState *qdev;
     struct XenDevice *xendev;
 
-    QTAILQ_FOREACH(xendev, &xendevs, next) {
+    QLIST_FOREACH(qdev, &xenbus->qbus.children, sibling) {
+        xendev = container_of(qdev, struct XenDevice, qdev);
 	if (xendev->dom != dom)
 	    continue;
 	if (xendev->dev != dev)
 	    continue;
-	if (strcmp(xendev->type, type) != 0)
+	if (strcmp(xendev->ops->type, type) != 0)
 	    continue;
 	return xendev;
     }
@@ -180,28 +190,34 @@ struct XenDevice *xen_be_find_xendev(const char *type, int dom, int dev)
 /*
  * get xen backend device, allocate a new one if it doesn't exist.
  */
-static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev,
+static struct XenDevice *xen_be_get_xendev(int dom, int dev,
                                            struct XenDevOps *ops)
 {
+    struct DeviceState *qdev;
     struct XenDevice *xendev;
+    char name[64];
     char *dom0;
 
-    xendev = xen_be_find_xendev(type, dom, dev);
+    xendev = xen_be_find_xendev(ops->type, dom, dev);
     if (xendev)
 	return xendev;
 
+    /* create new xendev */
+    snprintf(name, sizeof(name), "xen-%s", ops->type);
+    qdev = qdev_create(&xenbus->qbus, name);
+    qdev_init_nofail(qdev);
+    xendev = container_of(qdev, struct XenDevice, qdev);
+
     /* init new xendev */
-    xendev = qemu_mallocz(ops->size);
-    xendev->type  = type;
     xendev->dom   = dom;
     xendev->dev   = dev;
     xendev->ops   = ops;
 
     dom0 = xs_get_domain_path(xenstore, 0);
     snprintf(xendev->be, sizeof(xendev->be), "%s/backend/%s/%d/%d",
-	     dom0, xendev->type, xendev->dom, xendev->dev);
+	     dom0, xendev->ops->type, xendev->dom, xendev->dev);
     snprintf(xendev->name, sizeof(xendev->name), "%s-%d",
-	     xendev->type, xendev->dev);
+	     xendev->ops->type, xendev->dev);
     free(dom0);
 
     xendev->debug      = debug;
@@ -210,7 +226,7 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev,
     xendev->evtchndev = xc_evtchn_open();
     if (xendev->evtchndev < 0) {
 	xen_be_printf(NULL, 0, "can't open evtchn device\n");
-	qemu_free(xendev);
+	qdev_free(&xendev->qdev);
 	return NULL;
     }
     fcntl(xc_evtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC);
@@ -220,15 +236,13 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev,
 	if (xendev->gnttabdev < 0) {
 	    xen_be_printf(NULL, 0, "can't open gnttab device\n");
 	    xc_evtchn_close(xendev->evtchndev);
-	    qemu_free(xendev);
+	    qdev_free(&xendev->qdev);
 	    return NULL;
 	}
     } else {
 	xendev->gnttabdev = -1;
     }
 
-    QTAILQ_INSERT_TAIL(&xendevs, xendev, next);
-
     if (xendev->ops->alloc)
 	xendev->ops->alloc(xendev);
 
@@ -238,43 +252,44 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev,
 /*
  * release xen backend device.
  */
-static struct XenDevice *xen_be_del_xendev(int dom, int dev)
+static void xen_be_del_xendev(int dom, int dev, struct XenDevOps *ops)
 {
-    struct XenDevice *xendev, *xnext;
-
-    /*
-     * This is pretty much like QTAILQ_FOREACH(xendev, &xendevs, next) but
-     * we save the next pointer in xnext because we might free xendev.
-     */
-    xnext = xendevs.tqh_first;
-    while (xnext) {
-        xendev = xnext;
-        xnext = xendev->next.tqe_next;
-
-	if (xendev->dom != dom)
-	    continue;
-	if (xendev->dev != dev && dev != -1)
-	    continue;
-
-	if (xendev->ops->free)
-	    xendev->ops->free(xendev);
-
-	if (xendev->fe) {
-	    char token[XEN_BUFSIZE];
-	    snprintf(token, sizeof(token), "fe:%p", xendev);
-	    xs_unwatch(xenstore, xendev->fe, token);
-	    qemu_free(xendev->fe);
-	}
-
-	if (xendev->evtchndev >= 0)
-	    xc_evtchn_close(xendev->evtchndev);
-	if (xendev->gnttabdev >= 0)
-	    xc_gnttab_close(xendev->gnttabdev);
-
-	QTAILQ_REMOVE(&xendevs, xendev, next);
-	qemu_free(xendev);
-    }
-    return NULL;
+    struct DeviceState *qdev;
+    struct XenDevice *xendev;
+    int done;
+
+    do {
+        done = 1;
+        QLIST_FOREACH(qdev, &xenbus->qbus.children, sibling) {
+            xendev = container_of(qdev, struct XenDevice, qdev);
+            if (xendev->dom != dom)
+                continue;
+            if (xendev->dev != dev && dev != -1)
+                continue;
+            if (xendev->ops != ops)
+                continue;
+
+            if (xendev->ops->free)
+                xendev->ops->free(xendev);
+
+            if (xendev->fe) {
+                char token[XEN_BUFSIZE];
+                snprintf(token, sizeof(token), "fe:%p", xendev);
+                xs_unwatch(xenstore, xendev->fe, token);
+                qemu_free(xendev->fe);
+            }
+
+            if (xendev->evtchndev >= 0)
+                xc_evtchn_close(xendev->evtchndev);
+            if (xendev->gnttabdev >= 0)
+                xc_gnttab_close(xendev->gnttabdev);
+
+            qdev_free(&xendev->qdev);
+
+            done = 0;
+            break;
+        }
+    } while (!done);
 }
 
 /*
@@ -498,7 +513,7 @@ void xen_be_check_state(struct XenDevice *xendev)
 
 /* ------------------------------------------------------------- */
 
-static int xenstore_scan(const char *type, int dom, struct XenDevOps *ops)
+static int xenstore_scan(int dom, struct XenDevOps *ops)
 {
     struct XenDevice *xendev;
     char path[XEN_BUFSIZE], token[XEN_BUFSIZE];
@@ -507,8 +522,8 @@ static int xenstore_scan(const char *type, int dom, struct XenDevOps *ops)
 
     /* setup watch */
     dom0 = xs_get_domain_path(xenstore, 0);
-    snprintf(token, sizeof(token), "be:%p:%d:%p", type, dom, ops);
-    snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, type, dom);
+    snprintf(token, sizeof(token), "be:%d:%p", dom, ops);
+    snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, ops->type, dom);
     free(dom0);
     if (!xs_watch(xenstore, path, token)) {
 	xen_be_printf(NULL, 0, "xen be: watching backend path (%s) failed\n", path);
@@ -520,7 +535,7 @@ static int xenstore_scan(const char *type, int dom, struct XenDevOps *ops)
     if (!dev)
 	return 0;
     for (j = 0; j < cdev; j++) {
-	xendev = xen_be_get_xendev(type, dom, atoi(dev[j]), ops);
+	xendev = xen_be_get_xendev(dom, atoi(dev[j]), ops);
 	if (xendev == NULL)
 	    continue;
 	xen_be_check_state(xendev);
@@ -529,15 +544,14 @@ static int xenstore_scan(const char *type, int dom, struct XenDevOps *ops)
     return 0;
 }
 
-static void xenstore_update_be(char *watch, char *type, int dom,
-			       struct XenDevOps *ops)
+static void xenstore_update_be(char *watch, int dom, struct XenDevOps *ops)
 {
     struct XenDevice *xendev;
     char path[XEN_BUFSIZE], *dom0;
     unsigned int len, dev;
 
     dom0 = xs_get_domain_path(xenstore, 0);
-    len = snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, type, dom);
+    len = snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, ops->type, dom);
     free(dom0);
     if (strncmp(path, watch, len) != 0)
 	return;
@@ -551,10 +565,10 @@ static void xenstore_update_be(char *watch, char *type, int dom,
 
     if (0) {
 	/* FIXME: detect devices being deleted from xenstore ... */
-	xen_be_del_xendev(dom, dev);
+	xen_be_del_xendev(dom, dev, ops);
     }
 
-    xendev = xen_be_get_xendev(type, dom, dev, ops);
+    xendev = xen_be_get_xendev(dom, dev, ops);
     if (xendev != NULL) {
 	xen_be_backend_changed(xendev, path);
 	xen_be_check_state(xendev);
@@ -580,16 +594,16 @@ static void xenstore_update_fe(char *watch, struct XenDevice *xendev)
 static void xenstore_update(void *unused)
 {
     char **vec = NULL;
-    intptr_t type, ops, ptr;
+    intptr_t ops, ptr;
     unsigned int dom, count;
 
     vec = xs_read_watch(xenstore, &count);
     if (vec == NULL)
 	goto cleanup;
 
-    if (sscanf(vec[XS_WATCH_TOKEN], "be:%" PRIxPTR ":%d:%" PRIxPTR,
-               &type, &dom, &ops) == 3)
-	xenstore_update_be(vec[XS_WATCH_PATH], (void*)type, dom, (void*)ops);
+    if (sscanf(vec[XS_WATCH_TOKEN], "be:%d:%" PRIxPTR,
+               &dom, &ops) == 2)
+	xenstore_update_be(vec[XS_WATCH_PATH], dom, (void*)ops);
     if (sscanf(vec[XS_WATCH_TOKEN], "fe:%" PRIxPTR, &ptr) == 1)
 	xenstore_update_fe(vec[XS_WATCH_PATH], (void*)ptr);
 
@@ -642,9 +656,43 @@ err:
     return -1;
 }
 
-int xen_be_register(const char *type, struct XenDevOps *ops)
+void xen_create_bus(DeviceState *parent)
 {
-    return xenstore_scan(type, xen_domid, ops);
+    DeviceInfo *info;
+    BusState *qbus;
+
+    qbus = qbus_create(&xen_bus_info, parent, NULL);
+    xenbus = DO_UPCAST(XenBus, qbus, qbus);
+    for (info = device_info_list; info != NULL; info = info->next) {
+        if (info->bus_info != &xen_bus_info)
+            continue;
+        xenstore_scan(xen_domid, DO_UPCAST(struct XenDevOps, qinfo, info));
+    }
+
+    qbus->allow_hotplug = 1;
+}
+
+static int xen_be_initfn(DeviceState *dev, DeviceInfo *info)
+{
+#if 0
+    struct XenDevOps *ops = container_of(info, struct XenDevOps, qinfo);
+    struct XenDevice *xendev = container_of(dev, struct XenDevice, qdev);
+
+    /* nothing to do as create + init isn't really splitted. */
+#endif
+    return 0;
+}
+
+void xen_qdev_register(struct XenDevOps *ops)
+{
+    char name[64];
+
+    snprintf(name, sizeof(name), "xen-%s", ops->type);
+    ops->qinfo.name = qemu_strdup(name);
+    ops->qinfo.init = xen_be_initfn;
+    ops->qinfo.bus_info = &xen_bus_info;
+    ops->qinfo.no_user = 1,
+    qdev_register(&ops->qinfo);
 }
 
 int xen_be_bind_evtchn(struct XenDevice *xendev)
diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index 1b428e3..f53a742 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -4,6 +4,7 @@
 #include "xen_common.h"
 #include "sysemu.h"
 #include "net.h"
+#include "qdev.h"
 
 /* ------------------------------------------------------------- */
 
@@ -17,7 +18,8 @@ struct XenDevice;
 #define DEVOPS_FLAG_IGNORE_STATE  2
 
 struct XenDevOps {
-    size_t    size;
+    DeviceInfo qinfo;
+    char      type[64];
     uint32_t  flags;
     void      (*alloc)(struct XenDevice *xendev);
     int       (*init)(struct XenDevice *xendev);
@@ -30,7 +32,7 @@ struct XenDevOps {
 };
 
 struct XenDevice {
-    const char         *type;
+    DeviceState        qdev;
     int                dom;
     int                dev;
     char               name[64];
@@ -78,7 +80,8 @@ void xen_be_check_state(struct XenDevice *xendev);
 
 /* xen backend driver bits */
 int xen_be_init(void);
-int xen_be_register(const char *type, struct XenDevOps *ops);
+void xen_create_bus(DeviceState *parent);
+void xen_qdev_register(struct XenDevOps *ops);
 int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state);
 int xen_be_bind_evtchn(struct XenDevice *xendev);
 void xen_be_unbind_evtchn(struct XenDevice *xendev);
diff --git a/hw/xen_console.c b/hw/xen_console.c
index d2261f4..a980dc8 100644
--- a/hw/xen_console.c
+++ b/hw/xen_console.c
@@ -260,10 +260,18 @@ static void con_event(struct XenDevice *xendev)
 /* -------------------------------------------------------------------- */
 
 struct XenDevOps xen_console_ops = {
-    .size       = sizeof(struct XenConsole),
+    .qinfo.size = sizeof(struct XenConsole),
+    .type       = "console",
     .flags      = DEVOPS_FLAG_IGNORE_STATE,
     .init       = con_init,
     .connect    = con_connect,
     .event      = con_event,
     .disconnect = con_disconnect,
 };
+
+static void xen_console_register_devices(void)
+{
+    xen_qdev_register(&xen_console_ops);
+}
+
+device_init(xen_console_register_devices)
diff --git a/hw/xen_disk.c b/hw/xen_disk.c
index 06752de..5392f58 100644
--- a/hw/xen_disk.c
+++ b/hw/xen_disk.c
@@ -774,7 +774,8 @@ static void blk_event(struct XenDevice *xendev)
 }
 
 struct XenDevOps xen_blkdev_ops = {
-    .size       = sizeof(struct XenBlkDev),
+    .qinfo.size = sizeof(struct XenBlkDev),
+    .type       = "qdisk",
     .flags      = DEVOPS_FLAG_NEED_GNTDEV,
     .alloc      = blk_alloc,
     .init       = blk_init,
@@ -783,3 +784,10 @@ struct XenDevOps xen_blkdev_ops = {
     .event      = blk_event,
     .free       = blk_free,
 };
+
+static void xen_blkdev_register_devices(void)
+{
+    xen_qdev_register(&xen_blkdev_ops);
+}
+
+device_init(xen_blkdev_register_devices)
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index 77a34bf..b94d6e9 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -75,11 +75,7 @@ static void xen_init_pv(ram_addr_t ram_size,
         break;
     }
 
-    xen_be_register("console", &xen_console_ops);
-    xen_be_register("vkbd", &xen_kbdmouse_ops);
-    xen_be_register("vfb", &xen_framebuffer_ops);
-    xen_be_register("qdisk", &xen_blkdev_ops);
-    xen_be_register("qnic", &xen_netdev_ops);
+    xen_create_bus(NULL);
 
     /* configure framebuffer */
     if (xenfb_enabled) {
diff --git a/hw/xen_nic.c b/hw/xen_nic.c
index 08055b8..02d3c4e 100644
--- a/hw/xen_nic.c
+++ b/hw/xen_nic.c
@@ -411,7 +411,8 @@ static int net_free(struct XenDevice *xendev)
 /* ------------------------------------------------------------- */
 
 struct XenDevOps xen_netdev_ops = {
-    .size       = sizeof(struct XenNetDev),
+    .qinfo.size = sizeof(struct XenNetDev),
+    .type       = "qnic",
     .flags      = DEVOPS_FLAG_NEED_GNTDEV,
     .init       = net_init,
     .connect    = net_connect,
@@ -419,3 +420,10 @@ struct XenDevOps xen_netdev_ops = {
     .disconnect = net_disconnect,
     .free       = net_free,
 };
+
+static void xen_netdev_register_devices(void)
+{
+    xen_qdev_register(&xen_netdev_ops);
+}
+
+device_init(xen_netdev_register_devices)
diff --git a/hw/xenfb.c b/hw/xenfb.c
index da5297b..293210f 100644
--- a/hw/xenfb.c
+++ b/hw/xenfb.c
@@ -953,7 +953,8 @@ static void fb_event(struct XenDevice *xendev)
 /* -------------------------------------------------------------------- */
 
 struct XenDevOps xen_kbdmouse_ops = {
-    .size       = sizeof(struct XenInput),
+    .qinfo.size = sizeof(struct XenInput),
+    .type       = "vkbd",
     .init       = input_init,
     .connect    = input_connect,
     .disconnect = input_disconnect,
@@ -961,7 +962,8 @@ struct XenDevOps xen_kbdmouse_ops = {
 };
 
 struct XenDevOps xen_framebuffer_ops = {
-    .size       = sizeof(struct XenFB),
+    .qinfo.size = sizeof(struct XenFB),
+    .type       = "vfb",
     .init       = fb_init,
     .connect    = fb_connect,
     .disconnect = fb_disconnect,
@@ -1011,3 +1013,11 @@ wait_more:
     xen_be_check_state(xin);
     xen_be_check_state(xfb);
 }
+
+static void xenfb_register_devices(void)
+{
+    xen_qdev_register(&xen_kbdmouse_ops);
+    xen_qdev_register(&xen_framebuffer_ops);
+}
+
+device_init(xenfb_register_devices)
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 07/40] xenner: kernel: 32 bit files
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (5 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 06/40] qdev-ify: xen backends Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files Alexander Graf
                   ` (32 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds various files required to implement 32bit support in the
xenner kernel.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner32-pae.lds |   37 ++++
 pc-bios/xenner/xenner32.S       |  441 +++++++++++++++++++++++++++++++++++++++
 pc-bios/xenner/xenner32.h       |  191 +++++++++++++++++
 pc-bios/xenner/xenner32.lds     |   37 ++++
 4 files changed, 706 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner32-pae.lds
 create mode 100644 pc-bios/xenner/xenner32.S
 create mode 100644 pc-bios/xenner/xenner32.h
 create mode 100644 pc-bios/xenner/xenner32.lds

diff --git a/pc-bios/xenner/xenner32-pae.lds b/pc-bios/xenner/xenner32-pae.lds
new file mode 100644
index 0000000..6d56ae4
--- /dev/null
+++ b/pc-bios/xenner/xenner32-pae.lds
@@ -0,0 +1,37 @@
+OUTPUT_FORMAT("elf32-i386")
+
+SECTIONS
+{
+    . = 0xff000000;
+    _vstart = .;
+
+    /* code */
+    .text : AT(ADDR(.text) - 0xff000000) { *(.text) }
+    . = ALIGN(4k);
+    .exfix  : { *(.exfix)  }
+
+    /* data, ro */
+    . = ALIGN(4k);
+    .note.gnu.build-id : { *(.note.gnu.build-id) }
+    . = ALIGN(4k);
+    _estart = .;
+    .extab  : { *(.extab)  }
+    _estop  = .;
+    . = ALIGN(4k);
+    .rodata : { *(.rodata) }
+
+    /* data, rw */
+    . = ALIGN(4k);
+    .pt  : { *(.pt) }
+    . = ALIGN(4k);
+    .pgdata : { *(.pgdata) }
+    . = ALIGN(4k);
+    .data   : { *(.data)   }
+
+    /* bss */
+    . = ALIGN(4k);
+    .bss    : { *(.bss)    }
+
+    . = ALIGN(4k);
+    _vstop  = .;
+}
diff --git a/pc-bios/xenner/xenner32.S b/pc-bios/xenner/xenner32.S
new file mode 100644
index 0000000..052a91b
--- /dev/null
+++ b/pc-bios/xenner/xenner32.S
@@ -0,0 +1,441 @@
+
+#define	ENTRY(name) \
+	.globl name; \
+	.align 16; \
+	name:
+
+	.macro PUSH_ERROR
+	sub $4, %esp		/* space for error code */
+	.endm
+
+	.macro PUSH_TRAP_EBP trapno
+	sub $4, %esp		/* space for trap number */
+	push %ebp
+	mov $\trapno, %ebp
+	mov %ebp, %ss:4(%esp)	/* save trap number on stack */
+	.endm
+
+	.macro PUSH_REGS
+	mov  %es,%ebp
+	push %ebp
+	mov  %ds,%ebp
+	push %ebp
+	mov $0xe010,%ebp	/* ring0 data flat */
+	mov %ebp,%ds
+	mov %ebp,%es
+
+	push %edi
+	push %esi
+	push %edx
+	push %ecx
+	push %ebx
+	push %eax
+	push %esp	       /* struct regs pointer */
+	.endm
+
+	.macro POP_REGS
+	pop %eax		/* dummy (struct regs pointer) */
+	pop %eax
+	pop %ebx
+	pop %ecx
+	pop %edx
+	pop %esi
+	pop %edi
+	pop %ebp
+	mov %ebp,%ds
+	pop %ebp
+	mov %ebp,%es
+	pop %ebp
+	.endm
+
+	.macro RETURN
+	add $8, %esp		/* remove error code & trap number */
+	iret			/* jump back */
+	.endm
+
+	.macro DO_TRAP trapno func
+	PUSH_TRAP_EBP \trapno
+	PUSH_REGS
+	call \func
+	POP_REGS
+	RETURN
+	.endm
+
+/* ------------------------------------------------------------------ */
+
+	.code32
+	.text
+
+/* --- 16-bit boot entry point --- */
+
+ENTRY(boot)
+	.code16
+
+	cli
+
+	/* load the GDT */
+	lgdt	(gdt_desc - boot)
+
+	/* turn on paging */
+	mov	$0x1, %eax
+	mov	%eax, %cr0
+
+	/* enable boot page table */
+	mov	$(emu_boot_pgd - boot), %eax
+	mov	%eax, %cr3
+
+	/* set PSE, maybe PAE */
+#ifdef CONFIG_PAE
+	mov	$0x30, %eax
+#else
+	mov	$0x10, %eax
+#endif
+	mov	%eax, %cr4
+
+	/* turn on paging */
+	mov	$0x80000001, %eax
+	mov	%eax, %cr0
+
+	ljmp	$0x8, $(boot32 - boot)
+
+
+.align 4, 0
+gdt:
+.byte   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* dummy */
+.byte   0xff, 0xff, 0x00, 0x00, 0x00, 0x9b, 0xcf, 0x00 /* code32 */
+.byte   0xff, 0xff, 0x00, 0x00, 0x00, 0x93, 0xcf, 0x00 /* data32 */
+
+gdt_desc:
+.short  (3 * 8) - 1
+.long   (gdt - boot)
+
+
+/* --- boot entry point --- */
+
+boot32:
+	.code32
+
+	mov $0x10, %ax
+	mov %ax, %ds
+	mov %ax, %es
+	mov %ax, %fs
+	mov %ax, %gs
+	mov %ax, %ss
+
+	jmp	*(boot32_ind - boot)
+
+boot32_ind:
+	.long boot32_real
+
+boot32_real:
+
+	/* switch to the real page table */
+
+	mov	$(emu_pgd - boot), %eax
+	mov	%eax, %cr3
+
+	lea boot_stack_high, %esp	/* setup stack */
+	sub $64, %esp			/* sizeof(struct regs_32) */
+	push %esp		       /* struct regs pointer */
+	cmp $0, %ebx
+	jne secondary
+	call do_boot
+	POP_REGS
+	RETURN
+
+secondary:
+	push %ebx
+	call do_boot_secondary
+	pop %ebx
+	POP_REGS
+	RETURN
+
+/* --- traps/faults handled by emu --- */
+
+ENTRY(debug_int1)
+	PUSH_ERROR
+	DO_TRAP 1 do_int1
+
+ENTRY(debug_int3)
+	PUSH_ERROR
+	DO_TRAP 3 do_int3
+
+ENTRY(illegal_instruction)
+	PUSH_ERROR
+	DO_TRAP 6 do_illegal_instruction
+
+ENTRY(no_device)
+	PUSH_ERROR
+	DO_TRAP 7 do_lazy_fpu
+
+ENTRY(double_fault)
+	DO_TRAP 8 do_double_fault
+
+ENTRY(general_protection)
+	DO_TRAP 13 do_general_protection
+
+ENTRY(page_fault)
+	DO_TRAP 14 do_page_fault
+
+/* --- traps/faults forwarded to guest --- */
+
+ENTRY(division_by_zero)
+	PUSH_ERROR
+	PUSH_TRAP_EBP 0
+	jmp guest_forward
+
+ENTRY(nmi)
+	PUSH_ERROR
+	PUSH_TRAP_EBP 2
+	jmp guest_forward
+
+ENTRY(overflow)
+	PUSH_ERROR
+	PUSH_TRAP_EBP 4
+	jmp guest_forward
+
+ENTRY(bound_check)
+	PUSH_ERROR
+	PUSH_TRAP_EBP 5
+	jmp guest_forward
+
+ENTRY(coprocessor)
+	PUSH_ERROR
+	PUSH_TRAP_EBP 9
+	jmp guest_forward
+
+ENTRY(invalid_tss)
+	PUSH_TRAP_EBP 10
+	jmp guest_forward
+
+ENTRY(segment_not_present)
+	PUSH_TRAP_EBP 11
+	jmp guest_forward
+
+ENTRY(stack_fault)
+	PUSH_TRAP_EBP 12
+	jmp guest_forward
+
+ENTRY(floating_point)
+	PUSH_ERROR
+	PUSH_TRAP_EBP 16
+	jmp guest_forward
+
+ENTRY(alignment)
+	PUSH_TRAP_EBP 17
+	jmp guest_forward
+
+ENTRY(machine_check)
+	PUSH_ERROR
+	PUSH_TRAP_EBP 18
+	jmp guest_forward
+
+ENTRY(simd_floating_point)
+	PUSH_ERROR
+	PUSH_TRAP_EBP 19
+	jmp guest_forward
+
+guest_forward:
+	PUSH_REGS
+	call do_guest_forward
+	POP_REGS
+	RETURN
+
+/* --- interrupts 32 ... 255 --- */
+
+ENTRY(smp_flush_tlb)
+	PUSH_ERROR
+	DO_TRAP -1 do_smp_flush_tlb
+
+ENTRY(xen_hypercall)
+	PUSH_ERROR
+	PUSH_TRAP_EBP -1
+	PUSH_REGS
+	call do_hypercall
+	POP_REGS
+	add $8, %esp		/* remove error code & trap number */
+	cmp $-1, -4(%esp)
+	jne 1f
+	out %al, $0xe0	  /* let userspace handle it */
+1:
+	iret
+
+ENTRY(irq_entries)
+vector=0
+.rept 256
+	.align 16
+	PUSH_ERROR
+	PUSH_TRAP_EBP vector
+	jmp irq_common
+vector=vector+1
+.endr
+
+ENTRY(irq_common)
+	PUSH_REGS
+	call do_irq
+	POP_REGS
+	RETURN
+
+/* --- helpers --- */
+
+ENTRY(broken_memcpy_pf)
+	mov 4(%esp),%edi
+	mov 8(%esp),%esi
+	mov 12(%esp),%ecx
+	cld
+1:	rep movsb
+	xor %eax,%eax
+8:	ret
+
+	.section .exfix, "ax"
+9:	mov $-1, %eax
+	jmp 8b
+	.previous
+
+	.section .extab, "a"
+	.align 4
+	.long 1b,9b
+	.previous
+
+ENTRY(broken_copy32_pf)
+	mov 4(%esp),%edi
+	mov 8(%esp),%esi
+1:	mov (%esi),%eax
+2:	mov %eax,(%edi)
+	xor %eax,%eax
+8:	ret
+
+	.section .exfix, "ax"
+9:	mov $-1, %eax
+	jmp 8b
+	.previous
+
+	.section .extab, "a"
+	.align 4
+	.long 1b,9b
+	.long 2b,9b
+	.previous
+
+ENTRY(broken_copy64_pf)
+	mov 4(%esp),%edi
+	mov 8(%esp),%esi
+1:	mov (%esi), %ebx
+2:	mov 4(%esi), %ecx
+3:	mov (%edi), %eax
+4:	mov 4(%edi), %edx
+5:	cmpxchg8b (%edi)
+	xor %eax,%eax
+8:	ret
+
+	.section .exfix, "ax"
+9:	mov $-1, %eax
+	jmp 8b
+	.previous
+
+	.section .extab, "a"
+	.align 4
+	.long 1b,9b
+	.long 2b,9b
+	.long 3b,9b
+	.long 4b,9b
+	.previous
+
+ENTRY(instructions)
+	.byte  0x0f, 0x11, 0x00, 0x0f,  0x11, 0x48, 0x10, 0x0f
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+	nop
+
+/* some 16 bit code for smp boot */
+
+	.code16
+	.align 4096
+ENTRY(sipi)
+	mov $0x00060000, %eax  /* EMUDEV_CMD_INIT_SECONDARY_VCPU */
+	outl %eax, $0xe8       /* EMUDEV_REG_COMMAND */
+	hlt
+	.code32
+
+/* data section */
+
+	.data
+	.globl boot_stack_low, boot_stack_high
+	.globl cpu_ptr
+	.align 4096
+boot_stack_low:
+cpu_ptr:
+	.long 0
+	.align 4096
+boot_stack_high:
+
+/* boot page tables */
+
+#define pageflags 0x063 /* preset, rw, accessed, dirty */
+#define largepage 0x080 /* pse */
+
+#ifdef CONFIG_PAE
+
+	.section .pt, "aw"
+	.globl emu_pgd_pae,emu_pmd_pae,emu_boot_pmd,emu_pgd,emu_pmd
+
+	.align 4096
+emu_pgd_pae:
+emu_pgd:
+	.fill 3,8,0
+	.long emu_pmd_pae - 0xff000000 + 0x001
+	.long 0
+	.fill 508,8,0
+
+	.align 4096
+emu_pmd_pae:
+emu_pmd:
+	.fill 504,8,0
+	.quad pageflags + largepage
+	.fill 7,8,0
+
+	/* boot page tables */
+
+	.align 4096
+emu_boot_pgd:
+	.long emu_boot_pmd - 0xff000000 + 0x001
+	.long 0
+	.fill 2,8,0
+	.long emu_boot_pmd - 0xff000000 + 0x001
+	.long 0
+	.fill 508,8,0
+
+	.align 4096
+emu_boot_pmd:
+	.quad pageflags + largepage
+	.fill 503,8,0
+	.quad pageflags + largepage
+	.fill 7,8,0
+
+#else
+
+	.section .pt, "aw"
+	.globl emu_pgd_32,emu_boot_pgd,emu_pgd,emu_pmd
+
+	.align 4096
+emu_pgd_32:
+emu_pgd:
+emu_pmd:
+	.fill 1020,4,0
+	.long pageflags + largepage
+	.fill 3,4,0
+
+	/* boot page tables */
+
+	.align 4096
+emu_boot_pgd:
+	.long pageflags + largepage
+	.fill 1019,4,0
+	.long pageflags + largepage
+	.fill 3,4,0
+
+#endif
diff --git a/pc-bios/xenner/xenner32.h b/pc-bios/xenner/xenner32.h
new file mode 100644
index 0000000..5b4a6d4
--- /dev/null
+++ b/pc-bios/xenner/xenner32.h
@@ -0,0 +1,191 @@
+#include <xen/foreign/x86_32.h>
+
+struct regs_32 {
+    /* pushed onto stack before calling into C code */
+    uint32_t eax;
+    uint32_t ebx;
+    uint32_t ecx;
+    uint32_t edx;
+    uint32_t esi;
+    uint32_t edi;
+    uint32_t ds;
+    uint32_t es;
+    uint32_t ebp;
+    uint32_t trapno;
+    /* trap / fault / int created */
+    uint32_t error;
+    uint32_t eip;
+    uint32_t cs;
+    uint32_t eflags;
+    uint32_t esp;
+    uint32_t ss;
+};
+
+/* 32bit defines */
+#define EMUNAME   "xenner32"
+#define regs      regs_32
+#define fix_sel   fix_sel32
+#define fix_desc  fix_desc32
+#define ureg_t    uint32_t
+#define sreg_t    int32_t
+#define PRIxREG   PRIx32
+#define rax       eax
+#define rbx       ebx
+#define rcx       ecx
+#define rdx       edx
+#define rsi       esi
+#define rdi       edi
+#define rbp       ebp
+#define rsp       esp
+#define rip       eip
+#define rflags    eflags
+#define tss(_v)   ((0xe000 >> 3) + 8 + (((_v)->id) << 1))
+#define ldt(_v)   ((0xe000 >> 3) + 9 + (((_v)->id) << 1))
+
+/* xenner32.S */
+extern uint32_t emu_pgd_32[];
+extern uint64_t emu_pgd_pae[];
+extern uint64_t emu_pmd_pae[];
+extern pte_t emu_pmd[];
+
+/* xenner-data.c */
+#define MAPS_MAX 256
+extern page_aligned uint32_t maps_32[];
+extern page_aligned uint64_t maps_pae[];
+extern uint32_t maps_refcnt[MAPS_MAX];
+
+extern struct descriptor_32 page_aligned xen_idt[256];
+
+/* xenner-mm.c */
+void *map_page(unsigned long maddr);
+void *fixmap_page(struct xen_cpu *cpu, unsigned long maddr);
+void free_page(void *ptr);
+pte_t *find_pte_lpt(uint32_t va);
+pte_t *find_pte_map(struct xen_cpu *cpu, uint32_t va);
+void pgtable_walk(struct xen_cpu *cpu, uint32_t va);
+
+/* xenner-hcall.c */
+asmlinkage void do_hypercall(struct regs_32 *regs);
+
+/* macros */
+#define context_is_emu(_r)       (((_r)->cs & 0x03) == 0x00)
+#define context_is_kernel(_v,_r) (((_r)->cs & 0x03) == 0x01)
+#define context_is_user(_v,_r)   (((_r)->cs & 0x03) == 0x03)
+
+/* inline asm bits */
+static inline int wrmsr_safe(uint32_t msr, uint32_t ax, uint32_t dx)
+{
+    int ret;
+    asm volatile("91:  wrmsr               \n"
+                 "    xorl %0,%0           \n"
+                 "92:  nop                 \n"
+                 ".section .exfix, \"ax\"  \n"
+                 "93:  mov $-1,%0          \n"
+                 "    jmp 92b              \n"
+                 ".previous                \n"
+                 ".section .extab, \"a\"   \n"
+                 "    .align 4             \n"
+                 "    .long 91b,93b        \n"
+                 ".previous                \n"
+                 : "=r" (ret)
+                 : "c" (msr), "a" (ax), "d" (dx));
+    return ret;
+}
+
+static inline int memcpy_pf(void *dest, const void *src, size_t bytes)
+{
+    int ret;
+
+    asm volatile("    cld                  \n"
+                 "91: rep movsb            \n"
+                 "    xor %[ret],%[ret]    \n"
+                 "98:                      \n"
+
+                 ".section .exfix, \"ax\"  \n"
+                 "99: mov $-1, %[ret]      \n"
+                 "    jmp 98b              \n"
+                 ".previous                \n"
+
+                 ".section .extab, \"a\"   \n"
+                 "    .align 4             \n"
+                 "    .long 91b,99b        \n"
+                 ".previous                \n"
+                 : [ ret ] "=r" (ret),
+                   [ esi ] "+S" (src),
+                   [ edi ] "+D" (dest),
+                   [ ecx ] "+c" (bytes)
+                 :
+                 : "memory" );
+    return ret;
+}
+
+static inline int copy32_pf(uint32_t *dest, const uint32_t *src)
+{
+    int ret;
+
+    asm volatile("91: mov (%[src]),%%ecx   \n"
+                 "92: mov %%ecx,(%[dst])   \n"
+                 "    xor %[ret],%[ret]    \n"
+                 "98:                      \n"
+
+                 ".section .exfix, \"ax\"  \n"
+                 "99: mov $-1, %[ret]      \n"
+                 "    jmp 98b              \n"
+                 ".previous                \n"
+
+                 ".section .extab, \"a\"   \n"
+                 "    .align 4             \n"
+                 "    .long 91b,99b        \n"
+                 "    .long 92b,99b        \n"
+                 ".previous                \n"
+
+                 : [ ret ] "=r" (ret)
+                 : [ src ] "r" (src),
+                   [ dst ] "r" (dest)
+                 : "ecx", "memory" );
+    return ret;
+}
+
+static inline int copy64_pf(uint64_t *dest, const uint64_t *src)
+{
+    int ret;
+
+    asm volatile("91: mov (%[src]), %%ebx    \n"
+                 "92: mov 4(%[src]), %%ecx   \n"
+                 "93: mov (%[dst]), %%eax    \n"
+                 "94: mov 4(%[dst]), %%edx   \n"
+                 "95: cmpxchg8b (%[dst])     \n"
+                 "    xor %[ret],%[ret]      \n"
+                 "98:                        \n"
+
+                 ".section .exfix, \"ax\"    \n"
+                 "99: mov $-1, %[ret]        \n"
+                 "    jmp 98b                \n"
+                 ".previous                  \n"
+
+                 ".section .extab, \"a\"     \n"
+                 "    .align 4               \n"
+                 "    .long 91b,99b          \n"
+                 "    .long 92b,99b          \n"
+                 "    .long 93b,99b          \n"
+                 "    .long 94b,99b          \n"
+                 "    .long 95b,99b          \n"
+                 ".previous                  \n"
+                 : [ ret ] "=r" (ret)
+                 : [ src ] "r" (src),
+                   [ dst ] "r" (dest)
+                 : "eax", "ebx", "ecx", "edx", "memory");
+    return ret;
+}
+
+/* PAE agnosticity */
+
+static inline int copypt_pf(pte_t *dest, const pte_t *src)
+{
+#ifdef CONFIG_PAE
+    return copy64_pf(dest, src);
+#else
+    return copy32_pf(dest, src);
+#endif
+}
+
diff --git a/pc-bios/xenner/xenner32.lds b/pc-bios/xenner/xenner32.lds
new file mode 100644
index 0000000..6d56ae4
--- /dev/null
+++ b/pc-bios/xenner/xenner32.lds
@@ -0,0 +1,37 @@
+OUTPUT_FORMAT("elf32-i386")
+
+SECTIONS
+{
+    . = 0xff000000;
+    _vstart = .;
+
+    /* code */
+    .text : AT(ADDR(.text) - 0xff000000) { *(.text) }
+    . = ALIGN(4k);
+    .exfix  : { *(.exfix)  }
+
+    /* data, ro */
+    . = ALIGN(4k);
+    .note.gnu.build-id : { *(.note.gnu.build-id) }
+    . = ALIGN(4k);
+    _estart = .;
+    .extab  : { *(.extab)  }
+    _estop  = .;
+    . = ALIGN(4k);
+    .rodata : { *(.rodata) }
+
+    /* data, rw */
+    . = ALIGN(4k);
+    .pt  : { *(.pt) }
+    . = ALIGN(4k);
+    .pgdata : { *(.pgdata) }
+    . = ALIGN(4k);
+    .data   : { *(.data)   }
+
+    /* bss */
+    . = ALIGN(4k);
+    .bss    : { *(.bss)    }
+
+    . = ALIGN(4k);
+    _vstop  = .;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (6 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 07/40] xenner: kernel: 32 bit files Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:44   ` Anthony Liguori
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 09/40] xenner: kernel: Global data Alexander Graf
                   ` (31 subsequent siblings)
  39 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds various header files required for the xenner kernel on 64 bit
systems.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner64.S   |  400 +++++++++++++++++++++++++++++++++++++++++++
 pc-bios/xenner/xenner64.h   |  117 +++++++++++++
 pc-bios/xenner/xenner64.lds |   38 ++++
 3 files changed, 555 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner64.S
 create mode 100644 pc-bios/xenner/xenner64.h
 create mode 100644 pc-bios/xenner/xenner64.lds

diff --git a/pc-bios/xenner/xenner64.S b/pc-bios/xenner/xenner64.S
new file mode 100644
index 0000000..b140214
--- /dev/null
+++ b/pc-bios/xenner/xenner64.S
@@ -0,0 +1,400 @@
+#define	ENTRY(name) \
+	.globl name; \
+	.align 16; \
+	name:
+
+	.macro PUSH_ERROR
+	sub $8, %rsp		/* space for error code */
+	.endm
+
+	.macro PUSH_TRAP_RBP trapno
+	sub $8, %rsp		/* space for trap number */
+	push %rbp
+	mov $\trapno, %rbp
+	mov %rbp, 8(%rsp)	/* save trap number on stack */
+	.endm
+
+	.macro PUSH_REGS
+	push %rdi
+	push %rsi
+	push %r15
+	push %r14
+	push %r13
+	push %r12
+	push %r11
+	push %r10
+	push %r9
+	push %r8
+	push %rdx
+	push %rcx
+	push %rbx
+	push %rax
+	mov  %rsp,%rdi		/* struct regs pointer */
+	.endm
+
+	.macro POP_REGS
+	pop %rax
+	pop %rbx
+	pop %rcx
+	pop %rdx
+	pop %r8
+	pop %r9
+	pop %r10
+	pop %r11
+	pop %r12
+	pop %r13
+	pop %r14
+	pop %r15
+	pop %rsi
+	pop %rdi
+	pop %rbp
+	.endm
+
+	.macro RETURN
+	add $16, %rsp		/* remove error code & trap number */
+	iretq			/* jump back */
+	.endm
+
+	.macro DO_TRAP trapno func
+	PUSH_TRAP_RBP \trapno
+	PUSH_REGS
+	call \func
+	POP_REGS
+	RETURN
+	.endm
+
+/* ------------------------------------------------------------------ */
+
+	.code64
+	.text
+
+/* --- 16-bit boot entry point --- */
+
+ENTRY(boot)
+	.code16
+
+	cli
+
+	/* load the GDT */
+	lgdt	(gdt_desc - boot)
+
+	/* turn on long mode and paging */
+	mov	$0x1, %eax
+	mov	%eax, %cr0
+
+	/* enable pagetables */
+	mov	$(boot_pgd - boot), %eax
+	mov	%eax, %cr3
+
+	/* set PSE,  PAE */
+	mov	$0x30, %eax
+	mov	%eax, %cr4
+
+	/* long mode */
+	mov	$0xc0000080, %ecx
+	rdmsr
+	or	$0x101, %eax
+	wrmsr
+
+	/* turn on long mode and paging */
+	mov	$0x80010001, %eax
+	mov	%eax, %cr0
+
+	ljmp	$0x8, $(boot64 - boot)
+
+
+.align 4, 0
+gdt:
+.byte   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* dummy */
+.byte   0xff, 0xff, 0x00, 0x00, 0x00, 0x9b, 0xaf, 0x00 /* code64 */
+.byte   0xff, 0xff, 0x00, 0x00, 0x00, 0x93, 0xaf, 0x00 /* data64 */
+
+gdt_desc:
+.short  (3 * 8) - 1
+.long   (gdt - boot)
+
+
+/* --- 64-bit entry point --- */
+
+boot64:
+	.code64
+
+	mov $0x10, %ax
+	mov %ax, %ds
+	mov %ax, %es
+	mov %ax, %fs
+	mov %ax, %gs
+	mov %ax, %ss
+
+	jmpq	*(boot64_ind - boot)
+
+boot64_ind:
+	.quad boot64_real
+
+boot64_real:
+
+	/* switch to real pagetables */
+	mov	$(emu_pgd - boot), %rax
+	mov	%rax, %cr3
+
+	lea boot_stack_high(%rip), %rsp	/* setup stack */
+	sub $176, %rsp			/* sizeof(struct regs_64) */
+	mov  %rsp,%rdi			/* struct regs pointer */
+	cmp $0, %rbx
+	jne secondary
+	call do_boot
+	POP_REGS
+	RETURN
+
+secondary:
+	mov  %rdi,%rsi
+	mov  %rbx,%rdi
+	call do_boot_secondary
+	POP_REGS
+	RETURN
+
+/* --- traps/faults handled by emu --- */
+
+ENTRY(debug_int1)
+	PUSH_ERROR
+	DO_TRAP 1 do_int1
+
+ENTRY(debug_int3)
+	PUSH_ERROR
+	DO_TRAP 3 do_int3
+
+ENTRY(illegal_instruction)
+	PUSH_ERROR
+	DO_TRAP 6 do_illegal_instruction
+
+ENTRY(no_device)
+	PUSH_ERROR
+	DO_TRAP 7 do_lazy_fpu
+
+ENTRY(double_fault)
+	DO_TRAP 8 do_double_fault
+
+ENTRY(general_protection)
+	DO_TRAP 13 do_general_protection
+
+ENTRY(page_fault)
+	DO_TRAP 14 do_page_fault
+
+/* --- traps/faults forwarded to guest --- */
+
+ENTRY(division_by_zero)
+	PUSH_ERROR
+	PUSH_TRAP_RBP 0
+	jmp guest_forward
+
+ENTRY(nmi)
+	PUSH_ERROR
+	PUSH_TRAP_RBP 2
+	jmp guest_forward
+
+ENTRY(overflow)
+	PUSH_ERROR
+	PUSH_TRAP_RBP 4
+	jmp guest_forward
+
+ENTRY(bound_check)
+	PUSH_ERROR
+	PUSH_TRAP_RBP 5
+	jmp guest_forward
+
+ENTRY(coprocessor)
+	PUSH_ERROR
+	PUSH_TRAP_RBP 9
+	jmp guest_forward
+
+ENTRY(invalid_tss)
+	PUSH_TRAP_RBP 10
+	jmp guest_forward
+
+ENTRY(segment_not_present)
+	PUSH_TRAP_RBP 11
+	jmp guest_forward
+
+ENTRY(stack_fault)
+	PUSH_TRAP_RBP 12
+	jmp guest_forward
+
+ENTRY(floating_point)
+	PUSH_ERROR
+	PUSH_TRAP_RBP 16
+	jmp guest_forward
+
+ENTRY(alignment)
+	PUSH_TRAP_RBP 17
+	jmp guest_forward
+
+ENTRY(machine_check)
+	PUSH_ERROR
+	PUSH_TRAP_RBP 18
+	jmp guest_forward
+
+ENTRY(simd_floating_point)
+	PUSH_ERROR
+	PUSH_TRAP_RBP 19
+	jmp guest_forward
+
+guest_forward:
+	PUSH_REGS
+	call do_guest_forward
+	POP_REGS
+	RETURN
+
+/* --- interrupts 32 ... 255 --- */
+
+ENTRY(smp_flush_tlb)
+	PUSH_ERROR
+	DO_TRAP -1 do_smp_flush_tlb
+
+ENTRY(int_80)
+	PUSH_ERROR
+	DO_TRAP -1 do_int_80
+
+ENTRY(irq_entries)
+vector=0
+.rept 256
+	.align 16
+	PUSH_ERROR
+	PUSH_TRAP_RBP vector
+	jmp irq_common
+vector=vector+1
+.endr
+
+ENTRY(irq_common)
+	PUSH_REGS
+	call do_irq
+	POP_REGS
+	RETURN
+
+/* --- syscall --- */
+
+ENTRY(trampoline_syscall)
+	/* we arrive here via per-cpu trampoline
+	 * which sets up the stack for us */
+	PUSH_ERROR
+	PUSH_TRAP_RBP -1
+	PUSH_REGS
+	call do_syscall
+	POP_REGS
+
+	add $16, %rsp			/* remove error code + trapno */
+	cmp $-1, -8(%rsp)
+	je syscall_vmexit
+	cmp $-2, -8(%rsp)
+	je syscall_iretq
+
+syscall_default:
+	/* default sysret path */
+	popq %rcx			/* rip    */
+	popq %r11			/* cs     */
+	popq %r11			/* rflags */
+	popq %rsp			/* rsp    */
+	sysretq
+
+syscall_vmexit:
+	/* bounce hypercall to userspace */
+	popq %rcx			/* rip    */
+	popq %r11			/* cs     */
+	popq %r11			/* rflags */
+	popq %rsp			/* rsp    */
+	out %al, $0xe0			/* let userspace handle it */
+	sysretq
+
+syscall_iretq:
+	/* return from syscall via iretq */
+	iretq
+
+/* helpers */
+
+ENTRY(broken_memcpy_pf)
+	mov %rdx,%rcx
+	cld
+1:	rep movsb
+	xor %rax,%rax
+8:	ret
+
+	.section .exfix, "ax"
+9:	mov $-1, %rax
+	jmp 8b
+	.previous
+
+	.section .extab, "a"
+	.align 8
+	.quad 1b,9b
+	.previous
+
+/* some 16 bit code for smp boot */
+
+	.code16
+	.align 4096
+ENTRY(sipi)
+	mov $0x00060000, %eax  /* EMUDEV_CMD_INIT_SECONDARY_VCPU */
+	outl %eax, $0xe8       /* EMUDEV_REG_COMMAND */
+	hlt
+	.code64
+
+/* emu boot stack, including syscall trampoline template */
+
+	.data
+	.globl boot_stack_low, boot_stack_high
+	.globl cpu_ptr
+	.globl trampoline_start, trampoline_patch, trampoline_stop
+	.align 4096
+boot_stack_low:
+cpu_ptr:
+	.quad 0
+trampoline_start:
+	movq %rsp, boot_stack_high-16(%rip)
+	leaq boot_stack_high-16(%rip), %rsp
+	push %r11				/* rflags	 */
+	mov  $0xdeadbeef, %r11			/* C code must fix cs & ss */
+	movq %r11, boot_stack_high-8(%rip)	/* ss	     */
+	push %r11				/* cs	     */
+	push %rcx				/* rip	    */
+
+	.byte 0x49, 0xbb			/* mov data, %r11 ...	 */
+trampoline_patch:
+	.quad 0					/* ... data, for jump to ...  */
+	jmpq *%r11				/* ... trampoline_syscall     */
+trampoline_stop:
+	.align 4096
+boot_stack_high:
+
+/* boot page tables */
+
+#define pageflags 0x063 /* preset, rw, accessed, dirty */
+#define largepage 0x080 /* pse */
+
+	.section .pt, "aw"
+	.globl emu_pgd
+
+	.align 4096
+boot_pgd:
+	.quad emu_pud - 0xffff830000000000 + pageflags
+	.fill 261,8,0
+	.quad emu_pud - 0xffff830000000000 + pageflags
+	.fill 249,8,0
+
+	.align 4096
+emu_pgd:
+	.fill 262,8,0
+	.quad emu_pud - 0xffff830000000000 + pageflags
+	.fill 249,8,0
+
+	.align 4096
+emu_pud:
+	.quad emu_pmd - 0xffff830000000000 + pageflags
+	.fill 511,8,0
+
+	.align 4096
+emu_pmd:
+i = 0
+	.rept 512
+	.quad pageflags + largepage | (i << 21)
+	i = i + 1
+	.endr
+	.align 4096
diff --git a/pc-bios/xenner/xenner64.h b/pc-bios/xenner/xenner64.h
new file mode 100644
index 0000000..92d956e
--- /dev/null
+++ b/pc-bios/xenner/xenner64.h
@@ -0,0 +1,117 @@
+#include <xen/foreign/x86_64.h>
+
+struct regs_64 {
+    /* pushed onto stack before calling into C code */
+    uint64_t rax;
+    uint64_t rbx;
+    uint64_t rcx;
+    uint64_t rdx;
+    uint64_t r8;
+    uint64_t r9;
+    uint64_t r10;
+    uint64_t r11;
+    uint64_t r12;
+    uint64_t r13;
+    uint64_t r14;
+    uint64_t r15;
+    uint64_t rsi;
+    uint64_t rdi;
+    uint64_t rbp;
+    uint64_t trapno;
+    /* trap / fault / int created */
+    uint64_t error;
+    uint64_t rip;
+    uint64_t cs;
+    uint64_t rflags;
+    uint64_t rsp;
+    uint64_t ss;
+};
+
+/* 64bit defines */
+#define EMUNAME   "xenner64"
+#define regs      regs_64
+#define fix_sel   fix_sel64
+#define fix_desc  fix_desc64
+#define ureg_t    uint64_t
+#define sreg_t    int64_t
+#define PRIxREG   PRIx64
+#define tss(_v)   ((0xe000 >> 3) +  8 + (((_v)->id) << 2))
+#define ldt(_v)   ((0xe000 >> 3) + 10 + (((_v)->id) << 2))
+
+/* xenner-data.c */
+extern struct idt_64 page_aligned xen_idt[256];
+
+/* xenner-main.c */
+asmlinkage void do_int_80(struct regs_64 *regs);
+
+/* xenner-hcall.c */
+void switch_mode(struct xen_cpu *cpu);
+int is_kernel(struct xen_cpu *cpu);
+asmlinkage void do_syscall(struct regs_64 *regs);
+
+/* xenner-mm.c */
+void pgtable_walk(int level, uint64_t va, uint64_t root_mfn);
+int pgtable_fixup_flag(struct xen_cpu *cpu, uint64_t va, uint32_t flag);
+int pgtable_is_present(uint64_t va, uint64_t root_mfn);
+void *map_page(uint64_t maddr);
+void *fixmap_page(struct xen_cpu *cpu, uint64_t maddr);
+static inline void free_page(void *ptr) {}
+uint64_t *find_pte_64(uint64_t va);
+
+/* macros */
+#define context_is_emu(_r)       (((_r)->cs & 0x03) == 0x00)
+#define context_is_kernel(_v,_r) (((_r)->cs & 0x03) == 0x03 && is_kernel(_v))
+#define context_is_user(_v,_r)   (((_r)->cs & 0x03) == 0x03 && !is_kernel(_v))
+
+#define addr_is_emu(va)     (((va) >= XEN_M2P_64) && ((va) < XEN_DOM_64))
+#define addr_is_kernel(va)  ((va) >= XEN_DOM_64)
+#define addr_is_user(va)    ((va) < XEN_M2P_64)
+
+/* inline asm bits */
+static inline int wrmsr_safe(uint32_t msr, uint32_t ax, uint32_t dx)
+{
+    int ret;
+    asm volatile("1:  wrmsr                \n"
+                 "    xorl %0,%0           \n"
+                 "2:  nop                  \n"
+
+                 ".section .exfix, \"ax\"  \n"
+                 "3:  mov $-1,%0           \n"
+                 "    jmp 2b               \n"
+                 ".previous                \n"
+
+                 ".section .extab, \"a\"   \n"
+                 "    .align 8             \n"
+                 "    .quad 1b,3b          \n"
+                 ".previous                \n"
+                 : "=r" (ret)
+                 : "c" (msr), "a" (ax), "d" (dx));
+    return ret;
+}
+
+static inline int memcpy_pf(void *dest, const void *src, size_t bytes)
+{
+    int ret;
+
+    asm volatile("    cld                  \n"
+                 "91: rep movsb            \n"
+                 "    xor %[ret],%[ret]    \n"
+                 "98:                      \n"
+
+                 ".section .exfix, \"ax\"  \n"
+                 "99: mov $-1, %[ret]      \n"
+                 "    jmp 98b              \n"
+                 ".previous                \n"
+
+                 ".section .extab, \"a\"   \n"
+                 "    .align 8             \n"
+                 "    .quad 91b,99b        \n"
+                 ".previous                \n"
+                 : [ ret ] "=a" (ret),
+                   [ rsi ] "+S" (src),
+                   [ rdi ] "+D" (dest),
+                   [ rcx ] "+c" (bytes)
+                 :
+                 : "memory" );
+    return ret;
+}
diff --git a/pc-bios/xenner/xenner64.lds b/pc-bios/xenner/xenner64.lds
new file mode 100644
index 0000000..0b580a9
--- /dev/null
+++ b/pc-bios/xenner/xenner64.lds
@@ -0,0 +1,38 @@
+OUTPUT_FORMAT("elf64-x86-64")
+
+SECTIONS
+{
+    . = 0xffff830000000000;
+    _vstart = .;
+    phys_startup_64 = 0x0;
+
+    /* code */
+    .text : AT(ADDR(.text) - 0xffff830000000000) { *(.text) }
+    . = ALIGN(4k);
+    .exfix  : { *(.exfix)  }
+
+    /* data, ro */
+    . = ALIGN(4k);
+    .note.gnu.build-id : { *(.note.gnu.build-id) }
+    . = ALIGN(4k);
+    _estart = .;
+    .extab  : { *(.extab)  }
+    _estop  = .;
+    . = ALIGN(4k);
+    .rodata : { *(.rodata) }
+
+    /* data, rw */
+    . = ALIGN(4k);
+    .pt     : { *(.pt) }
+    . = ALIGN(4k);
+    .pgdata : { *(.pgdata) }
+    . = ALIGN(4k);
+    .data   : { *(.data)   }
+
+    /* bss */
+    . = ALIGN(4k);
+    .bss    : { *(.bss)    }
+
+    . = ALIGN(4k);
+    _vstop  = .;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 09/40] xenner: kernel: Global data
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (7 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 10/40] xenner: kernel: Hypercall handler (i386) Alexander Graf
                   ` (30 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

We need to access global variables from various points in the code. Keep them
in a single file, so we know where they are.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-data.c |  142 ++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 142 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-data.c

diff --git a/pc-bios/xenner/xenner-data.c b/pc-bios/xenner/xenner-data.c
new file mode 100644
index 0000000..cbe399c
--- /dev/null
+++ b/pc-bios/xenner/xenner-data.c
@@ -0,0 +1,142 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner random data structures
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xenner.h"
+
+#ifdef CONFIG_64BIT
+
+struct idt_64 page_aligned xen_idt[256];
+
+#else
+
+struct descriptor_32 page_aligned xen_idt[256];
+
+page_aligned uint32_t maps_32[PTE_COUNT_32];
+page_aligned uint64_t maps_pae[PTE_COUNT_PAE];
+uint32_t maps_refcnt[MAPS_MAX];
+
+#endif
+
+unsigned long *m2p;
+
+page_aligned struct vcpu_guest_context boot_ctxt;
+LIST_HEAD(cpus);
+
+int grant_frames;
+struct grant_entry_v1 page_aligned grant_table[GRANT_ENTRIES];
+struct shared_info page_aligned shared_info;
+struct xenner_info page_aligned vminfo = {
+    .abi_version          = XENNER_ABI_VERSION,
+};
+
+struct vmconfig  vmconf;
+xen_callback_t   xencb[8];
+struct trap_info xentr[256];
+int wrpt;
+
+/* trap/fault info */
+const struct trapinfo trapinfo[32] = {
+    [  0 ] = { .name = "division by zero",              .ec = 0, .lvl = 1 },
+    [  1 ] = { .name = "debug interrupt",               .ec = 0, .lvl = 3 },
+    [  2 ] = { .name = "NMI",                           .ec = 0, .lvl = 1 },
+    [  3 ] = { .name = "breakpoint",                    .ec = 0, .lvl = 3 },
+    [  4 ] = { .name = "overflow",                      .ec = 0, .lvl = 1 },
+    [  5 ] = { .name = "bound check",                   .ec = 0, .lvl = 1 },
+    [  6 ] = { .name = "illegal instruction",           .ec = 0, .lvl = 1 },
+    [  7 ] = { .name = "device not present",            .ec = 0, .lvl = 1 },
+    [  8 ] = { .name = "double fault",                  .ec = 1, .lvl = 0 },
+    [  9 ] = { .name = "coprocessor",                   .ec = 0, .lvl = 1 },
+    [ 10 ] = { .name = "invalid TSS",                   .ec = 1, .lvl = 0 },
+    [ 11 ] = { .name = "segment not pesent",            .ec = 1, .lvl = 1 },
+    [ 12 ] = { .name = "stack fault",                   .ec = 1, .lvl = 1 },
+    [ 13 ] = { .name = "general protection fault",      .ec = 1, .lvl = 1 },
+    [ 14 ] = { .name = "page fault",                    .ec = 1, .lvl = 1 },
+    [ 16 ] = { .name = "floating point exception",      .ec = 0, .lvl = 1 },
+    [ 17 ] = { .name = "alignment",                     .ec = 1, .lvl = 1 },
+    [ 18 ] = { .name = "machine check",                 .ec = 0, .lvl = 1 },
+    [ 19 ] = { .name = "SIMD floating point exception", .ec = 0, .lvl = 1 },
+};
+
+const char *cr0_bits[32] = {
+    [  0 ] = "PE",
+    [  1 ] = "MP",
+    [  2 ] = "EM",
+    [  3 ] = "TS",
+    [  4 ] = "ET",
+    [  5 ] = "NE",
+    [ 16 ] = "WP",
+    [ 18 ] = "AM",
+    [ 29 ] = "NW",
+    [ 30 ] = "CD",
+    [ 31 ] = "PG",
+};
+
+const char *cr4_bits[32] = {
+    [  0 ] = "VME",
+    [  1 ] = "PVI",
+    [  2 ] = "TSD",
+    [  3 ] = "DE",
+    [  4 ] = "PSE",
+    [  5 ] = "PAE",
+    [  6 ] = "MCE",
+    [  7 ] = "PGE",
+    [  8 ] = "PCE",
+    [  9 ] = "OSFXSR",
+    [ 10 ] = "OSXMMEXCPT",
+};
+
+const char *pg_bits[32] = {
+    [ 0 ] = "present",
+    [ 1 ] = "write",
+    [ 2 ] = "user",
+    [ 3 ] = "pwt",
+    [ 4 ] = "pcd",
+    [ 5 ] = "access",
+    [ 6 ] = "dirty",
+    [ 7 ] = "pse",
+    [ 8 ] = "global",
+};
+
+const char *rflags_bits[32] = {
+    [  0 ] = "CF",
+    [  2 ] = "PF",
+
+    [  4 ] = "AF",
+    [  6 ] = "ZF",
+
+    [  8 ] = "TF",
+    [  9 ] = "IF",
+    [ 10 ] = "DF",
+    [ 11 ] = "OF",
+
+    [ 12 ] = "IOPL-1",
+    [ 13 ] = "IOPL-2",
+    [ 14 ] = "NT",
+
+    [ 16 ] = "RF",
+    [ 17 ] = "VM",
+    [ 18 ] = "AC",
+    [ 19 ] = "VIF",
+
+    [ 20 ] = "VIP",
+    [ 21 ] = "ID",
+};
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 10/40] xenner: kernel: Hypercall handler (i386)
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (8 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 09/40] xenner: kernel: Global data Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 11/40] xenner: kernel: Hypercall handler (x86_64) Alexander Graf
                   ` (29 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner handles guest hypercalls itself. This patch adds all the handling
code that is i386 specific.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-hcall32.c |  299 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 299 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-hcall32.c

diff --git a/pc-bios/xenner/xenner-hcall32.c b/pc-bios/xenner/xenner-hcall32.c
new file mode 100644
index 0000000..45a3046
--- /dev/null
+++ b/pc-bios/xenner/xenner-hcall32.c
@@ -0,0 +1,299 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner 32 bit hypercall handlers
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <errno.h>
+
+#include "xenner.h"
+
+/* --------------------------------------------------------------------- */
+
+typedef int32_t (*xen_hcall)(struct xen_cpu *cpu, uint32_t *args);
+static int32_t multicall(struct xen_cpu *cpu, uint32_t *args);
+
+/* --------------------------------------------------------------------- */
+
+static int32_t update_va_mapping(struct xen_cpu *cpu, uint32_t *args)
+{
+    uint32_t va    = args[0];
+    uint64_t val   = args[1] | ((uint64_t)args[2] << 32);
+    uint32_t flags = args[3];
+    void *map = NULL;
+    pte_t *ptr = find_pte_lpt(va);
+    pte_t pte;
+
+    printk(5, "%s: va %" PRIx32 " val %" PRIx64 " pte %p\n",
+           __FUNCTION__, va, val, ptr);
+
+    pgtable_walk(cpu, (uint32_t)ptr);
+
+    if (va >= XEN_M2P) {
+        goto inval;
+    }
+
+    if (copypt_pf(&pte, ptr) < 0) {
+        printk(2, "%s: lpt rd fault: va %" PRIx32 " val %" PRIx64 "\n",
+               __FUNCTION__, va, val);
+        goto inval;
+    }
+
+    if ((pte & ~_PAGE_RW) == val && !(val & _PAGE_USER)) {
+        /* ignore make_readonly */
+        vminfo.faults[XEN_FAULT_UPDATE_VA_FIX_RO]++;
+        goto doflags;
+    }
+
+    pte = val;
+    if (!copypt_pf(ptr, &pte)) {
+        goto doflags;
+    }
+
+    printk(1, "%s: lpt wr fault: va %" PRIx32 " val %" PRIx64 "\n",
+           __FUNCTION__, va, val);
+
+    ptr = find_pte_map(cpu, va);
+    if (!ptr) {
+        goto inval;
+    }
+
+    map = ptr;
+    if (!copypt_pf(ptr, &pte)) {
+        goto doflags;
+    }
+
+    printk(1, "%s: map wr fault: va %" PRIx32 " val %" PRIx64 "\n",
+           __FUNCTION__, va, val);
+    goto inval;
+
+doflags:
+    if (map) {
+        free_page(map);
+    }
+    switch (flags & UVMF_FLUSHTYPE_MASK) {
+    case UVMF_NONE:
+        break;
+    case UVMF_TLB_FLUSH:
+        flush_tlb();
+        break;
+    case UVMF_INVLPG:
+        flush_tlb_addr(va);
+        break;
+    }
+    return 0;
+
+inval:
+    if (map) {
+        free_page(map);
+    }
+    return -EINVAL;
+}
+
+static int32_t mmu_update(struct xen_cpu *cpu, uint32_t *args)
+{
+    uint64_t *reqs = (void*)args[0];
+    uint32_t count = args[1];
+    uint32_t *done = (void*)args[2];
+    uint32_t dom   = args[3];
+    uint32_t i;
+
+    if (dom != DOMID_SELF) {
+        printk(1, "%s: foreigndom not supported (domid %d)\n",
+               __FUNCTION__, dom);
+        return -ENOSYS;
+    }
+
+    for (i = 0; i < count; i++) {
+        switch (reqs[0] & 3) {
+        case MMU_NORMAL_PT_UPDATE:
+        {
+            pte_t *pte = map_page(reqs[0]);
+            *pte = reqs[1];
+            free_page(pte);
+#ifdef CONFIG_PAE
+            if (read_cr3_mfn(cpu) == addr_to_frame(reqs[0])) {
+                update_emu_mappings(read_cr3_mfn(cpu));
+                flush_tlb();
+            }
+#endif
+            break;
+        }
+        case MMU_MACHPHYS_UPDATE:
+        {
+            xen_pfn_t gmfn = reqs[0] >> PAGE_SHIFT;
+            xen_pfn_t gpfn = reqs[1];
+            if (gmfn < vmconf.mfn_guest)
+                panic("suspious m2p update", NULL);
+            m2p[gmfn] = gpfn;
+            break;
+        }
+        default:
+            return -ENOSYS;
+        }
+        reqs += 2;
+    }
+    if (done) {
+        *done = i;
+    }
+
+    return 0;
+}
+
+static int32_t iret(struct xen_cpu *cpu, uint32_t *args)
+{
+    struct regs_32 *regs = (void*)cpu->stack_high - sizeof(*regs);
+    uint32_t eflags;
+    uint32_t *stack;
+
+    stack = (void*)regs->esp;
+
+    regs->eax     = stack[0];
+    regs->eip     = stack[1];
+    regs->cs      = stack[2];
+    eflags        = stack[3];
+    regs->eflags  = (eflags & ~X86_EFLAGS_IOPL) | X86_EFLAGS_IF;
+
+    if (context_is_emu(regs)) {
+        /* not allowed */
+        panic("guest tried iret to ring0", regs);
+
+    } else if (context_is_kernel(cpu, regs)) {
+        /* kernel -> kernel  --  just move stack pointer */
+        regs->esp += 4*4;
+
+    } else {
+        /* kernel -> user  --  switch back stack */
+        regs->esp = stack[4];
+        regs->ss  = stack[5];
+    }
+
+    /* Restore upcall mask from supplied EFLAGS.IF. */
+    if (eflags & X86_EFLAGS_IF) {
+        guest_sti(cpu);
+    } else {
+        guest_cli(cpu);
+    }
+
+    return -EINTR;
+}
+
+/* --------------------------------------------------------------------- */
+
+static xen_hcall hcalls[XEN_HCALL_MAX] = {
+    [ __HYPERVISOR_update_va_mapping ]       = update_va_mapping,
+    [ __HYPERVISOR_mmu_update ]              = mmu_update,
+    [ __HYPERVISOR_mmuext_op ]               = mmuext_op,
+    [ __HYPERVISOR_update_descriptor ]       = update_descriptor,
+    [ __HYPERVISOR_stack_switch ]            = stack_switch,
+    [ __HYPERVISOR_multicall ]               = multicall,
+    [ __HYPERVISOR_iret ]                    = iret,
+    [ __HYPERVISOR_fpu_taskswitch ]          = fpu_taskswitch,
+    [ __HYPERVISOR_grant_table_op ]          = grant_table_op,
+    [ __HYPERVISOR_xen_version ]             = xen_version,
+    [ __HYPERVISOR_vm_assist ]               = vm_assist,
+    [ __HYPERVISOR_sched_op ]                = sched_op,
+    [ __HYPERVISOR_sched_op_compat ]         = sched_op_compat,
+    [ __HYPERVISOR_memory_op ]               = memory_op,
+    [ __HYPERVISOR_set_trap_table ]          = set_trap_table,
+    [ __HYPERVISOR_set_callbacks ]           = set_callbacks,
+    [ __HYPERVISOR_callback_op ]             = callback_op,
+    [ __HYPERVISOR_set_gdt ]                 = set_gdt,
+    [ __HYPERVISOR_vcpu_op ]                 = vcpu_op,
+    [ __HYPERVISOR_event_channel_op ]        = event_channel_op,
+    [ __HYPERVISOR_event_channel_op_compat ] = event_channel_op_compat,
+    [ __HYPERVISOR_set_timer_op ]            = set_timer_op,
+    [ __HYPERVISOR_physdev_op ]              = physdev_op,
+    [ __HYPERVISOR_get_debugreg ]            = get_debugreg,
+    [ __HYPERVISOR_set_debugreg ]            = set_debugreg,
+    [ __HYPERVISOR_console_io ]              = console_io,
+
+    [ __HYPERVISOR_physdev_op_compat ]       = error_noperm,
+    [ __HYPERVISOR_platform_op ]             = error_noperm,
+    [ __HYPERVISOR_set_debugreg ]            = error_noop,
+};
+
+static int32_t multicall(struct xen_cpu *cpu, uint32_t *args)
+{
+    struct multicall_entry *calls = (void*)args[0];
+    uint32_t i, count = args[1];
+    uint32_t margs[6];
+
+    for (i = 0; i < count; i++) {
+        if (!hcalls[calls[i].op])
+            panic("unknown hypercall in multicall list", NULL);
+        vminfo.hcalls[calls[i].op]++;
+        margs[0] = calls[i].args[0];
+        margs[1] = calls[i].args[1];
+        margs[2] = calls[i].args[2];
+        margs[3] = calls[i].args[3];
+        margs[4] = calls[i].args[4];
+        margs[5] = calls[i].args[5];
+        calls[i].result = hcalls[calls[i].op](cpu, margs);
+    }
+    return 0;
+}
+
+asmlinkage void do_hypercall(struct regs_32 *regs)
+{
+    uint32_t args[6];
+    uint32_t retval = -ENOSYS;
+    struct xen_cpu *cpu =get_cpu();
+
+    printk(3, "%s: %s #%d\n", __FUNCTION__,
+           __hypervisor_name(regs->eax), regs->eax);
+
+    if (regs->eax >= XEN_HCALL_MAX) {
+        /* invalid hypercall number */
+        goto handled;
+    }
+    if (!hcalls[regs->eax]) {
+        /* no hypercall handler */
+        goto handled;
+    }
+
+    /* do call */
+    vminfo.hcalls[regs->eax]++;
+    args[0] = regs->ebx;
+    args[1] = regs->ecx;
+    args[2] = regs->edx;
+    args[3] = regs->esi;
+    args[4] = regs->edi;
+    args[5] = regs->ebp;
+    retval = hcalls[regs->eax](cpu, args);
+
+    if (-EINTR == retval) {
+        goto iret;
+    }
+
+handled:
+    if (-ENOSYS == retval) {
+        printk(0, "hypercall %s (#%d)  |  arg0 0x%x  arg1 0x%x  -> -ENOSYS\n",
+               __hypervisor_name(regs->eax), regs->eax, args[0], args[1]);
+    }
+    regs->eax = retval;
+    regs->error = HCALL_HANDLED;
+    evtchn_try_forward(cpu, regs);
+    return;
+
+iret:
+    regs->error = HCALL_HANDLED;
+    evtchn_try_forward(cpu, regs);
+    return;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 11/40] xenner: kernel: Hypercall handler (x86_64)
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (9 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 10/40] xenner: kernel: Hypercall handler (i386) Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 12/40] xenner: kernel: Hypercall handler (generic) Alexander Graf
                   ` (28 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner handles guest hypercalls itself. This patch adds all the handling
code that is x86_64 specific.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-hcall64.c |  323 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 323 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-hcall64.c

diff --git a/pc-bios/xenner/xenner-hcall64.c b/pc-bios/xenner/xenner-hcall64.c
new file mode 100644
index 0000000..93dfb99
--- /dev/null
+++ b/pc-bios/xenner/xenner-hcall64.c
@@ -0,0 +1,323 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner 64 bit hypercall handlers
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <inttypes.h>
+#include <errno.h>
+#include <xen/xen.h>
+
+#include "msr-index.h"
+
+#include "xenner.h"
+
+/* --------------------------------------------------------------------- */
+
+typedef int64_t (*xen_hcall)(struct xen_cpu *cpu, uint64_t *args);
+static int64_t multicall(struct xen_cpu *cpu, uint64_t *args);
+
+/* --------------------------------------------------------------------- */
+
+void switch_mode(struct xen_cpu *cpu)
+{
+    vminfo.faults[XEN_FAULT_OTHER_SWITCH_MODE]++;
+    cpu->user_mode = !cpu->user_mode;
+    if (cpu->user_mode) {
+        pv_write_cr3(cpu, cpu->user_cr3_mfn);
+    } else {
+        pv_write_cr3(cpu, cpu->kernel_cr3_mfn);
+    }
+    __asm__("swapgs" ::: "memory");
+}
+
+int is_kernel(struct xen_cpu *cpu)
+{
+    return !cpu->user_mode;
+}
+
+/* --------------------------------------------------------------------- */
+
+static int64_t update_va_mapping(struct xen_cpu *cpu, uint64_t *args)
+{
+    uint64_t va    = args[0];
+    uint64_t val   = args[1];
+    uint64_t flags = args[2];
+    uint64_t *pte;
+    uint64_t pte_val;
+
+    pte = find_pte_64(va);
+    if (addr_is_kernel(va)) {
+        if (test_pgflag_64(val, _PAGE_PRESENT) &&
+            !test_pgflag_64(val, _PAGE_USER)) {
+            vminfo.faults[XEN_FAULT_UPDATE_VA_FIX_USER]++;
+            val |= _PAGE_USER;
+        }
+    }
+
+    if (memcpy_pf(&pte_val, pte, sizeof(uint64_t)) < 0) {
+        /* pte is missing levels below - get out quick */
+        return -1;
+    }
+
+    if (pte_val != val) {
+        *pte = val;
+    }
+
+    switch (flags & UVMF_FLUSHTYPE_MASK) {
+    case UVMF_NONE:
+        break;
+    case UVMF_TLB_FLUSH:
+        flush_tlb();
+        break;
+    case UVMF_INVLPG:
+        flush_tlb_addr(va);
+        break;
+    }
+    return 0;
+}
+
+static int64_t mmu_update(struct xen_cpu *cpu, uint64_t *args)
+{
+    uint64_t *reqs = (void*)args[0];
+    uint64_t count = args[1];
+    uint64_t *done = (void*)args[2];
+    uint64_t dom   = args[3];
+    uint64_t *pte;
+    int i;
+
+    if (dom != DOMID_SELF) {
+        printk(1, "%s: foreigndom not supported\n", __FUNCTION__);
+        return -ENOSYS;
+    }
+
+    for (i = 0; i < count; i++) {
+        switch (reqs[0] & 3) {
+        case MMU_NORMAL_PT_UPDATE:
+            pte = map_page(reqs[0]);
+            *pte = reqs[1];
+            break;
+        case MMU_MACHPHYS_UPDATE:
+        {
+            xen_pfn_t gmfn = reqs[0] >> PAGE_SHIFT;
+            xen_pfn_t gpfn = reqs[1];
+            if (gmfn < vmconf.mfn_guest)
+                panic("suspious m2p update", NULL);
+            m2p[gmfn] = gpfn;
+            break;
+        }
+        default:
+            return -ENOSYS;
+        }
+        reqs += 2;
+    }
+    if (done) {
+        *done = i;
+    }
+
+    return 0;
+}
+
+static int64_t iret(struct xen_cpu *cpu, uint64_t *args)
+{
+    struct regs_64 *regs = (void*)cpu->stack_high - sizeof(*regs);
+    struct iret_context stack;
+
+    stack = *((struct iret_context*)regs->rsp);
+
+    if ((stack.cs & 3) == 3) {
+        /* return to userspace */
+        switch_mode(cpu);
+    }
+
+    regs->rip     = stack.rip;
+    regs->cs      = fix_sel64(stack.cs);
+    regs->rsp     = stack.rsp;
+    regs->ss      = fix_sel64(stack.ss);
+    regs->rflags  = stack.rflags;
+    regs->rflags &= ~(X86_EFLAGS_IOPL|X86_EFLAGS_VM);
+    regs->rflags |= X86_EFLAGS_IF;
+
+    if (stack.rflags & X86_EFLAGS_IF) {
+        guest_sti(cpu);
+    } else {
+        guest_cli(cpu);
+    }
+
+    if (!(stack.flags & VGCF_in_syscall)) {
+        regs->r11 = stack.r11;
+        regs->rcx = stack.rcx;
+    }
+
+    regs->rax     = stack.rax;
+    return -EINTR;
+}
+
+static int64_t set_segment_base(struct xen_cpu *cpu, uint64_t *args)
+{
+    switch (args[0]) {
+    case SEGBASE_FS:
+        wrmsrl(MSR_FS_BASE, args[1]);
+        break;
+    case SEGBASE_GS_USER:
+        wrmsrl(MSR_KERNEL_GS_BASE, args[1]);
+        break;
+    case SEGBASE_GS_KERNEL:
+        wrmsrl(MSR_GS_BASE, args[1]);
+        break;
+    case SEGBASE_GS_USER_SEL:
+        __asm__("swapgs         \n"
+                "movl %k0, %%gs \n"
+                "mfence         \n"
+                "swapgs         \n"
+                :: "r" (args[1] & 0xffff));
+        return 0;
+    default:
+        printk(0, "%s: unknown %d\n", __FUNCTION__, (int)args[0]);
+        return -ENOSYS;
+    }
+    return 0;
+}
+
+/* --------------------------------------------------------------------- */
+
+static xen_hcall hcalls[XEN_HCALL_MAX] = {
+    [ __HYPERVISOR_update_va_mapping ]       = update_va_mapping,
+    [ __HYPERVISOR_mmu_update ]              = mmu_update,
+    [ __HYPERVISOR_mmuext_op ]               = mmuext_op,
+    [ __HYPERVISOR_stack_switch ]            = stack_switch,
+    [ __HYPERVISOR_multicall ]               = multicall,
+    [ __HYPERVISOR_iret ]                    = iret,
+    [ __HYPERVISOR_update_descriptor ]       = update_descriptor,
+    [ __HYPERVISOR_set_segment_base ]        = set_segment_base,
+    [ __HYPERVISOR_fpu_taskswitch ]          = fpu_taskswitch,
+    [ __HYPERVISOR_grant_table_op ]          = grant_table_op,
+    [ __HYPERVISOR_xen_version ]             = xen_version,
+    [ __HYPERVISOR_vm_assist ]               = vm_assist,
+    [ __HYPERVISOR_sched_op ]                = sched_op,
+    [ __HYPERVISOR_sched_op_compat ]         = sched_op_compat,
+    [ __HYPERVISOR_memory_op ]               = memory_op,
+    [ __HYPERVISOR_set_trap_table ]          = set_trap_table,
+    [ __HYPERVISOR_set_callbacks ]           = set_callbacks,
+    [ __HYPERVISOR_callback_op ]             = callback_op,
+    [ __HYPERVISOR_set_gdt ]                 = set_gdt,
+    [ __HYPERVISOR_vcpu_op ]                 = vcpu_op,
+    [ __HYPERVISOR_event_channel_op ]        = event_channel_op,
+    [ __HYPERVISOR_event_channel_op_compat ] = event_channel_op_compat,
+    [ __HYPERVISOR_set_timer_op ]            = set_timer_op,
+    [ __HYPERVISOR_physdev_op ]              = physdev_op,
+    [ __HYPERVISOR_get_debugreg ]            = get_debugreg,
+    [ __HYPERVISOR_set_debugreg ]            = set_debugreg,
+    [ __HYPERVISOR_console_io ]              = console_io,
+
+    [ __HYPERVISOR_platform_op ]             = error_noperm,
+    [ __HYPERVISOR_physdev_op_compat ]       = error_noperm,
+    [ __HYPERVISOR_set_debugreg ]            = error_noop,
+};
+
+static int64_t multicall(struct xen_cpu *cpu, uint64_t *args)
+{
+    struct multicall_entry *calls = (void*)args[0];
+    uint64_t i, count = args[1];
+    uint64_t margs[6];
+
+    for (i = 0; i < count; i++) {
+        if (!hcalls[calls[i].op]) {
+            printk(0, "%s: unknown hypercall #%ld\n", __FUNCTION__, calls[i].op);
+            panic("unknown hypercall in multicall list", NULL);
+        }
+        vminfo.hcalls[calls[i].op]++;
+        margs[0] = calls[i].args[0];
+        margs[1] = calls[i].args[1];
+        margs[2] = calls[i].args[2];
+        margs[3] = calls[i].args[3];
+        margs[4] = calls[i].args[4];
+        margs[5] = calls[i].args[5];
+        calls[i].result = hcalls[calls[i].op](cpu, margs);
+    }
+    return 0;
+}
+
+static void do_hypercall(struct xen_cpu *cpu, struct regs_64 *regs)
+{
+    uint64_t args[6];
+    uint64_t retval = -ENOSYS;
+
+    if (regs->rax >= XEN_HCALL_MAX) {
+        /* invalid hypercall number */
+        printk(5, "hcall %ld >= XEN_HCALL_MAX\n", regs->rax);
+        goto handled;
+    }
+    if (!hcalls[regs->rax]) {
+        /* no hypercall handler */
+        printk(5, "hcall %ld no handler (%p)\n", regs->rax, hcalls[regs->rax]);
+        goto handled;
+    }
+
+    /* do call */
+    vminfo.hcalls[regs->rax]++;
+    args[0] = regs->rdi;
+    args[1] = regs->rsi;
+    args[2] = regs->rdx;
+    args[3] = regs->r10;
+    args[4] = regs->r8;
+    args[5] = regs->r9;
+
+    retval = hcalls[regs->rax](cpu, args);
+
+    if (-EINTR == retval)
+        goto iret;
+
+handled:
+    if (-ENOSYS == retval) {
+        printk(0, "hypercall %s (#%ld)  |  arg0 0x%lx  arg1 0x%lx  -> -ENOSYS\n",
+               __hypervisor_name(regs->rax), regs->rax, args[0], args[1]);
+    }
+
+    regs->rax = retval;
+    regs->error = HCALL_HANDLED;
+    evtchn_try_forward(cpu, regs);
+    return;
+
+iret:
+    regs->error = HCALL_IRET;
+    evtchn_try_forward(cpu, regs);
+    return;
+}
+
+asmlinkage void do_syscall(struct regs_64 *regs)
+{
+    struct xen_cpu *cpu =get_cpu();
+
+    if (is_kernel(cpu)) {
+        /* init segments: not done in syscall path */
+        regs->cs = FLAT_KERNEL_CS;
+        regs->ss = FLAT_KERNEL_SS;
+        do_hypercall(cpu, regs);
+    } else {
+        vminfo.faults[XEN_FAULT_SYSCALL]++;
+        /* init segments: not done in syscall path */
+        regs->cs = FLAT_USER_CS;
+        regs->ss = FLAT_USER_SS;
+        bounce_trap(cpu, regs, -1, CALLBACKTYPE_syscall);
+        /* return via iretq please */
+        regs->error = HCALL_IRET;
+    }
+    return;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 12/40] xenner: kernel: Hypercall handler (generic)
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (10 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 11/40] xenner: kernel: Hypercall handler (x86_64) Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 13/40] xenner: kernel: Headers Alexander Graf
                   ` (27 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner handles guest hypercalls itself. This patch adds all the handling
code that is shared between i386 and x86_64.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-hcall.c | 1031 +++++++++++++++++++++++++++++++++++++++++
 1 files changed, 1031 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-hcall.c

diff --git a/pc-bios/xenner/xenner-hcall.c b/pc-bios/xenner/xenner-hcall.c
new file mode 100644
index 0000000..30b574f
--- /dev/null
+++ b/pc-bios/xenner/xenner-hcall.c
@@ -0,0 +1,1031 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner hypercall handlers
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <errno.h>
+
+#include "config-host.h"
+#include "xenner.h"
+
+sreg_t error_noop(struct xen_cpu *cpu, ureg_t *args)
+{
+    /* ignore */
+    return 0;
+}
+
+sreg_t error_noperm(struct xen_cpu *cpu, ureg_t *args)
+{
+    /* we don't do dom0 hypercalls */
+    return -EPERM;
+}
+
+sreg_t console_io(struct xen_cpu *cpu, ureg_t *args)
+{
+    int count = args[1];
+    void *ptr = (void*)args[2];
+    uint8_t buf[128];
+
+    switch (args[0]) {
+    case CONSOLEIO_write:
+        if (count > sizeof(buf)-1) {
+            count = sizeof(buf)-1;
+        }
+        if (0 != memcpy_pf(&buf, ptr, count)) {
+            return -EFAULT;
+        }
+        buf[count] = 0;
+        while (count > 0 && (buf[count-1] == '\r' || buf[count-1] == '\n')) {
+            buf[--count] = 0;
+        }
+        printk(1, "guest: \"%s\"\n", buf);
+        return count;
+    case CONSOLEIO_read:
+        return 0;
+    default:
+        printk(1, "console: unknown: %s\n", consoleio_name(args[0]));
+        return -ENOSYS;
+    }
+}
+
+sreg_t stack_switch(struct xen_cpu *cpu, ureg_t *args)
+{
+    cpu->kernel_ss = fix_sel(args[0]);
+    cpu->kernel_sp = args[1];
+
+#ifdef CONFIG_32BIT
+    cpu->tss.ss1  = cpu->kernel_ss;
+    cpu->tss.esp1 = cpu->kernel_sp;
+#endif
+    return 0;
+}
+
+sreg_t update_descriptor(struct xen_cpu *cpu, ureg_t *args)
+{
+#ifdef CONFIG_64BIT
+    uint64_t pa = args[0];
+    struct descriptor_32 desc = {
+        .a = args[1] & 0xffffffff,
+        .b = args[1] >> 32,
+    };
+#else
+    uint64_t pa   = args[0] | (uint64_t)args[1] << 32;
+    struct descriptor_32 desc = {
+        .a = args[2],
+        .b = args[3],
+    };
+#endif
+    struct descriptor_32 *guest_gdt;
+    int p, index;
+    uint64_t mfn;
+
+    fix_desc(&desc);
+
+    mfn = addr_to_frame(pa);
+    for (p = 0; p < 16; p++) {
+        if (mfn == cpu->gdt_mfns[p]) {
+            break;
+        }
+    }
+    if (p == 16) {
+        printk(1, "%s: not found in gdt: pa %" PRIx64 " (ldt update?)\n",
+               __FUNCTION__, pa);
+    } else {
+        /* update emu gdt shadow */
+        index = addr_offset(pa) / sizeof(struct descriptor_32);
+        cpu->gdt[p * 512 + index] = desc;
+    }
+
+    /* update guest gdt/ldt */
+    guest_gdt = map_page(pa);
+    *guest_gdt = desc;
+    free_page(guest_gdt);
+    return 0;
+}
+
+sreg_t fpu_taskswitch(struct xen_cpu *cpu, ureg_t *args)
+{
+    if (args[0]) {
+        write_cr0(X86_CR0_TS|read_cr0());
+    } else {
+        clts();
+    }
+    return 0;
+}
+
+sreg_t grant_table_op(struct xen_cpu *cpu, ureg_t *args)
+{
+    struct gnttab_setup_table *st;
+    struct gnttab_query_size  *qs;
+    unsigned long *frames;
+    int i, rc = 0;
+
+    switch (args[0]) {
+    case GNTTABOP_setup_table:
+        st = (void*)args[1];
+        printk(1, "%s: setup_table %d\n", __FUNCTION__, st->nr_frames);
+        if (st->nr_frames > GRANT_FRAMES_MAX) {
+            st->status = GNTST_general_error;
+        } else {
+            grant_frames = st->nr_frames;
+            frames = (unsigned long *)st->frame_list.p;
+            for (i = 0; i < grant_frames; i++) {
+                frames[i] = EMU_MFN(grant_table) + i;
+            }
+            st->status = GNTST_okay;
+        }
+        break;
+    case GNTTABOP_query_size:
+        printk(1, "%s: query_size\n", __FUNCTION__);
+        qs = (void*)args[1];
+        qs->nr_frames = grant_frames;
+        qs->max_nr_frames = GRANT_FRAMES_MAX;
+        qs->status = GNTST_okay;
+        break;
+    default:
+        printk(0, "%s: FIXME: unknown %d\n", __FUNCTION__, (int)args[0]);
+        rc = -ENOSYS;
+    }
+    return rc;
+}
+
+sreg_t xen_version(struct xen_cpu *cpu, ureg_t *args)
+{
+    static const char extra[XEN_EXTRAVERSION_LEN] =
+        "-qemu-" QEMU_VERSION QEMU_PKGVERSION;
+
+    switch (args[0]) {
+    case XENVER_version:
+        return (3 << 16) | 1;
+    case XENVER_extraversion:
+        if (memcpy_pf((void*)args[1], extra, sizeof(extra))) {
+            return -EFAULT;
+        }
+        return 0;
+    case XENVER_capabilities:
+    {
+        char caps[] = CAP_VERSION_STRING;
+        if (memcpy_pf((void*)args[1], caps, sizeof(caps))) {
+            return -EFAULT;
+        }
+        break;
+    }
+
+    case XENVER_get_features:
+    {
+        xen_feature_info_t *fi = (void*)args[1];
+        fi->submap = 0;
+        if (!fi->submap_idx) {
+            fi->submap |= (1 << XENFEAT_pae_pgdir_above_4gb);
+        }
+        break;
+    }
+
+#ifdef CONFIG_32BIT
+    case XENVER_platform_parameters:
+    {
+        uint32_t *ptr32 = (void*)args[1];
+        *ptr32 = XEN_M2P;
+        break;
+    }
+#endif
+
+    default:
+        printk(0, "%s: FIXME: unknown %d\n", __FUNCTION__, (int)args[0]);
+        return -ENOSYS;
+    }
+    return 0;
+}
+
+sreg_t vm_assist(struct xen_cpu *cpu, ureg_t *args)
+{
+    int type = args[1];
+
+    switch (args[0]) {
+    case VMASST_CMD_enable:
+        printk(1, "%s: enable %d (%s)\n", __FUNCTION__,
+               type, vmasst_type_name(type));
+        if (type == VMASST_TYPE_writable_pagetables) {
+            wrpt = 1;
+        }
+        break;
+    case VMASST_CMD_disable:
+        printk(1, "%s: disable %d (%s)\n", __FUNCTION__,
+               type, vmasst_type_name(type));
+        if (type == VMASST_TYPE_writable_pagetables) {
+            wrpt = 0;
+        }
+        break;
+    default:
+        printk(0, "%s: FIXME: unknown %d\n", __FUNCTION__, (int)args[0]);
+        return -ENOSYS;
+    }
+    return 0;
+}
+
+sreg_t sched_op(struct xen_cpu *cpu, ureg_t *args)
+{
+    switch (args[0]) {
+    case SCHEDOP_yield:
+        /* Hmm, some day, on SMP, we want probably do something else ... */
+        sti(); pause(); cli();
+        break;
+
+    case SCHEDOP_block:
+        guest_sti(cpu);
+        if (!evtchn_pending(cpu)) {
+            halt_i(cpu->id);
+            pv_clock_update(1);
+        }
+        break;
+
+    case SCHEDOP_shutdown:
+    {
+        struct sched_shutdown sh;
+
+        if (memcpy_pf(&sh, (void*)(args[1]), sizeof(sh))) {
+            return -EFAULT;
+        }
+        emudev_cmd(EMUDEV_CMD_GUEST_SHUTDOWN, sh.reason);
+        break;
+    }
+
+    default:
+        printk(0, "%s: FIXME: unknown %d\n", __FUNCTION__, (int)args[0]);
+        return -ENOSYS;
+    }
+    return 0;
+}
+
+sreg_t sched_op_compat(struct xen_cpu *cpu, ureg_t *args)
+{
+    switch (args[0]) {
+    case SCHEDOP_yield:
+    case SCHEDOP_block:
+        return sched_op(cpu, args);
+    case SCHEDOP_shutdown:
+        emudev_cmd(EMUDEV_CMD_GUEST_SHUTDOWN, args[1]);
+        return 0;
+    default:
+        return -ENOSYS;
+    }
+}
+
+sreg_t memory_op(struct xen_cpu *cpu, ureg_t *args)
+{
+    int cmd = args[0] & 0x0f /* MEMOP_CMD_MASK */;
+
+    switch (cmd) {
+    case XENMEM_increase_reservation:
+    {
+        struct xen_memory_reservation res;
+
+        if (memcpy_pf(&res, (void*)(args[1]), sizeof(res))) {
+            return -EFAULT;
+        }
+        if (res.domid != DOMID_SELF) {
+            return -EPERM;
+        }
+        printk(0, "%s: increase_reservation: nr %ld, order %d (not implemented)\n",
+               __FUNCTION__, res.nr_extents, res.extent_order);
+        /* FIXME: not implemented yet, thus say "no pages allocated" */
+        return 0;
+    }
+    case XENMEM_decrease_reservation:
+    {
+        struct xen_memory_reservation res;
+        xen_pfn_t *ptr, gmfn;
+        int i, p, count = 0;
+
+        if (memcpy_pf(&res, (void*)(args[1]), sizeof(res))) {
+            return -EFAULT;
+        }
+        if (res.domid != DOMID_SELF) {
+            return -EPERM;
+        }
+        ptr = (xen_pfn_t *)res.extent_start.p;
+        for (i = 0; i < res.nr_extents; i++) {
+            if (memcpy_pf(&gmfn, ptr + i, sizeof(gmfn))) {
+                break;
+            }
+            for (p = 0; p < (1 << res.extent_order); p++) {
+                m2p[gmfn+p] = INVALID_M2P_ENTRY;
+            }
+            /* FIXME: make host free pages */
+            count++;
+        }
+        printk(2, "%s: decrease_reservation: nr %ld, order %d"
+               " (max %" PRIx64 "/%" PRIx64 ") -> rc %d\n", __FUNCTION__,
+               res.nr_extents, res.extent_order,
+               vmconf.pg_guest, vmconf.pg_total,
+               count);
+        /* FIXME: signal to userspace */
+        return count;
+    }
+    case XENMEM_populate_physmap:
+    {
+        struct xen_memory_reservation res;
+        xen_pfn_t *ptr, gpfn, gmfn;
+        int i, p, count = 0;
+
+        if (memcpy_pf(&res, (void*)(args[1]), sizeof(res))) {
+            return -EFAULT;
+        }
+        if (res.domid != DOMID_SELF) {
+            return -EPERM;
+        }
+        ptr = (xen_pfn_t *)res.extent_start.p;
+        gmfn = vmconf.mfn_guest;
+        for (i = 0; i < res.nr_extents; i++) {
+            if (memcpy_pf(&gpfn, ptr + i, sizeof(gpfn))) {
+                break;
+            }
+            for (p = 0; p < (1 << res.extent_order); p++) {
+                while (m2p[gmfn] != INVALID_M2P_ENTRY &&
+                       gmfn < vmconf.pg_total) {
+                    gmfn++;
+                }
+                if (gmfn == vmconf.pg_total) {
+                    break;
+                }
+                m2p[gmfn] = gpfn+p;
+            }
+            if (p != (1 << res.extent_order)) {
+                break;
+            }
+            /* FIXME: make host reclaim pages */
+            if (memcpy_pf(ptr + i, &gmfn, sizeof(gmfn))) {
+                break;
+            }
+            count++;
+        }
+        printk(2, "%s: populate_physmap: nr %ld, order %d -> rc %d\n",
+               __FUNCTION__, res.nr_extents, res.extent_order, count);
+        /* FIXME: signal to userspace */
+        return count;
+    }
+    case XENMEM_machphys_mapping:
+    {
+        struct xen_machphys_mapping map;
+        uint32_t pg_m2p = 1024 /* pages (4 MB) */;
+        void *dest = (void*)(args[1]);
+#ifdef CONFIG_64BIT
+        map.v_start = XEN_M2P_64;
+#else
+        map.v_start = XEN_M2P;
+#endif
+        map.v_end   = map.v_start + frame_to_addr(pg_m2p);
+        map.max_mfn = pg_m2p << (PAGE_SHIFT-3);
+        if (memcpy_pf(dest, &map, sizeof(map))) {
+            return -EFAULT;
+        }
+        return 0;
+    }
+
+    case XENMEM_memory_map:
+        /* we have no e820 map */
+        return -ENOSYS;
+
+    default:
+        printk(0, "%s: FIXME: unknown %d\n", __FUNCTION__, cmd);
+        return -ENOSYS;
+    }
+    return 0;
+}
+
+sreg_t set_trap_table(struct xen_cpu *cpu, ureg_t *args)
+{
+    struct trap_info *traps;
+    struct trap_info trap;
+    int i;
+
+    if (!args[0]) {
+        memset(&xentr, 0, sizeof(xentr));
+        return -EINVAL;
+    }
+
+    traps = (void*)args[0];
+    for (i = 0;; i++) {
+        if (memcpy_pf(&trap, traps+i, sizeof(trap))) {
+            return -EFAULT;
+        }
+        if (!trap.address) {
+            break;
+        }
+        trap.cs = fix_sel32(trap.cs);
+        xentr[traps[i].vector] = trap;
+#ifdef CONFIG_32BIT
+        if (traps[i].vector >= 0x80) {
+            /* route directly */
+            uint32_t dpl = trap.flags & 0x03;
+            xen_idt[traps[i].vector] =
+                mkgate32(trap.cs, trap.address, 0x8f | (dpl << 5));
+        }
+#endif
+    }
+    return 0;
+}
+
+static int callback_setup(int type, xen_callback_t *cb)
+{
+    int ok = 0;
+
+    switch (type) {
+    case CALLBACKTYPE_event:
+    case CALLBACKTYPE_failsafe:
+#ifdef CONFIG_64BIT
+    case CALLBACKTYPE_syscall:
+#endif
+        ok = 1;
+        break;
+    }
+
+    printk(1, "%s: %s (#%d) -> %s\n", __FUNCTION__,
+           callbacktype_name(type), type,
+           ok ? "OK" : "unsupported");
+    if (!ok) {
+        return -1;
+    }
+
+#ifdef CONFIG_32BIT
+    cb->cs = fix_sel32(cb->cs);
+#endif
+    xencb[type] = *cb;
+    return 0;
+}
+
+static void callback_clear(int type)
+{
+#ifdef CONFIG_64BIT
+    xencb[type] = 0;
+#else
+    xencb[type].cs = 0;
+    xencb[type].eip = 0;
+#endif
+}
+
+sreg_t set_callbacks(struct xen_cpu *cpu, ureg_t *args)
+{
+#ifdef CONFIG_64BIT
+    callback_setup(CALLBACKTYPE_event,    &args[0]);
+    callback_setup(CALLBACKTYPE_failsafe, &args[1]);
+    callback_setup(CALLBACKTYPE_syscall,  &args[2]);
+#else
+    xen_callback_t cb;
+
+    cb.cs  = args[0];
+    cb.eip = args[1];
+    callback_setup(CALLBACKTYPE_event, &cb);
+    cb.cs  = args[2];
+    cb.eip = args[3];
+    callback_setup(CALLBACKTYPE_failsafe, &cb);
+#endif
+    return 0;
+}
+
+sreg_t callback_op(struct xen_cpu *cpu, ureg_t *args)
+{
+    struct callback_register cb;
+
+    memcpy_pf(&cb, (void*)(args[1]), sizeof(cb));
+    if (cb.type >= 8) {
+        return -EINVAL;
+    }
+
+    switch (args[0]) {
+    case CALLBACKOP_register:
+        if (callback_setup(cb.type, &cb.address)) {
+            return -EINVAL;
+        }
+        break;
+    case CALLBACKOP_unregister:
+        callback_clear(cb.type);
+        break;
+    default:
+        printk(0, "%s: FIXME: unknown %d\n", __FUNCTION__, (int)args[0]);
+        return -ENOSYS;
+    }
+    return 0;
+}
+
+void guest_gdt_copy_page(struct descriptor_32 *src,
+                         struct descriptor_32 *dst)
+{
+    struct descriptor_32 tmp;
+    int e;
+
+    for (e = 0; e < 512; e++) {
+        tmp = src[e];
+        fix_desc(&tmp);
+        dst[e] = tmp;
+    }
+}
+
+int guest_gdt_init(struct xen_cpu *cpu, uint32_t entries, ureg_t *mfns)
+{
+    uint32_t pages = (entries + 511) / 512;
+    struct descriptor_32 *src, *dst;
+    uint32_t p;
+
+    for (p = 0; p < pages; p++) {
+        cpu->gdt_mfns[p] = mfns[p];
+        src = map_page(frame_to_addr(mfns[p]));
+        dst = cpu->gdt + p * 512;
+        guest_gdt_copy_page(src, dst);
+        free_page(src);
+    }
+    return 0;
+}
+
+sreg_t set_gdt(struct xen_cpu *cpu, ureg_t *args)
+{
+    ureg_t mfns[16];
+    uint32_t entries = args[1];
+
+    if (memcpy_pf(mfns, (void*)(args[0]), sizeof(mfns))) {
+        return -EFAULT;
+    }
+    if (entries > (0xe000 >> 3)) {
+        return -EINVAL;
+    }
+
+    return guest_gdt_init(cpu, entries, mfns);
+}
+
+sreg_t vcpu_op(struct xen_cpu *cpu, ureg_t *args)
+{
+    if (args[1] >= vmconf.nr_cpus) {
+        return -EINVAL;
+    }
+
+    switch (args[0]) {
+    case VCPUOP_register_runstate_memory_area:
+        /* FIXME */
+        return 0;
+
+    case VCPUOP_is_up:
+    {
+        struct xen_cpu *vcpu;
+
+        vcpu = cpu_find(args[1]);
+        return vcpu->online;
+    }
+
+    case VCPUOP_set_periodic_timer:
+    {
+        struct vcpu_set_periodic_timer ticks;
+
+        if (memcpy_pf(&ticks, (void*)args[2], sizeof(ticks))) {
+            return -EFAULT;
+        }
+
+        cpu->periodic = ticks.period_ns;
+        printk(1, "%s/%d: periodic %" PRId64 " (%d Hz)\n", __FUNCTION__,
+               cpu->id, cpu->periodic,
+               1000000000 / (unsigned int)cpu->periodic);
+        lapic_timer(cpu);
+        break;
+    }
+    case VCPUOP_stop_periodic_timer:
+        cpu->periodic = 0;
+        printk(1, "%s/%d: periodic off\n", __FUNCTION__, cpu->id);
+        lapic_timer(cpu);
+        break;
+    case VCPUOP_set_singleshot_timer:
+    {
+        struct vcpu_set_singleshot_timer single;
+
+        if (memcpy_pf(&single, (void*)args[2], sizeof(single))) {
+            return -EFAULT;
+        }
+        cpu->oneshot = single.timeout_abs_ns;
+        printk(3, "%s/%d: oneshot %" PRId64 "\n", __FUNCTION__, cpu->id,
+               cpu->oneshot);
+        lapic_timer(cpu);
+        break;
+    }
+    case VCPUOP_stop_singleshot_timer:
+        cpu->oneshot = 0;
+        printk(1, "%s/%d: oneshot off\n", __FUNCTION__, cpu->id);
+        lapic_timer(cpu);
+        break;
+    case VCPUOP_initialise:
+    {
+        struct xen_cpu *vcpu;
+
+        printk(0, "%s: initialise cpu %d\n", __FUNCTION__, (int)args[1]);
+        vcpu = cpu_find(args[1]);
+        if (!vcpu->init_ctxt) {
+            vcpu->init_ctxt = get_memory(sizeof(*(vcpu->init_ctxt)), "init_ctxt");
+        }
+        if (memcpy_pf(vcpu->init_ctxt, (void*)args[2],
+                           sizeof(*(vcpu->init_ctxt)))) {
+            return -EFAULT;
+        }
+        break;
+    }
+    case VCPUOP_up:
+    {
+        struct xen_cpu *vcpu;
+
+        printk(0, "%s: up cpu %d\n", __FUNCTION__, (int)args[1]);
+        vcpu = cpu_find(args[1]);
+        lapic_ipi_boot(cpu, vcpu);
+        break;
+    }
+    case VCPUOP_down:
+    {
+        return -ENOSYS;
+    }
+    case VCPUOP_register_vcpu_info:
+    {
+        struct vcpu_register_vcpu_info reg;
+        struct xen_cpu *vcpu;
+        struct vcpu_info *new_info;
+        uint64_t new_info_pa;
+
+        vcpu = cpu_find(args[1]);
+        if (memcpy_pf(&reg, (void*)args[2], sizeof(reg))) {
+            return -EFAULT;
+        }
+        if (reg.offset + sizeof(struct vcpu_info) > PAGE_SIZE) {
+            return -EINVAL;
+        }
+        if (vcpu->v.vcpu_page) {
+            return -EINVAL;
+        }
+
+        vcpu->v.vcpu_page = fixmap_page(cpu, frame_to_addr(reg.mfn));
+        new_info = vcpu->v.vcpu_page + reg.offset;
+        new_info_pa = frame_to_addr(reg.mfn) + reg.offset;
+        printk(1,"%s/%d: vcpu_info: mfn 0x%" PRIx64 ", offset 0x%x, pa %" PRIx64 " mapped to %p\n",
+               __FUNCTION__, vcpu->id, reg.mfn, reg.offset, new_info_pa, new_info);
+
+        memcpy(new_info, vcpu->v.vcpu_info, sizeof(struct vcpu_info));
+        vcpu->v.vcpu_info = new_info;
+        vcpu->v.vcpu_info_pa = new_info_pa;
+        pv_clock_sys(vcpu);
+        break;
+    }
+    default:
+        return -ENOSYS;
+    }
+    return 0;
+}
+
+sreg_t set_timer_op(struct xen_cpu *cpu, ureg_t *args)
+{
+#ifdef CONFIG_64BIT
+    uint64_t time = args[0];
+#else
+    uint64_t time = args[0] | (uint64_t)args[1] << 32;
+#endif
+
+    cpu->oneshot = time;
+    lapic_timer(cpu);
+    return 0;
+}
+
+sreg_t event_channel_op(struct xen_cpu *cpu, ureg_t *args)
+{
+    switch (args[0]) {
+    case EVTCHNOP_alloc_unbound:
+    {
+        struct evtchn_alloc_unbound alloc;
+        struct xen_cpu *vcpu;
+
+        if (memcpy_pf(&alloc, (void*)args[1], sizeof(alloc))) {
+            return -EFAULT;
+        }
+        if (alloc.dom != DOMID_SELF || alloc.remote_dom != 0) {
+            return -EINVAL;
+        }
+        alloc.port = evtchn_alloc(cpu->id);
+        vcpu = cpu_find(0);
+        evtchn_route_interdomain(vcpu, alloc.port, NULL);
+        if (memcpy_pf((void*)args[1], &alloc, sizeof(alloc))) {
+            return -EFAULT;
+        }
+        return 0;
+    }
+    case EVTCHNOP_bind_vcpu:
+    {
+        struct evtchn_bind_vcpu bind;
+        struct xen_cpu *vcpu;
+
+        if (memcpy_pf(&bind, (void*)args[1], sizeof(bind))) {
+            return -EFAULT;
+        }
+        vcpu = cpu_find(bind.vcpu);
+        if (evtchn_route_interdomain(vcpu, bind.port, NULL)) {
+            return -EINVAL;
+        }
+        return 0;
+    }
+    case EVTCHNOP_bind_virq:
+    {
+        struct evtchn_bind_virq bind;
+        struct xen_cpu *vcpu;
+
+        if (memcpy_pf(&bind, (void*)args[1], sizeof(bind))) {
+            return -EFAULT;
+        }
+        switch (bind.virq) {
+        case VIRQ_TIMER:
+            vcpu = cpu_find(bind.vcpu);
+            if (!vcpu->timerport) {
+                vcpu->timerport = evtchn_alloc(cpu->id);
+                evtchn_route_virq(vcpu, VIRQ_TIMER, vcpu->timerport, "timer");
+                if (cpu == vcpu) {
+                    lapic_timer(cpu);
+                }
+            }
+            bind.port = vcpu->timerport;
+            break;
+        default:
+            bind.port = evtchn_alloc(cpu->id);
+            break;
+        }
+        if (memcpy_pf((void*)args[1], &bind, sizeof(bind))) {
+            return -EFAULT;
+        }
+        return 0;
+    }
+    case EVTCHNOP_bind_ipi:
+    {
+        struct evtchn_bind_ipi bind;
+        struct xen_cpu *vcpu;
+
+        if (memcpy_pf(&bind, (void*)args[1], sizeof(bind))) {
+            return -EFAULT;
+        }
+        bind.port = evtchn_alloc(cpu->id);
+        vcpu = cpu_find(bind.vcpu);
+        evtchn_route_ipi(vcpu, bind.port);
+        if (memcpy_pf((void*)args[1], &bind, sizeof(bind))) {
+            return -EFAULT;
+        }
+        return 0;
+    }
+    case EVTCHNOP_bind_pirq:
+        return -EPERM;
+    case EVTCHNOP_send:
+    {
+        struct evtchn_send send;
+
+        if (memcpy_pf(&send, (void*)args[1], sizeof(send))) {
+            return -EFAULT;
+        }
+        if (evtchn_send(cpu, send.port)) {
+            /* handled internally */
+            return 0;
+        } else {
+            emudev_cmd(EMUDEV_CMD_EVTCHN_SEND, send.port);
+            return 0;
+        }
+    }
+    case EVTCHNOP_unmask:
+    {
+        struct evtchn_unmask unmask;
+
+        if (memcpy_pf(&unmask, (void*)args[1], sizeof(unmask))) {
+            return -EFAULT;
+        }
+        evtchn_unmask(cpu, unmask.port);
+        return 0;
+    }
+    case EVTCHNOP_close:
+    {
+        struct evtchn_close cl;
+
+        if (memcpy_pf(&cl, (void*)args[1], sizeof(cl))) {
+            return -EFAULT;
+        }
+        evtchn_close(cpu, cl.port);
+        return 0;
+    }
+    default:
+        return -ENOSYS;
+    }
+}
+
+sreg_t event_channel_op_compat(struct xen_cpu *cpu, ureg_t *args)
+{
+    struct evtchn_op op;
+    ureg_t nargs[2];
+
+    if (memcpy_pf(&op, (void*)args[0], sizeof(op))) {
+        return -EFAULT;
+    }
+    nargs[0] = op.cmd;
+    nargs[1] = args[0] + offsetof(struct evtchn_op, u);
+    return event_channel_op(cpu, nargs);
+}
+
+sreg_t mmuext_op(struct xen_cpu *cpu, ureg_t *args)
+{
+    struct mmuext_op *uops = (void*)args[0];
+    ureg_t count = args[1];
+#if 0
+    ureg_t *done = (void*)args[2];
+#endif
+    ureg_t dom   = args[3];
+    ureg_t cpumask;
+    int i;
+
+    if (dom != DOMID_SELF) {
+        printk(1, "%s: foreigndom not supported\n", __FUNCTION__);
+        return -ENOSYS;
+    }
+
+    for (i = 0; i < count; i++, uops++) {
+        switch (uops->cmd) {
+        case MMUEXT_PIN_L1_TABLE:
+        case MMUEXT_PIN_L2_TABLE:
+        case MMUEXT_PIN_L3_TABLE:
+        case MMUEXT_PIN_L4_TABLE:
+            /* ignore */
+            break;
+        case MMUEXT_UNPIN_TABLE:
+            /* KVM_MMU_OP_RELEASE_PT ??? */
+            break;
+        case MMUEXT_INVLPG_LOCAL:
+            flush_tlb_addr(uops->arg1.linear_addr);
+            break;
+        case MMUEXT_INVLPG_MULTI:
+            if (memcpy_pf(&cpumask, (void*)uops->arg2.vcpumask.p,
+                sizeof(cpumask))) {
+                return -EFAULT;
+            }
+            flush_tlb_addr(uops->arg1.linear_addr);
+            flush_tlb_remote(cpu, cpumask, uops->arg1.linear_addr);
+            break;
+        case MMUEXT_INVLPG_ALL:
+            flush_tlb_addr(uops->arg1.linear_addr);
+            flush_tlb_remote(cpu, vminfo.vcpus_online, uops->arg1.linear_addr);
+            break;
+        case MMUEXT_TLB_FLUSH_LOCAL:
+            flush_tlb();
+            break;
+        case MMUEXT_TLB_FLUSH_MULTI:
+            flush_tlb();
+            flush_tlb_remote(cpu, cpumask, 0);
+            break;
+        case MMUEXT_TLB_FLUSH_ALL:
+            flush_tlb();
+            flush_tlb_remote(cpu, vminfo.vcpus_online, 0);
+            break;
+
+        case MMUEXT_SET_LDT:
+            printk(2, "%s: SET_LDT (va %lx, nr %d)\n", __FUNCTION__,
+                   uops->arg1.linear_addr, uops->arg2.nr_ents);
+            if (uops->arg2.nr_ents) {
+                struct descriptor_32 *gdt = cpu->gdt;
+                int idx = ldt(cpu);
+                gdt[ idx +0 ] = mkdesc32(uops->arg1.linear_addr & 0xffffffff,
+                                         uops->arg2.nr_ents * 8 - 1,
+                                         0x82, 0);
+#ifdef CONFIG_64BIT
+                gdt[ idx+1 ].a = uops->arg1.linear_addr >> 32;
+                gdt[ idx+1 ].b = 0;
+#endif
+                lldt(idx << 3);
+            } else {
+                lldt(0);
+            }
+            break;
+
+        case MMUEXT_NEW_BASEPTR:
+            update_emu_mappings(uops->arg1.mfn);
+            pv_write_cr3(cpu, uops->arg1.mfn);
+            break;
+#ifdef CONFIG_64BIT
+        case MMUEXT_NEW_USER_BASEPTR:
+            update_emu_mappings(uops->arg1.mfn);
+            cpu->user_cr3_mfn = uops->arg1.mfn;
+            break;
+#endif
+        default:
+            printk(0, "%s: FIXME: unknown %d\n", __FUNCTION__, uops->cmd);
+            return -ENOSYS;
+        }
+    }
+    return 0;
+}
+
+sreg_t physdev_op(struct xen_cpu *cpu, ureg_t *args)
+{
+    switch (args[0]) {
+    case PHYSDEVOP_set_iopl:
+    {
+        struct physdev_set_iopl iopl;
+
+        if (memcpy_pf(&iopl, (void*)args[1], sizeof(iopl))) {
+            return -EFAULT;
+        }
+        printk(2, "%s: set iopl: %d\n", __FUNCTION__, iopl.iopl);
+        cpu->iopl = iopl.iopl;
+        return 0;
+    }
+    case PHYSDEVOP_set_iobitmap:
+    {
+        struct physdev_set_iobitmap iobitmap;
+        if (memcpy_pf(&iobitmap, (void*)args[1], sizeof(iobitmap))) {
+            return -EFAULT;
+        }
+        printk(2, "%s: set iobitmap: %d\n", __FUNCTION__, iobitmap.nr_ports);
+        cpu->nr_ports = iobitmap.nr_ports;
+        return 0;
+    }
+    default:
+        printk(1, "%s: not implemented (#%d)\n", __FUNCTION__, (int)args[0]);
+        return -EPERM;
+    }
+}
+
+static ureg_t read_debugreg(int nr)
+{
+    ureg_t val;
+    switch (nr) {
+    case 0: asm volatile("mov %%db0,%0" : "=r" (val)); break;
+    case 1: asm volatile("mov %%db1,%0" : "=r" (val)); break;
+    case 2: asm volatile("mov %%db2,%0" : "=r" (val)); break;
+    case 3: asm volatile("mov %%db3,%0" : "=r" (val)); break;
+    case 4: asm volatile("mov %%db4,%0" : "=r" (val)); break;
+    case 5: asm volatile("mov %%db5,%0" : "=r" (val)); break;
+    case 6: asm volatile("mov %%db6,%0" : "=r" (val)); break;
+    case 7: asm volatile("mov %%db7,%0" : "=r" (val)); break;
+    default: val = -EINVAL; break;
+    }
+    return val;
+}
+
+static void write_debugreg(int nr, ureg_t val)
+{
+    switch (nr) {
+    case 0: asm volatile("mov %0,%%db0" : : "r" (val) : "memory"); break;
+    case 1: asm volatile("mov %0,%%db1" : : "r" (val) : "memory"); break;
+    case 2: asm volatile("mov %0,%%db2" : : "r" (val) : "memory"); break;
+    case 3: asm volatile("mov %0,%%db3" : : "r" (val) : "memory"); break;
+    case 4: asm volatile("mov %0,%%db4" : : "r" (val) : "memory"); break;
+    case 5: asm volatile("mov %0,%%db5" : : "r" (val) : "memory"); break;
+    case 6: asm volatile("mov %0,%%db6" : : "r" (val) : "memory"); break;
+    case 7: asm volatile("mov %0,%%db7" : : "r" (val) : "memory"); break;
+    }
+}
+
+sreg_t get_debugreg(struct xen_cpu *cpu, ureg_t *args)
+{
+    ureg_t val = read_debugreg(args[0]);
+    printk(2, "%s: %" PRIxREG" = %" PRIxREG "\n", __FUNCTION__, args[0], val);
+    return val;
+}
+
+sreg_t set_debugreg(struct xen_cpu *cpu, ureg_t *args)
+{
+    int nr = args[0];
+    ureg_t val = args[1];
+
+    switch (nr) {
+    case 0:
+    case 1:
+    case 2:
+    case 3:
+        /* TODO: check address */
+        break;
+    case 6:
+        val &= 0xffffefff; /* reserved bits => 0 */
+        val |= 0xffff0ff0; /* reserved bits => 1 */
+        break;
+    case 7:
+        if (val) {
+            val &= 0xffff27ff; /* reserved bits => 0 */
+            val |= 0x00000400; /* reserved bits => 1 */
+        }
+        break;
+    default:
+        return -EINVAL;
+    }
+
+    printk(0, "%s: %d = %" PRIxREG "\n", __FUNCTION__, nr, val);
+    write_debugreg(nr,val);
+    return 0;
+}
+
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 13/40] xenner: kernel: Headers
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (11 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 12/40] xenner: kernel: Hypercall handler (generic) Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 14/40] xenner: kernel: Instruction emulator Alexander Graf
                   ` (26 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds various header files required to build the xenner kernel.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/apicdef.h       |  173 ++++++++++
 pc-bios/xenner/cpufeature.h    |  129 ++++++++
 pc-bios/xenner/list.h          |  169 ++++++++++
 pc-bios/xenner/msr-index.h     |  278 ++++++++++++++++
 pc-bios/xenner/processor.h     |  326 +++++++++++++++++++
 pc-bios/xenner/shared.h        |  188 +++++++++++
 pc-bios/xenner/xenner-emudev.h |   57 ++++
 pc-bios/xenner/xenner.h        |  684 ++++++++++++++++++++++++++++++++++++++++
 8 files changed, 2004 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/apicdef.h
 create mode 100644 pc-bios/xenner/cpufeature.h
 create mode 100644 pc-bios/xenner/list.h
 create mode 100644 pc-bios/xenner/msr-index.h
 create mode 100644 pc-bios/xenner/processor.h
 create mode 100644 pc-bios/xenner/shared.h
 create mode 100644 pc-bios/xenner/xenner-emudev.h
 create mode 100644 pc-bios/xenner/xenner.h

diff --git a/pc-bios/xenner/apicdef.h b/pc-bios/xenner/apicdef.h
new file mode 100644
index 0000000..0f079ed
--- /dev/null
+++ b/pc-bios/xenner/apicdef.h
@@ -0,0 +1,173 @@
+#ifndef _ASM_X86_APICDEF_H
+#define _ASM_X86_APICDEF_H
+
+/*
+ * Constants for various Intel APICs. (local APIC, IOAPIC, etc.)
+ *
+ * Alan Cox <Alan.Cox@linux.org>, 1995.
+ * Ingo Molnar <mingo@redhat.com>, 1999, 2000
+ */
+
+#define APIC_DEFAULT_PHYS_BASE        0xfee00000
+
+#define APIC_ID                       0x20
+#define APIC_ID_MASK                  (0xFFu<<24)
+#define GET_APIC_ID(x)                (((x)>>24)&0xFFu)
+#define SET_APIC_ID(x)                (((x)<<24))
+#define APIC_LVR                      0x30
+#define APIC_LVR_MASK                 0xFF00FF
+#define GET_APIC_VERSION(x)           ((x)&0xFFu)
+#define GET_APIC_MAXLVT(x)            (((x)>>16)&0xFFu)
+#define APIC_INTEGRATED(x)            ((x)&0xF0u)
+#define APIC_XAPIC(x)                 ((x) >= 0x14)
+#define APIC_TASKPRI                  0x80
+#define APIC_TPRI_MASK                0xFFu
+#define APIC_ARBPRI                   0x90
+#define APIC_ARBPRI_MASK              0xFFu
+#define APIC_PROCPRI                  0xA0
+#define APIC_EOI                      0xB0
+#define APIC_EIO_ACK                  0x0
+#define APIC_RRR                      0xC0
+#define APIC_LDR                      0xD0
+#define APIC_LDR_MASK                 (0xFFu<<24)
+#define GET_APIC_LOGICAL_ID(x)        (((x)>>24)&0xFFu)
+#define SET_APIC_LOGICAL_ID(x)        (((x)<<24))
+#define APIC_ALL_CPUS                 0xFFu
+#define APIC_DFR                      0xE0
+#define APIC_DFR_CLUSTER              0x0FFFFFFFul
+#define APIC_DFR_FLAT                 0xFFFFFFFFul
+#define APIC_SPIV                     0xF0
+#define APIC_SPIV_FOCUS_DISABLED      (1<<9)
+#define APIC_SPIV_APIC_ENABLED        (1<<8)
+#define APIC_ISR                      0x100
+#define APIC_ISR_NR                   0x8
+#define APIC_TMR                      0x180
+#define APIC_IRR                      0x200
+#define APIC_ESR                      0x280
+#define APIC_ESR_SEND_CS              0x00001
+#define APIC_ESR_RECV_CS              0x00002
+#define APIC_ESR_SEND_ACC             0x00004
+#define APIC_ESR_RECV_ACC             0x00008
+#define APIC_ESR_SENDILL              0x00020
+#define APIC_ESR_RECVILL              0x00040
+#define APIC_ESR_ILLREGA              0x00080
+#define APIC_ICR                      0x300
+#define APIC_DEST_SELF                0x40000
+#define APIC_DEST_ALLINC              0x80000
+#define APIC_DEST_ALLBUT              0xC0000
+#define APIC_ICR_RR_MASK              0x30000
+#define APIC_ICR_RR_INVALID           0x00000
+#define APIC_ICR_RR_INPROG            0x10000
+#define APIC_ICR_RR_VALID             0x20000
+#define APIC_INT_LEVELTRIG            0x08000
+#define APIC_INT_ASSERT               0x04000
+#define APIC_ICR_BUSY                 0x01000
+#define APIC_DEST_LOGICAL             0x00800
+#define APIC_DEST_PHYSICAL            0x00000
+#define APIC_DM_FIXED                 0x00000
+#define APIC_DM_LOWEST                0x00100
+#define APIC_DM_SMI                   0x00200
+#define APIC_DM_REMRD                 0x00300
+#define APIC_DM_NMI                   0x00400
+#define APIC_DM_INIT                  0x00500
+#define APIC_DM_STARTUP               0x00600
+#define APIC_DM_EXTINT                0x00700
+#define APIC_VECTOR_MASK              0x000FF
+#define APIC_ICR2                     0x310
+#define GET_APIC_DEST_FIELD(x)        (((x)>>24)&0xFF)
+#define SET_APIC_DEST_FIELD(x)        ((x)<<24)
+#define APIC_LVTT                     0x320
+#define APIC_LVTTHMR                  0x330
+#define APIC_LVTPC                    0x340
+#define APIC_LVT0                     0x350
+#define APIC_LVT_TIMER_BASE_MASK      (0x3<<18)
+#define GET_APIC_TIMER_BASE(x)        (((x)>>18)&0x3)
+#define SET_APIC_TIMER_BASE(x)        (((x)<<18))
+#define APIC_TIMER_BASE_CLKIN         0x0
+#define APIC_TIMER_BASE_TMBASE        0x1
+#define APIC_TIMER_BASE_DIV           0x2
+#define APIC_LVT_TIMER_PERIODIC       (1<<17)
+#define APIC_LVT_MASKED               (1<<16)
+#define APIC_LVT_LEVEL_TRIGGER        (1<<15)
+#define APIC_LVT_REMOTE_IRR           (1<<14)
+#define APIC_INPUT_POLARITY           (1<<13)
+#define APIC_SEND_PENDING             (1<<12)
+#define APIC_MODE_MASK                0x700
+#define GET_APIC_DELIVERY_MODE(x)     (((x)>>8)&0x7)
+#define SET_APIC_DELIVERY_MODE(x, y)  (((x)&~0x700)|((y)<<8))
+#define APIC_MODE_FIXED               0x0
+#define APIC_MODE_NMI                 0x4
+#define APIC_MODE_EXTINT              0x7
+#define APIC_LVT1                     0x360
+#define APIC_LVTERR                   0x370
+#define APIC_TMICT                    0x380
+#define APIC_TMCCT                    0x390
+#define APIC_TDCR                     0x3E0
+#define APIC_TDR_DIV_TMBASE           (1<<2)
+#define APIC_TDR_DIV_1                0xB
+#define APIC_TDR_DIV_2                0x0
+#define APIC_TDR_DIV_4                0x1
+#define APIC_TDR_DIV_8                0x2
+#define APIC_TDR_DIV_16               0x3
+#define APIC_TDR_DIV_32               0x8
+#define APIC_TDR_DIV_64               0x9
+#define APIC_TDR_DIV_128              0xA
+#define APIC_EILVT0                   0x500
+#define APIC_EILVT_NR_AMD_K8          1 /* Number of extended interrupts */
+#define APIC_EILVT_NR_AMD_10H         4
+#define APIC_EILVT_LVTOFF(x)          (((x)>>4)&0xF)
+#define APIC_EILVT_MSG_FIX            0x0
+#define APIC_EILVT_MSG_SMI            0x2
+#define APIC_EILVT_MSG_NMI            0x4
+#define APIC_EILVT_MSG_EXT            0x7
+#define APIC_EILVT_MASKED             (1<<16)
+#define APIC_EILVT1                   0x510
+#define APIC_EILVT2                   0x520
+#define APIC_EILVT3                   0x530
+
+/*************** IO-APIC *************/
+
+#define IOAPIC_NUM_PINS              KVM_IOAPIC_NUM_PINS
+#define IOAPIC_VERSION_ID            0x11        /* IOAPIC version */
+#define IOAPIC_EDGE_TRIG             0
+#define IOAPIC_LEVEL_TRIG            1
+
+#define IOAPIC_DEFAULT_BASE_ADDRESS  0xfec00000
+#define IOAPIC_MEM_LENGTH            0x100
+
+/* Direct registers. */
+#define IOAPIC_REG_SELECT            0x00
+#define IOAPIC_REG_WINDOW            0x10
+#define IOAPIC_REG_EOI               0x40        /* IA64 IOSAPIC only */
+
+/* Indirect registers. */
+#define IOAPIC_REG_APIC_ID           0x00        /* x86 IOAPIC only */
+#define IOAPIC_REG_VERSION           0x01
+#define IOAPIC_REG_ARB_ID            0x02        /* x86 IOAPIC only */
+
+/*ioapic delivery mode*/
+#define IOAPIC_FIXED                 0x0
+#define IOAPIC_LOWEST_PRIORITY       0x1
+#define IOAPIC_PMI                   0x2
+#define IOAPIC_NMI                   0x4
+#define IOAPIC_INIT                  0x5
+#define IOAPIC_EXTINT                0x7
+
+struct IO_APIC_route_entry {
+    uint32_t vector          :  8,
+             delivery_mode   :  3,   /* 000: FIXED
+                                      * 001: lowest prio
+                                      * 111: ExtINT
+                                      */
+             dest_mode       :  1,   /* 0: physical, 1: logical */
+             delivery_status :  1,
+             polarity        :  1,
+             irr             :  1,
+             trigger         :  1,   /* 0: edge, 1: level */
+             mask            :  1,   /* 0: enabled, 1: disabled */
+             __reserved_2    : 15;
+    uint32_t __reserved_3    : 24,
+             dest            :  8;
+} __attribute__ ((packed));
+
+#endif
diff --git a/pc-bios/xenner/cpufeature.h b/pc-bios/xenner/cpufeature.h
new file mode 100644
index 0000000..d0e4ddf
--- /dev/null
+++ b/pc-bios/xenner/cpufeature.h
@@ -0,0 +1,129 @@
+/*
+ * cpufeature.h
+ *
+ * Defines x86 CPU feature bits
+ */
+
+#ifndef __ASM_I386_CPUFEATURE_H
+#define __ASM_I386_CPUFEATURE_H
+
+/* Intel-defined CPU features, CPUID level 0x00000001 (edx), word 0 */
+#define X86_FEATURE_FPU          (0*32+ 0) /* Onboard FPU */
+#define X86_FEATURE_VME          (0*32+ 1) /* Virtual Mode Extensions */
+#define X86_FEATURE_DE           (0*32+ 2) /* Debugging Extensions */
+#define X86_FEATURE_PSE          (0*32+ 3) /* Page Size Extensions */
+#define X86_FEATURE_TSC          (0*32+ 4) /* Time Stamp Counter */
+#define X86_FEATURE_MSR          (0*32+ 5) /* Model-Specific Registers, RDMSR, WRMSR */
+#define X86_FEATURE_PAE          (0*32+ 6) /* Physical Address Extensions */
+#define X86_FEATURE_MCE          (0*32+ 7) /* Machine Check Architecture */
+#define X86_FEATURE_CX8          (0*32+ 8) /* CMPXCHG8 instruction */
+#define X86_FEATURE_APIC         (0*32+ 9) /* Onboard APIC */
+#define X86_FEATURE_SEP          (0*32+11) /* SYSENTER/SYSEXIT */
+#define X86_FEATURE_MTRR         (0*32+12) /* Memory Type Range Registers */
+#define X86_FEATURE_PGE          (0*32+13) /* Page Global Enable */
+#define X86_FEATURE_MCA          (0*32+14) /* Machine Check Architecture */
+#define X86_FEATURE_CMOV         (0*32+15) /* CMOV instruction (FCMOVCC and FCOMI too if FPU present) */
+#define X86_FEATURE_PAT          (0*32+16) /* Page Attribute Table */
+#define X86_FEATURE_PSE36        (0*32+17) /* 36-bit PSEs */
+#define X86_FEATURE_PN           (0*32+18) /* Processor serial number */
+#define X86_FEATURE_CLFLSH       (0*32+19) /* Supports the CLFLUSH instruction */
+#define X86_FEATURE_DS           (0*32+21) /* Debug Store */
+#define X86_FEATURE_ACPI         (0*32+22) /* ACPI via MSR */
+#define X86_FEATURE_MMX          (0*32+23) /* Multimedia Extensions */
+#define X86_FEATURE_FXSR         (0*32+24) /* FXSAVE and FXRSTOR instructions (fast save and restore */
+                                           /* of FPU context), and CR4.OSFXSR available */
+#define X86_FEATURE_XMM          (0*32+25) /* Streaming SIMD Extensions */
+#define X86_FEATURE_XMM2         (0*32+26) /* Streaming SIMD Extensions-2 */
+#define X86_FEATURE_SELFSNOOP    (0*32+27) /* CPU self snoop */
+#define X86_FEATURE_HT           (0*32+28) /* Hyper-Threading */
+#define X86_FEATURE_ACC          (0*32+29) /* Automatic clock control */
+#define X86_FEATURE_IA64         (0*32+30) /* IA-64 processor */
+#define X86_FEATURE_PBE          (0*32+31) /* Pending Break Enable */
+
+/* AMD-defined CPU features, CPUID level 0x80000001, word 1 */
+/* Don't duplicate feature flags which are redundant with Intel! */
+#define X86_FEATURE_SYSCALL      (1*32+11) /* SYSCALL/SYSRET */
+#define X86_FEATURE_MP           (1*32+19) /* MP Capable. */
+#define X86_FEATURE_NX           (1*32+20) /* Execute Disable */
+#define X86_FEATURE_MMXEXT       (1*32+22) /* AMD MMX extensions */
+#define X86_FEATURE_FFXSR        (1*32+25) /* FFXSR instruction optimizations */
+#define X86_FEATURE_PAGE1GB      (1*32+26) /* 1Gb large page support */
+#define X86_FEATURE_RDTSCP       (1*32+27) /* RDTSCP */
+#define X86_FEATURE_LM           (1*32+29) /* Long Mode (x86-64) */
+#define X86_FEATURE_3DNOWEXT     (1*32+30) /* AMD 3DNow! extensions */
+#define X86_FEATURE_3DNOW        (1*32+31) /* 3DNow! */
+
+/* Transmeta-defined CPU features, CPUID level 0x80860001, word 2 */
+#define X86_FEATURE_RECOVERY     (2*32+ 0) /* CPU in recovery mode */
+#define X86_FEATURE_LONGRUN      (2*32+ 1) /* Longrun power control */
+#define X86_FEATURE_LRTI         (2*32+ 3) /* LongRun table interface */
+
+/* Other features, Linux-defined mapping, word 3 */
+/* This range is used for feature bits which conflict or are synthesized */
+#define X86_FEATURE_CXMMX        (3*32+ 0) /* Cyrix MMX extensions */
+#define X86_FEATURE_K6_MTRR      (3*32+ 1) /* AMD K6 nonstandard MTRRs */
+#define X86_FEATURE_CYRIX_ARR    (3*32+ 2) /* Cyrix ARRs (= MTRRs) */
+#define X86_FEATURE_CENTAUR_MCR  (3*32+ 3) /* Centaur MCRs (= MTRRs) */
+/* cpu types for specific tunings: */
+#define X86_FEATURE_K8           (3*32+ 4) /* Opteron, Athlon64 */
+#define X86_FEATURE_K7           (3*32+ 5) /* Athlon */
+#define X86_FEATURE_P3           (3*32+ 6) /* P3 */
+#define X86_FEATURE_P4           (3*32+ 7) /* P4 */
+#define X86_FEATURE_CONSTANT_TSC (3*32+ 8) /* TSC ticks at a constant rate */
+#define X86_FEATURE_NONSTOP_TSC  (3*32+ 9) /* TSC does not stop in C states */
+#define X86_FEATURE_ARAT         (3*32+ 10) /* Always running APIC timer */
+#define X86_FEATURE_ARCH_PERFMON (3*32+11) /* Intel Architectural PerfMon */
+#define X86_FEATURE_TSC_RELIABLE (3*32+12) /* TSC is known to be reliable */
+
+/* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
+#define X86_FEATURE_XMM3         (4*32+ 0) /* Streaming SIMD Extensions-3 */
+#define X86_FEATURE_DTES64       (4*32+ 2) /* 64-bit Debug Store */
+#define X86_FEATURE_MWAIT        (4*32+ 3) /* Monitor/Mwait support */
+#define X86_FEATURE_DSCPL        (4*32+ 4) /* CPL Qualified Debug Store */
+#define X86_FEATURE_VMXE         (4*32+ 5) /* Virtual Machine Extensions */
+#define X86_FEATURE_SMXE         (4*32+ 6) /* Safer Mode Extensions */
+#define X86_FEATURE_EST          (4*32+ 7) /* Enhanced SpeedStep */
+#define X86_FEATURE_TM2          (4*32+ 8) /* Thermal Monitor 2 */
+#define X86_FEATURE_SSSE3        (4*32+ 9) /* Supplemental Streaming SIMD Extensions-3 */
+#define X86_FEATURE_CID          (4*32+10) /* Context ID */
+#define X86_FEATURE_CX16         (4*32+13) /* CMPXCHG16B */
+#define X86_FEATURE_XTPR         (4*32+14) /* Send Task Priority Messages */
+#define X86_FEATURE_PDCM         (4*32+15) /* Perf/Debug Capability MSR */
+#define X86_FEATURE_DCA          (4*32+18) /* Direct Cache Access */
+#define X86_FEATURE_SSE4_1       (4*32+19) /* Streaming SIMD Extensions 4.1 */
+#define X86_FEATURE_SSE4_2       (4*32+20) /* Streaming SIMD Extensions 4.2 */
+#define X86_FEATURE_X2APIC       (4*32+21) /* Extended xAPIC */
+#define X86_FEATURE_POPCNT       (4*32+23) /* POPCNT instruction */
+#define X86_FEATURE_XSAVE        (4*32+26) /* XSAVE/XRSTOR/XSETBV/XGETBV */
+#define X86_FEATURE_OSXSAVE      (4*32+27) /* OSXSAVE */
+#define X86_FEATURE_HYPERVISOR   (4*32+31) /* Running under some hypervisor */
+
+/* VIA/Cyrix/Centaur-defined CPU features, CPUID level 0xC0000001, word 5 */
+#define X86_FEATURE_XSTORE       (5*32+ 2) /* on-CPU RNG present (xstore insn) */
+#define X86_FEATURE_XSTORE_EN    (5*32+ 3) /* on-CPU RNG enabled */
+#define X86_FEATURE_XCRYPT       (5*32+ 6) /* on-CPU crypto (xcrypt insn) */
+#define X86_FEATURE_XCRYPT_EN    (5*32+ 7) /* on-CPU crypto enabled */
+#define X86_FEATURE_ACE2         (5*32+ 8) /* Advanced Cryptography Engine v2 */
+#define X86_FEATURE_ACE2_EN      (5*32+ 9) /* ACE v2 enabled */
+#define X86_FEATURE_PHE          (5*32+ 10) /* PadLock Hash Engine */
+#define X86_FEATURE_PHE_EN       (5*32+ 11) /* PHE enabled */
+#define X86_FEATURE_PMM          (5*32+ 12) /* PadLock Montgomery Multiplier */
+#define X86_FEATURE_PMM_EN       (5*32+ 13) /* PMM enabled */
+
+/* More extended AMD flags: CPUID level 0x80000001, ecx, word 6 */
+#define X86_FEATURE_LAHF_LM      (6*32+ 0) /* LAHF/SAHF in long mode */
+#define X86_FEATURE_CMP_LEGACY   (6*32+ 1) /* If yes HyperThreading not valid */
+#define X86_FEATURE_SVME         (6*32+ 2) /* Secure Virtual Machine */
+#define X86_FEATURE_EXTAPICSPACE (6*32+ 3) /* Extended APIC space */
+#define X86_FEATURE_ALTMOVCR     (6*32+ 4) /* LOCK MOV CR accesses CR+8 */
+#define X86_FEATURE_ABM          (6*32+ 5) /* Advanced Bit Manipulation */
+#define X86_FEATURE_SSE4A        (6*32+ 6) /* AMD Streaming SIMD Extensions-4a */
+#define X86_FEATURE_MISALIGNSSE  (6*32+ 7) /* Misaligned SSE Access */
+#define X86_FEATURE_3DNOWPF      (6*32+ 8) /* 3DNow! Prefetch */
+#define X86_FEATURE_OSVW         (6*32+ 9) /* OS Visible Workaround */
+#define X86_FEATURE_IBS          (6*32+ 10) /* Instruction Based Sampling */
+#define X86_FEATURE_SSE5         (6*32+ 11) /* AMD Streaming SIMD Extensions-5 */
+#define X86_FEATURE_SKINIT       (6*32+ 12) /* SKINIT, STGI/CLGI, DEV */
+#define X86_FEATURE_WDT          (6*32+ 13) /* Watchdog Timer */
+
+#endif
diff --git a/pc-bios/xenner/list.h b/pc-bios/xenner/list.h
new file mode 100644
index 0000000..d621bda
--- /dev/null
+++ b/pc-bios/xenner/list.h
@@ -0,0 +1,169 @@
+#ifndef _LIST_H
+#define _LIST_H 1
+
+/*
+ * Simple doubly linked list implementation.
+ *        -- shameless stolen from the linux kernel sources
+ *
+ * Some of the internal functions ("__xxx") are useful when
+ * manipulating whole lists rather than single entries, as
+ * sometimes we already know the next/prev entries and we can
+ * generate better code by using them directly rather than
+ * using the generic single-entry routines.
+ */
+
+struct list_head {
+    struct list_head *next, *prev;
+};
+
+#define LIST_HEAD_INIT(name) { &(name), &(name) }
+
+#define LIST_HEAD(name) \
+    struct list_head name = LIST_HEAD_INIT(name)
+
+#define INIT_LIST_HEAD(ptr) do { \
+    (ptr)->next = (ptr); (ptr)->prev = (ptr); \
+} while (0)
+
+/*
+ * Insert a item entry between two known consecutive entries.
+ *
+ * This is only for internal list manipulation where we know
+ * the prev/next entries already!
+ */
+static __inline__ void __list_add(struct list_head * item,
+        struct list_head * prev,
+        struct list_head * next)
+{
+    next->prev = item;
+    item->next = next;
+    item->prev = prev;
+    prev->next = item;
+}
+
+/**
+ * list_add - add a item entry
+ * @item: item entry to be added
+ * @head: list head to add it after
+ *
+ * Insert a item entry after the specified head.
+ * This is good for implementing stacks.
+ */
+static __inline__ void list_add(struct list_head *item, struct list_head *head)
+{
+    __list_add(item, head, head->next);
+}
+
+/**
+ * list_add_tail - add a item entry
+ * @item: item entry to be added
+ * @head: list head to add it before
+ *
+ * Insert a item entry before the specified head.
+ * This is useful for implementing queues.
+ */
+static __inline__ void list_add_tail(struct list_head *item, struct list_head *head)
+{
+    __list_add(item, head->prev, head);
+}
+
+/*
+ * Delete a list entry by making the prev/next entries
+ * point to each other.
+ *
+ * This is only for internal list manipulation where we know
+ * the prev/next entries already!
+ */
+static __inline__ void __list_del(struct list_head * prev,
+                                  struct list_head * next)
+{
+    next->prev = prev;
+    prev->next = next;
+}
+
+/**
+ * list_del - deletes entry from list.
+ * @entry: the element to delete from the list.
+ * Note: list_empty on entry does not return true after this, the entry is in an undefined state.
+ */
+static __inline__ void list_del(struct list_head *entry)
+{
+    __list_del(entry->prev, entry->next);
+}
+
+/**
+ * list_del_init - deletes entry from list and reinitialize it.
+ * @entry: the element to delete from the list.
+ */
+static __inline__ void list_del_init(struct list_head *entry)
+{
+    __list_del(entry->prev, entry->next);
+    INIT_LIST_HEAD(entry);
+}
+
+/**
+ * list_empty - tests whether a list is empty
+ * @head: the list to test.
+ */
+static __inline__ int list_empty(struct list_head *head)
+{
+    return head->next == head;
+}
+
+/**
+ * list_splice - join two lists
+ * @list: the item list to add.
+ * @head: the place to add it in the first list.
+ */
+static __inline__ void list_splice(struct list_head *list, struct list_head *head)
+{
+    struct list_head *first = list->next;
+
+    if (first != list) {
+        struct list_head *last = list->prev;
+        struct list_head *at = head->next;
+
+        first->prev = head;
+        head->next = first;
+
+        last->next = at;
+        at->prev = last;
+    }
+}
+
+/**
+ * list_entry - get the struct for this entry
+ * @ptr:        the &struct list_head pointer.
+ * @type:        the type of the struct this is embedded in.
+ * @member:        the name of the list_struct within the struct.
+ */
+#define list_entry(ptr, type, member) \
+    ((type *)((char *)(ptr)-(unsigned long)(&((type *)0)->member)))
+
+/**
+ * list_for_each        -        iterate over a list
+ * @pos:        the &struct list_head to use as a loop counter.
+ * @head:        the head for your list.
+ */
+#define list_for_each(pos, head) \
+    for (pos = (head)->next; pos != (head); pos = pos->next)
+
+/**
+ * list_for_each_safe        -        iterate over a list safe against removal of list entry
+ * @pos:        the &struct list_head to use as a loop counter.
+ * @n:                another &struct list_head to use as temporary storage
+ * @head:        the head for your list.
+ */
+#define list_for_each_safe(pos, n, head) \
+    for (pos = (head)->next, n = pos->next; pos != (head); \
+        pos = n, n = pos->next)
+
+/**
+ * list_for_each_prev        -        iterate over a list in reverse order
+ * @pos:        the &struct list_head to use as a loop counter.
+ * @head:        the head for your list.
+ */
+#define list_for_each_prev(pos, head) \
+    for (pos = (head)->prev; pos != (head); pos = pos->prev)
+
+#endif /* _LIST_H */
diff --git a/pc-bios/xenner/msr-index.h b/pc-bios/xenner/msr-index.h
new file mode 100644
index 0000000..99e17f1
--- /dev/null
+++ b/pc-bios/xenner/msr-index.h
@@ -0,0 +1,278 @@
+#ifndef __ASM_MSR_INDEX_H
+#define __ASM_MSR_INDEX_H
+
+/* CPU model specific register (MSR) numbers */
+
+/* x86-64 specific MSRs */
+#define MSR_EFER                    0xc0000080 /* extended feature register */
+#define MSR_STAR                    0xc0000081 /* legacy mode SYSCALL target */
+#define MSR_LSTAR                   0xc0000082 /* long mode SYSCALL target */
+#define MSR_CSTAR                   0xc0000083 /* compat mode SYSCALL target */
+#define MSR_SYSCALL_MASK            0xc0000084 /* EFLAGS mask for syscall */
+#define MSR_FS_BASE                 0xc0000100 /* 64bit FS base */
+#define MSR_GS_BASE                 0xc0000101 /* 64bit GS base */
+#define MSR_KERNEL_GS_BASE          0xc0000102 /* SwapGS GS shadow */
+
+/* EFER bits: */
+#define _EFER_SCE                    0  /* SYSCALL/SYSRET */
+#define _EFER_LME                    8  /* Long mode enable */
+#define _EFER_LMA                   10 /* Long mode active (read-only) */
+#define _EFER_NX                    11 /* No execute enable */
+
+#define EFER_SCE                    (1<<_EFER_SCE)
+#define EFER_LME                    (1<<_EFER_LME)
+#define EFER_LMA                    (1<<_EFER_LMA)
+#define EFER_NX                     (1<<_EFER_NX)
+
+/* Intel MSRs. Some also available on other CPUs */
+#define MSR_IA32_PERFCTR0           0x000000c1
+#define MSR_IA32_PERFCTR1           0x000000c2
+#define MSR_FSB_FREQ                0x000000cd
+
+#define MSR_MTRRcap                 0x000000fe
+#define MSR_IA32_BBL_CR_CTL         0x00000119
+
+#define MSR_IA32_SYSENTER_CS        0x00000174
+#define MSR_IA32_SYSENTER_ESP       0x00000175
+#define MSR_IA32_SYSENTER_EIP       0x00000176
+
+#define MSR_IA32_MCG_CAP            0x00000179
+#define MSR_IA32_MCG_STATUS         0x0000017a
+#define MSR_IA32_MCG_CTL            0x0000017b
+
+#define MSR_IA32_PEBS_ENABLE        0x000003f1
+#define MSR_IA32_DS_AREA            0x00000600
+#define MSR_IA32_PERF_CAPABILITIES  0x00000345
+
+#define MSR_MTRRfix64K_00000        0x00000250
+#define MSR_MTRRfix16K_80000        0x00000258
+#define MSR_MTRRfix16K_A0000        0x00000259
+#define MSR_MTRRfix4K_C0000         0x00000268
+#define MSR_MTRRfix4K_C8000         0x00000269
+#define MSR_MTRRfix4K_D0000         0x0000026a
+#define MSR_MTRRfix4K_D8000         0x0000026b
+#define MSR_MTRRfix4K_E0000         0x0000026c
+#define MSR_MTRRfix4K_E8000         0x0000026d
+#define MSR_MTRRfix4K_F0000         0x0000026e
+#define MSR_MTRRfix4K_F8000         0x0000026f
+#define MSR_MTRRdefType             0x000002ff
+
+#define MSR_IA32_DEBUGCTLMSR        0x000001d9
+#define MSR_IA32_LASTBRANCHFROMIP   0x000001db
+#define MSR_IA32_LASTBRANCHTOIP     0x000001dc
+#define MSR_IA32_LASTINTFROMIP      0x000001dd
+#define MSR_IA32_LASTINTTOIP        0x000001de
+
+#define MSR_IA32_MC0_CTL            0x00000400
+#define MSR_IA32_MC0_STATUS         0x00000401
+#define MSR_IA32_MC0_ADDR           0x00000402
+#define MSR_IA32_MC0_MISC           0x00000403
+
+#define MSR_P6_PERFCTR0             0x000000c1
+#define MSR_P6_PERFCTR1             0x000000c2
+#define MSR_P6_EVNTSEL0             0x00000186
+#define MSR_P6_EVNTSEL1             0x00000187
+
+/* K7/K8 MSRs. Not complete. See the architecture manual for a more
+   complete list. */
+#define MSR_K7_EVNTSEL0             0xc0010000
+#define MSR_K7_PERFCTR0             0xc0010004
+#define MSR_K7_EVNTSEL1             0xc0010001
+#define MSR_K7_PERFCTR1             0xc0010005
+#define MSR_K7_EVNTSEL2             0xc0010002
+#define MSR_K7_PERFCTR2             0xc0010006
+#define MSR_K7_EVNTSEL3             0xc0010003
+#define MSR_K7_PERFCTR3             0xc0010007
+#define MSR_K8_TOP_MEM1             0xc001001a
+#define MSR_K7_CLK_CTL              0xc001001b
+#define MSR_K8_TOP_MEM2             0xc001001d
+#define MSR_K8_SYSCFG               0xc0010010
+
+#define K8_MTRRFIXRANGE_DRAM_ENABLE 0x00040000 /* MtrrFixDramEn bit    */
+#define K8_MTRRFIXRANGE_DRAM_MODIFY 0x00080000 /* MtrrFixDramModEn bit */
+#define K8_MTRR_RDMEM_WRMEM_MASK    0x18181818 /* Mask: RdMem|WrMem    */
+
+#define MSR_K7_HWCR                 0xc0010015
+#define MSR_K8_HWCR                 0xc0010015
+#define MSR_K7_FID_VID_CTL          0xc0010041
+#define MSR_K7_FID_VID_STATUS       0xc0010042
+#define MSR_K8_ENABLE_C1E           0xc0010055
+
+/* K6 MSRs */
+#define MSR_K6_EFER                 0xc0000080
+#define MSR_K6_STAR                 0xc0000081
+#define MSR_K6_WHCR                 0xc0000082
+#define MSR_K6_UWCCR                0xc0000085
+#define MSR_K6_EPMR                 0xc0000086
+#define MSR_K6_PSOR                 0xc0000087
+#define MSR_K6_PFIR                 0xc0000088
+
+/* Centaur-Hauls/IDT defined MSRs. */
+#define MSR_IDT_FCR1                0x00000107
+#define MSR_IDT_FCR2                0x00000108
+#define MSR_IDT_FCR3                0x00000109
+#define MSR_IDT_FCR4                0x0000010a
+
+#define MSR_IDT_MCR0                0x00000110
+#define MSR_IDT_MCR1                0x00000111
+#define MSR_IDT_MCR2                0x00000112
+#define MSR_IDT_MCR3                0x00000113
+#define MSR_IDT_MCR4                0x00000114
+#define MSR_IDT_MCR5                0x00000115
+#define MSR_IDT_MCR6                0x00000116
+#define MSR_IDT_MCR7                0x00000117
+#define MSR_IDT_MCR_CTRL            0x00000120
+
+/* VIA Cyrix defined MSRs*/
+#define MSR_VIA_FCR                 0x00001107
+#define MSR_VIA_LONGHAUL            0x0000110a
+#define MSR_VIA_RNG                 0x0000110b
+#define MSR_VIA_BCR2                0x00001147
+
+/* Transmeta defined MSRs */
+#define MSR_TMTA_LONGRUN_CTRL       0x80868010
+#define MSR_TMTA_LONGRUN_FLAGS      0x80868011
+#define MSR_TMTA_LRTI_READOUT       0x80868018
+#define MSR_TMTA_LRTI_VOLT_MHZ      0x8086801a
+
+/* Intel defined MSRs. */
+#define MSR_IA32_P5_MC_ADDR         0x00000000
+#define MSR_IA32_P5_MC_TYPE         0x00000001
+#define MSR_IA32_TSC                0x00000010
+#define MSR_IA32_PLATFORM_ID        0x00000017
+#define MSR_IA32_EBL_CR_POWERON     0x0000002a
+
+#define MSR_IA32_APICBASE           0x0000001b
+#define MSR_IA32_APICBASE_BSP       (1<<8)
+#define MSR_IA32_APICBASE_ENABLE    (1<<11)
+#define MSR_IA32_APICBASE_BASE      (0xfffff<<12)
+
+#define MSR_IA32_UCODE_WRITE        0x00000079
+#define MSR_IA32_UCODE_REV          0x0000008b
+
+#define MSR_IA32_PERF_STATUS        0x00000198
+#define MSR_IA32_PERF_CTL           0x00000199
+
+#define MSR_IA32_MPERF              0x000000e7
+#define MSR_IA32_APERF              0x000000e8
+
+#define MSR_IA32_THERM_CONTROL      0x0000019a
+#define MSR_IA32_THERM_INTERRUPT    0x0000019b
+#define MSR_IA32_THERM_STATUS       0x0000019c
+#define MSR_IA32_MISC_ENABLE        0x000001a0
+
+/* Intel Model 6 */
+#define MSR_P6_EVNTSEL0             0x00000186
+#define MSR_P6_EVNTSEL1             0x00000187
+
+/* P4/Xeon+ specific */
+#define MSR_IA32_MCG_EAX            0x00000180
+#define MSR_IA32_MCG_EBX            0x00000181
+#define MSR_IA32_MCG_ECX            0x00000182
+#define MSR_IA32_MCG_EDX            0x00000183
+#define MSR_IA32_MCG_ESI            0x00000184
+#define MSR_IA32_MCG_EDI            0x00000185
+#define MSR_IA32_MCG_EBP            0x00000186
+#define MSR_IA32_MCG_ESP            0x00000187
+#define MSR_IA32_MCG_EFLAGS         0x00000188
+#define MSR_IA32_MCG_EIP            0x00000189
+#define MSR_IA32_MCG_RESERVED       0x0000018a
+
+/* Pentium IV performance counter MSRs */
+#define MSR_P4_BPU_PERFCTR0         0x00000300
+#define MSR_P4_BPU_PERFCTR1         0x00000301
+#define MSR_P4_BPU_PERFCTR2         0x00000302
+#define MSR_P4_BPU_PERFCTR3         0x00000303
+#define MSR_P4_MS_PERFCTR0          0x00000304
+#define MSR_P4_MS_PERFCTR1          0x00000305
+#define MSR_P4_MS_PERFCTR2          0x00000306
+#define MSR_P4_MS_PERFCTR3          0x00000307
+#define MSR_P4_FLAME_PERFCTR0       0x00000308
+#define MSR_P4_FLAME_PERFCTR1       0x00000309
+#define MSR_P4_FLAME_PERFCTR2       0x0000030a
+#define MSR_P4_FLAME_PERFCTR3       0x0000030b
+#define MSR_P4_IQ_PERFCTR0          0x0000030c
+#define MSR_P4_IQ_PERFCTR1          0x0000030d
+#define MSR_P4_IQ_PERFCTR2          0x0000030e
+#define MSR_P4_IQ_PERFCTR3          0x0000030f
+#define MSR_P4_IQ_PERFCTR4          0x00000310
+#define MSR_P4_IQ_PERFCTR5          0x00000311
+#define MSR_P4_BPU_CCCR0            0x00000360
+#define MSR_P4_BPU_CCCR1            0x00000361
+#define MSR_P4_BPU_CCCR2            0x00000362
+#define MSR_P4_BPU_CCCR3            0x00000363
+#define MSR_P4_MS_CCCR0             0x00000364
+#define MSR_P4_MS_CCCR1             0x00000365
+#define MSR_P4_MS_CCCR2             0x00000366
+#define MSR_P4_MS_CCCR3             0x00000367
+#define MSR_P4_FLAME_CCCR0          0x00000368
+#define MSR_P4_FLAME_CCCR1          0x00000369
+#define MSR_P4_FLAME_CCCR2          0x0000036a
+#define MSR_P4_FLAME_CCCR3          0x0000036b
+#define MSR_P4_IQ_CCCR0             0x0000036c
+#define MSR_P4_IQ_CCCR1             0x0000036d
+#define MSR_P4_IQ_CCCR2             0x0000036e
+#define MSR_P4_IQ_CCCR3             0x0000036f
+#define MSR_P4_IQ_CCCR4             0x00000370
+#define MSR_P4_IQ_CCCR5             0x00000371
+#define MSR_P4_ALF_ESCR0            0x000003ca
+#define MSR_P4_ALF_ESCR1            0x000003cb
+#define MSR_P4_BPU_ESCR0            0x000003b2
+#define MSR_P4_BPU_ESCR1            0x000003b3
+#define MSR_P4_BSU_ESCR0            0x000003a0
+#define MSR_P4_BSU_ESCR1            0x000003a1
+#define MSR_P4_CRU_ESCR0            0x000003b8
+#define MSR_P4_CRU_ESCR1            0x000003b9
+#define MSR_P4_CRU_ESCR2            0x000003cc
+#define MSR_P4_CRU_ESCR3            0x000003cd
+#define MSR_P4_CRU_ESCR4            0x000003e0
+#define MSR_P4_CRU_ESCR5            0x000003e1
+#define MSR_P4_DAC_ESCR0            0x000003a8
+#define MSR_P4_DAC_ESCR1            0x000003a9
+#define MSR_P4_FIRM_ESCR0           0x000003a4
+#define MSR_P4_FIRM_ESCR1           0x000003a5
+#define MSR_P4_FLAME_ESCR0          0x000003a6
+#define MSR_P4_FLAME_ESCR1          0x000003a7
+#define MSR_P4_FSB_ESCR0            0x000003a2
+#define MSR_P4_FSB_ESCR1            0x000003a3
+#define MSR_P4_IQ_ESCR0             0x000003ba
+#define MSR_P4_IQ_ESCR1             0x000003bb
+#define MSR_P4_IS_ESCR0             0x000003b4
+#define MSR_P4_IS_ESCR1             0x000003b5
+#define MSR_P4_ITLB_ESCR0           0x000003b6
+#define MSR_P4_ITLB_ESCR1           0x000003b7
+#define MSR_P4_IX_ESCR0             0x000003c8
+#define MSR_P4_IX_ESCR1             0x000003c9
+#define MSR_P4_MOB_ESCR0            0x000003aa
+#define MSR_P4_MOB_ESCR1            0x000003ab
+#define MSR_P4_MS_ESCR0             0x000003c0
+#define MSR_P4_MS_ESCR1             0x000003c1
+#define MSR_P4_PMH_ESCR0            0x000003ac
+#define MSR_P4_PMH_ESCR1            0x000003ad
+#define MSR_P4_RAT_ESCR0            0x000003bc
+#define MSR_P4_RAT_ESCR1            0x000003bd
+#define MSR_P4_SAAT_ESCR0           0x000003ae
+#define MSR_P4_SAAT_ESCR1           0x000003af
+#define MSR_P4_SSU_ESCR0            0x000003be
+#define MSR_P4_SSU_ESCR1            0x000003bf /* guess: not in manual */
+
+#define MSR_P4_TBPU_ESCR0           0x000003c2
+#define MSR_P4_TBPU_ESCR1           0x000003c3
+#define MSR_P4_TC_ESCR0             0x000003c4
+#define MSR_P4_TC_ESCR1             0x000003c5
+#define MSR_P4_U2L_ESCR0            0x000003b0
+#define MSR_P4_U2L_ESCR1            0x000003b1
+
+/* Intel Core-based CPU performance counters */
+#define MSR_CORE_PERF_FIXED_CTR0    0x00000309
+#define MSR_CORE_PERF_FIXED_CTR1    0x0000030a
+#define MSR_CORE_PERF_FIXED_CTR2    0x0000030b
+#define MSR_CORE_PERF_FIXED_CTR_CTRL 0x0000038d
+#define MSR_CORE_PERF_GLOBAL_STATUS 0x0000038e
+#define MSR_CORE_PERF_GLOBAL_CTRL   0x0000038f
+#define MSR_CORE_PERF_GLOBAL_OVF_CTRL 0x00000390
+
+/* Geode defined MSRs */
+#define MSR_GEODE_BUSCONT_CONF0     0x00001900
+
+#endif /* __ASM_MSR_INDEX_H */
diff --git a/pc-bios/xenner/processor.h b/pc-bios/xenner/processor.h
new file mode 100644
index 0000000..124d6f0
--- /dev/null
+++ b/pc-bios/xenner/processor.h
@@ -0,0 +1,326 @@
+#ifndef __PROCESSOR_H__
+#define __PROCESSOR_H__ 1
+
+/*
+ * x86 hardware specific structs and defines
+ */
+
+/* page size */
+#undef PAGE_SHIFT
+#undef PAGE_SIZE
+#undef PAGE_MASK
+#define PAGE_SHIFT             12
+#define PAGE_SIZE              (1 << PAGE_SHIFT)
+#define PAGE_MASK              (~(PAGE_SIZE-1))
+
+#define PAGE_SHIFT_2MB         21
+#define PAGE_SIZE_2MB          (1 << PAGE_SHIFT_2MB)
+#define PAGE_MASK_2MB          (~(PAGE_SIZE_2MB-1))
+
+#define addr_to_frame(addr)    ((addr) >> PAGE_SHIFT)
+#define frame_to_addr(frame)   ((frame) << PAGE_SHIFT)
+#define addr_offset(addr)      ((addr) & ~PAGE_MASK)
+
+/* page flags */
+#define _PAGE_PRESENT          0x001
+#define _PAGE_RW               0x002
+#define _PAGE_USER             0x004
+#define _PAGE_PWT              0x008
+#define _PAGE_PCD              0x010
+#define _PAGE_ACCESSED         0x020
+#define _PAGE_DIRTY            0x040
+#define _PAGE_PSE              0x080
+#define _PAGE_GLOBAL           0x100
+#define _PAGE_NX               ((uint64_t)1<<63)
+
+/* 32-bit paging */
+#define PGD_SHIFT_32           22
+#define PTE_SHIFT_32           12
+
+#define PGD_COUNT_32           1024
+#define PTE_COUNT_32           1024
+
+#define PGD_INDEX_32(va)       (((va) >> PGD_SHIFT_32) & (PGD_COUNT_32-1))
+#define PTE_INDEX_32(va)       (((va) >> PTE_SHIFT_32) & (PTE_COUNT_32-1))
+
+static inline uint32_t get_pgentry_32(uint32_t frame, uint32_t flags)
+{
+    return (frame << PAGE_SHIFT) | flags;
+}
+static inline uint32_t get_pgframe_32(uint32_t entry)
+{
+    return entry >> PAGE_SHIFT;
+}
+static inline uint32_t get_pgflags_32(uint32_t entry)
+{
+    return entry & ~PAGE_MASK;
+}
+static inline uint32_t test_pgflag_32(uint32_t entry, uint32_t flag)
+{
+    return entry & ~PAGE_MASK & flag;
+}
+
+/* 32-bit pae paging */
+#define PGD_SHIFT_PAE          30
+#define PMD_SHIFT_PAE          21
+#define PTE_SHIFT_PAE          12
+
+#define PGD_COUNT_PAE          4
+#define PMD_COUNT_PAE          512
+#define PTE_COUNT_PAE          512
+
+#define PGD_INDEX_PAE(va)      (((va) >> PGD_SHIFT_PAE) & (PGD_COUNT_PAE-1))
+#define PMD_INDEX_PAE(va)      (((va) >> PMD_SHIFT_PAE) & (PMD_COUNT_PAE-1))
+#define PTE_INDEX_PAE(va)      (((va) >> PTE_SHIFT_PAE) & (PTE_COUNT_PAE-1))
+
+static inline uint64_t get_pgentry_pae(uint32_t frame, uint32_t flags)
+{
+    return (frame << PAGE_SHIFT) | flags;
+}
+static inline uint32_t get_pgframe_pae(uint64_t entry)
+{
+    return (entry & ~_PAGE_NX) >> PAGE_SHIFT;
+}
+static inline uint32_t get_pgflags_pae(uint64_t entry)
+{
+    return entry & ~PAGE_MASK;
+}
+static inline uint32_t test_pgflag_pae(uint64_t entry, uint32_t flag)
+{
+    return entry & ~PAGE_MASK & flag;
+}
+
+/* 64-bit paging */
+#define PGD_SHIFT_64           39
+#define PUD_SHIFT_64           30
+#define PMD_SHIFT_64           21
+#define PTE_SHIFT_64           12
+
+#define PGD_COUNT_64           512
+#define PUD_COUNT_64           512
+#define PMD_COUNT_64           512
+#define PTE_COUNT_64           512
+
+#define PGD_INDEX_64(va)       (((va) >> PGD_SHIFT_64) & (PGD_COUNT_64-1))
+#define PUD_INDEX_64(va)       (((va) >> PUD_SHIFT_64) & (PUD_COUNT_64-1))
+#define PMD_INDEX_64(va)       (((va) >> PMD_SHIFT_64) & (PMD_COUNT_64-1))
+#define PTE_INDEX_64(va)       (((va) >> PTE_SHIFT_64) & (PTE_COUNT_64-1))
+
+static inline uint64_t get_pgentry_64(uint64_t frame, uint32_t flags)
+{
+    if ((flags & _PAGE_PSE) && (frame & 0x1f)) {
+        /* adding huge page with invalid offset */
+        while(1) { }
+    }
+    return (frame << PAGE_SHIFT) | flags;
+}
+static inline uint64_t get_pgframe_64(uint64_t entry)
+{
+    return (entry & ~_PAGE_NX) >> PAGE_SHIFT;
+}
+static inline uint32_t get_pgflags_64(uint64_t entry)
+{
+    return entry & ~PAGE_MASK;
+}
+static inline uint32_t test_pgflag_64(uint64_t entry, uint32_t flag)
+{
+    return entry & ~PAGE_MASK & flag;
+}
+
+/* Generic functions */
+
+#if defined(CONFIG_64BIT) || defined(CONFIG_PAE)
+typedef uint64_t pte_t;
+#else
+typedef uint32_t pte_t;
+#endif
+
+static inline pte_t get_pgentry(unsigned long frame, uint32_t flags)
+{
+#if defined(CONFIG_64BIT) || defined(CONFIG_PAE)
+    return get_pgentry_64(frame, flags);
+#else
+    return get_pgentry_32(frame, flags);
+#endif
+}
+
+static inline pte_t get_pgframe(pte_t entry)
+{
+#if defined(CONFIG_64BIT) || defined(CONFIG_PAE)
+    return get_pgframe_64(entry);
+#else
+    return get_pgframe_32(entry);
+#endif
+}
+
+static inline pte_t get_pgflags(pte_t entry)
+{
+#if defined(CONFIG_64BIT) || defined(CONFIG_PAE)
+    return get_pgflags_64(entry);
+#else
+    return get_pgflags_32(entry);
+#endif
+}
+
+static inline pte_t test_pgflag(pte_t entry, uint32_t flag)
+{
+    return get_pgflags(entry) & flag;
+}
+
+/* ------------------------------------------------------------------ */
+
+/*
+ * EFLAGS bits
+ */
+#define X86_EFLAGS_CF      0x00000001 /* Carry Flag */
+#define X86_EFLAGS_PF      0x00000004 /* Parity Flag */
+#define X86_EFLAGS_AF      0x00000010 /* Auxillary carry Flag */
+#define X86_EFLAGS_ZF      0x00000040 /* Zero Flag */
+#define X86_EFLAGS_SF      0x00000080 /* Sign Flag */
+#define X86_EFLAGS_TF      0x00000100 /* Trap Flag */
+#define X86_EFLAGS_IF      0x00000200 /* Interrupt Flag */
+#define X86_EFLAGS_DF      0x00000400 /* Direction Flag */
+#define X86_EFLAGS_OF      0x00000800 /* Overflow Flag */
+#define X86_EFLAGS_IOPL    0x00003000 /* IOPL mask */
+#define X86_EFLAGS_NT      0x00004000 /* Nested Task */
+#define X86_EFLAGS_RF      0x00010000 /* Resume Flag */
+#define X86_EFLAGS_VM      0x00020000 /* Virtual Mode */
+#define X86_EFLAGS_AC      0x00040000 /* Alignment Check */
+#define X86_EFLAGS_VIF     0x00080000 /* Virtual Interrupt Flag */
+#define X86_EFLAGS_VIP     0x00100000 /* Virtual Interrupt Pending */
+#define X86_EFLAGS_ID      0x00200000 /* CPUID detection flag */
+
+/*
+ * Basic CPU control in CR0
+ */
+#define X86_CR0_PE         0x00000001 /* Protection Enable */
+#define X86_CR0_MP         0x00000002 /* Monitor Coprocessor */
+#define X86_CR0_EM         0x00000004 /* Emulation */
+#define X86_CR0_TS         0x00000008 /* Task Switched */
+#define X86_CR0_ET         0x00000010 /* Extension Type */
+#define X86_CR0_NE         0x00000020 /* Numeric Error */
+#define X86_CR0_WP         0x00010000 /* Write Protect */
+#define X86_CR0_AM         0x00040000 /* Alignment Mask */
+#define X86_CR0_NW         0x20000000 /* Not Write-through */
+#define X86_CR0_CD         0x40000000 /* Cache Disable */
+#define X86_CR0_PG         0x80000000 /* Paging */
+
+/*
+ * Paging options in CR3
+ */
+#define X86_CR3_PWT        0x00000008 /* Page Write Through */
+#define X86_CR3_PCD        0x00000010 /* Page Cache Disable */
+
+/*
+ * Intel CPU features in CR4
+ */
+#define X86_CR4_VME        0x00000001 /* enable vm86 extensions */
+#define X86_CR4_PVI        0x00000002 /* virtual interrupts flag enable */
+#define X86_CR4_TSD        0x00000004 /* disable time stamp at ipl 3 */
+#define X86_CR4_DE         0x00000008 /* enable debugging extensions */
+#define X86_CR4_PSE        0x00000010 /* enable page size extensions */
+#define X86_CR4_PAE        0x00000020 /* enable physical address extensions */
+#define X86_CR4_MCE        0x00000040 /* Machine check enable */
+#define X86_CR4_PGE        0x00000080 /* enable global pages */
+#define X86_CR4_PCE        0x00000100 /* enable performance counters at ipl 3 */
+#define X86_CR4_OSFXSR     0x00000200 /* enable fast FPU save and restore */
+#define X86_CR4_OSXMMEXCPT 0x00000400 /* enable unmasked SSE exceptions */
+#define X86_CR4_VMXE       0x00002000 /* enable VMX virtualization */
+
+/* ------------------------------------------------------------------ */
+
+struct tss_32 {
+    /* hardware */
+    uint16_t back_link,__blh;
+    uint32_t esp0;
+    uint16_t ss0,__ss0h;
+    uint32_t esp1;
+    uint16_t ss1,__ss1h;
+    uint32_t esp2;
+    uint16_t ss2,__ss2h;
+    uint32_t __cr3;
+    uint32_t eip;
+    uint32_t eflags;
+    uint32_t eax,ecx,edx,ebx;
+    uint32_t esp;
+    uint32_t ebp;
+    uint32_t esi;
+    uint32_t edi;
+    uint16_t es, __esh;
+    uint16_t cs, __csh;
+    uint16_t ss, __ssh;
+    uint16_t ds, __dsh;
+    uint16_t fs, __fsh;
+    uint16_t gs, __gsh;
+    uint16_t ldt, __ldth;
+    uint16_t trace, io_bitmap_base;
+} __attribute__((packed));
+
+struct tss_64 {
+    uint32_t reserved1;
+    uint64_t rsp0;
+    uint64_t rsp1;
+    uint64_t rsp2;
+    uint64_t reserved2;
+    uint64_t ist[7];
+    uint32_t reserved3;
+    uint32_t reserved4;
+    uint16_t reserved5;
+    uint16_t io_bitmap_base;
+} __attribute__((packed));
+
+/* ------------------------------------------------------------------ */
+
+#define EFLAGS_TRAPMASK (~(X86_EFLAGS_VM | X86_EFLAGS_RF | X86_EFLAGS_NT | \
+                           X86_EFLAGS_TF))
+
+/* ------------------------------------------------------------------ */
+
+struct descriptor_32 {
+    uint32_t a,b;
+};
+
+struct idt_64 {
+    uint32_t a,b,c,d;
+};
+
+#define DESC32(base,limit,type,flags) {                                      \
+    .a = ((base & 0xffff) << 16) | (limit & 0xffff),                         \
+    .b = (base & 0xff000000) | ((base & 0xff0000) >> 16) |                   \
+         (limit & 0x000f0000) | ((type & 0xff) << 8) | ((flags & 0xf) << 20) \
+}
+
+#define GATE32(seg,addr,type) {                                              \
+    .a = ((seg & 0xffff) << 16) | (addr & 0xffff),                           \
+    .b = (addr & 0xffff0000) | ((type & 0xff) << 8)                          \
+}
+
+#define GATE64(seg,addr,type,ist) {                                          \
+    .a = ((seg & 0xffff) << 16) | (addr & 0xffff),                           \
+    .b = (addr & 0xffff0000) | ((type & 0xff) << 8) | (ist & 0x07),          \
+    .c = ((addr >> 32) & 0xffffffff),                                        \
+    .d = 0                                                                   \
+}
+
+static inline struct descriptor_32 mkdesc32(uint32_t base, uint32_t limit,
+                                            uint32_t type, uint32_t flags)
+{
+    struct descriptor_32 desc = DESC32(base, limit, type, flags);
+    return desc;
+}
+
+static inline struct descriptor_32 mkgate32(uint32_t seg, uint32_t addr,
+                                            uint32_t type)
+{
+    struct descriptor_32 desc = GATE32(seg, addr, type);
+    return desc;
+}
+
+static inline struct idt_64 mkgate64(uint32_t seg, uint64_t addr,
+                                     uint32_t type, uint32_t ist)
+{
+    struct idt_64 desc = GATE64(seg, addr, type, ist);
+    return desc;
+}
+
+#endif /* __PROCESSOR_H__ */
diff --git a/pc-bios/xenner/shared.h b/pc-bios/xenner/shared.h
new file mode 100644
index 0000000..07e4de0
--- /dev/null
+++ b/pc-bios/xenner/shared.h
@@ -0,0 +1,188 @@
+#undef PAGE_SIZE
+#undef PAGE_MASK
+
+#include <linux/kvm.h>
+#include <linux/kvm_para.h>
+
+#include "processor.h"
+
+#define GRANT_FRAMES_MAX (16)
+#define GRANT_ENTRIES (GRANT_FRAMES_MAX * PAGE_SIZE / sizeof(struct grant_entry_v1))
+
+#define VCPUS_MAX        (4)
+
+/* useful helper macros */
+#define GETNAME(a, i) ( ((i) < sizeof(a)/sizeof(a[0]) && a[i]) ? a[i] : "UNKNOWN")
+#define SETBIT(a, n)  ( (a)[(n)/(sizeof(a[0])*8)] |= (1<< ((n)%(sizeof(a[0])*8)) ))
+#define TESTBIT(a, n) ( (a)[(n)/(sizeof(a[0])*8)]  & (1<< ((n)%(sizeof(a[0])*8)) ))
+
+/* common flags */
+#define ALL_PGFLAGS    (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY)
+
+/* emulator code+data */
+#define EMU_PGFLAGS    (ALL_PGFLAGS | _PAGE_GLOBAL | _PAGE_RW)
+
+/* machphys table */
+#define M2P_PGFLAGS_32 (ALL_PGFLAGS | _PAGE_GLOBAL | _PAGE_RW)
+#define M2P_PGFLAGS_64 (ALL_PGFLAGS | _PAGE_GLOBAL | _PAGE_RW | _PAGE_USER)
+
+/* linear page tables */
+#define LPT_PGFLAGS    (ALL_PGFLAGS | _PAGE_RW)
+
+/* pmd/pgt: pte pointer (map cache) */
+#define PGT_PGFLAGS_32 (ALL_PGFLAGS | _PAGE_GLOBAL | _PAGE_RW)
+#define PGT_PGFLAGS_64 (ALL_PGFLAGS | _PAGE_GLOBAL | _PAGE_RW | _PAGE_USER)
+
+/* misc xen defines */
+#define XEN_HCALL_MAX      64
+#define XEN_DEFAULT_PERIOD (10000000ll) /* 10 ms aka 100 Hz */
+
+/* misc xen addresses */
+#define XEN_IPT_32         0xfb800000
+#define XEN_M2P_32         0xfc000000
+#define XEN_LPT_32         0xfe000000
+#define XEN_MAP_32         0xfe800000
+#define XEN_TXT_32         0xff000000
+
+#define XEN_IPT_PAE        0xf4800000
+#define XEN_M2P_PAE        0xf5800000
+#define XEN_LPT_PAE        0xfd800000
+#define XEN_MAP_PAE        0xfe800000
+#define XEN_TXT_PAE        0xff000000
+
+#define XEN_M2P_64         0xffff800000000000  //  256 GB, pgd:256
+#define XEN_LPT_64         0xffff808000000000  //  512 GB, pgd:257
+#define XEN_MAP_64         0xffff820000000000  //  512 GB, pgd:260
+#define XEN_RAM_64         0xffff830000000000  //    1 TB, pgd:262,263
+#define XEN_DOM_64         0xffff880000000000  //  120 TB, pgd:272+
+
+#if defined(CONFIG_PAE) && defined(CONFIG_32BIT)
+#define XEN_IPT            XEN_IPT_PAE
+#define XEN_M2P            XEN_M2P_PAE
+#define XEN_LPT            XEN_LPT_PAE
+#define XEN_MAP            XEN_MAP_PAE
+#define XEN_TXT            XEN_TXT_PAE
+#elif defined(CONFIG_32BIT)
+#define XEN_IPT            XEN_IPT_32
+#define XEN_M2P            XEN_M2P_32
+#define XEN_LPT            XEN_LPT_32
+#define XEN_MAP            XEN_MAP_32
+#define XEN_TXT            XEN_TXT_32
+#elif defined(CONFIG_64BIT)
+#define XEN_M2P            XEN_M2P_64
+#define XEN_LPT            XEN_LPT_64
+#define XEN_MAP            XEN_MAP_64
+#define XEN_TXT            XEN_RAM_64
+#endif
+
+#define INVALID_M2P_ENTRY  (~0UL)
+
+/* ------------------------------------------------------------------ */
+/* statistics                                                         */
+
+#define XEN_FAULT_ILLEGAL_INSTRUCTION          0
+#define XEN_FAULT_GENERAL_PROTECTION          10
+#define XEN_FAULT_GENERAL_PROTECTION_GUEST    11
+#define XEN_FAULT_GENERAL_PROTECTION_EMUINS   12
+#define XEN_FAULT_PAGE_FAULT                  20
+#define XEN_FAULT_PAGE_FAULT_GUEST            21
+#define XEN_FAULT_PAGE_FAULT_FIX_RO           22
+#define XEN_FAULT_PAGE_FAULT_FIX_USER         23
+#define XEN_FAULT_PAGE_FAULT_FIX_EXTAB        24
+#define XEN_FAULT_UPDATE_VA_FIX_RO            25
+#define XEN_FAULT_UPDATE_VA_FIX_USER          26
+#define XEN_FAULT_SYSCALL                     30
+#define XEN_FAULT_INT_80                      31
+#define XEN_FAULT_EVENT_CALLBACK              32
+#define XEN_FAULT_LAZY_FPU                    33
+#define XEN_FAULT_BOUNCE_TRAP                 34
+#define XEN_FAULT_MAPS_MAPIT                  40
+#define XEN_FAULT_MAPS_REUSE                  41
+#define XEN_FAULT_OTHER_CR3_LOAD              50
+#define XEN_FAULT_OTHER_SWITCH_MODE           51
+#define XEN_FAULT_OTHER_CR3_CACHE_HIT         52
+#define XEN_FAULT_OTHER_FLUSH_TLB_ALL         53
+#define XEN_FAULT_OTHER_FLUSH_TLB_PAGE        54
+#define XEN_FAULT_OTHER_FLUSH_TLB_NONE        55
+
+#define XEN_FAULT_TMP_1                      240
+#define XEN_FAULT_TMP_2                      241
+#define XEN_FAULT_TMP_3                      242
+#define XEN_FAULT_TMP_4                      243
+#define XEN_FAULT_TMP_5                      244
+#define XEN_FAULT_TMP_6                      245
+#define XEN_FAULT_TMP_7                      246
+#define XEN_FAULT_TMP_8                      247
+
+#define XEN_FAULT_MAX                        256
+
+#define XEN_EVENT_MAX                         64
+#define XEN_ENAME_LEN                         20
+
+/* ------------------------------------------------------------------ */
+/* state info                                                         */
+
+#define XENNER_ABI_VERSION 40
+
+struct xenner_info {
+    /* state bits info */
+    uint32_t      abi_version;
+    uint32_t      dying:1;
+
+    uint64_t      vcpus_online;
+    uint64_t      vcpus_running;
+    uint64_t      vcpus;
+
+    /* statistics */
+    uint64_t      hcalls[XEN_HCALL_MAX];
+    uint64_t      faults[XEN_FAULT_MAX];
+    uint64_t      events[XEN_EVENT_MAX];
+    char          enames[XEN_EVENT_MAX * XEN_ENAME_LEN];
+};
+
+/* ------------------------------------------------------------------ */
+
+static inline uint32_t fix_sel32(uint32_t sel)
+{
+    /* fixup DPL: 0 -> 1 */
+    if (sel && 0 == (sel & 0x03)) {
+        sel |= 0x01;
+    }
+    return sel;
+}
+
+static inline uint32_t unfix_sel32(uint32_t sel)
+{
+    /* reverse DPL fixup: 1 -> 0 */
+    if (0x01 == (sel & 0x03)) {
+        sel &= ~0x03;
+    }
+    return sel;
+}
+
+static inline void fix_desc32(struct descriptor_32 *desc)
+{
+    if (desc->b & (1<<15)) {              /* present ?        */
+        if (0 == (desc->b & (3 << 13))) { /*   dpl == 0 ?     */
+            desc->b |= 1 << 13;           /*     fix: dpl = 1 */
+        }
+    }
+}
+
+static inline uint32_t fix_sel64(uint32_t sel)
+{
+    /* fixup DPL: 0 -> 3 */
+    if (sel && 0 == (sel & 0x03)) {
+        sel |= 0x03;
+    }
+    return sel;
+}
+
+static inline void fix_desc64(struct descriptor_32 *desc)
+{
+    if (desc->b & (1<<15)) {              /* present ?        */
+        if (0 == (desc->b & (3 << 13))) { /*   dpl == 0 ?     */
+            desc->b |= 3 << 13;           /*     fix: dpl = 3 */
+        }
+    }
+}
diff --git a/pc-bios/xenner/xenner-emudev.h b/pc-bios/xenner/xenner-emudev.h
new file mode 100644
index 0000000..adeef58
--- /dev/null
+++ b/pc-bios/xenner/xenner-emudev.h
@@ -0,0 +1,57 @@
+#include "../../hw/xenner_emudev.h"
+
+#ifndef __XENNER_EMUDEV_GUEST_H__
+#define __XENNER_EMUDEV_GUEST_H__ 1
+
+/* --------- guest side bits --------- */
+
+static inline void emudev_set(uint16_t type, uint16_t index, uint32_t value)
+{
+    uint32_t entry = (uint32_t)type << 16 | index;
+
+    asm volatile("outl %[data],%w[port]\n"
+                 : /* no output */
+                 : [data] "a" (entry), [port] "Nd" (EMUDEV_REG_CONF_ENTRY)
+                 : "memory");
+    asm volatile("outl %[data],%w[port]\n"
+                 : /* no output */
+                 : [data] "a" (value), [port] "Nd" (EMUDEV_REG_CONF_VALUE)
+                 : "memory");
+}
+
+static inline uint32_t emudev_get32(uint16_t type, uint16_t index)
+{
+    uint32_t entry = (uint32_t)type << 16 | index;
+    uint32_t value;
+
+    asm volatile("outl %[data],%w[port]\n"
+                 : /* no output */
+                 : [data] "a" (entry), [port] "Nd" (EMUDEV_REG_CONF_ENTRY)
+                 : "memory");
+    asm volatile("inl %w[port],%[data]\n"
+                 : [data] "=a" (value)
+                 : [port] "Nd" (EMUDEV_REG_CONF_VALUE)
+                 : "memory");
+    return value;
+}
+
+static inline uint64_t emudev_get(uint16_t type, uint16_t index)
+{
+    uint64_t r;
+    r = emudev_get32(type, index);
+    r |= ((uint64_t)emudev_get32(type | EMUDEV_CONF_HIGH_32, index) << 32);
+
+    return r;
+}
+
+static inline void emudev_cmd(uint16_t cmd, uint16_t arg)
+{
+    uint32_t command = (uint32_t)cmd << 16 | arg;
+
+    asm volatile("outl %[data],%w[port]\n"
+                 : /* no output */
+                 : [data] "a" (command), [port] "Nd" (EMUDEV_REG_COMMAND)
+                 : "memory");
+}
+
+#endif /* __XENNER_EMUDEV_GUEST_H__ */
diff --git a/pc-bios/xenner/xenner.h b/pc-bios/xenner/xenner.h
new file mode 100644
index 0000000..66f2e77
--- /dev/null
+++ b/pc-bios/xenner/xenner.h
@@ -0,0 +1,684 @@
+#include <stdarg.h>
+#include <stddef.h>
+#include <inttypes.h>
+#include <xen/xen.h>
+#include <xen/callback.h>
+#include <xen/grant_table.h>
+#include <xen/version.h>
+#include <xen/sched.h>
+#include <xen/memory.h>
+#include <xen/vcpu.h>
+#include <xen/physdev.h>
+
+#include "list.h"
+
+#include "shared.h"
+#include "xenner-emudev.h"
+#include "xen-names.h"
+
+/* attributes */
+#define asmlinkage __attribute__((regparm(0)))
+#define page_aligned __attribute__((aligned(4096))) __attribute__((__section__ (".pgdata")))
+
+/* fwd decl */
+struct xen_cpu;
+
+/* arch specific bits */
+#ifdef CONFIG_64BIT
+#include "xenner64.h"
+#else
+#include "xenner32.h"
+#endif
+
+#if defined(CONFIG_64BIT)
+#define CAP_VERSION_STRING "xen-3.0-x86_64"
+#elif defined(CONFIG_PAE)
+#define CAP_VERSION_STRING "xen-3.0-x86_32p"
+#else
+#define CAP_VERSION_STRING "xen-3.0-x86_32";
+#endif
+
+/* idt entry points */
+extern void division_by_zero(void);
+extern void debug_int1(void);
+extern void nmi(void);
+extern void debug_int3(void);
+extern void overflow(void);
+extern void bound_check(void);
+extern void illegal_instruction(void);
+extern void no_device(void);
+extern void double_fault(void);
+extern void coprocessor(void);
+extern void invalid_tss(void);
+extern void segment_not_present(void);
+extern void stack_fault(void);
+extern void general_protection(void);
+extern void page_fault(void);
+extern void floating_point(void);
+extern void alignment(void);
+extern void machine_check(void);
+extern void simd_floating_point(void);
+extern void smp_flush_tlb(void);
+extern void int_unknown(void);
+
+#ifdef CONFIG_64BIT
+/* 64bit only */
+extern void int_80(void);
+#else
+/* 32bit only */
+extern void xen_hypercall(void);
+#endif
+
+/* functions */
+extern uintptr_t  emu_pa(uintptr_t va);
+#define EMU_PA(_vaddr)       emu_pa((uintptr_t)_vaddr)
+#define EMU_MFN(_vaddr)      (EMU_PA((uintptr_t)_vaddr) >> PAGE_SHIFT)
+extern uint8_t    _vstart[];
+extern uint8_t    _vstop[];
+extern uintptr_t  _estart[];
+extern uintptr_t  _estop[];
+extern uint8_t    trampoline_syscall[];
+
+#define STACK_PTR(_cpu,_sym) ((void*)((_cpu)->stack_low + ((_sym) - boot_stack_low)))
+#define STACK_PTR(_cpu,_sym) ((void*)((_cpu)->stack_low + ((_sym) - boot_stack_low)))
+#define IRQSTACK_PTR(_cpu,_sym) ((void*)((_cpu)->irqstack_low + ((_sym) - boot_stack_low)))
+extern uint8_t    boot_stack_low[];
+extern uint8_t    cpu_ptr[];
+#ifdef CONFIG_64BIT
+extern uint8_t    trampoline_start[];
+extern uint8_t    trampoline_patch[];
+extern uint8_t    trampoline_stop[];
+#endif
+extern uint8_t    boot_stack_high[];
+
+extern uint8_t    irq_entries[];
+extern uint8_t    irq_common[];
+
+extern uint8_t    sipi[];
+
+/* xenner-data.c */
+extern int grant_frames;
+extern struct grant_entry_v1 page_aligned grant_table[];
+extern struct xenner_info vminfo;
+extern int wrpt;
+extern unsigned long *m2p;
+
+struct vmconfig {
+    uint64_t      mfn_emu;
+    uint64_t      pg_emu;
+    uint64_t      mfn_m2p;
+    uint64_t      pg_m2p;
+    uint64_t      mfn_guest;
+    uint64_t      pg_guest;
+    uint64_t      pg_total;
+    int           debug_level;
+    int           nr_cpus;
+};
+extern struct vmconfig vmconf;
+
+struct xen_vcpu {
+    void                  *vcpu_page;
+    struct vcpu_info      *vcpu_info;
+    uint64_t              vcpu_info_pa;
+};
+
+struct xen_cpu {
+    /* used by hardware */
+#ifdef CONFIG_64BIT
+    struct tss_64         tss;
+#else
+    struct tss_32         tss;
+#endif
+    void                  *lapic;
+
+    /* used by kvm */
+    struct kvm_cr3_cache  *cr3_cache;
+    uint8_t               mmu_queue[128];
+    int                   mmu_queue_len;
+
+    /* emu state */
+    struct xen_vcpu       v;
+    struct descriptor_32  *gdt;
+    uint64_t              gdt_mfns[16];
+    uint32_t              virq_to_vector[NR_VIRQS];
+    uint8_t               *stack_low;
+    uint8_t               *stack_high;
+#ifdef CONFIG_64BIT
+    uint8_t               *irqstack_low;
+    uint8_t               *irqstack_high;
+#endif
+    ureg_t                kernel_ss;
+    ureg_t                kernel_sp;
+
+    /* timer */
+    uint64_t              periodic;
+    uint64_t              oneshot;
+    int                   timerport;
+
+    /* I/O */
+    int                   iopl;
+    int                   nr_ports;
+
+    /* initial state */
+    struct vcpu_guest_context *init_ctxt;
+
+#ifdef CONFIG_64BIT
+    uint64_t              kernel_cr3_mfn;
+    uint64_t              user_cr3_mfn;
+    int                   user_mode;
+#else
+    ureg_t                cr3_mfn;
+#endif
+
+    int                   online;
+    int                   id;
+    struct list_head      next;
+};
+extern struct list_head cpus;
+extern ureg_t cpumask_all;
+
+extern struct vcpu_guest_context boot_ctxt;
+extern struct shared_info page_aligned shared_info;
+extern xen_callback_t   xencb[8];
+extern struct trap_info xentr[256];
+
+extern uint64_t emu_hcalls[XEN_HCALL_MAX];
+extern uint64_t emu_faults[XEN_FAULT_MAX];
+
+struct trapinfo {
+    char *name;
+    int  ec;      /* has error code  */
+    int  lvl;     /* debug log level */
+};
+extern const struct trapinfo trapinfo[32];
+extern const char *cr0_bits[32];
+extern const char *cr4_bits[32];
+extern const char *pg_bits[32];
+extern const char *rflags_bits[32];
+
+/* xenner-main.c */
+void gdt_init(struct xen_cpu *cpu);
+void gdt_load(struct xen_cpu *cpu);
+void tss_init(struct xen_cpu *cpu);
+void msrs_init(struct xen_cpu *cpu);
+void idt_init(void);
+void idt_load(void);
+void guest_cpu_init(struct xen_cpu *cpu);
+void guest_regs_init(struct xen_cpu *cpu, struct regs *regs);
+struct xen_cpu *cpu_find(int id);
+void print_registers(int level, struct regs *regs);
+void print_stack(int level, ureg_t rsp);
+void print_state(struct regs *regs);
+
+int panic(const char *message, struct regs *regs);
+int bounce_trap(struct xen_cpu *cpu, struct regs *regs, int trapno, int cbno);
+void flush_tlb_remote(struct xen_cpu *cpu, ureg_t mask, ureg_t addr);
+
+asmlinkage void do_boot(struct regs *regs);
+asmlinkage void do_boot_secondary(ureg_t id, struct regs *regs);
+asmlinkage void do_illegal_instruction(struct regs *regs);
+asmlinkage void do_general_protection(struct regs *regs);
+asmlinkage void do_page_fault(struct regs *regs);
+asmlinkage void do_double_fault(struct regs *regs);
+asmlinkage void do_event_callback(struct regs *regs);
+asmlinkage void do_guest_forward(struct regs *regs);
+asmlinkage void do_int1(struct regs *regs);
+asmlinkage void do_int3(struct regs *regs);
+asmlinkage void do_lazy_fpu(struct regs *regs);
+asmlinkage void do_smp_flush_tlb(struct regs *regs);
+
+/* xenner-mm.c */
+void paging_init(struct xen_cpu *cpu);
+void paging_start(struct xen_cpu *cpu);
+void update_emu_mappings(ureg_t cr3_mfn);
+void *get_pages(int pages, const char *purpose);
+void *get_memory(int bytes, const char *purpose);
+void switch_heap(int heap_type);
+
+#define HEAP_EMU       0
+#define HEAP_HIGH      1
+
+unsigned long heap_size(void);
+void map_region(struct xen_cpu *cpu, uint64_t va, uint32_t flags,
+                uint64_t maddr, uint64_t count);
+
+/* xenner-hcall.c */
+#define HCALL_HANDLED  0
+#define HCALL_FORWARD -1
+#define HCALL_IRET    -2
+void guest_gdt_copy_page(struct descriptor_32 *src, struct descriptor_32 *dst);
+int guest_gdt_init(struct xen_cpu *cpu, uint32_t entries, ureg_t *mfns);
+
+sreg_t error_noop(struct xen_cpu *cpu, ureg_t *args);
+sreg_t error_noperm(struct xen_cpu *cpu, ureg_t *args);
+sreg_t stack_switch(struct xen_cpu *cpu, ureg_t *args);
+sreg_t console_io(struct xen_cpu *cpu, ureg_t *args);
+sreg_t update_descriptor(struct xen_cpu *cpu, ureg_t *args);
+sreg_t fpu_taskswitch(struct xen_cpu *cpu, ureg_t *args);
+sreg_t grant_table_op(struct xen_cpu *cpu, ureg_t *args);
+sreg_t xen_version(struct xen_cpu *cpu, ureg_t *args);
+sreg_t vm_assist(struct xen_cpu *cpu, ureg_t *args);
+sreg_t sched_op(struct xen_cpu *cpu, ureg_t *args);
+sreg_t sched_op_compat(struct xen_cpu *cpu, ureg_t *args);
+sreg_t memory_op(struct xen_cpu *cpu, ureg_t *args);
+sreg_t set_trap_table(struct xen_cpu *cpu, ureg_t *args);
+sreg_t set_callbacks(struct xen_cpu *cpu, ureg_t *args);
+sreg_t callback_op(struct xen_cpu *cpu, ureg_t *args);
+sreg_t set_gdt(struct xen_cpu *cpu, ureg_t *args);
+sreg_t vcpu_op(struct xen_cpu *cpu, ureg_t *args);
+sreg_t set_timer_op(struct xen_cpu *cpu, ureg_t *args);
+sreg_t event_channel_op(struct xen_cpu *cpu, ureg_t *args);
+sreg_t event_channel_op_compat(struct xen_cpu *cpu, ureg_t *args);
+sreg_t mmuext_op(struct xen_cpu *cpu, ureg_t *args);
+sreg_t physdev_op(struct xen_cpu *cpu, ureg_t *args);
+sreg_t get_debugreg(struct xen_cpu *cpu, ureg_t *args);
+sreg_t set_debugreg(struct xen_cpu *cpu, ureg_t *args);
+
+/* xenner-pv.c */
+int pv_have_clock;
+
+void pv_clock_update(int wakeup);
+void pv_clock_sys(struct xen_cpu *cpu);
+void pv_write_cr3(struct xen_cpu *cpu, ureg_t cr3_mfn);
+void pv_init(struct xen_cpu *cpu);
+
+/* xenner-instr.c */
+void real_cpuid(struct kvm_cpuid_entry *entry);
+void print_bits(int level, const char *msg, uint32_t old, uint32_t new,
+                const char *names[]);
+void print_emu_instr(int level, const char *prefix, uint8_t *instr);
+int emulate(struct xen_cpu *cpu, struct regs *regs);
+
+/* xenner-lapic.c */
+#define VECTOR_FLUSH_TLB    0x20
+#define VECTOR_EVTCHN_START 0x21
+void lapic_eoi(struct xen_cpu *cpu);
+void lapic_timer(struct xen_cpu *cpu);
+void lapic_ipi_boot(struct xen_cpu *cpu, struct xen_cpu *ap);
+void lapic_ipi_flush_tlb(struct xen_cpu *cpu);
+int evtchn_route_interdomain(struct xen_cpu *cpu, int port, char *desc);
+int evtchn_route_virq(struct xen_cpu *cpu, int virq, int port, char *desc);
+int evtchn_route_ipi(struct xen_cpu *cpu, int port);
+int evtchn_send(struct xen_cpu *cpu, int port);
+void evtchn_unmask(struct xen_cpu *cpu, int port);
+void evtchn_close(struct xen_cpu *cpu, int port);
+int evtchn_alloc(int vcpu_id);
+int evtchn_pending(struct xen_cpu *cpu);
+void evtchn_try_forward(struct xen_cpu *cpu, struct regs *regs);
+int irq_init(struct xen_cpu *cpu);
+asmlinkage void do_irq(struct regs *regs);
+
+/* xenner*.S */
+extern pte_t emu_pgd[];
+
+/* printk.c */
+int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
+int snprintf(char * buf, size_t size, const char *fmt, ...);
+int printk(int level, const char *fmt, ...) __attribute__((format(printf, 2, 3)));
+void write_string(char *msg);
+
+/* inline asm bits */
+static inline ureg_t read_cr0(void)
+{
+    ureg_t val;
+    asm volatile("mov %%cr0,%0"
+                 : "=r" (val));
+    return val;
+}
+
+static inline ureg_t read_cr2(void)
+{
+    ureg_t val;
+    asm volatile("mov %%cr2,%0"
+                 : "=r" (val));
+    return val;
+}
+
+static inline ureg_t read_cr3_mfn(struct xen_cpu *cpu)
+{
+#ifdef CONFIG_64BIT
+    return cpu->user_mode ? cpu->user_cr3_mfn : cpu->kernel_cr3_mfn;
+#else
+    return cpu->cr3_mfn;
+#endif
+}
+
+static inline ureg_t read_cr4(void)
+{
+    ureg_t val;
+    asm volatile("mov %%cr4,%0"
+                 : "=r" (val));
+    return val;
+}
+
+static inline void write_cr0(ureg_t val)
+{
+    asm volatile("mov %0, %%cr0"
+                 : /* no output */
+                 : "r" (val)
+                 : "memory" );
+}
+
+static inline void write_cr3(ureg_t val)
+{
+    asm volatile("mov %0, %%cr3"
+                 : /* no output */
+                 : "r" (val)
+                 : "memory");
+}
+
+static inline void write_cr4(ureg_t val)
+{
+    asm volatile("mov %0, %%cr4"
+                 : /* no output */
+                 : "r" (val)
+                 : "memory");
+}
+
+static inline void flush_tlb(void)
+{
+    ureg_t tmpreg;
+
+    asm volatile("mov %%cr3, %0;              \n"
+                 "mov %0, %%cr3;  # flush TLB \n"
+                 : "=r" (tmpreg)
+                 : /* no input */
+                 : "memory");
+}
+
+static inline void flush_tlb_addr(ureg_t va)
+{
+    asm volatile("invlpg (%0)"
+                 : /* no output */
+                 : "r" (va)
+                 : "memory");
+}
+
+static inline void outb(uint8_t value, uint16_t port)
+{
+    asm volatile("outb %[value],%w[port]"
+                 : /* no output */
+                 : [value] "a" (value), [port] "Nd" (port)
+                 : "memory");
+}
+
+static inline void rdmsr(uint32_t msr, uint32_t *ax, uint32_t *dx)
+{
+    asm volatile("rdmsr"
+                 : "=a" (*ax), "=d" (*dx)
+                 : "c" (msr)
+                 : "memory");
+}
+
+static inline void wrmsr(uint32_t msr, uint32_t ax, uint32_t dx)
+{
+    asm volatile("wrmsr"
+                 : /* no outputs */
+                 : "c" (msr), "a" (ax), "d" (dx)
+                 : "memory");
+}
+
+static inline void wrmsrl(uint32_t msr, uint64_t val)
+{
+    uint32_t ax = (uint32_t)val;
+    uint32_t dx = (uint32_t)(val >> 32);
+    wrmsr(msr, ax, dx);
+}
+
+static inline int wrmsrl_safe(uint32_t msr, uint64_t val)
+{
+    uint32_t ax = (uint32_t)val;
+    uint32_t dx = (uint32_t)(val >> 32);
+    return wrmsr_safe(msr, ax, dx);
+}
+
+static inline void lldt(uint16_t sel)
+{
+    asm volatile("lldt %0"
+                 : /* no outputs */
+                 : "a" (sel)
+                 : "memory");
+}
+
+static inline void ltr(uint16_t sel)
+{
+    asm volatile("ltr %0"
+                 : /* no outputs */
+                 : "a" (sel)
+                 : "memory");
+}
+
+static inline void halt_i(int cpu_id)
+{
+    vminfo.vcpus_running &= ~(1 << cpu_id);
+    asm volatile("sti\n"
+                 "hlt\n"
+                 "cli\n"
+                 : : : "memory");
+    vminfo.vcpus_running |= (1 << cpu_id);
+}
+
+static inline void clts(void)
+{
+    asm volatile("clts" : : : "memory");
+}
+
+static inline void pause(void)
+{
+    asm volatile("pause" : : : "memory");
+}
+
+static inline void sti(void)
+{
+    asm volatile("sti" : : : "memory");
+}
+
+static inline void cli(void)
+{
+    asm volatile("cli" : : : "memory");
+}
+
+static inline void int3(void)
+{
+    asm volatile("int3" : : : "memory");
+}
+
+static inline void set_eflag(ureg_t flag)
+{
+    ureg_t reg = 0;
+
+    asm volatile("pushf\n"
+                 "pop    %[reg]\n"
+                 "or     %[tf], %[reg]\n"
+                 "push   %[reg]\n"
+                 "popf\n"
+                 "nop\n"
+                 : [reg] "+r" (reg)
+                 : [tf]  "r" (flag)
+                 : "memory");
+}
+
+static inline uint64_t rdtsc(void)
+{
+  unsigned long low, high;
+
+  asm volatile("rdtsc" : "=a" (low) , "=d" (high));
+
+  return ((uint64_t)high << 32) | low;
+}
+
+/*
+ * We have 4k stacks (one page).
+ *  - there is a pointer to the per-cpu data at the bottom.
+ *  - (64bit also has the sysenter trampoline there).
+ */
+static inline struct xen_cpu *get_cpu(void)
+{
+    uintptr_t rsp;
+
+#ifdef CONFIG_64BIT
+    asm volatile("mov %%rsp, %[rsp]" : [rsp] "=a" (rsp) : /* no input */);
+#else
+    asm volatile("mov %%esp, %[esp]" : [esp] "=a" (rsp) : /* no input */);
+#endif
+    rsp &= PAGE_MASK;
+    return *((void**)rsp);
+}
+
+/* gcc builtins */
+void *memset(void *s, int c, size_t n);
+void *memcpy(void *dest, const void *src, size_t n);
+int memcmp(const void *s1, const void *s2, size_t n);
+
+/* guest virtual irq flag */
+#define guest_cli(_cpu)                do {                                \
+                (_cpu)->v.vcpu_info->evtchn_upcall_mask = 1;        \
+        } while (0)
+#define guest_sti(_cpu)                do {                                \
+                (_cpu)->v.vcpu_info->evtchn_upcall_mask = 0;        \
+        } while (0)
+#define guest_irq_flag(_cpu)        (!((_cpu)->v.vcpu_info->evtchn_upcall_mask))
+
+
+/******************************************************************
+ *                      atomic operations                         *
+ ******************************************************************/
+
+typedef struct {
+        int counter;
+} atomic_t;
+
+#define atomic_read(v)                ((v)->counter)
+#define atomic_set(v, i)        (((v)->counter) = (i))
+
+static inline void atomic_add(int i, atomic_t *v)
+{
+    asm volatile("lock; addl %1,%0"
+                 : "+m" (v->counter)
+                 : "ir" (i));
+}
+
+static inline void atomic_sub(int i, atomic_t *v)
+{
+    asm volatile("lock; subl %1,%0"
+                 : "+m" (v->counter)
+                 : "ir" (i));
+}
+
+static inline void atomic_inc(atomic_t *v)
+{
+    asm volatile("lock; incl %0"
+                 : "+m" (v->counter));
+}
+
+static inline void atomic_dec(atomic_t *v)
+{
+    asm volatile("lock; decl %0"
+                 : "+m" (v->counter));
+}
+
+/******************************************************************
+ *                      bitops operations                         *
+ ******************************************************************/
+
+/* from linux/asm-x86/bitops.h */
+
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 1)
+/* Technically wrong, but this avoids compilation errors on some gcc
+   versions. */
+#define ADDR "=m" (*(volatile long *) addr)
+#else
+#define ADDR "+m" (*(volatile long *) addr)
+#endif
+
+static inline void set_bit(int nr, volatile void *addr)
+{
+    asm volatile("lock bts %1,%0"
+                 : ADDR
+                 : "Ir" (nr) : "memory");
+}
+
+static inline void clear_bit(int nr, volatile void *addr)
+{
+    asm volatile("lock btr %1,%0"
+                 : ADDR
+                 : "Ir" (nr));
+}
+
+static inline int test_and_set_bit(int nr, volatile void *addr)
+{
+    int oldbit;
+
+    asm volatile("lock bts %2,%1\n\t"
+                 "sbb %0,%0"
+                 : "=r" (oldbit), ADDR
+                 : "Ir" (nr) : "memory");
+
+    return oldbit;
+}
+
+static inline int test_and_clear_bit(int nr, volatile void *addr)
+{
+    int oldbit;
+
+    asm volatile("lock btr %2,%1\n\t"
+                 "sbb %0,%0"
+                 : "=r" (oldbit), ADDR
+                 : "Ir" (nr) : "memory");
+
+    return oldbit;
+}
+
+static inline int test_bit(int nr, volatile const void *addr)
+{
+    int oldbit;
+
+    asm volatile("bt %2,%1\n\t"
+                 "sbb %0,%0"
+                 : "=r" (oldbit)
+                 : "m" (*(unsigned long *)addr), "Ir" (nr));
+
+    return oldbit;
+}
+
+/******************************************************************
+ *                     spinlock operations                        *
+ ******************************************************************/
+
+#define barrier()               asm volatile("": : :"memory")
+
+typedef struct {
+    volatile unsigned int slock;
+} spinlock_t;
+
+#define SPIN_LOCK_UNLOCKED      (spinlock_t) { .slock = 1 }
+#define spin_lock_init(x)       do { (x)->slock = 1; } while(0)
+
+#define spin_is_locked(x)       (*(volatile signed char *)(&(x)->slock) <= 0)
+#define spin_unlock_wait(x)     do { barrier(); } while(spin_is_locked(x))
+
+static inline void spin_lock(spinlock_t *lock)
+{
+    asm volatile(
+        "\n"
+        "1:	lock ; decb %0\n"
+        "	jns 3f\n"
+        "2:	pause\n"
+        "	cmpb $0,%0\n"
+        "	jle 2b\n"
+        "	jmp 1b\n"
+        "3:\n"
+        :"=m" (lock->slock) : : "memory");
+}
+
+static inline void spin_unlock(spinlock_t *lock)
+{
+    char oldval = 1;
+
+    asm volatile(
+        "xchgb %b0, %1"
+        :"=q" (oldval), "=m" (lock->slock)
+        :"0" (oldval)
+        : "memory"
+    );
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 14/40] xenner: kernel: Instruction emulator
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (12 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 13/40] xenner: kernel: Headers Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:41   ` malc
  2010-11-01 18:46   ` [Qemu-devel] " Paolo Bonzini
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 15/40] xenner: kernel: lapic code Alexander Graf
                   ` (25 subsequent siblings)
  39 siblings, 2 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

In some cases we need to emulate guest instructions. This patch adds
code to take care of this.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-instr.c |  405 +++++++++++++++++++++++++++++++++++++++++
 1 files changed, 405 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-instr.c

diff --git a/pc-bios/xenner/xenner-instr.c b/pc-bios/xenner/xenner-instr.c
new file mode 100644
index 0000000..11be2ce
--- /dev/null
+++ b/pc-bios/xenner/xenner-instr.c
@@ -0,0 +1,405 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner instruction emulator
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xenner.h"
+#include "msr-index.h"
+#include "cpufeature.h"
+
+void real_cpuid(struct kvm_cpuid_entry *entry)
+{
+    asm volatile("cpuid"
+                 : "=a" (entry->eax),
+                   "=b" (entry->ebx),
+                   "=c" (entry->ecx),
+                   "=d" (entry->edx)
+                 : "a" (entry->function));
+}
+
+static unsigned long clear_cpuid_bit(unsigned long bit, unsigned long x)
+{
+    unsigned long r = x;
+
+    bit %= 64;
+    r = x & ~(1 << bit);
+
+    return r;
+}
+
+static void filter_cpuid(struct kvm_cpuid_entry *entry)
+{
+    switch (entry->function) {
+    case 0x00000001:
+        entry->edx = clear_cpuid_bit(X86_FEATURE_SEP, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_DS, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_DS, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_ACC, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_PBE, entry->edx);
+
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_DTES64, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_MWAIT, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_DSCPL, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_VMXE, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_SMXE, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_EST, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_TM2, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_XTPR, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_PDCM, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_DCA, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_XSAVE, entry->ecx);
+        /* fall through */
+    case 0x80000001:
+        entry->edx = clear_cpuid_bit(X86_FEATURE_VME, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_PSE, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_PGE, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_MCE, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_MCA, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_MTRR, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_PSE36, entry->edx);
+
+#ifdef CONFIG_32BIT
+        entry->edx = clear_cpuid_bit(X86_FEATURE_LM, entry->edx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_LAHF_LM, entry->ecx);
+#endif
+        entry->edx = clear_cpuid_bit(X86_FEATURE_PAGE1GB, entry->edx);
+        entry->edx = clear_cpuid_bit(X86_FEATURE_RDTSCP, entry->edx);
+
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_SVME, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_OSVW, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_IBS, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_SKINIT, entry->ecx);
+        entry->ecx = clear_cpuid_bit(X86_FEATURE_WDT, entry->ecx);
+        break;
+
+    case 0x00000005: /* MONITOR/MWAIT */
+    case 0x0000000a: /* Architectural Performance Monitor Features */
+    case 0x8000000a: /* SVM revision and features */
+    case 0x8000001b: /* Instruction Based Sampling */
+        entry->eax = 0;
+        entry->ebx = 0;
+        entry->ecx = 0;
+        entry->edx = 0;
+        break;
+    }
+}
+
+static void emulate_cpuid(struct regs *regs)
+{
+    struct kvm_cpuid_entry entry;
+
+    entry.function = regs->rax;
+    real_cpuid(&entry);
+    filter_cpuid(&entry);
+    regs->rax = entry.eax;
+    regs->rbx = entry.ebx;
+    regs->rcx = entry.ecx;
+    regs->rdx = entry.edx;
+    printk(2, "cpuid 0x%08x: eax 0x%08x ebx 0x%08x ecx 0x%08x edx 0x%08x\n",
+           entry.function, entry.eax, entry.ebx, entry.ecx, entry.edx);
+}
+
+static void emulate_rdmsr(struct regs *regs)
+{
+    uint32_t ax,dx;
+    switch (regs->rcx) {
+    case MSR_EFER:
+    case MSR_FS_BASE:
+    case MSR_GS_BASE:
+    case MSR_KERNEL_GS_BASE:
+        /* white listed */
+        rdmsr(regs->rcx, &ax, &dx);
+        regs->rax = ax;
+        regs->rdx = dx;
+        break;
+    default:
+        printk(1, "%s: ignore: rcx 0x%" PRIxREG "\n", __FUNCTION__, regs->rcx);
+        regs->rax = 0;
+        regs->rdx = 0;
+        break;
+    }
+}
+
+static void emulate_wrmsr(struct regs *regs)
+{
+    static const uint64_t known = (EFER_NX|EFER_LMA|EFER_LME|EFER_SCE);
+    static const uint64_t fixed = (EFER_LMA|EFER_LME|EFER_SCE);
+    uint32_t ax,dx;
+
+    switch (regs->rcx) {
+    case MSR_EFER:
+        if (regs->rax & ~known) {
+            printk(1, "%s: efer: unknown bit set\n", __FUNCTION__);
+            goto out;
+        }
+
+        rdmsr(regs->rcx, &ax, &dx);
+        if ((regs->rax & fixed) != (ax & fixed)) {
+            printk(1, "%s: efer: modify fixed bit\n", __FUNCTION__);
+            goto out;
+        }
+
+        printk(1, "%s: efer:%s%s%s%s\n", __FUNCTION__,
+               regs->rax & EFER_SCE ? " sce" : "",
+               regs->rax & EFER_LME ? " lme" : "",
+               regs->rax & EFER_LMA ? " lma" : "",
+               regs->rax & EFER_NX  ? " nx"  : "");
+        /* fall through */
+    case MSR_FS_BASE:
+    case MSR_GS_BASE:
+    case MSR_KERNEL_GS_BASE:
+        wrmsr(regs->rcx, regs->rax, regs->rdx);
+        return;
+    }
+
+out:
+    printk(1, "%s: ignore: 0x%" PRIxREG " 0x%" PRIxREG ":0x%" PRIxREG "\n",
+           __FUNCTION__, regs->rcx, regs->rdx, regs->rax);
+}
+
+void print_emu_instr(int level, const char *prefix, uint8_t *instr)
+{
+    printk(level, "%s: rip %p bytes %02x %02x %02x %02x  %02x %02x %02x %02x\n",
+           prefix, instr,
+           instr[0], instr[1], instr[2], instr[3],
+           instr[4], instr[5], instr[6], instr[7]);
+}
+
+static ureg_t *decode_reg(struct regs *regs, uint8_t modrm, int rm)
+{
+    int shift = rm ? 0 : 3;
+    ureg_t *reg = NULL;
+
+    switch ((modrm >> shift) & 0x07) {
+    case 0: reg = (ureg_t*)&regs->rax; break;
+    case 1: reg = (ureg_t*)&regs->rcx; break;
+    case 2: reg = (ureg_t*)&regs->rdx; break;
+    case 3: reg = (ureg_t*)&regs->rbx; break;
+    case 4: reg = (ureg_t*)&regs->rsp; break;
+    case 5: reg = (ureg_t*)&regs->rbp; break;
+    case 6: reg = (ureg_t*)&regs->rsi; break;
+    case 7: reg = (ureg_t*)&regs->rdi; break;
+    }
+    return reg;
+}
+
+void print_bits(int level, const char *msg, uint32_t old, uint32_t new,
+                const char *names[])
+{
+    char buf[128];
+    int pos = 0;
+    uint32_t mask;
+    char *mod;
+    int i;
+
+    pos += snprintf(buf+pos, sizeof(buf)-pos, "%s:", msg);
+    for (i = 0; i < 32; i++) {
+        mask = 1 << i;
+        if (new&mask) {
+            if (old&mask) {
+                /* bit present */
+                mod = "";
+            } else {
+                /* bit added */
+                mod = "+";
+            }
+        } else {
+            if (old&mask) {
+                /* bit removed */
+                mod = "-";
+            } else {
+                /* bit not present */
+                continue;
+            }
+        }
+        pos += snprintf(buf+pos, sizeof(buf)-pos, " %s%s",
+                        mod, names[i] ? names[i] : "???");
+    }
+    pos += snprintf(buf+pos, sizeof(buf)-pos, "\n");
+    printk(level, "%s", buf);
+}
+
+int emulate(struct xen_cpu *cpu, struct regs *regs)
+{
+    static const uint8_t xen_emu_prefix[5] = {0x0f, 0x0b, 'x','e','n'};
+    uint8_t *instr;
+    int skip = 0;
+    int in = 0;
+    int shift = 0;
+    int port = 0;
+
+restart:
+    instr = (void*)regs->rip;
+
+    /* prefixes */
+    if (instr[skip] == 0x66) {
+        shift = 16;
+        skip++;
+    }
+
+    /* instructions */
+    switch (instr[skip]) {
+    case 0x0f:
+        switch (instr[skip+1]) {
+        case 0x06:
+            /* clts */
+            clts();
+            skip += 2;
+            break;
+        case 0x09:
+            /* wbinvd */
+            __asm__("wbinvd" ::: "memory");
+            skip += 2;
+            break;
+        case 0x0b:
+            /* ud2a */
+            if (xen_emu_prefix[2] == instr[skip+2] &&
+                xen_emu_prefix[3] == instr[skip+3] &&
+                xen_emu_prefix[4] == instr[skip+4]) {
+                printk(2, "%s: xen emu prefix\n", __FUNCTION__);
+                regs->rip += 5;
+                goto restart;
+            }
+            printk(1, "%s: ud2a -- linux kernel BUG()?\n", __FUNCTION__);
+            /* bounce to guest, hoping it prints more info */
+            return 0;
+        case 0x20:
+        {
+            /* read control registers */
+            ureg_t *reg = decode_reg(regs, instr[skip+2], 1);
+            switch (((instr[skip+2]) >> 3) & 0x07) {
+            case 0:
+                *reg = read_cr0();
+                skip = 3;
+                break;
+            case 3:
+                *reg = frame_to_addr(read_cr3_mfn(cpu));
+                skip = 3;
+                break;
+            case 4:
+                *reg = read_cr4();
+                skip = 3;
+                break;
+            }
+            break;
+        }
+        case 0x22:
+        {
+            /* write control registers */
+            static const ureg_t cr0_fixed = ~(X86_CR0_TS);
+            static const ureg_t cr4_fixed = X86_CR4_TSD;
+            ureg_t *reg = decode_reg(regs, instr[skip+2], 1);
+            ureg_t cr;
+            switch (((instr[skip+2]) >> 3) & 0x07) {
+            case 0:
+                cr = read_cr0();
+                if (cr != *reg) {
+                    if ((cr & cr0_fixed) == (*reg & cr0_fixed)) {
+                        print_bits(2, "apply cr0 update", cr, *reg, cr0_bits);
+                        write_cr0(*reg);
+                    } else {
+                        print_bits(1, "IGNORE cr0 update", cr, *reg, cr0_bits);
+                    }
+                }
+                skip = 3;
+                break;
+            case 4:
+                cr = read_cr4();
+                if (cr != *reg) {
+                    if ((cr & cr4_fixed) == (*reg & cr4_fixed)) {
+                        print_bits(1, "apply cr4 update", cr, *reg, cr4_bits);
+                        write_cr4(*reg);
+                    } else {
+                        print_bits(1, "IGNORE cr4 update", cr, *reg, cr4_bits);
+                    }
+                }
+                skip = 3;
+                break;
+            }
+            break;
+        }
+        case 0x30:
+            /* wrmsr */
+            emulate_wrmsr(regs);
+            skip += 2;
+            break;
+        case 0x32:
+            /* rdmsr */
+            emulate_rdmsr(regs);
+            skip += 2;
+            break;
+        case 0xa2:
+            /* cpuid */
+            emulate_cpuid(regs);
+            skip += 2;
+            break;
+        }
+        break;
+
+    case 0xe4: /* in     <next byte>,%al */
+    case 0xe5:
+        in = (instr[skip] & 1) ? 2 : 1;
+        port = instr[skip+1];
+        skip += 2;
+        break;
+    case 0xec: /* in     (%dx),%al */
+    case 0xed:
+        in = (instr[skip] & 1) ? 2 : 1;
+        port = regs->rdx & 0xffff;
+        skip += 1;
+        break;
+    case 0xe6: /* out    %al,<next byte> */
+    case 0xe7:
+        port = instr[skip+1];
+        skip += 2;
+        break;
+    case 0xee: /* out    %al,(%dx) */
+    case 0xef:
+        port = regs->rdx & 0xffff;
+        skip += 1;
+        break;
+
+    case 0xfa:
+        /* cli */
+        guest_cli(cpu);
+        skip += 1;
+        break;
+    case 0xfb:
+        /* sti */
+        guest_sti(cpu);
+        skip += 1;
+        break;
+    }
+
+    /* unknown instruction */
+    if (!skip) {
+        print_emu_instr(0, "instr emu failed", instr);
+        return -1;
+    }
+
+    /* I/O instruction */
+    if (in == 2) {
+        regs->rax |= 0xffffffff;
+    } else if (in == 1) {
+        regs->rax |= (0xffff << shift);
+    }
+
+    return skip;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 15/40] xenner: kernel: lapic code
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (13 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 14/40] xenner: kernel: Instruction emulator Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 16/40] xenner: kernel: Main (i386) Alexander Graf
                   ` (24 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner uses the lapic for interrupt handling and time keeping. This
patch adds support for this.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-lapic.c |  622 +++++++++++++++++++++++++++++++++++++++++
 1 files changed, 622 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-lapic.c

diff --git a/pc-bios/xenner/xenner-lapic.c b/pc-bios/xenner/xenner-lapic.c
new file mode 100644
index 0000000..af08c20
--- /dev/null
+++ b/pc-bios/xenner/xenner-lapic.c
@@ -0,0 +1,622 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner lapic handling
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+/*
+ * interrupt handling is here:
+ *  - ioapic control
+ *  - lapic control
+ *  - event channel management
+ *  - irq handler
+ */
+
+#include "xenner.h"
+
+#include "cpufeature.h"
+#include "msr-index.h"
+#include "apicdef.h"
+
+/* --------------------------------------------------------------------- */
+
+static void      *ioapic_mmio;
+static uint32_t  ioapic_pins;
+
+static uint32_t ioapic_read(int reg)
+{
+    volatile uint32_t *sel = (ioapic_mmio + IOAPIC_REG_SELECT);
+    volatile uint32_t *win = (ioapic_mmio + IOAPIC_REG_WINDOW);
+    *sel = reg;
+    return *win;
+}
+
+static void ioapic_write(int reg, uint32_t val)
+{
+    volatile uint32_t *sel = (ioapic_mmio + IOAPIC_REG_SELECT);
+    volatile uint32_t *win = (ioapic_mmio + IOAPIC_REG_WINDOW);
+    *sel = reg;
+    *win = val;
+}
+
+static void ioapic_write_irq_entry(int pin, struct IO_APIC_route_entry e)
+{
+    union route_entry_union {
+        struct { uint32_t w1, w2; };
+        struct IO_APIC_route_entry entry;
+    } eu;
+
+    eu.entry = e;
+    ioapic_write(0x11 + 2*pin, eu.w2);
+    ioapic_write(0x10 + 2*pin, eu.w1);
+}
+
+static void ioapic_route_irq(int pin, int vector, int cpu_id)
+{
+    struct IO_APIC_route_entry entry;
+
+    memset(&entry,0,sizeof(entry));
+    entry.vector = vector;
+    entry.dest = cpu_id;
+
+    ioapic_write_irq_entry(pin, entry);
+}
+
+static void ioapic_unroute_irq(int pin)
+{
+    struct IO_APIC_route_entry entry;
+
+    memset(&entry,0,sizeof(entry));
+    ioapic_write_irq_entry(pin, entry);
+}
+
+static void ioapic_init(struct xen_cpu *cpu)
+{
+    ureg_t base = IOAPIC_DEFAULT_BASE_ADDRESS;
+    uint32_t ver, id;
+
+    ioapic_mmio = fixmap_page(cpu, base);
+    id = ioapic_read(IOAPIC_REG_APIC_ID);
+    ver = ioapic_read(IOAPIC_REG_VERSION);
+    ioapic_pins = ((ver >> 16) & 0xff) + 1;
+    printk(1, "%s: base %" PRIxREG ", mapped to %p, id %d, version %d, pins %d\n",
+           __FUNCTION__, base, ioapic_mmio,
+           (id >> 24) & 0x0f, ver & 0xff, ioapic_pins);
+    if (!ver) {
+        panic("oops: ioapic version register is zero", NULL);
+    }
+
+    /* PICs: mask all irqs */
+    outb(0x10, 0x20);
+}
+
+/* --------------------------------------------------------------------- */
+
+static uint32_t lapic_read(struct xen_cpu *cpu, int reg)
+{
+    volatile uint32_t *ptr = (cpu->lapic + reg);
+    return *ptr;
+}
+
+static void lapic_write(struct xen_cpu *cpu, int reg, uint32_t val)
+{
+    volatile uint32_t *ptr = (cpu->lapic + reg);
+    *ptr = val;
+}
+
+void lapic_eoi(struct xen_cpu *cpu)
+{
+    lapic_write(cpu, APIC_EOI, 0);
+}
+
+void lapic_timer(struct xen_cpu *cpu)
+{
+    uint64_t systime;
+    uint32_t lvt;
+    uint32_t count;
+    uint32_t div;
+    int64_t nsecs;
+
+    systime = cpu->v.vcpu_info->time.system_time;
+    if (cpu->oneshot) {
+        nsecs = cpu->oneshot - systime;
+        if (nsecs < 10000) {
+            nsecs = 10000;
+        }
+    } else {
+        nsecs = cpu->periodic;
+    }
+
+    /* cap the max timer time - if we return too fast the guest will be nice
+       with us and just retrigger again. */
+    if (nsecs > 0x80000000) {
+        nsecs = 0x80000000;
+    }
+
+    printk(3, "%s/%d: periodic %" PRId64 ", oneshot %" PRId64
+           ", systime %" PRId64 ", nsecs %" PRId64 "\n", __FUNCTION__,
+           cpu->id, cpu->periodic, cpu->oneshot, systime, nsecs);
+
+    printk(3, "%s/%d: periodic %" PRIx64 ", oneshot %" PRIx64
+           ", systime %" PRIx64 ", nsecs %" PRIx64 "\n", __FUNCTION__,
+           cpu->id, cpu->periodic, cpu->oneshot, systime, nsecs);
+
+    lvt = cpu->virq_to_vector[VIRQ_TIMER];
+    if (!cpu->oneshot) {
+        lvt |= (1 << 17);
+    }
+
+    div = APIC_TDR_DIV_1;
+    count = nsecs;  /* kvm virtual apic has 1 ns ticks */
+    if (count != nsecs) {
+        /* count overflow, get some more bits */
+        div = APIC_TDR_DIV_128;
+        count = nsecs / 128;
+        if (count != nsecs / 128) {
+            /* Hmm, still overflows ... */
+            printk(0, "%s: nsecs 0x%" PRIx64 ", nsecs/128 0x%" PRIx64 ", count 0x%x\n",
+                   __FUNCTION__, nsecs, nsecs / 128, count);
+            panic("lapic timer count overflow", NULL);
+        }
+    }
+
+    lapic_write(cpu, APIC_LVTT,  lvt);
+    lapic_write(cpu, APIC_TDCR,  div);
+    lapic_write(cpu, APIC_TMICT, count);
+}
+
+static void lapic_ipi_send(struct xen_cpu *cpu, int dest,
+                           int vector, uint32_t flags)
+{
+    uint32_t icr2 = SET_APIC_DEST_FIELD(dest);
+    uint32_t icr  = vector | flags;
+
+    if (lapic_read(cpu, APIC_ICR) & APIC_ICR_BUSY) {
+        printk(0, "%s: busy ...\n", __FUNCTION__);
+        while (lapic_read(cpu, APIC_ICR) & APIC_ICR_BUSY)
+            /* busy wait */;
+        printk(0, "%s: ... ok\n", __FUNCTION__);
+    }
+
+    lapic_write(cpu, APIC_ICR2, icr2);
+    lapic_write(cpu, APIC_ICR,  icr);
+}
+
+void lapic_ipi_boot(struct xen_cpu *cpu, struct xen_cpu *ap)
+{
+    int addr = EMU_PA(sipi);
+
+    emudev_set(EMUDEV_CONF_NEXT_SECONDARY_VCPU, 0, ap->id);
+    printk(0, "%s/%d: send init ...\n", __FUNCTION__, ap->id);
+    lapic_ipi_send(cpu, ap->id, 0, APIC_DM_INIT | APIC_INT_ASSERT);
+    printk(0, "%s/%d: send sipi @ %x ...\n", __FUNCTION__, ap->id, addr);
+    lapic_ipi_send(cpu, ap->id, addr >> PAGE_SHIFT, APIC_DM_STARTUP | APIC_INT_ASSERT);
+}
+
+void lapic_ipi_flush_tlb(struct xen_cpu *cpu)
+{
+    lapic_ipi_send(cpu, 0, VECTOR_FLUSH_TLB,
+                   APIC_DEST_ALLBUT | APIC_DM_FIXED | APIC_INT_ASSERT);
+}
+
+static int lapic_init(struct xen_cpu *cpu)
+{
+    struct kvm_cpuid_entry entry;
+    uint32_t ax, dx, ver, id, spiv;
+    ureg_t base;
+
+    entry.function = 0x00000001;
+    real_cpuid(&entry);
+    if (!(entry.edx & (1 << (X86_FEATURE_APIC % 32)))) {
+        printk(1, "%s: no lapic present\n", __FUNCTION__);
+        return 0;
+    }
+
+    rdmsr(MSR_IA32_APICBASE, &ax, &dx);
+    base = (uint64_t)dx << 32 | (ax & PAGE_MASK);
+    cpu->lapic = fixmap_page(cpu, base);
+    id   = lapic_read(cpu, APIC_ID);
+    ver  = lapic_read(cpu, APIC_LVR);
+    spiv = lapic_read(cpu, APIC_SPIV);
+    printk(1, "%s: base %" PRIxREG ", mapped to %p, id %d, version %d, maxlvt %d%s%s%s\n",
+           __FUNCTION__, base, cpu->lapic,
+           GET_APIC_ID(id),
+           GET_APIC_VERSION(ver),
+           GET_APIC_MAXLVT(ver),
+           ax & 0x00000100 ? ", bootcpu"    : "",
+           ax & 0x00000800 ? ", hw-enabled" : "",
+           spiv & APIC_SPIV_APIC_ENABLED ? ", sw-enabled" : "");
+    if (!ver) {
+        panic("oops: lapic version register is zero", NULL);
+    }
+
+    lapic_write(cpu, APIC_SPIV, spiv | APIC_SPIV_APIC_ENABLED);
+
+    if (ax & 0x00000100) {
+        /* boot cpu */
+        return 1;
+    }
+
+    return 2;
+}
+
+/* --------------------------------------------------------------------- */
+
+#define NUM_VEC 256
+
+static struct vector {
+    enum {
+        VECTYPE_UNDEFINED = 0,
+        VECTYPE_INTERDOMAIN,
+        VECTYPE_VIRQ,
+        VECTYPE_IPI,
+    }               type;
+    int             vec;
+    int             pin;
+    int             evtchn;
+    int             virq;
+    struct xen_cpu  *cpu;
+    char            *desc;
+} vectors[NUM_VEC];
+
+static struct vector *evtchn_to_vec[NUM_VEC];
+static struct vector *pin_to_vec[NUM_VEC];
+
+static void evtchn_route_print(int level, struct vector *vector)
+{
+    static const char *tname[] = {
+        [ VECTYPE_UNDEFINED ]   = "???",
+        [ VECTYPE_INTERDOMAIN ] = "ext",
+        [ VECTYPE_VIRQ ]        = "virq",
+        [ VECTYPE_IPI ]         = "ipi",
+    };
+    char *name = vminfo.enames + vector->evtchn * XEN_ENAME_LEN;
+    char linfo[64], sinfo[20];
+
+    switch (vector->type) {
+    case VECTYPE_INTERDOMAIN:
+        snprintf(linfo, sizeof(linfo), "vcpu %d, io-apic pin %d, %s",
+                 vector->cpu->id, vector->pin, vector->desc);
+        snprintf(sinfo, sizeof(sinfo), "%s", vector->desc);
+        break;
+    case VECTYPE_VIRQ:
+        snprintf(linfo, sizeof(linfo), "vcpu %d, virq %d, %s",
+                 vector->cpu->id, vector->virq, vector->desc);
+        snprintf(sinfo, sizeof(sinfo), "virq%d (%s)",
+                 vector->virq, vector->desc);
+        break;
+    case VECTYPE_IPI:
+        snprintf(linfo, sizeof(linfo), "vcpu %d", vector->cpu->id);
+        snprintf(sinfo, sizeof(sinfo), "ipi");
+        break;
+    default:
+        snprintf(linfo, sizeof(linfo), "FIXME");
+        snprintf(sinfo, sizeof(sinfo), "FIXME");
+        break;
+    }
+    printk(1, "irq route: vec %d = evtchn %d, type %s, %s\n",
+           vector->vec, vector->evtchn, tname[vector->type], linfo);
+    snprintf(name, XEN_ENAME_LEN, "#%d/%d %s",
+             vector->evtchn, vector->cpu->id, sinfo);
+}
+
+static struct vector *evtchn_route_add(int type, int port)
+{
+    struct vector *vector;
+    int vec = VECTOR_EVTCHN_START;
+
+    while (vectors[vec].type != VECTYPE_UNDEFINED) {
+        vec++;
+    }
+    vector = vectors + vec;
+
+    evtchn_to_vec[port] = vector;
+    vector->type   = type;
+    vector->vec    = vec;
+    vector->evtchn = port;
+    return vector;
+}
+
+int evtchn_route_interdomain(struct xen_cpu *cpu, int port, char *desc)
+{
+    struct vector *vector;
+    int pin;
+
+    vector = evtchn_to_vec[port];
+    if (vector) {
+        /* re-route to other vcpu */
+        if (vector->type != VECTYPE_INTERDOMAIN) {
+            return -1;
+        }
+    } else {
+        /* new evtchn */
+        vector = evtchn_route_add(VECTYPE_INTERDOMAIN, port);
+        vector->desc = desc ? desc : "other";
+        for (pin = 1; pin_to_vec[pin]; pin++) {
+        }
+        vector->pin = pin;
+        pin_to_vec[pin] = vector;
+        emudev_set(EMUDEV_CONF_EVTCHN_TO_PIN, vector->evtchn, vector->pin);
+    }
+    vector->cpu = cpu;
+    ioapic_route_irq(vector->pin, vector->vec, vector->cpu->id);
+    evtchn_route_print(1, vector);
+    return 0;
+}
+
+int evtchn_route_virq(struct xen_cpu *cpu, int virq, int port, char *desc)
+{
+    struct vector *vector;
+
+    vector = evtchn_route_add(VECTYPE_VIRQ, port);
+    vector->virq = virq;
+    vector->cpu  = cpu;
+    vector->desc = desc ? desc : "other";
+    cpu->virq_to_vector[virq] = vector->vec;
+    evtchn_route_print(1, vector);
+    return 0;
+}
+
+int evtchn_route_ipi(struct xen_cpu *cpu, int port)
+{
+    struct vector *vector;
+
+    vector = evtchn_route_add(VECTYPE_IPI, port);
+    vector->cpu = cpu;
+    evtchn_route_print(1, vector);
+    return 0;
+}
+
+int evtchn_send(struct xen_cpu *cpu, int port)
+{
+    struct vector *vector;
+
+    if (port >= NUM_VEC) {
+        printk(0, "%s: oops: port %d is out of range\n", __FUNCTION__, port);
+        return 0;
+    }
+
+    vector = evtchn_to_vec[port];
+    if (!vector) {
+        printk(0, "%s: oops: vector for port %d is NULL\n", __FUNCTION__, port);
+        return 0;
+    }
+    switch (vector->type) {
+    case VECTYPE_VIRQ:
+        /* should not happen */
+        printk(0, "%s: port %d, virq (Huh? -- FIXME)\n", __FUNCTION__, port);
+        return 1;
+    case VECTYPE_IPI:
+        /* handled internally */
+        printk(2, "%s: port %d, ipi\n", __FUNCTION__, port);
+        lapic_ipi_send(cpu, vector->cpu->id, vector->vec,
+                       APIC_DM_FIXED | APIC_INT_ASSERT);
+        return 1;
+    default:
+        /* handled by xenner */
+        printk(3, "%s: port %d, external\n", __FUNCTION__, port);
+        return 0;
+    }
+}
+
+void evtchn_unmask(struct xen_cpu *cpu, int port)
+{
+    struct vector *vector = evtchn_to_vec[port];
+    int resent = 0;
+
+    if (!vector) {
+        printk(0, "%s: oops: vector for port %d is NULL\n", __FUNCTION__, port);
+        return;
+    }
+
+    clear_bit(port, shared_info.evtchn_mask);
+    if (test_and_clear_bit(port, shared_info.evtchn_pending)) {
+        lapic_ipi_send(cpu, vector->cpu->id, vector->vec,
+                       APIC_DM_FIXED | APIC_INT_ASSERT);
+        resent = 1;
+    }
+    printk(2, "%s: port %d%s\n", __FUNCTION__, port,
+           resent ? ", resent" : "");
+}
+
+void evtchn_close(struct xen_cpu *cpu, int port)
+{
+    struct vector *vector = evtchn_to_vec[port];
+    char *name;
+
+    if (!vector) {
+        printk(0, "%s: oops: vector for port %d is NULL\n", __FUNCTION__, port);
+        return;
+    }
+
+    switch (vector->type) {
+    case VECTYPE_INTERDOMAIN:
+        ioapic_unroute_irq(vector->pin);
+        pin_to_vec[vector->pin] = NULL;
+        break;
+    case VECTYPE_VIRQ:
+        if (vector->virq == VIRQ_TIMER) {
+            cpu->oneshot  = 0;
+            cpu->periodic = XEN_DEFAULT_PERIOD;
+            lapic_timer(cpu);
+            return;
+        }
+        break;
+    default:
+        /* nothing -- make gcc happy */
+        break;
+    }
+    printk(1, "irq route: vec %d = evtchn %d, closing\n",
+           vector->vec, vector->evtchn);
+    name = vminfo.enames + vector->evtchn * XEN_ENAME_LEN;
+    snprintf(name, XEN_ENAME_LEN, "#%d (closed)", vector->evtchn);
+
+    memset(vector, 0, sizeof(*vector));
+    evtchn_to_vec[port] = NULL;
+    emudev_cmd(EMUDEV_CMD_EVTCHN_CLOSE, port);
+}
+
+int evtchn_alloc(int vcpu_id)
+{
+    ureg_t port;
+
+    emudev_cmd(EMUDEV_CMD_EVTCHN_ALLOC, vcpu_id);
+    port = emudev_get(EMUDEV_CONF_COMMAND_RESULT, vcpu_id);
+    return port;
+}
+
+static int evtchn_route_init(struct xen_cpu *cpu)
+{
+    uint64_t evtchn_store;
+    uint64_t evtchn_console;
+
+    evtchn_store = emudev_get(EMUDEV_CONF_EVTCH_XENSTORE, 0);
+    evtchn_console = emudev_get(EMUDEV_CONF_EVTCH_CONSOLE, 0);
+
+    evtchn_route_interdomain(cpu, evtchn_store, "xenstore");
+    evtchn_route_interdomain(cpu, evtchn_console, "console");
+
+    cpu->timerport = evtchn_alloc(cpu->id);
+    evtchn_route_virq(cpu, VIRQ_TIMER, cpu->timerport, "timer");
+    lapic_timer(cpu);
+
+    return 0;
+}
+
+/* --------------------------------------------------------------------- */
+
+static void evtchn_raise_event(struct xen_cpu *cpu, int port)
+{
+    int word = port / (sizeof(intptr_t)*8);
+
+    if (test_and_set_bit(port, shared_info.evtchn_pending) ||
+        test_bit(port, shared_info.evtchn_mask) ||
+        test_and_set_bit(word, &cpu->v.vcpu_info->evtchn_pending_sel)) {
+        return;
+    }
+    cpu->v.vcpu_info->evtchn_upcall_pending = 1;
+}
+
+int evtchn_pending(struct xen_cpu *cpu)
+{
+    if (!cpu->v.vcpu_info->evtchn_upcall_pending ||
+        !guest_irq_flag(cpu)) {
+        return 0;
+    }
+    return 1;
+}
+
+static void evtchn_forward(struct xen_cpu *cpu, struct regs *regs)
+{
+    vminfo.faults[XEN_FAULT_EVENT_CALLBACK]++;
+    bounce_trap(cpu, regs, -1, CALLBACKTYPE_event);
+#ifdef CONFIG_64BIT
+    /* return via iretq please */
+    regs->error = HCALL_IRET;
+#endif
+}
+
+void evtchn_try_forward(struct xen_cpu *cpu, struct regs *regs)
+{
+    if (context_is_emu(regs)) {
+        return;
+    }
+
+    if (!evtchn_pending(cpu)) {
+
+#if 0 /* deadlock detector */
+        static int masked;
+        uint8_t *instr = (void*)regs->rip;
+
+        if (cpu->v.vcpu_info->evtchn_upcall_pending &&
+            !guest_irq_flag(cpu)) {
+            masked++;
+            if (masked > 10000) {
+                printk(0, "%s: deadlocked? injecting BUG() for trace\n", __FUNCTION__);
+                instr[0] = 0x0f; /* ud2a -- BUG() */
+                instr[1] = 0x0b;
+                instr[2] = 0xcd; /* int 255 */
+                instr[3] = 0xff;
+                masked = 0;
+            }
+        } else {
+            masked = 0;
+        }
+#endif
+
+        return;
+    }
+
+    evtchn_forward(cpu, regs);
+}
+
+/* --------------------------------------------------------------------- */
+
+asmlinkage void do_irq(struct regs *regs)
+{
+    struct xen_cpu *cpu = get_cpu();
+    struct vector *vector = vectors + regs->trapno;
+
+    printk(3, "%s: irq vector %d\n", __FUNCTION__, vector->vec);
+
+    lapic_eoi(cpu);
+    switch (vector->type) {
+    case VECTYPE_UNDEFINED:
+        printk(0, "%s: unhandled irq (vector %d)\n", __FUNCTION__, (int)regs->trapno);
+        panic("unknown irq", regs);
+        break;
+    case VECTYPE_VIRQ:
+        if (vector->virq == VIRQ_TIMER) {
+            if (cpu->oneshot) {
+                cpu->oneshot = 0;
+                lapic_timer(cpu);
+            }
+            pv_clock_update(0);
+        }
+        /* fall through */
+    default:
+        vminfo.events[vector->evtchn]++;
+        evtchn_raise_event(cpu, vector->evtchn);
+        evtchn_try_forward(cpu, regs);
+        break;
+    }
+
+    if (context_is_emu(regs)) {
+        uint8_t *ins = (void*)regs->rip;
+        if (ins[0] == 0xf4) {
+            printk(0, "%s: WARN: rip %" PRIxREG " points to hlt\n",
+                   __FUNCTION__, regs->rip);
+        }
+    }
+}
+
+int irq_init(struct xen_cpu *cpu)
+{
+    int rc;
+
+    rc = lapic_init(cpu);
+    if (rc == 0) {
+        return 0;
+    } else if (rc == 1) {
+        /* boot cpu */
+        ioapic_init(cpu);
+        evtchn_route_init(cpu);
+    }
+    return rc;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 16/40] xenner: kernel: Main (i386)
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (14 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 15/40] xenner: kernel: lapic code Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 17/40] xenner: kernel: Main (x86_64) Alexander Graf
                   ` (23 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds the i386 specific piece of xenner's main loop.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-main32.c |  390 ++++++++++++++++++++++++++++++++++++++++
 1 files changed, 390 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-main32.c

diff --git a/pc-bios/xenner/xenner-main32.c b/pc-bios/xenner/xenner-main32.c
new file mode 100644
index 0000000..0c049dd
--- /dev/null
+++ b/pc-bios/xenner/xenner-main32.c
@@ -0,0 +1,390 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner main functions for 32 bit
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xenner.h"
+#include "xenner-main.c"
+
+/* --------------------------------------------------------------------- */
+/* helper functions                                                      */
+
+int bounce_trap(struct xen_cpu *cpu, struct regs_32 *regs, int trapno, int cbno)
+{
+    uint32_t *kesp, eip = 0, cs = 0;
+    uint32_t stack_cs, stack_eflags;
+    int stack_switch = 0;
+    int error_code = 0;
+    int interrupt = 0;
+    int k = 0;
+
+    vminfo.faults[XEN_FAULT_BOUNCE_TRAP]++;
+
+    if (trapno >= 0) {
+        /* trap bounce */
+        eip  = xentr[trapno].address;
+        cs   = xentr[trapno].cs;
+        if (TI_GET_IF(&xentr[trapno])) {
+            interrupt = 1;
+        }
+        if (trapno < (sizeof(trapinfo) / sizeof(trapinfo[0]))) {
+            error_code = trapinfo[trapno].ec;
+        }
+        if (trapno == 14) {
+            /* page fault */
+            cpu->v.vcpu_info->arch.cr2 = read_cr2();
+        }
+    }
+    if (cbno >= 0) {
+        /* callback */
+        eip  = xencb[cbno].eip;
+        cs   = xencb[cbno].cs;
+        switch (cbno) {
+        case CALLBACKTYPE_event:
+            interrupt = 1;
+            break;
+        }
+    }
+
+    if (!cs) {
+        printk(0, "%s: trapno %d, cbno %d\n", __FUNCTION__, trapno, cbno);
+        panic("no guest trap handler", regs);
+    }
+
+    /* set interrupt flag depending on event channel mask */
+    stack_eflags = regs->eflags & ~X86_EFLAGS_IF;
+    if (guest_irq_flag(cpu)) {
+        stack_eflags |= X86_EFLAGS_IF;
+    }
+
+    /* old evtchn_upcall_mask is saved in cs slot on the stack */
+    stack_cs = regs->cs | ((uint32_t)cpu->v.vcpu_info->evtchn_upcall_mask << 16);
+    if (interrupt) {
+        guest_cli(cpu);
+    }
+
+    if ((regs->cs & 0x03) < (cs & 0x03)) {
+        panic("bounce trap: illegal ring switch", regs);
+    }
+    if ((regs->cs & 0x03) > (cs & 0x03)) {
+        stack_switch = 1;
+    }
+
+    /* prepare guest stack: copy from emu, so the handler
+     * jumps straigt back without round-trip via emu */
+    if (stack_switch) {
+        kesp = (void*)cpu->tss.esp1;
+        kesp[-(++k)] = regs->ss;            // push ss
+        kesp[-(++k)] = regs->esp;           // push esp
+    } else {
+        kesp = (void*)regs->esp;
+    }
+
+    kesp[-(++k)] = stack_eflags;            // push eflags
+    kesp[-(++k)] = stack_cs;                // push cs
+    kesp[-(++k)] = regs->eip;               // push eip
+    if (error_code) {
+        kesp[-(++k)] = regs->error;         // push error code
+    }
+
+    /* prepare emu stack, so iret jumps to the kernels handler. */
+    regs->eip     = eip;
+    regs->cs      = cs;
+    regs->eflags &= EFLAGS_TRAPMASK;
+    if (stack_switch) {
+        regs->ss  = cpu->tss.ss1;
+        regs->esp = cpu->tss.esp1;
+    }
+    regs->esp -= 4*k;
+
+    return 0;
+}
+
+/* --------------------------------------------------------------------- */
+
+static const struct kvm_segment xen32_cs0 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe008,
+    .dpl      = 0,
+    .type     = 0xb,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen32_ds0 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe010,
+    .dpl      = 0,
+    .type     = 0x3,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen32_cs1 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe019,
+    .dpl      = 1,
+    .type     = 0xb,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen32_ds1 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe021,
+    .dpl      = 1,
+    .type     = 0x3,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen32_cs3 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe02b,
+    .dpl      = 3,
+    .type     = 0xb,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen32_ds3 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe033,
+    .dpl      = 3,
+    .type     = 0x3,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+
+void gdt_init(struct xen_cpu *cpu)
+{
+    printk(2, "%s: cpu %d\n", __FUNCTION__, cpu->id);
+
+    if (!cpu->gdt) {
+        cpu->gdt = get_pages(16, "gdt");
+    }
+
+    gdt_set(cpu->gdt, &xen32_cs0);
+    gdt_set(cpu->gdt, &xen32_ds0);
+    gdt_set(cpu->gdt, &xen32_cs1);
+    gdt_set(cpu->gdt, &xen32_ds1);
+    gdt_set(cpu->gdt, &xen32_cs3);
+    gdt_set(cpu->gdt, &xen32_ds3);
+}
+
+void tss_init(struct xen_cpu *cpu)
+{
+    struct descriptor_32 *gdt = cpu->gdt;
+    int idx = tss(cpu);
+
+    printk(2, "%s: cpu %d\n", __FUNCTION__, cpu->id);
+
+    cpu->tss.esp0 = (uintptr_t)cpu->stack_high;
+    cpu->tss.ss0  = 0xe010;
+
+    gdt[ idx ] = mkdesc32((uintptr_t)(&cpu->tss), sizeof(cpu->tss)-1, 0x89, 0);
+}
+
+void msrs_init(struct xen_cpu *cpu)
+{
+    printk(2, "%s: cpu %d\n", __FUNCTION__, cpu->id);
+}
+
+void idt_init(void)
+{
+    intptr_t entry;
+    int i,len;
+
+    printk(2, "%s\n", __FUNCTION__);
+
+    len = (irq_common - irq_entries) / 256;
+    for (i = 0; i < 256; i++) {
+        entry = (intptr_t)(irq_entries + i*len);
+        xen_idt[i] = mkgate32(0xe008, (uintptr_t)irq_entries + i*len, 0x8e);
+    }
+
+    xen_idt[    0 ] = mkgate32(0xe008, (uintptr_t)division_by_zero,    0x8e);
+    xen_idt[    1 ] = mkgate32(0xe008, (uintptr_t)debug_int1,          0x8e);
+    xen_idt[    2 ] = mkgate32(0xe008, (uintptr_t)nmi,                 0x8e);
+    xen_idt[    3 ] = mkgate32(0xe008, (uintptr_t)debug_int3,          0xee);
+    xen_idt[    4 ] = mkgate32(0xe008, (uintptr_t)overflow,            0x8e);
+    xen_idt[    5 ] = mkgate32(0xe008, (uintptr_t)bound_check,         0x8e);
+    xen_idt[    6 ] = mkgate32(0xe008, (uintptr_t)illegal_instruction, 0x8e);
+    xen_idt[    7 ] = mkgate32(0xe008, (uintptr_t)no_device,           0x8e);
+    xen_idt[    8 ] = mkgate32(0xe008, (uintptr_t)double_fault,        0x8e);
+    xen_idt[    9 ] = mkgate32(0xe008, (uintptr_t)coprocessor,         0x8e);
+    xen_idt[   10 ] = mkgate32(0xe008, (uintptr_t)invalid_tss,         0x8e);
+    xen_idt[   11 ] = mkgate32(0xe008, (uintptr_t)segment_not_present, 0x8e);
+    xen_idt[   12 ] = mkgate32(0xe008, (uintptr_t)stack_fault,         0x8e);
+    xen_idt[   13 ] = mkgate32(0xe008, (uintptr_t)general_protection,  0x8e);
+    xen_idt[   14 ] = mkgate32(0xe008, (uintptr_t)page_fault,          0x8e);
+    xen_idt[   16 ] = mkgate32(0xe008, (uintptr_t)floating_point,      0x8e);
+    xen_idt[   17 ] = mkgate32(0xe008, (uintptr_t)alignment,           0x8e);
+    xen_idt[   18 ] = mkgate32(0xe008, (uintptr_t)machine_check,       0x8e);
+    xen_idt[   19 ] = mkgate32(0xe008, (uintptr_t)simd_floating_point, 0x8e);
+
+    xen_idt[ VECTOR_FLUSH_TLB  ] =
+        mkgate32(0xe008, (uintptr_t)smp_flush_tlb, 0x8e);
+
+    xen_idt[ 0x82 ] = mkgate32(0xe008, (uintptr_t)xen_hypercall,       0xae);
+}
+
+/* --------------------------------------------------------------------- */
+
+static int pf_fixup_readonly(struct regs_32 *regs, uint32_t cr2)
+{
+    pte_t *pte = find_pte_lpt(cr2);
+
+    if (cr2 >= XEN_IPT) {
+        return 0;
+    }
+
+    if (*pte & _PAGE_USER) {
+        return 0;
+    }
+
+    /* is kernel page */
+    *pte |= _PAGE_RW;
+
+    vminfo.faults[XEN_FAULT_PAGE_FAULT_FIX_RO]++;
+    flush_tlb_addr(cr2);
+
+    return 1;
+}
+
+void guest_regs_init(struct xen_cpu *cpu, struct regs_32 *regs)
+{
+    struct vcpu_guest_context *ctxt = cpu->init_ctxt;
+
+    regs->eax    = ctxt->user_regs.eax;
+    regs->ebx    = ctxt->user_regs.ebx;
+    regs->ecx    = ctxt->user_regs.ecx;
+    regs->edx    = ctxt->user_regs.edx;
+    regs->esi    = ctxt->user_regs.esi;
+    regs->edi    = ctxt->user_regs.edi;
+    regs->ebp    = ctxt->user_regs.ebp;
+    regs->eip    = ctxt->user_regs.eip;
+    regs->cs     = ctxt->user_regs.cs;
+    regs->eflags = ctxt->user_regs.eflags;
+    regs->esp    = ctxt->user_regs.esp;
+    regs->ss     = ctxt->user_regs.ss;
+
+    regs->ds     = ctxt->user_regs.ds;
+    regs->es     = ctxt->user_regs.es;
+    asm volatile("mov %0, %%fs;\n" :: "r" (ctxt->user_regs.fs) : "memory");
+    asm volatile("mov %0, %%gs;\n" :: "r" (ctxt->user_regs.gs) : "memory");
+}
+
+static void set_up_shared_info(void)
+{
+    int i;
+
+    memset(&shared_info, 0, sizeof(shared_info));
+    for ( i = 0; i < XEN_LEGACY_MAX_VCPUS; i++ ) {
+        shared_info.vcpu_info[i].evtchn_upcall_mask = 1;
+    }
+}
+
+static void set_up_context(void *_ctxt, unsigned long boot_cr3,
+                           unsigned long init_pt_len)
+{
+    vcpu_guest_context_t *ctxt = _ctxt;
+    uint64_t virt_base = emudev_get(EMUDEV_CONF_PV_VIRT_BASE, 0);
+    uint64_t virt_entry = emudev_get(EMUDEV_CONF_PV_VIRT_ENTRY, 0);
+    uint64_t boot_stack_pfn = emudev_get(EMUDEV_CONF_PFN_INIT_PT, 0) +
+                              addr_to_frame(init_pt_len + PAGE_SIZE - 1);
+    uint64_t start_info_pfn = emudev_get(EMUDEV_CONF_PFN_START_INFO, 0);
+
+    set_up_shared_info();
+
+    /* clear everything */
+    memset(ctxt, 0, sizeof(*ctxt));
+
+    ctxt->user_regs.ds = FLAT_KERNEL_DS_X86_32;
+    ctxt->user_regs.es = FLAT_KERNEL_DS_X86_32;
+    ctxt->user_regs.fs = FLAT_KERNEL_DS_X86_32;
+    ctxt->user_regs.gs = FLAT_KERNEL_DS_X86_32;
+    ctxt->user_regs.ss = FLAT_KERNEL_SS_X86_32;
+    ctxt->user_regs.cs = FLAT_KERNEL_CS_X86_32;
+    ctxt->user_regs.eip = virt_entry;
+    ctxt->user_regs.esp = virt_base | ((boot_stack_pfn + 1) << PAGE_SHIFT);
+    ctxt->user_regs.esi = virt_base | (start_info_pfn << PAGE_SHIFT);
+    ctxt->user_regs.eflags = 1 << 9; /* Interrupt Enable */
+
+    ctxt->kernel_ss = ctxt->user_regs.ss;
+    ctxt->kernel_sp = ctxt->user_regs.esp;
+
+    ctxt->flags = VGCF_in_kernel_X86_32 | VGCF_online_X86_32;
+    ctxt->ctrlreg[3] = boot_cr3;
+}
+
+static void guest_hypercall_page(struct xen_cpu *cpu)
+{
+    unsigned long _hypercall_page = emudev_get(EMUDEV_CONF_HYPERCALL_PAGE, 0);
+    char *hypercall_page = (char*)_hypercall_page;
+
+    char *p;
+    int i;
+
+    /* Fill in all the transfer points with template machine code. */
+    for ( i = 0; i < (PAGE_SIZE / 32); i++ ) {
+        p = (char *)(hypercall_page + (i * 32));
+        *(uint8_t  *)(p+ 0) = 0xb8;    /* mov  $<i>,%eax */
+        *(uint32_t *)(p+ 1) = i;
+        *(uint16_t *)(p+ 5) = 0x82cd;  /* int  $0x82 */
+        *(uint8_t  *)(p+ 7) = 0xc3;    /* ret */
+    }
+
+    /*
+     * HYPERVISOR_iret is special because it doesn't return and expects a
+     * special stack frame. Guests jump at this transfer point instead of
+     * calling it.
+     */
+    p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32));
+    *(uint8_t  *)(p+ 0) = 0x50;    /* push %eax */
+    *(uint8_t  *)(p+ 1) = 0xb8;    /* mov  $__HYPERVISOR_iret,%eax */
+    *(uint32_t *)(p+ 2) = __HYPERVISOR_iret;
+    *(uint16_t *)(p+ 6) = 0x82cd;  /* int  $0x82 */
+}
+
+
+/* --------------------------------------------------------------------- */
+/* called from assembler                                                 */
+
+asmlinkage void do_page_fault(struct regs_32 *regs)
+{
+    struct xen_cpu *cpu =get_cpu();
+    uint32_t cr2 = read_cr2();
+
+    vminfo.faults[XEN_FAULT_PAGE_FAULT]++;
+
+    if (context_is_emu(regs)) {
+        if (fixup_extable(regs)) {
+            return;
+        }
+        print_page_fault_info(0, cpu, regs, cr2);
+        panic("ring0 (emu) page fault", regs);
+    }
+
+    if (wrpt && regs->error == 3) {
+        /* kernel write to r/o page */
+        if (pf_fixup_readonly(regs, cr2)) {
+            return;
+        }
+    }
+
+    vminfo.faults[XEN_FAULT_PAGE_FAULT_GUEST]++;
+    bounce_trap(cpu, regs, 14, -1);
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 17/40] xenner: kernel: Main (x86_64)
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (15 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 16/40] xenner: kernel: Main (i386) Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 18/40] xenner: kernel: Main Alexander Graf
                   ` (22 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds the x86_64 specific piece of xenner's main loop.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-main64.c |  412 ++++++++++++++++++++++++++++++++++++++++
 1 files changed, 412 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-main64.c

diff --git a/pc-bios/xenner/xenner-main64.c b/pc-bios/xenner/xenner-main64.c
new file mode 100644
index 0000000..52f1dd3
--- /dev/null
+++ b/pc-bios/xenner/xenner-main64.c
@@ -0,0 +1,412 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner main functions for 64 bit
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "msr-index.h"
+#include "xenner.h"
+#include "xenner-main.c"
+
+/* --------------------------------------------------------------------- */
+/* helper functions                                                      */
+
+int bounce_trap(struct xen_cpu *cpu, struct regs_64 *regs, int trapno, int cbno)
+{
+    uint64_t *stack, rip = 0, rsp, stack_cs, stack_rflags;
+    int error_code = 0;
+    int interrupt = 0;
+    int failsafe = 0;
+    int k = 0;
+
+    vminfo.faults[XEN_FAULT_BOUNCE_TRAP]++;
+
+    if (trapno >= 0) {
+        /* trap bounce */
+        rip  = xentr[trapno].address;
+        if (TI_GET_IF(&xentr[trapno])) {
+            interrupt = 1;
+        }
+        if (trapno < sizeof(trapinfo)/sizeof(trapinfo[0])) {
+            error_code = trapinfo[trapno].ec;
+        }
+        if (trapno == 14) {
+            /* page fault */
+            cpu->v.vcpu_info->arch.cr2 = read_cr2();
+        }
+    }
+    if (cbno >= 0) {
+        /* callback */
+        rip  = xencb[cbno];
+        switch (cbno) {
+        case CALLBACKTYPE_event:
+            interrupt = 1;
+            break;
+        }
+    }
+
+    if (!rip) {
+        printk(0, "%s: cbno %d, trapno %d\n", __FUNCTION__, cbno, trapno);
+        panic("no guest trap handler", regs);
+    }
+
+    /* set interrupt flag depending on event channel mask */
+    stack_rflags = regs->rflags & ~X86_EFLAGS_IF;
+    if (guest_irq_flag(cpu)) {
+        stack_rflags |= X86_EFLAGS_IF;
+    }
+
+    /* old evtchn_upcall_mask is saved in cs slot on the stack */
+    stack_cs = regs->cs | ((uint64_t)cpu->v.vcpu_info->evtchn_upcall_mask << 32);
+    if (interrupt) {
+        guest_cli(cpu);
+    }
+
+    if (!is_kernel(cpu)) {
+        /* user mode */
+        switch_mode(cpu);
+        rsp = cpu->kernel_sp;
+    } else {
+        /* kernel mode */
+        stack_cs &= ~3;         /* signal kernel mode */
+        rsp = regs->rsp & ~0xf; /* align stack */
+    }
+    stack = (void*)(rsp);
+
+    stack[-(++k)] = regs->ss;        // push ss
+    stack[-(++k)] = regs->rsp;       // push rsp
+    stack[-(++k)] = stack_rflags;    // push rflags
+    stack[-(++k)] = stack_cs;        // push cs
+    stack[-(++k)] = regs->rip;       // push rip
+    if (error_code) {
+        stack[-(++k)] = regs->error; // push error code
+    }
+
+    if (failsafe) {
+        /* push segment registers */;
+    }
+
+    stack[-(++k)] = regs->r11;       // push r11
+    stack[-(++k)] = regs->rcx;       // push rcx
+
+    /* prepare emu stack, so iret jumps to the kernels handler. */
+    regs->rip     = rip;
+    regs->cs      = FLAT_KERNEL_CS;
+    regs->rflags &= EFLAGS_TRAPMASK;
+    regs->rsp     = rsp - 8*k;
+    regs->ss      = FLAT_KERNEL_SS;
+
+    return 0;
+}
+
+/* --------------------------------------------------------------------- */
+
+static const struct kvm_segment xen64_cs0_64 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe008,
+    .dpl      = 0,
+    .type     = 0xb,
+    .present  = 1,  .l = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen64_ds0_32 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe010,
+    .dpl      = 0,
+    .type     = 0x3,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen64_cs3_32 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe023,
+    .dpl      = 3,
+    .type     = 0xb,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen64_ds3_32 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe02b,
+    .dpl      = 3,
+    .type     = 0x3,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen64_cs3_64 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe033,
+    .dpl      = 3,
+    .type     = 0xb,
+    .present  = 1,  .l = 1,  .s = 1,  .g = 1,
+};
+static const struct kvm_segment xen64_cs0_32 = {
+    .base     = 0,
+    .limit    = 0xffffffff,
+    .selector = 0xe038,
+    .dpl      = 0,
+    .type     = 0xb,
+    .present  = 1,  .db = 1,  .s = 1,  .g = 1,
+};
+
+void gdt_init(struct xen_cpu *cpu)
+{
+    printk(2, "%s: cpu %d\n", __FUNCTION__, cpu->id);
+
+    if (!cpu->gdt) {
+        cpu->gdt = get_pages(16, "gdt");
+    }
+
+    gdt_set(cpu->gdt, &xen64_cs0_64);
+    gdt_set(cpu->gdt, &xen64_ds0_32);
+    gdt_set(cpu->gdt, &xen64_cs3_32);
+    gdt_set(cpu->gdt, &xen64_ds3_32);
+    gdt_set(cpu->gdt, &xen64_cs3_64);
+    gdt_set(cpu->gdt, &xen64_cs0_32);
+}
+
+void tss_init(struct xen_cpu *cpu)
+{
+    struct descriptor_32 *gdt = cpu->gdt;
+    int size, idx = tss(cpu);
+    uint64_t base;
+
+    printk(2, "%s: cpu %d\n", __FUNCTION__, cpu->id);
+
+    cpu->tss.rsp0   = (uintptr_t)cpu->stack_high;
+    cpu->tss.ist[0] = (uintptr_t)cpu->irqstack_high;
+
+    base = (uintptr_t)(&cpu->tss);
+    size = sizeof(cpu->tss)-1;
+    gdt[ idx +0 ] = mkdesc32(base & 0xffffffff, size, 0x89, 0);
+    gdt[ idx +1 ].a = base >> 32;
+    gdt[ idx +1 ].b = 0;
+}
+
+void msrs_init(struct xen_cpu *cpu)
+{
+    printk(2, "%s: cpu %d\n", __FUNCTION__, cpu->id);
+
+    /* syscall setup */
+    wrmsrl(MSR_STAR, (((uint64_t)0xe023 << 48) |
+                      ((uint64_t)0xe008 << 32)));
+    wrmsrl(MSR_LSTAR, (uintptr_t)STACK_PTR(cpu, trampoline_start));
+    wrmsrl(MSR_SYSCALL_MASK, X86_EFLAGS_VM | X86_EFLAGS_RF |
+           X86_EFLAGS_NT | X86_EFLAGS_DF | X86_EFLAGS_IF | X86_EFLAGS_TF);
+}
+
+void idt_init(void)
+{
+    intptr_t entry;
+    int i, len;
+
+    printk(2, "%s\n", __FUNCTION__);
+
+    len = (irq_common - irq_entries) / 256;
+    for (i = 0; i < 256; i++) {
+        entry = (intptr_t)(irq_entries + i*len);
+        xen_idt[i]  = mkgate64(0xe008, (uintptr_t)irq_entries + i*len, 0x8e, 1);
+    }
+
+    xen_idt[    0 ] = mkgate64(0xe008, (uintptr_t)division_by_zero,    0x8e, 0);
+    xen_idt[    1 ] = mkgate64(0xe008, (uintptr_t)debug_int1,          0x8e, 0);
+    xen_idt[    2 ] = mkgate64(0xe008, (uintptr_t)nmi,                 0x8e, 0);
+    xen_idt[    3 ] = mkgate64(0xe008, (uintptr_t)debug_int3,          0xee, 0);
+    xen_idt[    4 ] = mkgate64(0xe008, (uintptr_t)overflow,            0xee, 0);
+    xen_idt[    5 ] = mkgate64(0xe008, (uintptr_t)bound_check,         0x8e, 0);
+    xen_idt[    6 ] = mkgate64(0xe008, (uintptr_t)illegal_instruction, 0x8e, 0);
+    xen_idt[    7 ] = mkgate64(0xe008, (uintptr_t)no_device,           0x8e, 0);
+    xen_idt[    8 ] = mkgate64(0xe008, (uintptr_t)double_fault,        0x8e, 0);
+    xen_idt[    9 ] = mkgate64(0xe008, (uintptr_t)coprocessor,         0x8e, 0);
+    xen_idt[   10 ] = mkgate64(0xe008, (uintptr_t)invalid_tss,         0x8e, 0);
+    xen_idt[   11 ] = mkgate64(0xe008, (uintptr_t)segment_not_present, 0x8e, 0);
+    xen_idt[   12 ] = mkgate64(0xe008, (uintptr_t)stack_fault,         0x8e, 0);
+    xen_idt[   13 ] = mkgate64(0xe008, (uintptr_t)general_protection,  0x8e, 0);
+    xen_idt[   14 ] = mkgate64(0xe008, (uintptr_t)page_fault,          0x8e, 0);
+    xen_idt[   16 ] = mkgate64(0xe008, (uintptr_t)floating_point,      0x8e, 0);
+    xen_idt[   17 ] = mkgate64(0xe008, (uintptr_t)alignment,           0x8e, 0);
+    xen_idt[   18 ] = mkgate64(0xe008, (uintptr_t)machine_check,       0x8e, 0);
+    xen_idt[   19 ] = mkgate64(0xe008, (uintptr_t)simd_floating_point, 0x8e, 0);
+
+    xen_idt[ VECTOR_FLUSH_TLB  ] =
+        mkgate64(0xe008, (uintptr_t)smp_flush_tlb, 0x8e, 1);
+
+    xen_idt[ 0x80 ] = mkgate64(0xe008, (uintptr_t)int_80,              0xee, 0);
+}
+
+void guest_regs_init(struct xen_cpu *cpu, struct regs_64 *regs)
+{
+    struct vcpu_guest_context *ctxt = cpu->init_ctxt;
+
+    cpu->kernel_ss = ctxt->kernel_ss;
+    cpu->kernel_sp = ctxt->kernel_sp;
+
+    regs->rax    = ctxt->user_regs.rax;
+    regs->rbx    = ctxt->user_regs.rbx;
+    regs->rcx    = ctxt->user_regs.rcx;
+    regs->rdx    = ctxt->user_regs.rdx;
+    regs->rsi    = ctxt->user_regs.rsi;
+    regs->rdi    = ctxt->user_regs.rdi;
+    regs->rbp    = ctxt->user_regs.rbp;
+    regs->rip    = ctxt->user_regs.rip;
+    regs->cs     = ctxt->user_regs.cs;
+    regs->rflags = ctxt->user_regs.rflags;
+    regs->rsp    = ctxt->user_regs.rsp;
+    regs->ss     = ctxt->user_regs.ss;
+
+    asm volatile("mov %0, %%ds;\n" :: "r" (ctxt->user_regs.ds) : "memory");
+    asm volatile("mov %0, %%es;\n" :: "r" (ctxt->user_regs.es) : "memory");
+    asm volatile("mov %0, %%fs;\n" :: "r" (ctxt->user_regs.fs) : "memory");
+    asm volatile("mov %0, %%gs;\n" :: "r" (ctxt->user_regs.gs) : "memory");
+}
+
+static void set_up_shared_info(void)
+{
+    int i;
+
+    memset(&shared_info, 0, sizeof(shared_info));
+    for ( i = 0; i < XEN_LEGACY_MAX_VCPUS; i++ ) {
+        shared_info.vcpu_info[i].evtchn_upcall_mask = 1;
+    }
+}
+
+static void set_up_context(void *_ctxt, unsigned long boot_cr3,
+                           unsigned long init_pt_len)
+{
+    vcpu_guest_context_t *ctxt = _ctxt;
+    uint64_t virt_base = emudev_get(EMUDEV_CONF_PV_VIRT_BASE, 0);
+    uint64_t virt_entry = emudev_get(EMUDEV_CONF_PV_VIRT_ENTRY, 0);
+    uint64_t boot_stack_pfn = emudev_get(EMUDEV_CONF_PFN_INIT_PT, 0) +
+                              addr_to_frame(init_pt_len + PAGE_SIZE - 1);
+    uint64_t start_info_pfn = emudev_get(EMUDEV_CONF_PFN_START_INFO, 0);
+
+    set_up_shared_info();
+
+    /* clear everything */
+    memset(ctxt, 0, sizeof(*ctxt));
+
+    ctxt->user_regs.ds = FLAT_KERNEL_DS_X86_64;
+    ctxt->user_regs.es = FLAT_KERNEL_DS_X86_64;
+    ctxt->user_regs.fs = FLAT_KERNEL_DS_X86_64;
+    ctxt->user_regs.gs = FLAT_KERNEL_DS_X86_64;
+    ctxt->user_regs.ss = FLAT_KERNEL_SS_X86_64;
+    ctxt->user_regs.cs = FLAT_KERNEL_CS_X86_64;
+    ctxt->user_regs.rip = virt_entry;
+    ctxt->user_regs.rsp = virt_base | ((boot_stack_pfn + 1) << PAGE_SHIFT);
+    ctxt->user_regs.rsi = virt_base | (start_info_pfn << PAGE_SHIFT);
+    ctxt->user_regs.rflags = 1 << 9; /* Interrupt Enable */
+
+    ctxt->kernel_ss = ctxt->user_regs.ss;
+    ctxt->kernel_sp = ctxt->user_regs.esp;
+
+    ctxt->flags = VGCF_in_kernel_X86_64 | VGCF_online_X86_64;
+    ctxt->ctrlreg[3] = boot_cr3;
+}
+
+static void guest_hypercall_page(struct xen_cpu *cpu)
+{
+    uint64_t _hypercall_page = emudev_get(EMUDEV_CONF_HYPERCALL_PAGE, 0);
+    char *hypercall_page = (char*)_hypercall_page;
+
+    char *p;
+    int i;
+
+    /* Fill in all the transfer points with template machine code. */
+    for ( i = 0; i < (PAGE_SIZE / 32); i++ ) {
+        p = (char *)(hypercall_page + (i * 32));
+        *(uint8_t  *)(p+ 0) = 0x51;    /* push %rcx */
+        *(uint16_t *)(p+ 1) = 0x5341;  /* push %r11 */
+        *(uint8_t  *)(p+ 3) = 0xb8;    /* mov  $<i>,%eax */
+        *(uint32_t *)(p+ 4) = i;
+        *(uint16_t *)(p+ 8) = 0x050f;  /* syscall */
+        *(uint16_t *)(p+10) = 0x5b41;  /* pop  %r11 */
+        *(uint8_t  *)(p+12) = 0x59;    /* pop  %rcx */
+        *(uint8_t  *)(p+13) = 0xc3;    /* ret */
+    }
+
+    /*
+     * HYPERVISOR_iret is special because it doesn't return and expects a
+     * special stack frame. Guests jump at this transfer point instead of
+     * calling it.
+     */
+    p = (char *)(hypercall_page + (__HYPERVISOR_iret * 32));
+    *(uint8_t  *)(p+ 0) = 0x51;    /* push %rcx */
+    *(uint16_t *)(p+ 1) = 0x5341;  /* push %r11 */
+    *(uint8_t  *)(p+ 3) = 0x50;    /* push %rax */
+    *(uint8_t  *)(p+ 4) = 0xb8;    /* mov  $__HYPERVISOR_iret,%eax */
+    *(uint32_t *)(p+ 5) = __HYPERVISOR_iret;
+    *(uint16_t *)(p+ 9) = 0x050f;  /* syscall */
+
+}
+
+/* --------------------------------------------------------------------- */
+/* called from assembler                                                 */
+
+asmlinkage void do_page_fault(struct regs_64 *regs)
+{
+    struct xen_cpu *cpu = get_cpu();
+    uint64_t cr2 = read_cr2();
+
+    vminfo.faults[XEN_FAULT_PAGE_FAULT]++;
+
+    if (context_is_emu(regs)) {
+        if (fixup_extable(regs)) {
+            return;
+        }
+        print_page_fault_info(0, cpu, regs, cr2);
+        pgtable_walk(0, cr2, read_cr3_mfn(cpu));
+        panic("ring0 (emu) page fault", regs);
+    }
+
+    /* fixup error code for kernel faults */
+    if (context_is_kernel(cpu, regs)) {
+        regs->error &= ~0x04;
+    }
+
+    if (wrpt && regs->error == 3) {
+        /* kernel write to r/o page */
+        if (!cpu->user_cr3_mfn || !pgtable_is_present(cr2, cpu->user_cr3_mfn)) {
+            /* is kernel page -> rw fixup for page tables */
+            if (pgtable_fixup_flag(cpu, cr2, _PAGE_RW) > 0) {
+                vminfo.faults[XEN_FAULT_PAGE_FAULT_FIX_RO]++;
+                return;
+            }
+        }
+    }
+
+    if (regs->error & 0x01) {
+        /* present */
+        if (pgtable_fixup_flag(cpu, cr2, _PAGE_USER) > 0) {
+            vminfo.faults[XEN_FAULT_PAGE_FAULT_FIX_USER]++;
+            return;
+        }
+    }
+
+    vminfo.faults[XEN_FAULT_PAGE_FAULT_GUEST]++;
+    bounce_trap(cpu, regs, 14, -1);
+}
+
+asmlinkage void do_int_80(struct regs_64 *regs)
+{
+    struct xen_cpu *cpu = get_cpu();
+
+    vminfo.faults[XEN_FAULT_INT_80]++;
+    bounce_trap(cpu, regs, 0x80, -1);
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 18/40] xenner: kernel: Main
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (16 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 17/40] xenner: kernel: Main (x86_64) Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 19/40] xenner: kernel: Makefile Alexander Graf
                   ` (21 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds the platform agnostic piece of xenner's main loop.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-main.c |  875 ++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 875 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-main.c

diff --git a/pc-bios/xenner/xenner-main.c b/pc-bios/xenner/xenner-main.c
new file mode 100644
index 0000000..c63f447
--- /dev/null
+++ b/pc-bios/xenner/xenner-main.c
@@ -0,0 +1,875 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner generic main functions
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "config-host.h"
+
+static void set_up_context(void *ctxt, unsigned long boot_cr3,
+                           unsigned long init_pt_len);
+static void guest_hypercall_page(struct xen_cpu *cpu);
+
+void *memset(void *s, int c, size_t n)
+{
+    uint8_t *p = s;
+    uint32_t i;
+
+    for (i = 0; i < n; i++) {
+        p[i] = c;
+    }
+    return s;
+}
+
+void *memcpy(void *dest, const void *src, size_t n)
+{
+    const uint8_t *s = src;
+    uint8_t *d = dest;
+    uint32_t i;
+
+    for (i = 0; i < n; i++) {
+        d[i] = s[i];
+    }
+    return dest;
+}
+
+int memcmp(const void *s1, const void *s2, size_t n)
+{
+    const uint8_t *a = s1;
+    const uint8_t *b = s2;
+    int i;
+
+    for (i = 0; i < n; i++) {
+        if (a[i] == b[i]) {
+            continue;
+        }
+        if (a[i] < b[i]) {
+            return -1;
+        }
+        return 1;
+    }
+    return 0;
+}
+
+/* --------------------------------------------------------------------- */
+
+static void print_gpf_info(int level, struct xen_cpu *cpu, struct regs *regs)
+{
+    uint8_t *code = (void*)regs->rip;
+
+    printk(level, "%s: vcpu %d, index 0x%x%s%s%s, "
+           "rflags %" PRIxREG ", cs:rip %" PRIxREG ":%" PRIxREG " "
+           "-> 0x%02x, 0x%02x, 0x%02x, 0x%02x,  0x%02x, 0x%02x, 0x%02x, 0x%02x\n",
+           __FUNCTION__, cpu->id, (int)(regs->error >> 3),
+           (regs->error & 0x04) ? ", TI"  : "",
+           (regs->error & 0x02) ? ", IDT" : "",
+           (regs->error & 0x01) ? ", EXT" : "",
+           regs->rflags, regs->cs, regs->rip,
+           code[0], code[1], code[2], code[3],
+           code[4], code[5], code[6], code[7]);
+}
+
+static void print_page_fault_info(int level, struct xen_cpu *cpu, struct regs *regs, ureg_t cr2)
+{
+    printk(level, "%s:%s%s%s%s%s%s, rip %" PRIxREG ", cr2 %" PRIxREG ", vcpu %d\n",
+           __FUNCTION__,
+#ifdef CONFIG_64BIT
+           is_kernel(cpu) ? " [kernel-mode]" : " [user-mode]",
+#else
+           "",
+#endif
+           regs->error & 0x01 ? " preset"  : " nopage",
+           regs->error & 0x02 ? " write"   : " read",
+           regs->error & 0x04 ? " user"    : " kernel",
+           regs->error & 0x08 ? " reserved-bit"  : "",
+           regs->error & 0x10 ? " instr-fetch"   : "",
+           regs->rip, cr2, cpu->id);
+}
+
+static int fixup_extable(struct regs *regs)
+{
+    uintptr_t *ptr;
+
+    for (ptr = _estart; ptr < _estop; ptr += 2) {
+        if (ptr[0] != regs->rip) {
+            continue;
+        }
+        printk(2, "fixup: %" PRIxPTR " -> %" PRIxPTR "\n", ptr[0], ptr[1]);
+        regs->rip = ptr[1];
+        vminfo.faults[XEN_FAULT_PAGE_FAULT_FIX_EXTAB]++;
+        return 1;
+    }
+    return 0;
+}
+
+int panic(const char *message, struct regs *regs)
+{
+    printk(0, "panic: %s\n", message);
+    if (regs) {
+        print_state(regs);
+    }
+    emudev_cmd(EMUDEV_CMD_GUEST_SHUTDOWN, -1);
+    return 0;
+}
+
+/* --------------------------------------------------------------------- */
+
+#ifdef CONFIG_64BIT
+# define DR "%016" PRIxREG
+# define DC "%08"  PRIxREG
+# define DS "%04"  PRIxREG
+#else
+# define DR "%08"  PRIxREG
+# define DC "%08"  PRIxREG
+# define DS "%04"  PRIxREG
+#endif
+
+void print_registers(int level, struct regs *regs)
+{
+    ureg_t ds,es,fs,gs,cr0,cr2,cr3,cr4;
+
+    asm volatile("mov %%ds, %[ds]  \n"
+                 "mov %%es, %[es]  \n"
+                 "mov %%fs, %[fs]  \n"
+                 "mov %%gs, %[gs]  \n"
+                 : [ds] "=r" (ds),
+                   [es] "=r" (es),
+                   [fs] "=r" (fs),
+                   [gs] "=r" (gs)
+                 : /* no inputs */);
+    asm volatile("mov %%cr0, %[cr0]  \n"
+                 "mov %%cr2, %[cr2]  \n"
+                 "mov %%cr3, %[cr3]  \n"
+                 "mov %%cr4, %[cr4]  \n"
+                 : [cr0] "=r" (cr0),
+                   [cr2] "=r" (cr2),
+                   [cr3] "=r" (cr3),
+                   [cr4] "=r" (cr4)
+                 : /* no inputs */);
+
+    printk(level, "printing registers\n");
+    printk(level, "  code   cs:rip " DS ":" DR "\n", regs->cs, regs->rip);
+    printk(level, "  stack  ss:rsp " DS ":" DR "\n", regs->ss, regs->rsp);
+    printk(level, "  rax " DR " rbx " DR " rcx " DR " rdx " DR "\n",
+           regs->rax, regs->rbx, regs->rcx, regs->rdx);
+    printk(level, "  rsi " DR " rdi " DR " rsp " DR " rbp " DR "\n",
+           regs->rsi, regs->rdi, regs->rsp, regs->rbp);
+#ifdef CONFIG_64BIT
+    printk(level, "  r8  " DR " r9  " DR " r10 " DR " r11 " DR "\n",
+           regs->r8, regs->r9, regs->r10, regs->r11);
+    printk(level, "  r12 " DR " r13 " DR " r14 " DR " r15 " DR "\n",
+           regs->r12, regs->r13, regs->r14, regs->r15);
+#endif
+    printk(level, "  cs " DS " ds " DS " es " DS " fs " DS " gs " DS " ss " DS "\n",
+           regs->cs, ds, es, fs, gs, regs->ss);
+    printk(level, "  cr0 " DC " cr2 " DC " cr3 " DC " cr4 " DC " rflags " DC "\n",
+           cr0, cr2, cr3, cr4, regs->rflags);
+    print_bits(level, "  cr0", cr0, cr0, cr0_bits);
+    print_bits(level, "  cr4", cr4, cr4, cr4_bits);
+    print_bits(level, "  rflags", regs->rflags, regs->rflags, rflags_bits);
+
+}
+
+void print_stack(int level, ureg_t rsp)
+{
+    ureg_t max;
+
+    max = ((rsp + PAGE_SIZE) & PAGE_MASK) - sizeof(ureg_t);
+    printk(level, "printing stack " DR " - " DR "\n", rsp, max);
+    while (rsp <= max) {
+        printk(level, "  " DR ": " DR "\n", rsp, *((ureg_t*)rsp));
+        rsp += sizeof(ureg_t);
+    }
+}
+
+void print_state(struct regs *regs)
+{
+    print_registers(0, regs);
+    print_stack(0, regs->rsp);
+}
+
+#undef DR
+
+/* --------------------------------------------------------------------- */
+
+static struct descriptor_32 mkdesc(const struct kvm_segment *seg)
+{
+    struct descriptor_32 desc;
+    int shift = 0;
+
+    shift  = seg->g ? 12 : 0;
+    desc.a = (seg->limit >> shift) & 0xffff;
+    desc.b = (seg->limit >> shift) & 0x000f0000;
+
+    desc.a |= (seg->base & 0xffff) << 16;
+    desc.b |= seg->base & 0xff000000;
+    desc.b |= (seg->base & 0xff0000) >> 16;
+    desc.b |= (seg->type & 0x0f) << 8;
+    desc.b |= (seg->dpl & 0x03) << 13;
+
+    if (seg->s)       desc.b |= (1 << 12);
+    if (seg->present) desc.b |= (1 << 15);
+    if (seg->avl)     desc.b |= (1 << 20);
+    if (seg->l)       desc.b |= (1 << 21);
+    if (seg->db)      desc.b |= (1 << 22);
+    if (seg->g)       desc.b |= (1 << 23);
+
+    return desc;
+}
+
+static inline void gdt_set(struct descriptor_32 *gdt, const struct kvm_segment *seg)
+{
+    gdt[ seg->selector >> 3 ] = mkdesc(seg);
+}
+
+static void cr_init(struct xen_cpu *cpu)
+{
+    ureg_t cr0, cr4;
+
+    printk(2, "%s: cpu %d\n", __FUNCTION__, cpu->id);
+
+    cr0  = read_cr0();
+    cr0 |= X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | X86_CR0_NE | \
+        X86_CR0_WP | X86_CR0_AM | X86_CR0_PG;
+    cr0 &= ~(X86_CR0_TS|X86_CR0_CD|X86_CR0_NW);
+    print_bits(2, "cr0", read_cr0(), cr0, cr0_bits);
+    write_cr0(cr0);
+
+    cr4  = read_cr4();
+    cr4 |= X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT;
+    print_bits(2, "cr4", read_cr4(), cr4, cr4_bits);
+    write_cr4(cr4);
+}
+
+static void stack_init(struct xen_cpu *cpu)
+{
+    uintptr_t *ptr;
+    int pages;
+
+    if (cpu->stack_low) {
+        return;
+    }
+
+    /* allocate stack */
+    pages = (boot_stack_high - boot_stack_low + PAGE_SIZE -1) / PAGE_SIZE;
+    cpu->stack_low  = get_pages(pages, "stack");
+    cpu->stack_high = cpu->stack_low + pages * PAGE_SIZE;
+
+    /* set per-cpu data pointer */
+    ptr = STACK_PTR(cpu, cpu_ptr);
+    *ptr = (uintptr_t)cpu;
+
+    /* set per-cpu data pointer for boot stack */
+    if (!cpu->id) {
+        ptr = (void*)(&cpu_ptr);
+        *ptr = (uintptr_t)cpu;
+    }
+
+#ifdef CONFIG_64BIT
+    /* copy and setup syscall trampoline from boot stack */
+    memcpy(STACK_PTR(cpu, trampoline_start),
+           trampoline_start, trampoline_stop - trampoline_start);
+    ptr = STACK_PTR(cpu, trampoline_patch);
+    *ptr = (uintptr_t)trampoline_syscall;
+
+    /* allocate irq stack */
+    cpu->irqstack_low  = get_pages(pages, "irqstack");
+    cpu->irqstack_high = cpu->irqstack_low + PAGE_SIZE;
+
+    /* set per-cpu data pointer */
+    ptr = IRQSTACK_PTR(cpu, cpu_ptr);
+    *ptr = (uintptr_t)cpu;
+#endif
+}
+
+void gdt_load(struct xen_cpu *cpu)
+{
+    struct {
+        uint16_t  len;
+        uintptr_t ptr;
+    } __attribute__((packed)) gdtp = {
+        .len = (16 * PAGE_SIZE)-1,
+        .ptr = (uintptr_t)cpu->gdt,
+    };
+
+    asm volatile("lgdt %0" : : "m" (gdtp) : "memory");
+}
+
+void idt_load(void)
+{
+    struct {
+        uint16_t  len;
+        uintptr_t ptr;
+    } __attribute__((packed)) idtp = {
+        .len = sizeof(xen_idt)-1,
+        .ptr = (uintptr_t)xen_idt,
+    };
+
+    asm volatile("lidt %0" : : "m" (idtp) : "memory");
+}
+
+void guest_cpu_init(struct xen_cpu *cpu)
+{
+    struct vcpu_guest_context *ctxt = cpu->init_ctxt;
+    ureg_t mfns[16];
+    int i;
+
+    if (ctxt->gdt_ents) {
+        for (i = 0; i < 16; i++) {
+            mfns[i] = ctxt->gdt_frames[i];
+        }
+        guest_gdt_init(cpu, ctxt->gdt_ents, mfns);
+    }
+
+    ctxt->kernel_ss    = fix_sel(ctxt->kernel_ss);
+    ctxt->user_regs.cs = fix_sel(ctxt->user_regs.cs);
+    ctxt->user_regs.ds = fix_sel(ctxt->user_regs.ds);
+    ctxt->user_regs.es = fix_sel(ctxt->user_regs.es);
+    ctxt->user_regs.fs = fix_sel(ctxt->user_regs.fs);
+    ctxt->user_regs.gs = fix_sel(ctxt->user_regs.gs);
+    ctxt->user_regs.ss = fix_sel(ctxt->user_regs.ss);
+
+    cpu->kernel_ss = ctxt->kernel_ss;
+    cpu->kernel_sp = ctxt->kernel_sp;
+}
+
+static uint64_t maddr_to_paddr(uint64_t _maddr)
+{
+    unsigned long virt_base = emudev_get(EMUDEV_CONF_PV_VIRT_BASE, 0);
+    uint64_t maddr = _maddr;
+    uint64_t mfn = addr_to_frame(maddr);
+
+    /* M2P */
+    if ((mfn >= vmconf.mfn_m2p) && (mfn < (vmconf.mfn_m2p + vmconf.pg_m2p))) {
+        return XEN_M2P + maddr - frame_to_addr(vmconf.mfn_m2p);
+    }
+
+    /* xenner */
+    if (maddr < frame_to_addr(vmconf.mfn_guest)) {
+        return (uintptr_t)_vstart + maddr;
+    }
+
+    /* guest */
+    maddr -= frame_to_addr(vmconf.mfn_guest);
+    maddr += virt_base;
+
+    return maddr;
+}
+
+static void *pfn_to_ptr(xen_pfn_t pfn)
+{
+    unsigned long addr = frame_to_addr(pfn);
+
+    addr += frame_to_addr(vmconf.mfn_guest);
+    return map_page(addr);
+}
+
+static void guest_start_info(struct xen_cpu *cpu, struct regs *regs,
+                             unsigned long init_pt_len, unsigned long boot_cr3)
+{
+    struct start_info *start_info;
+    uint64_t i;
+    uint64_t virt_base = emudev_get(EMUDEV_CONF_PV_VIRT_BASE, 0);
+    uint64_t initrd_len;
+    uint64_t cmdline_pfn = emudev_get(EMUDEV_CONF_PFN_CMDLINE, 0);
+    unsigned long *mfn_list;
+    uint64_t mfn_list_pfn = emudev_get(EMUDEV_CONF_PFN_MFN_LIST, 0);
+    char cap_ver[] = CAP_VERSION_STRING;
+    char *cmdline = NULL;
+
+    start_info = pfn_to_ptr(emudev_get(EMUDEV_CONF_PFN_START_INFO, 0));
+
+    printk(1, "%s: called\n", __FUNCTION__);
+
+    memset(start_info, 0, sizeof(*start_info));
+    memcpy(start_info->magic, cap_ver, sizeof(cap_ver));
+    start_info->magic[sizeof(start_info->magic) - 1] = '\0';
+
+    start_info->shared_info = EMU_PA(&shared_info);
+    start_info->pt_base = maddr_to_paddr(boot_cr3);
+    start_info->nr_pt_frames = addr_to_frame(init_pt_len + (PAGE_SIZE - 1));
+    start_info->shared_info = (unsigned long)EMU_PA(&shared_info);
+    start_info->nr_pages = emudev_get(EMUDEV_CONF_GUEST_PAGE_COUNT, 0);
+    start_info->store_mfn = emudev_get(EMUDEV_CONF_MFN_XENSTORE, 0);
+    start_info->store_evtchn = emudev_get(EMUDEV_CONF_EVTCH_XENSTORE, 0);
+    start_info->console.domU.mfn = emudev_get(EMUDEV_CONF_MFN_CONSOLE, 0);
+    start_info->console.domU.evtchn = emudev_get(EMUDEV_CONF_EVTCH_CONSOLE, 0);
+
+    initrd_len = emudev_get(EMUDEV_CONF_INITRD_LEN, 0);
+    if (initrd_len) {
+        start_info->mod_start = virt_base +
+            frame_to_addr(emudev_get(EMUDEV_CONF_PFN_INITRD, 0));
+        start_info->mod_len = initrd_len;
+    }
+
+    if (cmdline_pfn) {
+        cmdline = pfn_to_ptr(cmdline_pfn);
+
+        memcpy(start_info->cmd_line, pfn_to_ptr(cmdline_pfn),
+               MAX_GUEST_CMDLINE);
+        printk(1, "guest cmdline: %s\n", start_info->cmd_line);
+    }
+
+    /* set up m2p page table */
+    for (i = 0; i < vmconf.pg_total; i++) {
+        m2p[i + vmconf.mfn_guest] = i;
+    }
+
+    /* fill mfn list */
+    start_info->mfn_list = virt_base + frame_to_addr(mfn_list_pfn);
+    mfn_list = (void*)start_info->mfn_list;
+
+    for (i = 0; i < start_info->nr_pages; i++) {
+        mfn_list[i] = i + vmconf.mfn_guest;
+    }
+
+    regs->rsi = (unsigned long)start_info;
+
+    free_page(start_info);
+    if (cmdline) {
+        free_page(cmdline);
+    }
+}
+
+static void cpu_set_cr3(struct xen_cpu *cpu, unsigned long boot_cr3)
+{
+#ifdef CONFIG_64BIT
+    cpu->user_mode = 0;
+    cpu->kernel_cr3_mfn = addr_to_frame(boot_cr3);
+#else
+    cpu->cr3_mfn = addr_to_frame(boot_cr3);
+#endif
+}
+
+static uint64_t count_pgtables(uint64_t max_pfn)
+{
+    uint64_t r = max_pfn;
+    uint64_t fourmb = addr_to_frame(4 * 1024 * 1024);
+
+    /* XXX this should become a real calculation, for now assume we need max
+     *     200 page table pages */
+    r += 200;
+
+    /* pad to 4mb */
+    r = (r + fourmb - 1) & ~(fourmb - 1);
+
+    return r;
+}
+
+/*
+ * Maps the guest into its own virtual address space in its own page table and
+ * returns the length and maddr of that new page table
+ */
+static unsigned long map_guest(unsigned long *boot_cr3)
+{
+    uint64_t virt_base = emudev_get(EMUDEV_CONF_PV_VIRT_BASE, 0);
+    struct xen_cpu tmp_cpu;
+    uint64_t max_pfn = emudev_get(EMUDEV_CONF_PFN_INIT_PT, 0);
+    unsigned long init_pt_len;
+
+    max_pfn += count_pgtables(max_pfn);
+
+    /* create initial page table that maps the guest virt_base linearly
+       to host physical memory. This has to happen in guest visible mem */
+    switch_heap(HEAP_HIGH);
+
+    *boot_cr3 = (unsigned long)EMU_PA(get_pages(1, "pt root"));
+    cpu_set_cr3(&tmp_cpu, *boot_cr3);
+    printk(3, "init guest pt map mfn %lx len %lx\n", (unsigned long)vmconf.mfn_guest,
+           (unsigned long)max_pfn);
+
+    map_region(&tmp_cpu, virt_base, EMU_PGFLAGS, vmconf.mfn_guest, max_pfn);
+
+    /* save the pt len for start_info */
+    init_pt_len = heap_size();
+
+    switch_heap(HEAP_EMU);
+
+    return init_pt_len;
+}
+
+
+/* --------------------------------------------------------------------- */
+
+static struct xen_cpu *cpu_alloc(int id)
+{
+    struct xen_cpu *cpu;
+    ureg_t cr3;
+
+    printk(1, "%s: cpu %d\n", __FUNCTION__, id);
+
+    cpu = get_memory(sizeof(*cpu), "per-cpu data");
+    cpu->id = id;
+    cpu->periodic = XEN_DEFAULT_PERIOD;
+    cpu->v.vcpu_info = (void*)&shared_info.vcpu_info[id];
+    cpu->v.vcpu_info_pa = EMU_PA(cpu->v.vcpu_info);
+    guest_cli(cpu);
+    list_add_tail(&cpu->next, &cpus);
+
+    asm volatile("mov %%cr3,%0" : "=r" (cr3));
+    pv_write_cr3(cpu, addr_to_frame(cr3));
+
+    gdt_init(cpu);
+    stack_init(cpu);
+    tss_init(cpu);
+    return cpu;
+}
+
+struct xen_cpu *cpu_find(int id)
+{
+    struct list_head *item;
+    struct xen_cpu *cpu;
+
+    list_for_each(item, &cpus) {
+        cpu = list_entry(item, struct xen_cpu, next);
+        if (cpu->id == id) {
+            return cpu;
+        }
+    }
+    return cpu_alloc(id);
+}
+
+static void cpu_init(struct xen_cpu *cpu)
+{
+    printk(1, "%s: cpu %d\n", __FUNCTION__, cpu->id);
+
+    gdt_load(cpu);
+    ltr(tss(cpu) << 3);
+    idt_load();
+    cr_init(cpu);
+    msrs_init(cpu);
+    pv_init(cpu);
+
+    vminfo.vcpus_online  |= (1 << cpu->id);
+    vminfo.vcpus_running |= (1 << cpu->id);
+    vminfo.vcpus++;
+    cpu->online = 1;
+}
+
+static void userspace_config(void)
+{
+    uint32_t pfn;
+    int i;
+
+    /* read config */
+    vmconf.debug_level = emudev_get(EMUDEV_CONF_DEBUG_LEVEL, 0);
+    vmconf.mfn_emu     = emudev_get(EMUDEV_CONF_EMU_START_PFN, 0);
+    vmconf.pg_emu      = emudev_get(EMUDEV_CONF_EMU_PAGE_COUNT, 0);
+    vmconf.mfn_m2p     = emudev_get(EMUDEV_CONF_M2P_START_PFN, 0);
+    vmconf.pg_m2p      = emudev_get(EMUDEV_CONF_M2P_PAGE_COUNT, 0);
+    vmconf.mfn_guest   = emudev_get(EMUDEV_CONF_GUEST_START_PFN, 0);
+    vmconf.pg_guest    = emudev_get(EMUDEV_CONF_GUEST_PAGE_COUNT, 0);
+    vmconf.pg_total    = emudev_get(EMUDEV_CONF_TOTAL_PAGE_COUNT, 0);
+    vmconf.nr_cpus     = emudev_get(EMUDEV_CONF_NR_VCPUS, 0);
+
+    /* write config */
+    pfn = addr_to_frame(EMU_PA(&boot_ctxt));
+    emudev_set(EMUDEV_CONF_BOOT_CTXT_PFN, 0, pfn);
+    pfn = addr_to_frame(EMU_PA(&vminfo));
+    emudev_set(EMUDEV_CONF_VMINFO_PFN, 0, pfn);
+    pfn = addr_to_frame(EMU_PA(&grant_table));
+    for (i = 0; i < GRANT_FRAMES_MAX; i++)
+        emudev_set(EMUDEV_CONF_GRANT_TABLE_PFNS, i, pfn+i);
+
+    /* commands */
+    emudev_cmd(EMUDEV_CMD_CONFIGURATION_DONE, 0);
+}
+
+/* --------------------------------------------------------------------- */
+/* called from assembler                                                 */
+
+asmlinkage void do_boot(struct regs *regs)
+{
+    struct xen_cpu *cpu;
+    struct xen_cpu boot_cpu;
+    unsigned long init_pt_len, boot_cr3;
+
+    printk(0, "this is %s (qemu-xenner %s), boot cpu #0\n", EMUNAME,
+                QEMU_VERSION QEMU_PKGVERSION);
+
+    userspace_config();
+    printk(1, "%s: configuration done\n", EMUNAME);
+
+    cpu_set_cr3(&boot_cpu, EMU_PA(emu_pgd));
+    paging_init(&boot_cpu);
+    init_pt_len = map_guest(&boot_cr3);
+
+    set_up_context(&boot_ctxt, boot_cr3, init_pt_len);
+
+    cpu = cpu_alloc(0);
+    cpu->init_ctxt = &boot_ctxt;
+    idt_init();
+    cpu_init(cpu);
+    printk(1, "%s: boot cpu setup done\n", EMUNAME);
+
+#ifdef CONFIG_64BIT
+    paging_init(cpu);
+#endif
+    paging_start(cpu);
+    printk(1, "%s: paging setup done\n", EMUNAME);
+
+    irq_init(cpu);
+    printk(1, "%s: irq setup done\n", EMUNAME);
+
+    guest_cpu_init(cpu);
+    guest_regs_init(cpu, regs);
+    guest_start_info(cpu, regs, init_pt_len, boot_cr3);
+    guest_hypercall_page(cpu);
+    printk(1, "%s: booting guest kernel (entry %" PRIxREG ":%" PRIxREG ") ...\n",
+           EMUNAME, regs->cs, regs->rip);
+}
+
+asmlinkage void do_boot_secondary(ureg_t id, struct regs *regs)
+{
+    struct xen_cpu *cpu;
+
+    printk(0, "this is cpu #%d\n", (int)id);
+    cpu = cpu_find(id);
+    cpu_init(cpu);
+    paging_start(cpu);
+    irq_init(cpu);
+#if 0
+    if (cpu->virq_to_vector[VIRQ_TIMER])
+        lapic_timer(cpu);
+#endif
+
+    guest_cpu_init(cpu);
+    guest_regs_init(cpu, regs);
+
+    print_registers(2, regs);
+    printk(1, "%s: secondary entry: %" PRIxREG ":%" PRIxREG ", jumping ...\n",
+           EMUNAME, regs->cs, regs->rip);
+}
+
+asmlinkage void do_illegal_instruction(struct regs *regs)
+{
+    struct xen_cpu *cpu = get_cpu();
+    int skip;
+
+    vminfo.faults[XEN_FAULT_ILLEGAL_INSTRUCTION]++;
+    if (context_is_emu(regs)) {
+        panic("ring0 (emu) illegal instruction", regs);
+    }
+    if (context_is_user(cpu, regs)) {
+        uint8_t *i = (void*)regs->rip;
+        printk(1, "user ill: at %p"
+               "  0x%02x, 0x%02x, 0x%02x, 0x%02x,"
+               "  0x%02x, 0x%02x, 0x%02x, 0x%02x\n",
+               i, i[0], i[1], i[2], i[3], i[4], i[5], i[6], i[7]);
+        bounce_trap(cpu, regs, 6, -1);
+        return;
+    }
+
+    skip = emulate(cpu, regs);
+    switch (skip) {
+    case -1: /* error */
+        panic("instruction emulation failed (ill)\n", regs);
+        break;
+    case 0:  /* bounce to guest */
+        bounce_trap(cpu, regs, 6, -1);
+        break;
+    default: /* handled */
+        regs->rip += skip;
+        break;
+    }
+}
+
+static int is_allowed_io(struct xen_cpu *cpu, struct regs *regs)
+{
+    uint8_t *code = (void*)regs->rip;
+    int pl;
+
+#ifdef CONFIG_64BIT
+    pl = context_is_user(cpu, regs) ? 3 : 1;
+#else
+    pl = regs->cs & 0x03;
+#endif
+
+    switch (*code) {
+    case 0xe4 ... 0xe7:
+    case 0xec ... 0xef:
+        /* I/O instructions */
+        if (pl <= cpu->iopl)
+            return 1; /* yes: by iopl */
+        if (cpu->nr_ports)
+            return 1; /* yes: by bitmap (FIXME: check port) */
+        break;
+    case 0xfa:
+    case 0xfb:
+        /* cli, sti */
+        if (pl <= cpu->iopl)
+            return 1; /* yes: by iopl */
+    }
+    return 0; /* no */
+}
+
+asmlinkage void do_general_protection(struct regs *regs)
+{
+    struct xen_cpu *cpu = get_cpu();
+    int skip;
+
+    vminfo.faults[XEN_FAULT_GENERAL_PROTECTION]++;
+    if (context_is_emu(regs)) {
+        if (fixup_extable(regs)) {
+            return;
+        }
+        print_gpf_info(0, cpu, regs);
+        panic("ring0 (emu) general protection fault", regs);
+    }
+    if (is_allowed_io(cpu, regs)) {
+        goto emulate;
+    }
+    if (context_is_user(cpu, regs)) {
+        vminfo.faults[XEN_FAULT_GENERAL_PROTECTION_GUEST]++;
+        print_gpf_info(1, cpu, regs);
+        bounce_trap(cpu, regs, 13, -1);
+        return;
+    }
+
+    if (regs->error) {
+        print_gpf_info(0, cpu, regs);
+        panic("unhandled kernel gpf", regs);
+    }
+
+emulate:
+    skip = emulate(cpu, regs);
+    switch (skip) {
+    case -1: /* error */
+        print_gpf_info(0, cpu, regs);
+        panic("instruction emulation failed (gpf)", regs);
+        break;
+    case 0:  /* bounce to guest */
+        vminfo.faults[XEN_FAULT_GENERAL_PROTECTION_GUEST]++;
+        bounce_trap(cpu, regs, 13, -1);
+        break;
+    default: /* handled */
+        vminfo.faults[XEN_FAULT_GENERAL_PROTECTION_EMUINS]++;
+        regs->rip += skip;
+        evtchn_try_forward(cpu, regs); /* sti */
+        break;
+    }
+}
+
+asmlinkage void do_double_fault(struct regs *regs)
+{
+    panic("double fault", regs);
+}
+
+asmlinkage void do_guest_forward(struct regs *regs)
+{
+    struct xen_cpu *cpu = get_cpu();
+    const struct trapinfo *trap = NULL;
+
+    if (regs->trapno < sizeof(trapinfo)/sizeof(trapinfo[0])) {
+        trap = trapinfo + regs->trapno;
+    }
+    printk(trap ? trap->lvl : 0,
+           "%s: trap %d [%s], error 0x%" PRIxREG ","
+           " cs:rip %" PRIxREG ":%" PRIxREG ","
+           " forwarding to guest\n",
+           __FUNCTION__, (int)regs->trapno,
+           trap && trap->name ? trap->name : "-",
+           trap && trap->ec   ? regs->error : 0,
+           regs->cs, regs->rip);
+    bounce_trap(cpu, regs, regs->trapno, -1);
+}
+
+asmlinkage void do_lazy_fpu(struct regs *regs)
+{
+    struct xen_cpu *cpu = get_cpu();
+
+    vminfo.faults[XEN_FAULT_LAZY_FPU]++;
+    clts();
+    bounce_trap(cpu, regs, regs->trapno, -1);
+}
+
+asmlinkage void do_int1(struct regs *regs)
+{
+    if (context_is_emu(regs)) {
+        printk(0, "%s: emu context\n", __FUNCTION__);
+        print_registers(0, regs);
+        return;
+    }
+    do_guest_forward(regs);
+}
+
+asmlinkage void do_int3(struct regs *regs)
+{
+    if (context_is_emu(regs)) {
+        printk(0, "%s: emu context\n", __FUNCTION__);
+        print_registers(0, regs);
+        return;
+    }
+    do_guest_forward(regs);
+}
+
+/* --------------------------------------------------------------------- */
+
+static spinlock_t flush_lock = SPIN_LOCK_UNLOCKED;
+static atomic_t   flush_cnt;
+static ureg_t     flush_addr;
+
+asmlinkage void do_smp_flush_tlb(struct regs *regs)
+{
+    struct xen_cpu *cpu = get_cpu();
+
+    lapic_eoi(cpu);
+    if (flush_addr) {
+        flush_tlb_addr(flush_addr);
+    } else {
+        flush_tlb();
+    }
+    atomic_dec(&flush_cnt);
+}
+
+void flush_tlb_remote(struct xen_cpu *cpu, ureg_t mask, ureg_t addr)
+{
+    int cpus;
+
+    mask &= ~(1 << cpu->id);
+    if (!mask) {
+        vminfo.faults[XEN_FAULT_OTHER_FLUSH_TLB_NONE]++;
+        return;
+    }
+
+    /*
+     * we must be able to process ipi while waiting for the lock,
+     * otherwise we deadlock in case another cpu busy-waits for us
+     * doing the tlb flush.
+     */
+    sti();
+    spin_lock(&flush_lock);
+
+    cpus = vminfo.vcpus-1; /* FIXME: not using mask, sending to all */
+    flush_addr = addr;
+    if (flush_addr) {
+        vminfo.faults[XEN_FAULT_OTHER_FLUSH_TLB_PAGE]++;
+    } else {
+        vminfo.faults[XEN_FAULT_OTHER_FLUSH_TLB_ALL]++;
+    }
+
+    atomic_add(cpus, &flush_cnt);
+    lapic_ipi_flush_tlb(cpu);
+    while (atomic_read(&flush_cnt)) {
+        pause();
+    }
+
+    spin_unlock(&flush_lock);
+    cli();
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 19/40] xenner: kernel: Makefile
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (17 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 18/40] xenner: kernel: Main Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 20/40] xenner: kernel: mmu support for 32-bit PAE Alexander Graf
                   ` (20 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds the Makefile to build the xenner kernel.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/Makefile |   72 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 72 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/Makefile

diff --git a/pc-bios/xenner/Makefile b/pc-bios/xenner/Makefile
new file mode 100644
index 0000000..f768e0b
--- /dev/null
+++ b/pc-bios/xenner/Makefile
@@ -0,0 +1,72 @@
+all: build-all
+# Dummy command so that make thinks it has done something
+	@true
+
+include ../../config-host.mak
+include $(SRC_PATH)/rules.mak
+
+$(call set-vpath, $(SRC_PATH)/pc-bios/xenner)
+
+.PHONY : all clean build-all
+
+CFLAGS := -Wall -Wstrict-prototypes -Werror -fomit-frame-pointer -fno-builtin -g
+CFLAGS += -I$(SRC_PATH) -D__XEN_TOOLS__
+CFLAGS += $(call cc-option, $(CFLAGS), -fno-stack-protector)
+QXENNER_CFLAGS = $(CFLAGS)
+
+build-all: xenner32.elf xenner32-pae.elf xenner64.elf
+
+
+XENNERXX_OBJS := xenner-hcall.o xenner-data.o xenner-instr.o xenner-pv.o xenner-lapic.o \
+		printk.o xen-names.o
+XENNER32_OBJS := xenner32.o xenner-main32.o xenner-hcall32.o $(XENNERXX_OBJS)
+XENNER32_NOPAE_OBJS := $(patsubst %,%32,$(XENNER32_OBJS)) xenner-mm32.o
+XENNER32_PAE_OBJS := $(patsubst %,%pae,$(XENNER32_OBJS)) xenner-mmpae.o
+XENNER64_OBJS := xenner64.o xenner-main64.o xenner-hcall64.o xenner-mm64.o \
+		$(patsubst %,%64,$(XENNERXX_OBJS))
+
+xenner32.elf : CFLAGS  += -m32 -ffreestanding -DCONFIG_32BIT
+xenner32.elf : ASFLAGS += -m32 -DCONFIG_32BIT
+
+xenner32-pae.elf : CFLAGS  += -m32 -ffreestanding -DCONFIG_PAE -DCONFIG_32BIT
+xenner32-pae.elf : ASFLAGS += -m32 -DCONFIG_PAE -DCONFIG_32BIT
+
+xenner64.elf : CFLAGS  += -m64 -ffreestanding -fpic -mno-red-zone -DCONFIG_64BIT
+xenner64.elf : ASFLAGS += -m64 -DCONFIG_64BIT
+
+xenner32.elf: $(XENNER32_NOPAE_OBJS)
+xenner32-pae.elf: $(XENNER32_PAE_OBJS)
+xenner64.elf: $(XENNER64_OBJS)
+
+clean:
+	rm -f *.o32 *.o64 *.opae *.o *.d *.raw *.img *.bin *.elf *~
+
+
+##############################################################################
+
+%.o32: %.c
+	$(call quiet-command,$(CC) -m32 $(QXENNER_CFLAGS) $(QXENNER_DGFLAGS) \
+            $(CFLAGS) -c -o $@ $<,"  CC    $(TARGET_DIR)$@") -DCONFIG_32BIT
+
+%.o32: %.S
+	$(call quiet-command,$(CC) -m32 $(QXENNER_CFLAGS) $(QXENNER_DGFLAGS) \
+            $(CFLAGS) -c -o $@ $<,"  AS    $(TARGET_DIR)$@") -DCONFIG_32BIT
+
+%.opae: %.c
+	$(call quiet-command,$(CC) -m32 $(QXENNER_CFLAGS) $(QXENNER_DGFLAGS) \
+            $(CFLAGS) -c -o $@ $<,"  CC    $(TARGET_DIR)$@") -DCONFIG_PAE    \
+            -DCONFIG_32BIT
+
+%.opae: %.S
+	$(call quiet-command,$(CC) -m32 $(QXENNER_CFLAGS) $(QXENNER_DGFLAGS) \
+            $(CFLAGS) -c -o $@ $<,"  AS    $(TARGET_DIR)$@") -DCONFIG_PAE    \
+            -DCONFIG_32BIT
+
+%.o64: %.c
+	$(call quiet-command,$(CC) -m64 $(QXENNER_CFLAGS) $(QXENNER_DGFLAGS) \
+            $(CFLAGS) -c -o $@ $<,"  CC    $(TARGET_DIR)$@") -DCONFIG_64BIT
+
+%.elf:
+	$(CC) $(CFLAGS) -nostdlib -o $@ -Wl,-N -Wl,-T,$*.lds $^
+
+##############################################################################
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 20/40] xenner: kernel: mmu support for 32-bit PAE
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (18 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 19/40] xenner: kernel: Makefile Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 21/40] xenner: kernel: mmu support for 32-bit normal Alexander Graf
                   ` (19 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds support for memory management on 32 bit systems with PAE.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-mmpae.c |  444 +++++++++++++++++++++++++++++++++++++++++
 1 files changed, 444 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-mmpae.c

diff --git a/pc-bios/xenner/xenner-mmpae.c b/pc-bios/xenner/xenner-mmpae.c
new file mode 100644
index 0000000..7c11732
--- /dev/null
+++ b/pc-bios/xenner/xenner-mmpae.c
@@ -0,0 +1,444 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner memory management for 32 bit pae mode
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <inttypes.h>
+#include <xen/xen.h>
+
+#include "xenner.h"
+#include "xenner-mm.c"
+
+/* --------------------------------------------------------------------- */
+
+#define MAPS_R_BITS        4
+#define MAPS_R_COUNT       (1 << MAPS_R_BITS)
+#define MAPS_R_MASK        (MAPS_R_COUNT - 1)
+#define MAPS_R_SIZE        (MAPS_MAX / MAPS_R_COUNT)
+#define MAPS_R_LOW(r)      (MAPS_R_SIZE * (r))
+#define MAPS_R_HIGH(r)     (MAPS_R_SIZE * (r) + MAPS_R_SIZE)
+static int maps_next[MAPS_R_COUNT];
+
+static spinlock_t maplock = SPIN_LOCK_UNLOCKED;
+
+/* --------------------------------------------------------------------- */
+
+uintptr_t emu_pa(uintptr_t va)
+{
+    switch(va & 0xff800000) {
+    case XEN_TXT:
+        return va - (uintptr_t)_vstart;
+    case XEN_IPT:
+    {
+        uintptr_t mfn_guest = emudev_get(EMUDEV_CONF_GUEST_START_PFN, 0);
+        uintptr_t init_pt_pfn = emudev_get(EMUDEV_CONF_PFN_INIT_PT, 0);
+
+        return frame_to_addr(mfn_guest + init_pt_pfn) | (va - XEN_IPT);
+    }
+    case XEN_M2P:
+        return va - XEN_M2P + frame_to_addr(vmconf.mfn_m2p);
+    }
+
+    panic("unknown address", NULL);
+    return 0;
+}
+
+/* --------------------------------------------------------------------- */
+
+static int find_slot(int range)
+{
+    int low   = MAPS_R_LOW(range);
+    int high  = MAPS_R_HIGH(range);
+    int *next = maps_next + range;
+    int start = *next;
+    int slot;
+
+    while (maps_refcnt[*next]) {
+        (*next)++;
+        if (*next == high) {
+            *next = low;
+        }
+        if (*next == start) {
+            return -1;
+        }
+    }
+    slot = *next;
+    (*next)++;
+    if (*next == high) {
+        *next = low;
+    }
+    return slot;
+}
+
+static int mfn_to_slot_pae(uint32_t mfn, int range)
+{
+    int low   = MAPS_R_LOW(range);
+    int high  = MAPS_R_HIGH(range);
+    int slot;
+
+    for (slot = low; slot < high; slot++) {
+        if (!test_pgflag_pae(maps_pae[slot], _PAGE_PRESENT)) {
+            continue;
+        }
+        if (get_pgframe_pae(maps_pae[slot]) == mfn) {
+            /* cache hit */
+            return slot;
+        }
+    }
+    return -1;
+}
+
+void *map_page(unsigned long maddr)
+{
+    uint32_t mfn = addr_to_frame(maddr);
+    uint32_t off = addr_offset(maddr);
+    uint32_t va;
+    int range, slot;
+
+    spin_lock(&maplock);
+    range = mfn & MAPS_R_MASK;
+    slot = mfn_to_slot_pae(mfn, range);
+    if (slot == -1) {
+        slot = find_slot(range);
+        if (slot == -1) {
+            panic("out of map slots", NULL);
+        }
+        printk(3, "%s: mfn %5x range %d [%3d - %3d], slot %3d\n", __FUNCTION__,
+               mfn, range, MAPS_R_LOW(range), MAPS_R_HIGH(range), slot);
+        maps_pae[slot] = get_pgentry_pae(mfn, EMU_PGFLAGS);
+        vminfo.faults[XEN_FAULT_MAPS_MAPIT]++;
+        va = XEN_MAP_PAE + slot*PAGE_SIZE;
+        flush_tlb_addr(va);
+    } else {
+        vminfo.faults[XEN_FAULT_MAPS_REUSE]++;
+        va = XEN_MAP_PAE + slot*PAGE_SIZE;
+    }
+    spin_unlock(&maplock);
+
+    maps_refcnt[slot]++;
+    return (void*)va + off;
+}
+
+void free_page(void *ptr)
+{
+    uintptr_t va   = (uintptr_t)ptr;
+    uintptr_t base = XEN_MAP_PAE;
+    int slot       = (va - base) >> PAGE_SHIFT;
+
+    spin_lock(&maplock);
+    maps_refcnt[slot]--;
+    spin_unlock(&maplock);
+}
+
+void *fixmap_page(struct xen_cpu *cpu, unsigned long maddr)
+{
+    static int fixmap_slot = MAPS_MAX;
+    uint32_t mfn = addr_to_frame(maddr);
+    uint32_t off = addr_offset(maddr);
+    uint32_t va;
+    int slot;
+
+    slot = fixmap_slot++;
+    printk(2, "%s: mfn %5x slot %3d\n", __FUNCTION__, mfn, slot);
+    maps_pae[slot] = get_pgentry_pae(mfn, EMU_PGFLAGS);
+    va = XEN_MAP_PAE + slot*PAGE_SIZE;
+    return (void*)va + off;
+}
+
+/* --------------------------------------------------------------------- */
+
+pte_t *find_pte_lpt(uint32_t va)
+{
+    uint64_t *lpt_base = (void*)XEN_LPT_PAE;
+    uint32_t offset = va >> PAGE_SHIFT;
+
+    return lpt_base + offset;
+}
+
+pte_t *find_pte_map(struct xen_cpu *cpu, uint32_t va)
+{
+    uint64_t *pgd;
+    uint64_t *pmd;
+    uint64_t *pte;
+    int g,m,t;
+
+    g = PGD_INDEX_PAE(va);
+    m = PMD_INDEX_PAE(va);
+    t = PTE_INDEX_PAE(va);
+
+    pgd  = map_page(frame_to_addr(read_cr3_mfn(cpu)));
+    if (!test_pgflag_pae(pgd[g], _PAGE_PRESENT)) {
+        goto err1;
+    }
+
+    pmd  = map_page(frame_to_addr(get_pgframe_pae(pgd[g])));
+    if (!test_pgflag_pae(pmd[m], _PAGE_PRESENT)) {
+        goto err2;
+    }
+
+    pte  = map_page(frame_to_addr(get_pgframe_pae(pmd[m])));
+    free_page(pgd);
+    free_page(pmd);
+    return pte+t;
+
+err2:
+    free_page(pmd);
+err1:
+    free_page(pgd);
+    return NULL;
+}
+
+static char *print_pgflags(uint32_t flags)
+{
+    static char buf[80];
+
+    snprintf(buf, sizeof(buf), "%s%s%s%s%s%s%s%s%s\n",
+             flags & _PAGE_GLOBAL   ? " global"   : "",
+             flags & _PAGE_PSE      ? " pse"      : "",
+             flags & _PAGE_DIRTY    ? " dirty"    : "",
+             flags & _PAGE_ACCESSED ? " accessed" : "",
+             flags & _PAGE_PCD      ? " pcd"      : "",
+             flags & _PAGE_PWT      ? " pwt"      : "",
+             flags & _PAGE_USER     ? " user"     : "",
+             flags & _PAGE_RW       ? " write"    : "",
+             flags & _PAGE_PRESENT  ? " present"  : "");
+    return buf;
+}
+
+void pgtable_walk(struct xen_cpu *cpu, uint32_t va)
+{
+    uint64_t *pgd = NULL;
+    uint64_t *pmd = NULL;
+    uint64_t *pte = NULL;
+    uint64_t mfn;
+    uint32_t g,m,t, flags;
+
+    g = PGD_INDEX_PAE(va);
+    m = PMD_INDEX_PAE(va);
+    t = PTE_INDEX_PAE(va);
+    printk(5, "va %" PRIx32 " | pae %d -> %d -> %d\n", va, g, m, t);
+
+    pgd  = map_page(frame_to_addr(read_cr3_mfn(cpu)));
+    mfn   = get_pgframe_64(pgd[g]);
+    flags = get_pgflags_64(pgd[g]);
+    printk(5, "  pgd     +%3d : %08" PRIx64 "  |  mfn %4" PRIx64 " | %s",
+           g, pgd[g], mfn, print_pgflags(flags));
+    if (!test_pgflag_pae(pgd[g], _PAGE_PRESENT)) {
+        goto cleanup;
+    }
+
+    pmd  = map_page(frame_to_addr(get_pgframe_pae(pgd[g])));
+    mfn   = get_pgframe_64(pmd[m]);
+    flags = get_pgflags_64(pmd[m]);
+    printk(5, "    pmd   +%3d : %08" PRIx64 "  |  mfn %4" PRIx64 " | %s",
+           m, pmd[m], mfn, print_pgflags(flags));
+    if (!test_pgflag_pae(pmd[m], _PAGE_PRESENT)) {
+        goto cleanup;
+    }
+    if (test_pgflag_pae(pmd[m], _PAGE_PSE)) {
+        goto cleanup;
+    }
+
+    pte   = map_page(frame_to_addr(get_pgframe_pae(pmd[m])));
+    mfn   = get_pgframe_64(pte[t]);
+    flags = get_pgflags_64(pte[t]);
+    printk(5, "      pte +%3d : %08" PRIx64 "  |  mfn %4" PRIx64 " | %s",
+           t, pte[t], mfn, print_pgflags(flags));
+
+cleanup:
+    if (pgd) {
+        free_page(pgd);
+    }
+    if (pmd) {
+        free_page(pmd);
+    }
+    if (pte) {
+        free_page(pte);
+    }
+}
+
+/* --------------------------------------------------------------------- */
+
+static inline pte_t *find_pgd(unsigned long va, uint64_t mfn, int alloc)
+{
+    pte_t *pgd = map_page(frame_to_addr(mfn));
+    pte_t *pmd;
+
+    pgd += PGD_INDEX_PAE(va);
+
+    if (!test_pgflag(*pgd, _PAGE_PRESENT)) {
+        pmd = get_pages(1, "pmd");
+        *pgd = get_pgentry(EMU_MFN(pmd), _PAGE_PRESENT);
+    }
+
+    return pgd;
+}
+
+static inline pte_t *find_pmd(unsigned long va, uint64_t mfn, int alloc)
+{
+    pte_t *pmd = map_page(frame_to_addr(mfn));
+    pte_t *pte;
+
+    pmd += PMD_INDEX_PAE(va);
+    if (!test_pgflag(*pmd, _PAGE_PRESENT)) {
+        pte = get_pages(1, "pte");
+        *pmd = get_pgentry(EMU_MFN(pte), ALL_PGFLAGS | _PAGE_RW);
+    }
+
+    return pmd;
+}
+
+static inline pte_t *find_pte(unsigned long va, uint64_t mfn)
+{
+    pte_t *pte = map_page(frame_to_addr(mfn));
+    return pte + PTE_INDEX_PAE(va);
+}
+
+static void map_one_page(struct xen_cpu *cpu, unsigned long va, uint64_t maddr,
+                         int flags)
+{
+    uint64_t mfn = addr_to_frame(maddr);
+    pte_t *pgd;
+    pte_t *pmd;
+    pte_t *pte;
+
+    pgd = find_pgd(va, read_cr3_mfn(cpu), 1);
+    pmd = find_pmd(va, get_pgframe(*pgd), 1);
+    if (test_pgflag(*pmd, _PAGE_PSE)) {
+        *pmd = 0;
+        pmd = find_pmd(va, get_pgframe(*pgd), 1);
+    }
+    pte = find_pte(va, get_pgframe(*pmd));
+    *pte = get_pgentry(mfn, flags);
+
+    free_page(pte);
+    free_page(pmd);
+    free_page(pgd);
+}
+
+void map_region(struct xen_cpu *cpu, uint64_t va, uint32_t flags,
+                uint64_t start, uint64_t count)
+{
+    uint64_t maddr = frame_to_addr(start);
+    uint64_t maddr_end = maddr + frame_to_addr(count);
+
+    for (; maddr < maddr_end; maddr += PAGE_SIZE, va += PAGE_SIZE) {
+        map_one_page(cpu, va, maddr, flags);
+    }
+}
+
+/* --------------------------------------------------------------------- */
+
+void update_emu_mappings(uint32_t cr3_mfn)
+{
+    uint64_t *new_pgd, *new_pmd3;
+    uint64_t entry;
+    uint32_t mfn;
+    int idx, i;
+
+    new_pgd  = map_page(frame_to_addr(cr3_mfn));
+
+    /* maybe alloc a pmd page */
+    switch_heap(HEAP_HIGH);
+    new_pmd3 = find_pgd(0xffffffff, cr3_mfn, 1);
+    free_page(new_pmd3);
+    switch_heap(HEAP_EMU);
+    /* map the pmd page */
+    new_pmd3 = map_page(frame_to_addr(get_pgframe_pae(new_pgd[3])));
+
+    /* xenner mapping */
+    idx = PMD_INDEX_PAE(XEN_TXT_PAE);
+    for (mfn = vmconf.mfn_emu;
+         mfn < vmconf.mfn_emu + vmconf.pg_emu;
+         mfn += PMD_COUNT_PAE, idx++) {
+        new_pmd3[idx] = emu_pmd_pae[idx];
+    }
+
+    idx = PMD_INDEX_PAE(XEN_M2P_PAE);
+    if (!test_pgflag_pae(new_pmd3[idx], _PAGE_PRESENT)) {
+        /* new one, must init static mappings */
+        for (; idx < PMD_COUNT_PAE; idx++) {
+            if (!test_pgflag_pae(emu_pmd_pae[idx], _PAGE_PRESENT)) {
+                continue;
+            }
+
+            if ((idx >= PMD_INDEX_PAE(XEN_LPT_PAE)) &&
+                (idx < (PMD_INDEX_PAE(XEN_LPT_PAE) + 4))) {
+                continue;
+            }
+            new_pmd3[idx] = emu_pmd_pae[idx];
+        }
+    }
+
+    /* linear pgtable mappings */
+    idx = PMD_INDEX_PAE(XEN_LPT_PAE);
+    for (i = 0; i < 4; i++) {
+        if (test_pgflag_pae(new_pgd[i], _PAGE_PRESENT)) {
+            mfn = get_pgframe_pae(new_pgd[i]);
+            entry = get_pgentry_pae(mfn, LPT_PGFLAGS);
+        } else {
+            entry = 0;
+        }
+        if (new_pmd3[idx+i] != entry) {
+            new_pmd3[idx+i] = entry;
+        }
+    }
+
+    /* mapping slots */
+    idx = PMD_INDEX_PAE(XEN_MAP_PAE);
+    new_pmd3[idx] = emu_pmd_pae[idx];
+
+    free_page(new_pgd);
+    free_page(new_pmd3);
+}
+
+/* --------------------------------------------------------------------- */
+
+void paging_init(struct xen_cpu *cpu)
+{
+    uintptr_t mfn_guest = emudev_get(EMUDEV_CONF_GUEST_START_PFN, 0);
+    uintptr_t init_pt_pfn = emudev_get(EMUDEV_CONF_PFN_INIT_PT, 0);
+    uint32_t mfn;
+    int idx;
+
+    idx = PMD_INDEX_PAE(XEN_TXT_PAE);
+    for (mfn = vmconf.mfn_emu;
+         mfn < vmconf.mfn_emu + vmconf.pg_emu;
+         mfn += PMD_COUNT_PAE, idx++) {
+        emu_pmd_pae[idx] = get_pgentry_pae(mfn, EMU_PGFLAGS | _PAGE_PSE);
+    }
+
+    idx = PMD_INDEX_PAE(XEN_M2P_PAE);
+    for (mfn = vmconf.mfn_m2p;
+         mfn < vmconf.mfn_m2p + vmconf.pg_m2p;
+         mfn += PMD_COUNT_PAE, idx++) {
+        emu_pmd_pae[idx] = get_pgentry_pae(mfn, EMU_PGFLAGS | _PAGE_PSE);
+    }
+
+    idx = PMD_INDEX_PAE(XEN_MAP_PAE);
+    emu_pmd_pae[idx] = get_pgentry_pae(EMU_MFN(maps_pae), PGT_PGFLAGS_32);
+
+    idx = PMD_INDEX_PAE(XEN_IPT);
+    emu_pmd_pae[idx] = get_pgentry(mfn_guest + init_pt_pfn,
+                                   EMU_PGFLAGS | _PAGE_PSE);
+
+    m2p = (void*)XEN_M2P_PAE;
+}
+
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 21/40] xenner: kernel: mmu support for 32-bit normal
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (19 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 20/40] xenner: kernel: mmu support for 32-bit PAE Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 22/40] xenner: kernel: mmu support for 64-bit Alexander Graf
                   ` (18 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds support for memory management on 32 bit systems without PAE.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-mm32.c |  314 ++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 314 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-mm32.c

diff --git a/pc-bios/xenner/xenner-mm32.c b/pc-bios/xenner/xenner-mm32.c
new file mode 100644
index 0000000..7622ae5
--- /dev/null
+++ b/pc-bios/xenner/xenner-mm32.c
@@ -0,0 +1,314 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner memory management for 32 bit normal mode
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <inttypes.h>
+#include <xen/xen.h>
+
+#include "xenner.h"
+#include "xenner-mm.c"
+
+/* --------------------------------------------------------------------- */
+
+#define MAPS_R_BITS        4
+#define MAPS_R_COUNT       (1 << MAPS_R_BITS)
+#define MAPS_R_MASK        (MAPS_R_COUNT - 1)
+#define MAPS_R_SIZE        (MAPS_MAX / MAPS_R_COUNT)
+#define MAPS_R_LOW(r)      (MAPS_R_SIZE * (r))
+#define MAPS_R_HIGH(r)     (MAPS_R_SIZE * (r) + MAPS_R_SIZE)
+static int maps_next[MAPS_R_COUNT];
+
+static spinlock_t maplock = SPIN_LOCK_UNLOCKED;
+
+/* --------------------------------------------------------------------- */
+
+uintptr_t emu_pa(uintptr_t va)
+{
+    switch(va & 0xfff00000) {
+    case XEN_TXT:
+        return va - (uintptr_t)_vstart;
+    case XEN_IPT:
+    {
+        uintptr_t mfn_guest = emudev_get(EMUDEV_CONF_GUEST_START_PFN, 0);
+        uintptr_t init_pt_pfn = emudev_get(EMUDEV_CONF_PFN_INIT_PT, 0);
+
+        return frame_to_addr(mfn_guest + init_pt_pfn) | (va - XEN_IPT);
+    }
+    case XEN_M2P:
+        return va - XEN_M2P + frame_to_addr(vmconf.mfn_m2p);
+    }
+
+    panic("unknown address", NULL);
+    return 0;
+}
+
+/* --------------------------------------------------------------------- */
+
+static int find_slot(int range)
+{
+    int low   = MAPS_R_LOW(range);
+    int high  = MAPS_R_HIGH(range);
+    int *next = maps_next + range;
+    int start = *next;
+    int slot;
+
+    while (maps_refcnt[*next]) {
+        (*next)++;
+        if (*next == high) {
+            *next = low;
+        }
+        if (*next == start) {
+            return -1;
+        }
+    }
+    slot = *next;
+    (*next)++;
+    if (*next == high) {
+        *next = low;
+    }
+    return slot;
+}
+
+static int mfn_to_slot_32(uint32_t mfn, int range)
+{
+    int low   = MAPS_R_LOW(range);
+    int high  = MAPS_R_HIGH(range);
+    int slot;
+
+    for (slot = low; slot < high; slot++) {
+        if (!test_pgflag_32(maps_32[slot], _PAGE_PRESENT)) {
+            continue;
+        }
+        if (get_pgframe_32(maps_32[slot]) == mfn) {
+            /* cache hit */
+            return slot;
+        }
+    }
+    return -1;
+}
+
+void *map_page(unsigned long maddr)
+{
+    uint32_t mfn = addr_to_frame(maddr);
+    uint32_t off = addr_offset(maddr);
+    uint32_t va;
+    int range, slot;
+
+    spin_lock(&maplock);
+    range = mfn & MAPS_R_MASK;
+    slot = mfn_to_slot_32(mfn, range);
+    if (slot == -1) {
+        slot = find_slot(range);
+        if (slot == -1) {
+            panic("out of map slots", NULL);
+        }
+        printk(3, "%s: mfn %5x range %d [%3d - %3d], slot %3d\n", __FUNCTION__,
+               mfn, range, MAPS_R_LOW(range), MAPS_R_HIGH(range), slot);
+        maps_32[slot] = get_pgentry_32(mfn, EMU_PGFLAGS);
+        vminfo.faults[XEN_FAULT_MAPS_MAPIT]++;
+        va = XEN_MAP_32 + slot*PAGE_SIZE;
+        flush_tlb_addr(va);
+    } else {
+        printk(3, "%s: mfn %5x range %d [%3d - %3d], slot %3d (cached)\n",
+               __FUNCTION__, mfn, range, MAPS_R_LOW(range), MAPS_R_HIGH(range),
+               slot);
+        vminfo.faults[XEN_FAULT_MAPS_REUSE]++;
+        va = XEN_MAP_32 + slot*PAGE_SIZE;
+    }
+    maps_refcnt[slot]++;
+    spin_unlock(&maplock);
+
+    return (void*)va + off;
+}
+
+void free_page(void *ptr)
+{
+    uintptr_t va   = ((uintptr_t)ptr) & PAGE_MASK;
+    uintptr_t base = XEN_MAP_32;
+    int slot       = (va - base) >> PAGE_SHIFT;
+
+    spin_lock(&maplock);
+    maps_refcnt[slot]--;
+    spin_unlock(&maplock);
+}
+
+void *fixmap_page(struct xen_cpu *cpu, unsigned long maddr)
+{
+    static int fixmap_slot = MAPS_MAX;
+    uint32_t mfn = addr_to_frame(maddr);
+    uint32_t off = addr_offset(maddr);
+    uint32_t va;
+    int slot;
+
+    slot = fixmap_slot++;
+    printk(2, "%s: mfn %5x slot %3d\n", __FUNCTION__, mfn, slot);
+    maps_32[slot] = get_pgentry_32(mfn, EMU_PGFLAGS);
+    va = XEN_MAP_32 + slot*PAGE_SIZE;
+    return (void*)va + off;
+}
+
+/* --------------------------------------------------------------------- */
+
+pte_t *find_pte_lpt(uint32_t va)
+{
+    pte_t *lpt_base = (void*)XEN_LPT_32;
+    pte_t offset = va >> PAGE_SHIFT;
+
+    return lpt_base + offset;
+}
+
+pte_t *find_pte_map(struct xen_cpu *cpu, uint32_t va)
+{
+    pte_t *pgd;
+    pte_t *pte;
+    int g,t;
+
+    g = PGD_INDEX_32(va);
+    t = PTE_INDEX_32(va);
+    printk(5, "va %" PRIx32 " | 32 %d -> %d\n", va, g, t);
+
+    pgd  = map_page(frame_to_addr(read_cr3_mfn(cpu)));
+    printk(5, "  pgd   %3d = %08" PRIx32 "\n", g, pgd[g]);
+    if (!test_pgflag_32(pgd[g], _PAGE_PRESENT)) {
+        return NULL;
+    }
+
+    pte  = map_page(frame_to_addr(get_pgframe_32(pgd[g])));
+    printk(5, "    pte %3d = %08" PRIx32 "\n", t, pte[t]);
+    free_page(pgd);
+
+    return pte+t;
+}
+
+void pgtable_walk(struct xen_cpu *cpu, uint32_t va)
+{
+    pte_t *p;
+    p = find_pte_map(cpu, va);
+    free_page(p);
+}
+
+/* --------------------------------------------------------------------- */
+
+static inline pte_t *find_pgd(unsigned long va, uint64_t mfn, int alloc)
+{
+    pte_t *pgd = map_page(frame_to_addr(mfn));
+    pte_t *pte;
+
+    pgd += PGD_INDEX_32(va);
+    if (!test_pgflag(*pgd, _PAGE_PRESENT)) {
+        pte = get_pages(1, "pte");
+        *pgd = get_pgentry(EMU_MFN(pte), _PAGE_PRESENT);
+    }
+
+    return pgd;
+}
+
+static inline pte_t *find_pte(unsigned long va, uint64_t mfn)
+{
+    pte_t *pte = map_page(frame_to_addr(mfn));
+    return pte + PTE_INDEX_32(va);
+}
+
+static void map_one_page(struct xen_cpu *cpu, unsigned long va, uint64_t maddr,
+                         int flags)
+{
+    uint64_t mfn = addr_to_frame(maddr);
+    pte_t *pgd;
+    pte_t *pte;
+
+    pgd = find_pgd(va, read_cr3_mfn(cpu), 1);
+    pte = find_pte(va, get_pgframe(*pgd));
+    *pte = get_pgentry(mfn, flags);
+
+    free_page(pte);
+    free_page(pgd);
+}
+
+void map_region(struct xen_cpu *cpu, uint64_t va, uint32_t flags,
+                uint64_t start, uint64_t count)
+{
+    uint64_t maddr = frame_to_addr(start);
+    uint64_t maddr_end = maddr + frame_to_addr(count);
+
+    for (; maddr < maddr_end; maddr += PAGE_SIZE, va += PAGE_SIZE) {
+        map_one_page(cpu, va, maddr, flags);
+    }
+}
+
+/* --------------------------------------------------------------------- */
+
+void update_emu_mappings(uint32_t cr3_mfn)
+{
+    uint32_t *new_pgd;
+    uint32_t entry;
+    int idx;
+
+    new_pgd  = map_page(frame_to_addr(cr3_mfn));
+
+    idx = PGD_INDEX_32(XEN_M2P_32);
+    if (!test_pgflag_32(new_pgd[idx], _PAGE_PRESENT)) {
+        /* new one, must init static mappings */
+        for (; idx < PGD_COUNT_32; idx++) {
+            if (!test_pgflag_32(emu_pgd_32[idx], _PAGE_PRESENT)) {
+                continue;
+            }
+            if (idx == PGD_INDEX_32(XEN_LPT_32)) {
+                continue;
+            }
+
+            new_pgd[idx] = emu_pgd_32[idx];
+            idx++;
+        }
+    }
+
+    /* linear pgtable mapping */
+    idx = PGD_INDEX_32(XEN_LPT_32);
+    entry = get_pgentry_32(cr3_mfn, LPT_PGFLAGS);
+    if (new_pgd[idx] != entry) {
+        new_pgd[idx] = entry;
+    }
+
+    free_page(new_pgd);
+}
+
+/* --------------------------------------------------------------------- */
+
+void paging_init(struct xen_cpu *cpu)
+{
+    uintptr_t mfn_guest = emudev_get(EMUDEV_CONF_GUEST_START_PFN, 0);
+    uintptr_t init_pt_pfn = emudev_get(EMUDEV_CONF_PFN_INIT_PT, 0);
+    int idx;
+
+    idx = PGD_INDEX_32(XEN_TXT_32);
+    emu_pgd_32[idx] = get_pgentry_32(vmconf.mfn_emu, EMU_PGFLAGS | _PAGE_PSE);
+
+    idx = PGD_INDEX_32(XEN_M2P_32);
+    emu_pgd_32[idx] = get_pgentry_32(vmconf.mfn_m2p, M2P_PGFLAGS_32 | _PAGE_PSE);
+
+    idx = PGD_INDEX_32(XEN_MAP_32);
+    emu_pgd_32[idx] = get_pgentry_32(EMU_MFN(maps_32), PGT_PGFLAGS_32);
+
+    idx = PGD_INDEX_32(XEN_IPT);
+    emu_pgd_32[idx] = get_pgentry(mfn_guest + init_pt_pfn,
+                                  EMU_PGFLAGS | _PAGE_PSE);
+
+    m2p = (void*)XEN_M2P_32;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 22/40] xenner: kernel: mmu support for 64-bit
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (20 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 21/40] xenner: kernel: mmu support for 32-bit normal Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 23/40] xenner: kernel: generic MM functionality Alexander Graf
                   ` (17 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds support for memory management on 64 bit systems.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-mm64.c |  369 ++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 369 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-mm64.c

diff --git a/pc-bios/xenner/xenner-mm64.c b/pc-bios/xenner/xenner-mm64.c
new file mode 100644
index 0000000..89cb076
--- /dev/null
+++ b/pc-bios/xenner/xenner-mm64.c
@@ -0,0 +1,369 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner memory management for 64 bit mode
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <inttypes.h>
+#include <xen/xen.h>
+
+#include "xenner.h"
+#include "xenner-mm.c"
+
+/* --------------------------------------------------------------------- */
+
+uintptr_t emu_pa(uintptr_t va)
+{
+    switch(va & 0xfffffff000000000) {
+    case XEN_RAM_64:
+        return va - (uintptr_t)_vstart;
+    case XEN_M2P_64:
+        return va - XEN_M2P_64 + frame_to_addr(vmconf.mfn_m2p);
+    }
+
+    panic("unknown address", NULL);
+    return 0;
+}
+
+/* --------------------------------------------------------------------- */
+
+static char *print_pgflags(uint32_t flags)
+{
+    static char buf[80];
+
+    snprintf(buf, sizeof(buf), "%s%s%s%s%s%s%s%s%s\n",
+             flags & _PAGE_GLOBAL   ? " global"   : "",
+             flags & _PAGE_PSE      ? " pse"      : "",
+             flags & _PAGE_DIRTY    ? " dirty"    : "",
+             flags & _PAGE_ACCESSED ? " accessed" : "",
+             flags & _PAGE_PCD      ? " pcd"      : "",
+             flags & _PAGE_PWT      ? " pwt"      : "",
+             flags & _PAGE_USER     ? " user"     : "",
+             flags & _PAGE_RW       ? " write"    : "",
+             flags & _PAGE_PRESENT  ? " present"  : "");
+    return buf;
+}
+
+void pgtable_walk(int level, uint64_t va, uint64_t root_mfn)
+{
+    void *physmem = (void*)XEN_RAM_64;
+    uint64_t *pgd, *pud, *pmd, *pte;
+    uint64_t mfn;
+    uint32_t slot, flags;
+
+    if (vmconf.debug_level < level)
+        return;
+
+    printk(level, "page table walk for va %" PRIx64 ", root_mfn %" PRIx64 "\n",
+           va, root_mfn);
+
+    pgd   = physmem + frame_to_addr(root_mfn);
+    slot  = PGD_INDEX_64(va);
+    mfn   = get_pgframe_64(pgd[slot]);
+    flags = get_pgflags_64(pgd[slot]);
+    printk(level, "pgd   : %p +%3d  |  mfn %4" PRIx64 "  |  %s",
+           pgd, slot, mfn, print_pgflags(flags));
+    if (!(flags & _PAGE_PRESENT))
+        return;
+
+    pud   = physmem + frame_to_addr(mfn);
+    slot  = PUD_INDEX_64(va);
+    mfn   = get_pgframe_64(pud[slot]);
+    flags = get_pgflags_64(pud[slot]);
+    printk(level, " pud  : %p +%3d  |  mfn %4" PRIx64 "  |  %s",
+           pud, slot, mfn, print_pgflags(flags));
+    if (!(flags & _PAGE_PRESENT))
+        return;
+
+    pmd   = physmem + frame_to_addr(mfn);
+    slot  = PMD_INDEX_64(va);
+    mfn   = get_pgframe_64(pmd[slot]);
+    flags = get_pgflags_64(pmd[slot]);
+    printk(level, "  pmd : %p +%3d  |  mfn %4" PRIx64 "  |  %s",
+           pmd, slot, mfn, print_pgflags(flags));
+    if (!(flags & _PAGE_PRESENT))
+        return;
+    if (flags & _PAGE_PSE)
+        return;
+
+    pte   = physmem + frame_to_addr(mfn);
+    slot  = PTE_INDEX_64(va);
+    mfn   = get_pgframe_64(pte[slot]);
+    flags = get_pgflags_64(pte[slot]);
+    printk(level, "   pte: %p +%3d  |  mfn %4" PRIx64 "  |  %s",
+           pte, slot, mfn, print_pgflags(flags));
+}
+
+static int is_pse(uint64_t va)
+{
+    switch (va & 0xffffff8000000000ULL) {
+    case XEN_RAM_64:
+    case XEN_M2P_64:
+        return 1;
+    default:
+        return 0;
+    }
+}
+
+int pgtable_fixup_flag(struct xen_cpu *cpu, uint64_t va, uint32_t flag)
+{
+    void *physmem = (void*)XEN_RAM_64;
+    uint64_t *pgd, *pud, *pmd, *pte;
+    uint32_t slot;
+    int fixes = 0;
+
+    /* quick test on the leaf page via linear page table when we're sure
+       we're not touching a 2mb page which doesn't have a pte */
+    pte = find_pte_64(va);
+    if (!is_pse(va) && !test_pgflag_64(*pte, flag)) {
+        *pte |= flag;
+        fixes++;
+        goto done;
+    }
+
+    /* do full page table walk */
+    pgd   = physmem + frame_to_addr(read_cr3_mfn(cpu));
+    slot  = PGD_INDEX_64(va);
+    if (!test_pgflag_64(pgd[slot], flag)) {
+        pgd[slot] |= flag;
+        fixes++;
+    }
+
+    pud   = physmem + frame_to_addr(get_pgframe_64(pgd[slot]));
+    slot  = PUD_INDEX_64(va);
+    if (!test_pgflag_64(pud[slot], flag)) {
+        pud[slot] |= flag;
+        fixes++;
+    }
+
+    pmd   = physmem + frame_to_addr(get_pgframe_64(pud[slot]));
+    slot  = PMD_INDEX_64(va);
+    if (!test_pgflag_64(pmd[slot], flag)) {
+        pmd[slot] |= flag;
+        fixes++;
+    }
+
+done:
+    if (fixes)
+        flush_tlb_addr(va);
+    return fixes;
+}
+
+int pgtable_is_present(uint64_t va, uint64_t root_mfn)
+{
+    void *physmem = (void*)XEN_RAM_64;
+    uint64_t *pgd, *pud, *pmd, *pte;
+    uint32_t slot;
+
+    pgd  = physmem + frame_to_addr(root_mfn);
+    slot = PGD_INDEX_64(va);
+    if (!test_pgflag_64(pgd[slot], _PAGE_PRESENT)) {
+        return 0;
+    }
+
+    pud  = physmem + frame_to_addr(get_pgframe_64(pgd[slot]));
+    slot = PUD_INDEX_64(va);
+    if (!test_pgflag_64(pud[slot], _PAGE_PRESENT)) {
+        return 0;
+    }
+
+    pmd  = physmem + frame_to_addr(get_pgframe_64(pud[slot]));
+    slot = PMD_INDEX_64(va);
+    if (!test_pgflag_64(pmd[slot], _PAGE_PRESENT)) {
+        return 0;
+    }
+    if (!test_pgflag_64(pmd[slot], _PAGE_PSE)) {
+        return 1;
+    }
+
+    pte   = physmem + frame_to_addr(get_pgframe_64(pmd[slot]));
+    slot  = PTE_INDEX_64(va);
+    if (!test_pgflag_64(pmd[slot], _PAGE_PRESENT)) {
+        return 0;
+    }
+
+    return 1;
+}
+
+/* --------------------------------------------------------------------- */
+
+void *map_page(uint64_t maddr)
+{
+    void *ram = (void*)XEN_RAM_64;
+    return ram + maddr;
+}
+
+uint64_t *find_pte_64(uint64_t va)
+{
+    uint64_t *lpt_base = (void*)XEN_LPT_64;
+    uint64_t offset = (va & 0xffffffffffff) >> PAGE_SHIFT;
+
+    return lpt_base + offset;
+}
+
+void update_emu_mappings(uint64_t cr3_mfn)
+{
+    uint64_t *new_pgd;
+    int idx;
+
+    new_pgd  = map_page(frame_to_addr(cr3_mfn));
+
+    idx = PGD_INDEX_64(XEN_M2P_64);
+    for (; idx < PGD_INDEX_64(XEN_DOM_64); idx++) {
+        if ((test_pgflag_64(new_pgd[idx], _PAGE_PRESENT)) ||
+            (!test_pgflag_64(emu_pgd[idx], _PAGE_PRESENT)) ||
+            (idx == PGD_INDEX_64(XEN_LPT_64))) {
+            continue;
+        }
+
+        new_pgd[idx] = emu_pgd[idx];
+    }
+
+    /* linear pgtable mapping */
+    idx = PGD_INDEX_64(XEN_LPT_64);
+    new_pgd[idx] = get_pgentry_64(cr3_mfn, LPT_PGFLAGS);
+}
+
+static inline uint64_t *find_pgd(uint64_t va, uint64_t mfn, int alloc, int sync)
+{
+    void *physmem = (void*)XEN_RAM_64;
+    uint64_t *pgd, *pud, idx;
+
+    pgd  = physmem + frame_to_addr(mfn);
+    idx = PGD_INDEX_64(va);
+    pgd += idx;
+    if (!test_pgflag_64(*pgd, _PAGE_PRESENT) && alloc) {
+        pud = get_pages(1, "pud");
+        *pgd = get_pgentry_64(EMU_MFN(pud), PGT_PGFLAGS_64) & ~_PAGE_GLOBAL;
+        if (sync && !test_pgflag_64(emu_pgd[idx], _PAGE_PRESENT)) {
+            /* sync emu boot pgd */
+            emu_pgd[idx] = *pgd;
+        }
+    }
+    return pgd;
+}
+
+static inline uint64_t *find_pud(uint64_t va, uint64_t mfn, int alloc)
+{
+    void *physmem = (void*)XEN_RAM_64;
+    uint64_t *pud, *pmd;
+
+    pud  = physmem + frame_to_addr(mfn);
+    pud += PUD_INDEX_64(va);
+    if (!test_pgflag_64(*pud, _PAGE_PRESENT) && alloc) {
+        pmd = get_pages(1, "pmd");
+        *pud = get_pgentry_64(EMU_MFN(pmd), PGT_PGFLAGS_64) & ~_PAGE_GLOBAL;
+    }
+    return pud;
+}
+
+static inline uint64_t *find_pmd(uint64_t va, uint64_t mfn, int alloc)
+{
+    void *physmem = (void*)XEN_RAM_64;
+    uint64_t *pmd, *pte;
+
+    pmd  = physmem + frame_to_addr(mfn);
+    pmd += PMD_INDEX_64(va);
+    if (!test_pgflag_64(*pmd, _PAGE_PRESENT) && alloc) {
+        pte = get_pages(1, "pte");
+        *pmd = get_pgentry_64(EMU_MFN(pte), PGT_PGFLAGS_64);
+    }
+    return pmd;
+}
+
+static inline uint64_t *find_pte(uint64_t va, uint64_t mfn)
+{
+    void *physmem = (void*)XEN_RAM_64;
+    uint64_t *pte;
+
+    pte  = physmem + frame_to_addr(mfn);
+    pte += PTE_INDEX_64(va);
+    return pte;
+}
+
+static int map_region_pse(struct xen_cpu *cpu, uint64_t va_start,
+                          uint32_t flags, uint64_t start, uint64_t count)
+{
+    uint64_t *pgd;
+    uint64_t *pud;
+    uint64_t *pmd;
+    uint64_t va;
+    uint64_t mfn;
+
+    flags |= _PAGE_PSE;
+    for (mfn = start; mfn < (start + count); mfn += PMD_COUNT_64) {
+        va = va_start + frame_to_addr(mfn-start);
+
+        pgd = find_pgd(va, read_cr3_mfn(cpu), 1, 1);
+        pud = find_pud(va, get_pgframe_64(*pgd), 1);
+        pmd = find_pmd(va, get_pgframe_64(*pud), 0);
+        *pmd = get_pgentry_64(mfn, flags);
+    }
+    return 0;
+}
+
+static void map_one_page(struct xen_cpu *cpu, uint64_t va, uint64_t maddr,
+                         int flags, int sync)
+{
+    uint64_t mfn = addr_to_frame(maddr);
+    uint64_t *pgd;
+    uint64_t *pud;
+    uint64_t *pmd;
+    uint64_t *pte;
+
+    pgd = find_pgd(va, read_cr3_mfn(cpu), 1, sync);
+    pud = find_pud(va, get_pgframe_64(*pgd), 1);
+    pmd = find_pmd(va, get_pgframe_64(*pud), 1);
+    if (*pmd & _PAGE_PSE) {
+        *pmd = 0;
+        pmd = find_pmd(va, get_pgframe_64(*pud), 1);
+    }
+    pte = find_pte(va, get_pgframe_64(*pmd));
+    *pte = get_pgentry_64(mfn, flags);
+}
+
+void map_region(struct xen_cpu *cpu, uint64_t va, uint32_t flags,
+                uint64_t start, uint64_t count)
+{
+    uint64_t maddr = frame_to_addr(start);
+    uint64_t maddr_end = maddr + frame_to_addr(count);
+
+    for (; maddr < maddr_end; maddr += PAGE_SIZE, va += PAGE_SIZE) {
+        map_one_page(cpu, va, maddr, flags, 0);
+    }
+}
+
+void *fixmap_page(struct xen_cpu *cpu, uint64_t maddr)
+{
+    static int fixmap_slot = 0;
+    uint32_t off = addr_offset(maddr);
+    uint64_t va;
+
+    va = XEN_MAP_64 + PAGE_SIZE * fixmap_slot++;
+    map_one_page(cpu, va, maddr, EMU_PGFLAGS, 1);
+
+    return (void*)va + off;
+}
+
+void paging_init(struct xen_cpu *cpu)
+{
+    map_region_pse(cpu, XEN_RAM_64, EMU_PGFLAGS,    0,              vmconf.pg_total);
+    map_region_pse(cpu, XEN_M2P_64, M2P_PGFLAGS_64, vmconf.mfn_m2p, vmconf.pg_m2p);
+    m2p = (void*)XEN_M2P_64;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 23/40] xenner: kernel: generic MM functionality
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (21 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 22/40] xenner: kernel: mmu support for 64-bit Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 24/40] xenner: kernel: printk Alexander Graf
                   ` (16 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner does its own memory management bookkeeping which can be kept
platform agnostic. This patch adds that.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-mm.c |  105 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 105 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-mm.c

diff --git a/pc-bios/xenner/xenner-mm.c b/pc-bios/xenner/xenner-mm.c
new file mode 100644
index 0000000..ccbd48a
--- /dev/null
+++ b/pc-bios/xenner/xenner-mm.c
@@ -0,0 +1,105 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner generic memory management
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+static int heap_type = HEAP_EMU;
+static void *heap_emu = NULL;
+static void *heap_high_start = NULL;
+static void *heap_high = NULL;
+
+unsigned long heap_size(void)
+{
+    unsigned long r = 0;
+
+    switch (heap_type) {
+        case HEAP_EMU:
+            r = (unsigned long)heap_emu - (unsigned long)_vstop;
+            break;
+        case HEAP_HIGH:
+            r = (unsigned long)heap_high - (unsigned long)heap_high_start;
+            break;
+    }
+
+    return r;
+}
+
+void switch_heap(int _heap_type)
+{
+    if ((_heap_type == HEAP_HIGH) && !heap_high) {
+#if defined(CONFIG_32BIT)
+        heap_high = (void*)(uintptr_t)XEN_IPT;
+#elif defined(CONFIG_64BIT)
+        uintptr_t mfn_guest = emudev_get(EMUDEV_CONF_GUEST_START_PFN, 0);
+        uintptr_t init_pt_pfn = emudev_get(EMUDEV_CONF_PFN_INIT_PT, 0);
+
+        heap_high = map_page(frame_to_addr(mfn_guest + init_pt_pfn));
+#endif
+        heap_high_start = heap_high;
+    }
+
+    switch (_heap_type) {
+    case HEAP_EMU:
+    case HEAP_HIGH:
+        heap_type = _heap_type;
+        break;
+    }
+}
+
+void *get_pages(int pages, const char *purpose)
+{
+    void **heap_cur = &heap_emu;
+    void *ptr;
+
+    if (!heap_emu) {
+        heap_emu = _vstop;
+    }
+
+    switch (heap_type) {
+        case HEAP_EMU:
+            heap_cur = &heap_emu;
+            break;
+        case HEAP_HIGH:
+            heap_cur = &heap_high;
+            break;
+    }
+
+    ptr = *heap_cur;
+    *heap_cur += pages * PAGE_SIZE;
+    printk(2, "%s: %d page(s) at %p (for %s)\n",
+           __FUNCTION__, pages, ptr, purpose);
+    memset(ptr, 0, pages * PAGE_SIZE);
+    return ptr;
+}
+
+void *get_memory(int bytes, const char *purpose)
+{
+    int pages = (bytes + PAGE_SIZE -1) / PAGE_SIZE;
+    return get_pages(pages, purpose);
+}
+
+void paging_start(struct xen_cpu *cpu)
+{
+    ureg_t cr3_mfn;
+
+    cr3_mfn = cpu->init_ctxt->ctrlreg[3] >> PAGE_SHIFT;
+    update_emu_mappings(cr3_mfn);
+    pv_write_cr3(cpu, cr3_mfn);
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 24/40] xenner: kernel: printk
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (22 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 23/40] xenner: kernel: generic MM functionality Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 25/40] xenner: kernel: KVM PV code Alexander Graf
                   ` (15 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds a printk implementation for xenner.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/printk.c |  682 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 682 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/printk.c

diff --git a/pc-bios/xenner/printk.c b/pc-bios/xenner/printk.c
new file mode 100644
index 0000000..a3ce4c9
--- /dev/null
+++ b/pc-bios/xenner/printk.c
@@ -0,0 +1,682 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  printk implementation (mostly from linux)
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdarg.h>
+#include <stddef.h>
+#include <inttypes.h>
+
+#include "xenner.h"
+
+/* --- do_div borrowed from linux --- */
+
+#ifdef CONFIG_64BIT
+
+#define do_div(n,base) ({                                   \
+    uint32_t __base = (base);                               \
+    uint32_t __rem;                                         \
+    __rem = ((uint64_t)(n)) % __base;                       \
+    (n) = ((uint64_t)(n)) / __base;                         \
+    __rem;                                                  \
+})
+
+#else
+
+#define do_div(n,base) ({                                   \
+    unsigned long __upper, __low, __high, __mod, __base;    \
+    __base = (base);                                        \
+    asm(""                                                  \
+        : "=a" (__low), "=d" (__high)                       \
+        : "A" (n));                                         \
+    __upper = __high;                                       \
+    if (__high) {                                           \
+            __upper = __high % (__base);                    \
+            __high = __high / (__base);                     \
+    }                                                       \
+    asm("divl %2"                                           \
+        : "=a" (__low), "=d" (__mod)                        \
+        : "rm" (__base), "0" (__low), "1" (__upper));       \
+    asm(""                                                  \
+        : "=A" (n)                                          \
+        : "a" (__low), "d" (__high));                       \
+    __mod;                                                  \
+})
+
+#endif
+
+#define likely(x)   x
+#define unlikely(x) x
+#define WARN_ON(x)  do {} while(0)
+
+/* --- ctype borrowed from linux --- */
+
+#define _U        0x01        /* upper */
+#define _L        0x02        /* lower */
+#define _D        0x04        /* digit */
+#define _C        0x08        /* cntrl */
+#define _P        0x10        /* punct */
+#define _S        0x20        /* white space (space/lf/tab) */
+#define _X        0x40        /* hex digit */
+#define _SP       0x80        /* hard space (0x20) */
+
+unsigned char _ctype[] = {
+    _C,_C,_C,_C,_C,_C,_C,_C,                               /* 0-7 */
+    _C,_C|_S,_C|_S,_C|_S,_C|_S,_C|_S,_C,_C,                /* 8-15 */
+    _C,_C,_C,_C,_C,_C,_C,_C,                               /* 16-23 */
+    _C,_C,_C,_C,_C,_C,_C,_C,                               /* 24-31 */
+    _S|_SP,_P,_P,_P,_P,_P,_P,_P,                           /* 32-39 */
+    _P,_P,_P,_P,_P,_P,_P,_P,                               /* 40-47 */
+    _D,_D,_D,_D,_D,_D,_D,_D,                               /* 48-55 */
+    _D,_D,_P,_P,_P,_P,_P,_P,                               /* 56-63 */
+    _P,_U|_X,_U|_X,_U|_X,_U|_X,_U|_X,_U|_X,_U,             /* 64-71 */
+    _U,_U,_U,_U,_U,_U,_U,_U,                               /* 72-79 */
+    _U,_U,_U,_U,_U,_U,_U,_U,                               /* 80-87 */
+    _U,_U,_U,_P,_P,_P,_P,_P,                               /* 88-95 */
+    _P,_L|_X,_L|_X,_L|_X,_L|_X,_L|_X,_L|_X,_L,             /* 96-103 */
+    _L,_L,_L,_L,_L,_L,_L,_L,                               /* 104-111 */
+    _L,_L,_L,_L,_L,_L,_L,_L,                               /* 112-119 */
+    _L,_L,_L,_P,_P,_P,_P,_C,                               /* 120-127 */
+    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                       /* 128-143 */
+    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,                       /* 144-159 */
+    _S|_SP,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,   /* 160-175 */
+    _P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,_P,       /* 176-191 */
+    _U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,_U,       /* 192-207 */
+    _U,_U,_U,_U,_U,_U,_U,_P,_U,_U,_U,_U,_U,_U,_U,_L,       /* 208-223 */
+    _L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,_L,       /* 224-239 */
+    _L,_L,_L,_L,_L,_L,_L,_P,_L,_L,_L,_L,_L,_L,_L,_L        /* 240-255 */
+};
+
+#define __ismask(x) (_ctype[(int)(unsigned char)(x)])
+
+#define isalnum(c)        ((__ismask(c)&(_U|_L|_D)) != 0)
+#define isalpha(c)        ((__ismask(c)&(_U|_L)) != 0)
+#define iscntrl(c)        ((__ismask(c)&(_C)) != 0)
+#define isdigit(c)        ((__ismask(c)&(_D)) != 0)
+#define isgraph(c)        ((__ismask(c)&(_P|_U|_L|_D)) != 0)
+#define islower(c)        ((__ismask(c)&(_L)) != 0)
+#define isprint(c)        ((__ismask(c)&(_P|_U|_L|_D|_SP)) != 0)
+#define ispunct(c)        ((__ismask(c)&(_P)) != 0)
+#define isspace(c)        ((__ismask(c)&(_S)) != 0)
+#define isupper(c)        ((__ismask(c)&(_U)) != 0)
+#define isxdigit(c)       ((__ismask(c)&(_D|_X)) != 0)
+
+#define isascii(c)        (((unsigned char)(c))<=0x7f)
+#define toascii(c)        (((unsigned char)(c))&0x7f)
+
+static inline unsigned char __tolower(unsigned char c)
+{
+    if (isupper(c)) {
+        c -= 'A'-'a';
+    }
+    return c;
+}
+
+static inline unsigned char __toupper(unsigned char c)
+{
+    if (islower(c)) {
+        c -= 'a'-'A';
+    }
+    return c;
+}
+
+#define tolower(c) __tolower(c)
+#define toupper(c) __toupper(c)
+
+/* --- *printf borrowed from linux --- */
+
+static int skip_atoi(const char **s)
+{
+    int i = 0;
+
+    while (isdigit(**s)) {
+            i = i * 10 + *((*s)++) - '0';
+    }
+
+    return i;
+}
+
+static char* put_dec_trunc(char *buf, unsigned q)
+{
+    unsigned d3, d2, d1, d0;
+    d1 = (q >> 4) & 0xf;
+    d2 = (q >> 8) & 0xf;
+    d3 = (q >> 12);
+
+    d0 = 6 * (d3 + d2 + d1) + (q & 0xf);
+    q = (d0 * 0xcd) >> 11;
+    d0 = d0 - 10 * q;
+    *buf++ = d0 + '0'; /* least significant digit */
+    d1 = q + 9 * d3 + 5 * d2 + d1;
+    if (d1 != 0) {
+        q = (d1 * 0xcd) >> 11;
+        d1 = d1 - 10 * q;
+        *buf++ = d1 + '0'; /* next digit */
+
+        d2 = q + 2 * d2;
+        if ((d2 != 0) || (d3 != 0)) {
+            q = (d2 * 0xd) >> 7;
+            d2 = d2 - 10 * q;
+            *buf++ = d2 + '0'; /* next digit */
+
+            d3 = q + 4 * d3;
+            if (d3 != 0) {
+                q = (d3 * 0xcd) >> 11;
+                d3 = d3 - 10 * q;
+                *buf++ = d3 + '0';  /* next digit */
+                if (q != 0) {
+                    *buf++ = q + '0';  /* most sign. digit */
+                }
+            }
+        }
+    }
+    return buf;
+}
+
+static char* put_dec_full(char *buf, unsigned q)
+{
+    /* BTW, if q is in [0,9999], 8-bit ints will be enough, */
+    /* but anyway, gcc produces better code with full-sized ints */
+    unsigned d3, d2, d1, d0;
+
+    d1 = (q >> 4) & 0xf;
+    d2 = (q >> 8) & 0xf;
+    d3 = (q >> 12);
+
+    d0 = 6 * (d3 + d2 + d1) + (q & 0xf);
+    q = (d0 * 0xcd) >> 11;
+    d0 = d0 - 10 * q;
+    *buf++ = d0 + '0';
+    d1 = q + 9 * d3 + 5 * d2 + d1;
+    q = (d1 * 0xcd) >> 11;
+    d1 = d1 - 10 * q;
+    *buf++ = d1 + '0';
+
+    d2 = q + 2 * d2;
+    q = (d2 * 0xd) >> 7;
+    d2 = d2 - 10 * q;
+    *buf++ = d2 + '0';
+
+    d3 = q + 4 * d3;
+    q = (d3 * 0xcd) >> 11;
+    d3 = d3 - 10 * q;
+    *buf++ = d3 + '0';
+    *buf++ = q + '0';
+
+    return buf;
+}
+
+static char* put_dec(char *buf, unsigned long long num)
+{
+    while (1) {
+            unsigned rem;
+            if (num < 100000)
+                    return put_dec_trunc(buf, num);
+            rem = do_div(num, 100000);
+            buf = put_dec_full(buf, rem);
+    }
+}
+
+#define ZEROPAD     1                /* pad with zero */
+#define SIGN        2                /* unsigned/signed long */
+#define PLUS        4                /* show plus */
+#define SPACE       8                /* space if plus */
+#define LEFT        16               /* left justified */
+#define SPECIAL     32               /* 0x */
+#define LARGE       64               /* use 'ABCDEF' instead of 'abcdef' */
+
+static char *number(char *buf, char *end, unsigned long long num, int base,
+                    int size, int precision, int type)
+{
+    char sign,tmp[66];
+    const char *digits;
+    /* we are called with base 8, 10 or 16, only, thus don't need "g..."  */
+    static const char small_digits[] = "0123456789abcdefx";
+    static const char large_digits[] = "0123456789ABCDEFX";
+    int need_pfx = ((type & SPECIAL) && base != 10);
+    int i;
+
+    digits = (type & LARGE) ? large_digits : small_digits;
+    if (type & LEFT) {
+        type &= ~ZEROPAD;
+    }
+    if (base < 2 || base > 36) {
+        return NULL;
+    }
+    sign = 0;
+    if (type & SIGN) {
+        if ((signed long long) num < 0) {
+            sign = '-';
+            num = - (signed long long) num;
+            size--;
+        } else if (type & PLUS) {
+            sign = '+';
+            size--;
+        } else if (type & SPACE) {
+            sign = ' ';
+            size--;
+        }
+    }
+    if (need_pfx) {
+        size--;
+        if (base == 16) {
+            size--;
+        }
+    }
+
+    /* generate full string in tmp[], in reverse order */
+    i = 0;
+    if (num == 0) {
+        tmp[i++] = '0';
+    } else if (base != 10) { /* 8 or 16 */
+        int mask = base - 1;
+        int shift = 3;
+        if (base == 16) {
+            shift = 4;
+        }
+        do {
+            tmp[i++] = digits[((unsigned char)num) & mask];
+            num >>= shift;
+        } while (num);
+    } else { /* base 10 */
+        i = put_dec(tmp, num) - tmp;
+    }
+
+    /* printing 100 using %2d gives "100", not "00" */
+    if (i > precision) {
+            precision = i;
+    }
+    /* leading space padding */
+    size -= precision;
+    if (!(type & (ZEROPAD+LEFT))) {
+        while(--size >= 0) {
+            if (buf < end) {
+                *buf = ' ';
+            }
+            ++buf;
+        }
+    }
+    /* sign */
+    if (sign) {
+        if (buf < end) {
+            *buf = sign;
+        }
+        ++buf;
+    }
+    /* "0x" / "0" prefix */
+    if (need_pfx) {
+        if (buf < end) {
+            *buf = '0';
+        }
+        ++buf;
+        if (base == 16) {
+            if (buf < end) {
+                *buf = digits[16]; /* for arbitrary base: digits[33]; */
+            }
+            ++buf;
+        }
+    }
+    /* zero or space padding */
+    if (!(type & LEFT)) {
+        char c = (type & ZEROPAD) ? '0' : ' ';
+        while (--size >= 0) {
+            if (buf < end) {
+                *buf = c;
+            }
+            ++buf;
+        }
+    }
+    /* hmm even more zero padding? */
+    while (i <= --precision) {
+        if (buf < end) {
+            *buf = '0';
+        }
+        ++buf;
+    }
+    /* actual digits of result */
+    while (--i >= 0) {
+        if (buf < end) {
+            *buf = tmp[i];
+        }
+        ++buf;
+    }
+    /* trailing space padding */
+    while (--size >= 0) {
+        if (buf < end) {
+            *buf = ' ';
+        }
+        ++buf;
+    }
+    return buf;
+}
+
+static size_t strnlen(const char *s, size_t count)
+{
+    const char *sc;
+
+    for (sc = s; count-- && *sc != '\0'; ++sc) {
+            /* nothing */
+    }
+    return sc - s;
+}
+
+static int vsnprintf(char *buf, size_t size, const char *fmt, va_list args)
+{
+    int len;
+    unsigned long long num;
+    int i, base;
+    char *str, *end, c;
+    const char *s;
+
+    int flags;              /* flags to number() */
+
+    int field_width;        /* width of output field */
+    int precision;          /* min. # of digits for integers; max
+                               number of chars for from string */
+    int qualifier;          /* 'h', 'l', or 'L' for integer fields */
+                            /* 'z' support added 23/7/1999 S.H.    */
+                            /* 'z' changed to 'Z' --davidm 1/25/99 */
+                            /* 't' added for ptrdiff_t */
+
+    /* Reject out-of-range values early.  Large positive sizes are
+       used for unknown buffer sizes. */
+    if (unlikely((int) size < 0)) {
+        /* There can be only one.. */
+        static char warn = 1;
+        WARN_ON(warn);
+        warn = 0;
+        return 0;
+    }
+
+    str = buf;
+    end = buf + size;
+
+    /* Make sure end is always >= buf */
+    if (end < buf) {
+        end = ((void *)-1);
+        size = end - buf;
+    }
+
+    for (; *fmt ; ++fmt) {
+        if (*fmt != '%') {
+            if (str < end) {
+                *str = *fmt;
+            }
+            ++str;
+            continue;
+        }
+
+        /* process flags */
+        flags = 0;
+        repeat:
+            ++fmt;                /* this also skips first '%' */
+            switch (*fmt) {
+                case '-': flags |= LEFT; goto repeat;
+                case '+': flags |= PLUS; goto repeat;
+                case ' ': flags |= SPACE; goto repeat;
+                case '#': flags |= SPECIAL; goto repeat;
+                case '0': flags |= ZEROPAD; goto repeat;
+            }
+
+        /* get field width */
+        field_width = -1;
+        if (isdigit(*fmt)) {
+            field_width = skip_atoi(&fmt);
+        } else if (*fmt == '*') {
+            ++fmt;
+            /* it's the next argument */
+            field_width = va_arg(args, int);
+            if (field_width < 0) {
+                field_width = -field_width;
+                flags |= LEFT;
+            }
+        }
+
+        /* get the precision */
+        precision = -1;
+        if (*fmt == '.') {
+            ++fmt;
+            if (isdigit(*fmt)) {
+                precision = skip_atoi(&fmt);
+            } else if (*fmt == '*') {
+                ++fmt;
+                /* it's the next argument */
+                precision = va_arg(args, int);
+            }
+            if (precision < 0) {
+                precision = 0;
+            }
+        }
+
+        /* get the conversion qualifier */
+        qualifier = -1;
+        if (*fmt == 'h' || *fmt == 'l' || *fmt == 'L' ||
+            *fmt =='Z' || *fmt == 'z' || *fmt == 't') {
+            qualifier = *fmt;
+            ++fmt;
+            if (qualifier == 'l' && *fmt == 'l') {
+                qualifier = 'L';
+                ++fmt;
+            }
+        }
+
+        /* default base */
+        base = 10;
+
+        switch (*fmt) {
+            case 'c':
+                if (!(flags & LEFT)) {
+                    while (--field_width > 0) {
+                        if (str < end) {
+                            *str = ' ';
+                        }
+                        ++str;
+                    }
+                }
+                c = (unsigned char) va_arg(args, int);
+                if (str < end) {
+                    *str = c;
+                }
+                ++str;
+                while (--field_width > 0) {
+                    if (str < end) {
+                        *str = ' ';
+                    }
+                    ++str;
+                }
+                continue;
+
+            case 's':
+                s = va_arg(args, char *);
+                if ((unsigned long)s < PAGE_SIZE) {
+                    s = "<NULL>";
+                }
+
+                len = strnlen(s, precision);
+
+                if (!(flags & LEFT)) {
+                    while (len < field_width--) {
+                        if (str < end) {
+                            *str = ' ';
+                        }
+                        ++str;
+                    }
+                }
+                for (i = 0; i < len; ++i) {
+                    if (str < end) {
+                        *str = *s;
+                    }
+                    ++str; ++s;
+                }
+                while (len < field_width--) {
+                    if (str < end) {
+                        *str = ' ';
+                    }
+                    ++str;
+                }
+                continue;
+
+            case 'p':
+                if (field_width == -1) {
+                    field_width = 2*sizeof(void *);
+                    flags |= ZEROPAD;
+                }
+                str = number(str, end,
+                             (unsigned long) va_arg(args, void *),
+                             16, field_width, precision, flags);
+                continue;
+
+
+            case 'n':
+                /* FIXME:
+                * What does C99 say about the overflow case here? */
+                if (qualifier == 'l') {
+                    long * ip = va_arg(args, long *);
+                    *ip = (str - buf);
+                } else if (qualifier == 'Z' || qualifier == 'z') {
+                    size_t * ip = va_arg(args, size_t *);
+                    *ip = (str - buf);
+                } else {
+                    int * ip = va_arg(args, int *);
+                    *ip = (str - buf);
+                }
+                continue;
+
+            case '%':
+                if (str < end) {
+                    *str = '%';
+                }
+                ++str;
+                continue;
+
+            /* integer number formats - set up the flags and "break" */
+            case 'o':
+                base = 8;
+                break;
+
+            case 'X':
+                flags |= LARGE;
+            case 'x':
+                base = 16;
+                break;
+
+            case 'd':
+            case 'i':
+                flags |= SIGN;
+            case 'u':
+                break;
+
+            default:
+                if (str < end) {
+                    *str = '%';
+                }
+                ++str;
+                if (*fmt) {
+                    if (str < end) {
+                        *str = *fmt;
+                    }
+                    ++str;
+                } else {
+                    --fmt;
+                }
+                continue;
+        }
+        if (qualifier == 'L') {
+            num = va_arg(args, long long);
+        } else if (qualifier == 'l') {
+            num = va_arg(args, unsigned long);
+            if (flags & SIGN) {
+                num = (signed long) num;
+            }
+        } else if (qualifier == 'Z' || qualifier == 'z') {
+            num = va_arg(args, size_t);
+        } else if (qualifier == 't') {
+            num = va_arg(args, ptrdiff_t);
+        } else if (qualifier == 'h') {
+            num = (unsigned short) va_arg(args, int);
+            if (flags & SIGN) {
+                num = (signed short) num;
+            }
+        } else {
+            num = va_arg(args, unsigned int);
+            if (flags & SIGN) {
+                num = (signed int) num;
+            }
+        }
+        str = number(str, end, num, base,
+                     field_width, precision, flags);
+    }
+    if (size > 0) {
+        if (str < end) {
+            *str = '\0';
+        } else {
+            end[-1] = '\0';
+        }
+    }
+    /* the trailing null byte doesn't count towards the total */
+    return str-buf;
+}
+
+int vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
+{
+    int i;
+
+    i = vsnprintf(buf,size,fmt,args);
+    return (i >= size) ? (size - 1) : i;
+}
+
+int snprintf(char * buf, size_t size, const char *fmt, ...)
+{
+    va_list args;
+    int i;
+
+    va_start(args, fmt);
+    i = vsnprintf(buf,size,fmt,args);
+    va_end(args);
+    return i;
+}
+
+void write_string(char *msg)
+{
+    int i;
+
+    for (i = 0; msg[i]; i++) {
+        emudev_cmd(EMUDEV_CMD_WRITE_CHAR, msg[i]);
+    }
+}
+
+int printk(int level, const char *fmt, ...)
+{
+    char buf[256];
+    va_list args;
+    int i = 0;
+
+    if (level > vmconf.debug_level) {
+        return 0;
+    }
+
+    i += snprintf(buf+i, sizeof(buf), "<%d>", level);
+    va_start(args, fmt);
+    i += vscnprintf(buf+i, sizeof(buf), fmt, args);
+    va_end(args);
+    write_string(buf);
+    return i;
+}
+
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 25/40] xenner: kernel: KVM PV code
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (23 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 24/40] xenner: kernel: printk Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 26/40] xenner: kernel: xen-names Alexander Graf
                   ` (14 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner uses KVM's PV functionality for timekeeping. If we don't find
KVM clocksource support, we try to emulate it as good as we can.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xenner-pv.c |  186 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 186 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xenner-pv.c

diff --git a/pc-bios/xenner/xenner-pv.c b/pc-bios/xenner/xenner-pv.c
new file mode 100644
index 0000000..98218a9
--- /dev/null
+++ b/pc-bios/xenner/xenner-pv.c
@@ -0,0 +1,186 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner KVM PV integration
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xenner.h"
+
+/* --------------------------------------------------------------------- */
+
+const char *feature_bits[32] = {
+    [ KVM_FEATURE_CLOCKSOURCE  ] = "clocksource",
+    [ KVM_FEATURE_NOP_IO_DELAY ] = "nop-iodelay",
+    [ KVM_FEATURE_MMU_OP       ] = "mmu-op",
+};
+
+int pv_have_clock;
+
+/* --------------------------------------------------------------------- */
+
+void pv_clock_update(int wakeup)
+{
+    static int wakeups;
+    static int update;
+
+    if (wakeup) {
+        /* after halt() -- update clock unconditionally */
+        update = 1;
+        wakeups++;
+    } else {
+        /* timer irq -- update only if needed */
+        update = (0 == wakeups);
+        wakeups = 0;
+    }
+
+    /* vmexit to userspace so xenner has a chance to update systime */
+    if (update) {
+        if (pv_have_clock) {
+            emudev_cmd(EMUDEV_CMD_NOP, 0);
+        } else {
+            struct xen_cpu *cpu = get_cpu();
+
+            cpu->v.vcpu_info->time.tsc_timestamp = rdtsc();
+            cpu->v.vcpu_info->time.system_time =
+                cpu->v.vcpu_info->time.tsc_timestamp;
+
+            cpu->v.vcpu_info->time.version+=2;
+            cpu->v.vcpu_info->time.tsc_timestamp = rdtsc();
+            cpu->v.vcpu_info->time.system_time =
+                cpu->v.vcpu_info->time.tsc_timestamp;
+        }
+    }
+}
+
+static void pv_clock_wall(void)
+{
+    uint64_t wall = EMU_PA(&shared_info.wc_version);
+
+    printk(1, "%s: register wall clock at 0x%" PRIx64 "\n",
+           __FUNCTION__, wall);
+
+    if (pv_have_clock) {
+        if (wrmsrl_safe(MSR_KVM_WALL_CLOCK, wall)) {
+            panic("MSR_KVM_WALL_CLOCK wrmsr failed", NULL);
+        }
+    } else {
+        shared_info.wc_version = 4;
+        shared_info.wc_sec = 0;
+        shared_info.wc_nsec = 0;
+    }
+
+    printk(1, "%s: v%d %d.%09d\n", __FUNCTION__,
+           shared_info.wc_version,
+           shared_info.wc_sec,
+           shared_info.wc_nsec);
+}
+
+void pv_clock_sys(struct xen_cpu *cpu)
+{
+    uint64_t sys = cpu->v.vcpu_info_pa + offsetof(struct vcpu_info, time);
+
+    printk(1, "%s: register vcpu %d clock at 0x%" PRIx64 "\n",
+           __FUNCTION__, cpu->id, sys);
+
+    if (pv_have_clock) {
+        if (wrmsrl_safe(MSR_KVM_SYSTEM_TIME, sys | 1)) {
+            panic("MSR_KVM_SYSTEM_TIME wrmsr failed", NULL);
+        }
+    } else {
+        /* fake data */
+        cpu->v.vcpu_info->time.tsc_to_system_mul = 1;
+        cpu->v.vcpu_info->time.version = 2;
+        cpu->v.vcpu_info->time.tsc_timestamp = rdtsc();
+        cpu->v.vcpu_info->time.system_time =
+            cpu->v.vcpu_info->time.tsc_timestamp;
+        cpu->v.vcpu_info->time.tsc_shift = 0;
+    }
+
+    printk(1, "%s: v%d sys %" PRIu64 " tsc %" PRIu64 " mul %u shift %d\n",
+           __FUNCTION__,
+           cpu->v.vcpu_info->time.version,
+           cpu->v.vcpu_info->time.system_time,
+           cpu->v.vcpu_info->time.tsc_timestamp,
+           cpu->v.vcpu_info->time.tsc_to_system_mul,
+           cpu->v.vcpu_info->time.tsc_shift);
+}
+
+/* --------------------------------------------------------------------- */
+
+void pv_write_cr3(struct xen_cpu *cpu, ureg_t cr3_mfn)
+{
+    ureg_t cr3 = frame_to_addr(cr3_mfn);
+
+#ifdef CONFIG_64BIT
+    if (cpu->user_mode) {
+        cpu->user_cr3_mfn = cr3_mfn;
+    } else {
+        cpu->kernel_cr3_mfn = cr3_mfn;
+    }
+#else
+    cpu->cr3_mfn = cr3_mfn;
+#endif
+
+    vminfo.faults[XEN_FAULT_OTHER_CR3_LOAD]++;
+    write_cr3(cr3);
+    return;
+}
+
+/* --------------------------------------------------------------------- */
+
+void pv_init(struct xen_cpu *cpu)
+{
+    char buf[128];
+    struct kvm_cpuid_entry entry;
+    uint32_t sig[3];
+    uint32_t features;
+
+    entry.function = KVM_CPUID_SIGNATURE;
+    real_cpuid(&entry);
+    sig[0] = entry.ebx;
+    sig[1] = entry.ecx;
+    sig[2] = entry.edx;
+    if (0 != memcmp((char*)sig, "KVMKVMKVM", 10)) {
+        printk(1, "%s: no kvm signature: \"%.12s\"\n",
+               __FUNCTION__, (char*)sig);
+        goto no_kvm;
+    }
+
+    entry.function = KVM_CPUID_FEATURES;
+    real_cpuid(&entry);
+    features = entry.eax;
+
+    snprintf(buf, sizeof(buf), "%s: cpu %d, signature \"%.12s\", features 0x%08x",
+             __FUNCTION__, cpu->id, (char*)sig, features);
+    print_bits(1, buf, features, features, feature_bits);
+
+    /* pv clocksource */
+    if (features & (1 << KVM_FEATURE_CLOCKSOURCE)) {
+        pv_have_clock = 1;
+    }
+
+no_kvm:
+    pv_clock_sys(cpu);
+    if (cpu->id == 0) {
+       pv_clock_wall();
+    }
+}
+
+/* --------------------------------------------------------------------- */
+
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 26/40] xenner: kernel: xen-names
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (24 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 25/40] xenner: kernel: KVM PV code Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 27/40] xenner: add xc_dom.h Alexander Graf
                   ` (13 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

To resolve various names, we keep a generated version of xen-names around.
This helps with debug output.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 pc-bios/xenner/xen-names.c |  141 ++++++++++++++++++++++++++++++++++++++++++++
 pc-bios/xenner/xen-names.h |   68 +++++++++++++++++++++
 2 files changed, 209 insertions(+), 0 deletions(-)
 create mode 100644 pc-bios/xenner/xen-names.c
 create mode 100644 pc-bios/xenner/xen-names.h

diff --git a/pc-bios/xenner/xen-names.c b/pc-bios/xenner/xen-names.c
new file mode 100644
index 0000000..3448539
--- /dev/null
+++ b/pc-bios/xenner/xen-names.c
@@ -0,0 +1,141 @@
+/*
+ * don't edit, generated by xen/nametable.sh
+ */
+
+#include <inttypes.h>
+
+#include <xen/xen.h>
+#include <xen/version.h>
+#include <xen/sched.h>
+#include <xen/memory.h>
+#include <xen/nmi.h>
+#include <xen/callback.h>
+#include <xen/physdev.h>
+#include <xen/vcpu.h>
+#include <xen/grant_table.h>
+
+/* --- __HYPERVISOR --- */
+const char *__hypervisor_names[] = {
+    [ __HYPERVISOR_multicall           ] = "multicall",
+    [ __HYPERVISOR_iret                ] = "iret",
+    [ __HYPERVISOR_sysctl              ] = "sysctl",
+    [ __HYPERVISOR_domctl              ] = "domctl",
+    [ __HYPERVISOR_mmuext_op           ] = "mmuext_op",
+};
+const int __hypervisor_count = sizeof(__hypervisor_names)/sizeof(__hypervisor_names[0]);
+
+/* --- XENVER --- */
+const char *xenver_names[] = {
+    [ XENVER_version                   ] = "version",
+    [ XENVER_extraversion              ] = "extraversion",
+    [ XENVER_capabilities              ] = "capabilities",
+    [ XENVER_changeset                 ] = "changeset",
+    [ XENVER_pagesize                  ] = "pagesize",
+    [ XENVER_commandline               ] = "commandline",
+};
+const int xenver_count = sizeof(xenver_names)/sizeof(xenver_names[0]);
+
+/* --- VMASST_TYPE --- */
+const char *vmasst_type_names[] = {
+};
+const int vmasst_type_count = sizeof(vmasst_type_names)/sizeof(vmasst_type_names[0]);
+
+/* --- VMASST_CMD --- */
+const char *vmasst_cmd_names[] = {
+    [ VMASST_CMD_enable                ] = "enable",
+    [ VMASST_CMD_disable               ] = "disable",
+};
+const int vmasst_cmd_count = sizeof(vmasst_cmd_names)/sizeof(vmasst_cmd_names[0]);
+
+/* --- SCHEDOP --- */
+const char *schedop_names[] = {
+    [ SCHEDOP_yield                    ] = "yield",
+    [ SCHEDOP_block                    ] = "block",
+    [ SCHEDOP_shutdown                 ] = "shutdown",
+    [ SCHEDOP_poll                     ] = "poll",
+};
+const int schedop_count = sizeof(schedop_names)/sizeof(schedop_names[0]);
+
+/* --- CONSOLEIO --- */
+const char *consoleio_names[] = {
+    [ CONSOLEIO_write                  ] = "write",
+    [ CONSOLEIO_read                   ] = "read",
+};
+const int consoleio_count = sizeof(consoleio_names)/sizeof(consoleio_names[0]);
+
+/* --- XENMEM --- */
+const char *xenmem_names[] = {
+    [ XENMEM_exchange                  ] = "exchange",
+};
+const int xenmem_count = sizeof(xenmem_names)/sizeof(xenmem_names[0]);
+
+/* --- XENNMI --- */
+const char *xennmi_names[] = {
+};
+const int xennmi_count = sizeof(xennmi_names)/sizeof(xennmi_names[0]);
+
+/* --- CALLBACKOP --- */
+const char *callbackop_names[] = {
+    [ CALLBACKOP_register              ] = "register",
+    [ CALLBACKOP_unregister            ] = "unregister",
+};
+const int callbackop_count = sizeof(callbackop_names)/sizeof(callbackop_names[0]);
+
+/* --- CALLBACKTYPE --- */
+const char *callbacktype_names[] = {
+    [ CALLBACKTYPE_event               ] = "event",
+    [ CALLBACKTYPE_failsafe            ] = "failsafe",
+    [ CALLBACKTYPE_syscall             ] = "syscall",
+    [ CALLBACKTYPE_nmi                 ] = "nmi",
+    [ CALLBACKTYPE_sysenter            ] = "sysenter",
+    [ CALLBACKTYPE_syscall32           ] = "syscall32",
+};
+const int callbacktype_count = sizeof(callbacktype_names)/sizeof(callbacktype_names[0]);
+
+/* --- MMUEXT --- */
+const char *mmuext_names[] = {
+};
+const int mmuext_count = sizeof(mmuext_names)/sizeof(mmuext_names[0]);
+
+/* --- PHYSDEVOP --- */
+const char *physdevop_names[] = {
+    [ PHYSDEVOP_eoi                    ] = "eoi",
+};
+const int physdevop_count = sizeof(physdevop_names)/sizeof(physdevop_names[0]);
+
+/* --- VCPUOP --- */
+const char *vcpuop_names[] = {
+    [ VCPUOP_initialise                ] = "initialise",
+    [ VCPUOP_up                        ] = "up",
+    [ VCPUOP_down                      ] = "down",
+};
+const int vcpuop_count = sizeof(vcpuop_names)/sizeof(vcpuop_names[0]);
+
+/* --- EVTCHNOP --- */
+const char *evtchnop_names[] = {
+    [ EVTCHNOP_close                   ] = "close",
+    [ EVTCHNOP_send                    ] = "send",
+    [ EVTCHNOP_status                  ] = "status",
+    [ EVTCHNOP_unmask                  ] = "unmask",
+    [ EVTCHNOP_reset                   ] = "reset",
+};
+const int evtchnop_count = sizeof(evtchnop_names)/sizeof(evtchnop_names[0]);
+
+/* --- VIRQ --- */
+const char *virq_names[] = {
+    [ VIRQ_TIMER                       ] = "TIMER",
+    [ VIRQ_DEBUG                       ] = "DEBUG",
+    [ VIRQ_CONSOLE                     ] = "CONSOLE",
+    [ VIRQ_TBUF                        ] = "TBUF",
+    [ VIRQ_DEBUGGER                    ] = "DEBUGGER",
+    [ VIRQ_XENOPROF                    ] = "XENOPROF",
+};
+const int virq_count = sizeof(virq_names)/sizeof(virq_names[0]);
+
+/* --- GNTTABOP --- */
+const char *gnttabop_names[] = {
+    [ GNTTABOP_transfer                ] = "transfer",
+    [ GNTTABOP_copy                    ] = "copy",
+};
+const int gnttabop_count = sizeof(gnttabop_names)/sizeof(gnttabop_names[0]);
+
diff --git a/pc-bios/xenner/xen-names.h b/pc-bios/xenner/xen-names.h
new file mode 100644
index 0000000..cf41fc5
--- /dev/null
+++ b/pc-bios/xenner/xen-names.h
@@ -0,0 +1,68 @@
+/*
+ * don't edit, generated by xen/nametable.sh
+ */
+
+extern const char *__hypervisor_names[];
+extern const int __hypervisor_count;
+#define __hypervisor_name(i) (((i) < __hypervisor_count && __hypervisor_names[i]) ? __hypervisor_names[i] : "UNKNOWN")
+
+extern const char *xenver_names[];
+extern const int xenver_count;
+#define xenver_name(i) (((i) < xenver_count && xenver_names[i]) ? xenver_names[i] : "UNKNOWN")
+
+extern const char *vmasst_type_names[];
+extern const int vmasst_type_count;
+#define vmasst_type_name(i) (((i) < vmasst_type_count && vmasst_type_names[i]) ? vmasst_type_names[i] : "UNKNOWN")
+
+extern const char *vmasst_cmd_names[];
+extern const int vmasst_cmd_count;
+#define vmasst_cmd_name(i) (((i) < vmasst_cmd_count && vmasst_cmd_names[i]) ? vmasst_cmd_names[i] : "UNKNOWN")
+
+extern const char *schedop_names[];
+extern const int schedop_count;
+#define schedop_name(i) (((i) < schedop_count && schedop_names[i]) ? schedop_names[i] : "UNKNOWN")
+
+extern const char *consoleio_names[];
+extern const int consoleio_count;
+#define consoleio_name(i) (((i) < consoleio_count && consoleio_names[i]) ? consoleio_names[i] : "UNKNOWN")
+
+extern const char *xenmem_names[];
+extern const int xenmem_count;
+#define xenmem_name(i) (((i) < xenmem_count && xenmem_names[i]) ? xenmem_names[i] : "UNKNOWN")
+
+extern const char *xennmi_names[];
+extern const int xennmi_count;
+#define xennmi_name(i) (((i) < xennmi_count && xennmi_names[i]) ? xennmi_names[i] : "UNKNOWN")
+
+extern const char *callbackop_names[];
+extern const int callbackop_count;
+#define callbackop_name(i) (((i) < callbackop_count && callbackop_names[i]) ? callbackop_names[i] : "UNKNOWN")
+
+extern const char *callbacktype_names[];
+extern const int callbacktype_count;
+#define callbacktype_name(i) (((i) < callbacktype_count && callbacktype_names[i]) ? callbacktype_names[i] : "UNKNOWN")
+
+extern const char *mmuext_names[];
+extern const int mmuext_count;
+#define mmuext_name(i) (((i) < mmuext_count && mmuext_names[i]) ? mmuext_names[i] : "UNKNOWN")
+
+extern const char *physdevop_names[];
+extern const int physdevop_count;
+#define physdevop_name(i) (((i) < physdevop_count && physdevop_names[i]) ? physdevop_names[i] : "UNKNOWN")
+
+extern const char *vcpuop_names[];
+extern const int vcpuop_count;
+#define vcpuop_name(i) (((i) < vcpuop_count && vcpuop_names[i]) ? vcpuop_names[i] : "UNKNOWN")
+
+extern const char *evtchnop_names[];
+extern const int evtchnop_count;
+#define evtchnop_name(i) (((i) < evtchnop_count && evtchnop_names[i]) ? evtchnop_names[i] : "UNKNOWN")
+
+extern const char *virq_names[];
+extern const int virq_count;
+#define virq_name(i) (((i) < virq_count && virq_names[i]) ? virq_names[i] : "UNKNOWN")
+
+extern const char *gnttabop_names[];
+extern const int gnttabop_count;
+#define gnttabop_name(i) (((i) < gnttabop_count && gnttabop_names[i]) ? gnttabop_names[i] : "UNKNOWN")
+
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 27/40] xenner: add xc_dom.h
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (25 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 26/40] xenner: kernel: xen-names Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn Alexander Graf
                   ` (12 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds a generic layer for xc calls, allowing us to choose between the
xenner and xen implementations at runtime.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xc_dom.h         |  273 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/xen_interfaces.c |  108 ++++++++++++++++++++
 hw/xen_interfaces.h |  111 +++++++++++++++++++++
 hw/xen_redirect.h   |   56 +++++++++++
 4 files changed, 548 insertions(+), 0 deletions(-)
 create mode 100644 hw/xc_dom.h
 create mode 100644 hw/xen_interfaces.c
 create mode 100644 hw/xen_interfaces.h
 create mode 100644 hw/xen_redirect.h

diff --git a/hw/xc_dom.h b/hw/xc_dom.h
new file mode 100644
index 0000000..c835a37
--- /dev/null
+++ b/hw/xc_dom.h
@@ -0,0 +1,273 @@
+#include <libelf.h>
+
+#define INVALID_P2M_ENTRY   ((xen_pfn_t)-1)
+
+/* --- typedefs and structs ---------------------------------------- */
+
+typedef uint64_t xen_vaddr_t;
+typedef uint64_t xen_paddr_t;
+
+#define PRIpfn PRI_xen_pfn
+
+struct xc_dom_seg {
+    xen_vaddr_t vstart;
+    xen_vaddr_t vend;
+    xen_pfn_t pfn;
+};
+
+struct xc_dom_mem {
+    struct xc_dom_mem *next;
+    void *mmap_ptr;
+    size_t mmap_len;
+    unsigned char memory[0];
+};
+
+struct xc_dom_phys {
+    struct xc_dom_phys *next;
+    void *ptr;
+    xen_pfn_t first;
+    xen_pfn_t count;
+};
+
+struct xc_dom_image {
+    /* files */
+    void *kernel_blob;
+    size_t kernel_size;
+    void *ramdisk_blob;
+    size_t ramdisk_size;
+
+    /* arguments and parameters */
+    char *cmdline;
+    uint32_t f_requested[XENFEAT_NR_SUBMAPS];
+
+    /* info from (elf) kernel image */
+    struct {
+        uint64_t virt_base;
+    } parms;
+
+    char *guest_type;
+
+    /* memory layout */
+    struct xc_dom_seg kernel_seg;
+    struct xc_dom_seg ramdisk_seg;
+    struct xc_dom_seg p2m_seg;
+    struct xc_dom_seg pgtables_seg;
+    struct xc_dom_seg devicetree_seg;
+    xen_pfn_t start_info_pfn;
+    xen_pfn_t console_pfn;
+    xen_pfn_t xenstore_pfn;
+    xen_pfn_t shared_info_pfn;
+    xen_pfn_t bootstack_pfn;
+    xen_vaddr_t virt_alloc_end;
+    xen_vaddr_t bsd_symtab_start;
+
+    /* initial page tables */
+    unsigned int pgtables;
+    unsigned int pg_l4;
+    unsigned int pg_l3;
+    unsigned int pg_l2;
+    unsigned int pg_l1;
+    unsigned int alloc_bootstack;
+    unsigned int extra_pages;
+    xen_vaddr_t virt_pgtab_end;
+
+    /* other state info */
+    uint32_t f_active[XENFEAT_NR_SUBMAPS];
+    xen_pfn_t *p2m_host;
+    void *p2m_guest;
+
+    /* physical memory */
+    xen_pfn_t total_pages;
+    struct xc_dom_phys *phys_pages;
+    int realmodearea_log;
+
+    /* malloc memory pool */
+    struct xc_dom_mem *memblocks;
+
+    /* memory footprint stats */
+    size_t alloc_malloc;
+    size_t alloc_mem_map;
+    size_t alloc_file_map;
+    size_t alloc_domU_map;
+
+    /* misc xen domain config stuff */
+    unsigned long flags;
+    unsigned int console_evtchn;
+    unsigned int xenstore_evtchn;
+    xen_pfn_t shared_info_mfn;
+
+    int guest_xc;
+    domid_t guest_domid;
+    int shadow_enabled;
+
+    int xen_version;
+    xen_capabilities_info_t xen_caps;
+
+    /* kernel loader, arch hooks */
+    struct xc_dom_loader *kernel_loader;
+    void *private_loader;
+
+    /* kernel loader */
+    struct xc_dom_arch *arch_hooks;
+};
+
+/* --- pluggable kernel loader ------------------------------------- */
+
+struct xc_dom_loader {
+    char *name;
+    int (*probe) (struct xc_dom_image * dom);
+    int (*parser) (struct xc_dom_image * dom);
+    int (*loader) (struct xc_dom_image * dom);
+
+    struct xc_dom_loader *next;
+};
+
+#define __init __attribute__ ((constructor))
+void xc_dom_register_loader(struct xc_dom_loader *loader);
+
+/* --- arch specific hooks ----------------------------------------- */
+
+struct xc_dom_arch {
+    /* pagetable setup */
+    int (*alloc_magic_pages) (struct xc_dom_image * dom);
+    int (*count_pgtables) (struct xc_dom_image * dom);
+    int (*setup_pgtables) (struct xc_dom_image * dom);
+
+    /* arch-specific data structs setup */
+    int (*start_info) (struct xc_dom_image * dom);
+    int (*shared_info) (struct xc_dom_image * dom, void *shared_info);
+    int (*vcpu) (struct xc_dom_image * dom, void *vcpu_ctxt);
+
+    char *guest_type;
+    int page_shift;
+    int sizeof_pfn;
+
+    struct xc_dom_arch *next;
+};
+void xc_dom_register_arch_hooks(struct xc_dom_arch *hooks);
+
+#define XC_DOM_PAGE_SHIFT(dom)  ((dom)->arch_hooks->page_shift)
+#define XC_DOM_PAGE_SIZE(dom)   (1 << (dom)->arch_hooks->page_shift)
+
+/* --- main functions ---------------------------------------------- */
+
+struct xc_dom_image *xc_dom_allocate(const char *cmdline, const char *features);
+void xc_dom_release_phys(struct xc_dom_image *dom);
+void xc_dom_release(struct xc_dom_image *dom);
+int xc_dom_mem_init(struct xc_dom_image *dom, unsigned int mem_mb);
+
+size_t xc_dom_check_gzip(void *blob, size_t ziplen);
+int xc_dom_do_gunzip(void *src, size_t srclen, void *dst, size_t dstlen);
+int xc_dom_try_gunzip(struct xc_dom_image *dom, void **blob, size_t * size);
+
+int xc_dom_kernel_file(struct xc_dom_image *dom, const char *filename);
+int xc_dom_ramdisk_file(struct xc_dom_image *dom, const char *filename);
+int xc_dom_kernel_mem(struct xc_dom_image *dom, const void *mem,
+                      size_t memsize);
+int xc_dom_ramdisk_mem(struct xc_dom_image *dom, const void *mem,
+                       size_t memsize);
+
+int xc_dom_parse_image(struct xc_dom_image *dom);
+int xc_dom_build_image(struct xc_dom_image *dom);
+int xc_dom_update_guest_p2m(struct xc_dom_image *dom);
+
+int xc_dom_boot_xen_init(struct xc_dom_image *dom, int xc, domid_t domid);
+int xc_dom_boot_mem_init(struct xc_dom_image *dom);
+void *xc_dom_boot_domU_map(struct xc_dom_image *dom, xen_pfn_t pfn,
+                           xen_pfn_t count);
+int xc_dom_boot_image(struct xc_dom_image *dom);
+int xc_dom_compat_check(struct xc_dom_image *dom);
+
+/* --- debugging bits ---------------------------------------------- */
+
+extern FILE *xc_dom_logfile;
+
+void xc_dom_loginit(void);
+int xc_dom_printf(const char *fmt, ...) __attribute__ ((format(printf, 1, 2)));
+int xc_dom_panic_func(const char *file, int line, xc_error_code err,
+                      const char *fmt, ...)
+    __attribute__ ((format(printf, 4, 5)));
+#define xc_dom_panic(err, fmt, args...) \
+    xc_dom_panic_func(__FILE__, __LINE__, err, fmt, ## args)
+#define xc_dom_trace(mark) \
+    xc_dom_printf("%s:%d: trace %s\n", __FILE__, __LINE__, mark)
+
+void xc_dom_log_memory_footprint(struct xc_dom_image *dom);
+
+/* --- simple memory pool ------------------------------------------ */
+
+void *xc_dom_malloc(struct xc_dom_image *dom, size_t size);
+void *xc_dom_malloc_page_aligned(struct xc_dom_image *dom, size_t size);
+void *xc_dom_malloc_filemap(struct xc_dom_image *dom,
+                            const char *filename, size_t * size);
+char *xc_dom_strdup(struct xc_dom_image *dom, const char *str);
+
+/* --- alloc memory pool ------------------------------------------- */
+
+int xc_dom_alloc_page(struct xc_dom_image *dom, char *name);
+int xc_dom_alloc_segment(struct xc_dom_image *dom,
+                         struct xc_dom_seg *seg, char *name,
+                         xen_vaddr_t start, xen_vaddr_t size);
+
+/* --- misc bits --------------------------------------------------- */
+
+void *xc_dom_pfn_to_ptr(struct xc_dom_image *dom, xen_pfn_t first,
+                        xen_pfn_t count);
+void xc_dom_unmap_one(struct xc_dom_image *dom, xen_pfn_t pfn);
+void xc_dom_unmap_all(struct xc_dom_image *dom);
+
+static inline void *xc_dom_seg_to_ptr(struct xc_dom_image *dom,
+                                      struct xc_dom_seg *seg)
+{
+    xen_vaddr_t segsize = seg->vend - seg->vstart;
+    unsigned int page_size = XC_DOM_PAGE_SIZE(dom);
+    xen_pfn_t pages = (segsize + page_size - 1) / page_size;
+
+    return xc_dom_pfn_to_ptr(dom, seg->pfn, pages);
+}
+
+static inline void *xc_dom_vaddr_to_ptr(struct xc_dom_image *dom,
+                                        xen_vaddr_t vaddr)
+{
+    unsigned int page_size = XC_DOM_PAGE_SIZE(dom);
+    xen_pfn_t page = (vaddr - dom->parms.virt_base) / page_size;
+    unsigned int offset = (vaddr - dom->parms.virt_base) % page_size;
+    void *ptr = xc_dom_pfn_to_ptr(dom, page, 0);
+    return (ptr ? (ptr + offset) : NULL);
+}
+
+static inline int xc_dom_feature_translated(struct xc_dom_image *dom)
+{
+    return 0;
+}
+
+static inline xen_pfn_t xc_dom_p2m_host(struct xc_dom_image *dom, xen_pfn_t pfn)
+{
+    if (dom->shadow_enabled)
+        return pfn;
+    return dom->p2m_host[pfn];
+}
+
+static inline xen_pfn_t xc_dom_p2m_guest(struct xc_dom_image *dom,
+                                         xen_pfn_t pfn)
+{
+    if (xc_dom_feature_translated(dom))
+        return pfn;
+    return dom->p2m_host[pfn];
+}
+
+/* --- arch bits --------------------------------------------------- */
+
+int arch_setup_meminit(struct xc_dom_image *dom);
+int arch_setup_bootearly(struct xc_dom_image *dom);
+int arch_setup_bootlate(struct xc_dom_image *dom);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-set-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/hw/xen_interfaces.c b/hw/xen_interfaces.c
new file mode 100644
index 0000000..6dec9f9
--- /dev/null
+++ b/hw/xen_interfaces.c
@@ -0,0 +1,108 @@
+#include <xenctrl.h>
+#include <xs.h>
+
+#include "hw.h"
+#include "xen.h"
+#include "xen_interfaces.h"
+
+#ifdef CONFIG_XEN
+
+static int xc_evtchn_domid(int handle, int domid)
+{
+    return -1;
+}
+
+static struct XenEvtOps xc_evtchn_xen = {
+    .open               = xc_evtchn_open,
+    .domid              = xc_evtchn_domid,
+    .close              = xc_evtchn_close,
+    .fd                 = xc_evtchn_fd,
+    .notify             = xc_evtchn_notify,
+    .bind_unbound_port  = xc_evtchn_bind_unbound_port,
+    .bind_interdomain   = xc_evtchn_bind_interdomain,
+    .bind_virq          = xc_evtchn_bind_virq,
+    .unbind             = xc_evtchn_unbind,
+    .pending            = xc_evtchn_pending,
+    .unmask             = xc_evtchn_unmask,
+};
+
+static int xs_domid(struct xs_handle *h, int domid)
+{
+    return -1;
+}
+
+static struct XenStoreOps xs_xen = {
+    .daemon_open           = xs_daemon_open,
+    .domain_open           = xs_domain_open,
+    .daemon_open_readonly  = xs_daemon_open_readonly,
+    .domid                 = xs_domid,
+    .daemon_close          = xs_daemon_close,
+    .directory             = xs_directory,
+    .read                  = xs_read,
+    .write                 = xs_write,
+    .mkdir                 = xs_mkdir,
+    .rm                    = xs_rm,
+    .get_permissions       = xs_get_permissions,
+    .set_permissions       = xs_set_permissions,
+    .watch                 = xs_watch,
+    .fileno                = xs_fileno,
+    .read_watch            = xs_read_watch,
+    .unwatch               = xs_unwatch,
+    .transaction_start     = xs_transaction_start,
+    .transaction_end       = xs_transaction_end,
+    .introduce_domain      = xs_introduce_domain,
+    .resume_domain         = xs_resume_domain,
+    .release_domain        = xs_release_domain,
+    .get_domain_path       = xs_get_domain_path,
+    .is_domain_introduced  = xs_is_domain_introduced,
+};
+
+static struct XenGnttabOps xc_gnttab_xen = {
+    .open            = xc_gnttab_open,
+    .close           = xc_gnttab_close,
+    .map_grant_ref   = xc_gnttab_map_grant_ref,
+    .map_grant_refs  = xc_gnttab_map_grant_refs,
+    .munmap          = xc_gnttab_munmap,
+};
+
+struct XenIfOps xc_xen = {
+    .interface_open    = xc_interface_open,
+    .interface_close   = xc_interface_close,
+    .map_foreign_range = xc_map_foreign_range,
+    .map_foreign_batch = xc_map_foreign_batch,
+    .map_foreign_pages = xc_map_foreign_pages,
+};
+
+#endif
+
+struct XenEvtOps xc_evtchn;
+struct XenGnttabOps xc_gnttab;
+struct XenIfOps xc;
+struct XenStoreOps xs;
+
+void xen_interfaces_init(void)
+{
+    switch (xen_mode) {
+#ifdef CONFIG_XEN
+    case XEN_ATTACH:
+    case XEN_CREATE:
+        xc_evtchn = xc_evtchn_xen;
+        xc_gnttab = xc_gnttab_xen;
+        xc        = xc_xen;
+        xs        = xs_xen;
+        break;
+#endif
+#ifdef CONFIG_XENNER
+    case XEN_EMULATE:
+        xc_evtchn = xc_evtchn_xenner;
+        xc_gnttab = xc_gnttab_xenner;
+        xc        = xc_xenner;
+        xs        = xs_xenner;
+        break;
+#endif
+    default:
+        fprintf(stderr, "ERROR: Compiled without %s support, sorry.\n",
+                xen_mode == XEN_EMULATE ? "xenner" : "Xen");
+        exit(1);
+    }
+}
diff --git a/hw/xen_interfaces.h b/hw/xen_interfaces.h
new file mode 100644
index 0000000..95592e6
--- /dev/null
+++ b/hw/xen_interfaces.h
@@ -0,0 +1,111 @@
+#ifndef QEMU_HW_XEN_INTERFACES_H
+#define QEMU_HW_XEN_INTERFACES_H 1
+
+#include <xenctrl.h>
+#include <xs.h>
+
+/* ------------------------------------------------------------- */
+/* xen event channel interface                                   */
+
+struct XenEvtOps {
+    int (*open)(void);
+    int (*domid)(int xce_handle, int domid);
+    int (*close)(int xce_handle);
+    int (*fd)(int xce_handle);
+    int (*notify)(int xce_handle, evtchn_port_t port);
+    evtchn_port_or_error_t (*bind_unbound_port)(int xce_handle, int domid);
+    evtchn_port_or_error_t (*bind_interdomain)(int xce_handle, int domid,
+					       evtchn_port_t remote_port);
+    evtchn_port_or_error_t (*bind_virq)(int xce_handle, unsigned int virq);
+    int (*unbind)(int xce_handle, evtchn_port_t port);
+    evtchn_port_or_error_t (*pending)(int xce_handle);
+    int (*unmask)(int xce_handle, evtchn_port_t port);
+};
+extern struct XenEvtOps xc_evtchn;
+
+/* ------------------------------------------------------------- */
+/* xenstore interface                                            */
+
+struct xs_handle;
+struct XenStoreOps {
+    struct xs_handle *(*daemon_open)(void);
+    struct xs_handle *(*domain_open)(void);
+    struct xs_handle *(*daemon_open_readonly)(void);
+    int (*domid)(struct xs_handle *h, int domid);
+    void (*daemon_close)(struct xs_handle *);
+    char **(*directory)(struct xs_handle *h, xs_transaction_t t,
+			const char *path, unsigned int *num);
+    void *(*read)(struct xs_handle *h, xs_transaction_t t,
+		  const char *path, unsigned int *len);
+    bool (*write)(struct xs_handle *h, xs_transaction_t t,
+		  const char *path, const void *data, unsigned int len);
+    bool (*mkdir)(struct xs_handle *h, xs_transaction_t t,
+		  const char *path);
+    bool (*rm)(struct xs_handle *h, xs_transaction_t t,
+	       const char *path);
+    struct xs_permissions *(*get_permissions)(struct xs_handle *h,
+					      xs_transaction_t t,
+					      const char *path, unsigned int *num);
+    bool (*set_permissions)(struct xs_handle *h, xs_transaction_t t,
+			    const char *path, struct xs_permissions *perms,
+			    unsigned int num_perms);
+    bool (*watch)(struct xs_handle *h, const char *path, const char *token);
+    int (*fileno)(struct xs_handle *h);
+    char **(*read_watch)(struct xs_handle *h, unsigned int *num);
+    bool (*unwatch)(struct xs_handle *h, const char *path, const char *token);
+    xs_transaction_t (*transaction_start)(struct xs_handle *h);
+    bool (*transaction_end)(struct xs_handle *h, xs_transaction_t t,
+			    bool abort);
+    bool (*introduce_domain)(struct xs_handle *h,
+			     unsigned int domid,
+			     unsigned long mfn,
+			     unsigned int eventchn);
+    bool (*resume_domain)(struct xs_handle *h, unsigned int domid);
+    bool (*release_domain)(struct xs_handle *h, unsigned int domid);
+    char *(*get_domain_path)(struct xs_handle *h, unsigned int domid);
+    bool (*is_domain_introduced)(struct xs_handle *h, unsigned int domid);
+};
+extern struct XenStoreOps xs;
+
+/* ------------------------------------------------------------- */
+/* xen grant table interface                                     */
+
+struct XenGnttabOps {
+    int (*open)(void);
+    int (*close)(int xcg_handle);
+    void *(*map_grant_ref)(int xcg_handle, uint32_t domid,
+			  uint32_t ref, int prot);
+    void *(*map_grant_refs)(int xcg_handle, uint32_t count,
+			    uint32_t *domids, uint32_t *refs, int prot);
+    int (*munmap)(int xcg_handle, void *start_address, uint32_t count);
+};
+extern struct XenGnttabOps xc_gnttab;
+
+/* ------------------------------------------------------------- */
+/* xen hypercall interface                                       */
+
+struct XenIfOps {
+    int (*interface_open)(void);
+    int (*interface_close)(int xc_handle);
+    void *(*map_foreign_range)(int xc_handle, uint32_t dom,
+			       int size, int prot,
+			       unsigned long mfn);
+    void *(*map_foreign_batch)(int xc_handle, uint32_t dom, int prot,
+			       xen_pfn_t *arr, int num);
+    void *(*map_foreign_pages)(int xc_handle, uint32_t dom, int prot,
+			       const xen_pfn_t *arr, int num);
+};
+extern struct XenIfOps xc;
+
+/* ------------------------------------------------------------- */
+
+#ifdef CONFIG_XENNER
+extern struct XenEvtOps xc_evtchn_xenner;
+extern struct XenGnttabOps xc_gnttab_xenner;
+extern struct XenIfOps xc_xenner;
+extern struct XenStoreOps xs_xenner;
+#endif
+
+void xen_interfaces_init(void);
+
+#endif /* QEMU_HW_XEN_INTERFACES_H */
diff --git a/hw/xen_redirect.h b/hw/xen_redirect.h
new file mode 100644
index 0000000..491eaf8
--- /dev/null
+++ b/hw/xen_redirect.h
@@ -0,0 +1,56 @@
+#ifndef QEMU_HW_XEN_REDIRECT_H
+#define QEMU_HW_XEN_REDIRECT_H 1
+
+#include "xen_interfaces.h"
+
+/* xen event channel interface */
+#define xc_evtchn_open              xc_evtchn.open
+#define xc_evtchn_close             xc_evtchn.close
+#define xc_evtchn_fd                xc_evtchn.fd
+#define xc_evtchn_notify            xc_evtchn.notify
+#define xc_evtchn_bind_unbound_port xc_evtchn.bind_unbound_port
+#define xc_evtchn_bind_interdomain  xc_evtchn.bind_interdomain
+#define xc_evtchn_bind_virq         xc_evtchn.bind_virq
+#define xc_evtchn_unbind            xc_evtchn.unbind
+#define xc_evtchn_pending           xc_evtchn.pending
+#define xc_evtchn_unmask            xc_evtchn.unmask
+
+/* grant table interface */
+#define xc_gnttab_open              xc_gnttab.open
+#define xc_gnttab_close             xc_gnttab.close
+#define xc_gnttab_map_grant_ref     xc_gnttab.map_grant_ref
+#define xc_gnttab_map_grant_refs    xc_gnttab.map_grant_refs
+#define xc_gnttab_munmap            xc_gnttab.munmap
+
+/* xen hypercall interface */
+#define xc_interface_open           xc.interface_open
+#define xc_interface_close          xc.interface_close
+#define xc_map_foreign_range        xc.map_foreign_range
+#define xc_map_foreign_batch        xc.map_foreign_batch
+#define xc_map_foreign_pages        xc.map_foreign_pages
+
+/* xenstore interface */
+#define xs_daemon_open              xs.daemon_open
+#define xs_domain_open              xs.domain_open
+#define xs_daemon_open_readonly     xs.daemon_open_readonly
+#define xs_daemon_close             xs.daemon_close
+#define xs_directory                xs.directory
+#define xs_read                     xs.read
+#define xs_write                    xs.write
+#define xs_mkdir                    xs.mkdir
+#define xs_rm                       xs.rm
+#define xs_get_permissions          xs.get_permissions
+#define xs_set_permissions          xs.set_permissions
+#define xs_watch                    xs.watch
+#define xs_fileno                   xs.fileno
+#define xs_read_watch               xs.read_watch
+#define xs_unwatch                  xs.unwatch
+#define xs_transaction_start        xs.transaction_start
+#define xs_transaction_end          xs.transaction_end
+#define xs_introduce_domain         xs.introduce_domain
+#define xs_resume_domain            xs.resume_domain
+#define xs_release_domain           xs.release_domain
+#define xs_get_domain_path          xs.get_domain_path
+#define xs_is_domain_introduced     xs.is_domain_introduced
+
+#endif /* QEMU_HW_XEN_REDIRECT_H */
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (26 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 27/40] xenner: add xc_dom.h Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:45   ` Anthony Liguori
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 29/40] xenner: libxc emu: grant tables Alexander Graf
                   ` (11 subsequent siblings)
  39 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner emulates parts of libxc, so we can not use the real xen infrastructure
when running xen pv guests without xen.

This patch adds support for event channel communication.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xenner_libxc_evtchn.c |  467 ++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 467 insertions(+), 0 deletions(-)
 create mode 100644 hw/xenner_libxc_evtchn.c

diff --git a/hw/xenner_libxc_evtchn.c b/hw/xenner_libxc_evtchn.c
new file mode 100644
index 0000000..bb1984c
--- /dev/null
+++ b/hw/xenner_libxc_evtchn.c
@@ -0,0 +1,467 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner emulation -- event channels
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <assert.h>
+#include <xenctrl.h>
+
+#include "hw.h"
+#include "qemu-log.h"
+#include "console.h"
+#include "monitor.h"
+#include "xen.h"
+#include "xen_interfaces.h"
+
+/* ------------------------------------------------------------- */
+
+struct evtpriv;
+
+struct port {
+    struct evtpriv   *priv;
+    struct port      *peer;
+    int              port;
+    int              pending;
+    int              count_snd;
+    int              count_fwd;
+    int              count_msg;
+};
+
+struct domain {
+    int              domid;
+    struct port      p[NR_EVENT_CHANNELS];
+};
+static struct domain dom0;  /* host  */
+static struct domain domU;  /* guest */
+
+struct evtpriv {
+    int                      fd_read, fd_write;
+    struct domain            *domain;
+    int                      ports;
+    int                      pending;
+    QTAILQ_ENTRY(evtpriv)    list;
+};
+static QTAILQ_HEAD(evtpriv_head, evtpriv) privs = QTAILQ_HEAD_INITIALIZER(privs);
+
+static int debug = 0;
+
+/* ------------------------------------------------------------- */
+
+static struct evtpriv *getpriv(int handle)
+{
+    struct evtpriv *priv;
+
+    QTAILQ_FOREACH(priv, &privs, list) {
+        if (priv->fd_read == handle) {
+            return priv;
+        }
+    }
+    return NULL;
+}
+
+static struct domain *get_domain(int domid)
+{
+    if (domid == 0) {
+        return &dom0;
+    }
+    if (!domU.domid) {
+        domU.domid = domid;
+    }
+    assert(domU.domid == domid);
+    return &domU;
+}
+
+static struct port *alloc_port(struct evtpriv *priv, const char *reason)
+{
+    struct port *p = NULL;
+    int i;
+
+    for (i = 1; i < NR_EVENT_CHANNELS; i++) {
+#ifdef DEBUG
+        /* debug hack */
+#define EA_START 20
+        if (priv->domain->domid && i < EA_START)
+            i = EA_START;
+#undef EA_START
+#endif
+        if (priv->domain->p[i].priv != NULL) {
+            continue;
+        }
+        p = priv->domain->p+i;
+        p->port = i;
+        p->priv = priv;
+        p->count_snd = 0;
+        p->count_fwd = 0;
+        p->count_msg = 1;
+        priv->ports++;
+        if (debug) {
+            qemu_log("xen ev:%3d: alloc port %d, domain %d (%s)\n",
+                     priv->fd_read, p->port, priv->domain->domid, reason);
+        }
+        return p;
+    }
+    return NULL;
+}
+
+static void bind_port_peer(struct port *p, int domid, int port)
+{
+    struct domain *domain;
+    struct port *o;
+    const char *msg = "ok";
+
+    domain = get_domain(domid);
+    o = domain->p+port;
+    if (!o->priv) {
+        msg = "peer not allocated";
+    } else if (o->peer) {
+        msg = "peer already bound";
+    } else if (p->peer) {
+        msg = "port already bound";
+    } else {
+        o->peer = p;
+        p->peer = o;
+    }
+    if (debug) {
+        qemu_log("xen ev:%3d: bind port %d domain %d  <->  port %d domain %d : %s\n",
+                 p->priv->fd_read,
+                 p->port, p->priv->domain->domid,
+                 port, domid, msg);
+    }
+}
+
+static void unbind_port(struct port *p)
+{
+    struct port *o;
+
+    o = p->peer;
+    if (o) {
+        if (debug) {
+            fprintf(stderr,"xen ev:%3d: unbind port %d domain %d  <->  port %d domain %d\n",
+                    p->priv->fd_read,
+                    p->port, p->priv->domain->domid,
+                    o->port, o->priv->domain->domid);
+        }
+        o->peer = NULL;
+        p->peer = NULL;
+    }
+}
+
+static void notify_send_peer(struct port *peer)
+{
+    uint32_t evtchn = peer->port;
+    int r;
+
+    peer->count_snd++;
+    if (peer->pending) {
+        return;
+    }
+
+    r = write(peer->priv->fd_write, &evtchn, sizeof(evtchn));
+    if (r != sizeof(evtchn)) {
+        // XXX break
+    }
+    peer->count_fwd++;
+    peer->pending++;
+    peer->priv->pending++;
+}
+
+static void notify_port(struct port *p)
+{
+    if (p->peer) {
+        notify_send_peer(p->peer);
+        if (debug && p->peer->count_snd >= p->peer->count_msg) {
+            fprintf(stderr, "xen ev:%3d: notify port %d domain %d  ->  port %d "
+                            "domain %d  |  counts %d/%d\n",
+                     p->priv->fd_read, p->port, p->priv->domain->domid,
+                     p->peer->port, p->peer->priv->domain->domid,
+                     p->peer->count_fwd, p->peer->count_snd);
+            p->peer->count_msg *= 10;
+        }
+    } else {
+        if (debug) {
+            fprintf(stderr, "xen ev:%3d: notify port %d domain %d  ->  unconnected\n",
+                    p->priv->fd_read, p->port, p->priv->domain->domid);
+        }
+    }
+}
+
+static void unmask_port(struct port *p)
+{
+    /* nothing to do */
+}
+
+static void release_port(struct port *p)
+{
+    if (debug) {
+        fprintf(stderr,"xen ev:%3d: release port %d, domain %d\n",
+                p->priv->fd_read, p->port, p->priv->domain->domid);
+    }
+    unbind_port(p);
+    p->priv->ports--;
+    p->port = 0;
+    p->priv = 0;
+}
+
+/* ------------------------------------------------------------- */
+
+static int qemu_xopen(void)
+{
+    struct evtpriv *priv;
+    int fd[2];
+
+    priv = qemu_mallocz(sizeof(*priv));
+    QTAILQ_INSERT_TAIL(&privs, priv, list);
+
+    if (pipe(fd) < 0) {
+        goto err;
+    }
+    priv->fd_read  = fd[0];
+    priv->fd_write = fd[1];
+    fcntl(priv->fd_read,F_SETFL,O_NONBLOCK);
+
+    priv->domain = get_domain(0);
+    return priv->fd_read;
+
+err:
+    qemu_free(priv);
+    return -1;
+}
+
+static int qemu_close(int handle)
+{
+    struct evtpriv *priv = getpriv(handle);
+    struct port *p;
+    int i;
+
+    if (!priv) {
+        return -1;
+    }
+
+    for (i = 1; i < NR_EVENT_CHANNELS; i++) {
+        p = priv->domain->p+i;
+        if (priv != p->priv) {
+            continue;
+        }
+        release_port(p);
+    }
+
+    close(priv->fd_read);
+    close(priv->fd_write);
+    QTAILQ_REMOVE(&privs, priv, list);
+    qemu_free(priv);
+    return 0;
+}
+
+static int qemu_fd(int handle)
+{
+    struct evtpriv *priv = getpriv(handle);
+
+    if (!priv) {
+        return -1;
+    }
+    return priv->fd_read;
+}
+
+static int qemu_notify(int handle, evtchn_port_t port)
+{
+    struct evtpriv *priv = getpriv(handle);
+    struct port *p;
+
+    if (!priv) {
+        return -1;
+    }
+    if (port >= NR_EVENT_CHANNELS) {
+        return -1;
+    }
+    p = priv->domain->p + port;
+    notify_port(p);
+    return -1;
+}
+
+static evtchn_port_or_error_t qemu_bind_unbound_port(int handle, int domid)
+{
+    struct evtpriv *priv = getpriv(handle);
+    struct port *p;
+
+    if (!priv) {
+        return -1;
+    }
+    p = alloc_port(priv, "unbound");
+    if (!p) {
+        return -1;
+    }
+    return p->port;
+}
+
+static evtchn_port_or_error_t qemu_bind_interdomain(int handle, int domid,
+                                                    evtchn_port_t remote_port)
+{
+    struct evtpriv *priv = getpriv(handle);
+    struct port *p;
+
+    if (!priv) {
+        return -1;
+    }
+    if (remote_port >= NR_EVENT_CHANNELS) {
+        return -1;
+    }
+    p = alloc_port(priv, "interdomain");
+    if (!p) {
+        return -1;
+    }
+    bind_port_peer(p, domid, remote_port);
+    return p->port;
+}
+
+static evtchn_port_or_error_t qemu_bind_virq(int handle, unsigned int virq)
+{
+    struct evtpriv *priv = getpriv(handle);
+    struct port *p;
+
+    if (!priv) {
+        return -1;
+    }
+    p = alloc_port(priv, "virq");
+    if (!p) {
+        return -1;
+    }
+    /*
+     * Note: port not linked here, we only allocate some port.
+     */
+    return p->port;
+}
+
+static int qemu_unbind(int handle, evtchn_port_t port)
+{
+    struct evtpriv *priv = getpriv(handle);
+    struct port *p;
+
+    if (!priv) {
+        return -1;
+    }
+    if (port >= NR_EVENT_CHANNELS) {
+        return -1;
+    }
+    p = priv->domain->p + port;
+    unbind_port(p);
+    release_port(p);
+    return 0;
+}
+
+static evtchn_port_or_error_t qemu_pending(int handle)
+{
+    struct evtpriv *priv = getpriv(handle);
+    uint32_t evtchn;
+    int rc;
+
+    if (!priv) {
+        return -1;
+    }
+    rc = read(priv->fd_read, &evtchn, sizeof(evtchn));
+    if (rc != sizeof(evtchn)) {
+        return -1;
+    }
+    priv->pending--;
+    priv->domain->p[evtchn].pending--;
+    return evtchn;
+}
+
+static int qemu_unmask(int handle, evtchn_port_t port)
+{
+    struct evtpriv *priv = getpriv(handle);
+    struct port *p;
+
+    if (!priv) {
+        return -1;
+    }
+    if (port >= NR_EVENT_CHANNELS) {
+        return -1;
+    }
+    p = priv->domain->p + port;
+    unmask_port(p);
+    return 0;
+}
+
+static int qemu_domid(int handle, int domid)
+{
+    struct evtpriv *priv = getpriv(handle);
+
+    if (!priv) {
+        return -1;
+    }
+    if (priv->ports) {
+        return -1;
+    }
+    priv->domain = get_domain(domid);
+    return 0;
+}
+
+struct XenEvtOps xc_evtchn_xenner = {
+    .open               = qemu_xopen,
+    .domid              = qemu_domid,
+    .close              = qemu_close,
+    .fd                 = qemu_fd,
+    .notify             = qemu_notify,
+    .bind_unbound_port  = qemu_bind_unbound_port,
+    .bind_interdomain   = qemu_bind_interdomain,
+    .bind_virq          = qemu_bind_virq,
+    .unbind             = qemu_unbind,
+    .pending            = qemu_pending,
+    .unmask             = qemu_unmask,
+};
+
+/* ------------------------------------------------------------- */
+
+#if 0
+
+void do_info_evtchn(Monitor *mon)
+{
+    struct evtpriv *priv;
+    struct port *port;
+    int i;
+
+    if (xen_mode != XEN_EMULATE) {
+        monitor_printf(mon, "Not emulating xen event channels.\n");
+        return;
+    }
+
+    QTAILQ_FOREACH(priv, &privs, list) {
+        monitor_printf(mon, "%p: domid %d, fds %d,%d\n", priv,
+                       priv->domain->domid,
+                       priv->fd_read, priv->fd_write);
+        for (i = 1; i < NR_EVENT_CHANNELS; i++) {
+            port = priv->domain->p + i;
+            if (port->priv != priv) {
+                continue;
+            }
+            monitor_printf(mon, "  port #%d: ", port->port);
+            if (port->peer) {
+                monitor_printf(mon, "peer #%d (%p, domid %d)\n",
+                               port->peer->port, port->peer->priv,
+                               port->peer->priv->domain->domid);
+            } else {
+                monitor_printf(mon, "no peer\n");
+            }
+        }
+    }
+}
+
+#endif
+
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 29/40] xenner: libxc emu: grant tables
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (27 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 30/40] xenner: libxc emu: memory mapping Alexander Graf
                   ` (10 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner emulates parts of libxc, so we can not use the real xen infrastructure
when running xen pv guests without xen.

This patch adds support for grant tables.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xenner_libxc_gnttab.c |   91 ++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 91 insertions(+), 0 deletions(-)
 create mode 100644 hw/xenner_libxc_gnttab.c

diff --git a/hw/xenner_libxc_gnttab.c b/hw/xenner_libxc_gnttab.c
new file mode 100644
index 0000000..c9f4ce2
--- /dev/null
+++ b/hw/xenner_libxc_gnttab.c
@@ -0,0 +1,91 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner Emulation -- grant tables
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <xenctrl.h>
+#include <xen/grant_table.h>
+
+#include "hw.h"
+#include "xen_interfaces.h"
+#include "xenner.h"
+
+/* ------------------------------------------------------------- */
+
+static grant_entry_v1_t *get_grant(uint32_t ref)
+{
+    grant_entry_v1_t *e;
+    int page, index;
+
+    page  = ref / (PAGE_SIZE / sizeof(grant_entry_v1_t));
+    index = ref % (PAGE_SIZE / sizeof(grant_entry_v1_t));
+
+    e = xenner_get_grant_table(page);
+    if (!e) {
+        return NULL;
+    }
+    return e + index;
+}
+
+/* ------------------------------------------------------------- */
+
+static int _qemu_open(void)
+{
+    return 42;
+}
+
+static int qemu_close(int xcg_handle)
+{
+    return 0;
+}
+
+static void *qemu_map_grant_ref(int xcg_handle, uint32_t domid,
+                                uint32_t ref, int prot)
+{
+    grant_entry_v1_t *e;
+
+    e = get_grant(ref);
+    if (!e) {
+        return NULL;
+    }
+
+    return xenner_mfn_to_ptr(e->frame);
+}
+
+static void *qemu_map_grant_refs(int xcg_handle, uint32_t count,
+                                 uint32_t *domids, uint32_t *refs, int prot)
+{
+    /* Hmm, not so easy ... */
+    return NULL;
+}
+
+static int qemu_munmap(int xcg_handle, void *start_address, uint32_t count)
+{
+    /* nothing as we didn't actually map anything ... */
+    return 0;
+}
+
+struct XenGnttabOps xc_gnttab_xenner = {
+    .open            = _qemu_open,
+    .close           = qemu_close,
+    .map_grant_ref   = qemu_map_grant_ref,
+    .map_grant_refs  = qemu_map_grant_refs,
+    .munmap          = qemu_munmap,
+};
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 30/40] xenner: libxc emu: memory mapping
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (28 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 29/40] xenner: libxc emu: grant tables Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:12   ` malc
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 31/40] xenner: libxc emu: xenstore Alexander Graf
                   ` (9 subsequent siblings)
  39 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner emulates parts of libxc, so we can not use the real xen infrastructure
when running xen pv guests without xen.

This patch adds support for guest memory mapping.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xenner_libxc_if.c |  124 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 124 insertions(+), 0 deletions(-)
 create mode 100644 hw/xenner_libxc_if.c

diff --git a/hw/xenner_libxc_if.c b/hw/xenner_libxc_if.c
new file mode 100644
index 0000000..7ccd3c0
--- /dev/null
+++ b/hw/xenner_libxc_if.c
@@ -0,0 +1,124 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner Emulation -- memory management
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <sys/mman.h>
+#include <xenctrl.h>
+
+#include "hw.h"
+#include "xen_interfaces.h"
+#include "xenner.h"
+
+/* ------------------------------------------------------------- */
+
+static int _qemu_open(void)
+{
+    return 42;
+}
+
+static int qemu_close(int xc_handle)
+{
+    return 0;
+}
+
+static void *qemu_map_foreign_range(int xc_handle, uint32_t dom,
+                                    int size, int prot, unsigned long mfn)
+{
+    target_phys_addr_t addr, len;
+    void *ptr;
+
+    addr = (target_phys_addr_t)mfn << PAGE_SHIFT;
+    len = size;
+    ptr = cpu_physical_memory_map(addr, &len, 1);
+
+    if (len != size) {
+        fprintf(stderr, "%s: couldn't allocate %d bytes\n", __FUNCTION__, size);
+        return NULL;
+    }
+
+    return ptr;
+}
+
+static void *qemu_map_foreign_batch(int xc_handle, uint32_t dom, int prot,
+                                    xen_pfn_t *arr, int num)
+{
+    ram_addr_t offset;
+    void *ptr;
+    int i;
+    target_phys_addr_t len = num * TARGET_PAGE_SIZE;
+
+    char filename[] = "/dev/shm/qemu-vmcore.XXXXXX";
+    int rc, fd;
+
+    fd = mkstemp(filename);
+    if (fd == -1) {
+        fprintf(stderr, "mkstemp(%s): %s\n", filename, strerror(errno));
+        return NULL;
+    };
+    unlink(filename);
+
+    rc = ftruncate(fd, len);
+    if (rc != 0) {
+        fprintf(stderr, "ftruncate(0x%" PRIx64 "): %s\n",
+                (uint64_t)len, strerror(errno));
+        return NULL;
+    }
+
+    ptr = mmap(NULL, len, PROT_WRITE | PROT_READ, MAP_SHARED | MAP_POPULATE,
+               fd, 0);
+
+    for (i = 0; i < num; i++) {
+        void *map;
+        target_phys_addr_t pagelen = TARGET_PAGE_SIZE;
+
+        printf("arr[%d] = %#lx\n", i, cpu_get_physical_page_desc(arr[i] << PAGE_SHIFT));
+
+        /* fetch the pointer in qemu's own virtual memory */
+        offset = cpu_get_physical_page_desc(arr[i] << PAGE_SHIFT);
+        map = cpu_physical_memory_map(offset, &pagelen, 1);
+
+        /* copy current mem to new map */
+        memcpy(ptr + (i * TARGET_PAGE_SIZE), map, TARGET_PAGE_SIZE);
+
+        if (mmap(map, TARGET_PAGE_SIZE, prot, MAP_SHARED | MAP_FIXED,
+                 fd, i * TARGET_PAGE_SIZE) == (void*)-1) {
+            fprintf(stderr, "%s: mmap(#%d, mfn 0x%lx): %s\n",
+                    __FUNCTION__, i, arr[i], strerror(errno));
+            return NULL;
+        }
+    }
+
+    return ptr;
+}
+
+static void *qemu_map_foreign_pages(int xc_handle, uint32_t dom, int prot,
+                                    const xen_pfn_t *arr, int num)
+{
+    return qemu_map_foreign_batch(xc_handle, dom, prot, (void*)arr, num);
+}
+
+struct XenIfOps xc_xenner = {
+    .interface_open    = _qemu_open,
+    .interface_close   = qemu_close,
+    .map_foreign_range = qemu_map_foreign_range,
+    .map_foreign_batch = qemu_map_foreign_batch,
+    .map_foreign_pages = qemu_map_foreign_pages,
+};
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 31/40] xenner: libxc emu: xenstore
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (29 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 30/40] xenner: libxc emu: memory mapping Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 18:36   ` Blue Swirl
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 32/40] xenner: emudev Alexander Graf
                   ` (8 subsequent siblings)
  39 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner emulates parts of libxc, so we can not use the real xen infrastructure
when running xen pv guests without xen.

This patch adds support for emulation of xenstored.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xenner_guest_store.c |  494 +++++++++++++++++++++++++++++++++
 hw/xenner_libxenstore.c |  709 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 1203 insertions(+), 0 deletions(-)
 create mode 100644 hw/xenner_guest_store.c
 create mode 100644 hw/xenner_libxenstore.c

diff --git a/hw/xenner_guest_store.c b/hw/xenner_guest_store.c
new file mode 100644
index 0000000..c067275
--- /dev/null
+++ b/hw/xenner_guest_store.c
@@ -0,0 +1,494 @@
+/*
+ *  Copyright (C) 2005 Rusty Russell IBM Corporation
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner emulation -- guest interface to xenstore
+ *
+ *  tools/xenstore/xenstored_domain.c equivalent, some code is from there.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "xen.h"
+#include "xen_interfaces.h"
+#include "xenner.h"
+#include "qemu-char.h"
+
+/* ------------------------------------------------------------- */
+
+static target_phys_addr_t xen_store_mfn;
+
+static struct xs_handle *xs_guest;
+static char xs_buf[1024];
+static char xs_len;
+static int debug = 0;
+
+static int evtchndev;
+static evtchn_port_t evtchnport;
+
+/* ------------------------------------------------------------- */
+
+static const char *msgname[] = {
+    [ XS_DEBUG                 ] = "XS_DEBUG",
+    [ XS_DIRECTORY             ] = "XS_DIRECTORY",
+    [ XS_READ                  ] = "XS_READ",
+    [ XS_GET_PERMS             ] = "XS_GET_PERMS",
+    [ XS_WATCH                 ] = "XS_WATCH",
+    [ XS_UNWATCH               ] = "XS_UNWATCH",
+    [ XS_TRANSACTION_START     ] = "XS_TRANSACTION_START",
+    [ XS_TRANSACTION_END       ] = "XS_TRANSACTION_END",
+    [ XS_INTRODUCE             ] = "XS_INTRODUCE",
+    [ XS_RELEASE               ] = "XS_RELEASE",
+    [ XS_GET_DOMAIN_PATH       ] = "XS_GET_DOMAIN_PATH",
+    [ XS_WRITE                 ] = "XS_WRITE",
+    [ XS_MKDIR                 ] = "XS_MKDIR",
+    [ XS_RM                    ] = "XS_RM",
+    [ XS_SET_PERMS             ] = "XS_SET_PERMS",
+    [ XS_WATCH_EVENT           ] = "XS_WATCH_EVENT",
+    [ XS_ERROR                 ] = "XS_ERROR",
+    [ XS_IS_DOMAIN_INTRODUCED  ] = "XS_IS_DOMAIN_INTRODUCED",
+    [ XS_RESUME                ] = "XS_RESUME",
+};
+
+/* ------------------------------------------------------------- */
+
+static bool check_indexes(XENSTORE_RING_IDX cons, XENSTORE_RING_IDX prod)
+{
+    return ((prod - cons) <= XENSTORE_RING_SIZE);
+}
+
+static void *get_output_chunk(XENSTORE_RING_IDX cons,
+                              XENSTORE_RING_IDX prod,
+                              char *buf, uint32_t *len)
+{
+    *len = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(prod);
+    if ((XENSTORE_RING_SIZE - (prod - cons)) < *len) {
+        *len = XENSTORE_RING_SIZE - (prod - cons);
+    }
+    return buf + MASK_XENSTORE_IDX(prod);
+}
+
+static const void *get_input_chunk(XENSTORE_RING_IDX cons,
+                                   XENSTORE_RING_IDX prod,
+                                   const char *buf, uint32_t *len)
+{
+    *len = XENSTORE_RING_SIZE - MASK_XENSTORE_IDX(cons);
+    if ((prod - cons) < *len) {
+        *len = prod - cons;
+    }
+    return buf + MASK_XENSTORE_IDX(cons);
+}
+
+static int domain_write(struct xenstore_domain_interface *intf,
+                        const void *data, unsigned int len)
+{
+    uint32_t avail;
+    void *dest;
+    XENSTORE_RING_IDX cons, prod;
+
+    /* Must read indexes once, and before anything else, and verified. */
+    cons = intf->rsp_cons;
+    prod = intf->rsp_prod;
+    xen_mb();
+
+    if (!check_indexes(cons, prod)) {
+        errno = EIO;
+        return -1;
+    }
+
+    dest = get_output_chunk(cons, prod, intf->rsp, &avail);
+    if (avail < len) {
+        /* write hangover at the beginning */
+        memcpy(intf->rsp, data + avail, len - avail);
+    }
+
+    memcpy(dest, data, avail);
+    xen_mb();
+    intf->rsp_prod += len;
+
+    xc_evtchn.notify(evtchndev, evtchnport);
+    return len;
+}
+
+static int domain_read(struct xenstore_domain_interface *intf,
+                       void *data, unsigned int len)
+{
+    uint32_t avail;
+    const void *src;
+    XENSTORE_RING_IDX cons, prod;
+
+    /* Must read indexes once, and before anything else, and verified. */
+    cons = intf->req_cons;
+    prod = intf->req_prod;
+    xen_mb();
+
+    if (!check_indexes(cons, prod)) {
+        errno = EIO;
+        return -1;
+    }
+
+    src = get_input_chunk(cons, prod, intf->req, &avail);
+    if (avail < len) {
+        len = avail;
+    }
+
+    memcpy(data, src, len);
+    xen_mb();
+    intf->req_cons += len;
+
+    xc_evtchn.notify(evtchndev, evtchnport);
+    return len;
+}
+
+/* ------------------------------------------------------------- */
+
+static int blen;
+static char *backlog = NULL;
+
+static void backlog_create(void *reply, int mlen, int sent)
+{
+    blen = mlen-sent;
+    backlog = qemu_malloc(blen);
+    memcpy(backlog, ((char*)reply) + sent, blen);
+    if (debug) {
+        fprintf(stderr, "%s: backlog created: %d bytes\n",
+                __FUNCTION__, blen);
+    }
+}
+
+static int backlog_shift(struct xenstore_domain_interface *di,
+                         void *reply, int mlen)
+{
+    int rc;
+
+    rc = domain_write(di, backlog, blen);
+    if (rc == blen) {
+        if (debug) {
+            fprintf(stderr, "%s: backlog cleared\n",
+                    __FUNCTION__);
+        }
+        qemu_free(backlog);
+        backlog = NULL;
+        blen = 0;
+    } else {
+        memmove(backlog, backlog+rc, blen-rc);
+        blen -= rc;
+        backlog = qemu_realloc(backlog, blen + mlen);
+        if (reply) {
+            memcpy(backlog + blen, reply, mlen);
+        }
+        blen += mlen;
+        if (debug) {
+            fprintf(stderr, "%s: backlog resized: %d bytes (%d sent, %d added)\n",
+                    __FUNCTION__, blen, rc, mlen);
+        }
+    }
+    return blen;
+}
+
+/* ------------------------------------------------------------- */
+
+static struct xenstore_domain_interface *get_xdf(void)
+{
+    struct xenstore_domain_interface *xdf;
+    target_phys_addr_t len = sizeof(*xdf);
+
+    xdf = cpu_physical_memory_map(xen_store_mfn << PAGE_SHIFT, &len, 1);
+
+    if (len < sizeof(*xdf)) {
+        return NULL;
+    }
+
+    return xdf;
+}
+
+static int xen_reply(struct xsd_sockmsg *msg, int type, const void *data, int len)
+{
+    struct xenstore_domain_interface *di = get_xdf();
+    struct xsd_sockmsg *reply;
+    int mlen, rc;
+
+    reply = qemu_mallocz(sizeof(*reply) + len);
+    if (!reply) {
+        return -1;
+    }
+    if (msg) {
+        *reply = *msg;
+    }
+    reply->type = type;
+    reply->len = len;
+    if (len) {
+        memcpy(reply+1, data, len);
+    }
+    mlen = sizeof(*reply) + len;
+
+    if (debug) {
+        fprintf(stderr, "%s: %s (#%d) %d:[%.*s]\n", __FUNCTION__,
+                msgname[reply->type], reply->type, len, len, (char*)(reply+1));
+    }
+
+    if (backlog) {
+        if (backlog_shift(di, reply, mlen)) {
+            goto out;
+        }
+    }
+
+    rc = domain_write(di, reply, mlen);
+    if (rc == -1) {
+        fprintf(stderr, "%s: domain_write error\n", __FUNCTION__);
+    } else if (rc != mlen) {
+        backlog_create(reply, mlen, rc);
+    }
+
+out:
+    qemu_free(reply);
+    return 0;
+}
+
+static int xen_reply_str(struct xsd_sockmsg *msg, int type, const char *str)
+{
+    if (str) {
+        return xen_reply(msg, type, str, strlen(str)+1);
+    } else {
+        return xen_reply(msg, type, NULL, 0);
+    }
+}
+
+static int xen_reply_vec(struct xsd_sockmsg *msg, int type, char **vec, int vlen)
+{
+    char payload[1024];
+    int i,len,pos;
+
+    if (!vec) {
+        vlen = 0;
+    }
+    for (pos = 0, i = 0; i < vlen; i++) {
+        len = strlen(vec[i])+1;
+        if (pos+len > sizeof(payload)) {
+            fprintf(stderr, "%s: oops: payload too small\n", __FUNCTION__);
+            break;
+        }
+        memcpy(payload+pos, vec[i], len);
+        pos += len;
+    }
+    return xen_reply(msg, type, payload, pos);
+}
+
+static int xen_handle_data(void *data, int len)
+{
+    struct xsd_sockmsg *msg;
+    char *payload, *arg2, *val, **vec, id[16];
+    unsigned int slen,vlen,alen;
+    bool rc;
+
+    if (len < sizeof(*msg)) {
+        if (debug) {
+            fprintf(stderr, "%s: header incomplete (%d/%zd)\n",
+                    __FUNCTION__, len, sizeof(*msg));
+        }
+        return 0;
+    }
+    msg = data;
+    if (len < sizeof(*msg) + msg->len) {
+        if (debug) {
+            fprintf(stderr, "%s: msg incomplete (%d/%zd)\n",
+                    __FUNCTION__, len, sizeof(*msg) + msg->len);
+        }
+        return 0;
+    }
+    payload = data + sizeof(*msg);
+    payload[msg->len] = 0;
+
+    if (debug) {
+        fprintf(stderr, "%s: %s (#%d) %d:[%.*s]\n", __FUNCTION__,
+                msgname[msg->type], msg->type, msg->len, msg->len, payload);
+    }
+
+    switch (msg->type) {
+    case XS_DEBUG:
+        xen_reply_str(msg, XS_DEBUG, "OK");
+        break;
+    case XS_DIRECTORY:
+        vec = xs.directory(xs_guest, msg->tx_id, payload, &vlen);
+        xen_reply_vec(msg, msg->type, vec, vlen);
+        qemu_free(vec);
+        break;
+    case XS_READ:
+        val = xs.read(xs_guest, msg->tx_id, payload, &slen);
+        if (!val) {
+            xen_reply_str(msg, XS_ERROR, "ENOENT");
+        } else {
+            xen_reply_str(msg, msg->type, val);
+            qemu_free(val);
+        }
+        break;
+    case XS_WRITE:
+        arg2 = payload + strlen(payload) + 1;
+        alen = msg->len - (arg2 - payload);
+        if (xs.write(xs_guest, msg->tx_id, payload, arg2, alen)) {
+            xen_reply(msg, msg->type, NULL, 0);
+        } else {
+            xen_reply_str(msg, XS_ERROR, "EINVAL");
+        }
+        break;
+    case XS_WATCH:
+        arg2 = payload + strlen(payload) + 1;
+        if (xs.watch(xs_guest, payload, arg2)) {
+            xen_reply(msg, msg->type, NULL, 0);
+        } else {
+            xen_reply_str(msg, XS_ERROR, "EINVAL");
+        }
+        break;
+    case XS_UNWATCH:
+        arg2 = payload + strlen(payload) + 1;
+        if (xs.unwatch(xs_guest, payload, arg2)) {
+            xen_reply(msg, msg->type, NULL, 0);
+        } else {
+            xen_reply_str(msg, XS_ERROR, "EINVAL");
+        }
+        break;
+    case XS_TRANSACTION_START:
+        snprintf(id, sizeof(id), "%u", xs.transaction_start(xs_guest));
+        xen_reply_str(msg, msg->type, id);
+        break;
+    case XS_TRANSACTION_END:
+        if (payload[0] == 'T') {
+            /* commit */
+            rc = xs.transaction_end(xs_guest, msg->tx_id, 0);
+        } else if (payload[0] == 'F') {
+            /* abort */
+            rc = xs.transaction_end(xs_guest, msg->tx_id, 1);
+        } else {
+            /* Huh? */
+            xen_reply_str(msg, XS_ERROR, "EINVAL");
+            break;
+        }
+        if (rc) {
+            xen_reply(msg, msg->type, NULL, 0);
+        } else {
+            xen_reply_str(msg, XS_ERROR, "EINVAL");
+        }
+        break;
+    case XS_RM:
+        if (xs.rm(xs_guest, msg->tx_id, payload)) {
+            xen_reply(msg, msg->type, NULL, 0);
+        } else {
+            xen_reply_str(msg, XS_ERROR, "EINVAL");
+        }
+        break;
+    default:
+        fprintf(stderr, "xs guest: unknown msg type %d, payload %d\n",
+                msg->type, msg->len);
+        xen_reply_str(msg, XS_ERROR, "EIO");
+        break;
+    }
+    return sizeof(*msg) + msg->len;
+}
+
+static void xen_store_evtchn_event(void *opaque)
+{
+    struct xenstore_domain_interface *di = get_xdf();
+    evtchn_port_t port;
+    int rc;
+
+    port = xc_evtchn.pending(evtchndev);
+    if (port != evtchnport) {
+        fprintf(stderr,"%s: xc_evtchn.pending returned %d (expected %d)\n",
+                __FUNCTION__, port, evtchnport );
+        return;
+    }
+    xc_evtchn.unmask(evtchndev, port);
+
+    rc = domain_read(di, xs_buf + xs_len, sizeof(xs_buf) - xs_len);
+    if (rc <= 0) {
+        if (backlog) {
+            backlog_shift(di, NULL, 0);
+        }
+        return;
+    }
+    xs_len += rc;
+    if (debug) {
+        fprintf(stderr, "%s: got %d bytes\n", __FUNCTION__, rc);
+    }
+
+    rc = domain_read(di, xs_buf + xs_len, sizeof(xs_buf) - xs_len);
+    if (rc > 0) {
+        xs_len += rc;
+        if (debug) {
+            fprintf(stderr, "%s: got %d bytes (ring wrap, part #2)\n", __FUNCTION__, rc);
+        }
+    }
+
+    for (;;) {
+        rc = xen_handle_data(xs_buf, xs_len);
+        if (!rc) {
+            break;
+        }
+        if (rc == xs_len) {
+            xs_len = 0;
+            break;
+        }
+        memmove(xs_buf, xs_buf + rc, xs_len - rc);
+        xs_len -= rc;
+    }
+}
+
+static void xen_store_watch_event(void *opaque)
+{
+    char **vec;
+    unsigned int len = 1;
+
+    vec = xs.read_watch(xs_guest, &len);
+    if (!vec) {
+        return;
+    }
+    xen_reply_vec(NULL, XS_WATCH_EVENT, vec, 2);
+}
+
+/* ------------------------------------------------------------- */
+
+void xenner_guest_store_setup(uint64_t guest_mfn, evtchn_port_t guest_evtchn)
+{
+    xen_store_mfn = guest_mfn;
+
+    /* xenstore event channel */
+    evtchndev = xc_evtchn.open();
+    evtchnport = xc_evtchn.bind_interdomain(evtchndev, xen_domid,
+                                            guest_evtchn);
+    qemu_set_fd_handler(xc_evtchn.fd(evtchndev),
+                        xen_store_evtchn_event, NULL, NULL);
+
+    /* guest connection to xenstore  */
+    xs_guest = xs.daemon_open();
+    xs.domid(xs_guest, xen_domid);
+    qemu_set_fd_handler(xs.fileno(xs_guest),
+                        xen_store_watch_event, NULL, NULL);
+}
+
+/* this clears guest watches */
+void xenner_guest_store_reset(void)
+{
+    /* close */
+    qemu_set_fd_handler(xs.fileno(xs_guest), NULL, NULL, NULL);
+    xs.daemon_close(xs_guest);
+
+    /* reopen */
+    xs_guest = xs.daemon_open();
+    xs.domid(xs_guest, xen_domid);
+    qemu_set_fd_handler(xs.fileno(xs_guest),
+                        xen_store_watch_event, NULL, NULL);
+}
diff --git a/hw/xenner_libxenstore.c b/hw/xenner_libxenstore.c
new file mode 100644
index 0000000..4110a13
--- /dev/null
+++ b/hw/xenner_libxenstore.c
@@ -0,0 +1,709 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner Core -- xenstored
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "hw.h"
+#include "xen.h"
+#include "xen_interfaces.h"
+#include "qemu-char.h"
+#include "console.h"
+#include "monitor.h"
+
+static int debug = 0;
+
+/* ------------------------------------------------------------- */
+
+#define XS_PATH_MAX 256
+
+struct node {
+    char                *path;
+    struct node         *parent;
+    char                *value;
+    int                 len;
+    int                 is_dir;
+    QTAILQ_ENTRY(node)  list;
+};
+static QTAILQ_HEAD(node_head, node) nodes = QTAILQ_HEAD_INITIALIZER(nodes);
+
+struct watch {
+    struct xs_handle    *who;
+    char                *path;
+    char                *token;
+    int                 offset;
+    QTAILQ_ENTRY(watch) list;
+};
+static QTAILQ_HEAD(watch_head, watch) watches = QTAILQ_HEAD_INITIALIZER(watches);
+
+struct event {
+    char                **vec;
+    QTAILQ_ENTRY(event)  list;
+};
+
+struct xs_handle {
+    int                 fd_read;
+    int                 fd_write;
+    int                 domid;
+    QTAILQ_HEAD(event_head, event) events;
+};
+
+/* ------------------------------------------------------------- */
+
+static struct node *node_find(const char *path)
+{
+    struct node *node;
+
+    QTAILQ_FOREACH(node, &nodes, list) {
+        if (!strcmp(path, node->path)) {
+            /* move to head of list */
+            QTAILQ_REMOVE(&nodes, node, list);
+            QTAILQ_INSERT_HEAD(&nodes, node, list);
+            return node;
+        }
+    }
+    return NULL;
+}
+
+static struct node *node_add(struct node *parent, const char *path)
+{
+    struct node *node;
+
+    node = qemu_mallocz(sizeof(*node));
+    if (!node) {
+        goto err;
+    }
+    node->path = qemu_strdup(path);
+    if (!node->path) {
+        goto err;
+    }
+    node->parent = parent;
+    QTAILQ_INSERT_HEAD(&nodes, node, list);
+    return node;
+
+err:
+    qemu_free(node);
+    return NULL;
+}
+
+static void node_del(struct node *node)
+{
+    struct node *child;
+    int found_child;
+
+    do {
+        found_child = 0;
+        QTAILQ_FOREACH(child, &nodes, list) {
+            if (child->parent != node) {
+                continue;
+            }
+            found_child = 1;
+            node_del(child);
+            break;
+        }
+    } while (found_child);
+
+    if (debug) {
+        fprintf(stderr, "%s: %s\n", __FUNCTION__, node->path);
+    }
+    QTAILQ_REMOVE(&nodes, node, list);
+    qemu_free(node->path);
+    qemu_free(node->value);
+    qemu_free(node);
+}
+
+static void node_path(struct xs_handle *h, const char *path, char *dest, int len)
+{
+    if (path[0] == '/') {
+        snprintf(dest, len, "%s", path);
+    } else {
+        snprintf(dest, len, "/local/domain/%d/%s",
+                 h->domid, path);
+    }
+}
+
+static void parent_path(struct xs_handle *h, const char *path, char *dest, int len)
+{
+    char *c;
+
+    node_path(h, path, dest, len);
+    c = strrchr(dest, '/');
+    if (c) {
+        if (c == dest && c[1]) {
+            c++;
+        }
+        *c = 0;
+    }
+}
+
+static void fire_watch(struct node *node, struct watch *watch)
+{
+    struct event *event;
+    char *path, *token, *dst, byte = 0;
+    int r;
+
+    path  = node->path + watch->offset;
+    token = watch->token;
+
+    event = qemu_mallocz(sizeof(*event));
+    if (!event) {
+        return;
+    }
+    event->vec = qemu_malloc(sizeof(char*)*2 +
+                             strlen(path)    +
+                             strlen(token)   +
+                             2);
+    if (!event->vec) {
+        qemu_free(event);
+        return;
+    }
+    dst = (void*)(event->vec+2);
+    event->vec[0] = dst;
+    strcpy(dst, path);
+    dst += strlen(path)+1;
+    event->vec[1] = dst;
+    strcpy(dst, token);
+
+    QTAILQ_INSERT_TAIL(&watch->who->events, event, list);
+    r = write(watch->who->fd_write, &byte, 1);
+}
+
+static void fire_watches(struct node *node)
+{
+    struct watch *watch;
+    int nlen,wlen;
+
+    nlen = strlen(node->path);
+    QTAILQ_FOREACH(watch, &watches, list) {
+        wlen = strlen(watch->path);
+        if (wlen > nlen) {
+            continue;
+        }
+        if (strncmp(watch->path, node->path, wlen)) {
+            continue;
+        }
+        fire_watch(node, watch);
+    }
+}
+
+/* ------------------------------------------------------------- */
+
+static struct xs_handle *_qemu_open(void)
+{
+    struct xs_handle *h;
+    int fd[2];
+
+    h = qemu_mallocz(sizeof(*h));
+    if (!h) {
+        goto err;
+    }
+
+    if (pipe(fd)) {
+        goto err;
+    }
+    h->fd_read  = fd[0];
+    h->fd_write = fd[1];
+    QTAILQ_INIT(&h->events);
+    return h;
+
+err:
+    qemu_free(h);
+    return NULL;
+}
+
+static int qemu_domid(struct xs_handle *h, int domid)
+{
+    h->domid = domid;
+    return 0;
+}
+
+static void qemu_close(struct xs_handle *h)
+{
+    struct watch *watch, *check;
+    struct event *event;
+
+    watch = QTAILQ_FIRST(&watches);
+    while (watch) {
+        check = watch;
+        watch = QTAILQ_NEXT(watch, list);
+        if (h != check->who) {
+            continue;
+        }
+        QTAILQ_REMOVE(&watches, check, list);
+        free(check);
+    }
+
+    while ((event = QTAILQ_FIRST(&h->events))) {
+        QTAILQ_REMOVE(&h->events, event, list);
+        free(event->vec);
+        free(event);
+    }
+
+    close(h->fd_read);
+    close(h->fd_write);
+    qemu_free(h);
+}
+
+static char **qemu_directory(struct xs_handle *h, xs_transaction_t t,
+                             const char *path, unsigned int *num)
+{
+    char npath[XS_PATH_MAX];
+    struct node *parent, *node;
+    int i,pos,size,plen,nlen;
+    char **vec, *dst, *name;
+
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s: %s\n", __FUNCTION__, path);
+    }
+    node_path(h, path, npath, sizeof(npath));
+    plen = strlen(npath);
+    parent = node_find(npath);
+    if (!parent) {
+        return NULL;
+    }
+    if (!parent->is_dir) {
+        return NULL;
+    }
+
+    /* count */
+    *num = 0;
+    size = 0;
+    QTAILQ_FOREACH(node, &nodes, list) {
+        if (node->parent != parent) {
+            continue;
+        }
+        name = node->path + plen + 1;
+        nlen = strlen(name)+1;
+        (*num)++;
+        size += nlen;
+    }
+    if (!*num) {
+        return NULL;
+    }
+
+    /* alloc memory */
+    vec = qemu_malloc(*num * sizeof(char*) + size);
+    dst = (void*)(vec + (*num));
+
+    /* fill data */
+    i = 0;
+    pos = 0;
+    QTAILQ_FOREACH(node, &nodes, list) {
+        if (node->parent != parent) {
+            continue;
+        }
+        name = node->path + plen + 1;
+        nlen = strlen(name)+1;
+        vec[i] = dst + pos;
+        memcpy(vec[i], name, nlen);
+        i++;
+        pos += nlen;
+    }
+    return vec;
+}
+
+static void *qemu_read(struct xs_handle *h, xs_transaction_t t,
+                       const char *path, unsigned int *len)
+{
+    char npath[XS_PATH_MAX];
+    struct node *node;
+    char *ret;
+
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s: %s\n", __FUNCTION__, path);
+    }
+    node_path(h, path, npath, sizeof(npath));
+    node = node_find(npath);
+    if (!node) {
+        *len = 0;
+        return NULL;
+    }
+    ret = qemu_malloc(node->len+1);
+    memcpy(ret, node->value, node->len);
+    ret[node->len] = 0;
+    *len = node->len;
+    return ret;
+}
+
+static bool qemu_mkdir(struct xs_handle *h, xs_transaction_t t,
+                       const char *path)
+{
+    char npath[XS_PATH_MAX], ppath[XS_PATH_MAX];
+    struct node *node;
+
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s: %s\n", __FUNCTION__, path);
+    }
+    node_path(h, path, npath, sizeof(npath));
+    node = node_find(npath);
+    if (node) {
+        return node->is_dir ? true : false;
+    }
+    parent_path(h, path, ppath, sizeof(ppath));
+    if (strlen(ppath)) {
+        node = node_find(ppath);
+        if (!node) {
+            if (!qemu_mkdir(h, t, ppath)) {
+                return false;
+            }
+            node = node_find(ppath);
+        }
+    } else {
+        node = NULL;
+    }
+    node = node_add(node, npath);
+    if (!node) {
+        return false;
+    }
+    node->is_dir = 1;
+    fire_watches(node);
+    return true;
+}
+
+static bool qemu_write(struct xs_handle *h, xs_transaction_t t,
+                       const char *path, const void *data, unsigned int len)
+{
+    char npath[XS_PATH_MAX], ppath[XS_PATH_MAX];
+    struct node *node;
+
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s: %s = %.*s\n", __FUNCTION__, path, len, (char*)data);
+    }
+    node_path(h, path, npath, sizeof(npath));
+    if (h->domid != 0) {
+        /* simple access control: guest can write to its own tree only */
+        int domid;
+        if (sscanf(npath, "/local/domain/%d", &domid) != 1) {
+            fprintf(stderr, "deny guest access: %s\n", npath);
+            return false;
+        }
+        if (domid != h->domid) {
+            fprintf(stderr, "deny guest access (domid %d): %s\n", h->domid, npath);
+            return false;
+        }
+    }
+    node = node_find(npath);
+    if (!node) {
+        parent_path(h, path, ppath, sizeof(ppath));
+        node = node_find(ppath);
+        if (!node) {
+            if (!qemu_mkdir(h, t, ppath)) {
+                return false;
+            }
+            node = node_find(ppath);
+        }
+        if (!node->is_dir) {
+            return false;
+        }
+        node = node_add(node, npath);
+    }
+    node->len = 0;
+    qemu_free(node->value);
+    if (len) {
+        node->value = qemu_malloc(len);
+        if (!node->value) {
+            return false;
+        }
+    }
+    node->len = len;
+    memcpy(node->value, data, len);
+    if (debug) {
+        fprintf(stderr, "xs: new value: %s = %.*s (%d)\n",
+                npath, len, (char*)data, len);
+    }
+    fire_watches(node);
+    return true;
+}
+
+static bool qemu_rm(struct xs_handle *h, xs_transaction_t t,
+                    const char *path)
+{
+    char npath[XS_PATH_MAX];
+    struct node *node;
+
+    if (debug) {
+        fprintf(stderr, "xs: %s: %s\n", __FUNCTION__, path);
+    }
+
+    node_path(h, path, npath, sizeof(npath));
+    node = node_find(npath);
+    if (node) {
+        fire_watches(node);
+        node_del(node);
+    }
+    return false;
+}
+
+static struct xs_permissions *qemu_get_permissions(struct xs_handle *h,
+                                                   xs_transaction_t t,
+                                                   const char *path, unsigned int *num)
+{
+    /* we don't implement permissions */
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s: %s\n", __FUNCTION__, path);
+    }
+    return NULL;
+}
+
+static bool qemu_set_permissions(struct xs_handle *h, xs_transaction_t t,
+                                 const char *path, struct xs_permissions *perms,
+                                 unsigned int num_perms)
+{
+    /* we don't implement permissions */
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s: %s\n", __FUNCTION__, path);
+    }
+    return true;
+}
+
+static bool qemu_watch(struct xs_handle *h, const char *path, const char *token)
+{
+    char npath[XS_PATH_MAX];
+    struct node *node;
+    struct watch *w;
+
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s: %s token %s\n", __FUNCTION__, path, token);
+    }
+    node_path(h, path, npath, sizeof(npath));
+    w = qemu_mallocz(sizeof(*w));
+    if (!w) {
+        goto err;
+    }
+    w->path = qemu_strdup(npath);
+    if (!w->path) {
+        goto err;
+    }
+    w->token = qemu_strdup(token);
+    if (!w->token) {
+        goto err;
+    }
+    w->who = h;
+    if (path[0] != '/') {
+        /* relative path offset */
+        w->offset = strlen(npath) - strlen(path);
+    }
+    QTAILQ_INSERT_TAIL(&watches, w, list);
+    if (debug) {
+        fprintf(stderr, "xs: new watch: %s (rel %s, token %s)\n",
+                w->path, w->offset ? w->path + w->offset : "-", w->token);
+    }
+    node = node_find(npath);
+    if (node) {
+        fire_watch(node, w);
+    }
+    return true;
+
+err:
+    if (w) {
+        qemu_free(w->path);
+        qemu_free(w->token);
+        qemu_free(w);
+    }
+    return false;
+}
+
+static int qemu_fileno(struct xs_handle *h)
+{
+    return h->fd_read;
+}
+
+static char **qemu_read_watch(struct xs_handle *h, unsigned int *num)
+{
+    struct event *event;
+    char **vec;
+    char byte;
+    int r;
+
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s\n", __FUNCTION__);
+    }
+    r = read(h->fd_read, &byte, 1);
+    if (QTAILQ_EMPTY(&h->events)) {
+        fprintf(stderr, "%s: Huh? fd readable but no event in list?\n",
+                __FUNCTION__);
+        return NULL;
+    }
+    event = QTAILQ_FIRST(&h->events);
+    if (debug) {
+        fprintf(stderr, "xs: get event: %s %s\n",
+                event->vec[0], event->vec[1]);
+    }
+    vec = event->vec;
+    QTAILQ_REMOVE(&h->events, event, list);
+    qemu_free(event);
+    *num = 1;
+    return vec;
+}
+
+static bool qemu_unwatch(struct xs_handle *h, const char *path, const char *token)
+{
+    struct watch *watch;
+
+    QTAILQ_FOREACH(watch, &watches, list) {
+        if (strcmp(watch->path + watch->offset, path)) {
+            continue;
+        }
+        if (strcmp(watch->token, token)) {
+            continue;
+        }
+        QTAILQ_REMOVE(&watches, watch, list);
+        qemu_free(watch->path);
+        qemu_free(watch->token);
+        qemu_free(watch);
+        return true;
+    }
+    return false;
+}
+
+static xs_transaction_t qemu_transaction_start(struct xs_handle *h)
+{
+    /* Note: transactions are not implemented */
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s\n", __FUNCTION__);
+    }
+    return 42;
+}
+
+static bool qemu_transaction_end(struct xs_handle *h, xs_transaction_t t,
+                                 bool abort)
+{
+    /* Note: transactions are not implemented */
+    if (debug > 1) {
+        fprintf(stderr, "xs: %s\n", __FUNCTION__);
+    }
+    return true;
+}
+
+static bool qemu_introduce_domain(struct xs_handle *h,
+                                  unsigned int domid,
+                                  unsigned long mfn,
+                                  unsigned int eventchn)
+{
+    /* not needed for us */
+    fprintf(stderr, "xs: %s: not implemented\n", __FUNCTION__);
+    return false;
+}
+
+static bool qemu_resume_domain(struct xs_handle *h, unsigned int domid)
+{
+    /* not needed for us */
+    fprintf(stderr, "xs: %s: not implemented\n", __FUNCTION__);
+    return false;
+}
+
+static bool qemu_release_domain(struct xs_handle *h, unsigned int domid)
+{
+    /* not needed for us */
+    fprintf(stderr, "xs: %s: not implemented\n", __FUNCTION__);
+    return false;
+}
+
+static char *qemu_get_domain_path(struct xs_handle *h, unsigned int domid)
+{
+    char *path;
+
+    path = malloc(32);
+    if (!path) {
+        return NULL;
+    }
+    snprintf(path, 32, "/local/domain/%d", domid);
+    return path;
+}
+
+static bool qemu_is_domain_introduced(struct xs_handle *h, unsigned int domid)
+{
+    /* not needed for us */
+    fprintf(stderr, "xs: %s: not implemented\n", __FUNCTION__);
+    return false;
+}
+
+struct XenStoreOps xs_xenner = {
+    .daemon_open           = _qemu_open,
+    .domain_open           = _qemu_open,
+    .daemon_open_readonly  = _qemu_open,
+    .domid                 = qemu_domid,
+    .daemon_close          = qemu_close,
+    .directory             = qemu_directory,
+    .read                  = qemu_read,
+    .write                 = qemu_write,
+    .mkdir                 = qemu_mkdir,
+    .rm                    = qemu_rm,
+    .get_permissions       = qemu_get_permissions,
+    .set_permissions       = qemu_set_permissions,
+    .watch                 = qemu_watch,
+    .fileno                = qemu_fileno,
+    .read_watch            = qemu_read_watch,
+    .unwatch               = qemu_unwatch,
+    .transaction_start     = qemu_transaction_start,
+    .transaction_end       = qemu_transaction_end,
+    .introduce_domain      = qemu_introduce_domain,
+    .resume_domain         = qemu_resume_domain,
+    .release_domain        = qemu_release_domain,
+    .get_domain_path       = qemu_get_domain_path,
+    .is_domain_introduced  = qemu_is_domain_introduced,
+};
+
+/* ------------------------------------------------------------- */
+
+#if 0
+
+static void print_node(Monitor *mon, struct node *node, int indent)
+{
+    struct node *child;
+    int width;
+    char *name;
+
+    width = 40 - indent;
+    name = strrchr(node->path,'/');
+    if (strcmp(name, "/")) {
+        name++;
+    }
+    monitor_printf(mon, "%*s%-*.*s = ", indent, "", width, width, name);
+    if (node->is_dir) {
+        monitor_printf(mon,"<DIR>\n");
+        QTAILQ_FOREACH(child, &nodes, list) {
+            if (child->parent != node) {
+                continue;
+            }
+            print_node(mon, child, indent+2);
+        }
+    } else {
+        monitor_printf(mon, "\"%.*s\"\n", node->len, node->value);
+    }
+}
+
+void do_info_xenstore(Monitor *mon)
+{
+    struct node *root;
+
+    if (xen_mode != XEN_EMULATE) {
+        monitor_printf(mon, "Not emulating xenstore (use /usr/bin/xenstore-ls).\n");
+        return;
+    }
+    root = node_find("/");
+    if (!root) {
+        monitor_printf(mon, "Xenstore is empty.\n");
+        return;
+    }
+    print_node(mon, root, 0);
+}
+
+#endif
+
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 32/40] xenner: emudev
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (30 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 31/40] xenner: libxc emu: xenstore Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 33/40] xenner: core Alexander Graf
                   ` (7 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Xenner uses its own special PV device to communicate between qemu and the
guest xenner kernel. This patch implements that device.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xenner_emudev.c |  107 +++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/xenner_emudev.h |  108 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 215 insertions(+), 0 deletions(-)
 create mode 100644 hw/xenner_emudev.c
 create mode 100644 hw/xenner_emudev.h

diff --git a/hw/xenner_emudev.c b/hw/xenner_emudev.c
new file mode 100644
index 0000000..dea617e
--- /dev/null
+++ b/hw/xenner_emudev.c
@@ -0,0 +1,107 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner communication device
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <inttypes.h>
+
+#include "xenner_emudev.h"
+
+void emudev_write_entry(struct emudev_state *e, uint32_t value)
+{
+    e->entry = value;
+}
+
+void emudev_write_value(struct emudev_state *e, uint32_t value)
+{
+    uint16_t type, index;
+    int high_32;
+
+    type = e->entry >> 16;
+    high_32 = type & EMUDEV_CONF_HIGH_32;
+    type &= ~EMUDEV_CONF_HIGH_32;
+    index = e->entry & 0xffff;
+    switch (type) {
+    case EMUDEV_CONF_GRANT_TABLE_PFNS:
+        if (index < EMUDEV_CONF_GRANT_TABLE_COUNT)
+            e->gnttab[index] = value;
+        break;
+    case EMUDEV_CONF_EVTCHN_TO_PIN:
+        if (index < EMUDEV_CONF_EVTCHN_TO_PIN_COUNT)
+            e->evtchn[index] = value;
+        break;
+    case EMUDEV_CONF_COMMAND_RESULT:
+        if (index < EMUDEV_CONF_COMMAND_RESULT_COUNT)
+            e->result[index] = value;
+        break;
+    default:
+        if (type >= EMUDEV_CONF_SIMPLE_COUNT) {
+            break;
+        }
+        if (high_32) {
+            e->config[type] = (e->config[type] & 0x00000000ffffffffULL)
+                            | ((uint64_t)value << 32);
+        } else {
+            e->config[type] = (e->config[type] & 0xffffffff00000000ULL)
+                            | ((uint64_t)value);
+        }
+        break;
+    }
+}
+
+uint32_t emudev_read_value(struct emudev_state *e)
+{
+    uint16_t type, index;
+    int high_32;
+    uint64_t r = -1;
+
+    type = e->entry >> 16;
+    high_32 = type & EMUDEV_CONF_HIGH_32;
+    type &= ~EMUDEV_CONF_HIGH_32;
+    index = e->entry & 0xffff;
+    switch (type) {
+    case EMUDEV_CONF_GRANT_TABLE_PFNS:
+        if (index < EMUDEV_CONF_GRANT_TABLE_COUNT) {
+            r = e->gnttab[index];
+        }
+        break;
+    case EMUDEV_CONF_EVTCHN_TO_PIN:
+        if (index < EMUDEV_CONF_EVTCHN_TO_PIN_COUNT) {
+            r = e->evtchn[index];
+        }
+        break;
+    case EMUDEV_CONF_COMMAND_RESULT:
+        if (index < EMUDEV_CONF_COMMAND_RESULT_COUNT) {
+            r = e->result[index];
+        }
+        break;
+    default:
+        if (type < EMUDEV_CONF_SIMPLE_COUNT) {
+            r = e->config[type];
+        }
+        break;
+    }
+
+    if (high_32) {
+        return (uint32_t)(r >> 32);
+    }
+
+    return (uint32_t)r;
+}
diff --git a/hw/xenner_emudev.h b/hw/xenner_emudev.h
new file mode 100644
index 0000000..e6b8676
--- /dev/null
+++ b/hw/xenner_emudev.h
@@ -0,0 +1,108 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __XENNER_EMUDEV_H__
+#define __XENNER_EMUDEV_H__ 1
+
+/*
+ * I/O ports for the virtual emu device
+ *     - CONF_ENTRY (write)
+ *       bits 31 .. 16  --  config type
+ *       bits 15 ..  0  --  config index
+ *     - CONF_VALUE (read/write)
+ *       32bit config value
+ *     - COMMAND (write)
+ *       bits 32 .. 16  --  command
+ *       bits 15 ..  0  --  argument
+ */
+#define EMUDEV_REG_BASE                      (0xe0)
+#define EMUDEV_REG_CONF_ENTRY                (EMUDEV_REG_BASE)
+#define EMUDEV_REG_CONF_VALUE                (EMUDEV_REG_BASE+4)
+#define EMUDEV_REG_COMMAND                   (EMUDEV_REG_BASE+8)
+#define EMUDEV_REG_RESERVED                  (EMUDEV_REG_BASE+12)
+
+/* simple entries, read */
+#define EMUDEV_CONF_DEBUG_LEVEL              0x00
+#define EMUDEV_CONF_EMU_START_PFN            0x01
+#define EMUDEV_CONF_EMU_PAGE_COUNT           0x02
+#define EMUDEV_CONF_M2P_START_PFN            0x03
+#define EMUDEV_CONF_M2P_PAGE_COUNT           0x04
+#define EMUDEV_CONF_GUEST_START_PFN          0x05
+#define EMUDEV_CONF_GUEST_PAGE_COUNT         0x06
+#define EMUDEV_CONF_TOTAL_PAGE_COUNT         0x07
+#define EMUDEV_CONF_NR_VCPUS                 0x08
+#define EMUDEV_CONF_HVM_XENSTORE_PFN         0x09
+#define EMUDEV_CONF_HVM_XENSTORE_EVTCHN      0x0a
+#define EMUDEV_CONF_MFN_CONSOLE              0x0b
+#define EMUDEV_CONF_MFN_XENSTORE             0x0c
+#define EMUDEV_CONF_PFN_START_INFO           0x0d
+#define EMUDEV_CONF_PV_VIRT_ENTRY            0x10
+#define EMUDEV_CONF_PV_VIRT_BASE             0x11
+#define EMUDEV_CONF_HYPERCALL_PAGE           0x12
+#define EMUDEV_CONF_EVTCH_CONSOLE            0x13
+#define EMUDEV_CONF_EVTCH_XENSTORE           0x14
+#define EMUDEV_CONF_PFN_INIT_PT              0x15
+#define EMUDEV_CONF_PFN_CMDLINE              0x16
+#define EMUDEV_CONF_PFN_INITRD               0x17
+#define EMUDEV_CONF_INITRD_LEN               0x18
+#define EMUDEV_CONF_PFN_MFN_LIST             0x19
+/* simple entries, write */
+#define EMUDEV_CONF_BOOT_CTXT_PFN            0x40
+#define EMUDEV_CONF_NEXT_SECONDARY_VCPU      0x42
+#define EMUDEV_CONF_HVM_CALLBACK_IRQ         0x43
+#define EMUDEV_CONF_VMINFO_PFN               0x70 /* temporary */
+/* simple count */
+#define EMUDEV_CONF_SIMPLE_COUNT             0x80
+/* high dword marker */
+#define EMUDEV_CONF_HIGH_32                  0x8000
+
+/* indexed config entries */
+#define EMUDEV_CONF_GRANT_TABLE_PFNS         0x80
+#define EMUDEV_CONF_GRANT_TABLE_COUNT        16
+#define EMUDEV_CONF_EVTCHN_TO_PIN            0x81
+#define EMUDEV_CONF_EVTCHN_TO_PIN_COUNT      64
+#define EMUDEV_CONF_COMMAND_RESULT           0x82
+#define EMUDEV_CONF_COMMAND_RESULT_COUNT     64
+
+/* commands */
+#define EMUDEV_CMD_NOP                       1
+#define EMUDEV_CMD_WRITE_CHAR                2
+#define EMUDEV_CMD_CONFIGURATION_DONE        3
+#define EMUDEV_CMD_EVTCHN_ALLOC              4
+#define EMUDEV_CMD_EVTCHN_SEND               5
+#define EMUDEV_CMD_INIT_SECONDARY_VCPU       6
+#define EMUDEV_CMD_GUEST_SHUTDOWN            7
+#define EMUDEV_CMD_EVTCHN_CLOSE              8
+
+/* --------- host side bits --------- */
+
+struct emudev_state {
+    uint32_t   entry;
+    uint64_t   config[EMUDEV_CONF_SIMPLE_COUNT];
+    uint32_t   gnttab[EMUDEV_CONF_GRANT_TABLE_COUNT];
+    uint32_t   evtchn[EMUDEV_CONF_EVTCHN_TO_PIN_COUNT];
+    uint32_t   result[EMUDEV_CONF_COMMAND_RESULT_COUNT];
+};
+
+void emudev_write_entry(struct emudev_state *e, uint32_t value);
+void emudev_write_value(struct emudev_state *e, uint32_t value);
+uint32_t emudev_read_value(struct emudev_state *e);
+
+#endif /* __XENNER_EMUDEV_H__ */
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 33/40] xenner: core
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (31 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 32/40] xenner: emudev Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:13   ` malc
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 34/40] xenner: PV machine Alexander Graf
                   ` (6 subsequent siblings)
  39 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

This patch adds generic xenner functionality to qemu.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xenner.h      |   52 +++++++++++++
 hw/xenner_core.c |  224 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 276 insertions(+), 0 deletions(-)
 create mode 100644 hw/xenner.h
 create mode 100644 hw/xenner_core.c

diff --git a/hw/xenner.h b/hw/xenner.h
new file mode 100644
index 0000000..241dc4c
--- /dev/null
+++ b/hw/xenner.h
@@ -0,0 +1,52 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu-common.h"
+
+/* xenner_core.c */
+struct xenner_emu_file {
+    const char *filename;
+    int        pages;
+    void       *blob;
+};
+
+void *xenner_mfn_to_ptr(xen_pfn_t pfn);
+void *xenner_get_grant_table(int index);
+void *xenner_get_vminfo(void);
+uint64_t xenner_get_config(uint32_t reg);
+uint32_t xenner_get_evtchn_pin(uint32_t reg);
+void xenner_set_config(uint32_t reg, uint64_t value);
+int xenner_load_emu_file(struct xenner_emu_file *f);
+int xenner_guest_evtchn_alloc(void);
+int xenner_guest_evtchn_release(int port);
+int xenner_core_init(int evt);
+
+/* xenner_guest_store.c */
+void xenner_guest_store_setup(uint64_t guest_mfn, evtchn_port_t guest_evtchn);
+void xenner_guest_store_reset(void);
+
+/* xenner_pv.c */
+int xenner_init_pv(const char *kernel_filename,
+                   const char *kernel_cmdline,
+                   const char *initrd_filename);
+
+int xenner_domain_build_pv(const char *kernel,
+                           const char *initrd,
+                           const char *cmdline);
diff --git a/hw/xenner_core.c b/hw/xenner_core.c
new file mode 100644
index 0000000..53a9a75
--- /dev/null
+++ b/hw/xenner_core.c
@@ -0,0 +1,224 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner Core
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "hw.h"
+#include "isa.h"
+#include "sysemu.h"
+#include "xen.h"
+#include "xen_interfaces.h"
+#include "xenner.h"
+#include "xenner_emudev.h"
+#include "loader.h"
+
+/* #define DEBUG */
+
+static int guest_evtchndev;
+static struct emudev_state xen_emudev;
+
+/* ------------------------------------------------------------- */
+
+static void do_emudev_command(uint16_t cmd, uint16_t arg)
+{
+    switch (cmd) {
+    case EMUDEV_CMD_NOP:
+        /* nop vmexit */
+        break;
+    case EMUDEV_CMD_WRITE_CHAR:
+#ifdef DEBUG
+        fprintf(stderr, "%c", arg);
+#endif
+        break;
+    case EMUDEV_CMD_CONFIGURATION_DONE:
+        /* emu finisted initial EMUDEV_CONF_* setup */
+        break;
+    case EMUDEV_CMD_EVTCHN_ALLOC:
+        if (arg < EMUDEV_CONF_COMMAND_RESULT_COUNT) {
+            xen_emudev.result[arg] = xenner_guest_evtchn_alloc();
+        }
+        break;
+    case EMUDEV_CMD_EVTCHN_SEND:
+        xc_evtchn.notify(guest_evtchndev, arg);
+        break;
+    case EMUDEV_CMD_GUEST_SHUTDOWN:
+        switch (arg) {
+        case SHUTDOWN_poweroff:
+            fprintf(stderr, "xenner emudev: guest poweroff -> powerdown\n");
+            qemu_system_powerdown_request();
+            break;
+        case SHUTDOWN_reboot:
+            fprintf(stderr, "xenner emudev: guest reboot -> reset\n");
+            qemu_system_reset_request();
+            vm_stop(0);
+            break;
+        case SHUTDOWN_suspend:
+            fprintf(stderr, "xenner emudev: guest suspend -> stop vm\n");
+            vm_stop(0);
+            break;
+        case SHUTDOWN_crash:
+            fprintf(stderr, "xenner emudev: guest crash -> reset\n");
+            vm_stop(0);
+            qemu_system_reset_request();
+            break;
+        }
+        break;
+    default:
+        fprintf(stderr, "xenner emudev: cmd 0x%04x arg 0x%04x\n", cmd, arg);
+    }
+}
+
+static void xenner_port_write(void *opaque, uint32_t addr, uint32_t val)
+{
+    uint16_t cmd, arg;
+
+    switch (addr) {
+    case EMUDEV_REG_CONF_ENTRY:
+        emudev_write_entry(&xen_emudev, val);
+        break;
+    case EMUDEV_REG_CONF_VALUE:
+        emudev_write_value(&xen_emudev, val);
+        break;
+    case EMUDEV_REG_COMMAND:
+        cmd = val >> 16;
+        arg = val & 0xffff;
+        do_emudev_command(cmd, arg);
+        break;
+    default:
+        fprintf(stderr, "io: %s, addr 0x%" PRIx16 " value 0x%" PRIx32 "\n",
+                __FUNCTION__, addr, val);
+        break;
+    }
+}
+
+static uint32_t xenner_port_read(void *opaque, uint32_t addr)
+{
+    uint32_t val;
+
+    switch (addr) {
+    case EMUDEV_REG_CONF_VALUE:
+        val = emudev_read_value(&xen_emudev);
+        break;
+    default:
+        fprintf(stderr, "io: %s, addr 0x%" PRIx16 "\n", __FUNCTION__, addr);
+        val = 0xffffffff;
+        break;
+    }
+    return val;
+}
+
+/* ------------------------------------------------------------- */
+
+void *xenner_mfn_to_ptr(xen_pfn_t pfn)
+{
+    ram_addr_t offset;
+
+    offset = cpu_get_physical_page_desc(pfn << PAGE_SHIFT);
+    return qemu_get_ram_ptr(offset);
+}
+
+void *xenner_get_grant_table(int index)
+{
+    uint32_t pfn;
+
+    if (index > EMUDEV_CONF_GRANT_TABLE_COUNT) {
+        return NULL;
+    }
+    pfn = xen_emudev.gnttab[index];
+    if (!pfn) {
+        return NULL;
+    }
+    return xenner_mfn_to_ptr(pfn);
+}
+
+void *xenner_get_vminfo(void)
+{
+    uint32_t pfn;
+
+    pfn = xen_emudev.config[EMUDEV_CONF_VMINFO_PFN];
+    if (!pfn) {
+        return NULL;
+    }
+    return xenner_mfn_to_ptr(pfn);
+}
+
+uint64_t xenner_get_config(uint32_t reg)
+{
+    return xen_emudev.config[reg];
+}
+
+uint32_t xenner_get_evtchn_pin(uint32_t reg)
+{
+    return xen_emudev.evtchn[reg];
+}
+
+void xenner_set_config(uint32_t reg, uint64_t value)
+{
+    xen_emudev.config[reg] = value;
+}
+
+/* ------------------------------------------------------------- */
+
+int xenner_load_emu_file(struct xenner_emu_file *f)
+{
+    int r;
+
+    /* Fetch size */
+    r = get_image_size(f->filename);
+    if (r <= 0) {
+        fprintf(stderr, "can't find %s\n", f->filename);
+        return -1;
+    }
+
+    /* Allocate memory */
+    f->pages = (r + PAGE_SIZE - 1) / PAGE_SIZE;
+    f->blob = qemu_mallocz(f->pages * PAGE_SIZE);
+    if (!f->blob) {
+        return -1;
+    }
+
+    /* Load image */
+    r = load_image(f->filename, f->blob);
+
+    return r;
+}
+
+/* ------------------------------------------------------------- */
+
+int xenner_guest_evtchn_alloc(void)
+{
+    return xc_evtchn.bind_unbound_port(guest_evtchndev, 0);
+}
+
+int xenner_guest_evtchn_release(int port)
+{
+    return xc_evtchn.unbind(guest_evtchndev, port);
+}
+
+int xenner_core_init(int evt)
+{
+    guest_evtchndev = evt;
+    xc_evtchn.domid(guest_evtchndev, xen_domid);
+
+    register_ioport_write(EMUDEV_REG_BASE, 16, 4, xenner_port_write, NULL);
+    register_ioport_read(EMUDEV_REG_BASE, 16, 4, xenner_port_read, NULL);
+
+    return 0;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 34/40] xenner: PV machine
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (32 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 33/40] xenner: core Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 35/40] xenner: Domain Builder Alexander Graf
                   ` (5 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

The same as Xen has a PV machine to do all the initialization of devices, we
have one for xenner. This patch implements said machine description.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xenner_pv.c |  135 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 135 insertions(+), 0 deletions(-)
 create mode 100644 hw/xenner_pv.c

diff --git a/hw/xenner_pv.c b/hw/xenner_pv.c
new file mode 100644
index 0000000..6350075
--- /dev/null
+++ b/hw/xenner_pv.c
@@ -0,0 +1,135 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner PV machine
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <sys/mman.h>
+
+#include <xen/io/protocols.h>
+
+#include "hw.h"
+#include "pc.h"
+#include "kvm.h"
+#include "sysemu.h"
+#include "qemu-char.h"
+#include "xen_backend.h"
+#include "xen_interfaces.h"
+#include "xen_domainbuild.h"
+#include "xenner.h"
+#include "xenner_emudev.h"
+#include "loader.h"
+#include "elf.h"
+#include "apic.h"
+#include "sysbus.h"
+
+/* ------------------------------------------------------------- */
+
+/* convert megabytes to pages and back */
+#define MB_TO_PG(x) (x << 8)
+#define PG_TO_MB(x) (x >> 8)
+
+#define addr_to_frame(addr)    ((addr) >> PAGE_SHIFT)
+#define frame_to_addr(frame)   ((frame) << PAGE_SHIFT)
+#define addr_offset(addr)      ((addr) & ~PAGE_MASK)
+
+static int       guest_evtchndev;
+
+/* ------------------------------------------------------------- */
+
+static IsaIrqState *isa_irq_state;
+
+static void ioapic_init(IsaIrqState *isa_irq_state)
+{
+    DeviceState *dev;
+    SysBusDevice *d;
+    unsigned int i;
+
+    dev = qdev_create(NULL, "ioapic");
+    qdev_init_nofail(dev);
+    d = sysbus_from_qdev(dev);
+    sysbus_mmio_map(d, 0, 0xfec00000);
+
+    for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+        isa_irq_state->ioapic[i] = qdev_get_gpio_in(dev, i);
+    }
+}
+
+static void xenner_request_irq(void *opaque, int irq, int level)
+{
+}
+
+static int xenner_setup_irqs(void)
+{
+    qemu_irq *isa_irq;
+    qemu_irq *cpu_irq;
+
+    cpu_irq = qemu_allocate_irqs(xenner_request_irq, NULL, 1);
+    isa_irq_state = qemu_mallocz(sizeof(*isa_irq_state));
+    isa_irq_state->i8259 = i8259_init(cpu_irq[0]);
+    ioapic_init(isa_irq_state);
+    isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
+
+    isa_bus_new(NULL);
+    isa_bus_irqs(isa_irq);
+
+    return 0;
+}
+
+static void guest_evtchn_event(void *opaque)
+{
+    evtchn_port_t port;
+    int pin;
+
+    port = xc_evtchn.pending(guest_evtchndev);
+    if ((int)port < 0 || port > EMUDEV_CONF_EVTCHN_TO_PIN_COUNT) {
+        return;
+    }
+    xc_evtchn.unmask(guest_evtchndev, port);
+    pin = xenner_get_evtchn_pin(port);
+    qemu_set_irq(isa_irq_state->ioapic[pin], 1);
+}
+
+int xenner_init_pv(const char *kernel_filename,
+                   const char *kernel_cmdline,
+                   const char *initrd_filename)
+{
+    ram_addr_t ram_addr;
+    ram_addr_t boot_addr;
+
+    pc_cpus_init(NULL);
+
+    ram_addr = qemu_ram_alloc(NULL, "pc.ram", ram_size);
+    cpu_register_physical_memory(0, ram_size, ram_addr);
+
+    boot_addr = qemu_ram_alloc(NULL, "xenner.trampoline", 0x1000);
+    cpu_register_physical_memory(0xfffff000, 0x1000, boot_addr);
+
+    guest_evtchndev = xc_evtchn.open();
+    xenner_core_init(guest_evtchndev);
+    qemu_set_fd_handler(xc_evtchn.fd(guest_evtchndev),
+                        guest_evtchn_event, NULL, NULL);
+
+    xenner_setup_irqs();
+
+    xenner_domain_build_pv(kernel_filename, initrd_filename,
+                           kernel_cmdline);
+
+    return 0;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 35/40] xenner: Domain Builder
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (33 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 34/40] xenner: PV machine Alexander Graf
@ 2010-11-01 15:01 ` Alexander Graf
  2010-11-02 10:09   ` [Qemu-devel] " Paolo Bonzini
  2010-11-01 15:21 ` [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (4 subsequent siblings)
  39 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:01 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

The traditional Xen way of loading a kernel and initial data structures into the
guest's memory is by calling libxc functions. We want to be able to run without
libxc dependencies though, so we need an alternative.

This patch implements a full domain builder for xenner. It loads the guest
kernel, sets up basic memory layouts and saves everything off for the pv
communication device between xenner and qemu.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xenner_dom_builder.c |  406 +++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 406 insertions(+), 0 deletions(-)
 create mode 100644 hw/xenner_dom_builder.c

diff --git a/hw/xenner_dom_builder.c b/hw/xenner_dom_builder.c
new file mode 100644
index 0000000..b4d55dd
--- /dev/null
+++ b/hw/xenner_dom_builder.c
@@ -0,0 +1,406 @@
+/*
+ *  Copyright (C) Red Hat 2007
+ *  Copyright (C) Novell Inc. 2010
+ *
+ *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
+ *             Alexander Graf <agraf@suse.de>
+ *
+ *  Xenner Domain Builder
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; under version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License along
+ *  with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <sys/mman.h>
+
+#include <xen/io/protocols.h>
+
+#include "hw.h"
+#include "pc.h"
+#include "kvm.h"
+#include "sysemu.h"
+#include "qemu-char.h"
+#include "xen_backend.h"
+#include "xen_interfaces.h"
+#include "xen_domainbuild.h"
+#include "xenner.h"
+#include "xenner_emudev.h"
+#include "loader.h"
+#include "elf.h"
+#include "apic.h"
+
+#define XEN_ELFNOTE_INFO            0
+#define XEN_ELFNOTE_ENTRY           1
+#define XEN_ELFNOTE_HYPERCALL_PAGE  2
+#define XEN_ELFNOTE_VIRT_BASE       3
+#define XEN_ELFNOTE_PADDR_OFFSET    4
+#define XEN_ELFNOTE_XEN_VERSION     5
+#define XEN_ELFNOTE_GUEST_OS        6
+#define XEN_ELFNOTE_GUEST_VERSION   7
+#define XEN_ELFNOTE_LOADER          8
+#define XEN_ELFNOTE_PAE_MODE        9
+#define XEN_ELFNOTE_FEATURES       10
+#define XEN_ELFNOTE_BSD_SYMTAB     11
+#define XEN_ELFNOTE_HV_START_LOW   12
+#define XEN_ELFNOTE_L1_MFN_VALID   13
+#define XEN_ELFNOTE_SUSPEND_CANCEL 14
+#define XEN_ELFNOTE_INIT_P2M       15
+
+
+typedef struct DomParams {
+    uint64_t virt_hypercall;
+    uint64_t virt_base;
+    uint64_t virt_entry;
+    int pae;
+    int bits;
+} DomParams;
+
+#define MEGABYTE        (uint64_t)0x100000ULL
+#define MB_TO_PG(x)     (MEGABYTE * x / TARGET_PAGE_SIZE)
+#define PG_TO_MB(x)     (x * TARGET_PAGE_SIZE / MEGABYTE)
+
+static uint64_t note_numeric(uint8_t *desc, uint32_t desc_len)
+{
+    uint64_t r = 0;
+
+    switch (desc_len) {
+        case sizeof(uint8_t):
+            r = *desc;
+            break;
+        case sizeof(uint16_t):
+            r = lduw_p(desc);
+            break;
+        case sizeof(uint32_t):
+            r = ldl_p(desc);
+            break;
+        case sizeof(uint64_t):
+            r = ldq_p(desc);
+            break;
+    }
+
+    return r;
+}
+
+static void dombuild_note(void *opaque, uint8_t *name, uint32_t name_len,
+                          uint8_t *desc, uint32_t desc_len, uint32_t type)
+{
+    DomParams *params = (DomParams *)opaque;
+
+    if (name_len != 4) {
+        return;
+    }
+
+    if (!((name[0] == 'X') && (name[1] == 'e') && (name[2] == 'n'))) {
+        return;
+    }
+
+    switch (type) {
+        case XEN_ELFNOTE_HYPERCALL_PAGE:
+            params->virt_hypercall = note_numeric(desc, desc_len);
+            break;
+        case XEN_ELFNOTE_VIRT_BASE:
+            params->virt_base = note_numeric(desc, desc_len);
+            break;
+        case XEN_ELFNOTE_ENTRY:
+            params->virt_entry = note_numeric(desc, desc_len);
+            break;
+        case XEN_ELFNOTE_PAE_MODE:
+            params->pae = !strncmp((char*)desc, "yes", desc_len);
+            break;
+    }
+}
+
+static void dombuild_xen_guest_key(DomParams *params, char *key, char *value)
+{
+    if (!strncmp(key, "HYPERCALL_PAGE", strlen("HYPERCALL_PAGE"))) {
+        params->virt_hypercall = (strtoull(value, NULL, 0) << PAGE_SHIFT) +
+                                 params->virt_base;
+    }
+
+    if (!strncmp(key, "VIRT_BASE", strlen("VIRT_BASE"))) {
+        params->virt_base = strtoull(value, NULL, 0);
+    }
+
+    if (!strncmp(key, "VIRT_ENTRY", strlen("VIRT_ENTRY"))) {
+        params->virt_entry = strtoull(value, NULL, 0);
+    }
+
+    if (!strncmp(key, "PAE", strlen("PAE"))) {
+        params->pae = !strncmp(value, "yes", 3);
+    }
+}
+
+static void dombuild_section(void *opaque, uint8_t *_name, uint32_t len,
+                             uint8_t *_content)
+{
+    DomParams *params = (DomParams *)opaque;
+    char *entry, *next_entry;
+    char *name = (void*)_name;
+    char *content = (void*)_content;
+
+    if (strlen(name) != strlen("__xen_guest")) {
+        return;
+    }
+
+    if (strncmp(name, "__xen_guest", strlen("__xen_guest"))) {
+        return;
+    }
+
+    /* Example contents of section __xen_guest (without line break):
+     *
+     * GUEST_OS=Mini-OS,XEN_VER=xen-3.0,VIRT_BASE=0x0,ELF_PADDR_OFFSET=0x0,
+     * HYPERCALL_PAGE=0x2,LOADER=generic
+     *
+     */
+
+    entry = (char *)content;
+    while (entry) {
+        char *comma, *equalsign, *key, *value;
+
+        comma = strchr(entry, ',');
+        if (comma) {
+            *comma = '\0';
+            next_entry = comma + 1;
+        } else {
+            next_entry = NULL;
+        }
+
+        /* entry now contains KEY=VALUE */
+        equalsign = strchr(entry, '=');
+        if (equalsign) {
+            /* well-formed entry */
+
+            *equalsign = '\0';
+            key = entry;
+            value = equalsign + 1;
+
+            dombuild_xen_guest_key(params, key, value);
+        }
+
+        /* process next entry */
+        entry = next_entry;
+    }
+}
+
+static uint64_t mfn_to_pfn(uint64_t mfn, uint64_t mfn_guest)
+{
+    return mfn - mfn_guest;
+}
+
+static uint64_t dombuild_translate(void *opaque, uint64_t addr)
+{
+    xen_pfn_t *offset = opaque;
+
+    /* offset the kernel into physical memory */
+    addr += *offset * PAGE_SIZE;
+
+    return addr;
+}
+
+static void dombuild_header_notify(void *opaque, void *header, int bits)
+{
+    DomParams *params = opaque;
+
+    params->bits = bits;
+}
+
+
+static void xenner_cpu_reset(void *opaque)
+{
+    uint8_t boot_buf[] = { 0xea, 0x00, 0x00, 0x00, 0x00 };
+
+    /* install the stage0 boot loader */
+    cpu_physical_memory_rw(0xfffffff0, boot_buf, sizeof(boot_buf), 1);
+}
+
+int xenner_domain_build_pv(const char *kernel,
+                           const char *initrd,
+                           const char *cmdline)
+{
+    const char *emu_file;
+    xen_pfn_t pg_total, pg_emu, pg_m2p, pg_guest;
+    xen_pfn_t mfn_emu, mfn_m2p, mfn_guest;
+    uint64_t emu_entry, emu_low, emu_high;
+    int emu_size;
+    DomParams dom_params;
+    uint64_t elf_entry;
+    uint64_t elf_low, elf_high;
+    int kernel_size;
+    ElfHandlers handlers = elf_default_handlers;
+    target_phys_addr_t cur_addr;
+    int long_mode;
+    int console_evtchn, xenstore_evtchn;
+    uint64_t console_mfn, xenstore_mfn, start_info_pfn;
+    uint64_t init_pt_pfn, cmdline_pfn = 0, initrd_len = 0;
+    uint64_t mfn_list_pfn;
+
+    pg_total = ram_size / PAGE_SIZE;
+    pg_emu   = MB_TO_PG(4);
+    pg_m2p   = MB_TO_PG(4);
+    /* every page entry occupies one long */
+    while (pg_m2p < pg_total / (sizeof(target_phys_addr_t) * 8)) {
+        pg_m2p += MB_TO_PG(2);
+    }
+    mfn_guest  = pg_emu + pg_m2p;
+
+    memset(&dom_params, 0, sizeof(DomParams));
+    handlers.note_fn = dombuild_note;
+    handlers.note_opaque = &dom_params;
+    handlers.section_fn = dombuild_section;
+    handlers.section_opaque = &dom_params;
+    handlers.translate_fn = dombuild_translate;
+    handlers.translate_opaque = &mfn_guest;
+    handlers.header_notify_fn = dombuild_header_notify;
+    handlers.header_notify_opaque = &dom_params;
+    kernel_size = load_elf(kernel, &handlers, &elf_entry,
+                           &elf_low, &elf_high, 0, ELF_MACHINE, 0);
+    if (kernel_size < 0) {
+        fprintf(stderr, "Error while loading elf kernel\n");
+        exit(1);
+    }
+
+#if 0
+    printf("virt base: %#lx\n", dom_params.virt_base);
+    printf("virt hypercall: %#lx\n", dom_params.virt_hypercall);
+    printf("virt entry: %#lx\n", dom_params.virt_entry);
+    printf("guest bitness: %d%s\n", dom_params.bits,
+           dom_params.pae ? " pae" : "");
+#endif
+
+    cur_addr = (elf_high + 0x100000) & ~0xfffffULL;
+
+    if (initrd) {
+        int r;
+
+        xenner_set_config(EMUDEV_CONF_PFN_INITRD,
+                          mfn_to_pfn(cur_addr >> PAGE_SHIFT, mfn_guest));
+
+        r = load_image_targphys(initrd, cur_addr, 0);
+        cur_addr += r;
+        initrd_len = r;
+
+        if (r < 0) {
+            fprintf(stderr, "failed to load initrd\n");
+            exit(1);
+        }
+    }
+
+    /* memory setup */
+    if (dom_params.bits == 64) {
+        /* 64bit */
+        emu_file = "xenner64.elf";
+        long_mode = 1;
+    } else if (dom_params.pae) {
+        /* 32bit, PAE */
+        if (pg_total > MB_TO_PG(16384)) {
+            pg_total = MB_TO_PG(16384);
+        }
+        emu_file = "xenner32-pae.elf";
+        long_mode = 0;
+    } else {
+        /* 32bit, non-PAE */
+        if (pg_total > MB_TO_PG(4096)) {
+            pg_total = MB_TO_PG(4096);
+        }
+        emu_file = "xenner32.elf";
+        long_mode = 0;
+    }
+
+    mfn_emu    = 0;
+    mfn_m2p    = pg_emu;
+    pg_guest   = pg_total - mfn_guest;
+
+    emu_file = qemu_find_file(QEMU_FILE_TYPE_BIOS, emu_file);
+    emu_size = load_elf(emu_file, NULL, &emu_entry, &emu_low, &emu_high,
+                        0, ELF_MACHINE, 0);
+
+    if (emu_size < 0) {
+        fprintf(stderr, "Couldn't load xenner image\n");
+        exit(1);
+    }
+
+    cur_addr = (cur_addr + (PAGE_SIZE - 1)) & PAGE_MASK;
+
+    mfn_list_pfn = mfn_to_pfn(cur_addr >> PAGE_SHIFT, mfn_guest);
+    cur_addr += (pg_guest * (long_mode ? 8 : 4) + ~PAGE_MASK) & PAGE_MASK;
+
+    start_info_pfn = mfn_to_pfn(cur_addr >> PAGE_SHIFT, mfn_guest);
+    cur_addr += PAGE_SIZE;
+
+    console_mfn = (cur_addr >> PAGE_SHIFT);
+    cur_addr += PAGE_SIZE;
+
+    xenstore_mfn = (cur_addr >> PAGE_SHIFT);
+    cur_addr += PAGE_SIZE;
+
+    if (cmdline) {
+        cpu_physical_memory_rw(cur_addr, (void*)cmdline, strlen(cmdline) + 1, 1);
+        cmdline_pfn = mfn_to_pfn(cur_addr >> PAGE_SHIFT, mfn_guest);
+        cur_addr += PAGE_SIZE;
+    }
+
+    cur_addr = (cur_addr + ((4 * 1024 * 1024) - 1)) & ~((4 * 1024 * 1024) - 1);
+
+    init_pt_pfn = mfn_to_pfn(cur_addr >> PAGE_SHIFT, mfn_guest);
+    cur_addr += PAGE_SIZE * 100;
+
+    xenner_set_config(EMUDEV_CONF_MFN_CONSOLE,     console_mfn);
+    xenner_set_config(EMUDEV_CONF_MFN_XENSTORE,    xenstore_mfn);
+    xenner_set_config(EMUDEV_CONF_PFN_START_INFO,  start_info_pfn);
+    xenner_set_config(EMUDEV_CONF_PFN_INIT_PT,     init_pt_pfn);
+    xenner_set_config(EMUDEV_CONF_PFN_MFN_LIST,    mfn_list_pfn);
+    xenner_set_config(EMUDEV_CONF_PV_VIRT_ENTRY,   dom_params.virt_entry);
+    xenner_set_config(EMUDEV_CONF_PV_VIRT_BASE,    dom_params.virt_base);
+    xenner_set_config(EMUDEV_CONF_HYPERCALL_PAGE,  dom_params.virt_hypercall);
+
+    xenner_set_config(EMUDEV_CONF_DEBUG_LEVEL,      0);
+    xenner_set_config(EMUDEV_CONF_EMU_START_PFN,    mfn_emu);
+    xenner_set_config(EMUDEV_CONF_EMU_PAGE_COUNT,   pg_emu);
+    xenner_set_config(EMUDEV_CONF_M2P_START_PFN,    mfn_m2p);
+    xenner_set_config(EMUDEV_CONF_M2P_PAGE_COUNT,   pg_m2p);
+    xenner_set_config(EMUDEV_CONF_GUEST_START_PFN,  mfn_guest);
+    xenner_set_config(EMUDEV_CONF_TOTAL_PAGE_COUNT, pg_total);
+    xenner_set_config(EMUDEV_CONF_GUEST_PAGE_COUNT, pg_guest);
+    xenner_set_config(EMUDEV_CONF_NR_VCPUS,         smp_cpus);
+
+    xenner_set_config(EMUDEV_CONF_INITRD_LEN,       initrd_len);
+    xenner_set_config(EMUDEV_CONF_PFN_CMDLINE,      cmdline_pfn);
+
+    /* guest setup */
+
+    console_evtchn  = xenner_guest_evtchn_alloc();
+    xenstore_evtchn = xenner_guest_evtchn_alloc();
+
+    xenner_set_config(EMUDEV_CONF_EVTCH_CONSOLE,    console_evtchn);
+    xenner_set_config(EMUDEV_CONF_EVTCH_XENSTORE,   xenstore_evtchn);
+
+    /* default frontent protocol */
+    if (long_mode) {
+        xen_protocol = XEN_IO_PROTO_ABI_X86_32;
+    } else {
+        xen_protocol = XEN_IO_PROTO_ABI_X86_32;
+    }
+
+    xenstore_domain_init1(kernel, initrd, cmdline);
+    xenstore_domain_init2(xenstore_evtchn,
+                          xenstore_mfn,
+                          console_evtchn,
+                          console_mfn);
+
+    xenner_guest_store_setup(xenstore_mfn,
+                             xenstore_evtchn);
+
+    qemu_register_reset(xenner_cpu_reset, NULL);
+    xenner_cpu_reset(NULL);
+
+    return 0;
+}
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 30/40] xenner: libxc emu: memory mapping
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 30/40] xenner: libxc emu: memory mapping Alexander Graf
@ 2010-11-01 15:12   ` malc
  2010-11-01 15:15     ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: malc @ 2010-11-01 15:12 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On Mon, 1 Nov 2010, Alexander Graf wrote:

> Xenner emulates parts of libxc, so we can not use the real xen infrastructure
> when running xen pv guests without xen.
> 
> This patch adds support for guest memory mapping.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  hw/xenner_libxc_if.c |  124 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 124 insertions(+), 0 deletions(-)
>  create mode 100644 hw/xenner_libxc_if.c
> 
> diff --git a/hw/xenner_libxc_if.c b/hw/xenner_libxc_if.c
> new file mode 100644
> index 0000000..7ccd3c0
> --- /dev/null
> +++ b/hw/xenner_libxc_if.c
> @@ -0,0 +1,124 @@
> +/*
> + *  Copyright (C) Red Hat 2007
> + *  Copyright (C) Novell Inc. 2010
> + *
> + *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
> + *             Alexander Graf <agraf@suse.de>
> + *
> + *  Xenner Emulation -- memory management
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; under version 2 of the License.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License along
> + *  with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <sys/mman.h>
> +#include <xenctrl.h>
> +
> +#include "hw.h"
> +#include "xen_interfaces.h"
> +#include "xenner.h"
> +
> +/* ------------------------------------------------------------- */
> +
> +static int _qemu_open(void)
> +{
> +    return 42;
> +}

Identifiers with leading underscore are reserved in this context.

> +
> +static int qemu_close(int xc_handle)
> +{
> +    return 0;
> +}
> +
> +static void *qemu_map_foreign_range(int xc_handle, uint32_t dom,
> +                                    int size, int prot, unsigned long mfn)
> +{
> +    target_phys_addr_t addr, len;
> +    void *ptr;
> +
> +    addr = (target_phys_addr_t)mfn << PAGE_SHIFT;
> +    len = size;
> +    ptr = cpu_physical_memory_map(addr, &len, 1);
> +
> +    if (len != size) {
> +        fprintf(stderr, "%s: couldn't allocate %d bytes\n", __FUNCTION__, size);
> +        return NULL;
> +    }
> +
> +    return ptr;
> +}
> +
> +static void *qemu_map_foreign_batch(int xc_handle, uint32_t dom, int prot,
> +                                    xen_pfn_t *arr, int num)
> +{
> +    ram_addr_t offset;
> +    void *ptr;
> +    int i;
> +    target_phys_addr_t len = num * TARGET_PAGE_SIZE;
> +
> +    char filename[] = "/dev/shm/qemu-vmcore.XXXXXX";
> +    int rc, fd;
> +
> +    fd = mkstemp(filename);
> +    if (fd == -1) {
> +        fprintf(stderr, "mkstemp(%s): %s\n", filename, strerror(errno));
> +        return NULL;
> +    };
> +    unlink(filename);
> +
> +    rc = ftruncate(fd, len);
> +    if (rc != 0) {
> +        fprintf(stderr, "ftruncate(0x%" PRIx64 "): %s\n",
> +                (uint64_t)len, strerror(errno));
> +        return NULL;
> +    }
> +
> +    ptr = mmap(NULL, len, PROT_WRITE | PROT_READ, MAP_SHARED | MAP_POPULATE,
> +               fd, 0);

mmap can fail.

> +
> +    for (i = 0; i < num; i++) {
> +        void *map;
> +        target_phys_addr_t pagelen = TARGET_PAGE_SIZE;
> +
> +        printf("arr[%d] = %#lx\n", i, cpu_get_physical_page_desc(arr[i] << PAGE_SHIFT));
> +
> +        /* fetch the pointer in qemu's own virtual memory */
> +        offset = cpu_get_physical_page_desc(arr[i] << PAGE_SHIFT);
> +        map = cpu_physical_memory_map(offset, &pagelen, 1);
> +
> +        /* copy current mem to new map */
> +        memcpy(ptr + (i * TARGET_PAGE_SIZE), map, TARGET_PAGE_SIZE);
> +
> +        if (mmap(map, TARGET_PAGE_SIZE, prot, MAP_SHARED | MAP_FIXED,
> +                 fd, i * TARGET_PAGE_SIZE) == (void*)-1) {

And what happens to ptr if it didn't fail and this mmap did?

> +            fprintf(stderr, "%s: mmap(#%d, mfn 0x%lx): %s\n",
> +                    __FUNCTION__, i, arr[i], strerror(errno));
> +            return NULL;
> +        }
> +    }
> +
> +    return ptr;
> +}
> +
> +static void *qemu_map_foreign_pages(int xc_handle, uint32_t dom, int prot,
> +                                    const xen_pfn_t *arr, int num)
> +{
> +    return qemu_map_foreign_batch(xc_handle, dom, prot, (void*)arr, num);
> +}
> +
> +struct XenIfOps xc_xenner = {
> +    .interface_open    = _qemu_open,
> +    .interface_close   = qemu_close,
> +    .map_foreign_range = qemu_map_foreign_range,
> +    .map_foreign_batch = qemu_map_foreign_batch,
> +    .map_foreign_pages = qemu_map_foreign_pages,
> +};
> 

-- 
mailto:av1474@comtv.ru

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 33/40] xenner: core
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 33/40] xenner: core Alexander Graf
@ 2010-11-01 15:13   ` malc
  0 siblings, 0 replies; 96+ messages in thread
From: malc @ 2010-11-01 15:13 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On Mon, 1 Nov 2010, Alexander Graf wrote:

> This patch adds generic xenner functionality to qemu.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  hw/xenner.h      |   52 +++++++++++++
>  hw/xenner_core.c |  224 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 276 insertions(+), 0 deletions(-)
>  create mode 100644 hw/xenner.h
>  create mode 100644 hw/xenner_core.c
> 
> diff --git a/hw/xenner.h b/hw/xenner.h
> new file mode 100644
> index 0000000..241dc4c
> --- /dev/null
> +++ b/hw/xenner.h
> @@ -0,0 +1,52 @@
> +/*
> + *  Copyright (C) Red Hat 2007
> + *  Copyright (C) Novell Inc. 2010
> + *
> + *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
> + *             Alexander Graf <agraf@suse.de>
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; under version 2 of the License.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License along
> + *  with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu-common.h"
> +
> +/* xenner_core.c */
> +struct xenner_emu_file {
> +    const char *filename;
> +    int        pages;
> +    void       *blob;
> +};
> +
> +void *xenner_mfn_to_ptr(xen_pfn_t pfn);
> +void *xenner_get_grant_table(int index);
> +void *xenner_get_vminfo(void);
> +uint64_t xenner_get_config(uint32_t reg);
> +uint32_t xenner_get_evtchn_pin(uint32_t reg);
> +void xenner_set_config(uint32_t reg, uint64_t value);
> +int xenner_load_emu_file(struct xenner_emu_file *f);
> +int xenner_guest_evtchn_alloc(void);
> +int xenner_guest_evtchn_release(int port);
> +int xenner_core_init(int evt);
> +
> +/* xenner_guest_store.c */
> +void xenner_guest_store_setup(uint64_t guest_mfn, evtchn_port_t guest_evtchn);
> +void xenner_guest_store_reset(void);
> +
> +/* xenner_pv.c */
> +int xenner_init_pv(const char *kernel_filename,
> +                   const char *kernel_cmdline,
> +                   const char *initrd_filename);
> +
> +int xenner_domain_build_pv(const char *kernel,
> +                           const char *initrd,
> +                           const char *cmdline);
> diff --git a/hw/xenner_core.c b/hw/xenner_core.c
> new file mode 100644
> index 0000000..53a9a75
> --- /dev/null
> +++ b/hw/xenner_core.c
> @@ -0,0 +1,224 @@
> +/*
> + *  Copyright (C) Red Hat 2007
> + *  Copyright (C) Novell Inc. 2010
> + *
> + *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
> + *             Alexander Graf <agraf@suse.de>
> + *
> + *  Xenner Core
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; under version 2 of the License.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License along
> + *  with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "hw.h"
> +#include "isa.h"
> +#include "sysemu.h"
> +#include "xen.h"
> +#include "xen_interfaces.h"
> +#include "xenner.h"
> +#include "xenner_emudev.h"
> +#include "loader.h"
> +
> +/* #define DEBUG */
> +
> +static int guest_evtchndev;
> +static struct emudev_state xen_emudev;
> +
> +/* ------------------------------------------------------------- */
> +
> +static void do_emudev_command(uint16_t cmd, uint16_t arg)
> +{
> +    switch (cmd) {
> +    case EMUDEV_CMD_NOP:
> +        /* nop vmexit */
> +        break;
> +    case EMUDEV_CMD_WRITE_CHAR:
> +#ifdef DEBUG
> +        fprintf(stderr, "%c", arg);
> +#endif
> +        break;
> +    case EMUDEV_CMD_CONFIGURATION_DONE:
> +        /* emu finisted initial EMUDEV_CONF_* setup */
> +        break;
> +    case EMUDEV_CMD_EVTCHN_ALLOC:
> +        if (arg < EMUDEV_CONF_COMMAND_RESULT_COUNT) {
> +            xen_emudev.result[arg] = xenner_guest_evtchn_alloc();
> +        }
> +        break;
> +    case EMUDEV_CMD_EVTCHN_SEND:
> +        xc_evtchn.notify(guest_evtchndev, arg);
> +        break;
> +    case EMUDEV_CMD_GUEST_SHUTDOWN:
> +        switch (arg) {
> +        case SHUTDOWN_poweroff:
> +            fprintf(stderr, "xenner emudev: guest poweroff -> powerdown\n");
> +            qemu_system_powerdown_request();
> +            break;
> +        case SHUTDOWN_reboot:
> +            fprintf(stderr, "xenner emudev: guest reboot -> reset\n");
> +            qemu_system_reset_request();
> +            vm_stop(0);
> +            break;
> +        case SHUTDOWN_suspend:
> +            fprintf(stderr, "xenner emudev: guest suspend -> stop vm\n");
> +            vm_stop(0);
> +            break;
> +        case SHUTDOWN_crash:
> +            fprintf(stderr, "xenner emudev: guest crash -> reset\n");
> +            vm_stop(0);
> +            qemu_system_reset_request();
> +            break;
> +        }
> +        break;
> +    default:
> +        fprintf(stderr, "xenner emudev: cmd 0x%04x arg 0x%04x\n", cmd, arg);
> +    }
> +}
> +
> +static void xenner_port_write(void *opaque, uint32_t addr, uint32_t val)
> +{
> +    uint16_t cmd, arg;
> +
> +    switch (addr) {
> +    case EMUDEV_REG_CONF_ENTRY:
> +        emudev_write_entry(&xen_emudev, val);
> +        break;
> +    case EMUDEV_REG_CONF_VALUE:
> +        emudev_write_value(&xen_emudev, val);
> +        break;
> +    case EMUDEV_REG_COMMAND:
> +        cmd = val >> 16;
> +        arg = val & 0xffff;
> +        do_emudev_command(cmd, arg);
> +        break;
> +    default:
> +        fprintf(stderr, "io: %s, addr 0x%" PRIx16 " value 0x%" PRIx32 "\n",
> +                __FUNCTION__, addr, val);
> +        break;
> +    }
> +}
> +
> +static uint32_t xenner_port_read(void *opaque, uint32_t addr)
> +{
> +    uint32_t val;
> +
> +    switch (addr) {
> +    case EMUDEV_REG_CONF_VALUE:
> +        val = emudev_read_value(&xen_emudev);
> +        break;
> +    default:
> +        fprintf(stderr, "io: %s, addr 0x%" PRIx16 "\n", __FUNCTION__, addr);
> +        val = 0xffffffff;
> +        break;
> +    }
> +    return val;
> +}
> +
> +/* ------------------------------------------------------------- */
> +
> +void *xenner_mfn_to_ptr(xen_pfn_t pfn)
> +{
> +    ram_addr_t offset;
> +
> +    offset = cpu_get_physical_page_desc(pfn << PAGE_SHIFT);
> +    return qemu_get_ram_ptr(offset);
> +}
> +
> +void *xenner_get_grant_table(int index)
> +{
> +    uint32_t pfn;
> +
> +    if (index > EMUDEV_CONF_GRANT_TABLE_COUNT) {
> +        return NULL;
> +    }
> +    pfn = xen_emudev.gnttab[index];
> +    if (!pfn) {
> +        return NULL;
> +    }
> +    return xenner_mfn_to_ptr(pfn);
> +}
> +
> +void *xenner_get_vminfo(void)
> +{
> +    uint32_t pfn;
> +
> +    pfn = xen_emudev.config[EMUDEV_CONF_VMINFO_PFN];
> +    if (!pfn) {
> +        return NULL;
> +    }
> +    return xenner_mfn_to_ptr(pfn);
> +}
> +
> +uint64_t xenner_get_config(uint32_t reg)
> +{
> +    return xen_emudev.config[reg];
> +}
> +
> +uint32_t xenner_get_evtchn_pin(uint32_t reg)
> +{
> +    return xen_emudev.evtchn[reg];
> +}
> +
> +void xenner_set_config(uint32_t reg, uint64_t value)
> +{
> +    xen_emudev.config[reg] = value;
> +}
> +
> +/* ------------------------------------------------------------- */
> +
> +int xenner_load_emu_file(struct xenner_emu_file *f)
> +{
> +    int r;
> +
> +    /* Fetch size */
> +    r = get_image_size(f->filename);
> +    if (r <= 0) {
> +        fprintf(stderr, "can't find %s\n", f->filename);
> +        return -1;
> +    }
> +
> +    /* Allocate memory */
> +    f->pages = (r + PAGE_SIZE - 1) / PAGE_SIZE;
> +    f->blob = qemu_mallocz(f->pages * PAGE_SIZE);
> +    if (!f->blob) {
> +        return -1;
> +    }

qemu_mallocz doesn't return NULL

> +
> +    /* Load image */
> +    r = load_image(f->filename, f->blob);
> +
> +    return r;
> +}
> +
> +/* ------------------------------------------------------------- */
> +
> +int xenner_guest_evtchn_alloc(void)
> +{
> +    return xc_evtchn.bind_unbound_port(guest_evtchndev, 0);
> +}
> +
> +int xenner_guest_evtchn_release(int port)
> +{
> +    return xc_evtchn.unbind(guest_evtchndev, port);
> +}
> +
> +int xenner_core_init(int evt)
> +{
> +    guest_evtchndev = evt;
> +    xc_evtchn.domid(guest_evtchndev, xen_domid);
> +
> +    register_ioport_write(EMUDEV_REG_BASE, 16, 4, xenner_port_write, NULL);
> +    register_ioport_read(EMUDEV_REG_BASE, 16, 4, xenner_port_read, NULL);
> +
> +    return 0;
> +}
> 

-- 
mailto:av1474@comtv.ru

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 30/40] xenner: libxc emu: memory mapping
  2010-11-01 15:12   ` malc
@ 2010-11-01 15:15     ` Alexander Graf
  0 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:15 UTC (permalink / raw)
  To: malc; +Cc: qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 11:12, malc wrote:

> On Mon, 1 Nov 2010, Alexander Graf wrote:
> 
>> Xenner emulates parts of libxc, so we can not use the real xen infrastructure
>> when running xen pv guests without xen.
>> 
>> This patch adds support for guest memory mapping.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>> hw/xenner_libxc_if.c |  124 ++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 files changed, 124 insertions(+), 0 deletions(-)
>> create mode 100644 hw/xenner_libxc_if.c
>> 
>> diff --git a/hw/xenner_libxc_if.c b/hw/xenner_libxc_if.c
>> new file mode 100644
>> index 0000000..7ccd3c0
>> --- /dev/null
>> +++ b/hw/xenner_libxc_if.c
>> @@ -0,0 +1,124 @@
>> +/*
>> + *  Copyright (C) Red Hat 2007
>> + *  Copyright (C) Novell Inc. 2010
>> + *
>> + *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
>> + *             Alexander Graf <agraf@suse.de>
>> + *
>> + *  Xenner Emulation -- memory management
>> + *
>> + *  This program is free software; you can redistribute it and/or modify
>> + *  it under the terms of the GNU General Public License as published by
>> + *  the Free Software Foundation; under version 2 of the License.
>> + *
>> + *  This program is distributed in the hope that it will be useful,
>> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + *  GNU General Public License for more details.
>> + *
>> + *  You should have received a copy of the GNU General Public License along
>> + *  with this program; if not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include <sys/mman.h>
>> +#include <xenctrl.h>
>> +
>> +#include "hw.h"
>> +#include "xen_interfaces.h"
>> +#include "xenner.h"
>> +
>> +/* ------------------------------------------------------------- */
>> +
>> +static int _qemu_open(void)
>> +{
>> +    return 42;
>> +}
> 
> Identifiers with leading underscore are reserved in this context.

Yeah. Will change to xc_qemu_open.

> 
>> +
>> +static int qemu_close(int xc_handle)
>> +{
>> +    return 0;
>> +}
>> +
>> +static void *qemu_map_foreign_range(int xc_handle, uint32_t dom,
>> +                                    int size, int prot, unsigned long mfn)
>> +{
>> +    target_phys_addr_t addr, len;
>> +    void *ptr;
>> +
>> +    addr = (target_phys_addr_t)mfn << PAGE_SHIFT;
>> +    len = size;
>> +    ptr = cpu_physical_memory_map(addr, &len, 1);
>> +
>> +    if (len != size) {
>> +        fprintf(stderr, "%s: couldn't allocate %d bytes\n", __FUNCTION__, size);
>> +        return NULL;
>> +    }
>> +
>> +    return ptr;
>> +}
>> +
>> +static void *qemu_map_foreign_batch(int xc_handle, uint32_t dom, int prot,
>> +                                    xen_pfn_t *arr, int num)
>> +{
>> +    ram_addr_t offset;
>> +    void *ptr;
>> +    int i;
>> +    target_phys_addr_t len = num * TARGET_PAGE_SIZE;
>> +
>> +    char filename[] = "/dev/shm/qemu-vmcore.XXXXXX";
>> +    int rc, fd;
>> +
>> +    fd = mkstemp(filename);
>> +    if (fd == -1) {
>> +        fprintf(stderr, "mkstemp(%s): %s\n", filename, strerror(errno));
>> +        return NULL;
>> +    };
>> +    unlink(filename);
>> +
>> +    rc = ftruncate(fd, len);
>> +    if (rc != 0) {
>> +        fprintf(stderr, "ftruncate(0x%" PRIx64 "): %s\n",
>> +                (uint64_t)len, strerror(errno));
>> +        return NULL;
>> +    }
>> +
>> +    ptr = mmap(NULL, len, PROT_WRITE | PROT_READ, MAP_SHARED | MAP_POPULATE,
>> +               fd, 0);
> 
> mmap can fail.

Everything can fail, but what should we do if it does? Doesn't a failing mmap indicate something went really wrong?

Either way, this whole function is just plain wrong. It breaks once two callers try to access the same memory for example. I'm very open to suggestions on how to replace it with something that works.


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 00/40] RFC: Xenner
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (34 preceding siblings ...)
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 35/40] xenner: Domain Builder Alexander Graf
@ 2010-11-01 15:21 ` Alexander Graf
  2010-11-02 16:26 ` [Qemu-devel] [PATCH 36/40] xen: only create dummy env when necessary Alexander Graf
                   ` (3 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:21 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

[-- Attachment #1: Type: text/plain, Size: 1491 bytes --]


On 01.11.2010, at 11:01, Alexander Graf wrote:

> Some of you might remember Gerd's xenner project. The basic motivation is to
> run Xen PV guests in KVM with the normal KVM architecture.
> 
> In order to achieve this, Xenner contains of two pieces:
> 
>  1) Xenner Qemu pieces
>  2) Xenner guest kernel
> 
> Part 1 is partially in qemu already. The xen support framework that Gerd pushed
> a while back can be used just as well for xenner. Some parts like a special PV
> device to communicate with xenner and, mechanisms to instantiate a VM and 
> replacements for xen infrastructure are provided in patches here.
> 
> Part 2 is a completely self-contained piece of code. The xenner guest kernel
> runs in the VM's CPL0 context. It translates guest hypercalls to hardware calls
> that KVM implements, like CR3 modifications or LAPIC accesses.
> 
> This patch set tries to revive Gerd's code by integrating as much as possible
> into the qemu code base. My ultimate goal is to isolate the qemu xenner code
> well enough to be able to run an i386 xen pv guest with tcg on powerpc.
> 
> I'm sending this set out in the hope to receive feedback. Do you think this is
> a good idea? Can you spot some glitches in the code that I overlooked? See
> the list below for things I'm aware of to be broken.

I forgot to mention the git repo this is also available at:

  git://repo.or.cz/qemu/agraf.git xenner-v0

That makes it a lot easier to try out :).


Alex


[-- Attachment #2: Type: text/html, Size: 2499 bytes --]

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 14/40] xenner: kernel: Instruction emulator
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 14/40] xenner: kernel: Instruction emulator Alexander Graf
@ 2010-11-01 15:41   ` malc
  2010-11-01 18:46   ` [Qemu-devel] " Paolo Bonzini
  1 sibling, 0 replies; 96+ messages in thread
From: malc @ 2010-11-01 15:41 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On Mon, 1 Nov 2010, Alexander Graf wrote:

> In some cases we need to emulate guest instructions. This patch adds
> code to take care of this.
> 
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  pc-bios/xenner/xenner-instr.c |  405 +++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 405 insertions(+), 0 deletions(-)
>  create mode 100644 pc-bios/xenner/xenner-instr.c
> 
> diff --git a/pc-bios/xenner/xenner-instr.c b/pc-bios/xenner/xenner-instr.c
> new file mode 100644
> index 0000000..11be2ce
> --- /dev/null
> +++ b/pc-bios/xenner/xenner-instr.c
> @@ -0,0 +1,405 @@
> +/*
> + *  Copyright (C) Red Hat 2007
> + *  Copyright (C) Novell Inc. 2010
> + *
> + *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
> + *             Alexander Graf <agraf@suse.de>
> + *
> + *  Xenner instruction emulator
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; under version 2 of the License.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License along
> + *  with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "xenner.h"
> +#include "msr-index.h"
> +#include "cpufeature.h"
> +
> +void real_cpuid(struct kvm_cpuid_entry *entry)
> +{
> +    asm volatile("cpuid"
> +                 : "=a" (entry->eax),
> +                   "=b" (entry->ebx),
> +                   "=c" (entry->ecx),
> +                   "=d" (entry->edx)
> +                 : "a" (entry->function));
> +}
> +
> +static unsigned long clear_cpuid_bit(unsigned long bit, unsigned long x)
> +{
> +    unsigned long r = x;

This assignment serves no purpose.

> +
> +    bit %= 64;
> +    r = x & ~(1 << bit);
> +
> +    return r;
> +}
> +
> +static void filter_cpuid(struct kvm_cpuid_entry *entry)
> +{
> +    switch (entry->function) {
> +    case 0x00000001:
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_SEP, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_DS, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_DS, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_ACC, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_PBE, entry->edx);
> +
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_DTES64, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_MWAIT, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_DSCPL, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_VMXE, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_SMXE, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_EST, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_TM2, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_XTPR, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_PDCM, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_DCA, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_XSAVE, entry->ecx);
> +        /* fall through */
> +    case 0x80000001:
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_VME, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_PSE, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_PGE, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_MCE, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_MCA, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_MTRR, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_PSE36, entry->edx);
> +
> +#ifdef CONFIG_32BIT
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_LM, entry->edx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_LAHF_LM, entry->ecx);
> +#endif
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_PAGE1GB, entry->edx);
> +        entry->edx = clear_cpuid_bit(X86_FEATURE_RDTSCP, entry->edx);
> +
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_SVME, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_OSVW, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_IBS, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_SKINIT, entry->ecx);
> +        entry->ecx = clear_cpuid_bit(X86_FEATURE_WDT, entry->ecx);
> +        break;
> +
> +    case 0x00000005: /* MONITOR/MWAIT */
> +    case 0x0000000a: /* Architectural Performance Monitor Features */
> +    case 0x8000000a: /* SVM revision and features */
> +    case 0x8000001b: /* Instruction Based Sampling */
> +        entry->eax = 0;
> +        entry->ebx = 0;
> +        entry->ecx = 0;
> +        entry->edx = 0;
> +        break;
> +    }
> +}
> +
> +static void emulate_cpuid(struct regs *regs)
> +{
> +    struct kvm_cpuid_entry entry;
> +
> +    entry.function = regs->rax;
> +    real_cpuid(&entry);
> +    filter_cpuid(&entry);
> +    regs->rax = entry.eax;
> +    regs->rbx = entry.ebx;
> +    regs->rcx = entry.ecx;
> +    regs->rdx = entry.edx;
> +    printk(2, "cpuid 0x%08x: eax 0x%08x ebx 0x%08x ecx 0x%08x edx 0x%08x\n",
> +           entry.function, entry.eax, entry.ebx, entry.ecx, entry.edx);
> +}
> +
> +static void emulate_rdmsr(struct regs *regs)
> +{
> +    uint32_t ax,dx;
> +    switch (regs->rcx) {
> +    case MSR_EFER:
> +    case MSR_FS_BASE:
> +    case MSR_GS_BASE:
> +    case MSR_KERNEL_GS_BASE:
> +        /* white listed */
> +        rdmsr(regs->rcx, &ax, &dx);
> +        regs->rax = ax;
> +        regs->rdx = dx;
> +        break;
> +    default:
> +        printk(1, "%s: ignore: rcx 0x%" PRIxREG "\n", __FUNCTION__, regs->rcx);
> +        regs->rax = 0;
> +        regs->rdx = 0;
> +        break;
> +    }
> +}
> +
> +static void emulate_wrmsr(struct regs *regs)
> +{
> +    static const uint64_t known = (EFER_NX|EFER_LMA|EFER_LME|EFER_SCE);
> +    static const uint64_t fixed = (EFER_LMA|EFER_LME|EFER_SCE);
> +    uint32_t ax,dx;
> +
> +    switch (regs->rcx) {
> +    case MSR_EFER:
> +        if (regs->rax & ~known) {
> +            printk(1, "%s: efer: unknown bit set\n", __FUNCTION__);
> +            goto out;
> +        }
> +
> +        rdmsr(regs->rcx, &ax, &dx);
> +        if ((regs->rax & fixed) != (ax & fixed)) {
> +            printk(1, "%s: efer: modify fixed bit\n", __FUNCTION__);
> +            goto out;
> +        }
> +
> +        printk(1, "%s: efer:%s%s%s%s\n", __FUNCTION__,
> +               regs->rax & EFER_SCE ? " sce" : "",
> +               regs->rax & EFER_LME ? " lme" : "",
> +               regs->rax & EFER_LMA ? " lma" : "",
> +               regs->rax & EFER_NX  ? " nx"  : "");
> +        /* fall through */
> +    case MSR_FS_BASE:
> +    case MSR_GS_BASE:
> +    case MSR_KERNEL_GS_BASE:
> +        wrmsr(regs->rcx, regs->rax, regs->rdx);
> +        return;
> +    }
> +
> +out:
> +    printk(1, "%s: ignore: 0x%" PRIxREG " 0x%" PRIxREG ":0x%" PRIxREG "\n",
> +           __FUNCTION__, regs->rcx, regs->rdx, regs->rax);
> +}
> +
> +void print_emu_instr(int level, const char *prefix, uint8_t *instr)
> +{
> +    printk(level, "%s: rip %p bytes %02x %02x %02x %02x  %02x %02x %02x %02x\n",
> +           prefix, instr,
> +           instr[0], instr[1], instr[2], instr[3],
> +           instr[4], instr[5], instr[6], instr[7]);
> +}
> +
> +static ureg_t *decode_reg(struct regs *regs, uint8_t modrm, int rm)
> +{
> +    int shift = rm ? 0 : 3;
> +    ureg_t *reg = NULL;
> +
> +    switch ((modrm >> shift) & 0x07) {
> +    case 0: reg = (ureg_t*)&regs->rax; break;
> +    case 1: reg = (ureg_t*)&regs->rcx; break;
> +    case 2: reg = (ureg_t*)&regs->rdx; break;
> +    case 3: reg = (ureg_t*)&regs->rbx; break;
> +    case 4: reg = (ureg_t*)&regs->rsp; break;
> +    case 5: reg = (ureg_t*)&regs->rbp; break;
> +    case 6: reg = (ureg_t*)&regs->rsi; break;
> +    case 7: reg = (ureg_t*)&regs->rdi; break;
> +    }
> +    return reg;
> +}
> +
> +void print_bits(int level, const char *msg, uint32_t old, uint32_t new,
> +                const char *names[])
> +{
> +    char buf[128];
> +    int pos = 0;
> +    uint32_t mask;
> +    char *mod;
> +    int i;
> +
> +    pos += snprintf(buf+pos, sizeof(buf)-pos, "%s:", msg);
> +    for (i = 0; i < 32; i++) {
> +        mask = 1 << i;
> +        if (new&mask) {
> +            if (old&mask) {
> +                /* bit present */
> +                mod = "";
> +            } else {
> +                /* bit added */
> +                mod = "+";
> +            }
> +        } else {
> +            if (old&mask) {
> +                /* bit removed */
> +                mod = "-";
> +            } else {
> +                /* bit not present */
> +                continue;
> +            }
> +        }
> +        pos += snprintf(buf+pos, sizeof(buf)-pos, " %s%s",
> +                        mod, names[i] ? names[i] : "???");
> +    }
> +    pos += snprintf(buf+pos, sizeof(buf)-pos, "\n");
> +    printk(level, "%s", buf);
> +}
> +
> +int emulate(struct xen_cpu *cpu, struct regs *regs)
> +{
> +    static const uint8_t xen_emu_prefix[5] = {0x0f, 0x0b, 'x','e','n'};
> +    uint8_t *instr;
> +    int skip = 0;
> +    int in = 0;
> +    int shift = 0;
> +    int port = 0;
> +
> +restart:
> +    instr = (void*)regs->rip;
> +
> +    /* prefixes */
> +    if (instr[skip] == 0x66) {
> +        shift = 16;
> +        skip++;
> +    }
> +
> +    /* instructions */
> +    switch (instr[skip]) {
> +    case 0x0f:
> +        switch (instr[skip+1]) {
> +        case 0x06:
> +            /* clts */
> +            clts();
> +            skip += 2;
> +            break;
> +        case 0x09:
> +            /* wbinvd */
> +            __asm__("wbinvd" ::: "memory");
> +            skip += 2;
> +            break;
> +        case 0x0b:
> +            /* ud2a */
> +            if (xen_emu_prefix[2] == instr[skip+2] &&
> +                xen_emu_prefix[3] == instr[skip+3] &&
> +                xen_emu_prefix[4] == instr[skip+4]) {
> +                printk(2, "%s: xen emu prefix\n", __FUNCTION__);
> +                regs->rip += 5;
> +                goto restart;
> +            }
> +            printk(1, "%s: ud2a -- linux kernel BUG()?\n", __FUNCTION__);
> +            /* bounce to guest, hoping it prints more info */
> +            return 0;
> +        case 0x20:
> +        {
> +            /* read control registers */
> +            ureg_t *reg = decode_reg(regs, instr[skip+2], 1);
> +            switch (((instr[skip+2]) >> 3) & 0x07) {
> +            case 0:
> +                *reg = read_cr0();
> +                skip = 3;
> +                break;
> +            case 3:
> +                *reg = frame_to_addr(read_cr3_mfn(cpu));
> +                skip = 3;
> +                break;
> +            case 4:
> +                *reg = read_cr4();
> +                skip = 3;
> +                break;
> +            }
> +            break;
> +        }
> +        case 0x22:
> +        {
> +            /* write control registers */
> +            static const ureg_t cr0_fixed = ~(X86_CR0_TS);
> +            static const ureg_t cr4_fixed = X86_CR4_TSD;
> +            ureg_t *reg = decode_reg(regs, instr[skip+2], 1);
> +            ureg_t cr;
> +            switch (((instr[skip+2]) >> 3) & 0x07) {
> +            case 0:
> +                cr = read_cr0();
> +                if (cr != *reg) {
> +                    if ((cr & cr0_fixed) == (*reg & cr0_fixed)) {
> +                        print_bits(2, "apply cr0 update", cr, *reg, cr0_bits);
> +                        write_cr0(*reg);
> +                    } else {
> +                        print_bits(1, "IGNORE cr0 update", cr, *reg, cr0_bits);
> +                    }
> +                }
> +                skip = 3;
> +                break;
> +            case 4:
> +                cr = read_cr4();
> +                if (cr != *reg) {
> +                    if ((cr & cr4_fixed) == (*reg & cr4_fixed)) {
> +                        print_bits(1, "apply cr4 update", cr, *reg, cr4_bits);
> +                        write_cr4(*reg);
> +                    } else {
> +                        print_bits(1, "IGNORE cr4 update", cr, *reg, cr4_bits);
> +                    }
> +                }
> +                skip = 3;
> +                break;
> +            }
> +            break;
> +        }
> +        case 0x30:
> +            /* wrmsr */
> +            emulate_wrmsr(regs);
> +            skip += 2;
> +            break;
> +        case 0x32:
> +            /* rdmsr */
> +            emulate_rdmsr(regs);
> +            skip += 2;
> +            break;
> +        case 0xa2:
> +            /* cpuid */
> +            emulate_cpuid(regs);
> +            skip += 2;
> +            break;
> +        }
> +        break;
> +
> +    case 0xe4: /* in     <next byte>,%al */
> +    case 0xe5:
> +        in = (instr[skip] & 1) ? 2 : 1;
> +        port = instr[skip+1];
> +        skip += 2;
> +        break;
> +    case 0xec: /* in     (%dx),%al */
> +    case 0xed:
> +        in = (instr[skip] & 1) ? 2 : 1;
> +        port = regs->rdx & 0xffff;
> +        skip += 1;
> +        break;
> +    case 0xe6: /* out    %al,<next byte> */
> +    case 0xe7:
> +        port = instr[skip+1];
> +        skip += 2;
> +        break;
> +    case 0xee: /* out    %al,(%dx) */
> +    case 0xef:
> +        port = regs->rdx & 0xffff;
> +        skip += 1;
> +        break;
> +
> +    case 0xfa:
> +        /* cli */
> +        guest_cli(cpu);
> +        skip += 1;
> +        break;
> +    case 0xfb:
> +        /* sti */
> +        guest_sti(cpu);
> +        skip += 1;
> +        break;
> +    }
> +
> +    /* unknown instruction */
> +    if (!skip) {
> +        print_emu_instr(0, "instr emu failed", instr);
> +        return -1;
> +    }
> +
> +    /* I/O instruction */
> +    if (in == 2) {
> +        regs->rax |= 0xffffffff;
> +    } else if (in == 1) {
> +        regs->rax |= (0xffff << shift);
> +    }
> +
> +    return skip;
> +}
> 

-- 
mailto:av1474@comtv.ru

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files Alexander Graf
@ 2010-11-01 15:44   ` Anthony Liguori
  2010-11-01 15:47     ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 15:44 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 10:01 AM, Alexander Graf wrote:
> This patch adds various header files required for the xenner kernel on 64 bit
> systems.
>
> Signed-off-by: Alexander Graf<agraf@suse.de>
>    

I think it might make more sense to put this on a separate git.qemu.org 
repository and then use a submodule in roms/.

I'm happy to setup access for you/Gerd and it can use qemu-devel as a 
mailing list.

Regards,

Anthony Liguori

> ---
>   pc-bios/xenner/xenner64.S   |  400 +++++++++++++++++++++++++++++++++++++++++++
>   pc-bios/xenner/xenner64.h   |  117 +++++++++++++
>   pc-bios/xenner/xenner64.lds |   38 ++++
>   3 files changed, 555 insertions(+), 0 deletions(-)
>   create mode 100644 pc-bios/xenner/xenner64.S
>   create mode 100644 pc-bios/xenner/xenner64.h
>   create mode 100644 pc-bios/xenner/xenner64.lds
>
> diff --git a/pc-bios/xenner/xenner64.S b/pc-bios/xenner/xenner64.S
> new file mode 100644
> index 0000000..b140214
> --- /dev/null
> +++ b/pc-bios/xenner/xenner64.S
> @@ -0,0 +1,400 @@
> +#define	ENTRY(name) \
> +	.globl name; \
> +	.align 16; \
> +	name:
> +
> +	.macro PUSH_ERROR
> +	sub $8, %rsp		/* space for error code */
> +	.endm
> +
> +	.macro PUSH_TRAP_RBP trapno
> +	sub $8, %rsp		/* space for trap number */
> +	push %rbp
> +	mov $\trapno, %rbp
> +	mov %rbp, 8(%rsp)	/* save trap number on stack */
> +	.endm
> +
> +	.macro PUSH_REGS
> +	push %rdi
> +	push %rsi
> +	push %r15
> +	push %r14
> +	push %r13
> +	push %r12
> +	push %r11
> +	push %r10
> +	push %r9
> +	push %r8
> +	push %rdx
> +	push %rcx
> +	push %rbx
> +	push %rax
> +	mov  %rsp,%rdi		/* struct regs pointer */
> +	.endm
> +
> +	.macro POP_REGS
> +	pop %rax
> +	pop %rbx
> +	pop %rcx
> +	pop %rdx
> +	pop %r8
> +	pop %r9
> +	pop %r10
> +	pop %r11
> +	pop %r12
> +	pop %r13
> +	pop %r14
> +	pop %r15
> +	pop %rsi
> +	pop %rdi
> +	pop %rbp
> +	.endm
> +
> +	.macro RETURN
> +	add $16, %rsp		/* remove error code&  trap number */
> +	iretq			/* jump back */
> +	.endm
> +
> +	.macro DO_TRAP trapno func
> +	PUSH_TRAP_RBP \trapno
> +	PUSH_REGS
> +	call \func
> +	POP_REGS
> +	RETURN
> +	.endm
> +
> +/* ------------------------------------------------------------------ */
> +
> +	.code64
> +	.text
> +
> +/* --- 16-bit boot entry point --- */
> +
> +ENTRY(boot)
> +	.code16
> +
> +	cli
> +
> +	/* load the GDT */
> +	lgdt	(gdt_desc - boot)
> +
> +	/* turn on long mode and paging */
> +	mov	$0x1, %eax
> +	mov	%eax, %cr0
> +
> +	/* enable pagetables */
> +	mov	$(boot_pgd - boot), %eax
> +	mov	%eax, %cr3
> +
> +	/* set PSE,  PAE */
> +	mov	$0x30, %eax
> +	mov	%eax, %cr4
> +
> +	/* long mode */
> +	mov	$0xc0000080, %ecx
> +	rdmsr
> +	or	$0x101, %eax
> +	wrmsr
> +
> +	/* turn on long mode and paging */
> +	mov	$0x80010001, %eax
> +	mov	%eax, %cr0
> +
> +	ljmp	$0x8, $(boot64 - boot)
> +
> +
> +.align 4, 0
> +gdt:
> +.byte   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 /* dummy */
> +.byte   0xff, 0xff, 0x00, 0x00, 0x00, 0x9b, 0xaf, 0x00 /* code64 */
> +.byte   0xff, 0xff, 0x00, 0x00, 0x00, 0x93, 0xaf, 0x00 /* data64 */
> +
> +gdt_desc:
> +.short  (3 * 8) - 1
> +.long   (gdt - boot)
> +
> +
> +/* --- 64-bit entry point --- */
> +
> +boot64:
> +	.code64
> +
> +	mov $0x10, %ax
> +	mov %ax, %ds
> +	mov %ax, %es
> +	mov %ax, %fs
> +	mov %ax, %gs
> +	mov %ax, %ss
> +
> +	jmpq	*(boot64_ind - boot)
> +
> +boot64_ind:
> +	.quad boot64_real
> +
> +boot64_real:
> +
> +	/* switch to real pagetables */
> +	mov	$(emu_pgd - boot), %rax
> +	mov	%rax, %cr3
> +
> +	lea boot_stack_high(%rip), %rsp	/* setup stack */
> +	sub $176, %rsp			/* sizeof(struct regs_64) */
> +	mov  %rsp,%rdi			/* struct regs pointer */
> +	cmp $0, %rbx
> +	jne secondary
> +	call do_boot
> +	POP_REGS
> +	RETURN
> +
> +secondary:
> +	mov  %rdi,%rsi
> +	mov  %rbx,%rdi
> +	call do_boot_secondary
> +	POP_REGS
> +	RETURN
> +
> +/* --- traps/faults handled by emu --- */
> +
> +ENTRY(debug_int1)
> +	PUSH_ERROR
> +	DO_TRAP 1 do_int1
> +
> +ENTRY(debug_int3)
> +	PUSH_ERROR
> +	DO_TRAP 3 do_int3
> +
> +ENTRY(illegal_instruction)
> +	PUSH_ERROR
> +	DO_TRAP 6 do_illegal_instruction
> +
> +ENTRY(no_device)
> +	PUSH_ERROR
> +	DO_TRAP 7 do_lazy_fpu
> +
> +ENTRY(double_fault)
> +	DO_TRAP 8 do_double_fault
> +
> +ENTRY(general_protection)
> +	DO_TRAP 13 do_general_protection
> +
> +ENTRY(page_fault)
> +	DO_TRAP 14 do_page_fault
> +
> +/* --- traps/faults forwarded to guest --- */
> +
> +ENTRY(division_by_zero)
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP 0
> +	jmp guest_forward
> +
> +ENTRY(nmi)
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP 2
> +	jmp guest_forward
> +
> +ENTRY(overflow)
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP 4
> +	jmp guest_forward
> +
> +ENTRY(bound_check)
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP 5
> +	jmp guest_forward
> +
> +ENTRY(coprocessor)
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP 9
> +	jmp guest_forward
> +
> +ENTRY(invalid_tss)
> +	PUSH_TRAP_RBP 10
> +	jmp guest_forward
> +
> +ENTRY(segment_not_present)
> +	PUSH_TRAP_RBP 11
> +	jmp guest_forward
> +
> +ENTRY(stack_fault)
> +	PUSH_TRAP_RBP 12
> +	jmp guest_forward
> +
> +ENTRY(floating_point)
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP 16
> +	jmp guest_forward
> +
> +ENTRY(alignment)
> +	PUSH_TRAP_RBP 17
> +	jmp guest_forward
> +
> +ENTRY(machine_check)
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP 18
> +	jmp guest_forward
> +
> +ENTRY(simd_floating_point)
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP 19
> +	jmp guest_forward
> +
> +guest_forward:
> +	PUSH_REGS
> +	call do_guest_forward
> +	POP_REGS
> +	RETURN
> +
> +/* --- interrupts 32 ... 255 --- */
> +
> +ENTRY(smp_flush_tlb)
> +	PUSH_ERROR
> +	DO_TRAP -1 do_smp_flush_tlb
> +
> +ENTRY(int_80)
> +	PUSH_ERROR
> +	DO_TRAP -1 do_int_80
> +
> +ENTRY(irq_entries)
> +vector=0
> +.rept 256
> +	.align 16
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP vector
> +	jmp irq_common
> +vector=vector+1
> +.endr
> +
> +ENTRY(irq_common)
> +	PUSH_REGS
> +	call do_irq
> +	POP_REGS
> +	RETURN
> +
> +/* --- syscall --- */
> +
> +ENTRY(trampoline_syscall)
> +	/* we arrive here via per-cpu trampoline
> +	 * which sets up the stack for us */
> +	PUSH_ERROR
> +	PUSH_TRAP_RBP -1
> +	PUSH_REGS
> +	call do_syscall
> +	POP_REGS
> +
> +	add $16, %rsp			/* remove error code + trapno */
> +	cmp $-1, -8(%rsp)
> +	je syscall_vmexit
> +	cmp $-2, -8(%rsp)
> +	je syscall_iretq
> +
> +syscall_default:
> +	/* default sysret path */
> +	popq %rcx			/* rip    */
> +	popq %r11			/* cs     */
> +	popq %r11			/* rflags */
> +	popq %rsp			/* rsp    */
> +	sysretq
> +
> +syscall_vmexit:
> +	/* bounce hypercall to userspace */
> +	popq %rcx			/* rip    */
> +	popq %r11			/* cs     */
> +	popq %r11			/* rflags */
> +	popq %rsp			/* rsp    */
> +	out %al, $0xe0			/* let userspace handle it */
> +	sysretq
> +
> +syscall_iretq:
> +	/* return from syscall via iretq */
> +	iretq
> +
> +/* helpers */
> +
> +ENTRY(broken_memcpy_pf)
> +	mov %rdx,%rcx
> +	cld
> +1:	rep movsb
> +	xor %rax,%rax
> +8:	ret
> +
> +	.section .exfix, "ax"
> +9:	mov $-1, %rax
> +	jmp 8b
> +	.previous
> +
> +	.section .extab, "a"
> +	.align 8
> +	.quad 1b,9b
> +	.previous
> +
> +/* some 16 bit code for smp boot */
> +
> +	.code16
> +	.align 4096
> +ENTRY(sipi)
> +	mov $0x00060000, %eax  /* EMUDEV_CMD_INIT_SECONDARY_VCPU */
> +	outl %eax, $0xe8       /* EMUDEV_REG_COMMAND */
> +	hlt
> +	.code64
> +
> +/* emu boot stack, including syscall trampoline template */
> +
> +	.data
> +	.globl boot_stack_low, boot_stack_high
> +	.globl cpu_ptr
> +	.globl trampoline_start, trampoline_patch, trampoline_stop
> +	.align 4096
> +boot_stack_low:
> +cpu_ptr:
> +	.quad 0
> +trampoline_start:
> +	movq %rsp, boot_stack_high-16(%rip)
> +	leaq boot_stack_high-16(%rip), %rsp
> +	push %r11				/* rflags	 */
> +	mov  $0xdeadbeef, %r11			/* C code must fix cs&  ss */
> +	movq %r11, boot_stack_high-8(%rip)	/* ss	     */
> +	push %r11				/* cs	     */
> +	push %rcx				/* rip	    */
> +
> +	.byte 0x49, 0xbb			/* mov data, %r11 ...	 */
> +trampoline_patch:
> +	.quad 0					/* ... data, for jump to ...  */
> +	jmpq *%r11				/* ... trampoline_syscall     */
> +trampoline_stop:
> +	.align 4096
> +boot_stack_high:
> +
> +/* boot page tables */
> +
> +#define pageflags 0x063 /* preset, rw, accessed, dirty */
> +#define largepage 0x080 /* pse */
> +
> +	.section .pt, "aw"
> +	.globl emu_pgd
> +
> +	.align 4096
> +boot_pgd:
> +	.quad emu_pud - 0xffff830000000000 + pageflags
> +	.fill 261,8,0
> +	.quad emu_pud - 0xffff830000000000 + pageflags
> +	.fill 249,8,0
> +
> +	.align 4096
> +emu_pgd:
> +	.fill 262,8,0
> +	.quad emu_pud - 0xffff830000000000 + pageflags
> +	.fill 249,8,0
> +
> +	.align 4096
> +emu_pud:
> +	.quad emu_pmd - 0xffff830000000000 + pageflags
> +	.fill 511,8,0
> +
> +	.align 4096
> +emu_pmd:
> +i = 0
> +	.rept 512
> +	.quad pageflags + largepage | (i<<  21)
> +	i = i + 1
> +	.endr
> +	.align 4096
> diff --git a/pc-bios/xenner/xenner64.h b/pc-bios/xenner/xenner64.h
> new file mode 100644
> index 0000000..92d956e
> --- /dev/null
> +++ b/pc-bios/xenner/xenner64.h
> @@ -0,0 +1,117 @@
> +#include<xen/foreign/x86_64.h>
> +
> +struct regs_64 {
> +    /* pushed onto stack before calling into C code */
> +    uint64_t rax;
> +    uint64_t rbx;
> +    uint64_t rcx;
> +    uint64_t rdx;
> +    uint64_t r8;
> +    uint64_t r9;
> +    uint64_t r10;
> +    uint64_t r11;
> +    uint64_t r12;
> +    uint64_t r13;
> +    uint64_t r14;
> +    uint64_t r15;
> +    uint64_t rsi;
> +    uint64_t rdi;
> +    uint64_t rbp;
> +    uint64_t trapno;
> +    /* trap / fault / int created */
> +    uint64_t error;
> +    uint64_t rip;
> +    uint64_t cs;
> +    uint64_t rflags;
> +    uint64_t rsp;
> +    uint64_t ss;
> +};
> +
> +/* 64bit defines */
> +#define EMUNAME   "xenner64"
> +#define regs      regs_64
> +#define fix_sel   fix_sel64
> +#define fix_desc  fix_desc64
> +#define ureg_t    uint64_t
> +#define sreg_t    int64_t
> +#define PRIxREG   PRIx64
> +#define tss(_v)   ((0xe000>>  3) +  8 + (((_v)->id)<<  2))
> +#define ldt(_v)   ((0xe000>>  3) + 10 + (((_v)->id)<<  2))
> +
> +/* xenner-data.c */
> +extern struct idt_64 page_aligned xen_idt[256];
> +
> +/* xenner-main.c */
> +asmlinkage void do_int_80(struct regs_64 *regs);
> +
> +/* xenner-hcall.c */
> +void switch_mode(struct xen_cpu *cpu);
> +int is_kernel(struct xen_cpu *cpu);
> +asmlinkage void do_syscall(struct regs_64 *regs);
> +
> +/* xenner-mm.c */
> +void pgtable_walk(int level, uint64_t va, uint64_t root_mfn);
> +int pgtable_fixup_flag(struct xen_cpu *cpu, uint64_t va, uint32_t flag);
> +int pgtable_is_present(uint64_t va, uint64_t root_mfn);
> +void *map_page(uint64_t maddr);
> +void *fixmap_page(struct xen_cpu *cpu, uint64_t maddr);
> +static inline void free_page(void *ptr) {}
> +uint64_t *find_pte_64(uint64_t va);
> +
> +/* macros */
> +#define context_is_emu(_r)       (((_r)->cs&  0x03) == 0x00)
> +#define context_is_kernel(_v,_r) (((_r)->cs&  0x03) == 0x03&&  is_kernel(_v))
> +#define context_is_user(_v,_r)   (((_r)->cs&  0x03) == 0x03&&  !is_kernel(_v))
> +
> +#define addr_is_emu(va)     (((va)>= XEN_M2P_64)&&  ((va)<  XEN_DOM_64))
> +#define addr_is_kernel(va)  ((va)>= XEN_DOM_64)
> +#define addr_is_user(va)    ((va)<  XEN_M2P_64)
> +
> +/* inline asm bits */
> +static inline int wrmsr_safe(uint32_t msr, uint32_t ax, uint32_t dx)
> +{
> +    int ret;
> +    asm volatile("1:  wrmsr                \n"
> +                 "    xorl %0,%0           \n"
> +                 "2:  nop                  \n"
> +
> +                 ".section .exfix, \"ax\"  \n"
> +                 "3:  mov $-1,%0           \n"
> +                 "    jmp 2b               \n"
> +                 ".previous                \n"
> +
> +                 ".section .extab, \"a\"   \n"
> +                 "    .align 8             \n"
> +                 "    .quad 1b,3b          \n"
> +                 ".previous                \n"
> +                 : "=r" (ret)
> +                 : "c" (msr), "a" (ax), "d" (dx));
> +    return ret;
> +}
> +
> +static inline int memcpy_pf(void *dest, const void *src, size_t bytes)
> +{
> +    int ret;
> +
> +    asm volatile("    cld                  \n"
> +                 "91: rep movsb            \n"
> +                 "    xor %[ret],%[ret]    \n"
> +                 "98:                      \n"
> +
> +                 ".section .exfix, \"ax\"  \n"
> +                 "99: mov $-1, %[ret]      \n"
> +                 "    jmp 98b              \n"
> +                 ".previous                \n"
> +
> +                 ".section .extab, \"a\"   \n"
> +                 "    .align 8             \n"
> +                 "    .quad 91b,99b        \n"
> +                 ".previous                \n"
> +                 : [ ret ] "=a" (ret),
> +                   [ rsi ] "+S" (src),
> +                   [ rdi ] "+D" (dest),
> +                   [ rcx ] "+c" (bytes)
> +                 :
> +                 : "memory" );
> +    return ret;
> +}
> diff --git a/pc-bios/xenner/xenner64.lds b/pc-bios/xenner/xenner64.lds
> new file mode 100644
> index 0000000..0b580a9
> --- /dev/null
> +++ b/pc-bios/xenner/xenner64.lds
> @@ -0,0 +1,38 @@
> +OUTPUT_FORMAT("elf64-x86-64")
> +
> +SECTIONS
> +{
> +    . = 0xffff830000000000;
> +    _vstart = .;
> +    phys_startup_64 = 0x0;
> +
> +    /* code */
> +    .text : AT(ADDR(.text) - 0xffff830000000000) { *(.text) }
> +    . = ALIGN(4k);
> +    .exfix  : { *(.exfix)  }
> +
> +    /* data, ro */
> +    . = ALIGN(4k);
> +    .note.gnu.build-id : { *(.note.gnu.build-id) }
> +    . = ALIGN(4k);
> +    _estart = .;
> +    .extab  : { *(.extab)  }
> +    _estop  = .;
> +    . = ALIGN(4k);
> +    .rodata : { *(.rodata) }
> +
> +    /* data, rw */
> +    . = ALIGN(4k);
> +    .pt     : { *(.pt) }
> +    . = ALIGN(4k);
> +    .pgdata : { *(.pgdata) }
> +    . = ALIGN(4k);
> +    .data   : { *(.data)   }
> +
> +    /* bss */
> +    . = ALIGN(4k);
> +    .bss    : { *(.bss)    }
> +
> +    . = ALIGN(4k);
> +    _vstop  = .;
> +}
>    

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn Alexander Graf
@ 2010-11-01 15:45   ` Anthony Liguori
  2010-11-01 15:49     ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 15:45 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 10:01 AM, Alexander Graf wrote:
> Xenner emulates parts of libxc, so we can not use the real xen infrastructure
> when running xen pv guests without xen.
>
> This patch adds support for event channel communication.
>
> Signed-off-by: Alexander Graf<agraf@suse.de>
>    

Has anyone checked with the Xen folks about supporting this type of 
functionality in libxc directly?

Regards,

Anthony Liguori

> ---
>   hw/xenner_libxc_evtchn.c |  467 ++++++++++++++++++++++++++++++++++++++++++++++
>   1 files changed, 467 insertions(+), 0 deletions(-)
>   create mode 100644 hw/xenner_libxc_evtchn.c
>
> diff --git a/hw/xenner_libxc_evtchn.c b/hw/xenner_libxc_evtchn.c
> new file mode 100644
> index 0000000..bb1984c
> --- /dev/null
> +++ b/hw/xenner_libxc_evtchn.c
> @@ -0,0 +1,467 @@
> +/*
> + *  Copyright (C) Red Hat 2007
> + *  Copyright (C) Novell Inc. 2010
> + *
> + *  Author(s): Gerd Hoffmann<kraxel@redhat.com>
> + *             Alexander Graf<agraf@suse.de>
> + *
> + *  Xenner emulation -- event channels
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; under version 2 of the License.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License along
> + *  with this program; if not, see<http://www.gnu.org/licenses/>.
> + */
> +
> +#include<assert.h>
> +#include<xenctrl.h>
> +
> +#include "hw.h"
> +#include "qemu-log.h"
> +#include "console.h"
> +#include "monitor.h"
> +#include "xen.h"
> +#include "xen_interfaces.h"
> +
> +/* ------------------------------------------------------------- */
> +
> +struct evtpriv;
> +
> +struct port {
> +    struct evtpriv   *priv;
> +    struct port      *peer;
> +    int              port;
> +    int              pending;
> +    int              count_snd;
> +    int              count_fwd;
> +    int              count_msg;
> +};
> +
> +struct domain {
> +    int              domid;
> +    struct port      p[NR_EVENT_CHANNELS];
> +};
> +static struct domain dom0;  /* host  */
> +static struct domain domU;  /* guest */
> +
> +struct evtpriv {
> +    int                      fd_read, fd_write;
> +    struct domain            *domain;
> +    int                      ports;
> +    int                      pending;
> +    QTAILQ_ENTRY(evtpriv)    list;
> +};
> +static QTAILQ_HEAD(evtpriv_head, evtpriv) privs = QTAILQ_HEAD_INITIALIZER(privs);
> +
> +static int debug = 0;
> +
> +/* ------------------------------------------------------------- */
> +
> +static struct evtpriv *getpriv(int handle)
> +{
> +    struct evtpriv *priv;
> +
> +    QTAILQ_FOREACH(priv,&privs, list) {
> +        if (priv->fd_read == handle) {
> +            return priv;
> +        }
> +    }
> +    return NULL;
> +}
> +
> +static struct domain *get_domain(int domid)
> +{
> +    if (domid == 0) {
> +        return&dom0;
> +    }
> +    if (!domU.domid) {
> +        domU.domid = domid;
> +    }
> +    assert(domU.domid == domid);
> +    return&domU;
> +}
> +
> +static struct port *alloc_port(struct evtpriv *priv, const char *reason)
> +{
> +    struct port *p = NULL;
> +    int i;
> +
> +    for (i = 1; i<  NR_EVENT_CHANNELS; i++) {
> +#ifdef DEBUG
> +        /* debug hack */
> +#define EA_START 20
> +        if (priv->domain->domid&&  i<  EA_START)
> +            i = EA_START;
> +#undef EA_START
> +#endif
> +        if (priv->domain->p[i].priv != NULL) {
> +            continue;
> +        }
> +        p = priv->domain->p+i;
> +        p->port = i;
> +        p->priv = priv;
> +        p->count_snd = 0;
> +        p->count_fwd = 0;
> +        p->count_msg = 1;
> +        priv->ports++;
> +        if (debug) {
> +            qemu_log("xen ev:%3d: alloc port %d, domain %d (%s)\n",
> +                     priv->fd_read, p->port, priv->domain->domid, reason);
> +        }
> +        return p;
> +    }
> +    return NULL;
> +}
> +
> +static void bind_port_peer(struct port *p, int domid, int port)
> +{
> +    struct domain *domain;
> +    struct port *o;
> +    const char *msg = "ok";
> +
> +    domain = get_domain(domid);
> +    o = domain->p+port;
> +    if (!o->priv) {
> +        msg = "peer not allocated";
> +    } else if (o->peer) {
> +        msg = "peer already bound";
> +    } else if (p->peer) {
> +        msg = "port already bound";
> +    } else {
> +        o->peer = p;
> +        p->peer = o;
> +    }
> +    if (debug) {
> +        qemu_log("xen ev:%3d: bind port %d domain %d<->   port %d domain %d : %s\n",
> +                 p->priv->fd_read,
> +                 p->port, p->priv->domain->domid,
> +                 port, domid, msg);
> +    }
> +}
> +
> +static void unbind_port(struct port *p)
> +{
> +    struct port *o;
> +
> +    o = p->peer;
> +    if (o) {
> +        if (debug) {
> +            fprintf(stderr,"xen ev:%3d: unbind port %d domain %d<->   port %d domain %d\n",
> +                    p->priv->fd_read,
> +                    p->port, p->priv->domain->domid,
> +                    o->port, o->priv->domain->domid);
> +        }
> +        o->peer = NULL;
> +        p->peer = NULL;
> +    }
> +}
> +
> +static void notify_send_peer(struct port *peer)
> +{
> +    uint32_t evtchn = peer->port;
> +    int r;
> +
> +    peer->count_snd++;
> +    if (peer->pending) {
> +        return;
> +    }
> +
> +    r = write(peer->priv->fd_write,&evtchn, sizeof(evtchn));
> +    if (r != sizeof(evtchn)) {
> +        // XXX break
> +    }
> +    peer->count_fwd++;
> +    peer->pending++;
> +    peer->priv->pending++;
> +}
> +
> +static void notify_port(struct port *p)
> +{
> +    if (p->peer) {
> +        notify_send_peer(p->peer);
> +        if (debug&&  p->peer->count_snd>= p->peer->count_msg) {
> +            fprintf(stderr, "xen ev:%3d: notify port %d domain %d  ->   port %d "
> +                            "domain %d  |  counts %d/%d\n",
> +                     p->priv->fd_read, p->port, p->priv->domain->domid,
> +                     p->peer->port, p->peer->priv->domain->domid,
> +                     p->peer->count_fwd, p->peer->count_snd);
> +            p->peer->count_msg *= 10;
> +        }
> +    } else {
> +        if (debug) {
> +            fprintf(stderr, "xen ev:%3d: notify port %d domain %d  ->   unconnected\n",
> +                    p->priv->fd_read, p->port, p->priv->domain->domid);
> +        }
> +    }
> +}
> +
> +static void unmask_port(struct port *p)
> +{
> +    /* nothing to do */
> +}
> +
> +static void release_port(struct port *p)
> +{
> +    if (debug) {
> +        fprintf(stderr,"xen ev:%3d: release port %d, domain %d\n",
> +                p->priv->fd_read, p->port, p->priv->domain->domid);
> +    }
> +    unbind_port(p);
> +    p->priv->ports--;
> +    p->port = 0;
> +    p->priv = 0;
> +}
> +
> +/* ------------------------------------------------------------- */
> +
> +static int qemu_xopen(void)
> +{
> +    struct evtpriv *priv;
> +    int fd[2];
> +
> +    priv = qemu_mallocz(sizeof(*priv));
> +    QTAILQ_INSERT_TAIL(&privs, priv, list);
> +
> +    if (pipe(fd)<  0) {
> +        goto err;
> +    }
> +    priv->fd_read  = fd[0];
> +    priv->fd_write = fd[1];
> +    fcntl(priv->fd_read,F_SETFL,O_NONBLOCK);
> +
> +    priv->domain = get_domain(0);
> +    return priv->fd_read;
> +
> +err:
> +    qemu_free(priv);
> +    return -1;
> +}
> +
> +static int qemu_close(int handle)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +    struct port *p;
> +    int i;
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +
> +    for (i = 1; i<  NR_EVENT_CHANNELS; i++) {
> +        p = priv->domain->p+i;
> +        if (priv != p->priv) {
> +            continue;
> +        }
> +        release_port(p);
> +    }
> +
> +    close(priv->fd_read);
> +    close(priv->fd_write);
> +    QTAILQ_REMOVE(&privs, priv, list);
> +    qemu_free(priv);
> +    return 0;
> +}
> +
> +static int qemu_fd(int handle)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +    return priv->fd_read;
> +}
> +
> +static int qemu_notify(int handle, evtchn_port_t port)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +    struct port *p;
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +    if (port>= NR_EVENT_CHANNELS) {
> +        return -1;
> +    }
> +    p = priv->domain->p + port;
> +    notify_port(p);
> +    return -1;
> +}
> +
> +static evtchn_port_or_error_t qemu_bind_unbound_port(int handle, int domid)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +    struct port *p;
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +    p = alloc_port(priv, "unbound");
> +    if (!p) {
> +        return -1;
> +    }
> +    return p->port;
> +}
> +
> +static evtchn_port_or_error_t qemu_bind_interdomain(int handle, int domid,
> +                                                    evtchn_port_t remote_port)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +    struct port *p;
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +    if (remote_port>= NR_EVENT_CHANNELS) {
> +        return -1;
> +    }
> +    p = alloc_port(priv, "interdomain");
> +    if (!p) {
> +        return -1;
> +    }
> +    bind_port_peer(p, domid, remote_port);
> +    return p->port;
> +}
> +
> +static evtchn_port_or_error_t qemu_bind_virq(int handle, unsigned int virq)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +    struct port *p;
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +    p = alloc_port(priv, "virq");
> +    if (!p) {
> +        return -1;
> +    }
> +    /*
> +     * Note: port not linked here, we only allocate some port.
> +     */
> +    return p->port;
> +}
> +
> +static int qemu_unbind(int handle, evtchn_port_t port)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +    struct port *p;
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +    if (port>= NR_EVENT_CHANNELS) {
> +        return -1;
> +    }
> +    p = priv->domain->p + port;
> +    unbind_port(p);
> +    release_port(p);
> +    return 0;
> +}
> +
> +static evtchn_port_or_error_t qemu_pending(int handle)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +    uint32_t evtchn;
> +    int rc;
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +    rc = read(priv->fd_read,&evtchn, sizeof(evtchn));
> +    if (rc != sizeof(evtchn)) {
> +        return -1;
> +    }
> +    priv->pending--;
> +    priv->domain->p[evtchn].pending--;
> +    return evtchn;
> +}
> +
> +static int qemu_unmask(int handle, evtchn_port_t port)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +    struct port *p;
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +    if (port>= NR_EVENT_CHANNELS) {
> +        return -1;
> +    }
> +    p = priv->domain->p + port;
> +    unmask_port(p);
> +    return 0;
> +}
> +
> +static int qemu_domid(int handle, int domid)
> +{
> +    struct evtpriv *priv = getpriv(handle);
> +
> +    if (!priv) {
> +        return -1;
> +    }
> +    if (priv->ports) {
> +        return -1;
> +    }
> +    priv->domain = get_domain(domid);
> +    return 0;
> +}
> +
> +struct XenEvtOps xc_evtchn_xenner = {
> +    .open               = qemu_xopen,
> +    .domid              = qemu_domid,
> +    .close              = qemu_close,
> +    .fd                 = qemu_fd,
> +    .notify             = qemu_notify,
> +    .bind_unbound_port  = qemu_bind_unbound_port,
> +    .bind_interdomain   = qemu_bind_interdomain,
> +    .bind_virq          = qemu_bind_virq,
> +    .unbind             = qemu_unbind,
> +    .pending            = qemu_pending,
> +    .unmask             = qemu_unmask,
> +};
> +
> +/* ------------------------------------------------------------- */
> +
> +#if 0
> +
> +void do_info_evtchn(Monitor *mon)
> +{
> +    struct evtpriv *priv;
> +    struct port *port;
> +    int i;
> +
> +    if (xen_mode != XEN_EMULATE) {
> +        monitor_printf(mon, "Not emulating xen event channels.\n");
> +        return;
> +    }
> +
> +    QTAILQ_FOREACH(priv,&privs, list) {
> +        monitor_printf(mon, "%p: domid %d, fds %d,%d\n", priv,
> +                       priv->domain->domid,
> +                       priv->fd_read, priv->fd_write);
> +        for (i = 1; i<  NR_EVENT_CHANNELS; i++) {
> +            port = priv->domain->p + i;
> +            if (port->priv != priv) {
> +                continue;
> +            }
> +            monitor_printf(mon, "  port #%d: ", port->port);
> +            if (port->peer) {
> +                monitor_printf(mon, "peer #%d (%p, domid %d)\n",
> +                               port->peer->port, port->peer->priv,
> +                               port->peer->priv->domain->domid);
> +            } else {
> +                monitor_printf(mon, "no peer\n");
> +            }
> +        }
> +    }
> +}
> +
> +#endif
> +
>    

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files
  2010-11-01 15:44   ` Anthony Liguori
@ 2010-11-01 15:47     ` Alexander Graf
  2010-11-01 15:59       ` Anthony Liguori
  2010-11-01 19:00       ` Blue Swirl
  0 siblings, 2 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:47 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 11:44, Anthony Liguori wrote:

> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>> This patch adds various header files required for the xenner kernel on 64 bit
>> systems.
>> 
>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>   
> 
> I think it might make more sense to put this on a separate git.qemu.org repository and then use a submodule in roms/.

The main reason why I wanted to have it fully in the qemu tree is so that we can guarantee the qemu<->xenner interface is 100% private. I'm not sure how much it'll be in a flux, but I don't want to go through the hassle of defining stable ABIs where we don't have to :)


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 15:45   ` Anthony Liguori
@ 2010-11-01 15:49     ` Alexander Graf
  2010-11-01 16:01       ` Anthony Liguori
  0 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 15:49 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 11:45, Anthony Liguori wrote:

> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>> Xenner emulates parts of libxc, so we can not use the real xen infrastructure
>> when running xen pv guests without xen.
>> 
>> This patch adds support for event channel communication.
>> 
>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>   
> 
> Has anyone checked with the Xen folks about supporting this type of functionality in libxc directly?


The issue I have with libxc is that it goes orthogonal to the qemu infrastructure way of doing things. If we base on libxc, we will never be able to do cross-architecture execution of xen pv guests. Do we really want to go that way?


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files
  2010-11-01 15:47     ` Alexander Graf
@ 2010-11-01 15:59       ` Anthony Liguori
  2010-11-01 19:00       ` Blue Swirl
  1 sibling, 0 replies; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 15:59 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 10:47 AM, Alexander Graf wrote:
> On 01.11.2010, at 11:44, Anthony Liguori wrote:
>
>    
>> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>>      
>>> This patch adds various header files required for the xenner kernel on 64 bit
>>> systems.
>>>
>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>>
>>>        
>> I think it might make more sense to put this on a separate git.qemu.org repository and then use a submodule in roms/.
>>      
> The main reason why I wanted to have it fully in the qemu tree is so that we can guarantee the qemu<->xenner interface is 100% private. I'm not sure how much it'll be in a flux, but I don't want to go through the hassle of defining stable ABIs where we don't have to :)
>    

Yeah, I'm not suggesting it be a separate project.  But we've been 
keeping firmware out of tree for the most part and using submodules.  I 
like this model (even though git submodules don't always work the way 
you expect them to..).

I'm not passionate either way.  If you feel strongly about it being in 
the main source tree, I don't have a problem with it.  But stick it in 
roms/ instead of pc-bios.

Regards,

Anthony Liguori

> Alex
>
>    

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 15:49     ` Alexander Graf
@ 2010-11-01 16:01       ` Anthony Liguori
  2010-11-01 16:07         ` Alexander Graf
  2010-11-01 19:39         ` [Qemu-devel] " Paolo Bonzini
  0 siblings, 2 replies; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 16:01 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 10:49 AM, Alexander Graf wrote:
> On 01.11.2010, at 11:45, Anthony Liguori wrote:
>
>    
>> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>>      
>>> Xenner emulates parts of libxc, so we can not use the real xen infrastructure
>>> when running xen pv guests without xen.
>>>
>>> This patch adds support for event channel communication.
>>>
>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>>
>>>        
>> Has anyone checked with the Xen folks about supporting this type of functionality in libxc directly?
>>      
>
> The issue I have with libxc is that it goes orthogonal to the qemu infrastructure way of doing things. If we base on libxc, we will never be able to do cross-architecture execution of xen pv guests. Do we really want to go that way?
>    

IIUC, this is a mini-libxc that you enable by mucking with 
LD_LIBRARY_PATH such that you can run things like xenstored unmodified.  
What I'm really asking is whether there has been a discussion about a 
more pleasant way to do this that the Xen guys would feel comfortable with.

I'd feel a little weird if someone was replacing a part of QEMU via 
LD_LIBRARY_PATH trickery.  It's better to try to work out a proper 
solution with the upstream community than to do trickery.

I'm not entirely opposed to this if the Xen guys say they don't want 
anything to do with Xenner, but we should have the discussion at least.

Regards,

Anthony Liguori

>
> Alex
>
>    

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 16:01       ` Anthony Liguori
@ 2010-11-01 16:07         ` Alexander Graf
  2010-11-01 16:14           ` Anthony Liguori
  2010-11-01 19:39         ` [Qemu-devel] " Paolo Bonzini
  1 sibling, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 16:07 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 12:01, Anthony Liguori wrote:

> On 11/01/2010 10:49 AM, Alexander Graf wrote:
>> On 01.11.2010, at 11:45, Anthony Liguori wrote:
>> 
>>   
>>> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>>>     
>>>> Xenner emulates parts of libxc, so we can not use the real xen infrastructure
>>>> when running xen pv guests without xen.
>>>> 
>>>> This patch adds support for event channel communication.
>>>> 
>>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>>> 
>>>>       
>>> Has anyone checked with the Xen folks about supporting this type of functionality in libxc directly?
>>>     
>> 
>> The issue I have with libxc is that it goes orthogonal to the qemu infrastructure way of doing things. If we base on libxc, we will never be able to do cross-architecture execution of xen pv guests. Do we really want to go that way?
>>   
> 
> IIUC, this is a mini-libxc that you enable by mucking with LD_LIBRARY_PATH such that you can run things like xenstored unmodified.  What I'm really asking is whether there has been a discussion about a more pleasant way to do this that the Xen guys would feel comfortable with.
> 
> I'd feel a little weird if someone was replacing a part of QEMU via LD_LIBRARY_PATH trickery.  It's better to try to work out a proper solution with the upstream community than to do trickery.
> 
> I'm not entirely opposed to this if the Xen guys say they don't want anything to do with Xenner, but we should have the discussion at least.

I agree about the discussion part, that's why we're all gathering in Boston this week, right?

But technically, this code really just bumps all libxc calls to indirect function calls that go through a struct. If we're using xenner, we use our own implementation, if we're using xen, we use xen's. The thing is that with xenner we usually don't have xen infrastructure available and most likely don't want to start any either.


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 16:07         ` Alexander Graf
@ 2010-11-01 16:14           ` Anthony Liguori
  2010-11-01 16:15             ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 16:14 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 11:07 AM, Alexander Graf wrote:
> On 01.11.2010, at 12:01, Anthony Liguori wrote:
>
>    
>> On 11/01/2010 10:49 AM, Alexander Graf wrote:
>>      
>>> On 01.11.2010, at 11:45, Anthony Liguori wrote:
>>>
>>>
>>>        
>>>> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>>>>
>>>>          
>>>>> Xenner emulates parts of libxc, so we can not use the real xen infrastructure
>>>>> when running xen pv guests without xen.
>>>>>
>>>>> This patch adds support for event channel communication.
>>>>>
>>>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>>>>
>>>>>
>>>>>            
>>>> Has anyone checked with the Xen folks about supporting this type of functionality in libxc directly?
>>>>
>>>>          
>>> The issue I have with libxc is that it goes orthogonal to the qemu infrastructure way of doing things. If we base on libxc, we will never be able to do cross-architecture execution of xen pv guests. Do we really want to go that way?
>>>
>>>        
>> IIUC, this is a mini-libxc that you enable by mucking with LD_LIBRARY_PATH such that you can run things like xenstored unmodified.  What I'm really asking is whether there has been a discussion about a more pleasant way to do this that the Xen guys would feel comfortable with.
>>
>> I'd feel a little weird if someone was replacing a part of QEMU via LD_LIBRARY_PATH trickery.  It's better to try to work out a proper solution with the upstream community than to do trickery.
>>
>> I'm not entirely opposed to this if the Xen guys say they don't want anything to do with Xenner, but we should have the discussion at least.
>>      
> I agree about the discussion part, that's why we're all gathering in Boston this week, right?
>    

Fair enough :-)

> But technically, this code really just bumps all libxc calls to indirect function calls that go through a struct. If we're using xenner, we use our own implementation, if we're using xen, we use xen's. The thing is that with xenner we usually don't have xen infrastructure available and most likely don't want to start any either.
>    

Yeah, I guess I'd just like to see a more "polite" solution.

Regards,

Anthony Liguori

> Alex
>
>    

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 16:14           ` Anthony Liguori
@ 2010-11-01 16:15             ` Alexander Graf
  0 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 16:15 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 12:14, Anthony Liguori wrote:

> On 11/01/2010 11:07 AM, Alexander Graf wrote:
>> On 01.11.2010, at 12:01, Anthony Liguori wrote:
>> 
>>   
>>> On 11/01/2010 10:49 AM, Alexander Graf wrote:
>>>     
>>>> On 01.11.2010, at 11:45, Anthony Liguori wrote:
>>>> 
>>>> 
>>>>       
>>>>> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>>>>> 
>>>>>         
>>>>>> Xenner emulates parts of libxc, so we can not use the real xen infrastructure
>>>>>> when running xen pv guests without xen.
>>>>>> 
>>>>>> This patch adds support for event channel communication.
>>>>>> 
>>>>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>>>>> 
>>>>>> 
>>>>>>           
>>>>> Has anyone checked with the Xen folks about supporting this type of functionality in libxc directly?
>>>>> 
>>>>>         
>>>> The issue I have with libxc is that it goes orthogonal to the qemu infrastructure way of doing things. If we base on libxc, we will never be able to do cross-architecture execution of xen pv guests. Do we really want to go that way?
>>>> 
>>>>       
>>> IIUC, this is a mini-libxc that you enable by mucking with LD_LIBRARY_PATH such that you can run things like xenstored unmodified.  What I'm really asking is whether there has been a discussion about a more pleasant way to do this that the Xen guys would feel comfortable with.
>>> 
>>> I'd feel a little weird if someone was replacing a part of QEMU via LD_LIBRARY_PATH trickery.  It's better to try to work out a proper solution with the upstream community than to do trickery.
>>> 
>>> I'm not entirely opposed to this if the Xen guys say they don't want anything to do with Xenner, but we should have the discussion at least.
>>>     
>> I agree about the discussion part, that's why we're all gathering in Boston this week, right?
>>   
> 
> Fair enough :-)
> 
>> But technically, this code really just bumps all libxc calls to indirect function calls that go through a struct. If we're using xenner, we use our own implementation, if we're using xen, we use xen's. The thing is that with xenner we usually don't have xen infrastructure available and most likely don't want to start any either.
>>   
> 
> Yeah, I guess I'd just like to see a more "polite" solution.

We can try and see if we can maybe reuse parts of the event channel and xenstored stuff, but when it comes to memory mappings or grant tables, we have to have our own code since we're the ones owning the ram.

But yeah, let's move that discussion to LPC :). That way the xen folks can participate!


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 02/40] elf: Add notes implementation
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 02/40] elf: Add notes implementation Alexander Graf
@ 2010-11-01 18:29   ` Blue Swirl
  2010-11-01 18:42     ` Stefan Weil
  2010-11-01 18:41   ` [Qemu-devel] " Paolo Bonzini
  1 sibling, 1 reply; 96+ messages in thread
From: Blue Swirl @ 2010-11-01 18:29 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On Mon, Nov 1, 2010 at 3:01 PM, Alexander Graf <agraf@suse.de> wrote:
> ---
>  hw/elf_ops.h |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  hw/loader.c  |    7 ++++++
>  hw/loader.h  |    3 ++
>  3 files changed, 70 insertions(+), 1 deletions(-)
>
> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
> index 8b63dfc..645d058 100644
> --- a/hw/elf_ops.h
> +++ b/hw/elf_ops.h
> @@ -189,6 +189,44 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
>     return -1;
>  }
>
> +static void glue(elf_read_notes, SZ)(uint8_t *data, int data_len,
> +                                     ElfHandlers *handlers, int must_swab)
> +{
> +    uint8_t *p = data;
> +
> +    while ((ulong)&p[3] < (ulong)&data[data_len]) {

Please use 'unsigned long'.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 31/40] xenner: libxc emu: xenstore
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 31/40] xenner: libxc emu: xenstore Alexander Graf
@ 2010-11-01 18:36   ` Blue Swirl
  0 siblings, 0 replies; 96+ messages in thread
From: Blue Swirl @ 2010-11-01 18:36 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On Mon, Nov 1, 2010 at 3:01 PM, Alexander Graf <agraf@suse.de> wrote:
> Xenner emulates parts of libxc, so we can not use the real xen infrastructure
> when running xen pv guests without xen.
>
> This patch adds support for emulation of xenstored.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  hw/xenner_guest_store.c |  494 +++++++++++++++++++++++++++++++++
>  hw/xenner_libxenstore.c |  709 +++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 1203 insertions(+), 0 deletions(-)
>  create mode 100644 hw/xenner_guest_store.c
>  create mode 100644 hw/xenner_libxenstore.c
>
> diff --git a/hw/xenner_guest_store.c b/hw/xenner_guest_store.c
> new file mode 100644
> index 0000000..c067275
> --- /dev/null
> +++ b/hw/xenner_guest_store.c
> @@ -0,0 +1,494 @@
> +/*
> + *  Copyright (C) 2005 Rusty Russell IBM Corporation
> + *  Copyright (C) Red Hat 2007
> + *  Copyright (C) Novell Inc. 2010
> + *
> + *  Author(s): Gerd Hoffmann <kraxel@redhat.com>
> + *             Alexander Graf <agraf@suse.de>
> + *
> + *  Xenner emulation -- guest interface to xenstore
> + *
> + *  tools/xenstore/xenstored_domain.c equivalent, some code is from there.
> + *
> + *  This program is free software; you can redistribute it and/or modify
> + *  it under the terms of the GNU General Public License as published by
> + *  the Free Software Foundation; under version 2 of the License.
> + *
> + *  This program is distributed in the hope that it will be useful,
> + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *  GNU General Public License for more details.
> + *
> + *  You should have received a copy of the GNU General Public License along
> + *  with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "xen.h"
> +#include "xen_interfaces.h"
> +#include "xenner.h"
> +#include "qemu-char.h"
> +
> +/* ------------------------------------------------------------- */
> +
> +static target_phys_addr_t xen_store_mfn;
> +
> +static struct xs_handle *xs_guest;
> +static char xs_buf[1024];
> +static char xs_len;
> +static int debug = 0;
> +
> +static int evtchndev;
> +static evtchn_port_t evtchnport;

A lot of static state. Couldn't this be wrapped inside a structure,
which is then passed around?

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 02/40] elf: Add notes implementation
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 02/40] elf: Add notes implementation Alexander Graf
  2010-11-01 18:29   ` Blue Swirl
@ 2010-11-01 18:41   ` Paolo Bonzini
  2010-11-01 18:52     ` Alexander Graf
  1 sibling, 1 reply; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-01 18:41 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 04:01 PM, Alexander Graf wrote:
> diff --git a/hw/loader.c b/hw/loader.c
> index 50b43a0..cb430e0 100644
> --- a/hw/loader.c
> +++ b/hw/loader.c
> @@ -229,6 +229,11 @@ int load_aout(const char *filename, target_phys_addr_t addr, int max_sz,
>
>   /* ELF loader */
>
> +static void elf_default_note(void *opaque, uint8_t *name, uint32_t name_len,
> +                             uint8_t *desc, uint32_t desc_len, uint32_t type)
> +{
> +}
> +
>   static uint64_t elf_default_translate(void *opaque, uint64_t addr)
>   {
>       return addr;
> @@ -237,6 +242,8 @@ static uint64_t elf_default_translate(void *opaque, uint64_t addr)
>   ElfHandlers elf_default_handlers = {
>       .translate_fn = elf_default_translate,
>       .translate_opaque = NULL,
> +    .note_fn = elf_default_note,
> +    .note_opaque = NULL,

Don't you have to add the definition to every user of translate_fn?

Maybe it's better to guard calls through the pointers with an if.

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 02/40] elf: Add notes implementation
  2010-11-01 18:29   ` Blue Swirl
@ 2010-11-01 18:42     ` Stefan Weil
  2010-11-01 19:51       ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Stefan Weil @ 2010-11-01 18:42 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Gerd Hoffmann, Alexander Graf, qemu-devel Developers

Am 01.11.2010 19:29, schrieb Blue Swirl:
> On Mon, Nov 1, 2010 at 3:01 PM, Alexander Graf<agraf@suse.de>  wrote:
>    
>> ---
>>   hw/elf_ops.h |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>   hw/loader.c  |    7 ++++++
>>   hw/loader.h  |    3 ++
>>   3 files changed, 70 insertions(+), 1 deletions(-)
>>
>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>> index 8b63dfc..645d058 100644
>> --- a/hw/elf_ops.h
>> +++ b/hw/elf_ops.h
>> @@ -189,6 +189,44 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
>>      return -1;
>>   }
>>
>> +static void glue(elf_read_notes, SZ)(uint8_t *data, int data_len,
>> +                                     ElfHandlers *handlers, int must_swab)
>> +{
>> +    uint8_t *p = data;
>> +
>> +    while ((ulong)&p[3]<  (ulong)&data[data_len]) {
>>      
> Please use 'unsigned long'.
>    

Why is a type cast used here? I see no reason for it.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 14/40] xenner: kernel: Instruction emulator
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 14/40] xenner: kernel: Instruction emulator Alexander Graf
  2010-11-01 15:41   ` malc
@ 2010-11-01 18:46   ` Paolo Bonzini
  1 sibling, 0 replies; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-01 18:46 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 04:01 PM, Alexander Graf wrote:
> +    /* I/O instruction */
> +    if (in == 2) {
> +        regs->rax |= 0xffffffff;
> +    } else if (in == 1) {
> +        regs->rax |= (0xffff<<  shift);
> +    }

I don't understand this, and also why it's here rather than near case 
0xe4/0xe5/0xec/0xed.

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 02/40] elf: Add notes implementation
  2010-11-01 18:41   ` [Qemu-devel] " Paolo Bonzini
@ 2010-11-01 18:52     ` Alexander Graf
  2010-11-01 19:43       ` Paolo Bonzini
  0 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 18:52 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 14:41, Paolo Bonzini wrote:

> On 11/01/2010 04:01 PM, Alexander Graf wrote:
>> diff --git a/hw/loader.c b/hw/loader.c
>> index 50b43a0..cb430e0 100644
>> --- a/hw/loader.c
>> +++ b/hw/loader.c
>> @@ -229,6 +229,11 @@ int load_aout(const char *filename, target_phys_addr_t addr, int max_sz,
>> 
>>  /* ELF loader */
>> 
>> +static void elf_default_note(void *opaque, uint8_t *name, uint32_t name_len,
>> +                             uint8_t *desc, uint32_t desc_len, uint32_t type)
>> +{
>> +}
>> +
>>  static uint64_t elf_default_translate(void *opaque, uint64_t addr)
>>  {
>>      return addr;
>> @@ -237,6 +242,8 @@ static uint64_t elf_default_translate(void *opaque, uint64_t addr)
>>  ElfHandlers elf_default_handlers = {
>>      .translate_fn = elf_default_translate,
>>      .translate_opaque = NULL,
>> +    .note_fn = elf_default_note,
>> +    .note_opaque = NULL,
> 
> Don't you have to add the definition to every user of translate_fn?
> 
> Maybe it's better to guard calls through the pointers with an if.

All users either pass NULL as translate (which means they default to elf_default_translate) or initialize their structure with the values in elf_default_translate :)


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files
  2010-11-01 15:47     ` Alexander Graf
  2010-11-01 15:59       ` Anthony Liguori
@ 2010-11-01 19:00       ` Blue Swirl
  2010-11-01 19:02         ` Anthony Liguori
  1 sibling, 1 reply; 96+ messages in thread
From: Blue Swirl @ 2010-11-01 19:00 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On Mon, Nov 1, 2010 at 3:47 PM, Alexander Graf <agraf@suse.de> wrote:
>
> On 01.11.2010, at 11:44, Anthony Liguori wrote:
>
>> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>>> This patch adds various header files required for the xenner kernel on 64 bit
>>> systems.
>>>
>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>>
>>
>> I think it might make more sense to put this on a separate git.qemu.org repository and then use a submodule in roms/.
>
> The main reason why I wanted to have it fully in the qemu tree is so that we can guarantee the qemu<->xenner interface is 100% private. I'm not sure how much it'll be in a flux, but I don't want to go through the hassle of defining stable ABIs where we don't have to :)

How is this different from SeaBIOS or OpenBIOS, both use fw_cfg
interface? Also, after the ABI has stabilized (in the future), will
you detach xenner from QEMU tree since this argument is no longer
valid?

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files
  2010-11-01 19:00       ` Blue Swirl
@ 2010-11-01 19:02         ` Anthony Liguori
  2010-11-01 19:05           ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 19:02 UTC (permalink / raw)
  To: Blue Swirl; +Cc: Gerd Hoffmann, Alexander Graf, qemu-devel Developers

On 11/01/2010 02:00 PM, Blue Swirl wrote:
> On Mon, Nov 1, 2010 at 3:47 PM, Alexander Graf<agraf@suse.de>  wrote:
>    
>> On 01.11.2010, at 11:44, Anthony Liguori wrote:
>>
>>      
>>> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>>>        
>>>> This patch adds various header files required for the xenner kernel on 64 bit
>>>> systems.
>>>>
>>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>>>
>>>>          
>>> I think it might make more sense to put this on a separate git.qemu.org repository and then use a submodule in roms/.
>>>        
>> The main reason why I wanted to have it fully in the qemu tree is so that we can guarantee the qemu<->xenner interface is 100% private. I'm not sure how much it'll be in a flux, but I don't want to go through the hassle of defining stable ABIs where we don't have to :)
>>      
> How is this different from SeaBIOS or OpenBIOS, both use fw_cfg
> interface?

At least with SeaBIOS, we don't provide a stable ABI.  That's why we 
ship a specific version in roms/.

Regards,

Anthony Liguori

>   Also, after the ABI has stabilized (in the future), will
> you detach xenner from QEMU tree since this argument is no longer
> valid?
>    

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files
  2010-11-01 19:02         ` Anthony Liguori
@ 2010-11-01 19:05           ` Alexander Graf
  2010-11-01 19:23             ` Blue Swirl
  2010-11-01 19:37             ` Anthony Liguori
  0 siblings, 2 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 19:05 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Blue Swirl, qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 15:02, Anthony Liguori wrote:

> On 11/01/2010 02:00 PM, Blue Swirl wrote:
>> On Mon, Nov 1, 2010 at 3:47 PM, Alexander Graf<agraf@suse.de>  wrote:
>>   
>>> On 01.11.2010, at 11:44, Anthony Liguori wrote:
>>> 
>>>     
>>>> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>>>>       
>>>>> This patch adds various header files required for the xenner kernel on 64 bit
>>>>> systems.
>>>>> 
>>>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>>>> 
>>>>>         
>>>> I think it might make more sense to put this on a separate git.qemu.org repository and then use a submodule in roms/.
>>>>       
>>> The main reason why I wanted to have it fully in the qemu tree is so that we can guarantee the qemu<->xenner interface is 100% private. I'm not sure how much it'll be in a flux, but I don't want to go through the hassle of defining stable ABIs where we don't have to :)
>>>     
>> How is this different from SeaBIOS or OpenBIOS, both use fw_cfg
>> interface?
> 
> At least with SeaBIOS, we don't provide a stable ABI.  That's why we ship a specific version in roms/.

So if we can assume that the interface is 100% internal and can break at any time and there's no guarantee that a newer version of qemu works with an older version of xenner or vice versa, I'm perfectly fine in making it its own project :)


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files
  2010-11-01 19:05           ` Alexander Graf
@ 2010-11-01 19:23             ` Blue Swirl
  2010-11-01 19:37             ` Anthony Liguori
  1 sibling, 0 replies; 96+ messages in thread
From: Blue Swirl @ 2010-11-01 19:23 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On Mon, Nov 1, 2010 at 7:05 PM, Alexander Graf <agraf@suse.de> wrote:
>
> On 01.11.2010, at 15:02, Anthony Liguori wrote:
>
>> On 11/01/2010 02:00 PM, Blue Swirl wrote:
>>> On Mon, Nov 1, 2010 at 3:47 PM, Alexander Graf<agraf@suse.de>  wrote:
>>>
>>>> On 01.11.2010, at 11:44, Anthony Liguori wrote:
>>>>
>>>>
>>>>> On 11/01/2010 10:01 AM, Alexander Graf wrote:
>>>>>
>>>>>> This patch adds various header files required for the xenner kernel on 64 bit
>>>>>> systems.
>>>>>>
>>>>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>>>>>
>>>>>>
>>>>> I think it might make more sense to put this on a separate git.qemu.org repository and then use a submodule in roms/.
>>>>>
>>>> The main reason why I wanted to have it fully in the qemu tree is so that we can guarantee the qemu<->xenner interface is 100% private. I'm not sure how much it'll be in a flux, but I don't want to go through the hassle of defining stable ABIs where we don't have to :)
>>>>
>>> How is this different from SeaBIOS or OpenBIOS, both use fw_cfg
>>> interface?
>>
>> At least with SeaBIOS, we don't provide a stable ABI.  That's why we ship a specific version in roms/.
>
> So if we can assume that the interface is 100% internal and can break at any time and there's no guarantee that a newer version of qemu works with an older version of xenner or vice versa, I'm perfectly fine in making it its own project :)

This is the same situation as with OpenBIOS. We also ship known good
pc-bios/openbios-* files to help bisection etc.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files
  2010-11-01 19:05           ` Alexander Graf
  2010-11-01 19:23             ` Blue Swirl
@ 2010-11-01 19:37             ` Anthony Liguori
  1 sibling, 0 replies; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 19:37 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Blue Swirl, qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 02:05 PM, Alexander Graf wrote:
> So if we can assume that the interface is 100% internal and can break at any time and there's no guarantee that a newer version of qemu works with an older version of xenner or vice versa, I'm perfectly fine in making it its own project :)
>    

Not necessarily a totally independent project, just another repository 
or module.  It helps scale the project overall I think.

Regards,

Anthony Liguori

>
> Alex
>
>    

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 16:01       ` Anthony Liguori
  2010-11-01 16:07         ` Alexander Graf
@ 2010-11-01 19:39         ` Paolo Bonzini
  2010-11-01 19:41           ` Anthony Liguori
  1 sibling, 1 reply; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-01 19:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: Gerd Hoffmann, Alexander Graf

On 11/01/2010 05:01 PM, Anthony Liguori wrote:
>
> IIUC, this is a mini-libxc that you enable by mucking with
> LD_LIBRARY_PATH such that you can run things like xenstored unmodified.
> What I'm really asking is whether there has been a discussion about a
> more pleasant way to do this that the Xen guys would feel comfortable with.

I don't know if it's Alex or Gerd who did the switch, but this version 
of the code doesn't have the separate mini-libxc.  The code of the 
mini-libxc is embedded in QEMU, just like xenstored, blkback and 
netback.  See patch 31/40, which includes both the "mini xenstored" and 
the "mini libxenstore".

It's not clear where is xenconsoled, is the PV console functionality 
missing in this version of xenner?

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 19:39         ` [Qemu-devel] " Paolo Bonzini
@ 2010-11-01 19:41           ` Anthony Liguori
  2010-11-01 19:47             ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 19:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Gerd Hoffmann, Alexander Graf, qemu-devel Developers

On 11/01/2010 02:39 PM, Paolo Bonzini wrote:
> On 11/01/2010 05:01 PM, Anthony Liguori wrote:
>>
>> IIUC, this is a mini-libxc that you enable by mucking with
>> LD_LIBRARY_PATH such that you can run things like xenstored unmodified.
>> What I'm really asking is whether there has been a discussion about a
>> more pleasant way to do this that the Xen guys would feel comfortable 
>> with.
>
> I don't know if it's Alex or Gerd who did the switch, but this version 
> of the code doesn't have the separate mini-libxc.  The code of the 
> mini-libxc is embedded in QEMU, just like xenstored, blkback and 
> netback.  See patch 31/40, which includes both the "mini xenstored" 
> and the "mini libxenstore".

Oh, I'm still missing some of it.  That's a curious choice.

What's the logic for duplicating xenstored/xenconsoled?  I understand 
blkback/netback.

Regards,

Anthony Liguori

> It's not clear where is xenconsoled, is the PV console functionality 
> missing in this version of xenner?
>
> Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 02/40] elf: Add notes implementation
  2010-11-01 18:52     ` Alexander Graf
@ 2010-11-01 19:43       ` Paolo Bonzini
  2010-11-01 19:48         ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-01 19:43 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 07:52 PM, Alexander Graf wrote:
>>> @@ -237,6 +242,8 @@ static uint64_t elf_default_translate(void *opaque, uint64_t addr)
>>>   ElfHandlers elf_default_handlers = {
>>>       .translate_fn = elf_default_translate,
>>>       .translate_opaque = NULL,
>>> +    .note_fn = elf_default_note,
>>> +    .note_opaque = NULL,
>>
>> Don't you have to add the definition to every user of translate_fn?
>>
>> Maybe it's better to guard calls through the pointers with an if.
>
> All users either pass NULL as translate (which means they default to
> elf_default_translate) or initialize their structure with the values in
> elf_default_translate :)

But do the MIPS users initialize note_fn?

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 19:41           ` Anthony Liguori
@ 2010-11-01 19:47             ` Alexander Graf
  2010-11-01 20:32               ` Anthony Liguori
  0 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 19:47 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Paolo Bonzini, qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 15:41, Anthony Liguori wrote:

> On 11/01/2010 02:39 PM, Paolo Bonzini wrote:
>> On 11/01/2010 05:01 PM, Anthony Liguori wrote:
>>> 
>>> IIUC, this is a mini-libxc that you enable by mucking with
>>> LD_LIBRARY_PATH such that you can run things like xenstored unmodified.
>>> What I'm really asking is whether there has been a discussion about a
>>> more pleasant way to do this that the Xen guys would feel comfortable with.
>> 
>> I don't know if it's Alex or Gerd who did the switch, but this version of the code doesn't have the separate mini-libxc.  The code of the mini-libxc is embedded in QEMU, just like xenstored, blkback and netback.  See patch 31/40, which includes both the "mini xenstored" and the "mini libxenstore".
> 
> Oh, I'm still missing some of it.  That's a curious choice.
> 
> What's the logic for duplicating xenstored/xenconsoled?  I understand blkback/netback.

Where else would it belong? Qemu is an emulator. Device emulation belongs to qemu code. The xen PV machine is nothing but a special case of the pc machine with custom firmware and odd devices :).

As I stated in my cover letter, the goal of all this should be to have the qemu pieces be 100% independent of any xen headers or libraries, so we can eventually isolate it well enough that it even works on non-x86. Then we're at the point qemu code usually is.

I'm sure there are also practical implications btw. But I don't really care about those too much, because the architectural ones outweigh that to me.


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 02/40] elf: Add notes implementation
  2010-11-01 19:43       ` Paolo Bonzini
@ 2010-11-01 19:48         ` Alexander Graf
  2010-11-01 21:23           ` Paolo Bonzini
  0 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 19:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 15:43, Paolo Bonzini wrote:

> On 11/01/2010 07:52 PM, Alexander Graf wrote:
>>>> @@ -237,6 +242,8 @@ static uint64_t elf_default_translate(void *opaque, uint64_t addr)
>>>>  ElfHandlers elf_default_handlers = {
>>>>      .translate_fn = elf_default_translate,
>>>>      .translate_opaque = NULL,
>>>> +    .note_fn = elf_default_note,
>>>> +    .note_opaque = NULL,
>>> 
>>> Don't you have to add the definition to every user of translate_fn?
>>> 
>>> Maybe it's better to guard calls through the pointers with an if.
>> 
>> All users either pass NULL as translate (which means they default to
>> elf_default_translate) or initialize their structure with the values in
>> elf_default_translate :)
> 
> But do the MIPS users initialize note_fn?

They should:


@@ -106,8 +106,10 @@ static int64_t load_kernel (CPUState *env)
    ram_addr_t initrd_offset;
    uint32_t *prom_buf;
    long prom_size;
+    ElfHandlers handlers = elf_default_handlers;

-    if (load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys, NULL,
+    handlers.translate_fn = cpu_mips_kseg0_to_phys;
+    if (load_elf(loaderparams.kernel_filename, &handlers,
                 (uint64_t *)&kernel_entry, (uint64_t *)&kernel_low,
                 (uint64_t *)&kernel_high, 0, ELF_MACHINE, 1) < 0) {
        fprintf(stderr, "qemu: could not load kernel '%s'\n",


Unless my C foo is really bad, this means that handlers is initialized with the contents of elf_default_handlers :). And that's how every caller works.


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 02/40] elf: Add notes implementation
  2010-11-01 18:42     ` Stefan Weil
@ 2010-11-01 19:51       ` Alexander Graf
  2010-11-01 20:19         ` Stefan Weil
  0 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 19:51 UTC (permalink / raw)
  To: Stefan Weil; +Cc: Blue Swirl, qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 14:42, Stefan Weil wrote:

> Am 01.11.2010 19:29, schrieb Blue Swirl:
>> On Mon, Nov 1, 2010 at 3:01 PM, Alexander Graf<agraf@suse.de>  wrote:
>>   
>>> ---
>>>  hw/elf_ops.h |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>  hw/loader.c  |    7 ++++++
>>>  hw/loader.h  |    3 ++
>>>  3 files changed, 70 insertions(+), 1 deletions(-)
>>> 
>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>>> index 8b63dfc..645d058 100644
>>> --- a/hw/elf_ops.h
>>> +++ b/hw/elf_ops.h
>>> @@ -189,6 +189,44 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
>>>     return -1;
>>>  }
>>> 
>>> +static void glue(elf_read_notes, SZ)(uint8_t *data, int data_len,
>>> +                                     ElfHandlers *handlers, int must_swab)
>>> +{
>>> +    uint8_t *p = data;
>>> +
>>> +    while ((ulong)&p[3]<  (ulong)&data[data_len]) {
>>>     
>> Please use 'unsigned long'.
>>   
> 
> Why is a type cast used here? I see no reason for it.

Pointers can't be compared, you have to cast them to values first.


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 02/40] elf: Add notes implementation
  2010-11-01 19:51       ` Alexander Graf
@ 2010-11-01 20:19         ` Stefan Weil
  2010-11-01 21:17           ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Stefan Weil @ 2010-11-01 20:19 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Blue Swirl, qemu-devel Developers, Gerd Hoffmann

Am 01.11.2010 20:51, schrieb Alexander Graf:
> On 01.11.2010, at 14:42, Stefan Weil wrote:
>
>    
>> Am 01.11.2010 19:29, schrieb Blue Swirl:
>>      
>>> On Mon, Nov 1, 2010 at 3:01 PM, Alexander Graf<agraf@suse.de>   wrote:
>>>
>>>        
>>>> ---
>>>>   hw/elf_ops.h |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>   hw/loader.c  |    7 ++++++
>>>>   hw/loader.h  |    3 ++
>>>>   3 files changed, 70 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>>>> index 8b63dfc..645d058 100644
>>>> --- a/hw/elf_ops.h
>>>> +++ b/hw/elf_ops.h
>>>> @@ -189,6 +189,44 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
>>>>      return -1;
>>>>   }
>>>>
>>>> +static void glue(elf_read_notes, SZ)(uint8_t *data, int data_len,
>>>> +                                     ElfHandlers *handlers, int must_swab)
>>>> +{
>>>> +    uint8_t *p = data;
>>>> +
>>>> +    while ((ulong)&p[3]<   (ulong)&data[data_len]) {
>>>>
>>>>          
>>> Please use 'unsigned long'.
>>>
>>>        
>> Why is a type cast used here? I see no reason for it.
>>      
> Pointers can't be compared, you have to cast them to values first.
>
>
> Alex
>    

No. Pointers of same type which are not void pointers can be compared.

There is even a data type ptrdiff_t, so you can also compare their
difference with zero.

Regards,
Stefan

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 19:47             ` Alexander Graf
@ 2010-11-01 20:32               ` Anthony Liguori
  2010-11-01 21:47                 ` Paolo Bonzini
  2010-11-02  4:33                 ` Stefano Stabellini
  0 siblings, 2 replies; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 20:32 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paolo Bonzini, qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 02:47 PM, Alexander Graf wrote:
> Where else would it belong? Qemu is an emulator. Device emulation belongs to qemu code. The xen PV machine is nothing but a special case of the pc machine with custom firmware and odd devices :).
>
> As I stated in my cover letter, the goal of all this should be to have the qemu pieces be 100% independent of any xen headers or libraries,

I'm not sure I agree with the goal.  I think where ever possible we 
should reuse code with the Xen project when it makes sense.  Reusing 
blkback/netback is impossible because we want userspace implementations 
and the current implementations are in the kernel.  blktap also doesn't 
tie into the QEMU block layer and making it tie into the QEMU block 
layer would probably result in more code than it saved.

OTOH, xenstored and xenconsoled have very little direct dependence on 
Xen.  I'm not saying that we shouldn't make things Just Work in QEMU, so 
if that means spawning xenconsoled/xenstored automagically from QEMU 
with special options, that's perfectly fine.

But to replicate the functionality of this code solely because of NIH 
seems like a waste of effort.

Regards,

Anthony Liguori

>   so we can eventually isolate it well enough that it even works on non-x86. Then we're at the point qemu code usually is.
>
> I'm sure there are also practical implications btw. But I don't really care about those too much, because the architectural ones outweigh that to me.
>
>
> Alex
>
>    

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 02/40] elf: Add notes implementation
  2010-11-01 20:19         ` Stefan Weil
@ 2010-11-01 21:17           ` Alexander Graf
  2010-11-01 21:28             ` [Qemu-devel] " Paolo Bonzini
                               ` (2 more replies)
  0 siblings, 3 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-01 21:17 UTC (permalink / raw)
  To: Stefan Weil
  Cc: Blue Swirl, Michael Matz, qemu-devel Developers, Gerd Hoffmann


On 01.11.2010, at 16:19, Stefan Weil wrote:

> Am 01.11.2010 20:51, schrieb Alexander Graf:
>> On 01.11.2010, at 14:42, Stefan Weil wrote:
>> 
>>   
>>> Am 01.11.2010 19:29, schrieb Blue Swirl:
>>>     
>>>> On Mon, Nov 1, 2010 at 3:01 PM, Alexander Graf<agraf@suse.de>   wrote:
>>>> 
>>>>       
>>>>> ---
>>>>>  hw/elf_ops.h |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>>  hw/loader.c  |    7 ++++++
>>>>>  hw/loader.h  |    3 ++
>>>>>  3 files changed, 70 insertions(+), 1 deletions(-)
>>>>> 
>>>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>>>>> index 8b63dfc..645d058 100644
>>>>> --- a/hw/elf_ops.h
>>>>> +++ b/hw/elf_ops.h
>>>>> @@ -189,6 +189,44 @@ static int glue(load_symbols, SZ)(struct elfhdr *ehdr, int fd, int must_swab,
>>>>>     return -1;
>>>>>  }
>>>>> 
>>>>> +static void glue(elf_read_notes, SZ)(uint8_t *data, int data_len,
>>>>> +                                     ElfHandlers *handlers, int must_swab)
>>>>> +{
>>>>> +    uint8_t *p = data;
>>>>> +
>>>>> +    while ((ulong)&p[3]<   (ulong)&data[data_len]) {
>>>>> 
>>>>>         
>>>> Please use 'unsigned long'.
>>>> 
>>>>       
>>> Why is a type cast used here? I see no reason for it.
>>>     
>> Pointers can't be compared, you have to cast them to values first.
>> 
>> 
>> Alex
>>   
> 
> No. Pointers of same type which are not void pointers can be compared.
> 
> There is even a data type ptrdiff_t, so you can also compare their
> difference with zero.

Let's ask someone who definitely knows :).

Michael, is code like

char *x = a, *y = b;
if (x < y) {
  ...
}

valid? Or do I first have to cast x and y to unsigned longs or uintptr_t?


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 02/40] elf: Add notes implementation
  2010-11-01 19:48         ` Alexander Graf
@ 2010-11-01 21:23           ` Paolo Bonzini
  0 siblings, 0 replies; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-01 21:23 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 08:48 PM, Alexander Graf wrote:
> @@ -106,8 +106,10 @@ static int64_t load_kernel (CPUState *env)
>      ram_addr_t initrd_offset;
>      uint32_t *prom_buf;
>      long prom_size;
> +    ElfHandlers handlers = elf_default_handlers;
>
> -    if (load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys, NULL,
> +    handlers.translate_fn = cpu_mips_kseg0_to_phys;
> +    if (load_elf(loaderparams.kernel_filename,&handlers,
>                   (uint64_t *)&kernel_entry, (uint64_t *)&kernel_low,
>                   (uint64_t *)&kernel_high, 0, ELF_MACHINE, 1)<  0) {
>          fprintf(stderr, "qemu: could not load kernel '%s'\n",
>
>
> Unless my C foo is really bad, this means that handlers is
> initialized  with the contents of elf_default_handlers :). And
> that's how every caller works.

Sorry, my mistake.

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 02/40] elf: Add notes implementation
  2010-11-01 21:17           ` Alexander Graf
@ 2010-11-01 21:28             ` Paolo Bonzini
  2010-11-01 21:31             ` [Qemu-devel] " Stefan Weil
  2010-11-02 10:17             ` Michael Matz
  2 siblings, 0 replies; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-01 21:28 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Blue Swirl, Michael Matz, qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 10:17 PM, Alexander Graf wrote:
> Let's ask someone who definitely knows:).

LOL, hi Michael! :)

> Michael, is code like
>
> char *x = a, *y = b;
> if (x < y) {
>    ...
> }
>
> valid? Or do I first have to cast x and y to unsigned longs or uintptr_t?

It is, as long as x and y point into the same object (in your original 
code, data[0]...data[data_len] is the object).  This instead

   char *x = a;
   long *y = b;
   if (x < y)
     {
     }

should give a warning

   g2.c:1: warning: comparison of distinct pointer types lacks a cast

but is also valid as long as x and y point into the same object.  To 
quiet the warning you should _not_ cast x to long* however (unless you 
know it's properly aligned); casting y to char* instead is fine.

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 02/40] elf: Add notes implementation
  2010-11-01 21:17           ` Alexander Graf
  2010-11-01 21:28             ` [Qemu-devel] " Paolo Bonzini
@ 2010-11-01 21:31             ` Stefan Weil
  2010-11-02 10:17             ` Michael Matz
  2 siblings, 0 replies; 96+ messages in thread
From: Stefan Weil @ 2010-11-01 21:31 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Blue Swirl, Michael Matz, qemu-devel Developers, Gerd Hoffmann

Am 01.11.2010 22:17, schrieb Alexander Graf:
>
> On 01.11.2010, at 16:19, Stefan Weil wrote:
>
>> Am 01.11.2010 20:51, schrieb Alexander Graf:
>>> On 01.11.2010, at 14:42, Stefan Weil wrote:
>>>
>>>
>>>> Am 01.11.2010 19:29, schrieb Blue Swirl:
>>>>
>>>>> On Mon, Nov 1, 2010 at 3:01 PM, Alexander Graf<agraf@suse.de> wrote:
>>>>>
>>>>>
>>>>>> ---
>>>>>> hw/elf_ops.h | 61 
>>>>>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>>>>>> hw/loader.c | 7 ++++++
>>>>>> hw/loader.h | 3 ++
>>>>>> 3 files changed, 70 insertions(+), 1 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/elf_ops.h b/hw/elf_ops.h
>>>>>> index 8b63dfc..645d058 100644
>>>>>> --- a/hw/elf_ops.h
>>>>>> +++ b/hw/elf_ops.h
>>>>>> @@ -189,6 +189,44 @@ static int glue(load_symbols, SZ)(struct 
>>>>>> elfhdr *ehdr, int fd, int must_swab,
>>>>>> return -1;
>>>>>> }
>>>>>>
>>>>>> +static void glue(elf_read_notes, SZ)(uint8_t *data, int data_len,
>>>>>> + ElfHandlers *handlers, int must_swab)
>>>>>> +{
>>>>>> + uint8_t *p = data;
>>>>>> +
>>>>>> + while ((ulong)&p[3]< (ulong)&data[data_len]) {
>>>>>>
>>>>>>
>>>>> Please use 'unsigned long'.
>>>>>
>>>>>
>>>> Why is a type cast used here? I see no reason for it.
>>>>
>>> Pointers can't be compared, you have to cast them to values first.
>>>
>>>
>>> Alex
>>>
>>
>> No. Pointers of same type which are not void pointers can be compared.
>>
>> There is even a data type ptrdiff_t, so you can also compare their
>> difference with zero.
>
> Let's ask someone who definitely knows :).
>
> Michael, is code like
>
> char *x = a, *y = b;
> if (x < y) {
> ...
> }
>
> valid? Or do I first have to cast x and y to unsigned longs or uintptr_t?
>
>
> Alex
>


Hopefully C did not change for code like this during the last
20 years.

Then your code is always valid, but will only return useful results
if both a and b are derived from the same base pointer
(plus an individual offset):

char *base = any_value;
a = base + offset_a;
b = base + offset_b;

Then
         x - y = a - b = (ptrdiff_t)(offset_a - offset_b) / (sizeof(*x);
and
         (x < y) == (a < b) == (offset_a < offset_b);

Regards,
Stefan

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 20:32               ` Anthony Liguori
@ 2010-11-01 21:47                 ` Paolo Bonzini
  2010-11-01 22:00                   ` Anthony Liguori
  2010-11-02  4:33                 ` Stefano Stabellini
  1 sibling, 1 reply; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-01 21:47 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: Gerd Hoffmann, Alexander Graf, qemu-devel Developers

On 11/01/2010 09:32 PM, Anthony Liguori wrote:
>
> I'm not sure I agree with the goal.  I think where ever possible we
> should reuse code with the Xen project when it makes sense.  Reusing
> blkback/netback is impossible because we want userspace implementations
> and the current implementations are in the kernel.  blktap also doesn't
> tie into the QEMU block layer and making it tie into the QEMU block
> layer would probably result in more code than it saved.
>
> OTOH, xenstored and xenconsoled have very little direct dependence on
> Xen.  I'm not saying that we shouldn't make things Just Work in QEMU, so
> if that means spawning xenconsoled/xenstored automagically from QEMU
> with special options, that's perfectly fine.

xenstored is 3 times bigger than what Alex submitted, however.  The code 
is much simpler because _this_ xenstore only serves one domain.  So it 
doesn't have to implement permissions, it doesn't have complicated 
threading to handle multiple instances of libxs accessing the daemon, 
and so on.  Besides the data structures implementing the tree, there's 
really very little in common, and the xenner code is almost trivial.

The situation is similar for the console.  There is only one console to 
track here.  In fact, maybe it's simplest to implement it as a small 
8250A driver in the xenner kernel, reading from the serial console at 
0x3f8 and writing to the ring buffer and vice versa.

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 21:47                 ` Paolo Bonzini
@ 2010-11-01 22:00                   ` Anthony Liguori
  2010-11-01 22:08                     ` Paolo Bonzini
  0 siblings, 1 reply; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 22:00 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Gerd Hoffmann, Alexander Graf, qemu-devel Developers

On 11/01/2010 04:47 PM, Paolo Bonzini wrote:
> On 11/01/2010 09:32 PM, Anthony Liguori wrote:
>>
>> I'm not sure I agree with the goal.  I think where ever possible we
>> should reuse code with the Xen project when it makes sense.  Reusing
>> blkback/netback is impossible because we want userspace implementations
>> and the current implementations are in the kernel.  blktap also doesn't
>> tie into the QEMU block layer and making it tie into the QEMU block
>> layer would probably result in more code than it saved.
>>
>> OTOH, xenstored and xenconsoled have very little direct dependence on
>> Xen.  I'm not saying that we shouldn't make things Just Work in QEMU, so
>> if that means spawning xenconsoled/xenstored automagically from QEMU
>> with special options, that's perfectly fine.
>
> xenstored is 3 times bigger than what Alex submitted, however.  The 
> code is much simpler because _this_ xenstore only serves one domain.  
> So it doesn't have to implement permissions, it doesn't have 
> complicated threading to handle multiple instances of libxs accessing 
> the daemon, and so on.  Besides the data structures implementing the 
> tree, there's really very little in common, and the xenner code is 
> almost trivial.
>
> The situation is similar for the console.  There is only one console 
> to track here.  In fact, maybe it's simplest to implement it as a 
> small 8250A driver in the xenner kernel, reading from the serial 
> console at 0x3f8 and writing to the ring buffer and vice versa.

Okay, so does the same apply for xenstored?  Does it make more sense to 
move that into the xenner kernel?

The big advantage of the xenner kernel is that it runs in guest mode so 
it's no concern from a security PoV.  While xenstored is 3x bigger than 
Alex's version, it also has had an awful lot more validation from a 
security point of view.  Since this is guest facing code, that's important.

Regards,

Anthony Liguori

>
> Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 22:00                   ` Anthony Liguori
@ 2010-11-01 22:08                     ` Paolo Bonzini
  2010-11-01 22:29                       ` Anthony Liguori
  0 siblings, 1 reply; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-01 22:08 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel Developers, Gerd Hoffmann, Alexander Graf

On 11/01/2010 11:00 PM, Anthony Liguori wrote:
>
> Okay, so does the same apply for xenstored?  Does it make more sense to
> move that into the xenner kernel?

I think no, because the backend devices do use xenstore, so they would 
need a way to talk to the guest.  It's the same conceptually for the 
console, but in that case the "way to talk to the guest" is the 8250A 
device model that already exists.  In the case of xenstore it would be 
yet another protocol to devise and scrutinize.

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 22:08                     ` Paolo Bonzini
@ 2010-11-01 22:29                       ` Anthony Liguori
  0 siblings, 0 replies; 96+ messages in thread
From: Anthony Liguori @ 2010-11-01 22:29 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel Developers, Gerd Hoffmann, Alexander Graf

On 11/01/2010 05:08 PM, Paolo Bonzini wrote:
> On 11/01/2010 11:00 PM, Anthony Liguori wrote:
>>
>> Okay, so does the same apply for xenstored?  Does it make more sense to
>> move that into the xenner kernel?
>
> I think no, because the backend devices do use xenstore, so they would 
> need a way to talk to the guest.

Yeah, I was thinking fw_cfg but that's only after not thinking too much 
about it so that may be naive.

Regards,

Anthony Liguori

>   It's the same conceptually for the console, but in that case the 
> "way to talk to the guest" is the 8250A device model that already 
> exists.  In the case of xenstore it would be yet another protocol to 
> devise and scrutinize.
>
> Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-01 20:32               ` Anthony Liguori
  2010-11-01 21:47                 ` Paolo Bonzini
@ 2010-11-02  4:33                 ` Stefano Stabellini
  2010-11-02 10:06                   ` Paolo Bonzini
  1 sibling, 1 reply; 96+ messages in thread
From: Stefano Stabellini @ 2010-11-02  4:33 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Paolo Bonzini, Gerd Hoffmann, Alexander Graf, qemu-devel Developers

On Mon, 1 Nov 2010, Anthony Liguori wrote:
> On 11/01/2010 02:47 PM, Alexander Graf wrote:
> > Where else would it belong? Qemu is an emulator. Device emulation belongs to qemu code. The xen PV machine is nothing but a special case of the pc machine with custom firmware and odd devices :).
> >
> > As I stated in my cover letter, the goal of all this should be to have the qemu pieces be 100% independent of any xen headers or libraries,
> 
> I'm not sure I agree with the goal.  I think where ever possible we 
> should reuse code with the Xen project when it makes sense.  Reusing 
> blkback/netback is impossible because we want userspace implementations 
> and the current implementations are in the kernel.  blktap also doesn't 
> tie into the QEMU block layer and making it tie into the QEMU block 
> layer would probably result in more code than it saved.
> 
> OTOH, xenstored and xenconsoled have very little direct dependence on 
> Xen.  I'm not saying that we shouldn't make things Just Work in QEMU, so 
> if that means spawning xenconsoled/xenstored automagically from QEMU 
> with special options, that's perfectly fine.
> 
> But to replicate the functionality of this code solely because of NIH 
> seems like a waste of effort.
> 

I have been traveling so I haven't had a chance to carefully read the
series yet, however these are my early observations:

I don't mind xenner, of course I think the best way to run a PV guest is
to use Xen, but Xenner can be useful in many ways. I would love to see
an x86_32 PV guest run on PowerPC, or even in a Xen HVM domain!
It would be very useful for testing too, it would shorten my dev & test
cycle by quite a bit.

I am a strong proponent of code sharing and reuse so I agree with
Anthony on this: we should reuse Xen libraries and daemons as much as
possible. If you need some patches to port xenstored and/or xenconsoled
to PowerPC we would gladly accept them.
That said, many Xen components are obviously tied to the Xen
architecture, so it might not be easy to reuse them outside a Xen
environment. For example: making xenstored work without Xen shouldn't be
too difficult but porting libxc to KVM/QEMU I think would be harder.

I am looking forward to talking with you in Boston,

Stefano

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-02  4:33                 ` Stefano Stabellini
@ 2010-11-02 10:06                   ` Paolo Bonzini
  2010-11-02 10:31                     ` Gerd Hoffmann
  2010-11-02 13:55                     ` Stefano Stabellini
  0 siblings, 2 replies; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-02 10:06 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Gerd Hoffmann, Alexander Graf, qemu-devel Developers

On 11/02/2010 05:33 AM, Stefano Stabellini wrote:
> On Mon, 1 Nov 2010, Anthony Liguori wrote:
>> On 11/01/2010 02:47 PM, Alexander Graf wrote:
>>> Where else would it belong? Qemu is an emulator. Device emulation belongs to qemu code. The xen PV machine is nothing but a special case of the pc machine with custom firmware and odd devices :).
>>>
>>> As I stated in my cover letter, the goal of all this should be to have the qemu pieces be 100% independent of any xen headers or libraries,
>>
>> I'm not sure I agree with the goal.  I think where ever possible we
>> should reuse code with the Xen project when it makes sense.  Reusing
>> blkback/netback is impossible because we want userspace implementations
>> and the current implementations are in the kernel.  blktap also doesn't
>> tie into the QEMU block layer and making it tie into the QEMU block
>> layer would probably result in more code than it saved.
>>
>> OTOH, xenstored and xenconsoled have very little direct dependence on
>> Xen.  I'm not saying that we shouldn't make things Just Work in QEMU, so
>> if that means spawning xenconsoled/xenstored automagically from QEMU
>> with special options, that's perfectly fine.
>>
>> But to replicate the functionality of this code solely because of NIH
>> seems like a waste of effort.
>
> I am a strong proponent of code sharing and reuse so I agree with
> Anthony on this: we should reuse Xen libraries and daemons as much as
> possible. If you need some patches to port xenstored and/or xenconsoled
> to PowerPC we would gladly accept them.

The question is, how much do the Xen userspace and Xenner have in common?

If you remove code that Xen runs in the hypervisor or in the dom0 
kernel, or code that (like xenconsoled) is IMHO best moved to the Xenner 
kernel, what remains is the domain builder and of course xenstore 
handling.  The domain builder is in libxc, which makes it hard to share, 
and this leaves xenstore.

Now, half of it (the ring buffer protocol) already has a million 
duplicate implementation in userspace, in the kernel, in Windows PV 
drivers (at least three independent versions), and is pretty much set in 
stone.

So, what remains is actually parsing the xenstore messages and handling 
the tree data structure.  Which is actually a _very_ small part of 
xenstored: xenstored has to work across multiple domains and clients, be 
careful about inter-domain security, and so on.  Xenner has the _big_ 
advantage of having total independence between domUs (it's like if each 
domU had its own little dom0, its own little xenstore and so on).  While 
it doesn't mean there are no security concerns with guest-facing code, 
it simplifies the code to the point where effectively it makes no sense 
to share anything but the APIs.

I took a look at recent changes to libxs and xenstored in 
xen-unstable.hg. Here are some subjects going back to c/s 17400 (about 
30 months):

- xenstore: libxenstore: fix threading bug which cause xend startup hang
- xenstore: correctly handle errors from read_message
- xenstore: Make sure that libxs reports an error if xenstored drops
- xenstore: Fix cleanup_pop() definition for some (buggy) pthread.h headers.
- xs: avoid pthread_join deadlock in xs_daemon_close
- xs: make sure mutexes are cleaned up and memory freed if the read 
thread is cancelled
- xenstore,libxl: cleanup of xenstore connections across fork()
- xenstored: fix use-after free bug
- xenstore: Fix a memory leak in 'xs_is_domain_introduced'.
- xenstored: Fix xenstored abort when connection dropped.
- xenstore: fix canonicalize for metanodes

Almost all of them are about threading or error conditions, and even 
those that aren't wouldn't apply to a simple implementation like 
Xenner's.  This shows that the risk of missing bugfixes in guest-facing 
code is much smaller than one would think (including what I thought).

(BTW, I noticed that Xenner does not limit guest segments like Xen does. 
  Does it mean the guest can overwrite the Xenner kernel and effectively 
run ring0?)

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 06/40] qdev-ify: xen backends
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 06/40] qdev-ify: xen backends Alexander Graf
@ 2010-11-02 10:08   ` Markus Armbruster
  2010-11-02 10:43     ` Gerd Hoffmann
  0 siblings, 1 reply; 96+ messages in thread
From: Markus Armbruster @ 2010-11-02 10:08 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

Alexander Graf <agraf@suse.de> writes:

> From: Gerd Hoffmann <kraxel@redhat.com>
>
> This patch converts the xen backend code to qdev.

qdev conversions are always welcome.  This one's not complete (search
for #if 0).  The commit message should state that.

Two questions inline.

> Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  hw/xen_backend.c    |  176 ++++++++++++++++++++++++++++++++-------------------
>  hw/xen_backend.h    |    9 ++-
>  hw/xen_console.c    |   10 +++-
>  hw/xen_disk.c       |   10 +++-
>  hw/xen_machine_pv.c |    6 +--
>  hw/xen_nic.c        |   10 +++-
>  hw/xenfb.c          |   14 ++++-
>  7 files changed, 158 insertions(+), 77 deletions(-)
>
> diff --git a/hw/xen_backend.c b/hw/xen_backend.c
> index a2e408f..0d6a96b 100644
> --- a/hw/xen_backend.c
> +++ b/hw/xen_backend.c
> @@ -42,13 +42,21 @@
>  
>  /* ------------------------------------------------------------- */
>  
> +typedef struct XenBus {
> +    BusState qbus;
> +} XenBus;
> +
>  /* public */
>  int xen_xc;
>  struct xs_handle *xenstore = NULL;
>  const char *xen_protocol;
>  
>  /* private */
> -static QTAILQ_HEAD(XenDeviceHead, XenDevice) xendevs = QTAILQ_HEAD_INITIALIZER(xendevs);
> +static struct BusInfo xen_bus_info = {
> +    .name       = "Xen",
> +    .size       = sizeof(XenBus),
> +};
> +static XenBus *xenbus;
>  static int debug = 0;
>  
>  /* ------------------------------------------------------------- */
> @@ -163,14 +171,16 @@ int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state)
>  
>  struct XenDevice *xen_be_find_xendev(const char *type, int dom, int dev)
>  {
> +    struct DeviceState *qdev;
>      struct XenDevice *xendev;
>  
> -    QTAILQ_FOREACH(xendev, &xendevs, next) {
> +    QLIST_FOREACH(qdev, &xenbus->qbus.children, sibling) {
> +        xendev = container_of(qdev, struct XenDevice, qdev);
>  	if (xendev->dom != dom)
>  	    continue;
>  	if (xendev->dev != dev)
>  	    continue;
> -	if (strcmp(xendev->type, type) != 0)
> +	if (strcmp(xendev->ops->type, type) != 0)
>  	    continue;
>  	return xendev;
>      }
> @@ -180,28 +190,34 @@ struct XenDevice *xen_be_find_xendev(const char *type, int dom, int dev)
>  /*
>   * get xen backend device, allocate a new one if it doesn't exist.
>   */
> -static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev,
> +static struct XenDevice *xen_be_get_xendev(int dom, int dev,
>                                             struct XenDevOps *ops)
>  {
> +    struct DeviceState *qdev;
>      struct XenDevice *xendev;
> +    char name[64];
>      char *dom0;
>  
> -    xendev = xen_be_find_xendev(type, dom, dev);
> +    xendev = xen_be_find_xendev(ops->type, dom, dev);
>      if (xendev)
>  	return xendev;
>  
> +    /* create new xendev */
> +    snprintf(name, sizeof(name), "xen-%s", ops->type);
> +    qdev = qdev_create(&xenbus->qbus, name);
> +    qdev_init_nofail(qdev);
> +    xendev = container_of(qdev, struct XenDevice, qdev);
> +
>      /* init new xendev */
> -    xendev = qemu_mallocz(ops->size);
> -    xendev->type  = type;
>      xendev->dom   = dom;
>      xendev->dev   = dev;
>      xendev->ops   = ops;
>  
>      dom0 = xs_get_domain_path(xenstore, 0);
>      snprintf(xendev->be, sizeof(xendev->be), "%s/backend/%s/%d/%d",
> -	     dom0, xendev->type, xendev->dom, xendev->dev);
> +	     dom0, xendev->ops->type, xendev->dom, xendev->dev);
>      snprintf(xendev->name, sizeof(xendev->name), "%s-%d",
> -	     xendev->type, xendev->dev);
> +	     xendev->ops->type, xendev->dev);
>      free(dom0);
>  
>      xendev->debug      = debug;
> @@ -210,7 +226,7 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev,
>      xendev->evtchndev = xc_evtchn_open();
>      if (xendev->evtchndev < 0) {
>  	xen_be_printf(NULL, 0, "can't open evtchn device\n");
> -	qemu_free(xendev);
> +	qdev_free(&xendev->qdev);
>  	return NULL;
>      }
>      fcntl(xc_evtchn_fd(xendev->evtchndev), F_SETFD, FD_CLOEXEC);
> @@ -220,15 +236,13 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev,
>  	if (xendev->gnttabdev < 0) {
>  	    xen_be_printf(NULL, 0, "can't open gnttab device\n");
>  	    xc_evtchn_close(xendev->evtchndev);
> -	    qemu_free(xendev);
> +	    qdev_free(&xendev->qdev);
>  	    return NULL;
>  	}
>      } else {
>  	xendev->gnttabdev = -1;
>      }
>  
> -    QTAILQ_INSERT_TAIL(&xendevs, xendev, next);
> -
>      if (xendev->ops->alloc)
>  	xendev->ops->alloc(xendev);
>  
> @@ -238,43 +252,44 @@ static struct XenDevice *xen_be_get_xendev(const char *type, int dom, int dev,
>  /*
>   * release xen backend device.
>   */
> -static struct XenDevice *xen_be_del_xendev(int dom, int dev)
> +static void xen_be_del_xendev(int dom, int dev, struct XenDevOps *ops)
>  {
> -    struct XenDevice *xendev, *xnext;
> -
> -    /*
> -     * This is pretty much like QTAILQ_FOREACH(xendev, &xendevs, next) but
> -     * we save the next pointer in xnext because we might free xendev.
> -     */
> -    xnext = xendevs.tqh_first;
> -    while (xnext) {
> -        xendev = xnext;
> -        xnext = xendev->next.tqe_next;
> -
> -	if (xendev->dom != dom)
> -	    continue;
> -	if (xendev->dev != dev && dev != -1)
> -	    continue;
> -
> -	if (xendev->ops->free)
> -	    xendev->ops->free(xendev);
> -
> -	if (xendev->fe) {
> -	    char token[XEN_BUFSIZE];
> -	    snprintf(token, sizeof(token), "fe:%p", xendev);
> -	    xs_unwatch(xenstore, xendev->fe, token);
> -	    qemu_free(xendev->fe);
> -	}
> -
> -	if (xendev->evtchndev >= 0)
> -	    xc_evtchn_close(xendev->evtchndev);
> -	if (xendev->gnttabdev >= 0)
> -	    xc_gnttab_close(xendev->gnttabdev);
> -
> -	QTAILQ_REMOVE(&xendevs, xendev, next);
> -	qemu_free(xendev);
> -    }
> -    return NULL;
> +    struct DeviceState *qdev;
> +    struct XenDevice *xendev;
> +    int done;
> +
> +    do {
> +        done = 1;
> +        QLIST_FOREACH(qdev, &xenbus->qbus.children, sibling) {
> +            xendev = container_of(qdev, struct XenDevice, qdev);
> +            if (xendev->dom != dom)
> +                continue;
> +            if (xendev->dev != dev && dev != -1)
> +                continue;
> +            if (xendev->ops != ops)
> +                continue;
> +
> +            if (xendev->ops->free)
> +                xendev->ops->free(xendev);
> +
> +            if (xendev->fe) {
> +                char token[XEN_BUFSIZE];
> +                snprintf(token, sizeof(token), "fe:%p", xendev);
> +                xs_unwatch(xenstore, xendev->fe, token);
> +                qemu_free(xendev->fe);
> +            }
> +
> +            if (xendev->evtchndev >= 0)
> +                xc_evtchn_close(xendev->evtchndev);
> +            if (xendev->gnttabdev >= 0)
> +                xc_gnttab_close(xendev->gnttabdev);
> +
> +            qdev_free(&xendev->qdev);
> +
> +            done = 0;
> +            break;
> +        }
> +    } while (!done);

This loop nest confuses me.  Why can't we just QLIST_FOREACH_SAFE()?

>  }
>  
>  /*
> @@ -498,7 +513,7 @@ void xen_be_check_state(struct XenDevice *xendev)
>  
>  /* ------------------------------------------------------------- */
>  
> -static int xenstore_scan(const char *type, int dom, struct XenDevOps *ops)
> +static int xenstore_scan(int dom, struct XenDevOps *ops)
>  {
>      struct XenDevice *xendev;
>      char path[XEN_BUFSIZE], token[XEN_BUFSIZE];
> @@ -507,8 +522,8 @@ static int xenstore_scan(const char *type, int dom, struct XenDevOps *ops)
>  
>      /* setup watch */
>      dom0 = xs_get_domain_path(xenstore, 0);
> -    snprintf(token, sizeof(token), "be:%p:%d:%p", type, dom, ops);
> -    snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, type, dom);
> +    snprintf(token, sizeof(token), "be:%d:%p", dom, ops);

Why drop %p, type from token?

> +    snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, ops->type, dom);
>      free(dom0);
>      if (!xs_watch(xenstore, path, token)) {
>  	xen_be_printf(NULL, 0, "xen be: watching backend path (%s) failed\n", path);
> @@ -520,7 +535,7 @@ static int xenstore_scan(const char *type, int dom, struct XenDevOps *ops)
>      if (!dev)
>  	return 0;
>      for (j = 0; j < cdev; j++) {
> -	xendev = xen_be_get_xendev(type, dom, atoi(dev[j]), ops);
> +	xendev = xen_be_get_xendev(dom, atoi(dev[j]), ops);
>  	if (xendev == NULL)
>  	    continue;
>  	xen_be_check_state(xendev);
> @@ -529,15 +544,14 @@ static int xenstore_scan(const char *type, int dom, struct XenDevOps *ops)
>      return 0;
>  }
>  
> -static void xenstore_update_be(char *watch, char *type, int dom,
> -			       struct XenDevOps *ops)
> +static void xenstore_update_be(char *watch, int dom, struct XenDevOps *ops)
>  {
>      struct XenDevice *xendev;
>      char path[XEN_BUFSIZE], *dom0;
>      unsigned int len, dev;
>  
>      dom0 = xs_get_domain_path(xenstore, 0);
> -    len = snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, type, dom);
> +    len = snprintf(path, sizeof(path), "%s/backend/%s/%d", dom0, ops->type, dom);
>      free(dom0);
>      if (strncmp(path, watch, len) != 0)
>  	return;
> @@ -551,10 +565,10 @@ static void xenstore_update_be(char *watch, char *type, int dom,
>  
>      if (0) {
>  	/* FIXME: detect devices being deleted from xenstore ... */
> -	xen_be_del_xendev(dom, dev);
> +	xen_be_del_xendev(dom, dev, ops);
>      }
>  
> -    xendev = xen_be_get_xendev(type, dom, dev, ops);
> +    xendev = xen_be_get_xendev(dom, dev, ops);
>      if (xendev != NULL) {
>  	xen_be_backend_changed(xendev, path);
>  	xen_be_check_state(xendev);
> @@ -580,16 +594,16 @@ static void xenstore_update_fe(char *watch, struct XenDevice *xendev)
>  static void xenstore_update(void *unused)
>  {
>      char **vec = NULL;
> -    intptr_t type, ops, ptr;
> +    intptr_t ops, ptr;
>      unsigned int dom, count;
>  
>      vec = xs_read_watch(xenstore, &count);
>      if (vec == NULL)
>  	goto cleanup;
>  
> -    if (sscanf(vec[XS_WATCH_TOKEN], "be:%" PRIxPTR ":%d:%" PRIxPTR,
> -               &type, &dom, &ops) == 3)
> -	xenstore_update_be(vec[XS_WATCH_PATH], (void*)type, dom, (void*)ops);
> +    if (sscanf(vec[XS_WATCH_TOKEN], "be:%d:%" PRIxPTR,
> +               &dom, &ops) == 2)
> +	xenstore_update_be(vec[XS_WATCH_PATH], dom, (void*)ops);
>      if (sscanf(vec[XS_WATCH_TOKEN], "fe:%" PRIxPTR, &ptr) == 1)
>  	xenstore_update_fe(vec[XS_WATCH_PATH], (void*)ptr);
>  
> @@ -642,9 +656,43 @@ err:
>      return -1;
>  }
>  
> -int xen_be_register(const char *type, struct XenDevOps *ops)
> +void xen_create_bus(DeviceState *parent)
>  {
> -    return xenstore_scan(type, xen_domid, ops);
> +    DeviceInfo *info;
> +    BusState *qbus;
> +
> +    qbus = qbus_create(&xen_bus_info, parent, NULL);
> +    xenbus = DO_UPCAST(XenBus, qbus, qbus);
> +    for (info = device_info_list; info != NULL; info = info->next) {
> +        if (info->bus_info != &xen_bus_info)
> +            continue;
> +        xenstore_scan(xen_domid, DO_UPCAST(struct XenDevOps, qinfo, info));
> +    }
> +
> +    qbus->allow_hotplug = 1;
> +}
> +
> +static int xen_be_initfn(DeviceState *dev, DeviceInfo *info)
> +{
> +#if 0
> +    struct XenDevOps *ops = container_of(info, struct XenDevOps, qinfo);
> +    struct XenDevice *xendev = container_of(dev, struct XenDevice, qdev);
> +
> +    /* nothing to do as create + init isn't really splitted. */
> +#endif
> +    return 0;
> +}
> +
> +void xen_qdev_register(struct XenDevOps *ops)
> +{
> +    char name[64];
> +
> +    snprintf(name, sizeof(name), "xen-%s", ops->type);
> +    ops->qinfo.name = qemu_strdup(name);
> +    ops->qinfo.init = xen_be_initfn;
> +    ops->qinfo.bus_info = &xen_bus_info;
> +    ops->qinfo.no_user = 1,
> +    qdev_register(&ops->qinfo);
>  }
>  
>  int xen_be_bind_evtchn(struct XenDevice *xendev)
> diff --git a/hw/xen_backend.h b/hw/xen_backend.h
> index 1b428e3..f53a742 100644
> --- a/hw/xen_backend.h
> +++ b/hw/xen_backend.h
> @@ -4,6 +4,7 @@
>  #include "xen_common.h"
>  #include "sysemu.h"
>  #include "net.h"
> +#include "qdev.h"
>  
>  /* ------------------------------------------------------------- */
>  
> @@ -17,7 +18,8 @@ struct XenDevice;
>  #define DEVOPS_FLAG_IGNORE_STATE  2
>  
>  struct XenDevOps {
> -    size_t    size;
> +    DeviceInfo qinfo;
> +    char      type[64];
>      uint32_t  flags;
>      void      (*alloc)(struct XenDevice *xendev);
>      int       (*init)(struct XenDevice *xendev);
> @@ -30,7 +32,7 @@ struct XenDevOps {
>  };
>  
>  struct XenDevice {
> -    const char         *type;
> +    DeviceState        qdev;
>      int                dom;
>      int                dev;
>      char               name[64];
> @@ -78,7 +80,8 @@ void xen_be_check_state(struct XenDevice *xendev);
>  
>  /* xen backend driver bits */
>  int xen_be_init(void);
> -int xen_be_register(const char *type, struct XenDevOps *ops);
> +void xen_create_bus(DeviceState *parent);
> +void xen_qdev_register(struct XenDevOps *ops);
>  int xen_be_set_state(struct XenDevice *xendev, enum xenbus_state state);
>  int xen_be_bind_evtchn(struct XenDevice *xendev);
>  void xen_be_unbind_evtchn(struct XenDevice *xendev);
> diff --git a/hw/xen_console.c b/hw/xen_console.c
> index d2261f4..a980dc8 100644
> --- a/hw/xen_console.c
> +++ b/hw/xen_console.c
> @@ -260,10 +260,18 @@ static void con_event(struct XenDevice *xendev)
>  /* -------------------------------------------------------------------- */
>  
>  struct XenDevOps xen_console_ops = {
> -    .size       = sizeof(struct XenConsole),
> +    .qinfo.size = sizeof(struct XenConsole),
> +    .type       = "console",
>      .flags      = DEVOPS_FLAG_IGNORE_STATE,
>      .init       = con_init,
>      .connect    = con_connect,
>      .event      = con_event,
>      .disconnect = con_disconnect,
>  };
> +
> +static void xen_console_register_devices(void)
> +{
> +    xen_qdev_register(&xen_console_ops);
> +}
> +
> +device_init(xen_console_register_devices)
> diff --git a/hw/xen_disk.c b/hw/xen_disk.c
> index 06752de..5392f58 100644
> --- a/hw/xen_disk.c
> +++ b/hw/xen_disk.c
> @@ -774,7 +774,8 @@ static void blk_event(struct XenDevice *xendev)
>  }
>  
>  struct XenDevOps xen_blkdev_ops = {
> -    .size       = sizeof(struct XenBlkDev),
> +    .qinfo.size = sizeof(struct XenBlkDev),
> +    .type       = "qdisk",
>      .flags      = DEVOPS_FLAG_NEED_GNTDEV,
>      .alloc      = blk_alloc,
>      .init       = blk_init,
> @@ -783,3 +784,10 @@ struct XenDevOps xen_blkdev_ops = {
>      .event      = blk_event,
>      .free       = blk_free,
>  };
> +
> +static void xen_blkdev_register_devices(void)
> +{
> +    xen_qdev_register(&xen_blkdev_ops);
> +}
> +
> +device_init(xen_blkdev_register_devices)
> diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
> index 77a34bf..b94d6e9 100644
> --- a/hw/xen_machine_pv.c
> +++ b/hw/xen_machine_pv.c
> @@ -75,11 +75,7 @@ static void xen_init_pv(ram_addr_t ram_size,
>          break;
>      }
>  
> -    xen_be_register("console", &xen_console_ops);
> -    xen_be_register("vkbd", &xen_kbdmouse_ops);
> -    xen_be_register("vfb", &xen_framebuffer_ops);
> -    xen_be_register("qdisk", &xen_blkdev_ops);
> -    xen_be_register("qnic", &xen_netdev_ops);
> +    xen_create_bus(NULL);
>  
>      /* configure framebuffer */
>      if (xenfb_enabled) {
> diff --git a/hw/xen_nic.c b/hw/xen_nic.c
> index 08055b8..02d3c4e 100644
> --- a/hw/xen_nic.c
> +++ b/hw/xen_nic.c
> @@ -411,7 +411,8 @@ static int net_free(struct XenDevice *xendev)
>  /* ------------------------------------------------------------- */
>  
>  struct XenDevOps xen_netdev_ops = {
> -    .size       = sizeof(struct XenNetDev),
> +    .qinfo.size = sizeof(struct XenNetDev),
> +    .type       = "qnic",
>      .flags      = DEVOPS_FLAG_NEED_GNTDEV,
>      .init       = net_init,
>      .connect    = net_connect,
> @@ -419,3 +420,10 @@ struct XenDevOps xen_netdev_ops = {
>      .disconnect = net_disconnect,
>      .free       = net_free,
>  };
> +
> +static void xen_netdev_register_devices(void)
> +{
> +    xen_qdev_register(&xen_netdev_ops);
> +}
> +
> +device_init(xen_netdev_register_devices)
> diff --git a/hw/xenfb.c b/hw/xenfb.c
> index da5297b..293210f 100644
> --- a/hw/xenfb.c
> +++ b/hw/xenfb.c
> @@ -953,7 +953,8 @@ static void fb_event(struct XenDevice *xendev)
>  /* -------------------------------------------------------------------- */
>  
>  struct XenDevOps xen_kbdmouse_ops = {
> -    .size       = sizeof(struct XenInput),
> +    .qinfo.size = sizeof(struct XenInput),
> +    .type       = "vkbd",
>      .init       = input_init,
>      .connect    = input_connect,
>      .disconnect = input_disconnect,
> @@ -961,7 +962,8 @@ struct XenDevOps xen_kbdmouse_ops = {
>  };
>  
>  struct XenDevOps xen_framebuffer_ops = {
> -    .size       = sizeof(struct XenFB),
> +    .qinfo.size = sizeof(struct XenFB),
> +    .type       = "vfb",
>      .init       = fb_init,
>      .connect    = fb_connect,
>      .disconnect = fb_disconnect,
> @@ -1011,3 +1013,11 @@ wait_more:
>      xen_be_check_state(xin);
>      xen_be_check_state(xfb);
>  }
> +
> +static void xenfb_register_devices(void)
> +{
> +    xen_qdev_register(&xen_kbdmouse_ops);
> +    xen_qdev_register(&xen_framebuffer_ops);
> +}
> +
> +device_init(xenfb_register_devices)

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 35/40] xenner: Domain Builder
  2010-11-01 15:01 ` [Qemu-devel] [PATCH 35/40] xenner: Domain Builder Alexander Graf
@ 2010-11-02 10:09   ` Paolo Bonzini
  2010-11-02 15:36     ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-02 10:09 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/01/2010 04:01 PM, Alexander Graf wrote:
> The traditional Xen way of loading a kernel and initial data structures into the
> guest's memory is by calling libxc functions. We want to be able to run without
> libxc dependencies though, so we need an alternative.
>
> This patch implements a full domain builder for xenner. It loads the guest
> kernel, sets up basic memory layouts and saves everything off for the pv
> communication device between xenner and qemu.
>
> Signed-off-by: Alexander Graf<agraf@suse.de>

What about 36 to 40? :)

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 02/40] elf: Add notes implementation
  2010-11-01 21:17           ` Alexander Graf
  2010-11-01 21:28             ` [Qemu-devel] " Paolo Bonzini
  2010-11-01 21:31             ` [Qemu-devel] " Stefan Weil
@ 2010-11-02 10:17             ` Michael Matz
  2 siblings, 0 replies; 96+ messages in thread
From: Michael Matz @ 2010-11-02 10:17 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Blue Swirl, qemu-devel Developers, Gerd Hoffmann

Hi,

On Mon, 1 Nov 2010, Alexander Graf wrote:

> > No. Pointers of same type which are not void pointers can be compared.
> > 
> > There is even a data type ptrdiff_t, so you can also compare their
> > difference with zero.
> 
> Let's ask someone who definitely knows :).
> 
> Michael, is code like
> 
> char *x = a, *y = b;
> if (x < y) {
>   ...
> }

Pointers can be compared iff they point into the same object 
(including right after the object), so it depends on what a and b were 
above.  This would be invalid for instance:

  int o1, o2;
  int *p1 = &o1, *p2 = &o2;
  if (p1 < p2) ...

> valid? Or do I first have to cast x and y to unsigned longs or 
> uintptr_t?

For doing a valid pointer comparison you don't have to cast anything.  
Casting doesn't make an invalid comparison valid.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-02 10:06                   ` Paolo Bonzini
@ 2010-11-02 10:31                     ` Gerd Hoffmann
  2010-11-02 10:38                       ` Paolo Bonzini
  2010-11-02 13:55                     ` Stefano Stabellini
  1 sibling, 1 reply; 96+ messages in thread
From: Gerd Hoffmann @ 2010-11-02 10:31 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel Developers, Alexander Graf, Stefano Stabellini

   Hi,

> (BTW, I noticed that Xenner does not limit guest segments like Xen does.
> Does it mean the guest can overwrite the Xenner kernel and effectively
> run ring0?)

Yes.  The guest also can modify page tables as it pleases.  It is the 
vmx/svm container which protects the host, not the xenner kernel.

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-02 10:31                     ` Gerd Hoffmann
@ 2010-11-02 10:38                       ` Paolo Bonzini
  0 siblings, 0 replies; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-02 10:38 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: qemu-devel Developers, Alexander Graf, Stefano Stabellini

On 11/02/2010 11:31 AM, Gerd Hoffmann wrote:
>   Hi,
>
>> (BTW, I noticed that Xenner does not limit guest segments like Xen does.
>> Does it mean the guest can overwrite the Xenner kernel and effectively
>> run ring0?)
>
> Yes. The guest also can modify page tables as it pleases. It is the
> vmx/svm container which protects the host, not the xenner kernel.

Yes, got it.  I was trying to understand exactly which parts are 
guest-facing (the answer is "everything") and which are only 
xenner-facing (and here the answer is "none" :)).

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 06/40] qdev-ify: xen backends
  2010-11-02 10:08   ` Markus Armbruster
@ 2010-11-02 10:43     ` Gerd Hoffmann
  2010-11-02 13:26       ` Markus Armbruster
  0 siblings, 1 reply; 96+ messages in thread
From: Gerd Hoffmann @ 2010-11-02 10:43 UTC (permalink / raw)
  To: Markus Armbruster; +Cc: Alexander Graf, qemu-devel Developers

On 11/02/10 11:08, Markus Armbruster wrote:
> Alexander Graf<agraf@suse.de>  writes:
>
>> From: Gerd Hoffmann<kraxel@redhat.com>
>>
>> This patch converts the xen backend code to qdev.
>
> qdev conversions are always welcome.  This one's not complete (search
> for #if 0).

It is a tricky one too.

Creating the xen backend device instances is controlled via xenstore 
(either emulated in case of xenner or xenstored when running on Xen). 
When creating block/net backends via qemu command line switches all qemu 
does is creating the xenstore entries.  Having a external entity (i.e. 
xend) creating the xenstore entries works too.

This workflow is a bit hard to fit into the qdev model ...

>> +    do {
>> +        done = 1;
>> +        QLIST_FOREACH(qdev,&xenbus->qbus.children, sibling) {
>> +            xendev = container_of(qdev, struct XenDevice, qdev);

[ ... ]

>> +            done = 0;
>> +            break;
>> +        }
>> +    } while (!done);
>
> This loop nest confuses me.  Why can't we just QLIST_FOREACH_SAFE()?

Just historical reasons I guess.  QLIST_FOREACH_SAFE() wasn't there from 
the start but got added later.

cheers,
   Gerd

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] [PATCH 06/40] qdev-ify: xen backends
  2010-11-02 10:43     ` Gerd Hoffmann
@ 2010-11-02 13:26       ` Markus Armbruster
  0 siblings, 0 replies; 96+ messages in thread
From: Markus Armbruster @ 2010-11-02 13:26 UTC (permalink / raw)
  To: Gerd Hoffmann; +Cc: Alexander Graf, qemu-devel Developers

Gerd Hoffmann <kraxel@redhat.com> writes:

> On 11/02/10 11:08, Markus Armbruster wrote:
>> Alexander Graf<agraf@suse.de>  writes:
>>
>>> From: Gerd Hoffmann<kraxel@redhat.com>
>>>
>>> This patch converts the xen backend code to qdev.
>>
>> qdev conversions are always welcome.  This one's not complete (search
>> for #if 0).
>
> It is a tricky one too.
>
> Creating the xen backend device instances is controlled via xenstore
> (either emulated in case of xenner or xenstored when running on
> Xen). When creating block/net backends via qemu command line switches
> all qemu does is creating the xenstore entries.  Having a external
> entity (i.e. xend) creating the xenstore entries works too.
>
> This workflow is a bit hard to fit into the qdev model ...

I'm fine with imperfect qdev conversions, as long as the issues make
things no worse then they were before (check, I think), they're clearly
documented in the source (check, although the comment could be
improved), and in the commit message (fail, but easy enough to fix).

>>> +    do {
>>> +        done = 1;
>>> +        QLIST_FOREACH(qdev,&xenbus->qbus.children, sibling) {
>>> +            xendev = container_of(qdev, struct XenDevice, qdev);
>
> [ ... ]
>
>>> +            done = 0;
>>> +            break;
>>> +        }
>>> +    } while (!done);
>>
>> This loop nest confuses me.  Why can't we just QLIST_FOREACH_SAFE()?
>
> Just historical reasons I guess.  QLIST_FOREACH_SAFE() wasn't there
> from the start but got added later.

I'd like that to be cleaned up then.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-02 10:06                   ` Paolo Bonzini
  2010-11-02 10:31                     ` Gerd Hoffmann
@ 2010-11-02 13:55                     ` Stefano Stabellini
  2010-11-02 15:48                       ` Alexander Graf
  1 sibling, 1 reply; 96+ messages in thread
From: Stefano Stabellini @ 2010-11-02 13:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Stefano Stabellini, Developers, Alexander Graf,
	Gerd Hoffmann

On Tue, 2 Nov 2010, Paolo Bonzini wrote:
> The question is, how much do the Xen userspace and Xenner have in common?
> 
> If you remove code that Xen runs in the hypervisor or in the dom0 
> kernel, or code that (like xenconsoled) is IMHO best moved to the Xenner 
> kernel, what remains is the domain builder and of course xenstore 
> handling.  The domain builder is in libxc, which makes it hard to share, 
> and this leaves xenstore.
> 

There is a xen console backend in qemu already (xen_console.c).


> Now, half of it (the ring buffer protocol) already has a million 
> duplicate implementation in userspace, in the kernel, in Windows PV 
> drivers (at least three independent versions), and is pretty much set in 
> stone.
> 
> So, what remains is actually parsing the xenstore messages and handling 
> the tree data structure.  Which is actually a _very_ small part of 
> xenstored: xenstored has to work across multiple domains and clients, be 
> careful about inter-domain security, and so on.  Xenner has the _big_ 
> advantage of having total independence between domUs (it's like if each 
> domU had its own little dom0, its own little xenstore and so on).  While 
> it doesn't mean there are no security concerns with guest-facing code, 
> it simplifies the code to the point where effectively it makes no sense 
> to share anything but the APIs.
> 

All right, if you feel that it would be easier for you to use your own
simplified version, I am OK with that.
However it is important that the mini-libxc, the mini-xenstored and the
qemu domain builder are disable when using xen as accelerator.
As I said before, running pure PV guests in a xen HVM domain should be one of
the targets of the series, and in that case we do want to use the full
featured xenstored and libxc and the libxenlight domain buider.

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 35/40] xenner: Domain Builder
  2010-11-02 10:09   ` [Qemu-devel] " Paolo Bonzini
@ 2010-11-02 15:36     ` Alexander Graf
  2010-11-02 15:51       ` Paolo Bonzini
  0 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-02 15:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel Developers, Gerd Hoffmann


On 02.11.2010, at 06:09, Paolo Bonzini wrote:

> On 11/01/2010 04:01 PM, Alexander Graf wrote:
>> The traditional Xen way of loading a kernel and initial data structures into the
>> guest's memory is by calling libxc functions. We want to be able to run without
>> libxc dependencies though, so we need an alternative.
>> 
>> This patch implements a full domain builder for xenner. It loads the guest
>> kernel, sets up basic memory layouts and saves everything off for the pv
>> communication device between xenner and qemu.
>> 
>> Signed-off-by: Alexander Graf<agraf@suse.de>
> 
> What about 36 to 40? :)

What about them? :) Didn't they make it to the ML?


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-02 13:55                     ` Stefano Stabellini
@ 2010-11-02 15:48                       ` Alexander Graf
  2010-11-02 19:20                         ` Stefano Stabellini
  0 siblings, 1 reply; 96+ messages in thread
From: Alexander Graf @ 2010-11-02 15:48 UTC (permalink / raw)
  To: Stefano Stabellini; +Cc: Paolo Bonzini, qemu-devel Developers, Gerd Hoffmann


On 02.11.2010, at 09:55, Stefano Stabellini wrote:

> On Tue, 2 Nov 2010, Paolo Bonzini wrote:
>> The question is, how much do the Xen userspace and Xenner have in common?
>> 
>> If you remove code that Xen runs in the hypervisor or in the dom0 
>> kernel, or code that (like xenconsoled) is IMHO best moved to the Xenner 
>> kernel, what remains is the domain builder and of course xenstore 
>> handling.  The domain builder is in libxc, which makes it hard to share, 
>> and this leaves xenstore.
>> 
> 
> There is a xen console backend in qemu already (xen_console.c).
> 
> 
>> Now, half of it (the ring buffer protocol) already has a million 
>> duplicate implementation in userspace, in the kernel, in Windows PV 
>> drivers (at least three independent versions), and is pretty much set in 
>> stone.
>> 
>> So, what remains is actually parsing the xenstore messages and handling 
>> the tree data structure.  Which is actually a _very_ small part of 
>> xenstored: xenstored has to work across multiple domains and clients, be 
>> careful about inter-domain security, and so on.  Xenner has the _big_ 
>> advantage of having total independence between domUs (it's like if each 
>> domU had its own little dom0, its own little xenstore and so on).  While 
>> it doesn't mean there are no security concerns with guest-facing code, 
>> it simplifies the code to the point where effectively it makes no sense 
>> to share anything but the APIs.
>> 
> 
> All right, if you feel that it would be easier for you to use your own
> simplified version, I am OK with that.
> However it is important that the mini-libxc, the mini-xenstored and the
> qemu domain builder are disable when using xen as accelerator.
> As I said before, running pure PV guests in a xen HVM domain should be one of
> the targets of the series, and in that case we do want to use the full
> featured xenstored and libxc and the libxenlight domain buider.

This is getting confusing :). There are multiple ways of spawning a Xen PV instance I'm aware of:

1) Xen PV context
2) Xen PV context in SVM/VMX container, maintained by Xen
3) Xenner on TCG/KVM
4) Xenner on Xen HVM

For 1 and 2 the way to go is definitely to reuse the xen infrastructure. For 3 I'm very reluctant in requiring dependencies. One of qemu's strong points is that it does not have too many dependencies on other code. If there are strong points for it however, I gladly change my position :).

For 4 however, I haven't fully made up my mind on if it's useful to people (if you say it is, I'm more than glad to get this rolling!) and what the best way to implement it would be.

So I suppose your suggestion is to use the xen infrastructure for case 4? That might work out. Fortunately, all the detection on which backend we use happens at runtime. Since in that case Xen does own the guest's memory, we might even be safe on using its memory mapping functionality. Maybe.

I'm looking very much forward to talking to you about this in Boston. Are you around already?


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 35/40] xenner: Domain Builder
  2010-11-02 15:36     ` Alexander Graf
@ 2010-11-02 15:51       ` Paolo Bonzini
  2010-11-02 16:28         ` Alexander Graf
  0 siblings, 1 reply; 96+ messages in thread
From: Paolo Bonzini @ 2010-11-02 15:51 UTC (permalink / raw)
  To: Alexander Graf; +Cc: qemu-devel Developers, Gerd Hoffmann

On 11/02/2010 04:36 PM, Alexander Graf wrote:
>
> On 02.11.2010, at 06:09, Paolo Bonzini wrote:
>
>> On 11/01/2010 04:01 PM, Alexander Graf wrote:
>>> The traditional Xen way of loading a kernel and initial data structures into the
>>> guest's memory is by calling libxc functions. We want to be able to run without
>>> libxc dependencies though, so we need an alternative.
>>>
>>> This patch implements a full domain builder for xenner. It loads the guest
>>> kernel, sets up basic memory layouts and saves everything off for the pv
>>> communication device between xenner and qemu.
>>>
>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>
>> What about 36 to 40? :)
>
> What about them? :) Didn't they make it to the ML?

http://patchwork.ozlabs.org/project/qemu-devel/list/ seems to agree with 
me they didn't. :)

Paolo

^ permalink raw reply	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 36/40] xen: only create dummy env when necessary
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (35 preceding siblings ...)
  2010-11-01 15:21 ` [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
@ 2010-11-02 16:26 ` Alexander Graf
  2010-11-02 16:26 ` [Qemu-devel] [PATCH 38/40] xenner: integrate into build system Alexander Graf
                   ` (2 subsequent siblings)
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-02 16:26 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

For native Xen machines, we need to create a dummy env so qemu always has
something to work on.

When using xenner however, we don't want that dummy env but create real envs
instead.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xen_machine_pv.c |   22 ++++++++++++++--------
 1 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index b94d6e9..6f2666d 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -30,16 +30,9 @@
 #include "xen_domainbuild.h"
 #include "blockdev.h"
 
-static void xen_init_pv(ram_addr_t ram_size,
-			const char *boot_device,
-			const char *kernel_filename,
-			const char *kernel_cmdline,
-			const char *initrd_filename,
-			const char *cpu_model)
+static void create_dummy_env(const char *cpu_model)
 {
     CPUState *env;
-    DriveInfo *dinfo;
-    int i;
 
     /* Initialize a dummy CPU */
     if (cpu_model == NULL) {
@@ -51,6 +44,17 @@ static void xen_init_pv(ram_addr_t ram_size,
     }
     env = cpu_init(cpu_model);
     env->halted = 1;
+}
+
+static void xen_init_pv(ram_addr_t ram_size,
+			const char *boot_device,
+			const char *kernel_filename,
+			const char *kernel_cmdline,
+			const char *initrd_filename,
+			const char *cpu_model)
+{
+    DriveInfo *dinfo;
+    int i;
 
     /* Initialize backend core & drivers */
     if (xen_be_init() != 0) {
@@ -60,9 +64,11 @@ static void xen_init_pv(ram_addr_t ram_size,
 
     switch (xen_mode) {
     case XEN_ATTACH:
+        create_dummy_env(cpu_model);
         /* nothing to do, xend handles everything */
         break;
     case XEN_CREATE:
+        create_dummy_env(cpu_model);
         if (xen_domain_build_pv(kernel_filename, initrd_filename,
                                 kernel_cmdline) < 0) {
             fprintf(stderr, "xen pv domain creation failed\n");
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 38/40] xenner: integrate into build system
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (36 preceding siblings ...)
  2010-11-02 16:26 ` [Qemu-devel] [PATCH 36/40] xen: only create dummy env when necessary Alexander Graf
@ 2010-11-02 16:26 ` Alexander Graf
  2010-11-02 16:26 ` [Qemu-devel] [PATCH 39/40] xenner: integrate into xen pv machine Alexander Graf
  2010-11-02 16:26 ` [Qemu-devel] [PATCH 40/40] xen: add sysrq support Alexander Graf
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-02 16:26 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Now that we have all the pieces in place, let's integrate into the build system.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 Makefile        |    6 ++++++
 Makefile.target |   16 ++++++++++++++--
 configure       |   21 +++++++++++++++++++--
 3 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 3df202c..17b10bf 100644
--- a/Makefile
+++ b/Makefile
@@ -96,6 +96,12 @@ recurse-all: $(SUBDIR_RULES) $(ROMSUBDIR_RULES)
 
 audio/audio.o audio/fmodaudio.o: QEMU_CFLAGS += $(FMOD_CFLAGS)
 
+# xen backend driver support (shared by xen+xenner)
+ifneq ($(CONFIG_XEN)$(CONFIG_XENNER),)
+  obj-y += xen_backend.o xen_devconfig.o
+  obj-y += xen_console.o xenfb.o xen_disk.o xen_nic.o
+endif
+
 QEMU_CFLAGS+=$(CURL_CFLAGS)
 
 ui/cocoa.o: ui/cocoa.m
diff --git a/Makefile.target b/Makefile.target
index c48cbcc..9dc97b6 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -182,8 +182,20 @@ QEMU_CFLAGS += $(VNC_SASL_CFLAGS)
 QEMU_CFLAGS += $(VNC_JPEG_CFLAGS)
 QEMU_CFLAGS += $(VNC_PNG_CFLAGS)
 
-# xen backend driver support
-obj-$(CONFIG_XEN) += xen_machine_pv.o xen_domainbuild.o
+# xen bits shared by xenner+xen
+ifneq ($(CONFIG_XEN)$(CONFIG_XENNER),)
+  obj-y += xen_machine_pv.o xen_interfaces.o
+endif
+
+# xen support
+obj-$(CONFIG_XEN) += xen_domainbuild.o
+
+# xenner support
+obj-$(CONFIG_XENNER) += xenner_core.o xenner_emudev.o xenner_guest_store.o
+obj-$(CONFIG_XENNER) += xenner_pv.o
+obj-$(CONFIG_XENNER) += xenner_libxc_evtchn.o xenner_libxc_gnttab.o xenner_libxc_if.o
+obj-$(CONFIG_XENNER) += xenner_libxenstore.o
+obj-$(CONFIG_XENNER) += xenner_dom_builder.o
 
 # USB layer
 obj-$(CONFIG_USB_OHCI) += usb-ohci.o
diff --git a/configure b/configure
index f62c1fe..4d08220 100755
--- a/configure
+++ b/configure
@@ -285,6 +285,7 @@ vnc_jpeg=""
 vnc_png=""
 vnc_thread="no"
 xen=""
+xenner=""
 linux_aio=""
 attr=""
 vhost_net=""
@@ -632,6 +633,10 @@ for opt do
   ;;
   --enable-xen) xen="yes"
   ;;
+  --disable-xenner) xenner="no"
+  ;;
+  --enable-xenner) xenner="yes"
+  ;;
   --disable-brlapi) brlapi="no"
   ;;
   --enable-brlapi) brlapi="yes"
@@ -868,6 +873,8 @@ echo "                           (affects only QEMU, not qemu-img)"
 echo "  --enable-mixemu          enable mixer emulation"
 echo "  --disable-xen            disable xen backend driver support"
 echo "  --enable-xen             enable xen backend driver support"
+echo "  --disable-xenner         disable xenner (xen emulation) support"
+echo "  --enable-xenner          enable xenner (xen emulation) support"
 echo "  --disable-brlapi         disable BrlAPI"
 echo "  --enable-brlapi          enable BrlAPI"
 echo "  --disable-vnc-tls        disable TLS encryption for VNC server"
@@ -1134,9 +1141,10 @@ else
 fi
 
 ##########################################
-# xen probe
+# probe for xen libs and includes
+# note: xenner included here for now as it needs the headers too.
 
-if test "$xen" != "no" ; then
+if test "$xen" != "no" -o "$xenner" != "no"; then
   xen_libs="-lxenstore -lxenctrl -lxenguest"
   cat > $TMPC <<EOF
 #include <xenctrl.h>
@@ -1145,12 +1153,17 @@ int main(void) { xs_daemon_open(); xc_interface_open(); return 0; }
 EOF
   if compile_prog "" "$xen_libs" ; then
     xen=yes
+    xenner=yes
     libs_softmmu="$xen_libs $libs_softmmu"
   else
     if test "$xen" = "yes" ; then
       feature_not_found "xen"
     fi
+    if test "$xenner" = "yes" ; then
+      feature_not_found "xenner"
+    fi
     xen=no
+    xenner=no
   fi
 fi
 
@@ -2314,6 +2327,7 @@ if test -n "$sparc_cpu"; then
     echo "Target Sparc Arch $sparc_cpu"
 fi
 echo "xen support       $xen"
+echo "xenner support    $xenner"
 echo "brlapi support    $brlapi"
 echo "bluez  support    $bluez"
 echo "Documentation     $docs"
@@ -2886,6 +2900,9 @@ case "$target_arch2" in
     if test "$xen" = "yes" -a "$target_softmmu" = "yes" ; then
       echo "CONFIG_XEN=y" >> $config_target_mak
     fi
+    if test "$xenner" = "yes" -a "$target_softmmu" = "yes" ; then
+      echo "CONFIG_XENNER=y" >> $config_target_mak
+    fi
 esac
 case "$target_arch2" in
   i386|x86_64|ppcemb|ppc|ppc64|s390x)
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 39/40] xenner: integrate into xen pv machine
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (37 preceding siblings ...)
  2010-11-02 16:26 ` [Qemu-devel] [PATCH 38/40] xenner: integrate into build system Alexander Graf
@ 2010-11-02 16:26 ` Alexander Graf
  2010-11-02 16:26 ` [Qemu-devel] [PATCH 40/40] xen: add sysrq support Alexander Graf
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-02 16:26 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

For native Xen support, we already have a xen machine description. This one
doesn't really know too much about xenner yet though, so we need to teach it.

This patch adds support for xenner in the xen pv machine description.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hw/xen_backend.h    |    1 +
 hw/xen_machine_pv.c |   16 ++++++++++++++--
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/hw/xen_backend.h b/hw/xen_backend.h
index f53a742..f6fc5f3 100644
--- a/hw/xen_backend.h
+++ b/hw/xen_backend.h
@@ -2,6 +2,7 @@
 #define QEMU_HW_XEN_BACKEND_H 1
 
 #include "xen_common.h"
+#include "xen_redirect.h"
 #include "sysemu.h"
 #include "net.h"
 #include "qdev.h"
diff --git a/hw/xen_machine_pv.c b/hw/xen_machine_pv.c
index 6f2666d..6611011 100644
--- a/hw/xen_machine_pv.c
+++ b/hw/xen_machine_pv.c
@@ -29,6 +29,7 @@
 #include "xen_backend.h"
 #include "xen_domainbuild.h"
 #include "blockdev.h"
+#include "xenner.h"
 
 static void create_dummy_env(const char *cpu_model)
 {
@@ -57,12 +58,14 @@ static void xen_init_pv(ram_addr_t ram_size,
     int i;
 
     /* Initialize backend core & drivers */
+    xen_interfaces_init();
     if (xen_be_init() != 0) {
         fprintf(stderr, "%s: xen backend core setup failed\n", __FUNCTION__);
         exit(1);
     }
 
     switch (xen_mode) {
+#ifdef CONFIG_XEN
     case XEN_ATTACH:
         create_dummy_env(cpu_model);
         /* nothing to do, xend handles everything */
@@ -75,9 +78,18 @@ static void xen_init_pv(ram_addr_t ram_size,
             exit(1);
         }
         break;
+#endif
+#ifdef CONFIG_XENNER
     case XEN_EMULATE:
-        fprintf(stderr, "xen emulation not implemented (yet)\n");
-        exit(1);
+        if (xenner_init_pv(kernel_filename, kernel_cmdline,
+                           initrd_filename) < 0) {
+            fprintf(stderr, "xenner pv domain creation failed\n");
+            exit(1);
+        }
+        break;
+#endif
+    default:
+        /* not reached, xen_interfaces_init() catches this already */
         break;
     }
 
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] [PATCH 40/40] xen: add sysrq support
  2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
                   ` (38 preceding siblings ...)
  2010-11-02 16:26 ` [Qemu-devel] [PATCH 39/40] xenner: integrate into xen pv machine Alexander Graf
@ 2010-11-02 16:26 ` Alexander Graf
  39 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-02 16:26 UTC (permalink / raw)
  To: qemu-devel Developers; +Cc: Gerd Hoffmann

Sending sys-requests on Xen is different from the usual keyboard based
ways. For xen, we need to add a xenstored node which the guest pulls the
sysrq information from.

This patch implements said interface by introducing a new human monitor
command to use it. It's purely optional.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 hmp-commands.hx      |   24 ++++++++++++++++++++++++
 hw/xen.h             |    2 ++
 hw/xen_domainbuild.c |    8 ++++++++
 monitor.c            |    8 ++++++++
 4 files changed, 42 insertions(+), 0 deletions(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 81999aa..40fdd00 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -436,6 +436,30 @@ This command is useful to send keys that your graphical user interface
 intercepts at low level, such as @code{ctrl-alt-f1} in X Window.
 ETEXI
 
+#if defined(CONFIG_XEN) || defined(CONFIG_XENNER)
+    {
+        .name       = "xen_sysrq",
+        .args_type  = "string:s",
+        .params     = "key",
+        .help       = "send sysrq to the Xen VM",
+        .mhandler.cmd = do_xen_sysrq,
+    },
+#endif
+
+STEXI
+@item xen_sysrq @var{key}
+@findex xen_sysrq
+
+Send @var{key} as sys-request to the Xen VM. This is the equivalent to
+alt-print-@var{key} in normal Linux guests. Example:
+@example
+xen_sysrq s
+@end example
+
+Please keep in mind that this is the only way of sending sys-requests to
+Xen PV machines. The usual alt-print-@var{key} way does not work.
+ETEXI
+
     {
         .name       = "system_reset",
         .args_type  = "",
diff --git a/hw/xen.h b/hw/xen.h
index 780dcf7..f2ac576 100644
--- a/hw/xen.h
+++ b/hw/xen.h
@@ -18,4 +18,6 @@ enum xen_mode {
 extern uint32_t xen_domid;
 extern enum xen_mode xen_mode;
 
+void xen_sysrq(const char *sysrq);
+
 #endif /* QEMU_HW_XEN_H */
diff --git a/hw/xen_domainbuild.c b/hw/xen_domainbuild.c
index 7f1fd66..49962db 100644
--- a/hw/xen_domainbuild.c
+++ b/hw/xen_domainbuild.c
@@ -211,6 +211,14 @@ static int xen_domain_watcher(void)
     _exit(0);
 }
 
+void xen_sysrq(const char *sysrq)
+{
+    void *dom;
+
+    dom = xs_get_domain_path(xenstore, xen_domid);
+    xenstore_write_str(dom, "control/sysrq", sysrq);
+}
+
 /* normal cleanup */
 static void xen_domain_cleanup(void)
 {
diff --git a/monitor.c b/monitor.c
index 61607c5..99aae01 100644
--- a/monitor.c
+++ b/monitor.c
@@ -30,6 +30,7 @@
 #include "hw/pci.h"
 #include "hw/watchdog.h"
 #include "hw/loader.h"
+#include "hw/xen.h"
 #include "gdbstub.h"
 #include "net.h"
 #include "net/slirp.h"
@@ -1639,6 +1640,13 @@ static void release_keys(void *opaque)
     }
 }
 
+#if defined(CONFIG_XEN) || defined(CONFIG_XENNER)
+static void do_xen_sysrq(Monitor *mon, const QDict *qdict)
+{
+    xen_sysrq(strdup(qdict_get_str(qdict, "string")));
+}
+#endif
+
 static void do_sendkey(Monitor *mon, const QDict *qdict)
 {
     char keyname_buf[16];
-- 
1.6.0.2

^ permalink raw reply related	[flat|nested] 96+ messages in thread

* [Qemu-devel] Re: [PATCH 35/40] xenner: Domain Builder
  2010-11-02 15:51       ` Paolo Bonzini
@ 2010-11-02 16:28         ` Alexander Graf
  0 siblings, 0 replies; 96+ messages in thread
From: Alexander Graf @ 2010-11-02 16:28 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel Developers, Gerd Hoffmann


On 02.11.2010, at 11:51, Paolo Bonzini wrote:

> On 11/02/2010 04:36 PM, Alexander Graf wrote:
>> 
>> On 02.11.2010, at 06:09, Paolo Bonzini wrote:
>> 
>>> On 11/01/2010 04:01 PM, Alexander Graf wrote:
>>>> The traditional Xen way of loading a kernel and initial data structures into the
>>>> guest's memory is by calling libxc functions. We want to be able to run without
>>>> libxc dependencies though, so we need an alternative.
>>>> 
>>>> This patch implements a full domain builder for xenner. It loads the guest
>>>> kernel, sets up basic memory layouts and saves everything off for the pv
>>>> communication device between xenner and qemu.
>>>> 
>>>> Signed-off-by: Alexander Graf<agraf@suse.de>
>>> 
>>> What about 36 to 40? :)
>> 
>> What about them? :) Didn't they make it to the ML?
> 
> http://patchwork.ozlabs.org/project/qemu-devel/list/ seems to agree with me they didn't. :)

Alright, I resent 36-40 with the (hopefully) correct reply-to header :). The binary blob mail just bounced though, apparently it's too big. So for that, please grab it from my git tree :).


Alex

^ permalink raw reply	[flat|nested] 96+ messages in thread

* Re: [Qemu-devel] Re: [PATCH 28/40] xenner: libxc emu: evtchn
  2010-11-02 15:48                       ` Alexander Graf
@ 2010-11-02 19:20                         ` Stefano Stabellini
  0 siblings, 0 replies; 96+ messages in thread
From: Stefano Stabellini @ 2010-11-02 19:20 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Paolo Bonzini, Gerd Hoffmann, qemu-devel Developers, Stefano Stabellini

On Tue, 2 Nov 2010, Alexander Graf wrote:
> This is getting confusing :). There are multiple ways of spawning a Xen PV instance I'm aware of:
> 
> 1) Xen PV context
> 2) Xen PV context in SVM/VMX container, maintained by Xen
> 3) Xenner on TCG/KVM
> 4) Xenner on Xen HVM
> 
> For 1 and 2 the way to go is definitely to reuse the xen infrastructure. For 3 I'm very reluctant in requiring dependencies. One of qemu's strong points is that it does not have too many dependencies on other code. If there are strong points for it however, I gladly change my position :).
> 
> For 4 however, I haven't fully made up my mind on if it's useful to people (if you say it is, I'm more than glad to get this rolling!) and what the best way to implement it would be.
> 

I am guessing that with 2) you are referring to Linux PV on HVM guests.
If so 2) and 4) are very different: a Linux PV on HVM guest is a normal
Linux kernel that would boot just fine on native, but is also able to
enable some Xen PV interfaces when running in a Xen HVM domain.
Linux PV on HVM guests are new and support is in the kernel since less
than a year.
However Linux PV guests have been around for a long time and
traditionally are unable to boot on native or in a Xen HVM container.
So 4) would allow these kernels to boot in a Xen HVM container
unmodified, this is why it would be useful.


> So I suppose your suggestion is to use the xen infrastructure for case 4? That might work out. Fortunately, all the detection on which backend we use happens at runtime. Since in that case Xen does own the guest's memory, we might even be safe on using its memory mapping functionality. Maybe.
> 

Yes. Case 4) is just a normal Xen HVM domain from the Xen point of view,
so it needs all the rest of the Xen infrastructure. There is no need to
replace xenstored or libxc when the real xenstored and libxc are
available.


> I'm looking very much forward to talking to you about this in Boston. Are you around already?
> 

Yep!

^ permalink raw reply	[flat|nested] 96+ messages in thread

end of thread, other threads:[~2010-11-02 19:22 UTC | newest]

Thread overview: 96+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-01 15:01 [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 01/40] elf: Move translate_fn to helper struct Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 02/40] elf: Add notes implementation Alexander Graf
2010-11-01 18:29   ` Blue Swirl
2010-11-01 18:42     ` Stefan Weil
2010-11-01 19:51       ` Alexander Graf
2010-11-01 20:19         ` Stefan Weil
2010-11-01 21:17           ` Alexander Graf
2010-11-01 21:28             ` [Qemu-devel] " Paolo Bonzini
2010-11-01 21:31             ` [Qemu-devel] " Stefan Weil
2010-11-02 10:17             ` Michael Matz
2010-11-01 18:41   ` [Qemu-devel] " Paolo Bonzini
2010-11-01 18:52     ` Alexander Graf
2010-11-01 19:43       ` Paolo Bonzini
2010-11-01 19:48         ` Alexander Graf
2010-11-01 21:23           ` Paolo Bonzini
2010-11-01 15:01 ` [Qemu-devel] [PATCH 03/40] elf: add header notification Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 04/40] elf: add section analyzer Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 05/40] xen-disk: disable aio Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 06/40] qdev-ify: xen backends Alexander Graf
2010-11-02 10:08   ` Markus Armbruster
2010-11-02 10:43     ` Gerd Hoffmann
2010-11-02 13:26       ` Markus Armbruster
2010-11-01 15:01 ` [Qemu-devel] [PATCH 07/40] xenner: kernel: 32 bit files Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 08/40] xenner: kernel: 64-bit files Alexander Graf
2010-11-01 15:44   ` Anthony Liguori
2010-11-01 15:47     ` Alexander Graf
2010-11-01 15:59       ` Anthony Liguori
2010-11-01 19:00       ` Blue Swirl
2010-11-01 19:02         ` Anthony Liguori
2010-11-01 19:05           ` Alexander Graf
2010-11-01 19:23             ` Blue Swirl
2010-11-01 19:37             ` Anthony Liguori
2010-11-01 15:01 ` [Qemu-devel] [PATCH 09/40] xenner: kernel: Global data Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 10/40] xenner: kernel: Hypercall handler (i386) Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 11/40] xenner: kernel: Hypercall handler (x86_64) Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 12/40] xenner: kernel: Hypercall handler (generic) Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 13/40] xenner: kernel: Headers Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 14/40] xenner: kernel: Instruction emulator Alexander Graf
2010-11-01 15:41   ` malc
2010-11-01 18:46   ` [Qemu-devel] " Paolo Bonzini
2010-11-01 15:01 ` [Qemu-devel] [PATCH 15/40] xenner: kernel: lapic code Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 16/40] xenner: kernel: Main (i386) Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 17/40] xenner: kernel: Main (x86_64) Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 18/40] xenner: kernel: Main Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 19/40] xenner: kernel: Makefile Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 20/40] xenner: kernel: mmu support for 32-bit PAE Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 21/40] xenner: kernel: mmu support for 32-bit normal Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 22/40] xenner: kernel: mmu support for 64-bit Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 23/40] xenner: kernel: generic MM functionality Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 24/40] xenner: kernel: printk Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 25/40] xenner: kernel: KVM PV code Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 26/40] xenner: kernel: xen-names Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 27/40] xenner: add xc_dom.h Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 28/40] xenner: libxc emu: evtchn Alexander Graf
2010-11-01 15:45   ` Anthony Liguori
2010-11-01 15:49     ` Alexander Graf
2010-11-01 16:01       ` Anthony Liguori
2010-11-01 16:07         ` Alexander Graf
2010-11-01 16:14           ` Anthony Liguori
2010-11-01 16:15             ` Alexander Graf
2010-11-01 19:39         ` [Qemu-devel] " Paolo Bonzini
2010-11-01 19:41           ` Anthony Liguori
2010-11-01 19:47             ` Alexander Graf
2010-11-01 20:32               ` Anthony Liguori
2010-11-01 21:47                 ` Paolo Bonzini
2010-11-01 22:00                   ` Anthony Liguori
2010-11-01 22:08                     ` Paolo Bonzini
2010-11-01 22:29                       ` Anthony Liguori
2010-11-02  4:33                 ` Stefano Stabellini
2010-11-02 10:06                   ` Paolo Bonzini
2010-11-02 10:31                     ` Gerd Hoffmann
2010-11-02 10:38                       ` Paolo Bonzini
2010-11-02 13:55                     ` Stefano Stabellini
2010-11-02 15:48                       ` Alexander Graf
2010-11-02 19:20                         ` Stefano Stabellini
2010-11-01 15:01 ` [Qemu-devel] [PATCH 29/40] xenner: libxc emu: grant tables Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 30/40] xenner: libxc emu: memory mapping Alexander Graf
2010-11-01 15:12   ` malc
2010-11-01 15:15     ` Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 31/40] xenner: libxc emu: xenstore Alexander Graf
2010-11-01 18:36   ` Blue Swirl
2010-11-01 15:01 ` [Qemu-devel] [PATCH 32/40] xenner: emudev Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 33/40] xenner: core Alexander Graf
2010-11-01 15:13   ` malc
2010-11-01 15:01 ` [Qemu-devel] [PATCH 34/40] xenner: PV machine Alexander Graf
2010-11-01 15:01 ` [Qemu-devel] [PATCH 35/40] xenner: Domain Builder Alexander Graf
2010-11-02 10:09   ` [Qemu-devel] " Paolo Bonzini
2010-11-02 15:36     ` Alexander Graf
2010-11-02 15:51       ` Paolo Bonzini
2010-11-02 16:28         ` Alexander Graf
2010-11-01 15:21 ` [Qemu-devel] [PATCH 00/40] RFC: Xenner Alexander Graf
2010-11-02 16:26 ` [Qemu-devel] [PATCH 36/40] xen: only create dummy env when necessary Alexander Graf
2010-11-02 16:26 ` [Qemu-devel] [PATCH 38/40] xenner: integrate into build system Alexander Graf
2010-11-02 16:26 ` [Qemu-devel] [PATCH 39/40] xenner: integrate into xen pv machine Alexander Graf
2010-11-02 16:26 ` [Qemu-devel] [PATCH 40/40] xen: add sysrq support Alexander Graf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.