All of lore.kernel.org
 help / color / mirror / Atom feed
* KVM ivshmem enquiry
@ 2010-02-28 20:20 Khaled Ibrahim
  2010-03-01 18:17 ` Cam Macdonell
  0 siblings, 1 reply; 10+ messages in thread
From: Khaled Ibrahim @ 2010-02-28 20:20 UTC (permalink / raw)
  To: kvm; +Cc: cam


Cam,I am interested in the shared memory support you developed on
 KVM, but the whole process is not very clear to me. I patched the kernel on the
 guest OSs and used the samples codes found in http://www.mail-archive.com/kvm@vger.kernel.org/msg13328.html, but the applications fails in mmap.

 I have not done anything at booting the guest OS, and I am using vanilla qemu-kvm. I guest I am missing something for initialization.

 Do I need to patch qemu/kvm also? Where can I find the most recent ivshmem patches for the guest kernel and kvm if applicable. What is the proper way to start the guest OS to enable shared memory?

 If you have developed a documentation or a complete working example, please refer me to it.


 I appreciate your help.


 Thank you,

 -Khaled

 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with powerful SPAM protection.
http://clk.atdmt.com/GBL/go/201469227/direct/01/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: KVM ivshmem enquiry
  2010-02-28 20:20 KVM ivshmem enquiry Khaled Ibrahim
@ 2010-03-01 18:17 ` Cam Macdonell
  2010-03-01 18:19   ` [PATCH] Support an inter-vm shared memory device that maps a shared-memory object Cam Macdonell
  2010-03-01 18:20   ` [PATCH] Driver to support shared memory device with inerrupts Cam Macdonell
  0 siblings, 2 replies; 10+ messages in thread
From: Cam Macdonell @ 2010-03-01 18:17 UTC (permalink / raw)
  To: Khaled Ibrahim; +Cc: kvm

Hi Khaled,

On Sun, Feb 28, 2010 at 1:20 PM, Khaled Ibrahim <kzm98@hotmail.com> wrote:
>
> Cam,I am interested in the shared memory support you developed on
>  KVM, but the whole process is not very clear to me. I patched the kernel on the
>  guest OSs and used the samples codes found in http://www.mail-archive.com/kvm@vger.kernel.org/msg13328.html, but the applications fails in mmap.
>
>  I have not done anything at booting the guest OS, and I am using vanilla qemu-kvm. I guest I am missing something for initialization.
>
>  Do I need to patch qemu/kvm also? Where can I find the most recent ivshmem patches for the guest kernel and kvm if applicable. What is the proper way to start the guest OS to enable shared memory?

You need two patches, one for qemu and one of the linux kernel.  The
linux kernel patch is just a driver, so you could compile it
separately (you don't need to re-compile the whole kernel).  I will
send two patches soon, one is for qemu-kvm and the other is the
driver.

Once you have patches and compiled qemu-kvm, you need to add a
command-line argument

-ivshmem <size>,<name>

where <size> is the size in MB of the POSIX shared object you want to
create and <name> is the name to use for it (not a full path).  After
you start kvm with this command-line argument,  a file will be created
at /dev/shm/<name> of the size you specify.


>  If you have developed a documentation or a complete working example, please refer me to it.
>

Documentation is scarce at this point as the implementation is still
under development, but feel free to ask any questions you have via
email.

I do have a set of sample code and OS init scripts (to create the
device file in the guest) here:

http://www.gitorious.org/nahanni/

Cheers,
Cam

>
>  I appreciate your help.
>
>
>  Thank you,
>
>  -Khaled
>
>
> _________________________________________________________________
> Hotmail: Trusted email with powerful SPAM protection.
> http://clk.atdmt.com/GBL/go/201469227/direct/01/
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] Support an inter-vm shared memory device that maps a shared-memory object
  2010-03-01 18:17 ` Cam Macdonell
@ 2010-03-01 18:19   ` Cam Macdonell
  2010-03-03  7:06     ` IVSHMEM and limits on shared memory Khaled Ibrahim
  2010-03-01 18:20   ` [PATCH] Driver to support shared memory device with inerrupts Cam Macdonell
  1 sibling, 1 reply; 10+ messages in thread
From: Cam Macdonell @ 2010-03-01 18:19 UTC (permalink / raw)
  To: kzm98; +Cc: kvm, Cam Macdonell

This device now creates a qemu character device and sends 1-bytes messages to
trigger interrupts.  Writes are trigger by writing to the "Doorbell" register
on the shared memory PCI device.  The lower 8-bits of the value written to this
register are sent as the 1-byte message so different meanings of interrupts can
be supported.

Interrupts are supported between multiple VMs by using a shared memory server

-ivshmem <shm object>,<size in MB>,[unix:<path>][file]

Interrupts can also be used between host and guest as well by implementing a
listener on the host that talks to shared memory server.  The shared memory
server passes file descriptors for the shared memory object and eventfds (our
interrupt mechanism) to the respective qemu instances.

Sample programs and init scripts are available in a git repo here:

www.gitorious.org/nahanni
---
 Makefile.target |    3 +
 hw/ivshmem.c    |  678 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/pc.c         |    6 +
 hw/pc.h         |    3 +
 qemu-char.c     |    7 +
 qemu-char.h     |    3 +
 qemu-options.hx |   12 +
 sysemu.h        |    8 +
 vl.c            |   13 +
 9 files changed, 733 insertions(+), 0 deletions(-)
 create mode 100644 hw/ivshmem.c

diff --git a/Makefile.target b/Makefile.target
index 40ff6d5..3df9006 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -216,6 +216,9 @@ obj-y += pcnet.o
 obj-y += rtl8139.o
 obj-y += e1000.o
 
+# Inter-VM PCI shared memory
+obj-y += ivshmem.o
+
 # Hardware support
 obj-i386-y = ide/core.o ide/qdev.o ide/isa.o ide/pci.o ide/piix.o
 obj-i386-y += pckbd.o $(sound-obj-y) dma.o
diff --git a/hw/ivshmem.c b/hw/ivshmem.c
new file mode 100644
index 0000000..6bf6e0a
--- /dev/null
+++ b/hw/ivshmem.c
@@ -0,0 +1,678 @@
+/*
+ * Inter-VM Shared Memory PCI device.
+ *
+ * Author:
+ *      Cam Macdonell <cam@cs.ualberta.ca>
+ *
+ * Based On: cirrus_vga.c and rtl8139.c
+ *
+ * This code is licensed under the GNU GPL v2.
+ */
+
+#include "hw.h"
+#include "console.h"
+#include "pc.h"
+#include "pci.h"
+#include "sysemu.h"
+
+#include "qemu-common.h"
+#include <sys/mman.h>
+#include <sys/socket.h>
+
+#define PCI_COMMAND_IOACCESS                0x0001
+#define PCI_COMMAND_MEMACCESS               0x0002
+#define PCI_COMMAND_BUSMASTER               0x0004
+
+#define DEBUG_IVSHMEM
+#define MAX_EVENT_FDS 16
+
+#ifdef DEBUG_IVSHMEM
+#define IVSHMEM_DPRINTF(fmt, args...)        \
+    do {printf("IVSHMEM: " fmt, ##args); } while (0)
+#else
+#define IVSHMEM_DPRINTF(fmt, args...)
+#endif
+
+#define BROADCAST_VAL ((1 << 8) - 1)
+
+typedef struct IVShmemState {
+    uint16_t intrmask;
+    uint16_t intrstatus;
+    uint16_t doorbell;
+    uint8_t *ivshmem_ptr;
+    unsigned long ivshmem_offset;
+    unsigned int ivshmem_size;
+    unsigned long bios_offset;
+    unsigned int bios_size;
+    target_phys_addr_t base_ctrl;
+    int it_shift;
+    PCIDevice *pci_dev;
+    CharDriverState * chr;
+    CharDriverState * eventfd_chr;
+    unsigned long map_addr;
+    unsigned long map_end;
+    int ivshmem_mmio_io_addr;
+    int eventfds[16]; /* for now we have a limit of 16 inter-connected guests */
+    int eventfd_posn;
+    int shm_fd; /* shared memory file descriptor */
+    uint16_t eventfd_bitvec;
+    int num_eventfds;
+} IVShmemState;
+
+typedef struct PCI_IVShmemState {
+    PCIDevice dev;
+    IVShmemState ivshmem_state;
+} PCI_IVShmemState;
+
+typedef struct IVShmemDesc {
+    char name[1024];
+    char * chrdev;
+    int size;
+} IVShmemDesc;
+
+/* registers for the Inter-VM shared memory device */
+enum ivshmem_registers {
+    IntrMask = 0,
+    IntrStatus = 16,
+    Doorbell = 32,
+    IVPosition = 48,
+    IVLiveList = 64,
+    MemSize = 80
+};
+
+static int num_ivshmem_devices = 0;
+static IVShmemDesc ivshmem_desc;
+
+static void ivshmem_map(PCIDevice *pci_dev, int region_num,
+                    pcibus_t addr, pcibus_t size, int type)
+{
+    PCI_IVShmemState *d = (PCI_IVShmemState *)pci_dev;
+    IVShmemState *s = &d->ivshmem_state;
+
+    IVSHMEM_DPRINTF("addr = %u size = %u\n", (uint32_t)addr, (uint32_t)size);
+    cpu_register_physical_memory(addr, s->ivshmem_size, s->ivshmem_offset);
+
+}
+
+void ivshmem_init(const char * optarg) {
+
+    char * temp;
+    char * ivshmem_sz;
+    int size;
+
+    num_ivshmem_devices++;
+
+    /* currently we only support 1 device */
+    if (num_ivshmem_devices > MAX_IVSHMEM_DEVICES) {
+        return;
+    }
+
+    temp = strdup(optarg);
+/*
+    snprintf(ivshmem_desc.name, 1024, "/%s", strsep(&temp,","));
+*/
+    ivshmem_sz=strsep(&temp,",");
+
+    if (ivshmem_sz != NULL) {
+        size = atol(ivshmem_sz);
+    } else {
+        size = -1;
+    }
+
+    ivshmem_desc.chrdev = strsep(&temp,"\0");
+
+    if ( size == -1) {
+        ivshmem_desc.size = TARGET_PAGE_SIZE;
+    } else {
+        ivshmem_desc.size = size*1024*1024;
+    }
+    IVSHMEM_DPRINTF("optarg is %s, name is %s, size is %d, chrdev is %s\n",
+                                        optarg, ivshmem_desc.name,
+                                        ivshmem_desc.size, ivshmem_desc.chrdev);
+}
+
+int ivshmem_get_size(void) {
+    return ivshmem_desc.size;
+}
+
+static void broadcast_eventfds(int val, IVShmemState *s)
+{
+
+    int dest = val >> 4;
+    u_int64_t writelong = val & 0xff;
+
+    for (dest = 1; dest < s->num_eventfds; dest++) {
+
+        if (s->eventfds[dest] != -1) {
+            IVSHMEM_DPRINTF("Writing %ld to VM %d\n", writelong, dest);
+            if (write(s->eventfds[dest], &(writelong), 8) != 8)
+                IVSHMEM_DPRINTF("error writing to eventfd\n");
+        }
+
+    }
+
+}
+
+/* accessing registers - based on rtl8139 */
+static void ivshmem_update_irq(IVShmemState *s)
+{
+    int isr;
+    isr = (s->intrstatus & s->intrmask) & 0xffff;
+
+    /* don't print ISR resets */
+    if (isr) {
+        IVSHMEM_DPRINTF("Set IRQ to %d (%04x %04x)\n",
+           isr ? 1 : 0, s->intrstatus, s->intrmask);
+    }
+
+    qemu_set_irq(s->pci_dev->irq[0], (isr != 0));
+}
+
+static void ivshmem_mmio_map(PCIDevice *pci_dev, int region_num,
+                       pcibus_t addr, pcibus_t size, int type)
+{
+    PCI_IVShmemState *d = (PCI_IVShmemState *)pci_dev;
+    IVShmemState *s = &d->ivshmem_state;
+
+    cpu_register_physical_memory(addr + 0, 0x100, s->ivshmem_mmio_io_addr);
+}
+
+static void ivshmem_IntrMask_write(IVShmemState *s, uint32_t val)
+{
+    IVSHMEM_DPRINTF("IntrMask write(w) val = 0x%04x\n", val);
+
+    s->intrmask = val;
+
+    ivshmem_update_irq(s);
+}
+
+static uint32_t ivshmem_IntrMask_read(IVShmemState *s)
+{
+    uint32_t ret = s->intrmask;
+
+    IVSHMEM_DPRINTF("intrmask read(w) val = 0x%04x\n", ret);
+
+    return ret;
+}
+
+static void ivshmem_IntrStatus_write(IVShmemState *s, uint32_t val)
+{
+    IVSHMEM_DPRINTF("IntrStatus write(w) val = 0x%04x\n", val);
+
+    s->intrstatus = val;
+
+    ivshmem_update_irq(s);
+    return;
+}
+
+static uint32_t ivshmem_IntrStatus_read(IVShmemState *s)
+{
+    uint32_t ret = s->intrstatus;
+
+    /* reading ISR clears all interrupts */
+    s->intrstatus = 0;
+
+    ivshmem_update_irq(s);
+
+    return ret;
+}
+
+static void ivshmem_io_writew(void *opaque, uint8_t addr, uint32_t val)
+{
+    IVShmemState *s = opaque;
+
+    // 32-bits are written to the address
+    int dest = val >> 8;
+    u_int64_t writelong = val & 0xff;
+
+    IVSHMEM_DPRINTF("writing 0x%x to 0x%lx\n", addr, (unsigned long) opaque);
+
+    addr &= 0xfe;
+
+    switch (addr)
+    {
+        case IntrMask:
+            ivshmem_IntrMask_write(s, val);
+            break;
+
+        case IntrStatus:
+            ivshmem_IntrStatus_write(s, val);
+            break;
+
+        case Doorbell:
+            IVSHMEM_DPRINTF("val is %d\n", val);
+
+            if (dest == BROADCAST_VAL) {
+                broadcast_eventfds(val, s);
+            } else if (dest <= s->num_eventfds) {
+                IVSHMEM_DPRINTF("Writing %ld to VM %d\n", writelong, dest);
+                if (write(s->eventfds[dest], &(writelong), 8) != 8)
+                    IVSHMEM_DPRINTF("error writing to eventfd\n");
+            } else {
+                IVSHMEM_DPRINTF("Invalid %ld to VM %d\n", writelong, dest);
+            }
+
+            break;
+       default:
+            IVSHMEM_DPRINTF("why are we writing 0x%x\n", addr);
+    }
+}
+
+static void ivshmem_io_writel(void *opaque, uint8_t addr, uint32_t val)
+{
+    IVSHMEM_DPRINTF("We shouldn't be writing longs\n");
+}
+
+static void ivshmem_io_writeb(void *opaque, uint8_t addr, uint32_t val)
+{
+    IVSHMEM_DPRINTF("We shouldn't be writing bytes\n");
+}
+
+static uint32_t ivshmem_io_readw(void *opaque, uint8_t addr)
+{
+
+    IVShmemState *s = opaque;
+    uint32_t ret;
+
+    switch (addr)
+    {
+        case IntrMask:
+            ret = ivshmem_IntrMask_read(s);
+            break;
+        case IntrStatus:
+            ret = ivshmem_IntrStatus_read(s);
+            break;
+
+        case IVPosition:
+            /* return my id in the ivshmem list */
+            ret = s->eventfd_posn;
+            break;
+        case IVLiveList:
+            /* return the list of live VMs id for ivshmem */
+            ret = s->eventfd_bitvec;
+            break;
+
+        default:
+            IVSHMEM_DPRINTF("why are we reading 0x%x\n", addr);
+            ret = 0;
+    }
+
+    return ret;
+}
+
+static uint32_t ivshmem_io_readl(void *opaque, uint8_t addr)
+{
+    IVSHMEM_DPRINTF("We shouldn't be reading longs\n");
+    return 0;
+}
+
+static uint32_t ivshmem_io_readb(void *opaque, uint8_t addr)
+{
+    IVSHMEM_DPRINTF("We shouldn't be reading bytes\n");
+
+    return 0;
+}
+
+static void ivshmem_mmio_writeb(void *opaque,
+                                target_phys_addr_t addr, uint32_t val)
+{
+    ivshmem_io_writeb(opaque, addr & 0xFF, val);
+}
+
+static void ivshmem_mmio_writew(void *opaque,
+                                target_phys_addr_t addr, uint32_t val)
+{
+    ivshmem_io_writew(opaque, addr & 0xFF, val);
+}
+
+static void ivshmem_mmio_writel(void *opaque,
+                                target_phys_addr_t addr, uint32_t val)
+{
+    ivshmem_io_writel(opaque, addr & 0xFF, val);
+}
+
+static uint32_t ivshmem_mmio_readb(void *opaque, target_phys_addr_t addr)
+{
+    return ivshmem_io_readb(opaque, addr & 0xFF);
+}
+
+static uint32_t ivshmem_mmio_readw(void *opaque, target_phys_addr_t addr)
+{
+    uint32_t val = ivshmem_io_readw(opaque, addr & 0xFF);
+    return val;
+}
+
+static uint32_t ivshmem_mmio_readl(void *opaque, target_phys_addr_t addr)
+{
+    uint32_t val = ivshmem_io_readl(opaque, addr & 0xFF);
+    return val;
+}
+
+static CPUReadMemoryFunc *ivshmem_mmio_read[3] = {
+    ivshmem_mmio_readb,
+    ivshmem_mmio_readw,
+    ivshmem_mmio_readl,
+};
+
+static CPUWriteMemoryFunc *ivshmem_mmio_write[3] = {
+    ivshmem_mmio_writeb,
+    ivshmem_mmio_writew,
+    ivshmem_mmio_writel,
+};
+
+
+static void ivshmem_receive(void *opaque, const uint8_t *buf, int size)
+{
+    IVShmemState *s = opaque;
+
+    ivshmem_IntrStatus_write(s, *buf);
+
+    IVSHMEM_DPRINTF("ivshmem_receive 0x%02x\n", *buf);
+}
+
+static void ivshmem_event(void *opaque, int event)
+{
+//    IVShmemState *s = opaque;
+    IVSHMEM_DPRINTF("ivshmem_event %d\n", event);
+}
+
+static int ivshmem_can_receive(void * opaque)
+{
+    return 8;
+}
+
+static CharDriverState* create_eventfd_chr_device(void * opaque, int eventfd)
+{
+    // create a event character device based on the passed eventfd
+    IVShmemState *s = opaque;
+    CharDriverState * chr;
+
+    chr = qemu_chr_open_eventfd(eventfd);
+
+    if (chr == NULL) {
+        IVSHMEM_DPRINTF("creating eventfd for eventfd %d failed\n", eventfd);
+        exit(-1);
+    }
+
+    qemu_chr_add_handlers(chr, ivshmem_can_receive, ivshmem_receive,
+                      ivshmem_event, s);
+
+    return chr;
+
+}
+
+static int check_shm_size(IVShmemState *s, int shmemfd) {
+    /* check that the guest isn't going to try and map more memory than the
+     * card server allocated return -1 to indicate error */
+
+    struct stat buf;
+
+    fstat(shmemfd, &buf);
+
+    if (s->ivshmem_size > buf.st_size) {
+        fprintf(stderr, "IVSHMEM ERROR: Requested memory size greater");
+        fprintf(stderr, " than shared object size (%d > %ld)\n",
+                                          s->ivshmem_size, buf.st_size);
+        return -1;
+    } else {
+        return 0;
+    }
+}
+
+static void ivshmem_read(void *opaque, const uint8_t * buf, int flags)
+{
+    IVShmemState *s = opaque;
+    int incoming_fd, tmp_fd;
+    long incoming_posn;
+
+    memcpy(&incoming_posn, buf, sizeof(long));
+    /* pick off s->chr->msgfd and store it, posn should accompany msg */
+    tmp_fd = qemu_chr_get_msgfd(s->chr);
+    IVSHMEM_DPRINTF("posn is %ld, fd is %d\n", incoming_posn, tmp_fd);
+
+    if (tmp_fd == -1) {
+        s->eventfd_posn = incoming_posn;
+        return;
+    }
+
+    /* because of the implementation of get_msgfd, we need a dup */
+    incoming_fd = dup(tmp_fd);
+
+    /* if the position is -1, then it's shared memory fd */
+    if (incoming_posn == -1) {
+        s->shm_fd = incoming_fd;
+
+        s->eventfd_bitvec = 0;
+        s->num_eventfds = 0;
+
+        if (check_shm_size(s, s->shm_fd) == -1) {
+            exit(-1);
+        }
+
+        if (mmap(s->ivshmem_ptr, ivshmem_desc.size, PROT_READ|PROT_WRITE,
+                    MAP_SHARED|MAP_FIXED, s->shm_fd, 0) == MAP_FAILED)
+        {
+            fprintf(stderr, "kvm_ivshmem: could not mmap shared file\n");
+            exit(-1);
+        }
+
+        return;
+    }
+
+    /* this is an eventfd for a particular guest VM */
+    IVSHMEM_DPRINTF("eventfds[%ld] = %d\n", incoming_posn, incoming_fd);
+    s->eventfds[incoming_posn] = incoming_fd;
+
+    /* bitmap to keep track of live VMs */
+    s->eventfd_bitvec |= 1 << incoming_posn;
+
+    /* keep track of the maximum VM ID */
+    if (incoming_posn > s->num_eventfds) {
+        s->num_eventfds = incoming_posn;
+    }
+
+    /* initialize char device for callback on my eventfd */
+    if (incoming_posn == s->eventfd_posn) {
+        s->eventfd_chr = create_eventfd_chr_device(s, s->eventfds[s->eventfd_posn]);
+    }
+
+    return;
+}
+
+#if 0
+static void ivshmem_recvmsg(void *opaque, struct msghdr * msg, int flags)
+{
+
+    IVShmemState *s = opaque;
+    struct cmsghdr *cmptr;
+    struct iovec *iov;
+    long param;
+    long msg_size;
+
+    iov = msg->msg_iov;
+
+    memcpy(&param, iov->iov_base, sizeof param);
+
+    IVSHMEM_DPRINTF("Inside Recvmsg (%ld)\n", param);
+    for (cmptr = CMSG_FIRSTHDR(msg); cmptr != NULL;
+        cmptr = CMSG_NXTHDR(msg, cmptr)) {
+        if (cmptr->cmsg_level != SOL_SOCKET ||
+            cmptr->cmsg_type != SCM_RIGHTS) {
+                printf("read msg_size = %ld\n", msg_size);
+                continue;
+        }
+
+        // is eventfd_posn uninitialized?  Then this is the initial list of eventfds
+        if (s->eventfd_posn == -1) {
+
+            s->eventfd_bitvec = 0;
+            msg_size = sizeof(int) * (param + 1);
+            s->eventfds = (int *)malloc(msg_size);
+            s->num_eventfds = param + 1;
+            s->eventfd_posn = param;
+
+            memcpy(s->eventfds, CMSG_DATA(cmptr), msg_size);
+            IVSHMEM_DPRINTF("I am %d, receiving %d fds\n", s->eventfd_posn, s->num_eventfds);
+            IVSHMEM_DPRINTF("broadcast enabled - %d\n", BROADCAST_VAL);
+            IVSHMEM_DPRINTF("shmemfd is %d\n", s->eventfds[0]);
+            IVSHMEM_DPRINTF("My fd is %d\n", s->eventfds[s->eventfd_posn]);
+
+            /* presume all eventfds are live, we will be notified of dead ones */
+            s->eventfd_bitvec = (1 << (s->num_eventfds)) - 1;
+            s->eventfd_bitvec &= ~1; /* posn 0 is not an eventfd, it's the shared memory fd */
+
+            if (check_shm_size(s, s->eventfds[0]) == -1) {
+                exit(-1);
+            }
+
+            if (mmap(s->ivshmem_ptr, ivshmem_desc.size, PROT_READ|PROT_WRITE,
+                        MAP_SHARED|MAP_FIXED, s->eventfds[0], 0) == MAP_FAILED)
+            {
+                fprintf(stderr, "kvm_ivshmem: could not mmap shared file\n");
+                exit(-1);
+            }
+
+
+            // initialize char device for callback on my eventfd
+            s->eventfd_chr = create_eventfd_chr_device(s, s->eventfds[s->eventfd_posn]);
+
+            return;
+        }
+
+        // at this point we know this is an update
+        // the param is then the position of a new fd
+        if (param >= s->num_eventfds) {
+            s->eventfds = realloc(s->eventfds, sizeof(int) * (param + 1));
+        }
+
+        msg_size = sizeof(int);
+
+        // copy the new fd into its posn
+        memcpy(&(s->eventfds[param]), CMSG_DATA(cmptr), msg_size);
+
+        s->num_eventfds = param + 1;
+        s->eventfd_bitvec |= 1 << param;
+
+        IVSHMEM_DPRINTF("[update] new bitvec is %d\n", s->eventfd_bitvec);
+        IVSHMEM_DPRINTF("[update] s->eventfds[%ld] is %d\n", param,s->eventfds[param]);
+
+    }
+
+    // notification of a dead guest
+    // a negative number indicates a kill
+
+    if (param < 0) {
+        int index = -param;
+
+        IVSHMEM_DPRINTF("%d < %d || %d > 0\n", index, s->eventfds[index], s->num_eventfds);
+        if ((index < s->num_eventfds) || (s->eventfds[index] > 0)) {
+            s->eventfds[index] = -1;
+            // turn off the bit in the bit vector
+            s->eventfd_bitvec &= ~(1 << index);
+            IVSHMEM_DPRINTF("[kill] s->eventfds[%d] is %d\n", index,s->eventfds[index]);
+        }
+
+        return;
+
+    }
+
+}
+#endif
+
+int pci_ivshmem_init(PCIBus *bus)
+{
+    PCI_IVShmemState *d;
+    IVShmemState *s;
+    uint8_t *pci_conf;
+
+    IVSHMEM_DPRINTF("shared file is %s\n", ivshmem_desc.name);
+    d = (PCI_IVShmemState *)pci_register_device(bus, "kvm_ivshmem",
+                                           sizeof(PCI_IVShmemState),
+                                           -1, NULL, NULL);
+    if (!d) {
+        return -1;
+    }
+
+    s = &d->ivshmem_state;
+
+    /* allocate shared memory RAM */
+    s->ivshmem_offset = qemu_ram_alloc(ivshmem_desc.size);
+    IVSHMEM_DPRINTF("size is = %d\n", ivshmem_desc.size);
+    IVSHMEM_DPRINTF("ivshmem ram offset = %ld\n", s->ivshmem_offset);
+
+    s->ivshmem_ptr = qemu_get_ram_ptr(s->ivshmem_offset);
+
+    s->pci_dev = &d->dev;
+    s->ivshmem_size = ivshmem_desc.size;
+
+    pci_conf = d->dev.config;
+    pci_conf[0x00] = 0xf4; // Qumranet vendor ID 0x5002
+    pci_conf[0x01] = 0x1a;
+    pci_conf[0x02] = 0x10;
+    pci_conf[0x03] = 0x11;
+    pci_conf[0x04] = PCI_COMMAND_IOACCESS | PCI_COMMAND_MEMACCESS;
+    pci_conf[0x0a] = 0x00; // RAM controller
+    pci_conf[0x0b] = 0x05;
+    pci_conf[0x0e] = 0x00; // header_type
+
+    pci_conf[PCI_INTERRUPT_PIN] = 1; // we are going to support interrupts
+
+    /* XXX: ivshmem_desc.size must be a power of two */
+
+    s->ivshmem_mmio_io_addr = cpu_register_io_memory(ivshmem_mmio_read,
+                                    ivshmem_mmio_write, s);
+
+    /* region for registers*/
+    pci_register_bar(&d->dev, 0, 0x100,
+                           PCI_BASE_ADDRESS_SPACE_MEMORY, ivshmem_mmio_map);
+
+    /* region for shared memory */
+    pci_register_bar(&d->dev, 1, ivshmem_desc.size,
+                           PCI_BASE_ADDRESS_SPACE_MEMORY, ivshmem_map);
+
+    if (ivshmem_desc.chrdev != NULL) {
+        if (strncmp(ivshmem_desc.chrdev, "unix:", 5) == 0) {
+
+            /* if we get a UNIX socket as the parameter we will talk
+             * to the ivshmem server*/
+            char label[32];
+
+            s->eventfd_posn = -1;
+
+            snprintf(label, 32, "ivshmem_chardev");
+            s->chr = qemu_chr_open(label, ivshmem_desc.chrdev, NULL);
+            if (s->chr == NULL) {
+                fprintf(stderr, "No server listening on %s\n",
+                                                        ivshmem_desc.chrdev);
+                exit(-1);
+            }
+
+//            tcp_switch_to_recvmsg_handlers(s->chr);
+
+            qemu_chr_add_handlers(s->chr, ivshmem_can_receive, ivshmem_read,
+                              ivshmem_event, s);
+        } else {
+            /* just map the file, we're not using a server */
+            int shmem_fd;
+
+            if ((shmem_fd=shm_open(ivshmem_desc.chrdev, O_CREAT|O_RDWR,
+                            S_IRWXU|S_IRWXG|S_IRWXO)) < 0) {
+                fprintf(stderr, "kvm_shmem: could not open shared file\n");
+                exit(-1);
+            }
+
+            /* mmap onto PCI device's memory */
+            if (ftruncate(shmem_fd, ivshmem_desc.size) != 0) {
+                fprintf(stderr, "kvm_ivshmem: could not truncate shared file\n");
+            }
+            if (mmap(s->ivshmem_ptr, ivshmem_desc.size, PROT_READ|PROT_WRITE,
+                        MAP_SHARED|MAP_FIXED, shmem_fd, 0)  == MAP_FAILED) {
+                fprintf(stderr, "kvm_ivshmem: could not mmap shared file\n");
+                exit(-1);
+            }
+        }
+    }
+
+    return 0;
+}
+
diff --git a/hw/pc.c b/hw/pc.c
index dc935a8..5dacc77 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -74,6 +74,8 @@ static PCII440FXState *i440fx_state;
 
 qemu_irq *ioapic_irq_hack;
 
+extern int ivshmem_enabled;
+
 typedef struct isa_irq_state {
     qemu_irq *i8259;
     qemu_irq *ioapic;
@@ -938,6 +940,10 @@ static void pc_init1(ram_addr_t ram_size,
         }
     }
 
+    if (pci_enabled && ivshmem_enabled) {
+        pci_ivshmem_init(pci_bus);
+    }
+
     rtc_state = rtc_init(2000);
 
     qemu_register_boot_set(pc_boot_set, rtc_state);
diff --git a/hw/pc.h b/hw/pc.h
index e9da683..4a75c96 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -167,6 +167,9 @@ void isa_ne2000_init(int base, int irq, NICInfo *nd);
 
 void extboot_init(BlockDriverState *bs);
 
+/* ivshmem.c */
+int pci_ivshmem_init(PCIBus *bus);
+
 int cpu_is_bsp(CPUState *env);
 
 #endif
diff --git a/qemu-char.c b/qemu-char.c
index 75dbf66..6e957b7 100644
--- a/qemu-char.c
+++ b/qemu-char.c
@@ -1706,6 +1706,7 @@ static CharDriverState *qemu_chr_open_win_pipe(QemuOpts *opts)
     WinCharState *s;
 
     chr = qemu_mallocz(sizeof(CharDriverState));
+
     s = qemu_mallocz(sizeof(WinCharState));
     chr->opaque = s;
     chr->chr_write = win_chr_write;
@@ -2049,6 +2050,12 @@ static void tcp_chr_read(void *opaque)
     }
 }
 
+CharDriverState *qemu_chr_open_eventfd(int eventfd){
+
+    return qemu_chr_open_fd(eventfd, eventfd);
+
+}
+
 static void tcp_chr_connect(void *opaque)
 {
     CharDriverState *chr = opaque;
diff --git a/qemu-char.h b/qemu-char.h
index bcc0766..9a0d2c0 100644
--- a/qemu-char.h
+++ b/qemu-char.h
@@ -93,6 +93,9 @@ void qemu_chr_info_print(Monitor *mon, const QObject *ret_data);
 void qemu_chr_info(Monitor *mon, QObject **ret_data);
 CharDriverState *qemu_chr_find(const char *name);
 
+/* add an eventfd to the qemu devices that are polled */
+CharDriverState *qemu_chr_open_eventfd(int eventfd);
+
 extern int term_escape_char;
 
 /* async I/O support */
diff --git a/qemu-options.hx b/qemu-options.hx
index 9683d09..6a37083 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1679,6 +1679,18 @@ STEXI
 Setup monitor on chardev @var{name}.
 ETEXI
 
+DEF("ivshmem", HAS_ARG, QEMU_OPTION_ivshmem, \
+    "-ivshmem size,unix:file connects to shared memory PCI card server \
+    listening on unix domain socket 'path' of at least 'size' (in MB) and \
+    exposes  as a PCI device in the guest\n")
+STEXI
+@item -ivshmem @var{size},unix:@var{file}
+Connects to a shared memory PCI server at UDS @var{file} that shares a POSIX
+shared object of size @var{size} and creates a PCI device of the same size that
+maps the shared object (received from the server) into the device for guests to
+access.  The servers also sends eventfds to the guest to support interrupts.
+ETEXI
+
 DEF("debugcon", HAS_ARG, QEMU_OPTION_debugcon, \
     "-debugcon dev   redirect the debug console to char device 'dev'\n")
 STEXI
diff --git a/sysemu.h b/sysemu.h
index ff97786..9bf15b1 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -237,6 +237,14 @@ extern CharDriverState *serial_hds[MAX_SERIAL_PORTS];
 
 extern CharDriverState *parallel_hds[MAX_PARALLEL_PORTS];
 
+/* inter-VM shared memory devices */
+
+#define MAX_IVSHMEM_DEVICES 1
+
+extern CharDriverState * ivshmem_chardev;
+void ivshmem_init(const char * optarg);
+int ivshmem_get_size(void);
+
 #define TFR(expr) do { if ((expr) != -1) break; } while (errno == EINTR)
 
 #ifdef HAS_AUDIO
diff --git a/vl.c b/vl.c
index f7f388f..c7f6cf7 100644
--- a/vl.c
+++ b/vl.c
@@ -195,6 +195,7 @@ int autostart;
 static int rtc_utc = 1;
 static int rtc_date_offset = -1; /* -1 means no change */
 QEMUClock *rtc_clock;
+int ivshmem_enabled = 0;
 int vga_interface_type = VGA_NONE;
 #ifdef TARGET_SPARC
 int graphic_width = 1024;
@@ -213,6 +214,8 @@ int no_quit = 0;
 CharDriverState *serial_hds[MAX_SERIAL_PORTS];
 CharDriverState *parallel_hds[MAX_PARALLEL_PORTS];
 CharDriverState *virtcon_hds[MAX_VIRTIO_CONSOLES];
+CharDriverState *ivshmem_chardev;
+const char * ivshmem_device;
 #ifdef TARGET_I386
 int win2k_install_hack = 0;
 int rtc_td_hack = 0;
@@ -4885,6 +4888,8 @@ int main(int argc, char **argv, char **envp)
     kernel_cmdline = "";
     cyls = heads = secs = 0;
     translation = BIOS_ATA_TRANSLATION_AUTO;
+    ivshmem_device = NULL;
+    ivshmem_chardev = NULL;
 
     for (i = 0; i < MAX_NODES; i++) {
         node_mem[i] = 0;
@@ -5366,6 +5371,10 @@ int main(int argc, char **argv, char **envp)
                 add_device_config(DEV_PARALLEL, optarg);
                 default_parallel = 0;
                 break;
+            case QEMU_OPTION_ivshmem:
+                ivshmem_device = optarg;
+                ivshmem_enabled = 1;
+                break;
             case QEMU_OPTION_debugcon:
                 add_device_config(DEV_DEBUGCON, optarg);
                 break;
@@ -5883,6 +5892,10 @@ int main(int argc, char **argv, char **envp)
     if (ram_size == 0)
         ram_size = DEFAULT_RAM_SIZE * 1024 * 1024;
 
+    if (ivshmem_enabled) {
+        ivshmem_init(ivshmem_device);
+    }
+
     /* init the dynamic translator */
     cpu_exec_init_all(tb_size * 1024 * 1024);
 
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] Driver to support shared memory device with inerrupts
  2010-03-01 18:17 ` Cam Macdonell
  2010-03-01 18:19   ` [PATCH] Support an inter-vm shared memory device that maps a shared-memory object Cam Macdonell
@ 2010-03-01 18:20   ` Cam Macdonell
  1 sibling, 0 replies; 10+ messages in thread
From: Cam Macdonell @ 2010-03-01 18:20 UTC (permalink / raw)
  To: kzm98; +Cc: kvm, Cam Macdonell

This driver allows the guest VM to access shared memory between other guest
that is a POSIX shared memory object on the host.  The driver can also send
interrupts by writing to the DoorBell register.

With interrupts, the ioctl must specify the ID of the VM to receive the
interrupt or '255' for broadcast to all active VMs.  The 'arg' parameter is the
destination VM and 'cmd' is the interrupt code.  The value written to the
register is a bit messy.  32-bits are written to the register, the upper 16 are
the destination VM, and the lower 16 are the interrupt 'code' that the
destination guest will receive.  Implemented codes (see the interrupt handler)
either call up on the device's semaphore or wake up on the wait_event queue.
These codes' uses are at the discretion of the driver so they could be
customized.

For ioctls that read values from the device (such as for getting the global ID
of the guest) the arg parameter is unused.

Cam
---
 drivers/char/Kconfig       |    8 +
 drivers/char/Makefile      |    2 +
 drivers/char/kvm_ivshmem.c |  455 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 465 insertions(+), 0 deletions(-)
 create mode 100644 drivers/char/kvm_ivshmem.c

diff --git a/drivers/char/Kconfig b/drivers/char/Kconfig
index e023682..e8d82a9 100644
--- a/drivers/char/Kconfig
+++ b/drivers/char/Kconfig
@@ -1103,6 +1103,14 @@ config DEVPORT
 	depends on ISA || PCI
 	default y
 
+config KVM_IVSHMEM
+    tristate "Inter-VM Shared Memory Device"
+    depends on PCI
+    default m
+    help
+      This device maps a region of shared memory between the host OS and any
+      number of virtual machines.
+
 source "drivers/s390/char/Kconfig"
 
 endmenu
diff --git a/drivers/char/Makefile b/drivers/char/Makefile
index f957edf..b31f871 100644
--- a/drivers/char/Makefile
+++ b/drivers/char/Makefile
@@ -111,6 +111,8 @@ obj-$(CONFIG_PS3_FLASH)		+= ps3flash.o
 obj-$(CONFIG_JS_RTC)		+= js-rtc.o
 js-rtc-y = rtc.o
 
+obj-$(CONFIG_KVM_IVSHMEM)		+= kvm_ivshmem.o
+
 # Files generated that shall be removed upon make clean
 clean-files := consolemap_deftbl.c defkeymap.c
 
diff --git a/drivers/char/kvm_ivshmem.c b/drivers/char/kvm_ivshmem.c
new file mode 100644
index 0000000..f7057dc
--- /dev/null
+++ b/drivers/char/kvm_ivshmem.c
@@ -0,0 +1,455 @@
+/*
+ * drivers/char/kvm_ivshmem.c - driver for KVM Inter-VM shared memory PCI device
+ *
+ * Copyright 2009 Cam Macdonell <cam@cs.ualberta.ca>
+ *
+ * Based on cirrusfb.c and 8139cp.c:
+ *         Copyright 1999-2001 Jeff Garzik
+ *         Copyright 2001-2004 Jeff Garzik
+ *
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/proc_fs.h>
+#include <linux/smp_lock.h>
+#include <asm/uaccess.h>
+#include <linux/interrupt.h>
+#include <linux/mutex.h>
+
+#define TRUE 1
+#define FALSE 0
+#define KVM_IVSHMEM_DEVICE_MINOR_NUM 0
+
+enum {
+    /* KVM Inter-VM shared memory device register offsets */
+    IntrMask        = 0x00,    /* Interrupt Mask */
+    IntrStatus      = 0x10,    /* Interrupt Status */
+    Doorbell        = 0x20,    /* Doorbell */
+    IVPosition      = 0x30,
+    IVLiveList      = 0x40,
+    ShmOK = 1                /* Everything is OK */
+};
+
+typedef struct kvm_ivshmem_device {
+    void __iomem * regs;
+
+    void * base_addr;
+
+    unsigned int regaddr;
+    unsigned int reg_size;
+
+    unsigned int ioaddr;
+    unsigned int ioaddr_size;
+    unsigned int irq;
+
+    bool         enabled;
+
+} kvm_ivshmem_device;
+
+static int event_num;
+static struct semaphore sema;
+static wait_queue_head_t wait_queue;
+
+static kvm_ivshmem_device kvm_ivshmem_dev;
+
+static int device_major_nr;
+
+static int kvm_ivshmem_ioctl(struct inode *, struct file *, unsigned int, unsigned long);
+static int kvm_ivshmem_mmap(struct file *, struct vm_area_struct *);
+static int kvm_ivshmem_open(struct inode *, struct file *);
+static int kvm_ivshmem_release(struct inode *, struct file *);
+static ssize_t kvm_ivshmem_read(struct file *, char *, size_t, loff_t *);
+static ssize_t kvm_ivshmem_write(struct file *, const char *, size_t, loff_t *);
+static loff_t kvm_ivshmem_lseek(struct file * filp, loff_t offset, int origin);
+
+enum ivshmem_ioctl { set_sema, down_sema, empty, wait_event, wait_event_irq, read_ivposn, read_livelist, sema_irq };
+
+static const struct file_operations kvm_ivshmem_ops = {
+    .owner   = THIS_MODULE,
+    .open    = kvm_ivshmem_open,
+    .mmap    = kvm_ivshmem_mmap,
+    .read    = kvm_ivshmem_read,
+    .ioctl   = kvm_ivshmem_ioctl,
+    .write   = kvm_ivshmem_write,
+    .llseek  = kvm_ivshmem_lseek,
+    .release = kvm_ivshmem_release,
+};
+
+static struct pci_device_id kvm_ivshmem_id_table[] = {
+    { 0x1af4, 0x1110, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0 },
+    { 0 },
+};
+MODULE_DEVICE_TABLE (pci, kvm_ivshmem_id_table);
+
+static void kvm_ivshmem_remove_device(struct pci_dev* pdev);
+static int kvm_ivshmem_probe_device (struct pci_dev *pdev,
+                        const struct pci_device_id * ent);
+
+static struct pci_driver kvm_ivshmem_pci_driver = {
+    .name        = "kvm-shmem",
+    .id_table    = kvm_ivshmem_id_table,
+    .probe       = kvm_ivshmem_probe_device,
+    .remove      = kvm_ivshmem_remove_device,
+};
+
+static int kvm_ivshmem_ioctl(struct inode * ino, struct file * filp,
+            unsigned int cmd, unsigned long arg)
+{
+
+    int rv;
+    uint32_t msg;
+
+    printk("KVM_IVSHMEM: args is %ld\n", arg);
+#if 1
+    switch (cmd) {
+        case set_sema:
+            printk("KVM_IVSHMEM: initialize semaphore\n");
+            printk("KVM_IVSHMEM: args is %ld\n", arg);
+            sema_init(&sema, arg);
+            break;
+        case down_sema:
+            printk("KVM_IVSHMEM: sleeping on semaphore (cmd = %d)\n", cmd);
+            rv = down_interruptible(&sema);
+            printk("KVM_IVSHMEM: waking\n");
+            break;
+        case empty:
+            msg = ((arg & 0xff) << 8) + (cmd & 0xff);
+            printk("KVM_IVSHMEM: args is %ld\n", arg);
+            printk("KVM_IVSHMEM: ringing sema doorbell\n");
+            writew(msg, kvm_ivshmem_dev.regs + Doorbell);
+            break;
+        case wait_event:
+            printk("KVM_IVSHMEM: sleeping on event (cmd = %d)\n", cmd);
+            wait_event_interruptible(wait_queue, (event_num == 1));
+            printk("KVM_IVSHMEM: waking\n");
+            event_num = 0;
+            break;
+        case wait_event_irq:
+            msg = ((arg & 0xff) << 8) + (cmd & 0xff);
+            printk("KVM_IVSHMEM: ringing wait_event doorbell on %d (msg = %d)\n", arg, msg);
+            writew(msg, kvm_ivshmem_dev.regs + Doorbell);
+            break;
+        case read_ivposn:
+            msg = readw( kvm_ivshmem_dev.regs + IVPosition);
+            printk("KVM_IVSHMEM: my posn is %d\n", msg);
+            rv = copy_to_user(arg, &msg, sizeof(msg));
+            break;
+        case read_livelist:
+            msg = readw( kvm_ivshmem_dev.regs + IVLiveList);
+            printk("KVM_IVSHMEM: live list bit vector is %d\n", msg);
+            rv = copy_to_user(arg, &msg, sizeof(msg));
+            break;
+        case sema_irq:
+            // 2 is the actual code, but we use 7 from the user
+            msg = ((arg & 0xff) << 8) + (cmd & 0xff);
+            printk("KVM_IVSHMEM: args is %ld\n", arg);
+            printk("KVM_IVSHMEM: ringing sema doorbell\n");
+            writew(msg, kvm_ivshmem_dev.regs + Doorbell);
+            break;
+        default:
+            printk("KVM_IVSHMEM: bad ioctl (\n");
+    }
+#endif
+
+    return 0;
+}
+
+static ssize_t kvm_ivshmem_read(struct file * filp, char * buffer, size_t len,
+                        loff_t * poffset)
+{
+
+    int bytes_read = 0;
+    unsigned long offset;
+
+    offset = *poffset;
+
+    if (!kvm_ivshmem_dev.base_addr) {
+        printk(KERN_ERR "KVM_IVSHMEM: cannot read from ioaddr (NULL)\n");
+        return 0;
+    }
+
+    if (len > kvm_ivshmem_dev.ioaddr_size - offset) {
+        len = kvm_ivshmem_dev.ioaddr_size - offset;
+    }
+
+    if (len == 0) return 0;
+
+    bytes_read = copy_to_user(buffer, kvm_ivshmem_dev.base_addr+offset, len);
+    if (bytes_read > 0) {
+        return -EFAULT;
+    }
+
+    *poffset += len;
+    return len;
+}
+
+static loff_t kvm_ivshmem_lseek(struct file * filp, loff_t offset, int origin)
+{
+
+    loff_t retval = -1;
+
+    switch (origin) {
+        case 1:
+            offset += filp->f_pos;
+        case 0:
+            retval = offset;
+            if (offset > kvm_ivshmem_dev.ioaddr_size) {
+                offset = kvm_ivshmem_dev.ioaddr_size;
+            }
+            filp->f_pos = offset;
+    }
+
+    return retval;
+}
+
+static ssize_t kvm_ivshmem_write(struct file * filp, const char * buffer,
+                    size_t len, loff_t * poffset)
+{
+
+    int bytes_written = 0;
+    unsigned long offset;
+
+    offset = *poffset;
+
+//    printk(KERN_INFO "KVM_IVSHMEM: trying to write\n");
+    if (!kvm_ivshmem_dev.base_addr) {
+        printk(KERN_ERR "KVM_IVSHMEM: cannot write to ioaddr (NULL)\n");
+        return 0;
+    }
+
+    if (len > kvm_ivshmem_dev.ioaddr_size - offset) {
+        len = kvm_ivshmem_dev.ioaddr_size - offset;
+    }
+
+//    printk(KERN_INFO "KVM_IVSHMEM: len is %u\n", (unsigned) len);
+    if (len == 0) return 0;
+
+    bytes_written = copy_from_user(kvm_ivshmem_dev.base_addr+offset,
+                    buffer, len);
+    if (bytes_written > 0) {
+        return -EFAULT;
+    }
+
+//    printk(KERN_INFO "KVM_IVSHMEM: wrote %u bytes at offset %lu\n", (unsigned) len, offset);
+    *poffset += len;
+    return len;
+}
+
+static irqreturn_t kvm_ivshmem_interrupt (int irq, void *dev_instance)
+{
+    struct kvm_ivshmem_device * dev = dev_instance;
+    u16 status;
+
+    if (unlikely(dev == NULL))
+        return IRQ_NONE;
+
+    status = readw(dev->regs + IntrStatus);
+    if (!status || (status == 0xFFFF))
+        return IRQ_NONE;
+
+    /* depending on the message we wake different structures */
+    if (status == sema_irq) {
+        up(&sema);
+    } else if (status == wait_event_irq) {
+        event_num = 1;
+        wake_up_interruptible(&wait_queue);
+    }
+
+    printk(KERN_INFO "KVM_IVSHMEM: interrupt (status = 0x%04x)\n",
+           status);
+
+    return IRQ_HANDLED;
+}
+
+static int kvm_ivshmem_probe_device (struct pci_dev *pdev,
+                    const struct pci_device_id * ent) {
+
+    int result;
+
+    printk("KVM_IVSHMEM: Probing for KVM_IVSHMEM Device\n");
+
+    result = pci_enable_device(pdev);
+    if (result) {
+        printk(KERN_ERR "Cannot probe KVM_IVSHMEM device %s: error %d\n",
+        pci_name(pdev), result);
+        return result;
+    }
+
+    result = pci_request_regions(pdev, "kvm_ivshmem");
+    if (result < 0) {
+        printk(KERN_ERR "KVM_IVSHMEM: cannot request regions\n");
+        goto pci_disable;
+    } else printk(KERN_ERR "KVM_IVSHMEM: result is %d\n", result);
+
+    kvm_ivshmem_dev.ioaddr = pci_resource_start(pdev, 1);
+    kvm_ivshmem_dev.ioaddr_size = pci_resource_len(pdev, 1);
+
+    kvm_ivshmem_dev.base_addr = pci_iomap(pdev, 1, 0);
+    printk(KERN_INFO "KVM_IVSHMEM: iomap base = 0x%lu \n",
+                            (unsigned long) kvm_ivshmem_dev.base_addr);
+
+    if (!kvm_ivshmem_dev.base_addr) {
+        printk(KERN_ERR "KVM_IVSHMEM: cannot iomap region of size %d\n",
+                            kvm_ivshmem_dev.ioaddr_size);
+        goto pci_release;
+    }
+
+    printk(KERN_INFO "KVM_IVSHMEM: ioaddr = %x ioaddr_size = %d\n",
+                        kvm_ivshmem_dev.ioaddr, kvm_ivshmem_dev.ioaddr_size);
+
+    kvm_ivshmem_dev.regaddr =  pci_resource_start(pdev, 0);
+    kvm_ivshmem_dev.reg_size = pci_resource_len(pdev, 0);
+    kvm_ivshmem_dev.regs = pci_iomap(pdev, 0, 0x100);
+
+    if (!kvm_ivshmem_dev.regs) {
+        printk(KERN_ERR "KVM_IVSHMEM: cannot ioremap registers of size %d\n",
+                            kvm_ivshmem_dev.reg_size);
+        goto reg_release;
+    }
+
+    /* set all masks to on */
+    writew(0xffff, kvm_ivshmem_dev.regs + IntrMask);
+
+    /* by default initialize semaphore to 0 */
+    sema_init(&sema, 0);
+
+    init_waitqueue_head(&wait_queue);
+    event_num = 0;
+
+    printk(KERN_INFO "KVM_IVSHMEM: irq = %u regaddr = %x reg_size = %d\n",
+                 pdev->irq, kvm_ivshmem_dev.regaddr, kvm_ivshmem_dev.reg_size);
+
+    if (request_irq(pdev->irq, kvm_ivshmem_interrupt, IRQF_SHARED,
+                        "kvm_ivshmem", &kvm_ivshmem_dev)) {
+        printk(KERN_ERR "KVM_IVSHMEM: cannot get interrupt %d\n", pdev->irq);
+    }
+
+    return 0;
+
+
+reg_release:
+    pci_iounmap(pdev, kvm_ivshmem_dev.base_addr);
+pci_release:
+    pci_release_regions(pdev);
+pci_disable:
+    pci_disable_device(pdev);
+    return -EBUSY;
+
+}
+
+static void kvm_ivshmem_remove_device(struct pci_dev* pdev)
+{
+
+    printk(KERN_INFO "Unregister kvm_ivshmem device.\n");
+    free_irq(pdev->irq,&kvm_ivshmem_dev);
+    pci_iounmap(pdev, kvm_ivshmem_dev.regs);
+    pci_iounmap(pdev, kvm_ivshmem_dev.base_addr);
+    pci_release_regions(pdev);
+    pci_disable_device(pdev);
+
+}
+
+static void __exit kvm_ivshmem_cleanup_module (void)
+{
+    pci_unregister_driver (&kvm_ivshmem_pci_driver);
+    unregister_chrdev(device_major_nr, "kvm_ivshmem");
+}
+
+static int __init kvm_ivshmem_init_module (void)
+{
+
+    int err = -ENOMEM;
+
+    /* Register device node ops. */
+    err = register_chrdev(0, "kvm_ivshmem", &kvm_ivshmem_ops);
+    if (err < 0) {
+        printk(KERN_ERR "Unable to register kvm_ivshmem device\n");
+        return err;
+    }
+    device_major_nr = err;
+    printk("KVM_IVSHMEM: Major device number is: %d\n", device_major_nr);
+    kvm_ivshmem_dev.enabled=FALSE;
+
+    err = pci_register_driver(&kvm_ivshmem_pci_driver);
+    if (err < 0) {
+        goto error;
+    }
+
+    return 0;
+
+error:
+    unregister_chrdev(device_major_nr, "kvm_ivshmem");
+    return err;
+}
+
+
+static int kvm_ivshmem_open(struct inode * inode, struct file * filp)
+{
+
+   printk(KERN_INFO "Opening kvm_ivshmem device\n");
+
+   if (MINOR(inode->i_rdev) != KVM_IVSHMEM_DEVICE_MINOR_NUM) {
+      printk(KERN_INFO "minor number is %d\n", KVM_IVSHMEM_DEVICE_MINOR_NUM);
+      return -ENODEV;
+   }
+
+   return 0;
+}
+
+static int kvm_ivshmem_release(struct inode * inode, struct file * filp)
+{
+
+   return 0;
+}
+
+static int kvm_ivshmem_mmap(struct file *filp, struct vm_area_struct * vma)
+{
+
+    unsigned long len;
+    unsigned long off;
+    unsigned long start;
+
+    lock_kernel();
+
+    off = vma->vm_pgoff << PAGE_SHIFT;
+    start = kvm_ivshmem_dev.ioaddr;
+
+    len=PAGE_ALIGN((start & ~PAGE_MASK) + kvm_ivshmem_dev.ioaddr_size);
+    start &= PAGE_MASK;
+
+    printk(KERN_INFO "%lu - %lu + %lu\n",vma->vm_end ,vma->vm_start, off);
+    printk(KERN_INFO "%lu > %lu\n",(vma->vm_end - vma->vm_start + off), len);
+
+    if ((vma->vm_end - vma->vm_start + off) > len) {
+        unlock_kernel();
+        return -EINVAL;
+    }
+
+    off += start;
+    vma->vm_pgoff = off >> PAGE_SHIFT;
+
+    vma->vm_flags |= VM_SHARED|VM_RESERVED;
+
+    if(io_remap_pfn_range(vma, vma->vm_start,
+        off >> PAGE_SHIFT, vma->vm_end - vma->vm_start,
+        vma->vm_page_prot))
+    {
+        printk("mmap failed\n");
+        unlock_kernel();
+        return -ENXIO;
+    }
+    unlock_kernel();
+
+    return 0;
+}
+
+module_init(kvm_ivshmem_init_module);
+module_exit(kvm_ivshmem_cleanup_module);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Cam Macdonell <cam@cs.ualberta.ca>");
+MODULE_DESCRIPTION("KVM inter-VM shared memory module");
+MODULE_VERSION("1.0");
-- 
1.6.0.6


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* IVSHMEM and limits on shared memory
  2010-03-01 18:19   ` [PATCH] Support an inter-vm shared memory device that maps a shared-memory object Cam Macdonell
@ 2010-03-03  7:06     ` Khaled Ibrahim
  2010-03-03 22:09       ` Cam Macdonell
  0 siblings, 1 reply; 10+ messages in thread
From: Khaled Ibrahim @ 2010-03-03  7:06 UTC (permalink / raw)
  To: kvm; +Cc: cam




Hi Cam,


I used your patches successfully to support shared memory on KVM and
used the test cases successfully, but qemu-kvm crashes when I increased the
size of the shared memory.  I
applied the ivshmem patch to qemu-kvm-0.12.3 (some manual patching was
needed).  It worked flawlessly for
up to 128MB of shared memory on my system. I am running on a machine with 64GB
memory running opensuse (kernel 2.6.27) on AMD opteron. 

Qemu crashes  with
smp=4 and the shared memory requested in 256MB, (512MB with smp=1), even though
the shared memory file is created. I debugged the problem and it seems that
some memory corruptions happens.

It crashes in the subpage_register for rtl8139 pci driver!,
tracked back to rtl8139_mmio_map. The problem starts with corrupted value in
the config field in the struct for the rtl8139 driver. At offset 20 of this
field the address should indicate that the address is uninitialized at that
time of crash, but surprisingly the value changes over the course of execution
and gets the SIZE of the shared memory allocated (related to ivshmem). I failed
to identify what changes/corrupts that field. I tried some padding for
allocation but the field always gets updated with the size of the shared memory
in a very consistent way.

 

Do you have clue what might be causing this problem?

Thanks,

-Khaled 

 





 		 	   		  
_________________________________________________________________
Hotmail: Powerful Free email with security by Microsoft.
http://clk.atdmt.com/GBL/go/201469230/direct/01/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IVSHMEM and limits on shared memory
  2010-03-03  7:06     ` IVSHMEM and limits on shared memory Khaled Ibrahim
@ 2010-03-03 22:09       ` Cam Macdonell
  2010-03-03 22:38         ` Khaled Ibrahim
  0 siblings, 1 reply; 10+ messages in thread
From: Cam Macdonell @ 2010-03-03 22:09 UTC (permalink / raw)
  To: Khaled Ibrahim; +Cc: kvm

On Wed, Mar 3, 2010 at 12:06 AM, Khaled Ibrahim <kzm98@hotmail.com> wrote:
>
> Hi Cam,
>
> I used your patches successfully to support shared memory on KVM and
> used the test cases successfully, but qemu-kvm crashes when I increased the
> size of the shared memory.  I
> applied the ivshmem patch to qemu-kvm-0.12.3 (some manual patching was
> needed).  It worked flawlessly for
> up to 128MB of shared memory on my system. I am running on a machine with 64GB
> memory running opensuse (kernel 2.6.27) on AMD opteron.
>
> Qemu crashes  with
> smp=4 and the shared memory requested in 256MB, (512MB with smp=1), even though
> the shared memory file is created. I debugged the problem and it seems that
> some memory corruptions happens.

Can you please provide the full command-line for the smp=1 instance?

>
> It crashes in the subpage_register for rtl8139 pci driver!,
> tracked back to rtl8139_mmio_map. The problem starts with corrupted value in
> the config field in the struct for the rtl8139 driver. At offset 20 of this
> field the address should indicate that the address is uninitialized at that
> time of crash, but surprisingly the value changes over the course of execution
> and gets the SIZE of the shared memory allocated (related to ivshmem). I failed
> to identify what changes/corrupts that field. I tried some padding for
> allocation but the field always gets updated with the size of the shared memory
> in a very consistent way.
>

As far as you know does anything in the guest trigger the corruption?
Does the corruption happen immediately or after running some of the
test programs?

Thanks,
Cam

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: IVSHMEM and limits on shared memory
  2010-03-03 22:09       ` Cam Macdonell
@ 2010-03-03 22:38         ` Khaled Ibrahim
  2010-03-04  4:52           ` Cam Macdonell
  0 siblings, 1 reply; 10+ messages in thread
From: Khaled Ibrahim @ 2010-03-03 22:38 UTC (permalink / raw)
  To: cam; +Cc: kvm




----------------------------------------
> Date: Wed, 3 Mar 2010 15:09:17 -0700
> Subject: Re: IVSHMEM and limits on shared memory
> From: cam@cs.ualberta.ca
> To: kzm98@hotmail.com
> CC: kvm@vger.kernel.org
>
> On Wed, Mar 3, 2010 at 12:06 AM, Khaled Ibrahim  wrote:
>>
>> Hi Cam,
>>
>> I used your patches successfully to support shared memory on KVM and
>> used the test cases successfully, but qemu-kvm crashes when I increased the
>> size of the shared memory.  I
>> applied the ivshmem patch to qemu-kvm-0.12.3 (some manual patching was
>> needed).  It worked flawlessly for
>> up to 128MB of shared memory on my system. I am running on a machine with 64GB
>> memory running opensuse (kernel 2.6.27) on AMD opteron.
>>
>> Qemu crashes  with
>> smp=4 and the shared memory requested in 256MB, (512MB with smp=1), even though
>> the shared memory file is created. I debugged the problem and it seems that
>> some memory corruptions happens.
>
> Can you please provide the full command-line for the smp=1 instance?

qemu-system-x86_64 ./qemudisk0.raw \          -net nic,model=rtl8139,macaddr=52:54:00:12:34:50\          -net tap,ifname=tap0,script=no,downscript=no \          -m 4096 \	  -ivshmem 512,kvmshmem\          -smp 1 \          -usb \          -usbdevice tablet \          -localtime


>
>>
>> It crashes in the subpage_register for rtl8139 pci driver!,
>> tracked back to rtl8139_mmio_map. The problem starts with corrupted value in
>> the config field in the struct for the rtl8139 driver. At offset 20 of this
>> field the address should indicate that the address is uninitialized at that
>> time of crash, but surprisingly the value changes over the course of execution
>> and gets the SIZE of the shared memory allocated (related to ivshmem). I failed
>> to identify what changes/corrupts that field. I tried some padding for
>> allocation but the field always gets updated with the size of the shared memory
>> in a very consistent way.
>>
>
> As far as you know does anything in the guest trigger the corruption?
> Does the corruption happen immediately or after running some of the
> test programs?

The boot process does not complete, and it fails before it reach grub.   		 	   		  
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/201469226/direct/01/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IVSHMEM and limits on shared memory
  2010-03-03 22:38         ` Khaled Ibrahim
@ 2010-03-04  4:52           ` Cam Macdonell
  2010-03-04 20:12             ` Khaled Ibrahim
  0 siblings, 1 reply; 10+ messages in thread
From: Cam Macdonell @ 2010-03-04  4:52 UTC (permalink / raw)
  To: Khaled Ibrahim; +Cc: kvm

On Wed, Mar 3, 2010 at 3:38 PM, Khaled Ibrahim <kzm98@hotmail.com> wrote:
>
>
>
> ----------------------------------------
>> Date: Wed, 3 Mar 2010 15:09:17 -0700
>> Subject: Re: IVSHMEM and limits on shared memory
>> From: cam@cs.ualberta.ca
>> To: kzm98@hotmail.com
>> CC: kvm@vger.kernel.org
>>
>> On Wed, Mar 3, 2010 at 12:06 AM, Khaled Ibrahim  wrote:
>>>
>>> Hi Cam,
>>>
>>> I used your patches successfully to support shared memory on KVM and
>>> used the test cases successfully, but qemu-kvm crashes when I increased the
>>> size of the shared memory.  I
>>> applied the ivshmem patch to qemu-kvm-0.12.3 (some manual patching was
>>> needed).  It worked flawlessly for
>>> up to 128MB of shared memory on my system. I am running on a machine with 64GB
>>> memory running opensuse (kernel 2.6.27) on AMD opteron.
>>>
>>> Qemu crashes  with
>>> smp=4 and the shared memory requested in 256MB, (512MB with smp=1), even though
>>> the shared memory file is created. I debugged the problem and it seems that
>>> some memory corruptions happens.
>>
>> Can you please provide the full command-line for the smp=1 instance?
>
> qemu-system-x86_64 ./qemudisk0.raw \          -net nic,model=rtl8139,macaddr=52:54:00:12:34:50\          -net tap,ifname=tap0,script=no,downscript=no \          -m 4096 \        -ivshmem 512,kvmshmem\          -smp 1 \          -usb \          -usbdevice tablet \          -localtime
>
>
>>
>>>
>>> It crashes in the subpage_register for rtl8139 pci driver!,
>>> tracked back to rtl8139_mmio_map. The problem starts with corrupted value in
>>> the config field in the struct for the rtl8139 driver. At offset 20 of this
>>> field the address should indicate that the address is uninitialized at that
>>> time of crash, but surprisingly the value changes over the course of execution
>>> and gets the SIZE of the shared memory allocated (related to ivshmem). I failed
>>> to identify what changes/corrupts that field. I tried some padding for
>>> allocation but the field always gets updated with the size of the shared memory
>>> in a very consistent way.

Good debugging.  I've been able to reproduce your error when applying
my patch to qemu-kvm-0.12.3 and can trace the error to the
subpage_register.  Curiously, this bug does not occur with the latest
version from the git repo.  I've tested up to 1 GB without problem.
So I'm not sure if it's an error in my patch or elsewhere in the
memory management that has since been fixed.

As a test, I removed anywhere my patch stored the size of the shared
memory region and hard coded the size of 512 MB into qemu_ram_alloc
and pci_register_bar, so that my patch never writes the size of the
memory region anywhere.  And I discovered that the value of 512MB
still shows up at the offset you mention, so it seems something else
is storing that value in the wrong location and corrupting memory.

Can you try using the version from the git repo and see if the error recurs?

Cam

>>>
>>
>> As far as you know does anything in the guest trigger the corruption?
>> Does the corruption happen immediately or after running some of the
>> test programs?
>
> The boot process does not complete, and it fails before it reach grub.
> _________________________________________________________________
> Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
> http://clk.atdmt.com/GBL/go/201469226/direct/01/
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: IVSHMEM and limits on shared memory
  2010-03-04  4:52           ` Cam Macdonell
@ 2010-03-04 20:12             ` Khaled Ibrahim
  2010-03-04 20:53               ` Cam Macdonell
  0 siblings, 1 reply; 10+ messages in thread
From: Khaled Ibrahim @ 2010-03-04 20:12 UTC (permalink / raw)
  To: cam; +Cc: kvm


>
> As a test, I removed anywhere my patch stored the size of the shared
> memory region and hard coded the size of 512 MB into qemu_ram_alloc
> and pci_register_bar, so that my patch never writes the size of the
> memory region anywhere. And I discovered that the value of 512MB
> still shows up at the offset you mention, so it seems something else
> is storing that value in the wrong location and corrupting memory.
>
> Can you try using the version from the git repo and see if the error recurs?

Thank you Cam. I tried to build using git repo, but the build crashes while booting on my machine without the shared memory patch. I used git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git. Which git repo are you using? Can you send me a ivshmem patched qemu-kvm, or tell me which stable qemu-kvm repo should I use? 

Thanks,
-Khaled
 		 	   		  
_________________________________________________________________
Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/201469226/direct/01/

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IVSHMEM and limits on shared memory
  2010-03-04 20:12             ` Khaled Ibrahim
@ 2010-03-04 20:53               ` Cam Macdonell
  0 siblings, 0 replies; 10+ messages in thread
From: Cam Macdonell @ 2010-03-04 20:53 UTC (permalink / raw)
  To: Khaled Ibrahim; +Cc: kvm

On Thu, Mar 4, 2010 at 1:12 PM, Khaled Ibrahim <kzm98@hotmail.com> wrote:
>
>>
>> As a test, I removed anywhere my patch stored the size of the shared
>> memory region and hard coded the size of 512 MB into qemu_ram_alloc
>> and pci_register_bar, so that my patch never writes the size of the
>> memory region anywhere. And I discovered that the value of 512MB
>> still shows up at the offset you mention, so it seems something else
>> is storing that value in the wrong location and corrupting memory.
>>
>> Can you try using the version from the git repo and see if the error recurs?
>
> Thank you Cam. I tried to build using git repo, but the build crashes while booting on my machine without the shared memory patch. I used git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git. Which git repo are you using? Can you send me a ivshmem patched qemu-kvm, or tell me which stable qemu-kvm repo should I use?

That's the correct repo.

Your VM crashes using the latest git repo?  That is unusual.  I'll
send you a tar ball off-list of a patched version of KVM.

>
> Thanks,
> -Khaled
>
> _________________________________________________________________
> Hotmail: Trusted email with Microsoft’s powerful SPAM protection.
> http://clk.atdmt.com/GBL/go/201469226/direct/01/
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2010-03-04 20:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-28 20:20 KVM ivshmem enquiry Khaled Ibrahim
2010-03-01 18:17 ` Cam Macdonell
2010-03-01 18:19   ` [PATCH] Support an inter-vm shared memory device that maps a shared-memory object Cam Macdonell
2010-03-03  7:06     ` IVSHMEM and limits on shared memory Khaled Ibrahim
2010-03-03 22:09       ` Cam Macdonell
2010-03-03 22:38         ` Khaled Ibrahim
2010-03-04  4:52           ` Cam Macdonell
2010-03-04 20:12             ` Khaled Ibrahim
2010-03-04 20:53               ` Cam Macdonell
2010-03-01 18:20   ` [PATCH] Driver to support shared memory device with inerrupts Cam Macdonell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.