xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] qemu-qdisk: Implementation of grant copy operation.
@ 2016-06-22  8:38 Paulina Szubarczyk
  2016-06-22  8:38 ` [PATCH v3 1/2] Interface for grant copy operation in libs Paulina Szubarczyk
  2016-06-22  8:38 ` [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation Paulina Szubarczyk
  0 siblings, 2 replies; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-06-22  8:38 UTC (permalink / raw)
  To: xen-devel, roger.pau
  Cc: sstabellini, wei.liu2, Paulina Szubarczyk, ian.jackson,
	P.Gawkowski, dvrabel, david.vrabel, anthony.perard

Hi,

It is a proposition for implementation of grant copy operation in qemu-qdisk and 
interface in libxc/libs. 

Changes since v2:
Interface:
- dropped the changes in libxc/include/xenctrl_compat
- changed the MINOR version in Makefile
- replaced 'return -1' -> 'abort()'in libs/gnttab/gnttab_unimp.c
- moved the struct 'xengnttab_copy_grant_segment' to 
  libs/gnttab/include/xengnttab.h
- added explicit assingment to ioctl_gntdev_grant_copy_segment 
  to the linux part

qemu-qdisk:
- to use the xengnttab_* function directly added -lxengnttab to configure
  and include <xengnttab.h> in include/hw/xen/xen_common.h
- in ioreq_copy removed an out path, changed a log level, made explicit 
  assignement to 'xengnttab_copy_grant_segment'
* I did not change the way of testing if grant_copy operation is implemented.
  As far as I understand if the code from gnttab_unimp.c is used then the gnttab 
  device is unavailable and the handler to gntdev would be invalid. But 
  if the handler is valid then the ioctl should return operation unimplemented 
  if the gntdev does not implement the operation.


Changes since v1:
Interface:
- changed the interface to call grant copy operation to match ioctl
	int xengnttab_grant_copy(xengnttab_handle *xgt,
                         	 uint32_t count,
                         	 xengnttab_grant_copy_segment_t* segs)

- added a struct 'xengnttab_copy_grant_segment' definition to tools/libs	
  /gnttab/private.h, tools/libxc/include/xenctrl_compat.h

- changed the function 'osdep_gnttab_grant_copy' which right now just
  call the ioctl

- added a new VER1.1 to tools/libs/gnttab/libxengnttab.map 

qemu-qdisk:
- removed the 'ioreq_write','ioreq_read_init','ioreq_read' functions 
- implemented 'ioreq_init_copy_buffers', 'ioreq_copy' 
- reverted the removal of grant map and introduced conditional invoking
  grant copy or grant map
- resigned from caching the local buffers on behalf of allocating the 
  required amount of pages at once. The cached structure would require 
  to have an lock guard and I suppose that the performance improvement 
  would degraded. 
 

For the functional test I attached the device with a qdisk backend to the guest, 
mounted, performed some reads and writes.

I run fio tests[1] with different iodepth and size of the block. The test can be 
accessed on my github[2] but mainly after the warm up I run for 60 seconds:
    fio --time_based \
		--clocksource=clock_gettime \
		--rw=randread \
		--random_distribution=pareto:0.9 \
		--size=10g \
	    --direct='1' \
	    --ioengine=libaio \
		--filename=$DEV \
		--iodepth=$IODEPTH \
		--bs=$BS \
		--name=$NAME \
		--runtime=$RUNTIME >> $FILENAME
The test were repeated at least three times. 

[1] https://docs.google.com/spreadsheets/d/1E6AMiB8ceJpExL6jWpH9u2yy6DZxzhmDUyFf-eUuJ0c/edit?usp=sharing

[2] https://github.com/paulina-szubarczyk/xen-benchmark
    - multitest_with_iodepth.sh


Thanks and regards, 
Paulina

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22  8:38 [PATCH v3 0/2] qemu-qdisk: Implementation of grant copy operation Paulina Szubarczyk
@ 2016-06-22  8:38 ` Paulina Szubarczyk
  2016-06-22  9:37   ` David Vrabel
  2016-07-08 13:18   ` Wei Liu
  2016-06-22  8:38 ` [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation Paulina Szubarczyk
  1 sibling, 2 replies; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-06-22  8:38 UTC (permalink / raw)
  To: xen-devel, roger.pau
  Cc: sstabellini, wei.liu2, Paulina Szubarczyk, ian.jackson,
	david.vrabel, anthony.perard

In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
system call is invoked. In mini-os the operation is yet not
implemented. For other OSs there is a dummy implementation.

Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
---
Changes since v2:
- dropped the changes in libxc/include/xenctrl_compat
- changed the MINOR version in Makefile
- replaced 'return -1' -> 'abort()'in libs/gnttab/gnttab_unimp.c
- moved the struct 'xengnttab_copy_grant_segment' to 
  libs/gnttab/include/xengnttab.h 
- added explicit assingment to ioctl_gntdev_grant_copy_segment 
  to the linux part

 tools/include/xen-sys/Linux/gntdev.h  | 21 ++++++++++++++++
 tools/libs/gnttab/Makefile            |  2 +-
 tools/libs/gnttab/gnttab_core.c       |  6 +++++
 tools/libs/gnttab/gnttab_unimp.c      |  6 +++++
 tools/libs/gnttab/include/xengnttab.h | 28 ++++++++++++++++++++++
 tools/libs/gnttab/libxengnttab.map    |  5 ++++
 tools/libs/gnttab/linux.c             | 45 +++++++++++++++++++++++++++++++++++
 tools/libs/gnttab/minios.c            |  6 +++++
 tools/libs/gnttab/private.h           |  4 ++++
 9 files changed, 122 insertions(+), 1 deletion(-)

diff --git a/tools/include/xen-sys/Linux/gntdev.h b/tools/include/xen-sys/Linux/gntdev.h
index caf6fb4..0ca07c9 100644
--- a/tools/include/xen-sys/Linux/gntdev.h
+++ b/tools/include/xen-sys/Linux/gntdev.h
@@ -147,4 +147,25 @@ struct ioctl_gntdev_unmap_notify {
 /* Send an interrupt on the indicated event channel */
 #define UNMAP_NOTIFY_SEND_EVENT 0x2
 
+struct ioctl_gntdev_grant_copy_segment {
+    union {
+        void *virt;
+        struct {
+            uint32_t ref;
+            uint16_t offset;
+            uint16_t domid;
+        } foreign;
+    } source, dest;
+    uint16_t len;
+    uint16_t flags;
+    int16_t status;
+};
+
+#define IOCTL_GNTDEV_GRANT_COPY \
+_IOC(_IOC_NONE, 'G', 8, sizeof(struct ioctl_gntdev_grant_copy))
+struct ioctl_gntdev_grant_copy {
+    unsigned int count;
+    struct ioctl_gntdev_grant_copy_segment *segments;
+};
+
 #endif /* __LINUX_PUBLIC_GNTDEV_H__ */
diff --git a/tools/libs/gnttab/Makefile b/tools/libs/gnttab/Makefile
index af64542..95c2cd8 100644
--- a/tools/libs/gnttab/Makefile
+++ b/tools/libs/gnttab/Makefile
@@ -2,7 +2,7 @@ XEN_ROOT = $(CURDIR)/../../..
 include $(XEN_ROOT)/tools/Rules.mk
 
 MAJOR    = 1
-MINOR    = 0
+MINOR    = 1
 SHLIB_LDFLAGS += -Wl,--version-script=libxengnttab.map
 
 CFLAGS   += -Werror -Wmissing-prototypes
diff --git a/tools/libs/gnttab/gnttab_core.c b/tools/libs/gnttab/gnttab_core.c
index 5d0474d..968c833 100644
--- a/tools/libs/gnttab/gnttab_core.c
+++ b/tools/libs/gnttab/gnttab_core.c
@@ -113,6 +113,12 @@ int xengnttab_unmap(xengnttab_handle *xgt, void *start_address, uint32_t count)
     return osdep_gnttab_unmap(xgt, start_address, count);
 }
 
+int xengnttab_grant_copy(xengnttab_handle *xgt,
+                         uint32_t count,
+                         xengnttab_grant_copy_segment_t *segs)
+{
+    return osdep_gnttab_grant_copy(xgt, count, segs);
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/gnttab/gnttab_unimp.c b/tools/libs/gnttab/gnttab_unimp.c
index b3a4a20..829eced 100644
--- a/tools/libs/gnttab/gnttab_unimp.c
+++ b/tools/libs/gnttab/gnttab_unimp.c
@@ -78,6 +78,12 @@ int xengnttab_unmap(xengnttab_handle *xgt, void *start_address, uint32_t count)
     abort();
 }
 
+int xengnttab_copy_grant(xengnttab_handle *xgt,
+                         uint32_t count,
+                         xengnttab_copy_grant_segment_t *segs)
+{
+    abort();
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/gnttab/include/xengnttab.h b/tools/libs/gnttab/include/xengnttab.h
index 0431dcf..949fd9e 100644
--- a/tools/libs/gnttab/include/xengnttab.h
+++ b/tools/libs/gnttab/include/xengnttab.h
@@ -258,6 +258,34 @@ int xengnttab_unmap(xengnttab_handle *xgt, void *start_address, uint32_t count);
 int xengnttab_set_max_grants(xengnttab_handle *xgt,
                              uint32_t nr_grants);
 
+struct xengnttab_grant_copy_segment {
+    union xengnttab_copy_ptr {
+        void *virt;
+        struct {
+            uint32_t ref;
+            uint16_t offset;
+            uint16_t domid;
+        } foreign;
+    } source, dest;
+    uint16_t len;
+    uint16_t flags;
+    int16_t status;
+};
+
+typedef struct xengnttab_grant_copy_segment xengnttab_grant_copy_segment_t;
+
+/**
+ * Copy memory from or to grant references. The information of each operations
+ * are contained in 'xengnttab_grant_copy_segment_t'. The @flag value indicate
+ * the direction of an operation (GNTCOPY_source_gref\GNTCOPY_dest_gref).
+ *
+ * The sum of fields @offset[i] and @len[i] of 'xengnttab_grant_copy_segment_t'
+ * should not exceed XEN_PAGE_SIZE
+ */
+int xengnttab_grant_copy(xengnttab_handle *xgt,
+                         uint32_t count,
+                         xengnttab_grant_copy_segment_t *segs);
+
 /*
  * Grant Sharing Interface (allocating and granting pages to others)
  */
diff --git a/tools/libs/gnttab/libxengnttab.map b/tools/libs/gnttab/libxengnttab.map
index dc737ac..f78da22 100644
--- a/tools/libs/gnttab/libxengnttab.map
+++ b/tools/libs/gnttab/libxengnttab.map
@@ -21,3 +21,8 @@ VERS_1.0 {
 		xengntshr_unshare;
 	local: *; /* Do not expose anything by default */
 };
+
+VERS_1.1 {
+    global:
+        xengnttab_grant_copy;
+} VERS_1.0;
diff --git a/tools/libs/gnttab/linux.c b/tools/libs/gnttab/linux.c
index 7b0fba4..62ad7bd 100644
--- a/tools/libs/gnttab/linux.c
+++ b/tools/libs/gnttab/linux.c
@@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
     return 0;
 }
 
+int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
+                            uint32_t count,
+                            xengnttab_grant_copy_segment_t *segs)
+{
+    int i, rc;
+    int fd = xgt->fd;
+    struct ioctl_gntdev_grant_copy copy;
+
+    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
+    copy.count = count;
+    for (i = 0; i < count; i++)
+    {
+        copy.segments[i].flags = segs[i].flags;
+        copy.segments[i].len = segs[i].len;
+        if (segs[i].flags == GNTCOPY_dest_gref) 
+        {
+            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
+            copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
+            copy.segments[i].dest.foreign.offset = segs[i].dest.foreign.offset;
+            copy.segments[i].source.virt = segs[i].source.virt;
+        } 
+        else 
+        {
+            copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
+            copy.segments[i].source.foreign.domid = segs[i].source.foreign.domid;
+            copy.segments[i].source.foreign.offset = segs[i].source.foreign.offset;
+            copy.segments[i].dest.virt = segs[i].dest.virt;
+        }
+    }
+
+    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
+    if (rc) 
+    {
+        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
+    }
+    else 
+    {
+        for (i = 0; i < count; i++)
+            segs[i].status = copy.segments[i].status;
+    }
+
+    free(copy.segments);
+    return rc;
+}
+
 int osdep_gntshr_open(xengntshr_handle *xgs)
 {
     int fd = open(DEVXEN "gntalloc", O_RDWR);
diff --git a/tools/libs/gnttab/minios.c b/tools/libs/gnttab/minios.c
index 7e04174..0951bc9 100644
--- a/tools/libs/gnttab/minios.c
+++ b/tools/libs/gnttab/minios.c
@@ -106,6 +106,12 @@ int osdep_gnttab_set_max_grants(xengnttab_handle *xgt, uint32_t count)
     return ret;
 }
 
+int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
+                            uint32_t count,
+                            xengnttab_grant_copy_segment_t *segs)
+{
+    return -1;
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/gnttab/private.h b/tools/libs/gnttab/private.h
index d286c86..d6c5594 100644
--- a/tools/libs/gnttab/private.h
+++ b/tools/libs/gnttab/private.h
@@ -23,6 +23,10 @@ void *osdep_gnttab_grant_map(xengnttab_handle *xgt,
 int osdep_gnttab_unmap(xengnttab_handle *xgt,
                        void *start_address,
                        uint32_t count);
+int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
+                            uint32_t count,
+                            xengnttab_grant_copy_segment_t *segs);
+
 int osdep_gntshr_open(xengntshr_handle *xgs);
 int osdep_gntshr_close(xengntshr_handle *xgs);
 
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-06-22  8:38 [PATCH v3 0/2] qemu-qdisk: Implementation of grant copy operation Paulina Szubarczyk
  2016-06-22  8:38 ` [PATCH v3 1/2] Interface for grant copy operation in libs Paulina Szubarczyk
@ 2016-06-22  8:38 ` Paulina Szubarczyk
  2016-07-13 12:34   ` Paulina Szubarczyk
                     ` (3 more replies)
  1 sibling, 4 replies; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-06-22  8:38 UTC (permalink / raw)
  To: xen-devel, roger.pau
  Cc: sstabellini, wei.liu2, Paulina Szubarczyk, ian.jackson,
	david.vrabel, anthony.perard

Copy data operated on during request from/to local buffers to/from
the grant references.

Before grant copy operation local buffers must be allocated what is
done by calling ioreq_init_copy_buffers. For the 'read' operation,
first, the qemu device invokes the read operation on local buffers
and on the completion grant copy is called and buffers are freed.
For the 'write' operation grant copy is performed before invoking
write by qemu device.

A new value 'feature_grant_copy' is added to recognize when the
grant copy operation is supported by a guest.
The body of the function 'ioreq_runio_qemu_aio' is moved to
'ioreq_runio_qemu_aio_blk' and in the 'ioreq_runio_qemu_aio' depending
on the support for grant copy according checks, initialization, grant
operation are made, then the 'ioreq_runio_qemu_aio_blk' function is
called.

Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
---
Changes since v2:
- to use the xengnttab_* function directly added -lxengnttab to configure
  and include <xengnttab.h> in include/hw/xen/xen_common.h
- in ioreq_copy removed an out path, changed a log level, made explicit 
  assignement to 'xengnttab_copy_grant_segment'
* I did not change the way of testing if grant_copy operation is implemented.
  As far as I understand if the code from gnttab_unimp.c is used then the gnttab 
  device is unavailable and the handler to gntdev would be invalid. But 
  if the handler is valid then the ioctl should return operation unimplemented 
  if the gntdev does not implement the operation.

 configure                   |   2 +-
 hw/block/xen_disk.c         | 171 ++++++++++++++++++++++++++++++++++++++++----
 include/hw/xen/xen_common.h |   2 +
 3 files changed, 162 insertions(+), 13 deletions(-)

diff --git a/configure b/configure
index e41876a..355d3fa 100755
--- a/configure
+++ b/configure
@@ -1843,7 +1843,7 @@ fi
 # xen probe
 
 if test "$xen" != "no" ; then
-  xen_libs="-lxenstore -lxenctrl -lxenguest"
+  xen_libs="-lxenstore -lxenctrl -lxenguest -lxengnttab"
 
   # First we test whether Xen headers and libraries are available.
   # If no, we are done and there is no Xen support.
diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 37e14d1..4eca06a 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -131,6 +131,9 @@ struct XenBlkDev {
     unsigned int        persistent_gnt_count;
     unsigned int        max_grants;
 
+    /* Grant copy */
+    gboolean            feature_grant_copy;
+
     /* qemu block driver */
     DriveInfo           *dinfo;
     BlockBackend        *blk;
@@ -500,6 +503,99 @@ static int ioreq_map(struct ioreq *ioreq)
     return 0;
 }
 
+static void* get_buffer(int count)
+{
+    return xc_memalign(xen_xc, XC_PAGE_SIZE, count*XC_PAGE_SIZE);
+}
+
+static void free_buffers(struct ioreq *ioreq)
+{
+    int i;
+
+    for (i = 0; i < ioreq->v.niov; i++) {
+        ioreq->page[i] = NULL;
+    }
+
+    free(ioreq->pages);
+}
+
+static int ioreq_init_copy_buffers(struct ioreq *ioreq) {
+    int i;
+
+    if (ioreq->v.niov == 0) {
+        return 0;
+    }
+
+    ioreq->pages = get_buffer(ioreq->v.niov);
+    if (!ioreq->pages) {
+        return -1;
+    }
+
+    for (i = 0; i < ioreq->v.niov; i++) {
+        ioreq->page[i] = ioreq->pages + i*XC_PAGE_SIZE;
+        ioreq->v.iov[i].iov_base += (uintptr_t)ioreq->page[i];
+    }
+
+    return 0;
+}
+
+static int ioreq_copy(struct ioreq *ioreq)
+{
+    XenGnttab gnt = ioreq->blkdev->xendev.gnttabdev;
+    xengnttab_grant_copy_segment_t segs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    int i, count = 0, r, rc;
+    int64_t file_blk = ioreq->blkdev->file_blk;
+
+    if (ioreq->v.niov == 0) {
+        return 0;
+    }
+
+    count = ioreq->v.niov;
+
+    for (i = 0; i < count; i++) {
+
+        if (ioreq->req.operation == BLKIF_OP_READ) {
+            segs[i].flags = GNTCOPY_dest_gref;
+            segs[i].dest.foreign.ref = ioreq->refs[i];
+            segs[i].dest.foreign.domid = ioreq->domids[i];
+            segs[i].dest.foreign.offset = ioreq->req.seg[i].first_sect * file_blk;
+            segs[i].source.virt = ioreq->v.iov[i].iov_base;
+        } else {
+            segs[i].flags = GNTCOPY_source_gref;
+            segs[i].source.foreign.ref = ioreq->refs[i];
+            segs[i].source.foreign.domid = ioreq->domids[i];
+            segs[i].source.foreign.offset = ioreq->req.seg[i].first_sect * file_blk;
+            segs[i].dest.virt = ioreq->v.iov[i].iov_base;
+        }
+        segs[i].len = (ioreq->req.seg[i].last_sect
+                       - ioreq->req.seg[i].first_sect + 1) * file_blk;
+
+    }
+
+    rc = xengnttab_grant_copy(gnt, count, segs);
+
+    if (rc) {
+        xen_be_printf(&ioreq->blkdev->xendev, 0,
+                      "failed to copy data %d \n", rc);
+        ioreq->aio_errors++;
+        return -1;
+    } else {
+        r = 0;
+    }
+
+    for (i = 0; i < count; i++) {
+        if (segs[i].status != GNTST_okay) {
+            xen_be_printf(&ioreq->blkdev->xendev, 3,
+                          "failed to copy data %d for gref %d, domid %d\n", rc,
+                          ioreq->refs[i], ioreq->domids[i]);
+            ioreq->aio_errors++;
+            r = -1;
+        }
+    }
+
+    return r;
+}
+
 static int ioreq_runio_qemu_aio(struct ioreq *ioreq);
 
 static void qemu_aio_complete(void *opaque, int ret)
@@ -528,8 +624,31 @@ static void qemu_aio_complete(void *opaque, int ret)
         return;
     }
 
+    if (ioreq->blkdev->feature_grant_copy) {
+        switch (ioreq->req.operation) {
+        case BLKIF_OP_READ:
+            /* in case of failure ioreq->aio_errors is increased */
+            ioreq_copy(ioreq);
+            free_buffers(ioreq);
+            break;
+        case BLKIF_OP_WRITE:
+        case BLKIF_OP_FLUSH_DISKCACHE:
+            if (!ioreq->req.nr_segments) {
+                break;
+            }
+            free_buffers(ioreq);
+            break;
+        default:
+            break;
+        }
+    }
+
     ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY;
-    ioreq_unmap(ioreq);
+
+    if (!ioreq->blkdev->feature_grant_copy) {
+        ioreq_unmap(ioreq);
+    }
+
     ioreq_finish(ioreq);
     switch (ioreq->req.operation) {
     case BLKIF_OP_WRITE:
@@ -547,14 +666,42 @@ static void qemu_aio_complete(void *opaque, int ret)
     qemu_bh_schedule(ioreq->blkdev->bh);
 }
 
+static int ioreq_runio_qemu_aio_blk(struct ioreq *ioreq);
+
 static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
 {
-    struct XenBlkDev *blkdev = ioreq->blkdev;
+    if (ioreq->blkdev->feature_grant_copy) {
+
+        ioreq_init_copy_buffers(ioreq);
+        if (ioreq->req.nr_segments && (ioreq->req.operation == BLKIF_OP_WRITE ||
+            ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE)) {
+            if (ioreq_copy(ioreq)) {
+                free_buffers(ioreq);
+                goto err;
+            }
+        }
+        if (ioreq_runio_qemu_aio_blk(ioreq)) goto err;
+
+    } else {
 
-    if (ioreq->req.nr_segments && ioreq_map(ioreq) == -1) {
-        goto err_no_map;
+        if (ioreq->req.nr_segments && ioreq_map(ioreq)) goto err;
+        if (ioreq_runio_qemu_aio_blk(ioreq)) {
+            ioreq_unmap(ioreq);
+            goto err;
+        }
     }
 
+    return 0;
+err:
+    ioreq_finish(ioreq);
+    ioreq->status = BLKIF_RSP_ERROR;
+    return -1;
+}
+
+static int ioreq_runio_qemu_aio_blk(struct ioreq *ioreq)
+{
+    struct XenBlkDev *blkdev = ioreq->blkdev;
+
     ioreq->aio_inflight++;
     if (ioreq->presync) {
         blk_aio_flush(ioreq->blkdev->blk, qemu_aio_complete, ioreq);
@@ -594,19 +741,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
     }
     default:
         /* unknown operation (shouldn't happen -- parse catches this) */
-        goto err;
+        return -1;
     }
 
     qemu_aio_complete(ioreq, 0);
 
     return 0;
-
-err:
-    ioreq_unmap(ioreq);
-err_no_map:
-    ioreq_finish(ioreq);
-    ioreq->status = BLKIF_RSP_ERROR;
-    return -1;
 }
 
 static int blk_send_response_one(struct ioreq *ioreq)
@@ -1020,10 +1160,17 @@ static int blk_connect(struct XenDevice *xendev)
 
     xen_be_bind_evtchn(&blkdev->xendev);
 
+    blkdev->feature_grant_copy =
+                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
+
+    xen_be_printf(&blkdev->xendev, 3, "grant copy operation %s\n",
+                  blkdev->feature_grant_copy ? "enabled" : "disabled");
+
     xen_be_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, "
                   "remote port %d, local port %d\n",
                   blkdev->xendev.protocol, blkdev->ring_ref,
                   blkdev->xendev.remote_port, blkdev->xendev.local_port);
+
     return 0;
 }
 
diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index 4ac0c6f..bed5307 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -14,6 +14,8 @@
 #endif
 #include <xen/io/xenbus.h>
 
+#include <xengnttab.h>
+
 #include "hw/hw.h"
 #include "hw/xen/xen.h"
 #include "hw/pci/pci.h"
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22  8:38 ` [PATCH v3 1/2] Interface for grant copy operation in libs Paulina Szubarczyk
@ 2016-06-22  9:37   ` David Vrabel
  2016-06-22  9:53     ` Paulina Szubarczyk
  2016-06-22 11:21     ` Wei Liu
  2016-07-08 13:18   ` Wei Liu
  1 sibling, 2 replies; 27+ messages in thread
From: David Vrabel @ 2016-06-22  9:37 UTC (permalink / raw)
  To: Paulina Szubarczyk, xen-devel, roger.pau
  Cc: anthony.perard, ian.jackson, sstabellini, wei.liu2

On 22/06/16 09:38, Paulina Szubarczyk wrote:
> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> system call is invoked. In mini-os the operation is yet not
> implemented. For other OSs there is a dummy implementation.
[...]
> --- a/tools/libs/gnttab/linux.c
> +++ b/tools/libs/gnttab/linux.c
> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
>      return 0;
>  }
>  
> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> +                            uint32_t count,
> +                            xengnttab_grant_copy_segment_t *segs)
> +{
> +    int i, rc;
> +    int fd = xgt->fd;
> +    struct ioctl_gntdev_grant_copy copy;
> +
> +    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
> +    copy.count = count;
> +    for (i = 0; i < count; i++)
> +    {
> +        copy.segments[i].flags = segs[i].flags;
> +        copy.segments[i].len = segs[i].len;
> +        if (segs[i].flags == GNTCOPY_dest_gref) 
> +        {
> +            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> +            copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
> +            copy.segments[i].dest.foreign.offset = segs[i].dest.foreign.offset;
> +            copy.segments[i].source.virt = segs[i].source.virt;
> +        } 
> +        else 
> +        {
> +            copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
> +            copy.segments[i].source.foreign.domid = segs[i].source.foreign.domid;
> +            copy.segments[i].source.foreign.offset = segs[i].source.foreign.offset;
> +            copy.segments[i].dest.virt = segs[i].dest.virt;
> +        }
> +    }
> +
> +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
> +    if (rc) 
> +    {
> +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> +    }
> +    else 
> +    {
> +        for (i = 0; i < count; i++)
> +            segs[i].status = copy.segments[i].status;
> +    }
> +
> +    free(copy.segments);
> +    return rc;
> +}

I know Wei asked for this but you've replaced what should be a single
pointer assignment with a memory allocation and two loops over all the
segments.

This is a hot path and the two structures (the libxengnttab one and the
Linux kernel one) are both part of their respective ABIs and won't
change so Wei's concern that they might change in the future is unfounded.

This change makes xengnttab_grant_copy() useless for our (XenServer's)
use case.

David

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22  9:37   ` David Vrabel
@ 2016-06-22  9:53     ` Paulina Szubarczyk
  2016-06-22 11:24       ` Wei Liu
  2016-06-22 11:21     ` Wei Liu
  1 sibling, 1 reply; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-06-22  9:53 UTC (permalink / raw)
  To: David Vrabel
  Cc: sstabellini, wei.liu2, ian.jackson, anthony.perard, xen-devel, roger.pau

On Wed, 22 Jun 2016 10:37:24 +0100
David Vrabel <david.vrabel@citrix.com> wrote:

> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > system call is invoked. In mini-os the operation is yet not
> > implemented. For other OSs there is a dummy implementation.
> [...]
> > --- a/tools/libs/gnttab/linux.c
> > +++ b/tools/libs/gnttab/linux.c
> > @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> >      return 0;
> >  }
> >  
> > +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > +                            uint32_t count,
> > +                            xengnttab_grant_copy_segment_t *segs)
> > +{
> > +    int i, rc;
> > +    int fd = xgt->fd;
> > +    struct ioctl_gntdev_grant_copy copy;
> > +
> > +    copy.segments = calloc(count, sizeof(struct
> > ioctl_gntdev_grant_copy_segment));
> > +    copy.count = count;
> > +    for (i = 0; i < count; i++)
> > +    {
> > +        copy.segments[i].flags = segs[i].flags;
> > +        copy.segments[i].len = segs[i].len;
> > +        if (segs[i].flags == GNTCOPY_dest_gref) 
> > +        {
> > +            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> > +            copy.segments[i].dest.foreign.domid =
> > segs[i].dest.foreign.domid;
> > +            copy.segments[i].dest.foreign.offset =
> > segs[i].dest.foreign.offset;
> > +            copy.segments[i].source.virt = segs[i].source.virt;
> > +        } 
> > +        else 
> > +        {
> > +            copy.segments[i].source.foreign.ref =
> > segs[i].source.foreign.ref;
> > +            copy.segments[i].source.foreign.domid =
> > segs[i].source.foreign.domid;
> > +            copy.segments[i].source.foreign.offset =
> > segs[i].source.foreign.offset;
> > +            copy.segments[i].dest.virt = segs[i].dest.virt;
> > +        }
> > +    }
> > +
> > +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
> > +    if (rc) 
> > +    {
> > +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > +    }
> > +    else 
> > +    {
> > +        for (i = 0; i < count; i++)
> > +            segs[i].status = copy.segments[i].status;
> > +    }
> > +
> > +    free(copy.segments);
> > +    return rc;
> > +}
> 
> I know Wei asked for this but you've replaced what should be a single
> pointer assignment with a memory allocation and two loops over all the
> segments.
> 
> This is a hot path and the two structures (the libxengnttab one and the
> Linux kernel one) are both part of their respective ABIs and won't
> change so Wei's concern that they might change in the future is unfounded.
> 
> This change makes xengnttab_grant_copy() useless for our (XenServer's)
> use case.
> 
> David

As Wei and Ian are maintainers of toolstack if they agree on the previous cast
that was here I will revert the changes.

Paulina

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22  9:37   ` David Vrabel
  2016-06-22  9:53     ` Paulina Szubarczyk
@ 2016-06-22 11:21     ` Wei Liu
  2016-06-22 12:37       ` David Vrabel
  1 sibling, 1 reply; 27+ messages in thread
From: Wei Liu @ 2016-06-22 11:21 UTC (permalink / raw)
  To: David Vrabel
  Cc: sstabellini, wei.liu2, Paulina Szubarczyk, ian.jackson,
	anthony.perard, xen-devel, roger.pau

On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > system call is invoked. In mini-os the operation is yet not
> > implemented. For other OSs there is a dummy implementation.
> [...]
> > --- a/tools/libs/gnttab/linux.c
> > +++ b/tools/libs/gnttab/linux.c
> > @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> >      return 0;
> >  }
> >  
> > +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > +                            uint32_t count,
> > +                            xengnttab_grant_copy_segment_t *segs)
> > +{
> > +    int i, rc;
> > +    int fd = xgt->fd;
> > +    struct ioctl_gntdev_grant_copy copy;
> > +
> > +    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
> > +    copy.count = count;
> > +    for (i = 0; i < count; i++)
> > +    {
> > +        copy.segments[i].flags = segs[i].flags;
> > +        copy.segments[i].len = segs[i].len;
> > +        if (segs[i].flags == GNTCOPY_dest_gref) 
> > +        {
> > +            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> > +            copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
> > +            copy.segments[i].dest.foreign.offset = segs[i].dest.foreign.offset;
> > +            copy.segments[i].source.virt = segs[i].source.virt;
> > +        } 
> > +        else 
> > +        {
> > +            copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
> > +            copy.segments[i].source.foreign.domid = segs[i].source.foreign.domid;
> > +            copy.segments[i].source.foreign.offset = segs[i].source.foreign.offset;
> > +            copy.segments[i].dest.virt = segs[i].dest.virt;
> > +        }
> > +    }
> > +
> > +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
> > +    if (rc) 
> > +    {
> > +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > +    }
> > +    else 
> > +    {
> > +        for (i = 0; i < count; i++)
> > +            segs[i].status = copy.segments[i].status;
> > +    }
> > +
> > +    free(copy.segments);
> > +    return rc;
> > +}
> 
> I know Wei asked for this but you've replaced what should be a single
> pointer assignment with a memory allocation and two loops over all the
> segments.
> 
> This is a hot path and the two structures (the libxengnttab one and the
> Linux kernel one) are both part of their respective ABIs and won't
> change so Wei's concern that they might change in the future is unfounded.
> 

The fundamental question is: will the ABI between the library and the
kernel ever go mismatch?

My answer is "maybe".  My rationale is that everything goes across
boundary of components need to be considered with caution. And I tend to
assume the worst things will happen.

To guarantee that they will never go mismatch is to have

   typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;

But that's not how the code is written.

I would like to hear a third opinion. Is my concern unfounded? Am I too
cautious? Is there any compelling argument that I missed?

Somewhat related, can we have some numbers please? It could well be the
cost of the two loops is much cheaper than whatever is going on inside
the kernel / hypervisor. And it could turn out that the numbers render
this issue moot.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22  9:53     ` Paulina Szubarczyk
@ 2016-06-22 11:24       ` Wei Liu
  2016-06-22 14:19         ` Paulina Szubarczyk
  0 siblings, 1 reply; 27+ messages in thread
From: Wei Liu @ 2016-06-22 11:24 UTC (permalink / raw)
  To: Paulina Szubarczyk
  Cc: sstabellini, wei.liu2, ian.jackson, David Vrabel, anthony.perard,
	xen-devel, roger.pau

On Wed, Jun 22, 2016 at 11:53:00AM +0200, Paulina Szubarczyk wrote:
[...]
> > I know Wei asked for this but you've replaced what should be a single
> > pointer assignment with a memory allocation and two loops over all the
> > segments.
> > 
> > This is a hot path and the two structures (the libxengnttab one and the
> > Linux kernel one) are both part of their respective ABIs and won't
> > change so Wei's concern that they might change in the future is unfounded.
> > 
> > This change makes xengnttab_grant_copy() useless for our (XenServer's)
> > use case.
> > 
> > David
> 
> As Wei and Ian are maintainers of toolstack if they agree on the previous cast
> that was here I will revert the changes.
> 

Do you have the most up to date numbers? How do they compare to the
numbers in previous version? If there is degradation, how big is that in
terms of percentage?

Wei.

> Paulina

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22 11:21     ` Wei Liu
@ 2016-06-22 12:37       ` David Vrabel
  2016-06-22 13:29         ` Wei Liu
  0 siblings, 1 reply; 27+ messages in thread
From: David Vrabel @ 2016-06-22 12:37 UTC (permalink / raw)
  To: Wei Liu, David Vrabel
  Cc: sstabellini, Paulina Szubarczyk, ian.jackson, anthony.perard,
	xen-devel, roger.pau

On 22/06/16 12:21, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
>> On 22/06/16 09:38, Paulina Szubarczyk wrote:
>>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
>>> system call is invoked. In mini-os the operation is yet not
>>> implemented. For other OSs there is a dummy implementation.
>> [...]
>>> --- a/tools/libs/gnttab/linux.c
>>> +++ b/tools/libs/gnttab/linux.c
>>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
>>>      return 0;
>>>  }
>>>  
>>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
>>> +                            uint32_t count,
>>> +                            xengnttab_grant_copy_segment_t *segs)
>>> +{
>>> +    int i, rc;
>>> +    int fd = xgt->fd;
>>> +    struct ioctl_gntdev_grant_copy copy;
>>> +
>>> +    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
>>> +    copy.count = count;
>>> +    for (i = 0; i < count; i++)
>>> +    {
>>> +        copy.segments[i].flags = segs[i].flags;
>>> +        copy.segments[i].len = segs[i].len;
>>> +        if (segs[i].flags == GNTCOPY_dest_gref) 
>>> +        {
>>> +            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
>>> +            copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
>>> +            copy.segments[i].dest.foreign.offset = segs[i].dest.foreign.offset;
>>> +            copy.segments[i].source.virt = segs[i].source.virt;
>>> +        } 
>>> +        else 
>>> +        {
>>> +            copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
>>> +            copy.segments[i].source.foreign.domid = segs[i].source.foreign.domid;
>>> +            copy.segments[i].source.foreign.offset = segs[i].source.foreign.offset;
>>> +            copy.segments[i].dest.virt = segs[i].dest.virt;
>>> +        }
>>> +    }
>>> +
>>> +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
>>> +    if (rc) 
>>> +    {
>>> +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
>>> +    }
>>> +    else 
>>> +    {
>>> +        for (i = 0; i < count; i++)
>>> +            segs[i].status = copy.segments[i].status;
>>> +    }
>>> +
>>> +    free(copy.segments);
>>> +    return rc;
>>> +}
>>
>> I know Wei asked for this but you've replaced what should be a single
>> pointer assignment with a memory allocation and two loops over all the
>> segments.
>>
>> This is a hot path and the two structures (the libxengnttab one and the
>> Linux kernel one) are both part of their respective ABIs and won't
>> change so Wei's concern that they might change in the future is unfounded.
>>
> 
> The fundamental question is: will the ABI between the library and the
> kernel ever go mismatch?
> 
> My answer is "maybe".  My rationale is that everything goes across
> boundary of components need to be considered with caution. And I tend to
> assume the worst things will happen.
> 
> To guarantee that they will never go mismatch is to have
> 
>    typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
> 
> But that's not how the code is written.
> 
> I would like to hear a third opinion. Is my concern unfounded? Am I too
> cautious? Is there any compelling argument that I missed?
> 
> Somewhat related, can we have some numbers please? It could well be the
> cost of the two loops is much cheaper than whatever is going on inside
> the kernel / hypervisor. And it could turn out that the numbers render
> this issue moot.

I did some (very) adhoc measurements and with the worst case of single
short segments for each ioctl, the optimized version of
osdep_gnttab_grant_copy() looks to be ~5% faster.

This is enough of a difference that we should use the optimized version.

The unoptimized version also adds an additional failure path (the
calloc) which would be best avoided.

David

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22 12:37       ` David Vrabel
@ 2016-06-22 13:29         ` Wei Liu
  2016-06-22 13:52           ` David Vrabel
  0 siblings, 1 reply; 27+ messages in thread
From: Wei Liu @ 2016-06-22 13:29 UTC (permalink / raw)
  To: David Vrabel
  Cc: sstabellini, Wei Liu, ian.jackson, Paulina Szubarczyk,
	anthony.perard, xen-devel, roger.pau

On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
> On 22/06/16 12:21, Wei Liu wrote:
> > On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> >> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> >>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> >>> system call is invoked. In mini-os the operation is yet not
> >>> implemented. For other OSs there is a dummy implementation.
> >> [...]
> >>> --- a/tools/libs/gnttab/linux.c
> >>> +++ b/tools/libs/gnttab/linux.c
> >>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> >>>      return 0;
> >>>  }
> >>>  
> >>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> >>> +                            uint32_t count,
> >>> +                            xengnttab_grant_copy_segment_t *segs)
> >>> +{
> >>> +    int i, rc;
> >>> +    int fd = xgt->fd;
> >>> +    struct ioctl_gntdev_grant_copy copy;
> >>> +
> >>> +    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
> >>> +    copy.count = count;
> >>> +    for (i = 0; i < count; i++)
> >>> +    {
> >>> +        copy.segments[i].flags = segs[i].flags;
> >>> +        copy.segments[i].len = segs[i].len;
> >>> +        if (segs[i].flags == GNTCOPY_dest_gref) 
> >>> +        {
> >>> +            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> >>> +            copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
> >>> +            copy.segments[i].dest.foreign.offset = segs[i].dest.foreign.offset;
> >>> +            copy.segments[i].source.virt = segs[i].source.virt;
> >>> +        } 
> >>> +        else 
> >>> +        {
> >>> +            copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
> >>> +            copy.segments[i].source.foreign.domid = segs[i].source.foreign.domid;
> >>> +            copy.segments[i].source.foreign.offset = segs[i].source.foreign.offset;
> >>> +            copy.segments[i].dest.virt = segs[i].dest.virt;
> >>> +        }
> >>> +    }
> >>> +
> >>> +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
> >>> +    if (rc) 
> >>> +    {
> >>> +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> >>> +    }
> >>> +    else 
> >>> +    {
> >>> +        for (i = 0; i < count; i++)
> >>> +            segs[i].status = copy.segments[i].status;
> >>> +    }
> >>> +
> >>> +    free(copy.segments);
> >>> +    return rc;
> >>> +}
> >>
> >> I know Wei asked for this but you've replaced what should be a single
> >> pointer assignment with a memory allocation and two loops over all the
> >> segments.
> >>
> >> This is a hot path and the two structures (the libxengnttab one and the
> >> Linux kernel one) are both part of their respective ABIs and won't
> >> change so Wei's concern that they might change in the future is unfounded.
> >>
> > 
> > The fundamental question is: will the ABI between the library and the
> > kernel ever go mismatch?
> > 
> > My answer is "maybe".  My rationale is that everything goes across
> > boundary of components need to be considered with caution. And I tend to
> > assume the worst things will happen.
> > 
> > To guarantee that they will never go mismatch is to have
> > 
> >    typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
> > 
> > But that's not how the code is written.
> > 
> > I would like to hear a third opinion. Is my concern unfounded? Am I too
> > cautious? Is there any compelling argument that I missed?
> > 
> > Somewhat related, can we have some numbers please? It could well be the
> > cost of the two loops is much cheaper than whatever is going on inside
> > the kernel / hypervisor. And it could turn out that the numbers render
> > this issue moot.
> 
> I did some (very) adhoc measurements and with the worst case of single
> short segments for each ioctl, the optimized version of
> osdep_gnttab_grant_copy() looks to be ~5% faster.
> 
> This is enough of a difference that we should use the optimized version.
> 
> The unoptimized version also adds an additional failure path (the
> calloc) which would be best avoided.
> 

Your test case includes a lot of  noise in libc allocator, so...

Can you give try the following patch (apply on top of Paulina's patch)?
The basic idea is to provide scratch space for the structures. Note, the
patch is compile test only.

---8<---
From e72c1abb9852f40db5eeee48ef208492c3283884 Mon Sep 17 00:00:00 2001
From: Wei Liu <wei.liu2@citrix.com>
Date: Wed, 22 Jun 2016 14:22:48 +0100
Subject: [PATCH] xengnttab: provide osdep cache and use it in Linux grant copy

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
 tools/libs/gnttab/linux.c   | 35 +++++++++++++++++++++++++++++------
 tools/libs/gnttab/private.h |  2 ++
 2 files changed, 31 insertions(+), 6 deletions(-)

diff --git a/tools/libs/gnttab/linux.c b/tools/libs/gnttab/linux.c
index 62ad7bd..17d4d29 100644
--- a/tools/libs/gnttab/linux.c
+++ b/tools/libs/gnttab/linux.c
@@ -47,13 +47,28 @@
 #define O_CLOEXEC 0
 #endif
 
+#define COPY_SEGMENT_CACHE_SIZE 1024
+
 int osdep_gnttab_open(xengnttab_handle *xgt)
 {
-    int fd = open(DEVXEN "gntdev", O_RDWR|O_CLOEXEC);
-    if ( fd == -1 )
-        return -1;
-    xgt->fd = fd;
+    size_t s = COPY_SEGMENT_CACHE_SIZE *
+        sizeof(struct ioctl_gntdev_grant_copy_segment);
+
+    xgt->fd = open(DEVXEN "gntdev", O_RDWR|O_CLOEXEC);
+    if (xgt->fd == -1) goto err;
+
+    xgt->osdep_data = malloc(s);
+    if (!xgt->osdep_data) goto err;
+    xgt->osdep_data_size = s;
+
     return 0;
+err:
+    if (xgt->fd != -1) {
+        close(xgt->fd);
+        xgt->fd = -1;
+    }
+
+    return -1;
 }
 
 int osdep_gnttab_close(xengnttab_handle *xgt)
@@ -61,6 +76,10 @@ int osdep_gnttab_close(xengnttab_handle *xgt)
     if ( xgt->fd == -1 )
         return 0;
 
+    free(xgt->osdep_data);
+    xgt->osdep_data = NULL;
+    xgt->osdep_data_size = 0;
+
     return close(xgt->fd);
 }
 
@@ -243,7 +262,12 @@ int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
     int fd = xgt->fd;
     struct ioctl_gntdev_grant_copy copy;
 
-    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
+    if (count > COPY_SEGMENT_CACHE_SIZE) {
+        errno = E2BIG;
+        return -1;
+    }
+
+    copy.segments = xgt->osdep_data;
     copy.count = count;
     for (i = 0; i < count; i++)
     {
@@ -276,7 +300,6 @@ int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
             segs[i].status = copy.segments[i].status;
     }
 
-    free(copy.segments);
     return rc;
 }
 
diff --git a/tools/libs/gnttab/private.h b/tools/libs/gnttab/private.h
index d6c5594..e99a80d 100644
--- a/tools/libs/gnttab/private.h
+++ b/tools/libs/gnttab/private.h
@@ -7,6 +7,8 @@
 struct xengntdev_handle {
     xentoollog_logger *logger, *logger_tofree;
     int fd;
+    void *osdep_data;              /* osdep private data */
+    size_t osdep_data_size;        /* osdep private data size */
 };
 
 int osdep_gnttab_open(xengnttab_handle *xgt);
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22 13:29         ` Wei Liu
@ 2016-06-22 13:52           ` David Vrabel
  2016-06-22 14:52             ` Wei Liu
  0 siblings, 1 reply; 27+ messages in thread
From: David Vrabel @ 2016-06-22 13:52 UTC (permalink / raw)
  To: Wei Liu
  Cc: sstabellini, Paulina Szubarczyk, ian.jackson, anthony.perard,
	xen-devel, roger.pau

On 22/06/16 14:29, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
>> On 22/06/16 12:21, Wei Liu wrote:
>>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
>>>> On 22/06/16 09:38, Paulina Szubarczyk wrote:
>>>>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
>>>>> system call is invoked. In mini-os the operation is yet not
>>>>> implemented. For other OSs there is a dummy implementation.
>>>> [...]
>>>>> --- a/tools/libs/gnttab/linux.c
>>>>> +++ b/tools/libs/gnttab/linux.c
>>>>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
>>>>>      return 0;
>>>>>  }
>>>>>  
>>>>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
>>>>> +                            uint32_t count,
>>>>> +                            xengnttab_grant_copy_segment_t *segs)
>>>>> +{
>>>>> +    int i, rc;
>>>>> +    int fd = xgt->fd;
>>>>> +    struct ioctl_gntdev_grant_copy copy;
>>>>> +
>>>>> +    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
>>>>> +    copy.count = count;
>>>>> +    for (i = 0; i < count; i++)
>>>>> +    {
>>>>> +        copy.segments[i].flags = segs[i].flags;
>>>>> +        copy.segments[i].len = segs[i].len;
>>>>> +        if (segs[i].flags == GNTCOPY_dest_gref) 
>>>>> +        {
>>>>> +            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
>>>>> +            copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
>>>>> +            copy.segments[i].dest.foreign.offset = segs[i].dest.foreign.offset;
>>>>> +            copy.segments[i].source.virt = segs[i].source.virt;
>>>>> +        } 
>>>>> +        else 
>>>>> +        {
>>>>> +            copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
>>>>> +            copy.segments[i].source.foreign.domid = segs[i].source.foreign.domid;
>>>>> +            copy.segments[i].source.foreign.offset = segs[i].source.foreign.offset;
>>>>> +            copy.segments[i].dest.virt = segs[i].dest.virt;
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
>>>>> +    if (rc) 
>>>>> +    {
>>>>> +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
>>>>> +    }
>>>>> +    else 
>>>>> +    {
>>>>> +        for (i = 0; i < count; i++)
>>>>> +            segs[i].status = copy.segments[i].status;
>>>>> +    }
>>>>> +
>>>>> +    free(copy.segments);
>>>>> +    return rc;
>>>>> +}
>>>>
>>>> I know Wei asked for this but you've replaced what should be a single
>>>> pointer assignment with a memory allocation and two loops over all the
>>>> segments.
>>>>
>>>> This is a hot path and the two structures (the libxengnttab one and the
>>>> Linux kernel one) are both part of their respective ABIs and won't
>>>> change so Wei's concern that they might change in the future is unfounded.
>>>>
>>>
>>> The fundamental question is: will the ABI between the library and the
>>> kernel ever go mismatch?
>>>
>>> My answer is "maybe".  My rationale is that everything goes across
>>> boundary of components need to be considered with caution. And I tend to
>>> assume the worst things will happen.
>>>
>>> To guarantee that they will never go mismatch is to have
>>>
>>>    typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
>>>
>>> But that's not how the code is written.
>>>
>>> I would like to hear a third opinion. Is my concern unfounded? Am I too
>>> cautious? Is there any compelling argument that I missed?
>>>
>>> Somewhat related, can we have some numbers please? It could well be the
>>> cost of the two loops is much cheaper than whatever is going on inside
>>> the kernel / hypervisor. And it could turn out that the numbers render
>>> this issue moot.
>>
>> I did some (very) adhoc measurements and with the worst case of single
>> short segments for each ioctl, the optimized version of
>> osdep_gnttab_grant_copy() looks to be ~5% faster.
>>
>> This is enough of a difference that we should use the optimized version.
>>
>> The unoptimized version also adds an additional failure path (the
>> calloc) which would be best avoided.
>>
> 
> Your test case includes a lot of  noise in libc allocator, so...
> 
> Can you give try the following patch (apply on top of Paulina's patch)?
> The basic idea is to provide scratch space for the structures. Note, the
> patch is compile test only.
[...]
> +#define COPY_SEGMENT_CACHE_SIZE 1024

Arbitrary limit on number of segments.

> +    copy.segments = xgt->osdep_data;

Not thread safe.

I tried using alloca() which has <1% performance penalty but the failure
mode for alloca() is really bad so I would not recommend it.

I think the best solution is to allow the osdep code to provide the
implementation of xengnttab_grant_copy_segment_t, allowing the Linux
code to do:

typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t

You should still provide the generic structure as well, for those
platforms that don't provide their own optimized version.

David

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22 11:24       ` Wei Liu
@ 2016-06-22 14:19         ` Paulina Szubarczyk
  0 siblings, 0 replies; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-06-22 14:19 UTC (permalink / raw)
  To: Wei Liu
  Cc: sstabellini, ian.jackson, David Vrabel, anthony.perard,
	xen-devel, roger.pau

On Wed, 22 Jun 2016 12:24:16 +0100
Wei Liu <wei.liu2@citrix.com> wrote:

> On Wed, Jun 22, 2016 at 11:53:00AM +0200, Paulina Szubarczyk wrote:
> [...]
> > > I know Wei asked for this but you've replaced what should be a single
> > > pointer assignment with a memory allocation and two loops over all the
> > > segments.
> > > 
> > > This is a hot path and the two structures (the libxengnttab one and the
> > > Linux kernel one) are both part of their respective ABIs and won't
> > > change so Wei's concern that they might change in the future is unfounded.
> > > 
> > > This change makes xengnttab_grant_copy() useless for our (XenServer's)
> > > use case.
> > > 
> > > David
> > 
> > As Wei and Ian are maintainers of toolstack if they agree on the previous
> > cast that was here I will revert the changes.
> > 
> 
> Do you have the most up to date numbers? How do they compare to the
> numbers in previous version? If there is degradation, how big is that in
> terms of percentage?
> 
> Wei.
> 
In the file [1] in the sheets with *-domU there are the results for the new
implementation with comparisons to grant map implementation which I run
yesterday/today. I also rebase to the newest staging version before the tests
and there is some improvement for both grant map and grant copy implementation
comparing to previous results, that is way I would not compare the results from
the previous test and I am going to run the test for the implementation with
casting the structures again today. 

But single test that I made take around 90 minutes, since there is 5 min warm
up and 1 min x 14 size of blocks for iodepth in range [1, 4, 8, 64, 256]. And I
usually did them at least three times..

[1]https://docs.google.com/spreadsheets/d/1E6AMiB8ceJpExL6jWpH9u2yy6DZxzhmDUyFf-eUuJ0c/edit?usp=sharing

Paulina


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22 13:52           ` David Vrabel
@ 2016-06-22 14:52             ` Wei Liu
  2016-06-22 16:49               ` Wei Liu
  2016-07-05 16:27               ` George Dunlap
  0 siblings, 2 replies; 27+ messages in thread
From: Wei Liu @ 2016-06-22 14:52 UTC (permalink / raw)
  To: David Vrabel
  Cc: sstabellini, Wei Liu, ian.jackson, Paulina Szubarczyk,
	anthony.perard, xen-devel, roger.pau

On Wed, Jun 22, 2016 at 02:52:47PM +0100, David Vrabel wrote:
> On 22/06/16 14:29, Wei Liu wrote:
> > On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
> >> On 22/06/16 12:21, Wei Liu wrote:
> >>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> >>>> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> >>>>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> >>>>> system call is invoked. In mini-os the operation is yet not
> >>>>> implemented. For other OSs there is a dummy implementation.
> >>>> [...]
> >>>>> --- a/tools/libs/gnttab/linux.c
> >>>>> +++ b/tools/libs/gnttab/linux.c
> >>>>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> >>>>>      return 0;
> >>>>>  }
> >>>>>  
> >>>>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> >>>>> +                            uint32_t count,
> >>>>> +                            xengnttab_grant_copy_segment_t *segs)
> >>>>> +{
> >>>>> +    int i, rc;
> >>>>> +    int fd = xgt->fd;
> >>>>> +    struct ioctl_gntdev_grant_copy copy;
> >>>>> +
> >>>>> +    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
> >>>>> +    copy.count = count;
> >>>>> +    for (i = 0; i < count; i++)
> >>>>> +    {
> >>>>> +        copy.segments[i].flags = segs[i].flags;
> >>>>> +        copy.segments[i].len = segs[i].len;
> >>>>> +        if (segs[i].flags == GNTCOPY_dest_gref) 
> >>>>> +        {
> >>>>> +            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> >>>>> +            copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
> >>>>> +            copy.segments[i].dest.foreign.offset = segs[i].dest.foreign.offset;
> >>>>> +            copy.segments[i].source.virt = segs[i].source.virt;
> >>>>> +        } 
> >>>>> +        else 
> >>>>> +        {
> >>>>> +            copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
> >>>>> +            copy.segments[i].source.foreign.domid = segs[i].source.foreign.domid;
> >>>>> +            copy.segments[i].source.foreign.offset = segs[i].source.foreign.offset;
> >>>>> +            copy.segments[i].dest.virt = segs[i].dest.virt;
> >>>>> +        }
> >>>>> +    }
> >>>>> +
> >>>>> +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
> >>>>> +    if (rc) 
> >>>>> +    {
> >>>>> +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> >>>>> +    }
> >>>>> +    else 
> >>>>> +    {
> >>>>> +        for (i = 0; i < count; i++)
> >>>>> +            segs[i].status = copy.segments[i].status;
> >>>>> +    }
> >>>>> +
> >>>>> +    free(copy.segments);
> >>>>> +    return rc;
> >>>>> +}
> >>>>
> >>>> I know Wei asked for this but you've replaced what should be a single
> >>>> pointer assignment with a memory allocation and two loops over all the
> >>>> segments.
> >>>>
> >>>> This is a hot path and the two structures (the libxengnttab one and the
> >>>> Linux kernel one) are both part of their respective ABIs and won't
> >>>> change so Wei's concern that they might change in the future is unfounded.
> >>>>
> >>>
> >>> The fundamental question is: will the ABI between the library and the
> >>> kernel ever go mismatch?
> >>>
> >>> My answer is "maybe".  My rationale is that everything goes across
> >>> boundary of components need to be considered with caution. And I tend to
> >>> assume the worst things will happen.
> >>>
> >>> To guarantee that they will never go mismatch is to have
> >>>
> >>>    typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
> >>>
> >>> But that's not how the code is written.
> >>>
> >>> I would like to hear a third opinion. Is my concern unfounded? Am I too
> >>> cautious? Is there any compelling argument that I missed?
> >>>
> >>> Somewhat related, can we have some numbers please? It could well be the
> >>> cost of the two loops is much cheaper than whatever is going on inside
> >>> the kernel / hypervisor. And it could turn out that the numbers render
> >>> this issue moot.
> >>
> >> I did some (very) adhoc measurements and with the worst case of single
> >> short segments for each ioctl, the optimized version of
> >> osdep_gnttab_grant_copy() looks to be ~5% faster.
> >>
> >> This is enough of a difference that we should use the optimized version.
> >>
> >> The unoptimized version also adds an additional failure path (the
> >> calloc) which would be best avoided.
> >>
> > 
> > Your test case includes a lot of  noise in libc allocator, so...
> > 
> > Can you give try the following patch (apply on top of Paulina's patch)?
> > The basic idea is to provide scratch space for the structures. Note, the
> > patch is compile test only.
> [...]
> > +#define COPY_SEGMENT_CACHE_SIZE 1024
> 
> Arbitrary limit on number of segments.
> 
> > +    copy.segments = xgt->osdep_data;
> 
> Not thread safe.
> 

Both issues are real, but this is just a gross hack to try to get some
numbers.

> I tried using alloca() which has <1% performance penalty but the failure
> mode for alloca() is really bad so I would not recommend it.
> 

Agreed.

But if you want to use the stack, maybe C99 variable length array would
do?

> I think the best solution is to allow the osdep code to provide the
> implementation of xengnttab_grant_copy_segment_t, allowing the Linux
> code to do:
> 
> typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t
> 
> You should still provide the generic structure as well, for those
> platforms that don't provide their own optimized version.
> 

We can't do that (yet). This means we open the door for divergence on
different platforms.

Basically this approach requires each platform to do the same thing
(typedef) This implies any application that uses libxengnttab will need
to test what platform it runs on. It is just pushing the issue somewhere
else.

Still, I think I would wait a bit for other people to weight in because
I'm not sure if my concern is wrong headed.

Wei.

> David

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22 14:52             ` Wei Liu
@ 2016-06-22 16:49               ` Wei Liu
  2016-07-06 15:49                 ` Roger Pau Monné
  2016-07-05 16:27               ` George Dunlap
  1 sibling, 1 reply; 27+ messages in thread
From: Wei Liu @ 2016-06-22 16:49 UTC (permalink / raw)
  To: David Vrabel
  Cc: sstabellini, Wei Liu, ian.jackson, Paulina Szubarczyk,
	anthony.perard, xen-devel, roger.pau

On Wed, Jun 22, 2016 at 03:52:43PM +0100, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 02:52:47PM +0100, David Vrabel wrote:
> > On 22/06/16 14:29, Wei Liu wrote:
> > > On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
> > >> On 22/06/16 12:21, Wei Liu wrote:
> > >>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> > >>>> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > >>>>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > >>>>> system call is invoked. In mini-os the operation is yet not
> > >>>>> implemented. For other OSs there is a dummy implementation.
> > >>>> [...]
> > >>>>> --- a/tools/libs/gnttab/linux.c
> > >>>>> +++ b/tools/libs/gnttab/linux.c
> > >>>>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> > >>>>>      return 0;
> > >>>>>  }
> > >>>>>  
> > >>>>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > >>>>> +                            uint32_t count,
> > >>>>> +                            xengnttab_grant_copy_segment_t *segs)
> > >>>>> +{
> > >>>>> +    int i, rc;
> > >>>>> +    int fd = xgt->fd;
> > >>>>> +    struct ioctl_gntdev_grant_copy copy;
> > >>>>> +
> > >>>>> +    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
> > >>>>> +    copy.count = count;
> > >>>>> +    for (i = 0; i < count; i++)
> > >>>>> +    {
> > >>>>> +        copy.segments[i].flags = segs[i].flags;
> > >>>>> +        copy.segments[i].len = segs[i].len;
> > >>>>> +        if (segs[i].flags == GNTCOPY_dest_gref) 
> > >>>>> +        {
> > >>>>> +            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> > >>>>> +            copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
> > >>>>> +            copy.segments[i].dest.foreign.offset = segs[i].dest.foreign.offset;
> > >>>>> +            copy.segments[i].source.virt = segs[i].source.virt;
> > >>>>> +        } 
> > >>>>> +        else 
> > >>>>> +        {
> > >>>>> +            copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
> > >>>>> +            copy.segments[i].source.foreign.domid = segs[i].source.foreign.domid;
> > >>>>> +            copy.segments[i].source.foreign.offset = segs[i].source.foreign.offset;
> > >>>>> +            copy.segments[i].dest.virt = segs[i].dest.virt;
> > >>>>> +        }
> > >>>>> +    }
> > >>>>> +
> > >>>>> +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
> > >>>>> +    if (rc) 
> > >>>>> +    {
> > >>>>> +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > >>>>> +    }
> > >>>>> +    else 
> > >>>>> +    {
> > >>>>> +        for (i = 0; i < count; i++)
> > >>>>> +            segs[i].status = copy.segments[i].status;
> > >>>>> +    }
> > >>>>> +
> > >>>>> +    free(copy.segments);
> > >>>>> +    return rc;
> > >>>>> +}
> > >>>>
> > >>>> I know Wei asked for this but you've replaced what should be a single
> > >>>> pointer assignment with a memory allocation and two loops over all the
> > >>>> segments.
> > >>>>
> > >>>> This is a hot path and the two structures (the libxengnttab one and the
> > >>>> Linux kernel one) are both part of their respective ABIs and won't
> > >>>> change so Wei's concern that they might change in the future is unfounded.
> > >>>>
> > >>>
> > >>> The fundamental question is: will the ABI between the library and the
> > >>> kernel ever go mismatch?
> > >>>
> > >>> My answer is "maybe".  My rationale is that everything goes across
> > >>> boundary of components need to be considered with caution. And I tend to
> > >>> assume the worst things will happen.
> > >>>
> > >>> To guarantee that they will never go mismatch is to have
> > >>>
> > >>>    typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
> > >>>
> > >>> But that's not how the code is written.
> > >>>
> > >>> I would like to hear a third opinion. Is my concern unfounded? Am I too
> > >>> cautious? Is there any compelling argument that I missed?
> > >>>
> > >>> Somewhat related, can we have some numbers please? It could well be the
> > >>> cost of the two loops is much cheaper than whatever is going on inside
> > >>> the kernel / hypervisor. And it could turn out that the numbers render
> > >>> this issue moot.
> > >>
> > >> I did some (very) adhoc measurements and with the worst case of single
> > >> short segments for each ioctl, the optimized version of
> > >> osdep_gnttab_grant_copy() looks to be ~5% faster.
> > >>
> > >> This is enough of a difference that we should use the optimized version.
> > >>
> > >> The unoptimized version also adds an additional failure path (the
> > >> calloc) which would be best avoided.
> > >>
> > > 
> > > Your test case includes a lot of  noise in libc allocator, so...
> > > 
> > > Can you give try the following patch (apply on top of Paulina's patch)?
> > > The basic idea is to provide scratch space for the structures. Note, the
> > > patch is compile test only.
> > [...]
> > > +#define COPY_SEGMENT_CACHE_SIZE 1024
> > 
> > Arbitrary limit on number of segments.
> > 
> > > +    copy.segments = xgt->osdep_data;
> > 
> > Not thread safe.
> > 
> 
> Both issues are real, but this is just a gross hack to try to get some
> numbers.
> 
> > I tried using alloca() which has <1% performance penalty but the failure
> > mode for alloca() is really bad so I would not recommend it.
> > 
> 
> Agreed.
> 
> But if you want to use the stack, maybe C99 variable length array would
> do?
> 

The numbers (stack based < 1% overhead, heap based ~5% overhead) suggest
that all the assignments are fast. It is the malloc / free pair that is
slow.

And actually we can just use a combination of statically allocated stack
based array and heap based array. Say, let's have a X element array
(pick the number used in hypervisor preemption check), if count > X, use
heap based array (with the hope that the libc allocation / free overhead
should be masked by the copying overhead in hypervisor).

That would achieve both safety and performance, and render a lot of the
other discussions (the expectation of application, the interface in
other platform etc) moot. Looks like the good solution for me.

David, what do you think?

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22 14:52             ` Wei Liu
  2016-06-22 16:49               ` Wei Liu
@ 2016-07-05 16:27               ` George Dunlap
  1 sibling, 0 replies; 27+ messages in thread
From: George Dunlap @ 2016-07-05 16:27 UTC (permalink / raw)
  To: Wei Liu
  Cc: Stefano Stabellini, Ian Jackson, Paulina Szubarczyk,
	David Vrabel, Anthony Perard, xen-devel, Roger Pau Monné

On Wed, Jun 22, 2016 at 3:52 PM, Wei Liu <wei.liu2@citrix.com> wrote:
>> I think the best solution is to allow the osdep code to provide the
>> implementation of xengnttab_grant_copy_segment_t, allowing the Linux
>> code to do:
>>
>> typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t
>>
>> You should still provide the generic structure as well, for those
>> platforms that don't provide their own optimized version.
>>
>
> We can't do that (yet). This means we open the door for divergence on
> different platforms.
>
> Basically this approach requires each platform to do the same thing
> (typedef) This implies any application that uses libxengnttab will need
> to test what platform it runs on. It is just pushing the issue somewhere
> else.
>
> Still, I think I would wait a bit for other people to weight in because
> I'm not sure if my concern is wrong headed.

I tend to be sympathetic to David's argument here.  The library has to
provide some ABI to callers; and it has to know the appropriate Linux
ABI in order to translate from the library ABI to the Linux ABI.  If
it happens to know these are the same, I don't see a reason not to
"translate" it by just by casting the pointer.

If we want to declare the library ABI in a stand-alone fashion (i.e.,
instead of just doing a typedef, so that the library definition is the
same on all platforms), then having some compile-time checking to make
sure that the layouts of the two structures are identical makes sense.
Beyond that, I'm not sure what the extra copying really buys us.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22 16:49               ` Wei Liu
@ 2016-07-06 15:49                 ` Roger Pau Monné
  0 siblings, 0 replies; 27+ messages in thread
From: Roger Pau Monné @ 2016-07-06 15:49 UTC (permalink / raw)
  To: Wei Liu
  Cc: sstabellini, Paulina Szubarczyk, ian.jackson, David Vrabel,
	anthony.perard, xen-devel

On Wed, Jun 22, 2016 at 05:49:59PM +0100, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 03:52:43PM +0100, Wei Liu wrote:
> > On Wed, Jun 22, 2016 at 02:52:47PM +0100, David Vrabel wrote:
> > > On 22/06/16 14:29, Wei Liu wrote:
> > > > On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
> > > >> On 22/06/16 12:21, Wei Liu wrote:
> > > >>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
> > > >>>> On 22/06/16 09:38, Paulina Szubarczyk wrote:
> > > >>>>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
> > > >>>>> system call is invoked. In mini-os the operation is yet not
> > > >>>>> implemented. For other OSs there is a dummy implementation.
> > > >>>> [...]
> > > >>>>> --- a/tools/libs/gnttab/linux.c
> > > >>>>> +++ b/tools/libs/gnttab/linux.c
> > > >>>>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
> > > >>>>>      return 0;
> > > >>>>>  }
> > > >>>>>  
> > > >>>>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
> > > >>>>> +                            uint32_t count,
> > > >>>>> +                            xengnttab_grant_copy_segment_t *segs)
> > > >>>>> +{
> > > >>>>> +    int i, rc;
> > > >>>>> +    int fd = xgt->fd;
> > > >>>>> +    struct ioctl_gntdev_grant_copy copy;
> > > >>>>> +
> > > >>>>> +    copy.segments = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
> > > >>>>> +    copy.count = count;
> > > >>>>> +    for (i = 0; i < count; i++)
> > > >>>>> +    {
> > > >>>>> +        copy.segments[i].flags = segs[i].flags;
> > > >>>>> +        copy.segments[i].len = segs[i].len;
> > > >>>>> +        if (segs[i].flags == GNTCOPY_dest_gref) 
> > > >>>>> +        {
> > > >>>>> +            copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
> > > >>>>> +            copy.segments[i].dest.foreign.domid = segs[i].dest.foreign.domid;
> > > >>>>> +            copy.segments[i].dest.foreign.offset = segs[i].dest.foreign.offset;
> > > >>>>> +            copy.segments[i].source.virt = segs[i].source.virt;
> > > >>>>> +        } 
> > > >>>>> +        else 
> > > >>>>> +        {
> > > >>>>> +            copy.segments[i].source.foreign.ref = segs[i].source.foreign.ref;
> > > >>>>> +            copy.segments[i].source.foreign.domid = segs[i].source.foreign.domid;
> > > >>>>> +            copy.segments[i].source.foreign.offset = segs[i].source.foreign.offset;
> > > >>>>> +            copy.segments[i].dest.virt = segs[i].dest.virt;
> > > >>>>> +        }
> > > >>>>> +    }
> > > >>>>> +
> > > >>>>> +    rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, &copy);
> > > >>>>> +    if (rc) 
> > > >>>>> +    {
> > > >>>>> +        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
> > > >>>>> +    }
> > > >>>>> +    else 
> > > >>>>> +    {
> > > >>>>> +        for (i = 0; i < count; i++)
> > > >>>>> +            segs[i].status = copy.segments[i].status;
> > > >>>>> +    }
> > > >>>>> +
> > > >>>>> +    free(copy.segments);
> > > >>>>> +    return rc;
> > > >>>>> +}
> > > >>>>
> > > >>>> I know Wei asked for this but you've replaced what should be a single
> > > >>>> pointer assignment with a memory allocation and two loops over all the
> > > >>>> segments.
> > > >>>>
> > > >>>> This is a hot path and the two structures (the libxengnttab one and the
> > > >>>> Linux kernel one) are both part of their respective ABIs and won't
> > > >>>> change so Wei's concern that they might change in the future is unfounded.
> > > >>>>
> > > >>>
> > > >>> The fundamental question is: will the ABI between the library and the
> > > >>> kernel ever go mismatch?
> > > >>>
> > > >>> My answer is "maybe".  My rationale is that everything goes across
> > > >>> boundary of components need to be considered with caution. And I tend to
> > > >>> assume the worst things will happen.
> > > >>>
> > > >>> To guarantee that they will never go mismatch is to have
> > > >>>
> > > >>>    typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
> > > >>>
> > > >>> But that's not how the code is written.
> > > >>>
> > > >>> I would like to hear a third opinion. Is my concern unfounded? Am I too
> > > >>> cautious? Is there any compelling argument that I missed?
> > > >>>
> > > >>> Somewhat related, can we have some numbers please? It could well be the
> > > >>> cost of the two loops is much cheaper than whatever is going on inside
> > > >>> the kernel / hypervisor. And it could turn out that the numbers render
> > > >>> this issue moot.
> > > >>
> > > >> I did some (very) adhoc measurements and with the worst case of single
> > > >> short segments for each ioctl, the optimized version of
> > > >> osdep_gnttab_grant_copy() looks to be ~5% faster.
> > > >>
> > > >> This is enough of a difference that we should use the optimized version.
> > > >>
> > > >> The unoptimized version also adds an additional failure path (the
> > > >> calloc) which would be best avoided.
> > > >>
> > > > 
> > > > Your test case includes a lot of  noise in libc allocator, so...
> > > > 
> > > > Can you give try the following patch (apply on top of Paulina's patch)?
> > > > The basic idea is to provide scratch space for the structures. Note, the
> > > > patch is compile test only.
> > > [...]
> > > > +#define COPY_SEGMENT_CACHE_SIZE 1024
> > > 
> > > Arbitrary limit on number of segments.
> > > 
> > > > +    copy.segments = xgt->osdep_data;
> > > 
> > > Not thread safe.
> > > 
> > 
> > Both issues are real, but this is just a gross hack to try to get some
> > numbers.
> > 
> > > I tried using alloca() which has <1% performance penalty but the failure
> > > mode for alloca() is really bad so I would not recommend it.
> > > 
> > 
> > Agreed.
> > 
> > But if you want to use the stack, maybe C99 variable length array would
> > do?
> > 
> 
> The numbers (stack based < 1% overhead, heap based ~5% overhead) suggest
> that all the assignments are fast. It is the malloc / free pair that is
> slow.
> 
> And actually we can just use a combination of statically allocated stack
> based array and heap based array. Say, let's have a X element array
> (pick the number used in hypervisor preemption check), if count > X, use
> heap based array (with the hope that the libc allocation / free overhead
> should be masked by the copying overhead in hypervisor).
> 
> That would achieve both safety and performance, and render a lot of the
> other discussions (the expectation of application, the interface in
> other platform etc) moot. Looks like the good solution for me.

IMHO, I don't think the cast is specially bad, and even if we do the copy 
there's no guarantee that the ioctl structure we are copying to is the one 
that the kernel expects iff someone changed it (because the gntdev header 
lives in the Xen tree, it's not picked from the host).

In any case, changing the ioctl in any way is not an option AFAICT, and it 
would be a bug if someone tried to do it. If we ever need to change the 
grant copy ioctl we would have to introduce a new one, like we did with the 
privcmd foreign memory mapping ioctl.

What I think should be avoided is the typedef from the Linux ioctl structure 
to the public library headers. As Wei says, this is OS agnostic, and 
although other OSes try to follow suit there's no guarantee that the exact 
same structure can be reused. I think the full structure definition has to 
live in the library itself (which will be just a copy of 
ioctl_gntdev_grant_copy), so other OSes can reuse it even if their ioctl 
structure is slightly different. The Linux specific implementation can add 
some static asserts in order to make sure it doesn't change, and then just 
do a straight cast.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-06-22  8:38 ` [PATCH v3 1/2] Interface for grant copy operation in libs Paulina Szubarczyk
  2016-06-22  9:37   ` David Vrabel
@ 2016-07-08 13:18   ` Wei Liu
  2016-07-13  9:12     ` Wei Liu
  1 sibling, 1 reply; 27+ messages in thread
From: Wei Liu @ 2016-07-08 13:18 UTC (permalink / raw)
  To: Paulina Szubarczyk
  Cc: sstabellini, wei.liu2, ian.jackson, david.vrabel, anthony.perard,
	xen-devel, roger.pau

To unblock Paulina on her series, I would be ok with the cast provided
there is compile-time check to ensure the user-space structure is
identical to the ioctl structure.

That would involve:
1. Introducing BUILD_BUG_ON, offsetof, alignof to libs/ if they are not
   already available.
2. BUILD_BUG_ON(sizeof(A) != sizeof(B))
3. BUILD_BUG_ON(offsetof(A, f1) != offsetof(B, f1)) (enumerate all
   fields)
4. BUILD_BUG_ON(alignof(A) != alignof(B))

Paulina, let me know if you would be interested in doing #1. Normally
this requires reading compiler manuals and some coding. I can give you
more details if you're up for the task, otherwise I will try to find
some time to do it myself.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 1/2] Interface for grant copy operation in libs.
  2016-07-08 13:18   ` Wei Liu
@ 2016-07-13  9:12     ` Wei Liu
  0 siblings, 0 replies; 27+ messages in thread
From: Wei Liu @ 2016-07-13  9:12 UTC (permalink / raw)
  To: Paulina Szubarczyk
  Cc: sstabellini, wei.liu2, ian.jackson, david.vrabel, anthony.perard,
	xen-devel, roger.pau

On Fri, Jul 08, 2016 at 02:18:46PM +0100, Wei Liu wrote:
> To unblock Paulina on her series, I would be ok with the cast provided
> there is compile-time check to ensure the user-space structure is
> identical to the ioctl structure.
> 
> That would involve:
> 1. Introducing BUILD_BUG_ON, offsetof, alignof to libs/ if they are not
>    already available.

I just checked all these.

BUILD_BUG_ON is not there. A patch for it is trivial. I will do that
soon.

offsetof can be found in stddef.h.

alignof is C11 (we require C99), but there is __alignof__ gcc extension,
which clang also supports. We've been using that for quite a long time,
so I don't think we need to do anything about it.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-06-22  8:38 ` [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation Paulina Szubarczyk
@ 2016-07-13 12:34   ` Paulina Szubarczyk
  2016-07-14 10:37   ` Wei Liu
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-07-13 12:34 UTC (permalink / raw)
  To: xen-devel, roger.pau
  Cc: anthony.perard, ian.jackson, sstabellini, wei.liu2, david.vrabel


On 06/22/2016 10:38 AM, Paulina Szubarczyk wrote:
> Copy data operated on during request from/to local buffers to/from
> the grant references.
>
> Before grant copy operation local buffers must be allocated what is
> done by calling ioreq_init_copy_buffers. For the 'read' operation,
> first, the qemu device invokes the read operation on local buffers
> and on the completion grant copy is called and buffers are freed.
> For the 'write' operation grant copy is performed before invoking
> write by qemu device.
>
> A new value 'feature_grant_copy' is added to recognize when the
> grant copy operation is supported by a guest.
> The body of the function 'ioreq_runio_qemu_aio' is moved to
> 'ioreq_runio_qemu_aio_blk' and in the 'ioreq_runio_qemu_aio' depending
> on the support for grant copy according checks, initialization, grant
> operation are made, then the 'ioreq_runio_qemu_aio_blk' function is
> called.
>
> Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
> ---
> Changes since v2:
> - to use the xengnttab_* function directly added -lxengnttab to configure
>    and include <xengnttab.h> in include/hw/xen/xen_common.h
> - in ioreq_copy removed an out path, changed a log level, made explicit
>    assignement to 'xengnttab_copy_grant_segment'
> * I did not change the way of testing if grant_copy operation is implemented.
>    As far as I understand if the code from gnttab_unimp.c is used then the gnttab
>    device is unavailable and the handler to gntdev would be invalid. But
>    if the handler is valid then the ioctl should return operation unimplemented
>    if the gntdev does not implement the operation.
>
>   configure                   |   2 +-
>   hw/block/xen_disk.c         | 171 ++++++++++++++++++++++++++++++++++++++++----
>   include/hw/xen/xen_common.h |   2 +
>   3 files changed, 162 insertions(+), 13 deletions(-)
>
> diff --git a/configure b/configure
> index e41876a..355d3fa 100755
> --- a/configure
> +++ b/configure
> @@ -1843,7 +1843,7 @@ fi
>   # xen probe
>
>   if test "$xen" != "no" ; then
> -  xen_libs="-lxenstore -lxenctrl -lxenguest"
> +  xen_libs="-lxenstore -lxenctrl -lxenguest -lxengnttab"
>
>     # First we test whether Xen headers and libraries are available.
>     # If no, we are done and there is no Xen support.
> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 37e14d1..4eca06a 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -131,6 +131,9 @@ struct XenBlkDev {
>       unsigned int        persistent_gnt_count;
>       unsigned int        max_grants;
>
> +    /* Grant copy */
> +    gboolean            feature_grant_copy;
> +
>       /* qemu block driver */
>       DriveInfo           *dinfo;
>       BlockBackend        *blk;
> @@ -500,6 +503,99 @@ static int ioreq_map(struct ioreq *ioreq)
>       return 0;
>   }
>
> +static void* get_buffer(int count)
> +{
> +    return xc_memalign(xen_xc, XC_PAGE_SIZE, count*XC_PAGE_SIZE);
> +}
> +
> +static void free_buffers(struct ioreq *ioreq)
> +{
> +    int i;
> +
> +    for (i = 0; i < ioreq->v.niov; i++) {
> +        ioreq->page[i] = NULL;
> +    }
> +
> +    free(ioreq->pages);
> +}
> +
> +static int ioreq_init_copy_buffers(struct ioreq *ioreq) {
> +    int i;
> +
> +    if (ioreq->v.niov == 0) {
> +        return 0;
> +    }
> +
> +    ioreq->pages = get_buffer(ioreq->v.niov);
> +    if (!ioreq->pages) {
> +        return -1;
> +    }
> +
> +    for (i = 0; i < ioreq->v.niov; i++) {
> +        ioreq->page[i] = ioreq->pages + i*XC_PAGE_SIZE;
> +        ioreq->v.iov[i].iov_base += (uintptr_t)ioreq->page[i];
> +    }
> +
> +    return 0;
> +}
> +
> +static int ioreq_copy(struct ioreq *ioreq)
> +{
> +    XenGnttab gnt = ioreq->blkdev->xendev.gnttabdev;
> +    xengnttab_grant_copy_segment_t segs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
> +    int i, count = 0, r, rc;
> +    int64_t file_blk = ioreq->blkdev->file_blk;
> +
> +    if (ioreq->v.niov == 0) {
> +        return 0;
> +    }
> +
> +    count = ioreq->v.niov;
> +
> +    for (i = 0; i < count; i++) {
> +
> +        if (ioreq->req.operation == BLKIF_OP_READ) {
> +            segs[i].flags = GNTCOPY_dest_gref;
> +            segs[i].dest.foreign.ref = ioreq->refs[i];
> +            segs[i].dest.foreign.domid = ioreq->domids[i];
> +            segs[i].dest.foreign.offset = ioreq->req.seg[i].first_sect * file_blk;
> +            segs[i].source.virt = ioreq->v.iov[i].iov_base;
> +        } else {
> +            segs[i].flags = GNTCOPY_source_gref;
> +            segs[i].source.foreign.ref = ioreq->refs[i];
> +            segs[i].source.foreign.domid = ioreq->domids[i];
> +            segs[i].source.foreign.offset = ioreq->req.seg[i].first_sect * file_blk;
> +            segs[i].dest.virt = ioreq->v.iov[i].iov_base;
> +        }
> +        segs[i].len = (ioreq->req.seg[i].last_sect
> +                       - ioreq->req.seg[i].first_sect + 1) * file_blk;
> +
> +    }
> +
> +    rc = xengnttab_grant_copy(gnt, count, segs);
> +
> +    if (rc) {
> +        xen_be_printf(&ioreq->blkdev->xendev, 0,
> +                      "failed to copy data %d \n", rc);
> +        ioreq->aio_errors++;
> +        return -1;
> +    } else {
> +        r = 0;
> +    }
> +
> +    for (i = 0; i < count; i++) {
> +        if (segs[i].status != GNTST_okay) {
> +            xen_be_printf(&ioreq->blkdev->xendev, 3,
> +                          "failed to copy data %d for gref %d, domid %d\n", rc,
> +                          ioreq->refs[i], ioreq->domids[i]);
> +            ioreq->aio_errors++;
> +            r = -1;
> +        }
> +    }
> +
> +    return r;
> +}
> +
>   static int ioreq_runio_qemu_aio(struct ioreq *ioreq);
>
>   static void qemu_aio_complete(void *opaque, int ret)
> @@ -528,8 +624,31 @@ static void qemu_aio_complete(void *opaque, int ret)
>           return;
>       }
>
> +    if (ioreq->blkdev->feature_grant_copy) {
> +        switch (ioreq->req.operation) {
> +        case BLKIF_OP_READ:
> +            /* in case of failure ioreq->aio_errors is increased */
> +            ioreq_copy(ioreq);
> +            free_buffers(ioreq);
> +            break;
> +        case BLKIF_OP_WRITE:
> +        case BLKIF_OP_FLUSH_DISKCACHE:
> +            if (!ioreq->req.nr_segments) {
> +                break;
> +            }
> +            free_buffers(ioreq);
> +            break;
> +        default:
> +            break;
> +        }
> +    }
> +
>       ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY;
> -    ioreq_unmap(ioreq);
> +
> +    if (!ioreq->blkdev->feature_grant_copy) {
> +        ioreq_unmap(ioreq);
> +    }
> +
>       ioreq_finish(ioreq);
>       switch (ioreq->req.operation) {
>       case BLKIF_OP_WRITE:
> @@ -547,14 +666,42 @@ static void qemu_aio_complete(void *opaque, int ret)
>       qemu_bh_schedule(ioreq->blkdev->bh);
>   }
>
> +static int ioreq_runio_qemu_aio_blk(struct ioreq *ioreq);
> +
>   static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>   {
> -    struct XenBlkDev *blkdev = ioreq->blkdev;
> +    if (ioreq->blkdev->feature_grant_copy) {
> +
> +        ioreq_init_copy_buffers(ioreq);
> +        if (ioreq->req.nr_segments && (ioreq->req.operation == BLKIF_OP_WRITE ||
> +            ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE)) {
> +            if (ioreq_copy(ioreq)) {
> +                free_buffers(ioreq);
> +                goto err;
> +            }
> +        }
> +        if (ioreq_runio_qemu_aio_blk(ioreq)) goto err;
> +
> +    } else {
>
> -    if (ioreq->req.nr_segments && ioreq_map(ioreq) == -1) {
> -        goto err_no_map;
> +        if (ioreq->req.nr_segments && ioreq_map(ioreq)) goto err;
> +        if (ioreq_runio_qemu_aio_blk(ioreq)) {
> +            ioreq_unmap(ioreq);
> +            goto err;
> +        }
>       }
>
> +    return 0;
> +err:
> +    ioreq_finish(ioreq);
> +    ioreq->status = BLKIF_RSP_ERROR;
> +    return -1;
> +}
> +
> +static int ioreq_runio_qemu_aio_blk(struct ioreq *ioreq)
> +{
> +    struct XenBlkDev *blkdev = ioreq->blkdev;
> +
>       ioreq->aio_inflight++;
>       if (ioreq->presync) {
>           blk_aio_flush(ioreq->blkdev->blk, qemu_aio_complete, ioreq);
> @@ -594,19 +741,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>       }
>       default:
>           /* unknown operation (shouldn't happen -- parse catches this) */
> -        goto err;
> +        return -1;
>       }
>
>       qemu_aio_complete(ioreq, 0);
>
>       return 0;
> -
> -err:
> -    ioreq_unmap(ioreq);
> -err_no_map:
> -    ioreq_finish(ioreq);
> -    ioreq->status = BLKIF_RSP_ERROR;
> -    return -1;
>   }
>
>   static int blk_send_response_one(struct ioreq *ioreq)
> @@ -1020,10 +1160,17 @@ static int blk_connect(struct XenDevice *xendev)
>
>       xen_be_bind_evtchn(&blkdev->xendev);
>
> +    blkdev->feature_grant_copy =
> +                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
> +
> +    xen_be_printf(&blkdev->xendev, 3, "grant copy operation %s\n",
> +                  blkdev->feature_grant_copy ? "enabled" : "disabled");
> +
>       xen_be_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, "
>                     "remote port %d, local port %d\n",
>                     blkdev->xendev.protocol, blkdev->ring_ref,
>                     blkdev->xendev.remote_port, blkdev->xendev.local_port);
> +
>       return 0;
>   }
>
> diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
> index 4ac0c6f..bed5307 100644
> --- a/include/hw/xen/xen_common.h
> +++ b/include/hw/xen/xen_common.h
> @@ -14,6 +14,8 @@
>   #endif
>   #include <xen/io/xenbus.h>
>
> +#include <xengnttab.h>
> +
>   #include "hw/hw.h"
>   #include "hw/xen/xen.h"
>   #include "hw/pci/pci.h"
>

When it comes to that second part of the patch, should I do some changes 
here?

Paulina

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-06-22  8:38 ` [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation Paulina Szubarczyk
  2016-07-13 12:34   ` Paulina Szubarczyk
@ 2016-07-14 10:37   ` Wei Liu
  2016-07-15 10:28     ` Paulina Szubarczyk
  2016-07-15 16:55   ` Anthony PERARD
  2016-07-19  9:12   ` Roger Pau Monné
  3 siblings, 1 reply; 27+ messages in thread
From: Wei Liu @ 2016-07-14 10:37 UTC (permalink / raw)
  To: Paulina Szubarczyk
  Cc: sstabellini, wei.liu2, ian.jackson, david.vrabel, anthony.perard,
	xen-devel, roger.pau

On Wed, Jun 22, 2016 at 10:38:53AM +0200, Paulina Szubarczyk wrote:
> Copy data operated on during request from/to local buffers to/from
> the grant references.
> 
> Before grant copy operation local buffers must be allocated what is
> done by calling ioreq_init_copy_buffers. For the 'read' operation,
> first, the qemu device invokes the read operation on local buffers
> and on the completion grant copy is called and buffers are freed.
> For the 'write' operation grant copy is performed before invoking
> write by qemu device.
> 
> A new value 'feature_grant_copy' is added to recognize when the
> grant copy operation is supported by a guest.
> The body of the function 'ioreq_runio_qemu_aio' is moved to
> 'ioreq_runio_qemu_aio_blk' and in the 'ioreq_runio_qemu_aio' depending
> on the support for grant copy according checks, initialization, grant
> operation are made, then the 'ioreq_runio_qemu_aio_blk' function is
> called.
> 
> Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
> ---
> Changes since v2:
> - to use the xengnttab_* function directly added -lxengnttab to configure
>   and include <xengnttab.h> in include/hw/xen/xen_common.h
> - in ioreq_copy removed an out path, changed a log level, made explicit 
>   assignement to 'xengnttab_copy_grant_segment'
> * I did not change the way of testing if grant_copy operation is implemented.
>   As far as I understand if the code from gnttab_unimp.c is used then the gnttab 
>   device is unavailable and the handler to gntdev would be invalid. But 
>   if the handler is valid then the ioctl should return operation unimplemented 
>   if the gntdev does not implement the operation.
> 
>  configure                   |   2 +-
>  hw/block/xen_disk.c         | 171 ++++++++++++++++++++++++++++++++++++++++----
>  include/hw/xen/xen_common.h |   2 +
>  3 files changed, 162 insertions(+), 13 deletions(-)
> 
> diff --git a/configure b/configure
> index e41876a..355d3fa 100755
> --- a/configure
> +++ b/configure
> @@ -1843,7 +1843,7 @@ fi
>  # xen probe
>  
>  if test "$xen" != "no" ; then
> -  xen_libs="-lxenstore -lxenctrl -lxenguest"
> +  xen_libs="-lxenstore -lxenctrl -lxenguest -lxengnttab"
>  

First thing, -lxengnttab should be in xen_stable_libs.

The probing needs to be more sophisticated.

You need to probe the new function your added as well. Just a few lines
below xen_stable_libs there is a section for hand-coded probing source
code, which you would need to modify.

Assuming your gnttab change will be merged into 4.8 (the release under
development at the moment), you need to have a separate program for it.

After you've done proper probing, you will know which version of Xen
this qemu is compiling against.  And then, there should be some fallback
mechanism to compile and run this qemu with older version of xen. This
is not too hard because you can guard your code with feature flag or
ifdef (please consult Stefan and Anthony which method to use).

Feel free to ask questions. I will try my best to explain.

>  
> +    blkdev->feature_grant_copy =
> +                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);

This is a bit problematic. As this patch stands, it won't compile on
older version of Xen because there is no such function there.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-07-14 10:37   ` Wei Liu
@ 2016-07-15 10:28     ` Paulina Szubarczyk
  2016-07-15 11:15       ` Wei Liu
  0 siblings, 1 reply; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-07-15 10:28 UTC (permalink / raw)
  To: Wei Liu
  Cc: sstabellini, ian.jackson, david.vrabel, anthony.perard,
	xen-devel, roger.pau



On 07/14/2016 12:37 PM, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 10:38:53AM +0200, Paulina Szubarczyk wrote:
>> diff --git a/configure b/configure
>> index e41876a..355d3fa 100755
>> --- a/configure
>> +++ b/configure
>> @@ -1843,7 +1843,7 @@ fi
>>   # xen probe
>>
>>   if test "$xen" != "no" ; then
>> -  xen_libs="-lxenstore -lxenctrl -lxenguest"
>> +  xen_libs="-lxenstore -lxenctrl -lxenguest -lxengnttab"
>>
>
> First thing, -lxengnttab should be in xen_stable_libs.
>
Do I understand correctly that I should add a new variable 
"xen_stable_libs"? I could not find it in the qemu tree used anywhere else.

> The probing needs to be more sophisticated.
>
> You need to probe the new function your added as well. Just a few lines
> below xen_stable_libs there is a section for hand-coded probing source
> code, which you would need to modify.
>
> Assuming your gnttab change will be merged into 4.8 (the release under
> development at the moment), you need to have a separate program for it.
>
I will add that.

> After you've done proper probing, you will know which version of Xen
> this qemu is compiling against.  And then, there should be some fallback
> mechanism to compile and run this qemu with older version of xen. This
> is not too hard because you can guard your code with feature flag or
> ifdef (please consult Stefan and Anthony which method to use).
>
> Feel free to ask questions. I will try my best to explain.
>
>>
>> +    blkdev->feature_grant_copy =
>> +                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
>
> This is a bit problematic. As this patch stands, it won't compile on
> older version of Xen because there is no such function there.

There is a variable CONFIG_XEN_CTRL_INTERFACE_VERSION holding current 
version of the Xen control library this qemu is configured with. It is 
set from the configure file. The feature could be guarded with ifdef by 
a new variable CONFIG_XEN_LIBS_INTERFACE_VERSION or they could be 
unified to CONFIG_XEN_TOOLS_INTERFACE_VERSION to not fill the same value 
twice.

Paulina

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-07-15 10:28     ` Paulina Szubarczyk
@ 2016-07-15 11:15       ` Wei Liu
  2016-07-15 17:11         ` Anthony PERARD
  0 siblings, 1 reply; 27+ messages in thread
From: Wei Liu @ 2016-07-15 11:15 UTC (permalink / raw)
  To: Paulina Szubarczyk
  Cc: sstabellini, Wei Liu, ian.jackson, david.vrabel, anthony.perard,
	xen-devel, roger.pau

On Fri, Jul 15, 2016 at 12:28:48PM +0200, Paulina Szubarczyk wrote:
> 
> 
> On 07/14/2016 12:37 PM, Wei Liu wrote:
> >On Wed, Jun 22, 2016 at 10:38:53AM +0200, Paulina Szubarczyk wrote:
> >>diff --git a/configure b/configure
> >>index e41876a..355d3fa 100755
> >>--- a/configure
> >>+++ b/configure
> >>@@ -1843,7 +1843,7 @@ fi
> >>  # xen probe
> >>
> >>  if test "$xen" != "no" ; then
> >>-  xen_libs="-lxenstore -lxenctrl -lxenguest"
> >>+  xen_libs="-lxenstore -lxenctrl -lxenguest -lxengnttab"
> >>
> >
> >First thing, -lxengnttab should be in xen_stable_libs.
> >
> Do I understand correctly that I should add a new variable
> "xen_stable_libs"? I could not find it in the qemu tree used anywhere else.
> 

Hmm... there is already one in upstream QEMU -- which means you're
perhaps using qemu-xen tree.

I think all new development should happen on upstream qemu, not in our
qemu-xen tree.

> >The probing needs to be more sophisticated.
> >
> >You need to probe the new function your added as well. Just a few lines
> >below xen_stable_libs there is a section for hand-coded probing source
> >code, which you would need to modify.
> >
> >Assuming your gnttab change will be merged into 4.8 (the release under
> >development at the moment), you need to have a separate program for it.
> >
> I will add that.
> 
> >After you've done proper probing, you will know which version of Xen
> >this qemu is compiling against.  And then, there should be some fallback
> >mechanism to compile and run this qemu with older version of xen. This
> >is not too hard because you can guard your code with feature flag or
> >ifdef (please consult Stefan and Anthony which method to use).
> >
> >Feel free to ask questions. I will try my best to explain.
> >
> >>
> >>+    blkdev->feature_grant_copy =
> >>+                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
> >
> >This is a bit problematic. As this patch stands, it won't compile on
> >older version of Xen because there is no such function there.
> 
> There is a variable CONFIG_XEN_CTRL_INTERFACE_VERSION holding current
> version of the Xen control library this qemu is configured with. It is set
> from the configure file. The feature could be guarded with ifdef by a new
> variable CONFIG_XEN_LIBS_INTERFACE_VERSION or they could be unified to
> CONFIG_XEN_TOOLS_INTERFACE_VERSION to not fill the same value twice.
> 

Another way is to provide a stub for this function to always return 0.

Please wait for Stefano and Anthony to see which method they prefer.

Wei.

> Paulina

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-06-22  8:38 ` [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation Paulina Szubarczyk
  2016-07-13 12:34   ` Paulina Szubarczyk
  2016-07-14 10:37   ` Wei Liu
@ 2016-07-15 16:55   ` Anthony PERARD
  2016-07-19 10:51     ` Paulina Szubarczyk
  2016-07-19  9:12   ` Roger Pau Monné
  3 siblings, 1 reply; 27+ messages in thread
From: Anthony PERARD @ 2016-07-15 16:55 UTC (permalink / raw)
  To: Paulina Szubarczyk
  Cc: sstabellini, wei.liu2, ian.jackson, david.vrabel, xen-devel, roger.pau

On Wed, Jun 22, 2016 at 10:38:53AM +0200, Paulina Szubarczyk wrote:
> Copy data operated on during request from/to local buffers to/from
> the grant references.
> 
> Before grant copy operation local buffers must be allocated what is
> done by calling ioreq_init_copy_buffers. For the 'read' operation,
> first, the qemu device invokes the read operation on local buffers
> and on the completion grant copy is called and buffers are freed.
> For the 'write' operation grant copy is performed before invoking
> write by qemu device.
> 
> A new value 'feature_grant_copy' is added to recognize when the
> grant copy operation is supported by a guest.
> The body of the function 'ioreq_runio_qemu_aio' is moved to
> 'ioreq_runio_qemu_aio_blk' and in the 'ioreq_runio_qemu_aio' depending
> on the support for grant copy according checks, initialization, grant
> operation are made, then the 'ioreq_runio_qemu_aio_blk' function is
> called.
> 
> Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>

> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
> index 37e14d1..4eca06a 100644
> --- a/hw/block/xen_disk.c
> +++ b/hw/block/xen_disk.c
> @@ -500,6 +503,99 @@ static int ioreq_map(struct ioreq *ioreq)
>      return 0;
>  }
>  
> +static void* get_buffer(int count)
> +{
> +    return xc_memalign(xen_xc, XC_PAGE_SIZE, count*XC_PAGE_SIZE);

Instead of xc_memalign, I think you need to call qemu_memalign() here.
Have a look at the file HACKING, the section '3. Low level memory
management'. Also, you probably do not need an the extra function
get_buffer() and can call qemu_memalign() directly in
ioreq_init_copy_buffers().

> +}
> +
> +static void free_buffers(struct ioreq *ioreq)
> +{
> +    int i;
> +
> +    for (i = 0; i < ioreq->v.niov; i++) {
> +        ioreq->page[i] = NULL;
> +    }
> +
> +    free(ioreq->pages);

With the use of qemu_memalign, this would need to be qemu_vfree().

> +}
> +
> +static int ioreq_init_copy_buffers(struct ioreq *ioreq) {
> +    int i;
> +
> +    if (ioreq->v.niov == 0) {
> +        return 0;
> +    }
> +
> +    ioreq->pages = get_buffer(ioreq->v.niov);
> +    if (!ioreq->pages) {
> +        return -1;
> +    }
> +
> +    for (i = 0; i < ioreq->v.niov; i++) {
> +        ioreq->page[i] = ioreq->pages + i*XC_PAGE_SIZE;
> +        ioreq->v.iov[i].iov_base += (uintptr_t)ioreq->page[i];

Is the += intended here?

> +    }
> +
> +    return 0;
> +}
> +
> +static int ioreq_copy(struct ioreq *ioreq)
> +{
> +    XenGnttab gnt = ioreq->blkdev->xendev.gnttabdev;
> +    xengnttab_grant_copy_segment_t segs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
> +    int i, count = 0, r, rc;
> +    int64_t file_blk = ioreq->blkdev->file_blk;
> +
> +    if (ioreq->v.niov == 0) {
> +        return 0;
> +    }
> +
> +    count = ioreq->v.niov;
> +
> +    for (i = 0; i < count; i++) {
> +
> +        if (ioreq->req.operation == BLKIF_OP_READ) {
> +            segs[i].flags = GNTCOPY_dest_gref;
> +            segs[i].dest.foreign.ref = ioreq->refs[i];
> +            segs[i].dest.foreign.domid = ioreq->domids[i];
> +            segs[i].dest.foreign.offset = ioreq->req.seg[i].first_sect * file_blk;
> +            segs[i].source.virt = ioreq->v.iov[i].iov_base;
> +        } else {
> +            segs[i].flags = GNTCOPY_source_gref;
> +            segs[i].source.foreign.ref = ioreq->refs[i];
> +            segs[i].source.foreign.domid = ioreq->domids[i];
> +            segs[i].source.foreign.offset = ioreq->req.seg[i].first_sect * file_blk;
> +            segs[i].dest.virt = ioreq->v.iov[i].iov_base;
> +        }
> +        segs[i].len = (ioreq->req.seg[i].last_sect
> +                       - ioreq->req.seg[i].first_sect + 1) * file_blk;
> +
> +    }
> +
> +    rc = xengnttab_grant_copy(gnt, count, segs);
> +
> +    if (rc) {
> +        xen_be_printf(&ioreq->blkdev->xendev, 0,
> +                      "failed to copy data %d \n", rc);
> +        ioreq->aio_errors++;
> +        return -1;
> +    } else {
> +        r = 0;
> +    }
> +
> +    for (i = 0; i < count; i++) {
> +        if (segs[i].status != GNTST_okay) {
> +            xen_be_printf(&ioreq->blkdev->xendev, 3,
> +                          "failed to copy data %d for gref %d, domid %d\n", rc,
> +                          ioreq->refs[i], ioreq->domids[i]);
> +            ioreq->aio_errors++;
> +            r = -1;
> +        }
> +    }
> +
> +    return r;
> +}
> +
>  static int ioreq_runio_qemu_aio(struct ioreq *ioreq);
>  
>  static void qemu_aio_complete(void *opaque, int ret)
> @@ -528,8 +624,31 @@ static void qemu_aio_complete(void *opaque, int ret)
>          return;
>      }
>  
> +    if (ioreq->blkdev->feature_grant_copy) {
> +        switch (ioreq->req.operation) {
> +        case BLKIF_OP_READ:
> +            /* in case of failure ioreq->aio_errors is increased */
> +            ioreq_copy(ioreq);
> +            free_buffers(ioreq);
> +            break;
> +        case BLKIF_OP_WRITE:
> +        case BLKIF_OP_FLUSH_DISKCACHE:
> +            if (!ioreq->req.nr_segments) {
> +                break;
> +            }
> +            free_buffers(ioreq);
> +            break;
> +        default:
> +            break;
> +        }
> +    }
> +
>      ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY;
> -    ioreq_unmap(ioreq);
> +
> +    if (!ioreq->blkdev->feature_grant_copy) {
> +        ioreq_unmap(ioreq);
> +    }
> +
>      ioreq_finish(ioreq);
>      switch (ioreq->req.operation) {
>      case BLKIF_OP_WRITE:
> @@ -547,14 +666,42 @@ static void qemu_aio_complete(void *opaque, int ret)
>      qemu_bh_schedule(ioreq->blkdev->bh);
>  }
>  
> +static int ioreq_runio_qemu_aio_blk(struct ioreq *ioreq);
> +
>  static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>  {
> -    struct XenBlkDev *blkdev = ioreq->blkdev;
> +    if (ioreq->blkdev->feature_grant_copy) {
> +
> +        ioreq_init_copy_buffers(ioreq);
> +        if (ioreq->req.nr_segments && (ioreq->req.operation == BLKIF_OP_WRITE ||
> +            ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE)) {
> +            if (ioreq_copy(ioreq)) {

Would it make sens to do this ioreq_copy() directly in
ioreq_runio_qemu_aio_blk ?

> +                free_buffers(ioreq);
> +                goto err;
> +            }
> +        }
> +        if (ioreq_runio_qemu_aio_blk(ioreq)) goto err;
> +
> +    } else {
>  
> -    if (ioreq->req.nr_segments && ioreq_map(ioreq) == -1) {
> -        goto err_no_map;
> +        if (ioreq->req.nr_segments && ioreq_map(ioreq)) goto err;
> +        if (ioreq_runio_qemu_aio_blk(ioreq)) {
> +            ioreq_unmap(ioreq);
> +            goto err;
> +        }
>      }
>  
> +    return 0;
> +err:
> +    ioreq_finish(ioreq);
> +    ioreq->status = BLKIF_RSP_ERROR;
> +    return -1;
> +}
> +
> +static int ioreq_runio_qemu_aio_blk(struct ioreq *ioreq)
> +{
> +    struct XenBlkDev *blkdev = ioreq->blkdev;
> +
>      ioreq->aio_inflight++;
>      if (ioreq->presync) {
>          blk_aio_flush(ioreq->blkdev->blk, qemu_aio_complete, ioreq);
> @@ -594,19 +741,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>      }
>      default:
>          /* unknown operation (shouldn't happen -- parse catches this) */
> -        goto err;
> +        return -1;
>      }
>  
>      qemu_aio_complete(ioreq, 0);
>  
>      return 0;
> -
> -err:
> -    ioreq_unmap(ioreq);
> -err_no_map:
> -    ioreq_finish(ioreq);
> -    ioreq->status = BLKIF_RSP_ERROR;
> -    return -1;
>  }
>  
>  static int blk_send_response_one(struct ioreq *ioreq)
> @@ -1020,10 +1160,17 @@ static int blk_connect(struct XenDevice *xendev)
>  
>      xen_be_bind_evtchn(&blkdev->xendev);
>  
> +    blkdev->feature_grant_copy =
> +                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
> +
> +    xen_be_printf(&blkdev->xendev, 3, "grant copy operation %s\n",
> +                  blkdev->feature_grant_copy ? "enabled" : "disabled");
> +
>      xen_be_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, "
>                    "remote port %d, local port %d\n",
>                    blkdev->xendev.protocol, blkdev->ring_ref,
>                    blkdev->xendev.remote_port, blkdev->xendev.local_port);
> +
>      return 0;
>  }

Other things, could you rebase your patch on QEMU upstream and CC
qemu-devel@nongnu.org? You can also check the coding style of the patch
with ./scripts/checkpatch.pl.

Thank you,

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-07-15 11:15       ` Wei Liu
@ 2016-07-15 17:11         ` Anthony PERARD
  2016-07-19 10:16           ` Paulina Szubarczyk
  0 siblings, 1 reply; 27+ messages in thread
From: Anthony PERARD @ 2016-07-15 17:11 UTC (permalink / raw)
  To: Wei Liu
  Cc: sstabellini, ian.jackson, Paulina Szubarczyk, david.vrabel,
	xen-devel, roger.pau

On Fri, Jul 15, 2016 at 12:15:45PM +0100, Wei Liu wrote:
> On Fri, Jul 15, 2016 at 12:28:48PM +0200, Paulina Szubarczyk wrote:
> > 
> > 
> > On 07/14/2016 12:37 PM, Wei Liu wrote:
> > >On Wed, Jun 22, 2016 at 10:38:53AM +0200, Paulina Szubarczyk wrote:
> > >>diff --git a/configure b/configure
> > >>index e41876a..355d3fa 100755
> > >>--- a/configure
> > >>+++ b/configure
> > >>@@ -1843,7 +1843,7 @@ fi
> > >>  # xen probe
> > >>
> > >>  if test "$xen" != "no" ; then
> > >>-  xen_libs="-lxenstore -lxenctrl -lxenguest"
> > >>+  xen_libs="-lxenstore -lxenctrl -lxenguest -lxengnttab"
> > >>
> > >
> > >First thing, -lxengnttab should be in xen_stable_libs.
> > >
> > Do I understand correctly that I should add a new variable
> > "xen_stable_libs"? I could not find it in the qemu tree used anywhere else.
> > 
> 
> Hmm... there is already one in upstream QEMU -- which means you're
> perhaps using qemu-xen tree.
> 
> I think all new development should happen on upstream qemu, not in our
> qemu-xen tree.
> 
> > >The probing needs to be more sophisticated.
> > >
> > >You need to probe the new function your added as well. Just a few lines
> > >below xen_stable_libs there is a section for hand-coded probing source
> > >code, which you would need to modify.
> > >
> > >Assuming your gnttab change will be merged into 4.8 (the release under
> > >development at the moment), you need to have a separate program for it.
> > >
> > I will add that.
> > 
> > >After you've done proper probing, you will know which version of Xen
> > >this qemu is compiling against.  And then, there should be some fallback
> > >mechanism to compile and run this qemu with older version of xen. This
> > >is not too hard because you can guard your code with feature flag or
> > >ifdef (please consult Stefan and Anthony which method to use).
> > >
> > >Feel free to ask questions. I will try my best to explain.
> > >
> > >>
> > >>+    blkdev->feature_grant_copy =
> > >>+                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
> > >
> > >This is a bit problematic. As this patch stands, it won't compile on
> > >older version of Xen because there is no such function there.
> > 
> > There is a variable CONFIG_XEN_CTRL_INTERFACE_VERSION holding current
> > version of the Xen control library this qemu is configured with. It is set
> > from the configure file. The feature could be guarded with ifdef by a new
> > variable CONFIG_XEN_LIBS_INTERFACE_VERSION or they could be unified to
> > CONFIG_XEN_TOOLS_INTERFACE_VERSION to not fill the same value twice.
> > 
> 
> Another way is to provide a stub for this function to always return 0.
> 
> Please wait for Stefano and Anthony to see which method they prefer.

I think using CONFIG_XEN_CTRL_INTERFACE_VERSION is fine. With maybe a
stub of xengnttab_grant_copy() in xen_common.h.

-- 
Anthony PERARD

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-06-22  8:38 ` [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation Paulina Szubarczyk
                     ` (2 preceding siblings ...)
  2016-07-15 16:55   ` Anthony PERARD
@ 2016-07-19  9:12   ` Roger Pau Monné
  2016-07-19 10:12     ` Paulina Szubarczyk
  3 siblings, 1 reply; 27+ messages in thread
From: Roger Pau Monné @ 2016-07-19  9:12 UTC (permalink / raw)
  To: Paulina Szubarczyk
  Cc: sstabellini, wei.liu2, ian.jackson, david.vrabel, anthony.perard,
	xen-devel

On Wed, Jun 22, 2016 at 10:38:53AM +0200, Paulina Szubarczyk wrote:
> Copy data operated on during request from/to local buffers to/from
> the grant references.
> 
> Before grant copy operation local buffers must be allocated what is
> done by calling ioreq_init_copy_buffers. For the 'read' operation,
> first, the qemu device invokes the read operation on local buffers
> and on the completion grant copy is called and buffers are freed.
> For the 'write' operation grant copy is performed before invoking
> write by qemu device.
> 
> A new value 'feature_grant_copy' is added to recognize when the
> grant copy operation is supported by a guest.
> The body of the function 'ioreq_runio_qemu_aio' is moved to
> 'ioreq_runio_qemu_aio_blk' and in the 'ioreq_runio_qemu_aio' depending
> on the support for grant copy according checks, initialization, grant
> operation are made, then the 'ioreq_runio_qemu_aio_blk' function is
> called.
> 
> Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
> ---
> Changes since v2:
> - to use the xengnttab_* function directly added -lxengnttab to configure
>   and include <xengnttab.h> in include/hw/xen/xen_common.h
> - in ioreq_copy removed an out path, changed a log level, made explicit 
>   assignement to 'xengnttab_copy_grant_segment'
> * I did not change the way of testing if grant_copy operation is implemented.
>   As far as I understand if the code from gnttab_unimp.c is used then the gnttab 
>   device is unavailable and the handler to gntdev would be invalid. But 
>   if the handler is valid then the ioctl should return operation unimplemented 
>   if the gntdev does not implement the operation.
> 
>  configure                   |   2 +-
>  hw/block/xen_disk.c         | 171 ++++++++++++++++++++++++++++++++++++++++----
>  include/hw/xen/xen_common.h |   2 +
>  3 files changed, 162 insertions(+), 13 deletions(-)

[...]
 
> @@ -1020,10 +1160,17 @@ static int blk_connect(struct XenDevice *xendev)
>  
>      xen_be_bind_evtchn(&blkdev->xendev);
>  
> +    blkdev->feature_grant_copy =
> +                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);

Isn't this going to trigger an abort on OSes that don't implement 
xengnttab_grant_copy? AFAICT the 'unimplemented' handler in libgnttab for 
this is just an abort.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-07-19  9:12   ` Roger Pau Monné
@ 2016-07-19 10:12     ` Paulina Szubarczyk
  0 siblings, 0 replies; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-07-19 10:12 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: sstabellini, wei.liu2, ian.jackson, david.vrabel, anthony.perard,
	xen-devel



On 07/19/2016 11:12 AM, Roger Pau Monné wrote:
> On Wed, Jun 22, 2016 at 10:38:53AM +0200, Paulina Szubarczyk wrote:
>> Copy data operated on during request from/to local buffers to/from
>> the grant references.
>>
>> Before grant copy operation local buffers must be allocated what is
>> done by calling ioreq_init_copy_buffers. For the 'read' operation,
>> first, the qemu device invokes the read operation on local buffers
>> and on the completion grant copy is called and buffers are freed.
>> For the 'write' operation grant copy is performed before invoking
>> write by qemu device.
>>
>> A new value 'feature_grant_copy' is added to recognize when the
>> grant copy operation is supported by a guest.
>> The body of the function 'ioreq_runio_qemu_aio' is moved to
>> 'ioreq_runio_qemu_aio_blk' and in the 'ioreq_runio_qemu_aio' depending
>> on the support for grant copy according checks, initialization, grant
>> operation are made, then the 'ioreq_runio_qemu_aio_blk' function is
>> called.
>>
>> Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
>> ---
>> Changes since v2:
>> - to use the xengnttab_* function directly added -lxengnttab to configure
>>    and include <xengnttab.h> in include/hw/xen/xen_common.h
>> - in ioreq_copy removed an out path, changed a log level, made explicit
>>    assignement to 'xengnttab_copy_grant_segment'
>> * I did not change the way of testing if grant_copy operation is implemented.
>>    As far as I understand if the code from gnttab_unimp.c is used then the gnttab
>>    device is unavailable and the handler to gntdev would be invalid. But
>>    if the handler is valid then the ioctl should return operation unimplemented
>>    if the gntdev does not implement the operation.
>>
>>   configure                   |   2 +-
>>   hw/block/xen_disk.c         | 171 ++++++++++++++++++++++++++++++++++++++++----
>>   include/hw/xen/xen_common.h |   2 +
>>   3 files changed, 162 insertions(+), 13 deletions(-)
>
> [...]
>
>> @@ -1020,10 +1160,17 @@ static int blk_connect(struct XenDevice *xendev)
>>
>>       xen_be_bind_evtchn(&blkdev->xendev);
>>
>> +    blkdev->feature_grant_copy =
>> +                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
>
> Isn't this going to trigger an abort on OSes that don't implement
> xengnttab_grant_copy? AFAICT the 'unimplemented' handler in libgnttab for
> this is just an abort.

So is the xengnttab_map_grant_refs and the pointer to 
blkdev->xendev.gnttabdev would be invalid so the sring would not be 
initialized a few lines earlier in that function leading to the fail of 
the initialization. In case the gntdev does not implement the ioctl then 
only an error code will be returned.

Paulina
>
> Roger.
>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-07-15 17:11         ` Anthony PERARD
@ 2016-07-19 10:16           ` Paulina Szubarczyk
  0 siblings, 0 replies; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-07-19 10:16 UTC (permalink / raw)
  To: Anthony PERARD, Wei Liu
  Cc: xen-devel, sstabellini, ian.jackson, david.vrabel, roger.pau



On 07/15/2016 07:11 PM, Anthony PERARD wrote:
> On Fri, Jul 15, 2016 at 12:15:45PM +0100, Wei Liu wrote:
>> On Fri, Jul 15, 2016 at 12:28:48PM +0200, Paulina Szubarczyk wrote:
>>>
>>>
>>> On 07/14/2016 12:37 PM, Wei Liu wrote:
>>>> On Wed, Jun 22, 2016 at 10:38:53AM +0200, Paulina Szubarczyk wrote:
>>>>> diff --git a/configure b/configure
>>>>> index e41876a..355d3fa 100755
>>>>> --- a/configure
>>>>> +++ b/configure
>>>>> @@ -1843,7 +1843,7 @@ fi
>>>>>   # xen probe
>>>>>
>>>>>   if test "$xen" != "no" ; then
>>>>> -  xen_libs="-lxenstore -lxenctrl -lxenguest"
>>>>> +  xen_libs="-lxenstore -lxenctrl -lxenguest -lxengnttab"
>>>>>
>>>>
>>>> First thing, -lxengnttab should be in xen_stable_libs.
>>>>
>>> Do I understand correctly that I should add a new variable
>>> "xen_stable_libs"? I could not find it in the qemu tree used anywhere else.
>>>
>>
>> Hmm... there is already one in upstream QEMU -- which means you're
>> perhaps using qemu-xen tree.
>>
>> I think all new development should happen on upstream qemu, not in our
>> qemu-xen tree.
>>
>>>> The probing needs to be more sophisticated.
>>>>
>>>> You need to probe the new function your added as well. Just a few lines
>>>> below xen_stable_libs there is a section for hand-coded probing source
>>>> code, which you would need to modify.
>>>>
>>>> Assuming your gnttab change will be merged into 4.8 (the release under
>>>> development at the moment), you need to have a separate program for it.
>>>>
>>> I will add that.
>>>
>>>> After you've done proper probing, you will know which version of Xen
>>>> this qemu is compiling against.  And then, there should be some fallback
>>>> mechanism to compile and run this qemu with older version of xen. This
>>>> is not too hard because you can guard your code with feature flag or
>>>> ifdef (please consult Stefan and Anthony which method to use).
>>>>
>>>> Feel free to ask questions. I will try my best to explain.
>>>>
>>>>>
>>>>> +    blkdev->feature_grant_copy =
>>>>> +                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
>>>>
>>>> This is a bit problematic. As this patch stands, it won't compile on
>>>> older version of Xen because there is no such function there.
>>>
>>> There is a variable CONFIG_XEN_CTRL_INTERFACE_VERSION holding current
>>> version of the Xen control library this qemu is configured with. It is set
>>> from the configure file. The feature could be guarded with ifdef by a new
>>> variable CONFIG_XEN_LIBS_INTERFACE_VERSION or they could be unified to
>>> CONFIG_XEN_TOOLS_INTERFACE_VERSION to not fill the same value twice.
>>>
>>
>> Another way is to provide a stub for this function to always return 0.
>>
>> Please wait for Stefano and Anthony to see which method they prefer.
>
> I think using CONFIG_XEN_CTRL_INTERFACE_VERSION is fine. With maybe a
> stub of xengnttab_grant_copy() in xen_common.h.
>
I will add the stub but the structure xengnttab_grant_copy_segment need 
to be repeated in the xen_common.h again, it is also not defined in the 
earlier versions.

Paulina

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation
  2016-07-15 16:55   ` Anthony PERARD
@ 2016-07-19 10:51     ` Paulina Szubarczyk
  0 siblings, 0 replies; 27+ messages in thread
From: Paulina Szubarczyk @ 2016-07-19 10:51 UTC (permalink / raw)
  To: Anthony PERARD
  Cc: sstabellini, wei.liu2, ian.jackson, david.vrabel, xen-devel, roger.pau



On 07/15/2016 06:55 PM, Anthony PERARD wrote:
> On Wed, Jun 22, 2016 at 10:38:53AM +0200, Paulina Szubarczyk wrote:
>> Copy data operated on during request from/to local buffers to/from
>> the grant references.
>>
>> Before grant copy operation local buffers must be allocated what is
>> done by calling ioreq_init_copy_buffers. For the 'read' operation,
>> first, the qemu device invokes the read operation on local buffers
>> and on the completion grant copy is called and buffers are freed.
>> For the 'write' operation grant copy is performed before invoking
>> write by qemu device.
>>
>> A new value 'feature_grant_copy' is added to recognize when the
>> grant copy operation is supported by a guest.
>> The body of the function 'ioreq_runio_qemu_aio' is moved to
>> 'ioreq_runio_qemu_aio_blk' and in the 'ioreq_runio_qemu_aio' depending
>> on the support for grant copy according checks, initialization, grant
>> operation are made, then the 'ioreq_runio_qemu_aio_blk' function is
>> called.
>>
>> Signed-off-by: Paulina Szubarczyk <paulinaszubarczyk@gmail.com>
>
>> diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
>> index 37e14d1..4eca06a 100644
>> --- a/hw/block/xen_disk.c
>> +++ b/hw/block/xen_disk.c
>> @@ -500,6 +503,99 @@ static int ioreq_map(struct ioreq *ioreq)
>>       return 0;
>>   }
>>
>> +static void* get_buffer(int count)
>> +{
>> +    return xc_memalign(xen_xc, XC_PAGE_SIZE, count*XC_PAGE_SIZE);
>
> Instead of xc_memalign, I think you need to call qemu_memalign() here.
> Have a look at the file HACKING, the section '3. Low level memory
> management'. Also, you probably do not need an the extra function
> get_buffer() and can call qemu_memalign() directly in
> ioreq_init_copy_buffers().
>

Ok, I will changed that.

>> +}
>> +
>> +static void free_buffers(struct ioreq *ioreq)
>> +{
>> +    int i;
>> +
>> +    for (i = 0; i < ioreq->v.niov; i++) {
>> +        ioreq->page[i] = NULL;
>> +    }
>> +
>> +    free(ioreq->pages);
>
> With the use of qemu_memalign, this would need to be qemu_vfree().
>
>> +}
>> +
>> +static int ioreq_init_copy_buffers(struct ioreq *ioreq) {
>> +    int i;
>> +
>> +    if (ioreq->v.niov == 0) {
>> +        return 0;
>> +    }
>> +
>> +    ioreq->pages = get_buffer(ioreq->v.niov);
>> +    if (!ioreq->pages) {
>> +        return -1;
>> +    }
>> +
>> +    for (i = 0; i < ioreq->v.niov; i++) {
>> +        ioreq->page[i] = ioreq->pages + i*XC_PAGE_SIZE;
>> +        ioreq->v.iov[i].iov_base += (uintptr_t)ioreq->page[i];
>
> Is the += intended here?
>

I was suggested by ioreq_map assignment to the ioreq->v.iov[i].iov_base 
which is made that way. But I do not think that makes sense to sum up 
the pointers. I will change it to =.

>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int ioreq_copy(struct ioreq *ioreq)
>> +{
>> +    XenGnttab gnt = ioreq->blkdev->xendev.gnttabdev;
>> +    xengnttab_grant_copy_segment_t segs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>> +    int i, count = 0, r, rc;
>> +    int64_t file_blk = ioreq->blkdev->file_blk;
>> +
>> +    if (ioreq->v.niov == 0) {
>> +        return 0;
>> +    }
>> +
>> +    count = ioreq->v.niov;
>> +
>> +    for (i = 0; i < count; i++) {
>> +
>> +        if (ioreq->req.operation == BLKIF_OP_READ) {
>> +            segs[i].flags = GNTCOPY_dest_gref;
>> +            segs[i].dest.foreign.ref = ioreq->refs[i];
>> +            segs[i].dest.foreign.domid = ioreq->domids[i];
>> +            segs[i].dest.foreign.offset = ioreq->req.seg[i].first_sect * file_blk;
>> +            segs[i].source.virt = ioreq->v.iov[i].iov_base;
>> +        } else {
>> +            segs[i].flags = GNTCOPY_source_gref;
>> +            segs[i].source.foreign.ref = ioreq->refs[i];
>> +            segs[i].source.foreign.domid = ioreq->domids[i];
>> +            segs[i].source.foreign.offset = ioreq->req.seg[i].first_sect * file_blk;
>> +            segs[i].dest.virt = ioreq->v.iov[i].iov_base;
>> +        }
>> +        segs[i].len = (ioreq->req.seg[i].last_sect
>> +                       - ioreq->req.seg[i].first_sect + 1) * file_blk;
>> +
>> +    }
>> +
>> +    rc = xengnttab_grant_copy(gnt, count, segs);
>> +
>> +    if (rc) {
>> +        xen_be_printf(&ioreq->blkdev->xendev, 0,
>> +                      "failed to copy data %d \n", rc);
>> +        ioreq->aio_errors++;
>> +        return -1;
>> +    } else {
>> +        r = 0;
>> +    }
>> +
>> +    for (i = 0; i < count; i++) {
>> +        if (segs[i].status != GNTST_okay) {
>> +            xen_be_printf(&ioreq->blkdev->xendev, 3,
>> +                          "failed to copy data %d for gref %d, domid %d\n", rc,
>> +                          ioreq->refs[i], ioreq->domids[i]);
>> +            ioreq->aio_errors++;
>> +            r = -1;
>> +        }
>> +    }
>> +
>> +    return r;
>> +}
>> +
>>   static int ioreq_runio_qemu_aio(struct ioreq *ioreq);
>>
>>   static void qemu_aio_complete(void *opaque, int ret)
>> @@ -528,8 +624,31 @@ static void qemu_aio_complete(void *opaque, int ret)
>>           return;
>>       }
>>
>> +    if (ioreq->blkdev->feature_grant_copy) {
>> +        switch (ioreq->req.operation) {
>> +        case BLKIF_OP_READ:
>> +            /* in case of failure ioreq->aio_errors is increased */
>> +            ioreq_copy(ioreq);
>> +            free_buffers(ioreq);
>> +            break;
>> +        case BLKIF_OP_WRITE:
>> +        case BLKIF_OP_FLUSH_DISKCACHE:
>> +            if (!ioreq->req.nr_segments) {
>> +                break;
>> +            }
>> +            free_buffers(ioreq);
>> +            break;
>> +        default:
>> +            break;
>> +        }
>> +    }
>> +
>>       ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY;
>> -    ioreq_unmap(ioreq);
>> +
>> +    if (!ioreq->blkdev->feature_grant_copy) {
>> +        ioreq_unmap(ioreq);
>> +    }
>> +
>>       ioreq_finish(ioreq);
>>       switch (ioreq->req.operation) {
>>       case BLKIF_OP_WRITE:
>> @@ -547,14 +666,42 @@ static void qemu_aio_complete(void *opaque, int ret)
>>       qemu_bh_schedule(ioreq->blkdev->bh);
>>   }
>>
>> +static int ioreq_runio_qemu_aio_blk(struct ioreq *ioreq);
>> +
>>   static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>>   {
>> -    struct XenBlkDev *blkdev = ioreq->blkdev;
>> +    if (ioreq->blkdev->feature_grant_copy) {
>> +
>> +        ioreq_init_copy_buffers(ioreq);
>> +        if (ioreq->req.nr_segments && (ioreq->req.operation == BLKIF_OP_WRITE ||
>> +            ioreq->req.operation == BLKIF_OP_FLUSH_DISKCACHE)) {
>> +            if (ioreq_copy(ioreq)) {
>
> Would it make sens to do this ioreq_copy() directly in
> ioreq_runio_qemu_aio_blk ?

Do you mean moving only the ioreq_copy() or remove that new function?

I divided the old ioreq_runio_qemu_aio function on two parts. It 
clarifies which feature is chosen. The grant copy and grant map 
initialization and error handling is different and that way the Xen 
memory management and an actual processing the request by qemu is separated.

>
>> +                free_buffers(ioreq);
>> +                goto err;
>> +            }
>> +        }
>> +        if (ioreq_runio_qemu_aio_blk(ioreq)) goto err;
>> +
>> +    } else {
>>
>> -    if (ioreq->req.nr_segments && ioreq_map(ioreq) == -1) {
>> -        goto err_no_map;
>> +        if (ioreq->req.nr_segments && ioreq_map(ioreq)) goto err;
>> +        if (ioreq_runio_qemu_aio_blk(ioreq)) {
>> +            ioreq_unmap(ioreq);
>> +            goto err;
>> +        }
>>       }
>>
>> +    return 0;
>> +err:
>> +    ioreq_finish(ioreq);
>> +    ioreq->status = BLKIF_RSP_ERROR;
>> +    return -1;
>> +}
>> +
>> +static int ioreq_runio_qemu_aio_blk(struct ioreq *ioreq)
>> +{
>> +    struct XenBlkDev *blkdev = ioreq->blkdev;
>> +
>>       ioreq->aio_inflight++;
>>       if (ioreq->presync) {
>>           blk_aio_flush(ioreq->blkdev->blk, qemu_aio_complete, ioreq);
>> @@ -594,19 +741,12 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
>>       }
>>       default:
>>           /* unknown operation (shouldn't happen -- parse catches this) */
>> -        goto err;
>> +        return -1;
>>       }
>>
>>       qemu_aio_complete(ioreq, 0);
>>
>>       return 0;
>> -
>> -err:
>> -    ioreq_unmap(ioreq);
>> -err_no_map:
>> -    ioreq_finish(ioreq);
>> -    ioreq->status = BLKIF_RSP_ERROR;
>> -    return -1;
>>   }
>>
>>   static int blk_send_response_one(struct ioreq *ioreq)
>> @@ -1020,10 +1160,17 @@ static int blk_connect(struct XenDevice *xendev)
>>
>>       xen_be_bind_evtchn(&blkdev->xendev);
>>
>> +    blkdev->feature_grant_copy =
>> +                (xengnttab_grant_copy(blkdev->xendev.gnttabdev, 0, NULL) == 0);
>> +
>> +    xen_be_printf(&blkdev->xendev, 3, "grant copy operation %s\n",
>> +                  blkdev->feature_grant_copy ? "enabled" : "disabled");
>> +
>>       xen_be_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, "
>>                     "remote port %d, local port %d\n",
>>                     blkdev->xendev.protocol, blkdev->ring_ref,
>>                     blkdev->xendev.remote_port, blkdev->xendev.local_port);
>> +
>>       return 0;
>>   }
>
> Other things, could you rebase your patch on QEMU upstream and CC
> qemu-devel@nongnu.org? You can also check the coding style of the patch
> with ./scripts/checkpatch.pl.
>
> Thank you,
>

Yes, thank you.

Paulina

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2016-07-19 10:51 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-22  8:38 [PATCH v3 0/2] qemu-qdisk: Implementation of grant copy operation Paulina Szubarczyk
2016-06-22  8:38 ` [PATCH v3 1/2] Interface for grant copy operation in libs Paulina Szubarczyk
2016-06-22  9:37   ` David Vrabel
2016-06-22  9:53     ` Paulina Szubarczyk
2016-06-22 11:24       ` Wei Liu
2016-06-22 14:19         ` Paulina Szubarczyk
2016-06-22 11:21     ` Wei Liu
2016-06-22 12:37       ` David Vrabel
2016-06-22 13:29         ` Wei Liu
2016-06-22 13:52           ` David Vrabel
2016-06-22 14:52             ` Wei Liu
2016-06-22 16:49               ` Wei Liu
2016-07-06 15:49                 ` Roger Pau Monné
2016-07-05 16:27               ` George Dunlap
2016-07-08 13:18   ` Wei Liu
2016-07-13  9:12     ` Wei Liu
2016-06-22  8:38 ` [PATCH v3 2/2] qdisk - hw/block/xen_disk: grant copy implementation Paulina Szubarczyk
2016-07-13 12:34   ` Paulina Szubarczyk
2016-07-14 10:37   ` Wei Liu
2016-07-15 10:28     ` Paulina Szubarczyk
2016-07-15 11:15       ` Wei Liu
2016-07-15 17:11         ` Anthony PERARD
2016-07-19 10:16           ` Paulina Szubarczyk
2016-07-15 16:55   ` Anthony PERARD
2016-07-19 10:51     ` Paulina Szubarczyk
2016-07-19  9:12   ` Roger Pau Monné
2016-07-19 10:12     ` Paulina Szubarczyk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).