All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/2] backup: Use copy offloading
@ 2018-06-05  8:07 Fam Zheng
  2018-06-05  8:07 ` [Qemu-devel] [PATCH v2 1/2] block: Honour BDRV_REQ_NO_SERIALISING in copy range Fam Zheng
  2018-06-05  8:07 ` [Qemu-devel] [PATCH v2 2/2] backup: Use copy offloading Fam Zheng
  0 siblings, 2 replies; 5+ messages in thread
From: Fam Zheng @ 2018-06-05  8:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Fam Zheng, Stefan Hajnoczi, Kevin Wolf, Max Reitz, Jeff Cody

Based-on: <20180601092648.24614-1-famz@redhat.com>
([PATCH v9 00/10] qemu-img convert with copy offloading)

This enhances the backup job to make use of the copy offloading API. It
eliminates the necessity to use the bounce buffer as well as speeding up the
copy operation when the backend supports it.

v2: Use helper functions. [Stefan]

Fam Zheng (2):
  block: Honour BDRV_REQ_NO_SERIALISING in copy range
  backup: Use copy offloading

 block/backup.c        | 151 ++++++++++++++++++++++++++++++------------
 block/io.c            |   6 +-
 block/trace-events    |   1 +
 include/block/block.h |   5 +-
 4 files changed, 118 insertions(+), 45 deletions(-)

-- 
2.17.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v2 1/2] block: Honour BDRV_REQ_NO_SERIALISING in copy range
  2018-06-05  8:07 [Qemu-devel] [PATCH v2 0/2] backup: Use copy offloading Fam Zheng
@ 2018-06-05  8:07 ` Fam Zheng
  2018-06-05  8:07 ` [Qemu-devel] [PATCH v2 2/2] backup: Use copy offloading Fam Zheng
  1 sibling, 0 replies; 5+ messages in thread
From: Fam Zheng @ 2018-06-05  8:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Fam Zheng, Stefan Hajnoczi, Kevin Wolf, Max Reitz, Jeff Cody

This semantics is needed by drive-backup so implement it before using
this API there.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/io.c            | 6 ++++--
 include/block/block.h | 5 +++--
 2 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/block/io.c b/block/io.c
index b7beaeeb9f..f55e0c39fe 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2920,8 +2920,10 @@ int coroutine_fn bdrv_co_copy_range(BdrvChild *src, uint64_t src_offset,
     tracked_request_begin(&dst_req, dst_bs, dst_offset,
                           bytes, BDRV_TRACKED_WRITE);
 
-    wait_serialising_requests(&src_req);
-    wait_serialising_requests(&dst_req);
+    if (!(flags & BDRV_REQ_NO_SERIALISING)) {
+        wait_serialising_requests(&src_req);
+        wait_serialising_requests(&dst_req);
+    }
     ret = bdrv_co_copy_range_from(src, src_offset,
                                   dst, dst_offset,
                                   bytes, flags);
diff --git a/include/block/block.h b/include/block/block.h
index 6cc6c7e699..6d3d156927 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -630,13 +630,14 @@ void bdrv_unregister_buf(BlockDriverState *bs, void *host);
  * @dst: Destination child to copy data to
  * @dst_offset: offset in @dst image to write data
  * @bytes: number of bytes to copy
- * @flags: request flags. Must be one of:
- *         0 - actually read data from src;
+ * @flags: request flags. Supported flags:
  *         BDRV_REQ_ZERO_WRITE - treat the @src range as zero data and do zero
  *                               write on @dst as if bdrv_co_pwrite_zeroes is
  *                               called. Used to simplify caller code, or
  *                               during BlockDriver.bdrv_co_copy_range_from()
  *                               recursion.
+ *         BDRV_REQ_NO_SERIALISING - do not serialize with other overlapping
+ *                                   requests currently in flight.
  *
  * Returns: 0 if succeeded; negative error code if failed.
  **/
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v2 2/2] backup: Use copy offloading
  2018-06-05  8:07 [Qemu-devel] [PATCH v2 0/2] backup: Use copy offloading Fam Zheng
  2018-06-05  8:07 ` [Qemu-devel] [PATCH v2 1/2] block: Honour BDRV_REQ_NO_SERIALISING in copy range Fam Zheng
@ 2018-06-05  8:07 ` Fam Zheng
  2018-06-05 12:22   ` Stefan Hajnoczi
  1 sibling, 1 reply; 5+ messages in thread
From: Fam Zheng @ 2018-06-05  8:07 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, Fam Zheng, Stefan Hajnoczi, Kevin Wolf, Max Reitz, Jeff Cody

The implementation is similar to the 'qemu-img convert'. In the
beginning of the job, offloaded copy is attempted. If it fails, further
I/O will go through the existing bounce buffer code path.

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/backup.c     | 151 +++++++++++++++++++++++++++++++++------------
 block/trace-events |   1 +
 2 files changed, 111 insertions(+), 41 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 4e228e959b..8a7658db12 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -45,6 +45,8 @@ typedef struct BackupBlockJob {
     QLIST_HEAD(, CowRequest) inflight_reqs;
 
     HBitmap *copy_bitmap;
+    bool use_copy_range;
+    int64_t copy_range_size;
 } BackupBlockJob;
 
 static const BlockJobDriver backup_job_driver;
@@ -86,19 +88,102 @@ static void cow_request_end(CowRequest *req)
     qemu_co_queue_restart_all(&req->wait_queue);
 }
 
+/* Copy range to target with a bounce buffer and return the bytes copied. If
+ * error occured, return a negative error number */
+static int coroutine_fn backup_cow_with_bounce_buffer(BackupBlockJob *job,
+                                                      int64_t start,
+                                                      int64_t end,
+                                                      bool is_write_notifier,
+                                                      bool *error_is_read,
+                                                      void **bounce_buffer)
+{
+    int ret;
+    struct iovec iov;
+    QEMUIOVector qiov;
+    BlockBackend *blk = job->common.blk;
+    int nbytes;
+
+    hbitmap_reset(job->copy_bitmap, start / job->cluster_size, 1);
+    nbytes = MIN(job->cluster_size, job->len - start);
+    if (!*bounce_buffer) {
+        *bounce_buffer = blk_blockalign(blk, job->cluster_size);
+    }
+    iov.iov_base = *bounce_buffer;
+    iov.iov_len = nbytes;
+    qemu_iovec_init_external(&qiov, &iov, 1);
+
+    ret = blk_co_preadv(blk, start, qiov.size, &qiov,
+                        is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
+    if (ret < 0) {
+        trace_backup_do_cow_read_fail(job, start, ret);
+        if (error_is_read) {
+            *error_is_read = true;
+        }
+        goto fail;
+    }
+
+    if (qemu_iovec_is_zero(&qiov)) {
+        ret = blk_co_pwrite_zeroes(job->target, start,
+                                   qiov.size, BDRV_REQ_MAY_UNMAP);
+    } else {
+        ret = blk_co_pwritev(job->target, start,
+                             qiov.size, &qiov,
+                             job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
+    }
+    if (ret < 0) {
+        trace_backup_do_cow_write_fail(job, start, ret);
+        if (error_is_read) {
+            *error_is_read = false;
+        }
+        goto fail;
+    }
+
+    return nbytes;
+fail:
+    hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
+    return ret;
+
+}
+
+/* Copy range to target and return the bytes copied. If error occured, return a
+ * negative error number. */
+static int backup_cow_with_offload(BackupBlockJob *job,
+                                   int64_t start,
+                                   int64_t end,
+                                   bool is_write_notifier)
+{
+    int ret;
+    int nr_clusters;
+    BlockBackend *blk = job->common.blk;
+    int nbytes;
+
+    assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size));
+    nbytes = MIN(job->copy_range_size, end - start);
+    nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
+    hbitmap_reset(job->copy_bitmap, start / job->cluster_size,
+                  nr_clusters);
+    ret = blk_co_copy_range(blk, start, job->target, start, nbytes,
+                            is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
+    if (ret < 0) {
+        job->use_copy_range = false;
+        trace_backup_do_cow_copy_range_fail(job, start, ret);
+        hbitmap_set(job->copy_bitmap, start / job->cluster_size,
+                    nr_clusters);
+        return ret;
+    }
+
+    return nbytes;
+}
+
 static int coroutine_fn backup_do_cow(BackupBlockJob *job,
                                       int64_t offset, uint64_t bytes,
                                       bool *error_is_read,
                                       bool is_write_notifier)
 {
-    BlockBackend *blk = job->common.blk;
     CowRequest cow_request;
-    struct iovec iov;
-    QEMUIOVector bounce_qiov;
-    void *bounce_buffer = NULL;
     int ret = 0;
     int64_t start, end; /* bytes */
-    int n; /* bytes */
+    void *bounce_buffer = NULL;
 
     qemu_co_rwlock_rdlock(&job->flush_rwlock);
 
@@ -110,60 +195,38 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
     wait_for_overlapping_requests(job, start, end);
     cow_request_begin(&cow_request, job, start, end);
 
-    for (; start < end; start += job->cluster_size) {
+    while (start < end) {
         if (!hbitmap_get(job->copy_bitmap, start / job->cluster_size)) {
             trace_backup_do_cow_skip(job, start);
+            start += job->cluster_size;
             continue; /* already copied */
         }
-        hbitmap_reset(job->copy_bitmap, start / job->cluster_size, 1);
 
         trace_backup_do_cow_process(job, start);
 
-        n = MIN(job->cluster_size, job->len - start);
-
-        if (!bounce_buffer) {
-            bounce_buffer = blk_blockalign(blk, job->cluster_size);
-        }
-        iov.iov_base = bounce_buffer;
-        iov.iov_len = n;
-        qemu_iovec_init_external(&bounce_qiov, &iov, 1);
-
-        ret = blk_co_preadv(blk, start, bounce_qiov.size, &bounce_qiov,
-                            is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
-        if (ret < 0) {
-            trace_backup_do_cow_read_fail(job, start, ret);
-            if (error_is_read) {
-                *error_is_read = true;
+        if (job->use_copy_range) {
+            ret = backup_cow_with_offload(job, start, end, is_write_notifier);
+            if (ret < 0) {
+                job->use_copy_range = false;
             }
-            hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
-            goto out;
         }
-
-        if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
-            ret = blk_co_pwrite_zeroes(job->target, start,
-                                       bounce_qiov.size, BDRV_REQ_MAY_UNMAP);
-        } else {
-            ret = blk_co_pwritev(job->target, start,
-                                 bounce_qiov.size, &bounce_qiov,
-                                 job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
+        if (!job->use_copy_range) {
+            ret = backup_cow_with_bounce_buffer(job, start, end, is_write_notifier,
+                                                error_is_read, &bounce_buffer);
         }
         if (ret < 0) {
-            trace_backup_do_cow_write_fail(job, start, ret);
-            if (error_is_read) {
-                *error_is_read = false;
-            }
-            hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
-            goto out;
+            break;
         }
 
         /* Publish progress, guest I/O counts as progress too.  Note that the
          * offset field is an opaque progress value, it is not a disk offset.
          */
-        job->bytes_read += n;
-        job_progress_update(&job->common.job, n);
+        start += ret;
+        job->bytes_read += ret;
+        job_progress_update(&job->common.job, ret);
+        ret = 0;
     }
 
-out:
     if (bounce_buffer) {
         qemu_vfree(bounce_buffer);
     }
@@ -665,6 +728,12 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
     } else {
         job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
     }
+    job->use_copy_range = true;
+    job->copy_range_size = MIN_NON_ZERO(blk_get_max_transfer(job->common.blk),
+                                        blk_get_max_transfer(job->target));
+    job->copy_range_size = MAX(job->cluster_size,
+                               QEMU_ALIGN_UP(job->copy_range_size,
+                                             job->cluster_size));
 
     /* Required permissions are already taken with target's blk_new() */
     block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
diff --git a/block/trace-events b/block/trace-events
index 2d59b53fd3..c35287b48a 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -42,6 +42,7 @@ backup_do_cow_skip(void *job, int64_t start) "job %p start %"PRId64
 backup_do_cow_process(void *job, int64_t start) "job %p start %"PRId64
 backup_do_cow_read_fail(void *job, int64_t start, int ret) "job %p start %"PRId64" ret %d"
 backup_do_cow_write_fail(void *job, int64_t start, int ret) "job %p start %"PRId64" ret %d"
+backup_do_cow_copy_range_fail(void *job, int64_t start, int ret) "job %p start %"PRId64" ret %d"
 
 # blockdev.c
 qmp_block_job_cancel(void *job) "job %p"
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] backup: Use copy offloading
  2018-06-05  8:07 ` [Qemu-devel] [PATCH v2 2/2] backup: Use copy offloading Fam Zheng
@ 2018-06-05 12:22   ` Stefan Hajnoczi
  2018-06-05 14:05     ` Fam Zheng
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Hajnoczi @ 2018-06-05 12:22 UTC (permalink / raw)
  To: Fam Zheng; +Cc: qemu-devel, qemu-block, Kevin Wolf, Max Reitz, Jeff Cody

[-- Attachment #1: Type: text/plain, Size: 1491 bytes --]

On Tue, Jun 05, 2018 at 04:07:10PM +0800, Fam Zheng wrote:
> The implementation is similar to the 'qemu-img convert'. In the
> beginning of the job, offloaded copy is attempted. If it fails, further
> I/O will go through the existing bounce buffer code path.
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

Nice, this has become much more readable.

> +/* Copy range to target and return the bytes copied. If error occured, return a
> + * negative error number. */
> +static int backup_cow_with_offload(BackupBlockJob *job,

coroutine_fn

> +                                   int64_t start,
> +                                   int64_t end,
> +                                   bool is_write_notifier)
> +{
> +    int ret;
> +    int nr_clusters;
> +    BlockBackend *blk = job->common.blk;
> +    int nbytes;
> +
> +    assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size));
> +    nbytes = MIN(job->copy_range_size, end - start);
> +    nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
> +    hbitmap_reset(job->copy_bitmap, start / job->cluster_size,
> +                  nr_clusters);
> +    ret = blk_co_copy_range(blk, start, job->target, start, nbytes,
> +                            is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
> +    if (ret < 0) {
> +        job->use_copy_range = false;

This is not necessary since backup_do_cow() also sets it to false.  This
function doesn't need to know about job->use_copy_range.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH v2 2/2] backup: Use copy offloading
  2018-06-05 12:22   ` Stefan Hajnoczi
@ 2018-06-05 14:05     ` Fam Zheng
  0 siblings, 0 replies; 5+ messages in thread
From: Fam Zheng @ 2018-06-05 14:05 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: qemu-devel, qemu-block, Kevin Wolf, Max Reitz, Jeff Cody

On Tue, 06/05 13:22, Stefan Hajnoczi wrote:
> On Tue, Jun 05, 2018 at 04:07:10PM +0800, Fam Zheng wrote:
> > The implementation is similar to the 'qemu-img convert'. In the
> > beginning of the job, offloaded copy is attempted. If it fails, further
> > I/O will go through the existing bounce buffer code path.
> > 
> > Signed-off-by: Fam Zheng <famz@redhat.com>
> 
> Nice, this has become much more readable.
> 
> > +/* Copy range to target and return the bytes copied. If error occured, return a
> > + * negative error number. */
> > +static int backup_cow_with_offload(BackupBlockJob *job,
> 
> coroutine_fn

Will try to remember this better. :)

> 
> > +                                   int64_t start,
> > +                                   int64_t end,
> > +                                   bool is_write_notifier)
> > +{
> > +    int ret;
> > +    int nr_clusters;
> > +    BlockBackend *blk = job->common.blk;
> > +    int nbytes;
> > +
> > +    assert(QEMU_IS_ALIGNED(job->copy_range_size, job->cluster_size));
> > +    nbytes = MIN(job->copy_range_size, end - start);
> > +    nr_clusters = DIV_ROUND_UP(nbytes, job->cluster_size);
> > +    hbitmap_reset(job->copy_bitmap, start / job->cluster_size,
> > +                  nr_clusters);
> > +    ret = blk_co_copy_range(blk, start, job->target, start, nbytes,
> > +                            is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
> > +    if (ret < 0) {
> > +        job->use_copy_range = false;
> 
> This is not necessary since backup_do_cow() also sets it to false.  This
> function doesn't need to know about job->use_copy_range.

Yes, will fix.

Fam

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-06-05 14:05 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-05  8:07 [Qemu-devel] [PATCH v2 0/2] backup: Use copy offloading Fam Zheng
2018-06-05  8:07 ` [Qemu-devel] [PATCH v2 1/2] block: Honour BDRV_REQ_NO_SERIALISING in copy range Fam Zheng
2018-06-05  8:07 ` [Qemu-devel] [PATCH v2 2/2] backup: Use copy offloading Fam Zheng
2018-06-05 12:22   ` Stefan Hajnoczi
2018-06-05 14:05     ` Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.