All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/5] backup improvements part 1
@ 2017-10-02 14:39 Vladimir Sementsov-Ogievskiy
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function Vladimir Sementsov-Ogievskiy
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-02 14:39 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: jsnow, famz, mreitz, kwolf, jcody, stefanha, den, vsementsov

Hi all. Here is a continuation of my "new backup architecture" series,
The very first part, but it is meaningful itself.

01: was previously sent in other my series, but here is more correct
    version, one mistake is fixed.
02: only rebased to new byte-based interfaces (by Eric), r-bs saved
03: changed to use hbitmap_next_zero, also a comment added to
    none-mode (Fam)
04: commit message rewritten. The patch was controversial, but I need
    it for further backup enhancement, so, let's continue the
    discussion.
05: rebased, logic is not changed, r-bs saved


Vladimir Sementsov-Ogievskiy (5):
  hbitmap: add next_zero function
  backup: move from done_bitmap to copy_bitmap
  backup: init copy_bitmap from sync_bitmap for incremental
  backup: simplify non-dirty bits progress processing
  backup: use copy_bitmap in incremental backup

 include/block/dirty-bitmap.h |   1 +
 include/qemu/hbitmap.h       |   8 +++
 block/backup.c               | 117 ++++++++++++++++++++++++-------------------
 block/dirty-bitmap.c         |   5 ++
 util/hbitmap.c               |  29 +++++++++++
 5 files changed, 108 insertions(+), 52 deletions(-)

-- 
2.11.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function
  2017-10-02 14:39 [Qemu-devel] [PATCH 0/5] backup improvements part 1 Vladimir Sementsov-Ogievskiy
@ 2017-10-02 14:39 ` Vladimir Sementsov-Ogievskiy
  2017-10-02 15:43   ` Eric Blake
  2017-10-09 21:51   ` John Snow
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 2/5] backup: move from done_bitmap to copy_bitmap Vladimir Sementsov-Ogievskiy
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-02 14:39 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: jsnow, famz, mreitz, kwolf, jcody, stefanha, den, vsementsov

The function searches for next zero bit.
Also add interface for BdrvDirtyBitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 include/block/dirty-bitmap.h |  1 +
 include/qemu/hbitmap.h       |  8 ++++++++
 block/dirty-bitmap.c         |  5 +++++
 util/hbitmap.c               | 29 +++++++++++++++++++++++++++++
 4 files changed, 43 insertions(+)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 3579a7597c..a591c27213 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -91,5 +91,6 @@ bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs);
 BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
                                         BdrvDirtyBitmap *bitmap);
 char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp);
+int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t start);
 
 #endif
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 81e78043d1..6b6490ecad 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -292,6 +292,14 @@ void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap *hb, uint64_t first);
  */
 unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi);
 
+/* hbitmap_next_zero:
+ * @hb: The HBitmap to operate on
+ * @start: The bit to start from.
+ *
+ * Find next not dirty bit.
+ */
+int64_t hbitmap_next_zero(const HBitmap *hb, uint64_t start);
+
 /* hbitmap_create_meta:
  * Create a "meta" hbitmap to track dirtiness of the bits in this HBitmap.
  * The caller owns the created bitmap and must call hbitmap_free_meta(hb) to
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index bd04e991b1..7879d13ddb 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -715,3 +715,8 @@ char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
 {
     return hbitmap_sha256(bitmap->bitmap, errp);
 }
+
+int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t offset)
+{
+    return hbitmap_next_zero(bitmap->bitmap, offset);
+}
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 2f9d0fdbd0..ffcdbc5587 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -188,6 +188,35 @@ void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap *hb, uint64_t first)
     }
 }
 
+int64_t hbitmap_next_zero(const HBitmap *hb, uint64_t start)
+{
+    size_t pos = (start >> hb->granularity) >> BITS_PER_LEVEL;
+    unsigned long *last_lev = hb->levels[HBITMAP_LEVELS - 1];
+    uint64_t sz = hb->sizes[HBITMAP_LEVELS - 1];
+    unsigned long cur = last_lev[pos];
+    unsigned start_bit_offset =
+            (start >> hb->granularity) & (BITS_PER_LONG - 1);
+    int64_t res;
+    cur |= (1UL << start_bit_offset) - 1;
+    assert((start >> hb->granularity) < hb->size);
+
+    if (cur == (unsigned long)-1) {
+        do {
+            pos++;
+        } while (pos < sz && last_lev[pos] == (unsigned long)-1);
+
+        if (pos >= sz) {
+            return -1;
+        }
+
+        cur = last_lev[pos];
+    }
+
+    res = (pos << BITS_PER_LEVEL) + ctol(cur);
+
+    return res < hb->size ? (res << hb->granularity) : -1;
+}
+
 bool hbitmap_empty(const HBitmap *hb)
 {
     return hb->count == 0;
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 2/5] backup: move from done_bitmap to copy_bitmap
  2017-10-02 14:39 [Qemu-devel] [PATCH 0/5] backup improvements part 1 Vladimir Sementsov-Ogievskiy
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function Vladimir Sementsov-Ogievskiy
@ 2017-10-02 14:39 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 22:16   ` John Snow
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 3/5] backup: init copy_bitmap from sync_bitmap for incremental Vladimir Sementsov-Ogievskiy
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-02 14:39 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: jsnow, famz, mreitz, kwolf, jcody, stefanha, den, vsementsov

Use HBitmap copy_bitmap instead of done_bitmap. This is needed to
improve incremental backup in following patches and to unify backup
loop for full/incremental modes in future patches.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/backup.c | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 06ddbfd03d..08efec639f 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -40,11 +40,12 @@ typedef struct BackupBlockJob {
     BlockdevOnError on_target_error;
     CoRwlock flush_rwlock;
     uint64_t bytes_read;
-    unsigned long *done_bitmap;
     int64_t cluster_size;
     bool compress;
     NotifierWithReturn before_write;
     QLIST_HEAD(, CowRequest) inflight_reqs;
+
+    HBitmap *copy_bitmap;
 } BackupBlockJob;
 
 /* See if in-flight requests overlap and wait for them to complete */
@@ -109,10 +110,11 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
     cow_request_begin(&cow_request, job, start, end);
 
     for (; start < end; start += job->cluster_size) {
-        if (test_bit(start / job->cluster_size, job->done_bitmap)) {
+        if (!hbitmap_get(job->copy_bitmap, start / job->cluster_size)) {
             trace_backup_do_cow_skip(job, start);
             continue; /* already copied */
         }
+        hbitmap_reset(job->copy_bitmap, start / job->cluster_size, 1);
 
         trace_backup_do_cow_process(job, start);
 
@@ -132,6 +134,7 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
             if (error_is_read) {
                 *error_is_read = true;
             }
+            hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
             goto out;
         }
 
@@ -148,11 +151,10 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
             if (error_is_read) {
                 *error_is_read = false;
             }
+            hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
             goto out;
         }
 
-        set_bit(start / job->cluster_size, job->done_bitmap);
-
         /* Publish progress, guest I/O counts as progress too.  Note that the
          * offset field is an opaque progress value, it is not a disk offset.
          */
@@ -260,7 +262,7 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
     }
 
     len = DIV_ROUND_UP(backup_job->common.len, backup_job->cluster_size);
-    bitmap_zero(backup_job->done_bitmap, len);
+    hbitmap_set(backup_job->copy_bitmap, 0, len);
 }
 
 void backup_wait_for_overlapping_requests(BlockJob *job, int64_t offset,
@@ -425,14 +427,15 @@ static void coroutine_fn backup_run(void *opaque)
     BackupBlockJob *job = opaque;
     BackupCompleteData *data;
     BlockDriverState *bs = blk_bs(job->common.blk);
-    int64_t offset;
+    int64_t offset, nb_clusters;
     int ret = 0;
 
     QLIST_INIT(&job->inflight_reqs);
     qemu_co_rwlock_init(&job->flush_rwlock);
 
-    job->done_bitmap = bitmap_new(DIV_ROUND_UP(job->common.len,
-                                               job->cluster_size));
+    nb_clusters = DIV_ROUND_UP(job->common.len, job->cluster_size);
+    job->copy_bitmap = hbitmap_alloc(nb_clusters, 0);
+    hbitmap_set(job->copy_bitmap, 0, nb_clusters);
 
     job->before_write.notify = backup_before_write_notify;
     bdrv_add_before_write_notifier(bs, &job->before_write);
@@ -512,7 +515,7 @@ static void coroutine_fn backup_run(void *opaque)
     /* wait until pending backup_do_cow() calls have completed */
     qemu_co_rwlock_wrlock(&job->flush_rwlock);
     qemu_co_rwlock_unlock(&job->flush_rwlock);
-    g_free(job->done_bitmap);
+    hbitmap_free(job->copy_bitmap);
 
     data = g_malloc(sizeof(*data));
     data->ret = ret;
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 3/5] backup: init copy_bitmap from sync_bitmap for incremental
  2017-10-02 14:39 [Qemu-devel] [PATCH 0/5] backup improvements part 1 Vladimir Sementsov-Ogievskiy
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function Vladimir Sementsov-Ogievskiy
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 2/5] backup: move from done_bitmap to copy_bitmap Vladimir Sementsov-Ogievskiy
@ 2017-10-02 14:39 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 22:56   ` John Snow
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing Vladimir Sementsov-Ogievskiy
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-02 14:39 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: jsnow, famz, mreitz, kwolf, jcody, stefanha, den, vsementsov

We should not copy non-dirty clusters in write notifiers. So,
initialize copy_bitmap from sync_bitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 block/backup.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 44 insertions(+), 1 deletion(-)

diff --git a/block/backup.c b/block/backup.c
index 08efec639f..07f4ae846b 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -422,6 +422,43 @@ out:
     return ret;
 }
 
+/* init copy_bitmap from sync_bitmap */
+static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
+{
+    BdrvDirtyBitmapIter *dbi;
+    int64_t offset;
+    int64_t end = DIV_ROUND_UP(bdrv_dirty_bitmap_size(job->sync_bitmap),
+                               job->cluster_size);
+
+    dbi = bdrv_dirty_iter_new(job->sync_bitmap);
+    while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
+        int64_t cluster = offset / job->cluster_size;
+        int64_t next_cluster;
+
+        offset += bdrv_dirty_bitmap_granularity(job->sync_bitmap);
+        if (offset >= bdrv_dirty_bitmap_size(job->sync_bitmap)) {
+            hbitmap_set(job->copy_bitmap, cluster, end - cluster);
+            break;
+        }
+
+        offset = bdrv_dirty_bitmap_next_zero(job->sync_bitmap, offset);
+        if (offset == -1) {
+            hbitmap_set(job->copy_bitmap, cluster, end - cluster);
+            break;
+        }
+
+        next_cluster = DIV_ROUND_UP(offset, job->cluster_size);
+        hbitmap_set(job->copy_bitmap, cluster, next_cluster - cluster);
+        if (next_cluster >= end) {
+            break;
+        }
+
+        bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
+    }
+
+    bdrv_dirty_iter_free(dbi);
+}
+
 static void coroutine_fn backup_run(void *opaque)
 {
     BackupBlockJob *job = opaque;
@@ -435,20 +472,26 @@ static void coroutine_fn backup_run(void *opaque)
 
     nb_clusters = DIV_ROUND_UP(job->common.len, job->cluster_size);
     job->copy_bitmap = hbitmap_alloc(nb_clusters, 0);
-    hbitmap_set(job->copy_bitmap, 0, nb_clusters);
 
     job->before_write.notify = backup_before_write_notify;
     bdrv_add_before_write_notifier(bs, &job->before_write);
 
     if (job->sync_mode == MIRROR_SYNC_MODE_NONE) {
+        /* Here we set all bits in job->copy_bitmap to allow copying clusters
+         * which are going to be changed (none-mode semantics). It doesn't mean
+         * that we want to copy _all_ clusters.
+         */
+        hbitmap_set(job->copy_bitmap, 0, nb_clusters);
         while (!block_job_is_cancelled(&job->common)) {
             /* Yield until the job is cancelled.  We just let our before_write
              * notify callback service CoW requests. */
             block_job_yield(&job->common);
         }
     } else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+        backup_incremental_init_copy_bitmap(job);
         ret = backup_run_incremental(job);
     } else {
+        hbitmap_set(job->copy_bitmap, 0, nb_clusters);
         /* Both FULL and TOP SYNC_MODE's require copying.. */
         for (offset = 0; offset < job->common.len;
              offset += job->cluster_size) {
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing
  2017-10-02 14:39 [Qemu-devel] [PATCH 0/5] backup improvements part 1 Vladimir Sementsov-Ogievskiy
                   ` (2 preceding siblings ...)
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 3/5] backup: init copy_bitmap from sync_bitmap for incremental Vladimir Sementsov-Ogievskiy
@ 2017-10-02 14:39 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 23:44   ` John Snow
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 5/5] backup: use copy_bitmap in incremental backup Vladimir Sementsov-Ogievskiy
  2017-10-02 15:38 ` [Qemu-devel] [PATCH 0/5] backup improvements part 1 Eric Blake
  5 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-02 14:39 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: jsnow, famz, mreitz, kwolf, jcody, stefanha, den, vsementsov

Set fake progress for non-dirty clusters in copy_bitmap initialization,
to. It simplifies code and allows further refactoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---

Motivation (some of these paragraphs may be needed in commit message...)

1. Current behavior is not good: setting fake progress leads to progress
hopping, so actually for sparse disk progress say nothing. We just move
all these hops to the beginning.

2. Big hop in the beginning is possible now too: if there is a hole at
the beginning of virtual disk.

3. Now, there are some excuses for old behavior in comparison to new one:
backup progress is linear, cluster by cluster. But in future cluster
copying will be done by several coroutine-workers, several requests in
parallel, so it will make no sense to publish fake progress by parts in
parallel with non-sequential requests.

4. Actually it's just a point of view: when we are actually skip clusters?
If go through disk sequentially, it's logical to say, that we skip clusters
between copied portions to the left and to the right of them. But even know
copying progress is not sequential because of write notifiers.
If copying is async, non-sequential, we can say in the very beginning, that
we skip these clusters and do not think about them later.

So, I don't want to maintain portions of fake progress in the new backup
architecture. I understand, that this patch will change user's view
of backup progress, but formally it doesn't change: progress hops are just
moved to the beginning. Progress was meaningless for incremental backup and
it remains meaningless.

But what should we do with progress, really? It's obvious, that we need at
least one more field for block job status, for example, "skipped", which
would be amount of bytes which are not really processed but skipped by the
block job. So, more real progress may be calculated (by management tool) like

(job.offset - job.skipped) / (job.len - job.skipped)

And this progress will be more precise if information about skipped bytes
is provided earlier, and current patch is close to this idea.

So, if you agree, I can make a patch for 'skipped' too, but it actually not
related to the series, so I hope that current patch will be merged anyway.
(I need it for backup refactoring, not for accurate progress)

 block/backup.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 07f4ae846b..52df7bb19e 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -369,7 +369,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
     int64_t offset;
     int64_t cluster;
     int64_t end;
-    int64_t last_cluster = -1;
     BdrvDirtyBitmapIter *dbi;
 
     granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
@@ -380,12 +379,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
     while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
         cluster = offset / job->cluster_size;
 
-        /* Fake progress updates for any clusters we skipped */
-        if (cluster != last_cluster + 1) {
-            job->common.offset += ((cluster - last_cluster - 1) *
-                                   job->cluster_size);
-        }
-
         for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
             do {
                 if (yield_and_check(job)) {
@@ -407,14 +400,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
         if (granularity < job->cluster_size) {
             bdrv_set_dirty_iter(dbi, cluster * job->cluster_size);
         }
-
-        last_cluster = cluster - 1;
-    }
-
-    /* Play some final catchup with the progress meter */
-    end = DIV_ROUND_UP(job->common.len, job->cluster_size);
-    if (last_cluster + 1 < end) {
-        job->common.offset += ((end - last_cluster - 1) * job->cluster_size);
     }
 
 out:
@@ -456,6 +441,9 @@ static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
         bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
     }
 
+    job->common.offset = job->common.len -
+                         hbitmap_count(job->copy_bitmap) * job->cluster_size;
+
     bdrv_dirty_iter_free(dbi);
 }
 
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 5/5] backup: use copy_bitmap in incremental backup
  2017-10-02 14:39 [Qemu-devel] [PATCH 0/5] backup improvements part 1 Vladimir Sementsov-Ogievskiy
                   ` (3 preceding siblings ...)
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing Vladimir Sementsov-Ogievskiy
@ 2017-10-02 14:39 ` Vladimir Sementsov-Ogievskiy
  2017-10-09 23:51   ` John Snow
  2017-10-02 15:38 ` [Qemu-devel] [PATCH 0/5] backup improvements part 1 Eric Blake
  5 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-02 14:39 UTC (permalink / raw)
  To: qemu-block, qemu-devel
  Cc: jsnow, famz, mreitz, kwolf, jcody, stefanha, den, vsementsov

We can use copy_bitmap instead of sync_bitmap. copy_bitmap is
initialized from sync_bitmap and it is more informative: we will not try
to process data, that is already in progress (by write notifier).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 block/backup.c | 55 +++++++++++++++++--------------------------------------
 1 file changed, 17 insertions(+), 38 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 52df7bb19e..c933cfafc3 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -362,49 +362,28 @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
 
 static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
 {
+    int ret;
     bool error_is_read;
-    int ret = 0;
-    int clusters_per_iter;
-    uint32_t granularity;
-    int64_t offset;
     int64_t cluster;
-    int64_t end;
-    BdrvDirtyBitmapIter *dbi;
-
-    granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
-    clusters_per_iter = MAX((granularity / job->cluster_size), 1);
-    dbi = bdrv_dirty_iter_new(job->sync_bitmap);
-
-    /* Find the next dirty sector(s) */
-    while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
-        cluster = offset / job->cluster_size;
-
-        for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
-            do {
-                if (yield_and_check(job)) {
-                    goto out;
-                }
-                ret = backup_do_cow(job, cluster * job->cluster_size,
-                                    job->cluster_size, &error_is_read,
-                                    false);
-                if ((ret < 0) &&
-                    backup_error_action(job, error_is_read, -ret) ==
-                    BLOCK_ERROR_ACTION_REPORT) {
-                    goto out;
-                }
-            } while (ret < 0);
-        }
+    HBitmapIter hbi;
 
-        /* If the bitmap granularity is smaller than the backup granularity,
-         * we need to advance the iterator pointer to the next cluster. */
-        if (granularity < job->cluster_size) {
-            bdrv_set_dirty_iter(dbi, cluster * job->cluster_size);
-        }
+    hbitmap_iter_init(&hbi, job->copy_bitmap, 0);
+    while ((cluster = hbitmap_iter_next(&hbi)) != -1) {
+        do {
+            if (yield_and_check(job)) {
+                return 0;
+            }
+            ret = backup_do_cow(job, cluster * job->cluster_size,
+                                job->cluster_size, &error_is_read, false);
+            if (ret < 0 && backup_error_action(job, error_is_read, -ret) ==
+                           BLOCK_ERROR_ACTION_REPORT)
+            {
+                return ret;
+            }
+        } while (ret < 0);
     }
 
-out:
-    bdrv_dirty_iter_free(dbi);
-    return ret;
+    return 0;
 }
 
 /* init copy_bitmap from sync_bitmap */
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] backup improvements part 1
  2017-10-02 14:39 [Qemu-devel] [PATCH 0/5] backup improvements part 1 Vladimir Sementsov-Ogievskiy
                   ` (4 preceding siblings ...)
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 5/5] backup: use copy_bitmap in incremental backup Vladimir Sementsov-Ogievskiy
@ 2017-10-02 15:38 ` Eric Blake
  2017-10-02 16:17   ` Vladimir Sementsov-Ogievskiy
  5 siblings, 1 reply; 21+ messages in thread
From: Eric Blake @ 2017-10-02 15:38 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den, jsnow

[-- Attachment #1: Type: text/plain, Size: 1609 bytes --]

On 10/02/2017 09:39 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all. Here is a continuation of my "new backup architecture" series,
> The very first part, but it is meaningful itself.
> 
> 01: was previously sent in other my series, but here is more correct
>     version, one mistake is fixed.
> 02: only rebased to new byte-based interfaces (by Eric), r-bs saved

Presumably, that means we need to tell patchew:

Based-on:
https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg06848.html
[PATCH v10 00/20] make dirty-bitmap byte-based

> 03: changed to use hbitmap_next_zero, also a comment added to
>     none-mode (Fam)
> 04: commit message rewritten. The patch was controversial, but I need
>     it for further backup enhancement, so, let's continue the
>     discussion.
> 05: rebased, logic is not changed, r-bs saved
> 
> 
> Vladimir Sementsov-Ogievskiy (5):
>   hbitmap: add next_zero function
>   backup: move from done_bitmap to copy_bitmap
>   backup: init copy_bitmap from sync_bitmap for incremental
>   backup: simplify non-dirty bits progress processing
>   backup: use copy_bitmap in incremental backup
> 
>  include/block/dirty-bitmap.h |   1 +
>  include/qemu/hbitmap.h       |   8 +++
>  block/backup.c               | 117 ++++++++++++++++++++++++-------------------
>  block/dirty-bitmap.c         |   5 ++
>  util/hbitmap.c               |  29 +++++++++++
>  5 files changed, 108 insertions(+), 52 deletions(-)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function Vladimir Sementsov-Ogievskiy
@ 2017-10-02 15:43   ` Eric Blake
  2017-10-02 16:16     ` Vladimir Sementsov-Ogievskiy
  2017-10-09 21:51   ` John Snow
  1 sibling, 1 reply; 21+ messages in thread
From: Eric Blake @ 2017-10-02 15:43 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den, jsnow

[-- Attachment #1: Type: text/plain, Size: 1476 bytes --]

On 10/02/2017 09:39 AM, Vladimir Sementsov-Ogievskiy wrote:
> The function searches for next zero bit.
> Also add interface for BdrvDirtyBitmap.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/dirty-bitmap.h |  1 +
>  include/qemu/hbitmap.h       |  8 ++++++++
>  block/dirty-bitmap.c         |  5 +++++
>  util/hbitmap.c               | 29 +++++++++++++++++++++++++++++
>  4 files changed, 43 insertions(+)
> 

> +++ b/block/dirty-bitmap.c
> @@ -715,3 +715,8 @@ char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
>  {
>      return hbitmap_sha256(bitmap->bitmap, errp);
>  }
> +
> +int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t offset)
> +{
> +    return hbitmap_next_zero(bitmap->bitmap, offset);
> +}

Returns an answer in the same scale as the underlying hbitmap; if this
is applied before my byte-based dirty bitmap series, that means offset
is a sector count and the result is likewise a sector number (awkward);
if this is applied after my series, you pass in a byte offset start and
get a byte result (nice).

I don't see any obvious errors in the implementation, but DO think that
you should include a testsuite enhancement in tests/test-hbitmap.c to
cover the new functionality before we accept this.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function
  2017-10-02 15:43   ` Eric Blake
@ 2017-10-02 16:16     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-02 16:16 UTC (permalink / raw)
  To: Eric Blake, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den, jsnow

02.10.2017 18:43, Eric Blake wrote:
> On 10/02/2017 09:39 AM, Vladimir Sementsov-Ogievskiy wrote:
>> The function searches for next zero bit.
>> Also add interface for BdrvDirtyBitmap.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/dirty-bitmap.h |  1 +
>>   include/qemu/hbitmap.h       |  8 ++++++++
>>   block/dirty-bitmap.c         |  5 +++++
>>   util/hbitmap.c               | 29 +++++++++++++++++++++++++++++
>>   4 files changed, 43 insertions(+)
>>
>> +++ b/block/dirty-bitmap.c
>> @@ -715,3 +715,8 @@ char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
>>   {
>>       return hbitmap_sha256(bitmap->bitmap, errp);
>>   }
>> +
>> +int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t offset)
>> +{
>> +    return hbitmap_next_zero(bitmap->bitmap, offset);
>> +}
> Returns an answer in the same scale as the underlying hbitmap; if this
> is applied before my byte-based dirty bitmap series, that means offset
> is a sector count and the result is likewise a sector number (awkward);
> if this is applied after my series, you pass in a byte offset start and
> get a byte result (nice).

Oh, I forget to say it: it's based on your series)

>
> I don't see any obvious errors in the implementation, but DO think that
> you should include a testsuite enhancement in tests/test-hbitmap.c to
> cover the new functionality before we accept this.
>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 0/5] backup improvements part 1
  2017-10-02 15:38 ` [Qemu-devel] [PATCH 0/5] backup improvements part 1 Eric Blake
@ 2017-10-02 16:17   ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-02 16:17 UTC (permalink / raw)
  To: Eric Blake, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den, jsnow

02.10.2017 18:38, Eric Blake wrote:
> On 10/02/2017 09:39 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Hi all. Here is a continuation of my "new backup architecture" series,
>> The very first part, but it is meaningful itself.
>>
>> 01: was previously sent in other my series, but here is more correct
>>      version, one mistake is fixed.
>> 02: only rebased to new byte-based interfaces (by Eric), r-bs saved
> Presumably, that means we need to tell patchew:
>
> Based-on:
> https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg06848.html
> [PATCH v10 00/20] make dirty-bitmap byte-based

Yes, surely, I forgot to note it.

>
>> 03: changed to use hbitmap_next_zero, also a comment added to
>>      none-mode (Fam)
>> 04: commit message rewritten. The patch was controversial, but I need
>>      it for further backup enhancement, so, let's continue the
>>      discussion.
>> 05: rebased, logic is not changed, r-bs saved
>>
>>
>> Vladimir Sementsov-Ogievskiy (5):
>>    hbitmap: add next_zero function
>>    backup: move from done_bitmap to copy_bitmap
>>    backup: init copy_bitmap from sync_bitmap for incremental
>>    backup: simplify non-dirty bits progress processing
>>    backup: use copy_bitmap in incremental backup
>>
>>   include/block/dirty-bitmap.h |   1 +
>>   include/qemu/hbitmap.h       |   8 +++
>>   block/backup.c               | 117 ++++++++++++++++++++++++-------------------
>>   block/dirty-bitmap.c         |   5 ++
>>   util/hbitmap.c               |  29 +++++++++++
>>   5 files changed, 108 insertions(+), 52 deletions(-)
>>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function Vladimir Sementsov-Ogievskiy
  2017-10-02 15:43   ` Eric Blake
@ 2017-10-09 21:51   ` John Snow
  2017-10-12 10:05     ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 21+ messages in thread
From: John Snow @ 2017-10-09 21:51 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den



On 10/02/2017 10:39 AM, Vladimir Sementsov-Ogievskiy wrote:
> The function searches for next zero bit.
> Also add interface for BdrvDirtyBitmap.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  include/block/dirty-bitmap.h |  1 +
>  include/qemu/hbitmap.h       |  8 ++++++++
>  block/dirty-bitmap.c         |  5 +++++
>  util/hbitmap.c               | 29 +++++++++++++++++++++++++++++
>  4 files changed, 43 insertions(+)
> 
> diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
> index 3579a7597c..a591c27213 100644
> --- a/include/block/dirty-bitmap.h
> +++ b/include/block/dirty-bitmap.h
> @@ -91,5 +91,6 @@ bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs);
>  BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
>                                          BdrvDirtyBitmap *bitmap);
>  char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp);
> +int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t start);
>  
>  #endif
> diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
> index 81e78043d1..6b6490ecad 100644
> --- a/include/qemu/hbitmap.h
> +++ b/include/qemu/hbitmap.h
> @@ -292,6 +292,14 @@ void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap *hb, uint64_t first);
>   */
>  unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi);
>  
> +/* hbitmap_next_zero:
> + * @hb: The HBitmap to operate on
> + * @start: The bit to start from.
> + *
> + * Find next not dirty bit.
> + */
> +int64_t hbitmap_next_zero(const HBitmap *hb, uint64_t start);
> +
>  /* hbitmap_create_meta:
>   * Create a "meta" hbitmap to track dirtiness of the bits in this HBitmap.
>   * The caller owns the created bitmap and must call hbitmap_free_meta(hb) to
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index bd04e991b1..7879d13ddb 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -715,3 +715,8 @@ char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
>  {
>      return hbitmap_sha256(bitmap->bitmap, errp);
>  }
> +
> +int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t offset)
> +{
> +    return hbitmap_next_zero(bitmap->bitmap, offset);
> +}
> diff --git a/util/hbitmap.c b/util/hbitmap.c
> index 2f9d0fdbd0..ffcdbc5587 100644
> --- a/util/hbitmap.c
> +++ b/util/hbitmap.c
> @@ -188,6 +188,35 @@ void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap *hb, uint64_t first)
>      }
>  }
>  
> +int64_t hbitmap_next_zero(const HBitmap *hb, uint64_t start)
> +{
> +    size_t pos = (start >> hb->granularity) >> BITS_PER_LEVEL;
> +    unsigned long *last_lev = hb->levels[HBITMAP_LEVELS - 1];
> +    uint64_t sz = hb->sizes[HBITMAP_LEVELS - 1];
> +    unsigned long cur = last_lev[pos];
> +    unsigned start_bit_offset =
> +            (start >> hb->granularity) & (BITS_PER_LONG - 1);
> +    int64_t res;

Prefer a line separating declarations from statements.

> +    cur |= (1UL << start_bit_offset) - 1;
> +    assert((start >> hb->granularity) < hb->size);
> +
> +    if (cur == (unsigned long)-1) {
> +        do {
> +            pos++;
> +        } while (pos < sz && last_lev[pos] == (unsigned long)-1);
> +
> +        if (pos >= sz) {
> +            return -1;
> +        }
> +
> +        cur = last_lev[pos];
> +    }
> +
> +    res = (pos << BITS_PER_LEVEL) + ctol(cur);
> +
> +    return res < hb->size ? (res << hb->granularity) : -1;
> +}
> +
>  bool hbitmap_empty(const HBitmap *hb)
>  {
>      return hb->count == 0;
> 

And a test would be nice.

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] backup: move from done_bitmap to copy_bitmap
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 2/5] backup: move from done_bitmap to copy_bitmap Vladimir Sementsov-Ogievskiy
@ 2017-10-09 22:16   ` John Snow
  0 siblings, 0 replies; 21+ messages in thread
From: John Snow @ 2017-10-09 22:16 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den



On 10/02/2017 10:39 AM, Vladimir Sementsov-Ogievskiy wrote:
> Use HBitmap copy_bitmap instead of done_bitmap. This is needed to
> improve incremental backup in following patches and to unify backup
> loop for full/incremental modes in future patches.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Jeff Cody <jcody@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] backup: init copy_bitmap from sync_bitmap for incremental
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 3/5] backup: init copy_bitmap from sync_bitmap for incremental Vladimir Sementsov-Ogievskiy
@ 2017-10-09 22:56   ` John Snow
  2017-10-12 11:23     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 21+ messages in thread
From: John Snow @ 2017-10-09 22:56 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den



On 10/02/2017 10:39 AM, Vladimir Sementsov-Ogievskiy wrote:
> We should not copy non-dirty clusters in write notifiers. So,
> initialize copy_bitmap from sync_bitmap.
> 

...! Duh, good find!!

So, previously we'd copy extra sectors if they just so happened to be
written to during the backup process. These would be totally redundant
compared to the previous backups in the chain, by definition.

Now, we allow the write to go through to the source image and let it
dirty the bdrv dirty bitmap we have on standby.

Good fix.

> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block/backup.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 44 insertions(+), 1 deletion(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 08efec639f..07f4ae846b 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -422,6 +422,43 @@ out:
>      return ret;
>  }
>  
> +/* init copy_bitmap from sync_bitmap */
> +static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
> +{
> +    BdrvDirtyBitmapIter *dbi;
> +    int64_t offset;
> +    int64_t end = DIV_ROUND_UP(bdrv_dirty_bitmap_size(job->sync_bitmap),
> +                               job->cluster_size);
> +
> +    dbi = bdrv_dirty_iter_new(job->sync_bitmap);
> +    while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
> +        int64_t cluster = offset / job->cluster_size;
> +        int64_t next_cluster;
> +
> +        offset += bdrv_dirty_bitmap_granularity(job->sync_bitmap);
> +        if (offset >= bdrv_dirty_bitmap_size(job->sync_bitmap)) {
> +            hbitmap_set(job->copy_bitmap, cluster, end - cluster);
> +            break;
> +        }
> +
> +        offset = bdrv_dirty_bitmap_next_zero(job->sync_bitmap, offset);
> +        if (offset == -1) {
> +            hbitmap_set(job->copy_bitmap, cluster, end - cluster);
> +            break;
> +        }
> +
> +        next_cluster = DIV_ROUND_UP(offset, job->cluster_size);
> +        hbitmap_set(job->copy_bitmap, cluster, next_cluster - cluster);
> +        if (next_cluster >= end) {
> +            break;
> +        }
> +

Did you look into initializing this from a lower layer, e.g. a call to
initialize-an-hbitmap-from-another-hbitmap?

The loop might look a little cleaner that way and we might possibly get
more utility out of it than a custom coded loop here in the backup code.

> +        bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
> +    }
> +
> +    bdrv_dirty_iter_free(dbi);
> +}
> +
>  static void coroutine_fn backup_run(void *opaque)
>  {
>      BackupBlockJob *job = opaque;
> @@ -435,20 +472,26 @@ static void coroutine_fn backup_run(void *opaque)
>  
>      nb_clusters = DIV_ROUND_UP(job->common.len, job->cluster_size);
>      job->copy_bitmap = hbitmap_alloc(nb_clusters, 0);
> -    hbitmap_set(job->copy_bitmap, 0, nb_clusters);

Alright, so the last patch introduced a blunt initialization, and you've
replaced it by more granular initializations.

>  
>      job->before_write.notify = backup_before_write_notify;
>      bdrv_add_before_write_notifier(bs, &job->before_write);
>  
>      if (job->sync_mode == MIRROR_SYNC_MODE_NONE) {
> +        /* Here we set all bits in job->copy_bitmap to allow copying clusters
> +         * which are going to be changed (none-mode semantics). It doesn't mean
> +         * that we want to copy _all_ clusters.
> +         */

Right, this is more like ALLOWING all sectors to be copied.

Perhaps:

"All bits are set to allow any cluster to be copied. This does not
actually require them to be copied."

> +        hbitmap_set(job->copy_bitmap, 0, nb_clusters);
>          while (!block_job_is_cancelled(&job->common)) {
>              /* Yield until the job is cancelled.  We just let our before_write
>               * notify callback service CoW requests. */
>              block_job_yield(&job->common);
>          }
>      } else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
> +        backup_incremental_init_copy_bitmap(job);
>          ret = backup_run_incremental(job);
>      } else {
> +        hbitmap_set(job->copy_bitmap, 0, nb_clusters);

Maybe you could just factor these things out entirely:

if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
  backup_incremental_init_copy_bitmap(job);
} else {
  hbitmap_set(job->copy_bitmap, 0, nb_clusters);
}

prior to the big switch statement.

>          /* Both FULL and TOP SYNC_MODE's require copying.. */
>          for (offset = 0; offset < job->common.len;
>               offset += job->cluster_size) {
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing Vladimir Sementsov-Ogievskiy
@ 2017-10-09 23:44   ` John Snow
  2017-10-12 11:42     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 21+ messages in thread
From: John Snow @ 2017-10-09 23:44 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den



On 10/02/2017 10:39 AM, Vladimir Sementsov-Ogievskiy wrote:
> Set fake progress for non-dirty clusters in copy_bitmap initialization,
> to. It simplifies code and allows further refactoring.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
> 
> Motivation (some of these paragraphs may be needed in commit message...)
> 

lol! "pick and choose which paragraphs look good, stick some up there,
maybe?"

> 1. Current behavior is not good: setting fake progress leads to progress
> hopping, so actually for sparse disk progress say nothing. We just move
> all these hops to the beginning.
> 

To be fair, if it's sparse, the first update will be very soon.

> 2. Big hop in the beginning is possible now too: if there is a hole at
> the beginning of virtual disk.
> 
> 3. Now, there are some excuses for old behavior in comparison to new one:
> backup progress is linear, cluster by cluster. But in future cluster
> copying will be done by several coroutine-workers, several requests in
> parallel, so it will make no sense to publish fake progress by parts in
> parallel with non-sequential requests.
> 

I agree.

> 4. Actually it's just a point of view: when we are actually skip clusters?
> If go through disk sequentially, it's logical to say, that we skip clusters
> between copied portions to the left and to the right of them. But even know
> copying progress is not sequential because of write notifiers.
> If copying is async, non-sequential, we can say in the very beginning, that
> we skip these clusters and do not think about them later.
> 

Good observation.

> So, I don't want to maintain portions of fake progress in the new backup
> architecture. I understand, that this patch will change user's view
> of backup progress, but formally it doesn't change: progress hops are just
> moved to the beginning. Progress was meaningless for incremental backup and
> it remains meaningless.
> 

Haha. Not meaningless, but it does carry less information than other
progress meters. It is effectively a % complete meter, with a really bad
estimate.

> But what should we do with progress, really? It's obvious, that we need at
> least one more field for block job status, for example, "skipped", which
> would be amount of bytes which are not really processed but skipped by the
> block job. So, more real progress may be calculated (by management tool) like
> 
> (job.offset - job.skipped) / (job.len - job.skipped)
> 

I'm not sure we actually need a new field... let's just say that the job
length is the number of bytes described by the incremental backup. Every
time we copy some, move offset forward. This would give a more
appropriately linear/accurate sense of the progress.

I think we are allowed to do this as I think we promise that these
progress numbers mean essentially nothing...

New fields that actually *do* report meaningful information, however,
could be added. bytesCopied would be a prime candidate.

> And this progress will be more precise if information about skipped bytes
> is provided earlier, and current patch is close to this idea.
> 
> So, if you agree, I can make a patch for 'skipped' too, but it actually not
> related to the series, so I hope that current patch will be merged anyway.
> (I need it for backup refactoring, not for accurate progress)
> 

>  block/backup.c | 18 +++---------------
>  1 file changed, 3 insertions(+), 15 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 07f4ae846b..52df7bb19e 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -369,7 +369,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
>      int64_t offset;
>      int64_t cluster;
>      int64_t end;
> -    int64_t last_cluster = -1;
>      BdrvDirtyBitmapIter *dbi;
>  
>      granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
> @@ -380,12 +379,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
>      while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
>          cluster = offset / job->cluster_size;
>  
> -        /* Fake progress updates for any clusters we skipped */
> -        if (cluster != last_cluster + 1) {
> -            job->common.offset += ((cluster - last_cluster - 1) *
> -                                   job->cluster_size);
> -        }
> -
>          for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
>              do {
>                  if (yield_and_check(job)) {
> @@ -407,14 +400,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
>          if (granularity < job->cluster_size) {
>              bdrv_set_dirty_iter(dbi, cluster * job->cluster_size);
>          }
> -
> -        last_cluster = cluster - 1;
> -    }
> -
> -    /* Play some final catchup with the progress meter */
> -    end = DIV_ROUND_UP(job->common.len, job->cluster_size);
> -    if (last_cluster + 1 < end) {
> -        job->common.offset += ((end - last_cluster - 1) * job->cluster_size);
>      }
>  
>  out:
> @@ -456,6 +441,9 @@ static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
>          bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
>      }
>  
> +    job->common.offset = job->common.len -
> +                         hbitmap_count(job->copy_bitmap) * job->cluster_size;
> +
>      bdrv_dirty_iter_free(dbi);
>  }
>  
> 

Anything that makes less lines of code is nice, though, so if we cannot
reduce the length of the job:

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 5/5] backup: use copy_bitmap in incremental backup
  2017-10-02 14:39 ` [Qemu-devel] [PATCH 5/5] backup: use copy_bitmap in incremental backup Vladimir Sementsov-Ogievskiy
@ 2017-10-09 23:51   ` John Snow
  0 siblings, 0 replies; 21+ messages in thread
From: John Snow @ 2017-10-09 23:51 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den



On 10/02/2017 10:39 AM, Vladimir Sementsov-Ogievskiy wrote:
> We can use copy_bitmap instead of sync_bitmap. copy_bitmap is
> initialized from sync_bitmap and it is more informative: we will not try
> to process data, that is already in progress (by write notifier).
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  block/backup.c | 55 +++++++++++++++++--------------------------------------

Lookin' good.

>  1 file changed, 17 insertions(+), 38 deletions(-)
> 
> diff --git a/block/backup.c b/block/backup.c
> index 52df7bb19e..c933cfafc3 100644
> --- a/block/backup.c
> +++ b/block/backup.c
> @@ -362,49 +362,28 @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
>  
>  static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
>  {
> +    int ret;
>      bool error_is_read;
> -    int ret = 0;
> -    int clusters_per_iter;
> -    uint32_t granularity;
> -    int64_t offset;
>      int64_t cluster;
> -    int64_t end;
> -    BdrvDirtyBitmapIter *dbi;
> -
> -    granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
> -    clusters_per_iter = MAX((granularity / job->cluster_size), 1);
> -    dbi = bdrv_dirty_iter_new(job->sync_bitmap);
> -
> -    /* Find the next dirty sector(s) */
> -    while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
> -        cluster = offset / job->cluster_size;
> -
> -        for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
> -            do {
> -                if (yield_and_check(job)) {
> -                    goto out;
> -                }
> -                ret = backup_do_cow(job, cluster * job->cluster_size,
> -                                    job->cluster_size, &error_is_read,
> -                                    false);
> -                if ((ret < 0) &&
> -                    backup_error_action(job, error_is_read, -ret) ==
> -                    BLOCK_ERROR_ACTION_REPORT) {
> -                    goto out;
> -                }
> -            } while (ret < 0);
> -        }
> +    HBitmapIter hbi;
>  
> -        /* If the bitmap granularity is smaller than the backup granularity,
> -         * we need to advance the iterator pointer to the next cluster. */
> -        if (granularity < job->cluster_size) {
> -            bdrv_set_dirty_iter(dbi, cluster * job->cluster_size);
> -        }
> +    hbitmap_iter_init(&hbi, job->copy_bitmap, 0);
> +    while ((cluster = hbitmap_iter_next(&hbi)) != -1) {
> +        do {
> +            if (yield_and_check(job)) {
> +                return 0;
> +            }
> +            ret = backup_do_cow(job, cluster * job->cluster_size,
> +                                job->cluster_size, &error_is_read, false);
> +            if (ret < 0 && backup_error_action(job, error_is_read, -ret) ==
> +                           BLOCK_ERROR_ACTION_REPORT)
> +            {
> +                return ret;
> +            }
> +        } while (ret < 0);
>      }
>  
> -out:
> -    bdrv_dirty_iter_free(dbi);
> -    return ret;
> +    return 0;
>  }
>  
>  /* init copy_bitmap from sync_bitmap */
> 

Impressive, you really cleaned this function up considerably.

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function
  2017-10-09 21:51   ` John Snow
@ 2017-10-12 10:05     ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-12 10:05 UTC (permalink / raw)
  To: John Snow, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den

10.10.2017 00:51, John Snow wrote:
>
> On 10/02/2017 10:39 AM, Vladimir Sementsov-Ogievskiy wrote:
>> The function searches for next zero bit.
>> Also add interface for BdrvDirtyBitmap.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   include/block/dirty-bitmap.h |  1 +
>>   include/qemu/hbitmap.h       |  8 ++++++++
>>   block/dirty-bitmap.c         |  5 +++++
>>   util/hbitmap.c               | 29 +++++++++++++++++++++++++++++
>>   4 files changed, 43 insertions(+)
>>
>> diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
>> index 3579a7597c..a591c27213 100644
>> --- a/include/block/dirty-bitmap.h
>> +++ b/include/block/dirty-bitmap.h
>> @@ -91,5 +91,6 @@ bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs);
>>   BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
>>                                           BdrvDirtyBitmap *bitmap);
>>   char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp);
>> +int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t start);
>>   
>>   #endif
>> diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
>> index 81e78043d1..6b6490ecad 100644
>> --- a/include/qemu/hbitmap.h
>> +++ b/include/qemu/hbitmap.h
>> @@ -292,6 +292,14 @@ void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap *hb, uint64_t first);
>>    */
>>   unsigned long hbitmap_iter_skip_words(HBitmapIter *hbi);
>>   
>> +/* hbitmap_next_zero:
>> + * @hb: The HBitmap to operate on
>> + * @start: The bit to start from.
>> + *
>> + * Find next not dirty bit.
>> + */
>> +int64_t hbitmap_next_zero(const HBitmap *hb, uint64_t start);
>> +
>>   /* hbitmap_create_meta:
>>    * Create a "meta" hbitmap to track dirtiness of the bits in this HBitmap.
>>    * The caller owns the created bitmap and must call hbitmap_free_meta(hb) to
>> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
>> index bd04e991b1..7879d13ddb 100644
>> --- a/block/dirty-bitmap.c
>> +++ b/block/dirty-bitmap.c
>> @@ -715,3 +715,8 @@ char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
>>   {
>>       return hbitmap_sha256(bitmap->bitmap, errp);
>>   }
>> +
>> +int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t offset)
>> +{
>> +    return hbitmap_next_zero(bitmap->bitmap, offset);
>> +}
>> diff --git a/util/hbitmap.c b/util/hbitmap.c
>> index 2f9d0fdbd0..ffcdbc5587 100644
>> --- a/util/hbitmap.c
>> +++ b/util/hbitmap.c
>> @@ -188,6 +188,35 @@ void hbitmap_iter_init(HBitmapIter *hbi, const HBitmap *hb, uint64_t first)
>>       }
>>   }
>>   
>> +int64_t hbitmap_next_zero(const HBitmap *hb, uint64_t start)
>> +{
>> +    size_t pos = (start >> hb->granularity) >> BITS_PER_LEVEL;
>> +    unsigned long *last_lev = hb->levels[HBITMAP_LEVELS - 1];
>> +    uint64_t sz = hb->sizes[HBITMAP_LEVELS - 1];
>> +    unsigned long cur = last_lev[pos];
>> +    unsigned start_bit_offset =
>> +            (start >> hb->granularity) & (BITS_PER_LONG - 1);
>> +    int64_t res;
> Prefer a line separating declarations from statements.
>
>> +    cur |= (1UL << start_bit_offset) - 1;
>> +    assert((start >> hb->granularity) < hb->size);
>> +
>> +    if (cur == (unsigned long)-1) {
>> +        do {
>> +            pos++;
>> +        } while (pos < sz && last_lev[pos] == (unsigned long)-1);
>> +
>> +        if (pos >= sz) {
>> +            return -1;
>> +        }
>> +
>> +        cur = last_lev[pos];
>> +    }
>> +
>> +    res = (pos << BITS_PER_LEVEL) + ctol(cur);
>> +
>> +    return res < hb->size ? (res << hb->granularity) : -1;

here is a hidden mistake: we must consider case when start is not 
aligned to granularity, and if start bit is considered zero we should 
return start, not ((start >> gran) << gran)

>> +}
>> +
>>   bool hbitmap_empty(const HBitmap *hb)
>>   {
>>       return hb->count == 0;
>>
> And a test would be nice.
>
> Reviewed-by: John Snow <jsnow@redhat.com>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] backup: init copy_bitmap from sync_bitmap for incremental
  2017-10-09 22:56   ` John Snow
@ 2017-10-12 11:23     ` Vladimir Sementsov-Ogievskiy
  2017-11-07  0:25       ` John Snow
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-12 11:23 UTC (permalink / raw)
  To: John Snow, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den

10.10.2017 01:56, John Snow wrote:
>
> On 10/02/2017 10:39 AM, Vladimir Sementsov-Ogievskiy wrote:
>> We should not copy non-dirty clusters in write notifiers. So,
>> initialize copy_bitmap from sync_bitmap.
>>
> ...! Duh, good find!!
>
> So, previously we'd copy extra sectors if they just so happened to be
> written to during the backup process. These would be totally redundant
> compared to the previous backups in the chain, by definition.
>
> Now, we allow the write to go through to the source image and let it
> dirty the bdrv dirty bitmap we have on standby.
>
> Good fix.
>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>   block/backup.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
>>   1 file changed, 44 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/backup.c b/block/backup.c
>> index 08efec639f..07f4ae846b 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -422,6 +422,43 @@ out:
>>       return ret;
>>   }
>>   
>> +/* init copy_bitmap from sync_bitmap */
>> +static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
>> +{
>> +    BdrvDirtyBitmapIter *dbi;
>> +    int64_t offset;
>> +    int64_t end = DIV_ROUND_UP(bdrv_dirty_bitmap_size(job->sync_bitmap),
>> +                               job->cluster_size);
>> +
>> +    dbi = bdrv_dirty_iter_new(job->sync_bitmap);
>> +    while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
>> +        int64_t cluster = offset / job->cluster_size;
>> +        int64_t next_cluster;
>> +
>> +        offset += bdrv_dirty_bitmap_granularity(job->sync_bitmap);
>> +        if (offset >= bdrv_dirty_bitmap_size(job->sync_bitmap)) {
>> +            hbitmap_set(job->copy_bitmap, cluster, end - cluster);
>> +            break;
>> +        }
>> +
>> +        offset = bdrv_dirty_bitmap_next_zero(job->sync_bitmap, offset);
>> +        if (offset == -1) {
>> +            hbitmap_set(job->copy_bitmap, cluster, end - cluster);
>> +            break;
>> +        }
>> +
>> +        next_cluster = DIV_ROUND_UP(offset, job->cluster_size);
>> +        hbitmap_set(job->copy_bitmap, cluster, next_cluster - cluster);
>> +        if (next_cluster >= end) {
>> +            break;
>> +        }
>> +
> Did you look into initializing this from a lower layer, e.g. a call to
> initialize-an-hbitmap-from-another-hbitmap?
>
> The loop might look a little cleaner that way and we might possibly get
> more utility out of it than a custom coded loop here in the backup code.

No, to be honest, I don't want to do it now, there are some difficulties:

1. we init from bdrv bitmap, so we'll have to add a bit strange common 
interface init-hbitmap-from-bdrv-bitmap
2 (more hard). here copy_bitmap have other size: it's granularity is 
zero, it corresponds to clusters, not to bytes, and I don't want to 
change this.
so, it's not a common case..


>
>> +        bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
>> +    }
>> +
>> +    bdrv_dirty_iter_free(dbi);
>> +}
>> +
>>   static void coroutine_fn backup_run(void *opaque)
>>   {
>>       BackupBlockJob *job = opaque;
>> @@ -435,20 +472,26 @@ static void coroutine_fn backup_run(void *opaque)
>>   
>>       nb_clusters = DIV_ROUND_UP(job->common.len, job->cluster_size);
>>       job->copy_bitmap = hbitmap_alloc(nb_clusters, 0);
>> -    hbitmap_set(job->copy_bitmap, 0, nb_clusters);
> Alright, so the last patch introduced a blunt initialization, and you've
> replaced it by more granular initializations.
>
>>   
>>       job->before_write.notify = backup_before_write_notify;
>>       bdrv_add_before_write_notifier(bs, &job->before_write);
>>   
>>       if (job->sync_mode == MIRROR_SYNC_MODE_NONE) {
>> +        /* Here we set all bits in job->copy_bitmap to allow copying clusters
>> +         * which are going to be changed (none-mode semantics). It doesn't mean
>> +         * that we want to copy _all_ clusters.
>> +         */
> Right, this is more like ALLOWING all sectors to be copied.
>
> Perhaps:
>
> "All bits are set to allow any cluster to be copied. This does not
> actually require them to be copied."
>
>> +        hbitmap_set(job->copy_bitmap, 0, nb_clusters);
>>           while (!block_job_is_cancelled(&job->common)) {
>>               /* Yield until the job is cancelled.  We just let our before_write
>>                * notify callback service CoW requests. */
>>               block_job_yield(&job->common);
>>           }
>>       } else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
>> +        backup_incremental_init_copy_bitmap(job);
>>           ret = backup_run_incremental(job);
>>       } else {
>> +        hbitmap_set(job->copy_bitmap, 0, nb_clusters);
> Maybe you could just factor these things out entirely:
>
> if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
>    backup_incremental_init_copy_bitmap(job);
> } else {
>    hbitmap_set(job->copy_bitmap, 0, nb_clusters);
> }
>
> prior to the big switch statement.
>
>>           /* Both FULL and TOP SYNC_MODE's require copying.. */
>>           for (offset = 0; offset < job->common.len;
>>                offset += job->cluster_size) {
>>


-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing
  2017-10-09 23:44   ` John Snow
@ 2017-10-12 11:42     ` Vladimir Sementsov-Ogievskiy
  2017-10-12 13:56       ` Eric Blake
  0 siblings, 1 reply; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-12 11:42 UTC (permalink / raw)
  To: John Snow, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den

10.10.2017 02:44, John Snow wrote:
>
> On 10/02/2017 10:39 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Set fake progress for non-dirty clusters in copy_bitmap initialization,
>> to. It simplifies code and allows further refactoring.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>> ---
>>
>> Motivation (some of these paragraphs may be needed in commit message...)
>>
> lol! "pick and choose which paragraphs look good, stick some up there,
> maybe?"
>
>> 1. Current behavior is not good: setting fake progress leads to progress
>> hopping, so actually for sparse disk progress say nothing. We just move
>> all these hops to the beginning.
>>
> To be fair, if it's sparse, the first update will be very soon.
>
>> 2. Big hop in the beginning is possible now too: if there is a hole at
>> the beginning of virtual disk.
>>
>> 3. Now, there are some excuses for old behavior in comparison to new one:
>> backup progress is linear, cluster by cluster. But in future cluster
>> copying will be done by several coroutine-workers, several requests in
>> parallel, so it will make no sense to publish fake progress by parts in
>> parallel with non-sequential requests.
>>
> I agree.
>
>> 4. Actually it's just a point of view: when we are actually skip clusters?
>> If go through disk sequentially, it's logical to say, that we skip clusters
>> between copied portions to the left and to the right of them. But even know
>> copying progress is not sequential because of write notifiers.
>> If copying is async, non-sequential, we can say in the very beginning, that
>> we skip these clusters and do not think about them later.
>>
> Good observation.
>
>> So, I don't want to maintain portions of fake progress in the new backup
>> architecture. I understand, that this patch will change user's view
>> of backup progress, but formally it doesn't change: progress hops are just
>> moved to the beginning. Progress was meaningless for incremental backup and
>> it remains meaningless.
>>
> Haha. Not meaningless, but it does carry less information than other
> progress meters. It is effectively a % complete meter, with a really bad
> estimate.
>
>> But what should we do with progress, really? It's obvious, that we need at
>> least one more field for block job status, for example, "skipped", which
>> would be amount of bytes which are not really processed but skipped by the
>> block job. So, more real progress may be calculated (by management tool) like
>>
>> (job.offset - job.skipped) / (job.len - job.skipped)
>>
> I'm not sure we actually need a new field... let's just say that the job
> length is the number of bytes described by the incremental backup. Every
> time we copy some, move offset forward. This would give a more
> appropriately linear/accurate sense of the progress.
>
> I think we are allowed to do this as I think we promise that these
> progress numbers mean essentially nothing...

I'm not sure, length is published field, it is available in libvirt. 
IMHO, If we change it semantics it should
firstly break our iotests but what it will break for users is 
unpredictable..

>
> New fields that actually *do* report meaningful information, however,
> could be added. bytesCopied would be a prime candidate.

so you suggest to change semantics of one field and add another
I suggest to just add one field, it's safer I think.

>
>> And this progress will be more precise if information about skipped bytes
>> is provided earlier, and current patch is close to this idea.
>>
>> So, if you agree, I can make a patch for 'skipped' too, but it actually not
>> related to the series, so I hope that current patch will be merged anyway.
>> (I need it for backup refactoring, not for accurate progress)
>>
>>   block/backup.c | 18 +++---------------
>>   1 file changed, 3 insertions(+), 15 deletions(-)
>>
>> diff --git a/block/backup.c b/block/backup.c
>> index 07f4ae846b..52df7bb19e 100644
>> --- a/block/backup.c
>> +++ b/block/backup.c
>> @@ -369,7 +369,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
>>       int64_t offset;
>>       int64_t cluster;
>>       int64_t end;
>> -    int64_t last_cluster = -1;
>>       BdrvDirtyBitmapIter *dbi;
>>   
>>       granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
>> @@ -380,12 +379,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
>>       while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
>>           cluster = offset / job->cluster_size;
>>   
>> -        /* Fake progress updates for any clusters we skipped */
>> -        if (cluster != last_cluster + 1) {
>> -            job->common.offset += ((cluster - last_cluster - 1) *
>> -                                   job->cluster_size);
>> -        }
>> -
>>           for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
>>               do {
>>                   if (yield_and_check(job)) {
>> @@ -407,14 +400,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
>>           if (granularity < job->cluster_size) {
>>               bdrv_set_dirty_iter(dbi, cluster * job->cluster_size);
>>           }
>> -
>> -        last_cluster = cluster - 1;
>> -    }
>> -
>> -    /* Play some final catchup with the progress meter */
>> -    end = DIV_ROUND_UP(job->common.len, job->cluster_size);
>> -    if (last_cluster + 1 < end) {
>> -        job->common.offset += ((end - last_cluster - 1) * job->cluster_size);
>>       }
>>   
>>   out:
>> @@ -456,6 +441,9 @@ static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
>>           bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
>>       }
>>   
>> +    job->common.offset = job->common.len -
>> +                         hbitmap_count(job->copy_bitmap) * job->cluster_size;
>> +
>>       bdrv_dirty_iter_free(dbi);
>>   }
>>   
>>
> Anything that makes less lines of code is nice, though, so if we cannot
> reduce the length of the job:
>
> Reviewed-by: John Snow <jsnow@redhat.com>

Thank you!

-- 
Best regards,
Vladimir

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing
  2017-10-12 11:42     ` Vladimir Sementsov-Ogievskiy
@ 2017-10-12 13:56       ` Eric Blake
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Blake @ 2017-10-12 13:56 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, John Snow, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den

[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]

On 10/12/2017 06:42 AM, Vladimir Sementsov-Ogievskiy wrote:

>> I'm not sure we actually need a new field... let's just say that the job
>> length is the number of bytes described by the incremental backup. Every
>> time we copy some, move offset forward. This would give a more
>> appropriately linear/accurate sense of the progress.
>>
>> I think we are allowed to do this as I think we promise that these
>> progress numbers mean essentially nothing...
> 
> I'm not sure, length is published field, it is available in libvirt.
> IMHO, If we change it semantics it should
> firstly break our iotests but what it will break for users is
> unpredictable..

Libvirt already documents that progress numbers do NOT have any specific
meaning other than a rough estimate of percentage complete, and that it
IS acceptable for the numbers to jump around or move non-linearly during
a job.  I see no problem with returning different numbers than
previously if it makes our estimation of progress look smoother; it is
not a semantics change by the definition we've given to the numbers, but
merely a quality-of-implementation improvement.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH 3/5] backup: init copy_bitmap from sync_bitmap for incremental
  2017-10-12 11:23     ` Vladimir Sementsov-Ogievskiy
@ 2017-11-07  0:25       ` John Snow
  0 siblings, 0 replies; 21+ messages in thread
From: John Snow @ 2017-11-07  0:25 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, qemu-block, qemu-devel
  Cc: kwolf, famz, jcody, mreitz, stefanha, den



On 10/12/2017 07:23 AM, Vladimir Sementsov-Ogievskiy wrote:
> 10.10.2017 01:56, John Snow wrote:
>>
>> On 10/02/2017 10:39 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> We should not copy non-dirty clusters in write notifiers. So,
>>> initialize copy_bitmap from sync_bitmap.
>>>
>> ...! Duh, good find!!
>>
>> So, previously we'd copy extra sectors if they just so happened to be
>> written to during the backup process. These would be totally redundant
>> compared to the previous backups in the chain, by definition.
>>
>> Now, we allow the write to go through to the source image and let it
>> dirty the bdrv dirty bitmap we have on standby.
>>
>> Good fix.
>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>>> ---
>>>   block/backup.c | 45 ++++++++++++++++++++++++++++++++++++++++++++-
>>>   1 file changed, 44 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/backup.c b/block/backup.c
>>> index 08efec639f..07f4ae846b 100644
>>> --- a/block/backup.c
>>> +++ b/block/backup.c
>>> @@ -422,6 +422,43 @@ out:
>>>       return ret;
>>>   }
>>>   +/* init copy_bitmap from sync_bitmap */
>>> +static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
>>> +{
>>> +    BdrvDirtyBitmapIter *dbi;
>>> +    int64_t offset;
>>> +    int64_t end =
>>> DIV_ROUND_UP(bdrv_dirty_bitmap_size(job->sync_bitmap),
>>> +                               job->cluster_size);
>>> +
>>> +    dbi = bdrv_dirty_iter_new(job->sync_bitmap);
>>> +    while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
>>> +        int64_t cluster = offset / job->cluster_size;
>>> +        int64_t next_cluster;
>>> +
>>> +        offset += bdrv_dirty_bitmap_granularity(job->sync_bitmap);
>>> +        if (offset >= bdrv_dirty_bitmap_size(job->sync_bitmap)) {
>>> +            hbitmap_set(job->copy_bitmap, cluster, end - cluster);
>>> +            break;
>>> +        }
>>> +
>>> +        offset = bdrv_dirty_bitmap_next_zero(job->sync_bitmap, offset);
>>> +        if (offset == -1) {
>>> +            hbitmap_set(job->copy_bitmap, cluster, end - cluster);
>>> +            break;
>>> +        }
>>> +
>>> +        next_cluster = DIV_ROUND_UP(offset, job->cluster_size);
>>> +        hbitmap_set(job->copy_bitmap, cluster, next_cluster - cluster);
>>> +        if (next_cluster >= end) {
>>> +            break;
>>> +        }
>>> +
>> Did you look into initializing this from a lower layer, e.g. a call to
>> initialize-an-hbitmap-from-another-hbitmap?
>>
>> The loop might look a little cleaner that way and we might possibly get
>> more utility out of it than a custom coded loop here in the backup code.
> 
> No, to be honest, I don't want to do it now, there are some difficulties:
> 
> 1. we init from bdrv bitmap, so we'll have to add a bit strange common
> interface init-hbitmap-from-bdrv-bitmap
> 2 (more hard). here copy_bitmap have other size: it's granularity is
> zero, it corresponds to clusters, not to bytes, and I don't want to
> change this.
> so, it's not a common case..
> 

Fair enough, it just struck me as a slightly odd thing to code here in
the backup file, but it's true that nobody else needs this yet (or
perhaps ever.)

No problem. We'll figure it out when we get there. Thank you!

--js

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing
  2017-10-12 13:53 [Qemu-devel] [PATCH v2 " Vladimir Sementsov-Ogievskiy
@ 2017-10-12 13:53 ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2017-10-12 13:53 UTC (permalink / raw)
  To: qemu-devel, qemu-block
  Cc: jsnow, famz, mreitz, kwolf, jcody, stefanha, den, vsementsov

Set fake progress for non-dirty clusters in copy_bitmap initialization,
to. It simplifies code and allows further refactoring.

This patch changes user's view of backup progress, but formally it
doesn't changed: progress hops are just moved to the beginning.

Actually it's just a point of view: when do we actually skip clusters?
We can say in the very beginning, that we skip these clusters and do
not think about them later.

Of course, if go through disk sequentially, it's logical to say, that
we skip clusters between copied portions to the left and to the right
of them. But even now copying progress is not sequential because of
write notifiers. Future patches will introduce new backup architecture
which will do copying in several coroutines in parallel, so it will
make no sense to publish fake progress by parts in parallel with
other copying requests.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 block/backup.c | 18 +++---------------
 1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 71ad59b417..04608836d7 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -369,7 +369,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
     int64_t offset;
     int64_t cluster;
     int64_t end;
-    int64_t last_cluster = -1;
     BdrvDirtyBitmapIter *dbi;
 
     granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
@@ -380,12 +379,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
     while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
         cluster = offset / job->cluster_size;
 
-        /* Fake progress updates for any clusters we skipped */
-        if (cluster != last_cluster + 1) {
-            job->common.offset += ((cluster - last_cluster - 1) *
-                                   job->cluster_size);
-        }
-
         for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
             do {
                 if (yield_and_check(job)) {
@@ -407,14 +400,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
         if (granularity < job->cluster_size) {
             bdrv_set_dirty_iter(dbi, cluster * job->cluster_size);
         }
-
-        last_cluster = cluster - 1;
-    }
-
-    /* Play some final catchup with the progress meter */
-    end = DIV_ROUND_UP(job->common.len, job->cluster_size);
-    if (last_cluster + 1 < end) {
-        job->common.offset += ((end - last_cluster - 1) * job->cluster_size);
     }
 
 out:
@@ -456,6 +441,9 @@ static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
         bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
     }
 
+    job->common.offset = job->common.len -
+                         hbitmap_count(job->copy_bitmap) * job->cluster_size;
+
     bdrv_dirty_iter_free(dbi);
 }
 
-- 
2.11.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-11-07  0:25 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-02 14:39 [Qemu-devel] [PATCH 0/5] backup improvements part 1 Vladimir Sementsov-Ogievskiy
2017-10-02 14:39 ` [Qemu-devel] [PATCH 1/5] hbitmap: add next_zero function Vladimir Sementsov-Ogievskiy
2017-10-02 15:43   ` Eric Blake
2017-10-02 16:16     ` Vladimir Sementsov-Ogievskiy
2017-10-09 21:51   ` John Snow
2017-10-12 10:05     ` Vladimir Sementsov-Ogievskiy
2017-10-02 14:39 ` [Qemu-devel] [PATCH 2/5] backup: move from done_bitmap to copy_bitmap Vladimir Sementsov-Ogievskiy
2017-10-09 22:16   ` John Snow
2017-10-02 14:39 ` [Qemu-devel] [PATCH 3/5] backup: init copy_bitmap from sync_bitmap for incremental Vladimir Sementsov-Ogievskiy
2017-10-09 22:56   ` John Snow
2017-10-12 11:23     ` Vladimir Sementsov-Ogievskiy
2017-11-07  0:25       ` John Snow
2017-10-02 14:39 ` [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing Vladimir Sementsov-Ogievskiy
2017-10-09 23:44   ` John Snow
2017-10-12 11:42     ` Vladimir Sementsov-Ogievskiy
2017-10-12 13:56       ` Eric Blake
2017-10-02 14:39 ` [Qemu-devel] [PATCH 5/5] backup: use copy_bitmap in incremental backup Vladimir Sementsov-Ogievskiy
2017-10-09 23:51   ` John Snow
2017-10-02 15:38 ` [Qemu-devel] [PATCH 0/5] backup improvements part 1 Eric Blake
2017-10-02 16:17   ` Vladimir Sementsov-Ogievskiy
2017-10-12 13:53 [Qemu-devel] [PATCH v2 " Vladimir Sementsov-Ogievskiy
2017-10-12 13:53 ` [Qemu-devel] [PATCH 4/5] backup: simplify non-dirty bits progress processing Vladimir Sementsov-Ogievskiy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.