All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based
@ 2017-05-11  2:20 Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 01/20] blockjob: Track job ratelimits via bytes, not sectors Eric Blake
                   ` (20 more replies)
  0 siblings, 21 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow

There are patches floating around to add NBD_CMD_BLOCK_STATUS,
but NBD wants to report status on byte granularity (even if the
reporting will probably be naturally aligned to sectors or even
much higher levels).  I've therefore started the task of
converting our block status code to report at a byte granularity
rather than sectors.

This is part one of that conversion: bdrv_is_allocated().
Other parts still need a v2, but here's the link to their v1:
tracking dirty bitmaps by bytes:
https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02163.html
replacing bdrv_get_block_status() with a byte based callback
in all the drivers:
https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02642.html

Available as a tag at:
git fetch git://repo.or.cz/qemu/ericb.git nbd-byte-allocated-v2

Since v1:
- Add R-b from John as appropriate
- Couple of new patches for cleanups he noticed
- Rebase to Max's block branch

001/20:[----] [--] 'blockjob: Track job ratelimits via bytes, not sectors'
002/20:[----] [--] 'trace: Show blockjob actions via bytes, not sectors'
003/20:[----] [--] 'stream: Switch stream_populate() to byte-based'
004/20:[----] [--] 'stream: Switch stream_run() to byte-based'
005/20:[----] [--] 'commit: Switch commit_populate() to byte-based'
006/20:[----] [--] 'commit: Switch commit_run() to byte-based'
007/20:[----] [--] 'mirror: Switch MirrorBlockJob to byte-based'
008/20:[----] [--] 'mirror: Switch mirror_do_zero_or_discard() to byte-based'
009/20:[down] 'mirror: Update signature of mirror_clip_sectors()'
010/20:[0013] [FC] 'mirror: Switch mirror_cow_align() to byte-based'
011/20:[----] [-C] 'mirror: Switch mirror_do_read() to byte-based'
012/20:[0014] [FC] 'mirror: Switch mirror_iteration() to byte-based'
013/20:[----] [--] 'block: Drop unused bdrv_round_sectors_to_clusters()'
014/20:[----] [--] 'backup: Switch BackupBlockJob to byte-based'
015/20:[0008] [FC] 'backup: Switch block_backup.h to byte-based'
016/20:[----] [--] 'backup: Switch backup_do_cow() to byte-based'
017/20:[----] [--] 'backup: Switch backup_run() to byte-based'
018/20:[0039] [FC] 'block: Make bdrv_is_allocated() byte-based'
019/20:[down] 'block: Minimize raw use of bds->total_sectors'
020/20:[0026] [FC] 'block: Make bdrv_is_allocated_above() byte-based'

Eric Blake (20):
  blockjob: Track job ratelimits via bytes, not sectors
  trace: Show blockjob actions via bytes, not sectors
  stream: Switch stream_populate() to byte-based
  stream: Switch stream_run() to byte-based
  commit: Switch commit_populate() to byte-based
  commit: Switch commit_run() to byte-based
  mirror: Switch MirrorBlockJob to byte-based
  mirror: Switch mirror_do_zero_or_discard() to byte-based
  mirror: Update signature of mirror_clip_sectors()
  mirror: Switch mirror_cow_align() to byte-based
  mirror: Switch mirror_do_read() to byte-based
  mirror: Switch mirror_iteration() to byte-based
  block: Drop unused bdrv_round_sectors_to_clusters()
  backup: Switch BackupBlockJob to byte-based
  backup: Switch block_backup.h to byte-based
  backup: Switch backup_do_cow() to byte-based
  backup: Switch backup_run() to byte-based
  block: Make bdrv_is_allocated() byte-based
  block: Minimize raw use of bds->total_sectors
  block: Make bdrv_is_allocated_above() byte-based

 include/block/block.h        |  10 +-
 include/block/block_backup.h |  11 +-
 include/qemu/ratelimit.h     |   3 +-
 block/backup.c               | 130 ++++++++-----------
 block/commit.c               |  54 ++++----
 block/io.c                   |  92 +++++++------
 block/mirror.c               | 300 ++++++++++++++++++++++---------------------
 block/replication.c          |  29 +++--
 block/stream.c               |  35 +++--
 block/vvfat.c                |  34 +++--
 migration/block.c            |   9 +-
 qemu-img.c                   |  15 ++-
 qemu-io-cmds.c               |  70 +++++-----
 block/trace-events           |  14 +-
 14 files changed, 396 insertions(+), 410 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 01/20] blockjob: Track job ratelimits via bytes, not sectors
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 02/20] trace: Show blockjob actions " Eric Blake
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

The user interface specifies job rate limits in bytes/second.
It's pointless to have our internal representation track things
in sectors/second, particularly since we want to move away from
sector-based interfaces.

Fix up a doc typo found while verifying that the ratelimit
code handles the scaling difference.

Repetition of expressions like 'n * BDRV_SECTOR_SIZE' will be
cleaned up later when functions are converted to iterate over
images by bytes rather than by sectors.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: adjust commit message based on review; no code change
---
 include/qemu/ratelimit.h |  3 ++-
 block/backup.c           |  5 +++--
 block/commit.c           |  5 +++--
 block/mirror.c           | 13 +++++++------
 block/stream.c           |  5 +++--
 5 files changed, 18 insertions(+), 13 deletions(-)

diff --git a/include/qemu/ratelimit.h b/include/qemu/ratelimit.h
index 8da1232..8dece48 100644
--- a/include/qemu/ratelimit.h
+++ b/include/qemu/ratelimit.h
@@ -24,7 +24,8 @@ typedef struct {

 /** Calculate and return delay for next request in ns
  *
- * Record that we sent @p n data units. If we may send more data units
+ * Record that we sent @n data units (where @n matches the scale chosen
+ * during ratelimit_set_speed). If we may send more data units
  * in the current time slice, return 0 (i.e. no delay). Otherwise
  * return the amount of time (in ns) until the start of the next time
  * slice that will permit sending the next chunk of data.
diff --git a/block/backup.c b/block/backup.c
index a4fb288..bcfa321 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -208,7 +208,7 @@ static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
         error_setg(errp, QERR_INVALID_PARAMETER, "speed");
         return;
     }
-    ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
+    ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
 }

 static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
@@ -359,7 +359,8 @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
      */
     if (job->common.speed) {
         uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
-                                                      job->sectors_read);
+                                                      job->sectors_read *
+                                                      BDRV_SECTOR_SIZE);
         job->sectors_read = 0;
         block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
     } else {
diff --git a/block/commit.c b/block/commit.c
index 76a0d98..8b684e0 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -195,7 +195,8 @@ static void coroutine_fn commit_run(void *opaque)
         s->common.offset += n * BDRV_SECTOR_SIZE;

         if (copy && s->common.speed) {
-            delay_ns = ratelimit_calculate_delay(&s->limit, n);
+            delay_ns = ratelimit_calculate_delay(&s->limit,
+                                                 n * BDRV_SECTOR_SIZE);
         }
     }

@@ -217,7 +218,7 @@ static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
         error_setg(errp, QERR_INVALID_PARAMETER, "speed");
         return;
     }
-    ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
+    ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
 }

 static const BlockJobDriver commit_job_driver = {
diff --git a/block/mirror.c b/block/mirror.c
index e86f8f8..015a62c 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -391,7 +391,8 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
     bitmap_set(s->in_flight_bitmap, sector_num / sectors_per_chunk, nb_chunks);
     while (nb_chunks > 0 && sector_num < end) {
         int64_t ret;
-        int io_sectors, io_sectors_acct;
+        int io_sectors;
+        int64_t io_bytes_acct;
         BlockDriverState *file;
         enum MirrorMethod {
             MIRROR_METHOD_COPY,
@@ -439,16 +440,16 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
         switch (mirror_method) {
         case MIRROR_METHOD_COPY:
             io_sectors = mirror_do_read(s, sector_num, io_sectors);
-            io_sectors_acct = io_sectors;
+            io_bytes_acct = io_sectors * BDRV_SECTOR_SIZE;
             break;
         case MIRROR_METHOD_ZERO:
         case MIRROR_METHOD_DISCARD:
             mirror_do_zero_or_discard(s, sector_num, io_sectors,
                                       mirror_method == MIRROR_METHOD_DISCARD);
             if (write_zeroes_ok) {
-                io_sectors_acct = 0;
+                io_bytes_acct = 0;
             } else {
-                io_sectors_acct = io_sectors;
+                io_bytes_acct = io_sectors * BDRV_SECTOR_SIZE;
             }
             break;
         default:
@@ -458,7 +459,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
         sector_num += io_sectors;
         nb_chunks -= DIV_ROUND_UP(io_sectors, sectors_per_chunk);
         if (s->common.speed) {
-            delay_ns = ratelimit_calculate_delay(&s->limit, io_sectors_acct);
+            delay_ns = ratelimit_calculate_delay(&s->limit, io_bytes_acct);
         }
     }
     return delay_ns;
@@ -918,7 +919,7 @@ static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
         error_setg(errp, QERR_INVALID_PARAMETER, "speed");
         return;
     }
-    ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
+    ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
 }

 static void mirror_complete(BlockJob *job, Error **errp)
diff --git a/block/stream.c b/block/stream.c
index 0113710..11c5cf1 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -191,7 +191,8 @@ static void coroutine_fn stream_run(void *opaque)
         /* Publish progress */
         s->common.offset += n * BDRV_SECTOR_SIZE;
         if (copy && s->common.speed) {
-            delay_ns = ratelimit_calculate_delay(&s->limit, n);
+            delay_ns = ratelimit_calculate_delay(&s->limit,
+                                                 n * BDRV_SECTOR_SIZE);
         }
     }

@@ -220,7 +221,7 @@ static void stream_set_speed(BlockJob *job, int64_t speed, Error **errp)
         error_setg(errp, QERR_INVALID_PARAMETER, "speed");
         return;
     }
-    ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
+    ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
 }

 static const BlockJobDriver stream_job_driver = {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 02/20] trace: Show blockjob actions via bytes, not sectors
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 01/20] blockjob: Track job ratelimits via bytes, not sectors Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 03/20] stream: Switch stream_populate() to byte-based Eric Blake
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

Upcoming patches are going to switch to byte-based interfaces
instead of sector-based.  Even worse, trace_backup_do_cow_enter()
had a weird mix of cluster and sector indices.

The trace interface is low enough that there are no stability
guarantees, and therefore nothing wrong with changing our units,
even in cases like trace_backup_do_cow_skip() where we are not
changing the trace output.  So make the tracing uniformly use
bytes.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: improve commit message, no code change
---
 block/backup.c     | 16 ++++++++++------
 block/commit.c     |  3 ++-
 block/mirror.c     | 26 +++++++++++++++++---------
 block/stream.c     |  3 ++-
 block/trace-events | 14 +++++++-------
 5 files changed, 38 insertions(+), 24 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index bcfa321..b28df8c 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -102,6 +102,7 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
     void *bounce_buffer = NULL;
     int ret = 0;
     int64_t sectors_per_cluster = cluster_size_sectors(job);
+    int64_t bytes_per_cluster = sectors_per_cluster * BDRV_SECTOR_SIZE;
     int64_t start, end;
     int n;

@@ -110,18 +111,20 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
     start = sector_num / sectors_per_cluster;
     end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);

-    trace_backup_do_cow_enter(job, start, sector_num, nb_sectors);
+    trace_backup_do_cow_enter(job, start * bytes_per_cluster,
+                              sector_num * BDRV_SECTOR_SIZE,
+                              nb_sectors * BDRV_SECTOR_SIZE);

     wait_for_overlapping_requests(job, start, end);
     cow_request_begin(&cow_request, job, start, end);

     for (; start < end; start++) {
         if (test_bit(start, job->done_bitmap)) {
-            trace_backup_do_cow_skip(job, start);
+            trace_backup_do_cow_skip(job, start * bytes_per_cluster);
             continue; /* already copied */
         }

-        trace_backup_do_cow_process(job, start);
+        trace_backup_do_cow_process(job, start * bytes_per_cluster);

         n = MIN(sectors_per_cluster,
                 job->common.len / BDRV_SECTOR_SIZE -
@@ -138,7 +141,7 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
                             bounce_qiov.size, &bounce_qiov,
                             is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
         if (ret < 0) {
-            trace_backup_do_cow_read_fail(job, start, ret);
+            trace_backup_do_cow_read_fail(job, start * bytes_per_cluster, ret);
             if (error_is_read) {
                 *error_is_read = true;
             }
@@ -154,7 +157,7 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
                                  job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
         }
         if (ret < 0) {
-            trace_backup_do_cow_write_fail(job, start, ret);
+            trace_backup_do_cow_write_fail(job, start * bytes_per_cluster, ret);
             if (error_is_read) {
                 *error_is_read = false;
             }
@@ -177,7 +180,8 @@ out:

     cow_request_end(&cow_request);

-    trace_backup_do_cow_return(job, sector_num, nb_sectors, ret);
+    trace_backup_do_cow_return(job, sector_num * BDRV_SECTOR_SIZE,
+                               nb_sectors * BDRV_SECTOR_SIZE, ret);

     qemu_co_rwlock_unlock(&job->flush_rwlock);

diff --git a/block/commit.c b/block/commit.c
index 8b684e0..11dab98 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -176,7 +176,8 @@ static void coroutine_fn commit_run(void *opaque)
                                       COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
                                       &n);
         copy = (ret == 1);
-        trace_commit_one_iteration(s, sector_num, n, ret);
+        trace_commit_one_iteration(s, sector_num * BDRV_SECTOR_SIZE,
+                                   n * BDRV_SECTOR_SIZE, ret);
         if (copy) {
             ret = commit_populate(s->top, s->base, sector_num, n, buf);
             bytes_written += n * BDRV_SECTOR_SIZE;
diff --git a/block/mirror.c b/block/mirror.c
index 015a62c..58a6f89 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -103,7 +103,8 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
     int64_t chunk_num;
     int i, nb_chunks, sectors_per_chunk;

-    trace_mirror_iteration_done(s, op->sector_num, op->nb_sectors, ret);
+    trace_mirror_iteration_done(s, op->sector_num * BDRV_SECTOR_SIZE,
+                                op->nb_sectors * BDRV_SECTOR_SIZE, ret);

     s->in_flight--;
     s->sectors_in_flight -= op->nb_sectors;
@@ -268,7 +269,8 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
     nb_chunks = DIV_ROUND_UP(nb_sectors, sectors_per_chunk);

     while (s->buf_free_count < nb_chunks) {
-        trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
+        trace_mirror_yield_in_flight(s, sector_num * BDRV_SECTOR_SIZE,
+                                     s->in_flight);
         mirror_wait_for_io(s);
     }

@@ -294,7 +296,8 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
     /* Copy the dirty cluster.  */
     s->in_flight++;
     s->sectors_in_flight += nb_sectors;
-    trace_mirror_one_iteration(s, sector_num, nb_sectors);
+    trace_mirror_one_iteration(s, sector_num * BDRV_SECTOR_SIZE,
+                               nb_sectors * BDRV_SECTOR_SIZE);

     blk_aio_preadv(source, sector_num * BDRV_SECTOR_SIZE, &op->qiov, 0,
                    mirror_read_complete, op);
@@ -346,13 +349,15 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
     if (sector_num < 0) {
         bdrv_set_dirty_iter(s->dbi, 0);
         sector_num = bdrv_dirty_iter_next(s->dbi);
-        trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
+        trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap) *
+                                  BDRV_SECTOR_SIZE);
         assert(sector_num >= 0);
     }

     first_chunk = sector_num / sectors_per_chunk;
     while (test_bit(first_chunk, s->in_flight_bitmap)) {
-        trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
+        trace_mirror_yield_in_flight(s, sector_num * BDRV_SECTOR_SIZE,
+                                     s->in_flight);
         mirror_wait_for_io(s);
     }

@@ -428,7 +433,8 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
         }

         while (s->in_flight >= MAX_IN_FLIGHT) {
-            trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
+            trace_mirror_yield_in_flight(s, sector_num * BDRV_SECTOR_SIZE,
+                                         s->in_flight);
             mirror_wait_for_io(s);
         }

@@ -811,7 +817,8 @@ static void coroutine_fn mirror_run(void *opaque)
             s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
             if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 ||
                 (cnt == 0 && s->in_flight > 0)) {
-                trace_mirror_yield(s, cnt, s->buf_free_count, s->in_flight);
+                trace_mirror_yield(s, cnt * BDRV_SECTOR_SIZE,
+                                   s->buf_free_count, s->in_flight);
                 mirror_wait_for_io(s);
                 continue;
             } else if (cnt != 0) {
@@ -852,7 +859,7 @@ static void coroutine_fn mirror_run(void *opaque)
              * whether to switch to target check one last time if I/O has
              * come in the meanwhile, and if not flush the data to disk.
              */
-            trace_mirror_before_drain(s, cnt);
+            trace_mirror_before_drain(s, cnt * BDRV_SECTOR_SIZE);

             bdrv_drained_begin(bs);
             cnt = bdrv_get_dirty_count(s->dirty_bitmap);
@@ -871,7 +878,8 @@ static void coroutine_fn mirror_run(void *opaque)
         }

         ret = 0;
-        trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
+        trace_mirror_before_sleep(s, cnt * BDRV_SECTOR_SIZE,
+                                  s->synced, delay_ns);
         if (!s->synced) {
             block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
             if (block_job_is_cancelled(&s->common)) {
diff --git a/block/stream.c b/block/stream.c
index 11c5cf1..3134ec5 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -168,7 +168,8 @@ static void coroutine_fn stream_run(void *opaque)

             copy = (ret == 1);
         }
-        trace_stream_one_iteration(s, sector_num, n, ret);
+        trace_stream_one_iteration(s, sector_num * BDRV_SECTOR_SIZE,
+                                   n * BDRV_SECTOR_SIZE, ret);
         if (copy) {
             ret = stream_populate(blk, sector_num, n, buf);
         }
diff --git a/block/trace-events b/block/trace-events
index 9a71c7f..ca1b132 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -18,11 +18,11 @@ bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p off
 bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned int bytes, int64_t cluster_offset, unsigned int cluster_bytes) "bs %p offset %"PRId64" bytes %u cluster_offset %"PRId64" cluster_bytes %u"

 # block/stream.c
-stream_one_iteration(void *s, int64_t sector_num, int nb_sectors, int is_allocated) "s %p sector_num %"PRId64" nb_sectors %d is_allocated %d"
+stream_one_iteration(void *s, int64_t offset, uint64_t bytes, int is_allocated) "s %p offset %" PRId64 " bytes %" PRIu64 " is_allocated %d"
 stream_start(void *bs, void *base, void *s) "bs %p base %p s %p"

 # block/commit.c
-commit_one_iteration(void *s, int64_t sector_num, int nb_sectors, int is_allocated) "s %p sector_num %"PRId64" nb_sectors %d is_allocated %d"
+commit_one_iteration(void *s, int64_t offset, uint64_t bytes, int is_allocated) "s %p offset %" PRId64 " bytes %" PRIu64 " is_allocated %d"
 commit_start(void *bs, void *base, void *top, void *s) "bs %p base %p top %p s %p"

 # block/mirror.c
@@ -31,14 +31,14 @@ mirror_restart_iter(void *s, int64_t cnt) "s %p dirty count %"PRId64
 mirror_before_flush(void *s) "s %p"
 mirror_before_drain(void *s, int64_t cnt) "s %p dirty count %"PRId64
 mirror_before_sleep(void *s, int64_t cnt, int synced, uint64_t delay_ns) "s %p dirty count %"PRId64" synced %d delay %"PRIu64"ns"
-mirror_one_iteration(void *s, int64_t sector_num, int nb_sectors) "s %p sector_num %"PRId64" nb_sectors %d"
-mirror_iteration_done(void *s, int64_t sector_num, int nb_sectors, int ret) "s %p sector_num %"PRId64" nb_sectors %d ret %d"
+mirror_one_iteration(void *s, int64_t offset, uint64_t bytes) "s %p offset %" PRId64 " bytes %" PRIu64
+mirror_iteration_done(void *s, int64_t offset, uint64_t bytes, int ret) "s %p offset %" PRId64 " bytes %" PRIu64 " ret %d"
 mirror_yield(void *s, int64_t cnt, int buf_free_count, int in_flight) "s %p dirty count %"PRId64" free buffers %d in_flight %d"
-mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p sector_num %"PRId64" in_flight %d"
+mirror_yield_in_flight(void *s, int64_t offset, int in_flight) "s %p offset %" PRId64 " in_flight %d"

 # block/backup.c
-backup_do_cow_enter(void *job, int64_t start, int64_t sector_num, int nb_sectors) "job %p start %"PRId64" sector_num %"PRId64" nb_sectors %d"
-backup_do_cow_return(void *job, int64_t sector_num, int nb_sectors, int ret) "job %p sector_num %"PRId64" nb_sectors %d ret %d"
+backup_do_cow_enter(void *job, int64_t start, int64_t offset, uint64_t bytes) "job %p start %" PRId64 " offset %" PRId64 " bytes %" PRIu64
+backup_do_cow_return(void *job, int64_t offset, uint64_t bytes, int ret) "job %p offset %" PRId64 " bytes %" PRIu64 " ret %d"
 backup_do_cow_skip(void *job, int64_t start) "job %p start %"PRId64
 backup_do_cow_process(void *job, int64_t start) "job %p start %"PRId64
 backup_do_cow_read_fail(void *job, int64_t start, int ret) "job %p start %"PRId64" ret %d"
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 03/20] stream: Switch stream_populate() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 01/20] blockjob: Track job ratelimits via bytes, not sectors Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 02/20] trace: Show blockjob actions " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 04/20] stream: Switch stream_run() " Eric Blake
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Start by converting an
internal function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: no change
---
 block/stream.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index 3134ec5..5acef6d 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -41,20 +41,20 @@ typedef struct StreamBlockJob {
 } StreamBlockJob;

 static int coroutine_fn stream_populate(BlockBackend *blk,
-                                        int64_t sector_num, int nb_sectors,
+                                        int64_t offset, uint64_t bytes,
                                         void *buf)
 {
     struct iovec iov = {
         .iov_base = buf,
-        .iov_len  = nb_sectors * BDRV_SECTOR_SIZE,
+        .iov_len  = bytes,
     };
     QEMUIOVector qiov;

+    assert(bytes < SIZE_MAX);
     qemu_iovec_init_external(&qiov, &iov, 1);

     /* Copy-on-read the unallocated clusters */
-    return blk_co_preadv(blk, sector_num * BDRV_SECTOR_SIZE, qiov.size, &qiov,
-                         BDRV_REQ_COPY_ON_READ);
+    return blk_co_preadv(blk, offset, qiov.size, &qiov, BDRV_REQ_COPY_ON_READ);
 }

 typedef struct {
@@ -171,7 +171,8 @@ static void coroutine_fn stream_run(void *opaque)
         trace_stream_one_iteration(s, sector_num * BDRV_SECTOR_SIZE,
                                    n * BDRV_SECTOR_SIZE, ret);
         if (copy) {
-            ret = stream_populate(blk, sector_num, n, buf);
+            ret = stream_populate(blk, sector_num * BDRV_SECTOR_SIZE,
+                                  n * BDRV_SECTOR_SIZE, buf);
         }
         if (ret < 0) {
             BlockErrorAction action =
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 04/20] stream: Switch stream_run() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (2 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 03/20] stream: Switch stream_populate() to byte-based Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 05/20] commit: Switch commit_populate() " Eric Blake
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of streaming to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are sector-aligned).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: no change
---
 block/stream.c | 24 ++++++++++--------------
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index 5acef6d..3ed4fef 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -108,12 +108,11 @@ static void coroutine_fn stream_run(void *opaque)
     BlockBackend *blk = s->common.blk;
     BlockDriverState *bs = blk_bs(blk);
     BlockDriverState *base = s->base;
-    int64_t sector_num = 0;
-    int64_t end = -1;
+    int64_t offset = 0;
     uint64_t delay_ns = 0;
     int error = 0;
     int ret = 0;
-    int n = 0;
+    int n = 0; /* sectors */
     void *buf;

     if (!bs->backing) {
@@ -126,7 +125,6 @@ static void coroutine_fn stream_run(void *opaque)
         goto out;
     }

-    end = s->common.len >> BDRV_SECTOR_BITS;
     buf = qemu_blockalign(bs, STREAM_BUFFER_SIZE);

     /* Turn on copy-on-read for the whole block device so that guest read
@@ -138,7 +136,7 @@ static void coroutine_fn stream_run(void *opaque)
         bdrv_enable_copy_on_read(bs);
     }

-    for (sector_num = 0; sector_num < end; sector_num += n) {
+    for ( ; offset < s->common.len; offset += n * BDRV_SECTOR_SIZE) {
         bool copy;

         /* Note that even when no rate limit is applied we need to yield
@@ -151,28 +149,26 @@ static void coroutine_fn stream_run(void *opaque)

         copy = false;

-        ret = bdrv_is_allocated(bs, sector_num,
+        ret = bdrv_is_allocated(bs, offset / BDRV_SECTOR_SIZE,
                                 STREAM_BUFFER_SIZE / BDRV_SECTOR_SIZE, &n);
         if (ret == 1) {
             /* Allocated in the top, no need to copy.  */
         } else if (ret >= 0) {
             /* Copy if allocated in the intermediate images.  Limit to the
-             * known-unallocated area [sector_num, sector_num+n).  */
+             * known-unallocated area [offset, offset+n*BDRV_SECTOR_SIZE).  */
             ret = bdrv_is_allocated_above(backing_bs(bs), base,
-                                          sector_num, n, &n);
+                                          offset / BDRV_SECTOR_SIZE, n, &n);

             /* Finish early if end of backing file has been reached */
             if (ret == 0 && n == 0) {
-                n = end - sector_num;
+                n = (s->common.len - offset) / BDRV_SECTOR_SIZE;
             }

             copy = (ret == 1);
         }
-        trace_stream_one_iteration(s, sector_num * BDRV_SECTOR_SIZE,
-                                   n * BDRV_SECTOR_SIZE, ret);
+        trace_stream_one_iteration(s, offset, n * BDRV_SECTOR_SIZE, ret);
         if (copy) {
-            ret = stream_populate(blk, sector_num * BDRV_SECTOR_SIZE,
-                                  n * BDRV_SECTOR_SIZE, buf);
+            ret = stream_populate(blk, offset, n * BDRV_SECTOR_SIZE, buf);
         }
         if (ret < 0) {
             BlockErrorAction action =
@@ -211,7 +207,7 @@ out:
     /* Modify backing chain and close BDSes in main loop */
     data = g_malloc(sizeof(*data));
     data->ret = ret;
-    data->reached_end = sector_num == end;
+    data->reached_end = offset == s->common.len;
     block_job_defer_to_main_loop(&s->common, stream_complete, data);
 }

-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 05/20] commit: Switch commit_populate() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (3 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 04/20] stream: Switch stream_run() " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 06/20] commit: Switch commit_run() " Eric Blake
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Start by converting an
internal function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: no change
---
 block/commit.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/block/commit.c b/block/commit.c
index 11dab98..d244c62 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -47,26 +47,25 @@ typedef struct CommitBlockJob {
 } CommitBlockJob;

 static int coroutine_fn commit_populate(BlockBackend *bs, BlockBackend *base,
-                                        int64_t sector_num, int nb_sectors,
+                                        int64_t offset, uint64_t bytes,
                                         void *buf)
 {
     int ret = 0;
     QEMUIOVector qiov;
     struct iovec iov = {
         .iov_base = buf,
-        .iov_len = nb_sectors * BDRV_SECTOR_SIZE,
+        .iov_len = bytes,
     };

+    assert(bytes < SIZE_MAX);
     qemu_iovec_init_external(&qiov, &iov, 1);

-    ret = blk_co_preadv(bs, sector_num * BDRV_SECTOR_SIZE,
-                        qiov.size, &qiov, 0);
+    ret = blk_co_preadv(bs, offset, qiov.size, &qiov, 0);
     if (ret < 0) {
         return ret;
     }

-    ret = blk_co_pwritev(base, sector_num * BDRV_SECTOR_SIZE,
-                         qiov.size, &qiov, 0);
+    ret = blk_co_pwritev(base, offset, qiov.size, &qiov, 0);
     if (ret < 0) {
         return ret;
     }
@@ -179,7 +178,9 @@ static void coroutine_fn commit_run(void *opaque)
         trace_commit_one_iteration(s, sector_num * BDRV_SECTOR_SIZE,
                                    n * BDRV_SECTOR_SIZE, ret);
         if (copy) {
-            ret = commit_populate(s->top, s->base, sector_num, n, buf);
+            ret = commit_populate(s->top, s->base,
+                                  sector_num * BDRV_SECTOR_SIZE,
+                                  n * BDRV_SECTOR_SIZE, buf);
             bytes_written += n * BDRV_SECTOR_SIZE;
         }
         if (ret < 0) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 06/20] commit: Switch commit_run() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (4 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 05/20] commit: Switch commit_populate() " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 07/20] mirror: Switch MirrorBlockJob " Eric Blake
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of committing to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are sector-aligned).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: no change
---
 block/commit.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/block/commit.c b/block/commit.c
index d244c62..b7d3e60 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -129,17 +129,16 @@ static void coroutine_fn commit_run(void *opaque)
 {
     CommitBlockJob *s = opaque;
     CommitCompleteData *data;
-    int64_t sector_num, end;
+    int64_t offset;
     uint64_t delay_ns = 0;
     int ret = 0;
-    int n = 0;
+    int n = 0; /* sectors */
     void *buf = NULL;
     int bytes_written = 0;
     int64_t base_len;

     ret = s->common.len = blk_getlength(s->top);

-
     if (s->common.len < 0) {
         goto out;
     }
@@ -156,10 +155,9 @@ static void coroutine_fn commit_run(void *opaque)
         }
     }

-    end = s->common.len >> BDRV_SECTOR_BITS;
     buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);

-    for (sector_num = 0; sector_num < end; sector_num += n) {
+    for (offset = 0; offset < s->common.len; offset += n * BDRV_SECTOR_SIZE) {
         bool copy;

         /* Note that even when no rate limit is applied we need to yield
@@ -171,15 +169,13 @@ static void coroutine_fn commit_run(void *opaque)
         }
         /* Copy if allocated above the base */
         ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base),
-                                      sector_num,
+                                      offset / BDRV_SECTOR_SIZE,
                                       COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
                                       &n);
         copy = (ret == 1);
-        trace_commit_one_iteration(s, sector_num * BDRV_SECTOR_SIZE,
-                                   n * BDRV_SECTOR_SIZE, ret);
+        trace_commit_one_iteration(s, offset, n * BDRV_SECTOR_SIZE, ret);
         if (copy) {
-            ret = commit_populate(s->top, s->base,
-                                  sector_num * BDRV_SECTOR_SIZE,
+            ret = commit_populate(s->top, s->base, offset,
                                   n * BDRV_SECTOR_SIZE, buf);
             bytes_written += n * BDRV_SECTOR_SIZE;
         }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 07/20] mirror: Switch MirrorBlockJob to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (5 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 06/20] commit: Switch commit_run() " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 08/20] mirror: Switch mirror_do_zero_or_discard() " Eric Blake
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting an
internal structure (no semantic change), and all references to the
buffer size.

[checkpatch has a false positive on use of MIN() in this patch]

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: no change
---
 block/mirror.c | 79 ++++++++++++++++++++++++++++------------------------------
 1 file changed, 38 insertions(+), 41 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 58a6f89..d0544e3 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -24,9 +24,8 @@

 #define SLICE_TIME    100000000ULL /* ns */
 #define MAX_IN_FLIGHT 16
-#define MAX_IO_SECTORS ((1 << 20) >> BDRV_SECTOR_BITS) /* 1 Mb */
-#define DEFAULT_MIRROR_BUF_SIZE \
-    (MAX_IN_FLIGHT * MAX_IO_SECTORS * BDRV_SECTOR_SIZE)
+#define MAX_IO_BYTES (1 << 20) /* 1 Mb */
+#define DEFAULT_MIRROR_BUF_SIZE (MAX_IN_FLIGHT * MAX_IO_BYTES)

 /* The mirroring buffer is a list of granularity-sized chunks.
  * Free chunks are organized in a list.
@@ -67,11 +66,11 @@ typedef struct MirrorBlockJob {
     uint64_t last_pause_ns;
     unsigned long *in_flight_bitmap;
     int in_flight;
-    int64_t sectors_in_flight;
+    int64_t bytes_in_flight;
     int ret;
     bool unmap;
     bool waiting_for_io;
-    int target_cluster_sectors;
+    int target_cluster_size;
     int max_iov;
     bool initial_zeroing_ongoing;
 } MirrorBlockJob;
@@ -79,8 +78,8 @@ typedef struct MirrorBlockJob {
 typedef struct MirrorOp {
     MirrorBlockJob *s;
     QEMUIOVector qiov;
-    int64_t sector_num;
-    int nb_sectors;
+    int64_t offset;
+    uint64_t bytes;
 } MirrorOp;

 static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
@@ -101,13 +100,12 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
     MirrorBlockJob *s = op->s;
     struct iovec *iov;
     int64_t chunk_num;
-    int i, nb_chunks, sectors_per_chunk;
+    int i, nb_chunks;

-    trace_mirror_iteration_done(s, op->sector_num * BDRV_SECTOR_SIZE,
-                                op->nb_sectors * BDRV_SECTOR_SIZE, ret);
+    trace_mirror_iteration_done(s, op->offset, op->bytes, ret);

     s->in_flight--;
-    s->sectors_in_flight -= op->nb_sectors;
+    s->bytes_in_flight -= op->bytes;
     iov = op->qiov.iov;
     for (i = 0; i < op->qiov.niov; i++) {
         MirrorBuffer *buf = (MirrorBuffer *) iov[i].iov_base;
@@ -115,16 +113,15 @@ static void mirror_iteration_done(MirrorOp *op, int ret)
         s->buf_free_count++;
     }

-    sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
-    chunk_num = op->sector_num / sectors_per_chunk;
-    nb_chunks = DIV_ROUND_UP(op->nb_sectors, sectors_per_chunk);
+    chunk_num = op->offset / s->granularity;
+    nb_chunks = DIV_ROUND_UP(op->bytes, s->granularity);
     bitmap_clear(s->in_flight_bitmap, chunk_num, nb_chunks);
     if (ret >= 0) {
         if (s->cow_bitmap) {
             bitmap_set(s->cow_bitmap, chunk_num, nb_chunks);
         }
         if (!s->initial_zeroing_ongoing) {
-            s->common.offset += (uint64_t)op->nb_sectors * BDRV_SECTOR_SIZE;
+            s->common.offset += op->bytes;
         }
     }
     qemu_iovec_destroy(&op->qiov);
@@ -144,7 +141,8 @@ static void mirror_write_complete(void *opaque, int ret)
     if (ret < 0) {
         BlockErrorAction action;

-        bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
+        bdrv_set_dirty_bitmap(s->dirty_bitmap, op->offset >> BDRV_SECTOR_BITS,
+                              op->bytes >> BDRV_SECTOR_BITS);
         action = mirror_error_action(s, false, -ret);
         if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
             s->ret = ret;
@@ -163,7 +161,8 @@ static void mirror_read_complete(void *opaque, int ret)
     if (ret < 0) {
         BlockErrorAction action;

-        bdrv_set_dirty_bitmap(s->dirty_bitmap, op->sector_num, op->nb_sectors);
+        bdrv_set_dirty_bitmap(s->dirty_bitmap, op->offset >> BDRV_SECTOR_BITS,
+                              op->bytes >> BDRV_SECTOR_BITS);
         action = mirror_error_action(s, true, -ret);
         if (action == BLOCK_ERROR_ACTION_REPORT && s->ret >= 0) {
             s->ret = ret;
@@ -171,7 +170,7 @@ static void mirror_read_complete(void *opaque, int ret)

         mirror_iteration_done(op, ret);
     } else {
-        blk_aio_pwritev(s->target, op->sector_num * BDRV_SECTOR_SIZE, &op->qiov,
+        blk_aio_pwritev(s->target, op->offset, &op->qiov,
                         0, mirror_write_complete, op);
     }
     aio_context_release(blk_get_aio_context(s->common.blk));
@@ -211,7 +210,8 @@ static int mirror_cow_align(MirrorBlockJob *s,
         align_nb_sectors = max_sectors;
         if (need_cow) {
             align_nb_sectors = QEMU_ALIGN_DOWN(align_nb_sectors,
-                                               s->target_cluster_sectors);
+                                               s->target_cluster_size >>
+                                               BDRV_SECTOR_BITS);
         }
     }
     /* Clipping may result in align_nb_sectors unaligned to chunk boundary, but
@@ -277,8 +277,8 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
     /* Allocate a MirrorOp that is used as an AIO callback.  */
     op = g_new(MirrorOp, 1);
     op->s = s;
-    op->sector_num = sector_num;
-    op->nb_sectors = nb_sectors;
+    op->offset = sector_num * BDRV_SECTOR_SIZE;
+    op->bytes = nb_sectors * BDRV_SECTOR_SIZE;

     /* Now make a QEMUIOVector taking enough granularity-sized chunks
      * from s->buf_free.
@@ -295,7 +295,7 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,

     /* Copy the dirty cluster.  */
     s->in_flight++;
-    s->sectors_in_flight += nb_sectors;
+    s->bytes_in_flight += nb_sectors * BDRV_SECTOR_SIZE;
     trace_mirror_one_iteration(s, sector_num * BDRV_SECTOR_SIZE,
                                nb_sectors * BDRV_SECTOR_SIZE);

@@ -315,19 +315,17 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
      * so the freeing in mirror_iteration_done is nop. */
     op = g_new0(MirrorOp, 1);
     op->s = s;
-    op->sector_num = sector_num;
-    op->nb_sectors = nb_sectors;
+    op->offset = sector_num * BDRV_SECTOR_SIZE;
+    op->bytes = nb_sectors * BDRV_SECTOR_SIZE;

     s->in_flight++;
-    s->sectors_in_flight += nb_sectors;
+    s->bytes_in_flight += nb_sectors * BDRV_SECTOR_SIZE;
     if (is_discard) {
         blk_aio_pdiscard(s->target, sector_num << BDRV_SECTOR_BITS,
-                         op->nb_sectors << BDRV_SECTOR_BITS,
-                         mirror_write_complete, op);
+                         op->bytes, mirror_write_complete, op);
     } else {
         blk_aio_pwrite_zeroes(s->target, sector_num * BDRV_SECTOR_SIZE,
-                              op->nb_sectors * BDRV_SECTOR_SIZE,
-                              s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
+                              op->bytes, s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
                               mirror_write_complete, op);
     }
 }
@@ -342,8 +340,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
     int64_t end = s->bdev_length / BDRV_SECTOR_SIZE;
     int sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
     bool write_zeroes_ok = bdrv_can_write_zeroes_with_unmap(blk_bs(s->target));
-    int max_io_sectors = MAX((s->buf_size >> BDRV_SECTOR_BITS) / MAX_IN_FLIGHT,
-                             MAX_IO_SECTORS);
+    int max_io_bytes = MAX(s->buf_size / MAX_IN_FLIGHT, MAX_IO_BYTES);

     sector_num = bdrv_dirty_iter_next(s->dbi);
     if (sector_num < 0) {
@@ -410,9 +407,10 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
                                           nb_chunks * sectors_per_chunk,
                                           &io_sectors, &file);
         if (ret < 0) {
-            io_sectors = MIN(nb_chunks * sectors_per_chunk, max_io_sectors);
+            io_sectors = MIN(nb_chunks * sectors_per_chunk,
+                             max_io_bytes >> BDRV_SECTOR_BITS);
         } else if (ret & BDRV_BLOCK_DATA) {
-            io_sectors = MIN(io_sectors, max_io_sectors);
+            io_sectors = MIN(io_sectors, max_io_bytes >> BDRV_SECTOR_BITS);
         }

         io_sectors -= io_sectors % sectors_per_chunk;
@@ -707,7 +705,6 @@ static void coroutine_fn mirror_run(void *opaque)
     char backing_filename[2]; /* we only need 2 characters because we are only
                                  checking for a NULL string */
     int ret = 0;
-    int target_cluster_size = BDRV_SECTOR_SIZE;

     if (block_job_is_cancelled(&s->common)) {
         goto immediate_exit;
@@ -756,17 +753,17 @@ static void coroutine_fn mirror_run(void *opaque)
      * the destination do COW.  Instead, we copy sectors around the
      * dirty data if needed.  We need a bitmap to do that.
      */
+    s->target_cluster_size = BDRV_SECTOR_SIZE;
     bdrv_get_backing_filename(target_bs, backing_filename,
                               sizeof(backing_filename));
     if (!bdrv_get_info(target_bs, &bdi) && bdi.cluster_size) {
-        target_cluster_size = bdi.cluster_size;
+        s->target_cluster_size = bdi.cluster_size;
     }
     if (backing_filename[0] && !target_bs->backing
-        && s->granularity < target_cluster_size) {
-        s->buf_size = MAX(s->buf_size, target_cluster_size);
+        && s->granularity < s->target_cluster_size) {
+        s->buf_size = MAX(s->buf_size, s->target_cluster_size);
         s->cow_bitmap = bitmap_new(length);
     }
-    s->target_cluster_sectors = target_cluster_size >> BDRV_SECTOR_BITS;
     s->max_iov = MIN(bs->bl.max_iov, target_bs->bl.max_iov);

     s->buf = qemu_try_blockalign(bs, s->buf_size);
@@ -802,10 +799,10 @@ static void coroutine_fn mirror_run(void *opaque)
         cnt = bdrv_get_dirty_count(s->dirty_bitmap);
         /* s->common.offset contains the number of bytes already processed so
          * far, cnt is the number of dirty sectors remaining and
-         * s->sectors_in_flight is the number of sectors currently being
+         * s->bytes_in_flight is the number of bytes currently being
          * processed; together those are the current total operation length */
-        s->common.len = s->common.offset +
-                        (cnt + s->sectors_in_flight) * BDRV_SECTOR_SIZE;
+        s->common.len = s->common.offset + s->bytes_in_flight +
+            cnt * BDRV_SECTOR_SIZE;

         /* Note that even when no rate limit is applied we need to yield
          * periodically with no pending I/O so that bdrv_drain_all() returns.
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 08/20] mirror: Switch mirror_do_zero_or_discard() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (6 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 07/20] mirror: Switch MirrorBlockJob " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 09/20] mirror: Update signature of mirror_clip_sectors() Eric Blake
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: no change
---
 block/mirror.c | 20 +++++++++++---------
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index d0544e3..dc99435 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -305,8 +305,8 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
 }

 static void mirror_do_zero_or_discard(MirrorBlockJob *s,
-                                      int64_t sector_num,
-                                      int nb_sectors,
+                                      int64_t offset,
+                                      uint64_t bytes,
                                       bool is_discard)
 {
     MirrorOp *op;
@@ -315,16 +315,16 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
      * so the freeing in mirror_iteration_done is nop. */
     op = g_new0(MirrorOp, 1);
     op->s = s;
-    op->offset = sector_num * BDRV_SECTOR_SIZE;
-    op->bytes = nb_sectors * BDRV_SECTOR_SIZE;
+    op->offset = offset;
+    op->bytes = bytes;

     s->in_flight++;
-    s->bytes_in_flight += nb_sectors * BDRV_SECTOR_SIZE;
+    s->bytes_in_flight += bytes;
     if (is_discard) {
-        blk_aio_pdiscard(s->target, sector_num << BDRV_SECTOR_BITS,
+        blk_aio_pdiscard(s->target, offset,
                          op->bytes, mirror_write_complete, op);
     } else {
-        blk_aio_pwrite_zeroes(s->target, sector_num * BDRV_SECTOR_SIZE,
+        blk_aio_pwrite_zeroes(s->target, offset,
                               op->bytes, s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
                               mirror_write_complete, op);
     }
@@ -448,7 +448,8 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
             break;
         case MIRROR_METHOD_ZERO:
         case MIRROR_METHOD_DISCARD:
-            mirror_do_zero_or_discard(s, sector_num, io_sectors,
+            mirror_do_zero_or_discard(s, sector_num * BDRV_SECTOR_SIZE,
+                                      io_sectors * BDRV_SECTOR_SIZE,
                                       mirror_method == MIRROR_METHOD_DISCARD);
             if (write_zeroes_ok) {
                 io_bytes_acct = 0;
@@ -645,7 +646,8 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
                 continue;
             }

-            mirror_do_zero_or_discard(s, sector_num, nb_sectors, false);
+            mirror_do_zero_or_discard(s, sector_num * BDRV_SECTOR_SIZE,
+                                      nb_sectors * BDRV_SECTOR_SIZE, false);
             sector_num += nb_sectors;
         }

-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 09/20] mirror: Update signature of mirror_clip_sectors()
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (7 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 08/20] mirror: Switch mirror_do_zero_or_discard() " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-06-05 22:23   ` John Snow
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 10/20] mirror: Switch mirror_cow_align() to byte-based Eric Blake
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

Rather than having a void function that modifies its input
in-place as the output, change the signature to reduce a layer
of indirection and return the result.

Suggested-by: John Snow <jsnow@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: new patch
---
 block/mirror.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index dc99435..f023a0d 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -176,12 +176,12 @@ static void mirror_read_complete(void *opaque, int ret)
     aio_context_release(blk_get_aio_context(s->common.blk));
 }

-static inline void mirror_clip_sectors(MirrorBlockJob *s,
-                                       int64_t sector_num,
-                                       int *nb_sectors)
+static inline int mirror_clip_sectors(MirrorBlockJob *s,
+                                      int64_t sector_num,
+                                      int nb_sectors)
 {
-    *nb_sectors = MIN(*nb_sectors,
-                      s->bdev_length / BDRV_SECTOR_SIZE - sector_num);
+    return MIN(nb_sectors,
+               s->bdev_length / BDRV_SECTOR_SIZE - sector_num);
 }

 /* Round sector_num and/or nb_sectors to target cluster if COW is needed, and
@@ -216,7 +216,8 @@ static int mirror_cow_align(MirrorBlockJob *s,
     }
     /* Clipping may result in align_nb_sectors unaligned to chunk boundary, but
      * that doesn't matter because it's already the end of source image. */
-    mirror_clip_sectors(s, align_sector_num, &align_nb_sectors);
+    align_nb_sectors = mirror_clip_sectors(s, align_sector_num,
+                                           align_nb_sectors);

     ret = align_sector_num + align_nb_sectors - (*sector_num + *nb_sectors);
     *sector_num = align_sector_num;
@@ -440,7 +441,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
             return 0;
         }

-        mirror_clip_sectors(s, sector_num, &io_sectors);
+        io_sectors = mirror_clip_sectors(s, sector_num, io_sectors);
         switch (mirror_method) {
         case MIRROR_METHOD_COPY:
             io_sectors = mirror_do_read(s, sector_num, io_sectors);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 10/20] mirror: Switch mirror_cow_align() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (8 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 09/20] mirror: Update signature of mirror_clip_sectors() Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-06-05 22:24   ` John Snow
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 11/20] mirror: Switch mirror_do_read() " Eric Blake
                   ` (10 subsequent siblings)
  20 siblings, 1 reply; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change), and add mirror_clip_bytes() as a
counterpart to mirror_clip_sectors().  Some of the conversion is
a bit tricky, requiring temporaries to convert between units; it
will be cleared up in a following patch.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: tweak mirror_clip_bytes() signature to match previous patch
---
 block/mirror.c | 64 ++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 38 insertions(+), 26 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index f023a0d..8b9f0cf 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -176,6 +176,15 @@ static void mirror_read_complete(void *opaque, int ret)
     aio_context_release(blk_get_aio_context(s->common.blk));
 }

+/* Clip bytes relative to offset to not exceed end-of-file */
+static inline int64_t mirror_clip_bytes(MirrorBlockJob *s,
+                                        int64_t offset,
+                                        int64_t bytes)
+{
+    return MIN(bytes, s->bdev_length - offset);
+}
+
+/* Clip nb_sectors relative to sector_num to not exceed end-of-file */
 static inline int mirror_clip_sectors(MirrorBlockJob *s,
                                       int64_t sector_num,
                                       int nb_sectors)
@@ -184,44 +193,39 @@ static inline int mirror_clip_sectors(MirrorBlockJob *s,
                s->bdev_length / BDRV_SECTOR_SIZE - sector_num);
 }

-/* Round sector_num and/or nb_sectors to target cluster if COW is needed, and
- * return the offset of the adjusted tail sector against original. */
-static int mirror_cow_align(MirrorBlockJob *s,
-                            int64_t *sector_num,
-                            int *nb_sectors)
+/* Round offset and/or bytes to target cluster if COW is needed, and
+ * return the offset of the adjusted tail against original. */
+static int mirror_cow_align(MirrorBlockJob *s, int64_t *offset,
+                            unsigned int *bytes)
 {
     bool need_cow;
     int ret = 0;
-    int chunk_sectors = s->granularity >> BDRV_SECTOR_BITS;
-    int64_t align_sector_num = *sector_num;
-    int align_nb_sectors = *nb_sectors;
-    int max_sectors = chunk_sectors * s->max_iov;
+    int64_t align_offset = *offset;
+    unsigned int align_bytes = *bytes;
+    int max_bytes = s->granularity * s->max_iov;

-    need_cow = !test_bit(*sector_num / chunk_sectors, s->cow_bitmap);
-    need_cow |= !test_bit((*sector_num + *nb_sectors - 1) / chunk_sectors,
+    need_cow = !test_bit(*offset / s->granularity, s->cow_bitmap);
+    need_cow |= !test_bit((*offset + *bytes - 1) / s->granularity,
                           s->cow_bitmap);
     if (need_cow) {
-        bdrv_round_sectors_to_clusters(blk_bs(s->target), *sector_num,
-                                       *nb_sectors, &align_sector_num,
-                                       &align_nb_sectors);
+        bdrv_round_to_clusters(blk_bs(s->target), *offset, *bytes,
+                               &align_offset, &align_bytes);
     }

-    if (align_nb_sectors > max_sectors) {
-        align_nb_sectors = max_sectors;
+    if (align_bytes > max_bytes) {
+        align_bytes = max_bytes;
         if (need_cow) {
-            align_nb_sectors = QEMU_ALIGN_DOWN(align_nb_sectors,
-                                               s->target_cluster_size >>
-                                               BDRV_SECTOR_BITS);
+            align_bytes = QEMU_ALIGN_DOWN(align_bytes,
+                                          s->target_cluster_size);
         }
     }
-    /* Clipping may result in align_nb_sectors unaligned to chunk boundary, but
+    /* Clipping may result in align_bytes unaligned to chunk boundary, but
      * that doesn't matter because it's already the end of source image. */
-    align_nb_sectors = mirror_clip_sectors(s, align_sector_num,
-                                           align_nb_sectors);
+    align_bytes = mirror_clip_bytes(s, align_offset, align_bytes);

-    ret = align_sector_num + align_nb_sectors - (*sector_num + *nb_sectors);
-    *sector_num = align_sector_num;
-    *nb_sectors = align_nb_sectors;
+    ret = align_offset + align_bytes - (*offset + *bytes);
+    *offset = align_offset;
+    *bytes = align_bytes;
     assert(ret >= 0);
     return ret;
 }
@@ -257,10 +261,18 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
     nb_sectors = MIN(s->buf_size >> BDRV_SECTOR_BITS, nb_sectors);
     nb_sectors = MIN(max_sectors, nb_sectors);
     assert(nb_sectors);
+    assert(nb_sectors < BDRV_REQUEST_MAX_SECTORS);
     ret = nb_sectors;

     if (s->cow_bitmap) {
-        ret += mirror_cow_align(s, &sector_num, &nb_sectors);
+        int64_t offset = sector_num * BDRV_SECTOR_SIZE;
+        unsigned int bytes = nb_sectors * BDRV_SECTOR_SIZE;
+        int gap;
+
+        gap = mirror_cow_align(s, &offset, &bytes);
+        sector_num = offset / BDRV_SECTOR_SIZE;
+        nb_sectors = bytes / BDRV_SECTOR_SIZE;
+        ret += gap / BDRV_SECTOR_SIZE;
     }
     assert(nb_sectors << BDRV_SECTOR_BITS <= s->buf_size);
     /* The sector range must meet granularity because:
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 11/20] mirror: Switch mirror_do_read() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (9 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 10/20] mirror: Switch mirror_cow_align() to byte-based Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 12/20] mirror: Switch mirror_iteration() " Eric Blake
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: rebase to earlier changes
---
 block/mirror.c | 75 ++++++++++++++++++++++++++--------------------------------
 1 file changed, 33 insertions(+), 42 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 8b9f0cf..0dbb52f 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -196,7 +196,7 @@ static inline int mirror_clip_sectors(MirrorBlockJob *s,
 /* Round offset and/or bytes to target cluster if COW is needed, and
  * return the offset of the adjusted tail against original. */
 static int mirror_cow_align(MirrorBlockJob *s, int64_t *offset,
-                            unsigned int *bytes)
+                            uint64_t *bytes)
 {
     bool need_cow;
     int ret = 0;
@@ -204,6 +204,7 @@ static int mirror_cow_align(MirrorBlockJob *s, int64_t *offset,
     unsigned int align_bytes = *bytes;
     int max_bytes = s->granularity * s->max_iov;

+    assert(*bytes < INT_MAX);
     need_cow = !test_bit(*offset / s->granularity, s->cow_bitmap);
     need_cow |= !test_bit((*offset + *bytes - 1) / s->granularity,
                           s->cow_bitmap);
@@ -239,59 +240,50 @@ static inline void mirror_wait_for_io(MirrorBlockJob *s)
 }

 /* Submit async read while handling COW.
- * Returns: The number of sectors copied after and including sector_num,
- *          excluding any sectors copied prior to sector_num due to alignment.
- *          This will be nb_sectors if no alignment is necessary, or
- *          (new_end - sector_num) if tail is rounded up or down due to
+ * Returns: The number of bytes copied after and including offset,
+ *          excluding any bytes copied prior to offset due to alignment.
+ *          This will be @bytes if no alignment is necessary, or
+ *          (new_end - offset) if tail is rounded up or down due to
  *          alignment or buffer limit.
  */
-static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
-                          int nb_sectors)
+static uint64_t mirror_do_read(MirrorBlockJob *s, int64_t offset,
+                               uint64_t bytes)
 {
     BlockBackend *source = s->common.blk;
-    int sectors_per_chunk, nb_chunks;
-    int ret;
+    int nb_chunks;
+    uint64_t ret;
     MirrorOp *op;
-    int max_sectors;
+    uint64_t max_bytes;

-    sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
-    max_sectors = sectors_per_chunk * s->max_iov;
+    max_bytes = s->granularity * s->max_iov;

     /* We can only handle as much as buf_size at a time. */
-    nb_sectors = MIN(s->buf_size >> BDRV_SECTOR_BITS, nb_sectors);
-    nb_sectors = MIN(max_sectors, nb_sectors);
-    assert(nb_sectors);
-    assert(nb_sectors < BDRV_REQUEST_MAX_SECTORS);
-    ret = nb_sectors;
+    bytes = MIN(s->buf_size, MIN(max_bytes, bytes));
+    assert(bytes);
+    assert(bytes < BDRV_REQUEST_MAX_BYTES);
+    ret = bytes;

     if (s->cow_bitmap) {
-        int64_t offset = sector_num * BDRV_SECTOR_SIZE;
-        unsigned int bytes = nb_sectors * BDRV_SECTOR_SIZE;
-        int gap;
-
-        gap = mirror_cow_align(s, &offset, &bytes);
-        sector_num = offset / BDRV_SECTOR_SIZE;
-        nb_sectors = bytes / BDRV_SECTOR_SIZE;
-        ret += gap / BDRV_SECTOR_SIZE;
+        ret += mirror_cow_align(s, &offset, &bytes);
     }
-    assert(nb_sectors << BDRV_SECTOR_BITS <= s->buf_size);
-    /* The sector range must meet granularity because:
+    assert(bytes <= s->buf_size);
+    /* The range will be sector-aligned because:
      * 1) Caller passes in aligned values;
-     * 2) mirror_cow_align is used only when target cluster is larger. */
-    assert(!(sector_num % sectors_per_chunk));
-    nb_chunks = DIV_ROUND_UP(nb_sectors, sectors_per_chunk);
+     * 2) mirror_cow_align is used only when target cluster is larger.
+     * But it might not be cluster-aligned at end-of-file. */
+    assert(QEMU_IS_ALIGNED(bytes, BDRV_SECTOR_SIZE));
+    nb_chunks = DIV_ROUND_UP(bytes, s->granularity);

     while (s->buf_free_count < nb_chunks) {
-        trace_mirror_yield_in_flight(s, sector_num * BDRV_SECTOR_SIZE,
-                                     s->in_flight);
+        trace_mirror_yield_in_flight(s, offset, s->in_flight);
         mirror_wait_for_io(s);
     }

     /* Allocate a MirrorOp that is used as an AIO callback.  */
     op = g_new(MirrorOp, 1);
     op->s = s;
-    op->offset = sector_num * BDRV_SECTOR_SIZE;
-    op->bytes = nb_sectors * BDRV_SECTOR_SIZE;
+    op->offset = offset;
+    op->bytes = bytes;

     /* Now make a QEMUIOVector taking enough granularity-sized chunks
      * from s->buf_free.
@@ -299,7 +291,7 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
     qemu_iovec_init(&op->qiov, nb_chunks);
     while (nb_chunks-- > 0) {
         MirrorBuffer *buf = QSIMPLEQ_FIRST(&s->buf_free);
-        size_t remaining = nb_sectors * BDRV_SECTOR_SIZE - op->qiov.size;
+        size_t remaining = bytes - op->qiov.size;

         QSIMPLEQ_REMOVE_HEAD(&s->buf_free, next);
         s->buf_free_count--;
@@ -308,12 +300,10 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,

     /* Copy the dirty cluster.  */
     s->in_flight++;
-    s->bytes_in_flight += nb_sectors * BDRV_SECTOR_SIZE;
-    trace_mirror_one_iteration(s, sector_num * BDRV_SECTOR_SIZE,
-                               nb_sectors * BDRV_SECTOR_SIZE);
+    s->bytes_in_flight += bytes;
+    trace_mirror_one_iteration(s, offset, bytes);

-    blk_aio_preadv(source, sector_num * BDRV_SECTOR_SIZE, &op->qiov, 0,
-                   mirror_read_complete, op);
+    blk_aio_preadv(source, offset, &op->qiov, 0, mirror_read_complete, op);
     return ret;
 }

@@ -456,8 +446,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
         io_sectors = mirror_clip_sectors(s, sector_num, io_sectors);
         switch (mirror_method) {
         case MIRROR_METHOD_COPY:
-            io_sectors = mirror_do_read(s, sector_num, io_sectors);
-            io_bytes_acct = io_sectors * BDRV_SECTOR_SIZE;
+            io_bytes_acct = mirror_do_read(s, sector_num * BDRV_SECTOR_SIZE,
+                                           io_sectors * BDRV_SECTOR_SIZE);
+            io_sectors = io_bytes_acct / BDRV_SECTOR_SIZE;
             break;
         case MIRROR_METHOD_ZERO:
         case MIRROR_METHOD_DISCARD:
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 12/20] mirror: Switch mirror_iteration() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (10 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 11/20] mirror: Switch mirror_do_read() " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 13/20] block: Drop unused bdrv_round_sectors_to_clusters() Eric Blake
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of mirroring to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are both sector-aligned and multiples of the granularity).  Drop
the now-unused mirror_clip_sectors().

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: straightforward rebase to earlier mirror_clip_bytes() change, R-b kept
---
 block/mirror.c | 103 +++++++++++++++++++++++++--------------------------------
 1 file changed, 45 insertions(+), 58 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 0dbb52f..2a95ba5 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -184,15 +184,6 @@ static inline int64_t mirror_clip_bytes(MirrorBlockJob *s,
     return MIN(bytes, s->bdev_length - offset);
 }

-/* Clip nb_sectors relative to sector_num to not exceed end-of-file */
-static inline int mirror_clip_sectors(MirrorBlockJob *s,
-                                      int64_t sector_num,
-                                      int nb_sectors)
-{
-    return MIN(nb_sectors,
-               s->bdev_length / BDRV_SECTOR_SIZE - sector_num);
-}
-
 /* Round offset and/or bytes to target cluster if COW is needed, and
  * return the offset of the adjusted tail against original. */
 static int mirror_cow_align(MirrorBlockJob *s, int64_t *offset,
@@ -336,28 +327,26 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
 static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
 {
     BlockDriverState *source = s->source;
-    int64_t sector_num, first_chunk;
+    int64_t offset, first_chunk;
     uint64_t delay_ns = 0;
     /* At least the first dirty chunk is mirrored in one iteration. */
     int nb_chunks = 1;
-    int64_t end = s->bdev_length / BDRV_SECTOR_SIZE;
     int sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
     bool write_zeroes_ok = bdrv_can_write_zeroes_with_unmap(blk_bs(s->target));
     int max_io_bytes = MAX(s->buf_size / MAX_IN_FLIGHT, MAX_IO_BYTES);

-    sector_num = bdrv_dirty_iter_next(s->dbi);
-    if (sector_num < 0) {
+    offset = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+    if (offset < 0) {
         bdrv_set_dirty_iter(s->dbi, 0);
-        sector_num = bdrv_dirty_iter_next(s->dbi);
+        offset = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
         trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap) *
                                   BDRV_SECTOR_SIZE);
-        assert(sector_num >= 0);
+        assert(offset >= 0);
     }

-    first_chunk = sector_num / sectors_per_chunk;
+    first_chunk = offset / s->granularity;
     while (test_bit(first_chunk, s->in_flight_bitmap)) {
-        trace_mirror_yield_in_flight(s, sector_num * BDRV_SECTOR_SIZE,
-                                     s->in_flight);
+        trace_mirror_yield_in_flight(s, offset, s->in_flight);
         mirror_wait_for_io(s);
     }

@@ -365,25 +354,26 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)

     /* Find the number of consective dirty chunks following the first dirty
      * one, and wait for in flight requests in them. */
-    while (nb_chunks * sectors_per_chunk < (s->buf_size >> BDRV_SECTOR_BITS)) {
+    while (nb_chunks * s->granularity < s->buf_size) {
         int64_t next_dirty;
-        int64_t next_sector = sector_num + nb_chunks * sectors_per_chunk;
-        int64_t next_chunk = next_sector / sectors_per_chunk;
-        if (next_sector >= end ||
-            !bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
+        int64_t next_offset = offset + nb_chunks * s->granularity;
+        int64_t next_chunk = next_offset / s->granularity;
+        if (next_offset >= s->bdev_length ||
+            !bdrv_get_dirty(source, s->dirty_bitmap,
+                            next_offset >> BDRV_SECTOR_BITS)) {
             break;
         }
         if (test_bit(next_chunk, s->in_flight_bitmap)) {
             break;
         }

-        next_dirty = bdrv_dirty_iter_next(s->dbi);
-        if (next_dirty > next_sector || next_dirty < 0) {
+        next_dirty = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+        if (next_dirty > next_offset || next_dirty < 0) {
             /* The bitmap iterator's cache is stale, refresh it */
-            bdrv_set_dirty_iter(s->dbi, next_sector);
-            next_dirty = bdrv_dirty_iter_next(s->dbi);
+            bdrv_set_dirty_iter(s->dbi, next_offset >> BDRV_SECTOR_BITS);
+            next_dirty = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
         }
-        assert(next_dirty == next_sector);
+        assert(next_dirty == next_offset);
         nb_chunks++;
     }

@@ -391,12 +381,13 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
      * calling bdrv_get_block_status_above could yield - if some blocks are
      * marked dirty in this window, we need to know.
      */
-    bdrv_reset_dirty_bitmap(s->dirty_bitmap, sector_num,
+    bdrv_reset_dirty_bitmap(s->dirty_bitmap, offset >> BDRV_SECTOR_BITS,
                             nb_chunks * sectors_per_chunk);
-    bitmap_set(s->in_flight_bitmap, sector_num / sectors_per_chunk, nb_chunks);
-    while (nb_chunks > 0 && sector_num < end) {
+    bitmap_set(s->in_flight_bitmap, offset / s->granularity, nb_chunks);
+    while (nb_chunks > 0 && offset < s->bdev_length) {
         int64_t ret;
         int io_sectors;
+        unsigned int io_bytes;
         int64_t io_bytes_acct;
         BlockDriverState *file;
         enum MirrorMethod {
@@ -405,28 +396,28 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
             MIRROR_METHOD_DISCARD
         } mirror_method = MIRROR_METHOD_COPY;

-        assert(!(sector_num % sectors_per_chunk));
-        ret = bdrv_get_block_status_above(source, NULL, sector_num,
+        assert(!(offset % s->granularity));
+        ret = bdrv_get_block_status_above(source, NULL,
+                                          offset >> BDRV_SECTOR_BITS,
                                           nb_chunks * sectors_per_chunk,
                                           &io_sectors, &file);
+        io_bytes = io_sectors * BDRV_SECTOR_SIZE;
         if (ret < 0) {
-            io_sectors = MIN(nb_chunks * sectors_per_chunk,
-                             max_io_bytes >> BDRV_SECTOR_BITS);
+            io_bytes = MIN(nb_chunks * s->granularity, max_io_bytes);
         } else if (ret & BDRV_BLOCK_DATA) {
-            io_sectors = MIN(io_sectors, max_io_bytes >> BDRV_SECTOR_BITS);
+            io_bytes = MIN(io_bytes, max_io_bytes);
         }

-        io_sectors -= io_sectors % sectors_per_chunk;
-        if (io_sectors < sectors_per_chunk) {
-            io_sectors = sectors_per_chunk;
+        io_bytes -= io_bytes % s->granularity;
+        if (io_bytes < s->granularity) {
+            io_bytes = s->granularity;
         } else if (ret >= 0 && !(ret & BDRV_BLOCK_DATA)) {
-            int64_t target_sector_num;
-            int target_nb_sectors;
-            bdrv_round_sectors_to_clusters(blk_bs(s->target), sector_num,
-                                           io_sectors,  &target_sector_num,
-                                           &target_nb_sectors);
-            if (target_sector_num == sector_num &&
-                target_nb_sectors == io_sectors) {
+            int64_t target_offset;
+            unsigned int target_bytes;
+            bdrv_round_to_clusters(blk_bs(s->target), offset, io_bytes,
+                                   &target_offset, &target_bytes);
+            if (target_offset == offset &&
+                target_bytes == io_bytes) {
                 mirror_method = ret & BDRV_BLOCK_ZERO ?
                                     MIRROR_METHOD_ZERO :
                                     MIRROR_METHOD_DISCARD;
@@ -434,8 +425,7 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
         }

         while (s->in_flight >= MAX_IN_FLIGHT) {
-            trace_mirror_yield_in_flight(s, sector_num * BDRV_SECTOR_SIZE,
-                                         s->in_flight);
+            trace_mirror_yield_in_flight(s, offset, s->in_flight);
             mirror_wait_for_io(s);
         }

@@ -443,30 +433,27 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
             return 0;
         }

-        io_sectors = mirror_clip_sectors(s, sector_num, io_sectors);
+        io_bytes = mirror_clip_bytes(s, offset, io_bytes);
         switch (mirror_method) {
         case MIRROR_METHOD_COPY:
-            io_bytes_acct = mirror_do_read(s, sector_num * BDRV_SECTOR_SIZE,
-                                           io_sectors * BDRV_SECTOR_SIZE);
-            io_sectors = io_bytes_acct / BDRV_SECTOR_SIZE;
+            io_bytes = io_bytes_acct = mirror_do_read(s, offset, io_bytes);
             break;
         case MIRROR_METHOD_ZERO:
         case MIRROR_METHOD_DISCARD:
-            mirror_do_zero_or_discard(s, sector_num * BDRV_SECTOR_SIZE,
-                                      io_sectors * BDRV_SECTOR_SIZE,
+            mirror_do_zero_or_discard(s, offset, io_bytes,
                                       mirror_method == MIRROR_METHOD_DISCARD);
             if (write_zeroes_ok) {
                 io_bytes_acct = 0;
             } else {
-                io_bytes_acct = io_sectors * BDRV_SECTOR_SIZE;
+                io_bytes_acct = io_bytes;
             }
             break;
         default:
             abort();
         }
-        assert(io_sectors);
-        sector_num += io_sectors;
-        nb_chunks -= DIV_ROUND_UP(io_sectors, sectors_per_chunk);
+        assert(io_bytes);
+        offset += io_bytes;
+        nb_chunks -= DIV_ROUND_UP(io_bytes, s->granularity);
         if (s->common.speed) {
             delay_ns = ratelimit_calculate_delay(&s->limit, io_bytes_acct);
         }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 13/20] block: Drop unused bdrv_round_sectors_to_clusters()
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (11 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 12/20] mirror: Switch mirror_iteration() " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 14/20] backup: Switch BackupBlockJob to byte-based Eric Blake
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, jsnow, Stefan Hajnoczi, Fam Zheng, Kevin Wolf, Max Reitz

Now that the last user [mirror_iteration()] has converted to using
bytes, we no longer need a function to round sectors to clusters.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: hoist to earlier series, no change
---
 include/block/block.h |  4 ----
 block/io.c            | 21 ---------------------
 2 files changed, 25 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 14fba5e..96ece29 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -481,10 +481,6 @@ const char *bdrv_get_device_or_node_name(const BlockDriverState *bs);
 int bdrv_get_flags(BlockDriverState *bs);
 int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);
 ImageInfoSpecific *bdrv_get_specific_info(BlockDriverState *bs);
-void bdrv_round_sectors_to_clusters(BlockDriverState *bs,
-                                    int64_t sector_num, int nb_sectors,
-                                    int64_t *cluster_sector_num,
-                                    int *cluster_nb_sectors);
 void bdrv_round_to_clusters(BlockDriverState *bs,
                             int64_t offset, unsigned int bytes,
                             int64_t *cluster_offset,
diff --git a/block/io.c b/block/io.c
index fdd7485..88ae8b0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -436,27 +436,6 @@ static void mark_request_serialising(BdrvTrackedRequest *req, uint64_t align)
 }

 /**
- * Round a region to cluster boundaries (sector-based)
- */
-void bdrv_round_sectors_to_clusters(BlockDriverState *bs,
-                                    int64_t sector_num, int nb_sectors,
-                                    int64_t *cluster_sector_num,
-                                    int *cluster_nb_sectors)
-{
-    BlockDriverInfo bdi;
-
-    if (bdrv_get_info(bs, &bdi) < 0 || bdi.cluster_size == 0) {
-        *cluster_sector_num = sector_num;
-        *cluster_nb_sectors = nb_sectors;
-    } else {
-        int64_t c = bdi.cluster_size / BDRV_SECTOR_SIZE;
-        *cluster_sector_num = QEMU_ALIGN_DOWN(sector_num, c);
-        *cluster_nb_sectors = QEMU_ALIGN_UP(sector_num - *cluster_sector_num +
-                                            nb_sectors, c);
-    }
-}
-
-/**
  * Round a region to cluster boundaries
  */
 void bdrv_round_to_clusters(BlockDriverState *bs,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 14/20] backup: Switch BackupBlockJob to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (12 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 13/20] block: Drop unused bdrv_round_sectors_to_clusters() Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 15/20] backup: Switch block_backup.h " Eric Blake
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting an
internal structure (no semantic change), and all references to
tracking progress.  Drop a redundant local variable bytes_per_cluster.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: no change
---
 block/backup.c | 33 +++++++++++++++------------------
 1 file changed, 15 insertions(+), 18 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index b28df8c..a64a162 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -39,7 +39,7 @@ typedef struct BackupBlockJob {
     BlockdevOnError on_source_error;
     BlockdevOnError on_target_error;
     CoRwlock flush_rwlock;
-    uint64_t sectors_read;
+    uint64_t bytes_read;
     unsigned long *done_bitmap;
     int64_t cluster_size;
     bool compress;
@@ -102,16 +102,15 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
     void *bounce_buffer = NULL;
     int ret = 0;
     int64_t sectors_per_cluster = cluster_size_sectors(job);
-    int64_t bytes_per_cluster = sectors_per_cluster * BDRV_SECTOR_SIZE;
-    int64_t start, end;
-    int n;
+    int64_t start, end; /* clusters */
+    int n; /* bytes */

     qemu_co_rwlock_rdlock(&job->flush_rwlock);

     start = sector_num / sectors_per_cluster;
     end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);

-    trace_backup_do_cow_enter(job, start * bytes_per_cluster,
+    trace_backup_do_cow_enter(job, start * job->cluster_size,
                               sector_num * BDRV_SECTOR_SIZE,
                               nb_sectors * BDRV_SECTOR_SIZE);

@@ -120,28 +119,27 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,

     for (; start < end; start++) {
         if (test_bit(start, job->done_bitmap)) {
-            trace_backup_do_cow_skip(job, start * bytes_per_cluster);
+            trace_backup_do_cow_skip(job, start * job->cluster_size);
             continue; /* already copied */
         }

-        trace_backup_do_cow_process(job, start * bytes_per_cluster);
+        trace_backup_do_cow_process(job, start * job->cluster_size);

-        n = MIN(sectors_per_cluster,
-                job->common.len / BDRV_SECTOR_SIZE -
-                start * sectors_per_cluster);
+        n = MIN(job->cluster_size,
+                job->common.len - start * job->cluster_size);

         if (!bounce_buffer) {
             bounce_buffer = blk_blockalign(blk, job->cluster_size);
         }
         iov.iov_base = bounce_buffer;
-        iov.iov_len = n * BDRV_SECTOR_SIZE;
+        iov.iov_len = n;
         qemu_iovec_init_external(&bounce_qiov, &iov, 1);

         ret = blk_co_preadv(blk, start * job->cluster_size,
                             bounce_qiov.size, &bounce_qiov,
                             is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
         if (ret < 0) {
-            trace_backup_do_cow_read_fail(job, start * bytes_per_cluster, ret);
+            trace_backup_do_cow_read_fail(job, start * job->cluster_size, ret);
             if (error_is_read) {
                 *error_is_read = true;
             }
@@ -157,7 +155,7 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
                                  job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
         }
         if (ret < 0) {
-            trace_backup_do_cow_write_fail(job, start * bytes_per_cluster, ret);
+            trace_backup_do_cow_write_fail(job, start * job->cluster_size, ret);
             if (error_is_read) {
                 *error_is_read = false;
             }
@@ -169,8 +167,8 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
         /* Publish progress, guest I/O counts as progress too.  Note that the
          * offset field is an opaque progress value, it is not a disk offset.
          */
-        job->sectors_read += n;
-        job->common.offset += n * BDRV_SECTOR_SIZE;
+        job->bytes_read += n;
+        job->common.offset += n;
     }

 out:
@@ -363,9 +361,8 @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
      */
     if (job->common.speed) {
         uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
-                                                      job->sectors_read *
-                                                      BDRV_SECTOR_SIZE);
-        job->sectors_read = 0;
+                                                      job->bytes_read);
+        job->bytes_read = 0;
         block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
     } else {
         block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 15/20] backup: Switch block_backup.h to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (13 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 14/20] backup: Switch BackupBlockJob to byte-based Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-06-05 22:42   ` John Snow
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 16/20] backup: Switch backup_do_cow() " Eric Blake
                   ` (5 subsequent siblings)
  20 siblings, 1 reply; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz,
	Wen Congyang, Xie Changlong

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting
the public interface to backup jobs (no semantic change), including
a change to CowRequest to track by bytes instead of cluster indices.

Note that this does not change the difference between the public
interface (starting point, and size of the subsequent range) and
the internal interface (starting and end points).

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: change a couple more parameter names
---
 include/block/block_backup.h | 11 +++++------
 block/backup.c               | 33 ++++++++++++++++-----------------
 block/replication.c          | 12 ++++++++----
 3 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/include/block/block_backup.h b/include/block/block_backup.h
index 8a75947..994a3bd 100644
--- a/include/block/block_backup.h
+++ b/include/block/block_backup.h
@@ -21,17 +21,16 @@
 #include "block/block_int.h"

 typedef struct CowRequest {
-    int64_t start;
-    int64_t end;
+    int64_t start_byte;
+    int64_t end_byte;
     QLIST_ENTRY(CowRequest) list;
     CoQueue wait_queue; /* coroutines blocked on this request */
 } CowRequest;

-void backup_wait_for_overlapping_requests(BlockJob *job, int64_t sector_num,
-                                          int nb_sectors);
+void backup_wait_for_overlapping_requests(BlockJob *job, int64_t offset,
+                                          uint64_t bytes);
 void backup_cow_request_begin(CowRequest *req, BlockJob *job,
-                              int64_t sector_num,
-                              int nb_sectors);
+                              int64_t offset, uint64_t bytes);
 void backup_cow_request_end(CowRequest *req);

 void backup_do_checkpoint(BlockJob *job, Error **errp);
diff --git a/block/backup.c b/block/backup.c
index a64a162..568d454 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -55,7 +55,7 @@ static inline int64_t cluster_size_sectors(BackupBlockJob *job)

 /* See if in-flight requests overlap and wait for them to complete */
 static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
-                                                       int64_t start,
+                                                       int64_t offset,
                                                        int64_t end)
 {
     CowRequest *req;
@@ -64,7 +64,7 @@ static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
     do {
         retry = false;
         QLIST_FOREACH(req, &job->inflight_reqs, list) {
-            if (end > req->start && start < req->end) {
+            if (end > req->start_byte && offset < req->end_byte) {
                 qemu_co_queue_wait(&req->wait_queue, NULL);
                 retry = true;
                 break;
@@ -75,10 +75,10 @@ static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,

 /* Keep track of an in-flight request */
 static void cow_request_begin(CowRequest *req, BackupBlockJob *job,
-                                     int64_t start, int64_t end)
+                              int64_t offset, int64_t end)
 {
-    req->start = start;
-    req->end = end;
+    req->start_byte = offset;
+    req->end_byte = end;
     qemu_co_queue_init(&req->wait_queue);
     QLIST_INSERT_HEAD(&job->inflight_reqs, req, list);
 }
@@ -114,8 +114,10 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
                               sector_num * BDRV_SECTOR_SIZE,
                               nb_sectors * BDRV_SECTOR_SIZE);

-    wait_for_overlapping_requests(job, start, end);
-    cow_request_begin(&cow_request, job, start, end);
+    wait_for_overlapping_requests(job, start * job->cluster_size,
+                                  end * job->cluster_size);
+    cow_request_begin(&cow_request, job, start * job->cluster_size,
+                      end * job->cluster_size);

     for (; start < end; start++) {
         if (test_bit(start, job->done_bitmap)) {
@@ -277,32 +279,29 @@ void backup_do_checkpoint(BlockJob *job, Error **errp)
     bitmap_zero(backup_job->done_bitmap, len);
 }

-void backup_wait_for_overlapping_requests(BlockJob *job, int64_t sector_num,
-                                          int nb_sectors)
+void backup_wait_for_overlapping_requests(BlockJob *job, int64_t offset,
+                                          uint64_t bytes)
 {
     BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
-    int64_t sectors_per_cluster = cluster_size_sectors(backup_job);
     int64_t start, end;

     assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);

-    start = sector_num / sectors_per_cluster;
-    end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
+    start = QEMU_ALIGN_DOWN(offset, backup_job->cluster_size);
+    end = QEMU_ALIGN_UP(offset + bytes, backup_job->cluster_size);
     wait_for_overlapping_requests(backup_job, start, end);
 }

 void backup_cow_request_begin(CowRequest *req, BlockJob *job,
-                              int64_t sector_num,
-                              int nb_sectors)
+                              int64_t offset, uint64_t bytes)
 {
     BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
-    int64_t sectors_per_cluster = cluster_size_sectors(backup_job);
     int64_t start, end;

     assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);

-    start = sector_num / sectors_per_cluster;
-    end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
+    start = QEMU_ALIGN_DOWN(offset, backup_job->cluster_size);
+    end = QEMU_ALIGN_UP(offset + bytes, backup_job->cluster_size);
     cow_request_begin(req, backup_job, start, end);
 }

diff --git a/block/replication.c b/block/replication.c
index d300c15..137a7fc 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -234,10 +234,14 @@ static coroutine_fn int replication_co_readv(BlockDriverState *bs,
     }

     if (job) {
-        backup_wait_for_overlapping_requests(child->bs->job, sector_num,
-                                             remaining_sectors);
-        backup_cow_request_begin(&req, child->bs->job, sector_num,
-                                 remaining_sectors);
+        uint64_t remaining_bytes = remaining_sectors * BDRV_SECTOR_SIZE;
+
+        backup_wait_for_overlapping_requests(child->bs->job,
+                                             sector_num * BDRV_SECTOR_SIZE,
+                                             remaining_bytes);
+        backup_cow_request_begin(&req, child->bs->job,
+                                 sector_num * BDRV_SECTOR_SIZE,
+                                 remaining_bytes);
         ret = bdrv_co_readv(bs->file, sector_num, remaining_sectors,
                             qiov);
         backup_cow_request_end(&req);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 16/20] backup: Switch backup_do_cow() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (14 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 15/20] backup: Switch block_backup.h " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 17/20] backup: Switch backup_run() " Eric Blake
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: no change
---
 block/backup.c | 62 ++++++++++++++++++++++++----------------------------------
 1 file changed, 26 insertions(+), 36 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 568d454..6aa7d1a 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -91,7 +91,7 @@ static void cow_request_end(CowRequest *req)
 }

 static int coroutine_fn backup_do_cow(BackupBlockJob *job,
-                                      int64_t sector_num, int nb_sectors,
+                                      int64_t offset, uint64_t bytes,
                                       bool *error_is_read,
                                       bool is_write_notifier)
 {
@@ -101,34 +101,28 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
     QEMUIOVector bounce_qiov;
     void *bounce_buffer = NULL;
     int ret = 0;
-    int64_t sectors_per_cluster = cluster_size_sectors(job);
-    int64_t start, end; /* clusters */
+    int64_t start, end; /* bytes */
     int n; /* bytes */

     qemu_co_rwlock_rdlock(&job->flush_rwlock);

-    start = sector_num / sectors_per_cluster;
-    end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
+    start = QEMU_ALIGN_DOWN(offset, job->cluster_size);
+    end = QEMU_ALIGN_UP(bytes + offset, job->cluster_size);

-    trace_backup_do_cow_enter(job, start * job->cluster_size,
-                              sector_num * BDRV_SECTOR_SIZE,
-                              nb_sectors * BDRV_SECTOR_SIZE);
+    trace_backup_do_cow_enter(job, start, offset, bytes);

-    wait_for_overlapping_requests(job, start * job->cluster_size,
-                                  end * job->cluster_size);
-    cow_request_begin(&cow_request, job, start * job->cluster_size,
-                      end * job->cluster_size);
+    wait_for_overlapping_requests(job, start, end);
+    cow_request_begin(&cow_request, job, start, end);

-    for (; start < end; start++) {
-        if (test_bit(start, job->done_bitmap)) {
-            trace_backup_do_cow_skip(job, start * job->cluster_size);
+    for (; start < end; start += job->cluster_size) {
+        if (test_bit(start / job->cluster_size, job->done_bitmap)) {
+            trace_backup_do_cow_skip(job, start);
             continue; /* already copied */
         }

-        trace_backup_do_cow_process(job, start * job->cluster_size);
+        trace_backup_do_cow_process(job, start);

-        n = MIN(job->cluster_size,
-                job->common.len - start * job->cluster_size);
+        n = MIN(job->cluster_size, job->common.len - start);

         if (!bounce_buffer) {
             bounce_buffer = blk_blockalign(blk, job->cluster_size);
@@ -137,11 +131,10 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
         iov.iov_len = n;
         qemu_iovec_init_external(&bounce_qiov, &iov, 1);

-        ret = blk_co_preadv(blk, start * job->cluster_size,
-                            bounce_qiov.size, &bounce_qiov,
+        ret = blk_co_preadv(blk, start, bounce_qiov.size, &bounce_qiov,
                             is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
         if (ret < 0) {
-            trace_backup_do_cow_read_fail(job, start * job->cluster_size, ret);
+            trace_backup_do_cow_read_fail(job, start, ret);
             if (error_is_read) {
                 *error_is_read = true;
             }
@@ -149,22 +142,22 @@ static int coroutine_fn backup_do_cow(BackupBlockJob *job,
         }

         if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
-            ret = blk_co_pwrite_zeroes(job->target, start * job->cluster_size,
+            ret = blk_co_pwrite_zeroes(job->target, start,
                                        bounce_qiov.size, BDRV_REQ_MAY_UNMAP);
         } else {
-            ret = blk_co_pwritev(job->target, start * job->cluster_size,
+            ret = blk_co_pwritev(job->target, start,
                                  bounce_qiov.size, &bounce_qiov,
                                  job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
         }
         if (ret < 0) {
-            trace_backup_do_cow_write_fail(job, start * job->cluster_size, ret);
+            trace_backup_do_cow_write_fail(job, start, ret);
             if (error_is_read) {
                 *error_is_read = false;
             }
             goto out;
         }

-        set_bit(start, job->done_bitmap);
+        set_bit(start / job->cluster_size, job->done_bitmap);

         /* Publish progress, guest I/O counts as progress too.  Note that the
          * offset field is an opaque progress value, it is not a disk offset.
@@ -180,8 +173,7 @@ out:

     cow_request_end(&cow_request);

-    trace_backup_do_cow_return(job, sector_num * BDRV_SECTOR_SIZE,
-                               nb_sectors * BDRV_SECTOR_SIZE, ret);
+    trace_backup_do_cow_return(job, offset, bytes, ret);

     qemu_co_rwlock_unlock(&job->flush_rwlock);

@@ -194,14 +186,12 @@ static int coroutine_fn backup_before_write_notify(
 {
     BackupBlockJob *job = container_of(notifier, BackupBlockJob, before_write);
     BdrvTrackedRequest *req = opaque;
-    int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
-    int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;

     assert(req->bs == blk_bs(job->common.blk));
-    assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
-    assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
+    assert(QEMU_IS_ALIGNED(req->offset, BDRV_SECTOR_SIZE));
+    assert(QEMU_IS_ALIGNED(req->bytes, BDRV_SECTOR_SIZE));

-    return backup_do_cow(job, sector_num, nb_sectors, NULL, true);
+    return backup_do_cow(job, req->offset, req->bytes, NULL, true);
 }

 static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -406,8 +396,8 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
                 if (yield_and_check(job)) {
                     goto out;
                 }
-                ret = backup_do_cow(job, cluster * sectors_per_cluster,
-                                    sectors_per_cluster, &error_is_read,
+                ret = backup_do_cow(job, cluster * job->cluster_size,
+                                    job->cluster_size, &error_is_read,
                                     false);
                 if ((ret < 0) &&
                     backup_error_action(job, error_is_read, -ret) ==
@@ -509,8 +499,8 @@ static void coroutine_fn backup_run(void *opaque)
             if (alloced < 0) {
                 ret = alloced;
             } else {
-                ret = backup_do_cow(job, start * sectors_per_cluster,
-                                    sectors_per_cluster, &error_is_read,
+                ret = backup_do_cow(job, start * job->cluster_size,
+                                    job->cluster_size, &error_is_read,
                                     false);
             }
             if (ret < 0) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 17/20] backup: Switch backup_run() to byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (15 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 16/20] backup: Switch backup_do_cow() " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 18/20] block: Make bdrv_is_allocated() byte-based Eric Blake
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of backups to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are cluster-aligned).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>

---
v2: no change
---
 block/backup.c | 32 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 6aa7d1a..1d3a3ce 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -370,11 +370,10 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
     int ret = 0;
     int clusters_per_iter;
     uint32_t granularity;
-    int64_t sector;
+    int64_t offset;
     int64_t cluster;
     int64_t end;
     int64_t last_cluster = -1;
-    int64_t sectors_per_cluster = cluster_size_sectors(job);
     BdrvDirtyBitmapIter *dbi;

     granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
@@ -382,8 +381,8 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
     dbi = bdrv_dirty_iter_new(job->sync_bitmap, 0);

     /* Find the next dirty sector(s) */
-    while ((sector = bdrv_dirty_iter_next(dbi)) != -1) {
-        cluster = sector / sectors_per_cluster;
+    while ((offset = bdrv_dirty_iter_next(dbi) * BDRV_SECTOR_SIZE) >= 0) {
+        cluster = offset / job->cluster_size;

         /* Fake progress updates for any clusters we skipped */
         if (cluster != last_cluster + 1) {
@@ -410,7 +409,8 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
         /* If the bitmap granularity is smaller than the backup granularity,
          * we need to advance the iterator pointer to the next cluster. */
         if (granularity < job->cluster_size) {
-            bdrv_set_dirty_iter(dbi, cluster * sectors_per_cluster);
+            bdrv_set_dirty_iter(dbi,
+                                cluster * job->cluster_size / BDRV_SECTOR_SIZE);
         }

         last_cluster = cluster - 1;
@@ -432,17 +432,15 @@ static void coroutine_fn backup_run(void *opaque)
     BackupBlockJob *job = opaque;
     BackupCompleteData *data;
     BlockDriverState *bs = blk_bs(job->common.blk);
-    int64_t start, end;
+    int64_t offset;
     int64_t sectors_per_cluster = cluster_size_sectors(job);
     int ret = 0;

     QLIST_INIT(&job->inflight_reqs);
     qemu_co_rwlock_init(&job->flush_rwlock);

-    start = 0;
-    end = DIV_ROUND_UP(job->common.len, job->cluster_size);
-
-    job->done_bitmap = bitmap_new(end);
+    job->done_bitmap = bitmap_new(DIV_ROUND_UP(job->common.len,
+                                               job->cluster_size));

     job->before_write.notify = backup_before_write_notify;
     bdrv_add_before_write_notifier(bs, &job->before_write);
@@ -457,7 +455,8 @@ static void coroutine_fn backup_run(void *opaque)
         ret = backup_run_incremental(job);
     } else {
         /* Both FULL and TOP SYNC_MODE's require copying.. */
-        for (; start < end; start++) {
+        for (offset = 0; offset < job->common.len;
+             offset += job->cluster_size) {
             bool error_is_read;
             int alloced = 0;

@@ -480,8 +479,8 @@ static void coroutine_fn backup_run(void *opaque)
                      * needed but at some point that is always the case. */
                     alloced =
                         bdrv_is_allocated(bs,
-                                start * sectors_per_cluster + i,
-                                sectors_per_cluster - i, &n);
+                                          (offset >> BDRV_SECTOR_BITS) + i,
+                                          sectors_per_cluster - i, &n);
                     i += n;

                     if (alloced || n == 0) {
@@ -499,9 +498,8 @@ static void coroutine_fn backup_run(void *opaque)
             if (alloced < 0) {
                 ret = alloced;
             } else {
-                ret = backup_do_cow(job, start * job->cluster_size,
-                                    job->cluster_size, &error_is_read,
-                                    false);
+                ret = backup_do_cow(job, offset, job->cluster_size,
+                                    &error_is_read, false);
             }
             if (ret < 0) {
                 /* Depending on error action, fail now or retry cluster */
@@ -510,7 +508,7 @@ static void coroutine_fn backup_run(void *opaque)
                 if (action == BLOCK_ERROR_ACTION_REPORT) {
                     break;
                 } else {
-                    start--;
+                    offset -= job->cluster_size;
                     continue;
                 }
             }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 18/20] block: Make bdrv_is_allocated() byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (16 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 17/20] backup: Switch backup_run() " Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-06-05 22:57   ` John Snow
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 19/20] block: Minimize raw use of bds->total_sectors Eric Blake
                   ` (2 subsequent siblings)
  20 siblings, 1 reply; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz,
	Stefan Hajnoczi, Fam Zheng, Juan Quintela,
	Dr. David Alan Gilbert

We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated.  For now,
the io.c layer still assert()s that all callers are sector-aligned
on input and that *pnum is sector-aligned on return to the caller,
but that can be relaxed when a later patch implements byte-based
block status.  Therefore, this code adds usages like
DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned
values, where the call might reasonbly give non-aligned results
in the future; on the other hand, no rounding is needed for callers
that should just continue to work with byte alignment.

For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_is_allocated().  But
some code, particularly bdrv_commit(), gets a lot simpler because it
no longer has to mess with sectors; also, it is now possible to pass
NULL if the caller does not care how much of the image is allocated
beyond the initial offset.

For ease of review, bdrv_is_allocated_above() will be tackled
separately.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: rebase to earlier changes, tweak commit message
---
 include/block/block.h |  4 +--
 block/backup.c        | 17 ++++---------
 block/commit.c        | 21 +++++++---------
 block/io.c            | 49 +++++++++++++++++++++++++-----------
 block/stream.c        |  5 ++--
 block/vvfat.c         | 34 ++++++++++++++-----------
 migration/block.c     |  9 ++++---
 qemu-img.c            |  5 +++-
 qemu-io-cmds.c        | 70 +++++++++++++++++++++++----------------------------
 9 files changed, 114 insertions(+), 100 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 96ece29..47719ac 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -434,8 +434,8 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
                                     int64_t sector_num,
                                     int nb_sectors, int *pnum,
                                     BlockDriverState **file);
-int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
-                      int *pnum);
+int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes,
+                      int64_t *pnum);
 int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
                             int64_t sector_num, int nb_sectors, int *pnum);

diff --git a/block/backup.c b/block/backup.c
index 1d3a3ce..b8b76e5 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -47,12 +47,6 @@ typedef struct BackupBlockJob {
     QLIST_HEAD(, CowRequest) inflight_reqs;
 } BackupBlockJob;

-/* Size of a cluster in sectors, instead of bytes. */
-static inline int64_t cluster_size_sectors(BackupBlockJob *job)
-{
-  return job->cluster_size / BDRV_SECTOR_SIZE;
-}
-
 /* See if in-flight requests overlap and wait for them to complete */
 static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
                                                        int64_t offset,
@@ -433,7 +427,6 @@ static void coroutine_fn backup_run(void *opaque)
     BackupCompleteData *data;
     BlockDriverState *bs = blk_bs(job->common.blk);
     int64_t offset;
-    int64_t sectors_per_cluster = cluster_size_sectors(job);
     int ret = 0;

     QLIST_INIT(&job->inflight_reqs);
@@ -465,12 +458,13 @@ static void coroutine_fn backup_run(void *opaque)
             }

             if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
-                int i, n;
+                int i;
+                int64_t n;

                 /* Check to see if these blocks are already in the
                  * backing file. */

-                for (i = 0; i < sectors_per_cluster;) {
+                for (i = 0; i < job->cluster_size;) {
                     /* bdrv_is_allocated() only returns true/false based
                      * on the first set of sectors it comes across that
                      * are are all in the same state.
@@ -478,9 +472,8 @@ static void coroutine_fn backup_run(void *opaque)
                      * backup cluster length.  We end up copying more than
                      * needed but at some point that is always the case. */
                     alloced =
-                        bdrv_is_allocated(bs,
-                                          (offset >> BDRV_SECTOR_BITS) + i,
-                                          sectors_per_cluster - i, &n);
+                        bdrv_is_allocated(bs, offset + i,
+                                          job->cluster_size - i, &n);
                     i += n;

                     if (alloced || n == 0) {
diff --git a/block/commit.c b/block/commit.c
index b7d3e60..4d6bb2a 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -429,7 +429,7 @@ fail:
 }


-#define COMMIT_BUF_SECTORS 2048
+#define COMMIT_BUF_SIZE (2048 * BDRV_SECTOR_SIZE)

 /* commit COW file into the raw image */
 int bdrv_commit(BlockDriverState *bs)
@@ -438,8 +438,9 @@ int bdrv_commit(BlockDriverState *bs)
     BlockDriverState *backing_file_bs = NULL;
     BlockDriverState *commit_top_bs = NULL;
     BlockDriver *drv = bs->drv;
-    int64_t sector, total_sectors, length, backing_length;
-    int n, ro, open_flags;
+    int64_t offset, length, backing_length;
+    int ro, open_flags;
+    int64_t n;
     int ret = 0;
     uint8_t *buf = NULL;
     Error *local_err = NULL;
@@ -517,30 +518,26 @@ int bdrv_commit(BlockDriverState *bs)
         }
     }

-    total_sectors = length >> BDRV_SECTOR_BITS;
-
     /* blk_try_blockalign() for src will choose an alignment that works for
      * backing as well, so no need to compare the alignment manually. */
-    buf = blk_try_blockalign(src, COMMIT_BUF_SECTORS * BDRV_SECTOR_SIZE);
+    buf = blk_try_blockalign(src, COMMIT_BUF_SIZE);
     if (buf == NULL) {
         ret = -ENOMEM;
         goto ro_cleanup;
     }

-    for (sector = 0; sector < total_sectors; sector += n) {
-        ret = bdrv_is_allocated(bs, sector, COMMIT_BUF_SECTORS, &n);
+    for (offset = 0; offset < length; offset += n) {
+        ret = bdrv_is_allocated(bs, offset, COMMIT_BUF_SIZE, &n);
         if (ret < 0) {
             goto ro_cleanup;
         }
         if (ret) {
-            ret = blk_pread(src, sector * BDRV_SECTOR_SIZE, buf,
-                            n * BDRV_SECTOR_SIZE);
+            ret = blk_pread(src, offset, buf, n);
             if (ret < 0) {
                 goto ro_cleanup;
             }

-            ret = blk_pwrite(backing, sector * BDRV_SECTOR_SIZE, buf,
-                             n * BDRV_SECTOR_SIZE, 0);
+            ret = blk_pwrite(backing, offset, buf, n, 0);
             if (ret < 0) {
                 goto ro_cleanup;
             }
diff --git a/block/io.c b/block/io.c
index 88ae8b0..9ab89cb 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1050,14 +1050,15 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
         int64_t start_sector = offset >> BDRV_SECTOR_BITS;
         int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
         unsigned int nb_sectors = end_sector - start_sector;
-        int pnum;
+        int64_t pnum;

-        ret = bdrv_is_allocated(bs, start_sector, nb_sectors, &pnum);
+        ret = bdrv_is_allocated(bs, start_sector << BDRV_SECTOR_BITS,
+                                nb_sectors << BDRV_SECTOR_BITS, &pnum);
         if (ret < 0) {
             goto out;
         }

-        if (!ret || pnum != nb_sectors) {
+        if (!ret || pnum != nb_sectors << BDRV_SECTOR_BITS) {
             ret = bdrv_co_do_copy_on_readv(child, offset, bytes, qiov);
             goto out;
         }
@@ -1891,15 +1892,26 @@ int64_t bdrv_get_block_status(BlockDriverState *bs,
                                        sector_num, nb_sectors, pnum, file);
 }

-int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,
-                                   int nb_sectors, int *pnum)
+int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
+                                   int64_t bytes, int64_t *pnum)
 {
     BlockDriverState *file;
-    int64_t ret = bdrv_get_block_status(bs, sector_num, nb_sectors, pnum,
-                                        &file);
+    int64_t sector_num = offset >> BDRV_SECTOR_BITS;
+    int nb_sectors = bytes >> BDRV_SECTOR_BITS;
+    int64_t ret;
+    int psectors;
+
+    assert(QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE));
+    assert(QEMU_IS_ALIGNED(bytes, BDRV_SECTOR_SIZE) &&
+           bytes < INT_MAX * BDRV_SECTOR_SIZE);
+    ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &psectors,
+                                &file);
     if (ret < 0) {
         return ret;
     }
+    if (pnum) {
+        *pnum = psectors * BDRV_SECTOR_SIZE;
+    }
     return !!(ret & BDRV_BLOCK_ALLOCATED);
 }

@@ -1908,7 +1920,8 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,
  *
  * Return true if the given sector is allocated in any image between
  * BASE and TOP (inclusive).  BASE can be NULL to check if the given
- * sector is allocated in any image of the chain.  Return false otherwise.
+ * sector is allocated in any image of the chain.  Return false otherwise,
+ * or negative errno on failure.
  *
  * 'pnum' is set to the number of sectors (including and immediately following
  *  the specified sector) that are known to be in the same
@@ -1925,13 +1938,19 @@ int bdrv_is_allocated_above(BlockDriverState *top,

     intermediate = top;
     while (intermediate && intermediate != base) {
-        int pnum_inter;
-        ret = bdrv_is_allocated(intermediate, sector_num, nb_sectors,
+        int64_t pnum_inter;
+        int psectors_inter;
+
+        ret = bdrv_is_allocated(intermediate, sector_num * BDRV_SECTOR_SIZE,
+                                nb_sectors * BDRV_SECTOR_SIZE,
                                 &pnum_inter);
         if (ret < 0) {
             return ret;
-        } else if (ret) {
-            *pnum = pnum_inter;
+        }
+        assert(pnum_inter < INT_MAX * BDRV_SECTOR_SIZE);
+        psectors_inter = pnum_inter >> BDRV_SECTOR_BITS;
+        if (ret) {
+            *pnum = psectors_inter;
             return 1;
         }

@@ -1941,10 +1960,10 @@ int bdrv_is_allocated_above(BlockDriverState *top,
          *
          * [sector_num+x, nr_sectors] allocated.
          */
-        if (n > pnum_inter &&
+        if (n > psectors_inter &&
             (intermediate == top ||
-             sector_num + pnum_inter < intermediate->total_sectors)) {
-            n = pnum_inter;
+             sector_num + psectors_inter < intermediate->total_sectors)) {
+            n = psectors_inter;
         }

         intermediate = backing_bs(intermediate);
diff --git a/block/stream.c b/block/stream.c
index 3ed4fef..85502eb 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -138,6 +138,7 @@ static void coroutine_fn stream_run(void *opaque)

     for ( ; offset < s->common.len; offset += n * BDRV_SECTOR_SIZE) {
         bool copy;
+        int64_t count = 0;

         /* Note that even when no rate limit is applied we need to yield
          * with no pending I/O here so that bdrv_drain_all() returns.
@@ -149,8 +150,8 @@ static void coroutine_fn stream_run(void *opaque)

         copy = false;

-        ret = bdrv_is_allocated(bs, offset / BDRV_SECTOR_SIZE,
-                                STREAM_BUFFER_SIZE / BDRV_SECTOR_SIZE, &n);
+        ret = bdrv_is_allocated(bs, offset, STREAM_BUFFER_SIZE, &count);
+        n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
         if (ret == 1) {
             /* Allocated in the top, no need to copy.  */
         } else if (ret >= 0) {
diff --git a/block/vvfat.c b/block/vvfat.c
index b509d55..6b3135b 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1404,24 +1404,27 @@ static int vvfat_read(BlockDriverState *bs, int64_t sector_num,
 	if (sector_num >= bs->total_sectors)
 	   return -1;
 	if (s->qcow) {
-	    int n;
+            int64_t n;
             int ret;
-            ret = bdrv_is_allocated(s->qcow->bs, sector_num,
-                                    nb_sectors - i, &n);
+            ret = bdrv_is_allocated(s->qcow->bs, sector_num * BDRV_SECTOR_SIZE,
+                                    (nb_sectors - i) * BDRV_SECTOR_SIZE, &n);
             if (ret < 0) {
                 return ret;
             }
             if (ret) {
-                DLOG(fprintf(stderr, "sectors %d+%d allocated\n",
-                             (int)sector_num, n));
-                if (bdrv_read(s->qcow, sector_num, buf + i * 0x200, n)) {
+                DLOG(fprintf(stderr, "sectors %" PRId64 "+%" PRId64
+                             " allocated\n", sector_num,
+                             n >> BDRV_SECTOR_BITS));
+                if (bdrv_read(s->qcow, sector_num, buf + i * 0x200,
+                              n >> BDRV_SECTOR_BITS)) {
                     return -1;
                 }
-                i += n - 1;
-                sector_num += n - 1;
+                i += (n >> BDRV_SECTOR_BITS) - 1;
+                sector_num += (n >> BDRV_SECTOR_BITS) - 1;
                 continue;
             }
-DLOG(fprintf(stderr, "sector %d not allocated\n", (int)sector_num));
+            DLOG(fprintf(stderr, "sector %" PRId64 " not allocated\n",
+                         sector_num));
 	}
 	if(sector_num<s->faked_sectors) {
 	    if(sector_num<s->first_sectors_number)
@@ -1689,7 +1692,7 @@ static inline bool cluster_was_modified(BDRVVVFATState *s,
                                         uint32_t cluster_num)
 {
     int was_modified = 0;
-    int i, dummy;
+    int i;

     if (s->qcow == NULL) {
         return 0;
@@ -1697,8 +1700,9 @@ static inline bool cluster_was_modified(BDRVVVFATState *s,

     for (i = 0; !was_modified && i < s->sectors_per_cluster; i++) {
         was_modified = bdrv_is_allocated(s->qcow->bs,
-                                         cluster2sector(s, cluster_num) + i,
-                                         1, &dummy);
+                                         (cluster2sector(s, cluster_num) +
+                                          i) * BDRV_SECTOR_SIZE,
+                                         BDRV_SECTOR_SIZE, NULL);
     }

     /*
@@ -1845,7 +1849,7 @@ static uint32_t get_cluster_count_for_direntry(BDRVVVFATState* s,
 	    }

 	    if (copy_it) {
-		int i, dummy;
+                int i;
 		/*
 		 * This is horribly inefficient, but that is okay, since
 		 * it is rarely executed, if at all.
@@ -1856,7 +1860,9 @@ static uint32_t get_cluster_count_for_direntry(BDRVVVFATState* s,
                 for (i = 0; i < s->sectors_per_cluster; i++) {
                     int res;

-                    res = bdrv_is_allocated(s->qcow->bs, offset + i, 1, &dummy);
+                    res = bdrv_is_allocated(s->qcow->bs,
+                                            (offset + i) * BDRV_SECTOR_SIZE,
+                                            BDRV_SECTOR_SIZE, NULL);
                     if (res < 0) {
                         return -1;
                     }
diff --git a/migration/block.c b/migration/block.c
index 060087f..8d79d84 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -36,7 +36,7 @@
 #define BLK_MIG_FLAG_PROGRESS           0x04
 #define BLK_MIG_FLAG_ZERO_BLOCK         0x08

-#define MAX_IS_ALLOCATED_SEARCH 65536
+#define MAX_IS_ALLOCATED_SEARCH (65536 * BDRV_SECTOR_SIZE)

 #define MAX_INFLIGHT_IO 512

@@ -272,6 +272,7 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
     BlockBackend *bb = bmds->blk;
     BlkMigBlock *blk;
     int nr_sectors;
+    int64_t count;

     if (bmds->shared_base) {
         qemu_mutex_lock_iothread();
@@ -279,9 +280,9 @@ static int mig_save_device_bulk(QEMUFile *f, BlkMigDevState *bmds)
         /* Skip unallocated sectors; intentionally treats failure as
          * an allocated sector */
         while (cur_sector < total_sectors &&
-               !bdrv_is_allocated(blk_bs(bb), cur_sector,
-                                  MAX_IS_ALLOCATED_SEARCH, &nr_sectors)) {
-            cur_sector += nr_sectors;
+               !bdrv_is_allocated(blk_bs(bb), cur_sector * BDRV_SECTOR_SIZE,
+                                  MAX_IS_ALLOCATED_SEARCH, &count)) {
+            cur_sector += DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
         }
         aio_context_release(blk_get_aio_context(bb));
         qemu_mutex_unlock_iothread();
diff --git a/qemu-img.c b/qemu-img.c
index f981b57..3b17709 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -3211,6 +3211,7 @@ static int img_rebase(int argc, char **argv)
         int64_t new_backing_num_sectors = 0;
         uint64_t sector;
         int n;
+        int64_t count;
         float local_progress = 0;

         buf_old = blk_blockalign(blk, IO_BUF_SIZE);
@@ -3258,12 +3259,14 @@ static int img_rebase(int argc, char **argv)
             }

             /* If the cluster is allocated, we don't need to take action */
-            ret = bdrv_is_allocated(bs, sector, n, &n);
+            ret = bdrv_is_allocated(bs, sector << BDRV_SECTOR_BITS,
+                                    n << BDRV_SECTOR_BITS, &count);
             if (ret < 0) {
                 error_report("error while reading image metadata: %s",
                              strerror(-ret));
                 goto out;
             }
+            n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
             if (ret) {
                 continue;
             }
diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index 4b2278f..e5c209a 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -1760,12 +1760,12 @@ out:
 static int alloc_f(BlockBackend *blk, int argc, char **argv)
 {
     BlockDriverState *bs = blk_bs(blk);
-    int64_t offset, sector_num, nb_sectors, remaining, count;
+    int64_t offset, start, remaining, count;
     char s1[64];
-    int num, ret;
-    int64_t sum_alloc;
+    int ret;
+    int64_t num, sum_alloc;

-    offset = cvtnum(argv[1]);
+    start = offset = cvtnum(argv[1]);
     if (offset < 0) {
         print_cvtnum_err(offset, argv[1]);
         return 0;
@@ -1793,32 +1793,30 @@ static int alloc_f(BlockBackend *blk, int argc, char **argv)
                count);
         return 0;
     }
-    nb_sectors = count >> BDRV_SECTOR_BITS;

-    remaining = nb_sectors;
+    remaining = count;
     sum_alloc = 0;
-    sector_num = offset >> 9;
     while (remaining) {
-        ret = bdrv_is_allocated(bs, sector_num, remaining, &num);
+        ret = bdrv_is_allocated(bs, offset, remaining, &num);
         if (ret < 0) {
             printf("is_allocated failed: %s\n", strerror(-ret));
             return 0;
         }
-        sector_num += num;
+        offset += num;
         remaining -= num;
         if (ret) {
             sum_alloc += num;
         }
         if (num == 0) {
-            nb_sectors -= remaining;
+            count -= remaining;
             remaining = 0;
         }
     }

-    cvtstr(offset, s1, sizeof(s1));
+    cvtstr(start, s1, sizeof(s1));

     printf("%"PRId64"/%"PRId64" bytes allocated at offset %s\n",
-           sum_alloc << BDRV_SECTOR_BITS, nb_sectors << BDRV_SECTOR_BITS, s1);
+           sum_alloc, count, s1);
     return 0;
 }

@@ -1833,14 +1831,15 @@ static const cmdinfo_t alloc_cmd = {
 };


-static int map_is_allocated(BlockDriverState *bs, int64_t sector_num,
-                            int64_t nb_sectors, int64_t *pnum)
+static int map_is_allocated(BlockDriverState *bs, int64_t offset,
+                            int64_t bytes, int64_t *pnum)
 {
-    int num, num_checked;
+    int64_t num;
+    int num_checked;
     int ret, firstret;

-    num_checked = MIN(nb_sectors, INT_MAX);
-    ret = bdrv_is_allocated(bs, sector_num, num_checked, &num);
+    num_checked = MIN(bytes, BDRV_REQUEST_MAX_BYTES);
+    ret = bdrv_is_allocated(bs, offset, num_checked, &num);
     if (ret < 0) {
         return ret;
     }
@@ -1848,12 +1847,12 @@ static int map_is_allocated(BlockDriverState *bs, int64_t sector_num,
     firstret = ret;
     *pnum = num;

-    while (nb_sectors > 0 && ret == firstret) {
-        sector_num += num;
-        nb_sectors -= num;
+    while (bytes > 0 && ret == firstret) {
+        offset += num;
+        bytes -= num;

-        num_checked = MIN(nb_sectors, INT_MAX);
-        ret = bdrv_is_allocated(bs, sector_num, num_checked, &num);
+        num_checked = MIN(bytes, BDRV_REQUEST_MAX_BYTES);
+        ret = bdrv_is_allocated(bs, offset, num_checked, &num);
         if (ret == firstret && num) {
             *pnum += num;
         } else {
@@ -1866,25 +1865,21 @@ static int map_is_allocated(BlockDriverState *bs, int64_t sector_num,

 static int map_f(BlockBackend *blk, int argc, char **argv)
 {
-    int64_t offset;
-    int64_t nb_sectors, total_sectors;
+    int64_t offset, bytes;
     char s1[64], s2[64];
     int64_t num;
     int ret;
     const char *retstr;

     offset = 0;
-    total_sectors = blk_nb_sectors(blk);
-    if (total_sectors < 0) {
-        error_report("Failed to query image length: %s",
-                     strerror(-total_sectors));
+    bytes = blk_getlength(blk);
+    if (bytes < 0) {
+        error_report("Failed to query image length: %s", strerror(-bytes));
         return 0;
     }

-    nb_sectors = total_sectors;
-
-    do {
-        ret = map_is_allocated(blk_bs(blk), offset, nb_sectors, &num);
+    while (bytes) {
+        ret = map_is_allocated(blk_bs(blk), offset, bytes, &num);
         if (ret < 0) {
             error_report("Failed to get allocation status: %s", strerror(-ret));
             return 0;
@@ -1894,15 +1889,14 @@ static int map_f(BlockBackend *blk, int argc, char **argv)
         }

         retstr = ret ? "    allocated" : "not allocated";
-        cvtstr(num << BDRV_SECTOR_BITS, s1, sizeof(s1));
-        cvtstr(offset << BDRV_SECTOR_BITS, s2, sizeof(s2));
+        cvtstr(num, s1, sizeof(s1));
+        cvtstr(offset, s2, sizeof(s2));
         printf("%s (0x%" PRIx64 ") bytes %s at offset %s (0x%" PRIx64 ")\n",
-               s1, num << BDRV_SECTOR_BITS, retstr,
-               s2, offset << BDRV_SECTOR_BITS);
+               s1, num, retstr, s2, offset);

         offset += num;
-        nb_sectors -= num;
-    } while (offset < total_sectors);
+        bytes -= num;
+    };

     return 0;
 }
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 19/20] block: Minimize raw use of bds->total_sectors
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (17 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 18/20] block: Make bdrv_is_allocated() byte-based Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 20/20] block: Make bdrv_is_allocated_above() byte-based Eric Blake
  2017-06-05 19:39 ` [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
  20 siblings, 0 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, jsnow, Stefan Hajnoczi, Fam Zheng, Kevin Wolf, Max Reitz

bdrv_is_allocated_above() was relying on intermediate->total_sectors,
which is a field that can have stale contents depending on the value
of intermediate->has_variable_length.  An audit shows that we are safe
(we were first calling through bdrv_co_get_block_status() which in
turn calls bdrv_nb_sectors() and therefore just refreshed the current
length), but it's nicer to favor our accessor functions to avoid having
to repeat such an audit, even if it means refresh_total_sectors() is
called more frequently.

Suggested-by: John Snow <jsnow@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: new patch
---
 block/io.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index 9ab89cb..20d3ebd 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1939,6 +1939,7 @@ int bdrv_is_allocated_above(BlockDriverState *top,
     intermediate = top;
     while (intermediate && intermediate != base) {
         int64_t pnum_inter;
+        int64_t size_inter;
         int psectors_inter;

         ret = bdrv_is_allocated(intermediate, sector_num * BDRV_SECTOR_SIZE,
@@ -1956,13 +1957,14 @@ int bdrv_is_allocated_above(BlockDriverState *top,

         /*
          * [sector_num, nb_sectors] is unallocated on top but intermediate
-         * might have
-         *
-         * [sector_num+x, nr_sectors] allocated.
+         * might have [sector_num+x, nb_sectors-x] allocated.
          */
+        size_inter = bdrv_nb_sectors(intermediate);
+        if (size_inter < 0) {
+            return size_inter;
+        }
         if (n > psectors_inter &&
-            (intermediate == top ||
-             sector_num + psectors_inter < intermediate->total_sectors)) {
+            (intermediate == top || sector_num + psectors_inter < size_inter)) {
             n = psectors_inter;
         }

-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [Qemu-devel] [PATCH v2 20/20] block: Make bdrv_is_allocated_above() byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (18 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 19/20] block: Minimize raw use of bds->total_sectors Eric Blake
@ 2017-05-11  2:20 ` Eric Blake
  2017-06-05 23:22   ` John Snow
  2017-06-05 23:30   ` John Snow
  2017-06-05 19:39 ` [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
  20 siblings, 2 replies; 29+ messages in thread
From: Eric Blake @ 2017-05-11  2:20 UTC (permalink / raw)
  To: qemu-devel
  Cc: qemu-block, jsnow, Jeff Cody, Kevin Wolf, Max Reitz,
	Stefan Hajnoczi, Fam Zheng, Wen Congyang, Xie Changlong

We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated.  For now,
the io.c layer still assert()s that all callers are sector-aligned,
but that can be relaxed when a later patch implements byte-based
block status.  Therefore, for the most part this patch is just the
addition of scaling at the callers followed by inverse scaling at
bdrv_is_allocated().  But some code, particularly stream_run(),
gets a lot simpler because it no longer has to mess with sectors.

For ease of review, bdrv_is_allocated() was tackled separately.

Signed-off-by: Eric Blake <eblake@redhat.com>

---
v2: tweak function comments, favor bdrv_getlength() over ->total_sectors
---
 include/block/block.h |  2 +-
 block/commit.c        | 20 ++++++++------------
 block/io.c            | 42 ++++++++++++++++++++----------------------
 block/mirror.c        |  5 ++++-
 block/replication.c   | 17 ++++++++++++-----
 block/stream.c        | 21 +++++++++------------
 qemu-img.c            | 10 +++++++---
 7 files changed, 61 insertions(+), 56 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 47719ac..81e3325 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -437,7 +437,7 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
 int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes,
                       int64_t *pnum);
 int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
-                            int64_t sector_num, int nb_sectors, int *pnum);
+                            int64_t offset, int64_t bytes, int64_t *pnum);

 bool bdrv_is_read_only(BlockDriverState *bs);
 int bdrv_can_set_read_only(BlockDriverState *bs, bool read_only, Error **errp);
diff --git a/block/commit.c b/block/commit.c
index 4d6bb2a..989de7d 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -132,7 +132,7 @@ static void coroutine_fn commit_run(void *opaque)
     int64_t offset;
     uint64_t delay_ns = 0;
     int ret = 0;
-    int n = 0; /* sectors */
+    int64_t n = 0; /* bytes */
     void *buf = NULL;
     int bytes_written = 0;
     int64_t base_len;
@@ -157,7 +157,7 @@ static void coroutine_fn commit_run(void *opaque)

     buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);

-    for (offset = 0; offset < s->common.len; offset += n * BDRV_SECTOR_SIZE) {
+    for (offset = 0; offset < s->common.len; offset += n) {
         bool copy;

         /* Note that even when no rate limit is applied we need to yield
@@ -169,15 +169,12 @@ static void coroutine_fn commit_run(void *opaque)
         }
         /* Copy if allocated above the base */
         ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base),
-                                      offset / BDRV_SECTOR_SIZE,
-                                      COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
-                                      &n);
+                                      offset, COMMIT_BUFFER_SIZE, &n);
         copy = (ret == 1);
-        trace_commit_one_iteration(s, offset, n * BDRV_SECTOR_SIZE, ret);
+        trace_commit_one_iteration(s, offset, n, ret);
         if (copy) {
-            ret = commit_populate(s->top, s->base, offset,
-                                  n * BDRV_SECTOR_SIZE, buf);
-            bytes_written += n * BDRV_SECTOR_SIZE;
+            ret = commit_populate(s->top, s->base, offset, n, buf);
+            bytes_written += n;
         }
         if (ret < 0) {
             BlockErrorAction action =
@@ -190,11 +187,10 @@ static void coroutine_fn commit_run(void *opaque)
             }
         }
         /* Publish progress */
-        s->common.offset += n * BDRV_SECTOR_SIZE;
+        s->common.offset += n;

         if (copy && s->common.speed) {
-            delay_ns = ratelimit_calculate_delay(&s->limit,
-                                                 n * BDRV_SECTOR_SIZE);
+            delay_ns = ratelimit_calculate_delay(&s->limit, n);
         }
     }

diff --git a/block/io.c b/block/io.c
index 20d3ebd..d4f1925 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1918,54 +1918,52 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t offset,
 /*
  * Given an image chain: ... -> [BASE] -> [INTER1] -> [INTER2] -> [TOP]
  *
- * Return true if the given sector is allocated in any image between
- * BASE and TOP (inclusive).  BASE can be NULL to check if the given
- * sector is allocated in any image of the chain.  Return false otherwise,
+ * Return true if the (prefix of the) given range is allocated in any image
+ * between BASE and TOP (inclusive).  BASE can be NULL to check if the given
+ * offset is allocated in any image of the chain.  Return false otherwise,
  * or negative errno on failure.
  *
- * 'pnum' is set to the number of sectors (including and immediately following
- *  the specified sector) that are known to be in the same
- *  allocated/unallocated state.
+ * 'pnum' is set to the number of bytes (including and immediately
+ * following the specified offset) that are known to be in the same
+ * allocated/unallocated state.  Note that a subsequent call starting
+ * at 'offset + *pnum' may return the same allocation status (in other
+ * words, the result is not necessarily the maximum possible range);
+ * but 'pnum' will only be 0 when end of file is reached.
  *
  */
 int bdrv_is_allocated_above(BlockDriverState *top,
                             BlockDriverState *base,
-                            int64_t sector_num,
-                            int nb_sectors, int *pnum)
+                            int64_t offset, int64_t bytes, int64_t *pnum)
 {
     BlockDriverState *intermediate;
-    int ret, n = nb_sectors;
+    int ret;
+    int64_t n = bytes;

     intermediate = top;
     while (intermediate && intermediate != base) {
         int64_t pnum_inter;
         int64_t size_inter;
-        int psectors_inter;

-        ret = bdrv_is_allocated(intermediate, sector_num * BDRV_SECTOR_SIZE,
-                                nb_sectors * BDRV_SECTOR_SIZE,
-                                &pnum_inter);
+        ret = bdrv_is_allocated(intermediate, offset, bytes, &pnum_inter);
         if (ret < 0) {
             return ret;
         }
-        assert(pnum_inter < INT_MAX * BDRV_SECTOR_SIZE);
-        psectors_inter = pnum_inter >> BDRV_SECTOR_BITS;
         if (ret) {
-            *pnum = psectors_inter;
+            *pnum = pnum_inter;
             return 1;
         }

         /*
-         * [sector_num, nb_sectors] is unallocated on top but intermediate
-         * might have [sector_num+x, nb_sectors-x] allocated.
+         * [offset, bytes] is unallocated on top but intermediate
+         * might have [offset+x, bytes-x] allocated.
          */
-        size_inter = bdrv_nb_sectors(intermediate);
+        size_inter = bdrv_getlength(intermediate);
         if (size_inter < 0) {
             return size_inter;
         }
-        if (n > psectors_inter &&
-            (intermediate == top || sector_num + psectors_inter < size_inter)) {
-            n = psectors_inter;
+        if (n > pnum_inter &&
+            (intermediate == top || offset + pnum_inter < size_inter)) {
+            n = pnum_inter;
         }

         intermediate = backing_bs(intermediate);
diff --git a/block/mirror.c b/block/mirror.c
index 2a95ba5..452d546 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -609,6 +609,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
     BlockDriverState *bs = s->source;
     BlockDriverState *target_bs = blk_bs(s->target);
     int ret, n;
+    int64_t count;

     end = s->bdev_length / BDRV_SECTOR_SIZE;

@@ -658,11 +659,13 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
             return 0;
         }

-        ret = bdrv_is_allocated_above(bs, base, sector_num, nb_sectors, &n);
+        ret = bdrv_is_allocated_above(bs, base, sector_num * BDRV_SECTOR_SIZE,
+                                      nb_sectors * BDRV_SECTOR_SIZE, &count);
         if (ret < 0) {
             return ret;
         }

+        n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
         assert(n > 0);
         if (ret == 1) {
             bdrv_set_dirty_bitmap(s->dirty_bitmap, sector_num, n);
diff --git a/block/replication.c b/block/replication.c
index 137a7fc..52f4a8e 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -264,7 +264,8 @@ static coroutine_fn int replication_co_writev(BlockDriverState *bs,
     BdrvChild *top = bs->file;
     BdrvChild *base = s->secondary_disk;
     BdrvChild *target;
-    int ret, n;
+    int ret;
+    int64_t n;

     ret = replication_get_io_status(s);
     if (ret < 0) {
@@ -283,14 +284,20 @@ static coroutine_fn int replication_co_writev(BlockDriverState *bs,
      */
     qemu_iovec_init(&hd_qiov, qiov->niov);
     while (remaining_sectors > 0) {
-        ret = bdrv_is_allocated_above(top->bs, base->bs, sector_num,
-                                      remaining_sectors, &n);
+        int64_t count;
+
+        ret = bdrv_is_allocated_above(top->bs, base->bs,
+                                      sector_num * BDRV_SECTOR_SIZE,
+                                      remaining_sectors * BDRV_SECTOR_SIZE,
+                                      &count);
         if (ret < 0) {
             goto out1;
         }

+        assert(QEMU_IS_ALIGNED(count, BDRV_SECTOR_SIZE));
+        n = count >> BDRV_SECTOR_BITS;
         qemu_iovec_reset(&hd_qiov);
-        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, n * BDRV_SECTOR_SIZE);
+        qemu_iovec_concat(&hd_qiov, qiov, bytes_done, count);

         target = ret ? top : base;
         ret = bdrv_co_writev(target, sector_num, n, &hd_qiov);
@@ -300,7 +307,7 @@ static coroutine_fn int replication_co_writev(BlockDriverState *bs,

         remaining_sectors -= n;
         sector_num += n;
-        bytes_done += n * BDRV_SECTOR_SIZE;
+        bytes_done += count;
     }

 out1:
diff --git a/block/stream.c b/block/stream.c
index 85502eb..9033655 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -112,7 +112,7 @@ static void coroutine_fn stream_run(void *opaque)
     uint64_t delay_ns = 0;
     int error = 0;
     int ret = 0;
-    int n = 0; /* sectors */
+    int64_t n = 0; /* bytes */
     void *buf;

     if (!bs->backing) {
@@ -136,9 +136,8 @@ static void coroutine_fn stream_run(void *opaque)
         bdrv_enable_copy_on_read(bs);
     }

-    for ( ; offset < s->common.len; offset += n * BDRV_SECTOR_SIZE) {
+    for ( ; offset < s->common.len; offset += n) {
         bool copy;
-        int64_t count = 0;

         /* Note that even when no rate limit is applied we need to yield
          * with no pending I/O here so that bdrv_drain_all() returns.
@@ -150,26 +149,25 @@ static void coroutine_fn stream_run(void *opaque)

         copy = false;

-        ret = bdrv_is_allocated(bs, offset, STREAM_BUFFER_SIZE, &count);
-        n = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
+        ret = bdrv_is_allocated(bs, offset, STREAM_BUFFER_SIZE, &n);
         if (ret == 1) {
             /* Allocated in the top, no need to copy.  */
         } else if (ret >= 0) {
             /* Copy if allocated in the intermediate images.  Limit to the
              * known-unallocated area [offset, offset+n*BDRV_SECTOR_SIZE).  */
             ret = bdrv_is_allocated_above(backing_bs(bs), base,
-                                          offset / BDRV_SECTOR_SIZE, n, &n);
+                                          offset, n, &n);

             /* Finish early if end of backing file has been reached */
             if (ret == 0 && n == 0) {
-                n = (s->common.len - offset) / BDRV_SECTOR_SIZE;
+                n = s->common.len - offset;
             }

             copy = (ret == 1);
         }
-        trace_stream_one_iteration(s, offset, n * BDRV_SECTOR_SIZE, ret);
+        trace_stream_one_iteration(s, offset, n, ret);
         if (copy) {
-            ret = stream_populate(blk, offset, n * BDRV_SECTOR_SIZE, buf);
+            ret = stream_populate(blk, offset, n, buf);
         }
         if (ret < 0) {
             BlockErrorAction action =
@@ -188,10 +186,9 @@ static void coroutine_fn stream_run(void *opaque)
         ret = 0;

         /* Publish progress */
-        s->common.offset += n * BDRV_SECTOR_SIZE;
+        s->common.offset += n;
         if (copy && s->common.speed) {
-            delay_ns = ratelimit_calculate_delay(&s->limit,
-                                                 n * BDRV_SECTOR_SIZE);
+            delay_ns = ratelimit_calculate_delay(&s->limit, n);
         }
     }

diff --git a/qemu-img.c b/qemu-img.c
index 3b17709..08864f3 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1476,12 +1476,16 @@ static int img_compare(int argc, char **argv)
         }

         for (;;) {
+            int64_t count;
+
             nb_sectors = sectors_to_process(total_sectors_over, sector_num);
             if (nb_sectors <= 0) {
                 break;
             }
-            ret = bdrv_is_allocated_above(blk_bs(blk_over), NULL, sector_num,
-                                          nb_sectors, &pnum);
+            ret = bdrv_is_allocated_above(blk_bs(blk_over), NULL,
+                                          sector_num * BDRV_SECTOR_SIZE,
+                                          nb_sectors * BDRV_SECTOR_SIZE,
+                                          &count);
             if (ret < 0) {
                 ret = 3;
                 error_report("Sector allocation test failed for %s",
@@ -1489,7 +1493,7 @@ static int img_compare(int argc, char **argv)
                 goto out;

             }
-            nb_sectors = pnum;
+            nb_sectors = DIV_ROUND_UP(count, BDRV_SECTOR_SIZE);
             if (ret) {
                 ret = check_empty_sectors(blk_over, sector_num, nb_sectors,
                                           filename_over, buf1, quiet);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based
  2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
                   ` (19 preceding siblings ...)
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 20/20] block: Make bdrv_is_allocated_above() byte-based Eric Blake
@ 2017-06-05 19:39 ` Eric Blake
  2017-06-05 19:41   ` John Snow
  20 siblings, 1 reply; 29+ messages in thread
From: Eric Blake @ 2017-06-05 19:39 UTC (permalink / raw)
  To: qemu-devel; +Cc: jsnow, qemu-block

[-- Attachment #1: Type: text/plain, Size: 4396 bytes --]

ping

On 05/10/2017 09:20 PM, Eric Blake wrote:
> There are patches floating around to add NBD_CMD_BLOCK_STATUS,
> but NBD wants to report status on byte granularity (even if the
> reporting will probably be naturally aligned to sectors or even
> much higher levels).  I've therefore started the task of
> converting our block status code to report at a byte granularity
> rather than sectors.
> 
> This is part one of that conversion: bdrv_is_allocated().
> Other parts still need a v2, but here's the link to their v1:
> tracking dirty bitmaps by bytes:
> https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02163.html
> replacing bdrv_get_block_status() with a byte based callback
> in all the drivers:
> https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02642.html
> 
> Available as a tag at:
> git fetch git://repo.or.cz/qemu/ericb.git nbd-byte-allocated-v2
> 
> Since v1:
> - Add R-b from John as appropriate
> - Couple of new patches for cleanups he noticed
> - Rebase to Max's block branch
> 
> 001/20:[----] [--] 'blockjob: Track job ratelimits via bytes, not sectors'
> 002/20:[----] [--] 'trace: Show blockjob actions via bytes, not sectors'
> 003/20:[----] [--] 'stream: Switch stream_populate() to byte-based'
> 004/20:[----] [--] 'stream: Switch stream_run() to byte-based'
> 005/20:[----] [--] 'commit: Switch commit_populate() to byte-based'
> 006/20:[----] [--] 'commit: Switch commit_run() to byte-based'
> 007/20:[----] [--] 'mirror: Switch MirrorBlockJob to byte-based'
> 008/20:[----] [--] 'mirror: Switch mirror_do_zero_or_discard() to byte-based'
> 009/20:[down] 'mirror: Update signature of mirror_clip_sectors()'
> 010/20:[0013] [FC] 'mirror: Switch mirror_cow_align() to byte-based'
> 011/20:[----] [-C] 'mirror: Switch mirror_do_read() to byte-based'
> 012/20:[0014] [FC] 'mirror: Switch mirror_iteration() to byte-based'
> 013/20:[----] [--] 'block: Drop unused bdrv_round_sectors_to_clusters()'
> 014/20:[----] [--] 'backup: Switch BackupBlockJob to byte-based'
> 015/20:[0008] [FC] 'backup: Switch block_backup.h to byte-based'
> 016/20:[----] [--] 'backup: Switch backup_do_cow() to byte-based'
> 017/20:[----] [--] 'backup: Switch backup_run() to byte-based'
> 018/20:[0039] [FC] 'block: Make bdrv_is_allocated() byte-based'
> 019/20:[down] 'block: Minimize raw use of bds->total_sectors'
> 020/20:[0026] [FC] 'block: Make bdrv_is_allocated_above() byte-based'
> 
> Eric Blake (20):
>   blockjob: Track job ratelimits via bytes, not sectors
>   trace: Show blockjob actions via bytes, not sectors
>   stream: Switch stream_populate() to byte-based
>   stream: Switch stream_run() to byte-based
>   commit: Switch commit_populate() to byte-based
>   commit: Switch commit_run() to byte-based
>   mirror: Switch MirrorBlockJob to byte-based
>   mirror: Switch mirror_do_zero_or_discard() to byte-based
>   mirror: Update signature of mirror_clip_sectors()
>   mirror: Switch mirror_cow_align() to byte-based
>   mirror: Switch mirror_do_read() to byte-based
>   mirror: Switch mirror_iteration() to byte-based
>   block: Drop unused bdrv_round_sectors_to_clusters()
>   backup: Switch BackupBlockJob to byte-based
>   backup: Switch block_backup.h to byte-based
>   backup: Switch backup_do_cow() to byte-based
>   backup: Switch backup_run() to byte-based
>   block: Make bdrv_is_allocated() byte-based
>   block: Minimize raw use of bds->total_sectors
>   block: Make bdrv_is_allocated_above() byte-based
> 
>  include/block/block.h        |  10 +-
>  include/block/block_backup.h |  11 +-
>  include/qemu/ratelimit.h     |   3 +-
>  block/backup.c               | 130 ++++++++-----------
>  block/commit.c               |  54 ++++----
>  block/io.c                   |  92 +++++++------
>  block/mirror.c               | 300 ++++++++++++++++++++++---------------------
>  block/replication.c          |  29 +++--
>  block/stream.c               |  35 +++--
>  block/vvfat.c                |  34 +++--
>  migration/block.c            |   9 +-
>  qemu-img.c                   |  15 ++-
>  qemu-io-cmds.c               |  70 +++++-----
>  block/trace-events           |  14 +-
>  14 files changed, 396 insertions(+), 410 deletions(-)
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based
  2017-06-05 19:39 ` [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
@ 2017-06-05 19:41   ` John Snow
  0 siblings, 0 replies; 29+ messages in thread
From: John Snow @ 2017-06-05 19:41 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: qemu-block



On 06/05/2017 03:39 PM, Eric Blake wrote:
> ping
> 

ACK.

> On 05/10/2017 09:20 PM, Eric Blake wrote:
>> There are patches floating around to add NBD_CMD_BLOCK_STATUS,
>> but NBD wants to report status on byte granularity (even if the
>> reporting will probably be naturally aligned to sectors or even
>> much higher levels).  I've therefore started the task of
>> converting our block status code to report at a byte granularity
>> rather than sectors.
>>
>> This is part one of that conversion: bdrv_is_allocated().
>> Other parts still need a v2, but here's the link to their v1:
>> tracking dirty bitmaps by bytes:
>> https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02163.html
>> replacing bdrv_get_block_status() with a byte based callback
>> in all the drivers:
>> https://lists.gnu.org/archive/html/qemu-devel/2017-04/msg02642.html
>>
>> Available as a tag at:
>> git fetch git://repo.or.cz/qemu/ericb.git nbd-byte-allocated-v2
>>
>> Since v1:
>> - Add R-b from John as appropriate
>> - Couple of new patches for cleanups he noticed
>> - Rebase to Max's block branch
>>
>> 001/20:[----] [--] 'blockjob: Track job ratelimits via bytes, not sectors'
>> 002/20:[----] [--] 'trace: Show blockjob actions via bytes, not sectors'
>> 003/20:[----] [--] 'stream: Switch stream_populate() to byte-based'
>> 004/20:[----] [--] 'stream: Switch stream_run() to byte-based'
>> 005/20:[----] [--] 'commit: Switch commit_populate() to byte-based'
>> 006/20:[----] [--] 'commit: Switch commit_run() to byte-based'
>> 007/20:[----] [--] 'mirror: Switch MirrorBlockJob to byte-based'
>> 008/20:[----] [--] 'mirror: Switch mirror_do_zero_or_discard() to byte-based'
>> 009/20:[down] 'mirror: Update signature of mirror_clip_sectors()'
>> 010/20:[0013] [FC] 'mirror: Switch mirror_cow_align() to byte-based'
>> 011/20:[----] [-C] 'mirror: Switch mirror_do_read() to byte-based'
>> 012/20:[0014] [FC] 'mirror: Switch mirror_iteration() to byte-based'
>> 013/20:[----] [--] 'block: Drop unused bdrv_round_sectors_to_clusters()'
>> 014/20:[----] [--] 'backup: Switch BackupBlockJob to byte-based'
>> 015/20:[0008] [FC] 'backup: Switch block_backup.h to byte-based'
>> 016/20:[----] [--] 'backup: Switch backup_do_cow() to byte-based'
>> 017/20:[----] [--] 'backup: Switch backup_run() to byte-based'
>> 018/20:[0039] [FC] 'block: Make bdrv_is_allocated() byte-based'
>> 019/20:[down] 'block: Minimize raw use of bds->total_sectors'
>> 020/20:[0026] [FC] 'block: Make bdrv_is_allocated_above() byte-based'
>>
>> Eric Blake (20):
>>   blockjob: Track job ratelimits via bytes, not sectors
>>   trace: Show blockjob actions via bytes, not sectors
>>   stream: Switch stream_populate() to byte-based
>>   stream: Switch stream_run() to byte-based
>>   commit: Switch commit_populate() to byte-based
>>   commit: Switch commit_run() to byte-based
>>   mirror: Switch MirrorBlockJob to byte-based
>>   mirror: Switch mirror_do_zero_or_discard() to byte-based
>>   mirror: Update signature of mirror_clip_sectors()
>>   mirror: Switch mirror_cow_align() to byte-based
>>   mirror: Switch mirror_do_read() to byte-based
>>   mirror: Switch mirror_iteration() to byte-based
>>   block: Drop unused bdrv_round_sectors_to_clusters()
>>   backup: Switch BackupBlockJob to byte-based
>>   backup: Switch block_backup.h to byte-based
>>   backup: Switch backup_do_cow() to byte-based
>>   backup: Switch backup_run() to byte-based
>>   block: Make bdrv_is_allocated() byte-based
>>   block: Minimize raw use of bds->total_sectors
>>   block: Make bdrv_is_allocated_above() byte-based
>>
>>  include/block/block.h        |  10 +-
>>  include/block/block_backup.h |  11 +-
>>  include/qemu/ratelimit.h     |   3 +-
>>  block/backup.c               | 130 ++++++++-----------
>>  block/commit.c               |  54 ++++----
>>  block/io.c                   |  92 +++++++------
>>  block/mirror.c               | 300 ++++++++++++++++++++++---------------------
>>  block/replication.c          |  29 +++--
>>  block/stream.c               |  35 +++--
>>  block/vvfat.c                |  34 +++--
>>  migration/block.c            |   9 +-
>>  qemu-img.c                   |  15 ++-
>>  qemu-io-cmds.c               |  70 +++++-----
>>  block/trace-events           |  14 +-
>>  14 files changed, 396 insertions(+), 410 deletions(-)
>>
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v2 09/20] mirror: Update signature of mirror_clip_sectors()
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 09/20] mirror: Update signature of mirror_clip_sectors() Eric Blake
@ 2017-06-05 22:23   ` John Snow
  0 siblings, 0 replies; 29+ messages in thread
From: John Snow @ 2017-06-05 22:23 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Max Reitz



On 05/10/2017 10:20 PM, Eric Blake wrote:
> Rather than having a void function that modifies its input
> in-place as the output, change the signature to reduce a layer
> of indirection and return the result.
> 
> Suggested-by: John Snow <jsnow@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 

Reviewed-by: John Snow <jsnow@redhat.com>

> ---
> v2: new patch
> ---
>  block/mirror.c | 15 ++++++++-------
>  1 file changed, 8 insertions(+), 7 deletions(-)
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v2 10/20] mirror: Switch mirror_cow_align() to byte-based
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 10/20] mirror: Switch mirror_cow_align() to byte-based Eric Blake
@ 2017-06-05 22:24   ` John Snow
  0 siblings, 0 replies; 29+ messages in thread
From: John Snow @ 2017-06-05 22:24 UTC (permalink / raw)
  To: Eric Blake, qemu-devel; +Cc: Kevin Wolf, Jeff Cody, qemu-block, Max Reitz



On 05/10/2017 10:20 PM, Eric Blake wrote:
> We are gradually converting to byte-based interfaces, as they are
> easier to reason about than sector-based.  Convert another internal
> function (no semantic change), and add mirror_clip_bytes() as a
> counterpart to mirror_clip_sectors().  Some of the conversion is
> a bit tricky, requiring temporaries to convert between units; it
> will be cleared up in a following patch.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>

Reviewed-by: John Snow <jsnow@redhat.com>

> 
> ---
> v2: tweak mirror_clip_bytes() signature to match previous patch
> ---

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v2 15/20] backup: Switch block_backup.h to byte-based
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 15/20] backup: Switch block_backup.h " Eric Blake
@ 2017-06-05 22:42   ` John Snow
  0 siblings, 0 replies; 29+ messages in thread
From: John Snow @ 2017-06-05 22:42 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: Kevin Wolf, qemu-block, Wen Congyang, Jeff Cody, Max Reitz,
	Xie Changlong



On 05/10/2017 10:20 PM, Eric Blake wrote:
> We are gradually converting to byte-based interfaces, as they are
> easier to reason about than sector-based.  Continue by converting
> the public interface to backup jobs (no semantic change), including
> a change to CowRequest to track by bytes instead of cluster indices.
> 
> Note that this does not change the difference between the public
> interface (starting point, and size of the subsequent range) and
> the internal interface (starting and end points).
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 

Reviewed-by: John Snow <jsnow@redhat.com>

> ---
> v2: change a couple more parameter names
> ---
>  include/block/block_backup.h | 11 +++++------
>  block/backup.c               | 33 ++++++++++++++++-----------------
>  block/replication.c          | 12 ++++++++----
>  3 files changed, 29 insertions(+), 27 deletions(-)
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v2 18/20] block: Make bdrv_is_allocated() byte-based
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 18/20] block: Make bdrv_is_allocated() byte-based Eric Blake
@ 2017-06-05 22:57   ` John Snow
  0 siblings, 0 replies; 29+ messages in thread
From: John Snow @ 2017-06-05 22:57 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Juan Quintela, Jeff Cody,
	Dr. David Alan Gilbert, Max Reitz, Stefan Hajnoczi



On 05/10/2017 10:20 PM, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  In the common case, allocation is unlikely to ever use
> values that are not naturally sector-aligned, but it is possible
> that byte-based values will let us be more precise about allocation
> at the end of an unaligned file that can do byte-based access.
> 
> Changing the signature of the function to use int64_t *pnum ensures
> that the compiler enforces that all callers are updated.  For now,
> the io.c layer still assert()s that all callers are sector-aligned
> on input and that *pnum is sector-aligned on return to the caller,
> but that can be relaxed when a later patch implements byte-based
> block status.  Therefore, this code adds usages like
> DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned
> values, where the call might reasonbly give non-aligned results
> in the future; on the other hand, no rounding is needed for callers
> that should just continue to work with byte alignment.
> 
> For the most part this patch is just the addition of scaling at the
> callers followed by inverse scaling at bdrv_is_allocated().  But
> some code, particularly bdrv_commit(), gets a lot simpler because it
> no longer has to mess with sectors; also, it is now possible to pass
> NULL if the caller does not care how much of the image is allocated
> beyond the initial offset.
> 
> For ease of review, bdrv_is_allocated_above() will be tackled
> separately.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
> 

Reviewed-by: John Snow <jsnow@redhat.com>

> ---
> v2: rebase to earlier changes, tweak commit message
> ---
>  include/block/block.h |  4 +--
>  block/backup.c        | 17 ++++---------
>  block/commit.c        | 21 +++++++---------
>  block/io.c            | 49 +++++++++++++++++++++++++-----------
>  block/stream.c        |  5 ++--
>  block/vvfat.c         | 34 ++++++++++++++-----------
>  migration/block.c     |  9 ++++---
>  qemu-img.c            |  5 +++-
>  qemu-io-cmds.c        | 70 +++++++++++++++++++++++----------------------------
>  9 files changed, 114 insertions(+), 100 deletions(-)
> 

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v2 20/20] block: Make bdrv_is_allocated_above() byte-based
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 20/20] block: Make bdrv_is_allocated_above() byte-based Eric Blake
@ 2017-06-05 23:22   ` John Snow
  2017-06-05 23:30   ` John Snow
  1 sibling, 0 replies; 29+ messages in thread
From: John Snow @ 2017-06-05 23:22 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Wen Congyang, Jeff Cody,
	Max Reitz, Stefan Hajnoczi, Xie Changlong



On 05/10/2017 10:20 PM, Eric Blake wrote:
> -                            int64_t sector_num, int nb_sectors, int *pnum);
> +                            int64_t offset, int64_t bytes, int64_t *pnum);

Minor context conflict after this that, for whichever reason, git could
not resolve on its own for me.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Qemu-devel] [PATCH v2 20/20] block: Make bdrv_is_allocated_above() byte-based
  2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 20/20] block: Make bdrv_is_allocated_above() byte-based Eric Blake
  2017-06-05 23:22   ` John Snow
@ 2017-06-05 23:30   ` John Snow
  1 sibling, 0 replies; 29+ messages in thread
From: John Snow @ 2017-06-05 23:30 UTC (permalink / raw)
  To: Eric Blake, qemu-devel
  Cc: Kevin Wolf, Fam Zheng, qemu-block, Wen Congyang, Jeff Cody,
	Max Reitz, Stefan Hajnoczi, Xie Changlong



On 05/10/2017 10:20 PM, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  In the common case, allocation is unlikely to ever use
> values that are not naturally sector-aligned, but it is possible
> that byte-based values will let us be more precise about allocation
> at the end of an unaligned file that can do byte-based access.
> 
> Changing the signature of the function to use int64_t *pnum ensures
> that the compiler enforces that all callers are updated.  For now,
> the io.c layer still assert()s that all callers are sector-aligned,
> but that can be relaxed when a later patch implements byte-based
> block status.  Therefore, for the most part this patch is just the
> addition of scaling at the callers followed by inverse scaling at
> bdrv_is_allocated().  But some code, particularly stream_run(),
> gets a lot simpler because it no longer has to mess with sectors.
> 
> For ease of review, bdrv_is_allocated() was tackled separately.
> 
> Signed-off-by: Eric Blake <eblake@redhat.com>
>

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-06-05 23:31 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-11  2:20 [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 01/20] blockjob: Track job ratelimits via bytes, not sectors Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 02/20] trace: Show blockjob actions " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 03/20] stream: Switch stream_populate() to byte-based Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 04/20] stream: Switch stream_run() " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 05/20] commit: Switch commit_populate() " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 06/20] commit: Switch commit_run() " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 07/20] mirror: Switch MirrorBlockJob " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 08/20] mirror: Switch mirror_do_zero_or_discard() " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 09/20] mirror: Update signature of mirror_clip_sectors() Eric Blake
2017-06-05 22:23   ` John Snow
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 10/20] mirror: Switch mirror_cow_align() to byte-based Eric Blake
2017-06-05 22:24   ` John Snow
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 11/20] mirror: Switch mirror_do_read() " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 12/20] mirror: Switch mirror_iteration() " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 13/20] block: Drop unused bdrv_round_sectors_to_clusters() Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 14/20] backup: Switch BackupBlockJob to byte-based Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 15/20] backup: Switch block_backup.h " Eric Blake
2017-06-05 22:42   ` John Snow
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 16/20] backup: Switch backup_do_cow() " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 17/20] backup: Switch backup_run() " Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 18/20] block: Make bdrv_is_allocated() byte-based Eric Blake
2017-06-05 22:57   ` John Snow
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 19/20] block: Minimize raw use of bds->total_sectors Eric Blake
2017-05-11  2:20 ` [Qemu-devel] [PATCH v2 20/20] block: Make bdrv_is_allocated_above() byte-based Eric Blake
2017-06-05 23:22   ` John Snow
2017-06-05 23:30   ` John Snow
2017-06-05 19:39 ` [Qemu-devel] [PATCH v2 00/20] make bdrv_is_allocated[_above] byte-based Eric Blake
2017-06-05 19:41   ` John Snow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.