All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
	qemu-block@nongnu.org, qemu-devel@nongnu.org
Cc: armbru@redhat.com, pbonzini@redhat.com, mreitz@redhat.com,
	kwolf@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH v5 4/6] nbd/server: implement dirty bitmap export
Date: Wed, 20 Jun 2018 13:09:25 -0500	[thread overview]
Message-ID: <d297899a-5019-4d14-37fd-6d35b932d0d8@redhat.com> (raw)
In-Reply-To: <37e9fc1c-9c65-2121-12c9-4b648fea3947@virtuozzo.com>

On 06/20/2018 12:04 PM, Vladimir Sementsov-Ogievskiy wrote:
> 20.06.2018 19:27, Eric Blake wrote:
>> On 06/09/2018 10:17 AM, Vladimir Sementsov-Ogievskiy wrote:
>>> Handle new NBD meta namespace: "qemu", and corresponding queries:
>>> "qemu:dirty-bitmap:<export bitmap name>".
>>>
>>> With new metadata context negotiated, BLOCK_STATUS query will reply
>>> with dirty-bitmap data, converted to extents. New public function
>>> nbd_export_bitmap selects bitmap to export. For now, only one bitmap
>>> may be exported.
>>>

>>  static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
>> -                               NBDExtent *extents, unsigned nb_extents,
>> +                               NBDExtent *extents, unsigned int 
>> nb_extents,
>> +                               uint64_t length, bool last,
>>                                 uint32_t context_id, Error **errp)
>>  {
>>      NBDStructuredMeta chunk;
>> @@ -1890,7 +1897,9 @@ static int nbd_co_send_extents(NBDClient 
>> *client, uint64_t handle,
>>          {.iov_base = extents, .iov_len = nb_extents * 
>> sizeof(extents[0])}
>>      };
>>
>> -    set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_BLOCK_STATUS,
>> +    trace_nbd_co_send_extents(handle, nb_extents, context_id, length, 
>> last);
> 
> hm, length variable added only to be traced.. Ok, but a bit strange.

Yes. It doesn't affect what goes over the wire, but when it comes to 
tracing, knowing the sum of the extents can be quite helpful (especially 
knowing if the server's reply is shorter or longer than the client's 
request, and the fact that when two or more contexts are negotiated by 
the client, the different contexts can have different length replies)


>> +static unsigned int bitmap_to_extents(BdrvDirtyBitmap *bitmap, 
>> uint64_t offset,
>> +                                      uint64_t *length, NBDExtent 
>> *extents,
> 
> length - in-out? worth comment?

Sure.

> 
>> + unsigned int nb_extents,
>> +                                      bool dont_fragment)
>>  {
>>      uint64_t begin = offset, end;
>> -    uint64_t overall_end = offset + length;
>> -    unsigned i = 0;
>> +    uint64_t overall_end = offset + *length;
>> +    unsigned int i = 0;
>>      BdrvDirtyBitmapIter *it;
>>      bool dirty;
>>
>> @@ -1983,23 +1994,25 @@ static unsigned 
>> bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset,
>>
>>      bdrv_dirty_bitmap_unlock(bitmap);
>>
>> +    *length = end - begin;
> 
> hm, it will always be zero, as at the end of the previous loop we have 
> "begin = end;"

Ah, then it should be end - offset. Thanks for the careful review.

In fact, since ONLY the final extent can be larger than 2G (all earlier 
extents were short of the overall_end, and the incoming length is 
32-bit), but the NBD protocol says that at most one extent can go beyond 
the original request, we do NOT want to split a >2G extent into multiple 
extents, but rather cap it to just shy of 4G at the granularity offered 
by the bitmap itself.  At which point add_extents() never returns more 
than 1, and can just be inlined.

> 
>> return i;
>>  }
>>
>>  static int nbd_co_send_bitmap(NBDClient *client, uint64_t handle,
>>                                BdrvDirtyBitmap *bitmap, uint64_t offset,
>> -                              uint64_t length, bool dont_fragment,
>> +                              uint32_t length, bool dont_fragment,
>>                                uint32_t context_id, Error **errp)
>>  {
>>      int ret;
>> -    unsigned nb_extents = dont_fragment ? 1 : NBD_MAX_BITMAP_EXTENTS;
>> +    unsigned int nb_extents = dont_fragment ? 1 : 
>> NBD_MAX_BITMAP_EXTENTS;
>>      NBDExtent *extents = g_new(NBDExtent, nb_extents);
>> +    uint64_t final_length = length;
>>
>> -    nb_extents = bitmap_to_extents(bitmap, offset, length, extents, 
>> nb_extents,
>> -                                   dont_fragment);
>> +    nb_extents = bitmap_to_extents(bitmap, offset, &final_length, 
>> extents,
>> +                                   nb_extents, dont_fragment);
>>
>> -    ret = nbd_co_send_extents(client, handle, extents, nb_extents, 
>> context_id,
>> -                              errp);
>> +    ret = nbd_co_send_extents(client, handle, extents, nb_extents,
>> +                              final_length, true, context_id, errp);
> 
> hmm in-out pointer field only to trace final_length? may be just trace 
> it in bitmap_to_extents?

No, because base:allocation also benefits from tracing final_length.

> 
> also, it's not trivial, that the function now sends FLAG_DONE. I think, 
> worth add a comment, or
> a parameter like for nbd_co_send_block_status. It will simplify 
> introducing new contexts in future.

Do we anticipate adding any in the near future? But adding a parameter 
that is always true so that the callsite becomes more obvious on when to 
pass the done flag may indeed make future additions easier to spot in 
one place.

So here's what else I'll squash in:

diff --git i/nbd/server.c w/nbd/server.c
index e84aea6cf03..7a96b296839 100644
--- i/nbd/server.c
+++ w/nbd/server.c
@@ -1881,9 +1881,10 @@ static int 
blockstatus_to_extent_be(BlockDriverState *bs, uint64_t offset,

  /* nbd_co_send_extents
   *
- * @last controls whether NBD_REPLY_FLAG_DONE is sent.
- *
- * @extents should already be in big-endian format.
+ * @length is only for tracing purposes (and may be smaller or larger
+ * than the client's original request). @last controls whether
+ * NBD_REPLY_FLAG_DONE is sent. @extents should already be in
+ * big-endian format.
   */
  static int nbd_co_send_extents(NBDClient *client, uint64_t handle,
                                 NBDExtent *extents, unsigned int 
nb_extents,
@@ -1925,33 +1926,12 @@ static int nbd_co_send_block_status(NBDClient 
*client, uint64_t handle,
                                 context_id, errp);
  }

-/* Set several extents, describing region of given @length with given 
@flags.
- * Do not set more than @nb_extents, return number of set extents.
+/*
+ * Populate @extents from a dirty bitmap. Unless @dont_fragment, the
+ * final extent may exceed the original @length. Store in @length the
+ * byte length encoded (which may be smaller or larger than the
+ * original), and return the number of extents used.
   */
-static unsigned int add_extents(NBDExtent *extents, unsigned nb_extents,
-                                uint64_t length, uint32_t flags)
-{
-    unsigned int i = 0;
-    uint32_t max_extent = 0x80000000U;
-    uint32_t max_extent_be = cpu_to_be32(max_extent);
-    uint32_t flags_be = cpu_to_be32(flags);
-
-    for (i = 0; i < nb_extents && length > max_extent;
-         i++, length -= max_extent)
-    {
-        extents[i].length = max_extent_be;
-        extents[i].flags = flags_be;
-    }
-
-    if (length > 0 && i < nb_extents) {
-        extents[i].length = cpu_to_be32(length);
-        extents[i].flags = flags_be;
-        i++;
-    }
-
-    return i;
-}
-
  static unsigned int bitmap_to_extents(BdrvDirtyBitmap *bitmap, 
uint64_t offset,
                                        uint64_t *length, NBDExtent 
*extents,
                                        unsigned int nb_extents,
@@ -1976,16 +1956,18 @@ static unsigned int 
bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset,
              end = bdrv_dirty_iter_next(it);
          }
          if (end == -1) {
-            end = bdrv_dirty_bitmap_size(bitmap);
+            /* Cap to an aligned value < 4G beyond begin. */
+            end = MIN(bdrv_dirty_bitmap_size(bitmap),
+                      begin + 0x100000000ULL -
+                      bdrv_dirty_bitmap_granularity(bitmap));
          }
          if (dont_fragment && end > overall_end) {
-            /* We MUST not exceed requested region if 
NBD_CMD_FLAG_REQ_ONE flag
-             * is set */
              end = overall_end;
          }

-        i += add_extents(extents + i, nb_extents - i, end - begin,
-                         dirty ? NBD_STATE_DIRTY : 0);
+        extents[i].length = cpu_to_be32(end - begin);
+        extents[i].flags = cpu_to_be32(dirty ? NBD_STATE_DIRTY : 0);
+        i++;
          begin = end;
          dirty = !dirty;
      }
@@ -1994,13 +1976,13 @@ static unsigned int 
bitmap_to_extents(BdrvDirtyBitmap *bitmap, uint64_t offset,

      bdrv_dirty_bitmap_unlock(bitmap);

-    *length = end - begin;
+    *length = end - offset;
      return i;
  }

  static int nbd_co_send_bitmap(NBDClient *client, uint64_t handle,
                                BdrvDirtyBitmap *bitmap, uint64_t offset,
-                              uint32_t length, bool dont_fragment,
+                              uint32_t length, bool dont_fragment, bool 
last,
                                uint32_t context_id, Error **errp)
  {
      int ret;
@@ -2012,7 +1994,7 @@ static int nbd_co_send_bitmap(NBDClient *client, 
uint64_t handle,
                                     nb_extents, dont_fragment);

      ret = nbd_co_send_extents(client, handle, extents, nb_extents,
-                              final_length, true, context_id, errp);
+                              final_length, last, context_id, errp);

      g_free(extents);

@@ -2253,7 +2235,7 @@ static coroutine_fn int 
nbd_handle_request(NBDClient *client,
                                           client->exp->export_bitmap,
                                           request->from, request->len,
                                           request->flags & 
NBD_CMD_FLAG_REQ_ONE,
-                                         NBD_META_ID_DIRTY_BITMAP, errp);
+                                         true, 
NBD_META_ID_DIRTY_BITMAP, errp);
                  if (ret < 0) {
                      return ret;
                  }


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

  reply	other threads:[~2018-06-20 18:09 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-09 15:17 [Qemu-devel] [PATCH v5 0/6] NBD export bitmaps Vladimir Sementsov-Ogievskiy
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 1/6] nbd/server: fix trace Vladimir Sementsov-Ogievskiy
2018-06-19 18:39   ` Eric Blake
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 2/6] nbd/server: refactor NBDExportMetaContexts Vladimir Sementsov-Ogievskiy
2018-06-19 19:03   ` Eric Blake
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 3/6] nbd/server: add nbd_meta_empty_or_pattern helper Vladimir Sementsov-Ogievskiy
2018-06-19 20:24   ` Eric Blake
2018-06-20  9:43     ` Vladimir Sementsov-Ogievskiy
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 4/6] nbd/server: implement dirty bitmap export Vladimir Sementsov-Ogievskiy
2018-06-20 11:24   ` Eric Blake
2018-06-20 14:04     ` Vladimir Sementsov-Ogievskiy
2018-06-20 15:43     ` Eric Blake
2018-06-20 15:58       ` Eric Blake
2018-06-20 16:27   ` Eric Blake
2018-06-20 17:04     ` Vladimir Sementsov-Ogievskiy
2018-06-20 18:09       ` Eric Blake [this message]
2018-06-21 10:09         ` Vladimir Sementsov-Ogievskiy
2018-09-14 16:22         ` Vladimir Sementsov-Ogievskiy
2018-11-29  4:34   ` Eric Blake
2019-01-09 19:21   ` Eric Blake
2019-01-10  7:15     ` Eric Blake
2019-01-17 21:09     ` John Snow
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 5/6] qapi: new qmp command nbd-server-add-bitmap Vladimir Sementsov-Ogievskiy
2018-06-20 11:26   ` Eric Blake
2018-06-20 14:13     ` Vladimir Sementsov-Ogievskiy
2018-06-20 18:14       ` Eric Blake
2018-06-21 10:10         ` Vladimir Sementsov-Ogievskiy
2018-06-21 10:23       ` Nikolay Shirokovskiy
2018-06-09 15:17 ` [Qemu-devel] [PATCH v5 6/6] docs/interop: add nbd.txt Vladimir Sementsov-Ogievskiy
2018-06-20 11:33   ` Eric Blake
2018-06-20 14:16     ` Vladimir Sementsov-Ogievskiy
2018-06-20 20:58       ` [Qemu-devel] [Qemu-block] " John Snow
2018-06-21 15:59         ` Vladimir Sementsov-Ogievskiy
2018-06-21 22:10           ` [Qemu-devel] Incremental Backup Status (Was: Re: [Qemu-block] [PATCH v5 6/6] docs/interop: add nbd.txt) John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d297899a-5019-4d14-37fd-6d35b932d0d8@redhat.com \
    --to=eblake@redhat.com \
    --cc=armbru@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.