All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
To: John Snow <jsnow@redhat.com>, qemu-devel@nongnu.org
Cc: kwolf@redhat.com, peter.maydell@linaro.org, quintela@redhat.com,
	dgilbert@redhat.com, stefanha@redhat.com, pbonzini@redhat.com,
	amit.shah@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH RFC v3 08/14] migration: add migration/block-dirty-bitmap.c
Date: Thu, 19 Feb 2015 16:48:50 +0300	[thread overview]
Message-ID: <54E5E9C2.80002@parallels.com> (raw)
In-Reply-To: <54E524A5.2090407@redhat.com>

On 19.02.2015 02:47, John Snow wrote:
>
>
> On 02/18/2015 09:00 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Live migration of dirty bitmaps. Only named dirty bitmaps, associated 
>> with
>> root nodes and non-root named nodes are migrated.
>>
>> If destination qemu is already containing a dirty bitmap with the 
>> same name
>> as a migrated bitmap (for the same node), than, if their 
>> granularities are
>> the same the migration will be done, otherwise the error will be 
>> generated.
>>
>> If destination qemu doesn't contain such bitmap it will be created.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
>> ---
>>   include/migration/block.h      |   1 +
>>   migration/Makefile.objs        |   2 +-
>>   migration/block-dirty-bitmap.c | 708 
>> +++++++++++++++++++++++++++++++++++++++++
>>   vl.c                           |   1 +
>>   4 files changed, 711 insertions(+), 1 deletion(-)
>>   create mode 100644 migration/block-dirty-bitmap.c
>>
>> diff --git a/include/migration/block.h b/include/migration/block.h
>> index ffa8ac0..566bb9f 100644
>> --- a/include/migration/block.h
>> +++ b/include/migration/block.h
>> @@ -14,6 +14,7 @@
>>   #ifndef BLOCK_MIGRATION_H
>>   #define BLOCK_MIGRATION_H
>>
>> +void dirty_bitmap_mig_init(void);
>>   void blk_mig_init(void);
>>   int blk_mig_active(void);
>>   uint64_t blk_mig_bytes_transferred(void);
>> diff --git a/migration/Makefile.objs b/migration/Makefile.objs
>> index d929e96..128612d 100644
>> --- a/migration/Makefile.objs
>> +++ b/migration/Makefile.objs
>> @@ -6,5 +6,5 @@ common-obj-y += xbzrle.o
>>   common-obj-$(CONFIG_RDMA) += rdma.o
>>   common-obj-$(CONFIG_POSIX) += exec.o unix.o fd.o
>>
>> -common-obj-y += block.o
>> +common-obj-y += block.o block-dirty-bitmap.o
>>
>> diff --git a/migration/block-dirty-bitmap.c 
>> b/migration/block-dirty-bitmap.c
>> new file mode 100644
>> index 0000000..084ba22
>> --- /dev/null
>> +++ b/migration/block-dirty-bitmap.c
>> @@ -0,0 +1,708 @@
>> +/*
>> + * QEMU dirty bitmap migration
>> + *
>> + * Live migration of dirty bitmaps. Only named dirty bitmaps, 
>> associated with
>> + * root nodes and non-root named nodes are migrated.
>> + *
>> + * If destination qemu is already containing a dirty bitmap with the 
>> same name
>> + * as a migrated bitmap (for the same node), than, if their 
>> granularities are
>> + * the same the migration will be done, otherwise the error will be 
>> generated.
>> + *
>> + * If destination qemu doesn't contain such bitmap it will be created.
>> + *
>> + * format of migration:
>> + *
>> + * # Header (shared for different chunk types)
>> + * 1 byte: flags
>
> 1, 2, or 4 bytes.
>
>> + * [ 1 byte: node name size ] \  flags & DEVICE_NAME
>> + * [ n bytes: node name     ] /
>> + * [ 1 byte: bitmap name size ] \  flags & BITMAP_NAME
>> + * [ n bytes: bitmap name     ] /
>> + *
>> + * # Start of bitmap migration (flags & START)
>> + * header
>> + * be64: granularity
>> + *
>> + * # Complete of bitmap migration (flags & COMPLETE)
>> + * header
>> + * 1 byte: bitmap enabled flag
>> + *
>> + * # Data chunk of bitmap migration
>> + * header
>> + * be64: start sector
>> + * be32: number of sectors
>> + * [ be64: buffer size  ] \ ! (flags & ZEROES)
>> + * [ n bytes: buffer    ] /
>> + *
>> + * The last chunk in stream should contain flags & EOS. The chunk 
>> may skip
>> + * device and/or bitmap names, assuming them to be the same with the 
>> previous
>> + * chunk.
>> + *
>> + *
>> + * This file is derived from migration/block.c
>> + *
>> + * Author:
>> + * Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
>> + *
>> + * original copyright message:
>> + * 
>> =====================================================================
>> + * Copyright IBM, Corp. 2009
>> + *
>> + * Authors:
>> + *  Liran Schour   <lirans@il.ibm.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2.  
>> See
>> + * the COPYING file in the top-level directory.
>> + *
>> + * Contributions after 2012-01-13 are licensed under the terms of the
>> + * GNU GPL, version 2 or (at your option) any later version.
>> + * 
>> =====================================================================
>> + */
>> +
>> +#include "block/block.h"
>> +#include "block/block_int.h"
>> +#include "sysemu/block-backend.h"
>> +#include "qemu/main-loop.h"
>> +#include "qemu/error-report.h"
>> +#include "migration/block.h"
>> +#include "migration/migration.h"
>> +#include "qemu/hbitmap.h"
>> +#include <assert.h>
>> +
>> +#define CHUNK_SIZE                       (1 << 20)
>> +
>> +/* Flags occupy from one to four bytes. In all but one the 7-th 
>> (EXTRA_FLAGS)
>> + * bit should be set. */
>> +#define DIRTY_BITMAP_MIG_FLAG_EOS           0x01
>> +#define DIRTY_BITMAP_MIG_FLAG_ZEROES        0x02
>> +#define DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME   0x04
>> +#define DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME   0x08
>> +#define DIRTY_BITMAP_MIG_FLAG_START         0x10
>> +#define DIRTY_BITMAP_MIG_FLAG_COMPLETE      0x20
>> +#define DIRTY_BITMAP_MIG_FLAG_BITS          0x40
>> +
>> +#define DIRTY_BITMAP_MIG_EXTRA_FLAGS        0x80
>> +#define DIRTY_BITMAP_MIG_FLAGS_SIZE_16      0x8000
>> +#define DIRTY_BITMAP_MIG_FLAGS_SIZE_32      0x8080
>> +
>> +#define DEBUG_DIRTY_BITMAP_MIGRATION
>> +
>> +#ifdef DEBUG_DIRTY_BITMAP_MIGRATION
>> +#define DPRINTF(fmt, ...) \
>> +    do { printf("dirty_migration: " fmt, ## __VA_ARGS__); } while (0)
>> +#else
>> +#define DPRINTF(fmt, ...) \
>> +    do { } while (0)
>> +#endif
>> +
>
> Take a look at hw/ide/ahci.c, which has a DPRINTF macro defined so 
> that the debugging can be turned off, but the print statements etc. 
> will still be compiled and typechecked.
>
> The blurb looks like this:
>
> #define DEBUG_AHCI 0
>
> #define DPRINTF(port, fmt, ...) \
> do { \
>     if (DEBUG_AHCI) { \
>         fprintf(stderr, "ahci: %s: [%d] ", __func__, port); \
>         fprintf(stderr, fmt, ## __VA_ARGS__); \
>     } \
> } while (0)
>
> When it comes time to submit this as non-RFC, leave 
> DEBUG_DIRTY_BITMAP_MIGRATION defined to 0.
>
Ok.
>> +typedef struct DirtyBitmapMigBitmapState {
>> +    /* Written during setup phase. */
>> +    BlockDriverState *bs;
>> +    const char *node_name;
>> +    BdrvDirtyBitmap *bitmap;
>> +    HBitmap *meta_bitmap;
>> +    uint64_t total_sectors;
>> +    uint64_t sectors_per_chunk;
>> +    QSIMPLEQ_ENTRY(DirtyBitmapMigBitmapState) entry;
>> +
>> +    /* For bulk phase. */
>> +    bool bulk_completed;
>> +    uint64_t cur_sector;
>> +
>> +    /* For dirty phase. */
>> +    HBitmapIter iter_dirty;
>> +} DirtyBitmapMigBitmapState;
>> +
>> +typedef struct DirtyBitmapMigState {
>> +    QSIMPLEQ_HEAD(dbms_list, DirtyBitmapMigBitmapState) dbms_list;
>> +
>> +    bool bulk_completed;
>> +
>> +    /* for send_bitmap() */
>
> send_bitmap_bits, now.
>
>> +    BlockDriverState *prev_bs;
>> +    BdrvDirtyBitmap *prev_bitmap;
>> +} DirtyBitmapMigState;
>> +
>> +typedef struct DirtyBitmapLoadState {
>> +    uint32_t flags;
>> +    char node_name[256];
>> +    char bitmap_name[256];
>> +    BlockDriverState *bs;
>> +    BdrvDirtyBitmap *bitmap;
>> +} DirtyBitmapLoadState;
>> +
>> +static DirtyBitmapMigState dirty_bitmap_mig_state;
>> +
>> +static uint32_t qemu_get_flags(QEMUFile *f)
>> +{
>> +    uint8_t flags = qemu_get_byte(f);
>> +    if (flags & DIRTY_BITMAP_MIG_EXTRA_FLAGS) {
>> +        flags = flags << 8 | qemu_get_byte(f);
>> +        if (flags & DIRTY_BITMAP_MIG_EXTRA_FLAGS) {
>> +            flags = flags << 16 | qemu_get_be16(f);
>> +        }
>> +    }
>> +
>> +    return flags;
>> +}
>> +
>> +static void qemu_put_flags(QEMUFile *f, uint32_t flags)
>> +{
>> +    if (!(flags & 0xffffff00)) {
>> +        qemu_put_byte(f, flags);
>> +        return;
>> +    }
>> +
>> +    if (!(flags & 0xffff0000)) {
>> +        qemu_put_be16(f, flags | DIRTY_BITMAP_MIG_FLAGS_SIZE_16);
>> +        return;
>> +    }
>> +
>> +    qemu_put_be32(f, flags | DIRTY_BITMAP_MIG_FLAGS_SIZE_32);
>> +}
>> +
>
> This will give us breathing room for sure :)
>
>> +/* read name from qemu file:
>> + * format:
>> + * 1 byte : len = name length (<256)
>> + * len bytes : name without last zero byte
>> + *
>> + * name should point to the buffer >= 256 bytes length
>> + */
>> +static char *qemu_get_string(QEMUFile *f, char *name)
>> +{
>> +    int len = qemu_get_byte(f);
>> +    qemu_get_buffer(f, (uint8_t *)name, len);
>> +    name[len] = '\0';
>> +
>> +    DPRINTF("get name: %d %s\n", len, name);
>> +
>> +    return name;
>> +}
>> +
>> +/* write name to qemu file:
>> + * format:
>> + * same as for qemu_get_string
>> + *
>> + * maximum name length is 255
>> + */
>> +static void qemu_put_string(QEMUFile *f, const char *name)
>> +{
>> +    int len = strlen(name);
>> +
>> +    DPRINTF("put name: %d %s\n", len, name);
>> +
>> +    assert(len < 256);
>> +    qemu_put_byte(f, len);
>> +    qemu_put_buffer(f, (const uint8_t *)name, len);
>> +}
>> +
>
> Thanks, sorry for the nitpicking.
No problem, I like good naming too, but true idea doesn't come every time.
>
>> +static void send_bitmap_header(QEMUFile *f, 
>> DirtyBitmapMigBitmapState *dbms,
>> +                               uint32_t additional_flags)
>> +{
>> +    BlockDriverState *bs = dbms->bs;
>> +    BdrvDirtyBitmap *bitmap = dbms->bitmap;
>> +    uint32_t flags = additional_flags;
>> +
>> +    if (bs != dirty_bitmap_mig_state.prev_bs) {
>> +        dirty_bitmap_mig_state.prev_bs = bs;
>> +        flags |= DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME;
>> +    }
>> +
>> +    if (bitmap != dirty_bitmap_mig_state.prev_bitmap) {
>> +        dirty_bitmap_mig_state.prev_bitmap = bitmap;
>> +        flags |= DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME;
>> +    }
>> +
>> +    qemu_put_flags(f, flags);
>> +
>> +    if (flags & DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME) {
>> +        qemu_put_string(f, dbms->node_name);
>> +    }
>> +
>> +    if (flags & DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME) {
>> +        qemu_put_string(f, bdrv_dirty_bitmap_name(bitmap));
>> +    }
>> +}
>> +
>> +static void send_bitmap_start(QEMUFile *f, DirtyBitmapMigBitmapState 
>> *dbms)
>> +{
>> +    send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
>> +    qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
>> +}
>> +
>> +static void send_bitmap_complete(QEMUFile *f, 
>> DirtyBitmapMigBitmapState *dbms)
>> +{
>> +    send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
>> +    qemu_put_byte(f, bdrv_dirty_bitmap_enabled(dbms->bitmap));
>> +}
>> +
>> +static void send_bitmap_bits(QEMUFile *f, DirtyBitmapMigBitmapState 
>> *dbms,
>> +                             uint64_t start_sector, uint32_t 
>> nr_sectors)
>> +{
>> +    /* align for buffer_is_zero() */
>> +    uint64_t align = 4 * sizeof(long);
>> +    uint64_t buf_size =
>> +        (bdrv_dirty_bitmap_data_size(dbms->bitmap, nr_sectors) + 
>> align - 1) &
>> +        ~(align - 1);
>> +    uint8_t *buf = g_malloc0(buf_size);
>> +    uint32_t flags = DIRTY_BITMAP_MIG_FLAG_BITS;
>> +
>> +    bdrv_dirty_bitmap_serialize_part(dbms->bitmap, buf,
>> +                                     start_sector, nr_sectors);
>> +
>> +    if (buffer_is_zero(buf, buf_size)) {
>> +        g_free(buf);
>> +        buf = NULL;
>> +        flags |= DIRTY_BITMAP_MIG_FLAG_ZEROES;
>> +    }
>> +
>> +    DPRINTF("Enter send_bitmap"
>> +            "\n   flags:        %x"
>> +            "\n   start_sector: %" PRIu64
>> +            "\n   nr_sectors:   %" PRIu32
>> +            "\n   data_size:    %" PRIu64 "\n",
>> +            flags, start_sector, nr_sectors, buf_size);
>> +
>> +    send_bitmap_header(f, dbms, flags);
>> +
>> +    qemu_put_be64(f, start_sector);
>> +    qemu_put_be32(f, nr_sectors);
>> +
>> +    /* if a block is zero we need to flush here since the network
>> +     * bandwidth is now a lot higher than the storage device bandwidth.
>> +     * thus if we queue zero blocks we slow down the migration.
>> +     * also, skip writing block when migrate only dirty bitmaps. */
>> +    if (flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
>> +        qemu_fflush(f);
>> +        return;
>> +    }
>> +
>> +    qemu_put_be64(f, buf_size);
>> +    qemu_put_buffer(f, buf, buf_size);
>> +    g_free(buf);
>> +}
>> +
>> +
>> +/* Called with iothread lock taken.  */
>> +
>> +static void set_dirty_tracking(void)
>> +{
>> +    DirtyBitmapMigBitmapState *dbms;
>> +
>> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
>> +        dbms->meta_bitmap =
>> +            bdrv_create_meta_bitmap(dbms->bitmap, CHUNK_SIZE);
>> +    }
>> +}
>> +
>> +static void unset_dirty_tracking(void)
>> +{
>> +    DirtyBitmapMigBitmapState *dbms;
>> +
>> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
>> +        bdrv_release_meta_bitmap(dbms->bitmap);
>> +    }
>> +}
>> +
>> +static void init_dirty_bitmap_migration(QEMUFile *f)
>> +{
>> +    BlockDriverState *bs;
>> +    BdrvDirtyBitmap *bitmap;
>> +    DirtyBitmapMigBitmapState *dbms;
>> +
>> +    dirty_bitmap_mig_state.bulk_completed = false;
>> +    dirty_bitmap_mig_state.prev_bs = NULL;
>> +    dirty_bitmap_mig_state.prev_bitmap = NULL;
>> +
>> +    for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
>> +        for (bitmap = bdrv_next_dirty_bitmap(bs, NULL); bitmap;
>> +             bitmap = bdrv_next_dirty_bitmap(bs, bitmap)) {
>> +            if (!bdrv_dirty_bitmap_name(bitmap)) {
>> +                continue;
>> +            }
>> +
>> +            if (!bdrv_get_node_name(bs) && blk_bs(bs->blk) != bs) {
>> +                /* not named non-root node */
>> +                continue;
>> +            }
>> +
>> +            dbms = g_new0(DirtyBitmapMigBitmapState, 1);
>> +            dbms->bs = bs;
>> +            dbms->node_name = bdrv_get_node_name(bs);
>> +            if (!dbms->node_name || dbms->node_name[0] == '\0') {
>> +                dbms->node_name = bdrv_get_device_name(bs);
>> +            }
>> +            dbms->bitmap = bitmap;
>> +            dbms->total_sectors = bdrv_nb_sectors(bs);
>> +            dbms->sectors_per_chunk = (uint64_t)CHUNK_SIZE * 8 *
>> +                bdrv_dirty_bitmap_granularity(dbms->bitmap) >> 
>> BDRV_SECTOR_BITS;
>
> This calculation is re-used from the meta_bitmap granularity 
> calculation. Might be worth factoring out so that if we tweak the 
> values for performance reasons, we don't have to remember all the 
> places we need to change it.
>
> If your answer to my earlier question about CHUNK_SIZE is "We don't 
> need to change it," then disregard.
ok. I'll move 'dbms->sectors_per_chunk =' to set_dirty_tracking and use 
hbitmap_granularity
>
>> +
>> + QSIMPLEQ_INSERT_TAIL(&dirty_bitmap_mig_state.dbms_list,
>> +                                 dbms, entry);
>> +        }
>> +    }
>> +}
>> +
>> +/* Called with no lock taken.  */
>> +static void bulk_phase_send_chunk(QEMUFile *f, 
>> DirtyBitmapMigBitmapState *dbms)
>> +{
>> +    uint32_t nr_sectors = MIN(dbms->total_sectors - dbms->cur_sector,
>> +                             dbms->sectors_per_chunk);
>> +
>> +    send_bitmap_bits(f, dbms, dbms->cur_sector, nr_sectors);
>> +
>> +    dbms->cur_sector += nr_sectors;
>> +    if (dbms->cur_sector >= dbms->total_sectors) {
>> +        dbms->bulk_completed = true;
>> +    }
>> +}
>> +
>> +/* Called with no lock taken.  */
>> +static void bulk_phase(QEMUFile *f, bool limit)
>> +{
>> +    DirtyBitmapMigBitmapState *dbms;
>> +
>> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
>> +        while (!dbms->bulk_completed) {
>> +            bulk_phase_send_chunk(f, dbms);
>> +            if (limit && qemu_file_rate_limit(f)) {
>> +                return;
>> +            }
>> +        }
>> +    }
>> +
>> +    dirty_bitmap_mig_state.bulk_completed = true;
>> +}
>> +
>> +static void blk_mig_reset_dirty_cursor(void)
>> +{
>> +    DirtyBitmapMigBitmapState *dbms;
>> +
>> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
>> +        hbitmap_iter_init(&dbms->iter_dirty, dbms->meta_bitmap, 0);
>> +    }
>> +}
>> +
>> +/* Called with iothread lock taken. */
>> +static bool dirty_phase_send_chunk(QEMUFile *f, 
>> DirtyBitmapMigBitmapState *dbms)
>> +{
>> +    uint32_t nr_sectors;
>> +    size_t old_pos = dbms->iter_dirty.pos;
>> +    int64_t cur = hbitmap_iter_next(&dbms->iter_dirty);
>> +
>> +    /* restart search from the beginning */
>> +    if (old_pos && cur == -1) {
>> +        hbitmap_iter_init(&dbms->iter_dirty, dbms->meta_bitmap, 0);
>> +        cur = hbitmap_iter_next(&dbms->iter_dirty);
>> +    }
>> +
>> +    if (cur == -1) {
>> +        hbitmap_iter_init(&dbms->iter_dirty, dbms->meta_bitmap, 0);
>> +        return false;
>> +    }
>> +
>> +    nr_sectors = MIN(dbms->total_sectors - cur, 
>> dbms->sectors_per_chunk);
>> +    send_bitmap_bits(f, dbms, cur, nr_sectors);
>> +    hbitmap_reset(dbms->meta_bitmap, cur, dbms->sectors_per_chunk);
>> +    cur += nr_sectors;
>
> Dead assignment to cur, function is fine otherwise.
>
>> +
>> +    return true;
>> +}
>> +
>> +/* Called with iothread lock taken. */
>> +static void dirty_phase(QEMUFile *f, bool limit)
>> +{
>> +    DirtyBitmapMigBitmapState *dbms;
>> +
>> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
>> +        while (dirty_phase_send_chunk(f, dbms)) {
>> +            if (limit && qemu_file_rate_limit(f)) {
>> +                return;
>> +            }
>> +        }
>> +    }
>> +}
>> +
>> +
>> +/* Called with iothread lock taken.  */
>> +static void dirty_bitmap_mig_cleanup(void)
>> +{
>> +    DirtyBitmapMigBitmapState *dbms;
>> +
>> +    unset_dirty_tracking();
>> +
>> +    while ((dbms = 
>> QSIMPLEQ_FIRST(&dirty_bitmap_mig_state.dbms_list)) != NULL) {
>> + QSIMPLEQ_REMOVE_HEAD(&dirty_bitmap_mig_state.dbms_list, entry);
>> +        g_free(dbms);
>> +    }
>> +}
>> +
>> +static void dirty_bitmap_migration_cancel(void *opaque)
>> +{
>> +    dirty_bitmap_mig_cleanup();
>> +}
>> +
>> +static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
>> +{
>> +    DPRINTF("Enter save live iterate\n");
>> +
>> +    if (dirty_bitmap_mig_state.bulk_completed) {
>> +        qemu_mutex_lock_iothread();
>> +        dirty_phase(f, true);
>> +        qemu_mutex_unlock_iothread();
>> +    } else {
>> +        bulk_phase(f, true);
>> +    }
>> +
>> +    qemu_put_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
>> +
>> +    return dirty_bitmap_mig_state.bulk_completed;
>> +}
>> +
>> +/* Called with iothread lock taken.  */
>> +
>> +static int dirty_bitmap_save_complete(QEMUFile *f, void *opaque)
>> +{
>> +    DirtyBitmapMigBitmapState *dbms;
>> +    DPRINTF("Enter save live complete\n");
>> +
>> +    if (!dirty_bitmap_mig_state.bulk_completed) {
>> +        bulk_phase(f, false);
>> +    }
>> +
>> +    blk_mig_reset_dirty_cursor();
>> +    dirty_phase(f, false);
>> +
>> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
>> +        send_bitmap_complete(f, dbms);
>> +    }
>> +
>> +    qemu_put_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
>> +
>> +    DPRINTF("Dirty bitmaps migration completed\n");
>> +
>> +    dirty_bitmap_mig_cleanup();
>> +    return 0;
>> +}
>> +
>> +static uint64_t dirty_bitmap_save_pending(QEMUFile *f, void *opaque,
>> +                                          uint64_t max_size)
>> +{
>> +    DirtyBitmapMigBitmapState *dbms;
>> +    uint64_t pending = 0;
>> +
>> +    qemu_mutex_lock_iothread();
>> +
>> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
>> +        uint64_t sectors = hbitmap_count(dbms->meta_bitmap);
>> +        if (!dbms->bulk_completed) {
>> +            sectors += dbms->total_sectors - dbms->cur_sector;
>> +        }
>> +        pending += bdrv_dirty_bitmap_data_size(dbms->bitmap, sectors);
>> +    }
>> +
>> +    qemu_mutex_unlock_iothread();
>> +
>> +    DPRINTF("Enter save live pending %" PRIu64 ", max: %" PRIu64 "\n",
>> +            pending, max_size);
>> +    return pending;
>> +}
>> +
>> +/* First occurrence of this bitmap. It should be created if doesn't 
>> exist */
>> +static int dirty_bitmap_load_start(QEMUFile *f, DirtyBitmapLoadState 
>> *s)
>> +{
>> +    uint32_t granularity = qemu_get_be32(f);
>> +    if (!s->bitmap) {
>> +        Error *local_err = NULL;
>> +        s->bitmap = bdrv_create_dirty_bitmap(s->bs, granularity,
>> +                                             s->bitmap_name, 
>> &local_err);
>> +        if (!s->bitmap) {
>> +            error_report("%s", error_get_pretty(local_err));
>> +            error_free(local_err);
>> +            return -EINVAL;
>> +        }
>> +    } else {
>> +        uint32_t dest_granularity =
>> +            bdrv_dirty_bitmap_granularity(s->bitmap);
>> +        if (dest_granularity != granularity) {
>> +            fprintf(stderr,
>> +                    "Error: "
>> +                    "Migrated bitmap granularity (%" PRIu32 ") "
>> +                    "doesn't match the destination bitmap '%s' "
>> +                    "granularity (%" PRIu32 ")\n",
>> +                    granularity,
>> +                    bdrv_dirty_bitmap_name(s->bitmap),
>> +                    dest_granularity);
>> +            return -EINVAL;
>> +        }
>> +    }
>> +
>> +    bdrv_disable_dirty_bitmap(s->bitmap);
>
> This will definitely keep people from using it while it's in an 
> inconsistent state.
Ho, it's more interesting. Not people. Qemu! If we are migrating disk in 
parallel with dirty bitmap, every write of migrated disk block will 
spoil our dirty bitmap (if not disabled).
>
>> +
>> +    return 0;
>> +}
>> +
>> +static void dirty_bitmap_load_complete(QEMUFile *f, 
>> DirtyBitmapLoadState *s)
>> +{
>> +    bool enabled;
>> +
>> +    bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
>> +    DPRINTF("enab\n");
>> +
>> +    enabled = qemu_get_byte(f);
>> +    if (enabled) {
>> +        bdrv_enable_dirty_bitmap(s->bitmap);
>> +    }
>> +}
>> +
>> +static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
>> +{
>> +    uint64_t first_sector = qemu_get_be64(f);
>> +    uint32_t nr_sectors = qemu_get_be32(f);
>> +    DPRINTF("chunk: %lu %u\n", first_sector, nr_sectors);
>> +
>> +
>> +    if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
>> +        DPRINTF("   - zeroes\n");
>> +        bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_sector,
>> +                                             nr_sectors);
>> +    } else {
>> +        uint8_t *buf;
>> +        uint64_t buf_size = qemu_get_be64(f);
>> +        uint64_t needed_size =
>> +            bdrv_dirty_bitmap_data_size(s->bitmap, nr_sectors);
>> +
>> +        if (needed_size > buf_size) {
>> +            fprintf(stderr,
>> +                    "Error: Migrated bitmap granularity doesn't "
>> +                    "match the destination bitmap '%s' granularity\n",
>> +                    bdrv_dirty_bitmap_name(s->bitmap));
>> +            return -EINVAL;
>> +        }
>> +
>> +        buf = g_malloc(buf_size);
>> +        qemu_get_buffer(f, buf, buf_size);
>> +        bdrv_dirty_bitmap_deserialize_part(s->bitmap, buf,
>> +                                           first_sector,
>> +                                           nr_sectors);
>> +        g_free(buf);
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int dirty_bitmap_load_header(QEMUFile *f, 
>> DirtyBitmapLoadState *s)
>> +{
>> +    Error *local_err = NULL;
>> +    s->flags = qemu_get_flags(f);
>> +    DPRINTF("flags: %x\n", s->flags);
>> +
>> +    if (s->flags & DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME) {
>> +        qemu_get_string(f, s->node_name);
>> +        s->bs = bdrv_lookup_bs(s->node_name, s->node_name, &local_err);
>> +        if (!s->bs) {
>> +            error_report("%s", error_get_pretty(local_err));
>> +            error_free(local_err);
>> +            return -EINVAL;
>> +        }
>> +    } else if (!s->bs) {
>> +        fprintf(stderr, "Error: block device name is not set\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    if (s->flags & DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME) {
>> +        qemu_get_string(f, s->bitmap_name);
>> +        s->bitmap = bdrv_find_dirty_bitmap(s->bs, s->bitmap_name);
>> +
>> +        /* bitmap may be NULL here, it wouldn't be an error if it is 
>> the
>> +         * first occurrence of the bitmap */
>> +        if (!s->bitmap && !(s->flags & DIRTY_BITMAP_MIG_FLAG_START)) {
>> +            fprintf(stderr, "Error: unknown dirty bitmap "
>> +                    "'%s' for block device '%s'\n",
>> +                    s->bitmap_name, s->node_name);
>> +            return -EINVAL;
>> +        }
>> +    } else if (!s->bitmap) {
>> +        fprintf(stderr, "Error: block device name is not set\n");
>> +        return -EINVAL;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int dirty_bitmap_load(QEMUFile *f, void *opaque, int version_id)
>> +{
>> +    static DirtyBitmapLoadState s;
>> +
>> +    int ret = 0;
>> +
>> +    DPRINTF("load start\n");
>> +
>> +    do {
>> +        dirty_bitmap_load_header(f, &s);
>> +
>> +        if (s.flags & DIRTY_BITMAP_MIG_FLAG_START) {
>> +            ret = dirty_bitmap_load_start(f, &s);
>> +        } else if (s.flags & DIRTY_BITMAP_MIG_FLAG_COMPLETE) {
>> +            dirty_bitmap_load_complete(f, &s);
>> +        } else if (s.flags & DIRTY_BITMAP_MIG_FLAG_BITS) {
>> +            ret = dirty_bitmap_load_bits(f, &s);
>> +        }
>> +
>> +        DPRINTF("ret: %d\n", ret);
>> +        if (!ret) {
>> +            ret = qemu_file_get_error(f);
>> +        }
>> +
>> +        DPRINTF("ret: %d\n", ret);
>> +        if (ret) {
>> +            return ret;
>> +        }
>> +    } while (!(s.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
>> +
>> +    DPRINTF("load finish\n");
>> +    return 0;
>> +}
>> +
>
> Great, thanks. These functions read way nicer now.
>
>> +static bool dirty_bitmap_is_active(void *opaque)
>> +{
>> +    return migrate_dirty_bitmaps();
>> +}
>> +
>> +static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
>> +{
>> +    DirtyBitmapMigBitmapState *dbms = NULL;
>> +    init_dirty_bitmap_migration(f);
>> +
>> +    qemu_mutex_lock_iothread();
>> +    /* start track dirtyness of dirty bitmaps */
>
> 'dirtiness'
codespell missed it =(
>
>> +    set_dirty_tracking();
>> +    qemu_mutex_unlock_iothread();
>> +
>> +    blk_mig_reset_dirty_cursor();
>> +
>> +    QSIMPLEQ_FOREACH(dbms, &dirty_bitmap_mig_state.dbms_list, entry) {
>> +        send_bitmap_start(f, dbms);
>> +    }
>> +    qemu_put_flags(f, DIRTY_BITMAP_MIG_FLAG_EOS);
>> +
>> +    return 0;
>> +}
>> +
>> +static SaveVMHandlers savevm_block_handlers = {
>> +    .save_live_setup = dirty_bitmap_save_setup,
>> +    .save_live_iterate = dirty_bitmap_save_iterate,
>> +    .save_live_complete = dirty_bitmap_save_complete,
>> +    .save_live_pending = dirty_bitmap_save_pending,
>> +    .load_state = dirty_bitmap_load,
>> +    .cancel = dirty_bitmap_migration_cancel,
>> +    .is_active = dirty_bitmap_is_active,
>> +};
>> +
>> +void dirty_bitmap_mig_init(void)
>> +{
>> +    QSIMPLEQ_INIT(&dirty_bitmap_mig_state.dbms_list);
>> +
>> +    register_savevm_live(NULL, "dirty-bitmap", 0, 1, 
>> &savevm_block_handlers,
>> +                         &dirty_bitmap_mig_state);
>> +}
>> diff --git a/vl.c b/vl.c
>> index 8c8f142..723a362 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -4147,6 +4147,7 @@ int main(int argc, char **argv, char **envp)
>>
>>       blk_mig_init();
>>       ram_mig_init();
>> +    dirty_bitmap_mig_init();
>>
>>       /* If the currently selected machine wishes to override the 
>> units-per-bus
>>        * property of its default HBA interface type, do so now. */
>>
>
> Looks good. Holding the R-B pending discussion of the implications of 
> chunk sizes.
>
> --js

Thanks,

-- 
Best regards,
Vladimir

  reply	other threads:[~2015-02-19 13:49 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-18 14:00 [Qemu-devel] [PATCH RFC v3 00/14] Dirty bitmaps migration Vladimir Sementsov-Ogievskiy
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 01/14] qmp: add query-block-dirty-bitmap Vladimir Sementsov-Ogievskiy
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 02/14] hbitmap: serialization Vladimir Sementsov-Ogievskiy
2015-02-18 23:42   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 03/14] block: BdrvDirtyBitmap serialization interface Vladimir Sementsov-Ogievskiy
2015-02-18 23:43   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 04/14] block: tiny refactoring: minimize hbitmap_(set/reset) usage Vladimir Sementsov-Ogievskiy
2015-02-18 23:44   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 05/14] block: add meta bitmaps Vladimir Sementsov-Ogievskiy
2015-02-18 23:45   ` John Snow
2015-02-19 11:43     ` Vladimir Sementsov-Ogievskiy
2015-02-21  0:53       ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 06/14] block: add bdrv_next_dirty_bitmap() Vladimir Sementsov-Ogievskiy
2015-02-18 23:45   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 07/14] qapi: add dirty-bitmaps migration capability Vladimir Sementsov-Ogievskiy
2015-02-18 23:45   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 08/14] migration: add migration/block-dirty-bitmap.c Vladimir Sementsov-Ogievskiy
2015-02-18 23:47   ` John Snow
2015-02-19 13:48     ` Vladimir Sementsov-Ogievskiy [this message]
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 09/14] iotests: maintain several vms in test Vladimir Sementsov-Ogievskiy
2015-02-18 23:48   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 10/14] iotests: add add_incoming_migration to VM class Vladimir Sementsov-Ogievskiy
2015-02-18 23:48   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 11/14] iotests: add dirty bitmap migration test Vladimir Sementsov-Ogievskiy
2015-02-19 18:47   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 12/14] qapi: add md5 checksum of last dirty bitmap level to query-block Vladimir Sementsov-Ogievskiy
2015-02-19 18:53   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 13/14] iotests: add dirty bitmap migration test Vladimir Sementsov-Ogievskiy
2015-02-19 19:30   ` John Snow
2015-02-18 14:00 ` [Qemu-devel] [PATCH RFC v3 14/14] migration/qemu-file: make functions qemu_(get/put)_string public Vladimir Sementsov-Ogievskiy
2015-02-19  0:00   ` John Snow
2015-02-19  0:11 ` [Qemu-devel] [PATCH RFC v3 00/14] Dirty bitmaps migration John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54E5E9C2.80002@parallels.com \
    --to=vsementsov@parallels.com \
    --cc=amit.shah@redhat.com \
    --cc=den@openvz.org \
    --cc=dgilbert@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.