All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Denis V. Lunev" <den@openvz.org>
To: Fam Zheng <famz@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
	Evgeny Yakovlev <eyakovlev@virtuozzo.com>,
	Kevin Wolf <kwolf@redhat.com>, Max Reitz <mreitz@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>,
	John Snow <jsnow@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v4 1/3] block: ignore flush requests when storage is clean
Date: Tue, 28 Jun 2016 12:10:50 +0300	[thread overview]
Message-ID: <57723F1A.8020105@openvz.org> (raw)
In-Reply-To: <20160628012716.GB22237@ad.usersys.redhat.com>

On 06/28/2016 04:27 AM, Fam Zheng wrote:
> On Mon, 06/27 17:47, Denis V. Lunev wrote:
>> From: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
>>
>> Some guests (win2008 server for example) do a lot of unnecessary
>> flushing when underlying media has not changed. This adds additional
>> overhead on host when calling fsync/fdatasync.
>>
>> This change introduces a dirty flag in BlockDriverState which is set
>> in bdrv_set_dirty and is checked in bdrv_co_flush. This allows us to
>> avoid unnecessary flushing when storage is clean.
>>
>> The problem with excessive flushing was found by a performance test
>> which does parallel directory tree creation (from 2 processes).
>> Results improved from 0.424 loops/sec to 0.432 loops/sec.
>> Each loop creates 10^3 directories with 10 files in each.
>>
>> Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Kevin Wolf <kwolf@redhat.com>
>> CC: Max Reitz <mreitz@redhat.com>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Fam Zheng <famz@redhat.com>
>> CC: John Snow <jsnow@redhat.com>
>> ---
>>   block.c                   |  1 +
>>   block/dirty-bitmap.c      |  3 +++
>>   block/io.c                | 19 +++++++++++++++++++
>>   include/block/block_int.h |  1 +
>>   4 files changed, 24 insertions(+)
>>
>> diff --git a/block.c b/block.c
>> index 947df29..68ae3a0 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -2581,6 +2581,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
>>           ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
>>           bdrv_dirty_bitmap_truncate(bs);
>>           bdrv_parent_cb_resize(bs);
>> +        bs->dirty = true; /* file node sync is needed after truncate */
>>       }
>>       return ret;
>>   }
>> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
>> index 4902ca5..54e0413 100644
>> --- a/block/dirty-bitmap.c
>> +++ b/block/dirty-bitmap.c
>> @@ -370,6 +370,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
>>           }
>>           hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
>>       }
>> +
>> +    /* Set global block driver dirty flag even if bitmap is disabled */
>> +    bs->dirty = true;
>>   }
>>   
>>   /**
>> diff --git a/block/io.c b/block/io.c
>> index b9e53e3..152f5a9 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -2247,6 +2247,25 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
>>           goto flush_parent;
>>       }
>>   
>> +    /* Check if storage is actually dirty before flushing to disk */
>> +    if (!bs->dirty) {
>> +        /* Flush requests are appended to tracked request list in order so that
>> +         * most recent request is at the head of the list. Following code uses
>> +         * this ordering to wait for the most recent flush request to complete
>> +         * to ensure that requests return in order */
>> +        BdrvTrackedRequest *prev_req;
>> +        QLIST_FOREACH(prev_req, &bs->tracked_requests, list) {
>> +            if (prev_req == &req || prev_req->type != BDRV_TRACKED_FLUSH) {
>> +                continue;
>> +            }
>> +
>> +            qemu_co_queue_wait(&prev_req->wait_queue);
>> +            break;
>> +        }
>> +        goto flush_parent;
> Should we check bs->dirty again after qemu_co_queue_wait()? I think another
> write request could sneak in while this coroutine yields.
no, we do not care. Any subsequent to FLUSH write does not guaranteed to
be flushed. We have the warranty only that all write requests completed
prior to this flush are really flushed.



>> +    }
>> +    bs->dirty = false;
>> +
>>       BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
>>       if (bs->drv->bdrv_co_flush_to_disk) {
>>           ret = bs->drv->bdrv_co_flush_to_disk(bs);
>> diff --git a/include/block/block_int.h b/include/block/block_int.h
>> index 0432ba5..59a7def 100644
>> --- a/include/block/block_int.h
>> +++ b/include/block/block_int.h
>> @@ -435,6 +435,7 @@ struct BlockDriverState {
>>       bool valid_key; /* if true, a valid encryption key has been set */
>>       bool sg;        /* if true, the device is a /dev/sg* */
>>       bool probed;    /* if true, format was probed rather than specified */
>> +    bool dirty;     /* if true, media is dirty and should be flushed */
> How about renaming this to "need_flush"? The one "dirty" we had is set by
> bdrv_set_dirty, and cleared by bdrv_reset_dirty_bitmap. I'd avoid the
> confusion between the two concepts.
>
> Fam

can be

>>   
>>       int copy_on_read; /* if nonzero, copy read backing sectors into image.
>>                            note this is a reference count */
>> -- 
>> 2.1.4
>>

  reply	other threads:[~2016-06-28  9:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-27 14:47 [Qemu-devel] [PATCH v4 0/3] block: ignore flush requests when storage is clean Denis V. Lunev
2016-06-27 14:47 ` [Qemu-devel] [PATCH v4 1/3] " Denis V. Lunev
2016-06-28  1:27   ` Fam Zheng
2016-06-28  9:10     ` Denis V. Lunev [this message]
2016-06-29  1:12       ` Fam Zheng
2016-06-29  8:30         ` Denis V. Lunev
2016-06-29  9:09         ` Stefan Hajnoczi
2016-06-27 14:47 ` [Qemu-devel] [PATCH v4 2/3] ide: ignore retry_unit check for non-retry operation Denis V. Lunev
2016-06-27 14:47 ` [Qemu-devel] [PATCH v4 3/3] tests: in IDE and AHCI tests perform DMA write before flushing Denis V. Lunev
2016-06-27 23:19   ` John Snow
2016-06-28  9:11     ` Denis V. Lunev
2016-06-28  9:21     ` Evgeny Yakovlev
2016-06-28 16:37       ` John Snow
2016-06-29 17:40         ` John Snow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57723F1A.8020105@openvz.org \
    --to=den@openvz.org \
    --cc=eyakovlev@virtuozzo.com \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.