[PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters
@ 2020-08-05  2:38 Ying Fang
  2020-08-05  2:43 ` no-reply
  2020-08-06  9:13 ` Kevin Wolf
  0 siblings, 2 replies; 8+ messages in thread
From: Ying Fang @ 2020-08-05  2:38 UTC (permalink / raw)
  To: qemu-devel, kwolf, mreitz
  Cc: alex.chen, fangying, zhang.zhanghailiang, qemu-block

From: fangying <fangying1@huawei.com>

When qemu or qemu-nbd process uses a qcow2 image and configured with
'cache = none', it will write to the qcow2 image with a cache to cache
L2 tables, however the process will not use L2 tables without explicitly
calling the flush command or closing the mirror flash into the disk.
Which may cause the disk data inconsistent with the written data for
a long time. If an abnormal process exit occurs here, the issued written
data will be lost.

Therefore, in order to keep data consistency we need to flush the changes
to the L2 entry to the disk in time for the newly allocated cluster.

Signed-off-by: Ying Fang <fangying1@huawei.com>

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 7444b9c..ab6e812 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -266,6 +266,22 @@ int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
     return result;
 }
 
+#define L2_ENTRIES_PER_SECTOR 64
+int qcow2_cache_l2_write_entry(BlockDriverState *bs, Qcow2Cache *c,
+                               void *table, int index, int num)
+{
+    int ret;
+    int i = qcow2_cache_get_table_idx(c, table);
+    int start_sector = index / L2_ENTRIES_PER_SECTOR;
+    int end_sector = (index + num - 1) / L2_ENTRIES_PER_SECTOR;
+    int nr_sectors = end_sector - start_sector + 1;
+    ret = bdrv_pwrite(bs->file,
+                      c->entries[i].offset + start_sector * BDRV_SECTOR_SIZE,
+                      table + start_sector * BDRV_SECTOR_SIZE,
+                      nr_sectors * BDRV_SECTOR_SIZE);
+    return ret;
+}
+
 int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
     Qcow2Cache *dependency)
 {
diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index a677ba9..ae49a83 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -998,6 +998,9 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
      }
 
 
+    ret = qcow2_cache_l2_write_entry(bs, s->l2_table_cache, l2_slice,
+                                     l2_index, m->nb_clusters);
+
     qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
 
     /*
diff --git a/block/qcow2.h b/block/qcow2.h
index 7ce2c23..168ab59 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -748,6 +748,8 @@ int qcow2_cache_destroy(Qcow2Cache *c);
 void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table);
 int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
 int qcow2_cache_write(BlockDriverState *bs, Qcow2Cache *c);
+int qcow2_cache_l2_write_entry(BlockDriverState *bs, Qcow2Cache *c,
+                               void *table, int index, int num);
 int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
     Qcow2Cache *dependency);
 void qcow2_cache_depends_on_flush(Qcow2Cache *c);
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters
  2020-08-05  2:38 [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters Ying Fang
@ 2020-08-05  2:43 ` no-reply
  2020-08-06  9:01   ` Ying Fang
  2020-08-06  9:13 ` Kevin Wolf
  1 sibling, 1 reply; 8+ messages in thread
From: no-reply @ 2020-08-05  2:43 UTC (permalink / raw)
  To: fangying1
  Cc: kwolf, zhang.zhanghailiang, qemu-block, qemu-devel, mreitz,
	alex.chen, fangying1

Patchew URL: https://patchew.org/QEMU/20200805023826.184-1-fangying1@huawei.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.






The full log is available at
http://patchew.org/logs/20200805023826.184-1-fangying1@huawei.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters
  2020-08-05  2:43 ` no-reply
@ 2020-08-06  9:01   ` Ying Fang
  2020-08-06  9:04     ` Daniel P. Berrangé
  0 siblings, 1 reply; 8+ messages in thread
From: Ying Fang @ 2020-08-06  9:01 UTC (permalink / raw)
  To: qemu-devel, kwolf, mreitz; +Cc: alex.chen, zhang.zhanghailiang, qemu-block



On 8/5/2020 10:43 AM, no-reply@patchew.org wrote:
> Patchew URL: https://patchew.org/QEMU/20200805023826.184-1-fangying1@huawei.com/
> 
> 
> 
> Hi,
> 
> This series failed the docker-quick@centos7 build test. Please find the testing commands and
> their output below. If you have Docker installed, you can probably reproduce it
> locally.
> I see some error message which says ** No space left on device **
However I do not know what is wrong with this build test.
Could you give me some help here?

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
error: copy-fd: write returned No space left on device
fatal: failed to copy file to 
'/var/tmp/patchew-tester-tmp-wtnwtuq5/src/.git/objects/pack/pack-518a8ad92e3ce11d2627a7221e2d360b337cb27d.pack': 
No space left on device
fatal: The remote end hung up unexpectedly
Traceback (most recent call last):
   File "patchew-tester/src/patchew-cli", line 521, in test_one
     git_clone_repo(clone, r["repo"], r["head"], logf, True)
   File "patchew-tester/src/patchew-cli", line 53, in git_clone_repo
     subprocess.check_call(clone_cmd, stderr=logf, stdout=logf)
   File "/opt/rh/rh-python36/root/usr/lib64/python3.6/subprocess.py", 
line 291, in check_call
     raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['git', 'clone', '-q', 
'/home/patchew/.cache/patchew-git-cache/httpsgithubcompatchewprojectqemu-3c8cf5a9c21ff8782164d1def7f44bd888713384', 
'/var/tmp/patchew-tester-tmp-wtnwtuq5/src']' returned non-zero exit 
status 128.

> 
> 
> 
> 
> 
> The full log is available at
> http://patchew.org/logs/20200805023826.184-1-fangying1@huawei.com/testing.docker-quick@centos7/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-devel@redhat.com
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters
  2020-08-06  9:01   ` Ying Fang
@ 2020-08-06  9:04     ` Daniel P. Berrangé
  0 siblings, 0 replies; 8+ messages in thread
From: Daniel P. Berrangé @ 2020-08-06  9:04 UTC (permalink / raw)
  To: Ying Fang
  Cc: kwolf, zhang.zhanghailiang, qemu-block, qemu-devel, mreitz, alex.chen

On Thu, Aug 06, 2020 at 05:01:51PM +0800, Ying Fang wrote:
> 
> 
> On 8/5/2020 10:43 AM, no-reply@patchew.org wrote:
> > Patchew URL: https://patchew.org/QEMU/20200805023826.184-1-fangying1@huawei.com/
> > 
> > 
> > 
> > Hi,
> > 
> > This series failed the docker-quick@centos7 build test. Please find the testing commands and
> > their output below. If you have Docker installed, you can probably reproduce it
> > locally.
> > I see some error message which says ** No space left on device **
> However I do not know what is wrong with this build test.
> Could you give me some help here?

It isn't your fault - this is just QEMU's  patchew CI that is broken yet again
due to lack of disk space. Just ignore the error report here.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters
  2020-08-05  2:38 [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters Ying Fang
  2020-08-05  2:43 ` no-reply
@ 2020-08-06  9:13 ` Kevin Wolf
  2020-08-07  7:42   ` Ying Fang
  1 sibling, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2020-08-06  9:13 UTC (permalink / raw)
  To: Ying Fang; +Cc: alex.chen, zhang.zhanghailiang, qemu-devel, qemu-block, mreitz

Am 05.08.2020 um 04:38 hat Ying Fang geschrieben:
> From: fangying <fangying1@huawei.com>
> 
> When qemu or qemu-nbd process uses a qcow2 image and configured with
> 'cache = none', it will write to the qcow2 image with a cache to cache
> L2 tables, however the process will not use L2 tables without explicitly
> calling the flush command or closing the mirror flash into the disk.
> Which may cause the disk data inconsistent with the written data for
> a long time. If an abnormal process exit occurs here, the issued written
> data will be lost.
> 
> Therefore, in order to keep data consistency we need to flush the changes
> to the L2 entry to the disk in time for the newly allocated cluster.
> 
> Signed-off-by: Ying Fang <fangying1@huawei.com>

If you want to have data safely written to the disk after each write
request, you need to use cache=writethrough/directsync (in other words,
aliases that are equivalent to setting -device ...,write-cache=off).
Note that this will have a major impact on write performance.

cache=none means bypassing the kernel page cache (O_DIRECT), but not
flushing after each write request.

Kevin



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters
  2020-08-06  9:13 ` Kevin Wolf
@ 2020-08-07  7:42   ` Ying Fang
  2020-08-07  8:13     ` Kevin Wolf
  0 siblings, 1 reply; 8+ messages in thread
From: Ying Fang @ 2020-08-07  7:42 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: alex.chen, zhang.zhanghailiang, qemu-devel, qemu-block, mreitz



On 8/6/2020 5:13 PM, Kevin Wolf wrote:
> Am 05.08.2020 um 04:38 hat Ying Fang geschrieben:
>> From: fangying <fangying1@huawei.com>
>>
>> When qemu or qemu-nbd process uses a qcow2 image and configured with
>> 'cache = none', it will write to the qcow2 image with a cache to cache
>> L2 tables, however the process will not use L2 tables without explicitly
>> calling the flush command or closing the mirror flash into the disk.
>> Which may cause the disk data inconsistent with the written data for
>> a long time. If an abnormal process exit occurs here, the issued written
>> data will be lost.
>>
>> Therefore, in order to keep data consistency we need to flush the changes
>> to the L2 entry to the disk in time for the newly allocated cluster.
>>
>> Signed-off-by: Ying Fang <fangying1@huawei.com>
> 
> If you want to have data safely written to the disk after each write
> request, you need to use cache=writethrough/directsync (in other words,
> aliases that are equivalent to setting -device ...,write-cache=off).
> Note that this will have a major impact on write performance.
> 
> cache=none means bypassing the kernel page cache (O_DIRECT), but not
> flushing after each write request.

Well, IIUC, cache=none does not guarantee data safety and we should not
expect that. Then this patch can be ignored.

Thanks.
> 
> Kevin
> 
> .
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters
  2020-08-07  7:42   ` Ying Fang
@ 2020-08-07  8:13     ` Kevin Wolf
  2020-08-14  2:26       ` Ying Fang
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin Wolf @ 2020-08-07  8:13 UTC (permalink / raw)
  To: Ying Fang; +Cc: alex.chen, zhang.zhanghailiang, qemu-devel, qemu-block, mreitz

Am 07.08.2020 um 09:42 hat Ying Fang geschrieben:
> 
> 
> On 8/6/2020 5:13 PM, Kevin Wolf wrote:
> > Am 05.08.2020 um 04:38 hat Ying Fang geschrieben:
> > > From: fangying <fangying1@huawei.com>
> > > 
> > > When qemu or qemu-nbd process uses a qcow2 image and configured with
> > > 'cache = none', it will write to the qcow2 image with a cache to cache
> > > L2 tables, however the process will not use L2 tables without explicitly
> > > calling the flush command or closing the mirror flash into the disk.
> > > Which may cause the disk data inconsistent with the written data for
> > > a long time. If an abnormal process exit occurs here, the issued written
> > > data will be lost.
> > > 
> > > Therefore, in order to keep data consistency we need to flush the changes
> > > to the L2 entry to the disk in time for the newly allocated cluster.
> > > 
> > > Signed-off-by: Ying Fang <fangying1@huawei.com>
> > 
> > If you want to have data safely written to the disk after each write
> > request, you need to use cache=writethrough/directsync (in other words,
> > aliases that are equivalent to setting -device ...,write-cache=off).
> > Note that this will have a major impact on write performance.
> > 
> > cache=none means bypassing the kernel page cache (O_DIRECT), but not
> > flushing after each write request.
> 
> Well, IIUC, cache=none does not guarantee data safety and we should not
> expect that. Then this patch can be ignored.

Indeed, cache=none is a writeback cache mode with all of the
consequences. In practice, this is normally good enough because the
guest OS will send flush requests when needed (e.g. because a guest
application called fsync()), but if the guest doesn't do this, it may
suffer data loss. This behaviour is comparable to a volatile disk cache
on real hard disks and is a good default, but sometimes you need a
writethrough cache mode at the cost of a performance penalty.

Kevin



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters
  2020-08-07  8:13     ` Kevin Wolf
@ 2020-08-14  2:26       ` Ying Fang
  0 siblings, 0 replies; 8+ messages in thread
From: Ying Fang @ 2020-08-14  2:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: alex.chen, zhang.zhanghailiang, qemu-devel, qemu-block, mreitz



On 8/7/2020 4:13 PM, Kevin Wolf wrote:
> Am 07.08.2020 um 09:42 hat Ying Fang geschrieben:
>>
>>
>> On 8/6/2020 5:13 PM, Kevin Wolf wrote:
>>> Am 05.08.2020 um 04:38 hat Ying Fang geschrieben:
>>>> From: fangying <fangying1@huawei.com>
>>>>
>>>> When qemu or qemu-nbd process uses a qcow2 image and configured with
>>>> 'cache = none', it will write to the qcow2 image with a cache to cache
>>>> L2 tables, however the process will not use L2 tables without explicitly
>>>> calling the flush command or closing the mirror flash into the disk.
>>>> Which may cause the disk data inconsistent with the written data for
>>>> a long time. If an abnormal process exit occurs here, the issued written
>>>> data will be lost.
>>>>
>>>> Therefore, in order to keep data consistency we need to flush the changes
>>>> to the L2 entry to the disk in time for the newly allocated cluster.
>>>>
>>>> Signed-off-by: Ying Fang <fangying1@huawei.com>
>>>
>>> If you want to have data safely written to the disk after each write
>>> request, you need to use cache=writethrough/directsync (in other words,
>>> aliases that are equivalent to setting -device ...,write-cache=off).
>>> Note that this will have a major impact on write performance.
>>>
>>> cache=none means bypassing the kernel page cache (O_DIRECT), but not
>>> flushing after each write request.
>>
>> Well, IIUC, cache=none does not guarantee data safety and we should not
>> expect that. Then this patch can be ignored.
> 
> Indeed, cache=none is a writeback cache mode with all of the
> consequences. In practice, this is normally good enough because the
> guest OS will send flush requests when needed (e.g. because a guest
> application called fsync()), but if the guest doesn't do this, it may
> suffer data loss. This behaviour is comparable to a volatile disk cache
> on real hard disks and is a good default, but sometimes you need a
> writethrough cache mode at the cost of a performance penalty.

The late reply, thanks for your detailed explanation on the 'cache' 
option, having more understanding for it now.
> 
> Kevin
> 
> .
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-08-14  2:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-05  2:38 [PATCH] qcow2: flush qcow2 l2 meta for new allocated clusters Ying Fang
2020-08-05  2:43 ` no-reply
2020-08-06  9:01   ` Ying Fang
2020-08-06  9:04     ` Daniel P. Berrangé
2020-08-06  9:13 ` Kevin Wolf
2020-08-07  7:42   ` Ying Fang
2020-08-07  8:13     ` Kevin Wolf
2020-08-14  2:26       ` Ying Fang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).