All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
@ 2011-01-19 14:59 Pierre Riteau
  2011-01-20  2:06 ` Yoshiaki Tamura
  2011-01-21  9:16 ` Kevin Wolf
  0 siblings, 2 replies; 22+ messages in thread
From: Pierre Riteau @ 2011-01-19 14:59 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, tamura.yoshiaki, Pierre Riteau

b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
value of bdrv_write and aborts migration when it fails. However, if the
size of the block device to migrate is not a multiple of BLOCK_SIZE
(currently 1 MB), the last bdrv_write will fail with -EIO.

Fixed by calling bdrv_write with the correct size of the last block.
---
 block-migration.c |   16 +++++++++++++++-
 1 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 1475325..eeb9c62 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
     int64_t addr;
     BlockDriverState *bs;
     uint8_t *buf;
+    int64_t total_sectors;
+    int nr_sectors;
 
     do {
         addr = qemu_get_be64(f);
@@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
                 return -EINVAL;
             }
 
+            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
+            if (total_sectors <= 0) {
+                fprintf(stderr, "Error getting length of block device %s\n", device_name);
+                return -EINVAL;
+            }
+
+            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
+                nr_sectors = total_sectors - addr;
+            } else {
+                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
+            }
+
             buf = qemu_malloc(BLOCK_SIZE);
 
             qemu_get_buffer(f, buf, BLOCK_SIZE);
-            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
+            ret = bdrv_write(bs, addr, buf, nr_sectors);
 
             qemu_free(buf);
             if (ret < 0) {
-- 
1.7.3.5

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-19 14:59 [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB Pierre Riteau
@ 2011-01-20  2:06 ` Yoshiaki Tamura
  2011-01-20  6:49   ` Pierre Riteau
  2011-01-21  9:16 ` Kevin Wolf
  1 sibling, 1 reply; 22+ messages in thread
From: Yoshiaki Tamura @ 2011-01-20  2:06 UTC (permalink / raw)
  To: Pierre Riteau; +Cc: kwolf, qemu-devel

2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
> value of bdrv_write and aborts migration when it fails. However, if the
> size of the block device to migrate is not a multiple of BLOCK_SIZE
> (currently 1 MB), the last bdrv_write will fail with -EIO.
>
> Fixed by calling bdrv_write with the correct size of the last block.
> ---
>  block-migration.c |   16 +++++++++++++++-
>  1 files changed, 15 insertions(+), 1 deletions(-)
>
> diff --git a/block-migration.c b/block-migration.c
> index 1475325..eeb9c62 100644
> --- a/block-migration.c
> +++ b/block-migration.c
> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>     int64_t addr;
>     BlockDriverState *bs;
>     uint8_t *buf;
> +    int64_t total_sectors;
> +    int nr_sectors;
>
>     do {
>         addr = qemu_get_be64(f);
> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>                 return -EINVAL;
>             }
>
> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
> +            if (total_sectors <= 0) {
> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
> +                return -EINVAL;
> +            }
> +
> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
> +                nr_sectors = total_sectors - addr;
> +            } else {
> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
> +            }
> +
>             buf = qemu_malloc(BLOCK_SIZE);
>
>             qemu_get_buffer(f, buf, BLOCK_SIZE);
> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>
>             qemu_free(buf);
>             if (ret < 0) {
> --
> 1.7.3.5
>
>
>

Hi Pierre,

I don't think the fix above is correct.  If you have a file which
isn't aliened with BLOCK_SIZE, you won't get an error with the
patch.  However, the receiver doesn't know how much sectors which
the sender wants to be written, so the guest may fail after
migration because some data may not be written.  IIUC, although
changing bytestream should be prevented as much as possible, we
should save/load total_sectors to check appropriate file is
allocated on the receiver side.

BTW, you should use error_report instead of fprintf(stderr, ...).

Thanks,

Yoshi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-20  2:06 ` Yoshiaki Tamura
@ 2011-01-20  6:49   ` Pierre Riteau
  2011-01-20 16:18     ` Yoshiaki Tamura
  0 siblings, 1 reply; 22+ messages in thread
From: Pierre Riteau @ 2011-01-20  6:49 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: kwolf, qemu-devel

On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:

> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>> value of bdrv_write and aborts migration when it fails. However, if the
>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>> 
>> Fixed by calling bdrv_write with the correct size of the last block.
>> ---
>>  block-migration.c |   16 +++++++++++++++-
>>  1 files changed, 15 insertions(+), 1 deletions(-)
>> 
>> diff --git a/block-migration.c b/block-migration.c
>> index 1475325..eeb9c62 100644
>> --- a/block-migration.c
>> +++ b/block-migration.c
>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>     int64_t addr;
>>     BlockDriverState *bs;
>>     uint8_t *buf;
>> +    int64_t total_sectors;
>> +    int nr_sectors;
>> 
>>     do {
>>         addr = qemu_get_be64(f);
>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>                 return -EINVAL;
>>             }
>> 
>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>> +            if (total_sectors <= 0) {
>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>> +                return -EINVAL;
>> +            }
>> +
>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>> +                nr_sectors = total_sectors - addr;
>> +            } else {
>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>> +            }
>> +
>>             buf = qemu_malloc(BLOCK_SIZE);
>> 
>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>> 
>>             qemu_free(buf);
>>             if (ret < 0) {
>> --
>> 1.7.3.5
>> 
>> 
>> 
> 
> Hi Pierre,
> 
> I don't think the fix above is correct.  If you have a file which
> isn't aliened with BLOCK_SIZE, you won't get an error with the
> patch.  However, the receiver doesn't know how much sectors which
> the sender wants to be written, so the guest may fail after
> migration because some data may not be written.  IIUC, although
> changing bytestream should be prevented as much as possible, we
> should save/load total_sectors to check appropriate file is
> allocated on the receiver side.

Isn't the guest supposed to be started using a file with the correct size?
But I guess changing the protocol would be best as it would avoid headaches to people who mistakenly created a file that is too small.

> BTW, you should use error_report instead of fprintf(stderr, ...).

I didn't know that, I followed what was used in this file. Thank you.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-20  6:49   ` Pierre Riteau
@ 2011-01-20 16:18     ` Yoshiaki Tamura
  2011-01-21  8:08       ` Pierre Riteau
  0 siblings, 1 reply; 22+ messages in thread
From: Yoshiaki Tamura @ 2011-01-20 16:18 UTC (permalink / raw)
  To: Pierre Riteau; +Cc: kwolf, qemu-devel

2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>
>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>> value of bdrv_write and aborts migration when it fails. However, if the
>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>
>>> Fixed by calling bdrv_write with the correct size of the last block.
>>> ---
>>>  block-migration.c |   16 +++++++++++++++-
>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/block-migration.c b/block-migration.c
>>> index 1475325..eeb9c62 100644
>>> --- a/block-migration.c
>>> +++ b/block-migration.c
>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>     int64_t addr;
>>>     BlockDriverState *bs;
>>>     uint8_t *buf;
>>> +    int64_t total_sectors;
>>> +    int nr_sectors;
>>>
>>>     do {
>>>         addr = qemu_get_be64(f);
>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>                 return -EINVAL;
>>>             }
>>>
>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>> +            if (total_sectors <= 0) {
>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>> +                return -EINVAL;
>>> +            }
>>> +
>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>> +                nr_sectors = total_sectors - addr;
>>> +            } else {
>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>> +            }
>>> +
>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>
>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>
>>>             qemu_free(buf);
>>>             if (ret < 0) {
>>> --
>>> 1.7.3.5
>>>
>>>
>>>
>>
>> Hi Pierre,
>>
>> I don't think the fix above is correct.  If you have a file which
>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>> patch.  However, the receiver doesn't know how much sectors which
>> the sender wants to be written, so the guest may fail after
>> migration because some data may not be written.  IIUC, although
>> changing bytestream should be prevented as much as possible, we
>> should save/load total_sectors to check appropriate file is
>> allocated on the receiver side.
>
> Isn't the guest supposed to be started using a file with the correct size?

I personally don't like that; It's insisting too much to the user.
Can't we expand the image on the fly?  We can just abort if expanding
failed anyway.

> But I guess changing the protocol would be best as it would avoid headaches to people who mistakenly created a file that is too small.

We should think carefully before changing the protocol.

Kevin?

>
>> BTW, you should use error_report instead of fprintf(stderr, ...).
>
> I didn't know that, I followed what was used in this file. Thank you.
>
> --
> Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
> http://perso.univ-rennes1.fr/pierre.riteau/
>
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-20 16:18     ` Yoshiaki Tamura
@ 2011-01-21  8:08       ` Pierre Riteau
  2011-01-21  9:11         ` Kevin Wolf
  2011-01-21 12:15         ` Yoshiaki Tamura
  0 siblings, 2 replies; 22+ messages in thread
From: Pierre Riteau @ 2011-01-21  8:08 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: kwolf, qemu-devel

Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :

> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>> 
>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>> 
>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>> ---
>>>>  block-migration.c |   16 +++++++++++++++-
>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>> 
>>>> diff --git a/block-migration.c b/block-migration.c
>>>> index 1475325..eeb9c62 100644
>>>> --- a/block-migration.c
>>>> +++ b/block-migration.c
>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>     int64_t addr;
>>>>     BlockDriverState *bs;
>>>>     uint8_t *buf;
>>>> +    int64_t total_sectors;
>>>> +    int nr_sectors;
>>>> 
>>>>     do {
>>>>         addr = qemu_get_be64(f);
>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>                 return -EINVAL;
>>>>             }
>>>> 
>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>> +            if (total_sectors <= 0) {
>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>> +                return -EINVAL;
>>>> +            }
>>>> +
>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>> +                nr_sectors = total_sectors - addr;
>>>> +            } else {
>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>> +            }
>>>> +
>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>> 
>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>> 
>>>>             qemu_free(buf);
>>>>             if (ret < 0) {
>>>> --
>>>> 1.7.3.5
>>>> 
>>>> 
>>>> 
>>> 
>>> Hi Pierre,
>>> 
>>> I don't think the fix above is correct.  If you have a file which
>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>> patch.  However, the receiver doesn't know how much sectors which
>>> the sender wants to be written, so the guest may fail after
>>> migration because some data may not be written.  IIUC, although
>>> changing bytestream should be prevented as much as possible, we
>>> should save/load total_sectors to check appropriate file is
>>> allocated on the receiver side.
>> 
>> Isn't the guest supposed to be started using a file with the correct size?
> 
> I personally don't like that; It's insisting too much to the user.
> Can't we expand the image on the fly?  We can just abort if expanding
> failed anyway.

At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails. 

Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.

>> But I guess changing the protocol would be best as it would avoid headaches to people who mistakenly created a file that is too small.
> 
> We should think carefully before changing the protocol.
> 
> Kevin?
> 
>> 
>>> BTW, you should use error_report instead of fprintf(stderr, ...).
>> 
>> I didn't know that, I followed what was used in this file. Thank you.
>> 
>> --
>> Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
>> http://perso.univ-rennes1.fr/pierre.riteau/
>> 
>> 
>> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21  8:08       ` Pierre Riteau
@ 2011-01-21  9:11         ` Kevin Wolf
  2011-01-21 12:26           ` Yoshiaki Tamura
  2011-01-21 12:15         ` Yoshiaki Tamura
  1 sibling, 1 reply; 22+ messages in thread
From: Kevin Wolf @ 2011-01-21  9:11 UTC (permalink / raw)
  To: Pierre Riteau; +Cc: Yoshiaki Tamura, qemu-devel

Am 21.01.2011 09:08, schrieb Pierre Riteau:
> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
> 
>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>
>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>
>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>> ---
>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>> index 1475325..eeb9c62 100644
>>>>> --- a/block-migration.c
>>>>> +++ b/block-migration.c
>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>     int64_t addr;
>>>>>     BlockDriverState *bs;
>>>>>     uint8_t *buf;
>>>>> +    int64_t total_sectors;
>>>>> +    int nr_sectors;
>>>>>
>>>>>     do {
>>>>>         addr = qemu_get_be64(f);
>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>                 return -EINVAL;
>>>>>             }
>>>>>
>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>> +            if (total_sectors <= 0) {
>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>> +                return -EINVAL;
>>>>> +            }
>>>>> +
>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>> +                nr_sectors = total_sectors - addr;
>>>>> +            } else {
>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>> +            }
>>>>> +
>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>
>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>
>>>>>             qemu_free(buf);
>>>>>             if (ret < 0) {
>>>>> --
>>>>> 1.7.3.5
>>>>>
>>>>>
>>>>>
>>>>
>>>> Hi Pierre,
>>>>
>>>> I don't think the fix above is correct.  If you have a file which
>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>> patch.  However, the receiver doesn't know how much sectors which
>>>> the sender wants to be written, so the guest may fail after
>>>> migration because some data may not be written.  IIUC, although
>>>> changing bytestream should be prevented as much as possible, we
>>>> should save/load total_sectors to check appropriate file is
>>>> allocated on the receiver side.
>>>
>>> Isn't the guest supposed to be started using a file with the correct size?
>>
>> I personally don't like that; It's insisting too much to the user.
>> Can't we expand the image on the fly?  We can just abort if expanding
>> failed anyway.
> 
> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails. 
> 
> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.

Actually, that you can change the image size is a special case. It only
works on raw with file and sheepdog, and on qcow2 and qed. All other
block drivers can't do it.

>>> But I guess changing the protocol would be best as it would avoid headaches to people who mistakenly created a file that is too small.
>>
>> We should think carefully before changing the protocol.
>>
>> Kevin?

Can we do it in a compatible way? I agree that it would be nice to catch
this error, but changing the protocol in an incompatible way for it
seems to be too much.

Anyway, it's independent of this patch and can be done on top.

Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-19 14:59 [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB Pierre Riteau
  2011-01-20  2:06 ` Yoshiaki Tamura
@ 2011-01-21  9:16 ` Kevin Wolf
  2011-01-21 11:38   ` Pierre Riteau
  1 sibling, 1 reply; 22+ messages in thread
From: Kevin Wolf @ 2011-01-21  9:16 UTC (permalink / raw)
  To: Pierre Riteau; +Cc: qemu-devel, tamura.yoshiaki

Am 19.01.2011 15:59, schrieb Pierre Riteau:
> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
> value of bdrv_write and aborts migration when it fails. However, if the
> size of the block device to migrate is not a multiple of BLOCK_SIZE
> (currently 1 MB), the last bdrv_write will fail with -EIO.
> 
> Fixed by calling bdrv_write with the correct size of the last block.
> ---
>  block-migration.c |   16 +++++++++++++++-
>  1 files changed, 15 insertions(+), 1 deletions(-)
> 
> diff --git a/block-migration.c b/block-migration.c
> index 1475325..eeb9c62 100644
> --- a/block-migration.c
> +++ b/block-migration.c
> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>      int64_t addr;
>      BlockDriverState *bs;
>      uint8_t *buf;
> +    int64_t total_sectors;
> +    int nr_sectors;
>  
>      do {
>          addr = qemu_get_be64(f);
> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>                  return -EINVAL;
>              }
>  
> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
> +            if (total_sectors <= 0) {
> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
> +                return -EINVAL;
> +            }

Can you resend the patch with error_report(), as Yoshi mentioned?

Also, I would move the total_sectors calculation outside the loop -
though I have no idea how many iterations it typically has, so it might
not improve things a lot.

Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21  9:16 ` Kevin Wolf
@ 2011-01-21 11:38   ` Pierre Riteau
  2011-01-21 11:45     ` Kevin Wolf
  0 siblings, 1 reply; 22+ messages in thread
From: Pierre Riteau @ 2011-01-21 11:38 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, tamura.yoshiaki

On 21 janv. 2011, at 10:16, Kevin Wolf wrote:

> Am 19.01.2011 15:59, schrieb Pierre Riteau:
>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>> value of bdrv_write and aborts migration when it fails. However, if the
>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>> 
>> Fixed by calling bdrv_write with the correct size of the last block.
>> ---
>> block-migration.c |   16 +++++++++++++++-
>> 1 files changed, 15 insertions(+), 1 deletions(-)
>> 
>> diff --git a/block-migration.c b/block-migration.c
>> index 1475325..eeb9c62 100644
>> --- a/block-migration.c
>> +++ b/block-migration.c
>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>     int64_t addr;
>>     BlockDriverState *bs;
>>     uint8_t *buf;
>> +    int64_t total_sectors;
>> +    int nr_sectors;
>> 
>>     do {
>>         addr = qemu_get_be64(f);
>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>                 return -EINVAL;
>>             }
>> 
>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>> +            if (total_sectors <= 0) {
>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>> +                return -EINVAL;
>> +            }
> 
> Can you resend the patch with error_report(), as Yoshi mentioned?
> 
> Also, I would move the total_sectors calculation outside the loop -
> though I have no idea how many iterations it typically has, so it might
> not improve things a lot.

Actually, it is not possible to move the total_sectors calculation outside the loop, since the loop can receive blocks from any device (this is why each block is prefixed by the device name).
I'm sending a new patch with a small optimization to avoid recalculating total_sectors when the device doesn't change in the next iteration.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 11:38   ` Pierre Riteau
@ 2011-01-21 11:45     ` Kevin Wolf
  0 siblings, 0 replies; 22+ messages in thread
From: Kevin Wolf @ 2011-01-21 11:45 UTC (permalink / raw)
  To: Pierre Riteau; +Cc: qemu-devel, tamura.yoshiaki

Am 21.01.2011 12:38, schrieb Pierre Riteau:
> On 21 janv. 2011, at 10:16, Kevin Wolf wrote:
> 
>> Am 19.01.2011 15:59, schrieb Pierre Riteau:
>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>> value of bdrv_write and aborts migration when it fails. However, if the
>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>
>>> Fixed by calling bdrv_write with the correct size of the last block.
>>> ---
>>> block-migration.c |   16 +++++++++++++++-
>>> 1 files changed, 15 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/block-migration.c b/block-migration.c
>>> index 1475325..eeb9c62 100644
>>> --- a/block-migration.c
>>> +++ b/block-migration.c
>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>     int64_t addr;
>>>     BlockDriverState *bs;
>>>     uint8_t *buf;
>>> +    int64_t total_sectors;
>>> +    int nr_sectors;
>>>
>>>     do {
>>>         addr = qemu_get_be64(f);
>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>                 return -EINVAL;
>>>             }
>>>
>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>> +            if (total_sectors <= 0) {
>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>> +                return -EINVAL;
>>> +            }
>>
>> Can you resend the patch with error_report(), as Yoshi mentioned?
>>
>> Also, I would move the total_sectors calculation outside the loop -
>> though I have no idea how many iterations it typically has, so it might
>> not improve things a lot.
> 
> Actually, it is not possible to move the total_sectors calculation outside the loop, since the loop can receive blocks from any device (this is why each block is prefixed by the device name).
> I'm sending a new patch with a small optimization to avoid recalculating total_sectors when the device doesn't change in the next iteration.

Right, I should have read a bit more context... I won't insist on an
optimization in this case, but if you like to do it, go ahead.

Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21  8:08       ` Pierre Riteau
  2011-01-21  9:11         ` Kevin Wolf
@ 2011-01-21 12:15         ` Yoshiaki Tamura
  2011-01-21 12:31           ` Kevin Wolf
  1 sibling, 1 reply; 22+ messages in thread
From: Yoshiaki Tamura @ 2011-01-21 12:15 UTC (permalink / raw)
  To: Pierre Riteau; +Cc: kwolf, qemu-devel

2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>
>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>
>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>
>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>> ---
>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>> index 1475325..eeb9c62 100644
>>>>> --- a/block-migration.c
>>>>> +++ b/block-migration.c
>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>     int64_t addr;
>>>>>     BlockDriverState *bs;
>>>>>     uint8_t *buf;
>>>>> +    int64_t total_sectors;
>>>>> +    int nr_sectors;
>>>>>
>>>>>     do {
>>>>>         addr = qemu_get_be64(f);
>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>                 return -EINVAL;
>>>>>             }
>>>>>
>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>> +            if (total_sectors <= 0) {
>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>> +                return -EINVAL;
>>>>> +            }
>>>>> +
>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>> +                nr_sectors = total_sectors - addr;
>>>>> +            } else {
>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>> +            }
>>>>> +
>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>
>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>
>>>>>             qemu_free(buf);
>>>>>             if (ret < 0) {
>>>>> --
>>>>> 1.7.3.5
>>>>>
>>>>>
>>>>>
>>>>
>>>> Hi Pierre,
>>>>
>>>> I don't think the fix above is correct.  If you have a file which
>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>> patch.  However, the receiver doesn't know how much sectors which
>>>> the sender wants to be written, so the guest may fail after
>>>> migration because some data may not be written.  IIUC, although
>>>> changing bytestream should be prevented as much as possible, we
>>>> should save/load total_sectors to check appropriate file is
>>>> allocated on the receiver side.
>>>
>>> Isn't the guest supposed to be started using a file with the correct size?
>>
>> I personally don't like that; It's insisting too much to the user.
>> Can't we expand the image on the fly?  We can just abort if expanding
>> failed anyway.
>
> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>
> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>

Right.  But in case of partition doesn't the check in the patch below
return error?  Does bdrv_getlength return the size correctly?

total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
if (total_sectors <= 0) {
    fprintf(stderr, "Error getting length of block device %s\n", device_name);
    return -EINVAL;
}

Yoshi

>>> But I guess changing the protocol would be best as it would avoid headaches to people who mistakenly created a file that is too small.
>>
>> We should think carefully before changing the protocol.
>>
>> Kevin?
>>
>>>
>>>> BTW, you should use error_report instead of fprintf(stderr, ...).
>>>
>>> I didn't know that, I followed what was used in this file. Thank you.
>>>
>>> --
>>> Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
>>> http://perso.univ-rennes1.fr/pierre.riteau/
>>>
>>>
>>>
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21  9:11         ` Kevin Wolf
@ 2011-01-21 12:26           ` Yoshiaki Tamura
  0 siblings, 0 replies; 22+ messages in thread
From: Yoshiaki Tamura @ 2011-01-21 12:26 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Pierre Riteau

2011/1/21 Kevin Wolf <kwolf@redhat.com>:
> Am 21.01.2011 09:08, schrieb Pierre Riteau:
>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>
>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>
>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>
>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>> ---
>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>
>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>> index 1475325..eeb9c62 100644
>>>>>> --- a/block-migration.c
>>>>>> +++ b/block-migration.c
>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>     int64_t addr;
>>>>>>     BlockDriverState *bs;
>>>>>>     uint8_t *buf;
>>>>>> +    int64_t total_sectors;
>>>>>> +    int nr_sectors;
>>>>>>
>>>>>>     do {
>>>>>>         addr = qemu_get_be64(f);
>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>                 return -EINVAL;
>>>>>>             }
>>>>>>
>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>> +            if (total_sectors <= 0) {
>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>> +                return -EINVAL;
>>>>>> +            }
>>>>>> +
>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>> +            } else {
>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>> +            }
>>>>>> +
>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>
>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>
>>>>>>             qemu_free(buf);
>>>>>>             if (ret < 0) {
>>>>>> --
>>>>>> 1.7.3.5
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> Hi Pierre,
>>>>>
>>>>> I don't think the fix above is correct.  If you have a file which
>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>> the sender wants to be written, so the guest may fail after
>>>>> migration because some data may not be written.  IIUC, although
>>>>> changing bytestream should be prevented as much as possible, we
>>>>> should save/load total_sectors to check appropriate file is
>>>>> allocated on the receiver side.
>>>>
>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>
>>> I personally don't like that; It's insisting too much to the user.
>>> Can't we expand the image on the fly?  We can just abort if expanding
>>> failed anyway.
>>
>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>
>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>
> Actually, that you can change the image size is a special case. It only
> works on raw with file and sheepdog, and on qcow2 and qed. All other
> block drivers can't do it.
>
>>>> But I guess changing the protocol would be best as it would avoid headaches to people who mistakenly created a file that is too small.
>>>
>>> We should think carefully before changing the protocol.
>>>
>>> Kevin?
>
> Can we do it in a compatible way? I agree that it would be nice to catch
> this error, but changing the protocol in an incompatible way for it
> seems to be too much.

No.  However, it's not only about catching this error, but improving
the usability of block migration.  I don't expect to change all at
once, I think it would be worthwhile to discuss if we want to improve
block migration.

Yoshi

> Anyway, it's independent of this patch and can be done on top.
>
> Kevin
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 12:15         ` Yoshiaki Tamura
@ 2011-01-21 12:31           ` Kevin Wolf
  2011-01-21 12:36             ` Yoshiaki Tamura
  0 siblings, 1 reply; 22+ messages in thread
From: Kevin Wolf @ 2011-01-21 12:31 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: qemu-devel, Pierre Riteau

Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>
>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>
>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>
>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>> ---
>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>
>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>> index 1475325..eeb9c62 100644
>>>>>> --- a/block-migration.c
>>>>>> +++ b/block-migration.c
>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>     int64_t addr;
>>>>>>     BlockDriverState *bs;
>>>>>>     uint8_t *buf;
>>>>>> +    int64_t total_sectors;
>>>>>> +    int nr_sectors;
>>>>>>
>>>>>>     do {
>>>>>>         addr = qemu_get_be64(f);
>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>                 return -EINVAL;
>>>>>>             }
>>>>>>
>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>> +            if (total_sectors <= 0) {
>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>> +                return -EINVAL;
>>>>>> +            }
>>>>>> +
>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>> +            } else {
>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>> +            }
>>>>>> +
>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>
>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>
>>>>>>             qemu_free(buf);
>>>>>>             if (ret < 0) {
>>>>>> --
>>>>>> 1.7.3.5
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> Hi Pierre,
>>>>>
>>>>> I don't think the fix above is correct.  If you have a file which
>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>> the sender wants to be written, so the guest may fail after
>>>>> migration because some data may not be written.  IIUC, although
>>>>> changing bytestream should be prevented as much as possible, we
>>>>> should save/load total_sectors to check appropriate file is
>>>>> allocated on the receiver side.
>>>>
>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>
>>> I personally don't like that; It's insisting too much to the user.
>>> Can't we expand the image on the fly?  We can just abort if expanding
>>> failed anyway.
>>
>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>
>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>
> 
> Right.  But in case of partition doesn't the check in the patch below
> return error?  Does bdrv_getlength return the size correctly?

I'm pretty sure that it does. We would have problems in other places if
it didn't (e.g. we're checking if I/O requests are within the disk size).

Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 12:31           ` Kevin Wolf
@ 2011-01-21 12:36             ` Yoshiaki Tamura
  2011-01-21 12:40               ` Pierre Riteau
  0 siblings, 1 reply; 22+ messages in thread
From: Yoshiaki Tamura @ 2011-01-21 12:36 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Pierre Riteau

2011/1/21 Kevin Wolf <kwolf@redhat.com>:
> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>
>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>
>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>
>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>> ---
>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>
>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>> index 1475325..eeb9c62 100644
>>>>>>> --- a/block-migration.c
>>>>>>> +++ b/block-migration.c
>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>     int64_t addr;
>>>>>>>     BlockDriverState *bs;
>>>>>>>     uint8_t *buf;
>>>>>>> +    int64_t total_sectors;
>>>>>>> +    int nr_sectors;
>>>>>>>
>>>>>>>     do {
>>>>>>>         addr = qemu_get_be64(f);
>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>                 return -EINVAL;
>>>>>>>             }
>>>>>>>
>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>> +            if (total_sectors <= 0) {
>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>> +                return -EINVAL;
>>>>>>> +            }
>>>>>>> +
>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>> +            } else {
>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>> +            }
>>>>>>> +
>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>
>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>
>>>>>>>             qemu_free(buf);
>>>>>>>             if (ret < 0) {
>>>>>>> --
>>>>>>> 1.7.3.5
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Hi Pierre,
>>>>>>
>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>> the sender wants to be written, so the guest may fail after
>>>>>> migration because some data may not be written.  IIUC, although
>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>> should save/load total_sectors to check appropriate file is
>>>>>> allocated on the receiver side.
>>>>>
>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>
>>>> I personally don't like that; It's insisting too much to the user.
>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>> failed anyway.
>>>
>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>
>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>
>>
>> Right.  But in case of partition doesn't the check in the patch below
>> return error?  Does bdrv_getlength return the size correctly?
>
> I'm pretty sure that it does. We would have problems in other places if
> it didn't (e.g. we're checking if I/O requests are within the disk size).

Sorry for the noise.  I just learned it's returning the value of lseek
in case of raw-posix.

Yoshi

>
> Kevin
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 12:36             ` Yoshiaki Tamura
@ 2011-01-21 12:40               ` Pierre Riteau
  2011-01-21 13:59                 ` Yoshiaki Tamura
  0 siblings, 1 reply; 22+ messages in thread
From: Pierre Riteau @ 2011-01-21 12:40 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: Kevin Wolf, qemu-devel

On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:

> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>> 
>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>> 
>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>> 
>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>> ---
>>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>> 
>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>> --- a/block-migration.c
>>>>>>>> +++ b/block-migration.c
>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>     int64_t addr;
>>>>>>>>     BlockDriverState *bs;
>>>>>>>>     uint8_t *buf;
>>>>>>>> +    int64_t total_sectors;
>>>>>>>> +    int nr_sectors;
>>>>>>>> 
>>>>>>>>     do {
>>>>>>>>         addr = qemu_get_be64(f);
>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>                 return -EINVAL;
>>>>>>>>             }
>>>>>>>> 
>>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>> +            if (total_sectors <= 0) {
>>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>> +                return -EINVAL;
>>>>>>>> +            }
>>>>>>>> +
>>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>>> +            } else {
>>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>> +            }
>>>>>>>> +
>>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>> 
>>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>> 
>>>>>>>>             qemu_free(buf);
>>>>>>>>             if (ret < 0) {
>>>>>>>> --
>>>>>>>> 1.7.3.5
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> Hi Pierre,
>>>>>>> 
>>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>> migration because some data may not be written.  IIUC, although
>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>> allocated on the receiver side.
>>>>>> 
>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>> 
>>>>> I personally don't like that; It's insisting too much to the user.
>>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>>> failed anyway.
>>>> 
>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>> 
>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>> 
>>> 
>>> Right.  But in case of partition doesn't the check in the patch below
>>> return error?  Does bdrv_getlength return the size correctly?
>> 
>> I'm pretty sure that it does. We would have problems in other places if
>> it didn't (e.g. we're checking if I/O requests are within the disk size).
> 
> Sorry for the noise.  I just learned it's returning the value of lseek
> in case of raw-posix.


And it does a ioctl call on other platforms than Linux.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 12:40               ` Pierre Riteau
@ 2011-01-21 13:59                 ` Yoshiaki Tamura
  2011-01-21 14:09                   ` Kevin Wolf
  2011-01-21 14:14                   ` Pierre Riteau
  0 siblings, 2 replies; 22+ messages in thread
From: Yoshiaki Tamura @ 2011-01-21 13:59 UTC (permalink / raw)
  To: Pierre Riteau; +Cc: Kevin Wolf, qemu-devel

2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:
>
>> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>>>
>>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>>>
>>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>>>
>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>>> ---
>>>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>>> --- a/block-migration.c
>>>>>>>>> +++ b/block-migration.c
>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>     int64_t addr;
>>>>>>>>>     BlockDriverState *bs;
>>>>>>>>>     uint8_t *buf;
>>>>>>>>> +    int64_t total_sectors;
>>>>>>>>> +    int nr_sectors;
>>>>>>>>>
>>>>>>>>>     do {
>>>>>>>>>         addr = qemu_get_be64(f);
>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>                 return -EINVAL;
>>>>>>>>>             }
>>>>>>>>>
>>>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>>> +            if (total_sectors <= 0) {
>>>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>>> +                return -EINVAL;
>>>>>>>>> +            }
>>>>>>>>> +
>>>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>>>> +            } else {
>>>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>>> +            }
>>>>>>>>> +
>>>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>>>
>>>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>>>
>>>>>>>>>             qemu_free(buf);
>>>>>>>>>             if (ret < 0) {
>>>>>>>>> --
>>>>>>>>> 1.7.3.5
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Pierre,
>>>>>>>>
>>>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>>> migration because some data may not be written.  IIUC, although
>>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>>> allocated on the receiver side.
>>>>>>>
>>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>>>
>>>>>> I personally don't like that; It's insisting too much to the user.
>>>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>>>> failed anyway.
>>>>>
>>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>>>
>>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>>>
>>>>
>>>> Right.  But in case of partition doesn't the check in the patch below
>>>> return error?  Does bdrv_getlength return the size correctly?
>>>
>>> I'm pretty sure that it does. We would have problems in other places if
>>> it didn't (e.g. we're checking if I/O requests are within the disk size).
>>
>> Sorry for the noise.  I just learned it's returning the value of lseek
>> in case of raw-posix.
>
>
> And it does a ioctl call on other platforms than Linux.

Thanks.  Just a quick question regarding total_sectors.
BlockDriverState seems to contain total_sectors.  Can we avoid
calling bdrv_getlength() if bs->total_sectors were already there?

Yoshi

>
> --
> Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
> http://perso.univ-rennes1.fr/pierre.riteau/
>
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 13:59                 ` Yoshiaki Tamura
@ 2011-01-21 14:09                   ` Kevin Wolf
  2011-01-21 14:18                     ` Yoshiaki Tamura
  2011-01-21 14:14                   ` Pierre Riteau
  1 sibling, 1 reply; 22+ messages in thread
From: Kevin Wolf @ 2011-01-21 14:09 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: qemu-devel, Pierre Riteau

Am 21.01.2011 14:59, schrieb Yoshiaki Tamura:
> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:
>>
>>> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>>>>
>>>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>>>>
>>>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>>>>
>>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>>>> ---
>>>>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>>>> --- a/block-migration.c
>>>>>>>>>> +++ b/block-migration.c
>>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>     int64_t addr;
>>>>>>>>>>     BlockDriverState *bs;
>>>>>>>>>>     uint8_t *buf;
>>>>>>>>>> +    int64_t total_sectors;
>>>>>>>>>> +    int nr_sectors;
>>>>>>>>>>
>>>>>>>>>>     do {
>>>>>>>>>>         addr = qemu_get_be64(f);
>>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>                 return -EINVAL;
>>>>>>>>>>             }
>>>>>>>>>>
>>>>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>>>> +            if (total_sectors <= 0) {
>>>>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>>>> +                return -EINVAL;
>>>>>>>>>> +            }
>>>>>>>>>> +
>>>>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>>>>> +            } else {
>>>>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>>>> +            }
>>>>>>>>>> +
>>>>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>>>>
>>>>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>>>>
>>>>>>>>>>             qemu_free(buf);
>>>>>>>>>>             if (ret < 0) {
>>>>>>>>>> --
>>>>>>>>>> 1.7.3.5
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hi Pierre,
>>>>>>>>>
>>>>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>>>> migration because some data may not be written.  IIUC, although
>>>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>>>> allocated on the receiver side.
>>>>>>>>
>>>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>>>>
>>>>>>> I personally don't like that; It's insisting too much to the user.
>>>>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>>>>> failed anyway.
>>>>>>
>>>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>>>>
>>>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>>>>
>>>>>
>>>>> Right.  But in case of partition doesn't the check in the patch below
>>>>> return error?  Does bdrv_getlength return the size correctly?
>>>>
>>>> I'm pretty sure that it does. We would have problems in other places if
>>>> it didn't (e.g. we're checking if I/O requests are within the disk size).
>>>
>>> Sorry for the noise.  I just learned it's returning the value of lseek
>>> in case of raw-posix.
>>
>>
>> And it does a ioctl call on other platforms than Linux.
> 
> Thanks.  Just a quick question regarding total_sectors.
> BlockDriverState seems to contain total_sectors.  Can we avoid
> calling bdrv_getlength() if bs->total_sectors were already there?

I'd need to check the details, but I think it may not be correct with
growable files.

Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 13:59                 ` Yoshiaki Tamura
  2011-01-21 14:09                   ` Kevin Wolf
@ 2011-01-21 14:14                   ` Pierre Riteau
  2011-01-21 14:21                     ` Yoshiaki Tamura
  1 sibling, 1 reply; 22+ messages in thread
From: Pierre Riteau @ 2011-01-21 14:14 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: Kevin Wolf, qemu-devel

On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote:

> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:
>> 
>>> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>>>> 
>>>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>>>> 
>>>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>>>> 
>>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>>>> ---
>>>>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>>>> 
>>>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>>>> --- a/block-migration.c
>>>>>>>>>> +++ b/block-migration.c
>>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>     int64_t addr;
>>>>>>>>>>     BlockDriverState *bs;
>>>>>>>>>>     uint8_t *buf;
>>>>>>>>>> +    int64_t total_sectors;
>>>>>>>>>> +    int nr_sectors;
>>>>>>>>>> 
>>>>>>>>>>     do {
>>>>>>>>>>         addr = qemu_get_be64(f);
>>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>                 return -EINVAL;
>>>>>>>>>>             }
>>>>>>>>>> 
>>>>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>>>> +            if (total_sectors <= 0) {
>>>>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>>>> +                return -EINVAL;
>>>>>>>>>> +            }
>>>>>>>>>> +
>>>>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>>>>> +            } else {
>>>>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>>>> +            }
>>>>>>>>>> +
>>>>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>>>> 
>>>>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>>>> 
>>>>>>>>>>             qemu_free(buf);
>>>>>>>>>>             if (ret < 0) {
>>>>>>>>>> --
>>>>>>>>>> 1.7.3.5
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Pierre,
>>>>>>>>> 
>>>>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>>>> migration because some data may not be written.  IIUC, although
>>>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>>>> allocated on the receiver side.
>>>>>>>> 
>>>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>>>> 
>>>>>>> I personally don't like that; It's insisting too much to the user.
>>>>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>>>>> failed anyway.
>>>>>> 
>>>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>>>> 
>>>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>>>> 
>>>>> 
>>>>> Right.  But in case of partition doesn't the check in the patch below
>>>>> return error?  Does bdrv_getlength return the size correctly?
>>>> 
>>>> I'm pretty sure that it does. We would have problems in other places if
>>>> it didn't (e.g. we're checking if I/O requests are within the disk size).
>>> 
>>> Sorry for the noise.  I just learned it's returning the value of lseek
>>> in case of raw-posix.
>> 
>> 
>> And it does a ioctl call on other platforms than Linux.
> 
> Thanks.  Just a quick question regarding total_sectors.
> BlockDriverState seems to contain total_sectors.  Can we avoid
> calling bdrv_getlength() if bs->total_sectors were already there?

From a comment in bdrv_getlength():

Fixed size devices use the total_sectors value for speed instead of
issuing a length query (like lseek) on each call.  Also, legacy block
drivers don't provide a bdrv_getlength function and must use
total_sectors.

So using bdrv_getlength will protect against devices being resized during migration, but as far as I can see, the sender side doesn't support it: the value of total_sectors is cached for the whole block migration.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 14:09                   ` Kevin Wolf
@ 2011-01-21 14:18                     ` Yoshiaki Tamura
  0 siblings, 0 replies; 22+ messages in thread
From: Yoshiaki Tamura @ 2011-01-21 14:18 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: qemu-devel, Pierre Riteau

2011/1/21 Kevin Wolf <kwolf@redhat.com>:
> Am 21.01.2011 14:59, schrieb Yoshiaki Tamura:
>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:
>>>
>>>> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>>>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>>>>>
>>>>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>>>>>
>>>>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>>>>>
>>>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>>>>> ---
>>>>>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>>>>> --- a/block-migration.c
>>>>>>>>>>> +++ b/block-migration.c
>>>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>     int64_t addr;
>>>>>>>>>>>     BlockDriverState *bs;
>>>>>>>>>>>     uint8_t *buf;
>>>>>>>>>>> +    int64_t total_sectors;
>>>>>>>>>>> +    int nr_sectors;
>>>>>>>>>>>
>>>>>>>>>>>     do {
>>>>>>>>>>>         addr = qemu_get_be64(f);
>>>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>                 return -EINVAL;
>>>>>>>>>>>             }
>>>>>>>>>>>
>>>>>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>>>>> +            if (total_sectors <= 0) {
>>>>>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>>>>> +                return -EINVAL;
>>>>>>>>>>> +            }
>>>>>>>>>>> +
>>>>>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>>>>>> +            } else {
>>>>>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>>>>> +            }
>>>>>>>>>>> +
>>>>>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>>>>>
>>>>>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>>>>>
>>>>>>>>>>>             qemu_free(buf);
>>>>>>>>>>>             if (ret < 0) {
>>>>>>>>>>> --
>>>>>>>>>>> 1.7.3.5
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Pierre,
>>>>>>>>>>
>>>>>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>>>>> migration because some data may not be written.  IIUC, although
>>>>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>>>>> allocated on the receiver side.
>>>>>>>>>
>>>>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>>>>>
>>>>>>>> I personally don't like that; It's insisting too much to the user.
>>>>>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>>>>>> failed anyway.
>>>>>>>
>>>>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>>>>>
>>>>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>>>>>
>>>>>>
>>>>>> Right.  But in case of partition doesn't the check in the patch below
>>>>>> return error?  Does bdrv_getlength return the size correctly?
>>>>>
>>>>> I'm pretty sure that it does. We would have problems in other places if
>>>>> it didn't (e.g. we're checking if I/O requests are within the disk size).
>>>>
>>>> Sorry for the noise.  I just learned it's returning the value of lseek
>>>> in case of raw-posix.
>>>
>>>
>>> And it does a ioctl call on other platforms than Linux.
>>
>> Thanks.  Just a quick question regarding total_sectors.
>> BlockDriverState seems to contain total_sectors.  Can we avoid
>> calling bdrv_getlength() if bs->total_sectors were already there?
>
> I'd need to check the details, but I think it may not be correct with
> growable files.

Does growable flag mean total_sectors is growable?  Because
block-migration require users to preallocate a file w/ enough
size, it doesn't seem to be a problem, IIUC.

Yoshi

>
> Kevin
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 14:14                   ` Pierre Riteau
@ 2011-01-21 14:21                     ` Yoshiaki Tamura
  2011-01-21 14:23                       ` Pierre Riteau
  0 siblings, 1 reply; 22+ messages in thread
From: Yoshiaki Tamura @ 2011-01-21 14:21 UTC (permalink / raw)
  To: Pierre Riteau; +Cc: Kevin Wolf, qemu-devel

2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
> On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote:
>
>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:
>>>
>>>> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>>>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>>>>>
>>>>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>>>>>
>>>>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>>>>>
>>>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>>>>> ---
>>>>>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>>>>> --- a/block-migration.c
>>>>>>>>>>> +++ b/block-migration.c
>>>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>     int64_t addr;
>>>>>>>>>>>     BlockDriverState *bs;
>>>>>>>>>>>     uint8_t *buf;
>>>>>>>>>>> +    int64_t total_sectors;
>>>>>>>>>>> +    int nr_sectors;
>>>>>>>>>>>
>>>>>>>>>>>     do {
>>>>>>>>>>>         addr = qemu_get_be64(f);
>>>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>                 return -EINVAL;
>>>>>>>>>>>             }
>>>>>>>>>>>
>>>>>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>>>>> +            if (total_sectors <= 0) {
>>>>>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>>>>> +                return -EINVAL;
>>>>>>>>>>> +            }
>>>>>>>>>>> +
>>>>>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>>>>>> +            } else {
>>>>>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>>>>> +            }
>>>>>>>>>>> +
>>>>>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>>>>>
>>>>>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>>>>>
>>>>>>>>>>>             qemu_free(buf);
>>>>>>>>>>>             if (ret < 0) {
>>>>>>>>>>> --
>>>>>>>>>>> 1.7.3.5
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Pierre,
>>>>>>>>>>
>>>>>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>>>>> migration because some data may not be written.  IIUC, although
>>>>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>>>>> allocated on the receiver side.
>>>>>>>>>
>>>>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>>>>>
>>>>>>>> I personally don't like that; It's insisting too much to the user.
>>>>>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>>>>>> failed anyway.
>>>>>>>
>>>>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>>>>>
>>>>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>>>>>
>>>>>>
>>>>>> Right.  But in case of partition doesn't the check in the patch below
>>>>>> return error?  Does bdrv_getlength return the size correctly?
>>>>>
>>>>> I'm pretty sure that it does. We would have problems in other places if
>>>>> it didn't (e.g. we're checking if I/O requests are within the disk size).
>>>>
>>>> Sorry for the noise.  I just learned it's returning the value of lseek
>>>> in case of raw-posix.
>>>
>>>
>>> And it does a ioctl call on other platforms than Linux.
>>
>> Thanks.  Just a quick question regarding total_sectors.
>> BlockDriverState seems to contain total_sectors.  Can we avoid
>> calling bdrv_getlength() if bs->total_sectors were already there?
>
> From a comment in bdrv_getlength():
>
> Fixed size devices use the total_sectors value for speed instead of
> issuing a length query (like lseek) on each call.  Also, legacy block
> drivers don't provide a bdrv_getlength function and must use
> total_sectors.
>
> So using bdrv_getlength will protect against devices being resized during migration, but as far as I can see, the sender side doesn't support it: the value of total_sectors is cached for the whole block migration.

Even if the sender supports it, as far as total_sectors isn't
sent to the receiver, can we follow the resize on the receiver?

Yoshi

>
> --
> Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
> http://perso.univ-rennes1.fr/pierre.riteau/
>
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 14:21                     ` Yoshiaki Tamura
@ 2011-01-21 14:23                       ` Pierre Riteau
  2011-01-21 14:30                         ` Yoshiaki Tamura
  0 siblings, 1 reply; 22+ messages in thread
From: Pierre Riteau @ 2011-01-21 14:23 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: Kevin Wolf, qemu-devel

On 21 janv. 2011, at 15:21, Yoshiaki Tamura wrote:

> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>> On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote:
>> 
>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:
>>>> 
>>>>> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>>>>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>>>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>>>>>> 
>>>>>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>>>>>> 
>>>>>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>>>>>> 
>>>>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>>>>>> ---
>>>>>>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>>>>>> 
>>>>>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>>>>>> --- a/block-migration.c
>>>>>>>>>>>> +++ b/block-migration.c
>>>>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>>     int64_t addr;
>>>>>>>>>>>>     BlockDriverState *bs;
>>>>>>>>>>>>     uint8_t *buf;
>>>>>>>>>>>> +    int64_t total_sectors;
>>>>>>>>>>>> +    int nr_sectors;
>>>>>>>>>>>> 
>>>>>>>>>>>>     do {
>>>>>>>>>>>>         addr = qemu_get_be64(f);
>>>>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>>                 return -EINVAL;
>>>>>>>>>>>>             }
>>>>>>>>>>>> 
>>>>>>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>>>>>> +            if (total_sectors <= 0) {
>>>>>>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>>>>>> +                return -EINVAL;
>>>>>>>>>>>> +            }
>>>>>>>>>>>> +
>>>>>>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>>>>>>> +            } else {
>>>>>>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>>>>>> +            }
>>>>>>>>>>>> +
>>>>>>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>>>>>> 
>>>>>>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>>>>>> 
>>>>>>>>>>>>             qemu_free(buf);
>>>>>>>>>>>>             if (ret < 0) {
>>>>>>>>>>>> --
>>>>>>>>>>>> 1.7.3.5
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Hi Pierre,
>>>>>>>>>>> 
>>>>>>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>>>>>> migration because some data may not be written.  IIUC, although
>>>>>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>>>>>> allocated on the receiver side.
>>>>>>>>>> 
>>>>>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>>>>>> 
>>>>>>>>> I personally don't like that; It's insisting too much to the user.
>>>>>>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>>>>>>> failed anyway.
>>>>>>>> 
>>>>>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>>>>>> 
>>>>>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>>>>>> 
>>>>>>> 
>>>>>>> Right.  But in case of partition doesn't the check in the patch below
>>>>>>> return error?  Does bdrv_getlength return the size correctly?
>>>>>> 
>>>>>> I'm pretty sure that it does. We would have problems in other places if
>>>>>> it didn't (e.g. we're checking if I/O requests are within the disk size).
>>>>> 
>>>>> Sorry for the noise.  I just learned it's returning the value of lseek
>>>>> in case of raw-posix.
>>>> 
>>>> 
>>>> And it does a ioctl call on other platforms than Linux.
>>> 
>>> Thanks.  Just a quick question regarding total_sectors.
>>> BlockDriverState seems to contain total_sectors.  Can we avoid
>>> calling bdrv_getlength() if bs->total_sectors were already there?
>> 
>> From a comment in bdrv_getlength():
>> 
>> Fixed size devices use the total_sectors value for speed instead of
>> issuing a length query (like lseek) on each call.  Also, legacy block
>> drivers don't provide a bdrv_getlength function and must use
>> total_sectors.
>> 
>> So using bdrv_getlength will protect against devices being resized during migration, but as far as I can see, the sender side doesn't support it: the value of total_sectors is cached for the whole block migration.
> 
> Even if the sender supports it, as far as total_sectors isn't
> sent to the receiver, can we follow the resize on the receiver?


I was referring to the complex, and probably unrealistic scenario, where a user allocates a file of the correct size on the receiving side, starts block migration, and during migration grows the size of the disk on both the sender and receiver side.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 14:23                       ` Pierre Riteau
@ 2011-01-21 14:30                         ` Yoshiaki Tamura
  2011-01-21 14:48                           ` Pierre Riteau
  0 siblings, 1 reply; 22+ messages in thread
From: Yoshiaki Tamura @ 2011-01-21 14:30 UTC (permalink / raw)
  To: Pierre Riteau; +Cc: Kevin Wolf, qemu-devel

2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
> On 21 janv. 2011, at 15:21, Yoshiaki Tamura wrote:
>
>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>> On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote:
>>>
>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:
>>>>>
>>>>>> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>>>>>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>>>>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>>>>>>>
>>>>>>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>>>>>>>
>>>>>>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>>>>>>> --- a/block-migration.c
>>>>>>>>>>>>> +++ b/block-migration.c
>>>>>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>>>     int64_t addr;
>>>>>>>>>>>>>     BlockDriverState *bs;
>>>>>>>>>>>>>     uint8_t *buf;
>>>>>>>>>>>>> +    int64_t total_sectors;
>>>>>>>>>>>>> +    int nr_sectors;
>>>>>>>>>>>>>
>>>>>>>>>>>>>     do {
>>>>>>>>>>>>>         addr = qemu_get_be64(f);
>>>>>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>>>                 return -EINVAL;
>>>>>>>>>>>>>             }
>>>>>>>>>>>>>
>>>>>>>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>>>>>>> +            if (total_sectors <= 0) {
>>>>>>>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>>>>>>> +                return -EINVAL;
>>>>>>>>>>>>> +            }
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>>>>>>>> +            } else {
>>>>>>>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>>>>>>> +            }
>>>>>>>>>>>>> +
>>>>>>>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>>>>>>>
>>>>>>>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>>>>>>>
>>>>>>>>>>>>>             qemu_free(buf);
>>>>>>>>>>>>>             if (ret < 0) {
>>>>>>>>>>>>> --
>>>>>>>>>>>>> 1.7.3.5
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Pierre,
>>>>>>>>>>>>
>>>>>>>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>>>>>>> migration because some data may not be written.  IIUC, although
>>>>>>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>>>>>>> allocated on the receiver side.
>>>>>>>>>>>
>>>>>>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>>>>>>>
>>>>>>>>>> I personally don't like that; It's insisting too much to the user.
>>>>>>>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>>>>>>>> failed anyway.
>>>>>>>>>
>>>>>>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>>>>>>>
>>>>>>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Right.  But in case of partition doesn't the check in the patch below
>>>>>>>> return error?  Does bdrv_getlength return the size correctly?
>>>>>>>
>>>>>>> I'm pretty sure that it does. We would have problems in other places if
>>>>>>> it didn't (e.g. we're checking if I/O requests are within the disk size).
>>>>>>
>>>>>> Sorry for the noise.  I just learned it's returning the value of lseek
>>>>>> in case of raw-posix.
>>>>>
>>>>>
>>>>> And it does a ioctl call on other platforms than Linux.
>>>>
>>>> Thanks.  Just a quick question regarding total_sectors.
>>>> BlockDriverState seems to contain total_sectors.  Can we avoid
>>>> calling bdrv_getlength() if bs->total_sectors were already there?
>>>
>>> From a comment in bdrv_getlength():
>>>
>>> Fixed size devices use the total_sectors value for speed instead of
>>> issuing a length query (like lseek) on each call.  Also, legacy block
>>> drivers don't provide a bdrv_getlength function and must use
>>> total_sectors.
>>>
>>> So using bdrv_getlength will protect against devices being resized during migration, but as far as I can see, the sender side doesn't support it: the value of total_sectors is cached for the whole block migration.
>>
>> Even if the sender supports it, as far as total_sectors isn't
>> sent to the receiver, can we follow the resize on the receiver?
>
>
> I was referring to the complex, and probably unrealistic scenario, where a user allocates a file of the correct size on the receiving side, starts block migration, and during migration grows the size of the disk on both the sender and receiver side.

I thought supporting resize while block-migration would be a good
feature because Kemari is live migrating again and again :)

Yoshi

>
> --
> Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
> http://perso.univ-rennes1.fr/pierre.riteau/
>
>
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB
  2011-01-21 14:30                         ` Yoshiaki Tamura
@ 2011-01-21 14:48                           ` Pierre Riteau
  0 siblings, 0 replies; 22+ messages in thread
From: Pierre Riteau @ 2011-01-21 14:48 UTC (permalink / raw)
  To: Yoshiaki Tamura; +Cc: Kevin Wolf, qemu-devel

On 21 janv. 2011, at 15:30, Yoshiaki Tamura wrote:

> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>> On 21 janv. 2011, at 15:21, Yoshiaki Tamura wrote:
>> 
>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>> On 21 janv. 2011, at 14:59, Yoshiaki Tamura wrote:
>>>> 
>>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>> On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote:
>>>>>> 
>>>>>>> 2011/1/21 Kevin Wolf <kwolf@redhat.com>:
>>>>>>>> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura:
>>>>>>>>> 2011/1/21 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>> Le 20 janv. 2011 à 17:18, Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp> a écrit :
>>>>>>>>>> 
>>>>>>>>>>> 2011/1/20 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> 2011/1/19 Pierre Riteau <Pierre.Riteau@irisa.fr>:
>>>>>>>>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the return
>>>>>>>>>>>>>> value of bdrv_write and aborts migration when it fails. However, if the
>>>>>>>>>>>>>> size of the block device to migrate is not a multiple of BLOCK_SIZE
>>>>>>>>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Fixed by calling bdrv_write with the correct size of the last block.
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>  block-migration.c |   16 +++++++++++++++-
>>>>>>>>>>>>>>  1 files changed, 15 insertions(+), 1 deletions(-)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> diff --git a/block-migration.c b/block-migration.c
>>>>>>>>>>>>>> index 1475325..eeb9c62 100644
>>>>>>>>>>>>>> --- a/block-migration.c
>>>>>>>>>>>>>> +++ b/block-migration.c
>>>>>>>>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>>>>     int64_t addr;
>>>>>>>>>>>>>>     BlockDriverState *bs;
>>>>>>>>>>>>>>     uint8_t *buf;
>>>>>>>>>>>>>> +    int64_t total_sectors;
>>>>>>>>>>>>>> +    int nr_sectors;
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>     do {
>>>>>>>>>>>>>>         addr = qemu_get_be64(f);
>>>>>>>>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void *opaque, int version_id)
>>>>>>>>>>>>>>                 return -EINVAL;
>>>>>>>>>>>>>>             }
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> +            total_sectors = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
>>>>>>>>>>>>>> +            if (total_sectors <= 0) {
>>>>>>>>>>>>>> +                fprintf(stderr, "Error getting length of block device %s\n", device_name);
>>>>>>>>>>>>>> +                return -EINVAL;
>>>>>>>>>>>>>> +            }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +            if (total_sectors - addr < BDRV_SECTORS_PER_DIRTY_CHUNK) {
>>>>>>>>>>>>>> +                nr_sectors = total_sectors - addr;
>>>>>>>>>>>>>> +            } else {
>>>>>>>>>>>>>> +                nr_sectors = BDRV_SECTORS_PER_DIRTY_CHUNK;
>>>>>>>>>>>>>> +            }
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>             buf = qemu_malloc(BLOCK_SIZE);
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>             qemu_get_buffer(f, buf, BLOCK_SIZE);
>>>>>>>>>>>>>> -            ret = bdrv_write(bs, addr, buf, BDRV_SECTORS_PER_DIRTY_CHUNK);
>>>>>>>>>>>>>> +            ret = bdrv_write(bs, addr, buf, nr_sectors);
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>             qemu_free(buf);
>>>>>>>>>>>>>>             if (ret < 0) {
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> 1.7.3.5
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Pierre,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I don't think the fix above is correct.  If you have a file which
>>>>>>>>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the
>>>>>>>>>>>>> patch.  However, the receiver doesn't know how much sectors which
>>>>>>>>>>>>> the sender wants to be written, so the guest may fail after
>>>>>>>>>>>>> migration because some data may not be written.  IIUC, although
>>>>>>>>>>>>> changing bytestream should be prevented as much as possible, we
>>>>>>>>>>>>> should save/load total_sectors to check appropriate file is
>>>>>>>>>>>>> allocated on the receiver side.
>>>>>>>>>>>> 
>>>>>>>>>>>> Isn't the guest supposed to be started using a file with the correct size?
>>>>>>>>>>> 
>>>>>>>>>>> I personally don't like that; It's insisting too much to the user.
>>>>>>>>>>> Can't we expand the image on the fly?  We can just abort if expanding
>>>>>>>>>>> failed anyway.
>>>>>>>>>> 
>>>>>>>>>> At first I thought your expansion idea was best, but now I think there are valid scenarios where it fails.
>>>>>>>>>> 
>>>>>>>>>> Imagine both sides are not using a file but a disk partition as storage. If the partition size is not rounded to 1 MB, the last write will fail with the current code, and there is no way we can expand the partition.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Right.  But in case of partition doesn't the check in the patch below
>>>>>>>>> return error?  Does bdrv_getlength return the size correctly?
>>>>>>>> 
>>>>>>>> I'm pretty sure that it does. We would have problems in other places if
>>>>>>>> it didn't (e.g. we're checking if I/O requests are within the disk size).
>>>>>>> 
>>>>>>> Sorry for the noise.  I just learned it's returning the value of lseek
>>>>>>> in case of raw-posix.
>>>>>> 
>>>>>> 
>>>>>> And it does a ioctl call on other platforms than Linux.
>>>>> 
>>>>> Thanks.  Just a quick question regarding total_sectors.
>>>>> BlockDriverState seems to contain total_sectors.  Can we avoid
>>>>> calling bdrv_getlength() if bs->total_sectors were already there?
>>>> 
>>>> From a comment in bdrv_getlength():
>>>> 
>>>> Fixed size devices use the total_sectors value for speed instead of
>>>> issuing a length query (like lseek) on each call.  Also, legacy block
>>>> drivers don't provide a bdrv_getlength function and must use
>>>> total_sectors.
>>>> 
>>>> So using bdrv_getlength will protect against devices being resized during migration, but as far as I can see, the sender side doesn't support it: the value of total_sectors is cached for the whole block migration.
>>> 
>>> Even if the sender supports it, as far as total_sectors isn't
>>> sent to the receiver, can we follow the resize on the receiver?
>> 
>> 
>> I was referring to the complex, and probably unrealistic scenario, where a user allocates a file of the correct size on the receiving side, starts block migration, and during migration grows the size of the disk on both the sender and receiver side.
> 
> I thought supporting resize while block-migration would be a good
> feature because Kemari is live migrating again and again :)


Then bdrv_getlength would need to be called in the sender loop as well.

But there's one thing I don't know: how does the guest cope with online disk size changes? AFAIK Linux detects the size of the disk at boot.

-- 
Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France
http://perso.univ-rennes1.fr/pierre.riteau/

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-01-21 14:48 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-19 14:59 [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB Pierre Riteau
2011-01-20  2:06 ` Yoshiaki Tamura
2011-01-20  6:49   ` Pierre Riteau
2011-01-20 16:18     ` Yoshiaki Tamura
2011-01-21  8:08       ` Pierre Riteau
2011-01-21  9:11         ` Kevin Wolf
2011-01-21 12:26           ` Yoshiaki Tamura
2011-01-21 12:15         ` Yoshiaki Tamura
2011-01-21 12:31           ` Kevin Wolf
2011-01-21 12:36             ` Yoshiaki Tamura
2011-01-21 12:40               ` Pierre Riteau
2011-01-21 13:59                 ` Yoshiaki Tamura
2011-01-21 14:09                   ` Kevin Wolf
2011-01-21 14:18                     ` Yoshiaki Tamura
2011-01-21 14:14                   ` Pierre Riteau
2011-01-21 14:21                     ` Yoshiaki Tamura
2011-01-21 14:23                       ` Pierre Riteau
2011-01-21 14:30                         ` Yoshiaki Tamura
2011-01-21 14:48                           ` Pierre Riteau
2011-01-21  9:16 ` Kevin Wolf
2011-01-21 11:38   ` Pierre Riteau
2011-01-21 11:45     ` Kevin Wolf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.