From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=55392 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PgGIB-0005Mb-7x for qemu-devel@nongnu.org; Fri, 21 Jan 2011 07:40:40 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PgGI9-0001nP-Kv for qemu-devel@nongnu.org; Fri, 21 Jan 2011 07:40:39 -0500 Received: from mail4-relais-sop.national.inria.fr ([192.134.164.105]:37308) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PgGI9-0001mt-Cq for qemu-devel@nongnu.org; Fri, 21 Jan 2011 07:40:37 -0500 Subject: Re: [Qemu-devel] [PATCH] Fix block migration when the device size is not a multiple of 1 MB Mime-Version: 1.0 (Apple Message framework v1082) Content-Type: text/plain; charset=iso-8859-1 From: Pierre Riteau In-Reply-To: Date: Fri, 21 Jan 2011 13:40:35 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <292A277F-FDB6-4842-9133-8CAC22F08453@irisa.fr> References: <1295449188-17877-1-git-send-email-Pierre.Riteau@irisa.fr> <04350B7C-9933-4A70-8FA9-B5B409D1E10A@irisa.fr> <43211019-BF0D-405A-99B7-54C9B3BBE58E@irisa.fr> <4D397C8E.7080703@redhat.com> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Yoshiaki Tamura Cc: Kevin Wolf , "qemu-devel@nongnu.org" On 21 janv. 2011, at 13:36, Yoshiaki Tamura wrote: > 2011/1/21 Kevin Wolf : >> Am 21.01.2011 13:15, schrieb Yoshiaki Tamura: >>> 2011/1/21 Pierre Riteau : >>>> Le 20 janv. 2011 =E0 17:18, Yoshiaki Tamura = a =E9crit : >>>>=20 >>>>> 2011/1/20 Pierre Riteau : >>>>>> On 20 janv. 2011, at 03:06, Yoshiaki Tamura wrote: >>>>>>=20 >>>>>>> 2011/1/19 Pierre Riteau : >>>>>>>> b02bea3a85cc939f09aa674a3f1e4f36d418c007 added a check on the = return >>>>>>>> value of bdrv_write and aborts migration when it fails. = However, if the >>>>>>>> size of the block device to migrate is not a multiple of = BLOCK_SIZE >>>>>>>> (currently 1 MB), the last bdrv_write will fail with -EIO. >>>>>>>>=20 >>>>>>>> Fixed by calling bdrv_write with the correct size of the last = block. >>>>>>>> --- >>>>>>>> block-migration.c | 16 +++++++++++++++- >>>>>>>> 1 files changed, 15 insertions(+), 1 deletions(-) >>>>>>>>=20 >>>>>>>> diff --git a/block-migration.c b/block-migration.c >>>>>>>> index 1475325..eeb9c62 100644 >>>>>>>> --- a/block-migration.c >>>>>>>> +++ b/block-migration.c >>>>>>>> @@ -635,6 +635,8 @@ static int block_load(QEMUFile *f, void = *opaque, int version_id) >>>>>>>> int64_t addr; >>>>>>>> BlockDriverState *bs; >>>>>>>> uint8_t *buf; >>>>>>>> + int64_t total_sectors; >>>>>>>> + int nr_sectors; >>>>>>>>=20 >>>>>>>> do { >>>>>>>> addr =3D qemu_get_be64(f); >>>>>>>> @@ -656,10 +658,22 @@ static int block_load(QEMUFile *f, void = *opaque, int version_id) >>>>>>>> return -EINVAL; >>>>>>>> } >>>>>>>>=20 >>>>>>>> + total_sectors =3D bdrv_getlength(bs) >> = BDRV_SECTOR_BITS; >>>>>>>> + if (total_sectors <=3D 0) { >>>>>>>> + fprintf(stderr, "Error getting length of block = device %s\n", device_name); >>>>>>>> + return -EINVAL; >>>>>>>> + } >>>>>>>> + >>>>>>>> + if (total_sectors - addr < = BDRV_SECTORS_PER_DIRTY_CHUNK) { >>>>>>>> + nr_sectors =3D total_sectors - addr; >>>>>>>> + } else { >>>>>>>> + nr_sectors =3D BDRV_SECTORS_PER_DIRTY_CHUNK; >>>>>>>> + } >>>>>>>> + >>>>>>>> buf =3D qemu_malloc(BLOCK_SIZE); >>>>>>>>=20 >>>>>>>> qemu_get_buffer(f, buf, BLOCK_SIZE); >>>>>>>> - ret =3D bdrv_write(bs, addr, buf, = BDRV_SECTORS_PER_DIRTY_CHUNK); >>>>>>>> + ret =3D bdrv_write(bs, addr, buf, nr_sectors); >>>>>>>>=20 >>>>>>>> qemu_free(buf); >>>>>>>> if (ret < 0) { >>>>>>>> -- >>>>>>>> 1.7.3.5 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>>=20 >>>>>>>=20 >>>>>>> Hi Pierre, >>>>>>>=20 >>>>>>> I don't think the fix above is correct. If you have a file = which >>>>>>> isn't aliened with BLOCK_SIZE, you won't get an error with the >>>>>>> patch. However, the receiver doesn't know how much sectors = which >>>>>>> the sender wants to be written, so the guest may fail after >>>>>>> migration because some data may not be written. IIUC, although >>>>>>> changing bytestream should be prevented as much as possible, we >>>>>>> should save/load total_sectors to check appropriate file is >>>>>>> allocated on the receiver side. >>>>>>=20 >>>>>> Isn't the guest supposed to be started using a file with the = correct size? >>>>>=20 >>>>> I personally don't like that; It's insisting too much to the user. >>>>> Can't we expand the image on the fly? We can just abort if = expanding >>>>> failed anyway. >>>>=20 >>>> At first I thought your expansion idea was best, but now I think = there are valid scenarios where it fails. >>>>=20 >>>> Imagine both sides are not using a file but a disk partition as = storage. If the partition size is not rounded to 1 MB, the last write = will fail with the current code, and there is no way we can expand the = partition. >>>>=20 >>>=20 >>> Right. But in case of partition doesn't the check in the patch = below >>> return error? Does bdrv_getlength return the size correctly? >>=20 >> I'm pretty sure that it does. We would have problems in other places = if >> it didn't (e.g. we're checking if I/O requests are within the disk = size). >=20 > Sorry for the noise. I just learned it's returning the value of lseek > in case of raw-posix. And it does a ioctl call on other platforms than Linux. --=20 Pierre Riteau -- PhD student, Myriads team, IRISA, Rennes, France http://perso.univ-rennes1.fr/pierre.riteau/