From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alex Williamson Subject: Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path() Date: Tue, 15 Jun 2010 20:55:32 -0600 Message-ID: <1276656932.12015.960.camel@x201> References: <20100614054923.879.33717.stgit@localhost.localdomain> <201006160130.58001.paul@codesourcery.com> <20100616003555.GP24131@x200.localdomain> <201006160230.04514.paul@codesourcery.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Chris Wright , Jan Kiszka , Markus Armbruster , kvm@vger.kernel.org, qemu-devel@nongnu.org, avi@redhat.com, kraxel@redhat.com To: Paul Brook Return-path: Received: from mx1.redhat.com ([209.132.183.28]:38320 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753656Ab0FPCzv (ORCPT ); Tue, 15 Jun 2010 22:55:51 -0400 In-Reply-To: <201006160230.04514.paul@codesourcery.com> Sender: kvm-owner@vger.kernel.org List-ID: On Wed, 2010-06-16 at 02:30 +0100, Paul Brook wrote: > Transferring the machine description on migration is a separate problem. > > Lets say we associate each RAM block with a device. Each ram block also has a > name. These names distinguish between blocks attached to a given device, but > need not be globally unique. i.e. devices A and B can both have block named > "foo". RAM block migration happens before device state migration (including > device properties). > > There are three relevant migration failure modes: > > (1) The same device is present, but has a different size property. > If the incoming block is larger than the allocated block then you loose. > (2) A different device is present, but does not have a ram block of the same > name. > This safely fails migration because of the block name mismatch. > (3) A different device is present, that happens to have a ram block of the > same name. > If the blocks are the same size then transferring the contents is harmless. > If they are different sizes then this will be caught by (1). Either way, the > migration will be failed once we get to the vmstate check. > > Note how adding the device type to the canonical address does not effect the > outcome. > > Going back to the original problem, (1) is the most interesting. > > I suggest that the initial migration phase transfers a list of ram blocks. > Each entry in that list should be {canonical device path, name, size}. You > should lookup all these ram blocks, and fail migration if they are not present > with the correct size[1]. This list also gives you a convenient numeric index > to identify the block during RAM migration. > > [1] In the future we may be able to resize blocks. However this is not safe > with the current API. I think for the most part, you've just described the RAMBlock series of patches I sent out last week. I'll note that that series creates ram blocks on the target if they aren't present because of the technicality that we currently do not have a qemu_ram_free() to cleanup the list when things go away. Once we have that and cleanup drivers to use it, I agree we should fail the migration if it occurs, or at least print out a big warning so we can go fix the driver. If I'm missing where else it's significantly different please let me know. Yes, case (3) would fail in the vmstate code without driver name in the canonical path... or at least we hope it would. But with the driver name in the canonical path, we can avoid doing a useless operation, fail earlier, and provide the vmstate with a key piece of information it can use to help ensure that the incoming state information belongs to the driver it thinks it does. Seems like a win to me. Thanks, Alex From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from [140.186.70.92] (port=55995 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OOin7-0000Z3-Km for qemu-devel@nongnu.org; Tue, 15 Jun 2010 22:55:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OOin6-0000da-LI for qemu-devel@nongnu.org; Tue, 15 Jun 2010 22:55:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:64678) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OOin6-0000dP-Ek for qemu-devel@nongnu.org; Tue, 15 Jun 2010 22:55:48 -0400 Subject: Re: [Qemu-devel] [RFC PATCH 1/5] qdev: Create qdev_get_dev_path() From: Alex Williamson In-Reply-To: <201006160230.04514.paul@codesourcery.com> References: <20100614054923.879.33717.stgit@localhost.localdomain> <201006160130.58001.paul@codesourcery.com> <20100616003555.GP24131@x200.localdomain> <201006160230.04514.paul@codesourcery.com> Content-Type: text/plain; charset="UTF-8" Date: Tue, 15 Jun 2010 20:55:32 -0600 Message-ID: <1276656932.12015.960.camel@x201> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paul Brook Cc: Chris Wright , kvm@vger.kernel.org, Jan Kiszka , Markus Armbruster , qemu-devel@nongnu.org, avi@redhat.com, kraxel@redhat.com On Wed, 2010-06-16 at 02:30 +0100, Paul Brook wrote: > Transferring the machine description on migration is a separate problem. > > Lets say we associate each RAM block with a device. Each ram block also has a > name. These names distinguish between blocks attached to a given device, but > need not be globally unique. i.e. devices A and B can both have block named > "foo". RAM block migration happens before device state migration (including > device properties). > > There are three relevant migration failure modes: > > (1) The same device is present, but has a different size property. > If the incoming block is larger than the allocated block then you loose. > (2) A different device is present, but does not have a ram block of the same > name. > This safely fails migration because of the block name mismatch. > (3) A different device is present, that happens to have a ram block of the > same name. > If the blocks are the same size then transferring the contents is harmless. > If they are different sizes then this will be caught by (1). Either way, the > migration will be failed once we get to the vmstate check. > > Note how adding the device type to the canonical address does not effect the > outcome. > > Going back to the original problem, (1) is the most interesting. > > I suggest that the initial migration phase transfers a list of ram blocks. > Each entry in that list should be {canonical device path, name, size}. You > should lookup all these ram blocks, and fail migration if they are not present > with the correct size[1]. This list also gives you a convenient numeric index > to identify the block during RAM migration. > > [1] In the future we may be able to resize blocks. However this is not safe > with the current API. I think for the most part, you've just described the RAMBlock series of patches I sent out last week. I'll note that that series creates ram blocks on the target if they aren't present because of the technicality that we currently do not have a qemu_ram_free() to cleanup the list when things go away. Once we have that and cleanup drivers to use it, I agree we should fail the migration if it occurs, or at least print out a big warning so we can go fix the driver. If I'm missing where else it's significantly different please let me know. Yes, case (3) would fail in the vmstate code without driver name in the canonical path... or at least we hope it would. But with the driver name in the canonical path, we can avoid doing a useless operation, fail earlier, and provide the vmstate with a key piece of information it can use to help ensure that the incoming state information belongs to the driver it thinks it does. Seems like a win to me. Thanks, Alex