From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:34244) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Xrjno-00072R-IL for qemu-devel@nongnu.org; Fri, 21 Nov 2014 03:42:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Xrjni-0004Td-7d for qemu-devel@nongnu.org; Fri, 21 Nov 2014 03:42:52 -0500 From: Markus Armbruster References: <1416259239-13281-1-git-send-email-dslutz@verizon.com> <87k32r8jc8.fsf@blackfin.pond.sub.org> <546E339E.4030202@terremark.com> Date: Fri, 21 Nov 2014 09:42:38 +0100 In-Reply-To: <546E339E.4030202@terremark.com> (Don Slutz's message of "Thu, 20 Nov 2014 13:31:58 -0500") Message-ID: <87fvdd6j29.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Type: text/plain Subject: Re: [Qemu-devel] [BUGFIX][PATCH for 2.2 1/1] hw/ide/core.c: Prevent SIGSEGV during migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Don Slutz Cc: Kevin Wolf , qemu-stable@nongnu.org, qemu-devel@nongnu.org, Stefan Hajnoczi , Stefano Stabellini Don Slutz writes: > On 11/19/14 07:29, Markus Armbruster wrote: >> Don Slutz writes: >> >>> The other callers to blk_set_enable_write_cache() in this file >>> already check for s->blk == NULL. >>> >>> Signed-off-by: Don Slutz >>> --- >>> >>> I think this is a bugfix that should be back ported to stable >>> releases. >>> >>> I also think this should be done in xen's copy of QEMU for 4.5 with >>> back port(s) to active stable releases. >>> >>> Note: In 2.1 and earlier the routine is >>> bdrv_set_enable_write_cache(); variable is s->bs. >> Got a reproducer? > > yes. Migrating a guest from xen 4.2 or 4.3 to xen 4.4 (or 4.5-unstable) on > CentOS 6.3 with xen_emul_unplug=unnecessary and no cdrom defined. > > >> >> I'm asking because I believe s->identify_set implies s->blk. >> s->identify_set is initialized to zero, and gets set to non-zero exactly >> on the first successful IDENTIFY DEVICE or IDENTIFY PACKET DEVICE, in >> ide_identify(), ide_atapi_identify() or ide_cfata_identify(), >> respectively. Only called via cmd_identify() / cmd_identify_packet() >> via ide_exec_cmd(). The latter immediately fails when !s->blk: >> >> s = idebus_active_if(bus); >> /* ignore commands to non existent slave */ >> if (s != bus->ifs && !s->blk) { >> return; >> } > > I do think that you are right. I have now spent more time on why I am > seeing this. > > >> Even if I'm right, your patch is fine, because it makes this spot more >> obviously correct, and consistent with the other uses of >> blk_set_enable_write_cache(). The case for stable is weak, though. >> > > I had not fully tracked down what is happening before sending the bugfix. > I have now done more debugging, and have tracked it down to xen 4.4 > now using "-nodefaults" with QEMU. > > I needed to add output to QEMU to track this down because I have long > command lines... > > (all I get for ps -ef): [...] > > > Which is missing that option. > > The ide that was aborting in this case is the cdrom at hdc that is added > if you do not specify "-nodefaults". > > Since this is a "changed" machine config, I am no longer as sure as what > versions this needs to be in. > > If I put my QEMU hat on, it does not look like a back port is needed. > However > for xen it would be nice. > > I do not know how the QEMU community feels about migration from a config > without "-nodefaults" to one with "-nodefaults" as the only difference. So you have a CD-ROM on the source, but not on the destination? That can't work. I guess it broke for you in an unusual way (target crashes) rather than the usual way (target rejects migration data for a device it doesn't have) due to our convoluted IDE data structures. With your patch applied it should break the usual way. Does it? Management tools should use -nodefaults. But if it mixes default and -nodefaults in migration, recreating the stuff it got by default but doesn't get with -nodefaults is its own responsibility.