From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:33485)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <afaerber@suse.de>) id 1UgeQZ-0001GC-AS
	for qemu-devel@nongnu.org; Sun, 26 May 2013 13:08:20 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <afaerber@suse.de>) id 1UgeQT-0001K7-Bz
	for qemu-devel@nongnu.org; Sun, 26 May 2013 13:08:15 -0400
Received: from cantor2.suse.de ([195.135.220.15]:56593 helo=mx2.suse.de)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <afaerber@suse.de>) id 1UgeQT-0001Jz-2U
	for qemu-devel@nongnu.org; Sun, 26 May 2013 13:08:09 -0400
Message-ID: <51A24172.4020208@suse.de>
Date: Sun, 26 May 2013 19:08:02 +0200
From: =?UTF-8?B?QW5kcmVhcyBGw6RyYmVy?= <afaerber@suse.de>
MIME-Version: 1.0
References: <33183CC9F5247A488A2544077AF19020697A3B72@szxeml538-mbx.china.huawei.com>
In-Reply-To: <33183CC9F5247A488A2544077AF19020697A3B72@szxeml538-mbx.china.huawei.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] IDE disk FLUSH take more than 30 secs,
 the SUSE guest reports "lost interrupt and the file system becomes
 read-only"
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Gonglei (Arei)" <arei.gonglei@huawei.com>
Cc: "kwolf@redhat.com" <kwolf@redhat.com>, Stefano Stabellini <stefano.stabellini@eu.citrix.com>, Stefan Hajnoczi <stefanha@gmail.com>, Luonengjun <luonengjun@huawei.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>, Wangzhenguo <wangzhenguo@huawei.com>, "Huangweidong (Hardware)" <huangweidong@huawei.com>

Am 21.05.2013 09:12, schrieb Gonglei (Arei):
> Through analysis, I found that because the system call the fdatasync co=
mmand in the Qemu over 30s,=20
> after the Guest's kernel thread detects the io transferation is timeout=
, went to check IDE disk state.=20
> But the IDE disk status is 0x50, rather than the BSY status, and then d=
eparted error process...
>=20
> the path of kernel's action is :
> scsi_softirq_done
>  scsi_eh_scmd_add
>    scsi_error_handler
>      shost->transportt->eh_strategy_handler=20
> 		ata_scsi_error=20
> 			ap->ops->lost_interrupt
> 				ata_sff_lost_interrupt
> Finally, the file system becomes read-only.
>=20
> Why not set the IDE disk for the BSY status When 0xe7 command is execut=
ed in the Qemu?

Have you actually tried that out with a patch such as the following?

diff --git a/hw/ide/core.c b/hw/ide/core.c
index c7a8041..bf1ff18 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -795,6 +795,8 @@ static void ide_flush_cb(void *opaque, int ret)
 {
     IDEState *s =3D opaque;

+    s->status &=3D ~BUSY_STAT;
+
     if (ret < 0) {
         /* XXX: What sector number to set here? */
         if (ide_handle_rw_error(s, -ret, BM_STATUS_RETRY_FLUSH)) {
@@ -814,6 +816,7 @@ void ide_flush_cache(IDEState *s)
         return;
     }

+    s->status |=3D BUSY_STAT;
     bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
     bdrv_aio_flush(s->bs, ide_flush_cb, s);
 }

No clue if this is spec-compliant. ;)

Note however that qemu_fdatasync() is done in the flush callback of
block/raw-posix.c, so IIUC everything calling bdrv_aio_flush() or
bdrv_flush_all() may potentially run into issues beyond just ATA:

hw/block/virtio-blk.c
hw/block/xen_disk.c
hw/ide/core.c
hw/scsi/scsi-disk.c

cpus.c:do_vm_stop()
hw/xen/xen_platform.c:platform_fixed_ioport_writew()

qemu_fdatasync() further occurs in:

hw/block/dataplane/virtio-blk.c:process_request()
hw/9pfs/virtio-9p-*.c

Quite possibly not all of them are problematic, but flush times >30 sec
are very likely not well tested by developers...

Regards,
Andreas

--=20
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N=C3=BCrnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imend=C3=B6rffer; HRB 16746 AG N=C3=BC=
rnberg