All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O
@ 2012-03-28 15:43 Stefan Hajnoczi
  2012-03-28 15:43 ` [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() " Stefan Hajnoczi
                   ` (5 more replies)
  0 siblings, 6 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2012-03-28 15:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Richard Davies, Chris Webb, Stefan Hajnoczi,
	Zhi Yong Wu, Paolo Bonzini

IDE PIO mode is currently implemented using synchronous I/O functions.  There's
no need to do this because the IDE interface is actually designed with polling
and interrupts in mind - we can do asynchronous I/O and let the guest know when
the operation has completed.  The benefit of asynchronous I/O is that the guest
can continue executing code and is more responsive.

The second aim of this conversion is to avoid calling bdrv_read()/bdrv_write()
since they do not work with I/O throttling.  This means guests should now boot
IDE drives successfully when I/O throttling is enabled.

Note that ATAPI is not converted yet and still uses bdrv_read() in two
locations.  A future patch will have to convert ATAPI so CD-ROMs also do
asynchronous I/O.

I have tested both Windows 7 Home Premium and Red Hat Enterprise Linux 6.0
guests with these patches.  In Windows, use the device manager to disable DMA
on the IDE channels.  Under recent Linux kernels, use the libata.dma=0 kernel
parameter.

Chris and Richard: Please test this to confirm that it fixes the hang you
reported.

Stefan Hajnoczi (2):
  ide: convert ide_sector_read() to asynchronous I/O
  ide: convert ide_sector_write() to asynchronous I/O

 hw/ide/core.c     |  129 ++++++++++++++++++++++++++++++++++++----------------
 hw/ide/internal.h |    2 +
 2 files changed, 91 insertions(+), 40 deletions(-)

-- 
1.7.9.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() to asynchronous I/O
  2012-03-28 15:43 [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
@ 2012-03-28 15:43 ` Stefan Hajnoczi
  2012-03-29  6:42   ` Michael Tokarev
  2012-03-29  6:46   ` Michael Tokarev
  2012-03-28 15:43 ` [Qemu-devel] [PATCH 2/2] ide: convert ide_sector_write() " Stefan Hajnoczi
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2012-03-28 15:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Richard Davies, Chris Webb, Stefan Hajnoczi,
	Zhi Yong Wu, Paolo Bonzini

The IDE PIO interface currently uses bdrv_read() to perform reads
synchronously.  Synchronous I/O in the vcpu thread is bad because it
prevents the guest from executing code - it makes the guest
unresponsive.

This patch converts IDE PIO to use bdrv_aio_readv().  We simply need to
use the BUSY_STAT status so the guest knows to wait while we are busy.

The only external user of ide_sector_read() is restart behavior on I/O
errors and it is not affected by this change.  We still need to restart
I/O in the same way.

Migration is also unaffected if I understand the code correctly.  We
continue to use the same transfer function and the BUSY_STAT status
should never be migrated since we flush I/O before migrating device
state.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/ide/core.c     |   69 ++++++++++++++++++++++++++++++++++++----------------
 hw/ide/internal.h |    2 +
 2 files changed, 50 insertions(+), 21 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 4d568ac..e188157 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -457,40 +457,67 @@ static void ide_rw_error(IDEState *s) {
     ide_set_irq(s->bus);
 }
 
+static void ide_sector_read_cb(void *opaque, int ret)
+{
+    IDEState *s = opaque;
+    int n;
+
+    s->status &= ~BUSY_STAT;
+
+    bdrv_acct_done(s->bs, &s->acct);
+    if (ret != 0) {
+        if (ide_handle_rw_error(s, -ret, BM_STATUS_PIO_RETRY |
+                                BM_STATUS_RETRY_READ)) {
+            return;
+        }
+    }
+
+    n = s->nsector;
+    if (n > s->req_nb_sectors) {
+        n = s->req_nb_sectors;
+    }
+
+    /* Allow the guest to read the io_buffer */
+    ide_transfer_start(s, s->io_buffer, n * BDRV_SECTOR_SIZE, ide_sector_read);
+
+    ide_set_irq(s->bus);
+
+    ide_set_sector(s, ide_get_sector(s) + n);
+    s->nsector -= n;
+}
+
 void ide_sector_read(IDEState *s)
 {
     int64_t sector_num;
-    int ret, n;
+    int n;
 
     s->status = READY_STAT | SEEK_STAT;
     s->error = 0; /* not needed by IDE spec, but needed by Windows */
     sector_num = ide_get_sector(s);
     n = s->nsector;
+
     if (n == 0) {
-        /* no more sector to read from disk */
         ide_transfer_stop(s);
-    } else {
+        return;
+    }
+
+    s->status |= BUSY_STAT;
+
+    if (n > s->req_nb_sectors) {
+        n = s->req_nb_sectors;
+    }
+
 #if defined(DEBUG_IDE)
-        printf("read sector=%" PRId64 "\n", sector_num);
+    printf("sector=%" PRId64 "\n", sector_num);
 #endif
-        if (n > s->req_nb_sectors)
-            n = s->req_nb_sectors;
 
-        bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
-        ret = bdrv_read(s->bs, sector_num, s->io_buffer, n);
-        bdrv_acct_done(s->bs, &s->acct);
-        if (ret != 0) {
-            if (ide_handle_rw_error(s, -ret,
-                BM_STATUS_PIO_RETRY | BM_STATUS_RETRY_READ))
-            {
-                return;
-            }
-        }
-        ide_transfer_start(s, s->io_buffer, 512 * n, ide_sector_read);
-        ide_set_irq(s->bus);
-        ide_set_sector(s, sector_num + n);
-        s->nsector -= n;
-    }
+    s->iov.iov_base = s->io_buffer;
+    s->iov.iov_len  = n * BDRV_SECTOR_SIZE;
+    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
+
+    bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
+    bdrv_aio_readv(s->bs, sector_num, &s->qiov, n,
+                   ide_sector_read_cb, s);
 }
 
 static void dma_buf_commit(IDEState *s)
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index c808a0d..37d7307 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -383,6 +383,8 @@ struct IDEState {
     int cd_sector_size;
     int atapi_dma; /* true if dma is requested for the packet cmd */
     BlockAcctCookie acct;
+    struct iovec iov;
+    QEMUIOVector qiov;
     /* ATA DMA state */
     int io_buffer_size;
     QEMUSGList sg;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH 2/2] ide: convert ide_sector_write() to asynchronous I/O
  2012-03-28 15:43 [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
  2012-03-28 15:43 ` [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() " Stefan Hajnoczi
@ 2012-03-28 15:43 ` Stefan Hajnoczi
  2012-03-28 15:51 ` [Qemu-devel] [PATCH 0/2] ide: convert pio code path " Chris Webb
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Stefan Hajnoczi @ 2012-03-28 15:43 UTC (permalink / raw)
  To: qemu-devel
  Cc: Kevin Wolf, Richard Davies, Chris Webb, Stefan Hajnoczi,
	Zhi Yong Wu, Paolo Bonzini

The IDE PIO write sector code path uses bdrv_write() and hence can make
the guest unresponsive while the I/O request is in progress.  This patch
converts ide_sector_write() to use bdrv_aio_writev() by using the
BUSY_STAT bit to tell the guest that the request is in progress.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
---
 hw/ide/core.c |   60 ++++++++++++++++++++++++++++++++++++++------------------
 1 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index e188157..9333f13 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -672,40 +672,38 @@ static void ide_sector_write_timer_cb(void *opaque)
     ide_set_irq(s->bus);
 }
 
-void ide_sector_write(IDEState *s)
+static void ide_sector_write_cb(void *opaque, int ret)
 {
-    int64_t sector_num;
-    int ret, n, n1;
-
-    s->status = READY_STAT | SEEK_STAT;
-    sector_num = ide_get_sector(s);
-#if defined(DEBUG_IDE)
-    printf("write sector=%" PRId64 "\n", sector_num);
-#endif
-    n = s->nsector;
-    if (n > s->req_nb_sectors)
-        n = s->req_nb_sectors;
+    IDEState *s = opaque;
+    int n;
 
-    bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
-    ret = bdrv_write(s->bs, sector_num, s->io_buffer, n);
     bdrv_acct_done(s->bs, &s->acct);
 
+    s->status &= ~BUSY_STAT;
+
     if (ret != 0) {
-        if (ide_handle_rw_error(s, -ret, BM_STATUS_PIO_RETRY))
+        if (ide_handle_rw_error(s, -ret, BM_STATUS_PIO_RETRY)) {
             return;
+        }
     }
 
+    n = s->nsector;
+    if (n > s->req_nb_sectors) {
+        n = s->req_nb_sectors;
+    }
     s->nsector -= n;
     if (s->nsector == 0) {
         /* no more sectors to write */
         ide_transfer_stop(s);
     } else {
-        n1 = s->nsector;
-        if (n1 > s->req_nb_sectors)
+        int n1 = s->nsector;
+        if (n1 > s->req_nb_sectors) {
             n1 = s->req_nb_sectors;
-        ide_transfer_start(s, s->io_buffer, 512 * n1, ide_sector_write);
+        }
+        ide_transfer_start(s, s->io_buffer, n1 * BDRV_SECTOR_SIZE,
+                           ide_sector_write);
     }
-    ide_set_sector(s, sector_num + n);
+    ide_set_sector(s, ide_get_sector(s) + n);
 
     if (win2k_install_hack && ((++s->irq_count % 16) == 0)) {
         /* It seems there is a bug in the Windows 2000 installer HDD
@@ -721,6 +719,30 @@ void ide_sector_write(IDEState *s)
     }
 }
 
+void ide_sector_write(IDEState *s)
+{
+    int64_t sector_num;
+    int n;
+
+    s->status = READY_STAT | SEEK_STAT | BUSY_STAT;
+    sector_num = ide_get_sector(s);
+#if defined(DEBUG_IDE)
+    printf("sector=%" PRId64 "\n", sector_num);
+#endif
+    n = s->nsector;
+    if (n > s->req_nb_sectors) {
+        n = s->req_nb_sectors;
+    }
+
+    s->iov.iov_base = s->io_buffer;
+    s->iov.iov_len  = n * BDRV_SECTOR_SIZE;
+    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
+
+    bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
+    bdrv_aio_writev(s->bs, sector_num, &s->qiov, n,
+                    ide_sector_write_cb, s);
+}
+
 static void ide_flush_cb(void *opaque, int ret)
 {
     IDEState *s = opaque;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O
  2012-03-28 15:43 [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
  2012-03-28 15:43 ` [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() " Stefan Hajnoczi
  2012-03-28 15:43 ` [Qemu-devel] [PATCH 2/2] ide: convert ide_sector_write() " Stefan Hajnoczi
@ 2012-03-28 15:51 ` Chris Webb
  2012-03-28 16:14 ` Paolo Bonzini
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 14+ messages in thread
From: Chris Webb @ 2012-03-28 15:51 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Richard Davies, Zhi Yong Wu, qemu-devel, Paolo Bonzini

Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> writes:

> The second aim of this conversion is to avoid calling bdrv_read()/bdrv_write()
> since they do not work with I/O throttling.  This means guests should now boot
> IDE drives successfully when I/O throttling is enabled.
[...]
> Chris and Richard: Please test this to confirm that it fixes the hang you
> reported.

Thanks! I'll take a look at this and report back.

Best wishes,

Chris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O
  2012-03-28 15:43 [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
                   ` (2 preceding siblings ...)
  2012-03-28 15:51 ` [Qemu-devel] [PATCH 0/2] ide: convert pio code path " Chris Webb
@ 2012-03-28 16:14 ` Paolo Bonzini
  2012-03-29  2:15 ` Zhi Yong Wu
  2012-03-29 11:30 ` Chris Webb
  5 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2012-03-28 16:14 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Richard Davies, Chris Webb, qemu-devel, Zhi Yong Wu

Il 28/03/2012 17:43, Stefan Hajnoczi ha scritto:
> IDE PIO mode is currently implemented using synchronous I/O functions.  There's
> no need to do this because the IDE interface is actually designed with polling
> and interrupts in mind - we can do asynchronous I/O and let the guest know when
> the operation has completed.  The benefit of asynchronous I/O is that the guest
> can continue executing code and is more responsive.
> 
> The second aim of this conversion is to avoid calling bdrv_read()/bdrv_write()
> since they do not work with I/O throttling.  This means guests should now boot
> IDE drives successfully when I/O throttling is enabled.
> 
> Note that ATAPI is not converted yet and still uses bdrv_read() in two
> locations.  A future patch will have to convert ATAPI so CD-ROMs also do
> asynchronous I/O.
> 
> I have tested both Windows 7 Home Premium and Red Hat Enterprise Linux 6.0
> guests with these patches.  In Windows, use the device manager to disable DMA
> on the IDE channels.  Under recent Linux kernels, use the libata.dma=0 kernel
> parameter.
> 
> Chris and Richard: Please test this to confirm that it fixes the hang you
> reported.
> 
> Stefan Hajnoczi (2):
>   ide: convert ide_sector_read() to asynchronous I/O
>   ide: convert ide_sector_write() to asynchronous I/O
> 
>  hw/ide/core.c     |  129 ++++++++++++++++++++++++++++++++++++----------------
>  hw/ide/internal.h |    2 +
>  2 files changed, 91 insertions(+), 40 deletions(-)
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O
  2012-03-28 15:43 [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
                   ` (3 preceding siblings ...)
  2012-03-28 16:14 ` Paolo Bonzini
@ 2012-03-29  2:15 ` Zhi Yong Wu
  2012-03-29 11:30 ` Chris Webb
  5 siblings, 0 replies; 14+ messages in thread
From: Zhi Yong Wu @ 2012-03-29  2:15 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Richard Davies, Chris Webb, qemu-devel, Zhi Yong Wu,
	Paolo Bonzini

On Wed, Mar 28, 2012 at 11:43 PM, Stefan Hajnoczi
<stefanha@linux.vnet.ibm.com> wrote:
> IDE PIO mode is currently implemented using synchronous I/O functions.  There's
> no need to do this because the IDE interface is actually designed with polling
> and interrupts in mind - we can do asynchronous I/O and let the guest know when
> the operation has completed.  The benefit of asynchronous I/O is that the guest
> can continue executing code and is more responsive.
>
> The second aim of this conversion is to avoid calling bdrv_read()/bdrv_write()
> since they do not work with I/O throttling.  This means guests should now boot
> IDE drives successfully when I/O throttling is enabled.
>
> Note that ATAPI is not converted yet and still uses bdrv_read() in two
> locations.  A future patch will have to convert ATAPI so CD-ROMs also do
> asynchronous I/O.
>
> I have tested both Windows 7 Home Premium and Red Hat Enterprise Linux 6.0
> guests with these patches.  In Windows, use the device manager to disable DMA
> on the IDE channels.  Under recent Linux kernels, use the libata.dma=0 kernel
> parameter.
>
> Chris and Richard: Please test this to confirm that it fixes the hang you
> reported.
>
> Stefan Hajnoczi (2):
>  ide: convert ide_sector_read() to asynchronous I/O
>  ide: convert ide_sector_write() to asynchronous I/O
>
>  hw/ide/core.c     |  129 ++++++++++++++++++++++++++++++++++++----------------
>  hw/ide/internal.h |    2 +
>  2 files changed, 91 insertions(+), 40 deletions(-)
>
> --
> 1.7.9.1
>
>

Reviewed-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>

-- 
Regards,

Zhi Yong Wu

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() to asynchronous I/O
  2012-03-28 15:43 ` [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() " Stefan Hajnoczi
@ 2012-03-29  6:42   ` Michael Tokarev
  2012-03-29  8:29     ` Stefan Hajnoczi
  2012-03-29  6:46   ` Michael Tokarev
  1 sibling, 1 reply; 14+ messages in thread
From: Michael Tokarev @ 2012-03-29  6:42 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Richard Davies, Chris Webb, qemu-devel, Zhi Yong Wu,
	Paolo Bonzini

On 28.03.2012 19:43, Stefan Hajnoczi wrote:
>  void ide_sector_read(IDEState *s)
>  {
[]
> +    s->iov.iov_base = s->io_buffer;
> +    s->iov.iov_len  = n * BDRV_SECTOR_SIZE;
> +    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
> +
> +    bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
> +    bdrv_aio_readv(s->bs, sector_num, &s->qiov, n,
> +                   ide_sector_read_cb, s);
>  }

Shouldn't this function be returning something and
check the return value of bdrv_aio_readv() ?

I'm not sure if bdrv_aio_readv() is _supposed_ to never
return any errors.  In practice it is definitely a bit
more complex.  If bdrv_aio_readv() didn't do anything
useful (ie, didn't queue the callback), we'll be busy
forever.

Again, I've no idea if it supposed to never return error.
A block device without proper "media" present may - at
least in theory - do so.  If it must not, I think this
should be documented somewhere that bdrv_aio_readv()
does never return error, by design.

That was one of the reasons I digged into the block
layer in the first place - in order to understand,
simplify and _document_ what each interface does.

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() to asynchronous I/O
  2012-03-28 15:43 ` [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() " Stefan Hajnoczi
  2012-03-29  6:42   ` Michael Tokarev
@ 2012-03-29  6:46   ` Michael Tokarev
  2012-03-29  7:00     ` Michael Tokarev
  1 sibling, 1 reply; 14+ messages in thread
From: Michael Tokarev @ 2012-03-29  6:46 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Richard Davies, Chris Webb, qemu-devel, Zhi Yong Wu,
	Paolo Bonzini

On 28.03.2012 19:43, Stefan Hajnoczi wrote:
...

> void ide_sector_read(IDEState *s)
> {
...
> +    s->iov.iov_base = s->io_buffer;
> +    s->iov.iov_len  = n * BDRV_SECTOR_SIZE;
> +    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
> +    bdrv_aio_readv(s->bs, sector_num, &s->qiov, n,
> +                   ide_sector_read_cb, s);
>  }
>  
> @@ -383,6 +383,8 @@ struct IDEState {
>      int cd_sector_size;
>      int atapi_dma; /* true if dma is requested for the packet cmd */
>      BlockAcctCookie acct;
> +    struct iovec iov;
> +    QEMUIOVector qiov;
>      /* ATA DMA state */
>      int io_buffer_size;
>      QEMUSGList sg;

You don't actually need iov here, it can be a local variable
just as well.  The same applies to ide_sector_write() in the
next patch.  The state structure usually holds stuff which
is actually needed to be keept across several calls, iov is
not one of them.

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() to asynchronous I/O
  2012-03-29  6:46   ` Michael Tokarev
@ 2012-03-29  7:00     ` Michael Tokarev
  0 siblings, 0 replies; 14+ messages in thread
From: Michael Tokarev @ 2012-03-29  7:00 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Richard Davies, Chris Webb, qemu-devel, Zhi Yong Wu,
	Paolo Bonzini

On 29.03.2012 10:46, Michael Tokarev wrote:
> On 28.03.2012 19:43, Stefan Hajnoczi wrote:
> ...
> 
>> void ide_sector_read(IDEState *s)
>> {
> ...
>> +    s->iov.iov_base = s->io_buffer;
>> +    s->iov.iov_len  = n * BDRV_SECTOR_SIZE;
>> +    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
>> +    bdrv_aio_readv(s->bs, sector_num, &s->qiov, n,
>> +                   ide_sector_read_cb, s);
>>  }
>>  
>> @@ -383,6 +383,8 @@ struct IDEState {
>>      int cd_sector_size;
>>      int atapi_dma; /* true if dma is requested for the packet cmd */
>>      BlockAcctCookie acct;
>> +    struct iovec iov;
>> +    QEMUIOVector qiov;
>>      /* ATA DMA state */
>>      int io_buffer_size;
>>      QEMUSGList sg;
> 
> You don't actually need iov here, it can be a local variable
> just as well.  The same applies to ide_sector_write() in the
> next patch.  The state structure usually holds stuff which
> is actually needed to be keept across several calls, iov is
> not one of them.

Please ignore this one ;)  For qemu_iovec_init_external(), the
real iov is ofcourse needed.

Thanks,

/mjt

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() to asynchronous I/O
  2012-03-29  6:42   ` Michael Tokarev
@ 2012-03-29  8:29     ` Stefan Hajnoczi
  2012-03-29  9:10       ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2012-03-29  8:29 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: Kevin Wolf, Richard Davies, Chris Webb, Stefan Hajnoczi,
	qemu-devel, Zhi Yong Wu, Paolo Bonzini

On Thu, Mar 29, 2012 at 7:42 AM, Michael Tokarev <mjt@tls.msk.ru> wrote:
> On 28.03.2012 19:43, Stefan Hajnoczi wrote:
>>  void ide_sector_read(IDEState *s)
>>  {
> []
>> +    s->iov.iov_base = s->io_buffer;
>> +    s->iov.iov_len  = n * BDRV_SECTOR_SIZE;
>> +    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
>> +
>> +    bdrv_acct_start(s->bs, &s->acct, n * BDRV_SECTOR_SIZE, BDRV_ACCT_READ);
>> +    bdrv_aio_readv(s->bs, sector_num, &s->qiov, n,
>> +                   ide_sector_read_cb, s);
>>  }
>
> Shouldn't this function be returning something and
> check the return value of bdrv_aio_readv() ?

I think you are right that something is missing here.  If the IDE
controller is reset then we need to cancel the pending aio request.
Otherwise we could end up with a dangling request across reset that
overwrites io_buffer.

About checking the return value: it used to be the case that the
return value needed to be checked.

This was kind of a wart in the interface because it created 2 error
handling paths: one immediate return and one callback with error.  I
think Paolo and Kevin were the ones to address this.  You can check
out this patch for starters but there are a few more related ones:

ad54ae80c7 ("block: bdrv_aio_* do not return NULL")

Today the return value does not need to be checked for NULL.

Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() to asynchronous I/O
  2012-03-29  8:29     ` Stefan Hajnoczi
@ 2012-03-29  9:10       ` Paolo Bonzini
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2012-03-29  9:10 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Richard Davies, Chris Webb, Stefan Hajnoczi,
	Michael Tokarev, qemu-devel, Zhi Yong Wu

Il 29/03/2012 10:29, Stefan Hajnoczi ha scritto:
> This was kind of a wart in the interface because it created 2 error
> handling paths: one immediate return and one callback with error.  I
> think Paolo and Kevin were the ones to address this.

I think you did that when adding the coroutines.  I just cleaned up the
result.

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O
  2012-03-28 15:43 [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
                   ` (4 preceding siblings ...)
  2012-03-29  2:15 ` Zhi Yong Wu
@ 2012-03-29 11:30 ` Chris Webb
  2012-03-29 13:36   ` Stefan Hajnoczi
  5 siblings, 1 reply; 14+ messages in thread
From: Chris Webb @ 2012-03-29 11:30 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Richard Davies, Zhi Yong Wu, qemu-devel, Paolo Bonzini

Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> writes:

> Chris and Richard: Please test this to confirm that it fixes the hang you
> reported.

We've been testing this (v1 against qemu-kvm 1.0) today, and it's looking
very good. Thanks!

The lock-ups during boot no longer happen, and if you severely throttle
(1MB/s, 100 req/s) a guest in the middle of a big dd, there are lots of
nasty kernel messages as it falls back to pio mode, but the guest doesn't
lock up and qemu remains responsive.

Best wishes,

Chris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O
  2012-03-29 11:30 ` Chris Webb
@ 2012-03-29 13:36   ` Stefan Hajnoczi
  2012-03-29 16:10     ` Chris Webb
  0 siblings, 1 reply; 14+ messages in thread
From: Stefan Hajnoczi @ 2012-03-29 13:36 UTC (permalink / raw)
  To: Chris Webb
  Cc: Kevin Wolf, Richard Davies, Stefan Hajnoczi, qemu-devel,
	Zhi Yong Wu, Paolo Bonzini

On Thu, Mar 29, 2012 at 12:30 PM, Chris Webb <chris@arachsys.com> wrote:
> Stefan Hajnoczi <stefanha@linux.vnet.ibm.com> writes:
>
>> Chris and Richard: Please test this to confirm that it fixes the hang you
>> reported.
>
> We've been testing this (v1 against qemu-kvm 1.0) today, and it's looking
> very good. Thanks!
>
> The lock-ups during boot no longer happen, and if you severely throttle
> (1MB/s, 100 req/s) a guest in the middle of a big dd, there are lots of
> nasty kernel messages as it falls back to pio mode, but the guest doesn't
> lock up and qemu remains responsive.

Thanks for trying it out.

Regarding the guest kernel errors, they may be timeouts.  The guest
may consider the IDE controller buggy/dead due to how long requests
take to complete under severe throttling.  There's not much that can
be done about that, I think.  If you think the guest errors are due to
something else though, please post the error messages.

Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O
  2012-03-29 13:36   ` Stefan Hajnoczi
@ 2012-03-29 16:10     ` Chris Webb
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Webb @ 2012-03-29 16:10 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Kevin Wolf, Richard Davies, Stefan Hajnoczi, qemu-devel,
	Zhi Yong Wu, Paolo Bonzini

Stefan Hajnoczi <stefanha@gmail.com> writes:

> Thanks for trying it out.
> 
> Regarding the guest kernel errors, they may be timeouts.  The guest
> may consider the IDE controller buggy/dead due to how long requests
> take to complete under severe throttling.  There's not much that can
> be done about that, I think.  If you think the guest errors are due to
> something else though, please post the error messages.

I'm sure you're right that they're standard timeouts. We see exactly the
same behaviour when a guest is on a very heavily contended disk with slow
access because of that.

Best wishes,

Chris.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-03-29 16:10 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-28 15:43 [Qemu-devel] [PATCH 0/2] ide: convert pio code path to asynchronous I/O Stefan Hajnoczi
2012-03-28 15:43 ` [Qemu-devel] [PATCH 1/2] ide: convert ide_sector_read() " Stefan Hajnoczi
2012-03-29  6:42   ` Michael Tokarev
2012-03-29  8:29     ` Stefan Hajnoczi
2012-03-29  9:10       ` Paolo Bonzini
2012-03-29  6:46   ` Michael Tokarev
2012-03-29  7:00     ` Michael Tokarev
2012-03-28 15:43 ` [Qemu-devel] [PATCH 2/2] ide: convert ide_sector_write() " Stefan Hajnoczi
2012-03-28 15:51 ` [Qemu-devel] [PATCH 0/2] ide: convert pio code path " Chris Webb
2012-03-28 16:14 ` Paolo Bonzini
2012-03-29  2:15 ` Zhi Yong Wu
2012-03-29 11:30 ` Chris Webb
2012-03-29 13:36   ` Stefan Hajnoczi
2012-03-29 16:10     ` Chris Webb

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.