All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure
@ 2015-11-12 16:30 Peter Lieven
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 1/6] ide/atapi: make PIO read requests async Peter Lieven
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Peter Lieven @ 2015-11-12 16:30 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, famz, stefanha, jcody, Peter Lieven, jsnow

This series aims at avoiding a hanging main-loop if a vserver has a
CDROM image mounted from a NFS share and that NFS share goes down.
Typical situation is that users mount an CDROM ISO to install something
and then forget to eject that CDROM afterwards.
As a consequence this mounted CD is able to bring down the
whole vserver if the backend NFS share is unreachable. This is bad
especially if the CDROM itself is not needed anymore at this point.

v3->v4: - Patch 1: remove buf argument for cd_read_sector{_sync}
        - Patch 1: fix iov_base offset for 2352 sector size
        - Patch 2: fix indent [Fam]
        - Patch 3: fix leaking of req->iov.iov_base [Fam]

v2->v3: - adressed Stefans comments on Patch 1
        - added patches 2,4,5,6
        - avoided the term cancel in Patch 3. Added an iovec,
          added a BH [Stefan]
v1->v2: - fix offset for 2352 byte sector size [Kevin]
        - use a sync request if we continue an elementary transfer.
          As John pointed out we enter a race condition between next
          IDE command and async transfer otherwise. This is sill not
          optimal, but it fixes the NFS down problems for all cases where
          the NFS server goes down while there is no PIO CD activity.
          Of course, it could still happen during a PIO transfer, but I
          expect this to be the unlikelier case.
          I spent some effort trying to read more sectors at once and
          avoiding continuation of elementary transfers, but with
          whatever I came up it was destroying migration between different
          Qemu versions. I have a quite hackish patch that works and
          should survive migration, but I am not happy with it. So I
          would like to start with this version as it is a big improvement
          already.
        - Dropped Patch 5 because it is upstream meanwhile.

Peter Lieven (6):
  ide/atapi: make PIO read requests async
  block: add blk_abort_aio_request
  ide: add support for IDEBufferedRequest
  ide: orphan all buffered requests on DMA cancel
  ide: enable buffered requests for ATAPI devices
  ide: enable buffered requests for PIO read requests

 block/block-backend.c          |  17 +++----
 hw/ide/atapi.c                 | 103 +++++++++++++++++++++++++++++++++++------
 hw/ide/core.c                  |  51 +++++++++++++++++++-
 hw/ide/internal.h              |  14 ++++++
 hw/ide/pci.c                   |  19 ++++++++
 include/sysemu/block-backend.h |   3 ++
 6 files changed, 182 insertions(+), 25 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH V4 1/6] ide/atapi: make PIO read requests async
  2015-11-12 16:30 [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
@ 2015-11-12 16:30 ` Peter Lieven
  2015-11-13 22:42   ` John Snow
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 2/6] block: add blk_abort_aio_request Peter Lieven
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Peter Lieven @ 2015-11-12 16:30 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, famz, stefanha, jcody, Peter Lieven, jsnow

PIO read requests on the ATAPI interface used to be sync blk requests.
This has two significant drawbacks. First the main loop hangs util an
I/O request is completed and secondly if the I/O request does not
complete (e.g. due to an unresponsive storage) Qemu hangs completely.

Note: Due to possible race conditions requests during an ongoing
elementary transfer are still sync.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 hw/ide/atapi.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 85 insertions(+), 12 deletions(-)

diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
index 747f466..cfd2d63 100644
--- a/hw/ide/atapi.c
+++ b/hw/ide/atapi.c
@@ -105,33 +105,98 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
     memset(buf, 0, 288);
 }
 
-static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int sector_size)
+static int
+cd_read_sector_sync(IDEState *s)
 {
     int ret;
 
-    switch(sector_size) {
+#ifdef DEBUG_IDE_ATAPI
+    printf("cd_read_sector_sync: lba=%d\n", s->lba);
+#endif
+
+    switch (s->cd_sector_size) {
     case 2048:
         block_acct_start(blk_get_stats(s->blk), &s->acct,
                          4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
-        ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
+        ret = blk_read(s->blk, (int64_t)s->lba << 2,
+                       s->io_buffer, 4);
         block_acct_done(blk_get_stats(s->blk), &s->acct);
         break;
     case 2352:
         block_acct_start(blk_get_stats(s->blk), &s->acct,
                          4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
-        ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
+        ret = blk_read(s->blk, (int64_t)s->lba << 2,
+                       s->io_buffer + 16, 4);
         block_acct_done(blk_get_stats(s->blk), &s->acct);
         if (ret < 0)
             return ret;
-        cd_data_to_raw(buf, lba);
+        cd_data_to_raw(s->io_buffer, s->lba);
         break;
     default:
         ret = -EIO;
         break;
     }
+
+    if (!ret) {
+        s->lba++;
+        s->io_buffer_index = 0;
+    }
+
     return ret;
 }
 
+static void cd_read_sector_cb(void *opaque, int ret)
+{
+    IDEState *s = opaque;
+
+    block_acct_done(blk_get_stats(s->blk), &s->acct);
+
+#ifdef DEBUG_IDE_ATAPI
+    printf("cd_read_sector_cb: lba=%d ret=%d\n", s->lba, ret);
+#endif
+
+    if (ret < 0) {
+        ide_atapi_io_error(s, ret);
+        return;
+    }
+
+    if (s->cd_sector_size == 2352) {
+        cd_data_to_raw(s->io_buffer, s->lba);
+    }
+
+    s->lba++;
+    s->io_buffer_index = 0;
+    s->status &= ~BUSY_STAT;
+
+    ide_atapi_cmd_reply_end(s);
+}
+
+static int cd_read_sector(IDEState *s)
+{
+    if (s->cd_sector_size != 2048 && s->cd_sector_size != 2352) {
+        return -EINVAL;
+    }
+
+    s->iov.iov_base = (s->cd_sector_size == 2352) ?
+                      s->io_buffer + 16 : s->io_buffer;
+
+    s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
+    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
+
+#ifdef DEBUG_IDE_ATAPI
+    printf("cd_read_sector: lba=%d\n", s->lba);
+#endif
+
+    block_acct_start(blk_get_stats(s->blk), &s->acct,
+                     4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
+
+    blk_aio_readv(s->blk, (int64_t)s->lba << 2, &s->qiov, 4,
+                  cd_read_sector_cb, s);
+
+    s->status |= BUSY_STAT;
+    return 0;
+}
+
 void ide_atapi_cmd_ok(IDEState *s)
 {
     s->error = 0;
@@ -182,18 +247,27 @@ void ide_atapi_cmd_reply_end(IDEState *s)
         ide_atapi_cmd_ok(s);
         ide_set_irq(s->bus);
 #ifdef DEBUG_IDE_ATAPI
-        printf("status=0x%x\n", s->status);
+        printf("end of transfer, status=0x%x\n", s->status);
 #endif
     } else {
         /* see if a new sector must be read */
         if (s->lba != -1 && s->io_buffer_index >= s->cd_sector_size) {
-            ret = cd_read_sector(s, s->lba, s->io_buffer, s->cd_sector_size);
-            if (ret < 0) {
-                ide_atapi_io_error(s, ret);
+            if (!s->elementary_transfer_size) {
+                ret = cd_read_sector(s);
+                if (ret < 0) {
+                    ide_atapi_io_error(s, ret);
+                }
                 return;
+            } else {
+                /* rebuffering within an elementary transfer is
+                 * only possible with a sync request because we
+                 * end up with a race condition otherwise */
+                ret = cd_read_sector_sync(s);
+                if (ret < 0) {
+                    ide_atapi_io_error(s, ret);
+                    return;
+                }
             }
-            s->lba++;
-            s->io_buffer_index = 0;
         }
         if (s->elementary_transfer_size > 0) {
             /* there are some data left to transmit in this elementary
@@ -275,7 +349,6 @@ static void ide_atapi_cmd_read_pio(IDEState *s, int lba, int nb_sectors,
     s->io_buffer_index = sector_size;
     s->cd_sector_size = sector_size;
 
-    s->status = READY_STAT | SEEK_STAT;
     ide_atapi_cmd_reply_end(s);
 }
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH V4 2/6] block: add blk_abort_aio_request
  2015-11-12 16:30 [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 1/6] ide/atapi: make PIO read requests async Peter Lieven
@ 2015-11-12 16:30 ` Peter Lieven
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 3/6] ide: add support for IDEBufferedRequest Peter Lieven
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Peter Lieven @ 2015-11-12 16:30 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, famz, stefanha, jcody, Peter Lieven, jsnow

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 block/block-backend.c          | 17 +++++++++--------
 include/sysemu/block-backend.h |  3 +++
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 6f9309f..701234e 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -641,8 +641,9 @@ static void error_callback_bh(void *opaque)
     qemu_aio_unref(acb);
 }
 
-static BlockAIOCB *abort_aio_request(BlockBackend *blk, BlockCompletionFunc *cb,
-                                     void *opaque, int ret)
+BlockAIOCB *blk_abort_aio_request(BlockBackend *blk,
+                                  BlockCompletionFunc *cb,
+                                  void *opaque, int ret)
 {
     struct BlockBackendAIOCB *acb;
     QEMUBH *bh;
@@ -664,7 +665,7 @@ BlockAIOCB *blk_aio_write_zeroes(BlockBackend *blk, int64_t sector_num,
 {
     int ret = blk_check_request(blk, sector_num, nb_sectors);
     if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
+        return blk_abort_aio_request(blk, cb, opaque, ret);
     }
 
     return bdrv_aio_write_zeroes(blk->bs, sector_num, nb_sectors, flags,
@@ -724,7 +725,7 @@ BlockAIOCB *blk_aio_readv(BlockBackend *blk, int64_t sector_num,
 {
     int ret = blk_check_request(blk, sector_num, nb_sectors);
     if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
+        return blk_abort_aio_request(blk, cb, opaque, ret);
     }
 
     return bdrv_aio_readv(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
@@ -736,7 +737,7 @@ BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t sector_num,
 {
     int ret = blk_check_request(blk, sector_num, nb_sectors);
     if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
+        return blk_abort_aio_request(blk, cb, opaque, ret);
     }
 
     return bdrv_aio_writev(blk->bs, sector_num, iov, nb_sectors, cb, opaque);
@@ -746,7 +747,7 @@ BlockAIOCB *blk_aio_flush(BlockBackend *blk,
                           BlockCompletionFunc *cb, void *opaque)
 {
     if (!blk_is_available(blk)) {
-        return abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
+        return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
     }
 
     return bdrv_aio_flush(blk->bs, cb, opaque);
@@ -758,7 +759,7 @@ BlockAIOCB *blk_aio_discard(BlockBackend *blk,
 {
     int ret = blk_check_request(blk, sector_num, nb_sectors);
     if (ret < 0) {
-        return abort_aio_request(blk, cb, opaque, ret);
+        return blk_abort_aio_request(blk, cb, opaque, ret);
     }
 
     return bdrv_aio_discard(blk->bs, sector_num, nb_sectors, cb, opaque);
@@ -801,7 +802,7 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned long int req, void *buf,
                           BlockCompletionFunc *cb, void *opaque)
 {
     if (!blk_is_available(blk)) {
-        return abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
+        return blk_abort_aio_request(blk, cb, opaque, -ENOMEDIUM);
     }
 
     return bdrv_aio_ioctl(blk->bs, req, buf, cb, opaque);
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index f4a68e2..fb068ea4 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -184,5 +184,8 @@ int blk_save_vmstate(BlockBackend *blk, const uint8_t *buf,
 int blk_load_vmstate(BlockBackend *blk, uint8_t *buf, int64_t pos, int size);
 int blk_probe_blocksizes(BlockBackend *blk, BlockSizes *bsz);
 int blk_probe_geometry(BlockBackend *blk, HDGeometry *geo);
+BlockAIOCB *blk_abort_aio_request(BlockBackend *blk,
+                                  BlockCompletionFunc *cb,
+                                  void *opaque, int ret);
 
 #endif
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH V4 3/6] ide: add support for IDEBufferedRequest
  2015-11-12 16:30 [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 1/6] ide/atapi: make PIO read requests async Peter Lieven
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 2/6] block: add blk_abort_aio_request Peter Lieven
@ 2015-11-12 16:30 ` Peter Lieven
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 4/6] ide: orphan all buffered requests on DMA cancel Peter Lieven
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Peter Lieven @ 2015-11-12 16:30 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, famz, stefanha, jcody, Peter Lieven, jsnow

this patch adds a new aio readv compatible function which copies
all data through a bounce buffer. These buffered requests can be
flagged as orphaned which means that their original callback has
already been invoked and the request has just not been completed
by the backend storage. The bounce buffer guarantees that guest
memory corruption is avoided when such a orphaned request is
completed by the backend at a later stage.

This trick only works for read requests as a write request completed
at a later stage might corrupt data as there is no way to control
if and what data has already been written to the storage.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 hw/ide/core.c     | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 hw/ide/internal.h | 14 ++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 364ba21..7ca67bc 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -561,6 +561,53 @@ static bool ide_sect_range_ok(IDEState *s,
     return true;
 }
 
+static void ide_buffered_readv_cb(void *opaque, int ret)
+{
+    IDEBufferedRequest *req = opaque;
+    if (!req->orphaned) {
+        if (!ret) {
+            qemu_iovec_from_buf(req->original_qiov, 0, req->iov.iov_base,
+                                req->original_qiov->size);
+        }
+        req->original_cb(req->original_opaque, ret);
+    }
+    QLIST_REMOVE(req, list);
+    qemu_vfree(req->iov.iov_base);
+    g_free(req);
+}
+
+#define MAX_BUFFERED_REQS 16
+
+BlockAIOCB *ide_buffered_readv(IDEState *s, int64_t sector_num,
+                               QEMUIOVector *iov, int nb_sectors,
+                               BlockCompletionFunc *cb, void *opaque)
+{
+    BlockAIOCB *aioreq;
+    IDEBufferedRequest *req;
+    int c = 0;
+
+    QLIST_FOREACH(req, &s->buffered_requests, list) {
+        c++;
+    }
+    if (c > MAX_BUFFERED_REQS) {
+        return blk_abort_aio_request(s->blk, cb, opaque, -EIO);
+    }
+
+    req = g_new0(IDEBufferedRequest, 1);
+    req->original_qiov = iov;
+    req->original_cb = cb;
+    req->original_opaque = opaque;
+    req->iov.iov_base = qemu_blockalign(blk_bs(s->blk), iov->size);
+    req->iov.iov_len = iov->size;
+    qemu_iovec_init_external(&req->qiov, &req->iov, 1);
+
+    aioreq = blk_aio_readv(s->blk, sector_num, &req->qiov, nb_sectors,
+                           ide_buffered_readv_cb, req);
+
+    QLIST_INSERT_HEAD(&s->buffered_requests, req, list);
+    return aioreq;
+}
+
 static void ide_sector_read(IDEState *s);
 
 static void ide_sector_read_cb(void *opaque, int ret)
diff --git a/hw/ide/internal.h b/hw/ide/internal.h
index e4629b0..2d1e2d2 100644
--- a/hw/ide/internal.h
+++ b/hw/ide/internal.h
@@ -343,6 +343,16 @@ enum ide_dma_cmd {
 #define ide_cmd_is_read(s) \
 	((s)->dma_cmd == IDE_DMA_READ)
 
+typedef struct IDEBufferedRequest {
+    QLIST_ENTRY(IDEBufferedRequest) list;
+    struct iovec iov;
+    QEMUIOVector qiov;
+    QEMUIOVector *original_qiov;
+    BlockCompletionFunc *original_cb;
+    void *original_opaque;
+    bool orphaned;
+} IDEBufferedRequest;
+
 /* NOTE: IDEState represents in fact one drive */
 struct IDEState {
     IDEBus *bus;
@@ -396,6 +406,7 @@ struct IDEState {
     BlockAIOCB *pio_aiocb;
     struct iovec iov;
     QEMUIOVector qiov;
+    QLIST_HEAD(, IDEBufferedRequest) buffered_requests;
     /* ATA DMA state */
     uint64_t io_buffer_offset;
     int32_t io_buffer_size;
@@ -572,6 +583,9 @@ void ide_set_inactive(IDEState *s, bool more);
 BlockAIOCB *ide_issue_trim(BlockBackend *blk,
         int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
         BlockCompletionFunc *cb, void *opaque);
+BlockAIOCB *ide_buffered_readv(IDEState *s, int64_t sector_num,
+                               QEMUIOVector *iov, int nb_sectors,
+                               BlockCompletionFunc *cb, void *opaque);
 
 /* hw/ide/atapi.c */
 void ide_atapi_cmd(IDEState *s);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH V4 4/6] ide: orphan all buffered requests on DMA cancel
  2015-11-12 16:30 [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
                   ` (2 preceding siblings ...)
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 3/6] ide: add support for IDEBufferedRequest Peter Lieven
@ 2015-11-12 16:30 ` Peter Lieven
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 5/6] ide: enable buffered requests for ATAPI devices Peter Lieven
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Peter Lieven @ 2015-11-12 16:30 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, famz, stefanha, jcody, Peter Lieven, jsnow

If the guests canceles a DMA request we can prematurely
invoke all callbacks of buffered requests and flag all them
as orphaned. Ideally this avoids the need for draining all
requests. For CDROM devices this works in 100% of all cases.

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 hw/ide/pci.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/hw/ide/pci.c b/hw/ide/pci.c
index 9c54b37..37dbc29 100644
--- a/hw/ide/pci.c
+++ b/hw/ide/pci.c
@@ -233,6 +233,22 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val)
     /* Ignore writes to SSBM if it keeps the old value */
     if ((val & BM_CMD_START) != (bm->cmd & BM_CMD_START)) {
         if (!(val & BM_CMD_START)) {
+            /* First invoke the callbacks of all buffered requests
+             * and flag those requests as orphaned. Ideally there
+             * are no unbuffered (Scatter Gather DMA Requests or
+             * write requests) pending and we can avoid to drain. */
+            IDEBufferedRequest *req;
+            IDEState *s = idebus_active_if(bm->bus);
+            QLIST_FOREACH(req, &s->buffered_requests, list) {
+                if (!req->orphaned) {
+#ifdef DEBUG_IDE
+                    printf("%s: invoking cb %p of buffered request %p with"
+                           " -ECANCELED\n", __func__, req->original_cb, req);
+#endif
+                    req->original_cb(req->original_opaque, -ECANCELED);
+                }
+                req->orphaned = true;
+            }
             /*
              * We can't cancel Scatter Gather DMA in the middle of the
              * operation or a partial (not full) DMA transfer would reach
@@ -246,6 +262,9 @@ void bmdma_cmd_writeb(BMDMAState *bm, uint32_t val)
              * aio operation with preadv/pwritev.
              */
             if (bm->bus->dma->aiocb) {
+#ifdef DEBUG_IDE
+                printf("%s: draining all remaining requests", __func__);
+#endif
                 blk_drain_all();
                 assert(bm->bus->dma->aiocb == NULL);
             }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH V4 5/6] ide: enable buffered requests for ATAPI devices
  2015-11-12 16:30 [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
                   ` (3 preceding siblings ...)
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 4/6] ide: orphan all buffered requests on DMA cancel Peter Lieven
@ 2015-11-12 16:30 ` Peter Lieven
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 6/6] ide: enable buffered requests for PIO read requests Peter Lieven
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Peter Lieven @ 2015-11-12 16:30 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, famz, stefanha, jcody, Peter Lieven, jsnow

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 hw/ide/atapi.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
index cfd2d63..d1eaa29 100644
--- a/hw/ide/atapi.c
+++ b/hw/ide/atapi.c
@@ -190,8 +190,8 @@ static int cd_read_sector(IDEState *s)
     block_acct_start(blk_get_stats(s->blk), &s->acct,
                      4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
 
-    blk_aio_readv(s->blk, (int64_t)s->lba << 2, &s->qiov, 4,
-                  cd_read_sector_cb, s);
+    ide_buffered_readv(s, (int64_t)s->lba << 2, &s->qiov, 4,
+                       cd_read_sector_cb, s);
 
     s->status |= BUSY_STAT;
     return 0;
@@ -424,9 +424,9 @@ static void ide_atapi_cmd_read_dma_cb(void *opaque, int ret)
     s->bus->dma->iov.iov_len = n * 4 * 512;
     qemu_iovec_init_external(&s->bus->dma->qiov, &s->bus->dma->iov, 1);
 
-    s->bus->dma->aiocb = blk_aio_readv(s->blk, (int64_t)s->lba << 2,
-                                       &s->bus->dma->qiov, n * 4,
-                                       ide_atapi_cmd_read_dma_cb, s);
+    s->bus->dma->aiocb = ide_buffered_readv(s, (int64_t)s->lba << 2,
+                                            &s->bus->dma->qiov, n * 4,
+                                            ide_atapi_cmd_read_dma_cb, s);
     return;
 
 eot:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [Qemu-devel] [PATCH V4 6/6] ide: enable buffered requests for PIO read requests
  2015-11-12 16:30 [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
                   ` (4 preceding siblings ...)
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 5/6] ide: enable buffered requests for ATAPI devices Peter Lieven
@ 2015-11-12 16:30 ` Peter Lieven
  2015-11-13 11:13 ` [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Mark Cave-Ayland
  2015-11-13 22:44 ` John Snow
  7 siblings, 0 replies; 16+ messages in thread
From: Peter Lieven @ 2015-11-12 16:30 UTC (permalink / raw)
  To: qemu-devel, qemu-block; +Cc: kwolf, famz, stefanha, jcody, Peter Lieven, jsnow

Signed-off-by: Peter Lieven <pl@kamp.de>
---
 hw/ide/core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 7ca67bc..b9b531c 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -677,8 +677,8 @@ static void ide_sector_read(IDEState *s)
 
     block_acct_start(blk_get_stats(s->blk), &s->acct,
                      n * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
-    s->pio_aiocb = blk_aio_readv(s->blk, sector_num, &s->qiov, n,
-                                 ide_sector_read_cb, s);
+    s->pio_aiocb = ide_buffered_readv(s, sector_num, &s->qiov, n,
+                                      ide_sector_read_cb, s);
 }
 
 void dma_buf_commit(IDEState *s, uint32_t tx_bytes)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure
  2015-11-12 16:30 [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
                   ` (5 preceding siblings ...)
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 6/6] ide: enable buffered requests for PIO read requests Peter Lieven
@ 2015-11-13 11:13 ` Mark Cave-Ayland
  2015-11-13 17:18   ` John Snow
  2015-11-13 22:44 ` John Snow
  7 siblings, 1 reply; 16+ messages in thread
From: Mark Cave-Ayland @ 2015-11-13 11:13 UTC (permalink / raw)
  To: Peter Lieven, qemu-devel, qemu-block; +Cc: kwolf, stefanha, jcody, jsnow, famz

On 12/11/15 16:30, Peter Lieven wrote:

> This series aims at avoiding a hanging main-loop if a vserver has a
> CDROM image mounted from a NFS share and that NFS share goes down.
> Typical situation is that users mount an CDROM ISO to install something
> and then forget to eject that CDROM afterwards.
> As a consequence this mounted CD is able to bring down the
> whole vserver if the backend NFS share is unreachable. This is bad
> especially if the CDROM itself is not needed anymore at this point.
> 
> v3->v4: - Patch 1: remove buf argument for cd_read_sector{_sync}
>         - Patch 1: fix iov_base offset for 2352 sector size
>         - Patch 2: fix indent [Fam]
>         - Patch 3: fix leaking of req->iov.iov_base [Fam]
> 
> v2->v3: - adressed Stefans comments on Patch 1
>         - added patches 2,4,5,6
>         - avoided the term cancel in Patch 3. Added an iovec,
>           added a BH [Stefan]
> v1->v2: - fix offset for 2352 byte sector size [Kevin]
>         - use a sync request if we continue an elementary transfer.
>           As John pointed out we enter a race condition between next
>           IDE command and async transfer otherwise. This is sill not
>           optimal, but it fixes the NFS down problems for all cases where
>           the NFS server goes down while there is no PIO CD activity.
>           Of course, it could still happen during a PIO transfer, but I
>           expect this to be the unlikelier case.
>           I spent some effort trying to read more sectors at once and
>           avoiding continuation of elementary transfers, but with
>           whatever I came up it was destroying migration between different
>           Qemu versions. I have a quite hackish patch that works and
>           should survive migration, but I am not happy with it. So I
>           would like to start with this version as it is a big improvement
>           already.
>         - Dropped Patch 5 because it is upstream meanwhile.
> 
> Peter Lieven (6):
>   ide/atapi: make PIO read requests async
>   block: add blk_abort_aio_request
>   ide: add support for IDEBufferedRequest
>   ide: orphan all buffered requests on DMA cancel
>   ide: enable buffered requests for ATAPI devices
>   ide: enable buffered requests for PIO read requests
> 
>  block/block-backend.c          |  17 +++----
>  hw/ide/atapi.c                 | 103 +++++++++++++++++++++++++++++++++++------
>  hw/ide/core.c                  |  51 +++++++++++++++++++-
>  hw/ide/internal.h              |  14 ++++++
>  hw/ide/pci.c                   |  19 ++++++++
>  include/sysemu/block-backend.h |   3 ++
>  6 files changed, 182 insertions(+), 25 deletions(-)

Patches 4-6 look like they may handle issues I've found with migration
in DMA requests on PPC g3beige and mac99 machines. Unfortunately due to
alignment issues, the macio controller has implement its own versions of
some of these routines, so should parts of these also be applied
hw/ide/macio.c aswell?


ATB,

Mark.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure
  2015-11-13 11:13 ` [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Mark Cave-Ayland
@ 2015-11-13 17:18   ` John Snow
  0 siblings, 0 replies; 16+ messages in thread
From: John Snow @ 2015-11-13 17:18 UTC (permalink / raw)
  To: Mark Cave-Ayland, Peter Lieven, qemu-devel, qemu-block
  Cc: kwolf, stefanha, jcody, famz



On 11/13/2015 06:13 AM, Mark Cave-Ayland wrote:
> On 12/11/15 16:30, Peter Lieven wrote:
> 
>> This series aims at avoiding a hanging main-loop if a vserver has a
>> CDROM image mounted from a NFS share and that NFS share goes down.
>> Typical situation is that users mount an CDROM ISO to install something
>> and then forget to eject that CDROM afterwards.
>> As a consequence this mounted CD is able to bring down the
>> whole vserver if the backend NFS share is unreachable. This is bad
>> especially if the CDROM itself is not needed anymore at this point.
>>
>> v3->v4: - Patch 1: remove buf argument for cd_read_sector{_sync}
>>         - Patch 1: fix iov_base offset for 2352 sector size
>>         - Patch 2: fix indent [Fam]
>>         - Patch 3: fix leaking of req->iov.iov_base [Fam]
>>
>> v2->v3: - adressed Stefans comments on Patch 1
>>         - added patches 2,4,5,6
>>         - avoided the term cancel in Patch 3. Added an iovec,
>>           added a BH [Stefan]
>> v1->v2: - fix offset for 2352 byte sector size [Kevin]
>>         - use a sync request if we continue an elementary transfer.
>>           As John pointed out we enter a race condition between next
>>           IDE command and async transfer otherwise. This is sill not
>>           optimal, but it fixes the NFS down problems for all cases where
>>           the NFS server goes down while there is no PIO CD activity.
>>           Of course, it could still happen during a PIO transfer, but I
>>           expect this to be the unlikelier case.
>>           I spent some effort trying to read more sectors at once and
>>           avoiding continuation of elementary transfers, but with
>>           whatever I came up it was destroying migration between different
>>           Qemu versions. I have a quite hackish patch that works and
>>           should survive migration, but I am not happy with it. So I
>>           would like to start with this version as it is a big improvement
>>           already.
>>         - Dropped Patch 5 because it is upstream meanwhile.
>>
>> Peter Lieven (6):
>>   ide/atapi: make PIO read requests async
>>   block: add blk_abort_aio_request
>>   ide: add support for IDEBufferedRequest
>>   ide: orphan all buffered requests on DMA cancel
>>   ide: enable buffered requests for ATAPI devices
>>   ide: enable buffered requests for PIO read requests
>>
>>  block/block-backend.c          |  17 +++----
>>  hw/ide/atapi.c                 | 103 +++++++++++++++++++++++++++++++++++------
>>  hw/ide/core.c                  |  51 +++++++++++++++++++-
>>  hw/ide/internal.h              |  14 ++++++
>>  hw/ide/pci.c                   |  19 ++++++++
>>  include/sysemu/block-backend.h |   3 ++
>>  6 files changed, 182 insertions(+), 25 deletions(-)
> 
> Patches 4-6 look like they may handle issues I've found with migration
> in DMA requests on PPC g3beige and mac99 machines. Unfortunately due to
> alignment issues, the macio controller has implement its own versions of
> some of these routines, so should parts of these also be applied
> hw/ide/macio.c aswell?
> 
> 
> ATB,
> 
> Mark.
> 

I still might be suffering delusions that I can sneak this into 2.5 -- I
wouldn't mind reworking this to share with macio later if that's the case.

I don't think I'll make acceptance of this patchset conditional on macio
inclusion either way, though.

--js

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH V4 1/6] ide/atapi: make PIO read requests async
  2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 1/6] ide/atapi: make PIO read requests async Peter Lieven
@ 2015-11-13 22:42   ` John Snow
  2015-11-13 23:00     ` Peter Lieven
  0 siblings, 1 reply; 16+ messages in thread
From: John Snow @ 2015-11-13 22:42 UTC (permalink / raw)
  To: Peter Lieven, qemu-devel, qemu-block; +Cc: kwolf, stefanha, jcody, famz



On 11/12/2015 11:30 AM, Peter Lieven wrote:
> PIO read requests on the ATAPI interface used to be sync blk requests.
> This has two significant drawbacks. First the main loop hangs util an
> I/O request is completed and secondly if the I/O request does not
> complete (e.g. due to an unresponsive storage) Qemu hangs completely.
> 
> Note: Due to possible race conditions requests during an ongoing
> elementary transfer are still sync.
> 
> Signed-off-by: Peter Lieven <pl@kamp.de>
> ---
>  hw/ide/atapi.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 85 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
> index 747f466..cfd2d63 100644
> --- a/hw/ide/atapi.c
> +++ b/hw/ide/atapi.c
> @@ -105,33 +105,98 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
>      memset(buf, 0, 288);
>  }
>  
> -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int sector_size)
> +static int
> +cd_read_sector_sync(IDEState *s)
>  {
>      int ret;
>  
> -    switch(sector_size) {
> +#ifdef DEBUG_IDE_ATAPI
> +    printf("cd_read_sector_sync: lba=%d\n", s->lba);
> +#endif
> +
> +    switch (s->cd_sector_size) {
>      case 2048:
>          block_acct_start(blk_get_stats(s->blk), &s->acct,
>                           4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> -        ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
> +        ret = blk_read(s->blk, (int64_t)s->lba << 2,
> +                       s->io_buffer, 4);
>          block_acct_done(blk_get_stats(s->blk), &s->acct);
>          break;
>      case 2352:
>          block_acct_start(blk_get_stats(s->blk), &s->acct,
>                           4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> -        ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
> +        ret = blk_read(s->blk, (int64_t)s->lba << 2,
> +                       s->io_buffer + 16, 4);
>          block_acct_done(blk_get_stats(s->blk), &s->acct);
>          if (ret < 0)
>              return ret;
> -        cd_data_to_raw(buf, lba);
> +        cd_data_to_raw(s->io_buffer, s->lba);
>          break;
>      default:
>          ret = -EIO;
>          break;
>      }
> +
> +    if (!ret) {
> +        s->lba++;
> +        s->io_buffer_index = 0;
> +    }
> +
>      return ret;
>  }
>  
> +static void cd_read_sector_cb(void *opaque, int ret)
> +{
> +    IDEState *s = opaque;
> +
> +    block_acct_done(blk_get_stats(s->blk), &s->acct);
> +
> +#ifdef DEBUG_IDE_ATAPI
> +    printf("cd_read_sector_cb: lba=%d ret=%d\n", s->lba, ret);
> +#endif
> +
> +    if (ret < 0) {
> +        ide_atapi_io_error(s, ret);
> +        return;
> +    }
> +
> +    if (s->cd_sector_size == 2352) {
> +        cd_data_to_raw(s->io_buffer, s->lba);
> +    }
> +
> +    s->lba++;
> +    s->io_buffer_index = 0;
> +    s->status &= ~BUSY_STAT;
> +
> +    ide_atapi_cmd_reply_end(s);
> +}
> +
> +static int cd_read_sector(IDEState *s)
> +{
> +    if (s->cd_sector_size != 2048 && s->cd_sector_size != 2352) {
> +        return -EINVAL;
> +    }
> +
> +    s->iov.iov_base = (s->cd_sector_size == 2352) ?
> +                      s->io_buffer + 16 : s->io_buffer;
> +
> +    s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
> +    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
> +
> +#ifdef DEBUG_IDE_ATAPI
> +    printf("cd_read_sector: lba=%d\n", s->lba);
> +#endif
> +
> +    block_acct_start(blk_get_stats(s->blk), &s->acct,
> +                     4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> +
> +    blk_aio_readv(s->blk, (int64_t)s->lba << 2, &s->qiov, 4,
> +                  cd_read_sector_cb, s);
> +
> +    s->status |= BUSY_STAT;
> +    return 0;
> +}
> +
>  void ide_atapi_cmd_ok(IDEState *s)
>  {
>      s->error = 0;
> @@ -182,18 +247,27 @@ void ide_atapi_cmd_reply_end(IDEState *s)
>          ide_atapi_cmd_ok(s);
>          ide_set_irq(s->bus);
>  #ifdef DEBUG_IDE_ATAPI
> -        printf("status=0x%x\n", s->status);
> +        printf("end of transfer, status=0x%x\n", s->status);
>  #endif
>      } else {
>          /* see if a new sector must be read */
>          if (s->lba != -1 && s->io_buffer_index >= s->cd_sector_size) {
> -            ret = cd_read_sector(s, s->lba, s->io_buffer, s->cd_sector_size);
> -            if (ret < 0) {
> -                ide_atapi_io_error(s, ret);
> +            if (!s->elementary_transfer_size) {
> +                ret = cd_read_sector(s);
> +                if (ret < 0) {
> +                    ide_atapi_io_error(s, ret);
> +                }
>                  return;
> +            } else {
> +                /* rebuffering within an elementary transfer is
> +                 * only possible with a sync request because we
> +                 * end up with a race condition otherwise */
> +                ret = cd_read_sector_sync(s);
> +                if (ret < 0) {
> +                    ide_atapi_io_error(s, ret);
> +                    return;
> +                }
>              }
> -            s->lba++;
> -            s->io_buffer_index = 0;
>          }
>          if (s->elementary_transfer_size > 0) {
>              /* there are some data left to transmit in this elementary
> @@ -275,7 +349,6 @@ static void ide_atapi_cmd_read_pio(IDEState *s, int lba, int nb_sectors,
>      s->io_buffer_index = sector_size;
>      s->cd_sector_size = sector_size;
>  
> -    s->status = READY_STAT | SEEK_STAT;
>      ide_atapi_cmd_reply_end(s);
>  }
>  
> 

Had to rebase against Berto's patches, but nothing major. You can see my
rebase at:

https://github.com/jnsnow/qemu/tree/review-plieven

Reviewed-by: John Snow <jsnow@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure
  2015-11-12 16:30 [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
                   ` (6 preceding siblings ...)
  2015-11-13 11:13 ` [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Mark Cave-Ayland
@ 2015-11-13 22:44 ` John Snow
  2015-11-16  6:17   ` Fam Zheng
  7 siblings, 1 reply; 16+ messages in thread
From: John Snow @ 2015-11-13 22:44 UTC (permalink / raw)
  To: Peter Lieven; +Cc: kwolf, famz, qemu-block, jcody, qemu-devel, Stefan Hajnoczi



On 11/12/2015 11:30 AM, Peter Lieven wrote:
> This series aims at avoiding a hanging main-loop if a vserver has a
> CDROM image mounted from a NFS share and that NFS share goes down.
> Typical situation is that users mount an CDROM ISO to install something
> and then forget to eject that CDROM afterwards.
> As a consequence this mounted CD is able to bring down the
> whole vserver if the backend NFS share is unreachable. This is bad
> especially if the CDROM itself is not needed anymore at this point.
> 
> v3->v4: - Patch 1: remove buf argument for cd_read_sector{_sync}
>         - Patch 1: fix iov_base offset for 2352 sector size
>         - Patch 2: fix indent [Fam]
>         - Patch 3: fix leaking of req->iov.iov_base [Fam]
> 
> v2->v3: - adressed Stefans comments on Patch 1
>         - added patches 2,4,5,6
>         - avoided the term cancel in Patch 3. Added an iovec,
>           added a BH [Stefan]
> v1->v2: - fix offset for 2352 byte sector size [Kevin]
>         - use a sync request if we continue an elementary transfer.
>           As John pointed out we enter a race condition between next
>           IDE command and async transfer otherwise. This is sill not
>           optimal, but it fixes the NFS down problems for all cases where
>           the NFS server goes down while there is no PIO CD activity.
>           Of course, it could still happen during a PIO transfer, but I
>           expect this to be the unlikelier case.
>           I spent some effort trying to read more sectors at once and
>           avoiding continuation of elementary transfers, but with
>           whatever I came up it was destroying migration between different
>           Qemu versions. I have a quite hackish patch that works and
>           should survive migration, but I am not happy with it. So I
>           would like to start with this version as it is a big improvement
>           already.
>         - Dropped Patch 5 because it is upstream meanwhile.
> 
> Peter Lieven (6):
>   ide/atapi: make PIO read requests async
>   block: add blk_abort_aio_request
>   ide: add support for IDEBufferedRequest
>   ide: orphan all buffered requests on DMA cancel
>   ide: enable buffered requests for ATAPI devices
>   ide: enable buffered requests for PIO read requests
> 
>  block/block-backend.c          |  17 +++----
>  hw/ide/atapi.c                 | 103 +++++++++++++++++++++++++++++++++++------
>  hw/ide/core.c                  |  51 +++++++++++++++++++-
>  hw/ide/internal.h              |  14 ++++++
>  hw/ide/pci.c                   |  19 ++++++++
>  include/sysemu/block-backend.h |   3 ++
>  6 files changed, 182 insertions(+), 25 deletions(-)
> 

It looks sane to me:

Reviewed-by: John Snow <jsnow@redhat.com>

Fam, Stefan: Do you think this is still sane for 2.5? I am inclined to
get it in as a fix, especially since we've been bouncing it around for
so long.

--js

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH V4 1/6] ide/atapi: make PIO read requests async
  2015-11-13 22:42   ` John Snow
@ 2015-11-13 23:00     ` Peter Lieven
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Lieven @ 2015-11-13 23:00 UTC (permalink / raw)
  To: John Snow; +Cc: kwolf, famz, qemu-block, stefanha, jcody, qemu-devel



> Am 13.11.2015 um 23:42 schrieb John Snow <jsnow@redhat.com>:
> 
> 
> 
>> On 11/12/2015 11:30 AM, Peter Lieven wrote:
>> PIO read requests on the ATAPI interface used to be sync blk requests.
>> This has two significant drawbacks. First the main loop hangs util an
>> I/O request is completed and secondly if the I/O request does not
>> complete (e.g. due to an unresponsive storage) Qemu hangs completely.
>> 
>> Note: Due to possible race conditions requests during an ongoing
>> elementary transfer are still sync.
>> 
>> Signed-off-by: Peter Lieven <pl@kamp.de>
>> ---
>> hw/ide/atapi.c | 97 ++++++++++++++++++++++++++++++++++++++++++++++++++--------
>> 1 file changed, 85 insertions(+), 12 deletions(-)
>> 
>> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
>> index 747f466..cfd2d63 100644
>> --- a/hw/ide/atapi.c
>> +++ b/hw/ide/atapi.c
>> @@ -105,33 +105,98 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
>>     memset(buf, 0, 288);
>> }
>> 
>> -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int sector_size)
>> +static int
>> +cd_read_sector_sync(IDEState *s)
>> {
>>     int ret;
>> 
>> -    switch(sector_size) {
>> +#ifdef DEBUG_IDE_ATAPI
>> +    printf("cd_read_sector_sync: lba=%d\n", s->lba);
>> +#endif
>> +
>> +    switch (s->cd_sector_size) {
>>     case 2048:
>>         block_acct_start(blk_get_stats(s->blk), &s->acct,
>>                          4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>> -        ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
>> +        ret = blk_read(s->blk, (int64_t)s->lba << 2,
>> +                       s->io_buffer, 4);
>>         block_acct_done(blk_get_stats(s->blk), &s->acct);
>>         break;
>>     case 2352:
>>         block_acct_start(blk_get_stats(s->blk), &s->acct,
>>                          4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>> -        ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
>> +        ret = blk_read(s->blk, (int64_t)s->lba << 2,
>> +                       s->io_buffer + 16, 4);
>>         block_acct_done(blk_get_stats(s->blk), &s->acct);
>>         if (ret < 0)
>>             return ret;
>> -        cd_data_to_raw(buf, lba);
>> +        cd_data_to_raw(s->io_buffer, s->lba);
>>         break;
>>     default:
>>         ret = -EIO;
>>         break;
>>     }
>> +
>> +    if (!ret) {
>> +        s->lba++;
>> +        s->io_buffer_index = 0;
>> +    }
>> +
>>     return ret;
>> }
>> 
>> +static void cd_read_sector_cb(void *opaque, int ret)
>> +{
>> +    IDEState *s = opaque;
>> +
>> +    block_acct_done(blk_get_stats(s->blk), &s->acct);
>> +
>> +#ifdef DEBUG_IDE_ATAPI
>> +    printf("cd_read_sector_cb: lba=%d ret=%d\n", s->lba, ret);
>> +#endif
>> +
>> +    if (ret < 0) {
>> +        ide_atapi_io_error(s, ret);
>> +        return;
>> +    }
>> +
>> +    if (s->cd_sector_size == 2352) {
>> +        cd_data_to_raw(s->io_buffer, s->lba);
>> +    }
>> +
>> +    s->lba++;
>> +    s->io_buffer_index = 0;
>> +    s->status &= ~BUSY_STAT;
>> +
>> +    ide_atapi_cmd_reply_end(s);
>> +}
>> +
>> +static int cd_read_sector(IDEState *s)
>> +{
>> +    if (s->cd_sector_size != 2048 && s->cd_sector_size != 2352) {
>> +        return -EINVAL;
>> +    }
>> +
>> +    s->iov.iov_base = (s->cd_sector_size == 2352) ?
>> +                      s->io_buffer + 16 : s->io_buffer;
>> +
>> +    s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
>> +    qemu_iovec_init_external(&s->qiov, &s->iov, 1);
>> +
>> +#ifdef DEBUG_IDE_ATAPI
>> +    printf("cd_read_sector: lba=%d\n", s->lba);
>> +#endif
>> +
>> +    block_acct_start(blk_get_stats(s->blk), &s->acct,
>> +                     4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
>> +
>> +    blk_aio_readv(s->blk, (int64_t)s->lba << 2, &s->qiov, 4,
>> +                  cd_read_sector_cb, s);
>> +
>> +    s->status |= BUSY_STAT;
>> +    return 0;
>> +}
>> +
>> void ide_atapi_cmd_ok(IDEState *s)
>> {
>>     s->error = 0;
>> @@ -182,18 +247,27 @@ void ide_atapi_cmd_reply_end(IDEState *s)
>>         ide_atapi_cmd_ok(s);
>>         ide_set_irq(s->bus);
>> #ifdef DEBUG_IDE_ATAPI
>> -        printf("status=0x%x\n", s->status);
>> +        printf("end of transfer, status=0x%x\n", s->status);
>> #endif
>>     } else {
>>         /* see if a new sector must be read */
>>         if (s->lba != -1 && s->io_buffer_index >= s->cd_sector_size) {
>> -            ret = cd_read_sector(s, s->lba, s->io_buffer, s->cd_sector_size);
>> -            if (ret < 0) {
>> -                ide_atapi_io_error(s, ret);
>> +            if (!s->elementary_transfer_size) {
>> +                ret = cd_read_sector(s);
>> +                if (ret < 0) {
>> +                    ide_atapi_io_error(s, ret);
>> +                }
>>                 return;
>> +            } else {
>> +                /* rebuffering within an elementary transfer is
>> +                 * only possible with a sync request because we
>> +                 * end up with a race condition otherwise */
>> +                ret = cd_read_sector_sync(s);
>> +                if (ret < 0) {
>> +                    ide_atapi_io_error(s, ret);
>> +                    return;
>> +                }
>>             }
>> -            s->lba++;
>> -            s->io_buffer_index = 0;
>>         }
>>         if (s->elementary_transfer_size > 0) {
>>             /* there are some data left to transmit in this elementary
>> @@ -275,7 +349,6 @@ static void ide_atapi_cmd_read_pio(IDEState *s, int lba, int nb_sectors,
>>     s->io_buffer_index = sector_size;
>>     s->cd_sector_size = sector_size;
>> 
>> -    s->status = READY_STAT | SEEK_STAT;
>>     ide_atapi_cmd_reply_end(s);
>> }
> 
> Had to rebase against Berto's patches, but nothing major. You can see my
> rebase at:
> 
> https://github.com/jnsnow/qemu/tree/review-plieven
> 
> Reviewed-by: John Snow <jsnow@redhat.com>

in Patch 1 the indent is off at the end. Otherwise looks good to me.

Peter 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure
  2015-11-13 22:44 ` John Snow
@ 2015-11-16  6:17   ` Fam Zheng
  2015-11-16  6:48     ` Peter Lieven
  2015-11-16 19:24     ` John Snow
  0 siblings, 2 replies; 16+ messages in thread
From: Fam Zheng @ 2015-11-16  6:17 UTC (permalink / raw)
  To: John Snow
  Cc: kwolf, qemu-block, jcody, Peter Lieven, qemu-devel, Stefan Hajnoczi

On Fri, 11/13 17:44, John Snow wrote:
> 
> 
> On 11/12/2015 11:30 AM, Peter Lieven wrote:
> > This series aims at avoiding a hanging main-loop if a vserver has a
> > CDROM image mounted from a NFS share and that NFS share goes down.
> > Typical situation is that users mount an CDROM ISO to install something
> > and then forget to eject that CDROM afterwards.
> > As a consequence this mounted CD is able to bring down the
> > whole vserver if the backend NFS share is unreachable. This is bad
> > especially if the CDROM itself is not needed anymore at this point.
> > 
> > v3->v4: - Patch 1: remove buf argument for cd_read_sector{_sync}
> >         - Patch 1: fix iov_base offset for 2352 sector size
> >         - Patch 2: fix indent [Fam]
> >         - Patch 3: fix leaking of req->iov.iov_base [Fam]
> > 
> > v2->v3: - adressed Stefans comments on Patch 1
> >         - added patches 2,4,5,6
> >         - avoided the term cancel in Patch 3. Added an iovec,
> >           added a BH [Stefan]
> > v1->v2: - fix offset for 2352 byte sector size [Kevin]
> >         - use a sync request if we continue an elementary transfer.
> >           As John pointed out we enter a race condition between next
> >           IDE command and async transfer otherwise. This is sill not
> >           optimal, but it fixes the NFS down problems for all cases where
> >           the NFS server goes down while there is no PIO CD activity.
> >           Of course, it could still happen during a PIO transfer, but I
> >           expect this to be the unlikelier case.
> >           I spent some effort trying to read more sectors at once and
> >           avoiding continuation of elementary transfers, but with
> >           whatever I came up it was destroying migration between different
> >           Qemu versions. I have a quite hackish patch that works and
> >           should survive migration, but I am not happy with it. So I
> >           would like to start with this version as it is a big improvement
> >           already.
> >         - Dropped Patch 5 because it is upstream meanwhile.
> > 
> > Peter Lieven (6):
> >   ide/atapi: make PIO read requests async
> >   block: add blk_abort_aio_request
> >   ide: add support for IDEBufferedRequest
> >   ide: orphan all buffered requests on DMA cancel
> >   ide: enable buffered requests for ATAPI devices
> >   ide: enable buffered requests for PIO read requests
> > 
> >  block/block-backend.c          |  17 +++----
> >  hw/ide/atapi.c                 | 103 +++++++++++++++++++++++++++++++++++------
> >  hw/ide/core.c                  |  51 +++++++++++++++++++-
> >  hw/ide/internal.h              |  14 ++++++
> >  hw/ide/pci.c                   |  19 ++++++++
> >  include/sysemu/block-backend.h |   3 ++
> >  6 files changed, 182 insertions(+), 25 deletions(-)
> > 
> 
> It looks sane to me:
> 
> Reviewed-by: John Snow <jsnow@redhat.com>
> 
> Fam, Stefan: Do you think this is still sane for 2.5? I am inclined to
> get it in as a fix, especially since we've been bouncing it around for
> so long.

I have no objection here, there is no new feature in the series, and guest
hanging right after installing something is admittedly annoying to users.

Thanks,
Fam

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure
  2015-11-16  6:17   ` Fam Zheng
@ 2015-11-16  6:48     ` Peter Lieven
  2015-11-16 19:24     ` John Snow
  1 sibling, 0 replies; 16+ messages in thread
From: Peter Lieven @ 2015-11-16  6:48 UTC (permalink / raw)
  To: Fam Zheng, John Snow
  Cc: kwolf, jcody, Stefan Hajnoczi, qemu-devel, qemu-block

Am 16.11.2015 um 07:17 schrieb Fam Zheng:
> On Fri, 11/13 17:44, John Snow wrote:
>>
>> On 11/12/2015 11:30 AM, Peter Lieven wrote:
>>> This series aims at avoiding a hanging main-loop if a vserver has a
>>> CDROM image mounted from a NFS share and that NFS share goes down.
>>> Typical situation is that users mount an CDROM ISO to install something
>>> and then forget to eject that CDROM afterwards.
>>> As a consequence this mounted CD is able to bring down the
>>> whole vserver if the backend NFS share is unreachable. This is bad
>>> especially if the CDROM itself is not needed anymore at this point.
>>>
>>> v3->v4: - Patch 1: remove buf argument for cd_read_sector{_sync}
>>>          - Patch 1: fix iov_base offset for 2352 sector size
>>>          - Patch 2: fix indent [Fam]
>>>          - Patch 3: fix leaking of req->iov.iov_base [Fam]
>>>
>>> v2->v3: - adressed Stefans comments on Patch 1
>>>          - added patches 2,4,5,6
>>>          - avoided the term cancel in Patch 3. Added an iovec,
>>>            added a BH [Stefan]
>>> v1->v2: - fix offset for 2352 byte sector size [Kevin]
>>>          - use a sync request if we continue an elementary transfer.
>>>            As John pointed out we enter a race condition between next
>>>            IDE command and async transfer otherwise. This is sill not
>>>            optimal, but it fixes the NFS down problems for all cases where
>>>            the NFS server goes down while there is no PIO CD activity.
>>>            Of course, it could still happen during a PIO transfer, but I
>>>            expect this to be the unlikelier case.
>>>            I spent some effort trying to read more sectors at once and
>>>            avoiding continuation of elementary transfers, but with
>>>            whatever I came up it was destroying migration between different
>>>            Qemu versions. I have a quite hackish patch that works and
>>>            should survive migration, but I am not happy with it. So I
>>>            would like to start with this version as it is a big improvement
>>>            already.
>>>          - Dropped Patch 5 because it is upstream meanwhile.
>>>
>>> Peter Lieven (6):
>>>    ide/atapi: make PIO read requests async
>>>    block: add blk_abort_aio_request
>>>    ide: add support for IDEBufferedRequest
>>>    ide: orphan all buffered requests on DMA cancel
>>>    ide: enable buffered requests for ATAPI devices
>>>    ide: enable buffered requests for PIO read requests
>>>
>>>   block/block-backend.c          |  17 +++----
>>>   hw/ide/atapi.c                 | 103 +++++++++++++++++++++++++++++++++++------
>>>   hw/ide/core.c                  |  51 +++++++++++++++++++-
>>>   hw/ide/internal.h              |  14 ++++++
>>>   hw/ide/pci.c                   |  19 ++++++++
>>>   include/sysemu/block-backend.h |   3 ++
>>>   6 files changed, 182 insertions(+), 25 deletions(-)
>>>
>> It looks sane to me:
>>
>> Reviewed-by: John Snow <jsnow@redhat.com>
>>
>> Fam, Stefan: Do you think this is still sane for 2.5? I am inclined to
>> get it in as a fix, especially since we've been bouncing it around for
>> so long.
> I have no objection here, there is no new feature in the series, and guest
> hanging right after installing something is admittedly annoying to users.

Thats for speaking for merging it in 2.5. However, I have to clarify that
the fixed behaviour is the same since day 1 of Qemu.

Peter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure
  2015-11-16  6:17   ` Fam Zheng
  2015-11-16  6:48     ` Peter Lieven
@ 2015-11-16 19:24     ` John Snow
  2015-11-17  4:17       ` Fam Zheng
  1 sibling, 1 reply; 16+ messages in thread
From: John Snow @ 2015-11-16 19:24 UTC (permalink / raw)
  To: Fam Zheng
  Cc: kwolf, qemu-block, jcody, Peter Lieven, qemu-devel, Stefan Hajnoczi



On 11/16/2015 01:17 AM, Fam Zheng wrote:
> On Fri, 11/13 17:44, John Snow wrote:
>>
>>
>> On 11/12/2015 11:30 AM, Peter Lieven wrote:
>>> This series aims at avoiding a hanging main-loop if a vserver has a
>>> CDROM image mounted from a NFS share and that NFS share goes down.
>>> Typical situation is that users mount an CDROM ISO to install something
>>> and then forget to eject that CDROM afterwards.
>>> As a consequence this mounted CD is able to bring down the
>>> whole vserver if the backend NFS share is unreachable. This is bad
>>> especially if the CDROM itself is not needed anymore at this point.
>>>
>>> v3->v4: - Patch 1: remove buf argument for cd_read_sector{_sync}
>>>         - Patch 1: fix iov_base offset for 2352 sector size
>>>         - Patch 2: fix indent [Fam]
>>>         - Patch 3: fix leaking of req->iov.iov_base [Fam]
>>>
>>> v2->v3: - adressed Stefans comments on Patch 1
>>>         - added patches 2,4,5,6
>>>         - avoided the term cancel in Patch 3. Added an iovec,
>>>           added a BH [Stefan]
>>> v1->v2: - fix offset for 2352 byte sector size [Kevin]
>>>         - use a sync request if we continue an elementary transfer.
>>>           As John pointed out we enter a race condition between next
>>>           IDE command and async transfer otherwise. This is sill not
>>>           optimal, but it fixes the NFS down problems for all cases where
>>>           the NFS server goes down while there is no PIO CD activity.
>>>           Of course, it could still happen during a PIO transfer, but I
>>>           expect this to be the unlikelier case.
>>>           I spent some effort trying to read more sectors at once and
>>>           avoiding continuation of elementary transfers, but with
>>>           whatever I came up it was destroying migration between different
>>>           Qemu versions. I have a quite hackish patch that works and
>>>           should survive migration, but I am not happy with it. So I
>>>           would like to start with this version as it is a big improvement
>>>           already.
>>>         - Dropped Patch 5 because it is upstream meanwhile.
>>>
>>> Peter Lieven (6):
>>>   ide/atapi: make PIO read requests async
>>>   block: add blk_abort_aio_request
>>>   ide: add support for IDEBufferedRequest
>>>   ide: orphan all buffered requests on DMA cancel
>>>   ide: enable buffered requests for ATAPI devices
>>>   ide: enable buffered requests for PIO read requests
>>>
>>>  block/block-backend.c          |  17 +++----
>>>  hw/ide/atapi.c                 | 103 +++++++++++++++++++++++++++++++++++------
>>>  hw/ide/core.c                  |  51 +++++++++++++++++++-
>>>  hw/ide/internal.h              |  14 ++++++
>>>  hw/ide/pci.c                   |  19 ++++++++
>>>  include/sysemu/block-backend.h |   3 ++
>>>  6 files changed, 182 insertions(+), 25 deletions(-)
>>>
>>
>> It looks sane to me:
>>
>> Reviewed-by: John Snow <jsnow@redhat.com>
>>
>> Fam, Stefan: Do you think this is still sane for 2.5? I am inclined to
>> get it in as a fix, especially since we've been bouncing it around for
>> so long.
> 
> I have no objection here, there is no new feature in the series, and guest
> hanging right after installing something is admittedly annoying to users.
> 
> Thanks,
> Fam
> 

I will send a PR. Fam, should I add your R-B?

--js

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure
  2015-11-16 19:24     ` John Snow
@ 2015-11-17  4:17       ` Fam Zheng
  0 siblings, 0 replies; 16+ messages in thread
From: Fam Zheng @ 2015-11-17  4:17 UTC (permalink / raw)
  To: John Snow
  Cc: kwolf, qemu-block, jcody, Peter Lieven, qemu-devel, Stefan Hajnoczi

On Mon, 11/16 14:24, John Snow wrote:
> >> It looks sane to me:
> >>
> >> Reviewed-by: John Snow <jsnow@redhat.com>
> >>
> >> Fam, Stefan: Do you think this is still sane for 2.5? I am inclined to
> >> get it in as a fix, especially since we've been bouncing it around for
> >> so long.
> > 
> > I have no objection here, there is no new feature in the series, and guest
> > hanging right after installing something is admittedly annoying to users.
> > 
> > Thanks,
> > Fam
> > 
> 
> I will send a PR. Fam, should I add your R-B?
> 
> --js

For patches 2-6:

Reviewed-by: Fam Zheng <famz@redhat.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-11-17  4:17 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-12 16:30 [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Peter Lieven
2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 1/6] ide/atapi: make PIO read requests async Peter Lieven
2015-11-13 22:42   ` John Snow
2015-11-13 23:00     ` Peter Lieven
2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 2/6] block: add blk_abort_aio_request Peter Lieven
2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 3/6] ide: add support for IDEBufferedRequest Peter Lieven
2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 4/6] ide: orphan all buffered requests on DMA cancel Peter Lieven
2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 5/6] ide: enable buffered requests for ATAPI devices Peter Lieven
2015-11-12 16:30 ` [Qemu-devel] [PATCH V4 6/6] ide: enable buffered requests for PIO read requests Peter Lieven
2015-11-13 11:13 ` [Qemu-devel] [PATCH V4 0/6] ide: avoid main-loop hang on CDROM/NFS failure Mark Cave-Ayland
2015-11-13 17:18   ` John Snow
2015-11-13 22:44 ` John Snow
2015-11-16  6:17   ` Fam Zheng
2015-11-16  6:48     ` Peter Lieven
2015-11-16 19:24     ` John Snow
2015-11-17  4:17       ` Fam Zheng

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.