qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL 0/5] Block patches for 4.2-rc0
@ 2019-11-04  9:03 Max Reitz
  2019-11-04  9:03 ` [PULL 1/5] nvme: fix NSSRS offset in CAP register Max Reitz
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Max Reitz @ 2019-11-04  9:03 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Peter Maydell, qemu-devel, Max Reitz

The following changes since commit 36609b4fa36f0ac934874371874416f7533a5408:

  Merge remote-tracking branch 'remotes/palmer/tags/palmer-for-master-4.2-sf1' into staging (2019-11-02 17:59:03 +0000)

are available in the Git repository at:

  https://github.com/XanClic/qemu.git tags/pull-block-2019-11-04

for you to fetch changes up to 292d06b925b2787ee6f2430996b95651cae42fce:

  block/file-posix: Let post-EOF fallocate serialize (2019-11-04 09:33:51 +0100)

----------------------------------------------------------------
Block patches for 4.2-rc0:
- Work around XFS write-zeroes bug in file-posix block driver
- Fix backup job with compression
- Fix to the NVMe block driver header

----------------------------------------------------------------
Klaus Jensen (1):
  nvme: fix NSSRS offset in CAP register

Max Reitz (3):
  block: Make wait/mark serialising requests public
  block: Add bdrv_co_get_self_request()
  block/file-posix: Let post-EOF fallocate serialize

Vladimir Sementsov-Ogievskiy (1):
  block/block-copy: fix s->copy_size for compressed cluster

 include/block/block_int.h |  4 ++++
 include/block/nvme.h      |  2 +-
 block/block-copy.c        |  4 ++--
 block/file-posix.c        | 36 +++++++++++++++++++++++++++++++++
 block/io.c                | 42 ++++++++++++++++++++++++++++-----------
 5 files changed, 73 insertions(+), 15 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PULL 1/5] nvme: fix NSSRS offset in CAP register
  2019-11-04  9:03 [PULL 0/5] Block patches for 4.2-rc0 Max Reitz
@ 2019-11-04  9:03 ` Max Reitz
  2019-11-04  9:03 ` [PULL 2/5] block/block-copy: fix s->copy_size for compressed cluster Max Reitz
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Max Reitz @ 2019-11-04  9:03 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Peter Maydell, qemu-devel, Max Reitz

From: Klaus Jensen <its@irrelevant.dk>

Fix the offset of the NSSRS field the CAP register.

From NVME 1.4, section 3 ("Controller Registers"), subsection 3.1.1
("Offset 0h: CAP – Controller Capabilities") CAP_NSSRS_SHIFT is bit 36,
not 33.

Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Reported-by: Javier Gonzalez <javier.gonz@samsung.com>
Message-id: 20191023073315.446534-1-its@irrelevant.dk
Reviewed-by: John Snow <jsnow@redhat.com>
[mreitz: Added John's note on the location in the specification where
         this information can be found]
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/nvme.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index ab5943b90a..8fb941c653 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -23,7 +23,7 @@ enum NvmeCapShift {
     CAP_AMS_SHIFT      = 17,
     CAP_TO_SHIFT       = 24,
     CAP_DSTRD_SHIFT    = 32,
-    CAP_NSSRS_SHIFT    = 33,
+    CAP_NSSRS_SHIFT    = 36,
     CAP_CSS_SHIFT      = 37,
     CAP_MPSMIN_SHIFT   = 48,
     CAP_MPSMAX_SHIFT   = 52,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PULL 2/5] block/block-copy: fix s->copy_size for compressed cluster
  2019-11-04  9:03 [PULL 0/5] Block patches for 4.2-rc0 Max Reitz
  2019-11-04  9:03 ` [PULL 1/5] nvme: fix NSSRS offset in CAP register Max Reitz
@ 2019-11-04  9:03 ` Max Reitz
  2019-11-04  9:03 ` [PULL 3/5] block: Make wait/mark serialising requests public Max Reitz
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Max Reitz @ 2019-11-04  9:03 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Peter Maydell, qemu-devel, Max Reitz

From: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

0e2402452f1f20429 allowed writes larger than cluster, but that's
unsupported for compressed write. Fix it.

Fixes: 0e2402452f1f20429
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20191029150934.26416-1-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/block-copy.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index c39cc9cffe..79798a1567 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -109,9 +109,9 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
         s->use_copy_range = false;
         s->copy_size = cluster_size;
     } else if (write_flags & BDRV_REQ_WRITE_COMPRESSED) {
-        /* Compression is not supported for copy_range */
+        /* Compression supports only cluster-size writes and no copy-range. */
         s->use_copy_range = false;
-        s->copy_size = MAX(cluster_size, BLOCK_COPY_MAX_BUFFER);
+        s->copy_size = cluster_size;
     } else {
         /*
          * copy_range does not respect max_transfer (it's a TODO), so we factor
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PULL 3/5] block: Make wait/mark serialising requests public
  2019-11-04  9:03 [PULL 0/5] Block patches for 4.2-rc0 Max Reitz
  2019-11-04  9:03 ` [PULL 1/5] nvme: fix NSSRS offset in CAP register Max Reitz
  2019-11-04  9:03 ` [PULL 2/5] block/block-copy: fix s->copy_size for compressed cluster Max Reitz
@ 2019-11-04  9:03 ` Max Reitz
  2019-11-04  9:03 ` [PULL 4/5] block: Add bdrv_co_get_self_request() Max Reitz
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Max Reitz @ 2019-11-04  9:03 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Peter Maydell, qemu-devel, Max Reitz

Make both bdrv_mark_request_serialising() and
bdrv_wait_serialising_requests() public so they can be used from block
drivers.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191101152510.11719-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block_int.h |  3 +++
 block/io.c                | 24 ++++++++++++------------
 2 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 02dc0034a2..32fa323b63 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -999,6 +999,9 @@ extern unsigned int bdrv_drain_all_count;
 void bdrv_apply_subtree_drain(BdrvChild *child, BlockDriverState *new_parent);
 void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState *old_parent);
 
+bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self);
+void bdrv_mark_request_serialising(BdrvTrackedRequest *req, uint64_t align);
+
 int get_tmp_filename(char *filename, int size);
 BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
                             const char *filename);
diff --git a/block/io.c b/block/io.c
index 02659f994d..039c0d49c9 100644
--- a/block/io.c
+++ b/block/io.c
@@ -715,7 +715,7 @@ static void tracked_request_begin(BdrvTrackedRequest *req,
     qemu_co_mutex_unlock(&bs->reqs_lock);
 }
 
-static void mark_request_serialising(BdrvTrackedRequest *req, uint64_t align)
+void bdrv_mark_request_serialising(BdrvTrackedRequest *req, uint64_t align)
 {
     int64_t overlap_offset = req->offset & ~(align - 1);
     uint64_t overlap_bytes = ROUND_UP(req->offset + req->bytes, align)
@@ -805,7 +805,7 @@ void bdrv_dec_in_flight(BlockDriverState *bs)
     bdrv_wakeup(bs);
 }
 
-static bool coroutine_fn wait_serialising_requests(BdrvTrackedRequest *self)
+bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self)
 {
     BlockDriverState *bs = self->bs;
     BdrvTrackedRequest *req;
@@ -1437,14 +1437,14 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
          * with each other for the same cluster.  For example, in copy-on-read
          * it ensures that the CoR read and write operations are atomic and
          * guest writes cannot interleave between them. */
-        mark_request_serialising(req, bdrv_get_cluster_size(bs));
+        bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
     }
 
     /* BDRV_REQ_SERIALISING is only for write operation */
     assert(!(flags & BDRV_REQ_SERIALISING));
 
     if (!(flags & BDRV_REQ_NO_SERIALISING)) {
-        wait_serialising_requests(req);
+        bdrv_wait_serialising_requests(req);
     }
 
     if (flags & BDRV_REQ_COPY_ON_READ) {
@@ -1841,10 +1841,10 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
     assert(!(flags & ~BDRV_REQ_MASK));
 
     if (flags & BDRV_REQ_SERIALISING) {
-        mark_request_serialising(req, bdrv_get_cluster_size(bs));
+        bdrv_mark_request_serialising(req, bdrv_get_cluster_size(bs));
     }
 
-    waited = wait_serialising_requests(req);
+    waited = bdrv_wait_serialising_requests(req);
 
     assert(!waited || !req->serialising ||
            is_request_serialising_and_aligned(req));
@@ -2008,8 +2008,8 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvChild *child,
 
     padding = bdrv_init_padding(bs, offset, bytes, &pad);
     if (padding) {
-        mark_request_serialising(req, align);
-        wait_serialising_requests(req);
+        bdrv_mark_request_serialising(req, align);
+        bdrv_wait_serialising_requests(req);
 
         bdrv_padding_rmw_read(child, req, &pad, true);
 
@@ -2111,8 +2111,8 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child,
     }
 
     if (bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad)) {
-        mark_request_serialising(&req, align);
-        wait_serialising_requests(&req);
+        bdrv_mark_request_serialising(&req, align);
+        bdrv_wait_serialising_requests(&req);
         bdrv_padding_rmw_read(child, &req, &pad, false);
     }
 
@@ -3205,7 +3205,7 @@ static int coroutine_fn bdrv_co_copy_range_internal(
         /* BDRV_REQ_SERIALISING is only for write operation */
         assert(!(read_flags & BDRV_REQ_SERIALISING));
         if (!(read_flags & BDRV_REQ_NO_SERIALISING)) {
-            wait_serialising_requests(&req);
+            bdrv_wait_serialising_requests(&req);
         }
 
         ret = src->bs->drv->bdrv_co_copy_range_from(src->bs,
@@ -3336,7 +3336,7 @@ int coroutine_fn bdrv_co_truncate(BdrvChild *child, int64_t offset, bool exact,
      * new area, we need to make sure that no write requests are made to it
      * concurrently or they might be overwritten by preallocation. */
     if (new_bytes) {
-        mark_request_serialising(&req, 1);
+        bdrv_mark_request_serialising(&req, 1);
     }
     if (bs->read_only) {
         error_setg(errp, "Image is read-only");
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PULL 4/5] block: Add bdrv_co_get_self_request()
  2019-11-04  9:03 [PULL 0/5] Block patches for 4.2-rc0 Max Reitz
                   ` (2 preceding siblings ...)
  2019-11-04  9:03 ` [PULL 3/5] block: Make wait/mark serialising requests public Max Reitz
@ 2019-11-04  9:03 ` Max Reitz
  2019-11-04  9:03 ` [PULL 5/5] block/file-posix: Let post-EOF fallocate serialize Max Reitz
  2019-11-06 11:56 ` [PULL 0/5] Block patches for 4.2-rc0 Peter Maydell
  5 siblings, 0 replies; 7+ messages in thread
From: Max Reitz @ 2019-11-04  9:03 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Peter Maydell, qemu-devel, Max Reitz

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191101152510.11719-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 include/block/block_int.h |  1 +
 block/io.c                | 18 ++++++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 32fa323b63..dd033d0b37 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1001,6 +1001,7 @@ void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState *old_parent);
 
 bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest *self);
 void bdrv_mark_request_serialising(BdrvTrackedRequest *req, uint64_t align);
+BdrvTrackedRequest *coroutine_fn bdrv_co_get_self_request(BlockDriverState *bs);
 
 int get_tmp_filename(char *filename, int size);
 BlockDriver *bdrv_probe_all(const uint8_t *buf, int buf_size,
diff --git a/block/io.c b/block/io.c
index 039c0d49c9..f75777f5ea 100644
--- a/block/io.c
+++ b/block/io.c
@@ -742,6 +742,24 @@ static bool is_request_serialising_and_aligned(BdrvTrackedRequest *req)
            (req->bytes == req->overlap_bytes);
 }
 
+/**
+ * Return the tracked request on @bs for the current coroutine, or
+ * NULL if there is none.
+ */
+BdrvTrackedRequest *coroutine_fn bdrv_co_get_self_request(BlockDriverState *bs)
+{
+    BdrvTrackedRequest *req;
+    Coroutine *self = qemu_coroutine_self();
+
+    QLIST_FOREACH(req, &bs->tracked_requests, list) {
+        if (req->co == self) {
+            return req;
+        }
+    }
+
+    return NULL;
+}
+
 /**
  * Round a region to cluster boundaries
  */
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PULL 5/5] block/file-posix: Let post-EOF fallocate serialize
  2019-11-04  9:03 [PULL 0/5] Block patches for 4.2-rc0 Max Reitz
                   ` (3 preceding siblings ...)
  2019-11-04  9:03 ` [PULL 4/5] block: Add bdrv_co_get_self_request() Max Reitz
@ 2019-11-04  9:03 ` Max Reitz
  2019-11-06 11:56 ` [PULL 0/5] Block patches for 4.2-rc0 Peter Maydell
  5 siblings, 0 replies; 7+ messages in thread
From: Max Reitz @ 2019-11-04  9:03 UTC (permalink / raw)
  To: qemu-block; +Cc: Kevin Wolf, Peter Maydell, qemu-devel, Max Reitz

The XFS kernel driver has a bug that may cause data corruption for qcow2
images as of qemu commit c8bb23cbdbe32f.  We can work around it by
treating post-EOF fallocates as serializing up until infinity (INT64_MAX
in practice).

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191101152510.11719-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 block/file-posix.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index 0b7e904d48..1f0f61a02b 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2721,6 +2721,42 @@ raw_do_pwrite_zeroes(BlockDriverState *bs, int64_t offset, int bytes,
     RawPosixAIOData acb;
     ThreadPoolFunc *handler;
 
+#ifdef CONFIG_FALLOCATE
+    if (offset + bytes > bs->total_sectors * BDRV_SECTOR_SIZE) {
+        BdrvTrackedRequest *req;
+        uint64_t end;
+
+        /*
+         * This is a workaround for a bug in the Linux XFS driver,
+         * where writes submitted through the AIO interface will be
+         * discarded if they happen beyond a concurrently running
+         * fallocate() that increases the file length (i.e., both the
+         * write and the fallocate() happen beyond the EOF).
+         *
+         * To work around it, we extend the tracked request for this
+         * zero write until INT64_MAX (effectively infinity), and mark
+         * it as serializing.
+         *
+         * We have to enable this workaround for all filesystems and
+         * AIO modes (not just XFS with aio=native), because for
+         * remote filesystems we do not know the host configuration.
+         */
+
+        req = bdrv_co_get_self_request(bs);
+        assert(req);
+        assert(req->type == BDRV_TRACKED_WRITE);
+        assert(req->offset <= offset);
+        assert(req->offset + req->bytes >= offset + bytes);
+
+        end = INT64_MAX & -(uint64_t)bs->bl.request_alignment;
+        req->bytes = end - req->offset;
+        req->overlap_bytes = req->bytes;
+
+        bdrv_mark_request_serialising(req, bs->bl.request_alignment);
+        bdrv_wait_serialising_requests(req);
+    }
+#endif
+
     acb = (RawPosixAIOData) {
         .bs             = bs,
         .aio_fildes     = s->fd,
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PULL 0/5] Block patches for 4.2-rc0
  2019-11-04  9:03 [PULL 0/5] Block patches for 4.2-rc0 Max Reitz
                   ` (4 preceding siblings ...)
  2019-11-04  9:03 ` [PULL 5/5] block/file-posix: Let post-EOF fallocate serialize Max Reitz
@ 2019-11-06 11:56 ` Peter Maydell
  5 siblings, 0 replies; 7+ messages in thread
From: Peter Maydell @ 2019-11-06 11:56 UTC (permalink / raw)
  To: Max Reitz; +Cc: Kevin Wolf, QEMU Developers, Qemu-block

On Mon, 4 Nov 2019 at 09:03, Max Reitz <mreitz@redhat.com> wrote:
>
> The following changes since commit 36609b4fa36f0ac934874371874416f7533a5408:
>
>   Merge remote-tracking branch 'remotes/palmer/tags/palmer-for-master-4.2-sf1' into staging (2019-11-02 17:59:03 +0000)
>
> are available in the Git repository at:
>
>   https://github.com/XanClic/qemu.git tags/pull-block-2019-11-04
>
> for you to fetch changes up to 292d06b925b2787ee6f2430996b95651cae42fce:
>
>   block/file-posix: Let post-EOF fallocate serialize (2019-11-04 09:33:51 +0100)
>
> ----------------------------------------------------------------
> Block patches for 4.2-rc0:
> - Work around XFS write-zeroes bug in file-posix block driver
> - Fix backup job with compression
> - Fix to the NVMe block driver header
>

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.2
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-11-06 11:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-04  9:03 [PULL 0/5] Block patches for 4.2-rc0 Max Reitz
2019-11-04  9:03 ` [PULL 1/5] nvme: fix NSSRS offset in CAP register Max Reitz
2019-11-04  9:03 ` [PULL 2/5] block/block-copy: fix s->copy_size for compressed cluster Max Reitz
2019-11-04  9:03 ` [PULL 3/5] block: Make wait/mark serialising requests public Max Reitz
2019-11-04  9:03 ` [PULL 4/5] block: Add bdrv_co_get_self_request() Max Reitz
2019-11-04  9:03 ` [PULL 5/5] block/file-posix: Let post-EOF fallocate serialize Max Reitz
2019-11-06 11:56 ` [PULL 0/5] Block patches for 4.2-rc0 Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).