[Qemu-devel] [PATCH] nbd: fix trim/discard commands with a length bigger than NBD_MAX_BUFFER_SIZE

* [Qemu-devel] [PATCH] nbd: fix trim/discard commands with a length bigger than NBD_MAX_BUFFER_SIZE
@ 2016-05-06  8:45 Quentin Casasnovas
  2016-05-10 14:01 ` Eric Blake
  2016-05-10 20:34 ` [Qemu-devel] " Eric Blake
  0 siblings, 2 replies; 30+ messages in thread
From: Quentin Casasnovas @ 2016-05-06  8:45 UTC (permalink / raw)
  To: qemu-devel; +Cc: Quentin Casasnovas, Paolo Bonzini, qemu-stable, qemu-trivial

When running fstrim on a filesystem mounted through qemu-nbd with
--discard=on, fstrim would fail with I/O errors:

  $ fstrim /k/spl/ice/
  fstrim: /k/spl/ice/: FITRIM ioctl failed: Input/output error

and qemu-nbd was spitting these:

  nbd.c:nbd_co_receive_request():L1232: len (94621696) is larger than max len (33554432)

Enabling debug output on the NBD driver in the Linux kernel showed the
request length field sent was the one received and that qemu-nbd returned
22 (EINVAL) as error code:

  EXT4-fs (nbd0p1): mounted filesystem with ordered data mode. Opts: discard
  block nbd0: request ffff880094c0cc18: dequeued (flags=1)
  block nbd0: request ffff880094c0cc18: sending control (read@5255168,4096B)
  block nbd0: request ffff880094c0cc18: got reply
  block nbd0: request ffff880094c0cc18: got 4096 bytes data
  block nbd0: request ffff880094c0cc18: done
  block nbd0: request ffff8801728796d8: dequeued (flags=1)
  block nbd0: request ffff8801728796d8: sending control (trim/discard@39464960,45056B)
  block nbd0: request ffff8801728796d8: got reply
  block nbd0: request ffff8801728796d8: done
  block nbd0: request ffff880172879ae0: dequeued (flags=1)
  block nbd0: request ffff880172879ae0: sending control (trim/discard@39653376,16384B)
  block nbd0: request ffff880172879ae0: got reply
  block nbd0: request ffff880172879ae0: done
  block nbd0: request ffff880172879d90: dequeued (flags=1)
  block nbd0: request ffff880172879d90: sending control (trim/discard@40644608,94621696B)
                                                                               ^^^^^^^^
  block nbd0: Other side returned error (22)
                                         ^^

The length of the request seems huge but this is really just the filesystem
telling the block device driver that "this length should be trimmed", and,
unlike for a NBD_CMD_READ or NBD_CMD_WRITE, we'll not try to read/write
that amount of data from/to the NBD socket.  It is thus safe to remove the
length check for a NBD_CMD_TRIM.

I've confirmed this with both the protocol documentation at:

 https://github.com/yoe/nbd/blob/master/doc/proto.md

and looking at the kernel side implementation of the nbd device
(drivers/block/nbd.c) where it only sends the request header with no data
for a NBD_CMD_TRIM.

With this fix in, I am now able to run fstrim on my qcow2 images and keep
them small (or at least keep their size proportional to the amount of data
present on them).

Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: <qemu-devel@nongnu.org>
CC: <qemu-stable@nongnu.org>
CC: <qemu-trivial@nongnu.org>
---
 nbd.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/nbd.c b/nbd.c
index b3d9654..e733669 100644
--- a/nbd.c
+++ b/nbd.c
@@ -1209,6 +1209,11 @@ static ssize_t nbd_co_send_reply(NBDRequest *req, struct nbd_reply *reply,
     return rc;
 }
 
+static bool nbd_should_check_request_size(const struct nbd_request *request)
+{
+        return (request->type & NBD_CMD_MASK_COMMAND) != NBD_CMD_TRIM;
+}
+
 static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request *request)
 {
     NBDClient *client = req->client;
@@ -1227,7 +1232,8 @@ static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request *reque
         goto out;
     }
 
-    if (request->len > NBD_MAX_BUFFER_SIZE) {
+    if (nbd_should_check_request_size(request) &&
+        request->len > NBD_MAX_BUFFER_SIZE) {
         LOG("len (%u) is larger than max len (%u)",
             request->len, NBD_MAX_BUFFER_SIZE);
         rc = -EINVAL;
-- 
2.8.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread