All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/7] block: file-posix queue
@ 2021-06-08 13:16 Paolo Bonzini
  2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block

Hi Kevin,

this is a combination of two series that both affect host block device
support in block/file-posix.c.  Joelle's series is unchanged, while
mine was adjusted according to your review of v2.

v1->v2: add missing patch

v2->v3: add max_hw_transfer to BlockLimits

v3->v4: fix compilation after patch 1, tweak commit messages according
        to Vladimir's review

Joelle van Dyne (3):
  block: feature detection for host block support
  block: check for sys/disk.h
  block: detect DKIOCGETBLOCKCOUNT/SIZE before use

Paolo Bonzini (4):
  file-posix: fix max_iov for /dev/sg devices
  scsi-generic: pass max_segments via max_iov field in BlockLimits
  block: add max_hw_transfer to BlockLimits
  file-posix: try BLKSECTGET on block devices too, do not round to power
    of 2

 block.c                        |   2 +-
 block/block-backend.c          |  12 ++++
 block/file-posix.c             | 104 ++++++++++++++++++++-------------
 block/io.c                     |   1 +
 hw/scsi/scsi-generic.c         |   6 +-
 include/block/block_int.h      |   7 +++
 include/sysemu/block-backend.h |   1 +
 meson.build                    |   7 ++-
 qapi/block-core.json           |  10 +++-
 9 files changed, 102 insertions(+), 48 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
  2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
  2021-06-08 17:34   ` Eric Blake
  2021-06-08 19:14   ` Vladimir Sementsov-Ogievskiy
  2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block

Even though it was only called for devices that have bs->sg set (which
must be character devices), sg_get_max_segments looked at /sys/dev/block
which only works for block devices.

On Linux the sg driver has its own way to provide the maximum number of
iovecs in a scatter/gather list, so add support for it.  The block device
path is kept because it will be reinstated in the next patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/file-posix.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index f37dfc10b3..536998a1d6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
         goto out;
     }
 
+    if (S_ISCHR(st.st_mode)) {
+        if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
+            return ret;
+        }
+        return -ENOTSUP;
+    }
+
+    if (!S_ISBLK(st.st_mode)) {
+        return -ENOTSUP;
+    }
+
     sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
                                 major(st.st_rdev), minor(st.st_rdev));
     sysfd = open(sysfspath, O_RDONLY);
-- 
2.31.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits
  2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
  2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
  2021-06-08 17:42   ` Eric Blake
  2021-06-09 16:10   ` Maxim Levitsky
  2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block

I/O to a disk via read/write is not limited by the number of segments allowed
by the host adapter; the kernel can split requests if needed, and the limit
imposed by the host adapter can be very low (256k or so) to avoid that SG_IO
returns EINVAL if memory is heavily fragmented.

Since this value is only interesting for SG_IO-based I/O, do not include
it in the max_transfer and only take it into account when patching the
block limits VPD page in the scsi-generic device.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/file-posix.c     | 3 +--
 hw/scsi/scsi-generic.c | 6 ++++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 536998a1d6..670c577bfe 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1239,8 +1239,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
 
         ret = sg_get_max_segments(s->fd);
         if (ret > 0) {
-            bs->bl.max_transfer = MIN(bs->bl.max_transfer,
-                                      ret * qemu_real_host_page_size);
+            bs->bl.max_iov = ret;
         }
     }
 
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 98c30c5d5c..82e1e2ee79 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -179,10 +179,12 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s)
         (r->req.cmd.buf[1] & 0x01)) {
         page = r->req.cmd.buf[2];
         if (page == 0xb0) {
-            uint32_t max_transfer =
-                blk_get_max_transfer(s->conf.blk) / s->blocksize;
+            uint32_t max_transfer = blk_get_max_transfer(s->conf.blk);
+            uint32_t max_iov = blk_get_max_iov(s->conf.blk);
 
             assert(max_transfer);
+            max_transfer = MIN_NON_ZERO(max_transfer, max_iov * qemu_real_host_page_size)
+                / s->blocksize;
             stl_be_p(&r->buf[8], max_transfer);
             /* Also take care of the opt xfer len. */
             stl_be_p(&r->buf[12],
-- 
2.31.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits
  2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
  2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
  2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
  2021-06-08 17:48   ` Eric Blake
  2021-06-09 16:12   ` Maxim Levitsky
  2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block

For block host devices, I/O can happen through either the kernel file
descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
or the SCSI passthrough ioctl SG_IO.

In the latter case, the size of each transfer can be limited by the
HBA, while for file descriptor I/O the kernel is able to split and
merge I/O in smaller pieces as needed.  Applying the HBA limits to
file descriptor I/O results in more system calls and suboptimal
performance, so this patch splits the max_transfer limit in two:
max_transfer remains valid and is used in general, while max_hw_transfer
is limited to the maximum hardware size.  max_hw_transfer can then be
included by the scsi-generic driver in the block limits page, to ensure
that the stricter hardware limit is used.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/block-backend.c          | 12 ++++++++++++
 block/file-posix.c             |  2 +-
 block/io.c                     |  1 +
 hw/scsi/scsi-generic.c         |  2 +-
 include/block/block_int.h      |  7 +++++++
 include/sysemu/block-backend.h |  1 +
 6 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 15f1ea4288..2ea1412a54 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk)
     return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE;
 }
 
+/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
+uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
+{
+    BlockDriverState *bs = blk_bs(blk);
+    uint64_t max = INT_MAX;
+
+    if (bs) {
+        max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer);
+    }
+    return max;
+}
+
 /* Returns the maximum transfer length, in bytes; guaranteed nonzero */
 uint32_t blk_get_max_transfer(BlockBackend *blk)
 {
diff --git a/block/file-posix.c b/block/file-posix.c
index 670c577bfe..c9746d3eb6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1234,7 +1234,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
         int ret = sg_get_max_transfer_length(s->fd);
 
         if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
-            bs->bl.max_transfer = pow2floor(ret);
+            bs->bl.max_hw_transfer = pow2floor(ret);
         }
 
         ret = sg_get_max_segments(s->fd);
diff --git a/block/io.c b/block/io.c
index 323854d063..089b99bb0c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -127,6 +127,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
 {
     dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer);
     dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer);
+    dst->max_hw_transfer = MIN_NON_ZERO(dst->max_hw_transfer, src->max_hw_transfer);
     dst->opt_mem_alignment = MAX(dst->opt_mem_alignment,
                                  src->opt_mem_alignment);
     dst->min_mem_alignment = MAX(dst->min_mem_alignment,
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 82e1e2ee79..3762dce749 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -179,7 +179,7 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s)
         (r->req.cmd.buf[1] & 0x01)) {
         page = r->req.cmd.buf[2];
         if (page == 0xb0) {
-            uint32_t max_transfer = blk_get_max_transfer(s->conf.blk);
+            uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk);
             uint32_t max_iov = blk_get_max_iov(s->conf.blk);
 
             assert(max_transfer);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 057d88b1fc..f1a54db0f8 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -695,6 +695,13 @@ typedef struct BlockLimits {
      * clamped down. */
     uint32_t max_transfer;
 
+    /* Maximal hardware transfer length in bytes.  Applies whenever
+     * transfers to the device bypass the kernel I/O scheduler, for
+     * example with SG_IO.  If larger than max_transfer or if zero,
+     * blk_get_max_hw_transfer will fall back to max_transfer.
+     */
+    uint64_t max_hw_transfer;
+
     /* memory alignment, in bytes so that no bounce buffer is needed */
     size_t min_mem_alignment;
 
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 5423e3d9c6..9ac5f7bbd3 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -208,6 +208,7 @@ void blk_eject(BlockBackend *blk, bool eject_flag);
 int blk_get_flags(BlockBackend *blk);
 uint32_t blk_get_request_alignment(BlockBackend *blk);
 uint32_t blk_get_max_transfer(BlockBackend *blk);
+uint64_t blk_get_max_hw_transfer(BlockBackend *blk);
 int blk_get_max_iov(BlockBackend *blk);
 void blk_set_guest_block_size(BlockBackend *blk, int align);
 void *blk_try_blockalign(BlockBackend *blk, size_t size);
-- 
2.31.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2
  2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
                   ` (2 preceding siblings ...)
  2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
  2021-06-08 17:53   ` Eric Blake
  2021-06-09 16:15   ` Maxim Levitsky
  2021-06-08 13:16 ` [PATCH v4 5/7] block: feature detection for host block support Paolo Bonzini
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block

bs->sg is only true for character devices, but block devices can also
be used with scsi-block and scsi-generic.  Unfortunately BLKSECTGET
returns bytes in an int for /dev/sgN devices, and sectors in a short
for block devices, so account for that in the code.

The maximum transfer also need not be a power of 2 (for example I have
seen disks with 1280 KiB maximum transfer) so there's no need to pass
the result through pow2floor.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/file-posix.c | 44 ++++++++++++++++++++++++--------------------
 1 file changed, 24 insertions(+), 20 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index c9746d3eb6..1439293f63 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1149,22 +1149,27 @@ static void raw_reopen_abort(BDRVReopenState *state)
     s->reopen_state = NULL;
 }
 
-static int sg_get_max_transfer_length(int fd)
+static int hdev_get_max_hw_transfer(int fd, struct stat *st)
 {
 #ifdef BLKSECTGET
-    int max_bytes = 0;
-
-    if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
-        return max_bytes;
+    if (S_ISBLK(st->st_mode)) {
+        unsigned short max_sectors = 0;
+        if (ioctl(fd, BLKSECTGET, &max_sectors) == 0) {
+            return max_sectors * 512;
+        }
     } else {
-        return -errno;
+        int max_bytes = 0;
+        if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
+            return max_bytes;
+        }
     }
+    return -errno;
 #else
     return -ENOSYS;
 #endif
 }
 
-static int sg_get_max_segments(int fd)
+static int hdev_get_max_segments(int fd, struct stat *st)
 {
 #ifdef CONFIG_LINUX
     char buf[32];
@@ -1173,26 +1178,20 @@ static int sg_get_max_segments(int fd)
     int ret;
     int sysfd = -1;
     long max_segments;
-    struct stat st;
 
-    if (fstat(fd, &st)) {
-        ret = -errno;
-        goto out;
-    }
-
-    if (S_ISCHR(st.st_mode)) {
+    if (S_ISCHR(st->st_mode)) {
         if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
             return ret;
         }
         return -ENOTSUP;
     }
 
-    if (!S_ISBLK(st.st_mode)) {
+    if (!S_ISBLK(st->st_mode)) {
         return -ENOTSUP;
     }
 
     sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
-                                major(st.st_rdev), minor(st.st_rdev));
+                                major(st->st_rdev), minor(st->st_rdev));
     sysfd = open(sysfspath, O_RDONLY);
     if (sysfd == -1) {
         ret = -errno;
@@ -1229,15 +1228,20 @@ out:
 static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
 {
     BDRVRawState *s = bs->opaque;
+    struct stat st;
+
+    if (fstat(s->fd, &st)) {
+        return;
+    }
 
-    if (bs->sg) {
-        int ret = sg_get_max_transfer_length(s->fd);
+    if (bs->sg || S_ISBLK(st.st_mode)) {
+        int ret = hdev_get_max_hw_transfer(s->fd, &st);
 
         if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
-            bs->bl.max_hw_transfer = pow2floor(ret);
+            bs->bl.max_hw_transfer = ret;
         }
 
-        ret = sg_get_max_segments(s->fd);
+        ret = hdev_get_max_segments(s->fd, &st);
         if (ret > 0) {
             bs->bl.max_iov = ret;
         }
-- 
2.31.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 5/7] block: feature detection for host block support
  2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
                   ` (3 preceding siblings ...)
  2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
  2021-06-23 15:44   ` Max Reitz
  2021-06-08 13:16 ` [PATCH v4 6/7] block: check for sys/disk.h Paolo Bonzini
  2021-06-08 13:16 ` [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use Paolo Bonzini
  6 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: kwolf, vsementsov, Joelle van Dyne, qemu-block

From: Joelle van Dyne <j@getutm.app>

On Darwin (iOS), there are no system level APIs for directly accessing
host block devices. We detect this at configure time.

Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20210315180341.31638-2-j@getutm.app>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/file-posix.c   | 33 ++++++++++++++++++++++-----------
 meson.build          |  6 +++++-
 qapi/block-core.json | 10 +++++++---
 3 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 1439293f63..5821e1afed 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -42,6 +42,8 @@
 #include "scsi/constants.h"
 
 #if defined(__APPLE__) && (__MACH__)
+#include <sys/ioctl.h>
+#if defined(HAVE_HOST_BLOCK_DEVICE)
 #include <paths.h>
 #include <sys/param.h>
 #include <IOKit/IOKitLib.h>
@@ -52,6 +54,7 @@
 //#include <IOKit/storage/IOCDTypes.h>
 #include <IOKit/storage/IODVDMedia.h>
 #include <CoreFoundation/CoreFoundation.h>
+#endif /* defined(HAVE_HOST_BLOCK_DEVICE) */
 #endif
 
 #ifdef __sun__
@@ -180,7 +183,17 @@ typedef struct BDRVRawReopenState {
     bool check_cache_dropped;
 } BDRVRawReopenState;
 
-static int fd_open(BlockDriverState *bs);
+static int fd_open(BlockDriverState *bs)
+{
+    BDRVRawState *s = bs->opaque;
+
+    /* this is just to ensure s->fd is sane (its called by io ops) */
+    if (s->fd >= 0) {
+        return 0;
+    }
+    return -EIO;
+}
+
 static int64_t raw_getlength(BlockDriverState *bs);
 
 typedef struct RawPosixAIOData {
@@ -3028,6 +3041,7 @@ static BlockStatsSpecific *raw_get_specific_stats(BlockDriverState *bs)
     return stats;
 }
 
+#if defined(HAVE_HOST_BLOCK_DEVICE)
 static BlockStatsSpecific *hdev_get_specific_stats(BlockDriverState *bs)
 {
     BlockStatsSpecific *stats = g_new(BlockStatsSpecific, 1);
@@ -3037,6 +3051,7 @@ static BlockStatsSpecific *hdev_get_specific_stats(BlockDriverState *bs)
 
     return stats;
 }
+#endif /* HAVE_HOST_BLOCK_DEVICE */
 
 static QemuOptsList raw_create_opts = {
     .name = "raw-create-opts",
@@ -3252,6 +3267,8 @@ BlockDriver bdrv_file = {
 /***********************************************/
 /* host device */
 
+#if defined(HAVE_HOST_BLOCK_DEVICE)
+
 #if defined(__APPLE__) && defined(__MACH__)
 static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
                                 CFIndex maxPathSize, int flags);
@@ -3544,16 +3561,6 @@ hdev_co_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
 }
 #endif /* linux */
 
-static int fd_open(BlockDriverState *bs)
-{
-    BDRVRawState *s = bs->opaque;
-
-    /* this is just to ensure s->fd is sane (its called by io ops) */
-    if (s->fd >= 0)
-        return 0;
-    return -EIO;
-}
-
 static coroutine_fn int
 hdev_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
 {
@@ -3877,6 +3884,8 @@ static BlockDriver bdrv_host_cdrom = {
 };
 #endif /* __FreeBSD__ */
 
+#endif /* HAVE_HOST_BLOCK_DEVICE */
+
 static void bdrv_file_init(void)
 {
     /*
@@ -3884,6 +3893,7 @@ static void bdrv_file_init(void)
      * registered last will get probed first.
      */
     bdrv_register(&bdrv_file);
+#if defined(HAVE_HOST_BLOCK_DEVICE)
     bdrv_register(&bdrv_host_device);
 #ifdef __linux__
     bdrv_register(&bdrv_host_cdrom);
@@ -3891,6 +3901,7 @@ static void bdrv_file_init(void)
 #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
     bdrv_register(&bdrv_host_cdrom);
 #endif
+#endif /* HAVE_HOST_BLOCK_DEVICE */
 }
 
 block_init(bdrv_file_init);
diff --git a/meson.build b/meson.build
index 626cf932c1..53150866ac 100644
--- a/meson.build
+++ b/meson.build
@@ -183,7 +183,7 @@ if targetos == 'windows'
                                       include_directories: include_directories('.'))
 elif targetos == 'darwin'
   coref = dependency('appleframeworks', modules: 'CoreFoundation')
-  iokit = dependency('appleframeworks', modules: 'IOKit')
+  iokit = dependency('appleframeworks', modules: 'IOKit', required: false)
 elif targetos == 'sunos'
   socket = [cc.find_library('socket'),
             cc.find_library('nsl'),
@@ -1089,6 +1089,9 @@ if get_option('cfi')
   add_global_link_arguments(cfi_flags, native: false, language: ['c', 'cpp', 'objc'])
 endif
 
+have_host_block_device = (targetos != 'darwin' or
+    cc.has_header('IOKit/storage/IOMedia.h'))
+
 #################
 # config-host.h #
 #################
@@ -1183,6 +1186,7 @@ config_host_data.set('HAVE_PTY_H', cc.has_header('pty.h'))
 config_host_data.set('HAVE_SYS_IOCCOM_H', cc.has_header('sys/ioccom.h'))
 config_host_data.set('HAVE_SYS_KCOV_H', cc.has_header('sys/kcov.h'))
 config_host_data.set('HAVE_SYSTEM_FUNCTION', cc.has_function('system', prefix: '#include <stdlib.h>'))
+config_host_data.set('HAVE_HOST_BLOCK_DEVICE', have_host_block_device)
 
 config_host_data.set('CONFIG_PREADV', cc.has_function('preadv', prefix: '#include <sys/uio.h>'))
 
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 2ea294129e..2dd41be156 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -897,7 +897,8 @@
   'discriminator': 'driver',
   'data': {
       'file': 'BlockStatsSpecificFile',
-      'host_device': 'BlockStatsSpecificFile',
+      'host_device': { 'type': 'BlockStatsSpecificFile',
+                       'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
       'nvme': 'BlockStatsSpecificNvme' } }
 
 ##
@@ -2814,7 +2815,9 @@
 { 'enum': 'BlockdevDriver',
   'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
             'cloop', 'compress', 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps',
-            'gluster', 'host_cdrom', 'host_device', 'http', 'https', 'iscsi',
+            'gluster', 'host_cdrom',
+            {'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
+            'http', 'https', 'iscsi',
             'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels',
             'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd',
             { 'name': 'replication', 'if': 'defined(CONFIG_REPLICATION)' },
@@ -3996,7 +3999,8 @@
       'ftps':       'BlockdevOptionsCurlFtps',
       'gluster':    'BlockdevOptionsGluster',
       'host_cdrom': 'BlockdevOptionsFile',
-      'host_device':'BlockdevOptionsFile',
+      'host_device': { 'type': 'BlockdevOptionsFile',
+                       'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
       'http':       'BlockdevOptionsCurlHttp',
       'https':      'BlockdevOptionsCurlHttps',
       'iscsi':      'BlockdevOptionsIscsi',
-- 
2.31.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 6/7] block: check for sys/disk.h
  2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
                   ` (4 preceding siblings ...)
  2021-06-08 13:16 ` [PATCH v4 5/7] block: feature detection for host block support Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
  2021-06-08 13:16 ` [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use Paolo Bonzini
  6 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, Peter Maydell, vsementsov, qemu-block, Joelle van Dyne,
	Philippe Mathieu-Daudé

From: Joelle van Dyne <j@getutm.app>

Some BSD platforms do not have this header.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20210315180341.31638-3-j@getutm.app>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block.c     | 2 +-
 meson.build | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 3f456892d0..1d37f133a8 100644
--- a/block.c
+++ b/block.c
@@ -54,7 +54,7 @@
 #ifdef CONFIG_BSD
 #include <sys/ioctl.h>
 #include <sys/queue.h>
-#ifndef __DragonFly__
+#if defined(HAVE_SYS_DISK_H)
 #include <sys/disk.h>
 #endif
 #endif
diff --git a/meson.build b/meson.build
index 53150866ac..7e3902b5c8 100644
--- a/meson.build
+++ b/meson.build
@@ -1187,6 +1187,7 @@ config_host_data.set('HAVE_SYS_IOCCOM_H', cc.has_header('sys/ioccom.h'))
 config_host_data.set('HAVE_SYS_KCOV_H', cc.has_header('sys/kcov.h'))
 config_host_data.set('HAVE_SYSTEM_FUNCTION', cc.has_function('system', prefix: '#include <stdlib.h>'))
 config_host_data.set('HAVE_HOST_BLOCK_DEVICE', have_host_block_device)
+config_host_data.set('HAVE_SYS_DISK_H', cc.has_header('sys/disk.h'))
 
 config_host_data.set('CONFIG_PREADV', cc.has_function('preadv', prefix: '#include <sys/uio.h>'))
 
-- 
2.31.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use
  2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
                   ` (5 preceding siblings ...)
  2021-06-08 13:16 ` [PATCH v4 6/7] block: check for sys/disk.h Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
  2021-06-23 15:47   ` Max Reitz
  6 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
  To: qemu-devel
  Cc: kwolf, Peter Maydell, vsementsov, qemu-block, Joelle van Dyne,
	Warner Losh

From: Joelle van Dyne <j@getutm.app>

iOS hosts do not have these defined so we fallback to the
default behaviour.

Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20210315180341.31638-4-j@getutm.app>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 block/file-posix.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 5821e1afed..4e2f7cf508 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2322,8 +2322,11 @@ static int64_t raw_getlength(BlockDriverState *bs)
 again:
 #endif
     if (!fstat(fd, &sb) && (S_IFCHR & sb.st_mode)) {
+        size = 0;
 #ifdef DIOCGMEDIASIZE
-        if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size))
+        if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size)) {
+            size = 0;
+        }
 #elif defined(DIOCGPART)
         {
                 struct partinfo pi;
@@ -2332,9 +2335,7 @@ again:
                 else
                         size = 0;
         }
-        if (size == 0)
-#endif
-#if defined(__APPLE__) && defined(__MACH__)
+#elif defined(DKIOCGETBLOCKCOUNT) && defined(DKIOCGETBLOCKSIZE)
         {
             uint64_t sectors = 0;
             uint32_t sector_size = 0;
@@ -2342,19 +2343,15 @@ again:
             if (ioctl(fd, DKIOCGETBLOCKCOUNT, &sectors) == 0
                && ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) == 0) {
                 size = sectors * sector_size;
-            } else {
-                size = lseek(fd, 0LL, SEEK_END);
-                if (size < 0) {
-                    return -errno;
-                }
             }
         }
-#else
-        size = lseek(fd, 0LL, SEEK_END);
+#endif
+        if (size == 0) {
+            size = lseek(fd, 0LL, SEEK_END);
+        }
         if (size < 0) {
             return -errno;
         }
-#endif
 #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
         switch(s->type) {
         case FTYPE_CD:
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
  2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
@ 2021-06-08 17:34   ` Eric Blake
  2021-06-08 19:14   ` Vladimir Sementsov-Ogievskiy
  1 sibling, 0 replies; 21+ messages in thread
From: Eric Blake @ 2021-06-08 17:34 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, vsementsov, qemu-devel, qemu-block

On Tue, Jun 08, 2021 at 03:16:28PM +0200, Paolo Bonzini wrote:
> Even though it was only called for devices that have bs->sg set (which
> must be character devices), sg_get_max_segments looked at /sys/dev/block
> which only works for block devices.
> 
> On Linux the sg driver has its own way to provide the maximum number of
> iovecs in a scatter/gather list, so add support for it.  The block device
> path is kept because it will be reinstated in the next patches.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/file-posix.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index f37dfc10b3..536998a1d6 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
>          goto out;
>      }
>  
> +    if (S_ISCHR(st.st_mode)) {
> +        if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {

Do we need to do any conditional compilation based on whether
SG_GET_SG_TABLESIZE is a known ioctl, or is it old enough to be
assumed present on all platforms we care about?

> +            return ret;
> +        }
> +        return -ENOTSUP;
> +    }
> +
> +    if (!S_ISBLK(st.st_mode)) {
> +        return -ENOTSUP;
> +    }
> +
>      sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
>                                  major(st.st_rdev), minor(st.st_rdev));
>      sysfd = open(sysfspath, O_RDONLY);

Otherwise looks good to me.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits
  2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
@ 2021-06-08 17:42   ` Eric Blake
  2021-06-09 16:10   ` Maxim Levitsky
  1 sibling, 0 replies; 21+ messages in thread
From: Eric Blake @ 2021-06-08 17:42 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, vsementsov, qemu-devel, qemu-block

On Tue, Jun 08, 2021 at 03:16:29PM +0200, Paolo Bonzini wrote:
> I/O to a disk via read/write is not limited by the number of segments allowed
> by the host adapter; the kernel can split requests if needed, and the limit
> imposed by the host adapter can be very low (256k or so) to avoid that SG_IO
> returns EINVAL if memory is heavily fragmented.

to avoid SG_IO returning EINVAL

> 
> Since this value is only interesting for SG_IO-based I/O, do not include
> it in the max_transfer and only take it into account when patching the
> block limits VPD page in the scsi-generic device.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/file-posix.c     | 3 +--
>  hw/scsi/scsi-generic.c | 6 ++++--
>  2 files changed, 5 insertions(+), 4 deletions(-)
>

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits
  2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
@ 2021-06-08 17:48   ` Eric Blake
  2021-06-09 16:12   ` Maxim Levitsky
  1 sibling, 0 replies; 21+ messages in thread
From: Eric Blake @ 2021-06-08 17:48 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, vsementsov, qemu-devel, qemu-block

On Tue, Jun 08, 2021 at 03:16:30PM +0200, Paolo Bonzini wrote:
> For block host devices, I/O can happen through either the kernel file
> descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
> or the SCSI passthrough ioctl SG_IO.
> 
> In the latter case, the size of each transfer can be limited by the
> HBA, while for file descriptor I/O the kernel is able to split and
> merge I/O in smaller pieces as needed.  Applying the HBA limits to
> file descriptor I/O results in more system calls and suboptimal
> performance, so this patch splits the max_transfer limit in two:
> max_transfer remains valid and is used in general, while max_hw_transfer
> is limited to the maximum hardware size.  max_hw_transfer can then be
> included by the scsi-generic driver in the block limits page, to ensure
> that the stricter hardware limit is used.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/block-backend.c          | 12 ++++++++++++
>  block/file-posix.c             |  2 +-
>  block/io.c                     |  1 +
>  hw/scsi/scsi-generic.c         |  2 +-
>  include/block/block_int.h      |  7 +++++++
>  include/sysemu/block-backend.h |  1 +

[you can use git's orderfile option to put .h changes first]

>  6 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 15f1ea4288..2ea1412a54 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk)
>      return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE;
>  }
>  
> +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)

Since we have declared (elsewhere) that the maximum block device is
signed, would this be better as int64_t?  (Our reasoning is that off_t
is also signed, and we are unlikely to need to handle anything bigger
than what off_t can handle; plus it leaves room for returning errors,
although this particular function is not giving errors; see also
Vladimir's work on making the block layer 64-bit clean).  I'm not
opposed to unsigned here to represent lack of errors, but consistency
with the rest of the block layer may argue for signed.

> +++ b/include/block/block_int.h
> @@ -695,6 +695,13 @@ typedef struct BlockLimits {
>       * clamped down. */
>      uint32_t max_transfer;
>  
> +    /* Maximal hardware transfer length in bytes.  Applies whenever
> +     * transfers to the device bypass the kernel I/O scheduler, for
> +     * example with SG_IO.  If larger than max_transfer or if zero,
> +     * blk_get_max_hw_transfer will fall back to max_transfer.
> +     */
> +    uint64_t max_hw_transfer;
> +

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2
  2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
@ 2021-06-08 17:53   ` Eric Blake
  2021-06-09 16:15   ` Maxim Levitsky
  1 sibling, 0 replies; 21+ messages in thread
From: Eric Blake @ 2021-06-08 17:53 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kwolf, vsementsov, qemu-devel, qemu-block

On Tue, Jun 08, 2021 at 03:16:31PM +0200, Paolo Bonzini wrote:
> bs->sg is only true for character devices, but block devices can also
> be used with scsi-block and scsi-generic.  Unfortunately BLKSECTGET
> returns bytes in an int for /dev/sgN devices, and sectors in a short
> for block devices, so account for that in the code.

Gotta love inconsistent and poorly-documented kernel interfaces! (on my
system, 'man -k BLKSECTGET' had no hits)

> 
> The maximum transfer also need not be a power of 2 (for example I have
> seen disks with 1280 KiB maximum transfer) so there's no need to pass
> the result through pow2floor.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/file-posix.c | 44 ++++++++++++++++++++++++--------------------
>  1 file changed, 24 insertions(+), 20 deletions(-)
>

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
  2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
  2021-06-08 17:34   ` Eric Blake
@ 2021-06-08 19:14   ` Vladimir Sementsov-Ogievskiy
  2021-06-09 16:08     ` Maxim Levitsky
  2021-06-23 15:42     ` Max Reitz
  1 sibling, 2 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-08 19:14 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: qemu-block, kwolf

08.06.2021 16:16, Paolo Bonzini wrote:
> Even though it was only called for devices that have bs->sg set (which
> must be character devices), sg_get_max_segments looked at /sys/dev/block
> which only works for block devices.
> 
> On Linux the sg driver has its own way to provide the maximum number of
> iovecs in a scatter/gather list, so add support for it.  The block device
> path is kept because it will be reinstated in the next patches.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   block/file-posix.c | 11 +++++++++++
>   1 file changed, 11 insertions(+)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index f37dfc10b3..536998a1d6 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
>           goto out;
>       }
>   
> +    if (S_ISCHR(st.st_mode)) {

Why not check "if (bs->sg) {" instead? It seems to be more consistent with issuing SG_ ioctl. Or what I miss?

> +        if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> +            return ret;
> +        }
> +        return -ENOTSUP;
> +    }
> +
> +    if (!S_ISBLK(st.st_mode)) {
> +        return -ENOTSUP;
> +    }
> +
>       sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
>                                   major(st.st_rdev), minor(st.st_rdev));
>       sysfd = open(sysfspath, O_RDONLY);
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
  2021-06-08 19:14   ` Vladimir Sementsov-Ogievskiy
@ 2021-06-09 16:08     ` Maxim Levitsky
  2021-06-23 15:42     ` Max Reitz
  1 sibling, 0 replies; 21+ messages in thread
From: Maxim Levitsky @ 2021-06-09 16:08 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Paolo Bonzini, qemu-devel
  Cc: kwolf, Max Reitz, qemu-block, Tom Yan

On Tue, 2021-06-08 at 22:14 +0300, Vladimir Sementsov-Ogievskiy wrote:
> 08.06.2021 16:16, Paolo Bonzini wrote:
> > Even though it was only called for devices that have bs->sg set (which
> > must be character devices), sg_get_max_segments looked at /sys/dev/block
> > which only works for block devices.
> > 
> > On Linux the sg driver has its own way to provide the maximum number of
> > iovecs in a scatter/gather list, so add support for it.  The block device
> > path is kept because it will be reinstated in the next patches.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >   block/file-posix.c | 11 +++++++++++
> >   1 file changed, 11 insertions(+)
> > 
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index f37dfc10b3..536998a1d6 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
> >           goto out;
> >       }
> >   
> > +    if (S_ISCHR(st.st_mode)) {
> 
> Why not check "if (bs->sg) {" instead? It seems to be more consistent with issuing SG_ ioctl. Or what I miss?

I also think so. Actually the 'hdev_is_sg' has a check for character device as well, 
in addition to a few more checks that make sure that we are really 
dealing with the quirky /dev/sg character device.

> 
> > +        if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> > +            return ret;
> > +        }
> > +        return -ENOTSUP;
> > +    }
> > +
> > +    if (!S_ISBLK(st.st_mode)) {
> > +        return -ENOTSUP;
> > +    }
> > +
> >       sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> >                                   major(st.st_rdev), minor(st.st_rdev));
> >       sysfd = open(sysfspath, O_RDONLY);
> > 
> 
> 

Other than that, this is the same as the patch from Tom Yan:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg768262.html

In this version he does check if the SG_GET_SG_TABLESIZE is defined, so
you might want to do this as well.


Best regards,
	Maxim Levitsky





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits
  2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
  2021-06-08 17:42   ` Eric Blake
@ 2021-06-09 16:10   ` Maxim Levitsky
  1 sibling, 0 replies; 21+ messages in thread
From: Maxim Levitsky @ 2021-06-09 16:10 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: kwolf, vsementsov, Max Reitz, qemu-block, Tom Yan

On Tue, 2021-06-08 at 15:16 +0200, Paolo Bonzini wrote:
> I/O to a disk via read/write is not limited by the number of segments allowed
> by the host adapter; the kernel can split requests if needed, and the limit
> imposed by the host adapter can be very low (256k or so) to avoid that SG_IO
> returns EINVAL if memory is heavily fragmented.
> 
> Since this value is only interesting for SG_IO-based I/O, do not include
> it in the max_transfer and only take it into account when patching the
> block limits VPD page in the scsi-generic device.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/file-posix.c     | 3 +--
>  hw/scsi/scsi-generic.c | 6 ++++--
>  2 files changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 536998a1d6..670c577bfe 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1239,8 +1239,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>  
>          ret = sg_get_max_segments(s->fd);
>          if (ret > 0) {
> -            bs->bl.max_transfer = MIN(bs->bl.max_transfer,
> -                                      ret * qemu_real_host_page_size);
> +            bs->bl.max_iov = ret;

Actually I think that both max transfer size and max segement count,
are only relevant for SCSI passthrough since kernel I think emualates
both for regular I/O, so I think that we shoudn't expose them to qemu at all.

In my version of the patches I removed both bl.max_transfer and bl.max_iov
setup from the file-posix driver and replaced it with bs->bl.max_ioctl_transfer
(you call it max_hw_transfer)

In my version the bl.max_ioctl_transfer is a merged limit of the max transfer size
and the max iovec number.

https://www.mail-archive.com/qemu-devel@nongnu.org/msg768264.html


Best regards,
	Maxim Levitsky


>          }
>      }
>  
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 98c30c5d5c..82e1e2ee79 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -179,10 +179,12 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s)
>          (r->req.cmd.buf[1] & 0x01)) {
>          page = r->req.cmd.buf[2];
>          if (page == 0xb0) {
> -            uint32_t max_transfer =
> -                blk_get_max_transfer(s->conf.blk) / s->blocksize;
> +            uint32_t max_transfer = blk_get_max_transfer(s->conf.blk);
> +            uint32_t max_iov = blk_get_max_iov(s->conf.blk);
>  
>              assert(max_transfer);
> +            max_transfer = MIN_NON_ZERO(max_transfer, max_iov * qemu_real_host_page_size)
> +                / s->blocksize;
>              stl_be_p(&r->buf[8], max_transfer);
>              /* Also take care of the opt xfer len. */
>              stl_be_p(&r->buf[12],






^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits
  2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
  2021-06-08 17:48   ` Eric Blake
@ 2021-06-09 16:12   ` Maxim Levitsky
  1 sibling, 0 replies; 21+ messages in thread
From: Maxim Levitsky @ 2021-06-09 16:12 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: kwolf, vsementsov, Max Reitz, qemu-block, Tom Yan

On Tue, 2021-06-08 at 15:16 +0200, Paolo Bonzini wrote:
> For block host devices, I/O can happen through either the kernel file
> descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
> or the SCSI passthrough ioctl SG_IO.
> 
> In the latter case, the size of each transfer can be limited by the
> HBA, while for file descriptor I/O the kernel is able to split and
> merge I/O in smaller pieces as needed.  Applying the HBA limits to
> file descriptor I/O results in more system calls and suboptimal
> performance, so this patch splits the max_transfer limit in two:
> max_transfer remains valid and is used in general, while max_hw_transfer
> is limited to the maximum hardware size.  max_hw_transfer can then be
> included by the scsi-generic driver in the block limits page, to ensure
> that the stricter hardware limit is used.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

This is mostly the same as my patch 

https://www.mail-archive.com/qemu-devel@nongnu.org/msg768264.html

I called this max_ioctl_transfer, since this limit is only relevant
to the .ioctl, but max_hw_transfer is fine as well.

So this patch looks OK, but I might have missed something
as I haven't touched this area for a long time.

Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>

Best regards,
	Maxim Levitsky


> ---
>  block/block-backend.c          | 12 ++++++++++++
>  block/file-posix.c             |  2 +-
>  block/io.c                     |  1 +
>  hw/scsi/scsi-generic.c         |  2 +-
>  include/block/block_int.h      |  7 +++++++
>  include/sysemu/block-backend.h |  1 +
>  6 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 15f1ea4288..2ea1412a54 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk)
>      return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE;
>  }
>  
> +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
> +{
> +    BlockDriverState *bs = blk_bs(blk);
> +    uint64_t max = INT_MAX;
> +
> +    if (bs) {
> +        max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer);
> +    }
> +    return max;
> +}
> +
>  /* Returns the maximum transfer length, in bytes; guaranteed nonzero */
>  uint32_t blk_get_max_transfer(BlockBackend *blk)
>  {
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 670c577bfe..c9746d3eb6 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1234,7 +1234,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>          int ret = sg_get_max_transfer_length(s->fd);
>  
>          if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
> -            bs->bl.max_transfer = pow2floor(ret);
> +            bs->bl.max_hw_transfer = pow2floor(ret);
>          }
>  
>          ret = sg_get_max_segments(s->fd);
> diff --git a/block/io.c b/block/io.c
> index 323854d063..089b99bb0c 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -127,6 +127,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
>  {
>      dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer);
>      dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer);
> +    dst->max_hw_transfer = MIN_NON_ZERO(dst->max_hw_transfer, src->max_hw_transfer);
>      dst->opt_mem_alignment = MAX(dst->opt_mem_alignment,
>                                   src->opt_mem_alignment);
>      dst->min_mem_alignment = MAX(dst->min_mem_alignment,
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 82e1e2ee79..3762dce749 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -179,7 +179,7 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s)
>          (r->req.cmd.buf[1] & 0x01)) {
>          page = r->req.cmd.buf[2];
>          if (page == 0xb0) {
> -            uint32_t max_transfer = blk_get_max_transfer(s->conf.blk);
> +            uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk);
>              uint32_t max_iov = blk_get_max_iov(s->conf.blk);
>  
>              assert(max_transfer);
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 057d88b1fc..f1a54db0f8 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -695,6 +695,13 @@ typedef struct BlockLimits {
>       * clamped down. */
>      uint32_t max_transfer;
>  
> +    /* Maximal hardware transfer length in bytes.  Applies whenever
> +     * transfers to the device bypass the kernel I/O scheduler, for
> +     * example with SG_IO.  If larger than max_transfer or if zero,
> +     * blk_get_max_hw_transfer will fall back to max_transfer.
> +     */
> +    uint64_t max_hw_transfer;
> +
>      /* memory alignment, in bytes so that no bounce buffer is needed */
>      size_t min_mem_alignment;
>  
> diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
> index 5423e3d9c6..9ac5f7bbd3 100644
> --- a/include/sysemu/block-backend.h
> +++ b/include/sysemu/block-backend.h
> @@ -208,6 +208,7 @@ void blk_eject(BlockBackend *blk, bool eject_flag);
>  int blk_get_flags(BlockBackend *blk);
>  uint32_t blk_get_request_alignment(BlockBackend *blk);
>  uint32_t blk_get_max_transfer(BlockBackend *blk);
> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk);
>  int blk_get_max_iov(BlockBackend *blk);
>  void blk_set_guest_block_size(BlockBackend *blk, int align);
>  void *blk_try_blockalign(BlockBackend *blk, size_t size);






^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2
  2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
  2021-06-08 17:53   ` Eric Blake
@ 2021-06-09 16:15   ` Maxim Levitsky
  1 sibling, 0 replies; 21+ messages in thread
From: Maxim Levitsky @ 2021-06-09 16:15 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: kwolf, vsementsov, Max Reitz, qemu-block, Tom Yan

On Tue, 2021-06-08 at 15:16 +0200, Paolo Bonzini wrote:
> bs->sg is only true for character devices, but block devices can also
> be used with scsi-block and scsi-generic.  Unfortunately BLKSECTGET
> returns bytes in an int for /dev/sgN devices, and sectors in a short
> for block devices, so account for that in the code.
> 
> The maximum transfer also need not be a power of 2 (for example I have
> seen disks with 1280 KiB maximum transfer) so there's no need to pass
> the result through pow2floor.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  block/file-posix.c | 44 ++++++++++++++++++++++++--------------------
>  1 file changed, 24 insertions(+), 20 deletions(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index c9746d3eb6..1439293f63 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1149,22 +1149,27 @@ static void raw_reopen_abort(BDRVReopenState *state)
>      s->reopen_state = NULL;
>  }
>  
> -static int sg_get_max_transfer_length(int fd)
> +static int hdev_get_max_hw_transfer(int fd, struct stat *st)
>  {
>  #ifdef BLKSECTGET
> -    int max_bytes = 0;
> -
> -    if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
> -        return max_bytes;
> +    if (S_ISBLK(st->st_mode)) {
> +        unsigned short max_sectors = 0;
> +        if (ioctl(fd, BLKSECTGET, &max_sectors) == 0) {
> +            return max_sectors * 512;
> +        }
>      } else {
> -        return -errno;
> +        int max_bytes = 0;
> +        if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {

Again I would use the bs->sg for that.

> +            return max_bytes;
> +        }
>      }
> +    return -errno;
>  #else
>      return -ENOSYS;
>  #endif
>  }
>  
> -static int sg_get_max_segments(int fd)
> +static int hdev_get_max_segments(int fd, struct stat *st)
>  {
>  #ifdef CONFIG_LINUX
>      char buf[32];
> @@ -1173,26 +1178,20 @@ static int sg_get_max_segments(int fd)
>      int ret;
>      int sysfd = -1;
>      long max_segments;
> -    struct stat st;
>  
> -    if (fstat(fd, &st)) {
> -        ret = -errno;
> -        goto out;
> -    }
> -
> -    if (S_ISCHR(st.st_mode)) {
> +    if (S_ISCHR(st->st_mode)) {
>          if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
>              return ret;
>          }
>          return -ENOTSUP;
>      }
>  
> -    if (!S_ISBLK(st.st_mode)) {
> +    if (!S_ISBLK(st->st_mode)) {
>          return -ENOTSUP;
>      }
>  
>      sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> -                                major(st.st_rdev), minor(st.st_rdev));
> +                                major(st->st_rdev), minor(st->st_rdev));
>      sysfd = open(sysfspath, O_RDONLY);
>      if (sysfd == -1) {
>          ret = -errno;
> @@ -1229,15 +1228,20 @@ out:
>  static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>  {
>      BDRVRawState *s = bs->opaque;
> +    struct stat st;
> +
> +    if (fstat(s->fd, &st)) {
> +        return;
> +    }
>  
> -    if (bs->sg) {
> -        int ret = sg_get_max_transfer_length(s->fd);
> +    if (bs->sg || S_ISBLK(st.st_mode)) {
> +        int ret = hdev_get_max_hw_transfer(s->fd, &st);
>  
>          if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
> -            bs->bl.max_hw_transfer = pow2floor(ret);
> +            bs->bl.max_hw_transfer = ret;
>          }
>  
> -        ret = sg_get_max_segments(s->fd);
> +        ret = hdev_get_max_segments(s->fd, &st);
>          if (ret > 0) {
>              bs->bl.max_iov = ret;
>          }


Roughly speaking this looks correct, but I might have missed something as well.

This is roughly the same as patches from Tom Yan which I carried in my series

https://www.mail-archive.com/qemu-devel@nongnu.org/msg768258.html
https://www.mail-archive.com/qemu-devel@nongnu.org/msg768262.html


I like a bit more how he created separate functions for /dev/sg and for all other block devices.
Please take a look.

Also not related to this patch, you are missing my fix I did to the VPD limit emulation, please consider taking
it into the series:

https://www.mail-archive.com/qemu-devel@nongnu.org/msg768260.html


Best regards,
	Maxim Levitsky





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
  2021-06-08 19:14   ` Vladimir Sementsov-Ogievskiy
  2021-06-09 16:08     ` Maxim Levitsky
@ 2021-06-23 15:42     ` Max Reitz
  2021-06-24  7:33       ` Vladimir Sementsov-Ogievskiy
  1 sibling, 1 reply; 21+ messages in thread
From: Max Reitz @ 2021-06-23 15:42 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy, Paolo Bonzini, qemu-devel; +Cc: kwolf, qemu-block

On 08.06.21 21:14, Vladimir Sementsov-Ogievskiy wrote:
> 08.06.2021 16:16, Paolo Bonzini wrote:
>> Even though it was only called for devices that have bs->sg set (which
>> must be character devices), sg_get_max_segments looked at /sys/dev/block
>> which only works for block devices.
>>
>> On Linux the sg driver has its own way to provide the maximum number of
>> iovecs in a scatter/gather list, so add support for it.  The block 
>> device
>> path is kept because it will be reinstated in the next patches.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   block/file-posix.c | 11 +++++++++++
>>   1 file changed, 11 insertions(+)
>>
>> diff --git a/block/file-posix.c b/block/file-posix.c
>> index f37dfc10b3..536998a1d6 100644
>> --- a/block/file-posix.c
>> +++ b/block/file-posix.c
>> @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
>>           goto out;
>>       }
>>   +    if (S_ISCHR(st.st_mode)) {
>
> Why not check "if (bs->sg) {" instead? It seems to be more consistent 
> with issuing SG_ ioctl. Or what I miss?

I dismissed this in v3, because I didn’t understand why you’d raise this 
point.  The function is called sg_*(), and it’s only called if bs->sg is 
true anyway.  So clearly we can use SG_ ioctls, because the whole 
function is intended only for SG devices anyway.

This time, I looked forward, and perhaps starting at patch 4 I can 
understand where you’re coming from, because then the function is used 
for host devices in general.

So now I don’t particularly mind.  I think it’s still clear that if 
there’s a host device here that’s a character device, then that’s going 
to be an SG device, so I don’t really have a preference between 
S_ISCHR() and bs->sg.

Max

>> +        if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
>> +            return ret;
>> +        }
>> +        return -ENOTSUP;
>> +    }
>> +
>> +    if (!S_ISBLK(st.st_mode)) {
>> +        return -ENOTSUP;
>> +    }
>> +
>>       sysfspath = 
>> g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
>>                                   major(st.st_rdev), minor(st.st_rdev));
>>       sysfd = open(sysfspath, O_RDONLY);
>>
>
>



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 5/7] block: feature detection for host block support
  2021-06-08 13:16 ` [PATCH v4 5/7] block: feature detection for host block support Paolo Bonzini
@ 2021-06-23 15:44   ` Max Reitz
  0 siblings, 0 replies; 21+ messages in thread
From: Max Reitz @ 2021-06-23 15:44 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: kwolf, vsementsov, Joelle van Dyne, qemu-block

On 08.06.21 15:16, Paolo Bonzini wrote:
> From: Joelle van Dyne <j@getutm.app>
>
> On Darwin (iOS), there are no system level APIs for directly accessing
> host block devices. We detect this at configure time.
>
> Signed-off-by: Joelle van Dyne <j@getutm.app>
> Message-Id: <20210315180341.31638-2-j@getutm.app>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   block/file-posix.c   | 33 ++++++++++++++++++++++-----------
>   meson.build          |  6 +++++-
>   qapi/block-core.json | 10 +++++++---
>   3 files changed, 34 insertions(+), 15 deletions(-)

[...]

> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 2ea294129e..2dd41be156 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -897,7 +897,8 @@
>     'discriminator': 'driver',
>     'data': {
>         'file': 'BlockStatsSpecificFile',
> -      'host_device': 'BlockStatsSpecificFile',
> +      'host_device': { 'type': 'BlockStatsSpecificFile',
> +                       'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
>         'nvme': 'BlockStatsSpecificNvme' } }
>   
>   ##
> @@ -2814,7 +2815,9 @@
>   { 'enum': 'BlockdevDriver',
>     'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
>               'cloop', 'compress', 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps',
> -            'gluster', 'host_cdrom', 'host_device', 'http', 'https', 'iscsi',
> +            'gluster', 'host_cdrom',
> +            {'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
> +            'http', 'https', 'iscsi',
>               'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels',
>               'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd',
>               { 'name': 'replication', 'if': 'defined(CONFIG_REPLICATION)' },
> @@ -3996,7 +3999,8 @@
>         'ftps':       'BlockdevOptionsCurlFtps',
>         'gluster':    'BlockdevOptionsGluster',
>         'host_cdrom': 'BlockdevOptionsFile',
> -      'host_device':'BlockdevOptionsFile',
> +      'host_device': { 'type': 'BlockdevOptionsFile',
> +                       'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
>         'http':       'BlockdevOptionsCurlHttp',
>         'https':      'BlockdevOptionsCurlHttps',
>         'iscsi':      'BlockdevOptionsIscsi',

As I asked in v3: What about host_cdrom?

Max



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use
  2021-06-08 13:16 ` [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use Paolo Bonzini
@ 2021-06-23 15:47   ` Max Reitz
  0 siblings, 0 replies; 21+ messages in thread
From: Max Reitz @ 2021-06-23 15:47 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel
  Cc: kwolf, Peter Maydell, vsementsov, qemu-block, Joelle van Dyne,
	Warner Losh

On 08.06.21 15:16, Paolo Bonzini wrote:
> From: Joelle van Dyne <j@getutm.app>
>
> iOS hosts do not have these defined so we fallback to the
> default behaviour.
>
> Co-authored-by: Warner Losh <imp@bsdimp.com>
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Joelle van Dyne <j@getutm.app>
> Message-Id: <20210315180341.31638-4-j@getutm.app>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   block/file-posix.c | 21 +++++++++------------
>   1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 5821e1afed..4e2f7cf508 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -2322,8 +2322,11 @@ static int64_t raw_getlength(BlockDriverState *bs)
>   again:
>   #endif
>       if (!fstat(fd, &sb) && (S_IFCHR & sb.st_mode)) {
> +        size = 0;
>   #ifdef DIOCGMEDIASIZE
> -        if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size))
> +        if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size)) {
> +            size = 0;
> +        }
>   #elif defined(DIOCGPART)
>           {
>                   struct partinfo pi;
> @@ -2332,9 +2335,7 @@ again:
>                   else
>                           size = 0;
>           }
> -        if (size == 0)
> -#endif
> -#if defined(__APPLE__) && defined(__MACH__)
> +#elif defined(DKIOCGETBLOCKCOUNT) && defined(DKIOCGETBLOCKSIZE)

In v3, I was wondering whether it’s intentional that the following 
DKIOCGETBLOCKCOUNT/SIZE block would no longer be used as a fallback if 
the DIOCGMEDIASIZE ioctl failed (which it was before this patch).  I’m 
still wondering.

Max

>           {
>               uint64_t sectors = 0;
>               uint32_t sector_size = 0;
> @@ -2342,19 +2343,15 @@ again:
>               if (ioctl(fd, DKIOCGETBLOCKCOUNT, &sectors) == 0
>                  && ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) == 0) {
>                   size = sectors * sector_size;
> -            } else {
> -                size = lseek(fd, 0LL, SEEK_END);
> -                if (size < 0) {
> -                    return -errno;
> -                }
>               }
>           }
> -#else
> -        size = lseek(fd, 0LL, SEEK_END);
> +#endif
> +        if (size == 0) {
> +            size = lseek(fd, 0LL, SEEK_END);
> +        }
>           if (size < 0) {
>               return -errno;
>           }
> -#endif
>   #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
>           switch(s->type) {
>           case FTYPE_CD:



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
  2021-06-23 15:42     ` Max Reitz
@ 2021-06-24  7:33       ` Vladimir Sementsov-Ogievskiy
  0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-24  7:33 UTC (permalink / raw)
  To: Max Reitz, Paolo Bonzini, qemu-devel; +Cc: qemu-block, kwolf

23.06.2021 18:42, Max Reitz wrote:
> On 08.06.21 21:14, Vladimir Sementsov-Ogievskiy wrote:
>> 08.06.2021 16:16, Paolo Bonzini wrote:
>>> Even though it was only called for devices that have bs->sg set (which
>>> must be character devices), sg_get_max_segments looked at /sys/dev/block
>>> which only works for block devices.
>>>
>>> On Linux the sg driver has its own way to provide the maximum number of
>>> iovecs in a scatter/gather list, so add support for it.  The block device
>>> path is kept because it will be reinstated in the next patches.
>>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>>   block/file-posix.c | 11 +++++++++++
>>>   1 file changed, 11 insertions(+)
>>>
>>> diff --git a/block/file-posix.c b/block/file-posix.c
>>> index f37dfc10b3..536998a1d6 100644
>>> --- a/block/file-posix.c
>>> +++ b/block/file-posix.c
>>> @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
>>>           goto out;
>>>       }
>>>   +    if (S_ISCHR(st.st_mode)) {
>>
>> Why not check "if (bs->sg) {" instead? It seems to be more consistent with issuing SG_ ioctl. Or what I miss?
> 
> I dismissed this in v3, because I didn’t understand why you’d raise this point.  The function is called sg_*(), and it’s only called if bs->sg is true anyway.  So clearly we can use SG_ ioctls, because the whole function is intended only for SG devices anyway.
> 
> This time, I looked forward, and perhaps starting at patch 4 I can understand where you’re coming from, because then the function is used for host devices in general.
> 
> So now I don’t particularly mind.  I think it’s still clear that if there’s a host device here that’s a character device, then that’s going to be an SG device, so I don’t really have a preference between S_ISCHR() and bs->sg.
> 

If I understand all correctly:

In this patch we don't need neither S_ISCHR nor bs->sg check: they both must pass for sg devices. Starting from patch 4 we'll need here if (bs->sg) check.

> 
>>> +        if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
>>> +            return ret;
>>> +        }
>>> +        return -ENOTSUP;
>>> +    }
>>> +
>>> +    if (!S_ISBLK(st.st_mode)) {
>>> +        return -ENOTSUP;
>>> +    }
>>> +
>>>       sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
>>>                                   major(st.st_rdev), minor(st.st_rdev));
>>>       sysfd = open(sysfspath, O_RDONLY);
>>>
>>
>>
> 


-- 
Best regards,
Vladimir


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-06-24  7:34 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
2021-06-08 17:34   ` Eric Blake
2021-06-08 19:14   ` Vladimir Sementsov-Ogievskiy
2021-06-09 16:08     ` Maxim Levitsky
2021-06-23 15:42     ` Max Reitz
2021-06-24  7:33       ` Vladimir Sementsov-Ogievskiy
2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
2021-06-08 17:42   ` Eric Blake
2021-06-09 16:10   ` Maxim Levitsky
2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
2021-06-08 17:48   ` Eric Blake
2021-06-09 16:12   ` Maxim Levitsky
2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
2021-06-08 17:53   ` Eric Blake
2021-06-09 16:15   ` Maxim Levitsky
2021-06-08 13:16 ` [PATCH v4 5/7] block: feature detection for host block support Paolo Bonzini
2021-06-23 15:44   ` Max Reitz
2021-06-08 13:16 ` [PATCH v4 6/7] block: check for sys/disk.h Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use Paolo Bonzini
2021-06-23 15:47   ` Max Reitz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.