* [PATCH v4 0/7] block: file-posix queue
@ 2021-06-08 13:16 Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
` (6 more replies)
0 siblings, 7 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block
Hi Kevin,
this is a combination of two series that both affect host block device
support in block/file-posix.c. Joelle's series is unchanged, while
mine was adjusted according to your review of v2.
v1->v2: add missing patch
v2->v3: add max_hw_transfer to BlockLimits
v3->v4: fix compilation after patch 1, tweak commit messages according
to Vladimir's review
Joelle van Dyne (3):
block: feature detection for host block support
block: check for sys/disk.h
block: detect DKIOCGETBLOCKCOUNT/SIZE before use
Paolo Bonzini (4):
file-posix: fix max_iov for /dev/sg devices
scsi-generic: pass max_segments via max_iov field in BlockLimits
block: add max_hw_transfer to BlockLimits
file-posix: try BLKSECTGET on block devices too, do not round to power
of 2
block.c | 2 +-
block/block-backend.c | 12 ++++
block/file-posix.c | 104 ++++++++++++++++++++-------------
block/io.c | 1 +
hw/scsi/scsi-generic.c | 6 +-
include/block/block_int.h | 7 +++
include/sysemu/block-backend.h | 1 +
meson.build | 7 ++-
qapi/block-core.json | 10 +++-
9 files changed, 102 insertions(+), 48 deletions(-)
--
2.31.1
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
2021-06-08 17:34 ` Eric Blake
2021-06-08 19:14 ` Vladimir Sementsov-Ogievskiy
2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
` (5 subsequent siblings)
6 siblings, 2 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block
Even though it was only called for devices that have bs->sg set (which
must be character devices), sg_get_max_segments looked at /sys/dev/block
which only works for block devices.
On Linux the sg driver has its own way to provide the maximum number of
iovecs in a scatter/gather list, so add support for it. The block device
path is kept because it will be reinstated in the next patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/file-posix.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/block/file-posix.c b/block/file-posix.c
index f37dfc10b3..536998a1d6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
goto out;
}
+ if (S_ISCHR(st.st_mode)) {
+ if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
+ return ret;
+ }
+ return -ENOTSUP;
+ }
+
+ if (!S_ISBLK(st.st_mode)) {
+ return -ENOTSUP;
+ }
+
sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
major(st.st_rdev), minor(st.st_rdev));
sysfd = open(sysfspath, O_RDONLY);
--
2.31.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits
2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
2021-06-08 17:42 ` Eric Blake
2021-06-09 16:10 ` Maxim Levitsky
2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
` (4 subsequent siblings)
6 siblings, 2 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block
I/O to a disk via read/write is not limited by the number of segments allowed
by the host adapter; the kernel can split requests if needed, and the limit
imposed by the host adapter can be very low (256k or so) to avoid that SG_IO
returns EINVAL if memory is heavily fragmented.
Since this value is only interesting for SG_IO-based I/O, do not include
it in the max_transfer and only take it into account when patching the
block limits VPD page in the scsi-generic device.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/file-posix.c | 3 +--
hw/scsi/scsi-generic.c | 6 ++++--
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 536998a1d6..670c577bfe 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1239,8 +1239,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
ret = sg_get_max_segments(s->fd);
if (ret > 0) {
- bs->bl.max_transfer = MIN(bs->bl.max_transfer,
- ret * qemu_real_host_page_size);
+ bs->bl.max_iov = ret;
}
}
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 98c30c5d5c..82e1e2ee79 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -179,10 +179,12 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s)
(r->req.cmd.buf[1] & 0x01)) {
page = r->req.cmd.buf[2];
if (page == 0xb0) {
- uint32_t max_transfer =
- blk_get_max_transfer(s->conf.blk) / s->blocksize;
+ uint32_t max_transfer = blk_get_max_transfer(s->conf.blk);
+ uint32_t max_iov = blk_get_max_iov(s->conf.blk);
assert(max_transfer);
+ max_transfer = MIN_NON_ZERO(max_transfer, max_iov * qemu_real_host_page_size)
+ / s->blocksize;
stl_be_p(&r->buf[8], max_transfer);
/* Also take care of the opt xfer len. */
stl_be_p(&r->buf[12],
--
2.31.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits
2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
2021-06-08 17:48 ` Eric Blake
2021-06-09 16:12 ` Maxim Levitsky
2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
` (3 subsequent siblings)
6 siblings, 2 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block
For block host devices, I/O can happen through either the kernel file
descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
or the SCSI passthrough ioctl SG_IO.
In the latter case, the size of each transfer can be limited by the
HBA, while for file descriptor I/O the kernel is able to split and
merge I/O in smaller pieces as needed. Applying the HBA limits to
file descriptor I/O results in more system calls and suboptimal
performance, so this patch splits the max_transfer limit in two:
max_transfer remains valid and is used in general, while max_hw_transfer
is limited to the maximum hardware size. max_hw_transfer can then be
included by the scsi-generic driver in the block limits page, to ensure
that the stricter hardware limit is used.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/block-backend.c | 12 ++++++++++++
block/file-posix.c | 2 +-
block/io.c | 1 +
hw/scsi/scsi-generic.c | 2 +-
include/block/block_int.h | 7 +++++++
include/sysemu/block-backend.h | 1 +
6 files changed, 23 insertions(+), 2 deletions(-)
diff --git a/block/block-backend.c b/block/block-backend.c
index 15f1ea4288..2ea1412a54 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk)
return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE;
}
+/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
+uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
+{
+ BlockDriverState *bs = blk_bs(blk);
+ uint64_t max = INT_MAX;
+
+ if (bs) {
+ max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer);
+ }
+ return max;
+}
+
/* Returns the maximum transfer length, in bytes; guaranteed nonzero */
uint32_t blk_get_max_transfer(BlockBackend *blk)
{
diff --git a/block/file-posix.c b/block/file-posix.c
index 670c577bfe..c9746d3eb6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1234,7 +1234,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
int ret = sg_get_max_transfer_length(s->fd);
if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
- bs->bl.max_transfer = pow2floor(ret);
+ bs->bl.max_hw_transfer = pow2floor(ret);
}
ret = sg_get_max_segments(s->fd);
diff --git a/block/io.c b/block/io.c
index 323854d063..089b99bb0c 100644
--- a/block/io.c
+++ b/block/io.c
@@ -127,6 +127,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
{
dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer);
dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer);
+ dst->max_hw_transfer = MIN_NON_ZERO(dst->max_hw_transfer, src->max_hw_transfer);
dst->opt_mem_alignment = MAX(dst->opt_mem_alignment,
src->opt_mem_alignment);
dst->min_mem_alignment = MAX(dst->min_mem_alignment,
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 82e1e2ee79..3762dce749 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -179,7 +179,7 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s)
(r->req.cmd.buf[1] & 0x01)) {
page = r->req.cmd.buf[2];
if (page == 0xb0) {
- uint32_t max_transfer = blk_get_max_transfer(s->conf.blk);
+ uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk);
uint32_t max_iov = blk_get_max_iov(s->conf.blk);
assert(max_transfer);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 057d88b1fc..f1a54db0f8 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -695,6 +695,13 @@ typedef struct BlockLimits {
* clamped down. */
uint32_t max_transfer;
+ /* Maximal hardware transfer length in bytes. Applies whenever
+ * transfers to the device bypass the kernel I/O scheduler, for
+ * example with SG_IO. If larger than max_transfer or if zero,
+ * blk_get_max_hw_transfer will fall back to max_transfer.
+ */
+ uint64_t max_hw_transfer;
+
/* memory alignment, in bytes so that no bounce buffer is needed */
size_t min_mem_alignment;
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 5423e3d9c6..9ac5f7bbd3 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -208,6 +208,7 @@ void blk_eject(BlockBackend *blk, bool eject_flag);
int blk_get_flags(BlockBackend *blk);
uint32_t blk_get_request_alignment(BlockBackend *blk);
uint32_t blk_get_max_transfer(BlockBackend *blk);
+uint64_t blk_get_max_hw_transfer(BlockBackend *blk);
int blk_get_max_iov(BlockBackend *blk);
void blk_set_guest_block_size(BlockBackend *blk, int align);
void *blk_try_blockalign(BlockBackend *blk, size_t size);
--
2.31.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2
2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
` (2 preceding siblings ...)
2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
2021-06-08 17:53 ` Eric Blake
2021-06-09 16:15 ` Maxim Levitsky
2021-06-08 13:16 ` [PATCH v4 5/7] block: feature detection for host block support Paolo Bonzini
` (2 subsequent siblings)
6 siblings, 2 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, vsementsov, qemu-block
bs->sg is only true for character devices, but block devices can also
be used with scsi-block and scsi-generic. Unfortunately BLKSECTGET
returns bytes in an int for /dev/sgN devices, and sectors in a short
for block devices, so account for that in the code.
The maximum transfer also need not be a power of 2 (for example I have
seen disks with 1280 KiB maximum transfer) so there's no need to pass
the result through pow2floor.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/file-posix.c | 44 ++++++++++++++++++++++++--------------------
1 file changed, 24 insertions(+), 20 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index c9746d3eb6..1439293f63 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1149,22 +1149,27 @@ static void raw_reopen_abort(BDRVReopenState *state)
s->reopen_state = NULL;
}
-static int sg_get_max_transfer_length(int fd)
+static int hdev_get_max_hw_transfer(int fd, struct stat *st)
{
#ifdef BLKSECTGET
- int max_bytes = 0;
-
- if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
- return max_bytes;
+ if (S_ISBLK(st->st_mode)) {
+ unsigned short max_sectors = 0;
+ if (ioctl(fd, BLKSECTGET, &max_sectors) == 0) {
+ return max_sectors * 512;
+ }
} else {
- return -errno;
+ int max_bytes = 0;
+ if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
+ return max_bytes;
+ }
}
+ return -errno;
#else
return -ENOSYS;
#endif
}
-static int sg_get_max_segments(int fd)
+static int hdev_get_max_segments(int fd, struct stat *st)
{
#ifdef CONFIG_LINUX
char buf[32];
@@ -1173,26 +1178,20 @@ static int sg_get_max_segments(int fd)
int ret;
int sysfd = -1;
long max_segments;
- struct stat st;
- if (fstat(fd, &st)) {
- ret = -errno;
- goto out;
- }
-
- if (S_ISCHR(st.st_mode)) {
+ if (S_ISCHR(st->st_mode)) {
if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
return ret;
}
return -ENOTSUP;
}
- if (!S_ISBLK(st.st_mode)) {
+ if (!S_ISBLK(st->st_mode)) {
return -ENOTSUP;
}
sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
- major(st.st_rdev), minor(st.st_rdev));
+ major(st->st_rdev), minor(st->st_rdev));
sysfd = open(sysfspath, O_RDONLY);
if (sysfd == -1) {
ret = -errno;
@@ -1229,15 +1228,20 @@ out:
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVRawState *s = bs->opaque;
+ struct stat st;
+
+ if (fstat(s->fd, &st)) {
+ return;
+ }
- if (bs->sg) {
- int ret = sg_get_max_transfer_length(s->fd);
+ if (bs->sg || S_ISBLK(st.st_mode)) {
+ int ret = hdev_get_max_hw_transfer(s->fd, &st);
if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
- bs->bl.max_hw_transfer = pow2floor(ret);
+ bs->bl.max_hw_transfer = ret;
}
- ret = sg_get_max_segments(s->fd);
+ ret = hdev_get_max_segments(s->fd, &st);
if (ret > 0) {
bs->bl.max_iov = ret;
}
--
2.31.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 5/7] block: feature detection for host block support
2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
` (3 preceding siblings ...)
2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
2021-06-23 15:44 ` Max Reitz
2021-06-08 13:16 ` [PATCH v4 6/7] block: check for sys/disk.h Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use Paolo Bonzini
6 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
To: qemu-devel; +Cc: kwolf, vsementsov, Joelle van Dyne, qemu-block
From: Joelle van Dyne <j@getutm.app>
On Darwin (iOS), there are no system level APIs for directly accessing
host block devices. We detect this at configure time.
Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20210315180341.31638-2-j@getutm.app>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/file-posix.c | 33 ++++++++++++++++++++++-----------
meson.build | 6 +++++-
qapi/block-core.json | 10 +++++++---
3 files changed, 34 insertions(+), 15 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 1439293f63..5821e1afed 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -42,6 +42,8 @@
#include "scsi/constants.h"
#if defined(__APPLE__) && (__MACH__)
+#include <sys/ioctl.h>
+#if defined(HAVE_HOST_BLOCK_DEVICE)
#include <paths.h>
#include <sys/param.h>
#include <IOKit/IOKitLib.h>
@@ -52,6 +54,7 @@
//#include <IOKit/storage/IOCDTypes.h>
#include <IOKit/storage/IODVDMedia.h>
#include <CoreFoundation/CoreFoundation.h>
+#endif /* defined(HAVE_HOST_BLOCK_DEVICE) */
#endif
#ifdef __sun__
@@ -180,7 +183,17 @@ typedef struct BDRVRawReopenState {
bool check_cache_dropped;
} BDRVRawReopenState;
-static int fd_open(BlockDriverState *bs);
+static int fd_open(BlockDriverState *bs)
+{
+ BDRVRawState *s = bs->opaque;
+
+ /* this is just to ensure s->fd is sane (its called by io ops) */
+ if (s->fd >= 0) {
+ return 0;
+ }
+ return -EIO;
+}
+
static int64_t raw_getlength(BlockDriverState *bs);
typedef struct RawPosixAIOData {
@@ -3028,6 +3041,7 @@ static BlockStatsSpecific *raw_get_specific_stats(BlockDriverState *bs)
return stats;
}
+#if defined(HAVE_HOST_BLOCK_DEVICE)
static BlockStatsSpecific *hdev_get_specific_stats(BlockDriverState *bs)
{
BlockStatsSpecific *stats = g_new(BlockStatsSpecific, 1);
@@ -3037,6 +3051,7 @@ static BlockStatsSpecific *hdev_get_specific_stats(BlockDriverState *bs)
return stats;
}
+#endif /* HAVE_HOST_BLOCK_DEVICE */
static QemuOptsList raw_create_opts = {
.name = "raw-create-opts",
@@ -3252,6 +3267,8 @@ BlockDriver bdrv_file = {
/***********************************************/
/* host device */
+#if defined(HAVE_HOST_BLOCK_DEVICE)
+
#if defined(__APPLE__) && defined(__MACH__)
static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
CFIndex maxPathSize, int flags);
@@ -3544,16 +3561,6 @@ hdev_co_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
}
#endif /* linux */
-static int fd_open(BlockDriverState *bs)
-{
- BDRVRawState *s = bs->opaque;
-
- /* this is just to ensure s->fd is sane (its called by io ops) */
- if (s->fd >= 0)
- return 0;
- return -EIO;
-}
-
static coroutine_fn int
hdev_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
{
@@ -3877,6 +3884,8 @@ static BlockDriver bdrv_host_cdrom = {
};
#endif /* __FreeBSD__ */
+#endif /* HAVE_HOST_BLOCK_DEVICE */
+
static void bdrv_file_init(void)
{
/*
@@ -3884,6 +3893,7 @@ static void bdrv_file_init(void)
* registered last will get probed first.
*/
bdrv_register(&bdrv_file);
+#if defined(HAVE_HOST_BLOCK_DEVICE)
bdrv_register(&bdrv_host_device);
#ifdef __linux__
bdrv_register(&bdrv_host_cdrom);
@@ -3891,6 +3901,7 @@ static void bdrv_file_init(void)
#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
bdrv_register(&bdrv_host_cdrom);
#endif
+#endif /* HAVE_HOST_BLOCK_DEVICE */
}
block_init(bdrv_file_init);
diff --git a/meson.build b/meson.build
index 626cf932c1..53150866ac 100644
--- a/meson.build
+++ b/meson.build
@@ -183,7 +183,7 @@ if targetos == 'windows'
include_directories: include_directories('.'))
elif targetos == 'darwin'
coref = dependency('appleframeworks', modules: 'CoreFoundation')
- iokit = dependency('appleframeworks', modules: 'IOKit')
+ iokit = dependency('appleframeworks', modules: 'IOKit', required: false)
elif targetos == 'sunos'
socket = [cc.find_library('socket'),
cc.find_library('nsl'),
@@ -1089,6 +1089,9 @@ if get_option('cfi')
add_global_link_arguments(cfi_flags, native: false, language: ['c', 'cpp', 'objc'])
endif
+have_host_block_device = (targetos != 'darwin' or
+ cc.has_header('IOKit/storage/IOMedia.h'))
+
#################
# config-host.h #
#################
@@ -1183,6 +1186,7 @@ config_host_data.set('HAVE_PTY_H', cc.has_header('pty.h'))
config_host_data.set('HAVE_SYS_IOCCOM_H', cc.has_header('sys/ioccom.h'))
config_host_data.set('HAVE_SYS_KCOV_H', cc.has_header('sys/kcov.h'))
config_host_data.set('HAVE_SYSTEM_FUNCTION', cc.has_function('system', prefix: '#include <stdlib.h>'))
+config_host_data.set('HAVE_HOST_BLOCK_DEVICE', have_host_block_device)
config_host_data.set('CONFIG_PREADV', cc.has_function('preadv', prefix: '#include <sys/uio.h>'))
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 2ea294129e..2dd41be156 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -897,7 +897,8 @@
'discriminator': 'driver',
'data': {
'file': 'BlockStatsSpecificFile',
- 'host_device': 'BlockStatsSpecificFile',
+ 'host_device': { 'type': 'BlockStatsSpecificFile',
+ 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
'nvme': 'BlockStatsSpecificNvme' } }
##
@@ -2814,7 +2815,9 @@
{ 'enum': 'BlockdevDriver',
'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
'cloop', 'compress', 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps',
- 'gluster', 'host_cdrom', 'host_device', 'http', 'https', 'iscsi',
+ 'gluster', 'host_cdrom',
+ {'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
+ 'http', 'https', 'iscsi',
'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels',
'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd',
{ 'name': 'replication', 'if': 'defined(CONFIG_REPLICATION)' },
@@ -3996,7 +3999,8 @@
'ftps': 'BlockdevOptionsCurlFtps',
'gluster': 'BlockdevOptionsGluster',
'host_cdrom': 'BlockdevOptionsFile',
- 'host_device':'BlockdevOptionsFile',
+ 'host_device': { 'type': 'BlockdevOptionsFile',
+ 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
'http': 'BlockdevOptionsCurlHttp',
'https': 'BlockdevOptionsCurlHttps',
'iscsi': 'BlockdevOptionsIscsi',
--
2.31.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 6/7] block: check for sys/disk.h
2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
` (4 preceding siblings ...)
2021-06-08 13:16 ` [PATCH v4 5/7] block: feature detection for host block support Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use Paolo Bonzini
6 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
To: qemu-devel
Cc: kwolf, Peter Maydell, vsementsov, qemu-block, Joelle van Dyne,
Philippe Mathieu-Daudé
From: Joelle van Dyne <j@getutm.app>
Some BSD platforms do not have this header.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20210315180341.31638-3-j@getutm.app>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block.c | 2 +-
meson.build | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/block.c b/block.c
index 3f456892d0..1d37f133a8 100644
--- a/block.c
+++ b/block.c
@@ -54,7 +54,7 @@
#ifdef CONFIG_BSD
#include <sys/ioctl.h>
#include <sys/queue.h>
-#ifndef __DragonFly__
+#if defined(HAVE_SYS_DISK_H)
#include <sys/disk.h>
#endif
#endif
diff --git a/meson.build b/meson.build
index 53150866ac..7e3902b5c8 100644
--- a/meson.build
+++ b/meson.build
@@ -1187,6 +1187,7 @@ config_host_data.set('HAVE_SYS_IOCCOM_H', cc.has_header('sys/ioccom.h'))
config_host_data.set('HAVE_SYS_KCOV_H', cc.has_header('sys/kcov.h'))
config_host_data.set('HAVE_SYSTEM_FUNCTION', cc.has_function('system', prefix: '#include <stdlib.h>'))
config_host_data.set('HAVE_HOST_BLOCK_DEVICE', have_host_block_device)
+config_host_data.set('HAVE_SYS_DISK_H', cc.has_header('sys/disk.h'))
config_host_data.set('CONFIG_PREADV', cc.has_function('preadv', prefix: '#include <sys/uio.h>'))
--
2.31.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use
2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
` (5 preceding siblings ...)
2021-06-08 13:16 ` [PATCH v4 6/7] block: check for sys/disk.h Paolo Bonzini
@ 2021-06-08 13:16 ` Paolo Bonzini
2021-06-23 15:47 ` Max Reitz
6 siblings, 1 reply; 21+ messages in thread
From: Paolo Bonzini @ 2021-06-08 13:16 UTC (permalink / raw)
To: qemu-devel
Cc: kwolf, Peter Maydell, vsementsov, qemu-block, Joelle van Dyne,
Warner Losh
From: Joelle van Dyne <j@getutm.app>
iOS hosts do not have these defined so we fallback to the
default behaviour.
Co-authored-by: Warner Losh <imp@bsdimp.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20210315180341.31638-4-j@getutm.app>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
block/file-posix.c | 21 +++++++++------------
1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/block/file-posix.c b/block/file-posix.c
index 5821e1afed..4e2f7cf508 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2322,8 +2322,11 @@ static int64_t raw_getlength(BlockDriverState *bs)
again:
#endif
if (!fstat(fd, &sb) && (S_IFCHR & sb.st_mode)) {
+ size = 0;
#ifdef DIOCGMEDIASIZE
- if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size))
+ if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size)) {
+ size = 0;
+ }
#elif defined(DIOCGPART)
{
struct partinfo pi;
@@ -2332,9 +2335,7 @@ again:
else
size = 0;
}
- if (size == 0)
-#endif
-#if defined(__APPLE__) && defined(__MACH__)
+#elif defined(DKIOCGETBLOCKCOUNT) && defined(DKIOCGETBLOCKSIZE)
{
uint64_t sectors = 0;
uint32_t sector_size = 0;
@@ -2342,19 +2343,15 @@ again:
if (ioctl(fd, DKIOCGETBLOCKCOUNT, §ors) == 0
&& ioctl(fd, DKIOCGETBLOCKSIZE, §or_size) == 0) {
size = sectors * sector_size;
- } else {
- size = lseek(fd, 0LL, SEEK_END);
- if (size < 0) {
- return -errno;
- }
}
}
-#else
- size = lseek(fd, 0LL, SEEK_END);
+#endif
+ if (size == 0) {
+ size = lseek(fd, 0LL, SEEK_END);
+ }
if (size < 0) {
return -errno;
}
-#endif
#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
switch(s->type) {
case FTYPE_CD:
--
2.31.1
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
@ 2021-06-08 17:34 ` Eric Blake
2021-06-08 19:14 ` Vladimir Sementsov-Ogievskiy
1 sibling, 0 replies; 21+ messages in thread
From: Eric Blake @ 2021-06-08 17:34 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kwolf, vsementsov, qemu-devel, qemu-block
On Tue, Jun 08, 2021 at 03:16:28PM +0200, Paolo Bonzini wrote:
> Even though it was only called for devices that have bs->sg set (which
> must be character devices), sg_get_max_segments looked at /sys/dev/block
> which only works for block devices.
>
> On Linux the sg driver has its own way to provide the maximum number of
> iovecs in a scatter/gather list, so add support for it. The block device
> path is kept because it will be reinstated in the next patches.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index f37dfc10b3..536998a1d6 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
> goto out;
> }
>
> + if (S_ISCHR(st.st_mode)) {
> + if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
Do we need to do any conditional compilation based on whether
SG_GET_SG_TABLESIZE is a known ioctl, or is it old enough to be
assumed present on all platforms we care about?
> + return ret;
> + }
> + return -ENOTSUP;
> + }
> +
> + if (!S_ISBLK(st.st_mode)) {
> + return -ENOTSUP;
> + }
> +
> sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> major(st.st_rdev), minor(st.st_rdev));
> sysfd = open(sysfspath, O_RDONLY);
Otherwise looks good to me.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits
2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
@ 2021-06-08 17:42 ` Eric Blake
2021-06-09 16:10 ` Maxim Levitsky
1 sibling, 0 replies; 21+ messages in thread
From: Eric Blake @ 2021-06-08 17:42 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kwolf, vsementsov, qemu-devel, qemu-block
On Tue, Jun 08, 2021 at 03:16:29PM +0200, Paolo Bonzini wrote:
> I/O to a disk via read/write is not limited by the number of segments allowed
> by the host adapter; the kernel can split requests if needed, and the limit
> imposed by the host adapter can be very low (256k or so) to avoid that SG_IO
> returns EINVAL if memory is heavily fragmented.
to avoid SG_IO returning EINVAL
>
> Since this value is only interesting for SG_IO-based I/O, do not include
> it in the max_transfer and only take it into account when patching the
> block limits VPD page in the scsi-generic device.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 3 +--
> hw/scsi/scsi-generic.c | 6 ++++--
> 2 files changed, 5 insertions(+), 4 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits
2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
@ 2021-06-08 17:48 ` Eric Blake
2021-06-09 16:12 ` Maxim Levitsky
1 sibling, 0 replies; 21+ messages in thread
From: Eric Blake @ 2021-06-08 17:48 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kwolf, vsementsov, qemu-devel, qemu-block
On Tue, Jun 08, 2021 at 03:16:30PM +0200, Paolo Bonzini wrote:
> For block host devices, I/O can happen through either the kernel file
> descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
> or the SCSI passthrough ioctl SG_IO.
>
> In the latter case, the size of each transfer can be limited by the
> HBA, while for file descriptor I/O the kernel is able to split and
> merge I/O in smaller pieces as needed. Applying the HBA limits to
> file descriptor I/O results in more system calls and suboptimal
> performance, so this patch splits the max_transfer limit in two:
> max_transfer remains valid and is used in general, while max_hw_transfer
> is limited to the maximum hardware size. max_hw_transfer can then be
> included by the scsi-generic driver in the block limits page, to ensure
> that the stricter hardware limit is used.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/block-backend.c | 12 ++++++++++++
> block/file-posix.c | 2 +-
> block/io.c | 1 +
> hw/scsi/scsi-generic.c | 2 +-
> include/block/block_int.h | 7 +++++++
> include/sysemu/block-backend.h | 1 +
[you can use git's orderfile option to put .h changes first]
> 6 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 15f1ea4288..2ea1412a54 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk)
> return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE;
> }
>
> +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
Since we have declared (elsewhere) that the maximum block device is
signed, would this be better as int64_t? (Our reasoning is that off_t
is also signed, and we are unlikely to need to handle anything bigger
than what off_t can handle; plus it leaves room for returning errors,
although this particular function is not giving errors; see also
Vladimir's work on making the block layer 64-bit clean). I'm not
opposed to unsigned here to represent lack of errors, but consistency
with the rest of the block layer may argue for signed.
> +++ b/include/block/block_int.h
> @@ -695,6 +695,13 @@ typedef struct BlockLimits {
> * clamped down. */
> uint32_t max_transfer;
>
> + /* Maximal hardware transfer length in bytes. Applies whenever
> + * transfers to the device bypass the kernel I/O scheduler, for
> + * example with SG_IO. If larger than max_transfer or if zero,
> + * blk_get_max_hw_transfer will fall back to max_transfer.
> + */
> + uint64_t max_hw_transfer;
> +
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2
2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
@ 2021-06-08 17:53 ` Eric Blake
2021-06-09 16:15 ` Maxim Levitsky
1 sibling, 0 replies; 21+ messages in thread
From: Eric Blake @ 2021-06-08 17:53 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: kwolf, vsementsov, qemu-devel, qemu-block
On Tue, Jun 08, 2021 at 03:16:31PM +0200, Paolo Bonzini wrote:
> bs->sg is only true for character devices, but block devices can also
> be used with scsi-block and scsi-generic. Unfortunately BLKSECTGET
> returns bytes in an int for /dev/sgN devices, and sectors in a short
> for block devices, so account for that in the code.
Gotta love inconsistent and poorly-documented kernel interfaces! (on my
system, 'man -k BLKSECTGET' had no hits)
>
> The maximum transfer also need not be a power of 2 (for example I have
> seen disks with 1280 KiB maximum transfer) so there's no need to pass
> the result through pow2floor.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 44 ++++++++++++++++++++++++--------------------
> 1 file changed, 24 insertions(+), 20 deletions(-)
>
Reviewed-by: Eric Blake <eblake@redhat.com>
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
2021-06-08 17:34 ` Eric Blake
@ 2021-06-08 19:14 ` Vladimir Sementsov-Ogievskiy
2021-06-09 16:08 ` Maxim Levitsky
2021-06-23 15:42 ` Max Reitz
1 sibling, 2 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-08 19:14 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: qemu-block, kwolf
08.06.2021 16:16, Paolo Bonzini wrote:
> Even though it was only called for devices that have bs->sg set (which
> must be character devices), sg_get_max_segments looked at /sys/dev/block
> which only works for block devices.
>
> On Linux the sg driver has its own way to provide the maximum number of
> iovecs in a scatter/gather list, so add support for it. The block device
> path is kept because it will be reinstated in the next patches.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index f37dfc10b3..536998a1d6 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
> goto out;
> }
>
> + if (S_ISCHR(st.st_mode)) {
Why not check "if (bs->sg) {" instead? It seems to be more consistent with issuing SG_ ioctl. Or what I miss?
> + if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> + return ret;
> + }
> + return -ENOTSUP;
> + }
> +
> + if (!S_ISBLK(st.st_mode)) {
> + return -ENOTSUP;
> + }
> +
> sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> major(st.st_rdev), minor(st.st_rdev));
> sysfd = open(sysfspath, O_RDONLY);
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
2021-06-08 19:14 ` Vladimir Sementsov-Ogievskiy
@ 2021-06-09 16:08 ` Maxim Levitsky
2021-06-23 15:42 ` Max Reitz
1 sibling, 0 replies; 21+ messages in thread
From: Maxim Levitsky @ 2021-06-09 16:08 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, Paolo Bonzini, qemu-devel
Cc: kwolf, Max Reitz, qemu-block, Tom Yan
On Tue, 2021-06-08 at 22:14 +0300, Vladimir Sementsov-Ogievskiy wrote:
> 08.06.2021 16:16, Paolo Bonzini wrote:
> > Even though it was only called for devices that have bs->sg set (which
> > must be character devices), sg_get_max_segments looked at /sys/dev/block
> > which only works for block devices.
> >
> > On Linux the sg driver has its own way to provide the maximum number of
> > iovecs in a scatter/gather list, so add support for it. The block device
> > path is kept because it will be reinstated in the next patches.
> >
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> > block/file-posix.c | 11 +++++++++++
> > 1 file changed, 11 insertions(+)
> >
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index f37dfc10b3..536998a1d6 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
> > goto out;
> > }
> >
> > + if (S_ISCHR(st.st_mode)) {
>
> Why not check "if (bs->sg) {" instead? It seems to be more consistent with issuing SG_ ioctl. Or what I miss?
I also think so. Actually the 'hdev_is_sg' has a check for character device as well,
in addition to a few more checks that make sure that we are really
dealing with the quirky /dev/sg character device.
>
> > + if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> > + return ret;
> > + }
> > + return -ENOTSUP;
> > + }
> > +
> > + if (!S_ISBLK(st.st_mode)) {
> > + return -ENOTSUP;
> > + }
> > +
> > sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> > major(st.st_rdev), minor(st.st_rdev));
> > sysfd = open(sysfspath, O_RDONLY);
> >
>
>
Other than that, this is the same as the patch from Tom Yan:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg768262.html
In this version he does check if the SG_GET_SG_TABLESIZE is defined, so
you might want to do this as well.
Best regards,
Maxim Levitsky
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits
2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
2021-06-08 17:42 ` Eric Blake
@ 2021-06-09 16:10 ` Maxim Levitsky
1 sibling, 0 replies; 21+ messages in thread
From: Maxim Levitsky @ 2021-06-09 16:10 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel
Cc: kwolf, vsementsov, Max Reitz, qemu-block, Tom Yan
On Tue, 2021-06-08 at 15:16 +0200, Paolo Bonzini wrote:
> I/O to a disk via read/write is not limited by the number of segments allowed
> by the host adapter; the kernel can split requests if needed, and the limit
> imposed by the host adapter can be very low (256k or so) to avoid that SG_IO
> returns EINVAL if memory is heavily fragmented.
>
> Since this value is only interesting for SG_IO-based I/O, do not include
> it in the max_transfer and only take it into account when patching the
> block limits VPD page in the scsi-generic device.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 3 +--
> hw/scsi/scsi-generic.c | 6 ++++--
> 2 files changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 536998a1d6..670c577bfe 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1239,8 +1239,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>
> ret = sg_get_max_segments(s->fd);
> if (ret > 0) {
> - bs->bl.max_transfer = MIN(bs->bl.max_transfer,
> - ret * qemu_real_host_page_size);
> + bs->bl.max_iov = ret;
Actually I think that both max transfer size and max segement count,
are only relevant for SCSI passthrough since kernel I think emualates
both for regular I/O, so I think that we shoudn't expose them to qemu at all.
In my version of the patches I removed both bl.max_transfer and bl.max_iov
setup from the file-posix driver and replaced it with bs->bl.max_ioctl_transfer
(you call it max_hw_transfer)
In my version the bl.max_ioctl_transfer is a merged limit of the max transfer size
and the max iovec number.
https://www.mail-archive.com/qemu-devel@nongnu.org/msg768264.html
Best regards,
Maxim Levitsky
> }
> }
>
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 98c30c5d5c..82e1e2ee79 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -179,10 +179,12 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s)
> (r->req.cmd.buf[1] & 0x01)) {
> page = r->req.cmd.buf[2];
> if (page == 0xb0) {
> - uint32_t max_transfer =
> - blk_get_max_transfer(s->conf.blk) / s->blocksize;
> + uint32_t max_transfer = blk_get_max_transfer(s->conf.blk);
> + uint32_t max_iov = blk_get_max_iov(s->conf.blk);
>
> assert(max_transfer);
> + max_transfer = MIN_NON_ZERO(max_transfer, max_iov * qemu_real_host_page_size)
> + / s->blocksize;
> stl_be_p(&r->buf[8], max_transfer);
> /* Also take care of the opt xfer len. */
> stl_be_p(&r->buf[12],
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits
2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
2021-06-08 17:48 ` Eric Blake
@ 2021-06-09 16:12 ` Maxim Levitsky
1 sibling, 0 replies; 21+ messages in thread
From: Maxim Levitsky @ 2021-06-09 16:12 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel
Cc: kwolf, vsementsov, Max Reitz, qemu-block, Tom Yan
On Tue, 2021-06-08 at 15:16 +0200, Paolo Bonzini wrote:
> For block host devices, I/O can happen through either the kernel file
> descriptor I/O system calls (preadv/pwritev, io_submit, io_uring)
> or the SCSI passthrough ioctl SG_IO.
>
> In the latter case, the size of each transfer can be limited by the
> HBA, while for file descriptor I/O the kernel is able to split and
> merge I/O in smaller pieces as needed. Applying the HBA limits to
> file descriptor I/O results in more system calls and suboptimal
> performance, so this patch splits the max_transfer limit in two:
> max_transfer remains valid and is used in general, while max_hw_transfer
> is limited to the maximum hardware size. max_hw_transfer can then be
> included by the scsi-generic driver in the block limits page, to ensure
> that the stricter hardware limit is used.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is mostly the same as my patch
https://www.mail-archive.com/qemu-devel@nongnu.org/msg768264.html
I called this max_ioctl_transfer, since this limit is only relevant
to the .ioctl, but max_hw_transfer is fine as well.
So this patch looks OK, but I might have missed something
as I haven't touched this area for a long time.
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Best regards,
Maxim Levitsky
> ---
> block/block-backend.c | 12 ++++++++++++
> block/file-posix.c | 2 +-
> block/io.c | 1 +
> hw/scsi/scsi-generic.c | 2 +-
> include/block/block_int.h | 7 +++++++
> include/sysemu/block-backend.h | 1 +
> 6 files changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 15f1ea4288..2ea1412a54 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1953,6 +1953,18 @@ uint32_t blk_get_request_alignment(BlockBackend *blk)
> return bs ? bs->bl.request_alignment : BDRV_SECTOR_SIZE;
> }
>
> +/* Returns the maximum hardware transfer length, in bytes; guaranteed nonzero */
> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk)
> +{
> + BlockDriverState *bs = blk_bs(blk);
> + uint64_t max = INT_MAX;
> +
> + if (bs) {
> + max = MIN_NON_ZERO(bs->bl.max_hw_transfer, bs->bl.max_transfer);
> + }
> + return max;
> +}
> +
> /* Returns the maximum transfer length, in bytes; guaranteed nonzero */
> uint32_t blk_get_max_transfer(BlockBackend *blk)
> {
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 670c577bfe..c9746d3eb6 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1234,7 +1234,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
> int ret = sg_get_max_transfer_length(s->fd);
>
> if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
> - bs->bl.max_transfer = pow2floor(ret);
> + bs->bl.max_hw_transfer = pow2floor(ret);
> }
>
> ret = sg_get_max_segments(s->fd);
> diff --git a/block/io.c b/block/io.c
> index 323854d063..089b99bb0c 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -127,6 +127,7 @@ static void bdrv_merge_limits(BlockLimits *dst, const BlockLimits *src)
> {
> dst->opt_transfer = MAX(dst->opt_transfer, src->opt_transfer);
> dst->max_transfer = MIN_NON_ZERO(dst->max_transfer, src->max_transfer);
> + dst->max_hw_transfer = MIN_NON_ZERO(dst->max_hw_transfer, src->max_hw_transfer);
> dst->opt_mem_alignment = MAX(dst->opt_mem_alignment,
> src->opt_mem_alignment);
> dst->min_mem_alignment = MAX(dst->min_mem_alignment,
> diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
> index 82e1e2ee79..3762dce749 100644
> --- a/hw/scsi/scsi-generic.c
> +++ b/hw/scsi/scsi-generic.c
> @@ -179,7 +179,7 @@ static void scsi_handle_inquiry_reply(SCSIGenericReq *r, SCSIDevice *s)
> (r->req.cmd.buf[1] & 0x01)) {
> page = r->req.cmd.buf[2];
> if (page == 0xb0) {
> - uint32_t max_transfer = blk_get_max_transfer(s->conf.blk);
> + uint64_t max_transfer = blk_get_max_hw_transfer(s->conf.blk);
> uint32_t max_iov = blk_get_max_iov(s->conf.blk);
>
> assert(max_transfer);
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 057d88b1fc..f1a54db0f8 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -695,6 +695,13 @@ typedef struct BlockLimits {
> * clamped down. */
> uint32_t max_transfer;
>
> + /* Maximal hardware transfer length in bytes. Applies whenever
> + * transfers to the device bypass the kernel I/O scheduler, for
> + * example with SG_IO. If larger than max_transfer or if zero,
> + * blk_get_max_hw_transfer will fall back to max_transfer.
> + */
> + uint64_t max_hw_transfer;
> +
> /* memory alignment, in bytes so that no bounce buffer is needed */
> size_t min_mem_alignment;
>
> diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
> index 5423e3d9c6..9ac5f7bbd3 100644
> --- a/include/sysemu/block-backend.h
> +++ b/include/sysemu/block-backend.h
> @@ -208,6 +208,7 @@ void blk_eject(BlockBackend *blk, bool eject_flag);
> int blk_get_flags(BlockBackend *blk);
> uint32_t blk_get_request_alignment(BlockBackend *blk);
> uint32_t blk_get_max_transfer(BlockBackend *blk);
> +uint64_t blk_get_max_hw_transfer(BlockBackend *blk);
> int blk_get_max_iov(BlockBackend *blk);
> void blk_set_guest_block_size(BlockBackend *blk, int align);
> void *blk_try_blockalign(BlockBackend *blk, size_t size);
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2
2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
2021-06-08 17:53 ` Eric Blake
@ 2021-06-09 16:15 ` Maxim Levitsky
1 sibling, 0 replies; 21+ messages in thread
From: Maxim Levitsky @ 2021-06-09 16:15 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel
Cc: kwolf, vsementsov, Max Reitz, qemu-block, Tom Yan
On Tue, 2021-06-08 at 15:16 +0200, Paolo Bonzini wrote:
> bs->sg is only true for character devices, but block devices can also
> be used with scsi-block and scsi-generic. Unfortunately BLKSECTGET
> returns bytes in an int for /dev/sgN devices, and sectors in a short
> for block devices, so account for that in the code.
>
> The maximum transfer also need not be a power of 2 (for example I have
> seen disks with 1280 KiB maximum transfer) so there's no need to pass
> the result through pow2floor.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 44 ++++++++++++++++++++++++--------------------
> 1 file changed, 24 insertions(+), 20 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index c9746d3eb6..1439293f63 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1149,22 +1149,27 @@ static void raw_reopen_abort(BDRVReopenState *state)
> s->reopen_state = NULL;
> }
>
> -static int sg_get_max_transfer_length(int fd)
> +static int hdev_get_max_hw_transfer(int fd, struct stat *st)
> {
> #ifdef BLKSECTGET
> - int max_bytes = 0;
> -
> - if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
> - return max_bytes;
> + if (S_ISBLK(st->st_mode)) {
> + unsigned short max_sectors = 0;
> + if (ioctl(fd, BLKSECTGET, &max_sectors) == 0) {
> + return max_sectors * 512;
> + }
> } else {
> - return -errno;
> + int max_bytes = 0;
> + if (ioctl(fd, BLKSECTGET, &max_bytes) == 0) {
Again I would use the bs->sg for that.
> + return max_bytes;
> + }
> }
> + return -errno;
> #else
> return -ENOSYS;
> #endif
> }
>
> -static int sg_get_max_segments(int fd)
> +static int hdev_get_max_segments(int fd, struct stat *st)
> {
> #ifdef CONFIG_LINUX
> char buf[32];
> @@ -1173,26 +1178,20 @@ static int sg_get_max_segments(int fd)
> int ret;
> int sysfd = -1;
> long max_segments;
> - struct stat st;
>
> - if (fstat(fd, &st)) {
> - ret = -errno;
> - goto out;
> - }
> -
> - if (S_ISCHR(st.st_mode)) {
> + if (S_ISCHR(st->st_mode)) {
> if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
> return ret;
> }
> return -ENOTSUP;
> }
>
> - if (!S_ISBLK(st.st_mode)) {
> + if (!S_ISBLK(st->st_mode)) {
> return -ENOTSUP;
> }
>
> sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
> - major(st.st_rdev), minor(st.st_rdev));
> + major(st->st_rdev), minor(st->st_rdev));
> sysfd = open(sysfspath, O_RDONLY);
> if (sysfd == -1) {
> ret = -errno;
> @@ -1229,15 +1228,20 @@ out:
> static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
> {
> BDRVRawState *s = bs->opaque;
> + struct stat st;
> +
> + if (fstat(s->fd, &st)) {
> + return;
> + }
>
> - if (bs->sg) {
> - int ret = sg_get_max_transfer_length(s->fd);
> + if (bs->sg || S_ISBLK(st.st_mode)) {
> + int ret = hdev_get_max_hw_transfer(s->fd, &st);
>
> if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
> - bs->bl.max_hw_transfer = pow2floor(ret);
> + bs->bl.max_hw_transfer = ret;
> }
>
> - ret = sg_get_max_segments(s->fd);
> + ret = hdev_get_max_segments(s->fd, &st);
> if (ret > 0) {
> bs->bl.max_iov = ret;
> }
Roughly speaking this looks correct, but I might have missed something as well.
This is roughly the same as patches from Tom Yan which I carried in my series
https://www.mail-archive.com/qemu-devel@nongnu.org/msg768258.html
https://www.mail-archive.com/qemu-devel@nongnu.org/msg768262.html
I like a bit more how he created separate functions for /dev/sg and for all other block devices.
Please take a look.
Also not related to this patch, you are missing my fix I did to the VPD limit emulation, please consider taking
it into the series:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg768260.html
Best regards,
Maxim Levitsky
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
2021-06-08 19:14 ` Vladimir Sementsov-Ogievskiy
2021-06-09 16:08 ` Maxim Levitsky
@ 2021-06-23 15:42 ` Max Reitz
2021-06-24 7:33 ` Vladimir Sementsov-Ogievskiy
1 sibling, 1 reply; 21+ messages in thread
From: Max Reitz @ 2021-06-23 15:42 UTC (permalink / raw)
To: Vladimir Sementsov-Ogievskiy, Paolo Bonzini, qemu-devel; +Cc: kwolf, qemu-block
On 08.06.21 21:14, Vladimir Sementsov-Ogievskiy wrote:
> 08.06.2021 16:16, Paolo Bonzini wrote:
>> Even though it was only called for devices that have bs->sg set (which
>> must be character devices), sg_get_max_segments looked at /sys/dev/block
>> which only works for block devices.
>>
>> On Linux the sg driver has its own way to provide the maximum number of
>> iovecs in a scatter/gather list, so add support for it. The block
>> device
>> path is kept because it will be reinstated in the next patches.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>> block/file-posix.c | 11 +++++++++++
>> 1 file changed, 11 insertions(+)
>>
>> diff --git a/block/file-posix.c b/block/file-posix.c
>> index f37dfc10b3..536998a1d6 100644
>> --- a/block/file-posix.c
>> +++ b/block/file-posix.c
>> @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
>> goto out;
>> }
>> + if (S_ISCHR(st.st_mode)) {
>
> Why not check "if (bs->sg) {" instead? It seems to be more consistent
> with issuing SG_ ioctl. Or what I miss?
I dismissed this in v3, because I didn’t understand why you’d raise this
point. The function is called sg_*(), and it’s only called if bs->sg is
true anyway. So clearly we can use SG_ ioctls, because the whole
function is intended only for SG devices anyway.
This time, I looked forward, and perhaps starting at patch 4 I can
understand where you’re coming from, because then the function is used
for host devices in general.
So now I don’t particularly mind. I think it’s still clear that if
there’s a host device here that’s a character device, then that’s going
to be an SG device, so I don’t really have a preference between
S_ISCHR() and bs->sg.
Max
>> + if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
>> + return ret;
>> + }
>> + return -ENOTSUP;
>> + }
>> +
>> + if (!S_ISBLK(st.st_mode)) {
>> + return -ENOTSUP;
>> + }
>> +
>> sysfspath =
>> g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
>> major(st.st_rdev), minor(st.st_rdev));
>> sysfd = open(sysfspath, O_RDONLY);
>>
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 5/7] block: feature detection for host block support
2021-06-08 13:16 ` [PATCH v4 5/7] block: feature detection for host block support Paolo Bonzini
@ 2021-06-23 15:44 ` Max Reitz
0 siblings, 0 replies; 21+ messages in thread
From: Max Reitz @ 2021-06-23 15:44 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel; +Cc: kwolf, vsementsov, Joelle van Dyne, qemu-block
On 08.06.21 15:16, Paolo Bonzini wrote:
> From: Joelle van Dyne <j@getutm.app>
>
> On Darwin (iOS), there are no system level APIs for directly accessing
> host block devices. We detect this at configure time.
>
> Signed-off-by: Joelle van Dyne <j@getutm.app>
> Message-Id: <20210315180341.31638-2-j@getutm.app>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 33 ++++++++++++++++++++++-----------
> meson.build | 6 +++++-
> qapi/block-core.json | 10 +++++++---
> 3 files changed, 34 insertions(+), 15 deletions(-)
[...]
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 2ea294129e..2dd41be156 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -897,7 +897,8 @@
> 'discriminator': 'driver',
> 'data': {
> 'file': 'BlockStatsSpecificFile',
> - 'host_device': 'BlockStatsSpecificFile',
> + 'host_device': { 'type': 'BlockStatsSpecificFile',
> + 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
> 'nvme': 'BlockStatsSpecificNvme' } }
>
> ##
> @@ -2814,7 +2815,9 @@
> { 'enum': 'BlockdevDriver',
> 'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
> 'cloop', 'compress', 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps',
> - 'gluster', 'host_cdrom', 'host_device', 'http', 'https', 'iscsi',
> + 'gluster', 'host_cdrom',
> + {'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
> + 'http', 'https', 'iscsi',
> 'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', 'parallels',
> 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd',
> { 'name': 'replication', 'if': 'defined(CONFIG_REPLICATION)' },
> @@ -3996,7 +3999,8 @@
> 'ftps': 'BlockdevOptionsCurlFtps',
> 'gluster': 'BlockdevOptionsGluster',
> 'host_cdrom': 'BlockdevOptionsFile',
> - 'host_device':'BlockdevOptionsFile',
> + 'host_device': { 'type': 'BlockdevOptionsFile',
> + 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
> 'http': 'BlockdevOptionsCurlHttp',
> 'https': 'BlockdevOptionsCurlHttps',
> 'iscsi': 'BlockdevOptionsIscsi',
As I asked in v3: What about host_cdrom?
Max
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use
2021-06-08 13:16 ` [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use Paolo Bonzini
@ 2021-06-23 15:47 ` Max Reitz
0 siblings, 0 replies; 21+ messages in thread
From: Max Reitz @ 2021-06-23 15:47 UTC (permalink / raw)
To: Paolo Bonzini, qemu-devel
Cc: kwolf, Peter Maydell, vsementsov, qemu-block, Joelle van Dyne,
Warner Losh
On 08.06.21 15:16, Paolo Bonzini wrote:
> From: Joelle van Dyne <j@getutm.app>
>
> iOS hosts do not have these defined so we fallback to the
> default behaviour.
>
> Co-authored-by: Warner Losh <imp@bsdimp.com>
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Joelle van Dyne <j@getutm.app>
> Message-Id: <20210315180341.31638-4-j@getutm.app>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> block/file-posix.c | 21 +++++++++------------
> 1 file changed, 9 insertions(+), 12 deletions(-)
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 5821e1afed..4e2f7cf508 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -2322,8 +2322,11 @@ static int64_t raw_getlength(BlockDriverState *bs)
> again:
> #endif
> if (!fstat(fd, &sb) && (S_IFCHR & sb.st_mode)) {
> + size = 0;
> #ifdef DIOCGMEDIASIZE
> - if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size))
> + if (ioctl(fd, DIOCGMEDIASIZE, (off_t *)&size)) {
> + size = 0;
> + }
> #elif defined(DIOCGPART)
> {
> struct partinfo pi;
> @@ -2332,9 +2335,7 @@ again:
> else
> size = 0;
> }
> - if (size == 0)
> -#endif
> -#if defined(__APPLE__) && defined(__MACH__)
> +#elif defined(DKIOCGETBLOCKCOUNT) && defined(DKIOCGETBLOCKSIZE)
In v3, I was wondering whether it’s intentional that the following
DKIOCGETBLOCKCOUNT/SIZE block would no longer be used as a fallback if
the DIOCGMEDIASIZE ioctl failed (which it was before this patch). I’m
still wondering.
Max
> {
> uint64_t sectors = 0;
> uint32_t sector_size = 0;
> @@ -2342,19 +2343,15 @@ again:
> if (ioctl(fd, DKIOCGETBLOCKCOUNT, §ors) == 0
> && ioctl(fd, DKIOCGETBLOCKSIZE, §or_size) == 0) {
> size = sectors * sector_size;
> - } else {
> - size = lseek(fd, 0LL, SEEK_END);
> - if (size < 0) {
> - return -errno;
> - }
> }
> }
> -#else
> - size = lseek(fd, 0LL, SEEK_END);
> +#endif
> + if (size == 0) {
> + size = lseek(fd, 0LL, SEEK_END);
> + }
> if (size < 0) {
> return -errno;
> }
> -#endif
> #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
> switch(s->type) {
> case FTYPE_CD:
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices
2021-06-23 15:42 ` Max Reitz
@ 2021-06-24 7:33 ` Vladimir Sementsov-Ogievskiy
0 siblings, 0 replies; 21+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2021-06-24 7:33 UTC (permalink / raw)
To: Max Reitz, Paolo Bonzini, qemu-devel; +Cc: qemu-block, kwolf
23.06.2021 18:42, Max Reitz wrote:
> On 08.06.21 21:14, Vladimir Sementsov-Ogievskiy wrote:
>> 08.06.2021 16:16, Paolo Bonzini wrote:
>>> Even though it was only called for devices that have bs->sg set (which
>>> must be character devices), sg_get_max_segments looked at /sys/dev/block
>>> which only works for block devices.
>>>
>>> On Linux the sg driver has its own way to provide the maximum number of
>>> iovecs in a scatter/gather list, so add support for it. The block device
>>> path is kept because it will be reinstated in the next patches.
>>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>> block/file-posix.c | 11 +++++++++++
>>> 1 file changed, 11 insertions(+)
>>>
>>> diff --git a/block/file-posix.c b/block/file-posix.c
>>> index f37dfc10b3..536998a1d6 100644
>>> --- a/block/file-posix.c
>>> +++ b/block/file-posix.c
>>> @@ -1180,6 +1180,17 @@ static int sg_get_max_segments(int fd)
>>> goto out;
>>> }
>>> + if (S_ISCHR(st.st_mode)) {
>>
>> Why not check "if (bs->sg) {" instead? It seems to be more consistent with issuing SG_ ioctl. Or what I miss?
>
> I dismissed this in v3, because I didn’t understand why you’d raise this point. The function is called sg_*(), and it’s only called if bs->sg is true anyway. So clearly we can use SG_ ioctls, because the whole function is intended only for SG devices anyway.
>
> This time, I looked forward, and perhaps starting at patch 4 I can understand where you’re coming from, because then the function is used for host devices in general.
>
> So now I don’t particularly mind. I think it’s still clear that if there’s a host device here that’s a character device, then that’s going to be an SG device, so I don’t really have a preference between S_ISCHR() and bs->sg.
>
If I understand all correctly:
In this patch we don't need neither S_ISCHR nor bs->sg check: they both must pass for sg devices. Starting from patch 4 we'll need here if (bs->sg) check.
>
>>> + if (ioctl(fd, SG_GET_SG_TABLESIZE, &ret) == 0) {
>>> + return ret;
>>> + }
>>> + return -ENOTSUP;
>>> + }
>>> +
>>> + if (!S_ISBLK(st.st_mode)) {
>>> + return -ENOTSUP;
>>> + }
>>> +
>>> sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
>>> major(st.st_rdev), minor(st.st_rdev));
>>> sysfd = open(sysfspath, O_RDONLY);
>>>
>>
>>
>
--
Best regards,
Vladimir
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2021-06-24 7:34 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08 13:16 [PATCH v4 0/7] block: file-posix queue Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 1/7] file-posix: fix max_iov for /dev/sg devices Paolo Bonzini
2021-06-08 17:34 ` Eric Blake
2021-06-08 19:14 ` Vladimir Sementsov-Ogievskiy
2021-06-09 16:08 ` Maxim Levitsky
2021-06-23 15:42 ` Max Reitz
2021-06-24 7:33 ` Vladimir Sementsov-Ogievskiy
2021-06-08 13:16 ` [PATCH v4 2/7] scsi-generic: pass max_segments via max_iov field in BlockLimits Paolo Bonzini
2021-06-08 17:42 ` Eric Blake
2021-06-09 16:10 ` Maxim Levitsky
2021-06-08 13:16 ` [PATCH v4 3/7] block: add max_hw_transfer to BlockLimits Paolo Bonzini
2021-06-08 17:48 ` Eric Blake
2021-06-09 16:12 ` Maxim Levitsky
2021-06-08 13:16 ` [PATCH v4 4/7] file-posix: try BLKSECTGET on block devices too, do not round to power of 2 Paolo Bonzini
2021-06-08 17:53 ` Eric Blake
2021-06-09 16:15 ` Maxim Levitsky
2021-06-08 13:16 ` [PATCH v4 5/7] block: feature detection for host block support Paolo Bonzini
2021-06-23 15:44 ` Max Reitz
2021-06-08 13:16 ` [PATCH v4 6/7] block: check for sys/disk.h Paolo Bonzini
2021-06-08 13:16 ` [PATCH v4 7/7] block: detect DKIOCGETBLOCKCOUNT/SIZE before use Paolo Bonzini
2021-06-23 15:47 ` Max Reitz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).