* [PATCH V2 1/5] ublk_drv: avoid to leak ublk device in case that add_disk fails
2022-07-27 14:16 [PATCH V2 0/5] ublk_drv: add generic mechanism to get/set parameters Ming Lei
@ 2022-07-27 14:16 ` Ming Lei
2022-07-27 16:21 ` Christoph Hellwig
2022-07-27 14:16 ` [PATCH V2 2/5] ublk_drv: cancel device even though disk isn't up Ming Lei
` (3 subsequent siblings)
4 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2022-07-27 14:16 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, ZiyangZhang, Christoph Hellwig, Ming Lei
->free_disk is only called after disk is added successfully, so
not hold ublk device reference count until add_disk is done.
Fixes: 6d9e6dfdf3b2 ("ublk: defer disk allocation")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 255b2de46a24..b30d6c3355e8 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -215,8 +215,11 @@ static void ublk_free_disk(struct gendisk *disk)
{
struct ublk_device *ub = disk->private_data;
- clear_bit(UB_STATE_USED, &ub->state);
- put_device(&ub->cdev_dev);
+ /* only called for added/used disk */
+ if (test_bit(UB_STATE_USED, &ub->state)) {
+ clear_bit(UB_STATE_USED, &ub->state);
+ put_device(&ub->cdev_dev);
+ }
}
static const struct block_device_operations ub_fops = {
@@ -1181,12 +1184,12 @@ static int ublk_ctrl_start_dev(struct io_uring_cmd *cmd)
ub->dev_info.ublksrv_pid = ublksrv_pid;
ub->ub_disk = disk;
- get_device(&ub->cdev_dev);
ret = add_disk(disk);
if (ret) {
put_disk(disk);
goto out_unlock;
}
+ get_device(&ub->cdev_dev);
set_bit(UB_STATE_USED, &ub->state);
ub->dev_info.state = UBLK_S_DEV_LIVE;
out_unlock:
--
2.31.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH V2 1/5] ublk_drv: avoid to leak ublk device in case that add_disk fails
2022-07-27 14:16 ` [PATCH V2 1/5] ublk_drv: avoid to leak ublk device in case that add_disk fails Ming Lei
@ 2022-07-27 16:21 ` Christoph Hellwig
2022-07-28 0:18 ` Ming Lei
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2022-07-27 16:21 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, ZiyangZhang, Christoph Hellwig
maybe s/avoid/don't/ in the subject?
> - get_device(&ub->cdev_dev);
> ret = add_disk(disk);
> if (ret) {
> put_disk(disk);
> goto out_unlock;
Maybe just add a put_device here in the error branch to keep
things simple?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2 1/5] ublk_drv: avoid to leak ublk device in case that add_disk fails
2022-07-27 16:21 ` Christoph Hellwig
@ 2022-07-28 0:18 ` Ming Lei
2022-07-28 2:57 ` Ziyang Zhang
0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2022-07-28 0:18 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, ZiyangZhang
On Wed, Jul 27, 2022 at 06:21:32PM +0200, Christoph Hellwig wrote:
> maybe s/avoid/don't/ in the subject?
OK, will change in V3.
>
> > - get_device(&ub->cdev_dev);
> > ret = add_disk(disk);
> > if (ret) {
> > put_disk(disk);
> > goto out_unlock;
>
> Maybe just add a put_device here in the error branch to keep
> things simple?
That is fine.
Another way is to add 'out_put_disk' error label which can be
reused with previous error handling.
Thanks,
Ming
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2 1/5] ublk_drv: avoid to leak ublk device in case that add_disk fails
2022-07-28 0:18 ` Ming Lei
@ 2022-07-28 2:57 ` Ziyang Zhang
0 siblings, 0 replies; 12+ messages in thread
From: Ziyang Zhang @ 2022-07-28 2:57 UTC (permalink / raw)
To: Ming Lei, Christoph Hellwig; +Cc: Jens Axboe, linux-block
On 2022/7/28 08:18, Ming Lei wrote:
> On Wed, Jul 27, 2022 at 06:21:32PM +0200, Christoph Hellwig wrote:
>> maybe s/avoid/don't/ in the subject?
>
> OK, will change in V3.
>
>>
>>> - get_device(&ub->cdev_dev);
>>> ret = add_disk(disk);
>>> if (ret) {
>>> put_disk(disk);
>>> goto out_unlock;
>>
>> Maybe just add a put_device here in the error branch to keep
>> things simple?
>
> That is fine.
>
> Another way is to add 'out_put_disk' error label which can be
> reused with previous error handling.
+1, adding another error label looks clear and simple.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH V2 2/5] ublk_drv: cancel device even though disk isn't up
2022-07-27 14:16 [PATCH V2 0/5] ublk_drv: add generic mechanism to get/set parameters Ming Lei
2022-07-27 14:16 ` [PATCH V2 1/5] ublk_drv: avoid to leak ublk device in case that add_disk fails Ming Lei
@ 2022-07-27 14:16 ` Ming Lei
2022-07-27 14:16 ` [PATCH V2 3/5] ublk_drv: add SET_PARAM/GET_PARAM control command Ming Lei
` (2 subsequent siblings)
4 siblings, 0 replies; 12+ messages in thread
From: Ming Lei @ 2022-07-27 14:16 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, ZiyangZhang, Christoph Hellwig, Ming Lei
Each ublk queue is started before adding disk, we have to
cancel queues in ublk_stop_dev() so that ubq daemon can be exited,
otherwise DEL_DEV command may hang forever.
Fixes: 71f28f3136af ("ublk_drv: add io_uring based userspace block driver")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index b30d6c3355e8..cb880308a378 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -791,16 +791,27 @@ static void ublk_daemon_monitor_work(struct work_struct *work)
UBLK_DAEMON_MONITOR_PERIOD);
}
+static inline bool ublk_queue_ready(struct ublk_queue *ubq)
+{
+ return ubq->nr_io_ready == ubq->q_depth;
+}
+
static void ublk_cancel_queue(struct ublk_queue *ubq)
{
int i;
+ if (!ublk_queue_ready(ubq))
+ return;
+
for (i = 0; i < ubq->q_depth; i++) {
struct ublk_io *io = &ubq->ios[i];
if (io->flags & UBLK_IO_FLAG_ACTIVE)
io_uring_cmd_done(io->cmd, UBLK_IO_RES_ABORT, 0);
}
+
+ /* all io commands are canceled */
+ ubq->nr_io_ready = 0;
}
/* Cancel all pending commands, must be called after del_gendisk() returns */
@@ -821,19 +832,14 @@ static void ublk_stop_dev(struct ublk_device *ub)
del_gendisk(ub->ub_disk);
ub->dev_info.state = UBLK_S_DEV_DEAD;
ub->dev_info.ublksrv_pid = -1;
- ublk_cancel_dev(ub);
put_disk(ub->ub_disk);
ub->ub_disk = NULL;
unlock:
+ ublk_cancel_dev(ub);
mutex_unlock(&ub->mutex);
cancel_delayed_work_sync(&ub->monitor_work);
}
-static inline bool ublk_queue_ready(struct ublk_queue *ubq)
-{
- return ubq->nr_io_ready == ubq->q_depth;
-}
-
/* device can only be started after all IOs are ready */
static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq)
{
--
2.31.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH V2 3/5] ublk_drv: add SET_PARAM/GET_PARAM control command
2022-07-27 14:16 [PATCH V2 0/5] ublk_drv: add generic mechanism to get/set parameters Ming Lei
2022-07-27 14:16 ` [PATCH V2 1/5] ublk_drv: avoid to leak ublk device in case that add_disk fails Ming Lei
2022-07-27 14:16 ` [PATCH V2 2/5] ublk_drv: cancel device even though disk isn't up Ming Lei
@ 2022-07-27 14:16 ` Ming Lei
2022-07-27 16:22 ` Christoph Hellwig
2022-07-27 14:16 ` [PATCH V2 4/5] ublk_drv: add two parameter types Ming Lei
2022-07-27 14:16 ` [PATCH V2 5/5] ublk_drv: cleanup ublksrv_ctrl_dev_info Ming Lei
4 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2022-07-27 14:16 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, ZiyangZhang, Christoph Hellwig, Ming Lei
Add two commands to set/get parameters generically.
One important goal of ublk is to provide generic framework for making
block device by userspace flexibly.
As one generic block device, there are still lots of block parameters,
such as max_sectors, write_cache/fua, discard related limits,
zoned parameters, ...., so this patch starts to add mechanism for set/get
device parameters.
Both generic block parameters(all kinds of queue settings) and ublk feature
related parameters can be covered with this way, then it becomes quite easy
to extend in future.
The parameter passed from userspace is added to one xarray because most
of parameters are optional, and their type is used as key of xarray,
meantime in future there may be lots of parameters.
This way provides mechanism to simulate any kind of generic block device from
userspace easily, from both block queue setting viewpoint or ublk
feature viewpoint.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 103 ++++++++++++++++++++++++++++++++++
include/uapi/linux/ublk_cmd.h | 10 ++++
2 files changed, 113 insertions(+)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index cb880308a378..39bb2d943dc2 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -137,6 +137,8 @@ struct ublk_device {
spinlock_t mm_lock;
struct mm_struct *mm;
+ struct xarray params;
+
struct completion completion;
unsigned int nr_queues_ready;
atomic_t nr_aborted_queues;
@@ -160,6 +162,18 @@ static DEFINE_MUTEX(ublk_ctl_mutex);
static struct miscdevice ublk_misc;
+static int ublk_validate_param_header(const struct ublk_device *ub,
+ const struct ublk_param_header *h)
+{
+ return -EINVAL;
+}
+
+static int ublk_install_param(struct ublk_device *ub,
+ struct ublk_param_header *h)
+{
+ return -EINVAL;
+}
+
static inline bool ublk_can_use_task_work(const struct ublk_queue *ubq)
{
if (IS_BUILTIN(CONFIG_BLK_DEV_UBLK) &&
@@ -1055,6 +1069,7 @@ static void ublk_cdev_rel(struct device *dev)
ublk_deinit_queues(ub);
ublk_free_dev_number(ub);
mutex_destroy(&ub->mutex);
+ xa_destroy(&ub->params);
kfree(ub);
}
@@ -1300,6 +1315,7 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd)
spin_lock_init(&ub->mm_lock);
INIT_WORK(&ub->stop_work, ublk_stop_work_fn);
INIT_DELAYED_WORK(&ub->monitor_work, ublk_daemon_monitor_work);
+ xa_init(&ub->params);
ret = ublk_alloc_dev_number(ub, header->dev_id);
if (ret < 0)
@@ -1445,6 +1461,87 @@ static int ublk_ctrl_get_dev_info(struct io_uring_cmd *cmd)
return ret;
}
+static int ublk_ctrl_get_param(struct io_uring_cmd *cmd)
+{
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ void __user *argp = (void __user *)(unsigned long)header->addr;
+ struct ublk_device *ub;
+ struct ublk_param_header ph;
+ struct ublk_param_header *param = NULL;
+ int ret = 0;
+
+ if (header->len <= sizeof(ph) || !header->addr)
+ return -EINVAL;
+
+ ub = ublk_get_device_from_id(header->dev_id);
+ if (!ub)
+ return -EINVAL;
+
+ ret = -EFAULT;
+ if (copy_from_user(&ph, argp, sizeof(ph)))
+ goto out_put;
+
+ ret = ublk_validate_param_header(ub, &ph);
+ if (ret)
+ goto out_put;
+
+ mutex_lock(&ub->mutex);
+ param = xa_load(&ub->params, ph.type);
+ mutex_unlock(&ub->mutex);
+ if (!param)
+ ret = -EINVAL;
+ else if (copy_to_user(argp, param, ph.len))
+ ret = -EFAULT;
+out_put:
+ ublk_put_device(ub);
+ return ret;
+}
+
+static int ublk_ctrl_set_param(struct io_uring_cmd *cmd)
+{
+ struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+ void __user *argp = (void __user *)(unsigned long)header->addr;
+ struct ublk_device *ub;
+ struct ublk_param_header ph;
+ struct ublk_param_header *param = NULL;
+ int ret = -EFAULT;
+
+ if (header->len <= sizeof(ph) || !header->addr)
+ return -EINVAL;
+
+ ub = ublk_get_device_from_id(header->dev_id);
+ if (!ub)
+ return -EINVAL;
+
+ if (copy_from_user(&ph, argp, sizeof(ph)))
+ goto out_put;
+
+ ret = ublk_validate_param_header(ub, &ph);
+ if (ret)
+ goto out_put;
+
+ param = kmalloc(ph.len, GFP_KERNEL);
+ if (!param) {
+ ret = -ENOMEM;
+ } else if (copy_from_user(param, argp, ph.len)) {
+ ret = -EFAULT;
+ } else {
+ /* parameters can only be changed when device isn't live */
+ mutex_lock(&ub->mutex);
+ if (ub->dev_info.state != UBLK_S_DEV_LIVE)
+ ret = ublk_install_param(ub, param);
+ else
+ ret = -EACCES;
+ mutex_unlock(&ub->mutex);
+ }
+out_put:
+ ublk_put_device(ub);
+ if (ret)
+ kfree(param);
+
+ return ret;
+}
+
static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
unsigned int issue_flags)
{
@@ -1480,6 +1577,12 @@ static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
case UBLK_CMD_GET_QUEUE_AFFINITY:
ret = ublk_ctrl_get_queue_affinity(cmd);
break;
+ case UBLK_CMD_GET_PARAM:
+ ret = ublk_ctrl_get_param(cmd);
+ break;
+ case UBLK_CMD_SET_PARAM:
+ ret = ublk_ctrl_set_param(cmd);
+ break;
default:
break;
}
diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
index ca33092354ab..1fb56d35ba8a 100644
--- a/include/uapi/linux/ublk_cmd.h
+++ b/include/uapi/linux/ublk_cmd.h
@@ -15,6 +15,8 @@
#define UBLK_CMD_DEL_DEV 0x05
#define UBLK_CMD_START_DEV 0x06
#define UBLK_CMD_STOP_DEV 0x07
+#define UBLK_CMD_SET_PARAM 0x08
+#define UBLK_CMD_GET_PARAM 0x09
/*
* IO commands, issued by ublk server, and handled by ublk driver.
@@ -158,4 +160,12 @@ struct ublksrv_io_cmd {
__u64 addr;
};
+/* 512 is big enough for single parameter */
+#define UBLK_MAX_PARAM_LEN 512
+
+struct ublk_param_header {
+ __u16 type;
+ __u16 len;
+};
+
#endif
--
2.31.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH V2 3/5] ublk_drv: add SET_PARAM/GET_PARAM control command
2022-07-27 14:16 ` [PATCH V2 3/5] ublk_drv: add SET_PARAM/GET_PARAM control command Ming Lei
@ 2022-07-27 16:22 ` Christoph Hellwig
2022-07-28 1:56 ` Ming Lei
0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2022-07-27 16:22 UTC (permalink / raw)
To: Ming Lei; +Cc: Jens Axboe, linux-block, ZiyangZhang, Christoph Hellwig
As stated in the previous discussion I think this is a very bad idea
that leads to a lot of boiler plate code. If you don't want to rely
on zeroed fields we can add a mask of valid fields, similar to how
e.g. the statx API works.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2 3/5] ublk_drv: add SET_PARAM/GET_PARAM control command
2022-07-27 16:22 ` Christoph Hellwig
@ 2022-07-28 1:56 ` Ming Lei
2022-07-28 4:38 ` Ming Lei
0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2022-07-28 1:56 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, ZiyangZhang, ming.lei
Hi Christoph,
On Wed, Jul 27, 2022 at 06:22:34PM +0200, Christoph Hellwig wrote:
> As stated in the previous discussion I think this is a very bad idea
> that leads to a lot of boiler plate code. If you don't want to rely
But you never point out where the boiler plate code is, care to share
what/where it is?
If you don't like xarray, we can replace it with one plain array
from the beginning, this change is just in implementation level.
Given it is ABI interface, I'd suggest to consolidate it from the
beginning.
> on zeroed fields we can add a mask of valid fields, similar to how
> e.g. the statx API works.
With one mask for marking valid fields, we still need to group
fields, and one bit in mask has to represent one single group, and
it can't represent each single field.
Also one length field has to be added in the header for keeping
compatibility with old app.
With the above two change, it becomes very similar with the approach
in this patch.
IMO there are more advantages in grouping parameter explicitly:
1) avoid big chunk of memory for storing lots of unnecessary parameter
if the big params data structure is extended, and we just store whatever
the userspace cares.
2) easy to add new type by just implementing two callbacks(both optionally),
and code is actually well organized, and each callback just focuses on the
interested parameter type
3) parameter is easier to verify, since we know each parameter's length and type.
With mask, we can only know if one parameter type exists in the big
parameters data structure or not.
Thanks,
Ming
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2 3/5] ublk_drv: add SET_PARAM/GET_PARAM control command
2022-07-28 1:56 ` Ming Lei
@ 2022-07-28 4:38 ` Ming Lei
0 siblings, 0 replies; 12+ messages in thread
From: Ming Lei @ 2022-07-28 4:38 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Jens Axboe, linux-block, ZiyangZhang
On Thu, Jul 28, 2022 at 09:56:26AM +0800, Ming Lei wrote:
> Hi Christoph,
>
> On Wed, Jul 27, 2022 at 06:22:34PM +0200, Christoph Hellwig wrote:
> > As stated in the previous discussion I think this is a very bad idea
> > that leads to a lot of boiler plate code. If you don't want to rely
>
> But you never point out where the boiler plate code is, care to share
> what/where it is?
>
> If you don't like xarray, we can replace it with one plain array
> from the beginning, this change is just in implementation level.
>
> Given it is ABI interface, I'd suggest to consolidate it from the
> beginning.
>
> > on zeroed fields we can add a mask of valid fields, similar to how
> > e.g. the statx API works.
>
> With one mask for marking valid fields, we still need to group
> fields, and one bit in mask has to represent one single group, and
> it can't represent each single field.
>
> Also one length field has to be added in the header for keeping
> compatibility with old app.
>
> With the above two change, it becomes very similar with the approach
> in this patch.
>
> IMO there are more advantages in grouping parameter explicitly:
>
> 1) avoid big chunk of memory for storing lots of unnecessary parameter
> if the big params data structure is extended, and we just store whatever
> the userspace cares.
>
> 2) easy to add new type by just implementing two callbacks(both optionally),
> and code is actually well organized, and each callback just focuses on the
> interested parameter type
>
> 3) parameter is easier to verify, since we know each parameter's length and type.
> With mask, we can only know if one parameter type exists in the big
> parameters data structure or not.
Another thing with putting everything into single structure is that the
params structure will become a mess sooner or later, since all kinds of
parameters will be added at the end, none of them are related.
For example, Ziyang is working on userspace recovery feature which may
need some data which can't be hold in single bit flag, meantime someone
may work on zoned target implementation[1] which needs to send zoned
parameter, and I am planning to add backing pages for handling running
out of pages, which needs userspace to pre-allocate/send buffer to driver
as backing page[s], ... Then all these unrelated parameters are added at
the end, and the single structure becomes fatter and fatter, then become
unreadable and hard to maintain.
[1] https://lore.kernel.org/qemu-devel/fa7de750-8def-c532-8c86-64c7505608e0@opensource.wdc.com/T/#u
Thanks,
Ming
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH V2 4/5] ublk_drv: add two parameter types
2022-07-27 14:16 [PATCH V2 0/5] ublk_drv: add generic mechanism to get/set parameters Ming Lei
` (2 preceding siblings ...)
2022-07-27 14:16 ` [PATCH V2 3/5] ublk_drv: add SET_PARAM/GET_PARAM control command Ming Lei
@ 2022-07-27 14:16 ` Ming Lei
2022-07-27 14:16 ` [PATCH V2 5/5] ublk_drv: cleanup ublksrv_ctrl_dev_info Ming Lei
4 siblings, 0 replies; 12+ messages in thread
From: Ming Lei @ 2022-07-27 14:16 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, ZiyangZhang, Christoph Hellwig, Ming Lei
Each block devices have lots of queue setting, most of them needs
device's knowledge to configure correctly. Device parameter can be
thought as device's side knowledge since ublk does implement block
device from userspace.
Add ublk_basic_param & ublk_discard_param first, both two types are
used now.
Another change is that discard support becomes not enabled at default,
and it needs userspace to send ublk_discard_param for enabling it.
Also ublk_basic_param has to be set before sending START_DEV, otherwise
ublk block device won't be setup. Meantime not set dev_blocks from
handling START_DEV any more.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 197 ++++++++++++++++++++++++++++++----
include/uapi/linux/ublk_cmd.h | 37 +++++++
2 files changed, 216 insertions(+), 18 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 39bb2d943dc2..8188079ea185 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -129,6 +129,7 @@ struct ublk_device {
#define UB_STATE_OPEN (1 << 0)
#define UB_STATE_USED (1 << 1)
+#define UB_STATE_CONFIGURED (1 << 2)
unsigned long state;
int ub_number;
@@ -151,6 +152,16 @@ struct ublk_device {
struct work_struct stop_work;
};
+typedef int (ublk_param_validate)(const struct ublk_device *,
+ const struct ublk_param_header *);
+typedef void (ublk_param_apply)(struct ublk_device *ub,
+ const struct ublk_param_header *);
+
+struct ublk_param_ops {
+ ublk_param_validate *validate_fn;
+ ublk_param_apply *apply_fn;
+};
+
static dev_t ublk_chr_devt;
static struct class *ublk_chr_class;
@@ -162,16 +173,166 @@ static DEFINE_MUTEX(ublk_ctl_mutex);
static struct miscdevice ublk_misc;
+static int ublk_dev_param_basic_validate(const struct ublk_device *ub,
+ const struct ublk_param_header *header)
+{
+ const struct ublk_basic_param *p = (struct ublk_basic_param *)header;
+
+ if (p->logical_bs_shift > PAGE_SHIFT)
+ return -EINVAL;
+
+ if (p->logical_bs_shift > p->physical_bs_shift)
+ return -EINVAL;
+
+ if (p->max_sectors > (ub->dev_info.rq_max_blocks << (ub->bs_shift - 9)))
+ return -EINVAL;
+
+ return 0;
+}
+
+/* basic param is the only one parameter which has to be set */
+static void ublk_dev_param_basic_apply(struct ublk_device *ub,
+ const struct ublk_param_header *header)
+{
+ struct request_queue *q = ub->ub_disk->queue;
+ const struct ublk_basic_param *p = (struct ublk_basic_param *)header;
+
+ blk_queue_logical_block_size(q, 1 << p->logical_bs_shift);
+ blk_queue_physical_block_size(q, 1 << p->physical_bs_shift);
+ blk_queue_io_min(q, 1 << p->io_min_shift);
+ blk_queue_io_opt(q, 1 << p->io_opt_shift);
+
+ blk_queue_write_cache(q, p->attrs & UBLK_ATTR_VOLATILE_CACHE,
+ p->attrs & UBLK_ATTR_FUA);
+ if (p->attrs & UBLK_ATTR_ROTATIONAL)
+ blk_queue_flag_clear(QUEUE_FLAG_NONROT, q);
+ else
+ blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
+
+ blk_queue_max_hw_sectors(q, p->max_sectors);
+ blk_queue_chunk_sectors(q, p->chunk_sectors);
+ blk_queue_virt_boundary(q, p->virt_boundary_mask);
+
+ if (p->attrs & UBLK_ATTR_READ_ONLY)
+ set_disk_ro(ub->ub_disk, true);
+
+ set_capacity(ub->ub_disk, p->dev_sectors);
+
+ set_bit(UB_STATE_CONFIGURED, &ub->state);
+}
+
+static int ublk_dev_param_discard_validate(const struct ublk_device *ub,
+ const struct ublk_param_header *header)
+{
+ const struct ublk_discard_param *p = (struct ublk_discard_param *)header;
+
+ /* So far, only support single segment discard */
+ if (p->max_discard_sectors && p->max_discard_segments != 1)
+ return -EINVAL;
+ return 0;
+}
+
+static void ublk_dev_param_discard_apply(struct ublk_device *ub,
+ const struct ublk_param_header *header)
+{
+ struct request_queue *q = ub->ub_disk->queue;
+ const struct ublk_discard_param *p = (struct ublk_discard_param *)
+ header;
+
+ q->limits.discard_alignment = p->discard_alignment;
+ q->limits.discard_granularity = p->discard_granularity;
+ blk_queue_max_discard_sectors(q, p->max_discard_sectors);
+ blk_queue_max_write_zeroes_sectors(q,
+ p->max_write_zeroes_sectors);
+ blk_queue_max_discard_segments(q, p->max_discard_segments);
+}
+
+static const unsigned int param_len[] = {
+ [UBLK_PARAM_TYPE_BASIC] = sizeof(struct ublk_basic_param),
+ [UBLK_PARAM_TYPE_DISCARD] = sizeof(struct ublk_discard_param),
+};
+
+static const struct ublk_param_ops param_ops[] = {
+ [UBLK_PARAM_TYPE_BASIC] = {
+ .validate_fn = ublk_dev_param_basic_validate,
+ .apply_fn = ublk_dev_param_basic_apply,
+ },
+ [UBLK_PARAM_TYPE_DISCARD] = {
+ .validate_fn = ublk_dev_param_discard_validate,
+ .apply_fn = ublk_dev_param_discard_apply,
+ },
+};
+
static int ublk_validate_param_header(const struct ublk_device *ub,
const struct ublk_param_header *h)
{
- return -EINVAL;
+ if (h->type >= UBLK_PARAM_TYPE_LAST)
+ return -EINVAL;
+
+ if (h->len > UBLK_MAX_PARAM_LEN)
+ return -EINVAL;
+
+ if (h->len != param_len[h->type])
+ return -EINVAL;
+
+ return 0;
+}
+
+static int ublk_validate_param(const struct ublk_device *ub,
+ const struct ublk_param_header *h)
+{
+ int ret = ublk_validate_param_header(ub, h);
+
+ if (ret)
+ return ret;
+
+ if (param_ops[h->type].validate_fn)
+ return param_ops[h->type].validate_fn(ub, h);
+
+ return 0;
}
+/* Old parameter with same type will be overridden */
static int ublk_install_param(struct ublk_device *ub,
struct ublk_param_header *h)
{
- return -EINVAL;
+ void *old;
+ int ret;
+
+ ret = ublk_validate_param(ub, h);
+ if (ret)
+ return ret;
+
+ old = xa_store(&ub->params, h->type, h, GFP_KERNEL);
+ if (xa_is_err(old))
+ return xa_err(old);
+ kfree(old);
+ return 0;
+}
+
+static void ublk_apply_param(struct ublk_device *ub,
+ const struct ublk_param_header *h)
+{
+ if (param_ops[h->type].apply_fn)
+ param_ops[h->type].apply_fn(ub, h);
+}
+
+static void ublk_apply_params(struct ublk_device *ub)
+{
+ struct ublk_param_header *h;
+ unsigned long type;
+
+ xa_for_each(&ub->params, type, h)
+ ublk_apply_param(ub, h);
+}
+
+static void ublk_uninstall_params(struct ublk_device *ub)
+{
+ unsigned long type;
+ void *p;
+
+ xa_for_each(&ub->params, type, p)
+ kfree(p);
}
static inline bool ublk_can_use_task_work(const struct ublk_queue *ubq)
@@ -234,6 +395,7 @@ static void ublk_free_disk(struct gendisk *disk)
clear_bit(UB_STATE_USED, &ub->state);
put_device(&ub->cdev_dev);
}
+ clear_bit(UB_STATE_CONFIGURED, &ub->state);
}
static const struct block_device_operations ub_fops = {
@@ -1067,6 +1229,7 @@ static void ublk_cdev_rel(struct device *dev)
blk_mq_free_tag_set(&ub->tag_set);
ublk_deinit_queues(ub);
+ ublk_uninstall_params(ub);
ublk_free_dev_number(ub);
mutex_destroy(&ub->mutex);
xa_destroy(&ub->params);
@@ -1156,7 +1319,6 @@ static int ublk_ctrl_start_dev(struct io_uring_cmd *cmd)
{
struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
int ublksrv_pid = (int)header->data[0];
- unsigned long dev_blocks = header->data[1];
struct ublk_device *ub;
struct gendisk *disk;
int ret = -EINVAL;
@@ -1179,10 +1341,6 @@ static int ublk_ctrl_start_dev(struct io_uring_cmd *cmd)
goto out_unlock;
}
- /* We may get disk size updated */
- if (dev_blocks)
- ub->dev_info.dev_blocks = dev_blocks;
-
disk = blk_mq_alloc_disk(&ub->tag_set, ub);
if (IS_ERR(disk)) {
ret = PTR_ERR(disk);
@@ -1192,21 +1350,21 @@ static int ublk_ctrl_start_dev(struct io_uring_cmd *cmd)
disk->fops = &ub_fops;
disk->private_data = ub;
- blk_queue_logical_block_size(disk->queue, ub->dev_info.block_size);
- blk_queue_physical_block_size(disk->queue, ub->dev_info.block_size);
- blk_queue_io_min(disk->queue, ub->dev_info.block_size);
- blk_queue_max_hw_sectors(disk->queue,
- ub->dev_info.rq_max_blocks << (ub->bs_shift - 9));
- disk->queue->limits.discard_granularity = PAGE_SIZE;
- blk_queue_max_discard_sectors(disk->queue, UINT_MAX >> 9);
- blk_queue_max_write_zeroes_sectors(disk->queue, UINT_MAX >> 9);
-
- set_capacity(disk, ub->dev_info.dev_blocks << (ub->bs_shift - 9));
-
ub->dev_info.ublksrv_pid = ublksrv_pid;
ub->ub_disk = disk;
+
+ ublk_apply_params(ub);
+
+ /* ublk_basic_param has to be set from userspace */
+ if (!test_bit(UB_STATE_CONFIGURED, &ub->state)) {
+ ret = -EINVAL;
+ put_disk(disk);
+ goto out_unlock;
+ }
+
ret = add_disk(disk);
if (ret) {
+ clear_bit(UB_STATE_CONFIGURED, &ub->state);
put_disk(disk);
goto out_unlock;
}
@@ -1610,6 +1768,9 @@ static int __init ublk_init(void)
{
int ret;
+ BUILD_BUG_ON(sizeof(struct ublk_basic_param) > UBLK_MAX_PARAM_LEN);
+ BUILD_BUG_ON(sizeof(struct ublk_discard_param) > UBLK_MAX_PARAM_LEN);
+
init_waitqueue_head(&ublk_idr_wq);
ret = misc_register(&ublk_misc);
diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
index 1fb56d35ba8a..85b61c1f7e3d 100644
--- a/include/uapi/linux/ublk_cmd.h
+++ b/include/uapi/linux/ublk_cmd.h
@@ -168,4 +168,41 @@ struct ublk_param_header {
__u16 len;
};
+enum {
+ UBLK_PARAM_TYPE_BASIC,
+ UBLK_PARAM_TYPE_DISCARD,
+ UBLK_PARAM_TYPE_LAST,
+};
+
+struct ublk_basic_param {
+ struct ublk_param_header header;
+#define UBLK_ATTR_READ_ONLY (1 << 0)
+#define UBLK_ATTR_ROTATIONAL (1 << 1)
+#define UBLK_ATTR_VOLATILE_CACHE (1 << 2)
+#define UBLK_ATTR_FUA (1 << 3)
+ __u32 attrs;
+ __u8 logical_bs_shift;
+ __u8 physical_bs_shift;
+ __u8 io_opt_shift;
+ __u8 io_min_shift;
+
+ __u32 max_sectors;
+ __u32 chunk_sectors;
+
+ __u64 dev_sectors;
+ __u64 virt_boundary_mask;
+};
+
+struct ublk_discard_param {
+ struct ublk_param_header header;
+ __u32 discard_alignment;
+
+ __u32 discard_granularity;
+ __u32 max_discard_sectors;
+
+ __u32 max_write_zeroes_sectors;
+ __u16 max_discard_segments;
+ __u16 reserved0;
+};
+
#endif
--
2.31.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH V2 5/5] ublk_drv: cleanup ublksrv_ctrl_dev_info
2022-07-27 14:16 [PATCH V2 0/5] ublk_drv: add generic mechanism to get/set parameters Ming Lei
` (3 preceding siblings ...)
2022-07-27 14:16 ` [PATCH V2 4/5] ublk_drv: add two parameter types Ming Lei
@ 2022-07-27 14:16 ` Ming Lei
4 siblings, 0 replies; 12+ messages in thread
From: Ming Lei @ 2022-07-27 14:16 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-block, ZiyangZhang, Christoph Hellwig, Ming Lei
Remove all block device related info from ublksrv_ctrl_dev_info,
meantime reduce its size into 64 bytes because:
1) ublksrv_ctrl_dev_info becomes cleaner without including any
block related info
2) generic set/get parameter command can be used to set block
related setting easily and cleanly
3) generic set/get parameter command can be used for extending
ublk without needing more info in ublksrv_ctrl_dev_info
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/block/ublk_drv.c | 17 +++++++----------
include/uapi/linux/ublk_cmd.h | 10 +++++-----
2 files changed, 12 insertions(+), 15 deletions(-)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 8188079ea185..b3c981a3c248 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -119,7 +119,6 @@ struct ublk_device {
char *__queues;
unsigned short queue_size;
- unsigned short bs_shift;
struct ublksrv_ctrl_dev_info dev_info;
struct blk_mq_tag_set tag_set;
@@ -184,7 +183,7 @@ static int ublk_dev_param_basic_validate(const struct ublk_device *ub,
if (p->logical_bs_shift > p->physical_bs_shift)
return -EINVAL;
- if (p->max_sectors > (ub->dev_info.rq_max_blocks << (ub->bs_shift - 9)))
+ if (p->max_sectors > (ub->dev_info.max_io_buf_bytes >> 9))
return -EINVAL;
return 0;
@@ -1270,13 +1269,13 @@ static void ublk_stop_work_fn(struct work_struct *work)
ublk_stop_dev(ub);
}
-/* align maximum I/O size to PAGE_SIZE */
+/* align max io buffer size with PAGE_SIZE */
static void ublk_align_max_io_size(struct ublk_device *ub)
{
- unsigned int max_rq_bytes = ub->dev_info.rq_max_blocks << ub->bs_shift;
+ unsigned int max_io_bytes = ub->dev_info.max_io_buf_bytes;
- ub->dev_info.rq_max_blocks =
- round_down(max_rq_bytes, PAGE_SIZE) >> ub->bs_shift;
+ ub->dev_info.max_io_buf_bytes =
+ round_down(max_io_bytes, PAGE_SIZE);
}
static int ublk_add_tag_set(struct ublk_device *ub)
@@ -1432,9 +1431,8 @@ static inline void ublk_dump_dev_info(struct ublksrv_ctrl_dev_info *info)
{
pr_devel("%s: dev id %d flags %llx\n", __func__,
info->dev_id, info->flags);
- pr_devel("\t nr_hw_queues %d queue_depth %d block size %d dev_capacity %lld\n",
- info->nr_hw_queues, info->queue_depth,
- info->block_size, info->dev_blocks);
+ pr_devel("\t nr_hw_queues %d queue_depth %d\n",
+ info->nr_hw_queues, info->queue_depth);
}
static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd)
@@ -1495,7 +1493,6 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd)
/* We are not ready to support zero copy */
ub->dev_info.flags &= ~UBLK_F_SUPPORT_ZERO_COPY;
- ub->bs_shift = ilog2(ub->dev_info.block_size);
ub->dev_info.nr_hw_queues = min_t(unsigned int,
ub->dev_info.nr_hw_queues, nr_cpu_ids);
ublk_align_max_io_size(ub);
diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
index 85b61c1f7e3d..c81607a4e987 100644
--- a/include/uapi/linux/ublk_cmd.h
+++ b/include/uapi/linux/ublk_cmd.h
@@ -80,22 +80,22 @@ struct ublksrv_ctrl_cmd {
struct ublksrv_ctrl_dev_info {
__u16 nr_hw_queues;
__u16 queue_depth;
- __u16 block_size;
+ __u16 reserved0;
__u16 state;
- __u32 rq_max_blocks;
+ __u32 max_io_buf_bytes;
__u32 dev_id;
- __u64 dev_blocks;
+ __u64 reserved1;
__s32 ublksrv_pid;
- __s32 reserved0;
+ __s32 reserved2;
__u64 flags;
__u64 flags_reserved;
/* For ublksrv internal use, invisible to ublk driver */
__u64 ublksrv_flags;
- __u64 reserved1[9];
+ __u64 reserved3;
};
#define UBLK_IO_OP_READ 0
--
2.31.1
^ permalink raw reply related [flat|nested] 12+ messages in thread