All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support
@ 2022-09-21  9:58 ZiyangZhang
  2022-09-21  9:58 ` [PATCH V4 1/8] ublk_drv: check 'current' instead of 'ubq_daemon' ZiyangZhang
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: ZiyangZhang @ 2022-09-21  9:58 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi, ZiyangZhang

ublk_drv is a driver simply passes all blk-mq rqs to userspace
target(such as ublksrv[1]). For each ublk queue, there is one
ubq_daemon(pthread). All ubq_daemons share the same process
which opens /dev/ublkcX. The ubq_daemon code infinitely loops on
io_uring_enter() to send/receive io_uring cmds which pass
information of blk-mq rqs.

Since the real IO handler(the process/thread opening /dev/ublkcX) is
in userspace, it could crash if:
(1) the user kills -9 it because of IO hang on backend, system
    reboot, etc...
(2) the process/thread catches a exception(segfault, divisor error,
oom...) Therefore, the kernel driver has to deal with a dying
ubq_daemon or the process.

Now, if one ubq_daemon(pthread) or the process crashes, ublk_drv
must abort the dying ubq, stop the device and release everything.
This is not a good choice in practice because users do not expect
aborted requests, I/O errors and a released device. They may want
a recovery machenism so that no requests are aborted and no I/O
error occurs. Anyway, users just want everything works as usual.

This patchset implements USER_RECOVERY support. If the process
or any ubq_daemon(pthread) crashes(exits accidentally), we allow
user to provide new process and ubq_daemons.

Note: The responsibility of recovery belongs to the user who opens
/dev/ublkcX. After a crash, the kernel driver only switch the
device's state to be ready for recovery(START_USER_RECOVERY) or
termination(STOP_DEV). The state is defined as UBLK_S_DEV_QUIESCED.
This patchset does not provide how to detect such a crash in userspace.
The user has may ways to do so. For example, user may:
(1) send GET_DEV_INFO on specific dev_id and check if its state is
    UBLK_S_DEV_QUIESCED.
(2) 'ps' on ublksrv_pid.

Recovery feature is quite useful for real products. In detail,
we support this scenario:
(1) The /dev/ublkc0 is opened by process 0.
(2) Fio is running on /dev/ublkb0 exposed by ublk_drv and all
    rqs are handled by process 0.
(3) Process 0 suddenly crashes(e.g. segfault);
(4) Fio is still running and submit IOs(but these IOs cannot
    be dispatched now)
(5) User starts process 1 and attach it to /dev/ublkc0
(6) All rqs are handled by process 1 now and IOs can be
    completed now.

Note: The backend must tolerate double-write because we re-issue
a rq sent to the old process 0 before.

We provide a sample script here to simulate the above steps:

***************************script***************************
LOOPS=10

__ublk_get_pid() {
	pid=`./ublk list -n 0 | grep "pid" | awk '{print $7}'`
	echo $pid
}

ublk_recover_kill()
{
	for CNT in `seq $LOOPS`; do
		dmesg -C
                pid=`__ublk_get_pid`
                echo -e "*** kill $pid now ***"
		kill -9 $pid
		sleep 6
                echo -e "*** recover now ***"
                ./ublk recover -n 0
		sleep 6
	done
}

ublk_test()
{
        echo -e "*** add ublk device ***"
        ./ublk add -t null -d 4 -i 1
        sleep 2
        echo -e "*** start fio ***"
        fio --bs=4k \
            --filename=/dev/ublkb0 \
            --runtime=140s \
            --rw=read &
        sleep 4
        ublk_recover_kill
        wait
        echo -e "*** delete ublk device ***"
        ./ublk del -n 0
}

for CNT in `seq 4`; do
        modprobe -rv ublk_drv
        modprobe ublk_drv
        echo -e "************ round $CNT ************"
        ublk_test
        sleep 5
done
***************************script***************************

You may run it with our modified ublksrv[2] which supports
recovery feature. No I/O error occurs and you can verify it
by typing
    $ perf-tools/bin/tpoint block:block_rq_error

The basic idea of USER_RECOVERY is quite straightfoward:
(1) quiesce ublk queues and requeue/abort rqs.
(2) release/free everything belongs to the dying process.
    Note: Since ublk_drv does save information about user process,
    this work is important because we don't expect any resource
    lekage. Particularly, ioucmds from the dying ubq_daemons
    need to be completed(freed).
(3) allow new ubq_daemons issue FETCH_REQ.
    Note: ublk_ch_uring_cmd() checks some states and flags. We
    have to set them to a correct value.

Here is steps to reocver:
(0) requests dispatched after the corresponding ubq_daemon is dying 
    are requeued.
(1) monitor_work finds one dying ubq_daemon, and it should
    schedule quiesce_work and requeue/abort requests issued to
    userspace before the ubq_daemon is dying.
(2) quiesce_work must (a)quiesce request queue to ban any incoming
    ublk_queue_rq(), (b)wait unitl all rqs are IDLE, (c)complete old
	  ioucmds. Then the ublk device is ready for recovery or stop.
(3) Since io_uring resources are released, ublk_ch_release() is called
    and all ublk_ios are reset to be ready for a new process.
(4) Then, user should start a new process and ubq_daemons(pthreads) and
    send FETCH_REQ by io_uring_enter() to make all ubqs be ready. The
    user must correctly setup queues, flags and so on(how to persist
    user's information is not related to this patchset).
(6) The user sends RESTART_DEV ctrl-cmd to /dev/ublk-control with a
    dev_id X.
(7) After receiving RESTART_DEV, ublk_drv waits for all ubq_daemons
    getting ready. Then it unquiesces request queue and new rqs are
    allowed.

You should use ublksrv[2] and tests[3] provided by us. We add 3 additional
tests to verify that recovery feature works. Our code will be PR-ed to
Ming's repo soon.

[1] https://github.com/ming1/ubdsrv
[2] https://github.com/old-memories/ubdsrv/tree/recovery-v1
[3] https://github.com/old-memories/ubdsrv/tree/recovery-v1/tests/generic

Since V3:
(1) do not kick requeue list in ublk_queue_rq() or io_uring fallback wq
    with a dying ubq_daemon but kicking the list once while unquiescing dev
(2) add comment on requeing rqs in ublk_queue_rq(), or io_uring fallback wq
    with a dying ubq_daemon
(3) split support for UBLK_F_USER_RECOVERY_REISSUE into a single patch
(4) let monitor_work abort/requeue rqs issued to userspace instead of
    quiesce_work with recovery enabled
(5) alway wait until no INFLIGHT rq exists in ublk_quiesce_dev()
(6) move ublk re-init stuff into ublk_ch_release()
(7) let ublk_quiesce_dev() go on as long as one ubq_daemon is dying
(8) add only one ctrl-cmd and rename it as RESTART_DEV
(9) check ub.dev_info->flags instead of iterating on all ubqs
(10) do not disable recoevry feature, but always qiuesce dev in
     ublk_stop_dev() and then unquiesce it
(11) add doc on USER_RECOVERY feature
 
Since V2:
(1) run ublk_quiesce_dev() in a standalone work.
(2) do not run monitor_work after START_USER_RECOVERY is handled.
(3) refactor recovery feature code so that it does not affect current code.

Since V1:
(1) refactor cover letter. Add intruduction on "how to detect a crash" and
    "why we need recovery feature".
(2) do not refactor task_work and ublk_queue_rq().
(3) allow users freely stop/recover the device.
(4) add comment on ublk_cancel_queue().
(5) refactor monitor_work and aborting machenism since we add recovery
    machenism in monitor_work.

ZiyangZhang (8):
  ublk_drv: check 'current' instead of 'ubq_daemon'
  ublk_drv: refactor ublk_cancel_queue()
  ublk_drv: define macros for recovery feature and check them
  ublk_drv: requeue rqs with recovery feature enabled
  ublk_drv: consider recovery feature in aborting mechanism
  ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE
  ublk_drv: allow new process to open ublk chardev with recovery feature
    enabled
  Documentation: document ublk user recovery feature

 Documentation/block/ublk.rst  |  25 +++
 drivers/block/ublk_drv.c      | 345 +++++++++++++++++++++++++++++++---
 include/uapi/linux/ublk_cmd.h |   6 +
 3 files changed, 354 insertions(+), 22 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH V4 1/8] ublk_drv: check 'current' instead of 'ubq_daemon'
  2022-09-21  9:58 [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support ZiyangZhang
@ 2022-09-21  9:58 ` ZiyangZhang
  2022-09-21  9:58 ` [PATCH V4 2/8] ublk_drv: refactor ublk_cancel_queue() ZiyangZhang
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: ZiyangZhang @ 2022-09-21  9:58 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi, ZiyangZhang

This check is not atomic. So with recovery feature, ubq_daemon may be
modified simultaneously by recovery task. Instead, check 'current' is
safe here because 'current' never changes.

Also add comment explaining this check, which is really important for
understanding recovery feature.

Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/ublk_drv.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 6a4a94b4cdf4..c39b67d7133d 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -645,14 +645,22 @@ static inline void __ublk_rq_task_work(struct request *req)
 	struct ublk_device *ub = ubq->dev;
 	int tag = req->tag;
 	struct ublk_io *io = &ubq->ios[tag];
-	bool task_exiting = current != ubq->ubq_daemon || ubq_daemon_is_dying(ubq);
 	unsigned int mapped_bytes;
 
 	pr_devel("%s: complete: op %d, qid %d tag %d io_flags %x addr %llx\n",
 			__func__, io->cmd->cmd_op, ubq->q_id, req->tag, io->flags,
 			ublk_get_iod(ubq, req->tag)->addr);
 
-	if (unlikely(task_exiting)) {
+	/*
+	 * Task is exiting if either:
+	 *
+	 * (1) current != ubq_daemon.
+	 * io_uring_cmd_complete_in_task() tries to run task_work
+	 * in a workqueue if ubq_daemon(cmd's task) is PF_EXITING.
+	 *
+	 * (2) current->flags & PF_EXITING.
+	 */
+	if (unlikely(current != ubq->ubq_daemon || current->flags & PF_EXITING)) {
 		blk_mq_end_request(req, BLK_STS_IOERR);
 		mod_delayed_work(system_wq, &ub->monitor_work, 0);
 		return;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 2/8] ublk_drv: refactor ublk_cancel_queue()
  2022-09-21  9:58 [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support ZiyangZhang
  2022-09-21  9:58 ` [PATCH V4 1/8] ublk_drv: check 'current' instead of 'ubq_daemon' ZiyangZhang
@ 2022-09-21  9:58 ` ZiyangZhang
  2022-09-21  9:58 ` [PATCH V4 3/8] ublk_drv: define macros for recovery feature and check them ZiyangZhang
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: ZiyangZhang @ 2022-09-21  9:58 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi, ZiyangZhang

Assume only a few FETCH_REQ ioucmds are sent to ublk_drv, then the
ubq_daemon exits, We have to call io_uring_cmd_done() for all ioucmds
received so that io_uring ctx will not leak.

ublk_cancel_queue() may be called before START_DEV or after STOP_DEV,
we decrease ubq->nr_io_ready and clear UBLK_IO_FLAG_ACTIVE so that we
won't call io_uring_cmd_done() twice for one ioucmd to avoid UAF. Also
clearing UBLK_IO_FLAG_ACTIVE makes the code more reasonable.

Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/block/ublk_drv.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index c39b67d7133d..0c6db0978ed0 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -963,22 +963,39 @@ static inline bool ublk_queue_ready(struct ublk_queue *ubq)
 	return ubq->nr_io_ready == ubq->q_depth;
 }
 
+/* If ublk_cancel_queue() is called before sending START_DEV(), ->mutex
+ * provides protection on above update.
+ *
+ * If ublk_cancel_queue() is called after sending START_DEV(), disk is
+ * deleted first, UBLK_IO_RES_ABORT is returned so that any new io
+ * command can't be issued to driver, so updating on io flags and
+ * nr_io_ready is safe here.
+ *
+ * Also ->nr_io_ready is guaranteed to become zero after ublk_cance_queue()
+ * returns since request queue is either frozen or not present in both two
+ * cases.
+ */
 static void ublk_cancel_queue(struct ublk_queue *ubq)
 {
 	int i;
 
-	if (!ublk_queue_ready(ubq))
+	if (!ubq->nr_io_ready)
 		return;
 
 	for (i = 0; i < ubq->q_depth; i++) {
 		struct ublk_io *io = &ubq->ios[i];
 
-		if (io->flags & UBLK_IO_FLAG_ACTIVE)
+		if (io->flags & UBLK_IO_FLAG_ACTIVE) {
+			pr_devel("%s: done old cmd: qid %d tag %d\n",
+					__func__, ubq->q_id, i);
 			io_uring_cmd_done(io->cmd, UBLK_IO_RES_ABORT, 0);
+			io->flags &= ~UBLK_IO_FLAG_ACTIVE;
+			ubq->nr_io_ready--;
+		}
 	}
 
 	/* all io commands are canceled */
-	ubq->nr_io_ready = 0;
+	WARN_ON_ONCE(ubq->nr_io_ready);
 }
 
 /* Cancel all pending commands, must be called after del_gendisk() returns */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 3/8] ublk_drv: define macros for recovery feature and check them
  2022-09-21  9:58 [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support ZiyangZhang
  2022-09-21  9:58 ` [PATCH V4 1/8] ublk_drv: check 'current' instead of 'ubq_daemon' ZiyangZhang
  2022-09-21  9:58 ` [PATCH V4 2/8] ublk_drv: refactor ublk_cancel_queue() ZiyangZhang
@ 2022-09-21  9:58 ` ZiyangZhang
  2022-09-21  9:58 ` [PATCH V4 4/8] ublk_drv: requeue rqs with recovery feature enabled ZiyangZhang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: ZiyangZhang @ 2022-09-21  9:58 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi, ZiyangZhang

Define some macros for recovery feature.

UBLK_S_DEV_QUIESCED implies that ublk_device is quiesced
and is ready for recovery. This state can be observed by userspace.

UBLK_F_USER_RECOVERY implies that:
(1) ublk_drv enables recovery feature. It won't let monitor_work to
    automatically abort rqs and release the device.
(2) With a dying ubq_daemon, ublk_drv ends(aborts) rqs issued to
    userspace(ublksrv) before crash.
(3) With a dying ubq_daemon, in task work and ublk_queue_rq(),
    ublk_drv requeues rqs.

Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
---
 drivers/block/ublk_drv.c      | 18 +++++++++++++++++-
 include/uapi/linux/ublk_cmd.h |  3 +++
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 0c6db0978ed0..3bdac4bdf46f 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -49,7 +49,8 @@
 /* All UBLK_F_* have to be included into UBLK_F_ALL */
 #define UBLK_F_ALL (UBLK_F_SUPPORT_ZERO_COPY \
 		| UBLK_F_URING_CMD_COMP_IN_TASK \
-		| UBLK_F_NEED_GET_DATA)
+		| UBLK_F_NEED_GET_DATA \
+		| UBLK_F_USER_RECOVERY)
 
 /* All UBLK_PARAM_TYPE_* should be included here */
 #define UBLK_PARAM_TYPE_ALL (UBLK_PARAM_TYPE_BASIC | UBLK_PARAM_TYPE_DISCARD)
@@ -323,6 +324,21 @@ static inline int ublk_queue_cmd_buf_size(struct ublk_device *ub, int q_id)
 			PAGE_SIZE);
 }
 
+static inline bool ublk_queue_can_use_recovery(
+		struct ublk_queue *ubq)
+{
+	if (ubq->flags & UBLK_F_USER_RECOVERY)
+		return true;
+	return false;
+}
+
+static inline bool ublk_can_use_recovery(struct ublk_device *ub)
+{
+	if (ub->dev_info.flags & UBLK_F_USER_RECOVERY)
+		return true;
+	return false;
+}
+
 static void ublk_free_disk(struct gendisk *disk)
 {
 	struct ublk_device *ub = disk->private_data;
diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
index 677edaab2b66..340ff14bde49 100644
--- a/include/uapi/linux/ublk_cmd.h
+++ b/include/uapi/linux/ublk_cmd.h
@@ -74,9 +74,12 @@
  */
 #define UBLK_F_NEED_GET_DATA (1UL << 2)
 
+#define UBLK_F_USER_RECOVERY	(1UL << 3)
+
 /* device state */
 #define UBLK_S_DEV_DEAD	0
 #define UBLK_S_DEV_LIVE	1
+#define UBLK_S_DEV_QUIESCED	2
 
 /* shipped via sqe->cmd of io_uring command */
 struct ublksrv_ctrl_cmd {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 4/8] ublk_drv: requeue rqs with recovery feature enabled
  2022-09-21  9:58 [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support ZiyangZhang
                   ` (2 preceding siblings ...)
  2022-09-21  9:58 ` [PATCH V4 3/8] ublk_drv: define macros for recovery feature and check them ZiyangZhang
@ 2022-09-21  9:58 ` ZiyangZhang
  2022-09-22  0:28   ` Ming Lei
  2022-09-21  9:58 ` [PATCH V4 5/8] ublk_drv: consider recovery feature in aborting mechanism ZiyangZhang
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: ZiyangZhang @ 2022-09-21  9:58 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi, ZiyangZhang

With recovery feature enabled, in ublk_queue_rq or task work
(in exit_task_work or fallback wq), we requeue rqs instead of
ending(aborting) them. Besides, No matter recovery feature is enabled
or disabled, we schedule monitor_work immediately.

Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
---
 drivers/block/ublk_drv.c | 31 +++++++++++++++++++++++++++++--
 1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 3bdac4bdf46f..b940e490ebab 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -655,6 +655,19 @@ static void ubq_complete_io_cmd(struct ublk_io *io, int res)
 
 #define UBLK_REQUEUE_DELAY_MS	3
 
+static inline void __ublk_abort_rq_in_task_work(struct ublk_queue *ubq,
+		struct request *rq)
+{
+	pr_devel("%s: %s q_id %d tag %d io_flags %x.\n", __func__,
+			(ublk_queue_can_use_recovery(ubq)) ? "requeue" : "abort",
+			ubq->q_id, rq->tag, ubq->ios[rq->tag].flags);
+	/* We cannot process this rq so just requeue it. */
+	if (ublk_queue_can_use_recovery(ubq))
+		blk_mq_requeue_request(rq, false);
+	else
+		blk_mq_end_request(rq, BLK_STS_IOERR);
+}
+
 static inline void __ublk_rq_task_work(struct request *req)
 {
 	struct ublk_queue *ubq = req->mq_hctx->driver_data;
@@ -677,7 +690,7 @@ static inline void __ublk_rq_task_work(struct request *req)
 	 * (2) current->flags & PF_EXITING.
 	 */
 	if (unlikely(current != ubq->ubq_daemon || current->flags & PF_EXITING)) {
-		blk_mq_end_request(req, BLK_STS_IOERR);
+		__ublk_abort_rq_in_task_work(ubq, req);
 		mod_delayed_work(system_wq, &ub->monitor_work, 0);
 		return;
 	}
@@ -752,6 +765,20 @@ static void ublk_rq_task_work_fn(struct callback_head *work)
 	__ublk_rq_task_work(req);
 }
 
+static inline blk_status_t __ublk_abort_rq(struct ublk_queue *ubq,
+		struct request *rq)
+{
+	pr_devel("%s: %s q_id %d tag %d io_flags %x.\n", __func__,
+			(ublk_queue_can_use_recovery(ubq)) ? "requeue" : "abort",
+			ubq->q_id, rq->tag, ubq->ios[rq->tag].flags);
+	/* We cannot process this rq so just requeue it. */
+	if (ublk_queue_can_use_recovery(ubq)) {
+		blk_mq_requeue_request(rq, false);
+		return BLK_STS_OK;
+	}
+	return BLK_STS_IOERR;
+}
+
 static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx,
 		const struct blk_mq_queue_data *bd)
 {
@@ -769,7 +796,7 @@ static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx,
 	if (unlikely(ubq_daemon_is_dying(ubq))) {
  fail:
 		mod_delayed_work(system_wq, &ubq->dev->monitor_work, 0);
-		return BLK_STS_IOERR;
+		return __ublk_abort_rq(ubq, rq);
 	}
 
 	if (ublk_can_use_task_work(ubq)) {
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 5/8] ublk_drv: consider recovery feature in aborting mechanism
  2022-09-21  9:58 [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support ZiyangZhang
                   ` (3 preceding siblings ...)
  2022-09-21  9:58 ` [PATCH V4 4/8] ublk_drv: requeue rqs with recovery feature enabled ZiyangZhang
@ 2022-09-21  9:58 ` ZiyangZhang
  2022-09-22  0:18   ` Ming Lei
  2022-09-21  9:58 ` [PATCH V4 6/8] ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE ZiyangZhang
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: ZiyangZhang @ 2022-09-21  9:58 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi, ZiyangZhang

With USER_RECOVERY feature enabled, the monitor_work schedules
quiesce_work after finding a dying ubq_daemon. The monitor_work
should also abort all rqs issued to userspace before the ubq_daemon is
dying. The quiesce_work's job is to:
(1) quiesce request queue.
(2) check if there is any INFLIGHT rq. If so, we retry until all these
    rqs are requeued and become IDLE. These rqs should be requeued by
	ublk_queue_rq(), task work, io_uring fallback wq or monitor_work.
(3) complete all ioucmds by calling io_uring_cmd_done(). We are safe to
    do so because no ioucmd can be referenced now.
(5) set ub's state to UBLK_S_DEV_QUIESCED, which means we are ready for
    recovery. This state is exposed to userspace by GET_DEV_INFO.

The driver can always handle STOP_DEV and cleanup everything no matter
ub's state is LIVE or QUIESCED. After ub's state is UBLK_S_DEV_QUIESCED,
user can recover with new process.

Note: we do not change the default behavior with reocvery feature
disabled. monitor_work still schedules stop_work and abort inflight
rqs. And finally ublk_device is released.

Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
---
 drivers/block/ublk_drv.c | 137 +++++++++++++++++++++++++++++++++++----
 1 file changed, 125 insertions(+), 12 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index b940e490ebab..9610afe11463 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -120,7 +120,7 @@ struct ublk_queue {
 
 	unsigned long io_addr;	/* mapped vm address */
 	unsigned int max_io_sz;
-	bool abort_work_pending;
+	bool force_abort;
 	unsigned short nr_io_ready;	/* how many ios setup */
 	struct ublk_device *dev;
 	struct ublk_io ios[0];
@@ -162,6 +162,7 @@ struct ublk_device {
 	 * monitor each queue's daemon periodically
 	 */
 	struct delayed_work	monitor_work;
+	struct work_struct	quiesce_work;
 	struct work_struct	stop_work;
 };
 
@@ -628,11 +629,17 @@ static void ublk_complete_rq(struct request *req)
  * Also aborting may not be started yet, keep in mind that one failed
  * request may be issued by block layer again.
  */
-static void __ublk_fail_req(struct ublk_io *io, struct request *req)
+static void __ublk_fail_req(struct ublk_queue *ubq, struct ublk_io *io,
+		struct request *req)
 {
 	WARN_ON_ONCE(io->flags & UBLK_IO_FLAG_ACTIVE);
 
 	if (!(io->flags & UBLK_IO_FLAG_ABORTED)) {
+		pr_devel("%s: abort rq: qid %d tag %d io_flags %x\n",
+				__func__,
+				req->mq_hctx->queue_num,
+				req->tag,
+				io->flags);
 		io->flags |= UBLK_IO_FLAG_ABORTED;
 		blk_mq_end_request(req, BLK_STS_IOERR);
 	}
@@ -676,10 +683,6 @@ static inline void __ublk_rq_task_work(struct request *req)
 	struct ublk_io *io = &ubq->ios[tag];
 	unsigned int mapped_bytes;
 
-	pr_devel("%s: complete: op %d, qid %d tag %d io_flags %x addr %llx\n",
-			__func__, io->cmd->cmd_op, ubq->q_id, req->tag, io->flags,
-			ublk_get_iod(ubq, req->tag)->addr);
-
 	/*
 	 * Task is exiting if either:
 	 *
@@ -746,6 +749,9 @@ static inline void __ublk_rq_task_work(struct request *req)
 			mapped_bytes >> 9;
 	}
 
+	pr_devel("%s: complete: op %d, qid %d tag %d io_flags %x addr %llx\n",
+			__func__, io->cmd->cmd_op, ubq->q_id, req->tag, io->flags,
+			ublk_get_iod(ubq, req->tag)->addr);
 	ubq_complete_io_cmd(io, UBLK_IO_RES_OK);
 }
 
@@ -790,6 +796,21 @@ static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx,
 	res = ublk_setup_iod(ubq, rq);
 	if (unlikely(res != BLK_STS_OK))
 		return BLK_STS_IOERR;
+	/* With recovery feature enabled, force_abort is set in
+	 * ublk_stop_dev() before calling del_gendisk(). We have to
+	 * abort all requeued and new rqs here to let del_gendisk()
+	 * move on. Besides, we cannot not call io_uring_cmd_complete_in_task()
+	 * to avoid UAF on io_uring ctx.
+	 *
+	 * Note: force_abort is guaranteed to be seen because it is set
+	 * before request queue is unqiuesced.
+	 */
+	if (unlikely(ubq->force_abort)) {
+		pr_devel("%s: abort rq: qid %d tag %d io_flags %x\n",
+				__func__, ubq->q_id, rq->tag,
+				ubq->ios[rq->tag].flags);
+		return BLK_STS_IOERR;
+	}
 
 	blk_mq_start_request(bd->rq);
 
@@ -967,7 +988,7 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq)
 			 */
 			rq = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], i);
 			if (rq)
-				__ublk_fail_req(io, rq);
+				__ublk_fail_req(ubq, io, rq);
 		}
 	}
 	ublk_put_device(ub);
@@ -983,7 +1004,10 @@ static void ublk_daemon_monitor_work(struct work_struct *work)
 		struct ublk_queue *ubq = ublk_get_queue(ub, i);
 
 		if (ubq_daemon_is_dying(ubq)) {
-			schedule_work(&ub->stop_work);
+			if (ublk_queue_can_use_recovery(ubq))
+				schedule_work(&ub->quiesce_work);
+			else
+				schedule_work(&ub->stop_work);
 
 			/* abort queue is for making forward progress */
 			ublk_abort_queue(ub, ubq);
@@ -991,12 +1015,13 @@ static void ublk_daemon_monitor_work(struct work_struct *work)
 	}
 
 	/*
-	 * We can't schedule monitor work after ublk_remove() is started.
+	 * We can't schedule monitor work after ub's state is not UBLK_S_DEV_LIVE.
+	 * after ublk_remove() or __ublk_quiesce_dev() is started.
 	 *
 	 * No need ub->mutex, monitor work are canceled after state is marked
-	 * as DEAD, so DEAD state is observed reliably.
+	 * as not LIVE, so new state is observed reliably.
 	 */
-	if (ub->dev_info.state != UBLK_S_DEV_DEAD)
+	if (ub->dev_info.state == UBLK_S_DEV_LIVE)
 		schedule_delayed_work(&ub->monitor_work,
 				UBLK_DAEMON_MONITOR_PERIOD);
 }
@@ -1050,12 +1075,97 @@ static void ublk_cancel_dev(struct ublk_device *ub)
 		ublk_cancel_queue(ublk_get_queue(ub, i));
 }
 
-static void ublk_stop_dev(struct ublk_device *ub)
+static bool ublk_check_inflight_rq(struct request *rq, void *data)
+{
+	bool *idle = data;
+
+	if (blk_mq_request_started(rq)) {
+		pr_devel("%s: rq qid %d tag %d is not IDLE.\n",
+				__func__, rq->mq_hctx->queue_num,
+				rq->tag);
+		*idle = false;
+		return false;
+	}
+	return true;
+}
+
+static void ublk_wait_tagset_rqs_idle(struct ublk_device *ub)
+{
+	bool idle;
+
+	WARN_ON_ONCE(!blk_queue_quiesced(ub->ub_disk->queue));
+	while (true) {
+		idle = true;
+		blk_mq_tagset_busy_iter(&ub->tag_set,
+				ublk_check_inflight_rq, &idle);
+		if (idle)
+			break;
+		pr_devel("%s: not all tags are idle, ub: dev_id %d\n",
+				__func__, ub->dev_info.dev_id);
+		msleep(UBLK_REQUEUE_DELAY_MS);
+	}
+}
+
+static void __ublk_quiesce_dev(struct ublk_device *ub)
 {
+	pr_devel("%s: quiesce ub: dev_id %d state %s\n",
+			__func__, ub->dev_info.dev_id,
+			ub->dev_info.state == UBLK_S_DEV_LIVE ?
+			"LIVE" : "QUIESCED");
+	blk_mq_quiesce_queue(ub->ub_disk->queue);
+	ublk_wait_tagset_rqs_idle(ub);
+	pr_devel("%s: all tags are idle, ub: dev_id %d\n",
+			__func__, ub->dev_info.dev_id);
+	ublk_cancel_dev(ub);
+	ub->dev_info.state = UBLK_S_DEV_QUIESCED;
+}
+
+static void ublk_quiesce_work_fn(struct work_struct *work)
+{
+	struct ublk_device *ub =
+		container_of(work, struct ublk_device, quiesce_work);
+
 	mutex_lock(&ub->mutex);
 	if (ub->dev_info.state != UBLK_S_DEV_LIVE)
 		goto unlock;
+	pr_devel("%s: start __ublk_quiesce_dev: dev_id %d\n",
+			__func__, ub->dev_info.dev_id);
+	__ublk_quiesce_dev(ub);
+ unlock:
+	mutex_unlock(&ub->mutex);
+}
+
+static void ublk_unquiesce_dev(struct ublk_device *ub)
+{
+	int i;
+
+	pr_devel("%s: unquiesce ub: dev_id %d state %s\n",
+			__func__, ub->dev_info.dev_id,
+			ub->dev_info.state == UBLK_S_DEV_LIVE ?
+			"LIVE" : "QUIESCED");
+	/* quiesce_work has run. We let requeued rqs be aborted
+	 * before running fallback_wq. "force_abort" must be seen
+	 * after request queue is unqiuesced. Then del_gendisk()
+	 * can move on.
+	 */
+	for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
+		ublk_get_queue(ub, i)->force_abort = true;
+
+	blk_mq_unquiesce_queue(ub->ub_disk->queue);
+	/* We may have requeued some rqs in ublk_quiesce_queue() */
+	blk_mq_kick_requeue_list(ub->ub_disk->queue);
+}
 
+static void ublk_stop_dev(struct ublk_device *ub)
+{
+	mutex_lock(&ub->mutex);
+	if (ub->dev_info.state == UBLK_S_DEV_DEAD)
+		goto unlock;
+	if (ublk_can_use_recovery(ub)) {
+		if (ub->dev_info.state == UBLK_S_DEV_LIVE)
+			__ublk_quiesce_dev(ub);
+		ublk_unquiesce_dev(ub);
+	}
 	del_gendisk(ub->ub_disk);
 	ub->dev_info.state = UBLK_S_DEV_DEAD;
 	ub->dev_info.ublksrv_pid = -1;
@@ -1379,6 +1489,7 @@ static void ublk_remove(struct ublk_device *ub)
 {
 	ublk_stop_dev(ub);
 	cancel_work_sync(&ub->stop_work);
+	cancel_work_sync(&ub->quiesce_work);
 	cdev_device_del(&ub->cdev, &ub->cdev_dev);
 	put_device(&ub->cdev_dev);
 }
@@ -1555,6 +1666,7 @@ static int ublk_ctrl_add_dev(struct io_uring_cmd *cmd)
 		goto out_unlock;
 	mutex_init(&ub->mutex);
 	spin_lock_init(&ub->mm_lock);
+	INIT_WORK(&ub->quiesce_work, ublk_quiesce_work_fn);
 	INIT_WORK(&ub->stop_work, ublk_stop_work_fn);
 	INIT_DELAYED_WORK(&ub->monitor_work, ublk_daemon_monitor_work);
 
@@ -1675,6 +1787,7 @@ static int ublk_ctrl_stop_dev(struct io_uring_cmd *cmd)
 
 	ublk_stop_dev(ub);
 	cancel_work_sync(&ub->stop_work);
+	cancel_work_sync(&ub->quiesce_work);
 
 	ublk_put_device(ub);
 	return 0;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 6/8] ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE
  2022-09-21  9:58 [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support ZiyangZhang
                   ` (4 preceding siblings ...)
  2022-09-21  9:58 ` [PATCH V4 5/8] ublk_drv: consider recovery feature in aborting mechanism ZiyangZhang
@ 2022-09-21  9:58 ` ZiyangZhang
  2022-09-22  2:13   ` Ming Lei
  2022-09-21  9:58 ` [PATCH V4 7/8] ublk_drv: allow new process to open ublk chardev with recovery feature enabled ZiyangZhang
  2022-09-21  9:58 ` [PATCH V4 8/8] Documentation: document ublk user recovery feature ZiyangZhang
  7 siblings, 1 reply; 16+ messages in thread
From: ZiyangZhang @ 2022-09-21  9:58 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi, ZiyangZhang

UBLK_F_USER_RECOVERY_REISSUE implies that:
With a dying ubq_daemon, ublk_drv let monitor_work requeues rq issued to
userspace(ublksrv) before the ubq_daemon is dying.

UBLK_F_USER_RECOVERY_REISSUE is designed for backends which:
(1) tolerate double-write since ublk_drv may issue the same rq
    twice.
(2) does not let frontend users get I/O error, such as read-only FS
    and VM backend.

Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
---
 drivers/block/ublk_drv.c      | 17 +++++++++++++++--
 include/uapi/linux/ublk_cmd.h |  2 ++
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index 9610afe11463..dc33ebc20c01 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -50,7 +50,8 @@
 #define UBLK_F_ALL (UBLK_F_SUPPORT_ZERO_COPY \
 		| UBLK_F_URING_CMD_COMP_IN_TASK \
 		| UBLK_F_NEED_GET_DATA \
-		| UBLK_F_USER_RECOVERY)
+		| UBLK_F_USER_RECOVERY \
+		| UBLK_F_USER_RECOVERY_REISSUE)
 
 /* All UBLK_PARAM_TYPE_* should be included here */
 #define UBLK_PARAM_TYPE_ALL (UBLK_PARAM_TYPE_BASIC | UBLK_PARAM_TYPE_DISCARD)
@@ -325,6 +326,15 @@ static inline int ublk_queue_cmd_buf_size(struct ublk_device *ub, int q_id)
 			PAGE_SIZE);
 }
 
+static inline bool ublk_queue_can_use_recovery_reissue(
+		struct ublk_queue *ubq)
+{
+	if ((ubq->flags & UBLK_F_USER_RECOVERY) &&
+			(ubq->flags & UBLK_F_USER_RECOVERY_REISSUE))
+		return true;
+	return false;
+}
+
 static inline bool ublk_queue_can_use_recovery(
 		struct ublk_queue *ubq)
 {
@@ -641,7 +651,10 @@ static void __ublk_fail_req(struct ublk_queue *ubq, struct ublk_io *io,
 				req->tag,
 				io->flags);
 		io->flags |= UBLK_IO_FLAG_ABORTED;
-		blk_mq_end_request(req, BLK_STS_IOERR);
+		if (ublk_queue_can_use_recovery_reissue(ubq))
+			blk_mq_requeue_request(req, false);
+		else
+			blk_mq_end_request(req, BLK_STS_IOERR);
 	}
 }
 
diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
index 340ff14bde49..332370628757 100644
--- a/include/uapi/linux/ublk_cmd.h
+++ b/include/uapi/linux/ublk_cmd.h
@@ -76,6 +76,8 @@
 
 #define UBLK_F_USER_RECOVERY	(1UL << 3)
 
+#define UBLK_F_USER_RECOVERY_REISSUE	(1UL << 4)
+
 /* device state */
 #define UBLK_S_DEV_DEAD	0
 #define UBLK_S_DEV_LIVE	1
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 7/8] ublk_drv: allow new process to open ublk chardev with recovery feature enabled
  2022-09-21  9:58 [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support ZiyangZhang
                   ` (5 preceding siblings ...)
  2022-09-21  9:58 ` [PATCH V4 6/8] ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE ZiyangZhang
@ 2022-09-21  9:58 ` ZiyangZhang
  2022-09-22  2:38   ` Ming Lei
  2022-09-21  9:58 ` [PATCH V4 8/8] Documentation: document ublk user recovery feature ZiyangZhang
  7 siblings, 1 reply; 16+ messages in thread
From: ZiyangZhang @ 2022-09-21  9:58 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi, ZiyangZhang

With recovery feature enabled, if ublk chardev is ready to be released
and quiesce_work has been scheduled, we:
(1) cancel monitor_work to avoid UAF on ubq && ublk_io.
(2) reinit all ubqs, including:
    (a) put the task_struct and reset ->ubq_daemon to NULL.
    (b) reset all ublk_io.
(3) reset ub->mm to NULL.
Then ublk chardev is released and new process can open it.

RESTART_DEV is introduced as a new ctrl-cmd for recovery feature.
After the chardev is opened and all ubqs are ready, user should send
RESTART_DEV to:
(1) wait until all new ubq_daemons getting ready.
(2) update ublksrv_pid
(3) unquiesce the request queue and expect incoming ublk_queue_rq()
(4) convert ub's state to UBLK_S_DEV_LIVE
(5) reschedule monitor_work

Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
---
 drivers/block/ublk_drv.c      | 109 +++++++++++++++++++++++++++++++++-
 include/uapi/linux/ublk_cmd.h |   1 +
 2 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index dc33ebc20c01..871cd48503a2 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -912,10 +912,67 @@ static int ublk_ch_open(struct inode *inode, struct file *filp)
 	return 0;
 }
 
+static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
+{
+	int i;
+
+	WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
+	pr_devel("%s: prepare for recovering qid %d\n", __func__, ubq->q_id);
+	/* old daemon is PF_EXITING, put it now */
+	put_task_struct(ubq->ubq_daemon);
+	/* We have to reset it to NULL, otherwise ub won't accept new FETCH_REQ */
+	ubq->ubq_daemon = NULL;
+
+	for (i = 0; i < ubq->q_depth; i++) {
+		struct ublk_io *io = &ubq->ios[i];
+
+		/* forget everything now and be ready for new FETCH_REQ */
+		io->flags = 0;
+		io->cmd = NULL;
+		io->addr = 0;
+	}
+	ubq->nr_io_ready = 0;
+}
+
 static int ublk_ch_release(struct inode *inode, struct file *filp)
 {
 	struct ublk_device *ub = filp->private_data;
+	int i;
+
+	/* lockless fast path */
+	if (!unlikely(ublk_can_use_recovery(ub) && ub->dev_info.state == UBLK_S_DEV_QUIESCED))
+		goto out_clear;
+
+	mutex_lock(&ub->mutex);
+	/*
+	 * USER_RECOVERY is only allowd after UBLK_S_DEV_QUIESCED is set,
+	 * which means that:
+	 *     (a) request queue has been quiesced
+	 *     (b) no inflight rq exists
+	 *     (c) all ioucmds owned by ther dying process are completed
+	 */
+	if (!(ublk_can_use_recovery(ub) && ub->dev_info.state == UBLK_S_DEV_QUIESCED))
+		goto out_unlock;
+	pr_devel("%s: reinit queues for dev id %d.\n", __func__, ub->dev_info.dev_id);
+	/* we are going to release task_struct of ubq_daemon and resets
+	 * ->ubq_daemon to NULL. So in monitor_work, check on ubq_daemon causes UAF.
+	 * Besides, monitor_work is not necessary in QUIESCED state since we have
+	 * already scheduled quiesce_work and quiesced all ubqs.
+	 *
+	 * Do not let monitor_work schedule itself if state it QUIESCED. And we cancel
+	 * it here and re-schedule it in RESTART_DEV to avoid UAF.
+	 */
+	cancel_delayed_work_sync(&ub->monitor_work);
 
+	for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
+		ublk_queue_reinit(ub, ublk_get_queue(ub, i));
+	/* set to NULL, otherwise new ubq_daemon cannot mmap the io_cmd_buf */
+	ub->mm = NULL;
+	ub->nr_queues_ready = 0;
+	init_completion(&ub->completion);
+ out_unlock:
+	mutex_unlock(&ub->mutex);
+ out_clear:
 	clear_bit(UB_STATE_OPEN, &ub->state);
 	return 0;
 }
@@ -1199,9 +1256,14 @@ static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq)
 		ubq->ubq_daemon = current;
 		get_task_struct(ubq->ubq_daemon);
 		ub->nr_queues_ready++;
+		pr_devel("%s: ub %d qid %d is ready.\n",
+				__func__, ub->dev_info.dev_id, ubq->q_id);
 	}
-	if (ub->nr_queues_ready == ub->dev_info.nr_hw_queues)
+	if (ub->nr_queues_ready == ub->dev_info.nr_hw_queues) {
+		pr_devel("%s: ub %d all ubqs are ready.\n",
+				__func__, ub->dev_info.dev_id);
 		complete_all(&ub->completion);
+	}
 	mutex_unlock(&ub->mutex);
 }
 
@@ -1903,6 +1965,48 @@ static int ublk_ctrl_set_params(struct io_uring_cmd *cmd)
 	return ret;
 }
 
+static int ublk_ctrl_restart_dev(struct io_uring_cmd *cmd)
+{
+	struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
+	int ublksrv_pid = (int)header->data[0];
+	struct ublk_device *ub;
+	int ret = -EINVAL;
+
+	ub = ublk_get_device_from_id(header->dev_id);
+	if (!ub)
+		return ret;
+
+	pr_devel("%s: Waiting for new ubq_daemons(nr: %d) are ready, dev id %d...\n",
+			__func__, ub->dev_info.nr_hw_queues, header->dev_id);
+	/* wait until new ubq_daemon sending all FETCH_REQ */
+	wait_for_completion_interruptible(&ub->completion);
+	pr_devel("%s: All new ubq_daemons(nr: %d) are ready, dev id %d\n",
+			__func__, ub->dev_info.nr_hw_queues, header->dev_id);
+
+	mutex_lock(&ub->mutex);
+	if (!ublk_can_use_recovery(ub))
+		goto out_unlock;
+
+	if (ub->dev_info.state != UBLK_S_DEV_QUIESCED) {
+		ret = -EBUSY;
+		goto out_unlock;
+	}
+	ub->dev_info.ublksrv_pid = ublksrv_pid;
+	pr_devel("%s: new ublksrv_pid %d, dev id %d\n",
+			__func__, ublksrv_pid, header->dev_id);
+	blk_mq_unquiesce_queue(ub->ub_disk->queue);
+	pr_devel("%s: queue unquiesced, dev id %d.\n",
+			__func__, header->dev_id);
+	blk_mq_kick_requeue_list(ub->ub_disk->queue);
+	ub->dev_info.state = UBLK_S_DEV_LIVE;
+	schedule_delayed_work(&ub->monitor_work, UBLK_DAEMON_MONITOR_PERIOD);
+	ret = 0;
+ out_unlock:
+	mutex_unlock(&ub->mutex);
+	ublk_put_device(ub);
+	return ret;
+}
+
 static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
 		unsigned int issue_flags)
 {
@@ -1944,6 +2048,9 @@ static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
 	case UBLK_CMD_SET_PARAMS:
 		ret = ublk_ctrl_set_params(cmd);
 		break;
+	case UBLK_CMD_RESTART_DEV:
+		ret = ublk_ctrl_restart_dev(cmd);
+		break;
 	default:
 		break;
 	}
diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
index 332370628757..a088f374c0f6 100644
--- a/include/uapi/linux/ublk_cmd.h
+++ b/include/uapi/linux/ublk_cmd.h
@@ -17,6 +17,7 @@
 #define	UBLK_CMD_STOP_DEV	0x07
 #define	UBLK_CMD_SET_PARAMS	0x08
 #define	UBLK_CMD_GET_PARAMS	0x09
+#define UBLK_CMD_RESTART_DEV	0x10
 
 /*
  * IO commands, issued by ublk server, and handled by ublk driver.
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH V4 8/8] Documentation: document ublk user recovery feature
  2022-09-21  9:58 [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support ZiyangZhang
                   ` (6 preceding siblings ...)
  2022-09-21  9:58 ` [PATCH V4 7/8] ublk_drv: allow new process to open ublk chardev with recovery feature enabled ZiyangZhang
@ 2022-09-21  9:58 ` ZiyangZhang
  7 siblings, 0 replies; 16+ messages in thread
From: ZiyangZhang @ 2022-09-21  9:58 UTC (permalink / raw)
  To: ming.lei
  Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi, ZiyangZhang

Add documentation for user recovery feature of ublk subsystem.

Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
---
 Documentation/block/ublk.rst | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/Documentation/block/ublk.rst b/Documentation/block/ublk.rst
index 2122d1a4a541..3f1bfd51898b 100644
--- a/Documentation/block/ublk.rst
+++ b/Documentation/block/ublk.rst
@@ -144,6 +144,31 @@ managing and controlling ublk devices with help of several control commands:
   For retrieving device info via ``ublksrv_ctrl_dev_info``. It is the server's
   responsibility to save IO target specific info in userspace.
 
+- ``UBLK_CMD_RESTART_DEV``
+
+  This command is valid if ``UBLK_F_USER_RECOVERY`` feature is enabled. The old
+  process has exited and ublk device is quiesced. Then, user should start a new
+  process which opens ``/dev/ublkc*`` and gets all ublk queues be ready. Finally
+  user should send this command. When this command returns, ublk device is
+  unquiesced and new I/O requests are passed to the new process.
+
+- user recovery feature description
+
+  Two new features are added for user recovery: ``UBLK_F_USER_RECOVERY`` and
+  ``UBLK_F_USER_RECOVERY_REISSUE``.
+
+  With ``UBLK_F_USER_RECOVERY`` set, after one ubq_daemon(ublksrv io handler) is
+  dying, ublk does not release ``/dev/ublkc*`` or ``/dev/ublkb*`` but requeues all
+  inflight requests which have not been issued to userspace. Requests which have
+  been issued to userspace are aborted.
+
+  With ``UBLK_F_USER_RECOVERY_REISSUE`` set, after one ubq_daemon(ublksrv io
+  handler) is dying, contrary to ``UBLK_F_USER_RECOVERY``, requests which have been
+  issued to userspace are requeued and will be re-issued to the new process after
+  handling ``UBLK_CMD_RESTART_DEV``. ``UBLK_F_USER_RECOVERY_REISSUE`` is designed
+  for backends who tolerate double-write since the driver may issue the same
+  I/O request twice. It might be useful to a read-only FS or a VM backend.
+
 Data plane
 ----------
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 5/8] ublk_drv: consider recovery feature in aborting mechanism
  2022-09-21  9:58 ` [PATCH V4 5/8] ublk_drv: consider recovery feature in aborting mechanism ZiyangZhang
@ 2022-09-22  0:18   ` Ming Lei
  2022-09-22  2:06     ` Ziyang Zhang
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-09-22  0:18 UTC (permalink / raw)
  To: ZiyangZhang; +Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi

On Wed, Sep 21, 2022 at 05:58:46PM +0800, ZiyangZhang wrote:
> With USER_RECOVERY feature enabled, the monitor_work schedules
> quiesce_work after finding a dying ubq_daemon. The monitor_work
> should also abort all rqs issued to userspace before the ubq_daemon is
> dying. The quiesce_work's job is to:
> (1) quiesce request queue.
> (2) check if there is any INFLIGHT rq. If so, we retry until all these
>     rqs are requeued and become IDLE. These rqs should be requeued by
> 	ublk_queue_rq(), task work, io_uring fallback wq or monitor_work.
> (3) complete all ioucmds by calling io_uring_cmd_done(). We are safe to
>     do so because no ioucmd can be referenced now.
> (5) set ub's state to UBLK_S_DEV_QUIESCED, which means we are ready for
>     recovery. This state is exposed to userspace by GET_DEV_INFO.
> 
> The driver can always handle STOP_DEV and cleanup everything no matter
> ub's state is LIVE or QUIESCED. After ub's state is UBLK_S_DEV_QUIESCED,
> user can recover with new process.
> 
> Note: we do not change the default behavior with reocvery feature
> disabled. monitor_work still schedules stop_work and abort inflight
> rqs. And finally ublk_device is released.
> 
> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>

This version is close to be ready, just some debug logging needs
be removed, see inline comment. Also I'd suggest you to learn to
use bpftrace a bit, then basically you needn't to rely on kernel
logging.

If these logging is removed, you will see how simple the patch becomes
compared with previous version.

> ---
>  drivers/block/ublk_drv.c | 137 +++++++++++++++++++++++++++++++++++----
>  1 file changed, 125 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index b940e490ebab..9610afe11463 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -120,7 +120,7 @@ struct ublk_queue {
>  
>  	unsigned long io_addr;	/* mapped vm address */
>  	unsigned int max_io_sz;
> -	bool abort_work_pending;
> +	bool force_abort;
>  	unsigned short nr_io_ready;	/* how many ios setup */
>  	struct ublk_device *dev;
>  	struct ublk_io ios[0];
> @@ -162,6 +162,7 @@ struct ublk_device {
>  	 * monitor each queue's daemon periodically
>  	 */
>  	struct delayed_work	monitor_work;
> +	struct work_struct	quiesce_work;
>  	struct work_struct	stop_work;
>  };
>  
> @@ -628,11 +629,17 @@ static void ublk_complete_rq(struct request *req)
>   * Also aborting may not be started yet, keep in mind that one failed
>   * request may be issued by block layer again.
>   */
> -static void __ublk_fail_req(struct ublk_io *io, struct request *req)
> +static void __ublk_fail_req(struct ublk_queue *ubq, struct ublk_io *io,
> +		struct request *req)
>  {
>  	WARN_ON_ONCE(io->flags & UBLK_IO_FLAG_ACTIVE);
>  
>  	if (!(io->flags & UBLK_IO_FLAG_ABORTED)) {
> +		pr_devel("%s: abort rq: qid %d tag %d io_flags %x\n",
> +				__func__,
> +				req->mq_hctx->queue_num,
> +				req->tag,
> +				io->flags);

No necessary to add the above log.

>  		io->flags |= UBLK_IO_FLAG_ABORTED;
>  		blk_mq_end_request(req, BLK_STS_IOERR);
>  	}
> @@ -676,10 +683,6 @@ static inline void __ublk_rq_task_work(struct request *req)
>  	struct ublk_io *io = &ubq->ios[tag];
>  	unsigned int mapped_bytes;
>  
> -	pr_devel("%s: complete: op %d, qid %d tag %d io_flags %x addr %llx\n",
> -			__func__, io->cmd->cmd_op, ubq->q_id, req->tag, io->flags,
> -			ublk_get_iod(ubq, req->tag)->addr);
> -
>  	/*
>  	 * Task is exiting if either:
>  	 *
> @@ -746,6 +749,9 @@ static inline void __ublk_rq_task_work(struct request *req)
>  			mapped_bytes >> 9;
>  	}
>  
> +	pr_devel("%s: complete: op %d, qid %d tag %d io_flags %x addr %llx\n",
> +			__func__, io->cmd->cmd_op, ubq->q_id, req->tag, io->flags,
> +			ublk_get_iod(ubq, req->tag)->addr);
>  	ubq_complete_io_cmd(io, UBLK_IO_RES_OK);
>  }
>  
> @@ -790,6 +796,21 @@ static blk_status_t ublk_queue_rq(struct blk_mq_hw_ctx *hctx,
>  	res = ublk_setup_iod(ubq, rq);
>  	if (unlikely(res != BLK_STS_OK))
>  		return BLK_STS_IOERR;
> +	/* With recovery feature enabled, force_abort is set in
> +	 * ublk_stop_dev() before calling del_gendisk(). We have to
> +	 * abort all requeued and new rqs here to let del_gendisk()
> +	 * move on. Besides, we cannot not call io_uring_cmd_complete_in_task()
> +	 * to avoid UAF on io_uring ctx.
> +	 *
> +	 * Note: force_abort is guaranteed to be seen because it is set
> +	 * before request queue is unqiuesced.
> +	 */
> +	if (unlikely(ubq->force_abort)) {
> +		pr_devel("%s: abort rq: qid %d tag %d io_flags %x\n",
> +				__func__, ubq->q_id, rq->tag,
> +				ubq->ios[rq->tag].flags);

same with above.

> +		return BLK_STS_IOERR;
> +	}
>  
>  	blk_mq_start_request(bd->rq);
>  
> @@ -967,7 +988,7 @@ static void ublk_abort_queue(struct ublk_device *ub, struct ublk_queue *ubq)
>  			 */
>  			rq = blk_mq_tag_to_rq(ub->tag_set.tags[ubq->q_id], i);
>  			if (rq)
> -				__ublk_fail_req(io, rq);
> +				__ublk_fail_req(ubq, io, rq);
>  		}
>  	}
>  	ublk_put_device(ub);
> @@ -983,7 +1004,10 @@ static void ublk_daemon_monitor_work(struct work_struct *work)
>  		struct ublk_queue *ubq = ublk_get_queue(ub, i);
>  
>  		if (ubq_daemon_is_dying(ubq)) {
> -			schedule_work(&ub->stop_work);
> +			if (ublk_queue_can_use_recovery(ubq))
> +				schedule_work(&ub->quiesce_work);
> +			else
> +				schedule_work(&ub->stop_work);
>  
>  			/* abort queue is for making forward progress */
>  			ublk_abort_queue(ub, ubq);
> @@ -991,12 +1015,13 @@ static void ublk_daemon_monitor_work(struct work_struct *work)
>  	}
>  
>  	/*
> -	 * We can't schedule monitor work after ublk_remove() is started.
> +	 * We can't schedule monitor work after ub's state is not UBLK_S_DEV_LIVE.
> +	 * after ublk_remove() or __ublk_quiesce_dev() is started.
>  	 *
>  	 * No need ub->mutex, monitor work are canceled after state is marked
> -	 * as DEAD, so DEAD state is observed reliably.
> +	 * as not LIVE, so new state is observed reliably.
>  	 */
> -	if (ub->dev_info.state != UBLK_S_DEV_DEAD)
> +	if (ub->dev_info.state == UBLK_S_DEV_LIVE)
>  		schedule_delayed_work(&ub->monitor_work,
>  				UBLK_DAEMON_MONITOR_PERIOD);
>  }
> @@ -1050,12 +1075,97 @@ static void ublk_cancel_dev(struct ublk_device *ub)
>  		ublk_cancel_queue(ublk_get_queue(ub, i));
>  }
>  
> -static void ublk_stop_dev(struct ublk_device *ub)
> +static bool ublk_check_inflight_rq(struct request *rq, void *data)
> +{
> +	bool *idle = data;
> +
> +	if (blk_mq_request_started(rq)) {
> +		pr_devel("%s: rq qid %d tag %d is not IDLE.\n",
> +				__func__, rq->mq_hctx->queue_num,
> +				rq->tag);

Please remove above log, otherwise it may overflow printk buffer.
Also you can observe pending requests info from blk-mq debugfs.

> +		*idle = false;
> +		return false;
> +	}
> +	return true;
> +}
> +
> +static void ublk_wait_tagset_rqs_idle(struct ublk_device *ub)
> +{
> +	bool idle;
> +
> +	WARN_ON_ONCE(!blk_queue_quiesced(ub->ub_disk->queue));
> +	while (true) {
> +		idle = true;
> +		blk_mq_tagset_busy_iter(&ub->tag_set,
> +				ublk_check_inflight_rq, &idle);
> +		if (idle)
> +			break;
> +		pr_devel("%s: not all tags are idle, ub: dev_id %d\n",
> +				__func__, ub->dev_info.dev_id);

The above logging isn't useful, we can conclude easily that
the wait isn't done by checking stack trace or debugfs log.

> +		msleep(UBLK_REQUEUE_DELAY_MS);
> +	}
> +}
> +
> +static void __ublk_quiesce_dev(struct ublk_device *ub)
>  {
> +	pr_devel("%s: quiesce ub: dev_id %d state %s\n",
> +			__func__, ub->dev_info.dev_id,
> +			ub->dev_info.state == UBLK_S_DEV_LIVE ?
> +			"LIVE" : "QUIESCED");
> +	blk_mq_quiesce_queue(ub->ub_disk->queue);
> +	ublk_wait_tagset_rqs_idle(ub);
> +	pr_devel("%s: all tags are idle, ub: dev_id %d\n",
> +			__func__, ub->dev_info.dev_id);

The above logging can be removed too.

> +	ublk_cancel_dev(ub);
> +	ub->dev_info.state = UBLK_S_DEV_QUIESCED;
> +}
> +
> +static void ublk_quiesce_work_fn(struct work_struct *work)
> +{
> +	struct ublk_device *ub =
> +		container_of(work, struct ublk_device, quiesce_work);
> +
>  	mutex_lock(&ub->mutex);
>  	if (ub->dev_info.state != UBLK_S_DEV_LIVE)
>  		goto unlock;
> +	pr_devel("%s: start __ublk_quiesce_dev: dev_id %d\n",
> +			__func__, ub->dev_info.dev_id);

The above logging isn't needed, since you do add one
at the beginning of __ublk_quiesce_dev().


Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 4/8] ublk_drv: requeue rqs with recovery feature enabled
  2022-09-21  9:58 ` [PATCH V4 4/8] ublk_drv: requeue rqs with recovery feature enabled ZiyangZhang
@ 2022-09-22  0:28   ` Ming Lei
  2022-09-22  2:04     ` Ziyang Zhang
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-09-22  0:28 UTC (permalink / raw)
  To: ZiyangZhang; +Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi

On Wed, Sep 21, 2022 at 05:58:45PM +0800, ZiyangZhang wrote:
> With recovery feature enabled, in ublk_queue_rq or task work
> (in exit_task_work or fallback wq), we requeue rqs instead of
> ending(aborting) them. Besides, No matter recovery feature is enabled
> or disabled, we schedule monitor_work immediately.
> 
> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
> ---
>  drivers/block/ublk_drv.c | 31 +++++++++++++++++++++++++++++--
>  1 file changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index 3bdac4bdf46f..b940e490ebab 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -655,6 +655,19 @@ static void ubq_complete_io_cmd(struct ublk_io *io, int res)
>  
>  #define UBLK_REQUEUE_DELAY_MS	3
>  
> +static inline void __ublk_abort_rq_in_task_work(struct ublk_queue *ubq,
> +		struct request *rq)
> +{
> +	pr_devel("%s: %s q_id %d tag %d io_flags %x.\n", __func__,
> +			(ublk_queue_can_use_recovery(ubq)) ? "requeue" : "abort",
> +			ubq->q_id, rq->tag, ubq->ios[rq->tag].flags);
> +	/* We cannot process this rq so just requeue it. */
> +	if (ublk_queue_can_use_recovery(ubq))
> +		blk_mq_requeue_request(rq, false);
> +	else
> +		blk_mq_end_request(rq, BLK_STS_IOERR);
> +}
> +
>  static inline void __ublk_rq_task_work(struct request *req)
>  {
>  	struct ublk_queue *ubq = req->mq_hctx->driver_data;
> @@ -677,7 +690,7 @@ static inline void __ublk_rq_task_work(struct request *req)
>  	 * (2) current->flags & PF_EXITING.
>  	 */
>  	if (unlikely(current != ubq->ubq_daemon || current->flags & PF_EXITING)) {
> -		blk_mq_end_request(req, BLK_STS_IOERR);
> +		__ublk_abort_rq_in_task_work(ubq, req);
>  		mod_delayed_work(system_wq, &ub->monitor_work, 0);
>  		return;
>  	}
> @@ -752,6 +765,20 @@ static void ublk_rq_task_work_fn(struct callback_head *work)
>  	__ublk_rq_task_work(req);
>  }
>  
> +static inline blk_status_t __ublk_abort_rq(struct ublk_queue *ubq,
> +		struct request *rq)
> +{
> +	pr_devel("%s: %s q_id %d tag %d io_flags %x.\n", __func__,
> +			(ublk_queue_can_use_recovery(ubq)) ? "requeue" : "abort",
> +			ubq->q_id, rq->tag, ubq->ios[rq->tag].flags);
> +	/* We cannot process this rq so just requeue it. */
> +	if (ublk_queue_can_use_recovery(ubq)) {
> +		blk_mq_requeue_request(rq, false);
> +		return BLK_STS_OK;
> +	}
> +	return BLK_STS_IOERR;
> +}
> +

Please remove the two added logging, otherwise this patch looks fine.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 4/8] ublk_drv: requeue rqs with recovery feature enabled
  2022-09-22  0:28   ` Ming Lei
@ 2022-09-22  2:04     ` Ziyang Zhang
  0 siblings, 0 replies; 16+ messages in thread
From: Ziyang Zhang @ 2022-09-22  2:04 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi

On 2022/9/22 08:28, Ming Lei wrote:
> On Wed, Sep 21, 2022 at 05:58:45PM +0800, ZiyangZhang wrote:
>> With recovery feature enabled, in ublk_queue_rq or task work
>> (in exit_task_work or fallback wq), we requeue rqs instead of
>> ending(aborting) them. Besides, No matter recovery feature is enabled
>> or disabled, we schedule monitor_work immediately.
>>
>> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
>> ---
>>  drivers/block/ublk_drv.c | 31 +++++++++++++++++++++++++++++--
>>  1 file changed, 29 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
>> index 3bdac4bdf46f..b940e490ebab 100644
>> --- a/drivers/block/ublk_drv.c
>> +++ b/drivers/block/ublk_drv.c
>> @@ -655,6 +655,19 @@ static void ubq_complete_io_cmd(struct ublk_io *io, int res)
>>  
>>  #define UBLK_REQUEUE_DELAY_MS	3
>>  
>> +static inline void __ublk_abort_rq_in_task_work(struct ublk_queue *ubq,
>> +		struct request *rq)
>> +{
>> +	pr_devel("%s: %s q_id %d tag %d io_flags %x.\n", __func__,
>> +			(ublk_queue_can_use_recovery(ubq)) ? "requeue" : "abort",
>> +			ubq->q_id, rq->tag, ubq->ios[rq->tag].flags);
>> +	/* We cannot process this rq so just requeue it. */
>> +	if (ublk_queue_can_use_recovery(ubq))
>> +		blk_mq_requeue_request(rq, false);
>> +	else
>> +		blk_mq_end_request(rq, BLK_STS_IOERR);
>> +}
>> +
>>  static inline void __ublk_rq_task_work(struct request *req)
>>  {
>>  	struct ublk_queue *ubq = req->mq_hctx->driver_data;
>> @@ -677,7 +690,7 @@ static inline void __ublk_rq_task_work(struct request *req)
>>  	 * (2) current->flags & PF_EXITING.
>>  	 */
>>  	if (unlikely(current != ubq->ubq_daemon || current->flags & PF_EXITING)) {
>> -		blk_mq_end_request(req, BLK_STS_IOERR);
>> +		__ublk_abort_rq_in_task_work(ubq, req);
>>  		mod_delayed_work(system_wq, &ub->monitor_work, 0);
>>  		return;
>>  	}
>> @@ -752,6 +765,20 @@ static void ublk_rq_task_work_fn(struct callback_head *work)
>>  	__ublk_rq_task_work(req);
>>  }
>>  
>> +static inline blk_status_t __ublk_abort_rq(struct ublk_queue *ubq,
>> +		struct request *rq)
>> +{
>> +	pr_devel("%s: %s q_id %d tag %d io_flags %x.\n", __func__,
>> +			(ublk_queue_can_use_recovery(ubq)) ? "requeue" : "abort",
>> +			ubq->q_id, rq->tag, ubq->ios[rq->tag].flags);
>> +	/* We cannot process this rq so just requeue it. */
>> +	if (ublk_queue_can_use_recovery(ubq)) {
>> +		blk_mq_requeue_request(rq, false);
>> +		return BLK_STS_OK;
>> +	}
>> +	return BLK_STS_IOERR;
>> +}
>> +
> 
> Please remove the two added logging, otherwise this patch looks fine.

OK, will do so in V5.

Regards,
Zhang


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 5/8] ublk_drv: consider recovery feature in aborting mechanism
  2022-09-22  0:18   ` Ming Lei
@ 2022-09-22  2:06     ` Ziyang Zhang
  0 siblings, 0 replies; 16+ messages in thread
From: Ziyang Zhang @ 2022-09-22  2:06 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi

On 2022/9/22 08:18, Ming Lei wrote:
> On Wed, Sep 21, 2022 at 05:58:46PM +0800, ZiyangZhang wrote:
>> With USER_RECOVERY feature enabled, the monitor_work schedules
>> quiesce_work after finding a dying ubq_daemon. The monitor_work
>> should also abort all rqs issued to userspace before the ubq_daemon is
>> dying. The quiesce_work's job is to:
>> (1) quiesce request queue.
>> (2) check if there is any INFLIGHT rq. If so, we retry until all these
>>     rqs are requeued and become IDLE. These rqs should be requeued by
>> 	ublk_queue_rq(), task work, io_uring fallback wq or monitor_work.
>> (3) complete all ioucmds by calling io_uring_cmd_done(). We are safe to
>>     do so because no ioucmd can be referenced now.
>> (5) set ub's state to UBLK_S_DEV_QUIESCED, which means we are ready for
>>     recovery. This state is exposed to userspace by GET_DEV_INFO.
>>
>> The driver can always handle STOP_DEV and cleanup everything no matter
>> ub's state is LIVE or QUIESCED. After ub's state is UBLK_S_DEV_QUIESCED,
>> user can recover with new process.
>>
>> Note: we do not change the default behavior with reocvery feature
>> disabled. monitor_work still schedules stop_work and abort inflight
>> rqs. And finally ublk_device is released.
>>
>> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
> 
> This version is close to be ready, just some debug logging needs
> be removed, see inline comment. Also I'd suggest you to learn to
> use bpftrace a bit, then basically you needn't to rely on kernel
> logging.
> 
> If these logging is removed, you will see how simple the patch becomes
> compared with previous version.

Current version is simpler, thanks for reviewing, Ming.
Debug logging will be removed in V5 and I will send it out soon.

Regards,
Zhang

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 6/8] ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE
  2022-09-21  9:58 ` [PATCH V4 6/8] ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE ZiyangZhang
@ 2022-09-22  2:13   ` Ming Lei
  0 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2022-09-22  2:13 UTC (permalink / raw)
  To: ZiyangZhang; +Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi

On Wed, Sep 21, 2022 at 05:58:47PM +0800, ZiyangZhang wrote:
> UBLK_F_USER_RECOVERY_REISSUE implies that:
> With a dying ubq_daemon, ublk_drv let monitor_work requeues rq issued to
> userspace(ublksrv) before the ubq_daemon is dying.
> 
> UBLK_F_USER_RECOVERY_REISSUE is designed for backends which:
> (1) tolerate double-write since ublk_drv may issue the same rq
>     twice.
> (2) does not let frontend users get I/O error, such as read-only FS
>     and VM backend.
> 
> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
> ---

Reviewed-by: Ming Lei <ming.lei@redhat.com>

Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 7/8] ublk_drv: allow new process to open ublk chardev with recovery feature enabled
  2022-09-21  9:58 ` [PATCH V4 7/8] ublk_drv: allow new process to open ublk chardev with recovery feature enabled ZiyangZhang
@ 2022-09-22  2:38   ` Ming Lei
  2022-09-22  3:00     ` Ziyang Zhang
  0 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2022-09-22  2:38 UTC (permalink / raw)
  To: ZiyangZhang; +Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi

On Wed, Sep 21, 2022 at 05:58:48PM +0800, ZiyangZhang wrote:
> With recovery feature enabled, if ublk chardev is ready to be released
> and quiesce_work has been scheduled, we:
> (1) cancel monitor_work to avoid UAF on ubq && ublk_io.
> (2) reinit all ubqs, including:
>     (a) put the task_struct and reset ->ubq_daemon to NULL.
>     (b) reset all ublk_io.
> (3) reset ub->mm to NULL.
> Then ublk chardev is released and new process can open it.
> 
> RESTART_DEV is introduced as a new ctrl-cmd for recovery feature.
> After the chardev is opened and all ubqs are ready, user should send
> RESTART_DEV to:
> (1) wait until all new ubq_daemons getting ready.
> (2) update ublksrv_pid
> (3) unquiesce the request queue and expect incoming ublk_queue_rq()
> (4) convert ub's state to UBLK_S_DEV_LIVE
> (5) reschedule monitor_work
> 
> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
> ---
>  drivers/block/ublk_drv.c      | 109 +++++++++++++++++++++++++++++++++-
>  include/uapi/linux/ublk_cmd.h |   1 +
>  2 files changed, 109 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index dc33ebc20c01..871cd48503a2 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -912,10 +912,67 @@ static int ublk_ch_open(struct inode *inode, struct file *filp)
>  	return 0;
>  }
>  
> +static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
> +{
> +	int i;
> +
> +	WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
> +	pr_devel("%s: prepare for recovering qid %d\n", __func__, ubq->q_id);
> +	/* old daemon is PF_EXITING, put it now */
> +	put_task_struct(ubq->ubq_daemon);
> +	/* We have to reset it to NULL, otherwise ub won't accept new FETCH_REQ */
> +	ubq->ubq_daemon = NULL;

Then we can kill the task put & reset in ublk_deinit_queue(), and call
ublk_queue_reinit() unconditionally in ublk_ch_release(). 

> +
> +	for (i = 0; i < ubq->q_depth; i++) {
> +		struct ublk_io *io = &ubq->ios[i];
> +
> +		/* forget everything now and be ready for new FETCH_REQ */
> +		io->flags = 0;
> +		io->cmd = NULL;
> +		io->addr = 0;
> +	}
> +	ubq->nr_io_ready = 0;

I guess the above line should have been WARN_ON_ONCE(!ubq->nr_io_ready)?

> +}
> +
>  static int ublk_ch_release(struct inode *inode, struct file *filp)
>  {
>  	struct ublk_device *ub = filp->private_data;
> +	int i;
> +
> +	/* lockless fast path */
> +	if (!unlikely(ublk_can_use_recovery(ub) && ub->dev_info.state == UBLK_S_DEV_QUIESCED))
> +		goto out_clear;
> +
> +	mutex_lock(&ub->mutex);
> +	/*
> +	 * USER_RECOVERY is only allowd after UBLK_S_DEV_QUIESCED is set,
> +	 * which means that:
> +	 *     (a) request queue has been quiesced
> +	 *     (b) no inflight rq exists
> +	 *     (c) all ioucmds owned by ther dying process are completed
> +	 */
> +	if (!(ublk_can_use_recovery(ub) && ub->dev_info.state == UBLK_S_DEV_QUIESCED))
> +		goto out_unlock;
> +	pr_devel("%s: reinit queues for dev id %d.\n", __func__, ub->dev_info.dev_id);
> +	/* we are going to release task_struct of ubq_daemon and resets
> +	 * ->ubq_daemon to NULL. So in monitor_work, check on ubq_daemon causes UAF.
> +	 * Besides, monitor_work is not necessary in QUIESCED state since we have
> +	 * already scheduled quiesce_work and quiesced all ubqs.
> +	 *
> +	 * Do not let monitor_work schedule itself if state it QUIESCED. And we cancel
> +	 * it here and re-schedule it in RESTART_DEV to avoid UAF.
> +	 */
> +	cancel_delayed_work_sync(&ub->monitor_work);

monitor_work isn't supposed to be done here, which should be called after
ublk_wait_tagset_rqs_idle(ub) returns.

>  
> +	for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
> +		ublk_queue_reinit(ub, ublk_get_queue(ub, i));
> +	/* set to NULL, otherwise new ubq_daemon cannot mmap the io_cmd_buf */
> +	ub->mm = NULL;
> +	ub->nr_queues_ready = 0;
> +	init_completion(&ub->completion);

The above can be done as generic code for both non-recovery and recovery
code.

> + out_unlock:
> +	mutex_unlock(&ub->mutex);
> + out_clear:
>  	clear_bit(UB_STATE_OPEN, &ub->state);
>  	return 0;
>  }
> @@ -1199,9 +1256,14 @@ static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq)
>  		ubq->ubq_daemon = current;
>  		get_task_struct(ubq->ubq_daemon);
>  		ub->nr_queues_ready++;
> +		pr_devel("%s: ub %d qid %d is ready.\n",
> +				__func__, ub->dev_info.dev_id, ubq->q_id);
>  	}
> -	if (ub->nr_queues_ready == ub->dev_info.nr_hw_queues)
> +	if (ub->nr_queues_ready == ub->dev_info.nr_hw_queues) {
> +		pr_devel("%s: ub %d all ubqs are ready.\n",
> +				__func__, ub->dev_info.dev_id);
>  		complete_all(&ub->completion);
> +	}

Too many logging.

>  	mutex_unlock(&ub->mutex);
>  }
>  
> @@ -1903,6 +1965,48 @@ static int ublk_ctrl_set_params(struct io_uring_cmd *cmd)
>  	return ret;
>  }
>  
> +static int ublk_ctrl_restart_dev(struct io_uring_cmd *cmd)
> +{
> +	struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
> +	int ublksrv_pid = (int)header->data[0];
> +	struct ublk_device *ub;
> +	int ret = -EINVAL;
> +
> +	ub = ublk_get_device_from_id(header->dev_id);
> +	if (!ub)
> +		return ret;
> +
> +	pr_devel("%s: Waiting for new ubq_daemons(nr: %d) are ready, dev id %d...\n",
> +			__func__, ub->dev_info.nr_hw_queues, header->dev_id);
> +	/* wait until new ubq_daemon sending all FETCH_REQ */
> +	wait_for_completion_interruptible(&ub->completion);
> +	pr_devel("%s: All new ubq_daemons(nr: %d) are ready, dev id %d\n",
> +			__func__, ub->dev_info.nr_hw_queues, header->dev_id);
> +
> +	mutex_lock(&ub->mutex);
> +	if (!ublk_can_use_recovery(ub))
> +		goto out_unlock;
> +
> +	if (ub->dev_info.state != UBLK_S_DEV_QUIESCED) {
> +		ret = -EBUSY;
> +		goto out_unlock;
> +	}
> +	ub->dev_info.ublksrv_pid = ublksrv_pid;
> +	pr_devel("%s: new ublksrv_pid %d, dev id %d\n",
> +			__func__, ublksrv_pid, header->dev_id);
> +	blk_mq_unquiesce_queue(ub->ub_disk->queue);
> +	pr_devel("%s: queue unquiesced, dev id %d.\n",
> +			__func__, header->dev_id);
> +	blk_mq_kick_requeue_list(ub->ub_disk->queue);
> +	ub->dev_info.state = UBLK_S_DEV_LIVE;
> +	schedule_delayed_work(&ub->monitor_work, UBLK_DAEMON_MONITOR_PERIOD);
> +	ret = 0;
> + out_unlock:
> +	mutex_unlock(&ub->mutex);
> +	ublk_put_device(ub);
> +	return ret;
> +}
> +
>  static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
>  		unsigned int issue_flags)
>  {
> @@ -1944,6 +2048,9 @@ static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
>  	case UBLK_CMD_SET_PARAMS:
>  		ret = ublk_ctrl_set_params(cmd);
>  		break;
> +	case UBLK_CMD_RESTART_DEV:
> +		ret = ublk_ctrl_restart_dev(cmd);
> +		break;
>  	default:
>  		break;
>  	}
> diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
> index 332370628757..a088f374c0f6 100644
> --- a/include/uapi/linux/ublk_cmd.h
> +++ b/include/uapi/linux/ublk_cmd.h
> @@ -17,6 +17,7 @@
>  #define	UBLK_CMD_STOP_DEV	0x07
>  #define	UBLK_CMD_SET_PARAMS	0x08
>  #define	UBLK_CMD_GET_PARAMS	0x09
> +#define UBLK_CMD_RESTART_DEV	0x10

Maybe RESET_DEV or RECOVERY_DEV is better given userspace does not
send STOP_DEV command.

Thanks,
Ming


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH V4 7/8] ublk_drv: allow new process to open ublk chardev with recovery feature enabled
  2022-09-22  2:38   ` Ming Lei
@ 2022-09-22  3:00     ` Ziyang Zhang
  0 siblings, 0 replies; 16+ messages in thread
From: Ziyang Zhang @ 2022-09-22  3:00 UTC (permalink / raw)
  To: Ming Lei; +Cc: axboe, xiaoguang.wang, linux-block, linux-kernel, joseph.qi

On 2022/9/22 10:38, Ming Lei wrote:
> On Wed, Sep 21, 2022 at 05:58:48PM +0800, ZiyangZhang wrote:
>> With recovery feature enabled, if ublk chardev is ready to be released
>> and quiesce_work has been scheduled, we:
>> (1) cancel monitor_work to avoid UAF on ubq && ublk_io.
>> (2) reinit all ubqs, including:
>>     (a) put the task_struct and reset ->ubq_daemon to NULL.
>>     (b) reset all ublk_io.
>> (3) reset ub->mm to NULL.
>> Then ublk chardev is released and new process can open it.
>>
>> RESTART_DEV is introduced as a new ctrl-cmd for recovery feature.
>> After the chardev is opened and all ubqs are ready, user should send
>> RESTART_DEV to:
>> (1) wait until all new ubq_daemons getting ready.
>> (2) update ublksrv_pid
>> (3) unquiesce the request queue and expect incoming ublk_queue_rq()
>> (4) convert ub's state to UBLK_S_DEV_LIVE
>> (5) reschedule monitor_work
>>
>> Signed-off-by: ZiyangZhang <ZiyangZhang@linux.alibaba.com>
>> ---
>>  drivers/block/ublk_drv.c      | 109 +++++++++++++++++++++++++++++++++-
>>  include/uapi/linux/ublk_cmd.h |   1 +
>>  2 files changed, 109 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
>> index dc33ebc20c01..871cd48503a2 100644
>> --- a/drivers/block/ublk_drv.c
>> +++ b/drivers/block/ublk_drv.c
>> @@ -912,10 +912,67 @@ static int ublk_ch_open(struct inode *inode, struct file *filp)
>>  	return 0;
>>  }
>>  
>> +static void ublk_queue_reinit(struct ublk_device *ub, struct ublk_queue *ubq)
>> +{
>> +	int i;
>> +
>> +	WARN_ON_ONCE(!(ubq->ubq_daemon && ubq_daemon_is_dying(ubq)));
>> +	pr_devel("%s: prepare for recovering qid %d\n", __func__, ubq->q_id);
>> +	/* old daemon is PF_EXITING, put it now */
>> +	put_task_struct(ubq->ubq_daemon);
>> +	/* We have to reset it to NULL, otherwise ub won't accept new FETCH_REQ */
>> +	ubq->ubq_daemon = NULL;
> 
> Then we can kill the task put & reset in ublk_deinit_queue(), and call
> ublk_queue_reinit() unconditionally in ublk_ch_release(). 

ublk_queue_reinit() can only be called if ub->dev_info.state == UBLK_S_DEV_QUIESCED.
If we have done ublk_deinit_queue(), we cannot not be in UBLK_S_DEV_QUIESCED state.

> 
>> +
>> +	for (i = 0; i < ubq->q_depth; i++) {
>> +		struct ublk_io *io = &ubq->ios[i];
>> +
>> +		/* forget everything now and be ready for new FETCH_REQ */
>> +		io->flags = 0;
>> +		io->cmd = NULL;
>> +		io->addr = 0;
>> +	}
>> +	ubq->nr_io_ready = 0;
> 
> I guess the above line should have been WARN_ON_ONCE(!ubq->nr_io_ready)?


Before V4, It was WARN_ON_ONCE(!ubq->nr_io_ready). But ublk_cancel_queue() is called
in __ublk_quiesce_dev() now. In ublk_cancel_queue(), ->nr_io_ready is only decreased
if io->flags is ACTIVE. If some ioucmds has been sent to userspace and then a crash
happens, some ublk_ios' flags are ACTIVE while others are NOT. So here ->nr_io_ready
might not be zero.

ublk_cancel_queue() was called only before/after ublk device is started/stopped.
So in the past, ubq->nr_io_ready is always zero. But now I think
WARN_ON_ONCE(ubq->nr_io_ready) in ublk_cancel_queue() should also be removed
for the above reason.

> 
>> +}
>> +
>>  static int ublk_ch_release(struct inode *inode, struct file *filp)
>>  {
>>  	struct ublk_device *ub = filp->private_data;
>> +	int i;
>> +
>> +	/* lockless fast path */
>> +	if (!unlikely(ublk_can_use_recovery(ub) && ub->dev_info.state == UBLK_S_DEV_QUIESCED))
>> +		goto out_clear;
>> +
>> +	mutex_lock(&ub->mutex);
>> +	/*
>> +	 * USER_RECOVERY is only allowd after UBLK_S_DEV_QUIESCED is set,
>> +	 * which means that:
>> +	 *     (a) request queue has been quiesced
>> +	 *     (b) no inflight rq exists
>> +	 *     (c) all ioucmds owned by ther dying process are completed
>> +	 */
>> +	if (!(ublk_can_use_recovery(ub) && ub->dev_info.state == UBLK_S_DEV_QUIESCED))
>> +		goto out_unlock;
>> +	pr_devel("%s: reinit queues for dev id %d.\n", __func__, ub->dev_info.dev_id);
>> +	/* we are going to release task_struct of ubq_daemon and resets
>> +	 * ->ubq_daemon to NULL. So in monitor_work, check on ubq_daemon causes UAF.
>> +	 * Besides, monitor_work is not necessary in QUIESCED state since we have
>> +	 * already scheduled quiesce_work and quiesced all ubqs.
>> +	 *
>> +	 * Do not let monitor_work schedule itself if state it QUIESCED. And we cancel
>> +	 * it here and re-schedule it in RESTART_DEV to avoid UAF.
>> +	 */
>> +	cancel_delayed_work_sync(&ub->monitor_work);
> 
> monitor_work isn't supposed to be done here, which should be called after
> ublk_wait_tagset_rqs_idle(ub) returns.

So we do not need to cancel monbitor_work here and re-schedule it in RESTART_DEV?
I am worried about UAF in monitor_work's check on ubq_daemon since ubq_daemon is
invalid during user recovery.

> 
>>  
>> +	for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
>> +		ublk_queue_reinit(ub, ublk_get_queue(ub, i));
>> +	/* set to NULL, otherwise new ubq_daemon cannot mmap the io_cmd_buf */
>> +	ub->mm = NULL;
>> +	ub->nr_queues_ready = 0;
>> +	init_completion(&ub->completion);
> 
> The above can be done as generic code for both non-recovery and recovery
> code.

OK, it should be moved.

> 
>> + out_unlock:
>> +	mutex_unlock(&ub->mutex);
>> + out_clear:
>>  	clear_bit(UB_STATE_OPEN, &ub->state);
>>  	return 0;
>>  }
>> @@ -1199,9 +1256,14 @@ static void ublk_mark_io_ready(struct ublk_device *ub, struct ublk_queue *ubq)
>>  		ubq->ubq_daemon = current;
>>  		get_task_struct(ubq->ubq_daemon);
>>  		ub->nr_queues_ready++;
>> +		pr_devel("%s: ub %d qid %d is ready.\n",
>> +				__func__, ub->dev_info.dev_id, ubq->q_id);
>>  	}
>> -	if (ub->nr_queues_ready == ub->dev_info.nr_hw_queues)
>> +	if (ub->nr_queues_ready == ub->dev_info.nr_hw_queues) {
>> +		pr_devel("%s: ub %d all ubqs are ready.\n",
>> +				__func__, ub->dev_info.dev_id);
>>  		complete_all(&ub->completion);
>> +	}
> 
> Too many logging.

They will be removed.

> 
>>  	mutex_unlock(&ub->mutex);
>>  }
>>  
>> @@ -1903,6 +1965,48 @@ static int ublk_ctrl_set_params(struct io_uring_cmd *cmd)
>>  	return ret;
>>  }
>>  
>> +static int ublk_ctrl_restart_dev(struct io_uring_cmd *cmd)
>> +{
>> +	struct ublksrv_ctrl_cmd *header = (struct ublksrv_ctrl_cmd *)cmd->cmd;
>> +	int ublksrv_pid = (int)header->data[0];
>> +	struct ublk_device *ub;
>> +	int ret = -EINVAL;
>> +
>> +	ub = ublk_get_device_from_id(header->dev_id);
>> +	if (!ub)
>> +		return ret;
>> +
>> +	pr_devel("%s: Waiting for new ubq_daemons(nr: %d) are ready, dev id %d...\n",
>> +			__func__, ub->dev_info.nr_hw_queues, header->dev_id);
>> +	/* wait until new ubq_daemon sending all FETCH_REQ */
>> +	wait_for_completion_interruptible(&ub->completion);
>> +	pr_devel("%s: All new ubq_daemons(nr: %d) are ready, dev id %d\n",
>> +			__func__, ub->dev_info.nr_hw_queues, header->dev_id);
>> +
>> +	mutex_lock(&ub->mutex);
>> +	if (!ublk_can_use_recovery(ub))
>> +		goto out_unlock;
>> +
>> +	if (ub->dev_info.state != UBLK_S_DEV_QUIESCED) {
>> +		ret = -EBUSY;
>> +		goto out_unlock;
>> +	}
>> +	ub->dev_info.ublksrv_pid = ublksrv_pid;
>> +	pr_devel("%s: new ublksrv_pid %d, dev id %d\n",
>> +			__func__, ublksrv_pid, header->dev_id);
>> +	blk_mq_unquiesce_queue(ub->ub_disk->queue);
>> +	pr_devel("%s: queue unquiesced, dev id %d.\n",
>> +			__func__, header->dev_id);
>> +	blk_mq_kick_requeue_list(ub->ub_disk->queue);
>> +	ub->dev_info.state = UBLK_S_DEV_LIVE;
>> +	schedule_delayed_work(&ub->monitor_work, UBLK_DAEMON_MONITOR_PERIOD);
>> +	ret = 0;
>> + out_unlock:
>> +	mutex_unlock(&ub->mutex);
>> +	ublk_put_device(ub);
>> +	return ret;
>> +}
>> +
>>  static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
>>  		unsigned int issue_flags)
>>  {
>> @@ -1944,6 +2048,9 @@ static int ublk_ctrl_uring_cmd(struct io_uring_cmd *cmd,
>>  	case UBLK_CMD_SET_PARAMS:
>>  		ret = ublk_ctrl_set_params(cmd);
>>  		break;
>> +	case UBLK_CMD_RESTART_DEV:
>> +		ret = ublk_ctrl_restart_dev(cmd);
>> +		break;
>>  	default:
>>  		break;
>>  	}
>> diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
>> index 332370628757..a088f374c0f6 100644
>> --- a/include/uapi/linux/ublk_cmd.h
>> +++ b/include/uapi/linux/ublk_cmd.h
>> @@ -17,6 +17,7 @@
>>  #define	UBLK_CMD_STOP_DEV	0x07
>>  #define	UBLK_CMD_SET_PARAMS	0x08
>>  #define	UBLK_CMD_GET_PARAMS	0x09
>> +#define UBLK_CMD_RESTART_DEV	0x10
> 
> Maybe RESET_DEV or RECOVERY_DEV is better given userspace does not
> send STOP_DEV command
OK.

Regards,
Zhang

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-09-22  3:01 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-21  9:58 [PATCH V4 0/8] ublk_drv: add USER_RECOVERY support ZiyangZhang
2022-09-21  9:58 ` [PATCH V4 1/8] ublk_drv: check 'current' instead of 'ubq_daemon' ZiyangZhang
2022-09-21  9:58 ` [PATCH V4 2/8] ublk_drv: refactor ublk_cancel_queue() ZiyangZhang
2022-09-21  9:58 ` [PATCH V4 3/8] ublk_drv: define macros for recovery feature and check them ZiyangZhang
2022-09-21  9:58 ` [PATCH V4 4/8] ublk_drv: requeue rqs with recovery feature enabled ZiyangZhang
2022-09-22  0:28   ` Ming Lei
2022-09-22  2:04     ` Ziyang Zhang
2022-09-21  9:58 ` [PATCH V4 5/8] ublk_drv: consider recovery feature in aborting mechanism ZiyangZhang
2022-09-22  0:18   ` Ming Lei
2022-09-22  2:06     ` Ziyang Zhang
2022-09-21  9:58 ` [PATCH V4 6/8] ublk_drv: support UBLK_F_USER_RECOVERY_REISSUE ZiyangZhang
2022-09-22  2:13   ` Ming Lei
2022-09-21  9:58 ` [PATCH V4 7/8] ublk_drv: allow new process to open ublk chardev with recovery feature enabled ZiyangZhang
2022-09-22  2:38   ` Ming Lei
2022-09-22  3:00     ` Ziyang Zhang
2022-09-21  9:58 ` [PATCH V4 8/8] Documentation: document ublk user recovery feature ZiyangZhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.