All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 5.12 0/4] 5.12 fixes
@ 2021-04-08  0:54 Pavel Begunkov
  2021-04-08  0:54 ` [PATCH 1/4] io_uring: clear F_REISSUE right after getting it Pavel Begunkov
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-04-08  0:54 UTC (permalink / raw)
  To: Jens Axboe, io-uring

1-2 fix REQ_F_REISSUE,
3/4 is one of poll fixes, more will be sent separately

Long discussed 4/4 is actually fixes something, not sure what's
the exact reason for hangs, but maybe we'll find out later.
Easily reproducible by while(1) ./lfs-openat; and also reported
by Joakim Hassila.

Pavel Begunkov (4):
  io_uring: clear F_REISSUE right after getting it
  io_uring: fix rw req completion
  io_uring: fix poll_rewait racing for ->canceled
  io-wq: cancel unbounded works on io-wq destroy

 fs/io-wq.c    |  4 ++++
 fs/io_uring.c | 17 +++++++++++++----
 2 files changed, 17 insertions(+), 4 deletions(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/4] io_uring: clear F_REISSUE right after getting it
  2021-04-08  0:54 [PATCH 5.12 0/4] 5.12 fixes Pavel Begunkov
@ 2021-04-08  0:54 ` Pavel Begunkov
  2021-04-08  0:54 ` [PATCH 2/4] io_uring: fix rw req completion Pavel Begunkov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-04-08  0:54 UTC (permalink / raw)
  To: Jens Axboe, io-uring

There are lots of ways r/w request may continue its path after getting
REQ_F_REISSUE, it's not necessarily io-wq and can be, e.g. apoll,
and submitted via  io_async_task_func() -> __io_req_task_submit()

Clear the flag right after getting it, so the next attempt is well
prepared regardless how the request will be executed.

Fixes: 230d50d448ac ("io_uring: move reissue into regular IO path")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 65a17d560a73..f1881ac0744b 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3294,6 +3294,7 @@ static int io_read(struct io_kiocb *req, unsigned int issue_flags)
 	ret = io_iter_do_read(req, iter);
 
 	if (ret == -EAGAIN || (req->flags & REQ_F_REISSUE)) {
+		req->flags &= ~REQ_F_REISSUE;
 		/* IOPOLL retry should happen for io-wq threads */
 		if (!force_nonblock && !(req->ctx->flags & IORING_SETUP_IOPOLL))
 			goto done;
@@ -3417,8 +3418,10 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
 	else
 		ret2 = -EINVAL;
 
-	if (req->flags & REQ_F_REISSUE)
+	if (req->flags & REQ_F_REISSUE) {
+		req->flags &= ~REQ_F_REISSUE;
 		ret2 = -EAGAIN;
+	}
 
 	/*
 	 * Raw bdev writes will return -EOPNOTSUPP for IOCB_NOWAIT. Just
@@ -6173,7 +6176,6 @@ static void io_wq_submit_work(struct io_wq_work *work)
 		ret = -ECANCELED;
 
 	if (!ret) {
-		req->flags &= ~REQ_F_REISSUE;
 		do {
 			ret = io_issue_sqe(req, 0);
 			/*
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/4] io_uring: fix rw req completion
  2021-04-08  0:54 [PATCH 5.12 0/4] 5.12 fixes Pavel Begunkov
  2021-04-08  0:54 ` [PATCH 1/4] io_uring: clear F_REISSUE right after getting it Pavel Begunkov
@ 2021-04-08  0:54 ` Pavel Begunkov
  2021-04-08  0:54 ` [PATCH 3/4] io_uring: fix poll_rewait racing for ->canceled Pavel Begunkov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-04-08  0:54 UTC (permalink / raw)
  To: Jens Axboe, io-uring

WARNING: at fs/io_uring.c:8578 io_ring_exit_work.cold+0x0/0x18

As reissuing is now passed back by REQ_F_REISSUE, kiocb_done() may just
set the flag and do nothing leaving dangling requests. The handling is a
bit fragile, e.g. can't just complete them because the case of reading
beyond file boundary needs blocking context to return 0, otherwise it
may be -EAGAIN.

Go the easy way for now, just emulate how it was by io_rw_reissue() in
kiocb_done()

Fixes: 230d50d448ac ("io_uring: move reissue into regular IO path")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f1881ac0744b..de5822350345 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2762,6 +2762,7 @@ static void kiocb_done(struct kiocb *kiocb, ssize_t ret,
 {
 	struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
 	struct io_async_rw *io = req->async_data;
+	bool check_reissue = (kiocb->ki_complete == io_complete_rw);
 
 	/* add previously done IO, if any */
 	if (io && io->bytes_done > 0) {
@@ -2777,6 +2778,11 @@ static void kiocb_done(struct kiocb *kiocb, ssize_t ret,
 		__io_complete_rw(req, ret, 0, issue_flags);
 	else
 		io_rw_done(kiocb, ret);
+
+	if (check_reissue && req->flags & REQ_F_REISSUE) {
+		req->flags &= ~REQ_F_REISSUE;
+		io_rw_reissue(req);
+	}
 }
 
 static int io_import_fixed(struct io_kiocb *req, int rw, struct iov_iter *iter)
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/4] io_uring: fix poll_rewait racing for ->canceled
  2021-04-08  0:54 [PATCH 5.12 0/4] 5.12 fixes Pavel Begunkov
  2021-04-08  0:54 ` [PATCH 1/4] io_uring: clear F_REISSUE right after getting it Pavel Begunkov
  2021-04-08  0:54 ` [PATCH 2/4] io_uring: fix rw req completion Pavel Begunkov
@ 2021-04-08  0:54 ` Pavel Begunkov
  2021-04-08  2:28   ` Pavel Begunkov
  2021-04-08  0:54 ` [PATCH 4/4] io-wq: cancel unbounded works on io-wq destroy Pavel Begunkov
  2021-04-08  4:11 ` [PATCH 5.12 0/4] 5.12 fixes Jens Axboe
  4 siblings, 1 reply; 7+ messages in thread
From: Pavel Begunkov @ 2021-04-08  0:54 UTC (permalink / raw)
  To: Jens Axboe, io-uring

poll->canceled may be set from different contexts, even async, so
io_poll_rewait() should be prepared that it can change and not read it
twice.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io_uring.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index de5822350345..376d9c875dc2 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4897,15 +4897,16 @@ static bool io_poll_rewait(struct io_kiocb *req, struct io_poll_iocb *poll)
 	__acquires(&req->ctx->completion_lock)
 {
 	struct io_ring_ctx *ctx = req->ctx;
+	bool canceled = READ_ONCE(poll->canceled);
 
-	if (!req->result && !READ_ONCE(poll->canceled)) {
+	if (!req->result && !canceled) {
 		struct poll_table_struct pt = { ._key = poll->events };
 
 		req->result = vfs_poll(req->file, &pt) & poll->events;
 	}
 
 	spin_lock_irq(&ctx->completion_lock);
-	if (!req->result && !READ_ONCE(poll->canceled)) {
+	if (!req->result && !canceled) {
 		add_wait_queue(poll->head, &poll->wait);
 		return true;
 	}
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 4/4] io-wq: cancel unbounded works on io-wq destroy
  2021-04-08  0:54 [PATCH 5.12 0/4] 5.12 fixes Pavel Begunkov
                   ` (2 preceding siblings ...)
  2021-04-08  0:54 ` [PATCH 3/4] io_uring: fix poll_rewait racing for ->canceled Pavel Begunkov
@ 2021-04-08  0:54 ` Pavel Begunkov
  2021-04-08  4:11 ` [PATCH 5.12 0/4] 5.12 fixes Jens Axboe
  4 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-04-08  0:54 UTC (permalink / raw)
  To: Jens Axboe, io-uring

WARNING: CPU: 5 PID: 227 at fs/io_uring.c:8578 io_ring_exit_work+0xe6/0x470
RIP: 0010:io_ring_exit_work+0xe6/0x470
Call Trace:
 process_one_work+0x206/0x400
 worker_thread+0x4a/0x3d0
 kthread+0x129/0x170
 ret_from_fork+0x22/0x30

INFO: task lfs-openat:2359 blocked for more than 245 seconds.
task:lfs-openat      state:D stack:    0 pid: 2359 ppid:     1 flags:0x00000004
Call Trace:
 ...
 wait_for_completion+0x8b/0xf0
 io_wq_destroy_manager+0x24/0x60
 io_wq_put_and_exit+0x18/0x30
 io_uring_clean_tctx+0x76/0xa0
 __io_uring_files_cancel+0x1b9/0x2e0
 do_exit+0xc0/0xb40
 ...

Even after io-wq destroy has been issued io-wq worker threads will
continue executing all left work items as usual, and may hang waiting
for I/O that won't ever complete (aka unbounded).

[<0>] pipe_read+0x306/0x450
[<0>] io_iter_do_read+0x1e/0x40
[<0>] io_read+0xd5/0x330
[<0>] io_issue_sqe+0xd21/0x18a0
[<0>] io_wq_submit_work+0x6c/0x140
[<0>] io_worker_handle_work+0x17d/0x400
[<0>] io_wqe_worker+0x2c0/0x330
[<0>] ret_from_fork+0x22/0x30

Cancel all unbounded I/O instead of executing them. This changes the
user visible behaviour, but that's inevitable as io-wq is not per task.

Suggested-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 fs/io-wq.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/io-wq.c b/fs/io-wq.c
index 433c4d3c3c1c..4eba531bea5a 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -415,6 +415,7 @@ static void io_worker_handle_work(struct io_worker *worker)
 {
 	struct io_wqe *wqe = worker->wqe;
 	struct io_wq *wq = wqe->wq;
+	bool do_kill = test_bit(IO_WQ_BIT_EXIT, &wq->state);
 
 	do {
 		struct io_wq_work *work;
@@ -444,6 +445,9 @@ static void io_worker_handle_work(struct io_worker *worker)
 			unsigned int hash = io_get_work_hash(work);
 
 			next_hashed = wq_next_work(work);
+
+			if (unlikely(do_kill) && (work->flags & IO_WQ_WORK_UNBOUND))
+				work->flags |= IO_WQ_WORK_CANCEL;
 			wq->do_work(work);
 			io_assign_current_work(worker, NULL);
 
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 3/4] io_uring: fix poll_rewait racing for ->canceled
  2021-04-08  0:54 ` [PATCH 3/4] io_uring: fix poll_rewait racing for ->canceled Pavel Begunkov
@ 2021-04-08  2:28   ` Pavel Begunkov
  0 siblings, 0 replies; 7+ messages in thread
From: Pavel Begunkov @ 2021-04-08  2:28 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 08/04/2021 01:54, Pavel Begunkov wrote:
> poll->canceled may be set from different contexts, even async, so
> io_poll_rewait() should be prepared that it can change and not read it
> twice.

Please disregard this one, apparently it's not a bug and will get in
my way letter. Other 3 are fine.

> 
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  fs/io_uring.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index de5822350345..376d9c875dc2 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -4897,15 +4897,16 @@ static bool io_poll_rewait(struct io_kiocb *req, struct io_poll_iocb *poll)
>  	__acquires(&req->ctx->completion_lock)
>  {
>  	struct io_ring_ctx *ctx = req->ctx;
> +	bool canceled = READ_ONCE(poll->canceled);
>  
> -	if (!req->result && !READ_ONCE(poll->canceled)) {
> +	if (!req->result && !canceled) {
>  		struct poll_table_struct pt = { ._key = poll->events };
>  
>  		req->result = vfs_poll(req->file, &pt) & poll->events;
>  	}
>  
>  	spin_lock_irq(&ctx->completion_lock);
> -	if (!req->result && !READ_ONCE(poll->canceled)) {
> +	if (!req->result && !canceled) {
>  		add_wait_queue(poll->head, &poll->wait);
>  		return true;
>  	}
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 5.12 0/4] 5.12 fixes
  2021-04-08  0:54 [PATCH 5.12 0/4] 5.12 fixes Pavel Begunkov
                   ` (3 preceding siblings ...)
  2021-04-08  0:54 ` [PATCH 4/4] io-wq: cancel unbounded works on io-wq destroy Pavel Begunkov
@ 2021-04-08  4:11 ` Jens Axboe
  4 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2021-04-08  4:11 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 4/7/21 6:54 PM, Pavel Begunkov wrote:
> 1-2 fix REQ_F_REISSUE,
> 3/4 is one of poll fixes, more will be sent separately
> 
> Long discussed 4/4 is actually fixes something, not sure what's
> the exact reason for hangs, but maybe we'll find out later.
> Easily reproducible by while(1) ./lfs-openat; and also reported
> by Joakim Hassila.

Applied 1-2, 4/4, thanks.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-04-08  4:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-08  0:54 [PATCH 5.12 0/4] 5.12 fixes Pavel Begunkov
2021-04-08  0:54 ` [PATCH 1/4] io_uring: clear F_REISSUE right after getting it Pavel Begunkov
2021-04-08  0:54 ` [PATCH 2/4] io_uring: fix rw req completion Pavel Begunkov
2021-04-08  0:54 ` [PATCH 3/4] io_uring: fix poll_rewait racing for ->canceled Pavel Begunkov
2021-04-08  2:28   ` Pavel Begunkov
2021-04-08  0:54 ` [PATCH 4/4] io-wq: cancel unbounded works on io-wq destroy Pavel Begunkov
2021-04-08  4:11 ` [PATCH 5.12 0/4] 5.12 fixes Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.