IO-Uring Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] io_uring: fix req cannot arm poll after polled
@ 2020-06-30 12:41 Xuan Zhuo
  2020-06-30 12:59 ` Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Xuan Zhuo @ 2020-06-30 12:41 UTC (permalink / raw)
  To: io-uring, axboe; +Cc: Dust.li

For example, there are multiple sqes recv with the same connection.
When there is no data in the connection, the reqs of these sqes will
be armed poll. Then if only a little data is received, only one req
receives the data, and the other reqs get EAGAIN again. However,
due to this flags REQ_F_POLLED, these reqs cannot enter the
io_arm_poll_handler function. These reqs will be put into wq by
io_queue_async_work, and the flags passed by io_wqe_worker when recv
is called are BLOCK, which may make io_wqe_worker enter schedule in the
network protocol stack. When the main process of io_uring exits,
these io_wqe_workers still cannot exit. The connection will not be
actively released until the connection is closed by the peer.

So we should allow req to arm poll again.

Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
 fs/io_uring.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index e507737..a309832 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4406,7 +4406,7 @@ static bool io_arm_poll_handler(struct io_kiocb *req)
 
 	if (!req->file || !file_can_poll(req->file))
 		return false;
-	if (req->flags & (REQ_F_MUST_PUNT | REQ_F_POLLED))
+	if (req->flags & REQ_F_MUST_PUNT)
 		return false;
 	if (!def->pollin && !def->pollout)
 		return false;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] io_uring: fix req cannot arm poll after polled
  2020-06-30 12:41 [PATCH] io_uring: fix req cannot arm poll after polled Xuan Zhuo
@ 2020-06-30 12:59 ` Pavel Begunkov
  2020-06-30 14:02 ` Jens Axboe
  2020-07-01 12:47 ` Xuan Zhuo
  2 siblings, 0 replies; 4+ messages in thread
From: Pavel Begunkov @ 2020-06-30 12:59 UTC (permalink / raw)
  To: Xuan Zhuo, io-uring, axboe; +Cc: Dust.li

On 30/06/2020 15:41, Xuan Zhuo wrote:
> For example, there are multiple sqes recv with the same connection.
> When there is no data in the connection, the reqs of these sqes will
> be armed poll. Then if only a little data is received, only one req
> receives the data, and the other reqs get EAGAIN again. However,
> due to this flags REQ_F_POLLED, these reqs cannot enter the
> io_arm_poll_handler function. These reqs will be put into wq by
> io_queue_async_work, and the flags passed by io_wqe_worker when recv
> is called are BLOCK, which may make io_wqe_worker enter schedule in the
> network protocol stack. When the main process of io_uring exits,
> these io_wqe_workers still cannot exit. The connection will not be
> actively released until the connection is closed by the peer.

It's a problem unrelated to polling, though it may be a nice optimisation.
E.g. requests submitted with IOSQE_ASYNC will always get into io-wq.

Have you seen it yourself? When, io_uring is going away, it calls
io_wq_cancel_all(), which do send_sig(SIGINT) for all its workers. The
question is why this doesn't halt inflight send/recv down in the network
stack?

> 
> So we should allow req to arm poll again.
> 
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  fs/io_uring.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index e507737..a309832 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -4406,7 +4406,7 @@ static bool io_arm_poll_handler(struct io_kiocb *req)
>  
>  	if (!req->file || !file_can_poll(req->file))
>  		return false;
> -	if (req->flags & (REQ_F_MUST_PUNT | REQ_F_POLLED))
> +	if (req->flags & REQ_F_MUST_PUNT)

You have a bit outdated sources.

>  		return false;
>  	if (!def->pollin && !def->pollout)
>  		return false;
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] io_uring: fix req cannot arm poll after polled
  2020-06-30 12:41 [PATCH] io_uring: fix req cannot arm poll after polled Xuan Zhuo
  2020-06-30 12:59 ` Pavel Begunkov
@ 2020-06-30 14:02 ` Jens Axboe
  2020-07-01 12:47 ` Xuan Zhuo
  2 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2020-06-30 14:02 UTC (permalink / raw)
  To: Xuan Zhuo, io-uring; +Cc: Dust.li

On 6/30/20 6:41 AM, Xuan Zhuo wrote:
> For example, there are multiple sqes recv with the same connection.
> When there is no data in the connection, the reqs of these sqes will
> be armed poll. Then if only a little data is received, only one req
> receives the data, and the other reqs get EAGAIN again. However,
> due to this flags REQ_F_POLLED, these reqs cannot enter the
> io_arm_poll_handler function. These reqs will be put into wq by
> io_queue_async_work, and the flags passed by io_wqe_worker when recv
> is called are BLOCK, which may make io_wqe_worker enter schedule in the
> network protocol stack. When the main process of io_uring exits,
> these io_wqe_workers still cannot exit. The connection will not be
> actively released until the connection is closed by the peer.
> 
> So we should allow req to arm poll again.

I was actually pondering this when I wrote it, and was worried about
potential performance implications from only allowing a single trigger
of the poll side. But only for performance reasons, I'm puzzled as
to why this would cause a cancelation issue.

Why can't the workers exit? It's expected to be waiting in there,
and it should be interruptible sleep. Do you have more details on
the test case, maybe even a reproducer for this?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] io_uring: fix req cannot arm poll after polled
  2020-06-30 12:41 [PATCH] io_uring: fix req cannot arm poll after polled Xuan Zhuo
  2020-06-30 12:59 ` Pavel Begunkov
  2020-06-30 14:02 ` Jens Axboe
@ 2020-07-01 12:47 ` Xuan Zhuo
  2 siblings, 0 replies; 4+ messages in thread
From: Xuan Zhuo @ 2020-07-01 12:47 UTC (permalink / raw)
  To: axboe, Pavel Begunkov; +Cc: io-uring


It is true that this path is not perfect for poll, I mainly want to
solve this bug first.

I have considered to prevent the network fd from entering io-wq. It
is more reasonable to use poll for network fd. And since there is no
relationship between the sqes of the same network fd, each will receive
an EAGAIN and then arm poll, It is unreasonable to be wakeup at the same
time.  Although link can solve some problems.

Back to this question, I was able to reproduce this bug yesterday, but it
is strange that I tried various versions today, and I can't reproduce it
anymore.

The analysis at the time was that io_uring_release was not triggered. I
guess it is because mm refers to io_uring fd, and worker refers to mm and
enters schedule, which causes io_uring not to be completely closed.

But when I test today, it cannot be reproduced. When the process exits,
the network connection will always close automatically then the worker
exits the schedule. I don't know why it was not closed yesterday.

Sorry, I will test it later, if there is a conclusion I will report this
problem again.

Thanks jens and pavel for your time.

On Tue, Jun 30, 2020 at 08:41:14PM +0800, Xuan Zhuo wrote:
> For example, there are multiple sqes recv with the same connection.
> When there is no data in the connection, the reqs of these sqes will
> be armed poll. Then if only a little data is received, only one req
> receives the data, and the other reqs get EAGAIN again. However,
> due to this flags REQ_F_POLLED, these reqs cannot enter the
> io_arm_poll_handler function. These reqs will be put into wq by
> io_queue_async_work, and the flags passed by io_wqe_worker when recv
> is called are BLOCK, which may make io_wqe_worker enter schedule in the
> network protocol stack. When the main process of io_uring exits,
> these io_wqe_workers still cannot exit. The connection will not be
> actively released until the connection is closed by the peer.
>
> So we should allow req to arm poll again.
>
> Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
>  fs/io_uring.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index e507737..a309832 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -4406,7 +4406,7 @@ static bool io_arm_poll_handler(struct io_kiocb *req)
>
>  	if (!req->file || !file_can_poll(req->file))
>  		return false;
> -	if (req->flags & (REQ_F_MUST_PUNT | REQ_F_POLLED))
> +	if (req->flags & REQ_F_MUST_PUNT)
>  		return false;
>  	if (!def->pollin && !def->pollout)
>  		return false;
> --
> 1.8.3.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-30 12:41 [PATCH] io_uring: fix req cannot arm poll after polled Xuan Zhuo
2020-06-30 12:59 ` Pavel Begunkov
2020-06-30 14:02 ` Jens Axboe
2020-07-01 12:47 ` Xuan Zhuo

IO-Uring Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/io-uring/0 io-uring/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 io-uring io-uring/ https://lore.kernel.org/io-uring \
		io-uring@vger.kernel.org
	public-inbox-index io-uring

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.io-uring


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git