IO-Uring Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 0/2] Fix hang in io_uring_get_cqe() with iopoll
@ 2020-06-21 16:14 Pavel Begunkov
  2020-06-21 16:14 ` [PATCH v2 1/2] barriers: add load relaxed Pavel Begunkov
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-06-21 16:14 UTC (permalink / raw)
  To: Jens Axboe, io-uring

v2: use relaxed load
    fix errata

Pavel Begunkov (2):
  barriers: add load relaxed
  Fix hang in io_uring_get_cqe() with iopoll

 src/include/liburing/barrier.h |  4 ++++
 src/queue.c                    | 16 +++++++++++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)

-- 
2.24.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/2] barriers: add load relaxed
  2020-06-21 16:14 [PATCH v2 0/2] Fix hang in io_uring_get_cqe() with iopoll Pavel Begunkov
@ 2020-06-21 16:14 ` Pavel Begunkov
  2020-06-21 16:14 ` [PATCH v2 2/2] Fix hang in io_uring_get_cqe() with iopoll Pavel Begunkov
  2020-06-21 18:48 ` [PATCH v2 0/2] " Jens Axboe
  2 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-06-21 16:14 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Add io_uring_smp_load_relaxed() for internal use.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 src/include/liburing/barrier.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/include/liburing/barrier.h b/src/include/liburing/barrier.h
index ad69506..6a1aa52 100644
--- a/src/include/liburing/barrier.h
+++ b/src/include/liburing/barrier.h
@@ -47,6 +47,8 @@ do {						\
 	___p1;						\
 })
 
+#define io_uring_smp_load_relaxed(p) IO_URING_READ_ONCE(*(p))
+
 #else /* defined(__x86_64__) || defined(__i386__) */
 /*
  * Add arch appropriate definitions. Use built-in atomic operations for
@@ -55,6 +57,8 @@ do {						\
 #define io_uring_smp_store_release(p, v) \
 	__atomic_store_n(p, v, __ATOMIC_RELEASE)
 #define io_uring_smp_load_acquire(p) __atomic_load_n(p, __ATOMIC_ACQUIRE)
+#define io_uring_smp_load_relaxed(p) __atomic_load_n(p, __ATOMIC_RELAXED)
+
 #endif /* defined(__x86_64__) || defined(__i386__) */
 
 #endif /* defined(LIBURING_BARRIER_H) */
-- 
2.24.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 2/2] Fix hang in io_uring_get_cqe() with iopoll
  2020-06-21 16:14 [PATCH v2 0/2] Fix hang in io_uring_get_cqe() with iopoll Pavel Begunkov
  2020-06-21 16:14 ` [PATCH v2 1/2] barriers: add load relaxed Pavel Begunkov
@ 2020-06-21 16:14 ` Pavel Begunkov
  2020-06-21 18:48 ` [PATCH v2 0/2] " Jens Axboe
  2 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-06-21 16:14 UTC (permalink / raw)
  To: Jens Axboe, io-uring

Because of need_resched() check, io_uring_enter() -> io_iopoll_check()
can return 0 even if @min_complete wasn't satisfied. If that's the
case, __io_uring_get_cqe() sets submit=0 and wait_nr=0, disabling
setting IORING_ENTER_GETEVENTS as well. So, it goes crazy calling
io_uring_enter() in a loop, not actually submitting nor polling.

Set @wait_nr based on actual number of CQEs ready. It doesn't manifest
extra CQEs if any, thus implements __io_uring_cq_ready() with relaxed
semantics.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 src/queue.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/queue.c b/src/queue.c
index 14a0777..d824cfd 100644
--- a/src/queue.c
+++ b/src/queue.c
@@ -32,6 +32,19 @@ static inline bool sq_ring_needs_enter(struct io_uring *ring,
 	return false;
 }
 
+static inline unsigned int __io_uring_cq_ready(struct io_uring *ring)
+{
+	return io_uring_smp_load_relaxed(ring->cq.ktail) - *ring->cq.khead;
+}
+
+static inline unsigned int io_adjust_wait_nr(struct io_uring *ring,
+					     unsigned int to_wait)
+{
+	unsigned int ready = __io_uring_cq_ready(ring);
+
+	return (to_wait <= ready) ? 0 : (to_wait - ready);
+}
+
 int __io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr,
 		       unsigned submit, unsigned wait_nr, sigset_t *sigmask)
 {
@@ -60,7 +73,8 @@ int __io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr,
 			err = -errno;
 		} else if (ret == (int)submit) {
 			submit = 0;
-			wait_nr = 0;
+			if (to_wait)
+				wait_nr = io_adjust_wait_nr(ring, to_wait);
 		} else {
 			submit -= ret;
 		}
-- 
2.24.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 0/2] Fix hang in io_uring_get_cqe() with iopoll
  2020-06-21 16:14 [PATCH v2 0/2] Fix hang in io_uring_get_cqe() with iopoll Pavel Begunkov
  2020-06-21 16:14 ` [PATCH v2 1/2] barriers: add load relaxed Pavel Begunkov
  2020-06-21 16:14 ` [PATCH v2 2/2] Fix hang in io_uring_get_cqe() with iopoll Pavel Begunkov
@ 2020-06-21 18:48 ` Jens Axboe
  2020-06-21 19:54   ` Pavel Begunkov
  2 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2020-06-21 18:48 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring

On 6/21/20 10:14 AM, Pavel Begunkov wrote:
> v2: use relaxed load
>     fix errata
> 
> Pavel Begunkov (2):
>   barriers: add load relaxed
>   Fix hang in io_uring_get_cqe() with iopoll
> 
>  src/include/liburing/barrier.h |  4 ++++
>  src/queue.c                    | 16 +++++++++++++++-
>  2 files changed, 19 insertions(+), 1 deletion(-)

After checking again, I think your liburing is quite a bit out-of-date.
Can you check if the issue still exists in the current git tree?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2 0/2] Fix hang in io_uring_get_cqe() with iopoll
  2020-06-21 18:48 ` [PATCH v2 0/2] " Jens Axboe
@ 2020-06-21 19:54   ` Pavel Begunkov
  0 siblings, 0 replies; 5+ messages in thread
From: Pavel Begunkov @ 2020-06-21 19:54 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On 21/06/2020 21:48, Jens Axboe wrote:
> On 6/21/20 10:14 AM, Pavel Begunkov wrote:
>> v2: use relaxed load
>>     fix errata
>>
>> Pavel Begunkov (2):
>>   barriers: add load relaxed
>>   Fix hang in io_uring_get_cqe() with iopoll
>>
>>  src/include/liburing/barrier.h |  4 ++++
>>  src/queue.c                    | 16 +++++++++++++++-
>>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> After checking again, I think your liburing is quite a bit out-of-date.

It is. Apparently I checked out something old, my apologies.

> Can you check if the issue still exists in the current git tree?

This one is plumbed, though from a glance there is a similar problem
with non-iopoll mode and early wake by a timeout, rendering it to
idle-loop instead of sleeping. I'll leave it for a bit later.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-21 16:14 [PATCH v2 0/2] Fix hang in io_uring_get_cqe() with iopoll Pavel Begunkov
2020-06-21 16:14 ` [PATCH v2 1/2] barriers: add load relaxed Pavel Begunkov
2020-06-21 16:14 ` [PATCH v2 2/2] Fix hang in io_uring_get_cqe() with iopoll Pavel Begunkov
2020-06-21 18:48 ` [PATCH v2 0/2] " Jens Axboe
2020-06-21 19:54   ` Pavel Begunkov

IO-Uring Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/io-uring/0 io-uring/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 io-uring io-uring/ https://lore.kernel.org/io-uring \
		io-uring@vger.kernel.org
	public-inbox-index io-uring

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.io-uring


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git