All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>,
	Victor Stewart <v@nametag.social>
Cc: io-uring <io-uring@vger.kernel.org>
Subject: Re: bug with fastpoll accept and sqpoll + IOSQE_FIXED_FILE
Date: Fri, 5 Feb 2021 07:42:20 -0700	[thread overview]
Message-ID: <8349a07b-6975-dc55-dc0a-a4228f913af3@kernel.dk> (raw)
In-Reply-To: <c9550dcf-ce53-c214-8c4b-6165ad6605a9@gmail.com>

On 2/5/21 5:46 AM, Pavel Begunkov wrote:
> On 04/02/2021 16:50, Jens Axboe wrote:
>> On 2/3/21 4:49 AM, Pavel Begunkov wrote:
>>> On 02/02/2021 20:56, Pavel Begunkov wrote:
>>>> On 02/02/2021 20:48, Jens Axboe wrote:
>>>>> On 2/2/21 1:34 PM, Pavel Begunkov wrote:
>>>>>> On 02/02/2021 17:41, Pavel Begunkov wrote:
>>>>>>> On 02/02/2021 17:24, Jens Axboe wrote:
>>>>>>>> On 2/2/21 10:10 AM, Victor Stewart wrote:
>>>>>>>>>> Can you send the updated test app?
>>>>>>>>>
>>>>>>>>> https://gist.github.com/victorstewart/98814b65ed702c33480487c05b40eb56
>>>>>>>>>
>>>>>>>>> same link i just updated the same gist
>>>>>>>>
>>>>>>>> And how are you running it?
>>>>>>>
>>>>>>> with SQPOLL    with    FIXED FLAG -> FAILURE: failed with error = ???
>>>>>>> 	-> io_uring_wait_cqe_timeout() strangely returns -1, (-EPERM??)
>>>>>>
>>>>>> Ok, _io_uring_get_cqe() is just screwed twice
>>>>>>
>>>>>> TL;DR
>>>>>> we enter into it with submit=0, do an iteration, which decrements it,
>>>>>> then a second iteration passes submit=-1, which is returned back by
>>>>>> the kernel as a result and propagated back from liburing...
>>>>>
>>>>> Yep, that's what I came up with too. We really just need a clear way
>>>>> of knowing when to break out, and when to keep going. Eg if we've
>>>>> done a loop and don't end up calling the system call, then there's
>>>>> no point in continuing.
>>>>
>>>> We can bodge something up (and forget about it), and do much cleaner
>>>> for IORING_FEAT_EXT_ARG, because we don't have LIBURING_UDATA_TIMEOUT
>>>> reqs for it and so can remove peek and so on.
>>>
>>> This version looks reasonably simple, and even passes tests and all
>>> issues found by Victor's test. Didn't test it yet, but should behave
>>> similarly in regard of internal timeouts (pre IORING_FEAT_EXT_ARG).
>>>
>>> static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr,
>>> 			     struct get_data *data)
>>> {
>>> 	struct io_uring_cqe *cqe = NULL;
>>> 	int ret = 0, err;
>>>
>>> 	do {
>>> 		unsigned flags = 0;
>>> 		unsigned nr_available;
>>> 		bool enter = false;
>>>
>>> 		err = __io_uring_peek_cqe(ring, &cqe, &nr_available);
>>> 		if (err)
>>> 			break;
>>>
>>> 		/* IOPOLL won't proceed when there're not reaped CQEs */
>>> 		if (cqe && (ring->flags & IORING_SETUP_IOPOLL))
>>> 			data->wait_nr = 0;
>>>
>>> 		if (data->wait_nr > nr_available || cq_ring_needs_flush(ring)) {
>>> 			flags = IORING_ENTER_GETEVENTS | data->get_flags;
>>> 			enter = true;
>>> 		}
>>> 		if (data->submit) {
>>> 			sq_ring_needs_enter(ring, &flags);
>>> 			enter = true;
>>> 		}
>>> 		if (!enter)
>>> 			break;
>>>
>>> 		ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
>>> 					    data->wait_nr, flags, data->arg,
>>> 					    data->sz);
>>> 		if (ret < 0) {
>>> 			err = -errno;
>>> 			break;
>>> 		}
>>> 		data->submit -= ret;
>>> 	} while (1);
>>>
>>> 	*cqe_ptr = cqe;
>>> 	return err;
>>> }
>>
>> So here's my take on this - any rewrite of _io_uring_get_cqe() is going
>> to end up adding special cases, that's unfortunately just the nature of
>> the game. And since we're going to be doing a new liburing release very
>> shortly, this isn't a great time to add a rewrite of it. It'll certainly
>> introduce more bugs than it solves, and hence regressions, no matter how
>> careful we are.
>>
>> Hence my suggestion is to just patch this in a trivial kind of fashion,
>> even if it doesn't really make the function any prettier. But it'll be
>> safer for a release, and then we can rework the function after.
>>
>> With that in mind, here's my suggestion. The premise is if we go through
>> the loop and don't do io_uring_enter(), then there's no point in
>> continuing. That's the trivial fix.
> 
> Your idea but imho cleaner below.
> +1 comment inline

Shouldn't be hard, it was just a quick hack :-)

>> diff --git a/src/queue.c b/src/queue.c
>> index 94f791e..4161aa7 100644
>> --- a/src/queue.c
>> +++ b/src/queue.c
>> @@ -89,12 +89,13 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
>>  {
>>  	struct io_uring_cqe *cqe = NULL;
>>  	const int to_wait = data->wait_nr;
>> -	int ret = 0, err;
>> +	int err;
>>  
>>  	do {
>>  		bool cq_overflow_flush = false;
>>  		unsigned flags = 0;
>>  		unsigned nr_available;
>> +		int ret = -2;
>>  
>>  		err = __io_uring_peek_cqe(ring, &cqe, &nr_available);
>>  		if (err)
>> @@ -117,7 +118,9 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
>>  			ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
>>  					data->wait_nr, flags, data->arg,
>>  					data->sz);
>> -		if (ret < 0) {
>> +		if (ret == -2) {
>> +			break;
> 
> peek/wait_cqe expect that cqe_ptr is filled on return=0. Looks we need
> to return an error or hack up those functions.

Right good point, we'd need -EAGAIN.

>> +		} else if (ret < 0) {
>>  			err = -errno;
>>  		} else if (ret == (int)data->submit) {
>>  			data->submit = 0;
>>
> 
> 
> diff --git a/src/queue.c b/src/queue.c
> index 94f791e..7d6f31d 100644
> --- a/src/queue.c
> +++ b/src/queue.c
> @@ -112,11 +112,15 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
>  			flags = IORING_ENTER_GETEVENTS | data->get_flags;
>  		if (data->submit)
>  			sq_ring_needs_enter(ring, &flags);
> -		if (data->wait_nr > nr_available || data->submit ||
> -		    cq_overflow_flush)
> -			ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
> -					data->wait_nr, flags, data->arg,
> -					data->sz);
> +
> +		if (data->wait_nr <= nr_available && !data->submit &&
> +		    !cq_overflow_flush) {
> +			err = ?;

which I guess is the actual error missing from here?

> +			break;
> +		}
> +		ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
> +				data->wait_nr, flags, data->arg,
> +				data->sz);
>  		if (ret < 0) {
>  			err = -errno;
>  		} else if (ret == (int)data->submit) {
> 


-- 
Jens Axboe


  reply	other threads:[~2021-02-05 22:02 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-02  5:36 bug with fastpoll accept and sqpoll + IOSQE_FIXED_FILE Victor Stewart
2021-02-02 11:05 ` Pavel Begunkov
2021-02-02 11:23   ` Pavel Begunkov
2021-02-02 16:18     ` Victor Stewart
2021-02-02 16:30       ` Victor Stewart
2021-02-02 16:44         ` Jens Axboe
2021-02-02 17:10           ` Victor Stewart
2021-02-02 17:24             ` Jens Axboe
2021-02-02 17:41               ` Pavel Begunkov
2021-02-02 20:34                 ` Pavel Begunkov
2021-02-02 20:48                   ` Jens Axboe
2021-02-02 20:56                     ` Pavel Begunkov
2021-02-03 11:49                       ` Pavel Begunkov
2021-02-04 16:50                         ` Jens Axboe
2021-02-05 12:46                           ` Pavel Begunkov
2021-02-05 14:42                             ` Jens Axboe [this message]
2021-02-05 14:49                               ` Pavel Begunkov
2021-02-02 17:46               ` Jens Axboe
2021-02-02 17:50                 ` Victor Stewart
2021-02-02 17:57                   ` Jens Axboe
2021-02-02 17:46               ` Victor Stewart
2021-02-02 17:21       ` Pavel Begunkov
2021-02-02 17:30         ` Victor Stewart
2021-02-02 17:45           ` Victor Stewart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8349a07b-6975-dc55-dc0a-a4228f913af3@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=v@nametag.social \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.