io-uring.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Pavel Begunkov <asml.silence@gmail.com>,
	Victor Stewart <v@nametag.social>
Cc: io-uring <io-uring@vger.kernel.org>
Subject: Re: bug with fastpoll accept and sqpoll + IOSQE_FIXED_FILE
Date: Fri, 5 Feb 2021 07:42:20 -0700	[thread overview]
Message-ID: <8349a07b-6975-dc55-dc0a-a4228f913af3@kernel.dk> (raw)
In-Reply-To: <c9550dcf-ce53-c214-8c4b-6165ad6605a9@gmail.com>

On 2/5/21 5:46 AM, Pavel Begunkov wrote:
> On 04/02/2021 16:50, Jens Axboe wrote:
>> On 2/3/21 4:49 AM, Pavel Begunkov wrote:
>>> On 02/02/2021 20:56, Pavel Begunkov wrote:
>>>> On 02/02/2021 20:48, Jens Axboe wrote:
>>>>> On 2/2/21 1:34 PM, Pavel Begunkov wrote:
>>>>>> On 02/02/2021 17:41, Pavel Begunkov wrote:
>>>>>>> On 02/02/2021 17:24, Jens Axboe wrote:
>>>>>>>> On 2/2/21 10:10 AM, Victor Stewart wrote:
>>>>>>>>>> Can you send the updated test app?
>>>>>>>>>
>>>>>>>>> https://gist.github.com/victorstewart/98814b65ed702c33480487c05b40eb56
>>>>>>>>>
>>>>>>>>> same link i just updated the same gist
>>>>>>>>
>>>>>>>> And how are you running it?
>>>>>>>
>>>>>>> with SQPOLL    with    FIXED FLAG -> FAILURE: failed with error = ???
>>>>>>> 	-> io_uring_wait_cqe_timeout() strangely returns -1, (-EPERM??)
>>>>>>
>>>>>> Ok, _io_uring_get_cqe() is just screwed twice
>>>>>>
>>>>>> TL;DR
>>>>>> we enter into it with submit=0, do an iteration, which decrements it,
>>>>>> then a second iteration passes submit=-1, which is returned back by
>>>>>> the kernel as a result and propagated back from liburing...
>>>>>
>>>>> Yep, that's what I came up with too. We really just need a clear way
>>>>> of knowing when to break out, and when to keep going. Eg if we've
>>>>> done a loop and don't end up calling the system call, then there's
>>>>> no point in continuing.
>>>>
>>>> We can bodge something up (and forget about it), and do much cleaner
>>>> for IORING_FEAT_EXT_ARG, because we don't have LIBURING_UDATA_TIMEOUT
>>>> reqs for it and so can remove peek and so on.
>>>
>>> This version looks reasonably simple, and even passes tests and all
>>> issues found by Victor's test. Didn't test it yet, but should behave
>>> similarly in regard of internal timeouts (pre IORING_FEAT_EXT_ARG).
>>>
>>> static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr,
>>> 			     struct get_data *data)
>>> {
>>> 	struct io_uring_cqe *cqe = NULL;
>>> 	int ret = 0, err;
>>>
>>> 	do {
>>> 		unsigned flags = 0;
>>> 		unsigned nr_available;
>>> 		bool enter = false;
>>>
>>> 		err = __io_uring_peek_cqe(ring, &cqe, &nr_available);
>>> 		if (err)
>>> 			break;
>>>
>>> 		/* IOPOLL won't proceed when there're not reaped CQEs */
>>> 		if (cqe && (ring->flags & IORING_SETUP_IOPOLL))
>>> 			data->wait_nr = 0;
>>>
>>> 		if (data->wait_nr > nr_available || cq_ring_needs_flush(ring)) {
>>> 			flags = IORING_ENTER_GETEVENTS | data->get_flags;
>>> 			enter = true;
>>> 		}
>>> 		if (data->submit) {
>>> 			sq_ring_needs_enter(ring, &flags);
>>> 			enter = true;
>>> 		}
>>> 		if (!enter)
>>> 			break;
>>>
>>> 		ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
>>> 					    data->wait_nr, flags, data->arg,
>>> 					    data->sz);
>>> 		if (ret < 0) {
>>> 			err = -errno;
>>> 			break;
>>> 		}
>>> 		data->submit -= ret;
>>> 	} while (1);
>>>
>>> 	*cqe_ptr = cqe;
>>> 	return err;
>>> }
>>
>> So here's my take on this - any rewrite of _io_uring_get_cqe() is going
>> to end up adding special cases, that's unfortunately just the nature of
>> the game. And since we're going to be doing a new liburing release very
>> shortly, this isn't a great time to add a rewrite of it. It'll certainly
>> introduce more bugs than it solves, and hence regressions, no matter how
>> careful we are.
>>
>> Hence my suggestion is to just patch this in a trivial kind of fashion,
>> even if it doesn't really make the function any prettier. But it'll be
>> safer for a release, and then we can rework the function after.
>>
>> With that in mind, here's my suggestion. The premise is if we go through
>> the loop and don't do io_uring_enter(), then there's no point in
>> continuing. That's the trivial fix.
> 
> Your idea but imho cleaner below.
> +1 comment inline

Shouldn't be hard, it was just a quick hack :-)

>> diff --git a/src/queue.c b/src/queue.c
>> index 94f791e..4161aa7 100644
>> --- a/src/queue.c
>> +++ b/src/queue.c
>> @@ -89,12 +89,13 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
>>  {
>>  	struct io_uring_cqe *cqe = NULL;
>>  	const int to_wait = data->wait_nr;
>> -	int ret = 0, err;
>> +	int err;
>>  
>>  	do {
>>  		bool cq_overflow_flush = false;
>>  		unsigned flags = 0;
>>  		unsigned nr_available;
>> +		int ret = -2;
>>  
>>  		err = __io_uring_peek_cqe(ring, &cqe, &nr_available);
>>  		if (err)
>> @@ -117,7 +118,9 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
>>  			ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
>>  					data->wait_nr, flags, data->arg,
>>  					data->sz);
>> -		if (ret < 0) {
>> +		if (ret == -2) {
>> +			break;
> 
> peek/wait_cqe expect that cqe_ptr is filled on return=0. Looks we need
> to return an error or hack up those functions.

Right good point, we'd need -EAGAIN.

>> +		} else if (ret < 0) {
>>  			err = -errno;
>>  		} else if (ret == (int)data->submit) {
>>  			data->submit = 0;
>>
> 
> 
> diff --git a/src/queue.c b/src/queue.c
> index 94f791e..7d6f31d 100644
> --- a/src/queue.c
> +++ b/src/queue.c
> @@ -112,11 +112,15 @@ static int _io_uring_get_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_pt
>  			flags = IORING_ENTER_GETEVENTS | data->get_flags;
>  		if (data->submit)
>  			sq_ring_needs_enter(ring, &flags);
> -		if (data->wait_nr > nr_available || data->submit ||
> -		    cq_overflow_flush)
> -			ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
> -					data->wait_nr, flags, data->arg,
> -					data->sz);
> +
> +		if (data->wait_nr <= nr_available && !data->submit &&
> +		    !cq_overflow_flush) {
> +			err = ?;

which I guess is the actual error missing from here?

> +			break;
> +		}
> +		ret = __sys_io_uring_enter2(ring->ring_fd, data->submit,
> +				data->wait_nr, flags, data->arg,
> +				data->sz);
>  		if (ret < 0) {
>  			err = -errno;
>  		} else if (ret == (int)data->submit) {
> 


-- 
Jens Axboe


  reply	other threads:[~2021-02-05 22:02 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-02  5:36 bug with fastpoll accept and sqpoll + IOSQE_FIXED_FILE Victor Stewart
2021-02-02 11:05 ` Pavel Begunkov
2021-02-02 11:23   ` Pavel Begunkov
2021-02-02 16:18     ` Victor Stewart
2021-02-02 16:30       ` Victor Stewart
2021-02-02 16:44         ` Jens Axboe
2021-02-02 17:10           ` Victor Stewart
2021-02-02 17:24             ` Jens Axboe
2021-02-02 17:41               ` Pavel Begunkov
2021-02-02 20:34                 ` Pavel Begunkov
2021-02-02 20:48                   ` Jens Axboe
2021-02-02 20:56                     ` Pavel Begunkov
2021-02-03 11:49                       ` Pavel Begunkov
2021-02-04 16:50                         ` Jens Axboe
2021-02-05 12:46                           ` Pavel Begunkov
2021-02-05 14:42                             ` Jens Axboe [this message]
2021-02-05 14:49                               ` Pavel Begunkov
2021-02-02 17:46               ` Jens Axboe
2021-02-02 17:50                 ` Victor Stewart
2021-02-02 17:57                   ` Jens Axboe
2021-02-02 17:46               ` Victor Stewart
2021-02-02 17:21       ` Pavel Begunkov
2021-02-02 17:30         ` Victor Stewart
2021-02-02 17:45           ` Victor Stewart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8349a07b-6975-dc55-dc0a-a4228f913af3@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=v@nametag.social \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).