All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Begunkov <asml.silence@gmail.com>
To: Olivier Langlois <olivier@trillion01.com>,
	io-uring <io-uring@vger.kernel.org>
Subject: Re: IORING_OP_POLL_ADD/IORING_OP_POLL_REMOVE questions
Date: Thu, 6 May 2021 18:09:33 +0100	[thread overview]
Message-ID: <44491721-56a0-3188-a1ba-5a0920881ac2@gmail.com> (raw)
In-Reply-To: <0a12170604cfcbce61259661f579fe5640cc7ffb.camel@trillion01.com>

On 5/6/21 4:17 AM, Olivier Langlois wrote:
> Hi Pavel,
> 
> On Wed, 2021-05-05 at 18:56 +0100, Pavel Begunkov wrote:
>> On 5/4/21 7:06 PM, Olivier Langlois wrote:
>>>
>>>
>>> 2. I don't understand what I am looking at. Why am I receiving a
>>> completion notification for a POLL request that has just been
>>> cancelled? What is the logic behind silently discarding a
>>> IORING_OP_POLL_ADD sqe meant to replace an existing one?
>>
>> I'm lost in your message, so let's start with simple reasons. All
>> requests post one CQE (almost true), including poll_remove requests.
>>
>> io_uring_prep_poll_remove(sqe, iouring_build_user_data(IOURING_POLL,
>> fd, anfds [fd].egen));
>> // io_uring_sqe_set_data(sqe, iouring_build_user_data(IOURING_POLL, fd,
>> anfds [fd].egen));
>>
>> If poll remove and poll requests have identical user_data, as in
>> the second (commented?) line you'll get two CQEs with that user_data.
>>
>> Did you check return value (in CQE) of poll remove? I'd recommend
>> set its user_data to something unique. Did you consider that it
>> may fail?
> 
> Your comments does help me to see clearer!
> 
> You are correct that setting the poll remove user_data is not done.
> (hence the commented out statement for that purpose but I must have
> entertain the idea to set it at some point to see what good it would
> do)
> 
> The reason being that I do not care about whether or not it succeeds
> because the very next thing that I do is to rearm the poll for the same
> fd with a different event mask.
> 
> Beside, the removed poll original sqe is reported back as ECANCELED (-
> 125):
> 85 gen 1 res -125

That's why I mentioned setting user_data, so can distinguish cqes.

> This appear to be the code returned in io_poll_remove_one()
> 
> Does the poll remove operation generates 2 cqes?
> 1 for the canceled sqe and and 1 for the poll remove sqe itself?

No, only one.

> 
> I am not too sure what good setting a distinct user_data to the poll
> remove sqe could do but I will definitely give it a shot to perhaps see
> clearer.

again to be able to distinguish cqes, at least for debugging,
but I don't see how it can be not racy without it.

> Note that the poll remove sqe and the following poll add sqe don't have
> exactly the same user_data.

Right, I noticed. Was concerned about gen1 poll and its poll
remove.

 
> I have this statement in between:
> /* increment generation counter to avoid handling old events */
>           ++anfds [fd].egen;
> 
> poll remove cancel the previous poll add having gen 1 in its user data.
> the next poll add has it user_data storing gen var set to 2:
> 
> 1 3 remove 85 1
> 1 3 add 85 2
> 
> 85 gen 1 res -125
> 85 gen 1 res 4
> 
> I'll try to be more succinct this time.
> 
> If the poll add sqe having gen 1 in its user_data is cancelled, how can
> its completion can be reported in the very next cqe?
> 
> and I never hear back about the poll add gen 2 sqe...

This one sounds like that "85 gen 1 res 4"
is actually gen2 but with screwed user_data. I'd rather
double check that you set it right, and don't race
with multiple threads.

FWIW, submission queue filling is not synchronised by
liburing, users should do that.

> 
> I'll try to get more familiar with the fs/io_uring.c code but it feels
> like it could be some optimization done where because the cancelled
> poll result is available while inside io_uring_enter(), instead of
> discarding it to immediately rearm it for the new poll add request,
> io_uring_enter() instead decide to return it back to reply to the gen 2
> request but it forgets to update the user_data field before doing so...

There definitely may be a bug, but it's much more likely
lurking in your code.

> Maybe what is confusing is that the heading "1 3" in my traces is the
> EV_READ EV_WRITE bitmask which values are:
> 
> EV_READ  = 1
> EV_WRITE = 2
> 
> while
> 
> POLLIN  = 1
> POLLOUT = 4
> 
> So my poll add gen 1 request was for be notified for POLLIN. This is
> what I got with the question #1 "195" result.
> 
> Therefore the:
> 85 gen 1 res 4
> 
> can only be for my poll add gen 2 requesting for POLLIN|POLLOUT. Yet,
> it is reported with the previous request user_data...
> 
> I feel the mystery is almost solved with your help... I'll continue my
> investigation and I'll report back if I finally solve the mystery.
>>  
>>> 3. As I am writing this email, I have just noticed that it was
>>> possible
>>> to update an existing POLL entry with IORING_OP_POLL_REMOVE through
>>> io_uring_prep_poll_update(). Is this what I should do to eliminate my
>>> problems? What are the possible race conditions scenarios that I
>>> should
>>> be prepared to handle by using io_uring_prep_poll_update() (ie:
>>> completion of the poll entry to update while my process is inside
>>> io_uring_enter() to update it...)?
>>
>> Update is preferable, but it's Linux kernel 5.13.
>> Both remove and update may fail. e.g. with -EALREADY
>>
> I am just about to install 5.12 on my system and this and the new
> multishot poll feature makes me already crave 5.13!

-- 
Pavel Begunkov

      parent reply	other threads:[~2021-05-06 17:09 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-04 18:06 IORING_OP_POLL_ADD/IORING_OP_POLL_REMOVE questions Olivier Langlois
2021-05-05 17:20 ` Olivier Langlois
2021-05-05 17:56 ` Pavel Begunkov
2021-05-06  3:17   ` Olivier Langlois
2021-05-06  8:42     ` Olivier Langlois
2021-05-06 15:46       ` Olivier Langlois
2021-05-06 16:59         ` Pavel Begunkov
2021-05-06 19:32         ` Olivier Langlois
2021-05-06 17:09     ` Pavel Begunkov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44491721-56a0-3188-a1ba-5a0920881ac2@gmail.com \
    --to=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    --cc=olivier@trillion01.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.