All of lore.kernel.org
 help / color / mirror / Atom feed
From: Olivier Langlois <olivier@trillion01.com>
To: Pavel Begunkov <asml.silence@gmail.com>,
	io-uring <io-uring@vger.kernel.org>
Subject: Re: IORING_OP_POLL_ADD/IORING_OP_POLL_REMOVE questions
Date: Thu, 06 May 2021 11:46:44 -0400	[thread overview]
Message-ID: <2a4437a06ad910142d27d22d9e17a843dbd6d6fc.camel@trillion01.com> (raw)
In-Reply-To: <7fa90154d1fbe1969383261539b7fbee20caad43.camel@trillion01.com>

On Thu, 2021-05-06 at 04:42 -0400, Olivier Langlois wrote:
> On Wed, 2021-05-05 at 23:17 -0400, Olivier Langlois wrote:
> > Note that the poll remove sqe and the following poll add sqe don't
> > have
> > exactly the same user_data.
> > 
> > I have this statement in between:
> > /* increment generation counter to avoid handling old events */
> >           ++anfds [fd].egen;
> > 
> > poll remove cancel the previous poll add having gen 1 in its user
> > data.
> > the next poll add has it user_data storing gen var set to 2:
> > 
> > 1 3 remove 85 1
> > 1 3 add 85 2
> > 
> > 85 gen 1 res -125
> > 85 gen 1 res 4
> > 
> Good news!
> 
> I have used the io_uring tracepoints and they confirm that io_uring
> works as expected:
> 
> For the above printfs, I get the following perf traces:
> 
>  11940.259 Execution SVC/134675 io_uring:io_uring_submit_sqe(ctx:
> 0xffff9d3c4368c000, opcode: 7, force_nonblock: 1)
>  11940.270 Execution SVC/134675 io_uring:io_uring_complete(ctx:
> 0xffff9d3c4368c000, user_data: 4294967382, res: -125)
>  11940.272 Execution SVC/134675 io_uring:io_uring_complete(ctx:
> 0xffff9d3c4368c000)
>  11940.275 Execution SVC/134675 io_uring:io_uring_file_get(ctx:
> 0xffff9d3c4368c000, fd: 86)
>  11940.277 Execution SVC/134675 io_uring:io_uring_submit_sqe(ctx:
> 0xffff9d3c4368c000, opcode: 6, user_data: 4294967382, force_nonblock:
> 1)
>  11940.279 Execution SVC/134675 io_uring:io_uring_complete(ctx:
> 0xffff9d3c4368c000, user_data: 4294967382, res: 4)
> 
> So, it seems the compiler is playing games on me. It prints the correct
> gen 2 value but is passing 1 to io_uring_sqe_set_data()...
> 
> I'll try to turn optimization off to see if it helps.
> 
> thx a lot again for your exceptional work!
> 
> 
The more I fool around trying to find the problem, the more I think
that my problem is somewhere in the liburing (v2.0) code because of a
possibly missing memory barrier.

The function that I do use to submit the sqes is
io_uring_wait_cqe_timeout().

My problem did appear when I did replace libev original boilerplate
code for liburing (v2.0) used for filling and submitting the sqes.

Do you remember when you pointed out that I wasn't setting the
user_data field for my poll remove request in:

io_uring_prep_poll_remove(sqe,
iouring_build_user_data(IOURING_POLL, fd, anfds [fd].egen));
//          io_uring_sqe_set_data(sqe,
iouring_build_user_data(IOURING_POLL, fd, anfds [fd].egen));

?

The last call to remove the non-existant 'add 85 2' is replied by
ENOENT by io_uring and this was caught by an assert in my case
IOURING_POLL cqe handler.

  iouring_decode_user_data(cqe->user_data, &type, &fd, &gen);

  switch (type) {

This is impossible to end up there because if you do not explicitly set
user_data, io_uring_prep_rw() is setting it to 0.

In order for my assert to be hit, user_data would have to be set with
the commented out io_uring_sqe_set_data(), and it happens to also be
the value of the previously sent sqe user_data...

I did replace the code above with:

io_uring_prep_poll_remove(sqe,
iouring_build_user_data(IOURING_POLL_ADD, fd, anfds [fd].egen));
io_uring_sqe_set_data(sqe, iouring_build_user_data(IOURING_POLL_REMOVE,
fd, anfds [fd].egen));

and my assert for cqe->res != -ENOENT has stopped being hit.

There is clearly something messing with the sqe user_data communication
between user and kernel and I start to suspect that this might be
somewhere inside io_uring_wait_cqe_timeout()...



  reply	other threads:[~2021-05-06 15:46 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-04 18:06 IORING_OP_POLL_ADD/IORING_OP_POLL_REMOVE questions Olivier Langlois
2021-05-05 17:20 ` Olivier Langlois
2021-05-05 17:56 ` Pavel Begunkov
2021-05-06  3:17   ` Olivier Langlois
2021-05-06  8:42     ` Olivier Langlois
2021-05-06 15:46       ` Olivier Langlois [this message]
2021-05-06 16:59         ` Pavel Begunkov
2021-05-06 19:32         ` Olivier Langlois
2021-05-06 17:09     ` Pavel Begunkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a4437a06ad910142d27d22d9e17a843dbd6d6fc.camel@trillion01.com \
    --to=olivier@trillion01.com \
    --cc=asml.silence@gmail.com \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.