IO-Uring Archive on lore.kernel.org
 help / color / Atom feed
From: Glauber Costa <glauber@scylladb.com>
To: Avi Kivity <avi@scylladb.com>
Cc: Pavel Begunkov <asml.silence@gmail.com>,
	io-uring@vger.kernel.org, Jens Axboe <axboe@kernel.dk>
Subject: Re: shutdown not affecting connection?
Date: Sat, 8 Feb 2020 13:57:38 -0500
Message-ID: <CAD-J=zZm2B8-EXiX8j2AT5Q0zTCi5rB1gQzzOaYi3JoO1jcqOw@mail.gmail.com> (raw)
In-Reply-To: <9ec6cbf7-0f0b-f777-8507-199e8837df94@scylladb.com>

On Sat, Feb 8, 2020 at 1:48 PM Avi Kivity <avi@scylladb.com> wrote:
>
> On 2/8/20 8:42 PM, Glauber Costa wrote:
> > Hi
> >
> > BTW, my apologies but I should have specified the kernel I am running:
> > 90206ac99c1f25b7f7a4c2c40a0b9d4561ffa9bf
> >
> > On Sat, Feb 8, 2020 at 9:26 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
> >> Hi
> >>
> >> On 2/8/2020 4:55 PM, Glauber Costa wrote:
> >>> Hi
> >>>
> >>> I've been trying to make sense of some weird behavior with the seastar
> >>> implementation of io_uring, and started to suspect a bug in io_uring's
> >>> connect.
> >>>
> >>> The situation is as follows:
> >>>
> >>> - A connect() call is issued (and in the backend I can choose if I use
> >>> uring or not)
> >>> - The connection is supposed to take a while to establish.
> >>> - I call shutdown on the file descriptor
> >>>
> >>> If io_uring is not used:
> >>> - connect() starts by  returning EINPROGRESS as expected, and after
> >>> the shutdown the file descriptor is finally made ready for epoll. I
> >>> call getsockopt(SOL_SOCKET, SO_ERROR), and see the error (104)
> >>>
> >>> if io_uring is used:
> >>> - if the SQE has the IOSQE_ASYNC flag on, connect() never returns.
> >>> - if the SQE *does not* have the IOSQE_ASYNC flag on, then most of the
> >>> time the test works as intended and connect() returns 104, but
> >>> occasionally it hangs too. Note that, seastar may choose not to call
> >>> io_uring_enter immediately and batch sqes.
> >>>
> >>> Sounds like some kind of race?
> >>>
> >>> I know C++ probably stinks like the devil for you guys, but if you are
> >>> curious to see the code, this fails one of our unit tests:
> >>>
> >>> https://github.com/scylladb/seastar/blob/master/tests/unit/connect_test.cc
> >>> See test_connection_attempt_is_shutdown
> >>> (above is the master seastar tree, not including the io_uring implementation)
> >>>
> >> Is this chaining with connect().then_wrapped() asynchronous? Like kind
> >> of future/promise stuff?
> > Correct.
> > then_wrapped executes eventually when connect returns either success or failure
> >
> >> I wonder, if connect() and shutdown() there may
> >> be executed in the reverse order.
> > The methods connect and shutdown will execute in this order.
> > But connect will just queue something that will later be sent down to
> > the kernel.
> >
> > I initially suspected an ordering issue on my side. What made me start
> > suspecting a bug
> > are two reasons:
> > - I can force the code to grab an sqe and call io_uring_enter at the
> > moment the connect()
> > call happens : I see no change.
> > - that IOSQE_ASYNC changes this behavior, as you acknowledged yourself.
> >
> > It seems to me that if shutdown happens when the sqe is sitting on a
> > kernel queue somewhere
> > the connection will hang forever instead of failing right away as I would expect
> > - if shutdown happens after the call to io_uring_enter
>
>
>
> You can try to cancel the sqe before you shutdown the socket. This will
> flush the queue (even if the cancellation fails).
>
>
> However, if you io_uring_enter before calling shutdown and connect does
> not return, I'd consider that a kernel bug.

That is very definitely what it happens, since I've changed the code
to do it synchronously
(with our flush() implementation, which will loop until an sqe can be
acquired and io_uring_enter
returns success), so at this point I am sure this is in the kernel.




> Perhaps you can reduce the
> problem to a small C reproducer?
>
That was my intended next step, yes
>

  reply index

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-08 13:55 Glauber Costa
2020-02-08 14:26 ` Pavel Begunkov
2020-02-08 18:42   ` Glauber Costa
2020-02-08 18:48     ` Avi Kivity
2020-02-08 18:57       ` Glauber Costa [this message]
2020-02-08 20:20         ` Glauber Costa
2020-02-08 20:28           ` Avi Kivity
2020-02-08 20:43             ` Glauber Costa
2020-02-08 18:48 ` Andres Freund
2020-02-08 18:54   ` Glauber Costa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAD-J=zZm2B8-EXiX8j2AT5Q0zTCi5rB1gQzzOaYi3JoO1jcqOw@mail.gmail.com' \
    --to=glauber@scylladb.com \
    --cc=asml.silence@gmail.com \
    --cc=avi@scylladb.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

IO-Uring Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/io-uring/0 io-uring/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 io-uring io-uring/ https://lore.kernel.org/io-uring \
		io-uring@vger.kernel.org
	public-inbox-index io-uring

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.io-uring


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git