linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Stefan Bühler" <source@stbuehler.de>
To: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: io_uring memory ordering (and broken queue-is-full checks)
Date: Fri, 19 Apr 2019 11:29:32 +0200	[thread overview]
Message-ID: <54496e17-97de-9f9a-9972-c448226bb768@stbuehler.de> (raw)
In-Reply-To: <03cdf146-12a7-95de-eb78-31cd7df4bb44@kernel.dk>

Hi,

On 17.04.19 17:02, Jens Axboe wrote:
> [...]
>
> Care to turn the pseudo code into a real patch? Would be easier to
> digest. But thanks for taking a look at this!

Fair enough; I'm working on it.

I'll start with the actual bugs, and will try to cleanup unneeded
barriers/docs later.

>> The "userspace" liburing looks even worse.  For starters,
>> `io_uring_get_completion` cannot safely return a pointer to the CQ entry
>> *AND* increment head: you need to actually read the entry before
>> incrementing head.
> 
> That's a good point, we should actually have this split in two to avoid
> the case where the kernel will then fill in a new entry for us.

I see you already did so in the liburing repo.

When going through liburing code again, I noticed io_uring_submit
assumes there might be pending submission entries in the SQ queue.

But those entries are not "reserved" in the SQE queue; so
io_uring_get_sqe might overwrite SQE data that isn't read by the kernel
through SQ yet.

Is there any motivation behind the indirection?  I see two possible ideas:
- allocate fixed SQE entries for operations, and just add them to SQ
  when needed
- have multiple threads fill SQE in parallel, and only add them to SQ
  when done

Are those actually worth the cost?

liburing doesn't look like it is going to take advantage of it, and the
library I'm writing in Rust right now isn't going to either (actually
even using a fixed sq_ndx == sqe_ndx mapping, i.e. it'll only write
sq->array on startup).

At least consider documenting the motivation :)

>> Last but not least the kernel side has two lines it checks whether a
>> queue is full or not and uses `tail + 1 == head`: this is only true if
>> there were U32_MAX entries in the queue.  You should check `(tail -
>> head) == SIZE` instead.
> 
> Good catch, this is a leftover from when the rings were indeed capped by
> the mask of the entries. I've fixed this one and pushed it out, and
> added a liburing test case for it as well.

I think you missed the second one in io_uring_poll (looking at your
for-linus branch), so I will send a patch for that too.

cheers,
Stefan

  reply	other threads:[~2019-04-19 18:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-14 16:44 io_uring memory ordering (and broken queue-is-full checks) Stefan Bühler
2019-04-17 15:02 ` Jens Axboe
2019-04-19  9:29   ` Stefan Bühler [this message]
2019-04-19  9:57     ` [PATCH v1 1/3] io_uring: fix race condition reading SQ entries Stefan Bühler
2019-04-19  9:57       ` [PATCH v1 2/3] io_uring: fix race condition when sq threads goes sleeping Stefan Bühler
2019-04-19  9:57       ` [PATCH v1 3/3] io_uring: fix poll full SQ detection Stefan Bühler
2019-04-22 17:04     ` io_uring memory ordering (and broken queue-is-full checks) Jens Axboe
2019-04-24 21:54     ` [PATCH barrier cleanup v1 1/7] io_uring: fix notes on barriers Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 2/7] io_uring: remove unnecessary barrier before wq_has_sleeper Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 3/7] io_uring: remove unnecessary barrier before reading cq head Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 4/7] io_uring: remove unnecessary barrier after updating SQ head Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 5/7] io_uring: remove unnecessary barrier before reading SQ tail Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 6/7] io_uring: remove unnecessary barrier after incrementing dropped counter Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 7/7] io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUP Stefan Bühler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54496e17-97de-9f9a-9972-c448226bb768@stbuehler.de \
    --to=source@stbuehler.de \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).