linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: "Stefan Bühler" <source@stbuehler.de>,
	linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: io_uring memory ordering (and broken queue-is-full checks)
Date: Mon, 22 Apr 2019 11:04:49 -0600	[thread overview]
Message-ID: <91366210-d56c-acfe-589e-8765cb8475f7@kernel.dk> (raw)
In-Reply-To: <54496e17-97de-9f9a-9972-c448226bb768@stbuehler.de>

On 4/19/19 3:29 AM, Stefan Bühler wrote:
> Hi,
> 
> On 17.04.19 17:02, Jens Axboe wrote:
>> [...]
>>
>> Care to turn the pseudo code into a real patch? Would be easier to
>> digest. But thanks for taking a look at this!
> 
> Fair enough; I'm working on it.
> 
> I'll start with the actual bugs, and will try to cleanup unneeded
> barriers/docs later.

That sounds great, thanks. The three patches look good to me, I've
queued them up.

>>> The "userspace" liburing looks even worse.  For starters,
>>> `io_uring_get_completion` cannot safely return a pointer to the CQ entry
>>> *AND* increment head: you need to actually read the entry before
>>> incrementing head.
>>
>> That's a good point, we should actually have this split in two to avoid
>> the case where the kernel will then fill in a new entry for us.
> 
> I see you already did so in the liburing repo.

Yep

> When going through liburing code again, I noticed io_uring_submit
> assumes there might be pending submission entries in the SQ queue.
> 
> But those entries are not "reserved" in the SQE queue; so
> io_uring_get_sqe might overwrite SQE data that isn't read by the kernel
> through SQ yet.
> 
> Is there any motivation behind the indirection?  I see two possible ideas:
> - allocate fixed SQE entries for operations, and just add them to SQ
>   when needed
> - have multiple threads fill SQE in parallel, and only add them to SQ
>   when done
> 
> Are those actually worth the cost?
> 
> liburing doesn't look like it is going to take advantage of it, and the
> library I'm writing in Rust right now isn't going to either (actually
> even using a fixed sq_ndx == sqe_ndx mapping, i.e. it'll only write
> sq->array on startup).
> 
> At least consider documenting the motivation :)

I'll take a look at this case and see if the complexity is warranted,
or at least make sure that it's exposed in a safe fashion.

>>> Last but not least the kernel side has two lines it checks whether a
>>> queue is full or not and uses `tail + 1 == head`: this is only true if
>>> there were U32_MAX entries in the queue.  You should check `(tail -
>>> head) == SIZE` instead.
>>
>> Good catch, this is a leftover from when the rings were indeed capped by
>> the mask of the entries. I've fixed this one and pushed it out, and
>> added a liburing test case for it as well.
> 
> I think you missed the second one in io_uring_poll (looking at your
> for-linus branch), so I will send a patch for that too.

I did indeed, my grep failed me. Thanks for catching that!

-- 
Jens Axboe


  parent reply	other threads:[~2019-04-22 17:04 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-14 16:44 io_uring memory ordering (and broken queue-is-full checks) Stefan Bühler
2019-04-17 15:02 ` Jens Axboe
2019-04-19  9:29   ` Stefan Bühler
2019-04-19  9:57     ` [PATCH v1 1/3] io_uring: fix race condition reading SQ entries Stefan Bühler
2019-04-19  9:57       ` [PATCH v1 2/3] io_uring: fix race condition when sq threads goes sleeping Stefan Bühler
2019-04-19  9:57       ` [PATCH v1 3/3] io_uring: fix poll full SQ detection Stefan Bühler
2019-04-22 17:04     ` Jens Axboe [this message]
2019-04-24 21:54     ` [PATCH barrier cleanup v1 1/7] io_uring: fix notes on barriers Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 2/7] io_uring: remove unnecessary barrier before wq_has_sleeper Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 3/7] io_uring: remove unnecessary barrier before reading cq head Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 4/7] io_uring: remove unnecessary barrier after updating SQ head Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 5/7] io_uring: remove unnecessary barrier before reading SQ tail Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 6/7] io_uring: remove unnecessary barrier after incrementing dropped counter Stefan Bühler
2019-04-24 21:54       ` [PATCH barrier cleanup v1 7/7] io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUP Stefan Bühler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=91366210-d56c-acfe-589e-8765cb8475f7@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=source@stbuehler.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).