All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: yangerkun <yangerkun@huawei.com>, Bob Liu <bob.liu@oracle.com>,
	linux-block@vger.kernel.org
Cc: houtao1@huawei.com, yi.zhang@huawei.com
Subject: Re: [RFC] io_uring: stop only support read/write for ctx with IORING_SETUP_IOPOLL
Date: Mon, 4 Nov 2019 07:01:37 -0700	[thread overview]
Message-ID: <50de6a43-fc11-aada-40e6-f3fee6523d49@kernel.dk> (raw)
In-Reply-To: <3fd0dee1-52d6-4ea8-53d8-2c88b7fedce6@huawei.com>

On 11/4/19 4:46 AM, yangerkun wrote:
> 
> 
> On 2019/11/4 18:09, Bob Liu wrote:
>> On 11/4/19 4:56 PM, yangerkun wrote:
>>> There is no problem to support other type request for the ctx with
>>> IORING_SETUP_IOPOLL.
>>
>> Could you describe the benefit of doing this?
> 
> Hi,
> 
> I am trying to replace libaio with io_uring in InnoDB/MariaDB(which
> build on xfs/nvme). And in order to simulate the timeout mechanism
> like io_getevents, firstly, to use the poll function of io_uring's fd
> has been selected and it really did work. But while trying to enable
> IORING_SETUP_IOPOLL since xfs has iopoll function interface, the
> mechanism will fail since io_uring_poll does check the cq.head between
> cached_cq_tail, which will not update until we call io_uring_enter and
> do the poll. So, instead, I decide to use timeout requests in
> io_uring but will return -EINVAL since we enable IORING_SETUP_IOPOLL
> at the same time. I think this combination is a normal scene so as
> the other combination descibed in this patch. I am not sure does it a
> good solution for this problem, and maybe there exists some better way.

I think we can support timeouts pretty easily with SETUP_IOPOLL, but we
can't mix and match everything. Pretty sure I've written at length about
that before, but the tldr is that for purely polled commands, we won't
have an IRQ event. That's the case for nvme if it's misconfigured, but
for an optimal setup where nvme is loaded with poll queues, there will
never be an interrupt for the command. This means that we can never wait
in io_cqring_wait(), we must always call the iopoll poller, because if
we wait we might very well be waiting for events that will never happen
unless we actively poll for them.

This could be supported if we accounted requests, but I don't want to
add that kind of overhead. Same with the lock+irqdisable you had to add
for this, it's not acceptable overhead.

Sounds like you just need timeout support for polling? If so, then that
is supportable as we know that these events will trigger an async event
when they happen. Either that, or it triggers when we poll for
completions. So it's safe to support, and we can definitely do that.

But don't mix polled IO with "normal" IO, it just won't work without
extra additions that I'm not willing to pay the cost of.


-- 
Jens Axboe


  reply	other threads:[~2019-11-04 14:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-04  8:56 [RFC] io_uring: stop only support read/write for ctx with IORING_SETUP_IOPOLL yangerkun
2019-11-04 10:09 ` Bob Liu
2019-11-04 11:46   ` yangerkun
2019-11-04 14:01     ` Jens Axboe [this message]
2019-11-06 12:00       ` yangerkun
2019-11-06 14:14         ` Jens Axboe
2019-11-07  1:27           ` yangerkun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50de6a43-fc11-aada-40e6-f3fee6523d49@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=bob.liu@oracle.com \
    --cc=houtao1@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.