linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Avi Kivity <avi@scylladb.com>
To: Jens Axboe <axboe@kernel.dk>,
	Linux-RAID <linux-raid@vger.kernel.org>,
	linux-block@vger.kernel.org
Subject: Re: raid0 vs io_uring
Date: Mon, 15 Nov 2021 10:05:38 +0200	[thread overview]
Message-ID: <78ccd535-29fa-9d03-0adc-746d1ed62373@scylladb.com> (raw)
In-Reply-To: <ee22cbab-950f-cdb0-7ef0-5ea0fe67c628@kernel.dk>

On 11/14/21 20:23, Jens Axboe wrote:
> On 11/14/21 10:07 AM, Avi Kivity wrote:
>> Running a trivial randread, direct=1 fio workload against a RAID-0
>> composed of some nvme devices, I see this pattern:
>>
>>
>>                fio-7066  [009]  1800.209865: function: io_submit_sqes
>>                fio-7066  [009]  1800.209866: function:
>> rcu_read_unlock_strict
>>                fio-7066  [009]  1800.209866: function:
>> io_submit_sqe
>>                fio-7066  [009]  1800.209866: function:
>> io_init_req
>>                fio-7066  [009]  1800.209866:
>> function:                      io_file_get
>>                fio-7066  [009]  1800.209866:
>> function:                         fget_many
>>                fio-7066  [009]  1800.209866:
>> function:                            __fget_files
>>                fio-7066  [009]  1800.209867:
>> function:                               rcu_read_unlock_strict
>>                fio-7066  [009]  1800.209867: function:
>> io_req_prep
>>                fio-7066  [009]  1800.209867:
>> function:                      io_prep_rw
>>                fio-7066  [009]  1800.209867: function:
>> io_queue_sqe
>>                fio-7066  [009]  1800.209867:
>> function:                      io_req_defer
>>                fio-7066  [009]  1800.209867:
>> function:                      __io_queue_sqe
>>                fio-7066  [009]  1800.209868:
>> function:                         io_issue_sqe
>>                fio-7066  [009]  1800.209868:
>> function:                            io_read
>>                fio-7066  [009]  1800.209868:
>> function:                               io_import_iovec
>>                fio-7066  [009]  1800.209868:
>> function:                               __io_file_supports_async
>>                fio-7066  [009]  1800.209868:
>> function:                                  I_BDEV
>>                fio-7066  [009]  1800.209868:
>> function:                               __kmalloc
>>                fio-7066  [009]  1800.209868:
>> function:                                  kmalloc_slab
>>                fio-7066  [009]  1800.209868: function: __cond_resched
>>                fio-7066  [009]  1800.209868: function:
>> rcu_all_qs
>>                fio-7066  [009]  1800.209869: function: should_failslab
>>                fio-7066  [009]  1800.209869:
>> function:                               io_req_map_rw
>>                fio-7066  [009]  1800.209869:
>> function:                         io_arm_poll_handler
>>                fio-7066  [009]  1800.209869:
>> function:                         io_queue_async_work
>>                fio-7066  [009]  1800.209869:
>> function:                            io_prep_async_link
>>                fio-7066  [009]  1800.209869:
>> function:                               io_prep_async_work
>>                fio-7066  [009]  1800.209870:
>> function:                            io_wq_enqueue
>>                fio-7066  [009]  1800.209870:
>> function:                               io_wqe_enqueue
>>                fio-7066  [009]  1800.209870:
>> function:                                  _raw_spin_lock_irqsave
>>                fio-7066  [009]  1800.209870: function:
>> _raw_spin_unlock_irqrestore
>>
>>
>>
>>   From which I deduce that __io_file_supports_async() (today named
>> __io_file_supports_nowait) returns false, and therefore every io_uring
>> operation is bounced to a workqueue with the resulting great loss in
>> performance.
>>
>>
>> However, I also see NOWAIT is part of the default set of flags:
>>
>>
>> #define QUEUE_FLAG_MQ_DEFAULT   ((1 << QUEUE_FLAG_IO_STAT) |            \
>>                                    (1 << QUEUE_FLAG_SAME_COMP) |          \
>>                                    (1 << QUEUE_FLAG_NOWAIT))
>>
>> and I don't see that md touches it (I do see that dm plays with it).
>>
>>
>> So, what's the story? does md not support NOWAIT? If so, that's a huge
>> blow to io_uring with md. If it does, are there any clues about why I
>> see requests bouncing to a workqueue?
> That is indeed the story, dm supports it but md doesn't just yet.


Ah, so I missed md clearing the default flags somewhere.


This is a false negative from io_uring's point of view, yes? An md on 
nvme would be essentially nowait in normal operation, it just doesn't 
know it. aio on the same device would not block on the same workload.


> It's
> being worked on right now, though:
>
> https://lore.kernel.org/linux-raid/20211101215143.1580-1-vverma@digitalocean.com/
>
> Should be pretty simple, and then we can push to -stable as well.
>

That's good to know.



  reply	other threads:[~2021-11-15  8:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-14 17:07 raid0 vs io_uring Avi Kivity
2021-11-14 18:23 ` Jens Axboe
2021-11-15  8:05   ` Avi Kivity [this message]
2021-11-15 13:16     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78ccd535-29fa-9d03-0adc-746d1ed62373@scylladb.com \
    --to=avi@scylladb.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).