* raid0 vs io_uring
@ 2021-11-14 17:07 Avi Kivity
2021-11-14 18:23 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Avi Kivity @ 2021-11-14 17:07 UTC (permalink / raw)
To: Linux-RAID, linux-block
Running a trivial randread, direct=1 fio workload against a RAID-0
composed of some nvme devices, I see this pattern:
fio-7066 [009] 1800.209865: function: io_submit_sqes
fio-7066 [009] 1800.209866: function:
rcu_read_unlock_strict
fio-7066 [009] 1800.209866: function:
io_submit_sqe
fio-7066 [009] 1800.209866: function:
io_init_req
fio-7066 [009] 1800.209866:
function: io_file_get
fio-7066 [009] 1800.209866:
function: fget_many
fio-7066 [009] 1800.209866:
function: __fget_files
fio-7066 [009] 1800.209867:
function: rcu_read_unlock_strict
fio-7066 [009] 1800.209867: function:
io_req_prep
fio-7066 [009] 1800.209867:
function: io_prep_rw
fio-7066 [009] 1800.209867: function:
io_queue_sqe
fio-7066 [009] 1800.209867:
function: io_req_defer
fio-7066 [009] 1800.209867:
function: __io_queue_sqe
fio-7066 [009] 1800.209868:
function: io_issue_sqe
fio-7066 [009] 1800.209868:
function: io_read
fio-7066 [009] 1800.209868:
function: io_import_iovec
fio-7066 [009] 1800.209868:
function: __io_file_supports_async
fio-7066 [009] 1800.209868:
function: I_BDEV
fio-7066 [009] 1800.209868:
function: __kmalloc
fio-7066 [009] 1800.209868:
function: kmalloc_slab
fio-7066 [009] 1800.209868: function: __cond_resched
fio-7066 [009] 1800.209868: function:
rcu_all_qs
fio-7066 [009] 1800.209869: function: should_failslab
fio-7066 [009] 1800.209869:
function: io_req_map_rw
fio-7066 [009] 1800.209869:
function: io_arm_poll_handler
fio-7066 [009] 1800.209869:
function: io_queue_async_work
fio-7066 [009] 1800.209869:
function: io_prep_async_link
fio-7066 [009] 1800.209869:
function: io_prep_async_work
fio-7066 [009] 1800.209870:
function: io_wq_enqueue
fio-7066 [009] 1800.209870:
function: io_wqe_enqueue
fio-7066 [009] 1800.209870:
function: _raw_spin_lock_irqsave
fio-7066 [009] 1800.209870: function:
_raw_spin_unlock_irqrestore
From which I deduce that __io_file_supports_async() (today named
__io_file_supports_nowait) returns false, and therefore every io_uring
operation is bounced to a workqueue with the resulting great loss in
performance.
However, I also see NOWAIT is part of the default set of flags:
#define QUEUE_FLAG_MQ_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \
(1 << QUEUE_FLAG_SAME_COMP) | \
(1 << QUEUE_FLAG_NOWAIT))
and I don't see that md touches it (I do see that dm plays with it).
So, what's the story? does md not support NOWAIT? If so, that's a huge
blow to io_uring with md. If it does, are there any clues about why I
see requests bouncing to a workqueue?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: raid0 vs io_uring
2021-11-14 17:07 raid0 vs io_uring Avi Kivity
@ 2021-11-14 18:23 ` Jens Axboe
2021-11-15 8:05 ` Avi Kivity
0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2021-11-14 18:23 UTC (permalink / raw)
To: Avi Kivity, Linux-RAID, linux-block
On 11/14/21 10:07 AM, Avi Kivity wrote:
> Running a trivial randread, direct=1 fio workload against a RAID-0
> composed of some nvme devices, I see this pattern:
>
>
> fio-7066 [009] 1800.209865: function: io_submit_sqes
> fio-7066 [009] 1800.209866: function:
> rcu_read_unlock_strict
> fio-7066 [009] 1800.209866: function:
> io_submit_sqe
> fio-7066 [009] 1800.209866: function:
> io_init_req
> fio-7066 [009] 1800.209866:
> function: io_file_get
> fio-7066 [009] 1800.209866:
> function: fget_many
> fio-7066 [009] 1800.209866:
> function: __fget_files
> fio-7066 [009] 1800.209867:
> function: rcu_read_unlock_strict
> fio-7066 [009] 1800.209867: function:
> io_req_prep
> fio-7066 [009] 1800.209867:
> function: io_prep_rw
> fio-7066 [009] 1800.209867: function:
> io_queue_sqe
> fio-7066 [009] 1800.209867:
> function: io_req_defer
> fio-7066 [009] 1800.209867:
> function: __io_queue_sqe
> fio-7066 [009] 1800.209868:
> function: io_issue_sqe
> fio-7066 [009] 1800.209868:
> function: io_read
> fio-7066 [009] 1800.209868:
> function: io_import_iovec
> fio-7066 [009] 1800.209868:
> function: __io_file_supports_async
> fio-7066 [009] 1800.209868:
> function: I_BDEV
> fio-7066 [009] 1800.209868:
> function: __kmalloc
> fio-7066 [009] 1800.209868:
> function: kmalloc_slab
> fio-7066 [009] 1800.209868: function: __cond_resched
> fio-7066 [009] 1800.209868: function:
> rcu_all_qs
> fio-7066 [009] 1800.209869: function: should_failslab
> fio-7066 [009] 1800.209869:
> function: io_req_map_rw
> fio-7066 [009] 1800.209869:
> function: io_arm_poll_handler
> fio-7066 [009] 1800.209869:
> function: io_queue_async_work
> fio-7066 [009] 1800.209869:
> function: io_prep_async_link
> fio-7066 [009] 1800.209869:
> function: io_prep_async_work
> fio-7066 [009] 1800.209870:
> function: io_wq_enqueue
> fio-7066 [009] 1800.209870:
> function: io_wqe_enqueue
> fio-7066 [009] 1800.209870:
> function: _raw_spin_lock_irqsave
> fio-7066 [009] 1800.209870: function:
> _raw_spin_unlock_irqrestore
>
>
>
> From which I deduce that __io_file_supports_async() (today named
> __io_file_supports_nowait) returns false, and therefore every io_uring
> operation is bounced to a workqueue with the resulting great loss in
> performance.
>
>
> However, I also see NOWAIT is part of the default set of flags:
>
>
> #define QUEUE_FLAG_MQ_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \
> (1 << QUEUE_FLAG_SAME_COMP) | \
> (1 << QUEUE_FLAG_NOWAIT))
>
> and I don't see that md touches it (I do see that dm plays with it).
>
>
> So, what's the story? does md not support NOWAIT? If so, that's a huge
> blow to io_uring with md. If it does, are there any clues about why I
> see requests bouncing to a workqueue?
That is indeed the story, dm supports it but md doesn't just yet. It's
being worked on right now, though:
https://lore.kernel.org/linux-raid/20211101215143.1580-1-vverma@digitalocean.com/
Should be pretty simple, and then we can push to -stable as well.
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: raid0 vs io_uring
2021-11-14 18:23 ` Jens Axboe
@ 2021-11-15 8:05 ` Avi Kivity
2021-11-15 13:16 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Avi Kivity @ 2021-11-15 8:05 UTC (permalink / raw)
To: Jens Axboe, Linux-RAID, linux-block
On 11/14/21 20:23, Jens Axboe wrote:
> On 11/14/21 10:07 AM, Avi Kivity wrote:
>> Running a trivial randread, direct=1 fio workload against a RAID-0
>> composed of some nvme devices, I see this pattern:
>>
>>
>> fio-7066 [009] 1800.209865: function: io_submit_sqes
>> fio-7066 [009] 1800.209866: function:
>> rcu_read_unlock_strict
>> fio-7066 [009] 1800.209866: function:
>> io_submit_sqe
>> fio-7066 [009] 1800.209866: function:
>> io_init_req
>> fio-7066 [009] 1800.209866:
>> function: io_file_get
>> fio-7066 [009] 1800.209866:
>> function: fget_many
>> fio-7066 [009] 1800.209866:
>> function: __fget_files
>> fio-7066 [009] 1800.209867:
>> function: rcu_read_unlock_strict
>> fio-7066 [009] 1800.209867: function:
>> io_req_prep
>> fio-7066 [009] 1800.209867:
>> function: io_prep_rw
>> fio-7066 [009] 1800.209867: function:
>> io_queue_sqe
>> fio-7066 [009] 1800.209867:
>> function: io_req_defer
>> fio-7066 [009] 1800.209867:
>> function: __io_queue_sqe
>> fio-7066 [009] 1800.209868:
>> function: io_issue_sqe
>> fio-7066 [009] 1800.209868:
>> function: io_read
>> fio-7066 [009] 1800.209868:
>> function: io_import_iovec
>> fio-7066 [009] 1800.209868:
>> function: __io_file_supports_async
>> fio-7066 [009] 1800.209868:
>> function: I_BDEV
>> fio-7066 [009] 1800.209868:
>> function: __kmalloc
>> fio-7066 [009] 1800.209868:
>> function: kmalloc_slab
>> fio-7066 [009] 1800.209868: function: __cond_resched
>> fio-7066 [009] 1800.209868: function:
>> rcu_all_qs
>> fio-7066 [009] 1800.209869: function: should_failslab
>> fio-7066 [009] 1800.209869:
>> function: io_req_map_rw
>> fio-7066 [009] 1800.209869:
>> function: io_arm_poll_handler
>> fio-7066 [009] 1800.209869:
>> function: io_queue_async_work
>> fio-7066 [009] 1800.209869:
>> function: io_prep_async_link
>> fio-7066 [009] 1800.209869:
>> function: io_prep_async_work
>> fio-7066 [009] 1800.209870:
>> function: io_wq_enqueue
>> fio-7066 [009] 1800.209870:
>> function: io_wqe_enqueue
>> fio-7066 [009] 1800.209870:
>> function: _raw_spin_lock_irqsave
>> fio-7066 [009] 1800.209870: function:
>> _raw_spin_unlock_irqrestore
>>
>>
>>
>> From which I deduce that __io_file_supports_async() (today named
>> __io_file_supports_nowait) returns false, and therefore every io_uring
>> operation is bounced to a workqueue with the resulting great loss in
>> performance.
>>
>>
>> However, I also see NOWAIT is part of the default set of flags:
>>
>>
>> #define QUEUE_FLAG_MQ_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \
>> (1 << QUEUE_FLAG_SAME_COMP) | \
>> (1 << QUEUE_FLAG_NOWAIT))
>>
>> and I don't see that md touches it (I do see that dm plays with it).
>>
>>
>> So, what's the story? does md not support NOWAIT? If so, that's a huge
>> blow to io_uring with md. If it does, are there any clues about why I
>> see requests bouncing to a workqueue?
> That is indeed the story, dm supports it but md doesn't just yet.
Ah, so I missed md clearing the default flags somewhere.
This is a false negative from io_uring's point of view, yes? An md on
nvme would be essentially nowait in normal operation, it just doesn't
know it. aio on the same device would not block on the same workload.
> It's
> being worked on right now, though:
>
> https://lore.kernel.org/linux-raid/20211101215143.1580-1-vverma@digitalocean.com/
>
> Should be pretty simple, and then we can push to -stable as well.
>
That's good to know.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: raid0 vs io_uring
2021-11-15 8:05 ` Avi Kivity
@ 2021-11-15 13:16 ` Jens Axboe
0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2021-11-15 13:16 UTC (permalink / raw)
To: Avi Kivity, Linux-RAID, linux-block
On 11/15/21 1:05 AM, Avi Kivity wrote:
> On 11/14/21 20:23, Jens Axboe wrote:
>> On 11/14/21 10:07 AM, Avi Kivity wrote:
>>> Running a trivial randread, direct=1 fio workload against a RAID-0
>>> composed of some nvme devices, I see this pattern:
>>>
>>>
>>> fio-7066 [009] 1800.209865: function: io_submit_sqes
>>> fio-7066 [009] 1800.209866: function:
>>> rcu_read_unlock_strict
>>> fio-7066 [009] 1800.209866: function:
>>> io_submit_sqe
>>> fio-7066 [009] 1800.209866: function:
>>> io_init_req
>>> fio-7066 [009] 1800.209866:
>>> function: io_file_get
>>> fio-7066 [009] 1800.209866:
>>> function: fget_many
>>> fio-7066 [009] 1800.209866:
>>> function: __fget_files
>>> fio-7066 [009] 1800.209867:
>>> function: rcu_read_unlock_strict
>>> fio-7066 [009] 1800.209867: function:
>>> io_req_prep
>>> fio-7066 [009] 1800.209867:
>>> function: io_prep_rw
>>> fio-7066 [009] 1800.209867: function:
>>> io_queue_sqe
>>> fio-7066 [009] 1800.209867:
>>> function: io_req_defer
>>> fio-7066 [009] 1800.209867:
>>> function: __io_queue_sqe
>>> fio-7066 [009] 1800.209868:
>>> function: io_issue_sqe
>>> fio-7066 [009] 1800.209868:
>>> function: io_read
>>> fio-7066 [009] 1800.209868:
>>> function: io_import_iovec
>>> fio-7066 [009] 1800.209868:
>>> function: __io_file_supports_async
>>> fio-7066 [009] 1800.209868:
>>> function: I_BDEV
>>> fio-7066 [009] 1800.209868:
>>> function: __kmalloc
>>> fio-7066 [009] 1800.209868:
>>> function: kmalloc_slab
>>> fio-7066 [009] 1800.209868: function: __cond_resched
>>> fio-7066 [009] 1800.209868: function:
>>> rcu_all_qs
>>> fio-7066 [009] 1800.209869: function: should_failslab
>>> fio-7066 [009] 1800.209869:
>>> function: io_req_map_rw
>>> fio-7066 [009] 1800.209869:
>>> function: io_arm_poll_handler
>>> fio-7066 [009] 1800.209869:
>>> function: io_queue_async_work
>>> fio-7066 [009] 1800.209869:
>>> function: io_prep_async_link
>>> fio-7066 [009] 1800.209869:
>>> function: io_prep_async_work
>>> fio-7066 [009] 1800.209870:
>>> function: io_wq_enqueue
>>> fio-7066 [009] 1800.209870:
>>> function: io_wqe_enqueue
>>> fio-7066 [009] 1800.209870:
>>> function: _raw_spin_lock_irqsave
>>> fio-7066 [009] 1800.209870: function:
>>> _raw_spin_unlock_irqrestore
>>>
>>>
>>>
>>> From which I deduce that __io_file_supports_async() (today named
>>> __io_file_supports_nowait) returns false, and therefore every io_uring
>>> operation is bounced to a workqueue with the resulting great loss in
>>> performance.
>>>
>>>
>>> However, I also see NOWAIT is part of the default set of flags:
>>>
>>>
>>> #define QUEUE_FLAG_MQ_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \
>>> (1 << QUEUE_FLAG_SAME_COMP) | \
>>> (1 << QUEUE_FLAG_NOWAIT))
>>>
>>> and I don't see that md touches it (I do see that dm plays with it).
>>>
>>>
>>> So, what's the story? does md not support NOWAIT? If so, that's a huge
>>> blow to io_uring with md. If it does, are there any clues about why I
>>> see requests bouncing to a workqueue?
>> That is indeed the story, dm supports it but md doesn't just yet.
>
>
> Ah, so I missed md clearing the default flags somewhere.
>
>
> This is a false negative from io_uring's point of view, yes? An md on
> nvme would be essentially nowait in normal operation, it just doesn't
> know it. aio on the same device would not block on the same workload.
There are still conditions where it can block, it just didn't in your
test case.
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-11-15 13:16 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-14 17:07 raid0 vs io_uring Avi Kivity
2021-11-14 18:23 ` Jens Axboe
2021-11-15 8:05 ` Avi Kivity
2021-11-15 13:16 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).