linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Sighting:  io_uring requests on unexpected core
@ 2020-04-03 19:39 Wunderlich, Mark
  2020-04-03 21:44 ` Sagi Grimberg
  0 siblings, 1 reply; 4+ messages in thread
From: Wunderlich, Mark @ 2020-04-03 19:39 UTC (permalink / raw)
  To: linux-block; +Cc: Sagi Grimberg, Jens Axboe

Hey all, Mark here again with another sighting.  If you're all WFH like myself during this virus period maybe this will provide you with a new puzzle to solve and pass the time, while helping to educate.  Our family big into puzzles during this period.

Here is the issue:
While performing an FIO test, for a single job thread pinned to a specific CPU, I can trap requests to the nvmf layer from a core and queue not aligned to the FIO specified CPU.
I can replicate this on the baseline of nvme-5.5-rc or nvme-5.7-rc1 branches of the infradead repository, with no other patches applied.
For a typical 30 second 4k 100% read test there will be over 2 million packets processed, with under 100 sent by this other CPU to a different queue.  When this occurs it causes a drop in performance of 1-3%.
My nvmf queue configuration is 1 nr_io_queue and 104 nr_poll_queues that equal the number of active cores in the system.  As indicated this is while running an FIO test using io_uring for 100% random read.  And have seen this with a queue depth of 1 batch 1, as well as queue depth 32 batch 8.

 The basic command line being:
/fio --filename=/dev/nvme0n1 --time_based --runtime=30 --ramp_time=10 --thread --rw=randrw --rwmixread=100 --refill_buffers --direct=1 --ioengine=io_uring --hipri --fixedbufs --bs=4k --iodepth=32 --iodepth_batch_complete_min=1 --iodepth_batch_complete_max=32 --iodepth_batch=8 --numjobs=1 --group_reporting --gtod_reduce=0 --disable_lat=0 --name=cpu3 --cpus_allowed=3

Adding monitoring within the code functions nvme_tcp_queue_request() and nvme_tcp_poll() I will see the following.  Polling from the expected CPU for different queues with different assigned CPU [queue->io_cpu].  And new queue request coming in on an unexpected CPU [not as directed on FIO invocation] indicating a queue context assigned with the same CPU value.  Note: even when requests come in on different CPU cores, all polling is from the same expected CPU core.
[  524.867622] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
[  524.867686] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
[  524.867693] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
[  524.867755] nvme_tcp: nvme_tcp_queue_request: IO-Q [Queue CPU 75], [CPU 75]
[  524.867758] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
[  524.867777] nvme_tcp: nvme_tcp_queue_request: IO-Q [Queue CPU 3], [CPU 3]
[  524.867781] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
[  524.867853] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
[  524.867864] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]

So, if someone can help solve this puzzle and help me understand what is causing this behavior that would be great.  Hard for me to think this is an expected, or beneficial behavior, to have a need to use some other core/queue for less than 100 requests out of over 2 million.

Good hunting --- Cheers , Mark

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Sighting: io_uring requests on unexpected core
  2020-04-03 19:39 Sighting: io_uring requests on unexpected core Wunderlich, Mark
@ 2020-04-03 21:44 ` Sagi Grimberg
  2020-04-04  0:03   ` Wunderlich, Mark
  2020-04-06 18:11   ` Sagi Grimberg
  0 siblings, 2 replies; 4+ messages in thread
From: Sagi Grimberg @ 2020-04-03 21:44 UTC (permalink / raw)
  To: Wunderlich, Mark, linux-block; +Cc: Jens Axboe

> Hey all, Mark here again with another sighting.

Hey Mark,

> If you're all WFH like myself during this virus period maybe this will provide you with a new puzzle to solve and pass the time, while helping to educate.  Our family big into puzzles during this period.

:-)

> Here is the issue:
> While performing an FIO test, for a single job thread pinned to a specific CPU, I can trap requests to the nvmf layer from a core and queue not aligned to the FIO specified CPU.
> I can replicate this on the baseline of nvme-5.5-rc or nvme-5.7-rc1 branches of the infradead repository, with no other patches applied.
> For a typical 30 second 4k 100% read test there will be over 2 million packets processed, with under 100 sent by this other CPU to a different queue.  When this occurs it causes a drop in performance of 1-3%.
> My nvmf queue configuration is 1 nr_io_queue and 104 nr_poll_queues that equal the number of active cores in the system.

Given that you pin your fio thread the high poll queue count shouldn't
really matter I assume.

> As indicated this is while running an FIO test using io_uring for 100% random read.  And have seen this with a queue depth of 1 batch 1, as well as queue depth 32 batch 8.
> 
>   The basic command line being:
> /fio --filename=/dev/nvme0n1 --time_based --runtime=30 --ramp_time=10 --thread --rw=randrw --rwmixread=100 --refill_buffers --direct=1 --ioengine=io_uring --hipri --fixedbufs --bs=4k --iodepth=32 --iodepth_batch_complete_min=1 --iodepth_batch_complete_max=32 --iodepth_batch=8 --numjobs=1 --group_reporting --gtod_reduce=0 --disable_lat=0 --name=cpu3 --cpus_allowed=3
> 
> Adding monitoring within the code functions nvme_tcp_queue_request() and nvme_tcp_poll() I will see the following.  Polling from the expected CPU for different queues with different assigned CPU [queue->io_cpu].  And new queue request coming in on an unexpected CPU [not as directed on FIO invocation] indicating a queue context assigned with the same CPU value.  Note: even when requests come in on different CPU cores, all polling is from the same expected CPU core.

nvme_tcp_poll: [Queue CPU 3], [CPU 3] means that the poll is is called
on cpu core [3] on a queue that is mapped to cpu core [3] correct?

nvme_tcp_poll: [Queue CPU 75], [CPU 3] means that the poll is is called
on cpu core [3] on a queue that is mapped to cpu core [75] correct?

> [  524.867622] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
> [  524.867686] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]

I'm assuming that this is a poll invocation of a prior submission to
queue that is mapped to CPU 75?

> [  524.867693] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
> [  524.867755] nvme_tcp: nvme_tcp_queue_request: IO-Q [Queue CPU 75], [CPU 75]

This log print means that on cpu core [3] we see a request submitted on
a queue that is mapped to cpu core [75] correct?

> [  524.867758] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
> [  524.867777] nvme_tcp: nvme_tcp_queue_request: IO-Q [Queue CPU 3], [CPU 3]
> [  524.867781] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
> [  524.867853] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
> [  524.867864] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
> 
> So, if someone can help solve this puzzle and help me understand what is causing this behavior that would be great.  Hard for me to think this is an expected, or beneficial behavior, to have a need to use some other core/queue for less than 100 requests out of over 2 million.

I'm assuming that this phenomenon happens also without polling?

Anyways, it is unexpected to me, given that you have a queue that is
mapped to the cpu you are pinning on, I'd expect that all request
that are generated on this cpu would be submitted on that same
queue..

Anyone has any insights on this?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: Sighting: io_uring requests on unexpected core
  2020-04-03 21:44 ` Sagi Grimberg
@ 2020-04-04  0:03   ` Wunderlich, Mark
  2020-04-06 18:11   ` Sagi Grimberg
  1 sibling, 0 replies; 4+ messages in thread
From: Wunderlich, Mark @ 2020-04-04  0:03 UTC (permalink / raw)
  To: Sagi Grimberg, linux-block; +Cc: Jens Axboe

>> Here is the issue:
>> While performing an FIO test, for a single job thread pinned to a specific CPU, I can trap requests to the nvmf layer from a core and queue not aligned to the FIO specified CPU.
>> I can replicate this on the baseline of nvme-5.5-rc or nvme-5.7-rc1 branches of the infradead repository, with no other patches applied.
>> For a typical 30 second 4k 100% read test there will be over 2 million packets processed, with under 100 sent by this other CPU to a different queue.  When this occurs it causes a drop in performance of 1-3%.
>> My nvmf queue configuration is 1 nr_io_queue and 104 nr_poll_queues that equal the number of active cores in the system.

>Given that you pin your fio thread the high poll queue count shouldn't really matter I assume.

>> As indicated this is while running an FIO test using io_uring for 100% random read.  And have seen this with a queue depth of 1 batch 1, as well as queue depth 32 batch 8.
>> 
>>   The basic command line being:
>> /fio --filename=/dev/nvme0n1 --time_based --runtime=30 --ramp_time=10 
>> --thread --rw=randrw --rwmixread=100 --refill_buffers --direct=1 
>> --ioengine=io_uring --hipri --fixedbufs --bs=4k --iodepth=32 
>> --iodepth_batch_complete_min=1 --iodepth_batch_complete_max=32 
>> --iodepth_batch=8 --numjobs=1 --group_reporting --gtod_reduce=0 
>> --disable_lat=0 --name=cpu3 --cpus_allowed=3
>> 
>> Adding monitoring within the code functions nvme_tcp_queue_request() and nvme_tcp_poll() I will see the following.  Polling from the expected CPU for different queues with different assigned CPU [queue->io_cpu].  And new queue request coming in on an unexpected CPU [not as directed on FIO invocation] indicating a queue context assigned with the same CPU value.  Note: even when requests come in on different CPU cores, all polling is from the same expected CPU core.

> nvme_tcp_poll: [Queue CPU 3], [CPU 3] means that the poll is called on cpu core [3] on a queue that is mapped to cpu core [3] correct?
Correct.

> nvme_tcp_poll: [Queue CPU 75], [CPU 3] means that the poll is called on cpu core [3] on a queue that is mapped to cpu core [75] correct?
Correct.

>> [  524.867622] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
>> [  524.867686] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]

> I'm assuming that this is a poll invocation of a prior submission to queue that is mapped to CPU 75?
Correct.  I just grabbed a section of debug output that showed all details.

>> [  524.867693] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
>> [  524.867755] nvme_tcp: nvme_tcp_queue_request: IO-Q [Queue CPU 75], 
>> [CPU 75]

> This log print means that on cpu core [3] we see a request submitted on a queue that is mapped to cpu core [75] correct?
Actually, this print indicates that CPU 75 has called queue_request for a queue that is mapped to CPU 75.  So this highlights the issue that another CPU, other than one selected on FIO command line, has called into nvmf to issue I/O request.

>> [  524.867758] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
>> [  524.867777] nvme_tcp: nvme_tcp_queue_request: IO-Q [Queue CPU 3], [CPU 3]
>> [  524.867781] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
>> [  524.867853] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
>> [  524.867864] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
>> 
>> So, if someone can help solve this puzzle and help me understand what is causing this behavior that would be great.  Hard for me to think this is an expected, or beneficial behavior, to have a need to use some other core/queue for less than 100 requests out of over 2 million.

> I'm assuming that this phenomenon happens also without polling?
Yes, if I turn off polling by not selecting 'hipri' on the command line I will still see the queue_request behavior, but obviously not the calls to nvme_tcp_poll.

> Anyways, it is unexpected to me, given that you have a queue that is mapped to the cpu you are pinning on, I'd expect that all request that are generated on this cpu would be submitted on that same queue..

> Anyone has any insights on this?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Sighting: io_uring requests on unexpected core
  2020-04-03 21:44 ` Sagi Grimberg
  2020-04-04  0:03   ` Wunderlich, Mark
@ 2020-04-06 18:11   ` Sagi Grimberg
  1 sibling, 0 replies; 4+ messages in thread
From: Sagi Grimberg @ 2020-04-06 18:11 UTC (permalink / raw)
  To: Wunderlich, Mark, linux-block; +Cc: Jens Axboe, linux-nvme

CC'ing linux-nvme

On 4/3/20 2:44 PM, Sagi Grimberg wrote:
>> Hey all, Mark here again with another sighting.
> 
> Hey Mark,
> 
>> If you're all WFH like myself during this virus period maybe this will 
>> provide you with a new puzzle to solve and pass the time, while 
>> helping to educate.  Our family big into puzzles during this period.
> 
> :-)
> 
>> Here is the issue:
>> While performing an FIO test, for a single job thread pinned to a 
>> specific CPU, I can trap requests to the nvmf layer from a core and 
>> queue not aligned to the FIO specified CPU.
>> I can replicate this on the baseline of nvme-5.5-rc or nvme-5.7-rc1 
>> branches of the infradead repository, with no other patches applied.
>> For a typical 30 second 4k 100% read test there will be over 2 million 
>> packets processed, with under 100 sent by this other CPU to a 
>> different queue.  When this occurs it causes a drop in performance of 
>> 1-3%.
>> My nvmf queue configuration is 1 nr_io_queue and 104 nr_poll_queues 
>> that equal the number of active cores in the system.
> 
> Given that you pin your fio thread the high poll queue count shouldn't
> really matter I assume.
> 
>> As indicated this is while running an FIO test using io_uring for 100% 
>> random read.  And have seen this with a queue depth of 1 batch 1, as 
>> well as queue depth 32 batch 8.
>>
>>   The basic command line being:
>> /fio --filename=/dev/nvme0n1 --time_based --runtime=30 --ramp_time=10 
>> --thread --rw=randrw --rwmixread=100 --refill_buffers --direct=1 
>> --ioengine=io_uring --hipri --fixedbufs --bs=4k --iodepth=32 
>> --iodepth_batch_complete_min=1 --iodepth_batch_complete_max=32 
>> --iodepth_batch=8 --numjobs=1 --group_reporting --gtod_reduce=0 
>> --disable_lat=0 --name=cpu3 --cpus_allowed=3
>>
>> Adding monitoring within the code functions nvme_tcp_queue_request() 
>> and nvme_tcp_poll() I will see the following.  Polling from the 
>> expected CPU for different queues with different assigned CPU 
>> [queue->io_cpu].  And new queue request coming in on an unexpected CPU 
>> [not as directed on FIO invocation] indicating a queue context 
>> assigned with the same CPU value.  Note: even when requests come in on 
>> different CPU cores, all polling is from the same expected CPU core.
> 
> nvme_tcp_poll: [Queue CPU 3], [CPU 3] means that the poll is is called
> on cpu core [3] on a queue that is mapped to cpu core [3] correct?
> 
> nvme_tcp_poll: [Queue CPU 75], [CPU 3] means that the poll is is called
> on cpu core [3] on a queue that is mapped to cpu core [75] correct?
> 
>> [  524.867622] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
>> [  524.867686] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
> 
> I'm assuming that this is a poll invocation of a prior submission to
> queue that is mapped to CPU 75?
> 
>> [  524.867693] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
>> [  524.867755] nvme_tcp: nvme_tcp_queue_request: IO-Q [Queue CPU 75], 
>> [CPU 75]
> 
> This log print means that on cpu core [3] we see a request submitted on
> a queue that is mapped to cpu core [75] correct?
> 
>> [  524.867758] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
>> [  524.867777] nvme_tcp: nvme_tcp_queue_request: IO-Q [Queue CPU 3], 
>> [CPU 3]
>> [  524.867781] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
>> [  524.867853] nvme_tcp:        nvme_tcp_poll: [Queue CPU 75], [CPU 3]
>> [  524.867864] nvme_tcp:        nvme_tcp_poll: [Queue CPU 3], [CPU 3]
>>
>> So, if someone can help solve this puzzle and help me understand what 
>> is causing this behavior that would be great.  Hard for me to think 
>> this is an expected, or beneficial behavior, to have a need to use 
>> some other core/queue for less than 100 requests out of over 2 million.
> 
> I'm assuming that this phenomenon happens also without polling?
> 
> Anyways, it is unexpected to me, given that you have a queue that is
> mapped to the cpu you are pinning on, I'd expect that all request
> that are generated on this cpu would be submitted on that same
> queue..
> 
> Anyone has any insights on this?

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-04-06 18:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-03 19:39 Sighting: io_uring requests on unexpected core Wunderlich, Mark
2020-04-03 21:44 ` Sagi Grimberg
2020-04-04  0:03   ` Wunderlich, Mark
2020-04-06 18:11   ` Sagi Grimberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).