* fio with polling mode
@ 2016-03-15 11:20 Ley Foon Tan
2016-03-15 18:58 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Ley Foon Tan @ 2016-03-15 11:20 UTC (permalink / raw)
To: fio
Hi
In the kernel v4.4 above, we can use polling mode for the NVMe data
transfer with command below:
echo 1 > /sys/block/nvme0n1/queue/io_poll
We can see NVMe throughput increase with polling mode with dd command.
Can we run fio with polling mode as well? If yes, what are the correct
fio parameters/arguments should we use?
Thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fio with polling mode
2016-03-15 11:20 fio with polling mode Ley Foon Tan
@ 2016-03-15 18:58 ` Jens Axboe
2016-03-16 11:18 ` Ley Foon Tan
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2016-03-15 18:58 UTC (permalink / raw)
To: Ley Foon Tan, fio
On 03/15/2016 04:20 AM, Ley Foon Tan wrote:
> Hi
>
> In the kernel v4.4 above, we can use polling mode for the NVMe data
> transfer with command below:
>
> echo 1 > /sys/block/nvme0n1/queue/io_poll
>
> We can see NVMe throughput increase with polling mode with dd command.
> Can we run fio with polling mode as well? If yes, what are the correct
> fio parameters/arguments should we use?
direct=1, and use one of the sync IO engines (psync would be a good
one). And enable io_poll like you did above, then fio should be in
polled mode.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fio with polling mode
2016-03-15 18:58 ` Jens Axboe
@ 2016-03-16 11:18 ` Ley Foon Tan
2016-03-16 16:25 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Ley Foon Tan @ 2016-03-16 11:18 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio
On Wed, Mar 16, 2016 at 2:58 AM, Jens Axboe <axboe@kernel.dk> wrote:
>
> On 03/15/2016 04:20 AM, Ley Foon Tan wrote:
>>
>> Hi
>>
>> In the kernel v4.4 above, we can use polling mode for the NVMe data
>> transfer with command below:
>>
>> echo 1 > /sys/block/nvme0n1/queue/io_poll
>>
>> We can see NVMe throughput increase with polling mode with dd command.
>> Can we run fio with polling mode as well? If yes, what are the correct
>> fio parameters/arguments should we use?
>
>
> direct=1, and use one of the sync IO engines (psync would be a good one). And enable io_poll like you did above, then fio should be in polled mode.
>
> --
> Jens Axboe
Hi Jens
I have tried fio with direct=1 and ioengine=psync. But the results
almost the same (low throughput). Below is example command for
sequential write.
With your experiences, any clue where is the bottleneck for low
throughput (based on fio output). Note, kernel v4.4 and ARM platform.
# fio --filename=/dev/nvme0n1 --rw=write --direct=1 --blocksize=128k
--size=500M --iodepth=64 --group_reporting --name=myjob
--ioengine=psync
myjob: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=psync, iodepth=64
fio 2.0.5
Starting 1 process
Jobs: 1 (f=1)
myjob: (groupid=0, jobs=1): err= 0: pid=1139
write: io=512000KB, bw=314110KB/s, iops=2453 , runt= 1630msec
clat (usec): min=379 , max=694 , avg=395.20, stdev=11.90
lat (usec): min=384 , max=702 , avg=402.65, stdev=12.35
clat percentiles (usec):
| 1.00th=[ 382], 5.00th=[ 394], 10.00th=[ 394], 20.00th=[ 394],
| 30.00th=[ 394], 40.00th=[ 394], 50.00th=[ 394], 60.00th=[ 394],
| 70.00th=[ 394], 80.00th=[ 394], 90.00th=[ 398], 95.00th=[ 402],
| 99.00th=[ 410], 99.50th=[ 422], 99.90th=[ 644]
bw (KB/s) : min=314112, max=314624, per=100.00%, avg=314368.00,
stdev=256.00
lat (usec) : 500=99.78%, 750=0.22%
cpu : usr=3.07%, sys=33.76%, ctx=4001, majf=0, minf=0
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=512080/w=0/d=0, short=r=4000/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=512000KB, aggrb=314110KB/s, minb=321649KB/s,
maxb=321649KB/s, mint=1630msec, maxt=1630msec
Disk stats (read/write):
nvme0n1: ios=62/7962, merge=0/0, ticks=0/2020, in_queue=2020, util=58.76%
# echo 1 > /sys/block/nvme0n1/queue/io_poll
# fio --filename=/dev/nvme0n1 --rw=write --direct=1 --blocksize=128k
--size=500M --iodepth=64 --group_reporting --name=myjob
--ioengine=psync
myjob: (g=0): rw=write, bs=128K-128K/128K-128K, ioengine=psync, iodepth=64
fio 2.0.5
Starting 1 process
Jobs: 1 (f=1)
myjob: (groupid=0, jobs=1): err= 0: pid=1152
write: io=512000KB, bw=319600KB/s, iops=2496 , runt= 1602msec
clat (usec): min=368 , max=6292 , avg=389.23, stdev=140.91
lat (usec): min=373 , max=6299 , avg=396.41, stdev=140.94
clat percentiles (usec):
| 1.00th=[ 370], 5.00th=[ 374], 10.00th=[ 382], 20.00th=[ 382],
| 30.00th=[ 386], 40.00th=[ 386], 50.00th=[ 386], 60.00th=[ 386],
| 70.00th=[ 386], 80.00th=[ 386], 90.00th=[ 386], 95.00th=[ 394],
| 99.00th=[ 406], 99.50th=[ 414], 99.90th=[ 1880]
bw (KB/s) : min=317696, max=323328, per=99.99%, avg=319573.33,
stdev=3251.64
lat (usec) : 500=99.65%, 750=0.20%, 1000=0.03%
lat (msec) : 2=0.05%, 4=0.03%, 10=0.05%
cpu : usr=1.87%, sys=98.06%, ctx=18, majf=0, minf=0
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=512080/w=0/d=0, short=r=4000/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=512000KB, aggrb=319600KB/s, minb=327270KB/s,
maxb=327270KB/s, mint=1602msec, maxt=1602msec
Disk stats (read/write):
nvme0n1: ios=62/6878, merge=0/0, ticks=10/1740, in_queue=1750, util=58.94%
Thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fio with polling mode
2016-03-16 11:18 ` Ley Foon Tan
@ 2016-03-16 16:25 ` Jens Axboe
2016-03-17 9:39 ` Ley Foon Tan
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2016-03-16 16:25 UTC (permalink / raw)
To: Ley Foon Tan; +Cc: fio
On 03/16/2016 04:18 AM, Ley Foon Tan wrote:
> On Wed, Mar 16, 2016 at 2:58 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 03/15/2016 04:20 AM, Ley Foon Tan wrote:
>>>
>>> Hi
>>>
>>> In the kernel v4.4 above, we can use polling mode for the NVMe data
>>> transfer with command below:
>>>
>>> echo 1 > /sys/block/nvme0n1/queue/io_poll
>>>
>>> We can see NVMe throughput increase with polling mode with dd command.
>>> Can we run fio with polling mode as well? If yes, what are the correct
>>> fio parameters/arguments should we use?
>>
>>
>> direct=1, and use one of the sync IO engines (psync would be a good one). And enable io_poll like you did above, then fio should be in polled mode.
>>
>> --
>> Jens Axboe
> Hi Jens
>
> I have tried fio with direct=1 and ioengine=psync. But the results
> almost the same (low throughput). Below is example command for
> sequential write.
> With your experiences, any clue where is the bottleneck for low
> throughput (based on fio output). Note, kernel v4.4 and ARM platform.
Polled IO helps with latencies, which means that the effects are most
pronounced on smaller block size IO. You are using 128K, which is pretty
far outside the realm of "smaller block size".
That said, you do seem to have a reduction in average latency with
polling. But given the transfer size and time, percentage wise, it's not
that huge.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fio with polling mode
2016-03-16 16:25 ` Jens Axboe
@ 2016-03-17 9:39 ` Ley Foon Tan
2016-03-17 16:18 ` Jens Axboe
0 siblings, 1 reply; 7+ messages in thread
From: Ley Foon Tan @ 2016-03-17 9:39 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio
On Thu, Mar 17, 2016 at 12:25 AM, Jens Axboe <axboe@kernel.dk> wrote:
> On 03/16/2016 04:18 AM, Ley Foon Tan wrote:
>>
>> On Wed, Mar 16, 2016 at 2:58 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>>
>>> On 03/15/2016 04:20 AM, Ley Foon Tan wrote:
>>>>
>>>>
>>>> Hi
>>>>
>>>> In the kernel v4.4 above, we can use polling mode for the NVMe data
>>>> transfer with command below:
>>>>
>>>> echo 1 > /sys/block/nvme0n1/queue/io_poll
>>>>
>>>> We can see NVMe throughput increase with polling mode with dd command.
>>>> Can we run fio with polling mode as well? If yes, what are the correct
>>>> fio parameters/arguments should we use?
>>>
>>>
>>>
>>> direct=1, and use one of the sync IO engines (psync would be a good one).
>>> And enable io_poll like you did above, then fio should be in polled mode.
>>>
>>> --
>>> Jens Axboe
>>
>> Hi Jens
>>
>> I have tried fio with direct=1 and ioengine=psync. But the results
>> almost the same (low throughput). Below is example command for
>> sequential write.
>> With your experiences, any clue where is the bottleneck for low
>> throughput (based on fio output). Note, kernel v4.4 and ARM platform.
>
>
> Polled IO helps with latencies, which means that the effects are most
> pronounced on smaller block size IO. You are using 128K, which is pretty far
> outside the realm of "smaller block size".
>
> That said, you do seem to have a reduction in average latency with polling.
> But given the transfer size and time, percentage wise, it's not that huge.
>
Yes, you are right. We can see about 20% throughput gain with 4KB
block size. But, block size greater than 4KB doesn't have much
improvement.
Do you know any other fio or OS settings that can help on the NVMe
throughput? Tried set scheduler to NOOP mode doesn't help too.
Thanks.
Regards
Ley Foon
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fio with polling mode
2016-03-17 9:39 ` Ley Foon Tan
@ 2016-03-17 16:18 ` Jens Axboe
2016-03-18 7:09 ` Ley Foon Tan
0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2016-03-17 16:18 UTC (permalink / raw)
To: Ley Foon Tan; +Cc: fio
On 03/17/2016 02:39 AM, Ley Foon Tan wrote:
> On Thu, Mar 17, 2016 at 12:25 AM, Jens Axboe <axboe@kernel.dk> wrote:
>> On 03/16/2016 04:18 AM, Ley Foon Tan wrote:
>>>
>>> On Wed, Mar 16, 2016 at 2:58 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>>
>>>> On 03/15/2016 04:20 AM, Ley Foon Tan wrote:
>>>>>
>>>>>
>>>>> Hi
>>>>>
>>>>> In the kernel v4.4 above, we can use polling mode for the NVMe data
>>>>> transfer with command below:
>>>>>
>>>>> echo 1 > /sys/block/nvme0n1/queue/io_poll
>>>>>
>>>>> We can see NVMe throughput increase with polling mode with dd command.
>>>>> Can we run fio with polling mode as well? If yes, what are the correct
>>>>> fio parameters/arguments should we use?
>>>>
>>>>
>>>>
>>>> direct=1, and use one of the sync IO engines (psync would be a good one).
>>>> And enable io_poll like you did above, then fio should be in polled mode.
>>>>
>>>> --
>>>> Jens Axboe
>>>
>>> Hi Jens
>>>
>>> I have tried fio with direct=1 and ioengine=psync. But the results
>>> almost the same (low throughput). Below is example command for
>>> sequential write.
>>> With your experiences, any clue where is the bottleneck for low
>>> throughput (based on fio output). Note, kernel v4.4 and ARM platform.
>>
>>
>> Polled IO helps with latencies, which means that the effects are most
>> pronounced on smaller block size IO. You are using 128K, which is pretty far
>> outside the realm of "smaller block size".
>>
>> That said, you do seem to have a reduction in average latency with polling.
>> But given the transfer size and time, percentage wise, it's not that huge.
>>
> Yes, you are right. We can see about 20% throughput gain with 4KB
> block size. But, block size greater than 4KB doesn't have much
> improvement.
> Do you know any other fio or OS settings that can help on the NVMe
> throughput? Tried set scheduler to NOOP mode doesn't help too.
Polling isn't really going to help your throughput, it'll only do that
if you are IOPS bound. If you are missing bandwidth, you probably need
to look closer at why that is.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: fio with polling mode
2016-03-17 16:18 ` Jens Axboe
@ 2016-03-18 7:09 ` Ley Foon Tan
0 siblings, 0 replies; 7+ messages in thread
From: Ley Foon Tan @ 2016-03-18 7:09 UTC (permalink / raw)
To: Jens Axboe; +Cc: fio
On Fri, Mar 18, 2016 at 12:18 AM, Jens Axboe <axboe@kernel.dk> wrote:
> On 03/17/2016 02:39 AM, Ley Foon Tan wrote:
>>
>> On Thu, Mar 17, 2016 at 12:25 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> On 03/16/2016 04:18 AM, Ley Foon Tan wrote:
>>>>
>>>>
>>>> On Wed, Mar 16, 2016 at 2:58 AM, Jens Axboe <axboe@kernel.dk> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 03/15/2016 04:20 AM, Ley Foon Tan wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi
>>>>>>
>>>>>> In the kernel v4.4 above, we can use polling mode for the NVMe data
>>>>>> transfer with command below:
>>>>>>
>>>>>> echo 1 > /sys/block/nvme0n1/queue/io_poll
>>>>>>
>>>>>> We can see NVMe throughput increase with polling mode with dd command.
>>>>>> Can we run fio with polling mode as well? If yes, what are the correct
>>>>>> fio parameters/arguments should we use?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> direct=1, and use one of the sync IO engines (psync would be a good
>>>>> one).
>>>>> And enable io_poll like you did above, then fio should be in polled
>>>>> mode.
>>>>>
>>>>> --
>>>>> Jens Axboe
>>>>
>>>>
>>>> Hi Jens
>>>>
>>>> I have tried fio with direct=1 and ioengine=psync. But the results
>>>> almost the same (low throughput). Below is example command for
>>>> sequential write.
>>>> With your experiences, any clue where is the bottleneck for low
>>>> throughput (based on fio output). Note, kernel v4.4 and ARM platform.
>>>
>>>
>>>
>>> Polled IO helps with latencies, which means that the effects are most
>>> pronounced on smaller block size IO. You are using 128K, which is pretty
>>> far
>>> outside the realm of "smaller block size".
>>>
>>> That said, you do seem to have a reduction in average latency with
>>> polling.
>>> But given the transfer size and time, percentage wise, it's not that
>>> huge.
>>>
>> Yes, you are right. We can see about 20% throughput gain with 4KB
>> block size. But, block size greater than 4KB doesn't have much
>> improvement.
>> Do you know any other fio or OS settings that can help on the NVMe
>> throughput? Tried set scheduler to NOOP mode doesn't help too.
>
>
> Polling isn't really going to help your throughput, it'll only do that if
> you are IOPS bound. If you are missing bandwidth, you probably need to look
> closer at why that is.
Yes, we are looking at this now.
Let me know if you know any part that will impact this.
BTW, thanks for your help!
Regards
Ley Foon
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2016-03-18 7:09 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-15 11:20 fio with polling mode Ley Foon Tan
2016-03-15 18:58 ` Jens Axboe
2016-03-16 11:18 ` Ley Foon Tan
2016-03-16 16:25 ` Jens Axboe
2016-03-17 9:39 ` Ley Foon Tan
2016-03-17 16:18 ` Jens Axboe
2016-03-18 7:09 ` Ley Foon Tan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.