netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: packet stuck in qdisc
       [not found] ` <698739062.1462023.1643115337201.JavaMail.zimbra@kalray.eu>
@ 2022-01-28  2:36   ` Yunsheng Lin
  2022-01-28  8:58     ` Vincent Ray
  2022-03-25  6:16     ` Yunsheng Lin
  0 siblings, 2 replies; 28+ messages in thread
From: Yunsheng Lin @ 2022-01-28  2:36 UTC (permalink / raw)
  To: Vincent Ray, vladimir.oltean, kuba, davem; +Cc: Samuel Jones, netdev

On 2022/1/25 20:55, Vincent Ray wrote:
> Dear kernel maintainers / developers,
> 
> I work at Kalray where we are developping an NVME-over-TCP target controller board.
> My setup is as such :
> - a development workstation running Linux 5.x.y (the host)
> - sending NVME-TCP traffic to our board, to which it is connected through a Mellanox NIC (Connect-X-5) and a 100G ETH cable
> 
> While doing performance tests, using simple fio scenarios running over the regular kernel nvme-tcp driver on the host, we noticed important performance variations.
> After some digging (using tcpdump on the host), we found that there were big "holes" in the tcp traffic sent by the host.
> The scenario we observed is the following :
> 1) a TCP segment gets lost (not sent by the host) on a particular TCP connection, leading to a gap in the seq numbers received by the board
> 2) the board sends dup-acks and/or sacks (if configured) to signal this loss
> 3) then, sometimes, the host stops emitting on that TCP connection for several seconds (as much as 14s observed)
> 4) finally the host resumes emission, sending the missing packet
> 5) then the TCP connection continues correctly with the appropriate throughput
> 
> Such a scenario can be observed in the attached tcpdump (+ comments).

Hi,
    Thanks for reporting the problem.

> 
> We noticed that this was happening only in recent versions of the kernel, so we dichotomized until we found the culprit commits :
> we believe that the bug was introduced in qdisc updates for 5.14.rc1 by this set of commits, more precisely the middle one :
> 
> [2021-06-22] d3e0f57 Yunsheng Lin net: sched: remove qdisc->empty for lockless qdisc
> [2021-06-22] c4fef01 Yunsheng Lin net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc    *=> KO*
> [2021-06-22] dd25296 Yunsheng Lin net: sched: avoid unnecessary seqcount operation for lockless qdisc   *=> still OK*
> 
> As far as I can tell, the bug is still present in the mainline (at least it was in 5.16-rc4).
> From what I understand / guess, some optimizations in lockless qdiscs have lead to a race making the qdisc unaware that a packet has been enqueued and is waiting for emission.
> I suspect that, when this happens with TCP packets "to be retransmitted", the TCP stack will not timeout and try to retransmitt again because it uses skb_still_in_host_queue() to avoid useless re-retransmissions
> From net/ipv4/ tcp_output.c :
> //* Thanks to skb fast clones, we can detect if a prior transmit of                                                                                                                                                   /
> / * a packet is still in a qdisc or driver queue.                                                                                                                                                                     /
> / * In this case, there is very little point doing a retransmit !                                                                                                                                                     /
> / */  /
> I guess this plays a role in making these holes grow up to 14s, and an other layer than TCP might not experience it (?).
> 
> The interface through which my traffic is going is :
> eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
>     link/ether b8:ce:f6:60:c9:97 brd ff:ff:ff:ff:ff:ff
>     inet 10.20.0.254/24 scope global eth3
> 
> As you can see, it uses a mq qdisc. I did not try with other qdiscs yet.
> 
> Finally, if/when troubleshooting this problem in kernels older than 5.14.7, it's a good idea to first cherry-pick this patch :
> [2021-09-09] ae66447 Keith Busch nvme-tcp: fix io_work priority inversion
> because it fixes a nvme-tcp bug whose performance impact is itself so big that it "hides" the one we've discovered (bringing itself lots of holes at the nvme-tcp layer ...)
> 
> On impacted kernels, the "pkt_stuck_in_qdisc" bug shows up in the order of zero to a few occurences per minute per TCP connection.

It seems the problem can be reproduced easily. Is there some testcase without
hw dependency, so that I can reproduce the problem myself?

> 
> I did not really have time to understand it thoroughly, nor am I a network stack expert, so I don't propose any particular patch for it but I'll be happy to help testing fix attempts if it can help.
> Please feel free to ask any additional information.

Which cpu is used in the testing? It seems the cpu arch's
memory semantic is importance when handling the rare case.


> Best regards,
> 
> 
> *Vincent Ray*
> *Senior Architect • Kalray*
> Phone: +33 6 43 94 87 65
> _vray@kalrayinc.com_ • _www.kalrayinc.com_ <https://www.kalrayinc.com>
> 
> Kalray logo <https://www.kalrayinc.com>
> 	
> Intelligent Data Processing
> From Cloud to Edge
> 
> 
> *Please consider the environment before printing this e-mail.*
> This message contains information that may be privileged or confidential and is the property of Kalray S.A. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc
  2022-01-28  2:36   ` packet stuck in qdisc Yunsheng Lin
@ 2022-01-28  8:58     ` Vincent Ray
  2022-01-28  9:59       ` Vincent Ray
  2022-03-25  6:16     ` Yunsheng Lin
  1 sibling, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-01-28  8:58 UTC (permalink / raw)
  To: linyunsheng; +Cc: vladimir oltean, kuba, Samuel Jones, netdev, davem

[-- Attachment #1: Type: text/plain, Size: 7082 bytes --]

Hello,

It seems the problem can be reproduced easily. Is there some testcase without
hw dependency, so that I can reproduce the problem myself?

[VR]
Well, in my case, I produced it with the setup described below, which does include the Kalray board.
However, I am quite confident this has nothing to do with the board, not even with any particular Host NIC / cable, so that you can surely reproduce it in a different environment.
I think you just need to send some NVME-over-TCP traffic to any target able to receive it. In fact I suspect that any TCP traffic will do.
Attached is the fio test case we are running after doing the following nvme connect :
nvme connect -n nqn.2014-06.com.kalrayinc:nvme:ecca9057-4b59-5332-8d75-5acdcdd8a88e -t tcp -i 4 -a 10.20.0.1'      
And then simply
fio ./fio_jobs_randwr4k_60s.spdk


Which cpu is used in the testing? It seems the cpu arch's
memory semantic is importance when handling the rare case.

[VR] 
It's x86_64
FWIW, I've seen the problem happen both on native host and whithin a VM (be vcpu pinned or not, be fio threads pinned or not).


----- Original Message -----
From: "linyunsheng" <linyunsheng@huawei.com>
To: "Vincent Ray" <vray@kalrayinc.com>, "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "davem" <davem@davemloft.ne>
Cc: "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>
Sent: Friday, January 28, 2022 3:36:27 AM
Subject: Re: packet stuck in qdisc

On 2022/1/25 20:55, Vincent Ray wrote:
> Dear kernel maintainers / developers,
> 
> I work at Kalray where we are developping an NVME-over-TCP target controller board.
> My setup is as such :
> - a development workstation running Linux 5.x.y (the host)
> - sending NVME-TCP traffic to our board, to which it is connected through a Mellanox NIC (Connect-X-5) and a 100G ETH cable
> 
> While doing performance tests, using simple fio scenarios running over the regular kernel nvme-tcp driver on the host, we noticed important performance variations.
> After some digging (using tcpdump on the host), we found that there were big "holes" in the tcp traffic sent by the host.
> The scenario we observed is the following :
> 1) a TCP segment gets lost (not sent by the host) on a particular TCP connection, leading to a gap in the seq numbers received by the board
> 2) the board sends dup-acks and/or sacks (if configured) to signal this loss
> 3) then, sometimes, the host stops emitting on that TCP connection for several seconds (as much as 14s observed)
> 4) finally the host resumes emission, sending the missing packet
> 5) then the TCP connection continues correctly with the appropriate throughput
> 
> Such a scenario can be observed in the attached tcpdump (+ comments).

Hi,
    Thanks for reporting the problem.

> 
> We noticed that this was happening only in recent versions of the kernel, so we dichotomized until we found the culprit commits :
> we believe that the bug was introduced in qdisc updates for 5.14.rc1 by this set of commits, more precisely the middle one :
> 
> [2021-06-22] d3e0f57 Yunsheng Lin net: sched: remove qdisc->empty for lockless qdisc
> [2021-06-22] c4fef01 Yunsheng Lin net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc    *=> KO*
> [2021-06-22] dd25296 Yunsheng Lin net: sched: avoid unnecessary seqcount operation for lockless qdisc   *=> still OK*
> 
> As far as I can tell, the bug is still present in the mainline (at least it was in 5.16-rc4).
> From what I understand / guess, some optimizations in lockless qdiscs have lead to a race making the qdisc unaware that a packet has been enqueued and is waiting for emission.
> I suspect that, when this happens with TCP packets "to be retransmitted", the TCP stack will not timeout and try to retransmitt again because it uses skb_still_in_host_queue() to avoid useless re-retransmissions
> From net/ipv4/ tcp_output.c :
> //* Thanks to skb fast clones, we can detect if a prior transmit of                                                                                                                                                   /
> / * a packet is still in a qdisc or driver queue.                                                                                                                                                                     /
> / * In this case, there is very little point doing a retransmit !                                                                                                                                                     /
> / */  /
> I guess this plays a role in making these holes grow up to 14s, and an other layer than TCP might not experience it (?).
> 
> The interface through which my traffic is going is :
> eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
>     link/ether b8:ce:f6:60:c9:97 brd ff:ff:ff:ff:ff:ff
>     inet 10.20.0.254/24 scope global eth3
> 
> As you can see, it uses a mq qdisc. I did not try with other qdiscs yet.
> 
> Finally, if/when troubleshooting this problem in kernels older than 5.14.7, it's a good idea to first cherry-pick this patch :
> [2021-09-09] ae66447 Keith Busch nvme-tcp: fix io_work priority inversion
> because it fixes a nvme-tcp bug whose performance impact is itself so big that it "hides" the one we've discovered (bringing itself lots of holes at the nvme-tcp layer ...)
> 
> On impacted kernels, the "pkt_stuck_in_qdisc" bug shows up in the order of zero to a few occurences per minute per TCP connection.

It seems the problem can be reproduced easily. Is there some testcase without
hw dependency, so that I can reproduce the problem myself?

> 
> I did not really have time to understand it thoroughly, nor am I a network stack expert, so I don't propose any particular patch for it but I'll be happy to help testing fix attempts if it can help.
> Please feel free to ask any additional information.

Which cpu is used in the testing? It seems the cpu arch's
memory semantic is importance when handling the rare case.


> Best regards,
> 
> 
> *Vincent Ray*
> *Senior Architect • Kalray*
> Phone: +33 6 43 94 87 65
> _vray@kalrayinc.com_ • _www.kalrayinc.com_ <https://www.kalrayinc.com>
> 
> Kalray logo <https://www.kalrayinc.com>
> 	
> Intelligent Data Processing
> From Cloud to Edge
> 
> 
> *Please consider the environment before printing this e-mail.*
> This message contains information that may be privileged or confidential and is the property of Kalray S.A. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.
> 


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=12154.61f356b5.a06c1.0&r=vray%40kalrayinc.com&s=linyunsheng%40huawei.com&o=Re%3A+packet+stuck+in+qdisc&verdict=C&c=d26dcdd346e4be7ae9b35c1fc91bb6e71c6850cc


[-- Attachment #2: fio_jobs_randwr4k_60s.spdk --]
[-- Type: application/octet-stream, Size: 2821 bytes --]

[job000]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=0

[job001]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=1

[job002]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=2

[job003]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=3

[job004]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=4

[job005]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=5

[job006]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=6

[job007]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=7

[job008]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=8

[job009]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=9

[job010]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=10

[job011]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=11

[job012]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=12

[job013]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=13

[job014]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=14

[job015]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=15

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc
  2022-01-28  8:58     ` Vincent Ray
@ 2022-01-28  9:59       ` Vincent Ray
  2022-01-29  6:53         ` Yunsheng Lin
  0 siblings, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-01-28  9:59 UTC (permalink / raw)
  To: linyunsheng; +Cc: vladimir oltean, kuba, Samuel Jones, netdev, davem

Hmm ... sorry, actually I don't think I've seen this pb on a native host (because mine has an older kernel version and I did not change it), though I suppose it could happen too.


----- Original Message -----
From: "Vincent Ray" <vray@kalrayinc.com>
To: "linyunsheng" <linyunsheng@huawei.com>
Cc: "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>, "davem" <davem@davemloft.net>
Sent: Friday, January 28, 2022 9:58:05 AM
Subject: Re: packet stuck in qdisc

Hello,

It seems the problem can be reproduced easily. Is there some testcase without
hw dependency, so that I can reproduce the problem myself?

[VR]
Well, in my case, I produced it with the setup described below, which does include the Kalray board.
However, I am quite confident this has nothing to do with the board, not even with any particular Host NIC / cable, so that you can surely reproduce it in a different environment.
I think you just need to send some NVME-over-TCP traffic to any target able to receive it. In fact I suspect that any TCP traffic will do.
Attached is the fio test case we are running after doing the following nvme connect :
nvme connect -n nqn.2014-06.com.kalrayinc:nvme:ecca9057-4b59-5332-8d75-5acdcdd8a88e -t tcp -i 4 -a 10.20.0.1'      
And then simply
fio ./fio_jobs_randwr4k_60s.spdk


Which cpu is used in the testing? It seems the cpu arch's
memory semantic is importance when handling the rare case.

[VR] 
It's x86_64
FWIW, I've seen the problem happen both on native host and whithin a VM (be vcpu pinned or not, be fio threads pinned or not).


----- Original Message -----
From: "linyunsheng" <linyunsheng@huawei.com>
To: "Vincent Ray" <vray@kalrayinc.com>, "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "davem" <davem@davemloft.ne>
Cc: "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>
Sent: Friday, January 28, 2022 3:36:27 AM
Subject: Re: packet stuck in qdisc

On 2022/1/25 20:55, Vincent Ray wrote:
> Dear kernel maintainers / developers,
> 
> I work at Kalray where we are developping an NVME-over-TCP target controller board.
> My setup is as such :
> - a development workstation running Linux 5.x.y (the host)
> - sending NVME-TCP traffic to our board, to which it is connected through a Mellanox NIC (Connect-X-5) and a 100G ETH cable
> 
> While doing performance tests, using simple fio scenarios running over the regular kernel nvme-tcp driver on the host, we noticed important performance variations.
> After some digging (using tcpdump on the host), we found that there were big "holes" in the tcp traffic sent by the host.
> The scenario we observed is the following :
> 1) a TCP segment gets lost (not sent by the host) on a particular TCP connection, leading to a gap in the seq numbers received by the board
> 2) the board sends dup-acks and/or sacks (if configured) to signal this loss
> 3) then, sometimes, the host stops emitting on that TCP connection for several seconds (as much as 14s observed)
> 4) finally the host resumes emission, sending the missing packet
> 5) then the TCP connection continues correctly with the appropriate throughput
> 
> Such a scenario can be observed in the attached tcpdump (+ comments).

Hi,
    Thanks for reporting the problem.

> 
> We noticed that this was happening only in recent versions of the kernel, so we dichotomized until we found the culprit commits :
> we believe that the bug was introduced in qdisc updates for 5.14.rc1 by this set of commits, more precisely the middle one :
> 
> [2021-06-22] d3e0f57 Yunsheng Lin net: sched: remove qdisc->empty for lockless qdisc
> [2021-06-22] c4fef01 Yunsheng Lin net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc    *=> KO*
> [2021-06-22] dd25296 Yunsheng Lin net: sched: avoid unnecessary seqcount operation for lockless qdisc   *=> still OK*
> 
> As far as I can tell, the bug is still present in the mainline (at least it was in 5.16-rc4).
> From what I understand / guess, some optimizations in lockless qdiscs have lead to a race making the qdisc unaware that a packet has been enqueued and is waiting for emission.
> I suspect that, when this happens with TCP packets "to be retransmitted", the TCP stack will not timeout and try to retransmitt again because it uses skb_still_in_host_queue() to avoid useless re-retransmissions
> From net/ipv4/ tcp_output.c :
> //* Thanks to skb fast clones, we can detect if a prior transmit of                                                                                                                                                   /
> / * a packet is still in a qdisc or driver queue.                                                                                                                                                                     /
> / * In this case, there is very little point doing a retransmit !                                                                                                                                                     /
> / */  /
> I guess this plays a role in making these holes grow up to 14s, and an other layer than TCP might not experience it (?).
> 
> The interface through which my traffic is going is :
> eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
>     link/ether b8:ce:f6:60:c9:97 brd ff:ff:ff:ff:ff:ff
>     inet 10.20.0.254/24 scope global eth3
> 
> As you can see, it uses a mq qdisc. I did not try with other qdiscs yet.
> 
> Finally, if/when troubleshooting this problem in kernels older than 5.14.7, it's a good idea to first cherry-pick this patch :
> [2021-09-09] ae66447 Keith Busch nvme-tcp: fix io_work priority inversion
> because it fixes a nvme-tcp bug whose performance impact is itself so big that it "hides" the one we've discovered (bringing itself lots of holes at the nvme-tcp layer ...)
> 
> On impacted kernels, the "pkt_stuck_in_qdisc" bug shows up in the order of zero to a few occurences per minute per TCP connection.

It seems the problem can be reproduced easily. Is there some testcase without
hw dependency, so that I can reproduce the problem myself?

> 
> I did not really have time to understand it thoroughly, nor am I a network stack expert, so I don't propose any particular patch for it but I'll be happy to help testing fix attempts if it can help.
> Please feel free to ask any additional information.

Which cpu is used in the testing? It seems the cpu arch's
memory semantic is importance when handling the rare case.


> Best regards,
> 
> 
> *Vincent Ray*
> *Senior Architect • Kalray*
> Phone: +33 6 43 94 87 65
> _vray@kalrayinc.com_ • _www.kalrayinc.com_ <https://www.kalrayinc.com>
> 
> Kalray logo <https://www.kalrayinc.com>
> 	
> Intelligent Data Processing
> From Cloud to Edge
> 
> 
> *Please consider the environment before printing this e-mail.*
> This message contains information that may be privileged or confidential and is the property of Kalray S.A. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.
> 


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=12154.61f356b5.a06c1.0&r=vray%40kalrayinc.com&s=linyunsheng%40huawei.com&o=Re%3A+packet+stuck+in+qdisc&verdict=C&c=d26dcdd346e4be7ae9b35c1fc91bb6e71c6850cc





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc
  2022-01-28  9:59       ` Vincent Ray
@ 2022-01-29  6:53         ` Yunsheng Lin
  2022-01-31 18:39           ` Vincent Ray
  0 siblings, 1 reply; 28+ messages in thread
From: Yunsheng Lin @ 2022-01-29  6:53 UTC (permalink / raw)
  To: Vincent Ray; +Cc: vladimir oltean, kuba, Samuel Jones, netdev, davem

[-- Attachment #1: Type: text/plain, Size: 8103 bytes --]

On 2022/1/28 17:59, Vincent Ray wrote:
> Hmm ... sorry, actually I don't think I've seen this pb on a native host (because mine has an older kernel version and I did not change it), though I suppose it could happen too.
> 

Hi,
  I have tested NVME-over-TCP in a native arm64 host, which does not seems
to have the problem, the test log is in the attachment.

A debug patch in the attachment can be used to catch the packet_stuck_in_
qdisc problem, maybe give it a try.


> 
> ----- Original Message -----
> From: "Vincent Ray" <vray@kalrayinc.com>
> To: "linyunsheng" <linyunsheng@huawei.com>
> Cc: "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>, "davem" <davem@davemloft.net>
> Sent: Friday, January 28, 2022 9:58:05 AM
> Subject: Re: packet stuck in qdisc
> 
> Hello,
> 
> It seems the problem can be reproduced easily. Is there some testcase without
> hw dependency, so that I can reproduce the problem myself?
> 
> [VR]
> Well, in my case, I produced it with the setup described below, which does include the Kalray board.
> However, I am quite confident this has nothing to do with the board, not even with any particular Host NIC / cable, so that you can surely reproduce it in a different environment.
> I think you just need to send some NVME-over-TCP traffic to any target able to receive it. In fact I suspect that any TCP traffic will do.
> Attached is the fio test case we are running after doing the following nvme connect :
> nvme connect -n nqn.2014-06.com.kalrayinc:nvme:ecca9057-4b59-5332-8d75-5acdcdd8a88e -t tcp -i 4 -a 10.20.0.1'      
> And then simply
> fio ./fio_jobs_randwr4k_60s.spdk
> 
> 
> Which cpu is used in the testing? It seems the cpu arch's
> memory semantic is importance when handling the rare case.
> 
> [VR] 
> It's x86_64
> FWIW, I've seen the problem happen both on native host and whithin a VM (be vcpu pinned or not, be fio threads pinned or not).
> 
> 
> ----- Original Message -----
> From: "linyunsheng" <linyunsheng@huawei.com>
> To: "Vincent Ray" <vray@kalrayinc.com>, "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "davem" <davem@davemloft.ne>
> Cc: "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>
> Sent: Friday, January 28, 2022 3:36:27 AM
> Subject: Re: packet stuck in qdisc
> 
> On 2022/1/25 20:55, Vincent Ray wrote:
>> Dear kernel maintainers / developers,
>>
>> I work at Kalray where we are developping an NVME-over-TCP target controller board.
>> My setup is as such :
>> - a development workstation running Linux 5.x.y (the host)
>> - sending NVME-TCP traffic to our board, to which it is connected through a Mellanox NIC (Connect-X-5) and a 100G ETH cable
>>
>> While doing performance tests, using simple fio scenarios running over the regular kernel nvme-tcp driver on the host, we noticed important performance variations.
>> After some digging (using tcpdump on the host), we found that there were big "holes" in the tcp traffic sent by the host.
>> The scenario we observed is the following :
>> 1) a TCP segment gets lost (not sent by the host) on a particular TCP connection, leading to a gap in the seq numbers received by the board
>> 2) the board sends dup-acks and/or sacks (if configured) to signal this loss
>> 3) then, sometimes, the host stops emitting on that TCP connection for several seconds (as much as 14s observed)
>> 4) finally the host resumes emission, sending the missing packet
>> 5) then the TCP connection continues correctly with the appropriate throughput
>>
>> Such a scenario can be observed in the attached tcpdump (+ comments).
> 
> Hi,
>     Thanks for reporting the problem.
> 
>>
>> We noticed that this was happening only in recent versions of the kernel, so we dichotomized until we found the culprit commits :
>> we believe that the bug was introduced in qdisc updates for 5.14.rc1 by this set of commits, more precisely the middle one :
>>
>> [2021-06-22] d3e0f57 Yunsheng Lin net: sched: remove qdisc->empty for lockless qdisc
>> [2021-06-22] c4fef01 Yunsheng Lin net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc    *=> KO*
>> [2021-06-22] dd25296 Yunsheng Lin net: sched: avoid unnecessary seqcount operation for lockless qdisc   *=> still OK*
>>
>> As far as I can tell, the bug is still present in the mainline (at least it was in 5.16-rc4).
>> From what I understand / guess, some optimizations in lockless qdiscs have lead to a race making the qdisc unaware that a packet has been enqueued and is waiting for emission.
>> I suspect that, when this happens with TCP packets "to be retransmitted", the TCP stack will not timeout and try to retransmitt again because it uses skb_still_in_host_queue() to avoid useless re-retransmissions
>> From net/ipv4/ tcp_output.c :
>> //* Thanks to skb fast clones, we can detect if a prior transmit of                                                                                                                                                   /
>> / * a packet is still in a qdisc or driver queue.                                                                                                                                                                     /
>> / * In this case, there is very little point doing a retransmit !                                                                                                                                                     /
>> / */  /
>> I guess this plays a role in making these holes grow up to 14s, and an other layer than TCP might not experience it (?).
>>
>> The interface through which my traffic is going is :
>> eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
>>     link/ether b8:ce:f6:60:c9:97 brd ff:ff:ff:ff:ff:ff
>>     inet 10.20.0.254/24 scope global eth3
>>
>> As you can see, it uses a mq qdisc. I did not try with other qdiscs yet.
>>
>> Finally, if/when troubleshooting this problem in kernels older than 5.14.7, it's a good idea to first cherry-pick this patch :
>> [2021-09-09] ae66447 Keith Busch nvme-tcp: fix io_work priority inversion
>> because it fixes a nvme-tcp bug whose performance impact is itself so big that it "hides" the one we've discovered (bringing itself lots of holes at the nvme-tcp layer ...)
>>
>> On impacted kernels, the "pkt_stuck_in_qdisc" bug shows up in the order of zero to a few occurences per minute per TCP connection.
> 
> It seems the problem can be reproduced easily. Is there some testcase without
> hw dependency, so that I can reproduce the problem myself?
> 
>>
>> I did not really have time to understand it thoroughly, nor am I a network stack expert, so I don't propose any particular patch for it but I'll be happy to help testing fix attempts if it can help.
>> Please feel free to ask any additional information.
> 
> Which cpu is used in the testing? It seems the cpu arch's
> memory semantic is importance when handling the rare case.
> 
> 
>> Best regards,
>>
>>
>> *Vincent Ray*
>> *Senior Architect • Kalray*
>> Phone: +33 6 43 94 87 65
>> _vray@kalrayinc.com_ • _www.kalrayinc.com_ <https://www.kalrayinc.com>
>>
>> Kalray logo <https://www.kalrayinc.com>
>> 	
>> Intelligent Data Processing
>> From Cloud to Edge
>>
>>
>> *Please consider the environment before printing this e-mail.*
>> This message contains information that may be privileged or confidential and is the property of Kalray S.A. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.
>>
> 
> 
> To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=12154.61f356b5.a06c1.0&r=vray%40kalrayinc.com&s=linyunsheng%40huawei.com&o=Re%3A+packet+stuck+in+qdisc&verdict=C&c=d26dcdd346e4be7ae9b35c1fc91bb6e71c6850cc
> 
> 
> 
> 
> .
> 

[-- Attachment #2: board server_2022-01-29_11-20-29-fio_test.log --]
[-- Type: text/plain, Size: 210450 bytes --]

[BEGIN] 2022/1/29 11:20:41
root@(none):~# cat fio.spdk 
[job000]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=0

[job001]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=1

[job002]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=2

[job003]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=3

[job004]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=4

[job005]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=5

[job006]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=6

[job007]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=7

[job008]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=8

[job009]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=9

[job010]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=10

[job011]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=11

[job012]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=12

[job013]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=13

[job014]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=14

[job015]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=15
root@(none):~# fio fio.spdk 
job000: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job001: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job002: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job003: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job004: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job005: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job006: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job007: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job008: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job009: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job010: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job011: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job012: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job013: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job014: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job015: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
fio-3.1
Starting 16 threads
Jobs: 16 (f=16): [w(16)][100.0%][r=0KiB/s,w=2012KiB/s][r=0,w=503 IOPS][eta 00m:00s]
job000: (groupid=0, jobs=1): err= 0: pid=1374: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (146kB/s)(8616KiB/60381msec)
    slat (nsec): min=2620, max=64560, avg=6973.12, stdev=1568.22
    clat (usec): min=202, max=3232.7k, avg=448337.42, stdev=267922.27
     lat (usec): min=267, max=3232.7k, avg=448344.60, stdev=267922.39
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   44], 10.00th=[   73], 20.00th=[  194],
     | 30.00th=[  279], 40.00th=[  368], 50.00th=[  451], 60.00th=[  527],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  936], 99.50th=[ 1099], 99.90th=[ 2005], 99.95th=[ 2601],
     | 99.99th=[ 3239]
   bw (  KiB/s): min=   64, max= 1144, per=6.22%, avg=142.52, stdev=122.71, samples=120
   iops        : min=   16, max=  286, avg=35.62, stdev=30.68, samples=120
  lat (usec)   : 250=0.05%
  lat (msec)   : 20=0.84%, 50=4.87%, 100=6.22%, 250=14.07%, 500=29.62%
  lat (msec)   : 750=31.48%, 1000=12.12%, 2000=0.60%, >=2000=0.14%
  cpu          : usr=0.00%, sys=0.03%, ctx=2166, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2154,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job001: (groupid=0, jobs=1): err= 0: pid=1375: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=142KiB/s (146kB/s)(8608KiB/60413msec)
    slat (nsec): min=2910, max=46710, avg=7082.46, stdev=1181.65
    clat (msec): min=9, max=3182, avg=448.97, stdev=284.65
     lat (msec): min=9, max=3182, avg=448.98, stdev=284.65
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[   41], 10.00th=[   74], 20.00th=[  190],
     | 30.00th=[  275], 40.00th=[  359], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  609], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  995], 99.50th=[ 1586], 99.90th=[ 2534], 99.95th=[ 2601],
     | 99.99th=[ 3171]
   bw (  KiB/s): min=   56, max= 1152, per=6.22%, avg=142.40, stdev=119.55, samples=120
   iops        : min=   14, max=  288, avg=35.60, stdev=29.89, samples=120
  lat (msec)   : 10=0.05%, 20=1.07%, 50=5.58%, 100=5.02%, 250=15.89%
  lat (msec)   : 500=30.20%, 750=28.07%, 1000=13.15%, 2000=0.70%, >=2000=0.28%
  cpu          : usr=0.01%, sys=0.03%, ctx=2164, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2152,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job002: (groupid=0, jobs=1): err= 0: pid=1376: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (147kB/s)(8656KiB/60466msec)
    slat (nsec): min=2580, max=30880, avg=6461.90, stdev=727.08
    clat (msec): min=10, max=3021, avg=446.86, stdev=275.34
     lat (msec): min=10, max=3021, avg=446.86, stdev=275.34
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   46], 10.00th=[   75], 20.00th=[  190],
     | 30.00th=[  275], 40.00th=[  355], 50.00th=[  439], 60.00th=[  514],
     | 70.00th=[  609], 80.00th=[  684], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  969], 99.50th=[ 1519], 99.90th=[ 2333], 99.95th=[ 2735],
     | 99.99th=[ 3037]
   bw (  KiB/s): min=   56, max= 1168, per=6.25%, avg=143.20, stdev=118.89, samples=120
   iops        : min=   14, max=  292, avg=35.80, stdev=29.72, samples=120
  lat (msec)   : 20=1.20%, 50=4.53%, 100=5.82%, 250=14.74%, 500=31.24%
  lat (msec)   : 750=29.57%, 1000=12.06%, 2000=0.69%, >=2000=0.14%
  cpu          : usr=0.01%, sys=0.02%, ctx=2168, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2164,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job003: (groupid=0, jobs=1): err= 0: pid=1377: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8716KiB/60465msec)
    slat (nsec): min=2620, max=56210, avg=7801.05, stdev=1380.43
    clat (usec): min=235, max=4327.5k, avg=443765.83, stdev=291780.83
     lat (usec): min=292, max=4327.5k, avg=443773.86, stdev=291780.98
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   44], 10.00th=[   73], 20.00th=[  182],
     | 30.00th=[  271], 40.00th=[  342], 50.00th=[  430], 60.00th=[  527],
     | 70.00th=[  600], 80.00th=[  693], 90.00th=[  785], 95.00th=[  827],
     | 99.00th=[  944], 99.50th=[ 1301], 99.90th=[ 2635], 99.95th=[ 3876],
     | 99.99th=[ 4329]
   bw (  KiB/s): min=   64, max= 1256, per=6.30%, avg=144.18, stdev=127.61, samples=120
   iops        : min=   16, max=  314, avg=36.03, stdev=31.90, samples=120
  lat (usec)   : 250=0.05%, 1000=0.05%
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.51%, 50=4.73%, 100=5.92%
  lat (msec)   : 250=15.01%, 500=29.46%, 750=29.97%, 1000=12.30%, 2000=0.60%
  lat (msec)   : >=2000=0.28%
  cpu          : usr=0.01%, sys=0.03%, ctx=2188, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2179,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job004: (groupid=0, jobs=1): err= 0: pid=1378: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8716KiB/60407msec)
    slat (nsec): min=2440, max=79420, avg=6217.45, stdev=1652.58
    clat (usec): min=386, max=2823.0k, avg=443380.65, stdev=272277.40
     lat (usec): min=466, max=2823.0k, avg=443387.05, stdev=272277.42
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   40], 10.00th=[   70], 20.00th=[  199],
     | 30.00th=[  275], 40.00th=[  351], 50.00th=[  435], 60.00th=[  523],
     | 70.00th=[  600], 80.00th=[  684], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  927], 99.50th=[ 1469], 99.90th=[ 2106], 99.95th=[ 2567],
     | 99.99th=[ 2836]
   bw (  KiB/s): min=   56, max= 1216, per=6.30%, avg=144.17, stdev=126.51, samples=120
   iops        : min=   14, max=  304, avg=36.02, stdev=31.63, samples=120
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.14%, 20=1.65%, 50=4.86%, 100=5.42%
  lat (msec)   : 250=14.69%, 500=30.79%, 750=29.60%, 1000=11.98%, 2000=0.55%
  lat (msec)   : >=2000=0.23%
  cpu          : usr=0.01%, sys=0.02%, ctx=2184, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2179,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job005: (groupid=0, jobs=1): err= 0: pid=1379: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8720KiB/60442msec)
    slat (nsec): min=2520, max=32050, avg=7001.39, stdev=1241.59
    clat (msec): min=4, max=2307, avg=443.41, stdev=268.39
     lat (msec): min=4, max=2307, avg=443.42, stdev=268.39
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   44], 10.00th=[   73], 20.00th=[  188],
     | 30.00th=[  271], 40.00th=[  355], 50.00th=[  430], 60.00th=[  518],
     | 70.00th=[  600], 80.00th=[  693], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  953], 99.50th=[ 1284], 99.90th=[ 2165], 99.95th=[ 2232],
     | 99.99th=[ 2299]
   bw (  KiB/s): min=   64, max= 1200, per=6.30%, avg=144.23, stdev=121.99, samples=120
   iops        : min=   16, max=  300, avg=36.03, stdev=30.50, samples=120
  lat (msec)   : 10=0.05%, 20=1.42%, 50=4.77%, 100=5.78%, 250=15.96%
  lat (msec)   : 500=29.95%, 750=29.72%, 1000=11.61%, 2000=0.55%, >=2000=0.18%
  cpu          : usr=0.01%, sys=0.02%, ctx=2195, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2180,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job006: (groupid=0, jobs=1): err= 0: pid=1380: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (146kB/s)(8628KiB/60435msec)
    slat (nsec): min=4010, max=38940, avg=6295.87, stdev=866.71
    clat (msec): min=2, max=2437, avg=448.09, stdev=267.61
     lat (msec): min=2, max=2437, avg=448.09, stdev=267.61
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   40], 10.00th=[   72], 20.00th=[  190],
     | 30.00th=[  279], 40.00th=[  372], 50.00th=[  451], 60.00th=[  527],
     | 70.00th=[  609], 80.00th=[  693], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  961], 99.50th=[ 1150], 99.90th=[ 1972], 99.95th=[ 2433],
     | 99.99th=[ 2433]
   bw (  KiB/s): min=   48, max= 1232, per=6.23%, avg=142.72, stdev=125.77, samples=120
   iops        : min=   12, max=  308, avg=35.67, stdev=31.45, samples=120
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.11%, 50=5.42%, 100=5.56%
  lat (msec)   : 250=14.74%, 500=29.86%, 750=29.11%, 1000=13.26%, 2000=0.70%
  lat (msec)   : >=2000=0.09%
  cpu          : usr=0.01%, sys=0.02%, ctx=2166, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2157,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job007: (groupid=0, jobs=1): err= 0: pid=1381: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8728KiB/60469msec)
    slat (nsec): min=2800, max=78630, avg=6825.65, stdev=1746.73
    clat (usec): min=1538, max=2636.5k, avg=443178.69, stdev=270888.61
     lat (usec): min=1617, max=2636.5k, avg=443185.73, stdev=270888.68
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   39], 10.00th=[   67], 20.00th=[  186],
     | 30.00th=[  271], 40.00th=[  347], 50.00th=[  435], 60.00th=[  527],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[  953], 99.50th=[ 1250], 99.90th=[ 2232], 99.95th=[ 2467],
     | 99.99th=[ 2635]
   bw (  KiB/s): min=   48, max= 1328, per=6.30%, avg=144.38, stdev=132.70, samples=120
   iops        : min=   12, max=  332, avg=36.08, stdev=33.18, samples=120
  lat (msec)   : 2=0.05%, 10=0.09%, 20=1.28%, 50=5.50%, 100=5.55%
  lat (msec)   : 250=14.67%, 500=30.34%, 750=29.10%, 1000=12.60%, 2000=0.69%
  lat (msec)   : >=2000=0.14%
  cpu          : usr=0.01%, sys=0.02%, ctx=2202, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2182,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job008: (groupid=0, jobs=1): err= 0: pid=1382: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=141KiB/s (145kB/s)(8548KiB/60425msec)
    slat (nsec): min=2350, max=25210, avg=6648.75, stdev=677.19
    clat (msec): min=3, max=2657, avg=452.20, stdev=274.45
     lat (msec): min=3, max=2658, avg=452.20, stdev=274.45
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   45], 10.00th=[   78], 20.00th=[  192],
     | 30.00th=[  275], 40.00th=[  363], 50.00th=[  447], 60.00th=[  531],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[ 1011], 99.50th=[ 1234], 99.90th=[ 2500], 99.95th=[ 2534],
     | 99.99th=[ 2668]
   bw (  KiB/s): min=   40, max= 1176, per=6.18%, avg=141.47, stdev=119.64, samples=120
   iops        : min=   10, max=  294, avg=35.37, stdev=29.91, samples=120
  lat (msec)   : 4=0.05%, 10=0.05%, 20=1.17%, 50=4.54%, 100=5.85%
  lat (msec)   : 250=15.40%, 500=29.53%, 750=29.67%, 1000=12.63%, 2000=0.89%
  lat (msec)   : >=2000=0.23%
  cpu          : usr=0.00%, sys=0.03%, ctx=2145, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2137,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job009: (groupid=0, jobs=1): err= 0: pid=1383: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=146KiB/s (149kB/s)(8816KiB/60418msec)
    slat (nsec): min=2700, max=47330, avg=6945.72, stdev=1046.36
    clat (msec): min=4, max=2775, avg=438.41, stdev=275.59
     lat (msec): min=4, max=2775, avg=438.42, stdev=275.59
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   47], 10.00th=[   77], 20.00th=[  188],
     | 30.00th=[  262], 40.00th=[  351], 50.00th=[  439], 60.00th=[  506],
     | 70.00th=[  584], 80.00th=[  667], 90.00th=[  768], 95.00th=[  827],
     | 99.00th=[  969], 99.50th=[ 1569], 99.90th=[ 2433], 99.95th=[ 2702],
     | 99.99th=[ 2769]
   bw (  KiB/s): min=   80, max= 1256, per=6.37%, avg=145.87, stdev=124.92, samples=120
   iops        : min=   20, max=  314, avg=36.47, stdev=31.23, samples=120
  lat (msec)   : 10=0.09%, 20=1.32%, 50=4.36%, 100=6.08%, 250=16.97%
  lat (msec)   : 500=29.95%, 750=30.13%, 1000=10.12%, 2000=0.68%, >=2000=0.32%
  cpu          : usr=0.00%, sys=0.04%, ctx=2218, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2204,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job010: (groupid=0, jobs=1): err= 0: pid=1384: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (147kB/s)(8668KiB/60443msec)
    slat (nsec): min=2400, max=47130, avg=6453.77, stdev=1001.36
    clat (msec): min=4, max=2477, avg=446.08, stdev=266.52
     lat (msec): min=4, max=2477, avg=446.08, stdev=266.52
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   40], 10.00th=[   69], 20.00th=[  188],
     | 30.00th=[  279], 40.00th=[  359], 50.00th=[  451], 60.00th=[  531],
     | 70.00th=[  609], 80.00th=[  693], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[  961], 99.50th=[ 1267], 99.90th=[ 2072], 99.95th=[ 2299],
     | 99.99th=[ 2467]
   bw (  KiB/s): min=   80, max= 1296, per=6.26%, avg=143.40, stdev=132.23, samples=120
   iops        : min=   20, max=  324, avg=35.85, stdev=33.06, samples=120
  lat (msec)   : 10=0.09%, 20=1.34%, 50=5.63%, 100=5.68%, 250=13.80%
  lat (msec)   : 500=30.00%, 750=30.69%, 1000=11.91%, 2000=0.74%, >=2000=0.14%
  cpu          : usr=0.01%, sys=0.02%, ctx=2175, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2167,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job011: (groupid=0, jobs=1): err= 0: pid=1385: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=142KiB/s (146kB/s)(8600KiB/60451msec)
    slat (nsec): min=2820, max=46210, avg=7015.60, stdev=1116.46
    clat (msec): min=2, max=3147, avg=449.67, stdev=284.51
     lat (msec): min=2, max=3147, avg=449.68, stdev=284.51
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   44], 10.00th=[   74], 20.00th=[  182],
     | 30.00th=[  279], 40.00th=[  359], 50.00th=[  443], 60.00th=[  523],
     | 70.00th=[  609], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  978], 99.50th=[ 1469], 99.90th=[ 2903], 99.95th=[ 2937],
     | 99.99th=[ 3138]
   bw (  KiB/s): min=   56, max= 1200, per=6.21%, avg=142.27, stdev=122.46, samples=120
   iops        : min=   14, max=  300, avg=35.57, stdev=30.61, samples=120
  lat (msec)   : 4=0.05%, 20=1.16%, 50=4.60%, 100=6.19%, 250=14.93%
  lat (msec)   : 500=30.28%, 750=28.93%, 1000=12.88%, 2000=0.74%, >=2000=0.23%
  cpu          : usr=0.02%, sys=0.01%, ctx=2173, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2150,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job012: (groupid=0, jobs=1): err= 0: pid=1386: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=142KiB/s (146kB/s)(8596KiB/60463msec)
    slat (nsec): min=2530, max=39440, avg=6624.45, stdev=863.91
    clat (usec): min=433, max=4594.3k, avg=449973.00, stdev=287748.67
     lat (usec): min=480, max=4594.3k, avg=449979.80, stdev=287748.72
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   41], 10.00th=[   74], 20.00th=[  199],
     | 30.00th=[  279], 40.00th=[  355], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  609], 80.00th=[  693], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[ 1116], 99.50th=[ 1368], 99.90th=[ 2467], 99.95th=[ 2903],
     | 99.99th=[ 4597]
   bw (  KiB/s): min=   32, max= 1120, per=6.21%, avg=142.20, stdev=118.91, samples=120
   iops        : min=    8, max=  280, avg=35.55, stdev=29.73, samples=120
  lat (usec)   : 500=0.05%
  lat (msec)   : 10=0.09%, 20=1.12%, 50=5.21%, 100=5.21%, 250=15.08%
  lat (msec)   : 500=30.43%, 750=29.69%, 1000=11.96%, 2000=0.88%, >=2000=0.28%
  cpu          : usr=0.00%, sys=0.03%, ctx=2155, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2149,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job013: (groupid=0, jobs=1): err= 0: pid=1387: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (146kB/s)(8632KiB/60445msec)
    slat (nsec): min=2380, max=37540, avg=7318.07, stdev=1044.01
    clat (usec): min=1726, max=2985.8k, avg=447946.88, stdev=266479.06
     lat (usec): min=1771, max=2985.8k, avg=447954.52, stdev=266479.15
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[   43], 10.00th=[   74], 20.00th=[  197],
     | 30.00th=[  279], 40.00th=[  363], 50.00th=[  451], 60.00th=[  527],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  919], 99.50th=[ 1150], 99.90th=[ 2039], 99.95th=[ 2735],
     | 99.99th=[ 2970]
   bw (  KiB/s): min=   72, max= 1232, per=6.24%, avg=142.80, stdev=123.80, samples=120
   iops        : min=   18, max=  308, avg=35.70, stdev=30.95, samples=120
  lat (msec)   : 2=0.05%, 10=0.05%, 20=1.02%, 50=5.51%, 100=5.42%
  lat (msec)   : 250=14.37%, 500=29.70%, 750=30.72%, 1000=12.47%, 2000=0.56%
  lat (msec)   : >=2000=0.14%
  cpu          : usr=0.01%, sys=0.02%, ctx=2178, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2158,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job014: (groupid=0, jobs=1): err= 0: pid=1388: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=140KiB/s (144kB/s)(8492KiB/60468msec)
    slat (nsec): min=2350, max=40180, avg=6659.86, stdev=864.58
    clat (usec): min=522, max=4966.6k, avg=455499.61, stdev=305720.10
     lat (usec): min=570, max=4966.6k, avg=455506.44, stdev=305720.17
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[   44], 10.00th=[   70], 20.00th=[  188],
     | 30.00th=[  271], 40.00th=[  359], 50.00th=[  447], 60.00th=[  542],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[ 1099], 99.50th=[ 1653], 99.90th=[ 2970], 99.95th=[ 4010],
     | 99.99th=[ 5000]
   bw (  KiB/s): min=   72, max= 1272, per=6.13%, avg=140.47, stdev=128.42, samples=120
   iops        : min=   18, max=  318, avg=35.12, stdev=32.10, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 20=1.22%, 50=5.09%, 100=6.26%, 250=15.03%, 500=28.40%
  lat (msec)   : 750=29.35%, 1000=13.38%, 2000=0.99%, >=2000=0.24%
  cpu          : usr=0.01%, sys=0.02%, ctx=2126, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2123,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job015: (groupid=0, jobs=1): err= 0: pid=1389: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=145KiB/s (149kB/s)(8776KiB/60439msec)
    slat (nsec): min=2610, max=40250, avg=7126.12, stdev=1121.55
    clat (usec): min=630, max=3019.9k, avg=440554.76, stdev=265226.37
     lat (usec): min=677, max=3019.9k, avg=440562.13, stdev=265226.43
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   41], 10.00th=[   70], 20.00th=[  194],
     | 30.00th=[  279], 40.00th=[  355], 50.00th=[  435], 60.00th=[  514],
     | 70.00th=[  592], 80.00th=[  676], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  944], 99.50th=[ 1401], 99.90th=[ 1670], 99.95th=[ 2534],
     | 99.99th=[ 3004]
   bw (  KiB/s): min=   32, max= 1208, per=6.34%, avg=145.20, stdev=125.41, samples=120
   iops        : min=    8, max=  302, avg=36.30, stdev=31.35, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 4=0.05%, 10=0.05%, 20=1.19%, 50=5.29%, 100=5.52%
  lat (msec)   : 250=14.77%, 500=31.49%, 750=29.08%, 1000=11.90%, 2000=0.55%
  lat (msec)   : >=2000=0.09%
  cpu          : usr=0.02%, sys=0.02%, ctx=2214, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2194,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=2291KiB/s (2346kB/s), 140KiB/s-146KiB/s (144kB/s-149kB/s), io=135MiB (142MB), run=60381-60469msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
root@(none):~# fio fio.spdk 
job000: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job001: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job002: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job003: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job004: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job005: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job006: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job007: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job008: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job009: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job010: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job011: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job012: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job013: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job014: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job015: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
fio-3.1
Starting 16 threads
Jobs: 16 (f=16): [w(16)][100.0%][r=0KiB/s,w=2092KiB/s][r=0,w=523 IOPS][eta 00m:00s]
job000: (groupid=0, jobs=1): err= 0: pid=1392: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=131KiB/s (134kB/s)(7920KiB/60458msec)
    slat (nsec): min=2510, max=52450, avg=6720.86, stdev=1426.08
    clat (usec): min=267, max=3352.4k, avg=488306.04, stdev=256647.92
     lat (usec): min=321, max=3352.4k, avg=488312.97, stdev=256647.90
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[  142], 10.00th=[  178], 20.00th=[  257],
     | 30.00th=[  334], 40.00th=[  405], 50.00th=[  481], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  701], 90.00th=[  785], 95.00th=[  844],
     | 99.00th=[ 1070], 99.50th=[ 1401], 99.90th=[ 2668], 99.95th=[ 3339],
     | 99.99th=[ 3339]
   bw (  KiB/s): min=   64, max=  504, per=6.29%, avg=130.93, stdev=41.08, samples=120
   iops        : min=   16, max=  126, avg=32.73, stdev=10.27, samples=120
  lat (usec)   : 500=0.10%, 1000=0.05%
  lat (msec)   : 2=0.05%, 4=0.25%, 10=0.15%, 20=0.45%, 50=0.66%
  lat (msec)   : 100=1.06%, 250=16.46%, 500=34.14%, 750=32.47%, 1000=12.88%
  lat (msec)   : 2000=1.01%, >=2000=0.25%
  cpu          : usr=0.01%, sys=0.02%, ctx=1993, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1980,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job001: (groupid=0, jobs=1): err= 0: pid=1393: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=128KiB/s (131kB/s)(7752KiB/60471msec)
    slat (nsec): min=4620, max=48940, avg=6718.31, stdev=1197.97
    clat (usec): min=354, max=3270.3k, avg=498998.78, stdev=272368.89
     lat (usec): min=404, max=3270.3k, avg=499005.72, stdev=272368.85
    clat percentiles (msec):
     |  1.00th=[   23],  5.00th=[  133], 10.00th=[  178], 20.00th=[  257],
     | 30.00th=[  334], 40.00th=[  414], 50.00th=[  493], 60.00th=[  567],
     | 70.00th=[  651], 80.00th=[  718], 90.00th=[  802], 95.00th=[  844],
     | 99.00th=[ 1099], 99.50th=[ 1636], 99.90th=[ 2735], 99.95th=[ 3272],
     | 99.99th=[ 3272]
   bw (  KiB/s): min=   72, max=  624, per=6.16%, avg=128.13, stdev=49.69, samples=120
   iops        : min=   18, max=  156, avg=32.02, stdev=12.42, samples=120
  lat (usec)   : 500=0.05%, 750=0.05%
  lat (msec)   : 2=0.21%, 4=0.21%, 10=0.10%, 20=0.36%, 50=0.88%
  lat (msec)   : 100=1.75%, 250=15.02%, 500=32.30%, 750=33.23%, 1000=14.45%
  lat (msec)   : 2000=0.93%, >=2000=0.46%
  cpu          : usr=0.00%, sys=0.03%, ctx=1954, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1938,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job002: (groupid=0, jobs=1): err= 0: pid=1394: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=130KiB/s (133kB/s)(7864KiB/60419msec)
    slat (nsec): min=2690, max=26850, avg=6769.13, stdev=982.83
    clat (usec): min=1372, max=3288.0k, avg=491496.20, stdev=260042.95
     lat (usec): min=1404, max=3288.0k, avg=491503.21, stdev=260042.96
    clat percentiles (msec):
     |  1.00th=[   39],  5.00th=[  142], 10.00th=[  180], 20.00th=[  251],
     | 30.00th=[  330], 40.00th=[  405], 50.00th=[  481], 60.00th=[  567],
     | 70.00th=[  634], 80.00th=[  718], 90.00th=[  793], 95.00th=[  852],
     | 99.00th=[ 1053], 99.50th=[ 1401], 99.90th=[ 3004], 99.95th=[ 3306],
     | 99.99th=[ 3306]
   bw (  KiB/s): min=   88, max=  488, per=6.25%, avg=130.00, stdev=37.20, samples=120
   iops        : min=   22, max=  122, avg=32.50, stdev= 9.30, samples=120
  lat (msec)   : 2=0.05%, 20=0.20%, 50=1.32%, 100=1.22%, 250=17.14%
  lat (msec)   : 500=32.96%, 750=31.43%, 1000=14.39%, 2000=1.07%, >=2000=0.20%
  cpu          : usr=0.01%, sys=0.02%, ctx=1973, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1966,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job003: (groupid=0, jobs=1): err= 0: pid=1395: Sat Jan 29 03:19:28 2022
  write: IOPS=31, BW=127KiB/s (130kB/s)(7656KiB/60449msec)
    slat (nsec): min=2510, max=51640, avg=7222.55, stdev=1436.91
    clat (msec): min=3, max=3658, avg=505.09, stdev=281.43
     lat (msec): min=3, max=3658, avg=505.10, stdev=281.43
    clat percentiles (msec):
     |  1.00th=[   35],  5.00th=[  148], 10.00th=[  186], 20.00th=[  262],
     | 30.00th=[  342], 40.00th=[  426], 50.00th=[  493], 60.00th=[  575],
     | 70.00th=[  651], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1083], 99.50th=[ 1720], 99.90th=[ 3373], 99.95th=[ 3675],
     | 99.99th=[ 3675]
   bw (  KiB/s): min=   72, max=  528, per=6.08%, avg=126.53, stdev=42.53, samples=120
   iops        : min=   18, max=  132, avg=31.63, stdev=10.63, samples=120
  lat (msec)   : 4=0.05%, 10=0.10%, 20=0.37%, 50=0.99%, 100=1.62%
  lat (msec)   : 250=15.52%, 500=32.39%, 750=33.59%, 1000=13.85%, 2000=1.10%
  lat (msec)   : >=2000=0.42%
  cpu          : usr=0.01%, sys=0.02%, ctx=1928, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1914,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job004: (groupid=0, jobs=1): err= 0: pid=1396: Sat Jan 29 03:19:28 2022
  write: IOPS=31, BW=127KiB/s (130kB/s)(7692KiB/60398msec)
    slat (nsec): min=2430, max=41440, avg=6762.38, stdev=983.13
    clat (msec): min=5, max=3863, avg=502.31, stdev=284.93
     lat (msec): min=5, max=3863, avg=502.32, stdev=284.93
    clat percentiles (msec):
     |  1.00th=[   32],  5.00th=[  138], 10.00th=[  182], 20.00th=[  266],
     | 30.00th=[  347], 40.00th=[  422], 50.00th=[  493], 60.00th=[  567],
     | 70.00th=[  642], 80.00th=[  718], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[ 1083], 99.50th=[ 1737], 99.90th=[ 3809], 99.95th=[ 3876],
     | 99.99th=[ 3876]
   bw (  KiB/s): min=   48, max=  536, per=6.11%, avg=127.13, stdev=42.93, samples=120
   iops        : min=   12, max=  134, avg=31.78, stdev=10.73, samples=120
  lat (msec)   : 10=0.10%, 20=0.26%, 50=1.30%, 100=1.46%, 250=15.50%
  lat (msec)   : 500=32.61%, 750=33.75%, 1000=13.62%, 2000=1.14%, >=2000=0.26%
  cpu          : usr=0.00%, sys=0.02%, ctx=1929, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1923,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job005: (groupid=0, jobs=1): err= 0: pid=1397: Sat Jan 29 03:19:28 2022
  write: IOPS=33, BW=132KiB/s (135kB/s)(7996KiB/60440msec)
    slat (nsec): min=2380, max=42580, avg=7240.90, stdev=1366.94
    clat (msec): min=4, max=2363, avg=483.54, stdev=253.74
     lat (msec): min=4, max=2363, avg=483.55, stdev=253.74
    clat percentiles (msec):
     |  1.00th=[   27],  5.00th=[  131], 10.00th=[  174], 20.00th=[  247],
     | 30.00th=[  317], 40.00th=[  401], 50.00th=[  477], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  701], 90.00th=[  785], 95.00th=[  844],
     | 99.00th=[ 1045], 99.50th=[ 1603], 99.90th=[ 2123], 99.95th=[ 2366],
     | 99.99th=[ 2366]
   bw (  KiB/s): min=   64, max=  648, per=6.36%, avg=132.20, stdev=52.29, samples=120
   iops        : min=   16, max=  162, avg=33.05, stdev=13.07, samples=120
  lat (msec)   : 10=0.10%, 20=0.45%, 50=1.35%, 100=1.80%, 250=17.01%
  lat (msec)   : 500=31.97%, 750=33.12%, 1000=12.96%, 2000=1.10%, >=2000=0.15%
  cpu          : usr=0.00%, sys=0.03%, ctx=2004, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1999,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job006: (groupid=0, jobs=1): err= 0: pid=1398: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=130KiB/s (133kB/s)(7840KiB/60473msec)
    slat (nsec): min=2430, max=42960, avg=6686.17, stdev=1014.28
    clat (msec): min=4, max=2771, avg=493.43, stdev=264.04
     lat (msec): min=4, max=2771, avg=493.44, stdev=264.04
    clat percentiles (msec):
     |  1.00th=[   41],  5.00th=[  142], 10.00th=[  174], 20.00th=[  251],
     | 30.00th=[  330], 40.00th=[  409], 50.00th=[  485], 60.00th=[  558],
     | 70.00th=[  642], 80.00th=[  709], 90.00th=[  793], 95.00th=[  852],
     | 99.00th=[ 1070], 99.50th=[ 1854], 99.90th=[ 2232], 99.95th=[ 2769],
     | 99.99th=[ 2769]
   bw (  KiB/s): min=   64, max=  512, per=6.23%, avg=129.60, stdev=40.94, samples=120
   iops        : min=   16, max=  128, avg=32.40, stdev=10.24, samples=120
  lat (msec)   : 10=0.15%, 20=0.15%, 50=0.92%, 100=1.58%, 250=17.09%
  lat (msec)   : 500=31.63%, 750=33.57%, 1000=13.57%, 2000=1.02%, >=2000=0.31%
  cpu          : usr=0.01%, sys=0.02%, ctx=1967, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1960,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job007: (groupid=0, jobs=1): err= 0: pid=1399: Sat Jan 29 03:19:28 2022
  write: IOPS=33, BW=135KiB/s (138kB/s)(8164KiB/60452msec)
    slat (nsec): min=2620, max=29500, avg=7177.42, stdev=852.94
    clat (msec): min=5, max=3289, avg=473.68, stdev=247.63
     lat (msec): min=6, max=3289, avg=473.68, stdev=247.63
    clat percentiles (msec):
     |  1.00th=[   29],  5.00th=[  138], 10.00th=[  171], 20.00th=[  241],
     | 30.00th=[  313], 40.00th=[  384], 50.00th=[  464], 60.00th=[  542],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  995], 99.50th=[ 1183], 99.90th=[ 1854], 99.95th=[ 2299],
     | 99.99th=[ 3306]
   bw (  KiB/s): min=   72, max=  600, per=6.49%, avg=135.00, stdev=48.36, samples=120
   iops        : min=   18, max=  150, avg=33.75, stdev=12.09, samples=120
  lat (msec)   : 10=0.05%, 20=0.54%, 50=1.32%, 100=1.32%, 250=17.54%
  lat (msec)   : 500=34.44%, 750=31.06%, 1000=12.84%, 2000=0.78%, >=2000=0.10%
  cpu          : usr=0.00%, sys=0.03%, ctx=2056, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2041,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job008: (groupid=0, jobs=1): err= 0: pid=1400: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=130KiB/s (133kB/s)(7872KiB/60465msec)
    slat (nsec): min=2410, max=65250, avg=6716.78, stdev=1466.53
    clat (msec): min=4, max=3512, avg=491.38, stdev=256.63
     lat (msec): min=4, max=3512, avg=491.39, stdev=256.63
    clat percentiles (msec):
     |  1.00th=[   27],  5.00th=[  140], 10.00th=[  182], 20.00th=[  253],
     | 30.00th=[  330], 40.00th=[  409], 50.00th=[  485], 60.00th=[  567],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  802], 95.00th=[  844],
     | 99.00th=[ 1011], 99.50th=[ 1552], 99.90th=[ 2333], 99.95th=[ 3507],
     | 99.99th=[ 3507]
   bw (  KiB/s): min=   64, max=  544, per=6.26%, avg=130.13, stdev=43.84, samples=120
   iops        : min=   16, max=  136, avg=32.53, stdev=10.96, samples=120
  lat (msec)   : 10=0.15%, 20=0.41%, 50=1.17%, 100=1.47%, 250=16.36%
  lat (msec)   : 500=32.98%, 750=32.37%, 1000=13.97%, 2000=0.97%, >=2000=0.15%
  cpu          : usr=0.01%, sys=0.02%, ctx=1974, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1968,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job009: (groupid=0, jobs=1): err= 0: pid=1401: Sat Jan 29 03:19:28 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8028KiB/60463msec)
    slat (nsec): min=2440, max=54960, avg=7115.32, stdev=1273.86
    clat (msec): min=4, max=3361, avg=481.81, stdev=261.19
     lat (msec): min=4, max=3361, avg=481.82, stdev=261.19
    clat percentiles (msec):
     |  1.00th=[   29],  5.00th=[  138], 10.00th=[  176], 20.00th=[  251],
     | 30.00th=[  326], 40.00th=[  397], 50.00th=[  468], 60.00th=[  542],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  995], 99.50th=[ 1368], 99.90th=[ 2433], 99.95th=[ 2836],
     | 99.99th=[ 3373]
   bw (  KiB/s): min=   64, max=  632, per=6.38%, avg=132.73, stdev=51.12, samples=120
   iops        : min=   16, max=  158, avg=33.18, stdev=12.78, samples=120
  lat (msec)   : 10=0.15%, 20=0.50%, 50=1.59%, 100=1.20%, 250=16.14%
  lat (msec)   : 500=34.48%, 750=31.64%, 1000=13.40%, 2000=0.60%, >=2000=0.30%
  cpu          : usr=0.02%, sys=0.02%, ctx=2017, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2007,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job010: (groupid=0, jobs=1): err= 0: pid=1402: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=132KiB/s (135kB/s)(7976KiB/60475msec)
    slat (nsec): min=2340, max=42070, avg=6507.54, stdev=910.55
    clat (msec): min=6, max=2411, avg=485.07, stdev=242.56
     lat (msec): min=6, max=2411, avg=485.07, stdev=242.56
    clat percentiles (msec):
     |  1.00th=[   32],  5.00th=[  136], 10.00th=[  169], 20.00th=[  243],
     | 30.00th=[  326], 40.00th=[  409], 50.00th=[  489], 60.00th=[  567],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  953], 99.50th=[ 1150], 99.90th=[ 1703], 99.95th=[ 2400],
     | 99.99th=[ 2400]
   bw (  KiB/s): min=   64, max=  632, per=6.34%, avg=131.87, stdev=50.99, samples=120
   iops        : min=   16, max=  158, avg=32.97, stdev=12.75, samples=120
  lat (msec)   : 10=0.10%, 20=0.55%, 50=1.30%, 100=1.50%, 250=18.05%
  lat (msec)   : 500=30.09%, 750=33.05%, 1000=14.69%, 2000=0.60%, >=2000=0.05%
  cpu          : usr=0.01%, sys=0.01%, ctx=1998, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1994,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job011: (groupid=0, jobs=1): err= 0: pid=1403: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=130KiB/s (133kB/s)(7856KiB/60469msec)
    slat (nsec): min=2610, max=40810, avg=7173.53, stdev=1167.33
    clat (msec): min=6, max=4137, avg=492.39, stdev=288.26
     lat (msec): min=6, max=4137, avg=492.40, stdev=288.26
    clat percentiles (msec):
     |  1.00th=[   26],  5.00th=[  132], 10.00th=[  167], 20.00th=[  249],
     | 30.00th=[  326], 40.00th=[  405], 50.00th=[  485], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1150], 99.50th=[ 1821], 99.90th=[ 3138], 99.95th=[ 4144],
     | 99.99th=[ 4144]
   bw (  KiB/s): min=   64, max=  680, per=6.24%, avg=129.87, stdev=55.93, samples=120
   iops        : min=   16, max=  170, avg=32.47, stdev=13.98, samples=120
  lat (msec)   : 10=0.10%, 20=0.46%, 50=2.04%, 100=1.43%, 250=16.04%
  lat (msec)   : 500=31.92%, 750=33.40%, 1000=13.09%, 2000=1.12%, >=2000=0.41%
  cpu          : usr=0.00%, sys=0.03%, ctx=1982, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1964,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job012: (groupid=0, jobs=1): err= 0: pid=1404: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=129KiB/s (132kB/s)(7784KiB/60383msec)
    slat (nsec): min=2350, max=40970, avg=6621.70, stdev=923.57
    clat (msec): min=3, max=3713, avg=496.27, stdev=267.31
     lat (msec): min=3, max=3713, avg=496.27, stdev=267.31
    clat percentiles (msec):
     |  1.00th=[   31],  5.00th=[  140], 10.00th=[  178], 20.00th=[  253],
     | 30.00th=[  326], 40.00th=[  409], 50.00th=[  489], 60.00th=[  567],
     | 70.00th=[  651], 80.00th=[  726], 90.00th=[  810], 95.00th=[  852],
     | 99.00th=[ 1099], 99.50th=[ 1536], 99.90th=[ 2299], 99.95th=[ 3708],
     | 99.99th=[ 3708]
   bw (  KiB/s): min=   56, max=  512, per=6.19%, avg=128.67, stdev=42.20, samples=120
   iops        : min=   14, max=  128, avg=32.17, stdev=10.55, samples=120
  lat (msec)   : 4=0.05%, 10=0.05%, 20=0.36%, 50=1.03%, 100=1.34%
  lat (msec)   : 250=16.60%, 500=32.17%, 750=31.55%, 1000=15.67%, 2000=0.98%
  lat (msec)   : >=2000=0.21%
  cpu          : usr=0.01%, sys=0.02%, ctx=1950, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1946,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job013: (groupid=0, jobs=1): err= 0: pid=1405: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=129KiB/s (132kB/s)(7808KiB/60386msec)
    slat (nsec): min=2720, max=55220, avg=7185.44, stdev=1388.30
    clat (msec): min=5, max=4092, avg=494.79, stdev=279.70
     lat (msec): min=5, max=4092, avg=494.80, stdev=279.70
    clat percentiles (msec):
     |  1.00th=[   33],  5.00th=[  142], 10.00th=[  180], 20.00th=[  262],
     | 30.00th=[  334], 40.00th=[  409], 50.00th=[  485], 60.00th=[  550],
     | 70.00th=[  634], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[ 1053], 99.50th=[ 1921], 99.90th=[ 3574], 99.95th=[ 4077],
     | 99.99th=[ 4077]
   bw (  KiB/s): min=   72, max=  512, per=6.21%, avg=129.07, stdev=40.96, samples=120
   iops        : min=   18, max=  128, avg=32.27, stdev=10.24, samples=120
  lat (msec)   : 10=0.15%, 20=0.36%, 50=1.08%, 100=1.23%, 250=15.68%
  lat (msec)   : 500=34.07%, 750=32.38%, 1000=13.93%, 2000=0.72%, >=2000=0.41%
  cpu          : usr=0.00%, sys=0.03%, ctx=1980, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1952,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job014: (groupid=0, jobs=1): err= 0: pid=1406: Sat Jan 29 03:19:28 2022
  write: IOPS=31, BW=128KiB/s (131kB/s)(7712KiB/60450msec)
    slat (nsec): min=2370, max=35410, avg=6624.54, stdev=854.26
    clat (msec): min=5, max=2850, avg=501.42, stdev=270.00
     lat (msec): min=5, max=2850, avg=501.42, stdev=270.00
    clat percentiles (msec):
     |  1.00th=[   33],  5.00th=[  142], 10.00th=[  180], 20.00th=[  264],
     | 30.00th=[  338], 40.00th=[  414], 50.00th=[  498], 60.00th=[  567],
     | 70.00th=[  651], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1099], 99.50th=[ 1603], 99.90th=[ 2735], 99.95th=[ 2836],
     | 99.99th=[ 2836]
   bw (  KiB/s): min=   64, max=  600, per=6.13%, avg=127.47, stdev=49.41, samples=120
   iops        : min=   16, max=  150, avg=31.87, stdev=12.35, samples=120
  lat (msec)   : 10=0.05%, 20=0.31%, 50=1.30%, 100=1.71%, 250=14.73%
  lat (msec)   : 500=32.21%, 750=33.30%, 1000=14.99%, 2000=1.04%, >=2000=0.36%
  cpu          : usr=0.00%, sys=0.03%, ctx=1931, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1928,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job015: (groupid=0, jobs=1): err= 0: pid=1407: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=131KiB/s (134kB/s)(7892KiB/60467msec)
    slat (nsec): min=2400, max=50090, avg=7259.26, stdev=1254.04
    clat (msec): min=13, max=3597, avg=490.12, stdev=250.25
     lat (msec): min=13, max=3597, avg=490.13, stdev=250.25
    clat percentiles (msec):
     |  1.00th=[   29],  5.00th=[  140], 10.00th=[  178], 20.00th=[  264],
     | 30.00th=[  347], 40.00th=[  414], 50.00th=[  485], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  701], 90.00th=[  785], 95.00th=[  844],
     | 99.00th=[  953], 99.50th=[ 1267], 99.90th=[ 2500], 99.95th=[ 3608],
     | 99.99th=[ 3608]
   bw (  KiB/s): min=   72, max=  528, per=6.27%, avg=130.47, stdev=41.53, samples=120
   iops        : min=   18, max=  132, avg=32.62, stdev=10.38, samples=120
  lat (msec)   : 20=0.35%, 50=1.32%, 100=1.32%, 250=15.61%, 500=33.81%
  lat (msec)   : 750=33.00%, 1000=13.84%, 2000=0.61%, >=2000=0.15%
  cpu          : usr=0.02%, sys=0.02%, ctx=1982, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1973,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=2080KiB/s (2130kB/s), 127KiB/s-135KiB/s (130kB/s-138kB/s), io=123MiB (129MB), run=60383-60475msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
root@(none):~# fio fio.spdk 
job000: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job001: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job002: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job003: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job004: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job005: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job006: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job007: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job008: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job009: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job010: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job011: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job012: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job013: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job014: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job015: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
fio-3.1
Starting 16 threads
Jobs: 16 (f=16): [w(16)][100.0%][r=0KiB/s,w=2008KiB/s][r=0,w=502 IOPS][eta 00m:00s]
job000: (groupid=0, jobs=1): err= 0: pid=1411: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=145KiB/s (148kB/s)(8764KiB/60470msec)
    slat (nsec): min=2620, max=41160, avg=6864.50, stdev=1230.81
    clat (usec): min=903, max=2646.2k, avg=441374.15, stdev=268853.40
     lat (usec): min=952, max=2646.2k, avg=441381.23, stdev=268853.51
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   37], 10.00th=[   67], 20.00th=[  190],
     | 30.00th=[  275], 40.00th=[  355], 50.00th=[  439], 60.00th=[  518],
     | 70.00th=[  609], 80.00th=[  684], 90.00th=[  776], 95.00th=[  810],
     | 99.00th=[  919], 99.50th=[ 1334], 99.90th=[ 2366], 99.95th=[ 2601],
     | 99.99th=[ 2635]
   bw (  KiB/s): min=   88, max= 1336, per=6.28%, avg=145.00, stdev=126.02, samples=120
   iops        : min=   22, max=  334, avg=36.25, stdev=31.51, samples=120
  lat (usec)   : 1000=0.05%
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.78%, 50=5.25%, 100=4.43%
  lat (msec)   : 250=15.15%, 500=31.04%, 750=29.94%, 1000=11.46%, 2000=0.59%
  lat (msec)   : >=2000=0.18%
  cpu          : usr=0.00%, sys=0.03%, ctx=2202, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2191,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job001: (groupid=0, jobs=1): err= 0: pid=1412: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=142KiB/s (146kB/s)(8596KiB/60483msec)
    slat (nsec): min=2410, max=52740, avg=7069.09, stdev=1340.03
    clat (usec): min=697, max=3214.9k, avg=450089.77, stdev=268424.63
     lat (usec): min=751, max=3214.9k, avg=450097.08, stdev=268424.68
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   40], 10.00th=[   68], 20.00th=[  192],
     | 30.00th=[  279], 40.00th=[  363], 50.00th=[  456], 60.00th=[  535],
     | 70.00th=[  625], 80.00th=[  701], 90.00th=[  776], 95.00th=[  818],
     | 99.00th=[  911], 99.50th=[ 1036], 99.90th=[ 2232], 99.95th=[ 2702],
     | 99.99th=[ 3205]
   bw (  KiB/s): min=   80, max= 1312, per=6.16%, avg=142.20, stdev=121.21, samples=120
   iops        : min=   20, max=  328, avg=35.55, stdev=30.30, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.44%, 50=5.35%, 100=4.14%
  lat (msec)   : 250=15.31%, 500=28.66%, 750=31.41%, 1000=12.98%, 2000=0.37%
  lat (msec)   : >=2000=0.14%
  cpu          : usr=0.00%, sys=0.03%, ctx=2167, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2149,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job002: (groupid=0, jobs=1): err= 0: pid=1413: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=144KiB/s (147kB/s)(8700KiB/60453msec)
    slat (nsec): min=2480, max=29050, avg=6471.43, stdev=727.51
    clat (usec): min=1841, max=3617.2k, avg=444496.80, stdev=280625.58
     lat (usec): min=1870, max=3617.2k, avg=444503.46, stdev=280625.69
    clat percentiles (msec):
     |  1.00th=[   15],  5.00th=[   31], 10.00th=[   57], 20.00th=[  192],
     | 30.00th=[  279], 40.00th=[  359], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  600], 80.00th=[  684], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  936], 99.50th=[ 1250], 99.90th=[ 2836], 99.95th=[ 2869],
     | 99.99th=[ 3608]
   bw (  KiB/s): min=   88, max= 1528, per=6.23%, avg=143.91, stdev=140.79, samples=120
   iops        : min=   22, max=  382, avg=35.96, stdev=35.20, samples=120
  lat (msec)   : 2=0.05%, 4=0.05%, 10=0.09%, 20=1.98%, 50=6.57%
  lat (msec)   : 100=4.00%, 250=13.38%, 500=31.26%, 750=29.47%, 1000=12.37%
  lat (msec)   : 2000=0.60%, >=2000=0.18%
  cpu          : usr=0.02%, sys=0.01%, ctx=2181, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2175,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job003: (groupid=0, jobs=1): err= 0: pid=1414: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=141KiB/s (145kB/s)(8544KiB/60456msec)
    slat (nsec): min=2440, max=32220, avg=7299.84, stdev=1076.30
    clat (usec): min=1502, max=4003.6k, avg=452661.73, stdev=327143.54
     lat (usec): min=1543, max=4003.6k, avg=452669.21, stdev=327143.73
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   37], 10.00th=[   62], 20.00th=[  188],
     | 30.00th=[  268], 40.00th=[  351], 50.00th=[  435], 60.00th=[  523],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[ 1217], 99.50th=[ 2165], 99.90th=[ 3742], 99.95th=[ 3775],
     | 99.99th=[ 4010]
   bw (  KiB/s): min=   64, max= 1464, per=6.12%, avg=141.32, stdev=134.81, samples=120
   iops        : min=   16, max=  366, avg=35.32, stdev=33.70, samples=120
  lat (msec)   : 2=0.05%, 4=0.05%, 10=0.14%, 20=1.73%, 50=5.57%
  lat (msec)   : 100=4.92%, 250=15.36%, 500=30.06%, 750=28.89%, 1000=12.03%
  lat (msec)   : 2000=0.61%, >=2000=0.61%
  cpu          : usr=0.01%, sys=0.03%, ctx=2147, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2136,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job004: (groupid=0, jobs=1): err= 0: pid=1415: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=146KiB/s (149kB/s)(8816KiB/60433msec)
    slat (nsec): min=2540, max=77140, avg=6598.94, stdev=1671.32
    clat (usec): min=1344, max=2455.8k, avg=438533.43, stdev=266929.99
     lat (usec): min=1421, max=2455.9k, avg=438540.25, stdev=266930.03
    clat percentiles (msec):
     |  1.00th=[   15],  5.00th=[   33], 10.00th=[   58], 20.00th=[  184],
     | 30.00th=[  275], 40.00th=[  351], 50.00th=[  435], 60.00th=[  518],
     | 70.00th=[  600], 80.00th=[  684], 90.00th=[  776], 95.00th=[  818],
     | 99.00th=[  902], 99.50th=[ 1070], 99.90th=[ 2265], 99.95th=[ 2366],
     | 99.99th=[ 2467]
   bw (  KiB/s): min=   88, max= 1432, per=6.32%, avg=145.85, stdev=135.67, samples=120
   iops        : min=   22, max=  358, avg=36.45, stdev=33.92, samples=120
  lat (msec)   : 2=0.09%, 4=0.05%, 10=0.05%, 20=2.00%, 50=6.44%
  lat (msec)   : 100=3.54%, 250=14.56%, 500=31.08%, 750=29.67%, 1000=11.98%
  lat (msec)   : 2000=0.36%, >=2000=0.18%
  cpu          : usr=0.01%, sys=0.02%, ctx=2213, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2204,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job005: (groupid=0, jobs=1): err= 0: pid=1416: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=145KiB/s (149kB/s)(8780KiB/60389msec)
    slat (nsec): min=2450, max=30660, avg=6882.44, stdev=1177.36
    clat (usec): min=1011, max=3630.4k, avg=440004.56, stdev=265460.04
     lat (usec): min=1043, max=3630.4k, avg=440011.67, stdev=265460.21
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   36], 10.00th=[   63], 20.00th=[  192],
     | 30.00th=[  275], 40.00th=[  355], 50.00th=[  439], 60.00th=[  518],
     | 70.00th=[  592], 80.00th=[  676], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[  927], 99.50th=[ 1083], 99.90th=[ 2039], 99.95th=[ 2123],
     | 99.99th=[ 3641]
   bw (  KiB/s): min=   80, max= 1416, per=6.29%, avg=145.27, stdev=132.09, samples=120
   iops        : min=   20, max=  354, avg=36.32, stdev=33.02, samples=120
  lat (msec)   : 2=0.05%, 10=0.14%, 20=1.78%, 50=5.74%, 100=4.24%
  lat (msec)   : 250=14.81%, 500=30.71%, 750=29.75%, 1000=12.12%, 2000=0.55%
  lat (msec)   : >=2000=0.14%
  cpu          : usr=0.01%, sys=0.03%, ctx=2210, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2195,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job006: (groupid=0, jobs=1): err= 0: pid=1417: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=145KiB/s (148kB/s)(8744KiB/60444msec)
    slat (nsec): min=2330, max=77170, avg=6362.27, stdev=1620.98
    clat (usec): min=296, max=3845.9k, avg=442210.11, stdev=269713.89
     lat (usec): min=374, max=3845.0k, avg=442216.66, stdev=269713.94
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   38], 10.00th=[   60], 20.00th=[  194],
     | 30.00th=[  284], 40.00th=[  363], 50.00th=[  439], 60.00th=[  518],
     | 70.00th=[  609], 80.00th=[  684], 90.00th=[  768], 95.00th=[  818],
     | 99.00th=[  919], 99.50th=[ 1284], 99.90th=[ 1938], 99.95th=[ 1989],
     | 99.99th=[ 3842]
   bw (  KiB/s): min=   80, max= 1448, per=6.27%, avg=144.67, stdev=134.01, samples=120
   iops        : min=   20, max=  362, avg=36.17, stdev=33.50, samples=120
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.14%, 20=1.56%, 50=6.27%, 100=4.12%
  lat (msec)   : 250=13.95%, 500=31.75%, 750=29.19%, 1000=12.08%, 2000=0.82%
  lat (msec)   : >=2000=0.05%
  cpu          : usr=0.00%, sys=0.03%, ctx=2188, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2186,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job007: (groupid=0, jobs=1): err= 0: pid=1418: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=144KiB/s (147kB/s)(8700KiB/60459msec)
    slat (nsec): min=2520, max=46920, avg=6828.86, stdev=1145.05
    clat (usec): min=574, max=3809.8k, avg=444564.22, stdev=271679.72
     lat (usec): min=621, max=3809.8k, avg=444571.26, stdev=271679.76
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   35], 10.00th=[   63], 20.00th=[  194],
     | 30.00th=[  284], 40.00th=[  359], 50.00th=[  439], 60.00th=[  527],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  776], 95.00th=[  818],
     | 99.00th=[  936], 99.50th=[ 1234], 99.90th=[ 1955], 99.95th=[ 2467],
     | 99.99th=[ 3809]
   bw (  KiB/s): min=   72, max= 1504, per=6.23%, avg=143.90, stdev=136.48, samples=120
   iops        : min=   18, max=  376, avg=35.95, stdev=34.12, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 10=0.09%, 20=1.38%, 50=6.57%, 100=4.14%, 250=14.07%
  lat (msec)   : 500=30.85%, 750=29.79%, 1000=12.41%, 2000=0.55%, >=2000=0.09%
  cpu          : usr=0.01%, sys=0.02%, ctx=2192, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2175,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job008: (groupid=0, jobs=1): err= 0: pid=1419: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8728KiB/60476msec)
    slat (nsec): min=2530, max=39560, avg=6603.35, stdev=918.64
    clat (usec): min=478, max=3133.5k, avg=443221.62, stdev=262571.86
     lat (usec): min=518, max=3133.5k, avg=443228.44, stdev=262571.94
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   38], 10.00th=[   65], 20.00th=[  194],
     | 30.00th=[  271], 40.00th=[  368], 50.00th=[  443], 60.00th=[  527],
     | 70.00th=[  609], 80.00th=[  684], 90.00th=[  768], 95.00th=[  818],
     | 99.00th=[  911], 99.50th=[ 1083], 99.90th=[ 1854], 99.95th=[ 2140],
     | 99.99th=[ 3138]
   bw (  KiB/s): min=   80, max= 1360, per=6.25%, avg=144.40, stdev=126.91, samples=120
   iops        : min=   20, max=  340, avg=36.10, stdev=31.73, samples=120
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.14%, 20=1.28%, 50=5.82%, 100=4.26%
  lat (msec)   : 250=15.22%, 500=30.11%, 750=31.48%, 1000=11.00%, 2000=0.50%
  lat (msec)   : >=2000=0.09%
  cpu          : usr=0.00%, sys=0.03%, ctx=2186, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2182,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job009: (groupid=0, jobs=1): err= 0: pid=1420: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=146KiB/s (149kB/s)(8796KiB/60436msec)
    slat (nsec): min=2520, max=30370, avg=6854.95, stdev=792.92
    clat (usec): min=1668, max=1876.6k, avg=439539.00, stdev=257683.55
     lat (usec): min=1703, max=1876.6k, avg=439546.09, stdev=257683.63
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   37], 10.00th=[   62], 20.00th=[  194],
     | 30.00th=[  275], 40.00th=[  355], 50.00th=[  435], 60.00th=[  514],
     | 70.00th=[  592], 80.00th=[  693], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  944], 99.50th=[ 1133], 99.90th=[ 1653], 99.95th=[ 1854],
     | 99.99th=[ 1871]
   bw (  KiB/s): min=   80, max= 1464, per=6.30%, avg=145.51, stdev=135.21, samples=120
   iops        : min=   20, max=  366, avg=36.36, stdev=33.81, samples=120
  lat (msec)   : 2=0.05%, 10=0.09%, 20=1.77%, 50=5.64%, 100=4.55%
  lat (msec)   : 250=15.28%, 500=30.38%, 750=28.51%, 1000=13.01%, 2000=0.73%
  cpu          : usr=0.01%, sys=0.02%, ctx=2215, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2199,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job010: (groupid=0, jobs=1): err= 0: pid=1421: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=147KiB/s (151kB/s)(8892KiB/60474msec)
    slat (nsec): min=2450, max=77940, avg=6428.46, stdev=1589.72
    clat (usec): min=590, max=3242.0k, avg=435045.81, stdev=265597.59
     lat (usec): min=668, max=3242.0k, avg=435052.43, stdev=265597.66
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   34], 10.00th=[   59], 20.00th=[  178],
     | 30.00th=[  266], 40.00th=[  347], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  600], 80.00th=[  676], 90.00th=[  768], 95.00th=[  818],
     | 99.00th=[  911], 99.50th=[  961], 99.90th=[ 2123], 99.95th=[ 2232],
     | 99.99th=[ 3239]
   bw (  KiB/s): min=   72, max= 1480, per=6.37%, avg=147.13, stdev=140.21, samples=120
   iops        : min=   18, max=  370, avg=36.78, stdev=35.05, samples=120
  lat (usec)   : 750=0.04%
  lat (msec)   : 4=0.04%, 10=0.09%, 20=1.93%, 50=6.07%, 100=4.63%
  lat (msec)   : 250=15.43%, 500=29.19%, 750=30.90%, 1000=11.25%, 2000=0.27%
  lat (msec)   : >=2000=0.13%
  cpu          : usr=0.01%, sys=0.03%, ctx=2228, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2223,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job011: (groupid=0, jobs=1): err= 0: pid=1422: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=147KiB/s (151kB/s)(8916KiB/60449msec)
    slat (nsec): min=2460, max=28380, avg=7032.15, stdev=932.20
    clat (usec): min=304, max=2722.1k, avg=433708.44, stdev=259882.81
     lat (usec): min=333, max=2722.1k, avg=433715.74, stdev=259882.94
    clat percentiles (msec):
     |  1.00th=[   15],  5.00th=[   37], 10.00th=[   66], 20.00th=[  188],
     | 30.00th=[  271], 40.00th=[  347], 50.00th=[  435], 60.00th=[  502],
     | 70.00th=[  600], 80.00th=[  676], 90.00th=[  768], 95.00th=[  818],
     | 99.00th=[  911], 99.50th=[ 1083], 99.90th=[ 1854], 99.95th=[ 1955],
     | 99.99th=[ 2735]
   bw (  KiB/s): min=   80, max= 1464, per=6.39%, avg=147.53, stdev=134.88, samples=120
   iops        : min=   20, max=  366, avg=36.88, stdev=33.72, samples=120
  lat (usec)   : 500=0.04%
  lat (msec)   : 10=0.13%, 20=1.70%, 50=5.20%, 100=4.89%, 250=15.97%
  lat (msec)   : 500=31.49%, 750=29.39%, 1000=10.45%, 2000=0.67%, >=2000=0.04%
  cpu          : usr=0.01%, sys=0.03%, ctx=2245, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2229,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job012: (groupid=0, jobs=1): err= 0: pid=1423: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=143KiB/s (146kB/s)(8636KiB/60488msec)
    slat (nsec): min=2360, max=62320, avg=6618.27, stdev=1287.56
    clat (usec): min=1184, max=3716.5k, avg=448041.61, stdev=273441.30
     lat (usec): min=1247, max=3716.6k, avg=448048.40, stdev=273441.33
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   37], 10.00th=[   69], 20.00th=[  203],
     | 30.00th=[  279], 40.00th=[  363], 50.00th=[  443], 60.00th=[  527],
     | 70.00th=[  609], 80.00th=[  693], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[  995], 99.50th=[ 1200], 99.90th=[ 2366], 99.95th=[ 2433],
     | 99.99th=[ 3708]
   bw (  KiB/s): min=   80, max= 1280, per=6.19%, avg=142.87, stdev=121.27, samples=120
   iops        : min=   20, max=  320, avg=35.72, stdev=30.32, samples=120
  lat (msec)   : 2=0.05%, 4=0.05%, 10=0.14%, 20=1.44%, 50=5.60%
  lat (msec)   : 100=3.94%, 250=14.64%, 500=31.13%, 750=29.97%, 1000=12.14%
  lat (msec)   : 2000=0.74%, >=2000=0.19%
  cpu          : usr=0.00%, sys=0.03%, ctx=2164, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2159,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job013: (groupid=0, jobs=1): err= 0: pid=1424: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=144KiB/s (147kB/s)(8676KiB/60427msec)
    slat (nsec): min=2430, max=29820, avg=7256.25, stdev=943.13
    clat (msec): min=2, max=3406, avg=445.55, stdev=274.02
     lat (msec): min=2, max=3406, avg=445.56, stdev=274.02
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   37], 10.00th=[   64], 20.00th=[  199],
     | 30.00th=[  284], 40.00th=[  359], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  600], 80.00th=[  684], 90.00th=[  768], 95.00th=[  827],
     | 99.00th=[  969], 99.50th=[ 1452], 99.90th=[ 2534], 99.95th=[ 2668],
     | 99.99th=[ 3406]
   bw (  KiB/s): min=   80, max= 1416, per=6.22%, avg=143.53, stdev=132.36, samples=120
   iops        : min=   20, max=  354, avg=35.88, stdev=33.09, samples=120
  lat (msec)   : 4=0.09%, 10=0.09%, 20=1.75%, 50=6.09%, 100=3.96%
  lat (msec)   : 250=13.83%, 500=31.17%, 750=30.61%, 1000=11.53%, 2000=0.69%
  lat (msec)   : >=2000=0.18%
  cpu          : usr=0.00%, sys=0.04%, ctx=2194, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2169,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job014: (groupid=0, jobs=1): err= 0: pid=1425: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=142KiB/s (145kB/s)(8572KiB/60485msec)
    slat (nsec): min=2390, max=26580, avg=6598.72, stdev=639.79
    clat (msec): min=2, max=3306, avg=451.38, stdev=274.20
     lat (msec): min=2, max=3306, avg=451.38, stdev=274.20
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   40], 10.00th=[   63], 20.00th=[  201],
     | 30.00th=[  284], 40.00th=[  368], 50.00th=[  460], 60.00th=[  535],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  785], 95.00th=[  827],
     | 99.00th=[  944], 99.50th=[ 1284], 99.90th=[ 2534], 99.95th=[ 2735],
     | 99.99th=[ 3306]
   bw (  KiB/s): min=   72, max= 1424, per=6.14%, avg=141.80, stdev=131.06, samples=120
   iops        : min=   18, max=  356, avg=35.45, stdev=32.77, samples=120
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.63%, 50=5.69%, 100=4.85%
  lat (msec)   : 250=13.39%, 500=29.96%, 750=31.17%, 1000=12.41%, 2000=0.56%
  lat (msec)   : >=2000=0.19%
  cpu          : usr=0.01%, sys=0.02%, ctx=2149, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2143,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job015: (groupid=0, jobs=1): err= 0: pid=1426: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=146KiB/s (150kB/s)(8840KiB/60466msec)
    slat (nsec): min=2470, max=69370, avg=7317.33, stdev=1636.94
    clat (msec): min=11, max=2098, avg=437.54, stdev=255.26
     lat (msec): min=11, max=2098, avg=437.55, stdev=255.26
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   37], 10.00th=[   65], 20.00th=[  199],
     | 30.00th=[  284], 40.00th=[  359], 50.00th=[  430], 60.00th=[  510],
     | 70.00th=[  600], 80.00th=[  676], 90.00th=[  760], 95.00th=[  810],
     | 99.00th=[  911], 99.50th=[ 1011], 99.90th=[ 1938], 99.95th=[ 2089],
     | 99.99th=[ 2106]
   bw (  KiB/s): min=   88, max= 1432, per=6.33%, avg=146.27, stdev=129.89, samples=120
   iops        : min=   22, max=  358, avg=36.57, stdev=32.47, samples=120
  lat (msec)   : 20=1.99%, 50=5.75%, 100=3.89%, 250=13.53%, 500=33.39%
  lat (msec)   : 750=30.27%, 1000=10.63%, 2000=0.45%, >=2000=0.09%
  cpu          : usr=0.01%, sys=0.03%, ctx=2233, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2210,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=2310KiB/s (2365kB/s), 141KiB/s-147KiB/s (145kB/s-151kB/s), io=136MiB (143MB), run=60389-60488msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
root@(none):~# fio fio.spdk 
job000: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job001: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job002: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job003: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job004: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job005: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job006: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job007: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job008: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job009: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job010: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job011: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job012: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job013: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job014: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job015: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
fio-3.1
Starting 16 threads
Jobs: 16 (f=16): [w(16)][100.0%][r=0KiB/s,w=2104KiB/s][r=0,w=526 IOPS][eta 00m:00s]
job000: (groupid=0, jobs=1): err= 0: pid=1429: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (138kB/s)(8144KiB/60551msec)
    slat (nsec): min=2590, max=22580, avg=6989.15, stdev=1029.24
    clat (msec): min=3, max=2678, avg=475.56, stdev=265.61
     lat (msec): min=3, max=2678, avg=475.57, stdev=265.61
    clat percentiles (msec):
     |  1.00th=[   23],  5.00th=[   73], 10.00th=[  159], 20.00th=[  232],
     | 30.00th=[  313], 40.00th=[  388], 50.00th=[  464], 60.00th=[  542],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[ 1062], 99.50th=[ 1586], 99.90th=[ 2140], 99.95th=[ 2400],
     | 99.99th=[ 2668]
   bw (  KiB/s): min=    8, max=  992, per=6.27%, avg=133.60, stdev=82.10, samples=121
   iops        : min=    2, max=  248, avg=33.39, stdev=20.52, samples=121
  lat (msec)   : 4=0.05%, 10=0.10%, 20=0.79%, 50=2.50%, 100=2.26%
  lat (msec)   : 250=16.55%, 500=32.37%, 750=30.75%, 1000=13.51%, 2000=0.83%
  lat (msec)   : >=2000=0.29%
  cpu          : usr=0.01%, sys=0.02%, ctx=2051, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2036,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job001: (groupid=0, jobs=1): err= 0: pid=1430: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8104KiB/60527msec)
    slat (nsec): min=2500, max=23950, avg=7010.82, stdev=810.62
    clat (usec): min=1967, max=2864.3k, avg=477733.04, stdev=268372.18
     lat (usec): min=1996, max=2864.3k, avg=477740.33, stdev=268372.21
    clat percentiles (msec):
     |  1.00th=[   24],  5.00th=[   72], 10.00th=[  163], 20.00th=[  234],
     | 30.00th=[  309], 40.00th=[  388], 50.00th=[  472], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  986], 99.50th=[ 1217], 99.90th=[ 2366], 99.95th=[ 2802],
     | 99.99th=[ 2869]
   bw (  KiB/s): min=    8, max=  992, per=6.24%, avg=132.94, stdev=81.90, samples=121
   iops        : min=    2, max=  248, avg=33.22, stdev=20.48, samples=121
  lat (msec)   : 2=0.05%, 10=0.10%, 20=0.54%, 50=2.57%, 100=2.52%
  lat (msec)   : 250=16.44%, 500=31.34%, 750=31.98%, 1000=13.52%, 2000=0.59%
  lat (msec)   : >=2000=0.35%
  cpu          : usr=0.01%, sys=0.02%, ctx=2041, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2026,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job002: (groupid=0, jobs=1): err= 0: pid=1431: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8060KiB/60538msec)
    slat (nsec): min=2310, max=37630, avg=6584.38, stdev=871.89
    clat (usec): min=889, max=2200.5k, avg=480424.76, stdev=267411.65
     lat (usec): min=927, max=2200.6k, avg=480431.52, stdev=267411.68
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[   75], 10.00th=[  155], 20.00th=[  236],
     | 30.00th=[  313], 40.00th=[  388], 50.00th=[  468], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1099], 99.50th=[ 1502], 99.90th=[ 2123], 99.95th=[ 2165],
     | 99.99th=[ 2198]
   bw (  KiB/s): min=    8, max= 1000, per=6.20%, avg=132.22, stdev=83.33, samples=121
   iops        : min=    2, max=  250, avg=33.05, stdev=20.83, samples=121
  lat (usec)   : 1000=0.05%
  lat (msec)   : 4=0.10%, 10=0.15%, 20=0.79%, 50=2.08%, 100=2.78%
  lat (msec)   : 250=15.63%, 500=31.76%, 750=30.92%, 1000=14.19%, 2000=1.19%
  lat (msec)   : >=2000=0.35%
  cpu          : usr=0.00%, sys=0.03%, ctx=2023, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2015,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job003: (groupid=0, jobs=1): err= 0: pid=1432: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8052KiB/60520msec)
    slat (nsec): min=2620, max=37450, avg=7665.85, stdev=1100.23
    clat (usec): min=714, max=3576.9k, avg=480774.94, stdev=284170.73
     lat (usec): min=752, max=3576.9k, avg=480782.80, stdev=284170.83
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   69], 10.00th=[  157], 20.00th=[  239],
     | 30.00th=[  309], 40.00th=[  388], 50.00th=[  472], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[ 1036], 99.50th=[ 1469], 99.90th=[ 3205], 99.95th=[ 3339],
     | 99.99th=[ 3574]
   bw (  KiB/s): min=    8, max= 1032, per=6.20%, avg=132.09, stdev=86.03, samples=121
   iops        : min=    2, max=  258, avg=33.02, stdev=21.51, samples=121
  lat (usec)   : 750=0.05%
  lat (msec)   : 4=0.10%, 10=0.10%, 20=0.65%, 50=2.19%, 100=3.03%
  lat (msec)   : 250=15.55%, 500=31.35%, 750=32.44%, 1000=13.26%, 2000=0.94%
  lat (msec)   : >=2000=0.35%
  cpu          : usr=0.01%, sys=0.03%, ctx=2029, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2013,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job004: (groupid=0, jobs=1): err= 0: pid=1433: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8120KiB/60609msec)
    slat (nsec): min=2540, max=28140, avg=6474.20, stdev=803.99
    clat (usec): min=1252, max=3554.1k, avg=477444.44, stdev=257780.62
     lat (usec): min=1286, max=3554.1k, avg=477451.11, stdev=257780.64
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   63], 10.00th=[  157], 20.00th=[  243],
     | 30.00th=[  330], 40.00th=[  393], 50.00th=[  468], 60.00th=[  550],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  986], 99.50th=[ 1133], 99.90th=[ 2072], 99.95th=[ 2433],
     | 99.99th=[ 3540]
   bw (  KiB/s): min=    8, max= 1040, per=6.25%, avg=133.21, stdev=86.18, samples=121
   iops        : min=    2, max=  260, avg=33.30, stdev=21.54, samples=121
  lat (msec)   : 2=0.05%, 10=0.15%, 20=0.79%, 50=3.00%, 100=2.17%
  lat (msec)   : 250=14.58%, 500=33.10%, 750=31.28%, 1000=13.99%, 2000=0.74%
  lat (msec)   : >=2000=0.15%
  cpu          : usr=0.00%, sys=0.03%, ctx=2045, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2030,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job005: (groupid=0, jobs=1): err= 0: pid=1434: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8044KiB/60536msec)
    slat (nsec): min=2440, max=27470, avg=6892.76, stdev=1185.46
    clat (msec): min=2, max=3454, avg=481.36, stdev=278.03
     lat (msec): min=2, max=3454, avg=481.37, stdev=278.03
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   63], 10.00th=[  157], 20.00th=[  239],
     | 30.00th=[  317], 40.00th=[  401], 50.00th=[  477], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  978], 99.50th=[ 1368], 99.90th=[ 2635], 99.95th=[ 3205],
     | 99.99th=[ 3440]
   bw (  KiB/s): min=    8, max= 1056, per=6.19%, avg=131.94, stdev=87.78, samples=121
   iops        : min=    2, max=  264, avg=32.97, stdev=21.95, samples=121
  lat (msec)   : 4=0.10%, 10=0.10%, 20=1.19%, 50=2.54%, 100=2.34%
  lat (msec)   : 250=15.32%, 500=31.53%, 750=32.27%, 1000=13.67%, 2000=0.60%
  lat (msec)   : >=2000=0.35%
  cpu          : usr=0.01%, sys=0.02%, ctx=2029, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2011,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job006: (groupid=0, jobs=1): err= 0: pid=1435: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8088KiB/60518msec)
    slat (nsec): min=2550, max=30760, avg=6371.72, stdev=750.89
    clat (usec): min=1815, max=2713.0k, avg=478613.31, stdev=266856.27
     lat (usec): min=1847, max=2713.0k, avg=478619.89, stdev=266856.30
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   63], 10.00th=[  157], 20.00th=[  241],
     | 30.00th=[  317], 40.00th=[  393], 50.00th=[  464], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[  995], 99.50th=[ 1469], 99.90th=[ 2232], 99.95th=[ 2232],
     | 99.99th=[ 2702]
   bw (  KiB/s): min=    8, max= 1064, per=6.22%, avg=132.68, stdev=88.73, samples=121
   iops        : min=    2, max=  266, avg=33.16, stdev=22.19, samples=121
  lat (msec)   : 2=0.05%, 10=0.10%, 20=0.74%, 50=3.17%, 100=2.37%
  lat (msec)   : 250=15.13%, 500=32.54%, 750=29.87%, 1000=15.03%, 2000=0.79%
  lat (msec)   : >=2000=0.20%
  cpu          : usr=0.00%, sys=0.03%, ctx=2036, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2022,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job007: (groupid=0, jobs=1): err= 0: pid=1436: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8116KiB/60543msec)
    slat (nsec): min=2540, max=55830, avg=6721.67, stdev=1245.48
    clat (usec): min=331, max=3128.6k, avg=477164.93, stdev=266832.70
     lat (usec): min=388, max=3128.6k, avg=477171.87, stdev=266832.72
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   55], 10.00th=[  148], 20.00th=[  239],
     | 30.00th=[  313], 40.00th=[  393], 50.00th=[  472], 60.00th=[  550],
     | 70.00th=[  634], 80.00th=[  718], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  978], 99.50th=[ 1368], 99.90th=[ 2333], 99.95th=[ 2567],
     | 99.99th=[ 3138]
   bw (  KiB/s): min=    8, max= 1144, per=6.25%, avg=133.16, stdev=95.75, samples=121
   iops        : min=    2, max=  286, avg=33.29, stdev=23.94, samples=121
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.15%, 20=1.08%, 50=3.15%, 100=2.27%
  lat (msec)   : 250=14.69%, 500=31.69%, 750=31.44%, 1000=14.59%, 2000=0.59%
  lat (msec)   : >=2000=0.25%
  cpu          : usr=0.01%, sys=0.02%, ctx=2042, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2029,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job008: (groupid=0, jobs=1): err= 0: pid=1437: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8092KiB/60545msec)
    slat (nsec): min=2600, max=44290, avg=6544.16, stdev=1003.95
    clat (msec): min=2, max=2867, avg=478.57, stdev=270.92
     lat (msec): min=2, max=2867, avg=478.58, stdev=270.92
    clat percentiles (msec):
     |  1.00th=[   22],  5.00th=[   67], 10.00th=[  159], 20.00th=[  234],
     | 30.00th=[  309], 40.00th=[  393], 50.00th=[  477], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  709], 90.00th=[  802], 95.00th=[  844],
     | 99.00th=[ 1070], 99.50th=[ 1401], 99.90th=[ 2635], 99.95th=[ 2769],
     | 99.99th=[ 2869]
   bw (  KiB/s): min=    8, max= 1024, per=6.23%, avg=132.76, stdev=85.15, samples=121
   iops        : min=    2, max=  256, avg=33.19, stdev=21.29, samples=121
  lat (msec)   : 4=0.05%, 10=0.10%, 20=0.54%, 50=3.26%, 100=1.98%
  lat (msec)   : 250=16.96%, 500=30.94%, 750=30.55%, 1000=14.48%, 2000=0.89%
  lat (msec)   : >=2000=0.25%
  cpu          : usr=0.00%, sys=0.03%, ctx=2033, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2023,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job009: (groupid=0, jobs=1): err= 0: pid=1438: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=135KiB/s (138kB/s)(8144KiB/60542msec)
    slat (nsec): min=2440, max=36840, avg=6897.02, stdev=962.98
    clat (msec): min=2, max=3695, avg=475.49, stdev=262.05
     lat (msec): min=2, max=3695, avg=475.50, stdev=262.05
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   63], 10.00th=[  150], 20.00th=[  234],
     | 30.00th=[  313], 40.00th=[  397], 50.00th=[  477], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  709], 90.00th=[  793], 95.00th=[  835],
     | 99.00th=[ 1003], 99.50th=[ 1200], 99.90th=[ 1905], 99.95th=[ 2198],
     | 99.99th=[ 3708]
   bw (  KiB/s): min=    8, max= 1128, per=6.27%, avg=133.62, stdev=94.14, samples=121
   iops        : min=    2, max=  282, avg=33.40, stdev=23.53, samples=121
  lat (msec)   : 4=0.05%, 10=0.10%, 20=1.33%, 50=2.65%, 100=2.31%
  lat (msec)   : 250=15.72%, 500=31.09%, 750=31.88%, 1000=13.85%, 2000=0.93%
  lat (msec)   : >=2000=0.10%
  cpu          : usr=0.00%, sys=0.03%, ctx=2054, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2036,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job010: (groupid=0, jobs=1): err= 0: pid=1439: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8088KiB/60540msec)
    slat (nsec): min=2460, max=42050, avg=6505.00, stdev=938.78
    clat (msec): min=2, max=2549, avg=478.78, stdev=259.86
     lat (msec): min=2, max=2549, avg=478.79, stdev=259.86
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   59], 10.00th=[  150], 20.00th=[  239],
     | 30.00th=[  309], 40.00th=[  393], 50.00th=[  485], 60.00th=[  558],
     | 70.00th=[  642], 80.00th=[  709], 90.00th=[  793], 95.00th=[  835],
     | 99.00th=[ 1011], 99.50th=[ 1200], 99.90th=[ 2106], 99.95th=[ 2106],
     | 99.99th=[ 2534]
   bw (  KiB/s): min=    8, max= 1104, per=6.22%, avg=132.69, stdev=91.73, samples=121
   iops        : min=    2, max=  276, avg=33.17, stdev=22.93, samples=121
  lat (msec)   : 4=0.10%, 10=0.10%, 20=1.34%, 50=2.62%, 100=2.27%
  lat (msec)   : 250=15.33%, 500=30.27%, 750=33.04%, 1000=13.90%, 2000=0.84%
  lat (msec)   : >=2000=0.20%
  cpu          : usr=0.00%, sys=0.03%, ctx=2031, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2022,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job011: (groupid=0, jobs=1): err= 0: pid=1440: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=132KiB/s (135kB/s)(7992KiB/60406msec)
    slat (nsec): min=2430, max=42670, avg=7037.07, stdev=1109.36
    clat (usec): min=1632, max=3762.3k, avg=483526.39, stdev=291668.80
     lat (usec): min=1675, max=3762.3k, avg=483533.69, stdev=291668.83
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   66], 10.00th=[  153], 20.00th=[  230],
     | 30.00th=[  321], 40.00th=[  397], 50.00th=[  477], 60.00th=[  558],
     | 70.00th=[  642], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1053], 99.50th=[ 1552], 99.90th=[ 3742], 99.95th=[ 3775],
     | 99.99th=[ 3775]
   bw (  KiB/s): min=   72, max= 1048, per=6.20%, avg=132.13, stdev=87.51, samples=120
   iops        : min=   18, max=  262, avg=33.03, stdev=21.88, samples=120
  lat (msec)   : 2=0.05%, 10=0.10%, 20=0.95%, 50=2.30%, 100=2.85%
  lat (msec)   : 250=16.12%, 500=30.53%, 750=31.83%, 1000=14.21%, 2000=0.80%
  lat (msec)   : >=2000=0.25%
  cpu          : usr=0.01%, sys=0.02%, ctx=2021, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1998,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job012: (groupid=0, jobs=1): err= 0: pid=1441: Sat Jan 29 03:21:45 2022
  write: IOPS=32, BW=131KiB/s (134kB/s)(7928KiB/60499msec)
    slat (nsec): min=2550, max=41780, avg=6529.97, stdev=921.96
    clat (usec): min=526, max=2381.0k, avg=488141.63, stdev=263751.28
     lat (usec): min=569, max=2381.0k, avg=488148.31, stdev=263751.30
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   81], 10.00th=[  163], 20.00th=[  243],
     | 30.00th=[  317], 40.00th=[  405], 50.00th=[  489], 60.00th=[  567],
     | 70.00th=[  651], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1083], 99.50th=[ 1385], 99.90th=[ 2232], 99.95th=[ 2366],
     | 99.99th=[ 2366]
   bw (  KiB/s): min=   24, max=  896, per=6.15%, avg=131.07, stdev=74.54, samples=120
   iops        : min=    6, max=  224, avg=32.77, stdev=18.64, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 10=0.15%, 20=0.71%, 50=1.87%, 100=2.52%, 250=15.54%
  lat (msec)   : 500=30.73%, 750=32.29%, 1000=14.93%, 2000=0.96%, >=2000=0.25%
  cpu          : usr=0.00%, sys=0.03%, ctx=1989, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1982,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job013: (groupid=0, jobs=1): err= 0: pid=1442: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=135KiB/s (138kB/s)(8164KiB/60501msec)
    slat (nsec): min=2520, max=49150, avg=7161.63, stdev=1200.38
    clat (usec): min=1072, max=2513.7k, avg=474057.99, stdev=255114.85
     lat (usec): min=1122, max=2513.7k, avg=474065.44, stdev=255114.90
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   72], 10.00th=[  157], 20.00th=[  234],
     | 30.00th=[  313], 40.00th=[  393], 50.00th=[  481], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  701], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[ 1020], 99.50th=[ 1318], 99.90th=[ 2232], 99.95th=[ 2265],
     | 99.99th=[ 2500]
   bw (  KiB/s): min=   40, max= 1032, per=6.33%, avg=135.00, stdev=86.07, samples=120
   iops        : min=   10, max=  258, avg=33.75, stdev=21.52, samples=120
  lat (msec)   : 2=0.10%, 10=0.24%, 20=0.78%, 50=2.25%, 100=2.69%
  lat (msec)   : 250=15.73%, 500=31.41%, 750=33.41%, 1000=12.30%, 2000=0.88%
  lat (msec)   : >=2000=0.20%
  cpu          : usr=0.00%, sys=0.03%, ctx=2066, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2041,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job014: (groupid=0, jobs=1): err= 0: pid=1443: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8020KiB/60528msec)
    slat (nsec): min=2460, max=54580, avg=6537.48, stdev=1181.96
    clat (usec): min=327, max=3314.6k, avg=482757.51, stdev=288007.51
     lat (usec): min=383, max=3314.6k, avg=482764.20, stdev=288007.54
    clat percentiles (msec):
     |  1.00th=[   22],  5.00th=[   63], 10.00th=[  150], 20.00th=[  234],
     | 30.00th=[  317], 40.00th=[  401], 50.00th=[  472], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  718], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  969], 99.50th=[ 1620], 99.90th=[ 3138], 99.95th=[ 3272],
     | 99.99th=[ 3306]
   bw (  KiB/s): min=    8, max= 1096, per=6.17%, avg=131.57, stdev=92.47, samples=121
   iops        : min=    2, max=  274, avg=32.89, stdev=23.12, samples=121
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.15%, 20=0.70%, 50=2.74%, 100=3.04%
  lat (msec)   : 250=15.16%, 500=31.12%, 750=31.42%, 1000=14.66%, 2000=0.40%
  lat (msec)   : >=2000=0.50%
  cpu          : usr=0.00%, sys=0.02%, ctx=2017, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2005,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job015: (groupid=0, jobs=1): err= 0: pid=1444: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8096KiB/60507msec)
    slat (nsec): min=2500, max=46920, avg=7413.75, stdev=1287.51
    clat (usec): min=507, max=2640.1k, avg=478065.98, stdev=249696.12
     lat (usec): min=554, max=2640.1k, avg=478073.64, stdev=249696.21
    clat percentiles (msec):
     |  1.00th=[   22],  5.00th=[   67], 10.00th=[  161], 20.00th=[  239],
     | 30.00th=[  326], 40.00th=[  405], 50.00th=[  481], 60.00th=[  550],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  835],
     | 99.00th=[  944], 99.50th=[ 1062], 99.90th=[ 1720], 99.95th=[ 2039],
     | 99.99th=[ 2635]
   bw (  KiB/s): min=    8, max= 1016, per=6.23%, avg=132.83, stdev=84.45, samples=121
   iops        : min=    2, max=  254, avg=33.21, stdev=21.11, samples=121
  lat (usec)   : 750=0.05%
  lat (msec)   : 20=0.94%, 50=2.72%, 100=2.47%, 250=15.42%, 500=30.93%
  lat (msec)   : 750=32.51%, 1000=14.23%, 2000=0.64%, >=2000=0.10%
  cpu          : usr=0.00%, sys=0.03%, ctx=2039, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2024,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=2133KiB/s (2184kB/s), 131KiB/s-135KiB/s (134kB/s-138kB/s), io=126MiB (132MB), run=60406-60609msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

[END] 2022/1/29 11:26:38

[-- Attachment #3: packet_stuck_in_qdisc_debug.patch --]
[-- Type: text/plain, Size: 1915 bytes --]

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a27bcc4f7e9a..dc18bafd13d0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -951,6 +951,8 @@ struct sk_buff {
 	unsigned int		truesize;
 	refcount_t		users;
 
+	unsigned long           enqueue_jiffies;
+
 #ifdef CONFIG_SKB_EXTENSIONS
 	/* only useable after checking ->active_extensions != 0 */
 	struct skb_ext		*extensions;
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index f893d9a81b01..cb09b0f165ce 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -109,8 +109,10 @@ static inline struct sk_buff *qdisc_dequeue_skb_bad_txq(struct Qdisc *q)
 {
 	struct sk_buff *skb = skb_peek(&q->skb_bad_txq);
 
-	if (unlikely(skb))
+	if (unlikely(skb)) {
 		skb = __skb_dequeue_bad_txq(q);
+		net_err_ratelimited("dequeuing bad txq skb: %px\n", skb);
+	}
 
 	return skb;
 }
@@ -251,6 +253,8 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate,
 			goto validate;
 		}
 
+		net_err_ratelimited("dequeuing gso skb: %px\n", skb);
+
 		/* skb in gso_skb were already validated */
 		*validate = false;
 		if (xfrm_offload(skb))
@@ -737,6 +741,8 @@ static int pfifo_fast_enqueue(struct sk_buff *skb, struct Qdisc *qdisc,
 			return qdisc_drop_cpu(skb, qdisc, to_free);
 		else
 			return qdisc_drop(skb, qdisc, to_free);
+	} else {
+		skb->enqueue_jiffies = jiffies;
 	}
 
 	qdisc_update_stats_at_enqueue(qdisc, pkt_len);
@@ -760,6 +766,12 @@ static struct sk_buff *pfifo_fast_dequeue(struct Qdisc *qdisc)
 		skb = __skb_array_consume(q);
 	}
 	if (likely(skb)) {
+		unsigned int delay_ms;
+
+		delay_ms = jiffies_to_msecs(jiffies - skb->enqueue_jiffies);
+		if (delay_ms > 500)
+			netdev_err(qdisc_dev(qdisc), "delay: %u ms, skb: %px\n", delay_ms, skb);
+
 		qdisc_update_stats_at_dequeue(qdisc, skb);
 	} else if (need_retry &&
 		   READ_ONCE(qdisc->state) & QDISC_STATE_NON_EMPTY) {

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc
  2022-01-29  6:53         ` Yunsheng Lin
@ 2022-01-31 18:39           ` Vincent Ray
  2022-02-07  3:17             ` Yunsheng Lin
  0 siblings, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-01-31 18:39 UTC (permalink / raw)
  To: linyunsheng; +Cc: vladimir oltean, kuba, Samuel Jones, netdev, davem

Hi,

Thanks for giving it a try.
Your attached log contains a lot of chinese characters, which makes it quite difficult to read, could you strip them off ?
I guess it is a fio report though.
Can I ask on which kernel you tried ? And how did you proceed to know if the bug was present or not ?
(as I'm sure you're aware, merely looking at fio reported bandwidths is not enough as the holes can sommetimes be quite small or not appear at all
=> it takes some trials to get them, and in my case I scripted over tcpdump to catch them, computing timestamp differences between pkts of the same connection)

I'll try your patch and let you know.
I suppose it's meant to give me a print in dmesg if a pkt gets stuck ?

Thanks,

V

----- Original Message -----
From: "linyunsheng" <linyunsheng@huawei.com>
To: "Vincent Ray" <vray@kalrayinc.com>
Cc: "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>, "davem" <davem@davemloft.net>
Sent: Saturday, January 29, 2022 7:53:59 AM
Subject: Re: packet stuck in qdisc

On 2022/1/28 17:59, Vincent Ray wrote:
> Hmm ... sorry, actually I don't think I've seen this pb on a native host (because mine has an older kernel version and I did not change it), though I suppose it could happen too.
> 

Hi,
  I have tested NVME-over-TCP in a native arm64 host, which does not seems
to have the problem, the test log is in the attachment.

A debug patch in the attachment can be used to catch the packet_stuck_in_
qdisc problem, maybe give it a try.


> 
> ----- Original Message -----
> From: "Vincent Ray" <vray@kalrayinc.com>
> To: "linyunsheng" <linyunsheng@huawei.com>
> Cc: "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>, "davem" <davem@davemloft.net>
> Sent: Friday, January 28, 2022 9:58:05 AM
> Subject: Re: packet stuck in qdisc
> 
> Hello,
> 
> It seems the problem can be reproduced easily. Is there some testcase without
> hw dependency, so that I can reproduce the problem myself?
> 
> [VR]
> Well, in my case, I produced it with the setup described below, which does include the Kalray board.
> However, I am quite confident this has nothing to do with the board, not even with any particular Host NIC / cable, so that you can surely reproduce it in a different environment.
> I think you just need to send some NVME-over-TCP traffic to any target able to receive it. In fact I suspect that any TCP traffic will do.
> Attached is the fio test case we are running after doing the following nvme connect :
> nvme connect -n nqn.2014-06.com.kalrayinc:nvme:ecca9057-4b59-5332-8d75-5acdcdd8a88e -t tcp -i 4 -a 10.20.0.1'      
> And then simply
> fio ./fio_jobs_randwr4k_60s.spdk
> 
> 
> Which cpu is used in the testing? It seems the cpu arch's
> memory semantic is importance when handling the rare case.
> 
> [VR] 
> It's x86_64
> FWIW, I've seen the problem happen both on native host and whithin a VM (be vcpu pinned or not, be fio threads pinned or not).
> 
> 
> ----- Original Message -----
> From: "linyunsheng" <linyunsheng@huawei.com>
> To: "Vincent Ray" <vray@kalrayinc.com>, "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "davem" <davem@davemloft.ne>
> Cc: "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>
> Sent: Friday, January 28, 2022 3:36:27 AM
> Subject: Re: packet stuck in qdisc
> 
> On 2022/1/25 20:55, Vincent Ray wrote:
>> Dear kernel maintainers / developers,
>>
>> I work at Kalray where we are developping an NVME-over-TCP target controller board.
>> My setup is as such :
>> - a development workstation running Linux 5.x.y (the host)
>> - sending NVME-TCP traffic to our board, to which it is connected through a Mellanox NIC (Connect-X-5) and a 100G ETH cable
>>
>> While doing performance tests, using simple fio scenarios running over the regular kernel nvme-tcp driver on the host, we noticed important performance variations.
>> After some digging (using tcpdump on the host), we found that there were big "holes" in the tcp traffic sent by the host.
>> The scenario we observed is the following :
>> 1) a TCP segment gets lost (not sent by the host) on a particular TCP connection, leading to a gap in the seq numbers received by the board
>> 2) the board sends dup-acks and/or sacks (if configured) to signal this loss
>> 3) then, sometimes, the host stops emitting on that TCP connection for several seconds (as much as 14s observed)
>> 4) finally the host resumes emission, sending the missing packet
>> 5) then the TCP connection continues correctly with the appropriate throughput
>>
>> Such a scenario can be observed in the attached tcpdump (+ comments).
> 
> Hi,
>     Thanks for reporting the problem.
> 
>>
>> We noticed that this was happening only in recent versions of the kernel, so we dichotomized until we found the culprit commits :
>> we believe that the bug was introduced in qdisc updates for 5.14.rc1 by this set of commits, more precisely the middle one :
>>
>> [2021-06-22] d3e0f57 Yunsheng Lin net: sched: remove qdisc->empty for lockless qdisc
>> [2021-06-22] c4fef01 Yunsheng Lin net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc    *=> KO*
>> [2021-06-22] dd25296 Yunsheng Lin net: sched: avoid unnecessary seqcount operation for lockless qdisc   *=> still OK*
>>
>> As far as I can tell, the bug is still present in the mainline (at least it was in 5.16-rc4).
>> From what I understand / guess, some optimizations in lockless qdiscs have lead to a race making the qdisc unaware that a packet has been enqueued and is waiting for emission.
>> I suspect that, when this happens with TCP packets "to be retransmitted", the TCP stack will not timeout and try to retransmitt again because it uses skb_still_in_host_queue() to avoid useless re-retransmissions
>> From net/ipv4/ tcp_output.c :
>> //* Thanks to skb fast clones, we can detect if a prior transmit of                                                                                                                                                   /
>> / * a packet is still in a qdisc or driver queue.                                                                                                                                                                     /
>> / * In this case, there is very little point doing a retransmit !                                                                                                                                                     /
>> / */  /
>> I guess this plays a role in making these holes grow up to 14s, and an other layer than TCP might not experience it (?).
>>
>> The interface through which my traffic is going is :
>> eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
>>     link/ether b8:ce:f6:60:c9:97 brd ff:ff:ff:ff:ff:ff
>>     inet 10.20.0.254/24 scope global eth3
>>
>> As you can see, it uses a mq qdisc. I did not try with other qdiscs yet.
>>
>> Finally, if/when troubleshooting this problem in kernels older than 5.14.7, it's a good idea to first cherry-pick this patch :
>> [2021-09-09] ae66447 Keith Busch nvme-tcp: fix io_work priority inversion
>> because it fixes a nvme-tcp bug whose performance impact is itself so big that it "hides" the one we've discovered (bringing itself lots of holes at the nvme-tcp layer ...)
>>
>> On impacted kernels, the "pkt_stuck_in_qdisc" bug shows up in the order of zero to a few occurences per minute per TCP connection.
> 
> It seems the problem can be reproduced easily. Is there some testcase without
> hw dependency, so that I can reproduce the problem myself?
> 
>>
>> I did not really have time to understand it thoroughly, nor am I a network stack expert, so I don't propose any particular patch for it but I'll be happy to help testing fix attempts if it can help.
>> Please feel free to ask any additional information.
> 
> Which cpu is used in the testing? It seems the cpu arch's
> memory semantic is importance when handling the rare case.
> 
> 
>> Best regards,
>>
>>
>> *Vincent Ray*
>> *Senior Architect • Kalray*
>> Phone: +33 6 43 94 87 65
>> _vray@kalrayinc.com_ • _www.kalrayinc.com_ <https://www.kalrayinc.com>
>>
>> Kalray logo <https://www.kalrayinc.com>
>> 	
>> Intelligent Data Processing
>> From Cloud to Edge
>>
>>
>> *Please consider the environment before printing this e-mail.*
>> This message contains information that may be privileged or confidential and is the property of Kalray S.A. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.
>>
> 
> 
> To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=12154.61f356b5.a06c1.0&r=vray%40kalrayinc.com&s=linyunsheng%40huawei.com&o=Re%3A+packet+stuck+in+qdisc&verdict=C&c=d26dcdd346e4be7ae9b35c1fc91bb6e71c6850cc
> 
> 
> 
> 
> .
> 


To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=5ee2.61f4e48d.74be3.0&r=vray%40kalrayinc.com&s=linyunsheng%40huawei.com&o=Re%3A+packet+stuck+in+qdisc&verdict=C&c=d4f5100780ea6759b09ab4ac15ae24cb8575b3e5





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc
  2022-01-31 18:39           ` Vincent Ray
@ 2022-02-07  3:17             ` Yunsheng Lin
  0 siblings, 0 replies; 28+ messages in thread
From: Yunsheng Lin @ 2022-02-07  3:17 UTC (permalink / raw)
  To: Vincent Ray; +Cc: vladimir oltean, kuba, Samuel Jones, netdev, davem

[-- Attachment #1: Type: text/plain, Size: 10118 bytes --]

On 2022/2/1 2:39, Vincent Ray wrote:
> Hi,
> 
> Thanks for giving it a try.
> Your attached log contains a lot of chinese characters, which makes it quite difficult to read, could you strip them off ?

The log is attched again, just run dos2unix ower it, I hope it does not contain any chinese characters now:)

> I guess it is a fio report though.
> Can I ask on which kernel you tried ? And how did you proceed to know if the bug was present or not ?

On the latest net-next branch, which is 5.17-rc1.
The kernel is patched with packet_stuck_in_qdisc_debug.patch, there are no debug print and the fio
cmd out's IOPS number is changing constantly when running the test.

Jobs: 16 (f=16): [w(16)][35.0%][r=0KiB/s,w=1492KiB/s][r=0,w=373 IOPS][eta 00m:39s]

> (as I'm sure you're aware, merely looking at fio reported bandwidths is not enough as the holes can sommetimes be quite small or not appear at all
> => it takes some trials to get them, and in my case I scripted over tcpdump to catch them, computing timestamp differences between pkts of the same connection)
> 
> I'll try your patch and let you know.
> I suppose it's meant to give me a print in dmesg if a pkt gets stuck ?

Yes.

> 
> Thanks,
> 
> V
> 
> ----- Original Message -----
> From: "linyunsheng" <linyunsheng@huawei.com>
> To: "Vincent Ray" <vray@kalrayinc.com>
> Cc: "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>, "davem" <davem@davemloft.net>
> Sent: Saturday, January 29, 2022 7:53:59 AM
> Subject: Re: packet stuck in qdisc
> 
> On 2022/1/28 17:59, Vincent Ray wrote:
>> Hmm ... sorry, actually I don't think I've seen this pb on a native host (because mine has an older kernel version and I did not change it), though I suppose it could happen too.
>>
> 
> Hi,
>   I have tested NVME-over-TCP in a native arm64 host, which does not seems
> to have the problem, the test log is in the attachment.
> 
> A debug patch in the attachment can be used to catch the packet_stuck_in_
> qdisc problem, maybe give it a try.
> 
> 
>>
>> ----- Original Message -----
>> From: "Vincent Ray" <vray@kalrayinc.com>
>> To: "linyunsheng" <linyunsheng@huawei.com>
>> Cc: "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>, "davem" <davem@davemloft.net>
>> Sent: Friday, January 28, 2022 9:58:05 AM
>> Subject: Re: packet stuck in qdisc
>>
>> Hello,
>>
>> It seems the problem can be reproduced easily. Is there some testcase without
>> hw dependency, so that I can reproduce the problem myself?
>>
>> [VR]
>> Well, in my case, I produced it with the setup described below, which does include the Kalray board.
>> However, I am quite confident this has nothing to do with the board, not even with any particular Host NIC / cable, so that you can surely reproduce it in a different environment.
>> I think you just need to send some NVME-over-TCP traffic to any target able to receive it. In fact I suspect that any TCP traffic will do.
>> Attached is the fio test case we are running after doing the following nvme connect :
>> nvme connect -n nqn.2014-06.com.kalrayinc:nvme:ecca9057-4b59-5332-8d75-5acdcdd8a88e -t tcp -i 4 -a 10.20.0.1'      
>> And then simply
>> fio ./fio_jobs_randwr4k_60s.spdk
>>
>>
>> Which cpu is used in the testing? It seems the cpu arch's
>> memory semantic is importance when handling the rare case.
>>
>> [VR] 
>> It's x86_64
>> FWIW, I've seen the problem happen both on native host and whithin a VM (be vcpu pinned or not, be fio threads pinned or not).
>>
>>
>> ----- Original Message -----
>> From: "linyunsheng" <linyunsheng@huawei.com>
>> To: "Vincent Ray" <vray@kalrayinc.com>, "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "davem" <davem@davemloft.ne>
>> Cc: "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>
>> Sent: Friday, January 28, 2022 3:36:27 AM
>> Subject: Re: packet stuck in qdisc
>>
>> On 2022/1/25 20:55, Vincent Ray wrote:
>>> Dear kernel maintainers / developers,
>>>
>>> I work at Kalray where we are developping an NVME-over-TCP target controller board.
>>> My setup is as such :
>>> - a development workstation running Linux 5.x.y (the host)
>>> - sending NVME-TCP traffic to our board, to which it is connected through a Mellanox NIC (Connect-X-5) and a 100G ETH cable
>>>
>>> While doing performance tests, using simple fio scenarios running over the regular kernel nvme-tcp driver on the host, we noticed important performance variations.
>>> After some digging (using tcpdump on the host), we found that there were big "holes" in the tcp traffic sent by the host.
>>> The scenario we observed is the following :
>>> 1) a TCP segment gets lost (not sent by the host) on a particular TCP connection, leading to a gap in the seq numbers received by the board
>>> 2) the board sends dup-acks and/or sacks (if configured) to signal this loss
>>> 3) then, sometimes, the host stops emitting on that TCP connection for several seconds (as much as 14s observed)
>>> 4) finally the host resumes emission, sending the missing packet
>>> 5) then the TCP connection continues correctly with the appropriate throughput
>>>
>>> Such a scenario can be observed in the attached tcpdump (+ comments).
>>
>> Hi,
>>     Thanks for reporting the problem.
>>
>>>
>>> We noticed that this was happening only in recent versions of the kernel, so we dichotomized until we found the culprit commits :
>>> we believe that the bug was introduced in qdisc updates for 5.14.rc1 by this set of commits, more precisely the middle one :
>>>
>>> [2021-06-22] d3e0f57 Yunsheng Lin net: sched: remove qdisc->empty for lockless qdisc
>>> [2021-06-22] c4fef01 Yunsheng Lin net: sched: implement TCQ_F_CAN_BYPASS for lockless qdisc    *=> KO*
>>> [2021-06-22] dd25296 Yunsheng Lin net: sched: avoid unnecessary seqcount operation for lockless qdisc   *=> still OK*
>>>
>>> As far as I can tell, the bug is still present in the mainline (at least it was in 5.16-rc4).
>>> From what I understand / guess, some optimizations in lockless qdiscs have lead to a race making the qdisc unaware that a packet has been enqueued and is waiting for emission.
>>> I suspect that, when this happens with TCP packets "to be retransmitted", the TCP stack will not timeout and try to retransmitt again because it uses skb_still_in_host_queue() to avoid useless re-retransmissions
>>> From net/ipv4/ tcp_output.c :
>>> //* Thanks to skb fast clones, we can detect if a prior transmit of                                                                                                                                                   /
>>> / * a packet is still in a qdisc or driver queue.                                                                                                                                                                     /
>>> / * In this case, there is very little point doing a retransmit !                                                                                                                                                     /
>>> / */  /
>>> I guess this plays a role in making these holes grow up to 14s, and an other layer than TCP might not experience it (?).
>>>
>>> The interface through which my traffic is going is :
>>> eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
>>>     link/ether b8:ce:f6:60:c9:97 brd ff:ff:ff:ff:ff:ff
>>>     inet 10.20.0.254/24 scope global eth3
>>>
>>> As you can see, it uses a mq qdisc. I did not try with other qdiscs yet.
>>>
>>> Finally, if/when troubleshooting this problem in kernels older than 5.14.7, it's a good idea to first cherry-pick this patch :
>>> [2021-09-09] ae66447 Keith Busch nvme-tcp: fix io_work priority inversion
>>> because it fixes a nvme-tcp bug whose performance impact is itself so big that it "hides" the one we've discovered (bringing itself lots of holes at the nvme-tcp layer ...)
>>>
>>> On impacted kernels, the "pkt_stuck_in_qdisc" bug shows up in the order of zero to a few occurences per minute per TCP connection.
>>
>> It seems the problem can be reproduced easily. Is there some testcase without
>> hw dependency, so that I can reproduce the problem myself?
>>
>>>
>>> I did not really have time to understand it thoroughly, nor am I a network stack expert, so I don't propose any particular patch for it but I'll be happy to help testing fix attempts if it can help.
>>> Please feel free to ask any additional information.
>>
>> Which cpu is used in the testing? It seems the cpu arch's
>> memory semantic is importance when handling the rare case.
>>
>>
>>> Best regards,
>>>
>>>
>>> *Vincent Ray*
>>> *Senior Architect • Kalray*
>>> Phone: +33 6 43 94 87 65
>>> _vray@kalrayinc.com_ • _www.kalrayinc.com_ <https://www.kalrayinc.com>
>>>
>>> Kalray logo <https://www.kalrayinc.com>
>>> 	
>>> Intelligent Data Processing
>>> From Cloud to Edge
>>>
>>>
>>> *Please consider the environment before printing this e-mail.*
>>> This message contains information that may be privileged or confidential and is the property of Kalray S.A. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.
>>>
>>
>>
>> To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=12154.61f356b5.a06c1.0&r=vray%40kalrayinc.com&s=linyunsheng%40huawei.com&o=Re%3A+packet+stuck+in+qdisc&verdict=C&c=d26dcdd346e4be7ae9b35c1fc91bb6e71c6850cc
>>
>>
>>
>>
>> .
>>
> 
> 
> To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=5ee2.61f4e48d.74be3.0&r=vray%40kalrayinc.com&s=linyunsheng%40huawei.com&o=Re%3A+packet+stuck+in+qdisc&verdict=C&c=d4f5100780ea6759b09ab4ac15ae24cb8575b3e5
> 
> 
> 
> 
> .
> 

[-- Attachment #2: board server_2022-01-29_11-20-29-fio_test.log --]
[-- Type: text/plain, Size: 210450 bytes --]

[BEGIN] 2022/1/29 11:20:41
root@(none):~# cat fio.spdk 
[job000]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=0

[job001]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=1

[job002]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=2

[job003]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=3

[job004]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=4

[job005]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=5

[job006]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=6

[job007]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=7

[job008]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=8

[job009]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=9

[job010]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=10

[job011]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=11

[job012]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=12

[job013]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=13

[job014]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=14

[job015]
filename=/dev/nvme0n1
ioengine=libaio
thread=1
group_reporting=0
direct=1
norandommap=1
rw=randrw
rwmixread=0
bs=4k
iodepth=16
time_based=1
runtime=60
cpus_allowed=15
root@(none):~# fio fio.spdk 
job000: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job001: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job002: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job003: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job004: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job005: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job006: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job007: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job008: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job009: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job010: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job011: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job012: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job013: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job014: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job015: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
fio-3.1
Starting 16 threads
Jobs: 16 (f=16): [w(16)][100.0%][r=0KiB/s,w=2012KiB/s][r=0,w=503 IOPS][eta 00m:00s]
job000: (groupid=0, jobs=1): err= 0: pid=1374: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (146kB/s)(8616KiB/60381msec)
    slat (nsec): min=2620, max=64560, avg=6973.12, stdev=1568.22
    clat (usec): min=202, max=3232.7k, avg=448337.42, stdev=267922.27
     lat (usec): min=267, max=3232.7k, avg=448344.60, stdev=267922.39
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   44], 10.00th=[   73], 20.00th=[  194],
     | 30.00th=[  279], 40.00th=[  368], 50.00th=[  451], 60.00th=[  527],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  936], 99.50th=[ 1099], 99.90th=[ 2005], 99.95th=[ 2601],
     | 99.99th=[ 3239]
   bw (  KiB/s): min=   64, max= 1144, per=6.22%, avg=142.52, stdev=122.71, samples=120
   iops        : min=   16, max=  286, avg=35.62, stdev=30.68, samples=120
  lat (usec)   : 250=0.05%
  lat (msec)   : 20=0.84%, 50=4.87%, 100=6.22%, 250=14.07%, 500=29.62%
  lat (msec)   : 750=31.48%, 1000=12.12%, 2000=0.60%, >=2000=0.14%
  cpu          : usr=0.00%, sys=0.03%, ctx=2166, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2154,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job001: (groupid=0, jobs=1): err= 0: pid=1375: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=142KiB/s (146kB/s)(8608KiB/60413msec)
    slat (nsec): min=2910, max=46710, avg=7082.46, stdev=1181.65
    clat (msec): min=9, max=3182, avg=448.97, stdev=284.65
     lat (msec): min=9, max=3182, avg=448.98, stdev=284.65
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[   41], 10.00th=[   74], 20.00th=[  190],
     | 30.00th=[  275], 40.00th=[  359], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  609], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  995], 99.50th=[ 1586], 99.90th=[ 2534], 99.95th=[ 2601],
     | 99.99th=[ 3171]
   bw (  KiB/s): min=   56, max= 1152, per=6.22%, avg=142.40, stdev=119.55, samples=120
   iops        : min=   14, max=  288, avg=35.60, stdev=29.89, samples=120
  lat (msec)   : 10=0.05%, 20=1.07%, 50=5.58%, 100=5.02%, 250=15.89%
  lat (msec)   : 500=30.20%, 750=28.07%, 1000=13.15%, 2000=0.70%, >=2000=0.28%
  cpu          : usr=0.01%, sys=0.03%, ctx=2164, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2152,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job002: (groupid=0, jobs=1): err= 0: pid=1376: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (147kB/s)(8656KiB/60466msec)
    slat (nsec): min=2580, max=30880, avg=6461.90, stdev=727.08
    clat (msec): min=10, max=3021, avg=446.86, stdev=275.34
     lat (msec): min=10, max=3021, avg=446.86, stdev=275.34
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   46], 10.00th=[   75], 20.00th=[  190],
     | 30.00th=[  275], 40.00th=[  355], 50.00th=[  439], 60.00th=[  514],
     | 70.00th=[  609], 80.00th=[  684], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  969], 99.50th=[ 1519], 99.90th=[ 2333], 99.95th=[ 2735],
     | 99.99th=[ 3037]
   bw (  KiB/s): min=   56, max= 1168, per=6.25%, avg=143.20, stdev=118.89, samples=120
   iops        : min=   14, max=  292, avg=35.80, stdev=29.72, samples=120
  lat (msec)   : 20=1.20%, 50=4.53%, 100=5.82%, 250=14.74%, 500=31.24%
  lat (msec)   : 750=29.57%, 1000=12.06%, 2000=0.69%, >=2000=0.14%
  cpu          : usr=0.01%, sys=0.02%, ctx=2168, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2164,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job003: (groupid=0, jobs=1): err= 0: pid=1377: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8716KiB/60465msec)
    slat (nsec): min=2620, max=56210, avg=7801.05, stdev=1380.43
    clat (usec): min=235, max=4327.5k, avg=443765.83, stdev=291780.83
     lat (usec): min=292, max=4327.5k, avg=443773.86, stdev=291780.98
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   44], 10.00th=[   73], 20.00th=[  182],
     | 30.00th=[  271], 40.00th=[  342], 50.00th=[  430], 60.00th=[  527],
     | 70.00th=[  600], 80.00th=[  693], 90.00th=[  785], 95.00th=[  827],
     | 99.00th=[  944], 99.50th=[ 1301], 99.90th=[ 2635], 99.95th=[ 3876],
     | 99.99th=[ 4329]
   bw (  KiB/s): min=   64, max= 1256, per=6.30%, avg=144.18, stdev=127.61, samples=120
   iops        : min=   16, max=  314, avg=36.03, stdev=31.90, samples=120
  lat (usec)   : 250=0.05%, 1000=0.05%
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.51%, 50=4.73%, 100=5.92%
  lat (msec)   : 250=15.01%, 500=29.46%, 750=29.97%, 1000=12.30%, 2000=0.60%
  lat (msec)   : >=2000=0.28%
  cpu          : usr=0.01%, sys=0.03%, ctx=2188, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2179,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job004: (groupid=0, jobs=1): err= 0: pid=1378: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8716KiB/60407msec)
    slat (nsec): min=2440, max=79420, avg=6217.45, stdev=1652.58
    clat (usec): min=386, max=2823.0k, avg=443380.65, stdev=272277.40
     lat (usec): min=466, max=2823.0k, avg=443387.05, stdev=272277.42
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   40], 10.00th=[   70], 20.00th=[  199],
     | 30.00th=[  275], 40.00th=[  351], 50.00th=[  435], 60.00th=[  523],
     | 70.00th=[  600], 80.00th=[  684], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  927], 99.50th=[ 1469], 99.90th=[ 2106], 99.95th=[ 2567],
     | 99.99th=[ 2836]
   bw (  KiB/s): min=   56, max= 1216, per=6.30%, avg=144.17, stdev=126.51, samples=120
   iops        : min=   14, max=  304, avg=36.02, stdev=31.63, samples=120
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.14%, 20=1.65%, 50=4.86%, 100=5.42%
  lat (msec)   : 250=14.69%, 500=30.79%, 750=29.60%, 1000=11.98%, 2000=0.55%
  lat (msec)   : >=2000=0.23%
  cpu          : usr=0.01%, sys=0.02%, ctx=2184, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2179,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job005: (groupid=0, jobs=1): err= 0: pid=1379: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8720KiB/60442msec)
    slat (nsec): min=2520, max=32050, avg=7001.39, stdev=1241.59
    clat (msec): min=4, max=2307, avg=443.41, stdev=268.39
     lat (msec): min=4, max=2307, avg=443.42, stdev=268.39
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   44], 10.00th=[   73], 20.00th=[  188],
     | 30.00th=[  271], 40.00th=[  355], 50.00th=[  430], 60.00th=[  518],
     | 70.00th=[  600], 80.00th=[  693], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  953], 99.50th=[ 1284], 99.90th=[ 2165], 99.95th=[ 2232],
     | 99.99th=[ 2299]
   bw (  KiB/s): min=   64, max= 1200, per=6.30%, avg=144.23, stdev=121.99, samples=120
   iops        : min=   16, max=  300, avg=36.03, stdev=30.50, samples=120
  lat (msec)   : 10=0.05%, 20=1.42%, 50=4.77%, 100=5.78%, 250=15.96%
  lat (msec)   : 500=29.95%, 750=29.72%, 1000=11.61%, 2000=0.55%, >=2000=0.18%
  cpu          : usr=0.01%, sys=0.02%, ctx=2195, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2180,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job006: (groupid=0, jobs=1): err= 0: pid=1380: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (146kB/s)(8628KiB/60435msec)
    slat (nsec): min=4010, max=38940, avg=6295.87, stdev=866.71
    clat (msec): min=2, max=2437, avg=448.09, stdev=267.61
     lat (msec): min=2, max=2437, avg=448.09, stdev=267.61
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   40], 10.00th=[   72], 20.00th=[  190],
     | 30.00th=[  279], 40.00th=[  372], 50.00th=[  451], 60.00th=[  527],
     | 70.00th=[  609], 80.00th=[  693], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  961], 99.50th=[ 1150], 99.90th=[ 1972], 99.95th=[ 2433],
     | 99.99th=[ 2433]
   bw (  KiB/s): min=   48, max= 1232, per=6.23%, avg=142.72, stdev=125.77, samples=120
   iops        : min=   12, max=  308, avg=35.67, stdev=31.45, samples=120
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.11%, 50=5.42%, 100=5.56%
  lat (msec)   : 250=14.74%, 500=29.86%, 750=29.11%, 1000=13.26%, 2000=0.70%
  lat (msec)   : >=2000=0.09%
  cpu          : usr=0.01%, sys=0.02%, ctx=2166, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2157,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job007: (groupid=0, jobs=1): err= 0: pid=1381: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8728KiB/60469msec)
    slat (nsec): min=2800, max=78630, avg=6825.65, stdev=1746.73
    clat (usec): min=1538, max=2636.5k, avg=443178.69, stdev=270888.61
     lat (usec): min=1617, max=2636.5k, avg=443185.73, stdev=270888.68
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   39], 10.00th=[   67], 20.00th=[  186],
     | 30.00th=[  271], 40.00th=[  347], 50.00th=[  435], 60.00th=[  527],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[  953], 99.50th=[ 1250], 99.90th=[ 2232], 99.95th=[ 2467],
     | 99.99th=[ 2635]
   bw (  KiB/s): min=   48, max= 1328, per=6.30%, avg=144.38, stdev=132.70, samples=120
   iops        : min=   12, max=  332, avg=36.08, stdev=33.18, samples=120
  lat (msec)   : 2=0.05%, 10=0.09%, 20=1.28%, 50=5.50%, 100=5.55%
  lat (msec)   : 250=14.67%, 500=30.34%, 750=29.10%, 1000=12.60%, 2000=0.69%
  lat (msec)   : >=2000=0.14%
  cpu          : usr=0.01%, sys=0.02%, ctx=2202, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2182,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job008: (groupid=0, jobs=1): err= 0: pid=1382: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=141KiB/s (145kB/s)(8548KiB/60425msec)
    slat (nsec): min=2350, max=25210, avg=6648.75, stdev=677.19
    clat (msec): min=3, max=2657, avg=452.20, stdev=274.45
     lat (msec): min=3, max=2658, avg=452.20, stdev=274.45
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   45], 10.00th=[   78], 20.00th=[  192],
     | 30.00th=[  275], 40.00th=[  363], 50.00th=[  447], 60.00th=[  531],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[ 1011], 99.50th=[ 1234], 99.90th=[ 2500], 99.95th=[ 2534],
     | 99.99th=[ 2668]
   bw (  KiB/s): min=   40, max= 1176, per=6.18%, avg=141.47, stdev=119.64, samples=120
   iops        : min=   10, max=  294, avg=35.37, stdev=29.91, samples=120
  lat (msec)   : 4=0.05%, 10=0.05%, 20=1.17%, 50=4.54%, 100=5.85%
  lat (msec)   : 250=15.40%, 500=29.53%, 750=29.67%, 1000=12.63%, 2000=0.89%
  lat (msec)   : >=2000=0.23%
  cpu          : usr=0.00%, sys=0.03%, ctx=2145, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2137,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job009: (groupid=0, jobs=1): err= 0: pid=1383: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=146KiB/s (149kB/s)(8816KiB/60418msec)
    slat (nsec): min=2700, max=47330, avg=6945.72, stdev=1046.36
    clat (msec): min=4, max=2775, avg=438.41, stdev=275.59
     lat (msec): min=4, max=2775, avg=438.42, stdev=275.59
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   47], 10.00th=[   77], 20.00th=[  188],
     | 30.00th=[  262], 40.00th=[  351], 50.00th=[  439], 60.00th=[  506],
     | 70.00th=[  584], 80.00th=[  667], 90.00th=[  768], 95.00th=[  827],
     | 99.00th=[  969], 99.50th=[ 1569], 99.90th=[ 2433], 99.95th=[ 2702],
     | 99.99th=[ 2769]
   bw (  KiB/s): min=   80, max= 1256, per=6.37%, avg=145.87, stdev=124.92, samples=120
   iops        : min=   20, max=  314, avg=36.47, stdev=31.23, samples=120
  lat (msec)   : 10=0.09%, 20=1.32%, 50=4.36%, 100=6.08%, 250=16.97%
  lat (msec)   : 500=29.95%, 750=30.13%, 1000=10.12%, 2000=0.68%, >=2000=0.32%
  cpu          : usr=0.00%, sys=0.04%, ctx=2218, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2204,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job010: (groupid=0, jobs=1): err= 0: pid=1384: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (147kB/s)(8668KiB/60443msec)
    slat (nsec): min=2400, max=47130, avg=6453.77, stdev=1001.36
    clat (msec): min=4, max=2477, avg=446.08, stdev=266.52
     lat (msec): min=4, max=2477, avg=446.08, stdev=266.52
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   40], 10.00th=[   69], 20.00th=[  188],
     | 30.00th=[  279], 40.00th=[  359], 50.00th=[  451], 60.00th=[  531],
     | 70.00th=[  609], 80.00th=[  693], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[  961], 99.50th=[ 1267], 99.90th=[ 2072], 99.95th=[ 2299],
     | 99.99th=[ 2467]
   bw (  KiB/s): min=   80, max= 1296, per=6.26%, avg=143.40, stdev=132.23, samples=120
   iops        : min=   20, max=  324, avg=35.85, stdev=33.06, samples=120
  lat (msec)   : 10=0.09%, 20=1.34%, 50=5.63%, 100=5.68%, 250=13.80%
  lat (msec)   : 500=30.00%, 750=30.69%, 1000=11.91%, 2000=0.74%, >=2000=0.14%
  cpu          : usr=0.01%, sys=0.02%, ctx=2175, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2167,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job011: (groupid=0, jobs=1): err= 0: pid=1385: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=142KiB/s (146kB/s)(8600KiB/60451msec)
    slat (nsec): min=2820, max=46210, avg=7015.60, stdev=1116.46
    clat (msec): min=2, max=3147, avg=449.67, stdev=284.51
     lat (msec): min=2, max=3147, avg=449.68, stdev=284.51
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   44], 10.00th=[   74], 20.00th=[  182],
     | 30.00th=[  279], 40.00th=[  359], 50.00th=[  443], 60.00th=[  523],
     | 70.00th=[  609], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  978], 99.50th=[ 1469], 99.90th=[ 2903], 99.95th=[ 2937],
     | 99.99th=[ 3138]
   bw (  KiB/s): min=   56, max= 1200, per=6.21%, avg=142.27, stdev=122.46, samples=120
   iops        : min=   14, max=  300, avg=35.57, stdev=30.61, samples=120
  lat (msec)   : 4=0.05%, 20=1.16%, 50=4.60%, 100=6.19%, 250=14.93%
  lat (msec)   : 500=30.28%, 750=28.93%, 1000=12.88%, 2000=0.74%, >=2000=0.23%
  cpu          : usr=0.02%, sys=0.01%, ctx=2173, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2150,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job012: (groupid=0, jobs=1): err= 0: pid=1386: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=142KiB/s (146kB/s)(8596KiB/60463msec)
    slat (nsec): min=2530, max=39440, avg=6624.45, stdev=863.91
    clat (usec): min=433, max=4594.3k, avg=449973.00, stdev=287748.67
     lat (usec): min=480, max=4594.3k, avg=449979.80, stdev=287748.72
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   41], 10.00th=[   74], 20.00th=[  199],
     | 30.00th=[  279], 40.00th=[  355], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  609], 80.00th=[  693], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[ 1116], 99.50th=[ 1368], 99.90th=[ 2467], 99.95th=[ 2903],
     | 99.99th=[ 4597]
   bw (  KiB/s): min=   32, max= 1120, per=6.21%, avg=142.20, stdev=118.91, samples=120
   iops        : min=    8, max=  280, avg=35.55, stdev=29.73, samples=120
  lat (usec)   : 500=0.05%
  lat (msec)   : 10=0.09%, 20=1.12%, 50=5.21%, 100=5.21%, 250=15.08%
  lat (msec)   : 500=30.43%, 750=29.69%, 1000=11.96%, 2000=0.88%, >=2000=0.28%
  cpu          : usr=0.00%, sys=0.03%, ctx=2155, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2149,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job013: (groupid=0, jobs=1): err= 0: pid=1387: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=143KiB/s (146kB/s)(8632KiB/60445msec)
    slat (nsec): min=2380, max=37540, avg=7318.07, stdev=1044.01
    clat (usec): min=1726, max=2985.8k, avg=447946.88, stdev=266479.06
     lat (usec): min=1771, max=2985.8k, avg=447954.52, stdev=266479.15
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[   43], 10.00th=[   74], 20.00th=[  197],
     | 30.00th=[  279], 40.00th=[  363], 50.00th=[  451], 60.00th=[  527],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  919], 99.50th=[ 1150], 99.90th=[ 2039], 99.95th=[ 2735],
     | 99.99th=[ 2970]
   bw (  KiB/s): min=   72, max= 1232, per=6.24%, avg=142.80, stdev=123.80, samples=120
   iops        : min=   18, max=  308, avg=35.70, stdev=30.95, samples=120
  lat (msec)   : 2=0.05%, 10=0.05%, 20=1.02%, 50=5.51%, 100=5.42%
  lat (msec)   : 250=14.37%, 500=29.70%, 750=30.72%, 1000=12.47%, 2000=0.56%
  lat (msec)   : >=2000=0.14%
  cpu          : usr=0.01%, sys=0.02%, ctx=2178, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2158,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job014: (groupid=0, jobs=1): err= 0: pid=1388: Sat Jan 29 03:18:26 2022
  write: IOPS=35, BW=140KiB/s (144kB/s)(8492KiB/60468msec)
    slat (nsec): min=2350, max=40180, avg=6659.86, stdev=864.58
    clat (usec): min=522, max=4966.6k, avg=455499.61, stdev=305720.10
     lat (usec): min=570, max=4966.6k, avg=455506.44, stdev=305720.17
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[   44], 10.00th=[   70], 20.00th=[  188],
     | 30.00th=[  271], 40.00th=[  359], 50.00th=[  447], 60.00th=[  542],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[ 1099], 99.50th=[ 1653], 99.90th=[ 2970], 99.95th=[ 4010],
     | 99.99th=[ 5000]
   bw (  KiB/s): min=   72, max= 1272, per=6.13%, avg=140.47, stdev=128.42, samples=120
   iops        : min=   18, max=  318, avg=35.12, stdev=32.10, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 20=1.22%, 50=5.09%, 100=6.26%, 250=15.03%, 500=28.40%
  lat (msec)   : 750=29.35%, 1000=13.38%, 2000=0.99%, >=2000=0.24%
  cpu          : usr=0.01%, sys=0.02%, ctx=2126, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2123,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job015: (groupid=0, jobs=1): err= 0: pid=1389: Sat Jan 29 03:18:26 2022
  write: IOPS=36, BW=145KiB/s (149kB/s)(8776KiB/60439msec)
    slat (nsec): min=2610, max=40250, avg=7126.12, stdev=1121.55
    clat (usec): min=630, max=3019.9k, avg=440554.76, stdev=265226.37
     lat (usec): min=677, max=3019.9k, avg=440562.13, stdev=265226.43
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   41], 10.00th=[   70], 20.00th=[  194],
     | 30.00th=[  279], 40.00th=[  355], 50.00th=[  435], 60.00th=[  514],
     | 70.00th=[  592], 80.00th=[  676], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  944], 99.50th=[ 1401], 99.90th=[ 1670], 99.95th=[ 2534],
     | 99.99th=[ 3004]
   bw (  KiB/s): min=   32, max= 1208, per=6.34%, avg=145.20, stdev=125.41, samples=120
   iops        : min=    8, max=  302, avg=36.30, stdev=31.35, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 4=0.05%, 10=0.05%, 20=1.19%, 50=5.29%, 100=5.52%
  lat (msec)   : 250=14.77%, 500=31.49%, 750=29.08%, 1000=11.90%, 2000=0.55%
  lat (msec)   : >=2000=0.09%
  cpu          : usr=0.02%, sys=0.02%, ctx=2214, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2194,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=2291KiB/s (2346kB/s), 140KiB/s-146KiB/s (144kB/s-149kB/s), io=135MiB (142MB), run=60381-60469msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
root@(none):~# fio fio.spdk 
job000: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job001: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job002: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job003: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job004: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job005: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job006: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job007: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job008: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job009: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job010: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job011: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job012: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job013: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job014: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job015: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
fio-3.1
Starting 16 threads
Jobs: 16 (f=16): [w(16)][100.0%][r=0KiB/s,w=2092KiB/s][r=0,w=523 IOPS][eta 00m:00s]
job000: (groupid=0, jobs=1): err= 0: pid=1392: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=131KiB/s (134kB/s)(7920KiB/60458msec)
    slat (nsec): min=2510, max=52450, avg=6720.86, stdev=1426.08
    clat (usec): min=267, max=3352.4k, avg=488306.04, stdev=256647.92
     lat (usec): min=321, max=3352.4k, avg=488312.97, stdev=256647.90
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[  142], 10.00th=[  178], 20.00th=[  257],
     | 30.00th=[  334], 40.00th=[  405], 50.00th=[  481], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  701], 90.00th=[  785], 95.00th=[  844],
     | 99.00th=[ 1070], 99.50th=[ 1401], 99.90th=[ 2668], 99.95th=[ 3339],
     | 99.99th=[ 3339]
   bw (  KiB/s): min=   64, max=  504, per=6.29%, avg=130.93, stdev=41.08, samples=120
   iops        : min=   16, max=  126, avg=32.73, stdev=10.27, samples=120
  lat (usec)   : 500=0.10%, 1000=0.05%
  lat (msec)   : 2=0.05%, 4=0.25%, 10=0.15%, 20=0.45%, 50=0.66%
  lat (msec)   : 100=1.06%, 250=16.46%, 500=34.14%, 750=32.47%, 1000=12.88%
  lat (msec)   : 2000=1.01%, >=2000=0.25%
  cpu          : usr=0.01%, sys=0.02%, ctx=1993, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1980,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job001: (groupid=0, jobs=1): err= 0: pid=1393: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=128KiB/s (131kB/s)(7752KiB/60471msec)
    slat (nsec): min=4620, max=48940, avg=6718.31, stdev=1197.97
    clat (usec): min=354, max=3270.3k, avg=498998.78, stdev=272368.89
     lat (usec): min=404, max=3270.3k, avg=499005.72, stdev=272368.85
    clat percentiles (msec):
     |  1.00th=[   23],  5.00th=[  133], 10.00th=[  178], 20.00th=[  257],
     | 30.00th=[  334], 40.00th=[  414], 50.00th=[  493], 60.00th=[  567],
     | 70.00th=[  651], 80.00th=[  718], 90.00th=[  802], 95.00th=[  844],
     | 99.00th=[ 1099], 99.50th=[ 1636], 99.90th=[ 2735], 99.95th=[ 3272],
     | 99.99th=[ 3272]
   bw (  KiB/s): min=   72, max=  624, per=6.16%, avg=128.13, stdev=49.69, samples=120
   iops        : min=   18, max=  156, avg=32.02, stdev=12.42, samples=120
  lat (usec)   : 500=0.05%, 750=0.05%
  lat (msec)   : 2=0.21%, 4=0.21%, 10=0.10%, 20=0.36%, 50=0.88%
  lat (msec)   : 100=1.75%, 250=15.02%, 500=32.30%, 750=33.23%, 1000=14.45%
  lat (msec)   : 2000=0.93%, >=2000=0.46%
  cpu          : usr=0.00%, sys=0.03%, ctx=1954, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1938,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job002: (groupid=0, jobs=1): err= 0: pid=1394: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=130KiB/s (133kB/s)(7864KiB/60419msec)
    slat (nsec): min=2690, max=26850, avg=6769.13, stdev=982.83
    clat (usec): min=1372, max=3288.0k, avg=491496.20, stdev=260042.95
     lat (usec): min=1404, max=3288.0k, avg=491503.21, stdev=260042.96
    clat percentiles (msec):
     |  1.00th=[   39],  5.00th=[  142], 10.00th=[  180], 20.00th=[  251],
     | 30.00th=[  330], 40.00th=[  405], 50.00th=[  481], 60.00th=[  567],
     | 70.00th=[  634], 80.00th=[  718], 90.00th=[  793], 95.00th=[  852],
     | 99.00th=[ 1053], 99.50th=[ 1401], 99.90th=[ 3004], 99.95th=[ 3306],
     | 99.99th=[ 3306]
   bw (  KiB/s): min=   88, max=  488, per=6.25%, avg=130.00, stdev=37.20, samples=120
   iops        : min=   22, max=  122, avg=32.50, stdev= 9.30, samples=120
  lat (msec)   : 2=0.05%, 20=0.20%, 50=1.32%, 100=1.22%, 250=17.14%
  lat (msec)   : 500=32.96%, 750=31.43%, 1000=14.39%, 2000=1.07%, >=2000=0.20%
  cpu          : usr=0.01%, sys=0.02%, ctx=1973, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1966,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job003: (groupid=0, jobs=1): err= 0: pid=1395: Sat Jan 29 03:19:28 2022
  write: IOPS=31, BW=127KiB/s (130kB/s)(7656KiB/60449msec)
    slat (nsec): min=2510, max=51640, avg=7222.55, stdev=1436.91
    clat (msec): min=3, max=3658, avg=505.09, stdev=281.43
     lat (msec): min=3, max=3658, avg=505.10, stdev=281.43
    clat percentiles (msec):
     |  1.00th=[   35],  5.00th=[  148], 10.00th=[  186], 20.00th=[  262],
     | 30.00th=[  342], 40.00th=[  426], 50.00th=[  493], 60.00th=[  575],
     | 70.00th=[  651], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1083], 99.50th=[ 1720], 99.90th=[ 3373], 99.95th=[ 3675],
     | 99.99th=[ 3675]
   bw (  KiB/s): min=   72, max=  528, per=6.08%, avg=126.53, stdev=42.53, samples=120
   iops        : min=   18, max=  132, avg=31.63, stdev=10.63, samples=120
  lat (msec)   : 4=0.05%, 10=0.10%, 20=0.37%, 50=0.99%, 100=1.62%
  lat (msec)   : 250=15.52%, 500=32.39%, 750=33.59%, 1000=13.85%, 2000=1.10%
  lat (msec)   : >=2000=0.42%
  cpu          : usr=0.01%, sys=0.02%, ctx=1928, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1914,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job004: (groupid=0, jobs=1): err= 0: pid=1396: Sat Jan 29 03:19:28 2022
  write: IOPS=31, BW=127KiB/s (130kB/s)(7692KiB/60398msec)
    slat (nsec): min=2430, max=41440, avg=6762.38, stdev=983.13
    clat (msec): min=5, max=3863, avg=502.31, stdev=284.93
     lat (msec): min=5, max=3863, avg=502.32, stdev=284.93
    clat percentiles (msec):
     |  1.00th=[   32],  5.00th=[  138], 10.00th=[  182], 20.00th=[  266],
     | 30.00th=[  347], 40.00th=[  422], 50.00th=[  493], 60.00th=[  567],
     | 70.00th=[  642], 80.00th=[  718], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[ 1083], 99.50th=[ 1737], 99.90th=[ 3809], 99.95th=[ 3876],
     | 99.99th=[ 3876]
   bw (  KiB/s): min=   48, max=  536, per=6.11%, avg=127.13, stdev=42.93, samples=120
   iops        : min=   12, max=  134, avg=31.78, stdev=10.73, samples=120
  lat (msec)   : 10=0.10%, 20=0.26%, 50=1.30%, 100=1.46%, 250=15.50%
  lat (msec)   : 500=32.61%, 750=33.75%, 1000=13.62%, 2000=1.14%, >=2000=0.26%
  cpu          : usr=0.00%, sys=0.02%, ctx=1929, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1923,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job005: (groupid=0, jobs=1): err= 0: pid=1397: Sat Jan 29 03:19:28 2022
  write: IOPS=33, BW=132KiB/s (135kB/s)(7996KiB/60440msec)
    slat (nsec): min=2380, max=42580, avg=7240.90, stdev=1366.94
    clat (msec): min=4, max=2363, avg=483.54, stdev=253.74
     lat (msec): min=4, max=2363, avg=483.55, stdev=253.74
    clat percentiles (msec):
     |  1.00th=[   27],  5.00th=[  131], 10.00th=[  174], 20.00th=[  247],
     | 30.00th=[  317], 40.00th=[  401], 50.00th=[  477], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  701], 90.00th=[  785], 95.00th=[  844],
     | 99.00th=[ 1045], 99.50th=[ 1603], 99.90th=[ 2123], 99.95th=[ 2366],
     | 99.99th=[ 2366]
   bw (  KiB/s): min=   64, max=  648, per=6.36%, avg=132.20, stdev=52.29, samples=120
   iops        : min=   16, max=  162, avg=33.05, stdev=13.07, samples=120
  lat (msec)   : 10=0.10%, 20=0.45%, 50=1.35%, 100=1.80%, 250=17.01%
  lat (msec)   : 500=31.97%, 750=33.12%, 1000=12.96%, 2000=1.10%, >=2000=0.15%
  cpu          : usr=0.00%, sys=0.03%, ctx=2004, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1999,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job006: (groupid=0, jobs=1): err= 0: pid=1398: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=130KiB/s (133kB/s)(7840KiB/60473msec)
    slat (nsec): min=2430, max=42960, avg=6686.17, stdev=1014.28
    clat (msec): min=4, max=2771, avg=493.43, stdev=264.04
     lat (msec): min=4, max=2771, avg=493.44, stdev=264.04
    clat percentiles (msec):
     |  1.00th=[   41],  5.00th=[  142], 10.00th=[  174], 20.00th=[  251],
     | 30.00th=[  330], 40.00th=[  409], 50.00th=[  485], 60.00th=[  558],
     | 70.00th=[  642], 80.00th=[  709], 90.00th=[  793], 95.00th=[  852],
     | 99.00th=[ 1070], 99.50th=[ 1854], 99.90th=[ 2232], 99.95th=[ 2769],
     | 99.99th=[ 2769]
   bw (  KiB/s): min=   64, max=  512, per=6.23%, avg=129.60, stdev=40.94, samples=120
   iops        : min=   16, max=  128, avg=32.40, stdev=10.24, samples=120
  lat (msec)   : 10=0.15%, 20=0.15%, 50=0.92%, 100=1.58%, 250=17.09%
  lat (msec)   : 500=31.63%, 750=33.57%, 1000=13.57%, 2000=1.02%, >=2000=0.31%
  cpu          : usr=0.01%, sys=0.02%, ctx=1967, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1960,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job007: (groupid=0, jobs=1): err= 0: pid=1399: Sat Jan 29 03:19:28 2022
  write: IOPS=33, BW=135KiB/s (138kB/s)(8164KiB/60452msec)
    slat (nsec): min=2620, max=29500, avg=7177.42, stdev=852.94
    clat (msec): min=5, max=3289, avg=473.68, stdev=247.63
     lat (msec): min=6, max=3289, avg=473.68, stdev=247.63
    clat percentiles (msec):
     |  1.00th=[   29],  5.00th=[  138], 10.00th=[  171], 20.00th=[  241],
     | 30.00th=[  313], 40.00th=[  384], 50.00th=[  464], 60.00th=[  542],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  995], 99.50th=[ 1183], 99.90th=[ 1854], 99.95th=[ 2299],
     | 99.99th=[ 3306]
   bw (  KiB/s): min=   72, max=  600, per=6.49%, avg=135.00, stdev=48.36, samples=120
   iops        : min=   18, max=  150, avg=33.75, stdev=12.09, samples=120
  lat (msec)   : 10=0.05%, 20=0.54%, 50=1.32%, 100=1.32%, 250=17.54%
  lat (msec)   : 500=34.44%, 750=31.06%, 1000=12.84%, 2000=0.78%, >=2000=0.10%
  cpu          : usr=0.00%, sys=0.03%, ctx=2056, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2041,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job008: (groupid=0, jobs=1): err= 0: pid=1400: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=130KiB/s (133kB/s)(7872KiB/60465msec)
    slat (nsec): min=2410, max=65250, avg=6716.78, stdev=1466.53
    clat (msec): min=4, max=3512, avg=491.38, stdev=256.63
     lat (msec): min=4, max=3512, avg=491.39, stdev=256.63
    clat percentiles (msec):
     |  1.00th=[   27],  5.00th=[  140], 10.00th=[  182], 20.00th=[  253],
     | 30.00th=[  330], 40.00th=[  409], 50.00th=[  485], 60.00th=[  567],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  802], 95.00th=[  844],
     | 99.00th=[ 1011], 99.50th=[ 1552], 99.90th=[ 2333], 99.95th=[ 3507],
     | 99.99th=[ 3507]
   bw (  KiB/s): min=   64, max=  544, per=6.26%, avg=130.13, stdev=43.84, samples=120
   iops        : min=   16, max=  136, avg=32.53, stdev=10.96, samples=120
  lat (msec)   : 10=0.15%, 20=0.41%, 50=1.17%, 100=1.47%, 250=16.36%
  lat (msec)   : 500=32.98%, 750=32.37%, 1000=13.97%, 2000=0.97%, >=2000=0.15%
  cpu          : usr=0.01%, sys=0.02%, ctx=1974, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1968,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job009: (groupid=0, jobs=1): err= 0: pid=1401: Sat Jan 29 03:19:28 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8028KiB/60463msec)
    slat (nsec): min=2440, max=54960, avg=7115.32, stdev=1273.86
    clat (msec): min=4, max=3361, avg=481.81, stdev=261.19
     lat (msec): min=4, max=3361, avg=481.82, stdev=261.19
    clat percentiles (msec):
     |  1.00th=[   29],  5.00th=[  138], 10.00th=[  176], 20.00th=[  251],
     | 30.00th=[  326], 40.00th=[  397], 50.00th=[  468], 60.00th=[  542],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  995], 99.50th=[ 1368], 99.90th=[ 2433], 99.95th=[ 2836],
     | 99.99th=[ 3373]
   bw (  KiB/s): min=   64, max=  632, per=6.38%, avg=132.73, stdev=51.12, samples=120
   iops        : min=   16, max=  158, avg=33.18, stdev=12.78, samples=120
  lat (msec)   : 10=0.15%, 20=0.50%, 50=1.59%, 100=1.20%, 250=16.14%
  lat (msec)   : 500=34.48%, 750=31.64%, 1000=13.40%, 2000=0.60%, >=2000=0.30%
  cpu          : usr=0.02%, sys=0.02%, ctx=2017, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2007,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job010: (groupid=0, jobs=1): err= 0: pid=1402: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=132KiB/s (135kB/s)(7976KiB/60475msec)
    slat (nsec): min=2340, max=42070, avg=6507.54, stdev=910.55
    clat (msec): min=6, max=2411, avg=485.07, stdev=242.56
     lat (msec): min=6, max=2411, avg=485.07, stdev=242.56
    clat percentiles (msec):
     |  1.00th=[   32],  5.00th=[  136], 10.00th=[  169], 20.00th=[  243],
     | 30.00th=[  326], 40.00th=[  409], 50.00th=[  489], 60.00th=[  567],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  953], 99.50th=[ 1150], 99.90th=[ 1703], 99.95th=[ 2400],
     | 99.99th=[ 2400]
   bw (  KiB/s): min=   64, max=  632, per=6.34%, avg=131.87, stdev=50.99, samples=120
   iops        : min=   16, max=  158, avg=32.97, stdev=12.75, samples=120
  lat (msec)   : 10=0.10%, 20=0.55%, 50=1.30%, 100=1.50%, 250=18.05%
  lat (msec)   : 500=30.09%, 750=33.05%, 1000=14.69%, 2000=0.60%, >=2000=0.05%
  cpu          : usr=0.01%, sys=0.01%, ctx=1998, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1994,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job011: (groupid=0, jobs=1): err= 0: pid=1403: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=130KiB/s (133kB/s)(7856KiB/60469msec)
    slat (nsec): min=2610, max=40810, avg=7173.53, stdev=1167.33
    clat (msec): min=6, max=4137, avg=492.39, stdev=288.26
     lat (msec): min=6, max=4137, avg=492.40, stdev=288.26
    clat percentiles (msec):
     |  1.00th=[   26],  5.00th=[  132], 10.00th=[  167], 20.00th=[  249],
     | 30.00th=[  326], 40.00th=[  405], 50.00th=[  485], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1150], 99.50th=[ 1821], 99.90th=[ 3138], 99.95th=[ 4144],
     | 99.99th=[ 4144]
   bw (  KiB/s): min=   64, max=  680, per=6.24%, avg=129.87, stdev=55.93, samples=120
   iops        : min=   16, max=  170, avg=32.47, stdev=13.98, samples=120
  lat (msec)   : 10=0.10%, 20=0.46%, 50=2.04%, 100=1.43%, 250=16.04%
  lat (msec)   : 500=31.92%, 750=33.40%, 1000=13.09%, 2000=1.12%, >=2000=0.41%
  cpu          : usr=0.00%, sys=0.03%, ctx=1982, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1964,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job012: (groupid=0, jobs=1): err= 0: pid=1404: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=129KiB/s (132kB/s)(7784KiB/60383msec)
    slat (nsec): min=2350, max=40970, avg=6621.70, stdev=923.57
    clat (msec): min=3, max=3713, avg=496.27, stdev=267.31
     lat (msec): min=3, max=3713, avg=496.27, stdev=267.31
    clat percentiles (msec):
     |  1.00th=[   31],  5.00th=[  140], 10.00th=[  178], 20.00th=[  253],
     | 30.00th=[  326], 40.00th=[  409], 50.00th=[  489], 60.00th=[  567],
     | 70.00th=[  651], 80.00th=[  726], 90.00th=[  810], 95.00th=[  852],
     | 99.00th=[ 1099], 99.50th=[ 1536], 99.90th=[ 2299], 99.95th=[ 3708],
     | 99.99th=[ 3708]
   bw (  KiB/s): min=   56, max=  512, per=6.19%, avg=128.67, stdev=42.20, samples=120
   iops        : min=   14, max=  128, avg=32.17, stdev=10.55, samples=120
  lat (msec)   : 4=0.05%, 10=0.05%, 20=0.36%, 50=1.03%, 100=1.34%
  lat (msec)   : 250=16.60%, 500=32.17%, 750=31.55%, 1000=15.67%, 2000=0.98%
  lat (msec)   : >=2000=0.21%
  cpu          : usr=0.01%, sys=0.02%, ctx=1950, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1946,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job013: (groupid=0, jobs=1): err= 0: pid=1405: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=129KiB/s (132kB/s)(7808KiB/60386msec)
    slat (nsec): min=2720, max=55220, avg=7185.44, stdev=1388.30
    clat (msec): min=5, max=4092, avg=494.79, stdev=279.70
     lat (msec): min=5, max=4092, avg=494.80, stdev=279.70
    clat percentiles (msec):
     |  1.00th=[   33],  5.00th=[  142], 10.00th=[  180], 20.00th=[  262],
     | 30.00th=[  334], 40.00th=[  409], 50.00th=[  485], 60.00th=[  550],
     | 70.00th=[  634], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[ 1053], 99.50th=[ 1921], 99.90th=[ 3574], 99.95th=[ 4077],
     | 99.99th=[ 4077]
   bw (  KiB/s): min=   72, max=  512, per=6.21%, avg=129.07, stdev=40.96, samples=120
   iops        : min=   18, max=  128, avg=32.27, stdev=10.24, samples=120
  lat (msec)   : 10=0.15%, 20=0.36%, 50=1.08%, 100=1.23%, 250=15.68%
  lat (msec)   : 500=34.07%, 750=32.38%, 1000=13.93%, 2000=0.72%, >=2000=0.41%
  cpu          : usr=0.00%, sys=0.03%, ctx=1980, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1952,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job014: (groupid=0, jobs=1): err= 0: pid=1406: Sat Jan 29 03:19:28 2022
  write: IOPS=31, BW=128KiB/s (131kB/s)(7712KiB/60450msec)
    slat (nsec): min=2370, max=35410, avg=6624.54, stdev=854.26
    clat (msec): min=5, max=2850, avg=501.42, stdev=270.00
     lat (msec): min=5, max=2850, avg=501.42, stdev=270.00
    clat percentiles (msec):
     |  1.00th=[   33],  5.00th=[  142], 10.00th=[  180], 20.00th=[  264],
     | 30.00th=[  338], 40.00th=[  414], 50.00th=[  498], 60.00th=[  567],
     | 70.00th=[  651], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1099], 99.50th=[ 1603], 99.90th=[ 2735], 99.95th=[ 2836],
     | 99.99th=[ 2836]
   bw (  KiB/s): min=   64, max=  600, per=6.13%, avg=127.47, stdev=49.41, samples=120
   iops        : min=   16, max=  150, avg=31.87, stdev=12.35, samples=120
  lat (msec)   : 10=0.05%, 20=0.31%, 50=1.30%, 100=1.71%, 250=14.73%
  lat (msec)   : 500=32.21%, 750=33.30%, 1000=14.99%, 2000=1.04%, >=2000=0.36%
  cpu          : usr=0.00%, sys=0.03%, ctx=1931, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1928,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job015: (groupid=0, jobs=1): err= 0: pid=1407: Sat Jan 29 03:19:28 2022
  write: IOPS=32, BW=131KiB/s (134kB/s)(7892KiB/60467msec)
    slat (nsec): min=2400, max=50090, avg=7259.26, stdev=1254.04
    clat (msec): min=13, max=3597, avg=490.12, stdev=250.25
     lat (msec): min=13, max=3597, avg=490.13, stdev=250.25
    clat percentiles (msec):
     |  1.00th=[   29],  5.00th=[  140], 10.00th=[  178], 20.00th=[  264],
     | 30.00th=[  347], 40.00th=[  414], 50.00th=[  485], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  701], 90.00th=[  785], 95.00th=[  844],
     | 99.00th=[  953], 99.50th=[ 1267], 99.90th=[ 2500], 99.95th=[ 3608],
     | 99.99th=[ 3608]
   bw (  KiB/s): min=   72, max=  528, per=6.27%, avg=130.47, stdev=41.53, samples=120
   iops        : min=   18, max=  132, avg=32.62, stdev=10.38, samples=120
  lat (msec)   : 20=0.35%, 50=1.32%, 100=1.32%, 250=15.61%, 500=33.81%
  lat (msec)   : 750=33.00%, 1000=13.84%, 2000=0.61%, >=2000=0.15%
  cpu          : usr=0.02%, sys=0.02%, ctx=1982, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1973,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=2080KiB/s (2130kB/s), 127KiB/s-135KiB/s (130kB/s-138kB/s), io=123MiB (129MB), run=60383-60475msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
root@(none):~# fio fio.spdk 
job000: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job001: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job002: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job003: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job004: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job005: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job006: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job007: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job008: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job009: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job010: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job011: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job012: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job013: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job014: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job015: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
fio-3.1
Starting 16 threads
Jobs: 16 (f=16): [w(16)][100.0%][r=0KiB/s,w=2008KiB/s][r=0,w=502 IOPS][eta 00m:00s]
job000: (groupid=0, jobs=1): err= 0: pid=1411: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=145KiB/s (148kB/s)(8764KiB/60470msec)
    slat (nsec): min=2620, max=41160, avg=6864.50, stdev=1230.81
    clat (usec): min=903, max=2646.2k, avg=441374.15, stdev=268853.40
     lat (usec): min=952, max=2646.2k, avg=441381.23, stdev=268853.51
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   37], 10.00th=[   67], 20.00th=[  190],
     | 30.00th=[  275], 40.00th=[  355], 50.00th=[  439], 60.00th=[  518],
     | 70.00th=[  609], 80.00th=[  684], 90.00th=[  776], 95.00th=[  810],
     | 99.00th=[  919], 99.50th=[ 1334], 99.90th=[ 2366], 99.95th=[ 2601],
     | 99.99th=[ 2635]
   bw (  KiB/s): min=   88, max= 1336, per=6.28%, avg=145.00, stdev=126.02, samples=120
   iops        : min=   22, max=  334, avg=36.25, stdev=31.51, samples=120
  lat (usec)   : 1000=0.05%
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.78%, 50=5.25%, 100=4.43%
  lat (msec)   : 250=15.15%, 500=31.04%, 750=29.94%, 1000=11.46%, 2000=0.59%
  lat (msec)   : >=2000=0.18%
  cpu          : usr=0.00%, sys=0.03%, ctx=2202, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2191,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job001: (groupid=0, jobs=1): err= 0: pid=1412: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=142KiB/s (146kB/s)(8596KiB/60483msec)
    slat (nsec): min=2410, max=52740, avg=7069.09, stdev=1340.03
    clat (usec): min=697, max=3214.9k, avg=450089.77, stdev=268424.63
     lat (usec): min=751, max=3214.9k, avg=450097.08, stdev=268424.68
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   40], 10.00th=[   68], 20.00th=[  192],
     | 30.00th=[  279], 40.00th=[  363], 50.00th=[  456], 60.00th=[  535],
     | 70.00th=[  625], 80.00th=[  701], 90.00th=[  776], 95.00th=[  818],
     | 99.00th=[  911], 99.50th=[ 1036], 99.90th=[ 2232], 99.95th=[ 2702],
     | 99.99th=[ 3205]
   bw (  KiB/s): min=   80, max= 1312, per=6.16%, avg=142.20, stdev=121.21, samples=120
   iops        : min=   20, max=  328, avg=35.55, stdev=30.30, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.44%, 50=5.35%, 100=4.14%
  lat (msec)   : 250=15.31%, 500=28.66%, 750=31.41%, 1000=12.98%, 2000=0.37%
  lat (msec)   : >=2000=0.14%
  cpu          : usr=0.00%, sys=0.03%, ctx=2167, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2149,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job002: (groupid=0, jobs=1): err= 0: pid=1413: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=144KiB/s (147kB/s)(8700KiB/60453msec)
    slat (nsec): min=2480, max=29050, avg=6471.43, stdev=727.51
    clat (usec): min=1841, max=3617.2k, avg=444496.80, stdev=280625.58
     lat (usec): min=1870, max=3617.2k, avg=444503.46, stdev=280625.69
    clat percentiles (msec):
     |  1.00th=[   15],  5.00th=[   31], 10.00th=[   57], 20.00th=[  192],
     | 30.00th=[  279], 40.00th=[  359], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  600], 80.00th=[  684], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  936], 99.50th=[ 1250], 99.90th=[ 2836], 99.95th=[ 2869],
     | 99.99th=[ 3608]
   bw (  KiB/s): min=   88, max= 1528, per=6.23%, avg=143.91, stdev=140.79, samples=120
   iops        : min=   22, max=  382, avg=35.96, stdev=35.20, samples=120
  lat (msec)   : 2=0.05%, 4=0.05%, 10=0.09%, 20=1.98%, 50=6.57%
  lat (msec)   : 100=4.00%, 250=13.38%, 500=31.26%, 750=29.47%, 1000=12.37%
  lat (msec)   : 2000=0.60%, >=2000=0.18%
  cpu          : usr=0.02%, sys=0.01%, ctx=2181, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2175,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job003: (groupid=0, jobs=1): err= 0: pid=1414: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=141KiB/s (145kB/s)(8544KiB/60456msec)
    slat (nsec): min=2440, max=32220, avg=7299.84, stdev=1076.30
    clat (usec): min=1502, max=4003.6k, avg=452661.73, stdev=327143.54
     lat (usec): min=1543, max=4003.6k, avg=452669.21, stdev=327143.73
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   37], 10.00th=[   62], 20.00th=[  188],
     | 30.00th=[  268], 40.00th=[  351], 50.00th=[  435], 60.00th=[  523],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[ 1217], 99.50th=[ 2165], 99.90th=[ 3742], 99.95th=[ 3775],
     | 99.99th=[ 4010]
   bw (  KiB/s): min=   64, max= 1464, per=6.12%, avg=141.32, stdev=134.81, samples=120
   iops        : min=   16, max=  366, avg=35.32, stdev=33.70, samples=120
  lat (msec)   : 2=0.05%, 4=0.05%, 10=0.14%, 20=1.73%, 50=5.57%
  lat (msec)   : 100=4.92%, 250=15.36%, 500=30.06%, 750=28.89%, 1000=12.03%
  lat (msec)   : 2000=0.61%, >=2000=0.61%
  cpu          : usr=0.01%, sys=0.03%, ctx=2147, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2136,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job004: (groupid=0, jobs=1): err= 0: pid=1415: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=146KiB/s (149kB/s)(8816KiB/60433msec)
    slat (nsec): min=2540, max=77140, avg=6598.94, stdev=1671.32
    clat (usec): min=1344, max=2455.8k, avg=438533.43, stdev=266929.99
     lat (usec): min=1421, max=2455.9k, avg=438540.25, stdev=266930.03
    clat percentiles (msec):
     |  1.00th=[   15],  5.00th=[   33], 10.00th=[   58], 20.00th=[  184],
     | 30.00th=[  275], 40.00th=[  351], 50.00th=[  435], 60.00th=[  518],
     | 70.00th=[  600], 80.00th=[  684], 90.00th=[  776], 95.00th=[  818],
     | 99.00th=[  902], 99.50th=[ 1070], 99.90th=[ 2265], 99.95th=[ 2366],
     | 99.99th=[ 2467]
   bw (  KiB/s): min=   88, max= 1432, per=6.32%, avg=145.85, stdev=135.67, samples=120
   iops        : min=   22, max=  358, avg=36.45, stdev=33.92, samples=120
  lat (msec)   : 2=0.09%, 4=0.05%, 10=0.05%, 20=2.00%, 50=6.44%
  lat (msec)   : 100=3.54%, 250=14.56%, 500=31.08%, 750=29.67%, 1000=11.98%
  lat (msec)   : 2000=0.36%, >=2000=0.18%
  cpu          : usr=0.01%, sys=0.02%, ctx=2213, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2204,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job005: (groupid=0, jobs=1): err= 0: pid=1416: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=145KiB/s (149kB/s)(8780KiB/60389msec)
    slat (nsec): min=2450, max=30660, avg=6882.44, stdev=1177.36
    clat (usec): min=1011, max=3630.4k, avg=440004.56, stdev=265460.04
     lat (usec): min=1043, max=3630.4k, avg=440011.67, stdev=265460.21
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   36], 10.00th=[   63], 20.00th=[  192],
     | 30.00th=[  275], 40.00th=[  355], 50.00th=[  439], 60.00th=[  518],
     | 70.00th=[  592], 80.00th=[  676], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[  927], 99.50th=[ 1083], 99.90th=[ 2039], 99.95th=[ 2123],
     | 99.99th=[ 3641]
   bw (  KiB/s): min=   80, max= 1416, per=6.29%, avg=145.27, stdev=132.09, samples=120
   iops        : min=   20, max=  354, avg=36.32, stdev=33.02, samples=120
  lat (msec)   : 2=0.05%, 10=0.14%, 20=1.78%, 50=5.74%, 100=4.24%
  lat (msec)   : 250=14.81%, 500=30.71%, 750=29.75%, 1000=12.12%, 2000=0.55%
  lat (msec)   : >=2000=0.14%
  cpu          : usr=0.01%, sys=0.03%, ctx=2210, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2195,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job006: (groupid=0, jobs=1): err= 0: pid=1417: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=145KiB/s (148kB/s)(8744KiB/60444msec)
    slat (nsec): min=2330, max=77170, avg=6362.27, stdev=1620.98
    clat (usec): min=296, max=3845.9k, avg=442210.11, stdev=269713.89
     lat (usec): min=374, max=3845.0k, avg=442216.66, stdev=269713.94
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   38], 10.00th=[   60], 20.00th=[  194],
     | 30.00th=[  284], 40.00th=[  363], 50.00th=[  439], 60.00th=[  518],
     | 70.00th=[  609], 80.00th=[  684], 90.00th=[  768], 95.00th=[  818],
     | 99.00th=[  919], 99.50th=[ 1284], 99.90th=[ 1938], 99.95th=[ 1989],
     | 99.99th=[ 3842]
   bw (  KiB/s): min=   80, max= 1448, per=6.27%, avg=144.67, stdev=134.01, samples=120
   iops        : min=   20, max=  362, avg=36.17, stdev=33.50, samples=120
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.14%, 20=1.56%, 50=6.27%, 100=4.12%
  lat (msec)   : 250=13.95%, 500=31.75%, 750=29.19%, 1000=12.08%, 2000=0.82%
  lat (msec)   : >=2000=0.05%
  cpu          : usr=0.00%, sys=0.03%, ctx=2188, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2186,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job007: (groupid=0, jobs=1): err= 0: pid=1418: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=144KiB/s (147kB/s)(8700KiB/60459msec)
    slat (nsec): min=2520, max=46920, avg=6828.86, stdev=1145.05
    clat (usec): min=574, max=3809.8k, avg=444564.22, stdev=271679.72
     lat (usec): min=621, max=3809.8k, avg=444571.26, stdev=271679.76
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   35], 10.00th=[   63], 20.00th=[  194],
     | 30.00th=[  284], 40.00th=[  359], 50.00th=[  439], 60.00th=[  527],
     | 70.00th=[  617], 80.00th=[  693], 90.00th=[  776], 95.00th=[  818],
     | 99.00th=[  936], 99.50th=[ 1234], 99.90th=[ 1955], 99.95th=[ 2467],
     | 99.99th=[ 3809]
   bw (  KiB/s): min=   72, max= 1504, per=6.23%, avg=143.90, stdev=136.48, samples=120
   iops        : min=   18, max=  376, avg=35.95, stdev=34.12, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 10=0.09%, 20=1.38%, 50=6.57%, 100=4.14%, 250=14.07%
  lat (msec)   : 500=30.85%, 750=29.79%, 1000=12.41%, 2000=0.55%, >=2000=0.09%
  cpu          : usr=0.01%, sys=0.02%, ctx=2192, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2175,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job008: (groupid=0, jobs=1): err= 0: pid=1419: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=144KiB/s (148kB/s)(8728KiB/60476msec)
    slat (nsec): min=2530, max=39560, avg=6603.35, stdev=918.64
    clat (usec): min=478, max=3133.5k, avg=443221.62, stdev=262571.86
     lat (usec): min=518, max=3133.5k, avg=443228.44, stdev=262571.94
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   38], 10.00th=[   65], 20.00th=[  194],
     | 30.00th=[  271], 40.00th=[  368], 50.00th=[  443], 60.00th=[  527],
     | 70.00th=[  609], 80.00th=[  684], 90.00th=[  768], 95.00th=[  818],
     | 99.00th=[  911], 99.50th=[ 1083], 99.90th=[ 1854], 99.95th=[ 2140],
     | 99.99th=[ 3138]
   bw (  KiB/s): min=   80, max= 1360, per=6.25%, avg=144.40, stdev=126.91, samples=120
   iops        : min=   20, max=  340, avg=36.10, stdev=31.73, samples=120
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.14%, 20=1.28%, 50=5.82%, 100=4.26%
  lat (msec)   : 250=15.22%, 500=30.11%, 750=31.48%, 1000=11.00%, 2000=0.50%
  lat (msec)   : >=2000=0.09%
  cpu          : usr=0.00%, sys=0.03%, ctx=2186, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2182,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job009: (groupid=0, jobs=1): err= 0: pid=1420: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=146KiB/s (149kB/s)(8796KiB/60436msec)
    slat (nsec): min=2520, max=30370, avg=6854.95, stdev=792.92
    clat (usec): min=1668, max=1876.6k, avg=439539.00, stdev=257683.55
     lat (usec): min=1703, max=1876.6k, avg=439546.09, stdev=257683.63
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   37], 10.00th=[   62], 20.00th=[  194],
     | 30.00th=[  275], 40.00th=[  355], 50.00th=[  435], 60.00th=[  514],
     | 70.00th=[  592], 80.00th=[  693], 90.00th=[  776], 95.00th=[  827],
     | 99.00th=[  944], 99.50th=[ 1133], 99.90th=[ 1653], 99.95th=[ 1854],
     | 99.99th=[ 1871]
   bw (  KiB/s): min=   80, max= 1464, per=6.30%, avg=145.51, stdev=135.21, samples=120
   iops        : min=   20, max=  366, avg=36.36, stdev=33.81, samples=120
  lat (msec)   : 2=0.05%, 10=0.09%, 20=1.77%, 50=5.64%, 100=4.55%
  lat (msec)   : 250=15.28%, 500=30.38%, 750=28.51%, 1000=13.01%, 2000=0.73%
  cpu          : usr=0.01%, sys=0.02%, ctx=2215, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2199,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job010: (groupid=0, jobs=1): err= 0: pid=1421: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=147KiB/s (151kB/s)(8892KiB/60474msec)
    slat (nsec): min=2450, max=77940, avg=6428.46, stdev=1589.72
    clat (usec): min=590, max=3242.0k, avg=435045.81, stdev=265597.59
     lat (usec): min=668, max=3242.0k, avg=435052.43, stdev=265597.66
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   34], 10.00th=[   59], 20.00th=[  178],
     | 30.00th=[  266], 40.00th=[  347], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  600], 80.00th=[  676], 90.00th=[  768], 95.00th=[  818],
     | 99.00th=[  911], 99.50th=[  961], 99.90th=[ 2123], 99.95th=[ 2232],
     | 99.99th=[ 3239]
   bw (  KiB/s): min=   72, max= 1480, per=6.37%, avg=147.13, stdev=140.21, samples=120
   iops        : min=   18, max=  370, avg=36.78, stdev=35.05, samples=120
  lat (usec)   : 750=0.04%
  lat (msec)   : 4=0.04%, 10=0.09%, 20=1.93%, 50=6.07%, 100=4.63%
  lat (msec)   : 250=15.43%, 500=29.19%, 750=30.90%, 1000=11.25%, 2000=0.27%
  lat (msec)   : >=2000=0.13%
  cpu          : usr=0.01%, sys=0.03%, ctx=2228, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2223,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job011: (groupid=0, jobs=1): err= 0: pid=1422: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=147KiB/s (151kB/s)(8916KiB/60449msec)
    slat (nsec): min=2460, max=28380, avg=7032.15, stdev=932.20
    clat (usec): min=304, max=2722.1k, avg=433708.44, stdev=259882.81
     lat (usec): min=333, max=2722.1k, avg=433715.74, stdev=259882.94
    clat percentiles (msec):
     |  1.00th=[   15],  5.00th=[   37], 10.00th=[   66], 20.00th=[  188],
     | 30.00th=[  271], 40.00th=[  347], 50.00th=[  435], 60.00th=[  502],
     | 70.00th=[  600], 80.00th=[  676], 90.00th=[  768], 95.00th=[  818],
     | 99.00th=[  911], 99.50th=[ 1083], 99.90th=[ 1854], 99.95th=[ 1955],
     | 99.99th=[ 2735]
   bw (  KiB/s): min=   80, max= 1464, per=6.39%, avg=147.53, stdev=134.88, samples=120
   iops        : min=   20, max=  366, avg=36.88, stdev=33.72, samples=120
  lat (usec)   : 500=0.04%
  lat (msec)   : 10=0.13%, 20=1.70%, 50=5.20%, 100=4.89%, 250=15.97%
  lat (msec)   : 500=31.49%, 750=29.39%, 1000=10.45%, 2000=0.67%, >=2000=0.04%
  cpu          : usr=0.01%, sys=0.03%, ctx=2245, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2229,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job012: (groupid=0, jobs=1): err= 0: pid=1423: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=143KiB/s (146kB/s)(8636KiB/60488msec)
    slat (nsec): min=2360, max=62320, avg=6618.27, stdev=1287.56
    clat (usec): min=1184, max=3716.5k, avg=448041.61, stdev=273441.30
     lat (usec): min=1247, max=3716.6k, avg=448048.40, stdev=273441.33
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   37], 10.00th=[   69], 20.00th=[  203],
     | 30.00th=[  279], 40.00th=[  363], 50.00th=[  443], 60.00th=[  527],
     | 70.00th=[  609], 80.00th=[  693], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[  995], 99.50th=[ 1200], 99.90th=[ 2366], 99.95th=[ 2433],
     | 99.99th=[ 3708]
   bw (  KiB/s): min=   80, max= 1280, per=6.19%, avg=142.87, stdev=121.27, samples=120
   iops        : min=   20, max=  320, avg=35.72, stdev=30.32, samples=120
  lat (msec)   : 2=0.05%, 4=0.05%, 10=0.14%, 20=1.44%, 50=5.60%
  lat (msec)   : 100=3.94%, 250=14.64%, 500=31.13%, 750=29.97%, 1000=12.14%
  lat (msec)   : 2000=0.74%, >=2000=0.19%
  cpu          : usr=0.00%, sys=0.03%, ctx=2164, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2159,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job013: (groupid=0, jobs=1): err= 0: pid=1424: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=144KiB/s (147kB/s)(8676KiB/60427msec)
    slat (nsec): min=2430, max=29820, avg=7256.25, stdev=943.13
    clat (msec): min=2, max=3406, avg=445.55, stdev=274.02
     lat (msec): min=2, max=3406, avg=445.56, stdev=274.02
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   37], 10.00th=[   64], 20.00th=[  199],
     | 30.00th=[  284], 40.00th=[  359], 50.00th=[  439], 60.00th=[  523],
     | 70.00th=[  600], 80.00th=[  684], 90.00th=[  768], 95.00th=[  827],
     | 99.00th=[  969], 99.50th=[ 1452], 99.90th=[ 2534], 99.95th=[ 2668],
     | 99.99th=[ 3406]
   bw (  KiB/s): min=   80, max= 1416, per=6.22%, avg=143.53, stdev=132.36, samples=120
   iops        : min=   20, max=  354, avg=35.88, stdev=33.09, samples=120
  lat (msec)   : 4=0.09%, 10=0.09%, 20=1.75%, 50=6.09%, 100=3.96%
  lat (msec)   : 250=13.83%, 500=31.17%, 750=30.61%, 1000=11.53%, 2000=0.69%
  lat (msec)   : >=2000=0.18%
  cpu          : usr=0.00%, sys=0.04%, ctx=2194, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2169,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job014: (groupid=0, jobs=1): err= 0: pid=1425: Sat Jan 29 03:20:40 2022
  write: IOPS=35, BW=142KiB/s (145kB/s)(8572KiB/60485msec)
    slat (nsec): min=2390, max=26580, avg=6598.72, stdev=639.79
    clat (msec): min=2, max=3306, avg=451.38, stdev=274.20
     lat (msec): min=2, max=3306, avg=451.38, stdev=274.20
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   40], 10.00th=[   63], 20.00th=[  201],
     | 30.00th=[  284], 40.00th=[  368], 50.00th=[  460], 60.00th=[  535],
     | 70.00th=[  617], 80.00th=[  701], 90.00th=[  785], 95.00th=[  827],
     | 99.00th=[  944], 99.50th=[ 1284], 99.90th=[ 2534], 99.95th=[ 2735],
     | 99.99th=[ 3306]
   bw (  KiB/s): min=   72, max= 1424, per=6.14%, avg=141.80, stdev=131.06, samples=120
   iops        : min=   18, max=  356, avg=35.45, stdev=32.77, samples=120
  lat (msec)   : 4=0.05%, 10=0.09%, 20=1.63%, 50=5.69%, 100=4.85%
  lat (msec)   : 250=13.39%, 500=29.96%, 750=31.17%, 1000=12.41%, 2000=0.56%
  lat (msec)   : >=2000=0.19%
  cpu          : usr=0.01%, sys=0.02%, ctx=2149, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2143,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job015: (groupid=0, jobs=1): err= 0: pid=1426: Sat Jan 29 03:20:40 2022
  write: IOPS=36, BW=146KiB/s (150kB/s)(8840KiB/60466msec)
    slat (nsec): min=2470, max=69370, avg=7317.33, stdev=1636.94
    clat (msec): min=11, max=2098, avg=437.54, stdev=255.26
     lat (msec): min=11, max=2098, avg=437.55, stdev=255.26
    clat percentiles (msec):
     |  1.00th=[   16],  5.00th=[   37], 10.00th=[   65], 20.00th=[  199],
     | 30.00th=[  284], 40.00th=[  359], 50.00th=[  430], 60.00th=[  510],
     | 70.00th=[  600], 80.00th=[  676], 90.00th=[  760], 95.00th=[  810],
     | 99.00th=[  911], 99.50th=[ 1011], 99.90th=[ 1938], 99.95th=[ 2089],
     | 99.99th=[ 2106]
   bw (  KiB/s): min=   88, max= 1432, per=6.33%, avg=146.27, stdev=129.89, samples=120
   iops        : min=   22, max=  358, avg=36.57, stdev=32.47, samples=120
  lat (msec)   : 20=1.99%, 50=5.75%, 100=3.89%, 250=13.53%, 500=33.39%
  lat (msec)   : 750=30.27%, 1000=10.63%, 2000=0.45%, >=2000=0.09%
  cpu          : usr=0.01%, sys=0.03%, ctx=2233, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2210,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=2310KiB/s (2365kB/s), 141KiB/s-147KiB/s (145kB/s-151kB/s), io=136MiB (143MB), run=60389-60488msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
root@(none):~# fio fio.spdk 
job000: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job001: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job002: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job003: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job004: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job005: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job006: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job007: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job008: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job009: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job010: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job011: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job012: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job013: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job014: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
job015: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=16
fio-3.1
Starting 16 threads
Jobs: 16 (f=16): [w(16)][100.0%][r=0KiB/s,w=2104KiB/s][r=0,w=526 IOPS][eta 00m:00s]
job000: (groupid=0, jobs=1): err= 0: pid=1429: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (138kB/s)(8144KiB/60551msec)
    slat (nsec): min=2590, max=22580, avg=6989.15, stdev=1029.24
    clat (msec): min=3, max=2678, avg=475.56, stdev=265.61
     lat (msec): min=3, max=2678, avg=475.57, stdev=265.61
    clat percentiles (msec):
     |  1.00th=[   23],  5.00th=[   73], 10.00th=[  159], 20.00th=[  232],
     | 30.00th=[  313], 40.00th=[  388], 50.00th=[  464], 60.00th=[  542],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[ 1062], 99.50th=[ 1586], 99.90th=[ 2140], 99.95th=[ 2400],
     | 99.99th=[ 2668]
   bw (  KiB/s): min=    8, max=  992, per=6.27%, avg=133.60, stdev=82.10, samples=121
   iops        : min=    2, max=  248, avg=33.39, stdev=20.52, samples=121
  lat (msec)   : 4=0.05%, 10=0.10%, 20=0.79%, 50=2.50%, 100=2.26%
  lat (msec)   : 250=16.55%, 500=32.37%, 750=30.75%, 1000=13.51%, 2000=0.83%
  lat (msec)   : >=2000=0.29%
  cpu          : usr=0.01%, sys=0.02%, ctx=2051, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2036,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job001: (groupid=0, jobs=1): err= 0: pid=1430: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8104KiB/60527msec)
    slat (nsec): min=2500, max=23950, avg=7010.82, stdev=810.62
    clat (usec): min=1967, max=2864.3k, avg=477733.04, stdev=268372.18
     lat (usec): min=1996, max=2864.3k, avg=477740.33, stdev=268372.21
    clat percentiles (msec):
     |  1.00th=[   24],  5.00th=[   72], 10.00th=[  163], 20.00th=[  234],
     | 30.00th=[  309], 40.00th=[  388], 50.00th=[  472], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  986], 99.50th=[ 1217], 99.90th=[ 2366], 99.95th=[ 2802],
     | 99.99th=[ 2869]
   bw (  KiB/s): min=    8, max=  992, per=6.24%, avg=132.94, stdev=81.90, samples=121
   iops        : min=    2, max=  248, avg=33.22, stdev=20.48, samples=121
  lat (msec)   : 2=0.05%, 10=0.10%, 20=0.54%, 50=2.57%, 100=2.52%
  lat (msec)   : 250=16.44%, 500=31.34%, 750=31.98%, 1000=13.52%, 2000=0.59%
  lat (msec)   : >=2000=0.35%
  cpu          : usr=0.01%, sys=0.02%, ctx=2041, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2026,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job002: (groupid=0, jobs=1): err= 0: pid=1431: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8060KiB/60538msec)
    slat (nsec): min=2310, max=37630, avg=6584.38, stdev=871.89
    clat (usec): min=889, max=2200.5k, avg=480424.76, stdev=267411.65
     lat (usec): min=927, max=2200.6k, avg=480431.52, stdev=267411.68
    clat percentiles (msec):
     |  1.00th=[   20],  5.00th=[   75], 10.00th=[  155], 20.00th=[  236],
     | 30.00th=[  313], 40.00th=[  388], 50.00th=[  468], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1099], 99.50th=[ 1502], 99.90th=[ 2123], 99.95th=[ 2165],
     | 99.99th=[ 2198]
   bw (  KiB/s): min=    8, max= 1000, per=6.20%, avg=132.22, stdev=83.33, samples=121
   iops        : min=    2, max=  250, avg=33.05, stdev=20.83, samples=121
  lat (usec)   : 1000=0.05%
  lat (msec)   : 4=0.10%, 10=0.15%, 20=0.79%, 50=2.08%, 100=2.78%
  lat (msec)   : 250=15.63%, 500=31.76%, 750=30.92%, 1000=14.19%, 2000=1.19%
  lat (msec)   : >=2000=0.35%
  cpu          : usr=0.00%, sys=0.03%, ctx=2023, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2015,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job003: (groupid=0, jobs=1): err= 0: pid=1432: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8052KiB/60520msec)
    slat (nsec): min=2620, max=37450, avg=7665.85, stdev=1100.23
    clat (usec): min=714, max=3576.9k, avg=480774.94, stdev=284170.73
     lat (usec): min=752, max=3576.9k, avg=480782.80, stdev=284170.83
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   69], 10.00th=[  157], 20.00th=[  239],
     | 30.00th=[  309], 40.00th=[  388], 50.00th=[  472], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[ 1036], 99.50th=[ 1469], 99.90th=[ 3205], 99.95th=[ 3339],
     | 99.99th=[ 3574]
   bw (  KiB/s): min=    8, max= 1032, per=6.20%, avg=132.09, stdev=86.03, samples=121
   iops        : min=    2, max=  258, avg=33.02, stdev=21.51, samples=121
  lat (usec)   : 750=0.05%
  lat (msec)   : 4=0.10%, 10=0.10%, 20=0.65%, 50=2.19%, 100=3.03%
  lat (msec)   : 250=15.55%, 500=31.35%, 750=32.44%, 1000=13.26%, 2000=0.94%
  lat (msec)   : >=2000=0.35%
  cpu          : usr=0.01%, sys=0.03%, ctx=2029, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2013,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job004: (groupid=0, jobs=1): err= 0: pid=1433: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8120KiB/60609msec)
    slat (nsec): min=2540, max=28140, avg=6474.20, stdev=803.99
    clat (usec): min=1252, max=3554.1k, avg=477444.44, stdev=257780.62
     lat (usec): min=1286, max=3554.1k, avg=477451.11, stdev=257780.64
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   63], 10.00th=[  157], 20.00th=[  243],
     | 30.00th=[  330], 40.00th=[  393], 50.00th=[  468], 60.00th=[  550],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  986], 99.50th=[ 1133], 99.90th=[ 2072], 99.95th=[ 2433],
     | 99.99th=[ 3540]
   bw (  KiB/s): min=    8, max= 1040, per=6.25%, avg=133.21, stdev=86.18, samples=121
   iops        : min=    2, max=  260, avg=33.30, stdev=21.54, samples=121
  lat (msec)   : 2=0.05%, 10=0.15%, 20=0.79%, 50=3.00%, 100=2.17%
  lat (msec)   : 250=14.58%, 500=33.10%, 750=31.28%, 1000=13.99%, 2000=0.74%
  lat (msec)   : >=2000=0.15%
  cpu          : usr=0.00%, sys=0.03%, ctx=2045, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2030,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job005: (groupid=0, jobs=1): err= 0: pid=1434: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8044KiB/60536msec)
    slat (nsec): min=2440, max=27470, avg=6892.76, stdev=1185.46
    clat (msec): min=2, max=3454, avg=481.36, stdev=278.03
     lat (msec): min=2, max=3454, avg=481.37, stdev=278.03
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   63], 10.00th=[  157], 20.00th=[  239],
     | 30.00th=[  317], 40.00th=[  401], 50.00th=[  477], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  701], 90.00th=[  785], 95.00th=[  835],
     | 99.00th=[  978], 99.50th=[ 1368], 99.90th=[ 2635], 99.95th=[ 3205],
     | 99.99th=[ 3440]
   bw (  KiB/s): min=    8, max= 1056, per=6.19%, avg=131.94, stdev=87.78, samples=121
   iops        : min=    2, max=  264, avg=32.97, stdev=21.95, samples=121
  lat (msec)   : 4=0.10%, 10=0.10%, 20=1.19%, 50=2.54%, 100=2.34%
  lat (msec)   : 250=15.32%, 500=31.53%, 750=32.27%, 1000=13.67%, 2000=0.60%
  lat (msec)   : >=2000=0.35%
  cpu          : usr=0.01%, sys=0.02%, ctx=2029, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2011,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job006: (groupid=0, jobs=1): err= 0: pid=1435: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8088KiB/60518msec)
    slat (nsec): min=2550, max=30760, avg=6371.72, stdev=750.89
    clat (usec): min=1815, max=2713.0k, avg=478613.31, stdev=266856.27
     lat (usec): min=1847, max=2713.0k, avg=478619.89, stdev=266856.30
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   63], 10.00th=[  157], 20.00th=[  241],
     | 30.00th=[  317], 40.00th=[  393], 50.00th=[  464], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[  995], 99.50th=[ 1469], 99.90th=[ 2232], 99.95th=[ 2232],
     | 99.99th=[ 2702]
   bw (  KiB/s): min=    8, max= 1064, per=6.22%, avg=132.68, stdev=88.73, samples=121
   iops        : min=    2, max=  266, avg=33.16, stdev=22.19, samples=121
  lat (msec)   : 2=0.05%, 10=0.10%, 20=0.74%, 50=3.17%, 100=2.37%
  lat (msec)   : 250=15.13%, 500=32.54%, 750=29.87%, 1000=15.03%, 2000=0.79%
  lat (msec)   : >=2000=0.20%
  cpu          : usr=0.00%, sys=0.03%, ctx=2036, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2022,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job007: (groupid=0, jobs=1): err= 0: pid=1436: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8116KiB/60543msec)
    slat (nsec): min=2540, max=55830, avg=6721.67, stdev=1245.48
    clat (usec): min=331, max=3128.6k, avg=477164.93, stdev=266832.70
     lat (usec): min=388, max=3128.6k, avg=477171.87, stdev=266832.72
    clat percentiles (msec):
     |  1.00th=[   18],  5.00th=[   55], 10.00th=[  148], 20.00th=[  239],
     | 30.00th=[  313], 40.00th=[  393], 50.00th=[  472], 60.00th=[  550],
     | 70.00th=[  634], 80.00th=[  718], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  978], 99.50th=[ 1368], 99.90th=[ 2333], 99.95th=[ 2567],
     | 99.99th=[ 3138]
   bw (  KiB/s): min=    8, max= 1144, per=6.25%, avg=133.16, stdev=95.75, samples=121
   iops        : min=    2, max=  286, avg=33.29, stdev=23.94, samples=121
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.15%, 20=1.08%, 50=3.15%, 100=2.27%
  lat (msec)   : 250=14.69%, 500=31.69%, 750=31.44%, 1000=14.59%, 2000=0.59%
  lat (msec)   : >=2000=0.25%
  cpu          : usr=0.01%, sys=0.02%, ctx=2042, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2029,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job008: (groupid=0, jobs=1): err= 0: pid=1437: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8092KiB/60545msec)
    slat (nsec): min=2600, max=44290, avg=6544.16, stdev=1003.95
    clat (msec): min=2, max=2867, avg=478.57, stdev=270.92
     lat (msec): min=2, max=2867, avg=478.58, stdev=270.92
    clat percentiles (msec):
     |  1.00th=[   22],  5.00th=[   67], 10.00th=[  159], 20.00th=[  234],
     | 30.00th=[  309], 40.00th=[  393], 50.00th=[  477], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  709], 90.00th=[  802], 95.00th=[  844],
     | 99.00th=[ 1070], 99.50th=[ 1401], 99.90th=[ 2635], 99.95th=[ 2769],
     | 99.99th=[ 2869]
   bw (  KiB/s): min=    8, max= 1024, per=6.23%, avg=132.76, stdev=85.15, samples=121
   iops        : min=    2, max=  256, avg=33.19, stdev=21.29, samples=121
  lat (msec)   : 4=0.05%, 10=0.10%, 20=0.54%, 50=3.26%, 100=1.98%
  lat (msec)   : 250=16.96%, 500=30.94%, 750=30.55%, 1000=14.48%, 2000=0.89%
  lat (msec)   : >=2000=0.25%
  cpu          : usr=0.00%, sys=0.03%, ctx=2033, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2023,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job009: (groupid=0, jobs=1): err= 0: pid=1438: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=135KiB/s (138kB/s)(8144KiB/60542msec)
    slat (nsec): min=2440, max=36840, avg=6897.02, stdev=962.98
    clat (msec): min=2, max=3695, avg=475.49, stdev=262.05
     lat (msec): min=2, max=3695, avg=475.50, stdev=262.05
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   63], 10.00th=[  150], 20.00th=[  234],
     | 30.00th=[  313], 40.00th=[  397], 50.00th=[  477], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  709], 90.00th=[  793], 95.00th=[  835],
     | 99.00th=[ 1003], 99.50th=[ 1200], 99.90th=[ 1905], 99.95th=[ 2198],
     | 99.99th=[ 3708]
   bw (  KiB/s): min=    8, max= 1128, per=6.27%, avg=133.62, stdev=94.14, samples=121
   iops        : min=    2, max=  282, avg=33.40, stdev=23.53, samples=121
  lat (msec)   : 4=0.05%, 10=0.10%, 20=1.33%, 50=2.65%, 100=2.31%
  lat (msec)   : 250=15.72%, 500=31.09%, 750=31.88%, 1000=13.85%, 2000=0.93%
  lat (msec)   : >=2000=0.10%
  cpu          : usr=0.00%, sys=0.03%, ctx=2054, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2036,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job010: (groupid=0, jobs=1): err= 0: pid=1439: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8088KiB/60540msec)
    slat (nsec): min=2460, max=42050, avg=6505.00, stdev=938.78
    clat (msec): min=2, max=2549, avg=478.78, stdev=259.86
     lat (msec): min=2, max=2549, avg=478.79, stdev=259.86
    clat percentiles (msec):
     |  1.00th=[   17],  5.00th=[   59], 10.00th=[  150], 20.00th=[  239],
     | 30.00th=[  309], 40.00th=[  393], 50.00th=[  485], 60.00th=[  558],
     | 70.00th=[  642], 80.00th=[  709], 90.00th=[  793], 95.00th=[  835],
     | 99.00th=[ 1011], 99.50th=[ 1200], 99.90th=[ 2106], 99.95th=[ 2106],
     | 99.99th=[ 2534]
   bw (  KiB/s): min=    8, max= 1104, per=6.22%, avg=132.69, stdev=91.73, samples=121
   iops        : min=    2, max=  276, avg=33.17, stdev=22.93, samples=121
  lat (msec)   : 4=0.10%, 10=0.10%, 20=1.34%, 50=2.62%, 100=2.27%
  lat (msec)   : 250=15.33%, 500=30.27%, 750=33.04%, 1000=13.90%, 2000=0.84%
  lat (msec)   : >=2000=0.20%
  cpu          : usr=0.00%, sys=0.03%, ctx=2031, majf=0, minf=1
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2022,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job011: (groupid=0, jobs=1): err= 0: pid=1440: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=132KiB/s (135kB/s)(7992KiB/60406msec)
    slat (nsec): min=2430, max=42670, avg=7037.07, stdev=1109.36
    clat (usec): min=1632, max=3762.3k, avg=483526.39, stdev=291668.80
     lat (usec): min=1675, max=3762.3k, avg=483533.69, stdev=291668.83
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   66], 10.00th=[  153], 20.00th=[  230],
     | 30.00th=[  321], 40.00th=[  397], 50.00th=[  477], 60.00th=[  558],
     | 70.00th=[  642], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1053], 99.50th=[ 1552], 99.90th=[ 3742], 99.95th=[ 3775],
     | 99.99th=[ 3775]
   bw (  KiB/s): min=   72, max= 1048, per=6.20%, avg=132.13, stdev=87.51, samples=120
   iops        : min=   18, max=  262, avg=33.03, stdev=21.88, samples=120
  lat (msec)   : 2=0.05%, 10=0.10%, 20=0.95%, 50=2.30%, 100=2.85%
  lat (msec)   : 250=16.12%, 500=30.53%, 750=31.83%, 1000=14.21%, 2000=0.80%
  lat (msec)   : >=2000=0.25%
  cpu          : usr=0.01%, sys=0.02%, ctx=2021, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1998,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job012: (groupid=0, jobs=1): err= 0: pid=1441: Sat Jan 29 03:21:45 2022
  write: IOPS=32, BW=131KiB/s (134kB/s)(7928KiB/60499msec)
    slat (nsec): min=2550, max=41780, avg=6529.97, stdev=921.96
    clat (usec): min=526, max=2381.0k, avg=488141.63, stdev=263751.28
     lat (usec): min=569, max=2381.0k, avg=488148.31, stdev=263751.30
    clat percentiles (msec):
     |  1.00th=[   21],  5.00th=[   81], 10.00th=[  163], 20.00th=[  243],
     | 30.00th=[  317], 40.00th=[  405], 50.00th=[  489], 60.00th=[  567],
     | 70.00th=[  651], 80.00th=[  718], 90.00th=[  802], 95.00th=[  852],
     | 99.00th=[ 1083], 99.50th=[ 1385], 99.90th=[ 2232], 99.95th=[ 2366],
     | 99.99th=[ 2366]
   bw (  KiB/s): min=   24, max=  896, per=6.15%, avg=131.07, stdev=74.54, samples=120
   iops        : min=    6, max=  224, avg=32.77, stdev=18.64, samples=120
  lat (usec)   : 750=0.05%
  lat (msec)   : 10=0.15%, 20=0.71%, 50=1.87%, 100=2.52%, 250=15.54%
  lat (msec)   : 500=30.73%, 750=32.29%, 1000=14.93%, 2000=0.96%, >=2000=0.25%
  cpu          : usr=0.00%, sys=0.03%, ctx=1989, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.2%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,1982,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job013: (groupid=0, jobs=1): err= 0: pid=1442: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=135KiB/s (138kB/s)(8164KiB/60501msec)
    slat (nsec): min=2520, max=49150, avg=7161.63, stdev=1200.38
    clat (usec): min=1072, max=2513.7k, avg=474057.99, stdev=255114.85
     lat (usec): min=1122, max=2513.7k, avg=474065.44, stdev=255114.90
    clat percentiles (msec):
     |  1.00th=[   19],  5.00th=[   72], 10.00th=[  157], 20.00th=[  234],
     | 30.00th=[  313], 40.00th=[  393], 50.00th=[  481], 60.00th=[  550],
     | 70.00th=[  625], 80.00th=[  701], 90.00th=[  776], 95.00th=[  835],
     | 99.00th=[ 1020], 99.50th=[ 1318], 99.90th=[ 2232], 99.95th=[ 2265],
     | 99.99th=[ 2500]
   bw (  KiB/s): min=   40, max= 1032, per=6.33%, avg=135.00, stdev=86.07, samples=120
   iops        : min=   10, max=  258, avg=33.75, stdev=21.52, samples=120
  lat (msec)   : 2=0.10%, 10=0.24%, 20=0.78%, 50=2.25%, 100=2.69%
  lat (msec)   : 250=15.73%, 500=31.41%, 750=33.41%, 1000=12.30%, 2000=0.88%
  lat (msec)   : >=2000=0.20%
  cpu          : usr=0.00%, sys=0.03%, ctx=2066, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2041,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job014: (groupid=0, jobs=1): err= 0: pid=1443: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=133KiB/s (136kB/s)(8020KiB/60528msec)
    slat (nsec): min=2460, max=54580, avg=6537.48, stdev=1181.96
    clat (usec): min=327, max=3314.6k, avg=482757.51, stdev=288007.51
     lat (usec): min=383, max=3314.6k, avg=482764.20, stdev=288007.54
    clat percentiles (msec):
     |  1.00th=[   22],  5.00th=[   63], 10.00th=[  150], 20.00th=[  234],
     | 30.00th=[  317], 40.00th=[  401], 50.00th=[  472], 60.00th=[  558],
     | 70.00th=[  634], 80.00th=[  718], 90.00th=[  793], 95.00th=[  844],
     | 99.00th=[  969], 99.50th=[ 1620], 99.90th=[ 3138], 99.95th=[ 3272],
     | 99.99th=[ 3306]
   bw (  KiB/s): min=    8, max= 1096, per=6.17%, avg=131.57, stdev=92.47, samples=121
   iops        : min=    2, max=  274, avg=32.89, stdev=23.12, samples=121
  lat (usec)   : 500=0.05%
  lat (msec)   : 4=0.05%, 10=0.15%, 20=0.70%, 50=2.74%, 100=3.04%
  lat (msec)   : 250=15.16%, 500=31.12%, 750=31.42%, 1000=14.66%, 2000=0.40%
  lat (msec)   : >=2000=0.50%
  cpu          : usr=0.00%, sys=0.02%, ctx=2017, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2005,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16
job015: (groupid=0, jobs=1): err= 0: pid=1444: Sat Jan 29 03:21:45 2022
  write: IOPS=33, BW=134KiB/s (137kB/s)(8096KiB/60507msec)
    slat (nsec): min=2500, max=46920, avg=7413.75, stdev=1287.51
    clat (usec): min=507, max=2640.1k, avg=478065.98, stdev=249696.12
     lat (usec): min=554, max=2640.1k, avg=478073.64, stdev=249696.21
    clat percentiles (msec):
     |  1.00th=[   22],  5.00th=[   67], 10.00th=[  161], 20.00th=[  239],
     | 30.00th=[  326], 40.00th=[  405], 50.00th=[  481], 60.00th=[  550],
     | 70.00th=[  634], 80.00th=[  709], 90.00th=[  793], 95.00th=[  835],
     | 99.00th=[  944], 99.50th=[ 1062], 99.90th=[ 1720], 99.95th=[ 2039],
     | 99.99th=[ 2635]
   bw (  KiB/s): min=    8, max= 1016, per=6.23%, avg=132.83, stdev=84.45, samples=121
   iops        : min=    2, max=  254, avg=33.21, stdev=21.11, samples=121
  lat (usec)   : 750=0.05%
  lat (msec)   : 20=0.94%, 50=2.72%, 100=2.47%, 250=15.42%, 500=30.93%
  lat (msec)   : 750=32.51%, 1000=14.23%, 2000=0.64%, >=2000=0.10%
  cpu          : usr=0.00%, sys=0.03%, ctx=2039, majf=0, minf=2
  IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=99.3%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwt: total=0,2024,0, short=0,0,0, dropped=0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=16

Run status group 0 (all jobs):
  WRITE: bw=2133KiB/s (2184kB/s), 131KiB/s-135KiB/s (134kB/s-138kB/s), io=126MiB (132MB), run=60406-60609msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

[END] 2022/1/29 11:26:38

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc
  2022-01-28  2:36   ` packet stuck in qdisc Yunsheng Lin
  2022-01-28  8:58     ` Vincent Ray
@ 2022-03-25  6:16     ` Yunsheng Lin
  2022-03-25  8:45       ` Vincent Ray
  1 sibling, 1 reply; 28+ messages in thread
From: Yunsheng Lin @ 2022-03-25  6:16 UTC (permalink / raw)
  To: Vincent Ray, vladimir.oltean, kuba, davem
  Cc: Samuel Jones, netdev, 方国炬

[-- Attachment #1: Type: text/plain, Size: 1602 bytes --]

On 2022/1/28 10:36, Yunsheng Lin wrote:
> On 2022/1/25 20:55, Vincent Ray wrote:
>> Dear kernel maintainers / developers,
>>
>> I work at Kalray where we are developping an NVME-over-TCP target controller board.
>> My setup is as such :
>> - a development workstation running Linux 5.x.y (the host)
>> - sending NVME-TCP traffic to our board, to which it is connected through a Mellanox NIC (Connect-X-5) and a 100G ETH cable
>>
>> While doing performance tests, using simple fio scenarios running over the regular kernel nvme-tcp driver on the host, we noticed important performance variations.
>> After some digging (using tcpdump on the host), we found that there were big "holes" in the tcp traffic sent by the host.
>> The scenario we observed is the following :
>> 1) a TCP segment gets lost (not sent by the host) on a particular TCP connection, leading to a gap in the seq numbers received by the board
>> 2) the board sends dup-acks and/or sacks (if configured) to signal this loss
>> 3) then, sometimes, the host stops emitting on that TCP connection for several seconds (as much as 14s observed)
>> 4) finally the host resumes emission, sending the missing packet
>> 5) then the TCP connection continues correctly with the appropriate throughput
>>
>> Such a scenario can be observed in the attached tcpdump (+ comments).
> 
> Hi,
>     Thanks for reporting the problem.

Hi,
   It seems guoju from alibaba has a similar problem as above.
   And they fixed it by adding a smp_mb() barrier between spin_unlock()
and test_bit() in qdisc_run_end(), please see if it fixes your problem.

> 
>>


[-- Attachment #2: 0001-net-sched-add-barrier-to-fix-packet-stuck-problem-fo_ali.patch --]
[-- Type: text/plain, Size: 1905 bytes --]

From 27eacb9d72679d590271b2d3e9da2a77e4f60317 Mon Sep 17 00:00:00 2001
From: Guoju Fang <guoju.fgj@alibaba-inc.com>
Date: Fri, 18 Feb 2022 11:18:45 +0800
Subject: [PATCH] net: sched: add barrier to fix packet stuck problem for
 lockless qdisc

The commit a90c57f2cedd ("net: sched: fix packet stuck problem for
lockless qdisc") and commit 89837eb4b246 ("net: sched: add barrier to
ensure correct ordering for lockless qdisc") fix a packet stuck problem
for lockless qdisc and describe the race condition. But after the two
commits we still can reproduce a similar problem, and one more barrier
is needed to fix it.

Now the concurrent operations can be described as below,
       CPU0            |         CPU1
   qdisc_run_end()     |   qdisc_run_begin()
        .              |          .
       spin_unlock()   |       set_bit()
        .              |       smp_mb__after_atomic()
       test_bit()      |       spin_trylock()
        .              |          .

In qdisc_run_end(), the spin_unlock() only has store-release semantic,
which guarantees all earlier memory access are visible before the
spin_unlock(), but the subsequent test_bit() may be reordered to precede
the it. So one explicit barrier is needed between spin_unlock() and
test_bit() to ensure the correct order.

Signed-off-by: Guoju Fang <guoju.fgj@alibaba-inc.com>
---
 include/net/sch_generic.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 472843eedbae..ff843e2a4544 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -229,6 +229,9 @@ static inline void qdisc_run_end(struct Qdisc *qdisc)
 	if (qdisc->flags & TCQ_F_NOLOCK) {
 		spin_unlock(&qdisc->seqlock);
 
+		/* store-load barrier */
+		smp_mb();
+
 		if (unlikely(test_bit(__QDISC_STATE_MISSED,
 				      &qdisc->state)))
 			__netif_schedule(qdisc);
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc
  2022-03-25  6:16     ` Yunsheng Lin
@ 2022-03-25  8:45       ` Vincent Ray
  2022-04-13 13:01         ` Vincent Ray
  0 siblings, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-03-25  8:45 UTC (permalink / raw)
  To: linyunsheng
  Cc: vladimir oltean, kuba, davem, Samuel Jones, netdev,
	方国炬

OK I'll try that, thank you LinYun.

(I'm sorry for the delay in my answers, I haven't been able to try your debug patch yet because I've had other problems with my setup, preventing me from reproducing the issue in the first place, but it should be ok soon)

----- Original Message -----
From: "linyunsheng" <linyunsheng@huawei.com>
To: "Vincent Ray" <vray@kalrayinc.com>, "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "davem" <davem@davemloft.net>
Cc: "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>, "方国炬" <guoju.fgj@alibaba-inc.com>
Sent: Friday, March 25, 2022 7:16:02 AM
Subject: Re: packet stuck in qdisc

On 2022/1/28 10:36, Yunsheng Lin wrote:
> On 2022/1/25 20:55, Vincent Ray wrote:
>> Dear kernel maintainers / developers,
>>
>> I work at Kalray where we are developping an NVME-over-TCP target controller board.
>> My setup is as such :
>> - a development workstation running Linux 5.x.y (the host)
>> - sending NVME-TCP traffic to our board, to which it is connected through a Mellanox NIC (Connect-X-5) and a 100G ETH cable
>>
>> While doing performance tests, using simple fio scenarios running over the regular kernel nvme-tcp driver on the host, we noticed important performance variations.
>> After some digging (using tcpdump on the host), we found that there were big "holes" in the tcp traffic sent by the host.
>> The scenario we observed is the following :
>> 1) a TCP segment gets lost (not sent by the host) on a particular TCP connection, leading to a gap in the seq numbers received by the board
>> 2) the board sends dup-acks and/or sacks (if configured) to signal this loss
>> 3) then, sometimes, the host stops emitting on that TCP connection for several seconds (as much as 14s observed)
>> 4) finally the host resumes emission, sending the missing packet
>> 5) then the TCP connection continues correctly with the appropriate throughput
>>
>> Such a scenario can be observed in the attached tcpdump (+ comments).
> 
> Hi,
>     Thanks for reporting the problem.

Hi,
   It seems guoju from alibaba has a similar problem as above.
   And they fixed it by adding a smp_mb() barrier between spin_unlock()
and test_bit() in qdisc_run_end(), please see if it fixes your problem.

> 
>>



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=5ef9.623d5e27.9b9df.0&r=vray%40kalrayinc.com&s=linyunsheng%40huawei.com&o=Re%3A+packet+stuck+in+qdisc&verdict=C&c=7b4f9607053f62d4edea3c79310a8bd5d5e63628





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc
  2022-03-25  8:45       ` Vincent Ray
@ 2022-04-13 13:01         ` Vincent Ray
  2022-04-14  3:05           ` Guoju Fang
  0 siblings, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-04-13 13:01 UTC (permalink / raw)
  To: linyunsheng
  Cc: vladimir oltean, kuba, davem, Samuel Jones, netdev,
	方国炬

Hi Linyun,

I tried quoju's patch (on 5.15.0)but it did not make any difference, the pb is still there.

I tried applying your debug patch but can't seem to find the right sha1 where to apply it smoothly.
I guess this is because you made it based on a netdev branch, while I've been trying to apply it on various versions of the stable repo.

=> what do you suggest ?
- should I try to apply it "manually" to whatever stable version suits me (e.g. 5.15 or 5.17 why not) ?
- or should I clone a netdev repo and apply it there ? In this case can you please give me the exact sha1 to use ?

Thanks,

V
 

----- Original Message -----
From: "Vincent Ray" <vray@kalrayinc.com>
To: "linyunsheng" <linyunsheng@huawei.com>
Cc: "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "davem" <davem@davemloft.net>, "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>, "方国炬" <guoju.fgj@alibaba-inc.com>
Sent: Friday, March 25, 2022 9:45:14 AM
Subject: Re: packet stuck in qdisc

OK I'll try that, thank you LinYun.

(I'm sorry for the delay in my answers, I haven't been able to try your debug patch yet because I've had other problems with my setup, preventing me from reproducing the issue in the first place, but it should be ok soon)

----- Original Message -----
From: "linyunsheng" <linyunsheng@huawei.com>
To: "Vincent Ray" <vray@kalrayinc.com>, "vladimir oltean" <vladimir.oltean@nxp.com>, "kuba" <kuba@kernel.org>, "davem" <davem@davemloft.net>
Cc: "Samuel Jones" <sjones@kalrayinc.com>, "netdev" <netdev@vger.kernel.org>, "方国炬" <guoju.fgj@alibaba-inc.com>
Sent: Friday, March 25, 2022 7:16:02 AM
Subject: Re: packet stuck in qdisc

On 2022/1/28 10:36, Yunsheng Lin wrote:
> On 2022/1/25 20:55, Vincent Ray wrote:
>> Dear kernel maintainers / developers,
>>
>> I work at Kalray where we are developping an NVME-over-TCP target controller board.
>> My setup is as such :
>> - a development workstation running Linux 5.x.y (the host)
>> - sending NVME-TCP traffic to our board, to which it is connected through a Mellanox NIC (Connect-X-5) and a 100G ETH cable
>>
>> While doing performance tests, using simple fio scenarios running over the regular kernel nvme-tcp driver on the host, we noticed important performance variations.
>> After some digging (using tcpdump on the host), we found that there were big "holes" in the tcp traffic sent by the host.
>> The scenario we observed is the following :
>> 1) a TCP segment gets lost (not sent by the host) on a particular TCP connection, leading to a gap in the seq numbers received by the board
>> 2) the board sends dup-acks and/or sacks (if configured) to signal this loss
>> 3) then, sometimes, the host stops emitting on that TCP connection for several seconds (as much as 14s observed)
>> 4) finally the host resumes emission, sending the missing packet
>> 5) then the TCP connection continues correctly with the appropriate throughput
>>
>> Such a scenario can be observed in the attached tcpdump (+ comments).
> 
> Hi,
>     Thanks for reporting the problem.

Hi,
   It seems guoju from alibaba has a similar problem as above.
   And they fixed it by adding a smp_mb() barrier between spin_unlock()
and test_bit() in qdisc_run_end(), please see if it fixes your problem.

> 
>>



To declare a filtering error, please use the following link : https://www.security-mail.net/reporter.php?mid=5ef9.623d5e27.9b9df.0&r=vray%40kalrayinc.com&s=linyunsheng%40huawei.com&o=Re%3A+packet+stuck+in+qdisc&verdict=C&c=7b4f9607053f62d4edea3c79310a8bd5d5e63628





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc
  2022-04-13 13:01         ` Vincent Ray
@ 2022-04-14  3:05           ` Guoju Fang
  2022-05-23 13:54             ` packet stuck in qdisc : patch proposal Vincent Ray
  0 siblings, 1 reply; 28+ messages in thread
From: Guoju Fang @ 2022-04-14  3:05 UTC (permalink / raw)
  To: vray; +Cc: davem, guoju.fgj, kuba, linyunsheng, netdev, sjones, vladimir.oltean

Hi Vincent,

I use the following bpftrace code to print those skb that exceed a 
latency (10ms). Plese see if it can help you.


#!/usr/bin/bpftrace

tracepoint:net:net_dev_queue
{
	@start[args->skbaddr] = nsecs;
	@core[args->skbaddr] = cpu;
}

tracepoint:net:net_dev_start_xmit
{
	$lat = (nsecs - @start[args->skbaddr]) / 1000;
	if ($lat > 10000) {
		time("%H:%M:%S  ");
		printf("queue core %d, xmit core %d, len %d, lat %d us\n",
			@core[args->skbaddr], cpu, args->len, $lat);
	}
	delete(@start[args->skbaddr]);
	delete(@core[args->skbaddr]);
}


Best Regards,
Guoju.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* packet stuck in qdisc : patch proposal
  2022-04-14  3:05           ` Guoju Fang
@ 2022-05-23 13:54             ` Vincent Ray
  2022-05-24  2:55               ` Eric Dumazet
  2022-05-24  6:43               ` Yunsheng Lin
  0 siblings, 2 replies; 28+ messages in thread
From: Vincent Ray @ 2022-05-23 13:54 UTC (permalink / raw)
  To: linyunsheng
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Guoju Fang,
	Remy Gauguey

[-- Attachment #1: Type: text/plain, Size: 3956 bytes --]

Hi Yunsheng, all,

I finally spotted the bug that caused (nvme-)tcp packets to remain stuck in the qdisc once in a while.
It's in qdisc_run_begin within sch_generic.h : 

smp_mb__before_atomic();
 
// [comments]

if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
	return false;

should be 

smp_mb();

// [comments]

if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
	return false;

I have written a more detailed explanation in the attached patch, including a race example, but in short that's because test_bit() is not an atomic operation.
Therefore it does not give you any ordering guarantee on any architecture.
And neither does spin_trylock() called at the beginning of qdisc_run_begin() when it does not grab the lock...
So test_bit() may be reordered whith a preceding enqueue(), leading to a possible race in the dialog with pfifo_fast_dequeue().
We may then end up with a skbuff pushed "silently" to the qdisc (MISSED cleared, nobody aware that there is something in the backlog).
Then the cores pushing new skbuffs to the qdisc may all bypass it for an arbitrary amount of time, leaving the enqueued skbuff stuck in the backlog.

I believe the reason for which you could not reproduce the issue on ARM64 is that, on that architecture, smp_mb__before_atomic() will translate to a memory barrier.
It does not on x86 (turned into a NOP) because you're supposed to use this function just before an atomic operation, and atomic operations themselves provide full ordering effects on x86.

I think the code has been flawed for some time but the introduction of a (true) bypass policy in 5.14 made it more visible, because without this, the "victim" skbuff does not stay very long in the backlog : it is bound to pe popped by the next core executing __qdic_run().

In my setup, with our use case (16 (virtual) cpus in a VM shooting 4KB buffers with fio through a -i4 nvme-tcp connection to a target), I did not notice any performance degradation using smp_mb() in place of smp_mb__before_atomic(), but of course that does not mean it cannot happen in other configs.

I think Guoju's patch is also correct and necessary so that both patches, his and mine, should be applied "asap" to the kernel.
A difference between Guoju's race and "mine" is that, in his case, the MISSED bit will be set : though no one will take care of the skbuff immediately, the next cpu pushing to the qdisc (if ever ...) will notice and dequeue it (so Guoju's race probably happens in my use case too but is not noticeable).

Finally, given the necessity of these two new full barriers in the code, I wonder if the whole lockless (+ bypass) thing should be reconsidered.
At least, I think general performance tests should be run to check that lockless qdics still outperform locked qdiscs, in both bypassable and not-bypassable modes.
    
More generally, I found this piece of code quite tricky and error-prone, as evidenced by the numerous fixes it went through in the recent history. 
I believe most of this complexity comes from the lockless qdisc handling in itself, but of course the addition of the bypass support does not really help ;-)
I'm a linux kernel beginner however, so I'll let more experienced programmers decide about that :-)

I've made sure that, with this patch, no stuck packets happened any more on both v5.15 and v5.18-rc2 (whereas without the patch, numerous occurrences of stuck packets are visible).
I'm quite confident it will apply to any concerned version, that is from 5.14 (or before) to mainline.

Can you please tell me :

1) if you agree with this ?

2) how to proceed to push this patch (and Guoju's) for quick integration into the mainline ?

NB : an alternative fix (which I've tested OK too) would be to simply remove the

if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
	return false;

code path, but I have no clue if this would be better or worse than the present patch in terms of performance.
    
Thank you, best regards,

V



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: net_sched_fixed_barrier_to_prevent_skbuff_sticking_in_qdisc_backlog.patch --]
[-- Type: text/x-patch; name*0=net_sched_fixed_barrier_to_prevent_skbuff_sticking_in_qdisc_backlog.; name*1=patch, Size: 3897 bytes --]

commit 917f7ff2b0f59d721d11f983af1f46c1cd74130a
Author: Vincent Ray <vray@kalray.eu>
Date:   Mon May 23 15:24:12 2022 +0200

    net: sched: fixed barrier to prevent skbuff sticking in qdisc backlog
    
    In qdisc_run_begin(), smp_mb__before_atomic() used before test_bit()
    does not provide any ordering guarantee as test_bit() is not an atomic
    operation. This, added to the fact that the spin_trylock() call at
    the beginning of qdisc_run_begin() does not guarantee acquire
    semantics if it does not grab the lock, makes it possible for the
    following statement :
    
    if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
    
    to be executed before an enqueue operation called before
    qdisc_run_begin().
    
    As a result the following race can happen :
    
               CPU 1                             CPU 2
    
          qdisc_run_begin()               qdisc_run_begin() /* true */
            set(MISSED)                            .
          /* returns false */                      .
              .                            /* sees MISSED = 1 */
              .                            /* so qdisc not empty */
              .                            __qdisc_run()
              .                                    .
              .                              pfifo_fast_dequeue()
     ----> /* may be done here */                  .
    |         .                                clear(MISSED)
    |         .                                    .
    |         .                                smp_mb __after_atomic();
    |         .                                    .
    |         .                                /* recheck the queue */
    |         .                                /* nothing => exit   */
    |   enqueue(skb1)
    |         .
    |   qdisc_run_begin()
    |         .
    |     spin_trylock() /* fail */
    |         .
    |     smp_mb__before_atomic() /* not enough */
    |         .
     ---- if (test_bit(MISSED))
            return false;   /* exit */
    
    In the above scenario, CPU 1 and CPU 2 both try to grab the
    qdisc->seqlock at the same time. Only CPU 2 succeeds and enters the
    bypass code path, where it emits its skb then calls __qdisc_run().
    
    CPU1 fails, sets MISSED and goes down the traditionnal enqueue() +
    dequeue() code path. But when executing qdisc_run_begin() for the
    second time, after enqueing its skbuff, it sees the MISSED bit still
    set (by itself) and consequently chooses to exit early without setting
    it again nor trying to grab the spinlock again.
    
    Meanwhile CPU2 has seen MISSED = 1, cleared it, checked the queue
    and found it empty, so it returned.
    
    At the end of the sequence, we end up with skb1 enqueued in the
    backlog, both CPUs out of __dev_xmit_skb(), the MISSED bit not set,
    and no __netif_schedule() called made. skb1 will now linger in the
    qdisc until somebody later performs a full __qdisc_run(). Associated
    to the bypass capacity of the qdisc, and the ability of the TCP layer
    to avoid resending packets which it knows are still in the qdisc, this
    can lead to serious traffic "holes" in a TCP connexion.
    
    We fix this by turning smp_mb__before_atomic() into smp_mb() which
    guarantees the correct ordering of enqueue() vs test_bit() and
    consequently prevents the race.

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 9bab396c1f3b..0c6016e10a6f 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -191,7 +191,7 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
 		 * STATE_MISSED checking is synchronized with clearing
 		 * in pfifo_fast_dequeue().
 		 */
-		smp_mb__before_atomic();
+		smp_mb();
 
 		/* If the MISSED flag is set, it means other thread has
 		 * set the MISSED flag before second spin_trylock(), so

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-23 13:54             ` packet stuck in qdisc : patch proposal Vincent Ray
@ 2022-05-24  2:55               ` Eric Dumazet
  2022-05-24  6:43               ` Yunsheng Lin
  1 sibling, 0 replies; 28+ messages in thread
From: Eric Dumazet @ 2022-05-24  2:55 UTC (permalink / raw)
  To: Vincent Ray, linyunsheng
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Guoju Fang,
	Remy Gauguey, Eric Dumazet


On 5/23/22 06:54, Vincent Ray wrote:
> Hi Yunsheng, all,
>
> I finally spotted the bug that caused (nvme-)tcp packets to remain stuck in the qdisc once in a while.
> It's in qdisc_run_begin within sch_generic.h :
>
> smp_mb__before_atomic();
>   
> // [comments]
>
> if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
> 	return false;
>
> should be
>
> smp_mb();
>
> // [comments]
>
> if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
> 	return false;
>
> I have written a more detailed explanation in the attached patch, including a race example, but in short that's because test_bit() is not an atomic operation.
> Therefore it does not give you any ordering guarantee on any architecture.
> And neither does spin_trylock() called at the beginning of qdisc_run_begin() when it does not grab the lock...
> So test_bit() may be reordered whith a preceding enqueue(), leading to a possible race in the dialog with pfifo_fast_dequeue().
> We may then end up with a skbuff pushed "silently" to the qdisc (MISSED cleared, nobody aware that there is something in the backlog).
> Then the cores pushing new skbuffs to the qdisc may all bypass it for an arbitrary amount of time, leaving the enqueued skbuff stuck in the backlog.
>
> I believe the reason for which you could not reproduce the issue on ARM64 is that, on that architecture, smp_mb__before_atomic() will translate to a memory barrier.
> It does not on x86 (turned into a NOP) because you're supposed to use this function just before an atomic operation, and atomic operations themselves provide full ordering effects on x86.
>
> I think the code has been flawed for some time but the introduction of a (true) bypass policy in 5.14 made it more visible, because without this, the "victim" skbuff does not stay very long in the backlog : it is bound to pe popped by the next core executing __qdic_run().
>
> In my setup, with our use case (16 (virtual) cpus in a VM shooting 4KB buffers with fio through a -i4 nvme-tcp connection to a target), I did not notice any performance degradation using smp_mb() in place of smp_mb__before_atomic(), but of course that does not mean it cannot happen in other configs.
>
> I think Guoju's patch is also correct and necessary so that both patches, his and mine, should be applied "asap" to the kernel.
> A difference between Guoju's race and "mine" is that, in his case, the MISSED bit will be set : though no one will take care of the skbuff immediately, the next cpu pushing to the qdisc (if ever ...) will notice and dequeue it (so Guoju's race probably happens in my use case too but is not noticeable).
>
> Finally, given the necessity of these two new full barriers in the code, I wonder if the whole lockless (+ bypass) thing should be reconsidered.
> At least, I think general performance tests should be run to check that lockless qdics still outperform locked qdiscs, in both bypassable and not-bypassable modes.
>      
> More generally, I found this piece of code quite tricky and error-prone, as evidenced by the numerous fixes it went through in the recent history.
> I believe most of this complexity comes from the lockless qdisc handling in itself, but of course the addition of the bypass support does not really help ;-)
> I'm a linux kernel beginner however, so I'll let more experienced programmers decide about that :-)
>
> I've made sure that, with this patch, no stuck packets happened any more on both v5.15 and v5.18-rc2 (whereas without the patch, numerous occurrences of stuck packets are visible).
> I'm quite confident it will apply to any concerned version, that is from 5.14 (or before) to mainline.
>
> Can you please tell me :
>
> 1) if you agree with this ?
>
> 2) how to proceed to push this patch (and Guoju's) for quick integration into the mainline ?
>
> NB : an alternative fix (which I've tested OK too) would be to simply remove the
>
> if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
> 	return false;
>
> code path, but I have no clue if this would be better or worse than the present patch in terms of performance.
>      
> Thank you, best regards,
>
> V


We keep adding code and comments, this is quite silly.

test_and_set_bit() is exactly what we need.


diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 
9bab396c1f3ba3d143de4d63f4142cff3c9b9f3e..9d1b448c0dfc3925967635f3390b884a4ef7c55a 
100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -187,35 +187,9 @@ static inline bool qdisc_run_begin(struct Qdisc *qdisc)
                 if (spin_trylock(&qdisc->seqlock))
                         return true;

-               /* Paired with smp_mb__after_atomic() to make sure
-                * STATE_MISSED checking is synchronized with clearing
-                * in pfifo_fast_dequeue().
-                */
-               smp_mb__before_atomic();
-
-               /* If the MISSED flag is set, it means other thread has
-                * set the MISSED flag before second spin_trylock(), so
-                * we can return false here to avoid multi cpus doing
-                * the set_bit() and second spin_trylock() concurrently.
-                */
-               if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
+               if (test_and_set_bit(__QDISC_STATE_MISSED, &qdisc->state))
                         return false;

-               /* Set the MISSED flag before the second spin_trylock(),
-                * if the second spin_trylock() return false, it means
-                * other cpu holding the lock will do dequeuing for us
-                * or it will see the MISSED flag set after releasing
-                * lock and reschedule the net_tx_action() to do the
-                * dequeuing.
-                */
-               set_bit(__QDISC_STATE_MISSED, &qdisc->state);
-
-               /* spin_trylock() only has load-acquire semantic, so use
-                * smp_mb__after_atomic() to ensure STATE_MISSED is set
-                * before doing the second spin_trylock().
-                */
-               smp_mb__after_atomic();
-
                 /* Retry again in case other CPU may not see the new flag
                  * after it releases the lock at the end of 
qdisc_run_end().
                  */


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-23 13:54             ` packet stuck in qdisc : patch proposal Vincent Ray
  2022-05-24  2:55               ` Eric Dumazet
@ 2022-05-24  6:43               ` Yunsheng Lin
  2022-05-24  8:13                 ` Vincent Ray
  1 sibling, 1 reply; 28+ messages in thread
From: Yunsheng Lin @ 2022-05-24  6:43 UTC (permalink / raw)
  To: Vincent Ray
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Guoju Fang,
	Remy Gauguey, Eric Dumazet, will

On 2022/5/23 21:54, Vincent Ray wrote:
> Hi Yunsheng, all,

+cc Will & Eric

> 
> I finally spotted the bug that caused (nvme-)tcp packets to remain stuck in the qdisc once in a while.
> It's in qdisc_run_begin within sch_generic.h : 
> 
> smp_mb__before_atomic();
>  
> // [comments]
> 
> if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
> 	return false;
> 
> should be 
> 
> smp_mb();
> 
> // [comments]
> 
> if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
> 	return false;
> 
> I have written a more detailed explanation in the attached patch, including a race example, but in short that's because test_bit() is not an atomic operation.
> Therefore it does not give you any ordering guarantee on any architecture.
> And neither does spin_trylock() called at the beginning of qdisc_run_begin() when it does not grab the lock...
> So test_bit() may be reordered whith a preceding enqueue(), leading to a possible race in the dialog with pfifo_fast_dequeue().
> We may then end up with a skbuff pushed "silently" to the qdisc (MISSED cleared, nobody aware that there is something in the backlog).
> Then the cores pushing new skbuffs to the qdisc may all bypass it for an arbitrary amount of time, leaving the enqueued skbuff stuck in the backlog.
> 
> I believe the reason for which you could not reproduce the issue on ARM64 is that, on that architecture, smp_mb__before_atomic() will translate to a memory barrier.
> It does not on x86 (turned into a NOP) because you're supposed to use this function just before an atomic operation, and atomic operations themselves provide full ordering effects on x86.
> 
> I think the code has been flawed for some time but the introduction of a (true) bypass policy in 5.14 made it more visible, because without this, the "victim" skbuff does not stay very long in the backlog : it is bound to pe popped by the next core executing __qdic_run().
> 
> In my setup, with our use case (16 (virtual) cpus in a VM shooting 4KB buffers with fio through a -i4 nvme-tcp connection to a target), I did not notice any performance degradation using smp_mb() in place of smp_mb__before_atomic(), but of course that does not mean it cannot happen in other configs.
> 
> I think Guoju's patch is also correct and necessary so that both patches, his and mine, should be applied "asap" to the kernel.
> A difference between Guoju's race and "mine" is that, in his case, the MISSED bit will be set : though no one will take care of the skbuff immediately, the next cpu pushing to the qdisc (if ever ...) will notice and dequeue it (so Guoju's race probably happens in my use case too but is not noticeable).
> 
> Finally, given the necessity of these two new full barriers in the code, I wonder if the whole lockless (+ bypass) thing should be reconsidered.
> At least, I think general performance tests should be run to check that lockless qdics still outperform locked qdiscs, in both bypassable and not-bypassable modes.
>     
> More generally, I found this piece of code quite tricky and error-prone, as evidenced by the numerous fixes it went through in the recent history. 
> I believe most of this complexity comes from the lockless qdisc handling in itself, but of course the addition of the bypass support does not really help ;-)
> I'm a linux kernel beginner however, so I'll let more experienced programmers decide about that :-)
> 
> I've made sure that, with this patch, no stuck packets happened any more on both v5.15 and v5.18-rc2 (whereas without the patch, numerous occurrences of stuck packets are visible).
> I'm quite confident it will apply to any concerned version, that is from 5.14 (or before) to mainline.
> 
> Can you please tell me :
> 
> 1) if you agree with this ?
> 
> 2) how to proceed to push this patch (and Guoju's) for quick integration into the mainline ?
> 
> NB : an alternative fix (which I've tested OK too) would be to simply remove the
> 
> if (test_bit(__QDISC_STATE_MISSED, &qdisc->state))
> 	return false;
> 
> code path, but I have no clue if this would be better or worse than the present patch in terms of performance.
>     
> Thank you, best regards

Hi, thanks for the testing and debugging. So the main problem is that
test_bit() is not an atomic operation, so using smp_mb __*_atomic() is
not really helping, right?

In that case we might only need to change smp_mb__before_atomic() to
smp_rmb() in qdisc_run_begin(), as we only need to order test_bit()
after set_bit() and clear_bit(), which is a read/write ordering?

By the way,  Guoju need to ensure ordering between spin_unlock() and
test_bit() in qdisc_run_end(), which is a write/read ordering, so
smp_mb() is used.


> 
> V
> 
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-24  6:43               ` Yunsheng Lin
@ 2022-05-24  8:13                 ` Vincent Ray
  2022-05-24 17:00                   ` Vincent Ray
  0 siblings, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-05-24  8:13 UTC (permalink / raw)
  To: linyunsheng
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Guoju Fang,
	Remy Gauguey, Eric Dumazet, will

 
> Hi, thanks for the testing and debugging. So the main problem is that
> test_bit() is not an atomic operation, so using smp_mb __*_atomic() is
> not really helping, right?

Yes

> In that case we might only need to change smp_mb__before_atomic() to
> smp_rmb() in qdisc_run_begin(), as we only need to order test_bit()
> after set_bit() and clear_bit(), which is a read/write ordering?

I don't think so : we need to order test_bit() with respect to an earlier enqueue().
So we need a store/load barrier => smp_mb()

> By the way,  Guoju need to ensure ordering between spin_unlock() and
> test_bit() in qdisc_run_end(), which is a write/read ordering, so
> smp_mb() is used.

Agreed

But, for qdisc_run_begin(), I think Eric is right anyway, his fix, using test_and_set_bit() seems much better.
I'm going to try it and will tell you what it gives.

Thanks,

V





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-24  8:13                 ` Vincent Ray
@ 2022-05-24 17:00                   ` Vincent Ray
  2022-05-24 20:17                     ` Eric Dumazet
  0 siblings, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-05-24 17:00 UTC (permalink / raw)
  To: linyunsheng, Eric Dumazet
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Guoju Fang,
	Remy Gauguey, will

All,

I confirm Eric's patch works well too, and it's better and clearer than mine.
So I think we should go for it, and the one from Guoju in addition.

@Eric : I see you are one of the networking maintainers, so I have a few questions for you :

a) are you going to take care of these patches directly yourself, or is there something Guoju or I should do to promote them ?

b) Can we expect to see them land in the mainline soon ?

c) Will they be backported to previous versions of the kernel ? Which ones ?

Thanks a lot, best,

V





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-24 17:00                   ` Vincent Ray
@ 2022-05-24 20:17                     ` Eric Dumazet
  2022-05-25  9:44                       ` Vincent Ray
  0 siblings, 1 reply; 28+ messages in thread
From: Eric Dumazet @ 2022-05-24 20:17 UTC (permalink / raw)
  To: Vincent Ray, linyunsheng
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Guoju Fang,
	Remy Gauguey, will, Eric Dumazet


On 5/24/22 10:00, Vincent Ray wrote:
> All,
>
> I confirm Eric's patch works well too, and it's better and clearer than mine.
> So I think we should go for it, and the one from Guoju in addition.
>
> @Eric : I see you are one of the networking maintainers, so I have a few questions for you :
>
> a) are you going to take care of these patches directly yourself, or is there something Guoju or I should do to promote them ?

I think this is totally fine you take ownership of the patch, please 
send a formal V2.

Please double check what patchwork had to say about your V1 :


https://patchwork.kernel.org/project/netdevbpf/patch/1684598287.15044793.1653314052575.JavaMail.zimbra@kalray.eu/


And make sure to address the relevant points

The most important one is the lack of 'Signed-off-by:' tag, of course.


> b) Can we expect to see them land in the mainline soon ?

If your v2 submission is correct, it can be merged this week ;)


>
> c) Will they be backported to previous versions of the kernel ? Which ones ?

You simply can include a proper Fixes: tag, so that stable teams can 
backport

the patch to all affected kernel versions.



>
> Thanks a lot, best,

Thanks a lot for working on this long standing issue.



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-24 20:17                     ` Eric Dumazet
@ 2022-05-25  9:44                       ` Vincent Ray
  2022-05-25 10:45                         ` Yunsheng Lin
  0 siblings, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-05-25  9:44 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: linyunsheng, davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Guoju Fang,
	Remy Gauguey, will, Eric Dumazet

----- On May 24, 2022, at 10:17 PM, Eric Dumazet eric.dumazet@gmail.com wrote:

> On 5/24/22 10:00, Vincent Ray wrote:
>> All,
>>
>> I confirm Eric's patch works well too, and it's better and clearer than mine.
>> So I think we should go for it, and the one from Guoju in addition.
>>
>> @Eric : I see you are one of the networking maintainers, so I have a few
>> questions for you :
>>
>> a) are you going to take care of these patches directly yourself, or is there
>> something Guoju or I should do to promote them ?
> 
> I think this is totally fine you take ownership of the patch, please
> send a formal V2.
> 
> Please double check what patchwork had to say about your V1 :
> 
> 
> https://patchwork.kernel.org/project/netdevbpf/patch/1684598287.15044793.1653314052575.JavaMail.zimbra@kalray.eu/
> 
> 
> And make sure to address the relevant points

OK I will.
If you agree, I will take your version of the fix (test_and_set_bit()), keeping the commit message
similar to my original one.

What about Guoju's patch ? 
(adding a smp_mb() between the spin_unlock() and test_bit() in qdisc_run_end()). 
I think it is also necessary though potentially less critical.
Do we embed it in the same patch ? or patch series ?

@Guoju : have you submitted it for integration ?


> The most important one is the lack of 'Signed-off-by:' tag, of course.
> 
> 
>> b) Can we expect to see them land in the mainline soon ?
> 
> If your v2 submission is correct, it can be merged this week ;)
> 
> 
>>
>> c) Will they be backported to previous versions of the kernel ? Which ones ?
> 
> You simply can include a proper Fixes: tag, so that stable teams can
> backport
> 
> the patch to all affected kernel versions.
> 

Here things get a little complicated in my head ;-)
As explained, I think this mechanism has been bugged, in a way or an other, for some time, perhaps since the introduction
of lockless qdiscs (4.16) or somewhere between 4.16 and 5.14.
It's hard to tell at a glance since the code looks quite different back then.
Because of these changes, a unique patch will also only apply up to a certain point in the past.

However, I think the bug became really critical only with the introduction of "true bypass" behavior 
in lockless qdiscs by YunSheng in 5.14, though there may be scenarios where it is a big deal 
even in no-bypass mode.

=> I suggest we only tag it for backward fix up to the 5.14, where it should apply smoothly,
 and we live with the bug for versions before that.
This would mean that 5.15 LT can be patched but no earlier LT
 
What do you think ?

BTW : forgive my ignorance, but are there any kind of "Errata Sheet" or similar for known bugs that 
won't be fixed in a given kernel ?

> 
> 
>>
>> Thanks a lot, best,
> 
> Thanks a lot for working on this long standing issue.
> 
> 
> 
> 
> To declare a filtering error, please use the following link :
> https://www.security-mail.net/reporter.php?mid=7009.628d3d4c.37c04.0&r=vray%40kalrayinc.com&s=eric.dumazet%40gmail.com&o=Re%3A+packet+stuck+in+qdisc+%3A+patch+proposal&verdict=C&c=0ca08e7b7e420d1ab014cda67db48db71df41f5f





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-25  9:44                       ` Vincent Ray
@ 2022-05-25 10:45                         ` Yunsheng Lin
  2022-05-25 12:40                           ` Guoju Fang
  0 siblings, 1 reply; 28+ messages in thread
From: Yunsheng Lin @ 2022-05-25 10:45 UTC (permalink / raw)
  To: Vincent Ray, Eric Dumazet
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Guoju Fang,
	Remy Gauguey, will, Eric Dumazet

On 2022/5/25 17:44, Vincent Ray wrote:
> ----- On May 24, 2022, at 10:17 PM, Eric Dumazet eric.dumazet@gmail.com wrote:
> 
>> On 5/24/22 10:00, Vincent Ray wrote:
>>> All,
>>>
>>> I confirm Eric's patch works well too, and it's better and clearer than mine.
>>> So I think we should go for it, and the one from Guoju in addition.
>>>
>>> @Eric : I see you are one of the networking maintainers, so I have a few
>>> questions for you :
>>>
>>> a) are you going to take care of these patches directly yourself, or is there
>>> something Guoju or I should do to promote them ?
>>
>> I think this is totally fine you take ownership of the patch, please
>> send a formal V2.
>>
>> Please double check what patchwork had to say about your V1 :
>>
>>
>> https://patchwork.kernel.org/project/netdevbpf/patch/1684598287.15044793.1653314052575.JavaMail.zimbra@kalray.eu/
>>
>>
>> And make sure to address the relevant points
> 
> OK I will.
> If you agree, I will take your version of the fix (test_and_set_bit()), keeping the commit message
> similar to my original one.
> 
> What about Guoju's patch ? 

@Guoju, please speak up if you want to handle the patch yourself.

> (adding a smp_mb() between the spin_unlock() and test_bit() in qdisc_run_end()). 
> I think it is also necessary though potentially less critical.
> Do we embed it in the same patch ? or patch series ?

Guoju's patch fixes the commit a90c57f2cedd, so "patch series"
seems better if Guoju is not speaking up to handle the patch himself.


> 
> @Guoju : have you submitted it for integration ?
> 
> 
>> The most important one is the lack of 'Signed-off-by:' tag, of course.
>>
>>
>>> b) Can we expect to see them land in the mainline soon ?
>>
>> If your v2 submission is correct, it can be merged this week ;)
>>
>>
>>>
>>> c) Will they be backported to previous versions of the kernel ? Which ones ?
>>
>> You simply can include a proper Fixes: tag, so that stable teams can
>> backport
>>
>> the patch to all affected kernel versions.
>>
> 
> Here things get a little complicated in my head ;-)
> As explained, I think this mechanism has been bugged, in a way or an other, for some time, perhaps since the introduction
> of lockless qdiscs (4.16) or somewhere between 4.16 and 5.14.
> It's hard to tell at a glance since the code looks quite different back then.
> Because of these changes, a unique patch will also only apply up to a certain point in the past.
> 
> However, I think the bug became really critical only with the introduction of "true bypass" behavior 
> in lockless qdiscs by YunSheng in 5.14, though there may be scenarios where it is a big deal 
> even in no-bypass mode.


commit 89837eb4b246 tried to fix that, but it did not fix it completely, and that commit should has
been back-ported to the affected kernel versions as much as possible, so I think the Fixes tag
should be:

Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering for lockless qdisc")

> 
> => I suggest we only tag it for backward fix up to the 5.14, where it should apply smoothly,
>  and we live with the bug for versions before that.
> This would mean that 5.15 LT can be patched but no earlier LT
>  
> What do you think ?
> 
> BTW : forgive my ignorance, but are there any kind of "Errata Sheet" or similar for known bugs that 
> won't be fixed in a given kernel ?
> 
>>
>>
>>>
>>> Thanks a lot, best,
>>
>> Thanks a lot for working on this long standing issue.
>>
>>
>>
>>
>> To declare a filtering error, please use the following link :
>> https://www.security-mail.net/reporter.php?mid=7009.628d3d4c.37c04.0&r=vray%40kalrayinc.com&s=eric.dumazet%40gmail.com&o=Re%3A+packet+stuck+in+qdisc+%3A+patch+proposal&verdict=C&c=0ca08e7b7e420d1ab014cda67db48db71df41f5f
> 
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-25 10:45                         ` Yunsheng Lin
@ 2022-05-25 12:40                           ` Guoju Fang
  2022-05-25 17:43                             ` Vincent Ray
  0 siblings, 1 reply; 28+ messages in thread
From: Guoju Fang @ 2022-05-25 12:40 UTC (permalink / raw)
  To: Yunsheng Lin, Vincent Ray, Eric Dumazet
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Remy Gauguey, will,
	Eric Dumazet

[-- Attachment #1: Type: text/plain, Size: 4027 bytes --]



On 2022/5/25 18:45, Yunsheng Lin wrote:
> On 2022/5/25 17:44, Vincent Ray wrote:
>> ----- On May 24, 2022, at 10:17 PM, Eric Dumazet eric.dumazet@gmail.com wrote:
>>
>>> On 5/24/22 10:00, Vincent Ray wrote:
>>>> All,
>>>>
>>>> I confirm Eric's patch works well too, and it's better and clearer than mine.
>>>> So I think we should go for it, and the one from Guoju in addition.
>>>>
>>>> @Eric : I see you are one of the networking maintainers, so I have a few
>>>> questions for you :
>>>>
>>>> a) are you going to take care of these patches directly yourself, or is there
>>>> something Guoju or I should do to promote them ?
>>>
>>> I think this is totally fine you take ownership of the patch, please
>>> send a formal V2.
>>>
>>> Please double check what patchwork had to say about your V1 :
>>>
>>>
>>> https://patchwork.kernel.org/project/netdevbpf/patch/1684598287.15044793.1653314052575.JavaMail.zimbra@kalray.eu/
>>>
>>>
>>> And make sure to address the relevant points
>>
>> OK I will.
>> If you agree, I will take your version of the fix (test_and_set_bit()), keeping the commit message
>> similar to my original one.
>>
>> What about Guoju's patch ?
> 
> @Guoju, please speak up if you want to handle the patch yourself.

Hi Yunsheng, all,

I rewrite the comments of my patch and it looks a little clearer. :)

Thank you.

Best regards,

> 
>> (adding a smp_mb() between the spin_unlock() and test_bit() in qdisc_run_end()).
>> I think it is also necessary though potentially less critical.
>> Do we embed it in the same patch ? or patch series ?
> 
> Guoju's patch fixes the commit a90c57f2cedd, so "patch series"
> seems better if Guoju is not speaking up to handle the patch himself.
> 
> 
>>
>> @Guoju : have you submitted it for integration ?
>>
>>
>>> The most important one is the lack of 'Signed-off-by:' tag, of course.
>>>
>>>
>>>> b) Can we expect to see them land in the mainline soon ?
>>>
>>> If your v2 submission is correct, it can be merged this week ;)
>>>
>>>
>>>>
>>>> c) Will they be backported to previous versions of the kernel ? Which ones ?
>>>
>>> You simply can include a proper Fixes: tag, so that stable teams can
>>> backport
>>>
>>> the patch to all affected kernel versions.
>>>
>>
>> Here things get a little complicated in my head ;-)
>> As explained, I think this mechanism has been bugged, in a way or an other, for some time, perhaps since the introduction
>> of lockless qdiscs (4.16) or somewhere between 4.16 and 5.14.
>> It's hard to tell at a glance since the code looks quite different back then.
>> Because of these changes, a unique patch will also only apply up to a certain point in the past.
>>
>> However, I think the bug became really critical only with the introduction of "true bypass" behavior
>> in lockless qdiscs by YunSheng in 5.14, though there may be scenarios where it is a big deal
>> even in no-bypass mode.
> 
> 
> commit 89837eb4b246 tried to fix that, but it did not fix it completely, and that commit should has
> been back-ported to the affected kernel versions as much as possible, so I think the Fixes tag
> should be:
> 
> Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering for lockless qdisc")
> 
>>
>> => I suggest we only tag it for backward fix up to the 5.14, where it should apply smoothly,
>>   and we live with the bug for versions before that.
>> This would mean that 5.15 LT can be patched but no earlier LT
>>   
>> What do you think ?
>>
>> BTW : forgive my ignorance, but are there any kind of "Errata Sheet" or similar for known bugs that
>> won't be fixed in a given kernel ?
>>
>>>
>>>
>>>>
>>>> Thanks a lot, best,
>>>
>>> Thanks a lot for working on this long standing issue.
>>>
>>>
>>>
>>>
>>> To declare a filtering error, please use the following link :
>>> https://www.security-mail.net/reporter.php?mid=7009.628d3d4c.37c04.0&r=vray%40kalrayinc.com&s=eric.dumazet%40gmail.com&o=Re%3A+packet+stuck+in+qdisc+%3A+patch+proposal&verdict=C&c=0ca08e7b7e420d1ab014cda67db48db71df41f5f
>>
>>
>>
>>
>> .
>>

[-- Attachment #2: 0001-net-sched-add-barrier-to-fix-packet-stuck-problem-fo.patch --]
[-- Type: text/plain, Size: 2236 bytes --]

From 28cc81b33972cb1bcdfd529ceae024c9800a54a1 Mon Sep 17 00:00:00 2001
From: Guoju Fang <gjfang@linux.alibaba.com>
Date: Wed, 25 May 2022 19:12:00 +0800
Subject: [PATCH] net: sched: add barrier to fix packet stuck problem for
 lockless qdisc

In qdisc_run_end(), the spin_unlock() only has store-release semantic,
which guarantees all earlier memory access are visible before it. But
the subsequent test_bit() may be reordered ahead of the spin_unlock(),
and may cause a packet stuck problem.

The concurrent operations can be described as below,
         CPU 0                      |          CPU 1
   qdisc_run_end()                  |     qdisc_run_begin()
          .                         |           .
 ----> /* may be reorderd here */   |           .
|         .                         |           .
|     spin_unlock()                 |         set_bit()
|         .                         |         smp_mb__after_atomic()
 ---- test_bit()                    |         spin_trylock()
          .                         |          .

Consider the following sequence of events:
    CPU 0 reorder test_bit() ahead and see MISSED = 0
    CPU 1 calls set_bit()
    CPU 1 calls spin_trylock() and return fail
    CPU 0 executes spin_unlock()

At the end of the sequence, CPU 0 calls spin_unlock() and does nothing
because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one
take it, until the next cpu pushing to the qdisc (if ever ...) will
notice and dequeue it.

So one explicit barrier is needed between spin_unlock() and
test_bit() to ensure the correct order.

Signed-off-by: Guoju Fang <gjfang@linux.alibaba.com>
---
 include/net/sch_generic.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 9bab396c1f3b..8a8738642ca0 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -229,6 +229,9 @@ static inline void qdisc_run_end(struct Qdisc *qdisc)
 	if (qdisc->flags & TCQ_F_NOLOCK) {
 		spin_unlock(&qdisc->seqlock);
 
+		/* ensure ordering between spin_unlock() and test_bit() */
+		smp_mb();
+
 		if (unlikely(test_bit(__QDISC_STATE_MISSED,
 				      &qdisc->state)))
 			__netif_schedule(qdisc);
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-25 12:40                           ` Guoju Fang
@ 2022-05-25 17:43                             ` Vincent Ray
  2022-05-25 17:48                               ` Vincent Ray
  0 siblings, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-05-25 17:43 UTC (permalink / raw)
  To: Guoju Fang, Eric Dumazet, linyunsheng
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Remy Gauguey, will,
	Eric Dumazet, pabeni

[-- Attachment #1: Type: text/plain, Size: 4841 bytes --]



----- On May 25, 2022, at 2:40 PM, Guoju Fang gjfang@linux.alibaba.com wrote:

> On 2022/5/25 18:45, Yunsheng Lin wrote:
>> On 2022/5/25 17:44, Vincent Ray wrote:
>>> ----- On May 24, 2022, at 10:17 PM, Eric Dumazet eric.dumazet@gmail.com wrote:
>>>
>>>> On 5/24/22 10:00, Vincent Ray wrote:
>>>>> All,
>>>>>
>>>>> I confirm Eric's patch works well too, and it's better and clearer than mine.
>>>>> So I think we should go for it, and the one from Guoju in addition.
>>>>>
>>>>> @Eric : I see you are one of the networking maintainers, so I have a few
>>>>> questions for you :
>>>>>
>>>>> a) are you going to take care of these patches directly yourself, or is there
>>>>> something Guoju or I should do to promote them ?
>>>>
>>>> I think this is totally fine you take ownership of the patch, please
>>>> send a formal V2.
>>>>
>>>> Please double check what patchwork had to say about your V1 :
>>>>
>>>>
>>>> https://patchwork.kernel.org/project/netdevbpf/patch/1684598287.15044793.1653314052575.JavaMail.zimbra@kalray.eu/
>>>>
>>>>
>>>> And make sure to address the relevant points
>>>
>>> OK I will.
>>> If you agree, I will take your version of the fix (test_and_set_bit()), keeping
>>> the commit message
>>> similar to my original one.
>>>
>>> What about Guoju's patch ?
>> 
>> @Guoju, please speak up if you want to handle the patch yourself.
> 
> Hi Yunsheng, all,
> 
> I rewrite the comments of my patch and it looks a little clearer. :)
> 
> Thank you.
> 
> Best regards,

Guoju : shouldn't you also include the same Fixes tag suggested by YunSheng ?

Here's mine, attached. Hope it's well formatted this time. Tell me.
I don't feel quite confident with the submission process to produce the series myself, so I'll let Eric handle it if it's ok.

> 
>> 
>>> (adding a smp_mb() between the spin_unlock() and test_bit() in qdisc_run_end()).
>>> I think it is also necessary though potentially less critical.
>>> Do we embed it in the same patch ? or patch series ?
>> 
>> Guoju's patch fixes the commit a90c57f2cedd, so "patch series"
>> seems better if Guoju is not speaking up to handle the patch himself.
>> 
>> 
>>>
>>> @Guoju : have you submitted it for integration ?
>>>
>>>
>>>> The most important one is the lack of 'Signed-off-by:' tag, of course.
>>>>
>>>>
>>>>> b) Can we expect to see them land in the mainline soon ?
>>>>
>>>> If your v2 submission is correct, it can be merged this week ;)
>>>>
>>>>
>>>>>
>>>>> c) Will they be backported to previous versions of the kernel ? Which ones ?
>>>>
>>>> You simply can include a proper Fixes: tag, so that stable teams can
>>>> backport
>>>>
>>>> the patch to all affected kernel versions.
>>>>
>>>
>>> Here things get a little complicated in my head ;-)
>>> As explained, I think this mechanism has been bugged, in a way or an other, for
>>> some time, perhaps since the introduction
>>> of lockless qdiscs (4.16) or somewhere between 4.16 and 5.14.
>>> It's hard to tell at a glance since the code looks quite different back then.
>>> Because of these changes, a unique patch will also only apply up to a certain
>>> point in the past.
>>>
>>> However, I think the bug became really critical only with the introduction of
>>> "true bypass" behavior
>>> in lockless qdiscs by YunSheng in 5.14, though there may be scenarios where it
>>> is a big deal
>>> even in no-bypass mode.
>> 
>> 
>> commit 89837eb4b246 tried to fix that, but it did not fix it completely, and
>> that commit should has
>> been back-ported to the affected kernel versions as much as possible, so I think
>> the Fixes tag
>> should be:
>> 
>> Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering for
>> lockless qdisc")
>> 
>>>
>>> => I suggest we only tag it for backward fix up to the 5.14, where it should
>>> apply smoothly,
>>>   and we live with the bug for versions before that.
>>> This would mean that 5.15 LT can be patched but no earlier LT
>>>   
>>> What do you think ?
>>>
>>> BTW : forgive my ignorance, but are there any kind of "Errata Sheet" or similar
>>> for known bugs that
>>> won't be fixed in a given kernel ?
>>>
>>>>
>>>>
>>>>>
>>>>> Thanks a lot, best,
>>>>
>>>> Thanks a lot for working on this long standing issue.
>>>>
>>>>
>>>>
>>>>
>>>> To declare a filtering error, please use the following link :
>>>> https://www.security-mail.net/reporter.php?mid=7009.628d3d4c.37c04.0&r=vray%40kalrayinc.com&s=eric.dumazet%40gmail.com&o=Re%3A+packet+stuck+in+qdisc+%3A+patch+proposal&verdict=C&c=0ca08e7b7e420d1ab014cda67db48db71df41f5f
>>>
>>>
>>>
>>>
>>> .
>>>
> 
> To declare a filtering error, please use the following link :
> https://www.security-mail.net/reporter.php?mid=2c69.628e23bf.45908.0&r=vray%40kalrayinc.com&s=gjfang%40linux.alibaba.com&o=Re%3A+packet+stuck+in+qdisc+%3A+patch+proposal&verdict=C&c=6106070134039ab6725b6d3de67bd24d624c8b51



[-- Attachment #2: 0001-net-sched-fixed-barrier-to-prevent-skbuff-sticking-i.patch --]
[-- Type: application/mbox, Size: 5550 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-25 17:43                             ` Vincent Ray
@ 2022-05-25 17:48                               ` Vincent Ray
  2022-05-26  0:17                                 ` Eric Dumazet
  0 siblings, 1 reply; 28+ messages in thread
From: Vincent Ray @ 2022-05-25 17:48 UTC (permalink / raw)
  To: Guoju Fang, Eric Dumazet, linyunsheng
  Cc: davem, 方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Remy Gauguey, will,
	Eric Dumazet, pabeni



----- On May 25, 2022, at 7:43 PM, Vincent Ray vray@kalrayinc.com wrote:

> ----- On May 25, 2022, at 2:40 PM, Guoju Fang gjfang@linux.alibaba.com wrote:
> 
>> On 2022/5/25 18:45, Yunsheng Lin wrote:
>>> On 2022/5/25 17:44, Vincent Ray wrote:
>>>> ----- On May 24, 2022, at 10:17 PM, Eric Dumazet eric.dumazet@gmail.com wrote:
>>>>
>>>>> On 5/24/22 10:00, Vincent Ray wrote:
>>>>>> All,
>>>>>>
>>>>>> I confirm Eric's patch works well too, and it's better and clearer than mine.
>>>>>> So I think we should go for it, and the one from Guoju in addition.
>>>>>>
>>>>>> @Eric : I see you are one of the networking maintainers, so I have a few
>>>>>> questions for you :
>>>>>>
>>>>>> a) are you going to take care of these patches directly yourself, or is there
>>>>>> something Guoju or I should do to promote them ?
>>>>>
>>>>> I think this is totally fine you take ownership of the patch, please
>>>>> send a formal V2.
>>>>>
>>>>> Please double check what patchwork had to say about your V1 :
>>>>>
>>>>>
>>>>> https://patchwork.kernel.org/project/netdevbpf/patch/1684598287.15044793.1653314052575.JavaMail.zimbra@kalray.eu/
>>>>>
>>>>>
>>>>> And make sure to address the relevant points
>>>>
>>>> OK I will.
>>>> If you agree, I will take your version of the fix (test_and_set_bit()), keeping
>>>> the commit message
>>>> similar to my original one.
>>>>
>>>> What about Guoju's patch ?
>>> 
>>> @Guoju, please speak up if you want to handle the patch yourself.
>> 
>> Hi Yunsheng, all,
>> 
>> I rewrite the comments of my patch and it looks a little clearer. :)
>> 
>> Thank you.
>> 
>> Best regards,
> 
> Guoju : shouldn't you also include the same Fixes tag suggested by YunSheng ?
> 
> Here's mine, attached. Hope it's well formatted this time. Tell me.
> I don't feel quite confident with the submission process to produce the series
> myself, so I'll let Eric handle it if it's ok.

NB : from what I've understood reading some doc, as this is a fix, it's supposed to go 
in the "net" tree, so I tagged it accordingly in the Subject. Hope it's Ok

> 
>> 
>>> 
>>>> (adding a smp_mb() between the spin_unlock() and test_bit() in qdisc_run_end()).
>>>> I think it is also necessary though potentially less critical.
>>>> Do we embed it in the same patch ? or patch series ?
>>> 
>>> Guoju's patch fixes the commit a90c57f2cedd, so "patch series"
>>> seems better if Guoju is not speaking up to handle the patch himself.
>>> 
>>> 
>>>>
>>>> @Guoju : have you submitted it for integration ?
>>>>
>>>>
>>>>> The most important one is the lack of 'Signed-off-by:' tag, of course.
>>>>>
>>>>>
>>>>>> b) Can we expect to see them land in the mainline soon ?
>>>>>
>>>>> If your v2 submission is correct, it can be merged this week ;)
>>>>>
>>>>>
>>>>>>
>>>>>> c) Will they be backported to previous versions of the kernel ? Which ones ?
>>>>>
>>>>> You simply can include a proper Fixes: tag, so that stable teams can
>>>>> backport
>>>>>
>>>>> the patch to all affected kernel versions.
>>>>>
>>>>
>>>> Here things get a little complicated in my head ;-)
>>>> As explained, I think this mechanism has been bugged, in a way or an other, for
>>>> some time, perhaps since the introduction
>>>> of lockless qdiscs (4.16) or somewhere between 4.16 and 5.14.
>>>> It's hard to tell at a glance since the code looks quite different back then.
>>>> Because of these changes, a unique patch will also only apply up to a certain
>>>> point in the past.
>>>>
>>>> However, I think the bug became really critical only with the introduction of
>>>> "true bypass" behavior
>>>> in lockless qdiscs by YunSheng in 5.14, though there may be scenarios where it
>>>> is a big deal
>>>> even in no-bypass mode.
>>> 
>>> 
>>> commit 89837eb4b246 tried to fix that, but it did not fix it completely, and
>>> that commit should has
>>> been back-ported to the affected kernel versions as much as possible, so I think
>>> the Fixes tag
>>> should be:
>>> 
>>> Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering for
>>> lockless qdisc")
>>> 
>>>>
>>>> => I suggest we only tag it for backward fix up to the 5.14, where it should
>>>> apply smoothly,
>>>>   and we live with the bug for versions before that.
>>>> This would mean that 5.15 LT can be patched but no earlier LT
>>>>   
>>>> What do you think ?
>>>>
>>>> BTW : forgive my ignorance, but are there any kind of "Errata Sheet" or similar
>>>> for known bugs that
>>>> won't be fixed in a given kernel ?
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks a lot, best,
>>>>>
>>>>> Thanks a lot for working on this long standing issue.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> To declare a filtering error, please use the following link :
>>>>> https://www.security-mail.net/reporter.php?mid=7009.628d3d4c.37c04.0&r=vray%40kalrayinc.com&s=eric.dumazet%40gmail.com&o=Re%3A+packet+stuck+in+qdisc+%3A+patch+proposal&verdict=C&c=0ca08e7b7e420d1ab014cda67db48db71df41f5f
>>>>
>>>>
>>>>
>>>>
>>>> .
>>>>
>> 
>> To declare a filtering error, please use the following link :
>> https://www.security-mail.net/reporter.php?mid=2c69.628e23bf.45908.0&r=vray%40kalrayinc.com&s=gjfang%40linux.alibaba.com&o=Re%3A+packet+stuck+in+qdisc+%3A+patch+proposal&verdict=C&c=6106070134039ab6725b6d3de67bd24d624c8b51





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-25 17:48                               ` Vincent Ray
@ 2022-05-26  0:17                                 ` Eric Dumazet
  2022-05-26  7:01                                   ` [PATCH v2] net: sched: add barrier to fix packet stuck problem for lockless qdisc Guoju Fang
  2022-05-30  9:36                                   ` packet stuck in qdisc : patch proposal Vincent Ray
  0 siblings, 2 replies; 28+ messages in thread
From: Eric Dumazet @ 2022-05-26  0:17 UTC (permalink / raw)
  To: Vincent Ray
  Cc: Guoju Fang, Eric Dumazet, linyunsheng, davem,
	方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Remy Gauguey, will,
	pabeni

On Wed, May 25, 2022 at 10:48 AM Vincent Ray <vray@kalrayinc.com> wrote:
>

> >
> > Guoju : shouldn't you also include the same Fixes tag suggested by YunSheng ?
> >
> > Here's mine, attached. Hope it's well formatted this time. Tell me.
> > I don't feel quite confident with the submission process to produce the series
> > myself, so I'll let Eric handle it if it's ok.
>
> NB : from what I've understood reading some doc, as this is a fix, it's supposed to go
> in the "net" tree, so I tagged it accordingly in the Subject. Hope it's Ok

Patch was not delivered to patchwork this time.

I will resend it.

Next time, please take the time to read
Documentation/process/submitting-patches.rst, thanks.

(line 276: No MIME, no links, no compression, no attachments.  Just plain text)

Thanks.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2] net: sched: add barrier to fix packet stuck problem for lockless qdisc
  2022-05-26  0:17                                 ` Eric Dumazet
@ 2022-05-26  7:01                                   ` Guoju Fang
  2022-05-27  9:11                                     ` [PATCH v3 net] " Guoju Fang
  2022-05-30  9:36                                   ` packet stuck in qdisc : patch proposal Vincent Ray
  1 sibling, 1 reply; 28+ messages in thread
From: Guoju Fang @ 2022-05-26  7:01 UTC (permalink / raw)
  To: edumazet
  Cc: davem, eric.dumazet, gjfang, guoju.fgj, kuba, linyunsheng,
	netdev, pabeni, rgauguey, sjones, vladimir.oltean, vray, will

In qdisc_run_end(), the spin_unlock() only has store-release semantic,
which guarantees all earlier memory access are visible before it. But
the subsequent test_bit() may be reordered ahead of the spin_unlock(),
and may cause a packet stuck problem.

The concurrent operations can be described as below,
         CPU 0                      |          CPU 1
   qdisc_run_end()                  |     qdisc_run_begin()
          .                         |           .
 ----> /* may be reorderd here */   |           .
|         .                         |           .
|     spin_unlock()                 |         set_bit()
|         .                         |         smp_mb__after_atomic()
 ---- test_bit()                    |         spin_trylock()
          .                         |          .

Consider the following sequence of events:
    CPU 0 reorder test_bit() ahead and see MISSED = 0
    CPU 1 calls set_bit()
    CPU 1 calls spin_trylock() and return fail
    CPU 0 executes spin_unlock()

At the end of the sequence, CPU 0 calls spin_unlock() and does nothing
because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one
take it, until the next cpu pushing to the qdisc (if ever ...) will
notice and dequeue it.

So one explicit barrier is needed between spin_unlock() and
test_bit() to ensure the correct order.

Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering
for lockless qdisc")
Signed-off-by: Guoju Fang <gjfang@linux.alibaba.com>
---
 include/net/sch_generic.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 9bab396c1f3b..8a8738642ca0 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -229,6 +229,9 @@ static inline void qdisc_run_end(struct Qdisc *qdisc)
 	if (qdisc->flags & TCQ_F_NOLOCK) {
 		spin_unlock(&qdisc->seqlock);
 
+		/* ensure ordering between spin_unlock() and test_bit() */
+		smp_mb();
+
 		if (unlikely(test_bit(__QDISC_STATE_MISSED,
 				      &qdisc->state)))
 			__netif_schedule(qdisc);
-- 
2.32.0.3.g01195cf9f


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 net] net: sched: add barrier to fix packet stuck problem for lockless qdisc
  2022-05-26  7:01                                   ` [PATCH v2] net: sched: add barrier to fix packet stuck problem for lockless qdisc Guoju Fang
@ 2022-05-27  9:11                                     ` Guoju Fang
  2022-05-28  0:51                                       ` Yunsheng Lin
  0 siblings, 1 reply; 28+ messages in thread
From: Guoju Fang @ 2022-05-27  9:11 UTC (permalink / raw)
  To: gjfang
  Cc: davem, edumazet, eric.dumazet, guoju.fgj, kuba, linyunsheng,
	netdev, pabeni, rgauguey, sjones, vladimir.oltean, vray, will

In qdisc_run_end(), the spin_unlock() only has store-release semantic,
which guarantees all earlier memory access are visible before it. But
the subsequent test_bit() may be reordered ahead of the spin_unlock(),
and may cause a packet stuck problem.

The concurrent operations can be described as below,
         CPU 0                      |          CPU 1
   qdisc_run_end()                  |     qdisc_run_begin()
          .                         |           .
 ----> /* may be reorderd here */   |           .
|         .                         |           .
|     spin_unlock()                 |         set_bit()
|         .                         |         smp_mb__after_atomic()
 ---- test_bit()                    |         spin_trylock()
          .                         |          .

Consider the following sequence of events:
    CPU 0 reorder test_bit() ahead and see MISSED = 0
    CPU 1 calls set_bit()
    CPU 1 calls spin_trylock() and return fail
    CPU 0 executes spin_unlock()

At the end of the sequence, CPU 0 calls spin_unlock() and does nothing
because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one
take it, until the next cpu pushing to the qdisc (if ever ...) will
notice and dequeue it.

So one explicit barrier is needed between spin_unlock() and test_bit()
to ensure the correct order.

Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering for lockless qdisc")
Signed-off-by: Guoju Fang <gjfang@linux.alibaba.com>
---
V2 -> V3: Not split the Fixes tag across multiple lines
V1 -> V2: Rewrite comments
---
 include/net/sch_generic.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 9bab396c1f3b..8a8738642ca0 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -229,6 +229,9 @@ static inline void qdisc_run_end(struct Qdisc *qdisc)
 	if (qdisc->flags & TCQ_F_NOLOCK) {
 		spin_unlock(&qdisc->seqlock);
 
+		/* ensure ordering between spin_unlock() and test_bit() */
+		smp_mb();
+
 		if (unlikely(test_bit(__QDISC_STATE_MISSED,
 				      &qdisc->state)))
 			__netif_schedule(qdisc);
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 net] net: sched: add barrier to fix packet stuck problem for lockless qdisc
  2022-05-27  9:11                                     ` [PATCH v3 net] " Guoju Fang
@ 2022-05-28  0:51                                       ` Yunsheng Lin
  2022-05-28 10:16                                         ` [PATCH v4 " Guoju Fang
  0 siblings, 1 reply; 28+ messages in thread
From: Yunsheng Lin @ 2022-05-28  0:51 UTC (permalink / raw)
  To: Guoju Fang
  Cc: davem, edumazet, eric.dumazet, guoju.fgj, kuba, netdev, pabeni,
	rgauguey, sjones, vladimir.oltean, vray, will

On 2022/5/27 17:11, Guoju Fang wrote:
> In qdisc_run_end(), the spin_unlock() only has store-release semantic,
> which guarantees all earlier memory access are visible before it. But
> the subsequent test_bit() may be reordered ahead of the spin_unlock(),
> and may cause a packet stuck problem.
> 
> The concurrent operations can be described as below,
>          CPU 0                      |          CPU 1
>    qdisc_run_end()                  |     qdisc_run_begin()
>           .                         |           .
>  ----> /* may be reorderd here */   |           .
> |         .                         |           .
> |     spin_unlock()                 |         set_bit()
> |         .                         |         smp_mb__after_atomic()
>  ---- test_bit()                    |         spin_trylock()
>           .                         |          .
> 
> Consider the following sequence of events:
>     CPU 0 reorder test_bit() ahead and see MISSED = 0
>     CPU 1 calls set_bit()
>     CPU 1 calls spin_trylock() and return fail
>     CPU 0 executes spin_unlock()
> 
> At the end of the sequence, CPU 0 calls spin_unlock() and does nothing
> because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one
> take it, until the next cpu pushing to the qdisc (if ever ...) will
> notice and dequeue it.
> 
> So one explicit barrier is needed between spin_unlock() and test_bit()
> to ensure the correct order.

It might be better to mention why smp_mb() is used instead of smp_rmb()
or smp_wmb():

spin_unlock() and test_bit() ordering is a store-load ordering, which
requires a full memory barrier as smp_mb().

> 
> Fixes: 89837eb4b246 ("net: sched: add barrier to ensure correct ordering for lockless qdisc")

The Fixes tag should be:

Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc")

Other than that, it looks good to me:
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>

> Signed-off-by: Guoju Fang <gjfang@linux.alibaba.com>
> ---
> V2 -> V3: Not split the Fixes tag across multiple lines
> V1 -> V2: Rewrite comments
> ---
>  include/net/sch_generic.h | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
> index 9bab396c1f3b..8a8738642ca0 100644
> --- a/include/net/sch_generic.h
> +++ b/include/net/sch_generic.h
> @@ -229,6 +229,9 @@ static inline void qdisc_run_end(struct Qdisc *qdisc)
>  	if (qdisc->flags & TCQ_F_NOLOCK) {
>  		spin_unlock(&qdisc->seqlock);
>  
> +		/* ensure ordering between spin_unlock() and test_bit() */
> +		smp_mb();
> +
>  		if (unlikely(test_bit(__QDISC_STATE_MISSED,
>  				      &qdisc->state)))
>  			__netif_schedule(qdisc);
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 net] net: sched: add barrier to fix packet stuck problem for lockless qdisc
  2022-05-28  0:51                                       ` Yunsheng Lin
@ 2022-05-28 10:16                                         ` Guoju Fang
  2022-06-01  4:00                                           ` patchwork-bot+netdevbpf
  0 siblings, 1 reply; 28+ messages in thread
From: Guoju Fang @ 2022-05-28 10:16 UTC (permalink / raw)
  To: linyunsheng
  Cc: davem, edumazet, eric.dumazet, gjfang, guoju.fgj, kuba, netdev,
	pabeni, rgauguey, sjones, vladimir.oltean, vray, will

In qdisc_run_end(), the spin_unlock() only has store-release semantic,
which guarantees all earlier memory access are visible before it. But
the subsequent test_bit() has no barrier semantics so may be reordered
ahead of the spin_unlock(). The store-load reordering may cause a packet
stuck problem.

The concurrent operations can be described as below,
         CPU 0                      |          CPU 1
   qdisc_run_end()                  |     qdisc_run_begin()
          .                         |           .
 ----> /* may be reorderd here */   |           .
|         .                         |           .
|     spin_unlock()                 |         set_bit()
|         .                         |         smp_mb__after_atomic()
 ---- test_bit()                    |         spin_trylock()
          .                         |          .

Consider the following sequence of events:
    CPU 0 reorder test_bit() ahead and see MISSED = 0
    CPU 1 calls set_bit()
    CPU 1 calls spin_trylock() and return fail
    CPU 0 executes spin_unlock()

At the end of the sequence, CPU 0 calls spin_unlock() and does nothing
because it see MISSED = 0. The skb on CPU 1 has beed enqueued but no one
take it, until the next cpu pushing to the qdisc (if ever ...) will
notice and dequeue it.

This patch fix this by adding one explicit barrier. As spin_unlock() and
test_bit() ordering is a store-load ordering, a full memory barrier
smp_mb() is needed here.

Fixes: a90c57f2cedd ("net: sched: fix packet stuck problem for lockless qdisc")
Signed-off-by: Guoju Fang <gjfang@linux.alibaba.com>
---
V3 -> V4: Clarified why a full memory barrier is needed
V2 -> V3: Not split the Fixes tag across multiple lines
V1 -> V2: Rewrite comments
---
 include/net/sch_generic.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 9bab396c1f3b..93c808bd39aa 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -229,6 +229,12 @@ static inline void qdisc_run_end(struct Qdisc *qdisc)
 	if (qdisc->flags & TCQ_F_NOLOCK) {
 		spin_unlock(&qdisc->seqlock);
 
+		/* spin_unlock() only has store-release semantic. The unlock
+		 * and test_bit() ordering is a store-load ordering, so a full
+		 * memory barrier is needed here.
+		 */
+		smp_mb();
+
 		if (unlikely(test_bit(__QDISC_STATE_MISSED,
 				      &qdisc->state)))
 			__netif_schedule(qdisc);
-- 
2.34.0


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: packet stuck in qdisc : patch proposal
  2022-05-26  0:17                                 ` Eric Dumazet
  2022-05-26  7:01                                   ` [PATCH v2] net: sched: add barrier to fix packet stuck problem for lockless qdisc Guoju Fang
@ 2022-05-30  9:36                                   ` Vincent Ray
  1 sibling, 0 replies; 28+ messages in thread
From: Vincent Ray @ 2022-05-30  9:36 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Guoju Fang, Eric Dumazet, linyunsheng, davem,
	方国炬,
	kuba, netdev, Samuel Jones, vladimir oltean, Remy Gauguey, will,
	pabeni



----- On May 26, 2022, at 2:17 AM, Eric Dumazet edumazet@google.com wrote:

> On Wed, May 25, 2022 at 10:48 AM Vincent Ray <vray@kalrayinc.com> wrote:
>>
> 
>> >
>> > Guoju : shouldn't you also include the same Fixes tag suggested by YunSheng ?
>> >
>> > Here's mine, attached. Hope it's well formatted this time. Tell me.
>> > I don't feel quite confident with the submission process to produce the series
>> > myself, so I'll let Eric handle it if it's ok.
>>
>> NB : from what I've understood reading some doc, as this is a fix, it's supposed
>> to go
>> in the "net" tree, so I tagged it accordingly in the Subject. Hope it's Ok
> 
> Patch was not delivered to patchwork this time.
> 
> I will resend it.
> 
> Next time, please take the time to read
> Documentation/process/submitting-patches.rst, thanks.
> 
> (line 276: No MIME, no links, no compression, no attachments.  Just plain text)
> 
> Thanks.

Yes, sorry about that Eric.
I did read the howto but somehow convinced myself that plain-text attachment was ok, and 
was confused about how / if the patch would be correctly merged into the mail thread if inlined.
Will do better next time !
Thanks,

V 

> 
> 
> To declare a filtering error, please use the following link :
> https://www.security-mail.net/reporter.php?mid=f3cf.628ec731.cdb4a.0&r=vray%40kalrayinc.com&s=edumazet%40google.com&o=Re%3A+packet+stuck+in+qdisc+%3A+patch+proposal&verdict=C&c=08b5406354e783e551d3ed8910b4278bf54f89d9





^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 net] net: sched: add barrier to fix packet stuck problem for lockless qdisc
  2022-05-28 10:16                                         ` [PATCH v4 " Guoju Fang
@ 2022-06-01  4:00                                           ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 28+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-06-01  4:00 UTC (permalink / raw)
  To: Guoju Fang
  Cc: linyunsheng, davem, edumazet, eric.dumazet, guoju.fgj, kuba,
	netdev, pabeni, rgauguey, sjones, vladimir.oltean, vray, will

Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski <kuba@kernel.org>:

On Sat, 28 May 2022 18:16:28 +0800 you wrote:
> In qdisc_run_end(), the spin_unlock() only has store-release semantic,
> which guarantees all earlier memory access are visible before it. But
> the subsequent test_bit() has no barrier semantics so may be reordered
> ahead of the spin_unlock(). The store-load reordering may cause a packet
> stuck problem.
> 
> The concurrent operations can be described as below,
>          CPU 0                      |          CPU 1
>    qdisc_run_end()                  |     qdisc_run_begin()
>           .                         |           .
> 
> [...]

Here is the summary with links:
  - [v4,net] net: sched: add barrier to fix packet stuck problem for lockless qdisc
    https://git.kernel.org/netdev/net/c/2e8728c955ce

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2022-06-01  4:00 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1862202329.1457162.1643113633513.JavaMail.zimbra@kalray.eu>
     [not found] ` <698739062.1462023.1643115337201.JavaMail.zimbra@kalray.eu>
2022-01-28  2:36   ` packet stuck in qdisc Yunsheng Lin
2022-01-28  8:58     ` Vincent Ray
2022-01-28  9:59       ` Vincent Ray
2022-01-29  6:53         ` Yunsheng Lin
2022-01-31 18:39           ` Vincent Ray
2022-02-07  3:17             ` Yunsheng Lin
2022-03-25  6:16     ` Yunsheng Lin
2022-03-25  8:45       ` Vincent Ray
2022-04-13 13:01         ` Vincent Ray
2022-04-14  3:05           ` Guoju Fang
2022-05-23 13:54             ` packet stuck in qdisc : patch proposal Vincent Ray
2022-05-24  2:55               ` Eric Dumazet
2022-05-24  6:43               ` Yunsheng Lin
2022-05-24  8:13                 ` Vincent Ray
2022-05-24 17:00                   ` Vincent Ray
2022-05-24 20:17                     ` Eric Dumazet
2022-05-25  9:44                       ` Vincent Ray
2022-05-25 10:45                         ` Yunsheng Lin
2022-05-25 12:40                           ` Guoju Fang
2022-05-25 17:43                             ` Vincent Ray
2022-05-25 17:48                               ` Vincent Ray
2022-05-26  0:17                                 ` Eric Dumazet
2022-05-26  7:01                                   ` [PATCH v2] net: sched: add barrier to fix packet stuck problem for lockless qdisc Guoju Fang
2022-05-27  9:11                                     ` [PATCH v3 net] " Guoju Fang
2022-05-28  0:51                                       ` Yunsheng Lin
2022-05-28 10:16                                         ` [PATCH v4 " Guoju Fang
2022-06-01  4:00                                           ` patchwork-bot+netdevbpf
2022-05-30  9:36                                   ` packet stuck in qdisc : patch proposal Vincent Ray

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).