All of lore.kernel.org
 help / color / mirror / Atom feed
* nvme-tcp crashes the system when overloading the backend device.
@ 2021-08-31 13:30 Mark Ruijter
  2021-09-01 12:49 ` Sagi Grimberg
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Ruijter @ 2021-08-31 13:30 UTC (permalink / raw)
  To: linux-nvme

Hi all,

I can consistently crash a system when I sufficiently overload the nvme-tcp target.
The easiest way to reproduce the problem is by creating a raid5.

While this R5 is resyncing export it with the nvmet-tcp target driver and start a high queue-depth 4K random fio workload from the initiator.
At some point the target system will start logging these messages:
[ 2865.725069] nvmet: ctrl 238 keep-alive timer (15 seconds) expired!
[ 2865.725072] nvmet: ctrl 236 keep-alive timer (15 seconds) expired!
[ 2865.725075] nvmet: ctrl 238 fatal error occurred!
[ 2865.725076] nvmet: ctrl 236 fatal error occurred!
[ 2865.725080] nvmet: ctrl 237 keep-alive timer (15 seconds) expired!
[ 2865.725083] nvmet: ctrl 237 fatal error occurred!
[ 2865.725087] nvmet: ctrl 235 keep-alive timer (15 seconds) expired!
[ 2865.725094] nvmet: ctrl 235 fatal error occurred!

Even when you stop all IO from the initiator some of the nvmet_tcp_wq workers will keep running forever.
The workload shown with "top" never returns to the normal idle level.

root      5669  1.1  0.0      0     0 ?        D<   03:39   0:09 [kworker/22:2H+nvmet_tcp_wq]
root      5670  0.8  0.0      0     0 ?        D<   03:39   0:06 [kworker/55:2H+nvmet_tcp_wq]
root      5676  0.2  0.0      0     0 ?        D<   03:39   0:01 [kworker/29:2H+nvmet_tcp_wq]
root      5677 12.2  0.0      0     0 ?        D<   03:39   1:35 [kworker/59:2H+nvmet_tcp_wq]
root      5679  5.7  0.0      0     0 ?        D<   03:39   0:44 [kworker/27:2H+nvmet_tcp_wq]
root      5680  2.9  0.0      0     0 ?        I<   03:39   0:23 [kworker/57:2H-nvmet_tcp_wq]
root      5681  1.0  0.0      0     0 ?        D<   03:39   0:08 [kworker/60:2H+nvmet_tcp_wq]
root      5682  0.5  0.0      0     0 ?        D<   03:39   0:04 [kworker/18:2H+nvmet_tcp_wq]
root      5683  5.8  0.0      0     0 ?        D<   03:39   0:45 [kworker/54:2H+nvmet_tcp_wq]

The number of running nvmet_tcp_wq will keep increasing once you hit the problem:

gold:/var/crash/2021-08-26-08:38 # ps ax | grep nvmet_tcp_wq | tail -3
41114 ?        D<     0:00 [kworker/25:21H+nvmet_tcp_wq]
41152 ?        D<     0:00 [kworker/54:25H+nvmet_tcp_wq]

gold:/var/crash/2021-08-26-08:38 # ps ax | grep nvme | grep wq | wc -l
500
gold:/var/crash/2021-08-26-08:38 # ps ax | grep nvme | grep wq | wc -l
502
gold:/var/crash/2021-08-26-08:38 # ps ax | grep nvmet_tcp_wq | wc -l
503
gold:/var/crash/2021-08-26-08:38 # ps ax | grep nvmet_tcp_wq | wc -l
505
gold:/var/crash/2021-08-26-08:38 # ps ax | grep nvmet_tcp_wq | wc -l
506
gold:/var/crash/2021-08-26-08:38 # ps ax | grep nvmet_tcp_wq | wc -l
511
gold:/var/crash/2021-08-26-08:38 # ps ax | grep nvmet_tcp_wq | wc -l
661

Eventually the system runs out of resources.
At some point the system will reach a workload of 2000+ and crash.

So far, I have been unable to determine why the number of nvmet_tcp_wq keeps increasing.
It must be because the current failed worker gets replaced by a new worker without the old being terminated.

Thanks,

Mark Ruijter




_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-09-06 12:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-31 13:30 nvme-tcp crashes the system when overloading the backend device Mark Ruijter
2021-09-01 12:49 ` Sagi Grimberg
2021-09-01 14:36   ` Mark Ruijter
2021-09-01 14:47     ` Sagi Grimberg
2021-09-02 11:31       ` Mark Ruijter
     [not found]       ` <27377057-5001-4D53-B8D7-889972376F29@primelogic.nl>
2021-09-06 11:12         ` Sagi Grimberg
2021-09-06 12:25           ` Mark Ruijter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.