nvme-rdma corrupts memory upon timeout

* nvme-rdma corrupts memory upon timeout
@ 2018-02-25 15:10 Alon Horev
  2018-02-25 16:18 ` Bart Van Assche
  2018-02-25 17:45 ` Sagi Grimberg
  0 siblings, 2 replies; 10+ messages in thread
From: Alon Horev @ 2018-02-25 15:10 UTC (permalink / raw)

Hey,

We're running nvmf over a large cluster using RDMA. Sometimes, there's
some congestion that causes the nvme host driver to time out (we use a
4 second timeout).
Even though the host (initiator) times out and returns with an error
to userspace, we can see the buffer being written after the io
returned. This can obviously cause serious crashes and corruptions.
We suspect the same happens with writes but have yet to prove it.

We think we can spot the root cause: 'nvme_rdma_error_recovery'
handles the timeout in an asynchronous manner. It queues a task for
reconnecting the nvme device. Until that task is executed by the
worker thread the qp is open and a rdma write can get through. Does
this make sense?

Some additional information: we use a keepalive and reconnect timeout
of 1 second. ConnectX4 with OFED 4.1. I validated the code hasn't
changed in latest linux sources.

Thanks, Alon Horev
VastData

^ permalink raw reply	[flat|nested] 10+ messages in thread