linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] IB/rxe: Don't clamp residual length to mtu
@ 2017-04-06 12:49 Johannes Thumshirn
  2017-04-13 12:00 ` Johannes Thumshirn
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Johannes Thumshirn @ 2017-04-06 12:49 UTC (permalink / raw)
  To: Moni Shoua, Doug Ledford, Sean Hefty, Hal Rosenstock
  Cc: Linux Kernel Mailinglist, linux-rdma, Johannes Thumshirn,
	Hannes Reinecke, Sagi Grimberg, Max Gurtovoy

When reading a RDMA WRITE FIRST packet we copy the DMA length from the RDMA
header into the qp->resp.resid variable for later use. Later in check_rkey()
we clamp it to the MTU if the packet is an  RDMA WRITE packet and has a
residual length bigger than the MTU. Later in write_data_in() we subtract the
payload of the packet from the residual length. If the packet happens to have a
payload of exactly the MTU size we end up with a residual length of 0 despite
the packet not being the last in the conversation. When the next packet in the
conversation arrives, we don't have any residual length left and thus set the QP
into an error state.

This broke NVMe over Fabrics functionality over rdma_rxe.ko

The patch was verified using the following test.

 # echo eth0 > /sys/module/rdma_rxe/parameters/add
 # nvme connect -t rdma -a 192.168.155.101 -s 1023 -n nvmf-test
 # mkfs.xfs -fK /dev/nvme0n1
 meta-data=/dev/nvme0n1           isize=256    agcount=4, agsize=65536 blks
          =                       sectsz=4096  attr=2, projid32bit=1
          =                       crc=0        finobt=0, sparse=0
 data     =                       bsize=4096   blocks=262144, imaxpct=25
          =                       sunit=0      swidth=0 blks
 naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
 log      =internal log           bsize=4096   blocks=2560, version=2
          =                       sectsz=4096  sunit=1 blks, lazy-count=1
 realtime =none                   extsz=4096   blocks=0, rtextents=0
 # mount /dev/nvme0n1 /tmp/
 [  148.923263] XFS (nvme0n1): Mounting V4 Filesystem
 [  148.961196] XFS (nvme0n1): Ending clean mount
 # dd if=/dev/urandom of=test.bin bs=1M count=128
 128+0 records in
 128+0 records out
 134217728 bytes (134 MB, 128 MiB) copied, 0.437991 s, 306 MB/s
 # sha256sum test.bin
 cde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129a  test.bin
 # cp test.bin /tmp/
 sha256sum /tmp/test.bin
 cde42941f045efa8c4f0f157ab6f29741753cdd8d1cff93a6b03649d83c4129a  /tmp/test.bin

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/infiniband/sw/rxe/rxe_resp.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index c9dd385..58764df 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -478,8 +478,6 @@ static enum resp_states check_rkey(struct rxe_qp *qp,
 				state = RESPST_ERR_LENGTH;
 				goto err;
 			}
-
-			qp->resp.resid = mtu;
 		} else {
 			if (pktlen != resid) {
 				state = RESPST_ERR_LENGTH;
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-05-01 18:44 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-06 12:49 [PATCH] IB/rxe: Don't clamp residual length to mtu Johannes Thumshirn
2017-04-13 12:00 ` Johannes Thumshirn
2017-04-13 12:22   ` Leon Romanovsky
2017-04-13 12:29     ` Johannes Thumshirn
2017-04-13 14:12 ` Moni Shoua
2017-04-25  7:29 ` Johannes Thumshirn
2017-05-01 18:44   ` Doug Ledford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).