* RQ overflow seen running isert traffic @ 2016-09-27 7:01 Potnuri Bharat Teja [not found] ` <20160927070157.GA13140-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> 2016-10-05 6:14 ` Sagi Grimberg 0 siblings, 2 replies; 24+ messages in thread From: Potnuri Bharat Teja @ 2016-09-27 7:01 UTC (permalink / raw) To: sagi; +Cc: swise, target-devel, nab, linux-rdma Hi Sagi, I've been trying to understand the isert functionality with respect to RDMA Receive Queue sizing and Queue full handling. Here is the problem is see with iw_cxgb4: After running few minutes of iSER traffic with iw_cxgb4, I am seeing post receive failures due to receive queue full returning -ENOMEM. In case of iw_cxgb4 the RQ size is 130 with qp attribute max_recv_wr = 129, passed down by isert to iw_cxgb4.isert decides on max_recv_wr as 129 based on (ISERT_QP_MAX_RECV_DTOS = ISCSI_DEF_XMIT_CMDS_MAX = 128) + 1. My debug suggests that at some point isert tries to post more than 129 receive WRs into the RQ and fails as the queue is full already. From the code most of the recv wr are posted only after a recieve completion, but few datain operations(isert_put_datain()) are done independent of receive completions. In fact the last WR failed to post in to RQ is from isert_put_datain() through target_complete_ok_work(). CQ stats at the time of failure shows the cq polled to empty. How could I check for the possible conditions leading to this issue? I am happy to provide you any details further. Thanks for your time. Bharat. ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20160927070157.GA13140-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>]
* RE: RQ overflow seen running isert traffic [not found] ` <20160927070157.GA13140-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> @ 2016-09-29 14:12 ` Steve Wise 0 siblings, 0 replies; 24+ messages in thread From: Steve Wise @ 2016-09-29 14:12 UTC (permalink / raw) To: 'Potnuri Bharat Teja', sagi-NQWnxTmZq1alnMjI0IkVqw Cc: target-devel-u79uwXL29TY76Z2rM5mHXA, nab-IzHhD5pYlfBP7FQvKIMDCQ, linux-rdma-u79uwXL29TY76Z2rM5mHXA > > Hi Sagi, > I've been trying to understand the isert functionality with respect to > RDMA Receive Queue sizing and Queue full handling. Here is the problem > is see with iw_cxgb4: > > After running few minutes of iSER traffic with iw_cxgb4, I am seeing > post receive failures due to receive queue full returning -ENOMEM. > In case of iw_cxgb4 the RQ size is 130 with qp attribute max_recv_wr = 129, > passed down by isert to iw_cxgb4.isert decides on max_recv_wr as 129 based > on (ISERT_QP_MAX_RECV_DTOS = ISCSI_DEF_XMIT_CMDS_MAX = 128) + 1. > Just to clarify: The RQ size of 130 in iw_cxgb4 has to do with how the the SW and HW share the queue state. It will always be at least 1 more than the application's size. So the RQ depth is 129, and a post_recv() is done when the RQ is full, which should never happen... > My debug suggests that at some point isert tries to post more than > 129 receive WRs into the RQ and fails as the queue is full already. From > the code most of the recv wr are posted only after a recieve completion, > but few datain operations(isert_put_datain()) are done independent of > receive completions. In fact the last WR failed to post in to RQ is from > isert_put_datain() through target_complete_ok_work(). CQ stats at the > time of failure shows the cq polled to empty. > > How could I check for the possible conditions leading to this issue? > I am happy to provide you any details further. > Thanks for your time. > > Bharat. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: RQ overflow seen running isert traffic 2016-09-27 7:01 RQ overflow seen running isert traffic Potnuri Bharat Teja [not found] ` <20160927070157.GA13140-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> @ 2016-10-05 6:14 ` Sagi Grimberg 2016-10-17 11:16 ` Potnuri Bharat Teja 1 sibling, 1 reply; 24+ messages in thread From: Sagi Grimberg @ 2016-10-05 6:14 UTC (permalink / raw) To: Potnuri Bharat Teja; +Cc: swise, target-devel, nab, linux-rdma > Hi Sagi, Hey Baharat, Sorry for the late response, its the holiday season in Israel... > I've been trying to understand the isert functionality with respect to > RDMA Receive Queue sizing and Queue full handling. Here is the problem > is see with iw_cxgb4: > > After running few minutes of iSER traffic with iw_cxgb4, I am seeing > post receive failures due to receive queue full returning -ENOMEM. > In case of iw_cxgb4 the RQ size is 130 with qp attribute max_recv_wr = 129, > passed down by isert to iw_cxgb4.isert decides on max_recv_wr as 129 based > on (ISERT_QP_MAX_RECV_DTOS = ISCSI_DEF_XMIT_CMDS_MAX = 128) + 1. That's correct. > > My debug suggests that at some point isert tries to post more than > 129 receive WRs into the RQ and fails as the queue is full already. From > the code most of the recv wr are posted only after a recieve completion, > but few datain operations(isert_put_datain()) are done independent of > receive completions. Interesting. I suspect that this issue haven't come up is that the devices I used to test with allocate the send/recv queues in the next power of 2 (which would be 256) which was enough to hide this I guess... We repost the recv buffer under the following conditions: 1. We are queueing data + response (datain) or just response (dataout) and we are done with the recv buffer. 2. We got a unsolicited dataout. Can you please turn off unsolicited dataouts and see if this still happen? (InitialR2T=Yes) > In fact the last WR failed to post in to RQ is from > isert_put_datain() through target_complete_ok_work(). CQ stats at the > time of failure shows the cq polled to empty. That is strange, each scsi command should trigger iscsit_queue_data_in just once. Can you provide evidence of a command that triggers it more than once? Another possible reason is that we somehow get to put_data_in and put_response for the same command (which we should never do because we handle the response in put_data_in). Thanks for reporting. Sagi. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: RQ overflow seen running isert traffic 2016-10-05 6:14 ` Sagi Grimberg @ 2016-10-17 11:16 ` Potnuri Bharat Teja 2016-10-17 18:29 ` Steve Wise 0 siblings, 1 reply; 24+ messages in thread From: Potnuri Bharat Teja @ 2016-10-17 11:16 UTC (permalink / raw) To: Sagi Grimberg; +Cc: SWise OGC, target-devel, nab, linux-rdma On Wednesday, October 10/05/16, 2016 at 11:44:12 +0530, Sagi Grimberg wrote: > > > Hi Sagi, > > Hey Baharat, > > Sorry for the late response, its the holiday > season in Israel... > > > I've been trying to understand the isert functionality with respect to > > RDMA Receive Queue sizing and Queue full handling. Here is the problem > > is see with iw_cxgb4: > > > > After running few minutes of iSER traffic with iw_cxgb4, I am seeing > > post receive failures due to receive queue full returning -ENOMEM. > > In case of iw_cxgb4 the RQ size is 130 with qp attribute max_recv_wr = 129, > > passed down by isert to iw_cxgb4.isert decides on max_recv_wr as 129 based > > on (ISERT_QP_MAX_RECV_DTOS = ISCSI_DEF_XMIT_CMDS_MAX = 128) + 1. > > That's correct. Hi Sagi, Sorry for the late reply, I had to recheck my findings before I reply you back. My interpretation of the queue full issue was not complete, got carried away by the recieve queue and missed the SQ failure due to full among the debug logs. Here is what was happening: It was the SQ full first and fails to get posted with ENOMEM, due to this the command is queued to queue full list this will schedule it to post at later instance and this repeated try for posting will cause the rq to get full. For everyfurther try sq post will any way fail and an extra rq wr is posted as a part of datain() leading it to be full too. This happened a bit earlier in my case, since I corrected the the iser_put_datain() to return error to LIO. Here is the failure log on a fresh 4.8 kernel: isert: isert_rdma_rw_ctx_post: Cmd: ffff882ec8c96e60 failed to post RDMA res <===here is the post send failure due to ENOMEM. ABORT_TASK: Found referenced iSCSI task_tag: 33 ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 33 isert: isert_post_recv: ib_post_recv() failed with ret: -22 isert: isert_post_response: ib_post_recv failed with -22 isert: isert_post_recv: ib_post_recv() failed with ret: -22 isert: isert_post_response: ib_post_recv failed with -22 isert: isert_post_recv: ib_post_recv() failed with ret: -22 isert: isert_post_response: ib_post_recv failed with -22 isert: isert_post_recv: ib_post_recv() failed with ret: -22 isert: isert_post_response: ib_post_recv failed with -22 isert: isert_post_recv: ib_post_recv() failed with ret: -22 isert: isert_post_response: ib_post_recv failed with -22 cxgb4 0000:84:00.4: AE qpid 1026 opcode 10 status 0x1 type 1 len 0x0 wrid.hi 0x0 wrid.lo 0x4c01 isert: isert_qp_event_callback: QP access error (3): conn ffff8817ddc46000 iSCSI Login timeout on Network Portal 10.40.40.198:3260 INFO: task iscsi_np:14744 blocked for more than 120 seconds. Here are the QP and CQ stats before my assert for queue full: qhp->attr.sq_num_entries 523 qhp->attr.rq_num_entries 129 qhp->wq.sq.qid 1026 qhp->wq.rq.qid 1027 qhp->wq.sq.in_use 523 <=== SQ to the brim qhp->wq.sq.size 524 <=== SQ size qhp->wq.sq.cidx 391 qhp->wq.sq.pidx 390 qhp->wq.sq.wq_pidx 202 qhp->wq.sq.wq_pidx_inc 0 qhp->wq.sq.flush_cidx 391 qhp->wq.rq.in_use 128 qhp->wq.rq.size 130 qhp->wq.rq.cidx 112 qhp->wq.rq.pidx 110 qhp->wq.rq.wq_pidx 240 qhp->wq.rq.wq_pidx_inc 0 qhp->wq.flushed 0 chp->cq.cqid 1024 chp->cq.size 6335 chp->cq.cidx 4978 chp->cq.sw_cidx 4126 chp->cq.sw_pidx 4126 chp->cq.sw_in_use 0 chp->cq.cidx_inc 0 As an experiment I tried increasing the ISCSI_DEF_XMIT_CMDS_MAX to 256 instead of 128, which incase of iwarp creates SQ with size 1548, the issue is not seen. I doubt if the SQ is not sized properly incase of iWARP Or factored incorrectly in the rdma api for iser IOP. I am digging for root cause. Thanks for your time, Bharat. > > > > > My debug suggests that at some point isert tries to post more than > > 129 receive WRs into the RQ and fails as the queue is full already. From > > the code most of the recv wr are posted only after a recieve completion, > > but few datain operations(isert_put_datain()) are done independent of > > receive completions. > > Interesting. I suspect that this issue haven't come up is that > the devices I used to test with allocate the send/recv queues in > the next power of 2 (which would be 256) which was enough to hide > this I guess... > > We repost the recv buffer under the following conditions: > 1. We are queueing data + response (datain) or just response (dataout) > and we are done with the recv buffer. > 2. We got a unsolicited dataout. > > Can you please turn off unsolicited dataouts and see if this > still happen? (InitialR2T=Yes) > > > In fact the last WR failed to post in to RQ is from > > isert_put_datain() through target_complete_ok_work(). CQ stats at the > > time of failure shows the cq polled to empty. > > That is strange, each scsi command should trigger iscsit_queue_data_in > just once. Can you provide evidence of a command that triggers it more > than once? > > Another possible reason is that we somehow get to put_data_in and > put_response for the same command (which we should never do because > we handle the response in put_data_in). > > Thanks for reporting. > Sagi. ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: RQ overflow seen running isert traffic 2016-10-17 11:16 ` Potnuri Bharat Teja @ 2016-10-17 18:29 ` Steve Wise 2016-10-18 8:04 ` Sagi Grimberg 0 siblings, 1 reply; 24+ messages in thread From: Steve Wise @ 2016-10-17 18:29 UTC (permalink / raw) To: 'Potnuri Bharat Teja', 'Sagi Grimberg' Cc: target-devel, nab, linux-rdma > On Wednesday, October 10/05/16, 2016 at 11:44:12 +0530, Sagi Grimberg wrote: > > > > > Hi Sagi, > > > > Hey Baharat, > > > > Sorry for the late response, its the holiday > > season in Israel... > > > > > I've been trying to understand the isert functionality with respect to > > > RDMA Receive Queue sizing and Queue full handling. Here is the problem > > > is see with iw_cxgb4: > > > > > > After running few minutes of iSER traffic with iw_cxgb4, I am seeing > > > post receive failures due to receive queue full returning -ENOMEM. > > > In case of iw_cxgb4 the RQ size is 130 with qp attribute max_recv_wr = 129, > > > passed down by isert to iw_cxgb4.isert decides on max_recv_wr as 129 based > > > on (ISERT_QP_MAX_RECV_DTOS = ISCSI_DEF_XMIT_CMDS_MAX = 128) + 1. > > > > That's correct. > > Hi Sagi, > Sorry for the late reply, I had to recheck my findings before I reply > you back. > > My interpretation of the queue full issue was not complete, got carried > away by the recieve queue and missed the SQ failure due to full among > the debug logs. > > Here is what was happening: > It was the SQ full first and fails to get posted with ENOMEM, due to this > the command is queued to queue full list this will schedule it to post at > later instance and this repeated try for posting will cause the rq to get > full. For everyfurther try sq post will any way fail and an extra rq wr is > posted as a part of datain() leading it to be full too. > > This happened a bit earlier in my case, since I corrected the the > iser_put_datain() to return error to LIO. > > Here is the failure log on a fresh 4.8 kernel: > isert: isert_rdma_rw_ctx_post: Cmd: ffff882ec8c96e60 failed to post RDMA > res <===here is the post send failure due to ENOMEM. > ABORT_TASK: Found referenced iSCSI task_tag: 33 > ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 33 > isert: isert_post_recv: ib_post_recv() failed with ret: -22 > isert: isert_post_response: ib_post_recv failed with -22 > isert: isert_post_recv: ib_post_recv() failed with ret: -22 > isert: isert_post_response: ib_post_recv failed with -22 > isert: isert_post_recv: ib_post_recv() failed with ret: -22 > isert: isert_post_response: ib_post_recv failed with -22 > isert: isert_post_recv: ib_post_recv() failed with ret: -22 > isert: isert_post_response: ib_post_recv failed with -22 > isert: isert_post_recv: ib_post_recv() failed with ret: -22 > isert: isert_post_response: ib_post_recv failed with -22 > cxgb4 0000:84:00.4: AE qpid 1026 opcode 10 status 0x1 type 1 len 0x0 > wrid.hi 0x0 wrid.lo 0x4c01 > isert: isert_qp_event_callback: QP access error (3): conn > ffff8817ddc46000 > iSCSI Login timeout on Network Portal 10.40.40.198:3260 > INFO: task iscsi_np:14744 blocked for more than 120 seconds. > > Here are the QP and CQ stats before my assert for queue full: > qhp->attr.sq_num_entries 523 > qhp->attr.rq_num_entries 129 > qhp->wq.sq.qid 1026 > qhp->wq.rq.qid 1027 > qhp->wq.sq.in_use 523 <=== SQ to the brim > qhp->wq.sq.size 524 <=== SQ size > qhp->wq.sq.cidx 391 > qhp->wq.sq.pidx 390 > qhp->wq.sq.wq_pidx 202 > qhp->wq.sq.wq_pidx_inc 0 > qhp->wq.sq.flush_cidx 391 > qhp->wq.rq.in_use 128 > qhp->wq.rq.size 130 > qhp->wq.rq.cidx 112 > qhp->wq.rq.pidx 110 > qhp->wq.rq.wq_pidx 240 > qhp->wq.rq.wq_pidx_inc 0 > qhp->wq.flushed 0 > chp->cq.cqid 1024 > chp->cq.size 6335 > chp->cq.cidx 4978 > chp->cq.sw_cidx 4126 > chp->cq.sw_pidx 4126 > chp->cq.sw_in_use 0 > chp->cq.cidx_inc 0 > > As an experiment I tried increasing the ISCSI_DEF_XMIT_CMDS_MAX to 256 > instead of 128, which incase of iwarp creates SQ with size 1548, the > issue is not seen. > I doubt if the SQ is not sized properly incase of iWARP Or factored > incorrectly in the rdma api for iser IOP. > > I am digging for root cause. > > Thanks for your time, > Bharat. > Hey Sagi, I'm looking at isert_create_qp() and it appears to not be correctly sizing the SQ: ... #define ISERT_QP_MAX_REQ_DTOS (ISCSI_DEF_XMIT_CMDS_MAX + \ ISERT_MAX_TX_MISC_PDUS + \ ISERT_MAX_RX_MISC_PDUS) ... attr.cap.max_send_wr = ISERT_QP_MAX_REQ_DTOS + 1; attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS + 1; ... I think above snipit assumes a DTO consumes exactly one WR/WQE in the SQ. But the DTO can be broken into multiple WRs to handle REG_MRs, multiple WRITE or READ WRs due to limits on local sge depths target sge depths, etc. Yes? Or am I all wet? Or perhaps isert doesn't require the SQ to be the max possible because it flow controls the DTO submissions? Stevo ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: RQ overflow seen running isert traffic 2016-10-17 18:29 ` Steve Wise @ 2016-10-18 8:04 ` Sagi Grimberg 2016-10-18 11:28 ` SQ " Potnuri Bharat Teja 0 siblings, 1 reply; 24+ messages in thread From: Sagi Grimberg @ 2016-10-18 8:04 UTC (permalink / raw) To: Steve Wise, 'Potnuri Bharat Teja' Cc: target-devel-u79uwXL29TY76Z2rM5mHXA, nab-IzHhD5pYlfBP7FQvKIMDCQ, linux-rdma-u79uwXL29TY76Z2rM5mHXA Hey Steve and Baharat, > Hey Sagi, I'm looking at isert_create_qp() and it appears to not be correctly > sizing the SQ: > ... > #define ISERT_QP_MAX_REQ_DTOS (ISCSI_DEF_XMIT_CMDS_MAX + \ > ISERT_MAX_TX_MISC_PDUS + \ > ISERT_MAX_RX_MISC_PDUS) > ... > attr.cap.max_send_wr = ISERT_QP_MAX_REQ_DTOS + 1; > attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS + 1; > ... > > I think above snipit assumes a DTO consumes exactly one WR/WQE in the SQ. But > the DTO can be broken into multiple WRs to handle REG_MRs, multiple WRITE or > READ WRs due to limits on local sge depths target sge depths, etc. Yes? Or am > I all wet? Or perhaps isert doesn't require the SQ to be the max possible > because it flow controls the DTO submissions? I think you are correct. On my test devices, I didn't see that multiple WRs has had any effect becuase: 1. My test devices usually give next power of 2 (256) 2. workloads that involved multiple rdma operations never stressed the system enough to get the queues full. Now, in iWARP for non-immediate writes we'll need more than a single wr per IO (I think the SQ size is expanded with the new rdma RW API which implicitly increases with attr.cap.max_rdma_ctxs). But I do agree that we need to take into account that each IO needs at least 2 WRs (one for rdma and one for send). So a temp bandage would be: -- diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h index fc791efe3a10..81afb95aeea9 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.h +++ b/drivers/infiniband/ulp/isert/ib_isert.h @@ -54,8 +54,14 @@ #define ISERT_MIN_POSTED_RX (ISCSI_DEF_XMIT_CMDS_MAX >> 2) -#define ISERT_QP_MAX_REQ_DTOS (ISCSI_DEF_XMIT_CMDS_MAX + \ - ISERT_MAX_TX_MISC_PDUS + \ +/* + * Max QP send work requests consist of: + * - RDMA + SEND for each iscsi IO + * - iscsi misc TX pdus + * - iscsi misc RX response pdus + */ +#define ISERT_QP_MAX_REQ_DTOS ((ISCSI_DEF_XMIT_CMDS_MAX * 2 ) + \ + ISERT_MAX_TX_MISC_PDUS + \ ISERT_MAX_RX_MISC_PDUS) #define ISER_RX_PAD_SIZE (ISCSI_DEF_MAX_RECV_SEG_LEN + 4096 - \ -- But we do need to track the SQ overflow and queue a retransmit work when we don't have enough available SQ slots.. Thoughts? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: SQ overflow seen running isert traffic 2016-10-18 8:04 ` Sagi Grimberg @ 2016-10-18 11:28 ` Potnuri Bharat Teja 2016-10-18 13:17 ` Sagi Grimberg 0 siblings, 1 reply; 24+ messages in thread From: Potnuri Bharat Teja @ 2016-10-18 11:28 UTC (permalink / raw) To: Sagi Grimberg; +Cc: Steve Wise, target-devel, nab, linux-rdma On Tuesday, October 10/18/16, 2016 at 11:04:00 +0300, Sagi Grimberg wrote: > Hey Steve and Baharat, > > >Hey Sagi, I'm looking at isert_create_qp() and it appears to not be correctly > >sizing the SQ: > >... > >#define ISERT_QP_MAX_REQ_DTOS (ISCSI_DEF_XMIT_CMDS_MAX + \ > > ISERT_MAX_TX_MISC_PDUS + \ > > ISERT_MAX_RX_MISC_PDUS) > >... > > attr.cap.max_send_wr = ISERT_QP_MAX_REQ_DTOS + 1; > > attr.cap.max_recv_wr = ISERT_QP_MAX_RECV_DTOS + 1; > >... > > > >I think above snipit assumes a DTO consumes exactly one WR/WQE in the SQ. But > >the DTO can be broken into multiple WRs to handle REG_MRs, multiple WRITE or > >READ WRs due to limits on local sge depths target sge depths, etc. Yes? Or am > >I all wet? Or perhaps isert doesn't require the SQ to be the max possible > >because it flow controls the DTO submissions? > > I think you are correct. > > On my test devices, I didn't see that multiple WRs has had any effect > becuase: > 1. My test devices usually give next power of 2 (256) > 2. workloads that involved multiple rdma operations never stressed the > system enough to get the queues full. > > Now, in iWARP for non-immediate writes we'll need more than a single > wr per IO (I think the SQ size is expanded with the new rdma RW API > which implicitly increases with attr.cap.max_rdma_ctxs). Yes, rdma RW api factors the attr.cap.max_rdma_ctxs based on attr Flags. > > But I do agree that we need to take into account that each IO needs > at least 2 WRs (one for rdma and one for send). > > So a temp bandage would be: > -- > diff --git a/drivers/infiniband/ulp/isert/ib_isert.h > b/drivers/infiniband/ulp/isert/ib_isert.h > index fc791efe3a10..81afb95aeea9 100644 > --- a/drivers/infiniband/ulp/isert/ib_isert.h > +++ b/drivers/infiniband/ulp/isert/ib_isert.h > @@ -54,8 +54,14 @@ > > #define ISERT_MIN_POSTED_RX (ISCSI_DEF_XMIT_CMDS_MAX >> 2) > > -#define ISERT_QP_MAX_REQ_DTOS (ISCSI_DEF_XMIT_CMDS_MAX + \ > - ISERT_MAX_TX_MISC_PDUS + \ > +/* > + * Max QP send work requests consist of: > + * - RDMA + SEND for each iscsi IO > + * - iscsi misc TX pdus > + * - iscsi misc RX response pdus > + */ > +#define ISERT_QP_MAX_REQ_DTOS ((ISCSI_DEF_XMIT_CMDS_MAX * 2 ) + \ > + ISERT_MAX_TX_MISC_PDUS + \ > ISERT_MAX_RX_MISC_PDUS) > > #define ISER_RX_PAD_SIZE (ISCSI_DEF_MAX_RECV_SEG_LEN + 4096 - \ > -- > I tried out this change and it works fine with iwarp. I dont see SQ overflow. Apparently we have increased the sq too big to overflow. I am going to let it run with higher workloads for longer time, to see if it holds good. > But we do need to track the SQ overflow and queue a retransmit work when > we don't have enough available SQ slots.. Agreed, iscsi-target (LIO in our case)expects failure to be returned by overlying modules Or it could be for tcp as it handling is different to that of iser. It queues the WR and schedules it for repost incase of post failures with ENOMEM/EAGAIN. Shall send the necessary change if it is needed. Thanks, Bharat. > > Thoughts? ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SQ overflow seen running isert traffic 2016-10-18 11:28 ` SQ " Potnuri Bharat Teja @ 2016-10-18 13:17 ` Sagi Grimberg [not found] ` <ed7ebb39-be81-00b3-ef23-3f4c0e3afbb1-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Sagi Grimberg @ 2016-10-18 13:17 UTC (permalink / raw) To: Potnuri Bharat Teja; +Cc: Steve Wise, target-devel, nab, linux-rdma > I tried out this change and it works fine with iwarp. I dont see SQ > overflow. Apparently we have increased the sq too big to overflow. I am going > to let it run with higher workloads for longer time, to see if it holds good. Actually on second thought, this patch is an overkill. Effectively we now set: MAX_CMD=266 and max_rdma_ctx=128 so together we take 394 which seems to too much. If we go by the scheme of 1 rdma + 1 send for each IO we need: - 128 sends - 128 rdmas - 10 miscs so this gives 266. Perhaps this is due to the fact that iWARP needs to register memory for rdma reads as well? (and also rdma writes > 128k for chelsio HW right?) What is the workload you are running? with immediatedata enabled you should issue reg+rdma_read+send only for writes > 8k. Does this happen when you run only reads for example? I guess its time to get the sq accounting into shape... ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <ed7ebb39-be81-00b3-ef23-3f4c0e3afbb1-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* RE: SQ overflow seen running isert traffic [not found] ` <ed7ebb39-be81-00b3-ef23-3f4c0e3afbb1-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2016-10-18 14:34 ` Steve Wise 2016-10-18 16:13 ` Jason Gunthorpe 2016-10-31 3:40 ` Nicholas A. Bellinger 0 siblings, 2 replies; 24+ messages in thread From: Steve Wise @ 2016-10-18 14:34 UTC (permalink / raw) To: 'Sagi Grimberg', 'Potnuri Bharat Teja' Cc: target-devel-u79uwXL29TY76Z2rM5mHXA, nab-IzHhD5pYlfBP7FQvKIMDCQ, linux-rdma-u79uwXL29TY76Z2rM5mHXA > > > I tried out this change and it works fine with iwarp. I dont see SQ > > overflow. Apparently we have increased the sq too big to overflow. I am going > > to let it run with higher workloads for longer time, to see if it holds good. > > Actually on second thought, this patch is an overkill. Effectively we > now set: > > MAX_CMD=266 > and max_rdma_ctx=128 so together we take 394 which seems to too much. > > If we go by the scheme of 1 rdma + 1 send for each IO we need: > - 128 sends > - 128 rdmas > - 10 miscs > > so this gives 266. > > Perhaps this is due to the fact that iWARP needs to register memory for > rdma reads as well? (and also rdma writes > 128k for chelsio HW right?) > iWARP definitely needs to register memory for the target of reads, due to REMOTE_WRITE requirement for the protocol. The source of a write doesn't need to register memory, but the SGE depth can cause multiple WRITE WRs to be required to service the IO. And in theory there should be some threshold where it might be better performance-wise to do a memory register + 1 WRITE vs X WRITEs. As you mentioned, the RW API should account for this, but perhaps it is still off some. Bharat, have a look into the RDMA-RW API and let us see if we can figure out if the additional SQ depth it adds is sufficient. > What is the workload you are running? with immediatedata enabled you > should issue reg+rdma_read+send only for writes > 8k. > > Does this happen when you run only reads for example? > > I guess its time to get the sq accounting into shape... So to sum up - 2 issues: 1) we believe the iSER + RW API correctly sizes the SQ, yet we're seeing SQ overflows. So the SQ sizing needs more investigation. 2) if the SQ is full, then the iSER/target code is supposed to resubmit. And apparently that isn't working. Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SQ overflow seen running isert traffic 2016-10-18 14:34 ` Steve Wise @ 2016-10-18 16:13 ` Jason Gunthorpe 2016-10-18 19:03 ` Steve Wise 2016-10-20 8:34 ` Sagi Grimberg 2016-10-31 3:40 ` Nicholas A. Bellinger 1 sibling, 2 replies; 24+ messages in thread From: Jason Gunthorpe @ 2016-10-18 16:13 UTC (permalink / raw) To: Steve Wise Cc: 'Sagi Grimberg', 'Potnuri Bharat Teja', target-devel, nab, linux-rdma On Tue, Oct 18, 2016 at 09:34:33AM -0500, Steve Wise wrote: > 1) we believe the iSER + RW API correctly sizes the SQ, yet we're seeing SQ > overflows. So the SQ sizing needs more investigation. NFS had this sort of problem - in that case it was because the code was assuming that a RQ completion implied SQ space - that is not legal, only direct completions from SQ WCs can guide available space in the SQ.. Jason ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: SQ overflow seen running isert traffic 2016-10-18 16:13 ` Jason Gunthorpe @ 2016-10-18 19:03 ` Steve Wise 2016-10-20 8:34 ` Sagi Grimberg 1 sibling, 0 replies; 24+ messages in thread From: Steve Wise @ 2016-10-18 19:03 UTC (permalink / raw) To: 'Jason Gunthorpe' Cc: 'Sagi Grimberg', 'Potnuri Bharat Teja', target-devel, nab, linux-rdma > On Tue, Oct 18, 2016 at 09:34:33AM -0500, Steve Wise wrote: > > > 1) we believe the iSER + RW API correctly sizes the SQ, yet we're seeing SQ > > overflows. So the SQ sizing needs more investigation. > > NFS had this sort of problem - in that case it was because the code > was assuming that a RQ completion implied SQ space - that is not > legal, only direct completions from SQ WCs can guide available space > in the SQ.. > Agreed. And this could be the case Bharat is hitting. (I wonder if there is a clever way we can tell?) I don't see I the code that isert flow controls SQ WR submission based on SQ completions. It would be nice to have some core API, maybe in rdma-rw, that could handle this by queuing and processing when the SQ has resources... Steve. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SQ overflow seen running isert traffic 2016-10-18 16:13 ` Jason Gunthorpe 2016-10-18 19:03 ` Steve Wise @ 2016-10-20 8:34 ` Sagi Grimberg [not found] ` <f7a4b395-1786-3c7a-7639-195e830db5ad-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 1 sibling, 1 reply; 24+ messages in thread From: Sagi Grimberg @ 2016-10-20 8:34 UTC (permalink / raw) To: Jason Gunthorpe, Steve Wise Cc: 'Potnuri Bharat Teja', target-devel, nab, linux-rdma Hey Jason, >> 1) we believe the iSER + RW API correctly sizes the SQ, yet we're seeing SQ >> overflows. So the SQ sizing needs more investigation. > > NFS had this sort of problem - in that case it was because the code > was assuming that a RQ completion implied SQ space - that is not > legal, only direct completions from SQ WCs can guide available space > in the SQ.. Its not the same problem. iser-target does not tie SQ and RQ spaces. The origin here is the difference between IB/RoCE and iWARP and the chelsio HW that makes it hard to predict the SQ correct size. iWARP needs extra registration for rdma reads and the chelsio device seems to be limited in the number of pages per registration so this configuration will need a larger send queue. Another problem is that we don't have a correct retry flow. Hopefully we can address that in the RW API which is designed to hide these details from the ULP... ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <f7a4b395-1786-3c7a-7639-195e830db5ad-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* Re: SQ overflow seen running isert traffic [not found] ` <f7a4b395-1786-3c7a-7639-195e830db5ad-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2017-03-20 13:05 ` Potnuri Bharat Teja 2017-03-20 15:04 ` Steve Wise 0 siblings, 1 reply; 24+ messages in thread From: Potnuri Bharat Teja @ 2017-03-20 13:05 UTC (permalink / raw) To: Sagi Grimberg Cc: Jason Gunthorpe, SWise OGC, target-devel-u79uwXL29TY76Z2rM5mHXA, nab-IzHhD5pYlfBP7FQvKIMDCQ, linux-rdma-u79uwXL29TY76Z2rM5mHXA On Thursday, October 10/20/16, 2016 at 14:04:34 +0530, Sagi Grimberg wrote: > Hey Jason, > > >> 1) we believe the iSER + RW API correctly sizes the SQ, yet we're > seeing SQ > >> overflows. So the SQ sizing needs more investigation. > > > > NFS had this sort of problem - in that case it was because the code > > was assuming that a RQ completion implied SQ space - that is not > > legal, only direct completions from SQ WCs can guide available space > > in the SQ.. > > Its not the same problem. iser-target does not tie SQ and RQ spaces. > The origin here is the difference between IB/RoCE and iWARP and the > chelsio HW that makes it hard to predict the SQ correct size. > > iWARP needs extra registration for rdma reads and the chelsio device > seems to be limited in the number of pages per registration so this > configuration will need a larger send queue. > > Another problem is that we don't have a correct retry flow. > > Hopefully we can address that in the RW API which is designed to hide > these details from the ULP... Hi Sagi, Here is what our further analysis of SQ dump at the time of overflow says: RDMA read/write API is creating long chains (32 WRs) to handle large ISCSI READs. For Writing iscsi default block size of 512KB data, iw_cxgb4's max number of sge advertised is 4 page ~ 16KB for write, needs WR chain of 32 WRs (another possible factor is they all are unsignalled WRs and are completed only after next signalled WR) But apparantly rdma_rw_init_qp() assumes that any given IO will take only 1 WRITE WR to convey the data. This evidently is incorrect and rdma_rw_init_qp() needs to factor and size the queue based on max_sge of device for write and read and the sg_tablesize for which rdma read/write is used for, like ISCSI_ISER_MAX_SG_TABLESIZE of initiator. If above analysis is correct, please suggest how could this be fixed? Further, using MRs for rdma WRITE by using rdma_wr_force_mr = 1 module parameter of ib_core avoids SQ overflow by registering a single REG_MR and using that MR for a single WRITE WR. So a rdma-rw IO chain of say 32 WRITE WRs, becomes just 3 WRS: REG_MR + WRITE + INV_MR as max_fast_reg_page_list_len of iw_cxgb4 is 128 page. (By default force_mr is not set and iw_cxgb4 could only use MR for rdma READs only as per rdma_rw_io_needs_mr() if force_mr isnt set) >From this is there any possibility that we could use MR if the write WR chain exceeds a certain number? Thanks for your time! -Bharat. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: SQ overflow seen running isert traffic 2017-03-20 13:05 ` Potnuri Bharat Teja @ 2017-03-20 15:04 ` Steve Wise 0 siblings, 0 replies; 24+ messages in thread From: Steve Wise @ 2017-03-20 15:04 UTC (permalink / raw) To: 'Potnuri Bharat Teja', 'Sagi Grimberg' Cc: 'Jason Gunthorpe', target-devel, nab, linux-rdma, Christoph Hellwig +Christoph > > On Thursday, October 10/20/16, 2016 at 14:04:34 +0530, Sagi Grimberg wrote: > > Hey Jason, > > > > >> 1) we believe the iSER + RW API correctly sizes the SQ, yet we're > > seeing SQ > > >> overflows. So the SQ sizing needs more investigation. > > > > > > NFS had this sort of problem - in that case it was because the code > > > was assuming that a RQ completion implied SQ space - that is not > > > legal, only direct completions from SQ WCs can guide available space > > > in the SQ.. > > > > Its not the same problem. iser-target does not tie SQ and RQ spaces. > > The origin here is the difference between IB/RoCE and iWARP and the > > chelsio HW that makes it hard to predict the SQ correct size. > > > > iWARP needs extra registration for rdma reads and the chelsio device > > seems to be limited in the number of pages per registration so this > > configuration will need a larger send queue. > > > > Another problem is that we don't have a correct retry flow. > > > > Hopefully we can address that in the RW API which is designed to hide > > these details from the ULP... > Hi Sagi, > Here is what our further analysis of SQ dump at the time of overflow says: > > RDMA read/write API is creating long chains (32 WRs) to handle large ISCSI > READs. For Writing iscsi default block size of 512KB data, iw_cxgb4's max > number of sge advertised is 4 page ~ 16KB for write, needs WR chain of 32 WRs > (another possible factor is they all are unsignalled WRs and are completed > only after next signalled WR) But apparantly rdma_rw_init_qp() assumes that > any given IO will take only 1 WRITE WR to convey the data. > > This evidently is incorrect and rdma_rw_init_qp() needs to factor and size > the queue based on max_sge of device for write and read and the sg_tablesize > for which rdma read/write is used for, like ISCSI_ISER_MAX_SG_TABLESIZE of > initiator. If above analysis is correct, please suggest how could this be fixed? > > Further, using MRs for rdma WRITE by using rdma_wr_force_mr = 1 module > parameter of ib_core avoids SQ overflow by registering a single REG_MR and > using that MR for a single WRITE WR. So a rdma-rw IO chain of say 32 WRITE > WRs, becomes just 3 WRS: REG_MR + WRITE + INV_MR as > max_fast_reg_page_list_len of iw_cxgb4 is 128 page. > > (By default force_mr is not set and iw_cxgb4 could only use MR for rdma > READs only as per rdma_rw_io_needs_mr() if force_mr isnt set) > >From this is there any possibility that we could use MR if the write WR > chain exceeds a certain number? > > Thanks for your time! > I think it is time to resolve this XXX comment in rw.c for rdma_rw_io_needs_mr(): /* * Check if the device will use memory registration for this RW operation. * We currently always use memory registrations for iWarp RDMA READs, and * have a debug option to force usage of MRs. * XXX: In the future we can hopefully fine tune this based on HCA driver * input. */ Regardless of whether the HCA driver provides input, I think 30+ RDMA WRITE WR chains isn't as efficient as 1 REG_MR + 1 WRITE + 1 INV_MR. Is it unreasonable to just add some threshold in rw.c? Also, I think rdma_rw_init_qp() does need some tweaks: It needs to take into account the max sge depth, the max REG_MR depth, and the max SQ depth device attributes/capabilities when sizing the SQ. However, if that computed depth exceeds the device max, then the SQ will not be big enough to avoid potential overflowing, and I believe ULPs should _always_ flow control their outgoing WRs based on the SQ depth regardless. And perhaps rdma-rw should even avoid overly deep SQs just because that tends to inhibit scalability. EG: allowing lots of shallow QPs vs consuming all the device resources with very deep QPs... Steve. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SQ overflow seen running isert traffic 2016-10-18 14:34 ` Steve Wise 2016-10-18 16:13 ` Jason Gunthorpe @ 2016-10-31 3:40 ` Nicholas A. Bellinger 2016-11-02 17:03 ` Steve Wise [not found] ` <1477885208.27946.8.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org> 1 sibling, 2 replies; 24+ messages in thread From: Nicholas A. Bellinger @ 2016-10-31 3:40 UTC (permalink / raw) To: Steve Wise Cc: 'Sagi Grimberg', 'Potnuri Bharat Teja', target-devel-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA Hi Steve, Potnuri, & Co, On Tue, 2016-10-18 at 09:34 -0500, Steve Wise wrote: > > > > > I tried out this change and it works fine with iwarp. I dont see SQ > > > overflow. Apparently we have increased the sq too big to overflow. I am > going > > > to let it run with higher workloads for longer time, to see if it holds > good. > > > > Actually on second thought, this patch is an overkill. Effectively we > > now set: > > > > MAX_CMD=266 > > and max_rdma_ctx=128 so together we take 394 which seems to too much. > > > > If we go by the scheme of 1 rdma + 1 send for each IO we need: > > - 128 sends > > - 128 rdmas > > - 10 miscs > > > > so this gives 266. > > > > Perhaps this is due to the fact that iWARP needs to register memory for > > rdma reads as well? (and also rdma writes > 128k for chelsio HW right?) > > > > iWARP definitely needs to register memory for the target of reads, due to > REMOTE_WRITE requirement for the protocol. The source of a write doesn't need > to register memory, but the SGE depth can cause multiple WRITE WRs to be > required to service the IO. And in theory there should be some threshold where > it might be better performance-wise to do a memory register + 1 WRITE vs X > WRITEs. > > As you mentioned, the RW API should account for this, but perhaps it is still > off some. Bharat, have a look into the RDMA-RW API and let us see if we can > figure out if the additional SQ depth it adds is sufficient. > > > What is the workload you are running? with immediatedata enabled you > > should issue reg+rdma_read+send only for writes > 8k. > > > > Does this happen when you run only reads for example? > > > > I guess its time to get the sq accounting into shape... > > So to sum up - 2 issues: > > 1) we believe the iSER + RW API correctly sizes the SQ, yet we're seeing SQ > overflows. So the SQ sizing needs more investigation. > > 2) if the SQ is full, then the iSER/target code is supposed to resubmit. And > apparently that isn't working. > For #2, target-core expects -ENOMEM or -EAGAIN return from fabric driver callbacks to signal internal queue-full retry logic. Otherwise, the extra se_cmd->cmd_kref response SCF_ACK_KREF is leaked until session shutdown and/or reinstatement occurs. AFAICT, Potunri's earlier hung task with v4.8.y + ABORT_TASK is likely the earlier v4.1+ regression: https://github.com/torvalds/linux/commit/527268df31e57cf2b6d417198717c6d6afdb1e3e That said, there is room for improvement in target-core queue-full error signaling, and iscsi-target/iser-target callback error propagation. Sending out a series shortly to address these particular items. Please have a look. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* RE: SQ overflow seen running isert traffic 2016-10-31 3:40 ` Nicholas A. Bellinger @ 2016-11-02 17:03 ` Steve Wise [not found] ` <1477885208.27946.8.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org> 1 sibling, 0 replies; 24+ messages in thread From: Steve Wise @ 2016-11-02 17:03 UTC (permalink / raw) To: 'Nicholas A. Bellinger' Cc: 'Sagi Grimberg', 'Potnuri Bharat Teja', target-devel, linux-rdma > > So to sum up - 2 issues: > > > > 1) we believe the iSER + RW API correctly sizes the SQ, yet we're seeing SQ > > overflows. So the SQ sizing needs more investigation. > > > > 2) if the SQ is full, then the iSER/target code is supposed to resubmit. And > > apparently that isn't working. > > > > For #2, target-core expects -ENOMEM or -EAGAIN return from fabric driver > callbacks to signal internal queue-full retry logic. Otherwise, the > extra se_cmd->cmd_kref response SCF_ACK_KREF is leaked until session > shutdown and/or reinstatement occurs. > > AFAICT, Potunri's earlier hung task with v4.8.y + ABORT_TASK is likely > the earlier v4.1+ regression: > > https://github.com/torvalds/linux/commit/527268df31e57cf2b6d417198717c6d6afd > b1e3e > > That said, there is room for improvement in target-core queue-full error > signaling, and iscsi-target/iser-target callback error propagation. > > Sending out a series shortly to address these particular items. > Please have a look. > Hey Nicholas, Bharat is out until next week. He'll try this all out next week and report back. Thanks! Steve. ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <1477885208.27946.8.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org>]
* Re: SQ overflow seen running isert traffic [not found] ` <1477885208.27946.8.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org> @ 2016-11-08 10:06 ` Potnuri Bharat Teja 2017-03-20 10:15 ` Potnuri Bharat Teja 0 siblings, 1 reply; 24+ messages in thread From: Potnuri Bharat Teja @ 2016-11-08 10:06 UTC (permalink / raw) To: Nicholas A. Bellinger Cc: SWise OGC, 'Sagi Grimberg', target-devel-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA On Monday, October 10/31/16, 2016 at 09:10:08 +0530, Nicholas A. Bellinger wrote: > Hi Steve, Potnuri, & Co, > > On Tue, 2016-10-18 at 09:34 -0500, Steve Wise wrote: > > > > > > > I tried out this change and it works fine with iwarp. I dont see SQ > > > > overflow. Apparently we have increased the sq too big to overflow. I am > > going > > > > to let it run with higher workloads for longer time, to see if it holds > > good. > > > > > > Actually on second thought, this patch is an overkill. Effectively we > > > now set: > > > > > > MAX_CMD=266 > > > and max_rdma_ctx=128 so together we take 394 which seems to too much. > > > > > > If we go by the scheme of 1 rdma + 1 send for each IO we need: > > > - 128 sends > > > - 128 rdmas > > > - 10 miscs > > > > > > so this gives 266. > > > > > > Perhaps this is due to the fact that iWARP needs to register memory for > > > rdma reads as well? (and also rdma writes > 128k for chelsio HW right?) > > > > > > > iWARP definitely needs to register memory for the target of reads, due to > > REMOTE_WRITE requirement for the protocol. The source of a write doesn't need > > to register memory, but the SGE depth can cause multiple WRITE WRs to be > > required to service the IO. And in theory there should be some threshold where > > it might be better performance-wise to do a memory register + 1 WRITE vs X > > WRITEs. > > > > As you mentioned, the RW API should account for this, but perhaps it is still > > off some. Bharat, have a look into the RDMA-RW API and let us see if we can > > figure out if the additional SQ depth it adds is sufficient. > > > > > What is the workload you are running? with immediatedata enabled you > > > should issue reg+rdma_read+send only for writes > 8k. > > > > > > Does this happen when you run only reads for example? > > > > > > I guess its time to get the sq accounting into shape... > > > > So to sum up - 2 issues: > > > > 1) we believe the iSER + RW API correctly sizes the SQ, yet we're seeing SQ > > overflows. So the SQ sizing needs more investigation. > > > > 2) if the SQ is full, then the iSER/target code is supposed to resubmit. And > > apparently that isn't working. > > > > For #2, target-core expects -ENOMEM or -EAGAIN return from fabric driver > callbacks to signal internal queue-full retry logic. Otherwise, the > extra se_cmd->cmd_kref response SCF_ACK_KREF is leaked until session > shutdown and/or reinstatement occurs. > > AFAICT, Potunri's earlier hung task with v4.8.y + ABORT_TASK is likely > the earlier v4.1+ regression: > > https://github.com/torvalds/linux/commit/527268df31e57cf2b6d417198717c6d6afdb1e3e > > That said, there is room for improvement in target-core queue-full error > signaling, and iscsi-target/iser-target callback error propagation. > > Sending out a series shortly to address these particular items. > Please have a look. > Thanks for the changes Nicholas. Testing them now. Thanks, Bharat. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SQ overflow seen running isert traffic 2016-11-08 10:06 ` Potnuri Bharat Teja @ 2017-03-20 10:15 ` Potnuri Bharat Teja 2017-03-21 6:32 ` Nicholas A. Bellinger 0 siblings, 1 reply; 24+ messages in thread From: Potnuri Bharat Teja @ 2017-03-20 10:15 UTC (permalink / raw) To: Nicholas A. Bellinger Cc: SWise OGC, 'Sagi Grimberg', target-devel, linux-rdma On Tuesday, November 11/08/16, 2016 at 15:36:18 +0530, Potnuri Bharat Teja wrote: > On Monday, October 10/31/16, 2016 at 09:10:08 +0530, Nicholas A. Bellinger > wrote: Hi Nicholas, Tested the following 3 patches for SQ overflow. https://www.spinics.net/lists/target-devel/msg13571.html They along with an additional change added on top of those seems to hold good.( Aborts seen intermittently as shown at the end of mail) I have been testing the changes using fio with IO sizes ranging from 1KB to 32MB and test is running fine (despite few Intermittent Aborts). Here is what I see with the 3 patches alone applied: -> In isert_put_datain() and isert_post_response() a corresponding recv WR is posted before posting a send and hence for every post failure a recv is already posted into a tightly packed RQ, causing it to overflow. Below are the logs suggesting it: [ 189.944714] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res [ 189.944718] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res [ 189.944721] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res [ 189.944724] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res [ 189.944729] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res [ 189.944731] isert: isert_post_recv: ib_post_recv() failed with ret: -12 [ 189.944731] isert: isert_put_datain: ib_post_recv failed with -12 [ 189.944732] isert: isert_post_recv: ib_post_recv() failed with ret: -12 [ 189.944733] isert: isert_put_datain: ib_post_recv failed with -12 [ 189.944734] isert: isert_post_recv: ib_post_recv() failed with ret: -12 [ 189.944735] isert: isert_put_datain: ib_post_recv failed with -12 [ 189.944736] isert: isert_post_recv: ib_post_recv() failed with ret: -12 [ 189.944736] isert: isert_put_datain: ib_post_recv failed with -12 [ 189.944737] isert: isert_post_recv: ib_post_recv() failed with ret: -12 [ 189.944738] isert: isert_put_datain: ib_post_recv failed with -12 [ 189.944739] isert: isert_post_recv: ib_post_recv() failed with ret: -12 [ 189.944740] isert: isert_put_datain: ib_post_recv failed with -12 [ 189.944740] isert: isert_post_recv: ib_post_recv() failed with ret: -12 [ 189.944741] isert: isert_put_datain: ib_post_recv failed with -12 [ 189.944746] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res [ 189.944747] isert: isert_post_recv: ib_post_recv() failed with ret: -12 [ 189.944748] isert: isert_put_datain: ib_post_recv failed with -12 [ 189.944749] isert: isert_post_recv: ib_post_recv() failed with ret: -12 [ 251.543847] ABORT_TASK: Found referenced iSCSI task_tag: 68 [ 251.543858] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 68 [ 251.544071] ABORT_TASK: Found referenced iSCSI task_tag: 46 [ 251.544076] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 46 [ 251.544300] ABORT_TASK: Found referenced iSCSI task_tag: 51 [ 251.544307] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 51 [ 251.544528] ABORT_TASK: Found referenced iSCSI task_tag: 117 [ 251.544533] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 117 [ 282.778402] Unable to locate ITT: 0x00000062 on CID: 0 [ 282.778404] Unable to locate RefTaskTag: 0x00000062 on CID: 0. [ 282.778407] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 0 [ 282.778436] ABORT_TASK: Found referenced iSCSI task_tag: 37 [ 282.778439] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 37 [ 282.778519] ABORT_TASK: Found referenced iSCSI task_tag: 65 [ 282.778522] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 65 [ 282.778559] Unable to locate ITT: 0x00000019 on CID: 0 [ 282.778560] Unable to locate RefTaskTag: 0x00000019 on CID: 0. [ 282.778563] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 0 [ 293.004632] ABORT_TASK: Found referenced iSCSI task_tag: 22 [ 293.004643] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 22 [ 293.004808] TMR Opcode TARGET_WARM_RESET authorization failed for Initiator Node: iqn.1994-05.com.redhat:37fc2b6a5331 [ 310.265993] iSCSI Login timeout on Network Portal 10.0.2.2:3260 [ 493.567143] INFO: task iscsi_np:3998 blocked for more than 120 seconds. [ 493.567149] Tainted: G OE 4.9.0-For-DSGL-Debug+ #1 [ 493.567151] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 493.567153] iscsi_np D 0 3998 2 0x00000084 [ 493.567159] 0000000000000000 ffff883028a88b40 ffff88302693f7c0 ffff88302de3e300 [ 493.567163] ffff88303e057a40 ffffc900008abc00 ffffffff815e415b ffffc900008abca0 [ 493.567166] ffffffff81af4180 ffff883028a88b40 7fffffffffffffff ffff882f7426db68 [ 493.567170] Call Trace: [ 493.567180] [<ffffffff815e415b>] ? __schedule+0x23b/0x6c0 [ 493.567184] [<ffffffff815e4616>] schedule+0x36/0x80 [ 493.567187] [<ffffffff815e71c3>] schedule_timeout+0x1f3/0x390 [ 493.567194] [<ffffffff812c1f2e>] ? number+0x2ce/0x300 [ 493.567197] [<ffffffff812c42be>] ? vsnprintf+0x32e/0x4b0 [ 493.567199] [<ffffffff815e56a4>] wait_for_completion+0xb4/0xf0 [ 493.567201] [<ffffffff8108a6b0>] ? wake_up_q+0x80/0x80 [ 493.567216] [<ffffffffa05284d1>] iscsit_stop_session+0x1b1/0x1c0 [iscsi_target_mod] [ 493.567220] [<ffffffffa0519db2>] iscsi_check_for_session_reinstatement+0x1e2/0x270 [iscsi_target_mod] [ 493.567225] [<ffffffffa051c8c0>] iscsi_target_check_for_existing_instances+0x30/0x40 [iscsi_target_mod] [ 493.567230] [<ffffffffa051c9ee>] iscsi_target_do_login+0x11e/0x610 [iscsi_target_mod] [ 493.567235] [<ffffffffa051dbbe>] iscsi_target_start_negotiation+0x4e/0xa0 [iscsi_target_mod] [ 493.567239] [<ffffffffa051b933>] iscsi_target_login_thread+0xa73/0xfc0 [iscsi_target_mod] [ 493.567244] [<ffffffffa051aec0>] ? iscsi_target_login_sess_out+0x250/0x250 [iscsi_target_mod] [ 493.567246] [<ffffffff810660c4>] ? SyS_exit_group+0x14/0x20 [ 493.567249] [<ffffffff8107ee5a>] kthread+0xca/0xe0 [ 493.567250] [<ffffffff8107ed90>] ? kthread_park+0x60/0x60 [ 493.567252] [<ffffffff815e8752>] ret_from_fork+0x22/0x30 [ 493.567254] INFO: task iscsi_trx:4149 blocked for more than 120 seconds. [ 493.567255] Tainted: G OE 4.9.0-For-DSGL-Debug+ #1 [ 493.567256] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 493.567256] iscsi_trx D 0 4149 2 0x00000084 [ 493.567258] 0000000000000000 ffff882f5ef70340 ffff882f76253040 ffff88302de40340 [ 493.567259] ffff88303e097a40 ffffc90000f93c90 ffffffff815e415b ffffffff815e56c2 [ 493.567261] 0000000000000001 ffff882f5ef70340 7fffffffffffffff ffff882d233cc9b0 [ 493.567262] Call Trace: [ 493.567264] [<ffffffff815e415b>] ? __schedule+0x23b/0x6c0 [ 493.567265] [<ffffffff815e56c2>] ? wait_for_completion+0xd2/0xf0 [ 493.567267] [<ffffffff815e4616>] schedule+0x36/0x80 [ 493.567268] [<ffffffff815e71c3>] schedule_timeout+0x1f3/0x390 [ 493.567281] [<ffffffffa118e910>] ? ib_sg_to_pages+0x1a0/0x1a0 [ib_core] [ 493.567285] [<ffffffffa118e910>] ? ib_sg_to_pages+0x1a0/0x1a0 [ib_core] [ 493.567286] [<ffffffff815e56a4>] wait_for_completion+0xb4/0xf0 [ 493.567287] [<ffffffff8108a6b0>] ? wake_up_q+0x80/0x80 [ 493.567302] [<ffffffffa04d5399>] target_wait_for_sess_cmds+0x49/0x1a0 [target_core_mod] [ 493.567306] [<ffffffffa0471644>] isert_wait_conn+0x1a4/0x2d0 [ib_isert] [ 493.567311] [<ffffffffa05289ab>] iscsit_close_connection+0x14b/0x820 [iscsi_target_mod] [ 493.567315] [<ffffffffa05172ef>] iscsit_take_action_for_connection_exit+0x7f/0x100 [iscsi_target_mod] [ 493.567320] [<ffffffffa0528265>] iscsi_target_rx_thread+0x95/0xa0 [iscsi_target_mod] [ 493.567324] [<ffffffffa05281d0>] ? iscsi_target_tx_thread+0x1b0/0x1b0 [iscsi_target_mod] [ 493.567325] [<ffffffff8107ee5a>] kthread+0xca/0xe0 [ 493.567327] [<ffffffff8107ed90>] ? kthread_park+0x60/0x60 [ 493.567328] [<ffffffff815e8752>] ret_from_fork+0x22/0x30 -> In order to address this I tried the following (experimental) change on top of your 3 patches: diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 501a0ad124a7..0aead5faa2a2 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -1817,16 +1817,16 @@ isert_post_response(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd) struct ib_send_wr *wr_failed; int ret; - ret = isert_post_recv(isert_conn, isert_cmd->rx_desc); + ret = ib_post_send(isert_conn->qp, &isert_cmd->tx_desc.send_wr, + &wr_failed); if (ret) { - isert_err("ib_post_recv failed with %d\n", ret); + isert_err("ib_post_send failed with %d\n", ret); return ret; } - ret = ib_post_send(isert_conn->qp, &isert_cmd->tx_desc.send_wr, - &wr_failed); + ret = isert_post_recv(isert_conn, isert_cmd->rx_desc); if (ret) { - isert_err("ib_post_send failed with %d\n", ret); + isert_err("ib_post_recv failed with %d\n", ret); return ret; } return ret; @@ -2166,6 +2166,13 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) if (isert_prot_cmd(isert_conn, se_cmd)) { isert_cmd->tx_desc.tx_cqe.done = isert_rdma_write_done; cqe = &isert_cmd->tx_desc.tx_cqe; + + rc = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr); + if (rc) { + isert_dbg("Cmd: %p posted RDMA_WRITE for iSER Data READ rc: %d\n", + isert_cmd, rc); + return rc; + } } else { /* * Build isert_conn->tx_desc for iSCSI response PDU and attach @@ -2178,18 +2185,21 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) isert_init_send_wr(isert_conn, isert_cmd, &isert_cmd->tx_desc.send_wr); + chain_wr = &isert_cmd->tx_desc.send_wr; + + rc = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr); + if (rc) { + isert_err("Cmd: %p posted RDMA_WRITE for iSER Data READ rc: %d\n", + isert_cmd, rc); + return rc; + } + rc = isert_post_recv(isert_conn, isert_cmd->rx_desc); if (rc) { isert_err("ib_post_recv failed with %d\n", rc); return rc; } - - chain_wr = &isert_cmd->tx_desc.send_wr; } - rc = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr); - isert_dbg("Cmd: %p posted RDMA_WRITE for iSER Data READ rc: %d\n", - isert_cmd, rc); - return rc; } -> With these changes it works fine, But I still see few Aborts intermittently as follows: [10674.434643] isert: isert_rdma_rw_ctx_post: Cmd: ffff882cbb961138 failed to post RDMA res [10674.434690] isert: isert_rdma_rw_ctx_post: Cmd: ffff882cbb9434c0 failed to post RDMA res [10715.186372] ABORT_TASK: Found referenced iSCSI task_tag: 58 [10715.186417] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 58 -> Apart from the Aborts and post failure logs filling dmesg, luns are in state "running" and fio countinues to run. [root@bharat ~]# iscsiadm -m session -P3 | grep "Attached scsi disk" Attached scsi disk sdj State: running Attached scsi disk sdc State: running Attached scsi disk sdd State: running Attached scsi disk sde State: running Attached scsi disk sdf State: running Attached scsi disk sdg State: running Attached scsi disk sdh State: running Attached scsi disk sdi State: running Thanks, Bharat. > > Hi Steve, Potnuri, & Co, > > > > On Tue, 2016-10-18 at 09:34 -0500, Steve Wise wrote: > > > > > > > > > I tried out this change and it works fine with iwarp. I dont see > SQ > > > > > overflow. Apparently we have increased the sq too big to overflow. > I am > > > going > > > > > to let it run with higher workloads for longer time, to see if it > holds > > > good. > > > > > > > > Actually on second thought, this patch is an overkill. Effectively > we > > > > now set: > > > > > > > > MAX_CMD=266 > > > > and max_rdma_ctx=128 so together we take 394 which seems to too > much. > > > > > > > > If we go by the scheme of 1 rdma + 1 send for each IO we need: > > > > - 128 sends > > > > - 128 rdmas > > > > - 10 miscs > > > > > > > > so this gives 266. > > > > > > > > Perhaps this is due to the fact that iWARP needs to register memory > for > > > > rdma reads as well? (and also rdma writes > 128k for chelsio HW > right?) > > > > > > > > > > iWARP definitely needs to register memory for the target of reads, due > to > > > REMOTE_WRITE requirement for the protocol. The source of a write > doesn't need > > > to register memory, but the SGE depth can cause multiple WRITE WRs to > be > > > required to service the IO. And in theory there should be some > threshold where > > > it might be better performance-wise to do a memory register + 1 WRITE > vs X > > > WRITEs. > > > > > > As you mentioned, the RW API should account for this, but perhaps it > is still > > > off some. Bharat, have a look into the RDMA-RW API and let us see if > we can > > > figure out if the additional SQ depth it adds is sufficient. > > > > > > > What is the workload you are running? with immediatedata enabled you > > > > should issue reg+rdma_read+send only for writes > 8k. > > > > > > > > Does this happen when you run only reads for example? > > > > > > > > I guess its time to get the sq accounting into shape... > > > > > > So to sum up - 2 issues: > > > > > > 1) we believe the iSER + RW API correctly sizes the SQ, yet we're > seeing SQ > > > overflows. So the SQ sizing needs more investigation. > > > > > > 2) if the SQ is full, then the iSER/target code is supposed to > resubmit. And > > > apparently that isn't working. > > > > > > > For #2, target-core expects -ENOMEM or -EAGAIN return from fabric driver > > callbacks to signal internal queue-full retry logic. Otherwise, the > > extra se_cmd->cmd_kref response SCF_ACK_KREF is leaked until session > > shutdown and/or reinstatement occurs. > > > > AFAICT, Potunri's earlier hung task with v4.8.y + ABORT_TASK is likely > > the earlier v4.1+ regression: > > > > > [1]https://github.com/torvalds/linux/commit/527268df31e57cf2b6d417198717c6d6afdb1e3e > > > > That said, there is room for improvement in target-core queue-full error > > signaling, and iscsi-target/iser-target callback error propagation. > > > > Sending out a series shortly to address these particular items. > > Please have a look. > > > Thanks for the changes Nicholas. > Testing them now. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at [2]http://vger.kernel.org/majordomo-info.html > > References > > Visible links > 1. https://github.com/torvalds/linux/commit/527268df31e57cf2b6d417198717c6d6afdb1e3e > 2. http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: SQ overflow seen running isert traffic 2017-03-20 10:15 ` Potnuri Bharat Teja @ 2017-03-21 6:32 ` Nicholas A. Bellinger 2017-03-21 7:51 ` Potnuri Bharat Teja 0 siblings, 1 reply; 24+ messages in thread From: Nicholas A. Bellinger @ 2017-03-21 6:32 UTC (permalink / raw) To: Potnuri Bharat Teja Cc: SWise OGC, 'Sagi Grimberg', target-devel, linux-rdma Hi Bharat & Co, On Mon, 2017-03-20 at 15:45 +0530, Potnuri Bharat Teja wrote: > On Tuesday, November 11/08/16, 2016 at 15:36:18 +0530, Potnuri Bharat Teja wrote: > > On Monday, October 10/31/16, 2016 at 09:10:08 +0530, Nicholas A. Bellinger > > wrote: > Hi Nicholas, > Tested the following 3 patches for SQ overflow. > https://www.spinics.net/lists/target-devel/msg13571.html Thanks for following up on this. I was just thinking about sending these patches out again, as separate from iser-target bits in patch #3 (and your incremental below) they do address a couple of long-standing target-core issues wrt queue-full handling. > They along with an additional change added on top of those seems to hold good.( Aborts seen > intermittently as shown at the end of mail) > I have been testing the changes using fio with IO sizes ranging from 1KB to 32MB and test is > running fine (despite few Intermittent Aborts). > At what I/O size do the post RDMA_WRITE failures for iw_cxgb4 and associated TMR ABORT_TASK begin..? > Here is what I see with the 3 patches alone applied: > > -> In isert_put_datain() and isert_post_response() a corresponding recv WR is posted before > posting a send and hence for every post failure a recv is already posted into a tightly packed > RQ, causing it to overflow. Just for me to understand, the intermittent TMR ABORT_TASKs are caused by the repeated failure to post RDMA_WRITE WRs for a large ISER Data-In payload, due to mis-sizing of needed WRs from RDMA R/W API vs. underlying hardware capabilities. Moving the recv posts after the send post for RDMA_WRITEs helps to reduce the severity of the issue with iw_cxgb4, but doesn't completely eliminate the issue under load. > > Below are the logs suggesting it: > > [ 189.944714] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res > [ 189.944718] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res > [ 189.944721] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res > [ 189.944724] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res > [ 189.944729] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res > [ 189.944731] isert: isert_post_recv: ib_post_recv() failed with ret: -12 > [ 189.944731] isert: isert_put_datain: ib_post_recv failed with -12 > [ 189.944732] isert: isert_post_recv: ib_post_recv() failed with ret: -12 > [ 189.944733] isert: isert_put_datain: ib_post_recv failed with -12 > [ 189.944734] isert: isert_post_recv: ib_post_recv() failed with ret: -12 > [ 189.944735] isert: isert_put_datain: ib_post_recv failed with -12 > [ 189.944736] isert: isert_post_recv: ib_post_recv() failed with ret: -12 > [ 189.944736] isert: isert_put_datain: ib_post_recv failed with -12 > [ 189.944737] isert: isert_post_recv: ib_post_recv() failed with ret: -12 > [ 189.944738] isert: isert_put_datain: ib_post_recv failed with -12 > [ 189.944739] isert: isert_post_recv: ib_post_recv() failed with ret: -12 > [ 189.944740] isert: isert_put_datain: ib_post_recv failed with -12 > [ 189.944740] isert: isert_post_recv: ib_post_recv() failed with ret: -12 > [ 189.944741] isert: isert_put_datain: ib_post_recv failed with -12 > [ 189.944746] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 failed to post RDMA res > [ 189.944747] isert: isert_post_recv: ib_post_recv() failed with ret: -12 > [ 189.944748] isert: isert_put_datain: ib_post_recv failed with -12 > [ 189.944749] isert: isert_post_recv: ib_post_recv() failed with ret: -12 Mmmm, why does the retry of isert_put_datain() stop here..? Does this delay between the last isert_put_datain() retry and subsequent TMR ABORT_TASKs happen with your incremental patch applied too..? > [ 251.543847] ABORT_TASK: Found referenced iSCSI task_tag: 68 > [ 251.543858] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 68 > [ 251.544071] ABORT_TASK: Found referenced iSCSI task_tag: 46 > [ 251.544076] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 46 > [ 251.544300] ABORT_TASK: Found referenced iSCSI task_tag: 51 > [ 251.544307] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 51 > [ 251.544528] ABORT_TASK: Found referenced iSCSI task_tag: 117 > [ 251.544533] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 117 > [ 282.778402] Unable to locate ITT: 0x00000062 on CID: 0 > [ 282.778404] Unable to locate RefTaskTag: 0x00000062 on CID: 0. > [ 282.778407] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 0 > [ 282.778436] ABORT_TASK: Found referenced iSCSI task_tag: 37 > [ 282.778439] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 37 > [ 282.778519] ABORT_TASK: Found referenced iSCSI task_tag: 65 > [ 282.778522] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 65 > [ 282.778559] Unable to locate ITT: 0x00000019 on CID: 0 > [ 282.778560] Unable to locate RefTaskTag: 0x00000019 on CID: 0. > [ 282.778563] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 0 > [ 293.004632] ABORT_TASK: Found referenced iSCSI task_tag: 22 > [ 293.004643] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 22 > [ 293.004808] TMR Opcode TARGET_WARM_RESET authorization failed for Initiator Node: iqn.1994-05.com.redhat:37fc2b6a5331 > [ 310.265993] iSCSI Login timeout on Network Portal 10.0.2.2:3260 > [ 493.567143] INFO: task iscsi_np:3998 blocked for more than 120 seconds. > [ 493.567149] Tainted: G OE 4.9.0-For-DSGL-Debug+ #1 > [ 493.567151] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 493.567153] iscsi_np D 0 3998 2 0x00000084 > [ 493.567159] 0000000000000000 ffff883028a88b40 ffff88302693f7c0 ffff88302de3e300 > [ 493.567163] ffff88303e057a40 ffffc900008abc00 ffffffff815e415b ffffc900008abca0 > [ 493.567166] ffffffff81af4180 ffff883028a88b40 7fffffffffffffff ffff882f7426db68 > [ 493.567170] Call Trace: > [ 493.567180] [<ffffffff815e415b>] ? __schedule+0x23b/0x6c0 > [ 493.567184] [<ffffffff815e4616>] schedule+0x36/0x80 > [ 493.567187] [<ffffffff815e71c3>] schedule_timeout+0x1f3/0x390 > [ 493.567194] [<ffffffff812c1f2e>] ? number+0x2ce/0x300 > [ 493.567197] [<ffffffff812c42be>] ? vsnprintf+0x32e/0x4b0 > [ 493.567199] [<ffffffff815e56a4>] wait_for_completion+0xb4/0xf0 > [ 493.567201] [<ffffffff8108a6b0>] ? wake_up_q+0x80/0x80 > [ 493.567216] [<ffffffffa05284d1>] iscsit_stop_session+0x1b1/0x1c0 [iscsi_target_mod] > [ 493.567220] [<ffffffffa0519db2>] iscsi_check_for_session_reinstatement+0x1e2/0x270 [iscsi_target_mod] > [ 493.567225] [<ffffffffa051c8c0>] iscsi_target_check_for_existing_instances+0x30/0x40 [iscsi_target_mod] > [ 493.567230] [<ffffffffa051c9ee>] iscsi_target_do_login+0x11e/0x610 [iscsi_target_mod] > [ 493.567235] [<ffffffffa051dbbe>] iscsi_target_start_negotiation+0x4e/0xa0 [iscsi_target_mod] > [ 493.567239] [<ffffffffa051b933>] iscsi_target_login_thread+0xa73/0xfc0 [iscsi_target_mod] > [ 493.567244] [<ffffffffa051aec0>] ? iscsi_target_login_sess_out+0x250/0x250 [iscsi_target_mod] > [ 493.567246] [<ffffffff810660c4>] ? SyS_exit_group+0x14/0x20 > [ 493.567249] [<ffffffff8107ee5a>] kthread+0xca/0xe0 > [ 493.567250] [<ffffffff8107ed90>] ? kthread_park+0x60/0x60 > [ 493.567252] [<ffffffff815e8752>] ret_from_fork+0x22/0x30 > [ 493.567254] INFO: task iscsi_trx:4149 blocked for more than 120 seconds. > [ 493.567255] Tainted: G OE 4.9.0-For-DSGL-Debug+ #1 > [ 493.567256] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 493.567256] iscsi_trx D 0 4149 2 0x00000084 > [ 493.567258] 0000000000000000 ffff882f5ef70340 ffff882f76253040 ffff88302de40340 > [ 493.567259] ffff88303e097a40 ffffc90000f93c90 ffffffff815e415b ffffffff815e56c2 > [ 493.567261] 0000000000000001 ffff882f5ef70340 7fffffffffffffff ffff882d233cc9b0 > [ 493.567262] Call Trace: > [ 493.567264] [<ffffffff815e415b>] ? __schedule+0x23b/0x6c0 > [ 493.567265] [<ffffffff815e56c2>] ? wait_for_completion+0xd2/0xf0 > [ 493.567267] [<ffffffff815e4616>] schedule+0x36/0x80 > [ 493.567268] [<ffffffff815e71c3>] schedule_timeout+0x1f3/0x390 > [ 493.567281] [<ffffffffa118e910>] ? ib_sg_to_pages+0x1a0/0x1a0 [ib_core] > [ 493.567285] [<ffffffffa118e910>] ? ib_sg_to_pages+0x1a0/0x1a0 [ib_core] > [ 493.567286] [<ffffffff815e56a4>] wait_for_completion+0xb4/0xf0 > [ 493.567287] [<ffffffff8108a6b0>] ? wake_up_q+0x80/0x80 > [ 493.567302] [<ffffffffa04d5399>] target_wait_for_sess_cmds+0x49/0x1a0 [target_core_mod] > [ 493.567306] [<ffffffffa0471644>] isert_wait_conn+0x1a4/0x2d0 [ib_isert] > [ 493.567311] [<ffffffffa05289ab>] iscsit_close_connection+0x14b/0x820 [iscsi_target_mod] > [ 493.567315] [<ffffffffa05172ef>] iscsit_take_action_for_connection_exit+0x7f/0x100 [iscsi_target_mod] > [ 493.567320] [<ffffffffa0528265>] iscsi_target_rx_thread+0x95/0xa0 [iscsi_target_mod] > [ 493.567324] [<ffffffffa05281d0>] ? iscsi_target_tx_thread+0x1b0/0x1b0 [iscsi_target_mod] > [ 493.567325] [<ffffffff8107ee5a>] kthread+0xca/0xe0 > [ 493.567327] [<ffffffff8107ed90>] ? kthread_park+0x60/0x60 > [ 493.567328] [<ffffffff815e8752>] ret_from_fork+0x22/0x30 > OK, AFAICT these hung tasks basically mean iser-target never gave target-core back the outstanding se_cmd it was attempting to retry, even up to the point in which session reinstatement was eventually triggered by iser-initiator. So the reason why your patch to swap post_recv -> post_send to post_send -> post_recv makes a difference is because it allows enough trickle of RDMA_WRITEs to make it through, where iser-initiator doesn't attempt to escalate recovery and doesn't attempt session reinstatement. Have you been able to confirm with your incremental patch below that once TMR ABORT_TASKs start to occur due to isert_rdma_rw_ctx_post failures, that any subsequent session reinstatement events for unrelated reasons (or explicit logout..?) doesn't run into the same issue..? > -> In order to address this I tried the following (experimental) change on top of your 3 patches: > > diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c > index 501a0ad124a7..0aead5faa2a2 100644 > --- a/drivers/infiniband/ulp/isert/ib_isert.c > +++ b/drivers/infiniband/ulp/isert/ib_isert.c > @@ -1817,16 +1817,16 @@ isert_post_response(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd) > struct ib_send_wr *wr_failed; > int ret; > > - ret = isert_post_recv(isert_conn, isert_cmd->rx_desc); > + ret = ib_post_send(isert_conn->qp, &isert_cmd->tx_desc.send_wr, > + &wr_failed); > if (ret) { > - isert_err("ib_post_recv failed with %d\n", ret); > + isert_err("ib_post_send failed with %d\n", ret); > return ret; > } > > - ret = ib_post_send(isert_conn->qp, &isert_cmd->tx_desc.send_wr, > - &wr_failed); > + ret = isert_post_recv(isert_conn, isert_cmd->rx_desc); > if (ret) { > - isert_err("ib_post_send failed with %d\n", ret); > + isert_err("ib_post_recv failed with %d\n", ret); > return ret; > } > return ret; > @@ -2166,6 +2166,13 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) > if (isert_prot_cmd(isert_conn, se_cmd)) { > isert_cmd->tx_desc.tx_cqe.done = isert_rdma_write_done; > cqe = &isert_cmd->tx_desc.tx_cqe; > + > + rc = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr); > + if (rc) { > + isert_dbg("Cmd: %p posted RDMA_WRITE for iSER Data READ rc: %d\n", > + isert_cmd, rc); > + return rc; > + } > } else { > /* > * Build isert_conn->tx_desc for iSCSI response PDU and attach > @@ -2178,18 +2185,21 @@ isert_put_datain(struct iscsi_conn *conn, struct iscsi_cmd *cmd) > isert_init_send_wr(isert_conn, isert_cmd, > &isert_cmd->tx_desc.send_wr); > > + chain_wr = &isert_cmd->tx_desc.send_wr; > + > + rc = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr); > + if (rc) { > + isert_err("Cmd: %p posted RDMA_WRITE for iSER Data READ rc: %d\n", > + isert_cmd, rc); > + return rc; > + } > + > rc = isert_post_recv(isert_conn, isert_cmd->rx_desc); > if (rc) { > isert_err("ib_post_recv failed with %d\n", rc); > return rc; > } > - > - chain_wr = &isert_cmd->tx_desc.send_wr; > } > - rc = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, chain_wr); > - isert_dbg("Cmd: %p posted RDMA_WRITE for iSER Data READ rc: %d\n", > - isert_cmd, rc); > - > return rc; > } > Unsure of the implications for other RNICs + iser-target here.. I'll have to defer to Sagi judgment on this one. ;) > -> With these changes it works fine, But I still see few Aborts intermittently as follows: > > [10674.434643] isert: isert_rdma_rw_ctx_post: Cmd: ffff882cbb961138 failed to post RDMA res > [10674.434690] isert: isert_rdma_rw_ctx_post: Cmd: ffff882cbb9434c0 failed to post RDMA res > [10715.186372] ABORT_TASK: Found referenced iSCSI task_tag: 58 > [10715.186417] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 58 > OK, so there is still a delay between the isert_rdma_rw_ctx_post failures, and receiving TMR ABORT_TASKs. Is target-core -> iser-target aggressively retrying the posting of RDMA_WRITEs via isert_put_datain() during this entire time..? > -> Apart from the Aborts and post failure logs filling dmesg, luns are in state "running" and > fio countinues to run. > [root@bharat ~]# iscsiadm -m session -P3 | grep "Attached scsi disk" > Attached scsi disk sdj State: running > Attached scsi disk sdc State: running > Attached scsi disk sdd State: running > Attached scsi disk sde State: running > Attached scsi disk sdf State: running > Attached scsi disk sdg State: running > Attached scsi disk sdh State: running > Attached scsi disk sdi State: running > Ok, so it looks like you've got the second order issues under control for target-core + iser-target, once the first order issue of RDMA R/W API vs. iw_cxgb4 WR resource mis-sizing continuously triggers failed calls to isert_put_datain() -> isert_rdma_rw_ctx_post() rdma_rw_ctx_post(). That said, I'm going to go ahead and queue patch #1 and #2 from the original series to target-pending/for-next as they do address issues separate from iser-target scenario here, and await on Sagi's feedback for patch #3 and your incremental above. Thanks again. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: SQ overflow seen running isert traffic 2017-03-21 6:32 ` Nicholas A. Bellinger @ 2017-03-21 7:51 ` Potnuri Bharat Teja [not found] ` <20170321075131.GA11565-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Potnuri Bharat Teja @ 2017-03-21 7:51 UTC (permalink / raw) To: Nicholas A. Bellinger Cc: SWise OGC, 'Sagi Grimberg', target-devel, linux-rdma On Tuesday, March 03/21/17, 2017 at 12:02:20 +0530, Nicholas A. Bellinger wrote: > Hi Bharat & Co, > > On Mon, 2017-03-20 at 15:45 +0530, Potnuri Bharat Teja wrote: > > On Tuesday, November 11/08/16, 2016 at 15:36:18 +0530, Potnuri Bharat > Teja wrote: > > > On Monday, October 10/31/16, 2016 at 09:10:08 +0530, Nicholas A. > Bellinger > > > wrote: > > Hi Nicholas, > > Tested the following 3 patches for SQ overflow. > > [1]https://www.spinics.net/lists/target-devel/msg13571.html > > Thanks for following up on this. > > I was just thinking about sending these patches out again, as separate > from iser-target bits in patch #3 (and your incremental below) they do > address a couple of long-standing target-core issues wrt queue-full > handling. > > > They along with an additional change added on top of those seems to hold > good.( Aborts seen > > intermittently as shown at the end of mail) > > I have been testing the changes using fio with IO sizes ranging from 1KB > to 32MB and test is > > running fine (despite few Intermittent Aborts). > > > > At what I/O size do the post RDMA_WRITE failures for iw_cxgb4 and > associated TMR ABORT_TASK begin..? Hi Nicholas, I see them from 2MB onwards. > > > Here is what I see with the 3 patches alone applied: > > > > -> In isert_put_datain() and isert_post_response() a corresponding recv > WR is posted before > > posting a send and hence for every post failure a recv is already posted > into a tightly packed > > RQ, causing it to overflow. > > Just for me to understand, the intermittent TMR ABORT_TASKs are caused > by the repeated failure to post RDMA_WRITE WRs for a large ISER Data-In > payload, due to mis-sizing of needed WRs from RDMA R/W API vs. > underlying hardware capabilities. Yes. > > Moving the recv posts after the send post for RDMA_WRITEs helps to > reduce the severity of the issue with iw_cxgb4, but doesn't completely > eliminate the issue under load. Moving recv posts only comes in to effect along with your changes. > > > > > Below are the logs suggesting it: > > > > [ 189.944714] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 > failed to post RDMA res > > [ 189.944718] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 > failed to post RDMA res > > [ 189.944721] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 > failed to post RDMA res > > [ 189.944724] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 > failed to post RDMA res > > [ 189.944729] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 > failed to post RDMA res > > [ 189.944731] isert: isert_post_recv: ib_post_recv() failed with ret: > -12 > > [ 189.944731] isert: isert_put_datain: ib_post_recv failed with -12 > > [ 189.944732] isert: isert_post_recv: ib_post_recv() failed with ret: > -12 > > [ 189.944733] isert: isert_put_datain: ib_post_recv failed with -12 > > [ 189.944734] isert: isert_post_recv: ib_post_recv() failed with ret: > -12 > > [ 189.944735] isert: isert_put_datain: ib_post_recv failed with -12 > > [ 189.944736] isert: isert_post_recv: ib_post_recv() failed with ret: > -12 > > [ 189.944736] isert: isert_put_datain: ib_post_recv failed with -12 > > [ 189.944737] isert: isert_post_recv: ib_post_recv() failed with ret: > -12 > > [ 189.944738] isert: isert_put_datain: ib_post_recv failed with -12 > > [ 189.944739] isert: isert_post_recv: ib_post_recv() failed with ret: > -12 > > [ 189.944740] isert: isert_put_datain: ib_post_recv failed with -12 > > [ 189.944740] isert: isert_post_recv: ib_post_recv() failed with ret: > -12 > > [ 189.944741] isert: isert_put_datain: ib_post_recv failed with -12 > > [ 189.944746] isert: isert_rdma_rw_ctx_post: Cmd: ffff882d233e83d0 > failed to post RDMA res > > [ 189.944747] isert: isert_post_recv: ib_post_recv() failed with ret: > -12 > > [ 189.944748] isert: isert_put_datain: ib_post_recv failed with -12 > > [ 189.944749] isert: isert_post_recv: ib_post_recv() failed with ret: > -12 > > Mmmm, why does the retry of isert_put_datain() stop here..? It is because the iser command continues to loop failing to post either with SQ or RQ failure and reties reposting. Subsequently iser command times out on the Initiator and aborts the command. > > Does this delay between the last isert_put_datain() retry and subsequent > TMR ABORT_TASKs happen with your incremental patch applied too..? With my changes applied, I see a few aborts with high IO sizes like 32MB, 16MB etc. I dont see login time out failure with incremental patch. > > > [ 251.543847] ABORT_TASK: Found referenced iSCSI task_tag: 68 > > [ 251.543858] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 68 > > [ 251.544071] ABORT_TASK: Found referenced iSCSI task_tag: 46 > > [ 251.544076] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 46 > > [ 251.544300] ABORT_TASK: Found referenced iSCSI task_tag: 51 > > [ 251.544307] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 51 > > [ 251.544528] ABORT_TASK: Found referenced iSCSI task_tag: 117 > > [ 251.544533] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: > 117 > > [ 282.778402] Unable to locate ITT: 0x00000062 on CID: 0 > > [ 282.778404] Unable to locate RefTaskTag: 0x00000062 on CID: 0. > > [ 282.778407] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: > 0 > > [ 282.778436] ABORT_TASK: Found referenced iSCSI task_tag: 37 > > [ 282.778439] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 37 > > [ 282.778519] ABORT_TASK: Found referenced iSCSI task_tag: 65 > > [ 282.778522] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 65 > > [ 282.778559] Unable to locate ITT: 0x00000019 on CID: 0 > > [ 282.778560] Unable to locate RefTaskTag: 0x00000019 on CID: 0. > > [ 282.778563] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: > 0 > > [ 293.004632] ABORT_TASK: Found referenced iSCSI task_tag: 22 > > [ 293.004643] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 22 > > [ 293.004808] TMR Opcode TARGET_WARM_RESET authorization failed for > Initiator Node: iqn.1994-05.com.redhat:37fc2b6a5331 > > [ 310.265993] iSCSI Login timeout on Network Portal 10.0.2.2:3260 > > [ 493.567143] INFO: task iscsi_np:3998 blocked for more than 120 > seconds. > > [ 493.567149] Tainted: G OE 4.9.0-For-DSGL-Debug+ #1 > > [ 493.567151] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > > [ 493.567153] iscsi_np D 0 3998 2 0x00000084 > > [ 493.567159] 0000000000000000 ffff883028a88b40 ffff88302693f7c0 > ffff88302de3e300 > > [ 493.567163] ffff88303e057a40 ffffc900008abc00 ffffffff815e415b > ffffc900008abca0 > > [ 493.567166] ffffffff81af4180 ffff883028a88b40 7fffffffffffffff > ffff882f7426db68 > > [ 493.567170] Call Trace: > > [ 493.567180] [<ffffffff815e415b>] ? __schedule+0x23b/0x6c0 > > [ 493.567184] [<ffffffff815e4616>] schedule+0x36/0x80 > > [ 493.567187] [<ffffffff815e71c3>] schedule_timeout+0x1f3/0x390 > > [ 493.567194] [<ffffffff812c1f2e>] ? number+0x2ce/0x300 > > [ 493.567197] [<ffffffff812c42be>] ? vsnprintf+0x32e/0x4b0 > > [ 493.567199] [<ffffffff815e56a4>] wait_for_completion+0xb4/0xf0 > > [ 493.567201] [<ffffffff8108a6b0>] ? wake_up_q+0x80/0x80 > > [ 493.567216] [<ffffffffa05284d1>] iscsit_stop_session+0x1b1/0x1c0 > [iscsi_target_mod] > > [ 493.567220] [<ffffffffa0519db2>] > iscsi_check_for_session_reinstatement+0x1e2/0x270 [iscsi_target_mod] > > [ 493.567225] [<ffffffffa051c8c0>] > iscsi_target_check_for_existing_instances+0x30/0x40 [iscsi_target_mod] > > [ 493.567230] [<ffffffffa051c9ee>] iscsi_target_do_login+0x11e/0x610 > [iscsi_target_mod] > > [ 493.567235] [<ffffffffa051dbbe>] > iscsi_target_start_negotiation+0x4e/0xa0 [iscsi_target_mod] > > [ 493.567239] [<ffffffffa051b933>] > iscsi_target_login_thread+0xa73/0xfc0 [iscsi_target_mod] > > [ 493.567244] [<ffffffffa051aec0>] ? > iscsi_target_login_sess_out+0x250/0x250 [iscsi_target_mod] > > [ 493.567246] [<ffffffff810660c4>] ? SyS_exit_group+0x14/0x20 > > [ 493.567249] [<ffffffff8107ee5a>] kthread+0xca/0xe0 > > [ 493.567250] [<ffffffff8107ed90>] ? kthread_park+0x60/0x60 > > [ 493.567252] [<ffffffff815e8752>] ret_from_fork+0x22/0x30 > > [ 493.567254] INFO: task iscsi_trx:4149 blocked for more than 120 > seconds. > > [ 493.567255] Tainted: G OE 4.9.0-For-DSGL-Debug+ #1 > > [ 493.567256] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > > [ 493.567256] iscsi_trx D 0 4149 2 0x00000084 > > [ 493.567258] 0000000000000000 ffff882f5ef70340 ffff882f76253040 > ffff88302de40340 > > [ 493.567259] ffff88303e097a40 ffffc90000f93c90 ffffffff815e415b > ffffffff815e56c2 > > [ 493.567261] 0000000000000001 ffff882f5ef70340 7fffffffffffffff > ffff882d233cc9b0 > > [ 493.567262] Call Trace: > > [ 493.567264] [<ffffffff815e415b>] ? __schedule+0x23b/0x6c0 > > [ 493.567265] [<ffffffff815e56c2>] ? wait_for_completion+0xd2/0xf0 > > [ 493.567267] [<ffffffff815e4616>] schedule+0x36/0x80 > > [ 493.567268] [<ffffffff815e71c3>] schedule_timeout+0x1f3/0x390 > > [ 493.567281] [<ffffffffa118e910>] ? ib_sg_to_pages+0x1a0/0x1a0 > [ib_core] > > [ 493.567285] [<ffffffffa118e910>] ? ib_sg_to_pages+0x1a0/0x1a0 > [ib_core] > > [ 493.567286] [<ffffffff815e56a4>] wait_for_completion+0xb4/0xf0 > > [ 493.567287] [<ffffffff8108a6b0>] ? wake_up_q+0x80/0x80 > > [ 493.567302] [<ffffffffa04d5399>] > target_wait_for_sess_cmds+0x49/0x1a0 [target_core_mod] > > [ 493.567306] [<ffffffffa0471644>] isert_wait_conn+0x1a4/0x2d0 > [ib_isert] > > [ 493.567311] [<ffffffffa05289ab>] iscsit_close_connection+0x14b/0x820 > [iscsi_target_mod] > > [ 493.567315] [<ffffffffa05172ef>] > iscsit_take_action_for_connection_exit+0x7f/0x100 [iscsi_target_mod] > > [ 493.567320] [<ffffffffa0528265>] iscsi_target_rx_thread+0x95/0xa0 > [iscsi_target_mod] > > [ 493.567324] [<ffffffffa05281d0>] ? > iscsi_target_tx_thread+0x1b0/0x1b0 [iscsi_target_mod] > > [ 493.567325] [<ffffffff8107ee5a>] kthread+0xca/0xe0 > > [ 493.567327] [<ffffffff8107ed90>] ? kthread_park+0x60/0x60 > > [ 493.567328] [<ffffffff815e8752>] ret_from_fork+0x22/0x30 > > > > OK, AFAICT these hung tasks basically mean iser-target never gave > target-core back the outstanding se_cmd it was attempting to retry, even > up to the point in which session reinstatement was eventually triggered > by iser-initiator. Yes. > > So the reason why your patch to swap post_recv -> post_send to post_send > -> post_recv makes a difference is because it allows enough trickle of > RDMA_WRITEs to make it through, where iser-initiator doesn't attempt to > escalate recovery and doesn't attempt session reinstatement. I dont exactly know if above thing comes into play but the actual reason I did swap posting RQ and SQ is, unlike SQ, RQ is posted with WRs to the brim during the intialisation itself. From thereon we post a RQ WR for every RQ completion That makes it almost full at any point of time. Now in our scenario, SQ is miscalulated and too small for few adapters and so filled gradually as the IO starts. Once SQ is full, according to your patches isert queues it and tries to repost the command again. Here in iser functions like isert_post_response(), isert_put_datain() post send is done after post recv. For the first post send failure in say isert_put_datain(), the corresponding post recv is already posted, then on queuing the command and trying reposting an extra recv is again posted which fills up the RQ also. By swapping post recv and send as in my incermental patch, we dont post that extra recv, and post recv only on successful post send. Therfore I think this incremental patch is necessary. > > Have you been able to confirm with your incremental patch below that > once TMR ABORT_TASKs start to occur due to isert_rdma_rw_ctx_post > failures, that any subsequent session reinstatement events for unrelated > reasons (or explicit logout..?) doesn't run into the same issue..? Yes, It is tested good! I've left it to run from yesterday. IO is still running with out any login timeouts. [root@bharat ~]# iscsiadm -m session -P3 | grep "Attached scsi disk" Attached scsi disk sdj State: running Attached scsi disk sdc State: running Attached scsi disk sdd State: running Attached scsi disk sde State: running Attached scsi disk sdf State: running Attached scsi disk sdg State: running Attached scsi disk sdh State: running Attached scsi disk sdi State: running [root@bharat ~]# uptime 13:12:04 up 1 day, 1:22, 3 users, load average: 8.11, 8.30, 7.26 > > > -> In order to address this I tried the following (experimental) change > on top of your 3 patches: > > > > diff --git a/drivers/infiniband/ulp/isert/ib_isert.c > b/drivers/infiniband/ulp/isert/ib_isert.c > > index 501a0ad124a7..0aead5faa2a2 100644 > > --- a/drivers/infiniband/ulp/isert/ib_isert.c > > +++ b/drivers/infiniband/ulp/isert/ib_isert.c > > @@ -1817,16 +1817,16 @@ isert_post_response(struct isert_conn > *isert_conn, struct isert_cmd *isert_cmd) > > struct ib_send_wr *wr_failed; > > int ret; > > > > - ret = isert_post_recv(isert_conn, isert_cmd->rx_desc); > > + ret = ib_post_send(isert_conn->qp, &isert_cmd->tx_desc.send_wr, > > + &wr_failed); > > if (ret) { > > - isert_err("ib_post_recv failed with %d\n", ret); > > + isert_err("ib_post_send failed with %d\n", ret); > > return ret; > > } > > > > - ret = ib_post_send(isert_conn->qp, &isert_cmd->tx_desc.send_wr, > > - &wr_failed); > > + ret = isert_post_recv(isert_conn, isert_cmd->rx_desc); > > if (ret) { > > - isert_err("ib_post_send failed with %d\n", ret); > > + isert_err("ib_post_recv failed with %d\n", ret); > > return ret; > > } > > return ret; > > @@ -2166,6 +2166,13 @@ isert_put_datain(struct iscsi_conn *conn, struct > iscsi_cmd *cmd) > > if (isert_prot_cmd(isert_conn, se_cmd)) { > > isert_cmd->tx_desc.tx_cqe.done = isert_rdma_write_done; > > cqe = &isert_cmd->tx_desc.tx_cqe; > > + > > + rc = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, > chain_wr); > > + if (rc) { > > + isert_dbg("Cmd: %p posted RDMA_WRITE for iSER > Data READ rc: %d\n", > > + isert_cmd, rc); > > + return rc; > > + } > > } else { > > /* > > * Build isert_conn->tx_desc for iSCSI response PDU and > attach > > @@ -2178,18 +2185,21 @@ isert_put_datain(struct iscsi_conn *conn, struct > iscsi_cmd *cmd) > > isert_init_send_wr(isert_conn, isert_cmd, > > &isert_cmd->tx_desc.send_wr); > > > > + chain_wr = &isert_cmd->tx_desc.send_wr; > > + > > + rc = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, > chain_wr); > > + if (rc) { > > + isert_err("Cmd: %p posted RDMA_WRITE for iSER > Data READ rc: %d\n", > > + isert_cmd, rc); > > + return rc; > > + } > > + > > rc = isert_post_recv(isert_conn, isert_cmd->rx_desc); > > if (rc) { > > isert_err("ib_post_recv failed with %d\n", rc); > > return rc; > > } > > - > > - chain_wr = &isert_cmd->tx_desc.send_wr; > > } > > - rc = isert_rdma_rw_ctx_post(isert_cmd, isert_conn, cqe, > chain_wr); > > - isert_dbg("Cmd: %p posted RDMA_WRITE for iSER Data READ rc: > %d\n", > > - isert_cmd, rc); > > - > > return rc; > > } > > > > Unsure of the implications for other RNICs + iser-target here.. I believe it is needed. Lets wait for Sagi's opinion. > > I'll have to defer to Sagi judgment on this one. ;) > > > -> With these changes it works fine, But I still see few Aborts > intermittently as follows: > > > > [10674.434643] isert: isert_rdma_rw_ctx_post: Cmd: ffff882cbb961138 > failed to post RDMA res > > [10674.434690] isert: isert_rdma_rw_ctx_post: Cmd: ffff882cbb9434c0 > failed to post RDMA res > > [10715.186372] ABORT_TASK: Found referenced iSCSI task_tag: 58 > > [10715.186417] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 58 > > > > OK, so there is still a delay between the isert_rdma_rw_ctx_post > failures, and receiving TMR ABORT_TASKs. > > Is target-core -> iser-target aggressively retrying the posting of > RDMA_WRITEs via isert_put_datain() during this entire time..? > Yes, at higher IO sizes it happens. I belive that in addition to this Queue failure handling. There should be flow control mechanism to slow down the posts on ENOMEM/ post failures. I see something like that happening in Initiator: iscsi_iser.h: /* the max TX (send) WR supported by the iSER QP is defined by * * max_send_wr = T * (1 + D) + C ; D is how many inflight dataouts we expect * * to have at max for SCSI command. The tx posting & completion handling code * * supports -EAGAIN scheme where tx is suspended till the QP has room for more * * send WR. D=8 comes from 64K/8K */ "where tx is suspended til the QP has room for more send WR" I am not sure if that is done on target side. > > -> Apart from the Aborts and post failure logs filling dmesg, luns are > in state "running" and > > fio countinues to run. > > [root@bharat ~]# iscsiadm -m session -P3 | grep "Attached scsi disk" > > Attached scsi disk sdj State: running > > Attached scsi disk sdc State: running > > Attached scsi disk sdd State: running > > Attached scsi disk sde State: running > > Attached scsi disk sdf State: running > > Attached scsi disk sdg State: running > > Attached scsi disk sdh State: running > > Attached scsi disk sdi State: running > > > > Ok, so it looks like you've got the second order issues under control > for target-core + iser-target, once the first order issue of RDMA R/W > API vs. iw_cxgb4 WR resource mis-sizing continuously triggers failed > calls to isert_put_datain() -> isert_rdma_rw_ctx_post() > rdma_rw_ctx_post(). Yes, Thank you for the patches :). > > That said, I'm going to go ahead and queue patch #1 and #2 from the > original series to target-pending/for-next as they do address issues > separate from iser-target scenario here, and await on Sagi's feedback > for patch #3 and your incremental above. > > Thanks again. > > References > > Visible links > 1. https://www.spinics.net/lists/target-devel/msg13571.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20170321075131.GA11565-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>]
* Re: SQ overflow seen running isert traffic [not found] ` <20170321075131.GA11565-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> @ 2017-03-21 13:52 ` Sagi Grimberg [not found] ` <945e2947-f67a-4202-cd27-d4631fe10f68-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Sagi Grimberg @ 2017-03-21 13:52 UTC (permalink / raw) To: Potnuri Bharat Teja, Nicholas A. Bellinger Cc: SWise OGC, target-devel-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA Hi Baharat and Nic, Apologies for the late reply, > Hi Nicholas, > I see them from 2MB onwards. >> >> > Here is what I see with the 3 patches alone applied: >> > >> > -> In isert_put_datain() and isert_post_response() a corresponding recv >> WR is posted before >> > posting a send and hence for every post failure a recv is already posted >> into a tightly packed >> > RQ, causing it to overflow. >> >> Just for me to understand, the intermittent TMR ABORT_TASKs are caused >> by the repeated failure to post RDMA_WRITE WRs for a large ISER Data-In >> payload, due to mis-sizing of needed WRs from RDMA R/W API vs. >> underlying hardware capabilities. > Yes. >> >> Moving the recv posts after the send post for RDMA_WRITEs helps to >> reduce the severity of the issue with iw_cxgb4, but doesn't completely >> eliminate the issue under load. > Moving recv posts only comes in to effect along with your changes. ... >> So the reason why your patch to swap post_recv -> post_send to post_send >> -> post_recv makes a difference is because it allows enough trickle of >> RDMA_WRITEs to make it through, where iser-initiator doesn't attempt to >> escalate recovery and doesn't attempt session reinstatement. > I dont exactly know if above thing comes into play but the actual reason I did > swap posting RQ and SQ is, unlike SQ, RQ is posted with WRs to the brim during > the intialisation itself. From thereon we post a RQ WR for every RQ completion > That makes it almost full at any point of time. > > Now in our scenario, SQ is miscalulated and too small for few adapters and so > filled gradually as the IO starts. Once SQ is full, according to your patches > isert queues it and tries to repost the command again. Here in iser functions > like isert_post_response(), isert_put_datain() post send is done after post recv. > For the first post send failure in say isert_put_datain(), the corresponding > post recv is already posted, then on queuing the command and trying reposting > an extra recv is again posted which fills up the RQ also. > > By swapping post recv and send as in my incermental patch, we dont post that > extra recv, and post recv only on successful post send. > Therfore I think this incremental patch is necessary. Reversing the order to recv and send posting will cause problems in stress IO workloads (especially for iWARP). The problem of sending a reply before reposting the recv buffer is that the initiator can send immediately a new request and we don't have a recv buffer waiting for it, which will cause RNR-NAK. This *will* cause performance drops and jitters for sure. How about we just track the rx_desc to know if we already posted it as a start (untested as I don't have access to RDMA HW this week): -- diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 9b33c0c97468..fcbed35e95a8 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -817,6 +817,7 @@ isert_post_recvm(struct isert_conn *isert_conn, u32 count) rx_wr->sg_list = &rx_desc->rx_sg; rx_wr->num_sge = 1; rx_wr->next = rx_wr + 1; + rx_desc->in_use = false; } rx_wr--; rx_wr->next = NULL; /* mark end of work requests list */ @@ -835,6 +836,15 @@ isert_post_recv(struct isert_conn *isert_conn, struct iser_rx_desc *rx_desc) struct ib_recv_wr *rx_wr_failed, rx_wr; int ret; + if (!rx_desc->in_use) { + /* + * if the descriptor is not in-use we already reposted it + * for recv, so just silently return + */ + return 0; + } + + rx_desc->in_use = false; rx_wr.wr_cqe = &rx_desc->rx_cqe; rx_wr.sg_list = &rx_desc->rx_sg; rx_wr.num_sge = 1; @@ -1397,6 +1407,8 @@ isert_recv_done(struct ib_cq *cq, struct ib_wc *wc) return; } + rx_desc->in_use = true; + ib_dma_sync_single_for_cpu(ib_dev, rx_desc->dma_addr, ISER_RX_PAYLOAD_SIZE, DMA_FROM_DEVICE); diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h index c02ada57d7f5..87d994de8c91 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.h +++ b/drivers/infiniband/ulp/isert/ib_isert.h @@ -60,7 +60,7 @@ #define ISER_RX_PAD_SIZE (ISCSI_DEF_MAX_RECV_SEG_LEN + 4096 - \ (ISER_RX_PAYLOAD_SIZE + sizeof(u64) + sizeof(struct ib_sge) + \ - sizeof(struct ib_cqe))) + sizeof(struct ib_cqe) + sizeof(bool))) #define ISCSI_ISER_SG_TABLESIZE 256 @@ -85,6 +85,7 @@ struct iser_rx_desc { u64 dma_addr; struct ib_sge rx_sg; struct ib_cqe rx_cqe; + bool in_use; char pad[ISER_RX_PAD_SIZE]; } __packed; -- We have a lot of room for cleanups in isert... I'll need to make some time to get it going... I'll be waiting to hear from you if it makes your issue go away. Cheers, Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply related [flat|nested] 24+ messages in thread
[parent not found: <945e2947-f67a-4202-cd27-d4631fe10f68-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* Re: [SPAMMY (7.002)]Re: SQ overflow seen running isert traffic [not found] ` <945e2947-f67a-4202-cd27-d4631fe10f68-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2017-03-21 15:25 ` Potnuri Bharat Teja [not found] ` <20170321152506.GA32655-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Potnuri Bharat Teja @ 2017-03-21 15:25 UTC (permalink / raw) To: Sagi Grimberg Cc: Nicholas A. Bellinger, SWise OGC, target-devel-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA On Tuesday, March 03/21/17, 2017 at 19:22:30 +0530, Sagi Grimberg wrote: Hi Sagi, The patch works fine! > Hi Baharat and Nic, > > Apologies for the late reply, > > > Hi Nicholas, > > I see them from 2MB onwards. > >> > >> > Here is what I see with the 3 patches alone applied: > >> > > >> > -> In isert_put_datain() and isert_post_response() a corresponding > recv > >> WR is posted before > >> > posting a send and hence for every post failure a recv is already > posted > >> into a tightly packed > >> > RQ, causing it to overflow. > >> > >> Just for me to understand, the intermittent TMR ABORT_TASKs are > caused > >> by the repeated failure to post RDMA_WRITE WRs for a large ISER > Data-In > >> payload, due to mis-sizing of needed WRs from RDMA R/W API vs. > >> underlying hardware capabilities. > > Yes. > >> > >> Moving the recv posts after the send post for RDMA_WRITEs helps to > >> reduce the severity of the issue with iw_cxgb4, but doesn't > completely > >> eliminate the issue under load. > > Moving recv posts only comes in to effect along with your changes. > > ... > > >> So the reason why your patch to swap post_recv -> post_send to > post_send > >> -> post_recv makes a difference is because it allows enough trickle > of > >> RDMA_WRITEs to make it through, where iser-initiator doesn't attempt > to > >> escalate recovery and doesn't attempt session reinstatement. > > I dont exactly know if above thing comes into play but the actual reason > I did > > swap posting RQ and SQ is, unlike SQ, RQ is posted with WRs to the brim > during > > the intialisation itself. From thereon we post a RQ WR for every RQ > completion > > That makes it almost full at any point of time. > > > > Now in our scenario, SQ is miscalulated and too small for few adapters > and so > > filled gradually as the IO starts. Once SQ is full, according to your > patches > > isert queues it and tries to repost the command again. Here in iser > functions > > like isert_post_response(), isert_put_datain() post send is done after > post recv. > > For the first post send failure in say isert_put_datain(), the > corresponding > > post recv is already posted, then on queuing the command and trying > reposting > > an extra recv is again posted which fills up the RQ also. > > > > By swapping post recv and send as in my incermental patch, we dont post > that > > extra recv, and post recv only on successful post send. > > Therfore I think this incremental patch is necessary. > > Reversing the order to recv and send posting will cause problems > in stress IO workloads (especially for iWARP). The problem of sending > a reply before reposting the recv buffer is that the initiator can send > immediately a new request and we don't have a recv buffer waiting for > it, which will cause RNR-NAK. This *will* cause performance drops and > jitters for sure. Totally agree with you. > > How about we just track the rx_desc to know if we already posted it as > a start (untested as I don't have access to RDMA HW this week): > -- > diff --git a/drivers/infiniband/ulp/isert/ib_isert.c > b/drivers/infiniband/ulp/isert/ib_isert.c > index 9b33c0c97468..fcbed35e95a8 100644 > --- a/drivers/infiniband/ulp/isert/ib_isert.c > +++ b/drivers/infiniband/ulp/isert/ib_isert.c > @@ -817,6 +817,7 @@ isert_post_recvm(struct isert_conn *isert_conn, u32 > count) > rx_wr->sg_list = &rx_desc->rx_sg; > rx_wr->num_sge = 1; > rx_wr->next = rx_wr + 1; > + rx_desc->in_use = false; > } > rx_wr--; > rx_wr->next = NULL; /* mark end of work requests list */ > @@ -835,6 +836,15 @@ isert_post_recv(struct isert_conn *isert_conn, > struct iser_rx_desc *rx_desc) > struct ib_recv_wr *rx_wr_failed, rx_wr; > int ret; > > + if (!rx_desc->in_use) { > + /* > + * if the descriptor is not in-use we already reposted it > + * for recv, so just silently return > + */ > + return 0; > + } > + > + rx_desc->in_use = false; > rx_wr.wr_cqe = &rx_desc->rx_cqe; > rx_wr.sg_list = &rx_desc->rx_sg; > rx_wr.num_sge = 1; > @@ -1397,6 +1407,8 @@ isert_recv_done(struct ib_cq *cq, struct ib_wc *wc) > return; > } > > + rx_desc->in_use = true; > + > ib_dma_sync_single_for_cpu(ib_dev, rx_desc->dma_addr, > ISER_RX_PAYLOAD_SIZE, DMA_FROM_DEVICE); > > diff --git a/drivers/infiniband/ulp/isert/ib_isert.h > b/drivers/infiniband/ulp/isert/ib_isert.h > index c02ada57d7f5..87d994de8c91 100644 > --- a/drivers/infiniband/ulp/isert/ib_isert.h > +++ b/drivers/infiniband/ulp/isert/ib_isert.h > @@ -60,7 +60,7 @@ > > #define ISER_RX_PAD_SIZE (ISCSI_DEF_MAX_RECV_SEG_LEN + 4096 - \ > (ISER_RX_PAYLOAD_SIZE + sizeof(u64) + sizeof(struct > ib_sge) + \ > - sizeof(struct ib_cqe))) > + sizeof(struct ib_cqe) + sizeof(bool))) > > #define ISCSI_ISER_SG_TABLESIZE 256 > > @@ -85,6 +85,7 @@ struct iser_rx_desc { > u64 dma_addr; > struct ib_sge rx_sg; > struct ib_cqe rx_cqe; > + bool in_use; > char pad[ISER_RX_PAD_SIZE]; > } __packed; > -- > > We have a lot of room for cleanups in isert... I'll need to > make some time to get it going... > > I'll be waiting to hear from you if it makes your issue go away. Test runs fine and it solved the issue. Thanks for the patch! > > Cheers, > Sagi. Thanks, Bharat. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <20170321152506.GA32655-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org>]
* Re: [SPAMMY (7.002)]Re: SQ overflow seen running isert traffic [not found] ` <20170321152506.GA32655-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> @ 2017-03-21 16:38 ` Sagi Grimberg [not found] ` <4dab6b43-20d3-86f0-765a-be0851e9f4a0-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 0 siblings, 1 reply; 24+ messages in thread From: Sagi Grimberg @ 2017-03-21 16:38 UTC (permalink / raw) To: Potnuri Bharat Teja Cc: Nicholas A. Bellinger, SWise OGC, target-devel-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA > Test runs fine and it solved the issue. > Thanks for the patch! Cool, can I add your "Tested-by" tag on a formal patch then? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
[parent not found: <4dab6b43-20d3-86f0-765a-be0851e9f4a0-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org>]
* Re: [SPAMMY (7.002)]Re: SQ overflow seen running isert traffic [not found] ` <4dab6b43-20d3-86f0-765a-be0851e9f4a0-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> @ 2017-03-21 17:50 ` Potnuri Bharat Teja 0 siblings, 0 replies; 24+ messages in thread From: Potnuri Bharat Teja @ 2017-03-21 17:50 UTC (permalink / raw) To: Sagi Grimberg Cc: Nicholas A. Bellinger, SWise OGC, target-devel-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA On Tuesday, March 03/21/17, 2017 at 22:08:04 +0530, Sagi Grimberg wrote: > > Test runs fine and it solved the issue. > > Thanks for the patch! > > Cool, can I add your "Tested-by" tag on a formal patch then? Sure, Please add. Tested-by: Potnuri Bharat Teja <bharat-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> Thanks. > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at [1]http://vger.kernel.org/majordomo-info.html > > References > > Visible links > 1. http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2017-03-21 17:50 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-09-27 7:01 RQ overflow seen running isert traffic Potnuri Bharat Teja [not found] ` <20160927070157.GA13140-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> 2016-09-29 14:12 ` Steve Wise 2016-10-05 6:14 ` Sagi Grimberg 2016-10-17 11:16 ` Potnuri Bharat Teja 2016-10-17 18:29 ` Steve Wise 2016-10-18 8:04 ` Sagi Grimberg 2016-10-18 11:28 ` SQ " Potnuri Bharat Teja 2016-10-18 13:17 ` Sagi Grimberg [not found] ` <ed7ebb39-be81-00b3-ef23-3f4c0e3afbb1-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 2016-10-18 14:34 ` Steve Wise 2016-10-18 16:13 ` Jason Gunthorpe 2016-10-18 19:03 ` Steve Wise 2016-10-20 8:34 ` Sagi Grimberg [not found] ` <f7a4b395-1786-3c7a-7639-195e830db5ad-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 2017-03-20 13:05 ` Potnuri Bharat Teja 2017-03-20 15:04 ` Steve Wise 2016-10-31 3:40 ` Nicholas A. Bellinger 2016-11-02 17:03 ` Steve Wise [not found] ` <1477885208.27946.8.camel-XoQW25Eq2zviZyQQd+hFbcojREIfoBdhmpATvIKMPHk@public.gmane.org> 2016-11-08 10:06 ` Potnuri Bharat Teja 2017-03-20 10:15 ` Potnuri Bharat Teja 2017-03-21 6:32 ` Nicholas A. Bellinger 2017-03-21 7:51 ` Potnuri Bharat Teja [not found] ` <20170321075131.GA11565-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> 2017-03-21 13:52 ` Sagi Grimberg [not found] ` <945e2947-f67a-4202-cd27-d4631fe10f68-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 2017-03-21 15:25 ` [SPAMMY (7.002)]Re: " Potnuri Bharat Teja [not found] ` <20170321152506.GA32655-ut6Up61K2wZBDgjK7y7TUQ@public.gmane.org> 2017-03-21 16:38 ` Sagi Grimberg [not found] ` <4dab6b43-20d3-86f0-765a-be0851e9f4a0-NQWnxTmZq1alnMjI0IkVqw@public.gmane.org> 2017-03-21 17:50 ` Potnuri Bharat Teja
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.