From: "Steve Wise" <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> To: 'Chuck Lever' <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> Cc: 'Devesh Sharma' <Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org>, 'Linux NFS Mailing List' <linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>, linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, 'Trond Myklebust' <trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org> Subject: RE: [PATCH V1] NFS-RDMA: fix qp pointer validation checks Date: Thu, 10 Apr 2014 13:34:56 -0500 [thread overview] Message-ID: <006601cf54eb$92488e30$b6d9aa90$@opengridcomputing.com> (raw) In-Reply-To: <D7836AB3-FCB6-40EF-9954-B58A05A87791-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> > -----Original Message----- > From: Chuck Lever [mailto:chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org] > Sent: Thursday, April 10, 2014 12:44 PM > To: Steve Wise > Cc: Devesh Sharma; Linux NFS Mailing List; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Trond Myklebust > Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks > > > On Apr 10, 2014, at 11:01 AM, Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> wrote: > > > On 4/9/2014 7:26 PM, Chuck Lever wrote: > >> On Apr 9, 2014, at 7:56 PM, Devesh Sharma <Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org> wrote: > >> > >>> Hi Chuk and Trond > >>> > >>> I will resend a v2 for this. > >>> What if ib_post_send() fails with immidate error, I that case also DECR_CQCOUNT() will > be called but no completion will be reported. Will that not cause any problems? > >> We should investigate whether an error return from ib_post_{send,recv} means there will > be no completion. But I've never seen these verbs fail in practice, so I'm not in a hurry to make > work for anyone! ;-) > > > > A synchronous failure from ib_post_* means the WR (or at least one of them if there were > > 1) failed and did not get submitted to HW. So there will be no completion for those that failed. > > OK. > > Our post operations are largely single WRs. Before we address CQCOUNT in error cases, we'd > have to deal with chained WRs. > > Chained WRs are used only when rpcrdma_register_frmr_external() finds an MR that hasn't > been invalidated. That's actually working around a FRMR re-use bug (commit 5c635e09). If the > underlying re-use problem was fixed, we could get rid of the chained WR in > register_frmr_external() (and we wouldn't need completions at all for FAST_REG_MR). > > But at 100,000 feet, if a post operation fails, that seems like a very serious issue. I wonder > whether we would be better off disconnecting and starting over in those cases. > I agree. The application is responsible to flow-control its posting of WRs to the SQs/RQs. So we should never see sync failures with ib_post_* due to over-running the queues. However, if the QP moves out of RTS for whatever reason, then a multi-threaded application could encounter sync failures because the QP exited RTS. Anyway, I agree: if there are any failures with ib_post_*, the application should kill the connection, (LOG SOMETHING!), and setup a new connection. My 2 centimes. :) Steve -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: "Steve Wise" <swise@opengridcomputing.com> To: "'Chuck Lever'" <chuck.lever@oracle.com> Cc: "'Devesh Sharma'" <Devesh.Sharma@Emulex.Com>, "'Linux NFS Mailing List'" <linux-nfs@vger.kernel.org>, <linux-rdma@vger.kernel.org>, "'Trond Myklebust'" <trond.myklebust@primarydata.com> Subject: RE: [PATCH V1] NFS-RDMA: fix qp pointer validation checks Date: Thu, 10 Apr 2014 13:34:56 -0500 [thread overview] Message-ID: <006601cf54eb$92488e30$b6d9aa90$@opengridcomputing.com> (raw) In-Reply-To: <D7836AB3-FCB6-40EF-9954-B58A05A87791@oracle.com> > -----Original Message----- > From: Chuck Lever [mailto:chuck.lever@oracle.com] > Sent: Thursday, April 10, 2014 12:44 PM > To: Steve Wise > Cc: Devesh Sharma; Linux NFS Mailing List; linux-rdma@vger.kernel.org; Trond Myklebust > Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks > > > On Apr 10, 2014, at 11:01 AM, Steve Wise <swise@opengridcomputing.com> wrote: > > > On 4/9/2014 7:26 PM, Chuck Lever wrote: > >> On Apr 9, 2014, at 7:56 PM, Devesh Sharma <Devesh.Sharma@Emulex.Com> wrote: > >> > >>> Hi Chuk and Trond > >>> > >>> I will resend a v2 for this. > >>> What if ib_post_send() fails with immidate error, I that case also DECR_CQCOUNT() will > be called but no completion will be reported. Will that not cause any problems? > >> We should investigate whether an error return from ib_post_{send,recv} means there will > be no completion. But I've never seen these verbs fail in practice, so I'm not in a hurry to make > work for anyone! ;-) > > > > A synchronous failure from ib_post_* means the WR (or at least one of them if there were > > 1) failed and did not get submitted to HW. So there will be no completion for those that failed. > > OK. > > Our post operations are largely single WRs. Before we address CQCOUNT in error cases, we'd > have to deal with chained WRs. > > Chained WRs are used only when rpcrdma_register_frmr_external() finds an MR that hasn't > been invalidated. That's actually working around a FRMR re-use bug (commit 5c635e09). If the > underlying re-use problem was fixed, we could get rid of the chained WR in > register_frmr_external() (and we wouldn't need completions at all for FAST_REG_MR). > > But at 100,000 feet, if a post operation fails, that seems like a very serious issue. I wonder > whether we would be better off disconnecting and starting over in those cases. > I agree. The application is responsible to flow-control its posting of WRs to the SQs/RQs. So we should never see sync failures with ib_post_* due to over-running the queues. However, if the QP moves out of RTS for whatever reason, then a multi-threaded application could encounter sync failures because the QP exited RTS. Anyway, I agree: if there are any failures with ib_post_*, the application should kill the connection, (LOG SOMETHING!), and setup a new connection. My 2 centimes. :) Steve
next prev parent reply other threads:[~2014-04-10 18:34 UTC|newest] Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-04-09 18:40 [PATCH V1] NFS-RDMA: fix qp pointer validation checks Devesh Sharma 2014-04-09 18:40 ` Devesh Sharma [not found] ` <014738b6-698e-4ea1-82f9-287378bfec19-3RiH6ntJJkOPfaB/Gd0HpljyZtpTMMwT@public.gmane.org> 2014-04-09 20:22 ` Trond Myklebust 2014-04-09 20:22 ` Trond Myklebust [not found] ` <D7AB2150-5F25-4BA2-80D9-94890AD11F8F-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org> 2014-04-09 20:26 ` Chuck Lever 2014-04-09 20:26 ` Chuck Lever [not found] ` <F1C70AD6-BDD4-4534-8DC4-61D2767581D9-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-09 23:56 ` Devesh Sharma 2014-04-09 23:56 ` Devesh Sharma [not found] ` <EE7902D3F51F404C82415C4803930ACD3FDEAA43-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org> 2014-04-10 0:26 ` Chuck Lever 2014-04-10 0:26 ` Chuck Lever [not found] ` <E66D006A-0D04-4602-8BF5-6834CACD2E24-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-10 15:01 ` Steve Wise 2014-04-10 15:01 ` Steve Wise [not found] ` <5346B22D.3060706-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> 2014-04-10 17:43 ` Chuck Lever 2014-04-10 17:43 ` Chuck Lever [not found] ` <D7836AB3-FCB6-40EF-9954-B58A05A87791-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-10 18:34 ` Steve Wise [this message] 2014-04-10 18:34 ` Steve Wise 2014-04-10 17:42 ` Devesh Sharma 2014-04-10 17:42 ` Devesh Sharma [not found] ` <EE7902D3F51F404C82415C4803930ACD3FDEB3B4-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org> 2014-04-10 17:51 ` Chuck Lever 2014-04-10 17:51 ` Chuck Lever [not found] ` <BD7B05C0-4733-4DD1-83F3-B30B6B0EE48C-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-10 17:54 ` Devesh Sharma 2014-04-10 17:54 ` Devesh Sharma [not found] ` <EE7902D3F51F404C82415C4803930ACD3FDEB3DF-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org> 2014-04-10 19:53 ` Chuck Lever 2014-04-10 19:53 ` Chuck Lever [not found] ` <56C87770-7940-4006-948C-FEF3C0EC4ACC-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-11 23:51 ` Devesh Sharma 2014-04-11 23:51 ` Devesh Sharma [not found] ` <EE7902D3F51F404C82415C4803930ACD3FDEBD66-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org> 2014-04-13 4:01 ` Chuck Lever 2014-04-13 4:01 ` Chuck Lever [not found] ` <5710A71F-C4D5-408B-9B41-07F21B5853F0-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-14 20:53 ` Chuck Lever 2014-04-14 20:53 ` Chuck Lever [not found] ` <6837A427-B677-4CC7-A022-4FB9E52A3FC6-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-14 22:46 ` Devesh Sharma 2014-04-14 22:46 ` Devesh Sharma [not found] ` <EE7902D3F51F404C82415C4803930ACD3FDED915-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org> 2014-04-15 0:39 ` Chuck Lever 2014-04-15 0:39 ` Chuck Lever [not found] ` <C689AB91-46F6-4E96-A673-0DE76FE54CC4-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-15 18:25 ` Devesh Sharma 2014-04-15 18:25 ` Devesh Sharma [not found] ` <EE7902D3F51F404C82415C4803930ACD3FDEE11F-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org> 2014-04-23 23:30 ` Devesh Sharma 2014-04-23 23:30 ` Devesh Sharma [not found] ` <1bab6615-60c4-4865-a6a0-c53bb1c32341-3RiH6ntJJkP8BX6JNMqfyFjyZtpTMMwT@public.gmane.org> 2014-04-24 7:12 ` Sagi Grimberg 2014-04-24 7:12 ` Sagi Grimberg [not found] ` <5358B975.4020207-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2014-04-24 15:01 ` Chuck Lever 2014-04-24 15:01 ` Chuck Lever [not found] ` <B39C0B38-357F-4BDA-BDA7-048BD38853F7-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-24 15:48 ` Devesh Sharma 2014-04-24 15:48 ` Devesh Sharma [not found] ` <EE7902D3F51F404C82415C4803930ACD3FDF4F83-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org> 2014-04-24 17:44 ` Chuck Lever 2014-04-24 17:44 ` Chuck Lever 2014-04-27 10:12 ` Sagi Grimberg 2014-04-27 10:12 ` Sagi Grimberg [not found] ` <535CD819.3050508@dev! .mellanox.co.il> [not found] ` <535CD819.3050508-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org> 2014-04-27 12:37 ` Chuck Lever 2014-04-27 12:37 ` Chuck Lever [not found] ` <4ACED3B0-CC8B-4F1F-8DB6-6C272AB17C99-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org> 2014-04-28 8:58 ` Sagi Grimberg 2014-04-28 8:58 ` Sagi Grimberg 2014-04-14 23:55 ` Devesh Sharma 2014-04-14 23:55 ` Devesh Sharma
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='006601cf54eb$92488e30$b6d9aa90$@opengridcomputing.com' \ --to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \ --cc=Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org \ --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \ --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \ --cc=trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.