All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Steve Wise" <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
To: 'Chuck Lever' <chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
Cc: 'Devesh Sharma'
	<Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org>,
	'Linux NFS Mailing List'
	<linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	'Trond Myklebust'
	<trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
Subject: RE: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
Date: Thu, 10 Apr 2014 13:34:56 -0500	[thread overview]
Message-ID: <006601cf54eb$92488e30$b6d9aa90$@opengridcomputing.com> (raw)
In-Reply-To: <D7836AB3-FCB6-40EF-9954-B58A05A87791-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>



> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org]
> Sent: Thursday, April 10, 2014 12:44 PM
> To: Steve Wise
> Cc: Devesh Sharma; Linux NFS Mailing List; linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Trond Myklebust
> Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
> 
> 
> On Apr 10, 2014, at 11:01 AM, Steve Wise <swise-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org> wrote:
> 
> > On 4/9/2014 7:26 PM, Chuck Lever wrote:
> >> On Apr 9, 2014, at 7:56 PM, Devesh Sharma <Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org> wrote:
> >>
> >>> Hi Chuk and Trond
> >>>
> >>> I will resend a v2 for this.
> >>> What if ib_post_send() fails with immidate error, I that case also DECR_CQCOUNT()
will
> be called but no completion will be reported. Will that not cause any problems?
> >> We should investigate whether an error return from ib_post_{send,recv} means there
will
> be no completion. But I've never seen these verbs fail in practice, so I'm not in a
hurry to make
> work for anyone! ;-)
> >
> > A synchronous failure from ib_post_* means the WR (or at least one of them if there
were >
> 1) failed and did not get submitted to HW.  So there will be no completion for those
that failed.
> 
> OK.
> 
> Our post operations are largely single WRs. Before we address CQCOUNT in error cases,
we'd
> have to deal with chained WRs.
> 
> Chained WRs are used only when rpcrdma_register_frmr_external() finds an MR that hasn't
> been invalidated. That's actually working around a FRMR re-use bug (commit 5c635e09). If
the
> underlying re-use problem was fixed, we could get rid of the chained WR in
> register_frmr_external() (and we wouldn't need completions at all for FAST_REG_MR).
> 
> But at 100,000 feet, if a post operation fails, that seems like a very serious issue. I
wonder
> whether we would be better off disconnecting and starting over in those cases.
> 

I agree. The application is responsible to flow-control its posting of WRs to the SQs/RQs.
So we should never see sync failures with ib_post_* due to over-running the queues.
However, if the QP moves out of RTS for whatever reason, then a multi-threaded application
could encounter sync failures because the QP exited RTS.  Anyway, I agree: if there are
any failures with ib_post_*, the application should kill the connection, (LOG SOMETHING!),
and setup a new connection.

My 2 centimes.  :)


Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: "Steve Wise" <swise@opengridcomputing.com>
To: "'Chuck Lever'" <chuck.lever@oracle.com>
Cc: "'Devesh Sharma'" <Devesh.Sharma@Emulex.Com>,
	"'Linux NFS Mailing List'" <linux-nfs@vger.kernel.org>,
	<linux-rdma@vger.kernel.org>,
	"'Trond Myklebust'" <trond.myklebust@primarydata.com>
Subject: RE: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
Date: Thu, 10 Apr 2014 13:34:56 -0500	[thread overview]
Message-ID: <006601cf54eb$92488e30$b6d9aa90$@opengridcomputing.com> (raw)
In-Reply-To: <D7836AB3-FCB6-40EF-9954-B58A05A87791@oracle.com>



> -----Original Message-----
> From: Chuck Lever [mailto:chuck.lever@oracle.com]
> Sent: Thursday, April 10, 2014 12:44 PM
> To: Steve Wise
> Cc: Devesh Sharma; Linux NFS Mailing List; linux-rdma@vger.kernel.org; Trond Myklebust
> Subject: Re: [PATCH V1] NFS-RDMA: fix qp pointer validation checks
> 
> 
> On Apr 10, 2014, at 11:01 AM, Steve Wise <swise@opengridcomputing.com> wrote:
> 
> > On 4/9/2014 7:26 PM, Chuck Lever wrote:
> >> On Apr 9, 2014, at 7:56 PM, Devesh Sharma <Devesh.Sharma@Emulex.Com> wrote:
> >>
> >>> Hi Chuk and Trond
> >>>
> >>> I will resend a v2 for this.
> >>> What if ib_post_send() fails with immidate error, I that case also DECR_CQCOUNT()
will
> be called but no completion will be reported. Will that not cause any problems?
> >> We should investigate whether an error return from ib_post_{send,recv} means there
will
> be no completion. But I've never seen these verbs fail in practice, so I'm not in a
hurry to make
> work for anyone! ;-)
> >
> > A synchronous failure from ib_post_* means the WR (or at least one of them if there
were >
> 1) failed and did not get submitted to HW.  So there will be no completion for those
that failed.
> 
> OK.
> 
> Our post operations are largely single WRs. Before we address CQCOUNT in error cases,
we'd
> have to deal with chained WRs.
> 
> Chained WRs are used only when rpcrdma_register_frmr_external() finds an MR that hasn't
> been invalidated. That's actually working around a FRMR re-use bug (commit 5c635e09). If
the
> underlying re-use problem was fixed, we could get rid of the chained WR in
> register_frmr_external() (and we wouldn't need completions at all for FAST_REG_MR).
> 
> But at 100,000 feet, if a post operation fails, that seems like a very serious issue. I
wonder
> whether we would be better off disconnecting and starting over in those cases.
> 

I agree. The application is responsible to flow-control its posting of WRs to the SQs/RQs.
So we should never see sync failures with ib_post_* due to over-running the queues.
However, if the QP moves out of RTS for whatever reason, then a multi-threaded application
could encounter sync failures because the QP exited RTS.  Anyway, I agree: if there are
any failures with ib_post_*, the application should kill the connection, (LOG SOMETHING!),
and setup a new connection.

My 2 centimes.  :)


Steve



  parent reply	other threads:[~2014-04-10 18:34 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-04-09 18:40 [PATCH V1] NFS-RDMA: fix qp pointer validation checks Devesh Sharma
2014-04-09 18:40 ` Devesh Sharma
     [not found] ` <014738b6-698e-4ea1-82f9-287378bfec19-3RiH6ntJJkOPfaB/Gd0HpljyZtpTMMwT@public.gmane.org>
2014-04-09 20:22   ` Trond Myklebust
2014-04-09 20:22     ` Trond Myklebust
     [not found]     ` <D7AB2150-5F25-4BA2-80D9-94890AD11F8F-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>
2014-04-09 20:26       ` Chuck Lever
2014-04-09 20:26         ` Chuck Lever
     [not found]         ` <F1C70AD6-BDD4-4534-8DC4-61D2767581D9-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-09 23:56           ` Devesh Sharma
2014-04-09 23:56             ` Devesh Sharma
     [not found]             ` <EE7902D3F51F404C82415C4803930ACD3FDEAA43-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-10  0:26               ` Chuck Lever
2014-04-10  0:26                 ` Chuck Lever
     [not found]                 ` <E66D006A-0D04-4602-8BF5-6834CACD2E24-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-10 15:01                   ` Steve Wise
2014-04-10 15:01                     ` Steve Wise
     [not found]                     ` <5346B22D.3060706-7bPotxP6k4+P2YhJcF5u+vpXobYPEAuW@public.gmane.org>
2014-04-10 17:43                       ` Chuck Lever
2014-04-10 17:43                         ` Chuck Lever
     [not found]                         ` <D7836AB3-FCB6-40EF-9954-B58A05A87791-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-10 18:34                           ` Steve Wise [this message]
2014-04-10 18:34                             ` Steve Wise
2014-04-10 17:42                   ` Devesh Sharma
2014-04-10 17:42                     ` Devesh Sharma
     [not found]                     ` <EE7902D3F51F404C82415C4803930ACD3FDEB3B4-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-10 17:51                       ` Chuck Lever
2014-04-10 17:51                         ` Chuck Lever
     [not found]                         ` <BD7B05C0-4733-4DD1-83F3-B30B6B0EE48C-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-10 17:54                           ` Devesh Sharma
2014-04-10 17:54                             ` Devesh Sharma
     [not found]                             ` <EE7902D3F51F404C82415C4803930ACD3FDEB3DF-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-10 19:53                               ` Chuck Lever
2014-04-10 19:53                                 ` Chuck Lever
     [not found]                                 ` <56C87770-7940-4006-948C-FEF3C0EC4ACC-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-11 23:51                                   ` Devesh Sharma
2014-04-11 23:51                                     ` Devesh Sharma
     [not found]                                     ` <EE7902D3F51F404C82415C4803930ACD3FDEBD66-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-13  4:01                                       ` Chuck Lever
2014-04-13  4:01                                         ` Chuck Lever
     [not found]                                         ` <5710A71F-C4D5-408B-9B41-07F21B5853F0-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-14 20:53                                           ` Chuck Lever
2014-04-14 20:53                                             ` Chuck Lever
     [not found]                                             ` <6837A427-B677-4CC7-A022-4FB9E52A3FC6-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-14 22:46                                               ` Devesh Sharma
2014-04-14 22:46                                                 ` Devesh Sharma
     [not found]                                                 ` <EE7902D3F51F404C82415C4803930ACD3FDED915-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-15  0:39                                                   ` Chuck Lever
2014-04-15  0:39                                                     ` Chuck Lever
     [not found]                                                     ` <C689AB91-46F6-4E96-A673-0DE76FE54CC4-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-15 18:25                                                       ` Devesh Sharma
2014-04-15 18:25                                                         ` Devesh Sharma
     [not found]                                                         ` <EE7902D3F51F404C82415C4803930ACD3FDEE11F-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-23 23:30                                                           ` Devesh Sharma
2014-04-23 23:30                                                             ` Devesh Sharma
     [not found]                                                             ` <1bab6615-60c4-4865-a6a0-c53bb1c32341-3RiH6ntJJkP8BX6JNMqfyFjyZtpTMMwT@public.gmane.org>
2014-04-24  7:12                                                               ` Sagi Grimberg
2014-04-24  7:12                                                                 ` Sagi Grimberg
     [not found]                                                                 ` <5358B975.4020207-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-24 15:01                                                                   ` Chuck Lever
2014-04-24 15:01                                                                     ` Chuck Lever
     [not found]                                                                     ` <B39C0B38-357F-4BDA-BDA7-048BD38853F7-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-24 15:48                                                                       ` Devesh Sharma
2014-04-24 15:48                                                                         ` Devesh Sharma
     [not found]                                                                         ` <EE7902D3F51F404C82415C4803930ACD3FDF4F83-DWYeeINJQrxExQ8dmkPuX0M9+F4ksjoh@public.gmane.org>
2014-04-24 17:44                                                                           ` Chuck Lever
2014-04-24 17:44                                                                             ` Chuck Lever
2014-04-27 10:12                                                                       ` Sagi Grimberg
2014-04-27 10:12                                                                         ` Sagi Grimberg
     [not found]                                                                     ` <535CD819.3050508@dev! .mellanox.co.il>
     [not found]                                                                       ` <535CD819.3050508-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2014-04-27 12:37                                                                         ` Chuck Lever
2014-04-27 12:37                                                                           ` Chuck Lever
     [not found]                                                                           ` <4ACED3B0-CC8B-4F1F-8DB6-6C272AB17C99-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2014-04-28  8:58                                                                             ` Sagi Grimberg
2014-04-28  8:58                                                                               ` Sagi Grimberg
2014-04-14 23:55                                           ` Devesh Sharma
2014-04-14 23:55                                             ` Devesh Sharma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='006601cf54eb$92488e30$b6d9aa90$@opengridcomputing.com' \
    --to=swise-7bpotxp6k4+p2yhjcf5u+vpxobypeauw@public.gmane.org \
    --cc=Devesh.Sharma-iH1Dq9VlAzfQT0dZR+AlfA@public.gmane.org \
    --cc=chuck.lever-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.