All of lore.kernel.org
 help / color / mirror / Atom feed
From: Haakon Bugge <haakon.bugge@oracle.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Doug Ledford <dledford@redhat.com>,
	Leon Romanovsky <leon@kernel.org>,
	OFED mailing list <linux-rdma@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH for-next v2] RDMA/core/sa_query: Retry SA queries
Date: Thu, 26 Aug 2021 15:59:46 +0000	[thread overview]
Message-ID: <09CD5ACD-0147-45D5-8620-39FBF94CC02C@oracle.com> (raw)
In-Reply-To: <20210825174956.GA1200145@nvidia.com>



> On 25 Aug 2021, at 19:49, Jason Gunthorpe <jgg@nvidia.com> wrote:
> 
> On Thu, Aug 12, 2021 at 06:12:35PM +0200, Håkon Bugge wrote:
>> A MAD packet is sent as an unreliable datagram (UD). SA requests are
>> sent as MAD packets. As such, SA requests or responses may be silently
>> dropped.
>> 
>> IB Core's MAD layer has a timeout and retry mechanism, which amongst
>> other, is used by RDMA CM. But it is not used by SA queries. The lack
>> of retries of SA queries leads to long specified timeout, and error
>> being returned in case of packet loss. The ULP or user-land process
>> has to perform the retry.
>> 
>> Fix this by taking advantage of the MAD layer's retry mechanism.
>> 
>> First, a check against a zero timeout is added in
>> rdma_resolve_route(). In send_mad(), we set the MAD layer timeout to
>> one tenth of the specified timeout and the number of retries to
>> 10. The special case when timeout is less than 10 is handled.
>> 
>> With this fix:
>> 
>> # ucmatose -c 1000 -S 1024 -C 1
>> 
>> runs stable on an Infiniband fabric. Without this fix, we see an
>> intermittent behavior and it errors out with:
>> 
>> cmatose: event: RDMA_CM_EVENT_ROUTE_ERROR, error: -110
>> 
>> (110 is ETIMEDOUT)
>> 
>> Fixes: f75b7a529494 ("[PATCH] IB: Add automatic retries to MAD layer")
>> Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
>> ---
>> drivers/infiniband/core/cma.c      | 3 +++
>> drivers/infiniband/core/sa_query.c | 9 ++++++++-
>> 2 files changed, 11 insertions(+), 1 deletion(-)
> 
> I'm nervous about this, mostly because the mad layer is very
> complicated, but it does seem aligned with the spec.
> 
> However, it seems quite wrong that the timeout comes in from outside,
> the SA timeout should be integral to the SA layer..

They are quite different (timeout in ms):

	iser:      1000
	rtrs:     30000
	srp:       1000
	nvme:      3000
	samba:     5000
	p9:       30000
	rds:       5000
	xprtrdma:  5000

Dividing 30 seconds by ten and get 3, seems OK. But for iser/srp, we get 100ms, which is in the low end for some system I would expect.

> Anyhow, applied to for-next


Thanks!


Håkon


      reply	other threads:[~2021-08-26 15:59 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-12 16:12 [PATCH for-next v2] RDMA/core/sa_query: Retry SA queries Håkon Bugge
2021-08-23 12:25 ` Haakon Bugge
2021-08-25 17:49 ` Jason Gunthorpe
2021-08-26 15:59   ` Haakon Bugge [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=09CD5ACD-0147-45D5-8620-39FBF94CC02C@oracle.com \
    --to=haakon.bugge@oracle.com \
    --cc=dledford@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.