All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Wenpeng Liang <liangwenpeng@huawei.com>
Cc: leon@kernel.org, linux-rdma@vger.kernel.org, linuxarm@huawei.com
Subject: Re: [PATCH for-rc] RDMA/hns: Fix RNR retransmission issue for HIP08
Date: Tue, 14 Dec 2021 19:47:03 -0400	[thread overview]
Message-ID: <20211214234703.GA968962@nvidia.com> (raw)
In-Reply-To: <20211209140655.49493-1-liangwenpeng@huawei.com>

On Thu, Dec 09, 2021 at 10:06:55PM +0800, Wenpeng Liang wrote:
> From: Yangyang Li <liyangyang20@huawei.com>
> 
> Due to the discrete nature of the HIP08 timer unit, a requester might
> finish the timeout period sooner, in elapsed real time, than its responder
> does, even when both sides share the identical RNR timeout length included
> in the RNR Nak packet and the responder indeed starts the timing prior to
> the requester. Furthermore, if a 'providential' resend packet arrived
> before the responder's timeout period expired, the responder is certainly
> entitled to drop the packet silently in the light of IB protocol.
> 
> To address this problem, our team made good use of certain hardware facts:
> 1) The timing resolution regards the transmission arrangements is 1
> microsecond, e.g. if cq_period field is set to 3, it would be interpreted
> as 3 microsecond by hardware;
> 2) A QPC field shall inform the hardware how many timing unit (ticks)
> constitutes a full microsecond, which, by default, is 1000;
> 3) It takes 14ns for the processor to handle a packet in the buffer, so the
> RNR timeout length of 10ns would ensure our processing mechanism is
> disabled during the entire timeout period and the packet won't be dropped
> silently;
> 
> To achieve (3), we permanently set the QPC field mentioned in (2) to zero
> which nominally indicates every time tick is equivalent to a microsecond
> in wall-clock time; now, a RNR timeout period at face value of 10 would
> only last 10 ticks, which is 10ns in wall-clock time.
> 
> It's worth noting that we adapt the driver by magnifying certain
> configuration parameters(cq_period, eq_period and ack_timeout)by 1000 given
> the user assumes the configuring timing unit to be microseconds.
> 
> Also, this particular improvisation is only deployed on HIP08 since other
> hardware has already solved this issue.
> 
> Fixes: cfc85f3e4b7f ("RDMA/hns: Add profile support for hip08 driver")
> Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
> ---
>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 64 +++++++++++++++++++---
>  drivers/infiniband/hw/hns/hns_roce_hw_v2.h |  8 +++
>  2 files changed, 65 insertions(+), 7 deletions(-)

Applied to for-rc, thanks

Jason

      reply	other threads:[~2021-12-14 23:47 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-09 14:06 [PATCH for-rc] RDMA/hns: Fix RNR retransmission issue for HIP08 Wenpeng Liang
2021-12-14 23:47 ` Jason Gunthorpe [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211214234703.GA968962@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=leon@kernel.org \
    --cc=liangwenpeng@huawei.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.