From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B5EDC3A5A1 for ; Thu, 22 Aug 2019 15:03:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 429E72133F for ; Thu, 22 Aug 2019 15:03:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731898AbfHVPDT (ORCPT ); Thu, 22 Aug 2019 11:03:19 -0400 Received: from mga11.intel.com ([192.55.52.93]:63532 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727484AbfHVPDT (ORCPT ); Thu, 22 Aug 2019 11:03:19 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Aug 2019 08:03:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,417,1559545200"; d="scan'208";a="186595013" Received: from jerryopenix.sh.intel.com (HELO jerryopenix) ([10.239.158.171]) by FMSMGA003.fm.intel.com with ESMTP; 22 Aug 2019 08:03:17 -0700 Date: Thu, 22 Aug 2019 23:01:54 +0800 From: "Liu, Changcheng" To: Doug Ledford , tom@talpey.com Cc: linux-rdma@vger.kernel.org Subject: Re: CX314A WCE error: WR_FLUSH_ERR Message-ID: <20190822150154.GA27163@jerryopenix> References: <20190821120912.GA1672@jerryopenix> <6aed3f75-2445-eb6f-0bd8-7c79ea4a0967@talpey.com> <20190821153844.GA4545@jerryopenix> <0ce34055454c68cf9089e9e742b04397419a6309.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0ce34055454c68cf9089e9e742b04397419a6309.camel@redhat.com> User-Agent: Mutt/1.9.4 (2018-02-28) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org Thanks Doug Ledford & Tom. I've found that QP is force switched into Error status to flush outstandting WQEs into CQ with WR_FLUSH_ERR status. On 14:47 Wed 21 Aug, Doug Ledford wrote: > On Wed, 2019-08-21 at 23:38 +0800, Liu, Changcheng wrote: > > On 09:36 Wed 21 Aug, Tom Talpey wrote: > > > On 8/21/2019 8:09 AM, Liu, Changcheng wrote: > > > > Hi all, > > > > In one system, it always frequently hit "IBV_WC_WR_FLUSH_ERR" > > > > in the WCE(work completion element) polled from completion queue > > > > bound with RQ(Receive Queue). > > > > Does anyone has some idea to debug "IBV_WC_WR_FLUSH_ERR" > > > > problem? > > > > > > > > With CX314A/40Gb NIC, I hit this error when using RC transport > > > > type with only Send Operation(IBV_WR_SEND) WR(work request) on > > > > SQ(Send Queue). > > > > Every WR only has one SGE(scatter/gather element) and all the > > > > SGE on RQ has the same size. The SGE size in SQ WR is not greater > > > > than the SGE size in RQ WR. > > > > > > > > There’s one explanation about IBV_WC_WR_FLUSH_ERR on page 114 > > > > in the "RDMA Aware Networks Programming User Manual" > > > > http://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf > > > > But I still didn't understand it well. How to trigger this > > > > error with a short demo program? > > > > " > > > > IBV_WC_WR_FLUSH_ERR > > > > This event is generated when an invalid remote error is > > > > thrown when the responder detects an > > > > invalid request. It may be that the operation is not > > > > supported by the request queue or there is > > > > insufficient buffer space to receive the request. > > > > " > > > > > > The most common reason for a flushed work request is loss of > > > the connection to the remote peer. This can be caused by any > > > number of conditions. > > Good diretion. I'll debug it in this way first. > > > The second-most common is a programming error in the upper > > > layer protocol. A shortage of posted receives on either peer, > > > a protection error on some buffer, etc. > > Do you mean the protection key such as l_key/r_key isn't set well? > > What's kind of protection error could trigger IBV_WC_WR_FLUSH_ERR? > > FLUSH_ERR is the error used whenever a queue pair goes into an error > state and there are still WQEs posted to the queue pair. All > outstanding WQEs are returned with the state IBV_WC_WR_FLUSH_ERR. This > is how you make sure you don't loose WQEs when the QP hits an error > state. So, literally *anything* that can cause a QP to go into an ERROR > state will result in all WQEs currently posted to the QP being sent back > with this FLUSH_ERR. FLUSH_ERR literally just means that the card is > flushing out the QP's work queue because now that the QP is in an error > state it can't process the WQEs and, presumably, the application needs > to know which ones completed and which ones didn't so it knows what to > requeue once the QP is no longer in an error state. > > As Tom has already pointed out, all of these things will throw the queue > pair into an error state and cause all posted WQEs to be flushed with > the FLUSH_ERR condition: > > 1) Loss of queue pair connection > 2) Any memory permission violation (attempt to write to read only > memory, attempt to RDMA read/write to an invalid rkey, etc) > 3) Receipt of any post_send message without a waiting post_recv buffer > to accept the message > 4) Receipt of a post_send message that is too large to fit in the first > available post_recv buffer > > A common cause of this sort of thing is when you don't do proper flow > control on the queue pair and the sending side floods the receiving side > and runs it out of posted recv WQEs. Although, in your case, you did > say this was happening on the receive queue, so that implies this is > happening on the receiving side, so if that is what's happenining here, > the process would have to be something like: > > sender starts sending data (maybe without any flow control) > receiver starts receiving data and refilling buffers > ... > receiver runs totally dry of buffers and gets an incoming recv > causing qp to go into error state > > receiver then posts refill buffers to the RQ after the QP > went into error state but before acknowledging the error state > and shutting down the recv processing thread > > all recv buffers posted as WQEs are flushed back to the process > with FLUSH_ERR because they were posted to a QP in ERROR state > > > > If you're looking to actually trigger this error for testing, > > > well, try one of the above. If you're trying to figure out > > > why it's happening, that can take some digging, but not in > > > the RDMA stack, typically. > > Many thanks. > > > > --Changcheng > > > Tom. > > > > > -- > Doug Ledford > GPG KeyID: B826A3330E572FDD > Fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD