linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* CX314A WCE error: WR_FLUSH_ERR
@ 2019-08-21 12:09 Liu, Changcheng
  2019-08-21 13:36 ` Tom Talpey
  0 siblings, 1 reply; 5+ messages in thread
From: Liu, Changcheng @ 2019-08-21 12:09 UTC (permalink / raw)
  To: linux-rdma

Hi all,
   In one system, it always frequently hit "IBV_WC_WR_FLUSH_ERR" in the WCE(work completion element) polled from completion queue bound with RQ(Receive Queue).
   Does anyone has some idea to debug "IBV_WC_WR_FLUSH_ERR" problem?

   With CX314A/40Gb NIC, I hit this error when using RC transport type with only Send Operation(IBV_WR_SEND) WR(work request) on SQ(Send Queue).
   Every WR only has one SGE(scatter/gather element) and all the SGE on RQ has the same size. The SGE size in SQ WR is not greater than the SGE size in RQ WR.

  There’s one explanation about IBV_WC_WR_FLUSH_ERR on page 114 in the "RDMA Aware Networks Programming User Manual" http://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf
  But I still didn't understand it well. How to trigger this error with a short demo program?
  "
    IBV_WC_WR_FLUSH_ERR
    This event is generated when an invalid remote error is thrown when the responder detects an
    invalid request. It may be that the operation is not supported by the request queue or there is
    insufficient buffer space to receive the request.
  "

B.R.
Changcheng

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CX314A WCE error: WR_FLUSH_ERR
  2019-08-21 12:09 CX314A WCE error: WR_FLUSH_ERR Liu, Changcheng
@ 2019-08-21 13:36 ` Tom Talpey
  2019-08-21 15:38   ` Liu, Changcheng
  0 siblings, 1 reply; 5+ messages in thread
From: Tom Talpey @ 2019-08-21 13:36 UTC (permalink / raw)
  To: Liu, Changcheng, linux-rdma

On 8/21/2019 8:09 AM, Liu, Changcheng wrote:
> Hi all,
>     In one system, it always frequently hit "IBV_WC_WR_FLUSH_ERR" in the WCE(work completion element) polled from completion queue bound with RQ(Receive Queue).
>     Does anyone has some idea to debug "IBV_WC_WR_FLUSH_ERR" problem?
> 
>     With CX314A/40Gb NIC, I hit this error when using RC transport type with only Send Operation(IBV_WR_SEND) WR(work request) on SQ(Send Queue).
>     Every WR only has one SGE(scatter/gather element) and all the SGE on RQ has the same size. The SGE size in SQ WR is not greater than the SGE size in RQ WR.
> 
>    There’s one explanation about IBV_WC_WR_FLUSH_ERR on page 114 in the "RDMA Aware Networks Programming User Manual" http://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf
>    But I still didn't understand it well. How to trigger this error with a short demo program?
>    "
>      IBV_WC_WR_FLUSH_ERR
>      This event is generated when an invalid remote error is thrown when the responder detects an
>      invalid request. It may be that the operation is not supported by the request queue or there is
>      insufficient buffer space to receive the request.
>    "

The most common reason for a flushed work request is loss of
the connection to the remote peer. This can be caused by any
number of conditions.

The second-most common is a programming error in the upper
layer protocol. A shortage of posted receives on either peer,
a protection error on some buffer, etc.

If you're looking to actually trigger this error for testing,
well, try one of the above. If you're trying to figure out
why it's happening, that can take some digging, but not in
the RDMA stack, typically.

Tom.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CX314A WCE error: WR_FLUSH_ERR
  2019-08-21 13:36 ` Tom Talpey
@ 2019-08-21 15:38   ` Liu, Changcheng
  2019-08-21 18:47     ` Doug Ledford
  0 siblings, 1 reply; 5+ messages in thread
From: Liu, Changcheng @ 2019-08-21 15:38 UTC (permalink / raw)
  To: Tom Talpey; +Cc: linux-rdma

On 09:36 Wed 21 Aug, Tom Talpey wrote:
> On 8/21/2019 8:09 AM, Liu, Changcheng wrote:
> > Hi all,
> >     In one system, it always frequently hit "IBV_WC_WR_FLUSH_ERR" in the WCE(work completion element) polled from completion queue bound with RQ(Receive Queue).
> >     Does anyone has some idea to debug "IBV_WC_WR_FLUSH_ERR" problem?
> > 
> >     With CX314A/40Gb NIC, I hit this error when using RC transport type with only Send Operation(IBV_WR_SEND) WR(work request) on SQ(Send Queue).
> >     Every WR only has one SGE(scatter/gather element) and all the SGE on RQ has the same size. The SGE size in SQ WR is not greater than the SGE size in RQ WR.
> > 
> >    There’s one explanation about IBV_WC_WR_FLUSH_ERR on page 114 in the "RDMA Aware Networks Programming User Manual" http://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf
> >    But I still didn't understand it well. How to trigger this error with a short demo program?
> >    "
> >      IBV_WC_WR_FLUSH_ERR
> >      This event is generated when an invalid remote error is thrown when the responder detects an
> >      invalid request. It may be that the operation is not supported by the request queue or there is
> >      insufficient buffer space to receive the request.
> >    "
> 
> The most common reason for a flushed work request is loss of
> the connection to the remote peer. This can be caused by any
> number of conditions.
Good diretion. I'll debug it in this way first.
> 
> The second-most common is a programming error in the upper
> layer protocol. A shortage of posted receives on either peer,
> a protection error on some buffer, etc.
Do you mean the protection key such as l_key/r_key isn't set well?
What's kind of protection error could trigger IBV_WC_WR_FLUSH_ERR?
> 
> If you're looking to actually trigger this error for testing,
> well, try one of the above. If you're trying to figure out
> why it's happening, that can take some digging, but not in
> the RDMA stack, typically.
Many thanks.

--Changcheng
> 
> Tom.
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CX314A WCE error: WR_FLUSH_ERR
  2019-08-21 15:38   ` Liu, Changcheng
@ 2019-08-21 18:47     ` Doug Ledford
  2019-08-22 15:01       ` Liu, Changcheng
  0 siblings, 1 reply; 5+ messages in thread
From: Doug Ledford @ 2019-08-21 18:47 UTC (permalink / raw)
  To: Liu, Changcheng, Tom Talpey; +Cc: linux-rdma

[-- Attachment #1: Type: text/plain, Size: 4679 bytes --]

On Wed, 2019-08-21 at 23:38 +0800, Liu, Changcheng wrote:
> On 09:36 Wed 21 Aug, Tom Talpey wrote:
> > On 8/21/2019 8:09 AM, Liu, Changcheng wrote:
> > > Hi all,
> > >     In one system, it always frequently hit "IBV_WC_WR_FLUSH_ERR"
> > > in the WCE(work completion element) polled from completion queue
> > > bound with RQ(Receive Queue).
> > >     Does anyone has some idea to debug "IBV_WC_WR_FLUSH_ERR"
> > > problem?
> > > 
> > >     With CX314A/40Gb NIC, I hit this error when using RC transport
> > > type with only Send Operation(IBV_WR_SEND) WR(work request) on
> > > SQ(Send Queue).
> > >     Every WR only has one SGE(scatter/gather element) and all the
> > > SGE on RQ has the same size. The SGE size in SQ WR is not greater
> > > than the SGE size in RQ WR.
> > > 
> > >    There’s one explanation about IBV_WC_WR_FLUSH_ERR on page 114
> > > in the "RDMA Aware Networks Programming User Manual" 
> > > http://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf
> > >    But I still didn't understand it well. How to trigger this
> > > error with a short demo program?
> > >    "
> > >      IBV_WC_WR_FLUSH_ERR
> > >      This event is generated when an invalid remote error is
> > > thrown when the responder detects an
> > >      invalid request. It may be that the operation is not
> > > supported by the request queue or there is
> > >      insufficient buffer space to receive the request.
> > >    "
> > 
> > The most common reason for a flushed work request is loss of
> > the connection to the remote peer. This can be caused by any
> > number of conditions.
> Good diretion. I'll debug it in this way first.
> > The second-most common is a programming error in the upper
> > layer protocol. A shortage of posted receives on either peer,
> > a protection error on some buffer, etc.
> Do you mean the protection key such as l_key/r_key isn't set well?
> What's kind of protection error could trigger IBV_WC_WR_FLUSH_ERR?

FLUSH_ERR is the error used whenever a queue pair goes into an error
state and there are still WQEs posted to the queue pair.  All
outstanding WQEs are returned with the state IBV_WC_WR_FLUSH_ERR.  This
is how you make sure you don't loose WQEs when the QP hits an error
state.  So, literally *anything* that can cause a QP to go into an ERROR
state will result in all WQEs currently posted to the QP being sent back
with this FLUSH_ERR.  FLUSH_ERR literally just means that the card is
flushing out the QP's work queue because now that the QP is in an error
state it can't process the WQEs and, presumably, the application needs
to know which ones completed and which ones didn't so it knows what to
requeue once the QP is no longer in an error state.

As Tom has already pointed out, all of these things will throw the queue
pair into an error state and cause all posted WQEs to be flushed with
the FLUSH_ERR condition:

1) Loss of queue pair connection
2) Any memory permission violation (attempt to write to read only
memory, attempt to RDMA read/write to an invalid rkey, etc)
3) Receipt of any post_send message without a waiting post_recv buffer
to accept the message
4) Receipt of a post_send message that is too large to fit in the first
available post_recv buffer

A common cause of this sort of thing is when you don't do proper flow
control on the queue pair and the sending side floods the receiving side
and runs it out of posted recv WQEs.  Although, in your case, you did
say this was happening on the receive queue, so that implies this is
happening on the receiving side, so if that is what's happenining here,
the process would have to be something like:

sender starts sending data (maybe without any flow control)
	receiver starts receiving data and refilling buffers
	...
	receiver runs totally dry of buffers and gets an incoming recv
	causing qp to go into error state

	receiver then posts refill buffers to the RQ after the QP
	went into error state but before acknowledging the error state
	and shutting down the recv processing thread

	all recv buffers posted as WQEs are flushed back to the process
	with FLUSH_ERR because they were posted to a QP in ERROR state

> > If you're looking to actually trigger this error for testing,
> > well, try one of the above. If you're trying to figure out
> > why it's happening, that can take some digging, but not in
> > the RDMA stack, typically.
> Many thanks.
> 
> --Changcheng
> > Tom.
> > 

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: CX314A WCE error: WR_FLUSH_ERR
  2019-08-21 18:47     ` Doug Ledford
@ 2019-08-22 15:01       ` Liu, Changcheng
  0 siblings, 0 replies; 5+ messages in thread
From: Liu, Changcheng @ 2019-08-22 15:01 UTC (permalink / raw)
  To: Doug Ledford, tom; +Cc: linux-rdma

Thanks Doug Ledford & Tom. I've found that QP is force switched into
Error status to flush outstandting WQEs into CQ with WR_FLUSH_ERR
status.

On 14:47 Wed 21 Aug, Doug Ledford wrote:
> On Wed, 2019-08-21 at 23:38 +0800, Liu, Changcheng wrote:
> > On 09:36 Wed 21 Aug, Tom Talpey wrote:
> > > On 8/21/2019 8:09 AM, Liu, Changcheng wrote:
> > > > Hi all,
> > > >     In one system, it always frequently hit "IBV_WC_WR_FLUSH_ERR"
> > > > in the WCE(work completion element) polled from completion queue
> > > > bound with RQ(Receive Queue).
> > > >     Does anyone has some idea to debug "IBV_WC_WR_FLUSH_ERR"
> > > > problem?
> > > > 
> > > >     With CX314A/40Gb NIC, I hit this error when using RC transport
> > > > type with only Send Operation(IBV_WR_SEND) WR(work request) on
> > > > SQ(Send Queue).
> > > >     Every WR only has one SGE(scatter/gather element) and all the
> > > > SGE on RQ has the same size. The SGE size in SQ WR is not greater
> > > > than the SGE size in RQ WR.
> > > > 
> > > >    There’s one explanation about IBV_WC_WR_FLUSH_ERR on page 114
> > > > in the "RDMA Aware Networks Programming User Manual" 
> > > > http://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf
> > > >    But I still didn't understand it well. How to trigger this
> > > > error with a short demo program?
> > > >    "
> > > >      IBV_WC_WR_FLUSH_ERR
> > > >      This event is generated when an invalid remote error is
> > > > thrown when the responder detects an
> > > >      invalid request. It may be that the operation is not
> > > > supported by the request queue or there is
> > > >      insufficient buffer space to receive the request.
> > > >    "
> > > 
> > > The most common reason for a flushed work request is loss of
> > > the connection to the remote peer. This can be caused by any
> > > number of conditions.
> > Good diretion. I'll debug it in this way first.
> > > The second-most common is a programming error in the upper
> > > layer protocol. A shortage of posted receives on either peer,
> > > a protection error on some buffer, etc.
> > Do you mean the protection key such as l_key/r_key isn't set well?
> > What's kind of protection error could trigger IBV_WC_WR_FLUSH_ERR?
> 
> FLUSH_ERR is the error used whenever a queue pair goes into an error
> state and there are still WQEs posted to the queue pair.  All
> outstanding WQEs are returned with the state IBV_WC_WR_FLUSH_ERR.  This
> is how you make sure you don't loose WQEs when the QP hits an error
> state.  So, literally *anything* that can cause a QP to go into an ERROR
> state will result in all WQEs currently posted to the QP being sent back
> with this FLUSH_ERR.  FLUSH_ERR literally just means that the card is
> flushing out the QP's work queue because now that the QP is in an error
> state it can't process the WQEs and, presumably, the application needs
> to know which ones completed and which ones didn't so it knows what to
> requeue once the QP is no longer in an error state.
> 
> As Tom has already pointed out, all of these things will throw the queue
> pair into an error state and cause all posted WQEs to be flushed with
> the FLUSH_ERR condition:
> 
> 1) Loss of queue pair connection
> 2) Any memory permission violation (attempt to write to read only
> memory, attempt to RDMA read/write to an invalid rkey, etc)
> 3) Receipt of any post_send message without a waiting post_recv buffer
> to accept the message
> 4) Receipt of a post_send message that is too large to fit in the first
> available post_recv buffer
> 
> A common cause of this sort of thing is when you don't do proper flow
> control on the queue pair and the sending side floods the receiving side
> and runs it out of posted recv WQEs.  Although, in your case, you did
> say this was happening on the receive queue, so that implies this is
> happening on the receiving side, so if that is what's happenining here,
> the process would have to be something like:
> 
> sender starts sending data (maybe without any flow control)
> 	receiver starts receiving data and refilling buffers
> 	...
> 	receiver runs totally dry of buffers and gets an incoming recv
> 	causing qp to go into error state
> 
> 	receiver then posts refill buffers to the RQ after the QP
> 	went into error state but before acknowledging the error state
> 	and shutting down the recv processing thread
> 
> 	all recv buffers posted as WQEs are flushed back to the process
> 	with FLUSH_ERR because they were posted to a QP in ERROR state
> 
> > > If you're looking to actually trigger this error for testing,
> > > well, try one of the above. If you're trying to figure out
> > > why it's happening, that can take some digging, but not in
> > > the RDMA stack, typically.
> > Many thanks.
> > 
> > --Changcheng
> > > Tom.
> > > 
> 
> -- 
> Doug Ledford <dledford@redhat.com>
>     GPG KeyID: B826A3330E572FDD
>     Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-08-22 15:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-21 12:09 CX314A WCE error: WR_FLUSH_ERR Liu, Changcheng
2019-08-21 13:36 ` Tom Talpey
2019-08-21 15:38   ` Liu, Changcheng
2019-08-21 18:47     ` Doug Ledford
2019-08-22 15:01       ` Liu, Changcheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).