All of lore.kernel.org
 help / color / mirror / Atom feed
* FastLinQ: possible duplicate flush of FastReg and LocalInv
@ 2021-03-16 19:58 Chuck Lever III
  2021-03-17  8:54 ` Bernard Metzler
  2021-03-17 15:14 ` Chuck Lever III
  0 siblings, 2 replies; 6+ messages in thread
From: Chuck Lever III @ 2021-03-16 19:58 UTC (permalink / raw)
  To: linux-rdma

Hi-

I've been trying to track down some crashes when running NFS/RDMA
tests over FastLinQ devices in iWARP mode. To make it stressful,
I've enabled disconnect injection, where rpcrdma injects a
connection disconnect every so often.

As part of a disconnect event, the Receive and Send queues are
drained. Sometimes I see a duplicate flush for one or more of
memory registration ops. This is not a big deal for FastReq
because its completion handler is basically a no-op.

But for LocalInv this is a problem. On a flushed completion, the
MR is destroyed. If the completion occurs again, of course, all
kinds of badness happens because we're DMA-unmapping twice,
touching memory that has already been freed, and deleting from a
list_head that is poisonous.

The last straw is that wc_localinv_done calls the generic RPC layer
to indicate that an RPC Reply is ready. The duplicate flush
dereferences one or more NULL pointers.

Doesn't the verbs API contract stipulate that every posted WR gets
exactly one completion? I don't see this behavior with other
providers.

Thanks for any advice.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re:  FastLinQ: possible duplicate flush of FastReg and LocalInv
  2021-03-16 19:58 FastLinQ: possible duplicate flush of FastReg and LocalInv Chuck Lever III
@ 2021-03-17  8:54 ` Bernard Metzler
  2021-03-17 15:07   ` Tom Talpey
  2021-03-17 15:14 ` Chuck Lever III
  1 sibling, 1 reply; 6+ messages in thread
From: Bernard Metzler @ 2021-03-17  8:54 UTC (permalink / raw)
  To: Chuck Lever III; +Cc: linux-rdma

-----"Chuck Lever III" <chuck.lever@oracle.com> wrote: -----

>To: "linux-rdma" <linux-rdma@vger.kernel.org>
>From: "Chuck Lever III" <chuck.lever@oracle.com>
>Date: 03/16/2021 08:59PM
>Subject: [EXTERNAL] FastLinQ: possible duplicate flush of FastReg and
>LocalInv
>
>Hi-
>
>I've been trying to track down some crashes when running NFS/RDMA
>tests over FastLinQ devices in iWARP mode. To make it stressful,
>I've enabled disconnect injection, where rpcrdma injects a
>connection disconnect every so often.
>
>As part of a disconnect event, the Receive and Send queues are
>drained. Sometimes I see a duplicate flush for one or more of
>memory registration ops. This is not a big deal for FastReq
>because its completion handler is basically a no-op.
>
>But for LocalInv this is a problem. On a flushed completion, the
>MR is destroyed. If the completion occurs again, of course, all
>kinds of badness happens because we're DMA-unmapping twice,
>touching memory that has already been freed, and deleting from a
>list_head that is poisonous.
>
>The last straw is that wc_localinv_done calls the generic RPC layer
>to indicate that an RPC Reply is ready. The duplicate flush
>dereferences one or more NULL pointers.
>
>Doesn't the verbs API contract stipulate that every posted WR gets
>exactly one completion? I don't see this behavior with other
>providers.
>
Indeed. Nothing else is defined and applications obviously
rely on correctness in that respect.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: FastLinQ: possible duplicate flush of FastReg and LocalInv
  2021-03-17  8:54 ` Bernard Metzler
@ 2021-03-17 15:07   ` Tom Talpey
  0 siblings, 0 replies; 6+ messages in thread
From: Tom Talpey @ 2021-03-17 15:07 UTC (permalink / raw)
  To: Bernard Metzler, Chuck Lever III; +Cc: linux-rdma

On 3/17/2021 4:54 AM, Bernard Metzler wrote:
> -----"Chuck Lever III" <chuck.lever@oracle.com> wrote: -----
> 
>> To: "linux-rdma" <linux-rdma@vger.kernel.org>
>> From: "Chuck Lever III" <chuck.lever@oracle.com>
>> Date: 03/16/2021 08:59PM
>> Subject: [EXTERNAL] FastLinQ: possible duplicate flush of FastReg and
>> LocalInv
>>
>> Hi-
>>
>> I've been trying to track down some crashes when running NFS/RDMA
>> tests over FastLinQ devices in iWARP mode. To make it stressful,
>> I've enabled disconnect injection, where rpcrdma injects a
>> connection disconnect every so often.
>>
>> As part of a disconnect event, the Receive and Send queues are
>> drained. Sometimes I see a duplicate flush for one or more of
>> memory registration ops. This is not a big deal for FastReq
>> because its completion handler is basically a no-op.
>>
>> But for LocalInv this is a problem. On a flushed completion, the
>> MR is destroyed. If the completion occurs again, of course, all
>> kinds of badness happens because we're DMA-unmapping twice,
>> touching memory that has already been freed, and deleting from a
>> list_head that is poisonous.
>>
>> The last straw is that wc_localinv_done calls the generic RPC layer
>> to indicate that an RPC Reply is ready. The duplicate flush
>> dereferences one or more NULL pointers.
>>
>> Doesn't the verbs API contract stipulate that every posted WR gets
>> exactly one completion? I don't see this behavior with other
>> providers.
>>
> Indeed. Nothing else is defined and applications obviously
> rely on correctness in that respect.

Totally agree - any WR successfully posted must be completed, exactly
once. A missing or multiple completion is a provider bug.

Chuck, you might verify that every ib_post_send() call return code
is being checked. If you missed an error, that would allow for a
missed completion. But never a double completion, that's on the
provider.

Tom.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: FastLinQ: possible duplicate flush of FastReg and LocalInv
  2021-03-16 19:58 FastLinQ: possible duplicate flush of FastReg and LocalInv Chuck Lever III
  2021-03-17  8:54 ` Bernard Metzler
@ 2021-03-17 15:14 ` Chuck Lever III
  2021-03-17 18:39   ` Tom Talpey
  1 sibling, 1 reply; 6+ messages in thread
From: Chuck Lever III @ 2021-03-17 15:14 UTC (permalink / raw)
  To: linux-rdma


> On Mar 16, 2021, at 3:58 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
> Hi-
> 
> I've been trying to track down some crashes when running NFS/RDMA
> tests over FastLinQ devices in iWARP mode. To make it stressful,
> I've enabled disconnect injection, where rpcrdma injects a
> connection disconnect every so often.
> 
> As part of a disconnect event, the Receive and Send queues are
> drained. Sometimes I see a duplicate flush for one or more of
> memory registration ops. This is not a big deal for FastReq
> because its completion handler is basically a no-op.
> 
> But for LocalInv this is a problem. On a flushed completion, the
> MR is destroyed. If the completion occurs again, of course, all
> kinds of badness happens because we're DMA-unmapping twice,
> touching memory that has already been freed, and deleting from a
> list_head that is poisonous.
> 
> The last straw is that wc_localinv_done calls the generic RPC layer
> to indicate that an RPC Reply is ready. The duplicate flush
> dereferences one or more NULL pointers.

So this looked to me like a Queue wrap. After sleeping on it, I
decided to try disabling xprtrdma's Send signal batching. Setting
ep_send_batch to zero causes every Send WR to be signaled, and
that makes the problem go away.

This is a little surprising. Every LocalInv chain is signaled. The
only possible accounting error might be that ep_send_count does
not count FastReg WRs, which are always unsignaled.

More investigation needed.


> Doesn't the verbs API contract stipulate that every posted WR gets
> exactly one completion? I don't see this behavior with other
> providers.
> 
> Thanks for any advice.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: FastLinQ: possible duplicate flush of FastReg and LocalInv
  2021-03-17 15:14 ` Chuck Lever III
@ 2021-03-17 18:39   ` Tom Talpey
  2021-03-25 17:26     ` Chuck Lever III
  0 siblings, 1 reply; 6+ messages in thread
From: Tom Talpey @ 2021-03-17 18:39 UTC (permalink / raw)
  To: Chuck Lever III, linux-rdma

On 3/17/2021 11:14 AM, Chuck Lever III wrote:
> 
>> On Mar 16, 2021, at 3:58 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
>>
>> Hi-
>>
>> I've been trying to track down some crashes when running NFS/RDMA
>> tests over FastLinQ devices in iWARP mode. To make it stressful,
>> I've enabled disconnect injection, where rpcrdma injects a
>> connection disconnect every so often.
>>
>> As part of a disconnect event, the Receive and Send queues are
>> drained. Sometimes I see a duplicate flush for one or more of
>> memory registration ops. This is not a big deal for FastReq
>> because its completion handler is basically a no-op.
>>
>> But for LocalInv this is a problem. On a flushed completion, the
>> MR is destroyed. If the completion occurs again, of course, all
>> kinds of badness happens because we're DMA-unmapping twice,
>> touching memory that has already been freed, and deleting from a
>> list_head that is poisonous.
>>
>> The last straw is that wc_localinv_done calls the generic RPC layer
>> to indicate that an RPC Reply is ready. The duplicate flush
>> dereferences one or more NULL pointers.
> 
> So this looked to me like a Queue wrap. After sleeping on it, I
> decided to try disabling xprtrdma's Send signal batching. Setting
> ep_send_batch to zero causes every Send WR to be signaled, and
> that makes the problem go away.
> 
> This is a little surprising. Every LocalInv chain is signaled. The
> only possible accounting error might be that ep_send_count does
> not count FastReg WRs, which are always unsignaled.

Well, perhaps you're posting several WRs, and the connection is being
dropped before you post them all. Therefore, you bail out with the
last one you did post being unsignaled. You had better hope that last
one is flushed, because if it completed successfully, you may have a
missing interrupt.

It's really tricky to get unsignaled right, when errors occur. It
might still be the provider, but there are possibilities on both
sides of the API.

> More investigation needed.

Indeed, and good hunting!

Tom.

>> Doesn't the verbs API contract stipulate that every posted WR gets
>> exactly one completion? I don't see this behavior with other
>> providers.
>>
>> Thanks for any advice.
> 
> 
> --
> Chuck Lever
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: FastLinQ: possible duplicate flush of FastReg and LocalInv
  2021-03-17 18:39   ` Tom Talpey
@ 2021-03-25 17:26     ` Chuck Lever III
  0 siblings, 0 replies; 6+ messages in thread
From: Chuck Lever III @ 2021-03-25 17:26 UTC (permalink / raw)
  To: linux-rdma; +Cc: Michal Kalderon, Ariel Elior


> On Mar 17, 2021, at 2:39 PM, Tom Talpey <tom@talpey.com> wrote:
> 
> On 3/17/2021 11:14 AM, Chuck Lever III wrote:
>>> On Mar 16, 2021, at 3:58 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
>>> 
>>> Hi-
>>> 
>>> I've been trying to track down some crashes when running NFS/RDMA
>>> tests over FastLinQ devices in iWARP mode. To make it stressful,
>>> I've enabled disconnect injection, where rpcrdma injects a
>>> connection disconnect every so often.
>>> 
>>> As part of a disconnect event, the Receive and Send queues are
>>> drained. Sometimes I see a duplicate flush for one or more of
>>> memory registration ops. This is not a big deal for FastReq
>>> because its completion handler is basically a no-op.
>>> 
>>> But for LocalInv this is a problem. On a flushed completion, the
>>> MR is destroyed. If the completion occurs again, of course, all
>>> kinds of badness happens because we're DMA-unmapping twice,
>>> touching memory that has already been freed, and deleting from a
>>> list_head that is poisonous.
>>> 
>>> The last straw is that wc_localinv_done calls the generic RPC layer
>>> to indicate that an RPC Reply is ready. The duplicate flush
>>> dereferences one or more NULL pointers.
>> So this looked to me like a Queue wrap. After sleeping on it, I
>> decided to try disabling xprtrdma's Send signal batching. Setting
>> ep_send_batch to zero causes every Send WR to be signaled, and
>> that makes the problem go away.
>> This is a little surprising. Every LocalInv chain is signaled. The
>> only possible accounting error might be that ep_send_count does
>> not count FastReg WRs, which are always unsignaled.
> 
> Well, perhaps you're posting several WRs, and the connection is being
> dropped before you post them all. Therefore, you bail out with the
> last one you did post being unsignaled. You had better hope that last
> one is flushed, because if it completed successfully, you may have a
> missing interrupt.
> 
> It's really tricky to get unsignaled right, when errors occur. It
> might still be the provider, but there are possibilities on both
> sides of the API.

My current theory is that the only duplicate completions occur when
WRs have been posted after a disconnect. This happens in the window
where the workload is still active and the connection has been lost,
but before the DISCONNECTED CM event.

My expectation was that such a WR would flush through and complete
once. What I'm seeing is that on occasion one or more WRs that
were posted in this window complete twice.

If I add some logic to block posting in that window, the duplicate
completion problem seems to go away. The test runs long enough
without a duplication completion that I hit other bugs.

I never see duplicate Receive or Send completions.

When a duplicate completion occurs with LocalInv, I typically see
duplicate completions for all WRs on the same chained post. That
might be the case for FastReg also, I haven't looked closely, but
the Send WR these are chained to never sees a duplicate completion
(could be my duplicate checking logic for Sends doesn't work?).

This is with a QLogic Corp. FastLinQ QL41212HLCU 25GbE Adapter and
Storm FW 8.42.2.0, Management FW 8.30.18.0 [MBI 8.30.29].


--
Chuck Lever




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-25 17:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-16 19:58 FastLinQ: possible duplicate flush of FastReg and LocalInv Chuck Lever III
2021-03-17  8:54 ` Bernard Metzler
2021-03-17 15:07   ` Tom Talpey
2021-03-17 15:14 ` Chuck Lever III
2021-03-17 18:39   ` Tom Talpey
2021-03-25 17:26     ` Chuck Lever III

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.