Potential lost receive WCs (was "[PATCH WIP 38/43]")

* Potential lost receive WCs (was "[PATCH WIP 38/43]")
@ 2015-07-24 20:26 Chuck Lever
       [not found] ` <7824831C-3CC5-49C4-9E0B-58129D0E7FFF-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Chuck Lever @ 2015-07-24 20:26 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: linux-rdma

>>>> During some other testing I found that when a completion upcall
>>>> returns to the provider leaving CQEs still on the completion queue,
>>>> there is a non-zero probability that a completion will be lost.
>>> 
>>> What does lost mean?
>> 
>> Lost means a WC in the CQ is skipped by ib_poll_cq().
>> 
>> In other words, I expected that during the next upcall,
>> ib_poll_cq() would return WCs that were not processed, starting
>> with the last one on the CQ when my upcall handler returned.
> 
> Yes, this is what it should do. I wouldn't expect a timely upcall, but
> none should be lost.
> 
>> I found this by intentionally having the completion handler
>> process only one or two WCs and then return.
>> 
>>> The CQ is edge triggered, so if you don't drain it you might not get
>>> another timely CQ callback (which is bad), but CQEs themselves should
>>> not be lost.
>> 
>> I’m not sure I fully understand this problem, it might
>> even be my misuderstanding about ib_poll_cq(). But forcing
>> the completion upcall handler to completely drain the CQ
>> during each upcall prevents the issue.
> 
> CQs should never be lost.
> 
> The idea that you can completely drain the CQ during the upcall is
> inherently racey, so this cannot be the answer to whatever the problem
> is..

Hrm, ok. Completely draining the CQ is how the upcall handler
worked before commit 8301a2c047cc.

> Is there any chance this is still an artifact of the lazy SQE flow
> control? The RDMA buffer SQE recycling is solved by the sync
> invalidate, but workloads that don't use RDMA buffers (ie SEND only)
> will still run without proper flow control…

I can’t see how it can be related to the send queue these days.
The CQ is split. The behavior I observed was in the receive
completion path. All RECVs are signaled, and there are a fixed
and limited number of reply buffers that match the number of
RPC/RDMA credits.

Basically RPC work flow stopped because an RPC reply never
arrived.

The send queue accounting problem would cause the client to
stop sending RPC requests before we hit our credit limit.

> If you are totally certain a CQ was dropped from ib_poll_cq, and that
> the SQ is not overflowing by strict accounting, then I'd say driver
> problem, but the odds of having an undetected driver problem like that
> at this point seem somehow small…

Under normal circumstances, most ULPs are prepared to deal with a
large number of WCs per upcall. IMO the issue would be difficult to
hit unless you have rigged the upcall handler to force the problem
to occur (poll once with a small ib_wc array size and then return
from the upcall handler).

I will have some time to experiment next week. Thanks for confirming
my understanding of ib_poll_cq().

--
Chuck Lever

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread