All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bob Pearson <rpearsonhpe@gmail.com>
To: Zhu Yanjun <zyjzyj2000@gmail.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>,
	Bob Pearson <rpearson@hpe.com>
Subject: Re: [PATCH for-next v2] RDMA/rxe: Fix ib_device reference counting (again)
Date: Thu, 4 Mar 2021 12:36:04 -0600	[thread overview]
Message-ID: <02bd870d-3f3e-1735-3cd9-686ef744c042@gmail.com> (raw)
In-Reply-To: <CAD=hENcj91eT0VBQVExPBbg9K8+NNPr6BT_B47Q9cWqUP2KEcw@mail.gmail.com>

On 3/4/21 2:58 AM, Zhu Yanjun wrote:
> On Thu, Mar 4, 2021 at 7:02 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>>
>> Three errors occurred in the fix referenced below.
>>
>> 1) rxe_rcv_mcast_pkt() dropped a reference to ib_device when
>> no error occured causing an underflow on the reference counter.
>> This code is cleaned up to be clearer and easier to read.
>>
>> 2) Extending the reference taken by rxe_get_dev_from_net() in
>> rxe_udp_encap_recv() until each skb is freed was not matched by
>> a reference in the loopback path resulting in underflows.
>>
>> 3) In rxe_comp.c the function free_pkt() did not clear skb which
>> triggered a warning at done: and could possibly at exit: in
>> rxe_completer(). The WARN_ONCE() calls are not actually needed.
>>
>> This patch fixes these errors.
>>
>> Fixes: 899aba891cab ("RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()")
>> Signed-off-by: Bob Pearson <rpearson@hpe.com>
>> ---
>> Version 2:
>> v1 of this patch incorrectly added a WARN_ON_ONCE in rxe_completer
>> where it could be triggered for normal traffic. This version
>> replaced that with a pr_warn located correctly.
>>
>> v1 of this patch placed a call to kfree_skb in an if statement
>> that could trigger style warnings. This version cleans that up.
>>
>>  drivers/infiniband/sw/rxe/rxe_comp.c |  6 +--
>>  drivers/infiniband/sw/rxe/rxe_net.c  | 10 ++++-
>>  drivers/infiniband/sw/rxe/rxe_recv.c | 60 +++++++++++++++++-----------
>>  3 files changed, 48 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
>> index a8ac791a1bb9..96e5a73579f8 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_comp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_comp.c
>> @@ -672,8 +672,10 @@ int rxe_completer(void *arg)
>>                          */
>>
>>                         /* there is nothing to retry in this case */
>> -                       if (!wqe || (wqe->state == wqe_state_posted))
>> +                       if (!wqe || (wqe->state == wqe_state_posted)) {
>> +                               pr_warn("Retry attempted without a valid wqe\n");
>>                                 goto exit;
>> +                       }
>>
>>                         /* if we've started a retry, don't start another
>>                          * retry sequence, unless this is a timeout.
>> @@ -750,7 +752,6 @@ int rxe_completer(void *arg)
>>         /* we come here if we are done with processing and want the task to
>>          * exit from the loop calling us
>>          */
>> -       WARN_ON_ONCE(skb);
>>         rxe_drop_ref(qp);
>>         return -EAGAIN;
>>
>> @@ -758,7 +759,6 @@ int rxe_completer(void *arg)
>>         /* we come here if we have processed a packet we want the task to call
>>          * us again to see if there is anything else to do
>>          */
>> -       WARN_ON_ONCE(skb);
> 
> With the above line is kept, I made tests with this commit.
> 1. git clone  https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
> 2. cd rdma && git pull
> 3. apply this commit and with "WARN_ON_ONCE(skb);" kept
> make tests with "rping ...."
> The similar problem still occurs.
> 
> Zhu Yanjun

The WARNs occur because skb is not getting cleared not because packets are not being
freed.

The issue is whether the skbs are freed not whether a local variable still has an
(old) address of an skb. The following would trigger a warning but doesn't mean
anything

	skb = skb_alloc(...);
	kfree_skb(skb);
	WARN_ON_ONCE(skb);

Every path out of the subroutine calls free_pkt() except one. That is because I was
trying to not change the behavior to the original code. That occurs in the
ERROR_RETRY state when no wqe is available. All other paths call free_pkt() and
there calls kfree_skb(). On that one path we leak an skb which is not good so we
should probably go ahead and free it too just dropping the packet. In that case we
can move the free_pkt() to the end and make it explicit that all the packets are
actually freed. I will modify the code to do that. The WARN is still not required.

Bob

  reply	other threads:[~2021-03-04 18:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-03 22:56 [PATCH for-next v2] RDMA/rxe: Fix ib_device reference counting (again) Bob Pearson
2021-03-04  8:25 ` Leon Romanovsky
2021-03-04  8:58 ` Zhu Yanjun
2021-03-04 18:36   ` Bob Pearson [this message]
2021-03-04  9:14 ` Zhu Yanjun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02bd870d-3f3e-1735-3cd9-686ef744c042@gmail.com \
    --to=rpearsonhpe@gmail.com \
    --cc=jgg@nvidia.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearson@hpe.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.