linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yishai Hadas <yishaih@dev.mellanox.co.il>
To: Jason Gunthorpe <jgg@mellanox.com>
Cc: Leon Romanovsky <leon@kernel.org>,
	Doug Ledford <dledford@redhat.com>,
	Leon Romanovsky <leonro@mellanox.com>,
	RDMA mailing list <linux-rdma@vger.kernel.org>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	Matan Barak <matanb@mellanox.com>,
	Yishai Hadas <yishaih@mellanox.com>,
	Saeed Mahameed <saeedm@mellanox.com>,
	linux-netdev <netdev@vger.kernel.org>,
	Majd Dibbiny <majd@mellanox.com>
Subject: Re: [PATCH rdma-next v2 09/20] IB/core: Improve uverbs_cleanup_ucontext algorithm
Date: Mon, 18 Jun 2018 14:27:22 +0300	[thread overview]
Message-ID: <d5f535d4-ad77-1ebd-25d5-29cbf00ea983@dev.mellanox.co.il> (raw)
In-Reply-To: <20180617195106.eai7iajx4y5w3ii6@mellanox.com>

On 6/17/2018 10:51 PM, Jason Gunthorpe wrote:
> On Sun, Jun 17, 2018 at 12:59:55PM +0300, Leon Romanovsky wrote:
> 
>> +void uverbs_cleanup_ucontext(struct ib_ucontext *ucontext, bool device_removed)
>> +{
>>   	/*
>>   	 * Waits for all remove_commit and alloc_commit to finish. Logically, We
>>   	 * want to hold this forever as the context is going to be destroyed,
>>   	 * but we'll release it since it causes a "held lock freed" BUG message.
>>   	 */
>>   	down_write(&ucontext->cleanup_rwsem);
>> +	while (!list_empty(&ucontext->uobjects))
>> +		if (__uverbs_cleanup_ucontext(ucontext, RDMA_REMOVE_DESTROY))
>> +			/* No entry was cleaned-up successfully during this iteration */
>> +			break;
> 
> No, this isn't right, it must remain REMOVE or CLOSE here. The enum is
> a signal to the driver what is going on. DESTROY is only for user
> triggered destroy called in a user context.
> 

The algorithm must enable the drivers an option to fully cleanup their 
resources as was done before this change.

Using REMOVE or CLOSE without some following change won't do the work as 
the infrastructure (e.g. remove_commit_idr_uobject) and other IB cleanup 
functions during the road (e.g. uverbs_free_qp, uverbs_free_cq) will 
force cleanup of some memory/idr/ref resources and prevent a second 
successful iteration in case of a failure.

For that reason the initial iteration should be with some relaxed mode 
and just later in case were left uncleaned-up resources the code should 
use the REMOVE/CLOSE option.

However, I do agree that we need to preserve the original signal to let 
downstream layers to know what happened, at the moment it looks like 
there is one place that it's even a must as part of 
uverbs_hot_unplug_completion_event_file() where 
'RDMA_REMOVE_DRIVER_REMOVE' is used explicitly as part of the cleanup flow.

For that I suggest below [1] patch which replaces RDMA_REMOVE_DESTROY in 
the first iteration with RDMA_REMOVE_DRIVER_REMOVE_RELAX/ 
RDMA_REMOVE_CLOSE_RELAX new enum values to do the job.

> There needs to be some kind of guarenteed return from the driver that
> destroy is failing due to elevated refcounts, and not some other
> reason.. This is just checking for any ret?

We can't rely on all drivers' return codes from legacy/current firmware 
code for all objects to return a specific ret code for that case.
Even if we had such, the second iteration where we should force cleanup 
might still fail with that ret code and a kernel memory leak would occur.

> 
>> -	while (!list_empty(&ucontext->uobjects)) {
>> -		struct ib_uobject *obj, *next_obj;
>> -		unsigned int next_order = UINT_MAX;
>> +	if (!list_empty(&ucontext->uobjects))
>> +		__uverbs_cleanup_ucontext(ucontext, device_removed ?
>> +			RDMA_REMOVE_DRIVER_REMOVE : RDMA_REMOVE_CLOSE);
> 
> Failure to cleanup is a driver bug, and should be reported with
> WARN_ON. This is also mis using the remove enum, CLOSE is not a
> 'bigger hammer'
> 

The patch saves the previous behavior that set a warn message [2] and 
not a WARN_ON, if you think that WARN_ON is better we can change to.

[2]
pr_warn("ib_uverbs: unable to remove uobject id %d err %d\n",
				obj->id, err);


The suggested patch on top of current can look like below.


diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 4685ef5..6b03ca7
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1468,8 +1468,21 @@ enum rdma_remove_reason {
         RDMA_REMOVE_DRIVER_REMOVE,
         /* Context is being cleaned-up, but commit was just completed */
         RDMA_REMOVE_DURING_CLEANUP,
+       /* Driver is being hot-unplugged, retry may be called upon an 
error */
+       RDMA_REMOVE_DRIVER_REMOVE_RELAX,
+       /* Context deletion, retry may be called upon an error */
+       RDMA_REMOVE_CLOSE_RELAX,
  };

+static inline bool ib_is_remove_retry(enum rdma_remove_reason why)
+{
+       if (why == RDMA_REMOVE_DESTROY || why == 
RDMA_REMOVE_DRIVER_REMOVE_RELAX ||
+               why == RDMA_REMOVE_CLOSE_RELAX)
+               return true;
+
+       return false;
+}
+


// In the new algorithm below enums will be used instead of 
//RDMA_REMOVE_DESTROY

@@ -700,7 +701,9 @@ void uverbs_cleanup_ucontext(struct ib_ucontext 
*ucontext, bool device_removed)
          */
         down_write(&ucontext->cleanup_rwsem);
         while (!list_empty(&ucontext->uobjects))
-               if (__uverbs_cleanup_ucontext(ucontext, 
RDMA_REMOVE_DESTROY))
+               if (__uverbs_cleanup_ucontext(ucontext, device_removed ?
+                                       RDMA_REMOVE_DRIVER_REMOVE_RELAX :
+                                       RDMA_REMOVE_CLOSE_RELAX))
                         /* No entry was cleaned-up successfully during 
this iteration */
                         break;


// Below logic will be replaced in all applicable places, I just put few 
// of to demonstrate the solution.

--- a/drivers/infiniband/core/uverbs_std_types_cq.c
+++ b/drivers/infiniband/core/uverbs_std_types_cq.c
@@ -44,7 +44,7 @@ static int uverbs_free_cq(struct ib_uobject *uobject,
         int ret;

         ret = ib_destroy_cq(cq);
-       if (!ret || why != RDMA_REMOVE_DESTROY)
+       if (!ret || !ib_is_remove_retry(why))
                 ib_uverbs_release_ucq(uobject->context->ufile, ev_queue 
? container_of(ev_queue, struct ib_uverbs_completion_event_file,


++ b/drivers/infiniband/core/rdma_core.c
@@ -360,9 +360,10 @@ static int __must_check 
remove_commit_idr_uobject(struct ib_uobject *uobj,

         /*
          * We can only fail gracefully if the user requested to destroy the
-        * object. In the rest of the cases, just remove whatever you can.
+        * object or when a retry may be called upon an error.
+        * In the rest of the cases, just remove whatever you can.
          */
-       if (why == RDMA_REMOVE_DESTROY && ret)
+       if (ret && ib_is_remove_retry(why))
                 return ret;

         ib_rdmacg_uncharge(&uobj->cg_obj, uobj->context->device,
@@ -393,7 +394,7 @@ static int __must_check 
remove_commit_fd_uobject(struct ib_uobject *uobj,
                 container_of(uobj, struct ib_uobject_file, uobj);
         int ret = fd_type->context_closed(uobj_file, why);

-       if (why == RDMA_REMOVE_DESTROY && ret)
+       if (ret && ib_is_remove_retry(why))
                 return ret;


//Here is the specific place that checks for RDMA_REMOVE_DRIVER_REMOVE. 
// it should do the work also when RDMA_REMOVE_DRIVER_REMOVE was called.

@@ -187,7 +187,7 @@ static int 
uverbs_hot_unplug_completion_event_file(struct ib_uobject_file *uobj_
         event_queue->is_closed = 1;
         spin_unlock_irq(&event_queue->lock);

-       if (why == RDMA_REMOVE_DRIVER_REMOVE) {
+       if (why == RDMA_REMOVE_DRIVER_REMOVE || why == 
RDMA_REMOVE_DRIVER_REMOVE_RELAX) {
                 wake_up_interruptible(&event_queue->poll_wait);
                 kill_fasync(&event_queue->async_queue, SIGIO, POLL_IN);
         }


What do you think ?

Yishai

  reply	other threads:[~2018-06-18 11:27 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-17  9:59 [PATCH rdma-next v2 00/20] Introduce mlx5 DEVX interface Leon Romanovsky
2018-06-17  9:59 ` [PATCH mlx5-next v2 01/20] net/mlx5_core: Prevent warns in dmesg upon firmware commands Leon Romanovsky
2018-06-17  9:59 ` [PATCH rdma-next v2 02/20] drm/i915: Move u64-to-ptr helpers to general header Leon Romanovsky
2018-06-17  9:59 ` [PATCH rdma-next v2 03/20] kernel.h: Reuse u64_to_ptr macro to cast __user pointers Leon Romanovsky
2018-06-17  9:59 ` [PATCH rdma-next v2 04/20] IB/uverbs: Export uverbs idr and fd types Leon Romanovsky
2018-06-17  9:59 ` [PATCH rdma-next v2 05/20] IB/uverbs: Refactor uverbs_finalize_objects Leon Romanovsky
2018-06-17  9:59 ` [PATCH rdma-next v2 06/20] IB/uverbs: Add PTR_IN attributes that are allocated/copied automatically Leon Romanovsky
2018-06-18 20:48   ` Jason Gunthorpe
2018-06-17  9:59 ` [PATCH rdma-next v2 07/20] IB/uverbs: Add a macro to define a type with no kernel known size Leon Romanovsky
2018-06-17  9:59 ` [PATCH rdma-next v2 08/20] IB/uverbs: Allow an empty namespace in ioctl() framework Leon Romanovsky
2018-06-17  9:59 ` [PATCH rdma-next v2 09/20] IB/core: Improve uverbs_cleanup_ucontext algorithm Leon Romanovsky
2018-06-17 19:51   ` Jason Gunthorpe
2018-06-18 11:27     ` Yishai Hadas [this message]
2018-06-17  9:59 ` [PATCH mlx5-next v2 10/20] net/mlx5: Expose DEVX ifc structures Leon Romanovsky
2018-06-17  9:59 ` [PATCH mlx5-next v2 11/20] IB/mlx5: Introduce DEVX Leon Romanovsky
2018-06-17  9:59 ` [PATCH rdma-next v2 12/20] IB/core: Introduce DECLARE_UVERBS_GLOBAL_METHODS Leon Romanovsky
2018-06-17  9:59 ` [PATCH rdma-next v2 13/20] IB: Expose ib_ucontext from a given ib_uverbs_file Leon Romanovsky
2018-06-17 10:00 ` [PATCH rdma-next v2 14/20] IB/mlx5: Add support for DEVX general command Leon Romanovsky
2018-06-17 10:00 ` [PATCH rdma-next v2 15/20] IB/mlx5: Add obj create and destroy functionality Leon Romanovsky
2018-06-17 10:00 ` [PATCH mlx5-next v2 16/20] IB/mlx5: Add DEVX support for modify and query commands Leon Romanovsky
2018-06-17 10:00 ` [PATCH rdma-next v2 17/20] IB/mlx5: Add support for DEVX query UAR Leon Romanovsky
2018-06-17 10:00 ` [PATCH mlx5-next v2 18/20] IB/mlx5: Add DEVX support for memory registration Leon Romanovsky
2018-06-17 10:00 ` [PATCH rdma-next v2 19/20] IB/mlx5: Add DEVX query EQN support Leon Romanovsky
2018-06-17 10:00 ` [PATCH rdma-next v2 20/20] IB/mlx5: Expose DEVX tree Leon Romanovsky
2018-06-18 22:05 ` [PATCH rdma-next v2 00/20] Introduce mlx5 DEVX interface Jason Gunthorpe
2018-06-19  4:59   ` Leon Romanovsky
2018-06-19 16:46     ` Leon Romanovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d5f535d4-ad77-1ebd-25d5-29cbf00ea983@dev.mellanox.co.il \
    --to=yishaih@dev.mellanox.co.il \
    --cc=dledford@redhat.com \
    --cc=jgg@mellanox.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=leon@kernel.org \
    --cc=leonro@mellanox.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=majd@mellanox.com \
    --cc=matanb@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=yishaih@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).