All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgunthorpe-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
To: Yishai Hadas <yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
Cc: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org,
	jackm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
Subject: Re: [PATCH for-next V6 3/5] IB/uverbs: Enable device removal when there are active user space applications
Date: Tue, 30 Jun 2015 12:40:35 -0600	[thread overview]
Message-ID: <20150630184035.GC2819@obsidianresearch.com> (raw)
In-Reply-To: <1435659967-27173-4-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>

On Tue, Jun 30, 2015 at 01:26:05PM +0300, Yishai Hadas wrote:
>  struct ib_uverbs_device {
> -	struct kref				ref;
> +	struct kref				comp_ref;
> +	struct kref				free_ref;

So.. I was looking at this, and there is something wrong with the
existing code. 

This old code:

	cdev_del(&uverbs_dev->cdev);
	[..]
 	wait_for_completion(&uverbs_dev->comp);
-	kfree(uverbs_dev);

Has built in to it an assumption that when cdev_del returns there can
be no possible open() running. Which doesn't appear to be true, cdev
calls open unlocked and relies on refcounting to make everything work
out.

Even other places in the rdma core work this way, eg user_mad.

Which means open can be running concurrently with the rest of that
stuff, which creates several obvious problems.

I *think* (and I am not totally sure) that when you use cdev with a
dynamic structure, it *must* be chained off of a kobject for the
containing structure. Certainly, other examples in the kernel I've
looked at recently do this. (Typically the cdev will be part of the

Ie it should look like this:

  struct ib_uverbs_device {
	struct kobject			        kobj;
	struct cdev			        cdev;

	cdev_init(&uverbs_dev->cdev, NULL);
	uverbs_dev->cdev.kobj.parent = &uverbs_dev->kobj;
	cdev_add(..)

The cdev will hold a kref on the parent (the containing structure) and
only when that kref is released is it guaranteed that open will never
be called again.

So, kobj becomes your free_ref, and cdev properly chains off it to
close that little hole with kref.

---

The next problem is that open can run concurrently with
wait_for_completion, so the waiting scheme is wrong too.

This is a great example of why you should never use a kref for an
active count. It seems like the right thing, but it is subtly wrong.

krefs have this special property:

kref_get()
        WARN_ON_ONCE(atomic_inc_return(&kref->refcount) < 2);

So when the code did this:

-	kref_put(&uverbs_dev->ref, ib_uverbs_release_dev);
-	wait_for_completion(&uverbs_dev->comp);
-	kfree(uverbs_dev);

There is a race where another CPU may be in ib_uverbs_open
about to do kref_get, which will trigger the above WARN_ON, or a
use after free race with the kfree

A good way to implement this pattern is to use an atomic with a
bias. See how kernfs_get_active/kernfs_put_active/kernfs_drain work
for a very good example of this scheme.

This is an existing bug, I think a dedicated patch which
 - adds the kobj and moves the kfree(uverbs_dev) into it
 - Fixes the active count scheme to use an atomic not a kref

Would be appropriate. Once done the disassociate patch doesn't have to
really do anything with this stuff.

I would also recommend looking at other uses of cdev_add in the rdma
core, they may be similarly off..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-06-30 18:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-30 10:26 [PATCH for-next V6 0/5] HW Device hot-removal support Yishai Hadas
     [not found] ` <1435659967-27173-1-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-30 10:26   ` [PATCH for-next V6 1/5] IB/uverbs: Fix reference counting usage of event files Yishai Hadas
     [not found]     ` <1435659967-27173-2-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-30 17:04       ` Jason Gunthorpe
2015-06-30 10:26   ` [PATCH for-next V6 2/5] IB/uverbs: Explicitly pass ib_dev to uverbs commands Yishai Hadas
     [not found]     ` <1435659967-27173-3-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-30 17:25       ` Jason Gunthorpe
2015-06-30 10:26   ` [PATCH for-next V6 3/5] IB/uverbs: Enable device removal when there are active user space applications Yishai Hadas
     [not found]     ` <1435659967-27173-4-git-send-email-yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org>
2015-06-30 18:40       ` Jason Gunthorpe [this message]
     [not found]         ` <20150630184035.GC2819-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-07-06 14:08           ` Yishai Hadas
     [not found]             ` <559A8BC8.60507-LDSdmyG8hGV8YrgS2mwiifqBs+8SCbDb@public.gmane.org>
2015-07-06 17:18               ` Jason Gunthorpe
2015-06-30 10:26   ` [PATCH for-next V6 4/5] IB/mlx4_ib: Disassociate support Yishai Hadas
2015-06-30 10:26   ` [PATCH for-next V6 5/5] IB/ucma: HW Device hot-removal support Yishai Hadas
2015-07-30 16:46   ` [PATCH for-next V6 0/5] " Doug Ledford
     [not found]     ` <55BA54FC.8060905-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-07-30 16:50       ` Jason Gunthorpe
     [not found]         ` <20150730165014.GD16659-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-07-30 16:59           ` Doug Ledford
     [not found]             ` <55BA57F2.5040207-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-07-30 17:09               ` Jason Gunthorpe
     [not found]                 ` <20150730170934.GA25181-ePGOBjL8dl3ta4EC/59zMFaTQe2KTcn/@public.gmane.org>
2015-07-30 17:15                   ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150630184035.GC2819@obsidianresearch.com \
    --to=jgunthorpe-epgobjl8dl3ta4ec/59zmfatqe2ktcn/@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=jackm-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=raindel-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    --cc=yishaih-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.