All of lore.kernel.org
 help / color / mirror / Atom feed
From: Carol Soto <clsoto-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Cc: brking-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org,
	ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org
Subject: Re: [PATCH] IB/ucm: ib_ucm_release_dev clears wrong bit if devnum is greater than IB_UCM_MAX_DEVICES
Date: Fri, 12 Jun 2015 15:24:39 -0500	[thread overview]
Message-ID: <557B4007.7020301@linux.vnet.ibm.com> (raw)
In-Reply-To: <557B3C23.9090403-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>



On 6/12/2015 3:08 PM, Doug Ledford wrote:
> On 06/11/2015 12:06 PM, clsoto-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org wrote:
>> From: Carol L Soto <clsoto-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>>
>> ib_ucm_release_dev clears wrong bit if devnum is greater than IB_UCM_MAX_DEVICES.
>>
>> Signed-off-by: Carol L Soto <clsoto-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>> ---
>>   drivers/infiniband/core/ucm.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
>> index f2f6393..e2fd085 100644
>> --- a/drivers/infiniband/core/ucm.c
>> +++ b/drivers/infiniband/core/ucm.c
>> @@ -1193,6 +1193,7 @@ static int ib_ucm_close(struct inode *inode, struct file *filp)
>>   	return 0;
>>   }
>>   
>> +static DECLARE_BITMAP(overflow_map, IB_UCM_MAX_DEVICES);
>>   static void ib_ucm_release_dev(struct device *dev)
>>   {
>>   	struct ib_ucm_device *ucm_dev;
>> @@ -1202,7 +1203,7 @@ static void ib_ucm_release_dev(struct device *dev)
>>   	if (ucm_dev->devnum < IB_UCM_MAX_DEVICES)
>>   		clear_bit(ucm_dev->devnum, dev_map);
>>   	else
>> -		clear_bit(ucm_dev->devnum - IB_UCM_MAX_DEVICES, dev_map);
>> +		clear_bit(ucm_dev->devnum - IB_UCM_MAX_DEVICES, overflow_map);
>>   	kfree(ucm_dev);
>>   }
>>   
>> @@ -1226,7 +1227,6 @@ static ssize_t show_ibdev(struct device *dev, struct device_attribute *attr,
>>   static DEVICE_ATTR(ibdev, S_IRUGO, show_ibdev, NULL);
>>   
>>   static dev_t overflow_maj;
>> -static DECLARE_BITMAP(overflow_map, IB_UCM_MAX_DEVICES);
>>   static int find_overflow_devnum(void)
>>   {
>>   	int ret;
>>
> This doesn't look right to me.  In particular, you are creating a bitmap
> and clearing bits, but I never see that bitmap get any bits set.  So,
> are you overflowing on the set routine and you just didn't bother to fix
> that?
I just moved the declaration of the bitmap before ib_ucm_release_dev to 
be able to compile. This bitmap is used in ib_ucm_add_one
in the case of devnum > IB_UCM_MAX_DEVICES. The error path in 
ib_ucm_add_one did the clear bit correctly but not the
ib_ucm_release_dev function. We saw an issue when using SRIOV. The case 
was that user has 2 Mellanox SRIOV cards and just did an
unbind/bind a VF that was using a ib ucm device with devnum greater than 
32. We saw this warning
WARNING: at fs/sysfs/dir.c:31
May 31 14:04:11 powerkvm1-lp1 kernel: NIP [c000000000368958] 
sysfs_warn_dup+0x88/0xc0
May 31 14:04:11 powerkvm1-lp1 kernel: LR [c000000000368954] 
sysfs_warn_dup+0x84/0xc0
May 31 14:04:11 powerkvm1-lp1 kernel: Call Trace:
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf510] 
[c000000000368954] sysfs_warn_dup+0x84/0xc0 (unreliable)
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf590] 
[c000000000368fcc] sysfs_do_create_link_sd.isra.2+0x15c/0x170
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf5e0] 
[c0000000005aebb8] device_add+0x2f8/0x6f0
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf690] 
[d00000001f5c2a48] ib_ucm_add_one+0x288/0x378 [ib_ucm]
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf730] 
[d00000001cef7468] ib_register_device+0x508/0x5b0 [ib_core]
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf830] 
[d000000019baba88] mlx4_ib_add+0x918/0x1320 [mlx4_ib]
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf950] 
[d000000019b32d10] mlx4_add_device+0x70/0x120 [mlx4_core]
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf990] 
[d000000019b3304c] mlx4_register_device+0x9c/0xf0 [mlx4_core]
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbf9d0] 
[d000000019b38844] mlx4_load_one+0x1254/0x1600 [mlx4_core]
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfad0] 
[d000000019b39210] mlx4_init_one+0x620/0x770 [mlx4_core]
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfba0] 
[c0000000004fdf6c] local_pci_probe+0x6c/0x130
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfc30] 
[c0000000000e5548] work_for_cpu_fn+0x38/0x60
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfc60] 
[c0000000000ea518] process_one_work+0x198/0x470
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfcf0] 
[c0000000000eaf0c] worker_thread+0x37c/0x5a0
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfd80] 
[c0000000000f1f90] kthread+0x110/0x130
May 31 14:04:11 powerkvm1-lp1 kernel: [c000002e6bdbfe30] 
[c000000000009660] ret_from_kernel_thread+0x5c/0x7c

the warning was due to it was trying to create ib ucm device with devnum 
0 but that was in use already by another device. With the patch
I created, then I do not see the warning.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2015-06-12 20:24 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-11 16:06 [PATCH] IB/ucm: ib_ucm_release_dev clears wrong bit if devnum is greater than IB_UCM_MAX_DEVICES clsoto-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8
     [not found] ` <1434038801-6513-1-git-send-email-clsoto-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2015-06-12 20:08   ` Doug Ledford
     [not found]     ` <557B3C23.9090403-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2015-06-12 20:24       ` Carol Soto [this message]
     [not found]         ` <557B4007.7020301-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2015-06-12 20:31           ` Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=557B4007.7020301@linux.vnet.ibm.com \
    --to=clsoto-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=brking-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=ogerlitz-VPRAkNaXOzVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.