All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] uio: fix potential crash bug
@ 2018-07-06  2:57 xiubli
  2018-07-06  2:57 ` [PATCH v3 1/3] uio: use request_threaded_irq instead xiubli
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: xiubli @ 2018-07-06  2:57 UTC (permalink / raw)
  To: gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose, mchristi

From: Xiubo Li <xiubli@redhat.com>

The V2 patch set maybe not post successfully, so post it again
with one new updation.

V2:
- resend it with some small fix

V3:
- switch to use request_threaded_irq


Xiubo Li (3):
  uio: use request_threaded_irq instead
  uio: change to use the mutex lock instead of the spin lock
  uio: fix crash after the device is unregistered

 drivers/uio/uio.c          | 151 ++++++++++++++++++++++++++++++++++-----------
 include/linux/uio_driver.h |   2 +-
 2 files changed, 115 insertions(+), 38 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3 1/3] uio: use request_threaded_irq instead
  2018-07-06  2:57 [PATCH v3 0/3] uio: fix potential crash bug xiubli
@ 2018-07-06  2:57 ` xiubli
  2018-07-06  2:57 ` [PATCH v3 2/3] uio: change to use the mutex lock instead of the spin lock xiubli
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: xiubli @ 2018-07-06  2:57 UTC (permalink / raw)
  To: gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose, mchristi

From: Xiubo Li <xiubli@redhat.com>

Prepraing for changing to use mutex lock.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 drivers/uio/uio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index e8f4ac9..b4b2ae1 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -902,8 +902,9 @@ int __uio_register_device(struct module *owner,
 		 * FDs at the time of unregister and therefore may not be
 		 * freed until they are released.
 		 */
-		ret = request_irq(info->irq, uio_interrupt,
-				  info->irq_flags, info->name, idev);
+		ret = request_threaded_irq(info->irq, NULL, uio_interrupt,
+					   info->irq_flags, info->name, idev);
+
 		if (ret)
 			goto err_request_irq;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 2/3] uio: change to use the mutex lock instead of the spin lock
  2018-07-06  2:57 [PATCH v3 0/3] uio: fix potential crash bug xiubli
  2018-07-06  2:57 ` [PATCH v3 1/3] uio: use request_threaded_irq instead xiubli
@ 2018-07-06  2:57 ` xiubli
  2018-07-06  3:39   ` Hamish Martin
  2018-07-06  2:57 ` [PATCH v3 3/3] uio: fix crash after the device is unregistered xiubli
  2018-07-06  3:31 ` [PATCH v3 0/3] uio: fix potential crash bug Hamish Martin
  3 siblings, 1 reply; 16+ messages in thread
From: xiubli @ 2018-07-06  2:57 UTC (permalink / raw)
  To: gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose, mchristi

From: Xiubo Li <xiubli@redhat.com>

We are hitting a regression with the following commit:

commit a93e7b331568227500186a465fee3c2cb5dffd1f
Author: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Date:   Mon May 14 13:32:23 2018 +1200

    uio: Prevent device destruction while fds are open

The problem is the addition of spin_lock_irqsave in uio_write. This
leads to hitting  uio_write -> copy_from_user -> _copy_from_user ->
might_fault and the logs filling up with sleeping warnings.

I also noticed some uio drivers allocate memory, sleep, grab mutexes
from callouts like open() and release and uio is now doing
spin_lock_irqsave while calling them.

Reported-by: Mike Christie <mchristi@redhat.com>
CC: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 drivers/uio/uio.c          | 32 +++++++++++++-------------------
 include/linux/uio_driver.h |  2 +-
 2 files changed, 14 insertions(+), 20 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index b4b2ae1..655ade4 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -433,7 +433,6 @@ static int uio_open(struct inode *inode, struct file *filep)
 	struct uio_device *idev;
 	struct uio_listener *listener;
 	int ret = 0;
-	unsigned long flags;
 
 	mutex_lock(&minor_lock);
 	idev = idr_find(&uio_idr, iminor(inode));
@@ -460,10 +459,10 @@ static int uio_open(struct inode *inode, struct file *filep)
 	listener->event_count = atomic_read(&idev->event);
 	filep->private_data = listener;
 
-	spin_lock_irqsave(&idev->info_lock, flags);
+	mutex_lock(&idev->info_lock);
 	if (idev->info && idev->info->open)
 		ret = idev->info->open(idev->info, inode);
-	spin_unlock_irqrestore(&idev->info_lock, flags);
+	mutex_unlock(&idev->info_lock);
 	if (ret)
 		goto err_infoopen;
 
@@ -495,12 +494,11 @@ static int uio_release(struct inode *inode, struct file *filep)
 	int ret = 0;
 	struct uio_listener *listener = filep->private_data;
 	struct uio_device *idev = listener->dev;
-	unsigned long flags;
 
-	spin_lock_irqsave(&idev->info_lock, flags);
+	mutex_lock(&idev->info_lock);
 	if (idev->info && idev->info->release)
 		ret = idev->info->release(idev->info, inode);
-	spin_unlock_irqrestore(&idev->info_lock, flags);
+	mutex_unlock(&idev->info_lock);
 
 	module_put(idev->owner);
 	kfree(listener);
@@ -513,12 +511,11 @@ static __poll_t uio_poll(struct file *filep, poll_table *wait)
 	struct uio_listener *listener = filep->private_data;
 	struct uio_device *idev = listener->dev;
 	__poll_t ret = 0;
-	unsigned long flags;
 
-	spin_lock_irqsave(&idev->info_lock, flags);
+	mutex_lock(&idev->info_lock);
 	if (!idev->info || !idev->info->irq)
 		ret = -EIO;
-	spin_unlock_irqrestore(&idev->info_lock, flags);
+	mutex_unlock(&idev->info_lock);
 
 	if (ret)
 		return ret;
@@ -537,12 +534,11 @@ static ssize_t uio_read(struct file *filep, char __user *buf,
 	DECLARE_WAITQUEUE(wait, current);
 	ssize_t retval = 0;
 	s32 event_count;
-	unsigned long flags;
 
-	spin_lock_irqsave(&idev->info_lock, flags);
+	mutex_lock(&idev->info_lock);
 	if (!idev->info || !idev->info->irq)
 		retval = -EIO;
-	spin_unlock_irqrestore(&idev->info_lock, flags);
+	mutex_unlock(&idev->info_lock);
 
 	if (retval)
 		return retval;
@@ -592,9 +588,8 @@ static ssize_t uio_write(struct file *filep, const char __user *buf,
 	struct uio_device *idev = listener->dev;
 	ssize_t retval;
 	s32 irq_on;
-	unsigned long flags;
 
-	spin_lock_irqsave(&idev->info_lock, flags);
+	mutex_lock(&idev->info_lock);
 	if (!idev->info || !idev->info->irq) {
 		retval = -EIO;
 		goto out;
@@ -618,7 +613,7 @@ static ssize_t uio_write(struct file *filep, const char __user *buf,
 	retval = idev->info->irqcontrol(idev->info, irq_on);
 
 out:
-	spin_unlock_irqrestore(&idev->info_lock, flags);
+	mutex_unlock(&idev->info_lock);
 	return retval ? retval : sizeof(s32);
 }
 
@@ -865,7 +860,7 @@ int __uio_register_device(struct module *owner,
 
 	idev->owner = owner;
 	idev->info = info;
-	spin_lock_init(&idev->info_lock);
+	mutex_init(&idev->info_lock);
 	init_waitqueue_head(&idev->wait);
 	atomic_set(&idev->event, 0);
 
@@ -929,7 +924,6 @@ int __uio_register_device(struct module *owner,
 void uio_unregister_device(struct uio_info *info)
 {
 	struct uio_device *idev;
-	unsigned long flags;
 
 	if (!info || !info->uio_dev)
 		return;
@@ -943,9 +937,9 @@ void uio_unregister_device(struct uio_info *info)
 	if (info->irq && info->irq != UIO_IRQ_CUSTOM)
 		free_irq(info->irq, idev);
 
-	spin_lock_irqsave(&idev->info_lock, flags);
+	mutex_lock(&idev->info_lock);
 	idev->info = NULL;
-	spin_unlock_irqrestore(&idev->info_lock, flags);
+	mutex_unlock(&idev->info_lock);
 
 	device_unregister(&idev->dev);
 
diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
index 6c5f207..6f8b68c 100644
--- a/include/linux/uio_driver.h
+++ b/include/linux/uio_driver.h
@@ -75,7 +75,7 @@ struct uio_device {
         struct fasync_struct    *async_queue;
         wait_queue_head_t       wait;
         struct uio_info         *info;
-	spinlock_t		info_lock;
+	struct mutex		info_lock;
         struct kobject          *map_dir;
         struct kobject          *portio_dir;
 };
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-06  2:57 [PATCH v3 0/3] uio: fix potential crash bug xiubli
  2018-07-06  2:57 ` [PATCH v3 1/3] uio: use request_threaded_irq instead xiubli
  2018-07-06  2:57 ` [PATCH v3 2/3] uio: change to use the mutex lock instead of the spin lock xiubli
@ 2018-07-06  2:57 ` xiubli
  2018-07-06  3:40   ` Hamish Martin
                     ` (2 more replies)
  2018-07-06  3:31 ` [PATCH v3 0/3] uio: fix potential crash bug Hamish Martin
  3 siblings, 3 replies; 16+ messages in thread
From: xiubli @ 2018-07-06  2:57 UTC (permalink / raw)
  To: gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose, mchristi

From: Xiubo Li <xiubli@redhat.com>

For the target_core_user use case, after the device is unregistered
it maybe still opened in user space, then the kernel will crash, like:

[  251.163692] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[  251.163820] IP: [<ffffffffc0736213>] show_name+0x23/0x40 [uio]
[  251.163965] PGD 8000000062694067 PUD 62696067 PMD 0
[  251.164097] Oops: 0000 [#1] SMP
...
[  251.165605]  e1000 mptscsih mptbase drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
[  251.166014] CPU: 0 PID: 13380 Comm: tcmu-runner Kdump: loaded Not tainted 3.10.0-916.el7.test.x86_64 #1
[  251.166381] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
[  251.166747] task: ffff971eb91db0c0 ti: ffff971e9e384000 task.ti: ffff971e9e384000
[  251.167137] RIP: 0010:[<ffffffffc0736213>]  [<ffffffffc0736213>] show_name+0x23/0x40 [uio]
[  251.167563] RSP: 0018:ffff971e9e387dc8  EFLAGS: 00010282
[  251.167978] RAX: 0000000000000000 RBX: ffff971e9e3f8000 RCX: ffff971eb8368d98
[  251.168408] RDX: ffff971e9e3f8000 RSI: ffffffffc0738084 RDI: ffff971e9e3f8000
[  251.168856] RBP: ffff971e9e387dd0 R08: ffff971eb8bc0018 R09: 0000000000000000
[  251.169296] R10: 0000000000001000 R11: ffffffffa09d444d R12: ffffffffa1076e80
[  251.169750] R13: ffff971e9e387f18 R14: 0000000000000001 R15: ffff971e9cfb1c80
[  251.170213] FS:  00007ff37d175880(0000) GS:ffff971ebb600000(0000) knlGS:0000000000000000
[  251.170693] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  251.171248] CR2: 0000000000000008 CR3: 00000000001f6000 CR4: 00000000003607f0
[  251.172071] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  251.172640] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  251.173236] Call Trace:
[  251.173789]  [<ffffffffa0c9b2d3>] dev_attr_show+0x23/0x60
[  251.174356]  [<ffffffffa0f561b2>] ? mutex_lock+0x12/0x2f
[  251.174892]  [<ffffffffa0ac6d9f>] sysfs_kf_seq_show+0xcf/0x1f0
[  251.175433]  [<ffffffffa0ac54e6>] kernfs_seq_show+0x26/0x30
[  251.175981]  [<ffffffffa0a63be0>] seq_read+0x110/0x3f0
[  251.176609]  [<ffffffffa0ac5d45>] kernfs_fop_read+0xf5/0x160
[  251.177158]  [<ffffffffa0a3d3af>] vfs_read+0x9f/0x170
[  251.177707]  [<ffffffffa0a3e27f>] SyS_read+0x7f/0xf0
[  251.178268]  [<ffffffffa0f648af>] system_call_fastpath+0x1c/0x21
[  251.178823] Code: 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 d3 e8 7e 96 56 e0 48 8b 80 d8 02 00 00 48 89 df 48 c7 c6 84 80 73 c0 <48> 8b 50 08 31 c0 e8 e2 67 44 e0 5b 48 98 5d c3 0f 1f 00 66 2e
[  251.180115] RIP  [<ffffffffc0736213>] show_name+0x23/0x40 [uio]
[  251.180820]  RSP <ffff971e9e387dc8>
[  251.181473] CR2: 0000000000000008

CC: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
CC: Mike Christie <mchristi@redhat.com>
Signed-off-by: Xiubo Li <xiubli@redhat.com>
---
 drivers/uio/uio.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 99 insertions(+), 17 deletions(-)

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 655ade4..a0d926e 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -215,7 +215,20 @@ static ssize_t name_show(struct device *dev,
 			 struct device_attribute *attr, char *buf)
 {
 	struct uio_device *idev = dev_get_drvdata(dev);
-	return sprintf(buf, "%s\n", idev->info->name);
+	int ret;
+
+	mutex_lock(&idev->info_lock);
+	if (!idev->info) {
+		ret = -EINVAL;
+		dev_err(dev, "the device has been unregistered\n");
+		goto out;
+	}
+
+	ret = sprintf(buf, "%s\n", idev->info->name);
+
+out:
+	mutex_unlock(&idev->info_lock);
+	return ret;
 }
 static DEVICE_ATTR_RO(name);
 
@@ -223,7 +236,20 @@ static ssize_t version_show(struct device *dev,
 			    struct device_attribute *attr, char *buf)
 {
 	struct uio_device *idev = dev_get_drvdata(dev);
-	return sprintf(buf, "%s\n", idev->info->version);
+	int ret;
+
+	mutex_lock(&idev->info_lock);
+	if (!idev->info) {
+		ret = -EINVAL;
+		dev_err(dev, "the device has been unregistered\n");
+		goto out;
+	}
+
+	ret = sprintf(buf, "%s\n", idev->info->version);
+
+out:
+	mutex_unlock(&idev->info_lock);
+	return ret;
 }
 static DEVICE_ATTR_RO(version);
 
@@ -399,7 +425,12 @@ static void uio_free_minor(struct uio_device *idev)
  */
 void uio_event_notify(struct uio_info *info)
 {
-	struct uio_device *idev = info->uio_dev;
+	struct uio_device *idev;
+
+	if (!info)
+		return;
+
+	idev = info->uio_dev;
 
 	atomic_inc(&idev->event);
 	wake_up_interruptible(&idev->wait);
@@ -415,11 +446,20 @@ void uio_event_notify(struct uio_info *info)
 static irqreturn_t uio_interrupt(int irq, void *dev_id)
 {
 	struct uio_device *idev = (struct uio_device *)dev_id;
-	irqreturn_t ret = idev->info->handler(irq, idev->info);
+	irqreturn_t ret;
+
+	mutex_lock(&idev->info_lock);
+	if (!idev->info) {
+		ret = IRQ_NONE;
+		goto out;
+	}
 
+	ret = idev->info->handler(irq, idev->info);
 	if (ret == IRQ_HANDLED)
 		uio_event_notify(idev->info);
 
+out:
+	mutex_unlock(&idev->info_lock);
 	return ret;
 }
 
@@ -460,6 +500,12 @@ static int uio_open(struct inode *inode, struct file *filep)
 	filep->private_data = listener;
 
 	mutex_lock(&idev->info_lock);
+	if (!idev->info) {
+		mutex_unlock(&idev->info_lock);
+		ret = -EINVAL;
+		goto err_alloc_listener;
+	}
+
 	if (idev->info && idev->info->open)
 		ret = idev->info->open(idev->info, inode);
 	mutex_unlock(&idev->info_lock);
@@ -590,6 +636,11 @@ static ssize_t uio_write(struct file *filep, const char __user *buf,
 	s32 irq_on;
 
 	mutex_lock(&idev->info_lock);
+	if (!idev->info) {
+		retval = -EINVAL;
+		goto out;
+	}
+
 	if (!idev->info || !idev->info->irq) {
 		retval = -EIO;
 		goto out;
@@ -635,10 +686,20 @@ static vm_fault_t uio_vma_fault(struct vm_fault *vmf)
 	struct page *page;
 	unsigned long offset;
 	void *addr;
+	int ret = 0;
+	int mi;
 
-	int mi = uio_find_mem_index(vmf->vma);
-	if (mi < 0)
-		return VM_FAULT_SIGBUS;
+	mutex_lock(&idev->info_lock);
+	if (!idev->info) {
+		ret = VM_FAULT_SIGBUS;
+		goto out;
+	}
+
+	mi = uio_find_mem_index(vmf->vma);
+	if (mi < 0) {
+		ret = VM_FAULT_SIGBUS;
+		goto out;
+	}
 
 	/*
 	 * We need to subtract mi because userspace uses offset = N*PAGE_SIZE
@@ -653,7 +714,11 @@ static vm_fault_t uio_vma_fault(struct vm_fault *vmf)
 		page = vmalloc_to_page(addr);
 	get_page(page);
 	vmf->page = page;
-	return 0;
+
+out:
+	mutex_unlock(&idev->info_lock);
+
+	return ret;
 }
 
 static const struct vm_operations_struct uio_logical_vm_ops = {
@@ -678,6 +743,7 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
 	struct uio_device *idev = vma->vm_private_data;
 	int mi = uio_find_mem_index(vma);
 	struct uio_mem *mem;
+
 	if (mi < 0)
 		return -EINVAL;
 	mem = idev->info->mem + mi;
@@ -719,30 +785,46 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
 
 	vma->vm_private_data = idev;
 
+	mutex_lock(&idev->info_lock);
+	if (!idev->info) {
+		ret = -EINVAL;
+		goto out;
+	}
+
 	mi = uio_find_mem_index(vma);
-	if (mi < 0)
-		return -EINVAL;
+	if (mi < 0) {
+		ret = -EINVAL;
+		goto out;
+	}
 
 	requested_pages = vma_pages(vma);
 	actual_pages = ((idev->info->mem[mi].addr & ~PAGE_MASK)
 			+ idev->info->mem[mi].size + PAGE_SIZE -1) >> PAGE_SHIFT;
-	if (requested_pages > actual_pages)
-		return -EINVAL;
+	if (requested_pages > actual_pages) {
+		ret = -EINVAL;
+		goto out;
+	}
 
 	if (idev->info->mmap) {
 		ret = idev->info->mmap(idev->info, vma);
-		return ret;
+		goto out;
 	}
 
 	switch (idev->info->mem[mi].memtype) {
 		case UIO_MEM_PHYS:
-			return uio_mmap_physical(vma);
+			ret = uio_mmap_physical(vma);
+			break;
 		case UIO_MEM_LOGICAL:
 		case UIO_MEM_VIRTUAL:
-			return uio_mmap_logical(vma);
+			ret = uio_mmap_logical(vma);
+			break;
 		default:
-			return -EINVAL;
+			ret = -EINVAL;
 	}
+
+out:
+	mutex_unlock(&idev->info_lock);
+	return 0;
 }
 
 static const struct file_operations uio_fops = {
@@ -932,12 +1014,12 @@ void uio_unregister_device(struct uio_info *info)
 
 	uio_free_minor(idev);
 
+	mutex_lock(&idev->info_lock);
 	uio_dev_del_attributes(idev);
 
 	if (info->irq && info->irq != UIO_IRQ_CUSTOM)
 		free_irq(info->irq, idev);
 
-	mutex_lock(&idev->info_lock);
 	idev->info = NULL;
 	mutex_unlock(&idev->info_lock);
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/3] uio: fix potential crash bug
  2018-07-06  2:57 [PATCH v3 0/3] uio: fix potential crash bug xiubli
                   ` (2 preceding siblings ...)
  2018-07-06  2:57 ` [PATCH v3 3/3] uio: fix crash after the device is unregistered xiubli
@ 2018-07-06  3:31 ` Hamish Martin
  2018-07-06 12:04   ` Xiubo Li
  3 siblings, 1 reply; 16+ messages in thread
From: Hamish Martin @ 2018-07-06  3:31 UTC (permalink / raw)
  To: xiubli, gregkh, linux-kernel
  Cc: jannh, pkalever, pkarampu, atumball, sabose, mchristi

Hi Xiubo,

Thanks for your patch that addresses the regression found with my 
earlier commit.

I will take your code and run it in our scenario that showed the bug 
that led to my commit. Unfortunately I won't be able to get that done 
until mid-next week. I intend to report back to you by July 13th.

Thanks,
Hamish M

On 07/06/2018 02:57 PM, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
>
> The V2 patch set maybe not post successfully, so post it again
> with one new updation.
>
> V2:
> - resend it with some small fix
>
> V3:
> - switch to use request_threaded_irq
>
>
> Xiubo Li (3):
>    uio: use request_threaded_irq instead
>    uio: change to use the mutex lock instead of the spin lock
>    uio: fix crash after the device is unregistered
>
>   drivers/uio/uio.c          | 151 ++++++++++++++++++++++++++++++++++-----------
>   include/linux/uio_driver.h |   2 +-
>   2 files changed, 115 insertions(+), 38 deletions(-)
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 2/3] uio: change to use the mutex lock instead of the spin lock
  2018-07-06  2:57 ` [PATCH v3 2/3] uio: change to use the mutex lock instead of the spin lock xiubli
@ 2018-07-06  3:39   ` Hamish Martin
  0 siblings, 0 replies; 16+ messages in thread
From: Hamish Martin @ 2018-07-06  3:39 UTC (permalink / raw)
  To: xiubli, gregkh, linux-kernel
  Cc: jannh, pkalever, pkarampu, atumball, sabose, mchristi

Looks ok to me.

Reviewed-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz>

On 07/06/2018 02:57 PM, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
>
> We are hitting a regression with the following commit:
>
> commit a93e7b331568227500186a465fee3c2cb5dffd1f
> Author: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
> Date:   Mon May 14 13:32:23 2018 +1200
>
>      uio: Prevent device destruction while fds are open
>
> The problem is the addition of spin_lock_irqsave in uio_write. This
> leads to hitting  uio_write -> copy_from_user -> _copy_from_user ->
> might_fault and the logs filling up with sleeping warnings.
>
> I also noticed some uio drivers allocate memory, sleep, grab mutexes
> from callouts like open() and release and uio is now doing
> spin_lock_irqsave while calling them.
>
> Reported-by: Mike Christie <mchristi@redhat.com>
> CC: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>   drivers/uio/uio.c          | 32 +++++++++++++-------------------
>   include/linux/uio_driver.h |  2 +-
>   2 files changed, 14 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
> index b4b2ae1..655ade4 100644
> --- a/drivers/uio/uio.c
> +++ b/drivers/uio/uio.c
> @@ -433,7 +433,6 @@ static int uio_open(struct inode *inode, struct file *filep)
>   	struct uio_device *idev;
>   	struct uio_listener *listener;
>   	int ret = 0;
> -	unsigned long flags;
>   
>   	mutex_lock(&minor_lock);
>   	idev = idr_find(&uio_idr, iminor(inode));
> @@ -460,10 +459,10 @@ static int uio_open(struct inode *inode, struct file *filep)
>   	listener->event_count = atomic_read(&idev->event);
>   	filep->private_data = listener;
>   
> -	spin_lock_irqsave(&idev->info_lock, flags);
> +	mutex_lock(&idev->info_lock);
>   	if (idev->info && idev->info->open)
>   		ret = idev->info->open(idev->info, inode);
> -	spin_unlock_irqrestore(&idev->info_lock, flags);
> +	mutex_unlock(&idev->info_lock);
>   	if (ret)
>   		goto err_infoopen;
>   
> @@ -495,12 +494,11 @@ static int uio_release(struct inode *inode, struct file *filep)
>   	int ret = 0;
>   	struct uio_listener *listener = filep->private_data;
>   	struct uio_device *idev = listener->dev;
> -	unsigned long flags;
>   
> -	spin_lock_irqsave(&idev->info_lock, flags);
> +	mutex_lock(&idev->info_lock);
>   	if (idev->info && idev->info->release)
>   		ret = idev->info->release(idev->info, inode);
> -	spin_unlock_irqrestore(&idev->info_lock, flags);
> +	mutex_unlock(&idev->info_lock);
>   
>   	module_put(idev->owner);
>   	kfree(listener);
> @@ -513,12 +511,11 @@ static __poll_t uio_poll(struct file *filep, poll_table *wait)
>   	struct uio_listener *listener = filep->private_data;
>   	struct uio_device *idev = listener->dev;
>   	__poll_t ret = 0;
> -	unsigned long flags;
>   
> -	spin_lock_irqsave(&idev->info_lock, flags);
> +	mutex_lock(&idev->info_lock);
>   	if (!idev->info || !idev->info->irq)
>   		ret = -EIO;
> -	spin_unlock_irqrestore(&idev->info_lock, flags);
> +	mutex_unlock(&idev->info_lock);
>   
>   	if (ret)
>   		return ret;
> @@ -537,12 +534,11 @@ static ssize_t uio_read(struct file *filep, char __user *buf,
>   	DECLARE_WAITQUEUE(wait, current);
>   	ssize_t retval = 0;
>   	s32 event_count;
> -	unsigned long flags;
>   
> -	spin_lock_irqsave(&idev->info_lock, flags);
> +	mutex_lock(&idev->info_lock);
>   	if (!idev->info || !idev->info->irq)
>   		retval = -EIO;
> -	spin_unlock_irqrestore(&idev->info_lock, flags);
> +	mutex_unlock(&idev->info_lock);
>   
>   	if (retval)
>   		return retval;
> @@ -592,9 +588,8 @@ static ssize_t uio_write(struct file *filep, const char __user *buf,
>   	struct uio_device *idev = listener->dev;
>   	ssize_t retval;
>   	s32 irq_on;
> -	unsigned long flags;
>   
> -	spin_lock_irqsave(&idev->info_lock, flags);
> +	mutex_lock(&idev->info_lock);
>   	if (!idev->info || !idev->info->irq) {
>   		retval = -EIO;
>   		goto out;
> @@ -618,7 +613,7 @@ static ssize_t uio_write(struct file *filep, const char __user *buf,
>   	retval = idev->info->irqcontrol(idev->info, irq_on);
>   
>   out:
> -	spin_unlock_irqrestore(&idev->info_lock, flags);
> +	mutex_unlock(&idev->info_lock);
>   	return retval ? retval : sizeof(s32);
>   }
>   
> @@ -865,7 +860,7 @@ int __uio_register_device(struct module *owner,
>   
>   	idev->owner = owner;
>   	idev->info = info;
> -	spin_lock_init(&idev->info_lock);
> +	mutex_init(&idev->info_lock);
>   	init_waitqueue_head(&idev->wait);
>   	atomic_set(&idev->event, 0);
>   
> @@ -929,7 +924,6 @@ int __uio_register_device(struct module *owner,
>   void uio_unregister_device(struct uio_info *info)
>   {
>   	struct uio_device *idev;
> -	unsigned long flags;
>   
>   	if (!info || !info->uio_dev)
>   		return;
> @@ -943,9 +937,9 @@ void uio_unregister_device(struct uio_info *info)
>   	if (info->irq && info->irq != UIO_IRQ_CUSTOM)
>   		free_irq(info->irq, idev);
>   
> -	spin_lock_irqsave(&idev->info_lock, flags);
> +	mutex_lock(&idev->info_lock);
>   	idev->info = NULL;
> -	spin_unlock_irqrestore(&idev->info_lock, flags);
> +	mutex_unlock(&idev->info_lock);
>   
>   	device_unregister(&idev->dev);
>   
> diff --git a/include/linux/uio_driver.h b/include/linux/uio_driver.h
> index 6c5f207..6f8b68c 100644
> --- a/include/linux/uio_driver.h
> +++ b/include/linux/uio_driver.h
> @@ -75,7 +75,7 @@ struct uio_device {
>           struct fasync_struct    *async_queue;
>           wait_queue_head_t       wait;
>           struct uio_info         *info;
> -	spinlock_t		info_lock;
> +	struct mutex		info_lock;
>           struct kobject          *map_dir;
>           struct kobject          *portio_dir;
>   };

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-06  2:57 ` [PATCH v3 3/3] uio: fix crash after the device is unregistered xiubli
@ 2018-07-06  3:40   ` Hamish Martin
  2018-07-06 18:23   ` Mike Christie
  2018-07-06 18:58   ` Mike Christie
  2 siblings, 0 replies; 16+ messages in thread
From: Hamish Martin @ 2018-07-06  3:40 UTC (permalink / raw)
  To: xiubli, gregkh, linux-kernel
  Cc: jannh, pkalever, pkarampu, atumball, sabose, mchristi

Looks ok to me.

Reviewed-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz>

On 07/06/2018 02:57 PM, xiubli@redhat.com wrote:
> From: Xiubo Li <xiubli@redhat.com>
>
> For the target_core_user use case, after the device is unregistered
> it maybe still opened in user space, then the kernel will crash, like:
>
> [  251.163692] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> [  251.163820] IP: [<ffffffffc0736213>] show_name+0x23/0x40 [uio]
> [  251.163965] PGD 8000000062694067 PUD 62696067 PMD 0
> [  251.164097] Oops: 0000 [#1] SMP
> ...
> [  251.165605]  e1000 mptscsih mptbase drm_panel_orientation_quirks dm_mirror dm_region_hash dm_log dm_mod
> [  251.166014] CPU: 0 PID: 13380 Comm: tcmu-runner Kdump: loaded Not tainted 3.10.0-916.el7.test.x86_64 #1
> [  251.166381] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
> [  251.166747] task: ffff971eb91db0c0 ti: ffff971e9e384000 task.ti: ffff971e9e384000
> [  251.167137] RIP: 0010:[<ffffffffc0736213>]  [<ffffffffc0736213>] show_name+0x23/0x40 [uio]
> [  251.167563] RSP: 0018:ffff971e9e387dc8  EFLAGS: 00010282
> [  251.167978] RAX: 0000000000000000 RBX: ffff971e9e3f8000 RCX: ffff971eb8368d98
> [  251.168408] RDX: ffff971e9e3f8000 RSI: ffffffffc0738084 RDI: ffff971e9e3f8000
> [  251.168856] RBP: ffff971e9e387dd0 R08: ffff971eb8bc0018 R09: 0000000000000000
> [  251.169296] R10: 0000000000001000 R11: ffffffffa09d444d R12: ffffffffa1076e80
> [  251.169750] R13: ffff971e9e387f18 R14: 0000000000000001 R15: ffff971e9cfb1c80
> [  251.170213] FS:  00007ff37d175880(0000) GS:ffff971ebb600000(0000) knlGS:0000000000000000
> [  251.170693] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  251.171248] CR2: 0000000000000008 CR3: 00000000001f6000 CR4: 00000000003607f0
> [  251.172071] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  251.172640] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  251.173236] Call Trace:
> [  251.173789]  [<ffffffffa0c9b2d3>] dev_attr_show+0x23/0x60
> [  251.174356]  [<ffffffffa0f561b2>] ? mutex_lock+0x12/0x2f
> [  251.174892]  [<ffffffffa0ac6d9f>] sysfs_kf_seq_show+0xcf/0x1f0
> [  251.175433]  [<ffffffffa0ac54e6>] kernfs_seq_show+0x26/0x30
> [  251.175981]  [<ffffffffa0a63be0>] seq_read+0x110/0x3f0
> [  251.176609]  [<ffffffffa0ac5d45>] kernfs_fop_read+0xf5/0x160
> [  251.177158]  [<ffffffffa0a3d3af>] vfs_read+0x9f/0x170
> [  251.177707]  [<ffffffffa0a3e27f>] SyS_read+0x7f/0xf0
> [  251.178268]  [<ffffffffa0f648af>] system_call_fastpath+0x1c/0x21
> [  251.178823] Code: 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 d3 e8 7e 96 56 e0 48 8b 80 d8 02 00 00 48 89 df 48 c7 c6 84 80 73 c0 <48> 8b 50 08 31 c0 e8 e2 67 44 e0 5b 48 98 5d c3 0f 1f 00 66 2e
> [  251.180115] RIP  [<ffffffffc0736213>] show_name+0x23/0x40 [uio]
> [  251.180820]  RSP <ffff971e9e387dc8>
> [  251.181473] CR2: 0000000000000008
>
> CC: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
> CC: Mike Christie <mchristi@redhat.com>
> Signed-off-by: Xiubo Li <xiubli@redhat.com>
> ---
>   drivers/uio/uio.c | 116 ++++++++++++++++++++++++++++++++++++++++++++++--------
>   1 file changed, 99 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
> index 655ade4..a0d926e 100644
> --- a/drivers/uio/uio.c
> +++ b/drivers/uio/uio.c
> @@ -215,7 +215,20 @@ static ssize_t name_show(struct device *dev,
>   			 struct device_attribute *attr, char *buf)
>   {
>   	struct uio_device *idev = dev_get_drvdata(dev);
> -	return sprintf(buf, "%s\n", idev->info->name);
> +	int ret;
> +
> +	mutex_lock(&idev->info_lock);
> +	if (!idev->info) {
> +		ret = -EINVAL;
> +		dev_err(dev, "the device has been unregistered\n");
> +		goto out;
> +	}
> +
> +	ret = sprintf(buf, "%s\n", idev->info->name);
> +
> +out:
> +	mutex_unlock(&idev->info_lock);
> +	return ret;
>   }
>   static DEVICE_ATTR_RO(name);
>   
> @@ -223,7 +236,20 @@ static ssize_t version_show(struct device *dev,
>   			    struct device_attribute *attr, char *buf)
>   {
>   	struct uio_device *idev = dev_get_drvdata(dev);
> -	return sprintf(buf, "%s\n", idev->info->version);
> +	int ret;
> +
> +	mutex_lock(&idev->info_lock);
> +	if (!idev->info) {
> +		ret = -EINVAL;
> +		dev_err(dev, "the device has been unregistered\n");
> +		goto out;
> +	}
> +
> +	ret = sprintf(buf, "%s\n", idev->info->version);
> +
> +out:
> +	mutex_unlock(&idev->info_lock);
> +	return ret;
>   }
>   static DEVICE_ATTR_RO(version);
>   
> @@ -399,7 +425,12 @@ static void uio_free_minor(struct uio_device *idev)
>    */
>   void uio_event_notify(struct uio_info *info)
>   {
> -	struct uio_device *idev = info->uio_dev;
> +	struct uio_device *idev;
> +
> +	if (!info)
> +		return;
> +
> +	idev = info->uio_dev;
>   
>   	atomic_inc(&idev->event);
>   	wake_up_interruptible(&idev->wait);
> @@ -415,11 +446,20 @@ void uio_event_notify(struct uio_info *info)
>   static irqreturn_t uio_interrupt(int irq, void *dev_id)
>   {
>   	struct uio_device *idev = (struct uio_device *)dev_id;
> -	irqreturn_t ret = idev->info->handler(irq, idev->info);
> +	irqreturn_t ret;
> +
> +	mutex_lock(&idev->info_lock);
> +	if (!idev->info) {
> +		ret = IRQ_NONE;
> +		goto out;
> +	}
>   
> +	ret = idev->info->handler(irq, idev->info);
>   	if (ret == IRQ_HANDLED)
>   		uio_event_notify(idev->info);
>   
> +out:
> +	mutex_unlock(&idev->info_lock);
>   	return ret;
>   }
>   
> @@ -460,6 +500,12 @@ static int uio_open(struct inode *inode, struct file *filep)
>   	filep->private_data = listener;
>   
>   	mutex_lock(&idev->info_lock);
> +	if (!idev->info) {
> +		mutex_unlock(&idev->info_lock);
> +		ret = -EINVAL;
> +		goto err_alloc_listener;
> +	}
> +
>   	if (idev->info && idev->info->open)
>   		ret = idev->info->open(idev->info, inode);
>   	mutex_unlock(&idev->info_lock);
> @@ -590,6 +636,11 @@ static ssize_t uio_write(struct file *filep, const char __user *buf,
>   	s32 irq_on;
>   
>   	mutex_lock(&idev->info_lock);
> +	if (!idev->info) {
> +		retval = -EINVAL;
> +		goto out;
> +	}
> +
>   	if (!idev->info || !idev->info->irq) {
>   		retval = -EIO;
>   		goto out;
> @@ -635,10 +686,20 @@ static vm_fault_t uio_vma_fault(struct vm_fault *vmf)
>   	struct page *page;
>   	unsigned long offset;
>   	void *addr;
> +	int ret = 0;
> +	int mi;
>   
> -	int mi = uio_find_mem_index(vmf->vma);
> -	if (mi < 0)
> -		return VM_FAULT_SIGBUS;
> +	mutex_lock(&idev->info_lock);
> +	if (!idev->info) {
> +		ret = VM_FAULT_SIGBUS;
> +		goto out;
> +	}
> +
> +	mi = uio_find_mem_index(vmf->vma);
> +	if (mi < 0) {
> +		ret = VM_FAULT_SIGBUS;
> +		goto out;
> +	}
>   
>   	/*
>   	 * We need to subtract mi because userspace uses offset = N*PAGE_SIZE
> @@ -653,7 +714,11 @@ static vm_fault_t uio_vma_fault(struct vm_fault *vmf)
>   		page = vmalloc_to_page(addr);
>   	get_page(page);
>   	vmf->page = page;
> -	return 0;
> +
> +out:
> +	mutex_unlock(&idev->info_lock);
> +
> +	return ret;
>   }
>   
>   static const struct vm_operations_struct uio_logical_vm_ops = {
> @@ -678,6 +743,7 @@ static int uio_mmap_physical(struct vm_area_struct *vma)
>   	struct uio_device *idev = vma->vm_private_data;
>   	int mi = uio_find_mem_index(vma);
>   	struct uio_mem *mem;
> +
>   	if (mi < 0)
>   		return -EINVAL;
>   	mem = idev->info->mem + mi;
> @@ -719,30 +785,46 @@ static int uio_mmap(struct file *filep, struct vm_area_struct *vma)
>   
>   	vma->vm_private_data = idev;
>   
> +	mutex_lock(&idev->info_lock);
> +	if (!idev->info) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
>   	mi = uio_find_mem_index(vma);
> -	if (mi < 0)
> -		return -EINVAL;
> +	if (mi < 0) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
>   
>   	requested_pages = vma_pages(vma);
>   	actual_pages = ((idev->info->mem[mi].addr & ~PAGE_MASK)
>   			+ idev->info->mem[mi].size + PAGE_SIZE -1) >> PAGE_SHIFT;
> -	if (requested_pages > actual_pages)
> -		return -EINVAL;
> +	if (requested_pages > actual_pages) {
> +		ret = -EINVAL;
> +		goto out;
> +	}
>   
>   	if (idev->info->mmap) {
>   		ret = idev->info->mmap(idev->info, vma);
> -		return ret;
> +		goto out;
>   	}
>   
>   	switch (idev->info->mem[mi].memtype) {
>   		case UIO_MEM_PHYS:
> -			return uio_mmap_physical(vma);
> +			ret = uio_mmap_physical(vma);
> +			break;
>   		case UIO_MEM_LOGICAL:
>   		case UIO_MEM_VIRTUAL:
> -			return uio_mmap_logical(vma);
> +			ret = uio_mmap_logical(vma);
> +			break;
>   		default:
> -			return -EINVAL;
> +			ret = -EINVAL;
>   	}
> +
> +out:
> +	mutex_unlock(&idev->info_lock);
> +	return 0;
>   }
>   
>   static const struct file_operations uio_fops = {
> @@ -932,12 +1014,12 @@ void uio_unregister_device(struct uio_info *info)
>   
>   	uio_free_minor(idev);
>   
> +	mutex_lock(&idev->info_lock);
>   	uio_dev_del_attributes(idev);
>   
>   	if (info->irq && info->irq != UIO_IRQ_CUSTOM)
>   		free_irq(info->irq, idev);
>   
> -	mutex_lock(&idev->info_lock);
>   	idev->info = NULL;
>   	mutex_unlock(&idev->info_lock);
>   

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 0/3] uio: fix potential crash bug
  2018-07-06  3:31 ` [PATCH v3 0/3] uio: fix potential crash bug Hamish Martin
@ 2018-07-06 12:04   ` Xiubo Li
  0 siblings, 0 replies; 16+ messages in thread
From: Xiubo Li @ 2018-07-06 12:04 UTC (permalink / raw)
  To: Hamish Martin, gregkh, linux-kernel
  Cc: jannh, pkalever, pkarampu, atumball, sabose, mchristi

On 2018/7/6 11:31, Hamish Martin wrote:
> Hi Xiubo,
>
> Thanks for your patch that addresses the regression found with my
> earlier commit.
>
> I will take your code and run it in our scenario that showed the bug
> that led to my commit. Unfortunately I won't be able to get that done
> until mid-next week. I intend to report back to you by July 13th.
Hi Hamish,

Thanks verrrry much for your quickly review and reply about this. Take 
your time :-)

BRs
Xiubo


> Thanks,
> Hamish M
>
> On 07/06/2018 02:57 PM, xiubli@redhat.com wrote:
>> From: Xiubo Li <xiubli@redhat.com>
>>
>> The V2 patch set maybe not post successfully, so post it again
>> with one new updation.
>>
>> V2:
>> - resend it with some small fix
>>
>> V3:
>> - switch to use request_threaded_irq
>>
>>
>> Xiubo Li (3):
>>     uio: use request_threaded_irq instead
>>     uio: change to use the mutex lock instead of the spin lock
>>     uio: fix crash after the device is unregistered
>>
>>    drivers/uio/uio.c          | 151 ++++++++++++++++++++++++++++++++++-----------
>>    include/linux/uio_driver.h |   2 +-
>>    2 files changed, 115 insertions(+), 38 deletions(-)
>>


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-06  2:57 ` [PATCH v3 3/3] uio: fix crash after the device is unregistered xiubli
  2018-07-06  3:40   ` Hamish Martin
@ 2018-07-06 18:23   ` Mike Christie
  2018-07-07  1:28     ` Xiubo Li
  2018-07-06 18:58   ` Mike Christie
  2 siblings, 1 reply; 16+ messages in thread
From: Mike Christie @ 2018-07-06 18:23 UTC (permalink / raw)
  To: xiubli, gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose

On 07/05/2018 09:57 PM, xiubli@redhat.com wrote:
>  static irqreturn_t uio_interrupt(int irq, void *dev_id)
>  {
>  	struct uio_device *idev = (struct uio_device *)dev_id;
> -	irqreturn_t ret = idev->info->handler(irq, idev->info);
> +	irqreturn_t ret;
> +
> +	mutex_lock(&idev->info_lock);
> +	if (!idev->info) {
> +		ret = IRQ_NONE;
> +		goto out;
> +	}
>  
> +	ret = idev->info->handler(irq, idev->info);
>  	if (ret == IRQ_HANDLED)
>  		uio_event_notify(idev->info);
>  
> +out:
> +	mutex_unlock(&idev->info_lock);
>  	return ret;
>  }


Do you need the interrupt related changes in this patch and the first
one? When we do uio_unregister_device -> free_irq does free_irq return
when there are no longer running interrupt handlers that we requested?

If that is not the case then I think we can hit a similar bug. We do:

__uio_register_device -> device_register -> device's refcount goes to
zero so we do -> uio_device_release -> kfree(idev)

and if it is possible the interrupt handler could still run after
free_irq then we would end up doing:

uio_interrupt -> mutex_lock(&idev->info_lock) -> idev access freed memory.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-06  2:57 ` [PATCH v3 3/3] uio: fix crash after the device is unregistered xiubli
  2018-07-06  3:40   ` Hamish Martin
  2018-07-06 18:23   ` Mike Christie
@ 2018-07-06 18:58   ` Mike Christie
  2018-07-07  1:47     ` Xiubo Li
  2 siblings, 1 reply; 16+ messages in thread
From: Mike Christie @ 2018-07-06 18:58 UTC (permalink / raw)
  To: xiubli, gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose

On 07/05/2018 09:57 PM, xiubli@redhat.com wrote:
>  void uio_event_notify(struct uio_info *info)
>  {
> -	struct uio_device *idev = info->uio_dev;
> +	struct uio_device *idev;
> +
> +	if (!info)
> +		return;
> +
> +	idev = info->uio_dev;
>  

For this one too, I am not sure if it is needed.

uio_interrupt -> uio_event_notify. See other mail.

driver XYZ -> uio_event_notify. I think drivers need to handle this and
set some bits and/or perform some cleanup to make sure they are not
calling uio_event_notify after it has called uio_unregister_device. The
problem with the above test is if they do not they could have called
uio_unregister_device right after the info test so you could still hit
the problem.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-06 18:23   ` Mike Christie
@ 2018-07-07  1:28     ` Xiubo Li
  2018-07-09 17:06       ` Mike Christie
  0 siblings, 1 reply; 16+ messages in thread
From: Xiubo Li @ 2018-07-07  1:28 UTC (permalink / raw)
  To: Mike Christie, gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose

On 2018/7/7 2:23, Mike Christie wrote:
> On 07/05/2018 09:57 PM, xiubli@redhat.com wrote:
>>   static irqreturn_t uio_interrupt(int irq, void *dev_id)
>>   {
>>   	struct uio_device *idev = (struct uio_device *)dev_id;
>> -	irqreturn_t ret = idev->info->handler(irq, idev->info);
>> +	irqreturn_t ret;
>> +
>> +	mutex_lock(&idev->info_lock);
>> +	if (!idev->info) {
>> +		ret = IRQ_NONE;
>> +		goto out;
>> +	}
>>   
>> +	ret = idev->info->handler(irq, idev->info);
>>   	if (ret == IRQ_HANDLED)
>>   		uio_event_notify(idev->info);
>>   
>> +out:
>> +	mutex_unlock(&idev->info_lock);
>>   	return ret;
>>   }
>
> Do you need the interrupt related changes in this patch and the first
> one?
Actually, the NULL checking is not a must, we can remove this. But the 
lock/unlock is needed.
>   When we do uio_unregister_device -> free_irq does free_irq return
> when there are no longer running interrupt handlers that we requested?
>
> If that is not the case then I think we can hit a similar bug. We do:
>
> __uio_register_device -> device_register -> device's refcount goes to
> zero so we do -> uio_device_release -> kfree(idev)
>
> and if it is possible the interrupt handler could still run after
> free_irq then we would end up doing:
>
> uio_interrupt -> mutex_lock(&idev->info_lock) -> idev access freed memory.

I think this shouldn't happen. Because the free_irq function does not 
return until any executing interrupts for this IRQ have completed.

Thanks,
BRs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-06 18:58   ` Mike Christie
@ 2018-07-07  1:47     ` Xiubo Li
  2018-07-09 16:40       ` Mike Christie
  0 siblings, 1 reply; 16+ messages in thread
From: Xiubo Li @ 2018-07-07  1:47 UTC (permalink / raw)
  To: Mike Christie, gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose

On 2018/7/7 2:58, Mike Christie wrote:
> On 07/05/2018 09:57 PM, xiubli@redhat.com wrote:
>>   void uio_event_notify(struct uio_info *info)
>>   {
>> -	struct uio_device *idev = info->uio_dev;
>> +	struct uio_device *idev;
>> +
>> +	if (!info)
>> +		return;
>> +
>> +	idev = info->uio_dev;
>>   
> For this one too, I am not sure if it is needed.
>
> uio_interrupt -> uio_event_notify. See other mail.
>
> driver XYZ -> uio_event_notify. I think drivers need to handle this and
> set some bits and/or perform some cleanup to make sure they are not
> calling uio_event_notify after it has called uio_unregister_device. The
> problem with the above test is if they do not they could have called
> uio_unregister_device right after the info test so you could still hit
> the problem.

When we are tcmu_destroy_device(), if the netlink notify event to 
userspace is not successful then the TCMU will call the uio unregister, 
which will set the idev->info = NULL, without close and deleting the 
device in userspace. But the TCMU could still queue cmds to the ring 
buffer, then the uio_event_notify will be called.

For this case only when using idev->info it would happen. And currently 
there is no need to check this, I will remove it for now.

Thanks,

BRs


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-07  1:47     ` Xiubo Li
@ 2018-07-09 16:40       ` Mike Christie
  2018-07-10  2:36         ` Xiubo Li
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Christie @ 2018-07-09 16:40 UTC (permalink / raw)
  To: Xiubo Li, gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose

On 07/06/2018 08:47 PM, Xiubo Li wrote:
> On 2018/7/7 2:58, Mike Christie wrote:
>> On 07/05/2018 09:57 PM, xiubli@redhat.com wrote:
>>>   void uio_event_notify(struct uio_info *info)
>>>   {
>>> -    struct uio_device *idev = info->uio_dev;
>>> +    struct uio_device *idev;
>>> +
>>> +    if (!info)
>>> +        return;
>>> +
>>> +    idev = info->uio_dev;
>>>   
>> For this one too, I am not sure if it is needed.
>>
>> uio_interrupt -> uio_event_notify. See other mail.
>>
>> driver XYZ -> uio_event_notify. I think drivers need to handle this and
>> set some bits and/or perform some cleanup to make sure they are not
>> calling uio_event_notify after it has called uio_unregister_device. The
>> problem with the above test is if they do not they could have called
>> uio_unregister_device right after the info test so you could still hit
>> the problem.
> 
> When we are tcmu_destroy_device(), if the netlink notify event to
> userspace is not successful then the TCMU will call the uio unregister,
> which will set the idev->info = NULL, without close and deleting the
> device in userspace. But the TCMU could still queue cmds to the ring
> buffer, then the uio_event_notify will be called.

Before tcmu_destroy_device is called LIO will have stopped new IO and
waited for outstanding IO to be completed, so it would never be called
after uio_unregister_device. If it does, it's a bug in LIO.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-07  1:28     ` Xiubo Li
@ 2018-07-09 17:06       ` Mike Christie
  2018-07-10  2:40         ` Xiubo Li
  0 siblings, 1 reply; 16+ messages in thread
From: Mike Christie @ 2018-07-09 17:06 UTC (permalink / raw)
  To: Xiubo Li, gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose

On 07/06/2018 08:28 PM, Xiubo Li wrote:
> On 2018/7/7 2:23, Mike Christie wrote:
>> On 07/05/2018 09:57 PM, xiubli@redhat.com wrote:
>>>   static irqreturn_t uio_interrupt(int irq, void *dev_id)
>>>   {
>>>       struct uio_device *idev = (struct uio_device *)dev_id;
>>> -    irqreturn_t ret = idev->info->handler(irq, idev->info);
>>> +    irqreturn_t ret;
>>> +
>>> +    mutex_lock(&idev->info_lock);
>>> +    if (!idev->info) {
>>> +        ret = IRQ_NONE;
>>> +        goto out;
>>> +    }
>>>   +    ret = idev->info->handler(irq, idev->info);
>>>       if (ret == IRQ_HANDLED)
>>>           uio_event_notify(idev->info);
>>>   +out:
>>> +    mutex_unlock(&idev->info_lock);
>>>       return ret;
>>>   }
>>
>> Do you need the interrupt related changes in this patch and the first
>> one?
> Actually, the NULL checking is not a must, we can remove this. But the
> lock/unlock is needed.
>>   When we do uio_unregister_device -> free_irq does free_irq return
>> when there are no longer running interrupt handlers that we requested?
>>
>> If that is not the case then I think we can hit a similar bug. We do:
>>
>> __uio_register_device -> device_register -> device's refcount goes to
>> zero so we do -> uio_device_release -> kfree(idev)
>>
>> and if it is possible the interrupt handler could still run after
>> free_irq then we would end up doing:
>>
>> uio_interrupt -> mutex_lock(&idev->info_lock) -> idev access freed
>> memory.
> 
> I think this shouldn't happen. Because the free_irq function does not
> return until any executing interrupts for this IRQ have completed.
> 

If free_irq returns after executing interrupts and does not allow new
executions what is the lock protecting in uio_interrupt?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-09 16:40       ` Mike Christie
@ 2018-07-10  2:36         ` Xiubo Li
  0 siblings, 0 replies; 16+ messages in thread
From: Xiubo Li @ 2018-07-10  2:36 UTC (permalink / raw)
  To: Mike Christie, gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose

On 2018/7/10 0:40, Mike Christie wrote:
> On 07/06/2018 08:47 PM, Xiubo Li wrote:
>> On 2018/7/7 2:58, Mike Christie wrote:
>>> On 07/05/2018 09:57 PM, xiubli@redhat.com wrote:
>>>>    void uio_event_notify(struct uio_info *info)
>>>>    {
>>>> -    struct uio_device *idev = info->uio_dev;
>>>> +    struct uio_device *idev;
>>>> +
>>>> +    if (!info)
>>>> +        return;
>>>> +
>>>> +    idev = info->uio_dev;
>>>>    
>>> For this one too, I am not sure if it is needed.
>>>
>>> uio_interrupt -> uio_event_notify. See other mail.
>>>
>>> driver XYZ -> uio_event_notify. I think drivers need to handle this and
>>> set some bits and/or perform some cleanup to make sure they are not
>>> calling uio_event_notify after it has called uio_unregister_device. The
>>> problem with the above test is if they do not they could have called
>>> uio_unregister_device right after the info test so you could still hit
>>> the problem.
>> When we are tcmu_destroy_device(), if the netlink notify event to
>> userspace is not successful then the TCMU will call the uio unregister,
>> which will set the idev->info = NULL, without close and deleting the
>> device in userspace. But the TCMU could still queue cmds to the ring
>> buffer, then the uio_event_notify will be called.
> Before tcmu_destroy_device is called LIO will have stopped new IO and
> waited for outstanding IO to be completed, so it would never be called
> after uio_unregister_device. If it does, it's a bug in LIO.
Yeah, So this should be fixed now. Only hit one crash seem related to 
uio of this code with lower kernel before.

Thanks,
BRs




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3 3/3] uio: fix crash after the device is unregistered
  2018-07-09 17:06       ` Mike Christie
@ 2018-07-10  2:40         ` Xiubo Li
  0 siblings, 0 replies; 16+ messages in thread
From: Xiubo Li @ 2018-07-10  2:40 UTC (permalink / raw)
  To: Mike Christie, gregkh, linux-kernel
  Cc: hamish.martin, jannh, pkalever, pkarampu, atumball, sabose

On 2018/7/10 1:06, Mike Christie wrote:
> On 07/06/2018 08:28 PM, Xiubo Li wrote:
>> On 2018/7/7 2:23, Mike Christie wrote:
>>> On 07/05/2018 09:57 PM, xiubli@redhat.com wrote:
>>>>    static irqreturn_t uio_interrupt(int irq, void *dev_id)
>>>>    {
>>>>        struct uio_device *idev = (struct uio_device *)dev_id;
>>>> -    irqreturn_t ret = idev->info->handler(irq, idev->info);
>>>> +    irqreturn_t ret;
>>>> +
>>>> +    mutex_lock(&idev->info_lock);
>>>> +    if (!idev->info) {
>>>> +        ret = IRQ_NONE;
>>>> +        goto out;
>>>> +    }
>>>>    +    ret = idev->info->handler(irq, idev->info);
>>>>        if (ret == IRQ_HANDLED)
>>>>            uio_event_notify(idev->info);
>>>>    +out:
>>>> +    mutex_unlock(&idev->info_lock);
>>>>        return ret;
>>>>    }
>>> Do you need the interrupt related changes in this patch and the first
>>> one?
>> Actually, the NULL checking is not a must, we can remove this. But the
>> lock/unlock is needed.
>>>    When we do uio_unregister_device -> free_irq does free_irq return
>>> when there are no longer running interrupt handlers that we requested?
>>>
>>> If that is not the case then I think we can hit a similar bug. We do:
>>>
>>> __uio_register_device -> device_register -> device's refcount goes to
>>> zero so we do -> uio_device_release -> kfree(idev)
>>>
>>> and if it is possible the interrupt handler could still run after
>>> free_irq then we would end up doing:
>>>
>>> uio_interrupt -> mutex_lock(&idev->info_lock) -> idev access freed
>>> memory.
>> I think this shouldn't happen. Because the free_irq function does not
>> return until any executing interrupts for this IRQ have completed.
>>
> If free_irq returns after executing interrupts and does not allow new
> executions what is the lock protecting in uio_interrupt?
>
I meant idev->info->handler(irq, idev->info), it may should be protected 
by the related lock in each driver.

Thanks,


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2018-07-10  2:40 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-06  2:57 [PATCH v3 0/3] uio: fix potential crash bug xiubli
2018-07-06  2:57 ` [PATCH v3 1/3] uio: use request_threaded_irq instead xiubli
2018-07-06  2:57 ` [PATCH v3 2/3] uio: change to use the mutex lock instead of the spin lock xiubli
2018-07-06  3:39   ` Hamish Martin
2018-07-06  2:57 ` [PATCH v3 3/3] uio: fix crash after the device is unregistered xiubli
2018-07-06  3:40   ` Hamish Martin
2018-07-06 18:23   ` Mike Christie
2018-07-07  1:28     ` Xiubo Li
2018-07-09 17:06       ` Mike Christie
2018-07-10  2:40         ` Xiubo Li
2018-07-06 18:58   ` Mike Christie
2018-07-07  1:47     ` Xiubo Li
2018-07-09 16:40       ` Mike Christie
2018-07-10  2:36         ` Xiubo Li
2018-07-06  3:31 ` [PATCH v3 0/3] uio: fix potential crash bug Hamish Martin
2018-07-06 12:04   ` Xiubo Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.