linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] media: uvcvideo: Fix race conditions
@ 2020-08-30 15:04 Guenter Roeck
  2020-08-30 15:04 ` [PATCH 1/5] media: uvcvideo: Cancel async worker earlier Guenter Roeck
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Guenter Roeck @ 2020-08-30 15:04 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck

The uvcvideo code has no lock protection against USB disconnects
while video operations are ongoing. This has resulted in random
error reports, typically pointing to a crash in usb_ifnum_to_if(),
called from usb_hcd_alloc_bandwidth(). A typical traceback is as
follows.

usb 1-4: USB disconnect, device number 3
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
RIP: 0010:usb_ifnum_to_if+0x29/0x40
Code: <...>
RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
Call Trace:
 usb_hcd_alloc_bandwidth+0x1ee/0x30f
 usb_set_interface+0x1a3/0x2b7
 uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
 uvc_video_start_streaming+0x91/0xdd [uvcvideo]
 uvc_start_streaming+0x28/0x5d [uvcvideo]
 vb2_start_streaming+0x61/0x143 [videobuf2_common]
 vb2_core_streamon+0xf7/0x10f [videobuf2_common]
 uvc_queue_streamon+0x2e/0x41 [uvcvideo]
 uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
 __video_do_ioctl+0x33d/0x42a
 video_usercopy+0x34e/0x5ff
 ? video_ioctl2+0x16/0x16
 v4l2_ioctl+0x46/0x53
 do_vfs_ioctl+0x50a/0x76f
 ksys_ioctl+0x58/0x83
 __x64_sys_ioctl+0x1a/0x1e
 do_syscall_64+0x54/0xde

While this is problem rarely observed in the field, it is relatively easy
to reproduce by adding msleep() calls into the code.

I don't presume to claim that I found every issue, but this patch series
should fix at least the major problems.

The patch series was tested exensively on a Chromebook running chromeos-4.19
and on a Linux system running a v5.8.y based kernel.

----------------------------------------------------------------
Guenter Roeck (5):
      media: uvcvideo: Cancel async worker earlier
      media: uvcvideo: Lock video streams and queues while unregistering
      media: uvcvideo: Release stream queue when unregistering video device
      media: uvcvideo: Protect uvc queue file operations against disconnect
      media: uvcvideo: In uvc_v4l2_open, check if video device is registered

 drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
 drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
 drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
 drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
 drivers/media/usb/uvc/uvcvideo.h   |  1 +
 5 files changed, 93 insertions(+), 8 deletions(-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] media: uvcvideo: Cancel async worker earlier
  2020-08-30 15:04 [PATCH 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
@ 2020-08-30 15:04 ` Guenter Roeck
  2020-08-30 15:04 ` [PATCH 2/5] media: uvcvideo: Lock video streams and queues while unregistering Guenter Roeck
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Guenter Roeck @ 2020-08-30 15:04 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

So far the asynchronous control worker was canceled only in
uvc_ctrl_cleanup_device. This is much later than the call to
uvc_disconnect. However, after the call to uvc_disconnect returns,
there must be no more USB activity. This can result in all kinds
of problems in the USB code. One observed example:

URB ffff993e83d0bc00 submitted while active
WARNING: CPU: 0 PID: 4046 at drivers/usb/core/urb.c:364 usb_submit_urb+0x4ba/0x55e
Modules linked in: <...>
CPU: 0 PID: 4046 Comm: kworker/0:35 Not tainted 4.19.139 #18
Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
Workqueue: events uvc_ctrl_status_event_work [uvcvideo]
RIP: 0010:usb_submit_urb+0x4ba/0x55e
Code: <...>
RSP: 0018:ffffb08d471ebde8 EFLAGS: 00010246
RAX: a6da85d923ea5d00 RBX: ffff993e71985928 RCX: 0000000000000000
RDX: ffff993f37a1de90 RSI: ffff993f37a153d0 RDI: ffff993f37a153d0
RBP: ffffb08d471ebe28 R08: 000000000000003b R09: 001424bf85822e96
R10: 0000001000000000 R11: ffffffff975a4398 R12: ffff993e83d0b000
R13: ffff993e83d0bc00 R14: 0000000000000000 R15: 00000000fffffff0
FS:  0000000000000000(0000) GS:ffff993f37a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000ec9c0000 CR3: 000000025b160000 CR4: 0000000000340ef0
Call Trace:
 uvc_ctrl_status_event_work+0xd6/0x107 [uvcvideo]
 process_one_work+0x19b/0x4c5
 worker_thread+0x10d/0x286
 kthread+0x138/0x140
 ? process_one_work+0x4c5/0x4c5
 ? kthread_associate_blkcg+0xc1/0xc1
 ret_from_fork+0x1f/0x40

Introduce new function uvc_ctrl_stop_device() to cancel the worker
and call it from uvc_unregister_video() to solve the problem.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_ctrl.c   | 11 +++++++----
 drivers/media/usb/uvc/uvc_driver.c |  1 +
 drivers/media/usb/uvc/uvcvideo.h   |  1 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
index e399b9fad757..130c56e0063d 100644
--- a/drivers/media/usb/uvc/uvc_ctrl.c
+++ b/drivers/media/usb/uvc/uvc_ctrl.c
@@ -2340,14 +2340,17 @@ static void uvc_ctrl_cleanup_mappings(struct uvc_device *dev,
 	}
 }
 
-void uvc_ctrl_cleanup_device(struct uvc_device *dev)
+void uvc_ctrl_stop_device(struct uvc_device *dev)
 {
-	struct uvc_entity *entity;
-	unsigned int i;
-
 	/* Can be uninitialized if we are aborting on probe error. */
 	if (dev->async_ctrl.work.func)
 		cancel_work_sync(&dev->async_ctrl.work);
+}
+
+void uvc_ctrl_cleanup_device(struct uvc_device *dev)
+{
+	struct uvc_entity *entity;
+	unsigned int i;
 
 	/* Free controls and control mappings for all entities. */
 	list_for_each_entry(entity, &dev->entities, list) {
diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index 431d86e1c94b..bfba67a69185 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1952,6 +1952,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
 	}
 
 	uvc_status_unregister(dev);
+	uvc_ctrl_stop_device(dev);
 
 	if (dev->vdev.dev)
 		v4l2_device_unregister(&dev->vdev);
diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
index 6ab972c643e3..543afcd9fd26 100644
--- a/drivers/media/usb/uvc/uvcvideo.h
+++ b/drivers/media/usb/uvc/uvcvideo.h
@@ -830,6 +830,7 @@ int uvc_query_v4l2_menu(struct uvc_video_chain *chain,
 int uvc_ctrl_add_mapping(struct uvc_video_chain *chain,
 			 const struct uvc_control_mapping *mapping);
 int uvc_ctrl_init_device(struct uvc_device *dev);
+void uvc_ctrl_stop_device(struct uvc_device *dev);
 void uvc_ctrl_cleanup_device(struct uvc_device *dev);
 int uvc_ctrl_restore_values(struct uvc_device *dev);
 bool uvc_ctrl_status_event(struct urb *urb, struct uvc_video_chain *chain,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] media: uvcvideo: Lock video streams and queues while unregistering
  2020-08-30 15:04 [PATCH 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
  2020-08-30 15:04 ` [PATCH 1/5] media: uvcvideo: Cancel async worker earlier Guenter Roeck
@ 2020-08-30 15:04 ` Guenter Roeck
  2020-08-30 15:04 ` [PATCH 3/5] media: uvcvideo: Release stream queue when unregistering video device Guenter Roeck
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Guenter Roeck @ 2020-08-30 15:04 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

The call to uvc_disconnect() is not protected by any mutex.
This means it can and will be called while other accesses to the video
device are in progress. This can cause all kinds of race conditions,
including crashes such as the following.

usb 1-4: USB disconnect, device number 3
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
RIP: 0010:usb_ifnum_to_if+0x29/0x40
Code: <...>
RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
Call Trace:
 usb_hcd_alloc_bandwidth+0x1ee/0x30f
 usb_set_interface+0x1a3/0x2b7
 uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
 uvc_video_start_streaming+0x91/0xdd [uvcvideo]
 uvc_start_streaming+0x28/0x5d [uvcvideo]
 vb2_start_streaming+0x61/0x143 [videobuf2_common]
 vb2_core_streamon+0xf7/0x10f [videobuf2_common]
 uvc_queue_streamon+0x2e/0x41 [uvcvideo]
 uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
 __video_do_ioctl+0x33d/0x42a
 video_usercopy+0x34e/0x5ff
 ? video_ioctl2+0x16/0x16
 v4l2_ioctl+0x46/0x53
 do_vfs_ioctl+0x50a/0x76f
 ksys_ioctl+0x58/0x83
 __x64_sys_ioctl+0x1a/0x1e
 do_syscall_64+0x54/0xde

usb_set_interface() should not be called after the USB device has been
unregistered. However, in the above case the disconnect happened after
v4l2_ioctl() was called, but before the call to usb_ifnum_to_if().

Acquire various mutexes in uvc_unregister_video() to fix the majority
(maybe all) of the observed race conditions.

The uvc_device lock prevents races against suspend and resume calls
and the poll function.

The uvc_streaming lock prevents races against stream related functions;
for the most part, those are ioctls. This lock also requires other
functions using this lock to check if a video device is still registered
after acquiring it. For example, it was observed that the video device
was already unregistered by the time the stream lock was acquired in
uvc_ioctl_streamon().

The uvc_queue lock prevents races against queue functions, Most of
those are already protected by the uvc_streaming lock, but some
are called directly. This is done as added protection; an actual race
was not (yet) observed.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_driver.c |  9 +++++++
 drivers/media/usb/uvc/uvc_v4l2.c   | 39 +++++++++++++++++++++++++++++-
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index bfba67a69185..a5808305f1e3 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1941,14 +1941,22 @@ static void uvc_unregister_video(struct uvc_device *dev)
 {
 	struct uvc_streaming *stream;
 
+	mutex_lock(&dev->lock);
+
 	list_for_each_entry(stream, &dev->streams, list) {
 		if (!video_is_registered(&stream->vdev))
 			continue;
 
+		mutex_lock(&stream->mutex);
+		mutex_lock(&stream->queue.mutex);
+
 		video_unregister_device(&stream->vdev);
 		video_unregister_device(&stream->meta.vdev);
 
 		uvc_debugfs_cleanup_stream(stream);
+
+		mutex_unlock(&stream->queue.mutex);
+		mutex_unlock(&stream->mutex);
 	}
 
 	uvc_status_unregister(dev);
@@ -1960,6 +1968,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
 	if (media_devnode_is_registered(dev->mdev.devnode))
 		media_device_unregister(&dev->mdev);
 #endif
+	mutex_unlock(&dev->lock);
 }
 
 int uvc_register_video_device(struct uvc_device *dev,
diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c
index 0335e69b70ab..7e5e583dae5e 100644
--- a/drivers/media/usb/uvc/uvc_v4l2.c
+++ b/drivers/media/usb/uvc/uvc_v4l2.c
@@ -237,6 +237,12 @@ static int uvc_v4l2_try_format(struct uvc_streaming *stream,
 	 * the Windows driver).
 	 */
 	mutex_lock(&stream->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		mutex_unlock(&stream->mutex);
+		ret = -ENODEV;
+		goto done;
+	}
+
 	if (stream->dev->quirks & UVC_QUIRK_PROBE_EXTRAFIELDS)
 		probe->dwMaxVideoFrameSize =
 			stream->ctrl.dwMaxVideoFrameSize;
@@ -274,6 +280,12 @@ static int uvc_v4l2_get_format(struct uvc_streaming *stream,
 		return -EINVAL;
 
 	mutex_lock(&stream->mutex);
+
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto done;
+	}
+
 	format = stream->cur_format;
 	frame = stream->cur_frame;
 
@@ -312,6 +324,11 @@ static int uvc_v4l2_set_format(struct uvc_streaming *stream,
 
 	mutex_lock(&stream->mutex);
 
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto done;
+	}
+
 	if (uvc_queue_allocated(&stream->queue)) {
 		ret = -EBUSY;
 		goto done;
@@ -387,6 +404,11 @@ static int uvc_v4l2_set_streamparm(struct uvc_streaming *stream,
 
 	mutex_lock(&stream->mutex);
 
+	if (!video_is_registered(&stream->vdev)) {
+		mutex_unlock(&stream->mutex);
+		return -ENODEV;
+	}
+
 	if (uvc_queue_streaming(&stream->queue)) {
 		mutex_unlock(&stream->mutex);
 		return -EBUSY;
@@ -713,6 +735,10 @@ static int uvc_ioctl_reqbufs(struct file *file, void *fh,
 		return ret;
 
 	mutex_lock(&stream->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		mutex_unlock(&stream->mutex);
+		return -ENODEV;
+	}
 	ret = uvc_request_buffers(&stream->queue, rb);
 	mutex_unlock(&stream->mutex);
 	if (ret < 0)
@@ -797,7 +823,12 @@ static int uvc_ioctl_streamon(struct file *file, void *fh,
 		return -EBUSY;
 
 	mutex_lock(&stream->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
 	ret = uvc_queue_streamon(&stream->queue, type);
+unlock:
 	mutex_unlock(&stream->mutex);
 
 	return ret;
@@ -808,15 +839,21 @@ static int uvc_ioctl_streamoff(struct file *file, void *fh,
 {
 	struct uvc_fh *handle = fh;
 	struct uvc_streaming *stream = handle->stream;
+	int ret = 0;
 
 	if (!uvc_has_privileges(handle))
 		return -EBUSY;
 
 	mutex_lock(&stream->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
 	uvc_queue_streamoff(&stream->queue, type);
+unlock:
 	mutex_unlock(&stream->mutex);
 
-	return 0;
+	return ret;
 }
 
 static int uvc_ioctl_enum_input(struct file *file, void *fh,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] media: uvcvideo: Release stream queue when unregistering video device
  2020-08-30 15:04 [PATCH 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
  2020-08-30 15:04 ` [PATCH 1/5] media: uvcvideo: Cancel async worker earlier Guenter Roeck
  2020-08-30 15:04 ` [PATCH 2/5] media: uvcvideo: Lock video streams and queues while unregistering Guenter Roeck
@ 2020-08-30 15:04 ` Guenter Roeck
  2020-08-30 15:04 ` [PATCH 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect Guenter Roeck
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 13+ messages in thread
From: Guenter Roeck @ 2020-08-30 15:04 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

The following traceback was observed when stress testing the uvcvideo
driver.

config->interface[0] is NULL
WARNING: CPU: 0 PID: 2757 at drivers/usb/core/usb.c:285 usb_ifnum_to_if+0x81/0x85
^^^ added log message and WARN() to prevent crash
Modules linked in: <...>
CPU: 0 PID: 2757 Comm: inotify_reader Not tainted 4.19.139 #20
Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
RIP: 0010:usb_ifnum_to_if+0x81/0x85
Code: <...>
RSP: 0018:ffff9ee20141fa58 EFLAGS: 00010246
RAX: 438a0e5a525f1800 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff975477a1de90 RSI: ffff975477a153d0 RDI: ffff975477a153d0
RBP: ffff9ee20141fa70 R08: 000000000000002c R09: 002daec189138d78
R10: 0000001000000000 R11: ffffffffa7da42e6 R12: ffff975459594800
R13: 0000000000000001 R14: 0000000000000001 R15: ffff975465376400
FS:  00007ddebffd6700(0000) GS:ffff975477a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000012c83000 CR3: 00000001bbaf8000 CR4: 0000000000340ef0
Call Trace:
 usb_set_interface+0x3b/0x2f3
 uvc_video_stop_streaming+0x38/0x81 [uvcvideo]
 uvc_stop_streaming+0x21/0x49 [uvcvideo]
 __vb2_queue_cancel+0x33/0x249 [videobuf2_common]
 vb2_core_queue_release+0x1c/0x45 [videobuf2_common]
 uvc_queue_release+0x26/0x32 [uvcvideo]
 uvc_v4l2_release+0x41/0xdd [uvcvideo]
 v4l2_release+0x99/0xed
 __fput+0xf0/0x1ea
 task_work_run+0x7f/0xa7
 do_exit+0x1d1/0x6eb
 do_group_exit+0x9d/0xac
 get_signal+0x135/0x482
 do_signal+0x4a/0x63c
 exit_to_usermode_loop+0x86/0xa5
 do_syscall_64+0x171/0x269
 ? __do_page_fault+0x272/0x474
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ddec156dc26
Code: Bad RIP value.
RSP: 002b:00007ddebffd5c10 EFLAGS: 00000293 ORIG_RAX: 0000000000000017
RAX: fffffffffffffdfe RBX: 000000000000000a RCX: 00007ddec156dc26
RDX: 0000000000000000 RSI: 00007ddebffd5d28 RDI: 000000000000000a
RBP: 00007ddebffd5c50 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 00007ddebffd5d28
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

When this was observed, a USB disconnect was in progress, uvc_disconnect()
had returned, but usb_disable_device() was still running.
Drivers are not supposed to call any USB functions after the driver
disconnect function has been called. The uvcvideo driver does not follow
that convention and tries to write to the disconnected USB interface
anyway. This results in a variety of race conditions.

To solve this specific problem, release the uvc queue from
uvc_unregister_video() after the associated video devices have been
unregistered. Since the function already holds the uvc queue mutex,
bypass uvc_queue_release() and call vb2_queue_release() directly.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_driver.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index a5808305f1e3..27b72806b9b9 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1955,6 +1955,8 @@ static void uvc_unregister_video(struct uvc_device *dev)
 
 		uvc_debugfs_cleanup_stream(stream);
 
+		vb2_queue_release(&stream->queue.queue);
+
 		mutex_unlock(&stream->queue.mutex);
 		mutex_unlock(&stream->mutex);
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect
  2020-08-30 15:04 [PATCH 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
                   ` (2 preceding siblings ...)
  2020-08-30 15:04 ` [PATCH 3/5] media: uvcvideo: Release stream queue when unregistering video device Guenter Roeck
@ 2020-08-30 15:04 ` Guenter Roeck
  2020-09-01 16:51   ` kernel test robot
  2020-08-30 15:04 ` [PATCH 5/5] media: uvcvideo: In uvc_v4l2_open, check if video device is registered Guenter Roeck
  2020-08-30 15:58 ` [PATCH 0/5] media: uvcvideo: Fix race conditions Laurent Pinchart
  5 siblings, 1 reply; 13+ messages in thread
From: Guenter Roeck @ 2020-08-30 15:04 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

uvc queue file operations have no mutex protection against USB
disconnect. This is questionable at least for the poll operation,
which has to wait for the uvc queue mutex. By the time that mutex
has been acquired, is it possible that the video device has been
unregistered.

Protect all file operations by using the queue mutex to avoid
possible race conditions. After acquiring the mutex, check if
the video device is still registered, and bail out if not.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_queue.c | 32 +++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_queue.c b/drivers/media/usb/uvc/uvc_queue.c
index cd60c6c1749e..c7964177a99d 100644
--- a/drivers/media/usb/uvc/uvc_queue.c
+++ b/drivers/media/usb/uvc/uvc_queue.c
@@ -358,24 +358,52 @@ int uvc_queue_streamoff(struct uvc_video_queue *queue, enum v4l2_buf_type type)
 
 int uvc_queue_mmap(struct uvc_video_queue *queue, struct vm_area_struct *vma)
 {
-	return vb2_mmap(&queue->queue, vma);
+	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
+	int ret;
+
+	mutex_lock(&queue->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+	ret = vb2_mmap(&queue->queue, vma);
+unlock:
+	mutex_unlock(&queue->mutex);
+	return ret;
 }
 
 #ifndef CONFIG_MMU
 unsigned long uvc_queue_get_unmapped_area(struct uvc_video_queue *queue,
 		unsigned long pgoff)
 {
-	return vb2_get_unmapped_area(&queue->queue, 0, 0, pgoff, 0);
+	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
+	unsigned long ret;
+
+	mutex_lock(&queue->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+	ret = vb2_get_unmapped_area(&queue->queue, 0, 0, pgoff, 0);
+unlock:
+	mutex_unlock(&queue->mutex);
+	return ret;
 }
 #endif
 
 __poll_t uvc_queue_poll(struct uvc_video_queue *queue, struct file *file,
 			    poll_table *wait)
 {
+	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
 	__poll_t ret;
 
 	mutex_lock(&queue->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
 	ret = vb2_poll(&queue->queue, file, wait);
+unlock:
 	mutex_unlock(&queue->mutex);
 
 	return ret;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] media: uvcvideo: In uvc_v4l2_open, check if video device is registered
  2020-08-30 15:04 [PATCH 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
                   ` (3 preceding siblings ...)
  2020-08-30 15:04 ` [PATCH 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect Guenter Roeck
@ 2020-08-30 15:04 ` Guenter Roeck
  2020-08-30 15:58 ` [PATCH 0/5] media: uvcvideo: Fix race conditions Laurent Pinchart
  5 siblings, 0 replies; 13+ messages in thread
From: Guenter Roeck @ 2020-08-30 15:04 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

uvc_v4l2_open() acquires the uvc device mutex. After doing so,
it does not check if the video device is still registered. This may
result in race conditions and can result in an attempt to submit an urb
to a disconnected USB interface (from uvc_v4l2_open).

The problem was observed after adding a call to msleep() just before
acquiring the mutex and disconnecting the camera during the sleep.

Check if the video device is still registered after acquiring the mutex
to avoid the problem. In the release function, only call uvc_status_stop()
if the video device is still registered. If the video device has been
unregistered, the urb associated with uvc status has already been killed
in uvc_status_unregister(). Trying to kill it again won't do any good
and might have unexpected side effects.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_v4l2.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c
index 7e5e583dae5e..1f7d454557b9 100644
--- a/drivers/media/usb/uvc/uvc_v4l2.c
+++ b/drivers/media/usb/uvc/uvc_v4l2.c
@@ -548,6 +548,10 @@ static int uvc_v4l2_open(struct file *file)
 	}
 
 	mutex_lock(&stream->dev->lock);
+	if (!video_is_registered(&stream->vdev)) {
+		mutex_unlock(&stream->dev->lock);
+		return -ENODEV;
+	}
 	if (stream->dev->users == 0) {
 		ret = uvc_status_start(stream->dev, GFP_KERNEL);
 		if (ret < 0) {
@@ -590,7 +594,7 @@ static int uvc_v4l2_release(struct file *file)
 	file->private_data = NULL;
 
 	mutex_lock(&stream->dev->lock);
-	if (--stream->dev->users == 0)
+	if (--stream->dev->users == 0 && video_is_registered(&stream->vdev))
 		uvc_status_stop(stream->dev);
 	mutex_unlock(&stream->dev->lock);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] media: uvcvideo: Fix race conditions
  2020-08-30 15:04 [PATCH 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
                   ` (4 preceding siblings ...)
  2020-08-30 15:04 ` [PATCH 5/5] media: uvcvideo: In uvc_v4l2_open, check if video device is registered Guenter Roeck
@ 2020-08-30 15:58 ` Laurent Pinchart
  2020-08-30 20:48   ` Guenter Roeck
  5 siblings, 1 reply; 13+ messages in thread
From: Laurent Pinchart @ 2020-08-30 15:58 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel

Hi Guenter,

Thank you for the patches.

On Sun, Aug 30, 2020 at 08:04:38AM -0700, Guenter Roeck wrote:
> The uvcvideo code has no lock protection against USB disconnects
> while video operations are ongoing. This has resulted in random
> error reports, typically pointing to a crash in usb_ifnum_to_if(),
> called from usb_hcd_alloc_bandwidth(). A typical traceback is as
> follows.
> 
> usb 1-4: USB disconnect, device number 3
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
> Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
> RIP: 0010:usb_ifnum_to_if+0x29/0x40
> Code: <...>
> RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
> RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
> RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
> R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
> R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
> FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
> Call Trace:
>  usb_hcd_alloc_bandwidth+0x1ee/0x30f
>  usb_set_interface+0x1a3/0x2b7
>  uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
>  uvc_video_start_streaming+0x91/0xdd [uvcvideo]
>  uvc_start_streaming+0x28/0x5d [uvcvideo]
>  vb2_start_streaming+0x61/0x143 [videobuf2_common]
>  vb2_core_streamon+0xf7/0x10f [videobuf2_common]
>  uvc_queue_streamon+0x2e/0x41 [uvcvideo]
>  uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
>  __video_do_ioctl+0x33d/0x42a
>  video_usercopy+0x34e/0x5ff
>  ? video_ioctl2+0x16/0x16
>  v4l2_ioctl+0x46/0x53
>  do_vfs_ioctl+0x50a/0x76f
>  ksys_ioctl+0x58/0x83
>  __x64_sys_ioctl+0x1a/0x1e
>  do_syscall_64+0x54/0xde
> 
> While this is problem rarely observed in the field, it is relatively easy
> to reproduce by adding msleep() calls into the code.
> 
> I don't presume to claim that I found every issue, but this patch series
> should fix at least the major problems.
> 
> The patch series was tested exensively on a Chromebook running chromeos-4.19
> and on a Linux system running a v5.8.y based kernel.

I'll review each patch individually, but I think 2/5, 4/5 and 5/5 should
be handled in the V4L2 core, not the uvcvideo driver. Otherwise we would
have to replicate that logic in all drivers, while I think it can easily
be implemented in a generic fashion as previously discussed.

> ----------------------------------------------------------------
> Guenter Roeck (5):
>       media: uvcvideo: Cancel async worker earlier
>       media: uvcvideo: Lock video streams and queues while unregistering
>       media: uvcvideo: Release stream queue when unregistering video device
>       media: uvcvideo: Protect uvc queue file operations against disconnect
>       media: uvcvideo: In uvc_v4l2_open, check if video device is registered
> 
>  drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
>  drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
>  drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
>  drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
>  drivers/media/usb/uvc/uvcvideo.h   |  1 +
>  5 files changed, 93 insertions(+), 8 deletions(-)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] media: uvcvideo: Fix race conditions
  2020-08-30 15:58 ` [PATCH 0/5] media: uvcvideo: Fix race conditions Laurent Pinchart
@ 2020-08-30 20:48   ` Guenter Roeck
  2020-08-30 21:36     ` Laurent Pinchart
  0 siblings, 1 reply; 13+ messages in thread
From: Guenter Roeck @ 2020-08-30 20:48 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel

Hi Laurent,

On 8/30/20 8:58 AM, Laurent Pinchart wrote:
> Hi Guenter,
> 
> Thank you for the patches.
> 
> On Sun, Aug 30, 2020 at 08:04:38AM -0700, Guenter Roeck wrote:
>> The uvcvideo code has no lock protection against USB disconnects
>> while video operations are ongoing. This has resulted in random
>> error reports, typically pointing to a crash in usb_ifnum_to_if(),
>> called from usb_hcd_alloc_bandwidth(). A typical traceback is as
>> follows.
>>
>> usb 1-4: USB disconnect, device number 3
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
>> PGD 0 P4D 0
>> Oops: 0000 [#1] PREEMPT SMP PTI
>> CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
>> Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
>> RIP: 0010:usb_ifnum_to_if+0x29/0x40
>> Code: <...>
>> RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
>> RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
>> RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
>> R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
>> R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
>> FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
>> Call Trace:
>>  usb_hcd_alloc_bandwidth+0x1ee/0x30f
>>  usb_set_interface+0x1a3/0x2b7
>>  uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
>>  uvc_video_start_streaming+0x91/0xdd [uvcvideo]
>>  uvc_start_streaming+0x28/0x5d [uvcvideo]
>>  vb2_start_streaming+0x61/0x143 [videobuf2_common]
>>  vb2_core_streamon+0xf7/0x10f [videobuf2_common]
>>  uvc_queue_streamon+0x2e/0x41 [uvcvideo]
>>  uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
>>  __video_do_ioctl+0x33d/0x42a
>>  video_usercopy+0x34e/0x5ff
>>  ? video_ioctl2+0x16/0x16
>>  v4l2_ioctl+0x46/0x53
>>  do_vfs_ioctl+0x50a/0x76f
>>  ksys_ioctl+0x58/0x83
>>  __x64_sys_ioctl+0x1a/0x1e
>>  do_syscall_64+0x54/0xde
>>
>> While this is problem rarely observed in the field, it is relatively easy
>> to reproduce by adding msleep() calls into the code.
>>
>> I don't presume to claim that I found every issue, but this patch series
>> should fix at least the major problems.
>>
>> The patch series was tested exensively on a Chromebook running chromeos-4.19
>> and on a Linux system running a v5.8.y based kernel.
> 
> I'll review each patch individually, but I think 2/5, 4/5 and 5/5 should
> be handled in the V4L2 core, not the uvcvideo driver. Otherwise we would
> have to replicate that logic in all drivers, while I think it can easily
> be implemented in a generic fashion as previously discussed.
> 
The problem is that the v4l2 core already does support locking. There is
a global lock, in struct video_device, a queue lock in struct v4l2_m2m_ctx,
and another queue lock in struct vb2_queue. However, all of those have
to be initialized from the driver. The uvcvideo driver uses its own locks and
does not set the lock pointers in the various generic structures. I was able
to figure out how to use the uvcvideo specific locks in the uvcvideo
driver, but all my attempts to initialize and use the generic locks failed.

It may well be that the generic code isn't entirely clean - for example
I am not sure if the lock protection in v4l2_open() is complete since
it doesn't handle disconnects after checking if the video device is still
registered (and I don't really see the point of the second video_is_registered()
call in v4l2_open). However, that may just be a lack of understanding on my
side on how the code is supposed to work. Maybe the actual device open function
is expected to have its own protection against underlying hardware removal
and video device unregistration while opening the device.

[ Regarding the second call to video_is_registered() in v4l2_open():
  Add msleep(5000) between it and the call to the driver open function,
  disconnect the device during the sleep, and it will happily call the device
  open function on a non-registered video device. That is what patch 5/5 tries
  to fix or the uvcvideo driver.
  The same problem applies to other file operations in v4l2-dev.c: They all
  check if the video device is registered before calling the device
  specific code, but I don't really see the point of doing that because
  there is no protection against unregistration after the check was made
  and before/while the device specific code is running.
  Patch 4/5 tries to fix this for the uvcvideo driver.
  If that is a bug in the v4l2 code, I'll be happy to work on a fix,
  but the only generic fix I could think of would be to utilize the lock in
  struct video_device ... but that lock isn't initialized by the uvcvideo
  driver.
]

Either case, I don't think my understanding of the interaction between
v4l2 and uvcvideo is good enough to make more invasive changes. I _think_
any generic improvement should start with refactoring the uvcvideo code to
use the v4l2 locking mechanism. However, from the exchange here, my
understanding is that this locking mechanism is not used on purpose. That
means we'll have a uvcvideo specific locking mechanism, period, and I don't
think it is even possible to solve the problem without utilizing this locking
mechanism.

Of course, it may as well be that I am completely off track and clueless.
After all, the first time I looked into this code was about two weeks ago.
So please bear with me if I talk nonsense.

Thanks,
Guenter

>> ----------------------------------------------------------------
>> Guenter Roeck (5):
>>       media: uvcvideo: Cancel async worker earlier
>>       media: uvcvideo: Lock video streams and queues while unregistering
>>       media: uvcvideo: Release stream queue when unregistering video device
>>       media: uvcvideo: Protect uvc queue file operations against disconnect
>>       media: uvcvideo: In uvc_v4l2_open, check if video device is registered
>>
>>  drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
>>  drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
>>  drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
>>  drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
>>  drivers/media/usb/uvc/uvcvideo.h   |  1 +
>>  5 files changed, 93 insertions(+), 8 deletions(-)
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] media: uvcvideo: Fix race conditions
  2020-08-30 20:48   ` Guenter Roeck
@ 2020-08-30 21:36     ` Laurent Pinchart
  2020-08-31  0:10       ` Guenter Roeck
  0 siblings, 1 reply; 13+ messages in thread
From: Laurent Pinchart @ 2020-08-30 21:36 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel

Hi Guenter,

On Sun, Aug 30, 2020 at 01:48:24PM -0700, Guenter Roeck wrote:
> On 8/30/20 8:58 AM, Laurent Pinchart wrote:
> > On Sun, Aug 30, 2020 at 08:04:38AM -0700, Guenter Roeck wrote:
> >> The uvcvideo code has no lock protection against USB disconnects
> >> while video operations are ongoing. This has resulted in random
> >> error reports, typically pointing to a crash in usb_ifnum_to_if(),
> >> called from usb_hcd_alloc_bandwidth(). A typical traceback is as
> >> follows.
> >>
> >> usb 1-4: USB disconnect, device number 3
> >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> >> PGD 0 P4D 0
> >> Oops: 0000 [#1] PREEMPT SMP PTI
> >> CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
> >> Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
> >> RIP: 0010:usb_ifnum_to_if+0x29/0x40
> >> Code: <...>
> >> RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
> >> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
> >> RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
> >> RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
> >> R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
> >> R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
> >> FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
> >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >> CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
> >> Call Trace:
> >>  usb_hcd_alloc_bandwidth+0x1ee/0x30f
> >>  usb_set_interface+0x1a3/0x2b7
> >>  uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
> >>  uvc_video_start_streaming+0x91/0xdd [uvcvideo]
> >>  uvc_start_streaming+0x28/0x5d [uvcvideo]
> >>  vb2_start_streaming+0x61/0x143 [videobuf2_common]
> >>  vb2_core_streamon+0xf7/0x10f [videobuf2_common]
> >>  uvc_queue_streamon+0x2e/0x41 [uvcvideo]
> >>  uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
> >>  __video_do_ioctl+0x33d/0x42a
> >>  video_usercopy+0x34e/0x5ff
> >>  ? video_ioctl2+0x16/0x16
> >>  v4l2_ioctl+0x46/0x53
> >>  do_vfs_ioctl+0x50a/0x76f
> >>  ksys_ioctl+0x58/0x83
> >>  __x64_sys_ioctl+0x1a/0x1e
> >>  do_syscall_64+0x54/0xde
> >>
> >> While this is problem rarely observed in the field, it is relatively easy
> >> to reproduce by adding msleep() calls into the code.
> >>
> >> I don't presume to claim that I found every issue, but this patch series
> >> should fix at least the major problems.
> >>
> >> The patch series was tested exensively on a Chromebook running chromeos-4.19
> >> and on a Linux system running a v5.8.y based kernel.
> > 
> > I'll review each patch individually, but I think 2/5, 4/5 and 5/5 should
> > be handled in the V4L2 core, not the uvcvideo driver. Otherwise we would
> > have to replicate that logic in all drivers, while I think it can easily
> > be implemented in a generic fashion as previously discussed.
> > 
> The problem is that the v4l2 core already does support locking. There is
> a global lock, in struct video_device, a queue lock in struct v4l2_m2m_ctx,
> and another queue lock in struct vb2_queue. However, all of those have
> to be initialized from the driver. The uvcvideo driver uses its own locks and
> does not set the lock pointers in the various generic structures. I was able
> to figure out how to use the uvcvideo specific locks in the uvcvideo
> driver, but all my attempts to initialize and use the generic locks failed.
> 
> It may well be that the generic code isn't entirely clean - for example
> I am not sure if the lock protection in v4l2_open() is complete since
> it doesn't handle disconnects after checking if the video device is still
> registered (and I don't really see the point of the second video_is_registered()
> call in v4l2_open). However, that may just be a lack of understanding on my
> side on how the code is supposed to work. Maybe the actual device open function
> is expected to have its own protection against underlying hardware removal
> and video device unregistration while opening the device.
> 
> [ Regarding the second call to video_is_registered() in v4l2_open():
>   Add msleep(5000) between it and the call to the driver open function,
>   disconnect the device during the sleep, and it will happily call the device
>   open function on a non-registered video device. That is what patch 5/5 tries
>   to fix or the uvcvideo driver.
>   The same problem applies to other file operations in v4l2-dev.c: They all
>   check if the video device is registered before calling the device
>   specific code, but I don't really see the point of doing that because
>   there is no protection against unregistration after the check was made
>   and before/while the device specific code is running.
>   Patch 4/5 tries to fix this for the uvcvideo driver.
>   If that is a bug in the v4l2 code, I'll be happy to work on a fix,
>   but the only generic fix I could think of would be to utilize the lock in
>   struct video_device ... but that lock isn't initialized by the uvcvideo
>   driver.
> ]
> 
> Either case, I don't think my understanding of the interaction between
> v4l2 and uvcvideo is good enough to make more invasive changes. I _think_
> any generic improvement should start with refactoring the uvcvideo code to
> use the v4l2 locking mechanism. However, from the exchange here, my
> understanding is that this locking mechanism is not used on purpose. That
> means we'll have a uvcvideo specific locking mechanism, period, and I don't
> think it is even possible to solve the problem without utilizing this locking
> mechanism.
> 
> Of course, it may as well be that I am completely off track and clueless.
> After all, the first time I looked into this code was about two weeks ago.
> So please bear with me if I talk nonsense.

It would be rather impolite to claim you're clueless, given that you
managed to write this patch series only two weeks after first looking
into the problem :-)

I'll try to prototype what I envision would be a good solution in the
V4L2 core. If stars align, I may even try to push it one level up, to
the chardev layer. Would you then be able to test it ?

> >> ----------------------------------------------------------------
> >> Guenter Roeck (5):
> >>       media: uvcvideo: Cancel async worker earlier
> >>       media: uvcvideo: Lock video streams and queues while unregistering
> >>       media: uvcvideo: Release stream queue when unregistering video device
> >>       media: uvcvideo: Protect uvc queue file operations against disconnect
> >>       media: uvcvideo: In uvc_v4l2_open, check if video device is registered
> >>
> >>  drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
> >>  drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
> >>  drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
> >>  drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
> >>  drivers/media/usb/uvc/uvcvideo.h   |  1 +
> >>  5 files changed, 93 insertions(+), 8 deletions(-)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] media: uvcvideo: Fix race conditions
  2020-08-30 21:36     ` Laurent Pinchart
@ 2020-08-31  0:10       ` Guenter Roeck
  2020-09-03  3:19         ` Guenter Roeck
  0 siblings, 1 reply; 13+ messages in thread
From: Guenter Roeck @ 2020-08-31  0:10 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel

On Mon, Aug 31, 2020 at 12:36:21AM +0300, Laurent Pinchart wrote:
> Hi Guenter,
> 
[ ... ]

> I'll try to prototype what I envision would be a good solution in the
> V4L2 core. If stars align, I may even try to push it one level up, to
> the chardev layer. Would you then be able to test it ?
> 

Sure, I'll be happy to do that.

I ordered a couple of non-UVC webcams (pwc and gspca) from eBay for
comparison. Both of those use the v4l2 locking mechanism, so we should
be able to see the difference.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect
  2020-08-30 15:04 ` [PATCH 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect Guenter Roeck
@ 2020-09-01 16:51   ` kernel test robot
  2020-09-01 16:58     ` Guenter Roeck
  0 siblings, 1 reply; 13+ messages in thread
From: kernel test robot @ 2020-09-01 16:51 UTC (permalink / raw)
  To: Guenter Roeck, Laurent Pinchart
  Cc: kbuild-all, Mauro Carvalho Chehab, linux-media, Sakari Ailus,
	linux-uvc-devel, linux-usb, linux-kernel, Guenter Roeck,
	Alan Stern, Hans Verkuil

[-- Attachment #1: Type: text/plain, Size: 2323 bytes --]

Hi Guenter,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linuxtv-media/master]
[also build test WARNING on v5.9-rc3 next-20200828]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Guenter-Roeck/media-uvcvideo-Fix-race-conditions/20200830-230715
base:   git://linuxtv.org/media_tree.git master
config: x86_64-randconfig-s031-20200901 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.2-191-g10164920-dirty
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

>> drivers/media/usb/uvc/uvc_queue.c:402:21: sparse: sparse: incorrect type in assignment (different base types) @@     expected restricted __poll_t [usertype] ret @@     got int @@
>> drivers/media/usb/uvc/uvc_queue.c:402:21: sparse:     expected restricted __poll_t [usertype] ret
>> drivers/media/usb/uvc/uvc_queue.c:402:21: sparse:     got int

# https://github.com/0day-ci/linux/commit/9128827e77bc077a6fefd3ca886a8466e84f7154
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Guenter-Roeck/media-uvcvideo-Fix-race-conditions/20200830-230715
git checkout 9128827e77bc077a6fefd3ca886a8466e84f7154
vim +402 drivers/media/usb/uvc/uvc_queue.c

   393	
   394	__poll_t uvc_queue_poll(struct uvc_video_queue *queue, struct file *file,
   395				    poll_table *wait)
   396	{
   397		struct uvc_streaming *stream = uvc_queue_to_stream(queue);
   398		__poll_t ret;
   399	
   400		mutex_lock(&queue->mutex);
   401		if (!video_is_registered(&stream->vdev)) {
 > 402			ret = -ENODEV;
   403			goto unlock;
   404		}
   405		ret = vb2_poll(&queue->queue, file, wait);
   406	unlock:
   407		mutex_unlock(&queue->mutex);
   408	
   409		return ret;
   410	}
   411	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 42430 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect
  2020-09-01 16:51   ` kernel test robot
@ 2020-09-01 16:58     ` Guenter Roeck
  0 siblings, 0 replies; 13+ messages in thread
From: Guenter Roeck @ 2020-09-01 16:58 UTC (permalink / raw)
  To: kernel test robot, Laurent Pinchart
  Cc: kbuild-all, Mauro Carvalho Chehab, linux-media, Sakari Ailus,
	linux-uvc-devel, linux-usb, linux-kernel, Alan Stern,
	Hans Verkuil

On 9/1/20 9:51 AM, kernel test robot wrote:
> Hi Guenter,
> 
> I love your patch! Perhaps something to improve:
> 
> [auto build test WARNING on linuxtv-media/master]
> [also build test WARNING on v5.9-rc3 next-20200828]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
> 
> url:    https://github.com/0day-ci/linux/commits/Guenter-Roeck/media-uvcvideo-Fix-race-conditions/20200830-230715
> base:   git://linuxtv.org/media_tree.git master
> config: x86_64-randconfig-s031-20200901 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> reproduce:
>         # apt-get install sparse
>         # sparse version: v0.6.2-191-g10164920-dirty
>         # save the attached .config to linux build tree
>         make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> 
> sparse warnings: (new ones prefixed by >>)
> 
>>> drivers/media/usb/uvc/uvc_queue.c:402:21: sparse: sparse: incorrect type in assignment (different base types) @@     expected restricted __poll_t [usertype] ret @@     got int @@
>>> drivers/media/usb/uvc/uvc_queue.c:402:21: sparse:     expected restricted __poll_t [usertype] ret
>>> drivers/media/usb/uvc/uvc_queue.c:402:21: sparse:     got int
> 
> # https://github.com/0day-ci/linux/commit/9128827e77bc077a6fefd3ca886a8466e84f7154
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review Guenter-Roeck/media-uvcvideo-Fix-race-conditions/20200830-230715
> git checkout 9128827e77bc077a6fefd3ca886a8466e84f7154
> vim +402 drivers/media/usb/uvc/uvc_queue.c
> 
>    393	
>    394	__poll_t uvc_queue_poll(struct uvc_video_queue *queue, struct file *file,
>    395				    poll_table *wait)
>    396	{
>    397		struct uvc_streaming *stream = uvc_queue_to_stream(queue);
>    398		__poll_t ret;
>    399	
>    400		mutex_lock(&queue->mutex);
>    401		if (!video_is_registered(&stream->vdev)) {
>  > 402			ret = -ENODEV;

Looks like this has to return EPOLLERR. I'll make that change in v2.

Guenter

>    403			goto unlock;
>    404		}
>    405		ret = vb2_poll(&queue->queue, file, wait);
>    406	unlock:
>    407		mutex_unlock(&queue->mutex);
>    408	
>    409		return ret;
>    410	}
>    411	
> 
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 0/5] media: uvcvideo: Fix race conditions
  2020-08-31  0:10       ` Guenter Roeck
@ 2020-09-03  3:19         ` Guenter Roeck
  0 siblings, 0 replies; 13+ messages in thread
From: Guenter Roeck @ 2020-09-03  3:19 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel

On Sun, Aug 30, 2020 at 05:10:10PM -0700, Guenter Roeck wrote:
> On Mon, Aug 31, 2020 at 12:36:21AM +0300, Laurent Pinchart wrote:
> > Hi Guenter,
> > 
> [ ... ]
> 
> > I'll try to prototype what I envision would be a good solution in the
> > V4L2 core. If stars align, I may even try to push it one level up, to
> > the chardev layer. Would you then be able to test it ?
> > 
> 
> Sure, I'll be happy to do that.
> 
> I ordered a couple of non-UVC webcams (pwc and gspca) from eBay for
> comparison. Both of those use the v4l2 locking mechanism, so we should
> be able to see the difference.
> 

Turns out gspca webcams (or at least the one I got - Logitech QuickCam for
Notebooks Deluxe) don't have a problem. As mentioned before, the gspca
driver uses the locking mechanism provided by the v4l2/vb2 code. Unlike
uvcvideo, its open function doesn't trigger sending send usb packets.
Its usb disconnect function acquires the lock, and the rest of the code
holds that lock where needed. I''ll keep trying, but so far I can't
get it to do anything wrong.

I am still waiting for the pwc webcam, but that also uses the v4l2/vb2
locking mechanism, and it seems unlikely that I'll get it to fail.
I'll try anyway.

Guenter

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-09-03  3:19 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-30 15:04 [PATCH 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
2020-08-30 15:04 ` [PATCH 1/5] media: uvcvideo: Cancel async worker earlier Guenter Roeck
2020-08-30 15:04 ` [PATCH 2/5] media: uvcvideo: Lock video streams and queues while unregistering Guenter Roeck
2020-08-30 15:04 ` [PATCH 3/5] media: uvcvideo: Release stream queue when unregistering video device Guenter Roeck
2020-08-30 15:04 ` [PATCH 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect Guenter Roeck
2020-09-01 16:51   ` kernel test robot
2020-09-01 16:58     ` Guenter Roeck
2020-08-30 15:04 ` [PATCH 5/5] media: uvcvideo: In uvc_v4l2_open, check if video device is registered Guenter Roeck
2020-08-30 15:58 ` [PATCH 0/5] media: uvcvideo: Fix race conditions Laurent Pinchart
2020-08-30 20:48   ` Guenter Roeck
2020-08-30 21:36     ` Laurent Pinchart
2020-08-31  0:10       ` Guenter Roeck
2020-09-03  3:19         ` Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).