linux-usb.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions
@ 2020-09-17  2:25 Guenter Roeck
  2020-09-17  2:25 ` [PATCH RESEND v3 1/5] media: uvcvideo: Cancel async worker earlier Guenter Roeck
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Guenter Roeck @ 2020-09-17  2:25 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck

Something seems to have gone wrong with v3 of this patch series.
I am sure I sent it out, but I don't find it anywhere.
Resending. Sorry for any duplicates.

The uvcvideo code has no lock protection against USB disconnects
while video operations are ongoing. This has resulted in random
error reports, typically pointing to a crash in usb_ifnum_to_if(),
called from usb_hcd_alloc_bandwidth(). A typical traceback is as
follows.

usb 1-4: USB disconnect, device number 3
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
RIP: 0010:usb_ifnum_to_if+0x29/0x40
Code: <...>
RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
Call Trace:
 usb_hcd_alloc_bandwidth+0x1ee/0x30f
 usb_set_interface+0x1a3/0x2b7
 uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
 uvc_video_start_streaming+0x91/0xdd [uvcvideo]
 uvc_start_streaming+0x28/0x5d [uvcvideo]
 vb2_start_streaming+0x61/0x143 [videobuf2_common]
 vb2_core_streamon+0xf7/0x10f [videobuf2_common]
 uvc_queue_streamon+0x2e/0x41 [uvcvideo]
 uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
 __video_do_ioctl+0x33d/0x42a
 video_usercopy+0x34e/0x5ff
 ? video_ioctl2+0x16/0x16
 v4l2_ioctl+0x46/0x53
 do_vfs_ioctl+0x50a/0x76f
 ksys_ioctl+0x58/0x83
 __x64_sys_ioctl+0x1a/0x1e
 do_syscall_64+0x54/0xde

While there are not many references to this problem on mailing lists, it is
reported on a regular basis on various Chromebooks (roughly 300 reports
per month). The problem is relatively easy to reproduce by adding msleep()
calls into the code.

I tried to reproduce the problem with non-uvcvideo webcams, but was
unsuccessful. I was unable to get Philips (pwc) webcams to work. gspca
based webcams don't experience the problem, or at least I was unable to
reproduce it (The gspa driver does not trigger sending USB messages in the
open function, and otherwise uses the locking mechanism provided by the
v4l2/vb2 core).

I don't presume to claim that I found every issue, but this patch series
should fix at least the major problems.

The patch series was tested exensively on a Chromebook running chromeos-4.19
and on a Linux system running a v5.8.y based kernel.

v3:
- In patch 5/5, add missing calls to usb_autopm_put_interface() and kfree()
  to failure code path

v2:
- Added details about problem frequency and testing with non-uvc webcams
  to summary
- In patch 4/5, return EPOLLERR instead of -ENODEV on poll errors
- Fix description in patch 5/5

----------------------------------------------------------------
Guenter Roeck (5):
      media: uvcvideo: Cancel async worker earlier
      media: uvcvideo: Lock video streams and queues while unregistering
      media: uvcvideo: Release stream queue when unregistering video device
      media: uvcvideo: Protect uvc queue file operations against disconnect
      media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered

 drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
 drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
 drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
 drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
 drivers/media/usb/uvc/uvcvideo.h   |  1 +
 5 files changed, 93 insertions(+), 8 deletions(-)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH RESEND v3 1/5] media: uvcvideo: Cancel async worker earlier
  2020-09-17  2:25 [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
@ 2020-09-17  2:25 ` Guenter Roeck
  2021-06-17 17:06   ` Ezequiel Garcia
  2020-09-17  2:25 ` [PATCH RESEND v3 2/5] media: uvcvideo: Lock video streams and queues while unregistering Guenter Roeck
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 15+ messages in thread
From: Guenter Roeck @ 2020-09-17  2:25 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

So far the asynchronous control worker was canceled only in
uvc_ctrl_cleanup_device. This is much later than the call to
uvc_disconnect. However, after the call to uvc_disconnect returns,
there must be no more USB activity. This can result in all kinds
of problems in the USB code. One observed example:

URB ffff993e83d0bc00 submitted while active
WARNING: CPU: 0 PID: 4046 at drivers/usb/core/urb.c:364 usb_submit_urb+0x4ba/0x55e
Modules linked in: <...>
CPU: 0 PID: 4046 Comm: kworker/0:35 Not tainted 4.19.139 #18
Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
Workqueue: events uvc_ctrl_status_event_work [uvcvideo]
RIP: 0010:usb_submit_urb+0x4ba/0x55e
Code: <...>
RSP: 0018:ffffb08d471ebde8 EFLAGS: 00010246
RAX: a6da85d923ea5d00 RBX: ffff993e71985928 RCX: 0000000000000000
RDX: ffff993f37a1de90 RSI: ffff993f37a153d0 RDI: ffff993f37a153d0
RBP: ffffb08d471ebe28 R08: 000000000000003b R09: 001424bf85822e96
R10: 0000001000000000 R11: ffffffff975a4398 R12: ffff993e83d0b000
R13: ffff993e83d0bc00 R14: 0000000000000000 R15: 00000000fffffff0
FS:  0000000000000000(0000) GS:ffff993f37a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000ec9c0000 CR3: 000000025b160000 CR4: 0000000000340ef0
Call Trace:
 uvc_ctrl_status_event_work+0xd6/0x107 [uvcvideo]
 process_one_work+0x19b/0x4c5
 worker_thread+0x10d/0x286
 kthread+0x138/0x140
 ? process_one_work+0x4c5/0x4c5
 ? kthread_associate_blkcg+0xc1/0xc1
 ret_from_fork+0x1f/0x40

Introduce new function uvc_ctrl_stop_device() to cancel the worker
and call it from uvc_unregister_video() to solve the problem.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_ctrl.c   | 11 +++++++----
 drivers/media/usb/uvc/uvc_driver.c |  1 +
 drivers/media/usb/uvc/uvcvideo.h   |  1 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
index e399b9fad757..130c56e0063d 100644
--- a/drivers/media/usb/uvc/uvc_ctrl.c
+++ b/drivers/media/usb/uvc/uvc_ctrl.c
@@ -2340,14 +2340,17 @@ static void uvc_ctrl_cleanup_mappings(struct uvc_device *dev,
 	}
 }
 
-void uvc_ctrl_cleanup_device(struct uvc_device *dev)
+void uvc_ctrl_stop_device(struct uvc_device *dev)
 {
-	struct uvc_entity *entity;
-	unsigned int i;
-
 	/* Can be uninitialized if we are aborting on probe error. */
 	if (dev->async_ctrl.work.func)
 		cancel_work_sync(&dev->async_ctrl.work);
+}
+
+void uvc_ctrl_cleanup_device(struct uvc_device *dev)
+{
+	struct uvc_entity *entity;
+	unsigned int i;
 
 	/* Free controls and control mappings for all entities. */
 	list_for_each_entry(entity, &dev->entities, list) {
diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index 431d86e1c94b..bfba67a69185 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1952,6 +1952,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
 	}
 
 	uvc_status_unregister(dev);
+	uvc_ctrl_stop_device(dev);
 
 	if (dev->vdev.dev)
 		v4l2_device_unregister(&dev->vdev);
diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
index 6ab972c643e3..543afcd9fd26 100644
--- a/drivers/media/usb/uvc/uvcvideo.h
+++ b/drivers/media/usb/uvc/uvcvideo.h
@@ -830,6 +830,7 @@ int uvc_query_v4l2_menu(struct uvc_video_chain *chain,
 int uvc_ctrl_add_mapping(struct uvc_video_chain *chain,
 			 const struct uvc_control_mapping *mapping);
 int uvc_ctrl_init_device(struct uvc_device *dev);
+void uvc_ctrl_stop_device(struct uvc_device *dev);
 void uvc_ctrl_cleanup_device(struct uvc_device *dev);
 int uvc_ctrl_restore_values(struct uvc_device *dev);
 bool uvc_ctrl_status_event(struct urb *urb, struct uvc_video_chain *chain,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v3 2/5] media: uvcvideo: Lock video streams and queues while unregistering
  2020-09-17  2:25 [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
  2020-09-17  2:25 ` [PATCH RESEND v3 1/5] media: uvcvideo: Cancel async worker earlier Guenter Roeck
@ 2020-09-17  2:25 ` Guenter Roeck
  2020-09-17  2:25 ` [PATCH RESEND v3 3/5] media: uvcvideo: Release stream queue when unregistering video device Guenter Roeck
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Guenter Roeck @ 2020-09-17  2:25 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

The call to uvc_disconnect() is not protected by any mutex.
This means it can and will be called while other accesses to the video
device are in progress. This can cause all kinds of race conditions,
including crashes such as the following.

usb 1-4: USB disconnect, device number 3
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
RIP: 0010:usb_ifnum_to_if+0x29/0x40
Code: <...>
RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
Call Trace:
 usb_hcd_alloc_bandwidth+0x1ee/0x30f
 usb_set_interface+0x1a3/0x2b7
 uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
 uvc_video_start_streaming+0x91/0xdd [uvcvideo]
 uvc_start_streaming+0x28/0x5d [uvcvideo]
 vb2_start_streaming+0x61/0x143 [videobuf2_common]
 vb2_core_streamon+0xf7/0x10f [videobuf2_common]
 uvc_queue_streamon+0x2e/0x41 [uvcvideo]
 uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
 __video_do_ioctl+0x33d/0x42a
 video_usercopy+0x34e/0x5ff
 ? video_ioctl2+0x16/0x16
 v4l2_ioctl+0x46/0x53
 do_vfs_ioctl+0x50a/0x76f
 ksys_ioctl+0x58/0x83
 __x64_sys_ioctl+0x1a/0x1e
 do_syscall_64+0x54/0xde

usb_set_interface() should not be called after the USB device has been
unregistered. However, in the above case the disconnect happened after
v4l2_ioctl() was called, but before the call to usb_ifnum_to_if().

Acquire various mutexes in uvc_unregister_video() to fix the majority
(maybe all) of the observed race conditions.

The uvc_device lock prevents races against suspend and resume calls
and the poll function.

The uvc_streaming lock prevents races against stream related functions;
for the most part, those are ioctls. This lock also requires other
functions using this lock to check if a video device is still registered
after acquiring it. For example, it was observed that the video device
was already unregistered by the time the stream lock was acquired in
uvc_ioctl_streamon().

The uvc_queue lock prevents races against queue functions, Most of
those are already protected by the uvc_streaming lock, but some
are called directly. This is done as added protection; an actual race
was not (yet) observed.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_driver.c |  9 +++++++
 drivers/media/usb/uvc/uvc_v4l2.c   | 39 +++++++++++++++++++++++++++++-
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index bfba67a69185..a5808305f1e3 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1941,14 +1941,22 @@ static void uvc_unregister_video(struct uvc_device *dev)
 {
 	struct uvc_streaming *stream;
 
+	mutex_lock(&dev->lock);
+
 	list_for_each_entry(stream, &dev->streams, list) {
 		if (!video_is_registered(&stream->vdev))
 			continue;
 
+		mutex_lock(&stream->mutex);
+		mutex_lock(&stream->queue.mutex);
+
 		video_unregister_device(&stream->vdev);
 		video_unregister_device(&stream->meta.vdev);
 
 		uvc_debugfs_cleanup_stream(stream);
+
+		mutex_unlock(&stream->queue.mutex);
+		mutex_unlock(&stream->mutex);
 	}
 
 	uvc_status_unregister(dev);
@@ -1960,6 +1968,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
 	if (media_devnode_is_registered(dev->mdev.devnode))
 		media_device_unregister(&dev->mdev);
 #endif
+	mutex_unlock(&dev->lock);
 }
 
 int uvc_register_video_device(struct uvc_device *dev,
diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c
index 0335e69b70ab..7e5e583dae5e 100644
--- a/drivers/media/usb/uvc/uvc_v4l2.c
+++ b/drivers/media/usb/uvc/uvc_v4l2.c
@@ -237,6 +237,12 @@ static int uvc_v4l2_try_format(struct uvc_streaming *stream,
 	 * the Windows driver).
 	 */
 	mutex_lock(&stream->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		mutex_unlock(&stream->mutex);
+		ret = -ENODEV;
+		goto done;
+	}
+
 	if (stream->dev->quirks & UVC_QUIRK_PROBE_EXTRAFIELDS)
 		probe->dwMaxVideoFrameSize =
 			stream->ctrl.dwMaxVideoFrameSize;
@@ -274,6 +280,12 @@ static int uvc_v4l2_get_format(struct uvc_streaming *stream,
 		return -EINVAL;
 
 	mutex_lock(&stream->mutex);
+
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto done;
+	}
+
 	format = stream->cur_format;
 	frame = stream->cur_frame;
 
@@ -312,6 +324,11 @@ static int uvc_v4l2_set_format(struct uvc_streaming *stream,
 
 	mutex_lock(&stream->mutex);
 
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto done;
+	}
+
 	if (uvc_queue_allocated(&stream->queue)) {
 		ret = -EBUSY;
 		goto done;
@@ -387,6 +404,11 @@ static int uvc_v4l2_set_streamparm(struct uvc_streaming *stream,
 
 	mutex_lock(&stream->mutex);
 
+	if (!video_is_registered(&stream->vdev)) {
+		mutex_unlock(&stream->mutex);
+		return -ENODEV;
+	}
+
 	if (uvc_queue_streaming(&stream->queue)) {
 		mutex_unlock(&stream->mutex);
 		return -EBUSY;
@@ -713,6 +735,10 @@ static int uvc_ioctl_reqbufs(struct file *file, void *fh,
 		return ret;
 
 	mutex_lock(&stream->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		mutex_unlock(&stream->mutex);
+		return -ENODEV;
+	}
 	ret = uvc_request_buffers(&stream->queue, rb);
 	mutex_unlock(&stream->mutex);
 	if (ret < 0)
@@ -797,7 +823,12 @@ static int uvc_ioctl_streamon(struct file *file, void *fh,
 		return -EBUSY;
 
 	mutex_lock(&stream->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
 	ret = uvc_queue_streamon(&stream->queue, type);
+unlock:
 	mutex_unlock(&stream->mutex);
 
 	return ret;
@@ -808,15 +839,21 @@ static int uvc_ioctl_streamoff(struct file *file, void *fh,
 {
 	struct uvc_fh *handle = fh;
 	struct uvc_streaming *stream = handle->stream;
+	int ret = 0;
 
 	if (!uvc_has_privileges(handle))
 		return -EBUSY;
 
 	mutex_lock(&stream->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
 	uvc_queue_streamoff(&stream->queue, type);
+unlock:
 	mutex_unlock(&stream->mutex);
 
-	return 0;
+	return ret;
 }
 
 static int uvc_ioctl_enum_input(struct file *file, void *fh,
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v3 3/5] media: uvcvideo: Release stream queue when unregistering video device
  2020-09-17  2:25 [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
  2020-09-17  2:25 ` [PATCH RESEND v3 1/5] media: uvcvideo: Cancel async worker earlier Guenter Roeck
  2020-09-17  2:25 ` [PATCH RESEND v3 2/5] media: uvcvideo: Lock video streams and queues while unregistering Guenter Roeck
@ 2020-09-17  2:25 ` Guenter Roeck
  2020-09-17  2:25 ` [PATCH RESEND v3 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect Guenter Roeck
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Guenter Roeck @ 2020-09-17  2:25 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

The following traceback was observed when stress testing the uvcvideo
driver.

config->interface[0] is NULL
WARNING: CPU: 0 PID: 2757 at drivers/usb/core/usb.c:285 usb_ifnum_to_if+0x81/0x85
^^^ added log message and WARN() to prevent crash
Modules linked in: <...>
CPU: 0 PID: 2757 Comm: inotify_reader Not tainted 4.19.139 #20
Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
RIP: 0010:usb_ifnum_to_if+0x81/0x85
Code: <...>
RSP: 0018:ffff9ee20141fa58 EFLAGS: 00010246
RAX: 438a0e5a525f1800 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff975477a1de90 RSI: ffff975477a153d0 RDI: ffff975477a153d0
RBP: ffff9ee20141fa70 R08: 000000000000002c R09: 002daec189138d78
R10: 0000001000000000 R11: ffffffffa7da42e6 R12: ffff975459594800
R13: 0000000000000001 R14: 0000000000000001 R15: ffff975465376400
FS:  00007ddebffd6700(0000) GS:ffff975477a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000012c83000 CR3: 00000001bbaf8000 CR4: 0000000000340ef0
Call Trace:
 usb_set_interface+0x3b/0x2f3
 uvc_video_stop_streaming+0x38/0x81 [uvcvideo]
 uvc_stop_streaming+0x21/0x49 [uvcvideo]
 __vb2_queue_cancel+0x33/0x249 [videobuf2_common]
 vb2_core_queue_release+0x1c/0x45 [videobuf2_common]
 uvc_queue_release+0x26/0x32 [uvcvideo]
 uvc_v4l2_release+0x41/0xdd [uvcvideo]
 v4l2_release+0x99/0xed
 __fput+0xf0/0x1ea
 task_work_run+0x7f/0xa7
 do_exit+0x1d1/0x6eb
 do_group_exit+0x9d/0xac
 get_signal+0x135/0x482
 do_signal+0x4a/0x63c
 exit_to_usermode_loop+0x86/0xa5
 do_syscall_64+0x171/0x269
 ? __do_page_fault+0x272/0x474
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ddec156dc26
Code: Bad RIP value.
RSP: 002b:00007ddebffd5c10 EFLAGS: 00000293 ORIG_RAX: 0000000000000017
RAX: fffffffffffffdfe RBX: 000000000000000a RCX: 00007ddec156dc26
RDX: 0000000000000000 RSI: 00007ddebffd5d28 RDI: 000000000000000a
RBP: 00007ddebffd5c50 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 00007ddebffd5d28
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

When this was observed, a USB disconnect was in progress, uvc_disconnect()
had returned, but usb_disable_device() was still running.
Drivers are not supposed to call any USB functions after the driver
disconnect function has been called. The uvcvideo driver does not follow
that convention and tries to write to the disconnected USB interface
anyway. This results in a variety of race conditions.

To solve this specific problem, release the uvc queue from
uvc_unregister_video() after the associated video devices have been
unregistered. Since the function already holds the uvc queue mutex,
bypass uvc_queue_release() and call vb2_queue_release() directly.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_driver.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index a5808305f1e3..27b72806b9b9 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1955,6 +1955,8 @@ static void uvc_unregister_video(struct uvc_device *dev)
 
 		uvc_debugfs_cleanup_stream(stream);
 
+		vb2_queue_release(&stream->queue.queue);
+
 		mutex_unlock(&stream->queue.mutex);
 		mutex_unlock(&stream->mutex);
 	}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v3 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect
  2020-09-17  2:25 [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
                   ` (2 preceding siblings ...)
  2020-09-17  2:25 ` [PATCH RESEND v3 3/5] media: uvcvideo: Release stream queue when unregistering video device Guenter Roeck
@ 2020-09-17  2:25 ` Guenter Roeck
  2020-09-17  2:25 ` [PATCH RESEND v3 5/5] media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered Guenter Roeck
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Guenter Roeck @ 2020-09-17  2:25 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

uvc queue file operations have no mutex protection against USB
disconnect. This is questionable at least for the poll operation,
which has to wait for the uvc queue mutex. By the time that mutex
has been acquired, is it possible that the video device has been
unregistered.

Protect all file operations by using the queue mutex to avoid
possible race conditions. After acquiring the mutex, check if
the video device is still registered, and bail out if not.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_queue.c | 32 +++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_queue.c b/drivers/media/usb/uvc/uvc_queue.c
index cd60c6c1749e..b2c59ce38008 100644
--- a/drivers/media/usb/uvc/uvc_queue.c
+++ b/drivers/media/usb/uvc/uvc_queue.c
@@ -358,24 +358,52 @@ int uvc_queue_streamoff(struct uvc_video_queue *queue, enum v4l2_buf_type type)
 
 int uvc_queue_mmap(struct uvc_video_queue *queue, struct vm_area_struct *vma)
 {
-	return vb2_mmap(&queue->queue, vma);
+	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
+	int ret;
+
+	mutex_lock(&queue->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+	ret = vb2_mmap(&queue->queue, vma);
+unlock:
+	mutex_unlock(&queue->mutex);
+	return ret;
 }
 
 #ifndef CONFIG_MMU
 unsigned long uvc_queue_get_unmapped_area(struct uvc_video_queue *queue,
 		unsigned long pgoff)
 {
-	return vb2_get_unmapped_area(&queue->queue, 0, 0, pgoff, 0);
+	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
+	unsigned long ret;
+
+	mutex_lock(&queue->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = -ENODEV;
+		goto unlock;
+	}
+	ret = vb2_get_unmapped_area(&queue->queue, 0, 0, pgoff, 0);
+unlock:
+	mutex_unlock(&queue->mutex);
+	return ret;
 }
 #endif
 
 __poll_t uvc_queue_poll(struct uvc_video_queue *queue, struct file *file,
 			    poll_table *wait)
 {
+	struct uvc_streaming *stream = uvc_queue_to_stream(queue);
 	__poll_t ret;
 
 	mutex_lock(&queue->mutex);
+	if (!video_is_registered(&stream->vdev)) {
+		ret = EPOLLERR;
+		goto unlock;
+	}
 	ret = vb2_poll(&queue->queue, file, wait);
+unlock:
 	mutex_unlock(&queue->mutex);
 
 	return ret;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH RESEND v3 5/5] media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered
  2020-09-17  2:25 [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
                   ` (3 preceding siblings ...)
  2020-09-17  2:25 ` [PATCH RESEND v3 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect Guenter Roeck
@ 2020-09-17  2:25 ` Guenter Roeck
  2020-09-17 12:47 ` [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Laurent Pinchart
  2022-03-11 20:24 ` Michael Grzeschik
  6 siblings, 0 replies; 15+ messages in thread
From: Guenter Roeck @ 2020-09-17  2:25 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel, Guenter Roeck, Alan Stern,
	Hans Verkuil

uvc_v4l2_open() acquires the uvc device mutex. After doing so,
it does not check if the video device is still registered. This may
result in race conditions and can result in an attempt to submit an urb
to a disconnected USB interface (from uvc_status_start).

The problem was observed after adding a call to msleep() just before
acquiring the mutex and disconnecting the camera during the sleep.

Check if the video device is still registered after acquiring the mutex
to avoid the problem. In the release function, only call uvc_status_stop()
if the video device is still registered. If the video device has been
unregistered, the urb associated with uvc status has already been killed
in uvc_status_unregister(). Trying to kill it again won't do any good
and might have unexpected side effects.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/media/usb/uvc/uvc_v4l2.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/media/usb/uvc/uvc_v4l2.c b/drivers/media/usb/uvc/uvc_v4l2.c
index 7e5e583dae5e..8073eae5d879 100644
--- a/drivers/media/usb/uvc/uvc_v4l2.c
+++ b/drivers/media/usb/uvc/uvc_v4l2.c
@@ -548,6 +548,12 @@ static int uvc_v4l2_open(struct file *file)
 	}
 
 	mutex_lock(&stream->dev->lock);
+	if (!video_is_registered(&stream->vdev)) {
+		mutex_unlock(&stream->dev->lock);
+		usb_autopm_put_interface(stream->dev->intf);
+		kfree(handle);
+		return -ENODEV;
+	}
 	if (stream->dev->users == 0) {
 		ret = uvc_status_start(stream->dev, GFP_KERNEL);
 		if (ret < 0) {
@@ -590,7 +596,7 @@ static int uvc_v4l2_release(struct file *file)
 	file->private_data = NULL;
 
 	mutex_lock(&stream->dev->lock);
-	if (--stream->dev->users == 0)
+	if (--stream->dev->users == 0 && video_is_registered(&stream->vdev))
 		uvc_status_stop(stream->dev);
 	mutex_unlock(&stream->dev->lock);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions
  2020-09-17  2:25 [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
                   ` (4 preceding siblings ...)
  2020-09-17  2:25 ` [PATCH RESEND v3 5/5] media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered Guenter Roeck
@ 2020-09-17 12:47 ` Laurent Pinchart
  2020-09-18  2:16   ` Guenter Roeck
  2022-03-11 20:24 ` Michael Grzeschik
  6 siblings, 1 reply; 15+ messages in thread
From: Laurent Pinchart @ 2020-09-17 12:47 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel

Hi Guenter,

On Wed, Sep 16, 2020 at 07:25:42PM -0700, Guenter Roeck wrote:
> Something seems to have gone wrong with v3 of this patch series.
> I am sure I sent it out, but I don't find it anywhere.
> Resending. Sorry for any duplicates.

I haven't checked the mailing list, but I've found it in my inbox :-)
I'm not forgetting about you, just been fairly busy recently. I still
plan to try and provide an alternative implementation in the V4L2 core
(in a form that I think should even be moved to the cdev core) that
would fix this for all drivers.

By the way, as you managed to get hold of non-UVC webcams, one thing you
could try in your tests to make the drivers misbehave is to block on a
DQBUF call, and unplug the device at that time. When blocking, DQBUF
releases the driver lock (through the vb2ops .wait_prepare() and
.wait_finis() operations for drivers based on vb2), so this may allow
unregistration to proceed without waiting for userspace calls to
complete.

> The uvcvideo code has no lock protection against USB disconnects
> while video operations are ongoing. This has resulted in random
> error reports, typically pointing to a crash in usb_ifnum_to_if(),
> called from usb_hcd_alloc_bandwidth(). A typical traceback is as
> follows.
> 
> usb 1-4: USB disconnect, device number 3
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
> Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
> RIP: 0010:usb_ifnum_to_if+0x29/0x40
> Code: <...>
> RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
> RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
> RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
> R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
> R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
> FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
> Call Trace:
>  usb_hcd_alloc_bandwidth+0x1ee/0x30f
>  usb_set_interface+0x1a3/0x2b7
>  uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
>  uvc_video_start_streaming+0x91/0xdd [uvcvideo]
>  uvc_start_streaming+0x28/0x5d [uvcvideo]
>  vb2_start_streaming+0x61/0x143 [videobuf2_common]
>  vb2_core_streamon+0xf7/0x10f [videobuf2_common]
>  uvc_queue_streamon+0x2e/0x41 [uvcvideo]
>  uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
>  __video_do_ioctl+0x33d/0x42a
>  video_usercopy+0x34e/0x5ff
>  ? video_ioctl2+0x16/0x16
>  v4l2_ioctl+0x46/0x53
>  do_vfs_ioctl+0x50a/0x76f
>  ksys_ioctl+0x58/0x83
>  __x64_sys_ioctl+0x1a/0x1e
>  do_syscall_64+0x54/0xde
> 
> While there are not many references to this problem on mailing lists, it is
> reported on a regular basis on various Chromebooks (roughly 300 reports
> per month). The problem is relatively easy to reproduce by adding msleep()
> calls into the code.
> 
> I tried to reproduce the problem with non-uvcvideo webcams, but was
> unsuccessful. I was unable to get Philips (pwc) webcams to work. gspca
> based webcams don't experience the problem, or at least I was unable to
> reproduce it (The gspa driver does not trigger sending USB messages in the
> open function, and otherwise uses the locking mechanism provided by the
> v4l2/vb2 core).
> 
> I don't presume to claim that I found every issue, but this patch series
> should fix at least the major problems.
> 
> The patch series was tested exensively on a Chromebook running chromeos-4.19
> and on a Linux system running a v5.8.y based kernel.
> 
> v3:
> - In patch 5/5, add missing calls to usb_autopm_put_interface() and kfree()
>   to failure code path
> 
> v2:
> - Added details about problem frequency and testing with non-uvc webcams
>   to summary
> - In patch 4/5, return EPOLLERR instead of -ENODEV on poll errors
> - Fix description in patch 5/5
> 
> ----------------------------------------------------------------
> Guenter Roeck (5):
>       media: uvcvideo: Cancel async worker earlier
>       media: uvcvideo: Lock video streams and queues while unregistering
>       media: uvcvideo: Release stream queue when unregistering video device
>       media: uvcvideo: Protect uvc queue file operations against disconnect
>       media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered
> 
>  drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
>  drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
>  drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
>  drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
>  drivers/media/usb/uvc/uvcvideo.h   |  1 +
>  5 files changed, 93 insertions(+), 8 deletions(-)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions
  2020-09-17 12:47 ` [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Laurent Pinchart
@ 2020-09-18  2:16   ` Guenter Roeck
  2021-03-12  7:36     ` Dominique MARTINET
  0 siblings, 1 reply; 15+ messages in thread
From: Guenter Roeck @ 2020-09-18  2:16 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, linux-kernel

Hi Laurent,

On 9/17/20 5:47 AM, Laurent Pinchart wrote:
> Hi Guenter,
> 
> On Wed, Sep 16, 2020 at 07:25:42PM -0700, Guenter Roeck wrote:
>> Something seems to have gone wrong with v3 of this patch series.
>> I am sure I sent it out, but I don't find it anywhere.
>> Resending. Sorry for any duplicates.
> 
> I haven't checked the mailing list, but I've found it in my inbox :-)
> I'm not forgetting about you, just been fairly busy recently. I still
> plan to try and provide an alternative implementation in the V4L2 core
> (in a form that I think should even be moved to the cdev core) that
> would fix this for all drivers.
> 
Thanks for letting me know. As it turns out, this problem is responsible
for about 2% of all Chromebook crashes, so I'll probably not wait for
the series to be accepted upstream but apply it as-is to the various
ChromeOS kernel branches.

> By the way, as you managed to get hold of non-UVC webcams, one thing you
> could try in your tests to make the drivers misbehave is to block on a
> DQBUF call, and unplug the device at that time. When blocking, DQBUF
> releases the driver lock (through the vb2ops .wait_prepare() and
> .wait_finis() operations for drivers based on vb2), so this may allow
> unregistration to proceed without waiting for userspace calls to
> complete.
> 

Good idea. I'll give it a try.

Thanks,
Guenter

>> The uvcvideo code has no lock protection against USB disconnects
>> while video operations are ongoing. This has resulted in random
>> error reports, typically pointing to a crash in usb_ifnum_to_if(),
>> called from usb_hcd_alloc_bandwidth(). A typical traceback is as
>> follows.
>>
>> usb 1-4: USB disconnect, device number 3
>> BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
>> PGD 0 P4D 0
>> Oops: 0000 [#1] PREEMPT SMP PTI
>> CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
>> Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
>> RIP: 0010:usb_ifnum_to_if+0x29/0x40
>> Code: <...>
>> RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
>> RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
>> RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
>> RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
>> R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
>> R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
>> FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
>> Call Trace:
>>  usb_hcd_alloc_bandwidth+0x1ee/0x30f
>>  usb_set_interface+0x1a3/0x2b7
>>  uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
>>  uvc_video_start_streaming+0x91/0xdd [uvcvideo]
>>  uvc_start_streaming+0x28/0x5d [uvcvideo]
>>  vb2_start_streaming+0x61/0x143 [videobuf2_common]
>>  vb2_core_streamon+0xf7/0x10f [videobuf2_common]
>>  uvc_queue_streamon+0x2e/0x41 [uvcvideo]
>>  uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
>>  __video_do_ioctl+0x33d/0x42a
>>  video_usercopy+0x34e/0x5ff
>>  ? video_ioctl2+0x16/0x16
>>  v4l2_ioctl+0x46/0x53
>>  do_vfs_ioctl+0x50a/0x76f
>>  ksys_ioctl+0x58/0x83
>>  __x64_sys_ioctl+0x1a/0x1e
>>  do_syscall_64+0x54/0xde
>>
>> While there are not many references to this problem on mailing lists, it is
>> reported on a regular basis on various Chromebooks (roughly 300 reports
>> per month). The problem is relatively easy to reproduce by adding msleep()
>> calls into the code.
>>
>> I tried to reproduce the problem with non-uvcvideo webcams, but was
>> unsuccessful. I was unable to get Philips (pwc) webcams to work. gspca
>> based webcams don't experience the problem, or at least I was unable to
>> reproduce it (The gspa driver does not trigger sending USB messages in the
>> open function, and otherwise uses the locking mechanism provided by the
>> v4l2/vb2 core).
>>
>> I don't presume to claim that I found every issue, but this patch series
>> should fix at least the major problems.
>>
>> The patch series was tested exensively on a Chromebook running chromeos-4.19
>> and on a Linux system running a v5.8.y based kernel.
>>
>> v3:
>> - In patch 5/5, add missing calls to usb_autopm_put_interface() and kfree()
>>   to failure code path
>>
>> v2:
>> - Added details about problem frequency and testing with non-uvc webcams
>>   to summary
>> - In patch 4/5, return EPOLLERR instead of -ENODEV on poll errors
>> - Fix description in patch 5/5
>>
>> ----------------------------------------------------------------
>> Guenter Roeck (5):
>>       media: uvcvideo: Cancel async worker earlier
>>       media: uvcvideo: Lock video streams and queues while unregistering
>>       media: uvcvideo: Release stream queue when unregistering video device
>>       media: uvcvideo: Protect uvc queue file operations against disconnect
>>       media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered
>>
>>  drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
>>  drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
>>  drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
>>  drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
>>  drivers/media/usb/uvc/uvcvideo.h   |  1 +
>>  5 files changed, 93 insertions(+), 8 deletions(-)
> 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions
  2020-09-18  2:16   ` Guenter Roeck
@ 2021-03-12  7:36     ` Dominique MARTINET
  2021-03-12 14:54       ` Guenter Roeck
  0 siblings, 1 reply; 15+ messages in thread
From: Dominique MARTINET @ 2021-03-12  7:36 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Laurent Pinchart, Mauro Carvalho Chehab, Sakari Ailus,
	linux-uvc-devel, linux-usb, linux-media, linux-kernel

Hi,

Guenter Roeck wrote on Thu, Sep 17, 2020 at 07:16:17PM -0700:
> On 9/17/20 5:47 AM, Laurent Pinchart wrote:
> > On Wed, Sep 16, 2020 at 07:25:42PM -0700, Guenter Roeck wrote:
> >> Something seems to have gone wrong with v3 of this patch series.
> >> I am sure I sent it out, but I don't find it anywhere.
> >> Resending. Sorry for any duplicates.
> > 
> > I haven't checked the mailing list, but I've found it in my inbox :-)
> > I'm not forgetting about you, just been fairly busy recently. I still
> > plan to try and provide an alternative implementation in the V4L2 core
> > (in a form that I think should even be moved to the cdev core) that
> > would fix this for all drivers.
> > 
> Thanks for letting me know. As it turns out, this problem is responsible
> for about 2% of all Chromebook crashes, so I'll probably not wait for
> the series to be accepted upstream but apply it as-is to the various
> ChromeOS kernel branches.

We have a customer who reported the same issue recently, has there been
any development?

I don't see anything in either uvc nor v4l2 that would address the race
since this mail half a year ago (well, I could have missed it ;))


If nothing happened I'll probably backport this series as well, at which
point it might make more sense to take it in until a better fix gets
here then revert it...


Thanks!
-- 
Dominique

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions
  2021-03-12  7:36     ` Dominique MARTINET
@ 2021-03-12 14:54       ` Guenter Roeck
  2021-06-09  4:12         ` Dominique MARTINET
  0 siblings, 1 reply; 15+ messages in thread
From: Guenter Roeck @ 2021-03-12 14:54 UTC (permalink / raw)
  To: Dominique MARTINET
  Cc: Laurent Pinchart, Mauro Carvalho Chehab, Sakari Ailus,
	linux-uvc-devel, linux-usb, linux-media, linux-kernel

On 3/11/21 11:36 PM, Dominique MARTINET wrote:
> Hi,
> 
> Guenter Roeck wrote on Thu, Sep 17, 2020 at 07:16:17PM -0700:
>> On 9/17/20 5:47 AM, Laurent Pinchart wrote:
>>> On Wed, Sep 16, 2020 at 07:25:42PM -0700, Guenter Roeck wrote:
>>>> Something seems to have gone wrong with v3 of this patch series.
>>>> I am sure I sent it out, but I don't find it anywhere.
>>>> Resending. Sorry for any duplicates.
>>>
>>> I haven't checked the mailing list, but I've found it in my inbox :-)
>>> I'm not forgetting about you, just been fairly busy recently. I still
>>> plan to try and provide an alternative implementation in the V4L2 core
>>> (in a form that I think should even be moved to the cdev core) that
>>> would fix this for all drivers.
>>>
>> Thanks for letting me know. As it turns out, this problem is responsible
>> for about 2% of all Chromebook crashes, so I'll probably not wait for
>> the series to be accepted upstream but apply it as-is to the various
>> ChromeOS kernel branches.
> 
> We have a customer who reported the same issue recently, has there been
> any development?
> 

Not that I know of. We applied the series to all Chrome OS kernel branches,
and it reliably fixes the problem for us. We'd like to have the problem
fixed upstream; until that happens we'll have to carry the series forward.

> I don't see anything in either uvc nor v4l2 that would address the race
> since this mail half a year ago (well, I could have missed it ;))
> 

The problem still exists in the upstream kernel.

Guenter

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions
  2021-03-12 14:54       ` Guenter Roeck
@ 2021-06-09  4:12         ` Dominique MARTINET
  0 siblings, 0 replies; 15+ messages in thread
From: Dominique MARTINET @ 2021-06-09  4:12 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Laurent Pinchart, Mauro Carvalho Chehab, Sakari Ailus,
	linux-uvc-devel, linux-usb, linux-media, linux-kernel

Guenter Roeck wrote on Fri, Mar 12, 2021 at 06:54:52AM -0800:
> On 3/11/21 11:36 PM, Dominique MARTINET wrote:
> > Guenter Roeck wrote on Thu, Sep 17, 2020 at 07:16:17PM -0700:
> >> On 9/17/20 5:47 AM, Laurent Pinchart wrote:
> >>> I haven't checked the mailing list, but I've found it in my inbox :-)
> >>> I'm not forgetting about you, just been fairly busy recently. I still
> >>> plan to try and provide an alternative implementation in the V4L2 core
> >>> (in a form that I think should even be moved to the cdev core) that
> >>> would fix this for all drivers.
> >>>
> >> Thanks for letting me know. As it turns out, this problem is responsible
> >> for about 2% of all Chromebook crashes, so I'll probably not wait for
> >> the series to be accepted upstream but apply it as-is to the various
> >> ChromeOS kernel branches.
> > 
> > We have a customer who reported the same issue recently, has there been
> > any development?
> > 
> 
> Not that I know of. We applied the series to all Chrome OS kernel branches,
> and it reliably fixes the problem for us. We'd like to have the problem
> fixed upstream; until that happens we'll have to carry the series forward.

Thanks for confirming.
Laurent, would it make sense to take the patches as they are until a
better fix is ready?

-- 
Dominique

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v3 1/5] media: uvcvideo: Cancel async worker earlier
  2020-09-17  2:25 ` [PATCH RESEND v3 1/5] media: uvcvideo: Cancel async worker earlier Guenter Roeck
@ 2021-06-17 17:06   ` Ezequiel Garcia
  0 siblings, 0 replies; 15+ messages in thread
From: Ezequiel Garcia @ 2021-06-17 17:06 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Sakari Ailus, linux-uvc-devel, linux-usb,
	linux-media, Linux Kernel Mailing List, Guenter Roeck,
	Alan Stern, Hans Verkuil

Hi Laurent,

On Wed, 16 Sept 2020 at 23:33, Guenter Roeck <linux@roeck-us.net> wrote:
>
> So far the asynchronous control worker was canceled only in
> uvc_ctrl_cleanup_device. This is much later than the call to
> uvc_disconnect. However, after the call to uvc_disconnect returns,
> there must be no more USB activity. This can result in all kinds
> of problems in the USB code. One observed example:
>
> URB ffff993e83d0bc00 submitted while active
> WARNING: CPU: 0 PID: 4046 at drivers/usb/core/urb.c:364 usb_submit_urb+0x4ba/0x55e
> Modules linked in: <...>
> CPU: 0 PID: 4046 Comm: kworker/0:35 Not tainted 4.19.139 #18
> Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
> Workqueue: events uvc_ctrl_status_event_work [uvcvideo]
> RIP: 0010:usb_submit_urb+0x4ba/0x55e
> Code: <...>
> RSP: 0018:ffffb08d471ebde8 EFLAGS: 00010246
> RAX: a6da85d923ea5d00 RBX: ffff993e71985928 RCX: 0000000000000000
> RDX: ffff993f37a1de90 RSI: ffff993f37a153d0 RDI: ffff993f37a153d0
> RBP: ffffb08d471ebe28 R08: 000000000000003b R09: 001424bf85822e96
> R10: 0000001000000000 R11: ffffffff975a4398 R12: ffff993e83d0b000
> R13: ffff993e83d0bc00 R14: 0000000000000000 R15: 00000000fffffff0
> FS:  0000000000000000(0000) GS:ffff993f37a00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000ec9c0000 CR3: 000000025b160000 CR4: 0000000000340ef0
> Call Trace:
>  uvc_ctrl_status_event_work+0xd6/0x107 [uvcvideo]
>  process_one_work+0x19b/0x4c5
>  worker_thread+0x10d/0x286
>  kthread+0x138/0x140
>  ? process_one_work+0x4c5/0x4c5
>  ? kthread_associate_blkcg+0xc1/0xc1
>  ret_from_fork+0x1f/0x40
>
> Introduce new function uvc_ctrl_stop_device() to cancel the worker
> and call it from uvc_unregister_video() to solve the problem.
>
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Cc: Alan Stern <stern@rowland.harvard.edu>
> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>

If I understand correctly, this patch is correct and can be merged
independently of the rest in this series.

I think it could also be fixed with a check in uvc_ctrl_status_event_work,
to prevent pending workers from accessing the device, which is now disconnected.
Something like this (untested):

@@ -1325,6 +1325,10 @@ static void uvc_ctrl_status_event_work(struct
work_struct *work)

        uvc_ctrl_status_event(w->chain, w->ctrl, w->data);

+       /* Don't submit URBs if the device was disconnected */
+       if (!usb_get_intfdata(dev->intf))
+               return;
+

In any case, it'd be nice to fix this case now, and pospone taking care
of the other race conditions in the core, as you suggested in the cover letter.

Thanks!
Ezequiel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions
  2020-09-17  2:25 [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
                   ` (5 preceding siblings ...)
  2020-09-17 12:47 ` [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Laurent Pinchart
@ 2022-03-11 20:24 ` Michael Grzeschik
  2022-03-12 18:43   ` Laurent Pinchart
  6 siblings, 1 reply; 15+ messages in thread
From: Michael Grzeschik @ 2022-03-11 20:24 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Laurent Pinchart, Mauro Carvalho Chehab, Sakari Ailus,
	linux-uvc-devel, linux-usb, linux-media, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4776 bytes --]

Ping!

This series seems to be hanging around. It would be nice to get these
patches upstream, as they help my uvc-gadget workflow. Without them it
is likely that in the development cases my gadget won't start and then
leave the whole xhci controller broken.

@Laurent, what do you think?

Regards,
Michael


On Wed, Sep 16, 2020 at 07:25:42PM -0700, Guenter Roeck wrote:
>Something seems to have gone wrong with v3 of this patch series.
>I am sure I sent it out, but I don't find it anywhere.
>Resending. Sorry for any duplicates.
>
>The uvcvideo code has no lock protection against USB disconnects
>while video operations are ongoing. This has resulted in random
>error reports, typically pointing to a crash in usb_ifnum_to_if(),
>called from usb_hcd_alloc_bandwidth(). A typical traceback is as
>follows.
>
>usb 1-4: USB disconnect, device number 3
>BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
>PGD 0 P4D 0
>Oops: 0000 [#1] PREEMPT SMP PTI
>CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
>Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
>RIP: 0010:usb_ifnum_to_if+0x29/0x40
>Code: <...>
>RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
>RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
>RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
>RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
>R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
>R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
>FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
>CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
>Call Trace:
> usb_hcd_alloc_bandwidth+0x1ee/0x30f
> usb_set_interface+0x1a3/0x2b7
> uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
> uvc_video_start_streaming+0x91/0xdd [uvcvideo]
> uvc_start_streaming+0x28/0x5d [uvcvideo]
> vb2_start_streaming+0x61/0x143 [videobuf2_common]
> vb2_core_streamon+0xf7/0x10f [videobuf2_common]
> uvc_queue_streamon+0x2e/0x41 [uvcvideo]
> uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
> __video_do_ioctl+0x33d/0x42a
> video_usercopy+0x34e/0x5ff
> ? video_ioctl2+0x16/0x16
> v4l2_ioctl+0x46/0x53
> do_vfs_ioctl+0x50a/0x76f
> ksys_ioctl+0x58/0x83
> __x64_sys_ioctl+0x1a/0x1e
> do_syscall_64+0x54/0xde
>
>While there are not many references to this problem on mailing lists, it is
>reported on a regular basis on various Chromebooks (roughly 300 reports
>per month). The problem is relatively easy to reproduce by adding msleep()
>calls into the code.
>
>I tried to reproduce the problem with non-uvcvideo webcams, but was
>unsuccessful. I was unable to get Philips (pwc) webcams to work. gspca
>based webcams don't experience the problem, or at least I was unable to
>reproduce it (The gspa driver does not trigger sending USB messages in the
>open function, and otherwise uses the locking mechanism provided by the
>v4l2/vb2 core).
>
>I don't presume to claim that I found every issue, but this patch series
>should fix at least the major problems.
>
>The patch series was tested exensively on a Chromebook running chromeos-4.19
>and on a Linux system running a v5.8.y based kernel.
>
>v3:
>- In patch 5/5, add missing calls to usb_autopm_put_interface() and kfree()
>  to failure code path
>
>v2:
>- Added details about problem frequency and testing with non-uvc webcams
>  to summary
>- In patch 4/5, return EPOLLERR instead of -ENODEV on poll errors
>- Fix description in patch 5/5
>
>----------------------------------------------------------------
>Guenter Roeck (5):
>      media: uvcvideo: Cancel async worker earlier
>      media: uvcvideo: Lock video streams and queues while unregistering
>      media: uvcvideo: Release stream queue when unregistering video device
>      media: uvcvideo: Protect uvc queue file operations against disconnect
>      media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered
>
> drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
> drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
> drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
> drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
> drivers/media/usb/uvc/uvcvideo.h   |  1 +
> 5 files changed, 93 insertions(+), 8 deletions(-)

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions
  2022-03-11 20:24 ` Michael Grzeschik
@ 2022-03-12 18:43   ` Laurent Pinchart
  2022-08-23  8:37     ` Lukasz Majczak
  0 siblings, 1 reply; 15+ messages in thread
From: Laurent Pinchart @ 2022-03-12 18:43 UTC (permalink / raw)
  To: Michael Grzeschik
  Cc: Guenter Roeck, Mauro Carvalho Chehab, Sakari Ailus,
	linux-uvc-devel, linux-usb, linux-media, linux-kernel

Hi Michael,

On Fri, Mar 11, 2022 at 09:24:26PM +0100, Michael Grzeschik wrote:
> Ping!
> 
> This series seems to be hanging around. It would be nice to get these
> patches upstream, as they help my uvc-gadget workflow. Without them it
> is likely that in the development cases my gadget won't start and then
> leave the whole xhci controller broken.
> 
> @Laurent, what do you think?

I think I've explained before how this should be fixed at the V4L2
level. The problem actually affects character devices globally, and Greg
KH said he would have a go at fixing it there, but I don't think much
happened. Starting with a V4L2-level fix is fine with me.

There are a few patches in the series that are specific to uvcvideo,
I'll have another look and merge those.

> On Wed, Sep 16, 2020 at 07:25:42PM -0700, Guenter Roeck wrote:
> > Something seems to have gone wrong with v3 of this patch series.
> > I am sure I sent it out, but I don't find it anywhere.
> > Resending. Sorry for any duplicates.
> > 
> > The uvcvideo code has no lock protection against USB disconnects
> > while video operations are ongoing. This has resulted in random
> > error reports, typically pointing to a crash in usb_ifnum_to_if(),
> > called from usb_hcd_alloc_bandwidth(). A typical traceback is as
> > follows.
> > 
> > usb 1-4: USB disconnect, device number 3
> > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> > PGD 0 P4D 0
> > Oops: 0000 [#1] PREEMPT SMP PTI
> > CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
> > Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
> > RIP: 0010:usb_ifnum_to_if+0x29/0x40
> > Code: <...>
> > RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
> > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
> > RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
> > RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
> > R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
> > R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
> > FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
> > Call Trace:
> >  usb_hcd_alloc_bandwidth+0x1ee/0x30f
> >  usb_set_interface+0x1a3/0x2b7
> >  uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
> >  uvc_video_start_streaming+0x91/0xdd [uvcvideo]
> >  uvc_start_streaming+0x28/0x5d [uvcvideo]
> >  vb2_start_streaming+0x61/0x143 [videobuf2_common]
> >  vb2_core_streamon+0xf7/0x10f [videobuf2_common]
> >  uvc_queue_streamon+0x2e/0x41 [uvcvideo]
> >  uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
> >  __video_do_ioctl+0x33d/0x42a
> >  video_usercopy+0x34e/0x5ff
> >  ? video_ioctl2+0x16/0x16
> >  v4l2_ioctl+0x46/0x53
> >  do_vfs_ioctl+0x50a/0x76f
> >  ksys_ioctl+0x58/0x83
> >  __x64_sys_ioctl+0x1a/0x1e
> >  do_syscall_64+0x54/0xde
> > 
> > While there are not many references to this problem on mailing lists, it is
> > reported on a regular basis on various Chromebooks (roughly 300 reports
> > per month). The problem is relatively easy to reproduce by adding msleep()
> > calls into the code.
> > 
> > I tried to reproduce the problem with non-uvcvideo webcams, but was
> > unsuccessful. I was unable to get Philips (pwc) webcams to work. gspca
> > based webcams don't experience the problem, or at least I was unable to
> > reproduce it (The gspa driver does not trigger sending USB messages in the
> > open function, and otherwise uses the locking mechanism provided by the
> > v4l2/vb2 core).
> > 
> > I don't presume to claim that I found every issue, but this patch series
> > should fix at least the major problems.
> > 
> > The patch series was tested exensively on a Chromebook running chromeos-4.19
> > and on a Linux system running a v5.8.y based kernel.
> > 
> > v3:
> > - In patch 5/5, add missing calls to usb_autopm_put_interface() and kfree()
> >   to failure code path
> > 
> > v2:
> > - Added details about problem frequency and testing with non-uvc webcams
> >   to summary
> > - In patch 4/5, return EPOLLERR instead of -ENODEV on poll errors
> > - Fix description in patch 5/5
> > 
> > ----------------------------------------------------------------
> > Guenter Roeck (5):
> >       media: uvcvideo: Cancel async worker earlier
> >       media: uvcvideo: Lock video streams and queues while unregistering
> >       media: uvcvideo: Release stream queue when unregistering video device
> >       media: uvcvideo: Protect uvc queue file operations against disconnect
> >       media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered
> > 
> >  drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
> >  drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
> >  drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
> >  drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
> >  drivers/media/usb/uvc/uvcvideo.h   |  1 +
> >  5 files changed, 93 insertions(+), 8 deletions(-)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions
  2022-03-12 18:43   ` Laurent Pinchart
@ 2022-08-23  8:37     ` Lukasz Majczak
  0 siblings, 0 replies; 15+ messages in thread
From: Lukasz Majczak @ 2022-08-23  8:37 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Michael Grzeschik, Guenter Roeck, Mauro Carvalho Chehab,
	Sakari Ailus, linux-uvc-devel, linux-usb, linux-media, LKML

sob., 12 mar 2022 o 19:43 Laurent Pinchart
<laurent.pinchart@ideasonboard.com> napisał(a):
>
> Hi Michael,
>
> On Fri, Mar 11, 2022 at 09:24:26PM +0100, Michael Grzeschik wrote:
> > Ping!
> >
> > This series seems to be hanging around. It would be nice to get these
> > patches upstream, as they help my uvc-gadget workflow. Without them it
> > is likely that in the development cases my gadget won't start and then
> > leave the whole xhci controller broken.
> >
> > @Laurent, what do you think?
>
> I think I've explained before how this should be fixed at the V4L2
> level. The problem actually affects character devices globally, and Greg
> KH said he would have a go at fixing it there, but I don't think much
> happened. Starting with a V4L2-level fix is fine with me.
>
> There are a few patches in the series that are specific to uvcvideo,
> I'll have another look and merge those.
>
> > On Wed, Sep 16, 2020 at 07:25:42PM -0700, Guenter Roeck wrote:
> > > Something seems to have gone wrong with v3 of this patch series.
> > > I am sure I sent it out, but I don't find it anywhere.
> > > Resending. Sorry for any duplicates.
> > >
> > > The uvcvideo code has no lock protection against USB disconnects
> > > while video operations are ongoing. This has resulted in random
> > > error reports, typically pointing to a crash in usb_ifnum_to_if(),
> > > called from usb_hcd_alloc_bandwidth(). A typical traceback is as
> > > follows.
> > >
> > > usb 1-4: USB disconnect, device number 3
> > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
> > > PGD 0 P4D 0
> > > Oops: 0000 [#1] PREEMPT SMP PTI
> > > CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
> > > Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
> > > RIP: 0010:usb_ifnum_to_if+0x29/0x40
> > > Code: <...>
> > > RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
> > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
> > > RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
> > > RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
> > > R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
> > > R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
> > > FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
> > > Call Trace:
> > >  usb_hcd_alloc_bandwidth+0x1ee/0x30f
> > >  usb_set_interface+0x1a3/0x2b7
> > >  uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
> > >  uvc_video_start_streaming+0x91/0xdd [uvcvideo]
> > >  uvc_start_streaming+0x28/0x5d [uvcvideo]
> > >  vb2_start_streaming+0x61/0x143 [videobuf2_common]
> > >  vb2_core_streamon+0xf7/0x10f [videobuf2_common]
> > >  uvc_queue_streamon+0x2e/0x41 [uvcvideo]
> > >  uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
> > >  __video_do_ioctl+0x33d/0x42a
> > >  video_usercopy+0x34e/0x5ff
> > >  ? video_ioctl2+0x16/0x16
> > >  v4l2_ioctl+0x46/0x53
> > >  do_vfs_ioctl+0x50a/0x76f
> > >  ksys_ioctl+0x58/0x83
> > >  __x64_sys_ioctl+0x1a/0x1e
> > >  do_syscall_64+0x54/0xde
> > >
> > > While there are not many references to this problem on mailing lists, it is
> > > reported on a regular basis on various Chromebooks (roughly 300 reports
> > > per month). The problem is relatively easy to reproduce by adding msleep()
> > > calls into the code.
> > >
> > > I tried to reproduce the problem with non-uvcvideo webcams, but was
> > > unsuccessful. I was unable to get Philips (pwc) webcams to work. gspca
> > > based webcams don't experience the problem, or at least I was unable to
> > > reproduce it (The gspa driver does not trigger sending USB messages in the
> > > open function, and otherwise uses the locking mechanism provided by the
> > > v4l2/vb2 core).
> > >
> > > I don't presume to claim that I found every issue, but this patch series
> > > should fix at least the major problems.
> > >
> > > The patch series was tested exensively on a Chromebook running chromeos-4.19
> > > and on a Linux system running a v5.8.y based kernel.
> > >
> > > v3:
> > > - In patch 5/5, add missing calls to usb_autopm_put_interface() and kfree()
> > >   to failure code path
> > >
> > > v2:
> > > - Added details about problem frequency and testing with non-uvc webcams
> > >   to summary
> > > - In patch 4/5, return EPOLLERR instead of -ENODEV on poll errors
> > > - Fix description in patch 5/5
> > >
> > > ----------------------------------------------------------------
> > > Guenter Roeck (5):
> > >       media: uvcvideo: Cancel async worker earlier
> > >       media: uvcvideo: Lock video streams and queues while unregistering
> > >       media: uvcvideo: Release stream queue when unregistering video device
> > >       media: uvcvideo: Protect uvc queue file operations against disconnect
> > >       media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered
> > >
> > >  drivers/media/usb/uvc/uvc_ctrl.c   | 11 ++++++----
> > >  drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++
> > >  drivers/media/usb/uvc/uvc_queue.c  | 32 +++++++++++++++++++++++++--
> > >  drivers/media/usb/uvc/uvc_v4l2.c   | 45 ++++++++++++++++++++++++++++++++++++--
> > >  drivers/media/usb/uvc/uvcvideo.h   |  1 +
> > >  5 files changed, 93 insertions(+), 8 deletions(-)
>
> --
> Regards,
>
> Laurent Pinchart

Hi Laurent,

Have you had time to take another look at those patches? Can we
somehow move at least with the uvcvideo patches?

Best regards,
Lukasz

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-08-23  9:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-17  2:25 [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Guenter Roeck
2020-09-17  2:25 ` [PATCH RESEND v3 1/5] media: uvcvideo: Cancel async worker earlier Guenter Roeck
2021-06-17 17:06   ` Ezequiel Garcia
2020-09-17  2:25 ` [PATCH RESEND v3 2/5] media: uvcvideo: Lock video streams and queues while unregistering Guenter Roeck
2020-09-17  2:25 ` [PATCH RESEND v3 3/5] media: uvcvideo: Release stream queue when unregistering video device Guenter Roeck
2020-09-17  2:25 ` [PATCH RESEND v3 4/5] media: uvcvideo: Protect uvc queue file operations against disconnect Guenter Roeck
2020-09-17  2:25 ` [PATCH RESEND v3 5/5] media: uvcvideo: Abort uvc_v4l2_open if video device is unregistered Guenter Roeck
2020-09-17 12:47 ` [PATCH RESEND v3 0/5] media: uvcvideo: Fix race conditions Laurent Pinchart
2020-09-18  2:16   ` Guenter Roeck
2021-03-12  7:36     ` Dominique MARTINET
2021-03-12 14:54       ` Guenter Roeck
2021-06-09  4:12         ` Dominique MARTINET
2022-03-11 20:24 ` Michael Grzeschik
2022-03-12 18:43   ` Laurent Pinchart
2022-08-23  8:37     ` Lukasz Majczak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).