All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] uvcvideo: Attempt N to land UVC race conditions fixes
@ 2023-03-09 14:44 Ricardo Ribalda
  2023-03-09 14:44 ` [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier Ricardo Ribalda
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Ricardo Ribalda @ 2023-03-09 14:44 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Ricardo Ribalda, Laurent Pinchart, Max Staudt, Alan Stern,
	Sakari Ailus, Guenter Roeck, linux-media, linux-kernel,
	Sean Paul, Tomasz Figa, Hans Verkuil

Back in 2020 Guenter published a set of patches to fix some race
conditions on UVC.
https://lore.kernel.org/all/20200917022547.198090-5-linux@roeck-us.net/

That kind of race conditions are not only seen on UVC, but are a common
sin on almost all the kernel, so this is what it was decided back then
that we should try to fix them at higher levels.

After that. A lot of video_is_registered() were added to the core:

```
ribalda@alco:~/work/linux$ git grep is_registered drivers/media/v4l2-core/
drivers/media/v4l2-core/v4l2-compat-ioctl32.c:  if (!video_is_registered(vdev))
drivers/media/v4l2-core/v4l2-dev.c:     if (video_is_registered(vdev))
drivers/media/v4l2-core/v4l2-dev.c:     if (video_is_registered(vdev))
drivers/media/v4l2-core/v4l2-dev.c:     if (video_is_registered(vdev)) {
drivers/media/v4l2-core/v4l2-dev.c:             if (video_is_registered(vdev))
drivers/media/v4l2-core/v4l2-dev.c:     if (!video_is_registered(vdev))
drivers/media/v4l2-core/v4l2-dev.c:     if (video_is_registered(vdev))
drivers/media/v4l2-core/v4l2-dev.c:     if (vdev == NULL || !video_is_registered(vdev)) {
drivers/media/v4l2-core/v4l2-dev.c:             if (video_is_registered(vdev))
drivers/media/v4l2-core/v4l2-dev.c:     if (!vdev || !video_is_registered(vdev))
drivers/media/v4l2-core/v4l2-ioctl.c:   if (!video_is_registered(vfd)) {
drivers/media/v4l2-core/v4l2-subdev.c:  if (video_is_registered(vdev)) {
```

And recently Sakari is trying to land:
https://lore.kernel.org/linux-media/20230201214535.347075-1-sakari.ailus@linux.intel.com/

Which will make obsolete a lof (all?) of the video_is_registered() checks on
Guenters patches.

Besides those checks, there were some other valid races fixed on his
patches.

This series is just a rebase of what I think is missing on UVC even
if we fixed v4l2/core with all the video_is_register() checks removed.

Thanks!

To: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Max Staudt <mstaudt@chromium.org>
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Sakari Ailus <sakari.ailus@iki.fi>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: linux-media@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
Changes in v2:
- Actually send the series to the ML an not only to individuals.
- Link to v1: https://lore.kernel.org/r/20230309-guenter-mini-v1-0-627d10cf6e96@chromium.org

---
Guenter Roeck (3):
      media: uvcvideo: Cancel async worker earlier
      media: uvcvideo: Lock video streams and queues while unregistering
      media: uvcvideo: Release stream queue when unregistering video device

 drivers/media/usb/uvc/uvc_ctrl.c   | 11 +++++++----
 drivers/media/usb/uvc/uvc_driver.c | 12 ++++++++++++
 drivers/media/usb/uvc/uvcvideo.h   |  1 +
 3 files changed, 20 insertions(+), 4 deletions(-)
---
base-commit: 63355b9884b3d1677de6bd1517cd2b8a9bf53978
change-id: 20230309-guenter-mini-89861b084ef1

Best regards,
-- 
Ricardo Ribalda <ribalda@chromium.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier
  2023-03-09 14:44 [PATCH v2 0/3] uvcvideo: Attempt N to land UVC race conditions fixes Ricardo Ribalda
@ 2023-03-09 14:44 ` Ricardo Ribalda
  2023-03-09 14:57   ` Laurent Pinchart
  2023-03-09 14:44 ` [PATCH v2 2/3] media: uvcvideo: Lock video streams and queues while unregistering Ricardo Ribalda
  2023-03-09 14:44 ` [PATCH v2 3/3] media: uvcvideo: Release stream queue when unregistering video device Ricardo Ribalda
  2 siblings, 1 reply; 8+ messages in thread
From: Ricardo Ribalda @ 2023-03-09 14:44 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Ricardo Ribalda, Laurent Pinchart, Max Staudt, Alan Stern,
	Sakari Ailus, Guenter Roeck, linux-media, linux-kernel,
	Sean Paul, Tomasz Figa, Hans Verkuil

From: Guenter Roeck <linux@roeck-us.net>

So far the asynchronous control worker was canceled only in
uvc_ctrl_cleanup_device. This is much later than the call to
uvc_disconnect. However, after the call to uvc_disconnect returns,
there must be no more USB activity. This can result in all kinds
of problems in the USB code. One observed example:

URB ffff993e83d0bc00 submitted while active
WARNING: CPU: 0 PID: 4046 at drivers/usb/core/urb.c:364 usb_submit_urb+0x4ba/0x55e
Modules linked in: <...>
CPU: 0 PID: 4046 Comm: kworker/0:35 Not tainted 4.19.139 #18
Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
Workqueue: events uvc_ctrl_status_event_work [uvcvideo]
RIP: 0010:usb_submit_urb+0x4ba/0x55e
Code: <...>
RSP: 0018:ffffb08d471ebde8 EFLAGS: 00010246
RAX: a6da85d923ea5d00 RBX: ffff993e71985928 RCX: 0000000000000000
RDX: ffff993f37a1de90 RSI: ffff993f37a153d0 RDI: ffff993f37a153d0
RBP: ffffb08d471ebe28 R08: 000000000000003b R09: 001424bf85822e96
R10: 0000001000000000 R11: ffffffff975a4398 R12: ffff993e83d0b000
R13: ffff993e83d0bc00 R14: 0000000000000000 R15: 00000000fffffff0
FS:  0000000000000000(0000) GS:ffff993f37a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000ec9c0000 CR3: 000000025b160000 CR4: 0000000000340ef0
Call Trace:
 uvc_ctrl_status_event_work+0xd6/0x107 [uvcvideo]
 process_one_work+0x19b/0x4c5
 worker_thread+0x10d/0x286
 kthread+0x138/0x140
 ? process_one_work+0x4c5/0x4c5
 ? kthread_associate_blkcg+0xc1/0xc1
 ret_from_fork+0x1f/0x40

Introduce new function uvc_ctrl_stop_device() to cancel the worker
and call it from uvc_unregister_video() to solve the problem.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
 drivers/media/usb/uvc/uvc_ctrl.c   | 11 +++++++----
 drivers/media/usb/uvc/uvc_driver.c |  1 +
 drivers/media/usb/uvc/uvcvideo.h   |  1 +
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
index 5e9d3da862dd..769c1d2a2f45 100644
--- a/drivers/media/usb/uvc/uvc_ctrl.c
+++ b/drivers/media/usb/uvc/uvc_ctrl.c
@@ -2757,14 +2757,17 @@ static void uvc_ctrl_cleanup_mappings(struct uvc_device *dev,
 	}
 }
 
-void uvc_ctrl_cleanup_device(struct uvc_device *dev)
+void uvc_ctrl_stop_device(struct uvc_device *dev)
 {
-	struct uvc_entity *entity;
-	unsigned int i;
-
 	/* Can be uninitialized if we are aborting on probe error. */
 	if (dev->async_ctrl.work.func)
 		cancel_work_sync(&dev->async_ctrl.work);
+}
+
+void uvc_ctrl_cleanup_device(struct uvc_device *dev)
+{
+	struct uvc_entity *entity;
+	unsigned int i;
 
 	/* Free controls and control mappings for all entities. */
 	list_for_each_entry(entity, &dev->entities, list) {
diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index 7aefa76a42b3..4be6dfeaa295 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1893,6 +1893,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
 	}
 
 	uvc_status_unregister(dev);
+	uvc_ctrl_stop_device(dev);
 
 	if (dev->vdev.dev)
 		v4l2_device_unregister(&dev->vdev);
diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
index 9a596c8d894a..50f171e7381b 100644
--- a/drivers/media/usb/uvc/uvcvideo.h
+++ b/drivers/media/usb/uvc/uvcvideo.h
@@ -760,6 +760,7 @@ int uvc_query_v4l2_menu(struct uvc_video_chain *chain,
 int uvc_ctrl_add_mapping(struct uvc_video_chain *chain,
 			 const struct uvc_control_mapping *mapping);
 int uvc_ctrl_init_device(struct uvc_device *dev);
+void uvc_ctrl_stop_device(struct uvc_device *dev);
 void uvc_ctrl_cleanup_device(struct uvc_device *dev);
 int uvc_ctrl_restore_values(struct uvc_device *dev);
 bool uvc_ctrl_status_event_async(struct urb *urb, struct uvc_video_chain *chain,

-- 
2.40.0.rc0.216.gc4246ad0f0-goog-b4-0.11.0-dev-696ae

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 2/3] media: uvcvideo: Lock video streams and queues while unregistering
  2023-03-09 14:44 [PATCH v2 0/3] uvcvideo: Attempt N to land UVC race conditions fixes Ricardo Ribalda
  2023-03-09 14:44 ` [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier Ricardo Ribalda
@ 2023-03-09 14:44 ` Ricardo Ribalda
  2023-03-09 14:44 ` [PATCH v2 3/3] media: uvcvideo: Release stream queue when unregistering video device Ricardo Ribalda
  2 siblings, 0 replies; 8+ messages in thread
From: Ricardo Ribalda @ 2023-03-09 14:44 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Ricardo Ribalda, Laurent Pinchart, Max Staudt, Alan Stern,
	Sakari Ailus, Guenter Roeck, linux-media, linux-kernel,
	Sean Paul, Tomasz Figa, Hans Verkuil

From: Guenter Roeck <linux@roeck-us.net>

The call to uvc_disconnect() is not protected by any mutex.
This means it can and will be called while other accesses to the video
device are in progress. This can cause all kinds of race conditions,
including crashes such as the following.

usb 1-4: USB disconnect, device number 3
BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 5633 Comm: V4L2CaptureThre Not tainted 4.19.113-08536-g5d29ca36db06 #1
Hardware name: GOOGLE Edgar, BIOS Google_Edgar.7287.167.156 03/25/2019
RIP: 0010:usb_ifnum_to_if+0x29/0x40
Code: <...>
RSP: 0018:ffffa46f42a47a80 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff904a396c9000
RDX: ffff904a39641320 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffa46f42a47a80 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000009975 R11: 0000000000000009 R12: 0000000000000000
R13: ffff904a396b3800 R14: ffff904a39e88000 R15: 0000000000000000
FS: 00007f396448e700(0000) GS:ffff904a3ba00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000016cb46000 CR4: 00000000001006f0
Call Trace:
 usb_hcd_alloc_bandwidth+0x1ee/0x30f
 usb_set_interface+0x1a3/0x2b7
 uvc_video_start_transfer+0x29b/0x4b8 [uvcvideo]
 uvc_video_start_streaming+0x91/0xdd [uvcvideo]
 uvc_start_streaming+0x28/0x5d [uvcvideo]
 vb2_start_streaming+0x61/0x143 [videobuf2_common]
 vb2_core_streamon+0xf7/0x10f [videobuf2_common]
 uvc_queue_streamon+0x2e/0x41 [uvcvideo]
 uvc_ioctl_streamon+0x42/0x5c [uvcvideo]
 __video_do_ioctl+0x33d/0x42a
 video_usercopy+0x34e/0x5ff
 ? video_ioctl2+0x16/0x16
 v4l2_ioctl+0x46/0x53
 do_vfs_ioctl+0x50a/0x76f
 ksys_ioctl+0x58/0x83
 __x64_sys_ioctl+0x1a/0x1e
 do_syscall_64+0x54/0xde

usb_set_interface() should not be called after the USB device has been
unregistered. However, in the above case the disconnect happened after
v4l2_ioctl() was called, but before the call to usb_ifnum_to_if().

Acquire various mutexes in uvc_unregister_video() to fix the majority
(maybe all) of the observed race conditions.

The uvc_device lock prevents races against suspend and resume calls
and the poll function.

The uvc_streaming lock prevents races against stream related functions;
for the most part, those are ioctls. This lock also requires other
functions using this lock to check if a video device is still registered
after acquiring it. For example, it was observed that the video device
was already unregistered by the time the stream lock was acquired in
uvc_ioctl_streamon().

The uvc_queue lock prevents races against queue functions, Most of
those are already protected by the uvc_streaming lock, but some
are called directly. This is done as added protection; an actual race
was not (yet) observed.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
 drivers/media/usb/uvc/uvc_driver.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index 4be6dfeaa295..9fda863ec446 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1882,14 +1882,22 @@ static void uvc_unregister_video(struct uvc_device *dev)
 {
 	struct uvc_streaming *stream;
 
+	mutex_lock(&dev->lock);
+
 	list_for_each_entry(stream, &dev->streams, list) {
 		if (!video_is_registered(&stream->vdev))
 			continue;
 
+		mutex_lock(&stream->mutex);
+		mutex_lock(&stream->queue.mutex);
+
 		video_unregister_device(&stream->vdev);
 		video_unregister_device(&stream->meta.vdev);
 
 		uvc_debugfs_cleanup_stream(stream);
+
+		mutex_unlock(&stream->queue.mutex);
+		mutex_unlock(&stream->mutex);
 	}
 
 	uvc_status_unregister(dev);
@@ -1901,6 +1909,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
 	if (media_devnode_is_registered(dev->mdev.devnode))
 		media_device_unregister(&dev->mdev);
 #endif
+	mutex_unlock(&dev->lock);
 }
 
 int uvc_register_video_device(struct uvc_device *dev,

-- 
2.40.0.rc0.216.gc4246ad0f0-goog-b4-0.11.0-dev-696ae

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 3/3] media: uvcvideo: Release stream queue when unregistering video device
  2023-03-09 14:44 [PATCH v2 0/3] uvcvideo: Attempt N to land UVC race conditions fixes Ricardo Ribalda
  2023-03-09 14:44 ` [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier Ricardo Ribalda
  2023-03-09 14:44 ` [PATCH v2 2/3] media: uvcvideo: Lock video streams and queues while unregistering Ricardo Ribalda
@ 2023-03-09 14:44 ` Ricardo Ribalda
  2 siblings, 0 replies; 8+ messages in thread
From: Ricardo Ribalda @ 2023-03-09 14:44 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Ricardo Ribalda, Laurent Pinchart, Max Staudt, Alan Stern,
	Sakari Ailus, Guenter Roeck, linux-media, linux-kernel,
	Sean Paul, Tomasz Figa, Hans Verkuil

From: Guenter Roeck <linux@roeck-us.net>

The following traceback was observed when stress testing the uvcvideo
driver.

config->interface[0] is NULL
WARNING: CPU: 0 PID: 2757 at drivers/usb/core/usb.c:285 usb_ifnum_to_if+0x81/0x85
^^^ added log message and WARN() to prevent crash
Modules linked in: <...>
CPU: 0 PID: 2757 Comm: inotify_reader Not tainted 4.19.139 #20
Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
RIP: 0010:usb_ifnum_to_if+0x81/0x85
Code: <...>
RSP: 0018:ffff9ee20141fa58 EFLAGS: 00010246
RAX: 438a0e5a525f1800 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff975477a1de90 RSI: ffff975477a153d0 RDI: ffff975477a153d0
RBP: ffff9ee20141fa70 R08: 000000000000002c R09: 002daec189138d78
R10: 0000001000000000 R11: ffffffffa7da42e6 R12: ffff975459594800
R13: 0000000000000001 R14: 0000000000000001 R15: ffff975465376400
FS:  00007ddebffd6700(0000) GS:ffff975477a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000012c83000 CR3: 00000001bbaf8000 CR4: 0000000000340ef0
Call Trace:
 usb_set_interface+0x3b/0x2f3
 uvc_video_stop_streaming+0x38/0x81 [uvcvideo]
 uvc_stop_streaming+0x21/0x49 [uvcvideo]
 __vb2_queue_cancel+0x33/0x249 [videobuf2_common]
 vb2_core_queue_release+0x1c/0x45 [videobuf2_common]
 uvc_queue_release+0x26/0x32 [uvcvideo]
 uvc_v4l2_release+0x41/0xdd [uvcvideo]
 v4l2_release+0x99/0xed
 __fput+0xf0/0x1ea
 task_work_run+0x7f/0xa7
 do_exit+0x1d1/0x6eb
 do_group_exit+0x9d/0xac
 get_signal+0x135/0x482
 do_signal+0x4a/0x63c
 exit_to_usermode_loop+0x86/0xa5
 do_syscall_64+0x171/0x269
 ? __do_page_fault+0x272/0x474
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ddec156dc26
Code: Bad RIP value.
RSP: 002b:00007ddebffd5c10 EFLAGS: 00000293 ORIG_RAX: 0000000000000017
RAX: fffffffffffffdfe RBX: 000000000000000a RCX: 00007ddec156dc26
RDX: 0000000000000000 RSI: 00007ddebffd5d28 RDI: 000000000000000a
RBP: 00007ddebffd5c50 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000293 R12: 00007ddebffd5d28
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000

When this was observed, a USB disconnect was in progress, uvc_disconnect()
had returned, but usb_disable_device() was still running.
Drivers are not supposed to call any USB functions after the driver
disconnect function has been called. The uvcvideo driver does not follow
that convention and tries to write to the disconnected USB interface
anyway. This results in a variety of race conditions.

To solve this specific problem, release the uvc queue from
uvc_unregister_video() after the associated video devices have been
unregistered. Since the function already holds the uvc queue mutex,
bypass uvc_queue_release() and call vb2_queue_release() directly.

Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Reviewed-by: Tomasz Figa <tfiga@chromium.org>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
 drivers/media/usb/uvc/uvc_driver.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
index 9fda863ec446..23d839f74ca5 100644
--- a/drivers/media/usb/uvc/uvc_driver.c
+++ b/drivers/media/usb/uvc/uvc_driver.c
@@ -1896,6 +1896,8 @@ static void uvc_unregister_video(struct uvc_device *dev)
 
 		uvc_debugfs_cleanup_stream(stream);
 
+		vb2_queue_release(&stream->queue.queue);
+
 		mutex_unlock(&stream->queue.mutex);
 		mutex_unlock(&stream->mutex);
 	}

-- 
2.40.0.rc0.216.gc4246ad0f0-goog-b4-0.11.0-dev-696ae

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier
  2023-03-09 14:44 ` [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier Ricardo Ribalda
@ 2023-03-09 14:57   ` Laurent Pinchart
  2023-03-28  7:52     ` Ricardo Ribalda
  0 siblings, 1 reply; 8+ messages in thread
From: Laurent Pinchart @ 2023-03-09 14:57 UTC (permalink / raw)
  To: Ricardo Ribalda
  Cc: Mauro Carvalho Chehab, Max Staudt, Alan Stern, Sakari Ailus,
	Guenter Roeck, linux-media, linux-kernel, Sean Paul, Tomasz Figa,
	Hans Verkuil

Hi Ricardo and Guenter,

Thank you for the patch.

On Thu, Mar 09, 2023 at 03:44:05PM +0100, Ricardo Ribalda wrote:
> From: Guenter Roeck <linux@roeck-us.net>
> 
> So far the asynchronous control worker was canceled only in
> uvc_ctrl_cleanup_device. This is much later than the call to
> uvc_disconnect. However, after the call to uvc_disconnect returns,
> there must be no more USB activity. This can result in all kinds
> of problems in the USB code. One observed example:
> 
> URB ffff993e83d0bc00 submitted while active
> WARNING: CPU: 0 PID: 4046 at drivers/usb/core/urb.c:364 usb_submit_urb+0x4ba/0x55e
> Modules linked in: <...>
> CPU: 0 PID: 4046 Comm: kworker/0:35 Not tainted 4.19.139 #18
> Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
> Workqueue: events uvc_ctrl_status_event_work [uvcvideo]
> RIP: 0010:usb_submit_urb+0x4ba/0x55e
> Code: <...>
> RSP: 0018:ffffb08d471ebde8 EFLAGS: 00010246
> RAX: a6da85d923ea5d00 RBX: ffff993e71985928 RCX: 0000000000000000
> RDX: ffff993f37a1de90 RSI: ffff993f37a153d0 RDI: ffff993f37a153d0
> RBP: ffffb08d471ebe28 R08: 000000000000003b R09: 001424bf85822e96
> R10: 0000001000000000 R11: ffffffff975a4398 R12: ffff993e83d0b000
> R13: ffff993e83d0bc00 R14: 0000000000000000 R15: 00000000fffffff0
> FS:  0000000000000000(0000) GS:ffff993f37a00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000ec9c0000 CR3: 000000025b160000 CR4: 0000000000340ef0
> Call Trace:
>  uvc_ctrl_status_event_work+0xd6/0x107 [uvcvideo]
>  process_one_work+0x19b/0x4c5
>  worker_thread+0x10d/0x286
>  kthread+0x138/0x140
>  ? process_one_work+0x4c5/0x4c5
>  ? kthread_associate_blkcg+0xc1/0xc1
>  ret_from_fork+0x1f/0x40
> 
> Introduce new function uvc_ctrl_stop_device() to cancel the worker
> and call it from uvc_unregister_video() to solve the problem.
> 
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Cc: Alan Stern <stern@rowland.harvard.edu>
> Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> Reviewed-by: Tomasz Figa <tfiga@chromium.org>
> Reviewed-by: Sean Paul <seanpaul@chromium.org>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> ---
>  drivers/media/usb/uvc/uvc_ctrl.c   | 11 +++++++----
>  drivers/media/usb/uvc/uvc_driver.c |  1 +
>  drivers/media/usb/uvc/uvcvideo.h   |  1 +
>  3 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
> index 5e9d3da862dd..769c1d2a2f45 100644
> --- a/drivers/media/usb/uvc/uvc_ctrl.c
> +++ b/drivers/media/usb/uvc/uvc_ctrl.c
> @@ -2757,14 +2757,17 @@ static void uvc_ctrl_cleanup_mappings(struct uvc_device *dev,
>  	}
>  }
>  
> -void uvc_ctrl_cleanup_device(struct uvc_device *dev)
> +void uvc_ctrl_stop_device(struct uvc_device *dev)
>  {
> -	struct uvc_entity *entity;
> -	unsigned int i;
> -
>  	/* Can be uninitialized if we are aborting on probe error. */
>  	if (dev->async_ctrl.work.func)
>  		cancel_work_sync(&dev->async_ctrl.work);
> +}

There may be an opportunity for refactoring, as we have
uvc_status_stop() that stops the work queue, but I think this is good
enough for now. I'm wondering, though, if there could be a race
condition here similar to the one that the recent changes to
uvc_status_stop() have fixed ? As uvc_ctrl_stop_device() is called at
release time I assume that URBs have been cancelled, so there should be
no race, but a second pair of eyeballs to confirm this would be
appreciated.

Other than that, the patch looks good to me, and fixes an issue
independent from the rest of the series, so

Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>

I will wait for a reply regarding the race condition before queuing this
up though.

> +
> +void uvc_ctrl_cleanup_device(struct uvc_device *dev)
> +{
> +	struct uvc_entity *entity;
> +	unsigned int i;
>  
>  	/* Free controls and control mappings for all entities. */
>  	list_for_each_entry(entity, &dev->entities, list) {
> diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
> index 7aefa76a42b3..4be6dfeaa295 100644
> --- a/drivers/media/usb/uvc/uvc_driver.c
> +++ b/drivers/media/usb/uvc/uvc_driver.c
> @@ -1893,6 +1893,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
>  	}
>  
>  	uvc_status_unregister(dev);
> +	uvc_ctrl_stop_device(dev);
>  
>  	if (dev->vdev.dev)
>  		v4l2_device_unregister(&dev->vdev);
> diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
> index 9a596c8d894a..50f171e7381b 100644
> --- a/drivers/media/usb/uvc/uvcvideo.h
> +++ b/drivers/media/usb/uvc/uvcvideo.h
> @@ -760,6 +760,7 @@ int uvc_query_v4l2_menu(struct uvc_video_chain *chain,
>  int uvc_ctrl_add_mapping(struct uvc_video_chain *chain,
>  			 const struct uvc_control_mapping *mapping);
>  int uvc_ctrl_init_device(struct uvc_device *dev);
> +void uvc_ctrl_stop_device(struct uvc_device *dev);
>  void uvc_ctrl_cleanup_device(struct uvc_device *dev);
>  int uvc_ctrl_restore_values(struct uvc_device *dev);
>  bool uvc_ctrl_status_event_async(struct urb *urb, struct uvc_video_chain *chain,

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier
  2023-03-09 14:57   ` Laurent Pinchart
@ 2023-03-28  7:52     ` Ricardo Ribalda
  2023-05-01 13:29       ` Ricardo Ribalda
  0 siblings, 1 reply; 8+ messages in thread
From: Ricardo Ribalda @ 2023-03-28  7:52 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Max Staudt, Alan Stern, Sakari Ailus,
	Guenter Roeck, linux-media, linux-kernel, Sean Paul, Tomasz Figa,
	Hans Verkuil

Hi Laurent

I have not tested it yet... but maybe something like this might be
slightly better?


diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
index 5e9d3da862dd..3944404b2de2 100644
--- a/drivers/media/usb/uvc/uvc_ctrl.c
+++ b/drivers/media/usb/uvc/uvc_ctrl.c
@@ -2762,10 +2762,6 @@ void uvc_ctrl_cleanup_device(struct uvc_device *dev)
        struct uvc_entity *entity;
        unsigned int i;

-       /* Can be uninitialized if we are aborting on probe error. */
-       if (dev->async_ctrl.work.func)
-               cancel_work_sync(&dev->async_ctrl.work);
-
        /* Free controls and control mappings for all entities. */
        list_for_each_entry(entity, &dev->entities, list) {
                for (i = 0; i < entity->ncontrols; ++i) {
diff --git a/drivers/media/usb/uvc/uvc_status.c
b/drivers/media/usb/uvc/uvc_status.c
index a78a88c710e2..0208612a9f12 100644
--- a/drivers/media/usb/uvc/uvc_status.c
+++ b/drivers/media/usb/uvc/uvc_status.c
@@ -292,7 +292,7 @@ int uvc_status_init(struct uvc_device *dev)

 void uvc_status_unregister(struct uvc_device *dev)
 {
-       usb_kill_urb(dev->int_urb);
+       uvc_status_stop(dev);
        uvc_input_unregister(dev);
 }


The benefit from Guenter patch is that it has been tested for years...

What do you think? Shall we try this approach instead?

Regards!

On Thu, 9 Mar 2023 at 15:57, Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
>
> Hi Ricardo and Guenter,
>
> Thank you for the patch.
>
> On Thu, Mar 09, 2023 at 03:44:05PM +0100, Ricardo Ribalda wrote:
> > From: Guenter Roeck <linux@roeck-us.net>
> >
> > So far the asynchronous control worker was canceled only in
> > uvc_ctrl_cleanup_device. This is much later than the call to
> > uvc_disconnect. However, after the call to uvc_disconnect returns,
> > there must be no more USB activity. This can result in all kinds
> > of problems in the USB code. One observed example:
> >
> > URB ffff993e83d0bc00 submitted while active
> > WARNING: CPU: 0 PID: 4046 at drivers/usb/core/urb.c:364 usb_submit_urb+0x4ba/0x55e
> > Modules linked in: <...>
> > CPU: 0 PID: 4046 Comm: kworker/0:35 Not tainted 4.19.139 #18
> > Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
> > Workqueue: events uvc_ctrl_status_event_work [uvcvideo]
> > RIP: 0010:usb_submit_urb+0x4ba/0x55e
> > Code: <...>
> > RSP: 0018:ffffb08d471ebde8 EFLAGS: 00010246
> > RAX: a6da85d923ea5d00 RBX: ffff993e71985928 RCX: 0000000000000000
> > RDX: ffff993f37a1de90 RSI: ffff993f37a153d0 RDI: ffff993f37a153d0
> > RBP: ffffb08d471ebe28 R08: 000000000000003b R09: 001424bf85822e96
> > R10: 0000001000000000 R11: ffffffff975a4398 R12: ffff993e83d0b000
> > R13: ffff993e83d0bc00 R14: 0000000000000000 R15: 00000000fffffff0
> > FS:  0000000000000000(0000) GS:ffff993f37a00000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00000000ec9c0000 CR3: 000000025b160000 CR4: 0000000000340ef0
> > Call Trace:
> >  uvc_ctrl_status_event_work+0xd6/0x107 [uvcvideo]
> >  process_one_work+0x19b/0x4c5
> >  worker_thread+0x10d/0x286
> >  kthread+0x138/0x140
> >  ? process_one_work+0x4c5/0x4c5
> >  ? kthread_associate_blkcg+0xc1/0xc1
> >  ret_from_fork+0x1f/0x40
> >
> > Introduce new function uvc_ctrl_stop_device() to cancel the worker
> > and call it from uvc_unregister_video() to solve the problem.
> >
> > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > Cc: Alan Stern <stern@rowland.harvard.edu>
> > Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> > Reviewed-by: Tomasz Figa <tfiga@chromium.org>
> > Reviewed-by: Sean Paul <seanpaul@chromium.org>
> > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> > ---
> >  drivers/media/usb/uvc/uvc_ctrl.c   | 11 +++++++----
> >  drivers/media/usb/uvc/uvc_driver.c |  1 +
> >  drivers/media/usb/uvc/uvcvideo.h   |  1 +
> >  3 files changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
> > index 5e9d3da862dd..769c1d2a2f45 100644
> > --- a/drivers/media/usb/uvc/uvc_ctrl.c
> > +++ b/drivers/media/usb/uvc/uvc_ctrl.c
> > @@ -2757,14 +2757,17 @@ static void uvc_ctrl_cleanup_mappings(struct uvc_device *dev,
> >       }
> >  }
> >
> > -void uvc_ctrl_cleanup_device(struct uvc_device *dev)
> > +void uvc_ctrl_stop_device(struct uvc_device *dev)
> >  {
> > -     struct uvc_entity *entity;
> > -     unsigned int i;
> > -
> >       /* Can be uninitialized if we are aborting on probe error. */
> >       if (dev->async_ctrl.work.func)
> >               cancel_work_sync(&dev->async_ctrl.work);
> > +}
>
> There may be an opportunity for refactoring, as we have
> uvc_status_stop() that stops the work queue, but I think this is good
> enough for now. I'm wondering, though, if there could be a race
> condition here similar to the one that the recent changes to
> uvc_status_stop() have fixed ? As uvc_ctrl_stop_device() is called at
> release time I assume that URBs have been cancelled, so there should be
> no race, but a second pair of eyeballs to confirm this would be
> appreciated.
>
> Other than that, the patch looks good to me, and fixes an issue
> independent from the rest of the series, so
>
> Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>
> I will wait for a reply regarding the race condition before queuing this
> up though.
>
> > +
> > +void uvc_ctrl_cleanup_device(struct uvc_device *dev)
> > +{
> > +     struct uvc_entity *entity;
> > +     unsigned int i;
> >
> >       /* Free controls and control mappings for all entities. */
> >       list_for_each_entry(entity, &dev->entities, list) {
> > diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
> > index 7aefa76a42b3..4be6dfeaa295 100644
> > --- a/drivers/media/usb/uvc/uvc_driver.c
> > +++ b/drivers/media/usb/uvc/uvc_driver.c
> > @@ -1893,6 +1893,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
> >       }
> >
> >       uvc_status_unregister(dev);
> > +     uvc_ctrl_stop_device(dev);
> >
> >       if (dev->vdev.dev)
> >               v4l2_device_unregister(&dev->vdev);
> > diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
> > index 9a596c8d894a..50f171e7381b 100644
> > --- a/drivers/media/usb/uvc/uvcvideo.h
> > +++ b/drivers/media/usb/uvc/uvcvideo.h
> > @@ -760,6 +760,7 @@ int uvc_query_v4l2_menu(struct uvc_video_chain *chain,
> >  int uvc_ctrl_add_mapping(struct uvc_video_chain *chain,
> >                        const struct uvc_control_mapping *mapping);
> >  int uvc_ctrl_init_device(struct uvc_device *dev);
> > +void uvc_ctrl_stop_device(struct uvc_device *dev);
> >  void uvc_ctrl_cleanup_device(struct uvc_device *dev);
> >  int uvc_ctrl_restore_values(struct uvc_device *dev);
> >  bool uvc_ctrl_status_event_async(struct urb *urb, struct uvc_video_chain *chain,
>
> --
> Regards,
>
> Laurent Pinchart



-- 
Ricardo Ribalda

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier
  2023-03-28  7:52     ` Ricardo Ribalda
@ 2023-05-01 13:29       ` Ricardo Ribalda
  2023-05-17  4:05         ` Tomasz Figa
  0 siblings, 1 reply; 8+ messages in thread
From: Ricardo Ribalda @ 2023-05-01 13:29 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Mauro Carvalho Chehab, Max Staudt, Alan Stern, Sakari Ailus,
	Guenter Roeck, linux-media, linux-kernel, Sean Paul, Tomasz Figa,
	Hans Verkuil

Hi Laurent

Friendly ping!

On Tue, 28 Mar 2023 at 09:52, Ricardo Ribalda <ribalda@chromium.org> wrote:
>
> Hi Laurent
>
> I have not tested it yet... but maybe something like this might be
> slightly better?
>
>
> diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
> index 5e9d3da862dd..3944404b2de2 100644
> --- a/drivers/media/usb/uvc/uvc_ctrl.c
> +++ b/drivers/media/usb/uvc/uvc_ctrl.c
> @@ -2762,10 +2762,6 @@ void uvc_ctrl_cleanup_device(struct uvc_device *dev)
>         struct uvc_entity *entity;
>         unsigned int i;
>
> -       /* Can be uninitialized if we are aborting on probe error. */
> -       if (dev->async_ctrl.work.func)
> -               cancel_work_sync(&dev->async_ctrl.work);
> -
>         /* Free controls and control mappings for all entities. */
>         list_for_each_entry(entity, &dev->entities, list) {
>                 for (i = 0; i < entity->ncontrols; ++i) {
> diff --git a/drivers/media/usb/uvc/uvc_status.c
> b/drivers/media/usb/uvc/uvc_status.c
> index a78a88c710e2..0208612a9f12 100644
> --- a/drivers/media/usb/uvc/uvc_status.c
> +++ b/drivers/media/usb/uvc/uvc_status.c
> @@ -292,7 +292,7 @@ int uvc_status_init(struct uvc_device *dev)
>
>  void uvc_status_unregister(struct uvc_device *dev)
>  {
> -       usb_kill_urb(dev->int_urb);
> +       uvc_status_stop(dev);
>         uvc_input_unregister(dev);
>  }
>
>
> The benefit from Guenter patch is that it has been tested for years...
>
> What do you think? Shall we try this approach instead?
>
> Regards!
>
> On Thu, 9 Mar 2023 at 15:57, Laurent Pinchart
> <laurent.pinchart@ideasonboard.com> wrote:
> >
> > Hi Ricardo and Guenter,
> >
> > Thank you for the patch.
> >
> > On Thu, Mar 09, 2023 at 03:44:05PM +0100, Ricardo Ribalda wrote:
> > > From: Guenter Roeck <linux@roeck-us.net>
> > >
> > > So far the asynchronous control worker was canceled only in
> > > uvc_ctrl_cleanup_device. This is much later than the call to
> > > uvc_disconnect. However, after the call to uvc_disconnect returns,
> > > there must be no more USB activity. This can result in all kinds
> > > of problems in the USB code. One observed example:
> > >
> > > URB ffff993e83d0bc00 submitted while active
> > > WARNING: CPU: 0 PID: 4046 at drivers/usb/core/urb.c:364 usb_submit_urb+0x4ba/0x55e
> > > Modules linked in: <...>
> > > CPU: 0 PID: 4046 Comm: kworker/0:35 Not tainted 4.19.139 #18
> > > Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
> > > Workqueue: events uvc_ctrl_status_event_work [uvcvideo]
> > > RIP: 0010:usb_submit_urb+0x4ba/0x55e
> > > Code: <...>
> > > RSP: 0018:ffffb08d471ebde8 EFLAGS: 00010246
> > > RAX: a6da85d923ea5d00 RBX: ffff993e71985928 RCX: 0000000000000000
> > > RDX: ffff993f37a1de90 RSI: ffff993f37a153d0 RDI: ffff993f37a153d0
> > > RBP: ffffb08d471ebe28 R08: 000000000000003b R09: 001424bf85822e96
> > > R10: 0000001000000000 R11: ffffffff975a4398 R12: ffff993e83d0b000
> > > R13: ffff993e83d0bc00 R14: 0000000000000000 R15: 00000000fffffff0
> > > FS:  0000000000000000(0000) GS:ffff993f37a00000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00000000ec9c0000 CR3: 000000025b160000 CR4: 0000000000340ef0
> > > Call Trace:
> > >  uvc_ctrl_status_event_work+0xd6/0x107 [uvcvideo]
> > >  process_one_work+0x19b/0x4c5
> > >  worker_thread+0x10d/0x286
> > >  kthread+0x138/0x140
> > >  ? process_one_work+0x4c5/0x4c5
> > >  ? kthread_associate_blkcg+0xc1/0xc1
> > >  ret_from_fork+0x1f/0x40
> > >
> > > Introduce new function uvc_ctrl_stop_device() to cancel the worker
> > > and call it from uvc_unregister_video() to solve the problem.
> > >
> > > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > Cc: Alan Stern <stern@rowland.harvard.edu>
> > > Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> > > Reviewed-by: Tomasz Figa <tfiga@chromium.org>
> > > Reviewed-by: Sean Paul <seanpaul@chromium.org>
> > > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> > > ---
> > >  drivers/media/usb/uvc/uvc_ctrl.c   | 11 +++++++----
> > >  drivers/media/usb/uvc/uvc_driver.c |  1 +
> > >  drivers/media/usb/uvc/uvcvideo.h   |  1 +
> > >  3 files changed, 9 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
> > > index 5e9d3da862dd..769c1d2a2f45 100644
> > > --- a/drivers/media/usb/uvc/uvc_ctrl.c
> > > +++ b/drivers/media/usb/uvc/uvc_ctrl.c
> > > @@ -2757,14 +2757,17 @@ static void uvc_ctrl_cleanup_mappings(struct uvc_device *dev,
> > >       }
> > >  }
> > >
> > > -void uvc_ctrl_cleanup_device(struct uvc_device *dev)
> > > +void uvc_ctrl_stop_device(struct uvc_device *dev)
> > >  {
> > > -     struct uvc_entity *entity;
> > > -     unsigned int i;
> > > -
> > >       /* Can be uninitialized if we are aborting on probe error. */
> > >       if (dev->async_ctrl.work.func)
> > >               cancel_work_sync(&dev->async_ctrl.work);
> > > +}
> >
> > There may be an opportunity for refactoring, as we have
> > uvc_status_stop() that stops the work queue, but I think this is good
> > enough for now. I'm wondering, though, if there could be a race
> > condition here similar to the one that the recent changes to
> > uvc_status_stop() have fixed ? As uvc_ctrl_stop_device() is called at
> > release time I assume that URBs have been cancelled, so there should be
> > no race, but a second pair of eyeballs to confirm this would be
> > appreciated.
> >
> > Other than that, the patch looks good to me, and fixes an issue
> > independent from the rest of the series, so
> >
> > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> >
> > I will wait for a reply regarding the race condition before queuing this
> > up though.
> >
> > > +
> > > +void uvc_ctrl_cleanup_device(struct uvc_device *dev)
> > > +{
> > > +     struct uvc_entity *entity;
> > > +     unsigned int i;
> > >
> > >       /* Free controls and control mappings for all entities. */
> > >       list_for_each_entry(entity, &dev->entities, list) {
> > > diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
> > > index 7aefa76a42b3..4be6dfeaa295 100644
> > > --- a/drivers/media/usb/uvc/uvc_driver.c
> > > +++ b/drivers/media/usb/uvc/uvc_driver.c
> > > @@ -1893,6 +1893,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
> > >       }
> > >
> > >       uvc_status_unregister(dev);
> > > +     uvc_ctrl_stop_device(dev);
> > >
> > >       if (dev->vdev.dev)
> > >               v4l2_device_unregister(&dev->vdev);
> > > diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
> > > index 9a596c8d894a..50f171e7381b 100644
> > > --- a/drivers/media/usb/uvc/uvcvideo.h
> > > +++ b/drivers/media/usb/uvc/uvcvideo.h
> > > @@ -760,6 +760,7 @@ int uvc_query_v4l2_menu(struct uvc_video_chain *chain,
> > >  int uvc_ctrl_add_mapping(struct uvc_video_chain *chain,
> > >                        const struct uvc_control_mapping *mapping);
> > >  int uvc_ctrl_init_device(struct uvc_device *dev);
> > > +void uvc_ctrl_stop_device(struct uvc_device *dev);
> > >  void uvc_ctrl_cleanup_device(struct uvc_device *dev);
> > >  int uvc_ctrl_restore_values(struct uvc_device *dev);
> > >  bool uvc_ctrl_status_event_async(struct urb *urb, struct uvc_video_chain *chain,
> >
> > --
> > Regards,
> >
> > Laurent Pinchart
>
>
>
> --
> Ricardo Ribalda



-- 
Ricardo Ribalda

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier
  2023-05-01 13:29       ` Ricardo Ribalda
@ 2023-05-17  4:05         ` Tomasz Figa
  0 siblings, 0 replies; 8+ messages in thread
From: Tomasz Figa @ 2023-05-17  4:05 UTC (permalink / raw)
  To: Ricardo Ribalda, Mauro Carvalho Chehab, Kieran Bingham
  Cc: Laurent Pinchart, Max Staudt, Alan Stern, Sakari Ailus,
	Guenter Roeck, linux-media, linux-kernel, Sean Paul,
	Hans Verkuil

On Mon, May 1, 2023 at 10:30 PM Ricardo Ribalda <ribalda@chromium.org> wrote:
>
> Hi Laurent
>
> Friendly ping!

+Kieran and +Mauro, could you take a look at this please, since
Laurent has not responded to this for almost 2 months?

>
> On Tue, 28 Mar 2023 at 09:52, Ricardo Ribalda <ribalda@chromium.org> wrote:
> >
> > Hi Laurent
> >
> > I have not tested it yet... but maybe something like this might be
> > slightly better?
> >
> >
> > diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
> > index 5e9d3da862dd..3944404b2de2 100644
> > --- a/drivers/media/usb/uvc/uvc_ctrl.c
> > +++ b/drivers/media/usb/uvc/uvc_ctrl.c
> > @@ -2762,10 +2762,6 @@ void uvc_ctrl_cleanup_device(struct uvc_device *dev)
> >         struct uvc_entity *entity;
> >         unsigned int i;
> >
> > -       /* Can be uninitialized if we are aborting on probe error. */
> > -       if (dev->async_ctrl.work.func)
> > -               cancel_work_sync(&dev->async_ctrl.work);
> > -
> >         /* Free controls and control mappings for all entities. */
> >         list_for_each_entry(entity, &dev->entities, list) {
> >                 for (i = 0; i < entity->ncontrols; ++i) {
> > diff --git a/drivers/media/usb/uvc/uvc_status.c
> > b/drivers/media/usb/uvc/uvc_status.c
> > index a78a88c710e2..0208612a9f12 100644
> > --- a/drivers/media/usb/uvc/uvc_status.c
> > +++ b/drivers/media/usb/uvc/uvc_status.c
> > @@ -292,7 +292,7 @@ int uvc_status_init(struct uvc_device *dev)
> >
> >  void uvc_status_unregister(struct uvc_device *dev)
> >  {
> > -       usb_kill_urb(dev->int_urb);
> > +       uvc_status_stop(dev);
> >         uvc_input_unregister(dev);
> >  }
> >
> >
> > The benefit from Guenter patch is that it has been tested for years...
> >
> > What do you think? Shall we try this approach instead?
> >
> > Regards!
> >
> > On Thu, 9 Mar 2023 at 15:57, Laurent Pinchart
> > <laurent.pinchart@ideasonboard.com> wrote:
> > >
> > > Hi Ricardo and Guenter,
> > >
> > > Thank you for the patch.
> > >
> > > On Thu, Mar 09, 2023 at 03:44:05PM +0100, Ricardo Ribalda wrote:
> > > > From: Guenter Roeck <linux@roeck-us.net>
> > > >
> > > > So far the asynchronous control worker was canceled only in
> > > > uvc_ctrl_cleanup_device. This is much later than the call to
> > > > uvc_disconnect. However, after the call to uvc_disconnect returns,
> > > > there must be no more USB activity. This can result in all kinds
> > > > of problems in the USB code. One observed example:
> > > >
> > > > URB ffff993e83d0bc00 submitted while active
> > > > WARNING: CPU: 0 PID: 4046 at drivers/usb/core/urb.c:364 usb_submit_urb+0x4ba/0x55e
> > > > Modules linked in: <...>
> > > > CPU: 0 PID: 4046 Comm: kworker/0:35 Not tainted 4.19.139 #18
> > > > Hardware name: Google Phaser/Phaser, BIOS Google_Phaser.10952.0.0 08/09/2018
> > > > Workqueue: events uvc_ctrl_status_event_work [uvcvideo]
> > > > RIP: 0010:usb_submit_urb+0x4ba/0x55e
> > > > Code: <...>
> > > > RSP: 0018:ffffb08d471ebde8 EFLAGS: 00010246
> > > > RAX: a6da85d923ea5d00 RBX: ffff993e71985928 RCX: 0000000000000000
> > > > RDX: ffff993f37a1de90 RSI: ffff993f37a153d0 RDI: ffff993f37a153d0
> > > > RBP: ffffb08d471ebe28 R08: 000000000000003b R09: 001424bf85822e96
> > > > R10: 0000001000000000 R11: ffffffff975a4398 R12: ffff993e83d0b000
> > > > R13: ffff993e83d0bc00 R14: 0000000000000000 R15: 00000000fffffff0
> > > > FS:  0000000000000000(0000) GS:ffff993f37a00000(0000) knlGS:0000000000000000
> > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 00000000ec9c0000 CR3: 000000025b160000 CR4: 0000000000340ef0
> > > > Call Trace:
> > > >  uvc_ctrl_status_event_work+0xd6/0x107 [uvcvideo]
> > > >  process_one_work+0x19b/0x4c5
> > > >  worker_thread+0x10d/0x286
> > > >  kthread+0x138/0x140
> > > >  ? process_one_work+0x4c5/0x4c5
> > > >  ? kthread_associate_blkcg+0xc1/0xc1
> > > >  ret_from_fork+0x1f/0x40
> > > >
> > > > Introduce new function uvc_ctrl_stop_device() to cancel the worker
> > > > and call it from uvc_unregister_video() to solve the problem.
> > > >
> > > > Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > > > Cc: Alan Stern <stern@rowland.harvard.edu>
> > > > Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
> > > > Reviewed-by: Tomasz Figa <tfiga@chromium.org>
> > > > Reviewed-by: Sean Paul <seanpaul@chromium.org>
> > > > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> > > > ---
> > > >  drivers/media/usb/uvc/uvc_ctrl.c   | 11 +++++++----
> > > >  drivers/media/usb/uvc/uvc_driver.c |  1 +
> > > >  drivers/media/usb/uvc/uvcvideo.h   |  1 +
> > > >  3 files changed, 9 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/media/usb/uvc/uvc_ctrl.c b/drivers/media/usb/uvc/uvc_ctrl.c
> > > > index 5e9d3da862dd..769c1d2a2f45 100644
> > > > --- a/drivers/media/usb/uvc/uvc_ctrl.c
> > > > +++ b/drivers/media/usb/uvc/uvc_ctrl.c
> > > > @@ -2757,14 +2757,17 @@ static void uvc_ctrl_cleanup_mappings(struct uvc_device *dev,
> > > >       }
> > > >  }
> > > >
> > > > -void uvc_ctrl_cleanup_device(struct uvc_device *dev)
> > > > +void uvc_ctrl_stop_device(struct uvc_device *dev)
> > > >  {
> > > > -     struct uvc_entity *entity;
> > > > -     unsigned int i;
> > > > -
> > > >       /* Can be uninitialized if we are aborting on probe error. */
> > > >       if (dev->async_ctrl.work.func)
> > > >               cancel_work_sync(&dev->async_ctrl.work);
> > > > +}
> > >
> > > There may be an opportunity for refactoring, as we have
> > > uvc_status_stop() that stops the work queue, but I think this is good
> > > enough for now. I'm wondering, though, if there could be a race
> > > condition here similar to the one that the recent changes to
> > > uvc_status_stop() have fixed ? As uvc_ctrl_stop_device() is called at
> > > release time I assume that URBs have been cancelled, so there should be
> > > no race, but a second pair of eyeballs to confirm this would be
> > > appreciated.
> > >
> > > Other than that, the patch looks good to me, and fixes an issue
> > > independent from the rest of the series, so
> > >
> > > Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> > >
> > > I will wait for a reply regarding the race condition before queuing this
> > > up though.
> > >
> > > > +
> > > > +void uvc_ctrl_cleanup_device(struct uvc_device *dev)
> > > > +{
> > > > +     struct uvc_entity *entity;
> > > > +     unsigned int i;
> > > >
> > > >       /* Free controls and control mappings for all entities. */
> > > >       list_for_each_entry(entity, &dev->entities, list) {
> > > > diff --git a/drivers/media/usb/uvc/uvc_driver.c b/drivers/media/usb/uvc/uvc_driver.c
> > > > index 7aefa76a42b3..4be6dfeaa295 100644
> > > > --- a/drivers/media/usb/uvc/uvc_driver.c
> > > > +++ b/drivers/media/usb/uvc/uvc_driver.c
> > > > @@ -1893,6 +1893,7 @@ static void uvc_unregister_video(struct uvc_device *dev)
> > > >       }
> > > >
> > > >       uvc_status_unregister(dev);
> > > > +     uvc_ctrl_stop_device(dev);
> > > >
> > > >       if (dev->vdev.dev)
> > > >               v4l2_device_unregister(&dev->vdev);
> > > > diff --git a/drivers/media/usb/uvc/uvcvideo.h b/drivers/media/usb/uvc/uvcvideo.h
> > > > index 9a596c8d894a..50f171e7381b 100644
> > > > --- a/drivers/media/usb/uvc/uvcvideo.h
> > > > +++ b/drivers/media/usb/uvc/uvcvideo.h
> > > > @@ -760,6 +760,7 @@ int uvc_query_v4l2_menu(struct uvc_video_chain *chain,
> > > >  int uvc_ctrl_add_mapping(struct uvc_video_chain *chain,
> > > >                        const struct uvc_control_mapping *mapping);
> > > >  int uvc_ctrl_init_device(struct uvc_device *dev);
> > > > +void uvc_ctrl_stop_device(struct uvc_device *dev);
> > > >  void uvc_ctrl_cleanup_device(struct uvc_device *dev);
> > > >  int uvc_ctrl_restore_values(struct uvc_device *dev);
> > > >  bool uvc_ctrl_status_event_async(struct urb *urb, struct uvc_video_chain *chain,
> > >
> > > --
> > > Regards,
> > >
> > > Laurent Pinchart
> >
> >
> >
> > --
> > Ricardo Ribalda
>
>
>
> --
> Ricardo Ribalda

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-05-17  4:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-09 14:44 [PATCH v2 0/3] uvcvideo: Attempt N to land UVC race conditions fixes Ricardo Ribalda
2023-03-09 14:44 ` [PATCH v2 1/3] media: uvcvideo: Cancel async worker earlier Ricardo Ribalda
2023-03-09 14:57   ` Laurent Pinchart
2023-03-28  7:52     ` Ricardo Ribalda
2023-05-01 13:29       ` Ricardo Ribalda
2023-05-17  4:05         ` Tomasz Figa
2023-03-09 14:44 ` [PATCH v2 2/3] media: uvcvideo: Lock video streams and queues while unregistering Ricardo Ribalda
2023-03-09 14:44 ` [PATCH v2 3/3] media: uvcvideo: Release stream queue when unregistering video device Ricardo Ribalda

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.