linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
@ 2019-05-24  3:31 Shuah Khan
  2019-05-24  3:31 ` [PATCH 1/2] media: add generic device allocate interface to media-dev-allocator Shuah Khan
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Shuah Khan @ 2019-05-24  3:31 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, helen.koike
  Cc: Shuah Khan, linux-media, linux-kernel

media_device is embedded in struct vimc_device and when vimc is removed
vimc_device and the embedded media_device goes with it, while the active
stream and vimc_capture continue to access it.

Fix the media_device lifetime problem by changing vimc to create shared
media_device using Media Device Allocator API and vimc_capture getting
a reference to vimc module. With this change, vimc module can be removed
only when the references are gone. vimc can be removed after vimc_capture
is removed.

Media Device Allocator API supports just USB devices. Enhance it
adding a genetic device allocate interface to support other media
drivers.

The new interface takes pointer to struct device instead and creates
media device. This interface allows a group of drivers that have a
common root device to share media device resource and ensure media
device doesn't get deleted as long as one of the drivers holds its
reference.

The new interface has been tested with vimc component driver to fix
panics when vimc module is removed while streaming is in progress.

Shuah Khan (2):
  media: add generic device allocate interface to media-dev-allocator
  vimc: fix BUG: unable to handle kernel NULL pointer dereference

 drivers/media/Makefile                     |  4 +-
 drivers/media/media-dev-allocator.c        | 39 ++++++++++++++
 drivers/media/platform/vimc/vimc-capture.c | 17 +++++-
 drivers/media/platform/vimc/vimc-core.c    | 60 ++++++++++++----------
 include/media/media-dev-allocator.h        | 46 ++++++++++++++---
 5 files changed, 130 insertions(+), 36 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] media: add generic device allocate interface to media-dev-allocator
  2019-05-24  3:31 [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Shuah Khan
@ 2019-05-24  3:31 ` Shuah Khan
  2019-05-24  3:31 ` [PATCH 2/2] vimc: fix BUG: unable to handle kernel NULL pointer dereference Shuah Khan
  2019-06-13  5:44 ` [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Hans Verkuil
  2 siblings, 0 replies; 14+ messages in thread
From: Shuah Khan @ 2019-05-24  3:31 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, helen.koike
  Cc: Shuah Khan, linux-media, linux-kernel

Media Device Allocator API supports just USB devices. Enhance it
adding a genetic device allocate interface to support other media
drivers.

The new interface takes pointer to struct device instead and creates
media device. This interface allows a group of drivers that have a
common root device to share media device resource and ensure media
device doesn't get deleted as long as one of the drivers holds its
reference.

The new interface has been tested with vimc component driver to fix
panics when vimc module is removed while streaming is in progress.

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
---
 drivers/media/Makefile              |  4 +--
 drivers/media/media-dev-allocator.c | 39 ++++++++++++++++++++++++
 include/media/media-dev-allocator.h | 46 +++++++++++++++++++++++++----
 3 files changed, 80 insertions(+), 9 deletions(-)

diff --git a/drivers/media/Makefile b/drivers/media/Makefile
index 4a330d0e5e40..3a4f7f74301d 100644
--- a/drivers/media/Makefile
+++ b/drivers/media/Makefile
@@ -7,9 +7,7 @@ media-objs	:= media-device.o media-devnode.o media-entity.o \
 		   media-request.o
 
 ifeq ($(CONFIG_MEDIA_CONTROLLER),y)
-	ifeq ($(CONFIG_USB),y)
-		media-objs += media-dev-allocator.o
-	endif
+	media-objs += media-dev-allocator.o
 endif
 
 #
diff --git a/drivers/media/media-dev-allocator.c b/drivers/media/media-dev-allocator.c
index ae17887dec59..4078e098cede 100644
--- a/drivers/media/media-dev-allocator.c
+++ b/drivers/media/media-dev-allocator.c
@@ -94,6 +94,7 @@ static struct media_device *__media_device_get(struct device *dev,
 	return &mdi->mdev;
 }
 
+#if IS_ENABLED(CONFIG_USB)
 struct media_device *media_device_usb_allocate(struct usb_device *udev,
 					       const char *module_name,
 					       struct module *owner)
@@ -115,6 +116,44 @@ struct media_device *media_device_usb_allocate(struct usb_device *udev,
 	return mdev;
 }
 EXPORT_SYMBOL_GPL(media_device_usb_allocate);
+#endif
+
+struct media_device *media_device_allocate(struct device *dev,
+					   const char *model,
+					   const char *bus_info,
+					   const char *module_name,
+					   struct module *owner)
+{
+	struct media_device *mdev;
+
+	mutex_lock(&media_device_lock);
+	mdev = __media_device_get(dev, module_name, owner);
+	if (!mdev) {
+		mutex_unlock(&media_device_lock);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	if (!mdev->dev) {
+		/* Initialize media device */
+		if (model)
+			strscpy(mdev->model, model, sizeof(mdev->model));
+		else
+			strscpy(mdev->model, "Unkonw model",
+				sizeof(mdev->model));
+		if (bus_info)
+			strscpy(mdev->bus_info, bus_info,
+				sizeof(mdev->bus_info));
+		else
+			strscpy(mdev->bus_info, "Unknown bus_info",
+				sizeof(mdev->bus_info));
+		mdev->dev = dev;
+		media_device_init(mdev);
+	}
+
+	mutex_unlock(&media_device_lock);
+	return mdev;
+}
+EXPORT_SYMBOL_GPL(media_device_allocate);
 
 void media_device_delete(struct media_device *mdev, const char *module_name,
 			 struct module *owner)
diff --git a/include/media/media-dev-allocator.h b/include/media/media-dev-allocator.h
index b35ea6062596..479a3c52cf89 100644
--- a/include/media/media-dev-allocator.h
+++ b/include/media/media-dev-allocator.h
@@ -19,7 +19,8 @@
 
 struct usb_device;
 
-#if defined(CONFIG_MEDIA_CONTROLLER) && defined(CONFIG_USB)
+#if defined(CONFIG_MEDIA_CONTROLLER)
+#if defined(CONFIG_USB)
 /**
  * media_device_usb_allocate() - Allocate and return struct &media device
  *
@@ -38,6 +39,36 @@ struct usb_device;
 struct media_device *media_device_usb_allocate(struct usb_device *udev,
 					       const char *module_name,
 					       struct module *owner);
+#else
+static inline struct media_device *media_device_usb_allocate(
+			struct usb_device *udev, const char *module_name,
+			struct module *owner)
+			{ return NULL; }
+#endif /* CONFIG_USB */
+/**
+ * media_device_allocate() - Allocate and return struct &media device
+ *
+ * @udev:		struct &device pointer
+ * @model:		should be filled with device model name
+ * @bus_info:		should be filled with device bus information:
+ *			Unique and stable device location identifier
+ *			as defined in struct media_device
+ * @module_name:	should be filled with %KBUILD_MODNAME
+ * @owner:		struct module pointer %THIS_MODULE for the driver.
+ *			%THIS_MODULE is null for a built-in driver.
+ *			It is safe even when %THIS_MODULE is null.
+ *
+ * This interface should be called to allocate a Media Device when multiple
+ * drivers/sub-drivers share device and the media device. This interface
+ * allocates &media_device structure and calls media_device_init() to
+ * initialize it.
+ *
+ */
+struct media_device *media_device_allocate(struct device *dev,
+					   const char *model,
+					   const char *bus_info,
+					   const char *module_name,
+					   struct module *owner);
 /**
  * media_device_delete() - Release media device. Calls kref_put().
  *
@@ -52,12 +83,15 @@ struct media_device *media_device_usb_allocate(struct usb_device *udev,
 void media_device_delete(struct media_device *mdev, const char *module_name,
 			 struct module *owner);
 #else
-static inline struct media_device *media_device_usb_allocate(
-			struct usb_device *udev, const char *module_name,
-			struct module *owner)
-			{ return NULL; }
+static inline struct media_device *media_device_allocate(
+					struct device *dev,
+					const char *model,
+					const char *bus_info,
+					const char *module_name,
+					struct module *owner)
+					{ return NULL; }
 static inline void media_device_delete(
 			struct media_device *mdev, const char *module_name,
 			struct module *owner) { }
-#endif /* CONFIG_MEDIA_CONTROLLER && CONFIG_USB */
+#endif /* CONFIG_MEDIA_CONTROLLER */
 #endif /* _MEDIA_DEV_ALLOCATOR_H */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] vimc: fix BUG: unable to handle kernel NULL pointer dereference
  2019-05-24  3:31 [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Shuah Khan
  2019-05-24  3:31 ` [PATCH 1/2] media: add generic device allocate interface to media-dev-allocator Shuah Khan
@ 2019-05-24  3:31 ` Shuah Khan
  2019-06-13  5:44 ` [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Hans Verkuil
  2 siblings, 0 replies; 14+ messages in thread
From: Shuah Khan @ 2019-05-24  3:31 UTC (permalink / raw)
  To: mchehab, hverkuil-cisco, helen.koike
  Cc: Shuah Khan, linux-media, linux-kernel

If vimc module is removed while streaming is active, vimc_exit runs
into NULL pointer dereference error when streaming thread tries to
access and lock graph_mutex in the struct media_device.

media_device is embedded in struct vimc_device and when vimc is removed
vimc_device and the embedded media_device goes with it, while the active
stream and vimc_capture continue to access it.

Fix the media_device lifetime problem by changing vimc to create shared
media_device using Media Device Allocator API and vimc_capture getting
a reference to vimc module. With this change, vimc module can be removed
only when the references are gone. vimc can be removed after vimc_capture
is removed.

The following panic is fixed with this change.

 kernel: [ 1844.098372] Call Trace:
 kernel: [ 1844.098376]  lock_acquire+0xb4/0x160
 kernel: [ 1844.098379]  ? media_pipeline_stop+0x23/0x40
 kernel: [ 1844.098381]  ? media_pipeline_stop+0x23/0x40
 kernel: [ 1844.098385]  __mutex_lock+0x8b/0x950
 kernel: [ 1844.098386]  ? media_pipeline_stop+0x23/0x40
 kernel: [ 1844.098389]  ? wait_for_completion+0xac/0x150
 kernel: [ 1844.098391]  ? wait_for_completion+0x12a/0x150
 kernel: [ 1844.098395]  ? wake_up_q+0x80/0x80
 kernel: [ 1844.098397]  mutex_lock_nested+0x1b/0x20
 kernel: [ 1844.098398]  ? mutex_lock_nested+0x1b/0x20
 kernel: [ 1844.098400]  media_pipeline_stop+0x23/0x40
 kernel: [ 1844.098404]  vimc_cap_stop_streaming+0x28/0x40 [vimc_capture]
 kernel: [ 1844.098406]  __vb2_queue_cancel+0x30/0x2a0
 kernel: [ 1844.098408]  vb2_core_queue_release+0x23/0x50
 kernel: [ 1844.098410]  vb2_queue_release+0xe/0x10
 kernel: [ 1844.098412]  vimc_cap_comp_unbind+0x1d/0x40 [vimc_capture]
 kernel: [ 1844.098414]  component_unbind.isra.8+0x27/0x40
 kernel: [ 1844.098416]  component_unbind_all+0xaa/0xc0
 kernel: [ 1844.098418]  vimc_comp_unbind+0x2d/0x60 [vimc]
 kernel: [ 1844.098420]  take_down_master+0x24/0x40
 kernel: [ 1844.098422]  component_master_del+0x5e/0x80
 kernel: [ 1844.098424]  vimc_remove+0x27/0x90 [vimc]
 kernel: [ 1844.098426]  platform_drv_remove+0x28/0x50
 kernel: [ 1844.098428]  device_release_driver_internal+0xec/0x1c0
 kernel: [ 1844.098429]  driver_detach+0x49/0x90
 kernel: [ 1844.098432]  bus_remove_driver+0x5c/0xd0
 kernel: [ 1844.098433]  driver_unregister+0x2c/0x40
 kernel: [ 1844.098435]  platform_driver_unregister+0x12/0x20
 kernel: [ 1844.098437]  vimc_exit+0x10/0x9fa [vimc]
 kernel: [ 1844.098439]  __x64_sys_delete_module+0x148/0x280
 kernel: [ 1844.098443]  do_syscall_64+0x5a/0x120
 kernel: [ 1844.098446]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
---
 drivers/media/platform/vimc/vimc-capture.c | 17 +++++-
 drivers/media/platform/vimc/vimc-core.c    | 60 ++++++++++++----------
 2 files changed, 50 insertions(+), 27 deletions(-)

diff --git a/drivers/media/platform/vimc/vimc-capture.c b/drivers/media/platform/vimc/vimc-capture.c
index e7d0fc2228a6..2e5cf794f60f 100644
--- a/drivers/media/platform/vimc/vimc-capture.c
+++ b/drivers/media/platform/vimc/vimc-capture.c
@@ -22,6 +22,7 @@
 #include <media/v4l2-ioctl.h>
 #include <media/videobuf2-core.h>
 #include <media/videobuf2-vmalloc.h>
+#include <media/media-dev-allocator.h>
 
 #include "vimc-common.h"
 #include "vimc-streamer.h"
@@ -72,6 +73,8 @@ struct vimc_cap_device {
 	struct mutex lock;
 	u32 sequence;
 	struct vimc_stream stream;
+	/* Used for holding reference to media dev allocated by vimc-core */
+	struct media_device *mdev;
 };
 
 static const struct v4l2_pix_format fmt_default = {
@@ -371,6 +374,7 @@ static void vimc_cap_release(struct video_device *vdev)
 		container_of(vdev, struct vimc_cap_device, vdev);
 
 	vimc_pads_cleanup(vcap->ved.pads);
+	media_device_delete(vcap->mdev, VIMC_CAP_DRV_NAME, THIS_MODULE);
 	kfree(vcap);
 }
 
@@ -440,12 +444,21 @@ static int vimc_cap_comp_bind(struct device *comp, struct device *master,
 	if (!vcap)
 		return -ENOMEM;
 
+	/* first get reference to media device allocated by vimc */
+	vcap->mdev = media_device_allocate(master, NULL, NULL,
+					   VIMC_CAP_DRV_NAME,
+					   THIS_MODULE);
+	if (!vcap->mdev) {
+		ret = PTR_ERR(vcap->mdev);
+		goto err_free_vcap;
+	}
+
 	/* Allocate the pads */
 	vcap->ved.pads =
 		vimc_pads_init(1, (const unsigned long[1]) {MEDIA_PAD_FL_SINK});
 	if (IS_ERR(vcap->ved.pads)) {
 		ret = PTR_ERR(vcap->ved.pads);
-		goto err_free_vcap;
+		goto err_mdev_rls_ref;
 	}
 
 	/* Initialize the media entity */
@@ -524,6 +537,8 @@ static int vimc_cap_comp_bind(struct device *comp, struct device *master,
 	media_entity_cleanup(&vcap->vdev.entity);
 err_clean_pads:
 	vimc_pads_cleanup(vcap->ved.pads);
+err_mdev_rls_ref:
+	media_device_delete(vcap->mdev, VIMC_CAP_DRV_NAME, THIS_MODULE);
 err_free_vcap:
 	kfree(vcap);
 
diff --git a/drivers/media/platform/vimc/vimc-core.c b/drivers/media/platform/vimc/vimc-core.c
index 3aa62d7e3d0e..d381993f3d7e 100644
--- a/drivers/media/platform/vimc/vimc-core.c
+++ b/drivers/media/platform/vimc/vimc-core.c
@@ -20,6 +20,7 @@
 #include <linux/module.h>
 #include <linux/platform_device.h>
 #include <media/media-device.h>
+#include <media/media-dev-allocator.h>
 #include <media/v4l2-device.h>
 
 #include "vimc-common.h"
@@ -42,7 +43,7 @@ struct vimc_device {
 	const struct vimc_pipeline_config *pipe_cfg;
 
 	/* The Associated media_device parent */
-	struct media_device mdev;
+	struct media_device *mdev;
 
 	/* Internal v4l2 parent device*/
 	struct v4l2_device v4l2_dev;
@@ -182,9 +183,9 @@ static int vimc_comp_bind(struct device *master)
 	dev_dbg(master, "bind");
 
 	/* Register the v4l2 struct */
-	ret = v4l2_device_register(vimc->mdev.dev, &vimc->v4l2_dev);
+	ret = v4l2_device_register(vimc->mdev->dev, &vimc->v4l2_dev);
 	if (ret) {
-		dev_err(vimc->mdev.dev,
+		dev_err(vimc->mdev->dev,
 			"v4l2 device register failed (err=%d)\n", ret);
 		return ret;
 	}
@@ -200,9 +201,9 @@ static int vimc_comp_bind(struct device *master)
 		goto err_comp_unbind_all;
 
 	/* Register the media device */
-	ret = media_device_register(&vimc->mdev);
+	ret = media_device_register(vimc->mdev);
 	if (ret) {
-		dev_err(vimc->mdev.dev,
+		dev_err(vimc->mdev->dev,
 			"media device register failed (err=%d)\n", ret);
 		goto err_comp_unbind_all;
 	}
@@ -210,7 +211,7 @@ static int vimc_comp_bind(struct device *master)
 	/* Expose all subdev's nodes*/
 	ret = v4l2_device_register_subdev_nodes(&vimc->v4l2_dev);
 	if (ret) {
-		dev_err(vimc->mdev.dev,
+		dev_err(vimc->mdev->dev,
 			"vimc subdev nodes registration failed (err=%d)\n",
 			ret);
 		goto err_mdev_unregister;
@@ -219,8 +220,7 @@ static int vimc_comp_bind(struct device *master)
 	return 0;
 
 err_mdev_unregister:
-	media_device_unregister(&vimc->mdev);
-	media_device_cleanup(&vimc->mdev);
+	media_device_delete(vimc->mdev, VIMC_PDEV_NAME, THIS_MODULE);
 err_comp_unbind_all:
 	component_unbind_all(master, NULL);
 err_v4l2_unregister:
@@ -236,8 +236,7 @@ static void vimc_comp_unbind(struct device *master)
 
 	dev_dbg(master, "unbind");
 
-	media_device_unregister(&vimc->mdev);
-	media_device_cleanup(&vimc->mdev);
+	media_device_delete(vimc->mdev, VIMC_PDEV_NAME, THIS_MODULE);
 	component_unbind_all(master, NULL);
 	v4l2_device_unregister(&vimc->v4l2_dev);
 }
@@ -301,42 +300,51 @@ static int vimc_probe(struct platform_device *pdev)
 	struct vimc_device *vimc = container_of(pdev, struct vimc_device, pdev);
 	struct component_match *match = NULL;
 	int ret;
+	char bus_info[32]; /* same size as struct media_device bus_info */
 
 	dev_dbg(&pdev->dev, "probe");
 
-	memset(&vimc->mdev, 0, sizeof(vimc->mdev));
+	/* Initialize media device */
+	snprintf(bus_info, sizeof(bus_info), "platform:%s", VIMC_PDEV_NAME);
+	vimc->mdev = media_device_allocate(&pdev->dev,
+					   VIMC_MDEV_MODEL_NAME,
+					   bus_info,
+					   VIMC_PDEV_NAME,
+					   THIS_MODULE);
+	if (!vimc->mdev)
+		return -ENOMEM;
 
 	/* Create platform_device for each entity in the topology*/
 	vimc->subdevs = devm_kcalloc(&vimc->pdev.dev, vimc->pipe_cfg->num_ents,
 				     sizeof(*vimc->subdevs), GFP_KERNEL);
-	if (!vimc->subdevs)
-		return -ENOMEM;
+	if (!vimc->subdevs) {
+		ret = -ENOMEM;
+		goto err_delete_media_device;
+	}
 
 	match = vimc_add_subdevs(vimc);
-	if (IS_ERR(match))
-		return PTR_ERR(match);
+	if (IS_ERR(match)) {
+		ret = PTR_ERR(match);
+		goto err_delete_media_device;
+	}
 
 	/* Link the media device within the v4l2_device */
-	vimc->v4l2_dev.mdev = &vimc->mdev;
-
-	/* Initialize media device */
-	strscpy(vimc->mdev.model, VIMC_MDEV_MODEL_NAME,
-		sizeof(vimc->mdev.model));
-	snprintf(vimc->mdev.bus_info, sizeof(vimc->mdev.bus_info),
-		 "platform:%s", VIMC_PDEV_NAME);
-	vimc->mdev.dev = &pdev->dev;
-	media_device_init(&vimc->mdev);
+	vimc->v4l2_dev.mdev = vimc->mdev;
 
 	/* Add self to the component system */
 	ret = component_master_add_with_match(&pdev->dev, &vimc_comp_ops,
 					      match);
 	if (ret) {
-		media_device_cleanup(&vimc->mdev);
 		vimc_rm_subdevs(vimc);
-		return ret;
+		goto err_delete_media_device;
 	}
 
 	return 0;
+
+err_delete_media_device:
+	media_device_delete(vimc->mdev, VIMC_PDEV_NAME, THIS_MODULE);
+
+	return ret;
 }
 
 static int vimc_remove(struct platform_device *pdev)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-05-24  3:31 [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Shuah Khan
  2019-05-24  3:31 ` [PATCH 1/2] media: add generic device allocate interface to media-dev-allocator Shuah Khan
  2019-05-24  3:31 ` [PATCH 2/2] vimc: fix BUG: unable to handle kernel NULL pointer dereference Shuah Khan
@ 2019-06-13  5:44 ` Hans Verkuil
  2019-06-13 13:24   ` Helen Koike
  2019-06-16 18:43   ` Laurent Pinchart
  2 siblings, 2 replies; 14+ messages in thread
From: Hans Verkuil @ 2019-06-13  5:44 UTC (permalink / raw)
  To: helen.koike; +Cc: Shuah Khan, linux-media, linux-kernel

On 5/24/19 5:31 AM, Shuah Khan wrote:
> media_device is embedded in struct vimc_device and when vimc is removed
> vimc_device and the embedded media_device goes with it, while the active
> stream and vimc_capture continue to access it.
> 
> Fix the media_device lifetime problem by changing vimc to create shared
> media_device using Media Device Allocator API and vimc_capture getting
> a reference to vimc module. With this change, vimc module can be removed
> only when the references are gone. vimc can be removed after vimc_capture
> is removed.
> 
> Media Device Allocator API supports just USB devices. Enhance it
> adding a genetic device allocate interface to support other media
> drivers.
> 
> The new interface takes pointer to struct device instead and creates
> media device. This interface allows a group of drivers that have a
> common root device to share media device resource and ensure media
> device doesn't get deleted as long as one of the drivers holds its
> reference.
> 
> The new interface has been tested with vimc component driver to fix
> panics when vimc module is removed while streaming is in progress.

Helen, can you review this series? I'm not sure this is the right approach
for a driver like vimc, and even if it is, then it is odd that vimc-capture
is the only vimc module that's handled here.

My gut feeling is that this should be handled inside vimc directly and not
using the media-dev-allocator.

Regards,

	Hans

> 
> Shuah Khan (2):
>   media: add generic device allocate interface to media-dev-allocator
>   vimc: fix BUG: unable to handle kernel NULL pointer dereference
> 
>  drivers/media/Makefile                     |  4 +-
>  drivers/media/media-dev-allocator.c        | 39 ++++++++++++++
>  drivers/media/platform/vimc/vimc-capture.c | 17 +++++-
>  drivers/media/platform/vimc/vimc-core.c    | 60 ++++++++++++----------
>  include/media/media-dev-allocator.h        | 46 ++++++++++++++---
>  5 files changed, 130 insertions(+), 36 deletions(-)
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-06-13  5:44 ` [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Hans Verkuil
@ 2019-06-13 13:24   ` Helen Koike
  2019-06-14 23:26     ` Shuah Khan
  2019-06-16 18:43   ` Laurent Pinchart
  1 sibling, 1 reply; 14+ messages in thread
From: Helen Koike @ 2019-06-13 13:24 UTC (permalink / raw)
  To: Hans Verkuil; +Cc: Shuah Khan, linux-media, linux-kernel



On 6/13/19 2:44 AM, Hans Verkuil wrote:
> On 5/24/19 5:31 AM, Shuah Khan wrote:
>> media_device is embedded in struct vimc_device and when vimc is removed
>> vimc_device and the embedded media_device goes with it, while the active
>> stream and vimc_capture continue to access it.
>>
>> Fix the media_device lifetime problem by changing vimc to create shared
>> media_device using Media Device Allocator API and vimc_capture getting
>> a reference to vimc module. With this change, vimc module can be removed
>> only when the references are gone. vimc can be removed after vimc_capture
>> is removed.
>>
>> Media Device Allocator API supports just USB devices. Enhance it
>> adding a genetic device allocate interface to support other media
>> drivers.
>>
>> The new interface takes pointer to struct device instead and creates
>> media device. This interface allows a group of drivers that have a
>> common root device to share media device resource and ensure media
>> device doesn't get deleted as long as one of the drivers holds its
>> reference.
>>
>> The new interface has been tested with vimc component driver to fix
>> panics when vimc module is removed while streaming is in progress.
> 
> Helen, can you review this series? I'm not sure this is the right approach
> for a driver like vimc, and even if it is, then it is odd that vimc-capture
> is the only vimc module that's handled here.

Hi Hans,

Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
try to take a look at this patch series (and the others) asap.

Helen

> 
> My gut feeling is that this should be handled inside vimc directly and not
> using the media-dev-allocator.
> 
> Regards,
> 
> 	Hans
> 
>>
>> Shuah Khan (2):
>>   media: add generic device allocate interface to media-dev-allocator
>>   vimc: fix BUG: unable to handle kernel NULL pointer dereference
>>
>>  drivers/media/Makefile                     |  4 +-
>>  drivers/media/media-dev-allocator.c        | 39 ++++++++++++++
>>  drivers/media/platform/vimc/vimc-capture.c | 17 +++++-
>>  drivers/media/platform/vimc/vimc-core.c    | 60 ++++++++++++----------
>>  include/media/media-dev-allocator.h        | 46 ++++++++++++++---
>>  5 files changed, 130 insertions(+), 36 deletions(-)
>>
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-06-13 13:24   ` Helen Koike
@ 2019-06-14 23:26     ` Shuah Khan
  2019-06-16 18:45       ` Laurent Pinchart
  0 siblings, 1 reply; 14+ messages in thread
From: Shuah Khan @ 2019-06-14 23:26 UTC (permalink / raw)
  To: Helen Koike, Hans Verkuil; +Cc: linux-media, linux-kernel, Shuah Khan

On 6/13/19 7:24 AM, Helen Koike wrote:
> 
> 
> On 6/13/19 2:44 AM, Hans Verkuil wrote:
>> On 5/24/19 5:31 AM, Shuah Khan wrote:
>>> media_device is embedded in struct vimc_device and when vimc is removed
>>> vimc_device and the embedded media_device goes with it, while the active
>>> stream and vimc_capture continue to access it.
>>>
>>> Fix the media_device lifetime problem by changing vimc to create shared
>>> media_device using Media Device Allocator API and vimc_capture getting
>>> a reference to vimc module. With this change, vimc module can be removed
>>> only when the references are gone. vimc can be removed after vimc_capture
>>> is removed.
>>>
>>> Media Device Allocator API supports just USB devices. Enhance it
>>> adding a genetic device allocate interface to support other media
>>> drivers.
>>>
>>> The new interface takes pointer to struct device instead and creates
>>> media device. This interface allows a group of drivers that have a
>>> common root device to share media device resource and ensure media
>>> device doesn't get deleted as long as one of the drivers holds its
>>> reference.
>>>
>>> The new interface has been tested with vimc component driver to fix
>>> panics when vimc module is removed while streaming is in progress.
>>
>> Helen, can you review this series? I'm not sure this is the right approach
>> for a driver like vimc, and even if it is, then it is odd that vimc-capture
>> is the only vimc module that's handled here.
> 
> Hi Hans,
> 
> Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
> try to take a look at this patch series (and the others) asap.
> 
> Helen
> 
>>
>> My gut feeling is that this should be handled inside vimc directly and not
>> using the media-dev-allocator.
>>

Hi Hans and Helen,

I explored fixing the problem within vimc before I went down the path to
use Media Device Allocator API. I do think that it is cleaner to go this
way and easier to maintain.

vimc is a group pf component drivers that rely on the media device vimc
in vimc and falls into the use-case Media Device Allocator API is added
to address. The release and life-time management happens without vimc
component drivers being changed other than using the API to get and put
media device reference.

thanks,
-- Shuah



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-06-13  5:44 ` [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Hans Verkuil
  2019-06-13 13:24   ` Helen Koike
@ 2019-06-16 18:43   ` Laurent Pinchart
  1 sibling, 0 replies; 14+ messages in thread
From: Laurent Pinchart @ 2019-06-16 18:43 UTC (permalink / raw)
  To: Hans Verkuil; +Cc: helen.koike, Shuah Khan, linux-media, linux-kernel

Hi Hans,

On Thu, Jun 13, 2019 at 07:44:15AM +0200, Hans Verkuil wrote:
> On 5/24/19 5:31 AM, Shuah Khan wrote:
> > media_device is embedded in struct vimc_device and when vimc is removed
> > vimc_device and the embedded media_device goes with it, while the active
> > stream and vimc_capture continue to access it.
> > 
> > Fix the media_device lifetime problem by changing vimc to create shared
> > media_device using Media Device Allocator API and vimc_capture getting
> > a reference to vimc module. With this change, vimc module can be removed
> > only when the references are gone. vimc can be removed after vimc_capture
> > is removed.
> > 
> > Media Device Allocator API supports just USB devices. Enhance it
> > adding a genetic device allocate interface to support other media
> > drivers.
> > 
> > The new interface takes pointer to struct device instead and creates
> > media device. This interface allows a group of drivers that have a
> > common root device to share media device resource and ensure media
> > device doesn't get deleted as long as one of the drivers holds its
> > reference.
> > 
> > The new interface has been tested with vimc component driver to fix
> > panics when vimc module is removed while streaming is in progress.
> 
> Helen, can you review this series? I'm not sure this is the right approach
> for a driver like vimc, and even if it is, then it is odd that vimc-capture
> is the only vimc module that's handled here.
> 
> My gut feeling is that this should be handled inside vimc directly and not
> using the media-dev-allocator.

That's my opinion too, I don't see why the media dev allocator would be
needed here. We know how to fix this kind of issues, it requires proper
lifetime management of objects with refcouting, using the media dev
allocator is a hack that would just hide the problem deeper. Hiding
problems deeper during the last MC rework brought us where we are today,
and I don't think we should repeat the same mistake.

> > Shuah Khan (2):
> >   media: add generic device allocate interface to media-dev-allocator
> >   vimc: fix BUG: unable to handle kernel NULL pointer dereference
> > 
> >  drivers/media/Makefile                     |  4 +-
> >  drivers/media/media-dev-allocator.c        | 39 ++++++++++++++
> >  drivers/media/platform/vimc/vimc-capture.c | 17 +++++-
> >  drivers/media/platform/vimc/vimc-core.c    | 60 ++++++++++++----------
> >  include/media/media-dev-allocator.h        | 46 ++++++++++++++---
> >  5 files changed, 130 insertions(+), 36 deletions(-)

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-06-14 23:26     ` Shuah Khan
@ 2019-06-16 18:45       ` Laurent Pinchart
  2019-06-28 16:41         ` Shuah Khan
  0 siblings, 1 reply; 14+ messages in thread
From: Laurent Pinchart @ 2019-06-16 18:45 UTC (permalink / raw)
  To: Shuah Khan; +Cc: Helen Koike, Hans Verkuil, linux-media, linux-kernel

Hi Shuah,

On Fri, Jun 14, 2019 at 05:26:46PM -0600, Shuah Khan wrote:
> On 6/13/19 7:24 AM, Helen Koike wrote:
> > On 6/13/19 2:44 AM, Hans Verkuil wrote:
> >> On 5/24/19 5:31 AM, Shuah Khan wrote:
> >>> media_device is embedded in struct vimc_device and when vimc is removed
> >>> vimc_device and the embedded media_device goes with it, while the active
> >>> stream and vimc_capture continue to access it.
> >>>
> >>> Fix the media_device lifetime problem by changing vimc to create shared
> >>> media_device using Media Device Allocator API and vimc_capture getting
> >>> a reference to vimc module. With this change, vimc module can be removed
> >>> only when the references are gone. vimc can be removed after vimc_capture
> >>> is removed.
> >>>
> >>> Media Device Allocator API supports just USB devices. Enhance it
> >>> adding a genetic device allocate interface to support other media
> >>> drivers.
> >>>
> >>> The new interface takes pointer to struct device instead and creates
> >>> media device. This interface allows a group of drivers that have a
> >>> common root device to share media device resource and ensure media
> >>> device doesn't get deleted as long as one of the drivers holds its
> >>> reference.
> >>>
> >>> The new interface has been tested with vimc component driver to fix
> >>> panics when vimc module is removed while streaming is in progress.
> >>
> >> Helen, can you review this series? I'm not sure this is the right approach
> >> for a driver like vimc, and even if it is, then it is odd that vimc-capture
> >> is the only vimc module that's handled here.
> > 
> > Hi Hans,
> > 
> > Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
> > try to take a look at this patch series (and the others) asap.
> > 
> > Helen
> > 
> >> My gut feeling is that this should be handled inside vimc directly and not
> >> using the media-dev-allocator.
> 
> Hi Hans and Helen,
> 
> I explored fixing the problem within vimc before I went down the path to
> use Media Device Allocator API. I do think that it is cleaner to go this
> way and easier to maintain.
> 
> vimc is a group pf component drivers that rely on the media device vimc
> in vimc and falls into the use-case Media Device Allocator API is added
> to address. The release and life-time management happens without vimc
> component drivers being changed other than using the API to get and put
> media device reference.

Our replies crossed each other, please see my reply to Hans. I would
just like to comment here that if having multiple kernel modules causes
issue, they can all be merged together. There's no need for vimc to be
handled through multiple modules (I actually think it's quite
counterproductive, it only makes it more complex, for no added value).

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-06-16 18:45       ` Laurent Pinchart
@ 2019-06-28 16:41         ` Shuah Khan
  2019-06-30 11:41           ` Laurent Pinchart
  0 siblings, 1 reply; 14+ messages in thread
From: Shuah Khan @ 2019-06-28 16:41 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Helen Koike, Hans Verkuil, linux-media, linux-kernel, Shuah Khan

Hi Laurent,

On 6/16/19 12:45 PM, Laurent Pinchart wrote:
> Hi Shuah,
> 
> On Fri, Jun 14, 2019 at 05:26:46PM -0600, Shuah Khan wrote:
>> On 6/13/19 7:24 AM, Helen Koike wrote:
>>> On 6/13/19 2:44 AM, Hans Verkuil wrote:
>>>> On 5/24/19 5:31 AM, Shuah Khan wrote:
>>>>> media_device is embedded in struct vimc_device and when vimc is removed
>>>>> vimc_device and the embedded media_device goes with it, while the active
>>>>> stream and vimc_capture continue to access it.
>>>>>
>>>>> Fix the media_device lifetime problem by changing vimc to create shared
>>>>> media_device using Media Device Allocator API and vimc_capture getting
>>>>> a reference to vimc module. With this change, vimc module can be removed
>>>>> only when the references are gone. vimc can be removed after vimc_capture
>>>>> is removed.
>>>>>
>>>>> Media Device Allocator API supports just USB devices. Enhance it
>>>>> adding a genetic device allocate interface to support other media
>>>>> drivers.
>>>>>
>>>>> The new interface takes pointer to struct device instead and creates
>>>>> media device. This interface allows a group of drivers that have a
>>>>> common root device to share media device resource and ensure media
>>>>> device doesn't get deleted as long as one of the drivers holds its
>>>>> reference.
>>>>>
>>>>> The new interface has been tested with vimc component driver to fix
>>>>> panics when vimc module is removed while streaming is in progress.
>>>>
>>>> Helen, can you review this series? I'm not sure this is the right approach
>>>> for a driver like vimc, and even if it is, then it is odd that vimc-capture
>>>> is the only vimc module that's handled here.
>>>
>>> Hi Hans,
>>>
>>> Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
>>> try to take a look at this patch series (and the others) asap.
>>>
>>> Helen
>>>
>>>> My gut feeling is that this should be handled inside vimc directly and not
>>>> using the media-dev-allocator.
>>
>> Hi Hans and Helen,
>>
>> I explored fixing the problem within vimc before I went down the path to
>> use Media Device Allocator API. I do think that it is cleaner to go this
>> way and easier to maintain.
>>
>> vimc is a group pf component drivers that rely on the media device vimc
>> in vimc and falls into the use-case Media Device Allocator API is added
>> to address. The release and life-time management happens without vimc
>> component drivers being changed other than using the API to get and put
>> media device reference.
> 
> Our replies crossed each other, please see my reply to Hans. I would
> just like to comment here that if having multiple kernel modules causes
> issue, they can all be merged together. There's no need for vimc to be
> handled through multiple modules (I actually think it's quite
> counterproductive, it only makes it more complex, for no added value).
> 

There are several problems in this group of drivers as far as lifetime
management is concerned. I explained some of it in the patch 2/2

If vimc module is removed while streaming is active, vimc_exit runs
into NULL pointer dereference error when streaming thread tries to
access and lock graph_mutex in the struct media_device.

The primary reason for this is that:

media_device is embedded in struct vimc_device and when vimc is removed
vimc_device and the embedded media_device goes with it, while the active
stream and vimc_capture continue to access it.

If we chose to keep these drivers as component drivers, media device
needs to stick around until all components stop using it. This is tricky
because there is no tie between these set of drivers. vimc module can
be deleted while others are still active. As vimc gets removed, other
component drivers start wanting to access the media device tree.

This is classic media device lifetime problem which could be solved
easily with the way I solved it with this series. I saw this as a
variation on the same use-case we had with sound and media drivers
sharing the media device.

I have a TODO request from you asking to extend Media Device Allocator
API to generic case and not restrict it to USB devices. My thinking is
that this gives a perfect test case to extend the API to be generic
and use to solve this problem.

Collapsing the drivers into one might be lot more difficult and complex
than solving this problem with Media Device Allocator API. This approach
has an added benefit of extending the API to be generic and not just for
USB.

I looked at this as a good way to add generic API and have a great test
case for it. This patch series fixes the problem for the current vimc
architecture.

thanks,
-- Shuah





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-06-28 16:41         ` Shuah Khan
@ 2019-06-30 11:41           ` Laurent Pinchart
  2019-07-03 16:17             ` Niklas Söderlund
  0 siblings, 1 reply; 14+ messages in thread
From: Laurent Pinchart @ 2019-06-30 11:41 UTC (permalink / raw)
  To: Shuah Khan; +Cc: Helen Koike, Hans Verkuil, linux-media, linux-kernel

Hi Shuah,

On Fri, Jun 28, 2019 at 10:41:07AM -0600, Shuah Khan wrote:
> On 6/16/19 12:45 PM, Laurent Pinchart wrote:
> > On Fri, Jun 14, 2019 at 05:26:46PM -0600, Shuah Khan wrote:
> >> On 6/13/19 7:24 AM, Helen Koike wrote:
> >>> On 6/13/19 2:44 AM, Hans Verkuil wrote:
> >>>> On 5/24/19 5:31 AM, Shuah Khan wrote:
> >>>>> media_device is embedded in struct vimc_device and when vimc is removed
> >>>>> vimc_device and the embedded media_device goes with it, while the active
> >>>>> stream and vimc_capture continue to access it.
> >>>>>
> >>>>> Fix the media_device lifetime problem by changing vimc to create shared
> >>>>> media_device using Media Device Allocator API and vimc_capture getting
> >>>>> a reference to vimc module. With this change, vimc module can be removed
> >>>>> only when the references are gone. vimc can be removed after vimc_capture
> >>>>> is removed.
> >>>>>
> >>>>> Media Device Allocator API supports just USB devices. Enhance it
> >>>>> adding a genetic device allocate interface to support other media
> >>>>> drivers.
> >>>>>
> >>>>> The new interface takes pointer to struct device instead and creates
> >>>>> media device. This interface allows a group of drivers that have a
> >>>>> common root device to share media device resource and ensure media
> >>>>> device doesn't get deleted as long as one of the drivers holds its
> >>>>> reference.
> >>>>>
> >>>>> The new interface has been tested with vimc component driver to fix
> >>>>> panics when vimc module is removed while streaming is in progress.
> >>>>
> >>>> Helen, can you review this series? I'm not sure this is the right approach
> >>>> for a driver like vimc, and even if it is, then it is odd that vimc-capture
> >>>> is the only vimc module that's handled here.
> >>>
> >>> Hi Hans,
> >>>
> >>> Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
> >>> try to take a look at this patch series (and the others) asap.
> >>>
> >>> Helen
> >>>
> >>>> My gut feeling is that this should be handled inside vimc directly and not
> >>>> using the media-dev-allocator.
> >>
> >> Hi Hans and Helen,
> >>
> >> I explored fixing the problem within vimc before I went down the path to
> >> use Media Device Allocator API. I do think that it is cleaner to go this
> >> way and easier to maintain.
> >>
> >> vimc is a group pf component drivers that rely on the media device vimc
> >> in vimc and falls into the use-case Media Device Allocator API is added
> >> to address. The release and life-time management happens without vimc
> >> component drivers being changed other than using the API to get and put
> >> media device reference.
> > 
> > Our replies crossed each other, please see my reply to Hans. I would
> > just like to comment here that if having multiple kernel modules causes
> > issue, they can all be merged together. There's no need for vimc to be
> > handled through multiple modules (I actually think it's quite
> > counterproductive, it only makes it more complex, for no added value).
> 
> There are several problems in this group of drivers as far as lifetime
> management is concerned. I explained some of it in the patch 2/2
> 
> If vimc module is removed while streaming is active, vimc_exit runs
> into NULL pointer dereference error when streaming thread tries to
> access and lock graph_mutex in the struct media_device.
> 
> The primary reason for this is that:
> 
> media_device is embedded in struct vimc_device and when vimc is removed
> vimc_device and the embedded media_device goes with it, while the active
> stream and vimc_capture continue to access it.

The issue isn't so much that media_devic is embedded in vimc_device, but
that vimc_device is released too early. Not only does the thread need to
access the graph_mutex lock in the media_device structure, but it can
potentially access fields of the device-specific structures as well. The
proper solution is to propagate media_device_release() one level up, in
order to only release the top-level structure containing media_device
when the last reference to the media_device is dropped.

> If we chose to keep these drivers as component drivers, media device
> needs to stick around until all components stop using it. This is tricky
> because there is no tie between these set of drivers. vimc module can
> be deleted while others are still active. As vimc gets removed, other
> component drivers start wanting to access the media device tree.

Reference-counting is the key.

> This is classic media device lifetime problem which could be solved
> easily with the way I solved it with this series. I saw this as a
> variation on the same use-case we had with sound and media drivers
> sharing the media device.

This isn't about solving it easily, it's about solving it properly. The
media device allocator as used here is a hack and takes us in the
opposite direction of a proper fix.

> I have a TODO request from you asking to extend Media Device Allocator
> API to generic case and not restrict it to USB devices. My thinking is
> that this gives a perfect test case to extend the API to be generic
> and use to solve this problem.

The biggest issue at the moment with the media device allocator, which I
have pointed out numerous times and has never been addressed (and which
explains why I didn't think the code was ready to be merged) is that the
media_device contains operations that are based on having a single
driver controlling the media device. A proper shared media device
allocator needs to drop the concept of a single master for the media
device, and thus needs to refactor those operations to allow any user of
the media device to implement them (the .link_notify() operation is a
prime example, and the recently added request operations will make this
even more challenging - think of how this patch series would prevent
vimc from properly implementing the request API). As long as these issue
are not fixed I will be firmly opposed to spreading the usage of the
media device allocator beyond what exists today.

> Collapsing the drivers into one might be lot more difficult and complex
> than solving this problem with Media Device Allocator API. This approach
> has an added benefit of extending the API to be generic and not just for
> USB.

I've never disputed the fact that fixing a problem correctly is usually
more work than hacking around it :-)

> I looked at this as a good way to add generic API and have a great test
> case for it. This patch series fixes the problem for the current vimc
> architecture.

NAK, for the reasons above. Please drop this series and fix the problem
properly.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-06-30 11:41           ` Laurent Pinchart
@ 2019-07-03 16:17             ` Niklas Söderlund
  2019-07-03 16:52               ` Shuah Khan
  0 siblings, 1 reply; 14+ messages in thread
From: Niklas Söderlund @ 2019-07-03 16:17 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Shuah Khan, Helen Koike, Hans Verkuil, linux-media, linux-kernel

Hi Shauah, Laurent,

On 2019-06-30 14:41:02 +0300, Laurent Pinchart wrote:
> Hi Shuah,
> 
> On Fri, Jun 28, 2019 at 10:41:07AM -0600, Shuah Khan wrote:
> > On 6/16/19 12:45 PM, Laurent Pinchart wrote:
> > > On Fri, Jun 14, 2019 at 05:26:46PM -0600, Shuah Khan wrote:
> > >> On 6/13/19 7:24 AM, Helen Koike wrote:
> > >>> On 6/13/19 2:44 AM, Hans Verkuil wrote:
> > >>>> On 5/24/19 5:31 AM, Shuah Khan wrote:
> > >>>>> media_device is embedded in struct vimc_device and when vimc is removed
> > >>>>> vimc_device and the embedded media_device goes with it, while the active
> > >>>>> stream and vimc_capture continue to access it.
> > >>>>>
> > >>>>> Fix the media_device lifetime problem by changing vimc to create shared
> > >>>>> media_device using Media Device Allocator API and vimc_capture getting
> > >>>>> a reference to vimc module. With this change, vimc module can be removed
> > >>>>> only when the references are gone. vimc can be removed after vimc_capture
> > >>>>> is removed.
> > >>>>>
> > >>>>> Media Device Allocator API supports just USB devices. Enhance it
> > >>>>> adding a genetic device allocate interface to support other media
> > >>>>> drivers.
> > >>>>>
> > >>>>> The new interface takes pointer to struct device instead and creates
> > >>>>> media device. This interface allows a group of drivers that have a
> > >>>>> common root device to share media device resource and ensure media
> > >>>>> device doesn't get deleted as long as one of the drivers holds its
> > >>>>> reference.
> > >>>>>
> > >>>>> The new interface has been tested with vimc component driver to fix
> > >>>>> panics when vimc module is removed while streaming is in progress.
> > >>>>
> > >>>> Helen, can you review this series? I'm not sure this is the right approach
> > >>>> for a driver like vimc, and even if it is, then it is odd that vimc-capture
> > >>>> is the only vimc module that's handled here.
> > >>>
> > >>> Hi Hans,
> > >>>
> > >>> Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
> > >>> try to take a look at this patch series (and the others) asap.
> > >>>
> > >>> Helen
> > >>>
> > >>>> My gut feeling is that this should be handled inside vimc directly and not
> > >>>> using the media-dev-allocator.
> > >>
> > >> Hi Hans and Helen,
> > >>
> > >> I explored fixing the problem within vimc before I went down the path to
> > >> use Media Device Allocator API. I do think that it is cleaner to go this
> > >> way and easier to maintain.
> > >>
> > >> vimc is a group pf component drivers that rely on the media device vimc
> > >> in vimc and falls into the use-case Media Device Allocator API is added
> > >> to address. The release and life-time management happens without vimc
> > >> component drivers being changed other than using the API to get and put
> > >> media device reference.
> > > 
> > > Our replies crossed each other, please see my reply to Hans. I would
> > > just like to comment here that if having multiple kernel modules causes
> > > issue, they can all be merged together. There's no need for vimc to be
> > > handled through multiple modules (I actually think it's quite
> > > counterproductive, it only makes it more complex, for no added value).
> > 
> > There are several problems in this group of drivers as far as lifetime
> > management is concerned. I explained some of it in the patch 2/2
> > 
> > If vimc module is removed while streaming is active, vimc_exit runs
> > into NULL pointer dereference error when streaming thread tries to
> > access and lock graph_mutex in the struct media_device.
> > 
> > The primary reason for this is that:
> > 
> > media_device is embedded in struct vimc_device and when vimc is removed
> > vimc_device and the embedded media_device goes with it, while the active
> > stream and vimc_capture continue to access it.
> 
> The issue isn't so much that media_devic is embedded in vimc_device, but
> that vimc_device is released too early. Not only does the thread need to
> access the graph_mutex lock in the media_device structure, but it can
> potentially access fields of the device-specific structures as well. The
> proper solution is to propagate media_device_release() one level up, in
> order to only release the top-level structure containing media_device
> when the last reference to the media_device is dropped.

I have seen similar problems with rcar-vin, the device specific data is 
released to early. In my case it was not triggered by the struct
media_device but with a struct v4l2_device embedded in the device 
specific data IIRC.

This was when I tried to address the lifetime issues of the video device 
when binding/unbinding the device to the driver and not when unloading 
the module. This was quiet a while ago so I don't recall specifics, 
sorry about that. One finding was that there are also unsolved problems 
when it comes async notifiers and unloading/unbinding and then 
loading/binding subdevices as well as the driver controlling the video 
device. It was such a mess I gave up.

I'm happy to see activity in this area but I fear it might need work on 
a higher level and not trying to work around the problem in drivers.

> 
> > If we chose to keep these drivers as component drivers, media device
> > needs to stick around until all components stop using it. This is tricky
> > because there is no tie between these set of drivers. vimc module can
> > be deleted while others are still active. As vimc gets removed, other
> > component drivers start wanting to access the media device tree.
> 
> Reference-counting is the key.
> 
> > This is classic media device lifetime problem which could be solved
> > easily with the way I solved it with this series. I saw this as a
> > variation on the same use-case we had with sound and media drivers
> > sharing the media device.
> 
> This isn't about solving it easily, it's about solving it properly. The
> media device allocator as used here is a hack and takes us in the
> opposite direction of a proper fix.
> 
> > I have a TODO request from you asking to extend Media Device Allocator
> > API to generic case and not restrict it to USB devices. My thinking is
> > that this gives a perfect test case to extend the API to be generic
> > and use to solve this problem.
> 
> The biggest issue at the moment with the media device allocator, which I
> have pointed out numerous times and has never been addressed (and which
> explains why I didn't think the code was ready to be merged) is that the
> media_device contains operations that are based on having a single
> driver controlling the media device. A proper shared media device
> allocator needs to drop the concept of a single master for the media
> device, and thus needs to refactor those operations to allow any user of
> the media device to implement them (the .link_notify() operation is a
> prime example, and the recently added request operations will make this
> even more challenging - think of how this patch series would prevent
> vimc from properly implementing the request API). As long as these issue
> are not fixed I will be firmly opposed to spreading the usage of the
> media device allocator beyond what exists today.
> 
> > Collapsing the drivers into one might be lot more difficult and complex
> > than solving this problem with Media Device Allocator API. This approach
> > has an added benefit of extending the API to be generic and not just for
> > USB.
> 
> I've never disputed the fact that fixing a problem correctly is usually
> more work than hacking around it :-)
> 
> > I looked at this as a good way to add generic API and have a great test
> > case for it. This patch series fixes the problem for the current vimc
> > architecture.
> 
> NAK, for the reasons above. Please drop this series and fix the problem
> properly.
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
Regards,
Niklas Söderlund

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-07-03 16:17             ` Niklas Söderlund
@ 2019-07-03 16:52               ` Shuah Khan
  2019-07-03 23:25                 ` Laurent Pinchart
  0 siblings, 1 reply; 14+ messages in thread
From: Shuah Khan @ 2019-07-03 16:52 UTC (permalink / raw)
  To: Niklas Söderlund, Laurent Pinchart
  Cc: Helen Koike, Hans Verkuil, linux-media, linux-kernel, Shuah Khan

Hi Niklas, Laurent,

On 7/3/19 10:17 AM, Niklas Söderlund wrote:
> Hi Shauah, Laurent,
> 
> On 2019-06-30 14:41:02 +0300, Laurent Pinchart wrote:
>> Hi Shuah,
>>
>> On Fri, Jun 28, 2019 at 10:41:07AM -0600, Shuah Khan wrote:
>>> On 6/16/19 12:45 PM, Laurent Pinchart wrote:
>>>> On Fri, Jun 14, 2019 at 05:26:46PM -0600, Shuah Khan wrote:
>>>>> On 6/13/19 7:24 AM, Helen Koike wrote:
>>>>>> On 6/13/19 2:44 AM, Hans Verkuil wrote:
>>>>>>> On 5/24/19 5:31 AM, Shuah Khan wrote:
>>>>>>>> media_device is embedded in struct vimc_device and when vimc is removed
>>>>>>>> vimc_device and the embedded media_device goes with it, while the active
>>>>>>>> stream and vimc_capture continue to access it.
>>>>>>>>
>>>>>>>> Fix the media_device lifetime problem by changing vimc to create shared
>>>>>>>> media_device using Media Device Allocator API and vimc_capture getting
>>>>>>>> a reference to vimc module. With this change, vimc module can be removed
>>>>>>>> only when the references are gone. vimc can be removed after vimc_capture
>>>>>>>> is removed.
>>>>>>>>
>>>>>>>> Media Device Allocator API supports just USB devices. Enhance it
>>>>>>>> adding a genetic device allocate interface to support other media
>>>>>>>> drivers.
>>>>>>>>
>>>>>>>> The new interface takes pointer to struct device instead and creates
>>>>>>>> media device. This interface allows a group of drivers that have a
>>>>>>>> common root device to share media device resource and ensure media
>>>>>>>> device doesn't get deleted as long as one of the drivers holds its
>>>>>>>> reference.
>>>>>>>>
>>>>>>>> The new interface has been tested with vimc component driver to fix
>>>>>>>> panics when vimc module is removed while streaming is in progress.
>>>>>>>
>>>>>>> Helen, can you review this series? I'm not sure this is the right approach
>>>>>>> for a driver like vimc, and even if it is, then it is odd that vimc-capture
>>>>>>> is the only vimc module that's handled here.
>>>>>>
>>>>>> Hi Hans,
>>>>>>
>>>>>> Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
>>>>>> try to take a look at this patch series (and the others) asap.
>>>>>>
>>>>>> Helen
>>>>>>
>>>>>>> My gut feeling is that this should be handled inside vimc directly and not
>>>>>>> using the media-dev-allocator.
>>>>>
>>>>> Hi Hans and Helen,
>>>>>
>>>>> I explored fixing the problem within vimc before I went down the path to
>>>>> use Media Device Allocator API. I do think that it is cleaner to go this
>>>>> way and easier to maintain.
>>>>>
>>>>> vimc is a group pf component drivers that rely on the media device vimc
>>>>> in vimc and falls into the use-case Media Device Allocator API is added
>>>>> to address. The release and life-time management happens without vimc
>>>>> component drivers being changed other than using the API to get and put
>>>>> media device reference.
>>>>
>>>> Our replies crossed each other, please see my reply to Hans. I would
>>>> just like to comment here that if having multiple kernel modules causes
>>>> issue, they can all be merged together. There's no need for vimc to be
>>>> handled through multiple modules (I actually think it's quite
>>>> counterproductive, it only makes it more complex, for no added value).
>>>
>>> There are several problems in this group of drivers as far as lifetime
>>> management is concerned. I explained some of it in the patch 2/2
>>>
>>> If vimc module is removed while streaming is active, vimc_exit runs
>>> into NULL pointer dereference error when streaming thread tries to
>>> access and lock graph_mutex in the struct media_device.
>>>
>>> The primary reason for this is that:
>>>
>>> media_device is embedded in struct vimc_device and when vimc is removed
>>> vimc_device and the embedded media_device goes with it, while the active
>>> stream and vimc_capture continue to access it.
>>
>> The issue isn't so much that media_devic is embedded in vimc_device, but
>> that vimc_device is released too early. Not only does the thread need to
>> access the graph_mutex lock in the media_device structure, but it can
>> potentially access fields of the device-specific structures as well. The
>> proper solution is to propagate media_device_release() one level up, in
>> order to only release the top-level structure containing media_device
>> when the last reference to the media_device is dropped.
> 

Yes. vimc_device is the master device for all the component drivers and
it being released early definitely causes problems. I tried to solve
this by isolating the media_device embedded in it and taking that out
of contention for release later. This problem could be solved by making
sure vimc_device sticks around and I am on that solution now.

> I have seen similar problems with rcar-vin, the device specific data is
> released to early. In my case it was not triggered by the struct
> media_device but with a struct v4l2_device embedded in the device
> specific data IIRC.
> 
> This was when I tried to address the lifetime issues of the video device
> when binding/unbinding the device to the driver and not when unloading
> the module. This was quiet a while ago so I don't recall specifics,
> sorry about that. One finding was that there are also unsolved problems
> when it comes async notifiers and unloading/unbinding and then
> loading/binding subdevices as well as the driver controlling the video
> device. It was such a mess I gave up.
> 

Yes. You will find such problems with various media drivers. It could be
the v4l2 device or some other device that gets released while still in
use.

> I'm happy to see activity in this area but I fear it might need work on
> a higher level and not trying to work around the problem in drivers.
> 

Drivers still need to handle such issues anyway. Is there a reason why
you think it is a work-around?

>>
>>> If we chose to keep these drivers as component drivers, media device
>>> needs to stick around until all components stop using it. This is tricky
>>> because there is no tie between these set of drivers. vimc module can
>>> be deleted while others are still active. As vimc gets removed, other
>>> component drivers start wanting to access the media device tree.
>>
>> Reference-counting is the key.
>>
>>> This is classic media device lifetime problem which could be solved
>>> easily with the way I solved it with this series. I saw this as a
>>> variation on the same use-case we had with sound and media drivers
>>> sharing the media device.
>>
>> This isn't about solving it easily, it's about solving it properly. The
>> media device allocator as used here is a hack and takes us in the
>> opposite direction of a proper fix.
>>

Labeling this hack doesn't accurate. I agree though that this might be a
big hammer and there might be other solutions that can be limited to
just vimc scope. :)

>>> I have a TODO request from you asking to extend Media Device Allocator
>>> API to generic case and not restrict it to USB devices. My thinking is
>>> that this gives a perfect test case to extend the API to be generic
>>> and use to solve this problem.
>>
>> The biggest issue at the moment with the media device allocator, which I
>> have pointed out numerous times and has never been addressed (and which
>> explains why I didn't think the code was ready to be merged) is that the
>> media_device contains operations that are based on having a single
>> driver controlling the media device. A proper shared media device
>> allocator needs to drop the concept of a single master for the media
>> device, and thus needs to refactor those operations to allow any user of
>> the media device to implement them (the .link_notify() operation is a
>> prime example, and the recently added request operations will make this
>> even more challenging - think of how this patch series would prevent
>> vimc from properly implementing the request API). As long as these issue
>> are not fixed I will be firmly opposed to spreading the usage of the
>> media device allocator beyond what exists today.
>>

During the reviews, it was deemed necessary to make media driver as the
master for creating parts of the tree and provide hooks for other
drivers to add their own media components to the tree. The same is
extended to other interfaces. This feature was on ice for so long,
I don't recall all the details on how it evolved.

thanks,
-- Shuah


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-07-03 16:52               ` Shuah Khan
@ 2019-07-03 23:25                 ` Laurent Pinchart
  2019-07-03 23:42                   ` Shuah Khan
  0 siblings, 1 reply; 14+ messages in thread
From: Laurent Pinchart @ 2019-07-03 23:25 UTC (permalink / raw)
  To: Shuah Khan
  Cc: Niklas Söderlund, Helen Koike, Hans Verkuil, linux-media,
	linux-kernel

Hi Shuah,

On Wed, Jul 03, 2019 at 10:52:17AM -0600, Shuah Khan wrote:
> On 7/3/19 10:17 AM, Niklas Söderlund wrote:
> > On 2019-06-30 14:41:02 +0300, Laurent Pinchart wrote:
> >> On Fri, Jun 28, 2019 at 10:41:07AM -0600, Shuah Khan wrote:
> >>> On 6/16/19 12:45 PM, Laurent Pinchart wrote:
> >>>> On Fri, Jun 14, 2019 at 05:26:46PM -0600, Shuah Khan wrote:
> >>>>> On 6/13/19 7:24 AM, Helen Koike wrote:
> >>>>>> On 6/13/19 2:44 AM, Hans Verkuil wrote:
> >>>>>>> On 5/24/19 5:31 AM, Shuah Khan wrote:
> >>>>>>>> media_device is embedded in struct vimc_device and when vimc is removed
> >>>>>>>> vimc_device and the embedded media_device goes with it, while the active
> >>>>>>>> stream and vimc_capture continue to access it.
> >>>>>>>>
> >>>>>>>> Fix the media_device lifetime problem by changing vimc to create shared
> >>>>>>>> media_device using Media Device Allocator API and vimc_capture getting
> >>>>>>>> a reference to vimc module. With this change, vimc module can be removed
> >>>>>>>> only when the references are gone. vimc can be removed after vimc_capture
> >>>>>>>> is removed.
> >>>>>>>>
> >>>>>>>> Media Device Allocator API supports just USB devices. Enhance it
> >>>>>>>> adding a genetic device allocate interface to support other media
> >>>>>>>> drivers.
> >>>>>>>>
> >>>>>>>> The new interface takes pointer to struct device instead and creates
> >>>>>>>> media device. This interface allows a group of drivers that have a
> >>>>>>>> common root device to share media device resource and ensure media
> >>>>>>>> device doesn't get deleted as long as one of the drivers holds its
> >>>>>>>> reference.
> >>>>>>>>
> >>>>>>>> The new interface has been tested with vimc component driver to fix
> >>>>>>>> panics when vimc module is removed while streaming is in progress.
> >>>>>>>
> >>>>>>> Helen, can you review this series? I'm not sure this is the right approach
> >>>>>>> for a driver like vimc, and even if it is, then it is odd that vimc-capture
> >>>>>>> is the only vimc module that's handled here.
> >>>>>>
> >>>>>> Hi Hans,
> >>>>>>
> >>>>>> Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
> >>>>>> try to take a look at this patch series (and the others) asap.
> >>>>>>
> >>>>>> Helen
> >>>>>>
> >>>>>>> My gut feeling is that this should be handled inside vimc directly and not
> >>>>>>> using the media-dev-allocator.
> >>>>>
> >>>>> Hi Hans and Helen,
> >>>>>
> >>>>> I explored fixing the problem within vimc before I went down the path to
> >>>>> use Media Device Allocator API. I do think that it is cleaner to go this
> >>>>> way and easier to maintain.
> >>>>>
> >>>>> vimc is a group pf component drivers that rely on the media device vimc
> >>>>> in vimc and falls into the use-case Media Device Allocator API is added
> >>>>> to address. The release and life-time management happens without vimc
> >>>>> component drivers being changed other than using the API to get and put
> >>>>> media device reference.
> >>>>
> >>>> Our replies crossed each other, please see my reply to Hans. I would
> >>>> just like to comment here that if having multiple kernel modules causes
> >>>> issue, they can all be merged together. There's no need for vimc to be
> >>>> handled through multiple modules (I actually think it's quite
> >>>> counterproductive, it only makes it more complex, for no added value).
> >>>
> >>> There are several problems in this group of drivers as far as lifetime
> >>> management is concerned. I explained some of it in the patch 2/2
> >>>
> >>> If vimc module is removed while streaming is active, vimc_exit runs
> >>> into NULL pointer dereference error when streaming thread tries to
> >>> access and lock graph_mutex in the struct media_device.
> >>>
> >>> The primary reason for this is that:
> >>>
> >>> media_device is embedded in struct vimc_device and when vimc is removed
> >>> vimc_device and the embedded media_device goes with it, while the active
> >>> stream and vimc_capture continue to access it.
> >>
> >> The issue isn't so much that media_devic is embedded in vimc_device, but
> >> that vimc_device is released too early. Not only does the thread need to
> >> access the graph_mutex lock in the media_device structure, but it can
> >> potentially access fields of the device-specific structures as well. The
> >> proper solution is to propagate media_device_release() one level up, in
> >> order to only release the top-level structure containing media_device
> >> when the last reference to the media_device is dropped.
> 
> Yes. vimc_device is the master device for all the component drivers and
> it being released early definitely causes problems. I tried to solve
> this by isolating the media_device embedded in it and taking that out
> of contention for release later. This problem could be solved by making
> sure vimc_device sticks around and I am on that solution now.

Thank you :-)

> > I have seen similar problems with rcar-vin, the device specific data is
> > released to early. In my case it was not triggered by the struct
> > media_device but with a struct v4l2_device embedded in the device
> > specific data IIRC.
> > 
> > This was when I tried to address the lifetime issues of the video device
> > when binding/unbinding the device to the driver and not when unloading
> > the module. This was quiet a while ago so I don't recall specifics,
> > sorry about that. One finding was that there are also unsolved problems
> > when it comes async notifiers and unloading/unbinding and then
> > loading/binding subdevices as well as the driver controlling the video
> > device. It was such a mess I gave up.
> 
> Yes. You will find such problems with various media drivers. It could be
> the v4l2 device or some other device that gets released while still in
> use.
> 
> > I'm happy to see activity in this area but I fear it might need work on
> > a higher level and not trying to work around the problem in drivers.
> 
> Drivers still need to handle such issues anyway. Is there a reason why
> you think it is a work-around?
> 
> >>> If we chose to keep these drivers as component drivers, media device
> >>> needs to stick around until all components stop using it. This is tricky
> >>> because there is no tie between these set of drivers. vimc module can
> >>> be deleted while others are still active. As vimc gets removed, other
> >>> component drivers start wanting to access the media device tree.
> >>
> >> Reference-counting is the key.
> >>
> >>> This is classic media device lifetime problem which could be solved
> >>> easily with the way I solved it with this series. I saw this as a
> >>> variation on the same use-case we had with sound and media drivers
> >>> sharing the media device.
> >>
> >> This isn't about solving it easily, it's about solving it properly. The
> >> media device allocator as used here is a hack and takes us in the
> >> opposite direction of a proper fix.
> 
> Labeling this hack doesn't accurate. I agree though that this might be a
> big hammer and there might be other solutions that can be limited to
> just vimc scope. :)

The reason I call this a hack is that it may hide the early free of the
media_device structure itself, but won't help at all with the vimc
device structure that may also need to be accessed by the other drivers.
In order to fix this problem - and all similar lifetime management
problems - correctly we need to look at every structure and track how
they are referenced. Only through proper reference counting can we be
safe.

The media device allocator was specifically designed to handle cases
where there is no single master driver that can own a media device. This
caused the problems explained below which to my infinite disappointment
have been ignored while being pointed out multiple times during review.
I can only blame myself for not having done a better job at explaining
those issues of course, as the patch adding the allocator is signed by
three V4L2 core developers, so I must have failed three times.
Nevertheless, this API shall not be used until those problems are fixed,
to avoid spreading them to more drivers. At least until then - and I
believe beyond that too - it shall not be used for drivers where a media
device master exists, such as vimc.

> >>> I have a TODO request from you asking to extend Media Device Allocator
> >>> API to generic case and not restrict it to USB devices. My thinking is
> >>> that this gives a perfect test case to extend the API to be generic
> >>> and use to solve this problem.
> >>
> >> The biggest issue at the moment with the media device allocator, which I
> >> have pointed out numerous times and has never been addressed (and which
> >> explains why I didn't think the code was ready to be merged) is that the
> >> media_device contains operations that are based on having a single
> >> driver controlling the media device. A proper shared media device
> >> allocator needs to drop the concept of a single master for the media
> >> device, and thus needs to refactor those operations to allow any user of
> >> the media device to implement them (the .link_notify() operation is a
> >> prime example, and the recently added request operations will make this
> >> even more challenging - think of how this patch series would prevent
> >> vimc from properly implementing the request API). As long as these issue
> >> are not fixed I will be firmly opposed to spreading the usage of the
> >> media device allocator beyond what exists today.
> 
> During the reviews, it was deemed necessary to make media driver as the
> master for creating parts of the tree and provide hooks for other
> drivers to add their own media components to the tree. The same is
> extended to other interfaces. This feature was on ice for so long,
> I don't recall all the details on how it evolved.

Do you mean during the review of the vimc driver or of the media
allocator API ?

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
  2019-07-03 23:25                 ` Laurent Pinchart
@ 2019-07-03 23:42                   ` Shuah Khan
  0 siblings, 0 replies; 14+ messages in thread
From: Shuah Khan @ 2019-07-03 23:42 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Niklas Söderlund, Helen Koike, Hans Verkuil, linux-media,
	linux-kernel, Shuah Khan

On 7/3/19 5:25 PM, Laurent Pinchart wrote:
> Hi Shuah,
> 
> On Wed, Jul 03, 2019 at 10:52:17AM -0600, Shuah Khan wrote:
>> On 7/3/19 10:17 AM, Niklas Söderlund wrote:
>>> On 2019-06-30 14:41:02 +0300, Laurent Pinchart wrote:
>>>> On Fri, Jun 28, 2019 at 10:41:07AM -0600, Shuah Khan wrote:
>>>>> On 6/16/19 12:45 PM, Laurent Pinchart wrote:
>>>>>> On Fri, Jun 14, 2019 at 05:26:46PM -0600, Shuah Khan wrote:
>>>>>>> On 6/13/19 7:24 AM, Helen Koike wrote:
>>>>>>>> On 6/13/19 2:44 AM, Hans Verkuil wrote:
>>>>>>>>> On 5/24/19 5:31 AM, Shuah Khan wrote:
>>>>>>>>>> media_device is embedded in struct vimc_device and when vimc is removed
>>>>>>>>>> vimc_device and the embedded media_device goes with it, while the active
>>>>>>>>>> stream and vimc_capture continue to access it.
>>>>>>>>>>
>>>>>>>>>> Fix the media_device lifetime problem by changing vimc to create shared
>>>>>>>>>> media_device using Media Device Allocator API and vimc_capture getting
>>>>>>>>>> a reference to vimc module. With this change, vimc module can be removed
>>>>>>>>>> only when the references are gone. vimc can be removed after vimc_capture
>>>>>>>>>> is removed.
>>>>>>>>>>
>>>>>>>>>> Media Device Allocator API supports just USB devices. Enhance it
>>>>>>>>>> adding a genetic device allocate interface to support other media
>>>>>>>>>> drivers.
>>>>>>>>>>
>>>>>>>>>> The new interface takes pointer to struct device instead and creates
>>>>>>>>>> media device. This interface allows a group of drivers that have a
>>>>>>>>>> common root device to share media device resource and ensure media
>>>>>>>>>> device doesn't get deleted as long as one of the drivers holds its
>>>>>>>>>> reference.
>>>>>>>>>>
>>>>>>>>>> The new interface has been tested with vimc component driver to fix
>>>>>>>>>> panics when vimc module is removed while streaming is in progress.
>>>>>>>>>
>>>>>>>>> Helen, can you review this series? I'm not sure this is the right approach
>>>>>>>>> for a driver like vimc, and even if it is, then it is odd that vimc-capture
>>>>>>>>> is the only vimc module that's handled here.
>>>>>>>>
>>>>>>>> Hi Hans,
>>>>>>>>
>>>>>>>> Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
>>>>>>>> try to take a look at this patch series (and the others) asap.
>>>>>>>>
>>>>>>>> Helen
>>>>>>>>
>>>>>>>>> My gut feeling is that this should be handled inside vimc directly and not
>>>>>>>>> using the media-dev-allocator.
>>>>>>>
>>>>>>> Hi Hans and Helen,
>>>>>>>
>>>>>>> I explored fixing the problem within vimc before I went down the path to
>>>>>>> use Media Device Allocator API. I do think that it is cleaner to go this
>>>>>>> way and easier to maintain.
>>>>>>>
>>>>>>> vimc is a group pf component drivers that rely on the media device vimc
>>>>>>> in vimc and falls into the use-case Media Device Allocator API is added
>>>>>>> to address. The release and life-time management happens without vimc
>>>>>>> component drivers being changed other than using the API to get and put
>>>>>>> media device reference.
>>>>>>
>>>>>> Our replies crossed each other, please see my reply to Hans. I would
>>>>>> just like to comment here that if having multiple kernel modules causes
>>>>>> issue, they can all be merged together. There's no need for vimc to be
>>>>>> handled through multiple modules (I actually think it's quite
>>>>>> counterproductive, it only makes it more complex, for no added value).
>>>>>
>>>>> There are several problems in this group of drivers as far as lifetime
>>>>> management is concerned. I explained some of it in the patch 2/2
>>>>>
>>>>> If vimc module is removed while streaming is active, vimc_exit runs
>>>>> into NULL pointer dereference error when streaming thread tries to
>>>>> access and lock graph_mutex in the struct media_device.
>>>>>
>>>>> The primary reason for this is that:
>>>>>
>>>>> media_device is embedded in struct vimc_device and when vimc is removed
>>>>> vimc_device and the embedded media_device goes with it, while the active
>>>>> stream and vimc_capture continue to access it.
>>>>
>>>> The issue isn't so much that media_devic is embedded in vimc_device, but
>>>> that vimc_device is released too early. Not only does the thread need to
>>>> access the graph_mutex lock in the media_device structure, but it can
>>>> potentially access fields of the device-specific structures as well. The
>>>> proper solution is to propagate media_device_release() one level up, in
>>>> order to only release the top-level structure containing media_device
>>>> when the last reference to the media_device is dropped.
>>
>> Yes. vimc_device is the master device for all the component drivers and
>> it being released early definitely causes problems. I tried to solve
>> this by isolating the media_device embedded in it and taking that out
>> of contention for release later. This problem could be solved by making
>> sure vimc_device sticks around and I am on that solution now.
> 
> Thank you :-)
> 
>>> I have seen similar problems with rcar-vin, the device specific data is
>>> released to early. In my case it was not triggered by the struct
>>> media_device but with a struct v4l2_device embedded in the device
>>> specific data IIRC.
>>>
>>> This was when I tried to address the lifetime issues of the video device
>>> when binding/unbinding the device to the driver and not when unloading
>>> the module. This was quiet a while ago so I don't recall specifics,
>>> sorry about that. One finding was that there are also unsolved problems
>>> when it comes async notifiers and unloading/unbinding and then
>>> loading/binding subdevices as well as the driver controlling the video
>>> device. It was such a mess I gave up.
>>
>> Yes. You will find such problems with various media drivers. It could be
>> the v4l2 device or some other device that gets released while still in
>> use.
>>
>>> I'm happy to see activity in this area but I fear it might need work on
>>> a higher level and not trying to work around the problem in drivers.
>>
>> Drivers still need to handle such issues anyway. Is there a reason why
>> you think it is a work-around?
>>
>>>>> If we chose to keep these drivers as component drivers, media device
>>>>> needs to stick around until all components stop using it. This is tricky
>>>>> because there is no tie between these set of drivers. vimc module can
>>>>> be deleted while others are still active. As vimc gets removed, other
>>>>> component drivers start wanting to access the media device tree.
>>>>
>>>> Reference-counting is the key.
>>>>
>>>>> This is classic media device lifetime problem which could be solved
>>>>> easily with the way I solved it with this series. I saw this as a
>>>>> variation on the same use-case we had with sound and media drivers
>>>>> sharing the media device.
>>>>
>>>> This isn't about solving it easily, it's about solving it properly. The
>>>> media device allocator as used here is a hack and takes us in the
>>>> opposite direction of a proper fix.
>>
>> Labeling this hack doesn't accurate. I agree though that this might be a
>> big hammer and there might be other solutions that can be limited to
>> just vimc scope. :)
> 
> The reason I call this a hack is that it may hide the early free of the
> media_device structure itself, but won't help at all with the vimc
> device structure that may also need to be accessed by the other drivers.
> In order to fix this problem - and all similar lifetime management
> problems - correctly we need to look at every structure and track how
> they are referenced. Only through proper reference counting can we be
> safe.
> 
> The media device allocator was specifically designed to handle cases
> where there is no single master driver that can own a media device. This
> caused the problems explained below which to my infinite disappointment
> have been ignored while being pointed out multiple times during review.
> I can only blame myself for not having done a better job at explaining
> those issues of course, as the patch adding the allocator is signed by
> three V4L2 core developers, so I must have failed three times.
> Nevertheless, this API shall not be used until those problems are fixed,
> to avoid spreading them to more drivers. At least until then - and I
> believe beyond that too - it shall not be used for drivers where a media
> device master exists, such as vimc.
> 
>>>>> I have a TODO request from you asking to extend Media Device Allocator
>>>>> API to generic case and not restrict it to USB devices. My thinking is
>>>>> that this gives a perfect test case to extend the API to be generic
>>>>> and use to solve this problem.
>>>>
>>>> The biggest issue at the moment with the media device allocator, which I
>>>> have pointed out numerous times and has never been addressed (and which
>>>> explains why I didn't think the code was ready to be merged) is that the
>>>> media_device contains operations that are based on having a single
>>>> driver controlling the media device. A proper shared media device
>>>> allocator needs to drop the concept of a single master for the media
>>>> device, and thus needs to refactor those operations to allow any user of
>>>> the media device to implement them (the .link_notify() operation is a
>>>> prime example, and the recently added request operations will make this
>>>> even more challenging - think of how this patch series would prevent
>>>> vimc from properly implementing the request API). As long as these issue
>>>> are not fixed I will be firmly opposed to spreading the usage of the
>>>> media device allocator beyond what exists today.
>>
>> During the reviews, it was deemed necessary to make media driver as the
>> master for creating parts of the tree and provide hooks for other
>> drivers to add their own media components to the tree. The same is
>> extended to other interfaces. This feature was on ice for so long,
>> I don't recall all the details on how it evolved.
> 
> Do you mean during the review of the vimc driver or of the media
> allocator API ?
> 

I meant the media device allocator api. I did some looking after I sent
this email. link_notify itself isn't in the media device allocator api
scope. What is missing is someway to provide registering link_notify
similar to entity_notify that exists today.

Perhaps we can adrres the link_notify can be solved in the future. Let's
chat when we meet in person at one of the conferences.

thanks,
-- Shuah

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-07-03 23:42 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-24  3:31 [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Shuah Khan
2019-05-24  3:31 ` [PATCH 1/2] media: add generic device allocate interface to media-dev-allocator Shuah Khan
2019-05-24  3:31 ` [PATCH 2/2] vimc: fix BUG: unable to handle kernel NULL pointer dereference Shuah Khan
2019-06-13  5:44 ` [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Hans Verkuil
2019-06-13 13:24   ` Helen Koike
2019-06-14 23:26     ` Shuah Khan
2019-06-16 18:45       ` Laurent Pinchart
2019-06-28 16:41         ` Shuah Khan
2019-06-30 11:41           ` Laurent Pinchart
2019-07-03 16:17             ` Niklas Söderlund
2019-07-03 16:52               ` Shuah Khan
2019-07-03 23:25                 ` Laurent Pinchart
2019-07-03 23:42                   ` Shuah Khan
2019-06-16 18:43   ` Laurent Pinchart

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).