All of lore.kernel.org
 help / color / mirror / Atom feed
* [ANN] Media object lifetime management meeting report from Oslo
@ 2017-01-27 10:08 Sakari Ailus
  2017-01-27 10:52 ` Sakari Ailus
  2017-01-27 11:38 ` Mauro Carvalho Chehab
  0 siblings, 2 replies; 8+ messages in thread
From: Sakari Ailus @ 2017-01-27 10:08 UTC (permalink / raw)
  To: linux-media; +Cc: laurent.pinchart, hverkuil, mchehab

Hello everyone,

Please read below my report on resolving object lifetime issues in V4L2 and
Media controller frameworks. The document is rather long but worth reading if
you're interested in the topic.


Sakari Ailus, Laurent Pinchart and Hans Verkuil present
2016-01-13


Introduction to the problem domain
==================================

The Media controller framework was first used for complex devices with
internal data processing pipelines. The problem domain that got most of the
attention as how the pipeline configuration would work and how V4L2
sub-device format would be propagated or how the pipeline validation would
work. As none of the devices that were supported in the beginning were
hot-pluggable, little attention was paid in making it possible to remove
them.

The media graph is a complex, highly interconnected data structure with
object allocated by different drivers and with different lifetimes. The data
structure can be navigated by the Media controller framework and by drivers,
including parts that those drivers did not create themselves. While some
rules exist on how this could be done, for instance the graph traversal,
these rules are not enough to support new use cases such as hot-pluggable
devices (or safely unbinding devices in general).

In the original (and also the current) implementation of the Media
controller framework tearing down a Media device was centered in the ability
of removing devices without crashing the kernel while the said devices were
not in use by a user application. In order to prevent unbinding drivers,
module use counts are increased whilst the devices are in use. This
obviously does not guarantee that: hot-pluggable devices may be removed
without the user application even knowing it, as well as non-hotpluggable
devices may also be unbound even if the module stays loaded (of which the
latter is certainly a lesser concern).


Current and future issues
=========================

* Media device lifetime management. There is no serialisation between
  issuing system calls to the media device driver and removing the struct
  media_device from the system, typically through unbinding a device from
  the driver. This leaves a time window in which released kernel memory may
  be accessed.

* Media entity lifetime management (including driver allocated objects that
  can be reached through media graph objects). Media entities (and objects
  that embed media graph objects) may be looked up by drivers and
  referenced. Such references may be kept for an extended period of time.
  Sub-device drivers may call other sub-devices' operations as well as they
  may be accessed from the user space through file handles. Media entity
  lifetime management is required for safely removing them.

* Sharing a struct media_device between drivers. This is not a problem right
  now but will need to be addressed before a struct media_device may be
  shared between multiple drivers. Currently, the struct media_device has
  multiple fields that are specific to a single driver which inherently
  makes a media device non-shareable:

	- dev: struct device of the bridge/ISP driver
	- ops: ops structure that contains e.g. link_setup used for managing
	  power across the pipelines found in a media device. In the future,
	  this would include routing configuration as well.

  The functionality supported by these fields has to be implemented by means
  supporting multiple drivers before sharing the struct media_device is
  possible. In that solution, these fields may not continue to exist in the
  form they do now.


General rules for accessing data and running code
=================================================

These rules must be followed in accessing data and executing code in all
cases:

code) While there is a chance to execute code, the containing module
  reference count must be incremented.

data) Every memory allocation must be reference counted if a pointer to
  that memory is being shared. Each user obtains a reference and puts it
  using a related put function which will release the memory if it was the
  last reference.


Object reference counting golden rules
======================================

1. When a pointer to a reference-counted object is stored for later use, a
   reference to that object must exist.

2. A reference exists if it is taken or borrowed.

3. A borrowed reference is a reference taken elsewhere that can be proven to
   exist for the complete lifetime of the pointer.

4. When in doubt about whether a reference can be borrowed, it can be taken
   without any adverse impact.

5. A reference taken must be put at some point.

6. No access to a pointer can occur after the corresponding reference has
   been put.

	6.1. The point when the reference is put must be proven to occur
	     later than all other possible accesses to the reference.

	6.2. At the location where the reference is put, care must be taken
	     not to access the corresponding pointer after putting the
	     reference.

	     e.g.

	     object_put(obj);
	     obj->dead = true;

	     is invalid, the correct code would be

	     obj->dead = true;
	     object_put(obj);

7. Immediately set a reference to NULL before or right after putting a
   given reference. This helps catching use-after-release errors.


Media device lifetime
=====================


IOCTLs and removal
------------------

There is a choice to be made related to the approach used to ensure that the
media device itself will stay around as long as it is needed. This can be
done by referencing the media device, in which case the media device memory
will be released once all the users are done with the media device. An
alternative to this is to use locking, taking a lock during an IOCTL call
and during device removal. rw_semaphores could potentially be used for this
purpose, but they are hardly meant for such use.

The rest of the kernel generally uses reference counting to address such
issues. Besides this, sharing the media device later on will require using
reference, so it is certainly more future-proof as well as being a generally
known-good solution.

The media device thus needs to be reference counted.


Allocating struct media_devnode separately from struct media_device
-------------------------------------------------------------------

The current codebase allocates struct media_devnode dynamically. This was
done in order to reduce the time window during which unallocated kernel
memory could be accessed. However, there is no specific need for such a
separation as the entire data structure, including the media device which
used to contain struct media_devnode, will have the same lifetime. Thus the
struct media_devnode should be allocated as part of struct media_device.
Later on, struct media_devnode may be merged with struct media_device if so
decided.


Allocating struct media_device dynamically
------------------------------------------

Traditionally the media device has been allocated as part of drivers' own
structures. This is certainly fine as such, but many drivers have allocated
the driver private struct using the devm_() family of functions. This causes
such memory to be released at the time of device removal rather than at the
time when the memory is no longer accessible by e.g. user space processes or
interrupt handlers. Besides the media device, the driver's own data
structure is very likely to have the precisely same lifetime issue: it may
also be accessed through IOCTLs after the device has been removed from the
system.

Instead, the memory needs to be released as part of the release() callback
of the media device which is called when the last reference to the media
device is gone. Still, allocating the media device outside another
containing driver specific struct will be required in some cases: sharing
the media device mandates that. Implementing this can certainly be postponed
until sharing the media device is otherwise supported as well.


Lifetime of the media entities and related framework or driver objects
======================================================================

The media graph data structure is a complex data structure consisting of
media graph objects that themselves may be either entities, links or pads.
The media graph objects are allocated by various drivers and currently have
a lifetime related to binding and unbinding of the related hardware. The
current rules (i.e. acquiring the media graph lock for graph traversal) are
not enough to properly support removing entities.

Instead, reference counting needs to be introduced for media entities and
other objects that contain media entities. References must be acquired (or
in some cases, borrowed) whenever a reference is held to an object. A
release callback is used to release the memory allocation that contains an
object when the last reference to the object has been put. For instance:

	struct media_entity {
		...
		struct kref kref;
		void (*release)(struct media_entity *entity);
	};
	
	/*
	 * For refcounting, the caller has to set entity->release after
	 * initialising the entity.
	 */
	void media_entity_pads_init(struct media_entity *entity)
	{
		...
		kref_init(&entity->kref);
	}
	
	int media_device_register_entity(struct media_device *mdev,
					 struct media_entity *entity)
	{
		...
		entity->graph_obj.mdev = mdev;
	}
	
	void media_entity_release(struct kref *kref)
	{
		struct media_entity *entity =
			container_of(kref, struct media_entity, kref);
		struct media_device *mdev = entity->graph_obj.mdev;
		
		entity->release(entity);
		if (mdev)
			media_device_put(mdev);
	}
	
	void media_entity_put(struct media_entity *entity)
	{
		kref_put(&entity->kref, media_entity_release);
	}

	void media_entity_get(struct media_entity *entity)
	{
		kref_get(&entity->kref);
	}
	
The above takes into account the current behaviour which makes no use of
reference counting while still allowing reference counting for drivers that
use it.

Often media entities are contained in another struct such as struct
v4l2_subdev. In that case, the allocator of that struct must provide a
release callback which is used as a destructor of that object. Functions to
count references (get() and put()) are defined to obtain and released
references to the said objects. Wrappers are added for the types containing
the object type that does have the refcount, for instance in this case:

	void v4l2_subdev_get(struct v4l2_subdev *sd)
	{
		media_entity_get(&sd->entity);
	}

	void v4l2_subdev_put(struct v4l2_subdev *sd)
	{
		media_entity_put(&sd->entity);
	}

Memory allocations that contain one or more objects must be considered
indivisible. All the objects inside a struct are allocated and released at
one point of time. For instance, struct v4l2_subdev contains both struct
media_entity and struct video_device. struct video_device contains struct
device (related to the character device visible to the user) that is
reference counted already.

In the example below, foo is a sub-device driver. The struct foo_device
contains a struct v4l2_subdev field sd which in turn contains a media
entity. In addition to that, struct v4l2_subdev contains a struct
video_device (the devnode field) that currently is a pointer. In order to
simplify the reference counting, it is assumed to be directly a part of the
struct v4l2_subdev below. The example assumes that the driver requests a
device node for the V4L2 sub-device and does not address the case where the
device node is not created.


	struct media_entity {
		...
		struct kref kref;
		void (*release)(struct media_entity *entity);
	};
	
	struct v4l2_subdev {
		...
		struct video_device devnode;
	};

	void media_entity_release(struct kref *kref)
	{
		struct media_entity *entity =
			container_of(kref, struct media_entity, kref);
		struct media_device *mdev = entity->graph_obj.mdev;
		
		/* Release the media entity, possibly releasing its memory. */
		entity->release(entity);

		/*
		 * If it was bound to a media device, put the media device
		 * as well.
		 */
		if (mdev)
			media_device_put(mdev);
	}
	
	void media_entity_put(struct media_entity *entity)
	{
		kref_put(&entity->kref, media_entity_release);
	}

	/*
	 * Returns zero if getting entity succeeds, -ENODEV otherwise.
	 */
	int media_entity_try_get(struct media_entity *entity)
	{
		return kref_get_unless_zero(&entity->kref) ? 0 : -ENODEV;
	}

	void media_entity_get(struct media_entity *entity)
	{
		kref_get(&entity->kref);
	}
	
	/*
	 * Depending on the data structure containing the media entity, the
	 * caller has to set entity->release after initialising the entity.
	 */
	void media_entity_pads_init(struct media_entity *entity)
	{
		...
		kref_init(&entity->kref);
	}
	
	int media_device_register_entity(struct media_device *mdev,
					 struct media_entity *entity)
	{
		...
		/*
		 * Bind the media entity to the media device; thus increment
		 * media device refcount. The media device typically is not
		 * available during sub-device driver probe() time. This
		 * requires that a media entity may only be registered once.
		 */
		entity->graph_obj.mdev = mdev;
		media_device_get(mdev);
		/* And get the entity as well. */
		media_entity_get(entity);
	}
	
	
	void media_device_unregister_entity(struct media_entity *entity)
	{
		...
		media_entity_put(entity);
	}

	int v4l2_device_register_subdev(struct v4l2_subdev *sd)
	{
		...
		media_device_register_entity(&sd->entity);
		...
	}
	
	void v4l2_device_unregister_subdev(struct v4l2_subdev *sd)
	{
		...
		media_device_unregister_entity(&sd->entity);
	}

	/*
	 * Custom release callback function for V4L2 sub-devices that export
	 * a device node. (Could be used for others as well, with
	 * sd->devnode.release() callback.)
	 */
	void v4l2_device_subdev_devnode_release(struct media_entity *entity)
	{
		struct v4l2_subdev *sd = media_entity_to_v4l2_subdev(entity);

		/*
		 * If the devnode got registered we still hold a reference
		 * to it. Check this from vdev->prio (but any other field
		 * which gets set during device node registration and never
		 * changed again could be used). If the devnode was never
		 * registered, call its release function directly.
		 */
		if (sd->devnode.prio)
			video_put(sd->devnode);
		else
			sd->devnode.release(&sd->devnode);
	}
	
	int v4l2_device_register_subdev_nodes(struct v4l2_device *vdev)
	{
		struct video_device *vdev;
 	        struct v4l2_subdev *sd;
	        int err;

	        list_for_each_entry(sd, &v4l2_dev->subdevs, list) {
			__video_register_device(&sd->devnode, ...);
			video_get(&sd->devnode);
			sd->devnode.prio = something non-NULL;
		}
	}
	
	int v4l2_subdev_try_get(struct v4l2_subdev *sd)
	{
		return media_entity_try_get(&sd->entity);
	}

	void v4l2_subdev_get(struct v4l2_subdev *sd)
	{
		media_entity_get(&sd->entity);
	}

	void v4l2_subdev_put(struct v4l2_subdev *sd)
	{
		media_entity_put(&sd->entity);
	}

	/* V4L2 sub-device open file operation handler */	
	int subdev_open(struct file *file)
	{
		struct video_device *vdev = video_devdata(file);
		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
		int rval;
		
		/*
		 * The v4l2_subdev depends on the video device node. It thus
		 * may be that the media entity refcount (which is also used
		 * to count references to the v4l2_subdev) has reached zero
		 * here. However its memory is still intact as it's part of
		 * the same struct v4l2_subdev.
		 */
		rval = v4l2_subdev_try_get(sd);
		if (rval)
			return rval;
		
		...
	}
	
	/* V4L2 sub-device release file operation handler */
	int subdev_close(struct file *file)
	{
		struct video_device *vdev = video_devdata(file);
		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);

		v4l2_subdev_put(sd);
	}

	struct foo_device {
		struct v4l2_subdev sd;
		struct media_pad pad;
	};

	void foo_device_release(struct video_device *vdev)
	{
		struct v4l2_subdev *sd =
			container_of(vdev, struct v4l2_subdev, devnode);
		struct foo_device *foo =
			container_of(sd, struct foo_device, sd);

		/*
		 * TODO: acquire the graph mutex here and remove the
		 * entities corresponding to the V4L2 sub-device and its
		 * device node from the graph.
		 */
		media_entity_cleanup(&foo->sd.entity);
		
		kfree(foo);
	}

	int foo_probe(struct device *dev)
	{
		struct foo_device *foo = kmalloc(sizeof(*foo), GFP_KERNEL));

		media_entity_pads_init(&foo->sd.entity, 1, &foo->pad);
		foo->sd.entity.release = v4l2_subdev_devnode_release;
		foo->sd.devnode.release = foo_device_release;
		v4l2_async_register_subdev(&foo->sd);
	}

	void foo_remove(struct device *dev)
	{
		struct foo_device *foo = dev_get_drvdata(dev);
		
		v4l2_async_unregister_subdev(&foo->sd);
		v4l2_device_unregister_subdev(&foo->sd);
		/*
		 * v4l2_subdev_put() will end up releasing foo immediately
		 * unless file handles are open currently. Thus further
		 * accesses to foo are not allowed.
		 */
		v4l2_subdev_put(&foo->sd);
	}
	
Practical considerations
------------------------

- Beyond media entities, other media graph objects could be made refcounted.
  This is not seen necessary as the related objects such as pads are in
  practice allocated within the same driver specific struct containing the
  media entity.

	- This does not apply to links which are allocated dynamically by
	  the framework, and released at unregistration time, meaning that
	  accessing a link will require holding the media graph mutex.

- A custom release callback is to be used for refcounted framework
  structs. This makes it easy for the drivers to embed the objects in their
  own structs.

- As long as an object has its own reference count, getting or putting an
  object does not affect the refcount of another object that it depends on.
  For instance, a media entity depends on an existence of the media
  device.

- Each media entity holds a reference to the media device. Being able to
  navigate from the media entity to the media device is used in drivers and
  an abrupt disconnection of the two is problematic to handle for drivers.

- At object initialisation time the kref refcount is initialised. The object
  is put at unregistration time.
  
  	- If the object is a media entity, it is removed from the media graph
 	  when the last reference to the media entity is gone. The release
 	  callback needs to acquire the media graph mutex in order to
 	  perform the removal.

	- In particular for struct video_device, media_entity release callback will
	  put the refcount of the video device (struct device embedded in
	  struct video_device)

	- For struct v4l2_subdev, struct media_entity contained in struct
	  v4l2_subdev is dependent on struct device within struct
	  video_device.

- When a driver creates multiple reference-counted objects (such as multiple
  media entities for instance) that are part of its own private struct, it
  will need to reference-count its own struct based on the release callbacks
  of all those objects. It might be possible to simplify this by providing a
  single release callback that would cover all objects. This is already
  feasible for drivers that implement a media_device instance, V4L2
  sub-devices and V4L2 video devices hold a reference on the media_device,
  whose release callback could then act as a single gatekeeper. For other
  drivers we haven't explored yet whether the core could meaningfully
  provide help, but it should be remembered that drivers that implement
  multiple graph objects and that do not implement a media_device are not
  legion. Most slave drivers (such as sensor drivers) only implement a
  single v4l2_subdev.

- No V4L2 sub-device driver currently supports unbinding the device safely
  when a media entity (or V4L2 sub-device) is registered. Prevent manual
  unbind for all sub-device drivers by setting the suppress_bind_attrs field
  in struct device_driver.


Rules for navigating the media graph and accessing graph objects
================================================================

- Graph traversal is safe as long as the media graph lock is held during
  traversal. (I.e. no changes here.)

	- During graph traversal entity refcount must be increased in order
	  to prevent them from being released. As the media entity's release()
	  callback may be already in process at this point, a variant
	  checking that the reference count has not reached zero must be
	  used (media_entity_get_try()).

- In order to keep a reference to an object after the graph traversal has
  finished, a reference to the object must be obtained and held as long as
  the object is in use.

	- The same applies to file handles to V4L2 sub-devices or drivers
	  keeping a reference to another driver's sub-device.

- Once the reference count of an object reaches zero it will not be
  increased again, even if that object was part of a memory allocation which
  still has referenced objects. Obtaining references to such an object must
  fail.

- Navigating within a memory allocation while holding a reference to an
  object in that allocation to obtain another object is allowed. The target
  object type must provide a function testing for zero references in order
  not to not to increment reference count that has reached zero.
  kref_get_unless_zero() can be used for this. Especially drivers may need
  this.

- Circular references are not allowed.

- It is safe to move between objects of the same lifetime (same kref),
  e.g. struct v4l2_subdev and struct media_entity it contains.

The rules must be documented as part of the Media framework ReST
documentation.


Stopping the hardware safely at device unbind
=============================================

What typically gets little attention in driver implementation is stopping
safely whenever the hardware is being removed. The kernel frameworks are not
very helpful in this respect --- after all how this is done somewhat depends
on the hardware. Only hot-pluggable devices are generally affected. While it
is possible to manually unbind e.g. platform devices, that certainly is a
lesser problem for such devices.

The rules are:

- No memory related to data structures that are needed during driver
  operation can be released as long as interrupt handlers may be still
  executing or may start executing or user file handles are open.

- All hardware access must cease after the remove() callback in driver ops
  struct has returned. (For USB devices this callback is called
  disconnect(), but its purpose is precisely the same.)

Handling this correctly will require taking care of a few common matters:

1. Checking for device presence in each system call issued on a device, and
   returning -ENODEV if the device is not found.

2. The driver's remove() function must ensure that no system calls that
   could result in hardware access are ongoing.
   
3. Some system calls may sleep, and while doing so also releasing and later
   reacquiring (when they continue) the mutexes their processes were
   holding. Such processes must be woken up.

This is not an easy task for a driver and the V4L2 and Media controller
frameworks need to help the drivers to achieve this so that drivers would
need to worry about this as little as possible. Drivers relying for their
locking needs on the videobuf2 and the V4L2 frameworks are likely to require
fewer changes.

The same approach must be taken in issuing V4L2 sub-device operations by
other kernel drivers.


Serialising unbind with hardware access through system calls
------------------------------------------------------------

What's below is concentrated in V4L2; Media controller will need similar
changes.

- Add a function called video_unregister_device_prepare() (or
  video_device_mark_unregistered() or a similarly named function) to amend
  the functionality of video_unregister_device(): mark the device
  unregistering so that new system calls issues on the device will return an
  error.

- Device node specific use count for active system calls.

- Waiting list for waiting for the use count to reach zero (for proceeding
  unregistration in v4l2_unregister_device()). Each finishing system call on
  a file descriptor will wake up the waiting list.

- Processes waiting in a waiting list of a system call (e.g. poll, DQBUF
  IOCTL) will be woken up. After waiting, the processes will check whether
  the video node is still registered, i.e. video_unregister_device_prepare()
  has not been called.


Things to remember
------------------

- Using the devm_request_irq() is _NOT_ allowed in general case, as the
  free_irq() results in being called _after_ the driver's release callback.
  This means that if new interrupts still arrive between driver's release
  callback returning and devres resources not being released, the driver's
  interrupt handler will still run. Use request_irq() and free_irq()
  instead.


What changes in practice?
=========================

Most of the changes needed in order to align the current implementation to
conform the above are to be made in the frameworks themselves. What changes
in drivers, though, is how objects are initialised and released. This mostly
applies to drivers' own data structures that often contain reference counted
object types from the frameworks.

Relations between various frameworks objects will need to be handled in a
standard way in the appropriate registeration, unregistration and reference
getting and putting functions. Drivers (as well as frameworks, such as the
V4L2 sub-device framework in its open and close file operations) will make
use of these functions to get an put references to the said objects.

Drivers that are not hot-pluggable could largely ignore the changes, as
unregistering an object does in fact take place right before releasing that
object, meaning drivers that did not perform any object reference counting
are no more broken with the changes described above without implementing
reference counting. Still care must be taken that they remain decently
unloadable when there are no users.

New drivers should be required to implement object reference counting
correctly, whether hot-pluggable or not. One platform driver (e.g. omap3isp)
should be fixed to function as a model for new platform drivers.

-- 
Kind regards,

Sakari Ailus
e-mail: sakari.ailus@iki.fi	XMPP: sailus@retiisi.org.uk

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [ANN] Media object lifetime management meeting report from Oslo
  2017-01-27 10:08 [ANN] Media object lifetime management meeting report from Oslo Sakari Ailus
@ 2017-01-27 10:52 ` Sakari Ailus
  2017-01-27 11:38 ` Mauro Carvalho Chehab
  1 sibling, 0 replies; 8+ messages in thread
From: Sakari Ailus @ 2017-01-27 10:52 UTC (permalink / raw)
  To: linux-media; +Cc: laurent.pinchart, hverkuil, mchehab

On Fri, Jan 27, 2017 at 12:08:22PM +0200, Sakari Ailus wrote:
> 2016-01-13

As the most attentive of you have noticed, this should have been 2017-01-13.
The New Year just doesn't always take effect immediately. :-)

-- 
Sakari Ailus
e-mail: sakari.ailus@iki.fi	XMPP: sailus@retiisi.org.uk

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [ANN] Media object lifetime management meeting report from Oslo
  2017-01-27 10:08 [ANN] Media object lifetime management meeting report from Oslo Sakari Ailus
  2017-01-27 10:52 ` Sakari Ailus
@ 2017-01-27 11:38 ` Mauro Carvalho Chehab
  2017-01-27 22:02   ` Sakari Ailus
  1 sibling, 1 reply; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2017-01-27 11:38 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: linux-media, laurent.pinchart, hverkuil, Shuah Khan, Helen Koike

Hi Sakari/Hans/Laurent,

First of all, thanks for looking into those issues. Unfortunately, I was in
vacations, and were not able to be with you there for such discussions.

While I have a somewhat different view on some of the introductory points of 
this RFC, what really matters is the "proposal" part of it. So, based on the
experiments I did when I addressed the hotplug issues with the media
controller on USB drivers, I'm adding a few comments below.


Em Fri, 27 Jan 2017 12:08:22 +0200
Sakari Ailus <sakari.ailus@iki.fi> escreveu:

> Allocating struct media_devnode separately from struct media_device
> -------------------------------------------------------------------
> 
> The current codebase allocates struct media_devnode dynamically. This was
> done in order to reduce the time window during which unallocated kernel
> memory could be accessed. However, there is no specific need for such a
> separation as the entire data structure, including the media device which
> used to contain struct media_devnode, will have the same lifetime. Thus the
> struct media_devnode should be allocated as part of struct media_device.
> Later on, struct media_devnode may be merged with struct media_device if so
> decided.

There is one issue merging struct media_devnode at struct media_device.
The problem is that, if the same struct device is used for two different
APIs (like V4L2 and media_controller) , e. g. if the cdev parent points
to the same struct device, you may end by having a double free when the
device is unregistered on the second core. That's basically why 
currently struct cdev is at struct media_devnode: this way, it has its own
struct device.

IMHO, it also makes sense to embeed struct cdev at the V4L2 side, as I
detected some race issues at the V4L2 side when I ran the bind/unbind
race tests, when we tried to merge the snd-usb-audio MC support patch.
I remember Shuah reported something similar on that time.

> Allocating struct media_device dynamically
> ------------------------------------------
> 
> Traditionally the media device has been allocated as part of drivers' own
> structures. This is certainly fine as such, but many drivers have allocated
> the driver private struct using the devm_() family of functions. This causes
> such memory to be released at the time of device removal rather than at the
> time when the memory is no longer accessible by e.g. user space processes or
> interrupt handlers. Besides the media device, the driver's own data
> structure is very likely to have the precisely same lifetime issue: it may
> also be accessed through IOCTLs after the device has been removed from the
> system.
> 
> Instead, the memory needs to be released as part of the release() callback
> of the media device which is called when the last reference to the media
> device is gone. Still, allocating the media device outside another
> containing driver specific struct will be required in some cases: sharing
> the media device mandates that. Implementing this can certainly be postponed
> until sharing the media device is otherwise supported as well.

The patches adding MC support for snd-usb-audio, pending since Kernel
4.7 (I guess) require such functionatilty. Last year, on the audio
summit, they asked about its status, as they're needing MC suppor there.

So, whatever solution taken, this should be part of the solution.

(c/c Shuah, as she is the one working on it)

> Lifetime of the media entities and related framework or driver objects
> ======================================================================
> 
> The media graph data structure is a complex data structure consisting of
> media graph objects that themselves may be either entities, links or pads.
> The media graph objects are allocated by various drivers and currently have
> a lifetime related to binding and unbinding of the related hardware. The
> current rules (i.e. acquiring the media graph lock for graph traversal) are
> not enough to properly support removing entities.
> 
> Instead, reference counting needs to be introduced for media entities and
> other objects that contain media entities. References must be acquired (or
> in some cases, borrowed) whenever a reference is held to an object. A
> release callback is used to release the memory allocation that contains an
> object when the last reference to the object has been put. For instance:
> 
> 	struct media_entity {
> 		...
> 		struct kref kref;
> 		void (*release)(struct media_entity *entity);
> 	};

The similar rationale is valid also for media interfaces and, on a
lesser extent, to pads and links.

So, instead, IMHO, the kref should be implemented at struct media_gobj,
with is inherited by all graph elements. I guess that handling it
there will avoid us to duplicate the same logic on different
places (e. g. interface and entity logic handling; links handling).

Btw, as you want to use OMAP3 as a reference driver, you should
implement interface and interface links there, in order to have a
broader testcase coverage, or use some other driver that already
implements it.

Alternatively, we should work on merging the Helen's vimc driver, being
sure that it provides hot plug support for all kinds of media graph
objects. It should be easier/faster to detect race issues on such a driver
that doesn't depend on a hardware to complete their operations. So,
I'm a big fan on using it as primary testcase. Of course, we'll need to
test with real drivers too.
	
> 	/*
> 	 * For refcounting, the caller has to set entity->release after
> 	 * initialising the entity.
> 	 */
> 	void media_entity_pads_init(struct media_entity *entity)
> 	{
> 		...
> 		kref_init(&entity->kref);
> 	}
> 	
> 	int media_device_register_entity(struct media_device *mdev,
> 					 struct media_entity *entity)
> 	{
> 		...
> 		entity->graph_obj.mdev = mdev;
> 	}
> 	
> 	void media_entity_release(struct kref *kref)
> 	{
> 		struct media_entity *entity =
> 			container_of(kref, struct media_entity, kref);
> 		struct media_device *mdev = entity->graph_obj.mdev;
> 		
> 		entity->release(entity);
> 		if (mdev)
> 			media_device_put(mdev);
> 	}
> 	
> 	void media_entity_put(struct media_entity *entity)
> 	{
> 		kref_put(&entity->kref, media_entity_release);
> 	}
> 
> 	void media_entity_get(struct media_entity *entity)
> 	{
> 		kref_get(&entity->kref);
> 	}
> 	
> The above takes into account the current behaviour which makes no use of
> reference counting while still allowing reference counting for drivers that
> use it.
> 
> Often media entities are contained in another struct such as struct
> v4l2_subdev. In that case, the allocator of that struct must provide a
> release callback which is used as a destructor of that object. Functions to
> count references (get() and put()) are defined to obtain and released
> references to the said objects. Wrappers are added for the types containing
> the object type that does have the refcount, for instance in this case:
> 
> 	void v4l2_subdev_get(struct v4l2_subdev *sd)
> 	{
> 		media_entity_get(&sd->entity);
> 	}
> 
> 	void v4l2_subdev_put(struct v4l2_subdev *sd)
> 	{
> 		media_entity_put(&sd->entity);
> 	}
> 
> Memory allocations that contain one or more objects must be considered
> indivisible. All the objects inside a struct are allocated and released at
> one point of time. For instance, struct v4l2_subdev contains both struct
> media_entity and struct video_device. struct video_device contains struct
> device (related to the character device visible to the user) that is
> reference counted already.

Here's the problem: if we point the media_devnode's cdev parent to
the same object as struct video_device, both cores will try to
decrement its kref at release() time.

> 
> In the example below, foo is a sub-device driver. The struct foo_device
> contains a struct v4l2_subdev field sd which in turn contains a media
> entity. In addition to that, struct v4l2_subdev contains a struct
> video_device (the devnode field) that currently is a pointer. In order to
> simplify the reference counting, it is assumed to be directly a part of the
> struct v4l2_subdev below. The example assumes that the driver requests a
> device node for the V4L2 sub-device and does not address the case where the
> device node is not created.
> 
> 
> 	struct media_entity {
> 		...
> 		struct kref kref;
> 		void (*release)(struct media_entity *entity);
> 	};
> 	
> 	struct v4l2_subdev {
> 		...
> 		struct video_device devnode;
> 	};
> 
> 	void media_entity_release(struct kref *kref)
> 	{
> 		struct media_entity *entity =
> 			container_of(kref, struct media_entity, kref);
> 		struct media_device *mdev = entity->graph_obj.mdev;
> 		
> 		/* Release the media entity, possibly releasing its memory. */
> 		entity->release(entity);
> 
> 		/*
> 		 * If it was bound to a media device, put the media device
> 		 * as well.
> 		 */
> 		if (mdev)
> 			media_device_put(mdev);
> 	}
> 	
> 	void media_entity_put(struct media_entity *entity)
> 	{
> 		kref_put(&entity->kref, media_entity_release);
> 	}
> 
> 	/*
> 	 * Returns zero if getting entity succeeds, -ENODEV otherwise.
> 	 */
> 	int media_entity_try_get(struct media_entity *entity)
> 	{
> 		return kref_get_unless_zero(&entity->kref) ? 0 : -ENODEV;
> 	}
> 
> 	void media_entity_get(struct media_entity *entity)
> 	{
> 		kref_get(&entity->kref);
> 	}
> 	
> 	/*
> 	 * Depending on the data structure containing the media entity, the
> 	 * caller has to set entity->release after initialising the entity.
> 	 */
> 	void media_entity_pads_init(struct media_entity *entity)
> 	{
> 		...
> 		kref_init(&entity->kref);
> 	}
> 	
> 	int media_device_register_entity(struct media_device *mdev,
> 					 struct media_entity *entity)
> 	{
> 		...
> 		/*
> 		 * Bind the media entity to the media device; thus increment
> 		 * media device refcount. The media device typically is not
> 		 * available during sub-device driver probe() time. This
> 		 * requires that a media entity may only be registered once.
> 		 */
> 		entity->graph_obj.mdev = mdev;
> 		media_device_get(mdev);
> 		/* And get the entity as well. */
> 		media_entity_get(entity);
> 	}
> 	
> 	
> 	void media_device_unregister_entity(struct media_entity *entity)
> 	{
> 		...
> 		media_entity_put(entity);
> 	}
> 
> 	int v4l2_device_register_subdev(struct v4l2_subdev *sd)
> 	{
> 		...
> 		media_device_register_entity(&sd->entity);
> 		...
> 	}
> 	
> 	void v4l2_device_unregister_subdev(struct v4l2_subdev *sd)
> 	{
> 		...
> 		media_device_unregister_entity(&sd->entity);
> 	}
> 
> 	/*
> 	 * Custom release callback function for V4L2 sub-devices that export
> 	 * a device node. (Could be used for others as well, with
> 	 * sd->devnode.release() callback.)
> 	 */
> 	void v4l2_device_subdev_devnode_release(struct media_entity *entity)
> 	{
> 		struct v4l2_subdev *sd = media_entity_to_v4l2_subdev(entity);
> 
> 		/*
> 		 * If the devnode got registered we still hold a reference
> 		 * to it. Check this from vdev->prio (but any other field
> 		 * which gets set during device node registration and never
> 		 * changed again could be used). If the devnode was never
> 		 * registered, call its release function directly.
> 		 */
> 		if (sd->devnode.prio)
> 			video_put(sd->devnode);
> 		else
> 			sd->devnode.release(&sd->devnode);
> 	}
> 	
> 	int v4l2_device_register_subdev_nodes(struct v4l2_device *vdev)
> 	{
> 		struct video_device *vdev;
>  	        struct v4l2_subdev *sd;
> 	        int err;
> 
> 	        list_for_each_entry(sd, &v4l2_dev->subdevs, list) {
> 			__video_register_device(&sd->devnode, ...);
> 			video_get(&sd->devnode);
> 			sd->devnode.prio = something non-NULL;
> 		}
> 	}
> 	
> 	int v4l2_subdev_try_get(struct v4l2_subdev *sd)
> 	{
> 		return media_entity_try_get(&sd->entity);
> 	}
> 
> 	void v4l2_subdev_get(struct v4l2_subdev *sd)
> 	{
> 		media_entity_get(&sd->entity);
> 	}
> 
> 	void v4l2_subdev_put(struct v4l2_subdev *sd)
> 	{
> 		media_entity_put(&sd->entity);
> 	}
> 
> 	/* V4L2 sub-device open file operation handler */	
> 	int subdev_open(struct file *file)
> 	{
> 		struct video_device *vdev = video_devdata(file);
> 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
> 		int rval;
> 		
> 		/*
> 		 * The v4l2_subdev depends on the video device node. It thus
> 		 * may be that the media entity refcount (which is also used
> 		 * to count references to the v4l2_subdev) has reached zero
> 		 * here. However its memory is still intact as it's part of
> 		 * the same struct v4l2_subdev.
> 		 */
> 		rval = v4l2_subdev_try_get(sd);
> 		if (rval)
> 			return rval;
> 		
> 		...
> 	}
> 	
> 	/* V4L2 sub-device release file operation handler */
> 	int subdev_close(struct file *file)
> 	{
> 		struct video_device *vdev = video_devdata(file);
> 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
> 
> 		v4l2_subdev_put(sd);
> 	}
> 
> 	struct foo_device {
> 		struct v4l2_subdev sd;
> 		struct media_pad pad;
> 	};
> 
> 	void foo_device_release(struct video_device *vdev)
> 	{
> 		struct v4l2_subdev *sd =
> 			container_of(vdev, struct v4l2_subdev, devnode);
> 		struct foo_device *foo =
> 			container_of(sd, struct foo_device, sd);
> 
> 		/*
> 		 * TODO: acquire the graph mutex here and remove the
> 		 * entities corresponding to the V4L2 sub-device and its
> 		 * device node from the graph.
> 		 */
> 		media_entity_cleanup(&foo->sd.entity);
> 		
> 		kfree(foo);
> 	}
> 
> 	int foo_probe(struct device *dev)
> 	{
> 		struct foo_device *foo = kmalloc(sizeof(*foo), GFP_KERNEL));
> 
> 		media_entity_pads_init(&foo->sd.entity, 1, &foo->pad);
> 		foo->sd.entity.release = v4l2_subdev_devnode_release;
> 		foo->sd.devnode.release = foo_device_release;
> 		v4l2_async_register_subdev(&foo->sd);
> 	}
> 
> 	void foo_remove(struct device *dev)
> 	{
> 		struct foo_device *foo = dev_get_drvdata(dev);
> 		
> 		v4l2_async_unregister_subdev(&foo->sd);
> 		v4l2_device_unregister_subdev(&foo->sd);
> 		/*
> 		 * v4l2_subdev_put() will end up releasing foo immediately
> 		 * unless file handles are open currently. Thus further
> 		 * accesses to foo are not allowed.
> 		 */
> 		v4l2_subdev_put(&foo->sd);
> 	}
> 	
> Practical considerations
> ------------------------
> 
> - Beyond media entities, other media graph objects could be made refcounted.
>   This is not seen necessary as the related objects such as pads are in
>   practice allocated within the same driver specific struct containing the
>   media entity.

As I mentioned already, you're forgetting about the media interfaces, with 
are created/removed elsewhere as they're (somewhat) independent of the media
entities.

> 
> 	- This does not apply to links which are allocated dynamically by
> 	  the framework, and released at unregistration time, meaning that
> 	  accessing a link will require holding the media graph mutex.

If we're moving to refcounts, then perhaps we can get rid of the
graph mutex or reduce its usage.

> 
> - A custom release callback is to be used for refcounted framework
>   structs. This makes it easy for the drivers to embed the objects in their
>   own structs.
> 
> - As long as an object has its own reference count, getting or putting an
>   object does not affect the refcount of another object that it depends on.
>   For instance, a media entity depends on an existence of the media
>   device.
> 
> - Each media entity holds a reference to the media device. Being able to
>   navigate from the media entity to the media device is used in drivers and
>   an abrupt disconnection of the two is problematic to handle for drivers.
> 
> - At object initialisation time the kref refcount is initialised. The object
>   is put at unregistration time.
>   
>   	- If the object is a media entity, it is removed from the media graph
>  	  when the last reference to the media entity is gone. The release
>  	  callback needs to acquire the media graph mutex in order to
>  	  perform the removal.
> 
> 	- In particular for struct video_device, media_entity release callback will
> 	  put the refcount of the video device (struct device embedded in
> 	  struct video_device)
> 
> 	- For struct v4l2_subdev, struct media_entity contained in struct
> 	  v4l2_subdev is dependent on struct device within struct
> 	  video_device.
> 
> - When a driver creates multiple reference-counted objects (such as multiple
>   media entities for instance) that are part of its own private struct, it
>   will need to reference-count its own struct based on the release callbacks
>   of all those objects. It might be possible to simplify this by providing a
>   single release callback that would cover all objects. This is already
>   feasible for drivers that implement a media_device instance, V4L2
>   sub-devices and V4L2 video devices hold a reference on the media_device,
>   whose release callback could then act as a single gatekeeper. For other
>   drivers we haven't explored yet whether the core could meaningfully
>   provide help, but it should be remembered that drivers that implement
>   multiple graph objects and that do not implement a media_device are not
>   legion. Most slave drivers (such as sensor drivers) only implement a
>   single v4l2_subdev.
> 
> - No V4L2 sub-device driver currently supports unbinding the device safely
>   when a media entity (or V4L2 sub-device) is registered. Prevent manual
>   unbind for all sub-device drivers by setting the suppress_bind_attrs field
>   in struct device_driver.
> 
> 
> Rules for navigating the media graph and accessing graph objects
> ================================================================
> 
> - Graph traversal is safe as long as the media graph lock is held during
>   traversal. (I.e. no changes here.)
> 
> 	- During graph traversal entity refcount must be increased in order
> 	  to prevent them from being released. As the media entity's release()
> 	  callback may be already in process at this point, a variant
> 	  checking that the reference count has not reached zero must be
> 	  used (media_entity_get_try()).
> 
> - In order to keep a reference to an object after the graph traversal has
>   finished, a reference to the object must be obtained and held as long as
>   the object is in use.
> 
> 	- The same applies to file handles to V4L2 sub-devices or drivers
> 	  keeping a reference to another driver's sub-device.
> 
> - Once the reference count of an object reaches zero it will not be
>   increased again, even if that object was part of a memory allocation which
>   still has referenced objects. Obtaining references to such an object must
>   fail.
> 
> - Navigating within a memory allocation while holding a reference to an
>   object in that allocation to obtain another object is allowed. The target
>   object type must provide a function testing for zero references in order
>   not to not to increment reference count that has reached zero.
>   kref_get_unless_zero() can be used for this. Especially drivers may need
>   this.
> 
> - Circular references are not allowed.
> 
> - It is safe to move between objects of the same lifetime (same kref),
>   e.g. struct v4l2_subdev and struct media_entity it contains.
> 
> The rules must be documented as part of the Media framework ReST
> documentation.
> 
> 
> Stopping the hardware safely at device unbind
> =============================================
> 
> What typically gets little attention in driver implementation is stopping
> safely whenever the hardware is being removed. The kernel frameworks are not
> very helpful in this respect --- after all how this is done somewhat depends
> on the hardware. Only hot-pluggable devices are generally affected. While it
> is possible to manually unbind e.g. platform devices, that certainly is a
> lesser problem for such devices.
> 
> The rules are:
> 
> - No memory related to data structures that are needed during driver
>   operation can be released as long as interrupt handlers may be still
>   executing or may start executing or user file handles are open.
> 
> - All hardware access must cease after the remove() callback in driver ops
>   struct has returned. (For USB devices this callback is called
>   disconnect(), but its purpose is precisely the same.)
> 
> Handling this correctly will require taking care of a few common matters:
> 
> 1. Checking for device presence in each system call issued on a device, and
>    returning -ENODEV if the device is not found.
> 
> 2. The driver's remove() function must ensure that no system calls that
>    could result in hardware access are ongoing.
>    
> 3. Some system calls may sleep, and while doing so also releasing and later
>    reacquiring (when they continue) the mutexes their processes were
>    holding. Such processes must be woken up.
> 
> This is not an easy task for a driver and the V4L2 and Media controller
> frameworks need to help the drivers to achieve this so that drivers would
> need to worry about this as little as possible. Drivers relying for their
> locking needs on the videobuf2 and the V4L2 frameworks are likely to require
> fewer changes.
> 
> The same approach must be taken in issuing V4L2 sub-device operations by
> other kernel drivers.
> 
> 
> Serialising unbind with hardware access through system calls
> ------------------------------------------------------------
> 
> What's below is concentrated in V4L2; Media controller will need similar
> changes.
> 
> - Add a function called video_unregister_device_prepare() (or
>   video_device_mark_unregistered() or a similarly named function) to amend
>   the functionality of video_unregister_device(): mark the device
>   unregistering so that new system calls issues on the device will return an
>   error.
> 
> - Device node specific use count for active system calls.
> 
> - Waiting list for waiting for the use count to reach zero (for proceeding
>   unregistration in v4l2_unregister_device()). Each finishing system call on
>   a file descriptor will wake up the waiting list.
> 
> - Processes waiting in a waiting list of a system call (e.g. poll, DQBUF
>   IOCTL) will be woken up. After waiting, the processes will check whether
>   the video node is still registered, i.e. video_unregister_device_prepare()
>   has not been called.
> 
> 
> Things to remember
> ------------------
> 
> - Using the devm_request_irq() is _NOT_ allowed in general case, as the
>   free_irq() results in being called _after_ the driver's release callback.
>   This means that if new interrupts still arrive between driver's release
>   callback returning and devres resources not being released, the driver's
>   interrupt handler will still run. Use request_irq() and free_irq()
>   instead.
> 
> 
> What changes in practice?
> =========================
> 
> Most of the changes needed in order to align the current implementation to
> conform the above are to be made in the frameworks themselves. What changes
> in drivers, though, is how objects are initialised and released. This mostly
> applies to drivers' own data structures that often contain reference counted
> object types from the frameworks.
> 
> Relations between various frameworks objects will need to be handled in a
> standard way in the appropriate registeration, unregistration and reference
> getting and putting functions. Drivers (as well as frameworks, such as the
> V4L2 sub-device framework in its open and close file operations) will make
> use of these functions to get an put references to the said objects.
> 
> Drivers that are not hot-pluggable could largely ignore the changes, as
> unregistering an object does in fact take place right before releasing that
> object, meaning drivers that did not perform any object reference counting
> are no more broken with the changes described above without implementing
> reference counting. Still care must be taken that they remain decently
> unloadable when there are no users.
> 
> New drivers should be required to implement object reference counting
> correctly, whether hot-pluggable or not. One platform driver (e.g. omap3isp)
> should be fixed to function as a model for new platform drivers.
> 

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [ANN] Media object lifetime management meeting report from Oslo
  2017-01-27 11:38 ` Mauro Carvalho Chehab
@ 2017-01-27 22:02   ` Sakari Ailus
  2017-01-28 15:35     ` Laurent Pinchart
  0 siblings, 1 reply; 8+ messages in thread
From: Sakari Ailus @ 2017-01-27 22:02 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: linux-media, laurent.pinchart, hverkuil, Shuah Khan, Helen Koike

Hi Mauro,

Obrigado for the comments! Please see my replies below.

On Fri, Jan 27, 2017 at 09:38:31AM -0200, Mauro Carvalho Chehab wrote:
> Hi Sakari/Hans/Laurent,
> 
> First of all, thanks for looking into those issues. Unfortunately, I was in
> vacations, and were not able to be with you there for such discussions.
> 
> While I have a somewhat different view on some of the introductory points of 
> this RFC, what really matters is the "proposal" part of it. So, based on the
> experiments I did when I addressed the hotplug issues with the media
> controller on USB drivers, I'm adding a few comments below.
> 
> 
> Em Fri, 27 Jan 2017 12:08:22 +0200
> Sakari Ailus <sakari.ailus@iki.fi> escreveu:
> 
> > Allocating struct media_devnode separately from struct media_device
> > -------------------------------------------------------------------
> > 
> > The current codebase allocates struct media_devnode dynamically. This was
> > done in order to reduce the time window during which unallocated kernel
> > memory could be accessed. However, there is no specific need for such a
> > separation as the entire data structure, including the media device which
> > used to contain struct media_devnode, will have the same lifetime. Thus the
> > struct media_devnode should be allocated as part of struct media_device.
> > Later on, struct media_devnode may be merged with struct media_device if so
> > decided.
> 
> There is one issue merging struct media_devnode at struct media_device.
> The problem is that, if the same struct device is used for two different
> APIs (like V4L2 and media_controller) , e. g. if the cdev parent points
> to the same struct device, you may end by having a double free when the
> device is unregistered on the second core. That's basically why 
> currently struct cdev is at struct media_devnode: this way, it has its own
> struct device.

One of the conclusions of the memorandum I wrote is that the data structures
that are involved in executing a system call (including IOCTLs) must stay
intact during that system call. (Well, the alternative I can think of is
using an rw_semaphore but I certainly do not advocate that solution.)

Otherwise, we will continue to have serialisation issues: kernel memory that
may be released at any point of time independently of a system call in
progress:

<URL:https://www.mail-archive.com/linux-media@vger.kernel.org/msg106390.html>

That one is a NULL pointer dereference but released kernel memory can be
accessed as well.

> 
> IMHO, it also makes sense to embeed struct cdev at the V4L2 side, as I
> detected some race issues at the V4L2 side when I ran the bind/unbind
> race tests, when we tried to merge the snd-usb-audio MC support patch.
> I remember Shuah reported something similar on that time.
> 
> > Allocating struct media_device dynamically
> > ------------------------------------------
> > 
> > Traditionally the media device has been allocated as part of drivers' own
> > structures. This is certainly fine as such, but many drivers have allocated
> > the driver private struct using the devm_() family of functions. This causes
> > such memory to be released at the time of device removal rather than at the
> > time when the memory is no longer accessible by e.g. user space processes or
> > interrupt handlers. Besides the media device, the driver's own data
> > structure is very likely to have the precisely same lifetime issue: it may
> > also be accessed through IOCTLs after the device has been removed from the
> > system.
> > 
> > Instead, the memory needs to be released as part of the release() callback
> > of the media device which is called when the last reference to the media
> > device is gone. Still, allocating the media device outside another
> > containing driver specific struct will be required in some cases: sharing
> > the media device mandates that. Implementing this can certainly be postponed
> > until sharing the media device is otherwise supported as well.
> 
> The patches adding MC support for snd-usb-audio, pending since Kernel
> 4.7 (I guess) require such functionatilty. Last year, on the audio
> summit, they asked about its status, as they're needing MC suppor there.
> 
> So, whatever solution taken, this should be part of the solution.
> 
> (c/c Shuah, as she is the one working on it)

I think we should postpone that, or at least resolve that in a separate
patchset. This is already getting very big. The refcounting problems will be
harder to fix, should we extend the MC framework with media device sharing
without considering the object lifetime issues first. So the order should
be: first fix object lifetime issues, then add more features.

Matters unrelated to the object lifetime issues such as the device (driver)
specific fields in struct media_device need to be addressed before support
for sharing the media device can be implemented. That work can be done
independently of fixing the object lifetime issues.

Implemeting media device sharing correctly will also be easier after
the media device is refcounted: the object lifetime issues with a shared
media device are resolvable (and likely very easily so).

> 
> > Lifetime of the media entities and related framework or driver objects
> > ======================================================================
> > 
> > The media graph data structure is a complex data structure consisting of
> > media graph objects that themselves may be either entities, links or pads.
> > The media graph objects are allocated by various drivers and currently have
> > a lifetime related to binding and unbinding of the related hardware. The
> > current rules (i.e. acquiring the media graph lock for graph traversal) are
> > not enough to properly support removing entities.
> > 
> > Instead, reference counting needs to be introduced for media entities and
> > other objects that contain media entities. References must be acquired (or
> > in some cases, borrowed) whenever a reference is held to an object. A
> > release callback is used to release the memory allocation that contains an
> > object when the last reference to the object has been put. For instance:
> > 
> > 	struct media_entity {
> > 		...
> > 		struct kref kref;
> > 		void (*release)(struct media_entity *entity);
> > 	};
> 
> The similar rationale is valid also for media interfaces and, on a
> lesser extent, to pads and links.
> 
> So, instead, IMHO, the kref should be implemented at struct media_gobj,
> with is inherited by all graph elements. I guess that handling it
> there will avoid us to duplicate the same logic on different
> places (e. g. interface and entity logic handling; links handling).

I don't argue against doing that in principle, but my question is: is there
a need for that? Does a driver (or a framework) need to be able to hold a
link, a pad, for instance? In the proposal, acquiring the media entity would
be practically enough to hold the pads (albeit that would have to be
documented), or we could add functions for that that would be wrappers to
media_entity_get() and so on:

void media_pad_get(struct media_pad *pad)
{
	media_entity_get(pad->entity);
}

The lifetime of the graph objects are dependent of each other in any cases;
you can't have pads having significantly different lifetime than the entity
they belong to. Typically such a case is related to making sure structs stay
around as long as something may still access them in a case when graph
objects are being removed.

> 
> Btw, as you want to use OMAP3 as a reference driver, you should
> implement interface and interface links there, in order to have a
> broader testcase coverage, or use some other driver that already
> implements it.
> 
> Alternatively, we should work on merging the Helen's vimc driver, being
> sure that it provides hot plug support for all kinds of media graph
> objects. It should be easier/faster to detect race issues on such a driver
> that doesn't depend on a hardware to complete their operations. So,
> I'm a big fan on using it as primary testcase. Of course, we'll need to
> test with real drivers too.

The vimc driver looks pretty good in my opinion; it could be merged after
addressing the comments. I wrote quite a few comments to the patchset but
AFAIR all the matters are minor.

I see no reason why we shouldn't implement these fixes for both of the two
drivers. My understanding is that the driver changes due to refcounting will
be relatively minor after all. It's the framework changes that will be
harder.

> 	
> > 	/*
> > 	 * For refcounting, the caller has to set entity->release after
> > 	 * initialising the entity.
> > 	 */
> > 	void media_entity_pads_init(struct media_entity *entity)
> > 	{
> > 		...
> > 		kref_init(&entity->kref);
> > 	}
> > 	
> > 	int media_device_register_entity(struct media_device *mdev,
> > 					 struct media_entity *entity)
> > 	{
> > 		...
> > 		entity->graph_obj.mdev = mdev;
> > 	}
> > 	
> > 	void media_entity_release(struct kref *kref)
> > 	{
> > 		struct media_entity *entity =
> > 			container_of(kref, struct media_entity, kref);
> > 		struct media_device *mdev = entity->graph_obj.mdev;
> > 		
> > 		entity->release(entity);
> > 		if (mdev)
> > 			media_device_put(mdev);
> > 	}
> > 	
> > 	void media_entity_put(struct media_entity *entity)
> > 	{
> > 		kref_put(&entity->kref, media_entity_release);
> > 	}
> > 
> > 	void media_entity_get(struct media_entity *entity)
> > 	{
> > 		kref_get(&entity->kref);
> > 	}
> > 	
> > The above takes into account the current behaviour which makes no use of
> > reference counting while still allowing reference counting for drivers that
> > use it.
> > 
> > Often media entities are contained in another struct such as struct
> > v4l2_subdev. In that case, the allocator of that struct must provide a
> > release callback which is used as a destructor of that object. Functions to
> > count references (get() and put()) are defined to obtain and released
> > references to the said objects. Wrappers are added for the types containing
> > the object type that does have the refcount, for instance in this case:
> > 
> > 	void v4l2_subdev_get(struct v4l2_subdev *sd)
> > 	{
> > 		media_entity_get(&sd->entity);
> > 	}
> > 
> > 	void v4l2_subdev_put(struct v4l2_subdev *sd)
> > 	{
> > 		media_entity_put(&sd->entity);
> > 	}
> > 
> > Memory allocations that contain one or more objects must be considered
> > indivisible. All the objects inside a struct are allocated and released at
> > one point of time. For instance, struct v4l2_subdev contains both struct
> > media_entity and struct video_device. struct video_device contains struct
> > device (related to the character device visible to the user) that is
> > reference counted already.
> 
> Here's the problem: if we point the media_devnode's cdev parent to
> the same object as struct video_device, both cores will try to
> decrement its kref at release() time.

At the time of registering a struct video_device with the struct
media_device, the underlying refcount (struct
media_device.devnode.dev.kobj.kref) needs to be incremented. This needs to
done using media_device_get(), with (a) few layers of abstraction.

This guarantees that the struct media_device will stick around at least as
long as the struct video_device.

The struct video_device cdev arrangement differs from what's being done in
the CEC framework and elsewhere; that should be fixed I guess. Still, the
struct media_device reference should be put in v4l2_device_release() (the
release callback of the struct device corresponding to the device node,
struct video_device.dev.release).

> 
> > 
> > In the example below, foo is a sub-device driver. The struct foo_device
> > contains a struct v4l2_subdev field sd which in turn contains a media
> > entity. In addition to that, struct v4l2_subdev contains a struct
> > video_device (the devnode field) that currently is a pointer. In order to
> > simplify the reference counting, it is assumed to be directly a part of the
> > struct v4l2_subdev below. The example assumes that the driver requests a
> > device node for the V4L2 sub-device and does not address the case where the
> > device node is not created.
> > 
> > 
> > 	struct media_entity {
> > 		...
> > 		struct kref kref;
> > 		void (*release)(struct media_entity *entity);
> > 	};
> > 	
> > 	struct v4l2_subdev {
> > 		...
> > 		struct video_device devnode;
> > 	};
> > 
> > 	void media_entity_release(struct kref *kref)
> > 	{
> > 		struct media_entity *entity =
> > 			container_of(kref, struct media_entity, kref);
> > 		struct media_device *mdev = entity->graph_obj.mdev;
> > 		
> > 		/* Release the media entity, possibly releasing its memory. */
> > 		entity->release(entity);
> > 
> > 		/*
> > 		 * If it was bound to a media device, put the media device
> > 		 * as well.
> > 		 */
> > 		if (mdev)
> > 			media_device_put(mdev);
> > 	}
> > 	
> > 	void media_entity_put(struct media_entity *entity)
> > 	{
> > 		kref_put(&entity->kref, media_entity_release);
> > 	}
> > 
> > 	/*
> > 	 * Returns zero if getting entity succeeds, -ENODEV otherwise.
> > 	 */
> > 	int media_entity_try_get(struct media_entity *entity)
> > 	{
> > 		return kref_get_unless_zero(&entity->kref) ? 0 : -ENODEV;
> > 	}
> > 
> > 	void media_entity_get(struct media_entity *entity)
> > 	{
> > 		kref_get(&entity->kref);
> > 	}
> > 	
> > 	/*
> > 	 * Depending on the data structure containing the media entity, the
> > 	 * caller has to set entity->release after initialising the entity.
> > 	 */
> > 	void media_entity_pads_init(struct media_entity *entity)
> > 	{
> > 		...
> > 		kref_init(&entity->kref);
> > 	}
> > 	
> > 	int media_device_register_entity(struct media_device *mdev,
> > 					 struct media_entity *entity)
> > 	{
> > 		...
> > 		/*
> > 		 * Bind the media entity to the media device; thus increment
> > 		 * media device refcount. The media device typically is not
> > 		 * available during sub-device driver probe() time. This
> > 		 * requires that a media entity may only be registered once.
> > 		 */
> > 		entity->graph_obj.mdev = mdev;
> > 		media_device_get(mdev);
> > 		/* And get the entity as well. */
> > 		media_entity_get(entity);
> > 	}
> > 	
> > 	
> > 	void media_device_unregister_entity(struct media_entity *entity)
> > 	{
> > 		...
> > 		media_entity_put(entity);
> > 	}
> > 
> > 	int v4l2_device_register_subdev(struct v4l2_subdev *sd)
> > 	{
> > 		...
> > 		media_device_register_entity(&sd->entity);
> > 		...
> > 	}
> > 	
> > 	void v4l2_device_unregister_subdev(struct v4l2_subdev *sd)
> > 	{
> > 		...
> > 		media_device_unregister_entity(&sd->entity);
> > 	}
> > 
> > 	/*
> > 	 * Custom release callback function for V4L2 sub-devices that export
> > 	 * a device node. (Could be used for others as well, with
> > 	 * sd->devnode.release() callback.)
> > 	 */
> > 	void v4l2_device_subdev_devnode_release(struct media_entity *entity)
> > 	{
> > 		struct v4l2_subdev *sd = media_entity_to_v4l2_subdev(entity);
> > 
> > 		/*
> > 		 * If the devnode got registered we still hold a reference
> > 		 * to it. Check this from vdev->prio (but any other field
> > 		 * which gets set during device node registration and never
> > 		 * changed again could be used). If the devnode was never
> > 		 * registered, call its release function directly.
> > 		 */
> > 		if (sd->devnode.prio)
> > 			video_put(sd->devnode);
> > 		else
> > 			sd->devnode.release(&sd->devnode);
> > 	}
> > 	
> > 	int v4l2_device_register_subdev_nodes(struct v4l2_device *vdev)
> > 	{
> > 		struct video_device *vdev;
> >  	        struct v4l2_subdev *sd;
> > 	        int err;
> > 
> > 	        list_for_each_entry(sd, &v4l2_dev->subdevs, list) {
> > 			__video_register_device(&sd->devnode, ...);
> > 			video_get(&sd->devnode);
> > 			sd->devnode.prio = something non-NULL;
> > 		}
> > 	}
> > 	
> > 	int v4l2_subdev_try_get(struct v4l2_subdev *sd)
> > 	{
> > 		return media_entity_try_get(&sd->entity);
> > 	}
> > 
> > 	void v4l2_subdev_get(struct v4l2_subdev *sd)
> > 	{
> > 		media_entity_get(&sd->entity);
> > 	}
> > 
> > 	void v4l2_subdev_put(struct v4l2_subdev *sd)
> > 	{
> > 		media_entity_put(&sd->entity);
> > 	}
> > 
> > 	/* V4L2 sub-device open file operation handler */	
> > 	int subdev_open(struct file *file)
> > 	{
> > 		struct video_device *vdev = video_devdata(file);
> > 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
> > 		int rval;
> > 		
> > 		/*
> > 		 * The v4l2_subdev depends on the video device node. It thus
> > 		 * may be that the media entity refcount (which is also used
> > 		 * to count references to the v4l2_subdev) has reached zero
> > 		 * here. However its memory is still intact as it's part of
> > 		 * the same struct v4l2_subdev.
> > 		 */
> > 		rval = v4l2_subdev_try_get(sd);
> > 		if (rval)
> > 			return rval;
> > 		
> > 		...
> > 	}
> > 	
> > 	/* V4L2 sub-device release file operation handler */
> > 	int subdev_close(struct file *file)
> > 	{
> > 		struct video_device *vdev = video_devdata(file);
> > 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
> > 
> > 		v4l2_subdev_put(sd);
> > 	}
> > 
> > 	struct foo_device {
> > 		struct v4l2_subdev sd;
> > 		struct media_pad pad;
> > 	};
> > 
> > 	void foo_device_release(struct video_device *vdev)
> > 	{
> > 		struct v4l2_subdev *sd =
> > 			container_of(vdev, struct v4l2_subdev, devnode);
> > 		struct foo_device *foo =
> > 			container_of(sd, struct foo_device, sd);
> > 
> > 		/*
> > 		 * TODO: acquire the graph mutex here and remove the
> > 		 * entities corresponding to the V4L2 sub-device and its
> > 		 * device node from the graph.
> > 		 */
> > 		media_entity_cleanup(&foo->sd.entity);
> > 		
> > 		kfree(foo);
> > 	}
> > 
> > 	int foo_probe(struct device *dev)
> > 	{
> > 		struct foo_device *foo = kmalloc(sizeof(*foo), GFP_KERNEL));
> > 
> > 		media_entity_pads_init(&foo->sd.entity, 1, &foo->pad);
> > 		foo->sd.entity.release = v4l2_subdev_devnode_release;
> > 		foo->sd.devnode.release = foo_device_release;
> > 		v4l2_async_register_subdev(&foo->sd);
> > 	}
> > 
> > 	void foo_remove(struct device *dev)
> > 	{
> > 		struct foo_device *foo = dev_get_drvdata(dev);
> > 		
> > 		v4l2_async_unregister_subdev(&foo->sd);
> > 		v4l2_device_unregister_subdev(&foo->sd);
> > 		/*
> > 		 * v4l2_subdev_put() will end up releasing foo immediately
> > 		 * unless file handles are open currently. Thus further
> > 		 * accesses to foo are not allowed.
> > 		 */
> > 		v4l2_subdev_put(&foo->sd);
> > 	}
> > 	
> > Practical considerations
> > ------------------------
> > 
> > - Beyond media entities, other media graph objects could be made refcounted.
> >   This is not seen necessary as the related objects such as pads are in
> >   practice allocated within the same driver specific struct containing the
> >   media entity.
> 
> As I mentioned already, you're forgetting about the media interfaces, with 
> are created/removed elsewhere as they're (somewhat) independent of the media
> entities.

Right. Nothing prevents adding a kref to struct media_interface. It'd work
the same way than with media entities. Interface-wise, the refcounting
should use _get()/_put() functions of the native types independently of
where the kref refcount is located. (Obviously this requires that it has to
be in the same memory allocation. For instance, v4l2_subdev's depending on
kref in struct media_entity in the example.)

Having to do add kref to struct media_interface, though, does increase the
appeal of adding kref to each graph object instead. Still, my hunch is that
this would be vastly overkill: the other graph objects we have are very
tightly bound to entities (and media interfaces). I can't see a use case for
them right now other than than just walking the graph. That said, I'm
certainly open for new ideas and information that could probe me wrong.

> 
> > 
> > 	- This does not apply to links which are allocated dynamically by
> > 	  the framework, and released at unregistration time, meaning that
> > 	  accessing a link will require holding the media graph mutex.
> 
> If we're moving to refcounts, then perhaps we can get rid of the
> graph mutex or reduce its usage.

It'll be a lot easier to get rid of the graph mutex, I agree. Only the graph
walk will depend on holding the graph mutex then, and it's much easier to
address the serialisation needs in that use case only. The solution will
still be non-trivial as it's likely to require wait/wound mutexes: the graph
walk can well start at two locations of a graph simultanoeously and two
walkers may thus try to parse the same graph nodes. One of them needs to
back away in that case. This is something that needs to be managed in the
users of the graph walk API.

> 
> > 
> > - A custom release callback is to be used for refcounted framework
> >   structs. This makes it easy for the drivers to embed the objects in their
> >   own structs.
> > 
> > - As long as an object has its own reference count, getting or putting an
> >   object does not affect the refcount of another object that it depends on.
> >   For instance, a media entity depends on an existence of the media
> >   device.
> > 
> > - Each media entity holds a reference to the media device. Being able to
> >   navigate from the media entity to the media device is used in drivers and
> >   an abrupt disconnection of the two is problematic to handle for drivers.
> > 
> > - At object initialisation time the kref refcount is initialised. The object
> >   is put at unregistration time.
> >   
> >   	- If the object is a media entity, it is removed from the media graph
> >  	  when the last reference to the media entity is gone. The release
> >  	  callback needs to acquire the media graph mutex in order to
> >  	  perform the removal.
> > 
> > 	- In particular for struct video_device, media_entity release callback will
> > 	  put the refcount of the video device (struct device embedded in
> > 	  struct video_device)
> > 
> > 	- For struct v4l2_subdev, struct media_entity contained in struct
> > 	  v4l2_subdev is dependent on struct device within struct
> > 	  video_device.
> > 
> > - When a driver creates multiple reference-counted objects (such as multiple
> >   media entities for instance) that are part of its own private struct, it
> >   will need to reference-count its own struct based on the release callbacks
> >   of all those objects. It might be possible to simplify this by providing a
> >   single release callback that would cover all objects. This is already
> >   feasible for drivers that implement a media_device instance, V4L2
> >   sub-devices and V4L2 video devices hold a reference on the media_device,
> >   whose release callback could then act as a single gatekeeper. For other
> >   drivers we haven't explored yet whether the core could meaningfully
> >   provide help, but it should be remembered that drivers that implement
> >   multiple graph objects and that do not implement a media_device are not
> >   legion. Most slave drivers (such as sensor drivers) only implement a
> >   single v4l2_subdev.
> > 
> > - No V4L2 sub-device driver currently supports unbinding the device safely
> >   when a media entity (or V4L2 sub-device) is registered. Prevent manual
> >   unbind for all sub-device drivers by setting the suppress_bind_attrs field
> >   in struct device_driver.
> > 
> > 
> > Rules for navigating the media graph and accessing graph objects
> > ================================================================
> > 
> > - Graph traversal is safe as long as the media graph lock is held during
> >   traversal. (I.e. no changes here.)
> > 
> > 	- During graph traversal entity refcount must be increased in order
> > 	  to prevent them from being released. As the media entity's release()
> > 	  callback may be already in process at this point, a variant
> > 	  checking that the reference count has not reached zero must be
> > 	  used (media_entity_get_try()).
> > 
> > - In order to keep a reference to an object after the graph traversal has
> >   finished, a reference to the object must be obtained and held as long as
> >   the object is in use.
> > 
> > 	- The same applies to file handles to V4L2 sub-devices or drivers
> > 	  keeping a reference to another driver's sub-device.
> > 
> > - Once the reference count of an object reaches zero it will not be
> >   increased again, even if that object was part of a memory allocation which
> >   still has referenced objects. Obtaining references to such an object must
> >   fail.
> > 
> > - Navigating within a memory allocation while holding a reference to an
> >   object in that allocation to obtain another object is allowed. The target
> >   object type must provide a function testing for zero references in order
> >   not to not to increment reference count that has reached zero.
> >   kref_get_unless_zero() can be used for this. Especially drivers may need
> >   this.
> > 
> > - Circular references are not allowed.
> > 
> > - It is safe to move between objects of the same lifetime (same kref),
> >   e.g. struct v4l2_subdev and struct media_entity it contains.
> > 
> > The rules must be documented as part of the Media framework ReST
> > documentation.
> > 
> > 
> > Stopping the hardware safely at device unbind
> > =============================================
> > 
> > What typically gets little attention in driver implementation is stopping
> > safely whenever the hardware is being removed. The kernel frameworks are not
> > very helpful in this respect --- after all how this is done somewhat depends
> > on the hardware. Only hot-pluggable devices are generally affected. While it
> > is possible to manually unbind e.g. platform devices, that certainly is a
> > lesser problem for such devices.
> > 
> > The rules are:
> > 
> > - No memory related to data structures that are needed during driver
> >   operation can be released as long as interrupt handlers may be still
> >   executing or may start executing or user file handles are open.
> > 
> > - All hardware access must cease after the remove() callback in driver ops
> >   struct has returned. (For USB devices this callback is called
> >   disconnect(), but its purpose is precisely the same.)
> > 
> > Handling this correctly will require taking care of a few common matters:
> > 
> > 1. Checking for device presence in each system call issued on a device, and
> >    returning -ENODEV if the device is not found.
> > 
> > 2. The driver's remove() function must ensure that no system calls that
> >    could result in hardware access are ongoing.
> >    
> > 3. Some system calls may sleep, and while doing so also releasing and later
> >    reacquiring (when they continue) the mutexes their processes were
> >    holding. Such processes must be woken up.
> > 
> > This is not an easy task for a driver and the V4L2 and Media controller
> > frameworks need to help the drivers to achieve this so that drivers would
> > need to worry about this as little as possible. Drivers relying for their
> > locking needs on the videobuf2 and the V4L2 frameworks are likely to require
> > fewer changes.
> > 
> > The same approach must be taken in issuing V4L2 sub-device operations by
> > other kernel drivers.
> > 
> > 
> > Serialising unbind with hardware access through system calls
> > ------------------------------------------------------------
> > 
> > What's below is concentrated in V4L2; Media controller will need similar
> > changes.
> > 
> > - Add a function called video_unregister_device_prepare() (or
> >   video_device_mark_unregistered() or a similarly named function) to amend
> >   the functionality of video_unregister_device(): mark the device
> >   unregistering so that new system calls issues on the device will return an
> >   error.
> > 
> > - Device node specific use count for active system calls.
> > 
> > - Waiting list for waiting for the use count to reach zero (for proceeding
> >   unregistration in v4l2_unregister_device()). Each finishing system call on
> >   a file descriptor will wake up the waiting list.
> > 
> > - Processes waiting in a waiting list of a system call (e.g. poll, DQBUF
> >   IOCTL) will be woken up. After waiting, the processes will check whether
> >   the video node is still registered, i.e. video_unregister_device_prepare()
> >   has not been called.
> > 
> > 
> > Things to remember
> > ------------------
> > 
> > - Using the devm_request_irq() is _NOT_ allowed in general case, as the
> >   free_irq() results in being called _after_ the driver's release callback.
> >   This means that if new interrupts still arrive between driver's release
> >   callback returning and devres resources not being released, the driver's
> >   interrupt handler will still run. Use request_irq() and free_irq()
> >   instead.
> > 
> > 
> > What changes in practice?
> > =========================
> > 
> > Most of the changes needed in order to align the current implementation to
> > conform the above are to be made in the frameworks themselves. What changes
> > in drivers, though, is how objects are initialised and released. This mostly
> > applies to drivers' own data structures that often contain reference counted
> > object types from the frameworks.
> > 
> > Relations between various frameworks objects will need to be handled in a
> > standard way in the appropriate registeration, unregistration and reference
> > getting and putting functions. Drivers (as well as frameworks, such as the
> > V4L2 sub-device framework in its open and close file operations) will make
> > use of these functions to get an put references to the said objects.
> > 
> > Drivers that are not hot-pluggable could largely ignore the changes, as
> > unregistering an object does in fact take place right before releasing that
> > object, meaning drivers that did not perform any object reference counting
> > are no more broken with the changes described above without implementing
> > reference counting. Still care must be taken that they remain decently
> > unloadable when there are no users.
> > 
> > New drivers should be required to implement object reference counting
> > correctly, whether hot-pluggable or not. One platform driver (e.g. omap3isp)
> > should be fixed to function as a model for new platform drivers.
> > 
> 
> Thanks,
> Mauro

-- 
Kind regards,

Sakari Ailus
e-mail: sakari.ailus@iki.fi	XMPP: sailus@retiisi.org.uk

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [ANN] Media object lifetime management meeting report from Oslo
  2017-01-27 22:02   ` Sakari Ailus
@ 2017-01-28 15:35     ` Laurent Pinchart
  2017-01-30  8:39       ` Hans Verkuil
  0 siblings, 1 reply; 8+ messages in thread
From: Laurent Pinchart @ 2017-01-28 15:35 UTC (permalink / raw)
  To: Sakari Ailus
  Cc: Mauro Carvalho Chehab, linux-media, hverkuil, Shuah Khan, Helen Koike

Hi Sakari,

On Saturday 28 Jan 2017 00:02:53 Sakari Ailus wrote:
> On Fri, Jan 27, 2017 at 09:38:31AM -0200, Mauro Carvalho Chehab wrote:
> > Hi Sakari/Hans/Laurent,
> > 
> > First of all, thanks for looking into those issues. Unfortunately, I was
> > in vacations, and were not able to be with you there for such discussions.
> > 
> > While I have a somewhat different view on some of the introductory points
> > of this RFC, what really matters is the "proposal" part of it. So, based
> > on the experiments I did when I addressed the hotplug issues with the
> > media controller on USB drivers, I'm adding a few comments below.
> > 
> > Em Fri, 27 Jan 2017 12:08:22 +0200 Sakari Ailus escreveu:
> >> Allocating struct media_devnode separately from struct media_device
> >> -------------------------------------------------------------------
> >> 
> >> The current codebase allocates struct media_devnode dynamically. This
> >> was done in order to reduce the time window during which unallocated
> >> kernel memory could be accessed. However, there is no specific need for
> >> such a separation as the entire data structure, including the media
> >> device which used to contain struct media_devnode, will have the same
> >> lifetime. Thus the struct media_devnode should be allocated as part of
> >> struct media_device. Later on, struct media_devnode may be merged with
> >> struct media_device if so decided.
> > 
> > There is one issue merging struct media_devnode at struct media_device.
> > The problem is that, if the same struct device is used for two different
> > APIs (like V4L2 and media_controller) , e. g. if the cdev parent points
> > to the same struct device, you may end by having a double free when the
> > device is unregistered on the second core. That's basically why
> > currently struct cdev is at struct media_devnode: this way, it has its own
> > struct device.
> 
> One of the conclusions of the memorandum I wrote is that the data structures
> that are involved in executing a system call (including IOCTLs) must stay
> intact during that system call. (Well, the alternative I can think of is
> using an rw_semaphore but I certainly do not advocate that solution.)
> 
> Otherwise, we will continue to have serialisation issues: kernel memory that
> may be released at any point of time independently of a system call in
> progress:
> 
> <URL:https://www.mail-archive.com/linux-media@vger.kernel.org/msg106390.html
> >
> 
> That one is a NULL pointer dereference but released kernel memory can be
> accessed as well.
> 
> > IMHO, it also makes sense to embeed struct cdev at the V4L2 side, as I
> > detected some race issues at the V4L2 side when I ran the bind/unbind
> > race tests, when we tried to merge the snd-usb-audio MC support patch.
> > I remember Shuah reported something similar on that time.
> > 
> >> Allocating struct media_device dynamically
> >> ------------------------------------------
> >> 
> >> Traditionally the media device has been allocated as part of drivers'
> >> own structures. This is certainly fine as such, but many drivers have
> >> allocated the driver private struct using the devm_() family of
> >> functions. This causes such memory to be released at the time of device
> >> removal rather than at the time when the memory is no longer accessible
> >> by e.g. user space processes or interrupt handlers. Besides the media
> >> device, the driver's own data structure is very likely to have the
> >> precisely same lifetime issue: it may also be accessed through IOCTLs
> >> after the device has been removed from the system.
> >> 
> >> Instead, the memory needs to be released as part of the release()
> >> callback of the media device which is called when the last reference to
> >> the media device is gone. Still, allocating the media device outside
> >> another containing driver specific struct will be required in some cases:
> >> sharing the media device mandates that. Implementing this can certainly
> >> be postponed until sharing the media device is otherwise supported as
> >> well.
> > 
> > The patches adding MC support for snd-usb-audio, pending since Kernel
> > 4.7 (I guess) require such functionatilty. Last year, on the audio
> > summit, they asked about its status, as they're needing MC suppor there.
> > 
> > So, whatever solution taken, this should be part of the solution.
> > 
> > (c/c Shuah, as she is the one working on it)
> 
> I think we should postpone that, or at least resolve that in a separate
> patchset. This is already getting very big. The refcounting problems will be
> harder to fix, should we extend the MC framework with media device sharing
> without considering the object lifetime issues first. So the order should
> be: first fix object lifetime issues, then add more features.

I strongly second that, continuing to build on top of a totally broken base 
can only lead to disaster. Given that we have given the lifetime management 
problem lots of thoughts already, with a patch series out to fix it and work 
ongoing to rework that series, I don't see a reason to order development in a 
different way.

> Matters unrelated to the object lifetime issues such as the device (driver)
> specific fields in struct media_device need to be addressed before support
> for sharing the media device can be implemented. That work can be done
> independently of fixing the object lifetime issues.
> 
> Implemeting media device sharing correctly will also be easier after
> the media device is refcounted: the object lifetime issues with a shared
> media device are resolvable (and likely very easily so).
> 
> >> Lifetime of the media entities and related framework or driver objects
> >> ======================================================================
> >> 
> >> The media graph data structure is a complex data structure consisting of
> >> media graph objects that themselves may be either entities, links or
> >> pads. The media graph objects are allocated by various drivers and
> >> currently have a lifetime related to binding and unbinding of the related
> >> hardware. The current rules (i.e. acquiring the media graph lock for
> >> graph traversal) are not enough to properly support removing entities.
> >> 
> >> Instead, reference counting needs to be introduced for media entities
> >> and other objects that contain media entities. References must be
> >> acquired (or in some cases, borrowed) whenever a reference is held to an
> >> object. A release callback is used to release the memory allocation that
> >> contains an object when the last reference to the object has been put.
> >>
> >> For instance:
> >>
> >> 	struct media_entity {
> >> 		...
> >> 		struct kref kref;
> >> 		void (*release)(struct media_entity *entity);
> >> 	};
> > 
> > The similar rationale is valid also for media interfaces and, on a
> > lesser extent, to pads and links.
> > 
> > So, instead, IMHO, the kref should be implemented at struct media_gobj,
> > with is inherited by all graph elements. I guess that handling it
> > there will avoid us to duplicate the same logic on different
> > places (e. g. interface and entity logic handling; links handling).
> 
> I don't argue against doing that in principle, but my question is: is there
> a need for that? Does a driver (or a framework) need to be able to hold a
> link, a pad, for instance? In the proposal, acquiring the media entity would
> be practically enough to hold the pads (albeit that would have to be
> documented), or we could add functions for that that would be wrappers to
> media_entity_get() and so on:
> 
> void media_pad_get(struct media_pad *pad)
> {
> 	media_entity_get(pad->entity);
> }
> 
> The lifetime of the graph objects are dependent of each other in any cases;
> you can't have pads having significantly different lifetime than the entity
> they belong to. Typically such a case is related to making sure structs stay
> around as long as something may still access them in a case when graph
> objects are being removed.

I also believe that tying the lifetime of pads to the lifetime of entities is 
a good solution.

> > Btw, as you want to use OMAP3 as a reference driver, you should
> > implement interface and interface links there, in order to have a
> > broader testcase coverage, or use some other driver that already
> > implements it.
> > 
> > Alternatively, we should work on merging the Helen's vimc driver, being
> > sure that it provides hot plug support for all kinds of media graph
> > objects. It should be easier/faster to detect race issues on such a driver
> > that doesn't depend on a hardware to complete their operations. So,
> > I'm a big fan on using it as primary testcase. Of course, we'll need to
> > test with real drivers too.
> 
> The vimc driver looks pretty good in my opinion; it could be merged after
> addressing the comments. I wrote quite a few comments to the patchset but
> AFAIR all the matters are minor.
> 
> I see no reason why we shouldn't implement these fixes for both of the two
> drivers. My understanding is that the driver changes due to refcounting will
> be relatively minor after all. It's the framework changes that will be
> harder.

Agreed. On the topic of interfaces and interface links, I'd like to point out 
that we have never reached an agreement on them and that they have been merged 
despite a strong opposition.

> >> 	/*
> >> 	 * For refcounting, the caller has to set entity->release after
> >> 	 * initialising the entity.
> >> 	 */
> >> 	void media_entity_pads_init(struct media_entity *entity)
> >> 	{
> >> 		...
> >> 		kref_init(&entity->kref);
> >> 	}
> >> 	
> >> 	int media_device_register_entity(struct media_device *mdev,
> >> 					 struct media_entity *entity)
> >> 	{
> >> 		...
> >> 		entity->graph_obj.mdev = mdev;
> >> 	}
> >> 	
> >> 	void media_entity_release(struct kref *kref)
> >> 	{
> >> 		struct media_entity *entity =
> >> 			container_of(kref, struct media_entity, kref);
> >> 		struct media_device *mdev = entity->graph_obj.mdev;
> >> 		
> >> 		entity->release(entity);
> >> 		if (mdev)
> >> 			media_device_put(mdev);
> >> 	}
> >> 	
> >> 	void media_entity_put(struct media_entity *entity)
> >> 	{
> >> 		kref_put(&entity->kref, media_entity_release);
> >> 	}
> >> 	
> >> 	void media_entity_get(struct media_entity *entity)
> >> 	{
> >> 		kref_get(&entity->kref);
> >> 	}
> >> 
> >> The above takes into account the current behaviour which makes no use of
> >> reference counting while still allowing reference counting for drivers
> >> that use it.
> >> 
> >> Often media entities are contained in another struct such as struct
> >> v4l2_subdev. In that case, the allocator of that struct must provide a
> >> release callback which is used as a destructor of that object. Functions
> >> to count references (get() and put()) are defined to obtain and released
> >> references to the said objects. Wrappers are added for the types
> >> containing the object type that does have the refcount, for instance in
> >> this case:
> >>
> >> 	void v4l2_subdev_get(struct v4l2_subdev *sd)
> >> 	{
> >> 		media_entity_get(&sd->entity);
> >> 	}
> >> 	
> >> 	void v4l2_subdev_put(struct v4l2_subdev *sd)
> >> 	{
> >> 		media_entity_put(&sd->entity);
> >> 	}
> >> 
> >> Memory allocations that contain one or more objects must be considered
> >> indivisible. All the objects inside a struct are allocated and released
> >> at one point of time. For instance, struct v4l2_subdev contains both
> >> struct media_entity and struct video_device. struct video_device contains
> >> struct device (related to the character device visible to the user) that
> >> is reference counted already.
> > 
> > Here's the problem: if we point the media_devnode's cdev parent to
> > the same object as struct video_device, both cores will try to
> > decrement its kref at release() time.
> 
> At the time of registering a struct video_device with the struct
> media_device, the underlying refcount (struct
> media_device.devnode.dev.kobj.kref) needs to be incremented. This needs to
> done using media_device_get(), with (a) few layers of abstraction.
> 
> This guarantees that the struct media_device will stick around at least as
> long as the struct video_device.
> 
> The struct video_device cdev arrangement differs from what's being done in
> the CEC framework and elsewhere; that should be fixed I guess. Still, the
> struct media_device reference should be put in v4l2_device_release() (the
> release callback of the struct device corresponding to the device node,
> struct video_device.dev.release).

There are two static functions named v4l2_device_release(), one in v4l2-
device.c and one in v4l2-dev.c. I assume you mean the latter (which we could 
rename to avoid confusion).

On a related topic, video_device_release_empty() should be removed.

> >> In the example below, foo is a sub-device driver. The struct foo_device
> >> contains a struct v4l2_subdev field sd which in turn contains a media
> >> entity. In addition to that, struct v4l2_subdev contains a struct
> >> video_device (the devnode field) that currently is a pointer. In order
> >> to simplify the reference counting, it is assumed to be directly a part
> >> of the struct v4l2_subdev below. The example assumes that the driver
> >> requests a device node for the V4L2 sub-device and does not address the
> >> case where the device node is not created.
> >> 
> >> 	struct media_entity {
> >> 		...
> >> 		struct kref kref;
> >> 		void (*release)(struct media_entity *entity);
> >> 	};
> >> 	
> >> 	struct v4l2_subdev {
> >> 		...
> >> 		struct video_device devnode;
> >> 	};
> >> 	
> >> 	void media_entity_release(struct kref *kref)
> >> 	{
> >> 		struct media_entity *entity =
> >> 			container_of(kref, struct media_entity, kref);
> >> 		struct media_device *mdev = entity->graph_obj.mdev;
> >> 		
> >> 		/* Release the media entity, possibly releasing its memory. */
> >> 		entity->release(entity);
> >> 		
> >> 		/*
> >> 		 * If it was bound to a media device, put the media device
> >> 		 * as well.
> >> 		 */
> >> 		if (mdev)
> >> 			media_device_put(mdev);
> >> 	}
> >> 	
> >> 	void media_entity_put(struct media_entity *entity)
> >> 	{
> >> 		kref_put(&entity->kref, media_entity_release);
> >> 	}
> >> 	
> >> 	/*
> >> 	 * Returns zero if getting entity succeeds, -ENODEV otherwise.
> >> 	 */
> >> 	int media_entity_try_get(struct media_entity *entity)
> >> 	{
> >> 		return kref_get_unless_zero(&entity->kref) ? 0 : -ENODEV;
> >> 	}
> >> 	
> >> 	void media_entity_get(struct media_entity *entity)
> >> 	{
> >> 		kref_get(&entity->kref);
> >> 	}
> >> 	
> >> 	/*
> >> 	 * Depending on the data structure containing the media entity, the
> >> 	 * caller has to set entity->release after initialising the entity.
> >> 	 */
> >> 	void media_entity_pads_init(struct media_entity *entity)
> >> 	{
> >> 		...
> >> 		kref_init(&entity->kref);
> >> 	}
> >> 	
> >> 	int media_device_register_entity(struct media_device *mdev,
> >> 					 struct media_entity *entity)
> >> 	{
> >> 		...
> >> 		/*
> >> 		 * Bind the media entity to the media device; thus increment
> >> 		 * media device refcount. The media device typically is not
> >> 		 * available during sub-device driver probe() time. This
> >> 		 * requires that a media entity may only be registered once.
> >> 		 */
> >> 		entity->graph_obj.mdev = mdev;
> >> 		media_device_get(mdev);
> >> 		/* And get the entity as well. */
> >> 		media_entity_get(entity);
> >> 	}
> >> 	
> >> 	void media_device_unregister_entity(struct media_entity *entity)
> >> 	{
> >> 		...
> >> 		media_entity_put(entity);
> >> 	}
> >> 	
> >> 	int v4l2_device_register_subdev(struct v4l2_subdev *sd)
> >> 	{
> >> 		...
> >> 		media_device_register_entity(&sd->entity);
> >> 		...
> >> 	}
> >> 	
> >> 	void v4l2_device_unregister_subdev(struct v4l2_subdev *sd)
> >> 	{
> >> 		...
> >> 		media_device_unregister_entity(&sd->entity);
> >> 	}
> >> 	
> >> 	/*
> >> 	 * Custom release callback function for V4L2 sub-devices that export
> >> 	 * a device node. (Could be used for others as well, with
> >> 	 * sd->devnode.release() callback.)
> >> 	 */
> >> 	void v4l2_device_subdev_devnode_release(struct media_entity *entity)
> >> 	{
> >> 		struct v4l2_subdev *sd = media_entity_to_v4l2_subdev(entity);
> >> 		
> >> 		/*
> >> 		 * If the devnode got registered we still hold a reference
> >> 		 * to it. Check this from vdev->prio (but any other field
> >> 		 * which gets set during device node registration and never
> >> 		 * changed again could be used). If the devnode was never
> >> 		 * registered, call its release function directly.
> >> 		 */
> >> 		if (sd->devnode.prio)
> >> 			video_put(sd->devnode);
> >> 		else
> >> 			sd->devnode.release(&sd->devnode);
> >> 	}
> >> 	
> >> 	int v4l2_device_register_subdev_nodes(struct v4l2_device *vdev)
> >> 	{
> >> 		struct video_device *vdev;
> >>  	        struct v4l2_subdev *sd;
> >> 	        int err;
> >> 	        
> >> 	        list_for_each_entry(sd, &v4l2_dev->subdevs, list) {
> >> 			__video_register_device(&sd->devnode, ...);
> >> 			video_get(&sd->devnode);
> >> 			sd->devnode.prio = something non-NULL;
> >> 		}
> >> 	}
> >> 	
> >> 	int v4l2_subdev_try_get(struct v4l2_subdev *sd)
> >> 	{
> >> 		return media_entity_try_get(&sd->entity);
> >> 	}
> >> 	
> >> 	void v4l2_subdev_get(struct v4l2_subdev *sd)
> >> 	{
> >> 		media_entity_get(&sd->entity);
> >> 	}
> >> 	
> >> 	void v4l2_subdev_put(struct v4l2_subdev *sd)
> >> 	{
> >> 		media_entity_put(&sd->entity);
> >> 	}
> >> 	
> >> 	/* V4L2 sub-device open file operation handler */
> >> 	int subdev_open(struct file *file)
> >> 	{
> >> 		struct video_device *vdev = video_devdata(file);
> >> 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
> >> 		int rval;
> >> 		
> >> 		/*
> >> 		 * The v4l2_subdev depends on the video device node. It thus
> >> 		 * may be that the media entity refcount (which is also used
> >> 		 * to count references to the v4l2_subdev) has reached zero
> >> 		 * here. However its memory is still intact as it's part of
> >> 		 * the same struct v4l2_subdev.
> >> 		 */
> >> 		rval = v4l2_subdev_try_get(sd);
> >> 		if (rval)
> >> 			return rval;
> >> 		
> >> 		...
> >> 	}
> >> 	
> >> 	/* V4L2 sub-device release file operation handler */
> >> 	int subdev_close(struct file *file)
> >> 	{
> >> 		struct video_device *vdev = video_devdata(file);
> >> 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
> >> 		
> >> 		v4l2_subdev_put(sd);
> >> 	}
> >> 	
> >> 	struct foo_device {
> >> 		struct v4l2_subdev sd;
> >> 		struct media_pad pad;
> >> 	};
> >> 	
> >> 	void foo_device_release(struct video_device *vdev)
> >> 	{
> >> 		struct v4l2_subdev *sd =
> >> 			container_of(vdev, struct v4l2_subdev, devnode);
> >> 		struct foo_device *foo =
> >> 			container_of(sd, struct foo_device, sd);
> >> 		
> >> 		/*
> >> 		 * TODO: acquire the graph mutex here and remove the
> >> 		 * entities corresponding to the V4L2 sub-device and its
> >> 		 * device node from the graph.
> >> 		 */
> >> 		media_entity_cleanup(&foo->sd.entity);
> >> 		kfree(foo);
> >> 	}
> >> 	
> >> 	int foo_probe(struct device *dev)
> >> 	{
> >> 		struct foo_device *foo = kmalloc(sizeof(*foo), GFP_KERNEL));
> >> 		
> >> 		media_entity_pads_init(&foo->sd.entity, 1, &foo->pad);
> >> 		foo->sd.entity.release = v4l2_subdev_devnode_release;
> >> 		foo->sd.devnode.release = foo_device_release;
> >> 		v4l2_async_register_subdev(&foo->sd);
> >> 	}
> >> 	
> >> 	void foo_remove(struct device *dev)
> >> 	{
> >> 		struct foo_device *foo = dev_get_drvdata(dev);
> >> 		
> >> 		v4l2_async_unregister_subdev(&foo->sd);
> >> 		v4l2_device_unregister_subdev(&foo->sd);
> >> 		/*
> >> 		 * v4l2_subdev_put() will end up releasing foo immediately
> >> 		 * unless file handles are open currently. Thus further
> >> 		 * accesses to foo are not allowed.
> >> 		 */
> >> 		v4l2_subdev_put(&foo->sd);
> >> 	}
> >> 
> >> Practical considerations
> >> ------------------------
> >> 
> >> - Beyond media entities, other media graph objects could be made
> >>   refcounted. This is not seen necessary as the related objects such as
> >>   pads are in practice allocated within the same driver specific struct
> >>   containing the media entity.
> > 
> > As I mentioned already, you're forgetting about the media interfaces, with
> > are created/removed elsewhere as they're (somewhat) independent of the
> > media entities.
> 
> Right. Nothing prevents adding a kref to struct media_interface. It'd work
> the same way than with media entities. Interface-wise, the refcounting
> should use _get()/_put() functions of the native types independently of
> where the kref refcount is located. (Obviously this requires that it has to
> be in the same memory allocation. For instance, v4l2_subdev's depending on
> kref in struct media_entity in the example.)
> 
> Having to do add kref to struct media_interface, though, does increase the
> appeal of adding kref to each graph object instead. Still, my hunch is that
> this would be vastly overkill: the other graph objects we have are very
> tightly bound to entities (and media interfaces). I can't see a use case for
> them right now other than than just walking the graph. That said, I'm
> certainly open for new ideas and information that could probe me wrong.

I agree with you.

> >> 	- This does not apply to links which are allocated dynamically by
> >> 	  the framework, and released at unregistration time, meaning that
> >> 	  accessing a link will require holding the media graph mutex.
> > 
> > If we're moving to refcounts, then perhaps we can get rid of the
> > graph mutex or reduce its usage.
> 
> It'll be a lot easier to get rid of the graph mutex, I agree. Only the graph
> walk will depend on holding the graph mutex then, and it's much easier to
> address the serialisation needs in that use case only. The solution will
> still be non-trivial as it's likely to require wait/wound mutexes: the
> graph walk can well start at two locations of a graph simultanoeously and
> two walkers may thus try to parse the same graph nodes. One of them needs
> to back away in that case. This is something that needs to be managed in
> the users of the graph walk API.
> 
> >> - A custom release callback is to be used for refcounted framework
> >>   structs. This makes it easy for the drivers to embed the objects in
> >>   their own structs.
> >> 
> >> - As long as an object has its own reference count, getting or putting
> >>   an object does not affect the refcount of another object that it
> >>   depends on. For instance, a media entity depends on an existence of the
> >>   media device.
> >> 
> >> - Each media entity holds a reference to the media device. Being able to
> >>   navigate from the media entity to the media device is used in drivers
> >>   and an abrupt disconnection of the two is problematic to handle for
> >>   drivers.
> >> 
> >> - At object initialisation time the kref refcount is initialised. The
> >>   object is put at unregistration time.
> >>   
> >>   	- If the object is a media entity, it is removed from the media graph
> >>  	  when the last reference to the media entity is gone. The release
> >>  	  callback needs to acquire the media graph mutex in order to
> >>  	  perform the removal.
> >> 	
> >> 	- In particular for struct video_device, media_entity release callback
> >> 	  will put the refcount of the video device (struct device embedded in
> >> 	  struct video_device)
> >> 	
> >> 	- For struct v4l2_subdev, struct media_entity contained in struct
> >> 	  v4l2_subdev is dependent on struct device within struct
> >> 	  video_device.
> >> 
> >> - When a driver creates multiple reference-counted objects (such as
> >>   multiple media entities for instance) that are part of its own private
> >>   struct, it will need to reference-count its own struct based on the
> >>   release callbacks of all those objects. It might be possible to
> >>   simplify this by providing a single release callback that would cover
> >>   all objects. This is already feasible for drivers that implement a
> >>   media_device instance, V4L2 sub-devices and V4L2 video devices hold a
> >>   reference on the media_device, whose release callback could then act as
> >>   a single gatekeeper. For other drivers we haven't explored yet whether
> >>   the core could meaningfully provide help, but it should be remembered
> >>   that drivers that implement multiple graph objects and that do not
> >>   implement a media_device are not legion. Most slave drivers (such as
> >>   sensor drivers) only implement a single v4l2_subdev.
> >> 
> >> - No V4L2 sub-device driver currently supports unbinding the device
> >>   safely when a media entity (or V4L2 sub-device) is registered. Prevent
> >>   manual unbind for all sub-device drivers by setting the
> >>   suppress_bind_attrs field in struct device_driver.
> >> 
> >> Rules for navigating the media graph and accessing graph objects
> >> ================================================================
> >> 
> >> - Graph traversal is safe as long as the media graph lock is held during
> >>   traversal. (I.e. no changes here.)
> >> 	
> >> 	- During graph traversal entity refcount must be increased in order
> >> 	  to prevent them from being released. As the media entity's release()
> >> 	  callback may be already in process at this point, a variant
> >> 	  checking that the reference count has not reached zero must be
> >> 	  used (media_entity_get_try()).
> >> 
> >> - In order to keep a reference to an object after the graph traversal
> >>   has finished, a reference to the object must be obtained and held as
> >>   long as the object is in use.
> >> 	
> >> 	- The same applies to file handles to V4L2 sub-devices or drivers
> >> 	  keeping a reference to another driver's sub-device.
> >> 
> >> - Once the reference count of an object reaches zero it will not be
> >>   increased again, even if that object was part of a memory allocation
> >>   which still has referenced objects. Obtaining references to such an
> >>   object must fail.
> >> 
> >> - Navigating within a memory allocation while holding a reference to an
> >>   object in that allocation to obtain another object is allowed. The
> >>   target object type must provide a function testing for zero references
> >>   in order not to not to increment reference count that has reached zero.
> >>   kref_get_unless_zero() can be used for this. Especially drivers may
> >>   need this.
> >> 
> >> - Circular references are not allowed.
> >> 
> >> - It is safe to move between objects of the same lifetime (same kref),
> >>   e.g. struct v4l2_subdev and struct media_entity it contains.
> >> 
> >> The rules must be documented as part of the Media framework ReST
> >> documentation.
> >> 
> >> 
> >> Stopping the hardware safely at device unbind
> >> =============================================
> >> 
> >> What typically gets little attention in driver implementation is
> >> stopping safely whenever the hardware is being removed. The kernel
> >> frameworks are not very helpful in this respect --- after all how this is
> >> done somewhat depends on the hardware. Only hot-pluggable devices are
> >> generally affected. While it is possible to manually unbind e.g.
> >> platform devices, that certainly is a lesser problem for such devices.
> >> 
> >> The rules are:
> >> 
> >> - No memory related to data structures that are needed during driver
> >>   operation can be released as long as interrupt handlers may be still
> >>   executing or may start executing or user file handles are open.
> >> 
> >> - All hardware access must cease after the remove() callback in driver
> >>   ops struct has returned. (For USB devices this callback is called
> >>   disconnect(), but its purpose is precisely the same.)
> >> 
> >> Handling this correctly will require taking care of a few common
> >> matters:
> >> 
> >> 1. Checking for device presence in each system call issued on a device,
> >>    and returning -ENODEV if the device is not found.
> >> 
> >> 2. The driver's remove() function must ensure that no system calls that
> >>    could result in hardware access are ongoing.
> >> 
> >> 3. Some system calls may sleep, and while doing so also releasing and
> >>    later reacquiring (when they continue) the mutexes their processes
> >>    were holding. Such processes must be woken up.
> >> 
> >> This is not an easy task for a driver and the V4L2 and Media controller
> >> frameworks need to help the drivers to achieve this so that drivers
> >> would need to worry about this as little as possible. Drivers relying for
> >> their locking needs on the videobuf2 and the V4L2 frameworks are likely
> >> to require fewer changes.
> >> 
> >> The same approach must be taken in issuing V4L2 sub-device operations by
> >> other kernel drivers.
> >> 
> >> 
> >> Serialising unbind with hardware access through system calls
> >> ------------------------------------------------------------
> >> 
> >> What's below is concentrated in V4L2; Media controller will need similar
> >> changes.
> >> 
> >> - Add a function called video_unregister_device_prepare() (or
> >>   video_device_mark_unregistered() or a similarly named function) to
> >>   amend the functionality of video_unregister_device(): mark the device
> >>   unregistering so that new system calls issues on the device will
> >>   return an error.
> >> 
> >> - Device node specific use count for active system calls.
> >> 
> >> - Waiting list for waiting for the use count to reach zero (for
> >>   proceeding unregistration in v4l2_unregister_device()). Each finishing
> >>   system call on a file descriptor will wake up the waiting list.
> >> 
> >> - Processes waiting in a waiting list of a system call (e.g. poll, DQBUF
> >>   IOCTL) will be woken up. After waiting, the processes will check
> >>   whether the video node is still registered, i.e.
> >>   video_unregister_device_prepare()
> >>   has not been called.
> >> 
> >> Things to remember
> >> ------------------
> >> 
> >> - Using the devm_request_irq() is _NOT_ allowed in general case, as the
> >>   free_irq() results in being called _after_ the driver's release
> >>   callback. This means that if new interrupts still arrive between
> >>   driver's release callback returning and devres resources not being
> >>   released, the driver's interrupt handler will still run. Use
> >>   request_irq() and free_irq() instead.
> >> 
> >> What changes in practice?
> >> =========================
> >> 
> >> Most of the changes needed in order to align the current implementation
> >> to conform the above are to be made in the frameworks themselves. What
> >> changes in drivers, though, is how objects are initialised and released.
> >> This mostly applies to drivers' own data structures that often contain
> >> reference counted object types from the frameworks.
> >> 
> >> Relations between various frameworks objects will need to be handled in
> >> a standard way in the appropriate registeration, unregistration and
> >> reference getting and putting functions. Drivers (as well as frameworks,
> >> such as the V4L2 sub-device framework in its open and close file
> >> operations) will make use of these functions to get an put references to
> >> the said objects.
> >> 
> >> Drivers that are not hot-pluggable could largely ignore the changes, as
> >> unregistering an object does in fact take place right before releasing
> >> that object, meaning drivers that did not perform any object reference
> >> counting are no more broken with the changes described above without
> >> implementing reference counting. Still care must be taken that they
> >> remain decently unloadable when there are no users.
> >> 
> >> New drivers should be required to implement object reference counting
> >> correctly, whether hot-pluggable or not. One platform driver (e.g.
> >> omap3isp) should be fixed to function as a model for new platform
> >> drivers.

-- 
Regards,

Laurent Pinchart


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [ANN] Media object lifetime management meeting report from Oslo
  2017-01-28 15:35     ` Laurent Pinchart
@ 2017-01-30  8:39       ` Hans Verkuil
  2017-01-30 11:56         ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 8+ messages in thread
From: Hans Verkuil @ 2017-01-30  8:39 UTC (permalink / raw)
  To: Laurent Pinchart, Sakari Ailus
  Cc: Mauro Carvalho Chehab, linux-media, Shuah Khan, Helen Koike

On 28/01/17 16:35, Laurent Pinchart wrote:
> Hi Sakari,
>
> On Saturday 28 Jan 2017 00:02:53 Sakari Ailus wrote:
>> On Fri, Jan 27, 2017 at 09:38:31AM -0200, Mauro Carvalho Chehab wrote:
>>> Hi Sakari/Hans/Laurent,
>>>
>>> First of all, thanks for looking into those issues. Unfortunately, I was
>>> in vacations, and were not able to be with you there for such discussions.
>>>
>>> While I have a somewhat different view on some of the introductory points
>>> of this RFC, what really matters is the "proposal" part of it. So, based
>>> on the experiments I did when I addressed the hotplug issues with the
>>> media controller on USB drivers, I'm adding a few comments below.
>>>
>>> Em Fri, 27 Jan 2017 12:08:22 +0200 Sakari Ailus escreveu:
>>>> Allocating struct media_devnode separately from struct media_device
>>>> -------------------------------------------------------------------
>>>>
>>>> The current codebase allocates struct media_devnode dynamically. This
>>>> was done in order to reduce the time window during which unallocated
>>>> kernel memory could be accessed. However, there is no specific need for
>>>> such a separation as the entire data structure, including the media
>>>> device which used to contain struct media_devnode, will have the same
>>>> lifetime. Thus the struct media_devnode should be allocated as part of
>>>> struct media_device. Later on, struct media_devnode may be merged with
>>>> struct media_device if so decided.
>>>
>>> There is one issue merging struct media_devnode at struct media_device.
>>> The problem is that, if the same struct device is used for two different
>>> APIs (like V4L2 and media_controller) , e. g. if the cdev parent points
>>> to the same struct device, you may end by having a double free when the
>>> device is unregistered on the second core. That's basically why
>>> currently struct cdev is at struct media_devnode: this way, it has its own
>>> struct device.
>>
>> One of the conclusions of the memorandum I wrote is that the data structures
>> that are involved in executing a system call (including IOCTLs) must stay
>> intact during that system call. (Well, the alternative I can think of is
>> using an rw_semaphore but I certainly do not advocate that solution.)
>>
>> Otherwise, we will continue to have serialisation issues: kernel memory that
>> may be released at any point of time independently of a system call in
>> progress:
>>
>> <URL:https://www.mail-archive.com/linux-media@vger.kernel.org/msg106390.html
>>>
>>
>> That one is a NULL pointer dereference but released kernel memory can be
>> accessed as well.
>>
>>> IMHO, it also makes sense to embeed struct cdev at the V4L2 side, as I
>>> detected some race issues at the V4L2 side when I ran the bind/unbind
>>> race tests, when we tried to merge the snd-usb-audio MC support patch.
>>> I remember Shuah reported something similar on that time.
>>>
>>>> Allocating struct media_device dynamically
>>>> ------------------------------------------
>>>>
>>>> Traditionally the media device has been allocated as part of drivers'
>>>> own structures. This is certainly fine as such, but many drivers have
>>>> allocated the driver private struct using the devm_() family of
>>>> functions. This causes such memory to be released at the time of device
>>>> removal rather than at the time when the memory is no longer accessible
>>>> by e.g. user space processes or interrupt handlers. Besides the media
>>>> device, the driver's own data structure is very likely to have the
>>>> precisely same lifetime issue: it may also be accessed through IOCTLs
>>>> after the device has been removed from the system.
>>>>
>>>> Instead, the memory needs to be released as part of the release()
>>>> callback of the media device which is called when the last reference to
>>>> the media device is gone. Still, allocating the media device outside
>>>> another containing driver specific struct will be required in some cases:
>>>> sharing the media device mandates that. Implementing this can certainly
>>>> be postponed until sharing the media device is otherwise supported as
>>>> well.
>>>
>>> The patches adding MC support for snd-usb-audio, pending since Kernel
>>> 4.7 (I guess) require such functionatilty. Last year, on the audio
>>> summit, they asked about its status, as they're needing MC suppor there.
>>>
>>> So, whatever solution taken, this should be part of the solution.
>>>
>>> (c/c Shuah, as she is the one working on it)
>>
>> I think we should postpone that, or at least resolve that in a separate
>> patchset. This is already getting very big. The refcounting problems will be
>> harder to fix, should we extend the MC framework with media device sharing
>> without considering the object lifetime issues first. So the order should
>> be: first fix object lifetime issues, then add more features.
>
> I strongly second that, continuing to build on top of a totally broken base
> can only lead to disaster. Given that we have given the lifetime management
> problem lots of thoughts already, with a patch series out to fix it and work
> ongoing to rework that series, I don't see a reason to order development in a
> different way.

I agree with that as well.

>> Matters unrelated to the object lifetime issues such as the device (driver)
>> specific fields in struct media_device need to be addressed before support
>> for sharing the media device can be implemented. That work can be done
>> independently of fixing the object lifetime issues.
>>
>> Implemeting media device sharing correctly will also be easier after
>> the media device is refcounted: the object lifetime issues with a shared
>> media device are resolvable (and likely very easily so).
>>
>>>> Lifetime of the media entities and related framework or driver objects
>>>> ======================================================================
>>>>
>>>> The media graph data structure is a complex data structure consisting of
>>>> media graph objects that themselves may be either entities, links or
>>>> pads. The media graph objects are allocated by various drivers and
>>>> currently have a lifetime related to binding and unbinding of the related
>>>> hardware. The current rules (i.e. acquiring the media graph lock for
>>>> graph traversal) are not enough to properly support removing entities.
>>>>
>>>> Instead, reference counting needs to be introduced for media entities
>>>> and other objects that contain media entities. References must be
>>>> acquired (or in some cases, borrowed) whenever a reference is held to an
>>>> object. A release callback is used to release the memory allocation that
>>>> contains an object when the last reference to the object has been put.
>>>>
>>>> For instance:
>>>>
>>>> 	struct media_entity {
>>>> 		...
>>>> 		struct kref kref;
>>>> 		void (*release)(struct media_entity *entity);
>>>> 	};
>>>
>>> The similar rationale is valid also for media interfaces and, on a
>>> lesser extent, to pads and links.
>>>
>>> So, instead, IMHO, the kref should be implemented at struct media_gobj,
>>> with is inherited by all graph elements. I guess that handling it
>>> there will avoid us to duplicate the same logic on different
>>> places (e. g. interface and entity logic handling; links handling).
>>
>> I don't argue against doing that in principle, but my question is: is there
>> a need for that? Does a driver (or a framework) need to be able to hold a
>> link, a pad, for instance? In the proposal, acquiring the media entity would
>> be practically enough to hold the pads (albeit that would have to be
>> documented), or we could add functions for that that would be wrappers to
>> media_entity_get() and so on:
>>
>> void media_pad_get(struct media_pad *pad)
>> {
>> 	media_entity_get(pad->entity);
>> }
>>
>> The lifetime of the graph objects are dependent of each other in any cases;
>> you can't have pads having significantly different lifetime than the entity
>> they belong to. Typically such a case is related to making sure structs stay
>> around as long as something may still access them in a case when graph
>> objects are being removed.
>
> I also believe that tying the lifetime of pads to the lifetime of entities is
> a good solution.

Indeed.

>
>>> Btw, as you want to use OMAP3 as a reference driver, you should
>>> implement interface and interface links there, in order to have a
>>> broader testcase coverage, or use some other driver that already
>>> implements it.
>>>
>>> Alternatively, we should work on merging the Helen's vimc driver, being
>>> sure that it provides hot plug support for all kinds of media graph
>>> objects. It should be easier/faster to detect race issues on such a driver
>>> that doesn't depend on a hardware to complete their operations. So,
>>> I'm a big fan on using it as primary testcase. Of course, we'll need to
>>> test with real drivers too.
>>
>> The vimc driver looks pretty good in my opinion; it could be merged after
>> addressing the comments. I wrote quite a few comments to the patchset but
>> AFAIR all the matters are minor.
>>
>> I see no reason why we shouldn't implement these fixes for both of the two
>> drivers. My understanding is that the driver changes due to refcounting will
>> be relatively minor after all. It's the framework changes that will be
>> harder.
>
> Agreed. On the topic of interfaces and interface links, I'd like to point out
> that we have never reached an agreement on them and that they have been merged
> despite a strong opposition.

Two notes:

1) That's not how I remember things, but that could well be my memory.
2) It's in, so there is no turning back.

>
>>>> 	/*
>>>> 	 * For refcounting, the caller has to set entity->release after
>>>> 	 * initialising the entity.
>>>> 	 */
>>>> 	void media_entity_pads_init(struct media_entity *entity)
>>>> 	{
>>>> 		...
>>>> 		kref_init(&entity->kref);
>>>> 	}
>>>> 	
>>>> 	int media_device_register_entity(struct media_device *mdev,
>>>> 					 struct media_entity *entity)
>>>> 	{
>>>> 		...
>>>> 		entity->graph_obj.mdev = mdev;
>>>> 	}
>>>> 	
>>>> 	void media_entity_release(struct kref *kref)
>>>> 	{
>>>> 		struct media_entity *entity =
>>>> 			container_of(kref, struct media_entity, kref);
>>>> 		struct media_device *mdev = entity->graph_obj.mdev;
>>>> 		
>>>> 		entity->release(entity);
>>>> 		if (mdev)
>>>> 			media_device_put(mdev);
>>>> 	}
>>>> 	
>>>> 	void media_entity_put(struct media_entity *entity)
>>>> 	{
>>>> 		kref_put(&entity->kref, media_entity_release);
>>>> 	}
>>>> 	
>>>> 	void media_entity_get(struct media_entity *entity)
>>>> 	{
>>>> 		kref_get(&entity->kref);
>>>> 	}
>>>>
>>>> The above takes into account the current behaviour which makes no use of
>>>> reference counting while still allowing reference counting for drivers
>>>> that use it.
>>>>
>>>> Often media entities are contained in another struct such as struct
>>>> v4l2_subdev. In that case, the allocator of that struct must provide a
>>>> release callback which is used as a destructor of that object. Functions
>>>> to count references (get() and put()) are defined to obtain and released
>>>> references to the said objects. Wrappers are added for the types
>>>> containing the object type that does have the refcount, for instance in
>>>> this case:
>>>>
>>>> 	void v4l2_subdev_get(struct v4l2_subdev *sd)
>>>> 	{
>>>> 		media_entity_get(&sd->entity);
>>>> 	}
>>>> 	
>>>> 	void v4l2_subdev_put(struct v4l2_subdev *sd)
>>>> 	{
>>>> 		media_entity_put(&sd->entity);
>>>> 	}
>>>>
>>>> Memory allocations that contain one or more objects must be considered
>>>> indivisible. All the objects inside a struct are allocated and released
>>>> at one point of time. For instance, struct v4l2_subdev contains both
>>>> struct media_entity and struct video_device. struct video_device contains
>>>> struct device (related to the character device visible to the user) that
>>>> is reference counted already.
>>>
>>> Here's the problem: if we point the media_devnode's cdev parent to
>>> the same object as struct video_device, both cores will try to
>>> decrement its kref at release() time.
>>
>> At the time of registering a struct video_device with the struct
>> media_device, the underlying refcount (struct
>> media_device.devnode.dev.kobj.kref) needs to be incremented. This needs to
>> done using media_device_get(), with (a) few layers of abstraction.
>>
>> This guarantees that the struct media_device will stick around at least as
>> long as the struct video_device.
>>
>> The struct video_device cdev arrangement differs from what's being done in
>> the CEC framework and elsewhere; that should be fixed I guess. Still, the
>> struct media_device reference should be put in v4l2_device_release() (the
>> release callback of the struct device corresponding to the device node,
>> struct video_device.dev.release).
>
> There are two static functions named v4l2_device_release(), one in v4l2-
> device.c and one in v4l2-dev.c. I assume you mean the latter (which we could
> rename to avoid confusion).
>
> On a related topic, video_device_release_empty() should be removed.

It's there to satify a WARN_ON that checks if the release callback isn't NULL, i.e.
to make sure someone thought about this. I would keep it for now. Certainly we
shouldn't change this behavior until all the other life-time changes are in.
After that we can think about this some more.

>
>>>> In the example below, foo is a sub-device driver. The struct foo_device
>>>> contains a struct v4l2_subdev field sd which in turn contains a media
>>>> entity. In addition to that, struct v4l2_subdev contains a struct
>>>> video_device (the devnode field) that currently is a pointer. In order
>>>> to simplify the reference counting, it is assumed to be directly a part
>>>> of the struct v4l2_subdev below. The example assumes that the driver
>>>> requests a device node for the V4L2 sub-device and does not address the
>>>> case where the device node is not created.
>>>>
>>>> 	struct media_entity {
>>>> 		...
>>>> 		struct kref kref;
>>>> 		void (*release)(struct media_entity *entity);
>>>> 	};
>>>> 	
>>>> 	struct v4l2_subdev {
>>>> 		...
>>>> 		struct video_device devnode;
>>>> 	};
>>>> 	
>>>> 	void media_entity_release(struct kref *kref)
>>>> 	{
>>>> 		struct media_entity *entity =
>>>> 			container_of(kref, struct media_entity, kref);
>>>> 		struct media_device *mdev = entity->graph_obj.mdev;
>>>> 		
>>>> 		/* Release the media entity, possibly releasing its memory. */
>>>> 		entity->release(entity);
>>>> 		
>>>> 		/*
>>>> 		 * If it was bound to a media device, put the media device
>>>> 		 * as well.
>>>> 		 */
>>>> 		if (mdev)
>>>> 			media_device_put(mdev);
>>>> 	}
>>>> 	
>>>> 	void media_entity_put(struct media_entity *entity)
>>>> 	{
>>>> 		kref_put(&entity->kref, media_entity_release);
>>>> 	}
>>>> 	
>>>> 	/*
>>>> 	 * Returns zero if getting entity succeeds, -ENODEV otherwise.
>>>> 	 */
>>>> 	int media_entity_try_get(struct media_entity *entity)
>>>> 	{
>>>> 		return kref_get_unless_zero(&entity->kref) ? 0 : -ENODEV;
>>>> 	}
>>>> 	
>>>> 	void media_entity_get(struct media_entity *entity)
>>>> 	{
>>>> 		kref_get(&entity->kref);
>>>> 	}
>>>> 	
>>>> 	/*
>>>> 	 * Depending on the data structure containing the media entity, the
>>>> 	 * caller has to set entity->release after initialising the entity.
>>>> 	 */
>>>> 	void media_entity_pads_init(struct media_entity *entity)
>>>> 	{
>>>> 		...
>>>> 		kref_init(&entity->kref);
>>>> 	}
>>>> 	
>>>> 	int media_device_register_entity(struct media_device *mdev,
>>>> 					 struct media_entity *entity)
>>>> 	{
>>>> 		...
>>>> 		/*
>>>> 		 * Bind the media entity to the media device; thus increment
>>>> 		 * media device refcount. The media device typically is not
>>>> 		 * available during sub-device driver probe() time. This
>>>> 		 * requires that a media entity may only be registered once.
>>>> 		 */
>>>> 		entity->graph_obj.mdev = mdev;
>>>> 		media_device_get(mdev);
>>>> 		/* And get the entity as well. */
>>>> 		media_entity_get(entity);
>>>> 	}
>>>> 	
>>>> 	void media_device_unregister_entity(struct media_entity *entity)
>>>> 	{
>>>> 		...
>>>> 		media_entity_put(entity);
>>>> 	}
>>>> 	
>>>> 	int v4l2_device_register_subdev(struct v4l2_subdev *sd)
>>>> 	{
>>>> 		...
>>>> 		media_device_register_entity(&sd->entity);
>>>> 		...
>>>> 	}
>>>> 	
>>>> 	void v4l2_device_unregister_subdev(struct v4l2_subdev *sd)
>>>> 	{
>>>> 		...
>>>> 		media_device_unregister_entity(&sd->entity);
>>>> 	}
>>>> 	
>>>> 	/*
>>>> 	 * Custom release callback function for V4L2 sub-devices that export
>>>> 	 * a device node. (Could be used for others as well, with
>>>> 	 * sd->devnode.release() callback.)
>>>> 	 */
>>>> 	void v4l2_device_subdev_devnode_release(struct media_entity *entity)
>>>> 	{
>>>> 		struct v4l2_subdev *sd = media_entity_to_v4l2_subdev(entity);
>>>> 		
>>>> 		/*
>>>> 		 * If the devnode got registered we still hold a reference
>>>> 		 * to it. Check this from vdev->prio (but any other field
>>>> 		 * which gets set during device node registration and never
>>>> 		 * changed again could be used). If the devnode was never
>>>> 		 * registered, call its release function directly.
>>>> 		 */
>>>> 		if (sd->devnode.prio)
>>>> 			video_put(sd->devnode);
>>>> 		else
>>>> 			sd->devnode.release(&sd->devnode);
>>>> 	}
>>>> 	
>>>> 	int v4l2_device_register_subdev_nodes(struct v4l2_device *vdev)
>>>> 	{
>>>> 		struct video_device *vdev;
>>>>  	        struct v4l2_subdev *sd;
>>>> 	        int err;
>>>> 	
>>>> 	        list_for_each_entry(sd, &v4l2_dev->subdevs, list) {
>>>> 			__video_register_device(&sd->devnode, ...);
>>>> 			video_get(&sd->devnode);
>>>> 			sd->devnode.prio = something non-NULL;
>>>> 		}
>>>> 	}
>>>> 	
>>>> 	int v4l2_subdev_try_get(struct v4l2_subdev *sd)
>>>> 	{
>>>> 		return media_entity_try_get(&sd->entity);
>>>> 	}
>>>> 	
>>>> 	void v4l2_subdev_get(struct v4l2_subdev *sd)
>>>> 	{
>>>> 		media_entity_get(&sd->entity);
>>>> 	}
>>>> 	
>>>> 	void v4l2_subdev_put(struct v4l2_subdev *sd)
>>>> 	{
>>>> 		media_entity_put(&sd->entity);
>>>> 	}
>>>> 	
>>>> 	/* V4L2 sub-device open file operation handler */
>>>> 	int subdev_open(struct file *file)
>>>> 	{
>>>> 		struct video_device *vdev = video_devdata(file);
>>>> 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
>>>> 		int rval;
>>>> 		
>>>> 		/*
>>>> 		 * The v4l2_subdev depends on the video device node. It thus
>>>> 		 * may be that the media entity refcount (which is also used
>>>> 		 * to count references to the v4l2_subdev) has reached zero
>>>> 		 * here. However its memory is still intact as it's part of
>>>> 		 * the same struct v4l2_subdev.
>>>> 		 */
>>>> 		rval = v4l2_subdev_try_get(sd);
>>>> 		if (rval)
>>>> 			return rval;
>>>> 		
>>>> 		...
>>>> 	}
>>>> 	
>>>> 	/* V4L2 sub-device release file operation handler */
>>>> 	int subdev_close(struct file *file)
>>>> 	{
>>>> 		struct video_device *vdev = video_devdata(file);
>>>> 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
>>>> 		
>>>> 		v4l2_subdev_put(sd);
>>>> 	}
>>>> 	
>>>> 	struct foo_device {
>>>> 		struct v4l2_subdev sd;
>>>> 		struct media_pad pad;
>>>> 	};
>>>> 	
>>>> 	void foo_device_release(struct video_device *vdev)
>>>> 	{
>>>> 		struct v4l2_subdev *sd =
>>>> 			container_of(vdev, struct v4l2_subdev, devnode);
>>>> 		struct foo_device *foo =
>>>> 			container_of(sd, struct foo_device, sd);
>>>> 		
>>>> 		/*
>>>> 		 * TODO: acquire the graph mutex here and remove the
>>>> 		 * entities corresponding to the V4L2 sub-device and its
>>>> 		 * device node from the graph.
>>>> 		 */
>>>> 		media_entity_cleanup(&foo->sd.entity);
>>>> 		kfree(foo);
>>>> 	}
>>>> 	
>>>> 	int foo_probe(struct device *dev)
>>>> 	{
>>>> 		struct foo_device *foo = kmalloc(sizeof(*foo), GFP_KERNEL));
>>>> 		
>>>> 		media_entity_pads_init(&foo->sd.entity, 1, &foo->pad);
>>>> 		foo->sd.entity.release = v4l2_subdev_devnode_release;
>>>> 		foo->sd.devnode.release = foo_device_release;
>>>> 		v4l2_async_register_subdev(&foo->sd);
>>>> 	}
>>>> 	
>>>> 	void foo_remove(struct device *dev)
>>>> 	{
>>>> 		struct foo_device *foo = dev_get_drvdata(dev);
>>>> 		
>>>> 		v4l2_async_unregister_subdev(&foo->sd);
>>>> 		v4l2_device_unregister_subdev(&foo->sd);
>>>> 		/*
>>>> 		 * v4l2_subdev_put() will end up releasing foo immediately
>>>> 		 * unless file handles are open currently. Thus further
>>>> 		 * accesses to foo are not allowed.
>>>> 		 */
>>>> 		v4l2_subdev_put(&foo->sd);
>>>> 	}
>>>>
>>>> Practical considerations
>>>> ------------------------
>>>>
>>>> - Beyond media entities, other media graph objects could be made
>>>>   refcounted. This is not seen necessary as the related objects such as
>>>>   pads are in practice allocated within the same driver specific struct
>>>>   containing the media entity.
>>>
>>> As I mentioned already, you're forgetting about the media interfaces, with
>>> are created/removed elsewhere as they're (somewhat) independent of the
>>> media entities.
>>
>> Right. Nothing prevents adding a kref to struct media_interface. It'd work
>> the same way than with media entities. Interface-wise, the refcounting
>> should use _get()/_put() functions of the native types independently of
>> where the kref refcount is located. (Obviously this requires that it has to
>> be in the same memory allocation. For instance, v4l2_subdev's depending on
>> kref in struct media_entity in the example.)
>>
>> Having to do add kref to struct media_interface, though, does increase the
>> appeal of adding kref to each graph object instead. Still, my hunch is that
>> this would be vastly overkill: the other graph objects we have are very
>> tightly bound to entities (and media interfaces). I can't see a use case for
>> them right now other than than just walking the graph. That said, I'm
>> certainly open for new ideas and information that could probe me wrong.
>
> I agree with you.

Me too.

>
>>>> 	- This does not apply to links which are allocated dynamically by
>>>> 	  the framework, and released at unregistration time, meaning that
>>>> 	  accessing a link will require holding the media graph mutex.
>>>
>>> If we're moving to refcounts, then perhaps we can get rid of the
>>> graph mutex or reduce its usage.
>>
>> It'll be a lot easier to get rid of the graph mutex, I agree. Only the graph
>> walk will depend on holding the graph mutex then, and it's much easier to
>> address the serialisation needs in that use case only. The solution will
>> still be non-trivial as it's likely to require wait/wound mutexes: the
>> graph walk can well start at two locations of a graph simultanoeously and
>> two walkers may thus try to parse the same graph nodes. One of them needs
>> to back away in that case. This is something that needs to be managed in
>> the users of the graph walk API.
>>
>>>> - A custom release callback is to be used for refcounted framework
>>>>   structs. This makes it easy for the drivers to embed the objects in
>>>>   their own structs.
>>>>
>>>> - As long as an object has its own reference count, getting or putting
>>>>   an object does not affect the refcount of another object that it
>>>>   depends on. For instance, a media entity depends on an existence of the
>>>>   media device.
>>>>
>>>> - Each media entity holds a reference to the media device. Being able to
>>>>   navigate from the media entity to the media device is used in drivers
>>>>   and an abrupt disconnection of the two is problematic to handle for
>>>>   drivers.
>>>>
>>>> - At object initialisation time the kref refcount is initialised. The
>>>>   object is put at unregistration time.
>>>>
>>>>   	- If the object is a media entity, it is removed from the media graph
>>>>  	  when the last reference to the media entity is gone. The release
>>>>  	  callback needs to acquire the media graph mutex in order to
>>>>  	  perform the removal.
>>>> 	
>>>> 	- In particular for struct video_device, media_entity release callback
>>>> 	  will put the refcount of the video device (struct device embedded in
>>>> 	  struct video_device)
>>>> 	
>>>> 	- For struct v4l2_subdev, struct media_entity contained in struct
>>>> 	  v4l2_subdev is dependent on struct device within struct
>>>> 	  video_device.
>>>>
>>>> - When a driver creates multiple reference-counted objects (such as
>>>>   multiple media entities for instance) that are part of its own private
>>>>   struct, it will need to reference-count its own struct based on the
>>>>   release callbacks of all those objects. It might be possible to
>>>>   simplify this by providing a single release callback that would cover
>>>>   all objects. This is already feasible for drivers that implement a
>>>>   media_device instance, V4L2 sub-devices and V4L2 video devices hold a
>>>>   reference on the media_device, whose release callback could then act as
>>>>   a single gatekeeper. For other drivers we haven't explored yet whether
>>>>   the core could meaningfully provide help, but it should be remembered
>>>>   that drivers that implement multiple graph objects and that do not
>>>>   implement a media_device are not legion. Most slave drivers (such as
>>>>   sensor drivers) only implement a single v4l2_subdev.
>>>>
>>>> - No V4L2 sub-device driver currently supports unbinding the device
>>>>   safely when a media entity (or V4L2 sub-device) is registered. Prevent
>>>>   manual unbind for all sub-device drivers by setting the
>>>>   suppress_bind_attrs field in struct device_driver.
>>>>
>>>> Rules for navigating the media graph and accessing graph objects
>>>> ================================================================
>>>>
>>>> - Graph traversal is safe as long as the media graph lock is held during
>>>>   traversal. (I.e. no changes here.)
>>>> 	
>>>> 	- During graph traversal entity refcount must be increased in order
>>>> 	  to prevent them from being released. As the media entity's release()
>>>> 	  callback may be already in process at this point, a variant
>>>> 	  checking that the reference count has not reached zero must be
>>>> 	  used (media_entity_get_try()).
>>>>
>>>> - In order to keep a reference to an object after the graph traversal
>>>>   has finished, a reference to the object must be obtained and held as
>>>>   long as the object is in use.
>>>> 	
>>>> 	- The same applies to file handles to V4L2 sub-devices or drivers
>>>> 	  keeping a reference to another driver's sub-device.
>>>>
>>>> - Once the reference count of an object reaches zero it will not be
>>>>   increased again, even if that object was part of a memory allocation
>>>>   which still has referenced objects. Obtaining references to such an
>>>>   object must fail.
>>>>
>>>> - Navigating within a memory allocation while holding a reference to an
>>>>   object in that allocation to obtain another object is allowed. The
>>>>   target object type must provide a function testing for zero references
>>>>   in order not to not to increment reference count that has reached zero.
>>>>   kref_get_unless_zero() can be used for this. Especially drivers may
>>>>   need this.
>>>>
>>>> - Circular references are not allowed.
>>>>
>>>> - It is safe to move between objects of the same lifetime (same kref),
>>>>   e.g. struct v4l2_subdev and struct media_entity it contains.
>>>>
>>>> The rules must be documented as part of the Media framework ReST
>>>> documentation.
>>>>
>>>>
>>>> Stopping the hardware safely at device unbind
>>>> =============================================
>>>>
>>>> What typically gets little attention in driver implementation is
>>>> stopping safely whenever the hardware is being removed. The kernel
>>>> frameworks are not very helpful in this respect --- after all how this is
>>>> done somewhat depends on the hardware. Only hot-pluggable devices are
>>>> generally affected. While it is possible to manually unbind e.g.
>>>> platform devices, that certainly is a lesser problem for such devices.
>>>>
>>>> The rules are:
>>>>
>>>> - No memory related to data structures that are needed during driver
>>>>   operation can be released as long as interrupt handlers may be still
>>>>   executing or may start executing or user file handles are open.
>>>>
>>>> - All hardware access must cease after the remove() callback in driver
>>>>   ops struct has returned. (For USB devices this callback is called
>>>>   disconnect(), but its purpose is precisely the same.)
>>>>
>>>> Handling this correctly will require taking care of a few common
>>>> matters:
>>>>
>>>> 1. Checking for device presence in each system call issued on a device,
>>>>    and returning -ENODEV if the device is not found.
>>>>
>>>> 2. The driver's remove() function must ensure that no system calls that
>>>>    could result in hardware access are ongoing.
>>>>
>>>> 3. Some system calls may sleep, and while doing so also releasing and
>>>>    later reacquiring (when they continue) the mutexes their processes
>>>>    were holding. Such processes must be woken up.
>>>>
>>>> This is not an easy task for a driver and the V4L2 and Media controller
>>>> frameworks need to help the drivers to achieve this so that drivers
>>>> would need to worry about this as little as possible. Drivers relying for
>>>> their locking needs on the videobuf2 and the V4L2 frameworks are likely
>>>> to require fewer changes.
>>>>
>>>> The same approach must be taken in issuing V4L2 sub-device operations by
>>>> other kernel drivers.
>>>>
>>>>
>>>> Serialising unbind with hardware access through system calls
>>>> ------------------------------------------------------------
>>>>
>>>> What's below is concentrated in V4L2; Media controller will need similar
>>>> changes.
>>>>
>>>> - Add a function called video_unregister_device_prepare() (or
>>>>   video_device_mark_unregistered() or a similarly named function) to
>>>>   amend the functionality of video_unregister_device(): mark the device
>>>>   unregistering so that new system calls issues on the device will
>>>>   return an error.
>>>>
>>>> - Device node specific use count for active system calls.
>>>>
>>>> - Waiting list for waiting for the use count to reach zero (for
>>>>   proceeding unregistration in v4l2_unregister_device()). Each finishing
>>>>   system call on a file descriptor will wake up the waiting list.
>>>>
>>>> - Processes waiting in a waiting list of a system call (e.g. poll, DQBUF
>>>>   IOCTL) will be woken up. After waiting, the processes will check
>>>>   whether the video node is still registered, i.e.
>>>>   video_unregister_device_prepare()
>>>>   has not been called.
>>>>
>>>> Things to remember
>>>> ------------------
>>>>
>>>> - Using the devm_request_irq() is _NOT_ allowed in general case, as the
>>>>   free_irq() results in being called _after_ the driver's release
>>>>   callback. This means that if new interrupts still arrive between
>>>>   driver's release callback returning and devres resources not being
>>>>   released, the driver's interrupt handler will still run. Use
>>>>   request_irq() and free_irq() instead.
>>>>
>>>> What changes in practice?
>>>> =========================
>>>>
>>>> Most of the changes needed in order to align the current implementation
>>>> to conform the above are to be made in the frameworks themselves. What
>>>> changes in drivers, though, is how objects are initialised and released.
>>>> This mostly applies to drivers' own data structures that often contain
>>>> reference counted object types from the frameworks.
>>>>
>>>> Relations between various frameworks objects will need to be handled in
>>>> a standard way in the appropriate registeration, unregistration and
>>>> reference getting and putting functions. Drivers (as well as frameworks,
>>>> such as the V4L2 sub-device framework in its open and close file
>>>> operations) will make use of these functions to get an put references to
>>>> the said objects.
>>>>
>>>> Drivers that are not hot-pluggable could largely ignore the changes, as
>>>> unregistering an object does in fact take place right before releasing
>>>> that object, meaning drivers that did not perform any object reference
>>>> counting are no more broken with the changes described above without
>>>> implementing reference counting. Still care must be taken that they
>>>> remain decently unloadable when there are no users.
>>>>
>>>> New drivers should be required to implement object reference counting
>>>> correctly, whether hot-pluggable or not. One platform driver (e.g.
>>>> omap3isp) should be fixed to function as a model for new platform
>>>> drivers.
>

Regards,

	Hans

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [ANN] Media object lifetime management meeting report from Oslo
  2017-01-30  8:39       ` Hans Verkuil
@ 2017-01-30 11:56         ` Mauro Carvalho Chehab
  2017-02-07 21:38           ` Sakari Ailus
  0 siblings, 1 reply; 8+ messages in thread
From: Mauro Carvalho Chehab @ 2017-01-30 11:56 UTC (permalink / raw)
  To: Hans Verkuil
  Cc: Laurent Pinchart, Sakari Ailus, linux-media, Shuah Khan, Helen Koike

Em Mon, 30 Jan 2017 09:39:21 +0100
Hans Verkuil <hverkuil@xs4all.nl> escreveu:

> On 28/01/17 16:35, Laurent Pinchart wrote:
> > Hi Sakari,
> >
> > On Saturday 28 Jan 2017 00:02:53 Sakari Ailus wrote:  
> >> On Fri, Jan 27, 2017 at 09:38:31AM -0200, Mauro Carvalho Chehab wrote:  
> >>> Hi Sakari/Hans/Laurent,
> >>>
> >>> First of all, thanks for looking into those issues. Unfortunately, I was
> >>> in vacations, and were not able to be with you there for such discussions.
> >>>
> >>> While I have a somewhat different view on some of the introductory points
> >>> of this RFC, what really matters is the "proposal" part of it. So, based
> >>> on the experiments I did when I addressed the hotplug issues with the
> >>> media controller on USB drivers, I'm adding a few comments below.
> >>>
> >>> Em Fri, 27 Jan 2017 12:08:22 +0200 Sakari Ailus escreveu:  
> >>>> Allocating struct media_devnode separately from struct media_device
> >>>> -------------------------------------------------------------------
> >>>>
> >>>> The current codebase allocates struct media_devnode dynamically. This
> >>>> was done in order to reduce the time window during which unallocated
> >>>> kernel memory could be accessed. However, there is no specific need for
> >>>> such a separation as the entire data structure, including the media
> >>>> device which used to contain struct media_devnode, will have the same
> >>>> lifetime. Thus the struct media_devnode should be allocated as part of
> >>>> struct media_device. Later on, struct media_devnode may be merged with
> >>>> struct media_device if so decided.  
> >>>
> >>> There is one issue merging struct media_devnode at struct media_device.
> >>> The problem is that, if the same struct device is used for two different
> >>> APIs (like V4L2 and media_controller) , e. g. if the cdev parent points
> >>> to the same struct device, you may end by having a double free when the
> >>> device is unregistered on the second core. That's basically why
> >>> currently struct cdev is at struct media_devnode: this way, it has its own
> >>> struct device.  
> >>
> >> One of the conclusions of the memorandum I wrote is that the data structures
> >> that are involved in executing a system call (including IOCTLs) must stay
> >> intact during that system call. (Well, the alternative I can think of is
> >> using an rw_semaphore but I certainly do not advocate that solution.)
> >>
> >> Otherwise, we will continue to have serialisation issues: kernel memory that
> >> may be released at any point of time independently of a system call in
> >> progress:
> >>
> >> <URL:https://www.mail-archive.com/linux-media@vger.kernel.org/msg106390.html  
> >>>  
> >>
> >> That one is a NULL pointer dereference but released kernel memory can be
> >> accessed as well.
> >>  

There's an alternative approach: to have a flag to indicate that the
device got disconnected and denying access to the device-specific data
if the device was already hot-unplugged. The V4L USB drivers do that
already. See, for example, the em28xx driver:

int em28xx_read_reg_req_len(struct em28xx *dev, u8 req, u16 reg,
			    char *buf, int len)
{
...
	if (dev->disconnected)
		return -ENODEV;
...
}

/* Processes and copies the URB data content (video and VBI data) */
static inline int em28xx_urb_data_copy(struct em28xx *dev, struct urb *urb)
{
...
	if (dev->disconnected)
		return 0;

The same logic exists on multiple places along the driver. Some of
such checks could be moved to the core, but, if I remember well when
such code was written, several such checks should be inside the driver,
as the USB core doesn't allow any calls to their functions after the
disconnect callback is called. That's why it was opted to not move
it to the core.

In the case of DVB, the core has a similar logic to handle disconnected
devices, with prevents it to call driver-specific code when a device is
removed. That's why there are just 2 checks if dev->disconnected at
drivers/media/usb/em28xx/em28xx-dvb.c: one at the IRQ logic, and the
second one to avoid a race issue with the suspend code.

Due to that, on real hot-plug devices (USB drivers) that don't have
hot-unplug issues with the V4L2 API, it should be safe to free some of
their data structs at the .disconnect callback. Of course, the struct
that embeds the struct device used as the cdev's parent can't be freed
there.

> >>> IMHO, it also makes sense to embeed struct cdev at the V4L2 side, as I
> >>> detected some race issues at the V4L2 side when I ran the bind/unbind
> >>> race tests, when we tried to merge the snd-usb-audio MC support patch.
> >>> I remember Shuah reported something similar on that time.
> >>>  
> >>>> Allocating struct media_device dynamically
> >>>> ------------------------------------------
> >>>>
> >>>> Traditionally the media device has been allocated as part of drivers'
> >>>> own structures. This is certainly fine as such, but many drivers have
> >>>> allocated the driver private struct using the devm_() family of
> >>>> functions. This causes such memory to be released at the time of device
> >>>> removal rather than at the time when the memory is no longer accessible
> >>>> by e.g. user space processes or interrupt handlers. Besides the media
> >>>> device, the driver's own data structure is very likely to have the
> >>>> precisely same lifetime issue: it may also be accessed through IOCTLs
> >>>> after the device has been removed from the system.
> >>>>
> >>>> Instead, the memory needs to be released as part of the release()
> >>>> callback of the media device which is called when the last reference to
> >>>> the media device is gone. Still, allocating the media device outside
> >>>> another containing driver specific struct will be required in some cases:
> >>>> sharing the media device mandates that. Implementing this can certainly
> >>>> be postponed until sharing the media device is otherwise supported as
> >>>> well.  
> >>>
> >>> The patches adding MC support for snd-usb-audio, pending since Kernel
> >>> 4.7 (I guess) require such functionatilty. Last year, on the audio
> >>> summit, they asked about its status, as they're needing MC suppor there.
> >>>
> >>> So, whatever solution taken, this should be part of the solution.
> >>>
> >>> (c/c Shuah, as she is the one working on it)  
> >>
> >> I think we should postpone that, or at least resolve that in a separate
> >> patchset. This is already getting very big. The refcounting problems will be
> >> harder to fix, should we extend the MC framework with media device sharing
> >> without considering the object lifetime issues first. So the order should
> >> be: first fix object lifetime issues, then add more features.  
> >
> > I strongly second that, continuing to build on top of a totally broken base
> > can only lead to disaster. Given that we have given the lifetime management
> > problem lots of thoughts already, with a patch series out to fix it and work
> > ongoing to rework that series, I don't see a reason to order development in a
> > different way.  
> 
> I agree with that as well.

Ok. When do you expect to have a patchset ready for tests covering the
hot-plug issue without causing regressions to the USB drivers?

> >> Matters unrelated to the object lifetime issues such as the device (driver)
> >> specific fields in struct media_device need to be addressed before support
> >> for sharing the media device can be implemented. That work can be done
> >> independently of fixing the object lifetime issues.
> >>
> >> Implemeting media device sharing correctly will also be easier after
> >> the media device is refcounted: the object lifetime issues with a shared
> >> media device are resolvable (and likely very easily so).
> >>  
> >>>> Lifetime of the media entities and related framework or driver objects
> >>>> ======================================================================
> >>>>
> >>>> The media graph data structure is a complex data structure consisting of
> >>>> media graph objects that themselves may be either entities, links or
> >>>> pads. The media graph objects are allocated by various drivers and
> >>>> currently have a lifetime related to binding and unbinding of the related
> >>>> hardware. The current rules (i.e. acquiring the media graph lock for
> >>>> graph traversal) are not enough to properly support removing entities.
> >>>>
> >>>> Instead, reference counting needs to be introduced for media entities
> >>>> and other objects that contain media entities. References must be
> >>>> acquired (or in some cases, borrowed) whenever a reference is held to an
> >>>> object. A release callback is used to release the memory allocation that
> >>>> contains an object when the last reference to the object has been put.
> >>>>
> >>>> For instance:
> >>>>
> >>>> 	struct media_entity {
> >>>> 		...
> >>>> 		struct kref kref;
> >>>> 		void (*release)(struct media_entity *entity);
> >>>> 	};  
> >>>
> >>> The similar rationale is valid also for media interfaces and, on a
> >>> lesser extent, to pads and links.
> >>>
> >>> So, instead, IMHO, the kref should be implemented at struct media_gobj,
> >>> with is inherited by all graph elements. I guess that handling it
> >>> there will avoid us to duplicate the same logic on different
> >>> places (e. g. interface and entity logic handling; links handling).  
> >>
> >> I don't argue against doing that in principle, but my question is: is there
> >> a need for that? Does a driver (or a framework) need to be able to hold a
> >> link, a pad, for instance? In the proposal, acquiring the media entity would
> >> be practically enough to hold the pads (albeit that would have to be
> >> documented), or we could add functions for that that would be wrappers to
> >> media_entity_get() and so on:
> >>
> >> void media_pad_get(struct media_pad *pad)
> >> {
> >> 	media_entity_get(pad->entity);
> >> }
> >>
> >> The lifetime of the graph objects are dependent of each other in any cases;
> >> you can't have pads having significantly different lifetime than the entity
> >> they belong to. Typically such a case is related to making sure structs stay
> >> around as long as something may still access them in a case when graph
> >> objects are being removed.  
> >
> > I also believe that tying the lifetime of pads to the lifetime of entities is
> > a good solution.  
> 
> Indeed.

Works for me.

> >  
> >>> Btw, as you want to use OMAP3 as a reference driver, you should
> >>> implement interface and interface links there, in order to have a
> >>> broader testcase coverage, or use some other driver that already
> >>> implements it.
> >>>
> >>> Alternatively, we should work on merging the Helen's vimc driver, being
> >>> sure that it provides hot plug support for all kinds of media graph
> >>> objects. It should be easier/faster to detect race issues on such a driver
> >>> that doesn't depend on a hardware to complete their operations. So,
> >>> I'm a big fan on using it as primary testcase. Of course, we'll need to
> >>> test with real drivers too.  
> >>
> >> The vimc driver looks pretty good in my opinion; it could be merged after
> >> addressing the comments. I wrote quite a few comments to the patchset but
> >> AFAIR all the matters are minor.
> >>
> >> I see no reason why we shouldn't implement these fixes for both of the two
> >> drivers. My understanding is that the driver changes due to refcounting will
> >> be relatively minor after all. It's the framework changes that will be
> >> harder.  
> >
> > Agreed. On the topic of interfaces and interface links, I'd like to point out
> > that we have never reached an agreement on them and that they have been merged
> > despite a strong opposition.  
> 
> Two notes:
> 
> 1) That's not how I remember things, but that could well be my memory.
> 2) It's in, so there is no turning back.

Second with Hans on that matter.

> 
> >  
> >>>> 	/*
> >>>> 	 * For refcounting, the caller has to set entity->release after
> >>>> 	 * initialising the entity.
> >>>> 	 */
> >>>> 	void media_entity_pads_init(struct media_entity *entity)
> >>>> 	{
> >>>> 		...
> >>>> 		kref_init(&entity->kref);
> >>>> 	}
> >>>> 	
> >>>> 	int media_device_register_entity(struct media_device *mdev,
> >>>> 					 struct media_entity *entity)
> >>>> 	{
> >>>> 		...
> >>>> 		entity->graph_obj.mdev = mdev;
> >>>> 	}
> >>>> 	
> >>>> 	void media_entity_release(struct kref *kref)
> >>>> 	{
> >>>> 		struct media_entity *entity =
> >>>> 			container_of(kref, struct media_entity, kref);
> >>>> 		struct media_device *mdev = entity->graph_obj.mdev;
> >>>> 		
> >>>> 		entity->release(entity);
> >>>> 		if (mdev)
> >>>> 			media_device_put(mdev);
> >>>> 	}
> >>>> 	
> >>>> 	void media_entity_put(struct media_entity *entity)
> >>>> 	{
> >>>> 		kref_put(&entity->kref, media_entity_release);
> >>>> 	}
> >>>> 	
> >>>> 	void media_entity_get(struct media_entity *entity)
> >>>> 	{
> >>>> 		kref_get(&entity->kref);
> >>>> 	}
> >>>>
> >>>> The above takes into account the current behaviour which makes no use of
> >>>> reference counting while still allowing reference counting for drivers
> >>>> that use it.
> >>>>
> >>>> Often media entities are contained in another struct such as struct
> >>>> v4l2_subdev. In that case, the allocator of that struct must provide a
> >>>> release callback which is used as a destructor of that object. Functions
> >>>> to count references (get() and put()) are defined to obtain and released
> >>>> references to the said objects. Wrappers are added for the types
> >>>> containing the object type that does have the refcount, for instance in
> >>>> this case:
> >>>>
> >>>> 	void v4l2_subdev_get(struct v4l2_subdev *sd)
> >>>> 	{
> >>>> 		media_entity_get(&sd->entity);
> >>>> 	}
> >>>> 	
> >>>> 	void v4l2_subdev_put(struct v4l2_subdev *sd)
> >>>> 	{
> >>>> 		media_entity_put(&sd->entity);
> >>>> 	}
> >>>>
> >>>> Memory allocations that contain one or more objects must be considered
> >>>> indivisible. All the objects inside a struct are allocated and released
> >>>> at one point of time. For instance, struct v4l2_subdev contains both
> >>>> struct media_entity and struct video_device. struct video_device contains
> >>>> struct device (related to the character device visible to the user) that
> >>>> is reference counted already.  
> >>>
> >>> Here's the problem: if we point the media_devnode's cdev parent to
> >>> the same object as struct video_device, both cores will try to
> >>> decrement its kref at release() time.  
> >>
> >> At the time of registering a struct video_device with the struct
> >> media_device, the underlying refcount (struct
> >> media_device.devnode.dev.kobj.kref) needs to be incremented. This needs to
> >> done using media_device_get(), with (a) few layers of abstraction.
> >>
> >> This guarantees that the struct media_device will stick around at least as
> >> long as the struct video_device.
> >>
> >> The struct video_device cdev arrangement differs from what's being done in
> >> the CEC framework and elsewhere; that should be fixed I guess. Still, the
> >> struct media_device reference should be put in v4l2_device_release() (the
> >> release callback of the struct device corresponding to the device node,
> >> struct video_device.dev.release).  
> >
> > There are two static functions named v4l2_device_release(), one in v4l2-
> > device.c and one in v4l2-dev.c. I assume you mean the latter (which we could
> > rename to avoid confusion).
> >
> > On a related topic, video_device_release_empty() should be removed.  
> 
> It's there to satify a WARN_ON that checks if the release callback isn't NULL, i.e.
> to make sure someone thought about this. I would keep it for now. Certainly we
> shouldn't change this behavior until all the other life-time changes are in.
> After that we can think about this some more.
> 
> >  
> >>>> In the example below, foo is a sub-device driver. The struct foo_device
> >>>> contains a struct v4l2_subdev field sd which in turn contains a media
> >>>> entity. In addition to that, struct v4l2_subdev contains a struct
> >>>> video_device (the devnode field) that currently is a pointer. In order
> >>>> to simplify the reference counting, it is assumed to be directly a part
> >>>> of the struct v4l2_subdev below. The example assumes that the driver
> >>>> requests a device node for the V4L2 sub-device and does not address the
> >>>> case where the device node is not created.
> >>>>
> >>>> 	struct media_entity {
> >>>> 		...
> >>>> 		struct kref kref;
> >>>> 		void (*release)(struct media_entity *entity);
> >>>> 	};
> >>>> 	
> >>>> 	struct v4l2_subdev {
> >>>> 		...
> >>>> 		struct video_device devnode;
> >>>> 	};
> >>>> 	
> >>>> 	void media_entity_release(struct kref *kref)
> >>>> 	{
> >>>> 		struct media_entity *entity =
> >>>> 			container_of(kref, struct media_entity, kref);
> >>>> 		struct media_device *mdev = entity->graph_obj.mdev;
> >>>> 		
> >>>> 		/* Release the media entity, possibly releasing its memory. */
> >>>> 		entity->release(entity);
> >>>> 		
> >>>> 		/*
> >>>> 		 * If it was bound to a media device, put the media device
> >>>> 		 * as well.
> >>>> 		 */
> >>>> 		if (mdev)
> >>>> 			media_device_put(mdev);
> >>>> 	}
> >>>> 	
> >>>> 	void media_entity_put(struct media_entity *entity)
> >>>> 	{
> >>>> 		kref_put(&entity->kref, media_entity_release);
> >>>> 	}
> >>>> 	
> >>>> 	/*
> >>>> 	 * Returns zero if getting entity succeeds, -ENODEV otherwise.
> >>>> 	 */
> >>>> 	int media_entity_try_get(struct media_entity *entity)
> >>>> 	{
> >>>> 		return kref_get_unless_zero(&entity->kref) ? 0 : -ENODEV;
> >>>> 	}
> >>>> 	
> >>>> 	void media_entity_get(struct media_entity *entity)
> >>>> 	{
> >>>> 		kref_get(&entity->kref);
> >>>> 	}
> >>>> 	
> >>>> 	/*
> >>>> 	 * Depending on the data structure containing the media entity, the
> >>>> 	 * caller has to set entity->release after initialising the entity.
> >>>> 	 */
> >>>> 	void media_entity_pads_init(struct media_entity *entity)
> >>>> 	{
> >>>> 		...
> >>>> 		kref_init(&entity->kref);
> >>>> 	}
> >>>> 	
> >>>> 	int media_device_register_entity(struct media_device *mdev,
> >>>> 					 struct media_entity *entity)
> >>>> 	{
> >>>> 		...
> >>>> 		/*
> >>>> 		 * Bind the media entity to the media device; thus increment
> >>>> 		 * media device refcount. The media device typically is not
> >>>> 		 * available during sub-device driver probe() time. This
> >>>> 		 * requires that a media entity may only be registered once.
> >>>> 		 */
> >>>> 		entity->graph_obj.mdev = mdev;
> >>>> 		media_device_get(mdev);
> >>>> 		/* And get the entity as well. */
> >>>> 		media_entity_get(entity);
> >>>> 	}
> >>>> 	
> >>>> 	void media_device_unregister_entity(struct media_entity *entity)
> >>>> 	{
> >>>> 		...
> >>>> 		media_entity_put(entity);
> >>>> 	}
> >>>> 	
> >>>> 	int v4l2_device_register_subdev(struct v4l2_subdev *sd)
> >>>> 	{
> >>>> 		...
> >>>> 		media_device_register_entity(&sd->entity);
> >>>> 		...
> >>>> 	}
> >>>> 	
> >>>> 	void v4l2_device_unregister_subdev(struct v4l2_subdev *sd)
> >>>> 	{
> >>>> 		...
> >>>> 		media_device_unregister_entity(&sd->entity);
> >>>> 	}
> >>>> 	
> >>>> 	/*
> >>>> 	 * Custom release callback function for V4L2 sub-devices that export
> >>>> 	 * a device node. (Could be used for others as well, with
> >>>> 	 * sd->devnode.release() callback.)
> >>>> 	 */
> >>>> 	void v4l2_device_subdev_devnode_release(struct media_entity *entity)
> >>>> 	{
> >>>> 		struct v4l2_subdev *sd = media_entity_to_v4l2_subdev(entity);
> >>>> 		
> >>>> 		/*
> >>>> 		 * If the devnode got registered we still hold a reference
> >>>> 		 * to it. Check this from vdev->prio (but any other field
> >>>> 		 * which gets set during device node registration and never
> >>>> 		 * changed again could be used). If the devnode was never
> >>>> 		 * registered, call its release function directly.
> >>>> 		 */
> >>>> 		if (sd->devnode.prio)
> >>>> 			video_put(sd->devnode);
> >>>> 		else
> >>>> 			sd->devnode.release(&sd->devnode);
> >>>> 	}
> >>>> 	
> >>>> 	int v4l2_device_register_subdev_nodes(struct v4l2_device *vdev)
> >>>> 	{
> >>>> 		struct video_device *vdev;
> >>>>  	        struct v4l2_subdev *sd;
> >>>> 	        int err;
> >>>> 	
> >>>> 	        list_for_each_entry(sd, &v4l2_dev->subdevs, list) {
> >>>> 			__video_register_device(&sd->devnode, ...);
> >>>> 			video_get(&sd->devnode);
> >>>> 			sd->devnode.prio = something non-NULL;
> >>>> 		}
> >>>> 	}
> >>>> 	
> >>>> 	int v4l2_subdev_try_get(struct v4l2_subdev *sd)
> >>>> 	{
> >>>> 		return media_entity_try_get(&sd->entity);
> >>>> 	}
> >>>> 	
> >>>> 	void v4l2_subdev_get(struct v4l2_subdev *sd)
> >>>> 	{
> >>>> 		media_entity_get(&sd->entity);
> >>>> 	}
> >>>> 	
> >>>> 	void v4l2_subdev_put(struct v4l2_subdev *sd)
> >>>> 	{
> >>>> 		media_entity_put(&sd->entity);
> >>>> 	}
> >>>> 	
> >>>> 	/* V4L2 sub-device open file operation handler */
> >>>> 	int subdev_open(struct file *file)
> >>>> 	{
> >>>> 		struct video_device *vdev = video_devdata(file);
> >>>> 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
> >>>> 		int rval;
> >>>> 		
> >>>> 		/*
> >>>> 		 * The v4l2_subdev depends on the video device node. It thus
> >>>> 		 * may be that the media entity refcount (which is also used
> >>>> 		 * to count references to the v4l2_subdev) has reached zero
> >>>> 		 * here. However its memory is still intact as it's part of
> >>>> 		 * the same struct v4l2_subdev.
> >>>> 		 */
> >>>> 		rval = v4l2_subdev_try_get(sd);
> >>>> 		if (rval)
> >>>> 			return rval;
> >>>> 		
> >>>> 		...
> >>>> 	}
> >>>> 	
> >>>> 	/* V4L2 sub-device release file operation handler */
> >>>> 	int subdev_close(struct file *file)
> >>>> 	{
> >>>> 		struct video_device *vdev = video_devdata(file);
> >>>> 		struct v4l2_subdev *sd = vdev_to_v4l2_subdev(vdev);
> >>>> 		
> >>>> 		v4l2_subdev_put(sd);
> >>>> 	}
> >>>> 	
> >>>> 	struct foo_device {
> >>>> 		struct v4l2_subdev sd;
> >>>> 		struct media_pad pad;
> >>>> 	};
> >>>> 	
> >>>> 	void foo_device_release(struct video_device *vdev)
> >>>> 	{
> >>>> 		struct v4l2_subdev *sd =
> >>>> 			container_of(vdev, struct v4l2_subdev, devnode);
> >>>> 		struct foo_device *foo =
> >>>> 			container_of(sd, struct foo_device, sd);
> >>>> 		
> >>>> 		/*
> >>>> 		 * TODO: acquire the graph mutex here and remove the
> >>>> 		 * entities corresponding to the V4L2 sub-device and its
> >>>> 		 * device node from the graph.
> >>>> 		 */
> >>>> 		media_entity_cleanup(&foo->sd.entity);
> >>>> 		kfree(foo);
> >>>> 	}
> >>>> 	
> >>>> 	int foo_probe(struct device *dev)
> >>>> 	{
> >>>> 		struct foo_device *foo = kmalloc(sizeof(*foo), GFP_KERNEL));
> >>>> 		
> >>>> 		media_entity_pads_init(&foo->sd.entity, 1, &foo->pad);
> >>>> 		foo->sd.entity.release = v4l2_subdev_devnode_release;
> >>>> 		foo->sd.devnode.release = foo_device_release;
> >>>> 		v4l2_async_register_subdev(&foo->sd);
> >>>> 	}
> >>>> 	
> >>>> 	void foo_remove(struct device *dev)
> >>>> 	{
> >>>> 		struct foo_device *foo = dev_get_drvdata(dev);
> >>>> 		
> >>>> 		v4l2_async_unregister_subdev(&foo->sd);
> >>>> 		v4l2_device_unregister_subdev(&foo->sd);
> >>>> 		/*
> >>>> 		 * v4l2_subdev_put() will end up releasing foo immediately
> >>>> 		 * unless file handles are open currently. Thus further
> >>>> 		 * accesses to foo are not allowed.
> >>>> 		 */
> >>>> 		v4l2_subdev_put(&foo->sd);
> >>>> 	}
> >>>>
> >>>> Practical considerations
> >>>> ------------------------
> >>>>
> >>>> - Beyond media entities, other media graph objects could be made
> >>>>   refcounted. This is not seen necessary as the related objects such as
> >>>>   pads are in practice allocated within the same driver specific struct
> >>>>   containing the media entity.  
> >>>
> >>> As I mentioned already, you're forgetting about the media interfaces, with
> >>> are created/removed elsewhere as they're (somewhat) independent of the
> >>> media entities.  
> >>
> >> Right. Nothing prevents adding a kref to struct media_interface. It'd work
> >> the same way than with media entities. Interface-wise, the refcounting
> >> should use _get()/_put() functions of the native types independently of
> >> where the kref refcount is located. (Obviously this requires that it has to
> >> be in the same memory allocation. For instance, v4l2_subdev's depending on
> >> kref in struct media_entity in the example.)
> >>
> >> Having to do add kref to struct media_interface, though, does increase the
> >> appeal of adding kref to each graph object instead. Still, my hunch is that
> >> this would be vastly overkill: the other graph objects we have are very
> >> tightly bound to entities (and media interfaces). I can't see a use case for
> >> them right now other than than just walking the graph. That said, I'm
> >> certainly open for new ideas and information that could probe me wrong.  
> >
> > I agree with you.  
> 
> Me too.

Works for me.

> 
> >  
> >>>> 	- This does not apply to links which are allocated dynamically by
> >>>> 	  the framework, and released at unregistration time, meaning that
> >>>> 	  accessing a link will require holding the media graph mutex.  
> >>>
> >>> If we're moving to refcounts, then perhaps we can get rid of the
> >>> graph mutex or reduce its usage.  
> >>
> >> It'll be a lot easier to get rid of the graph mutex, I agree. Only the graph
> >> walk will depend on holding the graph mutex then, and it's much easier to
> >> address the serialisation needs in that use case only. The solution will
> >> still be non-trivial as it's likely to require wait/wound mutexes: the
> >> graph walk can well start at two locations of a graph simultanoeously and
> >> two walkers may thus try to parse the same graph nodes. One of them needs
> >> to back away in that case. This is something that needs to be managed in
> >> the users of the graph walk API.
> >>  
> >>>> - A custom release callback is to be used for refcounted framework
> >>>>   structs. This makes it easy for the drivers to embed the objects in
> >>>>   their own structs.
> >>>>
> >>>> - As long as an object has its own reference count, getting or putting
> >>>>   an object does not affect the refcount of another object that it
> >>>>   depends on. For instance, a media entity depends on an existence of the
> >>>>   media device.
> >>>>
> >>>> - Each media entity holds a reference to the media device. Being able to
> >>>>   navigate from the media entity to the media device is used in drivers
> >>>>   and an abrupt disconnection of the two is problematic to handle for
> >>>>   drivers.
> >>>>
> >>>> - At object initialisation time the kref refcount is initialised. The
> >>>>   object is put at unregistration time.
> >>>>
> >>>>   	- If the object is a media entity, it is removed from the media graph
> >>>>  	  when the last reference to the media entity is gone. The release
> >>>>  	  callback needs to acquire the media graph mutex in order to
> >>>>  	  perform the removal.
> >>>> 	
> >>>> 	- In particular for struct video_device, media_entity release callback
> >>>> 	  will put the refcount of the video device (struct device embedded in
> >>>> 	  struct video_device)
> >>>> 	
> >>>> 	- For struct v4l2_subdev, struct media_entity contained in struct
> >>>> 	  v4l2_subdev is dependent on struct device within struct
> >>>> 	  video_device.
> >>>>
> >>>> - When a driver creates multiple reference-counted objects (such as
> >>>>   multiple media entities for instance) that are part of its own private
> >>>>   struct, it will need to reference-count its own struct based on the
> >>>>   release callbacks of all those objects. It might be possible to
> >>>>   simplify this by providing a single release callback that would cover
> >>>>   all objects. This is already feasible for drivers that implement a
> >>>>   media_device instance, V4L2 sub-devices and V4L2 video devices hold a
> >>>>   reference on the media_device, whose release callback could then act as
> >>>>   a single gatekeeper. For other drivers we haven't explored yet whether
> >>>>   the core could meaningfully provide help, but it should be remembered
> >>>>   that drivers that implement multiple graph objects and that do not
> >>>>   implement a media_device are not legion. Most slave drivers (such as
> >>>>   sensor drivers) only implement a single v4l2_subdev.
> >>>>
> >>>> - No V4L2 sub-device driver currently supports unbinding the device
> >>>>   safely when a media entity (or V4L2 sub-device) is registered. Prevent
> >>>>   manual unbind for all sub-device drivers by setting the
> >>>>   suppress_bind_attrs field in struct device_driver.
> >>>>
> >>>> Rules for navigating the media graph and accessing graph objects
> >>>> ================================================================
> >>>>
> >>>> - Graph traversal is safe as long as the media graph lock is held during
> >>>>   traversal. (I.e. no changes here.)
> >>>> 	
> >>>> 	- During graph traversal entity refcount must be increased in order
> >>>> 	  to prevent them from being released. As the media entity's release()
> >>>> 	  callback may be already in process at this point, a variant
> >>>> 	  checking that the reference count has not reached zero must be
> >>>> 	  used (media_entity_get_try()).
> >>>>
> >>>> - In order to keep a reference to an object after the graph traversal
> >>>>   has finished, a reference to the object must be obtained and held as
> >>>>   long as the object is in use.
> >>>> 	
> >>>> 	- The same applies to file handles to V4L2 sub-devices or drivers
> >>>> 	  keeping a reference to another driver's sub-device.
> >>>>
> >>>> - Once the reference count of an object reaches zero it will not be
> >>>>   increased again, even if that object was part of a memory allocation
> >>>>   which still has referenced objects. Obtaining references to such an
> >>>>   object must fail.
> >>>>
> >>>> - Navigating within a memory allocation while holding a reference to an
> >>>>   object in that allocation to obtain another object is allowed. The
> >>>>   target object type must provide a function testing for zero references
> >>>>   in order not to not to increment reference count that has reached zero.
> >>>>   kref_get_unless_zero() can be used for this. Especially drivers may
> >>>>   need this.
> >>>>
> >>>> - Circular references are not allowed.
> >>>>
> >>>> - It is safe to move between objects of the same lifetime (same kref),
> >>>>   e.g. struct v4l2_subdev and struct media_entity it contains.
> >>>>
> >>>> The rules must be documented as part of the Media framework ReST
> >>>> documentation.
> >>>>
> >>>>
> >>>> Stopping the hardware safely at device unbind
> >>>> =============================================
> >>>>
> >>>> What typically gets little attention in driver implementation is
> >>>> stopping safely whenever the hardware is being removed. The kernel
> >>>> frameworks are not very helpful in this respect --- after all how this is
> >>>> done somewhat depends on the hardware. Only hot-pluggable devices are
> >>>> generally affected. While it is possible to manually unbind e.g.
> >>>> platform devices, that certainly is a lesser problem for such devices.
> >>>>
> >>>> The rules are:
> >>>>
> >>>> - No memory related to data structures that are needed during driver
> >>>>   operation can be released as long as interrupt handlers may be still
> >>>>   executing or may start executing or user file handles are open.
> >>>>
> >>>> - All hardware access must cease after the remove() callback in driver
> >>>>   ops struct has returned. (For USB devices this callback is called
> >>>>   disconnect(), but its purpose is precisely the same.)
> >>>>
> >>>> Handling this correctly will require taking care of a few common
> >>>> matters:
> >>>>
> >>>> 1. Checking for device presence in each system call issued on a device,
> >>>>    and returning -ENODEV if the device is not found.
> >>>>
> >>>> 2. The driver's remove() function must ensure that no system calls that
> >>>>    could result in hardware access are ongoing.
> >>>>
> >>>> 3. Some system calls may sleep, and while doing so also releasing and
> >>>>    later reacquiring (when they continue) the mutexes their processes
> >>>>    were holding. Such processes must be woken up.
> >>>>
> >>>> This is not an easy task for a driver and the V4L2 and Media controller
> >>>> frameworks need to help the drivers to achieve this so that drivers
> >>>> would need to worry about this as little as possible. Drivers relying for
> >>>> their locking needs on the videobuf2 and the V4L2 frameworks are likely
> >>>> to require fewer changes.
> >>>>
> >>>> The same approach must be taken in issuing V4L2 sub-device operations by
> >>>> other kernel drivers.
> >>>>
> >>>>
> >>>> Serialising unbind with hardware access through system calls
> >>>> ------------------------------------------------------------
> >>>>
> >>>> What's below is concentrated in V4L2; Media controller will need similar
> >>>> changes.
> >>>>
> >>>> - Add a function called video_unregister_device_prepare() (or
> >>>>   video_device_mark_unregistered() or a similarly named function) to
> >>>>   amend the functionality of video_unregister_device(): mark the device
> >>>>   unregistering so that new system calls issues on the device will
> >>>>   return an error.
> >>>>
> >>>> - Device node specific use count for active system calls.
> >>>>
> >>>> - Waiting list for waiting for the use count to reach zero (for
> >>>>   proceeding unregistration in v4l2_unregister_device()). Each finishing
> >>>>   system call on a file descriptor will wake up the waiting list.
> >>>>
> >>>> - Processes waiting in a waiting list of a system call (e.g. poll, DQBUF
> >>>>   IOCTL) will be woken up. After waiting, the processes will check
> >>>>   whether the video node is still registered, i.e.
> >>>>   video_unregister_device_prepare()
> >>>>   has not been called.
> >>>>
> >>>> Things to remember
> >>>> ------------------
> >>>>
> >>>> - Using the devm_request_irq() is _NOT_ allowed in general case, as the
> >>>>   free_irq() results in being called _after_ the driver's release
> >>>>   callback. This means that if new interrupts still arrive between
> >>>>   driver's release callback returning and devres resources not being
> >>>>   released, the driver's interrupt handler will still run. Use
> >>>>   request_irq() and free_irq() instead.
> >>>>
> >>>> What changes in practice?
> >>>> =========================
> >>>>
> >>>> Most of the changes needed in order to align the current implementation
> >>>> to conform the above are to be made in the frameworks themselves. What
> >>>> changes in drivers, though, is how objects are initialised and released.
> >>>> This mostly applies to drivers' own data structures that often contain
> >>>> reference counted object types from the frameworks.
> >>>>
> >>>> Relations between various frameworks objects will need to be handled in
> >>>> a standard way in the appropriate registeration, unregistration and
> >>>> reference getting and putting functions. Drivers (as well as frameworks,
> >>>> such as the V4L2 sub-device framework in its open and close file
> >>>> operations) will make use of these functions to get an put references to
> >>>> the said objects.
> >>>>
> >>>> Drivers that are not hot-pluggable could largely ignore the changes, as
> >>>> unregistering an object does in fact take place right before releasing
> >>>> that object, meaning drivers that did not perform any object reference
> >>>> counting are no more broken with the changes described above without
> >>>> implementing reference counting. Still care must be taken that they
> >>>> remain decently unloadable when there are no users.
> >>>>
> >>>> New drivers should be required to implement object reference counting
> >>>> correctly, whether hot-pluggable or not. One platform driver (e.g.
> >>>> omap3isp) should be fixed to function as a model for new platform
> >>>> drivers.  
> >  
> 
> Regards,
> 
> 	Hans

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [ANN] Media object lifetime management meeting report from Oslo
  2017-01-30 11:56         ` Mauro Carvalho Chehab
@ 2017-02-07 21:38           ` Sakari Ailus
  0 siblings, 0 replies; 8+ messages in thread
From: Sakari Ailus @ 2017-02-07 21:38 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Hans Verkuil, Laurent Pinchart, linux-media, Shuah Khan, Helen Koike

Hi Mauro,

On Mon, Jan 30, 2017 at 09:56:46AM -0200, Mauro Carvalho Chehab wrote:
> Em Mon, 30 Jan 2017 09:39:21 +0100
> Hans Verkuil <hverkuil@xs4all.nl> escreveu:
> 
> > On 28/01/17 16:35, Laurent Pinchart wrote:
> > > Hi Sakari,
> > >
> > > On Saturday 28 Jan 2017 00:02:53 Sakari Ailus wrote:  
> > >> On Fri, Jan 27, 2017 at 09:38:31AM -0200, Mauro Carvalho Chehab wrote:  
> > >>> Hi Sakari/Hans/Laurent,
> > >>>
> > >>> First of all, thanks for looking into those issues. Unfortunately, I was
> > >>> in vacations, and were not able to be with you there for such discussions.
> > >>>
> > >>> While I have a somewhat different view on some of the introductory points
> > >>> of this RFC, what really matters is the "proposal" part of it. So, based
> > >>> on the experiments I did when I addressed the hotplug issues with the
> > >>> media controller on USB drivers, I'm adding a few comments below.
> > >>>
> > >>> Em Fri, 27 Jan 2017 12:08:22 +0200 Sakari Ailus escreveu:  
> > >>>> Allocating struct media_devnode separately from struct media_device
> > >>>> -------------------------------------------------------------------
> > >>>>
> > >>>> The current codebase allocates struct media_devnode dynamically. This
> > >>>> was done in order to reduce the time window during which unallocated
> > >>>> kernel memory could be accessed. However, there is no specific need for
> > >>>> such a separation as the entire data structure, including the media
> > >>>> device which used to contain struct media_devnode, will have the same
> > >>>> lifetime. Thus the struct media_devnode should be allocated as part of
> > >>>> struct media_device. Later on, struct media_devnode may be merged with
> > >>>> struct media_device if so decided.  
> > >>>
> > >>> There is one issue merging struct media_devnode at struct media_device.
> > >>> The problem is that, if the same struct device is used for two different
> > >>> APIs (like V4L2 and media_controller) , e. g. if the cdev parent points
> > >>> to the same struct device, you may end by having a double free when the
> > >>> device is unregistered on the second core. That's basically why
> > >>> currently struct cdev is at struct media_devnode: this way, it has its own
> > >>> struct device.  
> > >>
> > >> One of the conclusions of the memorandum I wrote is that the data structures
> > >> that are involved in executing a system call (including IOCTLs) must stay
> > >> intact during that system call. (Well, the alternative I can think of is
> > >> using an rw_semaphore but I certainly do not advocate that solution.)
> > >>
> > >> Otherwise, we will continue to have serialisation issues: kernel memory that
> > >> may be released at any point of time independently of a system call in
> > >> progress:
> > >>
> > >> <URL:https://www.mail-archive.com/linux-media@vger.kernel.org/msg106390.html  
> > >>>  
> > >>
> > >> That one is a NULL pointer dereference but released kernel memory can be
> > >> accessed as well.
> > >>  
> 
> There's an alternative approach: to have a flag to indicate that the
> device got disconnected and denying access to the device-specific data
> if the device was already hot-unplugged. The V4L USB drivers do that
> already. See, for example, the em28xx driver:
> 
> int em28xx_read_reg_req_len(struct em28xx *dev, u8 req, u16 reg,
> 			    char *buf, int len)
> {
> ...
> 	if (dev->disconnected)
> 		return -ENODEV;
> ...
> }
> 
> /* Processes and copies the URB data content (video and VBI data) */
> static inline int em28xx_urb_data_copy(struct em28xx *dev, struct urb *urb)
> {
> ...
> 	if (dev->disconnected)
> 		return 0;
> 
> The same logic exists on multiple places along the driver. Some of
> such checks could be moved to the core, but, if I remember well when
> such code was written, several such checks should be inside the driver,
> as the USB core doesn't allow any calls to their functions after the
> disconnect callback is called. That's why it was opted to not move
> it to the core.
> 
> In the case of DVB, the core has a similar logic to handle disconnected
> devices, with prevents it to call driver-specific code when a device is
> removed. That's why there are just 2 checks if dev->disconnected at
> drivers/media/usb/em28xx/em28xx-dvb.c: one at the IRQ logic, and the
> second one to avoid a race issue with the suspend code.
> 
> Due to that, on real hot-plug devices (USB drivers) that don't have
> hot-unplug issues with the V4L2 API, it should be safe to free some of
> their data structs at the .disconnect callback. Of course, the struct
> that embeds the struct device used as the cdev's parent can't be freed
> there.

I agree there's a need to prevent devices from accessing the hardware ---
please also see "Stopping the hardware safely at device unbind" below ---
but that's unrelated to accessing the media device or the media graph from
the user space and from other users in the kernel space.

The USB devices are a bit special as the framework does ensure that
accessing a device does not lead to catastrophic results if it's no longer
there. For other busses, accessing the hardware is no longer allowed once
the driver's remove() function has completed.

The purpose here is just to ensure that the media device itself will stay
around while it's being accessed. For instance, issuing a system call on a
file handle open to a sub-device device node quite probably does require
accessing to the media device as well.

> 
> > >>> IMHO, it also makes sense to embeed struct cdev at the V4L2 side, as I
> > >>> detected some race issues at the V4L2 side when I ran the bind/unbind
> > >>> race tests, when we tried to merge the snd-usb-audio MC support patch.
> > >>> I remember Shuah reported something similar on that time.
> > >>>  
> > >>>> Allocating struct media_device dynamically
> > >>>> ------------------------------------------
> > >>>>
> > >>>> Traditionally the media device has been allocated as part of drivers'
> > >>>> own structures. This is certainly fine as such, but many drivers have
> > >>>> allocated the driver private struct using the devm_() family of
> > >>>> functions. This causes such memory to be released at the time of device
> > >>>> removal rather than at the time when the memory is no longer accessible
> > >>>> by e.g. user space processes or interrupt handlers. Besides the media
> > >>>> device, the driver's own data structure is very likely to have the
> > >>>> precisely same lifetime issue: it may also be accessed through IOCTLs
> > >>>> after the device has been removed from the system.
> > >>>>
> > >>>> Instead, the memory needs to be released as part of the release()
> > >>>> callback of the media device which is called when the last reference to
> > >>>> the media device is gone. Still, allocating the media device outside
> > >>>> another containing driver specific struct will be required in some cases:
> > >>>> sharing the media device mandates that. Implementing this can certainly
> > >>>> be postponed until sharing the media device is otherwise supported as
> > >>>> well.  
> > >>>
> > >>> The patches adding MC support for snd-usb-audio, pending since Kernel
> > >>> 4.7 (I guess) require such functionatilty. Last year, on the audio
> > >>> summit, they asked about its status, as they're needing MC suppor there.
> > >>>
> > >>> So, whatever solution taken, this should be part of the solution.
> > >>>
> > >>> (c/c Shuah, as she is the one working on it)  
> > >>
> > >> I think we should postpone that, or at least resolve that in a separate
> > >> patchset. This is already getting very big. The refcounting problems will be
> > >> harder to fix, should we extend the MC framework with media device sharing
> > >> without considering the object lifetime issues first. So the order should
> > >> be: first fix object lifetime issues, then add more features.  
> > >
> > > I strongly second that, continuing to build on top of a totally broken base
> > > can only lead to disaster. Given that we have given the lifetime management
> > > problem lots of thoughts already, with a patch series out to fix it and work
> > > ongoing to rework that series, I don't see a reason to order development in a
> > > different way.  
> > 
> > I agree with that as well.
> 
> Ok. When do you expect to have a patchset ready for tests covering the
> hot-plug issue without causing regressions to the USB drivers?

I've been adjusting my existing patchset to conform to the meeting notes
without making additional driver changes. The changes include removal of the
functions used to allocate the media device, i.e. embedding it into another
struct is the way to proceed. This means that fewer driver changes will be
needed. There's definitely some amount of work there left. Some review time
will need to be allocated as well.

I'd guesstimate that, if we will roughly agree on the proposed changes and
few iterations will be needed, we should be able to have that in the
mediatree.git master around the end of the month, depending a bit on easy it
is to make the necessary changes to the DVB USB drivers.

It'll likely take another month or two to add refcounting and help drivers
not to access hardware once they've been unbound. I presume there will be a
lot of details I can't foresee now that won't be trivial to handle. I might
be that my assumption on the complexity of the task at hand is too
pessimistic and it's actually easier that I foresee but I doubt it.

-- 
Kind regards,

Sakari Ailus
e-mail: sakari.ailus@iki.fi	XMPP: sailus@retiisi.org.uk

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-02-07 21:39 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-27 10:08 [ANN] Media object lifetime management meeting report from Oslo Sakari Ailus
2017-01-27 10:52 ` Sakari Ailus
2017-01-27 11:38 ` Mauro Carvalho Chehab
2017-01-27 22:02   ` Sakari Ailus
2017-01-28 15:35     ` Laurent Pinchart
2017-01-30  8:39       ` Hans Verkuil
2017-01-30 11:56         ` Mauro Carvalho Chehab
2017-02-07 21:38           ` Sakari Ailus

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.