linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Niklas Söderlund" <niklas.soderlund@ragnatech.se>
To: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>,
	Helen Koike <helen.koike@collabora.com>,
	Hans Verkuil <hverkuil-cisco@xs4all.nl>,
	linux-media@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs
Date: Wed, 3 Jul 2019 18:17:02 +0200	[thread overview]
Message-ID: <20190703161702.GA26461@bigcity.dyn.berto.se> (raw)
In-Reply-To: <20190630114102.GB7043@pendragon.ideasonboard.com>

Hi Shauah, Laurent,

On 2019-06-30 14:41:02 +0300, Laurent Pinchart wrote:
> Hi Shuah,
> 
> On Fri, Jun 28, 2019 at 10:41:07AM -0600, Shuah Khan wrote:
> > On 6/16/19 12:45 PM, Laurent Pinchart wrote:
> > > On Fri, Jun 14, 2019 at 05:26:46PM -0600, Shuah Khan wrote:
> > >> On 6/13/19 7:24 AM, Helen Koike wrote:
> > >>> On 6/13/19 2:44 AM, Hans Verkuil wrote:
> > >>>> On 5/24/19 5:31 AM, Shuah Khan wrote:
> > >>>>> media_device is embedded in struct vimc_device and when vimc is removed
> > >>>>> vimc_device and the embedded media_device goes with it, while the active
> > >>>>> stream and vimc_capture continue to access it.
> > >>>>>
> > >>>>> Fix the media_device lifetime problem by changing vimc to create shared
> > >>>>> media_device using Media Device Allocator API and vimc_capture getting
> > >>>>> a reference to vimc module. With this change, vimc module can be removed
> > >>>>> only when the references are gone. vimc can be removed after vimc_capture
> > >>>>> is removed.
> > >>>>>
> > >>>>> Media Device Allocator API supports just USB devices. Enhance it
> > >>>>> adding a genetic device allocate interface to support other media
> > >>>>> drivers.
> > >>>>>
> > >>>>> The new interface takes pointer to struct device instead and creates
> > >>>>> media device. This interface allows a group of drivers that have a
> > >>>>> common root device to share media device resource and ensure media
> > >>>>> device doesn't get deleted as long as one of the drivers holds its
> > >>>>> reference.
> > >>>>>
> > >>>>> The new interface has been tested with vimc component driver to fix
> > >>>>> panics when vimc module is removed while streaming is in progress.
> > >>>>
> > >>>> Helen, can you review this series? I'm not sure this is the right approach
> > >>>> for a driver like vimc, and even if it is, then it is odd that vimc-capture
> > >>>> is the only vimc module that's handled here.
> > >>>
> > >>> Hi Hans,
> > >>>
> > >>> Yes, I can take a look. Sorry, I've been a bit busy these days but I'll
> > >>> try to take a look at this patch series (and the others) asap.
> > >>>
> > >>> Helen
> > >>>
> > >>>> My gut feeling is that this should be handled inside vimc directly and not
> > >>>> using the media-dev-allocator.
> > >>
> > >> Hi Hans and Helen,
> > >>
> > >> I explored fixing the problem within vimc before I went down the path to
> > >> use Media Device Allocator API. I do think that it is cleaner to go this
> > >> way and easier to maintain.
> > >>
> > >> vimc is a group pf component drivers that rely on the media device vimc
> > >> in vimc and falls into the use-case Media Device Allocator API is added
> > >> to address. The release and life-time management happens without vimc
> > >> component drivers being changed other than using the API to get and put
> > >> media device reference.
> > > 
> > > Our replies crossed each other, please see my reply to Hans. I would
> > > just like to comment here that if having multiple kernel modules causes
> > > issue, they can all be merged together. There's no need for vimc to be
> > > handled through multiple modules (I actually think it's quite
> > > counterproductive, it only makes it more complex, for no added value).
> > 
> > There are several problems in this group of drivers as far as lifetime
> > management is concerned. I explained some of it in the patch 2/2
> > 
> > If vimc module is removed while streaming is active, vimc_exit runs
> > into NULL pointer dereference error when streaming thread tries to
> > access and lock graph_mutex in the struct media_device.
> > 
> > The primary reason for this is that:
> > 
> > media_device is embedded in struct vimc_device and when vimc is removed
> > vimc_device and the embedded media_device goes with it, while the active
> > stream and vimc_capture continue to access it.
> 
> The issue isn't so much that media_devic is embedded in vimc_device, but
> that vimc_device is released too early. Not only does the thread need to
> access the graph_mutex lock in the media_device structure, but it can
> potentially access fields of the device-specific structures as well. The
> proper solution is to propagate media_device_release() one level up, in
> order to only release the top-level structure containing media_device
> when the last reference to the media_device is dropped.

I have seen similar problems with rcar-vin, the device specific data is 
released to early. In my case it was not triggered by the struct
media_device but with a struct v4l2_device embedded in the device 
specific data IIRC.

This was when I tried to address the lifetime issues of the video device 
when binding/unbinding the device to the driver and not when unloading 
the module. This was quiet a while ago so I don't recall specifics, 
sorry about that. One finding was that there are also unsolved problems 
when it comes async notifiers and unloading/unbinding and then 
loading/binding subdevices as well as the driver controlling the video 
device. It was such a mess I gave up.

I'm happy to see activity in this area but I fear it might need work on 
a higher level and not trying to work around the problem in drivers.

> 
> > If we chose to keep these drivers as component drivers, media device
> > needs to stick around until all components stop using it. This is tricky
> > because there is no tie between these set of drivers. vimc module can
> > be deleted while others are still active. As vimc gets removed, other
> > component drivers start wanting to access the media device tree.
> 
> Reference-counting is the key.
> 
> > This is classic media device lifetime problem which could be solved
> > easily with the way I solved it with this series. I saw this as a
> > variation on the same use-case we had with sound and media drivers
> > sharing the media device.
> 
> This isn't about solving it easily, it's about solving it properly. The
> media device allocator as used here is a hack and takes us in the
> opposite direction of a proper fix.
> 
> > I have a TODO request from you asking to extend Media Device Allocator
> > API to generic case and not restrict it to USB devices. My thinking is
> > that this gives a perfect test case to extend the API to be generic
> > and use to solve this problem.
> 
> The biggest issue at the moment with the media device allocator, which I
> have pointed out numerous times and has never been addressed (and which
> explains why I didn't think the code was ready to be merged) is that the
> media_device contains operations that are based on having a single
> driver controlling the media device. A proper shared media device
> allocator needs to drop the concept of a single master for the media
> device, and thus needs to refactor those operations to allow any user of
> the media device to implement them (the .link_notify() operation is a
> prime example, and the recently added request operations will make this
> even more challenging - think of how this patch series would prevent
> vimc from properly implementing the request API). As long as these issue
> are not fixed I will be firmly opposed to spreading the usage of the
> media device allocator beyond what exists today.
> 
> > Collapsing the drivers into one might be lot more difficult and complex
> > than solving this problem with Media Device Allocator API. This approach
> > has an added benefit of extending the API to be generic and not just for
> > USB.
> 
> I've never disputed the fact that fixing a problem correctly is usually
> more work than hacking around it :-)
> 
> > I looked at this as a good way to add generic API and have a great test
> > case for it. This patch series fixes the problem for the current vimc
> > architecture.
> 
> NAK, for the reasons above. Please drop this series and fix the problem
> properly.
> 
> -- 
> Regards,
> 
> Laurent Pinchart

-- 
Regards,
Niklas Söderlund

  reply	other threads:[~2019-07-03 16:17 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-24  3:31 [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Shuah Khan
2019-05-24  3:31 ` [PATCH 1/2] media: add generic device allocate interface to media-dev-allocator Shuah Khan
2019-05-24  3:31 ` [PATCH 2/2] vimc: fix BUG: unable to handle kernel NULL pointer dereference Shuah Khan
2019-06-13  5:44 ` [PATCH 0/2] Use Media Dev Allocator to fix vimc dev lifetime bugs Hans Verkuil
2019-06-13 13:24   ` Helen Koike
2019-06-14 23:26     ` Shuah Khan
2019-06-16 18:45       ` Laurent Pinchart
2019-06-28 16:41         ` Shuah Khan
2019-06-30 11:41           ` Laurent Pinchart
2019-07-03 16:17             ` Niklas Söderlund [this message]
2019-07-03 16:52               ` Shuah Khan
2019-07-03 23:25                 ` Laurent Pinchart
2019-07-03 23:42                   ` Shuah Khan
2019-06-16 18:43   ` Laurent Pinchart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190703161702.GA26461@bigcity.dyn.berto.se \
    --to=niklas.soderlund@ragnatech.se \
    --cc=helen.koike@collabora.com \
    --cc=hverkuil-cisco@xs4all.nl \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=skhan@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).