linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sakari Ailus <sakari.ailus@iki.fi>
To: Kieran Bingham <kieran@ksquared.org.uk>
Cc: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>,
	laurent.pinchart@ideasonboard.com,
	linux-renesas-soc@vger.kernel.org, linux-media@vger.kernel.org
Subject: Re: [PATCHv3 RFC 4/4] media: Catch null pipes on pipeline stop
Date: Wed, 14 Dec 2016 14:43:25 +0200	[thread overview]
Message-ID: <20161214124324.GB16630@valkosipuli.retiisi.org.uk> (raw)
In-Reply-To: <9594fc25-c657-7326-0987-9a7bc1bc888f@bingham.xyz>

Hi Kieran,

On Wed, Dec 14, 2016 at 12:27:37PM +0000, Kieran Bingham wrote:
> Hi Sakari,
> 
> On 14/12/16 07:28, Sakari Ailus wrote:
> > Hi Kieran,
> > 
> > On Tue, Dec 13, 2016 at 05:59:44PM +0000, Kieran Bingham wrote:
> >> media_entity_pipeline_stop() can be called through error paths with a
> >> NULL entity pipe object. In this instance, stopping is a no-op, so
> >> simply return without any action
> > 
> > The approach of returning silently is wrong; the streaming count is indeed a
> > count: you have to decrement it the exactly same number of times it's been
> > increased. Code that attempts to call __media_entity_pipeline_stop() on an
> > entity that's not streaming is simply buggy.
> 
> Ok, Thanks for the heads up on where to look, as I suspected we are in
> section B) of my comments below :D
> 
> > I've got a patch here that adds a warning to graph traversal on streaming
> > count being zero. I sent a pull request including that some days ago:
> > 
> > <URL:http://www.spinics.net/lists/linux-media/msg108980.html>
> > <URL:http://www.spinics.net/lists/linux-media/msg108995.html>
> 
> Excellent, thanks, I've merged your branch into mine, and I'll
> investigate where the error is actually coming from.
> 
> --
> Ok - so I've merged your patches, (and dropped this patch) but they
> don't fire any warning before I hit my null-pointer dereference in
> __media_pipeline_stop(), on the WARN_ON(!pipe->streaming_count);
> 
> So I think this is a case of calling stop on an invalid entity rather
> than an unbalanced start/stop somehow:

Not necessarily. The pipe is set to NULL if the count goes to zero.

This check should be dropped, it's ill-placed anyway: if streaming count is
zero the pipe is expected to be NULL anyway in which case you get a NULL
pointer exception (that you saw there). With the check removed, you should
get the warning instead.

> 
> [  613.830334] vsp1 fea38000.vsp: failed to reset wpf.0
> [  613.838137] PM: resume of devices complete after 124.410 msecs
> [  613.847390] PM: Finishing wakeup.
> [  613.852247] Restarting tasks ... done.
> [  615.864801] ravb e6800000.ethernet eth0: Link is Up - 100Mbps/Full -
> flow control rx/tx
> [  617.011299] vsp1 fea38000.vsp: failed to reset wpf.0
> [  617.017777] vsp1 fea38000.vsp: DRM pipeline stop timeout
> [  617.024649] Unable to handle kernel NULL pointer dereference at
> virtual address 00000000
> [  617.034273] pgd = ffffff80098c5000
> [  617.039171] [00000000] *pgd=000000073fffe003[  617.043336] ,
> *pud=000000073fffe003
> , *pmd=0000000000000000[  617.050353]
> [  617.053282] Internal error: Oops: 96000005 [#1] PREEMPT SMP
> [  617.060309] Modules linked in:
> [  617.064793] CPU: 1 PID: 683 Comm: kworker/1:2 Not tainted
> 4.9.0-00506-g4d2a939ea654 #350
> [  617.074320] Hardware name: Renesas Salvator-X board based on r8a7795 (DT)
> [  617.082578] Workqueue: events drm_mode_rmfb_work_fn
> [  617.088884] task: ffffffc6fabd2b00 task.stack: ffffffc6f9d90000
> [  617.096246] PC is at __media_pipeline_stop+0x24/0xe8
> [  617.102644] LR is at media_pipeline_stop+0x34/0x48
> 
> <regs and stack snipped>
> 
> [  617.863386] [<ffffff800853e724>] __media_pipeline_stop+0x24/0xe8
> [  617.870417] [<ffffff800853e81c>] media_pipeline_stop+0x34/0x48
> [  617.877272] [<ffffff8008568000>] vsp1_du_setup_lif+0x68/0x5a8
> [  617.884047] [<ffffff80084de9d4>] rcar_du_vsp_disable+0x2c/0x38
> [  617.890898] [<ffffff80084da710>] rcar_du_crtc_stop+0x198/0x1e8
> [  617.897749] [<ffffff80084da780>] rcar_du_crtc_disable+0x20/0x38
> [  617.904683] [<ffffff80084ac9b0>]
> drm_atomic_helper_commit_modeset_disables+0x1b0/0x3d8
> [  617.913634] [<ffffff80084db9cc>] rcar_du_atomic_complete+0x34/0xc8
> [  617.920828] [<ffffff80084dbd20>] rcar_du_atomic_commit+0x2c0/0x2d0
> [  617.928012] [<ffffff80084ceffc>] drm_atomic_commit+0x5c/0x68
> [  617.934663] [<ffffff80084ad2e0>] drm_atomic_helper_set_config+0x90/0xd0
> [  617.942288] [<ffffff80084c06a0>] drm_mode_set_config_internal+0x70/0x100
> [  617.949996] [<ffffff80084c0760>] drm_crtc_force_disable+0x30/0x40
> [  617.957084] [<ffffff80084d0b10>] drm_framebuffer_remove+0x100/0x128
> [  617.964347] [<ffffff80084d0b80>] drm_mode_rmfb_work_fn+0x48/0x60
> [  617.971352] [<ffffff80080e9770>] process_one_work+0x1e0/0x738
> [  617.978084] [<ffffff80080e9f24>] worker_thread+0x25c/0x4a0
> [  617.984546] [<ffffff80080f11ac>] kthread+0xd4/0xe8
> [  617.990305] [<ffffff8008083680>] ret_from_fork+0x10/0x50
> 
> 
> So, I'll have to schedule some time to investigate this deeper and find
> out where we are calling stop on an invalid entity object.
> 
> I agree that this patch should simply be dropped though, it was more to
> promote a bit of discussion on the area to help me understand what was
> going on, which it has!.
> 
> Thankyou Sakari!
> 
> --
> Regards
> 
> Kieran
> 
> 
> > I think the check here could simply be removed as the check is done for
> > every entity in the pipeline with the above patch.
> > 
> > If there's still a wish to check for the pipe field which should not be
> > written by drivers, it should be done during pipeline traversal so that it
> > applies to all entities in the pipeline, not just where traversal starts.
> > 
> >>
> >> Signed-off-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
> >> ---
> >>
> >> I've marked this patch as RFC, although if deemed suitable, by all means
> >> integrate it as is.
> >>
> >> When testing suspend/resume operations on VSP1, I encountered a segfault on the
> >> WARN_ON(!pipe->streaming_count) line, where 'pipe == NULL'. The simple
> >> protection fix is to return early in this instance, as this patch does however:
> >>
> >> A) Does this early return path warrant a WARN() statement itself, to identify
> >> drivers which are incorrectly calling media_entity_pipeline_stop() with an
> >> invalid entity, or would this just be noise ...
> >>
> >> and therefore..
> >>
> >> B) I also partly assume this patch could simply get NAK'd with a request to go
> >> and dig out the root cause of calling media_entity_pipeline_stop() with an
> >> invalid entity. 
> >>
> >> My brief investigation so far here so far shows that it's almost a second order
> >> fault - where the first suspend resume cycle completes but leaves the entity in
> >> an invalid state having followed an error path - and then on a second
> >> suspend/resume - the stop fails with the affected segfault.
> >>
> >> If statement A) or B) apply here, please drop this patch from the series, and
> >> don't consider it a blocking issue for the other 3 patches.
> >>
> >> Kieran
> >>
> >>
> >>  drivers/media/media-entity.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/drivers/media/media-entity.c b/drivers/media/media-entity.c
> >> index c68239e60487..93c9cbf4bf46 100644
> >> --- a/drivers/media/media-entity.c
> >> +++ b/drivers/media/media-entity.c
> >> @@ -508,6 +508,8 @@ void __media_entity_pipeline_stop(struct media_entity *entity)
> >>  	struct media_entity_graph *graph = &entity->pipe->graph;
> >>  	struct media_pipeline *pipe = entity->pipe;
> >>  
> >> +	if (!pipe)
> >> +		return;
> >>  
> >>  	WARN_ON(!pipe->streaming_count);
> >>  	media_entity_graph_walk_start(graph, entity);
> > 

-- 
Regards,

Sakari Ailus
e-mail: sakari.ailus@iki.fi	XMPP: sailus@retiisi.org.uk

  reply	other threads:[~2016-12-14 12:44 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-13 17:59 [PATCHv3 0/4] v4l: vsp1: Fix suspend/resume and race on M2M pipelines Kieran Bingham
2016-12-13 17:59 ` [PATCHv3 1/4] v4l: vsp1: Move vsp1_video_setup_pipeline() Kieran Bingham
2016-12-14 20:21   ` Laurent Pinchart
2016-12-13 17:59 ` [PATCHv3 2/4] v4l: vsp1: Refactor video pipeline configuration Kieran Bingham
2016-12-14 16:30   ` Laurent Pinchart
2016-12-15 11:50     ` Kieran Bingham
2017-01-06 11:11       ` Kieran Bingham
2016-12-13 17:59 ` [PATCHv3 3/4] v4l: vsp1: Use local display lists and remove global pipe->dl Kieran Bingham
2016-12-13 17:59 ` [PATCHv3 RFC 4/4] media: Catch null pipes on pipeline stop Kieran Bingham
2016-12-14  7:28   ` Sakari Ailus
2016-12-14 12:27     ` Kieran Bingham
2016-12-14 12:43       ` Sakari Ailus [this message]
2016-12-14 17:53         ` Kieran Bingham
2016-12-15  7:14           ` Sakari Ailus
2016-12-14 19:56   ` Kieran Bingham
2016-12-14 21:16     ` Laurent Pinchart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161214124324.GB16630@valkosipuli.retiisi.org.uk \
    --to=sakari.ailus@iki.fi \
    --cc=kieran.bingham+renesas@ideasonboard.com \
    --cc=kieran@ksquared.org.uk \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-renesas-soc@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).