All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Cc: <linuxarm@huawei.com>, <mauro.chehab@huawei.com>,
	Andy Gross <agross@kernel.org>,
	Bjorn Andersson <bjorn.andersson@linaro.org>,
	"Hans Verkuil" <hans.verkuil@cisco.com>,
	Mauro Carvalho Chehab <mchehab@kernel.org>,
	Stanimir Varbanov <stanimir.varbanov@linaro.org>,
	<linux-arm-msm@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-media@vger.kernel.org>
Subject: Re: [PATCH v4 01/79] media: venus: fix PM runtime logic at venus_sys_error_handler()
Date: Mon, 3 May 2021 11:00:22 +0200	[thread overview]
Message-ID: <20210503110022.22503bfd@coco.lan> (raw)
In-Reply-To: <20210430162110.000058e0@Huawei.com>

Em Fri, 30 Apr 2021 16:21:10 +0100
Jonathan Cameron <Jonathan.Cameron@Huawei.com> escreveu:

> On Wed, 28 Apr 2021 16:51:22 +0200
> Mauro Carvalho Chehab <mchehab+huawei@kernel.org> wrote:
> 
> > The venus_sys_error_handler() assumes that pm_runtime was
> > able to resume, as it does things like:
> > 
> > 	while (pm_runtime_active(core->dev_dec) || pm_runtime_active(core->dev_enc))
> > 		msleep(10);
> > 
> > Well, if, for whatever reason, this won't happen, the routine
> > won't do what's expected. So, check for the returned error
> > condition, warning if it returns an error.
> > 
> > Fixes: af2c3834c8ca ("[media] media: venus: adding core part and helper functions")
> > Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
> > ---
> >  drivers/media/platform/qcom/venus/core.c | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/media/platform/qcom/venus/core.c b/drivers/media/platform/qcom/venus/core.c
> > index 54bac7ec14c5..c80c27c87ccc 100644
> > --- a/drivers/media/platform/qcom/venus/core.c
> > +++ b/drivers/media/platform/qcom/venus/core.c
> > @@ -84,7 +84,11 @@ static void venus_sys_error_handler(struct work_struct *work)
> >  			container_of(work, struct venus_core, work.work);
> >  	int ret = 0;
> >  
> > -	pm_runtime_get_sync(core->dev);
> > +	ret = pm_runtime_get_sync(core->dev);
> > +	if (WARN_ON(ret < 0)) {
> > +		pm_runtime_put_noidle(core->dev);
> > +		return;
> > +	}
> >  
> >  	hfi_core_deinit(core, true);
> >  
> > @@ -106,9 +110,13 @@ static void venus_sys_error_handler(struct work_struct *work)
> >  
> >  	hfi_reinit(core);
> >  
> > -	pm_runtime_get_sync(core->dev);
> > +	ret = pm_runtime_get_sync(core->dev);
> > +	if (WARN_ON(ret < 0)) {
> > +		pm_runtime_put_noidle(core->dev);  

Thanks for review!

> mutex_unlock(&core->lock);
> (the unlock is currently just below the enable_irq() in 5.12)

Basically, this function assumes that the core->lock is not locked
and that IRQs are disabled, as can be seen at the function which
starts such work:

    static void venus_event_notify(struct venus_core *core, u32 event)
    {
	struct venus_inst *inst;

	switch (event) {
	case EVT_SYS_WATCHDOG_TIMEOUT:
	case EVT_SYS_ERROR:
		break;
	default:
		return;
	}

	mutex_lock(&core->lock);
	core->sys_error = true;
	list_for_each_entry(inst, &core->instances, list)
		inst->ops->event_notify(inst, EVT_SESSION_ERROR, NULL);
	mutex_unlock(&core->lock);

	disable_irq_nosync(core->irq);
	schedule_delayed_work(&core->work, msecs_to_jiffies(10));
    }

The code inside it actually locks/unlocks two times the core->lock. 
See, this is the original code:

    static void venus_sys_error_handler(struct work_struct *work)
    {
	<not locked>

        pm_runtime_get_sync(core->dev);

        hfi_core_deinit(core, true);

        dev_warn(core->dev, "system error has occurred, starting recovery!\n");

        mutex_lock(&core->lock);
	while (pm_runtime_active(core->dev_dec) || pm_runtime_active(core->dev_enc))
                msleep(10);
...
        enable_irq(core->irq);
        mutex_unlock(&core->lock);
...
	if (ret) {
                disable_irq_nosync(core->irq);
                dev_warn(core->dev, "recovery failed (%d)\n", ret);
                schedule_delayed_work(&core->work, msecs_to_jiffies(10));
                return;
        }

	mutex_lock(&core->lock);
        core->sys_error = false;
        mutex_unlock(&core->lock);
    }

It should be noticed that, once started, this delayed work re-starts
itself, with IRQs disabled, trying to reboot the Venus IP hardware,
until it stops failing [1].

[1] IMHO, it seems a very bad idea to keep running the work forever,
flooding syslog with error messages on every 10ms or so.

That's said, my patch doesn't seem to fix all potential issues that
could happen there.

I'll propose a separate fix, outside this patch series, as the issues
here are not only due to RPM, but the main issue is that both
while loops inside this code can run forever with the core->lock
hold.

Thanks,
Mauro

  reply	other threads:[~2021-05-03  9:00 UTC|newest]

Thread overview: 208+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-28 14:51 [PATCH v4 00/79] Address some issues with PM runtime at media subsystem Mauro Carvalho Chehab
2021-04-28 14:51 ` Mauro Carvalho Chehab
2021-04-28 14:51 ` Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 01/79] media: venus: fix PM runtime logic at venus_sys_error_handler() Mauro Carvalho Chehab
2021-04-30 15:21   ` Jonathan Cameron
2021-05-03  9:00     ` Mauro Carvalho Chehab [this message]
2021-04-28 14:51 ` [PATCH v4 02/79] media: s6p_cec: decrement usage count if disabled Mauro Carvalho Chehab
2021-04-28 15:31   ` Sylwester Nawrocki
2021-04-28 14:51 ` [PATCH v4 03/79] media: i2c: ccs-core: return the right error code at suspend Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 04/79] media: i2c: ov7740: don't resume at remove time Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 05/79] media: i2c: video-i2c: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 06/79] media: i2c: imx334: fix the pm runtime get logic Mauro Carvalho Chehab
2021-04-30 15:54   ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 07/79] media: exynos-gsc: don't resume at remove time Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-28 15:25   ` Sylwester Nawrocki
2021-04-28 15:25     ` Sylwester Nawrocki
2021-04-28 19:44   ` kernel test robot
2021-04-28 19:44     ` kernel test robot
2021-04-28 19:44     ` kernel test robot
2021-04-28 19:56   ` kernel test robot
2021-04-28 19:56     ` kernel test robot
2021-04-28 19:56     ` kernel test robot
2021-04-28 14:51 ` [PATCH v4 08/79] media: atmel: properly get pm_runtime Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-30 16:13   ` Jonathan Cameron
2021-04-30 16:13     ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 09/79] media: marvel-ccic: fix some issues when getting pm_runtime Mauro Carvalho Chehab
2021-04-30 16:29   ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 10/79] media: mdk-mdp: fix pm_runtime_get_sync() usage count Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-30 16:30   ` Jonathan Cameron
2021-04-30 16:30     ` Jonathan Cameron
2021-04-30 16:30     ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 11/79] media: rcar_fdp1: " Mauro Carvalho Chehab
2021-04-30 16:26   ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 12/79] media: renesas-ceu: Properly check for PM errors Mauro Carvalho Chehab
2021-04-30 16:28   ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 13/79] media: s5p: fix pm_runtime_get_sync() usage count Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 14/79] media: am437x: " Mauro Carvalho Chehab
2021-04-30 16:36   ` Jonathan Cameron
2021-05-04  7:19     ` Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 15/79] media: sh_vou: " Mauro Carvalho Chehab
2021-04-30 16:40   ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 16/79] media: mtk-vcodec: " Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-30 16:44   ` Jonathan Cameron
2021-04-30 16:44     ` Jonathan Cameron
2021-04-30 16:44     ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 17/79] media: s5p-jpeg: " Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 18/79] media: sti/delta: " Mauro Carvalho Chehab
2021-04-30 16:47   ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 19/79] media: sunxi: " Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-30 16:48   ` Jonathan Cameron
2021-04-30 16:48     ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 20/79] staging: media: rkvdec: " Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-28 15:09   ` Johan Hovold
2021-04-28 15:09     ` Johan Hovold
2021-04-28 15:09     ` Johan Hovold
2021-04-29  7:38     ` Mauro Carvalho Chehab
2021-04-29  7:38       ` Mauro Carvalho Chehab
2021-04-29  7:38       ` Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 21/79] staging: media: atomisp: use pm_runtime_resume_and_get() Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-30 16:59   ` Jonathan Cameron
2021-04-30 16:59     ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 22/79] staging: media: imx7-mipi-csis: " Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 23/79] staging: media: ipu3: " Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-30 17:03   ` Jonathan Cameron
2021-04-30 17:03     ` Jonathan Cameron
2021-05-03  9:57     ` Johan Hovold
2021-05-03  9:57       ` Johan Hovold
2021-04-28 14:51 ` [PATCH v4 24/79] staging: media: cedrus_video: " Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-30 17:05   ` Jonathan Cameron
2021-04-30 17:05     ` Jonathan Cameron
2021-04-30 17:05     ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 25/79] staging: media: tegra-vde: " Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-29 12:38   ` Dmitry Osipenko
2021-04-29 12:38     ` Dmitry Osipenko
2021-04-30 17:08   ` Jonathan Cameron
2021-04-30 17:08     ` Jonathan Cameron
2021-04-30 17:11     ` Jonathan Cameron
2021-04-30 17:11       ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 26/79] staging: media: tegra-video: " Mauro Carvalho Chehab
2021-04-28 14:51   ` Mauro Carvalho Chehab
2021-04-30 17:13   ` Jonathan Cameron
2021-04-30 17:13     ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 27/79] media: i2c: ak7375: " Mauro Carvalho Chehab
2021-04-30 17:14   ` Jonathan Cameron
2021-04-28 14:51 ` [PATCH v4 28/79] media: i2c: ccs-core: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 29/79] media: i2c: dw9714: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 30/79] media: i2c: dw9768: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 31/79] media: i2c: dw9807-vcm: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 32/79] media: i2c: hi556: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 33/79] media: i2c: imx214: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 34/79] media: i2c: imx219: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 35/79] media: i2c: imx258: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 36/79] media: i2c: imx274: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 37/79] media: i2c: imx290: " Mauro Carvalho Chehab
2021-04-28 14:51 ` [PATCH v4 38/79] media: i2c: imx319: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 39/79] media: i2c: imx355: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 40/79] media: i2c: mt9m001: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 41/79] media: i2c: ov02a10: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 42/79] media: i2c: ov13858: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 43/79] media: i2c: ov2659: " Mauro Carvalho Chehab
2021-05-03 12:30   ` Lad, Prabhakar
2021-04-28 14:52 ` [PATCH v4 44/79] media: i2c: ov2685: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 45/79] media: i2c: ov2740: " Mauro Carvalho Chehab
2021-04-30 17:20   ` Jonathan Cameron
2021-04-28 14:52 ` [PATCH v4 46/79] media: i2c: ov5647: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 47/79] media: i2c: ov5648: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 48/79] media: i2c: ov5670: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 49/79] media: i2c: ov5675: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 50/79] media: i2c: ov5695: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 51/79] media: i2c: ov7740: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 52/79] media: i2c: ov8856: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 53/79] media: i2c: ov8865: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 54/79] media: i2c: ov9734: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 55/79] media: i2c: tvp5150: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 56/79] media: i2c: video-i2c: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 57/79] media: rockchip/rga: " Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 17:11   ` Ezequiel Garcia
2021-04-28 17:11     ` Ezequiel Garcia
2021-04-28 17:11     ` Ezequiel Garcia
2021-04-28 14:52 ` [PATCH v4 58/79] media: sti/hva: " Mauro Carvalho Chehab
2021-04-30 17:35   ` Jonathan Cameron
2021-04-28 14:52 ` [PATCH v4 59/79] media: sti/bdisp: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 60/79] media: ipu3: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 61/79] media: coda: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 62/79] media: exynos4-is: " Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-30 17:49   ` Jonathan Cameron
2021-04-30 17:49     ` Jonathan Cameron
2021-04-28 14:52 ` [PATCH v4 63/79] media: exynos-gsc: " Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-30 17:50   ` Jonathan Cameron
2021-04-30 17:50     ` Jonathan Cameron
2021-04-28 14:52 ` [PATCH v4 64/79] media: mtk-jpeg: " Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 65/79] media: camss: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 66/79] media: venus: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 67/79] media: venus: vdec: " Mauro Carvalho Chehab
2021-04-30 17:53   ` Jonathan Cameron
2021-04-28 14:52 ` [PATCH v4 68/79] media: venus: venc: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 69/79] media: rcar-fcp: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 70/79] media: rkisp1: " Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 71/79] media: s3c-camif: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 72/79] media: s5p-mfc: " Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 73/79] media: stm32: " Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-30 17:58   ` Jonathan Cameron
2021-04-30 17:58     ` Jonathan Cameron
2021-04-28 14:52 ` [PATCH v4 74/79] media: sunxi: " Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 75/79] media: ti-vpe: " Mauro Carvalho Chehab
2021-04-30 18:03   ` Jonathan Cameron
2021-05-03 12:49     ` Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 76/79] media: vsp1: " Mauro Carvalho Chehab
2021-04-28 14:52 ` [PATCH v4 77/79] media: rcar-vin: " Mauro Carvalho Chehab
2021-04-30 18:05   ` Jonathan Cameron
2021-04-28 14:52 ` [PATCH v4 78/79] media: hantro: " Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 17:14   ` Ezequiel Garcia
2021-04-28 17:14     ` Ezequiel Garcia
2021-04-28 17:14     ` Ezequiel Garcia
2021-04-30 18:09   ` Jonathan Cameron
2021-04-30 18:09     ` Jonathan Cameron
2021-04-30 18:09     ` Jonathan Cameron
2021-04-28 14:52 ` [PATCH v4 79/79] media: hantro: do a PM resume earlier Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 14:52   ` Mauro Carvalho Chehab
2021-04-28 17:17   ` Ezequiel Garcia
2021-04-28 17:17     ` Ezequiel Garcia
2021-04-28 17:17     ` Ezequiel Garcia
2021-04-29  7:13     ` Mauro Carvalho Chehab
2021-04-29  7:13       ` Mauro Carvalho Chehab
2021-04-29  7:13       ` Mauro Carvalho Chehab
2021-04-28 15:50 ` [PATCH v4 00/79] Address some issues with PM runtime at media subsystem Johan Hovold
2021-04-28 15:50   ` Johan Hovold
2021-04-28 15:50   ` Johan Hovold
2021-04-29 10:18   ` Mauro Carvalho Chehab
2021-04-29 10:18     ` Mauro Carvalho Chehab
2021-04-29 10:18     ` Mauro Carvalho Chehab
2021-04-29 12:33     ` Dmitry Osipenko
2021-04-29 12:33       ` Dmitry Osipenko
2021-04-29 12:33       ` Dmitry Osipenko
2021-04-29 15:17     ` Johan Hovold
2021-04-29 15:17       ` Johan Hovold
2021-04-29 15:17       ` Johan Hovold

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210503110022.22503bfd@coco.lan \
    --to=mchehab+huawei@kernel.org \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=agross@kernel.org \
    --cc=bjorn.andersson@linaro.org \
    --cc=hans.verkuil@cisco.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=mauro.chehab@huawei.com \
    --cc=mchehab@kernel.org \
    --cc=stanimir.varbanov@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.