dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: Boris Brezillon <boris.brezillon@collabora.com>
To: Liviu Dudau <Liviu.Dudau@arm.com>
Cc: "Nicolas Boichat" <drinkcat@chromium.org>,
	kernel@collabora.com, "Daniel Stone" <daniels@collabora.com>,
	"Neil Armstrong" <neil.armstrong@linaro.org>,
	dri-devel@lists.freedesktop.org,
	"Steven Price" <steven.price@arm.com>,
	"Clément Péron" <peron.clem@gmail.com>,
	"Grant Likely" <grant.likely@linaro.org>,
	"Marty E . Plummer" <hanetzer@startmail.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Faith Ekstrand" <faith.ekstrand@collabora.com>
Subject: Re: [PATCH v3 03/14] drm/panthor: Add the device logical block
Date: Fri, 22 Dec 2023 15:04:56 +0100	[thread overview]
Message-ID: <20231222150456.0fca9d73@collabora.com> (raw)
In-Reply-To: <ZYWOfhMa2ljntUJW@e110455-lin.cambridge.arm.com>

On Fri, 22 Dec 2023 13:26:22 +0000
Liviu Dudau <Liviu.Dudau@arm.com> wrote:

> Hi Boris,
> 
> On Mon, Dec 04, 2023 at 06:32:56PM +0100, Boris Brezillon wrote:
> > The panthor driver is designed in a modular way, where each logical
> > block is dealing with a specific HW-block or software feature. In order
> > for those blocks to communicate with each other, we need a central
> > panthor_device collecting all the blocks, and exposing some common
> > features, like interrupt handling, power management, reset, ...
> > 
> > This what this panthor_device logical block is about.
> > 
> > v3:
> > - Add acks for the MIT+GPL2 relicensing
> > - Fix 32-bit support
> > - Shorten the sections protected by panthor_device::pm::mmio_lock to fix
> >   lock ordering issues.
> > - Rename panthor_device::pm::lock into panthor_device::pm::mmio_lock to
> >   better reflect what this lock is protecting
> > - Use dev_err_probe()
> > - Make sure we call drm_dev_exit() when something fails half-way in
> >   panthor_device_reset_work()
> > - Replace CSF_GPU_LATEST_FLUSH_ID_DEFAULT with a constant '1' and a
> >   comment to explain. Also remove setting the dummy flush ID on suspend.
> > - Remove drm_WARN_ON() in panthor_exception_name()
> > - Check pirq->suspended in panthor_xxx_irq_raw_handler()
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> > Signed-off-by: Steven Price <steven.price@arm.com>
> > Acked-by: Steven Price <steven.price@arm.com> # MIT+GPL2 relicensing,Arm
> > Acked-by: Grant Likely <grant.likely@linaro.org> # MIT+GPL2 relicensing,Linaro
> > Acked-by: Boris Brezillon <boris.brezillon@collabora.com> # MIT+GPL2 relicensing,Collabora
> > ---
> >  drivers/gpu/drm/panthor/panthor_device.c | 497 +++++++++++++++++++++++
> >  drivers/gpu/drm/panthor/panthor_device.h | 381 +++++++++++++++++
> >  2 files changed, 878 insertions(+)
> >  create mode 100644 drivers/gpu/drm/panthor/panthor_device.c
> >  create mode 100644 drivers/gpu/drm/panthor/panthor_device.h
> > 
> > diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> > new file mode 100644
> > index 000000000000..40038bac147a
> > --- /dev/null
> > +++ b/drivers/gpu/drm/panthor/panthor_device.c
> > @@ -0,0 +1,497 @@
> > +// SPDX-License-Identifier: GPL-2.0 or MIT
> > +/* Copyright 2018 Marty E. Plummer <hanetzer@startmail.com> */
> > +/* Copyright 2019 Linaro, Ltd, Rob Herring <robh@kernel.org> */
> > +/* Copyright 2023 Collabora ltd. */
> > +
> > +#include <linux/clk.h>
> > +#include <linux/reset.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/pm_domain.h>
> > +#include <linux/pm_runtime.h>
> > +#include <linux/regulator/consumer.h>
> > +
> > +#include <drm/drm_drv.h>
> > +#include <drm/drm_managed.h>
> > +
> > +#include "panthor_sched.h"
> > +#include "panthor_device.h"
> > +#include "panthor_devfreq.h"
> > +#include "panthor_gpu.h"
> > +#include "panthor_fw.h"
> > +#include "panthor_mmu.h"
> > +#include "panthor_regs.h"
> > +
> > +static int panthor_clk_init(struct panthor_device *ptdev)
> > +{
> > +	ptdev->clks.core = devm_clk_get(ptdev->base.dev, NULL);
> > +	if (IS_ERR(ptdev->clks.core))
> > +		return dev_err_probe(ptdev->base.dev,
> > +				     PTR_ERR(ptdev->clks.core),
> > +				     "get 'core' clock failed");
> > +
> > +	ptdev->clks.stacks = devm_clk_get_optional(ptdev->base.dev, "stacks");
> > +	if (IS_ERR(ptdev->clks.stacks))
> > +		return dev_err_probe(ptdev->base.dev,
> > +				     PTR_ERR(ptdev->clks.stacks),
> > +				     "get 'stacks' clock failed");
> > +
> > +	ptdev->clks.coregroup = devm_clk_get_optional(ptdev->base.dev, "coregroup");
> > +	if (IS_ERR(ptdev->clks.coregroup))
> > +		return dev_err_probe(ptdev->base.dev,
> > +				     PTR_ERR(ptdev->clks.coregroup),
> > +				     "get 'coregroup' clock failed");
> > +
> > +	drm_info(&ptdev->base, "clock rate = %lu\n", clk_get_rate(ptdev->clks.core));
> > +	return 0;
> > +}
> > +
> > +void panthor_device_unplug(struct panthor_device *ptdev)
> > +{
> > +	/* FIXME: This is racy. */
> > +	if (drm_dev_is_unplugged(&ptdev->base))
> > +		return;
> > +
> > +	drm_WARN_ON(&ptdev->base, pm_runtime_get_sync(ptdev->base.dev) < 0);
> > +
> > +	/* Call drm_dev_unplug() so any access to HW block happening after
> > +	 * that point get rejected.
> > +	 */
> > +	drm_dev_unplug(&ptdev->base);
> > +
> > +	/* Now, try to cleanly shutdown the GPU before the device resources
> > +	 * get reclaimed.
> > +	 */
> > +	panthor_sched_unplug(ptdev);
> > +	panthor_fw_unplug(ptdev);
> > +	panthor_mmu_unplug(ptdev);
> > +	panthor_gpu_unplug(ptdev);
> > +
> > +	pm_runtime_dont_use_autosuspend(ptdev->base.dev);
> > +	pm_runtime_put_sync_suspend(ptdev->base.dev);
> > +}
> > +
> > +static void panthor_device_reset_cleanup(struct drm_device *ddev, void *data)
> > +{
> > +	struct panthor_device *ptdev = container_of(ddev, struct panthor_device, base);
> > +
> > +	cancel_work_sync(&ptdev->reset.work);
> > +	destroy_workqueue(ptdev->reset.wq);
> > +}
> > +
> > +static void panthor_device_reset_work(struct work_struct *work)
> > +{
> > +	struct panthor_device *ptdev = container_of(work, struct panthor_device, reset.work);
> > +	int ret = 0, cookie;
> > +
> > +	if (atomic_read(&ptdev->pm.state) != PANTHOR_DEVICE_PM_STATE_ACTIVE) {
> > +		/*
> > +		 * No need for a reset as the device has been (or will be)
> > +		 * powered down
> > +		 */
> > +		atomic_set(&ptdev->reset.pending, 0);
> > +		return;
> > +	}
> > +
> > +	if (!drm_dev_enter(&ptdev->base, &cookie))
> > +		return;
> > +
> > +	panthor_sched_pre_reset(ptdev);
> > +	panthor_fw_pre_reset(ptdev, true);
> > +	panthor_mmu_pre_reset(ptdev);
> > +	panthor_gpu_soft_reset(ptdev);
> > +	panthor_gpu_l2_power_on(ptdev);
> > +	panthor_mmu_post_reset(ptdev);
> > +	ret = panthor_fw_post_reset(ptdev);
> > +	if (ret)
> > +		goto out_dev_exit;
> > +
> > +	atomic_set(&ptdev->reset.pending, 0);
> > +	panthor_sched_post_reset(ptdev);
> > +
> > +	if (ret) {
> > +		panthor_device_unplug(ptdev);
> > +		drm_err(&ptdev->base, "Failed to boot MCU after reset, making device unusable.");
> > +	}
> > +
> > +out_dev_exit:
> > +	drm_dev_exit(cookie);
> > +}
> > +
> > +static bool panthor_device_is_initialized(struct panthor_device *ptdev)
> > +{
> > +	return !!ptdev->scheduler;
> > +}
> > +
> > +static void panthor_device_free_page(struct drm_device *ddev, void *data)
> > +{
> > +	free_page((unsigned long)data);
> > +}
> > +
> > +int panthor_device_init(struct panthor_device *ptdev)
> > +{
> > +	struct resource *res;
> > +	struct page *p;
> > +	int ret;
> > +
> > +	ptdev->coherent = device_get_dma_attr(ptdev->base.dev) == DEV_DMA_COHERENT;
> > +
> > +	drmm_mutex_init(&ptdev->base, &ptdev->pm.mmio_lock);
> > +	atomic_set(&ptdev->pm.state, PANTHOR_DEVICE_PM_STATE_SUSPENDED);
> > +	p = alloc_page(GFP_KERNEL | __GFP_ZERO);
> > +	if (!p)
> > +		return -ENOMEM;
> > +
> > +	ptdev->pm.dummy_latest_flush = page_address(p);
> > +	ret = drmm_add_action_or_reset(&ptdev->base, panthor_device_free_page,
> > +				       ptdev->pm.dummy_latest_flush);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/*
> > +	 * Set the dummy page holding the latest flush to 1. This will cause the
> > +	 * flush to avoided as we know it isn't necessary if the submission
> > +	 * happens while the dummy page is mapped. Zero cannot be used because
> > +	 * that means 'always flush'.
> > +	 */
> > +	*ptdev->pm.dummy_latest_flush = 1;
> > +
> > +	INIT_WORK(&ptdev->reset.work, panthor_device_reset_work);
> > +	ptdev->reset.wq = alloc_ordered_workqueue("panthor-reset-wq", 0);
> > +	if (!ptdev->reset.wq)
> > +		return -ENOMEM;
> > +
> > +	ret = drmm_add_action_or_reset(&ptdev->base, panthor_device_reset_cleanup, NULL);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = panthor_clk_init(ptdev);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = panthor_devfreq_init(ptdev);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ptdev->iomem = devm_platform_get_and_ioremap_resource(to_platform_device(ptdev->base.dev),
> > +							      0, &res);
> > +	if (IS_ERR(ptdev->iomem))
> > +		return PTR_ERR(ptdev->iomem);
> > +
> > +	ptdev->phys_addr = res->start;
> > +
> > +	ret = devm_pm_runtime_enable(ptdev->base.dev);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = pm_runtime_resume_and_get(ptdev->base.dev);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = panthor_gpu_init(ptdev);
> > +	if (ret)
> > +		goto err_rpm_put;
> > +
> > +	ret = panthor_mmu_init(ptdev);
> > +	if (ret)
> > +		goto err_rpm_put;
> > +
> > +	ret = panthor_fw_init(ptdev);
> > +	if (ret)
> > +		goto err_rpm_put;
> > +
> > +	ret = panthor_sched_init(ptdev);
> > +	if (ret)
> > +		goto err_rpm_put;
> > +
> > +	/* ~3 frames */
> > +	pm_runtime_set_autosuspend_delay(ptdev->base.dev, 50);
> > +	pm_runtime_use_autosuspend(ptdev->base.dev);
> > +	pm_runtime_put_autosuspend(ptdev->base.dev);
> > +	return 0;
> > +
> > +err_rpm_put:
> > +	pm_runtime_put_sync_suspend(ptdev->base.dev);  
> 
> I'm afraid this is not enough to cleanup the driver if firmware fails to load
> or if panthor_sched_init() fails. I was playing with another device that's
> based on RK3588 and forgot to move the firmware to the new location. I've got
> a kernel crash with an SError due to failure to load the firmware. It was in
> panthor_mmu_irq_raw_handler() which fired because the MMU was not unplugged.

Uh, right. I hate the fact we have to explicitly call the unplug()
functions in that case, but I also don't have a better solution...

> 
> My local fix was:
> 
> -- >8 --------------------------------------------------------------  
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 834b828fc1b59..d8374f8b9333a 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -224,15 +224,15 @@ int panthor_device_init(struct panthor_device *ptdev)
>  
>  	ret = panthor_mmu_init(ptdev);
>  	if (ret)
> -		goto err_rpm_put;
> +		goto err_mmu_init;

All the label names encode the unwind operation they're taking care of,
so s/err_mmu_init/err_unplug_gpu/.

>  
>  	ret = panthor_fw_init(ptdev);
>  	if (ret)
> -		goto err_rpm_put;
> +		goto err_fw_init;
>  
>  	ret = panthor_sched_init(ptdev);
>  	if (ret)
> -		goto err_rpm_put;
> +		goto err_sched_init;
>  
>  	/* ~3 frames */
>  	pm_runtime_set_autosuspend_delay(ptdev->base.dev, 50);
> @@ -240,6 +240,12 @@ int panthor_device_init(struct panthor_device *ptdev)
>  	pm_runtime_put_autosuspend(ptdev->base.dev);
>  	return 0;
>  
> +err_sched_init:
> +	panthor_fw_unplug(ptdev);
> +err_fw_init:
> +	panthor_mmu_unplug(ptdev);
> +err_mmu_init:
> +	panthor_gpu_unplug(ptdev);
>  err_rpm_put:
>  	pm_runtime_put_sync_suspend(ptdev->base.dev);
>  	return ret;
> 
> -- >8 -------------------------------------------------------------  
> 
> I can send a fully formed patch if you agree with the fix.

I'll take care of send when preparing v4. Thanks for the offer.

  reply	other threads:[~2023-12-22 14:05 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-04 17:32 [PATCH v3 00/14] drm: Add a driver for CSF-based Mali GPUs Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 01/14] drm/panthor: Add uAPI Boris Brezillon
2023-12-06 16:17   ` Steven Price
2023-12-18 13:20   ` Chris Diamand
2024-01-15 11:18     ` Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 02/14] drm/panthor: Add GPU register definitions Boris Brezillon
2023-12-06 16:23   ` Steven Price
2023-12-04 17:32 ` [PATCH v3 03/14] drm/panthor: Add the device logical block Boris Brezillon
2023-12-06 16:55   ` Steven Price
2023-12-07  8:12     ` Boris Brezillon
2023-12-07  8:56       ` Boris Brezillon
2023-12-07 10:23         ` Steven Price
2023-12-07 10:49           ` Boris Brezillon
2023-12-07 11:11           ` [EXTERNAL] " Donald Robson
2023-12-22 13:26   ` Liviu Dudau
2023-12-22 14:04     ` Boris Brezillon [this message]
2023-12-04 17:32 ` [PATCH v3 04/14] drm/panthor: Add the GPU " Boris Brezillon
2023-12-07 16:05   ` Steven Price
2023-12-04 17:32 ` [PATCH v3 05/14] drm/panthor: Add GEM " Boris Brezillon
2023-12-07 16:38   ` Steven Price
2024-01-15 10:29     ` Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 06/14] drm/panthor: Add the devfreq " Boris Brezillon
2023-12-05  9:42   ` Clément Péron
2023-12-04 17:33 ` [PATCH v3 07/14] drm/panthor: Add the MMU/VM " Boris Brezillon
2023-12-08 14:28   ` Steven Price
2024-01-15 11:04     ` Boris Brezillon
2024-01-15 17:31   ` Boris Brezillon
2024-01-15 17:38   ` Boris Brezillon
2024-01-15 17:41   ` Boris Brezillon
2024-01-15 18:09   ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 08/14] drm/panthor: Add the FW " Boris Brezillon
2023-12-08 15:39   ` Steven Price
2023-12-18 21:25   ` Chris Diamand
2024-01-15 11:37     ` Boris Brezillon
2024-01-22 16:34     ` Boris Brezillon
2024-01-22 21:14       ` Chris Diamand
2023-12-20 15:12   ` Liviu Dudau
2024-01-15 12:56     ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 09/14] drm/panthor: Add the heap " Boris Brezillon
2023-12-08 16:27   ` Steven Price
2024-01-15 11:15     ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 10/14] drm/panthor: Add the scheduler " Boris Brezillon
2023-12-11 16:27   ` Steven Price
2024-01-15 13:03     ` Boris Brezillon
2023-12-19 11:50   ` Ketil Johnsen
2024-01-15 13:05     ` Boris Brezillon
2023-12-20 19:59   ` Ketil Johnsen
2024-01-15 13:11     ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 11/14] drm/panthor: Add the driver frontend block Boris Brezillon
2023-12-13 11:47   ` Steven Price
2023-12-20 16:24   ` Liviu Dudau
2024-01-15 12:59     ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 12/14] drm/panthor: Allow driver compilation Boris Brezillon
2023-12-13 13:18   ` Steven Price
2023-12-04 17:33 ` [PATCH v3 13/14] dt-bindings: gpu: mali-valhall-csf: Add support for Arm Mali CSF GPUs Boris Brezillon
2023-12-04 19:29   ` Rob Herring
2023-12-05  8:46     ` Boris Brezillon
2023-12-05 20:48   ` Rob Herring
2023-12-06 10:59     ` Liviu Dudau
2024-01-22 16:37       ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 14/14] drm/panthor: Add an entry to MAINTAINERS Boris Brezillon
2023-12-13 13:51   ` Steven Price
2023-12-04 18:09 ` [PATCH v3 00/14] drm: Add a driver for CSF-based Mali GPUs Clément Péron
2023-12-05  8:04   ` Boris Brezillon
2023-12-05  8:48 ` Boris Brezillon
2023-12-06 15:47   ` Steven Price
2023-12-06 16:28     ` Boris Brezillon
2023-12-10  4:58 ` Tatsuyuki Ishi
2023-12-11  8:52   ` Boris Brezillon
2023-12-11 18:18     ` Faith Ekstrand
2024-01-15 14:18       ` Boris Brezillon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231222150456.0fca9d73@collabora.com \
    --to=boris.brezillon@collabora.com \
    --cc=Liviu.Dudau@arm.com \
    --cc=daniels@collabora.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=drinkcat@chromium.org \
    --cc=faith.ekstrand@collabora.com \
    --cc=grant.likely@linaro.org \
    --cc=hanetzer@startmail.com \
    --cc=kernel@collabora.com \
    --cc=neil.armstrong@linaro.org \
    --cc=peron.clem@gmail.com \
    --cc=robin.murphy@arm.com \
    --cc=steven.price@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).