From: Chris Diamand <chris.diamand@foss.arm.com>
To: Boris Brezillon <boris.brezillon@collabora.com>,
dri-devel@lists.freedesktop.org
Cc: "Neil Armstrong" <neil.armstrong@linaro.org>,
"Nicolas Boichat" <drinkcat@chromium.org>,
"Daniel Stone" <daniels@collabora.com>,
"Liviu Dudau" <Liviu.Dudau@arm.com>,
"Steven Price" <steven.price@arm.com>,
"Clément Péron" <peron.clem@gmail.com>,
kernel@collabora.com, "Robin Murphy" <robin.murphy@arm.com>,
"Faith Ekstrand" <faith.ekstrand@collabora.com>,
"Marty E . Plummer" <hanetzer@startmail.com>
Subject: Re: [PATCH v3 08/14] drm/panthor: Add the FW logical block
Date: Mon, 18 Dec 2023 21:25:16 +0000 [thread overview]
Message-ID: <62049a0e-1441-47a5-8369-8f05028d76ec@foss.arm.com> (raw)
In-Reply-To: <20231204173313.2098733-9-boris.brezillon@collabora.com>
On 04/12/2023 17:33, Boris Brezillon wrote:
> Contains everything that's FW related, that includes the code dealing
> with the microcontroller unit (MCU) that's running the FW, and anything
> related to allocating memory shared between the FW and the CPU.
>
> A few global FW events are processed in the IRQ handler, the rest is
> forwarded to the scheduler, since scheduling is the primary reason for
> the FW existence, and also the main source of FW <-> kernel
> interactions.
>
> v3:
> - Make the FW path more future-proof (Liviu)
> - Use one waitqueue for all FW events
> - Simplify propagation of FW events to the scheduler logic
> - Drop the panthor_fw_mem abstraction and use panthor_kernel_bo instead
> - Account for the panthor_vm changes
> - Replace magic number with 0x7fffffff with ~0 to better signify that
> it's the maximum permitted value.
> - More accurate rounding when computing the firmware timeout.
> - Add a 'sub iterator' helper function. This also adds a check that a
> firmware entry doesn't overflow the firmware image.
> - Drop __packed from FW structures, natural alignment is good enough.
> - Other minor code improvements.
>
> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
> Signed-off-by: Steven Price <steven.price@arm.com>
> ---
> drivers/gpu/drm/panthor/panthor_fw.c | 1332 ++++++++++++++++++++++++++
> drivers/gpu/drm/panthor/panthor_fw.h | 504 ++++++++++
> 2 files changed, 1836 insertions(+)
> create mode 100644 drivers/gpu/drm/panthor/panthor_fw.c
> create mode 100644 drivers/gpu/drm/panthor/panthor_fw.h
>
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> new file mode 100644
> index 000000000000..85afe769f567
> --- /dev/null
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
...
> +static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
> + const struct firmware *fw,
> + struct panthor_fw_binary_iter *iter,
> + u32 ehdr)
> +{
> + struct panthor_fw_binary_section_entry_hdr hdr;
> + struct panthor_fw_section *section;
> + u32 section_size;
> + u32 name_len;
> + int ret;
> +
> + ret = panthor_fw_binary_iter_read(ptdev, iter, &hdr, sizeof(hdr));
> + if (ret)
> + return ret;
> +
> + if (hdr.data.end < hdr.data.start) {
> + drm_err(&ptdev->base, "Firmware corrupted, data.end < data.start (0x%x < 0x%x)\n",
> + hdr.data.end, hdr.data.start);
> + return -EINVAL;
> + }
> +
> + if (hdr.va.end < hdr.va.start) {
> + drm_err(&ptdev->base, "Firmware corrupted, hdr.va.end < hdr.va.start (0x%x < 0x%x)\n",
> + hdr.va.end, hdr.va.start);
> + return -EINVAL;
> + }
> +
> + if (hdr.data.end > fw->size) {
> + drm_err(&ptdev->base, "Firmware corrupted, file truncated? data_end=0x%x > fw size=0x%zx\n",
> + hdr.data.end, fw->size);
> + return -EINVAL;
> + }
> +
> + if ((hdr.va.start & ~PAGE_MASK) != 0 ||
> + (hdr.va.end & ~PAGE_MASK) != 0) {
> + drm_err(&ptdev->base, "Firmware corrupted, virtual addresses not page aligned: 0x%x-0x%x\n",
> + hdr.va.start, hdr.va.end);
> + return -EINVAL;
> + }
> +
> + if (hdr.flags & ~CSF_FW_BINARY_IFACE_ENTRY_RD_SUPPORTED_FLAGS) {
> + drm_err(&ptdev->base, "Firmware contains interface with unsupported flags (0x%x)\n",
> + hdr.flags);
> + return -EINVAL;
> + }
> +
> + if (hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_RD_PROT) {
> + drm_warn(&ptdev->base,
> + "Firmware protected mode entry not be supported, ignoring");
> + return 0;
> + }
> +
> + if (hdr.va.start == CSF_MCU_SHARED_REGION_START &&
> + !(hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_RD_SHARED)) {
> + drm_err(&ptdev->base,
> + "Interface at 0x%llx must be shared", CSF_MCU_SHARED_REGION_START);
> + return -EINVAL;
> + }
> +
> + name_len = iter->size - iter->offset;
> +
> + section = drmm_kzalloc(&ptdev->base, sizeof(*section), GFP_KERNEL);
> + if (!section)
> + return -ENOMEM;
> +
> + list_add_tail(§ion->node, &ptdev->fw->sections);
> + section->flags = hdr.flags;
> + section->data.size = hdr.data.end - hdr.data.start;
> +
> + if (section->data.size > 0) {
> + void *data = drmm_kmalloc(&ptdev->base, section->data.size, GFP_KERNEL);
> +
> + if (!data)
> + return -ENOMEM;
> +
> + memcpy(data, fw->data + hdr.data.start, section->data.size);
> + section->data.buf = data;
> + }
> +
> + if (name_len > 0) {
> + char *name = drmm_kmalloc(&ptdev->base, name_len + 1, GFP_KERNEL);
> +
> + if (!name)
> + return -ENOMEM;
> +
> + memcpy(name, iter->data + iter->offset, name_len);
> + name[name_len] = '\0';
> + section->name = name;
> + }
> +
> + section_size = hdr.va.end - hdr.va.start;
> + if (section_size) {
> + u32 cache_mode = hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_RD_CACHE_MODE_MASK;
> + struct panthor_gem_object *bo;
> + u32 vm_map_flags = 0;
> + struct sg_table *sgt;
> + u64 va = hdr.va.start;
> +
> + if (!(hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_RD_WR))
> + vm_map_flags |= DRM_PANTHOR_VM_BIND_OP_MAP_READONLY;
> +
> + if (!(hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_RD_EX))
> + vm_map_flags |= DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC;
> +
> + /* TODO: CSF_FW_BINARY_IFACE_ENTRY_RD_CACHE_MODE_*_COHERENT are mapped to
> + * non-cacheable for now. We might want to introduce a new
> + * IOMMU_xxx flag (or abuse IOMMU_MMIO, which maps to device
> + * memory and is currently not used by our driver) for
> + * AS_MEMATTR_AARCH64_SHARED memory, so we can take benefit
> + * of IO-coherent systems.
> + */
> + if (cache_mode != CSF_FW_BINARY_IFACE_ENTRY_RD_CACHE_MODE_CACHED)
> + vm_map_flags |= DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED;
> +
> + section->mem = panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev),
> + section_size,
> + DRM_PANTHOR_BO_NO_MMAP,
> + vm_map_flags, va);
> + if (IS_ERR(section->mem))
> + return PTR_ERR(section->mem);
There's a small bug here - if panthor_kernel_bo_create() fails, section->mem will be
left containing an error value, not NULL or a valid pointer.
This means that on rmmod, panthor_fw_unplug() will try and free the BO even though it
was never allocated, causing a kernel panic.
The solution is obviously to only assign section->mem if the allocation actually succeeded.
However I also wonder if the list_add_tail() call should be moved to later in the function,
after we know everything else has succeeded, or if this function needs some more error
handling in general.
> +
> + if (drm_WARN_ON(&ptdev->base, section->mem->va_node.start != hdr.va.start))
> + return -EINVAL;
> +
> + if (section->flags & CSF_FW_BINARY_IFACE_ENTRY_RD_SHARED) {
> + ret = panthor_kernel_bo_vmap(section->mem);
> + if (ret)
> + return ret;
> + }
> +
> + panthor_fw_init_section_mem(ptdev, section);
> +
> + bo = to_panthor_bo(section->mem->obj);
> + sgt = drm_gem_shmem_get_pages_sgt(&bo->base);
> + if (IS_ERR(sgt))
> + return PTR_ERR(section->mem);
> +
> + dma_sync_sgtable_for_device(ptdev->base.dev, sgt, DMA_TO_DEVICE);
> + }
> +
> + if (hdr.va.start == CSF_MCU_SHARED_REGION_START)
> + ptdev->fw->shared_section = section;
> +
> + return 0;
> +}
> +
...
> +/**
> + * panthor_fw_unplug() - Called when the device is unplugged.
> + * @ptdev: Device.
> + *
> + * This function must make sure all pending operations are flushed before
> + * will release device resources, thus preventing any interaction with
> + * the HW.
> + *
> + * If there is still FW-related work running after this function returns,
> + * they must use drm_dev_{enter,exit}() and skip any HW access when
> + * drm_dev_enter() returns false.
> + */
> +void panthor_fw_unplug(struct panthor_device *ptdev)
> +{
> + struct panthor_fw_section *section;
> +
> + cancel_delayed_work_sync(&ptdev->fw->watchdog.ping_work);
> +
> + /* Make sure the IRQ handler can be called after that point. */
> + if (ptdev->fw->irq.irq)
> + panthor_job_irq_suspend(&ptdev->fw->irq);
> +
> + panthor_fw_stop(ptdev);
> +
> + if (ptdev->fw->vm)
> + panthor_vm_idle(ptdev->fw->vm);
> +
> + list_for_each_entry(section, &ptdev->fw->sections, node)
> + panthor_kernel_bo_destroy(panthor_fw_vm(ptdev), section->mem);
Here's where we destroy the potentially invalid section->mem.
Note that the issues are twofold:
Firstly, section->mem could be an error pointer as mentioned earlier.
Secondly, panthor_kernel_bo_destroy() doesn't actually check for error values or even NULL.
It would probably be worth adding NULL checks to panthor_kernel_bo_destroy() and panthor_kernel_bo_vunmap() to protect against this.
> +
> + panthor_vm_put(ptdev->fw->vm);
> +
> + panthor_gpu_power_off(ptdev, L2, ptdev->gpu_info.l2_present, 20000);
> +}
> +
Cheers,
Chris
next prev parent reply other threads:[~2023-12-18 21:25 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-04 17:32 [PATCH v3 00/14] drm: Add a driver for CSF-based Mali GPUs Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 01/14] drm/panthor: Add uAPI Boris Brezillon
2023-12-06 16:17 ` Steven Price
2023-12-18 13:20 ` Chris Diamand
2024-01-15 11:18 ` Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 02/14] drm/panthor: Add GPU register definitions Boris Brezillon
2023-12-06 16:23 ` Steven Price
2023-12-04 17:32 ` [PATCH v3 03/14] drm/panthor: Add the device logical block Boris Brezillon
2023-12-06 16:55 ` Steven Price
2023-12-07 8:12 ` Boris Brezillon
2023-12-07 8:56 ` Boris Brezillon
2023-12-07 10:23 ` Steven Price
2023-12-07 10:49 ` Boris Brezillon
2023-12-07 11:11 ` [EXTERNAL] " Donald Robson
2023-12-22 13:26 ` Liviu Dudau
2023-12-22 14:04 ` Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 04/14] drm/panthor: Add the GPU " Boris Brezillon
2023-12-07 16:05 ` Steven Price
2023-12-04 17:32 ` [PATCH v3 05/14] drm/panthor: Add GEM " Boris Brezillon
2023-12-07 16:38 ` Steven Price
2024-01-15 10:29 ` Boris Brezillon
2023-12-04 17:32 ` [PATCH v3 06/14] drm/panthor: Add the devfreq " Boris Brezillon
2023-12-05 9:42 ` Clément Péron
2023-12-04 17:33 ` [PATCH v3 07/14] drm/panthor: Add the MMU/VM " Boris Brezillon
2023-12-08 14:28 ` Steven Price
2024-01-15 11:04 ` Boris Brezillon
2024-01-15 17:31 ` Boris Brezillon
2024-01-15 17:38 ` Boris Brezillon
2024-01-15 17:41 ` Boris Brezillon
2024-01-15 18:09 ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 08/14] drm/panthor: Add the FW " Boris Brezillon
2023-12-08 15:39 ` Steven Price
2023-12-18 21:25 ` Chris Diamand [this message]
2024-01-15 11:37 ` Boris Brezillon
2024-01-22 16:34 ` Boris Brezillon
2024-01-22 21:14 ` Chris Diamand
2023-12-20 15:12 ` Liviu Dudau
2024-01-15 12:56 ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 09/14] drm/panthor: Add the heap " Boris Brezillon
2023-12-08 16:27 ` Steven Price
2024-01-15 11:15 ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 10/14] drm/panthor: Add the scheduler " Boris Brezillon
2023-12-11 16:27 ` Steven Price
2024-01-15 13:03 ` Boris Brezillon
2023-12-19 11:50 ` Ketil Johnsen
2024-01-15 13:05 ` Boris Brezillon
2023-12-20 19:59 ` Ketil Johnsen
2024-01-15 13:11 ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 11/14] drm/panthor: Add the driver frontend block Boris Brezillon
2023-12-13 11:47 ` Steven Price
2023-12-20 16:24 ` Liviu Dudau
2024-01-15 12:59 ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 12/14] drm/panthor: Allow driver compilation Boris Brezillon
2023-12-13 13:18 ` Steven Price
2023-12-04 17:33 ` [PATCH v3 13/14] dt-bindings: gpu: mali-valhall-csf: Add support for Arm Mali CSF GPUs Boris Brezillon
2023-12-04 19:29 ` Rob Herring
2023-12-05 8:46 ` Boris Brezillon
2023-12-05 20:48 ` Rob Herring
2023-12-06 10:59 ` Liviu Dudau
2024-01-22 16:37 ` Boris Brezillon
2023-12-04 17:33 ` [PATCH v3 14/14] drm/panthor: Add an entry to MAINTAINERS Boris Brezillon
2023-12-13 13:51 ` Steven Price
2023-12-04 18:09 ` [PATCH v3 00/14] drm: Add a driver for CSF-based Mali GPUs Clément Péron
2023-12-05 8:04 ` Boris Brezillon
2023-12-05 8:48 ` Boris Brezillon
2023-12-06 15:47 ` Steven Price
2023-12-06 16:28 ` Boris Brezillon
2023-12-10 4:58 ` Tatsuyuki Ishi
2023-12-11 8:52 ` Boris Brezillon
2023-12-11 18:18 ` Faith Ekstrand
2024-01-15 14:18 ` Boris Brezillon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=62049a0e-1441-47a5-8369-8f05028d76ec@foss.arm.com \
--to=chris.diamand@foss.arm.com \
--cc=Liviu.Dudau@arm.com \
--cc=boris.brezillon@collabora.com \
--cc=daniels@collabora.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=drinkcat@chromium.org \
--cc=faith.ekstrand@collabora.com \
--cc=hanetzer@startmail.com \
--cc=kernel@collabora.com \
--cc=neil.armstrong@linaro.org \
--cc=peron.clem@gmail.com \
--cc=robin.murphy@arm.com \
--cc=steven.price@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).