dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [bug report] drm/msm: devcoredump iommu fault support
@ 2022-05-09  6:28 Dan Carpenter
  2022-05-09 14:48 ` Rob Clark
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Carpenter @ 2022-05-09  6:28 UTC (permalink / raw)
  To: robdclark; +Cc: dri-devel

Hello Rob Clark,

The patch e25e92e08e32: "drm/msm: devcoredump iommu fault support"
from Jun 10, 2021, leads to the following Smatch static checker
warning:

drivers/gpu/drm/msm/msm_gpu.c:418 recover_worker() error: dereferencing freed memory 'gpu'
drivers/gpu/drm/msm/msm_gpu.c:497 fault_worker() error: dereferencing freed memory 'gpu'

drivers/gpu/drm/msm/msm_gpu.c
    376 static void recover_worker(struct kthread_work *work)
    377 {
    378         struct msm_gpu *gpu = container_of(work, struct msm_gpu, recover_work);
    379         struct drm_device *dev = gpu->dev;
    380         struct msm_drm_private *priv = dev->dev_private;
    381         struct msm_gem_submit *submit;
    382         struct msm_ringbuffer *cur_ring = gpu->funcs->active_ring(gpu);
    383         char *comm = NULL, *cmd = NULL;
    384         int i;
    385 
    386         mutex_lock(&gpu->lock);
    387 
    388         DRM_DEV_ERROR(dev->dev, "%s: hangcheck recover!\n", gpu->name);
    389 
    390         submit = find_submit(cur_ring, cur_ring->memptrs->fence + 1);
    391         if (submit) {
    392                 /* Increment the fault counts */
    393                 submit->queue->faults++;
    394                 submit->aspace->faults++;
    395 
    396                 get_comm_cmdline(submit, &comm, &cmd);
    397 
    398                 if (comm && cmd) {
    399                         DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
    400                                 gpu->name, comm, cmd);
    401 
    402                         msm_rd_dump_submit(priv->hangrd, submit,
    403                                 "offending task: %s (%s)", comm, cmd);
    404                 } else {
    405                         msm_rd_dump_submit(priv->hangrd, submit, NULL);
    406                 }
    407         } else {
    408                 /*
    409                  * We couldn't attribute this fault to any particular context,
    410                  * so increment the global fault count instead.
    411                  */
    412                 gpu->global_faults++;
    413         }
    414 
    415         /* Record the crash state */
    416         pm_runtime_get_sync(&gpu->pdev->dev);
    417         msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
This function calls:

	dev_coredumpm(gpu->dev->dev, THIS_MODULE, gpu, 0, GFP_KERNEL,
                                                  ^^^
Which kfrees gpu.

--> 418         pm_runtime_put_sync(&gpu->pdev->dev);
                                     ^^^^^
The gpu wasn't supposed to be free so a lot of things go wrong from
this point.

    419 
    420         kfree(cmd);
    421         kfree(comm);
    422 
    423         /*
    424          * Update all the rings with the latest and greatest fence.. this
    425          * needs to happen after msm_rd_dump_submit() to ensure that the
    426          * bo's referenced by the offending submit are still around.
    427          */

regards,
dan carpenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] drm/msm: devcoredump iommu fault support
  2022-05-09  6:28 [bug report] drm/msm: devcoredump iommu fault support Dan Carpenter
@ 2022-05-09 14:48 ` Rob Clark
  2022-05-12 11:00   ` Dan Carpenter
  0 siblings, 1 reply; 3+ messages in thread
From: Rob Clark @ 2022-05-09 14:48 UTC (permalink / raw)
  To: Dan Carpenter; +Cc: dri-devel

On Sun, May 8, 2022 at 11:28 PM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> Hello Rob Clark,
>
> The patch e25e92e08e32: "drm/msm: devcoredump iommu fault support"
> from Jun 10, 2021, leads to the following Smatch static checker
> warning:
>
> drivers/gpu/drm/msm/msm_gpu.c:418 recover_worker() error: dereferencing freed memory 'gpu'
> drivers/gpu/drm/msm/msm_gpu.c:497 fault_worker() error: dereferencing freed memory 'gpu'
>
> drivers/gpu/drm/msm/msm_gpu.c
>     376 static void recover_worker(struct kthread_work *work)
>     377 {
>     378         struct msm_gpu *gpu = container_of(work, struct msm_gpu, recover_work);
>     379         struct drm_device *dev = gpu->dev;
>     380         struct msm_drm_private *priv = dev->dev_private;
>     381         struct msm_gem_submit *submit;
>     382         struct msm_ringbuffer *cur_ring = gpu->funcs->active_ring(gpu);
>     383         char *comm = NULL, *cmd = NULL;
>     384         int i;
>     385
>     386         mutex_lock(&gpu->lock);
>     387
>     388         DRM_DEV_ERROR(dev->dev, "%s: hangcheck recover!\n", gpu->name);
>     389
>     390         submit = find_submit(cur_ring, cur_ring->memptrs->fence + 1);
>     391         if (submit) {
>     392                 /* Increment the fault counts */
>     393                 submit->queue->faults++;
>     394                 submit->aspace->faults++;
>     395
>     396                 get_comm_cmdline(submit, &comm, &cmd);
>     397
>     398                 if (comm && cmd) {
>     399                         DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
>     400                                 gpu->name, comm, cmd);
>     401
>     402                         msm_rd_dump_submit(priv->hangrd, submit,
>     403                                 "offending task: %s (%s)", comm, cmd);
>     404                 } else {
>     405                         msm_rd_dump_submit(priv->hangrd, submit, NULL);
>     406                 }
>     407         } else {
>     408                 /*
>     409                  * We couldn't attribute this fault to any particular context,
>     410                  * so increment the global fault count instead.
>     411                  */
>     412                 gpu->global_faults++;
>     413         }
>     414
>     415         /* Record the crash state */
>     416         pm_runtime_get_sync(&gpu->pdev->dev);
>     417         msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
>                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> This function calls:
>
>         dev_coredumpm(gpu->dev->dev, THIS_MODULE, gpu, 0, GFP_KERNEL,
>                                                   ^^^
> Which kfrees gpu.

How does the gpu object get kfree'd?  That is the root problem, it
shouldn't be freed until module unload.  I don't think e25e92e08e32:
"drm/msm: devcoredump iommu fault support" is actually related.

Is there a way to reproduce this?

BR,
-R

> --> 418         pm_runtime_put_sync(&gpu->pdev->dev);
>                                      ^^^^^
> The gpu wasn't supposed to be free so a lot of things go wrong from
> this point.
>
>     419
>     420         kfree(cmd);
>     421         kfree(comm);
>     422
>     423         /*
>     424          * Update all the rings with the latest and greatest fence.. this
>     425          * needs to happen after msm_rd_dump_submit() to ensure that the
>     426          * bo's referenced by the offending submit are still around.
>     427          */
>
> regards,
> dan carpenter

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [bug report] drm/msm: devcoredump iommu fault support
  2022-05-09 14:48 ` Rob Clark
@ 2022-05-12 11:00   ` Dan Carpenter
  0 siblings, 0 replies; 3+ messages in thread
From: Dan Carpenter @ 2022-05-12 11:00 UTC (permalink / raw)
  To: Rob Clark; +Cc: dri-devel

On Mon, May 09, 2022 at 07:48:23AM -0700, Rob Clark wrote:
> On Sun, May 8, 2022 at 11:28 PM Dan Carpenter <dan.carpenter@oracle.com> wrote:
> >     407         } else {
> >     408                 /*
> >     409                  * We couldn't attribute this fault to any particular context,
> >     410                  * so increment the global fault count instead.
> >     411                  */
> >     412                 gpu->global_faults++;
> >     413         }
> >     414
> >     415         /* Record the crash state */
> >     416         pm_runtime_get_sync(&gpu->pdev->dev);
> >     417         msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
> >                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > This function calls:
> >
> >         dev_coredumpm(gpu->dev->dev, THIS_MODULE, gpu, 0, GFP_KERNEL,
> >                                                   ^^^
> > Which kfrees gpu.
> 
> How does the gpu object get kfree'd?  That is the root problem, it
> shouldn't be freed until module unload.  I don't think e25e92e08e32:
> "drm/msm: devcoredump iommu fault support" is actually related.
> 
> Is there a way to reproduce this?

Ah.  Thanks for your feedback.  I saw free(data) and misread it as
kfree(data).  It's actually a function pointer which is
msm_gpu_devcoredump_free() so it doesn't free "gpu".

My bad.

regards,
dan carpenter


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-05-12 11:01 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-09  6:28 [bug report] drm/msm: devcoredump iommu fault support Dan Carpenter
2022-05-09 14:48 ` Rob Clark
2022-05-12 11:00   ` Dan Carpenter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).