* [bug report] drm/msm: devcoredump iommu fault support
@ 2022-05-09 6:28 Dan Carpenter
2022-05-09 14:48 ` Rob Clark
0 siblings, 1 reply; 3+ messages in thread
From: Dan Carpenter @ 2022-05-09 6:28 UTC (permalink / raw)
To: robdclark; +Cc: dri-devel
Hello Rob Clark,
The patch e25e92e08e32: "drm/msm: devcoredump iommu fault support"
from Jun 10, 2021, leads to the following Smatch static checker
warning:
drivers/gpu/drm/msm/msm_gpu.c:418 recover_worker() error: dereferencing freed memory 'gpu'
drivers/gpu/drm/msm/msm_gpu.c:497 fault_worker() error: dereferencing freed memory 'gpu'
drivers/gpu/drm/msm/msm_gpu.c
376 static void recover_worker(struct kthread_work *work)
377 {
378 struct msm_gpu *gpu = container_of(work, struct msm_gpu, recover_work);
379 struct drm_device *dev = gpu->dev;
380 struct msm_drm_private *priv = dev->dev_private;
381 struct msm_gem_submit *submit;
382 struct msm_ringbuffer *cur_ring = gpu->funcs->active_ring(gpu);
383 char *comm = NULL, *cmd = NULL;
384 int i;
385
386 mutex_lock(&gpu->lock);
387
388 DRM_DEV_ERROR(dev->dev, "%s: hangcheck recover!\n", gpu->name);
389
390 submit = find_submit(cur_ring, cur_ring->memptrs->fence + 1);
391 if (submit) {
392 /* Increment the fault counts */
393 submit->queue->faults++;
394 submit->aspace->faults++;
395
396 get_comm_cmdline(submit, &comm, &cmd);
397
398 if (comm && cmd) {
399 DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
400 gpu->name, comm, cmd);
401
402 msm_rd_dump_submit(priv->hangrd, submit,
403 "offending task: %s (%s)", comm, cmd);
404 } else {
405 msm_rd_dump_submit(priv->hangrd, submit, NULL);
406 }
407 } else {
408 /*
409 * We couldn't attribute this fault to any particular context,
410 * so increment the global fault count instead.
411 */
412 gpu->global_faults++;
413 }
414
415 /* Record the crash state */
416 pm_runtime_get_sync(&gpu->pdev->dev);
417 msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
^^^^^^^^^^^^^^^^^^^^^^^^^^^
This function calls:
dev_coredumpm(gpu->dev->dev, THIS_MODULE, gpu, 0, GFP_KERNEL,
^^^
Which kfrees gpu.
--> 418 pm_runtime_put_sync(&gpu->pdev->dev);
^^^^^
The gpu wasn't supposed to be free so a lot of things go wrong from
this point.
419
420 kfree(cmd);
421 kfree(comm);
422
423 /*
424 * Update all the rings with the latest and greatest fence.. this
425 * needs to happen after msm_rd_dump_submit() to ensure that the
426 * bo's referenced by the offending submit are still around.
427 */
regards,
dan carpenter
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [bug report] drm/msm: devcoredump iommu fault support
2022-05-09 6:28 [bug report] drm/msm: devcoredump iommu fault support Dan Carpenter
@ 2022-05-09 14:48 ` Rob Clark
2022-05-12 11:00 ` Dan Carpenter
0 siblings, 1 reply; 3+ messages in thread
From: Rob Clark @ 2022-05-09 14:48 UTC (permalink / raw)
To: Dan Carpenter; +Cc: dri-devel
On Sun, May 8, 2022 at 11:28 PM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> Hello Rob Clark,
>
> The patch e25e92e08e32: "drm/msm: devcoredump iommu fault support"
> from Jun 10, 2021, leads to the following Smatch static checker
> warning:
>
> drivers/gpu/drm/msm/msm_gpu.c:418 recover_worker() error: dereferencing freed memory 'gpu'
> drivers/gpu/drm/msm/msm_gpu.c:497 fault_worker() error: dereferencing freed memory 'gpu'
>
> drivers/gpu/drm/msm/msm_gpu.c
> 376 static void recover_worker(struct kthread_work *work)
> 377 {
> 378 struct msm_gpu *gpu = container_of(work, struct msm_gpu, recover_work);
> 379 struct drm_device *dev = gpu->dev;
> 380 struct msm_drm_private *priv = dev->dev_private;
> 381 struct msm_gem_submit *submit;
> 382 struct msm_ringbuffer *cur_ring = gpu->funcs->active_ring(gpu);
> 383 char *comm = NULL, *cmd = NULL;
> 384 int i;
> 385
> 386 mutex_lock(&gpu->lock);
> 387
> 388 DRM_DEV_ERROR(dev->dev, "%s: hangcheck recover!\n", gpu->name);
> 389
> 390 submit = find_submit(cur_ring, cur_ring->memptrs->fence + 1);
> 391 if (submit) {
> 392 /* Increment the fault counts */
> 393 submit->queue->faults++;
> 394 submit->aspace->faults++;
> 395
> 396 get_comm_cmdline(submit, &comm, &cmd);
> 397
> 398 if (comm && cmd) {
> 399 DRM_DEV_ERROR(dev->dev, "%s: offending task: %s (%s)\n",
> 400 gpu->name, comm, cmd);
> 401
> 402 msm_rd_dump_submit(priv->hangrd, submit,
> 403 "offending task: %s (%s)", comm, cmd);
> 404 } else {
> 405 msm_rd_dump_submit(priv->hangrd, submit, NULL);
> 406 }
> 407 } else {
> 408 /*
> 409 * We couldn't attribute this fault to any particular context,
> 410 * so increment the global fault count instead.
> 411 */
> 412 gpu->global_faults++;
> 413 }
> 414
> 415 /* Record the crash state */
> 416 pm_runtime_get_sync(&gpu->pdev->dev);
> 417 msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> This function calls:
>
> dev_coredumpm(gpu->dev->dev, THIS_MODULE, gpu, 0, GFP_KERNEL,
> ^^^
> Which kfrees gpu.
How does the gpu object get kfree'd? That is the root problem, it
shouldn't be freed until module unload. I don't think e25e92e08e32:
"drm/msm: devcoredump iommu fault support" is actually related.
Is there a way to reproduce this?
BR,
-R
> --> 418 pm_runtime_put_sync(&gpu->pdev->dev);
> ^^^^^
> The gpu wasn't supposed to be free so a lot of things go wrong from
> this point.
>
> 419
> 420 kfree(cmd);
> 421 kfree(comm);
> 422
> 423 /*
> 424 * Update all the rings with the latest and greatest fence.. this
> 425 * needs to happen after msm_rd_dump_submit() to ensure that the
> 426 * bo's referenced by the offending submit are still around.
> 427 */
>
> regards,
> dan carpenter
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [bug report] drm/msm: devcoredump iommu fault support
2022-05-09 14:48 ` Rob Clark
@ 2022-05-12 11:00 ` Dan Carpenter
0 siblings, 0 replies; 3+ messages in thread
From: Dan Carpenter @ 2022-05-12 11:00 UTC (permalink / raw)
To: Rob Clark; +Cc: dri-devel
On Mon, May 09, 2022 at 07:48:23AM -0700, Rob Clark wrote:
> On Sun, May 8, 2022 at 11:28 PM Dan Carpenter <dan.carpenter@oracle.com> wrote:
> > 407 } else {
> > 408 /*
> > 409 * We couldn't attribute this fault to any particular context,
> > 410 * so increment the global fault count instead.
> > 411 */
> > 412 gpu->global_faults++;
> > 413 }
> > 414
> > 415 /* Record the crash state */
> > 416 pm_runtime_get_sync(&gpu->pdev->dev);
> > 417 msm_gpu_crashstate_capture(gpu, submit, comm, cmd);
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > This function calls:
> >
> > dev_coredumpm(gpu->dev->dev, THIS_MODULE, gpu, 0, GFP_KERNEL,
> > ^^^
> > Which kfrees gpu.
>
> How does the gpu object get kfree'd? That is the root problem, it
> shouldn't be freed until module unload. I don't think e25e92e08e32:
> "drm/msm: devcoredump iommu fault support" is actually related.
>
> Is there a way to reproduce this?
Ah. Thanks for your feedback. I saw free(data) and misread it as
kfree(data). It's actually a function pointer which is
msm_gpu_devcoredump_free() so it doesn't free "gpu".
My bad.
regards,
dan carpenter
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-05-12 11:01 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-09 6:28 [bug report] drm/msm: devcoredump iommu fault support Dan Carpenter
2022-05-09 14:48 ` Rob Clark
2022-05-12 11:00 ` Dan Carpenter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).