All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jordan Crouse <jcrouse@codeaurora.org>
To: Sean Paul <sean@poorly.run>
Cc: dri-devel@lists.freedesktop.org, freedreno@lists.freedesktop.org,
	Sean Paul <seanpaul@chromium.org>,
	Rob Clark <robdclark@gmail.com>,
	linux-arm-msm@vger.kernel.org
Subject: Re: [PATCH v2 4/6] drm/msm/a6xx: Remove devm calls from gmu driver
Date: Thu, 23 May 2019 14:51:51 -0600	[thread overview]
Message-ID: <20190523205151.GE18360@jcrouse1-lnx.qualcomm.com> (raw)
In-Reply-To: <20190523171653.138678-4-sean@poorly.run>

On Thu, May 23, 2019 at 01:16:43PM -0400, Sean Paul wrote:
> From: Sean Paul <seanpaul@chromium.org>
> 
> The gmu driver is initialized and cleaned up with calls from the gpu driver. As
> such, the platform device stays valid after a6xx_gmu_remove is called and the
> device managed resources are not freed. In the case of gpu probe failures or
> unbind, these resources will remain managed.
> 
> If the gpu bind is run again (eg: if there's a probe defer somewhere in msm),
> these resources will be initialized again for the same device, creating multiple
> references. In the case of irqs, this causes failures since the irqs are
> not shared (nor should they be).
> 
> This patch removes all devm_* calls and manually cleans things up in
> gmu_remove.
> 
> Changes in v2:
> - Add iounmap and free_irq to gmu_probe error paths
> 
> Cc: Jordan Crouse <jcrouse@codeaurora.org>
> Signed-off-by: Sean Paul <seanpaul@chromium.org>

As we discussed in IRC, I feel like the way we are probing and dealing with
the GMU device is messing up the reference counting. I had hoped that a
put_device() would do the trick but testing showed it didn't so there is clearly
remaining fail that we should eventually find and fix.

That said; there is really no reason to be using managed resources for this
device so this is an entirely appropriate patch.

Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>

> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 33 ++++++++++++++++++---------
>  1 file changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 7465423e9b71..f7240c9e11fb 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -505,9 +505,9 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
>  
>  err:
>  	if (!IS_ERR_OR_NULL(pdcptr))
> -		devm_iounmap(gmu->dev, pdcptr);
> +		iounmap(pdcptr);
>  	if (!IS_ERR_OR_NULL(seqptr))
> -		devm_iounmap(gmu->dev, seqptr);
> +		iounmap(seqptr);
>  }
>  
>  /*
> @@ -1197,7 +1197,7 @@ static void __iomem *a6xx_gmu_get_mmio(struct platform_device *pdev,
>  		return ERR_PTR(-EINVAL);
>  	}
>  
> -	ret = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> +	ret = ioremap(res->start, resource_size(res));
>  	if (!ret) {
>  		DRM_DEV_ERROR(&pdev->dev, "Unable to map the %s registers\n", name);
>  		return ERR_PTR(-EINVAL);
> @@ -1213,10 +1213,10 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, struct platform_device *pdev,
>  
>  	irq = platform_get_irq_byname(pdev, name);
>  
> -	ret = devm_request_irq(&pdev->dev, irq, handler, IRQF_TRIGGER_HIGH,
> -		name, gmu);
> +	ret = request_irq(irq, handler, IRQF_TRIGGER_HIGH, name, gmu);
>  	if (ret) {
> -		DRM_DEV_ERROR(&pdev->dev, "Unable to get interrupt %s\n", name);
> +		DRM_DEV_ERROR(&pdev->dev, "Unable to get interrupt %s %d\n",
> +			      name, ret);
>  		return ret;
>  	}
>  
> @@ -1241,12 +1241,18 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>  		dev_pm_domain_detach(gmu->gxpd, false);
>  	}
>  
> +	iounmap(gmu->mmio);
> +	gmu->mmio = NULL;
> +
>  	a6xx_gmu_memory_free(gmu, gmu->hfi);
>  
>  	iommu_detach_device(gmu->domain, gmu->dev);
>  
>  	iommu_domain_free(gmu->domain);
>  
> +	free_irq(gmu->gmu_irq, gmu);
> +	free_irq(gmu->hfi_irq, gmu);
> +
>  	gmu->initialized = false;
>  }
>  
> @@ -1281,24 +1287,24 @@ int a6xx_gmu_probe(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  	/* Allocate memory for for the HFI queues */
>  	gmu->hfi = a6xx_gmu_memory_alloc(gmu, SZ_16K);
>  	if (IS_ERR(gmu->hfi))
> -		goto err;
> +		goto err_memory;
>  
>  	/* Allocate memory for the GMU debug region */
>  	gmu->debug = a6xx_gmu_memory_alloc(gmu, SZ_16K);
>  	if (IS_ERR(gmu->debug))
> -		goto err;
> +		goto err_memory;
>  
>  	/* Map the GMU registers */
>  	gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
>  	if (IS_ERR(gmu->mmio))
> -		goto err;
> +		goto err_memory;
>  
>  	/* Get the HFI and GMU interrupts */
>  	gmu->hfi_irq = a6xx_gmu_get_irq(gmu, pdev, "hfi", a6xx_hfi_irq);
>  	gmu->gmu_irq = a6xx_gmu_get_irq(gmu, pdev, "gmu", a6xx_gmu_irq);
>  
>  	if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0)
> -		goto err;
> +		goto err_mmio;
>  
>  	/*
>  	 * Get a link to the GX power domain to reset the GPU in case of GMU
> @@ -1315,7 +1321,12 @@ int a6xx_gmu_probe(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  	gmu->initialized = true;
>  
>  	return 0;
> -err:
> +
> +err_mmio:
> +	iounmap(gmu->mmio);
> +	free_irq(gmu->gmu_irq, gmu);
> +	free_irq(gmu->hfi_irq, gmu);
> +err_memory:
>  	a6xx_gmu_memory_free(gmu, gmu->hfi);
>  
>  	if (gmu->domain) {
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

WARNING: multiple messages have this Message-ID (diff)
From: Jordan Crouse <jcrouse-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
To: Sean Paul <sean-p7yTbzM4H96eqtR555YLDQ@public.gmane.org>
Cc: Sean Paul <seanpaul-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>,
	Rob Clark <robdclark-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	freedreno-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
	linux-arm-msm-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH v2 4/6] drm/msm/a6xx: Remove devm calls from gmu driver
Date: Thu, 23 May 2019 14:51:51 -0600	[thread overview]
Message-ID: <20190523205151.GE18360@jcrouse1-lnx.qualcomm.com> (raw)
In-Reply-To: <20190523171653.138678-4-sean-p7yTbzM4H96eqtR555YLDQ@public.gmane.org>

On Thu, May 23, 2019 at 01:16:43PM -0400, Sean Paul wrote:
> From: Sean Paul <seanpaul@chromium.org>
> 
> The gmu driver is initialized and cleaned up with calls from the gpu driver. As
> such, the platform device stays valid after a6xx_gmu_remove is called and the
> device managed resources are not freed. In the case of gpu probe failures or
> unbind, these resources will remain managed.
> 
> If the gpu bind is run again (eg: if there's a probe defer somewhere in msm),
> these resources will be initialized again for the same device, creating multiple
> references. In the case of irqs, this causes failures since the irqs are
> not shared (nor should they be).
> 
> This patch removes all devm_* calls and manually cleans things up in
> gmu_remove.
> 
> Changes in v2:
> - Add iounmap and free_irq to gmu_probe error paths
> 
> Cc: Jordan Crouse <jcrouse@codeaurora.org>
> Signed-off-by: Sean Paul <seanpaul@chromium.org>

As we discussed in IRC, I feel like the way we are probing and dealing with
the GMU device is messing up the reference counting. I had hoped that a
put_device() would do the trick but testing showed it didn't so there is clearly
remaining fail that we should eventually find and fix.

That said; there is really no reason to be using managed resources for this
device so this is an entirely appropriate patch.

Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>

> ---
>  drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 33 ++++++++++++++++++---------
>  1 file changed, 22 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> index 7465423e9b71..f7240c9e11fb 100644
> --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
> @@ -505,9 +505,9 @@ static void a6xx_gmu_rpmh_init(struct a6xx_gmu *gmu)
>  
>  err:
>  	if (!IS_ERR_OR_NULL(pdcptr))
> -		devm_iounmap(gmu->dev, pdcptr);
> +		iounmap(pdcptr);
>  	if (!IS_ERR_OR_NULL(seqptr))
> -		devm_iounmap(gmu->dev, seqptr);
> +		iounmap(seqptr);
>  }
>  
>  /*
> @@ -1197,7 +1197,7 @@ static void __iomem *a6xx_gmu_get_mmio(struct platform_device *pdev,
>  		return ERR_PTR(-EINVAL);
>  	}
>  
> -	ret = devm_ioremap(&pdev->dev, res->start, resource_size(res));
> +	ret = ioremap(res->start, resource_size(res));
>  	if (!ret) {
>  		DRM_DEV_ERROR(&pdev->dev, "Unable to map the %s registers\n", name);
>  		return ERR_PTR(-EINVAL);
> @@ -1213,10 +1213,10 @@ static int a6xx_gmu_get_irq(struct a6xx_gmu *gmu, struct platform_device *pdev,
>  
>  	irq = platform_get_irq_byname(pdev, name);
>  
> -	ret = devm_request_irq(&pdev->dev, irq, handler, IRQF_TRIGGER_HIGH,
> -		name, gmu);
> +	ret = request_irq(irq, handler, IRQF_TRIGGER_HIGH, name, gmu);
>  	if (ret) {
> -		DRM_DEV_ERROR(&pdev->dev, "Unable to get interrupt %s\n", name);
> +		DRM_DEV_ERROR(&pdev->dev, "Unable to get interrupt %s %d\n",
> +			      name, ret);
>  		return ret;
>  	}
>  
> @@ -1241,12 +1241,18 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu)
>  		dev_pm_domain_detach(gmu->gxpd, false);
>  	}
>  
> +	iounmap(gmu->mmio);
> +	gmu->mmio = NULL;
> +
>  	a6xx_gmu_memory_free(gmu, gmu->hfi);
>  
>  	iommu_detach_device(gmu->domain, gmu->dev);
>  
>  	iommu_domain_free(gmu->domain);
>  
> +	free_irq(gmu->gmu_irq, gmu);
> +	free_irq(gmu->hfi_irq, gmu);
> +
>  	gmu->initialized = false;
>  }
>  
> @@ -1281,24 +1287,24 @@ int a6xx_gmu_probe(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  	/* Allocate memory for for the HFI queues */
>  	gmu->hfi = a6xx_gmu_memory_alloc(gmu, SZ_16K);
>  	if (IS_ERR(gmu->hfi))
> -		goto err;
> +		goto err_memory;
>  
>  	/* Allocate memory for the GMU debug region */
>  	gmu->debug = a6xx_gmu_memory_alloc(gmu, SZ_16K);
>  	if (IS_ERR(gmu->debug))
> -		goto err;
> +		goto err_memory;
>  
>  	/* Map the GMU registers */
>  	gmu->mmio = a6xx_gmu_get_mmio(pdev, "gmu");
>  	if (IS_ERR(gmu->mmio))
> -		goto err;
> +		goto err_memory;
>  
>  	/* Get the HFI and GMU interrupts */
>  	gmu->hfi_irq = a6xx_gmu_get_irq(gmu, pdev, "hfi", a6xx_hfi_irq);
>  	gmu->gmu_irq = a6xx_gmu_get_irq(gmu, pdev, "gmu", a6xx_gmu_irq);
>  
>  	if (gmu->hfi_irq < 0 || gmu->gmu_irq < 0)
> -		goto err;
> +		goto err_mmio;
>  
>  	/*
>  	 * Get a link to the GX power domain to reset the GPU in case of GMU
> @@ -1315,7 +1321,12 @@ int a6xx_gmu_probe(struct a6xx_gpu *a6xx_gpu, struct device_node *node)
>  	gmu->initialized = true;
>  
>  	return 0;
> -err:
> +
> +err_mmio:
> +	iounmap(gmu->mmio);
> +	free_irq(gmu->gmu_irq, gmu);
> +	free_irq(gmu->hfi_irq, gmu);
> +err_memory:
>  	a6xx_gmu_memory_free(gmu, gmu->hfi);
>  
>  	if (gmu->domain) {
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS
> 

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
Freedreno mailing list
Freedreno@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/freedreno

  reply	other threads:[~2019-05-23 20:51 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-23 17:16 [PATCH v2 1/6] drm/msm/a6xx: Avoid freeing gmu resources multiple times Sean Paul
2019-05-23 17:16 ` Sean Paul
2019-05-23 17:16 ` [PATCH v2 2/6] drm/msm/a6xx: Remove duplicate irq disable from remove Sean Paul
2019-05-23 17:16   ` Sean Paul
2019-05-23 20:47   ` Jordan Crouse
2019-05-23 20:47     ` Jordan Crouse
2019-05-23 17:16 ` [PATCH v2 3/6] drm/msm/a6xx: Check for ERR or NULL before iounmap Sean Paul
2019-05-23 17:16   ` Sean Paul
2019-05-23 20:48   ` Jordan Crouse
2019-05-23 20:48     ` Jordan Crouse
2019-05-23 17:16 ` [PATCH v2 4/6] drm/msm/a6xx: Remove devm calls from gmu driver Sean Paul
2019-05-23 17:16   ` Sean Paul
2019-05-23 20:51   ` Jordan Crouse [this message]
2019-05-23 20:51     ` Jordan Crouse
2019-05-23 17:16 ` [PATCH v2 5/6] drm/msm/a6xx: Drop the device reference in gmu Sean Paul
2019-05-23 17:16   ` Sean Paul
2019-05-23 20:52   ` Jordan Crouse
2019-05-23 20:52     ` Jordan Crouse
2019-05-23 17:16 ` [PATCH v2 6/6] drm/msm/a6xx: Rename a6xx_gmu_probe to a6xx_gmu_init Sean Paul
2019-05-23 17:16   ` Sean Paul
2019-05-23 20:53   ` Jordan Crouse
2019-05-23 20:53     ` Jordan Crouse
2019-05-23 20:46 ` [PATCH v2 1/6] drm/msm/a6xx: Avoid freeing gmu resources multiple times Jordan Crouse
2019-05-23 20:46   ` Jordan Crouse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190523205151.GE18360@jcrouse1-lnx.qualcomm.com \
    --to=jcrouse@codeaurora.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=freedreno@lists.freedesktop.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=robdclark@gmail.com \
    --cc=sean@poorly.run \
    --cc=seanpaul@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.