All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] drm/amdgpu: Fix resource leak on probe error path
@ 2021-06-24 11:20 ` Jiri Kosina
  0 siblings, 0 replies; 9+ messages in thread
From: Jiri Kosina @ 2021-06-24 11:20 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie
  Cc: amd-gfx, dri-devel, linux-kernel, Vojtech Pavlik

From: Jiri Kosina <jkosina@suse.cz>

This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.

It is not true (as stated in the reverted commit changelog) that we never 
unmap the BAR on failure; it actually does happen properly on 
amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> 
amdgpu_device_fini() error path.

What's worse, this commit actually completely breaks resource freeing on 
probe failure (like e.g. failure to load microcode), as 
amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too 
early, leaving all the resources that'd normally be freed in 
amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading 
to all sorts of oopses when someone tries to, for example, access the 
sysfs and procfs resources which are still around while the driver is 
gone.

Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 57ec108b5972..0f1c0e17a587 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	r = amdgpu_device_get_job_timeout_settings(adev);
 	if (r) {
 		dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
-		goto failed_unmap;
+		return r;
 	}
 
 	/* early init functions */
 	r = amdgpu_device_ip_early_init(adev);
 	if (r)
-		goto failed_unmap;
+		return r;
 
 	/* doorbell bar mapping and doorbell index init*/
 	amdgpu_device_doorbell_init(adev);
@@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 failed:
 	amdgpu_vf_error_trans_all(adev);
 
-failed_unmap:
-	iounmap(adev->rmmio);
-	adev->rmmio = NULL;
-
 	return r;
 }
 
-- 
2.12.3



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] drm/amdgpu: Fix resource leak on probe error path
@ 2021-06-24 11:20 ` Jiri Kosina
  0 siblings, 0 replies; 9+ messages in thread
From: Jiri Kosina @ 2021-06-24 11:20 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie
  Cc: Vojtech Pavlik, dri-devel, amd-gfx, linux-kernel

From: Jiri Kosina <jkosina@suse.cz>

This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.

It is not true (as stated in the reverted commit changelog) that we never 
unmap the BAR on failure; it actually does happen properly on 
amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> 
amdgpu_device_fini() error path.

What's worse, this commit actually completely breaks resource freeing on 
probe failure (like e.g. failure to load microcode), as 
amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too 
early, leaving all the resources that'd normally be freed in 
amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading 
to all sorts of oopses when someone tries to, for example, access the 
sysfs and procfs resources which are still around while the driver is 
gone.

Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 57ec108b5972..0f1c0e17a587 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	r = amdgpu_device_get_job_timeout_settings(adev);
 	if (r) {
 		dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
-		goto failed_unmap;
+		return r;
 	}
 
 	/* early init functions */
 	r = amdgpu_device_ip_early_init(adev);
 	if (r)
-		goto failed_unmap;
+		return r;
 
 	/* doorbell bar mapping and doorbell index init*/
 	amdgpu_device_doorbell_init(adev);
@@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 failed:
 	amdgpu_vf_error_trans_all(adev);
 
-failed_unmap:
-	iounmap(adev->rmmio);
-	adev->rmmio = NULL;
-
 	return r;
 }
 
-- 
2.12.3



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] drm/amdgpu: Fix resource leak on probe error path
@ 2021-06-24 11:20 ` Jiri Kosina
  0 siblings, 0 replies; 9+ messages in thread
From: Jiri Kosina @ 2021-06-24 11:20 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie
  Cc: Vojtech Pavlik, dri-devel, amd-gfx, linux-kernel

From: Jiri Kosina <jkosina@suse.cz>

This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.

It is not true (as stated in the reverted commit changelog) that we never 
unmap the BAR on failure; it actually does happen properly on 
amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> 
amdgpu_device_fini() error path.

What's worse, this commit actually completely breaks resource freeing on 
probe failure (like e.g. failure to load microcode), as 
amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too 
early, leaving all the resources that'd normally be freed in 
amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading 
to all sorts of oopses when someone tries to, for example, access the 
sysfs and procfs resources which are still around while the driver is 
gone.

Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 57ec108b5972..0f1c0e17a587 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	r = amdgpu_device_get_job_timeout_settings(adev);
 	if (r) {
 		dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
-		goto failed_unmap;
+		return r;
 	}
 
 	/* early init functions */
 	r = amdgpu_device_ip_early_init(adev);
 	if (r)
-		goto failed_unmap;
+		return r;
 
 	/* doorbell bar mapping and doorbell index init*/
 	amdgpu_device_doorbell_init(adev);
@@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 failed:
 	amdgpu_vf_error_trans_all(adev);
 
-failed_unmap:
-	iounmap(adev->rmmio);
-	adev->rmmio = NULL;
-
 	return r;
 }
 
-- 
2.12.3


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix resource leak on probe error path
  2021-06-24 11:20 ` Jiri Kosina
  (?)
@ 2021-07-01  8:32   ` Jiri Kosina
  -1 siblings, 0 replies; 9+ messages in thread
From: Jiri Kosina @ 2021-07-01  8:32 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie
  Cc: amd-gfx, dri-devel, linux-kernel, Vojtech Pavlik

On Thu, 24 Jun 2021, Jiri Kosina wrote:

> From: Jiri Kosina <jkosina@suse.cz>
> 
> This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.
> 
> It is not true (as stated in the reverted commit changelog) that we never 
> unmap the BAR on failure; it actually does happen properly on 
> amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> 
> amdgpu_device_fini() error path.
> 
> What's worse, this commit actually completely breaks resource freeing on 
> probe failure (like e.g. failure to load microcode), as 
> amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too 
> early, leaving all the resources that'd normally be freed in 
> amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading 
> to all sorts of oopses when someone tries to, for example, access the 
> sysfs and procfs resources which are still around while the driver is 
> gone.
> 
> Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
> Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>

Friendly ping on this one (as it's easily triggerable in the wild by just 
missing proper firwmare).

Thanks.

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 57ec108b5972..0f1c0e17a587 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>  	r = amdgpu_device_get_job_timeout_settings(adev);
>  	if (r) {
>  		dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
> -		goto failed_unmap;
> +		return r;
>  	}
>  
>  	/* early init functions */
>  	r = amdgpu_device_ip_early_init(adev);
>  	if (r)
> -		goto failed_unmap;
> +		return r;
>  
>  	/* doorbell bar mapping and doorbell index init*/
>  	amdgpu_device_doorbell_init(adev);
> @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>  failed:
>  	amdgpu_vf_error_trans_all(adev);
>  
> -failed_unmap:
> -	iounmap(adev->rmmio);
> -	adev->rmmio = NULL;
> -
>  	return r;
>  }
>  
> -- 
> 2.12.3
> 
> 

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix resource leak on probe error path
@ 2021-07-01  8:32   ` Jiri Kosina
  0 siblings, 0 replies; 9+ messages in thread
From: Jiri Kosina @ 2021-07-01  8:32 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie
  Cc: Vojtech Pavlik, dri-devel, amd-gfx, linux-kernel

On Thu, 24 Jun 2021, Jiri Kosina wrote:

> From: Jiri Kosina <jkosina@suse.cz>
> 
> This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.
> 
> It is not true (as stated in the reverted commit changelog) that we never 
> unmap the BAR on failure; it actually does happen properly on 
> amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> 
> amdgpu_device_fini() error path.
> 
> What's worse, this commit actually completely breaks resource freeing on 
> probe failure (like e.g. failure to load microcode), as 
> amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too 
> early, leaving all the resources that'd normally be freed in 
> amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading 
> to all sorts of oopses when someone tries to, for example, access the 
> sysfs and procfs resources which are still around while the driver is 
> gone.
> 
> Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
> Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>

Friendly ping on this one (as it's easily triggerable in the wild by just 
missing proper firwmare).

Thanks.

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 57ec108b5972..0f1c0e17a587 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>  	r = amdgpu_device_get_job_timeout_settings(adev);
>  	if (r) {
>  		dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
> -		goto failed_unmap;
> +		return r;
>  	}
>  
>  	/* early init functions */
>  	r = amdgpu_device_ip_early_init(adev);
>  	if (r)
> -		goto failed_unmap;
> +		return r;
>  
>  	/* doorbell bar mapping and doorbell index init*/
>  	amdgpu_device_doorbell_init(adev);
> @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>  failed:
>  	amdgpu_vf_error_trans_all(adev);
>  
> -failed_unmap:
> -	iounmap(adev->rmmio);
> -	adev->rmmio = NULL;
> -
>  	return r;
>  }
>  
> -- 
> 2.12.3
> 
> 

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix resource leak on probe error path
@ 2021-07-01  8:32   ` Jiri Kosina
  0 siblings, 0 replies; 9+ messages in thread
From: Jiri Kosina @ 2021-07-01  8:32 UTC (permalink / raw)
  To: Alex Deucher, Christian König, David Airlie
  Cc: Vojtech Pavlik, dri-devel, amd-gfx, linux-kernel

On Thu, 24 Jun 2021, Jiri Kosina wrote:

> From: Jiri Kosina <jkosina@suse.cz>
> 
> This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.
> 
> It is not true (as stated in the reverted commit changelog) that we never 
> unmap the BAR on failure; it actually does happen properly on 
> amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() -> 
> amdgpu_device_fini() error path.
> 
> What's worse, this commit actually completely breaks resource freeing on 
> probe failure (like e.g. failure to load microcode), as 
> amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too 
> early, leaving all the resources that'd normally be freed in 
> amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading 
> to all sorts of oopses when someone tries to, for example, access the 
> sysfs and procfs resources which are still around while the driver is 
> gone.
> 
> Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
> Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>

Friendly ping on this one (as it's easily triggerable in the wild by just 
missing proper firwmare).

Thanks.

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 57ec108b5972..0f1c0e17a587 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>  	r = amdgpu_device_get_job_timeout_settings(adev);
>  	if (r) {
>  		dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
> -		goto failed_unmap;
> +		return r;
>  	}
>  
>  	/* early init functions */
>  	r = amdgpu_device_ip_early_init(adev);
>  	if (r)
> -		goto failed_unmap;
> +		return r;
>  
>  	/* doorbell bar mapping and doorbell index init*/
>  	amdgpu_device_doorbell_init(adev);
> @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>  failed:
>  	amdgpu_vf_error_trans_all(adev);
>  
> -failed_unmap:
> -	iounmap(adev->rmmio);
> -	adev->rmmio = NULL;
> -
>  	return r;
>  }
>  
> -- 
> 2.12.3
> 
> 

-- 
Jiri Kosina
SUSE Labs

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix resource leak on probe error path
  2021-07-01  8:32   ` Jiri Kosina
  (?)
@ 2021-07-01 14:07     ` Alex Deucher
  -1 siblings, 0 replies; 9+ messages in thread
From: Alex Deucher @ 2021-07-01 14:07 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: Alex Deucher, Christian König, David Airlie, Vojtech Pavlik,
	Maling list - DRI developers, amd-gfx list, LKML

Applied.  Thanks!

Alex

On Thu, Jul 1, 2021 at 4:32 AM Jiri Kosina <jikos@kernel.org> wrote:
>
> On Thu, 24 Jun 2021, Jiri Kosina wrote:
>
> > From: Jiri Kosina <jkosina@suse.cz>
> >
> > This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.
> >
> > It is not true (as stated in the reverted commit changelog) that we never
> > unmap the BAR on failure; it actually does happen properly on
> > amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
> > amdgpu_device_fini() error path.
> >
> > What's worse, this commit actually completely breaks resource freeing on
> > probe failure (like e.g. failure to load microcode), as
> > amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
> > early, leaving all the resources that'd normally be freed in
> > amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
> > to all sorts of oopses when someone tries to, for example, access the
> > sysfs and procfs resources which are still around while the driver is
> > gone.
> >
> > Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
> > Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
> > Signed-off-by: Jiri Kosina <jkosina@suse.cz>
>
> Friendly ping on this one (as it's easily triggerable in the wild by just
> missing proper firwmare).
>
> Thanks.
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 57ec108b5972..0f1c0e17a587 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >       r = amdgpu_device_get_job_timeout_settings(adev);
> >       if (r) {
> >               dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
> > -             goto failed_unmap;
> > +             return r;
> >       }
> >
> >       /* early init functions */
> >       r = amdgpu_device_ip_early_init(adev);
> >       if (r)
> > -             goto failed_unmap;
> > +             return r;
> >
> >       /* doorbell bar mapping and doorbell index init*/
> >       amdgpu_device_doorbell_init(adev);
> > @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >  failed:
> >       amdgpu_vf_error_trans_all(adev);
> >
> > -failed_unmap:
> > -     iounmap(adev->rmmio);
> > -     adev->rmmio = NULL;
> > -
> >       return r;
> >  }
> >
> > --
> > 2.12.3
> >
> >
>
> --
> Jiri Kosina
> SUSE Labs
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix resource leak on probe error path
@ 2021-07-01 14:07     ` Alex Deucher
  0 siblings, 0 replies; 9+ messages in thread
From: Alex Deucher @ 2021-07-01 14:07 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: David Airlie, Vojtech Pavlik, LKML, amd-gfx list,
	Maling list - DRI developers, Alex Deucher, Christian König

Applied.  Thanks!

Alex

On Thu, Jul 1, 2021 at 4:32 AM Jiri Kosina <jikos@kernel.org> wrote:
>
> On Thu, 24 Jun 2021, Jiri Kosina wrote:
>
> > From: Jiri Kosina <jkosina@suse.cz>
> >
> > This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.
> >
> > It is not true (as stated in the reverted commit changelog) that we never
> > unmap the BAR on failure; it actually does happen properly on
> > amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
> > amdgpu_device_fini() error path.
> >
> > What's worse, this commit actually completely breaks resource freeing on
> > probe failure (like e.g. failure to load microcode), as
> > amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
> > early, leaving all the resources that'd normally be freed in
> > amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
> > to all sorts of oopses when someone tries to, for example, access the
> > sysfs and procfs resources which are still around while the driver is
> > gone.
> >
> > Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
> > Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
> > Signed-off-by: Jiri Kosina <jkosina@suse.cz>
>
> Friendly ping on this one (as it's easily triggerable in the wild by just
> missing proper firwmare).
>
> Thanks.
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 57ec108b5972..0f1c0e17a587 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >       r = amdgpu_device_get_job_timeout_settings(adev);
> >       if (r) {
> >               dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
> > -             goto failed_unmap;
> > +             return r;
> >       }
> >
> >       /* early init functions */
> >       r = amdgpu_device_ip_early_init(adev);
> >       if (r)
> > -             goto failed_unmap;
> > +             return r;
> >
> >       /* doorbell bar mapping and doorbell index init*/
> >       amdgpu_device_doorbell_init(adev);
> > @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >  failed:
> >       amdgpu_vf_error_trans_all(adev);
> >
> > -failed_unmap:
> > -     iounmap(adev->rmmio);
> > -     adev->rmmio = NULL;
> > -
> >       return r;
> >  }
> >
> > --
> > 2.12.3
> >
> >
>
> --
> Jiri Kosina
> SUSE Labs
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] drm/amdgpu: Fix resource leak on probe error path
@ 2021-07-01 14:07     ` Alex Deucher
  0 siblings, 0 replies; 9+ messages in thread
From: Alex Deucher @ 2021-07-01 14:07 UTC (permalink / raw)
  To: Jiri Kosina
  Cc: David Airlie, Vojtech Pavlik, LKML, amd-gfx list,
	Maling list - DRI developers, Alex Deucher, Christian König

Applied.  Thanks!

Alex

On Thu, Jul 1, 2021 at 4:32 AM Jiri Kosina <jikos@kernel.org> wrote:
>
> On Thu, 24 Jun 2021, Jiri Kosina wrote:
>
> > From: Jiri Kosina <jkosina@suse.cz>
> >
> > This reverts commit 4192f7b5768912ceda82be2f83c87ea7181f9980.
> >
> > It is not true (as stated in the reverted commit changelog) that we never
> > unmap the BAR on failure; it actually does happen properly on
> > amdgpu_driver_load_kms() -> amdgpu_driver_unload_kms() ->
> > amdgpu_device_fini() error path.
> >
> > What's worse, this commit actually completely breaks resource freeing on
> > probe failure (like e.g. failure to load microcode), as
> > amdgpu_driver_unload_kms() notices adev->rmmio being NULL and bails too
> > early, leaving all the resources that'd normally be freed in
> > amdgpu_acpi_fini() and amdgpu_device_fini() still hanging around, leading
> > to all sorts of oopses when someone tries to, for example, access the
> > sysfs and procfs resources which are still around while the driver is
> > gone.
> >
> > Fixes: 4192f7b57689 ("drm/amdgpu: unmap register bar on device init failure")
> > Reported-by: Vojtech Pavlik <vojtech@ucw.cz>
> > Signed-off-by: Jiri Kosina <jkosina@suse.cz>
>
> Friendly ping on this one (as it's easily triggerable in the wild by just
> missing proper firwmare).
>
> Thanks.
>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > index 57ec108b5972..0f1c0e17a587 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> > @@ -3414,13 +3414,13 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >       r = amdgpu_device_get_job_timeout_settings(adev);
> >       if (r) {
> >               dev_err(adev->dev, "invalid lockup_timeout parameter syntax\n");
> > -             goto failed_unmap;
> > +             return r;
> >       }
> >
> >       /* early init functions */
> >       r = amdgpu_device_ip_early_init(adev);
> >       if (r)
> > -             goto failed_unmap;
> > +             return r;
> >
> >       /* doorbell bar mapping and doorbell index init*/
> >       amdgpu_device_doorbell_init(adev);
> > @@ -3646,10 +3646,6 @@ int amdgpu_device_init(struct amdgpu_device *adev,
> >  failed:
> >       amdgpu_vf_error_trans_all(adev);
> >
> > -failed_unmap:
> > -     iounmap(adev->rmmio);
> > -     adev->rmmio = NULL;
> > -
> >       return r;
> >  }
> >
> > --
> > 2.12.3
> >
> >
>
> --
> Jiri Kosina
> SUSE Labs
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-07-01 14:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-24 11:20 [PATCH] drm/amdgpu: Fix resource leak on probe error path Jiri Kosina
2021-06-24 11:20 ` Jiri Kosina
2021-06-24 11:20 ` Jiri Kosina
2021-07-01  8:32 ` Jiri Kosina
2021-07-01  8:32   ` Jiri Kosina
2021-07-01  8:32   ` Jiri Kosina
2021-07-01 14:07   ` Alex Deucher
2021-07-01 14:07     ` Alex Deucher
2021-07-01 14:07     ` Alex Deucher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.