All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-25 20:40 ` KyleMahlkuch
  0 siblings, 0 replies; 16+ messages in thread
From: KyleMahlkuch @ 2019-10-25 20:40 UTC (permalink / raw)
  To: alexander.deucher; +Cc: stable, amd-gfx, Kyle Mahlkuch

From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>

During kexec some adapters hit an EEH since they are not properly
shut down in the radeon_pci_shutdown() function. Adding
radeon_suspend_kms() fixes this issue.
Enabled only on PPC because this patch causes issues on some other
boards.

Signed-off-by: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
---
 drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index 9e55076..4528f4d 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
 static void
 radeon_pci_shutdown(struct pci_dev *pdev)
 {
+#ifdef CONFIG_PPC64
+	struct drm_device *ddev = pci_get_drvdata(pdev);
+#endif
+
 	/* if we are running in a VM, make sure the device
 	 * torn down properly on reboot/shutdown
 	 */
 	if (radeon_device_is_virtual())
 		radeon_pci_remove(pdev);
+
+#ifdef CONFIG_PPC64
+	/* Some adapters need to be suspended before a
+	 * shutdown occurs in order to prevent an error
+	 * during kexec.
+	 * Make this power specific becauase it breaks
+	 * some non-power boards.
+	 */
+	radeon_suspend_kms(ddev, true, true, false);
+#endif
 }
 
 static int radeon_pmops_suspend(struct device *dev)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-25 20:40 ` KyleMahlkuch
  0 siblings, 0 replies; 16+ messages in thread
From: KyleMahlkuch @ 2019-10-25 20:40 UTC (permalink / raw)
  To: alexander.deucher; +Cc: amd-gfx, stable, Kyle Mahlkuch

From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>

During kexec some adapters hit an EEH since they are not properly
shut down in the radeon_pci_shutdown() function. Adding
radeon_suspend_kms() fixes this issue.
Enabled only on PPC because this patch causes issues on some other
boards.

Signed-off-by: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
---
 drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index 9e55076..4528f4d 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
 static void
 radeon_pci_shutdown(struct pci_dev *pdev)
 {
+#ifdef CONFIG_PPC64
+	struct drm_device *ddev = pci_get_drvdata(pdev);
+#endif
+
 	/* if we are running in a VM, make sure the device
 	 * torn down properly on reboot/shutdown
 	 */
 	if (radeon_device_is_virtual())
 		radeon_pci_remove(pdev);
+
+#ifdef CONFIG_PPC64
+	/* Some adapters need to be suspended before a
+	 * shutdown occurs in order to prevent an error
+	 * during kexec.
+	 * Make this power specific becauase it breaks
+	 * some non-power boards.
+	 */
+	radeon_suspend_kms(ddev, true, true, false);
+#endif
 }
 
 static int radeon_pmops_suspend(struct device *dev)
-- 
1.8.3.1

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-26  8:00   ` Greg KH
  0 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2019-10-26  8:00 UTC (permalink / raw)
  To: KyleMahlkuch; +Cc: alexander.deucher, stable, amd-gfx

On Fri, Oct 25, 2019 at 03:40:50PM -0500, KyleMahlkuch wrote:
> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
> 
> During kexec some adapters hit an EEH since they are not properly
> shut down in the radeon_pci_shutdown() function. Adding
> radeon_suspend_kms() fixes this issue.
> Enabled only on PPC because this patch causes issues on some other
> boards.
> 
> Signed-off-by: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
> ---
>  drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-26  8:00   ` Greg KH
  0 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2019-10-26  8:00 UTC (permalink / raw)
  To: KyleMahlkuch; +Cc: alexander.deucher, amd-gfx, stable

On Fri, Oct 25, 2019 at 03:40:50PM -0500, KyleMahlkuch wrote:
> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
> 
> During kexec some adapters hit an EEH since they are not properly
> shut down in the radeon_pci_shutdown() function. Adding
> radeon_suspend_kms() fixes this issue.
> Enabled only on PPC because this patch causes issues on some other
> boards.
> 
> Signed-off-by: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
> ---
>  drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-28 14:07   ` Alex Deucher
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Deucher @ 2019-10-28 14:07 UTC (permalink / raw)
  To: KyleMahlkuch; +Cc: Deucher, Alexander, amd-gfx list, for 3.8

On Fri, Oct 25, 2019 at 4:44 PM KyleMahlkuch
<kmahlkuc@linux.vnet.ibm.com> wrote:
>
> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
>
> During kexec some adapters hit an EEH since they are not properly
> shut down in the radeon_pci_shutdown() function. Adding
> radeon_suspend_kms() fixes this issue.
> Enabled only on PPC because this patch causes issues on some other
> boards.
>
> Signed-off-by: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>

Applied.  Thanks!

Alex

> ---
>  drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
> index 9e55076..4528f4d 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
>  static void
>  radeon_pci_shutdown(struct pci_dev *pdev)
>  {
> +#ifdef CONFIG_PPC64
> +       struct drm_device *ddev = pci_get_drvdata(pdev);
> +#endif
> +
>         /* if we are running in a VM, make sure the device
>          * torn down properly on reboot/shutdown
>          */
>         if (radeon_device_is_virtual())
>                 radeon_pci_remove(pdev);
> +
> +#ifdef CONFIG_PPC64
> +       /* Some adapters need to be suspended before a
> +        * shutdown occurs in order to prevent an error
> +        * during kexec.
> +        * Make this power specific becauase it breaks
> +        * some non-power boards.
> +        */
> +       radeon_suspend_kms(ddev, true, true, false);
> +#endif
>  }
>
>  static int radeon_pmops_suspend(struct device *dev)
> --
> 1.8.3.1
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-28 14:07   ` Alex Deucher
  0 siblings, 0 replies; 16+ messages in thread
From: Alex Deucher @ 2019-10-28 14:07 UTC (permalink / raw)
  To: KyleMahlkuch; +Cc: Deucher, Alexander, for 3.8, amd-gfx list

On Fri, Oct 25, 2019 at 4:44 PM KyleMahlkuch
<kmahlkuc@linux.vnet.ibm.com> wrote:
>
> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
>
> During kexec some adapters hit an EEH since they are not properly
> shut down in the radeon_pci_shutdown() function. Adding
> radeon_suspend_kms() fixes this issue.
> Enabled only on PPC because this patch causes issues on some other
> boards.
>
> Signed-off-by: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>

Applied.  Thanks!

Alex

> ---
>  drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
> index 9e55076..4528f4d 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
>  static void
>  radeon_pci_shutdown(struct pci_dev *pdev)
>  {
> +#ifdef CONFIG_PPC64
> +       struct drm_device *ddev = pci_get_drvdata(pdev);
> +#endif
> +
>         /* if we are running in a VM, make sure the device
>          * torn down properly on reboot/shutdown
>          */
>         if (radeon_device_is_virtual())
>                 radeon_pci_remove(pdev);
> +
> +#ifdef CONFIG_PPC64
> +       /* Some adapters need to be suspended before a
> +        * shutdown occurs in order to prevent an error
> +        * during kexec.
> +        * Make this power specific becauase it breaks
> +        * some non-power boards.
> +        */
> +       radeon_suspend_kms(ddev, true, true, false);
> +#endif
>  }
>
>  static int radeon_pmops_suspend(struct device *dev)
> --
> 1.8.3.1
>
> _______________________________________________
> amd-gfx mailing list
> amd-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
  2019-10-25 20:40 ` KyleMahlkuch
  (?)
@ 2019-10-30 10:35   ` Michael Ellerman
  -1 siblings, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2019-10-30 10:35 UTC (permalink / raw)
  To: KyleMahlkuch, alexander.deucher; +Cc: linuxppc-dev, Kyle Mahlkuch, amd-gfx

Hi Kyle,

KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> writes:
> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
>
> During kexec some adapters hit an EEH since they are not properly
> shut down in the radeon_pci_shutdown() function. Adding
> radeon_suspend_kms() fixes this issue.
> Enabled only on PPC because this patch causes issues on some other
> boards.

Which adapters hit the issues?

And do we know why they're not shut down correctly in
radeon_pci_shutdown()? That seems like the root cause no?


> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
> index 9e55076..4528f4d 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
>  static void
>  radeon_pci_shutdown(struct pci_dev *pdev)
>  {
> +#ifdef CONFIG_PPC64
> +	struct drm_device *ddev = pci_get_drvdata(pdev);
> +#endif

This local serves no real purpose and could be avoided, which would also
avoid this ifdef.

>  	/* if we are running in a VM, make sure the device
>  	 * torn down properly on reboot/shutdown
>  	 */
>  	if (radeon_device_is_virtual())
>  		radeon_pci_remove(pdev);
> +
> +#ifdef CONFIG_PPC64
> +	/* Some adapters need to be suspended before a

AFAIK drm uses normal kernel comment style, so this should be:

	/*
	 * Some adapters need to be suspended before a
> +	 * shutdown occurs in order to prevent an error
> +	 * during kexec.
> +	 * Make this power specific becauase it breaks
> +	 * some non-power boards.
> +	 */
> +	radeon_suspend_kms(ddev, true, true, false);

ie, instead do:

	radeon_suspend_kms(pci_get_drvdata(pdev), true, true, false);

> +#endif
>  }
>  
>  static int radeon_pmops_suspend(struct device *dev)
> -- 
> 1.8.3.1

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-30 10:35   ` Michael Ellerman
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2019-10-30 10:35 UTC (permalink / raw)
  To: alexander.deucher; +Cc: linuxppc-dev, Kyle Mahlkuch, amd-gfx

Hi Kyle,

KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> writes:
> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
>
> During kexec some adapters hit an EEH since they are not properly
> shut down in the radeon_pci_shutdown() function. Adding
> radeon_suspend_kms() fixes this issue.
> Enabled only on PPC because this patch causes issues on some other
> boards.

Which adapters hit the issues?

And do we know why they're not shut down correctly in
radeon_pci_shutdown()? That seems like the root cause no?


> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
> index 9e55076..4528f4d 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
>  static void
>  radeon_pci_shutdown(struct pci_dev *pdev)
>  {
> +#ifdef CONFIG_PPC64
> +	struct drm_device *ddev = pci_get_drvdata(pdev);
> +#endif

This local serves no real purpose and could be avoided, which would also
avoid this ifdef.

>  	/* if we are running in a VM, make sure the device
>  	 * torn down properly on reboot/shutdown
>  	 */
>  	if (radeon_device_is_virtual())
>  		radeon_pci_remove(pdev);
> +
> +#ifdef CONFIG_PPC64
> +	/* Some adapters need to be suspended before a

AFAIK drm uses normal kernel comment style, so this should be:

	/*
	 * Some adapters need to be suspended before a
> +	 * shutdown occurs in order to prevent an error
> +	 * during kexec.
> +	 * Make this power specific becauase it breaks
> +	 * some non-power boards.
> +	 */
> +	radeon_suspend_kms(ddev, true, true, false);

ie, instead do:

	radeon_suspend_kms(pci_get_drvdata(pdev), true, true, false);

> +#endif
>  }
>  
>  static int radeon_pmops_suspend(struct device *dev)
> -- 
> 1.8.3.1

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-30 10:35   ` Michael Ellerman
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2019-10-30 10:35 UTC (permalink / raw)
  To: KyleMahlkuch, alexander.deucher; +Cc: linuxppc-dev, Kyle Mahlkuch, amd-gfx

Hi Kyle,

KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> writes:
> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
>
> During kexec some adapters hit an EEH since they are not properly
> shut down in the radeon_pci_shutdown() function. Adding
> radeon_suspend_kms() fixes this issue.
> Enabled only on PPC because this patch causes issues on some other
> boards.

Which adapters hit the issues?

And do we know why they're not shut down correctly in
radeon_pci_shutdown()? That seems like the root cause no?


> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
> index 9e55076..4528f4d 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
>  static void
>  radeon_pci_shutdown(struct pci_dev *pdev)
>  {
> +#ifdef CONFIG_PPC64
> +	struct drm_device *ddev = pci_get_drvdata(pdev);
> +#endif

This local serves no real purpose and could be avoided, which would also
avoid this ifdef.

>  	/* if we are running in a VM, make sure the device
>  	 * torn down properly on reboot/shutdown
>  	 */
>  	if (radeon_device_is_virtual())
>  		radeon_pci_remove(pdev);
> +
> +#ifdef CONFIG_PPC64
> +	/* Some adapters need to be suspended before a

AFAIK drm uses normal kernel comment style, so this should be:

	/*
	 * Some adapters need to be suspended before a
> +	 * shutdown occurs in order to prevent an error
> +	 * during kexec.
> +	 * Make this power specific becauase it breaks
> +	 * some non-power boards.
> +	 */
> +	radeon_suspend_kms(ddev, true, true, false);

ie, instead do:

	radeon_suspend_kms(pci_get_drvdata(pdev), true, true, false);

> +#endif
>  }
>  
>  static int radeon_pmops_suspend(struct device *dev)
> -- 
> 1.8.3.1

cheers
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-31 15:24     ` Kyle Mahlkuch
  0 siblings, 0 replies; 16+ messages in thread
From: Kyle Mahlkuch @ 2019-10-31 15:24 UTC (permalink / raw)
  To: Michael Ellerman, alexander.deucher; +Cc: linuxppc-dev, amd-gfx

[-- Attachment #1: Type: text/plain, Size: 2465 bytes --]

On 10/30/19 5:35 AM, Michael Ellerman wrote:

> Hi Kyle,
>
> KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> writes:
>> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
>>
>> During kexec some adapters hit an EEH since they are not properly
>> shut down in the radeon_pci_shutdown() function. Adding
>> radeon_suspend_kms() fixes this issue.
>> Enabled only on PPC because this patch causes issues on some other
>> boards.
> Which adapters hit the issues?
>
> And do we know why they're not shut down correctly in
> radeon_pci_shutdown()? That seems like the root cause no?

Hi Michael,
This is hit by the Caicos (edwards2) adapter that I have on ppc. It is not hit
on the Cedar (FirePro) adapter - though I haven't tested this one recently. I'm
not able to test any other adapters. As far as "why", I'm unsure. During
initialization after the kexec we hit an EEH. There could be another point in
the shutdown / start up process where something doesn't get reset correctly.
I'm open to other ideas if you have any.

>> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
>> index 9e55076..4528f4d 100644
>> --- a/drivers/gpu/drm/radeon/radeon_drv.c
>> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
>> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
>>   static void
>>   radeon_pci_shutdown(struct pci_dev *pdev)
>>   {
>> +#ifdef CONFIG_PPC64
>> +	struct drm_device *ddev = pci_get_drvdata(pdev);
>> +#endif
> This local serves no real purpose and could be avoided, which would also
> avoid this ifdef.
>
>>   	/* if we are running in a VM, make sure the device
>>   	 * torn down properly on reboot/shutdown
>>   	 */
>>   	if (radeon_device_is_virtual())
>>   		radeon_pci_remove(pdev);
>> +
>> +#ifdef CONFIG_PPC64
>> +	/* Some adapters need to be suspended before a
> AFAIK drm uses normal kernel comment style, so this should be:
>
> 	/*
> 	 * Some adapters need to be suspended before a
>> +	 * shutdown occurs in order to prevent an error
>> +	 * during kexec.
>> +	 * Make this power specific becauase it breaks
>> +	 * some non-power boards.
>> +	 */
>> +	radeon_suspend_kms(ddev, true, true, false);
> ie, instead do:
>
> 	radeon_suspend_kms(pci_get_drvdata(pdev), true, true, false);

I agree, this is a cleaner way to write this patch. I'll update the comment as
well. Thanks for the help.

>> +#endif
>>   }
>>   
>>   static int radeon_pmops_suspend(struct device *dev)
>> -- 
>> 1.8.3.1
> cheers
>

[-- Attachment #2: Type: text/html, Size: 3897 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-31 15:24     ` Kyle Mahlkuch
  0 siblings, 0 replies; 16+ messages in thread
From: Kyle Mahlkuch @ 2019-10-31 15:24 UTC (permalink / raw)
  To: Michael Ellerman, alexander.deucher-5C7GfCeVMHo
  Cc: linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW


[-- Attachment #1.1: Type: text/plain, Size: 2527 bytes --]

On 10/30/19 5:35 AM, Michael Ellerman wrote:

> Hi Kyle,
>
> KyleMahlkuch <kmahlkuc-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
>> From: Kyle Mahlkuch <kmahlkuc-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
>>
>> During kexec some adapters hit an EEH since they are not properly
>> shut down in the radeon_pci_shutdown() function. Adding
>> radeon_suspend_kms() fixes this issue.
>> Enabled only on PPC because this patch causes issues on some other
>> boards.
> Which adapters hit the issues?
>
> And do we know why they're not shut down correctly in
> radeon_pci_shutdown()? That seems like the root cause no?

Hi Michael,
This is hit by the Caicos (edwards2) adapter that I have on ppc. It is not hit
on the Cedar (FirePro) adapter - though I haven't tested this one recently. I'm
not able to test any other adapters. As far as "why", I'm unsure. During
initialization after the kexec we hit an EEH. There could be another point in
the shutdown / start up process where something doesn't get reset correctly.
I'm open to other ideas if you have any.

>> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
>> index 9e55076..4528f4d 100644
>> --- a/drivers/gpu/drm/radeon/radeon_drv.c
>> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
>> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
>>   static void
>>   radeon_pci_shutdown(struct pci_dev *pdev)
>>   {
>> +#ifdef CONFIG_PPC64
>> +	struct drm_device *ddev = pci_get_drvdata(pdev);
>> +#endif
> This local serves no real purpose and could be avoided, which would also
> avoid this ifdef.
>
>>   	/* if we are running in a VM, make sure the device
>>   	 * torn down properly on reboot/shutdown
>>   	 */
>>   	if (radeon_device_is_virtual())
>>   		radeon_pci_remove(pdev);
>> +
>> +#ifdef CONFIG_PPC64
>> +	/* Some adapters need to be suspended before a
> AFAIK drm uses normal kernel comment style, so this should be:
>
> 	/*
> 	 * Some adapters need to be suspended before a
>> +	 * shutdown occurs in order to prevent an error
>> +	 * during kexec.
>> +	 * Make this power specific becauase it breaks
>> +	 * some non-power boards.
>> +	 */
>> +	radeon_suspend_kms(ddev, true, true, false);
> ie, instead do:
>
> 	radeon_suspend_kms(pci_get_drvdata(pdev), true, true, false);

I agree, this is a cleaner way to write this patch. I'll update the comment as
well. Thanks for the help.

>> +#endif
>>   }
>>   
>>   static int radeon_pmops_suspend(struct device *dev)
>> -- 
>> 1.8.3.1
> cheers
>

[-- Attachment #1.2: Type: text/html, Size: 4114 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-31 15:24     ` Kyle Mahlkuch
  0 siblings, 0 replies; 16+ messages in thread
From: Kyle Mahlkuch @ 2019-10-31 15:24 UTC (permalink / raw)
  To: Michael Ellerman, alexander.deucher; +Cc: linuxppc-dev, amd-gfx


[-- Attachment #1.1: Type: text/plain, Size: 2465 bytes --]

On 10/30/19 5:35 AM, Michael Ellerman wrote:

> Hi Kyle,
>
> KyleMahlkuch <kmahlkuc@linux.vnet.ibm.com> writes:
>> From: Kyle Mahlkuch <kmahlkuc@linux.vnet.ibm.com>
>>
>> During kexec some adapters hit an EEH since they are not properly
>> shut down in the radeon_pci_shutdown() function. Adding
>> radeon_suspend_kms() fixes this issue.
>> Enabled only on PPC because this patch causes issues on some other
>> boards.
> Which adapters hit the issues?
>
> And do we know why they're not shut down correctly in
> radeon_pci_shutdown()? That seems like the root cause no?

Hi Michael,
This is hit by the Caicos (edwards2) adapter that I have on ppc. It is not hit
on the Cedar (FirePro) adapter - though I haven't tested this one recently. I'm
not able to test any other adapters. As far as "why", I'm unsure. During
initialization after the kexec we hit an EEH. There could be another point in
the shutdown / start up process where something doesn't get reset correctly.
I'm open to other ideas if you have any.

>> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
>> index 9e55076..4528f4d 100644
>> --- a/drivers/gpu/drm/radeon/radeon_drv.c
>> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
>> @@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
>>   static void
>>   radeon_pci_shutdown(struct pci_dev *pdev)
>>   {
>> +#ifdef CONFIG_PPC64
>> +	struct drm_device *ddev = pci_get_drvdata(pdev);
>> +#endif
> This local serves no real purpose and could be avoided, which would also
> avoid this ifdef.
>
>>   	/* if we are running in a VM, make sure the device
>>   	 * torn down properly on reboot/shutdown
>>   	 */
>>   	if (radeon_device_is_virtual())
>>   		radeon_pci_remove(pdev);
>> +
>> +#ifdef CONFIG_PPC64
>> +	/* Some adapters need to be suspended before a
> AFAIK drm uses normal kernel comment style, so this should be:
>
> 	/*
> 	 * Some adapters need to be suspended before a
>> +	 * shutdown occurs in order to prevent an error
>> +	 * during kexec.
>> +	 * Make this power specific becauase it breaks
>> +	 * some non-power boards.
>> +	 */
>> +	radeon_suspend_kms(ddev, true, true, false);
> ie, instead do:
>
> 	radeon_suspend_kms(pci_get_drvdata(pdev), true, true, false);

I agree, this is a cleaner way to write this patch. I'll update the comment as
well. Thanks for the help.

>> +#endif
>>   }
>>   
>>   static int radeon_pmops_suspend(struct device *dev)
>> -- 
>> 1.8.3.1
> cheers
>

[-- Attachment #1.2: Type: text/html, Size: 3897 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-25 20:49       ` Kyle Mahlkuch
  0 siblings, 0 replies; 16+ messages in thread
From: Kyle Mahlkuch @ 2019-10-25 20:49 UTC (permalink / raw)
  To: Greg KH
  Cc: amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, stable-u79uwXL29TY76Z2rM5mHXA


[-- Attachment #1.1: Type: text/plain, Size: 931 bytes --]


On 10/10/19 11:27 PM, Greg KH wrote:

> On Thu, Oct 10, 2019 at 02:44:29PM -0500, KyleMahlkuch wrote:
>> During kexec some adapters hit an EEH since they are not properly
>> shut down in the radeon_pci_shutdown() function. Adding
>> radeon_suspend_kms() fixes this issue.
>> Enabled only on PPC because this patch causes issues on some other
>> boards.
>>
>> Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch at ibm.com>
> Real email address please, with a '@' sign.
>
> And your "From:" line did not match up with this :(

Greg, thanks for your help, I've resubmitted my patch.

>> ---
>>   drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
>>   1 file changed, 14 insertions(+)
> <formletter>
>
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read:
>      https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
>
> </formletter>

[-- Attachment #1.2: Type: text/html, Size: 1907 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-25 20:49       ` Kyle Mahlkuch
  0 siblings, 0 replies; 16+ messages in thread
From: Kyle Mahlkuch @ 2019-10-25 20:49 UTC (permalink / raw)
  To: Greg KH; +Cc: amd-gfx, stable


[-- Attachment #1.1: Type: text/plain, Size: 931 bytes --]


On 10/10/19 11:27 PM, Greg KH wrote:

> On Thu, Oct 10, 2019 at 02:44:29PM -0500, KyleMahlkuch wrote:
>> During kexec some adapters hit an EEH since they are not properly
>> shut down in the radeon_pci_shutdown() function. Adding
>> radeon_suspend_kms() fixes this issue.
>> Enabled only on PPC because this patch causes issues on some other
>> boards.
>>
>> Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch at ibm.com>
> Real email address please, with a '@' sign.
>
> And your "From:" line did not match up with this :(

Greg, thanks for your help, I've resubmitted my patch.

>> ---
>>   drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
>>   1 file changed, 14 insertions(+)
> <formletter>
>
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read:
>      https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
>
> </formletter>

[-- Attachment #1.2: Type: text/html, Size: 1847 bytes --]

[-- Attachment #2: Type: text/plain, Size: 153 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v3] drm/radeon: Fix EEH during kexec
  2019-10-10 19:44 KyleMahlkuch
@ 2019-10-11  4:27 ` Greg KH
       [not found]   ` <20191011042734.GA939089-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 16+ messages in thread
From: Greg KH @ 2019-10-11  4:27 UTC (permalink / raw)
  To: KyleMahlkuch; +Cc: alexander.deucher, stable, amd-gfx

On Thu, Oct 10, 2019 at 02:44:29PM -0500, KyleMahlkuch wrote:
> During kexec some adapters hit an EEH since they are not properly
> shut down in the radeon_pci_shutdown() function. Adding
> radeon_suspend_kms() fixes this issue.
> Enabled only on PPC because this patch causes issues on some other
> boards.
> 
> Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch at ibm.com>

Real email address please, with a '@' sign.

And your "From:" line did not match up with this :(

> ---
>  drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v3] drm/radeon: Fix EEH during kexec
@ 2019-10-10 19:44 KyleMahlkuch
  2019-10-11  4:27 ` Greg KH
  0 siblings, 1 reply; 16+ messages in thread
From: KyleMahlkuch @ 2019-10-10 19:44 UTC (permalink / raw)
  To: alexander.deucher; +Cc: stable, amd-gfx, KyleMahlkuch

During kexec some adapters hit an EEH since they are not properly
shut down in the radeon_pci_shutdown() function. Adding
radeon_suspend_kms() fixes this issue.
Enabled only on PPC because this patch causes issues on some other
boards.

Signed-off-by: Kyle Mahlkuch <Kyle.Mahlkuch at ibm.com>
---
 drivers/gpu/drm/radeon/radeon_drv.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index 9e55076..4528f4d 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -379,11 +379,25 @@ static int radeon_pci_probe(struct pci_dev *pdev,
 static void
 radeon_pci_shutdown(struct pci_dev *pdev)
 {
+#ifdef CONFIG_PPC64
+	struct drm_device *ddev = pci_get_drvdata(pdev);
+#endif
+
 	/* if we are running in a VM, make sure the device
 	 * torn down properly on reboot/shutdown
 	 */
 	if (radeon_device_is_virtual())
 		radeon_pci_remove(pdev);
+
+#ifdef CONFIG_PPC64
+	/* Some adapters need to be suspended before a
+	 * shutdown occurs in order to prevent an error
+	 * during kexec.
+	 * Make this power specific becauase it breaks
+	 * some non-power boards.
+	 */
+	radeon_suspend_kms(ddev, true, true, false);
+#endif
 }
 
 static int radeon_pmops_suspend(struct device *dev)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-10-31 20:18 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-25 20:40 [PATCH v3] drm/radeon: Fix EEH during kexec KyleMahlkuch
2019-10-25 20:40 ` KyleMahlkuch
2019-10-26  8:00 ` Greg KH
2019-10-26  8:00   ` Greg KH
2019-10-28 14:07 ` Alex Deucher
2019-10-28 14:07   ` Alex Deucher
2019-10-30 10:35 ` Michael Ellerman
2019-10-30 10:35   ` Michael Ellerman
2019-10-30 10:35   ` Michael Ellerman
2019-10-31 15:24   ` Kyle Mahlkuch
2019-10-31 15:24     ` Kyle Mahlkuch
2019-10-31 15:24     ` Kyle Mahlkuch
  -- strict thread matches above, loose matches on Subject: below --
2019-10-10 19:44 KyleMahlkuch
2019-10-11  4:27 ` Greg KH
     [not found]   ` <20191011042734.GA939089-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2019-10-25 20:49     ` Kyle Mahlkuch
2019-10-25 20:49       ` Kyle Mahlkuch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.