* [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg
@ 2022-04-12 4:08 Darren Powell
2022-04-12 4:19 ` Lazar, Lijo
2022-04-12 21:09 ` Luben Tuikov
0 siblings, 2 replies; 10+ messages in thread
From: Darren Powell @ 2022-04-12 4:08 UTC (permalink / raw)
To: amd-gfx
Cc: luben.tuikov, evan.quan, Darren Powell, andrey.grodzovsky, pmenzel
Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
cases exist where messages are silently dropped with no error returned
to the caller. These cases occur in unusual situations where either:
1. the message target is a virtual GPU, or
2. a PCI recovery is underway and the HW is not yet in sync with the SW
For more details see
commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
(v2)
Reworked with suggestions from Luben & Paul
Signed-off-by: Darren Powell <darren.powell@amd.com>
---
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index b8d0c70ff668..8008ae5508e6 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -356,9 +356,11 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
* completion of the command, and return back a value from the SMU in
* @read_arg pointer.
*
- * Return 0 on success, -errno on error, if we weren't able to send
- * the message or if the message completed with some kind of
- * error. See __smu_cmn_reg2errno() for details of the -errno.
+ * Return 0 on success, -errno when a problem is encountered sending
+ * message or receiving reply. If there is a PCI bus recovery or
+ * the destination is a virtual GPU, the message is simply dropped and
+ * success is also returned.
+ * See __smu_cmn_reg2errno() for details of the -errno.
*
* If we weren't able to send the message to the SMU, we also print
* the error to the standard log.
base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
--
2.35.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg
2022-04-12 4:08 [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg Darren Powell
@ 2022-04-12 4:19 ` Lazar, Lijo
2022-04-13 2:48 ` Powell, Darren
2022-04-12 21:09 ` Luben Tuikov
1 sibling, 1 reply; 10+ messages in thread
From: Lazar, Lijo @ 2022-04-12 4:19 UTC (permalink / raw)
To: Darren Powell, amd-gfx
Cc: andrey.grodzovsky, luben.tuikov, evan.quan, pmenzel
On 4/12/2022 9:38 AM, Darren Powell wrote:
> Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
> cases exist where messages are silently dropped with no error returned
> to the caller. These cases occur in unusual situations where either:
> 1. the message target is a virtual GPU, or
This is not fully correct - only messages which are not valid for
virtual GPU are dropped, not all.
Thanks,
Lijo
> 2. a PCI recovery is underway and the HW is not yet in sync with the SW
>
> For more details see
> commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
> commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
>
> (v2)
> Reworked with suggestions from Luben & Paul
>
> Signed-off-by: Darren Powell <darren.powell@amd.com>
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..8008ae5508e6 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,9 +356,11 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
> * completion of the command, and return back a value from the SMU in
> * @read_arg pointer.
> *
> - * Return 0 on success, -errno on error, if we weren't able to send
> - * the message or if the message completed with some kind of
> - * error. See __smu_cmn_reg2errno() for details of the -errno.
> + * Return 0 on success, -errno when a problem is encountered sending
> + * message or receiving reply. If there is a PCI bus recovery or
> + * the destination is a virtual GPU, the message is simply dropped and
> + * success is also returned.
> + * See __smu_cmn_reg2errno() for details of the -errno.
> *
> * If we weren't able to send the message to the SMU, we also print
> * the error to the standard log.
>
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg
2022-04-12 4:08 [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg Darren Powell
2022-04-12 4:19 ` Lazar, Lijo
@ 2022-04-12 21:09 ` Luben Tuikov
2022-04-13 2:50 ` Powell, Darren
1 sibling, 1 reply; 10+ messages in thread
From: Luben Tuikov @ 2022-04-12 21:09 UTC (permalink / raw)
To: Darren Powell, amd-gfx; +Cc: pmenzel, evan.quan, andrey.grodzovsky
I suppose I didn't quite register this on a first read:
On 2022-04-12 00:08, Darren Powell wrote:
> Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
I'd just say
Clarify the documentation to also mention that we drop
messages and return success in the following two cases:
1. ...
That "Contrary to the ..." sounds a bit harsh.
Regards,
Luben
> cases exist where messages are silently dropped with no error returned
> to the caller. These cases occur in unusual situations where either:
> 1. the message target is a virtual GPU, or
> 2. a PCI recovery is underway and the HW is not yet in sync with the SW
>
> For more details see
> commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
> commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
>
> (v2)
> Reworked with suggestions from Luben & Paul
>
> Signed-off-by: Darren Powell <darren.powell@amd.com>
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..8008ae5508e6 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,9 +356,11 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
> * completion of the command, and return back a value from the SMU in
> * @read_arg pointer.
> *
> - * Return 0 on success, -errno on error, if we weren't able to send
> - * the message or if the message completed with some kind of
> - * error. See __smu_cmn_reg2errno() for details of the -errno.
> + * Return 0 on success, -errno when a problem is encountered sending
> + * message or receiving reply. If there is a PCI bus recovery or
> + * the destination is a virtual GPU, the message is simply dropped and
> + * success is also returned.
> + * See __smu_cmn_reg2errno() for details of the -errno.
> *
> * If we weren't able to send the message to the SMU, we also print
> * the error to the standard log.
>
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
Regards,
--
Luben
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg
2022-04-12 4:19 ` Lazar, Lijo
@ 2022-04-13 2:48 ` Powell, Darren
0 siblings, 0 replies; 10+ messages in thread
From: Powell, Darren @ 2022-04-13 2:48 UTC (permalink / raw)
To: Lazar, Lijo, amd-gfx
Cc: Grodzovsky, Andrey, Tuikov, Luben, Quan, Evan, pmenzel
[-- Attachment #1: Type: text/plain, Size: 2749 bytes --]
[AMD Official Use Only]
I needed to dig further down to find the message map, I had been looking back in mailing list looking for clarification but hadn't found anything.
Will reword
Thanks
Darren
________________________________
From: Lazar, Lijo <Lijo.Lazar@amd.com>
Sent: Tuesday, April 12, 2022 12:19 AM
To: Powell, Darren <Darren.Powell@amd.com>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Cc: Tuikov, Luben <Luben.Tuikov@amd.com>; Quan, Evan <Evan.Quan@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; pmenzel@molgen.mpg.de <pmenzel@molgen.mpg.de>
Subject: Re: [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg
On 4/12/2022 9:38 AM, Darren Powell wrote:
> Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
> cases exist where messages are silently dropped with no error returned
> to the caller. These cases occur in unusual situations where either:
> 1. the message target is a virtual GPU, or
This is not fully correct - only messages which are not valid for
virtual GPU are dropped, not all.
Thanks,
Lijo
> 2. a PCI recovery is underway and the HW is not yet in sync with the SW
>
> For more details see
> commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
> commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
>
> (v2)
> Reworked with suggestions from Luben & Paul
>
> Signed-off-by: Darren Powell <darren.powell@amd.com>
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..8008ae5508e6 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,9 +356,11 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
> * completion of the command, and return back a value from the SMU in
> * @read_arg pointer.
> *
> - * Return 0 on success, -errno on error, if we weren't able to send
> - * the message or if the message completed with some kind of
> - * error. See __smu_cmn_reg2errno() for details of the -errno.
> + * Return 0 on success, -errno when a problem is encountered sending
> + * message or receiving reply. If there is a PCI bus recovery or
> + * the destination is a virtual GPU, the message is simply dropped and
> + * success is also returned.
> + * See __smu_cmn_reg2errno() for details of the -errno.
> *
> * If we weren't able to send the message to the SMU, we also print
> * the error to the standard log.
>
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
>
[-- Attachment #2: Type: text/html, Size: 4465 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg
2022-04-12 21:09 ` Luben Tuikov
@ 2022-04-13 2:50 ` Powell, Darren
0 siblings, 0 replies; 10+ messages in thread
From: Powell, Darren @ 2022-04-13 2:50 UTC (permalink / raw)
To: Tuikov, Luben, amd-gfx; +Cc: pmenzel, Quan, Evan, Grodzovsky, Andrey
[-- Attachment #1: Type: text/plain, Size: 2798 bytes --]
[AMD Official Use Only]
Yes, it looks like I was a little snippy writing that intro, will lighten the grammar.
Thanks
Darren
________________________________
From: Tuikov, Luben <Luben.Tuikov@amd.com>
Sent: Tuesday, April 12, 2022 5:09 PM
To: Powell, Darren <Darren.Powell@amd.com>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Cc: Quan, Evan <Evan.Quan@amd.com>; pmenzel@molgen.mpg.de <pmenzel@molgen.mpg.de>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Subject: Re: [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg
I suppose I didn't quite register this on a first read:
On 2022-04-12 00:08, Darren Powell wrote:
> Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
I'd just say
Clarify the documentation to also mention that we drop
messages and return success in the following two cases:
1. ...
That "Contrary to the ..." sounds a bit harsh.
Regards,
Luben
> cases exist where messages are silently dropped with no error returned
> to the caller. These cases occur in unusual situations where either:
> 1. the message target is a virtual GPU, or
> 2. a PCI recovery is underway and the HW is not yet in sync with the SW
>
> For more details see
> commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
> commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
>
> (v2)
> Reworked with suggestions from Luben & Paul
>
> Signed-off-by: Darren Powell <darren.powell@amd.com>
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..8008ae5508e6 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,9 +356,11 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
> * completion of the command, and return back a value from the SMU in
> * @read_arg pointer.
> *
> - * Return 0 on success, -errno on error, if we weren't able to send
> - * the message or if the message completed with some kind of
> - * error. See __smu_cmn_reg2errno() for details of the -errno.
> + * Return 0 on success, -errno when a problem is encountered sending
> + * message or receiving reply. If there is a PCI bus recovery or
> + * the destination is a virtual GPU, the message is simply dropped and
> + * success is also returned.
> + * See __smu_cmn_reg2errno() for details of the -errno.
> *
> * If we weren't able to send the message to the SMU, we also print
> * the error to the standard log.
>
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
Regards,
--
Luben
[-- Attachment #2: Type: text/html, Size: 4400 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg
2022-04-08 6:29 ` Paul Menzel
@ 2022-04-08 20:55 ` Powell, Darren
0 siblings, 0 replies; 10+ messages in thread
From: Powell, Darren @ 2022-04-08 20:55 UTC (permalink / raw)
To: Paul Menzel; +Cc: Grodzovsky, Andrey, Tuikov, Luben, Quan, Evan, amd-gfx
[-- Attachment #1: Type: text/plain, Size: 2861 bytes --]
[AMD Official Use Only]
will respin and incorporate your suggestions
Thanks,
Darren
________________________________
From: Paul Menzel <pmenzel@molgen.mpg.de>
Sent: Friday, April 8, 2022 2:29 AM
To: Powell, Darren <Darren.Powell@amd.com>
Cc: amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; Tuikov, Luben <Luben.Tuikov@amd.com>; Quan, Evan <Evan.Quan@amd.com>; wenhui.sheng@amd.com <wenhui.sheng@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Subject: Re: [PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg
Dear Darren,
Thank you for your patch.
Am 08.04.22 um 04:26 schrieb Darren Powell:
> Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
> cases exist where messages are silently dropped with no error returned
> to the caller. These cases occur in unusual situations where either:
> 1. the caller is a virtual GPU, or
> 2. a PCI recovery is underway and the HW is not yet in sync with the SW
>
> For more details see
> commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
> commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
Please remove the indentation. If you re-rolled the patch, you could
also spell *documentation* lowercase in the commit message summary.
> Signed-off-by: Darren Powell <darren.powell@amd.com>
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..b1bd1990c88b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,12 +356,15 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
> * completion of the command, and return back a value from the SMU in
> * @read_arg pointer.
> *
> - * Return 0 on success, -errno on error, if we weren't able to send
> + * Return 0 on success, or if the message is dropped.
> + * On error, -errno is returned if we weren't able to send
> * the message or if the message completed with some kind of
> * error. See __smu_cmn_reg2errno() for details of the -errno.
> *
> * If we weren't able to send the message to the SMU, we also print
> - * the error to the standard log.
> + * the error to the standard log. Dropped messages can be caused
> + * due to PCI slot recovery or attempting to send from a virtual GPU,
> + * and do not print an error.
> *
> * Command completion status is printed only if the -errno is
> * -EREMOTEIO, indicating that the SMU returned back an
>
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
The diff looks good – despite Mozilla Thunderbird quoting it strangely.
Kind regards,
Paul
[-- Attachment #2: Type: text/html, Size: 4484 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg
2022-04-08 13:33 ` Luben Tuikov
@ 2022-04-08 20:54 ` Powell, Darren
0 siblings, 0 replies; 10+ messages in thread
From: Powell, Darren @ 2022-04-08 20:54 UTC (permalink / raw)
To: Tuikov, Luben, amd-gfx; +Cc: Grodzovsky, Andrey, Quan, Evan
[-- Attachment #1: Type: text/plain, Size: 3068 bytes --]
[AMD Official Use Only]
inline
________________________________
From: Tuikov, Luben <Luben.Tuikov@amd.com>
Sent: Friday, April 8, 2022 9:33 AM
To: Powell, Darren <Darren.Powell@amd.com>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>
Cc: Quan, Evan <Evan.Quan@amd.com>; Wenhui.Sheng@amd.com <Wenhui.Sheng@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Subject: Re: [PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg
I'd add who and how is the message dropped, and also mention that we're unable
to recognize a dropped message.
On 2022-04-07 22:26, Darren Powell wrote:
> Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
> cases exist where messages are silently dropped with no error returned
> to the caller. These cases occur in unusual situations where either:
> 1. the caller is a virtual GPU, or
The caller? Isn't this code executed on a CPU sending to the SMU (which lives on a GPU)?
[DP] Great point, will fix
> 2. a PCI recovery is underway and the HW is not yet in sync with the SW
>
> For more details see
> commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
> commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
>
> Signed-off-by: Darren Powell <darren.powell@amd.com>
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..b1bd1990c88b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,12 +356,15 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
> * completion of the command, and return back a value from the SMU in
> * @read_arg pointer.
> *
> - * Return 0 on success, -errno on error, if we weren't able to send
> + * Return 0 on success, or if the message is dropped.
> + * On error, -errno is returned if we weren't able to send
Something like this:
Return 0 on success, -errno on error. If the message was dropped
due to PCI bus recovery or sending to a virtual GPU, we're unable
to detect this and success is also returned.
> * the message or if the message completed with some kind of
> * error. See __smu_cmn_reg2errno() for details of the -errno.
> *
> * If we weren't able to send the message to the SMU, we also print
> - * the error to the standard log.
> + * the error to the standard log. Dropped messages can be caused
> + * due to PCI slot recovery or attempting to send from a virtual GPU,
> + * and do not print an error.
This is a moot point with the clarification I suggested above and I'd remove that.
[DP] sounds more succinct, will address in v2
> *
> * Command completion status is printed only if the -errno is
> * -EREMOTEIO, indicating that the SMU returned back an
>
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
Regards,
--
Luben
[-- Attachment #2: Type: text/html, Size: 4664 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg
2022-04-08 2:26 [PATCH 1/1] amdgpu/pm: Clarify Documentation " Darren Powell
2022-04-08 6:29 ` Paul Menzel
@ 2022-04-08 13:33 ` Luben Tuikov
2022-04-08 20:54 ` Powell, Darren
1 sibling, 1 reply; 10+ messages in thread
From: Luben Tuikov @ 2022-04-08 13:33 UTC (permalink / raw)
To: Darren Powell, amd-gfx; +Cc: andrey.grodzovsky, evan.quan, Wenhui.Sheng
I'd add who and how is the message dropped, and also mention that we're unable
to recognize a dropped message.
On 2022-04-07 22:26, Darren Powell wrote:
> Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
> cases exist where messages are silently dropped with no error returned
> to the caller. These cases occur in unusual situations where either:
> 1. the caller is a virtual GPU, or
The caller? Isn't this code executed on a CPU sending to the SMU (which lives on a GPU)?
> 2. a PCI recovery is underway and the HW is not yet in sync with the SW
>
> For more details see
> commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
> commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
>
> Signed-off-by: Darren Powell <darren.powell@amd.com>
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..b1bd1990c88b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,12 +356,15 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
> * completion of the command, and return back a value from the SMU in
> * @read_arg pointer.
> *
> - * Return 0 on success, -errno on error, if we weren't able to send
> + * Return 0 on success, or if the message is dropped.
> + * On error, -errno is returned if we weren't able to send
Something like this:
Return 0 on success, -errno on error. If the message was dropped
due to PCI bus recovery or sending to a virtual GPU, we're unable
to detect this and success is also returned.
> * the message or if the message completed with some kind of
> * error. See __smu_cmn_reg2errno() for details of the -errno.
> *
> * If we weren't able to send the message to the SMU, we also print
> - * the error to the standard log.
> + * the error to the standard log. Dropped messages can be caused
> + * due to PCI slot recovery or attempting to send from a virtual GPU,
> + * and do not print an error.
This is a moot point with the clarification I suggested above and I'd remove that.
> *
> * Command completion status is printed only if the -errno is
> * -EREMOTEIO, indicating that the SMU returned back an
>
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
Regards,
--
Luben
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg
2022-04-08 2:26 [PATCH 1/1] amdgpu/pm: Clarify Documentation " Darren Powell
@ 2022-04-08 6:29 ` Paul Menzel
2022-04-08 20:55 ` Powell, Darren
2022-04-08 13:33 ` Luben Tuikov
1 sibling, 1 reply; 10+ messages in thread
From: Paul Menzel @ 2022-04-08 6:29 UTC (permalink / raw)
To: Darren Powell
Cc: andrey.grodzovsky, luben.tuikov, evan.quan, amd-gfx, wenhui.sheng
Dear Darren,
Thank you for your patch.
Am 08.04.22 um 04:26 schrieb Darren Powell:
> Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
> cases exist where messages are silently dropped with no error returned
> to the caller. These cases occur in unusual situations where either:
> 1. the caller is a virtual GPU, or
> 2. a PCI recovery is underway and the HW is not yet in sync with the SW
>
> For more details see
> commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
> commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
Please remove the indentation. If you re-rolled the patch, you could
also spell *documentation* lowercase in the commit message summary.
> Signed-off-by: Darren Powell <darren.powell@amd.com>
> ---
> drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> index b8d0c70ff668..b1bd1990c88b 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
> @@ -356,12 +356,15 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
> * completion of the command, and return back a value from the SMU in
> * @read_arg pointer.
> *
> - * Return 0 on success, -errno on error, if we weren't able to send
> + * Return 0 on success, or if the message is dropped.
> + * On error, -errno is returned if we weren't able to send
> * the message or if the message completed with some kind of
> * error. See __smu_cmn_reg2errno() for details of the -errno.
> *
> * If we weren't able to send the message to the SMU, we also print
> - * the error to the standard log.
> + * the error to the standard log. Dropped messages can be caused
> + * due to PCI slot recovery or attempting to send from a virtual GPU,
> + * and do not print an error.
> *
> * Command completion status is printed only if the -errno is
> * -EREMOTEIO, indicating that the SMU returned back an
>
> base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
The diff looks good – despite Mozilla Thunderbird quoting it strangely.
Kind regards,
Paul
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/1] amdgpu/pm: Clarify Documentation of error handling in send_smc_mesg
@ 2022-04-08 2:26 Darren Powell
2022-04-08 6:29 ` Paul Menzel
2022-04-08 13:33 ` Luben Tuikov
0 siblings, 2 replies; 10+ messages in thread
From: Darren Powell @ 2022-04-08 2:26 UTC (permalink / raw)
To: amd-gfx
Cc: luben.tuikov, evan.quan, Darren Powell, Wenhui.Sheng, andrey.grodzovsky
Contrary to the smu_cmn_send_smc_msg_with_param documentation, two
cases exist where messages are silently dropped with no error returned
to the caller. These cases occur in unusual situations where either:
1. the caller is a virtual GPU, or
2. a PCI recovery is underway and the HW is not yet in sync with the SW
For more details see
commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")
Signed-off-by: Darren Powell <darren.powell@amd.com>
---
drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index b8d0c70ff668..b1bd1990c88b 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -356,12 +356,15 @@ int smu_cmn_wait_for_response(struct smu_context *smu)
* completion of the command, and return back a value from the SMU in
* @read_arg pointer.
*
- * Return 0 on success, -errno on error, if we weren't able to send
+ * Return 0 on success, or if the message is dropped.
+ * On error, -errno is returned if we weren't able to send
* the message or if the message completed with some kind of
* error. See __smu_cmn_reg2errno() for details of the -errno.
*
* If we weren't able to send the message to the SMU, we also print
- * the error to the standard log.
+ * the error to the standard log. Dropped messages can be caused
+ * due to PCI slot recovery or attempting to send from a virtual GPU,
+ * and do not print an error.
*
* Command completion status is printed only if the -errno is
* -EREMOTEIO, indicating that the SMU returned back an
base-commit: 4585c45a6a66cb17cc97f4370457503746e540b7
--
2.35.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-04-13 2:50 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-12 4:08 [PATCH 1/1] amdgpu/pm: Clarify documentation of error handling in send_smc_mesg Darren Powell
2022-04-12 4:19 ` Lazar, Lijo
2022-04-13 2:48 ` Powell, Darren
2022-04-12 21:09 ` Luben Tuikov
2022-04-13 2:50 ` Powell, Darren
-- strict thread matches above, loose matches on Subject: below --
2022-04-08 2:26 [PATCH 1/1] amdgpu/pm: Clarify Documentation " Darren Powell
2022-04-08 6:29 ` Paul Menzel
2022-04-08 20:55 ` Powell, Darren
2022-04-08 13:33 ` Luben Tuikov
2022-04-08 20:54 ` Powell, Darren
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.