linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
@ 2020-03-27 13:28 Sai Prakash Ranjan
  2020-03-27 14:12 ` Robin Murphy
  0 siblings, 1 reply; 15+ messages in thread
From: Sai Prakash Ranjan @ 2020-03-27 13:28 UTC (permalink / raw)
  To: Robin Murphy, Will Deacon, Joerg Roedel, iommu, Douglas Anderson
  Cc: linux-arm-msm, linux-kernel, Sai Prakash Ranjan

Currently on reboot/shutdown, the following messages are
displayed on the console as error messages before the
system reboots/shutdown.

On SC7180:

  arm-smmu 15000000.iommu: removing device with active domains!
  arm-smmu 5040000.iommu: removing device with active domains!

Demote the log level to debug since it does not offer much
help in identifying/fixing any issue as the system is anyways
going down and reduce spamming the kernel log.

Reported-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
---
 drivers/iommu/arm-smmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 16c4b87af42b..0a865e32054a 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -2250,7 +2250,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
 		return -ENODEV;
 
 	if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
-		dev_err(&pdev->dev, "removing device with active domains!\n");
+		dev_dbg(&pdev->dev, "removing device with active domains!\n");
 
 	arm_smmu_bus_init(NULL);
 	iommu_device_unregister(&smmu->iommu);
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-27 13:28 [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback Sai Prakash Ranjan
@ 2020-03-27 14:12 ` Robin Murphy
  2020-03-27 15:09   ` Sai Prakash Ranjan
  0 siblings, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2020-03-27 14:12 UTC (permalink / raw)
  To: Sai Prakash Ranjan, Will Deacon, Joerg Roedel, iommu, Douglas Anderson
  Cc: linux-arm-msm, linux-kernel

On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote:
> Currently on reboot/shutdown, the following messages are
> displayed on the console as error messages before the
> system reboots/shutdown.
> 
> On SC7180:
> 
>    arm-smmu 15000000.iommu: removing device with active domains!
>    arm-smmu 5040000.iommu: removing device with active domains!
> 
> Demote the log level to debug since it does not offer much
> help in identifying/fixing any issue as the system is anyways
> going down and reduce spamming the kernel log.

I've gone back and forth on this pretty much ever since we added the 
shutdown hook - on the other hand, if any devices *are* still running in 
those domains at this point, then once we turn off the SMMU and let 
those IOVAs go out on the bus as physical addresses, all manner of 
weirdness may ensue. Thus there is an argument for *some* indication 
that this may happen, although IMO it could be downgraded to at least 
dev_warn().

Robin.

> Reported-by: Douglas Anderson <dianders@chromium.org>
> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> ---
>   drivers/iommu/arm-smmu.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 16c4b87af42b..0a865e32054a 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -2250,7 +2250,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev)
>   		return -ENODEV;
>   
>   	if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
> -		dev_err(&pdev->dev, "removing device with active domains!\n");
> +		dev_dbg(&pdev->dev, "removing device with active domains!\n");
>   
>   	arm_smmu_bus_init(NULL);
>   	iommu_device_unregister(&smmu->iommu);
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-27 14:12 ` Robin Murphy
@ 2020-03-27 15:09   ` Sai Prakash Ranjan
  2020-03-27 16:17     ` Rob Clark
  2020-03-27 19:02     ` Robin Murphy
  0 siblings, 2 replies; 15+ messages in thread
From: Sai Prakash Ranjan @ 2020-03-27 15:09 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Will Deacon, Joerg Roedel, iommu, Douglas Anderson,
	linux-arm-msm, linux-kernel

Hi Robin,

Thanks for taking a look at this.

On 2020-03-27 19:42, Robin Murphy wrote:
> On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote:
>> Currently on reboot/shutdown, the following messages are
>> displayed on the console as error messages before the
>> system reboots/shutdown.
>> 
>> On SC7180:
>> 
>>    arm-smmu 15000000.iommu: removing device with active domains!
>>    arm-smmu 5040000.iommu: removing device with active domains!
>> 
>> Demote the log level to debug since it does not offer much
>> help in identifying/fixing any issue as the system is anyways
>> going down and reduce spamming the kernel log.
> 
> I've gone back and forth on this pretty much ever since we added the
> shutdown hook - on the other hand, if any devices *are* still running
> in those domains at this point, then once we turn off the SMMU and let
> those IOVAs go out on the bus as physical addresses, all manner of
> weirdness may ensue. Thus there is an argument for *some* indication
> that this may happen, although IMO it could be downgraded to at least
> dev_warn().
> 

Any pointers to the weirdness here after SMMU is turned off?
Because if we look at the call sites, device_shutdown is called
from kernel_restart_prepare or kernel_shutdown_prepare which would
mean system is going down anyways, so do we really care about these
error messages or warnings from SMMU?

  arm_smmu_device_shutdown
   platform_drv_shutdown
    device_shutdown
     kernel_restart_prepare
      kernel_restart


Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-27 15:09   ` Sai Prakash Ranjan
@ 2020-03-27 16:17     ` Rob Clark
  2020-03-27 18:18       ` Sai Prakash Ranjan
  2020-03-27 19:02     ` Robin Murphy
  1 sibling, 1 reply; 15+ messages in thread
From: Rob Clark @ 2020-03-27 16:17 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Robin Murphy, linux-arm-msm, Linux Kernel Mailing List,
	Douglas Anderson,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	Will Deacon

On Fri, Mar 27, 2020 at 8:10 AM Sai Prakash Ranjan
<saiprakash.ranjan@codeaurora.org> wrote:
>
> Hi Robin,
>
> Thanks for taking a look at this.
>
> On 2020-03-27 19:42, Robin Murphy wrote:
> > On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote:
> >> Currently on reboot/shutdown, the following messages are
> >> displayed on the console as error messages before the
> >> system reboots/shutdown.
> >>
> >> On SC7180:
> >>
> >>    arm-smmu 15000000.iommu: removing device with active domains!
> >>    arm-smmu 5040000.iommu: removing device with active domains!
> >>
> >> Demote the log level to debug since it does not offer much
> >> help in identifying/fixing any issue as the system is anyways
> >> going down and reduce spamming the kernel log.
> >
> > I've gone back and forth on this pretty much ever since we added the
> > shutdown hook - on the other hand, if any devices *are* still running
> > in those domains at this point, then once we turn off the SMMU and let
> > those IOVAs go out on the bus as physical addresses, all manner of
> > weirdness may ensue. Thus there is an argument for *some* indication
> > that this may happen, although IMO it could be downgraded to at least
> > dev_warn().
> >
>
> Any pointers to the weirdness here after SMMU is turned off?
> Because if we look at the call sites, device_shutdown is called
> from kernel_restart_prepare or kernel_shutdown_prepare which would
> mean system is going down anyways, so do we really care about these
> error messages or warnings from SMMU?
>
>   arm_smmu_device_shutdown
>    platform_drv_shutdown
>     device_shutdown
>      kernel_restart_prepare
>       kernel_restart
>

I'd guess that drm/msm is not detaching all of it's UNMANAGED domains
in shutdown.  Although *presumably* the device_link stuff would
prevent the SMMU from shutting down while gpu/display is still active?
 If not I think we have bigger problems.

I hadn't really noticed the error msgs before, not sure if that is
just because the screen is off by the time they happen or if they are
a new warning..

BR,
-R

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-27 16:17     ` Rob Clark
@ 2020-03-27 18:18       ` Sai Prakash Ranjan
  0 siblings, 0 replies; 15+ messages in thread
From: Sai Prakash Ranjan @ 2020-03-27 18:18 UTC (permalink / raw)
  To: Rob Clark
  Cc: Robin Murphy, linux-arm-msm, Linux Kernel Mailing List,
	Douglas Anderson, list@263.net:IOMMU DRIVERS ,
	Joerg Roedel <joro@8bytes.org>,,
	Will Deacon

Hi Rob,

On 2020-03-27 21:47, Rob Clark wrote:
> On Fri, Mar 27, 2020 at 8:10 AM Sai Prakash Ranjan
> <saiprakash.ranjan@codeaurora.org> wrote:
>> 
>> Hi Robin,
>> 
>> Thanks for taking a look at this.
>> 
>> On 2020-03-27 19:42, Robin Murphy wrote:
>> > On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote:
>> >> Currently on reboot/shutdown, the following messages are
>> >> displayed on the console as error messages before the
>> >> system reboots/shutdown.
>> >>
>> >> On SC7180:
>> >>
>> >>    arm-smmu 15000000.iommu: removing device with active domains!
>> >>    arm-smmu 5040000.iommu: removing device with active domains!
>> >>
>> >> Demote the log level to debug since it does not offer much
>> >> help in identifying/fixing any issue as the system is anyways
>> >> going down and reduce spamming the kernel log.
>> >
>> > I've gone back and forth on this pretty much ever since we added the
>> > shutdown hook - on the other hand, if any devices *are* still running
>> > in those domains at this point, then once we turn off the SMMU and let
>> > those IOVAs go out on the bus as physical addresses, all manner of
>> > weirdness may ensue. Thus there is an argument for *some* indication
>> > that this may happen, although IMO it could be downgraded to at least
>> > dev_warn().
>> >
>> 
>> Any pointers to the weirdness here after SMMU is turned off?
>> Because if we look at the call sites, device_shutdown is called
>> from kernel_restart_prepare or kernel_shutdown_prepare which would
>> mean system is going down anyways, so do we really care about these
>> error messages or warnings from SMMU?
>> 
>>   arm_smmu_device_shutdown
>>    platform_drv_shutdown
>>     device_shutdown
>>      kernel_restart_prepare
>>       kernel_restart
>> 
> 
> I'd guess that drm/msm is not detaching all of it's UNMANAGED domains
> in shutdown.  Although *presumably* the device_link stuff would
> prevent the SMMU from shutting down while gpu/display is still active?
>  If not I think we have bigger problems.
> 

Where is the shutdown callback in drm/msm? I don't see any...

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-27 15:09   ` Sai Prakash Ranjan
  2020-03-27 16:17     ` Rob Clark
@ 2020-03-27 19:02     ` Robin Murphy
  2020-03-28  7:35       ` Sai Prakash Ranjan
  1 sibling, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2020-03-27 19:02 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Will Deacon, Joerg Roedel, iommu, Douglas Anderson,
	linux-arm-msm, linux-kernel

On 2020-03-27 3:09 pm, Sai Prakash Ranjan wrote:
> Hi Robin,
> 
> Thanks for taking a look at this.
> 
> On 2020-03-27 19:42, Robin Murphy wrote:
>> On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote:
>>> Currently on reboot/shutdown, the following messages are
>>> displayed on the console as error messages before the
>>> system reboots/shutdown.
>>>
>>> On SC7180:
>>>
>>>    arm-smmu 15000000.iommu: removing device with active domains!
>>>    arm-smmu 5040000.iommu: removing device with active domains!
>>>
>>> Demote the log level to debug since it does not offer much
>>> help in identifying/fixing any issue as the system is anyways
>>> going down and reduce spamming the kernel log.
>>
>> I've gone back and forth on this pretty much ever since we added the
>> shutdown hook - on the other hand, if any devices *are* still running
>> in those domains at this point, then once we turn off the SMMU and let
>> those IOVAs go out on the bus as physical addresses, all manner of
>> weirdness may ensue. Thus there is an argument for *some* indication
>> that this may happen, although IMO it could be downgraded to at least
>> dev_warn().
>>
> 
> Any pointers to the weirdness here after SMMU is turned off?
> Because if we look at the call sites, device_shutdown is called
> from kernel_restart_prepare or kernel_shutdown_prepare which would
> mean system is going down anyways, so do we really care about these
> error messages or warnings from SMMU?
> 
>   arm_smmu_device_shutdown
>    platform_drv_shutdown
>     device_shutdown
>      kernel_restart_prepare
>       kernel_restart

Imagine your network driver doesn't implement a .shutdown method (so the 
hardware is still active regardless of device links), happens to have an 
Rx buffer or descriptor ring DMA-mapped at an IOVA that looks like the 
physical address of the memory containing some part of the kernel text 
lower down that call stack, and the MAC receives a broadcast IP packet 
at about the point arm_smmu_device_shutdown() is returning. Enjoy 
debugging that ;)

And if coincidental memory corruption seems too far-fetched for your 
liking, other fun alternatives might include "display tries to scan out 
from powered-off device, deadlocks interconnect and prevents anything 
else making progress", or "access to TZC-protected physical address 
triggers interrupt and over-eager Secure firmware resets system before 
orderly poweroff has a chance to finish".

Of course the fact that in practice we'll *always* see the warning 
because there's no way to tear down the default DMA domains, and even if 
all devices *have* been nicely quiesced there's no way to tell, is 
certainly less than ideal. Like I say, it's not entirely clear-cut 
either way...

Robin.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-27 19:02     ` Robin Murphy
@ 2020-03-28  7:35       ` Sai Prakash Ranjan
  2020-03-30 18:24         ` Doug Anderson
  0 siblings, 1 reply; 15+ messages in thread
From: Sai Prakash Ranjan @ 2020-03-28  7:35 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Will Deacon, Joerg Roedel, iommu, Douglas Anderson,
	linux-arm-msm, linux-kernel

Hi Robin,

On 2020-03-28 00:32, Robin Murphy wrote:
> On 2020-03-27 3:09 pm, Sai Prakash Ranjan wrote:
> 
> Imagine your network driver doesn't implement a .shutdown method (so
> the hardware is still active regardless of device links), happens to
> have an Rx buffer or descriptor ring DMA-mapped at an IOVA that looks
> like the physical address of the memory containing some part of the
> kernel text lower down that call stack, and the MAC receives a
> broadcast IP packet at about the point arm_smmu_device_shutdown() is
> returning. Enjoy debugging that ;)
> 
> And if coincidental memory corruption seems too far-fetched for your
> liking, other fun alternatives might include "display tries to scan
> out from powered-off device, deadlocks interconnect and prevents
> anything else making progress", or "access to TZC-protected physical
> address triggers interrupt and over-eager Secure firmware resets
> system before orderly poweroff has a chance to finish".
> 
> Of course the fact that in practice we'll *always* see the warning
> because there's no way to tear down the default DMA domains, and even
> if all devices *have* been nicely quiesced there's no way to tell, is
> certainly less than ideal. Like I say, it's not entirely clear-cut
> either way...
> 

Thanks for these examples, good to know these scenarios in case we come 
across these.
However, if we see these error/warning messages appear everytime then 
what will be
the credibility of these messages? We will just ignore these messages 
when
these issues you mention actually appears because we see them everytime 
on
reboot or shutdown. So doesn't it make sense to enable these only when
we are debugging? We could argue that how will we know the issue could 
be related
to SMMU, but that's the case even now.

The reason why this came up was that, we had a NOC(interconnect) error 
which does
have a logging atleast in QCOM platforms from the secure side(it prints 
these on the console)
after the SMMU err messages and there was a confusion if it was related 
to these messages.
However, NOC error messages did identify the issue with the USB and it 
was solved later.
So these SMMU err/warning messages could be misleading like the above 
case almost everytime.

The probability of the issues you mentioned occuring is very less than 
the actual reboot,
shutdown scenarios, for ex: we run reboot stress test for thousands of 
times and these messages
don't add anything special in those cases when any issue occurs because 
they are seen
everytime.

Thanks,
Sai
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-28  7:35       ` Sai Prakash Ranjan
@ 2020-03-30 18:24         ` Doug Anderson
  2020-03-31  7:36           ` Sai Prakash Ranjan
  0 siblings, 1 reply; 15+ messages in thread
From: Doug Anderson @ 2020-03-30 18:24 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Robin Murphy, Will Deacon, Joerg Roedel,
	list@263.net:IOMMU DRIVERS
	<iommu@lists.linux-foundation.org>,
	Joerg Roedel <joro@8bytes.org>,,
	linux-arm-msm, LKML

Hi,

On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan
<saiprakash.ranjan@codeaurora.org> wrote:
>
> > Of course the fact that in practice we'll *always* see the warning
> > because there's no way to tear down the default DMA domains, and even
> > if all devices *have* been nicely quiesced there's no way to tell, is
> > certainly less than ideal. Like I say, it's not entirely clear-cut
> > either way...
> >
>
> Thanks for these examples, good to know these scenarios in case we come
> across these.
> However, if we see these error/warning messages appear everytime then
> what will be
> the credibility of these messages? We will just ignore these messages
> when
> these issues you mention actually appears because we see them everytime
> on
> reboot or shutdown.

I would agree that if these messages are expected to be seen every
time, there's no way to fix them, and they're not indicative of any
problem then something should be done.  Seeing something printed at
"dev_error" level with an exclamation point (!) at the end makes me
feel like this is something that needs immediate action on my part.

If we really can't do better but feel that the messages need to be
there, at least make them dev_info and less scary like:

  arm-smmu 15000000.iommu: turning off; DMA should be quiesced before now

...that would still give you a hint in the logs that if you saw a DMA
transaction after the message that it was a bug but also wouldn't
sound scary to someone who wasn't seeing any other problems.

-Doug

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-30 18:24         ` Doug Anderson
@ 2020-03-31  7:36           ` Sai Prakash Ranjan
  2020-03-31  7:44             ` Will Deacon
  0 siblings, 1 reply; 15+ messages in thread
From: Sai Prakash Ranjan @ 2020-03-31  7:36 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Robin Murphy, Will Deacon, Joerg Roedel,
	list@263.net:IOMMU DRIVERS ,
	Joerg Roedel <joro@8bytes.org>,,
	linux-arm-msm, LKML

Hi,

On 2020-03-30 23:54, Doug Anderson wrote:
> Hi,
> 
> On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan
> <saiprakash.ranjan@codeaurora.org> wrote:
>> 
>> > Of course the fact that in practice we'll *always* see the warning
>> > because there's no way to tear down the default DMA domains, and even
>> > if all devices *have* been nicely quiesced there's no way to tell, is
>> > certainly less than ideal. Like I say, it's not entirely clear-cut
>> > either way...
>> >
>> 
>> Thanks for these examples, good to know these scenarios in case we 
>> come
>> across these.
>> However, if we see these error/warning messages appear everytime then
>> what will be
>> the credibility of these messages? We will just ignore these messages
>> when
>> these issues you mention actually appears because we see them 
>> everytime
>> on
>> reboot or shutdown.
> 
> I would agree that if these messages are expected to be seen every
> time, there's no way to fix them, and they're not indicative of any
> problem then something should be done.  Seeing something printed at
> "dev_error" level with an exclamation point (!) at the end makes me
> feel like this is something that needs immediate action on my part.
> 
> If we really can't do better but feel that the messages need to be
> there, at least make them dev_info and less scary like:
> 
>   arm-smmu 15000000.iommu: turning off; DMA should be quiesced before 
> now
> 
> ...that would still give you a hint in the logs that if you saw a DMA
> transaction after the message that it was a bug but also wouldn't
> sound scary to someone who wasn't seeing any other problems.
> 

We can do this if Robin is OK?

-Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-31  7:36           ` Sai Prakash Ranjan
@ 2020-03-31  7:44             ` Will Deacon
  2020-03-31  7:53               ` Sai Prakash Ranjan
  0 siblings, 1 reply; 15+ messages in thread
From: Will Deacon @ 2020-03-31  7:44 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Doug Anderson, Robin Murphy, Joerg Roedel,
	list@263.net:IOMMU DRIVERS ,
	Joerg Roedel <joro@8bytes.org>,,
	linux-arm-msm, LKML

On Tue, Mar 31, 2020 at 01:06:11PM +0530, Sai Prakash Ranjan wrote:
> On 2020-03-30 23:54, Doug Anderson wrote:
> > On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan
> > <saiprakash.ranjan@codeaurora.org> wrote:
> > > 
> > > > Of course the fact that in practice we'll *always* see the warning
> > > > because there's no way to tear down the default DMA domains, and even
> > > > if all devices *have* been nicely quiesced there's no way to tell, is
> > > > certainly less than ideal. Like I say, it's not entirely clear-cut
> > > > either way...
> > > >
> > > 
> > > Thanks for these examples, good to know these scenarios in case we
> > > come
> > > across these.
> > > However, if we see these error/warning messages appear everytime then
> > > what will be
> > > the credibility of these messages? We will just ignore these messages
> > > when
> > > these issues you mention actually appears because we see them
> > > everytime
> > > on
> > > reboot or shutdown.
> > 
> > I would agree that if these messages are expected to be seen every
> > time, there's no way to fix them, and they're not indicative of any
> > problem then something should be done.  Seeing something printed at
> > "dev_error" level with an exclamation point (!) at the end makes me
> > feel like this is something that needs immediate action on my part.
> > 
> > If we really can't do better but feel that the messages need to be
> > there, at least make them dev_info and less scary like:
> > 
> >   arm-smmu 15000000.iommu: turning off; DMA should be quiesced before
> > now
> > 
> > ...that would still give you a hint in the logs that if you saw a DMA
> > transaction after the message that it was a bug but also wouldn't
> > sound scary to someone who wasn't seeing any other problems.
> > 
> 
> We can do this if Robin is OK?

It would be nice if you could figure out which domains are still live when
the SMMU is being shut down in your case and verify that it *is* infact
benign before we start making the message more friendly. As Robin said
earlier, rogue DMA is a real nightmare to debug.

Will

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-31  7:44             ` Will Deacon
@ 2020-03-31  7:53               ` Sai Prakash Ranjan
  2020-04-22 19:49                 ` Doug Anderson
  0 siblings, 1 reply; 15+ messages in thread
From: Sai Prakash Ranjan @ 2020-03-31  7:53 UTC (permalink / raw)
  To: Will Deacon
  Cc: Doug Anderson, Robin Murphy, Joerg Roedel,
	list@263.net:IOMMU DRIVERS ,
	Joerg Roedel <joro@8bytes.org>,,
	linux-arm-msm, LKML

Hi Will,

On 2020-03-31 13:14, Will Deacon wrote:
> On Tue, Mar 31, 2020 at 01:06:11PM +0530, Sai Prakash Ranjan wrote:
>> On 2020-03-30 23:54, Doug Anderson wrote:
>> > On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan
>> > <saiprakash.ranjan@codeaurora.org> wrote:
>> > >
>> > > > Of course the fact that in practice we'll *always* see the warning
>> > > > because there's no way to tear down the default DMA domains, and even
>> > > > if all devices *have* been nicely quiesced there's no way to tell, is
>> > > > certainly less than ideal. Like I say, it's not entirely clear-cut
>> > > > either way...
>> > > >
>> > >
>> > > Thanks for these examples, good to know these scenarios in case we
>> > > come
>> > > across these.
>> > > However, if we see these error/warning messages appear everytime then
>> > > what will be
>> > > the credibility of these messages? We will just ignore these messages
>> > > when
>> > > these issues you mention actually appears because we see them
>> > > everytime
>> > > on
>> > > reboot or shutdown.
>> >
>> > I would agree that if these messages are expected to be seen every
>> > time, there's no way to fix them, and they're not indicative of any
>> > problem then something should be done.  Seeing something printed at
>> > "dev_error" level with an exclamation point (!) at the end makes me
>> > feel like this is something that needs immediate action on my part.
>> >
>> > If we really can't do better but feel that the messages need to be
>> > there, at least make them dev_info and less scary like:
>> >
>> >   arm-smmu 15000000.iommu: turning off; DMA should be quiesced before
>> > now
>> >
>> > ...that would still give you a hint in the logs that if you saw a DMA
>> > transaction after the message that it was a bug but also wouldn't
>> > sound scary to someone who wasn't seeing any other problems.
>> >
>> 
>> We can do this if Robin is OK?
> 
> It would be nice if you could figure out which domains are still live 
> when
> the SMMU is being shut down in your case and verify that it *is* infact
> benign before we start making the message more friendly. As Robin said
> earlier, rogue DMA is a real nightmare to debug.
> 

I could see this error message for all the clients of apps_smmu.
I checked manually enabling bypass and removing iommus dt property
for each client of apps_smmu.

Thanks,
Sai

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-03-31  7:53               ` Sai Prakash Ranjan
@ 2020-04-22 19:49                 ` Doug Anderson
  2020-04-23  8:17                   ` Sai Prakash Ranjan
  0 siblings, 1 reply; 15+ messages in thread
From: Doug Anderson @ 2020-04-22 19:49 UTC (permalink / raw)
  To: Sai Prakash Ranjan
  Cc: Will Deacon, Robin Murphy, Joerg Roedel,
	list@263.net:IOMMU DRIVERS ,
	Joerg Roedel <joro@8bytes.org>,,
	linux-arm-msm, LKML

Hi,

On Tue, Mar 31, 2020 at 12:53 AM Sai Prakash Ranjan
<saiprakash.ranjan@codeaurora.org> wrote:
>
> Hi Will,
>
> On 2020-03-31 13:14, Will Deacon wrote:
> > On Tue, Mar 31, 2020 at 01:06:11PM +0530, Sai Prakash Ranjan wrote:
> >> On 2020-03-30 23:54, Doug Anderson wrote:
> >> > On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan
> >> > <saiprakash.ranjan@codeaurora.org> wrote:
> >> > >
> >> > > > Of course the fact that in practice we'll *always* see the warning
> >> > > > because there's no way to tear down the default DMA domains, and even
> >> > > > if all devices *have* been nicely quiesced there's no way to tell, is
> >> > > > certainly less than ideal. Like I say, it's not entirely clear-cut
> >> > > > either way...
> >> > > >
> >> > >
> >> > > Thanks for these examples, good to know these scenarios in case we
> >> > > come
> >> > > across these.
> >> > > However, if we see these error/warning messages appear everytime then
> >> > > what will be
> >> > > the credibility of these messages? We will just ignore these messages
> >> > > when
> >> > > these issues you mention actually appears because we see them
> >> > > everytime
> >> > > on
> >> > > reboot or shutdown.
> >> >
> >> > I would agree that if these messages are expected to be seen every
> >> > time, there's no way to fix them, and they're not indicative of any
> >> > problem then something should be done.  Seeing something printed at
> >> > "dev_error" level with an exclamation point (!) at the end makes me
> >> > feel like this is something that needs immediate action on my part.
> >> >
> >> > If we really can't do better but feel that the messages need to be
> >> > there, at least make them dev_info and less scary like:
> >> >
> >> >   arm-smmu 15000000.iommu: turning off; DMA should be quiesced before
> >> > now
> >> >
> >> > ...that would still give you a hint in the logs that if you saw a DMA
> >> > transaction after the message that it was a bug but also wouldn't
> >> > sound scary to someone who wasn't seeing any other problems.
> >> >
> >>
> >> We can do this if Robin is OK?
> >
> > It would be nice if you could figure out which domains are still live
> > when
> > the SMMU is being shut down in your case and verify that it *is* infact
> > benign before we start making the message more friendly. As Robin said
> > earlier, rogue DMA is a real nightmare to debug.
> >
>
> I could see this error message for all the clients of apps_smmu.
> I checked manually enabling bypass and removing iommus dt property
> for each client of apps_smmu.

Any update on the status here?  If I'm reading the conversation above,
Robin said: "we'll *always* see the warning because there's no way to
tear down the default DMA domains, and even if all devices *have* been
nicely quiesced there's no way to tell".  Did I understand that
properly?  If so, it seems like it's fully expected to see this
message on every reboot and it doesn't necessarily signify anything
bad.

-Doug

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-04-22 19:49                 ` Doug Anderson
@ 2020-04-23  8:17                   ` Sai Prakash Ranjan
  2020-04-23  9:28                     ` Robin Murphy
  0 siblings, 1 reply; 15+ messages in thread
From: Sai Prakash Ranjan @ 2020-04-23  8:17 UTC (permalink / raw)
  To: Doug Anderson, Will Deacon, Robin Murphy
  Cc: Joerg Roedel, list@263.net:IOMMU DRIVERS ,
	Joerg Roedel <joro@8bytes.org>,,
	linux-arm-msm, LKML, linux-arm-msm-owner

Hi,

On 2020-04-23 01:19, Doug Anderson wrote:
> Hi,
> 
> On Tue, Mar 31, 2020 at 12:53 AM Sai Prakash Ranjan
> <saiprakash.ranjan@codeaurora.org> wrote:
>> 
>> Hi Will,
>> 
>> On 2020-03-31 13:14, Will Deacon wrote:
>> > On Tue, Mar 31, 2020 at 01:06:11PM +0530, Sai Prakash Ranjan wrote:
>> >> On 2020-03-30 23:54, Doug Anderson wrote:
>> >> > On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan
>> >> > <saiprakash.ranjan@codeaurora.org> wrote:
>> >> > >
>> >> > > > Of course the fact that in practice we'll *always* see the warning
>> >> > > > because there's no way to tear down the default DMA domains, and even
>> >> > > > if all devices *have* been nicely quiesced there's no way to tell, is
>> >> > > > certainly less than ideal. Like I say, it's not entirely clear-cut
>> >> > > > either way...
>> >> > > >
>> >> > >
>> >> > > Thanks for these examples, good to know these scenarios in case we
>> >> > > come
>> >> > > across these.
>> >> > > However, if we see these error/warning messages appear everytime then
>> >> > > what will be
>> >> > > the credibility of these messages? We will just ignore these messages
>> >> > > when
>> >> > > these issues you mention actually appears because we see them
>> >> > > everytime
>> >> > > on
>> >> > > reboot or shutdown.
>> >> >
>> >> > I would agree that if these messages are expected to be seen every
>> >> > time, there's no way to fix them, and they're not indicative of any
>> >> > problem then something should be done.  Seeing something printed at
>> >> > "dev_error" level with an exclamation point (!) at the end makes me
>> >> > feel like this is something that needs immediate action on my part.
>> >> >
>> >> > If we really can't do better but feel that the messages need to be
>> >> > there, at least make them dev_info and less scary like:
>> >> >
>> >> >   arm-smmu 15000000.iommu: turning off; DMA should be quiesced before
>> >> > now
>> >> >
>> >> > ...that would still give you a hint in the logs that if you saw a DMA
>> >> > transaction after the message that it was a bug but also wouldn't
>> >> > sound scary to someone who wasn't seeing any other problems.
>> >> >
>> >>
>> >> We can do this if Robin is OK?
>> >
>> > It would be nice if you could figure out which domains are still live
>> > when
>> > the SMMU is being shut down in your case and verify that it *is* infact
>> > benign before we start making the message more friendly. As Robin said
>> > earlier, rogue DMA is a real nightmare to debug.
>> >
>> 
>> I could see this error message for all the clients of apps_smmu.
>> I checked manually enabling bypass and removing iommus dt property
>> for each client of apps_smmu.
> 
> Any update on the status here?  If I'm reading the conversation above,
> Robin said: "we'll *always* see the warning because there's no way to
> tear down the default DMA domains, and even if all devices *have* been
> nicely quiesced there's no way to tell".  Did I understand that
> properly?  If so, it seems like it's fully expected to see this
> message on every reboot and it doesn't necessarily signify anything
> bad.
> 

Understanding is the same, waiting for Will and Robin to check if its OK
to make the message more friendly.

-Sai
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-04-23  8:17                   ` Sai Prakash Ranjan
@ 2020-04-23  9:28                     ` Robin Murphy
  2020-04-23  9:41                       ` Sai Prakash Ranjan
  0 siblings, 1 reply; 15+ messages in thread
From: Robin Murphy @ 2020-04-23  9:28 UTC (permalink / raw)
  To: Sai Prakash Ranjan, Doug Anderson, Will Deacon
  Cc: linux-arm-msm, linux-arm-msm-owner, Joerg Roedel, LKML, iommu

On 2020-04-23 9:17 am, Sai Prakash Ranjan wrote:
[...]
>> Any update on the status here?  If I'm reading the conversation above,
>> Robin said: "we'll *always* see the warning because there's no way to
>> tear down the default DMA domains, and even if all devices *have* been
>> nicely quiesced there's no way to tell".  Did I understand that
>> properly?  If so, it seems like it's fully expected to see this
>> message on every reboot and it doesn't necessarily signify anything
>> bad.
>>
> 
> Understanding is the same, waiting for Will and Robin to check if its OK
> to make the message more friendly.

The way I see it, we essentially just want *something* visible that will 
correlate with any misbehaviour that *might* result from turning off a 
possibly-live context. How about simply "disabling translation", at 
dev_warn or dev_info level?

Robin.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback
  2020-04-23  9:28                     ` Robin Murphy
@ 2020-04-23  9:41                       ` Sai Prakash Ranjan
  0 siblings, 0 replies; 15+ messages in thread
From: Sai Prakash Ranjan @ 2020-04-23  9:41 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Doug Anderson, Will Deacon, linux-arm-msm, linux-arm-msm-owner,
	Joerg Roedel, LKML, iommu

On 2020-04-23 14:58, Robin Murphy wrote:
> On 2020-04-23 9:17 am, Sai Prakash Ranjan wrote:
> [...]
>>> Any update on the status here?  If I'm reading the conversation 
>>> above,
>>> Robin said: "we'll *always* see the warning because there's no way to
>>> tear down the default DMA domains, and even if all devices *have* 
>>> been
>>> nicely quiesced there's no way to tell".  Did I understand that
>>> properly?  If so, it seems like it's fully expected to see this
>>> message on every reboot and it doesn't necessarily signify anything
>>> bad.
>>> 
>> 
>> Understanding is the same, waiting for Will and Robin to check if its 
>> OK
>> to make the message more friendly.
> 
> The way I see it, we essentially just want *something* visible that
> will correlate with any misbehaviour that *might* result from turning
> off a possibly-live context. How about simply "disabling translation",
> at dev_warn or dev_info level?
> 


Sounds good, I'll go with disabling translation with dev_info.

Thanks,
Sai
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member
of Code Aurora Forum, hosted by The Linux Foundation

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-04-23  9:41 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-27 13:28 [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback Sai Prakash Ranjan
2020-03-27 14:12 ` Robin Murphy
2020-03-27 15:09   ` Sai Prakash Ranjan
2020-03-27 16:17     ` Rob Clark
2020-03-27 18:18       ` Sai Prakash Ranjan
2020-03-27 19:02     ` Robin Murphy
2020-03-28  7:35       ` Sai Prakash Ranjan
2020-03-30 18:24         ` Doug Anderson
2020-03-31  7:36           ` Sai Prakash Ranjan
2020-03-31  7:44             ` Will Deacon
2020-03-31  7:53               ` Sai Prakash Ranjan
2020-04-22 19:49                 ` Doug Anderson
2020-04-23  8:17                   ` Sai Prakash Ranjan
2020-04-23  9:28                     ` Robin Murphy
2020-04-23  9:41                       ` Sai Prakash Ranjan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).