* [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback @ 2020-03-27 13:28 Sai Prakash Ranjan 2020-03-27 14:12 ` Robin Murphy 0 siblings, 1 reply; 15+ messages in thread From: Sai Prakash Ranjan @ 2020-03-27 13:28 UTC (permalink / raw) To: Robin Murphy, Will Deacon, Joerg Roedel, iommu, Douglas Anderson Cc: linux-arm-msm, linux-kernel, Sai Prakash Ranjan Currently on reboot/shutdown, the following messages are displayed on the console as error messages before the system reboots/shutdown. On SC7180: arm-smmu 15000000.iommu: removing device with active domains! arm-smmu 5040000.iommu: removing device with active domains! Demote the log level to debug since it does not offer much help in identifying/fixing any issue as the system is anyways going down and reduce spamming the kernel log. Reported-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> --- drivers/iommu/arm-smmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 16c4b87af42b..0a865e32054a 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -2250,7 +2250,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev) return -ENODEV; if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS)) - dev_err(&pdev->dev, "removing device with active domains!\n"); + dev_dbg(&pdev->dev, "removing device with active domains!\n"); arm_smmu_bus_init(NULL); iommu_device_unregister(&smmu->iommu); -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-27 13:28 [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback Sai Prakash Ranjan @ 2020-03-27 14:12 ` Robin Murphy 2020-03-27 15:09 ` Sai Prakash Ranjan 0 siblings, 1 reply; 15+ messages in thread From: Robin Murphy @ 2020-03-27 14:12 UTC (permalink / raw) To: Sai Prakash Ranjan, Will Deacon, Joerg Roedel, iommu, Douglas Anderson Cc: linux-arm-msm, linux-kernel On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote: > Currently on reboot/shutdown, the following messages are > displayed on the console as error messages before the > system reboots/shutdown. > > On SC7180: > > arm-smmu 15000000.iommu: removing device with active domains! > arm-smmu 5040000.iommu: removing device with active domains! > > Demote the log level to debug since it does not offer much > help in identifying/fixing any issue as the system is anyways > going down and reduce spamming the kernel log. I've gone back and forth on this pretty much ever since we added the shutdown hook - on the other hand, if any devices *are* still running in those domains at this point, then once we turn off the SMMU and let those IOVAs go out on the bus as physical addresses, all manner of weirdness may ensue. Thus there is an argument for *some* indication that this may happen, although IMO it could be downgraded to at least dev_warn(). Robin. > Reported-by: Douglas Anderson <dianders@chromium.org> > Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> > --- > drivers/iommu/arm-smmu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 16c4b87af42b..0a865e32054a 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -2250,7 +2250,7 @@ static int arm_smmu_device_remove(struct platform_device *pdev) > return -ENODEV; > > if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS)) > - dev_err(&pdev->dev, "removing device with active domains!\n"); > + dev_dbg(&pdev->dev, "removing device with active domains!\n"); > > arm_smmu_bus_init(NULL); > iommu_device_unregister(&smmu->iommu); > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-27 14:12 ` Robin Murphy @ 2020-03-27 15:09 ` Sai Prakash Ranjan 2020-03-27 16:17 ` Rob Clark 2020-03-27 19:02 ` Robin Murphy 0 siblings, 2 replies; 15+ messages in thread From: Sai Prakash Ranjan @ 2020-03-27 15:09 UTC (permalink / raw) To: Robin Murphy Cc: Will Deacon, Joerg Roedel, iommu, Douglas Anderson, linux-arm-msm, linux-kernel Hi Robin, Thanks for taking a look at this. On 2020-03-27 19:42, Robin Murphy wrote: > On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote: >> Currently on reboot/shutdown, the following messages are >> displayed on the console as error messages before the >> system reboots/shutdown. >> >> On SC7180: >> >> arm-smmu 15000000.iommu: removing device with active domains! >> arm-smmu 5040000.iommu: removing device with active domains! >> >> Demote the log level to debug since it does not offer much >> help in identifying/fixing any issue as the system is anyways >> going down and reduce spamming the kernel log. > > I've gone back and forth on this pretty much ever since we added the > shutdown hook - on the other hand, if any devices *are* still running > in those domains at this point, then once we turn off the SMMU and let > those IOVAs go out on the bus as physical addresses, all manner of > weirdness may ensue. Thus there is an argument for *some* indication > that this may happen, although IMO it could be downgraded to at least > dev_warn(). > Any pointers to the weirdness here after SMMU is turned off? Because if we look at the call sites, device_shutdown is called from kernel_restart_prepare or kernel_shutdown_prepare which would mean system is going down anyways, so do we really care about these error messages or warnings from SMMU? arm_smmu_device_shutdown platform_drv_shutdown device_shutdown kernel_restart_prepare kernel_restart Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-27 15:09 ` Sai Prakash Ranjan @ 2020-03-27 16:17 ` Rob Clark 2020-03-27 18:18 ` Sai Prakash Ranjan 2020-03-27 19:02 ` Robin Murphy 1 sibling, 1 reply; 15+ messages in thread From: Rob Clark @ 2020-03-27 16:17 UTC (permalink / raw) To: Sai Prakash Ranjan Cc: Robin Murphy, linux-arm-msm, Linux Kernel Mailing List, Douglas Anderson, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, Will Deacon On Fri, Mar 27, 2020 at 8:10 AM Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> wrote: > > Hi Robin, > > Thanks for taking a look at this. > > On 2020-03-27 19:42, Robin Murphy wrote: > > On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote: > >> Currently on reboot/shutdown, the following messages are > >> displayed on the console as error messages before the > >> system reboots/shutdown. > >> > >> On SC7180: > >> > >> arm-smmu 15000000.iommu: removing device with active domains! > >> arm-smmu 5040000.iommu: removing device with active domains! > >> > >> Demote the log level to debug since it does not offer much > >> help in identifying/fixing any issue as the system is anyways > >> going down and reduce spamming the kernel log. > > > > I've gone back and forth on this pretty much ever since we added the > > shutdown hook - on the other hand, if any devices *are* still running > > in those domains at this point, then once we turn off the SMMU and let > > those IOVAs go out on the bus as physical addresses, all manner of > > weirdness may ensue. Thus there is an argument for *some* indication > > that this may happen, although IMO it could be downgraded to at least > > dev_warn(). > > > > Any pointers to the weirdness here after SMMU is turned off? > Because if we look at the call sites, device_shutdown is called > from kernel_restart_prepare or kernel_shutdown_prepare which would > mean system is going down anyways, so do we really care about these > error messages or warnings from SMMU? > > arm_smmu_device_shutdown > platform_drv_shutdown > device_shutdown > kernel_restart_prepare > kernel_restart > I'd guess that drm/msm is not detaching all of it's UNMANAGED domains in shutdown. Although *presumably* the device_link stuff would prevent the SMMU from shutting down while gpu/display is still active? If not I think we have bigger problems. I hadn't really noticed the error msgs before, not sure if that is just because the screen is off by the time they happen or if they are a new warning.. BR, -R ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-27 16:17 ` Rob Clark @ 2020-03-27 18:18 ` Sai Prakash Ranjan 0 siblings, 0 replies; 15+ messages in thread From: Sai Prakash Ranjan @ 2020-03-27 18:18 UTC (permalink / raw) To: Rob Clark Cc: Robin Murphy, linux-arm-msm, Linux Kernel Mailing List, Douglas Anderson, list@263.net:IOMMU DRIVERS , Joerg Roedel <joro@8bytes.org>,, Will Deacon Hi Rob, On 2020-03-27 21:47, Rob Clark wrote: > On Fri, Mar 27, 2020 at 8:10 AM Sai Prakash Ranjan > <saiprakash.ranjan@codeaurora.org> wrote: >> >> Hi Robin, >> >> Thanks for taking a look at this. >> >> On 2020-03-27 19:42, Robin Murphy wrote: >> > On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote: >> >> Currently on reboot/shutdown, the following messages are >> >> displayed on the console as error messages before the >> >> system reboots/shutdown. >> >> >> >> On SC7180: >> >> >> >> arm-smmu 15000000.iommu: removing device with active domains! >> >> arm-smmu 5040000.iommu: removing device with active domains! >> >> >> >> Demote the log level to debug since it does not offer much >> >> help in identifying/fixing any issue as the system is anyways >> >> going down and reduce spamming the kernel log. >> > >> > I've gone back and forth on this pretty much ever since we added the >> > shutdown hook - on the other hand, if any devices *are* still running >> > in those domains at this point, then once we turn off the SMMU and let >> > those IOVAs go out on the bus as physical addresses, all manner of >> > weirdness may ensue. Thus there is an argument for *some* indication >> > that this may happen, although IMO it could be downgraded to at least >> > dev_warn(). >> > >> >> Any pointers to the weirdness here after SMMU is turned off? >> Because if we look at the call sites, device_shutdown is called >> from kernel_restart_prepare or kernel_shutdown_prepare which would >> mean system is going down anyways, so do we really care about these >> error messages or warnings from SMMU? >> >> arm_smmu_device_shutdown >> platform_drv_shutdown >> device_shutdown >> kernel_restart_prepare >> kernel_restart >> > > I'd guess that drm/msm is not detaching all of it's UNMANAGED domains > in shutdown. Although *presumably* the device_link stuff would > prevent the SMMU from shutting down while gpu/display is still active? > If not I think we have bigger problems. > Where is the shutdown callback in drm/msm? I don't see any... Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-27 15:09 ` Sai Prakash Ranjan 2020-03-27 16:17 ` Rob Clark @ 2020-03-27 19:02 ` Robin Murphy 2020-03-28 7:35 ` Sai Prakash Ranjan 1 sibling, 1 reply; 15+ messages in thread From: Robin Murphy @ 2020-03-27 19:02 UTC (permalink / raw) To: Sai Prakash Ranjan Cc: Will Deacon, Joerg Roedel, iommu, Douglas Anderson, linux-arm-msm, linux-kernel On 2020-03-27 3:09 pm, Sai Prakash Ranjan wrote: > Hi Robin, > > Thanks for taking a look at this. > > On 2020-03-27 19:42, Robin Murphy wrote: >> On 2020-03-27 1:28 pm, Sai Prakash Ranjan wrote: >>> Currently on reboot/shutdown, the following messages are >>> displayed on the console as error messages before the >>> system reboots/shutdown. >>> >>> On SC7180: >>> >>> arm-smmu 15000000.iommu: removing device with active domains! >>> arm-smmu 5040000.iommu: removing device with active domains! >>> >>> Demote the log level to debug since it does not offer much >>> help in identifying/fixing any issue as the system is anyways >>> going down and reduce spamming the kernel log. >> >> I've gone back and forth on this pretty much ever since we added the >> shutdown hook - on the other hand, if any devices *are* still running >> in those domains at this point, then once we turn off the SMMU and let >> those IOVAs go out on the bus as physical addresses, all manner of >> weirdness may ensue. Thus there is an argument for *some* indication >> that this may happen, although IMO it could be downgraded to at least >> dev_warn(). >> > > Any pointers to the weirdness here after SMMU is turned off? > Because if we look at the call sites, device_shutdown is called > from kernel_restart_prepare or kernel_shutdown_prepare which would > mean system is going down anyways, so do we really care about these > error messages or warnings from SMMU? > > arm_smmu_device_shutdown > platform_drv_shutdown > device_shutdown > kernel_restart_prepare > kernel_restart Imagine your network driver doesn't implement a .shutdown method (so the hardware is still active regardless of device links), happens to have an Rx buffer or descriptor ring DMA-mapped at an IOVA that looks like the physical address of the memory containing some part of the kernel text lower down that call stack, and the MAC receives a broadcast IP packet at about the point arm_smmu_device_shutdown() is returning. Enjoy debugging that ;) And if coincidental memory corruption seems too far-fetched for your liking, other fun alternatives might include "display tries to scan out from powered-off device, deadlocks interconnect and prevents anything else making progress", or "access to TZC-protected physical address triggers interrupt and over-eager Secure firmware resets system before orderly poweroff has a chance to finish". Of course the fact that in practice we'll *always* see the warning because there's no way to tear down the default DMA domains, and even if all devices *have* been nicely quiesced there's no way to tell, is certainly less than ideal. Like I say, it's not entirely clear-cut either way... Robin. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-27 19:02 ` Robin Murphy @ 2020-03-28 7:35 ` Sai Prakash Ranjan 2020-03-30 18:24 ` Doug Anderson 0 siblings, 1 reply; 15+ messages in thread From: Sai Prakash Ranjan @ 2020-03-28 7:35 UTC (permalink / raw) To: Robin Murphy Cc: Will Deacon, Joerg Roedel, iommu, Douglas Anderson, linux-arm-msm, linux-kernel Hi Robin, On 2020-03-28 00:32, Robin Murphy wrote: > On 2020-03-27 3:09 pm, Sai Prakash Ranjan wrote: > > Imagine your network driver doesn't implement a .shutdown method (so > the hardware is still active regardless of device links), happens to > have an Rx buffer or descriptor ring DMA-mapped at an IOVA that looks > like the physical address of the memory containing some part of the > kernel text lower down that call stack, and the MAC receives a > broadcast IP packet at about the point arm_smmu_device_shutdown() is > returning. Enjoy debugging that ;) > > And if coincidental memory corruption seems too far-fetched for your > liking, other fun alternatives might include "display tries to scan > out from powered-off device, deadlocks interconnect and prevents > anything else making progress", or "access to TZC-protected physical > address triggers interrupt and over-eager Secure firmware resets > system before orderly poweroff has a chance to finish". > > Of course the fact that in practice we'll *always* see the warning > because there's no way to tear down the default DMA domains, and even > if all devices *have* been nicely quiesced there's no way to tell, is > certainly less than ideal. Like I say, it's not entirely clear-cut > either way... > Thanks for these examples, good to know these scenarios in case we come across these. However, if we see these error/warning messages appear everytime then what will be the credibility of these messages? We will just ignore these messages when these issues you mention actually appears because we see them everytime on reboot or shutdown. So doesn't it make sense to enable these only when we are debugging? We could argue that how will we know the issue could be related to SMMU, but that's the case even now. The reason why this came up was that, we had a NOC(interconnect) error which does have a logging atleast in QCOM platforms from the secure side(it prints these on the console) after the SMMU err messages and there was a confusion if it was related to these messages. However, NOC error messages did identify the issue with the USB and it was solved later. So these SMMU err/warning messages could be misleading like the above case almost everytime. The probability of the issues you mentioned occuring is very less than the actual reboot, shutdown scenarios, for ex: we run reboot stress test for thousands of times and these messages don't add anything special in those cases when any issue occurs because they are seen everytime. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-28 7:35 ` Sai Prakash Ranjan @ 2020-03-30 18:24 ` Doug Anderson 2020-03-31 7:36 ` Sai Prakash Ranjan 0 siblings, 1 reply; 15+ messages in thread From: Doug Anderson @ 2020-03-30 18:24 UTC (permalink / raw) To: Sai Prakash Ranjan Cc: Robin Murphy, Will Deacon, Joerg Roedel, list@263.net:IOMMU DRIVERS <iommu@lists.linux-foundation.org>, Joerg Roedel <joro@8bytes.org>,, linux-arm-msm, LKML Hi, On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> wrote: > > > Of course the fact that in practice we'll *always* see the warning > > because there's no way to tear down the default DMA domains, and even > > if all devices *have* been nicely quiesced there's no way to tell, is > > certainly less than ideal. Like I say, it's not entirely clear-cut > > either way... > > > > Thanks for these examples, good to know these scenarios in case we come > across these. > However, if we see these error/warning messages appear everytime then > what will be > the credibility of these messages? We will just ignore these messages > when > these issues you mention actually appears because we see them everytime > on > reboot or shutdown. I would agree that if these messages are expected to be seen every time, there's no way to fix them, and they're not indicative of any problem then something should be done. Seeing something printed at "dev_error" level with an exclamation point (!) at the end makes me feel like this is something that needs immediate action on my part. If we really can't do better but feel that the messages need to be there, at least make them dev_info and less scary like: arm-smmu 15000000.iommu: turning off; DMA should be quiesced before now ...that would still give you a hint in the logs that if you saw a DMA transaction after the message that it was a bug but also wouldn't sound scary to someone who wasn't seeing any other problems. -Doug ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-30 18:24 ` Doug Anderson @ 2020-03-31 7:36 ` Sai Prakash Ranjan 2020-03-31 7:44 ` Will Deacon 0 siblings, 1 reply; 15+ messages in thread From: Sai Prakash Ranjan @ 2020-03-31 7:36 UTC (permalink / raw) To: Doug Anderson Cc: Robin Murphy, Will Deacon, Joerg Roedel, list@263.net:IOMMU DRIVERS , Joerg Roedel <joro@8bytes.org>,, linux-arm-msm, LKML Hi, On 2020-03-30 23:54, Doug Anderson wrote: > Hi, > > On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan > <saiprakash.ranjan@codeaurora.org> wrote: >> >> > Of course the fact that in practice we'll *always* see the warning >> > because there's no way to tear down the default DMA domains, and even >> > if all devices *have* been nicely quiesced there's no way to tell, is >> > certainly less than ideal. Like I say, it's not entirely clear-cut >> > either way... >> > >> >> Thanks for these examples, good to know these scenarios in case we >> come >> across these. >> However, if we see these error/warning messages appear everytime then >> what will be >> the credibility of these messages? We will just ignore these messages >> when >> these issues you mention actually appears because we see them >> everytime >> on >> reboot or shutdown. > > I would agree that if these messages are expected to be seen every > time, there's no way to fix them, and they're not indicative of any > problem then something should be done. Seeing something printed at > "dev_error" level with an exclamation point (!) at the end makes me > feel like this is something that needs immediate action on my part. > > If we really can't do better but feel that the messages need to be > there, at least make them dev_info and less scary like: > > arm-smmu 15000000.iommu: turning off; DMA should be quiesced before > now > > ...that would still give you a hint in the logs that if you saw a DMA > transaction after the message that it was a bug but also wouldn't > sound scary to someone who wasn't seeing any other problems. > We can do this if Robin is OK? -Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-31 7:36 ` Sai Prakash Ranjan @ 2020-03-31 7:44 ` Will Deacon 2020-03-31 7:53 ` Sai Prakash Ranjan 0 siblings, 1 reply; 15+ messages in thread From: Will Deacon @ 2020-03-31 7:44 UTC (permalink / raw) To: Sai Prakash Ranjan Cc: Doug Anderson, Robin Murphy, Joerg Roedel, list@263.net:IOMMU DRIVERS , Joerg Roedel <joro@8bytes.org>,, linux-arm-msm, LKML On Tue, Mar 31, 2020 at 01:06:11PM +0530, Sai Prakash Ranjan wrote: > On 2020-03-30 23:54, Doug Anderson wrote: > > On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan > > <saiprakash.ranjan@codeaurora.org> wrote: > > > > > > > Of course the fact that in practice we'll *always* see the warning > > > > because there's no way to tear down the default DMA domains, and even > > > > if all devices *have* been nicely quiesced there's no way to tell, is > > > > certainly less than ideal. Like I say, it's not entirely clear-cut > > > > either way... > > > > > > > > > > Thanks for these examples, good to know these scenarios in case we > > > come > > > across these. > > > However, if we see these error/warning messages appear everytime then > > > what will be > > > the credibility of these messages? We will just ignore these messages > > > when > > > these issues you mention actually appears because we see them > > > everytime > > > on > > > reboot or shutdown. > > > > I would agree that if these messages are expected to be seen every > > time, there's no way to fix them, and they're not indicative of any > > problem then something should be done. Seeing something printed at > > "dev_error" level with an exclamation point (!) at the end makes me > > feel like this is something that needs immediate action on my part. > > > > If we really can't do better but feel that the messages need to be > > there, at least make them dev_info and less scary like: > > > > arm-smmu 15000000.iommu: turning off; DMA should be quiesced before > > now > > > > ...that would still give you a hint in the logs that if you saw a DMA > > transaction after the message that it was a bug but also wouldn't > > sound scary to someone who wasn't seeing any other problems. > > > > We can do this if Robin is OK? It would be nice if you could figure out which domains are still live when the SMMU is being shut down in your case and verify that it *is* infact benign before we start making the message more friendly. As Robin said earlier, rogue DMA is a real nightmare to debug. Will ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-31 7:44 ` Will Deacon @ 2020-03-31 7:53 ` Sai Prakash Ranjan 2020-04-22 19:49 ` Doug Anderson 0 siblings, 1 reply; 15+ messages in thread From: Sai Prakash Ranjan @ 2020-03-31 7:53 UTC (permalink / raw) To: Will Deacon Cc: Doug Anderson, Robin Murphy, Joerg Roedel, list@263.net:IOMMU DRIVERS , Joerg Roedel <joro@8bytes.org>,, linux-arm-msm, LKML Hi Will, On 2020-03-31 13:14, Will Deacon wrote: > On Tue, Mar 31, 2020 at 01:06:11PM +0530, Sai Prakash Ranjan wrote: >> On 2020-03-30 23:54, Doug Anderson wrote: >> > On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan >> > <saiprakash.ranjan@codeaurora.org> wrote: >> > > >> > > > Of course the fact that in practice we'll *always* see the warning >> > > > because there's no way to tear down the default DMA domains, and even >> > > > if all devices *have* been nicely quiesced there's no way to tell, is >> > > > certainly less than ideal. Like I say, it's not entirely clear-cut >> > > > either way... >> > > > >> > > >> > > Thanks for these examples, good to know these scenarios in case we >> > > come >> > > across these. >> > > However, if we see these error/warning messages appear everytime then >> > > what will be >> > > the credibility of these messages? We will just ignore these messages >> > > when >> > > these issues you mention actually appears because we see them >> > > everytime >> > > on >> > > reboot or shutdown. >> > >> > I would agree that if these messages are expected to be seen every >> > time, there's no way to fix them, and they're not indicative of any >> > problem then something should be done. Seeing something printed at >> > "dev_error" level with an exclamation point (!) at the end makes me >> > feel like this is something that needs immediate action on my part. >> > >> > If we really can't do better but feel that the messages need to be >> > there, at least make them dev_info and less scary like: >> > >> > arm-smmu 15000000.iommu: turning off; DMA should be quiesced before >> > now >> > >> > ...that would still give you a hint in the logs that if you saw a DMA >> > transaction after the message that it was a bug but also wouldn't >> > sound scary to someone who wasn't seeing any other problems. >> > >> >> We can do this if Robin is OK? > > It would be nice if you could figure out which domains are still live > when > the SMMU is being shut down in your case and verify that it *is* infact > benign before we start making the message more friendly. As Robin said > earlier, rogue DMA is a real nightmare to debug. > I could see this error message for all the clients of apps_smmu. I checked manually enabling bypass and removing iommus dt property for each client of apps_smmu. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-03-31 7:53 ` Sai Prakash Ranjan @ 2020-04-22 19:49 ` Doug Anderson 2020-04-23 8:17 ` Sai Prakash Ranjan 0 siblings, 1 reply; 15+ messages in thread From: Doug Anderson @ 2020-04-22 19:49 UTC (permalink / raw) To: Sai Prakash Ranjan Cc: Will Deacon, Robin Murphy, Joerg Roedel, list@263.net:IOMMU DRIVERS , Joerg Roedel <joro@8bytes.org>,, linux-arm-msm, LKML Hi, On Tue, Mar 31, 2020 at 12:53 AM Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> wrote: > > Hi Will, > > On 2020-03-31 13:14, Will Deacon wrote: > > On Tue, Mar 31, 2020 at 01:06:11PM +0530, Sai Prakash Ranjan wrote: > >> On 2020-03-30 23:54, Doug Anderson wrote: > >> > On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan > >> > <saiprakash.ranjan@codeaurora.org> wrote: > >> > > > >> > > > Of course the fact that in practice we'll *always* see the warning > >> > > > because there's no way to tear down the default DMA domains, and even > >> > > > if all devices *have* been nicely quiesced there's no way to tell, is > >> > > > certainly less than ideal. Like I say, it's not entirely clear-cut > >> > > > either way... > >> > > > > >> > > > >> > > Thanks for these examples, good to know these scenarios in case we > >> > > come > >> > > across these. > >> > > However, if we see these error/warning messages appear everytime then > >> > > what will be > >> > > the credibility of these messages? We will just ignore these messages > >> > > when > >> > > these issues you mention actually appears because we see them > >> > > everytime > >> > > on > >> > > reboot or shutdown. > >> > > >> > I would agree that if these messages are expected to be seen every > >> > time, there's no way to fix them, and they're not indicative of any > >> > problem then something should be done. Seeing something printed at > >> > "dev_error" level with an exclamation point (!) at the end makes me > >> > feel like this is something that needs immediate action on my part. > >> > > >> > If we really can't do better but feel that the messages need to be > >> > there, at least make them dev_info and less scary like: > >> > > >> > arm-smmu 15000000.iommu: turning off; DMA should be quiesced before > >> > now > >> > > >> > ...that would still give you a hint in the logs that if you saw a DMA > >> > transaction after the message that it was a bug but also wouldn't > >> > sound scary to someone who wasn't seeing any other problems. > >> > > >> > >> We can do this if Robin is OK? > > > > It would be nice if you could figure out which domains are still live > > when > > the SMMU is being shut down in your case and verify that it *is* infact > > benign before we start making the message more friendly. As Robin said > > earlier, rogue DMA is a real nightmare to debug. > > > > I could see this error message for all the clients of apps_smmu. > I checked manually enabling bypass and removing iommus dt property > for each client of apps_smmu. Any update on the status here? If I'm reading the conversation above, Robin said: "we'll *always* see the warning because there's no way to tear down the default DMA domains, and even if all devices *have* been nicely quiesced there's no way to tell". Did I understand that properly? If so, it seems like it's fully expected to see this message on every reboot and it doesn't necessarily signify anything bad. -Doug ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-04-22 19:49 ` Doug Anderson @ 2020-04-23 8:17 ` Sai Prakash Ranjan 2020-04-23 9:28 ` Robin Murphy 0 siblings, 1 reply; 15+ messages in thread From: Sai Prakash Ranjan @ 2020-04-23 8:17 UTC (permalink / raw) To: Doug Anderson, Will Deacon, Robin Murphy Cc: Joerg Roedel, list@263.net:IOMMU DRIVERS , Joerg Roedel <joro@8bytes.org>,, linux-arm-msm, LKML, linux-arm-msm-owner Hi, On 2020-04-23 01:19, Doug Anderson wrote: > Hi, > > On Tue, Mar 31, 2020 at 12:53 AM Sai Prakash Ranjan > <saiprakash.ranjan@codeaurora.org> wrote: >> >> Hi Will, >> >> On 2020-03-31 13:14, Will Deacon wrote: >> > On Tue, Mar 31, 2020 at 01:06:11PM +0530, Sai Prakash Ranjan wrote: >> >> On 2020-03-30 23:54, Doug Anderson wrote: >> >> > On Sat, Mar 28, 2020 at 12:35 AM Sai Prakash Ranjan >> >> > <saiprakash.ranjan@codeaurora.org> wrote: >> >> > > >> >> > > > Of course the fact that in practice we'll *always* see the warning >> >> > > > because there's no way to tear down the default DMA domains, and even >> >> > > > if all devices *have* been nicely quiesced there's no way to tell, is >> >> > > > certainly less than ideal. Like I say, it's not entirely clear-cut >> >> > > > either way... >> >> > > > >> >> > > >> >> > > Thanks for these examples, good to know these scenarios in case we >> >> > > come >> >> > > across these. >> >> > > However, if we see these error/warning messages appear everytime then >> >> > > what will be >> >> > > the credibility of these messages? We will just ignore these messages >> >> > > when >> >> > > these issues you mention actually appears because we see them >> >> > > everytime >> >> > > on >> >> > > reboot or shutdown. >> >> > >> >> > I would agree that if these messages are expected to be seen every >> >> > time, there's no way to fix them, and they're not indicative of any >> >> > problem then something should be done. Seeing something printed at >> >> > "dev_error" level with an exclamation point (!) at the end makes me >> >> > feel like this is something that needs immediate action on my part. >> >> > >> >> > If we really can't do better but feel that the messages need to be >> >> > there, at least make them dev_info and less scary like: >> >> > >> >> > arm-smmu 15000000.iommu: turning off; DMA should be quiesced before >> >> > now >> >> > >> >> > ...that would still give you a hint in the logs that if you saw a DMA >> >> > transaction after the message that it was a bug but also wouldn't >> >> > sound scary to someone who wasn't seeing any other problems. >> >> > >> >> >> >> We can do this if Robin is OK? >> > >> > It would be nice if you could figure out which domains are still live >> > when >> > the SMMU is being shut down in your case and verify that it *is* infact >> > benign before we start making the message more friendly. As Robin said >> > earlier, rogue DMA is a real nightmare to debug. >> > >> >> I could see this error message for all the clients of apps_smmu. >> I checked manually enabling bypass and removing iommus dt property >> for each client of apps_smmu. > > Any update on the status here? If I'm reading the conversation above, > Robin said: "we'll *always* see the warning because there's no way to > tear down the default DMA domains, and even if all devices *have* been > nicely quiesced there's no way to tell". Did I understand that > properly? If so, it seems like it's fully expected to see this > message on every reboot and it doesn't necessarily signify anything > bad. > Understanding is the same, waiting for Will and Robin to check if its OK to make the message more friendly. -Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-04-23 8:17 ` Sai Prakash Ranjan @ 2020-04-23 9:28 ` Robin Murphy 2020-04-23 9:41 ` Sai Prakash Ranjan 0 siblings, 1 reply; 15+ messages in thread From: Robin Murphy @ 2020-04-23 9:28 UTC (permalink / raw) To: Sai Prakash Ranjan, Doug Anderson, Will Deacon Cc: linux-arm-msm, linux-arm-msm-owner, Joerg Roedel, LKML, iommu On 2020-04-23 9:17 am, Sai Prakash Ranjan wrote: [...] >> Any update on the status here? If I'm reading the conversation above, >> Robin said: "we'll *always* see the warning because there's no way to >> tear down the default DMA domains, and even if all devices *have* been >> nicely quiesced there's no way to tell". Did I understand that >> properly? If so, it seems like it's fully expected to see this >> message on every reboot and it doesn't necessarily signify anything >> bad. >> > > Understanding is the same, waiting for Will and Robin to check if its OK > to make the message more friendly. The way I see it, we essentially just want *something* visible that will correlate with any misbehaviour that *might* result from turning off a possibly-live context. How about simply "disabling translation", at dev_warn or dev_info level? Robin. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback 2020-04-23 9:28 ` Robin Murphy @ 2020-04-23 9:41 ` Sai Prakash Ranjan 0 siblings, 0 replies; 15+ messages in thread From: Sai Prakash Ranjan @ 2020-04-23 9:41 UTC (permalink / raw) To: Robin Murphy Cc: Doug Anderson, Will Deacon, linux-arm-msm, linux-arm-msm-owner, Joerg Roedel, LKML, iommu On 2020-04-23 14:58, Robin Murphy wrote: > On 2020-04-23 9:17 am, Sai Prakash Ranjan wrote: > [...] >>> Any update on the status here? If I'm reading the conversation >>> above, >>> Robin said: "we'll *always* see the warning because there's no way to >>> tear down the default DMA domains, and even if all devices *have* >>> been >>> nicely quiesced there's no way to tell". Did I understand that >>> properly? If so, it seems like it's fully expected to see this >>> message on every reboot and it doesn't necessarily signify anything >>> bad. >>> >> >> Understanding is the same, waiting for Will and Robin to check if its >> OK >> to make the message more friendly. > > The way I see it, we essentially just want *something* visible that > will correlate with any misbehaviour that *might* result from turning > off a possibly-live context. How about simply "disabling translation", > at dev_warn or dev_info level? > Sounds good, I'll go with disabling translation with dev_info. Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2020-04-23 9:41 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-27 13:28 [PATCH] iommu/arm-smmu: Demote error messages to debug in shutdown callback Sai Prakash Ranjan 2020-03-27 14:12 ` Robin Murphy 2020-03-27 15:09 ` Sai Prakash Ranjan 2020-03-27 16:17 ` Rob Clark 2020-03-27 18:18 ` Sai Prakash Ranjan 2020-03-27 19:02 ` Robin Murphy 2020-03-28 7:35 ` Sai Prakash Ranjan 2020-03-30 18:24 ` Doug Anderson 2020-03-31 7:36 ` Sai Prakash Ranjan 2020-03-31 7:44 ` Will Deacon 2020-03-31 7:53 ` Sai Prakash Ranjan 2020-04-22 19:49 ` Doug Anderson 2020-04-23 8:17 ` Sai Prakash Ranjan 2020-04-23 9:28 ` Robin Murphy 2020-04-23 9:41 ` Sai Prakash Ranjan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).