On Mon, Nov 16, 2020 at 11:48:39AM -0800, John Stultz wrote: > On Mon, Nov 16, 2020 at 8:36 AM Will Deacon wrote: > > On Mon, Nov 16, 2020 at 04:59:36PM +0100, Thierry Reding wrote: > > > On Fri, Nov 06, 2020 at 04:27:10AM +0000, John Stultz wrote: > > > Unfortunately, the ARM SMMU module will eventually end up being loaded > > > once the root filesystem has been mounted (for example via SDHCI or > > > Ethernet, both with using just plain, non-IOMMU-backed DMA API) and then > > > initialize, configuring as "fault by default", which then results from a > > > slew of SMMU faults from all the devices that have previously configured > > > themselves without IOMMU support. > > > > I wonder if fw_devlink=on would help here? > > > > But either way, I'd be more inclined to revert this change if it's causing > > problems for !QCOM devices. > > > > Linus -- please can you drop this one (patch 3/3) for now, given that it's > > causing problems? > > Agreed. Apologies again for the trouble. > > I do feel like the probe timeout to handle optional links is causing a > lot of the trouble here. I expect fw_devlink would solve this, but it > may be awhile before it can be always enabled. I may see about > pushing the default probe timeout value to be a little further out > than init (I backed away from my last attempt as I didn't want to > cause long (30 second) delays for cases like NFS root, but maybe 2-5 > seconds would be enough to make things work better for everyone). I think there are two problems here: 1) the deferred probe timeout can cause a mismatch between what SMMU masters and the SMMU think is going on and 2) a logistical problem of dealing with the SMMU driver being a loadable module. The second problem can be dealt with by shipping the module in the initial ramdisk. That's a bit annoying, but perhaps the right thing to do. At least on Tegra we need this because all the devices that carry the root filesystem (Ethernet for NFS and SDHCI/USB/SATA/PCI for disk boot) are SMMU masters and will start to fault once the SMMU driver is loaded. The first problem is trickier, but if the ARM SMMU driver is built as a module and shipped in the initial ramdisk it should work. Like I said, this is annoying because it makes the development a bit more complicated than just rebuilding a kernel image and flashing it (or boot it straight from TFTP) because now everytime the ARM SMMU module is built the initial ramdisk needs to be updated (and potentially flashed) as well. Thierry P.S.: Interestingly this is very similar to the problem that I've been trying to address for display hardware that's left on by the bootloader. Given that, one potential solution would be to somehow retrieve memory allocations done by these devices and create identity mappings in the ARM SMMU address spaces for such devices, much like we plan to do for devices left on by the bootloader (like the display controller for showing a boot splash). I suspect that it's not really worth doing this for devices that are only initialized by the kernel because we have a bit of control over when we enable them, so I'd prefer if we just kept things reasonably simple and made sure the SMMU was either always used by a device from the start or not at all. Dynamically switching between SMMU and no-SMMU seems a bit eccentric.