On Wed, Sep 30, 2020 at 01:36:18PM -0700, Nicolin Chen wrote: > On Wed, Sep 30, 2020 at 05:31:31PM +0200, Thierry Reding wrote: > > On Wed, Sep 30, 2020 at 01:42:57AM -0700, Nicolin Chen wrote: > > > Previously the driver relies on bus_set_iommu() in .probe() to call > > > in .probe_device() function so each client can poll iommus property > > > in DTB to configure fwspec via tegra_smmu_configure(). According to > > > the comments in .probe(), this is a bit of a hack. And this doesn't > > > work for a client that doesn't exist in DTB, PCI device for example. > > > > > > Actually when a device/client gets probed, the of_iommu_configure() > > > will call in .probe_device() function again, with a prepared fwspec > > > from of_iommu_configure() that reads the SWGROUP id in DTB as we do > > > in tegra-smmu driver. > > > > > > Additionally, as a new helper devm_tegra_get_memory_controller() is > > > introduced, there's no need to poll the iommus property in order to > > > get mc->smmu pointers or SWGROUP id. > > > > > > This patch reworks .probe_device() and .attach_dev() by doing: > > > 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev() > > > 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure() > > > 3) Calling devm_tegra_get_memory_controller() in .probe_device() > > > 4) Also dropping the hack in .probe() that's no longer needed. > > > > > > Signed-off-by: Nicolin Chen > [...] > > > static struct iommu_device *tegra_smmu_probe_device(struct device *dev) > > > { > > > - struct device_node *np = dev->of_node; > > > - struct tegra_smmu *smmu = NULL; > > > - struct of_phandle_args args; > > > - unsigned int index = 0; > > > - int err; > > > - > > > - while (of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index, > > > - &args) == 0) { > > > - smmu = tegra_smmu_find(args.np); > > > - if (smmu) { > > > - err = tegra_smmu_configure(smmu, dev, &args); > > > - of_node_put(args.np); > > > - > > > - if (err < 0) > > > - return ERR_PTR(err); > > > - > > > - /* > > > - * Only a single IOMMU master interface is currently > > > - * supported by the Linux kernel, so abort after the > > > - * first match. > > > - */ > > > - dev_iommu_priv_set(dev, smmu); > > > - > > > - break; > > > - } > > > + struct tegra_mc *mc = devm_tegra_get_memory_controller(dev); > > > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); > > > > It looks to me like the only reason why you need this new global API is > > because PCI devices may not have a device tree node with a phandle to > > the IOMMU. However, SMMU support for PCI will only be enabled if the > > root complex has an iommus property, right? In that case, can't we > > simply do something like this: > > > > if (dev_is_pci(dev)) > > np = find_host_bridge(dev)->of_node; > > else > > np = dev->of_node; > > > > ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty > > sure that exists. > > > > Once we have that we can still iterate over the iommus property and do > > not need to rely on this global variable. > > I agree that it'd work. But I was hoping to simplify the code > here if it's possible. Looks like we have an argument on this > so I will choose to go with your suggestion above for now. > > > > - of_node_put(args.np); > > > - index++; > > > - } > > > + /* An invalid mc pointer means mc and smmu drivers are not ready */ > > > + if (IS_ERR(mc)) > > > + return ERR_PTR(-EPROBE_DEFER); > > > > > > - if (!smmu) > > > + /* > > > + * IOMMU core allows -ENODEV return to carry on. So bypass any call > > > + * from bus_set_iommu() during tegra_smmu_probe(), as a device will > > > + * call in again via of_iommu_configure when fwspec is prepared. > > > + */ > > > + if (!mc->smmu || !fwspec || fwspec->ops != &tegra_smmu_ops) > > > return ERR_PTR(-ENODEV); > > > > > > - return &smmu->iommu; > > > + dev_iommu_priv_set(dev, mc->smmu); > > > + > > > + return &mc->smmu->iommu; > > > } > > > > > > static void tegra_smmu_release_device(struct device *dev) > > > @@ -1089,16 +1027,6 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev, > > > if (!smmu) > > > return ERR_PTR(-ENOMEM); > > > > > > - /* > > > - * This is a bit of a hack. Ideally we'd want to simply return this > > > - * value. However the IOMMU registration process will attempt to add > > > - * all devices to the IOMMU when bus_set_iommu() is called. In order > > > - * not to rely on global variables to track the IOMMU instance, we > > > - * set it here so that it can be looked up from the .probe_device() > > > - * callback via the IOMMU device's .drvdata field. > > > - */ > > > - mc->smmu = smmu; > > > > I don't think this is going to work. I distinctly remember putting this > > here because we needed access to this before ->probe_device() had been > > called for any of the devices. > > Do you remember which exact part of code needs to access mc->smmu > before ->probe_device() is called? > > What I understood is that IOMMU core didn't allow ERR_PTR(-ENODEV) > return value from ->probe_device(), previously ->add_device(), to > carry on when you added this code/driver: > commit 8918465163171322c77a19d5258a95f56d89d2e4 > Author: Thierry Reding > Date: Wed Apr 16 09:24:44 2014 +0200 > memory: Add NVIDIA Tegra memory controller support > > ..until the core had a change one year later: > commit 38667f18900afe172a4fe44279b132b4140f920f > Author: Joerg Roedel > Date: Mon Jun 29 10:16:08 2015 +0200 > iommu: Ignore -ENODEV errors from add_device call-back > > As my commit message of this change states, ->probe_device() will > be called in from both bus_set_iommu() and really_probe() of each > device through of_iommu_configure() -- the later one initializes > an fwspec by polling the iommus property in the IOMMU core, same > as what we do here in tegra-smmu. If this works, we can probably > drop the hack here and get rid of tegra_smmu_configure(). Looking at this a bit more, I notice that tegra_smmu_configure() does a lot of what's already done during of_iommu_configure(), so it'd indeed be nice if we could somehow get rid of that. However, like I said, I do recall that for DMA/IOMMU we need this prior to ->probe_device(), so it isn't clear to me if we can do that. So I think in order to make progress we need to check that dropping this does indeed still work when we enable DMA/IOMMU (and the preliminary patches to pass 1:1 mappings via reserved-memory regions). If so, I think it should be safe to remove this. Thierry