IOMMU Archive on lore.kernel.org
 help / color / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: John Garry <john.garry@huawei.com>, Marc Zyngier <maz@kernel.org>,
	Will Deacon <will@kernel.org>,
	Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>,
	Sudeep Holla <sudeep.holla@arm.com>,
	"Guohanjun (Hanjun Guo)" <guohanjun@huawei.com>
Cc: Saravana Kannan <saravanak@google.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Alex Williamson <alex.williamson@redhat.com>,
	Linuxarm <linuxarm@huawei.com>,
	iommu <iommu@lists.linux-foundation.org>,
	Bjorn Helgaas <bhelgaas@google.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: arm64 iommu groups issue
Date: Fri, 14 Feb 2020 18:35:05 +0000
Message-ID: <68643b18-c920-f997-a6d4-a5d9177c0f4e@arm.com> (raw)
In-Reply-To: <35fc8d13-b1c1-6a9e-4242-284da7f00764@huawei.com>

On 14/02/2020 2:09 pm, John Garry wrote:
>>>
>>> @@ -2420,6 +2421,10 @@ void pci_device_add(struct pci_dev *dev, struct
>>> pci_bus *bus)
>>>        /* Set up MSI IRQ domain */
>>>        pci_set_msi_domain(dev);
>>>
>>> +    parent = dev->dev.parent;
>>> +    if (parent && parent->bus == &pci_bus_type)
>>> +        device_link_add(&dev->dev, parent, DL_FLAG_AUTOPROBE_CONSUMER);
>>> +
>>>        /* Notifier could use PCI capabilities */
>>>        dev->match_driver = false;
>>>        ret = device_add(&dev->dev);
>>> -- 
>>>
>>> This would work, but the problem is that if the port driver fails in
>>> probing - and not just for -EPROBE_DEFER - then the child device will
>>> never probe. This very thing happens on my dev board. However we could
>>> expand the device links API to cover this sort of scenario.
>>
>> Yes, that's an undesirable issue, but in fact I think it's mostly
>> indicative that involving drivers in something which is designed to
>> happen at a level below drivers is still fundamentally wrong and doomed
>> to be fragile at best.
> 
> Right, and even worse is that it relies on the port driver even existing 
> at all.
> 
> All this iommu group assignment should be taken outside device driver 
> probe paths.
> 
> However we could still consider device links for sync'ing the SMMU and 
> each device probing.

Yes, we should get that for DT now thanks to the of_devlink stuff, but 
cooking up some equivalent for IORT might be worthwhile.

>> Another thought that crosses my mind is that when pci_device_group()
>> walks up to the point of ACS isolation and doesn't find an existing
>> group, it can still infer that everything it walked past *should* be put
>> in the same group it's then eventually going to return. Unfortunately I
>> can't see an obvious way for it to act on that knowledge, though, since
>> recursive iommu_probe_device() is unlikely to end well.
> 
> I'd be inclined not to change that code.
> 
>>
>>> As for alternatives, it looks pretty difficult to me to disassociate the
>>> group allocation from the dma_configure path.
>>
>> Indeed it's non-trivial, but it really does need cleaning up at some 
>> point.
>>
>> Having just had yet another spark, does something like the untested
>> super-hack below work at all? 
> 
> I tried it and it doesn't (yet) work.

Bleh - further reinforcement of the "ideas after 6PM are bad ideas" rule...

> So when we try 
> iommu_bus_replay()->add_iommu_group()->iommu_probe_device()->arm_smmu_add_device(), 
> 
> the iommu_fwspec is still NULL for that device - this is not set until 
> later when the device driver is going to finally probe in 
> iort_iommu_xlate()->iommu_fwspec_init(), and it's too late...
> 
> And this looks to be the reason for which current 
> iommu_bus_init()->bus_for_each_device(..., add_iommu_group) fails also.

Of course, just adding a 'correct' add_device replay without the 
of_xlate process doesn't help at all. No wonder this looked suspiciously 
simpler than where the first idea left off...

(on reflection, the core of this idea seems to be recycling the existing 
iommu_bus_init walk rather than building up a separate "waiting list", 
while forgetting that that wasn't the difficult part of the original 
idea anyway)

> On this current code mentioned, the principle of this seems wrong to me 
> - we call bus_for_each_device(..., add_iommu_group) for the first SMMU 
> in the system which probes, but we attempt to add_iommu_group() for all 
> devices on the bus, even though the SMMU for that device may yet to have 
> probed.

Yes, iommu_bus_init() is one of the places still holding a 
deeply-ingrained assumption that the ops go live for all IOMMU instances 
at once, which is what warranted the further replay in 
of_iommu_configure() originally. Moving that out of 
of_platform_device_create() to support probe deferral is where the 
trouble really started.

Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-19  8:43 John Garry
2019-09-19 13:25 ` Robin Murphy
2019-09-19 14:35   ` John Garry
2019-11-04 12:18     ` John Garry
2020-02-13 15:49     ` John Garry
2020-02-13 19:40       ` Robin Murphy
2020-02-14 14:09         ` John Garry
2020-02-14 18:35           ` Robin Murphy [this message]
2020-02-17 12:08             ` John Garry

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=68643b18-c920-f997-a6d4-a5d9177c0f4e@arm.com \
    --to=robin.murphy@arm.com \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=guohanjun@huawei.com \
    --cc=iommu@lists.linux-foundation.org \
    --cc=john.garry@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=lorenzo.pieralisi@arm.com \
    --cc=maz@kernel.org \
    --cc=saravanak@google.com \
    --cc=sudeep.holla@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

IOMMU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-iommu/0 linux-iommu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-iommu linux-iommu/ https://lore.kernel.org/linux-iommu \
		iommu@lists.linux-foundation.org
	public-inbox-index linux-iommu

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.linux-foundation.lists.iommu


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git