On Wed, 2016-03-09 at 07:31 +0000, Xu, Quan wrote: > On March 09, 2016 1:19pm, wrote: > >  > > > When iommu_setup() is called in __start_xen(), interrupts have > > > already > > > been enabled, and nothing disables (without reenabling) them > > > again in > > > the path that leads to calling set_iommu_interrupt_handler(). > > > There's > > > therefore no risk of them being disabled and needing remaining > > > so, and > > > hence no need for > > no risk of them being 'enabled' since no one disables them again? > > > Yes, > Reason why one use _irqsave() when locking is because he doesn't know whether interrupt are disabled or not, and wants that, whatever the status is (enabled or disabled), it remains unchanged upon unlock (which, therefore, does the _irqrestore()). If we are absolutely sure that they are enabled, i.e., there is no _risk_ that they are disabled, it would be ok to just use _irq() (if disabling interrupt is necessary, which is not in this case, but that's another thing) and avoid saving the flags. That's how I read the original description (which, of course, does not mean it can't be simplified). > > > saving and restoring the flags. This means that, even if > > > interrupt > > > would need to be disabled when taking pcidevs_lock, > > > spin_lock_irq() > > > and spin_unlock_irq() could be used. > > I didn't see how this explanation relates to below change. You are > > changing > > from spin_lock_irqsave to spin_lock. But here you explains the > > reason to > > spin_lock_irq... > > > Exactly. We have spin_lock_irqsave() that does two things:  - disables interrupts,  - saves and restores the flags. We think that:  - there's no need to save flags, even if disabling interrupts were     necessary,  - there is no need to disable interrupts. And therefore, we are changing to spin_lock() that:  - doesn't disable interrupts,  - doesn't save and restore the flags. So, basically, switching spinlock variant is basically _2_ changes. I do think that it is worth touching in the changelog why _both_ are ok. > Yes, you are right. I think I'd better remove or add a '()' for: > >    "This means that, even if interrupt >     would need to be disabled when taking pcidevs_lock, > spin_lock_irq() >     and spin_unlock_irq() could be used." > Yeah, that makes it more accurate, but even longer, while I think the changelog could use some shortening and simplification. > > > Inside set_iommu_interrupt_handler(), pcidevs_lock serializes > > > calls to > > > pci_get_pdev(), which does not require interrupts to be disabled > > > during its execution. In fact, pcidevs_lock is always (except for > > > this > > > case) taken without disabling interrupts, and disabling them in > > > this > > > case would make the BUG_ON() in check_lock() trigger, if it > > > wasn't for > > > the fact that spinlock debugging checks are still disabled, > > > during > > > initialization, when iommu_setup() (which then end up calling > > > set_iommu_interrupt_handler()) is called. > > The key problem is that we need consistent lock usage in all places > > (either all in > > IRQ-safe or all in IRQ-unsafe), regardless of whether check_lock is > > activated or > > not (which is just a debug method to help identify such > > inconsistency problem). > It is, and, during boot, is disabled for a reason, which is that we allow mixed usage, albeit only in super special circumstances (like, in fact, early boot). So I agree that check_lock() is just a sanity checking mechanism, but let's not state anything incorrect (as Jan requested in the first place). > IMO, this is not the key problem,  __Wait for Dario's / Jan's > opinions__. > Well, it is a way to describe at least some of the aspects of the key problem, I guess, just not the one that I think I would stress the most. > > I think it should be clear enough to state that pci_get_pdev > > doesn't require > > holding lock with interrupt disabled (so we should use spin_lock in > > this AMD > > case), and add the fact that when this function is invoked the > > interrupt is indeed > > enabled. > > I totally agree with this description! (Can we use that as changelog? :-) ) > > > In order to fix this, we can use just plain spin_lock() and > > > spin_unlock(), instead of spin_lock_irqsave() and > > > spin_unlock_irqrestore(). > > What about revise like below? > > -- > > > > pcidevs_lock should be held with interrupt enabled. > I am not sure for this point. > Well, it is true that it should. Altough, I think it's more accurate to say that, as Kevin says above, it "doesn't require being held with interrupt disabled". > > However there remains an > > exception in AMD IOMMU code, where the lock is acquired with > > interrupt > > disabled. This inconsistency can lead to deadlock. > > Can it? In the case of the occurrence being changed by this patch, I don't think it can. > > The fix is straightforward to use spin_lock instead. Also interrupt > > has been > > enabled when this function is invoked, so we're sure consistency > > around > > pcidevs_lock can be guaranteed after this fix. > I really appreciate you send out a revised one, but I think it is not > only for consistency. >  __Wait for Dario's / Jan's opinions__. > > To be honest, to me, __my_changlog_ is complicated. > It is complicated. I think it is a detailed, and IMO correct, description of the reason why the patch is ok. That is indeed the purpose of a changelog (especially for these kind of patches), but it certainly could be summarized/simplified a bit. I guess I'm also giving it a try, using what Kevin wrote in the middle of his email as a basis (with a small addition about consistency at the end). "pci_get_pdev() doesn't require interrupts to be disabled when taking the pcidevs_lock, which protects its execution. So, spin_lock() can be used for that, and that is what is done almost everywhere. Currenlty, there is an exception, in early boot IOMMU initialization on AMD systems, where spin_lock_irqsave() and restore are used. However, since, in that case too:  - we don't need to disable interrupts (for same reasons stated above),  - interrupts are enabled already, so there is no need to save and    restore flags, just change it into using spin_lock(), as everywhere. Note that, mixing interrupt disabled and enabled spinlocks is something we disallow, except for very special situations. And since this one does not qualify as such, using IRQ disabling variants can be considered a bug (which manages to not trigger the safety checking we have in place only because they're not yet enabled)." Regards, Dario -- <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)