From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dario Faggioli Subject: Re: [PATCH v2 1/2] IOMMU/spinlock: Fix a bug found in AMD IOMMU initialization. Date: Wed, 9 Mar 2016 14:19:59 +0100 Message-ID: <1457529599.3102.354.camel@citrix.com> References: <1457492889-36625-1-git-send-email-quan.xu@intel.com> <1457492889-36625-2-git-send-email-quan.xu@intel.com> <945CA011AD5F084CBEA3E851C0AB28894B85CA85@SHSMSX101.ccr.corp.intel.com> <1457519096.3102.327.camel@citrix.com> <945CA011AD5F084CBEA3E851C0AB28894B85E23E@SHSMSX101.ccr.corp.intel.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0758813347207378762==" Return-path: In-Reply-To: <945CA011AD5F084CBEA3E851C0AB28894B85E23E@SHSMSX101.ccr.corp.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" To: "Xu, Quan" Cc: "Tian, Kevin" , Jan Beulich , Suravee Suthikulpanit , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org --===============0758813347207378762== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-lUGpRmodSNumMV5HmM6b" --=-lUGpRmodSNumMV5HmM6b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, 2016-03-09 at 12:52 +0000, Xu, Quan wrote: > On March 09, 2016 6:25pm, wrote: > > On Wed, 2016-03-09 at 07:31 +0000, Xu, Quan wrote: > > > On March 09, 2016 1:19pm, wrote: > > > >=C2=A0 > > If we are absolutely sure that they are enabled, i.e., there is no > > _risk_ that they > > are disabled, it would be ok to just use _irq() (if disabling > > interrupt is necessary, > > which is not in this case, but that's another thing) and avoid > > saving the flags. > >=C2=A0 > Dario, thanks for your share. > I appreciate you always share some knowledge. :):) > btw, the "Linux Device Drivers, 3rd Edition" book also describe it > clearly, http://www.makelinux.net/ldd3/chp-5-sect-5=C2=A0 >=20 Yes, with respect to that, us and Linux are similar, and the concerns on whether interrupts should be disabled or not when taking a spinlock are generic, and can be applied to any OS/hypervisor. AFAICR, Linux does not have any check in place similar to our check_lock(), but that does not mean much. > > > > However there remains an > > > > exception in AMD IOMMU code, where the lock is acquired with > > > > interrupt disabled. This inconsistency can lead to deadlock. > > > >=20 > > Can it? In the case of the occurrence being changed by this patch, > > I don't think it > > can. > Before this patch, it might. As Jan mentioned, that's just a > theoretical concern:=C2=A0 > =C2=A0-spin_lock_irqsave() disables interrupts (on the local processor > only) before taking the spinlock.=C2=A0 > =C2=A0 Supposed in MP system (pCPU1, pCPU2, ...), when the pCPU1 call the > spin_lock_irqsave(), > =C2=A0 It does only disable interrupts for pCPU1, and _not_ for other > pCPUn. > =C2=A0-Then, it is mixing interrupt disabled and enabled spinlocks. >=20 I was commenting on the "can lead to deadlock" part, which is something that, in general, we risk if we mix, but that can't possibly occur in this specific case. This is also what Jan is saying, and the reason why is also asking to mitigate this exact claim... So, I'm not sure what your point is here... > > Note that, mixing interrupt disabled and enabled spinlocks is > > something we > > disallow, > > Dario, could you share something in detail? > Sorry, I again don't understand... share something about what? I was proposing myself a text to be used as changelog, which I'm pasting again here below, for completeness/clarity. This sentence you seem to be asking about, is what we've been "debating", in this thread about the changelog, since almost the beginning: it's bad to mix, because it leads to inconsistencies can are bad because in general it would trigger BUG_ON, in debug builds, and cause deadlock, in non-debug builds, even if in this case neither happen. What is that you need more insights about? Here it is (again) what I would use as commit message: --- pci_get_pdev() doesn't require interrupts to be disabled when taking the pcidevs_lock, which protects its execution. So, spin_lock() can be used for that, and that is what is done almost everywhere. Currenlty, there is an exception, in early boot IOMMU initialization on AMD systems, where spin_lock_irqsave() and restore are used. However, since, in that case too: =C2=A0- we don't need to disable interrupts (for same reasons stated above)= , =C2=A0- interrupts are enabled already, so there is no need to save and =C2=A0=C2=A0=C2=A0restore flags, just change it into using spin_lock(), as everywhere. Note that, mixing interrupt disabled and enabled spinlocks is something we disallow, except for very special situations. And since this one does not qualify as such, using IRQ disabling variants can be considered a bug (which manages to not trigger the safety checking we have in place only because they're not yet enabled). --- Regards, Dario --=20 <> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK) --=-lUGpRmodSNumMV5HmM6b Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlbgIv8ACgkQk4XaBE3IOsTpqgCfVLF2wg0hGho5m7U4PBz25iQ3 HqoAn3jxXP2pvxNAjoZ0Nqd6//D9NMHb =nefM -----END PGP SIGNATURE----- --=-lUGpRmodSNumMV5HmM6b-- --===============0758813347207378762== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KWGVuLWRldmVs IG1haWxpbmcgbGlzdApYZW4tZGV2ZWxAbGlzdHMueGVuLm9yZwpodHRwOi8vbGlzdHMueGVuLm9y Zy94ZW4tZGV2ZWwK --===============0758813347207378762==--