xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Xu, Quan" <quan.xu@intel.com>
To: Meng Xu <mengxu@cis.upenn.edu>
Cc: Dario Faggioli <dario.faggioli@citrix.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Jan Beulich <jbeulich@suse.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH v3 1/2] IOMMU/spinlock: Fix a bug found in AMD IOMMU initialization
Date: Fri, 11 Mar 2016 06:54:13 +0000	[thread overview]
Message-ID: <945CA011AD5F084CBEA3E851C0AB28894B861913@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <CAENZ-+=zh-fEC7kx+7S4P86izSJ+ivqnEgkaFeP1oVdajm36CQ@mail.gmail.com>

On March 11, 2016 11:25am, <mengxu@cis.upenn.edu> wrote:
> On Wed, Mar 9, 2016 at 8:17 AM, Quan Xu <quan.xu@intel.com> wrote:
> > pcidevs_lock should be held with interrupt enabled. However there
> > remains an exception in AMD IOMMU code, where the lock is acquired
> > with interrupt disabled. This inconsistency might lead to deadlock.
> 
> Why will this inconsistency lead to deadlock?
> I understand the difference between spin_lock_irqsave(), which disable interrupt,
> and spin_lock(), which allows interrupt, however, I didn't get why misuse the
> spin_lock_irqsave() at the place of spin_lock() could potentially lead to a
> deadlock?

For above 2 questions:
Mix is just one of _possible_ scenarios(IMO, not 100% leading to a deadlock):

-where an interrupt tries to lock an already locked variable. This is ok if
the other interrupt happens on another CPU, but it is _not_ ok if the
interrupt happens on the same CPU that already holds the lock

I got it from the bellow 2 points:
 1).As Jan mentioned, The implication from disabling interrupts while acquiring a lock is that the lock is also being acquired by some interrupt handler.
  If you mix acquire types, the one not disabling interrupts is prone to be interrupted, and the interrupt trying to get hold of the lock the same CPU already owns.

 2). Comment inside check_lock(), 
we partition locks into IRQ-safe (always held with IRQs disabled) and
IRQ-unsafe (always held with IRQs enabled) types. The convention for
every lock must be consistently observed else we can deadlock in
IRQ-context rendezvous functions (__a rendezvous which gets every CPU
into IRQ context before any CPU is released from the rendezvous__).
If we can mix IRQ-disabled and IRQ-enabled callers, the following can
happen:
 * Lock is held by CPU A, with IRQs enabled
 * CPU B is spinning on same lock, with IRQs disabled
 * Rendezvous starts -- CPU A takes interrupt and enters rendezbous spin
 * DEADLOCK -- CPU B will never enter rendezvous, CPU A will never exit
               the rendezvous, and will hence never release the lock.

To guard against this subtle bug we latch the IRQ safety of every
spinlock in the system, on first use.

> Would you minding pointing me to somewhere I can find the reason or
> enlighten me?
> 

Some reference material:

   * (look at "Lesson 3: spinlocks revisited.")
     https://www.kernel.org/doc/Documentation/locking/spinlocks.txt
   * comment inside check_lock(), in xen/common/spinlock.c, in Xen's codebase.
   * the "Linux Device Drivers, 3rd Edition" book , http://www.makelinux.net/ldd3/chp-5-sect-5

Quan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-03-11  6:54 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-09 13:17 [PATCH v3 0/2] Make the pcidevs_lock a recursive one Quan Xu
2016-03-09 13:17 ` [PATCH v3 1/2] IOMMU/spinlock: Fix a bug found in AMD IOMMU initialization Quan Xu
2016-03-09 14:59   ` Dario Faggioli
2016-03-10  6:12     ` Xu, Quan
2016-03-11  3:24   ` Meng Xu
2016-03-11  6:54     ` Xu, Quan [this message]
2016-03-11 10:35       ` Dario Faggioli
2016-03-11 12:36         ` Xu, Quan
2016-03-11 13:58           ` Dario Faggioli
2016-03-11 14:49           ` Meng Xu
2016-03-11 15:55             ` Dario Faggioli
2016-03-11 17:17               ` Meng Xu
2016-03-11 14:41         ` Meng Xu
2016-03-11 16:12           ` Dario Faggioli
2016-03-09 13:17 ` [PATCH v3 2/2] IOMMU/spinlock: Make the pcidevs_lock a recursive one Quan Xu
2016-03-09 17:45   ` Dario Faggioli
2016-03-10  1:21     ` Xu, Quan
2016-03-10  9:52   ` Jan Beulich
2016-03-10 11:27     ` Xu, Quan
2016-03-10 13:06       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=945CA011AD5F084CBEA3E851C0AB28894B861913@SHSMSX101.ccr.corp.intel.com \
    --to=quan.xu@intel.com \
    --cc=dario.faggioli@citrix.com \
    --cc=jbeulich@suse.com \
    --cc=mengxu@cis.upenn.edu \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).