All of lore.kernel.org
 help / color / mirror / Atom feed
From: Boqun Feng <boqun.feng@gmail.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Damien Le Moal <damien.lemoal@opensource.wdc.com>,
	Wei Chen <harperchen1110@gmail.com>,
	linux-ide@vger.kernel.org, linux-kernel@vger.kernel.org,
	syzkaller-bugs@googlegroups.com,
	syzbot <syzkaller@googlegroups.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>,
	Jeff Layton <jlayton@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: possible deadlock in __ata_sff_interrupt
Date: Fri, 16 Dec 2022 19:25:59 -0800	[thread overview]
Message-ID: <Y502x/oubigQGIrr@Boquns-Mac-mini.local> (raw)
In-Reply-To: <Y50ihHKFbderCqH1@ZenIV>

On Sat, Dec 17, 2022 at 01:59:32AM +0000, Al Viro wrote:
> On Fri, Dec 16, 2022 at 03:54:09PM -0800, Boqun Feng wrote:
> > On Fri, Dec 16, 2022 at 11:39:21PM +0000, Al Viro wrote:
> > > [Boqun Feng Cc'd]
> > > 
> > > On Fri, Dec 16, 2022 at 03:26:21AM -0800, Linus Torvalds wrote:
> > > > On Thu, Dec 15, 2022 at 7:41 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > > > >
> > > > > CPU1: ptrace(2)
> > > > >         ptrace_check_attach()
> > > > >                 read_lock(&tasklist_lock);
> > > > >
> > > > > CPU2: setpgid(2)
> > > > >         write_lock_irq(&tasklist_lock);
> > > > >         spins
> > > > >
> > > > > CPU1: takes an interrupt that would call kill_fasync().  grep and the
> > > > > first instance of kill_fasync() is in hpet_interrupt() - it's not
> > > > > something exotic.  IRQs disabled on CPU2 won't stop it.
> > > > >         kill_fasync(..., SIGIO, ...)
> > > > >                 kill_fasync_rcu()
> > > > >                         read_lock_irqsave(&fa->fa_lock, flags);
> > > > >                         send_sigio()
> > > > >                                 read_lock_irqsave(&fown->lock, flags);
> > > > >                                 read_lock(&tasklist_lock);
> > > > >
> > > > > ... and CPU1 spins as well.
> > > > 
> > > > Nope. See kernel/locking/qrwlock.c:
> > > 
> > > [snip rwlocks are inherently unfair, queued ones are somewhat milder, but
> > > all implementations have writers-starving behaviour for read_lock() at least
> > > when in_interrupt()]
> > > 
> > > D'oh...  Consider requested "Al, you are a moron" duly delivered...  I plead
> > > having been on way too low caffeine and too little sleep ;-/
> > > 
> > > Looking at the original report, looks like the scenario there is meant to be
> > > the following:
> > > 
> > > CPU1: read_lock(&tasklist_lock)
> > > 	tasklist_lock grabbed
> > > 
> > > CPU2: get an sg write(2) feeding request to libata; host->lock is taken,
> > > 	request is immediately completed and scsi_done() is about to be called.
> > > 	host->lock grabbed
> > > 
> > > CPU3: write_lock_irq(&tasklist_lock)
> > > 	spins on tasklist_lock until CPU1 gets through.
> > > 
> > > CPU2: get around to kill_fasync() called by sg_rq_end_io() and to grabbing
> > > 	tasklist_lock inside send_sigio()
> > > 	spins, since it's not in an interrupt and there's a pending writer
> > > 	host->lock is held, spin until CPU3 gets through.
> > 
> > Right, for a reader not in_interrupt(), it may be blocked by a random
> > waiting writer because of the fairness, even the lock is currently held
> > by a reader:
> > 
> > 	CPU 1			CPU 2		CPU 3
> > 	read_lock(&tasklist_lock); // get the lock
> > 
> > 						write_lock_irq(&tasklist_lock); // wait for the lock
> > 
> > 				read_lock(&tasklist_lock); // cannot get the lock because of the fairness
> 
> IOW, any caller of scsi_done() from non-interrupt context while
> holding a spinlock that is also taken in an interrupt...
> 
> And we have drivers/scsi/scsi_error.c:scsi_send_eh_cmnd(), which calls
> ->queuecommand() under a mutex, with
> #define DEF_SCSI_QCMD(func_name) \
>         int func_name(struct Scsi_Host *shost, struct scsi_cmnd *cmd)   \
>         {                                                               \
>                 unsigned long irq_flags;                                \
>                 int rc;                                                 \
>                 spin_lock_irqsave(shost->host_lock, irq_flags);         \
>                 rc = func_name##_lck(cmd);                              \
>                 spin_unlock_irqrestore(shost->host_lock, irq_flags);    \
>                 return rc;                                              \
>         }
> 
> being commonly used for ->queuecommand() instances.  So any scsi_done()
> in foo_lck() (quite a few of such) + use of ->host_lock in interrupt
> for the same driver (also common)...
> 
> I wonder why that hadn't triggered the same warning a long time
> ago - these warnings had been around for at least two years.
> 

FWIW, the complete dependency chain is:

	&host->lock --> &new->fa_lock --> &f->f_owner.lock --> tasklist_lock

for the "&f->f_owner.lock" part to get into lockdep's radar, the
following call trace needs to appear once:

	kill_fasync():
	  kill_fasync_rcu():
	    send_sigio()

not sure whether it's rare or not though. And ->fa_lock also had its own
issue:

	https://lore.kernel.org/lkml/20210702091831.615042-1-desmondcheongzx@gmail.com/

which may have covered &host->lock for a while ;-)

Regards,
Boqun

> Am I missing something here?

  reply	other threads:[~2022-12-17  3:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-13 15:09 possible deadlock in __ata_sff_interrupt Wei Chen
2022-12-15  9:48 ` Damien Le Moal
2022-12-15 15:19   ` Al Viro
2022-12-16  1:44     ` Damien Le Moal
2022-12-16  3:41       ` Al Viro
2022-12-16 11:26         ` Linus Torvalds
2022-12-16 23:39           ` Al Viro
2022-12-16 23:54             ` Boqun Feng
2022-12-17  1:59               ` Al Viro
2022-12-17  3:25                 ` Boqun Feng [this message]
2022-12-17  2:31               ` Linus Torvalds
2022-12-17  2:59                 ` Boqun Feng
2022-12-17  3:05                 ` Al Viro
2022-12-17  4:41                   ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y502x/oubigQGIrr@Boquns-Mac-mini.local \
    --to=boqun.feng@gmail.com \
    --cc=chuck.lever@oracle.com \
    --cc=damien.lemoal@opensource.wdc.com \
    --cc=harperchen1110@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=syzkaller@googlegroups.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.