linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: Kirill Tkhai <ktkhai@virtuozzo.com>,
	Matthew Wilcox <willy@infradead.org>
Cc: bfields@fieldses.org, viro@zeniv.linux.org.uk,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	boqun.feng@gmail.com, longman@redhat.com, peterz@infradead.org,
	mingo@redhat.com
Subject: Re: [PATCH] fasync: Fix deadlock between task-context and interrupt-context kill_fasync()
Date: Wed, 18 Apr 2018 16:00:07 -0400	[thread overview]
Message-ID: <1524081607.8502.18.camel@kernel.org> (raw)
In-Reply-To: <3518b26a-2a81-1fb6-e01d-8d0b06eb0ad3@virtuozzo.com>

On Tue, 2018-04-17 at 17:15 +0300, Kirill Tkhai wrote:
> On 17.04.2018 17:01, Matthew Wilcox wrote:
> > On Thu, Apr 05, 2018 at 02:58:06PM +0300, Kirill Tkhai wrote:
> > > I observed the following deadlock between them:
> > > 
> > > [task 1]                          [task 2]                         [task 3]
> > > kill_fasync()                     mm_update_next_owner()           copy_process()
> > >  spin_lock_irqsave(&fa->fa_lock)   read_lock(&tasklist_lock)        write_lock_irq(&tasklist_lock)
> > >   send_sigio()                    <IRQ>                             ...
> > >    read_lock(&fown->lock)         kill_fasync()                     ...
> > >     read_lock(&tasklist_lock)      spin_lock_irqsave(&fa->fa_lock)  ...
> > > 
> > > Task 1 can't acquire read locked tasklist_lock, since there is
> > > already task 3 expressed its wish to take the lock exclusive.
> > > Task 2 holds the read locked lock, but it can't take the spin lock.
> > 
> > I think the important question is to Peter ... why didn't lockdep catch this?
> > 
> > > -		spin_lock_irq(&fa->fa_lock);
> > > +		write_lock_irq(&fa->fa_lock);
> > >  		fa->fa_file = NULL;
> > > -		spin_unlock_irq(&fa->fa_lock);
> > > +		write_unlock_irq(&fa->fa_lock);
> > 
> > ...
> > > -		spin_lock_irq(&fa->fa_lock);
> > > +		write_lock_irq(&fa->fa_lock);
> > >  		fa->fa_fd = fd;
> > > -		spin_unlock_irq(&fa->fa_lock);
> > > +		write_unlock_irq(&fa->fa_lock);
> > 
> > Do we really need a lock here?  If we convert each of these into WRITE_ONCE,
> 
> We want to pass specific fd to send_sigio(), not a random one. Also, we do want
> to dereference specific file in kill_fasync_rcu() without a danger it will be freed
> in parallel. So, since there is no rcu_read_lock() or another protection in readers
> of this data, we *can't* drop the lock.
> 

I think it's probably possible to do this with RCU.

You'd need to put the whole fasync structure under RCU, and we'd have
to stop resetting fa_fd in fasync_insert_entry, and just insert a new
one and delete the old when there was one like that. That'd be no big
deal though since we always allocate one and pass it in anyway.

We could also turn the nasty singly-linked list into a standard hlist
too, which would be nice.

Regardless...for now we should probably take this patch and do any
further work on the code from that point. I'll plan to pick up the
patch for now unless Al or Bruce want it.

> > then 
> > 
> > ...
> > > -		spin_lock_irqsave(&fa->fa_lock, flags);
> > > +		read_lock(&fa->fa_lock);
> > >  		if (fa->fa_file) {
> > 
> > file = READ_ONCE(fa->fa_file)
> > 
> > then we're not opening any new races, are we?
> > 
> > >  			fown = &fa->fa_file->f_owner;
> > >  			/* Don't send SIGURG to processes which have not set a
> > > @@ -997,7 +996,7 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
> > >  			if (!(sig == SIGURG && fown->signum == 0))
> > >  				send_sigio(fown, fa->fa_fd, band);
> > >  		}
> > > -		spin_unlock_irqrestore(&fa->fa_lock, flags);
> > > +		read_unlock(&fa->fa_lock);
> > >  		fa = rcu_dereference(fa->fa_next);
> > >  	}
> > >  }
> > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > index c6baf767619e..297e2dcd9dd2 100644
> > > --- a/include/linux/fs.h
> > > +++ b/include/linux/fs.h
> > > @@ -1250,7 +1250,7 @@ static inline int locks_lock_file_wait(struct file *filp, struct file_lock *fl)
> > >  }
> > >  
> > >  struct fasync_struct {
> > > -	spinlock_t		fa_lock;
> > > +	rwlock_t		fa_lock;
> > >  	int			magic;
> > >  	int			fa_fd;
> > >  	struct fasync_struct	*fa_next; /* singly linked list */
> 
> Kirill

  reply	other threads:[~2018-04-18 20:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-05 11:58 [PATCH] fasync: Fix deadlock between task-context and interrupt-context kill_fasync() Kirill Tkhai
2018-04-17  9:04 ` Kirill Tkhai
2018-04-17 11:42 ` Jeff Layton
2018-04-17 11:53   ` Kirill Tkhai
2018-04-17 13:31     ` Jeff Layton
2018-04-17 13:59       ` Kirill Tkhai
2018-04-17 14:01 ` Matthew Wilcox
2018-04-17 14:15   ` Kirill Tkhai
2018-04-18 20:00     ` Jeff Layton [this message]
2018-04-18 22:40       ` Kirill Tkhai
2018-04-27 13:44   ` Boqun Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1524081607.8502.18.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=bfields@fieldses.org \
    --cc=boqun.feng@gmail.com \
    --cc=ktkhai@virtuozzo.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).