linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Hugh Dickins <hugh@veritas.com>
Cc: Oliver Neukum <oliver@neukum.name>,
	Maneesh Soni <maneesh@in.ibm.com>,
	Greg Kroah-Hartman <gregkh@suse.de>, Adrian Bunk <bunk@stusta.de>,
	linux-kernel@vger.kernel.org
Subject: Re: 2.6.21-rc suspend regression: sysfs deadlock
Date: Tue, 6 Mar 2007 17:56:57 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0703061735350.5963@woody.linux-foundation.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0703061914030.18144@blonde.wat.veritas.com>



On Tue, 6 Mar 2007, Hugh Dickins wrote:
> 
> This comes from Oliver's commit 94bebf4d1b8e7719f0f3944c037a21cfd99a4af7
> Driver core: fix race in sysfs between sysfs_remove_file() and read()/write()
> in 2.6.21-rc1.  It looks to me like sysfs_write_file downs buffer->sem
> while calling flush_write_buffer, and flushing that particular write
> buffer entails downing buffer->sem in orphan_all_buffers.

Gaah. What a crock.

I really don't see any alternative to just reverting the whole change. 
Hugh's patch is simple, but rather pointless.

The fact is, the whole change is *bogus*.

We don't "lock" datastructures. We *reference count* them!

This is so fundamental that it's even mentioned in the file 
Documentation/CodingStyle in "Chapter 11: Data structures".

The whole "orphaned" kind of locking is broken. It's stupid. The way we do 
races between removal and use is that initial setup sets a reference count 
of 1, and something really simple like:

	static inline struct sysfs_buffer *get_sysfs_buffer(struct inode *inode)
	{
		struct sysfs_buffer *buffer = inode->i_private;

		BUG_ON(!mutex_locked(&inode->i_mutex));
		if (buffer)
			atomic_inc(&buffer->count);
		return buffer;
	}

	static inline void put_sysfs_buffer(struct sysfs_buffer *buffer)
	{
		if (atomic_dec_and_test(&buffer->count))
			kfree(buffer);
	}

and then the rule is:

 - everybody uses "get_sysfs_buffer()" to follow the reference (and yes, 
   you obviously have to hold "inode->i_mutex" for this to be safe! I 
   added the BUG_ON() as an example)

 - everybody uses "put_buffer()" to release it (and we simply don't *care* 
   whether somebody else released it too, since everybody has a reference 
   count)

 - removing the buffer is now just

	mutex_lock(&inode->i_mutex);
	buffer = inode->i_private;
	inode->i_private = NULL;
	mutex_unlock(&inode->i_mutex);

	put_sysfs_buffer(buffer);

 - everybody is happy!

Anyway, I'm unable to revert the broken commit, since there are now other 
changes that depend on it, but can somebody *please* do that? I'll apply 
Hugh's silly patch in the meantime, just to avoid the lockup.

			Linus

  parent reply	other threads:[~2007-03-07  2:00 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-06 19:20 2.6.21-rc suspend regression: sysfs deadlock Hugh Dickins
2007-03-06 20:16 ` Oliver Neukum
2007-03-07  1:56 ` Linus Torvalds [this message]
2007-03-07 14:38   ` Oliver Neukum
2007-03-07 15:56   ` Dmitry Torokhov
2007-03-07 16:52     ` Linus Torvalds
2007-03-07 16:59       ` Oliver Neukum
2007-03-07 18:02         ` Linus Torvalds
2007-03-07 18:16           ` Oliver Neukum
2007-03-10 20:44 Alan Stern
2007-03-12 21:31 refcounting drivers' data structures used in sysfs buffers Richard Purdie
2007-03-13 15:00 ` 2.6.21-rc suspend regression: sysfs deadlock Alan Stern
2007-03-13 18:42   ` Cornelia Huck
2007-03-13 21:20     ` Linus Torvalds
2007-03-14 16:12       ` Alan Stern
2007-03-14 18:43         ` Cornelia Huck
2007-03-14 19:23           ` Alan Stern
2007-03-15 10:27             ` Cornelia Huck
2007-03-15 12:31               ` Hugh Dickins
2007-03-15 13:02                 ` Oliver Neukum
2007-03-15 13:22                   ` Dmitry Torokhov
2007-03-15 13:59                     ` Hugh Dickins
2007-03-15 14:27               ` Alan Stern
2007-03-15 15:32                 ` Cornelia Huck
2007-03-15 16:29                 ` Hugh Dickins
2007-03-15 16:51                   ` Linus Torvalds
2007-03-13 19:00   ` Hugh Dickins
2007-03-13 20:09     ` Alan Stern
2007-03-13 20:55       ` Hugh Dickins
2007-03-13 21:08         ` Dmitry Torokhov
2007-03-13 21:20         ` Alan Stern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0703061735350.5963@woody.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=bunk@stusta.de \
    --cc=gregkh@suse.de \
    --cc=hugh@veritas.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maneesh@in.ibm.com \
    --cc=oliver@neukum.name \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).