All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] inotify: locking
@ 2004-09-22 19:37 Robert Love
  2004-09-22 19:38 ` Robert Love
  2004-09-22 23:22 ` John McCutchan
  0 siblings, 2 replies; 7+ messages in thread
From: Robert Love @ 2004-09-22 19:37 UTC (permalink / raw)
  To: ttb; +Cc: linux-kernel

Hey, John.

I went over the locking in drivers/char/inotify.c  Looks right.

I made two major changes:

	- In a couple places you used irq-safe locks, but in most places
	  you did not.  It has to be all or never.  We do not currently
	  need protection from interrupts, so I changed the few
	  irq-safe locks on dev->lock to normal spin_lock/spin_unlock
	  calls.

	- dev->event_count was an atomic_t, but it was never accessed
	  outside of dev->lock.  I also did not see why ->event_count
	  was atomic but not ->nr_watches.  So I made event_count an
	  unsigned int and removed the atomic operations.

The rest of the (admittedly a bit large) patch is documenting the
locking rules.  I tried to put the locking assumptions in comments at
the top of each function.  I made some coding style cleanups as I went
along, too, but not too many (those come next).

I do have one remaining concern: create_watcher() is called without the
lock on dev, but it later obtains the lock, before it touches dev.  So
it is safe in that regard, but what if dev is deallocated before it
grabs the lock?  dev is passed in, so, for example, dev could be freed
(or otherwise manipulated) and then the dereference of dev->lock would
oops.  A couple other functions do this.  We probably need proper ref
counting on dev. BUT, are all of these call chains off of VFS functions
on the device?  Perhaps so long as the device is open it is pinned?

Attached patch is against your latest, plus the previous postings.

Thanks,

	Robert Love



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] inotify: locking
  2004-09-22 19:37 [patch] inotify: locking Robert Love
@ 2004-09-22 19:38 ` Robert Love
  2004-09-23  9:09   ` Paul Jackson
  2004-09-22 23:22 ` John McCutchan
  1 sibling, 1 reply; 7+ messages in thread
From: Robert Love @ 2004-09-22 19:38 UTC (permalink / raw)
  To: ttb; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 316 bytes --]

On Wed, 2004-09-22 at 15:37 -0400, Robert Love wrote:

> Attached patch is against your latest, plus the previous postings.

Evolution needs a heuristic to detect when I say that I attached
something, but when in reality I did not.  I do this all the time -
sorry.

Patch attached for real this time.

	Robert Love


[-- Attachment #2: inotify-locking-1.patch --]
[-- Type: text/x-patch, Size: 10579 bytes --]

 drivers/char/inotify.c |  189 +++++++++++++++++++++++++++++--------------------
 1 files changed, 114 insertions(+), 75 deletions(-)

--- linux-inotify/drivers/char/inotify.c	2004-09-22 15:30:58.774848096 -0400
+++ linux/drivers/char/inotify.c	2004-09-22 15:30:46.207758584 -0400
@@ -73,7 +73,7 @@
 	struct list_head 	events;
 	struct list_head 	watchers;
 	spinlock_t		lock;
-	atomic_t		event_count;
+	unsigned int		event_count;
 	int			nr_watches;
 };
 
@@ -182,22 +182,33 @@
 						(event_mask == IN_IGNORED) || \
 						(event_mask & watchers_mask))
 
-
-/* dev->lock == locked before calling */
-static void inotify_dev_queue_event (struct inotify_device *dev, struct inotify_watcher *watcher, int mask, const char *filename) {
+/*
+ * inotify_dev_queue_event - add a new event to the given device
+ *
+ * Caller must hold dev->lock.
+ */
+static void inotify_dev_queue_event (struct inotify_device *dev,
+				     struct inotify_watcher *watcher, int mask,
+				     const char *filename)
+{
 	struct inotify_kernel_event *kevent;
 
-	/* The queue has already overflowed and we have already sent the Q_OVERFLOW event */
-	if (atomic_read(&dev->event_count) > MAX_INOTIFY_QUEUED_EVENTS) {
-		iprintk(INOTIFY_DEBUG_EVENTS, "event queue for %p overflowed\n", dev);
+	/*
+	 * the queue has already overflowed and we have already sent the
+	 * Q_OVERFLOW event
+	 */
+	if (dev->event_count > MAX_INOTIFY_QUEUED_EVENTS) {
+		iprintk(INOTIFY_DEBUG_EVENTS,
+			"event queue for %p overflowed\n", dev);
 		return;
 	}
 
-	/* The queue has just overflowed and we need to notify user space that it has */
-	if (atomic_read(&dev->event_count) == MAX_INOTIFY_QUEUED_EVENTS) {
-		atomic_inc(&dev->event_count);
+	/* the queue has just overflowed and we need to notify user space */
+	if (dev->event_count == MAX_INOTIFY_QUEUED_EVENTS) {
+		dev->event_count++;
 		kevent = kernel_event(-1, IN_Q_OVERFLOW, NULL);
-		iprintk(INOTIFY_DEBUG_EVENTS, "sending IN_Q_OVERFLOW to %p\n", dev);
+		iprintk(INOTIFY_DEBUG_EVENTS, "sending IN_Q_OVERFLOW to %p\n",
+			dev);
 		goto add_event_to_queue;
 	}
 
@@ -205,13 +216,13 @@
 		return;
 	}
 
-	atomic_inc(&dev->event_count);
+	dev->event_count++;
 	kevent = kernel_event (watcher->wd, mask, filename);
 
 add_event_to_queue:
 	if (!kevent) {
 		iprintk(INOTIFY_DEBUG_EVENTS, "failed to queue event for %p -- could not allocate kevent\n", dev);
-		atomic_dec(&dev->event_count);
+		dev->event_count--;
 		return;
 	}
 
@@ -222,66 +233,77 @@
 
 
 
-
-static void inotify_dev_event_dequeue (struct inotify_device *dev) {
+/*
+ * inotify_dev_event_dequeue - destroy an event on the given device
+ *
+ * Caller must hold dev->lock.
+ */
+static void inotify_dev_event_dequeue(struct inotify_device *dev)
+{
 	struct inotify_kernel_event *kevent;
 
-	if (!inotify_dev_has_events (dev)) {
+	if (!inotify_dev_has_events(dev))
 		return;
-	}
 
 	kevent = inotify_dev_get_event(dev);
 
 	list_del(&kevent->list);
-	atomic_dec(&dev->event_count);
+	dev->event_count--;
 
 	delete_kernel_event (kevent);
 
 	iprintk(INOTIFY_DEBUG_EVENTS, "dequeued event on %p\n", dev);
 }
 
+/*
+ * inotify_dev_get_wd - returns the next WD for use by the given dev
+ *
+ * Caller must hold dev->lock before calling.
+ */
 static int inotify_dev_get_wd (struct inotify_device *dev)
 {
 	int wd;
 
-	wd = -1;
-
-	if (!dev)
-		return -1;
-
-	if (dev->nr_watches == MAX_INOTIFY_DEV_WATCHERS) {
+	if (!dev || dev->nr_watches == MAX_INOTIFY_DEV_WATCHERS)
 		return -1;
-	}
 
 	dev->nr_watches++;
-
 	wd = find_first_zero_bit (dev->bitmask, MAX_INOTIFY_DEV_WATCHERS);
-
 	set_bit (wd, dev->bitmask);
 
 	return wd;
 }
 
-static int inotify_dev_put_wd (struct inotify_device *dev, int wd)
+/*
+ * inotify_dev_put_wd - release the given WD on the given device
+ *
+ * Caller must hold dev->lock.
+ */
+static int inotify_dev_put_wd(struct inotify_device *dev, int wd)
 {
-	if (!dev||wd < 0)
+	if (!dev || wd < 0)
 		return -1;
 
 	dev->nr_watches--;
-
-	clear_bit (wd, dev->bitmask);
+	clear_bit(wd, dev->bitmask);
 
 	return 0;
 }
 
-
-static struct inotify_watcher *create_watcher (struct inotify_device *dev, int mask, struct inode *inode) {
+/*
+ * create_watcher - creates a watcher on the given device.
+ *
+ * Grabs dev->lock, so the caller must not hold it.
+ */
+static struct inotify_watcher *create_watcher (struct inotify_device *dev,
+					       int mask, struct inode *inode)
+{
 	struct inotify_watcher *watcher;
 
 	watcher = kmem_cache_alloc (watcher_cache, GFP_KERNEL);
-
 	if (!watcher) {
-		iprintk(INOTIFY_DEBUG_ALLOC, "failed to allocate watcher (%p,%d)\n", inode, mask);
+		iprintk(INOTIFY_DEBUG_ALLOC,
+			"failed to allocate watcher (%p,%d)\n", inode, mask);
 		return NULL;
 	}
 
@@ -294,7 +316,7 @@
 	INIT_LIST_HEAD(&watcher->u_list);
 
 	spin_lock(&dev->lock);
-		watcher->wd = inotify_dev_get_wd (dev);
+	watcher->wd = inotify_dev_get_wd (dev);
 	spin_unlock(&dev->lock);
 
 	if (watcher->wd < 0) {
@@ -309,29 +331,35 @@
 	return watcher;
 }
 
-/* Must be called with dev->lock held */
-static void delete_watcher (struct inotify_device *dev, struct inotify_watcher *watcher) {
+/*
+ * delete_watcher - removes the given 'watcher' from the given 'dev'
+ *
+ * Caller must hold dev->lock.
+ */
+static void delete_watcher (struct inotify_device *dev,
+			    struct inotify_watcher *watcher)
+{
 	inotify_dev_put_wd (dev, watcher->wd);
-
 	iprintk(INOTIFY_DEBUG_ALLOC, "free'd watcher %p\n", watcher);
-
 	kmem_cache_free (watcher_cache, watcher);
-
 	watcher_object_count--;
 }
 
-
-static struct inotify_watcher *inode_find_dev (struct inode *inode, struct inotify_device *dev) {
+/*
+ * inotify_find_dev - find the watcher associated with the given inode and dev
+ *
+ * Caller must hold dev->lock.
+ */
+static struct inotify_watcher *inode_find_dev (struct inode *inode,
+					       struct inotify_device *dev)
+{
 	struct inotify_watcher *watcher;
 
-	watcher = NULL;
-
 	list_for_each_entry (watcher, &inode->watchers, i_list) {
-		if (watcher->dev == dev) {
+		if (watcher->dev == dev)
 			return watcher;
-		}
-
 	}
+
 	return NULL;
 }
 
@@ -385,16 +413,20 @@
 	return error;
 }
 
-static int inotify_dev_rm_watcher (struct inotify_device *dev, struct inotify_watcher *watcher) {
+/*
+ * inotify_dev_rm_watcher - remove the given watch from the given device
+ *
+ * Caller must hold dev->lock because we call inotify_dev_queue_event().
+ */
+static int inotify_dev_rm_watcher (struct inotify_device *dev,
+				   struct inotify_watcher *watcher)
+{
 	int error;
 
 	error = -EINVAL;
-
 	if (watcher) {
 		inotify_dev_queue_event (dev, watcher, IN_IGNORED, NULL);
-
 		list_del(&watcher->d_list);
-
 		error = 0;
 	} 
 
@@ -413,16 +445,21 @@
 	inode->watchers_mask = new_mask;
 }
 
-static int inode_add_watcher (struct inode *inode, struct inotify_watcher *watcher) {
-	if (!inode||!watcher||inode_find_dev (inode, watcher->dev))
+/*
+ * inode_add_watcher - add a watcher to the given inode
+ *
+ * Callers must hold dev->lock, because we call inode_find_dev().
+ */
+static int inode_add_watcher (struct inode *inode,
+			      struct inotify_watcher *watcher)
+{
+	if (!inode || !watcher || inode_find_dev (inode, watcher->dev))
 		return -EINVAL;
 
 	list_add(&watcher->i_list, &inode->watchers);
 	inode->watcher_count++;
-
 	inode_update_watchers_mask (inode);
 
-
 	return 0;
 }
 
@@ -440,16 +477,18 @@
 
 /* Kernel API */
 
-void inotify_inode_queue_event (struct inode *inode, unsigned long mask, const char *filename) {
+void inotify_inode_queue_event (struct inode *inode, unsigned long mask,
+				const char *filename)
+{
 	struct inotify_watcher *watcher;
 
 	spin_lock(&inode->i_lock);
 
-		list_for_each_entry (watcher, &inode->watchers, i_list) {
-			spin_lock(&watcher->dev->lock);
-				inotify_dev_queue_event (watcher->dev, watcher, mask, filename);
-			spin_unlock(&watcher->dev->lock);
-		}
+	list_for_each_entry(watcher, &inode->watchers, i_list) {
+		spin_lock(&watcher->dev->lock);
+		inotify_dev_queue_event(watcher->dev, watcher, mask, filename);
+		spin_unlock(&watcher->dev->lock);
+	}
 
 	spin_unlock(&inode->i_lock);
 }
@@ -479,7 +518,7 @@
 	dev = watcher->dev;
 
 	if (event)
-		inotify_dev_queue_event (dev, watcher, event, NULL);
+		inotify_dev_queue_event(dev, watcher, event, NULL);
 
 	inode_rm_watcher (inode, watcher);
 	inotify_dev_rm_watcher (watcher->dev, watcher);
@@ -529,7 +568,7 @@
 	INIT_LIST_HEAD(&umount);
 
 	spin_lock(&inode_lock);
-		build_umount_list (&inode_in_use, sb, &umount);
+	build_umount_list (&inode_in_use, sb, &umount);
 	spin_unlock(&inode_lock);
 
 	process_umount_list (&umount);
@@ -584,7 +623,6 @@
 		goto out;
 
 	events = 0;
-	event_count = 0;
 	out = 0;
 	err = 0;
 
@@ -609,12 +647,12 @@
 		}
 	}
 
-	spin_lock_irq(&dev->lock);
+	spin_lock(&dev->lock);
 
 	add_wait_queue(&dev->wait, &wait);
 repeat:
 	if (signal_pending(current)) {
-		spin_unlock_irq (&dev->lock);
+		spin_unlock(&dev->lock);
 		out = -ERESTARTSYS;
 		set_current_state (TASK_RUNNING);
 		remove_wait_queue(&dev->wait, &wait);
@@ -622,9 +660,9 @@
 	}
 	set_current_state(TASK_INTERRUPTIBLE);
 	if (!inotify_dev_has_events (dev)) {
-		spin_unlock_irq (&dev->lock);
+		spin_unlock(&dev->lock);
 		schedule ();
-		spin_lock_irq (&dev->lock);
+		spin_lock(&dev->lock);
 		goto repeat;
 	}
 
@@ -646,7 +684,7 @@
 		inotify_dev_event_dequeue (dev);
 	}
 
-	spin_unlock_irq (&dev->lock);
+	spin_unlock(&dev->lock);
 
 	/* Send the event buffer to user space */
 	err = copy_to_user (buf, eventbuf, events * sizeof(struct inotify_event));
@@ -693,7 +731,7 @@
 	init_timer(&dev->timer);
 	init_waitqueue_head(&dev->wait);
 
-	atomic_set(&dev->event_count, 0);
+	dev->event_count = 0;
 	dev->nr_watches = 0;
 	dev->lock = SPIN_LOCK_UNLOCKED;
 
@@ -719,12 +757,14 @@
 	}
 }
 
+/*
+ * inotify_release_all_events - destroy all of the events on a given device
+ */
 static void inotify_release_all_events (struct inotify_device *dev)
 {
 	spin_lock(&dev->lock);
-	while (inotify_dev_has_events(dev)) {
+	while (inotify_dev_has_events(dev))
 		inotify_dev_event_dequeue(dev);
-	}
 	spin_unlock(&dev->lock);
 }
 
@@ -797,7 +837,6 @@
 	spin_unlock (&inode->i_lock);
 	spin_unlock (&dev->lock);
 
-
 	watcher = create_watcher (dev, request->mask, inode);
 
 	if (!watcher) {
@@ -870,7 +909,7 @@
 	spin_lock(&dev->lock);
 
 	printk (KERN_ALERT "inotify device: %p\n", dev);
-	printk (KERN_ALERT "inotify event_count: %d\n", atomic_read(&dev->event_count));
+	printk (KERN_ALERT "inotify event_count: %u\n", dev->event_count);
 	printk (KERN_ALERT "inotify watch_count: %d\n", dev->nr_watches);
 
 	spin_unlock(&dev->lock);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] inotify: locking
  2004-09-22 23:22 ` John McCutchan
@ 2004-09-22 23:22   ` Robert Love
  0 siblings, 0 replies; 7+ messages in thread
From: Robert Love @ 2004-09-22 23:22 UTC (permalink / raw)
  To: John McCutchan; +Cc: linux-kernel

On Wed, 2004-09-22 at 19:22 -0400, John McCutchan wrote:

> Okay, this is my first kernel project so I didn't know/follow all of the
> rules, I admit it is a bit of a mishmash.

Heh, none of these issues were big, and everything else was fine.

> Yes, AFAIK the only places where we rely on the dev not going away are
> when we are handling a request from user space. As long as VFS
> operations are serialized I don't think we have to worry about that.

You can see what locks and serialization the VFS uses in
Documentation/filesystems/Locking

	Robert Love



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] inotify: locking
  2004-09-22 19:37 [patch] inotify: locking Robert Love
  2004-09-22 19:38 ` Robert Love
@ 2004-09-22 23:22 ` John McCutchan
  2004-09-22 23:22   ` Robert Love
  1 sibling, 1 reply; 7+ messages in thread
From: John McCutchan @ 2004-09-22 23:22 UTC (permalink / raw)
  To: Robert Love; +Cc: linux-kernel

On Wed, 2004-09-22 at 15:37, Robert Love wrote:
> Hey, John.
> 
> I went over the locking in drivers/char/inotify.c  Looks right.
> 
> I made two major changes:
> 
> 	- In a couple places you used irq-safe locks, but in most places
> 	  you did not.  It has to be all or never.  We do not currently
> 	  need protection from interrupts, so I changed the few
> 	  irq-safe locks on dev->lock to normal spin_lock/spin_unlock
> 	  calls.
> 
> 	- dev->event_count was an atomic_t, but it was never accessed
> 	  outside of dev->lock.  I also did not see why ->event_count
> 	  was atomic but not ->nr_watches.  So I made event_count an
> 	  unsigned int and removed the atomic operations.
> 

Okay, this is my first kernel project so I didn't know/follow all of the
rules, I admit it is a bit of a mishmash.

> The rest of the (admittedly a bit large) patch is documenting the
> locking rules.  I tried to put the locking assumptions in comments at
> the top of each function.  I made some coding style cleanups as I went
> along, too, but not too many (those come next).
> 

The patch and your previous patches look excellent. I have applied them
to my tree and I will be making a new release this evening.

> I do have one remaining concern: create_watcher() is called without the
> lock on dev, but it later obtains the lock, before it touches dev.  So
> it is safe in that regard, but what if dev is deallocated before it
> grabs the lock?  dev is passed in, so, for example, dev could be freed
> (or otherwise manipulated) and then the dereference of dev->lock would
> oops.  A couple other functions do this.  We probably need proper ref
> counting on dev. BUT, are all of these call chains off of VFS functions
> on the device?  Perhaps so long as the device is open it is pinned?
> 

Yes, AFAIK the only places where we rely on the dev not going away are
when we are handling a request from user space. As long as VFS
operations are serialized I don't think we have to worry about that.

John

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] inotify: locking
  2004-09-22 19:38 ` Robert Love
@ 2004-09-23  9:09   ` Paul Jackson
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Jackson @ 2004-09-23  9:09 UTC (permalink / raw)
  To: Robert Love; +Cc: ttb, linux-kernel

> Evolution needs a heuristic to detect when I say that I attached
> something, but when in reality I did not.

I've mostly stopped submitting patches using my email client, and
instead submit them using a variant of a 'patch bomb' script.  This has
significantly improved the clerical accuracy of my patch submissions.

The variant I'm using is in pretty good shape - you're welcome to
give it a try.  See the embedded Usage string for documentation.

  http://www.speakeasy.org/~pj99/sgi/sendpatchset

It handles sending one or several related patches, to a list of email
addresses.  You prepare a text directive file with the addresses,
subjects and pathnames to the files containing the message contents.
Then you send it all off with a single invocation of this 'sendpatchset'
script.

-- 
                          I won't rest till it's the best ...
                          Programmer, Linux Scalability
                          Paul Jackson <pj@sgi.com> 1.650.933.1373

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [patch] inotify: locking
  2004-09-30 19:01 ` [patch] inotify: locking Robert Love
@ 2004-09-30 19:17   ` Robert Love
  0 siblings, 0 replies; 7+ messages in thread
From: Robert Love @ 2004-09-30 19:17 UTC (permalink / raw)
  To: John McCutchan; +Cc: linux-kernel, gamin-list, viro, akpm, iggy

[-- Attachment #1: Type: text/plain, Size: 387 bytes --]

On Thu, 2004-09-30 at 15:01 -0400, Robert Love wrote:

> I finally got around to reviewing the locking in inotify, again, in
> response to Andrew's points.

As I was saying.  I need to walk down the hall and have the Evolution
hackers add the "Robert intended to add a patch but did not, so let's
automatically add it for him" feature.

Here it is, this time for serious.

	Robert Love


[-- Attachment #2: inotify-0.11.0-rml-locking-redux-1.patch --]
[-- Type: text/x-patch, Size: 6149 bytes --]

locking cleanup for inotify, the code equivalent of fine cheese.

Signed-Off-By: Robert Love <rml@novell.com>

 drivers/char/inotify.c |   86 ++++++++++++++++++++++++++++++-------------------
 1 files changed, 54 insertions(+), 32 deletions(-)

diff -urN linux-inotify/drivers/char/inotify.c linux/drivers/char/inotify.c
--- linux-inotify/drivers/char/inotify.c	2004-09-30 12:09:41.317737160 -0400
+++ linux/drivers/char/inotify.c	2004-09-30 14:18:33.369286656 -0400
@@ -65,9 +65,12 @@
  * implies that the given WD is valid, unset implies it is not.
  *
  * This structure is protected by 'lock'.  Lock ordering:
+ *
+ * inode->i_lock
  *	dev->lock
  *		dev->wait->lock
- * FIXME: Define lock ordering wrt inode and dentry locking!
+ *
+ * FIXME: Look at replacing i_lock with i_sem.
  */
 struct inotify_device {
 	DECLARE_BITMAP(bitmask, MAX_INOTIFY_DEV_WATCHERS);
@@ -522,8 +525,13 @@
 {
 	struct inotify_watcher *watcher;
 
-	spin_lock(&inode1->i_lock);
-	spin_lock(&inode2->i_lock);
+	if (inode1 < inode2) {
+		spin_lock(&inode1->i_lock);
+		spin_lock(&inode2->i_lock);
+	} else {
+		spin_lock(&inode2->i_lock);
+		spin_lock(&inode1->i_lock);
+	}
 
 	list_for_each_entry(watcher, &inode1->watchers, i_list) {
 		spin_lock(&watcher->dev->lock);
@@ -550,12 +558,9 @@
 {
 	struct dentry *parent;
 
-	spin_lock(&dentry->d_lock);
-	dget(dentry->d_parent);
-	parent = dentry->d_parent;
+	parent = dget_parent(dentry);
 	inotify_inode_queue_event(parent->d_inode, mask, filename);
 	dput(parent);
-	spin_unlock(&dentry->d_lock);
 }
 EXPORT_SYMBOL_GPL(inotify_dentry_parent_queue_event);
 
@@ -564,8 +569,8 @@
 	struct inotify_device *dev;
 	struct inode *inode;
 
-	spin_lock(&watcher->dev->lock);
 	spin_lock(&watcher->inode->i_lock);
+	spin_lock(&watcher->dev->lock);
 
 	inode = watcher->inode;
 	dev = watcher->dev;
@@ -577,24 +582,30 @@
 	inotify_dev_rm_watcher(watcher->dev, watcher);
 	list_del(&watcher->u_list);
 
-	spin_unlock(&inode->i_lock);
 	delete_watcher(dev, watcher);
 	spin_unlock(&dev->lock);
+	spin_unlock(&inode->i_lock);
 
 	unref_inode(inode);
 }
 
-static void process_umount_list(struct list_head *umount) {
+static void process_umount_list(struct list_head *umount)
+{
 	struct inotify_watcher *watcher, *next;
 
 	list_for_each_entry_safe(watcher, next, umount, u_list)
 		ignore_helper(watcher, IN_UNMOUNT);
 }
 
+/*
+ * build_umount_list - build a list of watchers affected by an unmount.
+ *
+ * Caller must hold inode_lock.
+ */
 static void build_umount_list(struct list_head *head, struct super_block *sb,
 			       struct list_head *umount)
 {
-	struct inode *	inode;
+	struct inode *inode;
 
 	list_for_each_entry(inode, head, i_list) {
 		struct inotify_watcher *watcher;
@@ -625,6 +636,11 @@
 }
 EXPORT_SYMBOL_GPL(inotify_super_block_umount);
 
+/*
+ * inotify_inode_is_dead - an inode has been deleted, cleanup any watches
+ *
+ * FIXME: Callers need to always hold inode->i_lock.
+ */
 void inotify_inode_is_dead(struct inode *inode)
 {
 	struct inotify_watcher *watcher, *next;
@@ -752,6 +768,11 @@
 	return 0;
 }
 
+/*
+ * inotify_release_all_watchers - destroy all watchers on a given device
+ *
+ * Caller must hold dev->lock.
+ */
 static void inotify_release_all_watchers(struct inotify_device *dev)
 {
 	struct inotify_watcher *watcher,*next;
@@ -762,13 +783,13 @@
 
 /*
  * inotify_release_all_events - destroy all of the events on a given device
+ *
+ * Caller must hold dev->lock.
  */
 static void inotify_release_all_events(struct inotify_device *dev)
 {
-	spin_lock(&dev->lock);
 	while (inotify_dev_has_events(dev))
 		inotify_dev_event_dequeue(dev);
-	spin_unlock(&dev->lock);
 }
 
 
@@ -777,8 +798,10 @@
 	struct inotify_device *dev;
 
 	dev = file->private_data;
+	spin_lock(&dev->lock);
 	inotify_release_all_watchers(dev);
 	inotify_release_all_events(dev);
+	spin_unlock(&dev->lock);
 	kfree(dev);
 
 	printk(KERN_ALERT "inotify device released\n");
@@ -790,22 +813,18 @@
 static int inotify_watch(struct inotify_device *dev,
 			 struct inotify_watch_request *request)
 {
-	int err;
 	struct inode *inode;
 	struct inotify_watcher *watcher;
-	err = 0;
 
 	inode = find_inode(request->dirname);
-	if (IS_ERR(inode)) {
-		err = PTR_ERR(inode);
-		goto exit;
-	}
+	if (IS_ERR(inode))
+		return PTR_ERR(inode);
 
 	if (!S_ISDIR(inode->i_mode))
 		iprintk(INOTIFY_DEBUG_ERRORS, "watching file\n");
 
-	spin_lock(&dev->lock);
 	spin_lock(&inode->i_lock);
+	spin_lock(&dev->lock);
 
 	/*
 	 * This handles the case of re-adding a directory we are already
@@ -820,15 +839,15 @@
 		owatcher = inode_find_dev(inode, dev);
 		owatcher->mask = request->mask;
 		inode_update_watchers_mask(inode);
+		spin_unlock(&dev->lock);		
 		spin_unlock(&inode->i_lock);
-		spin_unlock(&dev->lock);
 		unref_inode(inode);
 
 		return 0;
 	}
 
+	spin_unlock(&dev->lock);	
 	spin_unlock(&inode->i_lock);
-	spin_unlock(&dev->lock);
 
 	watcher = create_watcher(dev, request->mask, inode);
 	if (!watcher) {
@@ -836,31 +855,27 @@
 		return -ENOSPC;
 	}
 
+	spin_lock(&inode->i_lock);	
 	spin_lock(&dev->lock);
-	spin_lock(&inode->i_lock);
 
 	/* We can't add anymore watchers to this device */
 	if (inotify_dev_add_watcher(dev, watcher) == -ENOSPC) {
 		iprintk(INOTIFY_DEBUG_ERRORS,
 			"can't add watcher dev is full\n");
-		spin_unlock(&inode->i_lock);
 		delete_watcher(dev, watcher);
 		spin_unlock(&dev->lock);
-
+		spin_unlock(&inode->i_lock);
 		unref_inode(inode);
+
 		return -ENOSPC;
 	}
 
 	inode_add_watcher(inode, watcher);
 
-	/* we keep a reference on the inode */
-	if (!err)
-		err = watcher->wd;
-
-	spin_unlock(&inode->i_lock);
 	spin_unlock(&dev->lock);
-exit:
-	return err;
+	spin_unlock(&inode->i_lock);
+
+	return watcher->wd;
 }
 
 static int inotify_ignore(struct inotify_device *dev, int wd)
@@ -904,6 +919,13 @@
 	spin_unlock(&dev->lock);
 }
 
+/*
+ * inotify_ioctl() - our device file's ioctl method
+ *
+ * The VFS serializes all of our calls via the BKL and we rely on that.  We
+ * could, alternatively, grab dev->lock.  Right now lower levels grab that
+ * where needed.
+ */
 static int inotify_ioctl(struct inode *ip, struct file *fp,
 			 unsigned int cmd, unsigned long arg)
 {

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [patch] inotify: locking
  2004-09-28 22:33 [RFC][PATCH] inotify 0.11.0 [WITH PATCH!] John McCutchan
@ 2004-09-30 19:01 ` Robert Love
  2004-09-30 19:17   ` Robert Love
  0 siblings, 1 reply; 7+ messages in thread
From: Robert Love @ 2004-09-30 19:01 UTC (permalink / raw)
  To: John McCutchan; +Cc: linux-kernel, gamin-list, viro, akpm, iggy

I finally got around to reviewing the locking in inotify, again, in
response to Andrew's points.

There are still three TODO items:

	- Look into replacing i_lock with i_sem.
	- dev->lock nests inode->i_lock.  Preferably i_lock should
	  remain an outermost lock.  Maybe i_sem can fix this.
	- inotify_inode_is_dead() should always have the inode's
	  i_lock locked when called.  I have not yet reviewed the
	  VFS paths that call it to ensure this is the case.

Anyhow, this patch does the following

	- More locking documentation/comments
	- Respect lock ranking when locking two different
	  inodes' i_lock
	- Don't grab dentry->d_lock and just use dget_parent(). [1]
	- Respect lock ranking with dev->lock vs. inode->i_lock.
	- inotify_release_all_watches() needed dev->lock.
	- misc. cleanup

Patch is on top of 0.11.0.

	Robert Love



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-09-30 19:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-09-22 19:37 [patch] inotify: locking Robert Love
2004-09-22 19:38 ` Robert Love
2004-09-23  9:09   ` Paul Jackson
2004-09-22 23:22 ` John McCutchan
2004-09-22 23:22   ` Robert Love
2004-09-28 22:33 [RFC][PATCH] inotify 0.11.0 [WITH PATCH!] John McCutchan
2004-09-30 19:01 ` [patch] inotify: locking Robert Love
2004-09-30 19:17   ` Robert Love

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.