[PATCH v6 0/2] improve sync efficiency with sb inode wb list

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v6 0/2] improve sync efficiency with sb inode wb list
@ 2016-01-19 17:59 Brian Foster
  2016-01-19 17:59 ` [PATCH v6 1/2] sb: add a new writeback list for sync Brian Foster
  2016-01-19 17:59 ` [PATCH v6 2/2] wb: inode writeback list tracking tracepoints Brian Foster
  0 siblings, 2 replies; 10+ messages in thread
From: Brian Foster @ 2016-01-19 17:59 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: dchinner, jbacik, jack

Hi all,

This is version 6 of the sync efficiency fix. Changes from v5 are noted
below. The primary changes are to use rcu locking in wait_sb_inodes()
rather than the s_inode_list_lock hack and to clean up the i_wb_list
maintenance a bit, based on Jan Kara's feedback to v5. Note that I
folded the rcu locking change into the original rather than as a
separate patch since it seemed like the more correct approach and most
of my testing since v5 is based on that code.

This survives some xfstests testing on XFS and ext4 as well as some
elongated (several day) fsstress testing on XFS without any explosions.
Thoughts, reviews, flames appreciated.

Brian

v6:
- Use rcu locking instead of s_inode_list_lock spinlock in
  wait_sb_inodes().
- Refactor wait_sb_inodes() to keep inode on wb list.
- Drop remaining, unnecessary lazy list removal bits and relocate inode
  list check to clear_inode().
- Fix up some comments, etc.
v5: http://marc.info/?l=linux-fsdevel&m=145262374402798&w=2
- Converted from per-bdi list to per-sb list. Also marked as RFC and
  dropped testing/review tags.
- Updated to use new irq-safe lock for wb list.
- Dropped lazy list removal. Inodes are removed when the mapping is
  cleared of the writeback tag.
- Tweaked wait_sb_inodes() to remove deferred iput(), other cleanups.
- Added wb list tracepoint patch.
v4: http://marc.info/?l=linux-fsdevel&m=143511628828000&w=2

Brian Foster (1):
  wb: inode writeback list tracking tracepoints

Dave Chinner (1):
  sb: add a new writeback list for sync

 fs/fs-writeback.c                | 111 ++++++++++++++++++++++++++++++---------
 fs/inode.c                       |   2 +
 fs/super.c                       |   2 +
 include/linux/fs.h               |   4 ++
 include/linux/writeback.h        |   3 ++
 include/trace/events/writeback.h |  22 ++++++--
 mm/page-writeback.c              |  18 +++++++
 7 files changed, 133 insertions(+), 29 deletions(-)

-- 
2.4.3


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v6 1/2] sb: add a new writeback list for sync
  2016-01-19 17:59 [PATCH v6 0/2] improve sync efficiency with sb inode wb list Brian Foster
@ 2016-01-19 17:59 ` Brian Foster
  2016-01-20 13:26   ` Jan Kara
  2016-01-19 17:59 ` [PATCH v6 2/2] wb: inode writeback list tracking tracepoints Brian Foster
  1 sibling, 1 reply; 10+ messages in thread
From: Brian Foster @ 2016-01-19 17:59 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: dchinner, jbacik, jack

From: Dave Chinner <dchinner@redhat.com>

wait_sb_inodes() currently does a walk of all inodes in the
filesystem to find dirty one to wait on during sync. This is highly
inefficient and wastes a lot of CPU when there are lots of clean
cached inodes that we don't need to wait on.

To avoid this "all inode" walk, we need to track inodes that are
currently under writeback that we need to wait for. We do this by
adding inodes to a writeback list on the sb when the mapping is
first tagged as having pages under writeback. wait_sb_inodes() can
then walk this list of "inodes under IO" and wait specifically just
for the inodes that the current sync(2) needs to wait for.

Define a couple helpers to add/remove an inode from the writeback
list and call them when the overall mapping is tagged for or cleared
from writeback. Update wait_sb_inodes() to walk only the inodes
under writeback due to the sync.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/fs-writeback.c         | 106 +++++++++++++++++++++++++++++++++++-----------
 fs/inode.c                |   2 +
 fs/super.c                |   2 +
 include/linux/fs.h        |   4 ++
 include/linux/writeback.h |   3 ++
 mm/page-writeback.c       |  18 ++++++++
 6 files changed, 110 insertions(+), 25 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 6915c95..63b878b 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -943,6 +943,37 @@ void inode_io_list_del(struct inode *inode)
 }
 
 /*
+ * mark an inode as under writeback on the sb
+ */
+void sb_mark_inode_writeback(struct inode *inode)
+{
+	struct super_block *sb = inode->i_sb;
+	unsigned long flags;
+
+	if (list_empty(&inode->i_wb_list)) {
+		spin_lock_irqsave(&sb->s_inode_wblist_lock, flags);
+		if (list_empty(&inode->i_wb_list))
+			list_add_tail(&inode->i_wb_list, &sb->s_inodes_wb);
+		spin_unlock_irqrestore(&sb->s_inode_wblist_lock, flags);
+	}
+}
+
+/*
+ * clear an inode as under writeback on the sb
+ */
+void sb_clear_inode_writeback(struct inode *inode)
+{
+	struct super_block *sb = inode->i_sb;
+	unsigned long flags;
+
+	if (!list_empty(&inode->i_wb_list)) {
+		spin_lock_irqsave(&sb->s_inode_wblist_lock, flags);
+		list_del_init(&inode->i_wb_list);
+		spin_unlock_irqrestore(&sb->s_inode_wblist_lock, flags);
+	}
+}
+
+/*
  * Redirty an inode: set its when-it-was dirtied timestamp and move it to the
  * furthest end of its superblock's dirty-inode list.
  *
@@ -2106,7 +2137,7 @@ EXPORT_SYMBOL(__mark_inode_dirty);
  */
 static void wait_sb_inodes(struct super_block *sb)
 {
-	struct inode *inode, *old_inode = NULL;
+	LIST_HEAD(sync_list);
 
 	/*
 	 * We need to be protected against the filesystem going from
@@ -2115,38 +2146,60 @@ static void wait_sb_inodes(struct super_block *sb)
 	WARN_ON(!rwsem_is_locked(&sb->s_umount));
 
 	mutex_lock(&sb->s_sync_lock);
-	spin_lock(&sb->s_inode_list_lock);
 
 	/*
-	 * Data integrity sync. Must wait for all pages under writeback,
-	 * because there may have been pages dirtied before our sync
-	 * call, but which had writeout started before we write it out.
-	 * In which case, the inode may not be on the dirty list, but
-	 * we still have to wait for that writeout.
+	 * Splice the writeback list onto a temporary list to avoid waiting on
+	 * inodes that have started writeback after this point.
+	 *
+	 * Use rcu_read_lock() to keep the inodes around until we have a
+	 * reference. s_inode_wblist_lock protects sb->s_inodes_wb as well as
+	 * the local list because inodes can be dropped from either by writeback
+	 * completion.
+	 */
+	rcu_read_lock();
+	spin_lock_irq(&sb->s_inode_wblist_lock);
+	list_splice_init(&sb->s_inodes_wb, &sync_list);
+
+	/*
+	 * Data integrity sync. Must wait for all pages under writeback, because
+	 * there may have been pages dirtied before our sync call, but which had
+	 * writeout started before we write it out.  In which case, the inode
+	 * may not be on the dirty list, but we still have to wait for that
+	 * writeout.
 	 */
-	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
+	while (!list_empty(&sync_list)) {
+		struct inode *inode = list_first_entry(&sync_list, struct inode,
+						       i_wb_list);
 		struct address_space *mapping = inode->i_mapping;
 
+		/*
+		 * Move each inode back to the wb list before we drop the lock
+		 * to preserve consistency between i_wb_list and the mapping
+		 * writeback tag. Writeback completion is responsible to remove
+		 * the inode from either list once the writeback tag is cleared.
+		 */
+		list_move_tail(&inode->i_wb_list, &sb->s_inodes_wb);
+
+		/*
+		 * The mapping can appear untagged while still on-list since we
+		 * do not have the mapping lock. Skip it here, wb completion
+		 * will remove it.
+		 */
+		if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
+			continue;
+
+		spin_unlock_irq(&sb->s_inode_wblist_lock);
+
 		spin_lock(&inode->i_lock);
-		if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
-		    (mapping->nrpages == 0)) {
+		if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
 			spin_unlock(&inode->i_lock);
+
+			spin_lock_irq(&sb->s_inode_wblist_lock);
 			continue;
 		}
 		__iget(inode);
 		spin_unlock(&inode->i_lock);
-		spin_unlock(&sb->s_inode_list_lock);
-
-		/*
-		 * We hold a reference to 'inode' so it couldn't have been
-		 * removed from s_inodes list while we dropped the
-		 * s_inode_list_lock.  We cannot iput the inode now as we can
-		 * be holding the last reference and we cannot iput it under
-		 * s_inode_list_lock. So we keep the reference and iput it
-		 * later.
-		 */
-		iput(old_inode);
-		old_inode = inode;
+		rcu_read_unlock();
 
 		/*
 		 * We keep the error status of individual mapping so that
@@ -2157,10 +2210,13 @@ static void wait_sb_inodes(struct super_block *sb)
 
 		cond_resched();
 
-		spin_lock(&sb->s_inode_list_lock);
+		iput(inode);
+
+		rcu_read_lock();
+		spin_lock_irq(&sb->s_inode_wblist_lock);
 	}
-	spin_unlock(&sb->s_inode_list_lock);
-	iput(old_inode);
+	spin_unlock_irq(&sb->s_inode_wblist_lock);
+	rcu_read_unlock();
 	mutex_unlock(&sb->s_sync_lock);
 }
 
diff --git a/fs/inode.c b/fs/inode.c
index e491e54..f5a7eb9 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -358,6 +358,7 @@ void inode_init_once(struct inode *inode)
 	INIT_HLIST_NODE(&inode->i_hash);
 	INIT_LIST_HEAD(&inode->i_devices);
 	INIT_LIST_HEAD(&inode->i_io_list);
+	INIT_LIST_HEAD(&inode->i_wb_list);
 	INIT_LIST_HEAD(&inode->i_lru);
 	address_space_init_once(&inode->i_data);
 	i_size_ordered_init(inode);
@@ -500,6 +501,7 @@ void clear_inode(struct inode *inode)
 	BUG_ON(!list_empty(&inode->i_data.private_list));
 	BUG_ON(!(inode->i_state & I_FREEING));
 	BUG_ON(inode->i_state & I_CLEAR);
+	BUG_ON(!list_empty(&inode->i_wb_list));
 	/* don't need i_lock here, no concurrent mods to i_state */
 	inode->i_state = I_FREEING | I_CLEAR;
 }
diff --git a/fs/super.c b/fs/super.c
index 1182af8..60dd44a 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -206,6 +206,8 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags)
 	mutex_init(&s->s_sync_lock);
 	INIT_LIST_HEAD(&s->s_inodes);
 	spin_lock_init(&s->s_inode_list_lock);
+	INIT_LIST_HEAD(&s->s_inodes_wb);
+	spin_lock_init(&s->s_inode_wblist_lock);
 
 	if (list_lru_init_memcg(&s->s_dentry_lru))
 		goto fail;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index eb73d74..ac2797d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -651,6 +651,7 @@ struct inode {
 #endif
 	struct list_head	i_lru;		/* inode LRU list */
 	struct list_head	i_sb_list;
+	struct list_head	i_wb_list;	/* backing dev writeback list */
 	union {
 		struct hlist_head	i_dentry;
 		struct rcu_head		i_rcu;
@@ -1377,6 +1378,9 @@ struct super_block {
 	/* s_inode_list_lock protects s_inodes */
 	spinlock_t		s_inode_list_lock ____cacheline_aligned_in_smp;
 	struct list_head	s_inodes;	/* all inodes */
+
+	spinlock_t		s_inode_wblist_lock;
+	struct list_head	s_inodes_wb;	/* writeback inodes */
 };
 
 extern struct timespec current_fs_time(struct super_block *sb);
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index b333c94..90a380c 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -379,4 +379,7 @@ void tag_pages_for_writeback(struct address_space *mapping,
 
 void account_page_redirty(struct page *page);
 
+void sb_mark_inode_writeback(struct inode *inode);
+void sb_clear_inode_writeback(struct inode *inode);
+
 #endif		/* WRITEBACK_H */
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 6fe7d15..a8b718137 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -2745,6 +2745,11 @@ int test_clear_page_writeback(struct page *page)
 				__wb_writeout_inc(wb);
 			}
 		}
+
+		if (mapping->host && !mapping_tagged(mapping,
+						     PAGECACHE_TAG_WRITEBACK))
+			sb_clear_inode_writeback(mapping->host);
+
 		spin_unlock_irqrestore(&mapping->tree_lock, flags);
 	} else {
 		ret = TestClearPageWriteback(page);
@@ -2773,11 +2778,24 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
 		spin_lock_irqsave(&mapping->tree_lock, flags);
 		ret = TestSetPageWriteback(page);
 		if (!ret) {
+			bool on_wblist;
+
+			on_wblist = mapping_tagged(mapping,
+						   PAGECACHE_TAG_WRITEBACK);
+
 			radix_tree_tag_set(&mapping->page_tree,
 						page_index(page),
 						PAGECACHE_TAG_WRITEBACK);
 			if (bdi_cap_account_writeback(bdi))
 				__inc_wb_stat(inode_to_wb(inode), WB_WRITEBACK);
+
+			/*
+			 * We can come through here when swapping anonymous
+			 * pages, so we don't necessarily have an inode to track
+			 * for sync.
+			 */
+			if (mapping->host && !on_wblist)
+				sb_mark_inode_writeback(mapping->host);
 		}
 		if (!PageDirty(page))
 			radix_tree_tag_clear(&mapping->page_tree,
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 2/2] wb: inode writeback list tracking tracepoints
  2016-01-19 17:59 [PATCH v6 0/2] improve sync efficiency with sb inode wb list Brian Foster
  2016-01-19 17:59 ` [PATCH v6 1/2] sb: add a new writeback list for sync Brian Foster
@ 2016-01-19 17:59 ` Brian Foster
  2016-01-20 13:14   ` Jan Kara
  1 sibling, 1 reply; 10+ messages in thread
From: Brian Foster @ 2016-01-19 17:59 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: dchinner, jbacik, jack

The per-sb inode writeback list tracks inodes currently under writeback
to facilitate efficient sync processing. In particular, it ensures that
sync only needs to walk through a list of inodes that were cleaned by
the sync.

Add a couple tracepoints to help identify when inodes are added/removed
to and from the writeback lists. Piggyback off of the writeback lazytime
tracepoint template as it already tracks the relevant inode information.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/fs-writeback.c                |  9 +++++++--
 include/trace/events/writeback.h | 22 ++++++++++++++++++----
 2 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 63b878b..3a62013 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -952,8 +952,10 @@ void sb_mark_inode_writeback(struct inode *inode)
 
 	if (list_empty(&inode->i_wb_list)) {
 		spin_lock_irqsave(&sb->s_inode_wblist_lock, flags);
-		if (list_empty(&inode->i_wb_list))
+		if (list_empty(&inode->i_wb_list)) {
 			list_add_tail(&inode->i_wb_list, &sb->s_inodes_wb);
+			trace_sb_mark_inode_writeback(inode);
+		}
 		spin_unlock_irqrestore(&sb->s_inode_wblist_lock, flags);
 	}
 }
@@ -968,7 +970,10 @@ void sb_clear_inode_writeback(struct inode *inode)
 
 	if (!list_empty(&inode->i_wb_list)) {
 		spin_lock_irqsave(&sb->s_inode_wblist_lock, flags);
-		list_del_init(&inode->i_wb_list);
+		if (!list_empty(&inode->i_wb_list)) {
+			list_del_init(&inode->i_wb_list);
+			trace_sb_clear_inode_writeback(inode);
+		}
 		spin_unlock_irqrestore(&sb->s_inode_wblist_lock, flags);
 	}
 }
diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
index fff846b..962a7f1 100644
--- a/include/trace/events/writeback.h
+++ b/include/trace/events/writeback.h
@@ -727,7 +727,7 @@ DEFINE_EVENT(writeback_single_inode_template, writeback_single_inode,
 	TP_ARGS(inode, wbc, nr_to_write)
 );
 
-DECLARE_EVENT_CLASS(writeback_lazytime_template,
+DECLARE_EVENT_CLASS(writeback_inode_template,
 	TP_PROTO(struct inode *inode),
 
 	TP_ARGS(inode),
@@ -754,25 +754,39 @@ DECLARE_EVENT_CLASS(writeback_lazytime_template,
 		  show_inode_state(__entry->state), __entry->mode)
 );
 
-DEFINE_EVENT(writeback_lazytime_template, writeback_lazytime,
+DEFINE_EVENT(writeback_inode_template, writeback_lazytime,
 	TP_PROTO(struct inode *inode),
 
 	TP_ARGS(inode)
 );
 
-DEFINE_EVENT(writeback_lazytime_template, writeback_lazytime_iput,
+DEFINE_EVENT(writeback_inode_template, writeback_lazytime_iput,
 	TP_PROTO(struct inode *inode),
 
 	TP_ARGS(inode)
 );
 
-DEFINE_EVENT(writeback_lazytime_template, writeback_dirty_inode_enqueue,
+DEFINE_EVENT(writeback_inode_template, writeback_dirty_inode_enqueue,
 
 	TP_PROTO(struct inode *inode),
 
 	TP_ARGS(inode)
 );
 
+/*
+ * Inode writeback list tracking.
+ */
+
+DEFINE_EVENT(writeback_inode_template, sb_mark_inode_writeback,
+	TP_PROTO(struct inode *inode),
+	TP_ARGS(inode)
+);
+
+DEFINE_EVENT(writeback_inode_template, sb_clear_inode_writeback,
+	TP_PROTO(struct inode *inode),
+	TP_ARGS(inode)
+);
+
 #endif /* _TRACE_WRITEBACK_H */
 
 /* This part must be outside protection */
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 2/2] wb: inode writeback list tracking tracepoints
  2016-01-19 17:59 ` [PATCH v6 2/2] wb: inode writeback list tracking tracepoints Brian Foster
@ 2016-01-20 13:14   ` Jan Kara
  0 siblings, 0 replies; 10+ messages in thread
From: Jan Kara @ 2016-01-20 13:14 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-fsdevel, dchinner, jbacik, jack

On Tue 19-01-16 12:59:13, Brian Foster wrote:
> The per-sb inode writeback list tracks inodes currently under writeback
> to facilitate efficient sync processing. In particular, it ensures that
> sync only needs to walk through a list of inodes that were cleaned by
> the sync.
> 
> Add a couple tracepoints to help identify when inodes are added/removed
> to and from the writeback lists. Piggyback off of the writeback lazytime
> tracepoint template as it already tracks the relevant inode information.

The patch looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/fs-writeback.c                |  9 +++++++--
>  include/trace/events/writeback.h | 22 ++++++++++++++++++----
>  2 files changed, 25 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 63b878b..3a62013 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -952,8 +952,10 @@ void sb_mark_inode_writeback(struct inode *inode)
>  
>  	if (list_empty(&inode->i_wb_list)) {
>  		spin_lock_irqsave(&sb->s_inode_wblist_lock, flags);
> -		if (list_empty(&inode->i_wb_list))
> +		if (list_empty(&inode->i_wb_list)) {
>  			list_add_tail(&inode->i_wb_list, &sb->s_inodes_wb);
> +			trace_sb_mark_inode_writeback(inode);
> +		}
>  		spin_unlock_irqrestore(&sb->s_inode_wblist_lock, flags);
>  	}
>  }
> @@ -968,7 +970,10 @@ void sb_clear_inode_writeback(struct inode *inode)
>  
>  	if (!list_empty(&inode->i_wb_list)) {
>  		spin_lock_irqsave(&sb->s_inode_wblist_lock, flags);
> -		list_del_init(&inode->i_wb_list);
> +		if (!list_empty(&inode->i_wb_list)) {
> +			list_del_init(&inode->i_wb_list);
> +			trace_sb_clear_inode_writeback(inode);
> +		}
>  		spin_unlock_irqrestore(&sb->s_inode_wblist_lock, flags);
>  	}
>  }
> diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h
> index fff846b..962a7f1 100644
> --- a/include/trace/events/writeback.h
> +++ b/include/trace/events/writeback.h
> @@ -727,7 +727,7 @@ DEFINE_EVENT(writeback_single_inode_template, writeback_single_inode,
>  	TP_ARGS(inode, wbc, nr_to_write)
>  );
>  
> -DECLARE_EVENT_CLASS(writeback_lazytime_template,
> +DECLARE_EVENT_CLASS(writeback_inode_template,
>  	TP_PROTO(struct inode *inode),
>  
>  	TP_ARGS(inode),
> @@ -754,25 +754,39 @@ DECLARE_EVENT_CLASS(writeback_lazytime_template,
>  		  show_inode_state(__entry->state), __entry->mode)
>  );
>  
> -DEFINE_EVENT(writeback_lazytime_template, writeback_lazytime,
> +DEFINE_EVENT(writeback_inode_template, writeback_lazytime,
>  	TP_PROTO(struct inode *inode),
>  
>  	TP_ARGS(inode)
>  );
>  
> -DEFINE_EVENT(writeback_lazytime_template, writeback_lazytime_iput,
> +DEFINE_EVENT(writeback_inode_template, writeback_lazytime_iput,
>  	TP_PROTO(struct inode *inode),
>  
>  	TP_ARGS(inode)
>  );
>  
> -DEFINE_EVENT(writeback_lazytime_template, writeback_dirty_inode_enqueue,
> +DEFINE_EVENT(writeback_inode_template, writeback_dirty_inode_enqueue,
>  
>  	TP_PROTO(struct inode *inode),
>  
>  	TP_ARGS(inode)
>  );
>  
> +/*
> + * Inode writeback list tracking.
> + */
> +
> +DEFINE_EVENT(writeback_inode_template, sb_mark_inode_writeback,
> +	TP_PROTO(struct inode *inode),
> +	TP_ARGS(inode)
> +);
> +
> +DEFINE_EVENT(writeback_inode_template, sb_clear_inode_writeback,
> +	TP_PROTO(struct inode *inode),
> +	TP_ARGS(inode)
> +);
> +
>  #endif /* _TRACE_WRITEBACK_H */
>  
>  /* This part must be outside protection */
> -- 
> 2.4.3
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/2] sb: add a new writeback list for sync
  2016-01-19 17:59 ` [PATCH v6 1/2] sb: add a new writeback list for sync Brian Foster
@ 2016-01-20 13:26   ` Jan Kara
  2016-01-20 20:11     ` Dave Chinner
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2016-01-20 13:26 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-fsdevel, dchinner, jbacik, jack

On Tue 19-01-16 12:59:12, Brian Foster wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> wait_sb_inodes() currently does a walk of all inodes in the
> filesystem to find dirty one to wait on during sync. This is highly
> inefficient and wastes a lot of CPU when there are lots of clean
> cached inodes that we don't need to wait on.
> 
> To avoid this "all inode" walk, we need to track inodes that are
> currently under writeback that we need to wait for. We do this by
> adding inodes to a writeback list on the sb when the mapping is
> first tagged as having pages under writeback. wait_sb_inodes() can
> then walk this list of "inodes under IO" and wait specifically just
> for the inodes that the current sync(2) needs to wait for.
> 
> Define a couple helpers to add/remove an inode from the writeback
> list and call them when the overall mapping is tagged for or cleared
> from writeback. Update wait_sb_inodes() to walk only the inodes
> under writeback due to the sync.

The patch looks good.  Just one comment: This grows struct inode by two
longs. Such a growth should be justified by measuring the improvements. So
can you measure some numbers showing how much the patch helped? I think it
would be interesting to see:

a) How much sync(2) speed has improved if there's not much to wait for.

b) See whether parallel heavy stat(2) load which is rotating lots of inodes
in inode cache sees some improvement when it doesn't have to contend with
sync(2) on s_inode_list_lock. I believe Dave Chinner had some loads where
the contention on s_inode_list_lock due to sync and rotation of inodes was
pretty heavy.

Thanks.

								Honza

> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Josef Bacik <jbacik@fb.com>
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/fs-writeback.c         | 106 +++++++++++++++++++++++++++++++++++-----------
>  fs/inode.c                |   2 +
>  fs/super.c                |   2 +
>  include/linux/fs.h        |   4 ++
>  include/linux/writeback.h |   3 ++
>  mm/page-writeback.c       |  18 ++++++++
>  6 files changed, 110 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 6915c95..63b878b 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -943,6 +943,37 @@ void inode_io_list_del(struct inode *inode)
>  }
>  
>  /*
> + * mark an inode as under writeback on the sb
> + */
> +void sb_mark_inode_writeback(struct inode *inode)
> +{
> +	struct super_block *sb = inode->i_sb;
> +	unsigned long flags;
> +
> +	if (list_empty(&inode->i_wb_list)) {
> +		spin_lock_irqsave(&sb->s_inode_wblist_lock, flags);
> +		if (list_empty(&inode->i_wb_list))
> +			list_add_tail(&inode->i_wb_list, &sb->s_inodes_wb);
> +		spin_unlock_irqrestore(&sb->s_inode_wblist_lock, flags);
> +	}
> +}
> +
> +/*
> + * clear an inode as under writeback on the sb
> + */
> +void sb_clear_inode_writeback(struct inode *inode)
> +{
> +	struct super_block *sb = inode->i_sb;
> +	unsigned long flags;
> +
> +	if (!list_empty(&inode->i_wb_list)) {
> +		spin_lock_irqsave(&sb->s_inode_wblist_lock, flags);
> +		list_del_init(&inode->i_wb_list);
> +		spin_unlock_irqrestore(&sb->s_inode_wblist_lock, flags);
> +	}
> +}
> +
> +/*
>   * Redirty an inode: set its when-it-was dirtied timestamp and move it to the
>   * furthest end of its superblock's dirty-inode list.
>   *
> @@ -2106,7 +2137,7 @@ EXPORT_SYMBOL(__mark_inode_dirty);
>   */
>  static void wait_sb_inodes(struct super_block *sb)
>  {
> -	struct inode *inode, *old_inode = NULL;
> +	LIST_HEAD(sync_list);
>  
>  	/*
>  	 * We need to be protected against the filesystem going from
> @@ -2115,38 +2146,60 @@ static void wait_sb_inodes(struct super_block *sb)
>  	WARN_ON(!rwsem_is_locked(&sb->s_umount));
>  
>  	mutex_lock(&sb->s_sync_lock);
> -	spin_lock(&sb->s_inode_list_lock);
>  
>  	/*
> -	 * Data integrity sync. Must wait for all pages under writeback,
> -	 * because there may have been pages dirtied before our sync
> -	 * call, but which had writeout started before we write it out.
> -	 * In which case, the inode may not be on the dirty list, but
> -	 * we still have to wait for that writeout.
> +	 * Splice the writeback list onto a temporary list to avoid waiting on
> +	 * inodes that have started writeback after this point.
> +	 *
> +	 * Use rcu_read_lock() to keep the inodes around until we have a
> +	 * reference. s_inode_wblist_lock protects sb->s_inodes_wb as well as
> +	 * the local list because inodes can be dropped from either by writeback
> +	 * completion.
> +	 */
> +	rcu_read_lock();
> +	spin_lock_irq(&sb->s_inode_wblist_lock);
> +	list_splice_init(&sb->s_inodes_wb, &sync_list);
> +
> +	/*
> +	 * Data integrity sync. Must wait for all pages under writeback, because
> +	 * there may have been pages dirtied before our sync call, but which had
> +	 * writeout started before we write it out.  In which case, the inode
> +	 * may not be on the dirty list, but we still have to wait for that
> +	 * writeout.
>  	 */
> -	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
> +	while (!list_empty(&sync_list)) {
> +		struct inode *inode = list_first_entry(&sync_list, struct inode,
> +						       i_wb_list);
>  		struct address_space *mapping = inode->i_mapping;
>  
> +		/*
> +		 * Move each inode back to the wb list before we drop the lock
> +		 * to preserve consistency between i_wb_list and the mapping
> +		 * writeback tag. Writeback completion is responsible to remove
> +		 * the inode from either list once the writeback tag is cleared.
> +		 */
> +		list_move_tail(&inode->i_wb_list, &sb->s_inodes_wb);
> +
> +		/*
> +		 * The mapping can appear untagged while still on-list since we
> +		 * do not have the mapping lock. Skip it here, wb completion
> +		 * will remove it.
> +		 */
> +		if (!mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
> +			continue;
> +
> +		spin_unlock_irq(&sb->s_inode_wblist_lock);
> +
>  		spin_lock(&inode->i_lock);
> -		if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
> -		    (mapping->nrpages == 0)) {
> +		if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) {
>  			spin_unlock(&inode->i_lock);
> +
> +			spin_lock_irq(&sb->s_inode_wblist_lock);
>  			continue;
>  		}
>  		__iget(inode);
>  		spin_unlock(&inode->i_lock);
> -		spin_unlock(&sb->s_inode_list_lock);
> -
> -		/*
> -		 * We hold a reference to 'inode' so it couldn't have been
> -		 * removed from s_inodes list while we dropped the
> -		 * s_inode_list_lock.  We cannot iput the inode now as we can
> -		 * be holding the last reference and we cannot iput it under
> -		 * s_inode_list_lock. So we keep the reference and iput it
> -		 * later.
> -		 */
> -		iput(old_inode);
> -		old_inode = inode;
> +		rcu_read_unlock();
>  
>  		/*
>  		 * We keep the error status of individual mapping so that
> @@ -2157,10 +2210,13 @@ static void wait_sb_inodes(struct super_block *sb)
>  
>  		cond_resched();
>  
> -		spin_lock(&sb->s_inode_list_lock);
> +		iput(inode);
> +
> +		rcu_read_lock();
> +		spin_lock_irq(&sb->s_inode_wblist_lock);
>  	}
> -	spin_unlock(&sb->s_inode_list_lock);
> -	iput(old_inode);
> +	spin_unlock_irq(&sb->s_inode_wblist_lock);
> +	rcu_read_unlock();
>  	mutex_unlock(&sb->s_sync_lock);
>  }
>  
> diff --git a/fs/inode.c b/fs/inode.c
> index e491e54..f5a7eb9 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -358,6 +358,7 @@ void inode_init_once(struct inode *inode)
>  	INIT_HLIST_NODE(&inode->i_hash);
>  	INIT_LIST_HEAD(&inode->i_devices);
>  	INIT_LIST_HEAD(&inode->i_io_list);
> +	INIT_LIST_HEAD(&inode->i_wb_list);
>  	INIT_LIST_HEAD(&inode->i_lru);
>  	address_space_init_once(&inode->i_data);
>  	i_size_ordered_init(inode);
> @@ -500,6 +501,7 @@ void clear_inode(struct inode *inode)
>  	BUG_ON(!list_empty(&inode->i_data.private_list));
>  	BUG_ON(!(inode->i_state & I_FREEING));
>  	BUG_ON(inode->i_state & I_CLEAR);
> +	BUG_ON(!list_empty(&inode->i_wb_list));
>  	/* don't need i_lock here, no concurrent mods to i_state */
>  	inode->i_state = I_FREEING | I_CLEAR;
>  }
> diff --git a/fs/super.c b/fs/super.c
> index 1182af8..60dd44a 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -206,6 +206,8 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags)
>  	mutex_init(&s->s_sync_lock);
>  	INIT_LIST_HEAD(&s->s_inodes);
>  	spin_lock_init(&s->s_inode_list_lock);
> +	INIT_LIST_HEAD(&s->s_inodes_wb);
> +	spin_lock_init(&s->s_inode_wblist_lock);
>  
>  	if (list_lru_init_memcg(&s->s_dentry_lru))
>  		goto fail;
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index eb73d74..ac2797d 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -651,6 +651,7 @@ struct inode {
>  #endif
>  	struct list_head	i_lru;		/* inode LRU list */
>  	struct list_head	i_sb_list;
> +	struct list_head	i_wb_list;	/* backing dev writeback list */
>  	union {
>  		struct hlist_head	i_dentry;
>  		struct rcu_head		i_rcu;
> @@ -1377,6 +1378,9 @@ struct super_block {
>  	/* s_inode_list_lock protects s_inodes */
>  	spinlock_t		s_inode_list_lock ____cacheline_aligned_in_smp;
>  	struct list_head	s_inodes;	/* all inodes */
> +
> +	spinlock_t		s_inode_wblist_lock;
> +	struct list_head	s_inodes_wb;	/* writeback inodes */
>  };
>  
>  extern struct timespec current_fs_time(struct super_block *sb);
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index b333c94..90a380c 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -379,4 +379,7 @@ void tag_pages_for_writeback(struct address_space *mapping,
>  
>  void account_page_redirty(struct page *page);
>  
> +void sb_mark_inode_writeback(struct inode *inode);
> +void sb_clear_inode_writeback(struct inode *inode);
> +
>  #endif		/* WRITEBACK_H */
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 6fe7d15..a8b718137 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2745,6 +2745,11 @@ int test_clear_page_writeback(struct page *page)
>  				__wb_writeout_inc(wb);
>  			}
>  		}
> +
> +		if (mapping->host && !mapping_tagged(mapping,
> +						     PAGECACHE_TAG_WRITEBACK))
> +			sb_clear_inode_writeback(mapping->host);
> +
>  		spin_unlock_irqrestore(&mapping->tree_lock, flags);
>  	} else {
>  		ret = TestClearPageWriteback(page);
> @@ -2773,11 +2778,24 @@ int __test_set_page_writeback(struct page *page, bool keep_write)
>  		spin_lock_irqsave(&mapping->tree_lock, flags);
>  		ret = TestSetPageWriteback(page);
>  		if (!ret) {
> +			bool on_wblist;
> +
> +			on_wblist = mapping_tagged(mapping,
> +						   PAGECACHE_TAG_WRITEBACK);
> +
>  			radix_tree_tag_set(&mapping->page_tree,
>  						page_index(page),
>  						PAGECACHE_TAG_WRITEBACK);
>  			if (bdi_cap_account_writeback(bdi))
>  				__inc_wb_stat(inode_to_wb(inode), WB_WRITEBACK);
> +
> +			/*
> +			 * We can come through here when swapping anonymous
> +			 * pages, so we don't necessarily have an inode to track
> +			 * for sync.
> +			 */
> +			if (mapping->host && !on_wblist)
> +				sb_mark_inode_writeback(mapping->host);
>  		}
>  		if (!PageDirty(page))
>  			radix_tree_tag_clear(&mapping->page_tree,
> -- 
> 2.4.3
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/2] sb: add a new writeback list for sync
  2016-01-20 13:26   ` Jan Kara
@ 2016-01-20 20:11     ` Dave Chinner
  2016-01-21 15:22       ` Brian Foster
  0 siblings, 1 reply; 10+ messages in thread
From: Dave Chinner @ 2016-01-20 20:11 UTC (permalink / raw)
  To: Jan Kara; +Cc: Brian Foster, linux-fsdevel, dchinner, jbacik

On Wed, Jan 20, 2016 at 02:26:26PM +0100, Jan Kara wrote:
> On Tue 19-01-16 12:59:12, Brian Foster wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > wait_sb_inodes() currently does a walk of all inodes in the
> > filesystem to find dirty one to wait on during sync. This is highly
> > inefficient and wastes a lot of CPU when there are lots of clean
> > cached inodes that we don't need to wait on.
> > 
> > To avoid this "all inode" walk, we need to track inodes that are
> > currently under writeback that we need to wait for. We do this by
> > adding inodes to a writeback list on the sb when the mapping is
> > first tagged as having pages under writeback. wait_sb_inodes() can
> > then walk this list of "inodes under IO" and wait specifically just
> > for the inodes that the current sync(2) needs to wait for.
> > 
> > Define a couple helpers to add/remove an inode from the writeback
> > list and call them when the overall mapping is tagged for or cleared
> > from writeback. Update wait_sb_inodes() to walk only the inodes
> > under writeback due to the sync.
> 
> The patch looks good.  Just one comment: This grows struct inode by two
> longs. Such a growth should be justified by measuring the improvements. So
> can you measure some numbers showing how much the patch helped? I think it
> would be interesting to see:
> 
> a) How much sync(2) speed has improved if there's not much to wait for.

Depends on the size of the inode cache when sync is run.  If it's
empty it's not noticable. When you have tens of millions of cached,
clean inodes the inode list traversal can takes tens of seconds.
This is the sort of problem Josef reported that FB were having...

> b) See whether parallel heavy stat(2) load which is rotating lots of inodes
> in inode cache sees some improvement when it doesn't have to contend with
> sync(2) on s_inode_list_lock. I believe Dave Chinner had some loads where
> the contention on s_inode_list_lock due to sync and rotation of inodes was
> pretty heavy.

Just my usual fsmark workloads - they have parallel find and
parallel ls -lR traversals over the created fileset. Even just
running sync during creation (because there are millions of cached
inodes, and ~250,000 inodes being instiated and reclaimed every
second) causes lock contention problems....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/2] sb: add a new writeback list for sync
  2016-01-20 20:11     ` Dave Chinner
@ 2016-01-21 15:22       ` Brian Foster
  2016-01-21 16:34         ` Jan Kara
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Foster @ 2016-01-21 15:22 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Jan Kara, linux-fsdevel, dchinner, jbacik

On Thu, Jan 21, 2016 at 07:11:59AM +1100, Dave Chinner wrote:
> On Wed, Jan 20, 2016 at 02:26:26PM +0100, Jan Kara wrote:
> > On Tue 19-01-16 12:59:12, Brian Foster wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > > 
> > > wait_sb_inodes() currently does a walk of all inodes in the
> > > filesystem to find dirty one to wait on during sync. This is highly
> > > inefficient and wastes a lot of CPU when there are lots of clean
> > > cached inodes that we don't need to wait on.
> > > 
> > > To avoid this "all inode" walk, we need to track inodes that are
> > > currently under writeback that we need to wait for. We do this by
> > > adding inodes to a writeback list on the sb when the mapping is
> > > first tagged as having pages under writeback. wait_sb_inodes() can
> > > then walk this list of "inodes under IO" and wait specifically just
> > > for the inodes that the current sync(2) needs to wait for.
> > > 
> > > Define a couple helpers to add/remove an inode from the writeback
> > > list and call them when the overall mapping is tagged for or cleared
> > > from writeback. Update wait_sb_inodes() to walk only the inodes
> > > under writeback due to the sync.
> > 

Hi Jan, Dave,

> > The patch looks good.  Just one comment: This grows struct inode by two
> > longs. Such a growth should be justified by measuring the improvements. So
> > can you measure some numbers showing how much the patch helped? I think it
> > would be interesting to see:
> > 

Thanks.. indeed, I had run some simple tests that demonstrate the
effectiveness of the change. I reran them recently against the latest
version. Some results are appended to this mail.

Note that I don't have anything at the moment that demonstrates a
notable improvement with rcu over the original spin lock approach. I can
play with that a bit more, but that's not really the crux of the patch.

> > a) How much sync(2) speed has improved if there's not much to wait for.
> 
> Depends on the size of the inode cache when sync is run.  If it's
> empty it's not noticable. When you have tens of millions of cached,
> clean inodes the inode list traversal can takes tens of seconds.
> This is the sort of problem Josef reported that FB were having...
> 

FWIW, Ceph has indicated this is a pain point for them as well. The
results at [0] below show the difference in sync time with a largely
populated inode cache before and after this patch.

> > b) See whether parallel heavy stat(2) load which is rotating lots of inodes
> > in inode cache sees some improvement when it doesn't have to contend with
> > sync(2) on s_inode_list_lock. I believe Dave Chinner had some loads where
> > the contention on s_inode_list_lock due to sync and rotation of inodes was
> > pretty heavy.
> 
> Just my usual fsmark workloads - they have parallel find and
> parallel ls -lR traversals over the created fileset. Even just
> running sync during creation (because there are millions of cached
> inodes, and ~250,000 inodes being instiated and reclaimed every
> second) causes lock contention problems....
> 

I ran a similar parallel (16x) fs_mark workload using '-S 4,' which
incorporates a sync() per pass. Without this patch, this demonstrates a
slow degradation as the inode cache grows. Results at [1].

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

16xcpu, 32GB RAM x86-64 server
Storage is LVM volumes on hw raid0.

[0] -- sync test w/ ~10m clean inode cache
- 10TB pre-populated XFS fs, cache populated via parallel find/stat
  workload

--- 4.4.0+

# cat /proc/slabinfo | grep xfs
xfs_dqtrx              0      0    528   62    8 : tunables    0    0    0 : slabdata      0      0      0
xfs_dquot              0      0    656   49    8 : tunables    0    0    0 : slabdata      0      0      0
xfs_buf           496293 496893    640   51    8 : tunables    0    0    0 : slabdata   9743   9743      0
xfs_icr                0      0    144   56    2 : tunables    0    0    0 : slabdata      0      0      0
xfs_inode         10528071 10529150   1728   18    8 : tunables    0    0    0 : slabdata 584999 584999      0
xfs_efd_item           0      0    400   40    4 : tunables    0    0    0 : slabdata      0      0      0
xfs_da_state         544    544    480   34    4 : tunables    0    0    0 : slabdata     16     16      0
xfs_btree_cur          0      0    208   39    2 : tunables    0    0    0 : slabdata      0      0      0

# time sync

real    0m7.322s
user    0m0.000s
sys     0m7.314s
# time sync

real    0m7.299s
user    0m0.000s
sys     0m7.296s

--- 4.4.0+ w/ sync patch

# cat /proc/slabinfo | grep xfs
xfs_dqtrx              0      0    528   62    8 : tunables    0    0    0 : slabdata      0      0      0
xfs_dquot              0      0    656   49    8 : tunables    0    0    0 : slabdata      0      0      0
xfs_buf           428214 428514    640   51    8 : tunables    0    0    0 : slabdata   8719   8719      0
xfs_icr                0      0    144   56    2 : tunables    0    0    0 : slabdata      0      0      0
xfs_inode         11054375 11054438   1728   18    8 : tunables    0    0    0 : slabdata 721323 721323      0
xfs_efd_item           0      0    400   40    4 : tunables    0    0    0 : slabdata      0      0      0
xfs_da_state         544    544    480   34    4 : tunables    0    0    0 : slabdata     16     16      0
xfs_btree_cur          0      0    208   39    2 : tunables    0    0    0 : slabdata      0      0      0

# time sync

real    0m0.040s
user    0m0.001s
sys     0m0.003s
# time sync

real    0m0.002s
user    0m0.001s
sys     0m0.002s

[1] -- fs_mark -D 1000 -S4 -n 1000 -d /mnt/0 ... -d /mnt/15 -L 32
- 1TB XFS fs

--- 4.4.0+

FSUse%        Count         Size    Files/sec     App Overhead
     2        16000        51200       3313.3           822514
     2        32000        51200       3353.6           310268
     2        48000        51200       3475.2           289941
     2        64000        51200       3104.6           289993
     2        80000        51200       2944.9           292124
     2        96000        51200       3010.4           288042
     3       112000        51200       2756.4           289761
     3       128000        51200       2753.2           288096
     3       144000        51200       2474.4           290797
     3       160000        51200       2657.9           290898
     3       176000        51200       2498.0           288247
     3       192000        51200       2415.5           287329
     3       208000        51200       2336.1           291113
     3       224000        51200       2352.9           290103
     3       240000        51200       2309.6           289580
     3       256000        51200       2344.3           289828
     3       272000        51200       2293.0           291282
     3       288000        51200       2295.5           286538
     4       304000        51200       2119.0           288906
     4       320000        51200       2059.6           293605
     4       336000        51200       2129.1           289825
     4       352000        51200       1929.8           288186
     4       368000        51200       1987.5           294596
     4       384000        51200       1929.1           293528
     4       400000        51200       1934.8           288138
     4       416000        51200       1823.6           292318
     4       432000        51200       1838.7           290890
     4       448000        51200       1797.5           288816
     4       464000        51200       1823.2           287190
     4       480000        51200       1738.7           295745
     4       496000        51200       1716.4           293821
     5       512000        51200       1726.7           290445

--- 4.4.0+ w/ sync patch

FSUse%        Count         Size    Files/sec     App Overhead
     2        16000        51200       3409.7           999579
     2        32000        51200       3481.3           286877
     2        48000        51200       3447.3           282743
     2        64000        51200       3522.3           283400
     2        80000        51200       3427.0           286360
     2        96000        51200       3360.2           307219
     3       112000        51200       3377.7           286625
     3       128000        51200       3363.7           285929
     3       144000        51200       3345.7           283138
     3       160000        51200       3384.9           291081
     3       176000        51200       3084.1           285265
     3       192000        51200       3388.4           291439
     3       208000        51200       3242.8           286332
     3       224000        51200       3337.9           285006
     3       240000        51200       3442.8           292109
     3       256000        51200       3230.3           283432
     3       272000        51200       3358.3           286996
     3       288000        51200       3309.0           288058
     4       304000        51200       3293.4           284309
     4       320000        51200       3221.4           284476
     4       336000        51200       3241.5           283968
     4       352000        51200       3228.3           284354
     4       368000        51200       3255.7           286072
     4       384000        51200       3094.6           290240
     4       400000        51200       3385.6           288158
     4       416000        51200       3265.2           284387
     4       432000        51200       3315.2           289656
     4       448000        51200       3275.1           284562
     4       464000        51200       3238.4           294976
     4       480000        51200       3060.0           290088
     4       496000        51200       3359.5           286949
     5       512000        51200       3156.2           288126


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/2] sb: add a new writeback list for sync
  2016-01-21 15:22       ` Brian Foster
@ 2016-01-21 16:34         ` Jan Kara
  2016-01-21 17:13           ` Brian Foster
  0 siblings, 1 reply; 10+ messages in thread
From: Jan Kara @ 2016-01-21 16:34 UTC (permalink / raw)
  To: Brian Foster; +Cc: Dave Chinner, Jan Kara, linux-fsdevel, dchinner, jbacik

On Thu 21-01-16 10:22:57, Brian Foster wrote:
> On Thu, Jan 21, 2016 at 07:11:59AM +1100, Dave Chinner wrote:
> > On Wed, Jan 20, 2016 at 02:26:26PM +0100, Jan Kara wrote:
> > > On Tue 19-01-16 12:59:12, Brian Foster wrote:
> > > > From: Dave Chinner <dchinner@redhat.com>
> > > > 
> > > > wait_sb_inodes() currently does a walk of all inodes in the
> > > > filesystem to find dirty one to wait on during sync. This is highly
> > > > inefficient and wastes a lot of CPU when there are lots of clean
> > > > cached inodes that we don't need to wait on.
> > > > 
> > > > To avoid this "all inode" walk, we need to track inodes that are
> > > > currently under writeback that we need to wait for. We do this by
> > > > adding inodes to a writeback list on the sb when the mapping is
> > > > first tagged as having pages under writeback. wait_sb_inodes() can
> > > > then walk this list of "inodes under IO" and wait specifically just
> > > > for the inodes that the current sync(2) needs to wait for.
> > > > 
> > > > Define a couple helpers to add/remove an inode from the writeback
> > > > list and call them when the overall mapping is tagged for or cleared
> > > > from writeback. Update wait_sb_inodes() to walk only the inodes
> > > > under writeback due to the sync.
> > > 
> 
> Hi Jan, Dave,
> 
> > > The patch looks good.  Just one comment: This grows struct inode by two
> > > longs. Such a growth should be justified by measuring the improvements. So
> > > can you measure some numbers showing how much the patch helped? I think it
> > > would be interesting to see:
> > > 
> 
> Thanks.. indeed, I had run some simple tests that demonstrate the
> effectiveness of the change. I reran them recently against the latest
> version. Some results are appended to this mail.
> 
> Note that I don't have anything at the moment that demonstrates a
> notable improvement with rcu over the original spin lock approach. I can
> play with that a bit more, but that's not really the crux of the patch.
> 
> > > a) How much sync(2) speed has improved if there's not much to wait for.
> > 
> > Depends on the size of the inode cache when sync is run.  If it's
> > empty it's not noticable. When you have tens of millions of cached,
> > clean inodes the inode list traversal can takes tens of seconds.
> > This is the sort of problem Josef reported that FB were having...
> > 
> 
> FWIW, Ceph has indicated this is a pain point for them as well. The
> results at [0] below show the difference in sync time with a largely
> populated inode cache before and after this patch.
> 
> > > b) See whether parallel heavy stat(2) load which is rotating lots of inodes
> > > in inode cache sees some improvement when it doesn't have to contend with
> > > sync(2) on s_inode_list_lock. I believe Dave Chinner had some loads where
> > > the contention on s_inode_list_lock due to sync and rotation of inodes was
> > > pretty heavy.
> > 
> > Just my usual fsmark workloads - they have parallel find and
> > parallel ls -lR traversals over the created fileset. Even just
> > running sync during creation (because there are millions of cached
> > inodes, and ~250,000 inodes being instiated and reclaimed every
> > second) causes lock contention problems....
> > 
> 
> I ran a similar parallel (16x) fs_mark workload using '-S 4,' which
> incorporates a sync() per pass. Without this patch, this demonstrates a
> slow degradation as the inode cache grows. Results at [1].

Thanks for the results. I think it would be good if you incorporated them
in the changelog since other people will likely be asking similar
questions when seeing the inode is growing. Other than that feel free to
add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza
> 16xcpu, 32GB RAM x86-64 server
> Storage is LVM volumes on hw raid0.
> 
> [0] -- sync test w/ ~10m clean inode cache
> - 10TB pre-populated XFS fs, cache populated via parallel find/stat
>   workload
> 
> --- 4.4.0+
> 
> # cat /proc/slabinfo | grep xfs
> xfs_dqtrx              0      0    528   62    8 : tunables    0    0    0 : slabdata      0      0      0
> xfs_dquot              0      0    656   49    8 : tunables    0    0    0 : slabdata      0      0      0
> xfs_buf           496293 496893    640   51    8 : tunables    0    0    0 : slabdata   9743   9743      0
> xfs_icr                0      0    144   56    2 : tunables    0    0    0 : slabdata      0      0      0
> xfs_inode         10528071 10529150   1728   18    8 : tunables    0    0    0 : slabdata 584999 584999      0
> xfs_efd_item           0      0    400   40    4 : tunables    0    0    0 : slabdata      0      0      0
> xfs_da_state         544    544    480   34    4 : tunables    0    0    0 : slabdata     16     16      0
> xfs_btree_cur          0      0    208   39    2 : tunables    0    0    0 : slabdata      0      0      0
> 
> # time sync
> 
> real    0m7.322s
> user    0m0.000s
> sys     0m7.314s
> # time sync
> 
> real    0m7.299s
> user    0m0.000s
> sys     0m7.296s
> 
> --- 4.4.0+ w/ sync patch
> 
> # cat /proc/slabinfo | grep xfs
> xfs_dqtrx              0      0    528   62    8 : tunables    0    0    0 : slabdata      0      0      0
> xfs_dquot              0      0    656   49    8 : tunables    0    0    0 : slabdata      0      0      0
> xfs_buf           428214 428514    640   51    8 : tunables    0    0    0 : slabdata   8719   8719      0
> xfs_icr                0      0    144   56    2 : tunables    0    0    0 : slabdata      0      0      0
> xfs_inode         11054375 11054438   1728   18    8 : tunables    0    0    0 : slabdata 721323 721323      0
> xfs_efd_item           0      0    400   40    4 : tunables    0    0    0 : slabdata      0      0      0
> xfs_da_state         544    544    480   34    4 : tunables    0    0    0 : slabdata     16     16      0
> xfs_btree_cur          0      0    208   39    2 : tunables    0    0    0 : slabdata      0      0      0
> 
> # time sync
> 
> real    0m0.040s
> user    0m0.001s
> sys     0m0.003s
> # time sync
> 
> real    0m0.002s
> user    0m0.001s
> sys     0m0.002s
> 
> [1] -- fs_mark -D 1000 -S4 -n 1000 -d /mnt/0 ... -d /mnt/15 -L 32
> - 1TB XFS fs
> 
> --- 4.4.0+
> 
> FSUse%        Count         Size    Files/sec     App Overhead
>      2        16000        51200       3313.3           822514
>      2        32000        51200       3353.6           310268
>      2        48000        51200       3475.2           289941
>      2        64000        51200       3104.6           289993
>      2        80000        51200       2944.9           292124
>      2        96000        51200       3010.4           288042
>      3       112000        51200       2756.4           289761
>      3       128000        51200       2753.2           288096
>      3       144000        51200       2474.4           290797
>      3       160000        51200       2657.9           290898
>      3       176000        51200       2498.0           288247
>      3       192000        51200       2415.5           287329
>      3       208000        51200       2336.1           291113
>      3       224000        51200       2352.9           290103
>      3       240000        51200       2309.6           289580
>      3       256000        51200       2344.3           289828
>      3       272000        51200       2293.0           291282
>      3       288000        51200       2295.5           286538
>      4       304000        51200       2119.0           288906
>      4       320000        51200       2059.6           293605
>      4       336000        51200       2129.1           289825
>      4       352000        51200       1929.8           288186
>      4       368000        51200       1987.5           294596
>      4       384000        51200       1929.1           293528
>      4       400000        51200       1934.8           288138
>      4       416000        51200       1823.6           292318
>      4       432000        51200       1838.7           290890
>      4       448000        51200       1797.5           288816
>      4       464000        51200       1823.2           287190
>      4       480000        51200       1738.7           295745
>      4       496000        51200       1716.4           293821
>      5       512000        51200       1726.7           290445
> 
> --- 4.4.0+ w/ sync patch
> 
> FSUse%        Count         Size    Files/sec     App Overhead
>      2        16000        51200       3409.7           999579
>      2        32000        51200       3481.3           286877
>      2        48000        51200       3447.3           282743
>      2        64000        51200       3522.3           283400
>      2        80000        51200       3427.0           286360
>      2        96000        51200       3360.2           307219
>      3       112000        51200       3377.7           286625
>      3       128000        51200       3363.7           285929
>      3       144000        51200       3345.7           283138
>      3       160000        51200       3384.9           291081
>      3       176000        51200       3084.1           285265
>      3       192000        51200       3388.4           291439
>      3       208000        51200       3242.8           286332
>      3       224000        51200       3337.9           285006
>      3       240000        51200       3442.8           292109
>      3       256000        51200       3230.3           283432
>      3       272000        51200       3358.3           286996
>      3       288000        51200       3309.0           288058
>      4       304000        51200       3293.4           284309
>      4       320000        51200       3221.4           284476
>      4       336000        51200       3241.5           283968
>      4       352000        51200       3228.3           284354
>      4       368000        51200       3255.7           286072
>      4       384000        51200       3094.6           290240
>      4       400000        51200       3385.6           288158
>      4       416000        51200       3265.2           284387
>      4       432000        51200       3315.2           289656
>      4       448000        51200       3275.1           284562
>      4       464000        51200       3238.4           294976
>      4       480000        51200       3060.0           290088
>      4       496000        51200       3359.5           286949
>      5       512000        51200       3156.2           288126
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/2] sb: add a new writeback list for sync
  2016-01-21 16:34         ` Jan Kara
@ 2016-01-21 17:13           ` Brian Foster
  2016-01-21 18:08             ` Josef Bacik
  0 siblings, 1 reply; 10+ messages in thread
From: Brian Foster @ 2016-01-21 17:13 UTC (permalink / raw)
  To: Jan Kara; +Cc: Dave Chinner, linux-fsdevel, dchinner, jbacik

On Thu, Jan 21, 2016 at 05:34:11PM +0100, Jan Kara wrote:
> On Thu 21-01-16 10:22:57, Brian Foster wrote:
> > On Thu, Jan 21, 2016 at 07:11:59AM +1100, Dave Chinner wrote:
> > > On Wed, Jan 20, 2016 at 02:26:26PM +0100, Jan Kara wrote:
> > > > On Tue 19-01-16 12:59:12, Brian Foster wrote:
> > > > > From: Dave Chinner <dchinner@redhat.com>
> > > > > 
...
> > > > 
> > 
> > Hi Jan, Dave,
> > 
...
> > > > a) How much sync(2) speed has improved if there's not much to wait for.
> > > 
> > > Depends on the size of the inode cache when sync is run.  If it's
> > > empty it's not noticable. When you have tens of millions of cached,
> > > clean inodes the inode list traversal can takes tens of seconds.
> > > This is the sort of problem Josef reported that FB were having...
> > > 
> > 
> > FWIW, Ceph has indicated this is a pain point for them as well. The
> > results at [0] below show the difference in sync time with a largely
> > populated inode cache before and after this patch.
> > 
> > > > b) See whether parallel heavy stat(2) load which is rotating lots of inodes
> > > > in inode cache sees some improvement when it doesn't have to contend with
> > > > sync(2) on s_inode_list_lock. I believe Dave Chinner had some loads where
> > > > the contention on s_inode_list_lock due to sync and rotation of inodes was
> > > > pretty heavy.
> > > 
> > > Just my usual fsmark workloads - they have parallel find and
> > > parallel ls -lR traversals over the created fileset. Even just
> > > running sync during creation (because there are millions of cached
> > > inodes, and ~250,000 inodes being instiated and reclaimed every
> > > second) causes lock contention problems....
> > > 
> > 
> > I ran a similar parallel (16x) fs_mark workload using '-S 4,' which
> > incorporates a sync() per pass. Without this patch, this demonstrates a
> > slow degradation as the inode cache grows. Results at [1].
> 
> Thanks for the results. I think it would be good if you incorporated them
> in the changelog since other people will likely be asking similar
> questions when seeing the inode is growing. Other than that feel free to
> add:
> 
> Reviewed-by: Jan Kara <jack@suse.cz>
> 

No problem, thanks! Sure, I don't want to dump the raw stuff into the
commit log description to avoid making it too long, but I can reference
the core sync time impact. I've appended the following for now:

    "With this change, filesystem sync times are significantly reduced for
    fs' with largely populated inode caches and otherwise no other work to
    do. For example, on a 16xcpu 2GHz x86-64 server, 10TB XFS filesystem
    with a ~10m entry inode cache, sync times are reduced from ~7.3s to less
    than 0.1s when the filesystem is fully clean."

I'll repost in a day or so if I don't receive any other feedback.

Brian

> 								Honza
> > 16xcpu, 32GB RAM x86-64 server
> > Storage is LVM volumes on hw raid0.
> > 
> > [0] -- sync test w/ ~10m clean inode cache
> > - 10TB pre-populated XFS fs, cache populated via parallel find/stat
> >   workload
> > 
> > --- 4.4.0+
> > 
> > # cat /proc/slabinfo | grep xfs
> > xfs_dqtrx              0      0    528   62    8 : tunables    0    0    0 : slabdata      0      0      0
> > xfs_dquot              0      0    656   49    8 : tunables    0    0    0 : slabdata      0      0      0
> > xfs_buf           496293 496893    640   51    8 : tunables    0    0    0 : slabdata   9743   9743      0
> > xfs_icr                0      0    144   56    2 : tunables    0    0    0 : slabdata      0      0      0
> > xfs_inode         10528071 10529150   1728   18    8 : tunables    0    0    0 : slabdata 584999 584999      0
> > xfs_efd_item           0      0    400   40    4 : tunables    0    0    0 : slabdata      0      0      0
> > xfs_da_state         544    544    480   34    4 : tunables    0    0    0 : slabdata     16     16      0
> > xfs_btree_cur          0      0    208   39    2 : tunables    0    0    0 : slabdata      0      0      0
> > 
> > # time sync
> > 
> > real    0m7.322s
> > user    0m0.000s
> > sys     0m7.314s
> > # time sync
> > 
> > real    0m7.299s
> > user    0m0.000s
> > sys     0m7.296s
> > 
> > --- 4.4.0+ w/ sync patch
> > 
> > # cat /proc/slabinfo | grep xfs
> > xfs_dqtrx              0      0    528   62    8 : tunables    0    0    0 : slabdata      0      0      0
> > xfs_dquot              0      0    656   49    8 : tunables    0    0    0 : slabdata      0      0      0
> > xfs_buf           428214 428514    640   51    8 : tunables    0    0    0 : slabdata   8719   8719      0
> > xfs_icr                0      0    144   56    2 : tunables    0    0    0 : slabdata      0      0      0
> > xfs_inode         11054375 11054438   1728   18    8 : tunables    0    0    0 : slabdata 721323 721323      0
> > xfs_efd_item           0      0    400   40    4 : tunables    0    0    0 : slabdata      0      0      0
> > xfs_da_state         544    544    480   34    4 : tunables    0    0    0 : slabdata     16     16      0
> > xfs_btree_cur          0      0    208   39    2 : tunables    0    0    0 : slabdata      0      0      0
> > 
> > # time sync
> > 
> > real    0m0.040s
> > user    0m0.001s
> > sys     0m0.003s
> > # time sync
> > 
> > real    0m0.002s
> > user    0m0.001s
> > sys     0m0.002s
> > 
> > [1] -- fs_mark -D 1000 -S4 -n 1000 -d /mnt/0 ... -d /mnt/15 -L 32
> > - 1TB XFS fs
> > 
> > --- 4.4.0+
> > 
> > FSUse%        Count         Size    Files/sec     App Overhead
> >      2        16000        51200       3313.3           822514
> >      2        32000        51200       3353.6           310268
> >      2        48000        51200       3475.2           289941
> >      2        64000        51200       3104.6           289993
> >      2        80000        51200       2944.9           292124
> >      2        96000        51200       3010.4           288042
> >      3       112000        51200       2756.4           289761
> >      3       128000        51200       2753.2           288096
> >      3       144000        51200       2474.4           290797
> >      3       160000        51200       2657.9           290898
> >      3       176000        51200       2498.0           288247
> >      3       192000        51200       2415.5           287329
> >      3       208000        51200       2336.1           291113
> >      3       224000        51200       2352.9           290103
> >      3       240000        51200       2309.6           289580
> >      3       256000        51200       2344.3           289828
> >      3       272000        51200       2293.0           291282
> >      3       288000        51200       2295.5           286538
> >      4       304000        51200       2119.0           288906
> >      4       320000        51200       2059.6           293605
> >      4       336000        51200       2129.1           289825
> >      4       352000        51200       1929.8           288186
> >      4       368000        51200       1987.5           294596
> >      4       384000        51200       1929.1           293528
> >      4       400000        51200       1934.8           288138
> >      4       416000        51200       1823.6           292318
> >      4       432000        51200       1838.7           290890
> >      4       448000        51200       1797.5           288816
> >      4       464000        51200       1823.2           287190
> >      4       480000        51200       1738.7           295745
> >      4       496000        51200       1716.4           293821
> >      5       512000        51200       1726.7           290445
> > 
> > --- 4.4.0+ w/ sync patch
> > 
> > FSUse%        Count         Size    Files/sec     App Overhead
> >      2        16000        51200       3409.7           999579
> >      2        32000        51200       3481.3           286877
> >      2        48000        51200       3447.3           282743
> >      2        64000        51200       3522.3           283400
> >      2        80000        51200       3427.0           286360
> >      2        96000        51200       3360.2           307219
> >      3       112000        51200       3377.7           286625
> >      3       128000        51200       3363.7           285929
> >      3       144000        51200       3345.7           283138
> >      3       160000        51200       3384.9           291081
> >      3       176000        51200       3084.1           285265
> >      3       192000        51200       3388.4           291439
> >      3       208000        51200       3242.8           286332
> >      3       224000        51200       3337.9           285006
> >      3       240000        51200       3442.8           292109
> >      3       256000        51200       3230.3           283432
> >      3       272000        51200       3358.3           286996
> >      3       288000        51200       3309.0           288058
> >      4       304000        51200       3293.4           284309
> >      4       320000        51200       3221.4           284476
> >      4       336000        51200       3241.5           283968
> >      4       352000        51200       3228.3           284354
> >      4       368000        51200       3255.7           286072
> >      4       384000        51200       3094.6           290240
> >      4       400000        51200       3385.6           288158
> >      4       416000        51200       3265.2           284387
> >      4       432000        51200       3315.2           289656
> >      4       448000        51200       3275.1           284562
> >      4       464000        51200       3238.4           294976
> >      4       480000        51200       3060.0           290088
> >      4       496000        51200       3359.5           286949
> >      5       512000        51200       3156.2           288126
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 1/2] sb: add a new writeback list for sync
  2016-01-21 17:13           ` Brian Foster
@ 2016-01-21 18:08             ` Josef Bacik
  0 siblings, 0 replies; 10+ messages in thread
From: Josef Bacik @ 2016-01-21 18:08 UTC (permalink / raw)
  To: Brian Foster, Jan Kara; +Cc: Dave Chinner, linux-fsdevel, dchinner

On 01/21/2016 12:13 PM, Brian Foster wrote:
> On Thu, Jan 21, 2016 at 05:34:11PM +0100, Jan Kara wrote:
>> On Thu 21-01-16 10:22:57, Brian Foster wrote:
>>> On Thu, Jan 21, 2016 at 07:11:59AM +1100, Dave Chinner wrote:
>>>> On Wed, Jan 20, 2016 at 02:26:26PM +0100, Jan Kara wrote:
>>>>> On Tue 19-01-16 12:59:12, Brian Foster wrote:
>>>>>> From: Dave Chinner <dchinner@redhat.com>
>>>>>>
> ...
>>>>>
>>>
>>> Hi Jan, Dave,
>>>
> ...
>>>>> a) How much sync(2) speed has improved if there's not much to wait for.
>>>>
>>>> Depends on the size of the inode cache when sync is run.  If it's
>>>> empty it's not noticable. When you have tens of millions of cached,
>>>> clean inodes the inode list traversal can takes tens of seconds.
>>>> This is the sort of problem Josef reported that FB were having...
>>>>
>>>
>>> FWIW, Ceph has indicated this is a pain point for them as well. The
>>> results at [0] below show the difference in sync time with a largely
>>> populated inode cache before and after this patch.
>>>
>>>>> b) See whether parallel heavy stat(2) load which is rotating lots of inodes
>>>>> in inode cache sees some improvement when it doesn't have to contend with
>>>>> sync(2) on s_inode_list_lock. I believe Dave Chinner had some loads where
>>>>> the contention on s_inode_list_lock due to sync and rotation of inodes was
>>>>> pretty heavy.
>>>>
>>>> Just my usual fsmark workloads - they have parallel find and
>>>> parallel ls -lR traversals over the created fileset. Even just
>>>> running sync during creation (because there are millions of cached
>>>> inodes, and ~250,000 inodes being instiated and reclaimed every
>>>> second) causes lock contention problems....
>>>>
>>>
>>> I ran a similar parallel (16x) fs_mark workload using '-S 4,' which
>>> incorporates a sync() per pass. Without this patch, this demonstrates a
>>> slow degradation as the inode cache grows. Results at [1].
>>
>> Thanks for the results. I think it would be good if you incorporated them
>> in the changelog since other people will likely be asking similar
>> questions when seeing the inode is growing. Other than that feel free to
>> add:
>>
>> Reviewed-by: Jan Kara <jack@suse.cz>
>>
>
> No problem, thanks! Sure, I don't want to dump the raw stuff into the
> commit log description to avoid making it too long, but I can reference
> the core sync time impact. I've appended the following for now:
>
>      "With this change, filesystem sync times are significantly reduced for
>      fs' with largely populated inode caches and otherwise no other work to
>      do. For example, on a 16xcpu 2GHz x86-64 server, 10TB XFS filesystem
>      with a ~10m entry inode cache, sync times are reduced from ~7.3s to less
>      than 0.1s when the filesystem is fully clean."
>
> I'll repost in a day or so if I don't receive any other feedback.
>

Sorry I dropped the ball on this guys, thanks for picking it up Brian! 
I think that changelog is acceptable.  Thanks,

Josef


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-01-21 18:08 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-19 17:59 [PATCH v6 0/2] improve sync efficiency with sb inode wb list Brian Foster
2016-01-19 17:59 ` [PATCH v6 1/2] sb: add a new writeback list for sync Brian Foster
2016-01-20 13:26   ` Jan Kara
2016-01-20 20:11     ` Dave Chinner
2016-01-21 15:22       ` Brian Foster
2016-01-21 16:34         ` Jan Kara
2016-01-21 17:13           ` Brian Foster
2016-01-21 18:08             ` Josef Bacik
2016-01-19 17:59 ` [PATCH v6 2/2] wb: inode writeback list tracking tracepoints Brian Foster
2016-01-20 13:14   ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.