linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] ext4: Improve parallel I/O performance on NVDIMM
@ 2016-04-02  3:09 Waiman Long
  2016-04-02  3:09 ` [PATCH 1/3] ext4: Pass in DIO_SKIP_DIO_COUNT flag if inode_dio_begin() called Waiman Long
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Waiman Long @ 2016-04-02  3:09 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger, Tejun Heo, Christoph Lameter
  Cc: linux-ext4, linux-kernel, Scott J Norton, Douglas Hatch,
	Toshimitsu Kani, Waiman Long

This patchset aims to improve parallel I/O performance of the ext4
filesystem on fast storage devices like NVDIMM.

Patch 1 eliminates duplicated inode_dio_begin()/inode_dio_end() calls.

Patch 2 provides a set of simple percpu statistics count helper functions.

Patch 3 converts some ext4 statistics counts into percpu counts using
the helper functions.

Waiman Long (3):
  ext4: Pass in DIO_SKIP_DIO_COUNT flag if inode_dio_begin() called
  percpu_stats: Simple per-cpu statistics count helper functions
  ext4: Make cache hits/misses per-cpu counts

 fs/ext4/extents_status.c     |   20 +++++---
 fs/ext4/extents_status.h     |   11 ++++-
 fs/ext4/indirect.c           |   10 +++-
 fs/ext4/inode.c              |   12 ++++-
 include/linux/percpu_stats.h |  103 ++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 141 insertions(+), 15 deletions(-)
 create mode 100644 include/linux/percpu_stats.h

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/3] ext4: Pass in DIO_SKIP_DIO_COUNT flag if inode_dio_begin() called
  2016-04-02  3:09 [PATCH 0/3] ext4: Improve parallel I/O performance on NVDIMM Waiman Long
@ 2016-04-02  3:09 ` Waiman Long
  2016-04-02  3:09 ` [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions Waiman Long
  2016-04-02  3:09 ` [PATCH 3/3] ext4: Make cache hits/misses per-cpu counts Waiman Long
  2 siblings, 0 replies; 19+ messages in thread
From: Waiman Long @ 2016-04-02  3:09 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger, Tejun Heo, Christoph Lameter
  Cc: linux-ext4, linux-kernel, Scott J Norton, Douglas Hatch,
	Toshimitsu Kani, Waiman Long

When performing direct I/O, the current ext4 code does
not pass in the DIO_SKIP_DIO_COUNT flag to dax_do_io() or
__blockdev_direct_IO() when inode_dio_begin() has, in fact, been
called. This causes dax_do_io()/__blockdev_direct_IO() to invoke
inode_dio_begin()/inode_dio_end() internally.  This doubling of
inode_dio_begin()/inode_dio_end() calls are wasteful.

This patch removes the extra internal inode_dio_begin()/inode_dio_end()
calls when those calls are being issued by the caller directly. For
really fast storage systems like NVDIMM, the removal of the extra
inode_dio_begin()/inode_dio_end() can give a meaningful boost to
I/O performance.

On a 4-socket Haswell-EX system (72 cores) running 4.6-rc1 kernel,
fio with 38 threads doing parallel I/O on two shared files on an
NVDIMM with DAX gave the following aggregrate bandwidth with and
without the patch:

  Test          W/O patch       With patch      % change
  ----          ---------       ----------      --------
  Read-only      8688MB/s       10173MB/s        +17.1%
  Read-write     2687MB/s        2830MB/s         +5.3%

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 fs/ext4/indirect.c |   10 ++++++++--
 fs/ext4/inode.c    |   12 +++++++++---
 2 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
index 3027fa6..4304be6 100644
--- a/fs/ext4/indirect.c
+++ b/fs/ext4/indirect.c
@@ -706,14 +706,20 @@ retry:
 			inode_dio_end(inode);
 			goto locked;
 		}
+		/*
+		 * Need to pass in DIO_SKIP_DIO_COUNT to prevent
+		 * duplicated inode_dio_begin/inode_dio_end sequence.
+		 */
 		if (IS_DAX(inode))
 			ret = dax_do_io(iocb, inode, iter, offset,
-					ext4_dio_get_block, NULL, 0);
+					ext4_dio_get_block, NULL,
+					DIO_SKIP_DIO_COUNT);
 		else
 			ret = __blockdev_direct_IO(iocb, inode,
 						   inode->i_sb->s_bdev, iter,
 						   offset, ext4_dio_get_block,
-						   NULL, NULL, 0);
+						   NULL, NULL,
+						   DIO_SKIP_DIO_COUNT);
 		inode_dio_end(inode);
 	} else {
 locked:
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index dab84a2..779aa33 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3358,9 +3358,15 @@ static ssize_t ext4_ext_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 	 * Make all waiters for direct IO properly wait also for extent
 	 * conversion. This also disallows race between truncate() and
 	 * overwrite DIO as i_dio_count needs to be incremented under i_mutex.
+	 *
+	 * Both dax_do_io() and __blockdev_direct_IO() will unnecessarily
+	 * call inode_dio_begin()/inode_dio_end() again if the
+	 * DIO_SKIP_DIO_COUNT flag is not set.
 	 */
-	if (iov_iter_rw(iter) == WRITE)
+	if (iov_iter_rw(iter) == WRITE) {
+		dio_flags = DIO_SKIP_DIO_COUNT;
 		inode_dio_begin(inode);
+	}
 
 	/* If we do a overwrite dio, i_mutex locking can be released */
 	overwrite = *((int *)iocb->private);
@@ -3393,10 +3399,10 @@ static ssize_t ext4_ext_direct_IO(struct kiocb *iocb, struct iov_iter *iter,
 		get_block_func = ext4_dio_get_block_overwrite;
 	else if (is_sync_kiocb(iocb)) {
 		get_block_func = ext4_dio_get_block_unwritten_sync;
-		dio_flags = DIO_LOCKING;
+		dio_flags |= DIO_LOCKING;
 	} else {
 		get_block_func = ext4_dio_get_block_unwritten_async;
-		dio_flags = DIO_LOCKING;
+		dio_flags |= DIO_LOCKING;
 	}
 #ifdef CONFIG_EXT4_FS_ENCRYPTION
 	BUG_ON(ext4_encrypted_inode(inode) && S_ISREG(inode->i_mode));
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-02  3:09 [PATCH 0/3] ext4: Improve parallel I/O performance on NVDIMM Waiman Long
  2016-04-02  3:09 ` [PATCH 1/3] ext4: Pass in DIO_SKIP_DIO_COUNT flag if inode_dio_begin() called Waiman Long
@ 2016-04-02  3:09 ` Waiman Long
  2016-04-04  7:36   ` Nikolay Borisov
  2016-04-04 16:02   ` Tejun Heo
  2016-04-02  3:09 ` [PATCH 3/3] ext4: Make cache hits/misses per-cpu counts Waiman Long
  2 siblings, 2 replies; 19+ messages in thread
From: Waiman Long @ 2016-04-02  3:09 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger, Tejun Heo, Christoph Lameter
  Cc: linux-ext4, linux-kernel, Scott J Norton, Douglas Hatch,
	Toshimitsu Kani, Waiman Long

This patch introduces a set of simple per-cpu statictics count helper
functions that can be used by other kernel subsystems for keeping
track of the number of events that happens. It is per-cpu based to
reduce overhead and improve accuracy of the counter. Using per-cpu
counter is usually overkill for such purpose.

The following APIs are provided:

 - int percpu_stats_init(struct percpu_stats *pcs, int num)
   Initialize the per-cpu statictics counts structure which should have
   the given number of statistics counts. Return -ENOMEM on error.

 - void percpu_stats_destroy(struct percpu_stats *pcs)
   Free the percpu memory allocated.

 - void percpu_stats_inc(struct percpu_stats *pcs, int stat)
   void percpu_stats_dec(struct percpu_stats *pcs, int stat)
   Increment and decrement the given per-cpu statistics count.

 - unsigned long percpu_stats_sum(struct percpu_stats *pcs, int stat)
   Return the current aggregated sum of the given statistics count.

 - void percpu_stats_reset(struct percpu_stats *pcs)
   Clear all the statistics counts defined in the given percpu_stats
   structure.

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 include/linux/percpu_stats.h |  103 ++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 103 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/percpu_stats.h

diff --git a/include/linux/percpu_stats.h b/include/linux/percpu_stats.h
new file mode 100644
index 0000000..a4f715e
--- /dev/null
+++ b/include/linux/percpu_stats.h
@@ -0,0 +1,103 @@
+#ifndef _LINUX_PERCPU_STATS_H
+#define _LINUX_PERCPU_STATS_H
+/*
+ * Simple per-cpu statistics counts that have less overhead than the
+ * per-cpu counters.
+ */
+#include <linux/percpu.h>
+#include <linux/types.h>
+
+struct percpu_stats {
+	unsigned long __percpu *stats;
+	int nstats;	/* Number of statistics counts in stats array */
+};
+
+/*
+ * Reset the all statistics counts to 0 in the percpu_stats structure
+ */
+static inline void percpu_stats_reset(struct percpu_stats *pcs)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
+		int stat;
+
+		for (stat = 0; stat < pcs->nstats; stat++, pstats++)
+			*pstats = 0;
+	}
+
+	/*
+	 * If a statistics count is in the middle of being updated, it
+	 * is possible that the above clearing may not work. So we need
+	 * to double check again to make sure that the counters are really
+	 * cleared. Still there is a still a very small chance that the
+	 * second clearing does not work.
+	 */
+	for_each_possible_cpu(cpu) {
+		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
+		int stat;
+
+		for (stat = 0; stat < pcs->nstats; stat++, pstats++)
+			if (*pstats)
+				*pstats = 0;
+	}
+}
+
+static inline int percpu_stats_init(struct percpu_stats *pcs, int num)
+{
+	pcs->nstats = num;
+	pcs->stats  = __alloc_percpu(sizeof(unsigned long) * num,
+				     __alignof__(unsigned long));
+	if (!pcs->stats)
+		return -ENOMEM;
+
+	percpu_stats_reset(pcs);
+	return 0;
+}
+
+static inline void percpu_stats_destroy(struct percpu_stats *pcs)
+{
+	free_percpu(pcs->stats);
+	pcs->stats  = NULL;
+	pcs->nstats = 0;
+}
+
+static inline void
+__percpu_stats_add(struct percpu_stats *pcs, int stat, int cnt)
+{
+	unsigned long *pstat;
+
+	if ((unsigned int)stat >= pcs->nstats)
+		return;
+	preempt_disable();
+	pstat = this_cpu_ptr(&pcs->stats[stat]);
+	*pstat += cnt;
+	preempt_enable();
+}
+
+static inline void percpu_stats_inc(struct percpu_stats *pcs, int stat)
+{
+	__percpu_stats_add(pcs, stat, 1);
+}
+
+static inline void percpu_stats_dec(struct percpu_stats *pcs, int stat)
+{
+	__percpu_stats_add(pcs, stat, -1);
+}
+
+static inline unsigned long
+percpu_stats_sum(struct percpu_stats *pcs, int stat)
+{
+	int cpu;
+	unsigned long sum = 0;
+
+	if ((unsigned int)stat >= pcs->nstats)
+		return sum;
+
+	for_each_possible_cpu(cpu)
+		sum += per_cpu(pcs->stats[stat], cpu);
+	return sum;
+}
+
+#endif /* _LINUX_PERCPU_STATS_H */
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/3] ext4: Make cache hits/misses per-cpu counts
  2016-04-02  3:09 [PATCH 0/3] ext4: Improve parallel I/O performance on NVDIMM Waiman Long
  2016-04-02  3:09 ` [PATCH 1/3] ext4: Pass in DIO_SKIP_DIO_COUNT flag if inode_dio_begin() called Waiman Long
  2016-04-02  3:09 ` [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions Waiman Long
@ 2016-04-02  3:09 ` Waiman Long
  2 siblings, 0 replies; 19+ messages in thread
From: Waiman Long @ 2016-04-02  3:09 UTC (permalink / raw)
  To: Theodore Ts'o, Andreas Dilger, Tejun Heo, Christoph Lameter
  Cc: linux-ext4, linux-kernel, Scott J Norton, Douglas Hatch,
	Toshimitsu Kani, Waiman Long

This patch changes the es_stats_cache_hits and es_stats_cache_misses
statistics counts to per-cpu variables to reduce cacheline contention
issues whem multiple threads are trying to update those counts
simultaneously. It uses the new per-cpu stats APIs provided by the
percpu_stats.h header file.

With a 38-threads fio I/O test with 2 shared files (on DAX-mount
NVDIMM) running on a 4-socket Haswell-EX server with 4.6-rc1 kernel,
the aggregated bandwidths before and after the patch were:

  Test          W/O patch       With patch      % change
  ----          ---------       ----------      --------
  Read-only     10173MB/s       16141MB/s        +58.7%
  Read-write     2830MB/s        4315MB/s        +52.5%

Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
---
 fs/ext4/extents_status.c |   20 ++++++++++++--------
 fs/ext4/extents_status.h |   11 +++++++++--
 2 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index e38b987..01f8436 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -825,9 +825,9 @@ out:
 		es->es_pblk = es1->es_pblk;
 		if (!ext4_es_is_referenced(es1))
 			ext4_es_set_referenced(es1);
-		stats->es_stats_cache_hits++;
+		percpu_stats_inc(&stats->es_stats, es_stats_cache_hits);
 	} else {
-		stats->es_stats_cache_misses++;
+		percpu_stats_inc(&stats->es_stats, es_stats_cache_misses);
 	}
 
 	read_unlock(&EXT4_I(inode)->i_es_lock);
@@ -1114,8 +1114,8 @@ int ext4_seq_es_shrinker_info_show(struct seq_file *seq, void *v)
 		   percpu_counter_sum_positive(&es_stats->es_stats_all_cnt),
 		   percpu_counter_sum_positive(&es_stats->es_stats_shk_cnt));
 	seq_printf(seq, "  %lu/%lu cache hits/misses\n",
-		   es_stats->es_stats_cache_hits,
-		   es_stats->es_stats_cache_misses);
+		   percpu_stats_sum(&es_stats->es_stats, es_stats_cache_hits),
+		   percpu_stats_sum(&es_stats->es_stats, es_stats_cache_misses));
 	if (inode_cnt)
 		seq_printf(seq, "  %d inodes on list\n", inode_cnt);
 
@@ -1142,8 +1142,6 @@ int ext4_es_register_shrinker(struct ext4_sb_info *sbi)
 	sbi->s_es_nr_inode = 0;
 	spin_lock_init(&sbi->s_es_lock);
 	sbi->s_es_stats.es_stats_shrunk = 0;
-	sbi->s_es_stats.es_stats_cache_hits = 0;
-	sbi->s_es_stats.es_stats_cache_misses = 0;
 	sbi->s_es_stats.es_stats_scan_time = 0;
 	sbi->s_es_stats.es_stats_max_scan_time = 0;
 	err = percpu_counter_init(&sbi->s_es_stats.es_stats_all_cnt, 0, GFP_KERNEL);
@@ -1153,15 +1151,20 @@ int ext4_es_register_shrinker(struct ext4_sb_info *sbi)
 	if (err)
 		goto err1;
 
+	err = percpu_stats_init(&sbi->s_es_stats.es_stats, es_stats_cnt);
+	if (err)
+		goto err2;
+
 	sbi->s_es_shrinker.scan_objects = ext4_es_scan;
 	sbi->s_es_shrinker.count_objects = ext4_es_count;
 	sbi->s_es_shrinker.seeks = DEFAULT_SEEKS;
 	err = register_shrinker(&sbi->s_es_shrinker);
 	if (err)
-		goto err2;
+		goto err3;
 
 	return 0;
-
+err3:
+	percpu_stats_destroy(&sbi->s_es_stats.es_stats);
 err2:
 	percpu_counter_destroy(&sbi->s_es_stats.es_stats_shk_cnt);
 err1:
@@ -1173,6 +1176,7 @@ void ext4_es_unregister_shrinker(struct ext4_sb_info *sbi)
 {
 	percpu_counter_destroy(&sbi->s_es_stats.es_stats_all_cnt);
 	percpu_counter_destroy(&sbi->s_es_stats.es_stats_shk_cnt);
+	percpu_stats_destroy(&sbi->s_es_stats.es_stats);
 	unregister_shrinker(&sbi->s_es_shrinker);
 }
 
diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h
index f7aa24f..c163189 100644
--- a/fs/ext4/extents_status.h
+++ b/fs/ext4/extents_status.h
@@ -11,6 +11,8 @@
 #ifndef _EXT4_EXTENTS_STATUS_H
 #define _EXT4_EXTENTS_STATUS_H
 
+#include <linux/percpu_stats.h>
+
 /*
  * Turn on ES_DEBUG__ to get lots of info about extent status operations.
  */
@@ -67,10 +69,15 @@ struct ext4_es_tree {
 	struct extent_status *cache_es;	/* recently accessed extent */
 };
 
+enum ext4_es_stat_type {
+	es_stats_cache_hits,
+	es_stats_cache_misses,
+	es_stats_cnt,
+};
+
 struct ext4_es_stats {
 	unsigned long es_stats_shrunk;
-	unsigned long es_stats_cache_hits;
-	unsigned long es_stats_cache_misses;
+	struct percpu_stats es_stats;
 	u64 es_stats_scan_time;
 	u64 es_stats_max_scan_time;
 	struct percpu_counter es_stats_all_cnt;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-02  3:09 ` [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions Waiman Long
@ 2016-04-04  7:36   ` Nikolay Borisov
  2016-04-04 17:11     ` Waiman Long
  2016-04-04 16:02   ` Tejun Heo
  1 sibling, 1 reply; 19+ messages in thread
From: Nikolay Borisov @ 2016-04-04  7:36 UTC (permalink / raw)
  To: Waiman Long, Theodore Ts'o, Andreas Dilger, Tejun Heo,
	Christoph Lameter
  Cc: linux-ext4, linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani



On 04/02/2016 06:09 AM, Waiman Long wrote:
> This patch introduces a set of simple per-cpu statictics count helper
> functions that can be used by other kernel subsystems for keeping
> track of the number of events that happens. It is per-cpu based to
> reduce overhead and improve accuracy of the counter. Using per-cpu
> counter is usually overkill for such purpose.
> 
> The following APIs are provided:
> 
>  - int percpu_stats_init(struct percpu_stats *pcs, int num)
>    Initialize the per-cpu statictics counts structure which should have
>    the given number of statistics counts. Return -ENOMEM on error.
> 
>  - void percpu_stats_destroy(struct percpu_stats *pcs)
>    Free the percpu memory allocated.
> 
>  - void percpu_stats_inc(struct percpu_stats *pcs, int stat)
>    void percpu_stats_dec(struct percpu_stats *pcs, int stat)
>    Increment and decrement the given per-cpu statistics count.
> 
>  - unsigned long percpu_stats_sum(struct percpu_stats *pcs, int stat)
>    Return the current aggregated sum of the given statistics count.
> 
>  - void percpu_stats_reset(struct percpu_stats *pcs)
>    Clear all the statistics counts defined in the given percpu_stats
>    structure.
> 
> Signed-off-by: Waiman Long <Waiman.Long@hpe.com>
> ---
>  include/linux/percpu_stats.h |  103 ++++++++++++++++++++++++++++++++++++++++++
>  1 files changed, 103 insertions(+), 0 deletions(-)
>  create mode 100644 include/linux/percpu_stats.h

Just one minor nit below.
[..]
> +static inline void
> +__percpu_stats_add(struct percpu_stats *pcs, int stat, int cnt)
> +{
> +	unsigned long *pstat;
> +
> +	if ((unsigned int)stat >= pcs->nstats)
> +		return;
> +	preempt_disable();
> +	pstat = this_cpu_ptr(&pcs->stats[stat]);
> +	*pstat += cnt;
> +	preempt_enable();
> +}

pstat = get_cpu_ptr(&pcs->stats[stat]);
*pstat += cnt;
put_cpu_ptr(&pcs->stats[stat]);

It will generate identical code but this one uses APIs, making the
intention clearer. But as I said this is just a minor nit.

you can add my Reviewed-by: Nikolay Borisov <kernel@kyup.com> for this
particular patch.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-02  3:09 ` [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions Waiman Long
  2016-04-04  7:36   ` Nikolay Borisov
@ 2016-04-04 16:02   ` Tejun Heo
  2016-04-06 19:52     ` Waiman Long
  2016-04-06 21:51     ` Waiman Long
  1 sibling, 2 replies; 19+ messages in thread
From: Tejun Heo @ 2016-04-04 16:02 UTC (permalink / raw)
  To: Waiman Long
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

Hello,

On Fri, Apr 01, 2016 at 11:09:37PM -0400, Waiman Long wrote:
...
> +struct percpu_stats {
> +	unsigned long __percpu *stats;

I'm not sure ulong is the best choice here.  Atomic reads on 32bit are
nice but people often need 64bit counters for stats.  It probably is a
better idea to use u64_stats_sync.

> +/*
> + * Reset the all statistics counts to 0 in the percpu_stats structure

Proper function description please.

> + */
> +static inline void percpu_stats_reset(struct percpu_stats *pcs)

Why is this function inline?

> +{
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
                                       ^^
> +		int stat;
> +
> +		for (stat = 0; stat < pcs->nstats; stat++, pstats++)
> +			*pstats = 0;
> +	}
> +
> +	/*
> +	 * If a statistics count is in the middle of being updated, it
> +	 * is possible that the above clearing may not work. So we need
> +	 * to double check again to make sure that the counters are really
> +	 * cleared. Still there is a still a very small chance that the
> +	 * second clearing does not work.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
> +		int stat;
> +
> +		for (stat = 0; stat < pcs->nstats; stat++, pstats++)
> +			if (*pstats)
> +				*pstats = 0;
> +	}

I don't think this is acceptable.

> +}
> +
> +static inline int percpu_stats_init(struct percpu_stats *pcs, int num)
> +{
> +	pcs->nstats = num;
> +	pcs->stats  = __alloc_percpu(sizeof(unsigned long) * num,
> +				     __alignof__(unsigned long));
> +	if (!pcs->stats)
> +		return -ENOMEM;
> +
> +	percpu_stats_reset(pcs);
> +	return 0;
> +}
> +
> +static inline void percpu_stats_destroy(struct percpu_stats *pcs)
> +{
> +	free_percpu(pcs->stats);
> +	pcs->stats  = NULL;
> +	pcs->nstats = 0;
> +}

Why inline the above functions?

> +static inline void
> +__percpu_stats_add(struct percpu_stats *pcs, int stat, int cnt)
> +{
> +	unsigned long *pstat;
> +
> +	if ((unsigned int)stat >= pcs->nstats)
> +		return;

This is a critical bug.  Please don't fail silently.  BUG_ON(),
please.

> +	preempt_disable();
> +	pstat = this_cpu_ptr(&pcs->stats[stat]);
> +	*pstat += cnt;
> +	preempt_enable();
> +}

this_cpu_add() is atomic w.r.t. local operations.

> +static inline unsigned long
> +percpu_stats_sum(struct percpu_stats *pcs, int stat)
> +{
> +	int cpu;
> +	unsigned long sum = 0;
> +
> +	if ((unsigned int)stat >= pcs->nstats)
> +		return sum;

Ditto.

> +	for_each_possible_cpu(cpu)
> +		sum += per_cpu(pcs->stats[stat], cpu);
> +	return sum;
> +}

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-04  7:36   ` Nikolay Borisov
@ 2016-04-04 17:11     ` Waiman Long
  2016-04-04 19:09       ` Christoph Lameter
  0 siblings, 1 reply; 19+ messages in thread
From: Waiman Long @ 2016-04-04 17:11 UTC (permalink / raw)
  To: Nikolay Borisov
  Cc: Theodore Ts'o, Andreas Dilger, Tejun Heo, Christoph Lameter,
	linux-ext4, linux-kernel, Scott J Norton, Douglas Hatch,
	Toshimitsu Kani

On 04/04/2016 03:36 AM, Nikolay Borisov wrote:
>
> On 04/02/2016 06:09 AM, Waiman Long wrote:
>> This patch introduces a set of simple per-cpu statictics count helper
>> functions that can be used by other kernel subsystems for keeping
>> track of the number of events that happens. It is per-cpu based to
>> reduce overhead and improve accuracy of the counter. Using per-cpu
>> counter is usually overkill for such purpose.
>>
>> The following APIs are provided:
>>
>>   - int percpu_stats_init(struct percpu_stats *pcs, int num)
>>     Initialize the per-cpu statictics counts structure which should have
>>     the given number of statistics counts. Return -ENOMEM on error.
>>
>>   - void percpu_stats_destroy(struct percpu_stats *pcs)
>>     Free the percpu memory allocated.
>>
>>   - void percpu_stats_inc(struct percpu_stats *pcs, int stat)
>>     void percpu_stats_dec(struct percpu_stats *pcs, int stat)
>>     Increment and decrement the given per-cpu statistics count.
>>
>>   - unsigned long percpu_stats_sum(struct percpu_stats *pcs, int stat)
>>     Return the current aggregated sum of the given statistics count.
>>
>>   - void percpu_stats_reset(struct percpu_stats *pcs)
>>     Clear all the statistics counts defined in the given percpu_stats
>>     structure.
>>
>> Signed-off-by: Waiman Long<Waiman.Long@hpe.com>
>> ---
>>   include/linux/percpu_stats.h |  103 ++++++++++++++++++++++++++++++++++++++++++
>>   1 files changed, 103 insertions(+), 0 deletions(-)
>>   create mode 100644 include/linux/percpu_stats.h
> Just one minor nit below.
> [..]
>> +static inline void
>> +__percpu_stats_add(struct percpu_stats *pcs, int stat, int cnt)
>> +{
>> +	unsigned long *pstat;
>> +
>> +	if ((unsigned int)stat>= pcs->nstats)
>> +		return;
>> +	preempt_disable();
>> +	pstat = this_cpu_ptr(&pcs->stats[stat]);
>> +	*pstat += cnt;
>> +	preempt_enable();
>> +}
> pstat = get_cpu_ptr(&pcs->stats[stat]);
> *pstat += cnt;
> put_cpu_ptr(&pcs->stats[stat]);
>
> It will generate identical code but this one uses APIs, making the
> intention clearer. But as I said this is just a minor nit.
>
> you can add my Reviewed-by: Nikolay Borisov<kernel@kyup.com>  for this
> particular patch.

Yes, that will certainly make it look nicer. I will update the patch 
once I get feedback from my other ext4 patches.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-04 17:11     ` Waiman Long
@ 2016-04-04 19:09       ` Christoph Lameter
  2016-04-06 21:53         ` Waiman Long
  0 siblings, 1 reply; 19+ messages in thread
From: Christoph Lameter @ 2016-04-04 19:09 UTC (permalink / raw)
  To: Waiman Long
  Cc: Nikolay Borisov, Theodore Ts'o, Andreas Dilger, Tejun Heo,
	linux-ext4, linux-kernel, Scott J Norton, Douglas Hatch,
	Toshimitsu Kani

On Mon, 4 Apr 2016, Waiman Long wrote:

> > > +	if ((unsigned int)stat>= pcs->nstats)
> > > +		return;
> > > +	preempt_disable();
> > > +	pstat = this_cpu_ptr(&pcs->stats[stat]);
> > > +	*pstat += cnt;
> > > +	preempt_enable();
> > > +}
> > pstat = get_cpu_ptr(&pcs->stats[stat]);
> > *pstat += cnt;
> > put_cpu_ptr(&pcs->stats[stat]);
> >
> > It will generate identical code but this one uses APIs, making the
> > intention clearer. But as I said this is just a minor nit.
> >
> > you can add my Reviewed-by: Nikolay Borisov<kernel@kyup.com>  for this
> > particular patch.
>
> Yes, that will certainly make it look nicer. I will update the patch once I
> get feedback from my other ext4 patches.

Why not

   this_cpu_add(pci->stats[stat], cnt)

This is a single instruction on x86.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-04 16:02   ` Tejun Heo
@ 2016-04-06 19:52     ` Waiman Long
  2016-04-06 21:51     ` Waiman Long
  1 sibling, 0 replies; 19+ messages in thread
From: Waiman Long @ 2016-04-06 19:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

On 04/04/2016 12:02 PM, Tejun Heo wrote:
> Hello,
>
> On Fri, Apr 01, 2016 at 11:09:37PM -0400, Waiman Long wrote:
> ...
>> +struct percpu_stats {
>> +	unsigned long __percpu *stats;
> I'm not sure ulong is the best choice here.  Atomic reads on 32bit are
> nice but people often need 64bit counters for stats.  It probably is a
> better idea to use u64_stats_sync.
>
>> +/*
>> + * Reset the all statistics counts to 0 in the percpu_stats structure
> Proper function description please.
>
>> + */
>> +static inline void percpu_stats_reset(struct percpu_stats *pcs)
> Why is this function inline?
>
>> +{
>> +	int cpu;
>> +
>> +	for_each_possible_cpu(cpu) {
>> +		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
>                                         ^^
>> +		int stat;
>> +
>> +		for (stat = 0; stat<  pcs->nstats; stat++, pstats++)
>> +			*pstats = 0;
>> +	}
>> +
>> +	/*
>> +	 * If a statistics count is in the middle of being updated, it
>> +	 * is possible that the above clearing may not work. So we need
>> +	 * to double check again to make sure that the counters are really
>> +	 * cleared. Still there is a still a very small chance that the
>> +	 * second clearing does not work.
>> +	 */
>> +	for_each_possible_cpu(cpu) {
>> +		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
>> +		int stat;
>> +
>> +		for (stat = 0; stat<  pcs->nstats; stat++, pstats++)
>> +			if (*pstats)
>> +				*pstats = 0;
>> +	}
> I don't think this is acceptable.
>
>> +}
>> +
>> +static inline int percpu_stats_init(struct percpu_stats *pcs, int num)
>> +{
>> +	pcs->nstats = num;
>> +	pcs->stats  = __alloc_percpu(sizeof(unsigned long) * num,
>> +				     __alignof__(unsigned long));
>> +	if (!pcs->stats)
>> +		return -ENOMEM;
>> +
>> +	percpu_stats_reset(pcs);
>> +	return 0;
>> +}
>> +
>> +static inline void percpu_stats_destroy(struct percpu_stats *pcs)
>> +{
>> +	free_percpu(pcs->stats);
>> +	pcs->stats  = NULL;
>> +	pcs->nstats = 0;
>> +}
> Why inline the above functions?
>
>> +static inline void
>> +__percpu_stats_add(struct percpu_stats *pcs, int stat, int cnt)
>> +{
>> +	unsigned long *pstat;
>> +
>> +	if ((unsigned int)stat>= pcs->nstats)
>> +		return;
> This is a critical bug.  Please don't fail silently.  BUG_ON(),
> please.

Sure.

>
>> +	preempt_disable();
>> +	pstat = this_cpu_ptr(&pcs->stats[stat]);
>> +	*pstat += cnt;
>> +	preempt_enable();
>> +}
> this_cpu_add() is atomic w.r.t. local operations.

Will use this_cpu_add().

>> +static inline unsigned long
>> +percpu_stats_sum(struct percpu_stats *pcs, int stat)
>> +{
>> +	int cpu;
>> +	unsigned long sum = 0;
>> +
>> +	if ((unsigned int)stat>= pcs->nstats)
>> +		return sum;
> Ditto.
>
>> +	for_each_possible_cpu(cpu)
>> +		sum += per_cpu(pcs->stats[stat], cpu);
>> +	return sum;
>> +}
> Thanks.
>

Cheers,
Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-04 16:02   ` Tejun Heo
  2016-04-06 19:52     ` Waiman Long
@ 2016-04-06 21:51     ` Waiman Long
  2016-04-06 22:54       ` Tejun Heo
  1 sibling, 1 reply; 19+ messages in thread
From: Waiman Long @ 2016-04-06 21:51 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

On 04/04/2016 12:02 PM, Tejun Heo wrote:
> Hello,
>
> On Fri, Apr 01, 2016 at 11:09:37PM -0400, Waiman Long wrote:
> ...
>> +struct percpu_stats {
>> +	unsigned long __percpu *stats;
> I'm not sure ulong is the best choice here.  Atomic reads on 32bit are
> nice but people often need 64bit counters for stats.  It probably is a
> better idea to use u64_stats_sync.

Got that, will incorporate 64-bit counter support for 32-bit architecture.

>> +/*
>> + * Reset the all statistics counts to 0 in the percpu_stats structure
> Proper function description please.

Sure. Will do that for all the functions.

>> + */
>> +static inline void percpu_stats_reset(struct percpu_stats *pcs)
> Why is this function inline?

It doesn't need to be inlined, but I need to add a lib/percpu_stats.c 
file to hold the function which I will do in my v2 patch.

>
>> +{
>> +	int cpu;
>> +
>> +	for_each_possible_cpu(cpu) {
>> +		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
>                                         ^^
>> +		int stat;
>> +
>> +		for (stat = 0; stat<  pcs->nstats; stat++, pstats++)
>> +			*pstats = 0;
>> +	}
>> +
>> +	/*
>> +	 * If a statistics count is in the middle of being updated, it
>> +	 * is possible that the above clearing may not work. So we need
>> +	 * to double check again to make sure that the counters are really
>> +	 * cleared. Still there is a still a very small chance that the
>> +	 * second clearing does not work.
>> +	 */
>> +	for_each_possible_cpu(cpu) {
>> +		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
>> +		int stat;
>> +
>> +		for (stat = 0; stat<  pcs->nstats; stat++, pstats++)
>> +			if (*pstats)
>> +				*pstats = 0;
>> +	}
> I don't think this is acceptable.

I am not sure what you mean here by not acceptable. Please enlighten me 
on that.

>> +}
>> +
>> +static inline int percpu_stats_init(struct percpu_stats *pcs, int num)
>> +{
>> +	pcs->nstats = num;
>> +	pcs->stats  = __alloc_percpu(sizeof(unsigned long) * num,
>> +				     __alignof__(unsigned long));
>> +	if (!pcs->stats)
>> +		return -ENOMEM;
>> +
>> +	percpu_stats_reset(pcs);
>> +	return 0;
>> +}
>> +
>> +static inline void percpu_stats_destroy(struct percpu_stats *pcs)
>> +{
>> +	free_percpu(pcs->stats);
>> +	pcs->stats  = NULL;
>> +	pcs->nstats = 0;
>> +}
> Why inline the above functions?

Will move this function to lib/percpu_stats.c.

>> +static inline void
>> +__percpu_stats_add(struct percpu_stats *pcs, int stat, int cnt)
>> +{
>> +	unsigned long *pstat;
>> +
>> +	if ((unsigned int)stat>= pcs->nstats)
>> +		return;
> This is a critical bug.  Please don't fail silently.  BUG_ON(),
> please.

Sure.

>
>> +	preempt_disable();
>> +	pstat = this_cpu_ptr(&pcs->stats[stat]);
>> +	*pstat += cnt;
>> +	preempt_enable();
>> +}
> this_cpu_add() is atomic w.r.t. local operations.

Will use this_cpu_add().

>> +static inline unsigned long
>> +percpu_stats_sum(struct percpu_stats *pcs, int stat)
>> +{
>> +	int cpu;
>> +	unsigned long sum = 0;
>> +
>> +	if ((unsigned int)stat>= pcs->nstats)
>> +		return sum;
> Ditto.
>
>> +	for_each_possible_cpu(cpu)
>> +		sum += per_cpu(pcs->stats[stat], cpu);
>> +	return sum;
>> +}
> Thanks.
>

Cheers,
Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-04 19:09       ` Christoph Lameter
@ 2016-04-06 21:53         ` Waiman Long
  0 siblings, 0 replies; 19+ messages in thread
From: Waiman Long @ 2016-04-06 21:53 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Nikolay Borisov, Theodore Ts'o, Andreas Dilger, Tejun Heo,
	linux-ext4, linux-kernel, Scott J Norton, Douglas Hatch,
	Toshimitsu Kani

On 04/04/2016 03:09 PM, Christoph Lameter wrote:
> On Mon, 4 Apr 2016, Waiman Long wrote:
>
>>>> +	if ((unsigned int)stat>= pcs->nstats)
>>>> +		return;
>>>> +	preempt_disable();
>>>> +	pstat = this_cpu_ptr(&pcs->stats[stat]);
>>>> +	*pstat += cnt;
>>>> +	preempt_enable();
>>>> +}
>>> pstat = get_cpu_ptr(&pcs->stats[stat]);
>>> *pstat += cnt;
>>> put_cpu_ptr(&pcs->stats[stat]);
>>>
>>> It will generate identical code but this one uses APIs, making the
>>> intention clearer. But as I said this is just a minor nit.
>>>
>>> you can add my Reviewed-by: Nikolay Borisov<kernel@kyup.com>   for this
>>> particular patch.
>> Yes, that will certainly make it look nicer. I will update the patch once I
>> get feedback from my other ext4 patches.
> Why not
>
>     this_cpu_add(pci->stats[stat], cnt)
>
> This is a single instruction on x86.
>

Yes, using this_cpu_add() will be even simpler.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-06 21:51     ` Waiman Long
@ 2016-04-06 22:54       ` Tejun Heo
  2016-04-07 15:58         ` Waiman Long
  0 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2016-04-06 22:54 UTC (permalink / raw)
  To: Waiman Long
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

Hello,

On Wed, Apr 06, 2016 at 05:51:45PM -0400, Waiman Long wrote:
> >>+	/*
> >>+	 * If a statistics count is in the middle of being updated, it
> >>+	 * is possible that the above clearing may not work. So we need
> >>+	 * to double check again to make sure that the counters are really
> >>+	 * cleared. Still there is a still a very small chance that the
> >>+	 * second clearing does not work.
> >>+	 */
> >>+	for_each_possible_cpu(cpu) {
> >>+		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
> >>+		int stat;
> >>+
> >>+		for (stat = 0; stat<  pcs->nstats; stat++, pstats++)
> >>+			if (*pstats)
> >>+				*pstats = 0;
> >>+	}
> >I don't think this is acceptable.
> 
> I am not sure what you mean here by not acceptable. Please enlighten me on
> that.

Hmmm... I thought that was pretty clear.  Try-twice-and-we-are-probably-okay
is simply not acceptable.  Please make it watertight.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-06 22:54       ` Tejun Heo
@ 2016-04-07 15:58         ` Waiman Long
  2016-04-07 16:06           ` Tejun Heo
  0 siblings, 1 reply; 19+ messages in thread
From: Waiman Long @ 2016-04-07 15:58 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

On 04/06/2016 06:54 PM, Tejun Heo wrote:
> Hello,
>
> On Wed, Apr 06, 2016 at 05:51:45PM -0400, Waiman Long wrote:
>>>> +	/*
>>>> +	 * If a statistics count is in the middle of being updated, it
>>>> +	 * is possible that the above clearing may not work. So we need
>>>> +	 * to double check again to make sure that the counters are really
>>>> +	 * cleared. Still there is a still a very small chance that the
>>>> +	 * second clearing does not work.
>>>> +	 */
>>>> +	for_each_possible_cpu(cpu) {
>>>> +		unsigned long *pstats =  per_cpu_ptr(pcs->stats, cpu);
>>>> +		int stat;
>>>> +
>>>> +		for (stat = 0; stat<   pcs->nstats; stat++, pstats++)
>>>> +			if (*pstats)
>>>> +				*pstats = 0;
>>>> +	}
>>> I don't think this is acceptable.
>> I am not sure what you mean here by not acceptable. Please enlighten me on
>> that.
> Hmmm... I thought that was pretty clear.  Try-twice-and-we-are-probably-okay
> is simply not acceptable.  Please make it watertight.
>
> Thanks.

OK, I got it now.

We can certainly make it watertight. However, that will certainly 
require adding performance overhead in the percpu stats update fast path 
which I am not willing to pay.

The purpose of this stat counters reset functionality is to allow 
developers to reset the stat counters, run certain workload and see how 
things are going in the kernel when the workload completes assuming that 
those stat counters are exposed via sysfs, debugfs, etc. The developers 
can certainly check the stat counters after the reset to make sure that 
they are properly reset. So I don't think we need an airtight way of 
doing it. If you have scenarios in your mind that require airtight reset 
of the stat counters, please let me know and I will see what I can do 
about it.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-07 15:58         ` Waiman Long
@ 2016-04-07 16:06           ` Tejun Heo
  2016-04-07 18:52             ` Waiman Long
  0 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2016-04-07 16:06 UTC (permalink / raw)
  To: Waiman Long
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

Hello, Waiman.

On Thu, Apr 07, 2016 at 11:58:13AM -0400, Waiman Long wrote:
> We can certainly make it watertight. However, that will certainly require
> adding performance overhead in the percpu stats update fast path which I am
> not willing to pay.

There are multiple options depending on the specific balance you want
to hit.  Reset can be made very heavy (involving RCU sync operation)
to make hot path overhead minimal, local locking or atomic ops can
also be used which while more expensive than this_cpu_*() ops still
avoids cacheline bouncing.

> The purpose of this stat counters reset functionality is to allow developers
> to reset the stat counters, run certain workload and see how things are
> going in the kernel when the workload completes assuming that those stat
> counters are exposed via sysfs, debugfs, etc. The developers can certainly
> check the stat counters after the reset to make sure that they are properly
> reset. So I don't think we need an airtight way of doing it. If you have
> scenarios in your mind that require airtight reset of the stat counters,
> please let me know and I will see what I can do about it.

No matter what, don't create something which can yield a completely
surprising result once in a blue moon.  You might think it's okay
because the likelihood is low but that just means that the resulting
malfunctions will be that much more obscure and difficult to
reproduce.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-07 16:06           ` Tejun Heo
@ 2016-04-07 18:52             ` Waiman Long
  2016-04-07 18:58               ` Tejun Heo
  0 siblings, 1 reply; 19+ messages in thread
From: Waiman Long @ 2016-04-07 18:52 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

On 04/07/2016 12:06 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Thu, Apr 07, 2016 at 11:58:13AM -0400, Waiman Long wrote:
>> We can certainly make it watertight. However, that will certainly require
>> adding performance overhead in the percpu stats update fast path which I am
>> not willing to pay.
> There are multiple options depending on the specific balance you want
> to hit.  Reset can be made very heavy (involving RCU sync operation)
> to make hot path overhead minimal, local locking or atomic ops can
> also be used which while more expensive than this_cpu_*() ops still
> avoids cacheline bouncing.
>
>> The purpose of this stat counters reset functionality is to allow developers
>> to reset the stat counters, run certain workload and see how things are
>> going in the kernel when the workload completes assuming that those stat
>> counters are exposed via sysfs, debugfs, etc. The developers can certainly
>> check the stat counters after the reset to make sure that they are properly
>> reset. So I don't think we need an airtight way of doing it. If you have
>> scenarios in your mind that require airtight reset of the stat counters,
>> please let me know and I will see what I can do about it.
> No matter what, don't create something which can yield a completely
> surprising result once in a blue moon.  You might think it's okay
> because the likelihood is low but that just means that the resulting
> malfunctions will be that much more obscure and difficult to
> reproduce.
>
> Thanks.
>

As long as atomic reset is an optional feature that caller can choose at 
init time, I am OK to provide this functionality. I just don't want it 
to be the default because of the performance overhead.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-07 18:52             ` Waiman Long
@ 2016-04-07 18:58               ` Tejun Heo
  2016-04-07 20:37                 ` Waiman Long
  0 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2016-04-07 18:58 UTC (permalink / raw)
  To: Waiman Long
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

Hello, Waiman.

On Thu, Apr 07, 2016 at 02:52:33PM -0400, Waiman Long wrote:
> As long as atomic reset is an optional feature that caller can choose at
> init time, I am OK to provide this functionality. I just don't want it to be
> the default because of the performance overhead.

Please take a look at how percpu-ref coordinates global
synchronization.  The hot path overhead is one branch which is
extremely easy to predict and shouldn't show up anywhere.  If you're
gonna provide reset at all (which btw always kinda baffles me, what's
wrong with taking a snapshot value and taking delta from there?), you
need to make it actually work reliably.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-07 18:58               ` Tejun Heo
@ 2016-04-07 20:37                 ` Waiman Long
  2016-04-07 20:41                   ` Tejun Heo
  0 siblings, 1 reply; 19+ messages in thread
From: Waiman Long @ 2016-04-07 20:37 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

On 04/07/2016 02:58 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Thu, Apr 07, 2016 at 02:52:33PM -0400, Waiman Long wrote:
>> As long as atomic reset is an optional feature that caller can choose at
>> init time, I am OK to provide this functionality. I just don't want it to be
>> the default because of the performance overhead.
> Please take a look at how percpu-ref coordinates global
> synchronization.  The hot path overhead is one branch which is
> extremely easy to predict and shouldn't show up anywhere.  If you're
> gonna provide reset at all (which btw always kinda baffles me, what's
> wrong with taking a snapshot value and taking delta from there?), you
> need to make it actually work reliably.
>
> Thanks.
>

I would say that because I am lazy, I don't want compute the deltas 
every time I want to see the effect of running a certain type of 
workload on the statistics counts. I have use case that I need to track 
10 or so statistics counts and monitor their changes after running a 
job. It is much more convenient to do a reset and see what you get than 
doing manual subtractions to find out.

I had taken a look at percpu-refcount.[ch]. I think the synchronization 
code is a bit overkill for this purpose as no one really need a very 
precise statistics counts nor precise atomic reset. I would prefer 
providing an optional atomic reset feature with slower statistics count 
update path for the time being. If we come across a use case where we 
need atomic reset with negligible slowdown, we could then refactor the 
code to use something similar to  what the percpu-refcount code is doing.

Cheers,
Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-07 20:37                 ` Waiman Long
@ 2016-04-07 20:41                   ` Tejun Heo
  2016-04-07 21:38                     ` Waiman Long
  0 siblings, 1 reply; 19+ messages in thread
From: Tejun Heo @ 2016-04-07 20:41 UTC (permalink / raw)
  To: Waiman Long
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

Hello, Waiman.

On Thu, Apr 07, 2016 at 04:37:06PM -0400, Waiman Long wrote:
> I would say that because I am lazy, I don't want compute the deltas every
> time I want to see the effect of running a certain type of workload on the
> statistics counts. I have use case that I need to track 10 or so statistics
> counts and monitor their changes after running a job. It is much more
> convenient to do a reset and see what you get than doing manual subtractions
> to find out.

I don't know.  Write a simple script?  Even if you wanna keep it in
kernel, you can just have a base counter which offsets the summed up
value on read.

> I had taken a look at percpu-refcount.[ch]. I think the synchronization code
> is a bit overkill for this purpose as no one really need a very precise
> statistics counts nor precise atomic reset. I would prefer providing an
> optional atomic reset feature with slower statistics count update path for
> the time being. If we come across a use case where we need atomic reset with
> negligible slowdown, we could then refactor the code to use something
> similar to  what the percpu-refcount code is doing.

Please either drop reset or make it actually work; otherwise, I don't
think this should go in.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions
  2016-04-07 20:41                   ` Tejun Heo
@ 2016-04-07 21:38                     ` Waiman Long
  0 siblings, 0 replies; 19+ messages in thread
From: Waiman Long @ 2016-04-07 21:38 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Theodore Ts'o, Andreas Dilger, Christoph Lameter, linux-ext4,
	linux-kernel, Scott J Norton, Douglas Hatch, Toshimitsu Kani

On 04/07/2016 04:41 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Thu, Apr 07, 2016 at 04:37:06PM -0400, Waiman Long wrote:
>> I would say that because I am lazy, I don't want compute the deltas every
>> time I want to see the effect of running a certain type of workload on the
>> statistics counts. I have use case that I need to track 10 or so statistics
>> counts and monitor their changes after running a job. It is much more
>> convenient to do a reset and see what you get than doing manual subtractions
>> to find out.
> I don't know.  Write a simple script?  Even if you wanna keep it in
> kernel, you can just have a base counter which offsets the summed up
> value on read.
>
>> I had taken a look at percpu-refcount.[ch]. I think the synchronization code
>> is a bit overkill for this purpose as no one really need a very precise
>> statistics counts nor precise atomic reset. I would prefer providing an
>> optional atomic reset feature with slower statistics count update path for
>> the time being. If we come across a use case where we need atomic reset with
>> negligible slowdown, we could then refactor the code to use something
>> similar to  what the percpu-refcount code is doing.
> Please either drop reset or make it actually work; otherwise, I don't
> think this should go in.
>
> Thanks.
>

In this case, I think I will drop this reset functionality. It is not 
really needed for this patchset.

Thanks for the feedback!

Cheers,
Longman

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2016-04-07 21:39 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-02  3:09 [PATCH 0/3] ext4: Improve parallel I/O performance on NVDIMM Waiman Long
2016-04-02  3:09 ` [PATCH 1/3] ext4: Pass in DIO_SKIP_DIO_COUNT flag if inode_dio_begin() called Waiman Long
2016-04-02  3:09 ` [PATCH 2/3] percpu_stats: Simple per-cpu statistics count helper functions Waiman Long
2016-04-04  7:36   ` Nikolay Borisov
2016-04-04 17:11     ` Waiman Long
2016-04-04 19:09       ` Christoph Lameter
2016-04-06 21:53         ` Waiman Long
2016-04-04 16:02   ` Tejun Heo
2016-04-06 19:52     ` Waiman Long
2016-04-06 21:51     ` Waiman Long
2016-04-06 22:54       ` Tejun Heo
2016-04-07 15:58         ` Waiman Long
2016-04-07 16:06           ` Tejun Heo
2016-04-07 18:52             ` Waiman Long
2016-04-07 18:58               ` Tejun Heo
2016-04-07 20:37                 ` Waiman Long
2016-04-07 20:41                   ` Tejun Heo
2016-04-07 21:38                     ` Waiman Long
2016-04-02  3:09 ` [PATCH 3/3] ext4: Make cache hits/misses per-cpu counts Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).