From: Waiman Long <Waiman.Long@hpe.com> To: Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@linux-foundation.org>, Dave Chinner <dchinner@redhat.com> Cc: xfs@oss.sgi.com, linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Scott J Norton <scott.norton@hp.com>, Douglas Hatch <doug.hatch@hp.com>, Waiman Long <Waiman.Long@hpe.com> Subject: [RFC PATCH 2/2] xfs: Allow degeneration of m_fdblocks/m_ifree to global counters Date: Fri, 4 Mar 2016 21:51:39 -0500 [thread overview] Message-ID: <1457146299-1601-3-git-send-email-Waiman.Long@hpe.com> (raw) In-Reply-To: <1457146299-1601-1-git-send-email-Waiman.Long@hpe.com> Small XFS filesystems on systems with large number of CPUs can incur a significant overhead due to excessive calls to the percpu_counter_sum() function which needs to walk through a large number of different cachelines. This patch uses the newly added percpu_counter_set_limit() API to potentially switch the m_fdblocks and m_ifree per-cpu counters to a global counter with locks at filesystem mount time if its size is small relatively to the number of CPUs available. A possible use case is the use of the NVDIMM as an application scratch storage area for log file and other small files. Current battery-backed NVDIMMs are pretty small in size, e.g. 8G per DIMM. So we cannot create large filesystem on top of them. On a 4-socket 80-thread system running 4.5-rc6 kernel, this patch can improve the throughput of the AIM7 XFS disk workload by 25%. Before the patch, the perf profile was: 18.68% 0.08% reaim [k] __percpu_counter_compare 18.05% 9.11% reaim [k] __percpu_counter_sum 0.37% 0.36% reaim [k] __percpu_counter_add After the patch, the perf profile was: 0.73% 0.36% reaim [k] __percpu_counter_add 0.27% 0.27% reaim [k] __percpu_counter_compare Signed-off-by: Waiman Long <Waiman.Long@hpe.com> --- fs/xfs/xfs_mount.c | 1 - fs/xfs/xfs_mount.h | 5 +++++ fs/xfs/xfs_super.c | 6 ++++++ 3 files changed, 11 insertions(+), 1 deletions(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index bb753b3..fe74b91 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1163,7 +1163,6 @@ xfs_mod_ifree( * a large batch count (1024) to minimise global counter updates except when * we get near to ENOSPC and we have to be very accurate with our updates. */ -#define XFS_FDBLOCKS_BATCH 1024 int xfs_mod_fdblocks( struct xfs_mount *mp, diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index b570984..d9520f4 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -206,6 +206,11 @@ typedef struct xfs_mount { #define XFS_WSYNC_WRITEIO_LOG 14 /* 16k */ /* + * FD blocks batch size for per-cpu compare + */ +#define XFS_FDBLOCKS_BATCH 1024 + +/* * Allow large block sizes to be reported to userspace programs if the * "largeio" mount option is used. * diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 59c9b7b..c0b4f79 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1412,6 +1412,12 @@ xfs_reinit_percpu_counters( percpu_counter_set(&mp->m_icount, mp->m_sb.sb_icount); percpu_counter_set(&mp->m_ifree, mp->m_sb.sb_ifree); percpu_counter_set(&mp->m_fdblocks, mp->m_sb.sb_fdblocks); + + /* + * Use default batch size for m_ifree + */ + percpu_counter_set_limit(&mp->m_ifree, 0); + percpu_counter_set_limit(&mp->m_fdblocks, 4 * XFS_FDBLOCKS_BATCH); } static void -- 1.7.1
WARNING: multiple messages have this Message-ID (diff)
From: Waiman Long <Waiman.Long@hpe.com> To: Tejun Heo <tj@kernel.org>, Christoph Lameter <cl@linux-foundation.org>, Dave Chinner <dchinner@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org>, Scott J Norton <scott.norton@hp.com>, linux-kernel@vger.kernel.org, Waiman Long <Waiman.Long@hpe.com>, xfs@oss.sgi.com, Ingo Molnar <mingo@redhat.com>, Douglas Hatch <doug.hatch@hp.com> Subject: [RFC PATCH 2/2] xfs: Allow degeneration of m_fdblocks/m_ifree to global counters Date: Fri, 4 Mar 2016 21:51:39 -0500 [thread overview] Message-ID: <1457146299-1601-3-git-send-email-Waiman.Long@hpe.com> (raw) In-Reply-To: <1457146299-1601-1-git-send-email-Waiman.Long@hpe.com> Small XFS filesystems on systems with large number of CPUs can incur a significant overhead due to excessive calls to the percpu_counter_sum() function which needs to walk through a large number of different cachelines. This patch uses the newly added percpu_counter_set_limit() API to potentially switch the m_fdblocks and m_ifree per-cpu counters to a global counter with locks at filesystem mount time if its size is small relatively to the number of CPUs available. A possible use case is the use of the NVDIMM as an application scratch storage area for log file and other small files. Current battery-backed NVDIMMs are pretty small in size, e.g. 8G per DIMM. So we cannot create large filesystem on top of them. On a 4-socket 80-thread system running 4.5-rc6 kernel, this patch can improve the throughput of the AIM7 XFS disk workload by 25%. Before the patch, the perf profile was: 18.68% 0.08% reaim [k] __percpu_counter_compare 18.05% 9.11% reaim [k] __percpu_counter_sum 0.37% 0.36% reaim [k] __percpu_counter_add After the patch, the perf profile was: 0.73% 0.36% reaim [k] __percpu_counter_add 0.27% 0.27% reaim [k] __percpu_counter_compare Signed-off-by: Waiman Long <Waiman.Long@hpe.com> --- fs/xfs/xfs_mount.c | 1 - fs/xfs/xfs_mount.h | 5 +++++ fs/xfs/xfs_super.c | 6 ++++++ 3 files changed, 11 insertions(+), 1 deletions(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index bb753b3..fe74b91 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1163,7 +1163,6 @@ xfs_mod_ifree( * a large batch count (1024) to minimise global counter updates except when * we get near to ENOSPC and we have to be very accurate with our updates. */ -#define XFS_FDBLOCKS_BATCH 1024 int xfs_mod_fdblocks( struct xfs_mount *mp, diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index b570984..d9520f4 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -206,6 +206,11 @@ typedef struct xfs_mount { #define XFS_WSYNC_WRITEIO_LOG 14 /* 16k */ /* + * FD blocks batch size for per-cpu compare + */ +#define XFS_FDBLOCKS_BATCH 1024 + +/* * Allow large block sizes to be reported to userspace programs if the * "largeio" mount option is used. * diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index 59c9b7b..c0b4f79 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -1412,6 +1412,12 @@ xfs_reinit_percpu_counters( percpu_counter_set(&mp->m_icount, mp->m_sb.sb_icount); percpu_counter_set(&mp->m_ifree, mp->m_sb.sb_ifree); percpu_counter_set(&mp->m_fdblocks, mp->m_sb.sb_fdblocks); + + /* + * Use default batch size for m_ifree + */ + percpu_counter_set_limit(&mp->m_ifree, 0); + percpu_counter_set_limit(&mp->m_fdblocks, 4 * XFS_FDBLOCKS_BATCH); } static void -- 1.7.1 _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2016-03-05 2:52 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-03-05 2:51 [RFC PATCH 0/2] percpu_counter: Enable switching to global counter Waiman Long 2016-03-05 2:51 ` Waiman Long 2016-03-05 2:51 ` [RFC PATCH 1/2] percpu_counter: Allow falling back to global counter on large system Waiman Long 2016-03-05 2:51 ` Waiman Long 2016-03-07 18:24 ` Christoph Lameter 2016-03-07 18:24 ` Christoph Lameter 2016-03-07 19:47 ` Waiman Long 2016-03-16 19:20 ` Waiman Long 2016-03-18 1:58 ` Christoph Lameter 2016-03-18 1:58 ` Christoph Lameter 2016-03-05 2:51 ` Waiman Long [this message] 2016-03-05 2:51 ` [RFC PATCH 2/2] xfs: Allow degeneration of m_fdblocks/m_ifree to global counters Waiman Long 2016-03-05 6:34 ` [RFC PATCH 0/2] percpu_counter: Enable switching to global counter Dave Chinner 2016-03-05 6:34 ` Dave Chinner 2016-03-07 17:39 ` Waiman Long 2016-03-07 21:33 ` Dave Chinner 2016-03-07 21:33 ` Dave Chinner 2016-03-16 20:06 ` Waiman Long
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1457146299-1601-3-git-send-email-Waiman.Long@hpe.com \ --to=waiman.long@hpe.com \ --cc=cl@linux-foundation.org \ --cc=dchinner@redhat.com \ --cc=doug.hatch@hp.com \ --cc=linux-kernel@vger.kernel.org \ --cc=mingo@redhat.com \ --cc=peterz@infradead.org \ --cc=scott.norton@hp.com \ --cc=tj@kernel.org \ --cc=xfs@oss.sgi.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.